Updates, release notes, and technical write-ups about RawCull.
This is the multi-page printable view of this section. Click here to print.
Blog
Release Notes
RawCull version release notes and changelogs.
Version 1.4.0
Version 1.4.0 - April 7, 2026
RawCull supports a broader range of Sony FullFrame camera bodies. See table below for what is tested now.
RawCull Changes Summary (v1.3.5 → v1.4.0)
- Sharpness scoring was significantly upgraded: added saliency-aware scoring, optional subject classification, subject-based filtering, configurable scoring parameters, and improved normalization/progress handling.
- New UI controls and views were added around scoring and stats:
- new Scoring Parameters sheet
- new Scan Summary/Statistics sheet
- toolbar toggles for score badge and saliency badge
- expanded culling/rating status display (including per-star counts).
- EXIF/metadata support expanded: now includes RAW compression type, size class (L/M/S), and pixel dimensions; these are surfaced in inspector UI.
- Sony compatibility work: maker-note parser now supports more Sony bodies and adds robust diagnostics/full-file fallback for focus metadata extraction.
- Performance/state management improvements: rating/tag caching for O(1) lookups, better cancellation behavior in scoring, memory warning refinements (soft/full), and various cleanup/refactors.
- Persistence/settings updates: settings now store badge visibility toggles and decode safely with defaults.
- Testing expanded with a large new ARW body compatibility diagnostic test suite plus concurrency/sendability test adjustments.
ARW body compatibility diagnostic
The following Sony bodies successfully extract EXIF, focus points, sharpness, and saliency, except for the ILCE-7RM5, which failed to extract saliency on one of its three files. The ILCE-1M2 is the only body tested across all three Sony RAW size variants (S/M/L). All files use compressed RAW, and every body achieves full-resolution L-size output, ranging from 12.4 MP (ILCE-1M2 S-crop) to 60.2 MP (ILCE-7RM5). The ILCE-7M5 and ILCE-7RM5 are the next bodies to focus on, but I depend on test ARW files to properly test them before officially concluding support for these two bodies.
| Camera Body | EXIF | FocusPt | Sharpness | Saliency | RAW Types | Dimensions |
|---|---|---|---|---|---|---|
| ILCE-1M2 | ✅ | ✅ | ✅ | ✅ | Compressed | 4320 × 2880 (12.4 MP, S), 5616 × 3744 (21.0 MP, M), 8640 × 5760 (49.8 MP, L) |
| ILCE-1 | ✅ | ✅ | ✅ | ✅ | Compressed | 8640 × 5760 (49.8 MP, L) |
| ILCE-7M5 | ✅ | ✅ | ✅ | ✅ | Compressed | 7008 × 4672 (32.7 MP, L) |
| ILCE-7RM5 | ✅ | ✅ | ✅ | ✅ | Compressed | 9504 × 6336 (60.2 MP, L) |
| ILCE-9M3 | ✅ | ✅ | ✅ | ✅ | Compressed | 6000 × 4000 (24.0 MP, L) |
Version 1.4.1
Version 1.4.1 - April 7, 2026
RawCull supports a broader range of Sony FullFrame camera bodies. See table below for what is tested now.
RawCull Changes Summary (v1.4.0 → v1.4.1)
Version 1.4.1 solves an issue with Sony A7V (ILCE-7M5) Compressed ARW file.
ARW body compatibility diagnostic
The following Sony bodies successfully extract EXIF, focus points, sharpness, and saliency, except for the ILCE-7RM5, which failed to extract saliency on one of its three files. The ILCE-1M2 is the only body tested across all three Sony RAW size variants (S/M/L). All files use compressed RAW, and every body achieves full-resolution L-size output, ranging from 12.4 MP (ILCE-1M2 S-crop) to 60.2 MP (ILCE-7RM5). The ILCE-7M5 and ILCE-7RM5 are the next bodies to focus on, but I depend on test ARW files to properly test them before officially concluding support for these two bodies.
| Camera Body | EXIF | FocusPt | Sharpness | Saliency | RAW Types | Dimensions |
|---|---|---|---|---|---|---|
| ILCE-1M2 | ✅ | ✅ | ✅ | ✅ | Compressed | 4320 × 2880 (12.4 MP, S), 5616 × 3744 (21.0 MP, M), 8640 × 5760 (49.8 MP, L) |
| ILCE-1 | ✅ | ✅ | ✅ | ✅ | Compressed | 8640 × 5760 (49.8 MP, L) |
| ILCE-7M5 | ✅ | ✅ | ✅ | ✅ | Compressed | 7008 × 4672 (32.7 MP, L) |
| ILCE-7RM5 | ✅ | ✅ | ✅ | ✅ | Compressed | 9504 × 6336 (60.2 MP, L) |
| ILCE-9M3 | ✅ | ✅ | ✅ | ✅ | Compressed | 6000 × 4000 (24.0 MP, L) |
Version 1.3.5
Version 1.3.5 - April 4, 2026
April 4, 2026: It has been observed that RawCull supports a broader range of Sony FullFrame camera bodies. I have conducted tests on ARW files for the Sony A7RV and the newly released Sony A7V. After Easter, I plan to acquire additional full-frame ARW files from Sony A bodies to conduct further tests. Upon the completion of these tests, I will compile a comprehensive list of the Sony bodies that RawCull supports.
| Camera Body | Verified |
|---|---|
| Sony A1 mk I and mk II | Verified |
| Sony A7RV | Tested and seems to work |
| Sony A7V | Tested and seems to work |
| Sony A7S mk III | Not verified |
| Sony A9 mk III | Not verified |
The culling process has undergone modifications, enabling a two-step approach. Users now have the option to select the preferred culling method. The primary culling view is the grid view. Users can initiate rating immediately, but two new keystrokes, X and P, have been introduced for Reject and Pick, respectively, providing a binary selection. Rejected items are still visually indicated in red.
After rejecting and picking, users can select all P for Picks and proceed with rating from 2 to 5.
Additionally, automatic application of P and X can be performed after a Sharpness Scoring. The Focus Mask has also undergone a minor update. Following the scoring, users can automatically rate for pickers and rejected items.
Furthermore, after each rating, the thumbnail is automatically advanced to the next thumbnail for efficient culling.
Version 1.3.3
Version 1.3.3 - April 2, 2026
RawCull is developed for culling Sony A1 mkI and mkII ARW files. However, I have recently tested RawCull on Sony 7RV and the new Sony 7V, and it appears that all functions are functioning correctly on these models as well.
In this release, RawCull supports culling by color. Please refer to the Culling section in the documentation for further details. Additionally, updates have been made to the Focus Mask and pre-calibration in Sharpness Scoring.
Upon selecting a thumbnail and transitioning to Grid View, the Grid View automatically directs the user to the selected image. Furthermore, a progress bar has been incorporated into the function to assess the sharpness of the image.
Upon double-clicking a thumbnail in Grid View, the Zoom View is opened, either by extracting the JPG file or utilizing the newly created thumbnail.
Version 1.2.8
Version 1.2.8 - March 30, 2026
RawCull is developed for culling Sony A1 mkI and mkII ARW files. However, I have recently tested RawCull on Sony 7RV and the new Sony 7V, and it appears that all functions are functioning correctly on these models as well.
Code cleanup and several minor bugs were resolved. Additionally, a focus issue was addressed, eliminating the need for two clicks on either vertical or horizontal thumbnail row before utilizing arrow keys for navigation and tagging. Now, a single click suffices to achieve the desired focus. Furthermore, the tag command was modified to a single keypress of t, enabling tagging or detagging functionality.
Technical Deep Dives
Technical articles about RawCull’s implementation, architecture, and advanced concepts.
Swift Concurrency in RawCull
Swift Concurrency in RawCull
A summarized document about Concurrency in RawCull.
1 Why Concurrency Matters in RawCull
RawCull is a macOS photo-culling application that works with Sony A1 ARW raw files. A single RAW file from the A1 can be 50–80 MB. When you open a folder with hundreds of shots, the app must scan metadata, extract embedded JPEG previews, decode thumbnails, and manage a multi-gigabyte in-memory cache — all while keeping the UI perfectly fluid and responsive at 60 fps. Without concurrency that would be impossible.
RawCull is written in Swift 6, which has strict concurrency checking enabled by default. This means the compiler itself verifies thread safety at compile time. The project makes heavy use of Swift’s structured concurrency model: actors, async/await, task groups, and the MainActor.
Swift 6: Strict concurrency checking turns data-race warnings into hard compiler errors. Every type that crosses a concurrency boundary must be
Sendable, and every mutable shared state must be isolated to an actor.
2 async / await — The Foundation
async/await is the cornerstone of Swift’s structured concurrency model, introduced in Swift 5.5 (WWDC 2021). An async function can suspend itself — yielding the underlying thread to other work — then resume where it left off when the result is ready. Unlike Grand Central Dispatch callbacks, the code reads top-to-bottom like ordinary synchronous code, which makes it far easier to reason about.
How it looks
// A normal synchronous function — blocks the calling thread the entire time
func loadImageBlocking(url: URL) -> NSImage? { ... }
// An async function — suspends while waiting; doesn't block any thread
func loadImageAsync(url: URL) async -> NSImage? {
// 'await' means: "pause here and let other work run until I'm done"
let data = await fetchDataFromDisk(url: url)
return NSImage(data: data)
}
// Calling an async function — you must also be in an async context
func showImage() async {
let image = await loadImageAsync(url: someURL) // suspends here
updateUI(image) // resumes on same actor
}
In RawCull, virtually every file-loading, cache-lookup, and thumbnail-generation operation is async. This keeps the main thread (and therefore the UI) always free.
3 Actors — Thread-Safe Isolated State
An actor is a reference type (like a class) that protects its mutable state with automatic mutual exclusion. Only one caller can execute inside an actor at a time. You don’t need locks, dispatch queues, or semaphores — the Swift runtime enforces the isolation. If you try to read an actor’s property from outside without await, the compiler refuses to compile.
The rule in one sentence: Every stored property of an actor is only readable and writable from within that actor’s own methods. All other callers must await a method call to hop onto the actor.
RawCull’s actors at a glance
| Actor | File | Responsibility |
|---|---|---|
ScanFiles | Actors/ScanFiles.swift | Scans a folder for ARW files, reads EXIF, extracts focus points |
ScanAndCreateThumbnails | Actors/ScanAndCreateThumbnails.swift | Orchestrates bulk thumbnail creation with a concurrent task group |
RequestThumbnail | Actors/RequestThumbnail.swift | On-demand thumbnail resolver (RAM → disk → extract) |
ThumbnailLoader | Actors/ThumbnailLoader.swift | Rate-limits concurrent thumbnail requests using continuations |
DiskCacheManager | Actors/DiskCacheManager.swift | Reads and writes JPEG thumbnails to/from the on-disk cache |
SharedMemoryCache | Actors/SharedMemoryCache.swift | Singleton wrapping NSCache; manages memory pressure and config |
ExtractAndSaveJPGs | Actors/ExtractAndSaveJPGs.swift | Extracts full-resolution JPEGs from ARW files in parallel |
DiscoverFiles | Actors/DiscoverFiles.swift | Recursively enumerates .arw files in a directory |
ActorCreateOutputforView | Actors/ActorCreateOutputforView.swift | Converts rsync output strings to RsyncOutputData structs |
A minimal actor example from the project
// From Actors/DiscoverFiles.swift
actor DiscoverFiles {
// @concurrent tells Swift: run this method on the cooperative thread pool,
// not on the actor's serial queue. Safe because the method only uses
// local variables — no actor state is touched.
@concurrent
nonisolated func discoverFiles(at catalogURL: URL, recursive: Bool) async -> [URL] {
await Task {
let supported: Set<String> = [SupportedFileType.arw.rawValue]
let fileManager = FileManager.default
var urls: [URL] = []
guard let enumerator = fileManager.enumerator(
at: catalogURL,
includingPropertiesForKeys: [.isRegularFileKey],
options: recursive ? [] : [.skipsSubdirectoryDescendants]
) else { return urls }
while let fileURL = enumerator.nextObject() as? URL {
if supported.contains(fileURL.pathExtension.lowercased()) {
urls.append(fileURL)
}
}
return urls
}.value
}
}
discoverFiles is both nonisolated and @concurrent. Because it never reads or writes any property of the actor, it does not need to run on the actor’s serial queue — Swift can run it on any available thread in the cooperative pool, improving throughput.
4 @MainActor — Protecting the UI Thread
The main thread in a macOS/iOS app is special: all UI rendering must happen there. Swift’s @MainActor annotation is a global actor that ensures any code it annotates runs exclusively on the main thread. This replaces the old pattern of DispatchQueue.main.async { ... } with something the compiler can verify.
RawCullViewModel — the whole class lives on @MainActor
// From Model/ViewModels/RawCullViewModel.swift
@Observable @MainActor // <-- every property and method is main-thread only
final class RawCullViewModel {
var files: [FileItem] = [] // Safe: only touched on main thread
var filteredFiles: [FileItem] = [] // Safe: same
var creatingthumbnails: Bool = false // Drives UI animations
func handleSourceChange(url: URL) async {
// 'async' lets the function suspend while waiting for actor work,
// but it always starts and ends on the main thread (because of @MainActor)
scanning = true
let scan = ScanFiles() // Create a ScanFiles actor
files = await scan.scanFiles(url: url) // Hop to ScanFiles actor, wait, return
// Back on main thread here — safe to update UI
scanning = false
}
}
When handleSourceChange calls await scan.scanFiles(...), the main thread suspends (it is not blocked — it continues to process other UI events). When the scan is done, Swift automatically resumes on the main thread before assigning to files. This is the key insight: @MainActor + async/await means you never have to manually dispatch back to the main thread.
ExecuteCopyFiles — another @MainActor class
// From Model/ParametersRsync/ExecuteCopyFiles.swift
@Observable @MainActor
final class ExecuteCopyFiles {
private func handleProcessTermination(
stringoutputfromrsync: [String]?,
hiddenID: Int?
) async {
let viewOutput = await ActorCreateOutputforView()
.createOutputForView(stringoutputfromrsync)
let result = CopyDataResult(output: stringoutputfromrsync,
viewOutput: viewOutput,
linesCount: stringoutputfromrsync?.count ?? 0)
onCompletion?(result)
// Ensure completion handler finishes before cleaning up resources
try? await Task.sleep(for: .milliseconds(10))
cleanup()
}
}
Why the sleep? The brief
Task.sleepbeforecleanup()is an intentional concurrency fix. Without it, there was a race condition: the security-scoped resource access could be released before theonCompletioncallback had finished using it.
Crossing the boundary with MainActor.run
// From Model/ViewModels/SettingsViewModel.swift
// nonisolated means this is accessible without an actor hop,
// but to safely READ the @Observable properties we still need
// to jump to the MainActor for just a moment.
nonisolated func asyncgetsettings() async -> SavedSettings {
await MainActor.run { // Hop to main thread, read, return
SavedSettings(
memoryCacheSizeMB: self.memoryCacheSizeMB,
thumbnailSizeGrid: self.thumbnailSizeGrid,
thumbnailSizePreview: self.thumbnailSizePreview,
thumbnailSizeFullSize: self.thumbnailSizeFullSize,
thumbnailCostPerPixel: self.thumbnailCostPerPixel,
thumbnailSizeGridView: self.thumbnailSizeGridView,
useThumbnailAsZoomPreview: self.useThumbnailAsZoomPreview
)
} // Back to the calling actor with a Sendable value type
}
This pattern — nonisolated async function + MainActor.run — is the standard way to safely read @Observable (main-thread) properties from background actors. SavedSettings is a plain Codable struct (a value type), so it is Sendable and safe to return across the actor boundary.
5 Task Groups — Parallel File Processing
When you have a collection of independent items to process — like hundreds of RAW files — you want to process them in parallel, not one by one. Swift’s withTaskGroup (and its throwing counterpart withThrowingTaskGroup) let you spawn many child tasks and collect their results. The group automatically limits the number of tasks that run at once based on the cooperative thread pool.
Thumbnail preloading with withTaskGroup
// From Actors/ScanAndCreateThumbnails.swift
func preloadCatalog(at catalogURL: URL, targetSize: Int) async -> Int {
await ensureReady()
cancelPreload() // Cancel any ongoing previous preload
let task = Task<Int, Never> {
successCount = 0
let urls = await DiscoverFiles().discoverFiles(at: catalogURL, recursive: false)
totalFilesToProcess = urls.count
return await withTaskGroup(of: Void.self) { group in
// Allow up to (CPU cores × 2) concurrent thumbnail jobs
let maxConcurrent = ProcessInfo.processInfo.activeProcessorCount * 2
for (index, url) in urls.enumerated() {
if Task.isCancelled {
group.cancelAll() // Propagate cancellation to child tasks
break
}
// Once we've queued maxConcurrent tasks, wait for one to finish
// before adding more — this is backpressure / throttling
if index >= maxConcurrent {
await group.next()
}
group.addTask {
await self.processSingleFile(url, targetSize: targetSize, itemIndex: index)
}
}
await group.waitForAll()
return successCount
}
}
preloadTask = task // Store so we can cancel it later
return await task.value
}
The maxConcurrent throttle is important: if you queued 2,000 tasks at once, Swift would create 2,000 concurrent tasks competing for CPU and disk I/O. Instead, RawCull keeps at most (active CPU cores × 2) tasks in flight at any one time. When one finishes (await group.next()), the loop adds the next one.
Parallel focus-point extraction in ScanFiles
// From Actors/ScanFiles.swift
private func extractNativeFocusPoints(from items: [FileItem]) async -> [DecodeFocusPoints]? {
let collected = await withTaskGroup(of: DecodeFocusPoints?.self) { group in
for item in items {
group.addTask {
// SonyMakerNoteParser.focusLocation is a pure function — no shared state
guard let location = SonyMakerNoteParser.focusLocation(from: item.url)
else { return nil }
return DecodeFocusPoints(
sourceFile: item.url.lastPathComponent,
focusLocation: location
)
}
}
// Collect results as tasks complete (order not guaranteed)
var results: [DecodeFocusPoints] = []
for await result in group {
if let r = result { results.append(r) }
}
return results
}
return collected.isEmpty ? nil : collected
}
6 Task Cancellation — Cooperative, Not Forceful
Swift concurrency uses cooperative cancellation. You cannot forcefully kill a Task; instead, you call task.cancel() to set a cancellation flag, and the task’s code must periodically check Task.isCancelled and stop voluntarily. This is the correct pattern: clean shutdown instead of dangling resources.
How RawCull cancels thumbnail preloading
// From Model/ViewModels/RawCullViewModel.swift
func abort() {
// 1. Cancel the outer Task wrapper
preloadTask?.cancel()
preloadTask = nil
// 2. Tell the actor to cancel its internal Task too
if let actor = currentScanAndCreateThumbnailsActor {
Task { await actor.cancelPreload() }
}
currentScanAndCreateThumbnailsActor = nil
// 3. Cancel JPG extraction the same way
if let actor = currentExtractAndSaveJPGsActor {
Task { await actor.cancelExtractJPGSTask() }
}
currentExtractAndSaveJPGsActor = nil
creatingthumbnails = false
}
Checking cancellation inside the worker
// From Actors/ScanAndCreateThumbnails.swift
private func processSingleFile(_ url: URL, targetSize: Int, itemIndex: Int) async {
// Check before doing any I/O
if Task.isCancelled { return }
// Check RAM cache...
if let wrapper = SharedMemoryCache.shared.object(forKey: url as NSURL) { ... }
// Check again before slower disk operation
if Task.isCancelled { return }
// Load from disk cache...
if let diskImage = await diskCache.load(for: url) { ... }
// Check again before the most expensive operation: raw file extraction
if Task.isCancelled { return }
let cgImage = try await SonyThumbnailExtractor.extractSonyThumbnail(...)
}
Each Task.isCancelled guard cuts work short at logical checkpoints. The more expensive the upcoming operation, the more important the guard is. This gives smooth, instant response when the user switches to a different folder.
7 Task and Task.detached
Sometimes you want to start background work without awaiting its result — a fire-and-forget pattern. Swift provides two ways to do this:
Task { ... }— inherits the current actor context and task priority. If called from@MainActor, it also runs on@MainActorunless itawaits something that moves it elsewhere.Task.detached { ... }— starts a completely independent task. It inherits no actor context and runs on the cooperative thread pool at the specified priority. Use this for genuinely background work that has no relationship to the calling context.
Saving to disk in the background (Task.detached)
// From Actors/ScanAndCreateThumbnails.swift
// We have a cgImage — encode it to Data INSIDE this actor
// before crossing any boundary. CGImage is NOT Sendable.
guard let jpegData = DiskCacheManager.jpegData(from: cgImage) else { return }
// Data IS Sendable — safe to pass to a detached task.
let dcache = diskCache // Capture the actor reference (actors are Sendable)
Task.detached(priority: .background) {
// Runs on a background thread — no actor context
await dcache.save(jpegData, for: url)
}
// We DON'T await this — the thumbnail is shown immediately
// while the disk write happens silently in the background.
This is a key pattern: encode the image to Data (a value type, Sendable) while still inside the actor that owns the CGImage. Only after encoding do we hand it off to a detached task. This avoids the Swift 6 compile error that would occur if we tried to send a CGImage across a task boundary.
UI callback fire-and-forget (Task on @MainActor)
// From Actors/ScanAndCreateThumbnails.swift
private func notifyFileHandler(_ count: Int) {
let handler = fileHandlers?.fileHandler
Task { @MainActor in handler?(count) }
// Creates a Task that runs on the main thread,
// but we immediately return without awaiting it.
// Thumbnail generation must NOT stall waiting for UI rendering.
}
8 SharedMemoryCache — A Singleton Actor
SharedMemoryCache is one of the most sophisticated concurrency designs in RawCull. It is a singleton actor that wraps NSCache (Apple’s automatic memory-evicting cache). It cleverly combines actor isolation for configuration with nonisolated access for the NSCache itself.
// From Actors/SharedMemoryCache.swift (simplified)
actor SharedMemoryCache {
// Singleton — accessible as SharedMemoryCache.shared from any context
nonisolated static let shared = SharedMemoryCache()
// ── Actor-isolated state (requires await to access) ──────────────────
private var _costPerPixel: Int = 4
private var savedSettings: SavedSettings?
private var setupTask: Task<Void, Never>?
private var memoryPressureSource: DispatchSourceMemoryPressure?
// ── Non-isolated state (no await needed) ─────────────────────────────
// NSCache is internally thread-safe, so we can safely bypass the
// actor's serialization for fast synchronous lookups.
nonisolated(unsafe) let memoryCache = NSCache<NSURL, DiscardableThumbnail>()
// Synchronous cache lookup — no 'await' required by callers
nonisolated func object(forKey key: NSURL) -> DiscardableThumbnail? {
memoryCache.object(forKey: key)
}
nonisolated func setObject(_ obj: DiscardableThumbnail, forKey key: NSURL, cost: Int) {
memoryCache.setObject(obj, forKey: key, cost: cost)
}
}
The key insight is the two-tier design. Configuration properties (cost per pixel, settings, memory pressure source) are actor-isolated and require await. But the hot-path NSCache operations (lookups and insertions) are nonisolated — they happen in every SwiftUI view that renders a thumbnail, and they must be fast. NSCache provides its own thread safety, so nonisolated(unsafe) is legitimate here.
Guarding against duplicate initialization with a stored Task
func ensureReady(config: CacheConfig? = nil) async {
// If setup is already in progress (or done), just wait for it to finish
if let task = setupTask {
return await task.value // Join the existing task — don't start a new one
}
// Start a new setup task — store it IMMEDIATELY before awaiting
let newTask = Task {
self.startMemoryPressureMonitoring()
let settings = await SettingsViewModel.shared.asyncgetsettings()
let config = self.calculateConfig(from: settings)
self.applyConfig(config)
}
// Storing BEFORE awaiting is critical: if another caller arrives during
// the await below, they'll find setupTask already set and join it.
setupTask = newTask
await newTask.value
}
Race condition fix: If you stored
setupTask = newTaskafterawait newTask.value, a second concurrent caller could findsetupTaskstillniland start a duplicate initialization. Storing it immediately after creation is the correct pattern.
Memory pressure monitoring with DispatchSource
private func startMemoryPressureMonitoring() {
let source = DispatchSource.makeMemoryPressureSource(
eventMask: .all, queue: .global(qos: .utility)
)
// When the OS fires a memory pressure event (on a GCD background queue),
// we create a Task to hop back onto the actor and respond.
source.setEventHandler { [weak self] in
guard let self else { return }
Task {
await self.handleMemoryPressureEvent()
}
}
source.resume()
memoryPressureSource = source
}
9 AsyncStream — Streaming Progress Updates
AsyncStream is Swift’s way to model a sequence of values that arrive over time — analogous to a Combine publisher or a Unix pipe, but using async/await. RawCull uses AsyncStream to stream progress updates from the rsync copy process to the UI.
// From Model/ParametersRsync/ExecuteCopyFiles.swift
// In init(): create an AsyncStream with its continuation
let (stream, continuation) = AsyncStream.makeStream(of: Int.self)
self.progressStream = stream // Consumer reads from this
self.progressContinuation = continuation // Producer writes to this
// ── Producer (inside streaming handler callback) ─────────────────────────
streamingHandlers = CreateStreamingHandlers().createHandlersWithCleanup(
fileHandler: { [weak self] count in
// Each time rsync reports a file, yield the count to the stream
self?.progressContinuation?.yield(count)
}
)
// ── Consumer (in a ViewModel or View) ────────────────────────────────────
if let stream = copyFiles.progressStream {
for await count in stream {
// 'await' suspends between each value — no busy-waiting
updateProgressBar(count)
}
// Loop exits naturally when continuation.finish() is called
}
// ── Cleanup (inside handleProcessTermination) ─────────────────────────────
progressContinuation?.finish() // Signals the consumer loop to exit
progressContinuation = nil
progressStream = nil
AsyncStream is ideal here: rsync is a long-running subprocess that emits a count each time it copies a file. The UI wants to see each update as it happens, without polling. When the process finishes, calling .finish() on the continuation terminates the for await loop cleanly.
10 CheckedContinuation — Bridging to the Semaphore World
Swift’s concurrency model doesn’t have a built-in semaphore. Instead, you use withCheckedContinuation (or its throwing variant) to suspend a task and resume it later from a completely different context. ThumbnailLoader uses this to build a rate-limiter — a queue that allows at most 6 concurrent thumbnail loads at once.
// From Actors/ThumbnailLoader.swift
actor ThumbnailLoader {
static let shared = ThumbnailLoader()
private let maxConcurrent = 6
private var activeTasks = 0
private var pendingContinuations: [(id: UUID, continuation: CheckedContinuation<Void, Never>)] = []
private func acquireSlot() async {
if activeTasks < maxConcurrent {
activeTasks += 1
return // Slot available — proceed immediately
}
// No slot available — suspend this task and wait
let id = UUID()
await withTaskCancellationHandler {
await withCheckedContinuation { continuation in
// We are now suspended. Store the continuation.
// releaseSlot() will call continuation.resume() when a slot opens.
pendingContinuations.append((id: id, continuation: continuation))
}
activeTasks += 1
} onCancel: {
// If the task is cancelled while waiting, remove it from the queue
Task { await self.removeAndResumePendingContinuation(id: id) }
}
}
private func releaseSlot() {
activeTasks -= 1
if let next = pendingContinuations.first {
pendingContinuations.removeFirst()
next.continuation.resume() // Wake up the oldest waiting task
}
}
func thumbnailLoader(file: FileItem) async -> NSImage? {
await acquireSlot() // Wait for a free slot
defer { releaseSlot() } // Release when done (even on error)
guard !Task.isCancelled else { return nil }
// ... load thumbnail ...
}
}
withCheckedContinuation is Swift’s way to wrap callback-based or semaphore-based APIs into the async/await world. The “Checked” version adds runtime safety: if you forget to call resume() exactly once, the program crashes with a clear error rather than silently deadlocking. withTaskCancellationHandler ensures that if the task is cancelled while waiting for a slot, it cleans up gracefully.
11 The Thumbnail Pipeline — Putting It All Together
The thumbnail system is where all the concurrency patterns converge. Understanding this pipeline shows how each concept connects in practice.
Three-tier lookup strategy (RAM → Disk → Extract)
// From Actors/ScanAndCreateThumbnails.swift (resolveImage, simplified)
private func resolveImage(for url: URL, targetSize: Int) async throws -> CGImage {
// ── Tier A: RAM (synchronous — no await needed) ───────────────────────
// SharedMemoryCache.shared.object() is nonisolated — no actor hop.
if let wrapper = SharedMemoryCache.shared.object(forKey: url as NSURL),
wrapper.beginContentAccess() {
defer { wrapper.endContentAccess() }
return try nsImageToCGImage(wrapper.image) // Fastest path: ~μs
}
// ── Tier B: Disk cache (async — file I/O) ─────────────────────────────
if let diskImage = await diskCache.load(for: url) {
storeInMemoryCache(diskImage, for: url) // Promote to RAM
return try nsImageToCGImage(diskImage) // Fast: ~ms
}
// ── Tier C: In-flight deduplication ───────────────────────────────────
// If another caller is already generating this thumbnail, join that task
// instead of starting a duplicate.
if let existingTask = inflightTasks[url] {
let image = try await existingTask.value
return try nsImageToCGImage(image)
}
// ── Tier D: Extract from raw file ─────────────────────────────────────
let task = Task { () throws -> NSImage in
let cgImage = try await SonyThumbnailExtractor.extractSonyThumbnail(
from: url, maxDimension: CGFloat(targetSize), qualityCost: costPerPixel
)
let image = try cgImageToNormalizedNSImage(cgImage)
storeInMemoryCache(image, for: url)
// Encode to Data inside this actor, then fire off a background save
if let jpegData = DiskCacheManager.jpegData(from: cgImage) {
Task.detached(priority: .background) { await dcache.save(jpegData, for: url) }
}
inflightTasks[url] = nil
return image
}
inflightTasks[url] = task // Register so concurrent callers can join it
return try nsImageToCGImage(try await task.value)
}
Tier C is an elegant optimization called request coalescing. If the grid view shows 20 thumbnails and 5 of them are for the same URL (perhaps during a layout transition), only one extraction happens — the other 4 join the first task and share its result.
12 CacheDelegate — Tracking Evictions with an Actor
NSCache can evict objects at any time (when memory gets tight). CacheDelegate conforms to NSCacheDelegate so it gets a callback when an eviction happens. The tricky part: this callback is called from NSCache’s internal C++ thread — not from any Swift actor. The solution is a nested actor that owns the mutable counter.
// From Model/Cache/CacheDelegate.swift
final class CacheDelegate: NSObject, NSCacheDelegate, @unchecked Sendable {
nonisolated static let shared = CacheDelegate()
private let evictionCounter = EvictionCounter()
// Called by NSCache on its own internal thread
nonisolated func cache(_ cache: NSCache<AnyObject, AnyObject>,
willEvictObject obj: Any) {
if obj is DiscardableThumbnail {
Task {
let count = await evictionCounter.increment()
// log the count...
}
}
}
func getEvictionCount() async -> Int { await evictionCounter.getCount() }
func resetEvictionCount() async { await evictionCounter.reset() }
}
// A private actor that safely owns the mutable counter
private actor EvictionCounter {
private var count = 0
func increment() -> Int { count += 1; return count }
func getCount() -> Int { count }
func reset() { count = 0 }
}
EvictionCounter is a textbook use of an actor for the simplest possible case: protecting a single integer from concurrent writes. Before actors existed, you would use NSLock or DispatchQueue(label:) for this. The actor is cleaner, safer, and compiler-verified.
13 MemoryViewModel — Offloading Heavy Work from @MainActor
MemoryViewModel displays live memory statistics (total RAM, used RAM, app footprint). Getting these stats requires Mach kernel calls (vm_statistics64, task_vm_info) — synchronous system calls that block for a brief moment. If run directly on @MainActor, they would cause UI stutter.
// From Model/ViewModels/MemoryViewModel.swift
func updateMemoryStats() async {
// Step 1: Move the heavy work OFF the MainActor
let (total, used, app, threshold) = await Task.detached {
let total = ProcessInfo.processInfo.physicalMemory
let used = self.getUsedSystemMemory() // Blocking Mach call
let app = self.getAppMemory() // Blocking Mach call
let threshold = self.calculateMemoryPressureThreshold(total: total)
return (total, used, app, threshold)
}.value
// Step 2: Update @Observable properties back on MainActor
await MainActor.run {
self.totalMemory = total
self.usedMemory = used
self.appMemory = app
self.memoryPressureThreshold = threshold
}
}
// The Mach calls are nonisolated: they don't touch any actor state
private nonisolated func getUsedSystemMemory() -> UInt64 {
var stat = vm_statistics64()
// ... kernel call ...
return (wired + active + compressed) * pageSize
}
This pattern — Task.detached for blocking work, then MainActor.run to update observable state — is the canonical way to keep the UI thread responsive while doing expensive computation or I/O in a class that must also update the UI.
14 @concurrent and nonisolated
Two related annotations help you escape actor isolation when it is safe to do so, allowing more work to run in parallel.
nonisolated — opt out of the actor’s serial queue
A nonisolated method on an actor can be called without await from outside the actor. The tradeoff: it must not read or write any actor-isolated property. It is safe for pure computation or for accessing nonisolated(unsafe) properties.
// From Actors/ScanFiles.swift
// sortFiles does not touch any actor property — it only works on
// the passed-in 'files' array (a value type, passed by copy).
@concurrent
nonisolated func sortFiles(
_ files: [FileItem],
by sortOrder: [some SortComparator<FileItem>],
searchText: String
) async -> [FileItem] {
let sorted = files.sorted(using: sortOrder)
return searchText.isEmpty ? sorted
: sorted.filter { $0.name.localizedCaseInsensitiveContains(searchText) }
}
@concurrent — run on the thread pool, not the actor queue
@concurrent is a Swift 6 annotation that says: “even though this method is on an actor, execute it on the cooperative thread pool, not on the actor’s serial queue.” It is useful for pure CPU work that doesn’t need actor isolation but lives on an actor for organizational reasons.
// From Actors/ActorCreateOutputforView.swift
actor ActorCreateOutputforView {
// Pure mapping: [String] → [RsyncOutputData]
@concurrent
nonisolated func createOutputForView(_ strings: [String]?) async -> [RsyncOutputData] {
guard let strings else { return [] }
return strings.map { RsyncOutputData(record: $0) }
}
}
15 Sendable — The Type-Safety Rule
A Sendable type can safely cross actor/task boundaries. Swift enforces this at compile time in Swift 6: if you try to send a non-Sendable value to a different isolation domain, the compiler rejects it. The most common pattern in RawCull is the CGImage-to-Data conversion before any boundary crossing.
// CGImage is NOT Sendable (it wraps a C++ object)
// ❌ WRONG — Swift 6 compiler error
Task.detached {
await diskCache.save(cgImage, for: url) // Error: CGImage is not Sendable
}
// ✅ CORRECT — Encode to Data first, then cross the boundary
// Data is a struct (value type) — it IS Sendable.
if let jpegData = DiskCacheManager.jpegData(from: cgImage) {
let dcache = diskCache // Actor references are Sendable
Task.detached(priority: .background) {
await dcache.save(jpegData, for: url) // Data ✓, actor ref ✓
}
}
Value types (structs, enums) with only Sendable stored properties are automatically Sendable. SavedSettings, FileItem, ExifMetadata, CopyDataResult — all structs in RawCull — are Sendable for this reason. Actor references are also Sendable (the actor itself serializes access). Class instances are generally not Sendable unless annotated.
16 Bridging GCD and Swift Concurrency — Preventing Thread Pool Starvation
Both JPGSonyARWExtractor and SonyThumbnailExtractor are caseless enums — pure namespaces with no instance state — that perform CPU-intensive ImageIO work. They use a pattern that looks surprising at first: they explicitly dispatch to DispatchQueue.global inside a withCheckedContinuation. Understanding why reveals an important pitfall of Swift’s cooperative thread pool.
The problem: thread pool starvation
Swift’s cooperative thread pool has a limited number of threads — typically one per CPU core. When an async function calls a synchronous, blocking API (like CGImageSourceCreateWithURL or CGImageSourceCreateThumbnailAtIndex), that call does not suspend — it blocks the thread it is running on. If many tasks do this simultaneously, every thread in the pool can become occupied with blocked I/O, leaving no threads free to run other await continuations. The app effectively freezes. This is called thread pool starvation.
The fix is to deliberately hop off the cooperative thread pool and onto a GCD global queue — which has its own, much larger pool of threads — for the duration of the blocking call. When the GCD block finishes, it calls continuation.resume(), which re-queues the Swift task on the cooperative pool for the lightweight work that follows.
JPGSonyARWExtractor — withCheckedContinuation + GCD
// From Enum/JPGSonyARWExtractor.swift
// @preconcurrency suppresses Sendable errors for AppKit types (like NSImage)
// that predate Swift concurrency and are not formally Sendable.
@preconcurrency import AppKit
enum JPGSonyARWExtractor {
static func jpgSonyARWExtractor(
from arwURL: URL,
fullSize: Bool = false,
) async -> CGImage? {
return await withCheckedContinuation { continuation in
// Dispatch to GCD to prevent Thread Pool Starvation.
// CGImageSourceCreateWithURL and friends are synchronous and can
// block for tens of milliseconds on a large ARW file.
// Running them directly on the cooperative pool ties up a thread.
DispatchQueue.global(qos: .utility).async {
guard let imageSource = CGImageSourceCreateWithURL(arwURL as CFURL, nil) else {
continuation.resume(returning: nil)
return
}
// Scan all sub-images in the ARW container and find the largest JPEG preview
let imageCount = CGImageSourceGetCount(imageSource)
var targetIndex = -1
var targetWidth = 0
for index in 0 ..< imageCount {
guard let props = CGImageSourceCopyPropertiesAtIndex(imageSource, index, nil)
as? [CFString: Any] else { continue }
let hasJFIF = (props[kCGImagePropertyJFIFDictionary] as? [CFString: Any]) != nil
let tiffDict = props[kCGImagePropertyTIFFDictionary] as? [CFString: Any]
let compression = tiffDict?[kCGImagePropertyTIFFCompression] as? Int
let isJPEG = hasJFIF || (compression == 6) // TIFF compression 6 = JPEG
if let width = getWidth(from: props), isJPEG, width > targetWidth {
targetWidth = width
targetIndex = index
}
}
guard targetIndex != -1 else {
continuation.resume(returning: nil)
return
}
// Downsample in-place with ImageIO if the preview is larger than needed
let maxSize = CGFloat(fullSize ? 8640 : 4320)
let result: CGImage?
if CGFloat(targetWidth) > maxSize {
let options: [CFString: Any] = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: Int(maxSize),
]
result = CGImageSourceCreateThumbnailAtIndex(imageSource, targetIndex,
options as CFDictionary)
} else {
let options: [CFString: Any] = [
kCGImageSourceShouldCache: true,
kCGImageSourceShouldCacheImmediately: true,
]
result = CGImageSourceCreateImageAtIndex(imageSource, targetIndex,
options as CFDictionary)
}
// Hand the result back to the Swift async world
continuation.resume(returning: result)
}
}
}
}
The withCheckedContinuation call suspends the Swift task and stores its continuation. The GCD block then runs on a GCD worker thread — entirely outside the cooperative pool. When it calls continuation.resume(returning:), Swift schedules the task to resume, but only the lightweight resumption, not the expensive ImageIO work that has already completed on GCD.
SonyThumbnailExtractor — withCheckedThrowingContinuation + GCD
SonyThumbnailExtractor follows the same pattern but uses the throwing variant because the ImageIO operations can fail. The comment in the source file spells out a second important motivation beyond starvation:
// From Enum/SonyThumbnailExtractor.swift
enum SonyThumbnailExtractor {
static func extractSonyThumbnail(
from url: URL,
maxDimension: CGFloat,
qualityCost: Int = 4,
) async throws -> CGImage {
// We MUST explicitly hop off the current thread.
// Since we are an enum and static, we have no isolation of our own.
// If we don't do this, we run on the caller's thread (the Actor),
// causing serialization — only one extraction at a time.
try await withCheckedThrowingContinuation { continuation in
DispatchQueue.global(qos: .userInitiated).async {
do {
let image = try Self.extractSync(from: url,
maxDimension: maxDimension,
qualityCost: qualityCost)
continuation.resume(returning: image)
} catch {
continuation.resume(throwing: error) // Propagates to the call site
}
}
}
}
// All heavy ImageIO work lives in a private synchronous function,
// only ever called from the GCD block above
private nonisolated static func extractSync(
from url: URL,
maxDimension: CGFloat,
qualityCost: Int,
) throws -> CGImage {
let sourceOptions = [kCGImageSourceShouldCache: false] as CFDictionary
guard let source = CGImageSourceCreateWithURL(url as CFURL, sourceOptions)
else { throw ThumbnailError.invalidSource }
let thumbOptions: [CFString: Any] = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: maxDimension,
kCGImageSourceShouldCacheImmediately: true,
]
guard let raw = CGImageSourceCreateThumbnailAtIndex(source, 0,
thumbOptions as CFDictionary)
else { throw ThumbnailError.generationFailed }
return try rerender(raw, qualityCost: qualityCost)
}
// Re-renders into an sRGB CGContext to normalise colour space and apply
// the chosen interpolation quality
private nonisolated static func rerender(_ image: CGImage, qualityCost: Int) throws -> CGImage {
let quality: CGInterpolationQuality = switch qualityCost {
case 1...2: .low
case 3...4: .medium
default: .high
}
guard let colorSpace = CGColorSpace(name: CGColorSpace.sRGB)
else { throw ThumbnailError.contextCreationFailed }
let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedLast.rawValue)
guard let ctx = CGContext(data: nil, width: image.width, height: image.height,
bitsPerComponent: 8, bytesPerRow: 0,
space: colorSpace, bitmapInfo: bitmapInfo.rawValue)
else { throw ThumbnailError.contextCreationFailed }
ctx.interpolationQuality = quality
ctx.draw(image, in: CGRect(x: 0, y: 0, width: image.width, height: image.height))
guard let result = ctx.makeImage() else { throw ThumbnailError.generationFailed }
return result
}
}
The comment makes an important second point: “If we don’t do this, we run on the caller’s thread (the Actor), causing serialization.” Even without starvation, running the blocking work directly on the calling actor would mean the actor can only process one extraction at a time — because actors are serial. By dispatching to GCD immediately, the actor is freed to start the next request while GCD runs many extractions concurrently on its own thread pool.
Why enums, not classes or actors?
Using a caseless enum signals that the type is a pure namespace — it has no instance state and cannot be instantiated. This means there is no actor isolation to reason about, every method is inherently static, and self does not exist. The Swift compiler never has to consider whether the type crosses an isolation boundary. It is the right choice for stateless utility code that performs only I/O and pure computation.
@preconcurrency import — suppressing legacy Sendable warnings
JPGSonyARWExtractor annotates its AppKit import with @preconcurrency:
@preconcurrency import AppKit
AppKit was written before Swift concurrency existed, so many of its types are not formally declared Sendable. In Swift 6, using them across concurrency boundaries would normally produce hard errors. @preconcurrency import tells the compiler to treat missing Sendable conformances from that module as warnings rather than errors — the sanctioned way to integrate legacy frameworks without turning off strict concurrency checking globally.
QoS choices — utility vs userInitiated
The two enums deliberately pick different GCD quality-of-service levels:
JPGSonyARWExtractoruses.utility— extracting full-resolution previews for JPG export is a background batch job that can yield to foreground work without affecting perceived responsiveness.SonyThumbnailExtractoruses.userInitiated— thumbnail extraction is driven directly by the user scrolling the grid, so results need to appear quickly to keep the UI feeling snappy.
This mirrors the task priority system used within Swift concurrency itself (Task(priority: .background) vs .userInitiated), applied at the GCD layer where the blocking work actually lives.
17 Quick Reference
| Keyword / Pattern | What it does | Where in RawCull |
|---|---|---|
async / await | Suspend without blocking; resume when ready | Everywhere — all I/O functions |
actor | Reference type with automatic mutual exclusion | ScanFiles, DiskCacheManager, ThumbnailLoader, … |
@MainActor | Restrict execution to the main thread | RawCullViewModel, ExecuteCopyFiles |
@Observable + @MainActor | SwiftUI-observable classes on main thread | RawCullViewModel, SettingsViewModel |
withTaskGroup | Fan out many tasks in parallel, collect results | ScanFiles.scanFiles, ScanAndCreateThumbnails.preloadCatalog |
Task { } | Fire-and-forget; inherits current actor | UI callbacks, rating updates, abort() |
Task.detached { } | Fully independent background task | Disk-cache saves, MemoryViewModel stats |
Task.isCancelled | Cooperative cancellation check | processSingleFile — multiple guard points |
task.cancel() | Request cooperative cancellation | RawCullViewModel.abort() |
AsyncStream | Push-based sequence of values over time | ExecuteCopyFiles progress stream |
CheckedContinuation (rate-limiter) | Suspend a task; resume it from another context | ThumbnailLoader.acquireSlot() |
withCheckedContinuation + DispatchQueue.global | Escape the cooperative pool; prevent thread pool starvation | JPGSonyARWExtractor, SonyThumbnailExtractor |
withCheckedThrowingContinuation | Throwing variant of continuation bridging | SonyThumbnailExtractor.extractSonyThumbnail |
@preconcurrency import | Suppress Sendable errors for pre-concurrency frameworks | JPGSonyARWExtractor (AppKit) |
nonisolated | Escape actor isolation for pure functions | ScanFiles.sortFiles, SettingsViewModel.asyncgetsettings |
@concurrent | Run on thread pool, not actor queue | ScanFiles.sortFiles, ActorCreateOutputforView |
nonisolated(unsafe) | Bypass isolation for externally thread-safe objects | SharedMemoryCache.memoryCache (NSCache) |
MainActor.run { } | Hop to main thread for a block, then return | SettingsViewModel.asyncgetsettings, MemoryViewModel.updateMemoryStats |
Sendable | Types safe to cross actor/task boundaries | SavedSettings, FileItem, Data (CGImage → Data encoding) |
RawCull — a macOS app by Thomas Evensen · Swift 6 strict concurrency · Apple Silicon · macOS 26 Tahoe
Sharpness Scoring
Updated Sharpness Scoring in RawCull
This document describes how RawCull computes sharpness scores, what each parameter does, and a boundary-value test procedure for validating parameter behaviour. This applies to version 1.3.7 of RawCull
How the Scoring Pipeline Works
flowchart TD
A([ARW File]) --> B[Thumbnail decode\nthumbnailMaxPixelSize]
A --> C[Saliency detection\nApple Vision]
B --> D[Gaussian pre-blur\npreBlurRadius × isoFactor × resFactor]
D --> E[Laplacian Metal kernel\nfocusLaplacian]
E --> F[Amplify\nenergyMultiplier]
F --> G[Border suppression\nborderInsetFraction]
G --> H[Render to Float32 buffer]
H --> I[Full-frame samples\nborder-inset region]
C -->|bounding box ≥ 3% area| J[Salient-region samples\nVision bounding box]
H --> J
I --> K[p95 winsorized tail score\nfull-frame f]
J --> L[p95 winsorized tail score\nsubject s]
K --> M{Fusion}
L --> M
C -->|area| M
M --> N[finalScore]
N --> O([Badge: score / maxScore × 100])
style A fill:#2d2d2d,color:#fff
style O fill:#2d2d2d,color:#fffEach ARW file goes through the following stages in detail:
1. Thumbnail decode
ImageIO extracts the embedded JPEG thumbnail at the requested pixel size (thumbnailMaxPixelSize). If no embedded thumbnail is available, the full RAW is decoded instead. The thumbnail is the only input to the scoring pipeline — the full 61 MP RAW pixel data is never read for scoring.
2. Saliency detection
Apple Vision (VNGenerateAttentionBasedSaliencyImageRequest) analyses the thumbnail and returns the bounding boxes of visually salient objects. The boxes are unioned into a single region. If the union area is less than 3% of the frame, or if Vision finds nothing, saliency is discarded and only the full-frame path is used. This step runs in parallel with the image decode.
3. Gaussian pre-blur
A Gaussian blur is applied before the Laplacian. This is critical: without pre-blur, noise pixels generate false Laplacian responses and inflate scores on out-of-focus images. The effective blur radius is:
effectiveRadius = preBlurRadius × isoFactor × resFactor
isoFactor = clamp(sqrt(ISO / 400), 1.0, 3.0)
resFactor = clamp(sqrt(max(imageWidth, 512) / 512), 1.0, 3.0)
At ISO 400 both factors are 1.0. At ISO 6400 isoFactor ≈ 4.0 (capped at 3.0), so the blur automatically increases to suppress noise before the edge detector fires.
4. Laplacian (Metal kernel)
A Laplacian-of-Gaussian edge magnitude is computed via the focusLaplacian Metal kernel. This measures the second derivative of luminance — the response is high where focus is sharp and near zero where the image is blurred. The output is amplified by energyMultiplier.
5. Border suppression
The outer borderInsetFraction of pixels on each edge is zeroed out (replaced with black) before any scoring or thresholding. This prevents the Gaussian pre-blur from generating an artificial sharp step at the image boundary that would otherwise appear as a bright rectangle in the focus mask and inflate scores.
6. Pixel sampling
The amplified Laplacian is rendered to a Float32 buffer. Two sample sets are collected:
- Full-frame samples: all pixels inside the border-inset region.
- Salient-region samples: pixels within the Vision bounding box, if available and containing at least 256 pixels.
7. Robust tail score (p95 winsorized mean)
Each sample set is scored independently:
- Find the 95th-percentile value (p95) — the threshold above which only the sharpest pixels sit.
- Find the 99.5th-percentile value (p99.5) — used as an upper clip to reduce the influence of extreme outliers.
- Return the mean of all values ≥ p95, clipped at p99.5.
This produces a score that reflects the sharpest 5% of pixels rather than average image sharpness, which is a better predictor of perceived in-focus quality.
8. Score fusion
flowchart TD
A{Saliency\ndetected?} -->|Yes — area ≥ 3%| B[Compute salient score s]
A -->|No| C[Full-frame score f only]
B --> D{Both f and s\nproduced?}
D -->|Yes| E["blended = f × (1 − salientWeight) + s × salientWeight
sizeFactor = 1 + salientArea × subjectSizeFactor
finalScore = blended × sizeFactor"]
D -->|s only| F[finalScore = s]
C --> G[finalScore = f]
E --> H([Return finalScore])
F --> H
G --> H9. Badge display
The badge shown on each thumbnail is score / maxScore × 100, normalised to the highest score in the current catalog. All comparisons are therefore relative within a session, not absolute.
Parameters
Where each parameter applies
flowchart LR
subgraph Score ["Scoring pipeline"]
P1[thumbnailMaxPixelSize]
P2[borderInsetFraction]
P3[salientWeight]
P4[subjectSizeFactor]
P5[preBlurRadius]
P6[energyMultiplier]
end
subgraph Mask ["Focus mask overlay"]
P2
P5
P6
P7[threshold]
P8[erosionRadius]
P9[dilationRadius]
P10[featherRadius]
endborderInsetFraction, preBlurRadius, and energyMultiplier affect both scoring and the focus mask overlay. Changes made in the zoom view’s Focus Mask Controls panel are immediately reflected in the next score run.
Scoring Resolution — thumbnailMaxPixelSize
| Setting | Value |
|---|---|
| Fast | 512 px |
| Medium | 768 px |
| Accurate | 1024 px |
| Default | 512 px |
The pixel size of the thumbnail decoded for scoring. A larger thumbnail contains finer spatial detail, which improves score accuracy — especially at high ISO where noise patterns can obscure real edges at 512 px. Each step roughly doubles decode and pipeline time.
Applies to: scoring only (the focus mask overlay always uses the full displayed image).
Border Inset — borderInsetFraction
| Bound | Value | Effect |
|---|---|---|
| Min | 0% | No border exclusion; Gaussian edge artefacts visible in mask and inflate scores |
| Default | 4% | Removes the typical blur-boundary band |
| Max | 10% | Removes a wide margin; subjects near the frame edge may be partially excluded |
The fraction of the image dimension excluded from each of the four edges. Prevents the Gaussian pre-blur from creating an artificial bright rectangle at the image boundary in both the focus mask and the score.
Applies to: scoring and focus mask overlay.
Subject Weight — salientWeight
| Bound | Value | Effect |
|---|---|---|
| Min | 0.0 | Entire score is the full-frame tail score; subject sharpness is ignored |
| Default | 0.75 | Subject region contributes 75%, full frame 25% |
| Max | 1.0 | Entire score is the salient-region score; if saliency fails, falls back to full-frame |
Controls how much the Vision-detected subject region drives the score versus the full frame. At 0.0, background texture (grass, foliage) dominates and scores cluster regardless of subject focus. At 1.0, background is entirely ignored.
Applies to: scoring only.
Subject Size Bonus — subjectSizeFactor
| Bound | Value | Effect |
|---|---|---|
| Min | 0.0 | No size bonus; equal focus quality scores identically regardless of subject distance |
| Default | 1.5 | Subject at 20% frame area → ×1.30 multiplier; 5% frame area → ×1.075 |
| Max | 3.0 | Subject at 20% frame area → ×1.60 multiplier; strong preference for close subjects |
Multiplies the fused score by 1 + salientArea × subjectSizeFactor. Gives a proportional bonus to images where the subject fills more of the frame — closer subjects score higher than equally-sharp distant ones.
Applies to: scoring only. Has no effect if saliency detection finds nothing (salientArea = 0).
Pre-blur Radius — preBlurRadius
| Bound | Value | Effect |
|---|---|---|
| Min | 0.3 | Almost no blur; Laplacian fires on noise pixels as well as real edges |
| Default | 1.92 | Balanced noise suppression at ISO 400 |
| Max | 4.0 | Heavy blur; only very strong edges survive |
Base Gaussian blur radius applied before the Laplacian, automatically scaled upward at higher ISO and larger images. This is the single most influential parameter for whether noise or real sharpness drives the score.
Accessible via: Focus Mask Controls (zoom view) and Scoring Parameters sheet. Applies to: scoring and focus mask overlay.
Amplify — energyMultiplier
| Bound | Value | Effect |
|---|---|---|
| Min | 1.0 | No amplification; raw Laplacian values are very small |
| Default | 7.62 | Set automatically by calibration |
| Max | 20.0 | Strong amplification; subtle sharpness differences become large score gaps |
Multiplies the Laplacian output before scoring and thresholding. The calibration step (Re-score) automatically adjusts this so the p95 of the burst lands at 0.50 — manual adjustment is rarely needed.
Accessible via: Focus Mask Controls (zoom view). Applies to: scoring and focus mask overlay.
Threshold, Erosion, Dilation, Feather Radius
These four parameters affect only the focus mask visual overlay in the zoom view. They have no effect on sharpness scores.
Accessible via: Focus Mask Controls (zoom view).
Test Procedure
The goal is to confirm that each parameter produces the expected behaviour at its boundary values. Use a burst of 20–30 frames containing a clear subject (bird, animal) at varying distances and focus accuracy.
Before each test:
- Open the catalog, press Re-score to establish a calibrated baseline.
- Note which images score highest and lowest and what the score spread is.
- Adjust one parameter at a time, press Re-score after each change, and compare against the baseline.
Test 1 — Scoring Resolution
| Step | Setting | Expected result |
|---|---|---|
| 1a | 512 px | Baseline. At ISO 6400+, some noise-driven variation expected |
| 1b | 1024 px | Scores more stable; high-ISO images show less noise influence. Scoring takes ~3–4× longer |
| 1c | 512 px | Scores return to baseline behaviour |
Pass criteria: 1024 px produces smaller score gaps between frames that differ only in noise, and larger score gaps between frames that differ in actual focus quality.
Test 2 — Border Inset
| Step | Setting | Expected result |
|---|---|---|
| 2a | 0% | Focus mask shows a bright rectangle around the full image border |
| 2b | 4% | Border rectangle is absent from mask; scores slightly lower for images where background edges reached the frame edge |
| 2c | 10% | Border completely clean; verify subjects near the frame edge are not partially excluded by inspecting the focus mask |
Pass criteria: At 0% the mask shows an obvious border artefact. At 4% and 10% it is absent. Subject detail in the centre of the frame is unaffected at all three values.
Test 3 — Subject Weight
| Step | Setting | Expected result |
|---|---|---|
| 3a | 0.0 | Scores driven by full-frame background texture; frames with sharp foliage score high even if the subject is soft; score spread is narrow |
| 3b | 0.75 | Subject sharpness dominates; clearly in-focus frames rank above soft frames |
| 3c | 1.0 | Score is entirely the salient-region score; any frame with no detected subject must fall back gracefully to a non-zero score |
Pass criteria: At 0.0 scores cluster in a narrow band. At 0.75 and 1.0 there is clear separation between in-focus and soft frames. At 1.0 frames with no Vision detection produce a non-zero score (full-frame fallback).
Test 4 — Subject Size Bonus
| Step | Setting | Expected result |
|---|---|---|
| 4a | 0.0 | Close and distant frames with equivalent focus quality score similarly |
| 4b | 1.5 | The closest frame (largest subject) scores noticeably higher than the most distant frame even at similar focus quality |
| 4c | 3.0 | Size bonus is strong; rank order correlates with subject size; verify a perfectly sharp distant frame does not rank above a clearly soft close frame |
Pass criteria: At 0.0 rank order is determined by focus quality alone. At 1.5 and 3.0 close subjects score higher. At 3.0 the bonus should not cause obviously soft large-subject frames to outrank sharp small-subject frames.
Test 5 — Pre-blur Radius
| Step | Setting | Expected result |
|---|---|---|
| 5a | 0.3 | At high ISO, noise-dominated images score artificially high; focus mask shows noise grain highlighted as “sharp” |
| 5b | 1.92 | Balanced; noise suppressed, real edges preserved |
| 5c | 4.0 | Only very strong contrast edges highlighted; soft-but-textured backgrounds show little or no mask; score spread may narrow |
Pass criteria: At 0.3 with high-ISO files the focus mask shows widespread highlighting that does not correspond to genuine sharp edges. At 1.92 and above it is clean. At 4.0 the sharpest in-focus frame still scores highest.
Test 6 — Amplify (Energy Multiplier)
| Step | Setting | Expected result |
|---|---|---|
| 6a | 1.0 | Raw Laplacian values are tiny; most badges show low scores; little spread between frames |
| 6b | 7.62 (calibrated) | Good spread; badge scores use the full 0–100 range relative to the sharpest frame |
| 6c | 20.0 | Signal saturates; many frames score similarly near the top; focus mask nearly fully lit on any textured region |
Pass criteria: At 1.0 all scores are very low and compressed. At 20.0 scores are saturated and compressed at the top. At the calibrated default there is clear separation. Note: Re-score resets this to the calibrated value automatically.
Combined Boundary Test
To confirm parameters do not interact unexpectedly at extremes:
flowchart LR
A([Start]) --> B[Set ALL parameters\nto minimum values]
B --> C[Re-score]
C --> D{App stable?\nScores non-zero?\nMask renders?}
D -->|Fail| E([Log failure])
D -->|Pass| F[Set ALL parameters\nto maximum values]
F --> G[Re-score]
G --> H{App stable?\nScores non-zero?\nMask renders?}
H -->|Fail| E
H -->|Pass| I[Press Reset in\nScoring Parameters sheet]
I --> J[Re-score]
J --> K{Scores match\noriginal baseline?}
K -->|Fail| E
K -->|Pass| L([Pass])Minimum values: thumbnailMaxPixelSize=512, borderInsetFraction=0%, salientWeight=0.0, subjectSizeFactor=0.0, preBlurRadius=0.3, energyMultiplier=1.0
Maximum values: thumbnailMaxPixelSize=1024, borderInsetFraction=10%, salientWeight=1.0, subjectSizeFactor=3.0, preBlurRadius=4.0, energyMultiplier=20.0
Threads
Concurrency Monitoring — ScanFiles
Analysis
The cooperative thread pool
Swift’s runtime manages a pool of threads capped at roughly the number of CPU cores. Apple Silicon M-series typically has 8–12 performance cores. The output shows a peak of ~10–11 concurrent tasks, which matches this ceiling. This is intentional — the runtime will not spin up 21 threads for 21 tasks because context-switching that many threads would cost more than the parallelism gains.
Why tasks start in a burst then level off
The first 11 tasks start almost simultaneously (active: 1 through active: 11 with no done lines in between). After that the log shows an interleaved pattern of start/done — a slot only opens for a new task when one finishes. This is the scheduler working correctly: it never idles a core while work is queued.
EXIF vs MakerNote — same shape, different reason
Both groups plateau at ~10 active tasks. For EXIF (CGImageSourceCreateWithURL) the bottleneck is reading up to 4 MB of RAW data from disk per file. For MakerNote (SonyMakerNoteParser) it is parsing binary data in memory. They look similar in the log because both are fast enough that the thread pool stays saturated throughout — no task finishes so quickly that it leaves a core idle for long.
What “active” actually measures
The counter tracks tasks that have started but not yet returned. Because these tasks are not async internally — there is no await inside extractExifData or focusLocation — they never suspend. They run to completion on one thread without yielding. This means “active” here equals “threads actually burning CPU right now”, not just “tasks in flight”. That is a heavier workload than async I/O tasks would be, and it is why the pool caps out quickly.
The _DSC8303.ARW anomaly on MakerNote
It shows active: 7 on its done line instead of a clean countdown. This happens because decrement() and another task’s increment() race at the exact same moment — the lock serialises them, but the order is non-deterministic. This is normal behaviour and confirms the lock is doing its job correctly.
What would look different if something were wrong
- If
activenever exceeded 1–2 it would mean tasks were serialising somewhere — likely an actor hop or a lock held for too long inside the work itself. - If
activeclimbed to 21 and stayed there it would mean tasks were suspending (waiting on async I/O) rather than running — not the case here. - If
activewent negative the counter would be broken — it does not, confirming thenonisolated(unsafe)+NSLockpattern is sound.
sortFiles
Confirmed running off the main thread (number = 2). The @concurrent nonisolated annotation is working as intended — sort work never blocks the main actor.
Summary
21 files, ~10x parallelism, zero idle time between task groups. The scan is as fast as the hardware allows for synchronous CPU and disk work. If memory pressure ever becomes a concern with larger libraries, concurrency could be capped by replacing withTaskGroup with a fixed-width sliding window pattern — but for this workload the current profile is optimal.
Console Output
[ScanFiles] scanFiles: 21 ARW files — starting EXIF task group
[ScanFiles] EXIF start (_DSC8583.ARW) — active: 2/21
[ScanFiles] EXIF start (_DSC8387.ARW) — active: 3/21
[ScanFiles] EXIF start (_DSC8390.ARW) — active: 1/21
[ScanFiles] EXIF start (_DSC8318.ARW) — active: 6/21
[ScanFiles] EXIF start (_DSC8690.ARW) — active: 4/21
[ScanFiles] EXIF start (_DSC8641.ARW) — active: 7/21
[ScanFiles] EXIF start (_DSC8440.ARW) — active: 8/21
[ScanFiles] EXIF start (_DSC8634.ARW) — active: 5/21
[ScanFiles] EXIF start (_DSC8303.ARW) — active: 9/21
[ScanFiles] EXIF start (_DSC8673.ARW) — active: 10/21
[ScanFiles] EXIF start (_DSC8470.ARW) — active: 11/21
[ScanFiles] EXIF done (_DSC8387.ARW) — active: 10
[ScanFiles] EXIF done (_DSC8641.ARW) — active: 10
[ScanFiles] EXIF done (_DSC8690.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8305.ARW) — active: 11/21
[ScanFiles] EXIF start (_DSC8670.ARW) — active: 10/21
[ScanFiles] EXIF start (_DSC8499.ARW) — active: 11/21
[ScanFiles] EXIF done (_DSC8318.ARW) — active: 8
[ScanFiles] EXIF done (_DSC8673.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8304.ARW) — active: 9/21
[ScanFiles] EXIF start (_DSC8500.ARW) — active: 10/21
[ScanFiles] EXIF done (_DSC8634.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8313.ARW) — active: 10/21
[ScanFiles] EXIF done (_DSC8470.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8406.ARW) — active: 10/21
[ScanFiles] EXIF done (_DSC8303.ARW) — active: 9
[ScanFiles] EXIF done (_DSC8440.ARW) — active: 8
[ScanFiles] EXIF start (_DSC8602.ARW) — active: 9/21
[ScanFiles] EXIF start (_DSC8603.ARW) — active: 10/21
[ScanFiles] EXIF done (_DSC8390.ARW) — active: 10
[ScanFiles] EXIF start (_DSC8589.ARW) — active: 10/21
[ScanFiles] EXIF done (_DSC8583.ARW) — active: 9
[ScanFiles] EXIF done (_DSC8500.ARW) — active: 9
[ScanFiles] EXIF done (_DSC8406.ARW) — active: 8
[ScanFiles] EXIF done (_DSC8499.ARW) — active: 7
[ScanFiles] EXIF done (_DSC8305.ARW) — active: 6
[ScanFiles] EXIF done (_DSC8304.ARW) — active: 5
[ScanFiles] EXIF done (_DSC8313.ARW) — active: 4
[ScanFiles] EXIF done (_DSC8602.ARW) — active: 3
[ScanFiles] EXIF done (_DSC8603.ARW) — active: 2
[ScanFiles] EXIF done (_DSC8670.ARW) — active: 1
[ScanFiles] EXIF done (_DSC8589.ARW) — active: 0
[ScanFiles] scanFiles: EXIF group complete — 21/21 items built
[ScanFiles] extractNativeFocusPoints: 21 files — starting MakerNote task group
[ScanFiles] MakerNote start (_DSC8387.ARW) — active: 1/21
[ScanFiles] MakerNote start (_DSC8641.ARW) — active: 2/21
[ScanFiles] MakerNote start (_DSC8690.ARW) — active: 3/21
[ScanFiles] MakerNote start (_DSC8318.ARW) — active: 4/21
[ScanFiles] MakerNote start (_DSC8673.ARW) — active: 5/21
[ScanFiles] MakerNote start (_DSC8634.ARW) — active: 6/21
[ScanFiles] MakerNote start (_DSC8470.ARW) — active: 7/21
[ScanFiles] MakerNote start (_DSC8303.ARW) — active: 8/21
[ScanFiles] MakerNote start (_DSC8440.ARW) — active: 9/21
[ScanFiles] MakerNote done (_DSC8634.ARW) — active: 8
[ScanFiles] MakerNote done (_DSC8318.ARW) — active: 7
[ScanFiles] MakerNote start (_DSC8583.ARW) — active: 9/21
[ScanFiles] MakerNote done (_DSC8470.ARW) — active: 8
[ScanFiles] MakerNote start (_DSC8500.ARW) — active: 8/21
[ScanFiles] MakerNote start (_DSC8406.ARW) — active: 9/21
[ScanFiles] MakerNote start (_DSC8390.ARW) — active: 8/21
[ScanFiles] MakerNote done (_DSC8440.ARW) — active: 8
[ScanFiles] MakerNote start (_DSC8305.ARW) — active: 10/21
[ScanFiles] MakerNote done (_DSC8387.ARW) — active: 8
[ScanFiles] MakerNote start (_DSC8304.ARW) — active: 9/21
[ScanFiles] MakerNote done (_DSC8690.ARW) — active: 7
[ScanFiles] MakerNote start (_DSC8499.ARW) — active: 9/21
[ScanFiles] MakerNote start (_DSC8313.ARW) — active: 8/21
[ScanFiles] MakerNote done (_DSC8500.ARW) — active: 9
[ScanFiles] MakerNote start (_DSC8602.ARW) — active: 8/21
[ScanFiles] MakerNote done (_DSC8641.ARW) — active: 6
[ScanFiles] MakerNote start (_DSC8603.ARW) — active: 7/21
[ScanFiles] MakerNote done (_DSC8583.ARW) — active: 8
[ScanFiles] MakerNote done (_DSC8673.ARW) — active: 7
[ScanFiles] MakerNote start (_DSC8670.ARW) — active: 8/21
[ScanFiles] MakerNote start (_DSC8589.ARW) — active: 9/21
[ScanFiles] MakerNote done (_DSC8602.ARW) — active: 8
[ScanFiles] MakerNote done (_DSC8305.ARW) — active: 7
[ScanFiles] MakerNote done (_DSC8390.ARW) — active: 7
[ScanFiles] MakerNote done (_DSC8406.ARW) — active: 6
[ScanFiles] MakerNote done (_DSC8304.ARW) — active: 5
[ScanFiles] MakerNote done (_DSC8313.ARW) — active: 4
[ScanFiles] MakerNote done (_DSC8499.ARW) — active: 3
[ScanFiles] MakerNote done (_DSC8303.ARW) — active: 7
[ScanFiles] MakerNote done (_DSC8603.ARW) — active: 2
[ScanFiles] MakerNote done (_DSC8670.ARW) — active: 1
[ScanFiles] MakerNote done (_DSC8589.ARW) — active: 0
[ScanFiles] extractNativeFocusPoints: MakerNote group complete — 21/21 found focus data
Finished scanning! Total files: 21
func sortFiles() NOT on main thread, currently on <NSThread: 0xc98ae8540>{number = 2, name = (null)}
Memory Cache
Cache System — RawCull
RawCull uses a three-layer cache to avoid repeated RAW decoding. Decoding an ARW file on demand is expensive — the three-layer approach ensures that most requests are served from RAM or disk rather than from source.
Layers (fastest to slowest):
- Memory cache —
NSCache<NSURL, DiscardableThumbnail>in RAM - Disk cache — JPEG files on disk in
~/Library/Caches/no.blogspot.RawCull/Thumbnails/ - Source decode —
CGImageSourceCreateThumbnailAtIndexfrom the ARW file
The same cache stack is shared by two paths: the bulk preload flow (ScanAndCreateThumbnails) and on-demand per-file requests (RequestThumbnail).
1. Core Types
DiscardableThumbnail
DiscardableThumbnail is the in-memory cache entry. It wraps an NSImage and implements NSDiscardableContent so NSCache can manage it under memory pressure without immediately evicting objects that are currently in use.
final class DiscardableThumbnail: NSObject, NSDiscardableContent, @unchecked Sendable {
let image: NSImage
nonisolated let cost: Int
private let state = OSAllocatedUnfairLock(
initialState: (isDiscarded: false, accessCount: 0)
)
}
Cost calculation happens at initialization from the actual pixel dimensions of all image representations:
cost = (Σ rep.pixelsWide × rep.pixelsHigh × costPerPixel) × 1.1
- Iterates every
NSImageRepin the image - Falls back to logical
image.sizeif no representations are present - The 1.1 multiplier adds a 10% overhead for wrapper and metadata
costPerPixelcomes fromSettingsViewModel.thumbnailCostPerPixel(default: 4, representing RGBA bytes per pixel)
Thread safety uses OSAllocatedUnfairLock on a tuple (isDiscarded: Bool, accessCount: Int) to keep both fields consistent under concurrent access.
NSDiscardableContent protocol:
| Method | Behavior |
|---|---|
beginContentAccess() -> Bool | Acquires lock, increments accessCount, returns false if already discarded |
endContentAccess() | Acquires lock, decrements accessCount |
discardContentIfPossible() | Acquires lock, marks isDiscarded = true only if accessCount == 0 |
isContentDiscarded() -> Bool | Acquires lock, returns isDiscarded |
The correct access pattern for any caller:
if let wrapper = SharedMemoryCache.shared.object(forKey: url as NSURL),
wrapper.beginContentAccess() {
defer { wrapper.endContentAccess() }
use(wrapper.image)
} else {
// Cache miss or discarded — fall through to disk or source
}
CacheConfig
CacheConfig is an immutable value type passed to SharedMemoryCache at initialization or after a settings change:
struct CacheConfig {
nonisolated let totalCostLimit: Int // bytes
nonisolated let countLimit: Int
nonisolated var costPerPixel: Int?
static let production = CacheConfig(
totalCostLimit: 500 * 1024 * 1024, // 500 MB default
countLimit: 1000
)
static let testing = CacheConfig(
totalCostLimit: 100_000, // intentionally tiny
countLimit: 5
)
}
In production, totalCostLimit is overwritten from SettingsViewModel.memoryCacheSizeMB when applyConfig runs. The countLimit of 10,000 is intentionally very high — under normal operation totalCostLimit is always the binding constraint.
CacheDelegate
CacheDelegate implements NSCacheDelegate and counts evictions via an isolated EvictionCounter actor:
final class CacheDelegate: NSObject, NSCacheDelegate, @unchecked Sendable {
nonisolated static let shared = CacheDelegate()
// NSCacheDelegate — called synchronously on NSCache's internal queue
func cache(_ cache: NSCache<AnyObject, AnyObject>, willEvictObject obj: Any) {
guard obj is DiscardableThumbnail else { return }
Task { await evictionCounter.increment() }
}
}
actor EvictionCounter {
private var count = 0
func increment() { count += 1 }
func getCount() -> Int { count }
func reset() { count = 0 }
}
The delegate does not affect eviction behavior — it only feeds the statistics system.
SharedMemoryCache (actor)
SharedMemoryCache is a global actor singleton that owns the NSCache, memory pressure monitoring, and cache statistics.
actor SharedMemoryCache {
nonisolated static let shared = SharedMemoryCache()
// nonisolated(unsafe) allows synchronous access from any context.
// NSCache itself is thread-safe; this is intentional and documented.
nonisolated(unsafe) let memoryCache = NSCache<NSURL, DiscardableThumbnail>()
nonisolated(unsafe) var currentPressureLevel: MemoryPressureLevel = .normal
private var _costPerPixel: Int = 4
private var diskCache: DiskCacheManager
private var memoryPressureSource: DispatchSourceMemoryPressure?
private var setupTask: Task<Void, Never>?
// Statistics (actor-isolated)
private var cacheMemory: Int = 0 // RAM hits
private var cacheDisk: Int = 0 // Disk hits
}
Synchronous accessors — nonisolated, callable from any context without await:
nonisolated func object(forKey key: NSURL) -> DiscardableThumbnail?
nonisolated func setObject(_ obj: DiscardableThumbnail, forKey key: NSURL, cost: Int)
nonisolated func removeAllObjects()
Initialization is gated by a setupTask so that concurrent callers to ensureReady() share a single initialization pass:
func ensureReady(config: CacheConfig? = nil) async {
if let existing = setupTask {
await existing.value
return
}
let task = Task { await self.setCacheCostsFromSavedSettings() }
setupTask = task
await task.value
startMemoryPressureMonitoring()
}
Configuration flow:
ensureReady()
-> setCacheCostsFromSavedSettings()
-> SettingsViewModel.shared.asyncgetsettings()
-> calculateConfig(from:)
-> applyConfig(_:)
calculateConfig converts settings to a CacheConfig:
totalCostLimit = memoryCacheSizeMB × 1024 × 1024countLimit = 10,000(intentionally very high — memory cost, not item count, is the real constraint)costPerPixel = thumbnailCostPerPixel
applyConfig applies the config to NSCache:
memoryCache.totalCostLimit = config.totalCostLimitmemoryCache.countLimit = config.countLimitmemoryCache.delegate = CacheDelegate.shared
DiskCacheManager (actor)
DiskCacheManager stores JPEG thumbnails on disk and retrieves them on RAM cache misses.
actor DiskCacheManager {
private let cacheDirectory: URL
// ~/Library/Caches/no.blogspot.RawCull/Thumbnails/
}
Cache key generation — deterministic MD5 hash of the standardized source path:
func cacheURL(for sourceURL: URL) -> URL {
let standardized = sourceURL.standardizedFileURL.path
let hash = MD5(string: standardized) // hex string
return cacheDirectory.appendingPathComponent(hash + ".jpg")
}
Load — detached userInitiated priority task:
func load(for sourceURL: URL) async -> NSImage? {
let url = cacheURL(for: sourceURL)
return await Task.detached(priority: .userInitiated) {
guard FileManager.default.fileExists(atPath: url.path) else { return nil }
return NSImage(contentsOf: url)
}.value
}
Save — accepts pre-encoded Data (a Sendable type) to cross the actor boundary safely:
func save(_ jpegData: Data, for sourceURL: URL) async {
let url = cacheURL(for: sourceURL)
Task.detached(priority: .background) {
do {
try jpegData.write(to: url)
} catch {
// Log error
}
}
}
// Called inside the actor that owns the CGImage, before crossing actor boundaries
static nonisolated func jpegData(from cgImage: CGImage) -> Data? {
// CGImageDestination → JPEG quality 0.7
}
Cache maintenance:
| Method | Behavior |
|---|---|
getDiskCacheSize() async -> Int | Sums totalFileAllocatedSize for all .jpg cache files |
pruneCache(maxAgeInDays: Int = 30) async | Removes files with modification date older than threshold |
Both run in detached utility priority tasks.
2. Memory Pressure Handling
Memory pressure is monitored via DispatchSource.makeMemoryPressureSource:
func startMemoryPressureMonitoring() {
let source = DispatchSource.makeMemoryPressureSource(
eventMask: [.normal, .warning, .critical],
queue: .global(qos: .utility)
)
source.setEventHandler { [weak self] in
Task { await self?.handleMemoryPressureEvent() }
}
source.resume()
memoryPressureSource = source
}
Response by level:
| Level | Action |
|---|---|
.normal | Log, update currentPressureLevel, notify fileHandlers.memorypressurewarning(false) |
.warning | Reduce totalCostLimit to 60% of the current limit, notify fileHandlers.memorypressurewarning(true) |
.critical | removeAllObjects(), set totalCostLimit to 50 MB, notify fileHandlers.memorypressurewarning(true) |
Important detail about warning compounding: the warning level calculates its reduction from the current limit, not the original configured limit. Repeated warning events compound:
Original: 5000 MB
After 1st warning: 3000 MB
After 2nd warning: 1800 MB
After 3rd warning: 1080 MB
The limit is only restored to the configured value when applyConfig() runs again — for example, on app start or after a settings change.
3. Cache Statistics
SharedMemoryCache tracks hits and evictions in actor-isolated counters:
cacheMemory— incremented on every RAM hit (viaupdateCacheMemory())cacheDisk— incremented on every disk hit (viaupdateCacheDisk())- Eviction count — tracked by
CacheDelegate.EvictionCounter
getCacheStatistics() async -> CacheStatistics returns a snapshot:
struct CacheStatistics {
nonisolated let hits: Int
nonisolated let misses: Int
nonisolated let evictions: Int
nonisolated let hitRate: Double // (hits / (hits + misses)) * 100
}
clearCaches() async:
- Reads and logs final statistics
memoryCache.removeAllObjects()diskCache.pruneCache(maxAgeInDays: 0)— prunes all files- Resets
cacheMemory,cacheDisk, and eviction count to 0
4. End-to-End Cache Flow
Request thumbnail for URL
│
├─ Check SharedMemoryCache.object(forKey:)
│ ├─ Hit: beginContentAccess() → use image → endContentAccess() → return
│ └─ Miss:
│ ├─ Check DiskCacheManager.load(for:)
│ │ ├─ Hit: wrap as DiscardableThumbnail → store in NSCache → return
│ │ └─ Miss:
│ │ ├─ SonyThumbnailExtractor.extractSonyThumbnail(from:maxDimension:qualityCost:)
│ │ ├─ Normalize CGImage → NSImage (JPEG-backed, quality 0.7)
│ │ ├─ Create DiscardableThumbnail → store in NSCache (with cost)
│ │ └─ Encode JPEG data → DiskCacheManager.save(_:for:) [detached, background]
└─ Return CGImage to caller
5. Settings That Affect Cache Behavior
Settings live in SettingsViewModel and are persisted to ~/Library/Application Support/RawCull/settings.json.
| Setting | Default | Effect |
|---|---|---|
memoryCacheSizeMB | 5000 | NSCache.totalCostLimit = memoryCacheSizeMB × 1024 × 1024 |
thumbnailCostPerPixel | 4 | Cost per pixel in DiscardableThumbnail.cost |
thumbnailSizePreview | 1024 | Target size for bulk preload; affects entry cost |
thumbnailSizeGrid | 100 | Grid thumbnail size |
thumbnailSizeGridView | 400 | Grid View thumbnail size |
thumbnailSizeFullSize | 8700 | Full-size zoom path upper bound |
SettingsViewModel.validateSettings() emits warnings if:
memoryCacheSizeMB < 500memoryCacheSizeMB > 80%of available system memory
6. Cache Flow Diagram
flowchart TD
A[Thumbnail Requested] --> B{Memory Cache Hit?}
B -- Yes --> C[beginContentAccess]
C --> D[Return NSImage]
D --> E[endContentAccess]
B -- No --> F{Disk Cache Hit?}
F -- Yes --> G[Load JPEG from Disk]
G --> H[Wrap as DiscardableThumbnail]
H --> I[Store in NSCache with cost]
I --> D
F -- No --> J[SonyThumbnailExtractor]
J --> K[Normalize to JPEG-backed NSImage]
K --> L[Create DiscardableThumbnail]
L --> I
K --> M[Encode JPEG data]
M --> N[DiskCacheManager.save — detached background]7. Memory Pressure Response Diagram
flowchart TD
A[DispatchSourceMemoryPressure event] --> B{Pressure Level?}
B -- normal --> C[Log + update currentPressureLevel]
C --> D[Notify UI: no warning]
B -- warning --> E[Reduce limit to 60% of current]
E --> F[Notify UI: warning]
B -- critical --> G[removeAllObjects]
G --> H[Set limit to 50 MB]
H --> I[Notify UI: warning]Thumbnails
Thumbnails and Previews — RawCull
RawCull handles Sony ARW files through two distinct image paths:
- Generated thumbnails — fast, sized-down previews for browsing and culling, extracted with ImageIO and cached in RAM and on disk
- Embedded JPEG previews — full-resolution embedded JPEGs extracted from the ARW binary for high-quality inspection and export
Both paths integrate with the shared three-layer cache system: RAM → disk → source decode.
1. Thumbnail Sizes and Settings
All thumbnail dimensions are configurable via SettingsViewModel and persisted to ~/Library/Application Support/RawCull/settings.json.
| Setting | Default | Usage |
|---|---|---|
thumbnailSizeGrid | 100 | Small thumbnails in grid list view |
thumbnailSizeGridView | 400 | Thumbnails in the main grid view |
thumbnailSizePreview | 1024 | Bulk preload target size |
thumbnailSizeFullSize | 8700 | Upper bound for full-size zoom path |
thumbnailCostPerPixel | 4 | RGBA bytes per pixel — drives cache cost calculation |
useThumbnailAsZoomPreview | false | Reuse cached thumbnail instead of re-extracting for zoom |
All extraction uses max pixel size on the longest edge (kCGImageSourceThumbnailMaxPixelSize). Actual width and height depend on the source aspect ratio.
2. Generated Thumbnail Pipeline — SonyThumbnailExtractor
SonyThumbnailExtractor is a nonisolated static enum. Its extractSonyThumbnail method is the primary entry point for generating thumbnails from ARW files.
2.1 Async dispatch
To prevent blocking actor queues during CPU-intensive ImageIO work, extraction is dispatched to the global userInitiated GCD queue via withCheckedThrowingContinuation:
static func extractSonyThumbnail(
from url: URL,
maxDimension: CGFloat,
qualityCost: Int = 4
) async throws -> CGImage {
try await withCheckedThrowingContinuation { continuation in
DispatchQueue.global(qos: .userInitiated).async {
do {
let image = try Self.extractSync(from: url, maxDimension: maxDimension, qualityCost: qualityCost)
continuation.resume(returning: image)
} catch {
continuation.resume(throwing: error)
}
}
}
}
2.2 ImageIO extraction (extractSync)
extractSync is nonisolated and runs synchronously on the GCD thread:
let sourceOptions: [CFString: Any] = [kCGImageSourceShouldCache: false]
guard let source = CGImageSourceCreateWithURL(url as CFURL, sourceOptions as CFDictionary) else {
throw ThumbnailError.invalidSource
}
let thumbOptions: [CFString: Any] = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: maxDimension,
kCGImageSourceShouldCacheImmediately: true
]
guard let cgImage = CGImageSourceCreateThumbnailAtIndex(source, 0, thumbOptions as CFDictionary) else {
throw ThumbnailError.generationFailed
}
return try rerender(cgImage, qualityCost: qualityCost)
kCGImageSourceShouldCache: false on the source prevents ImageIO from caching the raw input; kCGImageSourceShouldCacheImmediately: true on the thumbnail options ensures the decoded output pixels are available immediately.
2.3 Re-rendering with interpolation quality (rerender)
After ImageIO decodes the thumbnail, rerender redraws it into a new CGContext. This applies controlled interpolation quality and normalizes the pixel format to sRGB premultipliedLast:
private static nonisolated func rerender(_ image: CGImage, qualityCost: Int) throws -> CGImage {
let quality: CGInterpolationQuality = switch qualityCost {
case 1...2: .low
case 3...4: .medium
default: .high
}
guard let colorSpace = CGColorSpace(name: CGColorSpace.sRGB),
let ctx = CGContext(
data: nil,
width: image.width,
height: image.height,
bitsPerComponent: 8,
bytesPerRow: 0,
space: colorSpace,
bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue
) else {
throw ThumbnailError.contextCreationFailed
}
ctx.interpolationQuality = quality
ctx.draw(image, in: CGRect(x: 0, y: 0, width: image.width, height: image.height))
guard let rendered = ctx.makeImage() else {
throw ThumbnailError.generationFailed
}
return rendered
}
Interpolation quality mapping:
thumbnailCostPerPixel | CGInterpolationQuality |
|---|---|
| 1–2 | .low — fastest, lowest quality |
| 3–4 | .medium — balanced (default) |
| 5+ | .high — best quality, highest CPU |
3. Thumbnail Normalization
Before storing in the cache, ScanAndCreateThumbnails normalizes the CGImage to an NSImage backed by a single JPEG representation. This ensures that the memory entry and the disk entry are consistent with each other.
func cgImageToNormalizedNSImage(_ cgImage: CGImage) throws -> NSImage {
// Encode to JPEG at quality 0.7
guard let jpegData = DiskCacheManager.jpegData(from: cgImage) else {
throw ThumbnailError.generationFailed
}
// Decode back to NSImage — now backed by exactly one NSBitmapImageRep
guard let image = NSImage(data: jpegData) else {
throw ThumbnailError.generationFailed
}
return image
}
The inverse direction (NSImage → CGImage) is also needed when promoting a disk-cached JPEG to the RAM cache:
func nsImageToCGImage(_ nsImage: NSImage) throws -> CGImage {
// Try direct CGImage extraction first
if let rep = nsImage.representations.first as? NSBitmapImageRep,
let cg = rep.cgImage { return cg }
// Fallback: TIFF round-trip
guard let tiffData = nsImage.tiffRepresentation,
let src = CGImageSourceCreateWithData(tiffData as CFData, nil),
let cg = CGImageSourceCreateImageAtIndex(src, 0, nil) else {
throw ThumbnailError.generationFailed
}
return cg
}
4. Preload Flow (Bulk) — ScanAndCreateThumbnails
When the user selects a catalog, RawCullViewModel.handleSourceChange(url:) triggers bulk preloading.
4.1 Concurrency model
preloadCatalog(at:targetSize:) uses a withTaskGroup bounded to ProcessInfo.processInfo.activeProcessorCount * 2 concurrent tasks. Back-pressure is applied with await group.next() before adding a new task when the limit is reached.
4.2 Per-file processing
For each ARW file, processSingleFile(_:targetSize:itemIndex:) runs the three-tier cache lookup:
RAM cache → Disk cache → SonyThumbnailExtractor
Cancellation is checked with Task.isCancelled before and after every expensive operation. If cancelled mid-extraction, the result is discarded and no write occurs.
4.3 Caching after extraction
On a successful source extraction:
cgImageToNormalizedNSImage(_:)produces a single-representationNSImage.storeInMemoryCache(_:for:)creates aDiscardableThumbnailusing pixel-accurate cost and stores it inSharedMemoryCache.DiskCacheManager.jpegData(from:)encodes to JPEG (quality 0.7) — this is called on the actor while the CGImage is still accessible.diskCache.save(_:for:)writes the data from a detached background task.
4.4 Request coalescing
If thumbnail(for:targetSize:) is called concurrently for the same URL while extraction is in progress, an inflightTasks: [URL: Task<CGImage, Error>] dictionary provides coalescing. Subsequent callers await the existing task rather than launching duplicate extraction work.
5. On-Demand Thumbnails — ThumbnailLoader + RequestThumbnail
UI elements (grid view, file list, inspector) request thumbnails through the on-demand path.
5.1 ThumbnailLoader (rate limiting)
ThumbnailLoader.shared is a global actor singleton that caps concurrent thumbnail loads at 6. Requests beyond this limit suspend via CheckedContinuation and queue in pendingContinuations. When a slot is released, the next waiting continuation is resumed. The target size passed to RequestThumbnail is thumbnailSizePreview (default 1024).
If a waiting task is cancelled before its slot becomes available, its continuation is removed from the queue by UUID so it is never spuriously resumed.
5.2 RequestThumbnail (cache pipeline)
RequestThumbnail handles the actual per-file resolution for the on-demand path:
ensureReady()—setupTaskgate ensuresSharedMemoryCacheis configured once.- RAM cache lookup via
SharedMemoryCache.object(forKey:)+beginContentAccess(). - Disk cache lookup via
DiskCacheManager.load(for:). - Extraction fallback via
SonyThumbnailExtractor.extractSonyThumbnail(from:maxDimension:qualityCost:). - Store in RAM cache.
- Disk save via a detached background task.
requestThumbnail(for:targetSize:) returns CGImage? for direct use by SwiftUI views.
6. Embedded JPEG Preview Extraction — JPGSonyARWExtractor
Embedded JPEG previews are distinct from generated thumbnails. They are the full-resolution previews baked into the ARW file by the camera, and are used for high-quality inspection and export.
JPGSonyARWExtractor is a nonisolated static enum dispatched to DispatchQueue.global(qos: .utility).
6.1 JPEG detection algorithm
ARW files contain multiple image sub-images. The extractor iterates all of them and identifies JPEG candidates by:
- Presence of
kCGImagePropertyJFIFDictionaryin image properties, or - Compression value
6(JPEG) in thekCGImagePropertyTIFFDictionary.
Among all JPEG candidates, the widest image is selected. Width is read from kCGImagePropertyPixelWidth, falling back to the TIFF or EXIF width dictionary entries.
6.2 Downsampling large previews
let maxThumbnailSize: CGFloat = fullSize ? 8640 : 4320
If the selected JPEG’s width exceeds maxThumbnailSize, it is downsampled using ImageIO:
let thumbOptions: [CFString: Any] = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: Int(maxThumbnailSize)
]
CGImageSourceCreateThumbnailAtIndex(source, index, thumbOptions as CFDictionary)
If the JPEG is already smaller than maxThumbnailSize, it is decoded at its original size with CGImageSourceCreateImageAtIndex.
7. JPEG Export — SaveJPGImage
SaveJPGImage.save(image:originalURL:) writes an extracted CGImage alongside the original ARW:
- Output path:
originalURLwith.arwextension replaced by.jpg - Compression quality:
1.0(maximum, no lossy compression) - Format: JPEG via
CGImageDestinationCreateWithURL+CGImageDestinationFinalize
The method is nonisolated and runs on the global queue via async, keeping actor queues clear of blocking I/O.
8. Error Handling
ThumbnailError defines three typed errors for the thumbnail pipeline:
enum ThumbnailError: Error, LocalizedError {
case invalidSource // CGImageSourceCreateWithURL returned nil
case generationFailed // CGImageSourceCreateThumbnailAtIndex or CGContext.makeImage returned nil
case contextCreationFailed // CGContext creation failed
}
All callers catch errors, log the failure with file path and description, and return nil to the consumer. A single corrupt or unreadable file does not interrupt bulk processing — the preload loop and extraction loop both continue with the next file.
9. Flow Diagrams
9.1 Bulk Preload (ScanAndCreateThumbnails)
flowchart TD
A[Catalog Selected] --> B[handleSourceChange — MainActor]
B --> C[ScanAndCreateThumbnails.preloadCatalog]
C --> D[ensureReady — SharedMemoryCache + settings]
D --> E[cancelPreload — cancel prior inner task]
E --> F[DiscoverFiles — enumerate ARW files]
F --> G[withTaskGroup — capped at processorCount × 2]
G --> H{RAM Cache Hit?}
H -- Yes --> P[Update progress / ETA]
H -- No --> I{Disk Cache Hit?}
I -- Yes --> J[Load JPEG from disk]
J --> K[Promote to RAM cache]
K --> P
I -- No --> L[SonyThumbnailExtractor — ImageIO decode]
L --> M[Rerender with interpolation quality]
M --> N[Normalize to JPEG-backed NSImage]
N --> O[Store in RAM cache]
O --> Q[Encode JPEG data]
Q --> R[DiskCacheManager.save — detached background]
R --> P
P --> S{More files?}
S -- Yes --> G
S -- No --> T[Return count to RawCullViewModel]9.2 On-Demand Request (ThumbnailLoader + RequestThumbnail)
sequenceDiagram
participant UI
participant TL as ThumbnailLoader.shared
participant RT as RequestThumbnail
participant MC as SharedMemoryCache
participant DC as DiskCacheManager
participant EX as SonyThumbnailExtractor
UI->>TL: thumbnailLoader(file:)
TL->>TL: acquireSlot() — suspend if activeTasks ≥ 6
TL->>RT: requestThumbnail(for:targetSize:)
RT->>MC: object(forKey:) + beginContentAccess()
alt RAM hit
MC-->>RT: DiscardableThumbnail.image
RT-->>UI: CGImage
else RAM miss
RT->>DC: load(for:)
alt Disk hit
DC-->>RT: NSImage
RT->>MC: setObject(DiscardableThumbnail)
RT-->>UI: CGImage
else Disk miss
RT->>EX: extractSonyThumbnail(from:maxDimension:qualityCost:)
EX-->>RT: CGImage
RT->>MC: setObject(DiscardableThumbnail)
RT->>DC: save — detached background
RT-->>UI: CGImage
end
end
TL->>TL: releaseSlot() — resume next pending continuation9.3 Embedded JPEG Extraction (JPGSonyARWExtractor)
flowchart TD
A[ARW file URL] --> B[CGImageSourceCreateWithURL]
B --> C[Iterate all sub-images]
C --> D{JPEG candidate?}
D -- JFIF dict present OR TIFF compression == 6 --> E[Record width]
D -- No --> C
E --> F{More images?}
F -- Yes --> C
F -- No --> G[Select widest JPEG candidate]
G --> H{Width > maxThumbnailSize?}
H -- Yes --> I[Downsample via kCGImageSourceThumbnailMaxPixelSize]
H -- No --> J[Decode at original size]
I --> K[Return CGImage]
J --> K10. Settings Reference
| Setting | Default | Effect |
|---|---|---|
memoryCacheSizeMB | 5000 | Sets NSCache.totalCostLimit |
thumbnailCostPerPixel | 4 | Drives DiscardableThumbnail cost and interpolation quality |
thumbnailSizePreview | 1024 | Bulk preload target size |
thumbnailSizeGrid | 100 | Grid list thumbnail size |
thumbnailSizeGridView | 400 | Main grid view thumbnail size |
thumbnailSizeFullSize | 8700 | Full-size zoom path upper bound |
useThumbnailAsZoomPreview | false | Skip re-extraction and use cached thumbnail for zoom |
Scan and Thumbnial Pipeline
RawCull — Scan and Thumbnail Pipeline
This document describes the complete execution flow from the moment a user opens a catalog folder to the point where all thumbnails are visible in the grid. It covers the actors involved, the data flow between them, the concurrency model, five performance bugs that were found and fixed, and the measured results on a real catalog of 809 Sony A1 ARW files stored on an external 800 MB/s SSD.
1. Overview
Opening a catalog triggers two parallel workstreams:
User opens folder
│
├─► ScanFiles.scanFiles() — discovers files, reads EXIF and focus points
│
└─► ScanAndCreateThumbnails
.preloadCatalog() — generates or loads thumbnails for every file
Both workstreams are Swift actors. Each uses a withTaskGroup internally to
process files concurrently. Both report progress back to the SwiftUI layer via
@MainActor callbacks.
2. Phase 1 — File scan (ScanFiles)
File: RawCull/Actors/ScanFiles.swift
2.1 Directory discovery
let contents = try FileManager.default.contentsOfDirectory(
at: url,
includingPropertiesForKeys: [.nameKey, .fileSizeKey, .contentTypeKey,
.contentModificationDateKey],
options: [.skipsHiddenFiles]
)
FileManager.contentsOfDirectory returns all entries in one call. File-system
metadata (name, size, type, modification date) is prefetched via
includingPropertiesForKeys — no per-file stat() calls are needed later.
2.2 Concurrent EXIF extraction (withTaskGroup)
await withTaskGroup(of: FileItem?.self) { group in
for fileURL in contents {
guard fileURL.pathExtension.lowercased() == "arw" else { continue }
let progress = onProgress
let count = discoveredCount
Task { @MainActor in progress?(count) } // fire-and-forget UI update
group.addTask {
let res = try? fileURL.resourceValues(forKeys: Set(keys))
let exif = self.extractExifData(from: fileURL) // nonisolated
return FileItem(url: fileURL, name: res?.name … exifData: exif)
}
}
…
}
For each ARW file a task is added to the group. The loop itself is non-blocking:
progress callbacks are fired to the main actor without await, so the loop
completes almost instantly and the task group fills up immediately.
extractExifData uses Apple’s ImageIO framework:
private nonisolated func extractExifData(from url: URL) -> ExifMetadata? {
guard let src = CGImageSourceCreateWithURL(url as CFURL, nil),
let props = CGImageSourceCopyPropertiesAtIndex(src, 0, nil) …
nonisolated is critical here: it tells Swift the method does not access any
actor-isolated state, so task group tasks can call it directly on the global
cooperative thread pool without hopping back to the ScanFiles actor’s serial
executor.
CGImageSourceCopyPropertiesAtIndex reads the TIFF/EXIF header from the file.
For a Sony ARW this is the first few kilobytes — not the full ~50 MB RAW image.
Measured throughput: ~2–3 ms per file. 809 files concurrently ≈ 3–4 seconds.
2.3 Concurrent focus point extraction
After the EXIF task group completes, focus points are extracted with a second concurrent task group:
private func extractNativeFocusPoints(from items: [FileItem]) async
-> [DecodeFocusPoints]?
{
await withTaskGroup(of: DecodeFocusPoints?.self) { group in
for item in items {
group.addTask {
guard let loc = SonyMakerNoteParser.focusLocation(from: item.url)
else { return nil }
return DecodeFocusPoints(sourceFile: item.url.lastPathComponent,
focusLocation: loc)
}
}
…
}
}
SonyMakerNoteParser.focusLocation is nonisolated static, so — like
extractExifData — task group tasks run it directly on the thread pool.
What SonyMakerNoteParser does
Sony ARW is TIFF-based. Focus location lives at:
TIFF IFD0 → ExifIFD (tag 0x8769) → MakerNote (tag 0x927C)
→ Sony MakerNote IFD → FocusLocation (tag 0x2027)
Tag 0x2027 is int16u[4] = [sensorWidth, sensorHeight, focusX, focusY]
in full sensor pixel coordinates. The parser navigates the TIFF IFD chain in
binary using only the bytes it needs.
Key implementation detail: the parser reads only the first 4 MB of the file:
guard let fh = try? FileHandle(forReadingFrom: url) else { return nil }
defer { try? fh.close() }
guard let data = try? fh.read(upToCount: 4 * 1024 * 1024) …
Sony ARW MakerNote metadata sits well within the first 1–2 MB of the file. Reading 4 MB is a conservative safe limit. The full RAW image data follows later in the file and is never touched.
Measured throughput: ~0.3–0.4 ms per file. 809 files concurrently ≈ < 1 second.
3. Phase 2 — Thumbnail generation (ScanAndCreateThumbnails)
File: RawCull/Actors/ScanAndCreateThumbnails.swift
3.1 Sliding-window task group
let maxConcurrent = ProcessInfo.processInfo.activeProcessorCount * 2
for (index, url) in urls.enumerated() {
if index >= maxConcurrent {
await group.next() // keep at most maxConcurrent in flight
}
group.addTask {
await self.processSingleFile(url, targetSize: targetSize, …)
}
}
On a Mac Mini M2 (10 reported cores), maxConcurrent = 20. The sliding window
ensures at most 20 files are being processed simultaneously, preventing memory
pressure from loading too many large images at once.
3.2 Per-file processing (processSingleFile)
Each task follows a three-tier lookup:
A. RAM cache (NSCache) → microseconds, no I/O
B. Disk cache (JPEG) → ~1–5 ms, reads ~494 KB from internal SSD
C. RAW extraction → ~180–200 ms, decodes full ARW via ImageIO
A. RAM cache
SharedMemoryCache is a global actor wrapping NSCache. A cache hit is a
synchronous dictionary lookup — effectively free.
B. Disk cache
Thumbnails are stored as JPEG files at
~/Library/Caches/no.blogspot.RawCull/Thumbnails/. The filename is an MD5
hash of the source file’s absolute path. Each cached thumbnail is ~494 KB
(512 px longest edge, JPEG quality 0.7).
DiskCacheManager.load(for:) spawns a Task.detached for the file read,
releasing the actor during I/O.
After a first full scan, the disk cache is ~400 MB for 809 files.
C. RAW extraction
let cgImage = try await SonyThumbnailExtractor.extractSonyThumbnail(
from: url,
maxDimension: CGFloat(targetSize),
qualityCost: costPerPixel
)
SonyThumbnailExtractor hops immediately to DispatchQueue.global() so the
actor is not blocked during the ~180–200 ms decode:
try await withCheckedThrowingContinuation { continuation in
DispatchQueue.global(qos: .userInitiated).async {
let image = try Self.extractSync(from: url, …)
continuation.resume(returning: image)
}
}
Internally this calls CGImageSourceCreateThumbnailAtIndex which uses the
embedded JPEG preview inside the ARW where available, avoiding a full RAW
decode.
After extraction the thumbnail is:
- Stored in the RAM cache (
NSCache) immediately. - Encoded to JPEG data and written to the disk cache via a background
Task.detached— this write does not block the thumbnail pipeline.
3.3 Progress notification (fire-and-forget)
After each file completes, the UI is notified:
private func notifyFileHandler(_ count: Int) {
let handler = fileHandlers?.fileHandler
Task { @MainActor in handler?(count) }
}
The Task { @MainActor in } delivers the update to SwiftUI without blocking
the current task. Thumbnail generation does not wait for the UI to finish
rendering before moving on to the next file.
4. Performance bugs found and fixed
All five bugs shared the same root cause: await on a @MainActor or
actor-isolated method inside a hot loop, serialising work that should have been
concurrent. Swift 6’s SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor makes this
easy to introduce — a method that touches no actor state is still actor-isolated
by default and must be explicitly marked nonisolated to opt out.
Bug 1 — Thumbnail pipeline blocked on UI callbacks
File: ScanAndCreateThumbnails.processSingleFile
Symptom: 809 thumbnails generated at ~199 ms/file wall-clock despite a 20-slot concurrent task group.
Root cause:
// Before
await fileHandlers?.fileHandler(newCount)
await updateEstimatedTime(for: startTime, itemsProcessed: newCount)
fileHandlers.fileHandler is @MainActor. Every completed thumbnail called
await on it, suspending the task until SwiftUI finished rendering the updated
grid. The main actor processed these callbacks serially. With 809 files the
entire thumbnail pipeline serialised behind 809 sequential SwiftUI re-renders.
Measured cost: ~78 ms per file × 809 = ~63 seconds on second run (disk cache I/O itself is < 1 ms per file on the internal SSD).
Fix:
// After
private func notifyFileHandler(_ count: Int) {
let handler = fileHandlers?.fileHandler
Task { @MainActor in handler?(count) } // fire-and-forget
}
updateEstimatedTime was also made non-async with its estimatedTimeHandler
callback converted to the same fire-and-forget pattern.
Bug 2 — EXIF extraction serialised on actor
File: ScanFiles.extractExifData
Symptom: EXIF phase took ~60 seconds for 809 files despite withTaskGroup.
Root cause:
// Before — actor-isolated, called with await from task group tasks
private func extractExifData(from url: URL) -> ExifMetadata? { … }
// In task group:
let exifData = await self.extractExifData(from: fileURL) // hops to actor
The task group created 809 tasks, but every task immediately called
await self.extractExifData(…), which required hopping to the ScanFiles
actor’s serial executor. All 809 tasks queued behind the actor. Concurrency
was zero.
Measured cost: ~74 ms/file × 809 = ~60 seconds.
Fix:
// After — nonisolated, no actor hop required
private nonisolated func extractExifData(from url: URL) -> ExifMetadata? { … }
// In task group:
let exifData = self.extractExifData(from: fileURL) // no await
The four pure helper methods (formatShutterSpeed, formatFocalLength,
formatAperture, formatISO) were also marked nonisolated since they are
called from extractExifData.
Bug 3 — File discovery loop blocked on UI progress callbacks
File: ScanFiles.scanFiles
Symptom: After Bug 2 fix, scan still took ~53 seconds.
Root cause:
// Before
for fileURL in contents {
discoveredCount += 1
await onProgress?(discoveredCount) // @MainActor hop per file
group.addTask { … }
}
onProgress is @MainActor @Sendable. The for loop awaited it for every
file before adding the next task to the group. The task group was not the
bottleneck — the loop that built it was. 809 sequential main-actor round
trips, each waiting for a SwiftUI counter update, held the loop for ~65 ms
per iteration.
Measured cost: ~65 ms/file × 809 = ~53 seconds.
Fix:
// After
for fileURL in contents {
discoveredCount += 1
let progress = onProgress // capture closure (Sendable)
let count = discoveredCount // capture value (Int, not var)
Task { @MainActor in progress?(count) } // fire-and-forget
group.addTask { … }
}
The loop now completes in milliseconds. All 809 task group tasks are enqueued before the first one finishes, maximising parallelism.
Bug 4 — Focus point parser read the entire RAW file
File: SonyMakerNoteParser.focusLocation
Symptom: Focus point extraction added ~50 seconds to the scan.
Root cause:
// Before
guard let data = try? Data(contentsOf: url, options: .mappedIfSafe) …
On an external SSD, macOS cannot safely use mmap (the filesystem or volume
does not permit it), so Data(contentsOf:, options: .mappedIfSafe) silently
falls back to reading the entire file into memory. A Sony A1 ARW is ~50 MB.
50 MB × 809 files = ~40 GB total I/O
40 GB ÷ 800 MB/s = ~50 seconds
The parser only needs the TIFF IFD chain and MakerNote, which reside within the first 1–2 MB of a Sony ARW. The RAW image data that follows is never accessed.
Fix:
// After — reads only the first 4 MB
guard let fh = try? FileHandle(forReadingFrom: url) else { return nil }
defer { try? fh.close() }
guard let data = try? fh.read(upToCount: 4 * 1024 * 1024) …
FileHandle.read(upToCount:) issues a single bounded read regardless of
filesystem. 4 MB per file is a conservative limit well above the 1–2 MB
actually needed, and safe for all known Sony ARW structures.
4 MB × 809 files = ~3.2 GB total I/O
3.2 GB ÷ 800 MB/s = ~4 seconds (sequential upper bound)
Bug 5 — Focus point extraction sequential
File: ScanFiles.extractNativeFocusPoints
Symptom: Even after Bug 4, focus extraction ran serially.
Root cause:
// Before — synchronous compactMap, runs on actor
private func extractNativeFocusPoints(from items: [FileItem]) -> [DecodeFocusPoints]? {
let parsed = items.compactMap { item in
guard let loc = SonyMakerNoteParser.focusLocation(from: item.url) …
}
…
}
A plain compactMap on 809 items processes them one at a time on the actor.
Fix:
// After — concurrent task group
private func extractNativeFocusPoints(from items: [FileItem]) async
-> [DecodeFocusPoints]?
{
await withTaskGroup(of: DecodeFocusPoints?.self) { group in
for item in items {
group.addTask {
guard let loc = SonyMakerNoteParser.focusLocation(from: item.url)
else { return nil }
return DecodeFocusPoints(…)
}
}
…
}
}
SonyMakerNoteParser.focusLocation is nonisolated static, so task group
tasks call it directly on the thread pool without touching the actor.
5. Measured results — 809 Sony A1 ARW files, external 800 MB/s SSD
| Operation | Before all fixes | After all fixes |
|---|---|---|
| EXIF extraction | ~60 s | ~3–4 s |
| Focus point extraction | ~50 s | < 1 s |
| Scan total (Phase 1) | ~60 s (sequential bottleneck) | ~6–7 s |
| Thumbnail generation, cold (Phase 2) | ~161 s | ~10–15 s |
| Thumbnail generation, cached (Phase 2) | ~63 s | ~instant |
| Total, first run | ~4 min | ~20 s |
| Total, second run | ~63 s | ~7 s |
6. Key architectural lesson
Swift 6 with SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor isolates all
methods to the main actor by default. A method that is pure and stateless —
no reads or writes of actor-isolated properties — must be explicitly annotated
nonisolated to run on the cooperative thread pool. Without this annotation,
task group tasks that call it will silently queue on the actor’s serial
executor, reducing a withTaskGroup to the performance of a for loop.
The pattern to check in any actor with a task group:
// If this is called with `await self.method()` inside group.addTask { }:
private func method() -> T { … }
// ↑
// Does this method read or write any stored property of the actor?
// If NO → mark it nonisolated. The await and actor hop are unnecessary.
// If YES → it must remain isolated; redesign the data flow instead.
The same applies to @MainActor progress callbacks: never await them
inside a processing loop. Fire-and-forget with Task { @MainActor in } keeps
the pipeline moving and lets the UI update at its own pace.
Sony MakerNote Parser
Sony MakerNote Parser — Focus Point Extraction
Files covered:
RawCull/Enum/SonyMakerNoteParser.swiftRawCull/Model/ViewModels/FocusPointsModel.swiftRawCull/Actors/ScanFiles.swiftRawCull/Views/FocusPoints/FocusOverlayView.swift
Overview
RawCull extracts autofocus (AF) focus point coordinates directly from Sony ARW
raw files without requiring any external tools such as exiftool. The parser
targets the Sony A1 (ILCE-1) and A1 Mark II (ILCE-1M2) cameras.
The focus location is stored inside the Sony proprietary MakerNote block
embedded in the EXIF data of every ARW file. The SonyMakerNoteParser struct
navigates the binary TIFF structure to locate and decode this data.
ARW File Structure
Sony ARW is a TIFF-based format (typically little-endian). EXIF and MakerNote data are embedded within the standard TIFF IFD chain:
TIFF Header
└── IFD0
└── ExifIFD (tag 0x8769)
└── MakerNote (tag 0x927C)
└── Sony MakerNote IFD
└── FocusLocation (tag 0x2027)
Tag 0x2027 (FocusLocation) holds four int16u values:
| Index | Meaning |
|---|---|
| 0 | Image width (sensor pixels) |
| 1 | Image height (sensor pixels) |
| 2 | Focus point X coordinate |
| 3 | Focus point Y coordinate |
The origin is the top-left corner of the sensor. Values are already in full
sensor pixel space — no scaling is required. Tag 0x204a is a redundant copy
of the same data (within one pixel) and is used as a fallback.
Note: Tag
0x9400(AFInfo) is an enciphered binary block and is not used for focus location.
SonyMakerNoteParser
The public interface is a single static method:
struct SonyMakerNoteParser {
/// Returns "width height x y" calibrated for the Sony A1 sensor.
nonisolated static func focusLocation(from url: URL) -> String? {
// Read only the first 4 MB. Sony ARW MakerNote metadata sits well
// within that range; loading the full ~50 MB RAW file is wasteful
// on external storage where mmap is unavailable and
// Data(contentsOf:) falls back to a full read.
guard let fh = try? FileHandle(forReadingFrom: url) else { return nil }
defer { try? fh.close() }
guard let data = try? fh.read(upToCount: 4 * 1024 * 1024),
let result = TIFFParser(data: data)?.parseSonyFocusLocation()
else { return nil }
return "\(result.width) \(result.height) \(result.x) \(result.y)"
}
}
Only the first 4 MB of the file is read. Sony ARW MakerNote metadata resides
within the first 1–2 MB of the file; the remainder is RAW image data that is
never needed for focus point extraction. Using FileHandle.read(upToCount:)
instead of Data(contentsOf:, options: .mappedIfSafe) avoids a full ~50 MB
file read on external storage where memory-mapping (mmap) is not available
and the system silently falls back to loading the entire file.
The result is a space-separated string: "width height x y".
TIFFParser — Binary Navigation
The private TIFFParser struct does all binary parsing work.
Byte Order Detection
init?(data: Data) {
guard data.count >= 8 else { return nil }
let b0 = data[0], b1 = data[1]
if b0 == 0x49 && b1 == 0x49 { le = true } // "II" — little-endian
else if b0 == 0x4D && b1 == 0x4D { le = false } // "MM" — big-endian
else { return nil }
self.data = data
}
Sony ARW files are little-endian (II), but the parser handles both byte
orders via readU16 and readU32 helpers.
Focus Location Navigation
func parseSonyFocusLocation() -> (width: Int, height: Int, x: Int, y: Int)? {
guard let ifd0 = readU32(at: 4).map(Int.init) else { return nil }
// Navigate: IFD0 → ExifIFD → MakerNote IFD
guard let exifIFD = subIFDOffset(in: ifd0, tag: 0x8769),
let (mnOffset, _) = tagDataRange(in: exifIFD, tag: 0x927C)
else { return nil }
let ifdStart = sonyIFDStart(at: mnOffset)
// Try tag 0x2027 first, fall back to 0x204a
let flTag: UInt16 = tagDataRange(in: ifdStart, tag: 0x2027) != nil
? 0x2027 : 0x204a
guard let (flOffset, flSize) = tagDataRange(in: ifdStart, tag: flTag),
flSize >= 8
else { return nil }
let width = Int(readU16(at: flOffset + 0))
let height = Int(readU16(at: flOffset + 2))
let x = Int(readU16(at: flOffset + 4))
let y = Int(readU16(at: flOffset + 6))
guard width > 0, height > 0, x > 0 || y > 0 else { return nil }
return (width, height, x, y)
}
The IFD0 offset is read from bytes 4–7 of the TIFF header (standard TIFF). The parser then follows each IFD pointer in sequence until the Sony MakerNote IFD is reached.
Sony MakerNote Header
Some Sony files prefix the MakerNote IFD with a 12-byte ASCII header
"SONY DSC " (9 bytes) followed by 3 null bytes. The parser detects and skips
it by checking the raw bytes directly — endian-aware integer reads are not used
for ASCII magic:
private func sonyIFDStart(at offset: Int) -> Int {
guard offset + 12 <= data.count else { return offset }
// Check for "SONY DSC " ASCII prefix (9 bytes + 3 null pad = 12 bytes).
// Read raw bytes — do not use endian-aware readU32 for ASCII magic.
let isSony = data[offset] == 0x53 && // S
data[offset+1] == 0x4F && // O
data[offset+2] == 0x4E && // N
data[offset+3] == 0x59 // Y
return isSony ? offset + 12 : offset
}
IFD Entry Parsing
Each IFD entry is 12 bytes: 2 bytes tag, 2 bytes type, 4 bytes count, 4 bytes value/offset.
private func tagDataRange(in ifdOffset: Int, tag: UInt16)
-> (dataOffset: Int, byteCount: Int)?
{
let entryCount = Int(readU16(at: ifdOffset))
for i in 0 ..< entryCount {
let e = ifdOffset + 2 + i * 12
guard e + 12 <= data.count else { break }
if readU16(at: e) == tag {
let type = Int(readU16(at: e + 2))
let count = Int(readU32(at: e + 4) ?? 0)
let sizes = [0,1,1,2,4,8,1,1,2,4,8,4,8,4]
let bytes = count * (type < sizes.count ? sizes[type] : 1)
if bytes <= 4 { return (e + 8, bytes) } // inline value
guard let ptr = readU32(at: e + 8) else { return nil }
// A1 / A1 II MakerNote IFD entries use absolute file offsets
// (not relative to MakerNote start) per ExifTool ProcessExif.
return (Int(ptr), bytes)
}
}
return nil
}
Sony A1 and A1 II MakerNote IFD entries use absolute file offsets,
consistent with ExifTool’s ProcessExif behaviour. The type-size table covers
all 14 standard TIFF types (index 0–13).
Data Models
FocusPoint
The parsed string "width height x y" is converted into a typed FocusPoint
struct:
struct FocusPoint: Identifiable {
let sensorWidth: CGFloat
let sensorHeight: CGFloat
let x: CGFloat
let y: CGFloat
var normalizedX: CGFloat { x / sensorWidth }
var normalizedY: CGFloat { y / sensorHeight }
}
Normalized coordinates (0.0–1.0) are used for rendering, making the marker position independent of the display image resolution.
Integration in ScanFiles
Focus points are extracted during the catalog scan in ScanFiles using a
concurrent withTaskGroup, after the EXIF task group has completed:
decodedFocusPoints = await extractNativeFocusPoints(from: result)
?? decodeFocusPointsJSON(from: url)
Native extraction is attempted first. If no ARW files in the catalog yield a
result (e.g. non-A1 cameras or files captured before the feature was added),
the actor falls back to reading a focuspoints.json sidecar file from the
same directory.
private func extractNativeFocusPoints(from items: [FileItem]) async
-> [DecodeFocusPoints]?
{
let collected = await withTaskGroup(of: DecodeFocusPoints?.self) { group in
for item in items {
group.addTask {
guard let location = SonyMakerNoteParser.focusLocation(from: item.url)
else { return nil }
// sourceFile must equal file.name — getFocusPoints() matches
// on filename only
return DecodeFocusPoints(
sourceFile: item.url.lastPathComponent,
focusLocation: location
)
}
}
var results: [DecodeFocusPoints] = []
for await result in group {
if let r = result { results.append(r) }
}
return results
}
return collected.isEmpty ? nil : collected
}
SonyMakerNoteParser.focusLocation is nonisolated static, so task group
tasks invoke it directly on the cooperative thread pool without hopping back
to the ScanFiles actor’s serial executor. All files are parsed concurrently.
Visualization
Focus point markers are rendered as corner brackets over the image using a
custom SwiftUI Shape:
struct FocusPointMarker: Shape {
let normalizedX: CGFloat
let normalizedY: CGFloat
let boxSize: CGFloat
func path(in rect: CGRect) -> Path {
let cx = normalizedX * rect.width
let cy = normalizedY * rect.height
let half = boxSize / 2
let bracket = boxSize * 0.28
// Draws 8 corner bracket lines around the focus position
…
}
}
The marker size is user-adjustable (32–100 px) via a slider in
FocusPointControllerView.
End-to-End Flow
flowchart TD
A[ARW file on disk] --> B[SonyMakerNoteParser.focusLocation]
B --> C[FileHandle.read — first 4 MB only]
C --> D[TIFFParser — detect byte order II/MM]
D --> E[Navigate IFD0 → ExifIFD 0x8769]
E --> F[Navigate to MakerNote 0x927C]
F --> G[Skip optional SONY DSC header 12 bytes]
G --> H{Tag 0x2027 present?}
H -- yes --> I[Read FocusLocation tag 0x2027]
H -- no --> J[Read fallback tag 0x204a]
I --> K[Decode int16u x4: width, height, x, y]
J --> K
K --> L[Return string: width height x y]
L --> M[ScanFiles.extractNativeFocusPoints — withTaskGroup]
M --> N[FocusPoint with normalizedX / normalizedY]
N --> O[FocusOverlayView — corner bracket marker on image]Key Technical Points
| Topic | Detail |
|---|---|
| File format | Sony ARW is TIFF-based, typically little-endian |
| Focus tag | 0x2027 (FocusLocation), fallback 0x204a |
| Data format | int16u[4] — width, height, x, y in sensor pixels |
| File read | First 4 MB only via FileHandle — MakerNote metadata is within the first 1–2 MB |
| Pointer base | Sony A1 / A1 II MakerNote pointers are absolute file offsets |
| MakerNote header | Optional 12-byte "SONY DSC " prefix detected by raw byte comparison |
| Encrypted tag | 0x9400 (AFInfo) is enciphered and not used |
| Concurrency | extractNativeFocusPoints uses withTaskGroup; focusLocation is nonisolated static |
| Fallback | focuspoints.json sidecar used when native parsing yields no results |
| Coordinates | Origin top-left; normalized to 0.0–1.0 before rendering |
Concurrency model
Concurrency Model — RawCull
Files covered:
RawCull/Model/ViewModels/RawCullViewModel.swiftRawCull/Views/RawCullSidebarMainView/extension+RawCullView.swiftRawCull/Actors/ScanFiles.swiftRawCull/Actors/ScanAndCreateThumbnails.swiftRawCull/Actors/ExtractAndSaveJPGs.swiftRawCull/Actors/ThumbnailLoader.swiftRawCull/Actors/RequestThumbnail.swiftRawCull/Actors/SharedMemoryCache.swiftRawCull/Actors/DiskCacheManager.swiftRawCull/Actors/SaveJPGImage.swiftRawCull/Enum/SonyThumbnailExtractor.swiftRawCull/Enum/JPGSonyARWExtractor.swift
Overview
RawCull uses Swift structured concurrency (async/await, Task, TaskGroup, and actor) across four primary flows:
| Flow | Entry point | Core actor(s) | Purpose |
|---|---|---|---|
| Catalog scan | RawCullViewModel.handleSourceChange(url:) | ScanFiles | Scan ARW files, extract metadata, load focus points |
| Thumbnail preload | RawCullViewModel.handleSourceChange(url:) | ScanAndCreateThumbnails | Bulk-populate the thumbnail cache for a selected catalog |
| JPG extraction | extension+RawCullView.extractAllJPGS() | ExtractAndSaveJPGs | Extract embedded JPEG previews and save to disk |
| On-demand thumbnails | UI grid + detail views | ThumbnailLoader, RequestThumbnail | Rate-limited, cached per-file thumbnail retrieval |
The two long-running operations (thumbnail preload and JPG extraction) share a two-level task pattern:
- An outer
Taskcreated from theViewModelorViewlayer. - An inner
Taskstored inside the actor, which owns the real work and cancellation handle.
This split keeps UI responsive: handleSourceChange is @MainActor but async — when it awaits the outer Task, the main actor is free to handle other work while the task’s body runs on the ScanAndCreateThumbnails actor. The inner task runs heavy I/O and image work on actor and cooperative thread-pool queues. Cancellation requires calling both levels.
1. Catalog Scan — ScanFiles
1.1 Entry point
RawCullViewModel.handleSourceChange(url:) is @MainActor and is called whenever the user selects a new catalog. It triggers the scan before any thumbnail work starts.
1.2 Scan flow
ScanFiles.scanFiles(url:onProgress:) runs on the ScanFiles actor:
- Opens the directory with security-scoped resource access.
- Uses
withTaskGroupto process all ARW files in parallel. - For each file, a task reads
URLResourceValues(name, size, content type, modification date) and callsextractExifData(from:). - After the group finishes, resolves focus points via a two-stage fallback:
- Native extraction first:
extractNativeFocusPoints(from:)runs awithTaskGroupover allFileItems, callingSonyMakerNoteParser.focusLocation(from:)on each ARW file. - JSON fallback: if native extraction yields no results,
decodeFocusPointsJSON(from:)readsfocuspoints.jsonsynchronously from the same directory.
- Native extraction first:
- Returns
[FileItem].
extractExifData(from:) reads EXIF data via CGImageSourceCopyPropertiesAtIndex and formats:
- Shutter speed (e.g.,
"1/1000"or"2.5s") - Focal length (e.g.,
"50.0mm") - Aperture (e.g.,
"ƒ/2.8") - ISO (e.g.,
"ISO 400") - Camera model (from TIFF dictionary)
- Lens model (from EXIF dictionary)
RawCullViewModel then calls ScanFiles.sortFiles(_:by:searchText:) (@concurrent nonisolated, runs on the cooperative thread pool), updates files and filteredFiles on the main actor, and maps decoded focus points to FocusPointsModel objects.
2. Thumbnail Preload — ScanAndCreateThumbnails
2.1 How the task starts
RawCullViewModel.handleSourceChange(url:) is the entry point (@MainActor).
Step-by-step:
- Skip duplicates:
processedURLs: Set<URL>prevents re-processing a catalog URL already handled in this session. - Fetch settings:
SettingsViewModel.shared.asyncgetsettings()providesthumbnailSizePreviewandthumbnailCostPerPixel. - Build handlers:
CreateFileHandlers().createFileHandlers(...)wires up four@MainActor @Sendableclosures:fileHandler(Int)— progress countmaxfilesHandler(Int)— total file countestimatedTimeHandler(Int)— ETA in secondsmemorypressurewarning(Bool)— memory pressure state for UI
- Create actor:
ScanAndCreateThumbnails()is instantiated and handlers injected. - Store actor reference:
currentScanAndCreateThumbnailsActoris set soabort()can reach it. - Create outer Task on the ViewModel:
preloadTask = Task {
await scanAndCreateThumbnails.preloadCatalog(
at: url,
targetSize: thumbnailSizePreview
)
}
await preloadTask?.value
The await suspends handleSourceChange (freeing the main actor while the preload runs on the ScanAndCreateThumbnails actor) until the preload finishes or is cancelled.
2.2 Inside the actor
preloadCatalog(at:targetSize:) runs on the ScanAndCreateThumbnails actor:
- One-time setup:
ensureReady()callsSharedMemoryCache.shared.ensureReady()and fetches settings via asetupTaskgate (preventing duplicate initialization from concurrent callers). - Cancel prior work:
cancelPreload()cancels and nils any existing inner task. - Discover files: Enumerate ARW files non-recursively via
DiscoverFiles. - Notify max:
fileHandlers?.maxfilesHandler(files.count)updates the progress bar maximum. - Create inner
Task<Int, Never>: stored aspreloadTaskon the actor. - Bounded
withTaskGroup: caps parallelism atProcessInfo.processInfo.activeProcessorCount * 2using index-based back-pressure and per-iteration cancellation checks:
for (index, url) in urls.enumerated() {
if Task.isCancelled {
group.cancelAll()
break
}
if index >= maxConcurrent {
await group.next()
}
group.addTask {
await self.processSingleFile(url, targetSize: targetSize, itemIndex: index)
}
}
await group.waitForAll()
2.3 Per-file processing and cancellation points
processSingleFile(_:targetSize:itemIndex:) follows the three-tier cache lookup and checks Task.isCancelled at every expensive step:
| Step | Cancellation check | Action on cancel |
|---|---|---|
| Before RAM lookup | Task.isCancelled | Return immediately |
| After RAM hit confirmed | Task.isCancelled | Skip remaining work |
| Before disk lookup | Task.isCancelled | Return immediately |
| Before source extraction | Task.isCancelled | Return immediately |
| After extraction completes | Task.isCancelled | Skip caching and disk write |
On extraction success:
- Call
cgImageToNormalizedNSImage(_:)— convertsCGImageto anNSImagebacked by a single JPEG representation (quality 0.7). This normalization ensures memory and disk representations are consistent. storeInMemoryCache(_:for:)— createsDiscardableThumbnailwith pixel-accurate cost and stores inSharedMemoryCache.- Encode
jpegDataand calldiskCache.save(_:for:)— this is a detached background task. The closure capturesdiskCachedirectly to avoid retaining the actor.
2.4 Request coalescing
ScanAndCreateThumbnails.thumbnail(for:targetSize:) exposes an async lookup for direct per-file requests. It calls resolveImage(for:targetSize:), which adds in-flight task coalescing via inflightTasks: [URL: Task<CGImage, Error>]:
- Check RAM cache.
- Check disk cache.
- If
inflightTasks[url]exists,awaitit — multiple callers share the same work. - Otherwise, create a new unstructured
Taskinside the actor, store it ininflightTasks, extract and cache the thumbnail, then remove the entry when done.
This prevents duplicate extraction work when multiple UI elements request the same file simultaneously.
3. JPG Extraction — ExtractAndSaveJPGs
3.1 How the task starts
extension+RawCullView.extractAllJPGS() creates an unstructured outer task from the View layer:
Task {
viewModel.creatingthumbnails = true
let handlers = CreateFileHandlers().createFileHandlers(
fileHandler: viewModel.fileHandler,
maxfilesHandler: viewModel.maxfilesHandler,
estimatedTimeHandler: viewModel.estimatedTimeHandler,
memorypressurewarning: { _ in },
)
let extract = ExtractAndSaveJPGs()
await extract.setFileHandlers(handlers)
viewModel.currentExtractAndSaveJPGsActor = extract
guard let url = viewModel.selectedSource?.url else { return }
await extract.extractAndSaveAlljpgs(from: url)
viewModel.currentExtractAndSaveJPGsActor = nil
viewModel.creatingthumbnails = false
}
Unlike the preload flow, the outer task is not stored on the ViewModel. Cancellation is driven entirely through the actor reference via viewModel.abort().
3.2 Inside the actor
extractAndSaveAlljpgs(from:) mirrors the preload pattern exactly:
- Cancel any previous inner task via
cancelExtractJPGSTask(). - Discover all ARW files (non-recursive).
- Create a
Task<Int, Never>stored asextractJPEGSTask. - Use
withThrowingTaskGroupwithactiveProcessorCount * 2concurrency cap and the same index-based back-pressure pattern asScanAndCreateThumbnails(cancellation check +group.cancelAll(), index guard beforegroup.next(),group.waitForAll()to drain). - Call
processSingleExtraction(_:itemIndex:)per file.
processSingleExtraction checks cancellation before and after JPGSonyARWExtractor.jpgSonyARWExtractor(from:fullSize:), then writes the result via SaveJPGImage().save(image:originalURL:).
SaveJPGImage.save is @concurrent nonisolated and runs on the cooperative thread pool. It:
- Replaces the
.arwextension with.jpg - Uses
CGImageDestinationCreateWithURLwith JPEG quality1.0 - Logs success/failure with image dimensions and file paths
4. Rate-Limited On-Demand Loading
4.1 ThumbnailLoader
ThumbnailLoader is a shared actor that enforces a maximum of 6 concurrent thumbnail loads. Excess requests suspend via CheckedContinuation and are queued:
actor ThumbnailLoader {
static let shared = ThumbnailLoader()
private let maxConcurrent = 6
private var activeTasks = 0
private var pendingContinuations: [(id: UUID, continuation: CheckedContinuation<Void, Never>)] = []
}
acquireSlot() flow:
- If
activeTasks < maxConcurrent: incrementactiveTasks, return immediately. - Otherwise: call
withCheckedContinuation { continuation in ... }— this suspends the caller. - A cancellation handler is registered to remove the pending continuation by ID so it is never resumed after cancellation.
releaseSlot() flow:
- Decrement
activeTasks. - If
pendingContinuationsis non-empty, pop the first andresume()it.
thumbnailLoader(file:) flow:
func thumbnailLoader(file: FileItem) async -> NSImage? {
await acquireSlot()
defer { releaseSlot() }
guard !Task.isCancelled else { return nil }
let settings = await getSettings()
let cgThumb = await RequestThumbnail().requestThumbnail(
for: file.url,
targetSize: settings.thumbnailSizePreview
)
guard !Task.isCancelled else { return nil }
if let cgThumb {
return NSImage(cgImage: cgThumb, size: .zero)
}
return nil
}
Settings are cached on the actor to avoid repeated SettingsViewModel calls. The result is wrapped as NSImage(cgImage:size:.zero) before returning to the caller.
4.2 RequestThumbnail
RequestThumbnail handles the per-file cache resolution path for the on-demand flow:
ensureReady()— samesetupTaskgate pattern asScanAndCreateThumbnails.- RAM cache lookup via
SharedMemoryCache.object(forKey:); on hit, callsSharedMemoryCache.updateCacheMemory()for statistics. - Disk cache lookup via
DiskCacheManager.load(for:); on hit, callsSharedMemoryCache.updateCacheDisk()for statistics. - Extraction fallback:
SonyThumbnailExtractor.extractSonyThumbnail(from:maxDimension:qualityCost:). - Store in RAM cache via
storeInMemory(_:for:). - Schedule disk save via a detached background task.
requestThumbnail(for:targetSize:) returns CGImage? for direct use by SwiftUI views. All errors are caught and logged; the method returns nil on failure.
nsImageToCGImage(_:) is async and tries cgImage(forProposedRect:) first; if that fails, it falls back to a TIFF round-trip on a Task.detached(priority: .utility) task to avoid blocking the actor with CPU-bound work.
5. Task Ownership and Handles
| Layer | Owner | Handle name | Type |
|---|---|---|---|
| Outer task (preload) | RawCullViewModel | preloadTask | Task<Void, Never>? |
| Inner task (preload) | ScanAndCreateThumbnails | preloadTask | Task<Int, Never>? |
| Outer task (extract) | View (extractAllJPGS) | not stored | Task<Void, Never> |
| Inner task (extract) | ExtractAndSaveJPGs | extractJPEGSTask | Task<Int, Never>? |
| Slot queue (on-demand) | ThumbnailLoader.shared | pendingContinuations | [(UUID, CheckedContinuation)] |
6. Cancellation
6.1 abort()
RawCullViewModel.abort() is the single cancellation entry point for user-initiated stops:
func abort() {
preloadTask?.cancel()
preloadTask = nil
if let actor = currentScanAndCreateThumbnailsActor {
Task { await actor.cancelPreload() }
}
currentScanAndCreateThumbnailsActor = nil
if let actor = currentExtractAndSaveJPGsActor {
Task { await actor.cancelExtractJPGSTask() }
}
currentExtractAndSaveJPGsActor = nil
creatingthumbnails = false
}
6.2 Why both levels matter
Cancelling the outer Task propagates into child structured tasks, but does not automatically cancel the actor’s inner Task. The inner task is unstructured (Task { ... } created inside the actor) — it is not a child of the outer task. The actor-specific cancel methods (cancelPreload, cancelExtractJPGSTask) must be explicitly called to cancel the inner task and allow the withTaskGroup to unwind.
6.3 ThumbnailLoader.cancelAll()
cancelAll() resumes all pending continuations immediately, unblocking any tasks waiting for a slot. This is called during teardown to prevent suspension leaks.
7. ETA Estimation
Both long-running actors compute a rolling ETA estimate:
Algorithm:
- Record a timestamp before each file starts processing.
- After completion, compute
delta = now - lastItemTime. - Append
deltatoprocessingTimesarray. - Keep only the last 10 items in the array.
- After collecting
minimumSamplesBeforeEstimation(10) items, calculate:
avgTime = sum(processingTimes) / processingTimes.count
remaining = (totalFiles - itemsProcessed) * avgTime
- Only update the displayed ETA if
remaining < lastEstimatedSeconds— this prevents the counter from jumping upward when a slow file takes longer than expected.
| Actor | Minimum samples threshold |
|---|---|
ScanAndCreateThumbnails | minimumSamplesBeforeEstimation = 10 |
ExtractAndSaveJPGs | estimationStartIndex = 10 |
8. Actor Isolation and Thread Safety
| Component | Isolation strategy |
|---|---|
ScanAndCreateThumbnails, ExtractAndSaveJPGs, ScanFiles | All mutable state is actor-isolated; mutations only through actor methods |
SharedMemoryCache | nonisolated(unsafe) for NSCache (thread-safe by design); all statistics and config remain actor-isolated |
DiskCacheManager | Actor-isolates path calculation and coordination; actual file I/O runs in detached tasks |
ThumbnailLoader | Actor-isolated slot counter and continuation queue |
DiscardableThumbnail | @unchecked Sendable with OSAllocatedUnfairLock protecting (isDiscarded, accessCount) |
CacheDelegate | @unchecked Sendable — willEvictObject is called synchronously by NSCache; increments are dispatched to an isolated EvictionCounter actor |
RawCullViewModel | @MainActor — all UI state updates serialized on the main thread |
SonyThumbnailExtractor, JPGSonyARWExtractor | nonisolated static methods dispatched to global GCD queues to prevent actor starvation |
SaveJPGImage | actor with a single @concurrent nonisolated method — runs on the cooperative thread pool, not the actor’s executor |
CPU-bound ImageIO and disk I/O work runs off-actor to keep the main thread and actor queues responsive.
9. Flow Diagrams
Thumbnail Preload — Two-Level Task Pattern
sequenceDiagram
participant VM as RawCullViewModel (MainActor)
participant A as ScanAndCreateThumbnails (actor)
participant MC as SharedMemoryCache
participant DC as DiskCacheManager
participant EX as SonyThumbnailExtractor
VM->>A: preloadCatalog(at:targetSize:)
Note over VM: outer Task stores preloadTask
A->>A: ensureReady() — setupTask gate
A->>A: cancelPreload() — cancel prior inner task
A->>A: create inner Task<Int,Never> — stored as preloadTask
loop withTaskGroup (capped at processorCount × 2)
A->>MC: object(forKey:) — RAM check
alt RAM hit
MC-->>A: DiscardableThumbnail
else RAM miss
A->>DC: load(for:) — disk check
alt Disk hit
DC-->>A: NSImage
A->>MC: setObject(...) — promote to RAM
else Disk miss
A->>EX: extractSonyThumbnail(from:maxDimension:qualityCost:)
EX-->>A: CGImage
A->>A: normalize to JPEG-backed NSImage
A->>MC: setObject(...)
A->>DC: save(jpegData:for:) — detached background
end
end
A->>VM: fileHandler(progress) via @MainActor closure
end
A-->>VM: return countOn-Demand Request
sequenceDiagram
participant UI
participant TL as ThumbnailLoader (actor)
participant RT as RequestThumbnail (actor)
participant MC as SharedMemoryCache
participant DC as DiskCacheManager
UI->>TL: thumbnailLoader(file:)
TL->>TL: acquireSlot() — suspend if activeTasks ≥ 6
TL->>RT: requestThumbnail(for:targetSize:)
RT->>MC: object(forKey:)
alt RAM hit
MC-->>RT: DiscardableThumbnail
RT-->>UI: CGImage
else RAM miss
RT->>DC: load(for:)
alt Disk hit
DC-->>RT: NSImage
RT->>MC: setObject(...)
RT-->>UI: CGImage
else Disk miss
RT->>RT: extractSonyThumbnail
RT->>MC: setObject(...)
RT->>DC: save — detached background
RT-->>UI: CGImage
end
end
TL->>TL: releaseSlot() — resume next pending continuation10. Settings Reference
| Setting | Default | Effect |
|---|---|---|
memoryCacheSizeMB | 5000 | Sets NSCache.totalCostLimit |
thumbnailCostPerPixel | 4 | Cost per pixel in DiscardableThumbnail |
thumbnailSizePreview | 1024 | Target size for bulk preload and on-demand loading via ThumbnailLoader |
thumbnailSizeGrid | 100 | Grid thumbnail size |
thumbnailSizeGridView | 400 | Grid View thumbnail size |
thumbnailSizeFullSize | 8700 | Full-size zoom path upper bound |
useThumbnailAsZoomPreview | false | Use cached thumbnail instead of re-extracting for zoom |
Synchronous Code
A Guide to Handling Heavy Synchronous Code in Swift Concurrency
This post explains why CPU-intensive synchronous code (such as image decoding via ImageIO) must be dispatched off the Swift Concurrency thread pool, and shows the correct patterns RawCull uses to do so.
DispatchQueue.global(qos:) — QoS Levels Compared
The key difference is priority and resource allocation by the system.
.userInitiated
- Priority: High (just below
.userInteractive) - Use case: Work the user directly triggered and is actively waiting for — e.g., loading a document they tapped, parsing data to display a screen
- Expected duration: Near-instantaneous to a few seconds
- System behavior: Gets more CPU time and higher thread priority — the system treats this as urgent
- Energy impact: Higher
.utility
- Priority: Low-medium
- Use case: Long-running work the user is aware of but not blocked by — e.g., downloading files, importing data, periodic syncs, progress-bar tasks
- Expected duration: Seconds to minutes
- System behavior: Balanced CPU/energy trade-off; the system throttles this more aggressively under load or low battery
- Energy impact: Lower (system may apply energy efficiency optimizations)
Quick Comparison
.userInitiated | .utility | |
|---|---|---|
| Priority | High | Low-medium |
| User waiting? | Yes, directly | Aware but not blocked |
| Duration | < a few seconds | Seconds to minutes |
| CPU allocation | Aggressive | Conservative |
| Battery impact | Higher | Lower |
| Thread pool | Higher-priority threads | Lower-priority threads |
Rule of thumb
// User tapped "Load" and is staring at a spinner → userInitiated
DispatchQueue.global(qos: .userInitiated).async {
let data = loadCriticalData()
}
// Background sync / download with a progress bar → utility
DispatchQueue.global(qos: .utility).async {
downloadLargeFile()
}
If you use .userInitiated for everything, you waste battery and CPU on non-urgent work. If you use .utility for user-blocking tasks, the UI will feel sluggish because the system may deprioritize the work.
1. The Core Problem: The Swift Cooperative Thread Pool
To understand why heavy synchronous code breaks modern Swift, you have to understand the difference between older Apple code (Grand Central Dispatch / GCD) and new Swift Concurrency.
- GCD (
DispatchQueue) uses a dynamic thread pool. If a thread gets blocked doing heavy work, GCD notices and spawns a new thread. This prevents deadlocks but causes Thread Explosion (which drains memory and battery). - Swift Concurrency (
async/await/Task) uses a fixed-size cooperative thread pool. It strictly limits the number of background threads to exactly the number of CPU cores your device has (e.g., 6 cores = exactly 6 threads). It will never spawn more.
Because there are so few threads, Swift relies on cooperation. When an async function hits an await, it says: “I’m pausing to wait for something. Take my thread and give it to another task!” This allows 6 threads to juggle thousands of concurrent tasks.
The “Choke” (Thread Pool Starvation)
If you run heavy synchronous code (code without await) on the Swift thread pool, it hijacks the thread and refuses to give it back.
If you request 6 heavy image extractions at the same time, all 6 Swift threads are paralyzed. Your entire app’s concurrency system freezes until an image finishes. Network requests halt, and background tasks deadlock.
2. What exactly is “Blocking Synchronous Code”?
Synchronous code executes top-to-bottom without ever pausing (it lacks the await keyword). Blocking code is synchronous code that takes a “long time” to finish (usually >10–50 milliseconds), thereby holding a thread hostage.
The 3 Types of Blocking Code:
- Heavy CPU-Bound Work: Number crunching, image processing (
CoreGraphics,ImageIO), video encoding, parsing massive JSON files. - Synchronous I/O: Reading massive files synchronously (e.g.,
Data(contentsOf: URL)) or older synchronous database queries. The thread is completely frozen waiting for the hard drive. - Locks and Semaphores: Using
DispatchSemaphore.wait()orNSLockintentionally pauses a thread. (Apple strictly forbids these inside Swift Concurrency).
The Checklist to Identify Blocking Code:
Ask yourself these questions about a function:
- Does it lack the
asynckeyword in its signature? - Does it lack internal
awaitcalls (orawait Task.yield())? - Does it take more than a few milliseconds to run?
- Is it a “Black Box” from an Apple framework (like
ImageIO) or C/C++?
If the answer is Yes, it is blocking synchronous code and does not belong in the Swift Concurrency thread pool.
3. The Traps: Why Task and Actor Don’t Fix It
It is highly intuitive to try and fix blocking code using modern Swift features. However, these common approaches are dangerous traps:
Trap 1: Using Task or Task.detached
// ❌ TRAP: Still causes Thread Pool Starvation!
func extract() async throws -> CGImage {
return try await Task.detached {
return try Self.extractSync() // Blocks one of the 6 Swift threads
}.value
}
Task and Task.detached do not create new background threads. They simply place work onto that same strict 6-thread cooperative pool. It might seem to “work” if you only test one image at a time, but at scale, it will deadlock your app.
Trap 2: Putting it inside an actor
Actors process their work one-by-one to protect state. However, Actors do not have their own dedicated threads. They borrow threads from the cooperative pool. If you run heavy sync code inside an Actor, you cause a Double Whammy:
- Thread Pool Starvation: You choked one of the 6 Swift workers.
- Actor Starvation: The Actor is locked up and cannot process any other messages until the heavy work finishes.
Trap 3: Using nonisolated
Marking an Actor function as nonisolated just means “this doesn’t touch the Actor’s private state.” It prevents Actor Starvation, but the function still physically runs on the exact same 6-thread pool, causing Thread Pool Starvation.
4. The Correct Solution: The GCD Escape Hatch
Apple’s official stance is that if you have heavy, blocking synchronous code that you cannot modify, Grand Central Dispatch (GCD) is still the correct tool for the job.
By wrapping the work in DispatchQueue.global().async and withCheckedThrowingContinuation, you push the heavy work out of Swift’s strict 6-thread pool and into GCD’s flexible thread pool (which is allowed to spin up extra threads).
This leaves the precious Swift Concurrency threads completely free to continue juggling all the other await tasks in your app.
Two functions in RawCull use DispatchQueue.global
extract JPGs from ARW files
static func extractEmbeddedPreview(
from arwURL: URL,
fullSize: Bool = false
) async -> CGImage? {
let maxThumbnailSize: CGFloat = fullSize ? 8640 : 4320
return await withCheckedContinuation { (continuation: CheckedContinuation<CGImage?, Never>) in
// Dispatch to GCD to prevent Thread Pool Starvation
DispatchQueue.global(qos: .utility).async {
guard let imageSource = CGImageSourceCreateWithURL(arwURL as CFURL, nil) else {
Logger.process.warning("PreviewExtractor: Failed to create image source")
continuation.resume(returning: nil)
return
}
let imageCount = CGImageSourceGetCount(imageSource)
var targetIndex: Int = -1
var targetWidth = 0
// 1. Find the LARGEST JPEG available
for index in 0 ..< imageCount {
guard let properties = CGImageSourceCopyPropertiesAtIndex(
imageSource,
index,
nil
) as? [CFString: Any]
else {
Logger.process.debugMessageOnly("enum: extractEmbeddedPreview(): Index \(index) - Failed to get properties")
continue
}
let hasJFIF = (properties[kCGImagePropertyJFIFDictionary] as? [CFString: Any]) != nil
let tiffDict = properties[kCGImagePropertyTIFFDictionary] as? [CFString: Any]
let compression = tiffDict?[kCGImagePropertyTIFFCompression] as? Int
let isJPEG = hasJFIF || (compression == 6)
if let width = getWidth(from: properties) {
if isJPEG, width > targetWidth {
targetWidth = width
targetIndex = index
}
}
}
guard targetIndex != -1 else {
Logger.process.warning("PreviewExtractor: No JPEG found in file")
continuation.resume(returning: nil)
return
}
let requiresDownsampling = CGFloat(targetWidth) > maxThumbnailSize
let result: CGImage?
// 2. Decode & Downsample using ImageIO directly
if requiresDownsampling {
Logger.process.info("PreviewExtractor: Native downsampling to \(maxThumbnailSize)px")
// THESE ARE THE MAGIC OPTIONS that replace your resizeImage() function
let options: [CFString: Any] = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: Int(maxThumbnailSize)
]
result = CGImageSourceCreateThumbnailAtIndex(imageSource, targetIndex, options as CFDictionary)
} else {
Logger.process.info("PreviewExtractor: Using original preview size (\(targetWidth)px)")
// Your original standard decoding options
let decodeOptions: [CFString: Any] = [
kCGImageSourceShouldCache: true,
kCGImageSourceShouldCacheImmediately: true
]
result = CGImageSourceCreateImageAtIndex(imageSource, targetIndex, decodeOptions as CFDictionary)
}
continuation.resume(returning: result)
}
}
}
extract thumbnails
import AppKit
import Foundation
enum SonyThumbnailExtractor {
/// Extract thumbnail using generic ImageIO framework.
/// - Parameters:
/// - url: The URL of the RAW image file.
/// - maxDimension: Maximum pixel size for the longest edge of the thumbnail.
/// - qualityCost: Interpolation cost.
/// - Returns: A `CGImage` thumbnail.
static func extractSonyThumbnail(
from url: URL,
maxDimension: CGFloat,
qualityCost: Int = 4
) async throws -> CGImage {
// We MUST explicitly hop off the current thread.
// Since we are an enum and static, we have no isolation of our own.
// If we don't do this, we run on the caller's thread (the Actor), causing serialization.
try await withCheckedThrowingContinuation { continuation in
DispatchQueue.global(qos: .userInitiated).async {
do {
let image = try Self.extractSync(
from: url,
maxDimension: maxDimension,
qualityCost: qualityCost
)
continuation.resume(returning: image)
} catch {
continuation.resume(throwing: error)
}
}
}
}
5. The “Modern Swift” Alternative (If you own the code)
If extractSync was your own custom Swift code (and not an opaque framework like ImageIO), the truly “Modern Swift” way to fix it is to rewrite the synchronous loop to be cooperative.
You do this by sprinkling await Task.yield() inside heavy loops to voluntarily give the thread back:
func extractSyncCodeMadeAsync() async -> CGImage {
for pixelRow in image {
process(pixelRow)
// Every few rows, pause and let another part of the app use the thread!
if pixelRow.index % 10 == 0 {
await Task.yield()
}
}
}
If you can do this, you don’t need DispatchQueue! But if you are using black-box code that you can’t add await to, the GCD Escape Hatch is the correct, Apple-approved architecture.
Summary
Heavy synchronous code — especially CPU-bound ImageIO work — must never run directly on Swift’s cooperative thread pool. The GCD escape hatch (DispatchQueue.global + withCheckedContinuation) moves that work onto GCD’s flexible thread pool, leaving Swift Concurrency threads free. RawCull uses this pattern for both thumbnail extraction (userInitiated priority) and JPEG preview extraction (utility priority).
Security-Scoped URLs
Security-scoped URLs are a cornerstone of macOS App Sandbox security. RawCull uses them to gain persistent, user-approved access to source and destination folders while remaining fully sandbox-compliant. This article walks through exactly how the implementation works, tracing the code from user interaction through to file operations.
What Are Security-Scoped URLs?
A security-scoped URL is a special file URL that carries a cryptographic capability granted by macOS, representing explicit user consent to access a specific file or folder. Without it, a sandboxed app cannot read or write anything outside its own container.
Key properties:
- Created only from user-granted file access (file picker, drag-and-drop)
- Grants temporary access to files outside the app sandbox
- Must be explicitly activated (
startAccessingSecurityScopedResource()) before use and deactivated (stopAccessingSecurityScopedResource()) after - Can be serialized as a bookmark — a persistent token stored in
UserDefaultsthat survives app restarts
Core API:
// Activate access — must be called before any file operations on the URL
let granted = url.startAccessingSecurityScopedResource() // returns Bool
// Deactivate — must always be paired with a successful start call
url.stopAccessingSecurityScopedResource()
// Serialize to persistent bookmark data
let bookmarkData = try url.bookmarkData(
options: .withSecurityScope,
includingResourceValuesForKeys: nil,
relativeTo: nil
)
// Restore from bookmark (across app launches)
var isStale = false
let restoredURL = try URL(
resolvingBookmarkData: bookmarkData,
options: .withSecurityScope,
relativeTo: nil,
bookmarkDataIsStale: &isStale
)
Architecture in RawCull
RawCull’s security-scoped URL system has three distinct layers, each with a specific responsibility.
Layer 1 — Initial User Selection (OpencatalogView)
OpencatalogView presents the macOS folder picker using SwiftUI’s .fileImporter() modifier. When the user selects a folder, the resulting URL is a short-lived security-scoped URL. The view immediately converts it into a persistent bookmark.
File: RawCull/Views/CopyFiles/OpencatalogView.swift
.fileImporter(
isPresented: $isImporting,
allowedContentTypes: [.directory]
) { result in
switch result {
case .success(let url):
// Activate access immediately — required to create a bookmark
guard url.startAccessingSecurityScopedResource() else {
Logger.process.errorMessageOnly("Failed to start accessing resource")
return
}
// Store the path string for immediate UI use
selecteditem = url.path
// Serialize the URL to a persistent bookmark while access is active
do {
let bookmarkData = try url.bookmarkData(
options: .withSecurityScope,
includingResourceValuesForKeys: nil,
relativeTo: nil
)
UserDefaults.standard.set(bookmarkData, forKey: bookmarkKey)
} catch {
Logger.process.warning("Could not create bookmark: \(error)")
}
// Release access — will be reacquired via bookmark when needed
url.stopAccessingSecurityScopedResource()
case .failure(let error):
Logger.process.errorMessageOnly("File picker error: \(error)")
}
}
bookmarkKey is either "sourceBookmark" or "destBookmark" — the two folder roles in RawCull.
What this layer guarantees:
- Bookmark is created while access is still active (the only valid window for bookmark creation)
- Access is released immediately after — the bookmark takes over for future launches
- The path is captured before releasing access, so the UI can display it without holding an open security scope
Layer 2 — Bookmark Restoration (ExecuteCopyFiles)
When the user initiates a copy operation on a subsequent launch, ExecuteCopyFiles resolves the stored bookmarks back into live, access-granted URLs.
File: RawCull/Model/ParametersRsync/ExecuteCopyFiles.swift
func getAccessedURL(fromBookmarkKey key: String, fallbackPath: String) -> URL? {
// Primary path: restore from persisted bookmark
if let bookmarkData = UserDefaults.standard.data(forKey: key) {
do {
var isStale = false
let url = try URL(
resolvingBookmarkData: bookmarkData,
options: .withSecurityScope,
relativeTo: nil,
bookmarkDataIsStale: &isStale
)
// Activate access on the resolved URL
guard url.startAccessingSecurityScopedResource() else {
Logger.process.errorMessageOnly("Failed to start accessing bookmark for \(key)")
return tryFallbackPath(fallbackPath, key: key)
}
// Warn if the folder was moved (bookmark is stale)
if isStale {
Logger.process.warning("Bookmark is stale for \(key) — user may need to reselect")
}
return url // Caller is responsible for stopAccessingSecurityScopedResource()
} catch {
Logger.process.errorMessageOnly("Bookmark resolution failed for \(key): \(error)")
return tryFallbackPath(fallbackPath, key: key)
}
}
return tryFallbackPath(fallbackPath, key: key)
}
private func tryFallbackPath(_ fallbackPath: String, key: String) -> URL? {
let fallbackURL = URL(fileURLWithPath: fallbackPath)
guard fallbackURL.startAccessingSecurityScopedResource() else {
Logger.process.errorMessageOnly("Failed to access fallback path for \(key)")
return nil
}
return fallbackURL
}
The returned URL has startAccessingSecurityScopedResource() already called. The calling code in ExecuteCopyFiles is responsible for calling stopAccessingSecurityScopedResource() on each URL once the rsync operation completes.
What this layer handles:
- Normal case: bookmark resolves cleanly → URL returned with access active
- Stale bookmark: folder was moved → logged as warning, access still attempted
- Bookmark resolution throws: falls back to direct path access
- No bookmark stored at all: falls back to direct path access
Layer 3 — Scoped Access During File Operations (ScanFiles)
When scanning a directory for ARW files, the ScanFiles actor activates and deactivates security-scoped access for the duration of the scan only.
File: RawCull/Actors/ScanFiles.swift
actor ScanFiles {
func scanFiles(url: URL, onProgress: @escaping (Double) -> Void) async -> [FileItem] {
// Activate access for this URL
guard url.startAccessingSecurityScopedResource() else {
return []
}
// defer guarantees deactivation even if the function throws or returns early
defer { url.stopAccessingSecurityScopedResource() }
let manager = FileManager.default
guard let contents = try? manager.contentsOfDirectory(
at: url,
includingPropertiesForKeys: [
.fileSizeKey,
.contentModificationDateKey,
.typeIdentifierKey
],
options: [.skipsHiddenFiles]
) else { return [] }
return await processContents(contents, onProgress: onProgress)
}
}
The defer pattern is critical here: it guarantees that stopAccessingSecurityScopedResource() is called regardless of whether the function completes normally, returns early, or the Swift runtime unwinds the stack. This prevents security-scoped resources from being “leaked” (left open indefinitely).
Actor isolation: Because ScanFiles is a Swift actor, all file operations on its state are serialized by the runtime — concurrent reads of the same directory cannot race each other.
Global Access Tracking in RawCullViewModel
The main view model maintains a comprehensive registry of all URLs for which startAccessingSecurityScopedResource() has been called, ensuring nothing is left open when the app quits.
File: RawCull/Model/ViewModels/RawCullViewModel.swift
@Observable @MainActor
final class RawCullViewModel {
private var securityScopedURLs: Set<URL> = []
func trackSecurityScopedAccess(for url: URL) {
securityScopedURLs.insert(url)
}
func stopSecurityScopedAccess(for url: URL) {
guard securityScopedURLs.contains(url) else { return }
url.stopAccessingSecurityScopedResource()
securityScopedURLs.remove(url)
}
deinit {
// Release all remaining security-scoped access on teardown
for url in securityScopedURLs {
url.stopAccessingSecurityScopedResource()
}
}
}
This acts as a safety net: even if a call path omits an explicit stop, the deinit cleans up everything before the app exits. Combined with defer in the actors, this gives double coverage against resource leaks.
End-to-End Flow
User selects destination folder via file picker
↓
OpencatalogView.fileImporter result handler
1. url.startAccessingSecurityScopedResource()
2. selecteditem = url.path (UI binding)
3. bookmarkData = try url.bookmarkData(options: .withSecurityScope)
4. UserDefaults.set(bookmarkData, forKey: "destBookmark")
5. url.stopAccessingSecurityScopedResource()
↓
[App may be quit and relaunched here]
↓
User initiates copy operation
↓
ExecuteCopyFiles.performCopyTask()
1. getAccessedURL(fromBookmarkKey: "sourceBookmark", ...)
→ resolves bookmark → startAccessingSecurityScopedResource() → returns URL
2. getAccessedURL(fromBookmarkKey: "destBookmark", ...)
→ resolves bookmark → startAccessingSecurityScopedResource() → returns URL
3. Builds rsync argument list using both paths
4. Spawns /usr/bin/rsync via RsyncProcessStreaming
5. After rsync completes:
sourceURL.stopAccessingSecurityScopedResource()
destURL.stopAccessingSecurityScopedResource()
↓
ScanFiles.scanFiles(url: sourceURL)
1. url.startAccessingSecurityScopedResource()
2. defer { url.stopAccessingSecurityScopedResource() }
3. FileManager.contentsOfDirectory(at: url, ...)
4. Returns [FileItem] ← defer fires here, access released
↓
RawCullViewModel.deinit (on app quit)
→ stopAccessingSecurityScopedResource() for any remaining tracked URLs
Security Model Summary
| Aspect | Implementation | Guarantee |
|---|---|---|
| User consent | File picker only — no programmatic path construction | App never accesses a folder the user did not explicitly choose |
| Persistence | Bookmark serialized to UserDefaults | User does not re-select folders on every launch |
| Minimal scope duration | defer and explicit stop calls bound access to the operation | Security-scoped access is held only as long as needed |
| Leak prevention | Set<URL> in view model + deinit cleanup | No access token outlives the app session |
| Stale bookmark detection | bookmarkDataIsStale checked on every resolve | User is informed if a folder has been moved |
| Fallback resilience | Direct path access if bookmark resolution fails | Graceful degradation, operation still attempted |
| Audit trail | OSLog records every start, stop, failure, and stale event | Security events are observable via Console.app |
Common Pitfalls (and How RawCull Avoids Them)
1. Forgetting to call startAccessingSecurityScopedResource() before file operations
→ RawCull guards every file operation with an explicit start call; failure returns nil or [] rather than crashing.
2. Not calling stopAccessingSecurityScopedResource() — leaking the scope
→ defer in actors and deinit in the view model provide two independent cleanup layers.
3. Creating a bookmark while access is not active
→ OpencatalogView always creates the bookmark inside the startAccessing… / stopAccessing… window.
4. Ignoring the isStale flag
→ RawCull logs a warning when bookmarkDataIsStale is true, making stale bookmarks visible in diagnostics.
5. Using the resolved URL after calling stop
→ The view model tracks active URLs and guards against double-stop via the contains check before removing from the set.
Compiling RawCull
Overview
There are three methods to compile RawCull: one without an Apple Developer account and two with an Apple Developer account. Regardless of the method used, it is straightforward to compile RawCull, as it is not dependent on any third-party code or library.
The easiest method is by using the included Makefile. The default make in /usr/bin/make does the job.
Compile by make
If you have an Apple Developer account, you should open the RawCull project and replace the Signing & Capabilities section with your own Apple Developer ID before using make and the procedure outlined below.
The use of the make command necessitates the application-specific password. There are two commands available for use with make: one creates a release build exclusively for RawCull, while the other generates a signed version that includes a DMG file.
If only utilizing the make archive command, the application-specific password is not required, and it would suffice to update only the Signing & Capabilities section. The make archive command will likely still function even if set to Sign to Run Locally.
To create a DMG file, the make command is dependent on the create-dmg tool. The instructions for create-dmg are included in the Makefile. Ensure that the fork of create-dmg is on the same level as the fork of RawCull. Before using make, create and store an app-specific password.
The following procedure creates and stores an app-specific password:
- Visit appleid.apple.com and log in with your Apple ID.
- Navigate to the Sign-In and Security section and select App-Specific Passwords → Generate an App-Specific Password.
- Provide a label to help identify the purpose of the password (e.g., notarytool).
- Click Create. The password will be displayed once; copy it and store it securely.
After creating the app-specific password, execute the following command and follow the prompts:
xcrun notarytool store-credentials --apple-id "youremail@gmail.com" --team-id "A1B2C3D4E5"
- Replace
youremail@gmail.comandA1B2C3D4E5with your actual credentials.
Name the app-specific password RawCull (in appleid.apple.com) and set Profile name: RawCull when executing the above command.
The following dialog will appear:
This process stores your credentials securely in the Keychain. You reference these credentials later using a profile name.
Profile name:
RawCull
App-specific password for youremail@gmail.com:
Validating your credentials...
Success. Credentials validated.
Credentials saved to Keychain.
To use them, specify `--keychain-profile "RawCull"`
Following the above steps, the following make commands are available from the root of RawCull’s source catalog:
make- will generate a signed and notarized DMG file including the release version of RawCull.make archive- will produce an unsigned release build with all debug information removed, placed in thebuild/directory.make clean- will delete all build data.
Compile by Xcode
If you have an Apple Developer account, use your Apple Developer ID in Xcode.
Apple Developer account
Open the RawCull project by Xcode. Choose the top level of the project, and select the tab Signing & Capabilities. Replace Team with your team.
No Apple Developer account
As above, but choose in Signing Certificate to Sign to Run Locally.
To compile or run
Use Xcode for run, debug or build. You choose.