Release Notes

RawCull version release notes and changelogs.

Version 1.4.0

Version 1.4.0 - April 7, 2026

RawCull supports a broader range of Sony FullFrame camera bodies. See table below for what is tested now.

RawCull Changes Summary (v1.3.5 → v1.4.0)

  • Sharpness scoring was significantly upgraded: added saliency-aware scoring, optional subject classification, subject-based filtering, configurable scoring parameters, and improved normalization/progress handling.
  • New UI controls and views were added around scoring and stats:
    • new Scoring Parameters sheet
    • new Scan Summary/Statistics sheet
    • toolbar toggles for score badge and saliency badge
    • expanded culling/rating status display (including per-star counts).
  • EXIF/metadata support expanded: now includes RAW compression type, size class (L/M/S), and pixel dimensions; these are surfaced in inspector UI.
  • Sony compatibility work: maker-note parser now supports more Sony bodies and adds robust diagnostics/full-file fallback for focus metadata extraction.
  • Performance/state management improvements: rating/tag caching for O(1) lookups, better cancellation behavior in scoring, memory warning refinements (soft/full), and various cleanup/refactors.
  • Persistence/settings updates: settings now store badge visibility toggles and decode safely with defaults.
  • Testing expanded with a large new ARW body compatibility diagnostic test suite plus concurrency/sendability test adjustments.

ARW body compatibility diagnostic

The following Sony bodies successfully extract EXIF, focus points, sharpness, and saliency, except for the ILCE-7RM5, which failed to extract saliency on one of its three files. The ILCE-1M2 is the only body tested across all three Sony RAW size variants (S/M/L). All files use compressed RAW, and every body achieves full-resolution L-size output, ranging from 12.4 MP (ILCE-1M2 S-crop) to 60.2 MP (ILCE-7RM5). The ILCE-7M5 and ILCE-7RM5 are the next bodies to focus on, but I depend on test ARW files to properly test them before officially concluding support for these two bodies.

Camera BodyEXIFFocusPtSharpnessSaliencyRAW TypesDimensions
ILCE-1M2Compressed4320 × 2880 (12.4 MP, S), 5616 × 3744 (21.0 MP, M), 8640 × 5760 (49.8 MP, L)
ILCE-1Compressed8640 × 5760 (49.8 MP, L)
ILCE-7M5Compressed7008 × 4672 (32.7 MP, L)
ILCE-7RM5Compressed9504 × 6336 (60.2 MP, L)
ILCE-9M3Compressed6000 × 4000 (24.0 MP, L)

Version 1.4.1

Version 1.4.1 - April 7, 2026

RawCull supports a broader range of Sony FullFrame camera bodies. See table below for what is tested now.

RawCull Changes Summary (v1.4.0 → v1.4.1)

Version 1.4.1 solves an issue with Sony A7V (ILCE-7M5) Compressed ARW file.

ARW body compatibility diagnostic

The following Sony bodies successfully extract EXIF, focus points, sharpness, and saliency, except for the ILCE-7RM5, which failed to extract saliency on one of its three files. The ILCE-1M2 is the only body tested across all three Sony RAW size variants (S/M/L). All files use compressed RAW, and every body achieves full-resolution L-size output, ranging from 12.4 MP (ILCE-1M2 S-crop) to 60.2 MP (ILCE-7RM5). The ILCE-7M5 and ILCE-7RM5 are the next bodies to focus on, but I depend on test ARW files to properly test them before officially concluding support for these two bodies.

Camera BodyEXIFFocusPtSharpnessSaliencyRAW TypesDimensions
ILCE-1M2Compressed4320 × 2880 (12.4 MP, S), 5616 × 3744 (21.0 MP, M), 8640 × 5760 (49.8 MP, L)
ILCE-1Compressed8640 × 5760 (49.8 MP, L)
ILCE-7M5Compressed7008 × 4672 (32.7 MP, L)
ILCE-7RM5Compressed9504 × 6336 (60.2 MP, L)
ILCE-9M3Compressed6000 × 4000 (24.0 MP, L)

Version 1.3.5

Version 1.3.5 - April 4, 2026

April 4, 2026: It has been observed that RawCull supports a broader range of Sony FullFrame camera bodies. I have conducted tests on ARW files for the Sony A7RV and the newly released Sony A7V. After Easter, I plan to acquire additional full-frame ARW files from Sony A bodies to conduct further tests. Upon the completion of these tests, I will compile a comprehensive list of the Sony bodies that RawCull supports.

Camera BodyVerified
Sony A1 mk I and mk IIVerified
Sony A7RVTested and seems to work
Sony A7VTested and seems to work
Sony A7S mk IIINot verified
Sony A9 mk IIINot verified

The culling process has undergone modifications, enabling a two-step approach. Users now have the option to select the preferred culling method. The primary culling view is the grid view. Users can initiate rating immediately, but two new keystrokes, X and P, have been introduced for Reject and Pick, respectively, providing a binary selection. Rejected items are still visually indicated in red.

After rejecting and picking, users can select all P for Picks and proceed with rating from 2 to 5.

Additionally, automatic application of P and X can be performed after a Sharpness Scoring. The Focus Mask has also undergone a minor update. Following the scoring, users can automatically rate for pickers and rejected items.

Furthermore, after each rating, the thumbnail is automatically advanced to the next thumbnail for efficient culling.

Version 1.3.3

Version 1.3.3 - April 2, 2026

RawCull is developed for culling Sony A1 mkI and mkII ARW files. However, I have recently tested RawCull on Sony 7RV and the new Sony 7V, and it appears that all functions are functioning correctly on these models as well.

In this release, RawCull supports culling by color. Please refer to the Culling section in the documentation for further details. Additionally, updates have been made to the Focus Mask and pre-calibration in Sharpness Scoring.

Upon selecting a thumbnail and transitioning to Grid View, the Grid View automatically directs the user to the selected image. Furthermore, a progress bar has been incorporated into the function to assess the sharpness of the image.

Upon double-clicking a thumbnail in Grid View, the Zoom View is opened, either by extracting the JPG file or utilizing the newly created thumbnail.

Version 1.2.8

Version 1.2.8 - March 30, 2026

RawCull is developed for culling Sony A1 mkI and mkII ARW files. However, I have recently tested RawCull on Sony 7RV and the new Sony 7V, and it appears that all functions are functioning correctly on these models as well.

Code cleanup and several minor bugs were resolved. Additionally, a focus issue was addressed, eliminating the need for two clicks on either vertical or horizontal thumbnail row before utilizing arrow keys for navigation and tagging. Now, a single click suffices to achieve the desired focus. Furthermore, the tag command was modified to a single keypress of t, enabling tagging or detagging functionality.

Technical Deep Dives

Technical articles about RawCull’s implementation, architecture, and advanced concepts.

Swift Concurrency in RawCull

Swift Concurrency in RawCull

A summarized document about Concurrency in RawCull.


1 Why Concurrency Matters in RawCull

RawCull is a macOS photo-culling application that works with Sony A1 ARW raw files. A single RAW file from the A1 can be 50–80 MB. When you open a folder with hundreds of shots, the app must scan metadata, extract embedded JPEG previews, decode thumbnails, and manage a multi-gigabyte in-memory cache — all while keeping the UI perfectly fluid and responsive at 60 fps. Without concurrency that would be impossible.

RawCull is written in Swift 6, which has strict concurrency checking enabled by default. This means the compiler itself verifies thread safety at compile time. The project makes heavy use of Swift’s structured concurrency model: actors, async/await, task groups, and the MainActor.

Swift 6: Strict concurrency checking turns data-race warnings into hard compiler errors. Every type that crosses a concurrency boundary must be Sendable, and every mutable shared state must be isolated to an actor.


2 async / await — The Foundation

async/await is the cornerstone of Swift’s structured concurrency model, introduced in Swift 5.5 (WWDC 2021). An async function can suspend itself — yielding the underlying thread to other work — then resume where it left off when the result is ready. Unlike Grand Central Dispatch callbacks, the code reads top-to-bottom like ordinary synchronous code, which makes it far easier to reason about.

How it looks

// A normal synchronous function — blocks the calling thread the entire time
func loadImageBlocking(url: URL) -> NSImage? { ... }

// An async function — suspends while waiting; doesn't block any thread
func loadImageAsync(url: URL) async -> NSImage? {
    // 'await' means: "pause here and let other work run until I'm done"
    let data = await fetchDataFromDisk(url: url)
    return NSImage(data: data)
}

// Calling an async function — you must also be in an async context
func showImage() async {
    let image = await loadImageAsync(url: someURL)  // suspends here
    updateUI(image)                                  // resumes on same actor
}

In RawCull, virtually every file-loading, cache-lookup, and thumbnail-generation operation is async. This keeps the main thread (and therefore the UI) always free.


3 Actors — Thread-Safe Isolated State

An actor is a reference type (like a class) that protects its mutable state with automatic mutual exclusion. Only one caller can execute inside an actor at a time. You don’t need locks, dispatch queues, or semaphores — the Swift runtime enforces the isolation. If you try to read an actor’s property from outside without await, the compiler refuses to compile.

The rule in one sentence: Every stored property of an actor is only readable and writable from within that actor’s own methods. All other callers must await a method call to hop onto the actor.

RawCull’s actors at a glance

ActorFileResponsibility
ScanFilesActors/ScanFiles.swiftScans a folder for ARW files, reads EXIF, extracts focus points
ScanAndCreateThumbnailsActors/ScanAndCreateThumbnails.swiftOrchestrates bulk thumbnail creation with a concurrent task group
RequestThumbnailActors/RequestThumbnail.swiftOn-demand thumbnail resolver (RAM → disk → extract)
ThumbnailLoaderActors/ThumbnailLoader.swiftRate-limits concurrent thumbnail requests using continuations
DiskCacheManagerActors/DiskCacheManager.swiftReads and writes JPEG thumbnails to/from the on-disk cache
SharedMemoryCacheActors/SharedMemoryCache.swiftSingleton wrapping NSCache; manages memory pressure and config
ExtractAndSaveJPGsActors/ExtractAndSaveJPGs.swiftExtracts full-resolution JPEGs from ARW files in parallel
DiscoverFilesActors/DiscoverFiles.swiftRecursively enumerates .arw files in a directory
ActorCreateOutputforViewActors/ActorCreateOutputforView.swiftConverts rsync output strings to RsyncOutputData structs

A minimal actor example from the project

// From Actors/DiscoverFiles.swift
actor DiscoverFiles {

    // @concurrent tells Swift: run this method on the cooperative thread pool,
    // not on the actor's serial queue. Safe because the method only uses
    // local variables — no actor state is touched.
    @concurrent
    nonisolated func discoverFiles(at catalogURL: URL, recursive: Bool) async -> [URL] {
        await Task {
            let supported: Set<String> = [SupportedFileType.arw.rawValue]
            let fileManager = FileManager.default
            var urls: [URL] = []
            guard let enumerator = fileManager.enumerator(
                at: catalogURL,
                includingPropertiesForKeys: [.isRegularFileKey],
                options: recursive ? [] : [.skipsSubdirectoryDescendants]
            ) else { return urls }
            while let fileURL = enumerator.nextObject() as? URL {
                if supported.contains(fileURL.pathExtension.lowercased()) {
                    urls.append(fileURL)
                }
            }
            return urls
        }.value
    }
}

discoverFiles is both nonisolated and @concurrent. Because it never reads or writes any property of the actor, it does not need to run on the actor’s serial queue — Swift can run it on any available thread in the cooperative pool, improving throughput.


4 @MainActor — Protecting the UI Thread

The main thread in a macOS/iOS app is special: all UI rendering must happen there. Swift’s @MainActor annotation is a global actor that ensures any code it annotates runs exclusively on the main thread. This replaces the old pattern of DispatchQueue.main.async { ... } with something the compiler can verify.

RawCullViewModel — the whole class lives on @MainActor

// From Model/ViewModels/RawCullViewModel.swift

@Observable @MainActor         // <-- every property and method is main-thread only
final class RawCullViewModel {

    var files: [FileItem] = []           // Safe: only touched on main thread
    var filteredFiles: [FileItem] = []   // Safe: same
    var creatingthumbnails: Bool = false // Drives UI animations

    func handleSourceChange(url: URL) async {
        // 'async' lets the function suspend while waiting for actor work,
        // but it always starts and ends on the main thread (because of @MainActor)
        scanning = true
        let scan = ScanFiles()           // Create a ScanFiles actor
        files = await scan.scanFiles(url: url)  // Hop to ScanFiles actor, wait, return
        // Back on main thread here — safe to update UI
        scanning = false
    }
}

When handleSourceChange calls await scan.scanFiles(...), the main thread suspends (it is not blocked — it continues to process other UI events). When the scan is done, Swift automatically resumes on the main thread before assigning to files. This is the key insight: @MainActor + async/await means you never have to manually dispatch back to the main thread.

ExecuteCopyFiles — another @MainActor class

// From Model/ParametersRsync/ExecuteCopyFiles.swift

@Observable @MainActor
final class ExecuteCopyFiles {

    private func handleProcessTermination(
        stringoutputfromrsync: [String]?,
        hiddenID: Int?
    ) async {
        let viewOutput = await ActorCreateOutputforView()
                                .createOutputForView(stringoutputfromrsync)

        let result = CopyDataResult(output: stringoutputfromrsync,
                                    viewOutput: viewOutput,
                                    linesCount: stringoutputfromrsync?.count ?? 0)
        onCompletion?(result)

        // Ensure completion handler finishes before cleaning up resources
        try? await Task.sleep(for: .milliseconds(10))
        cleanup()
    }
}

Why the sleep? The brief Task.sleep before cleanup() is an intentional concurrency fix. Without it, there was a race condition: the security-scoped resource access could be released before the onCompletion callback had finished using it.

Crossing the boundary with MainActor.run

// From Model/ViewModels/SettingsViewModel.swift

// nonisolated means this is accessible without an actor hop,
// but to safely READ the @Observable properties we still need
// to jump to the MainActor for just a moment.

nonisolated func asyncgetsettings() async -> SavedSettings {
    await MainActor.run {               // Hop to main thread, read, return
        SavedSettings(
            memoryCacheSizeMB: self.memoryCacheSizeMB,
            thumbnailSizeGrid: self.thumbnailSizeGrid,
            thumbnailSizePreview: self.thumbnailSizePreview,
            thumbnailSizeFullSize: self.thumbnailSizeFullSize,
            thumbnailCostPerPixel: self.thumbnailCostPerPixel,
            thumbnailSizeGridView: self.thumbnailSizeGridView,
            useThumbnailAsZoomPreview: self.useThumbnailAsZoomPreview
        )
    }   // Back to the calling actor with a Sendable value type
}

This pattern — nonisolated async function + MainActor.run — is the standard way to safely read @Observable (main-thread) properties from background actors. SavedSettings is a plain Codable struct (a value type), so it is Sendable and safe to return across the actor boundary.


5 Task Groups — Parallel File Processing

When you have a collection of independent items to process — like hundreds of RAW files — you want to process them in parallel, not one by one. Swift’s withTaskGroup (and its throwing counterpart withThrowingTaskGroup) let you spawn many child tasks and collect their results. The group automatically limits the number of tasks that run at once based on the cooperative thread pool.

Thumbnail preloading with withTaskGroup

// From Actors/ScanAndCreateThumbnails.swift

func preloadCatalog(at catalogURL: URL, targetSize: Int) async -> Int {
    await ensureReady()
    cancelPreload()   // Cancel any ongoing previous preload

    let task = Task<Int, Never> {
        successCount = 0
        let urls = await DiscoverFiles().discoverFiles(at: catalogURL, recursive: false)
        totalFilesToProcess = urls.count

        return await withTaskGroup(of: Void.self) { group in
            // Allow up to (CPU cores × 2) concurrent thumbnail jobs
            let maxConcurrent = ProcessInfo.processInfo.activeProcessorCount * 2

            for (index, url) in urls.enumerated() {
                if Task.isCancelled {
                    group.cancelAll()   // Propagate cancellation to child tasks
                    break
                }

                // Once we've queued maxConcurrent tasks, wait for one to finish
                // before adding more — this is backpressure / throttling
                if index >= maxConcurrent {
                    await group.next()
                }

                group.addTask {
                    await self.processSingleFile(url, targetSize: targetSize, itemIndex: index)
                }
            }

            await group.waitForAll()
            return successCount
        }
    }

    preloadTask = task         // Store so we can cancel it later
    return await task.value
}

The maxConcurrent throttle is important: if you queued 2,000 tasks at once, Swift would create 2,000 concurrent tasks competing for CPU and disk I/O. Instead, RawCull keeps at most (active CPU cores × 2) tasks in flight at any one time. When one finishes (await group.next()), the loop adds the next one.

Parallel focus-point extraction in ScanFiles

// From Actors/ScanFiles.swift

private func extractNativeFocusPoints(from items: [FileItem]) async -> [DecodeFocusPoints]? {
    let collected = await withTaskGroup(of: DecodeFocusPoints?.self) { group in
        for item in items {
            group.addTask {
                // SonyMakerNoteParser.focusLocation is a pure function — no shared state
                guard let location = SonyMakerNoteParser.focusLocation(from: item.url)
                else { return nil }
                return DecodeFocusPoints(
                    sourceFile: item.url.lastPathComponent,
                    focusLocation: location
                )
            }
        }

        // Collect results as tasks complete (order not guaranteed)
        var results: [DecodeFocusPoints] = []
        for await result in group {
            if let r = result { results.append(r) }
        }
        return results
    }
    return collected.isEmpty ? nil : collected
}

6 Task Cancellation — Cooperative, Not Forceful

Swift concurrency uses cooperative cancellation. You cannot forcefully kill a Task; instead, you call task.cancel() to set a cancellation flag, and the task’s code must periodically check Task.isCancelled and stop voluntarily. This is the correct pattern: clean shutdown instead of dangling resources.

How RawCull cancels thumbnail preloading

// From Model/ViewModels/RawCullViewModel.swift

func abort() {
    // 1. Cancel the outer Task wrapper
    preloadTask?.cancel()
    preloadTask = nil

    // 2. Tell the actor to cancel its internal Task too
    if let actor = currentScanAndCreateThumbnailsActor {
        Task { await actor.cancelPreload() }
    }
    currentScanAndCreateThumbnailsActor = nil

    // 3. Cancel JPG extraction the same way
    if let actor = currentExtractAndSaveJPGsActor {
        Task { await actor.cancelExtractJPGSTask() }
    }
    currentExtractAndSaveJPGsActor = nil

    creatingthumbnails = false
}

Checking cancellation inside the worker

// From Actors/ScanAndCreateThumbnails.swift

private func processSingleFile(_ url: URL, targetSize: Int, itemIndex: Int) async {
    // Check before doing any I/O
    if Task.isCancelled { return }

    // Check RAM cache...
    if let wrapper = SharedMemoryCache.shared.object(forKey: url as NSURL) { ... }

    // Check again before slower disk operation
    if Task.isCancelled { return }

    // Load from disk cache...
    if let diskImage = await diskCache.load(for: url) { ... }

    // Check again before the most expensive operation: raw file extraction
    if Task.isCancelled { return }

    let cgImage = try await SonyThumbnailExtractor.extractSonyThumbnail(...)
}

Each Task.isCancelled guard cuts work short at logical checkpoints. The more expensive the upcoming operation, the more important the guard is. This gives smooth, instant response when the user switches to a different folder.


7 Task and Task.detached

Sometimes you want to start background work without awaiting its result — a fire-and-forget pattern. Swift provides two ways to do this:

  • Task { ... } — inherits the current actor context and task priority. If called from @MainActor, it also runs on @MainActor unless it awaits something that moves it elsewhere.
  • Task.detached { ... } — starts a completely independent task. It inherits no actor context and runs on the cooperative thread pool at the specified priority. Use this for genuinely background work that has no relationship to the calling context.

Saving to disk in the background (Task.detached)

// From Actors/ScanAndCreateThumbnails.swift

// We have a cgImage — encode it to Data INSIDE this actor
// before crossing any boundary. CGImage is NOT Sendable.
guard let jpegData = DiskCacheManager.jpegData(from: cgImage) else { return }
// Data IS Sendable — safe to pass to a detached task.

let dcache = diskCache   // Capture the actor reference (actors are Sendable)
Task.detached(priority: .background) {
    // Runs on a background thread — no actor context
    await dcache.save(jpegData, for: url)
}
// We DON'T await this — the thumbnail is shown immediately
// while the disk write happens silently in the background.

This is a key pattern: encode the image to Data (a value type, Sendable) while still inside the actor that owns the CGImage. Only after encoding do we hand it off to a detached task. This avoids the Swift 6 compile error that would occur if we tried to send a CGImage across a task boundary.

UI callback fire-and-forget (Task on @MainActor)

// From Actors/ScanAndCreateThumbnails.swift

private func notifyFileHandler(_ count: Int) {
    let handler = fileHandlers?.fileHandler
    Task { @MainActor in handler?(count) }
    // Creates a Task that runs on the main thread,
    // but we immediately return without awaiting it.
    // Thumbnail generation must NOT stall waiting for UI rendering.
}

8 SharedMemoryCache — A Singleton Actor

SharedMemoryCache is one of the most sophisticated concurrency designs in RawCull. It is a singleton actor that wraps NSCache (Apple’s automatic memory-evicting cache). It cleverly combines actor isolation for configuration with nonisolated access for the NSCache itself.

// From Actors/SharedMemoryCache.swift (simplified)

actor SharedMemoryCache {
    // Singleton — accessible as SharedMemoryCache.shared from any context
    nonisolated static let shared = SharedMemoryCache()

    // ── Actor-isolated state (requires await to access) ──────────────────
    private var _costPerPixel: Int = 4
    private var savedSettings: SavedSettings?
    private var setupTask: Task<Void, Never>?
    private var memoryPressureSource: DispatchSourceMemoryPressure?

    // ── Non-isolated state (no await needed) ─────────────────────────────
    // NSCache is internally thread-safe, so we can safely bypass the
    // actor's serialization for fast synchronous lookups.
    nonisolated(unsafe) let memoryCache = NSCache<NSURL, DiscardableThumbnail>()

    // Synchronous cache lookup — no 'await' required by callers
    nonisolated func object(forKey key: NSURL) -> DiscardableThumbnail? {
        memoryCache.object(forKey: key)
    }
    nonisolated func setObject(_ obj: DiscardableThumbnail, forKey key: NSURL, cost: Int) {
        memoryCache.setObject(obj, forKey: key, cost: cost)
    }
}

The key insight is the two-tier design. Configuration properties (cost per pixel, settings, memory pressure source) are actor-isolated and require await. But the hot-path NSCache operations (lookups and insertions) are nonisolated — they happen in every SwiftUI view that renders a thumbnail, and they must be fast. NSCache provides its own thread safety, so nonisolated(unsafe) is legitimate here.

Guarding against duplicate initialization with a stored Task

func ensureReady(config: CacheConfig? = nil) async {
    // If setup is already in progress (or done), just wait for it to finish
    if let task = setupTask {
        return await task.value   // Join the existing task — don't start a new one
    }

    // Start a new setup task — store it IMMEDIATELY before awaiting
    let newTask = Task {
        self.startMemoryPressureMonitoring()
        let settings = await SettingsViewModel.shared.asyncgetsettings()
        let config   = self.calculateConfig(from: settings)
        self.applyConfig(config)
    }

    // Storing BEFORE awaiting is critical: if another caller arrives during
    // the await below, they'll find setupTask already set and join it.
    setupTask = newTask
    await newTask.value
}

Race condition fix: If you stored setupTask = newTask after await newTask.value, a second concurrent caller could find setupTask still nil and start a duplicate initialization. Storing it immediately after creation is the correct pattern.

Memory pressure monitoring with DispatchSource

private func startMemoryPressureMonitoring() {
    let source = DispatchSource.makeMemoryPressureSource(
        eventMask: .all, queue: .global(qos: .utility)
    )

    // When the OS fires a memory pressure event (on a GCD background queue),
    // we create a Task to hop back onto the actor and respond.
    source.setEventHandler { [weak self] in
        guard let self else { return }
        Task {
            await self.handleMemoryPressureEvent()
        }
    }

    source.resume()
    memoryPressureSource = source
}

9 AsyncStream — Streaming Progress Updates

AsyncStream is Swift’s way to model a sequence of values that arrive over time — analogous to a Combine publisher or a Unix pipe, but using async/await. RawCull uses AsyncStream to stream progress updates from the rsync copy process to the UI.

// From Model/ParametersRsync/ExecuteCopyFiles.swift

// In init(): create an AsyncStream with its continuation
let (stream, continuation) = AsyncStream.makeStream(of: Int.self)
self.progressStream       = stream        // Consumer reads from this
self.progressContinuation = continuation  // Producer writes to this

// ── Producer (inside streaming handler callback) ─────────────────────────
streamingHandlers = CreateStreamingHandlers().createHandlersWithCleanup(
    fileHandler: { [weak self] count in
        // Each time rsync reports a file, yield the count to the stream
        self?.progressContinuation?.yield(count)
    }
)

// ── Consumer (in a ViewModel or View) ────────────────────────────────────
if let stream = copyFiles.progressStream {
    for await count in stream {
        // 'await' suspends between each value — no busy-waiting
        updateProgressBar(count)
    }
    // Loop exits naturally when continuation.finish() is called
}

// ── Cleanup (inside handleProcessTermination) ─────────────────────────────
progressContinuation?.finish()   // Signals the consumer loop to exit
progressContinuation = nil
progressStream = nil

AsyncStream is ideal here: rsync is a long-running subprocess that emits a count each time it copies a file. The UI wants to see each update as it happens, without polling. When the process finishes, calling .finish() on the continuation terminates the for await loop cleanly.


10 CheckedContinuation — Bridging to the Semaphore World

Swift’s concurrency model doesn’t have a built-in semaphore. Instead, you use withCheckedContinuation (or its throwing variant) to suspend a task and resume it later from a completely different context. ThumbnailLoader uses this to build a rate-limiter — a queue that allows at most 6 concurrent thumbnail loads at once.

// From Actors/ThumbnailLoader.swift

actor ThumbnailLoader {
    static let shared = ThumbnailLoader()

    private let maxConcurrent = 6
    private var activeTasks  = 0
    private var pendingContinuations: [(id: UUID, continuation: CheckedContinuation<Void, Never>)] = []

    private func acquireSlot() async {
        if activeTasks < maxConcurrent {
            activeTasks += 1
            return   // Slot available — proceed immediately
        }

        // No slot available — suspend this task and wait
        let id = UUID()
        await withTaskCancellationHandler {
            await withCheckedContinuation { continuation in
                // We are now suspended. Store the continuation.
                // releaseSlot() will call continuation.resume() when a slot opens.
                pendingContinuations.append((id: id, continuation: continuation))
            }
            activeTasks += 1
        } onCancel: {
            // If the task is cancelled while waiting, remove it from the queue
            Task { await self.removeAndResumePendingContinuation(id: id) }
        }
    }

    private func releaseSlot() {
        activeTasks -= 1
        if let next = pendingContinuations.first {
            pendingContinuations.removeFirst()
            next.continuation.resume()   // Wake up the oldest waiting task
        }
    }

    func thumbnailLoader(file: FileItem) async -> NSImage? {
        await acquireSlot()              // Wait for a free slot
        defer { releaseSlot() }          // Release when done (even on error)

        guard !Task.isCancelled else { return nil }
        // ... load thumbnail ...
    }
}

withCheckedContinuation is Swift’s way to wrap callback-based or semaphore-based APIs into the async/await world. The “Checked” version adds runtime safety: if you forget to call resume() exactly once, the program crashes with a clear error rather than silently deadlocking. withTaskCancellationHandler ensures that if the task is cancelled while waiting for a slot, it cleans up gracefully.


11 The Thumbnail Pipeline — Putting It All Together

The thumbnail system is where all the concurrency patterns converge. Understanding this pipeline shows how each concept connects in practice.

Three-tier lookup strategy (RAM → Disk → Extract)

// From Actors/ScanAndCreateThumbnails.swift (resolveImage, simplified)

private func resolveImage(for url: URL, targetSize: Int) async throws -> CGImage {

    // ── Tier A: RAM (synchronous — no await needed) ───────────────────────
    // SharedMemoryCache.shared.object() is nonisolated — no actor hop.
    if let wrapper = SharedMemoryCache.shared.object(forKey: url as NSURL),
       wrapper.beginContentAccess() {
        defer { wrapper.endContentAccess() }
        return try nsImageToCGImage(wrapper.image)   // Fastest path: ~μs
    }

    // ── Tier B: Disk cache (async — file I/O) ─────────────────────────────
    if let diskImage = await diskCache.load(for: url) {
        storeInMemoryCache(diskImage, for: url)      // Promote to RAM
        return try nsImageToCGImage(diskImage)        // Fast: ~ms
    }

    // ── Tier C: In-flight deduplication ───────────────────────────────────
    // If another caller is already generating this thumbnail, join that task
    // instead of starting a duplicate.
    if let existingTask = inflightTasks[url] {
        let image = try await existingTask.value
        return try nsImageToCGImage(image)
    }

    // ── Tier D: Extract from raw file ─────────────────────────────────────
    let task = Task { () throws -> NSImage in
        let cgImage = try await SonyThumbnailExtractor.extractSonyThumbnail(
            from: url, maxDimension: CGFloat(targetSize), qualityCost: costPerPixel
        )
        let image = try cgImageToNormalizedNSImage(cgImage)
        storeInMemoryCache(image, for: url)

        // Encode to Data inside this actor, then fire off a background save
        if let jpegData = DiskCacheManager.jpegData(from: cgImage) {
            Task.detached(priority: .background) { await dcache.save(jpegData, for: url) }
        }
        inflightTasks[url] = nil
        return image
    }
    inflightTasks[url] = task     // Register so concurrent callers can join it
    return try nsImageToCGImage(try await task.value)
}

Tier C is an elegant optimization called request coalescing. If the grid view shows 20 thumbnails and 5 of them are for the same URL (perhaps during a layout transition), only one extraction happens — the other 4 join the first task and share its result.


12 CacheDelegate — Tracking Evictions with an Actor

NSCache can evict objects at any time (when memory gets tight). CacheDelegate conforms to NSCacheDelegate so it gets a callback when an eviction happens. The tricky part: this callback is called from NSCache’s internal C++ thread — not from any Swift actor. The solution is a nested actor that owns the mutable counter.

// From Model/Cache/CacheDelegate.swift

final class CacheDelegate: NSObject, NSCacheDelegate, @unchecked Sendable {
    nonisolated static let shared = CacheDelegate()

    private let evictionCounter = EvictionCounter()

    // Called by NSCache on its own internal thread
    nonisolated func cache(_ cache: NSCache<AnyObject, AnyObject>,
                           willEvictObject obj: Any) {
        if obj is DiscardableThumbnail {
            Task {
                let count = await evictionCounter.increment()
                // log the count...
            }
        }
    }

    func getEvictionCount() async -> Int { await evictionCounter.getCount() }
    func resetEvictionCount() async      { await evictionCounter.reset()    }
}

// A private actor that safely owns the mutable counter
private actor EvictionCounter {
    private var count = 0
    func increment() -> Int { count += 1; return count }
    func getCount()  -> Int { count }
    func reset()             { count = 0 }
}

EvictionCounter is a textbook use of an actor for the simplest possible case: protecting a single integer from concurrent writes. Before actors existed, you would use NSLock or DispatchQueue(label:) for this. The actor is cleaner, safer, and compiler-verified.


13 MemoryViewModel — Offloading Heavy Work from @MainActor

MemoryViewModel displays live memory statistics (total RAM, used RAM, app footprint). Getting these stats requires Mach kernel calls (vm_statistics64, task_vm_info) — synchronous system calls that block for a brief moment. If run directly on @MainActor, they would cause UI stutter.

// From Model/ViewModels/MemoryViewModel.swift

func updateMemoryStats() async {
    // Step 1: Move the heavy work OFF the MainActor
    let (total, used, app, threshold) = await Task.detached {
        let total     = ProcessInfo.processInfo.physicalMemory
        let used      = self.getUsedSystemMemory()   // Blocking Mach call
        let app       = self.getAppMemory()           // Blocking Mach call
        let threshold = self.calculateMemoryPressureThreshold(total: total)
        return (total, used, app, threshold)
    }.value

    // Step 2: Update @Observable properties back on MainActor
    await MainActor.run {
        self.totalMemory            = total
        self.usedMemory             = used
        self.appMemory              = app
        self.memoryPressureThreshold = threshold
    }
}

// The Mach calls are nonisolated: they don't touch any actor state
private nonisolated func getUsedSystemMemory() -> UInt64 {
    var stat = vm_statistics64()
    // ... kernel call ...
    return (wired + active + compressed) * pageSize
}

This pattern — Task.detached for blocking work, then MainActor.run to update observable state — is the canonical way to keep the UI thread responsive while doing expensive computation or I/O in a class that must also update the UI.


14 @concurrent and nonisolated

Two related annotations help you escape actor isolation when it is safe to do so, allowing more work to run in parallel.

nonisolated — opt out of the actor’s serial queue

A nonisolated method on an actor can be called without await from outside the actor. The tradeoff: it must not read or write any actor-isolated property. It is safe for pure computation or for accessing nonisolated(unsafe) properties.

// From Actors/ScanFiles.swift

// sortFiles does not touch any actor property — it only works on
// the passed-in 'files' array (a value type, passed by copy).
@concurrent
nonisolated func sortFiles(
    _ files: [FileItem],
    by sortOrder: [some SortComparator<FileItem>],
    searchText: String
) async -> [FileItem] {
    let sorted = files.sorted(using: sortOrder)
    return searchText.isEmpty ? sorted
           : sorted.filter { $0.name.localizedCaseInsensitiveContains(searchText) }
}

@concurrent — run on the thread pool, not the actor queue

@concurrent is a Swift 6 annotation that says: “even though this method is on an actor, execute it on the cooperative thread pool, not on the actor’s serial queue.” It is useful for pure CPU work that doesn’t need actor isolation but lives on an actor for organizational reasons.

// From Actors/ActorCreateOutputforView.swift

actor ActorCreateOutputforView {
    // Pure mapping: [String] → [RsyncOutputData]
    @concurrent
    nonisolated func createOutputForView(_ strings: [String]?) async -> [RsyncOutputData] {
        guard let strings else { return [] }
        return strings.map { RsyncOutputData(record: $0) }
    }
}

15 Sendable — The Type-Safety Rule

A Sendable type can safely cross actor/task boundaries. Swift enforces this at compile time in Swift 6: if you try to send a non-Sendable value to a different isolation domain, the compiler rejects it. The most common pattern in RawCull is the CGImage-to-Data conversion before any boundary crossing.

// CGImage is NOT Sendable (it wraps a C++ object)

// ❌ WRONG — Swift 6 compiler error
Task.detached {
    await diskCache.save(cgImage, for: url)  // Error: CGImage is not Sendable
}

// ✅ CORRECT — Encode to Data first, then cross the boundary
// Data is a struct (value type) — it IS Sendable.
if let jpegData = DiskCacheManager.jpegData(from: cgImage) {
    let dcache = diskCache  // Actor references are Sendable
    Task.detached(priority: .background) {
        await dcache.save(jpegData, for: url)  // Data ✓, actor ref ✓
    }
}

Value types (structs, enums) with only Sendable stored properties are automatically Sendable. SavedSettings, FileItem, ExifMetadata, CopyDataResult — all structs in RawCull — are Sendable for this reason. Actor references are also Sendable (the actor itself serializes access). Class instances are generally not Sendable unless annotated.


16 Bridging GCD and Swift Concurrency — Preventing Thread Pool Starvation

Both JPGSonyARWExtractor and SonyThumbnailExtractor are caseless enums — pure namespaces with no instance state — that perform CPU-intensive ImageIO work. They use a pattern that looks surprising at first: they explicitly dispatch to DispatchQueue.global inside a withCheckedContinuation. Understanding why reveals an important pitfall of Swift’s cooperative thread pool.

The problem: thread pool starvation

Swift’s cooperative thread pool has a limited number of threads — typically one per CPU core. When an async function calls a synchronous, blocking API (like CGImageSourceCreateWithURL or CGImageSourceCreateThumbnailAtIndex), that call does not suspend — it blocks the thread it is running on. If many tasks do this simultaneously, every thread in the pool can become occupied with blocked I/O, leaving no threads free to run other await continuations. The app effectively freezes. This is called thread pool starvation.

The fix is to deliberately hop off the cooperative thread pool and onto a GCD global queue — which has its own, much larger pool of threads — for the duration of the blocking call. When the GCD block finishes, it calls continuation.resume(), which re-queues the Swift task on the cooperative pool for the lightweight work that follows.

JPGSonyARWExtractor — withCheckedContinuation + GCD

// From Enum/JPGSonyARWExtractor.swift

// @preconcurrency suppresses Sendable errors for AppKit types (like NSImage)
// that predate Swift concurrency and are not formally Sendable.
@preconcurrency import AppKit

enum JPGSonyARWExtractor {
    static func jpgSonyARWExtractor(
        from arwURL: URL,
        fullSize: Bool = false,
    ) async -> CGImage? {

        return await withCheckedContinuation { continuation in
            // Dispatch to GCD to prevent Thread Pool Starvation.
            // CGImageSourceCreateWithURL and friends are synchronous and can
            // block for tens of milliseconds on a large ARW file.
            // Running them directly on the cooperative pool ties up a thread.
            DispatchQueue.global(qos: .utility).async {

                guard let imageSource = CGImageSourceCreateWithURL(arwURL as CFURL, nil) else {
                    continuation.resume(returning: nil)
                    return
                }

                // Scan all sub-images in the ARW container and find the largest JPEG preview
                let imageCount = CGImageSourceGetCount(imageSource)
                var targetIndex = -1
                var targetWidth  = 0

                for index in 0 ..< imageCount {
                    guard let props = CGImageSourceCopyPropertiesAtIndex(imageSource, index, nil)
                            as? [CFString: Any] else { continue }

                    let hasJFIF     = (props[kCGImagePropertyJFIFDictionary] as? [CFString: Any]) != nil
                    let tiffDict    = props[kCGImagePropertyTIFFDictionary] as? [CFString: Any]
                    let compression = tiffDict?[kCGImagePropertyTIFFCompression] as? Int
                    let isJPEG      = hasJFIF || (compression == 6)  // TIFF compression 6 = JPEG

                    if let width = getWidth(from: props), isJPEG, width > targetWidth {
                        targetWidth = width
                        targetIndex = index
                    }
                }

                guard targetIndex != -1 else {
                    continuation.resume(returning: nil)
                    return
                }

                // Downsample in-place with ImageIO if the preview is larger than needed
                let maxSize = CGFloat(fullSize ? 8640 : 4320)
                let result: CGImage?

                if CGFloat(targetWidth) > maxSize {
                    let options: [CFString: Any] = [
                        kCGImageSourceCreateThumbnailFromImageAlways: true,
                        kCGImageSourceCreateThumbnailWithTransform:   true,
                        kCGImageSourceThumbnailMaxPixelSize:           Int(maxSize),
                    ]
                    result = CGImageSourceCreateThumbnailAtIndex(imageSource, targetIndex,
                                                                 options as CFDictionary)
                } else {
                    let options: [CFString: Any] = [
                        kCGImageSourceShouldCache:            true,
                        kCGImageSourceShouldCacheImmediately: true,
                    ]
                    result = CGImageSourceCreateImageAtIndex(imageSource, targetIndex,
                                                            options as CFDictionary)
                }

                // Hand the result back to the Swift async world
                continuation.resume(returning: result)
            }
        }
    }
}

The withCheckedContinuation call suspends the Swift task and stores its continuation. The GCD block then runs on a GCD worker thread — entirely outside the cooperative pool. When it calls continuation.resume(returning:), Swift schedules the task to resume, but only the lightweight resumption, not the expensive ImageIO work that has already completed on GCD.

SonyThumbnailExtractor — withCheckedThrowingContinuation + GCD

SonyThumbnailExtractor follows the same pattern but uses the throwing variant because the ImageIO operations can fail. The comment in the source file spells out a second important motivation beyond starvation:

// From Enum/SonyThumbnailExtractor.swift

enum SonyThumbnailExtractor {
    static func extractSonyThumbnail(
        from url: URL,
        maxDimension: CGFloat,
        qualityCost: Int = 4,
    ) async throws -> CGImage {

        // We MUST explicitly hop off the current thread.
        // Since we are an enum and static, we have no isolation of our own.
        // If we don't do this, we run on the caller's thread (the Actor),
        // causing serialization — only one extraction at a time.
        try await withCheckedThrowingContinuation { continuation in
            DispatchQueue.global(qos: .userInitiated).async {
                do {
                    let image = try Self.extractSync(from: url,
                                                    maxDimension: maxDimension,
                                                    qualityCost: qualityCost)
                    continuation.resume(returning: image)
                } catch {
                    continuation.resume(throwing: error)  // Propagates to the call site
                }
            }
        }
    }

    // All heavy ImageIO work lives in a private synchronous function,
    // only ever called from the GCD block above
    private nonisolated static func extractSync(
        from url: URL,
        maxDimension: CGFloat,
        qualityCost: Int,
    ) throws -> CGImage {
        let sourceOptions = [kCGImageSourceShouldCache: false] as CFDictionary
        guard let source = CGImageSourceCreateWithURL(url as CFURL, sourceOptions)
        else { throw ThumbnailError.invalidSource }

        let thumbOptions: [CFString: Any] = [
            kCGImageSourceCreateThumbnailFromImageAlways: true,
            kCGImageSourceCreateThumbnailWithTransform:   true,
            kCGImageSourceThumbnailMaxPixelSize:           maxDimension,
            kCGImageSourceShouldCacheImmediately:          true,
        ]
        guard let raw = CGImageSourceCreateThumbnailAtIndex(source, 0,
                                                            thumbOptions as CFDictionary)
        else { throw ThumbnailError.generationFailed }

        return try rerender(raw, qualityCost: qualityCost)
    }

    // Re-renders into an sRGB CGContext to normalise colour space and apply
    // the chosen interpolation quality
    private nonisolated static func rerender(_ image: CGImage, qualityCost: Int) throws -> CGImage {
        let quality: CGInterpolationQuality = switch qualityCost {
            case 1...2: .low
            case 3...4: .medium
            default:    .high
        }
        guard let colorSpace = CGColorSpace(name: CGColorSpace.sRGB)
        else { throw ThumbnailError.contextCreationFailed }

        let bitmapInfo = CGBitmapInfo(rawValue: CGImageAlphaInfo.premultipliedLast.rawValue)
        guard let ctx = CGContext(data: nil, width: image.width, height: image.height,
                                  bitsPerComponent: 8, bytesPerRow: 0,
                                  space: colorSpace, bitmapInfo: bitmapInfo.rawValue)
        else { throw ThumbnailError.contextCreationFailed }

        ctx.interpolationQuality = quality
        ctx.draw(image, in: CGRect(x: 0, y: 0, width: image.width, height: image.height))
        guard let result = ctx.makeImage() else { throw ThumbnailError.generationFailed }
        return result
    }
}

The comment makes an important second point: “If we don’t do this, we run on the caller’s thread (the Actor), causing serialization.” Even without starvation, running the blocking work directly on the calling actor would mean the actor can only process one extraction at a time — because actors are serial. By dispatching to GCD immediately, the actor is freed to start the next request while GCD runs many extractions concurrently on its own thread pool.

Why enums, not classes or actors?

Using a caseless enum signals that the type is a pure namespace — it has no instance state and cannot be instantiated. This means there is no actor isolation to reason about, every method is inherently static, and self does not exist. The Swift compiler never has to consider whether the type crosses an isolation boundary. It is the right choice for stateless utility code that performs only I/O and pure computation.

@preconcurrency import — suppressing legacy Sendable warnings

JPGSonyARWExtractor annotates its AppKit import with @preconcurrency:

@preconcurrency import AppKit

AppKit was written before Swift concurrency existed, so many of its types are not formally declared Sendable. In Swift 6, using them across concurrency boundaries would normally produce hard errors. @preconcurrency import tells the compiler to treat missing Sendable conformances from that module as warnings rather than errors — the sanctioned way to integrate legacy frameworks without turning off strict concurrency checking globally.

QoS choices — utility vs userInitiated

The two enums deliberately pick different GCD quality-of-service levels:

  • JPGSonyARWExtractor uses .utility — extracting full-resolution previews for JPG export is a background batch job that can yield to foreground work without affecting perceived responsiveness.
  • SonyThumbnailExtractor uses .userInitiated — thumbnail extraction is driven directly by the user scrolling the grid, so results need to appear quickly to keep the UI feeling snappy.

This mirrors the task priority system used within Swift concurrency itself (Task(priority: .background) vs .userInitiated), applied at the GCD layer where the blocking work actually lives.


17 Quick Reference

Keyword / PatternWhat it doesWhere in RawCull
async / awaitSuspend without blocking; resume when readyEverywhere — all I/O functions
actorReference type with automatic mutual exclusionScanFiles, DiskCacheManager, ThumbnailLoader, …
@MainActorRestrict execution to the main threadRawCullViewModel, ExecuteCopyFiles
@Observable + @MainActorSwiftUI-observable classes on main threadRawCullViewModel, SettingsViewModel
withTaskGroupFan out many tasks in parallel, collect resultsScanFiles.scanFiles, ScanAndCreateThumbnails.preloadCatalog
Task { }Fire-and-forget; inherits current actorUI callbacks, rating updates, abort()
Task.detached { }Fully independent background taskDisk-cache saves, MemoryViewModel stats
Task.isCancelledCooperative cancellation checkprocessSingleFile — multiple guard points
task.cancel()Request cooperative cancellationRawCullViewModel.abort()
AsyncStreamPush-based sequence of values over timeExecuteCopyFiles progress stream
CheckedContinuation (rate-limiter)Suspend a task; resume it from another contextThumbnailLoader.acquireSlot()
withCheckedContinuation + DispatchQueue.globalEscape the cooperative pool; prevent thread pool starvationJPGSonyARWExtractor, SonyThumbnailExtractor
withCheckedThrowingContinuationThrowing variant of continuation bridgingSonyThumbnailExtractor.extractSonyThumbnail
@preconcurrency importSuppress Sendable errors for pre-concurrency frameworksJPGSonyARWExtractor (AppKit)
nonisolatedEscape actor isolation for pure functionsScanFiles.sortFiles, SettingsViewModel.asyncgetsettings
@concurrentRun on thread pool, not actor queueScanFiles.sortFiles, ActorCreateOutputforView
nonisolated(unsafe)Bypass isolation for externally thread-safe objectsSharedMemoryCache.memoryCache (NSCache)
MainActor.run { }Hop to main thread for a block, then returnSettingsViewModel.asyncgetsettings, MemoryViewModel.updateMemoryStats
SendableTypes safe to cross actor/task boundariesSavedSettings, FileItem, Data (CGImage → Data encoding)

RawCull — a macOS app by Thomas Evensen · Swift 6 strict concurrency · Apple Silicon · macOS 26 Tahoe

Sharpness Scoring

Updated Sharpness Scoring in RawCull

This document describes how RawCull computes sharpness scores, what each parameter does, and a boundary-value test procedure for validating parameter behaviour. This applies to version 1.3.7 of RawCull


How the Scoring Pipeline Works

flowchart TD
    A([ARW File]) --> B[Thumbnail decode\nthumbnailMaxPixelSize]
    A --> C[Saliency detection\nApple Vision]

    B --> D[Gaussian pre-blur\npreBlurRadius × isoFactor × resFactor]
    D --> E[Laplacian Metal kernel\nfocusLaplacian]
    E --> F[Amplify\nenergyMultiplier]
    F --> G[Border suppression\nborderInsetFraction]
    G --> H[Render to Float32 buffer]

    H --> I[Full-frame samples\nborder-inset region]
    C -->|bounding box ≥ 3% area| J[Salient-region samples\nVision bounding box]
    H --> J

    I --> K[p95 winsorized tail score\nfull-frame  f]
    J --> L[p95 winsorized tail score\nsubject  s]

    K --> M{Fusion}
    L --> M
    C -->|area| M

    M --> N[finalScore]
    N --> O([Badge: score / maxScore × 100])

    style A fill:#2d2d2d,color:#fff
    style O fill:#2d2d2d,color:#fff

Each ARW file goes through the following stages in detail:

1. Thumbnail decode

ImageIO extracts the embedded JPEG thumbnail at the requested pixel size (thumbnailMaxPixelSize). If no embedded thumbnail is available, the full RAW is decoded instead. The thumbnail is the only input to the scoring pipeline — the full 61 MP RAW pixel data is never read for scoring.

2. Saliency detection

Apple Vision (VNGenerateAttentionBasedSaliencyImageRequest) analyses the thumbnail and returns the bounding boxes of visually salient objects. The boxes are unioned into a single region. If the union area is less than 3% of the frame, or if Vision finds nothing, saliency is discarded and only the full-frame path is used. This step runs in parallel with the image decode.

3. Gaussian pre-blur

A Gaussian blur is applied before the Laplacian. This is critical: without pre-blur, noise pixels generate false Laplacian responses and inflate scores on out-of-focus images. The effective blur radius is:

effectiveRadius = preBlurRadius × isoFactor × resFactor

isoFactor = clamp(sqrt(ISO / 400),             1.0, 3.0)
resFactor = clamp(sqrt(max(imageWidth, 512) / 512), 1.0, 3.0)

At ISO 400 both factors are 1.0. At ISO 6400 isoFactor ≈ 4.0 (capped at 3.0), so the blur automatically increases to suppress noise before the edge detector fires.

4. Laplacian (Metal kernel)

A Laplacian-of-Gaussian edge magnitude is computed via the focusLaplacian Metal kernel. This measures the second derivative of luminance — the response is high where focus is sharp and near zero where the image is blurred. The output is amplified by energyMultiplier.

5. Border suppression

The outer borderInsetFraction of pixels on each edge is zeroed out (replaced with black) before any scoring or thresholding. This prevents the Gaussian pre-blur from generating an artificial sharp step at the image boundary that would otherwise appear as a bright rectangle in the focus mask and inflate scores.

6. Pixel sampling

The amplified Laplacian is rendered to a Float32 buffer. Two sample sets are collected:

  • Full-frame samples: all pixels inside the border-inset region.
  • Salient-region samples: pixels within the Vision bounding box, if available and containing at least 256 pixels.

7. Robust tail score (p95 winsorized mean)

Each sample set is scored independently:

  1. Find the 95th-percentile value (p95) — the threshold above which only the sharpest pixels sit.
  2. Find the 99.5th-percentile value (p99.5) — used as an upper clip to reduce the influence of extreme outliers.
  3. Return the mean of all values ≥ p95, clipped at p99.5.

This produces a score that reflects the sharpest 5% of pixels rather than average image sharpness, which is a better predictor of perceived in-focus quality.

8. Score fusion

flowchart TD
    A{Saliency\ndetected?} -->|Yes — area ≥ 3%| B[Compute salient score s]
    A -->|No| C[Full-frame score f only]

    B --> D{Both f and s\nproduced?}
    D -->|Yes| E["blended = f × (1 − salientWeight) + s × salientWeight
sizeFactor = 1 + salientArea × subjectSizeFactor
finalScore = blended × sizeFactor"]
    D -->|s only| F[finalScore = s]
    C --> G[finalScore = f]

    E --> H([Return finalScore])
    F --> H
    G --> H

9. Badge display

The badge shown on each thumbnail is score / maxScore × 100, normalised to the highest score in the current catalog. All comparisons are therefore relative within a session, not absolute.


Parameters

Where each parameter applies

flowchart LR
    subgraph Score ["Scoring pipeline"]
        P1[thumbnailMaxPixelSize]
        P2[borderInsetFraction]
        P3[salientWeight]
        P4[subjectSizeFactor]
        P5[preBlurRadius]
        P6[energyMultiplier]
    end

    subgraph Mask ["Focus mask overlay"]
        P2
        P5
        P6
        P7[threshold]
        P8[erosionRadius]
        P9[dilationRadius]
        P10[featherRadius]
    end

borderInsetFraction, preBlurRadius, and energyMultiplier affect both scoring and the focus mask overlay. Changes made in the zoom view’s Focus Mask Controls panel are immediately reflected in the next score run.


Scoring Resolution — thumbnailMaxPixelSize

SettingValue
Fast512 px
Medium768 px
Accurate1024 px
Default512 px

The pixel size of the thumbnail decoded for scoring. A larger thumbnail contains finer spatial detail, which improves score accuracy — especially at high ISO where noise patterns can obscure real edges at 512 px. Each step roughly doubles decode and pipeline time.

Applies to: scoring only (the focus mask overlay always uses the full displayed image).


Border Inset — borderInsetFraction

BoundValueEffect
Min0%No border exclusion; Gaussian edge artefacts visible in mask and inflate scores
Default4%Removes the typical blur-boundary band
Max10%Removes a wide margin; subjects near the frame edge may be partially excluded

The fraction of the image dimension excluded from each of the four edges. Prevents the Gaussian pre-blur from creating an artificial bright rectangle at the image boundary in both the focus mask and the score.

Applies to: scoring and focus mask overlay.


Subject Weight — salientWeight

BoundValueEffect
Min0.0Entire score is the full-frame tail score; subject sharpness is ignored
Default0.75Subject region contributes 75%, full frame 25%
Max1.0Entire score is the salient-region score; if saliency fails, falls back to full-frame

Controls how much the Vision-detected subject region drives the score versus the full frame. At 0.0, background texture (grass, foliage) dominates and scores cluster regardless of subject focus. At 1.0, background is entirely ignored.

Applies to: scoring only.


Subject Size Bonus — subjectSizeFactor

BoundValueEffect
Min0.0No size bonus; equal focus quality scores identically regardless of subject distance
Default1.5Subject at 20% frame area → ×1.30 multiplier; 5% frame area → ×1.075
Max3.0Subject at 20% frame area → ×1.60 multiplier; strong preference for close subjects

Multiplies the fused score by 1 + salientArea × subjectSizeFactor. Gives a proportional bonus to images where the subject fills more of the frame — closer subjects score higher than equally-sharp distant ones.

Applies to: scoring only. Has no effect if saliency detection finds nothing (salientArea = 0).


Pre-blur Radius — preBlurRadius

BoundValueEffect
Min0.3Almost no blur; Laplacian fires on noise pixels as well as real edges
Default1.92Balanced noise suppression at ISO 400
Max4.0Heavy blur; only very strong edges survive

Base Gaussian blur radius applied before the Laplacian, automatically scaled upward at higher ISO and larger images. This is the single most influential parameter for whether noise or real sharpness drives the score.

Accessible via: Focus Mask Controls (zoom view) and Scoring Parameters sheet. Applies to: scoring and focus mask overlay.


Amplify — energyMultiplier

BoundValueEffect
Min1.0No amplification; raw Laplacian values are very small
Default7.62Set automatically by calibration
Max20.0Strong amplification; subtle sharpness differences become large score gaps

Multiplies the Laplacian output before scoring and thresholding. The calibration step (Re-score) automatically adjusts this so the p95 of the burst lands at 0.50 — manual adjustment is rarely needed.

Accessible via: Focus Mask Controls (zoom view). Applies to: scoring and focus mask overlay.


Threshold, Erosion, Dilation, Feather Radius

These four parameters affect only the focus mask visual overlay in the zoom view. They have no effect on sharpness scores.

Accessible via: Focus Mask Controls (zoom view).


Test Procedure

The goal is to confirm that each parameter produces the expected behaviour at its boundary values. Use a burst of 20–30 frames containing a clear subject (bird, animal) at varying distances and focus accuracy.

Before each test:

  1. Open the catalog, press Re-score to establish a calibrated baseline.
  2. Note which images score highest and lowest and what the score spread is.
  3. Adjust one parameter at a time, press Re-score after each change, and compare against the baseline.

Test 1 — Scoring Resolution

StepSettingExpected result
1a512 pxBaseline. At ISO 6400+, some noise-driven variation expected
1b1024 pxScores more stable; high-ISO images show less noise influence. Scoring takes ~3–4× longer
1c512 pxScores return to baseline behaviour

Pass criteria: 1024 px produces smaller score gaps between frames that differ only in noise, and larger score gaps between frames that differ in actual focus quality.


Test 2 — Border Inset

StepSettingExpected result
2a0%Focus mask shows a bright rectangle around the full image border
2b4%Border rectangle is absent from mask; scores slightly lower for images where background edges reached the frame edge
2c10%Border completely clean; verify subjects near the frame edge are not partially excluded by inspecting the focus mask

Pass criteria: At 0% the mask shows an obvious border artefact. At 4% and 10% it is absent. Subject detail in the centre of the frame is unaffected at all three values.


Test 3 — Subject Weight

StepSettingExpected result
3a0.0Scores driven by full-frame background texture; frames with sharp foliage score high even if the subject is soft; score spread is narrow
3b0.75Subject sharpness dominates; clearly in-focus frames rank above soft frames
3c1.0Score is entirely the salient-region score; any frame with no detected subject must fall back gracefully to a non-zero score

Pass criteria: At 0.0 scores cluster in a narrow band. At 0.75 and 1.0 there is clear separation between in-focus and soft frames. At 1.0 frames with no Vision detection produce a non-zero score (full-frame fallback).


Test 4 — Subject Size Bonus

StepSettingExpected result
4a0.0Close and distant frames with equivalent focus quality score similarly
4b1.5The closest frame (largest subject) scores noticeably higher than the most distant frame even at similar focus quality
4c3.0Size bonus is strong; rank order correlates with subject size; verify a perfectly sharp distant frame does not rank above a clearly soft close frame

Pass criteria: At 0.0 rank order is determined by focus quality alone. At 1.5 and 3.0 close subjects score higher. At 3.0 the bonus should not cause obviously soft large-subject frames to outrank sharp small-subject frames.


Test 5 — Pre-blur Radius

StepSettingExpected result
5a0.3At high ISO, noise-dominated images score artificially high; focus mask shows noise grain highlighted as “sharp”
5b1.92Balanced; noise suppressed, real edges preserved
5c4.0Only very strong contrast edges highlighted; soft-but-textured backgrounds show little or no mask; score spread may narrow

Pass criteria: At 0.3 with high-ISO files the focus mask shows widespread highlighting that does not correspond to genuine sharp edges. At 1.92 and above it is clean. At 4.0 the sharpest in-focus frame still scores highest.


Test 6 — Amplify (Energy Multiplier)

StepSettingExpected result
6a1.0Raw Laplacian values are tiny; most badges show low scores; little spread between frames
6b7.62 (calibrated)Good spread; badge scores use the full 0–100 range relative to the sharpest frame
6c20.0Signal saturates; many frames score similarly near the top; focus mask nearly fully lit on any textured region

Pass criteria: At 1.0 all scores are very low and compressed. At 20.0 scores are saturated and compressed at the top. At the calibrated default there is clear separation. Note: Re-score resets this to the calibrated value automatically.


Combined Boundary Test

To confirm parameters do not interact unexpectedly at extremes:

flowchart LR
    A([Start]) --> B[Set ALL parameters\nto minimum values]
    B --> C[Re-score]
    C --> D{App stable?\nScores non-zero?\nMask renders?}
    D -->|Fail| E([Log failure])
    D -->|Pass| F[Set ALL parameters\nto maximum values]
    F --> G[Re-score]
    G --> H{App stable?\nScores non-zero?\nMask renders?}
    H -->|Fail| E
    H -->|Pass| I[Press Reset in\nScoring Parameters sheet]
    I --> J[Re-score]
    J --> K{Scores match\noriginal baseline?}
    K -->|Fail| E
    K -->|Pass| L([Pass])

Minimum values: thumbnailMaxPixelSize=512, borderInsetFraction=0%, salientWeight=0.0, subjectSizeFactor=0.0, preBlurRadius=0.3, energyMultiplier=1.0

Maximum values: thumbnailMaxPixelSize=1024, borderInsetFraction=10%, salientWeight=1.0, subjectSizeFactor=3.0, preBlurRadius=4.0, energyMultiplier=20.0

Threads

Concurrency Monitoring — ScanFiles

Analysis

The cooperative thread pool

Swift’s runtime manages a pool of threads capped at roughly the number of CPU cores. Apple Silicon M-series typically has 8–12 performance cores. The output shows a peak of ~10–11 concurrent tasks, which matches this ceiling. This is intentional — the runtime will not spin up 21 threads for 21 tasks because context-switching that many threads would cost more than the parallelism gains.

Why tasks start in a burst then level off

The first 11 tasks start almost simultaneously (active: 1 through active: 11 with no done lines in between). After that the log shows an interleaved pattern of start/done — a slot only opens for a new task when one finishes. This is the scheduler working correctly: it never idles a core while work is queued.

EXIF vs MakerNote — same shape, different reason

Both groups plateau at ~10 active tasks. For EXIF (CGImageSourceCreateWithURL) the bottleneck is reading up to 4 MB of RAW data from disk per file. For MakerNote (SonyMakerNoteParser) it is parsing binary data in memory. They look similar in the log because both are fast enough that the thread pool stays saturated throughout — no task finishes so quickly that it leaves a core idle for long.

What “active” actually measures

The counter tracks tasks that have started but not yet returned. Because these tasks are not async internally — there is no await inside extractExifData or focusLocation — they never suspend. They run to completion on one thread without yielding. This means “active” here equals “threads actually burning CPU right now”, not just “tasks in flight”. That is a heavier workload than async I/O tasks would be, and it is why the pool caps out quickly.

The _DSC8303.ARW anomaly on MakerNote

It shows active: 7 on its done line instead of a clean countdown. This happens because decrement() and another task’s increment() race at the exact same moment — the lock serialises them, but the order is non-deterministic. This is normal behaviour and confirms the lock is doing its job correctly.

What would look different if something were wrong

  • If active never exceeded 1–2 it would mean tasks were serialising somewhere — likely an actor hop or a lock held for too long inside the work itself.
  • If active climbed to 21 and stayed there it would mean tasks were suspending (waiting on async I/O) rather than running — not the case here.
  • If active went negative the counter would be broken — it does not, confirming the nonisolated(unsafe) + NSLock pattern is sound.

sortFiles

Confirmed running off the main thread (number = 2). The @concurrent nonisolated annotation is working as intended — sort work never blocks the main actor.


Summary

21 files, ~10x parallelism, zero idle time between task groups. The scan is as fast as the hardware allows for synchronous CPU and disk work. If memory pressure ever becomes a concern with larger libraries, concurrency could be capped by replacing withTaskGroup with a fixed-width sliding window pattern — but for this workload the current profile is optimal.


Console Output

[ScanFiles] scanFiles: 21 ARW files — starting EXIF task group
[ScanFiles] EXIF start (_DSC8583.ARW) — active: 2/21
[ScanFiles] EXIF start (_DSC8387.ARW) — active: 3/21
[ScanFiles] EXIF start (_DSC8390.ARW) — active: 1/21
[ScanFiles] EXIF start (_DSC8318.ARW) — active: 6/21
[ScanFiles] EXIF start (_DSC8690.ARW) — active: 4/21
[ScanFiles] EXIF start (_DSC8641.ARW) — active: 7/21
[ScanFiles] EXIF start (_DSC8440.ARW) — active: 8/21
[ScanFiles] EXIF start (_DSC8634.ARW) — active: 5/21
[ScanFiles] EXIF start (_DSC8303.ARW) — active: 9/21
[ScanFiles] EXIF start (_DSC8673.ARW) — active: 10/21
[ScanFiles] EXIF start (_DSC8470.ARW) — active: 11/21
[ScanFiles] EXIF done  (_DSC8387.ARW) — active: 10
[ScanFiles] EXIF done  (_DSC8641.ARW) — active: 10
[ScanFiles] EXIF done  (_DSC8690.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8305.ARW) — active: 11/21
[ScanFiles] EXIF start (_DSC8670.ARW) — active: 10/21
[ScanFiles] EXIF start (_DSC8499.ARW) — active: 11/21
[ScanFiles] EXIF done  (_DSC8318.ARW) — active: 8
[ScanFiles] EXIF done  (_DSC8673.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8304.ARW) — active: 9/21
[ScanFiles] EXIF start (_DSC8500.ARW) — active: 10/21
[ScanFiles] EXIF done  (_DSC8634.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8313.ARW) — active: 10/21
[ScanFiles] EXIF done  (_DSC8470.ARW) — active: 9
[ScanFiles] EXIF start (_DSC8406.ARW) — active: 10/21
[ScanFiles] EXIF done  (_DSC8303.ARW) — active: 9
[ScanFiles] EXIF done  (_DSC8440.ARW) — active: 8
[ScanFiles] EXIF start (_DSC8602.ARW) — active: 9/21
[ScanFiles] EXIF start (_DSC8603.ARW) — active: 10/21
[ScanFiles] EXIF done  (_DSC8390.ARW) — active: 10
[ScanFiles] EXIF start (_DSC8589.ARW) — active: 10/21
[ScanFiles] EXIF done  (_DSC8583.ARW) — active: 9
[ScanFiles] EXIF done  (_DSC8500.ARW) — active: 9
[ScanFiles] EXIF done  (_DSC8406.ARW) — active: 8
[ScanFiles] EXIF done  (_DSC8499.ARW) — active: 7
[ScanFiles] EXIF done  (_DSC8305.ARW) — active: 6
[ScanFiles] EXIF done  (_DSC8304.ARW) — active: 5
[ScanFiles] EXIF done  (_DSC8313.ARW) — active: 4
[ScanFiles] EXIF done  (_DSC8602.ARW) — active: 3
[ScanFiles] EXIF done  (_DSC8603.ARW) — active: 2
[ScanFiles] EXIF done  (_DSC8670.ARW) — active: 1
[ScanFiles] EXIF done  (_DSC8589.ARW) — active: 0
[ScanFiles] scanFiles: EXIF group complete — 21/21 items built
[ScanFiles] extractNativeFocusPoints: 21 files — starting MakerNote task group
[ScanFiles] MakerNote start (_DSC8387.ARW) — active: 1/21
[ScanFiles] MakerNote start (_DSC8641.ARW) — active: 2/21
[ScanFiles] MakerNote start (_DSC8690.ARW) — active: 3/21
[ScanFiles] MakerNote start (_DSC8318.ARW) — active: 4/21
[ScanFiles] MakerNote start (_DSC8673.ARW) — active: 5/21
[ScanFiles] MakerNote start (_DSC8634.ARW) — active: 6/21
[ScanFiles] MakerNote start (_DSC8470.ARW) — active: 7/21
[ScanFiles] MakerNote start (_DSC8303.ARW) — active: 8/21
[ScanFiles] MakerNote start (_DSC8440.ARW) — active: 9/21
[ScanFiles] MakerNote done  (_DSC8634.ARW) — active: 8
[ScanFiles] MakerNote done  (_DSC8318.ARW) — active: 7
[ScanFiles] MakerNote start (_DSC8583.ARW) — active: 9/21
[ScanFiles] MakerNote done  (_DSC8470.ARW) — active: 8
[ScanFiles] MakerNote start (_DSC8500.ARW) — active: 8/21
[ScanFiles] MakerNote start (_DSC8406.ARW) — active: 9/21
[ScanFiles] MakerNote start (_DSC8390.ARW) — active: 8/21
[ScanFiles] MakerNote done  (_DSC8440.ARW) — active: 8
[ScanFiles] MakerNote start (_DSC8305.ARW) — active: 10/21
[ScanFiles] MakerNote done  (_DSC8387.ARW) — active: 8
[ScanFiles] MakerNote start (_DSC8304.ARW) — active: 9/21
[ScanFiles] MakerNote done  (_DSC8690.ARW) — active: 7
[ScanFiles] MakerNote start (_DSC8499.ARW) — active: 9/21
[ScanFiles] MakerNote start (_DSC8313.ARW) — active: 8/21
[ScanFiles] MakerNote done  (_DSC8500.ARW) — active: 9
[ScanFiles] MakerNote start (_DSC8602.ARW) — active: 8/21
[ScanFiles] MakerNote done  (_DSC8641.ARW) — active: 6
[ScanFiles] MakerNote start (_DSC8603.ARW) — active: 7/21
[ScanFiles] MakerNote done  (_DSC8583.ARW) — active: 8
[ScanFiles] MakerNote done  (_DSC8673.ARW) — active: 7
[ScanFiles] MakerNote start (_DSC8670.ARW) — active: 8/21
[ScanFiles] MakerNote start (_DSC8589.ARW) — active: 9/21
[ScanFiles] MakerNote done  (_DSC8602.ARW) — active: 8
[ScanFiles] MakerNote done  (_DSC8305.ARW) — active: 7
[ScanFiles] MakerNote done  (_DSC8390.ARW) — active: 7
[ScanFiles] MakerNote done  (_DSC8406.ARW) — active: 6
[ScanFiles] MakerNote done  (_DSC8304.ARW) — active: 5
[ScanFiles] MakerNote done  (_DSC8313.ARW) — active: 4
[ScanFiles] MakerNote done  (_DSC8499.ARW) — active: 3
[ScanFiles] MakerNote done  (_DSC8303.ARW) — active: 7
[ScanFiles] MakerNote done  (_DSC8603.ARW) — active: 2
[ScanFiles] MakerNote done  (_DSC8670.ARW) — active: 1
[ScanFiles] MakerNote done  (_DSC8589.ARW) — active: 0
[ScanFiles] extractNativeFocusPoints: MakerNote group complete — 21/21 found focus data
Finished scanning! Total files: 21
func sortFiles() NOT on main thread, currently on <NSThread: 0xc98ae8540>{number = 2, name = (null)}

Memory Cache

Cache System — RawCull

RawCull uses a three-layer cache to avoid repeated RAW decoding. Decoding an ARW file on demand is expensive — the three-layer approach ensures that most requests are served from RAM or disk rather than from source.

Layers (fastest to slowest):

  1. Memory cacheNSCache<NSURL, DiscardableThumbnail> in RAM
  2. Disk cache — JPEG files on disk in ~/Library/Caches/no.blogspot.RawCull/Thumbnails/
  3. Source decodeCGImageSourceCreateThumbnailAtIndex from the ARW file

The same cache stack is shared by two paths: the bulk preload flow (ScanAndCreateThumbnails) and on-demand per-file requests (RequestThumbnail).


1. Core Types

DiscardableThumbnail

DiscardableThumbnail is the in-memory cache entry. It wraps an NSImage and implements NSDiscardableContent so NSCache can manage it under memory pressure without immediately evicting objects that are currently in use.

final class DiscardableThumbnail: NSObject, NSDiscardableContent, @unchecked Sendable {
    let image: NSImage
    nonisolated let cost: Int
    private let state = OSAllocatedUnfairLock(
        initialState: (isDiscarded: false, accessCount: 0)
    )
}

Cost calculation happens at initialization from the actual pixel dimensions of all image representations:

cost = (Σ rep.pixelsWide × rep.pixelsHigh × costPerPixel) × 1.1
  • Iterates every NSImageRep in the image
  • Falls back to logical image.size if no representations are present
  • The 1.1 multiplier adds a 10% overhead for wrapper and metadata
  • costPerPixel comes from SettingsViewModel.thumbnailCostPerPixel (default: 4, representing RGBA bytes per pixel)

Thread safety uses OSAllocatedUnfairLock on a tuple (isDiscarded: Bool, accessCount: Int) to keep both fields consistent under concurrent access.

NSDiscardableContent protocol:

MethodBehavior
beginContentAccess() -> BoolAcquires lock, increments accessCount, returns false if already discarded
endContentAccess()Acquires lock, decrements accessCount
discardContentIfPossible()Acquires lock, marks isDiscarded = true only if accessCount == 0
isContentDiscarded() -> BoolAcquires lock, returns isDiscarded

The correct access pattern for any caller:

if let wrapper = SharedMemoryCache.shared.object(forKey: url as NSURL),
   wrapper.beginContentAccess() {
    defer { wrapper.endContentAccess() }
    use(wrapper.image)
} else {
    // Cache miss or discarded — fall through to disk or source
}

CacheConfig

CacheConfig is an immutable value type passed to SharedMemoryCache at initialization or after a settings change:

struct CacheConfig {
    nonisolated let totalCostLimit: Int   // bytes
    nonisolated let countLimit: Int
    nonisolated var costPerPixel: Int?

    static let production = CacheConfig(
        totalCostLimit: 500 * 1024 * 1024,  // 500 MB default
        countLimit: 1000
    )

    static let testing = CacheConfig(
        totalCostLimit: 100_000,            // intentionally tiny
        countLimit: 5
    )
}

In production, totalCostLimit is overwritten from SettingsViewModel.memoryCacheSizeMB when applyConfig runs. The countLimit of 10,000 is intentionally very high — under normal operation totalCostLimit is always the binding constraint.


CacheDelegate

CacheDelegate implements NSCacheDelegate and counts evictions via an isolated EvictionCounter actor:

final class CacheDelegate: NSObject, NSCacheDelegate, @unchecked Sendable {
    nonisolated static let shared = CacheDelegate()

    // NSCacheDelegate — called synchronously on NSCache's internal queue
    func cache(_ cache: NSCache<AnyObject, AnyObject>, willEvictObject obj: Any) {
        guard obj is DiscardableThumbnail else { return }
        Task { await evictionCounter.increment() }
    }
}

actor EvictionCounter {
    private var count = 0
    func increment() { count += 1 }
    func getCount() -> Int { count }
    func reset() { count = 0 }
}

The delegate does not affect eviction behavior — it only feeds the statistics system.


SharedMemoryCache (actor)

SharedMemoryCache is a global actor singleton that owns the NSCache, memory pressure monitoring, and cache statistics.

actor SharedMemoryCache {
    nonisolated static let shared = SharedMemoryCache()

    // nonisolated(unsafe) allows synchronous access from any context.
    // NSCache itself is thread-safe; this is intentional and documented.
    nonisolated(unsafe) let memoryCache = NSCache<NSURL, DiscardableThumbnail>()
    nonisolated(unsafe) var currentPressureLevel: MemoryPressureLevel = .normal

    private var _costPerPixel: Int = 4
    private var diskCache: DiskCacheManager
    private var memoryPressureSource: DispatchSourceMemoryPressure?
    private var setupTask: Task<Void, Never>?

    // Statistics (actor-isolated)
    private var cacheMemory: Int = 0   // RAM hits
    private var cacheDisk: Int = 0     // Disk hits
}

Synchronous accessorsnonisolated, callable from any context without await:

nonisolated func object(forKey key: NSURL) -> DiscardableThumbnail?
nonisolated func setObject(_ obj: DiscardableThumbnail, forKey key: NSURL, cost: Int)
nonisolated func removeAllObjects()

Initialization is gated by a setupTask so that concurrent callers to ensureReady() share a single initialization pass:

func ensureReady(config: CacheConfig? = nil) async {
    if let existing = setupTask {
        await existing.value
        return
    }
    let task = Task { await self.setCacheCostsFromSavedSettings() }
    setupTask = task
    await task.value
    startMemoryPressureMonitoring()
}

Configuration flow:

ensureReady()
  -> setCacheCostsFromSavedSettings()
      -> SettingsViewModel.shared.asyncgetsettings()
          -> calculateConfig(from:)
              -> applyConfig(_:)

calculateConfig converts settings to a CacheConfig:

  • totalCostLimit = memoryCacheSizeMB × 1024 × 1024
  • countLimit = 10,000 (intentionally very high — memory cost, not item count, is the real constraint)
  • costPerPixel = thumbnailCostPerPixel

applyConfig applies the config to NSCache:

  • memoryCache.totalCostLimit = config.totalCostLimit
  • memoryCache.countLimit = config.countLimit
  • memoryCache.delegate = CacheDelegate.shared

DiskCacheManager (actor)

DiskCacheManager stores JPEG thumbnails on disk and retrieves them on RAM cache misses.

actor DiskCacheManager {
    private let cacheDirectory: URL
    // ~/Library/Caches/no.blogspot.RawCull/Thumbnails/
}

Cache key generation — deterministic MD5 hash of the standardized source path:

func cacheURL(for sourceURL: URL) -> URL {
    let standardized = sourceURL.standardizedFileURL.path
    let hash = MD5(string: standardized)   // hex string
    return cacheDirectory.appendingPathComponent(hash + ".jpg")
}

Load — detached userInitiated priority task:

func load(for sourceURL: URL) async -> NSImage? {
    let url = cacheURL(for: sourceURL)
    return await Task.detached(priority: .userInitiated) {
        guard FileManager.default.fileExists(atPath: url.path) else { return nil }
        return NSImage(contentsOf: url)
    }.value
}

Save — accepts pre-encoded Data (a Sendable type) to cross the actor boundary safely:

func save(_ jpegData: Data, for sourceURL: URL) async {
    let url = cacheURL(for: sourceURL)
    Task.detached(priority: .background) {
        do {
            try jpegData.write(to: url)
        } catch {
            // Log error
        }
    }
}

// Called inside the actor that owns the CGImage, before crossing actor boundaries
static nonisolated func jpegData(from cgImage: CGImage) -> Data? {
    // CGImageDestination → JPEG quality 0.7
}

Cache maintenance:

MethodBehavior
getDiskCacheSize() async -> IntSums totalFileAllocatedSize for all .jpg cache files
pruneCache(maxAgeInDays: Int = 30) asyncRemoves files with modification date older than threshold

Both run in detached utility priority tasks.


2. Memory Pressure Handling

Memory pressure is monitored via DispatchSource.makeMemoryPressureSource:

func startMemoryPressureMonitoring() {
    let source = DispatchSource.makeMemoryPressureSource(
        eventMask: [.normal, .warning, .critical],
        queue: .global(qos: .utility)
    )
    source.setEventHandler { [weak self] in
        Task { await self?.handleMemoryPressureEvent() }
    }
    source.resume()
    memoryPressureSource = source
}

Response by level:

LevelAction
.normalLog, update currentPressureLevel, notify fileHandlers.memorypressurewarning(false)
.warningReduce totalCostLimit to 60% of the current limit, notify fileHandlers.memorypressurewarning(true)
.criticalremoveAllObjects(), set totalCostLimit to 50 MB, notify fileHandlers.memorypressurewarning(true)

Important detail about warning compounding: the warning level calculates its reduction from the current limit, not the original configured limit. Repeated warning events compound:

Original: 5000 MB
After 1st warning: 3000 MB
After 2nd warning: 1800 MB
After 3rd warning: 1080 MB

The limit is only restored to the configured value when applyConfig() runs again — for example, on app start or after a settings change.


3. Cache Statistics

SharedMemoryCache tracks hits and evictions in actor-isolated counters:

  • cacheMemory — incremented on every RAM hit (via updateCacheMemory())
  • cacheDisk — incremented on every disk hit (via updateCacheDisk())
  • Eviction count — tracked by CacheDelegate.EvictionCounter

getCacheStatistics() async -> CacheStatistics returns a snapshot:

struct CacheStatistics {
    nonisolated let hits: Int
    nonisolated let misses: Int
    nonisolated let evictions: Int
    nonisolated let hitRate: Double   // (hits / (hits + misses)) * 100
}

clearCaches() async:

  1. Reads and logs final statistics
  2. memoryCache.removeAllObjects()
  3. diskCache.pruneCache(maxAgeInDays: 0) — prunes all files
  4. Resets cacheMemory, cacheDisk, and eviction count to 0

4. End-to-End Cache Flow

Request thumbnail for URL
│
├─ Check SharedMemoryCache.object(forKey:)
│   ├─ Hit: beginContentAccess() → use image → endContentAccess() → return
│   └─ Miss:
│       ├─ Check DiskCacheManager.load(for:)
│       │   ├─ Hit: wrap as DiscardableThumbnail → store in NSCache → return
│       │   └─ Miss:
│       │       ├─ SonyThumbnailExtractor.extractSonyThumbnail(from:maxDimension:qualityCost:)
│       │       ├─ Normalize CGImage → NSImage (JPEG-backed, quality 0.7)
│       │       ├─ Create DiscardableThumbnail → store in NSCache (with cost)
│       │       └─ Encode JPEG data → DiskCacheManager.save(_:for:) [detached, background]
└─ Return CGImage to caller

5. Settings That Affect Cache Behavior

Settings live in SettingsViewModel and are persisted to ~/Library/Application Support/RawCull/settings.json.

SettingDefaultEffect
memoryCacheSizeMB5000NSCache.totalCostLimit = memoryCacheSizeMB × 1024 × 1024
thumbnailCostPerPixel4Cost per pixel in DiscardableThumbnail.cost
thumbnailSizePreview1024Target size for bulk preload; affects entry cost
thumbnailSizeGrid100Grid thumbnail size
thumbnailSizeGridView400Grid View thumbnail size
thumbnailSizeFullSize8700Full-size zoom path upper bound

SettingsViewModel.validateSettings() emits warnings if:

  • memoryCacheSizeMB < 500
  • memoryCacheSizeMB > 80% of available system memory

6. Cache Flow Diagram

flowchart TD
    A[Thumbnail Requested] --> B{Memory Cache Hit?}
    B -- Yes --> C[beginContentAccess]
    C --> D[Return NSImage]
    D --> E[endContentAccess]
    B -- No --> F{Disk Cache Hit?}
    F -- Yes --> G[Load JPEG from Disk]
    G --> H[Wrap as DiscardableThumbnail]
    H --> I[Store in NSCache with cost]
    I --> D
    F -- No --> J[SonyThumbnailExtractor]
    J --> K[Normalize to JPEG-backed NSImage]
    K --> L[Create DiscardableThumbnail]
    L --> I
    K --> M[Encode JPEG data]
    M --> N[DiskCacheManager.save — detached background]

7. Memory Pressure Response Diagram

flowchart TD
    A[DispatchSourceMemoryPressure event] --> B{Pressure Level?}
    B -- normal --> C[Log + update currentPressureLevel]
    C --> D[Notify UI: no warning]
    B -- warning --> E[Reduce limit to 60% of current]
    E --> F[Notify UI: warning]
    B -- critical --> G[removeAllObjects]
    G --> H[Set limit to 50 MB]
    H --> I[Notify UI: warning]

Thumbnails

Thumbnails and Previews — RawCull

RawCull handles Sony ARW files through two distinct image paths:

  1. Generated thumbnails — fast, sized-down previews for browsing and culling, extracted with ImageIO and cached in RAM and on disk
  2. Embedded JPEG previews — full-resolution embedded JPEGs extracted from the ARW binary for high-quality inspection and export

Both paths integrate with the shared three-layer cache system: RAM → disk → source decode.


1. Thumbnail Sizes and Settings

All thumbnail dimensions are configurable via SettingsViewModel and persisted to ~/Library/Application Support/RawCull/settings.json.

SettingDefaultUsage
thumbnailSizeGrid100Small thumbnails in grid list view
thumbnailSizeGridView400Thumbnails in the main grid view
thumbnailSizePreview1024Bulk preload target size
thumbnailSizeFullSize8700Upper bound for full-size zoom path
thumbnailCostPerPixel4RGBA bytes per pixel — drives cache cost calculation
useThumbnailAsZoomPreviewfalseReuse cached thumbnail instead of re-extracting for zoom

All extraction uses max pixel size on the longest edge (kCGImageSourceThumbnailMaxPixelSize). Actual width and height depend on the source aspect ratio.


2. Generated Thumbnail Pipeline — SonyThumbnailExtractor

SonyThumbnailExtractor is a nonisolated static enum. Its extractSonyThumbnail method is the primary entry point for generating thumbnails from ARW files.

2.1 Async dispatch

To prevent blocking actor queues during CPU-intensive ImageIO work, extraction is dispatched to the global userInitiated GCD queue via withCheckedThrowingContinuation:

static func extractSonyThumbnail(
    from url: URL,
    maxDimension: CGFloat,
    qualityCost: Int = 4
) async throws -> CGImage {
    try await withCheckedThrowingContinuation { continuation in
        DispatchQueue.global(qos: .userInitiated).async {
            do {
                let image = try Self.extractSync(from: url, maxDimension: maxDimension, qualityCost: qualityCost)
                continuation.resume(returning: image)
            } catch {
                continuation.resume(throwing: error)
            }
        }
    }
}

2.2 ImageIO extraction (extractSync)

extractSync is nonisolated and runs synchronously on the GCD thread:

let sourceOptions: [CFString: Any] = [kCGImageSourceShouldCache: false]
guard let source = CGImageSourceCreateWithURL(url as CFURL, sourceOptions as CFDictionary) else {
    throw ThumbnailError.invalidSource
}

let thumbOptions: [CFString: Any] = [
    kCGImageSourceCreateThumbnailFromImageAlways: true,
    kCGImageSourceCreateThumbnailWithTransform:   true,
    kCGImageSourceThumbnailMaxPixelSize:          maxDimension,
    kCGImageSourceShouldCacheImmediately:         true
]

guard let cgImage = CGImageSourceCreateThumbnailAtIndex(source, 0, thumbOptions as CFDictionary) else {
    throw ThumbnailError.generationFailed
}
return try rerender(cgImage, qualityCost: qualityCost)

kCGImageSourceShouldCache: false on the source prevents ImageIO from caching the raw input; kCGImageSourceShouldCacheImmediately: true on the thumbnail options ensures the decoded output pixels are available immediately.

2.3 Re-rendering with interpolation quality (rerender)

After ImageIO decodes the thumbnail, rerender redraws it into a new CGContext. This applies controlled interpolation quality and normalizes the pixel format to sRGB premultipliedLast:

private static nonisolated func rerender(_ image: CGImage, qualityCost: Int) throws -> CGImage {
    let quality: CGInterpolationQuality = switch qualityCost {
        case 1...2: .low
        case 3...4: .medium
        default:    .high
    }
    guard let colorSpace = CGColorSpace(name: CGColorSpace.sRGB),
          let ctx = CGContext(
              data: nil,
              width: image.width,
              height: image.height,
              bitsPerComponent: 8,
              bytesPerRow: 0,
              space: colorSpace,
              bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue
          ) else {
        throw ThumbnailError.contextCreationFailed
    }
    ctx.interpolationQuality = quality
    ctx.draw(image, in: CGRect(x: 0, y: 0, width: image.width, height: image.height))
    guard let rendered = ctx.makeImage() else {
        throw ThumbnailError.generationFailed
    }
    return rendered
}

Interpolation quality mapping:

thumbnailCostPerPixelCGInterpolationQuality
1–2.low — fastest, lowest quality
3–4.medium — balanced (default)
5+.high — best quality, highest CPU

3. Thumbnail Normalization

Before storing in the cache, ScanAndCreateThumbnails normalizes the CGImage to an NSImage backed by a single JPEG representation. This ensures that the memory entry and the disk entry are consistent with each other.

func cgImageToNormalizedNSImage(_ cgImage: CGImage) throws -> NSImage {
    // Encode to JPEG at quality 0.7
    guard let jpegData = DiskCacheManager.jpegData(from: cgImage) else {
        throw ThumbnailError.generationFailed
    }
    // Decode back to NSImage — now backed by exactly one NSBitmapImageRep
    guard let image = NSImage(data: jpegData) else {
        throw ThumbnailError.generationFailed
    }
    return image
}

The inverse direction (NSImage → CGImage) is also needed when promoting a disk-cached JPEG to the RAM cache:

func nsImageToCGImage(_ nsImage: NSImage) throws -> CGImage {
    // Try direct CGImage extraction first
    if let rep = nsImage.representations.first as? NSBitmapImageRep,
       let cg = rep.cgImage { return cg }
    // Fallback: TIFF round-trip
    guard let tiffData = nsImage.tiffRepresentation,
          let src = CGImageSourceCreateWithData(tiffData as CFData, nil),
          let cg = CGImageSourceCreateImageAtIndex(src, 0, nil) else {
        throw ThumbnailError.generationFailed
    }
    return cg
}

4. Preload Flow (Bulk) — ScanAndCreateThumbnails

When the user selects a catalog, RawCullViewModel.handleSourceChange(url:) triggers bulk preloading.

4.1 Concurrency model

preloadCatalog(at:targetSize:) uses a withTaskGroup bounded to ProcessInfo.processInfo.activeProcessorCount * 2 concurrent tasks. Back-pressure is applied with await group.next() before adding a new task when the limit is reached.

4.2 Per-file processing

For each ARW file, processSingleFile(_:targetSize:itemIndex:) runs the three-tier cache lookup:

RAM cache → Disk cache → SonyThumbnailExtractor

Cancellation is checked with Task.isCancelled before and after every expensive operation. If cancelled mid-extraction, the result is discarded and no write occurs.

4.3 Caching after extraction

On a successful source extraction:

  1. cgImageToNormalizedNSImage(_:) produces a single-representation NSImage.
  2. storeInMemoryCache(_:for:) creates a DiscardableThumbnail using pixel-accurate cost and stores it in SharedMemoryCache.
  3. DiskCacheManager.jpegData(from:) encodes to JPEG (quality 0.7) — this is called on the actor while the CGImage is still accessible.
  4. diskCache.save(_:for:) writes the data from a detached background task.

4.4 Request coalescing

If thumbnail(for:targetSize:) is called concurrently for the same URL while extraction is in progress, an inflightTasks: [URL: Task<CGImage, Error>] dictionary provides coalescing. Subsequent callers await the existing task rather than launching duplicate extraction work.


5. On-Demand Thumbnails — ThumbnailLoader + RequestThumbnail

UI elements (grid view, file list, inspector) request thumbnails through the on-demand path.

5.1 ThumbnailLoader (rate limiting)

ThumbnailLoader.shared is a global actor singleton that caps concurrent thumbnail loads at 6. Requests beyond this limit suspend via CheckedContinuation and queue in pendingContinuations. When a slot is released, the next waiting continuation is resumed. The target size passed to RequestThumbnail is thumbnailSizePreview (default 1024).

If a waiting task is cancelled before its slot becomes available, its continuation is removed from the queue by UUID so it is never spuriously resumed.

5.2 RequestThumbnail (cache pipeline)

RequestThumbnail handles the actual per-file resolution for the on-demand path:

  1. ensureReady()setupTask gate ensures SharedMemoryCache is configured once.
  2. RAM cache lookup via SharedMemoryCache.object(forKey:) + beginContentAccess().
  3. Disk cache lookup via DiskCacheManager.load(for:).
  4. Extraction fallback via SonyThumbnailExtractor.extractSonyThumbnail(from:maxDimension:qualityCost:).
  5. Store in RAM cache.
  6. Disk save via a detached background task.

requestThumbnail(for:targetSize:) returns CGImage? for direct use by SwiftUI views.


6. Embedded JPEG Preview Extraction — JPGSonyARWExtractor

Embedded JPEG previews are distinct from generated thumbnails. They are the full-resolution previews baked into the ARW file by the camera, and are used for high-quality inspection and export.

JPGSonyARWExtractor is a nonisolated static enum dispatched to DispatchQueue.global(qos: .utility).

6.1 JPEG detection algorithm

ARW files contain multiple image sub-images. The extractor iterates all of them and identifies JPEG candidates by:

  1. Presence of kCGImagePropertyJFIFDictionary in image properties, or
  2. Compression value 6 (JPEG) in the kCGImagePropertyTIFFDictionary.

Among all JPEG candidates, the widest image is selected. Width is read from kCGImagePropertyPixelWidth, falling back to the TIFF or EXIF width dictionary entries.

6.2 Downsampling large previews

let maxThumbnailSize: CGFloat = fullSize ? 8640 : 4320

If the selected JPEG’s width exceeds maxThumbnailSize, it is downsampled using ImageIO:

let thumbOptions: [CFString: Any] = [
    kCGImageSourceCreateThumbnailFromImageAlways: true,
    kCGImageSourceCreateThumbnailWithTransform:   true,
    kCGImageSourceThumbnailMaxPixelSize:          Int(maxThumbnailSize)
]
CGImageSourceCreateThumbnailAtIndex(source, index, thumbOptions as CFDictionary)

If the JPEG is already smaller than maxThumbnailSize, it is decoded at its original size with CGImageSourceCreateImageAtIndex.


7. JPEG Export — SaveJPGImage

SaveJPGImage.save(image:originalURL:) writes an extracted CGImage alongside the original ARW:

  • Output path: originalURL with .arw extension replaced by .jpg
  • Compression quality: 1.0 (maximum, no lossy compression)
  • Format: JPEG via CGImageDestinationCreateWithURL + CGImageDestinationFinalize

The method is nonisolated and runs on the global queue via async, keeping actor queues clear of blocking I/O.


8. Error Handling

ThumbnailError defines three typed errors for the thumbnail pipeline:

enum ThumbnailError: Error, LocalizedError {
    case invalidSource          // CGImageSourceCreateWithURL returned nil
    case generationFailed       // CGImageSourceCreateThumbnailAtIndex or CGContext.makeImage returned nil
    case contextCreationFailed  // CGContext creation failed
}

All callers catch errors, log the failure with file path and description, and return nil to the consumer. A single corrupt or unreadable file does not interrupt bulk processing — the preload loop and extraction loop both continue with the next file.


9. Flow Diagrams

9.1 Bulk Preload (ScanAndCreateThumbnails)

flowchart TD
    A[Catalog Selected] --> B[handleSourceChange — MainActor]
    B --> C[ScanAndCreateThumbnails.preloadCatalog]
    C --> D[ensureReady — SharedMemoryCache + settings]
    D --> E[cancelPreload — cancel prior inner task]
    E --> F[DiscoverFiles — enumerate ARW files]
    F --> G[withTaskGroup — capped at processorCount × 2]
    G --> H{RAM Cache Hit?}
    H -- Yes --> P[Update progress / ETA]
    H -- No --> I{Disk Cache Hit?}
    I -- Yes --> J[Load JPEG from disk]
    J --> K[Promote to RAM cache]
    K --> P
    I -- No --> L[SonyThumbnailExtractor — ImageIO decode]
    L --> M[Rerender with interpolation quality]
    M --> N[Normalize to JPEG-backed NSImage]
    N --> O[Store in RAM cache]
    O --> Q[Encode JPEG data]
    Q --> R[DiskCacheManager.save — detached background]
    R --> P
    P --> S{More files?}
    S -- Yes --> G
    S -- No --> T[Return count to RawCullViewModel]

9.2 On-Demand Request (ThumbnailLoader + RequestThumbnail)

sequenceDiagram
    participant UI
    participant TL as ThumbnailLoader.shared
    participant RT as RequestThumbnail
    participant MC as SharedMemoryCache
    participant DC as DiskCacheManager
    participant EX as SonyThumbnailExtractor

    UI->>TL: thumbnailLoader(file:)
    TL->>TL: acquireSlot() — suspend if activeTasks ≥ 6
    TL->>RT: requestThumbnail(for:targetSize:)
    RT->>MC: object(forKey:) + beginContentAccess()
    alt RAM hit
        MC-->>RT: DiscardableThumbnail.image
        RT-->>UI: CGImage
    else RAM miss
        RT->>DC: load(for:)
        alt Disk hit
            DC-->>RT: NSImage
            RT->>MC: setObject(DiscardableThumbnail)
            RT-->>UI: CGImage
        else Disk miss
            RT->>EX: extractSonyThumbnail(from:maxDimension:qualityCost:)
            EX-->>RT: CGImage
            RT->>MC: setObject(DiscardableThumbnail)
            RT->>DC: save — detached background
            RT-->>UI: CGImage
        end
    end
    TL->>TL: releaseSlot() — resume next pending continuation

9.3 Embedded JPEG Extraction (JPGSonyARWExtractor)

flowchart TD
    A[ARW file URL] --> B[CGImageSourceCreateWithURL]
    B --> C[Iterate all sub-images]
    C --> D{JPEG candidate?}
    D -- JFIF dict present OR TIFF compression == 6 --> E[Record width]
    D -- No --> C
    E --> F{More images?}
    F -- Yes --> C
    F -- No --> G[Select widest JPEG candidate]
    G --> H{Width > maxThumbnailSize?}
    H -- Yes --> I[Downsample via kCGImageSourceThumbnailMaxPixelSize]
    H -- No --> J[Decode at original size]
    I --> K[Return CGImage]
    J --> K

10. Settings Reference

SettingDefaultEffect
memoryCacheSizeMB5000Sets NSCache.totalCostLimit
thumbnailCostPerPixel4Drives DiscardableThumbnail cost and interpolation quality
thumbnailSizePreview1024Bulk preload target size
thumbnailSizeGrid100Grid list thumbnail size
thumbnailSizeGridView400Main grid view thumbnail size
thumbnailSizeFullSize8700Full-size zoom path upper bound
useThumbnailAsZoomPreviewfalseSkip re-extraction and use cached thumbnail for zoom

Scan and Thumbnial Pipeline

RawCull — Scan and Thumbnail Pipeline

This document describes the complete execution flow from the moment a user opens a catalog folder to the point where all thumbnails are visible in the grid. It covers the actors involved, the data flow between them, the concurrency model, five performance bugs that were found and fixed, and the measured results on a real catalog of 809 Sony A1 ARW files stored on an external 800 MB/s SSD.


1. Overview

Opening a catalog triggers two parallel workstreams:

User opens folder
       │
       ├─► ScanFiles.scanFiles()          — discovers files, reads EXIF and focus points
       │
       └─► ScanAndCreateThumbnails
               .preloadCatalog()          — generates or loads thumbnails for every file

Both workstreams are Swift actors. Each uses a withTaskGroup internally to process files concurrently. Both report progress back to the SwiftUI layer via @MainActor callbacks.


2. Phase 1 — File scan (ScanFiles)

File: RawCull/Actors/ScanFiles.swift

2.1 Directory discovery

let contents = try FileManager.default.contentsOfDirectory(
    at: url,
    includingPropertiesForKeys: [.nameKey, .fileSizeKey, .contentTypeKey,
                                  .contentModificationDateKey],
    options: [.skipsHiddenFiles]
)

FileManager.contentsOfDirectory returns all entries in one call. File-system metadata (name, size, type, modification date) is prefetched via includingPropertiesForKeys — no per-file stat() calls are needed later.

2.2 Concurrent EXIF extraction (withTaskGroup)

await withTaskGroup(of: FileItem?.self) { group in
    for fileURL in contents {
        guard fileURL.pathExtension.lowercased() == "arw" else { continue }
        let progress = onProgress
        let count = discoveredCount
        Task { @MainActor in progress?(count) }   // fire-and-forget UI update
        group.addTask {
            let res  = try? fileURL.resourceValues(forKeys: Set(keys))
            let exif = self.extractExifData(from: fileURL)   // nonisolated
            return FileItem(url: fileURL, name: res?.name  exifData: exif)
        }
    }
    
}

For each ARW file a task is added to the group. The loop itself is non-blocking: progress callbacks are fired to the main actor without await, so the loop completes almost instantly and the task group fills up immediately.

extractExifData uses Apple’s ImageIO framework:

private nonisolated func extractExifData(from url: URL) -> ExifMetadata? {
    guard let src = CGImageSourceCreateWithURL(url as CFURL, nil),
          let props = CGImageSourceCopyPropertiesAtIndex(src, 0, nil) 

nonisolated is critical here: it tells Swift the method does not access any actor-isolated state, so task group tasks can call it directly on the global cooperative thread pool without hopping back to the ScanFiles actor’s serial executor.

CGImageSourceCopyPropertiesAtIndex reads the TIFF/EXIF header from the file. For a Sony ARW this is the first few kilobytes — not the full ~50 MB RAW image.

Measured throughput: ~2–3 ms per file. 809 files concurrently ≈ 3–4 seconds.

2.3 Concurrent focus point extraction

After the EXIF task group completes, focus points are extracted with a second concurrent task group:

private func extractNativeFocusPoints(from items: [FileItem]) async
    -> [DecodeFocusPoints]?
{
    await withTaskGroup(of: DecodeFocusPoints?.self) { group in
        for item in items {
            group.addTask {
                guard let loc = SonyMakerNoteParser.focusLocation(from: item.url)
                else { return nil }
                return DecodeFocusPoints(sourceFile: item.url.lastPathComponent,
                                         focusLocation: loc)
            }
        }
        
    }
}

SonyMakerNoteParser.focusLocation is nonisolated static, so — like extractExifData — task group tasks run it directly on the thread pool.

What SonyMakerNoteParser does

Sony ARW is TIFF-based. Focus location lives at:

TIFF IFD0  →  ExifIFD (tag 0x8769)  →  MakerNote (tag 0x927C)
    →  Sony MakerNote IFD  →  FocusLocation (tag 0x2027)

Tag 0x2027 is int16u[4] = [sensorWidth, sensorHeight, focusX, focusY] in full sensor pixel coordinates. The parser navigates the TIFF IFD chain in binary using only the bytes it needs.

Key implementation detail: the parser reads only the first 4 MB of the file:

guard let fh = try? FileHandle(forReadingFrom: url) else { return nil }
defer { try? fh.close() }
guard let data = try? fh.read(upToCount: 4 * 1024 * 1024) 

Sony ARW MakerNote metadata sits well within the first 1–2 MB of the file. Reading 4 MB is a conservative safe limit. The full RAW image data follows later in the file and is never touched.

Measured throughput: ~0.3–0.4 ms per file. 809 files concurrently ≈ < 1 second.


3. Phase 2 — Thumbnail generation (ScanAndCreateThumbnails)

File: RawCull/Actors/ScanAndCreateThumbnails.swift

3.1 Sliding-window task group

let maxConcurrent = ProcessInfo.processInfo.activeProcessorCount * 2

for (index, url) in urls.enumerated() {
    if index >= maxConcurrent {
        await group.next()        // keep at most maxConcurrent in flight
    }
    group.addTask {
        await self.processSingleFile(url, targetSize: targetSize, )
    }
}

On a Mac Mini M2 (10 reported cores), maxConcurrent = 20. The sliding window ensures at most 20 files are being processed simultaneously, preventing memory pressure from loading too many large images at once.

3.2 Per-file processing (processSingleFile)

Each task follows a three-tier lookup:

A. RAM cache (NSCache)   →  microseconds, no I/O
B. Disk cache (JPEG)     →  ~1–5 ms, reads ~494 KB from internal SSD
C. RAW extraction        →  ~180–200 ms, decodes full ARW via ImageIO

A. RAM cache

SharedMemoryCache is a global actor wrapping NSCache. A cache hit is a synchronous dictionary lookup — effectively free.

B. Disk cache

Thumbnails are stored as JPEG files at ~/Library/Caches/no.blogspot.RawCull/Thumbnails/. The filename is an MD5 hash of the source file’s absolute path. Each cached thumbnail is ~494 KB (512 px longest edge, JPEG quality 0.7).

DiskCacheManager.load(for:) spawns a Task.detached for the file read, releasing the actor during I/O.

After a first full scan, the disk cache is ~400 MB for 809 files.

C. RAW extraction

let cgImage = try await SonyThumbnailExtractor.extractSonyThumbnail(
    from: url,
    maxDimension: CGFloat(targetSize),
    qualityCost: costPerPixel
)

SonyThumbnailExtractor hops immediately to DispatchQueue.global() so the actor is not blocked during the ~180–200 ms decode:

try await withCheckedThrowingContinuation { continuation in
    DispatchQueue.global(qos: .userInitiated).async {
        let image = try Self.extractSync(from: url, )
        continuation.resume(returning: image)
    }
}

Internally this calls CGImageSourceCreateThumbnailAtIndex which uses the embedded JPEG preview inside the ARW where available, avoiding a full RAW decode.

After extraction the thumbnail is:

  1. Stored in the RAM cache (NSCache) immediately.
  2. Encoded to JPEG data and written to the disk cache via a background Task.detached — this write does not block the thumbnail pipeline.

3.3 Progress notification (fire-and-forget)

After each file completes, the UI is notified:

private func notifyFileHandler(_ count: Int) {
    let handler = fileHandlers?.fileHandler
    Task { @MainActor in handler?(count) }
}

The Task { @MainActor in } delivers the update to SwiftUI without blocking the current task. Thumbnail generation does not wait for the UI to finish rendering before moving on to the next file.


4. Performance bugs found and fixed

All five bugs shared the same root cause: await on a @MainActor or actor-isolated method inside a hot loop, serialising work that should have been concurrent. Swift 6’s SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor makes this easy to introduce — a method that touches no actor state is still actor-isolated by default and must be explicitly marked nonisolated to opt out.


Bug 1 — Thumbnail pipeline blocked on UI callbacks

File: ScanAndCreateThumbnails.processSingleFile

Symptom: 809 thumbnails generated at ~199 ms/file wall-clock despite a 20-slot concurrent task group.

Root cause:

// Before
await fileHandlers?.fileHandler(newCount)
await updateEstimatedTime(for: startTime, itemsProcessed: newCount)

fileHandlers.fileHandler is @MainActor. Every completed thumbnail called await on it, suspending the task until SwiftUI finished rendering the updated grid. The main actor processed these callbacks serially. With 809 files the entire thumbnail pipeline serialised behind 809 sequential SwiftUI re-renders.

Measured cost: ~78 ms per file × 809 = ~63 seconds on second run (disk cache I/O itself is < 1 ms per file on the internal SSD).

Fix:

// After
private func notifyFileHandler(_ count: Int) {
    let handler = fileHandlers?.fileHandler
    Task { @MainActor in handler?(count) }   // fire-and-forget
}

updateEstimatedTime was also made non-async with its estimatedTimeHandler callback converted to the same fire-and-forget pattern.


Bug 2 — EXIF extraction serialised on actor

File: ScanFiles.extractExifData

Symptom: EXIF phase took ~60 seconds for 809 files despite withTaskGroup.

Root cause:

// Before — actor-isolated, called with await from task group tasks
private func extractExifData(from url: URL) -> ExifMetadata? {  }

// In task group:
let exifData = await self.extractExifData(from: fileURL)  // hops to actor

The task group created 809 tasks, but every task immediately called await self.extractExifData(…), which required hopping to the ScanFiles actor’s serial executor. All 809 tasks queued behind the actor. Concurrency was zero.

Measured cost: ~74 ms/file × 809 = ~60 seconds.

Fix:

// After — nonisolated, no actor hop required
private nonisolated func extractExifData(from url: URL) -> ExifMetadata? {  }

// In task group:
let exifData = self.extractExifData(from: fileURL)   // no await

The four pure helper methods (formatShutterSpeed, formatFocalLength, formatAperture, formatISO) were also marked nonisolated since they are called from extractExifData.


Bug 3 — File discovery loop blocked on UI progress callbacks

File: ScanFiles.scanFiles

Symptom: After Bug 2 fix, scan still took ~53 seconds.

Root cause:

// Before
for fileURL in contents {
    discoveredCount += 1
    await onProgress?(discoveredCount)   // @MainActor hop per file
    group.addTask {  }
}

onProgress is @MainActor @Sendable. The for loop awaited it for every file before adding the next task to the group. The task group was not the bottleneck — the loop that built it was. 809 sequential main-actor round trips, each waiting for a SwiftUI counter update, held the loop for ~65 ms per iteration.

Measured cost: ~65 ms/file × 809 = ~53 seconds.

Fix:

// After
for fileURL in contents {
    discoveredCount += 1
    let progress = onProgress        // capture closure (Sendable)
    let count    = discoveredCount   // capture value (Int, not var)
    Task { @MainActor in progress?(count) }   // fire-and-forget
    group.addTask {  }
}

The loop now completes in milliseconds. All 809 task group tasks are enqueued before the first one finishes, maximising parallelism.


Bug 4 — Focus point parser read the entire RAW file

File: SonyMakerNoteParser.focusLocation

Symptom: Focus point extraction added ~50 seconds to the scan.

Root cause:

// Before
guard let data = try? Data(contentsOf: url, options: .mappedIfSafe) 

On an external SSD, macOS cannot safely use mmap (the filesystem or volume does not permit it), so Data(contentsOf:, options: .mappedIfSafe) silently falls back to reading the entire file into memory. A Sony A1 ARW is ~50 MB.

50 MB × 809 files = ~40 GB total I/O
40 GB ÷ 800 MB/s  = ~50 seconds

The parser only needs the TIFF IFD chain and MakerNote, which reside within the first 1–2 MB of a Sony ARW. The RAW image data that follows is never accessed.

Fix:

// After — reads only the first 4 MB
guard let fh = try? FileHandle(forReadingFrom: url) else { return nil }
defer { try? fh.close() }
guard let data = try? fh.read(upToCount: 4 * 1024 * 1024) 

FileHandle.read(upToCount:) issues a single bounded read regardless of filesystem. 4 MB per file is a conservative limit well above the 1–2 MB actually needed, and safe for all known Sony ARW structures.

4 MB × 809 files = ~3.2 GB total I/O
3.2 GB ÷ 800 MB/s = ~4 seconds (sequential upper bound)

Bug 5 — Focus point extraction sequential

File: ScanFiles.extractNativeFocusPoints

Symptom: Even after Bug 4, focus extraction ran serially.

Root cause:

// Before — synchronous compactMap, runs on actor
private func extractNativeFocusPoints(from items: [FileItem]) -> [DecodeFocusPoints]? {
    let parsed = items.compactMap { item in
        guard let loc = SonyMakerNoteParser.focusLocation(from: item.url) 
    }
    
}

A plain compactMap on 809 items processes them one at a time on the actor.

Fix:

// After — concurrent task group
private func extractNativeFocusPoints(from items: [FileItem]) async
    -> [DecodeFocusPoints]?
{
    await withTaskGroup(of: DecodeFocusPoints?.self) { group in
        for item in items {
            group.addTask {
                guard let loc = SonyMakerNoteParser.focusLocation(from: item.url)
                else { return nil }
                return DecodeFocusPoints()
            }
        }
        
    }
}

SonyMakerNoteParser.focusLocation is nonisolated static, so task group tasks call it directly on the thread pool without touching the actor.


5. Measured results — 809 Sony A1 ARW files, external 800 MB/s SSD

OperationBefore all fixesAfter all fixes
EXIF extraction~60 s~3–4 s
Focus point extraction~50 s< 1 s
Scan total (Phase 1)~60 s (sequential bottleneck)~6–7 s
Thumbnail generation, cold (Phase 2)~161 s~10–15 s
Thumbnail generation, cached (Phase 2)~63 s~instant
Total, first run~4 min~20 s
Total, second run~63 s~7 s

6. Key architectural lesson

Swift 6 with SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor isolates all methods to the main actor by default. A method that is pure and stateless — no reads or writes of actor-isolated properties — must be explicitly annotated nonisolated to run on the cooperative thread pool. Without this annotation, task group tasks that call it will silently queue on the actor’s serial executor, reducing a withTaskGroup to the performance of a for loop.

The pattern to check in any actor with a task group:

// If this is called with `await self.method()` inside group.addTask { }:
private func method() -> T {  }
//                    ↑
// Does this method read or write any stored property of the actor?
// If NO → mark it nonisolated. The await and actor hop are unnecessary.
// If YES → it must remain isolated; redesign the data flow instead.

The same applies to @MainActor progress callbacks: never await them inside a processing loop. Fire-and-forget with Task { @MainActor in } keeps the pipeline moving and lets the UI update at its own pace.

Sony MakerNote Parser

Sony MakerNote Parser — Focus Point Extraction

Files covered:

  • RawCull/Enum/SonyMakerNoteParser.swift
  • RawCull/Model/ViewModels/FocusPointsModel.swift
  • RawCull/Actors/ScanFiles.swift
  • RawCull/Views/FocusPoints/FocusOverlayView.swift

Overview

RawCull extracts autofocus (AF) focus point coordinates directly from Sony ARW raw files without requiring any external tools such as exiftool. The parser targets the Sony A1 (ILCE-1) and A1 Mark II (ILCE-1M2) cameras.

The focus location is stored inside the Sony proprietary MakerNote block embedded in the EXIF data of every ARW file. The SonyMakerNoteParser struct navigates the binary TIFF structure to locate and decode this data.


ARW File Structure

Sony ARW is a TIFF-based format (typically little-endian). EXIF and MakerNote data are embedded within the standard TIFF IFD chain:

TIFF Header
  └── IFD0
        └── ExifIFD  (tag 0x8769)
              └── MakerNote  (tag 0x927C)
                    └── Sony MakerNote IFD
                          └── FocusLocation  (tag 0x2027)

Tag 0x2027 (FocusLocation) holds four int16u values:

IndexMeaning
0Image width (sensor pixels)
1Image height (sensor pixels)
2Focus point X coordinate
3Focus point Y coordinate

The origin is the top-left corner of the sensor. Values are already in full sensor pixel space — no scaling is required. Tag 0x204a is a redundant copy of the same data (within one pixel) and is used as a fallback.

Note: Tag 0x9400 (AFInfo) is an enciphered binary block and is not used for focus location.


SonyMakerNoteParser

The public interface is a single static method:

struct SonyMakerNoteParser {
    /// Returns "width height x y" calibrated for the Sony A1 sensor.
    nonisolated static func focusLocation(from url: URL) -> String? {
        // Read only the first 4 MB. Sony ARW MakerNote metadata sits well
        // within that range; loading the full ~50 MB RAW file is wasteful
        // on external storage where mmap is unavailable and
        // Data(contentsOf:) falls back to a full read.
        guard let fh = try? FileHandle(forReadingFrom: url) else { return nil }
        defer { try? fh.close() }
        guard let data = try? fh.read(upToCount: 4 * 1024 * 1024),
              let result = TIFFParser(data: data)?.parseSonyFocusLocation()
        else { return nil }
        return "\(result.width) \(result.height) \(result.x) \(result.y)"
    }
}

Only the first 4 MB of the file is read. Sony ARW MakerNote metadata resides within the first 1–2 MB of the file; the remainder is RAW image data that is never needed for focus point extraction. Using FileHandle.read(upToCount:) instead of Data(contentsOf:, options: .mappedIfSafe) avoids a full ~50 MB file read on external storage where memory-mapping (mmap) is not available and the system silently falls back to loading the entire file.

The result is a space-separated string: "width height x y".


TIFFParser — Binary Navigation

The private TIFFParser struct does all binary parsing work.

Byte Order Detection

init?(data: Data) {
    guard data.count >= 8 else { return nil }
    let b0 = data[0], b1 = data[1]
    if b0 == 0x49 && b1 == 0x49 { le = true }        // "II" — little-endian
    else if b0 == 0x4D && b1 == 0x4D { le = false }  // "MM" — big-endian
    else { return nil }
    self.data = data
}

Sony ARW files are little-endian (II), but the parser handles both byte orders via readU16 and readU32 helpers.

Focus Location Navigation

func parseSonyFocusLocation() -> (width: Int, height: Int, x: Int, y: Int)? {
    guard let ifd0 = readU32(at: 4).map(Int.init) else { return nil }

    // Navigate: IFD0 → ExifIFD → MakerNote IFD
    guard let exifIFD = subIFDOffset(in: ifd0, tag: 0x8769),
          let (mnOffset, _) = tagDataRange(in: exifIFD, tag: 0x927C)
    else { return nil }

    let ifdStart = sonyIFDStart(at: mnOffset)

    // Try tag 0x2027 first, fall back to 0x204a
    let flTag: UInt16 = tagDataRange(in: ifdStart, tag: 0x2027) != nil
        ? 0x2027 : 0x204a
    guard let (flOffset, flSize) = tagDataRange(in: ifdStart, tag: flTag),
          flSize >= 8
    else { return nil }

    let width  = Int(readU16(at: flOffset + 0))
    let height = Int(readU16(at: flOffset + 2))
    let x      = Int(readU16(at: flOffset + 4))
    let y      = Int(readU16(at: flOffset + 6))

    guard width > 0, height > 0, x > 0 || y > 0 else { return nil }
    return (width, height, x, y)
}

The IFD0 offset is read from bytes 4–7 of the TIFF header (standard TIFF). The parser then follows each IFD pointer in sequence until the Sony MakerNote IFD is reached.

Sony MakerNote Header

Some Sony files prefix the MakerNote IFD with a 12-byte ASCII header "SONY DSC " (9 bytes) followed by 3 null bytes. The parser detects and skips it by checking the raw bytes directly — endian-aware integer reads are not used for ASCII magic:

private func sonyIFDStart(at offset: Int) -> Int {
    guard offset + 12 <= data.count else { return offset }
    // Check for "SONY DSC " ASCII prefix (9 bytes + 3 null pad = 12 bytes).
    // Read raw bytes — do not use endian-aware readU32 for ASCII magic.
    let isSony = data[offset]   == 0x53 &&  // S
                 data[offset+1] == 0x4F &&  // O
                 data[offset+2] == 0x4E &&  // N
                 data[offset+3] == 0x59     // Y
    return isSony ? offset + 12 : offset
}

IFD Entry Parsing

Each IFD entry is 12 bytes: 2 bytes tag, 2 bytes type, 4 bytes count, 4 bytes value/offset.

private func tagDataRange(in ifdOffset: Int, tag: UInt16)
    -> (dataOffset: Int, byteCount: Int)?
{
    let entryCount = Int(readU16(at: ifdOffset))
    for i in 0 ..< entryCount {
        let e = ifdOffset + 2 + i * 12
        guard e + 12 <= data.count else { break }
        if readU16(at: e) == tag {
            let type  = Int(readU16(at: e + 2))
            let count = Int(readU32(at: e + 4) ?? 0)
            let sizes = [0,1,1,2,4,8,1,1,2,4,8,4,8,4]
            let bytes = count * (type < sizes.count ? sizes[type] : 1)

            if bytes <= 4 { return (e + 8, bytes) }   // inline value
            guard let ptr = readU32(at: e + 8) else { return nil }
            // A1 / A1 II MakerNote IFD entries use absolute file offsets
            // (not relative to MakerNote start) per ExifTool ProcessExif.
            return (Int(ptr), bytes)
        }
    }
    return nil
}

Sony A1 and A1 II MakerNote IFD entries use absolute file offsets, consistent with ExifTool’s ProcessExif behaviour. The type-size table covers all 14 standard TIFF types (index 0–13).


Data Models

FocusPoint

The parsed string "width height x y" is converted into a typed FocusPoint struct:

struct FocusPoint: Identifiable {
    let sensorWidth: CGFloat
    let sensorHeight: CGFloat
    let x: CGFloat
    let y: CGFloat

    var normalizedX: CGFloat { x / sensorWidth }
    var normalizedY: CGFloat { y / sensorHeight }
}

Normalized coordinates (0.0–1.0) are used for rendering, making the marker position independent of the display image resolution.


Integration in ScanFiles

Focus points are extracted during the catalog scan in ScanFiles using a concurrent withTaskGroup, after the EXIF task group has completed:

decodedFocusPoints = await extractNativeFocusPoints(from: result)
    ?? decodeFocusPointsJSON(from: url)

Native extraction is attempted first. If no ARW files in the catalog yield a result (e.g. non-A1 cameras or files captured before the feature was added), the actor falls back to reading a focuspoints.json sidecar file from the same directory.

private func extractNativeFocusPoints(from items: [FileItem]) async
    -> [DecodeFocusPoints]?
{
    let collected = await withTaskGroup(of: DecodeFocusPoints?.self) { group in
        for item in items {
            group.addTask {
                guard let location = SonyMakerNoteParser.focusLocation(from: item.url)
                else { return nil }
                // sourceFile must equal file.name — getFocusPoints() matches
                // on filename only
                return DecodeFocusPoints(
                    sourceFile: item.url.lastPathComponent,
                    focusLocation: location
                )
            }
        }
        var results: [DecodeFocusPoints] = []
        for await result in group {
            if let r = result { results.append(r) }
        }
        return results
    }
    return collected.isEmpty ? nil : collected
}

SonyMakerNoteParser.focusLocation is nonisolated static, so task group tasks invoke it directly on the cooperative thread pool without hopping back to the ScanFiles actor’s serial executor. All files are parsed concurrently.


Visualization

Focus point markers are rendered as corner brackets over the image using a custom SwiftUI Shape:

struct FocusPointMarker: Shape {
    let normalizedX: CGFloat
    let normalizedY: CGFloat
    let boxSize: CGFloat

    func path(in rect: CGRect) -> Path {
        let cx = normalizedX * rect.width
        let cy = normalizedY * rect.height
        let half = boxSize / 2
        let bracket = boxSize * 0.28
        // Draws 8 corner bracket lines around the focus position
        
    }
}

The marker size is user-adjustable (32–100 px) via a slider in FocusPointControllerView.


End-to-End Flow

flowchart TD
    A[ARW file on disk] --> B[SonyMakerNoteParser.focusLocation]
    B --> C[FileHandle.read — first 4 MB only]
    C --> D[TIFFParser — detect byte order II/MM]
    D --> E[Navigate IFD0 → ExifIFD 0x8769]
    E --> F[Navigate to MakerNote 0x927C]
    F --> G[Skip optional SONY DSC header 12 bytes]
    G --> H{Tag 0x2027 present?}
    H -- yes --> I[Read FocusLocation tag 0x2027]
    H -- no  --> J[Read fallback tag 0x204a]
    I --> K[Decode int16u x4: width, height, x, y]
    J --> K
    K --> L[Return string: width height x y]
    L --> M[ScanFiles.extractNativeFocusPoints — withTaskGroup]
    M --> N[FocusPoint with normalizedX / normalizedY]
    N --> O[FocusOverlayView — corner bracket marker on image]

Key Technical Points

TopicDetail
File formatSony ARW is TIFF-based, typically little-endian
Focus tag0x2027 (FocusLocation), fallback 0x204a
Data formatint16u[4] — width, height, x, y in sensor pixels
File readFirst 4 MB only via FileHandle — MakerNote metadata is within the first 1–2 MB
Pointer baseSony A1 / A1 II MakerNote pointers are absolute file offsets
MakerNote headerOptional 12-byte "SONY DSC " prefix detected by raw byte comparison
Encrypted tag0x9400 (AFInfo) is enciphered and not used
ConcurrencyextractNativeFocusPoints uses withTaskGroup; focusLocation is nonisolated static
Fallbackfocuspoints.json sidecar used when native parsing yields no results
CoordinatesOrigin top-left; normalized to 0.0–1.0 before rendering

Concurrency model

Concurrency Model — RawCull

Files covered:

  • RawCull/Model/ViewModels/RawCullViewModel.swift
  • RawCull/Views/RawCullSidebarMainView/extension+RawCullView.swift
  • RawCull/Actors/ScanFiles.swift
  • RawCull/Actors/ScanAndCreateThumbnails.swift
  • RawCull/Actors/ExtractAndSaveJPGs.swift
  • RawCull/Actors/ThumbnailLoader.swift
  • RawCull/Actors/RequestThumbnail.swift
  • RawCull/Actors/SharedMemoryCache.swift
  • RawCull/Actors/DiskCacheManager.swift
  • RawCull/Actors/SaveJPGImage.swift
  • RawCull/Enum/SonyThumbnailExtractor.swift
  • RawCull/Enum/JPGSonyARWExtractor.swift

Overview

RawCull uses Swift structured concurrency (async/await, Task, TaskGroup, and actor) across four primary flows:

FlowEntry pointCore actor(s)Purpose
Catalog scanRawCullViewModel.handleSourceChange(url:)ScanFilesScan ARW files, extract metadata, load focus points
Thumbnail preloadRawCullViewModel.handleSourceChange(url:)ScanAndCreateThumbnailsBulk-populate the thumbnail cache for a selected catalog
JPG extractionextension+RawCullView.extractAllJPGS()ExtractAndSaveJPGsExtract embedded JPEG previews and save to disk
On-demand thumbnailsUI grid + detail viewsThumbnailLoader, RequestThumbnailRate-limited, cached per-file thumbnail retrieval

The two long-running operations (thumbnail preload and JPG extraction) share a two-level task pattern:

  1. An outer Task created from the ViewModel or View layer.
  2. An inner Task stored inside the actor, which owns the real work and cancellation handle.

This split keeps UI responsive: handleSourceChange is @MainActor but async — when it awaits the outer Task, the main actor is free to handle other work while the task’s body runs on the ScanAndCreateThumbnails actor. The inner task runs heavy I/O and image work on actor and cooperative thread-pool queues. Cancellation requires calling both levels.


1. Catalog Scan — ScanFiles

1.1 Entry point

RawCullViewModel.handleSourceChange(url:) is @MainActor and is called whenever the user selects a new catalog. It triggers the scan before any thumbnail work starts.

1.2 Scan flow

ScanFiles.scanFiles(url:onProgress:) runs on the ScanFiles actor:

  1. Opens the directory with security-scoped resource access.
  2. Uses withTaskGroup to process all ARW files in parallel.
  3. For each file, a task reads URLResourceValues (name, size, content type, modification date) and calls extractExifData(from:).
  4. After the group finishes, resolves focus points via a two-stage fallback:
    • Native extraction first: extractNativeFocusPoints(from:) runs a withTaskGroup over all FileItems, calling SonyMakerNoteParser.focusLocation(from:) on each ARW file.
    • JSON fallback: if native extraction yields no results, decodeFocusPointsJSON(from:) reads focuspoints.json synchronously from the same directory.
  5. Returns [FileItem].

extractExifData(from:) reads EXIF data via CGImageSourceCopyPropertiesAtIndex and formats:

  • Shutter speed (e.g., "1/1000" or "2.5s")
  • Focal length (e.g., "50.0mm")
  • Aperture (e.g., "ƒ/2.8")
  • ISO (e.g., "ISO 400")
  • Camera model (from TIFF dictionary)
  • Lens model (from EXIF dictionary)

RawCullViewModel then calls ScanFiles.sortFiles(_:by:searchText:) (@concurrent nonisolated, runs on the cooperative thread pool), updates files and filteredFiles on the main actor, and maps decoded focus points to FocusPointsModel objects.


2. Thumbnail Preload — ScanAndCreateThumbnails

2.1 How the task starts

RawCullViewModel.handleSourceChange(url:) is the entry point (@MainActor).

Step-by-step:

  1. Skip duplicates: processedURLs: Set<URL> prevents re-processing a catalog URL already handled in this session.
  2. Fetch settings: SettingsViewModel.shared.asyncgetsettings() provides thumbnailSizePreview and thumbnailCostPerPixel.
  3. Build handlers: CreateFileHandlers().createFileHandlers(...) wires up four @MainActor @Sendable closures:
    • fileHandler(Int) — progress count
    • maxfilesHandler(Int) — total file count
    • estimatedTimeHandler(Int) — ETA in seconds
    • memorypressurewarning(Bool) — memory pressure state for UI
  4. Create actor: ScanAndCreateThumbnails() is instantiated and handlers injected.
  5. Store actor reference: currentScanAndCreateThumbnailsActor is set so abort() can reach it.
  6. Create outer Task on the ViewModel:
preloadTask = Task {
    await scanAndCreateThumbnails.preloadCatalog(
        at: url,
        targetSize: thumbnailSizePreview
    )
}
await preloadTask?.value

The await suspends handleSourceChange (freeing the main actor while the preload runs on the ScanAndCreateThumbnails actor) until the preload finishes or is cancelled.

2.2 Inside the actor

preloadCatalog(at:targetSize:) runs on the ScanAndCreateThumbnails actor:

  1. One-time setup: ensureReady() calls SharedMemoryCache.shared.ensureReady() and fetches settings via a setupTask gate (preventing duplicate initialization from concurrent callers).
  2. Cancel prior work: cancelPreload() cancels and nils any existing inner task.
  3. Discover files: Enumerate ARW files non-recursively via DiscoverFiles.
  4. Notify max: fileHandlers?.maxfilesHandler(files.count) updates the progress bar maximum.
  5. Create inner Task<Int, Never>: stored as preloadTask on the actor.
  6. Bounded withTaskGroup: caps parallelism at ProcessInfo.processInfo.activeProcessorCount * 2 using index-based back-pressure and per-iteration cancellation checks:
for (index, url) in urls.enumerated() {
    if Task.isCancelled {
        group.cancelAll()
        break
    }
    if index >= maxConcurrent {
        await group.next()
    }
    group.addTask {
        await self.processSingleFile(url, targetSize: targetSize, itemIndex: index)
    }
}
await group.waitForAll()

2.3 Per-file processing and cancellation points

processSingleFile(_:targetSize:itemIndex:) follows the three-tier cache lookup and checks Task.isCancelled at every expensive step:

StepCancellation checkAction on cancel
Before RAM lookupTask.isCancelledReturn immediately
After RAM hit confirmedTask.isCancelledSkip remaining work
Before disk lookupTask.isCancelledReturn immediately
Before source extractionTask.isCancelledReturn immediately
After extraction completesTask.isCancelledSkip caching and disk write

On extraction success:

  1. Call cgImageToNormalizedNSImage(_:) — converts CGImage to an NSImage backed by a single JPEG representation (quality 0.7). This normalization ensures memory and disk representations are consistent.
  2. storeInMemoryCache(_:for:) — creates DiscardableThumbnail with pixel-accurate cost and stores in SharedMemoryCache.
  3. Encode jpegData and call diskCache.save(_:for:) — this is a detached background task. The closure captures diskCache directly to avoid retaining the actor.

2.4 Request coalescing

ScanAndCreateThumbnails.thumbnail(for:targetSize:) exposes an async lookup for direct per-file requests. It calls resolveImage(for:targetSize:), which adds in-flight task coalescing via inflightTasks: [URL: Task<CGImage, Error>]:

  1. Check RAM cache.
  2. Check disk cache.
  3. If inflightTasks[url] exists, await it — multiple callers share the same work.
  4. Otherwise, create a new unstructured Task inside the actor, store it in inflightTasks, extract and cache the thumbnail, then remove the entry when done.

This prevents duplicate extraction work when multiple UI elements request the same file simultaneously.


3. JPG Extraction — ExtractAndSaveJPGs

3.1 How the task starts

extension+RawCullView.extractAllJPGS() creates an unstructured outer task from the View layer:

Task {
    viewModel.creatingthumbnails = true

    let handlers = CreateFileHandlers().createFileHandlers(
        fileHandler: viewModel.fileHandler,
        maxfilesHandler: viewModel.maxfilesHandler,
        estimatedTimeHandler: viewModel.estimatedTimeHandler,
        memorypressurewarning: { _ in },
    )

    let extract = ExtractAndSaveJPGs()
    await extract.setFileHandlers(handlers)
    viewModel.currentExtractAndSaveJPGsActor = extract

    guard let url = viewModel.selectedSource?.url else { return }
    await extract.extractAndSaveAlljpgs(from: url)

    viewModel.currentExtractAndSaveJPGsActor = nil
    viewModel.creatingthumbnails = false
}

Unlike the preload flow, the outer task is not stored on the ViewModel. Cancellation is driven entirely through the actor reference via viewModel.abort().

3.2 Inside the actor

extractAndSaveAlljpgs(from:) mirrors the preload pattern exactly:

  1. Cancel any previous inner task via cancelExtractJPGSTask().
  2. Discover all ARW files (non-recursive).
  3. Create a Task<Int, Never> stored as extractJPEGSTask.
  4. Use withThrowingTaskGroup with activeProcessorCount * 2 concurrency cap and the same index-based back-pressure pattern as ScanAndCreateThumbnails (cancellation check + group.cancelAll(), index guard before group.next(), group.waitForAll() to drain).
  5. Call processSingleExtraction(_:itemIndex:) per file.

processSingleExtraction checks cancellation before and after JPGSonyARWExtractor.jpgSonyARWExtractor(from:fullSize:), then writes the result via SaveJPGImage().save(image:originalURL:).

SaveJPGImage.save is @concurrent nonisolated and runs on the cooperative thread pool. It:

  • Replaces the .arw extension with .jpg
  • Uses CGImageDestinationCreateWithURL with JPEG quality 1.0
  • Logs success/failure with image dimensions and file paths

4. Rate-Limited On-Demand Loading

4.1 ThumbnailLoader

ThumbnailLoader is a shared actor that enforces a maximum of 6 concurrent thumbnail loads. Excess requests suspend via CheckedContinuation and are queued:

actor ThumbnailLoader {
    static let shared = ThumbnailLoader()
    private let maxConcurrent = 6
    private var activeTasks = 0
    private var pendingContinuations: [(id: UUID, continuation: CheckedContinuation<Void, Never>)] = []
}

acquireSlot() flow:

  1. If activeTasks < maxConcurrent: increment activeTasks, return immediately.
  2. Otherwise: call withCheckedContinuation { continuation in ... } — this suspends the caller.
  3. A cancellation handler is registered to remove the pending continuation by ID so it is never resumed after cancellation.

releaseSlot() flow:

  1. Decrement activeTasks.
  2. If pendingContinuations is non-empty, pop the first and resume() it.

thumbnailLoader(file:) flow:

func thumbnailLoader(file: FileItem) async -> NSImage? {
    await acquireSlot()
    defer { releaseSlot() }
    guard !Task.isCancelled else { return nil }
    let settings = await getSettings()
    let cgThumb = await RequestThumbnail().requestThumbnail(
        for: file.url,
        targetSize: settings.thumbnailSizePreview
    )
    guard !Task.isCancelled else { return nil }
    if let cgThumb {
        return NSImage(cgImage: cgThumb, size: .zero)
    }
    return nil
}

Settings are cached on the actor to avoid repeated SettingsViewModel calls. The result is wrapped as NSImage(cgImage:size:.zero) before returning to the caller.

4.2 RequestThumbnail

RequestThumbnail handles the per-file cache resolution path for the on-demand flow:

  1. ensureReady() — same setupTask gate pattern as ScanAndCreateThumbnails.
  2. RAM cache lookup via SharedMemoryCache.object(forKey:); on hit, calls SharedMemoryCache.updateCacheMemory() for statistics.
  3. Disk cache lookup via DiskCacheManager.load(for:); on hit, calls SharedMemoryCache.updateCacheDisk() for statistics.
  4. Extraction fallback: SonyThumbnailExtractor.extractSonyThumbnail(from:maxDimension:qualityCost:).
  5. Store in RAM cache via storeInMemory(_:for:).
  6. Schedule disk save via a detached background task.

requestThumbnail(for:targetSize:) returns CGImage? for direct use by SwiftUI views. All errors are caught and logged; the method returns nil on failure.

nsImageToCGImage(_:) is async and tries cgImage(forProposedRect:) first; if that fails, it falls back to a TIFF round-trip on a Task.detached(priority: .utility) task to avoid blocking the actor with CPU-bound work.


5. Task Ownership and Handles

LayerOwnerHandle nameType
Outer task (preload)RawCullViewModelpreloadTaskTask<Void, Never>?
Inner task (preload)ScanAndCreateThumbnailspreloadTaskTask<Int, Never>?
Outer task (extract)View (extractAllJPGS)not storedTask<Void, Never>
Inner task (extract)ExtractAndSaveJPGsextractJPEGSTaskTask<Int, Never>?
Slot queue (on-demand)ThumbnailLoader.sharedpendingContinuations[(UUID, CheckedContinuation)]

6. Cancellation

6.1 abort()

RawCullViewModel.abort() is the single cancellation entry point for user-initiated stops:

func abort() {
    preloadTask?.cancel()
    preloadTask = nil

    if let actor = currentScanAndCreateThumbnailsActor {
        Task { await actor.cancelPreload() }
    }
    currentScanAndCreateThumbnailsActor = nil

    if let actor = currentExtractAndSaveJPGsActor {
        Task { await actor.cancelExtractJPGSTask() }
    }
    currentExtractAndSaveJPGsActor = nil

    creatingthumbnails = false
}

6.2 Why both levels matter

Cancelling the outer Task propagates into child structured tasks, but does not automatically cancel the actor’s inner Task. The inner task is unstructured (Task { ... } created inside the actor) — it is not a child of the outer task. The actor-specific cancel methods (cancelPreload, cancelExtractJPGSTask) must be explicitly called to cancel the inner task and allow the withTaskGroup to unwind.

6.3 ThumbnailLoader.cancelAll()

cancelAll() resumes all pending continuations immediately, unblocking any tasks waiting for a slot. This is called during teardown to prevent suspension leaks.


7. ETA Estimation

Both long-running actors compute a rolling ETA estimate:

Algorithm:

  1. Record a timestamp before each file starts processing.
  2. After completion, compute delta = now - lastItemTime.
  3. Append delta to processingTimes array.
  4. Keep only the last 10 items in the array.
  5. After collecting minimumSamplesBeforeEstimation (10) items, calculate:
avgTime = sum(processingTimes) / processingTimes.count
remaining = (totalFiles - itemsProcessed) * avgTime
  1. Only update the displayed ETA if remaining < lastEstimatedSeconds — this prevents the counter from jumping upward when a slow file takes longer than expected.
ActorMinimum samples threshold
ScanAndCreateThumbnailsminimumSamplesBeforeEstimation = 10
ExtractAndSaveJPGsestimationStartIndex = 10

8. Actor Isolation and Thread Safety

ComponentIsolation strategy
ScanAndCreateThumbnails, ExtractAndSaveJPGs, ScanFilesAll mutable state is actor-isolated; mutations only through actor methods
SharedMemoryCachenonisolated(unsafe) for NSCache (thread-safe by design); all statistics and config remain actor-isolated
DiskCacheManagerActor-isolates path calculation and coordination; actual file I/O runs in detached tasks
ThumbnailLoaderActor-isolated slot counter and continuation queue
DiscardableThumbnail@unchecked Sendable with OSAllocatedUnfairLock protecting (isDiscarded, accessCount)
CacheDelegate@unchecked SendablewillEvictObject is called synchronously by NSCache; increments are dispatched to an isolated EvictionCounter actor
RawCullViewModel@MainActor — all UI state updates serialized on the main thread
SonyThumbnailExtractor, JPGSonyARWExtractornonisolated static methods dispatched to global GCD queues to prevent actor starvation
SaveJPGImageactor with a single @concurrent nonisolated method — runs on the cooperative thread pool, not the actor’s executor

CPU-bound ImageIO and disk I/O work runs off-actor to keep the main thread and actor queues responsive.


9. Flow Diagrams

Thumbnail Preload — Two-Level Task Pattern

sequenceDiagram
    participant VM as RawCullViewModel (MainActor)
    participant A as ScanAndCreateThumbnails (actor)
    participant MC as SharedMemoryCache
    participant DC as DiskCacheManager
    participant EX as SonyThumbnailExtractor

    VM->>A: preloadCatalog(at:targetSize:)
    Note over VM: outer Task stores preloadTask
    A->>A: ensureReady() — setupTask gate
    A->>A: cancelPreload() — cancel prior inner task
    A->>A: create inner Task<Int,Never> — stored as preloadTask
    loop withTaskGroup (capped at processorCount × 2)
        A->>MC: object(forKey:) — RAM check
        alt RAM hit
            MC-->>A: DiscardableThumbnail
        else RAM miss
            A->>DC: load(for:) — disk check
            alt Disk hit
                DC-->>A: NSImage
                A->>MC: setObject(...) — promote to RAM
            else Disk miss
                A->>EX: extractSonyThumbnail(from:maxDimension:qualityCost:)
                EX-->>A: CGImage
                A->>A: normalize to JPEG-backed NSImage
                A->>MC: setObject(...)
                A->>DC: save(jpegData:for:) — detached background
            end
        end
        A->>VM: fileHandler(progress) via @MainActor closure
    end
    A-->>VM: return count

On-Demand Request

sequenceDiagram
    participant UI
    participant TL as ThumbnailLoader (actor)
    participant RT as RequestThumbnail (actor)
    participant MC as SharedMemoryCache
    participant DC as DiskCacheManager

    UI->>TL: thumbnailLoader(file:)
    TL->>TL: acquireSlot() — suspend if activeTasks ≥ 6
    TL->>RT: requestThumbnail(for:targetSize:)
    RT->>MC: object(forKey:)
    alt RAM hit
        MC-->>RT: DiscardableThumbnail
        RT-->>UI: CGImage
    else RAM miss
        RT->>DC: load(for:)
        alt Disk hit
            DC-->>RT: NSImage
            RT->>MC: setObject(...)
            RT-->>UI: CGImage
        else Disk miss
            RT->>RT: extractSonyThumbnail
            RT->>MC: setObject(...)
            RT->>DC: save — detached background
            RT-->>UI: CGImage
        end
    end
    TL->>TL: releaseSlot() — resume next pending continuation

10. Settings Reference

SettingDefaultEffect
memoryCacheSizeMB5000Sets NSCache.totalCostLimit
thumbnailCostPerPixel4Cost per pixel in DiscardableThumbnail
thumbnailSizePreview1024Target size for bulk preload and on-demand loading via ThumbnailLoader
thumbnailSizeGrid100Grid thumbnail size
thumbnailSizeGridView400Grid View thumbnail size
thumbnailSizeFullSize8700Full-size zoom path upper bound
useThumbnailAsZoomPreviewfalseUse cached thumbnail instead of re-extracting for zoom

Synchronous Code

A Guide to Handling Heavy Synchronous Code in Swift Concurrency

This post explains why CPU-intensive synchronous code (such as image decoding via ImageIO) must be dispatched off the Swift Concurrency thread pool, and shows the correct patterns RawCull uses to do so.

DispatchQueue.global(qos:) — QoS Levels Compared

The key difference is priority and resource allocation by the system.


.userInitiated

  • Priority: High (just below .userInteractive)
  • Use case: Work the user directly triggered and is actively waiting for — e.g., loading a document they tapped, parsing data to display a screen
  • Expected duration: Near-instantaneous to a few seconds
  • System behavior: Gets more CPU time and higher thread priority — the system treats this as urgent
  • Energy impact: Higher

.utility

  • Priority: Low-medium
  • Use case: Long-running work the user is aware of but not blocked by — e.g., downloading files, importing data, periodic syncs, progress-bar tasks
  • Expected duration: Seconds to minutes
  • System behavior: Balanced CPU/energy trade-off; the system throttles this more aggressively under load or low battery
  • Energy impact: Lower (system may apply energy efficiency optimizations)

Quick Comparison

.userInitiated.utility
PriorityHighLow-medium
User waiting?Yes, directlyAware but not blocked
Duration< a few secondsSeconds to minutes
CPU allocationAggressiveConservative
Battery impactHigherLower
Thread poolHigher-priority threadsLower-priority threads

Rule of thumb

// User tapped "Load" and is staring at a spinner → userInitiated
DispatchQueue.global(qos: .userInitiated).async {
    let data = loadCriticalData()
}

// Background sync / download with a progress bar → utility
DispatchQueue.global(qos: .utility).async {
    downloadLargeFile()
}

If you use .userInitiated for everything, you waste battery and CPU on non-urgent work. If you use .utility for user-blocking tasks, the UI will feel sluggish because the system may deprioritize the work.

1. The Core Problem: The Swift Cooperative Thread Pool

To understand why heavy synchronous code breaks modern Swift, you have to understand the difference between older Apple code (Grand Central Dispatch / GCD) and new Swift Concurrency.

  • GCD (DispatchQueue) uses a dynamic thread pool. If a thread gets blocked doing heavy work, GCD notices and spawns a new thread. This prevents deadlocks but causes Thread Explosion (which drains memory and battery).
  • Swift Concurrency (async/await/Task) uses a fixed-size cooperative thread pool. It strictly limits the number of background threads to exactly the number of CPU cores your device has (e.g., 6 cores = exactly 6 threads). It will never spawn more.

Because there are so few threads, Swift relies on cooperation. When an async function hits an await, it says: “I’m pausing to wait for something. Take my thread and give it to another task!” This allows 6 threads to juggle thousands of concurrent tasks.

The “Choke” (Thread Pool Starvation)

If you run heavy synchronous code (code without await) on the Swift thread pool, it hijacks the thread and refuses to give it back. If you request 6 heavy image extractions at the same time, all 6 Swift threads are paralyzed. Your entire app’s concurrency system freezes until an image finishes. Network requests halt, and background tasks deadlock.


2. What exactly is “Blocking Synchronous Code”?

Synchronous code executes top-to-bottom without ever pausing (it lacks the await keyword). Blocking code is synchronous code that takes a “long time” to finish (usually >10–50 milliseconds), thereby holding a thread hostage.

The 3 Types of Blocking Code:

  1. Heavy CPU-Bound Work: Number crunching, image processing (CoreGraphics, ImageIO), video encoding, parsing massive JSON files.
  2. Synchronous I/O: Reading massive files synchronously (e.g., Data(contentsOf: URL)) or older synchronous database queries. The thread is completely frozen waiting for the hard drive.
  3. Locks and Semaphores: Using DispatchSemaphore.wait() or NSLock intentionally pauses a thread. (Apple strictly forbids these inside Swift Concurrency).

The Checklist to Identify Blocking Code:

Ask yourself these questions about a function:

  1. Does it lack the async keyword in its signature?
  2. Does it lack internal await calls (or await Task.yield())?
  3. Does it take more than a few milliseconds to run?
  4. Is it a “Black Box” from an Apple framework (like ImageIO) or C/C++?

If the answer is Yes, it is blocking synchronous code and does not belong in the Swift Concurrency thread pool.


3. The Traps: Why Task and Actor Don’t Fix It

It is highly intuitive to try and fix blocking code using modern Swift features. However, these common approaches are dangerous traps:

Trap 1: Using Task or Task.detached

// ❌ TRAP: Still causes Thread Pool Starvation!
func extract() async throws -> CGImage {
    return try await Task.detached {
        return try Self.extractSync() // Blocks one of the 6 Swift threads
    }.value
}

Task and Task.detached do not create new background threads. They simply place work onto that same strict 6-thread cooperative pool. It might seem to “work” if you only test one image at a time, but at scale, it will deadlock your app.

Trap 2: Putting it inside an actor

Actors process their work one-by-one to protect state. However, Actors do not have their own dedicated threads. They borrow threads from the cooperative pool. If you run heavy sync code inside an Actor, you cause a Double Whammy:

  1. Thread Pool Starvation: You choked one of the 6 Swift workers.
  2. Actor Starvation: The Actor is locked up and cannot process any other messages until the heavy work finishes.

Trap 3: Using nonisolated

Marking an Actor function as nonisolated just means “this doesn’t touch the Actor’s private state.” It prevents Actor Starvation, but the function still physically runs on the exact same 6-thread pool, causing Thread Pool Starvation.


4. The Correct Solution: The GCD Escape Hatch

Apple’s official stance is that if you have heavy, blocking synchronous code that you cannot modify, Grand Central Dispatch (GCD) is still the correct tool for the job.

By wrapping the work in DispatchQueue.global().async and withCheckedThrowingContinuation, you push the heavy work out of Swift’s strict 6-thread pool and into GCD’s flexible thread pool (which is allowed to spin up extra threads).

This leaves the precious Swift Concurrency threads completely free to continue juggling all the other await tasks in your app.

Two functions in RawCull use DispatchQueue.global

extract JPGs from ARW files

static func extractEmbeddedPreview(
        from arwURL: URL,
        fullSize: Bool = false
    ) async -> CGImage? {
        let maxThumbnailSize: CGFloat = fullSize ? 8640 : 4320

        return await withCheckedContinuation { (continuation: CheckedContinuation<CGImage?, Never>) in
            // Dispatch to GCD to prevent Thread Pool Starvation
            DispatchQueue.global(qos: .utility).async {

                guard let imageSource = CGImageSourceCreateWithURL(arwURL as CFURL, nil) else {
                    Logger.process.warning("PreviewExtractor: Failed to create image source")
                    continuation.resume(returning: nil)
                    return
                }

                let imageCount = CGImageSourceGetCount(imageSource)
                var targetIndex: Int = -1
                var targetWidth = 0

                // 1. Find the LARGEST JPEG available
                for index in 0 ..< imageCount {
                    guard let properties = CGImageSourceCopyPropertiesAtIndex(
                        imageSource,
                        index,
                        nil
                    ) as? [CFString: Any]
                    else {
                        Logger.process.debugMessageOnly("enum: extractEmbeddedPreview(): Index \(index) - Failed to get properties")
                        continue
                    }

                    let hasJFIF = (properties[kCGImagePropertyJFIFDictionary] as? [CFString: Any]) != nil
                    let tiffDict = properties[kCGImagePropertyTIFFDictionary] as? [CFString: Any]
                    let compression = tiffDict?[kCGImagePropertyTIFFCompression] as? Int
                    let isJPEG = hasJFIF || (compression == 6)

                    if let width = getWidth(from: properties) {
                        if isJPEG, width > targetWidth {
                            targetWidth = width
                            targetIndex = index
                        }
                    }
                }

                guard targetIndex != -1 else {
                    Logger.process.warning("PreviewExtractor: No JPEG found in file")
                    continuation.resume(returning: nil)
                    return
                }

                let requiresDownsampling = CGFloat(targetWidth) > maxThumbnailSize
                let result: CGImage?

                // 2. Decode & Downsample using ImageIO directly
                if requiresDownsampling {
                    Logger.process.info("PreviewExtractor: Native downsampling to \(maxThumbnailSize)px")

                    // THESE ARE THE MAGIC OPTIONS that replace your resizeImage() function
                    let options: [CFString: Any] = [
                        kCGImageSourceCreateThumbnailFromImageAlways: true,
                        kCGImageSourceCreateThumbnailWithTransform: true,
                        kCGImageSourceThumbnailMaxPixelSize: Int(maxThumbnailSize)
                    ]

                    result = CGImageSourceCreateThumbnailAtIndex(imageSource, targetIndex, options as CFDictionary)
                } else {
                    Logger.process.info("PreviewExtractor: Using original preview size (\(targetWidth)px)")

                    // Your original standard decoding options
                    let decodeOptions: [CFString: Any] = [
                        kCGImageSourceShouldCache: true,
                        kCGImageSourceShouldCacheImmediately: true
                    ]

                    result = CGImageSourceCreateImageAtIndex(imageSource, targetIndex, decodeOptions as CFDictionary)
                }

                continuation.resume(returning: result)
            }
        }
    }

extract thumbnails

import AppKit
import Foundation

enum SonyThumbnailExtractor {
    /// Extract thumbnail using generic ImageIO framework.
    /// - Parameters:
    ///   - url: The URL of the RAW image file.
    ///   - maxDimension: Maximum pixel size for the longest edge of the thumbnail.
    ///   - qualityCost: Interpolation cost.
    /// - Returns: A `CGImage` thumbnail.
    static func extractSonyThumbnail(
        from url: URL,
        maxDimension: CGFloat,
        qualityCost: Int = 4
    ) async throws -> CGImage {
        // We MUST explicitly hop off the current thread.
        // Since we are an enum and static, we have no isolation of our own.
        // If we don't do this, we run on the caller's thread (the Actor), causing serialization.

        try await withCheckedThrowingContinuation { continuation in
            DispatchQueue.global(qos: .userInitiated).async {
                do {
                    let image = try Self.extractSync(
                        from: url,
                        maxDimension: maxDimension,
                        qualityCost: qualityCost
                    )
                    continuation.resume(returning: image)
                } catch {
                    continuation.resume(throwing: error)
                }
            }
        }
    }

5. The “Modern Swift” Alternative (If you own the code)

If extractSync was your own custom Swift code (and not an opaque framework like ImageIO), the truly “Modern Swift” way to fix it is to rewrite the synchronous loop to be cooperative.

You do this by sprinkling await Task.yield() inside heavy loops to voluntarily give the thread back:

func extractSyncCodeMadeAsync() async -> CGImage {
    for pixelRow in image {
        process(pixelRow)
        
        // Every few rows, pause and let another part of the app use the thread!
        if pixelRow.index % 10 == 0 {
            await Task.yield() 
        }
    }
}

If you can do this, you don’t need DispatchQueue! But if you are using black-box code that you can’t add await to, the GCD Escape Hatch is the correct, Apple-approved architecture.


Summary

Heavy synchronous code — especially CPU-bound ImageIO work — must never run directly on Swift’s cooperative thread pool. The GCD escape hatch (DispatchQueue.global + withCheckedContinuation) moves that work onto GCD’s flexible thread pool, leaving Swift Concurrency threads free. RawCull uses this pattern for both thumbnail extraction (userInitiated priority) and JPEG preview extraction (utility priority).

Security-Scoped URLs

Security-scoped URLs are a cornerstone of macOS App Sandbox security. RawCull uses them to gain persistent, user-approved access to source and destination folders while remaining fully sandbox-compliant. This article walks through exactly how the implementation works, tracing the code from user interaction through to file operations.


What Are Security-Scoped URLs?

A security-scoped URL is a special file URL that carries a cryptographic capability granted by macOS, representing explicit user consent to access a specific file or folder. Without it, a sandboxed app cannot read or write anything outside its own container.

Key properties:

  • Created only from user-granted file access (file picker, drag-and-drop)
  • Grants temporary access to files outside the app sandbox
  • Must be explicitly activated (startAccessingSecurityScopedResource()) before use and deactivated (stopAccessingSecurityScopedResource()) after
  • Can be serialized as a bookmark — a persistent token stored in UserDefaults that survives app restarts

Core API:

// Activate access — must be called before any file operations on the URL
let granted = url.startAccessingSecurityScopedResource()  // returns Bool

// Deactivate — must always be paired with a successful start call
url.stopAccessingSecurityScopedResource()

// Serialize to persistent bookmark data
let bookmarkData = try url.bookmarkData(
    options: .withSecurityScope,
    includingResourceValuesForKeys: nil,
    relativeTo: nil
)

// Restore from bookmark (across app launches)
var isStale = false
let restoredURL = try URL(
    resolvingBookmarkData: bookmarkData,
    options: .withSecurityScope,
    relativeTo: nil,
    bookmarkDataIsStale: &isStale
)

Architecture in RawCull

RawCull’s security-scoped URL system has three distinct layers, each with a specific responsibility.


Layer 1 — Initial User Selection (OpencatalogView)

OpencatalogView presents the macOS folder picker using SwiftUI’s .fileImporter() modifier. When the user selects a folder, the resulting URL is a short-lived security-scoped URL. The view immediately converts it into a persistent bookmark.

File: RawCull/Views/CopyFiles/OpencatalogView.swift

.fileImporter(
    isPresented: $isImporting,
    allowedContentTypes: [.directory]
) { result in
    switch result {
    case .success(let url):
        // Activate access immediately — required to create a bookmark
        guard url.startAccessingSecurityScopedResource() else {
            Logger.process.errorMessageOnly("Failed to start accessing resource")
            return
        }

        // Store the path string for immediate UI use
        selecteditem = url.path

        // Serialize the URL to a persistent bookmark while access is active
        do {
            let bookmarkData = try url.bookmarkData(
                options: .withSecurityScope,
                includingResourceValuesForKeys: nil,
                relativeTo: nil
            )
            UserDefaults.standard.set(bookmarkData, forKey: bookmarkKey)
        } catch {
            Logger.process.warning("Could not create bookmark: \(error)")
        }

        // Release access — will be reacquired via bookmark when needed
        url.stopAccessingSecurityScopedResource()

    case .failure(let error):
        Logger.process.errorMessageOnly("File picker error: \(error)")
    }
}

bookmarkKey is either "sourceBookmark" or "destBookmark" — the two folder roles in RawCull.

What this layer guarantees:

  • Bookmark is created while access is still active (the only valid window for bookmark creation)
  • Access is released immediately after — the bookmark takes over for future launches
  • The path is captured before releasing access, so the UI can display it without holding an open security scope

Layer 2 — Bookmark Restoration (ExecuteCopyFiles)

When the user initiates a copy operation on a subsequent launch, ExecuteCopyFiles resolves the stored bookmarks back into live, access-granted URLs.

File: RawCull/Model/ParametersRsync/ExecuteCopyFiles.swift

func getAccessedURL(fromBookmarkKey key: String, fallbackPath: String) -> URL? {
    // Primary path: restore from persisted bookmark
    if let bookmarkData = UserDefaults.standard.data(forKey: key) {
        do {
            var isStale = false

            let url = try URL(
                resolvingBookmarkData: bookmarkData,
                options: .withSecurityScope,
                relativeTo: nil,
                bookmarkDataIsStale: &isStale
            )

            // Activate access on the resolved URL
            guard url.startAccessingSecurityScopedResource() else {
                Logger.process.errorMessageOnly("Failed to start accessing bookmark for \(key)")
                return tryFallbackPath(fallbackPath, key: key)
            }

            // Warn if the folder was moved (bookmark is stale)
            if isStale {
                Logger.process.warning("Bookmark is stale for \(key) — user may need to reselect")
            }

            return url  // Caller is responsible for stopAccessingSecurityScopedResource()

        } catch {
            Logger.process.errorMessageOnly("Bookmark resolution failed for \(key): \(error)")
            return tryFallbackPath(fallbackPath, key: key)
        }
    }

    return tryFallbackPath(fallbackPath, key: key)
}

private func tryFallbackPath(_ fallbackPath: String, key: String) -> URL? {
    let fallbackURL = URL(fileURLWithPath: fallbackPath)
    guard fallbackURL.startAccessingSecurityScopedResource() else {
        Logger.process.errorMessageOnly("Failed to access fallback path for \(key)")
        return nil
    }
    return fallbackURL
}

The returned URL has startAccessingSecurityScopedResource() already called. The calling code in ExecuteCopyFiles is responsible for calling stopAccessingSecurityScopedResource() on each URL once the rsync operation completes.

What this layer handles:

  • Normal case: bookmark resolves cleanly → URL returned with access active
  • Stale bookmark: folder was moved → logged as warning, access still attempted
  • Bookmark resolution throws: falls back to direct path access
  • No bookmark stored at all: falls back to direct path access

Layer 3 — Scoped Access During File Operations (ScanFiles)

When scanning a directory for ARW files, the ScanFiles actor activates and deactivates security-scoped access for the duration of the scan only.

File: RawCull/Actors/ScanFiles.swift

actor ScanFiles {
    func scanFiles(url: URL, onProgress: @escaping (Double) -> Void) async -> [FileItem] {
        // Activate access for this URL
        guard url.startAccessingSecurityScopedResource() else {
            return []
        }
        // defer guarantees deactivation even if the function throws or returns early
        defer { url.stopAccessingSecurityScopedResource() }

        let manager = FileManager.default
        guard let contents = try? manager.contentsOfDirectory(
            at: url,
            includingPropertiesForKeys: [
                .fileSizeKey,
                .contentModificationDateKey,
                .typeIdentifierKey
            ],
            options: [.skipsHiddenFiles]
        ) else { return [] }

        return await processContents(contents, onProgress: onProgress)
    }
}

The defer pattern is critical here: it guarantees that stopAccessingSecurityScopedResource() is called regardless of whether the function completes normally, returns early, or the Swift runtime unwinds the stack. This prevents security-scoped resources from being “leaked” (left open indefinitely).

Actor isolation: Because ScanFiles is a Swift actor, all file operations on its state are serialized by the runtime — concurrent reads of the same directory cannot race each other.


Global Access Tracking in RawCullViewModel

The main view model maintains a comprehensive registry of all URLs for which startAccessingSecurityScopedResource() has been called, ensuring nothing is left open when the app quits.

File: RawCull/Model/ViewModels/RawCullViewModel.swift

@Observable @MainActor
final class RawCullViewModel {
    private var securityScopedURLs: Set<URL> = []

    func trackSecurityScopedAccess(for url: URL) {
        securityScopedURLs.insert(url)
    }

    func stopSecurityScopedAccess(for url: URL) {
        guard securityScopedURLs.contains(url) else { return }
        url.stopAccessingSecurityScopedResource()
        securityScopedURLs.remove(url)
    }

    deinit {
        // Release all remaining security-scoped access on teardown
        for url in securityScopedURLs {
            url.stopAccessingSecurityScopedResource()
        }
    }
}

This acts as a safety net: even if a call path omits an explicit stop, the deinit cleans up everything before the app exits. Combined with defer in the actors, this gives double coverage against resource leaks.


End-to-End Flow

User selects destination folder via file picker
    ↓
OpencatalogView.fileImporter result handler
    1. url.startAccessingSecurityScopedResource()
    2. selecteditem = url.path                    (UI binding)
    3. bookmarkData = try url.bookmarkData(options: .withSecurityScope)
    4. UserDefaults.set(bookmarkData, forKey: "destBookmark")
    5. url.stopAccessingSecurityScopedResource()
    ↓
    [App may be quit and relaunched here]
    ↓
User initiates copy operation
    ↓
ExecuteCopyFiles.performCopyTask()
    1. getAccessedURL(fromBookmarkKey: "sourceBookmark", ...)
       → resolves bookmark → startAccessingSecurityScopedResource() → returns URL
    2. getAccessedURL(fromBookmarkKey: "destBookmark", ...)
       → resolves bookmark → startAccessingSecurityScopedResource() → returns URL
    3. Builds rsync argument list using both paths
    4. Spawns /usr/bin/rsync via RsyncProcessStreaming
    5. After rsync completes:
       sourceURL.stopAccessingSecurityScopedResource()
       destURL.stopAccessingSecurityScopedResource()
    ↓
ScanFiles.scanFiles(url: sourceURL)
    1. url.startAccessingSecurityScopedResource()
    2. defer { url.stopAccessingSecurityScopedResource() }
    3. FileManager.contentsOfDirectory(at: url, ...)
    4. Returns [FileItem]   ← defer fires here, access released
    ↓
RawCullViewModel.deinit (on app quit)
    → stopAccessingSecurityScopedResource() for any remaining tracked URLs

Security Model Summary

AspectImplementationGuarantee
User consentFile picker only — no programmatic path constructionApp never accesses a folder the user did not explicitly choose
PersistenceBookmark serialized to UserDefaultsUser does not re-select folders on every launch
Minimal scope durationdefer and explicit stop calls bound access to the operationSecurity-scoped access is held only as long as needed
Leak preventionSet<URL> in view model + deinit cleanupNo access token outlives the app session
Stale bookmark detectionbookmarkDataIsStale checked on every resolveUser is informed if a folder has been moved
Fallback resilienceDirect path access if bookmark resolution failsGraceful degradation, operation still attempted
Audit trailOSLog records every start, stop, failure, and stale eventSecurity events are observable via Console.app

Common Pitfalls (and How RawCull Avoids Them)

1. Forgetting to call startAccessingSecurityScopedResource() before file operations → RawCull guards every file operation with an explicit start call; failure returns nil or [] rather than crashing.

2. Not calling stopAccessingSecurityScopedResource() — leaking the scopedefer in actors and deinit in the view model provide two independent cleanup layers.

3. Creating a bookmark while access is not activeOpencatalogView always creates the bookmark inside the startAccessing… / stopAccessing… window.

4. Ignoring the isStale flag → RawCull logs a warning when bookmarkDataIsStale is true, making stale bookmarks visible in diagnostics.

5. Using the resolved URL after calling stop → The view model tracks active URLs and guards against double-stop via the contains check before removing from the set.

Compiling RawCull

Overview

The easiest method is by using the included Makefile. The default make in /usr/bin/make does the job.

Compile by make

If you have an Apple Developer account, you should open the RawCull project and replace the Signing & Capabilities section with your own Apple Developer ID before using make and the procedure outlined below.

The use of the make command necessitates the application-specific password. There are two commands available for use with make: one creates a release build exclusively for RawCull, while the other generates a signed version that includes a DMG file.

If only utilizing the make archive command, the application-specific password is not required, and it would suffice to update only the Signing & Capabilities section. The make archive command will likely still function even if set to Sign to Run Locally.

To create a DMG file, the make command is dependent on the create-dmg tool. The instructions for create-dmg are included in the Makefile. Ensure that the fork of create-dmg is on the same level as the fork of RawCull. Before using make, create and store an app-specific password.

The following procedure creates and stores an app-specific password:

  1. Visit appleid.apple.com and log in with your Apple ID.
  2. Navigate to the Sign-In and Security section and select App-Specific Passwords → Generate an App-Specific Password.
  3. Provide a label to help identify the purpose of the password (e.g., notarytool).
  4. Click Create. The password will be displayed once; copy it and store it securely.

After creating the app-specific password, execute the following command and follow the prompts:

xcrun notarytool store-credentials --apple-id "youremail@gmail.com" --team-id "A1B2C3D4E5"

  • Replace youremail@gmail.com and A1B2C3D4E5 with your actual credentials.

Name the app-specific password RawCull (in appleid.apple.com) and set Profile name: RawCull when executing the above command.

The following dialog will appear:

This process stores your credentials securely in the Keychain. You reference these credentials later using a profile name.

Profile name:
RawCull
App-specific password for youremail@gmail.com: 
Validating your credentials...
Success. Credentials validated.
Credentials saved to Keychain.
To use them, specify `--keychain-profile "RawCull"`

Following the above steps, the following make commands are available from the root of RawCull’s source catalog:

  • make - will generate a signed and notarized DMG file including the release version of RawCull.
  • make archive - will produce an unsigned release build with all debug information removed, placed in the build/ directory.
  • make clean - will delete all build data.

Compile by Xcode

If you have an Apple Developer account, use your Apple Developer ID in Xcode.

Apple Developer account

Open the RawCull project by Xcode. Choose the top level of the project, and select the tab Signing & Capabilities. Replace Team with your team.

No Apple Developer account

As above, but choose in Signing Certificate to Sign to Run Locally.

To compile or run

Use Xcode for run, debug or build. You choose.