CoreData in Multiple Threads

More and more, I come to the conclusion that there are no developers left at Apple that actually know how to use multiple threads. Or they do not trust the average developer to be capable of using threads. Think about how hard it is in Swift Concurrency to run code off the main thread. If you think: Just span a Task { … }, you might be surprised.

But Apple’s journey to separate you from threads goes back way further. In 2009, Apple unveiled “Grand Central Dispatch” (GCD), which was set to simplify multi-threaded programs. It did, but it also replaced using threads by using queues. Queues can be serial or concurrent. All queues use a thread-pool managed by GCD. Managed by GCD means exactelly that: If you create a private serial queue, all work items scheduled on that queue are executed one after another but not necessarily on one thread. Each work item might run on a different thread. GCD provides guarantees for queues, not threads. Threads are considered an implementation detail.

Not trusting average developers with threading is one thing but I fear that Apple no longer has broad experience with multi-threading in their developer force, too. My latest project uses CoreData to store its data. As the data set size of my two person user base grew, I started to think about performance. Loading a huge amount of items for the main user interface list could at one point in the future become noticable slow.

To work around that, I wanted to use ContentUnavailableConfiguration while loading the data in the background. CoreData was released years before GCD but got updated in 2011 to support a private queue (requires some manual work, see perform(_:)). The NSPersistentContainer API that was added in 2016 finally provides a handy newBackgroundContext() method to support easy creation of contexts that perform all work on a private background queue. Done right, the main thread/queue, and therefor the user interface, stays free and responsive.

There’s one problem: You must not access any ManagedObject you get from the background context from any other queue. It’s not thread-safe, not even in the sense of “must be used by only one thread at any time”. Not even for reading. You cannot do this:

self.lockUI()
backgroundContext.perform {
   let fetchRequest = …
   if let items = backgroundContext.fetch(fetchRequest) as? [NamedItem] {
      DispatchQueue.main.sync {
         // Accessing .name in the following line is invalid!
         self.namesLabel.text = items.map { $0.name }.joined(separator: ", ")
         self.unlockUI()
      }
   }
}

While it is simple to work around in this made up example (just build the string before dispatching to main), this makes it nearly infeasible to use NSFetchedResultsController with a context that uses anything but the main queue.

I understand the reasons for this: Managed objects do not load all data, just the bare minimum, accessing a property might trigger loading the missing data. Thus accessing name on the main queue could trigger a load on the main queue, performing database access on a different queue than the dedicated private queue and in addition to that anniliating the effords of moving the load to the background thread.

There should be some kind of preloadingFetch() that allows one to specify keypaths of properties that should be loaded, similar to what CNContactFetchRequest does.

There is only one scenario that comes to my mind where the background queue of CoreData is useful: Importing or exporting data. Everything else has to be performed on the main queue.

Why did Apple fail to support this quite obvious use case? NSFetchedResultsController should accept background contexts, include prefetching information, perform the performFetch() on the context’s queue. It should just work.

For the foreseeable future, I am safe, as my measurements showed that CoreData is fast enough for the given use case. In addition to that, device performance will probably increase faster than data set size. With that in mind, I removed most uses of background contexts from the app (fixing several queue crossing errors while doing so). The only background contexts that remain are used for import or export only. In my case that’s feasable but for a larger app with more users, CoreData’s lack of cross thread support could easily be a show stopper.