Friday, 21 September 2018

Method dispatch mechanism

Method Dispatch is how a program selects which instructions to execute when invoking a method. It’s something that happens every time a method is called, and not something that you tend to think a lot about. Knowing how method dispatch works is vital when writing performant code, and can illuminate some of the confusing behavior found in Swift.
Compiled programming languages have three primary methods of dispatch at their disposal: direct dispatch, table dispatch, and message dispatch, which I explain below. Most languages support one or two of these. Java uses table dispatch by default, but you can opt into direct dispatch by using the final keyword. C++ uses direct dispatch by default, but you can opt into table dispatch by adding the virtual keyword. Objective-C always uses message dispatch, but allows developers to fall back to C in order to get the performance gains of direct dispatch. Swift has taken on the noble goal of supporting all three types of dispatch. This works remarkably well, but is a source of confusion to many developers, and is behind a number of gotchas that most Swift developers have encountered.

Types of Dispatch

The goal of dispatch is for the program to tell the CPU where in memory it can find the executable code for a particular method call. Before we delve into how Swift behaves, let’s look at each of the three types of method dispatch one by one. Each one has tradeoffs between execution performance and dynamic behavior.

Direct Dispatch

Direct dispatch is the fastest style of method dispatch. Not only does it result in the fewest number of assembly instructions, but the compiler can perform all sorts of smart tricks, like inlining code, and many more things which are outside of the scope of this document. This is often referred to as static dispatch.
However, direct dispatch is also the most restrictive from a programming point of view, and it is not dynamic enough to support subclassing.

Table Dispatch

Table dispatch is the most common implementation of dynamic behavior in compiled languages. Table dispatch uses an array of function pointers for each method in the class declaration. Most languages refer to this as a “virtual table,” but Swift uses the term “witness table.” Every subclass has its own copy of the table with a different function pointer for every method that the class has overridden. As subclasses add new methods to the class, those methods are appended to the end of this array. This table is then consulted at runtime to determine the method to run.
As an example, consider these two classes:
In this scenario, the compiler will create two dispatch tables, one for ParentClass, and one for ChildClass:

A diagram showing the memory offsets for method1, method2, and method3 in ParentClass and ChildClass.
When a method is invoked, the process will:
  1. Read the dispatch table for the object 0xB00
  2. Read the function pointer at the index for the method. In this case, the method index for function2 is 1, so the address 0xB00 + 1 is read.
  3. Jump to the address 0x222
Table lookup is pretty simple, implementation-wise, and the performance characteristics are predictable. However, this method of dispatch is still slow compared to direct dispatch. From a byte-code point of view, there are two additional reads and a jump, which contribute some overhead. However, another reason this is considered slow is that the compiler can’t perform any optimizations based on what is occurring inside the method.
One downfall of this array-based implementation is that extensions cannot extend the dispatch table. Since subclasses add new methods to the end of the dispatch table, there’s no index that an extension can safely add a function pointer to. This swift-evolution post describes these limitations in more detail.

Message Dispatch

Message dispatch is the most dynamic method of invocation available. This is a cornerstone of Cocoa development, and is the machinery that enables features like KVOUIAppearance, and Core Data. A key component to this functionality is that it allows developers to modify the dispatch behavior at runtime. Not only can method invocations be changed via swizzling, but objects can become different objects via isa-swizzling, allowing dispatch to be customized on an object-by-object basis.
As an example, consider these two classes:
Swift will model this hierarchy as a tree structure:
A diagram showing the tree structure that Swift uses to model the dispatch tables for a class and its subclass.
When a message is dispatched, the runtime will crawl the class hierarchy to determine which method to invoke. If this sounds slow, it is! However, this lookup is guarded by a fast cache layer that makes lookups almost as fast as table dispatch once the cache is warmed up. But this just touches the surface of message dispatch. This blog post is a great deep dive in to all sorts of technical details.

Swift Method Dispatch

So, how does Swift dispatch methods? I haven’t found a succinct answer to this question, but here are four aspects that guide how dispatch is selected:
  • Declaration Location
  • Reference Type
  • Specified Behavior
  • Visibility Optimizations
Before I define these, it’s important to point out that Swift doesn’t really document the differences between when a table lookup is being used and when message dispatch is being used. The only promise that has been made is that the dynamic keyword will use message dispatch via the Objective-C runtime. Everything else I mention below, I have determined from looking at the behavior of Swift 3.0, and it is subject to change in future releases.

Location Matters

Swift has two locations where a method can be declared: inside the initial declaration of a type, and in an extension. Depending on the type of declaration, this will change how dispatch is performed.
In the example above, mainMethod will use table dispatch, and extensionMethod will use direct dispatch. When I first discovered this, I was pretty surprised. It’s not clear or intuitive that these methods behave so differently. Below is a complete table of the types of dispatch selected based on the reference type and the declaration location:
A table showing the default method dispatch mechanisms used by Swift.
There are a few things to note here:
  • Value types always use direct dispatch. Nice and easy!
  • Extensions of protocols and classes use direct dispatch.
  • NSObject extensions use message dispatch
  • NSObject uses table dispatch for methods inside the initial declaration!
  • Default implementations of methods in the initial protocol declaration use table dispatch.

Reference Type Matters

The type of the reference on which the method is invoked determines the dispatch rules. This seems obvious, but it is an important distinction to make. A common source of confusion is when a protocol extension and an object extension both implement the same method.
Many people new to Swift expect proto.extensionMethod() to invoke the struct’s implementation. However, the reference type determines the dispatch selection, and the only method that is visible to the protocol is to use direct dispatch. If the declaration for extensionMethod is moved into the protocol declaration, table dispatch is used, and results in the struct’s implementation being invoked. Also, note that both declarations use direct dispatch, so the expected “override” behavior is just not possible, given the direct dispatch semantics. This has caught many new Swift developers off guard, as it seems like expected behavior when coming from an Objective-C background.
There are a few bugs on the Swift JIRA, a lot of discussion on the swift-evolution mailing list, and a great blog post about this. However, it is the intended behavior, even though it is not well documented.

Specifying Dispatch Behavior

Swift also has a number of declaration modifiers that alter the dispatch behavior.

final

final enables direct dispatch on a method defined in a class. This keyword removes the possibility of any dynamic behavior. It can be used on any method, even in an extension where the dispatch would already be direct. This will also hide the method from the Objective-C runtime, and will not generate a selector.

dynamic

dynamic enables message dispatch on a method defined in a class. It will also make the method available to the Objective-C runtime. To use dynamic, you must import Foundation, as this includes NSObject and the core of the Objective-C runtime. dynamiccan be used to allow methods declared in extensions to be overridden. The dynamickeyword can be applied to both NSObjectsubclasses and direct Swift classes.

@objc & @nonobjc

@objc and @nonobjc alter how the method is seen by the Objective-C runtime. The most common use for @objc is to namespace the selector, like @objc(abc_methodName)@objcdoes not alter the dispatch selection, it just makes the method available to the Objective-C runtime. @nonobjc does alter the dispatch selection. It can be used to disable message dispatch since it does not add the method to the Objective-C runtime, which message dispatch relies on. I’m not sure if there is a difference from final, as the assembly looks the same in the use cases I’ve seen. I prefer to see final when reading code because it makes the intent more clear.

final @objc

It is also possible to mark a method as finaland make the method available to message dispatch with @objc. This will cause invocations of the method to use direct dispatch, and will register the selector with the Objective-C runtime. This allows the method to respond to perform(selector) and other Objective-C features while giving you the performance of direct dispatch when invoking directly.

@inline

Swift also supports @inline, which provides a hint that the compiler can use to alter the direct dispatch. Interestingly, dynamic @inline(__always) func dynamicOrDirect() {}compiles! It does appear just to be a hint, as the assembly shows that the method will still use message dispatch. This feels like undefined behavior, and should be avoided.

Modifier Overview

A table showing the effect that modifiers have on Swift method dispatch.
If you are interested in seeing the assembly for some of the above examples, you can view it here.

Visibility Will Optimize

Swift will try to optimize method dispatch whenever it can. For instance, if you have a method that is never overridden, Swift will notice this and will use direct dispatch if it can. This optimization is great most of the time, but will often bite Cocoa programmers who are using the target / action patterns. For instance:
Here, the compiler will generate an error: Argument of '#selector' refers to a method that is not exposed to Objective-C. This makes sense when you remember that Swift is optimizing the method to use direct dispatch. The fix here is pretty easy: just add @objc or dynamic to the declaration to ensure that it stays visible to the Objective-C runtime. This type of error also occurs when using UIAppearance, which relies on proxy objects and NSInvocation.
Another thing to be aware of when using more dynamic Foundation features is that this optimization can silently break KVO if you do not use the dynamic keyword. If a property is observed with KVO, and the property is upgraded to direct dispatch, the code will still compile, but the dynamically generated KVO method will not be triggered.
The Swift blog has a great article describing more details and the rationale behind these optimizations.

Dispatch Summary

That’s a lot of rules to remember, so here’s a summary of the dispatch rules above:

A table showing a summary of the interactions between reference types and modifiers and their effect on Swift method dispatch.

NSObject and the Loss of Dynamic Behavior

A number of Cocoa developers commented on the loss of dynamic behavior a while back. The conversation was very interesting and a lot of points came up. I hope to continue this argument and point out a few aspects of Swift’s dispatch behavior that I believe damage its dynamic behavior, and propose a solution.

Table Dispatch in NSObject

Above, I mentioned that methods defined inside the initial declaration of an NSObjectsubclass use table dispatch. I find this to be confusing, hard to explain, and in the end, it’s only a marginal performance improvement. In addition to this:
  • Most NSObject subclasses sit on top of a lot of obj_msgSend. I strongly doubt that any of these dispatch upgrades will result in a performance improvement on any Cocoa subclass in practice.
  • Most Swift NSObject subclasses make extensive use of extensions, which dodge this upgrade all together.
In the end, it’s just another small detail that complicates the dispatch story.

Dispatch Upgrades Breaking NSObject Features

The visibility performance improvements are great, and I love how Swift is smart about upgrading dispatch when possible. However, having a theoretical performance boost in my UIViewsubclass color property breaking an established pattern in UIKit is damaging to the language.

NSObject as a Choice

Just as structs are a choice for static dispatch, it would be great to have NSObject be a choice for message dispatch. Right now, if you were to explain to a new Swift developer why something is an NSObject subclass, you’d have to explain Objective-C and the history of the language. There would be no reason to choose to subclass NSObject, other than inheriting an Objective-C code base.
Currently, the dispatch behavior of NSObject in Swift can best be described as “complicated,” which is less than ideal. I’d love to see this change: when you subclass NSObject, it should be a signal that you want fully dynamic message dispatch.

Implicit Dynamic Modification

Another possibility is that Swift could do a better job detecting when methods are used dynamically. I believe that it should be possible to detect what methods are referenced in #selector and #keypath and automatically flag them as dynamic. This would remove the majority of the dynamic issues logged here, with the exception of UIAppearance, but maybe there’s another sort of compiler trick that could flag those methods as well.

Errors and Bugs

With a bit more understanding of Swift dispatch rules, let’s review a few more error scenarios that a Swift developer may encounter.

SR-584

This Swift bug is a “feature” of Swift’s dispatch rules. It revolves around the fact that methods defined in the initial declaration of an NSObjectsubclass use table dispatch, and methods defined in extensions use message dispatch. To describe the behavior, let’s create an object with a simple method:
The greetings(person:) method uses table dispatch to invoke sayHi(). This resolves as expected, and “Hello” is printed. Nothing too exciting here. Now, let’s subclass Person
Notice that sayHi() is declared in an extension meaning that the method will be invoked with message dispatch. When greetings(person:) is invoked, sayHi()is dispatched to the Person object via table dispatch. Since the MisunderstoodPerson override was added via message dispatch, the dispatch table for MisunderstoodPerson still has the Personimplementation in the dispatch table, and confusion ensues.
The workaround here is to make sure that the methods use the same dispatch mechanism. You could either add the dynamic keyword, or move the method implementation from an extension into the initial declaration.
Here, understanding Swift’s dispatch rules helps us make sense of this, even though Swift should be smart enough to resolve the situation for us.

SR-103

This Swift bug involves default implementations of methods defined in a protocol, and subclasses. To illustrate the issue, let’s define a protocol with a default implementation of the method.
Now, let’s define a class hierarchy that conforms to this protocol. Let’s create a Person class that conforms to the Greetable protocol, and a LoudPerson subclass that overrides the sayHi() function.
Notice that there’s no override method in LoudPerson. This is the only visible warning that things here may not work as expected. In this case, the LoudPerson class did not register the sayHi() function correctly in the Greetablewitness table, and when sayHi() is dispatched through the Greetable protocol, the default implementation is used.
The workaround is to remember to provide implementations for all methods defined in the initial declaration of the protocol, even if a default implementation is provided. Or, you can declare the class as final to ensure that subclasses are not possible.
There was mention of work that Doug Gregor is doing that would implicitly redeclare the default protocol methods as class methods. This would fix the problem above and enable the expected override behavior.

Other bugs

Another bug that I thought I’d mention is SR-435. It involves two protocol extensions, where one extension is more specific than the other. The example in the bug shows one un-constrained extension, and one extension that is constrained to Equatable types. When the method is invoked inside a protocol, the more specific method is not called. I’m not sure if this always occurs or not, but seems important to keep an eye on.
If you are aware of any other Swift dispatch bugs, drop me a line and I’ll update this blog post.

Interesting Error

There’s an interesting compiler error message that provides a glimpse into the aspirations of Swift. As mentioned above, class extensions use direct dispatch. So what happens when you try to override a method declared in an extension?
The above code fails to compile with the error Declarations in extensions can not be overridden yet. Evidently, the Swift team has some plans to expand the basic table dispatch mechanism. Or maybe I’m trying to read tea leaves and it was just an optimistic choice of language!