Futures aren't ersatz threads

Marius Eriksen (marius@monkey.org)
02 Apr 2013

(Originally published at aboutwhichmorelater.tumblr.com.)

Concurrent programming is an increasingly important topic in the context of building modern distributed systems. Most such systems have a large amount of inherent concurrency. For example: To accommodate for large corpus sizes, a search engine splits its index into many small pieces. In order to efficiently satisfy queries, it must issue requests to each of these shards concurrently.

Threads (or processes) are a common abstraction for programming concurrent systems: there are multiple threads of execution, each of which has its own stack. The system manages how these threads are mapped onto physical hardware, and how they are preempted and interleaved. This allows the operating system to suspend execution of a thread while it is waiting for I/O operations, allowing other threads to proceed. Traditionally threads have had both high fixed costs (stacks must be allocated in increments of page sizes) as well high context switching costs. The emergence of event based programming addressed these issues: there is only one thread, with a run-loop, and I/O events are explicitly dispatched (eg. data is ready to be read, a write completed, etc.). While this reduces the cost of threads (you have only one stack, you don’t need the OS to interleave your executions), the programming model is awkward: the programmer has to split the program up into pieces delimited by I/O operations. In C, in particular, you have to declare separate functions for each of these pieces. The introduction of the libevent book has more details. This is also the model of node.js though, because javascript has first-class closures, the problem is ameliorated somewhat: by nesting callbacks, you can easily share context. This improves brevity but not modularity: each sequence of actions is fixed in place, errors must be handled very carefully. Nontrivial applications become difficult to reason about very quickly.

The assumptions upon which event-based programming is based have largely withered: context switches aren’t very expensive anymore, and memory has become plentiful. However, the degree to which our systems need to be concurrent has also increased. It is not uncommon to handle 100s of thousands, if not millions, of operations simultaneously. The archetypical example of this sort of application are so-called “long-polling” web servers.

Languages like Haskell and Go implement lightweight threads in their runtimes, allowing for cheaply maintaining millions of threads of execution, with the runtime itself multiplexing I/O operations. Go manages this by using segmented stacks: it allocates stack space as needed. This requires both the complicity of the compiler and a different ABI, but that’s the beauty of being able to start over.

So, clearly, “events vs. threads” is a false dichotomy; the statement doesn’t even make much sense, and conflates two separate concerns: They denote two concrete programming models, differing in (1) how context is encoded (heap vs. stack), and (2) where multiplexing is done (in a library, or runtime/OS).

In Finagle and elsewhere, we use composable futures as the basis of our concurrent programming model (these are quite different than the venerable java.util.Future and its brethren). Futures, it is often argued, make up for the deficiencies of traditional “event programming”: callbacks are tedious, compromise modularity, and make for spaghetti-like code that is difficult to understand. Futures correct this by introducing constructs that make callbacks manageable.

This misses the point.

Futures offer a concurrent programming model that translates naturally to the types of concerns that arise when building distributed systems. Take for instance remote procedure calls; RPCs are inherently asynchronous. This means that each component operates at their own speed, and the components fail independently of each other. There is no shared clock, and no shared bus. RPCs usually have dramatically different latency characteristics than a typical local operation. In fact, this holds for any sort of I/O, not just RPCs.

Futures model the real world truthfully. A Future[T] represents a T-typed result, which may or may not be delivered at some time in the future, or which may fail altogether. Like real-world asynchronous operations, a future is always in one of 3 states: incomplete, completed successfully, or completed with a failure.

Futures provide a firewall between synchronous and asynchronous operations. Because futures are typed differently than regular values (a value furnished by a future has type Future[T], not T), you can easily, at a glance, discern which operations imply asynchronous semantics. This allows you to reason more easily about the runtime semantics of your code: not only are such operations slower, but they have very different failure semantics. Furthermore, futures are “tainting”: if you want to incorporate the result of a future, you have to “lift” other values in to them. That is, if you are computing anything that requires an asynchronous, the operation itself becomes asynchronous - its type must be Future[T]. The upshot is that semantics are witnessed by types; we can use the type system to model and enforce behavior. Put another way: we can use the compiler to guarantee certain behavior.

Futures compose well. Futures, like physically asynchronous operations, are closed under composition. This is important: the composition of two asynchronous operations can in turn be described as an asynchronous operation. Futures work the same way: I might compose two Futures sequentially (eg. because there is a data dependency) or concurrently (because there is not), both of which produce a new future. When composing Futures, failure handling is baked in: When composed sequentially, the sequence of operations short circuit at the first failure; when composed in a concurrent configuration, any underlying failure also fails the composed operation.

Futures are persistent A Future[T] always means the same thing, refers to the same underlying operation, no matter how it is later used. To string operations together (concurrently or sequentially), new futures are produced, with new semantics. This makes reasoning about Futures very simple; you needn’t worry about how they might be used or modified elsewhere.

Futures liberate semantics from mechanics. By stringing futures together via the various combinators, the programmer is expressing the semantics of an operation, not its execution mechanics. In this example, adapted from the Finagle User’s Guide, we fetch all images in a web page:

	val allImages: Future[Seq[Image]] =
	  fetchUrl(url) flatMap { bytes =>
	    val fetches = findImageUrls(bytes) map { url =>
              fetchUrl(url)
            }
	    Future.collect(fetches)
	  }

This can be read thus:

allImages is the operation that first fetches url, then for each image URL in the body of the page that was fetched, the result of fetching all of those URLs as images in turn.

The code is fairly close to the English description of what we want to do. We have described the operation, but not how to perform it. In the description, there is inherent concurrency: the subsequent image URLs may all be fetched at the same time; this translates naturally into the concurrent combinators used with Futures, in this case Future.collect. Failure handling is omitted from the description, and is also omitted from the code. In this case, it is clear that we cannot proceed if any operation fails, and those are also the semantics of the composite operation (allImages). The data dependencies, inherent to the problem being solved, are all-revealing.

This separation enhances the ease of reasoning: the meaning of the operation isn’t intertwined instructions for how to go about executing it. No threads are spun up, no errors explicitly handled, no explicit coordination mechanism is required to gather the result of the image fetches. Instead, there are only data dependencies — to fetch all of the images, we need to fetch their URLs; to get the URLs we need to fetch the web page.

This property also allows us to separately consider the runtime behavior of such operations. The Future implementation might choose to limit the number of concurrent operations, for example, or to exploit inherent locality (like biasing connection pools so that a given connection is always handled by one underlying thread). A library might thread through tracing data to diagnose operations. This is not hypothetical: Finagle does all of this, and it’s done entirely as a library without any special runtime support.

Q.E.D.?

Futures present a concurrent programming model that is appealing on its own. They should not be seen as a poor-man’s version of threads.

Ultimately such abstractions exist in the service of writing robust, safe, performant, and modular software. Futures have served us very well in this regard at Twitter, and I think they’ll have a long shelf-life as a go-to abstraction for constructing concurrent systems. We’re in the Cambrian period of concurrency, and I predict futures will emerge as one of the survivors.

This isn’t to say that other models aren’t also good. I’m personally a huge fan of Go/Haskell’s cheap threads model: because they are built into the language and runtime, their usage is enforced. This is perhaps the main sticking point of futures. Because they are implemented as 3rd-party libraries, their usage can be spotty, and you may find yourself always writing little bridges or adapters. At Twitter, we’re lucky in that we have a common system upon which our systems are built, so this is less of a problem. But this isn’t inherent to the model: C# and F# has it into the language and runtime. Scala’s SIP-14 also promises to unify that ecosystem.