SOSP 2013 Trip report, and a note on systems research

Marius Eriksen (marius@monkey.org)
09 Nov 2013

The twenty-fourth ACM Symposium for on Operating Systems Principles (SOSP) was held at the Nemacolin Woodlands resort in Western Pennsylvania. (Nemacolin is a luxury resort, and it wants you to know it. It’s sort of like a Disneyland rendition of the palace of Versailles.) SOSPs are usually held in slightly isolated settings. It forces everyone to stay together for the entire affair—an all-inclusive conference. This works: your time there is consumed by SOSP activities. There is one track of sessions during the day, immediately followed by receptions, poster sessions, BoFs, and WIPs—open bars abound—followed by banquet dinners. This is Woodstock for systems researchers.

On the Sunday before the main conference I attended the PLOS’13 workshop where I presented a paper, Your Server as a Function Slides are available; the talk was not recorded. about some of the core abstractions and software systems that we use to construct our server software—i.e. our systems software stack. Russ Cox gave the workshop keynote, on Go. Russ did a great job motivating the philosophy behind Go’s type system, concurrency constructs, and its package system. I’ve had an eye on Go for a while: its focus on simplicity is a healthy response to the complexity creep that seems to plague just about every other “industry” language. I’m curious to see how well Go’s approach will hold up to time, and whether they truly manage to keep it simple in the face of larger code bases, greater demand for generic code, and a changing hardware and software landscape. Go’s designers have a healthy attitude, though. Anil Madhavapeddy maintained a liveblog of the entire event.

Monday through Wednesday featured the main conference. SOSP is a single track conference, and the talks are very well attended. There are lots of good questions—some are asked in anger!—and the whole affair is quite engaging. The talks were all well-rehearsed; there were few awkward pauses or speakers who stumbled.

My 3 favorite talks were also the best paper award winners.

The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors” describes a tool, commuter, which uses symbolic execution on models (effectively, “pseudo-code” Python implementations of the underlying operations) to determine what operations and argument combinations commute—that is, which pairs of operations may be executed independently of order? commuter allows you to statically analyze systems—more accurately: models of systems—and determine where operations may execute concurrently. The authors implement a filesystem in order to show the applicability of commuter.

Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior” describes a tool, stack, which sniffs out where applications rely on behavior that is undefined in C. (Did you know that pointer arithmetic overflow is undefined? Me neither.) The authors created a model for understanding such unstable code, and developed a static checker to find such code. Finally they ran stack on a number of open source projects (Kerberos, Postgres, the Linux kernel) finding a large amount of unstable code. This was a fun presentation and paper.

Perhaps my favorite paper was “Naiad: a timely dataflow system.” Naiad is a low-latency dataflow system that admits cycles, the upshot of which is support for dataflow iteration. For example, you can mix computing pagerank with more standard mapreduce-like data processing. It presents a single, unified model for doing so. Naiad uses a vertex-communication model similar to Pregel but without synchronous iteration: incoming messages are computed incrementally. Naiad maintains a versioning scheme and distinguishes loop ingress- and egress-nodes. The story for fault recovery is a bit weak—currently a synchronous recovery mechanism is employed—and may be its achilles heel; I’m curious to see it develop.

I also enojyed Gernot Heiser’s microkernel retrospective. It’s great to see such long and impactful lines of research, and better still to attempt to draw lessons from the process.

The first day of the conference was liveblogged.

A note on systems research

As an industry participant, it’s dissapointing to see a number of very important problems given short shrift.

The distributed systems tooling gap. We have good tools for understanding behavior of single-system processes: Debugging and diagnostics tools like Dtrace and perf, are among these. Equivalent tooling for distributed systems are weak to nonexistent. Google’s Dapper is one of the only published systems that come to mind which attempts to tackle parts of this problem. In reality, almost all organizations that deploy sizable distributed systems have their own set of tools and systems—aggregation and display of system statistics, RPC tracing, alerting, etc.—but these are rather primitive and unwieldy compared to tools like Dtrace and perf. I think there are a lot of important (and interesting!) problems to be solved here, but it seems to be beneath the radar of the academy.

Cost control and isolation in “the cloud”. Cost control is paramount in large systems deployments. It’s important to understand the cost structure of your system, and the trade-offs involved in optimizing. This is especially important in large multi-tenant systems. Service-oriented architectures, where different uses (“features”) share common systems, exacerbate this further. We have good, coarse-grained, isolation in operating systems—processes, virtual machines, containers, etc.—but vertical isolation in distributed systems is poor to nonexistent.

Distributed systems modularity. We reach for many tools when structuring software—classes, mixins, the ML module system, Go packages and interfaces, etc.—but in distributed systems we have far fewer, far more primitive ones. The state of the art is pretty much using some form of interface definition langauge (e.g. thrift, protobufs) to define module boundaries. These are then manually glued together. This is like having only C header files and dlopen(3). Worse, because services tend to follow organizational boundaries, and because it’s tedious to maintain a large number of systems, these interfaces tend to become kitchen-sinks comprising everything a particular team is responsible for, whether or not these functionalities are even related. It’s considered good programming practice to focus on compositionality: build software out of small, well-defined modules that combine to give rise to other modules with different behaviors. This is simply too difficult to do in distributed systems. Why?

Authentication. Kerberos showed us how to do authentication in distributed systems. It’s still a good model, but a lot has changed since 1988. In particular, service-oriented architectures make session authentication less useful: usually a request is performed on someone else’s behalf. Additionally, because central scheduling systems (e.g. Mesos, Omega) are actually responsible for provisioning and managing processes, there needs to be a chain of trust. Above all, such a system must be simple to operate and easy to understand and audit.