Twitter and the JVM

Oracle, November 5, 2012

Brandon Mitchell

Twitter Inc.

Marius Eriksen

Twitter Inc.

Use

The JVM is used throughout

Systems are both in Scala and Java; most are using ParNew+CMS.

JVM development

We have a VM team (Brandon, Kaushik) contributing to OpenJDK.

Concerns

Garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collecton, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collecton, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collecton, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collection, garbage collecton, garbage collection, garbage collection, garbage collection, garbage collection

Concerns (part 2)

Concerns (really)

We have good throughput (GC overhead is usually .2-.5%)

But tail latencies are very important. Effect is compounded in distributed systems.

Most of our systems are stateless.

Tooling

Google-perftools compatible tooling

Heap profilers

CPU, contention prolfilers

Accessible via HTTP on production systems.

Systemic GC robustness

Still experimental:

RPC: building in GC avoidance

Speculative execution.

Speculative future work

To minimize tail latency effects, we run with the largest heaps possible.

But we want to share, to colocate (see: Mesos). Our cluster scheduler should be able to make latency-heap-size tradeoffs (eg. to do better binpacking or to oversubscribe).

Thank you

Brandon Mitchell

Twitter Inc.

Marius Eriksen

Twitter Inc.