Limousine - DASlab @ Harvard University

Overview

Key-value storage engines are the backbone of modern big data systems, yet today's engines rely on fixed, static designs that do not scale gracefully with growing data volumes and shifting workloads. As datasets grow, classical in-memory navigational structures (filters, indexes) consume linearly more memory, inflating cloud costs — sometimes by 5× — with about 80% of the budget going to purchasing extra RAM.

Limousine is a self-designing key-value storage engine that automatically morphs to the near-optimal architecture given a workload, a cloud budget, and a target performance. It extends and generalizes our earlier system Cosine by introducing a fundamentally new capability: blending learned and classical data structures within a single engine so that the system can scale to larger-than-memory datasets without exploding cloud costs.

By identifying the first principles of storage engine design as combinations of learned, classical, and hybrid ("clearned") components, Limousine constructs a massive design space of 10⁴⁸ possible storage engine designs across diverse hardware and three major cloud providers (AWS, GCP, and Azure). It then searches this space in seconds to find the best design for a given context, and automatically materializes the result as ready-to-use Rust code.

Why "Limousine"?

Like a limousine that blends luxury with utility, Limousine blends the best of learned indexes (memory-efficient, distribution-aware) with classical structures (robust writes, proven guarantees) — delivering both performance and cost scalability in one engine.

Highlights

Limousine-designed engines outperform state-of-the-art systems
across diverse workloads, datasets, and cloud budgets.

1000×

Performance Gain

vs. RocksDB, WiredTiger, FASTER

10⁴⁸

Candidate Designs

Learned × Classical × Hybrid

Rust

Auto-Generated Code

From design spec to runnable engine

Cloud Providers

AWS · GCP · Azure

5×

Cost Reduction

Learned indexes shrink memory bill

Seconds

Design Time

Exhaustive search of the continuum

The Design Space: Classical, Learned & Clearned

Limousine identifies three families of navigational structures — classical, learned, and clearned (hybrids) — and combines them with diverse data layout and merge strategies to span an enormous space of possible engines.

Classical Structures

Bloom filters, fence pointers, B-Trees — proven navigational structures with linear memory growth.

Learned Structures

Models that learn data distributions to create compact, tailored indexes — trading write speed for memory savings.

Clearned Hybrids

Novel combinations that pair learned indexes with classical filters and buffers for balanced read-write performance.

Code Materialization

Once the optimal design is chosen, Limousine automatically generates production-ready Rust code.

Existing storage engines each support a single, fixed design — an LSM-tree in RocksDB, a B-tree in WiredTiger, a log-structured hash table in FASTER. This locks in their performance characteristics and makes them inherently sub-optimal for workloads they were not designed for.

Limousine breaks this constraint by formalizing the first principles of storage engine layout: how data is organized across levels, how buffers absorb writes, how navigational structures (indexes, filters) accelerate reads, and how merge policies maintain the structure over time. Each of these axes admits multiple choices — classical or learned — and their combinations yield trillions of valid engine designs that have never appeared in literature or industry.

A key innovation is the lazy write algorithm for engines with learned components. Because learned indexes must be retrained after data changes, naively rebuilding them on every write is prohibitively expensive. Limousine introduces efficient lazy write strategies that amortize retraining cost while maintaining read accuracy, enabling holistic read-write performance optimization for the first time in engines that contain learned structures.

Distribution-Aware IO Model

Limousine uses a unified, distribution-aware IO model to accurately predict the cost and performance of any candidate design on any workload and hardware.

The Problem

With 10⁴⁸ possible designs, there is no way to benchmark them all. We need analytical models that can evaluate any design in microseconds, not minutes.

Our Solution

Limousine's IO model captures the interplay between data distribution, storage layout, and cloud hardware to calculate latency and dollar cost for every query type — reads, writes, and range scans.

Pareto Continuum

The model enables Limousine to construct a navigable Pareto frontier of cost vs. performance, letting engineers pick the sweet spot for their budget and SLA requirements.