Modern, persistent key-value stores are used throughout numerous applications, from social networks to document stores. We show that the design space of modern key-value stores is complex and that existing designs are sub-optimal.
Typically, key-value stores use an LSM-tree as a storage backend. LSM-trees balance lookup and update performance, by (i) buffering updates, (ii) creating sorted runs, and (iii) periodically merge them forming a series of sorted runs with exponentially increasing size. Searching for a value requires to probe in the worst case all sorted runs, a cost that can modern key-value stores mitigate by employing (a) fence pointers and (b) Bloom filters, both maintained in main-memory. The above design elements of key-value stores exhibit an intrinsic trade-off between lookup cost, update cost, and main memory utilization. Existing designs enable a suboptimal and difficult to tune trade-off among these metrics.
Monkey is an LSM-based key-value store that is able to strike the best possible balance between the costs of updates and lookups for any given application. By mapping the possible design space, we show how to optimally navigate the key-value store design space holistically; Monkey co-tunes the merge policy, the size ratio between levels, the buffer size, and the Bloom filters’ false positive rates.
Monkey improves read performance by optimal allocation of memory to the bloom filters. The core insight is that the worst-case lookup cost is proportional to the sum of the false positive rates of the Bloom filters across all levels of the LSM-tree. Contrary to state of the art that assigns a fixed number of bits-per-element to all Bloom filters, Monkey allocates memory to filters across different levels so as to minimize the sum of their false positive rates.
Monkey improves the worst-case lookup cost by 50-90% and can also answer what-if questions about how changes in environmental parameters would impact performance and how to adapt the various design elements accordingly.
Click on one of the three configuration buttons below or enter a custom configuration to visualize the corresponding performance trade-off curves in Figure 1 and the detailed tuning of the resulting LSM-tree in Figure 2.
All three configurations correspond to 1 TB LSM-tree with 16 byte entries. The first two configurations reflect the default tuning of two state-of-the-art systems, LevelDB and Cassandra respectively. For each system the two lines shown in Figure 1 show the benefit it can achieve by integrating Monkey's memory allocation for Bloom filters. The third configuration is an edge-case where no memory is allocated to the Bloom filters, and so the trade-off curves for Monkey and state-of-the-art designs converge.
Click on one of the scenarios above and get ready for Monkey business.