Let us calculate

What if we could reason about the design space of data structures?

More Details

Data Structures are Everwhere

this is how any data driven subfield and industry
store and access data


There is no perfect data structure

~100 new data structures are designed every year due to continuous hardware and workload changes


The Data Calculator enables interactive and semi-automated design of data structures. It offers a set of fine-grained design primitives that capture the first principles of data layout design: how data structure nodes lay data out, and how they are positioned relative to each other. It allows of the computation of arbitrary data structure performance using learned cost models.
More details →

Step 1: Design Synthesis from First Principles


Nearly all new designs are about combining a small set of fundamental concepts in different ways or tunings. We are working towards mapping the design space of data structures to eventually find the first principles out of which all structures can be designed




Step 2: Learned Primitive Access Models


The Data Calculator contains a library of learned models that describe fundamental access patterns in blocks of data. They capture algorithmic, engineering, and hardware properties



Step 3: Algorithm and Cost Synthesis


For each operation in a workloads, the Data Calculator synthesizes the exact algorithm and its cost for the target hardware using an expert system. Based on the specification of the layout of each data structure node in the path of an operation it decides the best access pattern and based on the learned models it knows the expected cost




Accurate Cost Synthesis

10 million key value pairs, 100 point queries (skewed)

How to Use




~20 seconds, 10 million key value pairs, 100 point queries

Iteratively test diff combinations of design/workload/hardware




~30 minutes, >10 million key value pairs, mixed point and range queries

Automatically identify “the best design possible” to match a workload and hardware




Utilize design continuums and cross design spaces



Did you know?

There are more possible data structures than stars in the sky

The periodic table of data structures

Is this going to replace engineers?

Did the arithmetic calculator replace mathematicians?

Publications


2018

S. Idreos, K. Zoumpatianos, B. Hentschel, M. S. Kester, and D. Guo, “The Data Calculator: Data Structure Design and Cost Synthesis From First Principles, and Learned Cost Models,” in ACM SIGMOD International Conference on Management of Data.

N. Dayan and S. Idreos, “Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging,” in ACM SIGMOD International Conference on Management of Data.

B. Hentschel, M. S. Kester, and S. Idreos, “Column Sketches: A Scan Accelerator for Rapid and Robust Predicate Evaluation,” in ACM SIGMOD International Conference on Management of Data



2017

N. Dayan, M. Athanassoulis, and S. Idreos, “Monkey: Optimal Navigable Key-Value Store,” in ACM SIGMOD International Conference on Management of Data.

M. S. Kester, M. Athanassoulis, and S. Idreos, “Access Path Selection in Main-Memory Optimized Data Systems: Should I Scan or Should I Probe?,” in ACM SIGMOD International Conference on Management of Data.



2016

Manos Athanassoulis, Stratos Idreos. “Design Tradeoffs of Data Access Methods,” In ACM SIGMOD International Conference on Management of Data.

M. Athanassoulis, Z. Yan, and S. Idreos, “UpBit: Scalable In-Memory Updatable Bitmap Indexing,” in ACM SIGMOD International Conference on Management of Data.

M. Athanassoulis, et al., “Designing Access Methods: The RUM Conjecture,” in International Conference on Extending Database Technology (EDBT).





Sponsors


Back to top