A half-century of database systems research yielded numerous data system architectures, each optimized for a specific set of applications. Designed along multiple dimensions, such as data layouts, storage architectures or recovery strategies, application architects and software developers are faced with a plethora of different feature sets and design options to choose from. This vast design space is still growing as changes in hardware and applications introduce new concerns that warrant new techniques.
Today, matching a scientific or commercial application with its perfect data system is a time-consuming task that not only requires expertise in the area of databases, but also a willingness to compromise. Often, off-the-shelf solutions will only provide suboptimal performance. However, building a custom-tailored system for the task at hand is an expensive endeavor. Modifying an existing system under today’s monolithic implementations is extremely complex, while designing and building a new data system from scratch requires expertise and tens of man-years worth of time.
Rather than chasing changes in workload and hardware by continually designing and implementing new systems from scratch, or forcing end-users to settle for suboptimal solutions, we envision self-designing data systems that smoothly and autonomously navigate the design space to quickly generate the optimal solution for a given application. Self-designing data systems would relieve both system designers and end-users of data management headaches, culminating in greater productivity. Moreover, a self-designing system may discover new architectures that researchers would have never even considered by synthesizing new solutions out of existing ones, mimicking the natural process data system architects are performing manually.
We are building an infrastructure that allows for design exploration and visualization of core systems components. Designers can quickly and interactively design core system components; they can easily combine design options, try out alternative designs at a fine granularity, get instant feedback on the impact of their design decisions, ask what-if design questions, get suggestions about good and bad designs, and even semi-automate the process of discovering entirely new and previously unexplored designs, that is, doing research.
We have summarized the data for the design space of data structures into the periodic table of data structures which categorizes design decisions: not only how they manifestate in existing designs but also how they may be combined to create new and so far unknown designs.
We provide an interactive demo for users to both explore the possible design space of key-value data structures and interactively generate the data (expected performance properties of a design given a workload and hardware) in a matter of seconds.
This project is supported by the National Science Foundation under Grant No.
IIS-1452595.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s)
and do not necessarily reflect the views of the National Science Foundation.