CS265 Warm-Up Exercise

Project 0: Implementing a Hash Table

  1. Introduction

    This is a warm-up exercise for those interested in taking CS265.

    Students who do not feel comfortable with all the material discussed here should consider first taking CS165 or speaking to the teaching staff. Strong programming skills in a language other than C may also be sufficient.

    While the course will not focus on programming a system specifically there will be a strong technical component in each of the papers we read and the semester project will generally include a prototype built from scratch or through extension of the CS165 project.

    In addition, we provide some guidelines on tools and background that you should be familiar with prior to taking CS265. As a warm-up exercise we suggest building a simple hash table. A description of the task is included in Project 0. For most students this task should only take a day or so and for students with limited systems programming experience it will provide a glimpse of the expectations of the final project.

  2. Relevant Areas

    CS265 is a rigorous systems class. It includes a semester final research project that will require a strong background including experience in the following areas:

    1. C programming
    2. Data structures
    3. Algorithms
    4. Computer architecture

    These areas are essential for CS265 and some basic familiarity with all these areas is strongly advised. If you don't have some of the required background please speak with the teaching staff.


    C Programming

    The class uses the C language. This is because a low-level language is required to design and build efficient systems as we need to have full control of how we use memory and computational resources.

    The basic concepts of the C language are quite similar with other programming languages. Once you know how to write for loops and if statements you are good to go and then learning a new language is primarily about getting familiar with the syntax and its structure which is typically a trivial task.

    You can easily find several detailed tutorials online to get you started with basic C concepts.

    Here are some areas to focus on:

    • C syntax including pointer arithmetic/manipulation
    • Memory management (malloc/free)
    • File I/O
    • Recursion
    • Structs
    • Header files
    • pthreads (concurrency)
    • Synchronization Primitives (mutexes, condition variables, etc.)

    If you have previous experience with other languages but not with C, then most likely the new thing you need to get familiar with is pointer and memory management. Here is a nice tutorial on that:
    http://pw1.netcom.com/~tjensen/ptr/pointers.htm.

    If you do not have any familiarity with programming before, then you should not take CS265.


    Data Structures

    Data structures are essential in data systems. They drive the physical organization of data and the access methods we use to retrieve data. You will read about modern data structures used in data systems and may extend or develop your own in the final project. At a minimum you should have a good understanding of arrays, lists, and trees.

    There are two aspects when it comes to data structures. The first one has to do with the abstract design of data structures and their properties. The second one has to do with their proper implementation.

    If you have done C programming in the past, it is likely that you have implemented a few of the data structures we'll speak about.

    Also check the Project 0 that we provide as a warm-up programming task.


    Algorithms

    Algorithm design is essential in any area of computer science. Again we will will discuss the research impacts of several algorithms during the semester and think about how they influence or drive design decisions.

    A basic familiarity with algorithm design at the level of an introductory CS class is a fundamental. For example, do you have trouble understanding how a basic sort algorithm (e.g., quicksort) works or how a basic tree traversal happens? If yes, it is advisable that you refresh your basic algorithms skills and then consider enrolling in CS265.


    Computer Architecture

    Modern systems are designed to fully utilize modern hardware. This is essential in order to get the maximum efficiency (in terms of both speed and energy consumption).

    While CS165 teaches you about modern systems tailored for modern hardware, CS265 extends that knowledge by exploring the current research topics around these systems. A strong understanding of modern computer architecture will be helpful.

    Most students should be familiar with the basic properties of a multi-core CPU, CPU caches, memory, and disk. Hardware outside of this (SIMD, hardware accelerators, etc.) will be discussed as a group when appropriate.

  3. Student Categories

    Here we categorize students based on their past experience and give you suggestions on what to expect in CS265.

    Case 1 - Basic Computer Science: You have only taken an introductory computer science class such as CS50 or otherwise have only basic knowledge of computer science. If you fit in this category, you should not enroll for CS265. If you think that CS265 is still the correct class for you please speak to the teaching staff.

    Case 2 - Basic Systems: You have taken an introductory systems class before like, CS61. If you fit in this category, then this means that you already know a bit more about the 4 areas of the class than what a standard computer science introductory class gives you. However, you should still speak to the teaching staff if you have not taken CS165.

    Case 3 - Prior Systems Experience: You have taken another rigorous systems class before, like CS161 or CS165 or have done research in systems. If you are in this category then this means that you are very comfortable with system design. You will be OK in CS265.

    Case 4 - Familiarity with (nearly) all Areas: Another category includes students that do not have any systems experience but do have good understanding and knowledge of all or nearly all areas of the class. If you fall in this category you will be OK in CS265. It is advisable that you carefully go through the warm-up and the class material to see if there is an area you would like to work on prior to class but you will have time to do so during the semester as well.

  4. Tools

    As in most systems classes, a number of tools are particularly valuable in working with data systems. If you are not familiar with one or more of these tools you will want to review their usage or speak with the teaching staff. Here is a partial list of tools you will find helpful in systems development:

    gcc compiler to compile C (gcc.gnu.org)
    gdb to debug your system (www.gnu.org/software/gdb)
    valgrind to do memory profiling (valgrind.org)
    perf for all kinds of monitoring and performance counters (perf.wiki.kernel.org)

    git for version control (git-scm.com)

    You can find many detailed tutorials online for each of these tools.