Data Management

Spring 2015
Class: Tuesday and Thursday 10:30 –12:15, SI 006
Instructor: Robert Soulé
TAs: Nosheen Zaza
Office hours: Tuesday 13:00-14:00
Final exam: 8:30am, June 9, SI 008

Overview

Databases are essential to applications in a wide variety of domains, including finance, health care, commerce, and telecomunications. In fact, most applications that people use on a day-to-day bases are backed by databases. This course provides a balanced introduction to relational database management systems from two perspectives: that of the database user, and that of the database implementor. On the user side, the course will cover: modeling enterprise data with ER diagrams, the relational model, SQL, logical design with normalization, transaction processing, recovery, and concurrency. On the implementation side, the course will cover physical design, data storage, buffer management, indexing, and query execution. Time permitting, the course will briefly discuss some advanced topics such as processing unstructured or semi-unstructured data, graph databases, and NoSQL systems.

The course work will consist of implementing components of a relational database management system, exercises using an industrial DBMS, as well as data modeling and design problems. The implementation projects will involve a significant amount of programming in C++. You do not need to know C++ to take this course, but you should have a background in programming either C or Java. If you don't have programming experience, you may want to consider another course.

Details

Textbooks

We rely on one textbook:

Additionally, a reference on C++ may be useful, such as one of the following:

Moodle

Please send class-related questions to the Discussions Forum on Moodle (unless, of course, they concern private rather than technical or organizational issues).

Grading Policy

15% for homework and 25% for projects; 20% for the mid-term; 40% for the final exam.

Academic Integrity

I encourage you to collaborate on homework assignments. But you must write up and turn in your own answers. Also, you must clearly indicate who you collaborated with. If I detect any incidents of cheating, I will report them immediately to the department, and the assignment will be given a grade of 0.

Syllabus

Please be sure to regularly check this page for updates.

Feb 17
Introduction, C++ Crash Course
Feb 17
C++ Continued
  • Read Cow Book 1
Feb 24
Data Storage
Feb 26
Buffer Management
Mar 3
Project Practice
  • Bring your laptops to class
Mar 5
ER Model
Mar 10
Relational Model
Mar 12
Relational Algebra
Mar 17
SQL Data Manipulation Language
  • (10:30-11:30am) Attend Willy Zwaenepoel talk, room A23, Red building
Mar 19
No class. St. Giuseppe.
Mar 24
SQL Data Manipulation Language
  • Read Cow Book 5.0–5.6
  • Lecture notes
  • Example customer database
  • Join examples
  • Duplcates examples
  • SQL Zoo, a tutorial for SQL using various systems.
  • Mar 26
    SQL Data Definition and Control Language
  • Read Cow Book 5.7–5.9
  • Notes on relational division.
  • SQLite queries for division example.
  • DDL examples with customer database
  • Mar 31
    Logical Design with Normalization
  • Read Cow Book 19.1–19.6
  • Zvi Kedem's lecture notes..
  • Apr 2
    Class cancelled.
    Apr 7
    No class. Easter break.
    Apr 9
    No class. Easter break.
    Apr 14
    Logical Design with Normalization II
  • Notes on finding a minimal cover.
  • Apr 16
    Midterm
    Apr 21
    Tree-Structured Indexing
  • Read Cow Book 10
  • Lecture notes.
  • Notes on B-tree m parameter.
  • Illustrations of insertions and deletions.
  • Apr 23
    Query Evaluation and Optimization
  • Read Cow Book 12.4-12.6, 14.4
  • Lecture notes.
  • B+ Tree project out
  • Apr 28
    Transaction Processing (Recovery)
  • Zvi Kedem's lecture notes.
  • Apr 30
    Transaction Processing (Recovery Continued. Concurrency)
    May 5
    Transaction Processing (Concurrency Continued)
  • Zvi Kedem's lecture notes.
  • May 7
    NoSQL Concepts
  • Homework 4 out
  • Zvi Kedem's lecture notes.
  • May 12
    PHP and Web Apps
  • Notes on running Apache on OSX.
  • Lecture notes.
  • May 14
    No class. Ascension.
  • Homework 4 due
  • B+ Tree project due
  • May 19
    PHP and Facebook API, JDBC
  • Last programming project out
  • Lecture notes.
  • Sample JDBC code.
  • May 21
    Guest lecture by Daniele Sciascia.
  • Homework 5 out
  • May 26
    Graph databases
  • Bugra Gedik's lecture slides
  • May 28
    Final exam review
  • Homework 5 due (May 29th)
  • June 9
    Final exam at 8:30am, Room SI-008

    Tools and Resources

    You may also find the following resources useful:

    Acknowledgements

    Much of the material in this course is based on a similar course taught at NYU by Zvi Kedem. The BadgerDB projects and codebase were developed by Jignesh M. Patel at the University of Wisconsin. The course website is based on the design by Robert Grimm.