Joint primary keys

Occasionally, we’ll find ourselves building bridge tables that map many to many observations. For instance, we might have a table of Students and a table of Classes. Students can take multiple classes and likewise, classes can enroll multiple students.

Thus, we’ll want to create a table of records that maps all unique (student_id, class_id) pairs. Empahsis on unique.

We might do that like so

from utils import connect_to_db

with connect_to_db('classes.db') as cursor:
    cursor.execute('drop table if exists enrollments')
    cursor.execute('''create table enrollments (
                      student_id INTEGER,
                      class_id INTEGER,
                      PRIMARY KEY (student_id, class_id)
                  )''')

and then dump a bunch of records into the table

with connect_to_db('classes.db') as cursor:
    cursor.execute('insert into enrollments values (1, 1)')
    cursor.execute('insert into enrollments values (1, 2)')
    cursor.execute('insert into enrollments values (1, 3)')
    cursor.execute('insert into enrollments values (2, 1)')
    cursor.execute('insert into enrollments values (2, 2)')
    cursor.execute('insert into enrollments values (2, 4)')

and this has the intended effect of barring duplicate entries

try:
    with connect_to_db('classes.db') as cursor:
        cursor.execute('insert into enrollments values (1, 1)')
except Exception as e:
    print(e)
UNIQUE constraint failed: enrollments.student_id, enrollments.class_id

However, selecting, we can see that the table only has two columns

with connect_to_db('classes.db') as cursor:
    cursor.execute('select * from enrollments')
    print(cursor.fetchall())
[(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 4)]

It’s not hard to imagine wanting to join another table against this table on a student/course level– maybe for grades. In that case, we’ll want to join against one simple column on a primary key, not on two columns.

Thus, the preferred syntax here might be

with connect_to_db('classes.db') as cursor:
    cursor.execute('drop table if exists enrollments')
    cursor.execute('''create table enrollments (
                      id INTEGER PRIMARY KEY AUTOINCREMENT,
                      student_id INTEGER,
                      class_id INTEGER,
                      UNIQUE (student_id, class_id)
                  )''')

We need to be a little more deliberate about our insert statements

with connect_to_db('classes.db') as cursor:
    cursor.execute('insert into enrollments (student_id, class_id) values (1, 1)')
    cursor.execute('insert into enrollments (student_id, class_id) values (1, 2)')
    cursor.execute('insert into enrollments (student_id, class_id) values (1, 3)')
    cursor.execute('insert into enrollments (student_id, class_id) values (2, 1)')
    cursor.execute('insert into enrollments (student_id, class_id) values (2, 2)')
    cursor.execute('insert into enrollments (student_id, class_id) values (2, 4)')

But we get a nice, clean primary key in the left-most column

with connect_to_db('classes.db') as cursor:
    cursor.execute('select * from enrollments')
    print(cursor.fetchall())
[(1, 1, 1), (2, 1, 2), (3, 1, 3), (4, 2, 1), (5, 2, 2), (6, 2, 4)]
!rm classes.db