Joint primary keys
Occasionally, we’ll find ourselves building bridge tables that map many to many observations. For instance, we might have a table of Students and a table of Classes. Students can take multiple classes and likewise, classes can enroll multiple students.
Thus, we’ll want to create a table of records that maps all unique (student_id, class_id)
pairs. Empahsis on unique.
We might do that like so
from utils import connect_to_db
with connect_to_db('classes.db') as cursor:
cursor.execute('drop table if exists enrollments')
cursor.execute('''create table enrollments (
student_id INTEGER,
class_id INTEGER,
PRIMARY KEY (student_id, class_id)
)''')
and then dump a bunch of records into the table
with connect_to_db('classes.db') as cursor:
cursor.execute('insert into enrollments values (1, 1)')
cursor.execute('insert into enrollments values (1, 2)')
cursor.execute('insert into enrollments values (1, 3)')
cursor.execute('insert into enrollments values (2, 1)')
cursor.execute('insert into enrollments values (2, 2)')
cursor.execute('insert into enrollments values (2, 4)')
and this has the intended effect of barring duplicate entries
try:
with connect_to_db('classes.db') as cursor:
cursor.execute('insert into enrollments values (1, 1)')
except Exception as e:
print(e)
UNIQUE constraint failed: enrollments.student_id, enrollments.class_id
However, selecting, we can see that the table only has two columns
with connect_to_db('classes.db') as cursor:
cursor.execute('select * from enrollments')
print(cursor.fetchall())
[(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 4)]
It’s not hard to imagine wanting to join another table against this table on a student/course level– maybe for grades. In that case, we’ll want to join against one simple column on a primary key, not on two columns.
Thus, the preferred syntax here might be
with connect_to_db('classes.db') as cursor:
cursor.execute('drop table if exists enrollments')
cursor.execute('''create table enrollments (
id INTEGER PRIMARY KEY AUTOINCREMENT,
student_id INTEGER,
class_id INTEGER,
UNIQUE (student_id, class_id)
)''')
We need to be a little more deliberate about our insert statements
with connect_to_db('classes.db') as cursor:
cursor.execute('insert into enrollments (student_id, class_id) values (1, 1)')
cursor.execute('insert into enrollments (student_id, class_id) values (1, 2)')
cursor.execute('insert into enrollments (student_id, class_id) values (1, 3)')
cursor.execute('insert into enrollments (student_id, class_id) values (2, 1)')
cursor.execute('insert into enrollments (student_id, class_id) values (2, 2)')
cursor.execute('insert into enrollments (student_id, class_id) values (2, 4)')
But we get a nice, clean primary key in the left-most column
with connect_to_db('classes.db') as cursor:
cursor.execute('select * from enrollments')
print(cursor.fetchall())
[(1, 1, 1), (2, 1, 2), (3, 1, 3), (4, 2, 1), (5, 2, 2), (6, 2, 4)]
!rm classes.db