Functional dependencies
In week 2, the term functional dependency was introduced in the discussion on keys. For the rest of this set of notes, you need to be clear about the meaning of this term, so if you need to refresh your memory you should review the week 2 notes now.
Recall that the purpose of a key is to uniquely identify a row in table. Choosing a good key is usually fairly easy for simple entities, but sometimes we may have a set of attributes that are not yet properly structured. Choosing a key in this situation is more difficult. It can help to draw out a dependency diagram so that you have a visual representation of the dependencies in your data. As an example, takes the data we would need to store about students and the grades they receive for their modules. The set of relevant attributes might be:
student_results(matric_no, first_name, last_name, programme, module_code, module_title, school, result)
Without splitting this set of attributes up, which could be used as a primary key? The technical answer is a minimal set of attributes that functionally determine the rest. In practice, it often helps to look at some example data to answer the question:
matric_no | first_name | last_name | programme | module_code | module_title | school | result |
---|---|---|---|---|---|---|---|
10001234 | Ed | Edwards | BEng | Computing | SET07101 | Agile App Development | Computing |
10001234 | Ed | Edwards | BEng | Computing | SET07102 | Software Development 1 | Computing |
10010123 | Jo | Jones | BIS | SET07101 | Agile App Development | Computing | 64 |
10010123 | Jo | Jones | BIS | SOE07101 | Business Skills | Business | 70 |
10104321 | Pete | Peters | Management | SOE07101 | Business Skills | Business | 70 |
10012222 | Pete | Peters | English | JAC07109 | Social Media | Arts | 68 |
Working through the example data, you should quickly find that there is no single attribute that can uniquely identify a row. To select an appropriate combination we can set out the structure as shown below with the various functional dependencies made explicit.
Every attribute is dependent either on matric_no, on module_code or on a combination of the two. Therefore the combination of those two attributes would make a good choice for the primary key for this set of attributes. This can be written:
student_results(matric_no, first_name, last_name, programme, module_code, module_title, school, result)