Relationships
Summarising what you know so far:
- Entities are important things that we need to store information about
- Similar entities can be grouped in entity types
- An entity type is represented in a database by a table
- Each row in a table represents one entity of that type
- The individual pieces of information about an entity are its attributes
- Attributes correspond to the columns in a database table
- Entities can have many different relationships with each other
- Relationships between entities are shown using an entity relationship diagram (ER diagram or ERD)
Real world situations are typically vague and poorly structured. Part of the job of a database designer is to use the simple structures available within a relational database to represent these quite complex situations. To bridge the gap between the capabilities of the relational model and the real world, we need to think about the different types of relationship that can exist between entities. We also need to know how to represent the kind of uncertainty that is found in the real world. We can do this by using some simple examples using the conventions of ER diagrams. As we go through, we will add more information to the diagram so that it represents the real situation more closely.
Multiplicity
First, take the example that was in this week's introduction:
This diagram adequately captures the idea that a STUDENT enrols on a PROGRAMME; however, that is clearly not sufficient. Many students will usually usually enrol on the same programme. We therefore need some indication on the diagram of the number of entities of each type that are involved in the relationship, and this is shown in the version below.
In ER diagrams, an asterisk (star) is interpreted as "many". Enrols_on can therefore be described as a "many-to-one" relationship, usually written *:1. Notice that the relationship has direction: enrols_on is equivalent to a relationship called is_populated_by in the opposite direction. We would call is_populated_by a 1:* relationship. A good rule of thumb is to name the relationship from the perspective of the entity at the "one" end. Most relationships will be 1:\ or *:1, but other types also exist:
1:1 (one-to-one): one member of staff is the leader of one programme
*:* (many-to-many): one student studies many modules
The specification of the number of entities involved in the relationship is known as its multiplicity; however, this is not yet the end of the story.
Notice that you must be careful when defining the cardinality of a relationship. The 1:1 relationship above is only correct if there is some rule that says a member of staff can only be leader of one programme at a time. If such a rule does not exist, one person could be the leader of many programmes, and the relationship is actually 1:*. This is a very common error, and to identify it, you need to think about the relationship in both directions.
Cardinality and optionality
The last diagram above seems correct... However, what about the new student who has not yet registered for any modules? Similarly, what about the new module which has not yet been run? What about the majority of staff who are not programme leaders? Clearly, we need to able to show whether a relationship is mandatory (it must always be true) or optional (it can sometimes be true). We can do this quite easily on our diagram by adding a little more information to the multiplicity labels:
The multiplicity labels now show a range of values of the form minimum .. maximum. The diagram above can therefore be read from left to right as
A member of staff leads zero or one programmes
This captures the idea that a member of staff does not have to be a programme leader, because that person can be linked to zero programmes via this relationship. Therefore, the relationship is optional. Notice that this is true if there is a rule that prevents a member of staff from leading more than one programme. The relationship can also be read in the other direction as
A programme is led by one and only one member of staff
This leaves no question of there being a programme with no programme leader, and the relationship is therefore mandatory in this direction. This is just one of the ways in which direction is important in defining relationships.
To summarise:
- A full definition of a relationship includes the number of participants - this is the cardinality of the relationship, and the options are 1:1, 1:* or *:*
- A relationship can be mandatory or optional, and each direction has to be considered independently
- Optionality can be expressed on an ER diagram by extending the cardinality labels
- Cardinality and optionality together make up the multiplicity of the relationship.
NB. Do not confuse the two types of shorthand:
The labels 1:*, *:1, 1:1 and *:* refer to cardinality only, and are applied to relationships. A multiplicity label of the form n..m describes one end of a relationship and includes information about cardinality and optionality.