Normalisation
Posted in Computing, DatabasesNormalisation is a process which improves the design of a relational database by removing duplicate data, keeping data consistent, making the structure flexible and allowing for complex queries. There are a number of stages and types of normalisation.
First normal form (1NF)
A table in first normal form contains no repeating attribute or groups of attributes. Converting an unnormalised database to 1NF typically involes splitting the database in two or more to remove the repeating attribute. For example a table may contain a list of shops together with the products they sell information about those products. In this case the product information will be repeated so it should be removed and put into a products table which can then be linked to the original shop table by means of a foreign key, e.g ProductID. This example highlights the weakness of the two table structure - as many shops will contain many products the shop data will also be repeated in the shop database with each product.
Second normal form (1NF)
A table in second normal form contains no partial key dependancies, this means that any attribute, excluding the primary key, which depends on only the primary key is removed. With the example above the shop information depends only on the primary key, e.g. ShopID. Hence the shop name and location should be removed and linked to the shop database with another foreign key.
Third normal form (3NF)
A table in third normal form is both in 2NF and contains no non-key dependancies. This means that any attributes that are not dependant of the primary are removed to another table so that they are.
Boyce-Codd normal form (BCNF)
A table is in Boyce-Codd normal form when all the determinants in the relation are candidate keys. A determinant is an attribute which another attribute is dependant on. A candidate key is an attribute which is unique such that it could be the primary key (i.e. it is unique to each row of the database). For example a student database may have two candidate keys - StudentNumber and CandidateNumber. Both are unique enough to the primary key but only one is chosen. The other is known as an alternate key.
If there is only one candidate key (which means it will be the primary key as well) then 3NF is the same as BCNF.