Database normalization

The guidelines for proper relational database design are laid out in the relational model. They are grouped into 5 groups called normal forms. The first normal form represents the lowest form of database normalization, the fifth represents the highest form of database normalization.

The normal forms are guidelines for good database design. You are not obliged to adhere to all five normal forms when designing a database. Nevertheless, you are advised to normalize you database to some extent, because normalization has some significant advantages in terms of the efficiency and maintainability of your database.

  • In a normalized database structure you can make complex data selections with relatively simple SQL queries.
  • Data integrity. A normalized database allows for reliable data storage. 
  • Database normalization avoids redundant (duplicate) storage of data. Data are always stored in only one location which makes it easy to insert, update or delete data. There is one exception to this rule. The keys themselves are stored in multiple locations, because they are copied as foreign keys to other tables. If you want to state it correctly you should say that logical data is not duplicated.
  • Scalability is the ability of a system to deal with future growth. For a database this means that it must still be able to perform quickly when the number of users and the amount of data grows. Scalability is a very important characteristic of any database model and for database management systems.

These are some of the general tasks that are associated with database normalization.

  • Ordering data in logical groups or sets.
  • Finding relationships between sets of data. You have seen an examples of the one-to-many relationship and the many-to-many relationship in this tutorial.
  • Minimizing data redundancy. In other words, making sure logical data is stored in only one location.

Very few databases adhere to all five normal forms presented in the relational model. Usually databases are normalized untill the second or third normal form. The fourth and fifth normal form are rarely used. I will therefore limit this database design tutorial to discuss only the first, second and third normal form.