The second normal form (2NF)

In order for a database to be normalized according to the second normal form it must first be normalized according to the rules of the first normal form. The second normal deals with data redundancy.

Data redundancy

Rule: Non-primary key fields must be dependent on the primary key.

That may sound a bit academic. What it means is that you must only store data in a table that is directly related and doesn't belong to another entity. Adhering to the second normal form is a matter of finding data that is often duplicated over multiple rows and that could belong to a different entity.

duplication accross rows

Duplication of data accross rows in the store field.

The table above could belong to a company that sells cars and has multiple stores in The Netherlands.

When looking at the table above you might see multiple examples of duplication accross rows. The brand field could be split of into a separate table. Also, the type field could be split of into a table that has a many-to-one relationship with the brand table, because a brand hans multiple types.

The store column contains the store where the car is currently located. Store is an obvious case of data redundancy and a good candidate for a separate entity that should be referenced from the car table with a foreign key relationship.

Below is an example of how you could better model the car situation by avoiding data redundancy in the car table.

separated entities

In the setup above the car table has a foreign key reference to the type and store tables. The brand column is gone, because brand is implicitely referenced through the type reference. When a type is referenced, a brand is also referenced, because a type belongs to a brand.

Data redundancy has been largely removed from the model. If you are strict perhaps you still aren't satisfied with this solution. What about the country_of_origin field in the brand table? There is no duplication yet, because there are only four brands from different countries. A strict database designer might split off the country names into a separate country table.

And even then you might still not be satisfied, because you could also split of the color field in the car table.

How strictly you design your tables is up to you and depends on the situation. If you are going to have huge quantities of cars in your system and you want to be able to search cars by color, it might be wise to split off colors into a separate table, so they are not duplicated.

There is another situation where you might want to split off the colors into a separate table. If you want to allow employees of the company to enter new cars and you want them to be able to choose a car color from a predefined list. In that case you want to store all possible colors in your database, even if there is no car with a certain color yet, you still want it to be present in the database, so the employee can select it. This is certainly a case where you must splitt of the colors into a separate table.