Database Normalization

Normalizing our database is important to minimize redundancy, and make sure that only related data is stored in each table, it also help to keep the referential integrity, avoid orphan data, for example.

I remember my college days, the database teacher taking a month to explain database normalization, he had a very low tone voice and after 10 minutes his voice became a confortably buzz, I fought so hard to kept me awake after working all day long in these huge tedious classes, all I thought was that I wanted to die, and kept wondering why I choose computer science. For a while I didn't liked database a lot. But some years after it I entered a company that have a code training, to prepare the employees for the market, and in 45 minutes I completely understood database normalization.

So let's begin to learn about database normalization.

The First Normal Form - The key

The First Normal Form (1FN) says that every record should have only one data related to the primary key. The solution is create a new table with the data that are duplicated and references it by a foreign key with a relationship one-to-many. See the picture below. The column language have multiple data and besides that whe have two columns that could be one (telephone1 and telephone2).

The Second Normal Form - The Whole Key

The second normal form (2FN) says that the table must be on the 1FN and we shouldn't have data that is not related to the primary key. Look the example. The column product_name is not related to the order. It have some troubles, imagine the product description changes, you need to search all the table orders looking the product description and change all of it. To fix this we create a new table, and references it with a one-to-one relationship.

In this image you could say value column is duplicated, why you kept it duplicated? And the answer is, if the product value changes over the time, I still can traceback all my sales, and the values ​​will match with the old records.

The Third Normal Form - Nothing but the key

The third normal form is the easiest one. The table should be in the 2FN and we should delete the columns that can be retrieved by the other columns. In this case we simply delete the column. I'll keep the same table as the 2FN the column total can be retrieved by muplipying the quantity by the value, so we just delete it.

Rogério Eduardo Pereira
25/02/2020