Know when to denormalize columns

Normalization is DB Design 101

Normalization is the process structuring a relational database to reduce redundancy and improve integrity. One of the steps is to remove any columns that might be derived from others. This removes the need to sync the values whenever the fields are changed.

There are, however, a few potential issues with this.

But normalization has a few issues

The first is performance. Take, for example, a situation where you have to export a list of orders, including the total value of each. To do this, you also have to add up all the order items to get the total. Furthermore, what if you also had to export the product and supplier names as well. When dealing with a substantial number of entries, this results in a meaningful performance penalty.

The second issue is that the related fields can change independently. When the user makes an order it has a commission rate attached to it. A normalized database would have a supplier that owns the order. The commission would be the commission set for that supplier. However, updating a supplier’s commission will update the commission of all the previously created orders. Similarly, an order item might only contain a reference to its product. Therefore, changing the product then makes all the previous order items wrong.

Lastly is convenience. For reports, in particular, having some derived columns saved makes it easier because you won’t have to use joins to calculate them.

So, you should consider when to copy fields and when normalized fields are suitable

Keep in mind if saving a related field, instead of a reference, is suitable based on the examples above.

Denormalize frequently accessed derived fields, when a historical entry is required, and when it is much more convenient to do calculations.

Normalization is DB Design 101

But normalization has a few issues

So, you should consider when to copy fields and when normalized fields are suitable

Here’s an example