Database Normalization and Denormalization: A Complete Guide

Understanding Normal Forms and When to Denormalize

Featured image



Overview

Today, we’ll explore the concepts and applications of database normalization and denormalization. In database design, normalization is the process of maintaining data consistency and minimizing redundancy, while denormalization involves storing some data redundantly to optimize performance. By appropriately utilizing these two concepts, we can create efficient and scalable database designs.



What is Normalization?

Normalization is the process of organizing data to reduce redundancy, maintain integrity, and ensure data consistency. This is achieved by converting data into specific formats called Normal Forms (NF).


Goals of Normalization


Advantages of Normalization


Disadvantages of Normalization



Normal Forms (NF)

Normalization progresses through stages: 1NF → 2NF → 3NF → BCNF → 4NF → 5NF. Generally, normalization is performed up to 3NF (Third Normal Form) or BCNF.


1️⃣ First Normal Form (1NF)

Requirements

Example


Before 1NF:
OrderID CustomerName Products
101 John Doe TV, Refrigerator
102 Jane Smith Washing Machine
After 1NF:
OrderID CustomerName Product
101 John Doe TV
101 John Doe Refrigerator
102 Jane Smith Washing Machine


2️⃣ Second Normal Form (2NF)

Requirements

Example


Before 2NF:
OrderID CustomerName CustomerAddress Product
101 John Doe Seoul TV
101 John Doe Seoul Refrigerator
102 Jane Smith Busan Washing Machine


After 2NF:
Customer Table:
CustomerName CustomerAddress
John Doe Seoul
Jane Smith Busan
Order Table:
OrderID CustomerName Product
101 John Doe TV
101 John Doe Refrigerator
102 Jane Smith Washing Machine

3️⃣ Third Normal Form (3NF)

Requirements

Example


Before 3NF:
OrderID CustomerName CustomerAddress CustomerGrade
101 John Doe Seoul VIP
102 Jane Smith Busan Regular


After 3NF:
Customer Table:
CustomerName CustomerAddress CustomerGrade
John Doe Seoul VIP
Jane Smith Busan Regular
Order Table:
OrderID CustomerName
101 John Doe
102 Jane Smith


BCNF (Boyce-Codd Normal Form)

Requirements

Example


Before BCNF:
DeptID Professor Course
CS01 Prof. Kim Database
CS01 Prof. Kim OS
CS02 Prof. Park Data Structures


After BCNF:


Professor-Department Table:
Professor DeptID
Prof. Kim CS01
Prof. Park CS02
Course Table:
DeptID Course
CS01 Database
CS01 OS
CS02 Data Structures


4NF (Fourth Normal Form)

Requirements

Example


Before 4NF:
StudentID Course Club
S001 Database Soccer
S001 OS Soccer
S001 Database Reading
S001 OS Reading


After 4NF:
Student-Course Table:
StudentID Course
S001 Database
S001 OS
Student-Club Table:
StudentID Club
S001 Soccer
S001 Reading


5NF (Fifth Normal Form)

Requirements

Example


Before 5NF:
StudentID InstructorID Course
S001 I100 Database
S002 I100 Database
S001 I200 OS
S002 I300 Data Structures


After 5NF:
Student-Instructor Table:
StudentID InstructorID
S001 I100
S002 I100
S001 I200
S002 I300
Instructor-Course Table:
InstructorID Course
I100 Database
I200 OS
I300 Data Structures



What is Denormalization?

Denormalization is the process of intentionally introducing redundancy into a database to improve performance. It involves storing some data redundantly to enhance query speed.


When to Apply Denormalization

  1. When JOIN operations cause performance degradation
  2. When read operations are frequent and write operations are rare
  3. When real-time response speed is critical


Example

Normalized Structure (Requires JOIN):

-- Customer Table
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100)
);

-- Order Table
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    product VARCHAR(100),
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

Denormalized Structure (No JOIN needed):

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    product VARCHAR(100)
);



Conclusion

Normalization and denormalization are core concepts in database design.


When to Normalize


When to Denormalize

The key is to find the right balance between normalization and denormalization based on your project’s requirements and characteristics.



References