Understanding Database Normalization in SQL with Example

Understanding Database Normalization in SQL with Example

01 May 2025
Beginner
56.1K Views
26 min read
Learn with an interactive course and practical hands-on labs

Free SQL Server Online Course with Certificate - Start Today

Normalization in SQL is the process of organizing data in a database to eliminate redundancy and ensure data integrity. It involves dividing large tables into smaller, related tables and defining relationships between them. The goal is to structure the data efficiently and reduce issues like data duplication and update anomalies.

In this SQL Server tutorial, we’ll break down the different normal forms (1NF, 2NF, 3NF, BCNF) with easy-to-follow examples. Whether you’re preparing for an interview or learning database design, understanding SQL normalization is essential for creating scalable and consistent databases. This guide is perfect for beginners who want to start a career in Data Science or Data Engineering

In this blog, we’ll explore:

  • What normalization is and why it matters
  • The different normal forms (1NF, 2NF, 3NF, BCNF) with real-world examples
  • Pros and cons of normalization, when to use it (and when not to)
  • Best practices for implementing normalization in SQL databases

What is Normalization in SQL?

Normalization in SQL is a method used to organize data in a clean and structured way by breaking big tables into smaller ones. This helps remove duplicate data and keeps everything more accurate. The goal of normalization in SQL is to make databases faster, easier to manage, and more reliable. It also helps avoid problems when adding or updating data.
Let’s take a brief look at some scenarios where normalization is often used.

Data integrity:

Data integrity in SQL means making sure the information in your database stays accurate, consistent, and reliable over time. It helps prevent errors, duplicates, or mismatched data by following proper rules and relationships. With strong data integrity, your SQL database runs smoother and delivers trusted results every time.

Efficiency querying:

Efficient querying in SQL becomes easier when data is well-structured. Normalization in SQL helps organize data to reduce duplication, making queries faster and more accurate. By using normalization in SQL, you improve performance and simplify complex searches across your database.

Storage optimization:

Storage optimization in SQL means using your database space wisely by reducing unnecessary or duplicate data. Normalization in SQL plays a key role here by breaking large tables into smaller ones, which helps save storage. With proper normalization in SQL, you not only save space but also keep your data clean and organized

Why is Normalization in SQL Important?

Normalization in SQL is important because it reduces data redundancy and improves data integrity. It also enhances query performance and makes database updates more efficient.
Here are a few important reasons why it's so essential:

Reduces redundancy:

Redundancy happens when the same data is stored in multiple places, leading to wasted space and possible errors. Normalization in SQL reduces redundancy by organizing data into smaller, related tables, making your database cleaner and more efficient.

Improves query performance:

Normalization in SQL helps speed up queries by breaking data into smaller, organized tables. This structure allows the database to search and retrieve information faster, improving overall performance and efficiency.

Minimizes update anomalies:

Normalization in SQL helps avoid issues that can happen when updating data, like conflicting or inconsistent records. By organizing data into separate, related tables, updates become safer and more accurate across the database.

Enhances data integrity:

Normalization in SQL ensures that your data stays accurate, consistent, and reliable by removing duplicates and enforcing clear relationships. This structure helps maintain trust in your data across all operations and updates.

What Causes the Need for Normalization?

The need for normalization in SQL arises from problems like data redundancy, inefficient queries, and challenges in maintaining consistency. When data is duplicated, it can lead to errors during updates and wasted storage, which normalization in SQL helps to eliminate.
Several factors contribute to the need for normalization, ranging from data redundancy to the complexity of managing relationships. Let’s dive into the details!

Insertion, deletion, and update anomalies:

Insertion, deletion, and update anomalies refer to problems that arise when managing data in a database. Without proper normalization in SQL, inserting, deleting, or updating data can lead to inconsistencies, errors, and redundant information. Normalization helps prevent these anomalies by organizing data in a way that maintains integrity and consistency.

Difficulty in managing relationships:

Difficulty in managing relationships occurs when data is poorly structured, making it hard to establish clear links between tables. Normalization in SQL simplifies this by breaking data into smaller, related tables, ensuring that relationships are easy to define, manage, and maintain.

Other factors that drive the need for normalization in SQL include partial dependencies and transitive dependencies. Partial dependencies can cause data redundancy and update anomalies, while transitive dependencies can lead to data inconsistencies. In the following sections, we’ll explore how to address these dependencies to achieve proper database normalization.

Different Types of Database Normalization in SQL?

The world's database community has developed a certain guideline for ensuring that the databases in our application should be normalized in a specific way. These are referred to as normal forms and are numbered from one, where the lowest form of normalization is referred to as the first normal form or 1NF, through five, which is the fifth normal form or 5NF, along with the BCNF, which is the Boyce-Codd Normal Form.

While working with real-time applications or scenarios, we will often see different forms of normalization such as 1NF, 2NF, and 3NF, along with the occasional 4NF and BCNF. But the fifth normal form (5NF) is very rarely seen and used in enterprise applications due to its complexity.

We organize the data into database tables by using the normal forms of rules or conditions. Normal forms help us to make a good database design. Generally, we organize the data up to the third normal form. We rarely use the fourth and fifth normal forms.

Normal Forms: Types of Normalization

To understand normal forms, consider the following unnormalized database table. Now we will normalize the data in the table below using normal forms.

Normal Forms: Types of Normalization

Read More: DBMS Interview Questions For Freshers

1. First Normal Form (1NF)

A database table is said to be in 1NF if it contains no repeating fields/columns. The process of converting the UNF table into 1NF is as follows:

  1. Separate the repeating fields into new database tables along with the key from the unnormalized database table.
  2. The primary key of new database tables may be a composite key.

1NF of the above UNF table is as follows:

First Normal Form (1NF)

2. Second Normal Form (2NF)

A database table is said to be in 2NF if it is in 1NF and contains only those fields/columns that are functionally dependent(which means the value of the field is determined by the value of another field(s)) on the primary key. In 2NF, we remove the partial dependencies of any non-key field.

The process of converting the database table into 2NF is as follows:

  1. Remove the partial dependencies (A type of functional dependency where a field is only functionally dependent on the part of the primary key) of any non-key field.
  2. If field B depends on field A, and vice versa. Also, for a given value of B, we have only one possible value of A and vice versa. Then we put the field B into a new database table where B will be the primary key and also marked as a foreign key in a parent table.

2NF of the above 1NF tables is as follows:

Second Normal Form (2NF)

As you can see in the above example, we have divided our 1NF table into two tables, viz. Table 1 and Table 2. Table 1 contains project information. Table 2 contains information on the employees assigned to the specific projects.

We have introduced a new column called project_code, which is the primary key for table 1. Records can be uniquely identified in Table 1 using the project code column itself.

Read More: SQL Interview Questions And Answers For Experienced

3. Third Normal Form (3NF)

A database table is said to be in 3NF if it is in 2NF and all non-key fields are dependent on the primary key. We can also say a table is in 3NF if it is in 2NF and no fields of the table are transitively functionally dependent on the primary key. The process of converting the table into 3NF is as follows:

  1. Remove the transitive dependencies(A type of functional dependency where a field is functionally dependent on the Field that is not the primary key. Hence, its value is determined indirectly by the primary key )
  2. Make a separate table for transitive-dependent Fields.

3NF of the above 2NF tables is as follows:

Third Normal Form (3NF)

4. Boyce Code Normal Form (BCNF)

A database table is said to be in BCNF if it is in 3NF and contains every determinant as a candidate key. The process of converting the table into BCNF is as follows:

  1. Remove the nontrivial functional dependency.
  2. Make a separate table for the determinants.

BCNF of the table below is as follows:

Boyce Code Normal Form (BCNF)

5. Fourth Normal Form (4NF)

A database table is said to be in 4NF if it is in BCNF and the primary key has a one-to-one relationship to all non-key fields. We can also say a table is in 4NF if it is in BCNF and contains no multi-valued dependencies. The process of converting the table into 4NF is as follows:

  1. Remove the multivalued dependency.
  2. Make a separate table for multivalued Fields.

4NF of the table below is as follows:

Fourth Normal Form (4NF)

6. Fifth Normal Form (5NF)

A database table is said to be in 5NF if it is in 4NF and contains no redundant values. We can also say a table is in 5NF if it is in 4NF and contains no join dependencies. The process of converting the table into 5NF is as follows:

  1. Remove the join dependency.
  2. Break the database table into smaller and smaller tables to remove all data redundancy.

5NF of the table below is as follows:

Fifth Normal Form (5NF)

Read More: SQL Server Interview Questions and Answers

Normalization in SQL With Real-World Examples

We’ve already introduced the different levels of data normalization in SQL. Now, let’s take a closer look at each level with clear examples and detailed explanations.

First Normal Form (1NF) Normalization:

First Normal Form (1NF) is the first step in organizing your database for clarity and efficiency. A table is in 1NF when each column holds only atomic (indivisible) values, meaning no lists or sets are allowed in a single cell. It also ensures that each column contains data of the same type and that there are no repeating groups. This makes the data easier to search, filter, and maintain over time.

Before Applying 1NF (Non-1NF Table)

CustomerIDNameProducts
1JohnLaptop, Mouse
2AliceMonitor
3BobKeyboard, Mouse, USB

After Applying 1NF (1NF-Compliant Table)

CustomerIDNameProduct
1JohnLaptop
1JohnMouse
2AliceMonitor
3BobKeyboard
3BobMouse
3BobUSB

Explanation:

First Normal Form (1NF) ensures that each column in a table contains atomic values, meaning each cell holds a single value and there are no repeating groups. By applying 1NF, data is organized into individual rows, improving consistency and making it easier to query.

Second Normal Form (2NF):

Second Normal Form (2NF) builds on the principles of First Normal Form (1NF) by addressing the issue of partial dependency. A table is in 2NF when it is already in 1NF, and every non-key column is fully dependent on the primary key. This means that no column should depend on just a part of the composite primary key if the primary key consists of multiple columns. By eliminating partial dependencies, 2NF ensures that the data is more structured, reducing redundancy and improving data integrity.

Before Applying 2NF (Non-2NF Table)

StudentIDCourseIDStudentNameCourseName
1101JohnMath
1102JohnScience
2101AliceMath

After Applying 2NF (2NF-Compliant Tables)

Student Table:

StudentIDStudentName
1John
2Alice

StudentCourse Table:

StudentIDCourseIDCourseName
1101Math
1102Science
2101Math

Explanation:

Second Normal Form (2NF) ensures that all non-key attributes are fully dependent on the entire primary key, not just part of it. A table is in 2NF if it’s in 1NF and has no partial dependencies, which helps remove redundant data in tables with composite keys.

Third Normal Form (3NF):

Third Normal Form (3NF) takes normalization a step further by removing transitive dependencies. A table is in 3NF if it is already in 2NF, and all non-key columns are directly dependent only on the primary key, not on other non-key columns. This helps avoid data duplication and maintains data integrity by ensuring that each piece of information is stored in one place only.

Before Applying 3NF (Non-3NF Table)

EmpIDEmpNameDeptNameDeptLocation
1AliceHRNew York
2BobITSan Francisco
3CharlieHRNew York

After Applying 3NF (3NF-Compliant Tables)

Employee Table:

EmpIDEmpNameDeptID
1Alice101
2Bob102
3Charlie101

Department Table:

DeptIDDeptNameDeptLocation
101HRNew York
102ITSan Francisco

Explanation:

Third Normal Form (3NF) eliminates transitive dependencies, ensuring that non-key attributes depend only on the primary key. A table is in 3NF if it’s in 2NF and all its attributes are functionally dependent only on the primary key not on other non-key attributes. This reduces data redundancy and improves data integrity.

Boyce-Codd Normal Form (BCNF):

Boyce-Codd Normal Form (BCNF) is an advanced version of the Third Normal Form (3NF) used to further eliminate redundancy in relational databases. A table is in BCNF if it is already in 3NF and every determinant is a candidate key. In other words, if a non-trivial functional dependency exists, the left-hand side must be a super key. BCNF ensures that no anomalies arise from overlapping candidate keys or dependencies that 3NF doesn't fully resolve.

Before Applying BCNF (Non-BCNF Table)

ProfessorSubjectDepartment
Dr. SmithDatabaseComputer Science
Dr. SmithAIComputer Science
Dr. JonesMathMathematics

After Applying BCNF (BCNF-Compliant Tables)

Professor Dept Table:

ProfessorDepartment
Dr. SmithComputer Science
Dr. JonesMathematics

Professor Subject Table:

ProfessorSubject
Dr. SmithDatabase
Dr. SmithAI
Dr. JonesMath

Explanation:

Boyce-Codd Normal Form (BCNF) is an advanced version of the Third Normal Form (3NF) that handles certain edge cases where 3NF doesn't fully eliminate redundancy. A table is in BCNF if it’s in 3NF and every determinant is a candidate key. This ensures stronger consistency by eliminating anomalies caused by functional dependencies.

Fourth Normal Form (4NF):

Fourth Normal Form (4NF) focuses on removing multi-valued dependencies from a table. A table is in 4NF if it is already in Boyce-Codd Normal Form (BCNF) and contains no more than one independent multi-valued dependency. This means that for a given primary key, a table should not store two or more independent sets of multi-valued facts, which can cause unnecessary repetition and data anomalies.

Before Applying 4NF (Non-4NF Table)

StudentCourseHobby
JohnMathCricket
JohnMathMusic
JohnScienceCricket
JohnScienceMusic

After Applying 4NF (4NF-Compliant Tables)

StudentCourses Table:

StudentCourse
JohnMath
JohnScience

StudentHobbies Table:

StudentHobby
JohnCricket
JohnMusic

Explanation:

Fourth Normal Form (4NF) removes multi-valued dependencies from a table while ensuring it’s already in Boyce-Codd Normal Form (BCNF). It prevents storing two or more independent sets of data in the same table that are only related by the primary key. This helps avoid redundant data and improves clarity in relationships.

Fifth Normal Form (5NF)

Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF), deals with breaking down tables to eliminate redundancy caused by join dependencies. A table is in 5NF if it is already in 4NF and cannot be further decomposed into smaller tables without losing data or introducing incorrect combinations after performing joins. It ensures that every join dependency is a result of the candidate keys

Before Applying 5NF (Non-5NF Table)

ProductVendorRegion
PhoneVendor AUS
PhoneVendor AUK
PhoneVendor BUS
PhoneVendor BUK

After Applying 5NF (5NF-Compliant Tables)

ProductVendor Table:

ProductVendor
PhoneVendor A
PhoneVendor B

ProductRegion Table:

ProductRegion
PhoneUS
PhoneUK

VendorRegion Table:

VendorRegion
Vendor AUS
Vendor AUK
Vendor BUS
Vendor BUK

Explanation:

Fifth Normal Form (5NF) eliminates redundancy caused by complex join dependencies in a table. It ensures that data split across multiple tables can be joined back without loss or duplication. 5NF is useful when dealing with multiple many-to-many relationships.

Summary

Database designing is one of the critical tasks for the successful implementation of a database management system for the application that meets the data requirements of an enterprise application to be developed. Database normalization in DBMS is a crucial process that helps to create concrete database systems that are cost-effective while having better security models attached to them. The complete functional database dependencies are a very important component of the normalized data process. You can learn more through a Free SQL Server Online Course with a Certificate.

Let’s climb to the top test your knowledge and conquer every question

Q 1: What is the main purpose of normalization in SQL?

  • Reduce redundancy
  • Increase redundancy
  • Simplify complex queries
  • Increase performance

FAQs

Yes, because redundant data is removed and hence the database disk storage use becomes smaller.

Normalization in MySQL should be used to reduce data redundancy, improve data integrity, and facilitate data updates and modifications while ensuring scalability and flexibility in database design.

No, it increases the number of tables.

BCNF is an extension of 3NF. The primary difference is that it removes the transitive dependency from a relation.

Take our Sqlserver skill challenge to evaluate yourself!

In less than 5 minutes, with our skill challenge, you can identify your knowledge gaps and strengths in a given skill.

GET FREE CHALLENGE

Share Article
About Author
Shailendra Chauhan (Microsoft MVP, Founder & CEO at ScholarHat)

Shailendra Chauhan, Founder and CEO of ScholarHat by DotNetTricks, is a renowned expert in System Design, Software Architecture, Azure Cloud, .NET, Angular, React, Node.js, Microservices, DevOps, and Cross-Platform Mobile App Development. His skill set extends into emerging fields like Data Science, Python, Azure AI/ML, and Generative AI, making him a well-rounded expert who bridges traditional development frameworks with cutting-edge advancements. Recognized as a Microsoft Most Valuable Professional (MVP) for an impressive 9 consecutive years (2016–2024), he has consistently demonstrated excellence in delivering impactful solutions and inspiring learners.

Shailendra’s unique, hands-on training programs and bestselling books have empowered thousands of professionals to excel in their careers and crack tough interviews. A visionary leader, he continues to revolutionize technology education with his innovative approach.
Accept cookies & close this