16
MayUnderstanding Database Normalization in SQL with Example
Normalization in SQL is the process of organizing data in a database to eliminate redundancy and ensure data integrity. It involves dividing large tables into smaller, related tables and defining relationships between them. The goal is to structure the data efficiently and reduce issues like data duplication and update anomalies.
In this SQL Server tutorial, we’ll break down the different normal forms (1NF, 2NF, 3NF, BCNF) with easy-to-follow examples. Whether you’re preparing for an interview or learning database design, understanding SQL normalization is essential for creating scalable and consistent databases. This guide is perfect for beginners who want to start a career in Data Science or Data Engineering
In this blog, we’ll explore:
- What normalization is and why it matters
- The different normal forms (1NF, 2NF, 3NF, BCNF) with real-world examples
- Pros and cons of normalization, when to use it (and when not to)
- Best practices for implementing normalization in SQL databases
What is Normalization in SQL?
Data integrity:
Efficiency querying:
Storage optimization:
Why is Normalization in SQL Important?
Reduces redundancy:
Redundancy happens when the same data is stored in multiple places, leading to wasted space and possible errors. Normalization in SQL reduces redundancy by organizing data into smaller, related tables, making your database cleaner and more efficient.
Improves query performance:
Normalization in SQL helps speed up queries by breaking data into smaller, organized tables. This structure allows the database to search and retrieve information faster, improving overall performance and efficiency.
Minimizes update anomalies:
Normalization in SQL helps avoid issues that can happen when updating data, like conflicting or inconsistent records. By organizing data into separate, related tables, updates become safer and more accurate across the database.
Enhances data integrity:
Normalization in SQL ensures that your data stays accurate, consistent, and reliable by removing duplicates and enforcing clear relationships. This structure helps maintain trust in your data across all operations and updates.
What Causes the Need for Normalization?
Insertion, deletion, and update anomalies:
Insertion, deletion, and update anomalies refer to problems that arise when managing data in a database. Without proper normalization in SQL, inserting, deleting, or updating data can lead to inconsistencies, errors, and redundant information. Normalization helps prevent these anomalies by organizing data in a way that maintains integrity and consistency.
Difficulty in managing relationships:
Difficulty in managing relationships occurs when data is poorly structured, making it hard to establish clear links between tables. Normalization in SQL simplifies this by breaking data into smaller, related tables, ensuring that relationships are easy to define, manage, and maintain.
Other factors that drive the need for normalization in SQL include partial dependencies and transitive dependencies. Partial dependencies can cause data redundancy and update anomalies, while transitive dependencies can lead to data inconsistencies. In the following sections, we’ll explore how to address these dependencies to achieve proper database normalization.
Different Types of Database Normalization in SQL?
The world's database community has developed a certain guideline for ensuring that the databases in our application should be normalized in a specific way. These are referred to as normal forms and are numbered from one, where the lowest form of normalization is referred to as the first normal form or 1NF, through five, which is the fifth normal form or 5NF, along with the BCNF, which is the Boyce-Codd Normal Form.
While working with real-time applications or scenarios, we will often see different forms of normalization such as 1NF, 2NF, and 3NF, along with the occasional 4NF and BCNF. But the fifth normal form (5NF) is very rarely seen and used in enterprise applications due to its complexity.
We organize the data into database tables by using the normal forms of rules or conditions. Normal forms help us to make a good database design. Generally, we organize the data up to the third normal form. We rarely use the fourth and fifth normal forms.
To understand normal forms, consider the following unnormalized database table. Now we will normalize the data in the table below using normal forms.
Read More: DBMS Interview Questions For Freshers |
1. First Normal Form (1NF)
A database table is said to be in 1NF if it contains no repeating fields/columns. The process of converting the UNF table into 1NF is as follows:
- Separate the repeating fields into new database tables along with the key from the unnormalized database table.
The primary key of new database tables may be a composite key.
1NF of the above UNF table is as follows:
2. Second Normal Form (2NF)
A database table is said to be in 2NF if it is in 1NF and contains only those fields/columns that are functionally dependent(which means the value of the field is determined by the value of another field(s)) on the primary key. In 2NF, we remove the partial dependencies of any non-key field.
The process of converting the database table into 2NF is as follows:
- Remove the partial dependencies (A type of functional dependency where a field is only functionally dependent on the part of the primary key) of any non-key field.
- If field B depends on field A, and vice versa. Also, for a given value of B, we have only one possible value of A and vice versa. Then we put the field B into a new database table where B will be the primary key and also marked as a foreign key in a parent table.
2NF of the above 1NF tables is as follows:
As you can see in the above example, we have divided our 1NF table into two tables, viz. Table 1 and Table 2. Table 1 contains project information. Table 2 contains information on the employees assigned to the specific projects.
We have introduced a new column called project_code, which is the primary key for table 1. Records can be uniquely identified in Table 1 using the project code column itself.
Read More: SQL Interview Questions And Answers For Experienced |
3. Third Normal Form (3NF)
A database table is said to be in 3NF if it is in 2NF and all non-key fields are dependent on the primary key. We can also say a table is in 3NF if it is in 2NF and no fields of the table are transitively functionally dependent on the primary key. The process of converting the table into 3NF is as follows:
- Remove the transitive dependencies(A type of functional dependency where a field is functionally dependent on the Field that is not the primary key. Hence, its value is determined indirectly by the primary key )
Make a separate table for transitive-dependent Fields.
3NF of the above 2NF tables is as follows:
4. Boyce Code Normal Form (BCNF)
A database table is said to be in BCNF if it is in 3NF and contains every determinant as a candidate key. The process of converting the table into BCNF is as follows:
- Remove the nontrivial functional dependency.
- Make a separate table for the determinants.
BCNF of the table below is as follows:
5. Fourth Normal Form (4NF)
A database table is said to be in 4NF if it is in BCNF and the primary key has a one-to-one relationship to all non-key fields. We can also say a table is in 4NF if it is in BCNF and contains no multi-valued dependencies. The process of converting the table into 4NF is as follows:
- Remove the multivalued dependency.
Make a separate table for multivalued Fields.
4NF of the table below is as follows:
6. Fifth Normal Form (5NF)
A database table is said to be in 5NF if it is in 4NF and contains no redundant values. We can also say a table is in 5NF if it is in 4NF and contains no join dependencies. The process of converting the table into 5NF is as follows:
- Remove the join dependency.
Break the database table into smaller and smaller tables to remove all data redundancy.
5NF of the table below is as follows:
Read More: SQL Server Interview Questions and Answers |
Normalization in SQL With Real-World Examples
We’ve already introduced the different levels of data normalization in SQL. Now, let’s take a closer look at each level with clear examples and detailed explanations.
First Normal Form (1NF) Normalization:
Before Applying 1NF (Non-1NF Table)
CustomerID | Name | Products |
1 | John | Laptop, Mouse |
2 | Alice | Monitor |
3 | Bob | Keyboard, Mouse, USB |
After Applying 1NF (1NF-Compliant Table)
CustomerID | Name | Product |
1 | John | Laptop |
1 | John | Mouse |
2 | Alice | Monitor |
3 | Bob | Keyboard |
3 | Bob | Mouse |
3 | Bob | USB |
Explanation:
First Normal Form (1NF) ensures that each column in a table contains atomic values, meaning each cell holds a single value and there are no repeating groups. By applying 1NF, data is organized into individual rows, improving consistency and making it easier to query.
Second Normal Form (2NF):
Before Applying 2NF (Non-2NF Table)
StudentID | CourseID | StudentName | CourseName |
1 | 101 | John | Math |
1 | 102 | John | Science |
2 | 101 | Alice | Math |
After Applying 2NF (2NF-Compliant Tables)
Student Table:
StudentID | StudentName |
1 | John |
2 | Alice |
StudentCourse Table:
StudentID | CourseID | CourseName |
1 | 101 | Math |
1 | 102 | Science |
2 | 101 | Math |
Explanation:
Second Normal Form (2NF) ensures that all non-key attributes are fully dependent on the entire primary key, not just part of it. A table is in 2NF if it’s in 1NF and has no partial dependencies, which helps remove redundant data in tables with composite keys.
Third Normal Form (3NF):
Third Normal Form (3NF) takes normalization a step further by removing transitive dependencies. A table is in 3NF if it is already in 2NF, and all non-key columns are directly dependent only on the primary key, not on other non-key columns. This helps avoid data duplication and maintains data integrity by ensuring that each piece of information is stored in one place only.
Before Applying 3NF (Non-3NF Table)
EmpID | EmpName | DeptName | DeptLocation |
1 | Alice | HR | New York |
2 | Bob | IT | San Francisco |
3 | Charlie | HR | New York |
After Applying 3NF (3NF-Compliant Tables)
Employee Table:
EmpID | EmpName | DeptID |
1 | Alice | 101 |
2 | Bob | 102 |
3 | Charlie | 101 |
Department Table:
DeptID | DeptName | DeptLocation |
101 | HR | New York |
102 | IT | San Francisco |
Explanation:
Third Normal Form (3NF) eliminates transitive dependencies, ensuring that non-key attributes depend only on the primary key. A table is in 3NF if it’s in 2NF and all its attributes are functionally dependent only on the primary key not on other non-key attributes. This reduces data redundancy and improves data integrity.
Boyce-Codd Normal Form (BCNF):
Boyce-Codd Normal Form (BCNF) is an advanced version of the Third Normal Form (3NF) used to further eliminate redundancy in relational databases. A table is in BCNF if it is already in 3NF and every determinant is a candidate key. In other words, if a non-trivial functional dependency exists, the left-hand side must be a super key. BCNF ensures that no anomalies arise from overlapping candidate keys or dependencies that 3NF doesn't fully resolve.
Before Applying BCNF (Non-BCNF Table)
Professor | Subject | Department |
Dr. Smith | Database | Computer Science |
Dr. Smith | AI | Computer Science |
Dr. Jones | Math | Mathematics |
After Applying BCNF (BCNF-Compliant Tables)
Professor Dept Table:
Professor | Department |
Dr. Smith | Computer Science |
Dr. Jones | Mathematics |
Professor Subject Table:
Professor | Subject |
Dr. Smith | Database |
Dr. Smith | AI |
Dr. Jones | Math |
Explanation:
Boyce-Codd Normal Form (BCNF) is an advanced version of the Third Normal Form (3NF) that handles certain edge cases where 3NF doesn't fully eliminate redundancy. A table is in BCNF if it’s in 3NF and every determinant is a candidate key. This ensures stronger consistency by eliminating anomalies caused by functional dependencies.
Fourth Normal Form (4NF):
Fourth Normal Form (4NF) focuses on removing multi-valued dependencies from a table. A table is in 4NF if it is already in Boyce-Codd Normal Form (BCNF) and contains no more than one independent multi-valued dependency. This means that for a given primary key, a table should not store two or more independent sets of multi-valued facts, which can cause unnecessary repetition and data anomalies.
Before Applying 4NF (Non-4NF Table)
Student | Course | Hobby |
John | Math | Cricket |
John | Math | Music |
John | Science | Cricket |
John | Science | Music |
After Applying 4NF (4NF-Compliant Tables)
StudentCourses Table:
Student | Course |
John | Math |
John | Science |
StudentHobbies Table:
Student | Hobby |
John | Cricket |
John | Music |
Explanation:
Fourth Normal Form (4NF) removes multi-valued dependencies from a table while ensuring it’s already in Boyce-Codd Normal Form (BCNF). It prevents storing two or more independent sets of data in the same table that are only related by the primary key. This helps avoid redundant data and improves clarity in relationships.
Fifth Normal Form (5NF)
Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF), deals with breaking down tables to eliminate redundancy caused by join dependencies. A table is in 5NF if it is already in 4NF and cannot be further decomposed into smaller tables without losing data or introducing incorrect combinations after performing joins. It ensures that every join dependency is a result of the candidate keys
Before Applying 5NF (Non-5NF Table)
Product | Vendor | Region |
Phone | Vendor A | US |
Phone | Vendor A | UK |
Phone | Vendor B | US |
Phone | Vendor B | UK |
After Applying 5NF (5NF-Compliant Tables)
ProductVendor Table:
Product | Vendor |
Phone | Vendor A |
Phone | Vendor B |
ProductRegion Table:
Product | Region |
Phone | US |
Phone | UK |
VendorRegion Table:
Vendor | Region |
Vendor A | US |
Vendor A | UK |
Vendor B | US |
Vendor B | UK |
Explanation:
Fifth Normal Form (5NF) eliminates redundancy caused by complex join dependencies in a table. It ensures that data split across multiple tables can be joined back without loss or duplication. 5NF is useful when dealing with multiple many-to-many relationships.
Database designing is one of the critical tasks for the successful implementation of a database management system for the application that meets the data requirements of an enterprise application to be developed. Database normalization in DBMS is a crucial process that helps to create concrete database systems that are cost-effective while having better security models attached to them. The complete functional database dependencies are a very important component of the normalized data process. You can learn more through a Free SQL Server Online Course with a Certificate.
Let’s climb to the top test your knowledge and conquer every question
Q 1: What is the main purpose of normalization in SQL?
FAQs
Take our Sqlserver skill challenge to evaluate yourself!

In less than 5 minutes, with our skill challenge, you can identify your knowledge gaps and strengths in a given skill.