In the context of relational databases, a functional dependency is a relationship between two or more attributes (or columns) in a table where the value of one attribute determines the value of another.

For example, if we have a table called "Students" with the columns "Student_ID," "First_Name," "Last_Name," and "Email," we can say that the attribute "Email" is functionally dependent on the attributes "First_Name" and "Last_Name." This is because the value of the "Email" attribute is determined by the values of the "First_Name" and "Last_Name" attributes.

In general, a functional dependency is represented as A -> B, where A is the determinant (or independent) attribute and B is the dependent attribute. This means that for any given value of A, there is only one possible value of B.

Functional dependencies are important in database design because they help ensure data integrity and minimize redundancy.

Types of Functional Dependencies

There are several types of functional dependencies in a database:

  1. Trivial Functional Dependency: A functional dependency A -> B is trivial if B is a subset of A. For example, in a table with columns "Student_ID," "First_Name," and "Last_Name," the functional dependency Student_ID, First_Name -> First_Name is trivial because First_Name is already a part of the determinant.

  2. Non-Trivial Functional Dependency: A functional dependency A -> B is non-trivial if B is not a subset of A. For example, in a table with columns "Student_ID," "First_Name," "Last_Name," and "Email," the functional dependency First_Name, Last_Name -> Email is non-trivial because the Email attribute is not a part of the determinant.

  3. Transitive Functional Dependency: A functional dependency A -> B and B -> C implies a transitive functional dependency A -> C. For example, in a table with columns "Student_ID," "First_Name," "Last_Name," and "City," the functional dependency Student_ID -> City and City -> State imply a transitive functional dependency Student_ID -> State.

  4. Partial Dependency: A functional dependency A -> B is a partial dependency if B is functionally dependent on a proper subset of A. For example, in a table with columns "Student_ID," "First_Name," "Last_Name," and "Email," the functional dependency Student_ID, First_Name, Last_Name -> Email is a partial dependency because the attribute Email is also functionally dependent on just the subset "Student_ID".

  5. Full Dependency: A functional dependency A -> B is a full dependency if B is functionally dependent on the entire set of A. For example, in a table with columns "Student_ID," "First_Name," "Last_Name," and "Email," the functional dependency Student_ID, First_Name, Last_Name -> Email is a full dependency because the attribute Email is functionally dependent on the entire set of "Student_ID", "First_Name", and "Last_Name".

Normalization and Its Forms

Normalization is the process of organizing data in a database to reduce data redundancy and dependency, making the database more efficient, easier to maintain, and less prone to errors.

First Normal Form (1NF)

A table is in 1NF if it has no repeating groups or arrays, meaning that each column of the table contains only atomic values. For example, consider the following table called "Student_Info":

Student_ID First_Name Last_Name Course_List
1 John Smith Math, Science, English
2 Jane Doe Math, Science, History, French

This table is not in 1NF because the "Course_List" column contains multiple values. We can normalize this table to 1NF by splitting the "Course_List" column into separate rows and creating a new table called "Course_Info":

Student_ID First_Name Last_Name
1 John Smith
2 Jane Doe
Student_ID Course
1 Math
1 Science
1 English
2 Math
2 Science
2 History
2 French

Second Normal Form (2NF)

A table is in 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on the primary key. For example, consider the following table called "Order_Details":

Order_ID Product_ID Product_Name Product_Price Quantity
1001 P001 iPhone X 999 2
1001 P002 AirPods 149 3
1002 P001 iPhone X 999 1
1002 P003 MacBook Pro 1499 1

This table is not in 2NF because the "Product_Name" and "Product_Price" columns are dependent only on the "Product_ID" column, which is not a primary key. We can normalize this table to 2NF by creating a new table called "Product_Info":

Product_ID Product_Name Product_Price
P001 iPhone X 999
P002 AirPods 149
P003 MacBook Pro 1499
Order_ID Product_ID Quantity
1001 P001 2
1001 P002 3
1002 P001 1
1002 P003 1

Third Normal Form (3NF)

A table is in 3NF if it is in 2NF and has no transitive dependencies. Consider the following table called "Employee_Details":

Employee_ID Employee_Name Department_ID Manager_ID
E001 John Smith D001 M001
E002 Jane Doe D001 M001
E003 Bob Johnson D002 M002
Department_ID Department_Name Manager_ID
D001 Sales M001
D002 Marketing M002
Manager_ID Manager_Name
M001 Jack Johnson
M002 Sarah Thompson

This table is not in 3NF because the "Department_Name" and "Manager_Name" columns are not directly dependent on the primary key "Employee_ID". We can normalize this table to 3NF by creating new tables called "Department_Info" and "Manager_Info":

Employee_ID Employee_Name Department_ID
E001 John Smith D001
E002 Jane Doe D001
E003 Bob Johnson D002
Department_ID Manager_ID
D001 M001
D002 M002
Manager_ID Manager_Name
M001 Jack Johnson
M002 Sarah Thompson
Department_ID Department_Name
D001 Sales
D002 Marketing

This new design eliminates the transitive dependency and breaks down the original table into smaller, more manageable tables.

Boyce-Codd Normal Form (BCNF)

Boyce-Codd Normal Form (BCNF) is a stricter form of normalization that ensures that every determinant in a table is a candidate key. A determinant is any attribute or combination of attributes that determines the values of other attributes in a table. A candidate key is a minimal set of attributes that can uniquely identify each row in a table. A table is in BCNF if and only if every determinant is a candidate key.

For example, consider the following table called "Employee_Project":

Employee_ID Project_ID Project_Name Department_ID
E001 P001 Project A D001
E001 P002 Project B D001
E002 P001 Project A D001
E003 P003 Project C D002

In this table, the determinant is the combination of "Employee_ID" and "Project_ID", and it determines the values of "Project_Name" and "Department_ID". However, "Department_ID" is not dependent on the candidate key "Employee_ID" alone, but on the combination of "Employee_ID" and "Project_ID". Therefore, this table is not in BCNF.

To normalize this table to BCNF, we can create two new tables called "Employee_Project_Info" and "Department_Info":

Employee_ID Project_ID
E001 P001
E001 P002
E002 P001
E003 P003
Project_ID Project_Name
P001 Project A
P002 Project B
P003 Project C
Department_ID Project_ID
D001 P001
D001 P002
D002 P003

Now each table is in BCNF as each determinant is a candidate key. This design also eliminates data redundancy and dependency.

Difference Between 3NF and BCNF

Both Third Normal Form (3NF) and Boyce-Codd Normal Form (BCNF) aim to eliminate data redundancy and dependency. However, BCNF is a stricter form of normalization than 3NF.

3NF conditions:

  1. It is in second normal form (2NF)
  2. It does not have transitive dependencies

A transitive dependency is a functional dependency between two non-key attributes, where one non-key attribute depends on another non-key attribute, which in turn depends on the primary key.

Example — table not in 3NF:

Customer_ID Order_ID Order_Date Product_Name Product_Price
C001 O001 2021-01-01 Product A $10
C001 O001 2021-01-01 Product B $20
C002 O002 2021-01-02 Product C $30

In this table, "Product_Name" and "Product_Price" are dependent on "Order_ID", but "Order_ID" is not a candidate key. Therefore, this table is not in 3NF. Normalizing to 3NF:

Customer_ID Order_ID Order_Date
C001 O001 2021-01-01
C002 O002 2021-01-02
Order_ID Product_Name
O001 Product A
O001 Product B
O002 Product C
Product_Name Product_Price
Product A $10
Product B $20
Product C $30

BCNF condition:

  1. Every determinant is a candidate key

The main difference between 3NF and BCNF is that BCNF requires every determinant in a table to be a candidate key, whereas 3NF only requires that a table does not have transitive dependencies. BCNF is a stricter form of normalization than 3NF and leads to a more optimized database design.