In relation, a Functional Dependency P holds Q (P->Q) if two tuples having similar values of attributes for P also have similar values of attributes for Q i.e. P distinctively decides Q.
In a Database, Functional dependency performs assistance as a restriction between two sets of attributes.
A Candidate Key in a relation is a minimal set of attributes of which can be used to identify a tuple distinctively. A candidate key is a column or set of columns in a table or relation that can distinctly identify any database collection of record without indicating towards any other data. Each table may contain one or more candidate keys, but one candidate key is distinct in each table or relation, and it is called the primary key. The primary key is the best among the candidate keys which is usually used for identification.
The process of Normalization is to minimize redundancy from a database table or relation or a set of relations. Insertion, Deletion and Updation anomalies may cause by redundancy. So, it aids to reduce the redundancy in relations.
Normal forms are used to eliminate or minimize redundancy in database tables or relations. These Normal Forms are:
-
First Normal Form
-
Second Normal Form
-
Third Normal Form
-
Boyce-Codd Normal Form (BCNF)
Steps to follow to find the highest normal form of a relation
-
The first step is to find all feasible candidate keys of the relation and its attributes.
-
The second step is to organize into two categories all the attributes of the relation:
-
Prime attributes
-
Non-prime attributes
-
-
Third and the last step is to examine to determine for 1st normal form and then 2nd and so on. If the process is unsuccessful in satisfying nth normal form condition, then the highest normal form will be n-1.
Examples
Problem 1) Find the highest normal form of a relation R(P, Q, R, S, T) with Functional dependency set as (QR->S, PR->QT, Q->T).
Solution:
Step 1:
As the relation (PR)+ = (P, Q, R, S, T) is given, but not a single of its subset can determine all attributes of relation, So PR will be candidate key. P or R can’t be derived from any other attribute of the relation, so there will be only one candidate key (PR).
Step 2:
-
The attributes which are part of candidate key (P, R) are Prime attributes.
-
The others will be non-prime attributes (Q, R, S).
Step 3:
A Relational Database Management System does not enable multi-valued or composite attribute. So, the relation R(P, Q, R, S, T) is in 1st normal form.
Because QR->S is in 2nd normal form (QR is not a proper subset of candidate key PR) and PR->QT is in 2nd normal form (PR is candidate key) and Q->T is in 2nd normal form (Q is not a proper subset of candidate key PR).So, the relation is in 2nd normal form.
Because in QR->S (neither QR is a super key nor S is a prime attribute) and in Q->T (neither Q is a super key nor T is a prime attribute) but to satisfy 3rd normal form, either LHS of a Functional Dependency should be super key or RHS should be prime attribute. So, the relation is not in 3rd normal form.
So, the highest normal form of relation will be 2nd Normal form.
Problem 2) Find the highest normal form of a relation R (P, Q, R, S, T) with Functional Dependency set (Q->P, P->R, QR->S, PR->QT).
Solution:
Step 1:
As the relation (PR) + = (P, Q, R, S, T) is given, Q will be a candidate key. Q can be derived from PR using PR->Q (Decomposing PR->QT to PR->Q and PR->T). So PR will be super key but (R) + ={R} and (P) + = {P, R, Q, S, T}. So P (subset of PR) will be a candidate key.
So there will be two candidate keys {P, Q}.
Step 2:
-
The attributes which are part of candidate key (P, Q) are Prime attributes.
-
The others will be non-prime attributes (R, S, T).
Step 3:
A Relational Database Management System does not enable multi-valued or composite attribute. So, the relation R (P, Q, R, S, T) is in 1st normal form.
The relation is in 2nd normal form because Q->P is in 2nd normal form (Q is a super key) and P->R is in 2nd normal form (P is super key) and QR->S is in 2nd normal form (QR is a super key) and PR->QT is in 2nd normal form (PR is a super key).
Because LHS of all Functional Dependencies are super keys, the relation is in 3rd normal form.
The relation is in BCNF as all LHS of all Functional Dependencies are super keys.
So, the highest normal form is BCNF.
Problem 3) Find the highest normal form of a relation R (P, Q, R, S, T) with Functional Dependency set (P->S, Q->P, QR->S, PR->QT).
Solution:
Step 1:
As the relation (PR) + = (P, Q, R, S, T) is given, but not a single of its subset can determine all attributes of relation, so PR will be candidate key. P can be derived from Q, so we can replace P in PR by Q. So QR will also be a candidate key.
So the two candidate keys will be (PR, QR).
Step 2:
-
The attributes which are part of candidate key (P, Q, R) are Prime attributes.
-
The others will be non-prime attributes (S, T).
Step 3:
A Relational Database Management System does not enable multi-valued or composite attribute. So, the relation R (P, Q, R, S, T) is in 1st normal form.
Because P->S is partial dependency (P which is a subset of candidate key PR is determining non-prime attribute S), the relation is not in 2nd Normal form because the 2nd normal form does not enable partial dependency.
【数据库】How to find the highest normal form of a relation in DBMS?