Beruflich Dokumente
Kultur Dokumente
Functional Dependencies
Week 11
1
131-24-3650 Smethurst 35 5
10 40
2
Redundancy
When part of data can be derived from other parts, we say redundancy exists.
Example: the hrly_wage of Smiley can be derived from the hrly_wage of Attishoo because they have the same rating and we know rating determines hrly_wage.
Redundancy exists because of of the existence of integrity constraints (e.g., FD: R W).
3
What do we do?
Since constraints, in particular functional dependencies, cause problems, we need to study them, and understand when and how they cause redundancy. When redundancy exists, refinement is needed.
Main refinement technique: decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD).
Is there reason to decompose a relation? What problems (if any) does the decomposition cause?
5
L 48 22 35 35 35
R 8 8 5 5 8
L
W 10 10 7 7 10
R H 40 30 30 32 40
H 40 30 30 32 40
R W
123-22-3666 Attishoo
48 8 22 8 35 5 35 8
131-24-3650 Smethurst 35 5
8 10 5 7
Refining an ER Diagram
1st diagram translated: Workers(S,N,L,D,S) Departments(D,M,B)
Before:
name ssn Employees lot Works_In since did dname budget Departments
Suppose all workers in a dept are assigned the same After: lot: D L name ssn Can fine-tune this: Workers2(S,N,D,S) Employees Departments(D,M,B,L)
i.e., given any two tuples in r, if the X values agree, then the Y values must also agree. (X and Y are sets of attributes.)
Must be identified based on semantics of application. Given some instance r1 of R, we can check if it violates some FD f or not But we cannot tell if f holds over R by looking at an instance!
Notation: We will denote this relation schema by listing the attributes: SNLRWH
This is really the set of attributes {S,N,L,R,W,H}. Sometimes, we will refer to all attributes of a relation by using the relation name. (e.g., Hourly_Emps for SNLRWH) ssn is the key: S SNLRWH rating determines hrly_wages: R W
10
11
Violation of FD by a relation
The FD XY is NOT satisfied by a relation instance r if:
There exists a pair of tuples t1 and t2 in r such that
t1.X = t2.X but t1.Y t2.Y
i.e., we can find two tuples in r, such that X values agree, but Y values dont.
12
A 1 1 2 2
B 1 1 1 1
C 2 3 3 2
CB CAB BC BB AC B
13
In general,
X R means X is a (super)key.
Armstrongs axioms
Armstrongs axioms are sound and complete inference rules for FDs!
Sound: all the derived FDs (by using the axioms) are those logically implied by the given set Complete: all the logically implied (by the given set) FDs can be derived by using the axioms.
16
If X Y and Y Z, then X Z
B 1 1 1 1
C 2 3 3 2
17
Derive Union
Show that
If X Y and X Z, then X YZ
19
Derive Decomposition
Show that
20
Derivation Example
R = (A, B, C, G, H, I) F = {A B; A C; CG H; CG I; B H } some members of F+
A H
by transitivity from A B and B H
AG I
by augmenting A C with G, to get AG CG and then transitivity with CG I
CG HI
by augmenting CG I to infer CG CGI, and augmenting of CG H to infer CGI HI, and then transitivity
22
Example on Computing F+
F = {A B, B C, C D E } Step 1: For each f in F, apply reflexivity rule
We get: CD C; CD D Add them to F:
F = {A B, B C, C D E; CD C; CD D }
Step 3: Apply transitivity on pairs of fs Keep repeating You get the idea
24
C is the key: C CSJDPQV Project (jid) purchases each part using single contract:
JP C
JP C, C CSJDPQV imply JP CSJDPQV SD P implies SDJ JP SDJ JP, JP CSJDPQV imply SDJ CSJDPQV
25
Does F = {A B, B C, C D E } imply A E?
26
Computing X+
Input F (a set of FDs), and X (a set of attributes) Output: Result=X+ (under F) Method:
Step 1: Result :=X; Step 2: Take Y Z in F, and Y is in Result, do: Result := Result Z Repeat step 2 until Result cannot be changed and then output Result.
27
Step 1: Result = A Step 2: Consider A B, Result = AB Consider B C, Result = ABC CD is not in ABC, so stop Step 3: A+ = {ABC} E is NOT in A+, so A E is NOT in F+
28
Example of computing X+
F={A B, AC D, AB C} X=A Result should be X+=ABCD
29
Is AG a candidate key?
Is AG a super key?
Does AG R? == Is (AG)+ R
Computing closure of F
For each R, we find the closure +, and for each S 31 +, we output a functional dependency S.
Computing F+
Given F={ A B, B C}. Compute F+ (with attributes A, B, C). Step 2: Compute the attribute
Step 1: Construct an empty matrix, with all Possible combinations of attributes in the rows And columns
A A B C AB AC BC ABC B C AB AC BC ABC
closures for all attribute/ combination of attributes Attribute closure A+=? B+=? C+=? AB+=? AC+=? BC+=? ABC+=?
32
Computing F+
Given F={ A B, B C}. Compute F+ (with attributes A, B, C). Well do an example on A+. Step 1: Result = A Step 2: Consider A B, Result = AB = AB Consider B C, Result = AB C = ABC Step 3: A+ = {ABC}
33
Computing F+
Given F={ A B, B C}. Compute F+ (with attributes A, B, C). Step 2: Compute the attribute
Step 1: Construct an empty matrix, with all Possible combinations of attributes in the rows And columns
A A B C : B C AB AC BC ABC
closures for all attribute/ combination of attributes Attribute closure A+=ABC B+=? C+=? AB+=? AC+=? BC+=? ABC+=?
34
Step 3: Fill in the matrix using the results from Step 2. We have A+=ABC. Now fill in the row for A. Consider the first column. Is A part of A+? Yes, so check it. Is B part of A+? Yes, so check it and so on.
Computing F+
Given F={ A B, B C}. Compute F+ (with attributes A, B, C).
A B C AB AC BC ABC Attribute closure A+=ABC A B+=BC B C+=C C AB+=ABC AB AC+=ABC AC BC+=BC BC ABC+=ABC ABC An entry with means FD (the row) (the column) is in F+. 35 An entry gets when (the column) is in (the row)+
Computing F+
Step 4: Derive rules. A B C A B C AB AC BC ABC AB AC BC ABC ABC
Attribute closure
An entry with means FD (the row) (the column) is in F+. 36 An entry gets when (the column) is in (the row)+
For example, F1={A B, A C} is equivalent to F2={A BC} How to test? Two steps:
Every FD in F1 is in F2+ Every FD in F2 is in F1+
These two steps can use the algorithm (many times) for X+
37
Summary
Constraints give rise to redundancy
Three anomalies
Armstrongs Axioms
FD inference/derivation