Optimization Examples 3

12/13/09
Query Optimization Example

•  Sailors (sid, sname, ra5ng, age)
•  Boats(bid, bname, color)
•  Reserves(sid, bid, day, rname)
•  Query:
SELECT S.sid, S.sname, S.age
FROM Sailors S, Boats B, Reserves R
WHERE B.bid = R.rid AND B.bid = R.bid AND
B.color = “Red” AND S.age < 30;
•  Reserves has 1000 pages, 10 tuples/page

•  Sailors has 500 pages, 50 tuples/page
•  Boats has 160 pages, 10 tuples/page
•  Data is evenly distributed (assumption)
CPSC 421, 2009 1
Steps
•  Query op5miza5on steps:
–  Translate SQL Query to Rela5onal Algebra Query
–  Create a query tree
–  Create leY‐deep alterna5ve trees
–  Create query plans for our trees
–  Es5mate costs of plans
–  Pick best one
CPSC 421, 2009 2
1
12/13/09
Step 1
•  Translate SQL Query to Rela5onal Algebra Query
πsid, sname, age(σage<30(σcolor=“Red”(σbid=bid(B × σsid=sid(S × R)))))
CPSC 421, 2009 3
Step 2
•  Create a query tree πsid,sname,age
σage<30
σcolor=“Red”
σbid=bid
×
Boats σsid=sid
Note our query only involves joins
(as opposed to plain old cross-
products) … !
×
Lets draw the trees with joins! Reserves Sailors
CPSC 421, 2009 4
2
12/13/09
Step 3
•  Create leY‐deep alterna5ve trees
–  We replace cross products if they are really joins
–  How?
πsid,sname,age
•  Using RA equivalences!
σage<30
–  This is a leY‐deep plan with joins
σcolor=“Red”
⋈bid=bid
⋈sid=sid Boats
Reserves Sailors
CPSC 421, 2009 5
Step 3 – More Alternatives

•  Would we expect these to have diﬀerent costs over
previous alterna5ve?
σage<30 πsid,sname,age
πsid,sname,age
⋈bid=bid
σcolor=“Red”
⋈sid=sid σcolor=“Red”
No!
⋈bid=bid
Reserves σage<30 Boats
⋈sid=sid Boats
Sailors
Reserves Sailors Yes!
CPSC 421, 2009 6
3
12/13/09
Step 4 & 5 – Plans and Estimates

•  Lets create some plans and es5mates
Total cost: 160,160 + 1,002,000 = 1,162,160 I/Os !!!

πsid,sname,age (On-the-fly)
M + M*N = ?+?*500 = ???
•  How large is the result of the first join?
σcolor=“Red” ⋀ age<30 (On-the-fly)
•  Every reservation has a bid
•  bid is unique for each boat
•  So 10,000 tuples ⋈sid=sid (Page NL)
•  How many pages?
•  1 / ((1/10) + (1/10)) = 10/2 = 5 tuples/page
M + M*N = 2,000+2,000*500 = 1,002,000 Sailors
M + M*N = 160+160*1,000 = 160,160 ⋈bid=bid (Page NL)
(this is the cost of reading, not writing)
Boats Reserves
CPSC 421, 2009 7

•  Assume Boats has a clustered B+ Tree on bid (key)
•  Assume S has a clustered B+ Tree on sid (key)
Total cost: 5,160 + 8,500 = 13,660 I/Os !!!

Sort(M) + Sort(N) + M + N =
6,000 + 0 + 2,000 + 500 = 8,500
⋈sid=sid (Sort Merge)
Result must be sorted on sid:
•  2*2,000 + 2*1*2,000 = 6,000 I/Os
Sort(M) + Sort(R) + M + N =
Sailors
0 + 4,000 + 160 + 1,000 =
5,160
⋈bid=bid (Sort Merge)
Boats sorted on bid, Reserves isn’t
Assume buffers B = 50 * We are assuming merge (in Sort
Cost of sorting reserves: Boats Reserves Merge) takes M + N … this may
•  2*1,000 + 2*1*1,000 = 4,000 I/Os not always be the case. Why?
CPSC 421, 2009 8
4
12/13/09
Note on Sort Merge

This is a much better plan than our previous page nested
loops one
•  But it is an overestimate for Sort-Merge join!
–  In Pass 0 we read all pages then write all pages (2*M)
–  We do not need to read all pages if they are pipelined to
the next join (1*M)
–  In last merge pass, we don’t have to write the last set of
pages since we always pipeline them to the next operator
–  We also don’t have to read in the file again:
•  Sort(M) + Sort(N) if N needs to be sorted
•  Sort(M) + N if N is sorted
CPSC 421, 2009 9

•  Assume Boats has a clustered B+ Tree on bid (key)
•  Assume S has a clustered B+ Tree on sid (key)
Total cost: 3,160 + 4,500 = 7,660 I/Os !!!

Sort(M) + N = 4,000 + 500 = 4,500
⋈sid=sid (Sort Merge)

Result must be sorted on sid:
•  1*2,000 + 1*1*2,000 = 4,000 I/Os
Sailors
Sort(M) + N = 3,000 + 160 = 3,160 ⋈bid=bid (Sort Merge)
Boats sorted on bid, Reserves isn’t
Assume buffers B = 50 * We are assuming merge (in Sort
Cost of sorting reserves: Boats Reserves Merge) takes M + N … this may
•  2*1,000 + 1*1*1,000 = 3,000 I/Os not always be the case. Why?
CPSC 421, 2009 10
5
12/13/09

•  Lets try block nested loop join (B=50)
Total cost: 4,160 + 22,500 = 26,660 I/Os !!!

M + (M/(B-1))*N = 2,000 + 41*500 = 22,500 ⋈sid=sid (Block NL)
Sailors
M + (M/(B-1))*N = 160+4*1,000 = 4,160 ⋈bid=bid (Block NL)
Boats Reserves
CPSC 421, 2009 11

•  Can we do bejer?
Total cost: 5,410 + 11,500 = 16,910 I/Os !!!

M + (M/(B-1))*N = 1,250 + 26*160 = 5,410 ⋈bid=bid (Block NL)

How many pages in the answer?
1,250 … why?
Boats
M + (M/(B-1))*N = 1000+21*500 = 11,500 ⋈sid=sid (Block NL)
Reserves Sailors
CPSC 421, 2009 12
6
12/13/09

•  Can we do even bejer?
–  Assume Boats has a hash index on color
Total cost: 56 + 2,050 + 500 + 500 = 3,106 I/Os !!!
(less than sort-merge plan)
M + (M/(B-1))*N = 125 + 3*125 = 500 ⋈bid=bid (Block NL)

File Scan Sailors = 500
Assume 25% reduction factor
σage<30 (File Scan)
M + (M/(B-1))*N = 50 + 2*1000 = 2,050
Assume 1/16 fewer Reserves match boats ⋈sid=sid
So result is: 625 tuples at 5 tuples/page = 125 pages (Block NL)
Sailors
Boats has 1600 tuples
Assume 16 colors
Thus, 100 tuples are “Red”
σcolor=“Red” Reserves (Hash Index)
Assume scattered across 50 pages
And the hash index takes 6 pages
Cost = 56
Boats
CPSC 421, 2009 13

•  Can we s5ll do bejer?
–  Yes!
–  E.g., we could project on bid aYer ﬁrst join …
•  No5ce that by adding an index on color to Boats …
–  We reduced our best query 5me in half!
–  This is one reason we talk about query op5miza5on
–  It drives physical database design
–  Improving performance (by adding indexes, e.g.) should be
driving on how query op5miza5on works!
CPSC 421, 2009 14
7

Optimization Examples 3

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Optimization Examples 3

Hochgeladen von

Copyright:

Verfügbare Formate

12/13/09

Query Optimization Example

• Reserves has 1000 pages, 10 tuples/page

πsid, sname, age(σage<30(σcolor=“Red”(σbid=bid(B × σsid=sid(S × R)))))

Step 3 – More Alternatives

Step 4 & 5 – Plans and Estimates

Total cost: 160,160 + 1,002,000 = 1,162,160 I/Os !!!

Step 4 & 5 – Plans and Estimates

σcolor=“Red” ⋀ age<30 (On-the-fly)

Note on Sort Merge

Step 4 & 5 – Plans and Estimates

σcolor=“Red” ⋀ age<30 (On-the-fly)

⋈sid=sid (Sort Merge)

Step 4 & 5 – Plans and Estimates

Total cost: 4,160 + 22,500 = 26,660 I/Os !!!

σcolor=“Red” ⋀ age<30 (On-the-fly)

M + (M/(B-1))*N = 2,000 + 41*500 = 22,500 ⋈sid=sid (Block NL)

Step 4 & 5 – Plans and Estimates

Total cost: 5,410 + 11,500 = 16,910 I/Os !!!

σcolor=“Red” ⋀ age<30 (On-the-fly)

M + (M/(B-1))*N = 1,250 + 26*160 = 5,410 ⋈bid=bid (Block NL)

Step 4 & 5 – Plans and Estimates

M + (M/(B-1))*N = 125 + 3*125 = 500 ⋈bid=bid (Block NL)

Step 4 & 5 – Plans and Estimates

Das könnte Ihnen auch gefallen

•  Reserves has 1000 pages, 10 tuples/page

M + (M/(B-1))N = 2,000 + 41500 = 22,500 ⋈sid=sid (Block NL)

M + (M/(B-1))N = 1,250 + 26160 = 5,410 ⋈bid=bid (Block NL)

M + (M/(B-1))N = 125 + 3125 = 500 ⋈bid=bid (Block NL)