Sie sind auf Seite 1von 33

1

The Bare Basics


Storing Data on Disks and Files
Chapter 9
2
Disks and Files

DBMS stores nformaton on (hard) dsks.

Ths has ma|or mpcatons for DBMS desgn!

READ: transfer data from dsk to man memory


(RAM).

WRITE: transfer data from RAM to dsk.

Both are hgh-cost operatons, reatve to n-memory


operatons, so must be panned carefuy!
3
Why Not Store Everything in Main
Memory?

Costs too much.

Same amount of money w buy you say ether 128MB of


RAM or 20GB of dsk.

Main memory is volatile.

We want data to be saved between runs. (Obvousy!)

Typca storage herarches:

Man memory (RAM) for currenty used data (prmary


storage) .

Dsk for the man database (secondary storage).

Tapes for archvng oder versons of data (tertary storage).


4
Disks

Secondary storage devce of choce.

Man advantage over tapes:

random access vs. sequential.

Data s stored and retreved n unts :

caed disk locks or !ages.

Unke RAM, tme to retreve a dsk page


vares dependng upon ocaton on dsk.

Therefore, reatve pacement of pages on


dsk has ma|or mpact on DBMS
performance!
5
Com!onents o" a Disk
Platters
Spindle

The patters spn


(say, 90 rps).

The arm assemby


s moved n or out
to poston a head
on a desred track.

Tracks under
heads make a
cylinder
(magnary!).

Ony one head


reads/wrtes at any
one tme.
Disk head
Arm movement
Arm assembly
Tracks
Sector

Block si#e s a mutpe


of sector si#e (whch s
xed).
6
$ccessing a Disk %age

Tme to access (read/wrte) a dsk bock:

seek time (movng arms to poston dsk head on track)

rotational delay (watng for bock to rotate under head)

trans"er time (actuay movng data to/from dsk surface)

Seek tme and rotatona deay domnate.

Seek tme vares from about 1 to 20msec

Rotatona deay vares from 0 to 10msec

Transfer rate s about 1msec per 4KB page

Lower I/O cost: reduce seek/rotaton deays!


7
$rranging %ages on Disk

`Ne&t bock concept:

bocks on same track, foowed by

bocks on same cynder, foowed by

bocks on ad|acent cynder

Bocks n a e shoud be arranged sequentay


on dsk (by `next), to mnmze seek and
rotatona deay.

For a sequenta scan, !re'"etching severa


pages at a tme s a bg wn!
8
($)D *(edundant $rray o"
)nde!endent Disks+

Dsk Array: Arrangement of severa dsks that gves


abstracton of a snge arge dsk.

Goas: Increase performance and reabty.

Two man technques:

Data strpng:
Data s parttoned;
Sze of a partton s caed the strpng unt.
Parttons are dstrbuted over severa dsks.

Redundancy:
More dsks => more reabe.
Redundant nformaton aows reconstructon of data f a dsk
fas.
9
($)D ,evels

Leve 0: No redundancy

Best wrte performance

Not best n readng. (Why?)

Leve 1: Mrrored (two dentca copes)

Each dsk has a mrror mage (check dsk)

Parae reads, a wrte nvoves two dsks.

Maxmum transfer rate = transfer rate of


one dsk
13
Disk S!ace Management

Lowest ayer of DBMS software manages


space on dsk.

Hgher eves ca upon ths ayer to:

aocate/de-aocate a page

read/wrte a page

Hgher eves dont need to know how ths


s done, or how free space s managed.
14
Bu-er Management in a
DBMS

Data must e in ($M "or DBMS to o!erate on it.

Tale o" /"rame01 !ageid2 !airs is maintained3


DB
MAIN MEMORY
DISK
dsk page
free frame
Page Requests from Hgher Leves
BUFFER POOL
choce of frame dctated
by replacement policy
15
When a %age is (equested 333

If requested page s not n buher poo:

Choose a frame for re!lacement

If frame s drty, wrte t to dsk

Read requested page nto chosen frame

%in the page and return ts address.

)" requests can e !redicted *e3g31 sequential scans+


!ages can e !re'"etched *several !ages at a time+.
16
More on Bu-er Management

Requestor of page must unpn t, and ndcate


whether page has been moded:

dirty bt s used for ths.

Page n poo may be requested many tmes,

a !in count s used.

A page s a canddate for repacement h !in count


= 0.

CC & recovery may enta addtona I/O when


a frame s chosen for repacement. (Write'
$head ,og protoco; more ater.)
17
Bu-er (e!lacement %olicy

Frame s chosen for repacement by a


re!lacement !olicy4

Least-recenty-used (LRU), Cock, MRU etc.

Pocy can have bg mpact on # of I/Os;


depends on access !attern.

Sequential 5ooding: Nasty stuaton


caused by LRU + repeated sequenta
scans.

# buher frames < # pages n e means each


page request causes an I/O.

MRU much better n ths stuaton (but not n a


stuatons, of course).
18
DBMS vs3 6S File System
OS does dsk space & buher mgmt aready!
So why not et OS manage these tasks?

Dherences n OS support: Portabty ssues

Some mtatons, e.g., es dont span mutpe dsk


devces.

Buher management n DBMS requres abty to:

pn a page n buher poo,

force a page to dsk (mportant for mpementng CC &


recovery),

ad|ust re!lacement !olicy1 and pre-fetch pages based


on access patterns n typca DB operatons.
19
Structure o" a DBMS

A typca DBMS has a ayered


archtecture.

Dsk Storage herarchy, RAID

Dsk Space Management


Roes, Free bocks

Buher Management
Buher Poo, Repacement pocy

Fes and Access Methods


Fe organzaton (heap es, sorted
e, ndexes)
Fe and page eve storage
(coecton
of pages or records)
Query Optimization
and Execution
Relational Operators
iles and Access !ethods
"u##er !ana$ement
Disk Space !ana$ement
D"
These layers
must consider
concurrency
control and
recovery
Index Files
System Catalog
Data Files
20
Files o" (ecords

Page or bock s the granuarty for dong I/O

Hgher eves of DBMS operate on :

records, and

7les composed of records.

FILE: A coecton of pages, each contanng


a coecton of records.

Fe must support:

nsert/deete/modfy record

read a partcuar record (speced usng record


id)

scan a records (possby wth some condtons


on the records to be retreved)
21
8nordered Files *9ea! Files+

Smpest e structure contans records n no


partcuar order.

As e grows and shrnks, dsk pages are


aocated and de-aocated.

To support record eve operatons, we must:

keep track of the !ages n a e

keep track of "ree s!ace on pages

keep track of the records on a page

There are many aternatves for keepng track


of ths.
22
$lternative :4
9ea! File )m!lemented as
,ist

Mantan a tabe contanng pars of:


<heap_e_name, head_page_address>

Each page contans 2 `ponters (rd) pus data.


Header
Page
Data
Page
Data
Page
Data
Page
Data
Page
Data
Page
Data
Page
Pages wth
Free Space
Fu Pages
23
9ea! File )m!lemented as a
,ist

Insert a new page nto heap e

Dsk manager adds a new free space page nto nk

Deete a page from heap e

Removed from the st

Dsk manager deaocates t

Dsadvantages:

If records are of varabe ength, a pages w be n


free st.

Retreve and examne severa pages for enough


space.
24
$lternative ;4 9ea! File 8sing %age
Directory

In drectory, each entry for a page ncudes


number of free bytes on page.

The drectory s a coecton of pages


(nked st mpementaton s |ust one aternatve).

Much smaller than linked list o" all 9F !ages!


Data
Page 1
Data
Page 2
Data
Page N
Header
Page
DIRECTORY
25
$lternative ;4
9ea! File 8sing a %age
Directory

Advantage of Page Drectory :

The sze of drectory s very sma (much


smaer than heap e.)

Searchng space s very emcent, because


nd free space wthout ookng at actua
heap data pages.
26
%age Formats

Page : abstracton s used for I/O

Record : data granuarty for hgher eve of


DBMS

How to arrange records n pages?

Identfy a record:
<page_d, sot_number>, where sot_number = rd
Most cases, use <page_d, sot_number> as rd.

Aternatve approaches to manage sots on a


page

How to support nsert/deetng/searchng?


27
(ecords Formats4 Fi&ed ,ength
(ecord

Informaton about ed types same for a


records n a e

Stored record format n system catalogs3


+ Fndng i<th ed does not requre scan of
record, |ust ohset cacuaton.
Base address (B)
L1 L2 L3 L4
F1 F2 F3 F4
Address = B+L1+L2
28
%age Formats4 Fi&ed ,ength
(ecords

(ecord id = /!age id1 slot 023

Note4 )n 7rst alternative1 moving records "or "ree s!ace


management changes rid> may not e acce!tale i"
e&isting e&ternal re"erences to the record that is moved3
Sot 1
Sot 2
Sot N
. . . . . .
N M 1 0 . . .
M ... 3 2 1
PACKED
UNPACKED, BITMAP
Sot 1
Sot 2
Sot N
Free
Space
Sot M
1 1
number
of records
number
of sots
29
(ecord Formats4 ?ariale
,ength

Two aternatve formats (# eds s xed):


+ Second ohers drect access to th ed
+ emcent storage of nulls ;
- sma drectory overhead.
4 $ $ $ $
Fed
Count
Feds Demted by Speca Symbos
F1 F2 F3 F4
F1 F2 F3 F4
Array of Fed Ohsets
30
%age Formats4 ?ariale ,ength
(ecords

Sot drectory = {<record_ohset,


record_ength>}
Page
Rd = (,N)
Rd = (,2)
Rd = (,1)
Ponter
to start
of free
space
SLOT DIRECTORY
N . . . 2 1
20 16 24 N
# sots
Ohset of
record
from start
of data
area
Length = 24
Length = 16
Length = 20
31
%age Formats4 ?ariale ,ength
(ecords

Sot drectory = {<record_ohset, record_ength>}

Ds/Advantages:
+ Movng: rd s not changed
+ Deeton: ohset = -1 (rd changed?
Can we deete sot? Why?)
+ Inserton: Reuse deeted sot.
Ony nsert f none avaabe.

Free space? Free space ponter? Recyce after


deeton?
32
System Catalogs

Meta nformaton stored n system cataogs.

For each ndex:

structure (e.g., B+ tree) and search key eds

For each reaton:

name, e name, e structure (e.g., Heap e)

attrbute name and type, for each attrbute

ndex name, for each ndex

ntegrty constrants

For each vew:

vew name and denton

Pus statstcs, authorzaton, buher poo sze, etc.

Catalogs are themselves stored as relations!


33
$ttr@Cat*attr@name1 rel@name1 ty!e1
!osition+
attr_name re_name type poston
attr_name Attrbute_Cat strng 1
re_name Attrbute_Cat strng 2
type Attrbute_Cat strng 3
poston Attrbute_Cat nteger 4
sd Students strng 1
name Students strng 2
ogn Students strng 3
age Students nteger 4
gpa Students rea 5
d Facuty strng 1
fname Facuty strng 2
sa Facuty rea 3
34
Summary

Dsks provde cheap, non-voate storage.

Random access, but cost depends on ocaton


of page on dsk

Important to arrange data sequentay to


mnmze seek and rotation deays.

Buher manager brngs pages nto RAM.

Page stays n RAM unt reeased by requestor.

Wrtten to dsk when frame chosen for


repacement.

Frame to repace based on repacement pocy3

Tres to pre-fetch severa pages at a tme.


35
More Summary

DBMS vs. OS Fe Support

DBMS needs features not found n many OSs.

forcng a page to dsk

controng the order of page wrtes to dsk

es spannng dsks

abty to contro pre-fetchng and page repacement


pocy based on predctabe access patterns

Formats for Records and Pages :

Sotted page format : supports varabe ength


records and aows records to move on page.

Varabe ength record format : ed ohset


drectory ohers support for drect access to th
ed and nu vaues.
36
Even More Summary

Fe ayer keeps track of pages n a e, and


supports abstracton of a coecton of
records.

Pages wth free space dented usng nked st


or drectory structure

Indexes support emcent retreva of


records based on the vaues n some eds.

Cataog reatons store nformaton about


reatons, ndexes and vews.

Informaton common to a records n coecton.

Das könnte Ihnen auch gefallen