Sie sind auf Seite 1von 35

Main Memory Databases

Presented by Nate Harada

EECS 584 Fall 2014

Main Memory Database Systems:


An Overview
Garcia-Molina and Salem, IEEE, 1992

EECS 584 Fall 2014

H-Store: A High-Performance,
Distributed Main Memory
Transaction Processing System
Kallman et al., Brown, 2008
Jones et al., MIT
Hugg, Vertica

Adabi, Yale
EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

The Landscape

*From Stonebrakers 2007 VLDB Presentation on H-Store

EECS 584 Fall 2014

Solutions

*From Stonebrakers 2007 VLDB Presentation on H-Store

EECS 584 Fall 2014

DBMS vs MMDB

EECS 584 Fall 2014

Why Main Memory DB?


Main memory is faster
Disk access is traditionally the bottleneck
Random access is just as fast as sequential access

Main memory is simpler


Fast access simplifies concurrency
No caching

EECS 584 Fall 2014

Why Not Main Memory DB?


Main memory is volatile
Have to somehow recover data if crashes

Main memory is expensive


Limited to small databases (for now)

https://www.ic.gc.ca/eic/site/oca-bc.nsf/eng/ca02093.html

EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

Concurrency Control
Faster access means larger lock granules
Up to the whole database

Can change lock structure


Store lock info with files instead of hash table

Lock Bit

File Data

EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

Commit Processing
Logging becomes a bottleneck if we write to
disk
We could use stable main memory
Non-volatile RAM just coming out
Hold log tail and move to disk constantly

We could do group commits


Trade latency for throughput

EECS 584 Fall 2014

Recovery
How do we deal with crashes?

EECS 584 Fall 2014

Recovery
Most MMDB systems dump to disk
occasionally
Generally this is the entire database

Trade off between frequent (up to date) and


infrequent (good performance)
Could also have multiple machines
(redundancy)

EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

Access Methods
Data Representation
Pointers in index structures
Pointers to communicate with client

Index Structures: T-Trees vs B-Trees


We can use deeper trees

EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

Query Processing
Disk access time is no longer important
Cost estimation different

Sequential access no longer


important
Can create different data
structures, eg DBGraph

Performance and scheduling of


backups matters

EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

Miscellaneous
Applications can be given actual memory
positions for reads
Can even give direct access for writes
Dangerous!
H-Store solves with precompiled procedures

How do we determine where to store items


weve migrated to disk
This issue has no traditional counterpart

EECS 584 Fall 2014

Covering this Talk


Overview of Main Memory Databases
Specific Considerations
Concurrency Control
Commit Processing
Access Methods
Query Processing
Miscellaneous

Case Study: H-Store


EECS 584 Fall 2014

H-Store

EECS 584 Fall 2014

H-Store

ONE
CPU

EECS 584 Fall 2014

Architecture

EECS 584 Fall 2014

Deploy Time
Procedures are
compiled
Layout determined by
administrator
Database optimized at
deploy time

EECS 584 Fall 2014

Runtime
Transactions
Initiated at one site,
that site fulfills
transaction

Special Cases
Single Sited:
Transaction runs on
only one site
One-shot: Each
query in transaction
runs on only one
site
EECS 584 Fall 2014

Single-Sited Transaction

DATA

Request
Client

EECS 584 Fall 2014

Single-Sited Transaction

DATA

Redirect

Client

EECS 584 Fall 2014

Single-Sited Transaction

DATA

Response

Client

EECS 584 Fall 2014

Multi-Sited Transaction

DATA

DATA
DATA

Request
Client

EECS 584 Fall 2014

Locking
H-Store has no locks, we just execute one at a
time on a site
Concurrency achieved by partitioning data
across machines
We simply hope that a transaction doesnt
need data on multiple partitions

EECS 584 Fall 2014

Multi-Sited Transaction

DATA

DATA
DATA

Request
Client

EECS 584 Fall 2014

Multi-Sited Transaction

DATA

Request
Client

EECS 584 Fall 2014

Das könnte Ihnen auch gefallen