Sie sind auf Seite 1von 47

Declarative Programming over

Eventually Consistent Data


Stores
Gowtham Kaki
KC Sivaramakrishnan
Suresh Jagannathan
Http Http Http Http

AppServer AppServer AppServer

Cache Cache Cache Cache

Consistency,
Integrity, Durability,
Availability, etc.
Account balances should be non-negative

Usernames should be unique

Only bona fide bids are accepted in an auction.

Application invariants

Strong consistency
Linearizability & Serializability
INTERNET

☐ Strongly consistent, but not “always on”


☐ Be “always on”, but no strong consistency

Eventual Consistency


(convergence)
Session 1

//init balance = 0
deposit(100)
?  get_balance()

Store Basic eventual Read-my-writes Causal Read committed


Consistency
Levels Monotonic writes Bounded staleness Parallel Snapshot Isolation

Eventually
Consistent
Data Stores
INTERNET
Replica 1 Replica 1

Session 1 Session 1
bal=100 bal=100
//init balance = 0 //init balance = 0
deposit(100) deposit(100)
0  get_balance() ???  get_balance()
bal=0 bal=0

Replica 2 Replica 2

Eventual Consistency Read-my-writes consistency

Store Basic eventual Read-my-writes Causal Read committed


Consistency
Levels Monotonic writes Bounded staleness Parallel Snapshot Isolation

Eventually
Consistent
Data Stores
INTERNET
Replica 1 Replica 1

Session 1 Session 1
bal=100 bal=100
//init balance = 0 //init balance = 0
deposit(100) deposit(100)
0  get_balance() 100  get_balance()
bal=0 bal=100

Replica 2 Replica 2

Eventual Consistency Read-my-writes consistency

Store Basic eventual Read-my-writes Causal Read committed


Consistency
Levels Monotonic writes Bounded staleness Parallel Snapshot Isolation

Eventually
Consistent
Data Stores
INTERNET
Application
deposit() withdraw() tweet() bid()
invariants

Store Basic eventual Read-my-writes Causal Read committed


Consistency
Levels Monotonic writes Bounded staleness Parallel Snapshot Isolation

Eventually
Consistent
Data Stores
INTERNET
Application
deposit() withdraw() tweet() bid()
invariants

Can we automate the process of mapping


application requirements to store consistency levels?

Store Basic eventual Read-my-writes Causal Read committed


Consistency
Levels Monotonic writes Bounded staleness Parallel Snapshot Isolation

Eventually
Consistent
Data Stores
INTERNET
Our solution …
Classification
Scheme

An algorithm to …
Application Store consistency
requirements Map guarantees
• Unique usernames. • Sound.
• Read-my-writes consistency
• Non-negative balance. • Optimal
• Causal consistency
• Bona fide bids. • Read committed isolation level
• Repeatable read isolation level

Specification
Language

A common medium to express both.


Prelims - System Model
Replicated Data Store
Replica 1 Replica n
Deposit(200) Deposit(200)
Withdraw(20) Withdraw(10)
Withdraw(10) Visibility (Vis) ……
……
……
Vis
getBalance
getBalance

Session 1 Session n

v1 = getBalance();
Session Order (SO)
…… ……
v2 = getBalance();
Specification Language
• Axiomatically capture set of valid executions
• Associate with each operation a single abstract effect
– Express relationship between effects
– Visibility (vis), Session order (so), Same object (sameobj)

Primitive relations

Happens-before
12
Replicated Bank Account (1)
balance >= 0 violated

Session 1 Session 2

vis
//init balance = 100 //init balance = 100

withdraw(70); a b withdraw(70);

vis
Bank Account Contracts (2)
Session 1 Session 2
vis
a b
deposit(100) withdraw(50)

vis vis
c

Session 3

50
-50 
getbalance () () ()
A.getbalance
getbalance
Capturing Store Consistency Levels

Eventual
Consistency

Causal
Consistency

Strong
Consistency
Classification Scheme

Decidable  Automatically discharged with the help of


Z3 SMT solver.
Eventual
Consistency

Causal
Consistency deposit  EC
withdraw  SC
Strong getBalance  CC
Consistency
Classification Scheme (2)
Transactions
• Generalizing to transactions is easy.
– Add single primitive relation - sametxn(a,b)
– Derived relation:

• Isolation guarantees of stores can now be specified.

Read Committed

Txn 1 (current) Txn 2 (committed)

vis
oper1(…) a b oper2(…)

vis c oper3(…)
anope
operara tion at 1som
tionsinT willeare plica, sesthee
lsowitne nsureffetha t the
ctsof T2re plica
. MA V sinclude s
emantics
theistra
usens actions
ful for main
intaSining
t . He nce
the , MAV
inte isfore
grity of coordina
ign ketion-fre e.ints
y constra MA,
semantics
Read
mate isdvie
Committed
rialize captured
wsands with thery
econda following contract:
updates[2]. Inorder toimplement
MAV, a store only needs to keep track of the set of transactions
St mav =ss8a,
witne b, c,the
ed by d.running
txn{ a, b}
tra{ns
c,ad} ^ aso
ction, nd(a, b) ^pevis
before (c,ing
rform a)
an operation at some re^plica , ensure(d,
sameobj thab) ) re
t the vis (d, include
plica b) s all
the transactions in St . Hence, MAV is coordination-free. MAV
whose semantics
Monotonic
semanticsatomic is
view
is capturedillustrated
with the in the Figure
following 8(b).
contract:
ANSI RR semantics requires that the transaction witness
mav = 8a, b, c, d. txn{ a, b} { c, d} ^ so(a, b) ^ vis(c, a)
snapshot of thedatastorestate. Importantly, thissnapshot can
obtained fromany replica ^, asameobj (d, b) ) vis
ndhenceRRisa (d,coordina
lso b) tion-fr
Ane xamplefor
whose Readsuchatra
semantics
Repeatable nsactionisthe
is illustrated in the Figure al Bal ance transacti
t ot8(b).
ANSI RRofseRR
The semantics mantics requiresbytha
is captured t the
the transaction
following witness a
contract:
snapshot of thedatastorestate. Importantly, thissnapshot can be
= 8a,
obtainedrrfromanyb, c, d.,txn
replica and{ a, b} { c, d} ^lsovis
henceRRisa (c, a) tion-free.
coordina
Anexamplefor suchatransa^ctionisthe
sameobj t ot
(d,alb)Bal) ance trab)
vis(d, nsaction.
The semantics of RR is captured by the following contract:
whose semantics is illustrated in the Figure 8(c).
rr = 8a, b, c, d. txn{ a, b} { c, d} ^ vis(c, a)
• Haskell library for Eventually Consistent Data Stores
(ECDS)
– Definition language  define operations and transactions on
replicated data.
– Specification language  specify consistency and isolation
requirements.

DEFS GHC

+
Quelea Data Store
Case Studies
Application RUBiS Twitter-lite
#Tables 6 5
#Operations 17 20
#Transactions 6 10
Invariants e.g. See all bids placed in Unique username
current session
Results of classification
#EC Ops 14 13
#CC Ops 2 6
#SC Ops 1 1
#RC Txns 4 6
#MAV Txns 2 3
#RR Txns 0 1
Evaluation
• Correctness with classification vs without classification
– How do they compare in terms of availability?
• NoRep: No Replication • StrongRep: Strong Replication
• Experimental Setup:
– Amazon EC2; 5 replicas (StrongRep & Quelea); 1 replica (NoRep)
– Gradually increased # of concurrent clients from 128 to 1024.
Conclusion
• Quelea  Haskell-library for programming
ECDS
– Automatic classification of operation and transaction
contracts through SMT solver
• Leveraging off-the-shelf ECDS
– Avoid re-engineering complex systems
– Makes it practical!

http://kcsrk.info/Quelea
Thank you!
State Summarization
• Summarization is essential to check the unbounded
growth of the log.
• How is summarization done?
– Ask developer for summarization semantics.

– Replace (many) original effects with (few) summary effects.


Quelea Replicated Store obj.oper(args)
O -the-shelf Distributed Store • O -the-shelf store res Business Logic
• Failure handling (incl. Txns)
• Persistence (on-disk)
• Eventual consistency
select insert
• So -state (in-mem) REST API
Shim Layer (RDTs)
• Datatype operations
• Summarization
• Stronger consistency Clients

Figur e 9: Implementation Model.

dependencies introduced between effectsdueto visibility, session


andsametransactionrelations. Dependencetrackingissimilar tothe
Monotonic Reads

… like “monotonic reads” (roughly requiring that time doesn’t


appear to go backward) .

… So, if we want to build an available system providing the


monotonic reads session guarantee, we can ensure that read
operations only return writes when the writes are present on
all servers.
Quelea Replicated Store obj.oper(args)
O -the-shelf Distributed Store • O -the-shelf store res Business Logic
• Failure handling (incl. Txns)
• Persistence (on-disk)
• Eventual consistency
select insert
• So -state (in-mem) REST API
Shim Layer (RDTs)
• Datatype operations
• Summarization
• Stronger consistency Clients

• In the paper
Figur e 9: Implementation Model.
– Stronger eventual consistency
– Highly available transaction support
dependencies introduced between effectsdueto visibility, session
– Summarization
andsametransactionrelations. Dependencetrackingissimilar tothe
techniques presented in [3] and [21]. BecauseCassandraprovides
System Model
Effects Quelea Data Store

R1 R2 Rn

Alice Bob A  {d10,w2} A  {d10,w2} A  {d10,w2}


……
B  {d9} B  {d9} B  {d9}

Session 1 Session 2 Session n

B.deposit($5) d5
Session
B.withdraw($6) ……
Order w6
System Model
Quelea Data Store

R1 R2 Rn

Alice Bob A  {d10,w2} A  {d10,w2} A  {d10,w2}


……
B  {d9} B  {d9} B  {d9}

Session 1 Session 2 Session n

B.deposit($5) d5
Session
B.withdraw($6) ……
Order w6
System Model
Quelea Data Store

R1 R2 Rn

Alice Bob A  {d10,w2} A  {d10,w2} A  {d10,w2}


……
B  {d9,d5} B  {d9} B  {d9,w6}
w6
d5

Session 1 Session 2 Session n

B.deposit($5) d5
Session
B.withdraw($6) ……
Order w6
System Model
Quelea Data Store

R1 R2 Rn

Alice Bob A  {d10,w2} A  {d10,w2} A  {d10,w2}


……
B  {d3,d5,w6} B  {d3,d5,w6} B  {d3,d5,w6}

Session 1 Session 2 Session n

B.deposit($5) d5
Session
B.withdraw($6) ……
Order w6
System Model
Quelea Data Store
R1

Alice Bob A  {d10,w2} ……


B  {d3,d5,w6} gb
vis

Session 1

B.deposit($5) d5

B.withdraw($6) w6
……
v1 = B.getBalance()
gb
System Model
Quelea Data Store
R1

Alice Bob A  {d10,w2} ……


B  {d3,d5,w6} gb
vis

Session 1

B.deposit($5) d5

B.withdraw($6) w6
……
v1 = $3+$5–$6 = $2
gb
System Model
Replicated Data Store
Replica 1 Replica n
Deposit(200) Deposit(200)
Withdraw(20) Withdraw(10)
Withdraw(10) Deposit(10)
……
…… ……

Session 1 Session n

getBalance;
Session Order (SO)
…… ……
withdraw(6);
Replicated Bank Account (2)
Session 1 Session 2
vis
a b
deposit(100) withdraw(50)

vis vis
c

Session 3

-50 
getbalance
50 getbalance () ()()
getbalance
Table 1: The distribution of classified contracts. #T refers to the
Evaluation
number of tablesintheapplication. Thecolumns4-6(7-9) represent
operations (transactions) assigned to this consistency (isolation)
level.
Benchmar k L OC #T EC CC SC RC M AV RR
LWW Reg 108 1 2 2 2 0 0 0
DynamoDB 126 1 3 1 2 0 0 0
Bank Account 155 1 1 1 1 1 0 1
Shopping List 140 1 2 1 1 0 0 0
Online store 340 4 9 1 0 2 0 1
RUBiS 640 6 14 2 1 4 2 0
Microblog 659 5 13 6 1 6 3 1

summarization strategy. Suppose the original set of effects on an


object areo1, o2 ando3. Whensummarized, thenew effectsyielded
aren1 and n2. Wefirst instantiateasummarization marker s, and
similar to transaction marker, we do not insert it into the store
immediately. We insert the new effects n1 and n2, with strong
o3 consistency, including s asadependence. Sinces isnot yet in the
store, thenew effectsarenot madevisibleto theclients. Then we
insert s with strong consistency, including theoriginal effectso1,
n2 o2 and o3 asdependence. Strongly consistent insertions ensurethat
ashimlayer nodewitnessing s onsomeobject must alsowitnessn1
ation, and n2 on thesameobject. A shimlayer nodewhich witnessesall
Replicated Bank Account (1)

Session 1 Session 2

bal = A.getBalance();
A.withdraw(70);
If (bal ≥ 70)
A.withdraw(70);
Replicated Bank Account (1)
SO

a b
Session 1

c
Session 2

Session 1 Session 2

100 = A.getBalance(); a
c A.withdraw(70);
If (100 ≥ 70)
A.withdraw(70); b
Replicated Bank Account (1)
SO

a b
Session 1
VIS

c
Session 2

Session 1 Session 2

100 = A.getBalance(); a
c A.withdraw(70);
If (100 ≥ 70)
A.withdraw(70); b
Replicated Bank Account (1)
Required invariant: balance >= 0
SO

a b
Session 1
VIS

c
Session 2
Replicated Bank Account (1)

Session 1 Session 2

//init balance = 100 //init balance = 100


withdraw(70); withdraw(70);
Transactions
• Allows composition of operations.
• Serializable transactions are unavailable
• Highly available transactions (HAT)
– Atomic, but relaxed Isolation.
– Isolation levels: read committed, repeatable read,
monotonic atomic view, etc.
– Express foreign key constraints, secondary indexes etc.
• Choosing the correct isolation guarantee is an error-
prone process
– Automate it through specifications and classification!
– sametxn(a,b)
Case Study - RUBiS
• An “e-Bay”-like auction site
• Application Invariants:
– Canceling a bid must not violate data integrity
– A bidder must see all bids placed in the current session
– …
Operations (17) Transactions (6)
StockItem newItem
RemoveItemFromStock OpenAuction
AddBid ConcludeAuction
… …

1
2
EC 2
RC
CC
MAV
SC 4
14
VIDEO_ID COUNT
… …

VIDEO_ID COUNT
INTERNET
… … VIDEO_ID COUNT
… …

VIDEO_ID COUNT
… …
☐ Strongly consistent, but not “always on”
☐ Be “always on”, but no strong consistency

Eventual Consistency


(convergence)

Das könnte Ihnen auch gefallen