Sie sind auf Seite 1von 45

a

dr
Apache Cassandra

in
ah
M
An Overview
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 1


What is Apache Cassandra?

a
dr
“Apache Cassandra is an open source, distributed,

in
decentralized, elastically scalable, highly available,
fault-tolerant, tuneably consistent, column-oriented

ah
database, that bases its distribution design on Amazon’s
Dynamo and its data model on Google’s Bigtable.”
M
ch
Created at Facebook, it is now used at some of the most
Te

popular sites on the Web.

Copyright © 2013 Tech Mahindra. All rights reserved. 2


Why Cassandra?

a
1.98 billion 500 GB drives

dr
in
6 fold growth
In 4 years

ah
988 EB
322 million 500GB drives

161 EB
M
ch
Te

2006 2010

Source: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf

Copyright © 2013 Tech Mahindra. All rights reserved. 3


Scalability and Big Data?

a
 You Tube Serves 200 mn Videos every day

dr
Chevron accumulates 2TB Data everyday
 Indian Telecom collects call data 155 TB per month and Growing

in
900,000 android phones provisioned by Google everyday
 By 2015 there will be 2.5 billion email accounts

ah
 By 2015 there will be 1 billion Subscribers in the telecom sector in India
 Will RDBMS ever to scale these every growing volumes?

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 4


RDBMS

a
dr
 RDBMS - Structured and organized data
 Structured query language (SQL)

in
 Data and its relationships are stored in separate tables.
 Data Manipulation Language, Data Definition Language

ah
 Tight Consistency

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 5


SQL

a
dr
 Specialized data structures (think B-trees)
 Shines with complicated queries

in
 Focus on fast query & analysis quickly
 Not necessarily on large datasets

ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 6


NOSQL

a
dr
 Stands for Not Only SQL
 No declarative query language (recently evolving)

in
 No predefined schema
 Key-Value pair storage, Column Store, Document Store, Graph databases -

ah
Eventual consistency rather than ACID property
 Unstructured and unpredictable data


Driven by CAP Theorem
M
Prioritizes high performance, high availability and scalability
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 7


NOSQL Advantages & Disadvantages

a
dr
 Advantages
– High scalability

in
– Distributed Computing
– Lower cost

ah
– Schema flexibility, semi-structure data
– No complicated Relationships



Disadvantages
– No standardization
M
Object-oriented programming that is easy to use and flexible
ch
– Limited query capabilities (so far)
– Eventually consistent is not intuitive to program for
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 8


CAP Theorem

a
dr
 Consistency:
– If we wrote a data in one node and read it from another node in a

in
distributed system, it will return what I wrote on the other node.
 Availability:

ah
– Each node of the distributed system should respond to the query unless it
dies.
 Partition-Tolerance:
M
– This shows the availability and seamless operation of the distributed
system even with the partition (add/remove node from different data center)
ch
or message loss over the network.
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 9


Selecting the DB type
 CA

a
– To primarily support Consistency and Availability means that you’re likely

dr
using two-phase commit for distributed transactions. It means that the
system will block when a network partition occurs, so it may be that your

in
system is limited to a single data center cluster in an attempt to mitigate
this. If your application needs only this level of scale, this is easy to

ah
manage and allows you to rely on familiar, simple structures.
 CP

M
– To primarily support Consistency and Partition Tolerance, you may try to
advance your architecture by setting up data shards in order to scale. Your
data will be consistent, but you still run the risk of some data becoming
ch
unavailable if nodes fail.
 AP
Te

– To primarily support Availability and Partition Tolerance, your system may


return inaccurate data, but the system will always be available, even in the
face of network partitioning. DNS is perhaps the most popular example of a
system that is massively scalable, highly available, and partition-tolerant.

Copyright © 2013 Tech Mahindra. All rights reserved. 10


BASE, an alternative to ACID

a
dr
 ACID
– Atomic

in
– Consistent
– Isolation

ah
– Durability
– All of the above but not SCALABLE
 BASE
– Basic Availibility
– Soft-State
M
ch
– Eventual Consistency
– All of the Above but not Strongly Consistent
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 11


Enter Cassandra
 Amazon Dynamo

a
– Consistent hashing

dr
– Partitioning
– Replication

in
– One-hop routing
 Google BigTable

ah
– Column Families
– Memtables
– SSTables

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 12


Distributed and Scalable

a
 Horizontal - commodity hardware, not specialized boxes

dr
 All nodes are identical

in
 No master or SPOF

ah
 Adding is simple

 Automatic cluster maintenance


M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 13


Replication

a
dr
 Replication factor
– How many nodes data is replicated on

in
 Consistency level
– Zero, One, Quorum, All

ah
 Sync or async for writes
 Reliability of reads
– Read repair

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 14


Ring Topology

a
RF=3

dr
Conceptual Ring

in
a
One token per

ah
node

Multiple ranges M
ch
per node j d
Te

g
Copyright © 2013 Tech Mahindra. All rights reserved. 15
Ring Topology

a
RF=2

dr
Conceptual Ring

in
a
One token per

ah
node

Multiple ranges M
ch
per node j d
Te

g
Copyright © 2013 Tech Mahindra. All rights reserved. 16
New Node

a
RF=3

dr
Token assignment

in
a
Range adjustment

ah
m
Bootstrap
M
ch
Arrival only affects j d
immediate
Te

neighbors

g
Copyright © 2013 Tech Mahindra. All rights reserved. 17
Ring Partition

a
RF=3

dr
Node dies

in
a
Available?

ah
Hinting
Handoff
M
ch
Plan for this j d
Te

g
Copyright © 2013 Tech Mahindra. All rights reserved. 18
Schema-free Sparse-table

a
dr
 Flexible column naming
 You define the sort order

in
 Not required to have a specific column just because another row does

ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 19


Data Model Concepts

a
 Apache Cassandra DataModel has 4 main concepts

dr
– Cluster
– KeySpace

in
– Column Family
 A column family contains multiple columns referenced by a row key

ah
– Super Column Family

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 20


Cluster

a
dr
 Cassandra is meant to run on a cluster
 Although cassandra can run stand-alone, it defeats the purpose of what it is

in
built for
 Cluster is arranged as a ring of nodes

ah
 Clients send read/write requests to any node in the ring
 That node takes on the role of coordinator node, and forwards the request to



the node responsible for servicing it.
M
A partitioner decides which nodes store which rows.
Cluster is container for keyspaces
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 21


Keyspace

a
dr
 A keyspace is a namespace to group multiple column families, typically one
per application. keyspace is the outermost container for data in Cassandra

in
 The basic attributes that you can set per keyspace are
– Replication factor

ah
 Refers to the number of nodes that will act as copies
– Replica placement strategy

– There are different strategies


– SimpleStrategy (Single Data Center)
M
 refers to how the replicas will be placed in the ring
ch
– NetworkTopologyStrategy (Across Data Centers)
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 22


Column Family (Table)

a
dr
 A column family is roughly analogous to a table in the relational model
 It is a container for a collection of rows

in
 Each row can have a different set of columns
 Column Family can have types

ah
– Static Column Family
– Static Set of columns
– Dynamic Column Family

M
– Can use application supplied column names to store data
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 23


Column

a
dr
 The column is the smallest increment of data in Cassandra.
 It is a tuple containing a name, a value and a timestamp.

in
 A column must have a name, and the name can be a static label (such as
name” or “email”) or it can be dynamically set when the column is created by

ah
your application

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 24


Super Column

a
dr
 A Cassandra column family can contain either regular columns or super
columns , which adds another level of nesting to the regular column family

in
structure.
 Super columns are comprised of a (super) column name and an ordered map

ah
of sub-columns.
 A super column can specify a comparator on both the super column name as
well as on the sub-column names
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 25


Bird’s Eye View

a
dr
in
ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 26


Data Model

a
dr
• Keyspace
• ColumnFamily

in
• Row (indexed)

ah
• Key
• Columns
 Name (sorted)
M
ch
 Value
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 27


Data Model

a
dr
in
A single column

ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 28


Data Model

a
dr
A single row

in
ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 29


Data Model

a
dr
in
ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 30


Why Key-value Store?

a
dr
 (Business) Key -> Value
 (twitter.com) tweet id -> information about tweet

in
 (kayak.com) Flight number -> information about flight, e.g., availability
 (yourbank.com) Account number -> information about it

ah
 (amazon.com) item number -> information about it

 Search is usually built on top of a key-value store


M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 31


Isn’t that just a database?

a
dr
 Yes
 Relational Databases

in
(RDBMSs) have
been around for ages

ah
 Data stored in tables
 Schema-based, i.e.,
structured tables
 Queried using SQL M
ch
Te

SQL queries: SELECT user_id from users WHERE


username = “jbellis”

Copyright © 2013 Tech Mahindra. All rights reserved. 32


Cassandra Data Model
 Column Families:
 Like SQL tables
 but may be unstructured

a
(client-specified)

dr
 Can have index tables
 Hence “column-

in
oriented databases”/
“NoSQL”

ah
 No schemas
 Some columns missing
from some entries
 “Not Only SQL”
 Supports get(key) and M
ch
put(key, value) operations
 Often write-heavy
workloads
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 33


Eventually Consistent

a
 CAP Theorem

dr
– Consistency
– Availability

in
– Partition Tolerance
 Choose two

ah
– Cassandra chooses A and P

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 34


Tunable Consistency

a
dr
 Give up a little A and P to get more C
 Ratchet up the consistency level

in
 R + W > N  Strong consistency

ah
 More to come

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 35


Inserting: Overview

a
dr
 Simple: put(key, col, value)
 Complex: put(key, [col:value, …, col:value])

in
 Batch: multi key.

ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 36


Inserting: Writes
 Commit log for durability

a
dr
 Configurable fsync
 Sequential writes only

in
 Memtable – no disk access

ah
(no reads or seeks)
 Sstables are final (become
read only)
 Indexes
 Bloom filter
M
ch
 Raw data
Te

 Bottom line: FAST!!!

Copyright © 2013 Tech Mahindra. All rights reserved. 37


Querying: Overview

a
 You need a key or keys:

dr
 Single: key=‘a’
 Range: key=‘a’ through ’f’
 And columns to retrieve:

in
 Slice: cols={bar through kite}

ah
 By name: key=‘b’ cols={bar, cat, llama}
 Nothing like SQL “WHERE col=‘faz’”
 But secondary indices are being worked on

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 38


Querying: Reads

a
 Practically lock free

dr
 Sstable proliferation
 New in 0.6:

in
 Row cache (avoid sstable

ah
lookup, not write-through)
 Key cache (avoid index
scan)
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 39


Practical Considerations
• Partitioner-Random or Order Preserving

a
– Range queries

dr
• Provisioning
– Virtual or bare metal

in
– Cluster size
• Data model

ah
– Think in terms of access
– Giving up transactions, ad-hoc queries, arbitrary indexes and joins
• (you may already do this with an RDBMS!)

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 40


Practical Considerations

a
dr
 Wide rows
 Data life-span

in
 Cluster planning
 Bootstrapping

ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 41


Practical Considerations

a
dr
 Wide rows
 Data life-span

in
 Cluster planning
– Bootstrapping

ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 42


Future Direction

a
dr
 Vector clocks (server side conflict resolution)
 Alter keyspace/column families on a live cluster

in
 Compression
 Multi-tenant features

ah
 Less memory restrictions

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 43


Wrapping Up

a
dr
 Use Cassandra if you want/need
– High write throughput

in
– Near-linear scalability
– Automated replication/fault tolerance

ah
– Can tolerate missing RDBMS features

M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 44


a
dr
Thank You!

in
ah
M
ch
Te

Copyright © 2013 Tech Mahindra. All rights reserved. 45

Das könnte Ihnen auch gefallen