Beruflich Dokumente
Kultur Dokumente
ScaleDB Technical
Presentation
Thursday April 17, 2008
ScaleDB for MySQL
Database
ScaleDB
What Makes ScaleDB
Better?
• ScaleDB Advantages:
• Performance: New indexing delivers
dramatic performance improvement
• Scalability: Designed for clustering
with Plug-and-Cluster™ Architecture
ScaleDB
Improving Performance
ScaleDB Indexing
Hash Bitmap Aggreg Etc.
ate Special-purpose
Index Add-ons*
ScaleDB
ScaleDB: Multi-Table
Indexing
B-tree: Only indexes the data in tables
Index Index Index Index Index
#1 #2 #3 #4 #5
#1 #2 #3 #4 #5
ScaleDB: Indexes the data and relationships
ScaleDB
Index Advantages:
#1 • Faster
#2 • Smaller
#4
• Referential integrity
#3 #5 • More functionality
ScaleDB
Describing Our Demo
• Scenario: Select information that is
spread across 3 tables: Colleges,
Students and Enrollment
• Relationships: Students are enrolled
in courses within departments of
colleges
• DDL Definitions
ScaleDB
The Query
SELECT c1.CollName, s.StudName, c2.CourseName ,
e.Grade
FROM College AS c1
STRAIGHT_JOIN Student AS s
STRAIGHT_JOIN Enrollment AS e
STRAIGHT_JOIN Course AS c2
ON ( c1.CollNo = s.CollNo AND
s.CollNo = e.CollNo AND
s.StudentNo = e.StudentNo AND
e.CollNo = c2.CollNo AND
e.DeptNo = c2.DeptNo AND
e.CourseNum = c2.CourseNum )
WHERE c1.CollNo = X
AND s.StudentNo = Y ;
ScaleDB
A Sample Scenario
• Scenario: I need information that is spread across
3 tables: Colleges, Students and Enrollment
• Options:
• #1: Conventional Joins
• #2: Materialized View
• #3: ScaleDB
Colleges Students Enrollment
Col_ID#
Coll_ID# Col_Name
Coll_Name Col_Budget
Coll_Budget Col_Description
Coll_Description Dept_ID# Dept_Name Coll_ID# Dept_Budget Course_ID# Course_Name Coll_ID# Dept_ID#
0001 Amhearst $1,234,567 Nice place to visit 0001 Amhearst $1,234,567 Nice place to visit 0001 Amhearst $1,234,567 Nice place to visit
0002 Berkeley $5,432,567 Sports not so good 0002 Berkeley $5,432,567 Sports not so good 0002 Berkeley $5,432,567 Sports not so good
0003 Harvard $9,999,666 Cool logo 0003 Harvard $9,999,666 Cool logo 0003 Harvard $9,999,666 Cool logo
0004 Holy Cross $3,234,567 Ugh Worcester 0004 Holy Cross $3,234,567 Ugh Worcester 0004 Holy Cross $3,234,567 Ugh Worcester
0005 MIT $8,238,568 Serious work 0005 MIT $8,238,568 Serious work 0005 MIT $8,238,568 Serious work
0006 Cornell $7,237,767 Jumpy students 0006 Cornell $7,237,767 Jumpy students 0006 Cornell $7,237,767 Jumpy students
0007 Stanford $9,898,777 Pretty campus 0007 Stanford $9,898,777 Pretty campus 0007 Stanford $9,898,777 Pretty campus
0008 TCU $5,987,004 In Texas 0008 TCU $5,987,004 In Texas 0008 TCU $5,987,004 In Texas
ScaleDB
The Query
• SELECT c1.CollName, s.StudName, c2.CourseName , e.Grade
• FROM College AS c1
• STRAIGHT_JOIN Student AS s
• STRAIGHT_JOIN Enrollment AS e
• STRAIGHT_JOIN Course AS c2
• ON ( c1.CollNo = s.CollNo AND
• s.CollNo = e.CollNo AND
• s.StudentNo = e.StudentNo AND
• e.CollNo = c2.CollNo AND
• e.DeptNo = c2.DeptNo AND
• e.CourseNum = c2.CourseNum )
• WHERE c1.CollNo = X
• AND s.StudentNo = Y +------+--------+----------------------+------------+-------+----------+
;
|table | type | key | key_len | rows | filtered |
+------+--------+-----------------------+-----------+-------+----------+
| c1 | const | PRIMARY |4 | 1 | 100.00 |
| s | const | PRIMARY | 14 | 1 | 100.00 |
| e | ref | idx_EnrollStud | 4 | 3 | 100.00 |
| c2 | eq_ref | PRIMARY | 17 | 1 | 100.00 |
+------+--------+-----------------------+-----------+-------+----------+
ScaleDB
Option #1: Conventional
Joins
Query Result:
008 Medicine $5,987,004 In Texas | 56-8037 Saul Goode African American | 4455 B+ |
ScaleDB
Option #2: Materialized
View
Materialized View Indexes
Materialized View
Col_ID# Col_Name Col_Budget Col_Description
Copies (and synchronizes) the
Coll_ID# Coll_Name Coll_Budget Coll_Description Student_ID# Student_Name Student_Desc Dept_ID# Grade
001 Agriculture $1,234,567 Nice place to visit 56-8033 Mike Hogan Caucasian 3345 A
001 Agriculture $1,234,567 Nice place to visit 56-8033 Mike Hogan Caucasian 3235 B+
001
001
data from individual tables into
Agriculture
Agriculture
$1,234,567
$1,234,567
Nice place to visit
Nice place to visit
56-8033
56-8033
Mike Hogan
Mike Hogan
Caucasian
Caucasian
3245
3245
A-
B
Colleges
one massiveStudents
view
001 Agriculture $1,234,567 Nice place to visit 56-8033 Mike Hogan Caucasian 3235 A+
001 Agriculture $1,234,567
Enrollment
Nice placeDept_Name
to visit 56-8034 PaulDept_Budget
Martyn Caucasian 3239 A-
Col_ID#
Coll_ID# Col_Name
Coll_Name Col_Budget
Coll_Budget Col_Description
Coll_Description Dept_ID# Coll_ID# Course_ID# Course_Name Coll_ID# Dept_ID#
…………
001 Agriculture 001
$1,234,567 NiceAgriculture
place to visit $1,234,567 56-8033
Nice place to visit
008 56-8034
Mike Hogan Paul Martyn
Caucasian Caucasian 3239 B
008 4455 56-8037 B+
002 Arts 001
$5,432,567 Agriculture
Sports not so good $1,234,567 56-8045
Nice place to visit
008 56-8034
Moshe Smith Paul Martyn
Caucasian Caucasian 3240 A+
008 4455 56-8033 C
003 Business $9,999,666 Cool logo 56-8044 008 Sally Shadmon Native American 008 4455 56-8045 B+
004 Education Col_ID#
$3,234,567
008
Col_Name
Ugh Worcester
Medicine
Col_Budget
$5,987,004
Col_Description
56-8055
In Texas
008 Billy Fleegle
56-8037
African American
Saul Goode African American 008 4455 4456 A 56-8044 A-
005 Engineering $8,238,568 Serious work 56-8037 008 Saul Goode African American
008 Medicine $ 5,987,004 In Texas 56-8037 Saul Goode African American 008 4455 4456 A 56-8122 B-
006 Law $7,237,767
008 Jumpy students
Medicine $ 5,987,00456-8122In Texas 008 Tim Collins
56-8037 Polynesian
Saul Goode African American 008 4455 4454 B+ 56-8233 C
007 Liberal Arts $9,898,777
008 Pretty campus
Medicine $ 5,987,00456-8233In Texas 008 Sam Gee
56-8037 Asian
Saul Goode African American 008 4455 4455 A- 56-8334 F
008 Medicine $5,987,004
008 In Texas
Medicine $ 5,987,00456-8334In Texas 008 Rod Paulino
56-8037 Asian
Saul Goode African American 008 4455 4454 B 56-8055 D
Query Result:
008 Medicine $5,987,004 In Texas | 56-8037 Saul Goode African American | 4455 B+ |
ScaleDB
Option #3: ScaleDB
ScaleDB’s multi-table index is relationship-aware
ScaleDB Index
College
A Single
Departme Student Index
nts s Lookup
Enrollme
Courses
nt
Colleges Students Enrollment
Col_ID#
Coll_ID# Col_Name
Coll_Name Col_Budget
Coll_Budget Col_Description
Coll_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID# Student_ID# Grade
001 Agriculture $1,234,567 Nice place to visit 56-8033 008 Mike Hogan Caucasian 008 4455 56-8037 B+
002 Arts $5,432,567 Sports not so good 56-8045 008 Moshe Smith Caucasian 008 4455 56-8033 C
003 Business $9,999,666 Cool logo 56-8044 008 Sally Shadmon Native American 008 4455 56-8045 B+
004 Education $3,234,567 Ugh Worcester 56-8055 008 Billy Fleegle African American 008 4456 56-8044 A-
005 Engineering $8,238,568 Serious work 56-8037 008 Saul Goode African American 008 4456 56-8122 B-
006 Law $7,237,767 Jumpy students 56-8122 008 Tim Collins Polynesian 008 4454 56-8233 C
007 Liberal Arts $9,898,777 Pretty campus 56-8233 008 Sam Gee Asian 008 4455 56-8334 F
008 Medicine $5,987,004 In Texas 56-8334 008 Rod Paulino Asian 008 4454 56-8055 D
Query Result:
008 Medicine $5,987,004 In Texas | 56-8037 Saul Goode African American | 4455 B+ |
ScaleDB
Building Relationships in
ScaleDB
College
Create College
Departme Student
Create Department nts s
- foreign key: College Enrollme
Courses
Create Course nt
- foreign key: Department
Create Students
- foreign key: College Relationship
Create Enrollment creation is
- foreign key: Students automated
ScaleDB
Pros & Cons of Each Method
ScaleDB
Performance Variables
• Early performance benchmarks
• Used a vanilla scenario
• Our performance advantage
increases with:
• Query/Schema Complexity
• Referential Integrity Checks
• Key Size
• Data Size/Number of Keys
• Performance Advantage: 2X – 20X+
ScaleDB
MySQL Integration
• ScaleDB leverages its index to
assemble data across tables without
step-wise joins
• MySQL query optimizer sees multiple
tables, so it calls for step-wise joins
• ScaleDB tells the query optimizer
about joined tables, they are
virtualized (built on the fly)
• When MySQL’s query optimizer
ScaleDB
recognizes ScaleDB, phantom tables
Improving Scalability
The Challenges of Scaling
• How do I partition data?
• Predict usage patterns, application
evolution, data growth patterns…all are
moving targets
• Avoid data skew: bottlenecks caused by
frequently accessed data on just a few
nodes
• Data shipping between nodes (2-phase
commit)
• Searches outside the partition column
require participation by all nodes ScaleDB
ScaleDB’s Plug-and-
Cluster™
• Cluster-ready solution, just plug in a
server
• No need to partition the data
• Based on shared-everything
architecture
• Found in the highest-end commercial
databases
• Eliminates all of the data partitioning
problems
ScaleDB
ScaleDB Cluster
Local Lock
Local Lock
Manager
Manager
Shared
Storage
ScaleDB
ScaleDB Cluster
Shared
Storage
ScaleDB
Demo
ScaleDB
In a nutshell…
MySQL + ScaleDB
MySQL
ScaleDB
Summary
• Revolutionary indexing solution delivers
a quantum leap in performance &
scalability
• Results:
• Performance improvements of 2X and up
• 7X smaller index size (average)
• Stop jumping through hoops to avoid
joins…FREE JOINS!
• Enables more complex applications,
fresh data, lower TCO, superior
scalability & performance
• We’re looking for appropriate beta
ScaleDB
testers