Sie sind auf Seite 1von 44

Scaling Hibernate

Emmanuel Bernard - Max Ross

Emmanuel Bernard

•

Hibernate Search in Action blog.emmanuelbernard.com twitter.com/emmanuelbernard

•
Emmanuel Bernard • Hibernate Search in Action blog.emmanuelbernard.com • twitter.com/emmanuelbernard •
Emmanuel Bernard • Hibernate Search in Action blog.emmanuelbernard.com • twitter.com/emmanuelbernard •
Emmanuel Bernard • Hibernate Search in Action blog.emmanuelbernard.com • twitter.com/emmanuelbernard •

Max Ross

•
•

Google App Engine Hibernate Shards

What is scalability?

What is scalability?
What is scalability?

What is scalability?

Users

Resource

Data

Uptime

How does Hibernate stand?

Limitations?

SQL optimizations

2nd level cache

Conversation

Node Session Session Session Node Session DB Session Session Node Session Session Session 2nd level
Node
Session
Session
Session
Node
Session
DB
Session
Session
Node
Session
Session
Session
2nd level
2nd level
2nd level
cache
cache
cache
Node Session DB Session Session Node Session Session Session 2nd level 2nd level 2nd level cache

Changes in mass

Bulk insert / update / delete

Stateless session

Changes in mass • Bulk insert / update / delete • Stateless session
Changes in mass • Bulk insert / update / delete • Stateless session

To Googolzillions and beyond

Googolzillion things? Who are you?

Social network

SaaS

Googolzillion things? Who are you? • Social network • SaaS
Googolzillion things? Who are you? • Social network • SaaS

Problem

Same data model

Too much load

Too much data

Too many lawyers

Problem • Same data model • Too much load • Too much data • Too many
Problem • Same data model • Too much load • Too much data • Too many

Separating customer data

Logical separation

All customers share tables Manual or Hibernate Filter

Application SessionFactory DB Schema
Application
SessionFactory
DB
Schema

One user per schema

One SessionFactory per schema

Rewrite SQL

Application Session Factory DB Schema Schema
Application
Session
Factory
DB
Schema
Schema
Application Session Session Factory Factory DB Schema Schema
Application
Session
Session
Factory
Factory
DB
Schema
Schema

Use database security

Map JAAS credentials to DB credentials

One connection (pool) per user

Oracle security

Oracle VPD

Application defines active user

Storing in multiple databases

SessionFactory == DB

Same schema across DBs

Expensive in RAM

Data isolated

Sharing state across SessionFactory s is probably doable
Sharing state across
SessionFactory s
is probably doable

How many customer per DB?

One

One per schema

Several per schema

Dispatch customer to the right SessionFactory

Adjusting the application layer

Homogeneous nodes

Application Application Application Session Session Session Session Session Session Factory Factory Factory
Application
Application
Application
Session
Session
Session
Session
Session
Session
Factory
Factory
Factory
Factory
Factory
Factory
Conn
Conn
Conn
Conn
Conn
Conn
pool
pool
pool
pool
pool
pool
DB
DB
Conn Conn Conn pool pool pool pool pool pool DB DB Memory • Too many connections

Memory Too many connections

Slow to start

Specialized nodes

Dispatch per user Application Application Application Application Session Session Session Session Session
Dispatch per user
Application
Application
Application
Application
Session
Session
Session
Session
Session
Session
Session
Factory
Factory
Factory
Factory
Factory
Factory
Factory
Conn
Conn
Conn
Conn
Conn
Conn
Conn
pool
pool
pool
pool
pool
pool
pool
DB
DB
DB
DB
DB

Load balancing rules

Easy scalability

Efficient resource-wise

What if you need to query all your data?

Hibernate

Shards

Hibernate Shards
Hibernate Shards

Simplified Horizontal Partitioning

Separates app logic from federation logic

Standard Hibernate API

Unified view of your data

Shard Strategy

Federation logic is application specific

Selection

Resolution Access

Model Object ? ? ? Shard 1 Shard 2 Shard 3
Model
Object
? ?
?
Shard 1
Shard 2
Shard 3

Shard Selection

On which shard do we create the record?

Round robin

Capacity based Attribute based

Performance based

Shard Resolution

On which shard do we find the record?

Exhaustive search

Map ID ranges to shards

Distributed cache

Shard Access

How do we apply operations across shards?

Serially

In parallel (bring your own thread pool)

Hybrid

Writing the app is the easy part

Operational challenges/risks are amplified

Virtual shards can help

Virtual Shards

Application Sharded Session Factory Virtual Virtual Virtual Shard 1 Shard 2 Shard 3 Physical Shard
Application
Sharded Session Factory
Virtual
Virtual
Virtual
Shard 1
Shard 2
Shard 3
Physical
Shard 1

Virtual Shards

Application Sharded Session Factory Virtual Virtual Virtual Shard 1 Shard 2 Shard 3 Physical Shard
Application
Sharded Session Factory
Virtual
Virtual
Virtual
Shard 1
Shard 2
Shard 3
Physical
Shard 1
Physical
Shard 2

Coming Soon

Static Data

Full-fledged ShardedQuery JPA

Hibernate

Search

Hibernate Search
Hibernate Search

Full-text search your domain objects

Hibernate + Lucene

Same programmatic model

Index synchronized

Human queries

Data set

Word centric

Typos / Synonyms

Relevance

SQL underperforms

Wildcard Table/Index full scan

Multiple joins

Relevance?

Customer DBA

Customer

DBA

Customer DBA

Full-text search

Move load away from the DB

Replace or complement searches

Scalability Symmetric cluster

Distributed lock

Immediate visibility Affects front end

Hibernate + Hibernate Search Search request Index update Lucene Directory Database (Index) Search request Index
Hibernate
+
Hibernate Search
Search request
Index update
Lucene
Directory
Database
(Index)
Search request
Index update
Hibernate
+
Hibernate Search

Scalability Asymmetric cluster

Search local / change sent to master

Asynchronous indexing (delay)

No front end extra cost / good scalability

Slave Hibernate Database Lucene + Search request Directory Hibernate Search (Index) Copy Index update order
Slave
Hibernate
Database
Lucene
+
Search request
Directory
Hibernate Search
(Index)
Copy
Index update order
Copy
Hibernate
Master
Lucene
+
Directory
Index update
Hibernate Search
Process
(Index)
JMS
Master
queue

Scalabilities (sic)

Hibernate a good citizen

Isolating customer data

Deal with multiple databases

Hibernate Shards

Hibernate Search

Q&A

For more infos

Hibernate Search in Action Java Persistence with Hibernate

Max’s podcasts

hibernate.org