Sie sind auf Seite 1von 37

MicroStrategy PRIME

High Performance In-memory Analytics

Speaker Introduction
Bala Chandran Dir. Enterprise BI, MicroStrategy

15 years of experience implementing


and designing Big Data and Analytics
Solutions
Hands on experience with many MPP
and in-memory systems
@BG_Chandran ask questions
#MSTRPrime

High Performance Is No Longer A Nice To Have In Analytical Applications

Users expect Google Like performance


from analytic applications, especially on
mobile devices

Exploding data volumes & variety require


In-Memory consolidation and
aggregation

Drivers Of High
Performance

Modern analytical applications contain


100s of vizs, distributed to 1000s of users
daily

Users Expect Sub 3 Second Response From Applications

Strageloopnetworks.com

Performance Directly Correlates to Revenue

Torbit.com

Google found that a 500ms slowdown equals 20% decrease in ad revenue.


Amazon finds a 100ms slowdown can mean a 1% decrease in revenue.
Yahoo! found that a 400ms improvement translated to a 9% increase in traffic.
Mozilla mapped a 2.2s improvement to 60 million additional Firefox downloads http://blog.edgecast.com/post/42404930702/ecommerce-performance-website-speedimpacts-your#sthash.1Hn7Y4dr.dpuf

INTRODUCING MicroStrategy

PRIME

INTRODUCING MicroStrategy

PRIME

PARALLEL
Linear scalability
to 1,000s of
CPUs

INTRODUCING MicroStrategy

PARALLEL
Linear scalability
to 1,000s of
CPUs

PRIME

RELATIONAL
Flexible schema &
Partitioned data

INTRODUCING MicroStrategy

PARALLEL
Linear scalability
to 1,000s of
CPUs

RELATIONAL
Flexible schema &
Partitioned data

PRIME

IN-MEMORY
3x to 10x faster
7x to 20x more
users

INTRODUCING MicroStrategy

PARALLEL
Linear scalability
to 1,000s of
CPUs

RELATIONAL
Flexible schema &
Partitioned data

PRIME

IN-MEMORY
3x to 10x faster
7x to 20x more
users

ENGINE
Tightly-coupled
interactive exploration

10

MicroStrategy PRIME In Action At Facebook


We have this thing thats running. Its one of
the most amazing things Ive seen. Its
running against the entire Facebook user
base, 1.1 billion users.

200 + petabytes of
Hadoop Source Data

30 + Terabytes
Analyzed in PRIME

200+ Node Cluster

3500+ Cores

175 Billion Rows

Guy Bayes
Head of Enterprise BI, Facebook

11

Traditional Technologies Cannot Deliver Performance At High Scale

User Scale

Custom Approaches Are Expensive And Risky

HADOOP

Data Scale
12

Traditional Technologies Cannot Deliver Performance At High Scale

User Scale

Custom Approaches Are Expensive And Risky

MPP
Databases
HADOOP

Data Scale
13

Traditional Technologies Cannot Deliver Performance At High Scale

User Scale

Custom Approaches Are Expensive And Risky

Inmemory
DBs
MPP
Databases
HADOOP

Data Scale
14

Traditional Technologies Cannot Deliver Performance At High Scale


Custom Approaches Are Expensive And Risky
Expensive

User Scale

High Scale Information Driven


Apps

Complex

Custom Development
Risky
Java + Transactional DB clusters + Web 2.0 +
In-memory + BI Tools + .

Slow

Inmemory
DBs
MPP
Databases
HADOOP

Data Scale
15

User Scale

MicroStrategy PRIME Purpose Built For Performance @ Scale

MicroStrategy PRIME

First Out of the box solution


in the market

Inmemory
DBs
MPP
Databases
HADOOP

Data Scale
16

MicroStrategy PRIME Interactive Big Data Exploration

Example Applications

CRM analysis across a large customer base

Interactive analysis: large clickstream data

Merchant analytics for a credit card issuer

Store manager application for a large chain

17

MicroStrategy PRIME Interactive Big Data Exploration

Example Applications

Application Characteristics

Large Data Volumes

Sub 3 second response time

Highly Dimensional data

Complex Dashboards with multiple


visualizations

Highly Interactive App with users


filtering and slicing across many

CRM analysis across a large customer base

Interactive analysis: large clickstream data

Web & Mobile Deployments

Merchant analytics for a credit card issuer

Large User Populations

Store manager application for a large chain

dimensions

18

1
9

MicroStrategy PRIME - 7x more users and 3x faster than the next best inmemory technology

3x
Faster

Complex analytical
dashboard

7x More
Users

High user interactivity


200 GB data set with 50+
dimensions
Equivalent hardware
configurations 30 nodes

19

MicroStrategy PRIME is like In-Memory on steroids

OLAP Services

PRIME

SMP architecture

MPP architecture

Data Size

100GB Limit

No theoretical limit

Tested to 4.6 TB

Data Rows

2B Limit

No theoretical limit

Tested to 200B

Load Rate

8 GB/Hr

No theoretical limit

Tested to 7TB/Hr

20

MicroStrategy PRIME Worlds First Technology to Combine 3 Key


Breakthroughs

In-Memory Data Store

Interactive Exploration
of

Massively Parallel Processing on


Commodity HW

Look-Ahead Analytics Integrated


Data & Visualization Layers

Terabyte Datasets
by

100,000s of Users

21

The Evolution Of Storage

22

1. In-Memory Data Store How much Faster Is It?

Traditional Disk speed is a banana slug with a top speed of 0.007 mph

In-Memory is an F-18 Hornet with a max speed of 1,190 mph

23

RAM Prices Have Fallen Drastically

24

2. Massively Parallel Processing On Commodity Hardware


Traditional BI

PRIME Parallel Execution

Query Engines

Parallel Execution
Bottleneck

Shared Memory

Memory

Memory

Memory

Distributed Data

Distribute data across 1000s of nodes

Parallel Query execution and loading

Inexpensive Commodity Hardware


25

2. Parallel Processing: Scaling The Solution

PRIME Parallel Efficiency

http://blog.delloem.com/2010/12/talking-hpc-with-sagiv-tech/image001/

26

Parallel Processing: Breaking The Problem Down

Vertical Scaling (Scale-up): Generally refers to adding more


processors and RAM, buying a more robust server.

Horizontal Scaling (Scale-out): Generally refers to adding more servers with


less processors and RAM.

Pros

Less power consumption / cooling

Less network hardware than scaling horizontally

Cons

More expensive

Greater risk of hardware failure

Limited upgradeability

Pros

More cost effective than scaling vertically

Easier to run fault-tolerance

Easy to upgrade

Cons

Bigger footprint in the Data Center

Higher utility cost (Electricity and cooling)

Possible need for more networking equipment (switches/routers)


27

Data Movement: The Performance Killer

50% YoY growth

http://www.edn.com/design/communication
s-networking/4313434/The-evolution-tonetwork-flow-processing

90+% YoY growth

Oracle, 2012
28

Minimizing Data Movement: Bringing Query To The Data

PRIME Parallel Execution

Query Engines

Parallel Execution

Memory

Memory

Memory

Distributed Data

Query partitioned and executed on core where


data lives

Only summary information is sent across the


network

29

Commodity Hardware vs. Specialized Appliances

Example PRIME configuration

100 clusters of 2 worker


nodes; 1 cluster of 20 master
nodes

Each Node-16 cores, 144


GB RAM each

Total: 1920 cores, ~17TB


RAM

30

3. Look Ahead Analytics Tightly Integrated Data & Visualization Layers


PRIME Look Ahead
Analytics

Traditional BI

Visualization
Layer
Loosely
Coupled

Analytics layer
optimizes queries for
data

Visualization Layer

Data layer analyzes


dashboard and
optimizes structures

Data Layer

Data Layer

Data layer has no knowledge of analytics layer


design

Tightly integrated layers enable optimization

Analytics layer globally optimizes queries sent to


data based on data structures

Data layer looks ahead and plans based on


knowledge of dashboard

Connections Optimized for the lowest common


denominator

31

Taking Co-Location One Step Further: In-Process Analytics

Traditional BI
Even if you install BI and DB
on the same server
They run in separate
processes

MicroStrategy PRIME
Query Engine and
Application Engine run
In-process analytics

App Engine

App Engine

Query
Processing

Query
Processing

Process 0

Process 1
32

3. Look Ahead Analytics The Secret Sauce

Typical PRIME Application

75+data sets

Multiple views of similar data

Share joins, filters and cohorts

Look Ahead Analytics


Visualizations with identical information processed once
Filtering and cohorts processed once and reused - processed into machine code
.
Re-use of joined results for analytics with similar information
.. Many More

33

MicroStrategy PRIME in Action

34

MicroStrategy PRIME - Architecture

Web and mobile output


API

VISUALIZATION

API
Application Engines

Analytics Engines

Parallel query
execution
Data partitioning within and across
nodes
Optimized in-memory data
structure

Tightly
coupled for
minimal
computation
al distance

DATA

DATA

DATA

DATA

Commodity hardware
Parallel data
loading

SOURCE DATA
35

MicroStrategy PRIME Co-exists With Existing Enterprise Databases

Data
Warehouse

MicroStrategy
PRIME

Does not replace databases


Functions as Hot data layer

for apps requiring high


performance
Load from databases or

directly from files and Hadoop

SOURCE DATA

36

Thank You
@BG_Chandran #MSTRPrime

37

Das könnte Ihnen auch gefallen