Sie sind auf Seite 1von 46

Event Driven Real Time Analytics

Jon Mead, Rittman Mead September 2012


T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Introductions
Jon Mead CEO/co-founder of... Rittman Mead Consulting Oracle BI & DW Consultancy Gold Partner Long(est) running Oracle BI blog Annual BI Forum OBIEE Oracle Press book Customer-facing FTSE listed UK based and leading Internet based Retail based

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Agenda
Understanding the Project Legacy architecture Proposed architecture Reporting requirements Technical Infrastructure Hardware and Software Data Warehouse Architecture Adopting the Oracle reference architecture for real time Design Challenges De-queuing Operational ODI Logging Multi-threading and scalability Further thoughts Middleware or memory based applications

The point of this presentation is to give you an idea of how to approach a real time event driven BI system using Oracle's current toolset.

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Understanding the Project


T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Sunday, 30 September 12

Business Goal
Part of a major re-architecture program Covering ERP, CRM and BI

Driver: single view of customer Delivered by: channel consolidation into single enterprise data warehouse

Data migration Enterprise Architecture Enterprise Service Bus

Real-time reporting Legacy reporting BAU reporting

Revenue and Profit Liability and risk Up/cross-sell

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Legacy Architecture

Legacy architecture consisted of two completely separate systems Retail stored shop based transactions Online stored transactions generated online

Retail Data Warehouse Retail trading systems Retail trading systems Retail trading systems 24 hour batch (DTS) SQL Server 2005 6TB

Online Data Warehouse Online trading systems Online trading systems Online trading systems 24 hour batch (DTS) SQL Server 2008 3TB

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Proposed Architecture
Enterprise Architecture ODI TIBCO Queue Retail trading systems Retail trading systems Retail trading systems Real-time feed transactions ODI Real-time feed reference data Real Time Data Warehouse Exadata DR Exadata

Real-time feed

OD ta I - o mig nc ra e tio of n f

Online trading systems Online trading systems Online trading systems

Real-time feed Online Data Warehouse SQL Server 2008 3TB

Retail Data Warehouse SQL Server 2005 6TB

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

tion Data migra - once off

Da

ODI

Proposed Architecture
Enterprise Architecture ODI TIBCO Queue Retail trading systems Retail trading systems Retail trading systems Real-time feed transactions ODI Real-time feed reference data Real Time Data Warehouse Exadata DR Exadata

Real-time feed

OD ta I - o mig nc ra e tio of n f

Online trading systems Online trading systems Online trading systems

Real-time feed Online Data Warehouse SQL Server 2008 3TB

Retail Data Warehouse SQL Server 2005 6TB

Current state to future state includes a data migration


T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

tion Data migra - once off

Da

ODI

Reporting Requirements
Real time monitoring Risk and liability Profit and loss Analytic reporting Consolidated analytics Legacy reporting Operational reporting Detail level Support analytical reports Drill through

Need to understand the different drivers for each of these needs and the value provided by real time reporting during the running of high transaction events

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Technical Infrastructure
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Sunday, 30 September 12

Volumetrics
Initially processing data from 2500 shops, scaling to capacity 8 TB of migrated data Processing 1.8 million transactions a day Processing 4,000 reference data items a day Approximately 9 million transaction rows being processed a day All transactions read from a TIBCO queue Approximately 200,000 reference data changes a day 30,834 transaction processing cycles a day (one every ~2.8s) 2,701 reference data cycles a day 680,000 recalculations a day Online transactions will follow 2 million transactions a day Comparable downstream figures

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Exadata and ODI


Standard X2-2 Quarter Rack 2 compute nodes All databases split across the nodes The storage is configure in dual redundancy mode to offer up about 9TB of usable space, however, we use a couple of TB of that for backups and archive redo logs The flash storage has been set up as 250GB on each node as a local cache and 110GB from each being used to provide a 160GB flash disc. The database version is 11.2.0.3 and the client have the tuning and diagnostics pack and Heterogeneous Services on top of the usual Exadata software set. Both the 11.2.0.2 and 11.2.0.3 Oracle Homes still exists. ODI Agents for UAT and PROD running off Node 2 Running up to 30 ODI Agents for PROD to get the speed to read off the TIBCO queues. Each agent running with 512MB with the calling agent running 1GB.

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Exadata and ODI


Compute Nodes ODI Installs UAT
ODI AGENTS DEV/ NFT DEV01 DEV02 NFT PROD ODI WORK SCHEMA UAT ODI WORK SCHEMA

NODE 1

PROD

Oracle Databases
NODE 2 ODI AGENTS PROD UAT

UAT REPOSITORY

PROD REPOSITORY

SSD

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Exadata and ODI


Compute Nodes ODI Installs UAT
ODI AGENTS DEV/ NFT DEV01 DEV02 NFT PROD ODI WORK SCHEMA UAT ODI WORK SCHEMA

NODE 1

PROD

Oracle Databases
NODE 2 ODI AGENTS PROD UAT

UAT REPOSITORY

PROD REPOSITORY

Currently only one Exadata server available, so shared platform

SSD

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Data Warehouse Architecture


T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Sunday, 30 September 12

Key Drivers
Part of integrated Enterprise Architecture The enterprise data model was designed and developed in Enterprise Architect by the middleware architects The architects wanted to base the approach on the Oracle Reference Data Warehouse architecture There were different reporting needs for real time and business as usual reporting Write performance was likely to be as big a factor as read performanc

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Oracle Reference Architecture


Simplified view of Oracles Data Warehouse Reference Architecture Enterprise Architecture was XML based
Active Data Warehouse Staging Perfromance Audit and Reconciliation

OBIEE

Analysis

Foundation

Operational and realtime

Enterprise Architecture

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Oracle Reference Architecture


Simplified view of Oracles Data Warehouse Reference Architecture Enterprise Architecture was XML based
Active Data Warehouse Staging Perfromance Audit and Reconciliation

OBIEE

Analysis

Foundation

Operational and realtime

One of design drivers was that the foundation layer reected the enterprise data model

Enterprise Architecture

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Limitations
The ODS would reflect the enterprise architecture Non-database centric view Considerable processing to get data into ODS Data processing from Staging to Foundation was too complex to support SLAs Performance layer also to reflect existing more data warehouse structures

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Limitations
The ODS would reflect the enterprise architecture Non-database centric view Considerable processing to get data into ODS Data processing from Staging to Foundation was too complex to support SLAs Performance layer also to reflect existing more data warehouse structures

Result was non-performant and unusable structures for real-time reporting

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Alternative Architecture
Split the processing between real-time and BAU Process 1: Staging to Performance (real-time) Process 2: Staging to ODS to Foundation Independent control of either process Mechanism to handle peaks in data Needed to ensure consistency between processes
BET BETSLIP STG_xxx STG_xxx STG_xxx TIBCO ODI Real time ETL STG_CTL ODI Real time ETL 3NF tables BET BETSLIP ODI Micro batch ETL BET BETSLIP Decomposition and Aggregate tables SQL Real time query 3NF tables

Micro batch ETL ODI

BET BETSLIP Dimension and fact tables

Near real time SQL

SQL

Staging

Foundation

Performance

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

OBIEE

Near real time

Design Challenges
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Sunday, 30 September 12

De-queuing
Concern that ODI would not be able to de-queue A lot of fluctuations, depending on events XML messages were verbose Large amount of processing time for each batch of messages Scalability provided by creating more agents What would the limitations be in terms of RAM What would the limitations be in terms of connections What would the limitations be in terms of management

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

2 Queues
Queue are unstructured so the data can arrive in any order Difficulty of processing business logic Timestamps not always accurate Keys not always present One of most challenging areas of the project Often need to do manual lookup of keys Solution: recycle mechanism

Transactions
TIBCO

STG_xxx STG_xxx STG_xxx Recycle

Reference data Real time ETL

STG_CTL

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

High number of writes


The real-time write process generated a very high number of writes Exadata optimised for bulk reads Contention for REDO logs (see also the ODI Logging) Exadata configured for more of an OLTP system than Data Warehousing system However both share the same server

Resolution: lots of work by the DBAs to optimise database configuration

Isn't this a little bit like an OLTP system?

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Operational Challenges
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Sunday, 30 September 12

ODI Parser
XML Parser The XML data definition files were dynamically generated. The current version of the XML Parser does not do a double pass of the definition file Any referenced complex definitions needed to be defined in the order they were accessed The software generating the XML definition files did not do this Resolution Build a Java program to re-parse the XML data definition file and output a correctly ordered version This is a once per release process This behaviour if fixed in 11.1.1.7 of ODI (I think)

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

De-queuing Performance
ODI struggled to keep up with, or fell behind the queue at peak times Volumes of messages were not regular We also found agents failing Hence we needed a resumption mechanism

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Scaling agents
Because of the failing agents, we couldnt just increase their number Set up parent agents One for each queue One for monitoring and maintenance scripts Each parent agent ran a number of child agents Each child agent was actually two agents Second agent acted as redundancy Agents killed after 50 executions
A1 A1 A1 C1

Q1

Q2

P1

P2

M&M

A2 A2 A2 C2

A3 A3 C3

A4 A4 C4

C6

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Impacts of Multiple Agents


Memory Parent agent 1024MB Child agent 512M Total number of agents used 3 parents 18 child (primary) and 18 child (secondary) Total: 39 = 21504MB (approx 21GB) However we didnt get anywhere near linear scaling Max TPS = 176 Max queue TPS = 480 Second option is to increase the number of queues Split by functional area

Connections: Every time an ODI agent read from the queue a new connection was created and destroyed. There didnt seem to be any pooling.

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

ODI Logging
ODI Logging The ODI processes create 900GB of log files a day ODI logging needs high IOPS Exadata, by default not allocating enough IOPS resource to the ODI logging ODI logging then becomes a limiting factor on the database performance Target is SNP_SESS_TASK_LOG Log writer process cannot keep up Number of active processes mean the database will be performing as hard as it can and more activity will slow everything down.

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

ODI Logging
ODI Logging The ODI processes create 900GB of log files a day ODI logging needs high IOPS Exadata, by default not allocating enough IOPS resource to the ODI logging ODI logging then becomes a limiting factor on the database performance Target is SNP_SESS_TASK_LOG Log writer process cannot keep up Number of active processes mean the database will be performing as hard as it can and more activity will slow everything down.

ODI does the same logging but the volume preserved is reduced with lower levels of logging. So in fact, lower levels of logging could be more IO demanding as more data is deleted.

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

SNP_SESS_TASK_LOG
The most demanding SQL statement on the system is and always has been the update to SNP_SESS_TASK_LOG This table holds 3 CLOB columns. The update is "lazy", all columns are updated each time. Thus each update can potentially update: the table three clob indexes three clob tables

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

ODI Logging Impact


Initially the system was throttled on the SNP_SESS_TASK_LOG. ODI IOPS demand was maxing out the physical disc IOPS capability of the box Move SNP_SESS_TASK_LOG and SNP_SESS_TASK to a new ASM diskgroup created from the SSD storage in Exadata

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

ODI Logging Impact


Initially the system was throttled on the SNP_SESS_TASK_LOG. ODI IOPS demand was maxing out the physical disc IOPS capability of the box Move SNP_SESS_TASK_LOG and SNP_SESS_TASK to a new ASM diskgroup created from the SSD storage in Exadata

Simple solution for the ODI Logging problem is to move the database to another server

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

ODI temporary tables


The ODI real-time processing creating a large number of I$ and other internal tables Once the processing around these is complete, they are put the Recycle Bin The Recycle Bin become either full or unmanageable

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

ODI temporary tables


The ODI real-time processing creating a large number of I$ and other internal tables Once the processing around these is complete, they are put the Recycle Bin The Recycle Bin become either full or unmanageable

There is a wider issue here that affects scalability of the whole solution, discussed in next slides

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Future Challenges
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Sunday, 30 September 12

Scalability
The main bottleneck we are experiencing is I/O High number of writes to the database ODI $ internal tables ODI logging We should address this problem by making better use of memory We also have a constraint on the amount of memory Exadata can provide the agents Any allocated memory has the opportunity cost of not be used by the database We should also explore other logical approaches to solving this problem

Accessing data in memory reduces the I/O reading activity when querying the data which provides faster and more predictable performance than disk

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Process Flexibility
The source system splits the data into real time and batch ESB provides 2 separate queues The XSD is the same for both queues The processing for both queues is constantly running in a loop The batch queue is much larger than the real time queue. The foundation layer requires data from both The data from each queue lands in the same stage tables partitioned by the queue name Entire process controlled by maintaining BATCH_IDs
SRC Systems Feed

Non Critical Data

Real Time Data

DQ Process

DQ Process

Stage Schema

Event Stage Tables CTL_EVENT batch_id batch_type ODS_processed RTF_processed STG_processed

CDC Process

CDC Process

STG_ Tables

ODS Load Processing

Real Time processing

Foundation Layer Tables

Reporting Tables

Foundation Schema

Performance Schema

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

In-Memory Processing
Will require re-writing of the Knowledge Modules Should also persist connections Option 1: Remove the writes to the $ tables and attempt to do more operations on-the-fly Potential loss of audit trail and reconciliation points Currently all outer joins are materialised Will need to perform Option 2: Use In-Memory database?

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

In-Memory Processing
Will require re-writing of the Knowledge Modules Should also persist connections Option 1: Remove the writes to the $ tables and attempt to do more operations on-the-fly Potential loss of audit trail and reconciliation points Currently all outer joins are materialised Will need to perform Option 2: Use In-Memory database?

Exalytics?

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Conclusion
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com
Sunday, 30 September 12

Conclusion
The Oracle Reference Data Warehouse architect can support real time event driven ETL, however it may need modifications IDO has some rough edges and kinks that need to be ironed out for it to act at this kind of enterprise level Dont underestimate the effort of doing a data migration Its important to understand the implications and differences of middleware centric data models and processing compared with databases centric ones

The objectives of the project where achieved. The resulting data is being used on a daily basis and proving significant value to the organisation

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Questions?

T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Event Driven Real Time Analytics


Jon Mead, Rittman Mead September 2012
T : +44 (0) 8446 697 995 or (888) 631 1410 (USA) E : enquiries@rittmanmead.com W: www.rittmanmead.com

Sunday, 30 September 12

Das könnte Ihnen auch gefallen