Sie sind auf Seite 1von 8

Data Warehousing

Teradata Aggregate Designer

By: Sam Tawfik Product Marketing Manager Teradata Corporation

Teradata Aggregate Designer


Table of Contents
Executive Summary Introduction Problem Statement Implications of MOLAP Solution Overview Solution Details Product Overview Product Details Summary Abbreviations Requirements References 2 3 3 4 4 5 5 6 7 7 8 8

Executive Summary

The Teradata Aggregate Designer is a tool that enables customers to maximize the performance and value of their enterprise business intelligence environments on the Teradata platform. Teradata Database offers a unique built-in capability, the Aggregate Join Index, that supports multi-dimensional business intelligence solutions. Aggregate Join Indexes perform common aggregations and calculations automatically as the data are loaded into the data warehouse saving the business intelligence solutions significant aggregation time. The Teradata Aggregate Designer is used to automate the design, recommendation, and creation of Aggregate Join Indexes in Teradata Database by leveraging implementation best practices.

EB-6110

>

0210

>

PAGE 2 OF 8

Teradata Aggregate Designer


Introduction
Business intelligence (BI) applications are constantly increasing in scope, technological sophistication, and analytical power. Today more than ever, decision makers rely on BI tools and critical business data to monitor, analyze, and act upon the companys financial performance, operations, supply chain, compliance, risk management, and other key business functions. The volume and complexity of deep analytics required by modern organizations for reporting, auditing, planning, forecasting, and automating require optimal, state-of-the-art processing capabilities to deliver the right information to the right recipients at the right time. Teradata systems are purpose built to deliver high-performance, rich BI solutions powered by high-quality data and best-ofbreed analytical tools. Teradata Aggregate Designer extends and enhances Teradata Databases native capabilities to optimize the performance of rich, multi-dimensional BI solutions. mining and data modeling for complex analysis and advanced mathematical forecasting. In addition, these products provide rich presentation and delivery features that enable enterprise-wide Figure 1 outlines the traditional OLAP implementation approach used by many BI tools to extract data often summarized or aggregated results from the data warehouse and store those data on a separate OLAP server. The dedicated OLAP server and the limited data set enable the OLAP solution to deliver the performance that is required by the users. This implementation approach is called multi-dimensional on-line analytical processing (MOLAP). deployment of business reports, dashboards, and scorecards. The unprecedented reach and sophistication of these tools allows users to obtain answers to highly relevant and specific business questions using intuitive graphical tools. At the core of these remarkable capabilities, there are vast repositories of data, usually in the form of a data warehouse. Optimal performance in data access and processing is therefore critical to the success of enterprise BI solutions. The Teradata Database is designed specifically to address this challenge.
Figure 1. Traditional OLAP implementation.

Business Intelligence Applications

OLAP Server ETL

Problem Statement
BI solutions typically consist of commercial or custom-built on-line analytical processing (OLAP) tools and a data source. Leading OLAP vendors, such as IBM Cognos, Microstrategy, Microsoft, Oracle, SAP BusinessObjects, and SAS

enable users to perform a wide range of analytical tasks from simple dimensional aggregation (slice and dice), to data

EB-6110

>

0210

>

PAGE 3 OF 8

Teradata Aggregate Designer


Limitations of MOLAP Limitations of the MOLAP implementation approach include: > Designed to work with aggregated data and not detailed data. > Increased operational efforts required to provide highly detailed data, causing high data movement volumes, with daily or intra-daily cube updates. In reality, this limits significantly the level of detail and cube refresh frequency that can be supported. > Performance and levels of analytics are bound by the middle-tier servers hardware capabilities. > Requires a middle-tier server to host OLAP cubes and perform analytics. > Significant delays in making the business data available for business analysts as the data must be extracted from the data warehouse, transformed, and loaded. > Possible cube reliability challenges due to the frequent refresh work jobs touching multiple systems and environments (data warehouse, data extracts, network, OLAP server, cube building), each can fail. The time lag required to process the extraction, transformation, and loading of the data to OLAP cubes affects the timeliness of availability of critical business data necessary to respond to vital business events, such as delayed shipments, regional events, or competitive threats. Because of the larger number of system components that must be orchestrated, there is a higher operational risk that the cube refresh workflow process can fail. While performing analytics on aggregated data leads to increased performance, detail is sacrificed in favor of performance, significantly reducing the analytic range. Detailed data allow business users to answer progressively deeper and more specific questions to understand exceptions, temporary events, and ultimately, root causes. Accessing these detailed data requires a separate process for extracting the detailed data from the data warehouse. Most middle-tier analytical application servers are limited in processing power since they are not purpose built for massively parallel computing. This severely limits scalability, performance, and ultimately, business value as it hinders the ability of users to gain access to key insights in a timely manner. The required middle-tier hardware and software represent additional costs of maintenance, operation, licensing, and support.

Solution Overview
IT organizations are now deploying BI solutions using the relational on-line analytical processing (ROLAP) approach. This approach is designed to access data stored in the data warehouse directly instead of accessing them from the OLAP cube stored on the middle-tier server. The ROLAP approach avoids messy data movement, makes all aggregated and detailed data available to BI analytical tools, and no middle tier is required other than to host the BI application itself. Most modern BI tools also support the hybrid on-line analytical processing (HOLAP) model. In this approach, most processing is supported by ROLAP, while certain parts of the solution will rely on MOLAP capabilities. The intent of the hybrid approach is to optimally combine the fast response time typical of MOLAP for highly time-sensitive, usually tactical queries with the analytical depth and ability to drill-down and drill-through to detailed data through ROLAP. Many customers have implemented the ROLAP and/or HOLAP models to deliver successful BI solutions. These solutions eliminate the need to extract the data from the data warehouse while meeting the business response time requirements and leveraging the benefits of accessing aggregated as well as detailed data for better insight.

Implications of MOLAP
MOLAP-based BI solutions have many benefits, but there are negative implications for this approach as well. Some business implications include:

EB-6110

>

0210

>

PAGE 4 OF 8

Teradata Aggregate Designer


Solution Details
The ROLAP approach enables organizations to leverage best-of-breed BI tools and a powerful relational data warehouse such as the Teradata Database. Teradata Database is the ideal data warehouse to meet the scalability and performance requirements of BI solutions in highly demanding and competitive environments. Teradata Database is designed and optimized to support ROLAP analytics.
Data Freshness Depth of Analytics Query Performance Breadth of Analytics
Sub-seconds Sub-seconds for cube; few seconds for detail Almost unlimited Few seconds

Attribute

MOLAP

HOLAP

ROLAP

Limited to cube on the middle-tier server Limited to cube on the middle-tier server and drill through Varies depending on cube refresh

Almost unlimited

Drill-down for detailed analysis

Detailed analysis

A key Teradata technology for supporting ROLAP and HOLAP solutions is the Aggregate Join Index (AJI). AJIs are join indexes that specify SUM or COUNT aggregate operations across one or more tables. AJIs perform aggregations automatically as the data are loaded into the data warehouse, resulting in highly increased response times at query time. This approach enables the BI solution to benefit from the quick response time, but also allows the wider and deeper analysis of the detailed data. AJIs also eliminate the need to extract the data from the data warehouse in order to load it into a middle-tier, usually seen with the MOLAP approach. AJIs are completely transparent to BI tools and client applications in general. AJIs are typically implemented manually to support specific analysis scenarios. Once defined, they require no further user or administrator maintenance. AJIs are automatically accessed by the Teradata optimizer to maximize performance in accessing and navigating relational data.
Cost of Implementation

Combination of cube data and current data Moderate

Current data

High

Low

Figure 2. Comparison of the three OLAP implementation approaches.

Product Overview
Optimal performance of implementations using the ROLAP or HOLAP models with Teradata Database involves database design considerations such as data loading, physical data modeling, and performance tuning. The goal in the data loading process (including both ETL and ELT) is to cleanse, standardize, and load the data into a normalized model to support reporting and analytics. Dimensional aggregates are usually built on these data to increase performance. Data modeling is a set of techniques aimed at designing optimal data structures to store business data. Best practices involve a progression of stages from logical to physical.

Third normal form (3NF) is the ideal data representation for decision support solutions because it is optimized for ad-hoc queries while providing maximum flexibility and minimal redundancy. Star and snowflake schemas are better suited for OLAP solutions because they are optimized to support predefined business questions, and they align intuitively with the way people think about business data. In Teradata Database, a Semantic Layer, normally built through views or materialized tables, is typically utilized to implement star and snowflake schemas to support OLAP. Currently, using AJIs optimally involves a multi-step manual process that includes validating the design requirements (loading process, semantic layer, and database

EB-6110

>

0210

>

PAGE 5 OF 8

Teradata Aggregate Designer


considerations); capturing the OLAP attributes; and designing, building, testing, and deploying the identified AJIs in Teradata Database. The Teradata Aggregate Designer automates this process and makes it easier to take advantage of Teradata Databases built-in OLAP optimization features for faster response time and increased levels of analytics. Cube Schema Definition Capture In order for the Teradata Aggregate Designer to know what AJIs to build, it must have information about common questions that users will ask of the relational dataset. Therefore, the first action the tool performs is to read a schema definition either from a partner BI tool or from the Teradata Schema Workbench. The tool analyzes this schema to understand the constructs of the multidimensional schema definition and breaks it down into measures, dimensions, hierarchies, and other dimensional objects. Schemas can either be provided to the tool with simple flat files, or the tool can integrate with the multidimensional engine via web-services for seamless interoperability. Database Validations Once the schema has been consumed and parsed, the Teradata Aggregate Designer performs a series of validations to ensure the database is suitable for AJI creation. These validations include: > Database elements in the schema are defined. > Primary and Foreign Keys are NOT NULL. > Primary keys are unique. > Compression is not set on columns. > Referential Integrity is set. If issues are identified, the tool provides specific instructions to the DBA about A broad AJI joins to one or more dimension tables and aggregates to a higher level than was available in the Fact or Transaction table. A base AJI does not join to any dimension tables. It is an aggregate index that only aggregates rows on a Fact or Transaction table. AJI Recommendations The Teradata Aggregate Designer can be used in two different ways Manual and Automated. Manual Mode is targeted to expert users who are already familiar with AJIs and who know the exact AJI they want to build. They can leverage the GUI design tool to create the appropriate AJIs. Automated Mode is targeted to novice users who are not experienced in creating AJIs. The AJI Advisor is used to recommend AJIs based on the dimensional model. The Teradata Aggregate Designer AJI Advisor feature leverages best-practices heuristics and algorithms to recommend the optimal AJIs to build. The AJI Advisor recommends two AJIs: a base and a broad AJI. how to resolve any errors. These validations are important because schemas that fail to meet the specified conditions preclude either the appropriate creation and load of AJIs or their use by the query optimizer.

Product Details
The Teradata Aggregate Designer is a desktop administrative design-time productivity tool used by database administrators (DBAs) to automate the design, recommendation, and creation of AJIs in Teradata Database. The tool bridges the gap between the multidimensional BI environment and the relational database environment by helping DBAs create the recommended AJIs. AJIs improve the performance of BI requests based on the questions that can be asked of the relational dataset. Teradata Aggregate Designer takes the guesswork out of creating AJIs and allows DBAs to be more productive and accurate in their AJI designs. The tool also creates AJIs that increase the likelihood of being hit by SQL statement from a BI tool.

EB-6110

>

0210

>

PAGE 6 OF 8

Teradata Aggregate Designer


Teradata Aggregate Designer features a Creation Services module used to create, edit, or delete an AJI (See Figure 3.). The Creation Services module provides a GUI interface to allow the user to define the name of the AJI, select a predefined schema, select the dimensions to aggregate, and define the Teradata indexes. The Creation Services module also provides AJI storage cost estimates by calculating and displaying the AJI space requirements and overhead relative to the Fact Table estimates (See Figure 4.). Finally, the Creation Services module automatically writes and creates the AJI DDL statement, connects to the database, and executes the AJI.
Figure 3. AJI Creation Services.

Aggregation Levels Dimension : Hierarchy Time : All Time Org : All Orgs Business : All Business Channel: All Channels Brand : All Brands Product : All Product

AJI Options Level Day Sales Center Business Type Channel Type Brand Product

Abbreviations
AJI Aggregate Join Index AJI is an aggregate result set saved as an object in the database, and it is transparent to end users. It is leveraged automatically by the Teradata optimizer when a query plan contains matching columns and aggregates. HOLAP Hybrid On-Line Analytical Processing HOLAP is an OLAP implementation approach that utilizes both MOLAP and ROLAP approaches to provide high performance for frequently-accessed analytics while also providing the capability to drill-down to detailed data or to drill across multiple dimensions. MOLAP Multidimensional On-Line Analytical Processing MOLAP is an OLAP implementation approach that utilizes a middle-tier server for hosting and analyzing the BI solutions data.

100

Summary
The Teradata Aggregate Designer simplifies and automates building and deploying cube-based aggregations in Teradata Database to accelerate ROLAP (and HOLAP) solutions. By bridging the gap between the multidimensional BI environment and the relational database environment, the tool helps DBAs create AJIs that improve the performance of BI tools based on the questions that can be asked of the relational dataset. The Teradata Aggregate Designer takes the guesswork out of creating AJIs and allows DBAs to be more productive and accurate in their AJI designs. The tool also creates AJIs that increase the likelihood of being hit by SQL statement from a BI tool, ultimately providing maximum performance and optimal usage of resources.

Space Cost %

80 60 40 20 0

Selected AJIs
Figure 4. AJI storage cost estimates.

ROLAP Relational On-Line Analytical Processing ROLAP is an OLAP implementation approach that relies on the data warehouse for hosting and analyzing the BI solutions data.

EB-6110

>

0210

>

PAGE 7 OF 8

Teradata Aggregate Designer


Teradata.com

Requirements
> Database Versions: Teradata Database version 12 Teradata Database version 13 Desktop > Client Operating System Platforms Supported: Windows XP Service Pack 3 Windows 2003 Standard Service Pack 3 Windows Vista Windows 2008 Standard > Java Platform Standard Edition JRE 6 > Disk Space: Minimum 1GB Recommended 10GB > CPU: Minimum 1GHz (single-core) Recommended 1.5GHz (multi-core) > RAM: Minimum 1GB Recommended 2GB

References
Webinars available on Teradata Education Network (external) and Teradata University (Teradata associates only): > TBIO Overview Webinar (course # 45863) > Introduction to Data Modeling (course # 26369) > Teradata Star-Schema Designs (course # 26536) > OLAP Optimization with Teradata (course # 37317) > Business Intelligence Concepts and Tools (course # 43022) > Common Performance Considerations for Teradata and MicroStrategy (course # 37861) > Improve your OLAP Environment with Cognos and Teradata (course # 37715) > Improve your OLAP Environment with Microsoft and Teradata (course # 37686) > Improve your OLAP Environment with Teradata and Oracle Essbase (course # 43892) Other Teradata White Papers: Implementation AJI for ROLAP http://www.teradata.com/t/page/170888/?s rc=tdmo_rl&i=v07n04

Improve Your OLAP Environment with Hyperion and Teradata http://www.teradata.com/t/page/166368/?s rc=tdmo_rl&i=v07n04 Improve Your OLAP Environment with Microsoft and Teradata http://www.teradata.com/t/page/170195/?s rc=tdmo_rl&i=v07n04 About the Author Sam Tawfik, Product Marketing Manager with Teradata Corporation, has an extensive background in data warehousing, enterprise application architecture, and systems integration. His broad technical experience includes large-scale data warehouse systems, business intelligence, application development, and Service Oriented Architecture. Sams work experience includes software engineering, systems integration consulting, project management, product evangelism, marketing, and vendor and partner development. Prior to joining Teradata, he worked with BEA Systems and BearingPoint. He has more than 20 years of IT experience and received his undergraduate degree in Computer Science from California State University, Fullerton.

This document, which includes the information contained herein, is the exclusive property of Teradata Corporation. Any person is hereby authorized to view, copy, print, and distribute this document subject to the following conditions. This document may be used for non-commercial, informational purposes only and is provided on an AS-IS basis. Any copy of this document or portion thereof must include this copyright notice and all other restrictive legends appearing in this document. Note that any product, process, or technology described in this document may be the subject of other intellectual property rights reserved by Teradata and are not licensed hereunder. No license rights will be implied. Use, duplication, or disclosure by the United States government is subject to the restrictions set forth in DFARS 252.227-7013(c)(1)(ii) and FAR 52.227-19. Teradata, the Teradata logo, and Raising Intelligence are trademarks or registered trademarks of Teradata Corporation and/or its affiliates in the U.S. or worldwide. Cognos is a registered trademark of IBM. MicroStrategy is a registered trademark of MicroStrategy Incorporated. Microsoft is a registered trademark of Microsoft Corporation. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. SAP is a registered trademark, and BusinessObjects is a trademark of SAP AG in Germany and in several other countries. SAS is a registered trademark of SAS Institute Inc. in the USA and other countries. Teradata continually improves products as new technologies and components become available. Teradata, therefore, reserves the right to change specifications without prior notice. All features, functions, and operations described herein may not be marketed in all parts of the world. Consult your Teradata representative or Teradata.com for more information. Copyright 2010 by Teradata Corporation All Rights Reserved. Produced in U.S.A.

EB-6110

>

0210

>

PAGE 8 OF 8

Das könnte Ihnen auch gefallen