Sie sind auf Seite 1von 5

OVUM OPINION

Big Data integration is the big deal in Informatica 9.1


Reference Code: OI00141-026 Publication Date: June 2011 Author: Madan Sheina and Tony Baer

OVUM VIEW
Summary
The highlights of Informatica's recent 9.1 platform release target Big Data integration, self-service, upgraded data quality, master data management (MDM), and data service capabilities. It provides solid functional updates to what is already a rich and ever-broadening data integration platform. The Informatica platform already supported data movements with Hadoop through partnerships with Cloudera and EMC, but the new release adds direct, bidirectional connectivity between Informatica and Hadoop, tapping an emergent use case for customers seeking the raw power of this NoSQL target. The 9.1 release also adds new connectors to social networks, supporting the increasingly popular use case of social media analytics.

Big Data challenges play directly into Informatica's integration strengths


Big Data represents the confluence of more and new/emerging types of transaction and interaction data with demands for more scalable and quicker processing of that data. The issue is not so much the size of these traditionally slioed repositories of information, but the potential for understanding the relationships between them. This is where Informatica's competencies come into play. Combining traditional structured transactional information with unstructured interaction data generated by humans and the Internet (customer records, social media) and, increasingly, machines (sensor data, call detail records) is clearly the sweet spot. These types of interaction data have traditionally been difficult to access or process using conventional BI systems. The appeal of adding these new data types is to allow enterprises to achieve a more complete view of customers, with new insights into relationships and Big Data integration is the big deal in Informatica 9.1 (OI00141-026) Ovum (Published 06/2011) This report is a licensed product and is not to be photocopied Page 1

behaviors from social media data. That, of course, presents Informatica with an opportunity to apply its data integration, profiling, and quality know-how directly to Big Data sets and processing environments to enrich data sets as well as master data. Not surprisingly, Informatica is calling Big Data the next big growth opportunity for its business, with 9.1 the first stab of many. However, Ovum believes focus on Big Data is a natural corollary to the company's last stated big growth opportunity the Informatica cloud as both a data source target and a platform on which to host its products. A big part of Big Data will be driven by enterprises seeking to build hybrid architectures that store and integrate data residing in onpremise systems and in the cloud.

Informatica PowerExchange provides the technical foundation for 9.1's Big Data play
Informatica supports Big Data in two ways backing both Hadoop and non-Hadoop processing platforms and it is doing so largely through its PowerExchange family of data access products. In May 2011 the company announced support for EMC Greenplum's distribution of the Hadoop file system. The 9.1 release builds on this by adding a new PowerExchange for the Hadoop Distributed File System (HDFS) connectivity tool, which augments Big Data processing by moving enterprise data into Hadoop clustered environments for highly scalable parallel processing and out to targets (such as data warehouses) for consumption and analysis. The benefit is being able to reuse existing Informatica development skills in Hadoop environments. This addresses a major gap identified in the Ovum report What is Big Data: The Big Architecture: the lack of skills for Hadoop, MapReduce, and related technologies is currently one of the biggest impediments to adoption of NoSQL platforms. In the next release, Informatica plans to build a more robust offering that includes a graphical integrated development environment (IDE) for Hadoop; codeless and metadata-driven development; the ability to prepare and integrate data directly inside Hadoop environments; and end-to-end metadata lineage across the Informatica, Hadoop, and target environments. The 9.1 platform also includes a new set of connectors to various Big Data transactional systems to make it easier to meld structured transactional with largely unstructured interaction data (including social media). Informatica already offers connectors to popular databases such as Oracle, DB2, Teradata, and IBM Netezza, and is planning to put purpose-built advanced SQL analytic databases onto its price list, including Teradata/Aster Data, EMC Greenplum, and HP Vertica. Informatica has taken the logical first step in supporting social network integration by adding connectors for published Twitter, LinkedIn, and Facebook APIs.

Big Data integration is the big deal in Informatica 9.1 (OI00141-026) Ovum (Published 06/2011) This report is a licensed product and is not to be photocopied Page 2

Informatica has also enhanced its B2B Data Exchange Transformation product to make it easier to connect to other interaction data gleaned from call detail records (CDR), device/sensor data and scientific data (genomic and pharmaceutical), and large image files (through managed file transfer). Although the initial set of social media adapters are prescriptive to certain sites, Ovum expects Informatica to eventually offer a software development kit (SDK) approach that provides flexible connectivity to broader social media data sources. Informatica is not alone in providing support for loading and accessing of data to and from Hadoop. The race is on to provide a standardized set of visual Hadoop-focused tools that build around pillars such as MapReduce and access and transformation languages such as Hive and Pig. The leader will be the one that makes the NoSQL environment comfortable enough for the SQL developer mainstream.

MDM gets tightened integration with the rest of the platform


As one of its more recent and watershed acquisitions, one of Informatica's biggest challenges for this release was tighter integration with the MDM technology that came from Siperian. This helps organizations to deliver "authoritative and trustworthy data." Informatica's first move in the 9.0 release was to allow customers to define data quality rules that could be applied to data integration. The 9.1 release further advances integration across the platform by allowing end users to reuse the same data quality rules in the MDM environment. Hence, data quality policies can be surfaced and reused across data profiling, data cleansing, and MDM as a single process. The key benefits are better governance (which avoids having conflicting data quality rules applied across systems) and safeguarding existing investments in data quality rule standardization and skills (allowing them to be retained and transferred over to the MDM environment). Siperian provided a comprehensive multi-data domain solution (customer, product, chart of accounts, location, etc.), but architecturally it was rigid. That has changed in 9.1, which supports multiple MDM deployment styles registry, single-instance/consolidated hub, coexistence, analytical, transactional or federated via cloud, or service-oriented architecture. Further flexibility is enabled through added features that prevent the duplicate master data types from being created, make master data entity hierarchies and relationships more visible (within the Data Director tool), and enhance registry services for quicker on-boarding and updating of metadata (primarily through messaging) and more targeted master data search techniques.

9.1 encourages users to be self-sufficient


This release also comes with a long list of functional upgrades across the staple tools of the Informatica suite, such as data quality, data profiling, federation and virtualization, application ILM, event processing, and low-latency messaging. There are simply too many to do each one justice in Big Data integration is the big deal in Informatica 9.1 (OI00141-026) Ovum (Published 06/2011) This report is a licensed product and is not to be photocopied Page 3

this research note. However, one common thread that stands out across many of these additional enhancements in 9.1 is a continued focus on self-service provisioning of (in Informatica parlance) "authoritative and trustworthy" data. Informatica has worked hard to make its core business more accessible to a broader, nontechnical IT audience. This is a challenge, as data integration is a complicated IT task that has traditionally been the almost exclusive preserve of skilled DBAs and developers. Notable functionality to support this accessibility initiative includes the introduction of so-called "proactive data quality assurance" services to identify data exceptions more quickly. This is based on a complex event processing (CEP)-like model, which allows ETL developers to provide comparative profiling analysis to map certain data quality rules and logic against data profiles at early stages of the transformation pipeline in order to prevent costly errors from surfacing downstream. The model works by dynamically generating and comparing profiles of data as it flows through the mapping pipeline. It also enables "top-down" validation of actual versus expected data in data integration projects which is particularly useful when upgrading applications. There is also a new interactive, self-service Data Integration Analyst workbench for data analysts and data stewards, which extends a similar capability introduced for data quality analysts in its 9.0 release. This workbench aims to empower non-technical users who are close to the business and arguably have better business understanding of data to define their own data integration mapping and routines without having to constantly toggle back to IT developers. The creation and validation of source-to-target mappings is handled through a browser-based, guided interface that enables business analysts and data stewards to pinpoint data using business terms, define source-to-target mappings, selectively apply transform rules (including ETL and data quality) from a predefined inventory, validate the rules on the fly, and preview the results of their specifications. For example, analysts can find and navigate data sources and targets using metadata such as a business glossary or data lineage trails; specify, save, and share their own transformation logic with other analysts, projects, or both; and embed existing ETL mapping logic and data quality rules into their specification. The Data Integration Analyst tool then automatically generates the relevant PowerCenter or Informatica Data Services (IDS) transformation mapping logic, which can be deployed as virtualized SQL views, published web services, or batch ETL routines.

9.1 adds greater project awareness to data virtualization


Another notable addition to 9.1 is so-called adaptive data services that wrap project-specific context and intelligence into the data federation creation and delivery process. This allows delivery of data from single sources to the business needs of all projects, without necessarily having to reinvent the wheel for every project and ensure consistency. Big Data integration is the big deal in Informatica 9.1 (OI00141-026) Ovum (Published 06/2011) This report is a licensed product and is not to be photocopied Page 4

Informatica leverages this data virtualization solution as part of the overall platform to enable physical and virtual data integration depending on business needs. Informatica call this "multiprotocol data provisioning." It is technically an extension of Informatica's core data services architecture, and uses SQL endpoints via ODBC or JDBC as a web service, or to PowerCenter as a batch process. The key benefit is governance since the multi-provisioning is based on a common logical data object and policy definitions.

APPENDIX
Disclaimer
All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the publisher, Ovum (a subsidiary company of Datamonitor plc). The facts of this report are believed to be correct at the time of publication but cannot be guaranteed. Please note that the findings, conclusions and recommendations that Ovum delivers will be based on information gathered in good faith from both primary and secondary sources, whose accuracy we are not always in a position to guarantee. As such Ovum can accept no liability whatever for actions taken based on any information that may subsequently prove to be incorrect.

Big Data integration is the big deal in Informatica 9.1 (OI00141-026) Ovum (Published 06/2011) This report is a licensed product and is not to be photocopied Page 5

Das könnte Ihnen auch gefallen