Sie sind auf Seite 1von 12

W

h
i
t
e

P
a
p
e
r
www.infosys.com
Abstract
Enterprise Business Intelligence (BI) solutions today are analyzing growing amounts of data. More often, the data is
historical in nature, coming from within the enterprise and also from external channels such as the Web, mobile, and
devices. This has led to the growth of data volume to alarming levels. In traditional BI implementations, this
information explosion, along with increasing demands on computational power to process high volumes of data,
has been managed through expensive hardware and software upgrades. This is a highly inefcient approach to meet the
demands of a growing business, one that the enterprise considers economically unfavorable.
With the global scale of operation of large enterprises, the need of the hour is to make information available to
partners, remotely-located analysts, and managers who are on the move. This in turn results in additional demands on
infrastructure and IT.
This white paper discusses how Cloud Computing might help address these challenges with its round-the-clock availability
and its dynamic and scalable nature. Cloud infrastructure would be benefcial in terms of ofoading - BI storage, long
running processes, and handling erratic load behaviors. The proposed solution discussed in this paper is an alternative BI
architecture providing an optimal solution that extends existing BI infrastructure.
Business Intelligence Solutions on Windows Azure
- Sidharth Subhash Ghag
2 | Infosys White Paper
BI Process
Overview
Primarily, a BI solution has two parts: data storage and analysis. The stored raw data is an asset that needs to be cleansed and processed to
derive information for making decisions. The information has to be presented to the decision makers in an intuitive and highly interactive
manner, so that key strategic decisions can be made in the least possible time. BI relies on data warehousing (a data repository designed
to support an organizations decision making). Inefectively managed data warehouses make it difcult for organizations to quickly extract
necessary data for information analysis to facilitate practical decision-making.
The BI process can be represented using the following diagram:


Online Transaction Data
Online transactional data (operational data) from multiple systems (fnance, sales, and CRM) is extracted and processed to eliminate data
redundancy or is optimized to be stored in a data warehouse. The purpose of creating a data warehouse is to bring information from
heterogeneous systems to a common data storage platform.
Data Warehouse
A data warehouse is an independent master store of all the historical transactional data for any enterprise. Extracting transactional data from
multiple systems and then cleansing the data for using it for further analysis is the most important activity of establishing a data warehouse.
The process of accumulating data largely depends on the source systems from where the data is retrieved. Mostly, this process of accumulation
is customized enough to handle the multiple data sources and data rules, easing the transformation of data from multiple disparate systems,
which needs to be stored in a single platform.
Data Marts
Although a data warehouse is a storehouse for voluminous data, it is difficult to process complex analytical queries or jobs directly off the data
warehouse. Thus, the data warehouse is broken down logically or sometimes physically into smaller analysis units called data marts. Data marts
can be conceptualized as units of data storage used for dedicated analysis, which is generated using specific filters and queries. Data marts
contain specialized multi-dimensional data structures called data cubes. Unlike relational database tables, which have only two dimensions
(row and column), a data cube has multiple dimensions.
Typical data mart queries include how the sale of grocery products was in the last six months and how a promotion performed in the last six
months in the southern region. Data marts are useful for such focused analysis.
Since the data warehouse is responsible for storing high volumes of historical and ever-growing data, a data warehouse solution should be
cost-effective and reliable and should always be available to other components for analysis and reporting.
Reports, Dashboards and Key Performance Matrix
Analysis is the process of slicing and dicing a set of information to interpret a pattern that can be used to justify certain impact or for further
planning. The analytics engine works on data marts. The purpose of the analytics engine is to execute complex queries and present data with
multiple dimensions and measures. Dimensions and measures are key parameters in BI that help slice and dice information to make it more
precise for decision makers.
Figure 1: BI Process
3 | Infosys White Paper
Data presentation is a crucial component in analysis. The richer the presentation of the data to be analyzed, the better it is for decision makers
to examine the information. This presentation layer helps in presenting reports, KPI matrix, and dashboards to the end user for slicing and
dicing information. These rich reports also support what-if scenario analyses.
A BI system is an aggregation of multiple systems and sub-systems. Data storage, information slicing and dicing tools, and reporting or rich
visualization interfaces are some of the multiple sub-systems of any typical BI system. This peculiarity of structure and integration creates
inherent challenges. Let us look at the typical challenges faced by enterprises in implementing and using BI solutions.
BI Implementation Challenges
Intermittent demands for storage
Since a data warehouse is the backbone of the entire BI solution, it becomes important to manage this data warehouse properly to keep
it running all the time. The data warehouse is a storehouse for large datasets, and it is not possible to keep the entire data active so that it
may be used for on-demand analysis. In certain scenarios, historical data that has otherwise been inactive for some time may need to be
activated. Activation of historical data involves obtaining the backed up tapes, retrieving the data, and loading and ftting it into the current
activated data warehouse or data marts, all of which are by no means simple. Even if such a situation arises only once a month, it would
still consume a considerable amount of IT operational resources. Storage demand increases with every such request because activation of
inactive data adds to rather than taking away from the currently activated data. The need for extra storage capacity adds to the investment
of hardware and the pressure of managing the same.
Sub-optimal utilization of resources
As the BI solutions have been in place for many years, it is highly likely that the number of users, size of the storage, and complexity of the
systems have increased. Increase in users adds pressure on the scalability of the solution, which might have been provisioned long ago.
There is yet another possibility where an organization may have considered the rapid growth in the number of users, where the storage
and other infrastructure capacities are planned upfront. In such cases, it is highly likely that the system may remain underutilized causing
the loss of opportunity of using the same investment elsewhere. The scalability challenge is crucial in deciding utilization as well as smooth
running of the system.
Lacking external dimension
On-premise BI solutions are mostly oriented around the transactional data of the enterprises. They lack the external dimensions and
measures of analysis, that are important for strategic analysis. A combination of internal data such as sales data and external data such as
government collected data and industry trends can be used to get better insight and plan efective strategies.
External environmental data is available through diferent data marketplaces, which can help enhance the quality of analytics. Increasing
demand to factor external entities into the analysis is adding pressure on the design and fexibility of the BI solutions. Many a time,
enterprises end up developing their own components or smaller, independent BI solutions to factor these external entities.
Lacking multi-channel delivery capabilities
Most enterprises work with workforce spread all over the world. These geographically distributed stakeholders demand round-the-clock
availability and accessibility from any place. Enterprises that had not factored this demand have ended up spending huge amounts of
money and resources to address it. The need to make data warehouses and BI solutions available over the Internet with multiple delivery
channels such as RIA, services, mobile and browsers is increasing. This quick, easy, perennial accessibility adds an edge to enterprises,
facilitating them to collaborate better and take decisions quickly. Thus, it becomes essential for enterprises to make their BI platform
available over the Internet. This requirement not only demands additional investment for infrastructure, but also adds to the additional
integration touch points to address such requirements.
Present businesses operate in highly dynamic environments infuenced by factors such as changing business scenarios, change in
compliances and governance processes, new integration requirements adding to the complexities of the systems, and increasing pressure
on the system to be responsive. These challenges multiply with the increasing demand for dynamism in the business, processes, and
technologies. It is important for every enterprise to address these challenges and make use of their BI investment to get the best results.
4 | Infosys White Paper
BI Solution Based on Cloud Computing
With more and more devices getting meshed and inter-connected on the information highway, demand for data and everything related to it
will grow manifold. This information explosion will lead to the need of systems that can:
Process large amounts of data efciently and in near real-time
Handle storage for data flowing in from the various systems and devices into storage units that can store large amounts of data
The fgure shown below depicts a typical information fow landscape of any large enterprise in the future. Thus, a BI solution has to meet the
high volume requirements of an enterprise, which constantly exchanges information with multiple stakeholders, systems, and devices as part
of its day-to-day operations.
Cloud computing, a new generation technology platform of deploying and delivering software services, addresses the growth requirements
of an enterprise and the commonly faced BI challenges. The value proposition delivered by cloud computing, which can address the needs of
the BI platform for the future, includes:
Capability to process voluminous and rapidly-growing data over the Internet
Replication of machines, applications, and data storage at multiple instances to provide high availability
Dynamic, elastic capability to support scaling up and down of infrastructure within minutes
Improved Cost Efficiency
Managing complexity and Total Cost of Ownership (TCO) using cloud storage solutions are relatively more appealing compared to traditional
RDBMS data solutions, especially in a data warehouse scenario that deals with handling historic or inactive data. With cloud storage, data
can be kept active at all times while avoiding the aide of the IT management to activate any historical data. Thus, cloud storage addresses
the challenges of intermittent data storage access, particularly when there is an urgent need to reload historical data, say to meet
compliance-related queries.
Content Providers Field
Devices/Appliances
Delivery Channels
Regulatory
Agencies
Partners
Suppliers
Portal &
Reporting
Transformation
Engine
Analytical
Engine
DW
Enterprise Geo1
Sales SCM CRM
Enterprise Geo2
Sales SCM CRM
Customers
Figure 2: Typical Azure Business Intelligence Eco-System
5 | Infosys White Paper
Elastic and Scalable
A cloud-based solution offers users the capability to provide cloud resources such as computing services, storage services, and cache services
instantaneously. This infrastructure-level flexibility allows one to handle workload fluctuations, both planned and unplanned, in an elastic
manner without having to plan for any investments upfront. The elastic and scalable nature of the cloud, along with the pay-as-you-go model,
aligns well with the enterprise needs such that the business gets a more transparent and assured view of its IT resource consumption.
Interoperable
Since the cloud is available over the Internet and can easily provide interoperable endpoints such as REST and SOAP, the architecture supports
easy integration with external services. Relatively easy and quick integration with externally available interface endpoints makes the enterprises
account for adding external dimensions to their analysis. These rich sets of external dimensions provide a platform for the enterprise to logically
consider factors for their analysis, be it competitor data, national/international growth data, neighborhood safety, climate effect, or new stores
or services in the neighborhood.
Available Anytime Anywhere
The cloud is available ubiquitously and can be accessed through standard http protocols. Enterprises do not have to spend extra money or
resources to make the solution available over the Internet. Concerns such as provisioning and hardening are inconsequential with the cloud.
The cloud helps enterprises support multiple delivery channels that allow information to reach stakeholders including employees, mobile field
agents, and external partners easily.
Even as the cloud computing platform is growing, different vendors are adding to the rich set of building blocks required to develop enterprise
applications on the cloud. The basic principle in developing these building blocks is to be able to integrate easily and quickly. All the vendors
are striving for open and interoperable standards of integration, making it easier to use these enterprise application services on any cloud
platform. It also delivers the advantage of making the system agile to handle system changes required to address dynamic business and
technical needs.
These characteristics of the cloud computing platform enable the implementation of large BI solutions possible in an easy and relatively
inexpensive manner. Cloud computing platforms are maturing and cloud vendors are trying hard to increase the functional and technical
richness of their offerings to drive innovations. These innovations would help enterprises in better management, easy decision making, and being
more competitive.
We will explore Microsoft Azure, a public cloud platform that offers Platform as a Service (PaaS), for developing the next generation cloud-
based BI solution. PaaS offers hosted scalable application servers with necessary supporting services such as storage, security, and integration
infrastructure. PaaS platform also provides development tools and application building blocks to develop custom solutions on the cloud.
Though we have selected PaaS for our proposed solution, there are two other cloud delivery models: Software as a Service (SaaS) and IaaS
(Infrastructure as a service), which we will discuss briefly in this paper.
6 | Infosys White Paper
Azure Based BI Solution
We will now attempt to explain a high-level design for a custom-built BI solution on Windows Azure.
Let us first get acquainted with the Azure terminologies given in the following table:
High-Level Design for Custom-Built BI Solution on Azure
Owing to concerns around data privacy, security, and data ownership, enterprises have been cautious in adopting cloud computing. However,
at the same time, they have also shown a keen interest in leveraging the value proposition offered by the cloud and the potential opportunity
it presents in growing their businesses.
Keeping these key aspects in mind, a hybrid BI solution is proposed to alleviate enterprise challenges. As shown in the figure below, the
proposed solution divides the architecture into two distinct facets On-premise component and Cloud component.
Windows Azure A cloud operating system platform that provides the computing capability on a cloud
Azure Table Storage
Entity/Key value or tuple store-based service capabilities provided by Microsoft Azure to address large,
structured, and scalable data storage
Azure Blob Storage
Large and scalable data storage made available by Microsoft Azure for unstructured data such as
documents and media fles
Azure Queue
Queue service ofered by Microsoft Azure for message orchestration and asynchronous
request processing
SQL Azure
Relational database capability similar to SQL Server made available by Microsoft Azure to address
relational database capabilities on the cloud
Web Role
A web server instance to run web applications readily available at http/https endpoints for access.
A web role is simply a web server provided by Microsoft Azure
Worker Role A computing instance for executing long running processes on Microsoft Azure
VM Role
A role used to run a virtual hard disk image, store that image in the cloud, and load and run it on demand.
The role is highly suited for moving legacy applications to the cloud with minimal efort
AppFabric Service Bus
A service-bus-like messaging platform on the cloud that allows on-premise applications to be available
externally and to seamlessly connect with other systems
AppFabric Access Control
Service (ACS)

A claim-based authorization service that supports federated access to enterprise systems and services
on the cloud. All authorization rules can be abstracted and managed from ACS independently out of
the application in a standard oriented way
Windows Azure Data
Marketplace
An information marketplace that acts as an external dataset provider, which would be consumed by the
BI stack to leverage external dimensioning metrics such as demographics, location, and other publically
available information to enrich the analytical reporting capabilities
Windows Identity
Foundation (WIF)
An identity management framework that externalizes identity-related logic from an application.
Federated single sign-on scenarios involving multiple stakeholders can be built on this framework. For
the enterprise, this will also help integrate on-premise Active Directory-based authentication with the
Azure deployed application
7 | Infosys White Paper

On-Premise Components
Data Cleansing and Profling Agent
This agent would be responsible for collating transactional and unstructured data from on-premise systems, cleansing the data, and uploading
it on a data warehouse developed on Azure table storage. This component can be extended to consider disparate data sources such as Oracle,
SQL Server, mainframes, and excel data. Cleansing and profiling would also be configurable according to business needs to handle business-
specific rules, such as soft-deleted data should not be uploaded and transactional data not in the published state should not be uploaded.
The data transfer from agent to the cloud would happen over a secured channel. This agent is usually a part of the Extract Transform Load
(ETL) component.
Data Integration Layer
Based on the criticality of information, an enterprise may have structure data categorized into different levels. We will discuss the different data
integration approaches to cover mission critical and non-mission critical data.
Exposing master data on the cloud without having to upload the master data on the cloud storage helps in maintaining data privacy
and ownership in the hands of the enterprise. This would avoid the need to physically store confidential data such as credit card details,
address information of customers, and salary information of employees on the cloud. It would instead be fetched from the enterprise as and
when required.
Figure 3: High-Level Design for Custom-Built BI Solution on Azure
8 | Infosys White Paper
An on-premise component that forms a part of the integration layer would help in exposing the master data to the cloud. Technically, this can
be achieved by leveraging the capabilities of the Azure AppFabric service bus. Azure AppFabric service bus, with its service virtualization
capabilities, allows exposing on-premise components or services on the cloud without having to physically move the data outside the enterprise.
The AppFabric service bus provides a publically accessible virtual endpoint on the cloud to any on-premise service endpoint it manages. This
channel of communication between the Azure AppFabric service bus and the on-premise service can be secured at the transport level, which
would be achieved by using SSL, and at the message level, which would be achieved by using standard encryption techniques.
To avoid latency issues, which could be a cause of concern arising due to the external network hop between an on-premise and cloud
environments, a distributed caching functionality can be implemented on the cloud. The analytical engine deployed on the cloud can be
embedded with a caching component such as Azure AppFabric Cache to cache regularly-used master data and in turn reduce the effects
of latency.
Data integration achieved using service virtualization addresses data security concerns, but this comes at the cost of performance. It is, thus,
advisable that for non-critical data, the data be transported and made to reside physically on the cloud, closer to the hosted application. This
can be achieved by leveraging existing data integration techniques such as ETL, Change Data Capture (CDC), and Enterprise Information
Integration (EII) implemented using a tool such as Microsofts SQL Server Integration Services (SSIS).
Power Pivot
Power Pivot for Excel is a data analysis tool that delivers unmatched computational power directly within the application and with a tool such
as MS Excel, which users are fairly acquainted with. Power Pivot is a user-friendly way to perform data analysis using familiar Excel features
such as the common MS Office User Interface shell, PivotTable, PivotChart views, and slicers. Power Pivot helps users analyze data marts offline
without being connected to the online data marts. Power Pivot enables focused analysis on the data marts for on-premise and on-the-move
analysts to access at their own convenience.
ADFS 2.0
ADFS 2.0 is an identity provider service that enables an enterprise-level identity federation solution. It is developed on Windows Identity
Foundation (WIF) and makes it very easy to integrate with web applications for authentication/authorization from on-premise active directory
use stores. The BI portal solution proposed here would implement claims-based authentication using WIF and ADFS 2.0 for allowing enterprise
users to login to the system with their existing active directory credentials.
Azure Components
Cloud Data Warehouse
All the collated data uploaded by the cleansing and profiling agent would be stored in Azure table storage. Azure table storage is highly
scalable and is an appropriate fit for persisting de-normalized data due to its Entity Value Attribute (tuple store) style of storage. No analytical
processing or advanced queries would be run on the data warehouse. Hence, the economically cheaper Azure table storage is a relatively
better option compared to relational data stores such as SQL Azure. The Azure storage, through blobs, can also persist metadata of the data
warehouse along with unstructured data such as files, documents, scanned images, and video files.
The inexpensive storage capability delivered by table storage frees data warehouse administrators from having to deactivate historical data, a
practice often followed in the earlier BI systems due to storage capacity limitations of on-premise storage facilities. CAPEX spending, normally
involved in expanding storage to meet enterprise growth, is also eliminated. However, due to the Pay-As-You-Use pricing model of Windows
Azure services, there would be a rise in the OPEX spending, but it would tend to align more closely with the demands of the growing business.
A detailed assessment of the existing system along with a Y-O-Y ROI analysis of the Azure platform can help provide a clear picture in terms
of overall savings and business value that can be realized in the future.
Analytical Engine
The analytical engine is the most important component in the BI solution. The analytical engine:
Prepares data required for focused analysis
Applies algorithms for processing data based on different facts, measures, and dimensions
Analyzes structured and unstructured information to provide patterns and predicts trends that are usually difficult to spot with the naked
eye or traditional reporting
Identifies cases or exceptions in the data to isolate or identify anomalies
9 | Infosys White Paper
As of now, the SQL Server Analysis Services are not provided as part of the SQL Azure services. Hence, it is imperative to build this custom
component, which would achieve analysis services, cube formation, and querying cube-related functionalities on SQL Azure.
In the proposed solution, the analytical engine has the following parts:
Batch Process (Azure worker role): This Azure worker role would be responsible for the creation of data marts and offline reports.
Data-Mart Processor: Responsible for creating new data marts (SQL Azure tables) from the data warehouse (Azure table storage)
for focused analysis. The multiple requests submitted by analysts from the BI portal to create data marts would be handled
asynchronously by batch-processing requests, implemented using Azure queues.
Ofine Report Generator: Responsible for generating standard reports periodically and storing it in the Azure blobs to make it
readily available for the BI portal. This component would generate standard reports as per the confguration stored in the Azure
table storage.
Real Time Analytics (Azure web role): This Azure web role is one of the most important components used for analysis. It would be
responsible for fetching data from data marts and presenting it on the BI portal for analysis. BI portal presentation of dynamic reports and
KPI matrix and generation of ad-hoc reports on existing data marts are achieved through this component. It services analysis requests
synchronously on the existing data marts, making real-time analysis possible on the data marts.
Data Marts: Since the proposed data warehouse is created using Azure table storage, which is entity-value schema-based and
non-relational, we propose to create data marts in the SQL Azure tables. This is primarily because existing analytical engines can also
leverage the premium RDBMS capabilities ofered by SQL Azure on the cloud without any changes. SQL Azure is a relational database
and makes it easy to fetch data using complicated analytical queries. Power Pivot provides a quick and powerful analysis tool to be used
with SQL Azure. Moreover, the BI portal would be able to generate the desired reports and analyses out of SQL Azure.
Application Data: Application data comprises confguration and customization data required as a part of the BI solution.
SQL Azure Reporting Services Reports: As part of the BI solutions, standard reports can be confgured using SQL Azure Reporting
Services (SARS) and can be made available from the BI portal.
Standard Reports: As part of the BI solution, there are standard reports needed to be generated on the data using the specifc dimensions
and measures. These standard reports can be generated in a batch process to reduce the latency and can be made available all the time. As
explained previously, the batch analytics component running on the Azure worker role generates these reports periodically.
BI Portal: This is the web portal ported on Azure web role. It interacts with the analytical engine to generate dashboards, ad-hoc reports,
and visual analyses of data from multiple dimensions and measures. This BI portal would be accessible everywhere over the Internet and
would be made available over multiple delivery channels including desktop, mobile, and PDAs.
Windows Azure Data Marketplace Dataset External Measures: The analytics engine can be configured to use specific datasets exposed
from Windows Azure data marketplace. These datasets would be used as an external measure, along with the data mart measures, for
analysis. Examples of such datasets that can be used as external measures could be demographic information of customers, upcoming
business/stores in nearby locations, and weather conditions impacting sales for specific location
Design Considerations
Geo-location and afnity group: Applications developed on Windows Azure can be deployed across multiple data centers located
around the world South Central US, North Central US, West Europe, East Europe, East Asia, and South East Asia. The Windows Azure global
footprint is rapidly growing as Microsoft continues to build new global data centers for Azure deployment. Selection of appropriate data
centers and creating an affinity group for deployment should be considered for the following reasons:
Note: With Windows Azure version 1.6 release (November 2011), running SSAS off Azure VM roles is not supported by Microsoft. Hence, until
Microsoft recognizes SSAS as a first class citizen of the cloud, we suggest using the data-mart processor approach.
10 | Infosys White Paper
Regional Legislations/Regulations These are to address regulatory requirements of deploying the application and its data within a
specific geographical location. There are a few compliance requirements that organizations have to abide by, to keep their data
geographically close to the region of business operations. These requirements can be addressed by deploying the Azure
application in an appropriate data center.
Performance Data center proximity to end users would help in reducing network latency and improving overall application
performance. Creating an affinity group for application and data instances would deploy these components within the same data
center and would bring them closer. Inter-process communication within the same affinity group is faster and helps in improving
application performance, especially when there would be a large amount of data transfers involved during activities such as
reporting and data mart creation.
Caching: Caching frequently used data such as reference data and infrequently modified data would help reduce data access calls and
latency in serving requests. Moreover, since there would be multiple roles running in the Azure load-balanced environment, we need to
consider using distributed caching systems such as Windows Azure AppFabric Caching services or Distributed Memcached.
Partition keys for table storage: Partition keys used for data warehouse should not create too large partitions such that they are not able to
run efficient queries on Azure. We need to consider using partition keys in all queries for better performance.
Communication security for data in transit: We need to ensure transport level security using SSL. For highly confidential data, we need to
consider using messaging-level security, such as encryption and signatures.
Processing Model: We could analyze business use-cases and choose the appropriate processing model between online and batch. Long
running processes can be effectively scaled using the worker role approach for computation tasks. Message queue based asynchronous
processing also provides data and processing reliability.
SQL Azure Partition: In case where data-mart size expands more than one database instance limit of 150 GB for SQL Azure, consider
horizontal partitioning of few tables. We could consider high-growth tables for partitioning and range-based keys or storing hash of keys
to identify a specific partition.
Other Cloud-Based BI Implementation Models
According to US-based National Institute of Standards and Technology, the cloud is composed of three service models, namely, SaaS, PaaS, and
IaaS. The design of the cloud-hosted BI solution explained in this paper was made by considering the boundaries of a PaaS cloud service model,
realized using Microsoft Azure. The other cloud models available for implementing BI solutions are as follows:
SaaS: This is the highest abstraction of the cloud. In this model, a finished application or solution is offered as a service. It is akin to
a packaged product with support for limited customization offered through the cloud. Since it is a standard packaged solution,
there may be limitations for enterprises to map their unique customizations and heterogeneous data stores to avail this solution.
SaaS might be a good offering for smaller organizations to address their limited BI needs.
IaaS: This is the lowest abstraction of the cloud. In this model, vendors provide basic hardware and software infrastructure as a
service. Customers need to deploy their software, ranging from the operating system to the end application. Using this model,
enterprise will have to address the need of software licensing and deployment themselves, which limits the benefits of the cloud
computing platform.
Enterprises can select their cloud platform based on the criteria described in the figure below, driven by factors that make business sense in their
respective domains.
11 | Infosys White Paper
The above evaluation model summarizes the business value realized in implementing a cloud-based BI solution on different cloud service
models. A model of this nature can help guide enterprises in selecting the most appropriate cloud service by mapping the expected outcome
of their BI initiatives to the business value realized from the different cloud service options available.
Concerns About BI in Cloud/Azure
The cloud platform addresses most of the challenges faced by enterprises in implementing and managing a traditional on-premise BI solution.
However, there are few concerns around cloud usage for implementing BI solutions. These concerns are common to any cloud implementation
and are not specific to BI. Let us briefly discuss these concerns from a BI cloud adoption point of view. The most talked about concern is around
data security and compliance.
Enterprises have concerns about placing their confidential data on the cloud where it would get replicated onto multiple servers. Technically,
the cloud technology treats all data in a similar fashion and that raises concerns around information security. To address this problem
practically, there is a need to amend the compliance rules to cater to the technology evolution. At the same time, cloud vendors need to
provide mechanisms that can handle the need to meet compliance requirements more effectively. Until then, a hybrid solution as proposed
in the high-level design in this paper, wherein critical data is stored on-premise but is exposed as a service for integration and aggregation
purpose and transactional data is stored in the cloud, is an option that can be explored.
Selection
Criteria
Flexibility
Ease of
Management
(Hardware,
Software &
Infrastructure)
Control
Functional
Richness
Application
Building Blocks
Security &
Compliance
Time to Market
QoS (Scalability,
Availability,
Reliability &
Performance)
Preferred
Procurement
Choice
Buy Buy Build Subscribe
On-Premise
Private
Business Intelligence Platform Evaluation Model
Public
IAAS PAAS SAAS
On-Premise IAAS PAAS SAAS
Figure 4: Business Intelligence Platform Evaluation Model
2012 Infosys Limited, Bangalore, India. Infosys believes the information in this publication is accurate as of its publication date; suchinformation is subject to change without notice. Infosys acknowledges
the proprietary rights of the trademarks and product names of other companies mentioned in this document.
About Infosys
Many of the world's most successful organizations rely on Infosys to
deliver measurable business value. Infosys provides business consulting,
technology, engineering and outsourcing services to help clients in over
30 countries build tomorrow's enterprise.
For more information, contact askus@infosys.com www.infosys.com
Conclusion
As cloud computing is evolving and growing every day, it would bring on several distinct changes. We foresee changes in compliance
requirements and a mindset shift to make optimized use of the cloud technology from the decision support system perspective. BI, as
elucidated, has a peculiar nature; it would need a customized solution approach. An integrated BI solution formed from a combination of on-
premise deployments, as well as cloud-based deployments, is the most suitable option available not only to realize the cloud benefits but also
to address enterprise concerns around the cloud.
This paper has discussed in detail how Microsoft Azure can be a good fit for an enterprise willing to optimize yet futuristically enrich its
solution. This paper also envisages an integration pattern for hybrid in-cloud and on-premise solutions developed using Windows Azure.
This pattern is not limited to BI solutions; it can also be used in multiple problem domains such as disaster recovery, data backup, seasonal
campaigning, and collaboration solution. We hope to see a lot of interest generated in developing a green field BI solution, migrating an
existing BI solution, or using the proposed aggregation design for implementing solutions on Windows Azure.
References
http://www.powerpivot.com/
http://msdn.microsoft.com/en-us/security/aa570351.aspx



Acknowledgement
Sachin Kumar Sancheti, Technical Architect, for his immense contribution in preparing the initial draft and for technical input provided during
his tenure in the organization.
Yogesh Bhatt, Principal Architect, Infosys Labs and Sudhanshu Hate, Senior Technology Architect, Infosys Labs for paper review.
About the Author
Sidharth Subhash Ghag (Sidharth_ghag@infosys.com) is a Senior Technology Architect with the Microsoft Technology Center (MTC) in Infosys.
With several years of software industry experience, he currently leads solutions in Microsoft Technologies in the area of Cloud computing.
He has also worked in the areas of SOA and service-enabling mainframe systems and on domains such as Finance, Utilities, and Transportation.
He has been instrumental in helping Infosys clients with service orientation of their legacy systems. Currently, he helps customers adopt
Cloud computing within their Enterprise. He has authored papers on Cloud computing and service-enabling mainframe systems. Sidharth
blogs at http://www.infosysblogs.com/cloudcomputing

Das könnte Ihnen auch gefallen