Oracle Meeting The Challenge of Big Data

Meeting the Challenge of Big Data
Learn about the

opportunities for
harnessing big data.
Gain insight into
architecture and tools.
See how to grow
revenues, reduce costs
and gain competitive
advantage.
Table of Contents
Chapter 1 - Spotlight on Big Data
Chapter 2 - Architecture
Chapter 3 - Acquire
10
Chapter 4 - Organize
12
Chapter 5 - Analyze
14
Chapter 6 - Decide
16
Chapter 7 - Big Data Software
18
Chapter 8 - Engineered Systems
22
Resources
26

Chapter 1
Spotlight on Big Data
Business and government can never have too much information for
making the organization more efficient, profitable, or productive. For
that reason organizations turned to powerful data stores, including
very large databases (VLDBs), to meet their information storage and
retrieval needs. Due to exponential data growth in recent years, we have
embraced new storage technologies and the enterprise database now
shares the spotlight with complementary technologies for storing and
managing big data.
There are four key characteristics that define big data: Volume, Velocity,
Variety and Value. Volume and velocity arent necessarily new problems
for IT managers; these issues are just amplified today. The distinguishing
characteristics of big data that do create new problems are the variety
and low density value of the data. Big data comes in many different
formats that go beyond the traditional transactional data formats. It is
also typically very low density; one single observation on its own doesnt
have a lot of value. However, when this data is aggregated and analyzed,
meaningful trends can be identified.
Gartner Research Vice President Merv
Adrian, discusses drivers for big data
Exponential Data Growth

The global data explosion is driven in part by technology such as digital
video and music, smartphones, and growth of the internet. For example,
clickstream data became available for hundreds of millions of internet
users after the browser became a universal client. Social networks have
grown so large that the scope of data mining activity now encompasses
hundreds of millions. Smartphones that can provide information for
location-based services will soon be in the hands of 1 billion users. There
is useful information to be derived from these disparate sources such as
Web server logs, data streams from instruments, real-time trading data,
blogs, and social media such as Twitter and Facebook.
Online or mobile financial transactions, social

media traffic, and GPS coordinates now generate
over 2.5 quintillion bytes [exabytes] ...
every day1
Applications and Benefits
Today the processing of terabyte-size, and even petabyte-size, data sets
is within the budget of many organizations due to inexpensive CPU
cycles and low-cost storage. That puts many organizations in a position
to benefit from big data.
Watch Video
Big Data, Big Impact: New Possibilities for International Development,

World Economic Forum

Big data enables an organization to gain a much greater understanding

of their user and customer base, their operations and supply chain,
even their competitive or regulatory environment. When handled
correctly, big data will have a positive impact on the top line and bottom
line, enabling better services and better decisions based on improved
business intelligence. Organizations can analyze big data to develop and
refine sophisticated predictive analytics that can reduce costs and deliver
sustainable competitive advantage.
IDC expects the Big Data technology and

services market to grow to $16.9 billion in 2015
with a compound annual growth rate (CAGR) of
40 percent.2
IDC
Perhaps because of the benefits and the useful applications for big data,
industry analysts have forecast rapid growth in the market for big data
technology and services.
Developing a big-data strategy is complex with different kinds of data, new-use
cases, and additional software. Above all,
whats the value to the business?
See More
When organizations use big data to develop a better understanding of

customers and users it generates benefits that are seen across both
industry and government. The retail industry, for example, generates
data sets for clickstream monitoring, consumer sentiment analysis,
and making recommendations when a customer is online. In financial
services, enhanced knowledge of the customer enables fraud detection
and prediction, as well as analysis of spending habits to increase
profitability per customer. And in both public and private healthcare, big
data is expected to deliver cost reductions and efficiencies that will also
result in better patient care.
Watch Video
IDC Worldwide Big Data Technology and Services 2012-2015 Forecast, doc
#233485, March 2012

Chapter 2
Architecture
Big data represents a sea change in the technology we draw upon for
making decisions. Organizations will integrate and analyze data from
diverse sources, complementing enterprise databases with data from
social media, video, smart mobile devices, and other sources. The
evolution of information architectures to include big data will likely
provide the foundation for a new generation of enterprise infrastructure.
To exploit these diverse sources of data for decision-making, an
organization must develop an effective strategy for acquiring, organizing,
and analyzing big data, using it to generate new insights about the
business and make better decisions.
See More
Each step in the process of refining big data has requirements that are
best served by matching the right hardware and software to the job at
hand. Existing data warehouse infrastructure can grow to meet both
the scale of big data and the different analytics needs. But handling the
initial acquisition and organization of the new data types will require new
software, most notably Apache Hadoop.
Hadoop contains two main components: the Hadoop Distributed File
System (HDFS) for data storage, and the MapReduce programming
framework that manages the processing of the data. The Hadoop tool
suite enables organizations to organize raw (often unstructured) data and
transform it so it can be loaded into data warehouses and data marts for
integrated analysis.
Hadoop lets a cluster or grid of computers tackle big data workloads
by enabling parallel processing of large data sets. It operates primarily
with HDFS, which is fault-tolerant and can scale out to many clusters
with thousands of nodes. Hadoop MapReduce also provides capabilities
for analysis operations on massive data sets using a large number of
processors. For example, researchers at Yahoo sorted a petabyte of data
in 16.25 hours running Hadoop MapReduce on a cluster of 3,800 nodes.
Although Hadoop MapReduce is well suited to problems with key/value
data sets, its not intended for operations that require complex data or
transactions.
Hadoop is a core building block for most
big data architectures. It provides both data
acquisition and storage, and has three main
uses within organizations.
Watch Video

Chapter 3 - Acquire
Chapter 3
Acquire
Organizations need to choose the right storage technology for new data
with a clear understanding of both the kind of data they plan to store
as well as how they will use it. While there are many specialist storage
technologies tuned for particular scenarios, there are two primary use
cases to be aware of.
The sources of big data are numerous including both human- and
machine-generated data feeds. Acquisition of data from sources like
online activity, RFID, instrumentation, social media, clickstreams, and
trading systems is characterized by a large volume of transactions,
high velocity of data flow, and greater variety of data formats. Required
latency varies, from interactive systems that deliver a service and need
subsecond responses, to more batch-oriented systems that store data for
offline analysis later.
Adrian discusses big data technologies.
Watch Video
The diversity of content requires software to operate on structured and

unstructured data, often in high-throughput scenarios. An effective bigdata solution must provide storage and processing capacity to collect,
organize, and refine large volumes of data, even petabyte-size data sets.
Systems that are more batch-oriented with less stringent requirements

for response time, updates, and queries often use the Hadoop Distributed
File System (HDFS). Where time constraints are more stringent, with
applications needing subsecond query response times, or frequent updates
to existing data, some form of NoSQL database is usually required.
NoSQL emerged as companies, such as Amazon, Google, LinkedIn and
Twitter struggled to deal with unprecedented data and operation volumes
under tight latency constraints. Analyzing high-volume, real time data,
such as web-site click streams can provide significant business advantage
by harnessing unstructured and semi-structured data sources to develop
new business analysis models. Consequently, enterprises built upon a
decade of research on distributed hash tables (DHTs) and utilized either
conventional relational database systems or embedded key/value stores,
such as Oracles Berkeley DB, to develop highly available, distributed keyvalue stores.
Organizations acquire and store a variety
of structured and unstructured information.
They must understand whether their use case
requires subsecond interactive response or
comprises somewhat slower batch operations.
Watch Video
10
11

Chapter 4
Organize
Developers today typically create custom-written Java code that, in

conjunction with the MapReduce programming framework, processes and
transforms the data on the node where it is stored. Overall, data movement
is therefore minimized since only final results of preprocessing are
uploaded to the data warehouse.
Deriving value from big data is a multiphase process that takes raw data
and refines it into useful information. Data acquisition, such as taking
data from streams and social media feeds, is a precursor to transforming
and organizing data to derive business value. Pre-processing is used to
weed out less useful data and structure what is left for analysis. Because
big data comes in many shapes, sizes, and formats, this transformation
is an important prerequisite to moving the data into the analytics
environment.
Hadoop MapReduce can preprocess and

transform data for loading into an Oracle data
warehouse.
See More
By prepping data to load into Oracle Exadata Database Machine, we set

the stage for integrated analysis with traditional enterprise data.
After weve collected big data, we need to

transform and organize it as a precursor for
additional refinement with analytics.
Watch Video
The refining of big data enables it to be analyzed alongside your

enterprise data. After raw data has been acquired, using data stores such
as Hadoop Distributed File System (HDFS) or a NoSQL database, it can
be preprocessed for loading into an analytics environment, such as a
data warehouse running on Oracle Exadata Database Machine. This type
of workload is often handled using Apache Hadoop.
12
See More
13

Chapter 5 - Analyze
Chapter 5
Analyze
Advanced Analytics
Organizations have been deriving useful information through the

combination of building mathematical models and sifting through
large volumes of data for a long time. Once refined, big data expands
existing models and is a potential rich new source of insight for business
intelligence applications that use the data warehouse.
Big data analysis is different. See how it
can uncover why things happen and what
kind of new analytics tools and processes
supplement what you already have.
Watch Video
The data warehouse is key to big data analysis. While data comes from
many sources, new insight comes from an integrated analysis of all data
together. Hence, the modern data warehouse now becomes a repository
for the data summaries created by Hadoop as well as more traditional
enterprise data.
New data sources are differentthe data itself is often less well
understood, but may also be inherently less precise or only indirectly
relevant to the problem. So, to derive value from big data, we must turn to
an analysis process of iteration and refinement. Each iteration can either
reveal new insight, or simply enable an analyst to rule out a particular line
of inquiry. Big data analysis is about uncovering new relationships rather
than reporting on a well-understood data set.
14
While traditional analysis tools are still important, advanced analytics

involving both statistical analysis and data mining are required to get the
most out of big data. A large user community has turned to the open source
R statistical programming language that has been evolving since 1997. Very
popular among analysts and data scientists, R is also widely used in the
academic world, so theres a ready pool of trained R developers coming
along.
One use of statistical techniques called predictive analytics has gained
traction across multiple industries including finance, retail, insurance,
healthcare, pharmaceuticals, and telecommunications. Predictive analytics
can exploit and use customer data to build and optimize predictive
models. Organizations are using predictors, for example, to guide
marketing campaigns and make them more effective. The surge of interest
in predictive analytics has been made possible by gains in computing
horsepower. With todays tools, predictive analytics can create sophisticated
models and execute a variety of scenarios across large sets of data.
See More
15

Chapter 6 - Decide
Chapter 6
Decide
When we make decisions in todays world awash in data, we can use

powerful tools to distill data and present information, making for a more
intelligent decision-making process. Using automated analysis, we can
make decisions that are data driven. We can turn big data into actionable
insight and, with the right technology, do it in real time.
See More
Visualizations and business intelligence dashboards are a powerful assist to

decision-making, particularly when dealing with massive amounts of data.
Statistical software is a key element of analytics, business intelligence, and
decision support. The Web interface for running scripts of the R statistical
analysis language can be integrated into dashboards, providing analysis
and streaming graphics for the decision-making process.
Decisions in Real Time

The volume and velocity of big data have put new emphasis on scalability
and performance of analytics and business intelligence tools. Improvements
in server capacity, high-speed interconnects, and network bandwidth have
contributed to the emergence of a new generation of software that provides
in-memory, in-database, and real-time analytics.
In-memory databases, for example, give us the capacity for real-time
decision-making. The 64-bit addressing capability of modern systems
means we can configure servers with a terabyte (TB) of memory. That
capacity means databases, some in excess of a billion rows, can be loaded
into memory to sustain high-performance, low-latency processing, which
results in faster decision-making.
Firms that adopt data-driven decision-making

have output and productivity that is 5-6% higher
than what would be expected. 1
Brynjolfsson, Hitt, and Kim, Strength in Numbers: How Does Data-Driven
Decision Making Affect Firm Performance? (April 22, 2011).
16
17

Chapter 7
Big Data Software
Oracle offers a powerful stack of software, including new functionality

specifically designed to handle the new challenges of big data. All
components can run both on Oracle engineered systems as well as
customer-integrated hardware.
Oracle Endeca Information Discovery

Oracle Endeca Information Discovery is an enterprise data discovery
platform for advanced exploration and analysis of complex and varied
data. Information is loaded from disparate source systems and stored
in a faceted data model that dynamically supports changing data. This
integrated and enriched data is made available for search, discovery and
analysis via interactive and configurable applications. Oracle Endecas
intuitive interface empowers business users to easily explore big data to
determine its potential value.
Oracle NoSQL Database

Applications, having diverse architectures and performance
requirements, also have diverse requirements for data storage and
retrieval capabilities. Many big data applications require a fast, strippeddown data store that supports interactive queries and updates with a
large volume of data.
Oracle NoSQL Database can quickly acquire and organize schemaless, unstructured, or semistructured data. It is an always available,
distributed key-value data store with predictable latency and fast
response to queries, supporting a wide range of interactive use cases.
And it has a simple programming model, making it easy to integrate into
new big data applications.
18
Get Fast Answers to New Questions with

Information Discovery
Watch Video
Oracle Data Integration

Oracle Data Integrator provides data extraction, loading, and
transformation (E-LT) for Oracle Database, Oracle Applications, and other
3rd party application sources. Oracle GoldenGate provides high-volume,
real-time transformation and loading of data into a data warehouse or
data mart. Together these products work with Oracle Big Data Connectors
to provide a gateway to integrating big data. The big-data explosion has
added to the importance of those products because big data is not useful if
its siloed.
19

Oracle Big Data Connectors
Oracle Advanced Analytics
Oracle has developed a suite of software for easily integrating Oracle

Database with Hadoop. Oracle Big Data Connectors are available with
Oracle Big Data Appliance or as individual software products. They
facilitate access to the Hadoop Distributed File Systems (HDFS) from the
Oracle Database and data loading to Oracle Database from Hadoop. They
also provide native R interface to HDFS and the MapReduce framework
and enable Oracle Data Integrator to generate Hadoop MapReduce
programs.
We often see big data and analytics used in the same sentence because
technology gains have enabled us to analyze increasingly large data sets.
Not the least of those gains is the capacity for Oracle Database to embed
analytics in the database, an architectural solution that provides scalability,
performance, and security. This architecture offloads analytics work from
RAM-limited computers and puts analytics processing closer to the data.
This eliminates unnecessary network round trips, leverages an enterpriseclass database, and lowers hardware costs.

Adrian discusses integrating big data
into your data center
Watch Video
20
Oracle Advanced Analytics turns Oracle Database into a sophisticated

analytics platform ready for big data analytics. It combines the capabilities
of Oracle Data Mining with Oracle R Enterprise, an enhanced version of
the open source R statistical programming language. Oracle Advanced
Analytics eliminates network latency that results from marshalling data
between a database and external clients doing analytics processing.
This can produce a 10x to 100x improvement in performance compared
to processing outside the database. Encapsulating analytics logic in
the database also exploits the databases multilevel security model and
enables the database to manage real-time predictive models and results.
21

Chapter 8
Engineered Systems
Oracles software stack is the foundation for a powerful line of engineered

systems that will help you quickly find new insights and unlock the value
in big data.
Oracles engineered systems enable organizations to deploy big data

solutions as a complement to operational systems, data warehousing,
analytics, and business intelligence processing. Engineered systems
are preintegrated, so easier to deploy and support, and they deliver
optimized performance. They can be deployed alone or alongside existing
infrastructure.
Oracle Big Data Appliance 3-D Demo

View 3-D Demo
Clearly, Oracles release of Oracle Big Data

Appliance signifies a full commitment to
Hadoop as a first-class citizen of the Oracle
data platform. Its price, $450,000 for 216 CPU
cores backed by 648TB of storage and the same
Infiniband backplane used by Oracle Exadata and
Oracles other engineered systems, is definitely
competitive. 1
1
Oracle mainstreams its Hadoop platform with Cloudera [Oracle Enterprise
22 Manager] deal, Tony Baer, Ovum, January 2012.
Oracle Big Data Appliance is a comprehensive, enterprise-ready

combination of hardware and software that makes getting started with big
data easy and fast. It is designed to run both Hadoop and Oracle NoSQL
Database for data acquisition, and to run Hadoop MapReduce algorithms
to organize the data and load it into a data warehouse for integrated
analysis.
23

Customers discuss the benefits of Oracle

Exalytics
Oracle has partnered with Cloudera to include the Cloudera Distribution

as part of Oracle Big Data Appliance. This ensures that customers have
access to a fully integrated and supported distribution of Hadoop, which
has tens of thousands of nodes in production, speeding deployment and
reducing ownership costs.
Oracle Exadata Database Machine represents a leading-edge
combination of hardware and software that is easy-to-deploy, completely
scalable, secure and redundant. Innovative technologies such as
Exadata Smart Scan, Exadata Smart Flash Cache, and Hybrid Columnar
Compression enable Exadata to deliver extreme performance for
everything from data warehousing to online transaction processing to
mixed workloads. Oracle Exadata uses a massively parallel architecture
and a high-speed InfiniBand network to sustain high-bandwidth links
between the database servers and storage servers and also to other
engineered systems like Oracle Big Data Appliance and Oracle Exalytics.
Infosys discusses the benefits of Oracle
Big Data Appliance
Watch Video
Oracle Exadata supports deployment of massive data warehouses and

the iterative analysis needed to uncover new relationships and develop
new insight. Once this new analysis is operationalized, it becomes
available to decision-makers who can act upon it and realize the
business value.
24
Watch Video
Oracle Exalytics In-Memory Machine is an integrated hardware and

software solution that provides in-memory analytics for rapid decisionmaking without breaking the budget. It can be deployed to support
demand forecasting, revenue and yield management, pricing, inventory
management, and a myriad of other applications. Plus, it can be linked
by a high-speed InfiniBand connection to data warehouse on Oracle
Exadata providing real-time analytics for business intelligence applications
accessing large data warehouses.
Oracle Exalytics In-Memory Machine delivers speed of thought analysis.
And this fundamentally changes how you interact with your BI software,
enabling you to get more out of your data and to generate more business
value.
Conclusion
To derive real business value from Big Data, you need the right tools to
capture and organize a wide variety of data types from different sources,
and to be able to easily analyze it within the context of all your enterprise
data. Oracles engineered systems and complementary software provide
an end-to-end value chain to help you unlock the value of big data.
25
Resources
Videos
White Papers
Datasheets
Reports
Web
Gartner Research Vice

President Merv Adrian
discusses drivers for big data
Episode 5: Using Statistical

Analysis To Generate New
Insight
Gartner Research Vice President

Merv Adrian discusses big data
technologies
Land OLakes Get Fast

Answers to New Questions
with Information Discovery
Gartner Research Vice President

Merv Adrian discusses integrating
big data into your data center
Infosys Discuss Benefits of

Oracle Big Data Appliance
Episode 1: Developing a Big

Data Strategy
Customers Discuss the

Benefits of Oracle Exalytics
Podcasts
Blogs
Episode 2: Understanding the

Basics of Hadoop
Social Media
Episode 3: Comparing HDFS
and NoSQL
Episode 4: Transforming and

Organizing Data with Hadoop
26 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
27
Resources
Big Data for the Enterprise
Videos
Oracle Big Data Connectors
White Papers
Datasheets
Oracle Data Mining
Reports
Oracle Information Architecture: An Architects

Guide to Big Data
Web
Podcasts
Big Data and Enterprise Data: Bridging Two

Worlds with Oracle Data Integration
Blogs
Social Media
27
Resources
Oracle Big Data Appliance Datasheet
Videos
Oracle Exadata Database Machine X2-8

Datasheet
White Papers
Datasheets
Reports
Oracle Exadata Database Machine X2-2

Datasheet
Oracle Exalytics In-Memory Machine X2-4
Datasheet
Web
Podcasts
Blogs
Social Media
27
Resources
Videos
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
McKinsey report: Big Data: The Next

Frontier for Innovation, Competition, and
Productivity
IDC report: Oracles All-Out Assault on the
Big Data Market: Offering Hadoop, R, Cubes,
and Scalable IMDB in Familiar Packages
Forrester Report: Oracle Exadata Raises the
Bar on Database Appliances
World Economic Forum report: Big
Data, Big Impact: New Possibilities for
International Development
Ovum report on analytics
Social Media
From Overload to Impact: An Industry
Scorecard on Big Data Business Challenges
27
Resources
www.oracle.com/bigdata
Videos
www.oracle.com/exadata
White Papers
www.oracle.com/exalytics
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
27
Resources
Oracle NoSQL Database with Dave Segleau
Videos
Big Data Panel Discussion
White Papers
Oracle and Cloudera with Mike Olson, CEO of

Cloudera
Datasheets
Reports
Understanding Big Data Analysis with the R

Language
Web
Podcasts
Blogs
Social Media
27
Resources
Oracle Big Data Platform
Videos
Oracle R Enterprise
White Papers
Datasheets
Oracle Database Insider
Reports
Web
Podcasts
Blogs
Social Media
27
Resources
Follow Oracle Database on Facebook
Videos
Follow Oracle Database on Twitter
White Papers
Follow Oracle Database on LinkedIn
Datasheets
Follow Oracle Database on Google+
Reports
Web
Podcasts
Blogs
Social Media
27
oracle.com/bigdata
E-Book Popup Pages

Big data enables an organization to gain a much greater understanding

of their user and customer base, their operations and supply chain,
even their competitive or regulatory environment. When handled
correctly, big data will have a positive impact on the top line and bottom
line, enabling better services and better decisions based on improved
business intelligence. Organizations can analyze big data to develop and
refine sophisticated predictive analytics that can reduce costs and deliver
sustainable competitive advantage.
IDC expects the Big Data technology and

services market to grow to $16.9 billion in 2015
with a compound annual growth rate (CAGR) of
40 percent.2
IDC
Perhaps because of the benefits and the useful applications for big data,
industry analysts have forecast rapid growth in the market for big data
technology and services.
Developing a big-data strategy is complex with different kinds of data, new-use
cases, and additional software. Above all,
whats the value to the business?
See More
When organizations use big data to develop a better understanding of

customers and users it generates benefits that are seen across both
industry and government. The retail industry, for example, generates
data sets for clickstream monitoring, consumer sentiment analysis,
and making recommendations when a customer is online. In financial
services, enhanced knowledge of the customer enables fraud detection
and prediction, as well as analysis of spending habits to increase
profitability per customer. And in both public and private healthcare, big
data is expected to deliver cost reductions and efficiencies that will also
result in better patient care.
Watch Video
IDC Worldwide Big Data Technology and Services 2012-2015 Forecast, doc
#233485, March 2012

Chapter 2
Architecture
Big data represents a sea change in the technology we draw upon for
making decisions. Organizations will integrate and analyze data from
diverse sources, complementing enterprise databases with data from
social media, video, smart mobile devices, and other sources. The
evolution of information architectures to include big data will likely
provide the foundation for a new generation of enterprise infrastructure.
To exploit these diverse sources of data for decision-making, an
organization must develop an effective strategy for acquiring, organizing,
and analyzing big data, using it to generate new insights about the
business and make better decisions.
See More
Each step in the process of refining big data has requirements that are
best served by matching the right hardware and software to the job at
hand. Existing data warehouse infrastructure can grow to meet both
the scale of big data and the different analytics needs. But handling the
initial acquisition and organization of the new data types will require new
software, most notably Apache Hadoop.
Hadoop contains two main components: the Hadoop Distributed File
System (HDFS) for data storage, and the MapReduce programming
framework that manages the processing of the data. The Hadoop tool
suite enables organizations to organize raw (often unstructured) data and
transform it so it can be loaded into data warehouses and data marts for
integrated analysis.
Hadoop lets a cluster or grid of computers tackle big data workloads
by enabling parallel processing of large data sets. It operates primarily
with HDFS, which is fault-tolerant and can scale out to many clusters
with thousands of nodes. Hadoop MapReduce also provides capabilities
for analysis operations on massive data sets using a large number of
processors. For example, researchers at Yahoo sorted a petabyte of data
in 16.25 hours running Hadoop MapReduce on a cluster of 3,800 nodes.
Although Hadoop MapReduce is well suited to problems with key/value
data sets, its not intended for operations that require complex data or
transactions.
Hadoop is a core building block for most
big data architectures. It provides both data
acquisition and storage, and has three main
uses within organizations.
Watch Video

Chapter 4
Organize

environment.

warehouse.
See More


Watch Video

12
See More
13

Chapter 4
Organize

environment.

warehouse.
See More


Watch Video

12
See More
13

Chapter 5 - Analyze
Chapter 5
Analyze
Advanced Analytics
Organizations have been deriving useful information through the

combination of building mathematical models and sifting through
large volumes of data for a long time. Once refined, big data expands
existing models and is a potential rich new source of insight for business
intelligence applications that use the data warehouse.
Big data analysis is different. See how it
can uncover why things happen and what
kind of new analytics tools and processes
supplement what you already have.
Watch Video
The data warehouse is key to big data analysis. While data comes from
many sources, new insight comes from an integrated analysis of all data
together. Hence, the modern data warehouse now becomes a repository
for the data summaries created by Hadoop as well as more traditional
enterprise data.
New data sources are differentthe data itself is often less well
understood, but may also be inherently less precise or only indirectly
relevant to the problem. So, to derive value from big data, we must turn to
an analysis process of iteration and refinement. Each iteration can either
reveal new insight, or simply enable an analyst to rule out a particular line
of inquiry. Big data analysis is about uncovering new relationships rather
than reporting on a well-understood data set.
14
While traditional analysis tools are still important, advanced analytics

involving both statistical analysis and data mining are required to get the
most out of big data. A large user community has turned to the open source
R statistical programming language that has been evolving since 1997. Very
popular among analysts and data scientists, R is also widely used in the
academic world, so theres a ready pool of trained R developers coming
along.
One use of statistical techniques called predictive analytics has gained
traction across multiple industries including finance, retail, insurance,
healthcare, pharmaceuticals, and telecommunications. Predictive analytics
can exploit and use customer data to build and optimize predictive
models. Organizations are using predictors, for example, to guide
marketing campaigns and make them more effective. The surge of interest
in predictive analytics has been made possible by gains in computing
horsepower. With todays tools, predictive analytics can create sophisticated
models and execute a variety of scenarios across large sets of data.
See More
15

Chapter 6 - Decide
Chapter 6
Decide
When we make decisions in todays world awash in data, we can use

powerful tools to distill data and present information, making for a more
intelligent decision-making process. Using automated analysis, we can
make decisions that are data driven. We can turn big data into actionable
insight and, with the right technology, do it in real time.
See More
Visualizations and business intelligence dashboards are a powerful assist to

decision-making, particularly when dealing with massive amounts of data.
Statistical software is a key element of analytics, business intelligence, and
decision support. The Web interface for running scripts of the R statistical
analysis language can be integrated into dashboards, providing analysis
and streaming graphics for the decision-making process.
Decisions in Real Time

The volume and velocity of big data have put new emphasis on scalability
and performance of analytics and business intelligence tools. Improvements
in server capacity, high-speed interconnects, and network bandwidth have
contributed to the emergence of a new generation of software that provides
in-memory, in-database, and real-time analytics.
In-memory databases, for example, give us the capacity for real-time
decision-making. The 64-bit addressing capability of modern systems
means we can configure servers with a terabyte (TB) of memory. That
capacity means databases, some in excess of a billion rows, can be loaded
into memory to sustain high-performance, low-latency processing, which
results in faster decision-making.
Firms that adopt data-driven decision-making

have output and productivity that is 5-6% higher
than what would be expected. 1
Brynjolfsson, Hitt, and Kim, Strength in Numbers: How Does Data-Driven
Decision Making Affect Firm Performance? (April 22, 2011).
16
17

Oracle Meeting The Challenge of Big Data

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Oracle Meeting The Challenge of Big Data

Hochgeladen von

Copyright:

Verfügbare Formate

Meeting the Challenge of Big Data

Learn about the

Meeting the Challenge of Big Data

Chapter 1 - Spotlight on Big Data

Chapter 7 - Big Data Software

Chapter 8 - Engineered Systems

Meeting the Challenge of Big Data

Spotlight on Big Data

Exponential Data Growth

Online or mobile financial transactions, social

Big Data, Big Impact: New Possibilities for International Development,

Meeting the Challenge of Big Data

Big data enables an organization to gain a much greater understanding

IDC expects the Big Data technology and

When organizations use big data to develop a better understanding of

Meeting the Challenge of Big Data

Meeting the Challenge of Big Data

The diversity of content requires software to operate on structured and

Systems that are more batch-oriented with less stringent requirements

Meeting the Challenge of Big Data

Developers today typically create custom-written Java code that, in

Hadoop MapReduce can preprocess and

By prepping data to load into Oracle Exadata Database Machine, we set

After weve collected big data, we need to

The refining of big data enables it to be analyzed alongside your

Meeting the Challenge of Big Data

Organizations have been deriving useful information through the

While traditional analysis tools are still important, advanced analytics

Meeting the Challenge of Big Data

When we make decisions in todays world awash in data, we can use

Visualizations and business intelligence dashboards are a powerful assist to

Decisions in Real Time

Firms that adopt data-driven decision-making

Meeting the Challenge of Big Data

Big Data Software

Oracle offers a powerful stack of software, including new functionality

Oracle Endeca Information Discovery

Oracle NoSQL Database

Get Fast Answers to New Questions with

Oracle Data Integration

Meeting the Challenge of Big Data

Oracle Big Data Connectors

Oracle Advanced Analytics

Oracle has developed a suite of software for easily integrating Oracle

Gartner Research Vice President Merv

Oracle Advanced Analytics turns Oracle Database into a sophisticated

Meeting the Challenge of Big Data

Oracles software stack is the foundation for a powerful line of engineered

Oracles engineered systems enable organizations to deploy big data

Oracle Big Data Appliance 3-D Demo

Clearly, Oracles release of Oracle Big Data

Oracle mainstreams its Hadoop platform with Cloudera [Oracle Enterprise

22 Manager] deal, Tony Baer, Ovum, January 2012.

Oracle Big Data Appliance is a comprehensive, enterprise-ready

Meeting the Challenge of Big Data

Customers discuss the benefits of Oracle

Oracle has partnered with Cloudera to include the Cloudera Distribution

Oracle Exadata supports deployment of massive data warehouses and

Oracle Exalytics In-Memory Machine is an integrated hardware and

Meeting the Challenge of Big Data

Gartner Research Vice

Episode 5: Using Statistical

Gartner Research Vice President