Beruflich Dokumente
Kultur Dokumente
Table of Contents
Chapter 2 - Architecture
Chapter 3 - Acquire
10
Chapter 4 - Organize
12
Chapter 5 - Analyze
14
Chapter 6 - Decide
16
18
22
Resources
26
Business and government can never have too much information for
making the organization more efficient, profitable, or productive. For
that reason organizations turned to powerful data stores, including
very large databases (VLDBs), to meet their information storage and
retrieval needs. Due to exponential data growth in recent years, we have
embraced new storage technologies and the enterprise database now
shares the spotlight with complementary technologies for storing and
managing big data.
There are four key characteristics that define big data: Volume, Velocity,
Variety and Value. Volume and velocity arent necessarily new problems
for IT managers; these issues are just amplified today. The distinguishing
characteristics of big data that do create new problems are the variety
and low density value of the data. Big data comes in many different
formats that go beyond the traditional transactional data formats. It is
also typically very low density; one single observation on its own doesnt
have a lot of value. However, when this data is aggregated and analyzed,
meaningful trends can be identified.
Gartner Research Vice President Merv
Adrian, discusses drivers for big data
Watch Video
Perhaps because of the benefits and the useful applications for big data,
industry analysts have forecast rapid growth in the market for big data
technology and services.
Developing a big-data strategy is complex with different kinds of data, new-use
cases, and additional software. Above all,
whats the value to the business?
See More
Watch Video
IDC Worldwide Big Data Technology and Services 2012-2015 Forecast, doc
#233485, March 2012
Architecture
Big data represents a sea change in the technology we draw upon for
making decisions. Organizations will integrate and analyze data from
diverse sources, complementing enterprise databases with data from
social media, video, smart mobile devices, and other sources. The
evolution of information architectures to include big data will likely
provide the foundation for a new generation of enterprise infrastructure.
To exploit these diverse sources of data for decision-making, an
organization must develop an effective strategy for acquiring, organizing,
and analyzing big data, using it to generate new insights about the
business and make better decisions.
See More
Each step in the process of refining big data has requirements that are
best served by matching the right hardware and software to the job at
hand. Existing data warehouse infrastructure can grow to meet both
the scale of big data and the different analytics needs. But handling the
initial acquisition and organization of the new data types will require new
software, most notably Apache Hadoop.
Hadoop contains two main components: the Hadoop Distributed File
System (HDFS) for data storage, and the MapReduce programming
framework that manages the processing of the data. The Hadoop tool
suite enables organizations to organize raw (often unstructured) data and
transform it so it can be loaded into data warehouses and data marts for
integrated analysis.
Hadoop lets a cluster or grid of computers tackle big data workloads
by enabling parallel processing of large data sets. It operates primarily
with HDFS, which is fault-tolerant and can scale out to many clusters
with thousands of nodes. Hadoop MapReduce also provides capabilities
for analysis operations on massive data sets using a large number of
processors. For example, researchers at Yahoo sorted a petabyte of data
in 16.25 hours running Hadoop MapReduce on a cluster of 3,800 nodes.
Although Hadoop MapReduce is well suited to problems with key/value
data sets, its not intended for operations that require complex data or
transactions.
Hadoop is a core building block for most
big data architectures. It provides both data
acquisition and storage, and has three main
uses within organizations.
Watch Video
Acquire
Organizations need to choose the right storage technology for new data
with a clear understanding of both the kind of data they plan to store
as well as how they will use it. While there are many specialist storage
technologies tuned for particular scenarios, there are two primary use
cases to be aware of.
The sources of big data are numerous including both human- and
machine-generated data feeds. Acquisition of data from sources like
online activity, RFID, instrumentation, social media, clickstreams, and
trading systems is characterized by a large volume of transactions,
high velocity of data flow, and greater variety of data formats. Required
latency varies, from interactive systems that deliver a service and need
subsecond responses, to more batch-oriented systems that store data for
offline analysis later.
Gartner Research Vice President Merv
Adrian discusses big data technologies.
Watch Video
10
11
Organize
Deriving value from big data is a multiphase process that takes raw data
and refines it into useful information. Data acquisition, such as taking
data from streams and social media feeds, is a precursor to transforming
and organizing data to derive business value. Pre-processing is used to
weed out less useful data and structure what is left for analysis. Because
big data comes in many shapes, sizes, and formats, this transformation
is an important prerequisite to moving the data into the analytics
environment.
12
See More
13
Analyze
Advanced Analytics
14
See More
15
Decide
See More
16
17
18
19
We often see big data and analytics used in the same sentence because
technology gains have enabled us to analyze increasingly large data sets.
Not the least of those gains is the capacity for Oracle Database to embed
analytics in the database, an architectural solution that provides scalability,
performance, and security. This architecture offloads analytics work from
RAM-limited computers and puts analytics processing closer to the data.
This eliminates unnecessary network round trips, leverages an enterpriseclass database, and lowers hardware costs.
20
21
Engineered Systems
23
24
Watch Video
Conclusion
To derive real business value from Big Data, you need the right tools to
capture and organize a wide variety of data types from different sources,
and to be able to easily analyze it within the context of all your enterprise
data. Oracles engineered systems and complementary software provide
an end-to-end value chain to help you unlock the value of big data.
25
Resources
Videos
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
Episode 3: Comparing HDFS
and NoSQL
27
Resources
Videos
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
27
Resources
Videos
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
27
Resources
Videos
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
From Overload to Impact: An Industry
Scorecard on Big Data Business Challenges
27
Resources
www.oracle.com/bigdata
Videos
www.oracle.com/exadata
White Papers
www.oracle.com/exalytics
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
27
Resources
Videos
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
27
Resources
Videos
Oracle R Enterprise
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
27
Resources
Videos
White Papers
Datasheets
Reports
Web
Podcasts
Blogs
Social Media
27
oracle.com/bigdata
Perhaps because of the benefits and the useful applications for big data,
industry analysts have forecast rapid growth in the market for big data
technology and services.
Developing a big-data strategy is complex with different kinds of data, new-use
cases, and additional software. Above all,
whats the value to the business?
See More
Watch Video
IDC Worldwide Big Data Technology and Services 2012-2015 Forecast, doc
#233485, March 2012
Architecture
Big data represents a sea change in the technology we draw upon for
making decisions. Organizations will integrate and analyze data from
diverse sources, complementing enterprise databases with data from
social media, video, smart mobile devices, and other sources. The
evolution of information architectures to include big data will likely
provide the foundation for a new generation of enterprise infrastructure.
To exploit these diverse sources of data for decision-making, an
organization must develop an effective strategy for acquiring, organizing,
and analyzing big data, using it to generate new insights about the
business and make better decisions.
See More
Each step in the process of refining big data has requirements that are
best served by matching the right hardware and software to the job at
hand. Existing data warehouse infrastructure can grow to meet both
the scale of big data and the different analytics needs. But handling the
initial acquisition and organization of the new data types will require new
software, most notably Apache Hadoop.
Hadoop contains two main components: the Hadoop Distributed File
System (HDFS) for data storage, and the MapReduce programming
framework that manages the processing of the data. The Hadoop tool
suite enables organizations to organize raw (often unstructured) data and
transform it so it can be loaded into data warehouses and data marts for
integrated analysis.
Hadoop lets a cluster or grid of computers tackle big data workloads
by enabling parallel processing of large data sets. It operates primarily
with HDFS, which is fault-tolerant and can scale out to many clusters
with thousands of nodes. Hadoop MapReduce also provides capabilities
for analysis operations on massive data sets using a large number of
processors. For example, researchers at Yahoo sorted a petabyte of data
in 16.25 hours running Hadoop MapReduce on a cluster of 3,800 nodes.
Although Hadoop MapReduce is well suited to problems with key/value
data sets, its not intended for operations that require complex data or
transactions.
Hadoop is a core building block for most
big data architectures. It provides both data
acquisition and storage, and has three main
uses within organizations.
Watch Video
Organize
Deriving value from big data is a multiphase process that takes raw data
and refines it into useful information. Data acquisition, such as taking
data from streams and social media feeds, is a precursor to transforming
and organizing data to derive business value. Pre-processing is used to
weed out less useful data and structure what is left for analysis. Because
big data comes in many shapes, sizes, and formats, this transformation
is an important prerequisite to moving the data into the analytics
environment.
12
See More
13
Organize
Deriving value from big data is a multiphase process that takes raw data
and refines it into useful information. Data acquisition, such as taking
data from streams and social media feeds, is a precursor to transforming
and organizing data to derive business value. Pre-processing is used to
weed out less useful data and structure what is left for analysis. Because
big data comes in many shapes, sizes, and formats, this transformation
is an important prerequisite to moving the data into the analytics
environment.
12
See More
13
Analyze
Advanced Analytics
14
See More
15
Decide
See More
16
17