Beruflich Dokumente
Kultur Dokumente
SUBMITTED BY:
GARIMA SINGH(12609103) PRIYANKA GROVER(12609060)
11/25/2013
11/25/2013
Included in the Resource Manager is Scheduler, whose sole task is to allocate system resources to specific running applications (tasks), but it does not monitor or track the application status.
11/25/2013
11/25/2013
11/25/2013
11/25/2013
11/25/2013
11/25/2013
12
Reliable messaging: Even though workloads in Zookeeper are loosely coupled, you still have a need for communication between and among the nodes in the cluster specific to the distributed application.
11/25/2013
13
11/25/2013
14
11/25/2013
15
11/25/2013
16
Similarities between The Traditional Data Warehouse and The Big Data Warehouse
11/25/2013
20
Differences between The Traditional Data Warehouse and The Big Data Warehouse
The distributed computing model of big data will be essential to allowing the hybrid model to be operational. The big data analysis will be the primary focus of the efforts, while the traditional data warehouse will be used to add historical and transactional business context.
11/25/2013
21
11/25/2013
25
Overview
Using big data to get results Finding whats different with big data Exploring the challenges of analyzing big data Examining analytics tools for big data
11/25/2013
26
11/25/2013
27
Types of Analysis
When it comes to analytics, you might consider a range of possible kinds of analysis.
11/25/2013
29
Basic analytics
Basic analytics can be used to explore your data. This include simple visualizations or simple statistics. Basic analysis is often used when you have large amounts of disparate data. Slicing and dicing: Slicing and dicing refers to breaking down your data into smaller sets of data that are easier to explore. Basic monitoring: You might also want to monitor large volumes of data in real time. Anomaly identification: You might want to identify anomalies,
such as an event where the actual observation differs from what you expected, in your data because that may clue you in that something is going wrong with your organization, manufacturing 11/25/2013 Big data for dummies 30 process, and so on.
Advanced analytics
Advanced analytics provides algorithms for complex analysis of either structured or unstructured data. It includes sophisticated statistical models, machine learning, neural networks, text analytics and other advanced data-mining techniques. 11/25/2013 Big data for dummies
31
Operationalized analytics
When you operationalize analytics, you make them part of a business process. For example, a model could be built to predict customers who are good targets for up selling when they call into a call center. The call center agent, while on the phone with the customer, would receive a message on specific additional products to sell to this customer. The agent might not even know that a predictive model was working behind the scenes to make this recommendation. Big data for dummies 11/25/2013 33
Monetizing Analytics
Analytics can be used to optimize your business to create better decisions and drive bottom- and top-line revenue. Big data analytics can also be used to derive revenue above and beyond the insights for department or company. Assemble a unique data set that is valuable to other companies. For example, credit card providers take the data they assemble to offer value-added analytics products. 11/25/2013 Big data for dummies 34
11/25/2013
35
It can be dirty
The signal (usable information) may only be a tiny percent of the data; the noise is the rest. The signal-to-noise Being able to extract a tiny signal from noisy data is part of ratio can be low the benefit of big data analytics
It can be real-time
11/25/2013
Infrastructure Support
Integrate technologies needs to integrate new big data technologies with traditional technologies process all kinds of big data and make it consumable by traditional analytics An enterprise-hardened Hadoop system may be needed that can process/store/manage large amounts of data Data can be structured, semi-structured, or unstructured. A stream-computing capability may be needed to process data in motion that is continuously generated by sensors, smart devices, video, audio, and logs to support real-time decision making. You may need a solution optimized for operational or deep analytical workloads to store and manage the growing amounts of trusted data
Warehouse data:
11/25/2013
37
11/25/2013
38