Sie sind auf Seite 1von 3

Big Data Concepts for Executives and Senior Management Objective

Topics

Understand big data and how it can be applied to store, manage, process and analyze massive amounts of unstructured and poly structured data Explore the technologies underpinning big data including Hadoop and NoSQL Determine how big data systems can complement traditional data warehousing and business intelligence solutions and processes Utilize big data to differentiate your business and provide better service to your customers Examine case studies of how big data is influencing society and businesses

Understanding Big Data concepts Developing the business case for a big data solution Maintaining a technology ecosystem Examining how big data is influencing society and businesses The Emerging Role of a Data Scientist Social Media, the Quest for Real-Time and the Future

Hadoop Concepts for Executives, Business Leaders, IT Managers, Technical Staff, Developers & Administrators Objective Topics Why Hadoop? History & background Real-world use cases and case studies The Hadoop Platform Introduction to MapReduce and Hadoop File System (HDFS) Data warehousing with Hive Parallel processing with Pig Data mining with Mahout Data storage with HBase Common utilities - Sqoop, Flume, Hue, Scribe, Zookeeper, HCatalog Hadoop distributions - Apache Foundation, Cloudera, Hortonworks, MapR Understanding of the Hadoop technology stack, including MapReduce, HDFS, Hive, Pig, HBase, and provides an initial introduction to Mahout and other common utilities. What is Hadoop? The essential components of a Hadoop-based data management solution Pros and cons of implementing Hadoop How does Hadoop fit into our existing environment and architecture? The differences between various Hadoop distributions Examine case studies of how big data is influencing society and businesses

Hadoop for Developers Objective

Write a MapReduce program using Hadoop API Utilize HDFS for effective loading and processing of data with CLI and API. Understand best practices for building, debugging and optimizing Hadoop solutions. Use Pig, Hive, HBase and HCatalog effectively

Course Outline Day 1 Overview MapReduce Code HDFS MapReduce JobTracker, TaskTracker and Running Jobs Day 2 MapReduce Combiner MapReduce Partitioner MapReduce Distributed Cache MapReduce Streaming MapReduce Data Handling Day 3 Pig Into Pig Data Model Pig Scripting Language Hive - Part 1 Day 4 Hive - Part 2 HCatalog HBase Enterprise Integration Future of Hadoop

Hadoop Administration Objective

Utilize best practices for Deploying Hadoop clusters

Determine hardware needs Monitor Hadoop clusters Recover from NameNode failure Handle DataNode failures Manage hardware upgrade processes including node removal, configuration changes, node installation and rebalancing clusters Manage log files Install, configure, deploy verify and maintain Hadoop clusters including:


Day 1

MapReduce HDFS Pig Hive (and MySQL) HBase (and ZooKeeper) HCatalog Mahout

Overview of Hadoop Cluster Hardware and Installation of HDFS and MapReduce Rack Topology Setting up a Multi-user Environment Using Schedulers Hadoop Security with Kerberos Logs and Log Rotation Monitor, Maintain and Troubleshoot HDFS and MapReduce NameNode Failure and Recovery JobTracker Restarting

Day 2 Upgrade of Hardware Process Rebalancing Data Management Install Configure, Deploy and Verify Pig Install Configure, Deploy and Verify Hive Install Configure, Deploy and Verify MySQL Install Configure, Deploy and Verify HBase and ZooKeeper Install Configure, Deploy and Verify Other Hadoop Ecosystem (HCatalog, Mahout)