Sie sind auf Seite 1von 6

HadoopBigData

Administration
3 Day Course

Hadoop Big Data Administration

3 Day Course

Course Objectives

Master the understanding of Hadoop and Hadoop Administration eco system


components
Plan Hadoop Clusters with installation and configuration of Hadoop as well as
configuration of single node and multi node Hadoop Clusters
Become proficient in HDFS and Sqoop with the help of Demos and hands on Lab
exercises.
Install & configure YARN with gaining in depth understanding of Map Reduce and YARN
architecture
Become expert in recovering from node failures and troubleshoot common Hadoop
cluster issues
Install and configure Hadoop Eco system components such as Hive, Pig, Impala,
Ganglia, Nagios, Sqoop
Expertise in setup, configuration and management of security for Hadoop clusters using
Kerbero

Audience
Systems administrators and IT managers
IT administrators and operators
IT Systems Engineer
Data Engineer
Data Analytics Administrator
Cloud Systems Administrator
Web Engineer
Prerequisites
Fundamental knowledge of any programming language and Linux environment. Participants
should know how to navigate and modify files within a Linux environment. Existing knowledge of
Hadoop & Java is not required.

7/23/2015

Page 2

Hadoop Big Data Administration

3 Day Course

Course Outline
Module 1Introduction to Big Data and Hadoop

Introduction to Big Data


Introduction to Hadoop
Why Hadoop
Hadoop & Traditional RDBMS
Components of Hadoop & Hadoop Architecture
History and uses of Hadoop

Module 2Planning Hadoop Cluster

Overview of Hadoop Clusters


Planning your Hadoop Cluster
Overview of Hardware and other Network configurations
Network Topology for Hadoop Clusters
Overview of Cluster Management

Module 3Hadoop Installation and Configuration

Overview of various deployment types


Installing and configuring Hadoop
Configuring a single node Hadoop Cluster
Configuring a multi node Hadoop Cluster
Checking the correctness of Hadoop installation
Demos:
o
o
o
o
o

Install Ubuntu Server 12.04


Hadoop 1.0 in Ubuntu Server 12.04
Create a Clone of Hadoop Virtual Machine
Perform Clustering of the Hadoop Environment
Install Hadoop 2.0 in Ubuntu Server 12.0

Module 4Advanced Cluster Configuration Features

7/23/2015

Hadoop configuration overview and important configuration file


Configuration parameters and values
HDFS parameters MapReduce parameters
Hadoop environment setup
Include and Exclude configuration files
Demo: Configuration Settings of Hadoop
Lab Exercise

Page 3

Hadoop Big Data Administration

3 Day Course

Module 5Hadoop Distributed File System

Introduction to HDFS
Overview of HDFS Architecture
Overview of HDFS Sorage mechanisms
Overview of HDFS Rack
Writing and reading files from HDFS
Understanding the important commands of HDFS
Introduction to Squoop
Installing and configuring Sqoop
Demos:
o
o

Install Sqoop
HDFS Demo

Lab Exercise

Module 6Overview of MapReduce and YARN

Introduction to MapReduce
MapReduce Architecture and working with MapReduce
Development and Libraries of Map Reduce
MapReduce components failures and recoveries
Introduction to YARN
YARN Architecture
Installing and configuring YARN
Working with YARN & YARN Web UI

Module 7Important Hadoop Components

Understanding Hive
Installing and configuring Hive
Understanding Pig
Installing and configuring Pig
Understanding Impala
Installing and configuring Impala
Demos:
o
o

7/23/2015

Install Hive
Install Pig

Lab Exercises

Page 4

Hadoop Big Data Administration

3 Day Course

Module 8Hadoop Administration and Maintenance

Namenode/Datanode directory structures and files


File system image and Edit log
The Checkpoint Procedure
Namenode failure and recovery procedure
Safe Mode
Metadata and Data backup
Potential problems and solutions / what to look for
Adding and removing nodes
Lab Exercise

Module 9Hadoop Ecosystem Components

Eco system Component: Ganglia


o
o
o

Eco system Component: Nagios


o
o
o

7/23/2015

Install and Configure Ganglia on a Cluster


Configure and Use Ganglia
Use Ganglia for Graphs

Nagios Concepts
Install and Configure Nagios on Cluster
Use Nagios for Sample Alerts And Monitoring

Eco system Component: Sqoop


o
o

Install and Configure Sqoop on Cluster


Import Data From Oracle/Mysql to Hive

Overview of Other Eco system Components:

o
o
o
o
o
o

Oozie
Avro
Thrift
Rest
Mahout
Cassandra

o
o

YARN
MR2

Page 5

Hadoop Big Data Administration

7/23/2015

3 Day Course

Hadoop Security
Kerberos and Hadoop
Why Hadoop Security is Important?
Hadoops Security System Concepts
What Kerberos is and how it Works?
Configuring Kerberos Security
Securing a Hadoop Cluster with Kerberos
Lab Exercise

Page 6

Das könnte Ihnen auch gefallen