Sie sind auf Seite 1von 2

TRAINING SHEET

Administrator Training for Apache Hadoop

Take your knowledge to the next level with Clouderas Apache Hadoop
Training and Certification
Cloudera Universitys three-day administrator training course for Apache Hadoop
provides System Administrators a comprehensive understanding of all the
steps necessary to operate and manage Hadoop clusters. From installation
and configuration, through load balancing and tuning your cluster, Clouderas
Administration course has you covered.
Through lecture and interactive, hands-on exercises, attendees will cover topics
such as
Introduction to Apache Hadoop and HDFS
Apache Hadoop architecture
Proper cluster configuration and deployment
Populating HDFS using Apache Sqoop
Management and monitoring tools
Job scheduling
Best practices for maintaining Apache Hadoop in Production

Cloudera Administrator Training for


Apache Hadoop helped me to advance
my use of Apache Hadoop and cultivate
a better understanding of the platforms
inner workings. The course material,
interactive labs and exercises really
helped cement together all the little bits
and pieces that I had bumped into prior
to the class into a useful mental model
of how Apache Hadoop works.
Eric Marshall,
Senior System Administrator

Installing and managing other Apache Hadoop projects


Diagnosing, tuning and solving Apache Hadoop issues

Audience

This course is intended for experienced developers who wish to write, maintain,
and/or optimize Apache Hadoop jobs. A background in Java is preferred, but
experience with other programming language such as PHP, Python, or C# is
sufficient. Knowledge of algorithms and other computer science topics is a plus.
Upon completion of the course, attendees are able to attempt the Cloudera
Certified Developer for Apache Hadoop (CCDH) exam. Certification is a great
differentiator; it helps establish individuals as leaders in their field, providing
customers with tangible evidence of their skills.

Cloudera, Inc. 210 Portage Avenue, Palo Alto, CA 94306 USA | 1-888-789-1488 or 1-650-362-0488 | cloudera.com
2011 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their
respective companies. Information is subject to change without notice.

TRAINING SHEET

Administrator Training for Apache Hadoop


Course Outline: Cloudera Administrator Training for Apache Hadoop
An Introduction To Hadoop And HDFS

o Why Hadoop?

o HDFS

o MapReduce

o Hive, Pig, HBase and other ecosystem projects

o Hands-On Exercise: Installing a pseudo
distributed cluster
Planning Your Hadoop Cluster

o General Planning Considerations

o Choosing The Right Hardware

o Node Topologies

o Choosing The Right Software
Deploying Your Cluster

o Installing Hadoop

o Using SCM Express for easy installation

o Typical Configuration Parameters

o Configuring Rack Awareness

o Using Configuration Management Tools

o Hands-On Exercise: Installing a Hadoop Cluster
Managing and Scheduling Jobs

o Starting and stopping MapReduce jobs

o Hands-On Exercise: Managing jobs

o The FIFO Scheduler

o The Fair Scheduler

o Hands-On Exercise: Using the FairScheduler
Cluster Maintenance

o Checking HDFS with fsck

o Hands-On Exercise: Breaking the Cluster

o Copying data with distcp

Cluster Maintenance (continued)



o Rebalancing cluster nodes

o Adding and removing cluster nodes

o Hands-On Exercise: Verifying the Clusters Self
Healing Features

o Backup And Restore

o Upgrading and Migrating

o Hands-On Exercise: Backing Up and Restoring

the NameNode Metadata
Cluster Monitoring, Troubleshooting and Optimizing

o Hadoop Log Files

o Using the NameNode and JobTracker Web UIs

o Interpreting Job Logs

o Monitoring with Ganglia

o Other monitoring tools

o General Optimization Tips

o Benchmarking Your Cluster
Populating HDFS From External Sources

o Using Sqoop

o Using Flume

o Best Practices for Data Ingestion
Installing And Managing Other Hadoop Projects

o Hive

o Pig

o HBase

o Hands-On Exercise: Configuring the Hive Shared
Metastore
Cloudera Certified Administrator Exam

Cloudera Certified Administrator for Apache Hadoop (CCAH)

Establish yourself as a trusted and valuable resource by completing the online certification exam for Apache Hadoop
Administrators. The exam is demanding and is designed to test your fluency with concepts and terminology in the following areas:
Apache Hadoop Cluster Overview

Daemons and normal operation of an Apache Hadoop cluster, both in data storage and in data
processing. The current features of computing systems that motivate a system like Apache Hadoop

Apache Hadoop Cluster Planning

Principal points to consider in choosing the hardware and operating systems to host an Apache
Hadoop cluster

Apache Hadoop Cluster


Management

Cluster handling of disk and machine failures. Regular tools for monitoring and managing the Apache
Hadoop file system

Job Scheduling

How the default scheduler and the fair scheduler handle the tasks in a mix of jobs running on a cluster

Monitor and Logging

Basic functionality and features of Apache Hadoops logging and monitoring systems

Cloudera, Inc. 210 Portage Avenue, Palo Alto, CA 94306 USA | 1-888-789-1488 or 1-650-362-0488 | cloudera.com
2011 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their
respective companies. Information is subject to change without notice.

Das könnte Ihnen auch gefallen