Sie sind auf Seite 1von 13

Toll Free- 1800-30000-893 | www.mtaeducation.

in Page 1
Hadoop
Syllabus
SUMMER TRAINING 2018

Instructor
Information

Instructor Email Office Location & Hours ​CERTIFIED INSTRUCTOR ​mtaeducation@outlook.com


Aspirevision Nodal Centers-80HRS

General
Information

Descriptio
n
Learn the Concepts and implementation of Hadoop and Java programming, and take the first step on your journey
to becoming a Hadoop Developer!

Expectations and
Goals
It is a comprehensive Hadoop Big Data training course designed by industry experts considering current industry job
requirements to provide in-depth learning on big data and Hadoop Modules. This is an industry recognized Big Data
certification training course that is a combination of the training courses in Hadoop developer, Hadoop administrator,
and analytics.

Course
Materials

Required
Materials
• Laptop

• 6+ GB RAM (Recommended
8GB)

Optional
Materials
• Internet Connection
Course
Syllabus

Introduction to Big Data & Hadoop and its Ecosystem, Map Reduce and
HDFS

What is Big Data, Where does Hadoop fit in, Hadoop Distributed File System – Replications, Block Size, Secondary
Namenode, High Availability, Understanding YARN – ResourceManager, NodeManager, Difference between 1.x and
2.x

Hadoop Installation &


setup

Hadoop 2.x Cluster Architecture , Federation and High Availability, A Typical Production Cluster setup , Hadoop Cluster
Modes, Common Hadoop Shell Commands, Hadoop 2.x Configuration Files, Cloudera Single node cluster

Core Java
Fundamentals

Basic Overview, Classes & Objects revisited, Inheritence, Interface

Deep Dive in
Mapreduce

How Mapreduce Works, How Reducer works, How Driver works, Combiners, Partitioners, Input Formats, Output Formats,
Toll Free- 1800-30000-893 | www.mtaeducation.in Page 2
Shuffle and Sort, Mapside Joins, Reduce Side Joins, Distributed
Cache.

Linux
Fundamentals

Basic Linux commands, understanding the linux environment, exercises for practice

Lab
exercises:

Working with HDFS, Writing WordCount Program, Writing custom partitioner, Mapreduce with Combiner , Map Side Join,
Reduce Side Joins, Running Mapreduce in Local Job Runner Mode.

Understanding
Pig

A. Introduction to Pig Understanding Apache Pig, the features, various uses and learning to interact with Pig B.
Deploying Pig for data analysis The syntax of Pig Latin, the various definitions, data sort and filter, data types,
deploying Pig for ETL, data loading, schema viewing, field definitions, functions commonly used. C. Pig for complex
data processing Various data types including nested and complex, processing data with Pig, grouped data iteration,
practical exercise D. Performing multi-dataset operations Data set joining, data set splitting, various methods for data
set combining, set operations, hands-on exercise

Understanding
Hive

A. Hive Introduction Understanding Hive, traditional database comparison with Hive, Pig and Hive comparison, storing
data in Hive and Hive schema, Hive interaction and various use cases of Hive B. Hive for relational data analysis
Understanding HiveQL, basic syntax, the various tables and databases, data types, data set joining, various built-in
functions, deploying Hive queries on scripts, shell. C. Data management with Hive The various databases, creation of
databases, data formats in Hive, data loading, changing databases and Tables, result storing of queries, data access
control, managing data with Hive. D. Hands on Exercises – working with large data sets and extensive querying

Understanding
SQOOP

Sqoop Installations and Basics, Importing Data from MySQL to HDFS, Advance Imports, Real Time UseCase, Exporting
Data from HDFS to MySQL, Running Sqoop in Cloudera

Understanding
Flume

Overview of Apache Flume, Physically distributed Data sources, Changing structure of Data, Closer look, Anatomy of
Flume, Core concepts, Event, Clients, Agents, Source, Channels, Sinks, Interceptors, Channel selector, Sink processor,
Data ingest, Agent pipeline, Transactional data exchange, Routing and replicating, Why channels?, Use case- Log
aggregation, Adding flume agent, Handling a server farm, Data volume per agent, Example describing a single node flume
deployment

Introduction to IMPALA & (Avro) Data


Formats

IMPAL
A

A. Introduction to Impala What is Impala?, How Impala Differs from Hive and Pig, How Impala Differs from Relational
Databases, Limitations and Future Directions, Using the Impala Shell

B. Choosing the Best (Hive, Pig,


Impala)

C. Modeling and Managing Data with Impala and


Hive
Toll Free- 1800-30000-893 | www.mtaeducation.in Page 3
Data Storage Overview, Creating Databases and Tables, Loading Data into Tables, HCatalog, Impala Metadata
Caching

D. Data Partitioning Partitioning Overview, Partitioning


in Impala and Hive

(AVRO) Data Format ​Selecting a File Format, Tool Support for File Formats, Avro Schemas, Using Avro with Hive
and Sqoop, Avro Schema Evolution, Compression

Apache
HBase

What is Hbase, Where does it fits, What is NOSQL, Hbase Basics & Architecture, Creating Tables, Listing Tables,
Enabling & Disabling tables, describe, alter drop tables, Scan, Insert, Update, Read, Delete Data, Scan

Apache
Spark

A. Why Spark? Working with Spark and Hadoop Distributed File System What
is Spark, Comparison between Spark and Hadoop, Components of Spark B.
Running Spark on a Cluster, Writing Spark Applications using Java/Scala

ZOOKEEP
ER

ZOOKEEPER Introduction, ZOOKEEPER use cases, ZOOKEEPER Services, ZOOKEEPER


data Model

Oozi
e

Why Oozie?, Running an example, Oozie- workflow engine, Word count example, Oozie job processing, Job
submission

Quiz &
Awards

Project

Additional Information and


Resources

Students will be provided with following credibility enhancement awards and


documents.

✓ ​Microsoft International Certificate ​✓ Aspirevision Tech


Education Training Certificate ✓ Internship/Project Letter (on
successful completion of project)
Toll Free- 1800-30000-893 | www.mtaeducation.in Page 4
Awards &
Certificates
Toll Free- 1800-30000-893 | www.mtaeducation.in Page 5
Other Courses Offered by Aspirevision Tech
Education

c
c

Das könnte Ihnen auch gefallen