Big Data Hadoop

Toll Free- 1800-30000-893 | www.mtaeducation.
in Page 1
Hadoop
Syllabus
SUMMER TRAINING 2018
Instructor
Information
Instructor Email Office Location & Hours CERTIFIED INSTRUCTOR mtaeducation@outlook.com

Aspirevision Nodal Centers-80HRS
General
Information
Descriptio
n
Learn the Concepts and implementation of Hadoop and Java programming, and take the first step on your journey
to becoming a Hadoop Developer!
Expectations and
Goals
It is a comprehensive Hadoop Big Data training course designed by industry experts considering current industry job
requirements to provide in-depth learning on big data and Hadoop Modules. This is an industry recognized Big Data
certification training course that is a combination of the training courses in Hadoop developer, Hadoop administrator,
and analytics.
Course
Materials
Required
Materials
• Laptop
• 6+ GB RAM (Recommended
8GB)
Optional
Materials
• Internet Connection
Course
Syllabus
Introduction to Big Data & Hadoop and its Ecosystem, Map Reduce and
HDFS
What is Big Data, Where does Hadoop fit in, Hadoop Distributed File System – Replications, Block Size, Secondary
Namenode, High Availability, Understanding YARN – ResourceManager, NodeManager, Difference between 1.x and
2.x
Hadoop Installation &

setup
Hadoop 2.x Cluster Architecture , Federation and High Availability, A Typical Production Cluster setup , Hadoop Cluster
Modes, Common Hadoop Shell Commands, Hadoop 2.x Configuration Files, Cloudera Single node cluster
Core Java
Fundamentals
Basic Overview, Classes & Objects revisited, Inheritence, Interface
Deep Dive in
Mapreduce
How Mapreduce Works, How Reducer works, How Driver works, Combiners, Partitioners, Input Formats, Output Formats,
Toll Free- 1800-30000-893 | www.mtaeducation.in Page 2
Shuffle and Sort, Mapside Joins, Reduce Side Joins, Distributed
Cache.
Linux
Fundamentals
Basic Linux commands, understanding the linux environment, exercises for practice
Lab
exercises:
Working with HDFS, Writing WordCount Program, Writing custom partitioner, Mapreduce with Combiner , Map Side Join,
Reduce Side Joins, Running Mapreduce in Local Job Runner Mode.
Understanding
Pig
A. Introduction to Pig Understanding Apache Pig, the features, various uses and learning to interact with Pig B.
Deploying Pig for data analysis The syntax of Pig Latin, the various definitions, data sort and filter, data types,
deploying Pig for ETL, data loading, schema viewing, field definitions, functions commonly used. C. Pig for complex
data processing Various data types including nested and complex, processing data with Pig, grouped data iteration,
practical exercise D. Performing multi-dataset operations Data set joining, data set splitting, various methods for data
set combining, set operations, hands-on exercise
Understanding
Hive
A. Hive Introduction Understanding Hive, traditional database comparison with Hive, Pig and Hive comparison, storing
data in Hive and Hive schema, Hive interaction and various use cases of Hive B. Hive for relational data analysis
Understanding HiveQL, basic syntax, the various tables and databases, data types, data set joining, various built-in
functions, deploying Hive queries on scripts, shell. C. Data management with Hive The various databases, creation of
databases, data formats in Hive, data loading, changing databases and Tables, result storing of queries, data access
control, managing data with Hive. D. Hands on Exercises – working with large data sets and extensive querying
Understanding
SQOOP
Sqoop Installations and Basics, Importing Data from MySQL to HDFS, Advance Imports, Real Time UseCase, Exporting
Data from HDFS to MySQL, Running Sqoop in Cloudera
Understanding
Flume
Overview of Apache Flume, Physically distributed Data sources, Changing structure of Data, Closer look, Anatomy of
Flume, Core concepts, Event, Clients, Agents, Source, Channels, Sinks, Interceptors, Channel selector, Sink processor,
Data ingest, Agent pipeline, Transactional data exchange, Routing and replicating, Why channels?, Use case- Log
aggregation, Adding flume agent, Handling a server farm, Data volume per agent, Example describing a single node flume
deployment
Introduction to IMPALA & (Avro) Data

Formats
IMPAL
A
A. Introduction to Impala What is Impala?, How Impala Differs from Hive and Pig, How Impala Differs from Relational
Databases, Limitations and Future Directions, Using the Impala Shell
B. Choosing the Best (Hive, Pig,

Impala)
C. Modeling and Managing Data with Impala and

Hive
Data Storage Overview, Creating Databases and Tables, Loading Data into Tables, HCatalog, Impala Metadata
Caching
D. Data Partitioning Partitioning Overview, Partitioning

in Impala and Hive
(AVRO) Data Format Selecting a File Format, Tool Support for File Formats, Avro Schemas, Using Avro with Hive
and Sqoop, Avro Schema Evolution, Compression
Apache
HBase
What is Hbase, Where does it fits, What is NOSQL, Hbase Basics & Architecture, Creating Tables, Listing Tables,
Enabling & Disabling tables, describe, alter drop tables, Scan, Insert, Update, Read, Delete Data, Scan
Apache
Spark
A. Why Spark? Working with Spark and Hadoop Distributed File System What
is Spark, Comparison between Spark and Hadoop, Components of Spark B.
Running Spark on a Cluster, Writing Spark Applications using Java/Scala
ZOOKEEP
ER
ZOOKEEPER Introduction, ZOOKEEPER use cases, ZOOKEEPER Services, ZOOKEEPER

data Model
Oozi
e
Why Oozie?, Running an example, Oozie- workflow engine, Word count example, Oozie job processing, Job
submission
Quiz &
Awards
Project
Additional Information and

Resources
Students will be provided with following credibility enhancement awards and

documents.
✓ Microsoft International Certificate ✓ Aspirevision Tech

Education Training Certificate ✓ Internship/Project Letter (on
successful completion of project)
Awards &
Certificates
Other Courses Offered by Aspirevision Tech
Education
c
c

Big Data Hadoop

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Big Data Hadoop

Hochgeladen von

Copyright:

Verfügbare Formate

Toll Free- 1800-30000-893 | www.mtaeducation.

Instructor Email Office Location & Hours CERTIFIED INSTRUCTOR mtaeducation@outlook.com

Hadoop Installation &

Basic Overview, Classes & Objects revisited, Inheritence, Interface

Introduction to IMPALA & (Avro) Data

B. Choosing the Best (Hive, Pig,

C. Modeling and Managing Data with Impala and

D. Data Partitioning Partitioning Overview, Partitioning

ZOOKEEPER Introduction, ZOOKEEPER use cases, ZOOKEEPER Services, ZOOKEEPER

Additional Information and

Students will be provided with following credibility enhancement awards and

✓ Microsoft International Certificate ✓ Aspirevision Tech

Das könnte Ihnen auch gefallen

Big Data Hadoop

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Big Data Hadoop

Hochgeladen von

Copyright:

Verfügbare Formate

Toll Free- 1800-30000-893 | www.mtaeducation.

Instructor Email Office Location & Hours ​CERTIFIED INSTRUCTOR ​mtaeducation@outlook.com

Hadoop Installation &

Basic Overview, Classes & Objects revisited, Inheritence, Interface

Introduction to IMPALA & (Avro) Data

B. Choosing the Best (Hive, Pig,

C. Modeling and Managing Data with Impala and

D. Data Partitioning Partitioning Overview, Partitioning

ZOOKEEPER Introduction, ZOOKEEPER use cases, ZOOKEEPER Services, ZOOKEEPER

Additional Information and

Students will be provided with following credibility enhancement awards and

✓ ​Microsoft International Certificate ​✓ Aspirevision Tech

Das könnte Ihnen auch gefallen

Instructor Email Office Location & Hours CERTIFIED INSTRUCTOR mtaeducation@outlook.com

✓ Microsoft International Certificate ✓ Aspirevision Tech