Sie sind auf Seite 1von 6

PEERS TECHNOLOGIES PVT LTD

IT TRAINING AND SERVICES


Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038.
040-40310000
E-mail: enq@peerstech.com; URL: www.peerstech.com


Hadoop Developer Module

Requirements: - Core java knowledge is preferable.
Hardware Requirements: - Systems must have atleast 2gb RAM.
Contents:--
Virtual box/VM Ware
a. Basics
b. Installations
c. Backups
d. Snapshots
Linux
a. Basics
b. Installations
c. Commands
Hadoop
a. Why Hadoop?
b. Scaling
c. Distributed Framework
d. Hadoop v/s RDBMS
e. Brief history of hadoop
Setup hadoop
a. Pseudo mode
b. Cluster mode
c. Ipv6
d. Ssh
e. Installation of java, hadoop
f. Configurations of hadoop
g. Hadoop Processes ( NN, SNN, JT, DN, TT)
h. Temporary directory

PEERS TECHNOLOGIES PVT LTD
IT TRAINING AND SERVICES
Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038.
040-40310000
E-mail: enq@peerstech.com; URL: www.peerstech.com


i. UI
j. Common errors when running hadoop cluster, solutions
HDFS- Hadoop distributed File System
a. HDFS Design and Architecture
b. HDFS Concepts
c. Interacting HDFS using command line
d. Interacting HDFS using Java APIs
e. Dataflow
f. Blocks
g. Replica
Hadoop Processes
a. Name node
b. Secondary name node
c. Job tracker
d. Task tracker
e. Data node
Map Reduce
a. Developing Map Reduce Application
b. Phases in Map Reduce Framework
c. Map Reduce Input and Output Formats
d. Advanced Concepts
e. Sample Applications
f. Combiner
g. HAR
h.
Joining datasets in Map reduce jobs
a. Map-side join
b. Reduce-Side join

PEERS TECHNOLOGIES PVT LTD
IT TRAINING AND SERVICES
Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038.
040-40310000
E-mail: enq@peerstech.com; URL: www.peerstech.com


Map reduce customization
a. Custom Input format class
b. Hash Practitioner
c. Custom Practitioner
d. Sorting techniques
e. Custom Output format class
Hadoop Programming Languages:-
PIG
a. Introduction
b. Installation and Configuration
c. Interacting HDFS using PIG
d. Map Reduce Programs through PIG
e. PIG Commands
f. Loading, Filtering, Grouping.
g. Data types, Operators..
h. Joins, Groups.
i. Sample programs in PIG
Hive
a. Basics
b. Installation and Configurations
c. Commands.
NOSQL Databases Concepts
Specialties:
ETL tool (PDI ) ( Data Warehousing BI Tools)
a. Introduction
b. Creating RDBMS database
c. Establishing Connection between PDI to RDMS database

PEERS TECHNOLOGIES PVT LTD
IT TRAINING AND SERVICES
Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038.
040-40310000
E-mail: enq@peerstech.com; URL: www.peerstech.com


d. Creating data in hadoop
e. Establishing Connection between PDI to Hadoop data
f. Summarization

OVERVIEW HADOOP DEVELOPER
Introduction
The Motivation for Hadoop
o Problems with traditional large-scale systems
o Requirements for a new approach
Hadoop: Basic Concepts
o An Overview of Hadoop
o The Hadoop Distributed File System
o Hands-On Exercise
o How MapReduce Works
o Hands-On Exercise
o Anatomy of a Hadoop Cluster
o Other Hadoop Ecosystem Components
Writing a Map Reduce Program
o The Map Reduce Flow
o Examining a Sample Map Reduce Program
o Basic Map Reduce API Concepts
o The Driver Code
o The Mapper
o The Reducer
o Hadoops Streaming API
o Using Eclipse for Rapid Development
o Hands-on exercise
o The New MapReduce API

PEERS TECHNOLOGIES PVT LTD
IT TRAINING AND SERVICES
Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038.
040-40310000
E-mail: enq@peerstech.com; URL: www.peerstech.com


Delving Deeper Into The Hadoop API
o More about Tool Runner
o Testing with MRUnit
o Reducing Intermediate Data With Combiners
o The configure and close methods for Map/Reduce Setup and Teardown
o Writing Partitioners for Better Load Balancing
o Hands-On Exercise
o Directly Accessing HDFS
o Using the Distributed Cache
o Hands-On Exercise.
Common Map Reduce Algorithms
o Sorting and Searching
o Indexing
o Machine Learning With Mahout
o Term Frequency Inverse Document Frequency
o Word Co-Occurrence
o Hands-On Exercise.
Usining HBase
o What is HBase?
o HBase Architecture
o HBase API
o Managing large data sets with HBase
o Using HBase in Hadoop applications
o Hands-on exercise.
Using Hive and Pig
o Hive Basics
o Pig Basics
o Hands-on exercise.

PEERS TECHNOLOGIES PVT LTD
IT TRAINING AND SERVICES
Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038.
040-40310000
E-mail: enq@peerstech.com; URL: www.peerstech.com


Practical Development Tips and Techniques
o Debugging MapReduce Code
o Using LocalJobRunner Mode For Easier Debugging
o Retrieving Job Information with Counters
o Logging
o Splittable File Formats
o Determining the Optimal Number of Reducers
o Map-Only MapReduce Jobs
o Hands-On Exercise.
More Advanced MapReduce Programming
o Custom Writables and WritableComparables
o Saving Binary Data using SequenceFiles and Avro Files
o Creating InputFormats and OutputFormats
o Hands-On Exercise
Joining Data Sets in MapReduce
o Map-Side Joins
o The Secondary Sort
o Reduce-Side Joins

Das könnte Ihnen auch gefallen