Requirements: - Core java knowledge is preferable. Hardware Requirements: - Systems must have atleast 2gb RAM. Contents:-- Virtual box/VM Ware a. Basics b. Installations c. Backups d. Snapshots Linux a. Basics b. Installations c. Commands Hadoop a. Why Hadoop? b. Scaling c. Distributed Framework d. Hadoop v/s RDBMS e. Brief history of hadoop Setup hadoop a. Pseudo mode b. Cluster mode c. Ipv6 d. Ssh e. Installation of java, hadoop f. Configurations of hadoop g. Hadoop Processes ( NN, SNN, JT, DN, TT) h. Temporary directory
PEERS TECHNOLOGIES PVT LTD IT TRAINING AND SERVICES Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038. 040-40310000 E-mail: enq@peerstech.com; URL: www.peerstech.com
i. UI j. Common errors when running hadoop cluster, solutions HDFS- Hadoop distributed File System a. HDFS Design and Architecture b. HDFS Concepts c. Interacting HDFS using command line d. Interacting HDFS using Java APIs e. Dataflow f. Blocks g. Replica Hadoop Processes a. Name node b. Secondary name node c. Job tracker d. Task tracker e. Data node Map Reduce a. Developing Map Reduce Application b. Phases in Map Reduce Framework c. Map Reduce Input and Output Formats d. Advanced Concepts e. Sample Applications f. Combiner g. HAR h. Joining datasets in Map reduce jobs a. Map-side join b. Reduce-Side join
PEERS TECHNOLOGIES PVT LTD IT TRAINING AND SERVICES Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038. 040-40310000 E-mail: enq@peerstech.com; URL: www.peerstech.com
Map reduce customization a. Custom Input format class b. Hash Practitioner c. Custom Practitioner d. Sorting techniques e. Custom Output format class Hadoop Programming Languages:- PIG a. Introduction b. Installation and Configuration c. Interacting HDFS using PIG d. Map Reduce Programs through PIG e. PIG Commands f. Loading, Filtering, Grouping. g. Data types, Operators.. h. Joins, Groups. i. Sample programs in PIG Hive a. Basics b. Installation and Configurations c. Commands. NOSQL Databases Concepts Specialties: ETL tool (PDI ) ( Data Warehousing BI Tools) a. Introduction b. Creating RDBMS database c. Establishing Connection between PDI to RDMS database
PEERS TECHNOLOGIES PVT LTD IT TRAINING AND SERVICES Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038. 040-40310000 E-mail: enq@peerstech.com; URL: www.peerstech.com
d. Creating data in hadoop e. Establishing Connection between PDI to Hadoop data f. Summarization
OVERVIEW HADOOP DEVELOPER Introduction The Motivation for Hadoop o Problems with traditional large-scale systems o Requirements for a new approach Hadoop: Basic Concepts o An Overview of Hadoop o The Hadoop Distributed File System o Hands-On Exercise o How MapReduce Works o Hands-On Exercise o Anatomy of a Hadoop Cluster o Other Hadoop Ecosystem Components Writing a Map Reduce Program o The Map Reduce Flow o Examining a Sample Map Reduce Program o Basic Map Reduce API Concepts o The Driver Code o The Mapper o The Reducer o Hadoops Streaming API o Using Eclipse for Rapid Development o Hands-on exercise o The New MapReduce API
PEERS TECHNOLOGIES PVT LTD IT TRAINING AND SERVICES Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038. 040-40310000 E-mail: enq@peerstech.com; URL: www.peerstech.com
Delving Deeper Into The Hadoop API o More about Tool Runner o Testing with MRUnit o Reducing Intermediate Data With Combiners o The configure and close methods for Map/Reduce Setup and Teardown o Writing Partitioners for Better Load Balancing o Hands-On Exercise o Directly Accessing HDFS o Using the Distributed Cache o Hands-On Exercise. Common Map Reduce Algorithms o Sorting and Searching o Indexing o Machine Learning With Mahout o Term Frequency Inverse Document Frequency o Word Co-Occurrence o Hands-On Exercise. Usining HBase o What is HBase? o HBase Architecture o HBase API o Managing large data sets with HBase o Using HBase in Hadoop applications o Hands-on exercise. Using Hive and Pig o Hive Basics o Pig Basics o Hands-on exercise.
PEERS TECHNOLOGIES PVT LTD IT TRAINING AND SERVICES Regd. Office # 207, II floor, HUDA Maithrivanam, Ameerpet, Hyderabad 500 038. 040-40310000 E-mail: enq@peerstech.com; URL: www.peerstech.com
Practical Development Tips and Techniques o Debugging MapReduce Code o Using LocalJobRunner Mode For Easier Debugging o Retrieving Job Information with Counters o Logging o Splittable File Formats o Determining the Optimal Number of Reducers o Map-Only MapReduce Jobs o Hands-On Exercise. More Advanced MapReduce Programming o Custom Writables and WritableComparables o Saving Binary Data using SequenceFiles and Avro Files o Creating InputFormats and OutputFormats o Hands-On Exercise Joining Data Sets in MapReduce o Map-Side Joins o The Secondary Sort o Reduce-Side Joins