Hadoop Updated Material

Page 1 of 81
HADOOP

Hadoop is a frame work to store and process big data.
If the amount of data is more than tera bytes then such a data is called as big data.
Or
The amount of data that is beyound the storage and processing capabilities of single physical
machine then it is called as big data.
Hadoop can process
1.Unstructured data
ex: like filesimagessentimental data which is maintained by networking sites.
!.semistructured data like
ex: "#$ files %xcel work sheets..
&.'tructured data
ex: like tables information which is present in ()*#'.
()*#' systems can process maximum tera bytes of data only. There are not desiged to
process big data
Incase of ()*#' to process huge amount of data then we need specialised softwares and
high configured hardwares+expensi,e-.
.e can o,ercome the abo,e problems using hadoop.
it is completely free downloaded +open source -.To work with hadoop /ust we need to
maintained comodity systems+i.e we need not use high configure hardware-.
Advantages of Hadoop :
1.'calability.
&.fault tolerance.
0.High 1,ailability.
Hadoop was de,eloped by 2doug cutting3
Hadoop is not a technology name but it is a 2toy name3.
Hadoop consist of
Page 2 of 81
1. H)4'+ Hadoop )istributed 4ile 'ystem which is implemented in /a,a-
!. #apreduce.+it is a programing module to process data present in hdfs-5

H)4':
It is also called as master sla,e architecture.
In Hadoop cluster there will be one main node called 6ame6ode+#aster-.The other
nodes which are connected to 6ame6ode are called )ata6odes+sla,es-.1t the time of copying the file
into H)4' Hdfs7lient+bulit in framework classes-recei,es recie,es command forwards it to the
6ame6ode.6ame6ode checks whether the specified file is already existed or not and also checks
whether user ha,ing sufficient pre,ileges or not .If the file is already existed +or- user not ha,ing
sufficient pre,ileges then it will display an I8O %xception.If the specified file is not existed then the
entire file wil be di,ided into smaller chunks.%ach chunk is called a block. )efault si9e of each block is
:0 mb.*ut it can be configured and we can increase the si9e.The maximum sugested si9e of a each
block is 1!; mb.
Page 3 of 81
6ame6ode will decide where exactly block has to be stored. If the block1 is stored in )61 then
the block! can be stored in any other datanode .
Replication policy:
Incase of hadoop by default each block is stored & times in the hadoop cluster .i.e e,ery file is stored &
times in the hadoop cluster.this is called (eplication policy.
Through configuration files we can increase or decrease replication factor ,alue .
NOTE: Two continous blocks of same files will not be stored in same data node *ut more than one
block of same file can be stored in same data node.
Name node :
6ame6ode is the masters node which is responsible to maintain meta data information about
each file which is present in H)4' . i.e name node maintains
1. path of the file
!. number of blocks which are a,ailable for each file
&. 1ddress of data nodes where exactly each block is present.
The abo,e information is stored in 4'Image file < %ditlog file.

.ithout 6ame6ode the file system cannot be used. If name node goes down there is no way
of reconstructing the files from the blocks on the data nodes. 'o 6ame6ode is the single point of failure
in generation one hadoop.In generation one hadoop there is secondary namenode. .hen the name node
starts up it merges the fsimage and edit log files to pro,ide up to date ,iew of the file system metadata.
The namenode then o,erwrites fsimage with the new H)4' state and begins a new edit log.
The checkpoint node periodically creates checkpoints of the namespace. It downloads fsimage and edits
from the acti,e namenode. Then it merges locally and uploads the new image back to the acti,e
namenode.
The checkpoint node usually runs on a different machine than the namenode as its memory
re=uirements are on the same order as the namenode.it keeps a copy of the merged namespace image
which can be used in the e,ent of the namenode failing.
In generation two hadoop there are two master nodes in .if acti,e name node goes down then
Page 4 of 81
automatically passi,e namenode becomes acti,e namenode.
The big companies like google4acebook maintain clusters in the form of of )ata7enters. )ata
centers usually be in defferent locations like U'7anadaU> etc... )ata center is a collection of
(acks+rack1rack!.....-. 1 rack is nothing but set of nodes +machines-.These node are connected
through n8w and each data centre is also connected through n8w .This entire set up is called n8w
topology.
Note : The main idea of replication polocy is if the first block is stored in rack1 of data centre 1 then
the second block can be stored in anothers rack of same data center or some other data centre.

CAP Teo!em :
C : Consistancy
A : Availa"ility
P : #a$lt Tole!ance
Consistancy :
Once we write any thing into the hadoop cluster we can read it back at any mo,ement of time
without loss of data. Hadoop supports 1??@ consistency.
Availa"ility :
There is no chance of failure +approximately AA @ a,ailable-
#a$lt Tole!ance :
E,en there is n8w break downs between two data centres or any two racks or any two nodes still
we are able to process data which is present in clusters.

Page 5 of 81
%AP RED&CE
#ap reduce is a programming model for data processing. It is also called as master < sla,e
architecture.Bob tracker is the master process in mapreduce where as task tracker is the sla,e in #ap
(educe. These are the processes which are used to process the data present in H)4'.Bob tracker is
running under name node machine where as task trackers run on data nodes.
Page 6 of 81
*efore the BobTracker chooses a task for the TaskTracker the BobTracker must choose /ob
from the list of /obs which are submitted by multiple users from multiple data nodes. Bob tracker selects
highest priority /ob from the list+user can assign priority to /ob while submission to the cluster-.The /ob
tracker chooses a task from the /ob < assign it to the TaskTrackers. The /ob tracker coordinates all the
/obs assigned them to task trackers. TaskTrackers run the /ob and send heart beats+program reports- to
the BobTracker. BobTrackers run a simple loop that periodically sends heart beat method calls to the
BobTracker. Heart beats tell the /ob tracker that a task tracker is ali,e. 1s a part of the heart beat task
tracker will indicate whether it is ready to run a new task. If it is able to start a new task then BobTracker
will allocate a new task to the TaskTracker.BobTracker keeps o,er all progress of each /ob .If the task
fails then the /ob tracker can reschedule it on another task tracker.
Page 7 of 81
Task Tracker ha,e a fixed number of slots for map tasks and for reduce tasks.These are set
indenpendently .
4or ex:
1 Task Tracker may be configured to run two map tasks and two reducer tasks
simultaneously +number of tasks depends on amount of memory on the Task Tracker system. In the
context of gi,en /ob the default scheduler fills empty map tasks slots before reduce task slots. 'o if the
TaskTracker has atleast one empty map task slot the Bob Tracker will select a map task otherwise it will
select a reduce task.
Page 8 of 81
Hadoop frame work di,ides input of the map reducer into fixed si9e pieces called input
splits. Hadoop creates one map task for each split. %ach split will be di,ided into records+e,ery row is a
record-. 4or e,ery record one uni=ue number will be assigned. This number is called offset code. 4or
each record in the split user defined function will be that function name is map+-. Ha,ing many splits
means the time to process each split is smaller compared to the time to process the whole input. 4or
most of the /obs a good split si9e is :0mb by default. Hadoop creates map task on the data node where
the input resides in the Hdfs. 'o this is called data locality optimi9ation.
#ap tasks write their O8p to the local disks not to Hdfs. #ap output is intermediate output.
1gain it is processed by reduces to pro,ide final output. Once the /ob is completed the mapper output
can be thrown away. If the node is running and map task fails before the mapper output has been
consumed to reduced task then the /ob tracer will automatically create map task on another node.
The sorted mapper outputs ha,e to be transferred across the n8w to the node where the reduce task is
Page 9 of 81
running. Then they are merged and passed to the user defined reducer function. Output of the reducer is
normally stored in Hdfs for the reliability.
The input to a single reduce task is normally the output from all mappers. In the present example we
ha,e a single reduce task that is fed by all of the map tasks. Therefore the sorted map outputs ha,e to be
transferred across the network to the node where the reduce task is running where they are merged and
then passed to the userCdefined reduce function. The output of the reduce is normally stored in H)4' for
reliability.
The number of reduce tasks is not go,erned by the si9e of the input but is specified
independently. In 2The )efault #ap(educe Bob3 you will see how to choose the number of reduce
tasks for a gi,en /ob..hen there are multiple reducers the map tasks partition their output each
creating one partition for each reduce task. There can be many keys +and their associated ,alues- in each
partition but the records for e,ery key are all in a single partition. The partitioning can be controlled by
a userCdefined partitioning function but normally the default partitioner which buckets keys using a
hash function works ,ery well. itDs also possible to ha,e 9ero reduce tasks.
Page 10 of 81
Com"ine! #$nctions
#any #ap(educe /obs are limited by the bandwidth a,ailable on the cluster so it pays to
minimi9e the data transferred between map and reduce tasks. Hadoop allows the user to specify a
combiner function to be run on the map outputEthe combiner functionDs output forms the input to the
reduce function. 'ince the combiner function is an optimi9ation Hadoop does not pro,ide a guarantee
of how many times it will call it for a particular map output record if at all. In other words calling the
combiner function 9ero one or many times should produce the same output from the reducer.
Page 11 of 81
Anatomy of a #ile Read:
The client opens the file it wishes to read by calling open+- on the 4ile 'ystem ob/ectwhich for
H)4' is an instance of )istributed4ile'ystem .)istributed 4ile'ystem calls the namenode using (F7
to determine the locations of the blocks for the first few blocks in the file +step !-. 4or each block the
namenode returns the addresses of the datanodes that ha,e a copy of that block. 4urthermore the
datanodes are sorted according to their proximity to the client +according to the topology of the clusterDs
network5 see 26etwork Topology and Hadoop3 on page :0-. If the client is itself a datanode +in the case
of a #ap(educe task for instance- then it will read from the local datanode.The )istributed4ile'ystem
returns a 4')ataInput'tream +an input stream that supports file seeks- to the client for it to read data
from. 4')ataInput'tream in turn wraps a )4'Input'tream which manages the datanode and namenode
I8O.The client then calls read+- on the stream +step &-.
)4'Input'tream which has stored the datanode addresses for the first few blocks in the file
then connects to the first +closest- datanode for the first block in the file. )ata is streamed from the
datanode back to the client which calls read+- repeatedly on the stream +step 0-. .hen the end of the
block is reached )4'Input'tream will close the connection to the datanode then find the best datanode
for the next block +step G-. This happens transparently to the client which from its point of ,iew is /ust
reading a continuous stream. *locks are read in order with the )4'Input'tream opening new
connections to datanodes as the client reads through the stream. It will also call the namenode to retrie,e
the datanode locations for the next batch of blocks as needed. .hen the client has finished reading it
calls close+- on the 4')ataInput'tream +step :-. )uring reading if the client encounters an error while
communicating with a datanode then it will try the next closest one for that block. It will also remember
datanodes that ha,e failed so that it doesnDt needlessly retry them for later blocks. The client also
,erifies checksums for the data transferred to it from the datanode. If a corrupted block is found it is
reported to the namenode before the client attempts to read a replica of the block from another
datanode. One important aspect of this design is that the client contacts datanodes directly to retrie,e
data and is guided by the namenode to the best datanode for each block.
Page 12 of 81
Anatomy of a #ile '!ite:
The client creates the file by calling create+- on )istributed4ile'ystem +step 1 in 4igure
&C&-. )istributed4ile'ystem makes an (F7 call to the namenode to create a new file in the filesystemDs
namespace with no blocks associated with it +step !-. The namenode performs ,arious checks to make
sure the file doesnDt already exist and that the client has the right permissions to create the file. If these
checks pass the namenode makes a record of the new file5 otherwise file creation fails and the client is
thrown an IO%xception. The )istributed4ile'ystem returns a 4')ataOutput'tream for the client to
start writing data to. Bust as in the read case 4')ataOutput'tream wraps a )4'Output 'tream which
handles communication with the datanodes and namenode. 1s the client writes data +step &-
)4'Output'tream splits it into packets which it writes to an internal =ueue called the data =ueue. The
data =ueue is consumed by the )ata 'treamer whose responsibility it is to ask the namenode to allocate
new blocks by picking a list of suitable datanodes to store the replicas. The list of datanodes forms a
pipelineEweDll assume the replication le,el is & so there are three nodes in the pipeline. The
)ata'treamer streams the packets to the first datanode in the pipeline which stores The Hadoop
Page 13 of 81
)istributed 4ilesystemthe packet and forwards it to the second datanode in the pipeline. 'imilarly the
second datanode stores the packet and forwards it to the third +and last- datanode in the pipeline +step 0-.
)4'Output'tream also maintains an internal =ueue of packets that are waiting to be acknowledged by
datanodes called the ack =ueue. 1 packet is remo,ed from the ack =ueue only when it has been
acknowledged by all the datanodes in the pipeline +step G-.
If a datanode fails while data is being written to it then the following actions are taken
which are transparent to the client writing the data. 4irst the pipeline is closed and any packets in the
ack =ueue are added to the front of the data =ueue so that datanodes that are downstream from the failed
node will not miss any packets. The current block on the good datanodes is gi,en a new identity which
is communicated to the namenode so that the partial block on the failed datanode will be deleted if the
failed datanode reco,ers later on. The failed datanode is remo,ed from the pipeline and the remainder of
the blockDs data is written to the two good datanodes in the pipeline. The namenode notices that the
block is underCreplicated and it arranges for a further replica to be created on another node. 'ubse=uent
blocks are then treated as normal.
ItDs possible but unlikely that multiple datanodes fail while a block is being written.
1s long as dfs.replication.min replicas +default one- are written the write will succeed and the block
will be asynchronously replicated across the cluster until its target replication factor is reached
dfs.replication which defaults to three-. .hen the client has finished writing data it calls close+- on the
stream +step :-. This action flushes all the remaining packets to the datanode pipeline and waits for
acknowledgments before contacting the namenode to signal that the file is complete +step H-. The
namenode already knows which blocks the file is made up of +,ia )ata 'treamer asking for block
allocations- so it only has to wait for blocks to be minimally replicated before returning successfully.
(eplica Flacement How does the namenode choose which datanodes to store replicas
onI ThereDs a tradeoff between reliability and write bandwidth and read bandwidth here. 4or example
placing all replicas on a single node incurs the lowest write bandwidth penalty since the replication
pipeline runs on a single node but this offers no real redundancy +if the node fails the data for that
block is lost-. 1lso the read bandwidth is high for offCrack reads. 1t the other extreme placing replicas
in different data centers may maximi9e redundancy but at the cost of bandwidth. %,en in the same data
center +which is what all Hadoop clusters to date ha,e run in- there are a ,ariety of placement
strategies. Indeed Hadoop changed its placement strategy in release ?.1H.? to one that helps keep
a fairly e,en distribution of blocks across the cluster. +'ee 2balancer3 on page !;0 for details on keeping
a cluster balanced.- HadoopDs strategy is to place the first replica on the same node as the client +for
Page 14 of 81
clients running outside the cluster a node is chosen at random although the system tries not
to pick nodes that are too full or too busy-.
The second replica is placed on a different )ata 4low J :Hrack from the first +offCrack-
chosen at random. The third replica is placed on the same rack as the second but on a different node
chosen at random. 4urther replicas are placed on random nodes on the cluster although the system tries
to a,oid placing too many replicas on the same rack. Once the replica locations ha,e been chosen a
pipeline is built taking network topology into account. 4or a replication factor of & the pipeline might
look like 4igure &C0. O,erall this strategy gi,es a good balance between reliability +blocks are stored on
two racks- write bandwidth +writes only ha,e to tra,erse a single network switch- read
performance +thereDs a choice of two racks to read from- and block distribution across the cluster
+clients only write a single block on the local rack-.
OOP
Ob/ect:C Ob/ect is defined by state and beha,iour. 'tate represents physical appearance of an ob/ect.
*eha,iour represents purpose of an ob/ect. In case of oop state is nothing but application data. It can be
maintained through member ,ariables. *eha,iour can be implemented through member functions.
(ynta) to C!eate an o"*ect.
7lass classname
K
1ccess specifier datatype 1 ,ar1 ,ar!.......
1ccess specifier datatype! ,ar1,ar!........
.
.
L
1ccess specifier returntype function1+parm-
K
CCCCCC
CCCCCC
L
Page 15 of 81
1ccess specifier returntype function!+parm-
K
CCCCCC
CCCCCC
L
Class:+
7lass is a keyword used to bind ,ariables and methods to single unit.
Access (pecifie!s:+
Used to set the scope of the ,ariables. There are & types of access specifiers. publicpri,ate and
protected . If any member is preceded by any public specifier then it can be accessed through out /a,a
programming. %,ery program re=uires one entry point. Then only end users can access class properties
so this entry point should be public.
P!ivate:+
If any member is preceded by pri,ate then it can be accessed with in the class. *ut not outside of
the class.
1ll the sensiti,e data like username password etc must be declared under pri,ate section.
*ook lib collection of racksrack is a collection of books book is a collection of pages
page is a collection of info
Ba,a lib is a collection of packages package is collection of classes class is a collection of
methods method is collection of instructions
Frogram
7lass testempmaster class empmaster
K K
public static ,oid main +string MN args- int eid5
K string ename5
empmaster e1e!5 float basic5
e1O new empmaster+-5 L
e1.eidO1?15
e1.enameO3a/ay35
e1.ebasicO!G???.:?f5
Page 16 of 81
e!Oe15
'ystem.out.println+2eidO3Pe!.eid-5
s.o.p+2enameO3Pe1.ename-5
s.o.p+2basicsalO3Pe1.ebasic-5
L
L
Const!$cto!:
1-constructor ha,ing same of the class name
!-it is ,ery similar to function
&-it has no return type
0-it is used to assign initial ,alues to the member ,ariables.
G-it is called by /a,a run time at the time of ob/ect creation.
E) P!og!am:
7lass empmaster
K
int eid5
'tring ename5
float basic5
empmaster+-
K
eidO1?15
enameO3a/ay35
basicO!????.G?f5
L
empmaster+int id'tring name float sal-
K
eidOid5
enameOname5
basicOsal5
L
,oid displayemp+-
K
s.o.p+2empidO3Peid-5
s.o.p+2empnameO3Pename-5
Page 17 of 81
s.o.p+2ebasicO3Pbasic-5
L
L
class testempmaster
K
public static ,oid main+'tring MN args-
K
empmaster e1e!e&5
e1Onew empmaster+-5
e!Onew empmaster+1?!3a/ay3&???.G?-5
e1.displayemp+-5
e!.displayemp+-5
L
L
Ine!itance
It is a process of creating new class called deri,ed class from the existing class called base class.
1d,antages:
1.code reusability
!.extendability
&.(eliability
0.better maintenance
synta):
7lass baseclass
K
#ember Qariables
P
#ember functions
L
7lass subclass name extends baseclass name
K
#ember Qariables
Page 18 of 81
P
#ember functions
L
Types of ine!itance
1. 'ingle inheritance
!. #ulti le,el inheritance
&. Hierarchal inheritance
0. #ultiple inheritance
G. Hybrid inheritance
1.single ine!itance:
If any class is deri,ed from only one base class the such a inheritance is know as single inheritance
%x:
7lass c1
K
Int ab5
L
7lass c! extends c1
K
Int cd5
L
71 ob/1Onew c1+-5
7! ob/!Onewc!+-5
,.m$lti level ine!itance:
If there is more than one le,el of inheritance then such a type of inheritance is called multile,el
inheritance
%x:
7lass c1
K
Int ab5
L
Page 19 of 81
7lass c! extends c1
K
Int cd:
L
7lass c& extends c!
K
Int fg5
L
71 ob/1 O new c1+-5
7! ob/! O new c!+-5
7& ob/& O new c&+-5
-.ie!a!ical ine!itance:
If more than one class is deri,ed from single base class is called hierarical inheritance

71
7! c& c0
7lass c1
K
Int ab5
L
7lass c! extends c1
K
Int cd:
L
7lass c& extends c!
K
Int ef5
L
7lass c0 extends c1
K
Int gf5
Page 20 of 81
L
71 ob/1 O new c1 +-5
7! ob/! O new c! +-5
7& ob/& O new c& +-5
70 ob/0 O new c0 +-5
#$nction ove!!iding:
If any method is implemented in both base class and deri,ed class with same name and
same signature then the method in the deri,ed class is said to be o,erridden method of super class
%x:
class bc
K
int i/5
public bc+int aint b-
K
iOa5
/Ob5
L
public ,oid show+-
K
s.o.p+2iO3Pi-5
s.o.p+2/O3P/-5
L
L
class dc extends bc
K
int k5
public dc+int x int y int 9-
K
super+xy-5
kO95
L
public ,oid show+-
K
Page 21 of 81
s.o.p+2kO3Pk-5
L
L
($pe!:
it is one built in keyword. It contains the super class ob/ect reference
advantages:
1. Using super keyword we can call super class constructor from the deri,ed class constructor
!. .e can call super class method from the sub class
&. .e can access super class member ,ariable from the sub class
Tis:
It is also a built in keyword it contains the current class ob/ect references
Advantages:
Using 2this3 we can access member ,ariables of a class and we can call member functions of a
class
In the abo,e example the following statement will make a call to show method of 2dc3 class only
ob/.show+-5
.inding:
It is the process of creating a link between function calling and function definition
There are ! types of binding
1.static binding
!.dynamic binding
1.static "inding:
In the abo,e example link is created between ob/.show+- and its definition during compilation only
,.dynamic "inding:
R$les to implement dynamic
1. *oth the sub class and base class must ha,e same function name and same signature
!. *ase class methods must be abstract
&. *oth the base class method and deri,ed class methods must be called with respect to reference
Page 22 of 81
,ariables
0. *efore calling a particular class method using base class reference ,ariable it must ha,e
appropriate ob/ect reference
(ynta):
1ccess specifier abstract returntype functionname+parameters-
%x:
Fublic abstract ,oid area+-5
Fublic abstract ,oid peri+-5
'yntax:
1ccess specifier abstract class classname
K
#ember ,ariables
P
Implemented methods
P
1bstract methods
L
%x:
abstract class shape
K
float a
public ,oid getdata+-
K
L
public abstract ,oid area+-5
public abstract ,oid peri+-5
L
shape '588no error
88 s O new shape+-588error
class circle extends shape
K
public ,oid main area+-
K
L
Page 23 of 81
public ,oid peri+-
K
L
L
If any method has no definition such a method is called a"st!act method
If any class ha,ing one abstract method then such a class is called abstract class
.e can create reference ,ariables of type abstract classes
%x5
shape s5
.e cannot create ob/ect of type abstract classes
E):
sOnew shape +-5
%,ery abstract classes re=uires sub class sub class can inherit implemented methods and abstract
methods also 4rom the sub class we must o,erride all the abstract methods which are inherited
from the abstract base classs other wise deri,ed class becomes base class
E):
class drinkable
K
public abstract ,oid drink+-5
L
class tea extends drinkable
K
public ,oid drink+-
K
s.o.p+2here is your drink3-5
L
L
class coffee extends drinkables
K
public ,oid drink+-
K
s.o.p+2here is your coffee3-5
Page 24 of 81
L
L
class softdrink extends drinkables
K
public ,oid drink+-
K
s.o.p+2here is your softdrink3-5
L
L
class drinkabledemo
K
public static ,oid main+stringMN arg-
K
)rinkable d5
int tokenOinteger.parseint+argsM?N-
switch+token-
K
case 1: dO new tea+-5
break5
case ! : d Onew coffee+-5
break5
case &: d O new softdrinks+-5
break5
default:
s.o.p+2in,alid choice3-5
L
d.drink+-5
L
L
'a,e it:
7:RmatRdrinkabledemo./a,a
7:RmatR/a,ac drinkabledemo./a,a
7:RmatR/a,a drinkabledemo
Page 25 of 81
Inte!face:
Interface are syntactically ,ery similar to classes
Interface is a keyword used to create interface
'yntax:
interface interface name
K
Qariables
P
#ethods without body
L
*y default all the interface ,ariables are final and static
4inal is a one built in keyword it is used to create constants
if the ,ariable is static we can access it with respect to interface name
*y default all the interface methods are abstract methods they are by default public
%x:
interface inf1
K
int xOG???5
,oid f1+-5
,oid f!+-5
L
(eference ,ariables of type interface can be created
%x: inf1 i588no error
Ob/ect of type interface cannot be created
88 i O new inf1+-5 88error
%,ery interface re=uires subclass from the subclass we must o,erride all the dummy or duplicate
methods of an interface which are inherited
class demo implements inf1
K
public ,oid f1+-5
K
Page 26 of 81
L
public ,oid f!+-5
K
L
%x:
interface animal
K
,oid mo,e+-5
L
class implements animal
K
public ,oid mo,e+-
K
s.o.p+2cat mo,e3-5l
L
class dog implements animal
K
public ,oid mo,e+-
K
s.o.p+2dog mo,e3-5
L
public ,oid bark+-
K
s.o.p+2dog bark3-5
L
L
class animaldemo
K
public static ,oid main+string argsMN -
K
1nimal a5
'tring sO argsM?N5
If +s.e=ualsignorecase+SDcat3-
aOnew cat+-5
a.mo,e+-5
Page 27 of 81
if +a instanceof dog-
K
88 a.bark+- 88error
dog d O +dog- a
d.bark+-5
L
L
'tatic T type of reference ,ariable
)ynamic T runtime ,alue present in refer
(t!ings:
In /a,a strings is a ob/ect of type string class or string *uffer class.
'tring ob/ect contents are not modifiable +immortable-
'tring *uffer ob/ect contents are modifiable
7ontructor of 'tring class:
1. 'tring s1Onew 'tring+-
s1OUUHelloUU5
s.o.pln+s1-
!. 'tring s!Onew 'tring+UUHelloUU-5
s.o.pln+s!-5
o8p:CHello.
& 7harM NxOKUIU U U U$U UiU UkU UeU D SU/U UaU U,UDaDL
'tring s&Onew 'tring+x-
s.o.pln+s&-5
o8p:CI like /a,a
0 byte M NbOK:G:::H:;:AH?L
'tring s0Onew 'tring +b-
s.o.pin +s0-5
o8p:C1*7)%4
Page 28 of 81
#ethods:
1.Fublic int length+ -
ex:C'tring &OUUHelloUU5
s.o.pln+s.length+ --
o8p:CG
!.Fublic char chart1t+int index-
ex:Cstring sOUUHadoopUU 5
s.o.p +&.char1t+!--
o8p:Cd
&.public boolean e=uals +string ob/-
ex:Cstring s1OUU1*7UU
string s!OUUabcUU
s.o.p +s1.e=uals +s!--5
o8p:C&!
In the abo,e case e=uals method takes binary comarision. *ut here 1*7 is not e=ual to abc
0.Fublic booiean e=ualsIgnore7ase+'tring ob/-
ex:Csop +s1.e=ualIgnore7ase +s!--
o8p:CTrue
G.Fublic string concat +string ob/-
ex:Cstring s1OUUHelloUU
string s!OUU1/ayUU
string s&Os1.concat +s!-
s.o.p +s&-5
o8p:C Hello 1/ay

:. Fublic int compareTo +string ob/-
This method returns either
1- V?+string 1Vstring ! -
!- W ?+string 1Wstring ! -
Page 29 of 81
&- O O ?+string1OOstring ! -.

ex:Cstring s1OUU1*7 UU
string s! OUUabcUU
int x Os1.compare To +s!-
if +xV?-
s.o.p +UUs1Vs!UU-
else if +xW?-
s.o.p +UU s1Ws!UU-
else
s.o.p +UUs1O Os!UU-.
Fublic string trim + -
ex:Cstring s1 OUUCCCCCHelloCCCCCUU
s.o.p +s1.length + --
o8p:CG
string s!Os1.trim + -
s.o.p +s!.length + --
o8p:CG
s.o.p ++s1.trim+ - -.$ength + --
o8p:CG
Fublic string substring +int start index int end index-
string s OUU I $ike hadoop UU
s.o.p +s.substring + !:--
o8p:C$ike
'tring*uffer:C
7onstructors of 'tring*uffer:C
'tring*uffer sbOnew 'tring*uffer + -
1: locatations
sb OUUHadoopUU
s.o.p +sb-5
o8p:CHadoop
Page 30 of 81
4or the abo,e ob/ects system allocates 1: contineous locations.
!.'tring*uffer 'b O new 'tring*uffer+2Hello3-5
!1 $ocations
+GP1:-
$ength of stringP1: extra locations
&.Fublic string *uffer Insert +int index 'tring sub string-
%g:'tring*uffer 'b O new 'ring*uffer +2 I Hadoop 3-
sb.insert +1 2like3-
O8F : I like Hadoop.
0. Fublic 'tring*uffer append +'tring ob/-
%g: 'tring*uffer 'b O new 'tring*uffer +2 Hello 3-
sb. append +2world3-
CCCCV 'tring sO 2hyd is capital of 1F3
Fublic *oolean 7ontains +'tring ob/-
'.O.F +'.7ontains +2capital3--5
O8F : True.
G.Fublic 'tring replace1ll +string s1'tring s!-
'tring '1 O '.replace 1ll +2Hyd3 2'ec3 -
'.O.F +'1-
O8F : sec is capital of a.p
(t!ingTo/eni0e! :
It is a built in class which is present under /a,a. util package.It is used to break the text in to different
parts.%ach part is called token. 1 Token is a ,alid word.
Const!acto!s of (t!ingTo/eni0e!:
1. 'tringTokeni9er+'tring to be Tokeni9ed-
%g : 'tringTokeni9er stOnew 'tringTokeni9er+2I like Hadoop3-5
In the abo,e constractor system uses white spaces+spaceRn Rt-to break the text.
Page 31 of 81
! . 'tringTokeni9er +'tring to be Tokeni9ed string delimiters -
2ato 9 + KL x............3
%etods :
1.Fublic *oolean has#oreTokens+-
This method checks whether next token is present or not.If the next token is present then it
returns true else false.
1. P$"lic (t!ing ne)tTo/en 12
It is used to read next tokens from the list of tokens.
%xample :

import /a,a.util.X5
7lass )emo
K
Fiblic static ,oid main +string argsMN-
K
'tring ' O 2This is an example of string tokens35
'tringTokeni9er 't O new 'tringTokeni9er +s-
while +'t.has#oreTokens+--
K
'.O.F +st.nextToken+--
L
L
L
To install hadoop steps
1- install linux+ubuntu or cent os-
!- install /a,a
&- install eclipse
0- install hadoop
G- set re=uired configurations
:- start all fi,e demons
Page 32 of 81
(teps to install adoop
1- )ownload new stable ,ersion of hadoop from the apache mirror
%x hadoopC1.?.0.tar.g9
!- %xtract it under homefolder
&- 'et hadoop path and /a,a path in bashrc file
%x :
Open bashrc from the terminal
Vgedit Y8.bashrc
Zo to end of bashrc file and set the hadoop path and /a,a path
export H1)OOF[HO#%Ohome8matuser8hadoopC1.?.0
export B1Q1[HO#%O8usr8lib8/,m8/a,aC:Copen/dk
export F1THO\H1)OOF[HO#%8bin:\B1Q1[HO#%8bin:\F1TH
0-set /a,a path in hadoop[en,.sh file
export B1Q1[HO#%O8usr8lib8/,m8/a,aC:Copen/dk
G-set configuration files
%dit the coreCsite.xml with the following properties.
WconfigurationV
WpropertyV
WnameVfs.default.nameW8nameV
W,alueVhdfs:88localhost:A???W8,alueV
W8propertyV
W8configurationV
fs.defa$lt.name: te property fs.default.name is a H)4' filesystem U(Iwhose is the name
nodeDs hostname or IF address and port is the port that the namenode will listen on for (F7s.if
no port is specified the default of ;?!? is used.
%dit the hdfsCsiteCxml with the following properties.
WconfigurationV
WpropertyV
WnameVdfs.replicationW8nameV
W,alueV1W8,alueV
W8propertyV
WpropertyV
WnameVdfs.data.dirW8nameV
W,alueV8home8matuser8dfs8dataW8,alueV
Page 33 of 81
W8propertyV
WpropertyV
WnameVdfs.name.dirW8nameV
W,alueV8home8matuser8dfs8nameW8,alueV
W8propertyV
W8configurationV
dfs.!eplication: it is used to set number of replication factors .in hadoop default
replication factor is &.it can be increased or decreased .in sudo distributed mode
replication factor should be 1.
dfs.data.di!: this property specifies a list of the directories for a datanode to store its
blocks . 1 data node round robins writes between its storage directories.
dfs.name3di!: this property specifies a list of the directories where the namenode stores
persistent filesystem metadata+the edit logand filesystem image -.1 copy of each of the metadata
file is stored in each directory for redundancy.
Edit te map!ed+site.)ml with the following properties
WconfigurationV
WpropertyV
WnameVmapred./ob.trackerW8nameV
W,alueVlocalhost:A??1W8,alueV
W8propertyV
WpropertyV
WnameVmapred.local.dirW8nameV
W,alueV8home8matuser8mapred8localW8,alueV
W8propertyV
WpropertyV
WnameVmapred.system.dirW8nameV
W,alueV8home8matuser8mapred8systemW8,alueV
W8propertyV
W8configurationV
%ap!ed.*o".t!ac/e!: specify the hostname or IF address and port that the /ob tracker will listen
on default port is ;?!1.
%ap!ed.local.di!: this property specifies the list of directories separated by commas. )uring
mapreduce /ob intermediate data and working files are written to temporary local files.sience
Page 34 of 81
this data includes the potentially ,ery large output of map tasksyou need to ensure the
mapred.local.dir property which controls the location of local temporary storage.
%ap!ed.system3di!: mapreduce uses a distributed file system to share files+such as the /ob B1(
file-with the tasktrackers that run the mapreduce tasks
this property is used to specify a directory where these files can be stored.
:- 'tart fi,e demons+namenodedatanodesecondary namenode/obtrackertasktracker-
6ote: before starting demons for the first time format the namenode.
To format the namenode
Vhadoop namenode Tformat
To start fi,e demons run following file +which is present in home8matuser8hadoopC1.?.08bin-
VstartCall.sh
To check whether fi,e demons are started or not run following command.
V/ps
It shows demons which are stated as follows.
6ame 6ode
)ata 6ode
'econdary 6ame 6ode
Bob Tracker
Task Tracker
Bps
To start only namenode data nodesecondary name node then run following file
VstartCdfs.sh
To start only /ob tracker task tracker then run following file
VstartCmapred.sh
To stop all fi,e demons then run following file
VstopCall.sh
To stop only name node datanode''6 then run following files
VstopCdfs.sh
To stop only /ob tracker and task tracker then run following file
VstopCmapred.sh
Hadoop commands
Page 35 of 81
Cmod Command :
It is used to pro,ide permisions to the file or directory
to pro,ide readwriteexecute permissions respecti,ely ,alues are 0! 1.
0read
!write
1execute
'yntax :
4orm the home folder type the following commands
Hadoop fs Cchmod HHH path of the file 8directory
%x: Hadoop fs Cchmod HHH satya8abc.txt
the abo,e command pro,ides readwrite and execute permitions to user group of users and other users
%g: Hadoop fs T chmod 0&1 satya
The abo,e command pro,ides permissions for
Useronly read
Zroup of usersonly write
Othersonly execute

To fo!mat name node
'yntax : hadoop namenode C format
(top name node tempo!a!ily:+
'yntax:
hadoop dfsadmin C safemode enter.
'tart
hadoop dfsadmin C safemode lea,e.
To check status of name node whether it is safe mode or not:C
hadoop dfsadmin Csafemode get.
To check8 to get blocks information.
Hadoop fsck Cblock.
4askCfile system checkup8health check
Hadoop fsck Creport
To create a directory in hadoop
'yntax:
Page 36 of 81
Hadoop fs Cmkdir directory name
%x :
hadoop fs Tmkdir firstdir
%x :
hadoop fs Tmkdir seconddir
To copy the file from local hard disk to hadoop
hadoop fs Tcopy4rom$ocal source filepath target filepath
%x:
hadoop fs Tcopy4rom$ocal mat.doc firstdir
To copy the file from one hadoop directory to some other hadoop directory
hadoop fs Tcp source directory path target directory path
%x:
hadoop fs Tcp firstdir8mat.doc seconddir
To mo,e the file from one hadoop directory to some other hadoop directory
hadoop fs Cm, source directory path target directory path
%x:
hadoop fs Tm, firstdir8mat.doc seconddir
To mo,e the file from local file system to hadoop directory
hadoop fs Tmo,e4rom$ocal source directory path target directory path
%x:
hadoop fs Tmo,e4rom$ocal mat.doc firstdir
to display the contents of any file present in hadoop
hadoop fs Tcat path of the file
%x:
hadoop fs Tcat firstdir8mat.doc
to display the list of file and directories and file
hadoop fs Tls path of the directory
%x:
hadoop fs Tls firstdir
'teps to create new pro/ect using eclipse
1.go to 1pplications menuselect programing menu itemselect eclipse option
'teps to implement 1pplication using %clipse.
1pplications Frogramming %clipse
Page 37 of 81
'tep 1: To 'tart new application
Zo to file menu6ew option Ba,a Fro/ect option
then one dialogue box will be opened

%g:
7lick on 4inish button.
'tep:! To add class to pro/ect go to package explorer (ight click on Fro/ect name4rom the pop up
menu select new option 7lass Option class.
6ow it will ask class name
'teps to add Hadoop /ar files to application:
'tep 1: Zo to package %xplorer

(ight click on pro/ect

'elect build path
Project name Wordcountdemo
Finish
Class Name Wordcountmapper
Finish
Project name: wordcountdemo/src
Page 38 of 81

7onfigure build path
Then one dialogue box will be opened .

Then click on 1dd %xternal /ars.
select hadoop folder from the dialog box select all /ar files click on open button
%x:
HadoopC antC!?.!Cchd&?G./ar
HadoopCcoreCcdb&? G./ar etc..
/ar /a,a 1rchi,e
1fter implementing complete pro/ect it must be create in /ar files.
To create /ar files:
Zo to package %xplorer
(ight click on /a,a pro/ect
'elect %xport Option
'elect Bar file option
7lick on next button
This path can be any existing path followed by /ar file name
%x:
8home8matuser8desktop8anyfilename./ar

7lick on finish button
Page 39 of 81
Hadoop p!og!ams:
1. .rite a programm to display properties present in configuration files.
import /a,a .until.map.entry5
import org .apache.hadoop.cont.7onfiguration5
import org .apache.hadoop.until.Tool(unner5
import org .apache.hadoop.until.Tool5
import org .apache.hadoop.cont.7onfigured5
public class 7onfigurationFrinter extends 7onfigured implements Tool
K
pri,ate 7onfiguration conf5
static
K
7onfiguration.add)efault(esource+2hdfsCdefault.xml3-5
7onfiguration.add)efault(esource+2hdfsCsite.xml3-5
7onfiguration.add)efault(esource+2mapCredCdefault.xml3-5
7onfiguration.add)efault(esource+2mapCredCsite.xml3-5
L
]O,erride
public 7onfiguration get7onf+-
K
return conf5
L
]O,erride
public ,oid set7onf+7onfiguration conf-
K
this.confOconf5
L
]O,erride
public int run+'tringMNargs-throws %xeception
K
7onfiguration conf Oget7onf+-5
for +entryW'tring'tringVentry:conf-
K
'ystem.out.printf+2@sO@sRn3entry.get>ey+-entry.getQalue+--5
Page 40 of 81
L
return ?5
L
public static ,oid main+'tringMNargs- throws %xeception
K
int exitcode O Tool(unner.run+new 7onfigurationFrinter+-args-5
'ystem.exit+exitcode-5
L
L
Config$!ation :+
it is a one predefined class which is present in org.apache.hadoop.conf package. It is used to
retrie,e 7onfiguration class properties < ,alues. %ach property is named by a string < type of
the ,alue may be one of the se,eral types +such as *ooleanintlong^^-
To read properties create ob/ect of type configuration class
%x:C
7onfiguration confOnew 7onfiguration+-5
7onf.add(esource+2xml file path3-
addresource +-:is a member function of 7onfiguration class. It takes xml file as a parameter.
%x:Cconf. add(esource+2coreCsite.xml3-
To get ,alue of a property
(ynta):+
Qariable nameOconf.get+2property name3-
7onfiguration.add)efault(esource+2coresite.xml3-5
it is a static method of 7onfiguration class.
Hadoop comes with a few helper classes for making it easier to run /obs from the command
line. ZenericOptionsFarser is a class that interprets common Hadoop commandCline options and sets
them on a 7onfiguration ob/ect for your application to use as desired. _ou donDt usually use
ZenericOptionsFarser directly as itDs more con,enient to implement the Tool interface and run your
application with the Tool(unner which uses ZenericOptionsFarser internally:
public interface Tool extends 7onfigurable K
int run+'tring MN args- throws %xception5
L
Page 41 of 81
*elow example shows a ,ery simple implementation of Tool for running the Hadoop #ap (educe Bob.
public class .ord7ount7onfigured extends 7onfigured implements Tool K
]O,erride
public int run+'tringMN args- throws %xception K
7onfiguration conf O get7onf+-5
return ?5
L
L
public static ,oid main+'tringMN args- throws %xception K
int exit7ode O Tool(unner.run+new .ord7ount7onfigured+- args-5
'ystem.exit+exit7ode-5
L
.e make .ord7ount7onfigured a subclass of 7onfigured which is an implementation of the
7onfigurable interface. 1ll implementations of Tool need to implement 7onfigurable +since Tool
extends it- and subclassing 7onfigured is often the easiest way to achie,e this. The run+- method
obtains the 7onfiguration using 7onfigurableDs get7onf+- method and then iterates o,er it printing each
property to standard output.
.ord7ount7onfiguredDs main+- method does not in,oke its own run+- method directly. Instead we call
Tool(unnerDs static run+- method which takes care of creating a 7onfiguration ob/ect for the Tool
before calling its run+- method. Tool(unner also uses a ZenericOptionsFarser to pick up any standard
options specified on the command line and set them on the 7onfiguration instance. .e can see the
effect of picking up the properties specified in conf8hadoopClocalhost.xml by running the following
command:
Hadoop .ord7ount7onfigured Cconf conf8hadoopClocalhost.xml C)
mapred./ob.trackerOlocalhost:1??11 C) mapred.reduce.tasksOn
Options specified with C) take priority o,er properties from the configuration files. This is ,ery
useful: you can put defaults into configuration files and then o,erride them with the C) option as
needed. 1 common example of this is setting the number of reducers for a #ap(educe /ob ,ia C)
mapred.reduce.tasksOn. This will o,erride the number of reducers set on the cluster or if set in any
clientCside configuration files. The other options that ZenericOptionsFarser and Tool(unner support are
listed in Table.
Config$!a"le:+
It is a one predefined interface. It contains following abstract methods.
1.7onfiguration get7onf+-
It is used to get configurations ob/ect
,. 7onfiguration set7onf+7onfiguration conf-
It is used to set 7onfiguration
Config$!ed:+
It is a one predefined class which is deri,ed from 7onfigurable interface.1bstract methods of
Page 42 of 81
7onfigurable are already implemented in 7onfigured class.+get7onf+- < set7onf+- are
already implemented in configured class-.
Tool:+
It is a one predefined interface. It is the sub interface of 7onfigurable interface.
(ynta):C
interface Tool extends 7onfigurable
K
int run+'tringMNargs-5 88abstract method.
L
6ote :the class which impliments Tool interface must o,erride & abstract methods.+
get7onf+-set7onf+-run+--.
in the abo,e example 7onfigurationFrinter class is implementing Tool interface.'o it must
o,erride all the & abstract methods.otherwise 7onfigurationFrinter class becomes abstract
class.
ToolR$nne!:+ it is a predefined class which contains following static methods
run+- is o,er loaded method of Tool(unner
static int run+Tool tool'tringMNargs-5
It should be the sub class ob/ect of tool interface.
(ynta):+
static int run+7onfiguration conf Tool tool'tringMNargs-5
The abo,e run method of Tool(unner class first will call set7onf+- to set the configuration
and next it will make a call to run method of Tool interface to run the /ob.
%AP: C
It is a predefined interface which is present in until package.
%ntry is a inner interface of map which contains following methods.
(ynta):
Ob/ect get>ey+-5
Ob/ect getQalue+-5
4AR:C after implementing the entire program we ha,e to create a /ar file.
Page 43 of 81
(teps to c!eate a *a! file:+
1.goto package explorer and right on pro/ect name and select export option.
!.select /a,aC/ar file and click on next button.
&./ar file:path of the file browse
7lick on finish.
0.To run the application or a program goto terminal

To execute the abo,e example i8p < o8p file paths are not re=uired. 'o the command is

\ hadoop /ar 8home8matuser8documents8configurationdemo./ar 7onfigurationFrinter.
(ample%appe!.*ava
import /a,a.io.IO%xception5
import org.apache.hadoop.io.X5
import org.apache.hadoop.mapreduce.#apper5
public class 'ample#apper extends #apperW$ong.ritableTextText$ong.ritableV
K
'tring msg5
public ,oid setup+7ontext context-throws IO%xceptionInterrupted%xception
K
msgO`this is setup method of #apperRn`5
L
]O,erride
protected ,oid map+$ong.ritable keyText ,alue7ontext context-throws
IO%xceptionInterrupted%xception
K
msgOmsgP`map method is called for `P,alue.to'tring+-P`Rn`5
L
protected ,oid cleanup+7ontext context-throws IO%xceptionInterrupted%xception
\ hadoop /ar W/ar file nameV Wmain class nameV Winput file pathV Woutput file pathV
Page 44 of 81
K
msgOmsgP`this is cleaup method of mapperRn`5
context.write+new Text+msg-new $ong.ritable+msg.length+---5
L
L
(ampleRed$ce!.*ava
import org.apache.hadoop.io.$ong.ritable5
import org.apache.hadoop.io.6ull.ritable5
import org.apache.hadoop.io.Text5
import org.apache.hadoop.mapreduce.(educer5
public class 'ample(educer extends (educerWText $ong.ritable Text 6ull.ritableV
K
'tring msg5
K
msgO`this is setup method of (educerRn`5
L
]O,erride
protected ,oid reduce+Text keyIterableW$ong.ritableV ,alue7ontext context-throws
K
msgOkey.to'tring+-PmsgP`this is reducer methodRn`5
L
K msgOmsgP`this is clean up method of reducerRn`5
context.write+new Text+msg-6ull.ritable.get+--5
L
L
(ample*o".*ava
import org.apache.hadoop.conf.7onfiguration5
import org.apache.hadoop.fs.Fath5
import org.apache.hadoop.mapreduce.Bob5
import org.apache.hadoop.mapreduce.lib.input.4ileInput4ormat5
import org.apache.hadoop.mapreduce.lib.output.4ileOutput4ormat5
import org.apache.hadoop.util.Tool(unner5
import org.apache.hadoop.util.Tool5
public class 'ampleBob implements ToolK
Page 45 of 81
]O,erride
K
return conf5
L
]O,erride
K
this.confOconf5
L
]O,erride
public int run+'tring MNargs-throws %xception
K
Bob sample/obOnew Bob+get7onf+--5
sample/ob.setBob6ame+`mat word count`-5
sample/ob.setBar*y7lass+this.get7lass+--5
sample/ob.set#apper7lass+'ample#apper.class-5
sample/ob.set(educer7lass+'ample(educer.class-5
sample/ob.set#apOutput>ey7lass+Text.class-5
sample/ob.set#apOutputQalue7lass+$ong.ritable.class-5
sample/ob.setOutput>ey7lass+Text.class-5
sample/ob.setOutputQalue7lass+6ull.ritable.class-5
4ileInput4ormat.setInputFaths+sample/obnew Fath+argsM?N--5
4ileOutput4ormat.setOutputFath+sample/obnew Fath+argsM1N--5
return sample/ob.wait4or7ompletion+true-OOtrueI ?:15
L
public static ,oid main+'tring MNargs-throws %xception
K
Tool(unner.run+new 7onfiguration+-new 'ampleBob+-args-5
L
L
'o!dco$nt%appe!.*ava
import /a,a.util.'tringTokeni9er5
88import org.apache.hadoop.mapreduce.7ounter5
public class .ord7ount#apper extends #apperW$ong.ritableTextText$ong.ritableV
K
pri,ate Text tempOnew Text+-5
pri,ate final static $ong.ritable oneOnew $ong.ritable+1-5
]O,erride
K
Page 46 of 81
'tring strO,alue.to'tring+-5
'tringTokeni9er strtockOnew 'tringTokeni9er+str-5
while+strtock.has#oreTokens+--
temp.set+strtock.nextToken+--5
context.write+tempone-5
L
L
'o!dCo$ntRed$ce!.*ava
public class .ord7ount(educer extends (educerWText$ong.ritableText$ong.ritableV
K
]O,erride
K
long sumO?5
while+,alue.iterator+-.has6ext+--
K
sumPO,alue.iterator+-.next+-.get+-5
L
context.write+keynew $ong.ritable+sum--5
L
L
'o!dco$ntcom"ine!.*ava
public class .ord7ount7ombiner extends (educerWText$ong.ritableText$ong.ritableV
K
]O,erride
K
Page 47 of 81
long sumO?5
while+,alue.iterator+-.has6ext+--
K
sumPO,alue.iterator+-.next+-.get+-5
L
context.write+keynew $ong.ritable+sum--5
L
L
'o!dco$ntPa!tione!.*ava
import org.apache.hadoop.mapreduce.Fartitioner5
public class .ord7ountFartitioner extends FartitionerWText$ong.ritableV
K
]O,erride
public int getFartition+Text key$ong.ritable ,alueint noOf(educers-
K
'tring temp'tringOkey.to'tring+-5
return +temp'tring.to$ower7ase+-.char1t+?-CUaU-@noOf(educers5
L
L
'o!dco$nt 4o"
88import org.apache.hadoop.mapreduce.lib.input.TextInput4ormat5
88import org.apache.hadoop.mapreduce.lib.output.TextOutput4ormat5
Page 48 of 81
public class .ord7ountBob implements ToolK
]O,erride
K
return conf5
L
]O,erride
K
this.confOconf5
L
]O,erride
K
Bob wordcount/obOnew Bob+get7onf+--5
wordcount/ob.setBob6ame+`mat word count`-5
wordcount/ob.setBar*y7lass+this.get7lass+--5
wordcount/ob.set7ombiner7lass+.ord7ount7ombiner.class-5
wordcount/ob.set#apper7lass+.ord7ount#apper.class-5
wordcount/ob.set(educer7lass+.ord7ount(educer.class-5
wordcount/ob.set6um(educeTasks+!:-5
wordcount/ob.set#apOutput>ey7lass+Text.class-5
wordcount/ob.set#apOutputQalue7lass+$ong.ritable.class-5
wordcount/ob.setOutput>ey7lass+Text.class-5
wordcount/ob.setOutputQalue7lass+$ong.ritable.class-5
wordcount/ob.setFartitioner7lass+.ord7ountFartitioner.class-5
88wordcount/ob.setInput4ormat7lass+TextInput4ormat.class-5
88wordcount/ob.setOutput4ormat7lass+TextOutput4ormat.class-5
4ileInput4ormat.setInputFaths+wordcount/obnew Fath+argsM?N--5
4ileOutput4ormat.setOutputFath+wordcount/obnew Fath+argsM1N--5
return wordcount/ob.wait4or7ompletion+true-OOtrueI ?:15
L
K
Tool(unner.run+new 7onfiguration+-new .ord7ountBob+-args-5
L
L
Ho5 map !ed$ce p!og!am 5o!/s:
#ap reduce works by breaking the processing into two phases
1- #ap Fhase
!- (educe Fhase
%ach phase has key ,alue pairs as i8p < o8p.Input key ,alue pair of map phase is dermied by the
type of i8p for matter being usedwhere as rest of the key ,alue pair types may be choosen by the
Page 49 of 81
programmer.The programmer can write re=uired logic by o,erriding map+- reduce+- functions of
#apper(educer abstract classes. #apper is generic type with four formal parameters which specifies
input keyinput ,alueo8p keyand o8p ,alue types of map function.The (educer is generic class with four
formal type of parameters that specify the input keyinput ,alueo8p key and o8p ,alue of reducer
function.
(educer aggregates the map o8p so one needs to remember the map o8p key type#ap o8p ,alue
type should match with reucer i8p key type and reducer i8p ,alue type.Hadoop uses its own data types
rather than using /a,a data types.These data types are optimi9ed for network seriali9ation.These data
types are found in org.apache.hadoop.io package.here $ong.ritable and Text corresponding to long and
'tring respecti,ely.4or abo,e eg: Input key is line offset code in the file < ,alues is the entire line.
The third parameter is 7ontext ob/ect which allows the i8p and o8p from the task.It is only
supplies to the #apper and (educer.

In abo,e eg: take the ,alue which is entire line < con,ert it into string using tost!ing12
method in Text class.'plit the line into words using 'tringTokeni9er.Iterate o,er all the
words in the line < o8p word and its count 1.context has write+- to write map o8p.#ap o8p
are stored in $ocal 4ile 'ysem where mapper has run.1lso map o8p are stored in sorted
ordersorted on key.
6ote:If the /ob is map only /obmap o8p are not stored and directly stored in the configured
file system in coreCsite.xml in most cases it is H4)'.
Red$ce! p!og!am:
The .ord7ount(educer extends reducer
Wkey in,alue inkey out,alue outV with specific types like TextInt writableText Intwritable.
6ote: The word count reducer i8p key type is Text < ,alue type is int writablematch with
o8p key type < ,alue type of .ord7ount #apper.
(eucer first copies all the map o8p and merges them.This process is called suffle < sorting.
)uring this phases all ,alues corresponding to each key will be aggregated. (educer method is called
for each key with all the ,alues aggregated sorted in the reducer methodliterate o,er all the ,alues and
finally counting how many of them.4inally writing the reduce method o8p which is word and its count
using write method pro,ided by context class.4inally we need to pro,ide a dri,er class which submits
the /ob to the hadoop clusters with our own #apper < (educer implementation.
Page 50 of 81
D!ive! p!og!am:
7reated a /ob ob/ect with Config$!ation ob/ect and name of the /ob being the arguments.
7onfiguration is an ob/ect through which hadoop reads all the configuration properties mentioned
in co!esite.)ml3Hdfs+site.)ml and mapped+site.)ml by passing conf ob/ect to the /ob ob/ect it retri,es
the data from all configuration files during its life cycle. User can define their own property with a
,alue+,alue types could be any type-using setter methods pro,ided in the Config$!ation ob/ect like
conf.setInt+,alue type-
Hence the configuration is ,ery useful if you need to pass a small piece of meta data
to your tasks.
To retri,e the ,alues from the task using Conte)t ob/ect < get configuration ob/ect
using conte)t.getConfig$!ation12 then get the re=uired metadata.
Bob ob/ect has been specified with all classes+%appe!3Red$ce!.....-
Input path is mentioned using the static method of #ileInp$t#o!matte! classDs
addInp$tPat1- ha,ing /ob instance and input directory as argument.'imilarly o8p directory has been
mentioned using static method pro,ided by #ileO$tp$t#o!matte! class . 4inally submit the /ob to
cluster
using 5ait#o!Completion12 method wait for /obs completion.Once /ob completes fullyfind the o8p of
the /ob in the o8p directory.
1 Bob ob/ect forms the specifications of the /ob and gi,es you control o,er how the /ob
is run.when we run this /ob on a hadoop clusterwe wil package the code in to B1( file
+which hadoop will distribute around the cluster-(ather then explicitly specify the name
of the B1( file. .e can pass a class in the /obs set4a!.yClass12 method which hadoop
will use to locate the rele,ant B1( files by looking for B1( file containing this class.
Input path is specified by calling the static addInp$tPat12 on #ileInp$t#o!mate! and
it can be a single filea directory +in this case it takes all file from the dir forms input
for all the files-or a file pattern +like X.text or X.doc-.1s the name suggests addInp$tPat12
can be called more than once to use input from multiple paths.The o8p path is specified by
static setO$tp$tPat12 on #ileO$tp$t#o!matte!.It specifies where exactly o8p file must be
written by the reducer.The directory should not exist before running the /ob because
Hadoop will complain and not run the /ob.This precaution is to pre,ent data loss
+it can be o,er written by o8p of a long /ob with another-. .e can specify the #ap and reducer types to
use the set%appe!Class12 and setRed$ce!Class12 methods.
The setO$tp$t/eyClass12 and setO$tp$t6al$eClass12 method control the o8p types for the
map+- and reduce+- functionswhich are often the same.If they are different the map o8p
Page 51 of 81
types can be set using the methods set%apO$tp$t7eyClass12 and
set%apO$tp$t6al$e7eyClass12 Then we are ready to run the /ob.The wait4or7ompletiton+-
method on /ob submits the /ob < wait for it to finish.
The return ,alue of the 5ait#o!Completion12 is a *oolean indicating success+true- or failure+false-
which we translate in to the program exit code ? or 1.
4o" class
set%apO$tp$t7eyClass
public ,oid set#apOutput>ey7lass+7lassWIV the7lass-throws Illegal'tate%xception
'et the key class for the map output data. This allows the user to specify the map output key
class to be different than the final output ,alue class.
Farameters:
the7lass C the map output key class.
Throws:
Illegal'tate%xception C if the /ob is submitted
set%apO$tp$t6al$eClass
public ,oid set#apOutputQalue7lass+7lassWIV the7lass-throws Illegal'tate%xception
'et the ,alue class for the map output data. This allows the user to specify the map output ,alue class to
be different than the final output ,alue class.
Farameters:The7lass C the map output ,alue class.
Throws: Illegal'tate%xception C if the /ob is submitted
setO$tp$t7eyClass
public ,oid setOutput>ey7lass+7lassWIV the7lass-throws Illegal'tate%xception
'et the key class for the /ob output data.
Farameters:
the7lass C the key class for the /ob output data.
Throws:
Page 52 of 81
setO$tp$t6al$eClass
public ,oid setOutputQalue7lass+7lassWIV the7lass- throws Illegal'tate%xception
'et the ,alue class for /ob outputs.
Farameters: the7lass C the ,alue class for /ob outputs.
set4o"Name
public ,oid setBob6ame+'tring name-throws Illegal'tate%xception
'et the userCspecified /ob name.
Farameters: name C the /obUs new name.
Throws:Illegal'tate%xception C if the /ob is submitted
set(pec$lativeE)ec$tion
public ,oid set'peculati,e%xecution+boolean speculati,e%xecution-
Turn speculati,e execution on or off for this /ob.
Farameters: speculati,e%xecution C true if speculati,e execution should be turned on else false.
isComplete
p$"lic "oolean isComplete12t!o5s IOE)ception
7heck if the /ob is finished or not. This is a nonCblocking call.
(eturns: true if the /ob is complete else false.
Throws: IO%xception
is($ccessf$l
p$"lic "oolean is($ccessf$l12 t!o5s IOE)ception
7heck if the /ob completed successfully.
(eturns:true if the /ob succeeded else false.
Throws: IO%xception
/ill4o"
public ,oid killBob+- throws IO%xception
Page 53 of 81
>ill the running /ob. *locks until all /ob tasks ha,e been killed as well. If the /ob is no longer running it
simply returns.
Throws: IO%xception
/illTas/
public ,oid killTask+Task1ttemptI) taskId-throws IO%xception
>ill indicated task attempt.
Farameters:
taskId C the id of the task to be terminated.
Throws:IO%xception
s$"mit
public ,oid submit+- throws IO%xceptionInterrupted%xception7lass6ot4ound%xception
'ubmit the /ob to the cluster and return immediately.
Throws: IO%xception Interrupted%xception 7lass6ot4ound%xceptionwait4or7ompletion
public boolean wait4or7ompletion+boolean ,erbose-throws IO%xceptionInterrupted%xception
7lass6ot4ound%xception
'ubmit the /ob to the cluster and wait for it to finish.
Farameters:
,erbose C print the progress to the user
(eturns:true if the /ob succeeded
Throws: IO%xception C thrown if the communication with the BobTracker is lost
Interrupted%xception 7lass6ot4ound%xception
setInp$t#o!matClass
public ,oid setInput4ormat7lass+7lassWI extends Input4ormatV cls-throws Illegal'tate%xception
'et the Input4ormat for the /ob.
Farameters: cls C the Input4ormat to use
setO$tp$t#o!matClass
public ,oid setOutput4ormat7lass+7lassWI extends Output4ormatV cls-throws Illegal'tate%xception
'et the Output4ormat for the /ob.
Page 54 of 81
Farameters:
cls C the Output4ormat to use
Throws:
set%appe!Class
public ,oid set#apper7lass+7lassWI extends #apperV cls-
throws Illegal'tate%xception
'et the #apper for the /ob.
Farameters:cls C the #apper to use
set4a!.yClass
public ,oid setBar*y7lass+7lassWIV cls-
'et the Bar by finding where a gi,en class came from.
Farameters: cls C the example class
get4a!
public 'tring getBar+-
Zet the pathname of the /obUs /ar.
O,errides: getBar in class Bob7ontext
(eturns:the pathname
setCom"ine!Class
public ,oid set7ombiner7lass+7lassWI extends (educerV cls- throws Illegal'tate%xception
'et the combiner class for the /ob.
Farameters:cls C the combiner to use
setRed$ce!Class
public ,oid set(educer7lass+7lassWI extends (educerV cls-throws Illegal'tate%xception
'et the (educer for the /ob.
Farameters: cls C the (educer to use
setPa!titione!Class
Page 55 of 81
public ,oid setFartitioner7lass+7lassWI extends FartitionerV cls-
throws Illegal'tate%xception
'et the Fartitioner for the /ob.
Farameters: cls C the Fartitioner to use
set%apO$tp$t7eyClass
setN$mRed$ceTas/s
public ,oid set6um(educeTasks+int tasks-throws Illegal'tate%xception
'et the number of reduce tasks for the /ob.
Farameters: tasks C the number of reduce tasks
Com"ine!
7ombiner are used to reduce the the amount of the data being transferred o,er the network
It is used to minimi9e data to be transferred from mapper to reducer.It uses the optimum
usages of the network bandwidth . hadoop allows the user to specify a combiner function
to run on mapper output.The o8p of mapper becomes i8p of reducer.7ombiners are treated
as local reducer they run by consuming mapper o8p and run on the same machinewhere
mapper has run earlier.Hadoop does not pro,ide guarantee on combiner function execution.Hadoop
frame work may call combiner function 9ero or more times for a particular mapper o8p.
$et us imagine the first mapper o8p is 'econd mapper o8p is

Hadoop,1
.
.
.
Hadoop,1
! times
"s,1
.
.
"s,1
! times
Hadoop,1
Hadoop,1
.
.
1! times
#at,1
#at,1
.
.
! times
Page 56 of 81
.e can use combiner function like reducer functionwe can count the word fre=uency of abo,e o8p.
4irst combiner o8p 'econd combiner o8p
It will gi,e to reducer as i8p.7ombiner function does not replace reducer+- function
combiner is also implemented by extending (educer abstract class < o,erriding reduce+-
method.
Pa!titione!
1 partitioner in mapreduce partitiones the key space.the partitioner is used to deri,e the
Fartition to which key ,alue pair belongs.partitioner partitioning the keys of the intermediate
#apCoutputs
6umber of FartitionsOnumber of reduce tasks
Fartitioners run on the same machine where the mapper has finished his execution earlier.
%ntire mapper o8p is sent to partitioner and partitioner forms+number of reducer tasks-
groups from the mapper o8p.
*y default hadoop frame work is hash based partitioner.This partitioner partitions the
keyspace by using hash code.The following is logic for hash partioner to determine
reducer for particular key.
return +key.hashcode < integer #axC,alue-@ no of reducers
.e can customi9e partion logic hadoop pro,ides.Fartioner abstract class with a single
method which can be extended to write custom partioner
public abstract class Fartitioner Wkey,alueV
K
public abstract int getFartition+key,alue no of reducer-5
L
getFartition+- returns partition number for a gi,en key
In word count example re=uirement is all the words which starts with SaD should go to
one reducer < all the words which starts with S*D should go to another reducer < so on..
Hadoop,1!
#at,!
Hadoop,!
is,!
Page 57 of 81
In the case no of reducers are S!:D.
#o!matData%appe!.*ava
public class 'ampleBob implements ToolK
]O,erride
K
return conf5
L
]O,erride
K
this.confOconf5
L
]O,erride
K
Bob sample/obOnew Bob+get7onf+--5
sample/ob.setBob6ame+`mat word count`-5
sample/ob.setBar*y7lass+this.get7lass+--5
sample/ob.set#apper7lass+'ample#apper.class-5
sample/ob.set(educer7lass+'ample(educer.class-5
sample/ob.set#apOutput>ey7lass+Text.class-5
sample/ob.set#apOutputQalue7lass+$ong.ritable.class-5
sample/ob.setOutput>ey7lass+Text.class-5
sample/ob.setOutputQalue7lass+6ull.ritable.class-5
4ileInput4ormat.setInputFaths+sample/obnew Fath+argsM?N--5
4ileOutput4ormat.setOutputFath+sample/obnew Fath+argsM1N--5
return sample/ob.wait4or7ompletion+true-OOtrueI ?:15
L
K
Tool(unner.run+new 7onfiguration+-new 'ampleBob+-args-5
L
L
Page 58 of 81
#o!matDataRed$ce!.*ava
public class 4ormat)ata(educer extends (educerW$ong.ritable Text $ong.ritable TextVK
$ong xaddO+long- ?15
public ,oid reduce+$ong.ritable key IterableWTextV ,alues 7ontext context- throws
IO%xception Interrupted%xceptionK
$ong.ritable s>eyOnew $ong.ritable+?-5
Text txtrOnew Text+-5
for +Text ,al:,alues- K
'tring m'trO,al.to'tring+-5
s>ey.set+xaddPP-5
txtr.set+m'tr-5
context.write+s>ey txtr-5
L
L
L
#o!mat4o".*ava
import org.apache.hadoop.mapreduce.lib.input.TextInput4ormat5
import org.apache.hadoop.mapreduce.lib.output.TextOutput4ormat5
public class 4ormatBob implements ToolK
]O,erride
K
return conf5
L
]O,erride
K
this.confOconf5
L
]O,erride
Page 59 of 81
K
Bob format/obOnew Bob+get7onf+--5
format/ob.setBob6ame+`mat sed count`-5
format/ob.setBar*y7lass+this.get7lass+--5
format/ob.set#apper7lass+4ormat)ata#apper.class-5
format/ob.setOutput>ey7lass+$ong.ritable.class-5
format/ob.setOutputQalue7lass+Text.class-5
format/ob.setInput4ormat7lass+TextInput4ormat.class-5
format/ob.setOutput4ormat7lass+TextOutput4ormat.class-5
4ileInput4ormat.setInputFaths+format/obnew Fath+argsM?N--5
4ileOutput4ormat.setOutputFath+format/obnew Fath+argsM1N--5
return format/ob.wait4or7ompletion+true-OOtrueI ?:C15
L
K
7onfiguration conf1Onew 7onfiguration+-5
conf1.set+`*atch[IdàrgsM!N-5
conf1.set+`(un[IdàrgsM&N-5
Tool(unner.run+conf1new 4ormatBob+-args-5
L
L
%a)8engt'o!d4o".*ava
public class #ax$ength.ordBob implements ToolK
]O,erride
K
return conf5
L
]O,erride
K
this.confOconf5
L
]O,erride
K
Bob maxlengthword/obOnew Bob+get7onf+--5
Page 60 of 81
maxlengthword/ob.setBob6ame+`mat maxlength word`-5
maxlengthword/ob.setBar*y7lass+this.get7lass+--5
maxlengthword/ob.set#apper7lass+#ax$ength.ord#apper.class-5
maxlengthword/ob.set(educer7lass+#ax$ength.ord(educer.class-5
maxlengthword/ob.set#apOutput>ey7lass+Text.class-5
maxlengthword/ob.set#apOutputQalue7lass+$ong.ritable.class-5
maxlengthword/ob.setOutput>ey7lass+Text.class-5
maxlengthword/ob.setOutputQalue7lass+$ong.ritable.class-5
4ileInput4ormat.setInputFaths+maxlengthword/obnew Fath+argsM?N--5
4ileOutput4ormat.setOutputFath+maxlengthword/obnew Fath+argsM1N--5
return maxlengthword/ob.wait4or7ompletion+true-OOtrueI ?:15
L
K
Tool(unner.run+new 7onfiguration+-new #ax$ength.ordBob+-args-5
L
L
%a)8engt'o!d%appe!.*ava
public class #ax$ength.ord#apper extends #apperW$ong.ritableTextText$ong.ritableV
K
'tring max.ord5
K
max.ordOnew 'tring+-5
L
]O,erride
K
'tring nextToken5
'tringTokeni9er stOnew 'tringTokeni9er+,alue.to'tring+--5
while+st.has#oreTokens+--
K
nextTokenOst.nextToken+-5
if+nextToken.length+-Vmax.ord.length+--
K
max.ordOnextToken5
L
Page 61 of 81
L
L
K
context.write+new Text+max.ord- new $ong.ritable+max.ord.length+---5
L
L
%a)8engt'o!dRed$ce!.*ava
public class #ax$ength.ord(educer extends (educerWText$ong.ritableText6ull.ritableV
K
'tring max.ord5
protected ,oid setup+7ontext context-throws IO%xceptionInterrupted%xception
K
max.ordOnew 'tring+-5
L
]O,erride
K
if+key.to'tring+-.length+-Vmax.ord.length+--
K
max.ordOkey.to'tring+-5
L
L
K
context.write+new Text+max.ord-6ull.ritable.get+--5
L
L
P!ime*o".*ava
Page 62 of 81
public class FrimeBob implements ToolK
]O,erride
K
return conf5
L
]O,erride
K
this.confOconf5
L
]O,erride
K
Bob prime/obOnew Bob+get7onf+--5
prime/ob.setBob6ame+`mat prime 6umbers`-5
prime/ob.setBar*y7lass+this.get7lass+--5
prime/ob.set#apper7lass+Frime#apper)emo.class-5
prime/ob.set#apOutput>ey7lass+Text.class-5
prime/ob.set#apOutputQalue7lass+$ong.ritable.class-5
prime/ob.set6um(educeTasks+?-5
prime/ob.setOutput>ey7lass+Text.class-5
prime/ob.setOutputQalue7lass+$ong.ritable.class-5
4ileInput4ormat.setInputFaths+prime/obnew Fath+argsM?N--5
4ileOutput4ormat.setOutputFath+prime/obnew Fath+argsM1N--5
return prime/ob.wait4or7ompletion+true-OOtrueI ?:15
L
K
Tool(unner.run+new 7onfiguration+-new FrimeBob+-args-5
L
L
P!ime%appe!Demo.*ava
public class Frime#apper)emo extends #apperW$ong.ritableTextText6ull.ritableV
K
]O,erride
Page 63 of 81
K
long nfO?5
'tring strO,alue.to'tring+-5
'tring ,al5
'tringTokeni9er stOnew 'tringTokeni9er+str-5
while+st.has#oreTokens+--
K
fO?5
,alOst.nextToken+-5
nO$ong.parse$ong+,al-5
if+nOO1-
continue5
for+int iO!5iWOn8!5iPP-
K
if+n@iOO?-
K
fO15
break5
L
L
if+fOO?-
context.write+new Text+,al-6ull.ritable.get+--5
L
L
L
Page 64 of 81
Te %ap (ide:
.hen the map function starts producing output it is not simply written to disk. The
process is more in,ol,ed and takes ad,antage of buffering writes in memory and doing some presorting
for efficiency reasons. Shuffle and sort in MapReduce %ach map task has a circular memory buffer that
it writes the output to. The buffer is 1?? #* by default a si9e which can be tuned by changing the
io.sort.mb property. .hen the contents of the buffer reaches a certain threshold si9e +io.sort.spill.per
cent default ?.;? or ;?@- a background thread will start to spill the contents to disk. #ap outputs will
continue to be written to the buffer while the spill takes place but if the buffer fills up during this time
the map will block until the spill is complete. a The term shuffle is actually imprecise since in some
contexts it refers to only the part of the process where map outputs are fetched by reduce tasks. In this
section we take it to mean the whole process from the point where a map produces output to where a
reduce consumes input.
'huffle and 'ort 'pills are written in roundCrobin fashion to the directories specified by
the mapred.local.dir property in a /obCspecific subdirectory. *efore it writes to disk the thread first
di,ides the data into partitions corresponding to the reducers that they will ultimately be sent to. .ithin
each partition the background thread performs an inCmemory sort by key and if there is a combiner
function it is run on the output of the sort. %ach time the memory buffer reaches the spill threshold a
new spill file is created so after the map task has written its last output record there could be se,eral
spill files. *efore the task is finished the spill files are merged into a single partitioned and sorted output
file. The configuration property io.sort.factor controls the maximum number of streams to merge at
once5 the default is 1?. If a combiner function has been specified and the number of spills is at least
three +the ,alue of the min.num.spills.for.combine property- then the combiner is run before the output
file is written. (ecall that combiners may be run repeatedly o,er the input without affecting the final
result. The point is that running combiners makes for a more compact map output so there is less data to
write to local disk and to transfer to the
reducer.
It is often a good idea to compress the map output as it is written to disk since doing
so makes it faster to write to disk sa,es disk space and reduces the amount of data to transfer
to the reducer. *y default the output is not compressed but it is easy to enable by setting
Page 65 of 81
mapred.compress.map.output to true. The compression library to use is specified by
mapred.map.output.compression.codec5 see 27ompression3 on page HH for more on compression
formats. The output fileDs partitions are made a,ailable to the reducers o,er HTTF. The number
of worker threads used to ser,e the file partitions is controlled by the task tracker.http.threads property
Ethis setting is per tasktracker not per map task slot. The default of 0? may need increasing for large
clusters running large /obs. The (educe 'ide $etDs turn now to the reduce part of the process. The map
output file is sitting on the local disk of the tasktracker that ran the map task +note that although map
outputs always get written to the local disk of the map tasktracker reduce outputs may not be- but now
it is needed by the tasktracker that is about to run the reduce task for the partition. 4urthermore the
reduce task needs the map output for its particular partition
from se,eral map tasks across the cluster. The map tasks may finish at different times so the reduce task
starts copying their outputs as soon as each completes. This is known as the copy phase of the reduce
task.
The reduce task has a small number of copier threads so that it can fetch map outputs in
parallel. The default is fi,e threads but this number can be changed by setting the
mapred.reduce.parallel.copies property.
How #ap(educe .orks How do reducers know which tasktrackers to fetch map output
fromI 1s map tasks complete successfully they notify their parent tasktracker
of the status update which in turn notifies the /obtracker. These notifications are transmitted o,er the
heartbeat communication mechanism described earlier. Therefore for a gi,en /ob the /obtracker knows
the mapping between map outputs and tasktrackers. 1 thread in the reducer periodically asks the
/obtracker for map output locations until it has retrie,ed them all. Tasktrackers do not delete map
outputs from disk as soon as the first reducer has retrie,ed them as the reducer may fail. Instead they
wait until they are told to delete them by the /obtracker which is after the
/ob has completed.
The map outputs are copied to the reduce tasktrackerDs memory if they are small enough
+the bufferDs si9e is controlled by mapred./ob.shuffle.input.buffer.percent which specifies the proportion
of the heap to use for this purpose-5 otherwise they are copied to disk. .hen the inCmemory buffer
reaches a threshold si9e +controlled by mapred./ob.shuffle.merge.percent- or reaches a threshold
number of map outputs +mapred.inmem.merge.threshold- it is merged and spilled to disk. 1s the copies
accumulate on disk a background thread merges them into larger sorted files.
Page 66 of 81
This sa,es some time merging later on. 6ote that any map outputs that were compressed +by
the map task- ha,e to be decompressed in memory in order to perform a merge on them. .hen all the
map outputs ha,e been copied the reduce task mo,es into the sort phase +which should properly be
called the merge phase as the sorting was carried out on the map side- which merges the map outputs
maintaining their sort ordering. This is done in rounds. 4or example if there were G? map outputs and
the merge factor was 1? +the default controlled by the io.sort.factor property /ust like in the mapDs
merge- then there would be G rounds. %ach round would merge 1? files into one so at the end
there would be fi,e intermediate files.
(ather than ha,e a final round that merges these fi,e files into a single sorted file the
merge sa,es a trip to disk by directly feeding the reduce function in what is the last phase: the reduce
phase. This final merge can come from a mixture of inCmemory and onCdisk segments.
'huffle and 'ort The number of files merged in each round is actually more subtle than this example
suggests. The goal is to merge the minimum number of files to get to the merge factor for the final
round. 'o if there were 0? files the merge would not merge 1? files in each of the four rounds to get 0
files. Instead the first round would merge only 0 files and the subse=uent three rounds would merge the
full 1? files. The 0 merged files and the : +as yet unmerged- files make a total of 1? files for the final
round.
6ote that this does not change the number of rounds itDs /ust an optimi9ation
to minimi9e the amount of data that is written to disk since the final round always merges directly into
the reduce. )uring the reduce phase the reduce function is in,oked for each key in the sorted
output. The output of this phase is written directly to the output filesystem typically H)4'. In the case
of H)4' since the tasktracker node is also running a datanode the first block replica will be written to
the local disk.
#ai! (ced$la!:
Page 67 of 81
P$!pose
This document describes the 4air 'cheduler a pluggable #ap(educe scheduler that pro,ides a way to
share large clusters.
Int!od$ction
4air scheduling is a method of assigning resources to /obs such that all /obs get on a,erage an e=ual
share of resources o,er time. .hen there is a single /ob running that /ob uses the entire cluster. .hen
other /obs are submitted tasks slots that free up are assigned to the new /obs so that each /ob gets
roughly the same amount of 7FU time. Unlike the default Hadoop scheduler which forms a =ueue of
/obs this lets short /obs finish in reasonable time while not star,ing long /obs. It is also an easy way to
share a cluster between multiple of users. 4air sharing can also work with /ob priorities C the priorities
are used as weights to determine the fraction of total compute time that each /ob gets.
The fair scheduler organi9es /obs into pools and di,ides resources fairly between these pools. *y
default there is a separate pool for each user so that each user gets an e=ual share of the cluster. It is
also possible to set a /obUs pool based on the userUs Unix group or any /obconf property. .ithin each
pool /obs can be scheduled using either fair sharing or firstCinCfirstCout +4I4O- scheduling.
In addition to pro,iding fair sharing the 4air 'cheduler allows assigning guaranteed minimum shares to
pools which is useful for ensuring that certain users groups or production applications always get
sufficient resources. .hen a pool contains /obs it gets at least its minimum share but when the pool
does not need its full guaranteed share the excess is split between other pools.
If a poolUs minimum share is not met for some period of time the scheduler optionally
supports preemption of /obs in other pools. The pool will be allowed to kill tasks from other pools to
make room to run. Freemption can be used to guarantee that `production` /obs are not star,ed while also
allowing the Hadoop cluster to also be used for experimental and research /obs. In addition a pool can
also be allowed to preempt tasks if it is below half of its fair share for a configurable timeout +generally
set larger than the minimum share preemption timeout-. .hen choosing tasks to kill the fair scheduler
picks the mostCrecentlyClaunched tasks from o,erCallocated /obs to minimi9e wasted computation.
Freemption does not cause the preempted /obs to fail because Hadoop /obs tolerate losing tasks5 it only
makes them take longer to finish.
The 4air 'cheduler can limit the number of concurrent running /obs per user and per pool. This can be
useful when a user must submit hundreds of /obs at once or for ensuring that intermediate data does not
fill up disk space on a cluster when too many concurrent /obs are running. 'etting /ob limits causes /obs
submitted beyond the limit to wait until some of the user8poolUs earlier /obs finish. Bobs to run from each
user8pool are chosen in order of priority and then submit time.
4inally the 4air 'cheduler can limit the number of concurrent running tasks per pool. This can be useful
when /obs ha,e a dependency on an external ser,ice like a database or web ser,ice that could be
o,erloaded if too many map or reduce tasks are run at once.
Installation
To run the fair scheduler in your Hadoop installation you need to put it on the 7$1''F1TH. The
easiest way is to copy the hadoop-*-
fairscheduler.jar fromHADOO!HOM"#build#contrib#fairscheduler to HADOO!HOM"#lib.
Page 68 of 81
1lternati,ely you can modify HADOO!$%ASSA&H to include this /ar
inHADOO!$O'(!D)R#hadoop-en*.sh
_ou will also need to set the following property in the Hadoop config
file HADOO!$O'(!D)R#mapred-site.+ml to ha,e Hadoop use the fair scheduler:
WpropertyV
WnameVmapred./obtracker.task'chedulerW8nameV
W,alueVorg.apache.hadoop.mapred.4air'chedulerW8,alueV
W8propertyV
Once you restart the cluster you can check that the fair scheduler is running by going
to http,##-jobtrac.er /R%0#scheduler on the BobTrackerUs web UI. 1 `/ob scheduler administration`
page should be ,isible there. This page is described in the 1dministration section.
If you wish to compile the fair scheduler from source run ant pac.age in your H1)OOF[HO#%
directory. This will build build#contrib#fair-scheduler#hadoop-*-fairscheduler.jar.
Config$!ation
The 4air 'cheduler contains configuration in two places CC algorithm parameters are set
in HADOO!$O'(!D)R#mapred-site.+ml while a separate "#$ file called the allocation file located
by default in HADOO!$O'(!D)R#fair-scheduler.+ml is used to configure pools minimum shares
running /ob limits and preemption timeouts. The allocation file is reloaded periodically at runtime
allowing you to change pool settings without restarting your Hadoop cluster.
4or a minimal installation to /ust get e=ual sharing between users you will not need to edit the
allocation file.
(ced$le! Pa!amete!s in map!ed+site.)ml
The following parameters can be set in mapred-site.+ml to affect the beha,ior of the fair scheduler:
.asic Pa!amete!s
Name Desc!iption
mapred.fairscheduler.preemption *oolean property for enabling preemption. )efault: false.
mapred.fairscheduler.pool 'pecify the pool that a /ob belongs in. If this is specified then mapred.fairscheduler.poolnameproperty is
ignored.
mapred.fairscheduler.poolnameproperty 'pecify which /obconf property is used to determine the pool that a /ob belongs in. 'tring
default: user.name +i.e. one pool for each user-. 1nother useful ,alue is
#ap(educeUs `=ueue` system for access control lists +see below-.
mapred.fairscheduler.poolnameproperty is used only for /obs in which mapred.fairscheduler.pool is not
explicitly set.
Page 69 of 81
mapred.fairscheduler.allow.undeclared.pools *oolean property for enabling /ob submission to pools not declared in the allocation file. )efault: true.
mapred.fairscheduler.allocation.file 7an be used to ha,e the scheduler use a different allocation file than the default one
+HADOO!$O'(!D)R#fair-scheduler.+ml-. #ust be an absolute path to the allocation file.
Advanced Pa!amete!s
Name Desc!iption
mapred.fairscheduler.si9ebasedweight Take into account /ob si9es in calculating their weights for fair sharing. *y default weights are only based
on /ob priorities. 'etting this flag to true will make them based on the si9e of the /ob +number of tasks
needed- as wellthough not linearly +the weight will be proportional to the log of the number of tasks
needed-. This lets larger /obs get larger fair shares while still pro,iding enough of a share to small /obs to
let them finish fast. *oolean ,alue default: false.
mapred.fairscheduler.preemption.only.log This flag will cause the scheduler to run through the preemption calculations but simply log when it
wishes to preempt a task without actually preempting the task. *oolean property default: false. This
property can be useful for doing a `dry run` of preemption before enabling it to make sure that you ha,e
not set timeouts too aggressi,ely. _ou will see preemption log messages in your BobTrackerUs output log
+HADOO!%O1!D)R#hadoop-jobtrac.er-*.log-. The messages look as follows:
'hould preempt ! tasks for /ob[!??A?1?1&&H[???1: tasks)ueTo#in'hare O ! tasks)ueTo4air'hare O ?
mapred.fairscheduler.update.inter,al Inter,al at which to update fair share calculations. The default of G??ms works well for clusters with fewer
than G?? nodes but larger ,alues reduce load on the BobTracker for larger clusters. Integer ,alue in
milliseconds default: G??.
mapred.fairscheduler.preemption.inter,al Inter,al at which to check for tasks to preempt. The default of 1Gs works well for timeouts on the order of
minutes. It is not recommended to set timeouts much smaller than this amount but you can use this ,alue
to make preemption computations run more often if you do set such timeouts. 1 ,alue of less than Gs will
probably be too small howe,er as it becomes less than the interCheartbeat inter,al. Integer ,alue in
milliseconds default: 1G???.
mapred.fairscheduler.weightad/uster 1n extension point that lets you specify a class to ad/ust the weights of running /obs. This class should
implement the2eightAdjuster interface. There is currently one example implementation
C 'e34ob2eight5ooster which increases the weight of /obs for the first G minutes of their lifetime to let
short /obs finish faster. To use it set the weightad/uster property to the full class
name org.apache.hadoop.mapred.6ewBob.eight*ooster. 6ewBob.eight*ooster itself pro,ides two
parameters for setting the duration and boost factor.
mapred.ne3job3eightbooster.factor 4actor by which new /obs weight should be boosted. )efault is
&.
mapred.ne3job3eightbooster.duration *oost duration in milliseconds. )efault is &????? for G
minutes.
mapred.fairscheduler.loadmanager 1n extension point that lets you specify a class that determines how many maps and reduces can run on a
Page 70 of 81
gi,en TaskTracker. This class should implement the $oad#anager interface. *y default the task caps in the
Hadoop config file are used but this option could be used to make the load based on a,ailable memory
and 7FU utili9ation for example.
mapred.fairscheduler.taskselector 1n extension point that lets you specify a class that determines which task from within a /ob to launch on a
gi,en tracker. This can be used to change either the locality policy +e.g. keep some /obs within a particular
rack- or the speculati,e execution algorithm +select when to launch speculati,e tasks-. The default
implementation uses HadoopUs default algorithms from BobInFrogress.
mapred.fairscheduler.e,entlog.enabled %nable a detailed log of fair scheduler e,ents useful for debugging. This log is stored
inHADOO!%O1!D)R#fairscheduler. NOTICE: Tis setting is fo! e)pe!ts only.
false.
mapred.fairscheduler.dump.inter,al If using the e,ent log this is the inter,al at which to dump complete scheduler state +list of pools and /obs-
to the log. NOTICE: Tis setting is fo! e)pe!ts only. Integer ,alue in milliseconds default: 1????.
Allocation #ile 1fai!+sced$le!.)ml2
The allocation file configures minimum shares running /ob limits weights and preemption timeouts for
each pool. Only users8pools whose ,alues differ from the defaults need to be explicitly configured in
this file. The allocation file is located in HADOO!HOM"#conf#fair-scheduler.+ml. It can contain the
following types of elements:
pool elements which configure each pool. These may contain the following subCelements:
o minMaps and minReduces to set the poolUs minimum share of task slots.
o ma+Maps and ma+Reduces to set the poolUs maximum concurrent task slots.
o schedulingMode the poolUs internal scheduling mode which can be fair for fair sharing
or fifo for firstCinCfirstCout.
o ma+Running4obs to limit the number of /obs from the pool to run at once +defaults to
infinite-.
o 3eight to share the cluster nonCproportionally with other pools. 4or example a pool with
weight !.? will get a !x higher share than other pools. The default weight is 1.?.
o minSharereemption&imeout the number of seconds the pool will wait before killing
other poolsU tasks if it is below its minimum share +defaults to infinite-.
user elements which may contain a ma+Running4obs element to limit /obs. 6ote that by default
there is a pool for each user so perCuser limits are not necessary.
poolMa+4obsDefault which sets the default running /ob limit for any pools whose limit is not
specified.
userMa+4obsDefault which sets the default running /ob limit for any users whose limit is not
specified.
defaultMinSharereemption&imeout which sets the default minimum share preemption timeout
for any pools where it is not specified.
fairSharereemption&imeout which sets the preemption timeout used when /obs are below half
their fair share.
Page 71 of 81
defaultoolSchedulingMode which sets the default scheduling mode +fair or fifo- for pools
whose mode is not specified.
Fool and user elements only re=uired if you are setting nonCdefault ,alues for the pool8user. That is you
do not need to declare all users and all pools in your config file before running the fair scheduler. If a
user or pool is not listed in the config file the default ,alues for limits preemption timeouts etc will be
used.
1n example allocation file is gi,en below :
WIxml ,ersionO`1.?ÌV
WallocationsV
Wpool nameO`sample[pool`V
Wmin#apsVGW8min#apsV
Wmin(educesVGW8min(educesV
Wmax#apsV!GW8max#apsV
Wmax(educesV!GW8max(educesV
Wmin'hareFreemptionTimeoutV&??W8min'hareFreemptionTimeoutV
W8poolV
Wuser nameO`sample[user`V
Wmax(unningBobsV:W8max(unningBobsV
W8userV
Wuser#axBobs)efaultV&W8user#axBobs)efaultV
Wfair'hareFreemptionTimeoutV:??W8fair'hareFreemptionTimeoutV
W8allocationsV
This example creates a pool sample[pool with a guarantee of G map slots and G reduce slots. The pool
also has a minimum share preemption timeout of &?? seconds +G minutes- meaning that if it does not
get its guaranteed share within this time it is allowed to kill tasks from other pools to achie,e its share.
The pool has a cap of !G map and !G reduce slots which means that once !G tasks are running no more
will be scheduled e,en if the poolUs fair share is higher. The example also limits the number of running
/obs per user to & except for sample[user who can run : /obs concurrently. 4inally the example sets a
fair share preemption timeout of :?? seconds +1? minutes-. If a /ob is below half its fair share for 1?
minutes it will be allowed to kill tasks from other /obs to achie,e its share. 6ote that the preemption
settings re=uire preemption to be enabled in mapred-site.+ml as described earlier.
1ny pool not defined in the allocation file will ha,e no guaranteed capacity and no preemption timeout.
1lso any pool or user with no max running /obs set in the file will be allowed to run an unlimited
number of /obs.
Access Cont!ol 8ists 1AC8s2
The fair scheduler can be used in tandem with the `=ueue` based access control system in #ap(educe
to restrict which pools each user can access. To do this first enable 17$s and set up some =ueues as
described in the #ap(educe usage guide then set the fair scheduler to use one pool per =ueue by adding
the following property in HADOO!$O'(!D)R#mapred-site.+ml:
WpropertyV
WnameVmapred.fairscheduler.poolnamepropertyW8nameV
W,alueVmapred./ob.=ueue.nameW8,alueV
W8propertyV
Page 72 of 81
_ou can then set the minimum share weight and internal scheduling mode for each pool as described
earlier. In addition make sure that users submit /obs to the right =ueue by setting
the mapred.job.6ueue.name property in their /obs.
Administ!ation
The fair scheduler pro,ides support for administration at runtime through two mechanisms:
1. It is possible to modify minimum shares limits weights preemption timeouts and pool
scheduling modes at runtime by editing the allocation file. The scheduler will reload this file 1?C
1G seconds after it sees that it was modified.
!. 7urrent /obs pools and fair shares can be examined through the BobTrackerUs web interface
at http,##-4ob&rac.er /R%0#scheduler. On this interface it is also possible to modify /obsU
priorities or mo,e /obs from one pool to another and see the effects on the fair shares +this
re=uires Ba,a'cript-.
The following fields can be seen for each /ob on the web interface:
Submitted C )ate and time /ob was submitted.
4ob)D7 /ser7 'ame C Bob identifiers as on the standard web UI.
ool C 7urrent pool of /ob. 'elect another ,alue to mo,e /ob to another pool.
riority C 7urrent priority. 'elect another ,alue to change the /obUs priority
Maps#Reduces (inished: 6umber of tasks finished 8 total tasks.
Maps#Reduces Running: Tasks currently running.
Map#Reduce (air Share: The a,erage number of task slots that this /ob should ha,e at any gi,en
time according to fair sharing. The actual number of tasks will go up and down depending on how much
compute time the /ob has had but on a,erage it will get its fair share amount.
In addition it is possible to ,iew an àd,anced` ,ersion of the web UI by going to http,##-4ob&rac.er
/R%0#scheduler8ad*anced. This ,iew shows two more columns:
Maps#Reduce 2eight: .eight of the /ob in the fair sharing calculations. This depends on priority
and potentially also on /ob si9e and /ob age if thesi9ebased3eight and 'e34ob2eight5ooster are
enabled.
%et!ics
The fair scheduler can export metrics using the Hadoop metrics interface. This can be enabled by adding
an entry to hadoopCmetrics.properties to enable the fairscheduler metrics context. 4or example to
simply retain the metrics in memory so they may be ,iewed in the 8metrics ser,let:
fairscheduler.classOorg.apache.hadoop.metrics.spi.6o%mit#etrics7ontext
#etrics are generated for each pool and /ob and contain the same information that is ,isible on
Page 73 of 81
the 8scheduler web page.
Capacity Schedular:
Overview
The 7apacity'cheduler is designed to run Hadoop #apC(educe as a shared multiCtenant cluster in an
operatorCfriendly manner while maximi9ing the throughput and the utili9ation of the cluster while
running #apC(educe applications.
Traditionally each organi9ation has it own pri,ate set of compute resources that ha,e sufficient capacity
to meet the organi9ationUs '$1 under peak or near peak conditions. This generally leads to poor a,erage
utili9ation and the o,erhead of managing multiple independent clusters one per each organi9ation.
'haring clusters between organi9ations is a costCeffecti,e manner of running large Hadoop installations
since this allows them to reap benefits of economies of scale without creating pri,ate clusters. Howe,er
organi9ations are concerned about sharing a cluster because they are worried about others using the
resources that are critical for their '$1s.
The 7apacity'cheduler is designed to allow sharing a large cluster while gi,ing each organi9ation a
minimum capacity guarantee. The central idea is that the a,ailable resources in the Hadoop #apC(educe
cluster are partitioned among multiple organi9ations who collecti,ely fund the cluster based on
computing needs. There is an added benefit that an organi9ation can access any excess capacity no being
used by others. This pro,ides elasticity for the organi9ations in a costCeffecti,e manner.
'haring clusters across organi9ations necessitates strong support for multiCtenancy since each
organi9ation must be guaranteed capacity and safeCguards to ensure the shared cluster is imper,ious to
single rouge /ob or user. The 7apacity'cheduler pro,ides a stringent set of limits to ensure that a single
/ob or user or =ueue cannot consume dispropotionate amount of resources in the cluster. 1lso the
BobTracker of the cluster in particular is a precious resource and the 7apacity'cheduler pro,ides limits
on initiali9ed8pending tasks and /obs from a single user and =ueue to ensure fairness and stability of the
cluster.
The primary abstraction pro,ided by the 7apacity'cheduler is the concept of 6ueues. These =ueues are
typically setup by administrators to reflect the economics of the shared cluster.
Features
The 7apacity'cheduler supports the following features:
7apacity Zuarantees C 'upport for multiple =ueues where a /ob is submitted to a =ueue.bueues
are allocated a fraction of the capacity of the grid in the sense that a certain capacity of resources will be
at their disposal. 1ll /obs submitted to a =ueue will ha,e access to the capacity allocated to the =ueue.
1dminstrators can configure soft limits and optional hard limits on the capacity allocated to each =ueue.
'ecurity C %ach =ueue has strict 17$s which controls which users can submit /obs to indi,idual
=ueues. 1lso there are safeCguards to ensure that users cannot ,iew and8or modify /obs from other users
if so desired. 1lso perC=ueue and system administrator roles are supported.
Page 74 of 81
%lasticity C 4ree resources can be allocated to any =ueue beyond itUs capacity. .hen there is
demand for these resources from =ueues running below capacity at a future point in time as tasks
scheduled on these resources complete they will be assigned to /obs on =ueues running below the
capacity. This ensures that resources are a,ailable in a predictable and elastic manner to =ueues thus
pre,enting artifical silos of resources in the cluster which helps utili9ation.
#ultiCtenancy C 7omprehensi,e set of limits are pro,ided to pre,ent a single /ob user and =ueue
from monpoli9ing resources of the =ueue or the cluster as a whole to ensure that the system particularly
the BobTracker isnUt o,erwhelmed by too many tasks or /obs.
Operability C The =ueue definitions and properties can be changed at runtime by administrators
in a secure manner to minimi9e disruption to users. 1lso a console is pro,ided for users and
administrators to ,iew current allocation of resources to ,arious =ueues in the system.
(esourceCbased 'cheduling C 'upport for resourceCintensi,e /obs wherein a /ob can optionally
specify higher resourceCre=uirements than the default thereCby accomodating applications with differing
resource re=uirements. 7urrently memory is the the resource re=uirement supported.
Bob Friorities C bueues optionally support /ob priorities +disabled by default-. .ithin a =ueue
/obs with higher priority will ha,e access to the =ueueUs resources before /obs with lower priority.
Howe,er once a /ob is running it will not be preempted for a higher priority /ob premption is on the
roadmap is currently not supported.
Installation
The 7apacity'cheduler is a,ailable as a B1( file in the Hadoop tarball under the contrib#capacity-
scheduler directory. The name of the B1( file would be on the lines of hadoopCcapacityCschedulerCX./ar.
_ou can also build the 'cheduler from source by executing ant pac.age in which case it would be
a,ailable under build#contrib#capacity-scheduler.
To run the 7apacity'cheduler in your Hadoop installation you need to put it on the $%ASSA&H. The
easiest way is to copy the hadoopCcapacityCschedulerCX./ar from to H1)OOF[HO#%8lib. 1lternati,ely
you can modify HADOO!$%ASSA&H to include this /ar in conf8hadoopCen,.sh.
Configuration
&sing te Capacity(ced$le!
To make the Hadoop framework use the 7apacity'cheduler set up the following property in the site
configuration:
Froperty Qalue
mapred./obtracker.task'cheduler org.apache.hadoop.mapred.7apacityTask'cheduler
Page 75 of 81
(etting $p 9$e$es
_ou can define multiple =ueues to which users can submit /obs with the 7apacity'cheduler. To define
multiple =ueues you should use themapred.6ueue.names property in conf8hadoopCsite.xml.
The 7apacity'cheduler can be configured with se,eral properties for each =ueue that control the
beha,ior of the 'cheduler. This configuration is in theconf#capacity-scheduler.+ml.
_ou can also configure 17$s for controlling which users or groups ha,e access to the =ueues
in conf8mapredC=ueueCacls.xml.
4or more details refer to 7luster 'etup documentation.
:$e$e p!ope!ties
Reso$!ce allocation
The properties defined for resource allocations to =ueues and their descriptions are listed in below:
Name Desc!iption
mapred.capacityC
scheduler.=ueue.W=ueueCnameV.capacity
Fercentage of the number of slots in the cluster that are made to be a,ailable for /obs in this =ueue. The
sum of capacities for all =ueues should be less than or e=ual 1??.
mapred.capacityC
scheduler.=ueue.W=ueueC
nameV.maximumCcapacity
maximumCcapacity defines a limit beyond which a =ueue cannot use the capacity of the cluster.This
pro,ides a means to limit how much excess capacity a =ueue can use. *y default there is no limit. The
maximumCcapacity of a =ueue can only be greater than or e=ual to its minimum capacity. )efault ,alue of
C1 implies a =ueue can use complete capacity of the cluster. This property could be to curtail certain /obs
which are long running in nature from occupying more than a certain percentage of the cluster which in the
absence of preCemption could lead to capacity guarantees of other =ueues being affected. One important
thing to note is that maximumCcapacity is a percentage so based on the clusterUs capacity it would change.
'o if large no of nodes or racks get added to the cluster maximum 7apacity in absolute terms would
increase accordingly.
mapred.capacityC
nameV.minimumCuserClimitCpercent
%ach =ueue enforces a limit on the percentage of resources allocated to a user at any gi,en time if there is
competition for them. This user limit can ,ary between a minimum and maximum ,alue. The former
depends on the number of users who ha,e submitted /obs and the latter is set to this property ,alue. 4or
example suppose the ,alue of this property is !G. If two users ha,e submitted /obs to a =ueue no single
user can use more than G?@ of the =ueue resources. If a third user submits a /ob no single user can use
more than &&@ of the =ueue resources. .ith 0 or more users no user can use more than !G@ of the =ueueUs
resources. 1 ,alue of 1?? implies no user limits are imposed.
mapred.capacityC
scheduler.=ueue.W=ueueCnameV.userC
limitCfactor
The multiple of the =ueue capacity which can be configured to allow a single user to ac=uire more slots. *y
default this is set to 1 which ensure that a single user can ne,er take more than the =ueueUs configured
capacity irrespecti,e of how idle th cluster is.
mapred.capacityC
If true priorities of /obs will be taken into account in scheduling decisions.
Page 76 of 81
nameV.supportsCpriority
4o" initiali0ation
7apacity scheduler la9ily initiali9es the /obs before they are scheduled for reducing the memory
footprint on /obtracker. 4ollowing are the parameters by which you can control the initiali9ation of /obs
perC=ueue.
Name Desc!iption
mapred.capacityCscheduler.maximumCsystemC
/obs
#aximum number of /obs in the system which can be initiali9ed concurrently by the
7apacity'cheduler. Indi,idual =ueue limits on initiali9ed /obs are directly proportional to their
=ueue capacities.
mapred.capacityCscheduler.=ueue.W=ueueC
nameV.maximumCinitiali9edCacti,eCtasks
The maximum number of tasks across all /obs in the =ueue which can be initiali9ed concurrently.
Once the =ueueUs /obs exceed this limit they will be =ueued on disk.
nameV.maximumCinitiali9edCacti,eCtasksCperC
user
The maximum number of tasks perCuser across all the of the userUs /obs in the =ueue which can be
initiali9ed concurrently. Once the userUs /obs exceed this limit they will be =ueued on disk.
nameV.initCacceptC/obsCfactor
The multipe of +maximumCsystemC/obs X =ueueCcapacity- used to determine the number of /obs
which are accepted by the scheduler. The default ,alue is 1?. If number of /obs submitted to the
=ueue exceeds this limit /ob submission are re/ected.
Reso$!ce "ased sced$ling
The 7apacity'cheduler supports scheduling of tasks on a TaskTracker+TT- based on a /obUs memory
re=uirements in terms of (1# and Qirtual #emory +Q#%#- on the TT node. 1 TT is conceptually
composed of a fixed number of map and reduce slots with fixed slot si9e across the cluster. 1 /ob can
ask for one or more slots for each of its component map and8or reduce slots. If a task consumes more
memory than configured the TT forcibly kills the task.
7urrently the memory based scheduling is only supported in $inux platform.
1dditional schedulerCbased config parameters are as follows:
Name Desc!iption
mapred.cluster.map.memory.mb The si9e in terms of ,irtual memory of a single map slot in the #apC(educe framework used by the
scheduler. 1 /ob can ask for multiple slots for a single map task ,ia mapred./ob.map.memory.mb
limit specified bymapred.cluster.max.map.memory.mb if the scheduler supports the feature. The ,alue of C1
indicates that this feature is turned off.
mapred.cluster.reduce.memory.mb The si9e in terms of ,irtual memory of a single reduce slot in the #apC(educe framework used by the
scheduler. 1 /ob can ask for multiple slots for a single reduce task ,ia mapred./ob.reduce.memory.mb
the limit specified bymapred.cluster.max.reduce.memory.mb if the scheduler supports the feature.The ,alue
Page 77 of 81
of C1 indicates that this feature is turned off.
mapred.cluster.max.map.memory.mb The maximum si9e in terms of ,irtual memory of a single map task launched by the #apC(educe
framework used by the scheduler. 1 /ob can ask for multiple slots for a single map task
,ia mapred./ob.map.memory.mb upto the limit specified by mapred.cluster.max.map.memory.mb
scheduler supports the feature. The ,alue of C1 indicates that this feature is turned off.
mapred.cluster.max.reduce.memory.mb The maximum si9e in terms of ,irtual memory of a single reduce task launched by the #apC(educe
framework used by the scheduler. 1 /ob can ask for multiple slots for a single reduce task
,ia mapred./ob.reduce.memory.mb upto the limit specified by mapred.cluster.max.reduce.memory.mb
scheduler supports the feature. The ,alue of C1 indicates that this feature is turned off.
mapred./ob.map.memory.mb The si9e in terms of ,irtual memory of a single map task for the /ob. 1 /ob can ask for multiple slots for a
single map task rounded up to the next multiple of mapred.cluster.map.memory.mb
specified bymapred.cluster.max.map.memory.mb if the scheduler supports the feature. The ,alue of C1
indicates that this feature is turned off iff mapred.cluster.map.memory.mb
mapred./ob.reduce.memory.mb The si9e in terms of ,irtual memory of a single reduce task for the /ob. 1 /ob can ask for multiple slots for a
single reduce task rounded up to the next multiple of mapred.cluster.reduce.memory.mb
specified bymapred.cluster.max.reduce.memory.mb if the scheduler supports the feature. The ,alue of C1
indicates that this feature is turned off iff mapred.cluster.reduce.memory.mb
Revie5ing te config$!ation of te Capacity(ced$le!
Once the installation and configuration is completed you can re,iew it after starting the #ap(educe
cluster from the admin UI.
'tart the #ap(educe cluster as usual.
Open the BobTracker web UI.
The =ueues you ha,e configured should be listed under the Scheduling )nformation section of the
page.
The properties for the =ueues should be ,isible in the Scheduling )nformation column against
each =ueue.
The 8scheduler webCpage should show the resource usages of indi,idual =ueues.
Example
Here is a practical example for using 7apacity'cheduler:
WIxml ,ersionO`1.?ÌV
WconfigurationV
WcCC system limit across all =ueues CCV
Page 78 of 81
WpropertyV
WnameVmapred.capacityCscheduler.maximumCsystemC/obsW8nameV
W,alueV&???W8,alueV
WdescriptionV#aximum number of /obs in the system which can be initiali9ed
concurrently by the 7apacity'cheduler.
W8descriptionV
W8propertyV
WcCC =ueue: =ueue1 CCV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue1.capacityW8nameV
W,alueV;W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue1.supportsCpriorityW8nameV
W,alueVfalseW8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue1.minimumCuserClimitCpercentW8nameV
W,alueV!?W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue1.userClimitCfactorW8nameV
W,alueV1?W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue1.maximumCinitiali9edCacti,eCtasksW8nameV
W,alueV!?????W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue1.maximumCinitiali9edCacti,eCtasksCperCuserW8nameV
W,alueV1?????W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue1.initCacceptC/obsCfactorW8nameV
W,alueV1??W8,alueV
W8propertyV
WcCC =ueue: =ueue* CCV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue*.capacityW8nameV
W,alueV!W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue*.supportsCpriorityW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue*.minimumCuserClimitCpercentW8nameV
W,alueV!?W8,alueV
W8propertyV
Page 79 of 81
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue*.userClimitCfactorW8nameV
W,alueV1W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue*.maximumCinitiali9edCacti,eCtasksW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue*.maximumCinitiali9edCacti,eCtasksCperCuserW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue*.initCacceptC/obsCfactorW8nameV
W,alueV1?W8,alueV
W8propertyV
WpropertyV
W,alueV&?W8,alueV
W8propertyV
WpropertyV
W8propertyV
WpropertyV
W,alueV!?W8,alueV
W8propertyV
WpropertyV
W,alueV1W8,alueV
W8propertyV
WpropertyV
W8propertyV
WpropertyV
W8propertyV
WpropertyV
W,alueV1?W8,alueV
W8propertyV
WcCC =ueue: =ueue) CCV
WpropertyV
Page 80 of 81
WnameVmapred.capacityCscheduler.=ueue.=ueue).capacityW8nameV
W,alueV1W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue).supportsCpriorityW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue).minimumCuserClimitCpercentW8nameV
W,alueV!?W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue).userClimitCfactorW8nameV
W,alueV!?W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue).maximumCinitiali9edCacti,eCtasksW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue).maximumCinitiali9edCacti,eCtasksCperCuserW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue).initCacceptC/obsCfactorW8nameV
W,alueV1?W8,alueV
W8propertyV
WcCC =ueue: =ueue% CCV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue%.capacityW8nameV
W,alueV&1W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue%.supportsCpriorityW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue%.minimumCuserClimitCpercentW8nameV
W,alueV!?W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue%.userClimitCfactorW8nameV
W,alueV1W8,alueV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue%.maximumCinitiali9edCacti,eCtasksW8nameV
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue%.maximumCinitiali9edCacti,eCtasksCperCuserW8nameV
Page 81 of 81
W8propertyV
WpropertyV
WnameVmapred.capacityCscheduler.=ueue.=ueue%.initCacceptC/obsCfactorW8nameV
W,alueV1?W8,alueV
W8propertyV
WpropertyV
W,alueV!;W8,alueV
W8propertyV
WpropertyV
W8propertyV
WpropertyV
W,alueV!?W8,alueV
W8propertyV
WpropertyV
W,alueV1W8,alueV
W8propertyV
WpropertyV
W8propertyV
WpropertyV
W8propertyV
WpropertyV
W,alueV1?W8,alueV
W8propertyV
W8configurationV

Hadoop Updated Material

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Hadoop Updated Material

Hochgeladen von

Copyright:

Verfügbare Formate

Page 1 of 81

Das könnte Ihnen auch gefallen