Sie sind auf Seite 1von 3

Hadoop

Hadoop Architecture Overview


To store and to process hude sets of data on group of commodity hardware Hadoop Is used.
Lets take a look over the entire architecture of Hadoop .
Totally there are five stages from bottom to top inside this runtime envinroment:

Cluster : It is the combination of nodes ,also called as host machines . This host
machines are further divided into racks. Racks is the hardware segment of the Hadoop
infrastructure.
Yet Another Resource Negotiator abbreviated as YARN Infrastructure : To provide
computational resources YARN framework is used. Computational resources such as
CPUs, memory, etc. which are required for executing the functions of applications .
There are two important elements are:
o the Resource Manager runs services like Resource Scheduler which concludes
how to sign the resources .
To provide a permenant , trustable and distributed storage , the HDFS Federation
framework is utilized . This framework is traditionally utilized for storing inputs and
output . It doesnot store intermediate stages.

the MapReduce Framework is the software layer implementing the MapReduce


paradigm.

the Node Manager (many per cluster) is the slave of the infrastructure. Each
Node Manager provides some resources to the cluster. Its resource capacity is
the number of memory and the number of vcores. At run-time, the Resource
Scheduler will judge how to utilize this capacity:
a Container is a fraction of the Node manager capacity and is utilized by the
client for running a program.

The YARN infrastructure provides resources for running an application and the HDFS federation
provides storage . Hence they can be considered entirely independent of each other and are
decoupled . The MapReduce framework is only one of many possible framework which runs on
top of YARN .
YARN: Application Startup
In YARN, there are at least three programs which performs the functions:

the Job Submitter

the Resource Manager


the Node Manager

The entire process can be narrated as below :


1.
2.
3.
4.
5.

a client submits an application to the Resource Manager


the Resource Manager allocates a container
the Resource Manager contacts the related Node Manager
the Node Manager launches the container
the Container executes the Application Master

Single applications can be executed by the The Application Master . It asks for containers to the
Resource manager and performs specific programs .The Application Master knows the
application logic and thus it is framework-specific. The MapReduce framework provides its own
implementation of an Application Master.
There are chances that the The Resource Manager may show failure while functioning. YARN
helps in this by using Application Masters, Appluication Masters reduces the load of the
Resource Manager and makes it fast recoverable.
RS Trainings
HYD
Email: contact@rstrainings.com
Mobile No:9052699906

Das könnte Ihnen auch gefallen