Sie sind auf Seite 1von 14

YARN:

The Resource Manager


for Hadoop
After this video you will be able to..
• Outline how YARN provides flexible
resource management for Hadoop cluster

• Explain how YARN extends Hadoop to


enable multiple frameworks such as
MapReduce, Giraph, Spark and Flink
HDFS Cluster Utilization

Share Hadoop across applications

Hive Pig
Giraph

Spark
Storm

Flink
MapReduce

HBase

Cassandra

MongoDB
Zookeeper

YARN

HDFS
Hadoop evolved over time!

Hadoop 1.0 Hadoop 2.0

Hive Pig Others Hive Pig

Giraph

Spark
Storm

Flink
MapReduce

HBase
MapReduce

Cassandra

MongoDB
Zookeeper
YARN

HDFS HDFS
Hadoop 1.0

Only
MapReduce Hive Pig Others Other
jobs applications not
MapReduce supported
HDFS

Poor
Resource
utilization
One dataset  many applications
HADOOP 1.0 HADOOP 2.0

MAP
SPARK OTHERS
REDUCE

MAP REDUCE YARN


(Yet Another Resource Negotiator)

HDFS HDFS
Central Resource Manager Each machine
== gets a Node
ultimate decision maker
Manager
Resource Manager Node Manager

Data Computation
Framework
Application Master =
personal negotiator

Negotiates
Resource
Manager

Gets the job done Node Manager


Container = a machine Application Master = Personal
Negotiator
Essential gears in YARN engine

Resource Manager Applications Master

Node Manager

Container
2X ↑ Jobs 2.5X ↑
per day Number of
2X ↑ CPU tasks from all
utilization jobs

* Source: Apache Hadoop YARN: Yet Another Resource Negotiator.” In Proceedings of the 4th Annual Symposium on Cloud
Computing, 5:1–5:16. SOCC ’13.
YARN  More Applications

Apache Hama

and growing …
Data  Value Many choices in Hadoop 2.0

One dataset  Many applications

Higher Resource Utilization  Lower Cost

Das könnte Ihnen auch gefallen