Beruflich Dokumente
Kultur Dokumente
HADOOP
J. Sai Krishna
and G. Sravya
Lahari
2nd B.Tech (CSE)
K.O.R.M College of Engineering
Kadapa
Contents
1. Data trends in storing data.
2. Bigdata problems in IT industry
3. Introduction to HADOOP
4. HDFS (Hadoop Distributed File System)
MapReduce
6. Prominent users of Hadoop.
7. Conclusion
5.
(character,
numeric, special character)
or a of group of
them is
said to be
data it may be of the visual or audio or
scriptural ,etc
Big data
What is big dataIn IT, it is a collection
HADOOP
What is Hadoop?
It is a opensource software written in java
Hadoop software library is a framework that
1.Hadoop Commons
It provides access to the filesystems
supported by Hadoop.
The Hadoop Common package contains the
necessary JAR files and scripts needed to
start Hadoop.
The package also provides source code,
documentation, and a contribution section
which includes projects from the Hadoop
Community (Avro, Cassandra, Chukwa,
Hbase, Hive, Mahout, Pig, ZooKeeper)
MASTER NODE
Master node
Keeps
SLAVE NODES
Slave nodes
Manage
scheduling.
File access can be
achieved through
the native Java or
language of the
users' choice (C++,
Java, Python, PHP,
Ruby, Erlang, Perl,
Haskell, C#, Cocoa,
Smalltalk, and
OCaml),
It cannot be
directly mounted
by an existing
operating system.
It should be
provided with UNIX
or LUNIX system.
3.Hadoop MAPREDUCE
SYSTEM
Map function
Reduce function
We play
tennis
We
1
love
1
India
We
1
Play
Map
1
tennis
Love
India
1
1
We
2
tennis 1
play
1
Reduce
MapReduce - lifecycle
Input
Splits
Map
function
Map phase
Reduce
function
Reduce phase
nodes
Adobe 80 node system
EBay 532 node cluster
yahoo cluster of about 4500 nodes
IIIT Hyderabad 30 node cluster
Achievements
March 2011 - Apache Hadoop takes top
Conclusion:
It reduce traffic on capture, storage, search,
sharing, analysis, and visualization.
A huge amount of data could be stored and large
computations could be done in a single
compound with full safety and security at cheap
cost.
BIGDATA and BIGDATA-SOLUTIONS is one of the
burning issues in the present IT industry so, work
on those will surely make you more useful to that.
Thank
you
Any queries