Beruflich Dokumente
Kultur Dokumente
BIG DATA
HADOOP
HDFS
MAPREDUCE
ALCHETRON
FEEDBACKS
Q/A
+
To understand BIG
DATA we will have
to understand data
STONE TABLETS
AS TIME PASSED WE STARTED CREATING
MORE DATA AS YOU CAN SEE IN THIS PIC
WHICH IS 3000-10,000 YEARS OLD
Johannes
Gutenberg
100 crore
books printed
till 18th
century & my
dear friends
you are still
not born
..
30 years of mobile
Technology
30 years of mobile
Technology
30 years of mobile
Technology
Technological change
will be so rapid &
exponential
With
invention
of BIG
internet
SO
WHAT IS
DATAdata
??
creation explodes
Every day, we create 2.5 quintillion bytes of data
so much that 90% of the data in the world
today has been created in the last two years
alone. This data comes from everywhere :
sensors used to gather climate information, posts
to social media sites, digital pictures and videos,
purchase transaction records, and cell phone GPS
signals to name a few.
HADOOP
Open Source Apache Project
Written in Java
Runs on
Linux, Mac OS/X, Windows, and
Solaris
Commodity hardware
Contents
History of Hadoop
The current applications of Hadoop
Hadoop HDFS + MAP-REDUCE
Other hadoop projects
History of Hadoop
Re
ad
sp
Map-reduce
ap
er
2004
It is an important technique!
Doug
Cutting
Joins
Yaho
o
! at 2
006
Extended
Apache Nutch
History of Hadoop
Yahoo! became the primary
contributor in 2006
History of Hadoop
Yahoo! deployed large scale science
clusters in 2007.
Tons of Yahoo! Research papers
emerge:
WWW
CIKM
SIGIR
HDFS
Map-Reduce Architecture
Map-reduce is basically a data
processing engine
To understand it deeply you should
know java coding with experience
Lets try to learn the architecture of
map-reduce
An example
BORED
ALMOST THERE
BORED
ALMOST THERE
Zookeeper (initiated by
Yahoo!)
Coordinating distributed systems
Now a days
Who use Hadoop?
Amazon/A9
Alchetron
Fox interactive media
Google
IBM
Facebook
Quantcast
Rackspace/Mailtrust
Veoh
Yahoo!
More at http://wiki.apache.org/hadoop/PoweredBy
Searc
h
Index
Searc
h
Index
When
youyou
visitvisit
When
Alchetron.com
Alchetron.com
youyou
areare interacting
interacting
with data
with data
processed
processed
with
Hadoop!!
with
Hadoop
!!
Organizi
ng data
Content
Filtering
References
For more information:
http://hadoop.apache.org/
http://developer.yahoo.com/hadoop/
http://alchetron.com/What-is-Big-data1530-W
http://alchetron.com/Big-Data-Hadoop260-W