Sie sind auf Seite 1von 19

LAB 1

Cloud Computing

Virtualization

Jinnah University for Women
Instructor Engr S M Asim Ali
TASK LIST

What is Virtualization?
Show your understanding through 02 examples
LAB 2
Cloud Computing
Services

Jinnah University for Women
Instructor Engr S M Asim Ali
TASK LIST

What operating system will you prefer for creating Virtual
Environment
Mention the services of Microsoft Operating System or
Linux that support virtualization
LAB 3
Cloud Computing
HADOOP as a tool for MAP REDUCE

Jinnah University for Women
Instructor Engr S M Asim Ali
TASK LIST

Introduction
Data Grid vs. Computing Grid
Grid Computing
Cloud Computing
Data Grid (HaDoop File System)
Computing Grid (Map Reduce)
Counting of Words
Conclusion
Motivation
Count how frequent
each words appears
in the corpus
MEDline (18 millions
texts)
Motivation
I want to extend my
research to another
corpus


Need more computing resources


Data Grid vs. Computing Grid
Data Grid:
distributed data storage
controlled sharing and management of large amounts of
distributed data.
Computing Grid:
Parallel execution
divide pieces of a program among several computers

Data Grid + Computing Grid

Grid Computing

Grid Computing
The Grid
Master
Slaves
Grid Computing
Motivation: high performance, improving resources
utilization
Aims to create illusion of a simple, yet powerful computer
out of a large number of heterogeneous systems
Tasks are submitted and distributed on nodes in the grid
Cloud Computing
The interesting thing about cloud
computing is that weve redefined
cloud computing to include everything
that we already do.
Larry Ellison
during Oracles Analyst Day

Cloud Computing
Pay-as-you-go
No initial investments
Reduced operation costs
Scalability
Availability
Cloud Computing - Open Issues
Bandwidth and latency
Lack of standard and portability
Black-box implementations
Security and lack of control
Immature tools and framework support
Legal issues (ownership, auditing, etc)
Limited Service Level of Agreements (SLAs)
Data Grid vs. Computing Grid
Data Grid:
distributed data storage
controlled sharing and management of large amounts of
distributed data.
Computing Grid:
Parallel execution
divide pieces of a program among several computers

Data Grid + Computing Grid

Grid Computing

Data Grid (Hadoop FS - Overview)
Caching of Data

Namenode
(master node)
Metadata (Name, .., ..)


Index:
Datanodes
(Slave node)
Block ops
Client
Ask specific
text
Replication
Data Grid (HDFS - Replication Data)
Counting Words in Text Files
1 3 2 0
0 5 1 8
7 2 3 5

Split-Operation
countWords(File)
countWords(File)
countWords(File)
countWords(File)
Map-Operation
w
1
:
w
2
:
w
4
:
w
3
:
w
5
:


6 2 3 4
0 1 0 0
w
1
: 6
w
2
: 14
w
3
: 15
w
4
: 17
w
5
: 1
Reduce-Operation
Advantages of Hadoop
Purely written in Java, requires installation of Cygwin under
Windows
Available under LGPL and Apache 2.0 license
Usually offers only one implementation for the different
features of a grid framework
May also use other file systems than Hadoop FS
Very flexible implementation of MapReduce
For split operation only supports FileSplit out of the box
Better suited for computations where
large data collections should be handled
if reduce-operation is more than a simple aggregation of
the maps output

Das könnte Ihnen auch gefallen