Sie sind auf Seite 1von 12

ISSN 1000-9825, CODEN RUXUEW

Journal of Software, Vol.20, No.5, May 2009, pp.13371348


doi: 10.3724/SP.J.1001.2009.03493
by Institute of Software, the Chinese Academy of Sciences. All rights reserved.

E-mail: jos@iscas.ac.cn
http://www.jos.org.cn
Tel/Fax: +86-10-62562563

1,2+, 1,2
1

( (),

( ,

100084)

100084)

Cloud Computing: System Instances and Current Research


CHEN Kang1,2+,

ZHENG Wei-Min1,2

(Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China)

(Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China)

+ Corresponding author: E-mail: ck99@mails.tsinghua.edu.cn

Chen K, Zheng WM. Cloud computing: System instances and current research. Journal of Software, 2009,
20(5):13371348. http://www.jos.org.cn/1000-9825/3493.htm
Abstract:

This paper surveys the current technologies adopted in cloud computing as well as the systems in

enterprises. Cloud computing can be viewed from two different aspects. One is about the cloud infrastructure which
is the building block for the up layer cloud application. The other is of course the cloud application. This paper
focuses on the cloud infrastructure including the systems and current research. Some attractive cloud applications
are also discussed. Cloud computing infrastructure has three distinct characteristics. First, the infrastructure is built
on top of large scale clusters which contain a large number of cheap PC servers. Second, the applications are
co-designed with the fundamental infrastructure that the computing resources can be maximally utilized. Third, the
reliability of the whole system is achieved by software building on top of redundant hardware instead of mere
hardware. All these technologies are for the two important goals for distributed system: high scalability and high
availability. Scalability means that the cloud infrastructure can be expanded to very large scale even to thousands of
nodes. Availability means that the services are available even when quite a number of nodes fail. From this paper,
readers will capture the current status of cloud computing as well as its future trends.
Key words:

cloud computing; distributed infrastructure; distributed paradigm

.:,
;.
,. 3 : 1
;,; 3

Supported by the National Natural Science Foundation of China under Grant No.90718040 (); the National Basic

Research Program of China under Grant No.2007CB310900 ((973)); the National High-Tech Research and
Development Plan of China under Grant No.2008AA01Z112 ((863))
Received 2008-06-13; Accepted 2008-10-09

Journal of Software Vol.20, No.5, May 2009

1338

,.:.
,.
,.
.
:

;;

: TP393

: A

IBM 2007 [1],.


,,
,. IBM
Cloud Computing[2]:
.
(provision)(configuration)(reconfigure)(deprovision).
.,
(SANs),,., Internet
.
.
.
:,, PC
;.,
,.
,,
, Web 2.0 .
. 3
:
1) .,
, x86 .
.
2) ,.,
,.,
.
3) ,.,
,.,,
.
,:.
,.
,.
,.
[3,4].,
.() 1998
.,.
3 , Google IBM

::

1339

Amazon .,.

1998 , 2004 ,

[3,4].,
,.
,.,
,,
.
1 3 .,
PDA,.,
.,
,.,
, PC
,.

PC, PDA, Smartphone


(Light weight devices)

Transparent
clents

Ethernet, CATV,
802.11, IEEE 1394,
et al.

Transparent
network

PC, Servers,
Mainframe

Transparent
servers

Fig.1

Architecture of transparent computing system


1

, Linux Windows .
,. PC
,,.
.
,,.
,.

Google
Google [5], Google [6],

.Google 4 :Google File


System[7] , Google MapReduce [8],

Journal of Software Vol.20, No.5, May 2009

1340

Chubby[9] Google BigTable[10].


Google File System (GFS)
,,GFS Google [7].
4 :1) ,;2)
, G ,;3) ,
,;4) ,.
2 Google File System . 2 , GFS ,
., Linux ,
., 3 .
,.
,GFS .GFS Google ,
GFS . 1 000 , 300T ,
.
Application

GFS Master

(File name, chunk index)

File namespace
GFS Client

/foo/bar
chunk 2ef0

(chunk handle, chunk


locations)

Instructions to chunkserver
Chunkserver state

(chunk handle, byte range)

GFS chunkserver

GFS chunkserver

Chunk data

Linux file system

Linux file system

Data messages
Control messages

Fig.2

Google File System architecture

Google File System

MapReduce
Google MapReduce [8,11,12].
,,,.MapReduce Map()
Reduce(), Map Reduce
. MapReduce , MapReduce
..
3 map , 1( Key-Value )
MapReduce . MapReduce ,

::

1341

Reduce . Reduce .
4 MapReduce , Map Reduce ,.
, key Reduce
.
map(String input_key, String input_value):
// input_key: document name
// input_value: document contents
for each word w in input_value:
EmitIntermediate(w,1);
reduce(String output_key, Interator intermediate_values):
// output_key: a word
// output_values: a list of counts
int result = 0;
for each v in intermediate_values:
result+=ParseInt(v);
Emit(AsString(result));

Fig.3

WordCount program using MapReduce framework

MapReduce

Input
M
Intermediate

k1:v k1:v k2:v

k1:v

k3:v k4:v

k4:v k5:v

M
k4:v

M
k1:v k3:v

Group by key

Grouped

k1:v,v,v,v

k2:v

k3:v,v

k4:v,v,v

k5:v
R

Output

Fig.4

Execution steps of MapReduce processing programs (M stands for the exectuion


of Map while R stands for the execution of Reduce)

MapReduce (M Map ,R Reduce )

BigTable
Google , Google
BigTable[10].BigTable Search History,Maps,Orkut,RSS .
5 BigTable .,
.BigTable ,,.
Tablet. 6 BigTable .

Journal of Software Vol.20, No.5, May 2009

1342

Contents

ROWS

COLUMNS

html

t1

html

www.cnn.com

html

Fig.5

t2

TIMESTAMPS

t3

Data model of Google BigTable

Google BigTable

Bigtable client
Bigtable cell
Bigtable client
Bigtable master

library

Performs metadata ops,


load balancing

Bigtable tablet server

Bigtable tablet server

Serves data

Serves data

Cluster Scheduling Master

GFS

Handles failover, monitoring

Holds tablet data, logs

Open()

Bigtable tablet server


Serves data

Lock service
Holds metadata, handles
master-election

Fig.6

Organization of BigTable System

BigTable

,BigTable ,
.BigTable ,, Google
, Chubby.Chubby ,BigTable Chubby
, Chubby ,.BigTable
,., tablet (
)., tablet
.
Google 3 .Google ,
.[5] Google , x86
.Sawzall[13] MapReduce ,
.Chubby[9].,Chubby Paxos [14]

::

1343

.Chubby .

IBM
IBM , Internet ,

. IBM , IBM
,. IBM Almaden ,
Xen[15,16] PowerVM[17] ,Linux Hadoop[18] (Google File System
MapReduce ).IBM x86 .

Virtual machine Virtual machine Virtual machine Virtual machine

Tivoli monitoring agent

Open source Linux with Xen

Virtualization infrastructure based on open source Linux & Xen


Monitoring

IBM Tivoli
monitoring
machine

DB2

Provisioning baremental & Xen VM

Tivoli provising
manager

WebSphere application
server

Provisioning management stack

Fig.7

Architecture of IBM Blue Cloud


7

IBM

,IBM Tivoli (Tivoli provisioning manager)IBM


Tivoli (IBM Tivoli monitoring)IBM WebSphere IBM DB2
. x86 ,
. Apache Hadoop
.Hadoop Google File System MapReduce .

[19].,
,. IBM p ,
LPAR(logic partition). CPU IBM Enterprise Workload Manager
.,.p
1/10 (CPU).Xen , Linux
.
,,,

1344

Journal of Software Vol.20, No.5, May 2009

.,:1)
[20,21],,;
2) ,,
,;3) ,,
;4) ,,
,
.

,
. Google File System
SAN.
,.,
,.,
,,.
.,, Google File System
, SAN .
,SAN ( Google File System),SAN
,,. Google File System
, SAN .,
,.

Amazon
Amazon ,, Amazon

.Amazon (elastic compute cloud,


EC2)[22],.Amazon
,(instance).
,,
..,
,.
Amazon Amazon (Amazon Web services).2006 3
,Amazon (simple storage service, S3), SOAP
. 2007 7 ,Amazon (simple queue service, SQS),
,,.Amazon
EBS(elastic block storage),.,Amazon
EC2 ,. 8 EC2 .
8 , SOAP over HTTPS Amazon
.,,
,(Amazon ).
.,.
,,.
,Amazon ,,.
:,.

::

1345

,Amazon
., Amazon ,
,.
Amazon elastic computing cloud (EC2)

SOAP over HTTPS


EC2 instance 3
EC2 instance 1

Cloud computing clients

EC2 instance 2

Fig.8

Usage model of Amazon elastic computing cloud


8

Amazon

,,(
Microsoft,)( Yahoo,Google ).,
,.
, Dryad .Amazon Dynamo ,
Ask.com Neptune .,
Dryad[23],.Dryad , MapReduce
. Dryad ,
.Amazon (,)
( Key,Value ), Dynamo[24]. Amazon ,
(,),.,
,,Dynamo ,
,.Dynamo P2P
,, Gossip
. Ask.com Neptune[25].
,.,
..
,,
MapReduce .Yahoo MapReduce , MapReduce
Merge , MapReduceMerge[26] .
Merge ,.Stanford MapReduce
[27], MapReduce ,

1346

Journal of Software Vol.20, No.5, May 2009

pthread . , MapReduce
,MapReduce ,.Wisconsin Cell
MapReduce [28]. Cell , 1 8 ,
. MapReduce Cell . ,Cell
MapReduce . MapReduce ,HP Sinfonia[29]
.Sinfornia ,(Compare,Read,Write).
,Compare ,,
,Read Write ,. Compare ,
,.,
,.
.,,.

.,

,;,
,.,
,.,
3 ,,
,. 3
,,.
.,
,.Google
,, GWT(Google Web toolkit),Google App Engine Google Map
API ,Google .IBM
,.Amazon
,,. 1
.,,
,.
:,
;,. 1
,.,
, facebook ,,
.

,,

.,
,,.
,,
,.

::

Table 1

1347

Features comparison among cloud computing systems


1

Transparent computing
platform from Tsinghua
University
Compatibility to Based on the transparent
traditional
management technology,
software
completely compatible to the
current software. Current
system and software can run
on top of transparent
computing platform directly
System
Developed with private
openness
technologies

Adoption of
system
virtualization
technology
Target users

Programming
support

Provide the runtime


environment directly on metal
hardware, no overhead that
might be brought by
virtualization
For end users to use directly

No programming interface

Google cloud computing


infrastructure

IBM BlueCloud product

Amazon elastic
computing cloud

The new network system is


built from scratch; current
software cannot run on the
infrastructure. Not
compatible

Virtualization provided,
Virtualization
can run traditional software provided, can run
as well as new cloud
traditional software
computing interface for
programming the new
applications

Developed with private


technologies

Developed with open


source technologies

No system virtualization
technology adopted, only
support new applications

Use open source


virtualization software
Xen with virtualization
overheads

End users can use directly,


also open specific interfaces
for developers for building
new applications
Specific network application
programming interfaces are
provided

For developers

Local distributed
application programming
interface

Combine the open


source and private
technologies
together
Use open source
virtualization
software Xen with
virtualization
overheads
For developers

Network remote
operation interface

References:
[1]

Sims K. IBM introduces ready-to-use cloud computing collaboration services get clients started with cloud computing. 2007.
http://www-03.ibm.com/press/us/en/pressrelease/22613.wss

[2]

Boss G, Malladi P, Quan D, Legregni L, Hall H. Cloud computing. IBM White Paper, 2007. http://download.boulder.ibm.com/
ibmdl/pub/software/dw/wes/hipods/Cloud_computing_wp_final_8Oct.pdf

[3]

Zhang YX, Zhou YZ. 4VP+: A novel meta OS approach for streaming programs in ubiquitous computing. In: Proc. of IEEE the
21st Intl Conf. on Advanced Information Networking and Applications (AINA 2007). Los Alamitos: IEEE Computer Society,
2007. 394403.

[4]

Zhang YX, Zhou YZ. Transparent Computing: A new paradigm for pervasive computing. In: Ma JH, Jin H, Yang LT, Tsai JJP, eds.
Proc. of the 3rd Intl Conf. on Ubiquitous Intelligence and Computing (UIC 2006). Berlin, Heidelberg: Springer-Verlag, 2006.
111.

[5]

Barroso LA, Dean J, Hlzle U. Web search for a planet: The Google cluster architecture. IEEE Micro, 2003,23(2):2228.

[6]

Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 1998,30(1-7):107117.

[7]

Ghemawat S, Gobioff H, Leung ST. The Google file system. In: Proc. of the 19th ACM Symp. on Operating Systems Principles.
New York: ACM Press, 2003. 2943.

[8]

Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. In: Proc. of the 6th Symp. on Operating System
Design and Implementation. Berkeley: USENIX Association, 2004. 137150.

[9]

Burrows M. The chubby lock service for loosely-coupled distributed systems. In: Proc. of the 7th USENIX Symp. on Operating
Systems Design and Implementation. Berkeley: USENIX Association, 2006. 335350.

[10]

Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE. Bigtable: A distributed
storage system for structured data. In: Proc. of the 7th USENIX Symp. on Operating Systems Design and Implementation.
Berkeley: USENIX Association, 2006. 205218.

[11]

Dean J, Ghemawat S. Distributed programming with Mapreduce. In: Oram A, Wilson G, eds. Beautiful Code. Sebastopol: OReilly
Media, Inc., 2007. 371384.

[12]

Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2005,51(1):

Journal of Software Vol.20, No.5, May 2009

1348

107113.
[13]

Pike R, Dorward S, Griesemer R, Quinlan S. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming Journal,
2005,13(4):277298.

[14]

Lamport L. Paxos made simple. ACM SIGACT News, 2001,32(4):5158.

[15]

Barham P, Dragovic B, Fraser K, Hand S, Harris T, Ho A, Neugebaur R, Pratt I, Warfield A. Xen and the art of virtualization. In:

[16]

Citrix systems, citrix XenServer: Efficient virtual server software. XenSource Company. http://www.xensource.com/

[17]

IBM. IBM virtualization. 2009. http://www.ibm.com/virtualization

[18]

Apache. Apache hadoop. http://hadoop.apache.org/core/

[19]

Smith JE, Nair R. Virtual Machines: Versatile Platforms for Systems and Processes. San Francisco: Morgan Kaufmann Publishers,

Proc. of the 9th ACM Symp. on Operating Systems Principles. New York: Bolton Landing, 2003. 164177.

2005.
[20]

Clark C, Fraser K, Hansen JG, Jul E, Pratt I, Warfield A. Live migration of virtual machines. In: Proc. of the 2nd Symp. on
Networked Systems Design and Implementation. Berkeley: USENIX Association, 2005. 273286.

[21]

Nelson M, Lim BH, Hutchins G. Fast transparent migration for virtual machines. In: Proc. of the USENIX 2005 Annual Technical
Conf. Berkeley: USENIX Association, 2005. 391394.

[22]

Amazon. Amazon elastic compute cloud (Amazon EC2). 2009. http://aws.amazon.com/ec2/

[23]

Isard M, Budiu M, Yu Y, Birrell A, Fetterly D. Dryad: Distributed data-parallel programs from sequential building blocks. In: Proc.
of the 2nd European Conf. on Computer Systems (EuroSys)., 2007. 5972.

[24]

DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W. Dynamo:
Amazons highly available key-value store. In: Proc. of the 21st ACM Symp. on Operating Systems Principles. New York: ACM
Press, 2007. 205220.

[25]

Chu LK, Tang H, Yang T, Shen K. Optimizing data aggregation for cluster-based Internet services. In: Proc. of the ACM
SIGPLAN Symp. on Principles and Practice of Parallel Programming. New York: ACM Press, 2003. 119130.

[26]

Yang HC, Dasdan A, Hsiao RL, Parker DS. Map-Reduce-Merge: Simplified relational data processing on large clusters. In: Proc.
of the 2007 ACM SIGMOD Intl Conf. on Management of Data. New York: ACM Press, 2007. 10291040.

[27]

Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C. Evaluating MapReduce for multi-core and multiprocessor systems.
In: Proc. of the 13th Intl Symp. on High-Performance Computer Architecture. Los Alamitos: IEEE Computer Society, 2007.
1324.

[28]

de Kruijf M, Sankaralingam K. MapReduce for the Cell B.E. architecture. Technical Report, CS-TR-2007-1625, University of
Wisconsin Computer Sciences, 2007.

[29]

Aguilera MK, Merchant A, Shah M, Veitch A, Karamanolis C. Sinfonia: A new paradigm for building scalable distributed systems.
In: Proc. of the 21st ACM Symp. on Operating Systems Principles. New York: ACM Press, 2007. 159174.

(1976),,,,,

(1946),,,,CCF

,.

,.

Das könnte Ihnen auch gefallen