Beruflich Dokumente
Kultur Dokumente
E-mail: jos@iscas.ac.cn
http://www.jos.org.cn
Tel/Fax: +86-10-62562563
1,2+, 1,2
1
( (),
( ,
100084)
100084)
ZHENG Wei-Min1,2
(Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China)
(Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China)
Chen K, Zheng WM. Cloud computing: System instances and current research. Journal of Software, 2009,
20(5):13371348. http://www.jos.org.cn/1000-9825/3493.htm
Abstract:
This paper surveys the current technologies adopted in cloud computing as well as the systems in
enterprises. Cloud computing can be viewed from two different aspects. One is about the cloud infrastructure which
is the building block for the up layer cloud application. The other is of course the cloud application. This paper
focuses on the cloud infrastructure including the systems and current research. Some attractive cloud applications
are also discussed. Cloud computing infrastructure has three distinct characteristics. First, the infrastructure is built
on top of large scale clusters which contain a large number of cheap PC servers. Second, the applications are
co-designed with the fundamental infrastructure that the computing resources can be maximally utilized. Third, the
reliability of the whole system is achieved by software building on top of redundant hardware instead of mere
hardware. All these technologies are for the two important goals for distributed system: high scalability and high
availability. Scalability means that the cloud infrastructure can be expanded to very large scale even to thousands of
nodes. Availability means that the services are available even when quite a number of nodes fail. From this paper,
readers will capture the current status of cloud computing as well as its future trends.
Key words:
.:,
;.
,. 3 : 1
;,; 3
Supported by the National Natural Science Foundation of China under Grant No.90718040 (); the National Basic
Research Program of China under Grant No.2007CB310900 ((973)); the National High-Tech Research and
Development Plan of China under Grant No.2008AA01Z112 ((863))
Received 2008-06-13; Accepted 2008-10-09
1338
,.:.
,.
,.
.
:
;;
: TP393
: A
::
1339
Amazon .,.
1998 , 2004 ,
[3,4].,
,.
,.,
,,
.
1 3 .,
PDA,.,
.,
,.,
, PC
,.
Transparent
clents
Ethernet, CATV,
802.11, IEEE 1394,
et al.
Transparent
network
PC, Servers,
Mainframe
Transparent
servers
Fig.1
, Linux Windows .
,. PC
,,.
.
,,.
,.
Google
Google [5], Google [6],
1340
GFS Master
File namespace
GFS Client
/foo/bar
chunk 2ef0
Instructions to chunkserver
Chunkserver state
GFS chunkserver
GFS chunkserver
Chunk data
Data messages
Control messages
Fig.2
MapReduce
Google MapReduce [8,11,12].
,,,.MapReduce Map()
Reduce(), Map Reduce
. MapReduce , MapReduce
..
3 map , 1( Key-Value )
MapReduce . MapReduce ,
::
1341
Reduce . Reduce .
4 MapReduce , Map Reduce ,.
, key Reduce
.
map(String input_key, String input_value):
// input_key: document name
// input_value: document contents
for each word w in input_value:
EmitIntermediate(w,1);
reduce(String output_key, Interator intermediate_values):
// output_key: a word
// output_values: a list of counts
int result = 0;
for each v in intermediate_values:
result+=ParseInt(v);
Emit(AsString(result));
Fig.3
MapReduce
Input
M
Intermediate
k1:v
k3:v k4:v
k4:v k5:v
M
k4:v
M
k1:v k3:v
Group by key
Grouped
k1:v,v,v,v
k2:v
k3:v,v
k4:v,v,v
k5:v
R
Output
Fig.4
BigTable
Google , Google
BigTable[10].BigTable Search History,Maps,Orkut,RSS .
5 BigTable .,
.BigTable ,,.
Tablet. 6 BigTable .
1342
Contents
ROWS
COLUMNS
html
t1
html
www.cnn.com
html
Fig.5
t2
TIMESTAMPS
t3
Google BigTable
Bigtable client
Bigtable cell
Bigtable client
Bigtable master
library
Serves data
Serves data
GFS
Open()
Lock service
Holds metadata, handles
master-election
Fig.6
BigTable
,BigTable ,
.BigTable ,, Google
, Chubby.Chubby ,BigTable Chubby
, Chubby ,.BigTable
,., tablet (
)., tablet
.
Google 3 .Google ,
.[5] Google , x86
.Sawzall[13] MapReduce ,
.Chubby[9].,Chubby Paxos [14]
::
1343
.Chubby .
IBM
IBM , Internet ,
. IBM , IBM
,. IBM Almaden ,
Xen[15,16] PowerVM[17] ,Linux Hadoop[18] (Google File System
MapReduce ).IBM x86 .
IBM Tivoli
monitoring
machine
DB2
Tivoli provising
manager
WebSphere application
server
Fig.7
IBM
1344
.,:1)
[20,21],,;
2) ,,
,;3) ,,
;4) ,,
,
.
,
. Google File System
SAN.
,.,
,.,
,,.
.,, Google File System
, SAN .
,SAN ( Google File System),SAN
,,. Google File System
, SAN .,
,.
Amazon
Amazon ,, Amazon
::
1345
,Amazon
., Amazon ,
,.
Amazon elastic computing cloud (EC2)
EC2 instance 2
Fig.8
Amazon
,,(
Microsoft,)( Yahoo,Google ).,
,.
, Dryad .Amazon Dynamo ,
Ask.com Neptune .,
Dryad[23],.Dryad , MapReduce
. Dryad ,
.Amazon (,)
( Key,Value ), Dynamo[24]. Amazon ,
(,),.,
,,Dynamo ,
,.Dynamo P2P
,, Gossip
. Ask.com Neptune[25].
,.,
..
,,
MapReduce .Yahoo MapReduce , MapReduce
Merge , MapReduceMerge[26] .
Merge ,.Stanford MapReduce
[27], MapReduce ,
1346
pthread . , MapReduce
,MapReduce ,.Wisconsin Cell
MapReduce [28]. Cell , 1 8 ,
. MapReduce Cell . ,Cell
MapReduce . MapReduce ,HP Sinfonia[29]
.Sinfornia ,(Compare,Read,Write).
,Compare ,,
,Read Write ,. Compare ,
,.,
,.
.,,.
.,
,;,
,.,
,.,
3 ,,
,. 3
,,.
.,
,.Google
,, GWT(Google Web toolkit),Google App Engine Google Map
API ,Google .IBM
,.Amazon
,,. 1
.,,
,.
:,
;,. 1
,.,
, facebook ,,
.
,,
.,
,,.
,,
,.
::
Table 1
1347
Transparent computing
platform from Tsinghua
University
Compatibility to Based on the transparent
traditional
management technology,
software
completely compatible to the
current software. Current
system and software can run
on top of transparent
computing platform directly
System
Developed with private
openness
technologies
Adoption of
system
virtualization
technology
Target users
Programming
support
No programming interface
Amazon elastic
computing cloud
Virtualization provided,
Virtualization
can run traditional software provided, can run
as well as new cloud
traditional software
computing interface for
programming the new
applications
No system virtualization
technology adopted, only
support new applications
For developers
Local distributed
application programming
interface
Network remote
operation interface
References:
[1]
Sims K. IBM introduces ready-to-use cloud computing collaboration services get clients started with cloud computing. 2007.
http://www-03.ibm.com/press/us/en/pressrelease/22613.wss
[2]
Boss G, Malladi P, Quan D, Legregni L, Hall H. Cloud computing. IBM White Paper, 2007. http://download.boulder.ibm.com/
ibmdl/pub/software/dw/wes/hipods/Cloud_computing_wp_final_8Oct.pdf
[3]
Zhang YX, Zhou YZ. 4VP+: A novel meta OS approach for streaming programs in ubiquitous computing. In: Proc. of IEEE the
21st Intl Conf. on Advanced Information Networking and Applications (AINA 2007). Los Alamitos: IEEE Computer Society,
2007. 394403.
[4]
Zhang YX, Zhou YZ. Transparent Computing: A new paradigm for pervasive computing. In: Ma JH, Jin H, Yang LT, Tsai JJP, eds.
Proc. of the 3rd Intl Conf. on Ubiquitous Intelligence and Computing (UIC 2006). Berlin, Heidelberg: Springer-Verlag, 2006.
111.
[5]
Barroso LA, Dean J, Hlzle U. Web search for a planet: The Google cluster architecture. IEEE Micro, 2003,23(2):2228.
[6]
Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks, 1998,30(1-7):107117.
[7]
Ghemawat S, Gobioff H, Leung ST. The Google file system. In: Proc. of the 19th ACM Symp. on Operating Systems Principles.
New York: ACM Press, 2003. 2943.
[8]
Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. In: Proc. of the 6th Symp. on Operating System
Design and Implementation. Berkeley: USENIX Association, 2004. 137150.
[9]
Burrows M. The chubby lock service for loosely-coupled distributed systems. In: Proc. of the 7th USENIX Symp. on Operating
Systems Design and Implementation. Berkeley: USENIX Association, 2006. 335350.
[10]
Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE. Bigtable: A distributed
storage system for structured data. In: Proc. of the 7th USENIX Symp. on Operating Systems Design and Implementation.
Berkeley: USENIX Association, 2006. 205218.
[11]
Dean J, Ghemawat S. Distributed programming with Mapreduce. In: Oram A, Wilson G, eds. Beautiful Code. Sebastopol: OReilly
Media, Inc., 2007. 371384.
[12]
Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2005,51(1):
1348
107113.
[13]
Pike R, Dorward S, Griesemer R, Quinlan S. Interpreting the data: Parallel analysis with Sawzall. Scientific Programming Journal,
2005,13(4):277298.
[14]
[15]
Barham P, Dragovic B, Fraser K, Hand S, Harris T, Ho A, Neugebaur R, Pratt I, Warfield A. Xen and the art of virtualization. In:
[16]
Citrix systems, citrix XenServer: Efficient virtual server software. XenSource Company. http://www.xensource.com/
[17]
[18]
[19]
Smith JE, Nair R. Virtual Machines: Versatile Platforms for Systems and Processes. San Francisco: Morgan Kaufmann Publishers,
Proc. of the 9th ACM Symp. on Operating Systems Principles. New York: Bolton Landing, 2003. 164177.
2005.
[20]
Clark C, Fraser K, Hansen JG, Jul E, Pratt I, Warfield A. Live migration of virtual machines. In: Proc. of the 2nd Symp. on
Networked Systems Design and Implementation. Berkeley: USENIX Association, 2005. 273286.
[21]
Nelson M, Lim BH, Hutchins G. Fast transparent migration for virtual machines. In: Proc. of the USENIX 2005 Annual Technical
Conf. Berkeley: USENIX Association, 2005. 391394.
[22]
[23]
Isard M, Budiu M, Yu Y, Birrell A, Fetterly D. Dryad: Distributed data-parallel programs from sequential building blocks. In: Proc.
of the 2nd European Conf. on Computer Systems (EuroSys)., 2007. 5972.
[24]
DeCandia G, Hastorun D, Jampani M, Kakulapati G, Lakshman A, Pilchin A, Sivasubramanian S, Vosshall P, Vogels W. Dynamo:
Amazons highly available key-value store. In: Proc. of the 21st ACM Symp. on Operating Systems Principles. New York: ACM
Press, 2007. 205220.
[25]
Chu LK, Tang H, Yang T, Shen K. Optimizing data aggregation for cluster-based Internet services. In: Proc. of the ACM
SIGPLAN Symp. on Principles and Practice of Parallel Programming. New York: ACM Press, 2003. 119130.
[26]
Yang HC, Dasdan A, Hsiao RL, Parker DS. Map-Reduce-Merge: Simplified relational data processing on large clusters. In: Proc.
of the 2007 ACM SIGMOD Intl Conf. on Management of Data. New York: ACM Press, 2007. 10291040.
[27]
Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C. Evaluating MapReduce for multi-core and multiprocessor systems.
In: Proc. of the 13th Intl Symp. on High-Performance Computer Architecture. Los Alamitos: IEEE Computer Society, 2007.
1324.
[28]
de Kruijf M, Sankaralingam K. MapReduce for the Cell B.E. architecture. Technical Report, CS-TR-2007-1625, University of
Wisconsin Computer Sciences, 2007.
[29]
Aguilera MK, Merchant A, Shah M, Veitch A, Karamanolis C. Sinfonia: A new paradigm for building scalable distributed systems.
In: Proc. of the 21st ACM Symp. on Operating Systems Principles. New York: ACM Press, 2007. 159174.
(1976),,,,,
(1946),,,,CCF
,.
,.