A Fault-Tolerant Network Architecture For Modular Datacenter

International Journal of Software Engineering and Its Applications Vol. 6, No.
2, April, 2012
A Fault-tolerant Network Architecture for Modular Datacenter

Feng Huang, Xicheng Lu, Dongsheng Li and Yiming Zhang National Lab for Parallel and Distributed Processing, School of Computer National University of Defense Technology allyoume@163.com Abstract
Modular datacenters (MDCs) use shipping containers as large pluggable building blocks to construct mega-datacenter, and each container encapsulates thousands of servers. MDCs service-free model poses stricter demand on fault-tolerance of modular datacenter network (MDCN). Based on scale-out principle, in this paper we propose a novel hierarchical intra-container network for MDC, called SCautz, and design a fault-tolerant routing algorithm, called SCRouting+. SCRouting+ uses spare switches to bypass failed devices, so SCautz is able to retain the throughput for processing one-to-x traffic when failures occur, and achieve much more graceful network performance degradation than computation and storage capacity. Results from theoretical analysis and simulations show that SCautz is more viable for intra-container network of MDCN. Keywords: modular datacenter network, service-free, fault-tolerance
1. Introduction
Modular datacenter (MDC) uses shipping-containers as large pluggable building blocks for maga datacenter, which has been considered as the next-generation datacenter solution. In an MDC, typically, 1200-2500 servers are interconnected into a specific network infrastructure, and then encapsulated in a standard 20 - or 40-feet container [1, 2, 3, 4]. Once hooped up to power, cooling infrastructure, and Internet, an MDC can provide service at any locations in the world. Containerization makes it no need for MDC to repair incrementally, and allow MDC work in a particular service-free mode. That is, a container as a whole is never repaired during deployment lifetime (e.g., 3-5 years) [1, 2, 5]. As long as the performance of the entire container meets an engineered minimum criterion, there would be no continuous component repair. So hardware vendors scale out MDC by over-providing redundant devices to deal with increasing failures so as to maintain containers overall performance. For the modular datacenter network (MDCN), it poses much stricter demand on its capability of fault-tolerance. When server and switch failures happen and increase, the unfixed faults damage the original intra-container network structure, and result in network capacity decline. As the crucial component of MDC, MDCNs incomplete structure should try its best to retain the network performance. And the most important point is that the performance of MDCN must degrade more gracefully than MDC s computation and storage do, so as not to become the fatal weakness that make containers overall performance below the threshold criterion ahead of time. However, opportunity coexists with challenge. Since the number of servers in a container is fixed during the lifetime, the containers moderate scale relaxes the
93
International Journal of Software Engineering and Its Applications Vol. 6, No. 2, April, 2012
restriction on scalability of MDCN. So, intra-container networks could adopt some complex topologies which may be considered not suitable for traditional DCNs. In this work, we present SCautz, a novel hierarchical intra-container network structure for a Shipping Container Kautz network. SCautz comprises of a base physical Kautz topology, which is built by interconnecting servers' NIC ports and a small amount of redundant COTS (commodity off-the-shelf) switches. SCautz base topology adopts the server-centric approach. It is servers that take charge of routing traffic and work with switches to bypass the failed servers for achieving graceful perfor-mance degradation. The basic idea of SCautz is driven by the demand of MDCs service -free mode on fault-tolerance of MDCN, and the inspiration of design is based on scale -out principle in datacenter construction. Results from theoretical analysis and simulations show that SCautz is more viable for MDCN because of the following reasons: First, SCautzs base topology can offer as high network capacity as BCube [5] for one-to-x (e.g., one-to-one, one-to-all) and all-to-all traffic. Second, we propose a fault-tolerant routing algorithm called SCRouing+, which leverages switches and peer servers connected to the same switch to bypass the failed servers. SCautz thus can maintain the throughput for one-to-x traffic, and make network performance degrade smoothly for all-to-all traffic and much slower than MDCs computation and storage capacity do. Third, the extra cost of redundant switches is very low. Theoretical analysis shows that a typical SCautz-based container with 1280 servers only needs 160 switches. The rest of the paper is organized as follows. Section 2 discusses background. Section 3 presents BCube and its support for various trac patterns. Section 4 designs BSR. Section 5 addresses other design issues. Sections 6 studies graceful degradation. Section 7 presents implementation and experiments. Section 8 discusses related work and Section 9 concludes the paper.
2. Related Works
As MDC gets popular, modular datacenter networks (MDCN) have attracted more and more interest from cloud providers, hardware vendors and academic fields. Against the fatal drawbacks in supporting cloud data-intensive computing, lots of novel datacenter networks (DCN) have been proposed. VL2 [6] and PortLand [7] organize the switches into more sophisticated Clos and fattree structures respectively, in which any two servers are able to communicate to each other at maximum rate of network-interface cards (NIC). Since the routing intelligences are placed on switches, VL2 and PortLand belong to switch-centric DCN; while Dcell [8], BCube and Camcube [9] belong to server-centric DCN, for their routing intelligences are placed on servers. Dcell proposes a new recursive structure for high scalability, BCube leverages the low-end COTS switches to implement the intracontainer network based on the Hypercube topology [10], and CamCube designs a direct-connect 3D torus topology, which has been adopted by Content Addressable Network (CAN) overlay [11]. Because its servers are always equipped with multiple NICs, server-centric DCN is more effective in supporting data-intensive applications and dealing with failures than switch-centric one. Moreover, since Dcell has network performance bottleneck at its lower hierarchy and CamCube mainly studies the flexibility of routing API for cloud appications, BCube could better offer higher uniform network capacity, and achieve graceful performance degradation.
94
In server-centric MDCN, the failures of servers and switches both lead to overall performance of containers decreases. For example, BCubes incomplete structure make its throughput for one-to-x traffic patterns drops evidently, and ABT (Aggregation Bottle Throughput) for all-to-all traffic degrades faster than computation and storage do. Furthermore, switch failures decrease BCube performance much more significantly . Its ABT shrinks beyond 50% in the presence of 20% switch failures [5]. SCautz proposes a novel hierachical network structure, in the model of undirected Kautz graph, to avoid the above problems. Kautz achieves near-optimal tradeoff between node degree and diameter, and have better bisection width and bottleneck degree. However, it was considered not suitable for mega datacenter, because it is hard to be incrementally deployed without violating the origin structure. For MDCN, the amount of servers in a container is fixed, and the interior network will not be changed during its whole lifecycle. So this restriction doesnot exist anymore. Through simulations and comparisons, we show that SCautz is more viable for MDCN.
3. SCautz Architecture
SCautz comprises of two types of components: servers with multiple NICs and COTS switches. Servers interconnect their NICs forming a physical Kautz topology as SCautzs base network structure, denoted as . Switches use their lowspeed (1Gbps) ports to connect a specific number of servers, and reserve the high -speed (10Gbps) ports for inter-container networks. 3.1. Preliminaries For defining the base undirected Kautz topology of SCautz, we introduce the definition of directed Kautz graph first. Let be an alphabet of d+1 letters, and be the identifier space of Kautz, wherein vertex identifiers are a set of strings with length k and base d, and their consecutive letters are different.
21 12 20 01
10
21 12 20 01
10
02
02
Figure 1. The Kautz Graph and its Undirected Structure Definition 1 (Kautz graph [13]) The vertices and edges of Kautz graph K(d,k) are V(K(d,k)) and E(K(d,k)): , .
95
The Kautz graph is d-regular, the diameter is k and there are vertices and edges in it. The SCautzs base undirected Kautz structure UK(d,k) is obtained by omitting the directions of the edges and keeping the loops between the vertices of the form (abab...), e.g., (01,10), (21,12) and (02,20). So it is regular, which is not like the general undirected Kautz. Figure 1 shows K(2, 2) and UK(2, 2). 3.2. SCautz Structure The complete structure of SCautz with redundant switches is denoted as or for short, defined as follows. Definition 2 Let be the complete SCautz network with base topology and the redundant switch structure. The node ( ), switch ( and ), cluster ( and ) and link ( ), in which links comprise of the links ( ) directly connecting servers and the links connecting servers and switches ( ), are represented as follows: , , , , , or , . The definition of SCautz(d,k,t)s nodes is just the same as in Kautz(d,k), where t means the length of switchs identifier. Due to the different rules of organizing servers, the switches are divided into two categories: and . And let servers, whose rightmost (or leftmost) substrings of length t are the same and identical to a certain switchs identifier, connect to the corresponding (or ). So t determines the amount of servers in one cluster and the total amount of switches. The n servers connected to one same switch form a cluster, hence, all clusters fall into two categories as well: the clusters with (or ) are denoted as (or ), and each server is a member of and simultaneously. Therefore, The switch =10 connects with four servers (1010, 2010, 0210 and 1210) building the cluster ={ 10}; the switch =02 also connects with four servers (0201, 0202, 0210 and 0212) building the cluster ={02 }, and server 0210 is the member of clusters { 10} and {02 } both, as shown in Figure 2. In the rest paper, we will not distinguish S and C, and represent = (resp. ) for short, e.g., = =10 and = =21. The links in SCautz include building and connecting switches with their servers. All the links in SCautz are undirected, and the physical cables are full-duplex, thus can be defined in two equivalent ways.
96
cluster Sleft
02
10 cluster
Sright
Cleft
0210
Cright
Figure 2. The Cluster Structures of Two Types in SCautz(2,4,2) If clusters are treated as virtual nodes and the reduplicate links between any pairs of clusters are not considered temporarily, we can easily obtain the following theorem 1 and prove it true according to Definition 3. The SCautz(2,4,2) are shown in Figure 3, including the full higher-level and and the partial corresponding physical structures of servers. Note that the arrows of the links in Figure 3 are only used to exhibit SCautzs logical structures of clusters better.
Cright logical structure
cluster
10
21 12 20 01
10
1010 2010 0210 1210
0101 2101 1201 0201
2102 0102 0202 0102
01 02
Cright logical structure
cluster
physical structure
cluster
02
cluster
10
21 12 20 01
10
1010 2010 0210 1210
0101 2101 1201 0201
2102 0102 0202 0102
01
cluster
physical structure
cluster
02
02 Figure 3. The SCautz(2,4,2)s Two Full higher-level Logical Structures and Partial Physical Structures
97
THEOREM 1. All the (or ) form a logical Kautz structure, denoted as (or ). In SCautz, and represent the right-neighbors and leftneighbors of the server X by one L-shift and R-shift operation. The right-neighbor clusters and left-neighbor clusters of and are defined in the Definition 4. (or ) denotes the cluster, which server X belongs to, via (or ), while (or ) denotes the peer servers in the same cluster (or ) with server X. Definition 4 For any server
, be the neighbor-clusters of and . ( , let ) and ( ), )
, , , . Therefore, the server as a member of has d right-neighbor clusters and one left-neighbor cluster while it as a member of has d left-neighbor clusters and one right-neighbor cluster. Take 1210 as an example, = =01, or = =02 and =21 hold, while or = =20 and = =21 hold. Combing the hybrid structure of SCautz(d,k,t) and above definitions, we can obtain the following key properties about any server , cluster and their neighbors. Property 1. Each server ( ( ) has ) in the cluster right-neighbor servers , and these s are evenly distributed in d different right-neighbor clusters . Moreover, a cluster has d right-neighbor clusters, and all the servers in this connect to m servers in each right-neighbor cluster. d Property 2. Each server has d left-neighbor servers , and these d servers are in the same cluster ( ). Moreover, a cluster has d left-neighbor clusters ( , and every servers whose s are identical (assuming ) connect to all the servers in one left-neighbor clusters ( ( ). in the cluster = =10,
98
Therefore, we obtain the following lemmas. Lemma 1. If , then . That is all the servers in the cluster ( ) connect with only one server in each right-neighbor cluster , and . Lemma 2. If , then . Thus all the severs in one cluster connect with corresponding servers and .
010 cluster
1010
cluster
10
2010
1010 2010 0210 1210
0101
0101
0102
2101 1201 0201
2102 0102 0202 0102
2101
2102
101
cluster
cluster
102
01
cluster
cluster
02
Scautz(2,4,3)
Scautz(2,4,2)
Figure 4. The Cluster Interconnection Structures in SCautz(2,4,3) and SCautz(2,4,2) Therefore, if , there are node-disjoint paths and edge-disjoint paths from to each of its d right-neighbor clusters ( ). Take SCautz(2,4,3) and SCautz(2,4,2) as examples, for SCautz(2,4,3) shown in Figure 3, there are two servers in cluster 010 and their neighbor-servers are distributed in two neighbor-clusters 101 and 102. But according to Lemma 1, the two servers connect to only one server in each neighborcluster respectively. So, if server 0101 fails, all the links between cluster 010 and 101 are broken; while for SCautz(2,4,2), according to Lemma 2, there are two node-disjoint paths and four edge-disjoint paths, so it is more reliable than SCautz(2,4,3). Thus, we will always let in this paper. Lemma 3. There are node-disjoint paths and to its each of its d left-neighbor clusters edge-disjoint paths from ( ).
It is easy to know the logical Kautz structures of C_right (X) and C_left (X) are isomorphic, so we can also derive the corresponding properties about C_left (X) and they will not be listed here. The SCautz is server-centric and its routing intelligence is implemented on servers. In consideration of the number limits of servers Ethernet NIC slots and COTS switches low-speed ports, we pick SCautz(4,5,3) as a typical structure for MDCN. SCautz(4,5,3)
99
supports 1280 servers using only 160 COTS switches. Each server need to be equipped 10 Ethernet ports, in which 8 ports are used for constructing Kautz topology and 2 used for connecting to each type of COTS switches. Now the multi -port (Dual-port, Quadport) Ethernet NICs have become COTS components, and the COTS switch is generally equipped with tens (e.g, 24) of 1 GigE ports and several (e.g, 4) 10 GigE ports. SCautz uses switches 1 GigE ports to communicate with servers in the same cluster and reserve high-speed 10 GigE ports for inter-container network. Thus, SCautz is a practical approach for intra-container network of MDCN.
4. Routing in SCautz
According to SCautzs hierarchical structure, we propose a suite of routing algorithms to effectively utilize the redundant resources. In this section, we first introduce the regular routing methods in Kautz, i.e. in fault-free ; and then we analyze their disadvantages on dealing with node faults; at last, we present a fault-tolerant routing algorithm, SCRouing+, to achieve graceful performance degradation. 4.1. Routing in Kautz graph is a complete undirected Kautz structure. For directed Kautz graph, Fiol proposed a shortest path routing algorithm from source X to destination Y by using L-shift, defined in Definition 2: Find the largest suffix of X which coincides with a prefix of Y, and the substring is denoted as R-string. Then put the hop H with longer suffix that coincides with a prefix of Y than its previous hop until reach the destination Y, and is obtained. In the same way, could be computed by using R-shift operations, and R-shift is defined below too. Definition 2 Let L-shift and R-shift denote the shift operations on X: = = .
Combing Fiols [12] and Pradhans [13] ideas, we design a routing algorithm for , called SCRouting. Let |R-string| and |L-string| refer to the length of Rstring and L-string. SCRouting algorithm first compares |R-string| and |L-string|. If |Rstring|>|L-string|, then the is picked as by performing L-shift; otherwise, is picked as to route packets. 4.2. Routing in Kautz graph In , there are either d parallel R-paths or d parallel L-paths between any pairs of servers. Generally, the Kautz graph uses one R-path (or L-path) for data transmission. If the path breaks down, it is discarded and replaced by another one from the rest d-1 R-paths (or L-paths). The reason why not find a sub-path to bypass the failed links or nodes is that such a sub-path may need at most k hops. For example, if node 20 fails, then path 12->20->01 is not valid anymore, then it compute another new path 12->21->10>01 from 12 to 01, as shown in Figure 5. In this way, though the destination is still reachable, the capacity has shrunken: For one-to-one traffic, the spare paths are always longer than the primitive one, so the delay of single-path routing increases; since there are d-1 parallel paths left, so throughput of multi-path routing decreases by 1/d. For one-to-x traffic, since even one failure of link or server will make
100
all the paths via it become invalid, so the network capacity and reliability degrades severely. To remedy the deficiencies, we propose a fault-tolerant routing algorithm SCRouing+ based on SCautzs hybrid structure. It can handle the faults in paths generated by both SCBRouting in and SCRouting in . SCRouing+ uses the survival peer server in the same cluster with the unreachable one to bypass the failed link or server: for R-path( ), it utilizes the peer server in , while for L-path( ), it utilizes the one in s.
21 12 20 01
failed
10
02
Figure 5. Fault-tolerant Routing in Kautz
cluster
01
0101
0201
1201
2101
i fa le d
failed
0212 1012 1212 2012
failed
2120 0120 1020 2020
12
cluster
cluster
20
Figure 6. SCRouting+ fault-tolerant Routing in SCautz Let (resp. ) represent the i-th right-neighbor (resp. left-neighbor) servers by i L-shift (resp. R-shift) operations. For example assuming and 2, the
101
means s right-neighbors right-neighbor server, i.e. the second right-neighbor. Then the lemma 4 is obtained and proved easily. Lemma 4. For rightmost letters logical . in logical are identical , and are in the same cluster. If their m and m+1 rightmost letters are different, then , in which . So it is true for in
According to the Lemma 4, if a server detects the next hop is unreachable, SCRouting+ picks an idle peer server as the next hop from the ones, where s suffix (or prefix) of length coincides with s and that of length not. Then (or ). Thus SCRouting+ bypasses the failed hop and reaches its next hop. Moreover, the new fault-tolerant path is only one hop more than the original one, and without impacts on the other parallel paths. For example, server 2120 is down, resulting in the sub-path 0212->2120->1201 in certain path invalid. SCRouing+ constructs the sub-path 0212->1012->0120->1201 to bypass 2120, shown in Figure 6, instead of a new path 0212->2120->1201->2012->0120->1201 in regular method.
5. Simulations
In this section, we conduct simulations to evaluate the behavior of SCautz and SCRouting+ on fault-tolerance. First, we analyze the performance of SCautzs base topology on handling various patterns of traffic and compare the results to several representative BCubes. And then we test the performance decline of SCautz and BCube when failures happen and increase. In these simulations, we use SCautz(4,5,3) as a typical intra-container network of MDCN, whose base Kautz topology is UK(4,5) and t=3. There are servers equipped with 5 dual-port NICs and COTS switches with 24 1 GigE ports and 4 10 GigE ports. For comparisons, we pick two full BCube structures (BCube(32,1), BCube(4,4)) [6] and one partial BCube (BCube(8,3)) [3] , in which the partial BCube(8,3) uses 2 complete BCube(8,2) with full layer-4 switches ( ). So there are 1024 servers in all three BCubes but with 64, 1280, 704 switches in BCube(32,1), BCube(4,4) and BCube(8,3) respectively. 5.1. Performance of We assume the bandwidth of each servers NIC port is 1 Gbps and intermediate servers relay traffic without delay. We summary some key results in Table 1.
Table 1. Key Simulation Results of
and Bcube BCube(4,4) 3.75 5 5 1365.33 BCube(8,3) 3.51 4 4 1170.29
ave_path 1-to-1 1-to-all ABT
4.38 4 4 1168.95
BCube(32,1) 1.94 2 2 1057.03
102
From the simulations and comparisons, we know that could offer as high throughput for one-to-x traffic and high throughput for all-to-all traffic as BCube(8,3) does. But s ABT and per-server throughput are a little lower than BCube(4,4) because of its longer average path length, because the average path length directly affects the ABT. In our work, when computing path length for BCube, we considers the switches as dumb crossbar, as [5] says but unlike in [14,15] , so the two hops travelling a switch only accounted as one. In addition, BCube(4,4) needs more switches of an order of magnitude. The results illustrate that just is able to effectively accelerate various types of traffic patterns as well as BCube, when a container is fault-free.
5.2 Fault-tolerance Evaluation
Since either link or server failure makes one hop in the path unreachable, we assume all faults are caused by servers or switches and server failures also result in computation and storage capacity decline in our simulations. As shown in Figure 7, when one server failure happens, the per-server throughput of BCube(32,1), BCube(4,4) and BCube(8,3) lose by 50%, 20% and 25% for one-to-x traffic. Using switches, SCRouting+ algorithm could bypass the failed server by one more hop and keep the original path valid. So is able to retain the original throughput as a fault-free one. In Figure 8, when 10% and 20% servers fail, the overall computation capacity drops 10% and 20% correspondingly, while BCubes ABT drops by 15.3% and 25.23%, represented by the polyline named BCube(8,3). In contrast, only loses by 6.91% and 13.74% throughput respectively, much slower than computation and storage decrease. In addition, BCubes ABT shrinks beyond 50% when 20% switch fail, but no impact on .
Figure 4. Throughput Degradation for one-to-one Traffic
103
Figure 8. ABT Degradation for all-to-all Traffic

5.3 Fault-tolerance Analysis
From the above simulations, we can see that is able to leverage redundant switches to maintain the per-server throughput for one-to-x traffic and reduce about half ABT decrease than BCube, so as to improve the reliability of SCautz evidently. Switch faults have little impact on , but result in BCubes ABT drop sharply. It is because that switches in SCatuz are mainly used to tolerate the increasing faults, while switches in BCube exist between any two servers and participate in forwarding each network packet. It is easy to obtain an effective scheme of SCautz-based container to deal with frequent and increasing failures: First let a fault-free containers SCautzs base topology functions, and then leverage switches to tolerate faults. Thus, SCautz is able to retain the merits of its original base structure and achieve performance graceful degradation.
9. Conclusion
MDCs distinct service-free service model poses stricter demand on fault-tolerance of datacenter network. According to the scale-out design principle, we propose a novel hierarchical intra-container network structure for MDC, named SCautz. SCautz comprises of a base physical Kautz topology and hundreds of redundant COTS switches. Its base topology, , is able to effectively accelerate one-to-x traffic and offer high network throughput for all-to-all traffic, behaving as well as BCube. Besides, each switch of two types together with a specific number of servers form clsters, and clusters build two logical Kautz structures in higher level. Thus, SCautz is able to retain the throughput for processing one-to-x traffic in the presence of failures and achieve more graceful performance degradation by reducing about half ABT decrease than BCube. In this paper, we have proved that SCautz is able to meet the strict requirements of MDCN through theoretical analysis and simulating evaluations. In our future work, we will study how to design inter-container network by interconnecting SCautz-based containers to build mega-datacenters. Moreover, we need to design novel load-balanced
104
routing algorithm to process burst network flows of data-intensive applications [16, 17], so the map-reduce-like applications would not miss the strict deadline for fetching intermediate results from worker nodes [18].
Acknowledgements
This work is supported in part by the National Basic Research Program of China (973) under Grant No. 2011CB302600, the National Natural Science Foundation of China (NSFC) under Grant No. 60903205, the Foundation for the Author of National Excellent Doctoral Dissertations of PR China (FANEDD) under Grant No. 200953, and the Research Fund for the Doctoral Program of Higher Education (RFDP) under Grant No. 20094307110008.
References
[1] J. R. Hamilton, Recent Architecture for Modular Data Centers, Proceedings of Biennial Conference on Innovative Data Systems Research (CIDR), (2007) January 7-10, 2007, Asilomar, California, USA. [2] K. V. Vishwanath, A. Greenberg, and D. A. Reed, Modular data centers: how to design them?, Proceedings of LSAP, (2009), June 10. Munich, Germany. [3] A. B. Letaifa, A. Haji, M. Jebalia and S. Tabbane, State of the Art and Research Challenges of new services architecture technologies: Virtualization, SOA and Cloud Computing. International Journal of Grid and Distributed Computing (IJGDC). 3, 68 (2010). [4] P. Chakraborty, D. Bhattacharyya, N. Y. Sattarova and S. Bedaj, Green computing: Practice of Efficient and Eco-Friendly Computing Resources, International Journal of Grid and Distributed Computing (IJGDC). 2, 33 (2009). [5] C. Guo, G. Lu, et al. BCube: a high performance, server-centric network architecture for modular datacenters, Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 09), (2009) August 1721, Barcelona, Spain. [6] A. Greenberg and J. R. Hamilton, VL2: a scalable and flexible data center network, Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 09), (2009) August 1721, Barcelona, Spain. [7] R. N. Mysore, A. Pamboris, et al., PortLand: a scalable fault-tolerant layer 2 data center network fabric, Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 09), (2009) August 1721, Barcelona, Spain. [8] C. Guo, H. Wu, et al., Dcell: a scalable and fault-tolerant network structure for data centers, Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 098), (2008) August 1722, Seattle, Washington, USA. [9] H. Abu-Libdeh, P. Costa, et al., Symbiotic routing in future data centers, Proceedings of the ACM SIGCOMM conference on SIGCOMM (SIGCOMM 10). (2010) August 30September 3, New Delhi, India. [10] H. Sim, J.-C. Oh and H.-O. Lee, Multiple Reduced Hypercube(MRH): A New Interconnection Network Reducing Both Diameter and Edge of Hypercube, International Journal of Grid and Distributed Computing (IJGDC). 3, 19 (2010). [11] M. O. Balitanas and T. Kim, Using Incentives for Heterogeneous peer-to-peer Network, International Journal of Advanced Science and Technology (IJAST), 14, 23 (2010). [12] M. A. Fiol and A. S. Llado, The partial line digraph technique in the design of large interconnection networks, IEEE Trans. Computers, 41, 848 (1992). [13] D. K. Pradhan and S. M. Reddy, A fault-tolerant communication architecture for distributed systems, IEEE Trans. Computers, 32: 863, (1982). [14] Praveen G, P. Vijayrajan, Analysis of Performance in the Virtual Machines Environment, International Journal of Advanced Science and Technology (IJAST), 32, 53 (2011). [15] H. Wu, G. Lu, D. Li, et al., MDCube: a high performance network structure for modular datacenter interconnection, Proceedings of CoNEXT 09, (2009), December 14, Rome, Italy. [16] M. Al-Fares, S. Radhakrishnan, BarathRaghavan, NelsonHuang and AminVahdat, Hedera: Dynamic Flow Scheduling for Data Center Networks, Proceedings of the 7th USENIX conference on Networked systems design and implementation (NSDI10), (2010).
105
[17] C. Raiciu, S Barre, A. Greenhalgh, D. Wischik and M. Handley, Improving datacenter performance and robustness with multipath tcp., Proceedings of the ACM SIGCOMM conference on SIGCOMM (SIGCOMM 11), (2011) August 1519, Toronto, Ontario, Canada. [18] C. Wilson and H. Ballani, Better never than late: Meeting deadlines in datacenter networks, In: Proceedings of the ACM SIGCOMM conference on SIGCOMM (SIGCOMM 11), (2011) August 1519, Toronto, Ontario, Canada.
Authors
Feng Huang
He received the B.Sc. degree (with honors) in computer science from College of Computer, National University of Defense Technology (NUDT), Changsha, China, in 2001. He is now a student for Ph.D. at National Lab for Parallel and Distributed Processing, NUDT. His research interests include could computing, datacenter network, grid computing, virtual machine technology and data-intensive applications.
106

A Fault-Tolerant Network Architecture For Modular Datacenter

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

A Fault-Tolerant Network Architecture For Modular Datacenter

Hochgeladen von

Copyright:

Verfügbare Formate

International Journal of Software Engineering and Its Applications Vol. 6, No.

A Fault-tolerant Network Architecture for Modular Datacenter

0101 2101 1201 0201

2102 0102 0202 0102

0101 2101 1201 0201

2102 0102 0202 0102

1010 2010 0210 1210

2101 1201 0201

2102 0102 0202 0102

0212 1012 1212 2012

2120 0120 1020 2020

and Bcube BCube(4,4) 3.75 5 5 1365.33 BCube(8,3) 3.51 4 4 1170.29

ave_path 1-to-1 1-to-all ABT

BCube(32,1) 1.94 2 2 1057.03

Figure 4. Throughput Degradation for one-to-one Traffic

Figure 8. ABT Degradation for all-to-all Traffic

Das könnte Ihnen auch gefallen