Sie sind auf Seite 1von 19

Computer Networks 49 (2005) 84–102

www.elsevier.com/locate/comnet

Adaptive server selection for large scale interactive


online games
a,*
Kang-Won Lee , Bong-Jun Ko b, Seraphin Calo a

a
IBM Thomas J. Watson Research Center, 19 Skyline Dr., Hawthorne, NY 10532, USA
b
Columbia University, New York, NY 10027, USA

Available online 25 May 2005

Abstract

Large scale interactive online games aim to support a very large number of game players simultaneously. To support
hundreds of thousands of concurrent players, game providers have so far focused on developing highly scalable game
server architectures and extensible network infrastructures. Recently, distributed online games are beginning to incor-
porate more interactive features and action sequences; thus, it becomes increasingly important to provision server
resources in an efficient manner to support real-time interaction between the users.
In this paper, we present a novel distributed algorithm to select game servers for a group of clients participating in a
large scale interactive online game session. The goal of server selection is to minimize the server resource usage while
satisfying a real-time delay constraint. We develop a synchronization delay model for interactive games and formulate
the server selection problem, and prove that the considered problem is NP-hard. The proposed algorithm, called zoom-
in–zoom-out, is adaptive to session dynamics (e.g., clients join) and lets the clients select appropriate servers in a
distributed manner such that the server resource is efficiently utilized. Using simulation, we study the performance
of the proposed algorithm and show that it is simple, yet effective in achieving its design goal. In particular, we show
that the performance of our algorithm is comparable to, or sometimes even better than, that of centralized greedy
algorithms, which require global information and extensive computations.
 2005 Published by Elsevier B.V.

Keywords: Distributed systems; MMOG; On-line game infrastructure; Resource allocation; Distributed Algorithm; Synchronization
delay model

1. Introduction

Large scale interactive online games, such as


*
Corresponding author. Tel.: +1 914 784 7228; fax: +1 914
Massively Multi-player Online Games (MMOG),
784 6205. aim to support a very large number of game play-
E-mail address: kangwon@us.ibm.com (K.-W. Lee). ers simultaneously. In practice, MMOG providers

1389-1286/$ - see front matter  2005 Published by Elsevier B.V.


doi:10.1016/j.comnet.2005.04.006
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 85

often are required to support hundreds of thou- on-demand game architectures, game servers can
sands of geographically distributed users at the be dynamically provisioned to accommodate the
same time. For example, it has been reported that capacity requirements as the user demand changes.
about 165,000 concurrent users play Lineage [1] in In addition to efficient resource utilization, such
Taiwan [2]. To support such a large number of on-demand game architecture removes the burden
players, MMOG providers have so far focused of server and network management from the game
on developing highly scalable game server archi- service provider. Also by multiplexing and sharing
tectures and supporting network infrastructures a large scale infrastructure across multiple games,
that span wide geographical regions while satisfy- the service provider can try out new titles without
ing loose real time requirements [3,4]. Recently, having to make a large initial investment in infra-
however, large scale online games are beginning structure [10]. Thus it now becomes an important
to incorporate more interactive features and action issue to dynamically provision and utilize shared
sequences (e.g., first person player-to-player com- server resources in an efficient manner.
bat) to provide a realistic game experience [5]. In this paper, we present a novel distributed
Thus it becomes increasingly important to provi- algorithm that selects game servers for a group
sion enough server resources to support real-time of clients participating in large scale interactive on-
interaction between users. line games. The goal of the server selection algo-
There are several challenges in augmenting rithm is to select the minimum number of servers
MMOGs with such interactive features. First, from all available servers, while satisfying the
unlike in online first-person-shooting (FPS)-type real-time delay constraint of the game. To solve
games where a small number of users that are geo- this problem, we first develop a synchronization
graphically close to each other get assigned to the delay model for interactive distributed games, for-
same game session, MMOG games must maintain mulate the server selection problem, and prove
a persistent virtual world view for a large number that finding an optimal solution of this problem
of game players that are distributed over the net- is NP-hard. We then propose a novel heuristic
work. Second, the maintenance of a long-lived per- algorithm for the considered server selection prob-
sistent world mandates a server-based game lem that can be easily implemented in distributed
architecture, where clients interact with a central environments. The proposed algorithm, called
server that keeps track of the game states. How- zoom-in–zoom-out, is adaptive to session dynamics
ever, it is well known that the conventional ser- (e.g., clients join) and lets the clients select appro-
ver–client architecture does not scale well as the priate servers in a distributed manner such that the
number of clients increases [6]. To overcome this server resource is efficiently utilized. Through sim-
limitation, a mirrored server architecture has been ulation-based evaluation, we study the perfor-
proposed [7–9], where a set of distributed game mance of the proposed algorithm with various
servers are allocated and orchestrated to support zoom-in–zoom-out techniques and show that it is
a large number of distributed clients. In this archi- simple, yet effective in achieving its design goal.
tecture, the game servers are typically intercon- In particular, we show that the performance of
nected via well provisioned network links, and our algorithm is comparable to, or sometimes even
each game client is directed to connect to a nearby better than, that of centralized greedy algorithms,
game server. The game servers simulate the game which require global information and excessive
state according to the actions of the players and computation.
maintain a consistent game state among them- The remainder of this paper is organized as fol-
selves so that all the clients can have a consistent lows. Section 2 presents synchronization delay
view of the game world. model and problem formulation. Section 3 pre-
In parallel with this architectural development, sents the proposed server selection algorithm. Sec-
proposals have been made to dynamically provi- tion 4 presents the performance of the proposed
sion game servers on the fly exploiting emerging algorithm varying parameters and in comparison
Grid technologies [10,11,3]. According to these with more expensive greedy algorithms. Section 5
86 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

presents an overview of related work, and finally For accurate game simulation and presenta-
Section 6 concludes the paper. tion, game events generated by the players must
be time-stamped according to a global clock or
must be serialized in a total order. In practice,
2. Problem formulation however, having a global clock or determining a
total order of game playerÕs actions is infeasible.
2.1. System model Instead, most practical game systems adopt a syn-
chronization mechanism, which divides time into
We assume a mirrored-server architecture, discrete slots and lets each server wait for events
where multiple game servers are interconnected generated by remote clients to arrive before exe-
via well-provisioned links. In our model, a server cuting game simulation [15,16,7,17]. Depending
refers to an entity that collectively maintains a on the scheme, inconsistencies due to late events
persistent virtual world, calculates the game state are resolved by rolling back the game state, and
based on the events generated by the players. A lost events are dead-reckoned based on the action
client refers to an entity that renders and presents history. In this paper, we do not assume a partic-
the game states to the player. We call the network ular synchronization mechanism. Instead we just
interconnection between mirrored servers the ser- assume that the synchronization delay must be
ver network. The server network is assumed to be greater than the latency for the events generated
private and dedicated to the game service pro- by clients to reach the farthest server, and for
vider. the results of game simulation to reach all the
We further assume that the virtual game world local clients. This condition must be met because
is divided into multiple regions, wherein relatively we assume that each server simulates the game
few game players (compared to the number of state independently and for that it has to receive
users in the entire game world) directly interact all the events from the farthest client before per-
with each other. In general, a region is defined forming the simulation. Based on this condition,
by the geography or architectural structure in the we define the synchronization delay as follows:
virtual world, and the game players in the same re- the synchronization delay between a client and a
gion may engage in combat or first person shoot- server is the time difference between the instance
ing actions. We call the persistent state of a that the client sends its playersÕ actions and the
region a session. A single session may span multi- instance that the client renders a new game state
ple game servers to accommodate geographically in response to the actions sent to the server. This
distant clients. Each client is assigned to one of synchronization delay depends on the network
the servers, called the contact server of the client, latency from the client to the server (upstream
which is responsible for forwarding the clientÕs ac- latency), processing time at the server, and the
tion events to all the other servers participating in network latency from the server to the client
the same session. Upon receiving clientsÕ action, (downstream latency).
each game server independently calculates a new To synchronize game play and interaction
game state, and sends the updated state to the di- amongst all players participating in the same ses-
rectly connected clients. We note that this client sion, we must take into account synchronization
and server relation effectively defines a one-to- delays between all clients in the session and the
one mapping from the clients to the servers. We corresponding servers. More specifically, a game
call this mapping the server allocation or just allo- server must calculate a new game state after receiv-
cation. When a player moves from one region to ing the action events from the farthest client.
another region, the client may need to hand-off Otherwise, the action from the farthest client will
from the current contact server to another server, not be synchronized with others. Similarly, at the
which hosts the session for the new region. Discov- client side, a new game state should not be pre-
ering and subscribing to a new server can be facil- sented to the players until the same game state is
itated by techniques presented in [12–14]. delivered to the farthest client from the server.
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 87

Otherwise, the game simulation becomes unfair to for each session are S(Ag1) = {S1, S2, S5}, S(Ag2) =
the farthest clients [18]. {S2, S4}, and S(Ag3) = {S3, S4}. To illustrate the
latency calculation, take the events generated by
2.2. Delay model and problem statement client c5 in session g2. In this case, the upstream
latency from c5 to s2 is Du(c5, s2) = dc(c5, s4) +
Let Cg denote a set of game clients that partic- ds(s4, s2). The downstream latency from s4 to c5 is
ipate in a game session g and let S denote a set of dc(s4, c5).
available game servers. Servers in S form an undi- We now consider the overall synchronization
rected connected graph, called the server network. delay for a session. As previously discussed, for a
Let ds(j, k) denote the shortest distance (or latency) server to simulate game state in a fair manner it
between servers sj and sk, and let dc(i, j) denote the must wait until all the action data from all the
distance between a client ci and a server sj. Suppose clients in the session to arrive. In other words, a ser-
client ci is mapped to server sj in some allocation ver sj must wait for maxci 2Cg Du ðci ; sj Þ before game
Ag. Then we say server sj serves client ci under simulation. Then it processes the action data and
Ag. We define Cg,j  Cg as the set of clients that sends out updates to all the clients it serves. This
are directly connected to the server sj, i.e., game state update takes up to maxci 2Cg;j d c ðsj ; cj Þ
Cg,j = {ci 2 Cgjsj serves ci}. We also define the ses- for the farthest client. Since this condition must
sion server set, S(Ag) under the allocation Ag as the hold for all servers, the overall session synchroniza-
set of servers that serve at least one client in C, i.e., tion delay Ds(A) under allocation A can be suc-
S(Ag) = {sj 2 SjCg,j 5 /}. cinctly written as follows:
Let Du(ci, sk) be the upstream distance between a  
client ci and a server sk. We note that Du(ci, sk) is Ds ðAÞ ¼ max max Du ðci ; sj Þ þ max d c ðci ; sj Þ .
sj 2SðAÞ ci 2C g ci 2C g;j
defined not only for a client and a server that are
directly connected, but also for a client and a ser- Based on this session synchronization model, we
ver connected indirectly via some other servers for- now formulate our server selection problem. Our
warding the client ciÕs actions to server sk through goal in this paper is to find a server allocation that
the shortest path in the server graph, i.e., minimizes the number of servers allocated to a ses-
Du(ci, sk) = dc(ci, sj) + ds(sj, sk), where sj is the sion, while satisfying a given synchronization
contact server for ci. On the other hand, the down- delay requirement.1 Restated formally, the prob-
stream distance from sj to a client ci 2 Cg,j is just lem is: given a network topology consisting of a
dc(sj, ci) as ci receives game states from its contact set of servers S, a set of clients C, a set of links be-
server sj. tween the servers and the clients that are annotated
Fig. 1 presents an example. There are three game with latency between the nodes, and a real-time
sessions: (i) session g1 consisting of clients c1, c2, c3, delay requirement D for a game session, find a ser-
c7, and c8, represented in gray; (ii) session g2 con- ver allocation Amin that minimizes jS(A)j subject to
sisting of clients c4 and c5, represented by double Ds(A) 6 D and jS(A)j P 1. This problem is NP-
lines; and (iii) session g3 consisting of clients c6 hard as an efficient solution to this problem can
and c9, represented in white. The server allocations effectively solve other NP-hard problems.
Theorem 1. The minimum game server allocation
problem is NP-hard.

1
In this paper, we only consider the server selection problem
for a single game session. The problem of optimally allocating
servers for multiple dynamic game sessions is a very challenging
Fig. 1. Example of game server architecture with three sessions. problem of its own and we leave it as a future research topic.
88 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

Proof. Consider, in our model, the case when the network, and the dashed lines represent the latency
latency between every pair of servers is zero. Fur- between each server–client pair.
ther assume that the latency between any client Fig. 2(b)–(d) shows three different server alloca-
and the closest server to it is D/2, where D is the tions, A1, A2, and A3, where allocated session serv-
sync-delay bound in our model. Then the optimi- ers are marked in gray. According to the session
zation goal in this specific case would be to mini- delay model, the overall delay for allocation A1
mize the number of servers subject to the is 10 with both maximum upstream and down-
condition that every client is within distance D/2 stream delays being 5 (between c2 and s1). Allocat-
from its contact server, to which it is allocated. ing s2 instead of s1 as the session server (allocation
This is exactly a set-covering problem, in which a A2) reduces the synchronization delay to 6 (up-
set is defined by the set of clients that are within stream/downstream 3 each). Note that, though
the distance D/2 from each server. Since our prob- the end-to-end distances between the clients are
lem generalizes the set-covering problem, which is the same in A1 and A2, the synchronization delay
a well-known NP-hard problem, our problem is is reduced by placing the server at the center of
also NP-hard. h the network. Finally, if we allocate two servers at
the edges of the server network as in allocation
A3, the synchronization delay decreases to 4 with
the upstream latency 3 (e.g., along the path c1–
3. Server selection algorithm
s1–s2–s3) and the downstream latency reduced to
1. In this case, the reduction comes from placing
3.1. Impact of server allocation
the servers near the clients at the expense of using
two servers.
In this section, we first analyze the impact of
From this example, we can draw the following
server allocation using an example. Fig. 2 illus-
intuition to design our server selection algorithm:
trates a simple example with three servers, s1, s2,
and s3, and two clients, c1 and c2. Fig. 2(a) shows
• If we had to choose only one server to minimize
a network configuration, where the solid lines rep-
the overall synchronization delay, then it would
resent the latency between the servers in the server
be optimal to select a server that minimizes the
maximum distance (not the average) to all cli-
ents. In graph theory, such a node is called
the center of a network.2 In this paper, we call
it a core server.
• If it is faster to forward packets via the contact
server over the server network than to send
packets to a remote server directly over the pub-
lic Internet, allocating servers near the ‘‘edge’’
of the server network (and close to the clients)
reduces the overall synchronization delay.
However, this comes at the expense of increas-
ing the number of servers in the session.

2
Formally, the eccentricity of the vertex v in a graph G is the
maximum distance from v to any other vertex. The center of a
graph G is the set of vertices of eccentricity equal to the radius,
Fig. 2. Examples of game server allocation. (a) The topology, where the radius of G is the minimum eccentricity among the
(b) allocation A1, (c) allocation A2 and (d) allocation A3. vertices of G.
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 89

Based on these observations, we design a distrib- ied in various contexts. In this paper we
uted server selection algorithm as follows. employ a tournament-based method stud-
ied in [19].
3.2. Server selection algorithm Step 3 (Zoom-In): At each client ci, do the fol-
lowing. Migrate each client, ci, from the
We design our server selection algorithm based current contact server, sk(ci), to another
on our observation presented in the previous sec- server, sk + 1(ci), that is further from ci
tion: the synchronization delay of a session is small and is closer to s* than sk(ci). Using this
when clients are connected to the ‘‘edge’’ servers; as a new contact server sk + 1(ci), calculate
and it increases as the clients are connected to serv- the session synchronization delay. If the
ers near the core server. Since our goal is to reduce delay is within the bound D, then mark
the number of session servers, at a high level, our the server sk + 1(ci) to be the ‘‘new contact
algorithm attempts to reduce the ‘‘diameter’’ of server’’, and notify all clients in the session
the network of session servers while keeping the of the successful migration. When notified
synchronization delay below the bound, D. To this of a successful migration of some other cli-
end, each client visits a sequence of servers that are ent, mark the current server to be the ‘‘new
successively closer to the core server. This proce- contact server’’. Repeat this step for all cli-
dure has an effect of increasing the chance for ents until all clients arrive at the core server
more clients to share a common server, and there- s*, at which point each client migrates to
by decreasing the number of total session servers, the ‘‘new contact server’’.
at the expense of increased synchronization delay. Step 4 (Zoom-Out): At each client ci, do the fol-
Given a set of clients C, a set of servers S, link lowing. Let S 0 be the current set of contact
configuration, and delay requirement D, our algo- servers for all clients. Then among the
rithm performs the following procedure. servers in S 0 that are further from the core
server s* than the current contact server,
Step 1: Initially allocate each client ci 2 C to the find the closest server to ci. Then probe
closest server in S.3 Denote this initial con- the session synchronization delay to see
tact server of ci by s0(ci). This produces an whether the session synchronization delay
initial allocation A. We assume that the would still be within the bound D if the cli-
session has been provisioned to satisfy ent migrated to this new server. If yes, ci
the delay requirement D, i.e., Ds(A) 6 D. migrates to the new server. Otherwise, take
Otherwise, the violation is marked, a high no action. Repeat this step for all clients
level session regrouping module is called, until no client can migrate to a server in
and then exit. S 0 that is further from s* than the current
Step 2: Find the core server, s* 2 S, of the session server s(ci).
that minimizes the maximum distance to
the clients.4 Finding the core of a network To ensure correctness, we assume the existence of
in a distributed system has many interest- a mechanism to enforce that clients do not per-
ing applications (e.g., core-based multicast form Steps 3 and 4 concurrently. In practice, such
tree construction), and thus has been stud- a serialization can be obtained by using a token-
passing scheme amongst clients.
The procedure in Step 3 has the effect of moving
a cluster of the session servers toward the core ser-
ver; hence we call it zoom-in. With this procedure,
3
This can be done by roundtrip-time measurement based on the number of session servers tends to decrease,
a game server directory in the region.
4
The core is not necessarily the same server for multiple game
but the overall synchronization delay tends to in-
sessions since the core is determined by the distribution of the crease because the latency between clients and
clients in a particular game session. their contact servers increases. This zoom-in
90 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

process terminates when all clients have reached sponding session. More specifically, when a
the core server s*. At this point, each client ci con- client successfully migrates to a new server, that
nects to the last server in the migration procedure server notifies of this migration to all the other
that makes the overall synchronization delay session servers in S(Ag). Then, each session ser-
below the bound. ver, in turn, informs its clients that they can
Note that, in the zoom-in procedure, we at- mark the current server as the new contact ser-
tempt to move all clients up to the core server even ver. This approach is more scalable than having
after the synchronization delay exceeds the delay each client individually contact all the session cli-
bound. This is because the synchronization delay ents since the number of session servers is typi-
does not always monotonically increase, but some- cally small compared to that of clients, and we
times decreases as we move clients from the edge can minimize the session state information main-
to the core. Such fluctuations in the synchroniza- tained by each client.
tion delay are frequent particularly when the loca- Fig. 3 illustrates the above procedure through
tion of a new contact server deviates from the an example. In Fig. 3(a), each of five clients,
shortest path from the client to the core server. c0, . . ., c4 is initially allocated to the closest server.
Therefore, if we terminated the zoom-in process In this allocation, we have five servers participat-
when we first encounter increase in the synchroni- ing in a game session by the clients. Now, in Fig.
zation delay, then the process may be trapped in a 3(b), each client incrementally probes servers that
local minimum point. By iterating the zoom-in are closer to the core server, for example, client
step all the way up to the core server, the client c2 migrates twice toward the core server as long
can explore the entire search space and avoid stop- as the delay bound is satisfied. As a result of the
ping at local minima. zoom-in procedure, three servers near the core ser-
By performing the procedure in Step 4, we seek ver are selected. Now, clients further seek to re-
to further reduce the number of session servers by duce the number of session servers by moving
‘‘freeing up’’ some servers among those obtained out from the core server. For example, in Fig.
in Step 3 that do not contribute to decreasing the 3(c), two clients, c1 and c2, were able to migrate
synchronization delay. This is possible because to the session servers closer than the core server,
after the zoom-in procedure, servers near the core with s* no longer being allocated to any client.
server are most likely to have been selected, but The final server allocation is shown in Fig. 3(d),
they may be removed without increasing the syn- with only two servers being selected by the
chronization delay. This process is called zoom- procedure.
out since it tends to remove servers from the core In what follows, we present several issues in
and proceed to the outside, i.e., to the boundary implementing this algorithm in a scalable and effi-
of the session server cluster. Note that, in zoom- cient manner.
ing-out process, since the clients attempt to mi-
grate to another server among the set of servers
that are currently selected, Step 4 can only reduce 3.3. Implementing the selection algorithm
the number of session servers. If all clients could
connect to the core server in Step 3, this is an 3.3.1. Minimizing migration overhead
optimal selection and we do not need to perform The first issue concerns game session migration.
Step 4. Although a game architecture based on on-de-
Note that, in practice, the information shared mand technologies enables our dynamic server
by the session clients can be exchanged through selection, migrating a game session from one ser-
the ‘‘server network’’ rather than maintaining a ver to another is a relatively expensive procedure.
separate channel for signaling between the cli- As a result, probing a new server and trying to mi-
ents. For instance, in Step 3 of the algorithm, grate the game session in each zoom-in and zoom-
the notification for successful migration of a cli- out step by a client is not a good idea. We address
ent can be sent to all the servers in the corre- this issue by handling the zoom-in and zoom-out
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 91

c0 c0

c2 c2

S* S*
c1 c1

c3 c4 c3 c4

a b

c0 c0

c2 c2

S* S*
c1 c1

c3 c4 c3 c4

c d

Fig. 3. ZIZO algorithm. (a) Initial allocation, (b) Zoom-In, (c) Zoom-Out and (d) final allocation.

steps in the control and management plane, not in 3.3.2. Scalable server search method
the actual game session plane. To support this The performance of the ‘‘zoom-in’’ and ‘‘zoom-
operation, a server that has been probed by a client out’’ procedures in our proposed algorithm relies
runs a simulation on synchronization delay com- on the effectiveness of the server search method.
putation and maintains a virtual state. The clients We consider three types of search methods: (i) full
keep probing the next possible server following the search, (ii) partial search, and (iii) tree-based
server selection algorithm until they reach a steady search.
state. Only when the entire zoom-in and zoom-out In the full search method, each client searches
process terminates, do the clients coordinate and for the next server to migrate among all servers
move the game session to the newly allocated serv- in S. This may be costly if S is large, but will pro-
ers. In this way, we can minimize the management vide the most accurate result.
and session migration cost. In addition, we note In the partial search method, when we discover
that the search process need not occur every time a core server s*, each server discovers up to k clos-
when there is a change in the session. A practical est servers that are closer to s* than itself, and up
implementation may control the frequency of ser- to k closest servers that are further from s*, where
ver search procedures by specifying a threshold k < jSj. When a client searches for the next server,
or a damping factor. For example, the proposed it considers only the servers in these sets as the can-
server selection algorithm may be triggered only didates. In order to provide an estimate of the dis-
when certain number of servers have been added tance from its clients to the other servers, each
to the session. server evaluates the distances to the other servers
92 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

based on the distance in the public network overall delay constraints. Therefore this uncoordi-
domain, on which the clientsÕ data travel to the nated migration of clients may result in a subopti-
servers. mal result by temporarily increasing the number of
The tree-based method is similar in spirit to the servers allocated to the session. An alternative ap-
partial search method in that each client utilizes proach we consider in this paper is to coordinate
the set of servers discovered by the current server the migration of all clients on the same server
in searching for the next server. The difference is simultaneously, and move them only when all the
that each server, si, selects only one server, p(si), clients can migrate to new servers.
which is the closest to itself among those closer
to the core server s*. Then si registers p(si) as its 3.3.4. Server capacity
parent server, and informs p(si) that it has done The delay model presented in Section 2 and our
so. This mechanism effectively forms a core-based base-line algorithm in Section 3.2 do not take into
tree, with s* being the root of the tree, along which account the processing time at the server. How-
each client selects the next server when it migrates ever, if a server needs to handle a lot of game cli-
from one server to another. More specifically, ents, its processing time can be a bottleneck in
when zooming in, each client migrates to the par- overall latency in practice. While we mostly leave
ent server of the current server; hence there is no this issue as a future research topic (e.g., incorpo-
need to ‘‘search’’ for the next server. On the other rating the impact of the number of clients in a ser-
hand, when clients zoom-out, they search for the ver into the delay model), we take a simple
next server that is further from the core server heuristic approach to handle this case in this
among the children nodes of the current server in paper.
the server tree. When the number of clients at a server con-
The partial search and the tree-based method cerns, we impose an upper bound on the number
enable the clients to narrow the set of the candi- of clients that can be connected to a server, up
date servers within which the clients search for to which we assume the processing time remains
the next server at each step. They are an approxi- constant. Then we slightly modify our algorithm
mation based on a hypothesis that when a client to deal with the bounded capacity of the servers
selects the next server, it will be most likely on as follows. When a client performs zoom-in and
the shortest path from the client to the core server. zoom-out process, it first searches the best server
The above three different search methods repre- to migrate to; however, if the server has already
sent different levels of trade-off between the search reached its bound on the number of clients, then
space and the cost: the full search method will it attempts to migrate to the next best server,
examine all possible alternatives at high complex- and so on, until it finds a server that can accom-
ity, whereas the tree-based method will consider modate more clients. More specifically, in Step 3
only those servers that are on the shortest path, of the algorithm, if a client ci is currently con-
but does not incur search cost. We expect that nected to a server s(ci), it first attempts to migrate
the partial search method will perform similarly to the closest server among the servers that are clo-
to the full search method as we increase k. ser to the core server and further from ci. If this
server is full, ci attempts to migrate to the next
3.3.3. Coordinating client migration closest server, and so on. During a zoom-out pro-
Next, we consider two different types of client cess, the client searches for a server to migrate to in
migration strategies for the zoom-in and zoom- a similar manner, but only among the set of cur-
out process. In a naive implementation, each client rent contact servers. We evaluate this ‘‘capaci-
may individually probe candidate servers and may tated’’ version of the algorithm in Section 4.
make a decision on migration. In this case, it is
possible that, while some clients connected to a 3.3.5. Overhead of the algorithm
server could change their contact servers, others Finally, we study the computational overhead
may fail to migrate because it would violate the of the proposed algorithm. Overall we observe that
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 93

the computation overhead of the algorithm in- the topology generator. We denote this latency
curred at each client is moderate. The Step 1 re- reduction factor by a parameter f(=0.1, 0.3, 0.5).
quires a communication overhead of O(jSj) for We also use some abbreviated notations to
each client for exact computation. In most practi- represent different types of client migration
cal situations, however, it will be O(l) where approaches during the Zoom-In and Zoom-Out
l jSj is the number of servers in the clientÕs re- process as explained in Section 3.3. The first set
gion. The Step 2 requires a total of log2jSj tourna- of indices represents the way clients select the next
ments for the entire session. The Step 3 requires server to migrate:
either O(1) or O(k) depending on the type of the
zoom-in algorithm (see Section 3.3). The Step 4 • Tree: Tree-based search (clients migrate along
requires O(m) where m jSj is the number of the shortest-path tree to the core server).
session servers. • Part: Partial search (clients migrate by search-
In the next section, we present the effectiveness ing in the candidate server set).
of the design alternatives considered above using a • Full: Full search (clients migrate by searching
simulation-based study. for the next server among ‘‘all’’ servers).

Particularly in the partial search cases, we


4. Performance evaluation denote by k the maximum number of candidate
servers given by the current server.
In this section, we evaluate the performance of The second set of indices represents two differ-
the proposed algorithm through simulations. For ent ways of moving clients from one server to
simulation, we use a synthetic, two-level, another:
hierarchical topology generated by the BRITE
topology generator [20]. Specifically, we generate • Un: Uncoordinated migration (each client
50 AS domains derived by Barabási–Albert model migrates to a new server individually).
[21], and for each AS domain, we then generate • Co: Coordinated migration (all clients con-
100 nodes connected by Waxman model. Out of nected to a server migrate simultaneously).
these 5000 nodes generated (50 AS · 100 nodes
per AS), we randomly select 100 servers and 50 We have also evaluated the effect of the migra-
clients to participate in the same game session tion order of the clients by comparing several ap-
for each simulation scenario. We then measure proaches, e.g., round-robin, random order, and
the performance of the server selection schemes furthest-client-first (from the core). Overall, we
by averaging the results obtained from 50 simula- find that the order in which clients migrate does
tion runs using 50 different sets of servers and not affect the end result of the simulation. In this
clients selected randomly. We emphasize that we section, we present the case of the round-robin
select a relatively small number of clients (with migration.
respect to that of the servers) in the simulation For evaluation, we compare the performance of
since our goal is to evaluate the performance of the proposed algorithm with two centralized server
the algorithm for a ‘‘single’’ session. In reality, selection algorithms, namely, greedy server selec-
however, there will be many such sessions and tion and client clustering algorithms. In what
the total number of clients will be much larger follows, we briefly describe these centralized algo-
than that of servers. rithms before showing the performance results of
To emulate a well-provisioned server network the proposed scheme.
with little congestion, the inter-server latency is
set to be smaller than the client-to-server latency. 4.1. Reference centralized algorithms
In particular, we study the cases when the latency
between two servers is 10%, 30%, and 50% of the The greedy selection algorithm attempts to
latency in the underlying topology produced by incrementally add a new server to the session server
94 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

set by exhaustively searching for a server that, working environment, but also are computation-
when added, results in the smallest synchroniza- ally expensive: the greedy selection must evaluate
tion delay. More specifically, at each step of the the new synchronization delay for each of the serv-
greedy selection, we calculate the synchronization ers that have not yet been added to the session ser-
delay when a non-session server has been tempo- ver set, and the clustering algorithm must select
rarily added to the session server set and each cli- the session server set for all possible values of
ent is connected to the closest session server. After server disk size.
evaluating all non-session servers, we select a ser-
ver that produces the lowest synchronization delay 4.2. Number of servers selected
when added. We keep adding servers to the session
server set in this manner until the synchronization Fig. 4 compares the number of servers allocated
delay becomes smaller than the bound. This algo- by the centralized algorithms and the proposed
rithm is similar to the ‘‘greedy algorithm’’ pre- algorithm with full server search when the inter-
sented in [22] in the context of the Web server server latency is relatively small (f = 0.1 in Fig.
replica placement problem.5 In [22], the greedy 4(a)), moderate (f = 0.3 in Fig. 4(b)), and large
algorithm has been shown to perform closely to (f = 0.5 in Fig. 4(c)). The y-axis represents the
the optimal solution. number of servers selected by each algorithm and
The second reference algorithm, called client the x-axis represents the delay bound D. In gen-
clustering algorithm, also attempts to select servers eral, we observe that, when the full server search
in a greedy manner. But, this algorithm is different method is employed, our proposed algorithm per-
from the first algorithm in that, while the greedy forms as well as, or sometimes even better than the
algorithm keeps adding a server that would mini- centralized algorithms (when f = 0.5 and D < 30).
mize the synchronization delay, the clustering As the delay bound increases, all algorithms find
algorithm adds the server that would maximize that fewer servers can meet the delay requirement.
the number of clients that will be served by it. Con- In particular, when the delay bound is sufficiently
ceptually, let us assume that each server is assigned large, we note that all the algorithms can find the
a disk of radius x. We say the server ‘‘covers’’ the optimal server allocation, which is to allocate only
clients that are within the radius from the server if one server.
the server–client distance is smaller than x, i.e., When the inter-server latency is relatively large,
dc(ci, sj) 6 x. Using this model, we select a server more servers are needed to satisfy the synchroniza-
that covers the largest number of clients that are tion delay bound, particularly when the delay
not yet covered by any server in the session server bound is tight. This is because the inter-server la-
set. We keep adding servers until all clients are tency becomes the bottleneck and therefore it is
covered. Now we repeat this process for all possi- better for the clients to connect to nearby servers
ble disk sizes for each server, and then select the when the delay bound is small. On the other hand,
smallest size session server set for which the syn- if the server-to-server delay is relatively small, it
chronization delay is within bound.6 For the set- leaves some ‘‘slack’’ in the client-to-server latency,
covering problem, this selection strategy is known and it allows the clients to connect to a relatively
to have a O(1 + log N) performance bound [23]. distant servers. As a result, the number of servers
Note that these centralized algorithms are not required to meet the delay bound is reduced. Note
only impractical to be applied in a distributed net- that our proposed algorithm sometimes finds bet-
ter solutions than the centralized algorithms do
(especially when the delay bound is tight). We ob-
5
The goal of server replica placement problem in [22] is to serve this to be the case because the centralized
choose M replicas among N potential locations to minimize the algorithms tend to select the servers near the center
clientÕs access latency.
6
If the current disk radius is x, we increase the radius by the
of the network and keep them in the selected set,
minimum of the distance from each server to its closest client while our algorithm can remove redundant servers
that is not yet covered by a server. near the center during the zoom-out process. We
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 95

25 25
ZIZO-Full-Un ZIZO-Full-Un
ZIZO-Full-Co ZIZO-Full-Co
Greedy Greedy
20 Clustering 20 Clustering
# of servers selected

# of servers selected
15 15

10 10

5 5

0 0
25 30 35 40 45 50 25 30 35 40 45 50
delay bound (msec) delay bound (msec)

a b

25
ZIZO-Full-Un
ZIZO-Full-Co
Greedy
20 Clustering
# of servers selected

15

10

0
25 30 35 40 45 50
delay bound (msec)

Fig. 4. The number of servers selected: full search vs centralized, server capacity unbounded. (a) f = 0.1, (b) f = 0.3 and (c) f = 0.5.

verify the effectiveness of the zoom-out in Section client results in as good a server allocation as the
4.4. case with explicit client coordination.
We further notice that the coordinated client In Fig. 5, we evaluate the effectiveness of the
migration offers little benefit. This observation im- partial server search mechanisms with different
plies that the independent decision made by each values of k, i.e., the maximum number of servers

20 20
ZIZO-Tree-Un ZIZO-Tree-Co
ZIZO-Part-Un (k=5) ZIZO-Part-Co (k=5)
ZIZO-Part-Un (k=10) ZIZO-Part-Co (k=10)
ZIZO-Part-Un (k=20) ZIZO-Part-Co (k=20)
# of servers selected

# of servers selected

15 15
ZIZO-Full-Un ZIZO-Full-Co

10 10

5 5

0 0
25 30 35 40 45 50 25 30 35 40 45 50
a delay bound (msec) b delay bound (msec)

Fig. 5. The number of servers selected: partial search, server capacity unbounded, f = 0.3. (a) Uncoordinated migration and
(b) coordinated migration.
96 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

25
ZIZO-Tree-Un
tice, each server will be limited to handling up to
ZIZO-Tree-Co
ZIZO-Part-Un (k=5)
certain number of clients. Fig. 6 shows the case
20 ZIZO-Part-Co (k=5)
ZIZO-Part-Un (k=10)
where the maximum number of clients that each
ZIZO-Part-Co (k=10) server can handle is 10. In this case, our algorithm
# of servers

15 ZIZO-Full-Un
ZIZO-Full-Co first searches the best server to migrate to; how-
10 ever, if the server is full, it searches for the next
best server, and so on, until it finds a server that
5 can accommodate more clients. From the figure,
the performance of the algorithm improves as we
0
25 30 35 40 45 50 expand the search space for each client by increas-
delay bound (msec) ing k. However, when the number of candidate
Fig. 6. The number of servers selected: bounded server capac- servers is very small as in tree-based migration
ity, f = 0.3. method, the algorithm performs quite poorly. In
particular, even when the delay bound is suffi-
ciently large, the tree-based algorithms fail to
that each client searches, when f is 0.3. Fig. 5(a) reach the optimal number of servers (which is five
shows the case when clients migrate individually, servers). This is because migration along a fixed
while Fig. 5(b) shows coordinated migration. We server tree severely limits the search space for each
observe that, as k increases, the performance of client, and the chance of the same server being
partial search will become close to that of full contacted for candidate server for migration is rel-
search in both the figures. In particular, searching atively high. However, we find that the partial
for a next server among (maximum) 5 candidate search with large k and full search can find the
servers produces significant improvement over optimal selection.
the tree-based migration, and k = 20 (out of 100
servers) results in a performance that is very close 4.3. Number of migrations
to the full search case. From this result, we can
conclude that it is worthwhile for each client to Fig. 7(a) and (b) plot the average and maximum
search for the next server from a small set of number of migrations of 50 clients, respectively,
candidate servers instead of searching from a with different types of ZIZO algorithms. Each
tree. point in the figures is obtained by averaging over
So far we have seen the case when there is no the values resulted from 50 simulation runs. Recall
bound on the server capacity. However, in prac- that these migrations do not happen in the data

6 10
9
5
# of migrations per client

8
max # of migrations

7
4
6
3 5
4
2 ZIZO-Full-Un ZIZO-Full-Un
ZIZO-Part-Un (k=10) 3 ZIZO-Part-Un (k=10)
ZIZO-Tree-Un 2 ZIZO-Tree-Un
1 ZIZO-Full-Co ZIZO-Full-Co
ZIZO-Part-Co (k=10) 1 ZIZO-Part-Co (k=10)
ZIZO-Tree-Co ZIZO-Tree-Co
0 0
25 30 35 40 45 50 25 30 35 40 45 50
a # of clients b # of clients

Fig. 7. The number of server migrations of clients: server capacity unbounded, f = 0.3. (a) Average number of migrations per client
and (b) maximum number of migrations.
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 97

session domain but in the control domain. There- 1.1

# of servers-ZIZO/# of servers-ZI
fore, the average number of migrations indicates
1
the computation overhead and the message
exchange overhead of each algorithm rather than
0.9
the actual session migration overhead. On the
other hand, values in Fig. 7(b) (the number of ser- 0.8
ZIZO-Tree-Un
ver migrations of the client that visited the most ZIZO-Tree-Co
ZIZO-Part-Un (k=10)
servers) provide a rough indication of the time 0.7
ZIZO-Part-Co (k=10)
required to create a session since a game session ZIZO-Full-Un
ZIZO-Full-Co
0.6
can be started after all clients have finished the 25 30 35 40 45 50
migration steps and selected their servers to delay bound (msec)

connect. Fig. 8. Further benefit of Zoom-Out: server capacity


In the figure, while the number of required unbounded, f = 0.3.
migration steps is slightly larger when the delay
bound is smaller, it is somewhat insensitive to
the difference in the synchronization delay. This algorithm to the best case when all clients have mi-
is because, during the zoom-in procedure, all cli- grated up to the core server without zoom-out. We
ents first attempt to migrate up to the core servers observe that, when the delay bound is tight
(regardless of the sync-delay bound), and relatively (25–35 ms), 10–20% of unnecessary servers are fur-
few client migrations occur in the zoom-out ther removed from the selected server set by
procedure. zooming-out clients outward from the core server.
When we compare different migration/searching This result partially accounts for the sub-par
strategies, in general, the coordinated client migra- performance of the greedy algorithms presented
tion methods incur less migration steps than the in Fig. 4.
uncoordinated ones, and the tree-based search
scheme results in less number of migrations than 4.5. Adaptivity to incremental client join
the search-based ones. This result is expected be-
cause in the tree-based search algorithms at each In this section, we evaluate the adaptivity of the
client there are only a few servers that the client proposed algorithm by considering the case when
can choose from. On the other hand, we observe the clients incrementally join the gaming network.
that the non-tree search algorithms do not incur We handle this incremental growth of game ses-
an excessive number of migrations either. The sion as follows. First, when a new client arrives,
difference between the two is only a constant fac- this client connects to the closest server, and
tor and does not change much due to the simula- probes the synchronization delay. If the synchroni-
tion parameter. In particular, the coordinated zation delay is larger than the delay bound, the
migration with partial search in moderate number new client is notified of delay violation and re-
of candidate servers (e.g., ZIZO-Part-Co(k = 10)) jected. Otherwise, it is accepted and the client con-
appears to strike a good balance between the nects to the closest server. We then perform our
complexity and the performance observed in proposed algorithm periodically after every p suc-
Fig. 5. cessful joins to the session in an effort to reduce the
number of session servers. With this approach, the
4.4. Impact of zoom-out number of servers would temporarily increase as
more clients join until the server selection algo-
In Fig. 8, we quantify the additional benefit ob- rithm is performed in the next round. Thus we ex-
tained by the zoom-out procedure, compared to pect a tradeoff between the frequency of running
the case when only zoom-in has been performed. the algorithm (i.e., how many times the clients
Here we plot the ratio of the number of servers migrate servers overall) and the level of keeping
after performing the whole zoom-in–zoom-out the temporary growth of the set of session servers.
98 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

25 25
ZIZO-Full-Un
ZIZO-Part-Un (k=10)
ZIZO-Tree-Un
20 20 ZIZO-Full-Co
# of session servers

# of session servers
ZIZO-Part-Co (k=10)
ZIZO-Tree-Co
15 15

10 10
ZIZO-Full-Un
ZIZO-Part-Un (k=10)
5 ZIZO-Tree-Un 5
ZIZO-Full-Co
ZIZO-Part-Co (k=10)
ZIZO-Tree-Co
0 0
10 20 30 40 50 10 20 30 40 50
a # of clients b # of clients

Fig. 9. Number of servers selected when clients join incrementally: p = 10, unbounded server capacity, f = 0.3. (a) Delay
bound = 25 ms and (b) delay bound = 35 ms.

4 4
# of migrations per client

# of migrations per client

3 3

2 2
ZIZO-Full-Un ZIZO-Full-Un
ZIZO-Part-Un (k=10) ZIZO-Part-Un (k=10)
1 ZIZO-Tree-Un 1 ZIZO-Tree-Un
ZIZO-Full-Co ZIZO-Full-Co
ZIZO-Part-Co (k=10) ZIZO-Part-Co (k=10)
ZIZO-Tree-Co ZIZO-Tree-Co
0 0
10 20 30 40 50 10 20 30 40 50
a # of clients b # of clients

Fig. 10. Number of migrations per client when clients join incrementally: p = 10, unbounded server capacity, f = 0.3. (a) Delay
bound = 25 ms and (b) delay bound = 35 ms.

In this section, we show the results when the recon- that the algorithm is performed. When we com-
figuration period p is 10.7 pare these results to those in Fig. 5, which present
In Fig. 9, we show the number of session servers the case where all the clients are given initially, we
allocated as new clients join the game session, can see that running the algorithm ‘‘on-line’’ pro-
when the delay bound is relatively tight (Fig. duces as efficient results as the running it ‘‘off-line’’
9(a)) and moderate (Fig. 9(b)). The x-axis of the since the numbers of servers allocated when the
figures represents the number of clients that have last client finally has joined is close to the corre-
joined thus far. We see that, as more clients join sponding values in Fig. 5.
the game, the number of selected servers temporar- Fig. 10 plots the average number of migration
ily grows, and then drops significantly at the point steps performed by clients. Here we observe that
the number of migrations per client tend to de-
crease as the game session grows. In particular,
7
An alternative approach to periodic server selection would when the delay bound is moderate (Fig. 10(b)),
be to have a threshold in the number of servers and to trigger the overhead of the session change monotonically
the ZIZO process whenever the size (or the increase in size) of decreases as more clients join. This is because,
the session server set exceeds the threshold. The net effect of the
periodic method and the threshold-based method is similar.
when a game has a loose delay bound, the server
Thus we only present the periodic recomputation scenario in selection by the existing clients tends to form a
this paper. cluster near the center of the network as more
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 99

clients join, and therefore, there is not much need vate network to ensure low latency between them.
for the clients to move their servers to find a better And since the server network is privately owned,
selection. advanced features such as IP multicast can be
We conclude from these results that, when the leveraged to facilitate efficient event delivery.
game session grows incrementally, our algorithm In large scale games, the virtual game world is
is effective and scalable in terms of the server selec- divided into multiple regions or zones [12–14]
tion result and the overhead for session reconfigu- and only the users in the same zone directly inter-
ration and convergence. act with each other. Therefore, it is important to
logically group the users in the same zone together
and process the events and actions between them.
5. Related work To support clustering of clients and communica-
tion between them one can leverage pub-sub tech-
In the context of real-time group communica- nologies and distributed hashing. In [12], the
tion, our problem shares some similarity with the authors proposed to use a distributed hash table
delay-constrained multicast tree construction (DHT) for rendezvous to find a server in a mir-
problem [24,25]. The goal of the delay-constrained rored server architecture. In [13], the authors con-
multicast tree construction problem is to construct struct a Voronoi diagram based on the coordinates
a lowest-cost multicast tree with an additional con- of each node in the virtual world and maintain
straint on the upper bound of end-to-end delay be- peer-to-peer connection between the nodes to han-
tween the sender and the receivers. Our work dle the dynamics of the game play. In [14], the
differs in that the delay present in server–client authors proposed a range query mechanism
gaming networks is not just an end-to-end delay that can support multiple attributes and explicit
but a nonlinear combination of client-to-server load balancing. Such algorithms can be used for
and server-to-server delay as given in our delay communications in a game session once the server
model. allocation has been resolved by using our
Our work is related to the dual of the minimum algorithm.
K-center problem [26], in which the number of For efficiency, in the mirrored server architec-
servers is to be minimized when the maximum dis- ture, each game server simulates the game state
tance between clients and their nearest servers is independently based on the game events it receives.
given. In the server selection problem we have con- Since there is a delay in event delivery in distrib-
sidered we must take the inter-server delay into ac- uted game environments, the game state at each
count as well. The (log(N) + 1)-approximation server must be synchronized with each other. For
bound is known for this problem, and we are cur- this many synchronization algorithms have been
rently investigating the possibility of finding a for- proposed. One of the simplest approaches is the
mal approximation bound for our problem. lockstep synchronization. In the lockstep synchro-
Our server selection algorithm has been devel- nization, as the name suggests, all the events in the
oped for a mirrored server architecture. Recently, system must be acknowledged by all members and
the mirrored server architecture has been proposed processed in lockstep. Since events are processed in
for distributed gaming [7–9] in order to overcome sequence there is no inconsistency in the game
the limited scalability of the classic client–server state. However, since it cannot guarantee that
architecture. The mirrored server architecture pro- the game will play at the wall clock rate, the lock-
vides many benefits of peer-to-peer architecture step synchronization is not applicable for online
without burdening the clients with game simula- games. Another simple mechanism, called the
tion and administrative functions. To find the clos- bucket synchronization [16], delays the game sim-
est server among the mirrors, a client can use ping ulation for a fixed time period, or bucket, to pre-
or Internet distance service to find servers in its vent misordering of events due to network delay.
proximity. In the mirrored server architecture, Bucket synchronization is widely used in
the servers connected via a well provisioned pri- distributed games for its simplicity. However, its
100 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

performance and user experience is highly sensitive is based on a utility computing model to support
to the bucket size. multiple games by leveraging off-the-shelf soft-
Optimistic synchronization algorithms try to re- ware components that were originally developed
duce the synchronization delay by simulating for business applications. For example, they use
game play with minimal delay and rolling back IBM Tivoli Intelligent Orchestrator (TIO) [28] as
the game state when inconsistency has been de- the provisioning module for game servers. In addi-
tected. Time warp synchronization and trailing tion to performing addition and removal of servers
state synchronization (TSS) [7] are examples of according to the workload dynamics, the provi-
the optimistic algorithm. In time warp synchroni- sioning module performs administrative tasks such
zation, a snapshot of the game state is taken before as (a) new game deployment, (b) directory man-
execution of each event. When a late event arrives agement of games and players, (c) player redirec-
which requires re-computation, the game state gets tion to server, and (d) game content distribution.
rolled back to one of the previous snapshots and In the commercial world, Butterfly.net [3] is imple-
game simulation gets re-executed with the new menting the idea of building a scalable gaming
events. TSS maintains multiple game states with infrastructure based on grid computing technol-
different synchronization delays instead of keeping ogy. However, both these works focus on architec-
snapshots. Each state processes the events after its tural issues, leaving such issues as server allocation
own delay and the ‘‘leading’’ state (the state with unaddressed.
the smallest delay) rolls back to one of the
‘‘trailing’’ states if an inconsistency has been dis-
covered. Since it does not need to maintain many 6. Conclusion
snapshots it is more efficient in terms of space
complexity. In this paper, we present a novel distributed
Another interesting issue in distributed gaming algorithm that dynamically selects game servers
environment is to simulate the game play in a fair for a group of game clients participating in large
manner. This problem arises from the fact that dif- scale interactive online games. The goal of server
ferent clients may experience different latency to selection is to minimize server resource usage while
the server, and the game players who are closer satisfying the real-time delay constraint. The pro-
to the server can react quickly before other players posed algorithm, called zoom-in–zoom-out, is
who are connected over delayed links, thus have adaptive to session dynamics and lets the clients
unfair advantage. To address this issue, Guo select appropriate servers in a distributed manner
et al. have proposed a proxy-based algorithm to such that the number of servers used by the game
ensure so-called fair order service [27]. The main session is significantly reduced compared to the
concept of this algorithm is to have a client side case when clients select the closest servers. We
proxy, which marks the reaction time of the user, have considered various zoom-in techniques that
and forwards this information to the server. When the clients can implement; namely (i) tree-based
the server receives messages, it processes them con- search, (ii) partial server search, and (iii) full server
sidering both the reaction time and the arrival time search. We have also considered two different
of the messages. The server selection algorithm types of client migration; (i) coordinated migra-
proposed in this paper does not depend on a par- tion and (ii) uncoordinated migration. Using
ticular synchronization technique. We just assume simulation, we have shown that the proposed
that all servers must receive data from all clients zoom-in–zoom-out algorithm results in compara-
for computing a new game state, which is a neces- ble or better performance than the centralized
sary condition for most synchronization models. greedy algorithms, which are based on global
Shaikh et al. [10] and Saha et al. [11] have information and expensive computation. We also
proposed an on-line games hosting platform by show that, in practice, a partial server search
developing middleware based on existing grid algorithm with uncoordinated client migration
components. The proposed gaming architecture strikes a good balance between the performance
K.-W. Lee et al. / Computer Networks 49 (2005) 84–102 101

and complexity. In particular, we have shown that [16] C. Diot, L. Gautier, A distributed architecture for multi-
this scheme can find the optimal solution in many player interactive applications on the Internet, IEEE
Networks Magazine 13 (4) (1999) 6–15.
cases and performs closely to the full search algo- [17] Y.-J. Lin, S.P.K. Guo, Sync-MS: synchronized messaging
rithm. Furthermore, we have shown that the service for real-time multi-player distributed games, in:
zoom-out process is essential for reducing the Proceedings of IEEE International Conference on Net-
number of servers especially when the delay bound work Protocols (ICNP), 2002.
is tight. In the scenarios we have considered, the [18] P. Bettner, M. Terrano, 1500 Archers on a 28.8: network
programming in age of empires and beyond, in: Game
zoom-out process can save up to 20% of server re- Developers Conference, 2001.
sources. Finally, we have shown that our proposed [19] D.G. Thaler, C.V. Ravishankar, Distributed center-loca-
algorithm is not only effective for a static configu- tion algorithms, IEEE Journal on Selected Areas in
ration but also is adaptive to the dynamics in the Communications 15 (3) (1997) 291–303.
client set. [20] BRITE. Available from: <http://www.cs.bu.edu/brite/>.
[21] A.L. Barabási, R. Albert, Emergence of scaling in random
networks, Science (1999) 509–512.
[22] L. Qiu, V. Padmanabham, G. Voelker, On the placement
References of web server replicas, in: Proceedings of 20th IEEE
INFOCOM, 2001.
[1] Lineage. Available from: <http://www.lineage.net/>. [23] T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein,
[2] NCsoft. Available from: <http://www.ncsoft.net/kor/ Introduction to algorithms, second ed., MIT Press, 2001.
ncnewsdata/View.asp>. [24] M. Parsa, Q. Zhu, J.J. Garcia-Luna-Aceves, An iterative
[3] Butterfly.net. Available from: <http://www.butterfly.net/>. algorithm for delay-constrained minimum-cost multicast-
[4] Terazona. Available from: <http://www.zona.net/>. ing, IEEE/ACM Transactions on Networking 6 (4) (1998)
[5] PlanetSide. Available from: <http://www.planetside. 461–474.
com/>. [25] H.F. Salama, D.S. Reeves, Y. Viniotis, An Efficient Delay-
[6] J.D. Pellegrino, C. Dovrolis, Bandwidth requirement and Constrained Minimum Spanning Tree Heuristic, TR-96/46,
state consistency in three multiplayer game architectures, Technical Report, North Carolina State University, 1996.
in: Proceedings of NetGames, 2003. [26] J. Bar-Ilan, D. Peleg, Approximation algorithms for
[7] E. Cronin, B. Filstrup, A. Kurc, A Distributed Multiplayer selecting network centers, in: Proceedings of 2nd Work-
Game Server System, Technical Report, University of shop on Algorithms and Data Structures, Lecture Notes in
Michigan, May 2001. Computer Science, vol. 519, 1991, pp. 343–354.
[8] D. Bauer, S. Rooney, P. Scotton, Network infrastructure [27] K. Guo, S. Mukerjee, S. Rangrajan, S. Paul, A fair
for massively distributed games, in: Proceedings of Net- message exchange framework for distributed multi-player
Games, 2002. games, in: Proceedings of NetGames, 2003.
[9] M. Mauve, S. Fischer, J. Widmer, A generic proxy system [28] IBM, IBM Tivoli Intelligent Orchestrator. Available
for networked computer games, in: Proceedings of Net- from: <http://www-306.ibm.com/software/tivoli/products/
Games, 2002. intell-orch/>.
[10] A. Shaikh, S. Sahu, M. Rosu, M. Shea, D. Saha,
Implementation of a service platform for online games,
in: Proceedings of NetGames, 2004. Dr. Kang-Won Lee was born in Seoul,
[11] D. Saha, S. Sahu, A. Shaikh, A service platform for Korea. He received a B.S. in 1992 a
OnLine games, in: Proceedings of NetGames, 2003. M.S. in 1994 from Seoul National
[12] T. Iimura, H. Hazeyama, Y. Kadobayashi, Zoned feder- University in computer engineering
ation of game servers: a peer-to-peer approach to scalable and a Ph.D. in 2000 from the Univer-
multi-player online games, in: Proceedings of NetGame, sity of Illinois at Urbana–Champaign
2004. (UIUC) in Computer Science. He
[13] S.-Y. Hu, G.-M. Liao, Scalable peer-to-peer networked served the Republic of Korea Air
virtual environment, in: Proceedings of NetGame, Force during 1994–1995 and worked as
2004. a research assistant for the TIMELY
[14] A.R. Bharambe, M. Agrawal, S. Seshan, Mercury: sup- research group at the Center for Reli-
porting scalable multi-attribute range queries, in: Proceed- able and High Performance Computing, Urbana IL, during
ings of ACM SIGCOMM, 2004. 1996–2000. In 2000, he joined IBM T. J. Watson Research
[15] L. Gautier, C. Diot, J. Kurose, End-to-end transmission Center, Hawthorne, NY as a research staff member. His
control mechanisms for multiparty interactive applica- research interest is in computer networks, large scale distributed
tions on the Internet, in: Proceedings of IEEE Infocom, systems, and policy-based system management. He was awar-
1999. ded a Magna Cum Laude from Seoul National University in
102 K.-W. Lee et al. / Computer Networks 49 (2005) 84–102

1992, the Korean Government Overseas Scholarship from the Dr. Seraphin Calo is a Research Staff
Ministry of Education in 1996, the KFSA Scholarship from the Member at IBM Research and cur-
Korea Foundation for Advanced Studies in 1997, the C.W. rently manages the Policy Technologies
Gear Outstanding Graduate Student Award from the Com- group within that organization. He
puter Science Department at UIUC in 1999, and the Best received the M.S., M.A., and Ph.D.
Student Paper Award from Packet Video Workshop in 2000. degrees in electrical engineering from
Recently, He received the IBM Research Division Award for Princeton University, Princeton, New
contributing to the architecture of IBM Autonomic Computing Jersey. He has worked, published, and
Policy Infrastructure and Products. managed research projects in a number
of technical areas, including: queueing
theory, data communications net-
Bong-Jun Ko received the B.S. and works, multi-access protocols, expert systems, and complex
M.S. degrees in electrical engineering systems management. He has been very active in international
from Seoul National University, conferences, particularly in the systems management and policy
Korea, and is currently working areas. His recent involvements include serving on the Orga-
toward a Ph.D. degree in electrical nizing Committee of Policy 2004 (IEEE 5th International
engineering at Columbia University, Workshop on Policies for Distributed Systems and Networks)
New York. He has worked as a and serving as the General Chair of IM 2005 (The Ninth IFIP/
research engineer at LG Electronics in IEEE International Symposium on Integrated Network Man-
Korea before joining Columbia Uni- agement). He has authored more than 50 technical papers and
versity, and has interned at IBM T. J. has several United States patents (three issued and four pend-
Watson Research Lab, NY. His ing). He has received two IBM Research Division awards, and
research interests include large-scale distributed systems and two IBM Invention Achievement awards. His current research
mobile and wireless networks. He received the IEEE ICNP Best interests include: distributed applications, services management,
Paper Award in 2003. and policy based computing.

Das könnte Ihnen auch gefallen