Sie sind auf Seite 1von 10

IEEE TRANSACTIONS ON COMPUTERS,

VOL. 52, NO. 1,

JANUARY 2003

21

General Models and a Reduction Design


Technique for FPGA Switch Box Designs
Hongbing Fan, Jiping Liu, and Yu-Liang Wu, Member, IEEE
AbstractAn FPGA switch box is said to be hyper-universal if it is detailed-routable for any set of multipin nets specifying a routing
requirement over the switch box. Comparing with the known universal switch modules, where only 2-pin nets are considered, the
hyper-universal switch box model is more general and powerful. This paper studies the generic problem and proposes a systematic
designing methodology for hyper-universal k; W -switch boxes, where k is the number of sides and W is the number of terminals on
each side. We formulate this hyper-universal k; W -switch box design problem as a k-partite graph design problem and propose an
efficient reduction design technique. Applying this technique, we can design hyper-universal k; W -switch boxes with low OW
switches for any fixed k. For illustration, we provide optimum hyper-universal 2; W and 3; W -switch boxes and a hyper-universal
4; W -switch box with switch number quite close to the lower bound 6W , which is used in a well-known commercial design without
hyper-universal routability. We also conclude that the proposed reduction method can yield an efficient detailed routing algorithm for
any given routing requirement as well.
Index TermsFPGA, switch box, global routing, detailed routing, hyper-universal, optimum design, reduction technique.

INTRODUCTION
IELD Programmable

Gate Arrays (FPGAs), a kind of Very


Large Scale Integrated (VLSI) circuit, consist of arrays of
prefabricated functional blocks and wire segments with
user-programmability of logic and routing resources.
Because of their fast turn-around time and economic
manufacturing cost for low volume designs, FPGAs have
been used in a great number of digital equipment. FPGA
technologies are commonly classified into three major
categories: 1) Look-Up-Table (LUT), SRAM-based 2) multiplexer, channel organized, and anti-fused, and 3) LPLD,
EPROM-based. The switch box design problems considered
in this paper are related to the LUT and SRAM based twodimensional FPGA (2D-FPGA) architecture [4], [17].
A typical 2D-FPGA architecture is a symmetric-array
model as shown in Fig. 1 [4], [2], [1], [6], [7]. The functional
blocks (or logic cells) are marked by L, which are separated
by vertical and horizontal channels. There are W (called
channel density) prefabricated parallel wire segments
running between each pair of adjacent L-cells in both
vertical and horizontal channels. The wire segments in a
vertical (or horizontal) channel are aligned into W vertical
(or horizontal) tracks; each track within a channel is
assigned an integer in f1; . . . ; W g as its track ID. There are
connection boxes (C-box, C) in the channel between
adjacent L-cells. A switch box (S-box, S), located at each
. H. Fan is with the Department of Computer Science, University of
Victoria, Victoria, BS, Canada V8W 3P6. E-mail: hfan@uvic.ca.
. J. Liu is with the Department of Mathematics and Computer Science,
University of Lethbridge, Lethbridge, AB, Canada T1K 3M4.
E-mail: liu@cs.uleth.ca.
. Y.-L. Wu is with the Department of Computer Science and Engineering,
Chinese University of Hong Kong, Shatin, NT Hong Kong.
E-mail: ylw@cse.cuhk.edu.hk.
Manuscript received 1 June 2000; revised 18 Dec. 2001; accepted 25 Mar.
2002.
For information on obtaining reprints of this article, please send e-mail to:
tc@computer.org, and reference IEEECS Log Number 112221.
0018-9340/03/$17.00 2003 IEEE

intersection of a vertical and horizontal channel, contains


programmable switches to connect wire segments running
from its surrounding C-boxes.
In FPGA implementations, certain logic pins are grouped
and connected by chosen wire segments and switches in
C-boxes and S-boxes to form signal nets. This process is
called a routing. Conventionally, the routing process is
divided into two steps, a global routing and a detailed routing.
A global router decides the connection topology of all nets,
which is usually a set of paths or trees and called a global
routing. A detailed router decides which wire segments and
switches in C-boxes and S-boxes are used to implement the
connections indicated by a global routing; the result is
called a detailed routing.
Routability and area efficiency are two important issues in
the architectural design of 2D-FPGAs. Routability is measured by the probability of completing a detailed routing; the
higher the better. Area efficiency is characterized mainly by
the number of switches in the switch boxes; the smaller the
better. However, high routability and high area efficiency are
two conflicting goals. It is easy to see that an FPGA with
complete routability in C-boxes and S-boxes, namely, having
switches between all possible connections, will have the
highest routability under the same channel density. However, it will have the lowest area efficiency and becomes
impractical. In the past, much research has been done on
FPGA architectures, switch modules, and routing algorithms
for improvements on routability and/or area efficiency [11],
[4], [5], [2], [6], [9], [1], [10], [13], [14], [15], [16].
Rose and Brown [11] first investigated the impact of
C-box and S-box designs on routability and observed that
the S-box structure is quite crucial to the routability. In their
work, the flexibility of a switch block, Fs , is defined as the
maximum number of switches a terminal in an S-box can be
connected to. And, it was concluded that Fs 3 or 4 [11],
[4], [2] can result in a sufficiently high routability, which is
Published by the IEEE Computer Society

22

IEEE TRANSACTIONS ON COMPUTERS,

VOL. 52,

NO. 1,

JANUARY 2003

Fig. 1. The architecture of a 2D-FPGA.

considered as a reasonable balance between routability and


area efficiency. Two kinds of S-boxes with Fs 3 were
adopted by Rose and Brown [11] and Xilinx [17].
Observing that different S-box designs with the same
flexibility can have different routability and FPGAs with
S-boxes of high local routability tend to have high global
routability, Chang et al. [6] proposed the model of universal
switch module. A switch module M with W terminals on
each side is said to be universal if any set of 2-pin nets
satisfying the dimension constraint (i.e., the number of nets
on each side of M is at most W ) is simultaneously routable
through M. The symmetric 4-side switch module, which has
flexibility 3 and 6W switches, was designed and proven to
be optimal and universal. It was also shown in their work
that a universal switch module can accommodate 25 percent
more routing instances than that of the Xilinx XC4000-type
S-box using the same number of switches. The result is very
interesting since it points out that, besides the number of
switches equipped, the internal design of the S-box also
plays an important role in the routability. This result
motivates us to further investigate the optimum design
problem of S-boxes in this paper.
In conventional 2D-FPGAs, an S-box has k sides and
W terminals on each side, where 2  k  4. Moreover,
research on three-dimensional, six-way mesh, eight-way
mesh, and some other tree-structured FPGA architectures
raised interest in switch boxes with k ( 5) sides and W
terminals on each side (k; W -SB for short). Shyu et al. [12]
investigated the generic universal k; W switch box design
problem and generalized the symmetric switch module for
any pair of k; W .
However, as the models proposed in the previous work
only considered the special cases of 2-pin net routings, in
this paper, we will further generalize the model to cover all
possible routing cases. Here, we define that an S-box is
hyper-universal if it is routable for every set of multipin
nets satisfying the dimension constraint. The difference
between universal and hyper-universal boxes is: The former
is routable for 2-pin net routing requirements, while the
latter can cater to all multipin net routing scenarios. It is
obvious that a hyper-universal S-box is also universal, while

the reverse is not true. We will provide such a counterexample in Section 5. Therefore, the routability of a hyperuniversal S-box should be superior to that of a universal
S-box of the same size. Regardless of the powerful potential
applications of the hyper-universal S-boxes, this ideal
model has rarely been addressed or studied in the past.
The only known trivial model that can yield hyperuniversal routability would have a switch number in the
range of OW 2 [16], which is clearly beyond any practical
acceptability. In this paper, we will explore certain powerful
decomposition properties of the S-box designs. Applying
such decomposition properties, we develop an efficient
reduction scheme that can easily produce designs of hyperuniversal k; W -SBs with OW switches. Based on our
reduction design scheme, we design optimum hyperuniversal 2; W -SB, 3; W -SB, and near optimum
4; W -SB, which only use a few more switches than a
known non-hyper-universal commercial model that uses
6W switches.
The goal of this paper is to provide full coverage of graph
models and reduction design theory for general hyperuniversal k; W -SBs. This paper is organized as follows:
Section 2 gives graph models and related graph design
problems associated with the switch box designs. In
Section 3, we first describe a decomposition theorem of
global routings and then present a general reduction design
scheme for hyper-universal S-boxes and show that it can
derive a hyper-universal k; W -SB with OW switches.
Section 4 presents optimum hyper-universal 2; W -SBs and
3; W -SBs and Section 5 gives various designs of hyperuniversal 4; W -SB using the reduction design technique.
Our conclusions follow in Section 6.

DEFINITIONS

AND

PROBLEMS

Graph models for routing requirements and switch boxes


play an important role in our design technique. In this
section, we model a routing requirement as a collection of
subsets, a switch box as a graph with terminals as vertices
and switches as edges, and detailed routing as a spanning
forest. With these formulations, the switch box design

FAN ET AL.: GENERAL MODELS AND A REDUCTION DESIGN TECHNIQUE FOR FPGA SWITCH BOX DESIGNS

23

Fig. 2. Examples of a global routing and a detailed routing.

problem becomes a graph design problem. We will mostly


use the terminologies and notations of graph theory from
[3]. Let G V G; EG be a simple graph, where V G is
the vertex set and EG is the edge set. We denote by jV Gj
(or jGj) and jEGj the number of vertices and edges in G,
respectively. S  V G, GS denotes the induced subgraph
of G by S. v1 v2 . . . vt denotes the path consisting of
consecutive vertices v1 ; v2 ; . . . ; vt .
A k; W -SB can be naturally represented by a graph.
Consider a k; W -SB, denote the jth terminal on side i by
a vertex vi;j and a switch connecting vi;j and vi0 ;j0 by an
edge vi;j vi0 ;j0 . Thus, a k; W -SB corresponds to a k-partite
graph V1 ; . . . ; Vk ; E with vertex partition V1 ; . . . ; Vk ,
where Vi fvi;j jj 1; . . . ; W g; i 1; . . . ; k. We refer to
such a graph as a k; W -graph or a k; W -SB. Two
k; W -graphs are isomorphic if there is an isomorphism
which preserves the vertex partitions.
We next model the routing requirement for a k; W -SB,
namely, the global routing through the S-box. Label the
sides of the S-box by 1; 2; . . . ; k, respectively. Then, a
multinet in a routing requirement can be represented by a
subset of f1; 2; . . . ; kg. For example, a net that connects three
terminals on sides 1, 2, and 3 can be represented by f1; 2; 3g.
Thus, a routing requirement for the k; W -SB can be
represented as a collection (multiple set) of subsets (also
called nets) of f1; 2; . . . ; kg such that each i 2 f1; 2; . . . ; kg is
contained in no more than W subsets in the collection
(channel density constraint). For the sake of regularity, we
add some singletons (nets of size 1) to the collection such
that each element appears W times and call it a global

routing (GR for short). Formally, we define a global routing


as follows.
Definition 1. A collection fNi ji 1; . . . ; vg of subsets of
f1; 2; . . . ; kg is said to be a k-way
Pv global routing if there
exists an integer d such that
i1 jNi \ fjgj d for all
j 1; . . . ; k. d is called the density of the k-way global
routing. A k-way global routing of density d is shortened to
k; d-GR. A k; d-GR is said to be r-bounded if the size
(number of pins) of each net within the global routing is at
most r.
Let R be a k; d-GR and R0 be a subcollection of R. If R0 is
a k; d0 -GR with d0 < d, then R0 is said to be a subglobal
routing of R; R is a minimal global routing (MGR) if it does
not contain a subglobal routing.
We can view a k; d-GR R as a d-regular hyper-graph with
vertex set f1; . . . ; kg and edge set R.
We next define the detailed routing of a k; d-GR in a
k; W -SB, where d  W .
Definition 2. Let G be a k; W -graph with partition
V1 ; . . . ; Vk and R fNi ji 1; . . . ; lg be a k; d-GR. A
detailed routing of R in G is a set of mutually vertex disjoint
subgraphs of G fT Ni ji 1 . . . ; lg satisfying: 1) T Ni is a
tree of jNi j vertices and 2) jVj \ V T Ni j 1 if j 2 Ni , for
i 1; . . . ; l. T Ni is called a detailed routing of Ni .
Fig. 2a shows an example of global routing through
a 4; 4-SB. T h e c o r r e s p o n d i n g 4; 4-GR i s
ff1; 2g; f1; 2; 4g; f1; 3g; f1; 4g; f2; 3; 4g; f2; 3g; f3; 4gg. Fig. 2b
is the hyper-graph representation of the 4; 4-GR. Fig. 2c
and Fig. 2d give a 4; 4-SB and a detailed routing of the
4; 4-GR in the 4; 4-SB.

24

IEEE TRANSACTIONS ON COMPUTERS,

VOL. 52,

NO. 1,

JANUARY 2003

Fig. 3. The decomposition of global routings.

With the graph models, we can precisely define the


hyper-universal switch boxes as follows:
Definition 3. A switch box is a hyper-universal k; W -SB
(k; W -HUSB for short) if it accommodates a detailed
routing for each k; W -GR. A hyper-universal k; W -SB
is called an optimum k; W -HUSB if it has the least number
of switches among all k; W -HUSB. Denote by ek; W the
number of switches in an optimum k; W -HUSB.
Problem 1. For any given k, determine ek; W and find an
optimum k; W -HUSB for each W  1.
Problem 2. For a given k; W -HUSB, find an efficient
algorithm that determines a detailed routing for any given
k; W -GR.
From the definition of hyper-universal switch boxes,
we can see that a k; W -HUSB has the highest routing
capacity among all k; W -SBs. In fact, we can introduce a
new parameter r (1  r  k) to classify k; W -SBs by its
routing capacities: A k; W -SB is called r-universal if it is
routable for every r-bounded k; W -GRs. It is clear that
an r 1-universal k; W -SB must be an r-universal
k; W -SB; and a k-universal k; W -SB is just a hyperuniversal k; W -SB; and 2-universal k; W -SBs are the
same class as the universal k; W switch modules (or
blocks) in [6], [12].

A REDUCTION DESIGN TECHNIQUE

FOR

HUSBS

In this section, we develop a reduction design scheme for


general hyper-universal S-boxes. This scheme enables us to
design k; W -HUSBs with OW switches and efficient
detailed routing algorithms.

3.1 Decomposition Properties and HUSB Designs


Our approach is based on a decomposition property of
global routings. By the definition of minimal global
routings, we know that a k-way global routing can
always be decomposed into k-way minimal global
routings. For example, the 4; 4-GR ff1; 2g, f1; 2; 4g,
f1; 4g, f2; 3; 4g, f1; 3g, f2; 3g, f3; 4gg is a disjoint union
of the minimal 4-way global routings ff1; 2g; f3; 4gg,
ff1; 4g; f2; 3gg, and ff1; 2; 4g; f1; 3g; f2; 3; 4gg. Fig. 3 illustrates the decomposition in hypergraph. The following
decomposition theorem was proven in [8].
Lemma 1. For any integer k  2, there exists a unique integer fk
such that every k-way global routing can be decomposed into the
disjoint union of minimal k-way global routings of densities at
most fk. Particularly, fk k 1 for k 2; 3; 4.

Let k be an integer with k  2 and fr1 ; . . . ; rt g be the set


consisting of all densities of k-way MGRs. Then, ri  fk
and t is uniquely determined by k, where fk is the
function defined in Lemma 1. Define pk to be the least
common multiple of r1 ; . . . ; rt .
Lemma 2. If r1 x1 . . . rt xt  tpk t 1, where x1 ; . . . ; xt
are nonnegative integers, then there are integers 0  yi 
xi ; i 1; . . . ; t such that r1 y1 ::: rt yt pk.
Proof. By the generalized pigeon-hole principle, there is
an integer 1  i  t such that ri xi  pk. Therefore,
there is an integer yi  xi such that ri yi pk. For any
j 6 i, let yj 0. Then, 0  yl  xl for l 1; . . . ; t and
r1 y1    rt yt pk.
u
t
The following theorem and its proof give a general
solution for k; W -HUSB designs.
Theorem 3. For any fixed positive integer k,
k; W OW :
Proof. Let hk tpk t 1 and let r be the largest
integer such that W pkq r and r < hk. Consider
the k-partite graph F k; W consisting of q vertex disjoint
copies of complete k-partite graph Kk;pk with pk
vertices in each part, and a vertex disjoint complete
k-partite graph Kk;r with r vertices in each part. We
show that F k; W is a k; W -HUSB.
For any k; W -GR R, by Lemma 1, R can be
decomposed into the union of minimal k; ri -GRs,
where ri s are defined as the above. By recursively
applying Lemma 2, we can regroup these minimal k-way
global routings into q k; pk-GRs and one k; r-GR.
Obviously, each k; pk-GR has a detailed routing in
Kk;pk and the k; r-GR has a detailed routing in Kk;r .
Hence, F k; W is a k; W -HUSB.
jEF k; W j qkk 1pk2 =2 kk 1r2 =2
qp2 k r2 kk 1=2
W rpk r2 kk 1=2
 W pk hk2 kk 1=2
OW :
It follows that ek; W OW for a fixed k.

u
t

Note that if we replace Kk;pk and Kk;r by optimum or


near optimum k; pk-HUSB and k; r-HUSB, respectively,
the resulting design will be optimum or very close to an
optimum design. We refer the designs of k; pk-HUSB and
k; ri -HUSB i 1; . . . ; t as k-side elementary designs.

FAN ET AL.: GENERAL MODELS AND A REDUCTION DESIGN TECHNIQUE FOR FPGA SWITCH BOX DESIGNS

3.2 The Reduction Design Technique


The proof of Theorem 3 suggests a powerful technique for
hyper-universal S-box designs. For a fixed k and any W , as
a k; W -SB will accommodate all k; W -GRs and a
k; W -GR can be decomposed into a set of k; W -MGRs,
it is possible for us to first design a few elementary (with
smaller densities) k-side HUSBs corresponding to those
decomposed subglobal routings and then combine them
into desired k; W -HUSBs. This interesting property
enables us to reduce the difficult HUSB design problem to
some easier ones and solve their detailed routing problems
individually.
Reduction design scheme for k; W -HUSBs:
I. Find fk and all k-way minimal global routings. The
existence of fk and the finiteness of the number of
minimal k-way global routings are guaranteed by Lemma 1.
II. Determine p and r such that W pq r, p, and r are as
small as possible so that any k; W -GR is a union of q
subglobal routings of density p and one subglobal routing
of density r. Theorem 3 shows that p and r exist and
p  pk, r < hk. (Note that, as k is fixed, each W
corresponds to a unique r, we may have more than one r
as W varies, but there are finitely many such rs for all W s).
III. Design a k; p-HUSB S1 and a k; r-HUSB S2 with the
number of switches as small as possible. Set up a database
recording the detailed routing of each k; p-GR and
k; r-GR in S1 and S2 , respectively. Then, combine (disjoint
union) q copies of S1 and one copy of S2 , obtaining a
k; W -HUSB S.
IV. Find a detailed routing in S for any given k; W -GR
using the database created in routing in III. The detailed
routing can be done by the following algorithm.
Let MGR be the collection of minimal k-way global
routings, and R fN1 ; . . . ; Nl g be a given k; W -GR. Do
the following:
(a) Decompose R into minimal global routings:
Repeat Step 1 to Step 3 until R ;.
Step 1 For any M 2 MGR, let T1 ; . . . ; Tm be all
different nets in M. For i 1 to m, let ti be the
number of replications of Ti in M and set xi 0.
Let M be an empty set.
Step 2 For i 1 to l, j 1 to m, if Ni Tj , set
xj xj 1.
Step 3 If xj  tj for all j 1 to m, then set M M[M
and R R M; else, set MGR MGR M.
(b) Combine the minimal global routings in M to form
one k; r-GR and several k; p-GBs.
(c) Using the database produced in III, find a detailed
routing of each k; p-GR in one copy of S1 and a
detailed routing for the k; r-GR in S2 .
We point out that the above detailed routing algorithm is
polynomially bounded in terms of W . Meanwhile, the above
reduction design technique can be modified to design general
universal switch boxes and other r-universal switch boxes.

25

OPTIMUM k; W -HUSBS fOR k 2; 3

We can apply the reduction technique developed in


S e c t i o n 3 t o d e s i g n o p t i m u m 2; W -HUSB a n d
3; W -HUSB easily. But, here, we use different methods
as this will help us in the designing of near optimum
4; W -HUSBs.
The basic method is, first, to give a lower bound for
ek; W and, then, to find a hyper-universal k-partite graph
with the number of edges equal to the lower bound.
Theorem 4. e2; W W and the graph
G2; W fv1;j jj 1; . . . ; W g; fv2;j jj 1; . . . ; W g;
fv1;i v2;i ji 1; . . . ; W g
is an optimum 2; W -HUSB.
Proof. Let G V1 ; V2 ; E be an optimum 2; W -HUSB,
where Vi fvi;j jj 1; . . . ; W g; i 1; 2. Consider the
2; W -GR R fNi f1; 2gji 1; . . . ; W g. Then, each
vertex in V1 is incident with a tree in a detailed routing
and each tree of a detailed routing must be an edge. It
follows that the degree of each vertex is at least one
and, therefore, e2; W jEj  W . On the other hand,
we show that the graph G2; W contains a detailed
routing
for any 2; W -GR fNi ji 1; . . . ; lg. Since
Pl
jN
\
fjgj W ; j 1; 2, we may assume that
i
i1
Ni f1; 2g, where i 1; . . . ; l1 for some l1 , Ni f1g for
i l1 1; . . . ; W , and Ni f2g for i W 1; . . . ; l. Let
T Ni fv1;i ; v2;i g; fv1;i v2;i g for i 1; . . . ; l1 , T Ni
fv1;i g; ; for i l1 1; . . . ; W , and T Ni fv2;i g; ;
for i W 1; . . . ; l. Then, fT Ni ji 1; . . . ; lg is a detailed routing of fNi ji 1; . . . ; lg. This completes the
proof of the theorem.
u
t
Next, we investigate the 3; W -HUSB design problem.
Let G3; W denote the 3-partite graph on V1 ; V2 ; V3 with
edge set fvi;j vh;jhi1 jj 1; . . . ; W ; 1  i < h  3g, where
the second index is taken as modulo W . Since j p 6
j mod W when 0 < p < W ,
v1;1 v2;1 v3;1 v1;W v2;W v3;W . . . v1;t v2;t v3;t v1;t1 v2;t1 v3;t1 . . .
v1;2 v2;2 v3;2 v1;1
is a Hamilton cycle of G3; W . Therefore, G3; W is a
3W -cycle.
Theorem 5. e3; W 3W and G3; W is an optimum
3; W -HUSB.
P r o o f . L e t G V1 ; V2 ; V3 ; E b e a n o p t i m u m
3; W -HUSB. Consider the 3; W -GR R1 fNi ji
1; . . . ; 2W g with Ni f1; 2g for i 1; . . . ; W and Ni
f3g for i W 1; . . . ; 2W . Then, a detailed routing of
R1 in G consists of W independent edges joining
vertices between V1 and V2 . Therefore, G contains at
least W edges with one end in V1 and the other in V2 .
Similarly, by choosing a 3; W -GR R2 fNi ji
1; . . . ; 2W g with Ni f2; 3g for i 1; . . . ; W and Ni
f1g f o r i W 1; . . . ; 2W a n d a GR3 fNi ji
1; . . . ; 2W g with Ni f3; 1g for i 1; . . . ; W and Ni
f2g for i W 1; . . . ; 2W , we can show that, in G,
there are at least W independent edges between V2 and

26

IEEE TRANSACTIONS ON COMPUTERS,

we successively cut the section with jNi j vertices as


T Ni moves along the Hamilton cycle. Since the
H
Pal m i l t o n c y c l e h a s 3W v e r t i c e s a n d
each T Ni is well-defined for
i1 jNi j 3W ,
i 1; . . . ; l. Since the sequence N1 ; N2 ; . . . ; Nl generates a sequence 1; 2; 3; 1; 2; 3; . . . ; 1; 2; 3, T Ni is a
path, and jV T Ni [ Vj j 1 if and only if j 2 Vi .
This implies that fT Ni ji 1; . . . ; lg is a detailed
routing of fNi ji 1; . . . ; lg in G3; W and, therefore,
G3; W is a hyper-universal 3; W -HUSB. Hence,
3W  e3; W  jEG3; W j 3W , e3; W 3W . t
u

We claim that the nets in each R can be ordered and


the elements in each Ni can be ordered so that 1; 2; 3
appear successively in cyclic order (we will say that R
can be ordered to satisfy the order property). For
example, if R ff1; 2g; f2; 3g; f1; 3gg, then R can be
ordered as f2; 3g; f1; 2g; f3; 1g, (or ff1; 2g; f3; 1g; f2; 3gg)
to satisfy the required order property. We prove this
claim by induction on W .
When W 1, then a 3; 1-GR R must be one of
ff1; 2; 3gg; ff1; 2g; f3gg; ff3; 1g; f2gg; ff2; 3g; f1gg;
ff1g; f2g; f3gg:

If R R2, it is clearly true; otherwise, let R0 R R2.


Then, R0 is a 3; W 2-GR with W 2  1 and it can be
ordered to satisfy the order property by the induction
hypothesis. Furthermore, we can assume that 1 appears
first in the ordered R0 .
Now, by simply putting f1; 2g; f3; 1g; f2; 3g in front of
the ordered R0 , a desired order R will be formed.

JANUARY 2003

v1;1 v2;1 v3;1 v1;W v2;W v3;W . . . v1;t v2;t v3;t v1;t1 v2;t1 v3;t1 . . .
v1;2 v2;2 v3;2 v1;1 ;

f1; 2; 3g; f1; 2g; f2; 3g; f1; 3g; f1g; f2g; f3g:

R2 ff1; 2g; f3; 1g; f2; 3gg:

NO. 1,

We have shown that R can be ordered such that 1, 2, 3


appear successively in cyclic order. Without loss of
generality, we assume that the ordered sequence of Ni s is
N1 ; N2 ; . . . ; Nl and 123 is the first segment in the
ordering. Starting from v1;1 of the Hamilton cycle

V3 , an d V3 a n d V1 , re s p ec t i v e l y . T h er e f o r e ,
e3; W jEGj  3W .
We prove e3; W 3W by showing that G3; W is a
3; W -HUSB.
Let R fNi ji 1; . . . ; lg be any 3; W -GR.
Pl
Then,
i1 jNi \ fjgj W ; j 1; 2; 3 by the definition.
Since Ni  f1; 2; 3g is a nonempty set, Ni consists of some
of the sets

Clearly, R already satisfies the required order property.


We assume that the claim is true for any 3; n-GR for
n  W 1 and show that the claim is true for any
3; W -GR R with W  2.
By Lemma 1, R contains a 3; 1-GR or a 3; 2-GR.
Case 1. R contains a 3; 1-GR R1.
Let R0 R R1. Then, R0 is a 3; W 1-GR. By the
induction hypothesis, R0 can be ordered to satisfy the
order property.
If R1 f1; 2; 3g, without loss of generality, we
assume the element 1 is the first element in the order
of R0 , then simply putting f1; 2; 3g in front of the ordered
R0 will result in a desired order of R. The case of R1
ff1g; f2g; f3gg will be similarly ordered.
Suppose that R1 contains a net of two elements.
W.l.o.g., let R1 ff1; 2g; f3gg. If 1 or 2, say 2, is the first
element in the ordered R0 , then the first net in the
ordered R0 is either f2g or f2; 3g. In the first case, the last
element is 1, so we can move f2g to the end of the
ordered R0 to obtain a new desired order with 3 in the
first position. In the latter case, we can move f2; 3g to the
end of the ordered R0 to obtain a new desired order with
1 in the first position. Therefore, we may assume that
either 1 or 3 is in the first position in the ordered R0 .
If 1 is the first element in the ordered R0 , we put
f1; 2g; f3g in front of the ordered R0 . If 3 is the first element
in the order of R0 , we put f3g; f1; 2g in front of the ordered
R0 . In either case, we obtain a desired order of R.
Case 2. R contains no 3; 1-GR.
In this case, R must contain

VOL. 52,

Remark.
1.

2.

There may be more than one optimum 3; W -HUSB


for some W . For example, the disjoint union of a 6cycle and a 3-cycle is also an optimum 3; 3-HUSB.
But, a 3W -cycle is always an optimum 3; W -HUSB
by Theorem 5. We also note that an optimum
3; 4-HUSB must be a cycle.
The proof of Theorem 5 also gives an algorithm to find
a detailed routing for a 3; W -GR in G3; W . It is
obvious that we can rearrange the subsets of the global
routing in the order of required property in polynomial
time (in terms of the number of subsets). Hence, we can
find a detailed routing in polynomial time.

A NEAR OPTIMUM 4; W -HUSB

In this section, we first give a short proof for a lower bound


of ek; W and then present classes of elementary designs
for 4; W -HUSBs with W 1; . . . ; 7. Finally, we construct a
4; W -HUSB using the reduction design scheme.

5.1 Lower Bound and Elementary Designs


Theorem 6.
ek; W 

kk 1
W:
2

Proof. For each pair i; j with 1  i < j  k, let Ri; j be a


k; W -GR with W fi; jgs and W ftgs for each
t 2 f1; . . . ; kg n fi; jg. A detailed routing of Ri; j must
contain at least W edges between Vi and Vj . Therefore,
 a
hyper-universal k; W -HUSB contains at least k2 W
W.
u
t
edges. It follows that ek; W  kk1
2
If R is a k; W -GR, then R restricted on any two sides
and three sides is a 2; W -GR and a 3; W -GR, respectively. By Theorems 4 and 5, a k; W -HUSB on V1 ; . . . ; Vk
satisfies that any pair, Vi ; Vj , induces a 2; W -HUSB and
any triple, Vi ; Vj ; Vp , induces a 3; W -HUSB. Therefore, it

FAN ET AL.: GENERAL MODELS AND A REDUCTION DESIGN TECHNIQUE FOR FPGA SWITCH BOX DESIGNS

27

Fig. 4. A directed graph used to design MH4; W graphs.

is desirable to design a k-partite graph G on V1 ; . . . ; Vk


such that GVi ; Vj is a perfect matching and GVi ; Vj ; Vp
is a cycle. We refer to such a graph as an MHk; W -graph.
The existence of MHk; W -graphs is given by the following
lemma:
For k  2; W  1, let Gk; W be the k-partite graph with
vertex set V1 ; . . . ; Vk and edge set
[1i<jk fvi;p vj;pji1 jp 1; . . . ; W g;
where the second index is taken modulo W .
Lemma 7. Gk; W is an MHk; W -graph for each pair of k
and W with k  3; W  1.
Proof. Clearly, for any pair i; j with 1  i < j  k, the edge
set induced by Vi [ Vj is fvi;p vj;pji1 jp 1; . . . ; W g,
which is a perfect matching between Vi and Vj . For any
triple i; j; t with 1  i < j < t  k, the edge set induced
by Vi [ Vj [ Vt is
F fvi;p vj;pji1 jp 1; . . . ; W g
[ fvj;p vt;ptj1 jp 1; . . . ; W g
[ fvi;p vt;pti1 jp 1; . . . ; W g:
We notice that vi;p vj;pji1 vt;pti2 vi;p1 is a path.
Therefore, if there is a cycle of length 3n starting from vi;p
for some p, then p p nmod W . This implies that n
W and, therefore, F induces a cycle of length 3W .
u
t
From now on, we will focus on the design of a
4; W -HUSB. We will design another class of MH4; W
graphs to show that the MHk; W graphs are not unique.
Let D4 be an oriented K4 (we choose D4 , as in Fig. 4). Then,
we can construct a system of inequalities for D4 as follows:
8
px12 x23 x31 6 0mod W ;
>
>
<
px12 x24 x41 6 0mod W ;
> px23 x34 x24 6 0mod W ;
>
:
px34 x41 x31 6 0mod W ;
for 0 < p < 4. It is easy to see that x12 x23 x34 x41 0,
x31 x24 1 is a solution to the system of inequalities.
We replace vi in D4 by Vi for i 1; 2; 3; 4 and let vi;n 2 Vi join
vj;nxij 2 Vj to obtain a desired graph. The formal definition
is the following:
Let Vi fvi;j jj 1; . . . ; W g; i 1; . . . ; 4 and
E fvi;j vi1;j ji 1; . . . ; 4; j 1; . . . ; W g
[ fv1;j v3;j1 jj 1; . . . ; W g [ fv2;j v4;j1 jj 1; . . . ; W g;
where the first index takes modulo 4 and the second index
takes modulo W . Then, H4; W V1 ; . . . ; V4 ; E is an

Fig. 5. Different drawings of H4; 2 and H4; 3.

MH4; W -graph. In Fig. 5, we draw H4; 2 and H4; 3 in


two different ways. The right drawing gives a clear view of
the structures of the designs. It is easy to see that H4; W is
a 3-regular Hamiltonian graph. Such a drawing is also
helpful in the verification for hyper-universal switch boxes.
We see that jEH4; W j 6W and, by Theorem 6,
e4; W  6W . It is desirable to have H4; W as a
4; W -HUSB. But, unfortunately, an MH4; W -graph
might not be a 4; W -HUSB, although MH3; W -graphs
are optimum 3; W -HUSBs by Theorem 5, and G4; 2
H4; 2 is an optimum 4; 2-HUSB. We can verify that
H4; 3 does not accommodate a detailed routing for the
global routing
ff1; 2g; f1; 2g; f3; 4g; f3; 4g; f1; 3g; f2; 4gg:
Even though a MH4; W -graph may not be a 4; W -HUSB,
it can be used as an approximation to an optimum
4; W -HUSB with weaker routability. On the other hand,
we can obtain hyper-universal switch boxes by combining
some MH4; W -graphs and adding a few extra edges. This
method can be used to design the elementary universal
switch boxes used in the reduction design scheme.

5.2 Minimal 4-Way Global Routings


We note that, in our reduction design scheme, we need to
find all minimal 4-way global routings for the detailed
routing algorithm and for the generation of all 4; r-GRs
which are used in the verification of an elementary
4; r-HUSB design. To simplify the verification and
detailed routings, we can make a restriction to global
routings. A global routing is called primitive if it does not
contain two unequal nets of size 1. If a k; W -GR R is not
primitive, then we can combine the unequal nets of size 1
into nets of size 2 to obtain a primitive k; W -GR R0 . Any

28

IEEE TRANSACTIONS ON COMPUTERS,

VOL. 52,

NO. 1,

JANUARY 2003

Fig. 6. Primitive 4-way minimal global routings.

detailed global routing of R0 will induce a detailed global


routing of R by simply deleting the edges of those one edge
trees representing the nets of size two in R0 which are
obtained by combining the unequal nets of size 1 in R.
Therefore, to verify that a k; W -SB is hyper-universal,
we only need to show that each primitive k; W -SB has a
detailed routing in the k; W -SB.
It is obvious that a primitive global routing can be
decomposed into the union of primitive minimal global
routings. Therefore, we only need to find all primitive
4-way MGRs and use them to generate other primitive
4-way global routings. By the fact that f4 3, exhaustive
searches can be done. Fig. 6 gives the complete set of
primitive 4-way MGRs, where Rij;h denotes the hth
4; i-MGR of type j.
We note that R31;1 has a density of 3 and cannot be
decomposed into a 4; 2-GR and a 4; 1-GR. Therefore, the
4; 3-symmetric switch module M4;3 , a universal switch box
given in [6], [12], is not hyper-universal because M4;3 is a
union of M4;2 and M4;1 .
It can also be seen that the densities of minimal 4-way
global routings are 1, 2, and 3. Therefore a 4; W -GR can be
decomposed into a union of minimal 4-way global routings
of densities of 1, 2, or 3. This fact will be used in our design.

5.3 The Design Procedure


Now, we are ready to apply the technique proposed in
Section 3 to design a class of 4; W -HUSBx.
Step I. By Lemma 1, we have f4 3. Fig. 6 provides all
primitive 4-way MGRs.
Step II. pk 6 as fr1 ; r2 ; r3 g f1; 2; 3g. The integer r
could be 1, 2, 3, 4, 5, and 7. The reason why r 7 is that, for
m 6h 1, we may not be able to decompose every

4; m-GR into h 4; 6-GRs together with a 4; 1-GR, but


we can always decompose a 4; m-GR into h 1 4; 6-GRs
together with a 4; 7-GR.
Step III. Let G1 K4 , G2 H4; 2, G3 , G4 , G5 G6 and G7
be as in Fig. 7. Then, we have jEG1 j 6, jEG2 j 12,
jEG3 j 20, jEG4 j 26, jEG5 j 32, jEG6 j 40, and
jEG7 j 46. We note that G3 contains vertex disjoint
subgraphs G1 and G2 ; G4 contains two vertex disjoint G2 s;
G5 contains vertex disjoint subgraphs G2 and G3 ; G6
contains three vertex disjoint G2 s; and G7 contains vertex
disjoint subgraphs G3 and G4 . We have the following
lemma.
Lemma 8. Gi is a 4; i-HUSB for 1  i  7.
Proof. Let R be a primitive 4; i-GR. Then, R is a disjoint
union of Rjh;p s defined above. It is easy to verify that Gi is
a 4; i-HUSB for i 1 and 2.
For i 3, if R R31 or R3h;p for some h; p, we can verify
that G3 contains a detailed routing of R. If R is a union of
a 4; 1-GR and a 4; 2-GR, then G3 contains a detailed
routing of R since G3 contains vertex disjoint subgraphs
G1 and G2 .
Let i 4. If R is a union of a 4; 1-GR and a minimal
4; 3-GR, then we can verify that G4 contains a detailed
routing of R. Suppose R is a union of two 4; 2-GRs. In
this case, G4 contains a detailed routing of R as G4
contains two vertex disjoint G2 s.
For i 5, R can always be decomposed into a
4; 2-GR and a 4; 3-GR. It is easy to see that G5
contains a detailed routing for R since G5 contains vertex
disjoint G2 and G3 .
Let i 6. If R is a union of three 4; 2-GRs, then G6
contains a detailed routing of R as G6 contains three

FAN ET AL.: GENERAL MODELS AND A REDUCTION DESIGN TECHNIQUE FOR FPGA SWITCH BOX DESIGNS

29

Fig. 7. The elementary hyper-universal 4; W -HUSBs.

vertex disjoint G2 s. If R is a union of two 4; 3-global


routings, then we can verify that G6 contains a detailed
routing of R.
Finally, let i 7. In this case, R can always be
decomposed into a 4; 3-GR and a 4; 4-GR. G7 contains
a detailed routing of R since G7 contains vertex disjoint
u
t
subgraphs G3 and G4 .
Step IV. Construct a 4; W -HUSB by combining the
elementary hyper-universal designs Gi ; i 1; 2; . . . ; 7.
Define F W as the following graph:
F W
8
disjoint union of
>
>
>
>
>
disjoint union of
>
>
>
< disjoint union of
>
disjoint union of
>
>
>
>
>
disjoint union of
>
>
:
disjoint union of

h G6 s
h 1 G6 s and a G7

if W 6h;
if W 6h 1;

h G6 s and a G2
h G6 s and a G3
h G6 s and a G4

if W 6h 2;
if W 6h 3;
if W 6h 4;

h G6 s and a G5

if W 6h 5:

By the definition of Gi ; i 1; . . . ; 7, it is easy to see that the


number of edges of F W , W > 1, is
8 20
if W 0mod 6;
>
3 W
>
> 20
2
>
W

if
W 1mod 6;
>
3
3
>
< 20
W 43 if W 2mod 6;
3
jF W j 20
if W 3mod 6;
>
3 W
>
>
> 20 W 2 if W 4mod 6;
>
>
3
3
: 20
4
3 W 3 if W 5mod 6:

Theorem 9. For W > 1, F W is a 4; W -HUSB.


Proof. If W 6h 1, then h  1 and any 4; 6h 1-GR R
can be decomposed into a union of h 1 4; 6-GRs and
a 4; 7-GR. (Sometimes, R can be decomposed into h
4; 6-GRs and a 4; 1-HUSB. But, this does not always
happen.) Now, it is easy to see that F 6h 1 contains a
detailed routing of R.
Let W 6h i, where i 6 1. Since the densities of
minimal 4-way global routings are 1, 2, or 3, any 4; 6h
i-GR R can be decomposed into h 4; 6-GRs and a
4; i-GR. Since F 6h i is a disjoint union of h G6 s and
a Gi , and G6 and Gi are 4; 6-HUSB and 4; i-HUSB,
respectively, by Lemma 8, then F 6h i contains a
detailed routing of R. The proof of the theorem is
complete.
u
t
Remark.
1.

2.

3.

We note that we can apply the algorithm in Step IV of


the reduction design technique in Section 3 to k 4 to
obtain an efficient detailed routing algorithm for any
4-way global routing in F W .
We see that e4; 1 6; e4; 2 12. But, the exact
value of e4; W with W  3 is still unknown. From
the results of Theorems 6 and 9, we have that
6W  e4; W  6:6W < 6:7W .
The maximum degree of the F W is 4 for W  3,
therefore the flexibility of the design is Fs 4. We
conjectured that no hyper-universal 4-way switch box
has flexibility less than 4 when W  4.

30

IEEE TRANSACTIONS ON COMPUTERS,

4.

The switch box design technique described in Section 3


could also be applied to universal switch box designs
with global routings restricted on 2-bounded global
routings. For example, to design 4-way USBs, we see
that the only minimal 2-bounded 4-way global
routings are of densities 1 and 2. Therefore,
p4 2. We can choose r 1 as any 2-bounded
4; W -GR is a union of bW2 c 4; 2-GRs, together
with W 2bW2 c 4; 1-GR. Therefore,
UW

disjoint union of h G2 s
disjoint union of h G2 s and a G1

[6]
[7]
[8]
[9]
[10]
[11]

if W 2h;
if W 2h 1;

is a universal 4; W -SB and, moreover, it is optimum


because jUW j 6W is a lower bound for universal
4; W -SBs. In fact, UW is isomorphic to the
symmetric switch modules M4;W given in [6], [12].

[12]
[13]
[14]
[15]

CONCLUSIONS

We have studied the general k-way hyper-universal switch


module for multipin net routings of FPGAs and obtained
several interesting results. First of all, we have generalized
the previously known universal switch modules to hyper
universal switch modules by also including multipin net
routing cases. Second, we have formulated precise combinatorial abstractions for global and detailed routings
through switch boxes, which makes it possible to apply
theories and techniques in graph theory and combinatorics
in solving this complicated problem. Third, as a result of
our modeling and analysis, we have found a powerful
decomposition property of the problem, which, in turn, is
applied in developing a simple while powerful reduction
scheme that can produce any hyper-universal k; W -switch
box with very low OW number of switches for any fixed
k. And, finally, our design methodology at the same time
yields efficient detailed routing algorithms. For illustration,
we give the first known results of optimum 2-way and
3-way hyper-universal switch boxes, and some 4-way
hyper-universal switch boxes with the numbers of switches
very close to a non-hyper-universal commercial model.

ACKNOWLEDGMENTS
This research was partially supported by the Natural
Sciences and Engineering Research Council of Canada.

REFERENCES
[1]
[2]
[3]
[4]
[5]

M.J. Alexander and G. Robins, New Performance FPGA Routing


Algorithms, Proc. Design Automation Conf., pp. 562-567, 1995.
V. Betz, J. Rose, and A. Morquardt, Architecure and CAD for DeepSubmicron FPGAs. Boston: Kluwer Academic, 1999.
J.A. Bondy and U.S.R. Murty, Graph Theory with Applications.
London: Macmillan Press, 1976.
S. Brown, R.J. Francise, J. Rose, and Z.G. Vranesic, FieldProgrammable Gate Arrays, Boston: Kluwer Academic, 1992.
S. Brown, J. Rose, and Z.G. Vranesic, A Detailed Router for FieldProgrammable Gate Arrays, IEEE Trans. Computer-Aided Design,
vol. 11, pp. 620-628, May 1992.

[16]
[17]

VOL. 52,

NO. 1,

JANUARY 2003

Y.W. Chang, D.F. Wong, and C.K. Wong, Universal Switch


Modules for FPGA Design, ACM Trans. Design Automation of
Electronic Systems, vol. 1, no. 1, pp. 80-101, Jan. 1996.
A. Corp, The Maximalist Handbook, 1990.
H. Fan, P. Haxell, and J. Liu, The Global RoutingA
Combinatorial Design Problem, submitted.
Y.S. Lee and C.H. Wu, A Performance and Routability Driven
Router for FPGAs Considering Path Delays, Proc. Design
Automation Conf., pp. 557-561, 1995.
J.F. Pan, Y.L. Wu, G. Yan, and C.K. Wong, On the Optimal FourWay Switch Box Routing Structures of FPGA Greedy Routing
Architectures, Integration, the VLSI J., vol. 25, pp. 137-159, 1998.
J. Rose and S. Brown, Flexibility of Interconnection Structures for
Field-Programmable Gate Arrays, IEEE J. Solid-State Circuits,
vol. 26, no. 3, pp. 277-282, 1991.
M. Shyu, G.M. Wu, Y.D. Chang, and Y.W. Chang, Generic
Universal Switch Blocks, IEEE Trans. Computers, vol. 49, no. 4,
pp. 348-359, Apr. 2000.
Y.L. Wu and D. Chang, On NP-Completeness of 2-D FPGA
Routing Architectures and a Novel Solution, Proc. Intl Conf.
Computer-Aided-Design, pp. 362-366, 1994.
Y.L. Wu and M. Marek-Sadowska, Routing for Array Type
FPGAs, IEEE Trans. Computer-Aided Design of Integrated Circuits
and Systems, vol. 16, no. 5, pp. 506-518, May 1997.
Y.L. Wu, S. Tsukiyama, M. Marek-Sadowska, On Computational
Complexity of a Detailed Routing Problem in Two-Dimensional
FPGAs, Proc. Fourth Great Lakes Symp. VLSI, Mar. 1994.
Y.L. Wu, S. Tsukiyama, and M. Marek-Sadowska, Graph Based
Analysis of 2-D FPGA Routing, IEEE Trans. Computer-Aided
Design, vol. 15, no. 1, pp. 33-44, 1996.
Xilinx, Inc., The Programmable Logic Data Book, 1994.

Hongbing Fan received the BS degree in


mathematics from Shandong University in 1982
and the PhD degree in operational research and
control theory from the Institute of Mathematics
at the same university in 1990. He joined the
Mathematics Department at Shandong University in 1990 and served as an associate
professor. He worked with the Department of
Computer Science and Engineering at the
Chinese University as an research associate in
1998. He is currently studying and doing research with the Department
of Computer Science at the University of Victoria. His research interests
are combinatorial algorithm and complexity, graph theory, and various
topics in VLSI design and operations research.
Jiping Liu received the BS degree and MS
degree in mathematics from Shandong University
in 1982 and 1986, respectively, and the PhD
degree in combinatorics and graph theory from
Simon Fraser University in 1992. He worked as a
lecturer at Shandong University from 1982 to
1987. He held various positions as a postdoctoral
fellow and a research associate before he joined
the University of Lethbridge in 1995. He is now an
associate professor of mathematics and computer science. His research interests are in various optimum design
problems from VLSI, algorithms and complexities, graph theory, including
graph decompositions, factorizations, colorings, graph domination, and
Hamiltonian properties of certain graphs.
Yu-Liang (David) Wu received the BS and
MS degrees in computer science from Florida
International University of Miami in 1983 and
1984, respectively. He received the PhD
degree in electrical and computer engineering
from the University of California at Santa
Barbara in 1994. He worked at Internet
Systems Corp., AT&T Bell Labs, Amdahl
Corporation, and Cadence Design Systems
Inc. before he joined the Chinese University
of Hong Kong in January 1996. His current research interests
mostly relate to optimization of logic and physical design automation
of VLSI circuits and FPGA related CAD tool designs and
architectural analysis/optimization. He is a member of the IEEE.

Das könnte Ihnen auch gefallen