Beruflich Dokumente
Kultur Dokumente
I. I NTRODUCTION
Efficient user scheduling methods for linearly precoded
multiuser MIMO channels with multiple antennas at the base
station and multiple antennas at each user have captured much
research attention recently. It is known that the capacity of
a multiuser MIMO broadcast channel (BC) can be achieved
through the use of dirty paper coding (DPC) [1], [2]. However,
DPC is highly nonlinear and very complex to implement in
practice. Therefore, reduced complexity precoding methods
are of interest to reduce the effect of multiuser interference (MUI). Such methods include zero-forcing beamforming
(ZFB) [3] for systems with single-antenna users, and block
diagonalization (BD) [4] and successive zero-forcing (SZF)
[5] for systems with multiple-antenna users.
In particular, BD is a technique that completely nulls the
interference between users. However, this nulling operation
imposes a constraint that the total number of receive antennas
be no larger than the number of transmit antennas. This also
yields a reduction in the performance and number of users that
Our work made use of the infrastructure and computational resources
of AICT (Academic Information and Communication Technologies) at the
University of Alberta. The authors also gratefully acknowledge funding
for this research provided by TRLabs, Huawei Technologies, the Rohit
Sharma Professorship, the Natural Sciences and Engineering Research Council
(NSERC) of Canada, the Alberta Informatics Circle of Research Excellence
(iCORE), and the Alberta Ingenuity Fund.
j=1
Wj sj + nk
(1)
i>j
i<j
W(i) s(i)
(2)
(3)
K1 ).
SZF of K users channels is possible1 if MT > rank(H
Let us denote the SVD of (3) as
H
1
0
j1 = U
j1 V
j1
j1 [V
j1
j1
j1
j1
H
V
=U
]H
(4)
j1 C
j1 )
where V
, and
holds the MT rank(H
j1 .
right column vectors defining the null-space basis of H
The precoding matrix of the jth user W(j) is constrained to
0
j1
. Hence, the third term in
lie in the subspace defined by V
(2) is canceled by the subspace constraint on the design of the
precoding matrices for users i > j. Then, (2) reduces to
MT MT
0
j1
V
i<j
(5)
Tr(Qj )P
i ,i=1,2,...,K!
SZF
TABLE I
S IMPLIFIED GREEDY USER SCHEDULING ALGORITHM FOR SZF
1. i 1; U = {1, 2, ..., K}; Us = {}.
Select a user u1 such that u1 = arg max Hk 2F .
10000 00010 || 01
10000 00000 || 10
1000000100 || 10
1000000100 || 10
01010 00000 || 10
01010 00010 || 01
1101000010 || 00
10010 00000 || 10
(b)
(c)
(d)
Scheduled Encoding
users
order
kU
Find Ui =
[b.]
k Ui1 , k
/ Us |
If (|Ui | = 0),
Select a user such that
ui =
1 F
Hk V
i
1 F
Hk F V
i
< .
0 2
max Hk V
arg kU
i F
if i = 2,
max
arg kU
otherwise.
0 2
Hk V
i F
i1
0 2
Hk V
j F
j=2
(a)
B. Genetic Algorithm
Genetic algorithms to some degree mimic breeding in biological systems. Potential solutions to an optimization problem
are encoded in a set, or population, of data structures known
as chromosomes. The chromosomes crossbreed, mutate, and
evolve towards the optimal solution over several iterations, or
generations, of the algorithm. The most fit chromosomes,
as defined by the value of the utility function for the solution
they represent, are the most likely to pass on their solution
parameters to the next generation. In the case of scheduling,
those parameters include which users to schedule and the order
in which to encode the users data. The utility function for
user scheduling with the GA in this work is the achievable
sum rate of SZF (7). The operation of the genetic algorithm
is described briefly below. More details of the operation are
included in [10].
Initialization: A set of Np chromosomes is initialized at
random. The chromosome consists of two parts; the head
of the chromosome is a K-bit vector that indicates which
users are scheduled, and the tail contains K0 log2 (K0 )
bits indicating the encoding order of the scheduled users.
A 1 in position k of the head denotes user k is to be
scheduled, and a 0 not scheduled; the head is constrained
to have between 1 and K0 1s, since at most K0 users can
be scheduled simultaneously. The nth group of log2 (K0 )
bits in the tail denotes the relative encoding order of the nth
scheduled user (i.e. the nth 1 in the head); each of these
groups must have a unique value.
Selection: Two chromosomes are selected from the popula
tion with probability pi =Gi /( n Gn ), where Gi is the utility function value of the solution represented by chromosome
i, i.e. its fitness. These chromosomes are known as parents.
Breeding: A uniformly random position is defined within
the chromosome, and then the two selected parents swap all
bits after that point to form two child chromosomes. This
crossover operation occurs with probability pc =1. Next, the
children undergo the mutation operation, for which each bit
in the children has a pm =1/(1 +2 G /G ) probability of
being toggled, where G and G are the mean and standard
deviation of the current populations fitness before selection,
and 1 and 2 can be chosen anywhere on the line segment
1 +0.152 =(KK0 /7.5773)1.2071 , 1 1.1, 2 3, as
adapted from [12]. Finally, if the child chromosomes represent solutions that violate the constraints (i.e. too many/few
users scheduled, or non-unique encoding order values), the
chromosome is corrected to meet the constraints. Generally,
1s are toggled at random in the head until the number of
4KMT N +
3
i=2
16MT N (i 1)+32MT N (i 1)
(9)
9.8
9
by Gi =Hi (I + j=i Hj Pj Hj )1/2 , which involves matrix
additions and multiplications, and an inverse square root. The
block-diagonal matrix formed by these Gi is waterfilled in
order to obtain covariance matrices Si , which are in turn
used to update each Pi for the next iteration. Of all the
calculations, the most complex is the inverse square root
of an MT MT matrix for each of the K0 users during
each iteration. Thus, the complexity of Step 1 is O(K0 MT3 )
flops, where is the number of iterations required for the
algorithm to converge. From the figures in [11] and from our
own simulations, 3-5 iterations are generally enough for the
algorithm to converge to less than 1% error in the DPC sum
rate, which is sufficient for scheduling purposes. Hence, the
overall complexity of Step 1 is O(K0 MT3 ).
Step
2: The DPC covariance matrices j are
determined
successively
(assuming
user
1
is
A
encoded
first
on
the
MAC)
by
calculating
j =I+
j1 H
K0
i Hj ,
Bj =I+ i=j+1
HH
and
Hj
i Pi H i
i=1
1/2
1/2
H 1/2
H 1/2
j =Bj
Fj Gj Aj Pj Aj Gj Fj Bj
. This involves
matrix sums and multiplications, square roots, and an SVD
1/2
H
HH
to find Fj and Gj via B1/2
j Aj =Fj j Gj . As with
j
step 1, the most complex operation is the inverse square
root of the MT MT matrices Bj for K0 users. Thus, the
complexity of Step 2 is also O(K0 MT3 ).
Step 3: To convert the DPC matrices j to SZF covariance
matrices Qi , for each user j, the null space basis vectors
0 for the aggregate channel matrix of the previous
V
j
j1 encoded users are found through an SVD or a QR
10 =I. For users 1 through K0 1, the
decomposition; V
j0 V
j0H j V
j0 V
j0H . For
SZF matrices are found as Qj =V
the final user K0 , first a temporary covariance matrix
K0 is found by waterfilling over an effective channel
Q
K0 1
1
0 0H
K
2 HK V
VK0
Hef f =(I+HK0 ( j=1
Qj )HH
matrix
K0 )
0
0
K0 1
with
power
constraint
P j=1
Tr(Qj ),
then
0 0H
0 0H
QK0 =VK0 VK0 QK0 VK0 VK0 .Over all users, the calculation
of the null space vectors is O K02 MT2 N K03 MT N 2 +K04 N 3 ,
while the matrix multiplications for the Qj matrices are
O(K0 MT3 ). With K0 =MT /N , the above terms are all
about O(K0 MT3 ), which is therefore the overall complexity
of Step 3.
Step 4: In calculating the sum rate (7), each user requires
2 determinant calculations (except for the first, where an
identity matrix is in the denominator). The sum of Qj
matrices is updated once per user at a cost of 2MT2 flops. With
the sum calculated, each determinant value requires a total of
8MT2 N + 8MT N 2 + N + 8/3N 3 + 6N flops for the matrix
multiplications and the determinant value calculation. Lastly,
2 flops are required per user to multiply and divide all the
real determinant values together. Thus, the total complexity
of Step 4 is (2K0 1)(8MT2 N + 8MT N 2 + 8/3N 3 + 7N ) +
2K0 MT2 + 2K0 , which is O(K0 MT2 N ) O(MT3 ).
8
7
6.5
(a) SNR = 5 dB
10
20
30
40
50
60
70
80
90
100
15
14
13
Exhaustive search
Proposed genetic algorithm
Proposed greedy algorithm
12
(b) SNR = 10 dB
11
3
10
20
30
40
50
60
Number of users ( K )
70
80
90
100
15
14
13
12
(a) SNR = 5 dB
11
10
4
24
10
20
30
40
50
60
70
80
90
100
22
20
Exhaustive search
Proposed genetic algorithm
Proposed greedy algorithm
18
(b) SNR = 10 dB
16
4
10
20
30
40
50
60
70
80
90
100