Sie sind auf Seite 1von 9

IEEE TRANSACTIONS ON COMMUNICATIONS, ACCEPTED FOR PUBLICATION 1

A MMSE Vector Precoding With Block


Diagonalization for Multiuser MIMO Downlink
Jungyong Park, Student Member, IEEE, Byungju Lee, Student Member, IEEE,
and Byonghyo Shim, Senior Member, IEEE
AbstractBlock diagonalization (BD) algorithm is a general-
ization of the channel inversion that converts multiuser multi-
input multi-output (MIMO) broadcast channel into single-user
MIMO channel without inter-user interference. In this paper,
we combine the BD technique with a minimum mean square
error vector precoding (MMSE-VP) for achieving further gain
in performance with minimal computational overhead. Two
key ingredients to make our approach effective are the QR
decomposition based block diagonalization and joint optimization
of transmitter and receiver parameters in the MMSE sense.
In fact, by optimizing precoded signal vector and perturbation
vector in the transmitter and receiver jointly, we pursue an
optimal balance between the residual interference mitigation and
the noise enhancement suppression. From the sum rate analysis
as well as the bit error rate simulations (both uncoded and
coded cases) in realistic multiuser MIMO downlink, we show
that the proposed BD-MVP brings substantial performance gain
over existing multiuser MIMO algorithms.
Index TermsMultiuser MIMO, block diagonalization, QR
decomposition, MMSE vector precoding, Cholesky factorization.
I. INTRODUCTION
I
N the multiuser multiple-input multiple-output (MIMO)
downlink system, a basestation with multiple antennas
transmits data streams to multiple mobile users, each with one
or more receive antennas. Due to the fact that the co-channel
interference is hard to be managed by the receiver operation,
and more importantly, capacity of interference channel can
be made equivalent to capacity without interference, pre-
cancellation of the interference via the precoding at the
transmitter has received much attention in recent years. It
is now well-known that the capacity region of the multiuser
MIMO downlink can be achieved by dirty paper coding
(DPC) [1][8]. However, the shortcoming of the DPC is that
system implementation leads to the high computational cost
of successive encodings and decodings. Recently, as a way
to reduce the complexity of DPC while preserving a large
fraction of DPC capacity, block diagonalization (BD) has
been proposed [9]. The main advantage of the BD technique
Paper approved by D. J. Love, the Editor for MIMO and Adaptive Tech-
niques of the IEEE Communications Society. Manuscript received November
3, 2010; revised August 4, 2011.
This work is supported by the Basic Science Research Program through the
National Research Foundation (NRF) funded by the Ministry of Education,
Science and Technology (MEST) (No. 2011-0012525) and the second Brain
Korea 21 project.
The authors are with the School of Information and Communication, Korea
University, Anam-dong, Sungbuk-gu, Seoul, Korea, 136-713 (e-mail: {jypark,
bjlee}@isl.korea.ac.kr, bshim@korea.ac.kr).
Digital Object Identier 10.1109/TCOMM.2011.01.xxxxx
is that the transmit precoding matrix assures zero inter-user
interference in the received signal. Extensions of the BD,
employing waterlling or vector perturbation on top of the
BD [9] [10], have also been suggested to ll the gap between
the DPC and BD.
In this paper, we propose a technique that combines the
QR decomposition based BD and minimum mean square
error vector precoding (MMSE-VP) [11] for pursuing further
gain in performance. By combining the MMSE-VP with the
BD, the proposed method, henceforth referred to as the BD-
MVP, mitigates noise enhancement effect, thereby achieving
considerable gain in performance (both capacity and bit er-
ror rate). Unlike the conventional VP scheme that controls
parameters at the transmitter only, the MMSE-VP technique
jointly optimizes parameters at the transmit and receiver to
pursue a balance between the residual interference mitigation
and noise enhancement suppression. Moreover, due to the QR
decomposition based block diagonalization, dimension of the
system being processed is reduced, and hence the proposed
BD-MVP brings additional benet in complexity over the
SVD based BD. In a nutshell, the proposed approach improves
the quality of service (QoS) of mobile users in multiuser
MIMO environments with moderate computational cost. In
fact, from the sum rate analysis and BER simulation results,
we show that the proposed BD-MVP outperforms the BD
algorithm and its variants (BD-VP and BD-WF) [10] as well as
Tomlinson-Harashima precoding (THP) based schemes (BD-
GMD and BD-UCD) [12].
The rest of this paper is organized as follows. In Section II,
we describe the system model and the summary of the
BD technique and its extensions (BD-WF and BD-VP). In
Section III, we present the proposed BD-MVP method. We
provide the simulation results in Section IV and conclude in
Section V.
We briey summarize notations used in this paper. We
employ uppercase boldface letters for matrices and lowercase
boldface for vectors. The superscripts ()
H
and ()
T
denote
conjugate transpose and transpose, respectively. diag() is a
diagonal matrix where nonzero elements exist only on the
main diagonal of the matrix. indicates an L
2
-norm of
a vector. CN(m,
2
) denotes a complex Gaussian random
variable with mean m and variance
2
.
II. MULTIUSER MIMO DOWNLINK
In this section, we present the system model for the mul-
tiuser MIMO downlink and then review the BD and BD-VP
techniques.
0090-6778/11$25.00 c 2011 IEEE
2 IEEE TRANSACTIONS ON COMMUNICATIONS, ACCEPTED FOR PUBLICATION
A. System Model
In a multiuser MIMO downlink, a basestation equipped
with N
T
antennas transmits information to K mobile users.
Each mobile user has N
R,j
receive antennas and thus the
total number of receive antennas is N
R
=

K
j=1
N
R,j
. In
the sequel, we use the notation {N
R,1
, , N
R,K
} N
T
to
describe this multiuser system. For each channel realization,
we assume that a block of data symbols with length N
B
is
transmitted.
Under the at fading channel assumption, the total received
signal vector y[n] = [y
1
[n]
T
y
2
[n]
T
y
K
[n]
T
]
T
is given
by
y[n] = H x[n] +w[n], n = 1, 2, , N
B
, (1)
where H = [H
T
1
H
T
2
H
T
K
]
T
is the N
R
N
T
com-
posite channel matrix and x[n] = Fx[n] is the precoded
signal vector being transmitted at the basestation where F =
[F
1
F
2
F
K
] and x[n] = [x
1
[n]
T
x
2
[n]
T
x
K
[n]
T
]
T
are the N
T
N
R
precoding matrix and the N
R

1 transmit signal vector, respectively, and w[n] =
[w
1
[n]
T
w
2
[n]
T
w
K
[n]
T
]
T
is the complex Gaussian
noise vector
_
w[n] CN(0,
2
w
I)
_
. In this setup, the received
signal of the user j can be expressed as
y
j
[n] = H
j
F
j
x
j
[n] +H
j
K

l=1,l=j
F
l
x
l
[n] +w
j
[n]. (2)
Note that the rst and second terms in the right side of (2) are
the desired signal and multiuser interferences, respectively.
B. BD Algorithm
The BD algorithm is an extension of the channel inversion
for receivers with multiple antennas [9]. The key feature of
the BD is to employ the precoding matrix F annihilating all
multiuser interference. The precoding constraint for achieving
this goal is
H
i
F
j
= 0 for i = 1, , j 1, j + 1, , K. (3)
Clearly, F
j
should be a null space of

H
j
= [H
T
1
H
T
j1
H
T
j+1
H
T
K
]
T
. From the SVD of

H
j

H
j
=

U
j

j
[

V
(1)
j

V
(0)
j
]
H
(4)
where

U
j
and

j
are matrices of the left singular vectors
and the ordered singular values of

H
j
, respectively, and

V
(1)
j
and

V
(0)
j
are right singular matrices corresponding to nonzero
singular values and zero singular values, respectively. It is
clear that

V
(0)
j
, right singular matrix corresponding to zero
singular values, forms an orthogonal basis for the null space
of

H
j
. Thus,

V
(0)
j
becomes the right choice for the precoding
matrix F
j
and hence (2) becomes
y
j
[n] = H
j
F
j
x
j
[n] +w
j
[n]. (5)
In order to maximize the achievable sum rate of the BD, the
waterlling algorithm can be additionally incorporated [13].
Since HF has a block diagonal structure, it is possible to
perform the SVD for each users block matrix H
j
F
j
(j =
1, , K). Denoting this effective channel matrix as H
eff,j
=
H
j
F
j
= H
j

V
(0)
j
, the SVD of H
eff,j
becomes
H
eff,j
= U
j
[
j
0][V
(1)
j
V
(0)
j
]
H
. (6)
Since each column of V
(1)
j
corresponds to the singular vector
of nonzero singular value in
j
, by multiplying V
(1)
j
into the
right side of H
eff,j
, we obtain H
eff,j
V
(1)
j
= U
j

j
. Further,
after identifying and removing U
j
1
, we can nd a power
loading matrix D
j
maximizing the information rate of the
user j. In summary, the achievable sum rate of the BD-WF is
[9]
R
BD-WF
= max
D:Tr(D)P
T
log det
_
I +

2
D

2
w
_
(7)
where D = diag(D
1
, , D
K
) is determined by the water-
lling under the power constraint P
T
.
C. BD-VP Algorithm
Although the BD-WF is optimal in maximizing the sum rate
of the BD algorithm, delivery of precoding matrix index or
assignment of dedicated pilot for each user to estimate the ef-
fective channel H
eff,j
might be a bit burdensome. The BD-VP,
introduced to avoid this load, employs the vector perturbation
technique [14] [15] on top of the BD to compensate for the
transmit power suppression caused by the BD. Specically,
the transmitted signal of the user j is
x
j
[n] =
1

j
H
1
eff,j
s
j
[n] =
1

j
H
1
eff,j
(s
j
[n] + p
j
[n]) (8)
where s
j
[n] = s
j
[n] +p
j
[n] where s
j
[n] and are the trans-
mit signal vector of the user j and the pre-dened positive real
number, respectively, and
j
= H
1
eff,j
(s
j
[n] + p
j
[n])
2
.
2
Note that a choice of p
j
[n] minimizes
j
so that
p
j
[n] = arg min
p

j
[n]Z
N
R,j
_
_
_H
1
eff,j
(s
j
[n] + p

j
[n])
_
_
_
2
(9)
where Z is the integer set and N
R,j
is the number of transmit
streams for each user. Since the perturbation vector p
j
[n]
is chosen from the set of N
R,j
-dimensional lattice points,
considerable reduction of
j
can be achieved, which helps
to improve the effective SNR of the received signal. In fact,
it is shown that the gap of the sum rate between the BD-VP
and BD-WF is negligible in high SNR regime [10]. Note that
the achievable rate of the BD-VP is given by
R
BD-VP
=
K

j=1
N
R,j

k=1
log
2
(1 + SNR
j,k
)
=
K

j=1
N
R,j

k=1
_
1 +

2
k
N
R,j
_
(10)
1
Note that U
j
can be identied once the precoded effective channel H
eff,j
is estimated. H
eff,j
can be estimated from either the dedicated pilot signal
so called demodulation reference signal (DM-RS) assigned for each user or
combination of the precoding matrix F
j
and the estimated channel H
j
.
2
In principle
j
should be delivered to the receiver in the BD-VP technique.
However, since the performance difference between
j
and E
j
is negligible
[15], sporadic use of E
j
is more benecial in practice.
PARK et al.: A MMSE VECTOR PRECODING WITH BLOCK DIAGONALIZATION FOR MULTIUSER MIMO DOWNLINK 3
Sphere
Encoder
d
j
[n]
p
j
[n]
Original
Symbol
s
j
[n]

{-1,0,1}
A
j
F
j
H
w
j
[n]
H
eff, j
x
j
[n]
Transmitter Receiver
^
s
j
[n]
Modulo
^
d
j
[n]
g
j
B
j
H
Fig. 1. The transceiver structure of the proposed BD-MVP technique.
where =
P
T

2
w
and
k
(k = 1, , N
R,j
) is the singular value
of H
eff,j
.
In the implementation perspective, the integer least squares
problem in (9) is processed at the transmitter and solved by
a closest lattice point search, referred to as sphere encoder
[15]. In the receiver, the received signal vector of the user j
is modeled as
y
j
[n] =
1

j
s
j
[n] +w
j
[n]
=
1

j
(s
j
[n] + p
j
[n]) +w
j
[n]. (11)
In removing p
j
[n], we can employ the modulo operation and
hence mod(

j
y
j
[n], ) becomes a sufcient statistic for the
symbol slicing.
III. BD-MVP
In this section, we present the proposed BD-MVP algo-
rithm that combines the BD and MMSE-VP. While the main
feature of the BD-VP is to annihilate inter-stream interference
completely, the aim of the proposed BD-MVP is to allow a
small inter-stream interference for optimal balance between
the residual interference mitigation and noise enhancement
suppression. We rst provide the motivation of our work and
then discuss details of the proposed algorithm.
A. The BD-MVP
Although the BD-VP outperforms the BD, due to the zero
forcing nature of the algorithm, mobile suffers from the noise
enhancement effect. An idea motivated by this observation is
to reduce the noise enhancement effect by employing the QR
decomposition based BD together with the MMSE-VP [11].
Towards this end, we pursue a tradeoff between the residual
interference mitigation and noise enhancement suppression
in the MMSE sense. In a nutshell, key distinctions of the
proposed method over the BD-VP are that 1) elimination of
multiuser interference by the QR decomposition based BD
and 2) per user optimization of perturbation vector using the
MMSE criterion.
First, as a computationally efcient alternative of the SVD
based BD, the proposed BD-MVP exploits the QR decomposi-
tion of the pseudo inverse of the composite channel matrix H
[16]. To be specic, while the original BD employs the SVD
of

H
j
to nd out the null space of

H
j
, the BD-MVP exploits
the pseudo inverse of H, i.e.,

H = H
H
(HH
H
)
1
. Denoting

H = [

H
1

H
2


H
K
], we perform the QR decomposition
for each submatrix of

H (i.e.,

H
j
=

Q
j

R
j
). Then, it is clear
that

Q
j
forms an orthogonal basis for the null space of

H
j
(see Appendix A).
Second, the MMSE-VP is employed to nd the optimal
perturbation vector and the power constraint factor. Denoting
the desired symbol as d
j
[n] and the received symbol as

d
j
[n]
(see Fig. 1), the total MSE for whole block can be expressed
as
(p
j
[n], x
j
[n], g
j
)
=
N
B
X
n=1
E

d
j
[n] d
j
[n]
2

=
N
B
X
n=1
E

g
j
B
H
j
(H
eff,j
x
j
[n] +w
j
[n]) d
j
[n]
2

=
N
B
X
n=1
E((g
j
B
H
j
(H
eff,j
x
j
[n] +w
j
[n]) d
j
[n])
H
(g
j
B
H
j
(H
eff,j
x
j
[n] +w
j
[n]) d
j
[n]))
=
N
B
X
n=1
(g
2
j
x
H
j
[n]H
H
eff,j
H
eff,j
x
j
[n] g
j
d
H
j
[n]B
H
j
H
eff,j
x
j
[n] + g
2
j
tr(R
n
) g
j
x
H
j
[n]H
H
eff,j
B
j
d
j
[n]
+ d
H
j
[n]d
j
[n]) (12)
where B
H
j
is the receive equalization matrix for the user
j. Note that when B
H
j
= I
N
R,j
, this joint optimization
problem turns into the transmitter optimization problem. Since
H
eff,j
, d
j
[n], and tr(R
n
) are given, our goal is to nd p
j
[n],
x
j
[n], and g
j
minimizing the MSE under the power constraint.
The corresponding optimization problem can be expressed as
(p
j
[n], x
j
[n], g
j
) = arg min (p
j
[n], x
j
[n], g
j
)
subject to
N
B

n=1
x
H
j
[n]x
j
[n] = P
T
. (13)
This problem can be solved by Lagrangian multiplier [17].
One can show that the resulting transmit signal x
j
[n] is given
by (see Appendix B)
x
j
[n] = A
j
d
j
[n]
=
1
g
j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
B
j
d
j
[n]. (14)
Further, using the matrix inversion lemma, x
j
[n] =
1
g
j
(H
H
eff,j
H
eff,j
+ I)
1
H
H
eff,j
B
j
d
j
[n] and hence the power constraint
factor g
j
is given by
g
j
=

_
1
P
T
N
B

n=1
d
H
j
[n]B
H
j
H
eff,j
(H

eff,j
)
2
H
H
eff,j
B
j
d
j
[n] (15)
4 IEEE TRANSACTIONS ON COMMUNICATIONS, ACCEPTED FOR PUBLICATION
where H

eff,j
= H
eff,j
H
H
eff,j
+ I, = N
B
tr(R
n
)/P
T
. Once
x
j
[n] and g
j
are available, what remains is to nd the optimal
perturbation vector p
j
[n]. By plugging x
j
[n] and g
j
into (12),
one can show that (see Appendix C)
p
j
[n] = arg min
p

j
[n]
L
j
B
j
(s
j
[n] + p

j
[n])
2
(16)
where we employ the Cholesky factorization (H
eff,j
H
H
eff,j
+
I)
1
= L
H
j
L
j
(L
j
is the lower triangular matrix). Similar to
the BD-VP, the perturbation vector p
j
[n] can be found by the
sphere encoder.
In the receiver, the signal vector y
j
[n] is
y
j
[n] = H
eff,j
x
j
[n] +w
j
[n]
=
1
g
j
H
eff,j
H
H
eff,j
(H

eff,j
)
1
B
j
(s
j
[n] + p
j
[n])
+ w
j
[n]

1
g
j
B
j
(s
j
[n] + p
j
[n]) +w

j
[n]. (17)
Note that w

j
contains the residual interference introduced by
the regularization factor I as well as the Gaussian noise
w
j
. Since is known to the receiver, the effect of p
j
[n]
can be removed using a simple matrix multiplication (g
j
B
H
j
)
followed by the modulo operation. The proposed algorithm is
summarized in Appendix D.
B. BD-MVP With Geometric Mean Decomposition
In this subsection, we consider the optimization problem to
obtain unitary receive equalization matrix B
H
j
. The geometric
mean decomposition (GMD) [18] [19], developed for this
purpose, partitions the channel matrix H
eff,j
such that
H
eff,j
= Q
j
R
j
P
H
j
(18)
where Q
j
and P
j
are unitary matrices and R
j
is a real upper
triangular matrix with identical diagonal elements. We can
choose the unitary matrix B
j
= Q
j
to equalize the effective
channel of the user j. Then, the transmit signal and power
constraint factor are given by
x
j
[n] =
1
g
j
P
j
R
H
j
_
R
j
R
H
j
+ I
_
1
d
j
[n], (19)
g
j
=

_
1
P
T
N
B

n=1
R
H
j
(R
j
R
H
j
+ I)
1
d
j
[n]
2
, (20)
and the total MSE becomes
=
N
B

n=1
d
H
j
[n]
_
R
j
R
H
j
+ I
_
1
d
j
[n]. (21)
Using the Cholesky factorization
_
R
j
R
H
j
+ I
_
1
= T
H
j
T
j
,
(21) can be simplied to
=
N
B

n=1
T
j
(s
j
[n] + p
j
[n])
2
. (22)
In order to minimize the total MSE, each independent term
should be minimized and hence p
j
[n] is chosen as
p
j
[n] = arg min
p

j
[n]
T
j
(s
j
[n] + p

j
[n])
2
. (23)
C. The Sum Rate of the BD-MVP
The sum rate of the proposed method is equivalent to the
sum of single user rate by virtue of the fact that the precoding
matrix F
j
annihilates multiuser interference. Recalling that
the transmit and received signal vectors of the user j are
x
j
[n] =
1

g
j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
B
j
d
j
[n] and y
j
[n] =
H
eff,j
x
j
[n]+w
j
[n], respectively, and also using the eigenvalue
decomposition H
eff,j
H
H
eff,j
= QQ
H
, H
eff,j
x
j
[n] becomes
H
eff,j
x
j
[n] =
1

g
j
QQB
j
d
j
[n] (24)
where Q is the unitary matrix and is the diagonal matrix
whose jth element is

j

j
+
(
j
is the jth diagonal element of
).
In (24), the kth entry of the user j is
[H
eff,j
x
j
[n]]
k
=
1

g
j

q
k,1

1
+
q
k,N
R,j

N
R,j

N
R,j
+

2
6
6
4
q
H
1,1
q
H
N
R,j
,1
.
.
.
.
.
.
q
H
1,N
R,j
q
H
N
R,j
,N
R,j
3
7
7
5
2
6
4
c
j,1
[n]
.
.
.
c
j,N
R,j
[n]
3
7
5 (25)
where q
m,n
is the (m, n)th entry of Q and c
j,i
[n]
is the ith entry of B
j
d
j
[n]. In (25), we see that
1

g
j
_

N
R,j
l=1

l
+
|q
k,l
|
2
_
c
j,k
[n] is the desired term and all
of the rest (containing c
j,l
[n], l = k) are interferences. In this
perspective, kth received signal for the user j can be expressed
as
y
j,k
[n] =
1

g
j

N
R,j

l=1

l
+
|q
k,l
|
2

c
j,k
[n] +z
j,k
[n] (26)
where z
j,k
[n] =
1

g
j

N
R,j
m=1,m=k
_

N
R,j
l=1

l
+
q
k,l
q
H
m,l
_
c
j,m
[n] +w
j,k
[n] contains the inter-stream interference and
additive noise w
j,k
. Therefore, the SINR of each stream of
the user j becomes
SINR
j,k
=
N
B

n=1
sc
j,k
[n]
2
tc
j,m
[n]
2
+w
j,k
[n]
2
(27)
where s =
1

g
j
_

N
R,j
l=1

l
+
|q
k,l
|
2
_
and t =
1

g
j

N
R,j
m=1,m=k
_

N
R,j
l=1

l
+
q
k,l
q
H
m,l
_
and the corresponding
sum rate becomes
R
BD-MVP
=
K

j=1
N
R,j

k=1
log
2
(1 + SINR
j,k
) . (28)
D. MSE Performance Between the BD-VP and BD-MVP
In this subsection, we provide the MSE analysis of the BD-
VP and BD-MVP algorithms. Since the optimization problem
of the conventional VP can be interpreted as a zero forcing
( = 0) MMSE-VP problem [11], we can compare the
MSE of two schemes. Moreover, due to the fact that the
minimum MSE of the BD-VP and BD-MVP equals
j
and
g
j
, respectively, the MSE of an instantaneous block for both
PARK et al.: A MMSE VECTOR PRECODING WITH BLOCK DIAGONALIZATION FOR MULTIUSER MIMO DOWNLINK 5
TABLE I
COMPLEXITY COMPARISON BETWEEN THE BD-VP AND BD-MVP
Operations BD-VP BD-MVP
Annihilation of multiuser interference BD is done by the SVD BD is achieved by the QR decomposition
C
SVD
= K
`
4N
2
T
N
R
+ 13(N
R
N
R,j
)
3

C
QR
=
11
3
N
3
T
+
5
3
N
2
T
+KN
2
R,j
(N
T

1
3
N
R,j
)
Computation of perturbation vector SE employing zero forcing criterion SE exploiting MMSE criterion
with complexity C
VP-SE
with complexity C
MVP-SE
( C
VP-SE
)
Receiver equalizer operation - B
H
j
multiplication is required per each user
Modulo operation Remove p
j
to map the received signal Same as BD-VP
into the original vector

j
and g
j
can be compared without loss of generality. First,

j
of the BD-VP algorithm is expressed as

j
H
H
eff,j
(H
eff,j
H
H
eff,j
)
1
(s
j
+ p
j
)
2
= (s
j
+ p
j
)
H
(H
eff,j
H
H
eff,j
)
1
(s
j
+ p
j
)
= (s
j
+ p
j
)
H
Q
1
Q
H
(s
j
+ p
j
)
=
N
R,j

l=1
1

j
|q
H
l
(s
j
+ p
j
)|
2
. (29)
In a similar way, g
j
of the BD-MVP scheme is
g
j
L
j
(s
j
+ p
j
)
2
= (s
j
+ p
j
)
H
(H
eff,j
H
H
eff,j
+ I)
1
(s
j
+ p
j
)
= (s
j
+ p
j
)
H
Q( + I)
1
Q
H
(s
j
+ p
j
)
=
N
R,j

l=1
1

j
+
|q
H
l
(s
j
+ p
j
)|
2
. (30)
Due to the inclusion of (> 0) in the denominator, it is clear
that

j
g
j
> 0. (31)
Note that since the total MSE of the BD-VP is the sum of
each individual block MSE (MSE
BD-VP
=

K
j=1

j
) and the
same is true for the BD-MVP (MSE
BD-MVP
=

K
j=1
g
j
), we
conclude that the BD-MVP outperforms the BD-VP in the
MSE sense. Note also that we only consider the case where
B
H
j
= I
N
R,j
in g
j
computation so that further improvement
in the MSE can be achieved using the GMD based receive
equalization.
E. Comments on Complexity
In this subsection, we discuss the complexity of the BD-
MVP and BD-VP. Since the receiver operation of the BD-VP
and BD-MVP (the modulo operation and receive equalization)
is very simple to analyze and in fact trivial, we focus only
on the computations associated with transmitter precoding. In
our analysis, we measure the complexity of the algorithm by
the number of oating point operations (ops). The major
operations of the proposed method and BD-VP include 1)
annihilation of multiuser interference using F
j
and 2) per user
optimization of perturbation vector p
j
.
The computational complexity of the SVD for the BD-VP
C
SVD
and QR decomposition for the BD-MVP C
QR
are [20]
C
SVD
: K
_
4N
2
T
N
R
+ 13(N
R
N
R,j
)
3
_
.
C
QR
:
11
3
N
3
T
+
5
3
N
2
T
for channel inversion and KN
2
R,j

(N
T

1
3
N
R,j
) for orthogonalization.
0 5 10 15 20 25 30
0
5
10
15
20
25
30
35
40
SNR(dB)
S
u
m

R
a
t
e

(
b
i
t
s
/
s
e
c
/
H
z
)


DPC
BD
BDWF
BDVP [10]
BDGMD [21]
BDUCD [21]
BDMVP
Fig. 2. Achievable sum rates of DPC, conventional BD algorithms (BD,
BD-WF and BD-VP), THP based schemes (BD-GMD and BD-UCD), and
the proposed BD-MVP.
Comparing to the SVD based BD, the QR based BD has
smaller computational complexity since the QR based orthog-
onalization of N
T
N
R,j
matrix

H
j
is simpler than the SVD
operation on (N
R
N
R,j
) N
T
matrix

H
j
. For example, the
required ops of the BD-VP and BD-MVP for the {4, 4} 8
system are 2688 and 2137, respectively. In fact, computational
benet of the BD-MVP over the BD-VP grows with N
T
and
K mainly because the SVD is operated in a higher dimension.
Next, we investigate the complexity of perturbation vector
computation. Due to the fact that the vector precoding is
implemented using the sphere encoding, the computational
complexity associated with this operation is nondeterministic
[21]. The lower bound on the complexity of the sphere
encoding C
SE
is given by [22]
C
SE


N
s
1

1
(32)
where is the modulation order, N
s
is the number of search
dimension, and is the complexity component given by =
1
2
_
1 +
4(1)
3
2
SNR
_
where is a constant. Noting that the
dominant factor of C
SE
is N
s
and both schemes have the same
N
s
, we expect that the complexity of the BD-VP and BD-MVP
would be more or less similar. We summarize the complexity
analysis of this subsection in Table I.
6 IEEE TRANSACTIONS ON COMMUNICATIONS, ACCEPTED FOR PUBLICATION
9 12 15 18 21 24 27
10
5
10
4
10
3
10
2
10
1
(dB)
U
n
c
o
d
e
d

B
E
R


BDVP [10]
BDGMD [21]
BDUCD [21]
BDMVP
BDMVP w/ GMD
Fig. 3. Uncoded BER performance for {4, 4}8 multiuser MIMO systems.
IV. SIMULATION RESULTS AND DISCUSSIONS
In this section, we compare the performance of the proposed
BD-MVP method with the BD, BD-VP, BD-WF, as well
as THP based schemes (BD-GMD and BD-UCD). While
the BD-GMD generates subchannels with identical SNRs
using zero forcing based the GMD, the BD-UCD employs
MMSE extension of uniform channel decomposition (UCD)
for obtaining further gain in performance (readers are referred
to [12] for more details).
The simulation setup is based on the 16-QAM transmission.
We assume that elements of each users channel matrix are
i.i.d. zero mean complex Gaussian random variables with unit
variance, which almost surely ensures that each users channel
matrix has full rank (rank(H
j
) = min(N
R,j
, N
T
)). In order to
observe the comprehensive picture of theoretical and practical
performance, we measure the sum rate and bit error rate (BER)
(both uncoded and coded BER), respectively.
In Fig. 2, we plot the sum rate for the {2, 2} 4 system as
a function of SNR. Note that the achievable rate of the DPC
is [3][5]
R
DPC
= sup
DA
log |I +H
H
DH| (33)
where A is the set of nonnegative diagonal matrices D
satisfying tr(D) P
T
. Generally, we observe that the sum
rate of the BD-UCD, BD-GMD, BD-WF, and BD-MVP is not
much different from the sum rate of the DPC. In particular,
the sum rate of the BD-MVP becomes close to the sum rate
of the DPC as the SNR increases. However, the BD algorithm
leaves nonnegligible gap to the sum rate of the DPC.
In Fig. 3, we plot the uncoded BER for the {4, 4} 8
systems. We observe that the BD-MVP outperforms the BD-
VP and BD-UCD resulting in about 0.8 dB and 1.5 dB gain at
BER = 10
3
, respectively. We also observe that the BD-MVP
with receive equalization provides additional gain (around 0.3
dB at BER = 10
3
) over the BD without equalization.
In order to assess the computational complexity of al-
gorithms under test, we examine the running time of each
scheme using same set of symbols under same environments.
The program is coded with the Matlab program and the
6 7 8 9 10 11 12 13 14 15
10
6
10
5
10
4
10
3
10
2
10
1
10
0
(SNR)
C
o
d
e
d

B
E
R


BDUCD [21]
BDVP [10]
BDMVP
Fig. 4. Coded BER performance for {4, 4} 8 systems with 16-QAM
modulation (Turbo encoding with rate r = 1/2).
results are obtained as an average time (in seconds) for 10
4
symbols running under a duo-core processor with Windows
7 environment. As summarized in Table II, the BD-GMD
achieves the lowest complexity among all candidates by virtue
of the fact that small number of computations is needed for
the THP modulo operation. Due to the considerable number
of computations associated with the sphere encoding, we see
that the running time of the BD-MVP and BD-VP is 3.3 and
3.9 times higher than that of the BD-GMD. Since number of
iterations for performing waterlling is substantial, BD-UCD
achieves the highest computational complexity.
Finally, in order to observe if this gain can be transferred to
the gain after decoding, we check the coded BER performance.
Fig. 4 illustrates the coded BER performance of the proposed
method with half-rate (r = 1/2) turbo encoded data. The
standard Gray mapping is used to map the information bits
to 16-QAM symbols. As an encoder, we use a universal
mobile telecommunications systems (UMTS) standard with
feedforward polynomial 1+D+D
3
and feedback polynomial
1+D
2
+D
3
[23]. The size of codebook is set to 4000. Similar
to [15], y

j,re
[n] = y
j,re
[n] (/g
j
)p
j,re
[n] is employed to
compute the likelihood function
p(y

j,re
[n]|s
j,re
) =

X
m=
1

2
2
exp{(y

j,re
[n] (s
j,re
m)/g
j
)
2
/2
2
}
y

j,re
[n]


2g
j
,

2g
j

. (34)
The imaginary part is handled in the same way.
In our simulation, we replace the innite sum in (34) with
a sum of a few terms on both sides of m = 0 since the
performance loss over the setup using innite sum is negligi-
ble. From the Fig. 4, we observe that the proposed method
outperforms the BD-VP and BD-UCD. In particular, gain
of the BD-MVP over the BD-VP and BD-UCD at waterfall
region is around 2 dB and 4 dB, respectively. Since the
performance requirement of coded system typically lies on the
waterfall regime, the proposed method can bring a substantial
PARK et al.: A MMSE VECTOR PRECODING WITH BLOCK DIAGONALIZATION FOR MULTIUSER MIMO DOWNLINK 7
TABLE II
TIME COMPLEXITY FOR {4, 4} 8 SYSTEM (RUNNING TIME FOR 10
4
SYMBOLS)
Algorithm BD-MVP BD-VP BD-GMD BD-UCD
Running time (sec) 4.0531 4.8308 1.2125 19.2539
reduction on the transmitter power over the BD-UCD and
BD-VP. Clearly, we conclude that the proposed scheme is
competitive in both practical and theoretical perspectives.
V. CONCLUSION
In this paper, we investigated an approach based on block
diagonalization (BD) achieving robustness of multiuser MIMO
downlink. Motivated by the fact that mobile users suffer from
noise enhancement effect due to the zero forcing nature of
the BD algorithm, we combined the BD with the MMSE-
VP to achieve the balance between residual interference
mitigation and noise enhancement suppression. We could
observe from the sum rate analysis as well as practical BER
performance (both coded and uncoded) that the proposed
BD-MVP approach uniformly outperforms the BD algorithm
and its variants. Future study needs to be directed towards
the investigation of the performance when the channel state
information at the transmitter (CSIT) is imperfect.
APPENDIX I
QR DECOMPOSITION BASED BD
The pseudo inverse of N
R,j
N
T
composite channel matrix
H =
_
H
T
1
H
T
2
H
T
K

T
is

H = H
H
_
HH
H
_
1
=
_

H
1

H
2


H
K
_
where

H
j
is N
T
N
R,j
matrix. Then,
one can show that
H

H =

H
1
.
.
.
H
K

H
1


H
K

(35)
=

H
1

H
1
H
1

H
K
.
.
.
H
K

H
1
H
K

H
K

I
N
R,1
0
.
.
.
0 I
N
R,K

= I
N
R
.
(36)
Clearly, H
j

H
k
= 0 when j = k. By dening

H
j
=
_
H
T
1
H
T
j1
H
T
j+1
H
T
K

T
, the zero inter-user inter-
ference constraint is satised as

H
j

H
j
= 0. Now consider
the QR decomposition of

H
j

H
j
=

Q
j

R
j
for j = 1, , K (37)
where

R
j
is N
R,j
N
R,j
upper triangular matrix and

Q
j
is
N
T
N
R,j
matrix whose columns form an orthonormal basis
for

H
j
. From the zero inter-user interference constraint, we
have

H
j

Q
j

R
j
= 0. Since

R
j
is invertible, it follows that

H
j

Q
j
= 0.
APPENDIX II
PROOF OF JOINT OPTIMIZATION PROBLEM
The Lagrangian function to solve the joint optimization
problem is
f(p
j
[n], x
j
[n], g
j
, )
= (p
j
[n], x
j
[n], g
j
) + (
N
B

n=1
x
H
j
[n]x
j
[n] P
T
). (38)
By equating the derivatives of f() with respect to x
j
[n], g
j
and to zero, we can obtain the following equation
f()
x
j
[n]
= g
2
j
x
H
j
[n]H
H
eff,j
H
eff,j
g
j
d
H
j
[n]B
H
j
H
eff,j
+ x
H
j
[n] = 0, (39)
f()
g
j
=
N
B
X
n=1
{2g
j
x
H
j
[n]H
H
eff,j
H
eff,j
x
j
[n] d
H
j
[n]B
H
j
H
eff,j
x
j
[n] + 2g
j
tr(R
n
) x
H
j
[n]H
H
eff,j
B
j
d
j
[n]} = 0, (40)
f()

=
N
B
X
n=1
x
H
j
[n]x
j
[n] P
T
= 0. (41)
Using the matrix inversion lemma, (39) becomes
x
j
[n] =
1
g
j
(H
H
eff,j
H
eff,j
+

g
2
j
I)
1
H
H
eff,j
B
j
d
j
[n]
=
1
g
j
H
H
eff,j
(H
eff,j
H
H
eff,j
+

g
2
j
I)
1
B
j
d
j
[n]. (42)
By multiplying
x
j
[n]
g
j
at the right side of (39), we have
d
H
j
[n]B
H
j
H
eff,j
x
j
[n]
= g
j
x
H
j
[n]H
H
eff,j
H
eff,j
x
j
[n] +

g
j
x
H
j
[n]x
j
[n]. (43)
Plugging (43) into (40), we have
N
B

n=1
_
2

g
j
x
H
j
[n]x
j
[n] + 2g
j
tr(R
n
)
_
= 0. (44)
After some manipulation, we have

g
2
j
=
tr(R
n
)
1
N
B

N
B
n=1
x
H
j
[n]x
j
[n]
=
N
B
tr(R
n
)
P
T
. (45)
Let =
N
B
tr(R
n
)
P
T
, then (42) can be rewritten as
x
j
[n] =
1
g
j
(H
H
eff,j
H
eff,j
+ I)
1
H
H
eff,j
B
j
d
j
[n]
=
1
g
j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
B
j
d
j
[n]. (46)
8 IEEE TRANSACTIONS ON COMMUNICATIONS, ACCEPTED FOR PUBLICATION
APPENDIX III
PROOF OF (16)
Plugging x
j
[n] in (14) and g
j
in (15) into (12) and after
some manipulations, we have
(p
j
[n]

x
j
[n], g
j
)
=
N
B
X
n=1
(g
2
j
x
H
j
[n]H
H
eff,j
H
eff,j
x
j
[n] g
j
d
H
j
[n]B
H
j
H
eff,j
x
j
[n]
+ g
2
j
tr(R
n
) g
j
x
H
j
[n]H
H
eff,j
B
j
d
j
[n] +d
H
j
[n]d
j
[n])
=
N
B
X
n=1
(d
H
j
[n]B
H
j
(H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
B
j
d
j
[n] +d
H
j
[n]d
j
[n]
d
H
j
[n]B
H
j
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
B
j
d
j
[n]
d
H
j
[n]B
H
j
(H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
B
j
d
j
[n]
+ d
H
j
[n]B
H
j
(H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
B
j
d
j
[n])
=
N
B
X
n=1
d
H
j
[n]Q
H
j
((H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
+I H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
(H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
+ (H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
)B
j
d
j
[n]
=
N
B
X
n=1
d
H
j
[n]B
H
j
((H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
I +I H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
+I (H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
+ (H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
)B
j
d
j
[n]
=
N
B
X
n=1
d
H
j
[n]B
H
j
((H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
I + 2(H
eff,j
H
H
eff,j
+ I)
1
+ (H
eff,j
H
H
eff,j
+ I)
1
H
eff,j
H
H
eff,j
(H
eff,j
H
H
eff,j
+ I)
1
)B
j
d
j
[n]
=
N
B
X
n=1
d
H
j
[n]B
H
j
((H
eff,j
H
H
eff,j
+ I)
1
)B
j
d
j
[n], (47)
where I (H
eff,j
H
H
eff,j
+I)
1
H
eff,j
H
H
eff,j
= (H
eff,j
H
H
eff,j
+
I)
1
follows from the matrix inversion lemma. Employing
the Cholesky factorization (H
eff,j
H
H
eff,j
+I)
1
= L
H
j
L
j
and
also noting that d
j
[n] = s
j
[n] + p
j
[n], (47) becomes
(p
j
[n]

x
j
[n], g
j
) =
N
B

n=1
d
H
j
[n]B
H
j
L
H
j
L
j
B
j
d
j
[n]
=
N
B

n=1
L
j
B
j
(s
j
[n] + p
j
[n])
2
. (48)
In summary, the optimization problem to nd the perturbation
vector p
j
is expressed as
p
j
[n] = arg min
p

j
[n]
L
j
B
j
(s
j
[n] + p

j
[n])
2
. (49)
APPENDIX IV
SUMMARY OF THE BD-MVP ALGORITHM
Input : H, s[n]
Output : p[n]
Variable
1) j denotes the jth user being examined
2) n denotes the nth block being examined
Step 1 : (QR based BD step)
In order to compute the nullspace of

H
j
, we dene
the pseudo-inverse of the channel matrix H as

H =
H
H
(HH
H
)
1
= [

H
1


H
j


H
K
]
for j = 1 : K

H
j
=

Q
j

R
j
F
j
=

Q
j
H
eff,j
= H
j
F
j
end
Step 2 : (MMSE-VP step) for j = 1 : K
factorize (H
eff,j
H
H
eff,j
+ I
n
j
)
1
= L
H
j
L
j
for n = 1, . . . , N
B
p
j
[n] = arg min
p

j
[n]
L
j
B
j
(s
j
[n] + p

j
[n])
2
x
j
[n] = H
H
eff,j
(H
eff,j
H
H
eff,j
+ I
n
j
)
1
B
j
d
j
[n]
= H
H
eff,j
(H
eff,j
H
H
eff,j
+ I
n
j
)
1
B
j
(s
j
[n] + p
j
[n])
end
g
j
=
_
1
P
T

N
B
n=1
d
H
j
[n]B
H
j
H
eff,j
(H

eff,j
)
2
H
H
eff,j
B
j
d
j
[n]
for n = 1, . . . , N
B
x
j
[n] =
1
g
j
x
j
[n]
end
end
REFERENCES
[1] I. E. Telatar, Capacity of multi-antenna Gaussian channels, Eur. Trans.
Telecommun., vol. 10, pp. 585595, Nov./Dec. 1999.
[2] H. Weingarten, Y. Steinberg, and S. Shamai, The capacity region of the
Gaussian MIMO broadcast channel, IEEE Trans. Inf. Theory, vol. 52,
pp. 39363964, Sep. 2006.
[3] S. Vishwanath, N. Jindal, and A. Goldsmith, Duality, achievable rates,
and sum-rate capacity of Gaussian MIMO broadcast channels, IEEE
Trans. Inf. Theory, vol. 49, pp. 26582668, Oct. 2003.
[4] W. Yu and J. M. Ciof, Sum capacity of Gaussian vector broadcast
channels, IEEE Trans. Inf. Theory, vol. 50, pp. 18751892, Sep. 2004.
[5] P. Viswanath and D. N. C. Tse, Sum capacity of the vector Gaussian
broadcast channel and uplink downlink duality, IEEE Trans. Inf. Theory,
vol. 49, pp. 19121921, Aug. 2003.
[6] G. Caire and S. Shamai, On the achievable throughput of a multiantenna
Gaussian broadcast channel, IEEE Trans. Inf. Theory, vol. 49, pp. 1691
1706, July 2003.
[7] S. I. Gelfand and M. S. Pinsker, Coding for channel with side informa-
tion, Problemi Peredachi Informatsii, vol. 9, no. 1, p. 19.31, 1980.
[8] M. Costa, Writing on dirty paper, IEEE Trans. Inf. Theory, vol. 29, pp.
439441, May 1983.
[9] Q. H. Spencer, A. L. Swindlehurst, and M. Haardt, Zero-forcing methods
for downlink spatial multiplexing in multiuser MIMO channels, IEEE
Trans. Signal Process., vol. 52, pp. 461471, Feb. 2004.
[10] C. B. Chae, S. Shim, and R. W. Heath Jr., Block diagonalized
vector perturbation for multiuser MIMO systems, IEEE Trans. Wireless
Commun., vol. 7, pp. 40514057, Nov. 2008.
[11] D. Schmidt, M. Joham, and W. Utschick, Minimum mean square error
vector precoding, Eur. Trans. Telecommun., vol. 19, pp. 219-231, Mar.
2007.
[12] S. Lin, W. L. Ho, and Y. C. Liang, Block diagonal geometric mean
decomposition (BD-GMD) for MIMO broadcast channels, IEEE Trans.
Wireless Commun., vol. 7, no. 7, pp. 27782789, July 2008.
[13] T. M. Cover and J. A. Thomas, Elements of Information Theory. Wiley,
1991.
PARK et al.: A MMSE VECTOR PRECODING WITH BLOCK DIAGONALIZATION FOR MULTIUSER MIMO DOWNLINK 9
[14] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, A vector-
perturbation technique for near-capacity multiantenna multiuser commu-
nication - Part I: Channel inversion and regularization, IEEE Trans.
Commun., vol. 53, pp. 195202, Jan. 2005.
[15] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, A vector-
perturbation technique for near-capacity multiantenna multiuser commu-
nication - Part II: Perturbation, IEEE Trans. Commun., vol. 53, pp. 537
544, Mar. 2005.
[16] H. Sung, S. Lee, and I. Lee, Generalized channel inversion methods
for mutliuser MIMO systems, IEEE Trans. Commun., vol. 57, pp. 3489
3994, Nov. 2009.
[17] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Uni-
versity Press, 2004.
[18] Y. Jiang, J. Li, and W. W. Hager, Joint transceiver design for MIMO
communications using geometric mean decomposition, IEEE Trans.
Signal Process., vol. 53, no. 10, pp. 37913803, Oct. 2005.
[19] F. Liu, L. Jiang, and C. He, Joint MMSE vector precoding based on
GMD method for MIMO systems, IEICE Trans. Commun., vol. 90, no.
9, pp. 26172620, Sep. 2007.
[20] G. H. Golub and C. F. V. Loan, Matrix Computations, 3rd ed. The Johns
Hopkins University Press, 1989.
[21] B. Shim and I. Kang, Sphere decoding with a probabilistic tree
pruning, IEEE Trans. Signal Process., vol. 56, no. 10, pp. 48674878,
Oct. 2008.
[22] J. Jalden and B. Ottersten, On the complexity of sphere decoding in
digital communication, IEEE Trans. Signal Process., vol. 53, no. 4, pp.
14741484, Apr. 2005.
[23] 3rd Generation Partnership Project, Technical specication group radio
access network; physical layer (Release 7), Tech. Spec. 25.201 V7.2.0,
2007 [Online]. Available: http://www.3gpp.org/ftp/Specs/
archive/25$\_$series/25.201/25201-720.zip
Jungyong Park (S09) received the B.S. degree in
information and communication engineering from
Dongguk University, Seoul, Korea in 2008, and the
M.S. degree from the School of Information and
Communication, Korea University, Seoul, Korea, in
2010, where he is currently working toward the
Ph.D. degree. His research interests include wireless
communications and information theory.
Byungju Lee (S09) received the B.S. and M.S.
degrees in the School of Information and Commu-
nication, Korea University, Seoul, Korea, in 2008
and 2010, where he is currently working toward the
Ph.D. degree. His research interests include wireless
communications and information theory.
Byonghyo Shim (SM09) received the B.S. and
M.S. degrees in control and instrumentation engi-
neering (currently electrical engineering) from Seoul
National University, Korea, in 1995 and 1997, re-
spectively, and the M.S. degree in mathematics and
the Ph.D. degree in electrical and computer engi-
neering from the University of Illinois at Urbana-
Champaign, in 2004 and 2005, respectively.
From 1997 and 2000, he was with the Department
of Electronics Engineering at the Korean Air Force
Academy as an Ofcer (First Lieutenant) and an
Academic Full-time Instructor. From 2005 to 2007, he was with Qualcomm
Inc., San Diego, CA, working on CDMA systems with the emphasis on next-
generation UMTS receiver design. Since September 2007, he has been with
the School of Information and Communication, Korea University, where he is
now an Associate Professor. His research interests include signal processing
for wireless communications, compressive sensing, applied linear algebra, and
information theory.
Dr. Shim was the recipient of the 2005 M. E. Van Valkenburg research
award from the ECE department of the University of Illinois and the 2010
Haedong award from IEEK.

Das könnte Ihnen auch gefallen