Sie sind auf Seite 1von 4

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LWC.2017.2761876, IEEE Wireless
Communications Letters

Fast Converging Weighted Neumann Series

Precoding for Massive MIMO Systems
Betty Nagy, Maha Elsabrouty and Salwa Elramly

Abstract—Neumann Series (NS) expansion-based precoder in the iterative approach gets more accurate approximation than
massive multiple input multiple output (MIMO) systems suffers the direct method approach [5], it suffers from larger delays.
from slow convergence. To solve this problem, this letter proposes The third approach expands the inverse of a matrix into a
a weighted NS expansion precoder. The weights are designed to
minimize the error between the exact inverse and the weighted NS series of matrix-vector multiplications like Neumann Series
inverse. The optimal weights are deduced analytically. Moreover, (NS) expansion [6] and truncated polynomial expansion (TPE)
an approximation of these optimal weights is proposed, based [4]. The main advantage of the NS expansion is its simple
on the properties of large Wishart matrices, which saves the re- implementation. However, it suffers from slow convergence.
computation of these weights. The weighted NS precoding pro- This paper aims at speeding up the convergence of the NS
vides near optimal performance at only four weighted expanded
NS terms and has lower complexity than recently proposed expansion-based precoder. Firstly, weights are introduced to
approximate precoders. the terms of the NS expansion. Secondly, an optimization
problem whose objective is to minimize the error between
Index Terms—Massive MIMO, Matrix inversion, Neumann
series, Optimization. the exact matrix inverse and the weighted NS inverse is
presented and solved analytically. Finally, an approximation
to the optimal weights is proposed based on the properties
I. I NTRODUCTION of large Wishart matrices. Consequently, the computation of
ASSIVE MIMO is one of the most promising technolo- the optimal weights is insensitive to instantaneous channel
M gies for the 5th generation (5G) wireless communica-
tions systems [1] due to its exceptional array gain. Marzetta in
realizations. Hence, it can be done once during the system
setup and the system complexity will be equivalent to that of
[2] showed that as the number of the antenna elements grows the conventional NS-based precoding. Simulation results show
large, the effects of uncorrelated noise, fast fading and intra- that the weighted NS precoder at four expanded series terms
cell interference decrease. Thus, linear precoding schemes can has very close performance to optimal exact inversion. More-
achieve near optimal performance [3]. over, the complexity analysis provides a practical condition at
Linear precoders like Zero Forcing (ZF) [3], Regularized Zero which the weighted NS has lower complexity than competitive
Forcing (RZF) [4] and Minimum Mean Square Error (MMSE) techniques. By inspecting the practical values in 5G systems,
[3] require the inversion of the channel Gram matrix of all this condition was found to be satisfied most of the time except
users. With the large size of the Gram matrix in the case of in the case of large number of users in high mobility.
massive MIMO serving large number of users, matrix inver- It must be noted that applying and optimizing the coeffi-
sion is an important practical problem that affects the precoder cients of series expansion terms was applied in [4] and [10].
design and performance. A good precoder requires developing However, [4] targeted increasing the system throughput and the
a matrix inversion approximation of low complexity as well as added complexity in the system could not be resolved to the
good approximation accuracy. There are several approaches to original TPE. On the contrary, in the approach proposed here
obtain the inverse of a large matrix; the first approach, namely, the weights target the problem of slow convergence in NS-
direct method depends mainly on decomposing the matrix, to based precoding. The approximate weights developed using
be inverted, into a product of simple matrices like Cholesky the properties of the large Wishart matrices do not only provide
factorization [5] and QR decomposition. This approach suffers faster convergence than the unweighted case but also come at
from high complexity and requires special arithmetic units no extra computational complexity. The work in [10] aimed
[6]. While the second approach, namely, indirect or iterative to simplify the block diagonalization precoding for multiple
methods, treats the inversion problem as a system of linear antenna users. Authors in [10] used another definition of the
equations and solves it iteratively. Examples of the second NS expansion and obtained an approximate expression for
approach are the Gauss-Seidel [7], Conjugate gradient [8] and the optimal weights. On the other hand, this letter proposes
Symmetric Successive over Relaxation (SSOR) [9]. Although simplified ZF precoder for single antenna users. According
to [1], the diagonal-based NS expansion used here is more
B. Nagy is with the Department of Physics and Applied Mathematics robust against unknown channel distribution. Moreover, both
and S. Elramly is with the Department of Electronics and Communications
Engineering, Faculty of Engineering, Ain Shams University, Cairo 11571, exact and approximate optimal weights are presented.
Egypt (e-mail: {betty.nagy,salwa_elramly} Notations: Lower-case and upper-case boldface letters de-
M. Elsabrouty is with the Department of Electronics and Com- note vectors and matrices. (.)T , (.)H , (.)† , tr(.), (.)−1 and
munications Engineering, Egypt-Japan University of Science and Tech-
nology (E-JUST), Borg El-Arab 21934, Alexandria, Egypt (e-mail: k.k f present the transpose, conjugate transpose, pseudoinverse,
trace, inversion and Frobenius norm respectively, nk denotes

2162-2337 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LWC.2017.2761876, IEEE Wireless
Communications Letters

the binomial coefficient, Lnα (v) denotes the generalized La- where a = [A0, . . . A N −1 ]T , the (n, m)th element of the matrix
guerre polynomial of degree α and v1, v2, . . . vn are its roots. Z ∈ C N ×N and the nth element of the vector b ∈ C N ×1 are
Finally, CN (µ, σ 2 IK ) denotes the circularly symmetric com- defined respectively as follows;
plex Gaussian distribution with mean µ and covariance matrix
Õn−1 n − 1 Õm−1 m−1
σ 2 IK where IK is the identity matrix of size K. [Z]n,m = (−1)x (−1)y
x=0 x y=0 y
n o
+ (GD) (DG)x+1

A massive MIMO system is considered where the base

Õn−1 n−1 n o
station is equipped with M antennas and serves K users [b]n = 2 (−1)x tr (GD)x+1 (9)
x=0 x
simultaneously. The resulting transmitted signal vector after Proof: given in the Appendix.
precoding is denoted by x = [x1, . . ., x M ]T ∈ C(M×1) . The
Lemma 1. In a massive MIMO system with M antennas
signals received by all users are denoted by the vector y =
and K users, the optimal weights of the optimized NS-based
[y1, . . ., yK ]T ∈ C(K×1) and it can be represented as follows:
√ precoding can be approximated to be function of M and K
y = ρHx + n (1) only independent of the instantaneous channel realizations.
where ρ is the signal-to-noise ratio (SNR) in the downlink, Proof: For independent and identically distributed (i.i.d.)
H ∈ C(K×M) is the channel matrix which is assumed to be channels in massive MIMO, D ≈ M 1
IK [9]. Thus,
Rayleigh fading, n ∈ C(K×1) denotes the additive white Gaus- GD = DG = M G and as a result, tr {(GD)x+1 (DG)y+1 +
sian noise vector, which follows the distribution CN (0K×1, IK ) (GD)y+1 (DG)x+1 } = 2tr {( M 1
G)x+y+2 }. Then, (9) can be sim-
and x, which is precoded using the ZF precoder, is defined as; plified using Vandermonde’s identity [11] as follows;
x = βH† s = βHH (HHH )−1 s = βHH G−1 s n + m − 2 n 1  r+2 o
(2) Õn+m−2
[Z]n,m = (−1)r tr G
where s ∈ C is the symbol vector from K users, G is r=0 r M
q as HH and β is a normalization
the Gram matrix defined Õn−1 n − 1 n  1  r+1 o
K [b]n = (−1)r tr G (10)
parameter defined as tr(G−1 ) . It is clear from (2) that the ZF r=0 r M
precoder requires the computation of G−1 . Furthermore, the trace of a matrix tr(Xa ) = k=1 K
λka , where
1) NS-based ZF precoding: The basis of the Neumann λk is the k eigenvalue of the matrix X [12]. Therefore, (10)

series expansion [6] is considering that if the matrix G, whose is modified to:
r n+m−2
Õn+m−2  Õ
inverse needs to be computed, satisfies the following condition: K 
[Z]n,m = (−1) λr+2
lim (IK − DG)n ≈ 0K (3) r=0 r k=1
n→∞  Õ
Õn−1 n−1 K 
where D is a K × K preconditioning matrix. Then the matrix [b]n = (−1)r λr+1 (11)
inversion of the K × K matrix G is transformed to the sum of
r=0 r k=1

matrix polynomials as follows; where λk is the k th eigenvalue of the matrix ( M 1

G−1 = (IK − DG)n−1 D ≈ (IK − DG)n−1 D (4) In the case of i.i.d. channels, whose elements have zero
n=1 n=1 1
mean and unit variance, the matrix ( M G) in (11) follows
where N is the number of expanded terms. The precondition- the same distribution of Wishart matrices. Consequently, the
ing matrix D can be defined as [6]; Marchenko-Pastur law of large Wishart matrices can be ap-
 1 1 1 
D = diag , . . ., , . . ., (5) plied [13]. As a result, the eigenvalues λ1, . . . λk converge to
G1,1 Gk,k GK,K a non random limiting distribution. In [14], it has been proved
where Gk,k is the k th diagonal element of the matrix G. that as M and K grow large, λ1, . . . λk are approximated by the
roots of generalized Laguerre polynomial which only depend
III. P ROPOSED W EIGHTED NS P RECODING on the dimensions of the massive MIMO system (M, K).
The NS-based precoder suffers from slow convergence. To Thus, the (n, m)th element of the matrix Z and the nth
solve this problem, the inverse of G in (4) is modified to: element of the vector b in (11) can be approximated by:
G−1 ≈ Ĝ = An−1 (IK − DG)n−1 D n + m − 2  ÕK r+2 
(6) Õn+m−2
n=1 [Z]n,m ≈ (−1)r v
where A0, . . . , A N −1 are introduced weights to the terms of the r=0 r k=1 k
NS expansion to enhance its convergence. Õn−1 n − 1  ÕK r+1 
1) Problem formulation: The objective is to minimize the [b]n ≈ (−1)r v (12)
r=0 r k=1 k
error between the exact matrix inverse G−1 and the approxi- where v1, . . . , vK are the roots of the generalized La-
mate matrix inverse using NS expansion Ĝ defined in (6). guerre polynomial LKM−K+1 (Mv); where LKM−K+1 (Mv) =
The optimization problem can be expressed as: ÍK l M+1 (M v) [15].

2 l=0 (−1) K−l l!
min P = min GĜ − IK (7) Accordingly, the solution of the set of equations in (8) and

A0,... A N −1 A0,... A N −1 f
(12) will get the optimal weights needed for the weighted NS
2) Optimal weights: Theorem 1. The optimal weights A∗n−1
−1 expansion computed once for every massive MIMO system.
that minimize the error between G−1 and Ĝ are the solution Although this approximation will induce some error, it is
of the system of linear equations described as follows; acceptable in i.i.d. channels and in correlated channels with
Za = b (8) low to moderate correlation factor values as shown section IV.

2162-2337 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LWC.2017.2761876, IEEE Wireless
Communications Letters

IV. S IMULATION R ESULTS the system. Hence, the complexity of the weighted NS-based
In this section, we compare the performance of the proposed precoder is the same as that of the conventional NS-based
weighted NS-based precoder with the exact matrix inversion precoder. Before the comparison, we need to differentiate
precoder and the NS-based precoder [6]. The performance between two types of precoders (2); one that computes G−1
measures in Figure 1 are the bit error rate (BER) and the and the other computes G−1 s. The first one is computed
average achievable rate per user rk , which is defined as [4]; once per channel coherence time while the other one is
1 ÕK recomputed for every new symbol vector s sent during the
rk = E[log2 (1 + SINRk )] (13) channel coherence time. Consequently, the complexity of the
K k=1
where the expectation is taken over different channel realiza- second type needs to be multiplied by the number of symbol
tions. The signal to interference and noise ratio (SINR) of the vectors per channel coherence time which is denoted by Nspc .
k th user is following [7];
kwk,k k 2 TABLE I
z,k kwk,z k + 1
where wk,z is the element in the k th row and the z th column Matrix Inversion Complexity
Approximation Technique
of the precoding matrix W = βHHH G−1 .
Weighted NS [9] 2K 3 + M K Ns p c
SSOR [9] (10K 2 + 3K + M K)Ns p c
TPE [4] (12M K + 4M − 2K)Ns p c
10 10
Average rate per user (bit/sec/Hz)

9 6.75
In Table I, the complexity of weighted NS precoder is
6.7 compared with the complexity of SSOR precoder [9] and the
10 8
(1,0) 11.8 12 12.2 approximate complexity of the TPE precoder [4]. Note that

(1,0.3) 7 Table I is concerned with the computational complexity of
multiplications and additions operations in computing G−1 s
(2,0.1) 10 (1,0)
10 6
only. There is also the multiplication by βHH as shown in (2)
(2,0.3) (1,0.3)
(3,0) (2,0)
and since it is a common operation in all the methods, it will
(3,0.3) 10
−4 21.2 21.4
(3,0.3) not affect the comparison and thus excluded. From [4] and
14 16 18 20
SNR (dB)
22 24 26 5 10 15
SNR (dB)
20 25
[9], the 3 methods show near optimal performance at N = 4,
so the complexity will be calculated at N = 4. Table I shows
Fig. 1. Performance of different matrix inverse approximations for 256 × 32 that there are 3 parameters affecting the complexity; K, M and
ZF precoder.
Nspc . The weighted NS complexity is lower than or equal to
The modulation scheme is 64-QAM. The configuration of the SSOR in [9] and the TPE in [4] when;2K 3 ≤ 10K 2 Nspc <
the massive MIMO system is set to M × K = 256 × 32. All 12MK Nspc . These conditions reduce to:
the simulations were done on 1000 channel realizations with K ≤ 5Nspc (15)
the optimal weights computed only once by solving the set In the case of TDD operation in 5G [16], Nspc = γ (τc −

of equations defined in (8) and (12), then the output optimal τp ), γ DL is the fraction of the payload used by the downlink,
weights are used in the weighted NS precoder defined in τc is the number of transmission symbols that fit into a
(2) and (6). Figure 1 evaluates the weighted NS precoder coherence interval (ranges from 200 for highway velocity
in i.i.d. channels and correlated channels. The exponential users in urban environments to 104 for low mobility users
correlation model with correlation factor a [4] has been used and short delay spread at 2 GHz carrier frequencies) and
for comparison. The symbol (i, j) used in Figure 1 means τp is the number of symbols allocated for pilot sequences
that the method of matrix inversion is determined by i and transmission (τp = ηK,where η ≥ 1). For low mobility
the value of correlation factor a is j. i = 1, 2, 3 represents users, τc = 104 ,τp = ηK and γ DL = 0.5, the inequality
exact inversion, NS inverse in [6] and proposed weighted NS in (15) is always satisfied i.e. the weighted NS has lower
inverse in (6), respectively. Note that a = 0 is the case of complexity than TPE [4] and SSOR [9]. For high velocity
i.i.d. channels. The number of expanded NS terms is set to users, τc = 200,τp = ηK and γ DL = 0.5, in (15) then;
4. Figure 1 shows that at correlated channels with a = 0.1  200 2 
the performance is very close to i.i.d. channels and that even 1≤η≤ − (16)
K 5
though the performance of the NS-based precoder in [6] may The weighted NS-based precoding has equal or lower com-
degrade in correlated channels when the correlation factor plexity when the number of pilot sequences η satisfies (16).
increases from a = 0.1 to a = 0.3, but the weighted NS inverse Note that (16) tends to fail when the number of users K
speeds up the convergence and almost have the same BER and increases such that the condition η ≥ 1 can not be satisfied.
average rate as the exact inversion at only 4 expanded terms.
V. C OMPLEXITY A NALYSIS This letter proposes a weighted NS precoder for massive
The main advantage of the approximate weights is that they MIMO systems to speed up the convergence of the NS
are calculated once and do not affect the total complexity of precoder. The optimal weights were deduced analytically and

2162-2337 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LWC.2017.2761876, IEEE Wireless
Communications Letters

approximated based on the asymptotic properties of large • Differentiating P in (20) w.r.t. An−1 .
Wishart matrices. Simulation results showed that the weighted ∂P

N Õn−1
n − 1 Õm−1


NS precoder has a remarkable increase in the convergence to = (−1)x (−1)y
∂ An−1 m=1 x=0 x y=0 y
the exact inverse precoder and that the approximate weights h i 
are insensitive to channel realizations in case of i.i.d. and × tr (GD)x+1 (DG)y+1 Am−1
correlated channels at low to moderate correlation factor. Õ  Õ  
Moreover, when it is compared in terms of complexity with N Õm−1 x m−1 n−1 y n−1
+ (−1) (−1)
state of the art precoders at the same performance, the m=1 x=0 x y=0 y
weighted NS precoder has lower complexity except in the case
x n−1
h i Õn−1
x+1 y+1
of very large number of users in high mobility environments. × tr (GD) (DG) Am−1 − 2 (−1)
x=0 x
h i 
A PPENDIX × tr (GD)x+1 ∀n = 1, . . . , N. (21)
• Minimizing P w.r.t. An−1 , by setting ∂A∂Pn−1 = 0 in (21);
• Simplifying (7) using the identity kXk 2f = tr(XXH ) Õ  Õ  
h  Õ N Õn−1 x n−1 m−1 y m−1
N  i (−1) (−1)
P = tr G An−1 (IK − DG)n−1 D − IK m=1 x=0 x y=0 y
h i
h  ÕN  iH ×tr (GD) (DG) +(GD) (DG)
x+1 y+1 y+1 x+1
× G Am−1 (IK − DG)m−1 D − IK (17)
m=1 Õ   h 
n−1 n−1 i
• Using the binomial theorem (IK − DG)n−1 = =2 (−1)x tr (GD)x+1 ∀n = 1, . . . , N. (22)
Ín−1 x n−1 x x=0 x
x=0 (−1) x (DG) , (17) can be modified to:
h Õ   • Rewriting (22) in the matrix form, we get the set of
N Õn−1 n−1 i
equations in (8-9) whose solution are the optimal weights.
P = tr An−1 (−1)x G(DG)x D − IK
n=1 x=0 x
h ÕN Õm−1 m − 1 R EFERENCES
× Am−1 (−1)y G(DG)y D−IK
m=1 y=0 y [1] F. Rusek et al., “Scaling up MIMO: Opportunities and challenges with
h Õ   very large arrays,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 40–60,
N Õ n−1 n −1 i
= tr An−1 (−1) x
(GD)x+1 − IK 2013.
n=1 x=0 x [2] T. L. Marzetta, “Noncooperative cellular wireless with unlimited num-
  iH bers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9,
h ÕN Õm−1
y m − 1 y+1 no. 11, pp. 3590–3600, 2010.
× Am−1 (−1) (GD) −IK
m=1 y=0 y [3] X. Gao et al., “Linear pre-coding performance in measured very-large
(18) MIMO channels,” in Veh. Technology Conf. (VTC Fall). IEEE, 2011,
pp. 1–5.
• Expanding the two brackets in (18) using the fact that the [4] A. Mueller et al., “Linear precoding based on polynomial expansion:
trace and the hermitian operators are distributive under Reducing complexity in massive MIMO,” EURASIP Journal on Wireless
addition; Commun. and Netw., vol. 2016, no. 1, p. 63, 2016.
Õ  Õ   [5] M. Ylinen et al., “Direct versus iterative methods for fixed-point
N x n−1 y m−1
Õn−1 m−1 implementation of matrix inversion,” in Proc. of the Int. Symp. on
P= (−1) (−1) Circuits and Systems ISCAS, vol. 3. IEEE, 2004.
n=1,m=1 x=0 x y=0 y [6] H. Prabhu et al., “Approximative matrix inverse computations for very-

large mimo and applications to linear pre-coding systems,” in Wireless
h i h i
× tr (GD)x+1 ((GD)y+1 )H An−1 Am−1 + tr IK Commun. and Netw. Conf. (WCNC). IEEE, 2013, pp. 2710–2715.
Õ    [7] X. Gao et al., “Capacity-approaching linear precoding with low-
N Õm−1 m−1 h i complexity for large-scale mimo systems,” in (ICC) Int. Conf. on
− (−1)y tr ((GD)y+1 )H Am−1 Commun. IEEE, 2015, pp. 1577–1582.
m=1 y=0 y [8] B. Yin et al., “Conjugate gradient-based soft-output detection and pre-
Õ Õ   
N n−1 n − 1 h i coding in massive mimo systems,” in Global Commun. Conf. (GLOBE-
− (−1)x tr (GD)x+1 An−1 (19) COM). IEEE, 2014, pp. 3696–3701.
n=1 x=0 x [9] T. Xie et al., “Low-complexity SSOR-based precoding for massive
• From (2) and (5), G
= G and DH = D. Thus, MIMO systems,” IEEE Commun. Lett., vol. 20, no. 4, pp. 744–747,
(GD) = D G = DG and tr {(GD)x+1 ((GD)y+1 )H } =
[10] W. Zhang et al., “Simplified matrix polynomial-aided block diagonal-
tr {(GD)x+1 (DG)y+1 }. The trace operator is invariant ization precoding for massive mimo systems,” in Sensor Array and
under cyclic permutation [12], then tr {(DG)y+1 } = Multichannel Signal Process. Workshop (SAM). IEEE, 2016, pp. 1–5.
[11] J. Riordan, Combinatorial identities. Wiley New York, 1968.
tr {(GD)y+1 }. [12] K. B. Petersen et al., “The matrix cookbook,” Technical University of
Õ  Õ  
Denmark, vol. 7, p. 15, 2008.
N x n−1 y m−1
Õn−1 m−1
P= (−1) (−1) [13] A. M. Tulino et al., “Random matrix theory and wireless communica-
n=1,m=1 x=0 x y=0 y tions,” Found. and Trends in Commun. and Inform. Theory, vol. 1, no. 1,

h i pp. 1–182, 2004.
× tr (GD)x+1 (DG)y+1 An−1 Am−1 + K [14] H. Dette, “Strong approximation of eigenvalues of large dimensional
Õ    Wishart matrices by roots of generalized Laguerre polynomials,” J. of
N Õm−1 m−1 h i Approximation Theory, vol. 118, no. 2, pp. 290–304, 2002.
− (−1)y tr (GD)y+1 Am−1 [15] M. Abramowitz and I. A. Stegun, Handbook of mathematical functions:
m=1 y=0 y with formulas, graphs, and mathematical tables. Courier Corporation,
Õ Õ    1964, vol. 55.
N n−1 n − 1 h i
− (−1)x tr (GD)x+1 An−1 (20) [16] W. Xiang et al., 5G Mobile Communications. Springer, 2017.
n=1 x=0 x

2162-2337 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.