Interference Reduction in Multi-Cell Massive MIMO PDF

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/268451880
Interference Reduction in Multi-Cell Massive MIMO Systems I: Large-Scale

Fading Precoding and Decoding
Article in IEEE Transactions on Information Theory · November 2014

DOI: 10.1109/TIT.2018.2853733 · Source: arXiv
CITATIONS READS
36 316
3 authors, including:
Alexei Ashikhmin Thomas Marzetta

Nokia Bell Labs New York University, Tandon School of Engineering
126 PUBLICATIONS 6,158 CITATIONS 162 PUBLICATIONS 23,618 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Cell-Free Massive MIMO View project
Currently I work on Massive MIMO and Qautnum Error Correction View project
All content following this page was uploaded by Alexei Ashikhmin on 30 March 2015.
The user has requested enhancement of the downloaded file.

1
Interference Reduction in Multi-Cell Massive

MIMO Systems I: Large-Scale Fading Precoding
and Decoding
Alexei Ashikhmin1 , Thomas L. Marzetta1 , and Liangbin Li2
1
Bell Laboratories Alcatel-Lucent, 600 Mountain Ave, Murray Hill, NJ 07974.
2
University of California, Irvine, CA 92617.
arXiv:1411.4182v1 [cs.IT] 15 Nov 2014
Abstract—A wireless massive MIMO system entails a large In most modern MU-MIMO systems, base stations have
number (tens or hundreds) of base station antennas serving a only a few, typically fewer than 10, antennas, which results
much smaller number of users, with large gains in spectral- in relatively modest spectral efficiency and precludes a rapid
efficiency and energy-efficiency compared with conventional
MIMO technology. Until recently it was believed that in multi- increase in data rates, as well as higher user density required
cellular massive MIMO system, even in the asymptotic regime, as in the next generation cellular networks. This, along with
the number of service antennas tends to infinity, the performance the GreenTouch initiative to decrease the power consumption
is limited by directed inter-cellular interference. This interference in communication networks, motivated extensive research on
results from unavoidable re-use of reverse-link training sequences Massive MIMO systems, where each base station is equipped
(pilot contamination) by users in different cells.
We devise a new concept that leads to the effective elimination with a significantly larger number of antennas, e.g., 64 or
of inter-cell interference in massive MIMO systems. This is more.
achieved by outer multi-cellular precoding, which we call Large- In [1] and [3], Marzetta used asymptotic arguments based
Scale Fading Precoding (LSFP). The main idea of LSFP is that on random matrix theory to show that the effects of additive
each base station linearly combines messages aimed to users noise and intra-cell interference, and the required transmitted
from different cells that re-use the same training sequence.
Crucially, the combining coefficients depend only on the slow- energy per bit vanishes as the number of base station antennas
fading coefficients between the users and the base stations. Each grows to infinity. Furthermore, simple linear signal processing
base station independently transmits its LSFP-combined symbols approaches, such as matched filter precoding/detection, can be
using conventional linear precoding that is based on estimated used to achieve these advantages.
fast-fading coefficients. Further, we derive estimates for downlink Another important advantage of massive MIMO system is
and uplink SINRs and capacity lower bounds for the case of
massive MIMO systems with LSFP and a finite number of base their energy efficiency. In [4], it is shown that the transmit
station antennas. power scales down linearly with the number of antennas. The
high energy efficiency of massive MIMO systems is very
important since the large energy consumption can be one of
I. I NTRODUCTION the main technical issues in future wireless networks [5], [6].
Multiple-input multiple-output (MIMO) technology has Because of the above advantages, in recent years, mas-
been a subject of intensive studies during the last two decades. sive MIMO systems attracted significant attention of the
This technology became a part of many wireless standards research community. Understanding signal processing, infor-
since it can significantly improve the efficiency and reliability mation theoretic properties, optimization of parameters, and
of wireless systems. Initially, research in this area was focused other aspects of massive MIMO system become subjects of
on the point-to-point communication scenario, when two de- intensive studies. The articles [7], [8], and [9] present a good
vices are equipped with multiple antennas communicate to introduction into this area, including fundamental information
each other. In recent years, the focus has shifted to multi- theoretical limits, antenna and propagation aspects, design of
user multiple-input multiple-output (MU-MIMO) systems, in precoder/decoder, and other technical issues.
which a base station is equipped with multiple antennas and In this work, we consider one of the most important
simultaneously serves a multiplicity of autonomous users. and interesting problems of massive MIMO systems - pilot
These users can be cheap single-antenna devices and most contamination. In [3] (see also [10] and[11]), Marzetta de-
of the expensive equipment is needed only at base stations. rived estimates for SINR values in a non-cooperative cellular
Another advantage of MU-MIMO systems is their high multi- network in the asymptotic regime when the number of base
user diversity, which allows making the system performance station antennas tends to infinity. He showed that in this regime
more robust to the propagation environment than in the case not all interference vanishes, and therefore, SINR does not
of point-to-point MIMO case. As a result, MU-MIMO has grow indefinitely. The reason for this is that, unless all users
become an integral part of communications standards, such as in the network use mutually orthogonal training sequences
802.11 (WiFi), 802.16 (WiMAX), LTE, and is progressively (pilots), the training sequences transmitted by different users
being deployed throughout the world. contaminate each other. As a result, the estimates of channel
2
state information made at a base station are biased toward and when Zero-Forcing LSFP is used. In IV-C, we formulate
users located in other cells. This effect is called pilot contam- an optimization problem for designing efficient LSFP and in
ination problem. The pilot contamination causes the inter-cell IV-D, we present simulation results for Zero-Forcing LSFP for
interference that is proportional to the number of base station different number of base station antennas.
antennas.
A number of techniques were proposed for mitigation II. S YSTEM M ODEL
the pilot contamination. The numerical results obtained in
We consider a two-dimensional hexagonal cellular network
[10],[11], and [13] show that these techniques provide break-
composed of L hexagonal cells with one base station and
through data transmission rates for noncooperative cellular
K users in each cell. Each base station is equipped with M
networks. Advanced network MIMO systems that allow some
omnidirectional antennas and each user has a single omnidirec-
collaboration between base stations were proposed recently
tional antenna. We further assume that Orthogonal Frequency-
in [14]. Unfortunately, in all these techniques SINR values
Division Multiplexing (OFDM) is used and that in a given
remain finite and do not grow indefinitely with the number of
subcarrier we have a flat-fading channel.
base station antennas M .
The throughput of 95% of users can be improved by using
In [15] and [16], the authors proposed massive MIMO sys-
massive MIMO system with frequency reuse factor greater
tems with limited collaboration between base stations and an
than one [3]. Alternatively one can improve the throughput by
outer multi-cellular precoding. This outer-cellular precoding
using a pilot reuse factor greater than one [17]. For instance, a
is based only on the use of large-scale fading coefficients
pilot reuse factor of seven allows assigning orthogonal pilots
betweens base stations and users. Since large-scale fading
to users in two concentric rings of cells, at the expense of
coefficients do not depend on antenna index and frequency,
a longer training time. Similarly, the results presented in this
the number of them is relatively small - only one coefficient
work can be further improved by applying a frequency reuse
for each pair of a base station and an user. In addition, these
factor or pilot reuse factor different from one. However, to
coefficients change slowly over time. Thus, the proposed outer-
make presentation shorter, we will consider only the case of
cellular precoding does not require extensive traffic between
frequency reuse and pilot reuse one. Thus the entire frequency
base stations and/or a network controller. In the asymptotic
band is used for downlink and uplink transmissions by all base
regime, as M tends to infinity, this outer-cellular precoding
stations and all users, and the length of the pilot sequences is
allows one to construct interference and noise free multi-cell
equal to the number of users in one cell.
massive MIMO systems with frequency reuse one. In this
work, we present these results in full details with rigorous
proofs and extend them to the real life scenario when the A. Channel Model
number of base station antennas is finite. For a given subcarrier, we denote by
In [15] and [16], the main goal for designing the outer- q
[kl] [kl] [kl]
cellular precoding was cancelation of the interference caused gmj = βj hmj (1)
by the pilot contamination, and therefore we called it pilot
contamination precoding. It happened, however, that in the the channel (propagation) coefficient between the m-th an-
regime of a finite number of antennas, other sources of tenna of the j-th base station and the k-th terminal of the l-th
interference, caused not by pilot contamination, can not be cell, Fig. 1. The first factor in (1) is the large-scale fading
[kl]
ignored. The outer-cellular precoding allows one to efficiently coefficient βj ∈ R+ and the second factor is the small-scale
[kl]
mitigate these sources of interference. For this reason, we fading coefficient hmj ∈ C.
decided that large-scale fading precoding is a better name for A large-scale fading coefficient depends on the shadowing
it. and distance between the corresponding user and base station.
The paper is organized as follows. First, in Section II, we Typically, the distance between a user and base station is
describe our system model, network assumptions, and TDD significantly larger than the distance between base station
protocol. Then, in Section III-A, we explain the pilot contam- antennas. For this reason, the standard assumption is that the
ination problem and why inter-cell interference does not vanish large-scale fading coefficients do not depend on the antenna
as the number of base station antennas grows. In Sections index m of a given base station. We also assume that these
III-B and III-C we propose large-scale fading precoding and coefficients are constant across the used frequency band, i.e.,
decoding protocols and show that in the asymptotic regime, that they do not depend on OFDM subcarrier index. Thus there
as the number of base station antennas tends to infinity, these is only one large-scale fading coefficients for each pair of a
protocols allow construction of interference and noise free user and a base station. A detailed model for the large-scale
massive MIMO systems. In Section III-D, we briefly outline fading coefficients will be given in Part II of the paper [2].
an approach for estimation of large-scale fading coefficients. In contrast, the small-scale fading coefficients depend on
In Section IV, we analyze LSFP and LSFD in the regime both the antenna and subcarrier indices. Hence, if N is the
of a finite number of base station antennas. In particular, in number of OFDM subcarriers, then for each pair of a user
Section IV-A and IV-E, we derive estimates for downlink and and base station there are about N M small-scale fading co-
uplink SINRs and the capacity of massive MIMO systems efficients. In what follows, we consider only one OFDM sub-
with large-scale precoding and decoding respectively. Further, carrier and so we do not write the subcarrier index for small-
in IV-B, we consider two cases: when LSFP is not used, scale fading coefficients. For small-scale fading coefficients,
3
[kl]
we assume Rayleigh fading model, i.e., hmj ∼ CN (0, 1) and Finally, we assume reciprocity between uplink and downlink
[kl] [nv] [kl] [kl]
for any (m, j, k, l) 6= (u, i, n, v) the coefficients hmj and hui channels, i.e., βj and hmj are the same for these channels.
are independent. From the above assumptions it follows that The reciprocity, up to high accuracy, can be achieved with
[kl] [nv] proper calibration of hardware components, e.g., the transmit-
for any (j, k, l) 6 (i, n, v) the vectors hj and hi (similarly
[kl] [nv] ted power amplifier and the received low-noise amplifier.
gj and gi ) are independent.
B. Time-Division Duplexing Protocol

We consider a wireless network, where each cell has K
users enumerated by index k from 1 to K. In each cell, the
same set of K orthonormal training sequences r[k] ∈ Cτ ×1 (
†
r[k] r[i] = δki ) are assigned to the users. The sequence r[k]
is assigned to the k-th user. Since small-scale fading vectors
are mutually independent for different coherence blocks, we
have to assume that τ < T . Since the number of orthogonal
τ -tuples can not exceed τ , we also have K ≤ τ .
The assumption that the same set of training sequences is
used in all cells is justified in the following. If users move fast,
the coherence block is short, that is, T is small. Hence τ , the
[kl]
Fig. 1. The channel coefficient gmj between the m-th antenna of the j-th training time, should be also small. Therefore, it is reasonable
cell and the k-th terminal in the l-th cell
to assume that we can assign orthogonal training sequences
The small-scale fading coefficients between the j-th base to users within one cell, but there are not enough orthogonal
station and the k-th user in the l-th cell form small-scale fading training sequences for users from different cells. Thus, we
vector have to reuse the same training sequences in all cells.
[kl]

[kl] [kl] [kl]
T There is an alternative scheme in which different sets of
hj = h1j , h2j , . . . , hM j ∈ CM ×1 . training sequences are used in different cells. Thus, in the l-
[k]
The channel coefficients between the j-th base station and the th cell the orthonormal sequences rl ∈ Cτ ×1 , k = 1, K,
k-th user in the l-th cell form channel vector are used. In this case, however, the training sequences from
[k]
[kl]

[kl] [kl] [kl]
T q
[kl] [kl]
different cells arestill nonorthogonal, that is, for generic rl
gj = g1j , g2j , . . . , gM j = βj hj ∈ CM ×1 . †

[i] [k] [i]
and rn , we have rl rn > 0. Our estimates show that such
Since small-scale fading coefficients are i. i. d., we have a scheme would achieve performance similar to the scheme
[kl] [kl] [kl]
hj ∼ CN (0, IM ) and gj ∼ CN (0, βj IM ). with the same set of training sequences in all cells. At the same
We further assume a block fading model, that is, that small- time, the analysis becomes more complex. For this reason, in
[kl]
scale coefficients hj stay constant during coherence blocks this paper, we do not consider the alternative scheme, which,
of T OFDM symbols. The small-scale fading coefficients however, could be done in future works.
in different coherence blocks are assumed to be indepen- The Time-Division Duplexing (TDD) protocol consists of
dent. Similarly, we assume that large-scale fading coefficients six steps. The last four steps, that are conducted during each
[kl]
βj stay constant during large-scale coherence blocks of Tβ coherence block, are shown in Fig.2.
OFDM symbols. Typically Tβ is significantly larger than T Time-Division Duplexing Protocol
(at the end of Section III we discuss this in more details). 1) In the beginning of each large-scale coherence block
[kl]
For different large-scale coherence blocks coefficients βj are (of duration Tβ OFDM symbols) the j-th base station
assumed being independent. [kl]
estimates the large-scale fading coefficients βj , k =
Since large-scale fading coefficients stay constant for long 1, K, l = 1, L.
coherence blocks and the number of these coefficients is 2) Next, the j-th base station transmits to K mobiles
relatively small (for each pair of base station and mobile there located in the j-th cell the quantities
is only one large-scale fading coefficient) throughout the paper
p [kj]
we use the following [kj] M ρ f ρr τ β j
Network Assumptions I: j == PL [ks]
, k = 1, K.
(1 + s=1 ρr τ βj )1/2
1) We assume that the j-th base station can accurately
[kl]
estimate and track large-scale fading coefficients βj 3) All users synchronously transmit their uplink signals
with k = 1, K and l = 1, L. x[kj] , k = 1, K, j = 1, L.
[kj]
2) If j is a quantity that depends only on large-scale 4) All users synchronously transmit training sequences r[k] .
fading coefficients we assume that the j-th base station 5) Base stations process received uplink signals and train-
has means to forward it to the k-th user in the j-th cell. ing sequences. In particular, each base station estimates
Throughout the paper, in our analysis of communication the channel vector between itself and the users located
protocols we will not take into account the resources needed within the same cell, and further performs decoding and
for implementing the above assumptions. precoding of uplink and downlink signals, respectively.
4
6) All base stations synchronously transmit their downlink for downlink transmission. Thus, it gets an estimate of the
signals xj , j = 1, L. uplink signal x[kj] as
The End [kj]†
x̂[kj] = ĝj yj
K
√ [kj]† [kj]
X √ [kj] [nj] [nj]
= ρr ĝj gj x[kj] + ρr ĝj gj x
n=1
n6=k
L X
K
X √ [kj]† [nl] [nl] [kj]†
+ ρr ĝj gj x + ĝj w[kj] . (5)
l=1 n=1
l6=j
Fig. 2. Coherence block of T = 11 OFDM symbols
For the downlink transmission, the j-th base station forms
In what follows, we consider these steps in details. It is conjugate precoding beamforming vectors as
convenient to start with Step 3. During Step 3 of the TDD [kj]†
protocol, the j-th base station receives uplink data signal of [k] ĝj
uj = [kj]
, k = 1, K, (6)
the form λj
K X
L
√ X [kl] [kj]2 [kj]
yj = ρr gj x[kl] + wj ∈ CM ×1 , (2) where λj = E[||gj ||2 ] is the normalization factor, which
k=1 l=1 according to (4), is equal to
where x[kl] is the uplink signal of the k-th user located in ρr τ β j
[kj]2
[kj]2
the l-th cell, ρr is the reverse link transmit power, and wj ∼ λj =M· PL [ks]
. (7)
CN (0, IM ) is the additive noise. We assume that all users have 1 + s=1 ρr τ βj
the same transmit power. The j-th base station next forms the M -dimensional vector
In Step 4, the j-th base station receives training signals,
K
which can be written into a matrix Yj ∈ CM ×τ of the form √ X [k]
xj = ρf uj s[kj] ,
K X
L k=1
√ X [kl] †
Yj = ρr τ gj r[k] + Wj , [kj]
k=1 l=1
where s is the downlink data signal intended for the k-
th user located in the j-th cell, and ρf denotes forward link
where Wj ∈ CM ×τ is the additive white Gaussian noise transmit power used by the j-th base station. We assume that
matrix with i.i.d. CN (0, 1) entries. all base stations use the same transmit power.
In Step 5, the j-th base station uses the fact that the training Finally, in Step 6, the j-th base station transmits and from
sequences are orthogonal to obtain the MMSE estimate of the its M antennas the vector xj .
[kl]
channel vectors gj as (see for example [18, Chapter 12]) The k-th mobile in the j-th cell receives the signal
L L
[kj]

[kj]

[kj]
X √ [ks] [kj]
X √ [kj]
ĝj = Yj θj r[k] = θj ρr τ gj + ŵj . (3) y [kj] = ρf x l g l + w[kj]
s=1 l=1
[kj]† K [nj]†
where √ ĝj [kj]
X √ ĝj [kj]
√ [kj]
= ρf [kj] gj s[kj] + ρf [nj]
s[nj] gj
[kj] ρr τ β j [kj]2 λj n=1 λj
θj = PL [ks]
, and ŵj ∼ CN (0, θj IM ). n6=k
1 + s=1 ρr τ βj L X
K [nl] †
X √ ĝl [kj]
+ ρf [nl]
s[nl] gl + w[kj] . (8)
According to our Network Assumptions, base station have λl
[ks] [kj] l=1 n=1
access to βj and therefore are capable of finding ĝj . l6=j
[kj] [kj]†
Note that the vector ĝj has the distribution ĝj [kj]
Note that the effective channel [kj] gj , in average, has
  λj
[kj]2
[kj] ρ r τ β j
zero phase. The quantity
ĝj ∼ CN 0, PL I .
[ks] M
(4)
1 + s=1 ρr τ βj [kj] p [kj]
[kj] ĝj [kj] M ρ f ρr τ β j
j = E[ [kj] gj ] = PL [ks]
, k = 1, K,
[kj] λj (1 + s=1 ρr τ βj )1/2
Further, the j-th base station uses the estimates ĝj for
decoding the transmitted uplink signals x[kj] , k = 1, K, and which the j-th base station transmits to k-th user tells the user
forming precoding beamforming vectors for downlink trans- the expected value of the power of the effective channel. So
mission. The decoding and precoding can be implemented in the user can use the following simple detector
several possible ways, e.g., zero-forcing, MMSE, or matched
y [kj]
filtering. In this work we always assume that the base station ŝ[kj] = √ [kj]
uses matched filtering for decoding and conjugate precoding ρf j
5
for estimating the signal s[kj] . Alternatively, instead of trans- Similar analysis of the uplink transmission leads to the
[kj]
mitting j at step 4 of the TDD protocol, base stations can following result.
send downlink training sequences that would allow users to Theorem 3: The uplink SINR of the k-th terminal in the
estimate their effective channels. We do not elaborate on these j-th cell for decoding (5) converges to the following limit:
possible details of the TDD protocol. [kj]2
[kj] a.s. βj
Note also, Steps 1 and 2 should be conducted only one lim SINRU = . (11)
M →∞
PL [kl]2
time during each large-scale coherence block. As we noted in l=1,l6=j βj
Section II-A, we do not take into account the resources needed
The natural question is whether one can mitigate the pilot
for this.
contamination and further increase SINRs beyond results
obtained in Theorems 1 and 3. Below we briefly discuss some
III. L ARGE -S CALE FADING P RECODING AND approaches for this.
I NTERFERENCE F REE LSAS One possibility is to use frequency reuse for avoiding inter-
A. Pilot Contamination ference between adjacent cells [3]. Though the frequency reuse
From (5) and (8) it follows that the uplink and downlink reduces the bandwidth, overall it allows increasing SINRs for
SINRs can be written in the form presented at the top of the most of the users.
next page Another possibility is to optimize transmit powers. In this
[kj]
In [3] and [10],[11] the authors considered LSASs in the case, the j-th base station uses the power ρf for transmitting
asymptotic regime when the number of base station antennas to the k-th user in the j-th cell and transmits the vector
M tends to infinity. The following results were obtain. K q
[kj] [k]
X
Theorem 1: The downlink SINR of the k-th terminal in the xj = ρf uj s[kj] .
j-th cell for precoding (6) converges to the following limit: k=1
[kj] 2
[k] 2 Similarly, in the uplink transmission, the k-th user in the j-
[kj] a.s.βj /ηj [kj]
th cell will use the transmit power ρr . Then, the SINR
lim SINRD = PL [kj]2 [k]2
, (9)
M →∞ l=1, βl /ηl expressions (9) and (11) will be respectively
l6=j
[kj] [kj]2 [k]2
where [kj] a.s. ρf βj /ηj
L
!1/2 lim SINRD = PL [kl] [kj]2 [k]2
, and
M →∞ l=1, ρ
[k] [ks] f βl /ηl
X
ηl = 1+ ρr τ βl . l6=j
s=1 [kj] [kj]2

[kj] a.s. ρ r βj
To give an intuitive explanation of this result we remind that lim SINRU = PL [kl] [kl]2
.
M →∞ l=1 ρr βj
according to the strong law of large numbers we have the l6=i
following lemma. It is also possible to use modified TDD protocol in which

Lemma 2: Let x, y ∈ CM ×1 be two independent vectors users from different cells transmit pilot sequences asyn-
with distribution CN (0, ν IM ). Then chronously according to the time-shifted protocol proposed
in [12],[10],[11]. For sufficiently large M the time-shifted
x† x a.s. x† y a.s.
lim = ν and lim = 0. (10) protocol gives significant increase in downlink and uplink
M →∞ M M →∞ M
SINRs.
The signal y[kj] received by the k-th user in the j-th cell is One more possibility is to replace conjugate precoding
[nl]
defined in (8) and estimates ĝl are defined in (3). It is easy (6) with another linear precoding that mitigates the pilot
[nl] contamination effect [19].
to see that all terms of ĝl , n 6= k, are independent from
[kj] In all the above techniques, however, downlink and uplink
gl and therefore, according to Lemma 2 we have
SINRs approach some finite limits as M tends to infinity. In
1 [nl]† [kj] a.s. other words, SINRs do not grow with M .
lim ĝ gl = 0.
M →∞ M l To obtain SINRs that grow along with M , one may
Hence the contribution into interference of the product try to use the network MIMO approach (see for example
[nl]† [kj] [20],[21],[22] and references within). The network MIMO
ĝl gl vanishes as M grows. At the same time the estimate
[kl]† [kj] assumes that the j-th base station estimates the coefficients
ĝl contains gl as a term. The reason for this is that the [kl] [kl]
k-th users in cells j and l are using the same training sequence βj and hmj for all k = 1, K and l = 1, L, and sends
r[k] . Thus the product them to other base stations. After that, all base station start
to behave as one super large antenna array. This approach,
1 [kl]† [kj] however, seems to be infeasible for the following reasons.
ĝ gl
M l [kl]
First, the number of small-scale fading coefficients hmj is
almost surely converges to some finite value and therefore it proportional to M . Thus, in the asymptotic regime, as M
makes revanishing contribution to the interference. This effect tends to infinity, the needed traffic between base stations also
is called pilot contamination. It is shown in [3] and [10],[11] infinitely grows and the network MIMO becomes infeasible.
that pilot contamination is the only source of interference in Even in the case of a finite M the needed traffic is
the regime of infinitely large M . tremendously large. Indeed, the small-scale fading coefficients
6
[kj]† [kj]
[kj] ρr |ĝj gj |2
SINRU = P [kj] [nj] [kj]† [nl] 2 [kj]†
K
ρr |ĝj gj |2 + Ll=1 K
P P
n=1 n=1 ρr |ĝj gj | + Var[ĝj w[kj] ]
n6=k l6=j
ρf [kj]† [kj] 2
[kj]2
|ĝj gj |
[kj] λj
SINRD = P ρf [nj]† [kj] 2 ρf [nl]† [kj] 2
K PL PK
n=1
[nj]2
|ĝj gj | + l=1 n=1 [nl]2
|ĝl gl | + Var[w[kj] ]
n6=k λj l6=j λl
depend on frequency. The typical assumption is that small- s[kj] respectively, only in order to obtain a simple theoretical
[kl]
scale fading coefficients hmj of OFDM tones i and i + ∆ model. In Part II of the paper [2], we will replace these
are considered being independent random variables when ∆ assumptions with more realistic ones.
is the coherent bandwidth, which typically is in between 10 Remark 2: We would like to point out that the large-scale
and 20 subcarriers.. Thus, if M = 100 and the total number [kl]
fading coefficients βj are relatively easy to estimate and
of OFDM subcarriers is say N = 1400, and ∆ = 14, then track. One possible approach for this is outlined at the end of
the j-th base station needs to transmit to other base stations this section.
N M/∆ · K(L − 1) = 10000K(L − 1) small-scale fading Below we describe the Large-Scale Fading Precoding pro-
coefficients for given k and l. Note that the small-scale fading tocol for interference mitigation in LSASs. Originally we
coefficients substantially change as soon as a mobile moves designed this protocol in cite [15] and [16] for canceling the
a quarter of the wavelength. Taking all this into account we directed interference caused by pilot contamination and called
conclude that the needed traffic between base stations is hardly it Pilot Contamination Precoding (PCP). Recently, however,
feasible. we came to the conclusion that that the name Large-Scale
The second reason is even more fundamental. Since users Fading Precoding better reflects the idea of this protocol.
in different cells reuse the same pilot sequences, the j-th base
Large-Scale Fading Precoding (LSFP)
station is not capable of obtaining independent estimates for
[kj] [kl]
the coefficients hmj and hmj , since the k-th users in cells 1) In the beginning of each large-scale coherence block
j and l use the same pilot sequence r[k] . Thus, the standard (of duration Tβ OFDM symbols) the j-th base station
[kl]
network MIMO approach is not applicable even if we ignore estimates the large-scale fading coefficients βj , k =
the traffic problem. 1, K, l = 1, L, and sends them to the network controller.
One possible conclusion of the above arguments can be that 2) The network controller computes the L × L LSFP
in both noncooperative LSASs and LSASs with cooperation precoding matrices
(like network MIMO), SINRs do not grow with M beyond
φ[k]
 
certain limits. In this paper, we disprove the above statement.
1
We demonstrate that limited cooperation between base stations 
φ[k]

[k] 2
 
allows us to completely resolve the pilot contamination prob- Φ =
 ..  , k = 1, K,

lem and to construct interference and noise free LSASs with  . 
[k]
infinite downlink and uplink SINRs. φL
B. Large-Scale Fading Precoding [kl]

as functions of βj , j, l = 1, . . . , L, so that
We start with changing the Network Assumption defined in
Section II.
Network Assumptions II ||φ[k]
l
||2 ≤ 1, (12)
1) We assume that the j-th base station can accurately esti-
[kl]
mate and track large-scale fading coefficients βj , k = and sends the rows φ[k]
j
, k = 1, K, to the j-th base
1, K and l = 1, L. station.
[kj]
2) If j is a quantity that depends only on large-scale
fading coefficients we assume that the j-th base station 3) The network controller computes quantities
can forward it to the k-th user in the j-th cell.
3) We assume that all base stations are connected to a L [kj] [kl]
βl βl φl
[kj]
[kj] √ X
network controller (as it is shown in Fig. 3) and that j = ρf M ρ r τ PL [ks] [kl]
, k = 1, K,
[kl]
the large-scale fading coefficients βj are accessible to l=1 1 + s=1 ρr τ βl λl
the network controller.
[kl]
4) We assume that all downlink signals s[kj] with j = (where φj is the (j, l) entry of Φ[k] ) and sends them
1, L, k = 1, K, are accessible to all base stations. to the j-th base station, which sends them further to the
Remark 1: We assume that the network controller and base corresponding users located in the j-th cell.
[kl]
stations have access to all, across the entire network, βj and 4) The j-th base station conducts large-scale fading pre-
7
coding. Namely, it computes signals the k-th user in the j-th cell receives the signal y [kj] it
 [k1]  can estimate the signal s[kj] as
s
 s[k2]  y [kj]
[k]
cj = φ[k] ŝ[kj] = .
 ..  , k = 1, K. (13) [kj]
 
j  j
. 
s[kL] [kj]
Alternatively, instead of sending j base stations can
send downlink training sequences.
(Since Var[s[kj] ] = 1, the constraint (12) implies that [kl]
[k] • The estimates ĝj are computed one time for each
Var[cj ] = 1.)
coherence block of small-scale coefficients, that is once
5) The j-th base station obtains the MMSE estimates
[kj] every T OFDM symbols.
ĝj , k = 1, K, according to (3). [k]
• The quantity cj and vector xj are computed for each
6) The j-th base station performs small-scale fading pre-
channel use (each downlink OFDM symbol).
coding, namely it forms conjugate precoding beamform-
ing vectors Now we show that an appropriate choice of LSFP precoding
matrices Φ[k] allows one to completely cancel the interference
[kj]† and noise as M tends to infinity. Let us define L × L matrices
[k] ĝj
uj = [kj]
, k = 1, K, composed of large scale fading coefficients
λj  
[k1] [k] [k1] [k]
β1 /η1 . . . βL /ηL
and transmits from its M antennas the vector [k]  . .. 
BD =   ..
. (14)
K
. 
√ X [k] [k] [kL]
β1 /η1
[k] [kL]
. . . βL /ηL
[k]
xj = ρf uj cj .
k=1 [k]
For BD of full rank we define Zero-Forcing LSFP (ZF-LSFP)
(Note that other types of small-scale precodings can be as the above LSFP with
[k]
used at this step. For instance, vectors uj can be formed √ [k]−1
with the help of M -dimensional zero-forcing precoding.) Φ[k] = ρA BD , k = 1, K, (15)
The End where ρA is a normalization factor to insure the constraint
The block diagram of this protocol is shown in Fig.3. (12).
For analysis of ZF-LSFP it is convenient to make a small
[kj]
modification of the LSFP assuming that at Step 3 j =
s[kl] p
Large−scale
User 1 M ρ f ρr ρA τ .
precoding User 1
Let us assume now that signals s[kj] are taken from a signal
......
......
Small−scale
......
Large−scale
precoding
constellation R = {r1 , . . . , rN } according to some probability
precoding User K User K
[kl]
mass function (PMF) PS (·). Define further the entropy
βj BS 1 Cell 1
X
H(s[kj] ) = − PS (s) log PS (s).
......
......
LSFP
Computation [kl]
αj
Network Controller s∈R
Large−scale User 1
precoding User 1 Assume also that all |rj |, rj ∈ R, are finite.
......
Small−scale
......
......
precoding The k-th user in the j-th cell receives the signal
Large−scale
K √
precoding User K User K
L X
BS L
X ρf [nl]† [nj] [n]
Cell L
y [kj] = ĝ
[nl] l
gl cl + w[kj] . (16)
l=1 n=1 λl
Fig. 3. System diagram for the LSFP. Each BS performs two levels of precod-
ing. Multi-cell cooperation is based on the large-scale fading coefficients only. Let the user uses the following simple detection method of
Each BS also performs local precoding using estimates of M -dimensional fast the transmitted signal s[kj] :
fading vectors.
[kj] y [kj]
ŝM = p . (17)
We would like to point out the following things about this M ρ f ρr ρA τ
algorithm. The following theorem shows that in the asymptotic regime,
[kl]
• No exchange of small-scale fading coefficients hmj be- as M → ∞, the ZF-LSFP allows one to reliably transmit
tween base stations and the network controller is required signals from an arbitrary large R and therefore it provides
(opposite to the network MIMO). infinite capacity.
• Steps 1 – 3 are conducted once every large-scale coher- Theorem 4: For ZF-LSFP we have
ence block, which is typically about 40 times longer than [kj]
the coherence blocks of small-scale coefficients. lim I(ŝM ; s[kj] ) = H(s[kj] ), k = 1, K, j = 1, L. (18)
M →∞
[kj]
• The purpose of quantities j is to send to the cor-
To prove this theorem we need the following lemma. Let
responding users the expected powers of their effective
channels (see (33) and (34) in Section IV for details). If sM = aM · s + wM ,
8
where s is a random signal from R and aM and wM are Again, using Lemma 2, we obtain
random variables (not necessarily independent) such that 1 [kj] a.s.
a.s. a.s.
lim √ qnl = 0. (23)
lim aM = 1 and lim wM = 0. (19) M →∞ M
M →∞ M →∞
For k = 1, K, denote
 
[k1] [k1]
Lemma 5: f ... fL
lim I(sM ; s) = H(s).  1. ..
F [k]

M →∞ =  ..

.
,

[kL] [kL]
A proof of this lemma is in Appendix A. f1 ... fL
We will also need the following well known fact. Let φ and and
β be some constants and aM and bM be random variables so 
[k]

y [k1] s[k1]
   
that c1
=  ...
a.s. a.s.  [k]  .. ..
y[k]
  [k] 
lim aM = a and lim wM = b, 
,c =  . ,s = 
. ,

M →∞ M →∞ 
[kL] [k] [kL]
y cL s
where a and b are some constants. Then
a.s. and
lim aM φ + bM β = aφ + bβ. (20)
M →∞  PL PK [k1] 
l=1 n=1 qnl w[k1]
 
n6=k
Indeed, let (Ω, F, P ) be the probability space on which aM ..
q[k] =  =  ...  .
  [k]
,w  
and bM are defined. Let further  . 
[kL]
w[kL]
PL PK
l=1 n=1 qnl
A = {ω ∈ Ω : lim aM (ω) = a}, n6=k
M →∞
With this notations we have
B = {ω ∈ Ω : lim bM (ω) = b}, and
M →∞
y[k] = F [k] c[k] +q[k] +w[k] = F [k] Φ[k] s[k] +q[k] +w[k] . (24)
C = {ω ∈ Ω : lim am (ω)φ + bM (ω)β = aφ + bβ}.
M →∞
T Let V [k] = F [k] Φ[k] . According to (22) we have that the
Then A B ⊆ C. So we have entries of F [k] almost surely converge to the corresponding
√ [k]
\ [ entries of the matrix ρf ρr τ BD . From this, (20), and from
Pr(C) ≥ Pr(A B) = Pr((Ac B c )c ) [k]
[ the definition of Φ[k] it follows that the entries vim of V [k]
=1 − Pr(Ac B c ) ≥ 1 − Pr(Ac ) − Pr(B c ) = 1. have the property:
1 [k] a.s. √
Proof: (Theorem 4) The signal y [kj] defined in (16) can lim √ vim = δim ρf ρr ρA τ , i, m = 1, L.
be written in the form: M →∞ M
From (24) and (17) we have
y [kj]
L
L √ K √
L X [kj] 1 1 [k]
X [k]
X ρf [kl] †
[kj] [k]
X ρf [nl] †
[kj] [n] ŝM = √ √ (vjj s[kj] + vjl s[kl]
= [kl]
ĝl gl cl + [nl]
ĝl gl cl + w[kj] , M ρ ρ ρ
f r A τ
λl λl l6=j
l=1 l=1 n=1
n6=k L X
K
(21) [kj]
X
+ qnl + w[kj] ).
[k] l=1 n=1
Let us denote the factors in front of cl in the first sum by n6=k
√ Applying now Lemma 5 to the above expression we obtain

[kj] ρf [kl]† [kj]
fl = [kl] ĝl gl [kj]
λl lim I(ŝM ; s[kj] ) = H(s[kj] ).
M →∞
√ L
!
ρf [kl] X √ [ks]† [kl]† [kj]
= [kl] θl ρr τ gl + ŵl gl .
λl s=1
Thus, under our network assumptions defined in Section
III, we constructed a noise free and interference free multi-
Opening the parenthesis and applying Lemma 2 to each term cell LSAS with frequency reuse 1. In such LSAS the size of
of the above expression, we obtain modulation R can be chosen arbitrary large and therefore the
√
1 [kj] a.s. ρf ρr τ [kj] LSAS can achieve an arbitrary large capacity.
lim √ fl = [k]
βl . (22)
M →∞ M ηl
C. Large-Scale Fading Decoding
Denote further the terms in the second sum of (21) by
√ With the same Network Assumptions II we define the
[kj] ρf [nl]† [kj] [n] following protocol for uplink data transmission.
qnl = [nl] ĝl gl cl
λl Large-Scale Fading Decoding
√ L
!
1) The j-th base station estimates the large scale fading
ρf [nl] X √ [ns]† [nl]† [kj] [n] [kn]
= [kl] θl ρr τ gl + ŵl gl cl coefficients βj , k = 1, K, n = 1, L, and sends them
λl s=1 to the network controller.
9
2) The controller computes L × L decoding matrices Proof: Using (2) and (3), we get
[kj]†
[k] x̃[kj] = ĝj yj
 
ω1
[k] L L
ω2 √ [ks]† [kl]
 
[kj]
X X
[k]
x[kl] θj
 
Ω =
 ..  , k = 1, K,
 = ρr τ gj gj
 .  l=1 s=1
[k] | {z }
ωL fj
[kl]
L L X
K
[kl]
as functions of βj , j, l = 1, L. [kj]
X √ [ks] [kj]
X √ [nl]
+ θj ( ρr τ gj + ŵj )† ( ρr gj x[nl] + wj )
3) The j-th base station computes the MMSE estimates s=1 l=1 n=1
[kj] n6=k
ĝj , k = 1, . . . , K according to (3). | {z }
4) The j-th base station receives the vector yj defined in q [kj]
(2) and computes the matched filtering estimates Applying Lemma 1 we obtain
[kj]† 1 [kl] a.s. [kj] √ [kl]
x̃[kj] = ĝj yj , k = 1, . . . , K, lim fj = θj ρr τ βj , and
M →∞ M
1 [kj] a.s.
of the uplink signals x[kj] (other options, for instance lim q = 0.
M →∞ M
M -dimensional Zero Forcing or MMSE receiver, are
possible here) and sends them to the network controller. Denote  
[k1] [kL]
5) The network controller computes the following estimates f ... f1
 1. ..
of x[kj] F [k]

 ..
= ,
. 
[k1] [kL]
 [k1] 
x̃ fL ... fL
[kj] 1 [k]  ..  , k = 1, . . . , K.
x̂M = [kj] √
ω j  .  and
M θ j ρr τ
x̃[k1]
 [k1]   [k1] 
x̃kL
 
x q
=  ...
 [k]  ..  [k]  ..
x̃[k] , x = , q = .
 
 .  .
The End
 
[kL] [kL] [kL]
We would like to point out the following things. x̃ x q
• Unlike the network MIMO approach, small-scale fading We this notations we have
[kj]
coefficients hmj are used locally by the j-th base station x̃[k] = F [k] x[k] + q[k] .
and are not sent to the network controller.
• Steps 1 and 2 are conducted only one time every large- Hence
scale coherence block, that is once every Tβ OFDM 
[k1]

x̂
symbols.  M 1
 .. Ω[k] x̃[k]

[kl] =
• The estimates ĝj at Step 3 are computed one time  . [kj] √
M θ j ρr τ

for each coherence block, that is once every T OFDM x̂M
[kL]
symbols.
1
• Step 4 and 5 are conducted for each channel use (each = [kj] √
(Ω[k] F [k] x[k] + Ω[k] q[k] ). (26)
OFDM symbol). M θ j ρr τ
Let Note that as M grows the matrix F [k] almost surely converges
  [kj] √ [k]
β
[k1]
...
[kL]
β1 to the matrix θj ρr τ BU .
[k]  .1 ..  Let V [k] = Ω[k] F [k] . Applying Lemma 2 and (20) to the
BU =  ..

.
.
 entries of V [k] , we get
[k1] [kL]
βL ... βL
1 [k] a.s.
lim [kj] √
vim = δim , i, m = 1, L.
M →∞ M θ j ρr τ
Similar to the downlink case, we define Zero-Forcing LSFD
as LSFD with From Lemma 2 and (20) it also follows that each entry of the
[k] −1 vector Φ[k] q[k] almost surely converges to 0. Now applying
Ω[k] = BU , k = 1, K. (25) Lemma 5 to the entries of the vector (26) we obtain
[kj]
We again assume that signals x[kj] belong to a signal lim I(x̂M ; x[kj] ) = H(x[kj] ).
M →∞
constellation R = {r1 , . . . , rN } with PMF PS (·).
Theorem 6: For Large-Scale Fading Decoding, we have
Again we obtained a noise free and interference free multi-
lim
[kj]
I(x̂M ; x[kj] ) = H(x [kj]
). cell LSAS, that is an LSAS with arbitrary large capacity, with
M →∞ frequency reuse 1.
10
D. Estimation of Large-Scale Fading Coefficients appears as a parameter. We do not assume ZF-LSFP, instead
In this subsection we outline one possible algorithm for we consider generic LSFPs in which matrices Φ[k] and Ω[k]
estimation of large-scale fading coefficients. can have arbitrary entries, with the constraints that the transmit
First, we would like to remind that the coefficients βj
[kl] powers of each base station and each user, in average, do not
do not depend on antenna indices as well as on OFDM tone exceed ρf and ρr respectively.
indices. Thus, between any given base station and a mobile, First, we would like to note that though the j-th base
[kj]
there is only one large-scale fading coefficient. Second, these stations needs only MMSE estimates ĝj , k = 1, K, it can
coefficients change only when a mobile significantly change its also compute the MMSE estimates of the channel vectors
[kl]
geographical location. The standard assumption is that in the ĝj , k = 1, K, l = 1, L, between itself and all the users
radius of 10 wavelengths, the large-scale fading coefficients in the network as
are approximately constant. In contrast, small-scale fading L
[kl] [kl] [kl] √ [ks] [kl]
X
coefficients significantly change as soon as a user moves ĝj = Yj (θj r[k] ) = θj ρr τ gj + ŵj , (27)
by a quarter of the wavelength. Thus, large-scale fading s=1
coefficients change about 40 times slower than small-scale
where
fading coefficients. √ [kl]
Let us enumerate all users across the entire network by [kl] ρr τ β j
[i] [i] θj = PL [ks]
,
integers from 1 to LK. Denote by βj and hj the large-scale 1 + ρr τ s=1 βj
fading coefficient and fast fading vector between the i-th users
and the j-th base station. Let further v[r] , r = 1, µ, be a set of [kl] [kl]2
and ŵj ∼ CN (0, θj IM ). Let
mutually orthogonal µ-tuples of norm 1. Since the coefficient
[i] [kl] [kl]
βj does not depend on OFMD tone indices, it is enough if g̃j = g − ĝj
the i-th user transmits a training sequence, say v[1] , in only
one OFDM tone. We assume that no other user transmits v[1] be the estimation error.
[kl] [kl]
in this OFDM tones and that users that transmit v[r] , r ≥ 2, in The following properties of ĝj and g̃j are either well
this OFDM tone are enumerated by 2, . . . , µ. In this OFDM known (see for example [18]) or can be easily derived from
tone the j-th base station receives the signal their definitions.
[kl] [k0 s]
µ 1) If (j, k) 6= (j 0 , k 0 ) then the vectors ĝj and ĝj 0 are
√ √
q q
[i] [i] † X [k] [k]
Yj = ρr µ βj hj v[i] + ρr µ βj hj v[k] + Wj , independent for any l and s.
[kl] [ns]
k=2 2) The vectors ĝj and g̃i are uncorrelated for any
where Wj ∈ C M ×µ
is the additive white Gaussian noise indices j, k, l, i, n, and s.
[kl] [kl]
[i] 3) The vectors ĝj and g̃j have the following distribu-
matrix with i.i.d. CN (0, 1) entries. In order to estimate βj ,
tions:
the j-th base station first computes  
[kl]2
√ ρ τ β
q
[kl] [kl] r
y = Yj v[1] = ρr µ βj hj + ŵj , [kl]
ĝj ∼ CN 0,
j
I , (28)
PL [ks] M
1 + s=1 ρr τ βj
and further
[i] 1 1 and
β̂j = y† y − .
M ρr µ ρr µ [kl]
! !
[kl] [kl] ρr τ βj
Taking into account that ŵj ∼ CN (0, IM ) and using Lemma g̃j ∼ CN 0, βj − PL [ks]
IM .
1 + s=1 ρr τ βj
2, after simple computations, we obtain
(29)
[kl] a.s. [kl]
lim β̂j = βj .
M →∞ 4) It is not difficult to show that
Training sequences transmitted in different OFDM tones do [kl] [kj]
[kl] [kj]† ρ r τ β j βj
not interfere with each other. Thus if N is the number of E[ĝj ĝj ]= I . (30)
PL [ks] M
OFDM tones the above approach allows us to have N µ non 1 + s=1 ρr τ βj
interfering training sequences. For example if N = 1400, µ =
8, and K = 20 this approach allows one to estimate large-scale
fading coefficients in a network of L = 560 cells. In real life A. Performance of Large-Scale Fading Precoding with Finite
application we need to estimate large-scale coefficients only M
[kv]
of the users located in neighboring cells and users located far Denote the (j, v) entry of matrix Φ[k] by φj . It will
away from each other can reuse the same OFDM tone and be convenient for us to assume that in LSFP protocol the
training sequence. 1
normalization coefficients [kj] , used in Step 6 of LSFP, are
λj
[kv] [kv]
absorbed into the φj . In other words, we replace φj with
IV. F INITE M A NALYSIS [kv]
φj [k] [kj]†
In this section we derive SINR expressions for downlink and [kj] and in Step 6 use uj = ĝj . This allows us to shorten
λj
uplink LSFPs in which the number M of base station antennas notations.
11
According to the LSFP protocol, the k-th terminal in the To make notations shorter, we denote the five terms (four sums
l-th cell receives the signal and w[kl] ) in above expression by T0 , . . . , T5 respectively.
L First, we note that these terms are mutually uncorrelated.
√ X [kl] [ns]
Indeed, since s[kl] is independent of any ĝi , we have
y [kl] = ρf xj gj + w[kl]
j=1
E[T0† T1 ]
L X
K
†
[nj] [n] [kl] L X
L
X
= ĝj cj gj + w[kl] . (31) † X [kl]† [kl] [kj]† [kl]
j=1 n=1
=E[s[kl] s[kl] ]ρf φj φi E[ĝj ĝj ]†
j=1 i=1
Taking into account that [ki]† [kl] [ki]† [kl]
[kl] [kl] [kl]
· E[(ĝi ĝi − E[ĝi ĝi ])] = 0.
gj = ĝj + g̃j , and
L Since s[kl] is independent of s[nv] if (k, l) 6= (n, v) we get
[k] [kv] [kv]
X
cj = φj s , (32) E[T0† T2 ] = 0, E[T0† T3 ] = 0, E[T1† T2 ] = 0,
v=1
E[T1† T3 ] = 0, E[T2† T3 ] = 0.
and replacing the random variable in front of s[kl] with its
[kl] [nv]
expected value we obtain: Since g̃j is uncorrelated with any ĝi , it is not difficult to
y [kl] check that T4 is uncorrelated with T0 , T1 , T2 , and T3 . Finally,
L K L K
T5 = w[kl] is independent from all other terms. Thus, we can
√ X X [nj]† [kl] [n] √ X X [nj]† [kl] [n] rewrite (33) as
= ρf ĝj ĝj cj + ρf ĝj g̃j cj
j=1 n=1 j=1 n=1 L
√ X [kl] [kj]† [kl] [kl]
+w [kl] y [kl] = s[kl] ρf φj E[ĝj ĝj ] + wef f ,
j=1
L where the effective noise has the variance

√ X [kl] [kj]† [kl]
=s[kl] ρf φj ĝj ĝj [kl]
j=1 Var[wef f ] = Var[T1 ] + Var[T2 ] + Var[T3 ] + Var[T4 ] + Var[T5 ].
L L
√ X X [kv] [kj]† [kl] Using (30), we obtain
+ ρf s[kv] φj ĝj ĝj
L
v=1 j=1 √ X [kl] [kj]† [kl]
v6=l
ρf φj E[ĝj ĝj ]
K
L X K
L X
√ X [nj]† [kl] [n] √ X [nj]† [kl] [n] j=1
+ ρf ĝj ĝj cj + ρf ĝj g̃j cj L [kl] [kj]
j=1 n=1 j=1 n=1 √ X βj βj [kl]
n6=k = ρf M ρ r τ PL φ ,
[ks] j
(34)
[kl] j=1 1 + s=1 ρr τ βj
+w
[kl]
which defines the quantity l in Step 3 of the LSFP protocol.
[kl]
According to the protocol the l-th base station sends l to
L [kl]
√ X [kl] [kj]† [kl] the corresponding user. So the user can use l to detect signal
= s[kl] ρf φj E[ĝj ĝj ] [kl]
j=1 s[kl] . Note that l depends only on the statistical parameters
| {z } of the channel and not on instantaneous channel realizations.
T0 According to [23, Theorem 1], the worst-case uncorrelated
L additive noise is independent Gaussian noise with the same
√ [kl]
X [kl] [kj]† [kl] [kj]† [kl]
variance. Hence the downlink rate R[kl] can be lower bounded
+ ρf s φj (ĝj ĝj − E[ĝj ĝj ])
j=1 as follows

[kl]
R[kl] = I(y [kl] ; s[kl] l )
| {z }
T1
L [kl]
!
√ X
[kv]
X [kv] [kj]† [kl] |l |2
+ ρf s φj ĝj ĝj ≥ log2 1+ .
Var[T1 ] + Var[T2 ] + Var[T3 ] + Var[T4 ] + Var[T5 ]
v=1 j=1
|
v6=l
{z } (35)
T2
L X
K Now we proceed with finding the variances Var[Tj ], j =
√ X [nj]† [kl] [n] 1, 5.
+ ρf ĝj ĝj cj
j=1 n=1 The term T1 is caused by the channel uncertainty. The
n6=k
| {z } k-th user located in the l-th cell† does not know the actual
[kj] [kl]
T3 value of the effective channel ĝj ĝj , but only the expected
L X
K [kj]† [kl]
√ X [nj]† [kl] [n] [kl] value E[ĝj ĝj ]. So the difference, uncertainty, between the
+ ρf ĝj g̃j cj +w
|{z} . (33)
actual and expected values of the effective channel contributes
j=1 n=1 T5
| {z } to the interference. To estimate the variance of T1 we first
T4 note that the signals s[kl] and s[nv] are independent if (k, l) 6=
12
(n, v). Next, we note that for any 1 ≤ k, n ≤ K, and 1 ≤ Using (37) and (30), we obtain
[kl] [nv]
l, v ≤ L, the vectors ĝj and ĝi are independent if j 6= i.
[kj]† [kl] [kl]† [kj]
Taking this into account we obtain E[ĝj ĝj ĝj ĝj ]
[kj]† [kl] [kj]† [kl] 2
Var[T1 ] =Var[ĝj ĝj ] + |E[ĝj ĝj ]|
L X
L [kl]2 [kj]2
[kl] [kl]† [kj]† [kl] [kj]† [kl] ρr τ β j ρr τ β j
† X h
=E[s[kl] s[kl] ] φj φi E (ĝj ĝj − E[ĝj ĝj ]) =M · ·
PL [ks] PL [ks]
j=1 i=1 1 + s=1 ρr τ βj 1 + s=1 ρr τ βj
[ki]† [kl] [ki]† [kl] †
i !2
·(ĝi ĝi − E[ĝi ĝi ]) [kl] [kj]
M ρ r τ β j βj
L
+ PL [ks]
.
X [kl] [kj]† [kl] 1 + s=1 ρr τ βj
= |φj |2 Var[ĝj ĝj ].
j=1
From this, we have
[kj]† [kl]
For computing Var[ĝj ĝj ] we note that according to (27)
[kj] [kl] Var[T2 ]
ĝj is proportional to ĝj , namely 2
L M ρr τ βj[kj] βj[kl]
L X
[kj] [kv]
X
[kj] θj [kl] = PL φ
[ks] j
ĝj = [kl]
ĝj . (36) v=1 j=1 1 +

s=1 ρr τ βj
θj v6=l
L L [kl]2 [kj]2
Let z = (z1 , . . . , zM )> ∼ CN (0, IM ). It is well know that
X X ρr τ β j ρr τ βj [kv]
+ M PL [ks] PL |φ |2 .
[ks] j
j=1 1 + s=1 ρr τ βj 1 + s=1 ρr τ βj
zi† zi
v=1
zi ∼ CN (0, 1) and ∼ Γ(3, 1), v6=l
(39)
and therefore Var[zi† zi ] = 1. Using this, (36), and (28), we get
h
[kj]† [kl]
i Let us consider now the term T3 , which is caused by
Var ĝj ĝj the nonorthogonality of channel vectors. In the asymptotic
[kl]2 [kj]2 regime, as M tends to infinity, the normalized inner-product
ρr τ β j ρr τ β j [kl] [nj]
= PL [ks] PL [ks]
Var[z† z] of vectors ĝj and ĝj almost surely converges to zero. For
1 + s=1 ρr τ βj 1 + s=1 ρr τ βj finite M , however, this is not the case and T3 may significantly
[kl]2 [kj]2 contribute to the interference. From (32), we have
ρr τ β j ρr τ β j
= PL [ks] PL [ks]
· M. (37)
1 + s=1 ρr τ βj 1 + s=1 ρr τ βj L X
L L
[n] [nv] [nu]† † [nv] 2
X X
E[|cj |2 ] = φj φj E[s[nv] s[nu] ] = |φj | .
Thus v=1 u=1 v=1
Var[T1 ] [nj] [kj]

[kl]2 [kj]2
Using this, (30), and the fact that ĝj and gj are uncorre-
L
X [kl] 2
ρr τ β j ρr τ β j lated, we obtain
=M φj PL [ks] PL [ks]
.
j=1 1+ s=1 ρr τ βj 1+ s=1 ρr τ βj
(38) Var[T3 ]
L X
K
Next, we consider the term T2 . This term is caused by the [nj]† [kl] [kl]† [nj] [n]
X
= E[ĝj ĝj ĝj ĝj ] · E[|cj |2 ]
pilot contamination effect. Since the k-th users in cells j and j=1 n=1
[kj] n6=k
l use the same training sequence the vector ĝj is correlated L X
K
[kl] [nj] [nj]† [kl] [kl]† [n]
with the vector ĝj even if l 6= j. Taking into account the
X
= E[Tr(ĝj ĝj ĝj ĝj )] · E[|cj |2 ]
[nv]
same facts about signals s[nv] and vectors ĝi that we used j=1 n=1
n6=k
in the derivation of Var[T1 ], we obtain L X
K
[nj] [nj]† [kl] [kl]†

[n]
X
Var[T2 ] = Tr E[ĝj ĝj ]E[ĝj ĝj )] · E[|cj |2 ]
j=1 n=1
l L X
L n6=k
† † † †
[kv] [kv] [kj] [kl] [kl] [ki]
X X
= E[s[kv] s[kv] ] φj φi E[ĝj ĝj ĝi ĝi ] L X
X K
ρr τ β j
[kl]2
ρr τ β j
[nj]2
v=1
v6=l
j=1 i=1 =M PL [ks]
· PL [ks]
j=1 n=1 1 + s=1 ρr τ βj 1 + s=1 ρr τ βj
l X
L X
L n6=k
[kv] [kv]† [kj]† [kl] [kl]† [ki]
X
= φj φi E[ĝj ĝj ]E[ĝi ĝi ] L
[kv] 2
X
v=1
v6=l
j=1 i=1
i6=j
· |φj | .
v=1
l X
L
[kv] 2 [kj]† [kl] [kl]† [kj]
X
+ |φj | E[ĝj ĝj ĝj ĝj ].
Term T4 is caused by estimation errors of channel vectors.
v=1 j=1 [kl] [nv]
v6=l Since ĝj is uncorrelated with any g̃i , using (27) and (29)
13
we obtain and
L 2
[kl]
L X
K ρf ρr τ X βv
[nj] [nj]† [kl] [kl]† I1 = ,

[n]
X
Var[T4 ] = Tr E[ĝj ĝj ]E[g̃j g̃j ] · E[|cj |2 ] M v=1 1 +
PL [ks]
s=1 ρr τ βv
j=1 n=1 v6=l
2
[nj]
 
L X
K [kl]2 L K
ρr τ β j ρf X X βj [kl]
βj[kl]
X
=M − I2 = PL [ns]
βj
M j=1 n=1 1 +

PL [ks]
j=1 n=1 1+ s=1 ρr τ βj s=1 ρr τ βj
L P L [ns]
[nj]2
ρr τ β j L X 1 + s=1 ρr τ βj
[kv] 2 .
X
· PL [ks]
· |φj | . [nj]
ρr τ β j
1+ s=1 ρr τ βj v=1
v=1
Let now M → ∞. In this case we get
Finally, the power of additive noise w[kl] is Var[T5 ] = [kl] PL [ks]
[kl] β /(1 + ρr τ β l )
Var[w[kl] ] = 1. lim SINRD,N O LSF P = PL l [kl]2
s=1
PL [ks]
.
M →∞
Now, after some computations, we obtain from (35) the v6=l βv /(1 + s=1 ρr τ β v )
following theorem. Thus we again obtained (9) and again we see that for very large
Theorem 7: If the conjugate beamforming precoding is number of base station antennas the interference is completely
used in Step 6 of LSFP then the downlink transmission rate defined by the pilot contamination effect.
R[kl] is lower bounded by Let us consider ZF-LSFP. Let
[kj]
[kl] [kl] [k] βj
RD ≥ log2 (1 + SINRD ), k = 1, K, l = 1, L, µj = PL [ks]
1+ s=1 ρr τ β j
where and define L × L matrix
[kl] [kj]
2 
[k1] [k] [k1] [k]

PL βj βj [kl] β1 µ1 ... βL µL
ρf M 2 ρ2r τ 2 j=1 [ks] φj
. ..
PL
1+ s=1 ρr τ βj
B[k] = 
 
 ..
[kl] . (42)
SINRD = , (40) .
M 2 I1 + M I2 + 1

[kL] [k] [kL] [k]
β1 µ1 ... βL µL
and
Let further
√ −1

L X L [kl] [kj]
2 Φ[k] = ρA B[k] , k = 1, K,
β β

j j [kv]
X
I1 = ρf ρ2r τ 2

PL φ
[ks] j where ρA is a normalization factor to insure the constraint
v=1 j=1 1 + s=1 ρr τ βj

v6=l
(12). With such Φ[k] the numerator of (40) and I1 are
and ρf M 2 ρ2r ρA τ 2 and I1 = 0.
L X
K [nj]2 l
In the asymptotic regime, as M → ∞ we get
βj [kl] [nv]
ρf M 2 ρ2r τ 2
X X
I2 = ρf ρr τ β j |φj |2 . [kl]
1+
Pl [ns] lim SINRD,ZF −LSF P = lim = ∞.
j=1 n=1 s=1 ρr τ βj v=1 M →∞ M →∞ M I2
So, similar to Theorem 4, we obtained that in the asymptotic
It is instructive to apply Theorem 7 for the cases when LSFP
regime ZF-LSFP allows achieving infinite SINRs for all users.
is not used and when ZF-LSFP is used.
Remark 3: The matrix (42) has a slightly different form
compared with (14) because we assumed that the user knows
only the expected gain of the effective channel, that is the
B. No LSFP and ZF-LSFP [kl]
quantity l . In contrast, in Section III it is implicitly assumed
If we do not use LSFP then the matrices Φ[k] are diagonal. that the user knows the actual value of the effective channels
[kl] [kl]
Hence, taking into account (30) and the power constraint ĝj gj .
[kj] [k] 2 [kj] [kj] [kj] 2
E[||ĝj cj || ] = E[||ĝj φj s || ] = 1, C. Optimization Problem
[kl]
We can slightly simplify notation if we replace φj with
we conclude that
√ [kj]
PL [ns] 1/2 [kl] ρr τ β j [kl]
[kl] (1 + s=1 ρr τ β j ) αj = φ .
φj = δjl , (41) PL [ks] j
√ [kj] 1 + s=1 ρr τ βj
M ρr τ β j
This replacement does not cause any problems since all the
where δjl is Kronecker’s delta. The numerator of (40) will quantities used in the above expression are assumed being
have the form known at the j-th base station. After this replacement we get
P 2
[kl]2 L [kl] [kl]
βl M ρf ρr τ j=1 βj αj
ρf M ρ r τ PL [ks]
[kl]
SINRD = , (43)
1+ s=1 ρr τ β l M J1 + J2 + 1/M
14
where 2 gain in such scenario, or LSFP is only a theoretical tool for

X L [kl] [kv]
L X investigation of the asymptotic case M → ∞. We answer to
J1 = ρf ρr τ
βj αj , this question in Part II of the paper [2]. We will show that
v=1 j=1
v6=l
properly designed LSFP gives very significant gain over other
methods of interference reduction.
and
L X
K L L
!
[nv]∗ 2

[kl] [ns]
X X X
1
J 2 = ρf βj (1 + ρr τ βj ) φj . 1. ZF−LSFP, M=100
0.9 2. No LSFP, M=100
j=1 n=1 s=1 v=1
3. ZF−LSFP, M=104
4
0.8 4. No LSFP, M=10
With this notations the average power of the j-th base station 5. ZF−LSFP, M=106
is the following function of the large-scale fading coefficients 0.7 6. No LSFP, M=106
and the LSFP coefficients: 0.6

5
2
CDF
K 0.5
X [kj] [k] 2 3 6
γj = E[|ĝj cj | ] 0.4 1
k=1 4
0.3
K L L
[ks] [kv] 2
X X X 0.2
=M (1 + ρr τ β j )( |αj | ). (44)
0.1
k=1 s=1 v=1
0 −6
Using (44) one may formulate different optimization problems 10 10
−5
10
−4
10
−3
[kl]
10
−2 −1
10
0
10
1
10
2
10
RD (bits/channel use)
with base station power constraints. In particular, in Part II of
this paper [2] we consider the following problem [kl]

Fig. 4. The CDF of the achievable rate RD = log 1 + SINR[kl] for
[kl] two existing methods with different number of antennas.
max min SINRD
j,r=1,L k=1,K, l=1,L
[nr]
αj ,n=1,K,
subject to the constraints

1
γj ≤ 1, j = 1, L. 1. ZF−LSFP, M=100
2. No LSFP, M=100
0.9
4
3. ZF−LSFP, M=10
4
0.8 4. No LSFP, M=10
D. First Simulation Results 4
5. ZF−LSFP, M=106
0.7
6. No LSFP, M=106
Since ZF-LSFP allows achieving infinite SINRs when M → 0.6
2 6 3
1
∞ it is natural to ask what it gives us when M is finite. In order 5
CDF
0.5
to answer this question we generate random large-scale fading
0.4
coefficients and use Theorem 40 for finding corresponding
[kl] 0.3
downlink rates RD when LSFP is not used (No LSFP) and
0.2
when ZF-LSFP is used. In Fig. 4 and Fig. 5, we present
[kl]
simulation results for the CDFs of RD , k = 1, K, l = 1, L, 0.1
[kl]
and the CDF of the minimum rate mink,l RD , respectively. 0 −7
10 10
−6
10
−5
10
−4
10
−3
10
−2
10
−1 0
10
[kl]
We plot achievable rates and CDF in horizontal and vertical mink,l RD (bits/channel use)
axis, respectively. In these simulations, we assumed K = 10. [kl]

It can be observed from both figures that by increasing the Fig. 5. The CDF of the achievable rate RD = log 1 + SINR[kl] for
two existing methods with different number of antennas.
number of antennas we significantly improve the performance
of ZF-LSFP. In Fig. 4, ZF-LSFP achieves 5%-outage rate
around 10−6 bits per channel use at M = 100; the achievable
rate is improved to around 10−2 bits per channel use with
M = 106 antennas. On the other hand, the achievable rate of E. Performance of Large-Scale Fading Decoding with finite
no-LSFP is saturated when M is getting very large. M
According to Fig. 4, when we consider the rates of all users, Denote the (l, v)-th entry of matrix Ω[k] by ωl . Similar
[kv]
the 5%-outage rate of no-LSFP is larger than the 5%-outage to the downlink case we obtain the following theorem.
rate of ZF-LSFP at M = 100, 104 and only at M = 105
Theorem 8:
ZF-LSFP starts outperforming No LSFP. In the case of CDFs
[kl]
of minimum rates mink,l RD shown in Fig. 5, ZF-LSFP has [kl]
RU ≥ log2 (1 + SIN RU L ), k = 1, K, l = 1, L,
[kl]
better performance than No LSFP already at M = 104 , which
is still a very large number of antennas. where
Thus we conclude that in a regime with a practical number 2
of antennas, e.g. M = 100 and smaller, the ZF-LSFP performs PL β [kv] βv[kl] [kv]
M 2 ρ3r τ 2
v=1 Pv [ks] ωl
worse than No LSFP approach. It is a natural question to [kl] 1+ Ls=1 ρ r τ βv

ask whether an LSFP different from ZF-LSFP can give any SINRU = , (45)
M 2 I1 + M I2
15
and It is easy to see that

L
L X
2 a.s.
[kv] [kj]
X βv βv
[kv] lim xM = x,
I1 =ρ3r τ 2 ω , M →∞

PL [ks] l
j=1

v=1 1+ ρr τ β v
s=1

where x is a random variable defined by
j6=l
1
 
L 2 L X K
[kv]
X [kv] 2 ρr τ β v X Pr(x = −1) = Pr(x = 1) = .
I2 = |ωl | PL [ks]
1 + ρr βv[nj]  . 2
v=1 1 + s=1 ρr τ βv j=1 n=1 At the same time, in the case of the binary entropy, we have
Before presenting a formal proof of this theorem we would H2 (x) = −
X
Pr(x) log2 Pr(x) = 1,
like to note that the important distinction of this result from
[kv] [kl] x∈{−1,1}
Theorem 40. The coefficients ωl appear only in SINRU
[nj]
and not in any other SINRU . Thus optimal LSFP coefficients while H2 (xM ) = 2 for any M and therefore
[kv]
ωl , v = 1, L, can be chosen independently for each user, lim H2 (xM ) = 2.
which significantly simplify search of optimal coefficients. M →∞
We also would like to note that we can slightly simplify Thus we have
[kl] [kv]
expression for SINRU if replace ωl with
H2 ( lim xM ) 6= lim H2 (xM ).
√ [kv] M →∞ M →∞
[kv]∗ ρr τ M β v [kv]
ωl = PL ω .
[ks] l
This example shows that the statement of Lemma 5 is not
1 + s=1 ρr τ βv trivial and indeed needs a proof.
In this case we get Let us define for some positive φ and β the events AM ,
2 WM by aM ∈ [1 − φ, 1 + φ] and wM ∈ [−β, β] respectively.
[kl] [kv]∗
P
L
M ρ2r τ v=1 βv ωl We would like to remind that
[kl]

SINRU = , (46) a.s. a.s.
M J1 + J 2 lim aM = 1 and lim wM = 0
M →∞ M →∞
where 2

L X L imply that
∗

[kj] [kv]
X
J1 = ρ2r τ βv ωl

lim Pr(AcM ) = 0, and (47)

j=1 v=1 M →∞
j6=l
c
lim Pr(WM ) = 0. (48)
and M →∞
L L L X
K We also have
[kv]∗ 2
X X X
J2 = |ωl | (1 + ρr τ βv[ks] )(1 + ρr βv[nj] ). \
v=1 s=1 j=1 n=1
lim Pr(AM WM ) = 1. (49)
M →∞
Finally, it is instructive to consider the expression (46) for Indeed
the cases of NO LSFP and ZF-LSFP. If we do not use LSFP \ [
[kv] [kv]
then ωl = δlv . Substituting such ωl = δlv into (46), in the Pr(AM WM ) = Pr((AcM WM c c
) )
regime M → ∞ we get for the expression (11) from Section
[
[kv]
=1 − Pr(AcM WM c
) ≥ 1 − Pr(AcM ) − Pr(WM
c
) = 1.
III-A. In the case of ZF-LSFP we chose ωl according to
(25). Using these coefficients in (46) we get the result of In a similar way one can show that
Theorem 6 from Section III. \
c
lim Pr(AM WM ) = 0. (50)
The proof is similar to the proof of Theorem 7. However, as M →∞
[kv] [kl]
we noted above, the LSFP coefficients ωl appear in SINRU Proof: (Lemma 5) Below we show that for any given ∆,
in a different way than in Theorem 40. So it is important to by choosing sufficiently large M , we can make I(sM ; s) being
carefully track all indices involved into the computations. The arbitrary close to H(s).
proof is in Appendix B. To make notation short we assume that N is even and the
signals R are evenly spaced on the real line that is
V. A PPENDIX A
R = {−N/2 · ∆, −(N/2 − 1)∆, . . . , (N/2 − 1)∆, N/2 · ∆}.
Let R = {r1 , . . . , rN } be a constellation of signals such
that mini,j dist(ri , rj ) ≥ ∆ for some positive real ∆ and N Generalizations for unevenly spaced signals and for complex
is an arbitrary integer. signals are straightforward.
Before presenting a proof of Lemma 5, we would like to For a given sM , let qM ∈ R be such that
note that when we deal with mutual information or entropy
dist(qM , sM ) ≤ dist(s0 , sM ) for any s0 ∈ R \ qM .
functions we not always can replace a random variable, say
xM , with its limit value. For example, let xM be defined by In other words qM is obtained by demodulation of sM with
respect to R. According to the data processing inequality we
Pr(xM = −1 − 1/M ) = Pr(xM = −1 + 1/M )
have
= Pr(xM = 1 − 1/M ) = Pr(xM = 1 + 1/M ) = 1/4. I(qM , s) ≤ I(sM , s) ≤ H(s). (51)
16
Next, From (53) and (52) we get

I(qM , s) = H(s) − H(s|qM ),
Denote by PQ (·) the PMF of qM . Then lim PS|Q (s = qM |qM ) = 1.
M →∞
H(s|qM )
X X
= PQ (qM ) PS|Q (s|qM ) log PS|Q (s|qM ) and further
qM ∈R s∈R
X
= PQ (qM )[PS|Q (s = qM |qM ) log PS|Q (s = qM |qM )
qM ∈R lim PS|Q (s = qM |qM ) log PS|Q (s = qM |qM ) = 0. (54)
X M →∞
0 0
+ PS|Q (s |qM ) log PS|Q (s |qM )]
s0 ∈R\qM
For s0 6= qM we have
The conditional probability PS|Q (s = qM |qM ) can be written
as
PS|Q (s = qM |qM ) PS|Q (s0 |qM )
PS (qM )PQ|S (qM |s = qM ) PS (s0 )PQ|S (qM |s0 )
= . = .
PS (qM )PQ|S (qM |s = qM ) + s00 ∈R\qM PS (s00 )PQ|S (qM |s00 )
P
PS (qM )PQ|S (qM |s = qM ) + s0 ∈R\qM PS (s0 )PQ|S (qM |s0 )
P
First, we estimate PS (qM )PQ|S (qM |s0 ), s0 6= qM . From the

definition of qM it follows that Using (52) and (53) we obtain
PQ|S (qM |s0 ) = Pr(aM s0 + wM ∈ (qM − ∆/2, qM + ∆/2)).
Let us assume that qM = (n − 1)∆ and s0 = n∆ for some lim PS|Q (s0 |qM ) = 0,
M →∞
positive integer n. If aM ≥ 1 − 1/4n. Then in order to have
aM s0 + wM ∈[qM − ∆/2, qM + ∆/2] and further
= [(n − 1)∆ − ∆/2, (n − 1)∆ + ∆/2],
we need that wM < −∆/4. Similarly, if aM ≤ 1 + 1/4n lim PS|Q (s0 |qM ) log PS|Q (s0 |qM ) = 0. (55)
we have to have wM > −7∆/4. Hence, if aM ∈ [1 − M →∞
1/4n, 1 + 1/4n] then wM can not be outside of the interval

(−7∆/4, −∆/4). Thus From (54) and (55) it follows that
0
PQ|S (qM = (n − 1)∆|s = n∆)
≤ Pr(aM ∈ [1 − 1/4n, 1 + 1/4n] and wM ∈ (−7/4∆, −1/4∆)) lim H(s|qM ) = 0 and lim I(qM ; s) = H(s),
M →∞ M →∞
+ Pr(aM 6∈ [1 − 1/4n, 1 + 1/4n]).
Applying now (50) and (47) to the above terms, we get
which, together with (51), finishes the proof.
lim PQ|S (qM = (n − 1)∆|s0 = n∆) = 0.
M →∞
For any other s0 ∈ R \ qM we can use similar arguments that

lead to
lim PQ|S (qM |s0 ) = 0. VI. A PPENDIX B
M →∞
Hence, taking into account that the number of terms in the

sum is countable, we get Proof: of Theorem 8. The l-th base station receives the
X signal
lim PS (s0 )PQ|S (qM |s0 ) = 0. (52)
M →∞
s0 ∈R\qM
L X
K
Using the same type of arguments we can show that √ X [nj] [nj]
yl = ρr gl x + wl .
j=1 n=1
PQ|S (qM |s = qM )
= Pr(aM qM + wM ∈ (qM − ∆/2, qM + ∆/2))
≥ Pr(aM ∈ [1 − 1/4n, 1 + 1/4n] and wM ∈ [−∆/4, ∆/4]). After applying the matched filter ĝl [kl] it gets
Applying (49) to this lower bound we conclude that it con-
verges to 1. Thus √
L X
K
[kl]† [kl]† [nj] [nj] [kl]†
X
[kl]
x̃ = ĝl yl = ρr ĝl gl x + ĝl wl ,
lim PQ|S (qM |s = qM ) = 1. (53)
M →∞ j=1 n=1
17
where wl ∼ CN (0, IM ). The network controller collects these Again using [23, Theorem 1] we obtain the following lower
[kl]
estimates and applies pilot contamination decoding: bound on the uplink rate RU
L
[kv] [kv]
X
x̂[kl] = ωl x̃ L
!
v=1 √ X
[kl] [kl] [kv] [kv]† [kl]
RU =I x̂ ; yl ρr ωl E[ĝv ĝv ]

L XL X
K L
√ X [kv] † X [kv] †
= ρr ωl ĝv[kv] gv[nj] x[nj] + ωl ĝv[kv] wv

v=1
 2 
[kv]† [kl]
P
v=1 j=1 n=1 v=1 L [kv]
ρr v=1 ωl E[ĝv ĝv ]
L L K
≥ log2 1 +
 
√ X X X [kv] [kv]† [nj] [nj] Var[Q1 ] + Var[Q2 ] + Var[Q3 ] + Var[Q4 ] + Var[Q5 ]

= ρr ωl ĝv ĝv x
v=1 j=1 n=1
L X
L X
K L
√ X [kv] [kv]† [nj] [nj]
X [kv] [kv]†
+ ρr ωl ĝv g̃v x + ωl ĝv wv
v=1 j=1 n=1 v=1 Below we find the variances of Q1 , . . . , Q5 .
L
√ X [kv] [kv]† [kl]
The term Q1 is caused by the uncertainty of l-the base
=s[kl] ρr ωl ĝv ĝv [kl] [kl] [ks]
station about the affective channel ĝl ĝl . Since ĝv and
v=1 [ni]
L XX
K
ĝr are uncorrelated for any k, s, n, i, using (37), we get
√ X [kv] [kv] [nj] [nj]
+ ρr ωl ĝv ĝv s
v=1 j=1 n=1
j6=l
L
L X
L X
X
√ X [kv] Var[Q1 ] = ρr E[|s[kl] |2 ] Var[ĝv[kv] ĝv[kl] ]
+ ρr ωl ĝv[kv] ĝv[nj] s[nj] v=1
v=1 j=1 n=1
L [kv]2 [kl]2
n6=k X[kl] ρr τ β v ρr τ βv
L X
L X
K L =ρr M |ωl |2 · .
√ X [kv] [kv]† [nj] [nj]
X [kv] [kv]†
PL [ks]
1 + s=1 ρr τ βv
PL [ks]
1 + s=1 ρr τ βv
+ ρr ωl ĝv g̃v x + ωl ĝv wv v=1
v=1 j=1 n=1 v=1
Term Q2 is caused by the pilot contamination. In order to

L find Var[Q2 ] we note that s[nj] and s[mv] are independent if
√ X [kv] † [nj] [mv]
=s[kl] ρr ωl E[ĝv[kv] ĝv[kl] ] (n, j) 6= (m, v) and that ĝv and ĝv are uncorrelated if
v=1 n 6= m. Now, using (30) and (37), we obtain
L
√ X [kv] [kv]† [kl] †
+ s[kl] ρr ωl (ĝv ĝv − E[ĝv[kv] ĝv[kl] ])
v=1
K
L XX
Var[Q2 ]
√ X [kv] 2
+ ρr ωl ĝv[kv] ĝv[nj] s[nj] L X
X L X
L
[kv] [kr]† 2 2
v=1 j=1 n=1 =ρr ωl ωl E[ĝv[kv] ĝv[kv] ]E[ĝr[kj] ĝr[kj] ]
j6=l
v=1 j=1 r=1
L X
L X j6=l r6=v
√ X [kv] [kv]2 [nj] [nj] L X
L
+ ρr ωl ĝv ĝv s X [kv] 2 2 2
v=1 j=1 n=1 + ρr |ωl | E[ĝv[kv] ĝv[kj] ĝv[kj] ĝv[kv] ]
n6=k
v=1 j=1
L X
L X
K L j6=l
√ X [kv] [kv] † X [kv] [kv] † 2
+ ρr ωl ĝv g̃v[nj] x[nj] + ωl ĝv wv .
L
L X [kv] [kj]
X M ρ r τ β v βv
[kv]
=ρr ω

v=1 j=1 n=1 v=1 [ks] l
L
P
(56) j=1
1+
v=1 ρr τ β v s=1

j6=l
L X
L [kj]2 [kv]2
Denote the terms of this expression by Q0 , . . . , Q5 . Similar X ρr τ β v ρr τ β v
to the downlink case it is not difficult to prove that these terms + ρr M PL [ks]
· PL [ks]
j=1 v=1 1 + s=1 ρr τ βv 1 + s=1 ρr τ βv
are mutually uncorrelated. Hence we can rewrite (56) in the j6=l
form: [kv] 2
· |ωl |
L
√ X [kv] † [kl]
x̂[kl] = s[kl] ρr ωl E[ĝv[kv] ĝv[kl] ] + wef f ,
v=1
Consider term Q3 . In the asymptotic regime, as M tends
and [kv] [nj]
to infinity, the normalized inner-product of ĝv and ĝv
[kl]
Var[wef f ] almost surely tends to zero since for n 6= k these vectors
are independent. For finite M , however, we can not ignore
=Var[Q1 ] + Var[Q2 ] + Var[Q3 ] + Var[Q4 ] + Var[Q5 ]. the interference caused by Q3 . To compute Q3 we use the
18
same fact as we used for Q2 and, using (30), obtain [6] C. Xiong, G. Y. Li, S. Zhang, Y. Chen, and S. Xu, “Energy- and
spectral- efficiency tradeoff in downlink OFDMA networks,” IEEE
Var[Q3 ] Trans. Wireless Commun., 10, pp. 3874-3886, 2011.
[7] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta,
L X
L X
K O. Edfors, and F. Tufvesson, “Scaling up MIMO: Opportunities and
X [kv] 2 † 2
=ρr |ωl | Tr E[ĝv[kv] ĝv[kv] ]E[ĝv[nj] ĝv[nj] ] challenges with very large arrays,” IEEE Signal Process. Mag., 30,
v=1 j=1 n=1 pp. 40-46, 2013.
n6=k [8] E. G. Larsson, F. Tufvesson, O. Edfors, and T. L. Marzetta, “Massive
· E[|s[nj] |2 ] MIMO for next generation wireless systems,” IEEE Commun. Mag., 52,
pp. 186-195, 2014.
L X
L X
K [kv]2 [9] Lu Lu ; G. Y. Li, A. L. Swindlehurst, A. Ashikhmin, Rui Zhang, “An
X [kv] ρr τ β v
=ρr M |ωl |2 PL [ks]
Overview of Massive MIMO: Benefits and Challenges,” IEEE Journal
v=1 j=1 n=1 1 + s=1 ρr τ βv of Selected Topics in Signal Processing, 8, pp. 742–758, 2014.
n6=k [10] F. Fernandes, A. Ashikhmin, T. L. Marzetta, “Interfernce Reduction
[nj]2 on Cellular Networks with Large Antenna Arrays,” IEEE International
ρr τ β v Conference on Communicaions (ICC), Ottawa, Canada, 2012.
· PL [ks]
. [11] F. Fernandes, A. Ashikhmin, T. L. Marzetta, “Inter-Cell Interference in
1 + s=1 ρr τ βv Noncooperative TDD Large Scale Antenna Systems,” IEEE Journal on
Selected Areas in Communications, 31, pp. 192–201, 2013.
[nj]
Term Q4 is caused by the estimation error g̃v . Taking into [12] K. Appaiah, A. Ashikhmin, T. L. Marzetta, “Included in Your Digital
[nj] Subscription Pilot Contamination Reduction in Multi-User TDD Sys-
account that in the case of MMSE estimation the estimate ĝv
[nj] tems,” International Conference on Communications, pp. 1–5, 2010.
and estimation error g̃v are uncorrelated, and using (30) and [13] J. Hoydis, S. ten Brink, M. Debbah, “Massive-MIMO: How Many
(29), we obtain Antennas do We Need,” arXiv:1107.1709v2.
[14] H. Huh, G. Caire, H.C. Papadopoulos, S.A. Ramprashad, “Achieving
Var[Q4 ] “Massive-MIMO” Spectral Efficiency with a Not-so-Large Number of
Antennas,” arXiv:1107.3862v2.
L X
L X
K [15] A. Ashikhmin and T. Marzetta, “Large-Scale Antenna Method And
X [kv] 2 2 2
=ρr |ωl | Tr E[ĝv[kv] ĝv[kv] ]E[g̃v[nj] ĝv[nj] ] Apparatus Of Wireless Communication With Suppression Of Intercell
Interference,” US8774146 patent, filed on Dec. 19, 2011, Issued on July
v=1 j=1 n=1
8, 2014.
· E[|s[nj] |2 ] [16] A. Ashikhmin and T. L. Marzetta, “Pilot contamination precoding in
multi-cell large scale antenna systems,” Proceedings of 2012 IEEE
L X
L X
K [kv]2
X [kv] ρr τ βv International Symposium on Information Theory (ISIT), pp. 1137–1141,
=ρr M |ωl |2 PL [ks] 2012.
v=1 j=1 n=1 1 + s=1 ρr τ βv [17] H. Yang, T. L. Marzetta, “Total energy efficiency of cellular large
2 ! scale antenna system multiple access mobile networks,” IEEE Online
[nj]
ρr τ β v Conference on Green Communications (GreenCom), pp.27–32, 2013.
· βv[nj] − PL [ns]
. [18] S. M. Kay, Fundamentals of Statistical Sygnal Processing. I: Estimation
1 + s=1 ρr τ βv Theory, Prentice Hall PTR, 1993.
[19] J. Jose, A. Ashikhmin, T. L. Marzetta, S. Vishwanath, “Pilot Contami-
Finally, term Q5 is caused by the additive noise at the nation and Precoding in Multi-Cell TDD Systems,” IEEE Transactions
receivers of base stations. Using the same arguments as above, on Wireless Communications, vol. 10, pp. 2640 – 2651, 2011.
[20] M. K. Karakayali, G. J. Foschini, R. A. Valenzuela, and R. D. Yates, On
we obtain the maximum common rate achievable in a coordinated network, IEEE
L International Conf. of Commun., vol. 9, pp. 4333 - 4338, Jun. 2006.
X [kv] 2 2 [21] M. K. Karakayali, G. J. Foschini, R. A. Valenzuela, Network coordina-
Var[Q5 ] = |ωl | Tr E[ĝv[kv] ĝv[kv] ]E[wv wv† ] tion for spectrally efficient communications in cellular systems, IEEE
v=1 Trans. Wireless Commun., vol. 13, no. 4, pp. 56 - 61, Aug. 2006.
L [kv]2 [22] G. J. Foschini, M. K. Karakayali, and R. A. Valenzuela, Coordinating
X [kv] ρr τ β v
=M |ωl |2 PL [ks]
. multiple antenna cellular networks to achieve enormous spectral effi-
1 + s=1 ρr τ βv ciency, IEE Proc. Commun., vol. 153, no. 4, pp. 548 - 555, Aug. 2006.
v=1 [23] B. Hassibi and B. M. Hochwald, “How much training is needed in
multiple-antenna wireless links?” IEEE Trans. Inf. Theory, vol. 49, pp.
Combining the obtained results finishes the proof. 951963, Apr. 2003.
Acknowledgement The authors would like to thank Carl
Nuzman for his help with the proof of Lemma 5.
R EFERENCES
[1] T. L. Marzetta, Multi-cellular wireless with base stations employing
unlimited numbers of antennas, in Proc. UCSD Inf. Theory Applicat.
Workshop, Feb. 2010.
[2] L. Li, A. Ashikhmin, T. Marzetta, “Interference Reduction in Massive
MIMO Systems II: Downlink Analysis for a Finite Number of Anten-
nas,” submitted to IEEE Trans. on Information Theory.
[3] T. L. Marzetta, “Noncooperative Cellular Wireless with Unlimited
Numbers of Base Station Antennas,” , IEEE Trans. on Wireless Com-
munications, 9, pp. 3590 –3600, 2010.
[4] H. Q. Ngo, E. G. Larsson, and T. L. Marzetta,“Energy and spectral
efficiency of very largemultiuserMIMOsystems,” IEEE Trans. on Com-
mununications, 61, pp. 1436-1449, 2013.
[5] G. Y. Li, Z.-K. Xu, C. Xiong, C.-Y. Yang, S.-Q. Zhang, Y. Chen, and S.-
G.Xu,“Energy-efficient wireless communications: Tutorial, survey, and
open issues,” IEEE Wireless Commun. Mag., 18, pp. 28-35, 2011.
View publication stats

Interference Reduction in Multi-Cell Massive MIMO PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Interference Reduction in Multi-Cell Massive MIMO PDF

Hochgeladen von

Copyright:

Verfügbare Formate

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Interference Reduction in Multi-Cell Massive MIMO Systems I: Large-Scale

Article in IEEE Transactions on Information Theory · November 2014

Alexei Ashikhmin Thomas Marzetta

SEE PROFILE SEE PROFILE

Cell-Free Massive MIMO View project

The user has requested enhancement of the downloaded file.

Interference Reduction in Multi-Cell Massive

B. Time-Division Duplexing Protocol

s=1 [kj] [kj]2

following lemma. It is also possible to use modified TDD protocol in which

B. Large-Scale Fading Precoding [kl]

√ Applying now Lemma 5 to the above expression we obtain

L where the effective noise has the variance

Var[T1 ] [nj] [kj]

and ρf M 2 ρ2r ρA τ 2 and I1 = 0.

where 2 gain in such scenario, or LSFP is only a theoretical tool for

and the LSFP coefficients: 0.6

subject to the constraints

axis, respectively. In these simulations, we assumed K = 10. [kl]

and It is easy to see that

Next, From (53) and (52) we get

First, we estimate PS (qM )PQ|S (qM |s0 ), s0 6= qM . From the

1/4n, 1 + 1/4n] then wM can not be outside of the interval

For any other s0 ∈ R \ qM we can use similar arguments that

Hence, taking into account that the number of terms in the

Term Q2 is caused by the pilot contamination. In order to

View publication stats

Das könnte Ihnen auch gefallen