Wireline
Multiple  Input / Multiple  Output
Systems
ausgef
uhrt zum Zwecke der Erlangung des akademischen Grades
eines Doktors der technischen Wissenschaften
eingereicht an der
Technischen Universitat Wien
Fakultat f
ur Elektrotechnik und Informationstechnik
von
Supervisor
Examiner
Georg Taubock
Wireline
Multiple  Input / Multiple  Output
Systems
September 2006
1. Auflage
Alle Rechte vorbehalten
c 2006 Georg Taubock
Copyright
Herausgeber: Forschungszentrum Telekommunikation Wien
Printed in Austria
ISBN 3902477067
KURZFASSUNG
In den letzten Jahren kam die Idee auf, herkommliche Telefonleitungen fur die
hochratige Datenubertragung zu verwenden. Es stellte sich heraus, dass die existierenden Telefonleitungen schelle Datenubertragung unterstutzen, sofern sie nicht
zu lange sind. In der einschlagigen Literatur werden entsprechende digitale Datenu bertragungsverfahren unter dem Begriff xDSL (Digital Subscriber Line) Datenu bertragung zusammengefasst.
Die Ursache, dass die Verwendung von existierenden Telefonleitungen auf
kurzere Kabellangen beschrankt ist, liegt darin, dass in den meisten praktisch verwendeten Topologien die einzelnen Doppeladern zu Kabelbundel zusammengefasst sind, wodurch sich zwei Doppeladern, die nahe bei einander liegen, gegenseitig aufgrund des elektromagnetischen Feldes storen. In der entsprechenden Lit
eratur spricht man in diesem Zusammenhang auch von Ubersprechen.
Je nach
vi
welche dann in einem weiteren Schritt spezialisiert wird, um die Varianzen von
Real und Imaginarteil des Rauschens und die Korrelationen zwischen Real und
Imaginarteil des Rauschens bei fixen Frequenzen / Subtragern zu erhalten. Mittels Eigenwertzerlegungen ist es moglich, die Exzentrizitat und die Rotationen der
Rauschellipsen zu bestimmen. Es stellt sich heraus, dass die Rotationswinkel unabhangig von den Rauscheigenschaften am Empfangereingang sind. Sie hangen
ausschlielich von der Nummer des betrachteten Subtragers (Subtragerfrequenz)
ab. Auerdem wird gezeigt, dass verschiedene Rauschvarianzen bzw. Korrelationen nicht auftreten, wenn das Rauschen (am Empfangereingang) wei ist.
Fur farbiges Rauschen treten verschiedene Rauschvarianzen bzw. Korrelationen
allerdings auf, und man musste eigentlich rotierte rechteckige Signalpunktkonstellationen anstatt der u blichen (quadratischen) QAM  Konstellationen verwenden.
Andernfalls muss man einen Kapazitatsverlust und eine erhohte Symbolfehlerwahrscheinlichkeit hinnehmen. Wir berechnen beide Groen und zeigen, wie man
existierende Ladealgorithmen modifizieren muss, um die optimalen Konstellationsparameter zu erhalten.
Ebenso fuhren wir eine detaillierte Interferenzanalyse fur ein DMT System
durch. Wir betrachten den Fall, dass die Impulsantwort den Zyklischen Prafix auf
beiden Seiten u berschreitet, was zu Precursors und Postcursors von beiden benachbarten DMT Symbolen (Intersymbolinterferenz) und zu Intertragerinterferenz
fuhrt. Wir leiten geschlossene Formeln fur beide Beitrage ab und studieren ihre
statistischen Eigenschaften. Es stellt sich heraus, dass beide Interferenzbeitrage
komplexwertige Zufallsvektoren sind mit gleichen ersten und zweiten Momenten
und einer nichtverschwindenden Pseudokovarianzmatrix.
Wir zeigen auch, wie die erzielten Rausch und Interferenzresultate fur das
Design von Zeitbereichsentzerrer genutzt werden konnen.
In einem zweiten Schritt verallgemeinern wir die Rausch und Interferenzresultate von DMT zu MIMO DMT. Wiederum ist es moglich, Losungen in geschlossener Form zu erhalten, und das sogar fur sehr allgemeine Annahmen in
Bezug auf die Korrelationen zwischen den verschiedenen Doppeladern eines Kabelbundels.
ABSTRACT
In recent years, the idea to use existing telephone lines for high data rate transmission took shape. It turned out that existing telephone lines can support high data
rates, provided that the lengths of the cables are not too long. In the corresponding
literature, these transmission methods are referred to as xDSL (Digital Subscriber
Line) transmission.
The reason that the use of existing telephone lines is limited to shorter loop
lengths is that the various twisted pairs are bundled together in cable bundles in
most practical topologies. It is quite clear that two loops that are close to each
other disturb each other because of the induced electromagnetic field. Note that
this effect is called crosstalk in literature. Hence, there is recent research to reduce
the performance degradation introduced by crosstalk. There are several issues regarding crosstalk, and, therefore, different approaches to this problem. Note that
crosstalk is subdivided into two types, i.e., NearEnd Crosstalk (NEXT) and FarEnd Crosstalk (FEXT), depending on the location the crosstalk originates from.
Throughout this manuscript, we consider only crosstalk that stems from the farend side (FEXT).
We will assume that transmission over the individual loops is performed via
Discrete Multitone modulation (DMT), which is the modulation scheme used in
ADSL and VDSL. Since we are considering simultaneous transmission over several
loops in a cable bundle, the whole transmission system can be regarded as a Multiple  Input / Multiple  Output (MIMO) system, and we will refer to the overall
modulation scheme as MIMO DMT modulation scheme.
The present work gives a comprehensive treatment of Multiple  Input / Multiple  Output Discrete Multitone (MIMO DMT) transmission. We show that such
a transmission scenario can be modeled by a complex vector channel, i.e., by a
deterministic complex matrix and by a complex noise vector.
We develop a theory for complex random vectors that takes into account rotationally variant random vectors, and is therefore of great importance for our
purpose, since complex random vectors (in DMT and MIMO DMT) have a nonvanishing pseudocovariance matrix in general (as we will also show in this manuscript). We prove a Generalized Maximum Entropy Theorem, that includes the
pseudocovariance matrix in its entropy inequality and therefore tightens the upper bound for rotationally variant random vectors. We show that the additional
correction term is independent of the specific probability distribution of the considered random vector. Furthermore, we obtain several capacity results for the
complex vector channel considered, that take into account the pseudocovariance
viii
matrix. We show that a nonvanishing pseudocovariance matrix (of the noise) increases capacity and calculate the capacity loss if it is erroneously assumed that the
pseudocovariance matrix is the zero matrix. Note also that we derive a criterion
for a matrix to be a pseudocovariance matrix. This generalizes the wellknown
criterion that a matrix is a covariance matrix of a certain random vector if and only
if it is symmetric / Hermitian and nonnegative definite.
We perform a detailed noise analysis for a DMT system and show that the
noise vector at the input of the Decision Device is rotationally variant in general.
We calculate the corresponding covariance matrix and pseudocovariance matrix,
which is then specialized in order to obtain the noise variances of real and imaginary part and to obtain the correlations between real and imaginary part for a fixed
frequency / subcarrier. Via eigenvalue decompositions, we are able to determine
the eccentricities and the rotations of the noise ellipses. It turns out that the rotation
angles are independent of the actual noise characteristics. They only depend on the
number of the considered subcarrier. Furthermore, it is shown that different noise
variances and correlations of real and imaginary part do not occur in the presence
of white noise (at the input of the receiver). For colored noise, they do occur, and
one has to use rotated rectangular constellations instead of the common (square)
QAM constellations. Otherwise, one has to accept a capacity loss and increased
symbol error probability. We calculate both quantities. Furthermore, we show
how to modify the existing bitloading algorithms in order to obtain the optimum
constellation parameters.
We also perform a detailed interference analysis for a DMT system. We consider the case when the channel impulse response exceeds the Cyclic Prefix on both
sides, which yields precursors and postcursors from both neighboring DMT symbols (intersymbol interference) and also intercarrier interference. We derive closed
form formulas for both contributions and consider their statistical properties as
well. It turns out that both interference contributions are complex random vectors
with equal first and second order moments and a nonvanishing pseudocovariance
matrix.
We also show how the noise and interference results obtained can be utilized
for the design of Time Domain Equalizers.
In a second step, we generalize the noise and interference results from DMT to
the MIMO DMT case. Again, it is possible to obtain closed form solutions, even
for very general assumptions with respect to correlations across the various loops
of the cable bundle.
We present the general form of a transmission scheme that is suited to the
MIMO DMT channel and is based on socalled joint processing functions. It allows
the use of Single  Input / Single  Output (SISO) codes, and we introduce the (sum
) capacity as a performance measure.
We deal with transmission schemes whose joint processing functions are based
on the Singular Value Decomposition (SVD) of the channel matrix. We show that
the optimum joint processing function can be obtained by means of the SVD. Furthermore, we study low(er)complexity variations and discuss their performance.
ix
ACKNOWLEDGEMENT
I would like to thank Prof. Johannes Huber and Prof. Johann Weinrichter for their
support that goes far beyond what I would have expected.
I am grateful to all my colleagues at the Telecommunications Research Center
Vienna (ftw.), especially to Jossy Sayir for his continuous (in fact, continuous and
not piecewise continuous) assistance and encouragement, to Werner Henkel for his
support in every way, and to Driton Statovci for our fruitful discussions concerning
the practical aspects of my work. The collaboration with them was a constant
source of new ideas and entertaining hours. The professional, inspiring, and open
work environment at ftw., shaped by Markus Kommenda and Horst Rode, provided
the basis for the work on this thesis.
I would like to thank my wife, my family, and my friends for their continuous
sympathy during my research adventure, and especially my mother for being such
an enthusiastic grandmother with helping hands whenever the father is busy with
writing his thesis.
I would like to dedicate this work to my daughter Maria Shirin who is the
smiling sun in my life.
xii
CONTENTS
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2. The SubscriberLine Network . . . . . . . . . . . . .
2.1 Discrete Multitone Modulation . . . . . . . . . .
2.2 Discrete Multitone Modulation on a Cable Bundle
2.3 The Channel Model . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
5
5
16
24
27
28
28
37
47
51
51
54
60
65
66
66
67
71
73
74
77
82
92
95
101
104
xiv
Contents
107
108
108
115
120
121
126
127
131
132
133
136
137
139
142
145
. . . . . . . . . . . . . . . . . . . . . . . . 147
Appendix
151
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
153
153
154
155
B. Simulation Scenarios . . . . . . . .
B.1 Scenario 1 . . . . . . . . . . .
B.1.1 Transmission Medium
B.1.2 DMT Parameters . . .
B.1.3 Noise Model . . . . .
B.2 Scenario 2 . . . . . . . . . . .
B.2.1 Transmission Medium
B.2.2 DMT Parameters . . .
B.2.3 Noise Model . . . . .
B.3 Scenario 3 . . . . . . . . . . .
B.3.1 Transmission Medium
B.3.2 DMT Parameters . . .
B.3.3 Noise Model . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
157
157
157
157
157
158
158
158
159
159
160
160
160
1. INTRODUCTION
1. Introduction
on the location the crosstalk originates from. Throughout this manuscript, we will
assume that we know how to cope with NEXT and consider only the crosstalk that
stems from the farend side (FEXT). We want to emphasize that NEXT it still a
topic of ongoing research and our assumption is introduced to simplify matters by
considering only one crosstalk source.
There is one way to deal with this problem that is based on the viewpoint that
considers crosstalk not as disturbance but  instead  as part of the channel. In the
present work, we will pursue this idea and develop several methods for communication over channels with crosstalk. We also want to emphasize that the resulting
performance gains are in line with Information Theory, which has led to the fundamental result that reliable communication is only possible below a certain data
rate threshold (the capacity) for a given (physical) channel. If crosstalk is considered as part of the channel instead of as a disturbance source, this corresponds
to a change of the channel. Thus, Information Theory provides us with a new data
rate threshold.
In order to minimize the costs, we will assume that transmission over the individual loops is performed via Discrete Multitone modulation (DMT), which is the
modulation scheme used in ADSL and VDSL. Note that existing technologies are
usually cheaper than new technologies. Hence, a main part of this manuscript deals
with DMT, working out aspects not known before. We will show that there can be
situations (not only in theory but also in practice) where these aspects gain enormous influence on the transmission performance and should be taken into account.
The present work is structured as follows:
1. The first chapter contains this introduction.
2. The second chapter considers the SubscriberLine Network and presents the
fundamentals of Discrete Multitone modulation (DMT). Furthermore, we are
considering transmission over cable bundles and show how the crosstalk can
be modeled in order to reflect the properties of the real world. If each loop
is equipped with DMT transmission, the crosstalk is transformed according
to the DMT modulation scheme. We will show that DMT transmission over
cable bundles can be described as a complex valued vector channel, i.e., as
a complex matrix  vector product plus an additive complex valued noise
vector. Since a vector channel has several inputs and outputs (the elements
of the vectors), we refer to such a scenario as Multiple  Input / Multiple
Output (MIMO) System.
3. Since we are dealing with complex valued vectors, Chapter 3 presents a theory about complex random vectors. Note that most literature about complex
random vectors deals only with a subclass, the socalled rotationally invariant complex random vectors. We will show in Chapter 4 that the complex
random vectors we are looking at are not elements of this subclass. Instead,
they are rotationally variant, so that it makes sense to develop this theory.
Specifically, we generalize the Maximum Entropy Theorem to rotationally
variant complex random vectors, and then use it to obtain certain capacity
results. With the introduction of a rotationally invariant analogon of a complex random vector we are able to show that the additional correction term in
the Generalized Maximum Entropy Theorem is independent of the actual distribution of the considered complex random vector. We want to emphasize
that it is not widely known that the complex random vectors are rotationally variant in general, and we will calculate the capacity loss that must be
accepted if this fact is neglected (using very general assumptions without
making use of any specific DMT properties).
4. As already mentioned, it is shown in Chapter 4 that we have to deal with rotationally variant random vectors. To be more precise, we will show in this
chapter that the noise vector (at the input of the Decision Device) has this
property (in general), and we will look at the implications resulting from
this observation. It turns out that colored noise at the input of the DMT receiver makes the noise vector at the input of the Decision Device rotationally
variant. In order to cope with rotationally variant noise, one should apply
rotated rectangular constellations instead of the common quadratic QAM
constellations, and we will derive analytical formulas for the (optimum) parameters / shape of these constellations. In case this is not done, one must
accept a capacity loss and higher symbol error probability. Both quantities
will be calculated.
Finally in this chapter, we consider the problem to determine the Time Domain Equalizer (TDE) coefficients. If the TDE coefficients are not suitably
adapted, there will be intersymbol and intercarrier interference. We will obtain closed form solutions for both interference contributions and study their
statistical properties. Again, it turns out that the interference is represented
by a rotationally variant complex random vector, which is a second argument for the use of rotated rectangular signal constellations.
5. The first part of Chapter 5 generalizes the results of Chapter 4 about noise
and interference to the MIMO case, i.e., instead of looking at one single loop
equipped with DMT transmission, we look at the whole cable bundle. This
is of great importance, since these results are fundamental for the design of
lowcomplexity MIMO Time Domain Equalizers. This design is much more
difficult than for the single loop case.
The second part presents the general form of a transmission scheme that
can cope with crosstalk. It is based on socalled joint processing functions
and allows the use of conventional Single  Input / Single  Output (SISO)
codes, so that we can again resort to existing technologies for this part of the
transmission scheme.
The third section deals with transmission schemes whose joint processing
functions are based on the Singular Value Decomposition (SVD) [19] of
the channel matrix. We will show that we can obtain the optimum joint
1. Introduction
processing functions by means of the SVD. Furthermore, we study low(er)complexity transmission variants and discuss their performance. To obtain
quantitative results, we perform simulations of these transmission schemes
with realistic (practically used) parameters and compare the various methods.
The final section of this chapter presents the UP MIMO1 scheme, a scheme
that was originally designed by the author for wireless transmission, and that
also has applications in wireline transmission. Specifically, it can be used to
reduce the computational complexity at the transmitter side (but not at the
receiver side). We will treat various aspects of this scheme.
6. The last chapter presents the overall conclusions and gives an outlook on
possible further developments.
In this chapter we will first review one of the most important modulation schemes
used in the subscriber line network. This modulation scheme is called Discrete
Multitone modulation (DMT) and is currently used in the ADSL [25, 26, 2830]
and VDSL [912, 27] standards. It allows an efficient implementation and exhibits
performance near capacity. Our main focus in this chapter is to find a compact
mathematical description for the relationship between the data to be transmitted
and the received data.
Secondly, we will extend these considerations to DMT transmission over cable
bundles. It is well known, that if several modems transmit over a cable bundle simultaneously, each modem disturbs the other, so that we will encounter severe performance degradations. In literature, cf. [48], these interference mechanisms are
called NearEnd Crosstalk (NEXT) and FarEnd Crosstalk (FEXT). In this work we
will mainly focus on FEXT, since there exist1 several techniques [48] to compensate for NEXT, whereas FEXT cancellation methods are still a topic of research
and development. Again, it is our goal to find a compact model for the input output behavior.
It will turn out that both scenarios can be described essentially in the same way,
which puts us in a position to analyze and optimize both systems using the same
framework.
Source
Sink
bit rate
1/ Tb
Inverse
Mapping
Mapping
Decision
Device
Frequency
Domain
Equalizer
Conjugate Complex
Extension
IFFT
(Length: N )
Remove Conjugate
Complex Extension
FFT
(Length: N )
Parallel
Parallel
Serial
Serial
A/D Converter
D/A Converter
Transmit Filter
Driver Stage
analog signal
Channel
(Twisted Pair)
Noise
Receive Filter
Lowpass
The Source emits a sequence of bits that are converted in the block Mapping
into a sequence of complex valued symbol vectors of dimension N2 + 1, the so
called DMT symbols. Note that there is the additional constraint that the first
and last elements of the vectors (the DMT symbols) have to be real valued. Each
symbol element of the vector is chosen according to optimized discrete symbol
constellations. The optimization is usually carried out in a pretransmission phase
in which channel and noise characteristics are measured, so that high performance
is guaranteed during transmission and the required (signal power) constraints (cf.
[912, 2530]) are fulfilled. In the next two blocks, Conjugate Complex Extension
and IFFT, each complex vector of dimension N2 +1 is extended to a complex vector
of dimension N , with the property that one half of the vector is a conjugated version
of the other half and passed through an inverse discrete Fourier transform of length
N . Note that the vectors at the output of the inverse discrete Fourier transform,
which is implemented using fast Fourier algorithms, are now real valued vectors.
The block Add Cyclic Prefix takes the last p elements of each vector and produces a
new (N + p)  dimensional vector, which is obtained by stacking these p elements
and the original N  dimensional vector. In the next step (block Parallel Serial),
all elements of these vectors are put together into a sequence of numbers, which are
then (block D/A Converter) transformed into the analog domain, passed through a
transmit filter, and transmitted over the twisted pair (block Transmit Filter, Driver
Stage).
The received signal is passed through a Receive Filter (Lowpass) and converted
back to the digital domain (block A/D Converter). It goes through an adaptive filter
called Time Domain Equalizer, and the symbols are then stacked into a sequence
of (N + p)  dimensional vectors (block Serial Parallel). The first p elements
of each vector are removed (block Remove Cyclic Prefix, which is naturally implemented together with the block Serial Parallel in practice), and for the resulting
N  dimensional vectors a Fast Fourier Transform (FFT) is computed. The first
N
2 + 1 elements of each vector (block Remove Conjugate Complex Extension) are
multiplied with N2 + 1 complex scalars2 (block Frequency Domain Equalizer),
which are calculated in the pretransmission phase, so that the distortions introduced by the channel are compensated. The Decision Device maps back onto the
signal constellations, and the block Inverse Mapping in turn produces a bitstream,
which is (hopefully) equal to the bitstream generated by the Source.
Observe the relation between DMT symbol rate T1s and channel symbol rate T1 ,
cf. also Figure 2.1, given by
1
N +p
=
.
(2.1)
T
Ts
In order to analyze and optimize this system, we will build up a mathematical
model for the input  output behavior. We start with the part of the transmission
system that models the channel, as shown in Figure 2.2.
All involved signals are real valued, discretetime signals and we assume that
2
D/A Converter
A/D Converter
Transmit Filter
Driver Stage
Receive Filter
Lowpass
Channel
(Twisted Pair)
s
Noise
X
g(k)t(n k) + s(n).
r(n) =
(2.2)
k=
Next, we will include the so called Time Domain Equalizer (TDE) in our model,
cf. Figure 2.3. Again, we model its behavior by a convolution with a real valued,
timediscrete impulse response e = [e(n)]n=,..., , i.e.,
u = e r,
X
e(k)r(n k).
u(n) =
(2.3)
k=
X
u(n) =
h(k)t(n k) + z(n).
(2.4)
k=
It can be seen from Figure 2.3 that the TDE is an adaptive filter. Its coefficients
are determined in the pretransmission phase, so that the overall channel impulse
u
Time Domain Equalizer
(Adaptive FIR Filter)
D/A Converter
A/D Converter
Transmit Filter
Driver Stage
Receive Filter
Lowpass
Channel
(Twisted Pair)
s
Noise
Fig. 2.3: DMT: The Channel and the Time Domain Equalizer.
response h has a length shorter or equal to p + 1, p being the length of the Cyclic
Prefix, cf. Figure 2.1, or  to be more precise  they are determined such that
h(n) = 0,
n < 0 or n > p.
(2.5)
Note that for practically occurring channel impulse responses g, the resulting filter e will be always noncausal and therefore not implementable. Since a simple
delay in the receiver solves this problem, assumption (2.5) will be maintained for
simplicity.
We will address the issue of designing the coefficients of the TDE in more
detail (including an analysis of the case that (2.5) holds only approximately) later
in Section 4.4.
Due to assumption (2.5) the infinite sum in (2.4) is replaced by a finite sum, so
that (2.4) simplifies to
u(n) =
p
X
h(k)t(n k) + z(n).
(2.6)
k=0
z(n0 )
u(n0 )
z(n0 + 1)
u(n0 + 1)
vn0 =
bn0 =
RN ,
RN ,
..
..
.
.
z(n0 + N 1)
u(n0 + N 1)
10
qn0
t(n0 p)
..
.
R(N +p) ,
t(n0 1)
t(n0 )
t(n0 + 1)
..
.
t(n0 + N 1)
and
h(p) h(p 1)
..
..
.
.
H =
..
.
0
h(0)
0
..
..
.
h(p) h(p 1)
..
h(0)
N (N +p)
R
we have
(2.7)
From Figure 2.4, it can be seen that (2.7) describes the input  output behavior
between the input of the Parallel Serial block and the output of the Remove
Cyclic Prefix block.
Let an0 denote the N  dimensional input vector of the Add Cyclic Prefix block,
cf. also Figure 2.4. In this block3 , the last p elements of the vector are stacked
together with all N elements of the vector and are output as a (N +p)  dimensional
vector. We can write this operation as a matrix  vector product, i.e., qn0 = Ran0 ,
with
0 Ip
R(N +p)N ,
R=
(2.8)
IN
We assume N > p.
(2.9)
. . . , an0 , an0 +N +p , . . .
11
. . . , bn0 , bn0 +N +p , . . .
. . . , qn0 , qn0 +N +p , . . .
Parallel
Parallel
Serial
Serial
D/A Converter
A/D Converter
Transmit Filter
Driver Stage
Receive Filter
Lowpass
Channel
(Twisted Pair)
Noise
Fig. 2.4: DMT: The Channel, the Time Domain Equalizer, the Parallel / Serial Conversion,
and the Cyclic Prefix.
12
..
..
..
.
.
.
..
..
.
.
h(p)
T
RN N ,
Hcyc =
.
.
.
.
.
.
h(p)
.
.
0
.
..
..
..
.
.
. h(1)
h(1) h(p)
h(0)
which shows that the transposed matrix HT
cyc is a cyclic matrix  each row can be
obtained by cyclic permutations of the other rows. It is well known [13, 59] that
cyclic matrices can be diagonalized by means of the discrete Fourier transform
(DFT) and the inverse discrete Fourier transform (IDFT). Let
h
i
h
i
2
2
F = 1N e N kl
and F1 = 1N e N kl
k,l=0,...,N 1
k,l=0,...,N 1
(2.10)
denote the DFT  and the IDFT  matrix, respectively, and let
H(z) =
X
n=
h(n)z
p
X
h(n)z n
(2.11)
n=0
2
0
H e N 0
H e N 1
1 T
F Hcyc F =
..
.
2
0
H e N (N 1)
and transposing this equation (using F = FT and F1 = FT ),
2
H e N 0
0
H e N 1
FHcyc F =
..
.
2
0
H e N (N 1)
(2.12)
Note that the block IFFT in the transmitter performs a multiplication with F1 and
the block FFT in the receiver performs a multiplication with F using efficient fast
Fourier algorithms, cf. Figure 2.5, such that Hcyc is diagonalized and hence ISI is
avoided.
. . . , en0 , en0 +N +p , . . .
13
. . . , mn0 , mn0 +N +p , . . .
Frequency
Domain
Equalizer
. . . , fn0 , fn0 +N +p , . . .
Conjugate Complex
Extension
Remove Conjugate
Complex Extension
. . . , cn0 , cn0 +N +p , . . .
. . . , dn0 , dn0 +N +p , . . .
IFFT
(Length: N )
FFT
(Length: N )
Parallel
Parallel
Serial
Serial
D/A Converter
A/D Converter
Transmit Filter
Driver Stage
Receive Filter
Lowpass
Channel
(Twisted Pair)
Noise
Fig. 2.5: DMT: The Channel, the Time Domain Equalizer, the Parallel / Serial Conversion, the Cyclic Prefix, the (I)FFT, the Conjugate Complex Extension, and the
Frequency Domain Equalizer.
14
The blocks Conjugate Complex Extension and Remove Conjugate Complex Extension have the task of achieving a real valued transmit signal. This happens if the
output vector an0 of the IDFT is real valued. In order to guarantee this constraint,
the input vector of the IDFT has to fulfill a Hermitian symmetry condition, i.e., the
vector (for even N )
cn0 (0)
..
1
N
cn0 =
cn0 ( 2 ) with an0 = F cn0
..
.
cn0 (N 1)
has to satisfy [13, 59]
cn0 (n) = cn0 (N n),
n = 1, . . . , N2 1
and
(2.13)
complex conjugation. The block Conjugate Complex Extension takes its N2 + 1  dimensional complex input vector,
en0 (0)
..
=
,
.
N
en 0 ( 2 )
en0
en0 (0)
..
en 0 ( N )
2
cn0 =
(2.14)
e ( N 1) .
n0 2
..
.
en0 (1)
On the other hand, the block Remove Conjugate Complex Extension inverts the
operation of the block Conjugate Complex Extension. Mathematically, this can be
written as a matrix  vector product, i.e., fn0 = Edn0 , with
h
i
N
E = I N2 +1 0 R( 2 +1)N
(2.15)
and dn0 being the output vector of the DFT, dn0 = Fbn0 . For the nomenclature of
the previously described input / output vectors, we also refer to Figure 2.5. Finally,
15
= Edn0
= EFbn0
(2.16)
D=
2
H e N 0
..
.
2 N
H e N 2
2
H e N 1
0
Let be defined as
: C C,
z 7 z =
1
z,
z=
6 0
0, z = 0
and let D denote the (Moore  Penrose) pseudo inverse [19] of D, i.e.,
2
0
H e N 0
1
e 2
N
H
.
D =
..
.
2 N
0
H e N 2
The Frequency Domain Equalizer (FDE), cf. also Figure 2.5, performs a multiplication of its input vector with the matrix D , i.e., mn0 = D fn0 , so that we finally
obtain
mn0 = en0 + D Ewn0 ,
(2.17)
2
with the additional requirement that en0 (n) = 0 if H e N n = 0.
The block Mapping, see Figure 2.1, maps the bitstream onto complex symbols,
according to specific constellations (usually QAM constellations). In principle, we
are free to design these constellations, except that the constellations corresponding to subcarrier 0 and N2 have to be real valued
(they correspond to en0 (0) and
en0 ( N2 )), and that en0 (n) = 0 if H e N n = 0. However, there is another requirement that we have to take care of. We cannot transmit with arbitrarily high
16
power, as this would violate the standards [912, 2530] and is also impossible
from an engineering point of view. In order to model this power constraint, we
will assume that the average sum power4 of the vector en0 is smaller or equal to
a certain number SDMT . This power constraint now translates to a constraint on
the possible signal constellations. We also want to emphasize that this model is
only an approximation for the power constraint of a real system required by the
standards [912, 2530], but it allows an analysis of the system on one hand, and is
also accurate to some extent on the other hand. Note that in practical systems, there
is an additional constraint on the power spectral density of the transmit signal.
There remain two blocks to explain. The Decision Device maps (rounds) the
noisy elements of the vectors mn0 back onto the constellations, whereas the block
Inverse Mapping is an exact inversion of the block Mapping and produces a bitstream, which is hopefully equal to the bitstream emitted by the Source.
To be more precise: the vector en0 is modeled as random vector having a correlation matrix [34]
with a trace smaller or equal to SDMT .
5
In fact, it is not the fiber that is expensive but the equipment for modulating optical signals and
 of course  to lay the fiber. Fiber is cheaper than copper.
17
1
2
FEXT
K
NEXT
1
2
Noise
18
hKi
hKi
. . . , en0 , en0 +N +p , . . .
?
DMT Modulator
h2i
Transmit Unit
h2i
. . . , en0 , en0 +N +p , . . .
?
DMT Modulator
Transmit Unit
DMT Modulator
Transmit Unit
h1i h1i
. . . , en0 , en0 +N +p , . . .
Noise
FEXT
h1i
h1i
. . . , fn0 , fn0 +N +p , . . .
?
DMT Demodulator
Receive Unit
DMT Demodulator
Receive Unit
Receive Unit
6
h2i h2i
. . . , fn0 , fn0 +N +p , . . .
DMT Demodulator
6
hKi hKi
. . . , fn0 , fn0 +N +p , . . .
. . . , en0 , en0 +N +p , . . .
IFFT
(Length: N )
Parallel
Serial
Parallel
Serial
thki
hki
hki
Conjugate
Complex
Extension
Add
Cyclic
Prefix
(Length: p )
19
. . . , fn0 , fn0 +N +p , . . .
DMT Modulator
hki
hki
Remove
Conjugate
Complex
Extension
FFT
(Length: N )
Remove
Cyclic
Prefix
(Length: p )
DMT Demodulator
thki
Transmit Filter
Driver Stage
D/A Converter
Transmit Unit
rhki
Receive Filter
Lowpass
A/D Converter
Receive Unit
uhki
20
thKi
Transmit Unit
th2i
Transmit Unit
th1i
Transmit Unit
Noise
uh1i
uh2i
uhKi
rh1i
rh2i
FEXT
Receive Unit
Receive Unit
rhKi
Receive Unit
Fig. 2.9: MIMO DMT: The Channel and the Time Domain Equalizers.
21
fusion between a signal on the kth loop and the kth power of a signal. The input
 output behavior is approximated
by linear and timeinvariant convolutions of the
input signals thli = thli (n) n=,..., with real valued impulse responses ghkli =
hkli
g (n) n=,..., plus additive noise terms shki = shki (n) n=,..., . Note
that k, l = 1,
. . . , K. In contrast to the single loop scenario, the kth received signal rhki = rhki (n) n=,..., does not only depend on the kth transmit signal
thki . It also depends (linearly) on all other transmit signals thli , l 6= k, i.e.,
rhki =
r
hki
(n) =
K
X
l=1
K
X
(2.18)
l=1 m=
The Time Domain Equalizers (TDEs) have a similar function as in the single
loop case. Again, it is their task to shorten the impulse responses, and their behavior is modeled
by
X
hki
u (n) =
ehki (m)rhki (n m).
(2.19)
m=
volution of ehki and ghkli , hhkli = ehki ghkli , and let zhki = z hki (n) n=,...,
denote the convolution of ehki and shki , zhki = ehki shki . Then, we obtain an input
 output relation as
uhki =
K
X
(2.20)
l=1
uhki (n) =
X
X
l=1 m=
n < 0 or n > p.
(2.21)
Note that the calculation of the TDE coefficients is a nontrivial problem, since the
impulse response of the kth TDE, ehki , has to shorten all ghkli , l = 1, . . . , K,
22
K
X
(2.22)
l=1
We can therefore interchange the sum in (2.22) with the operation of the DMT demodulator and apply the analysis of single loop DMT (Section 2.1) to each element
of the sum. The input  output behavior of the whole MIMO DMT system is then
obtained by summation. Let
hki
N
2
+1
hki
N
2
+1
. . . , ehki
n0 , en0 +N +p , . . .
and
. . . , fnhki
, fn0 +N +p , . . .
0
denote the input vector sequence of the kth modulator and the output vector sequence of the kth demodulator, respectively, and define
z hki (n0 )
z hki (n0 + 1)
hki
vn0 =
RN .
..
.
z hki (n0 + N 1)
Then the MIMO DMT input  output behavior can be immediately derived from
equation (2.16) as
K
X
hki
fnhki
=
Dhkli ehli
(2.23)
n0 + Ewn0
0
l=1
Dhkli
2
H hkli e N 0
H hkli
0
where
H hkli (z) =
p
X
n=0
1
2
N
..
.
2 N
N 2
hkli
H
e
hhkli (n)z n
(2.24)
23
denotes the Z  transform of hhkli and E and F are defined in (2.15) and (2.10),
respectively. With equation (2.23), we have found an expression that  in principle
 can be used as starting point for further analysis and optimization. Unfortunately,
we have to deal with a set of K equations, since k takes values from 1 to K. For
simplicity (and also for notational reasons) we will rewrite equation (2.23) in the
following way. Let
h1i
h1i
h1i
en0 (0)
fn0 (0)
wn0 (0)
..
..
..
.
.
.
hKi
hKi
hKi
en0 (0)
fn0 (0)
wn0 (0)
h1i
h1i
en0 (1)
fn0 (1)
wnh1i
(1)
0
..
..
..
.
.
.
ehKi (1)
f hKi (1)
whKi (1)
n
n
n
0
0
0
xn0 =
, yn0 =
, and nn0 =
..
..
..
.
.
.
h1i N
h1i N
h1i N
en 0 2
fn0 2
wn0 2
..
..
..
.
.
.
hKi N
hKi N
hKi N
en0 2
fn0
wn0 2
2
C ( 2 +1)K
N
denote stacked versions of the input, output and noise vectors, respectively, and let
(0)
H
0
H(1)
N
N
HMIMO DMT =
C ( 2 +1)K( 2 +1)K
..
.
0
H( 2 )
with
H(n) =
2
H h11i e N n
..
..
.
. 2
N n
hK1i
H
e
2
H h1Ki e N n
..
.
2
hKKi
H
e N n
C KK
(2.25)
and we have found a compact mathematical description of the input  output behavior for synchronized DMT transmission over cable bundles. Like in the single loop
case, we have to impose a power constraint. Instead of having power constraints
24
for each loop, we will assume that the average sum power6 of the vector xn0 is
smaller or equal to a certain number SMIMO DMT . Mathematically, this is easier to
handle, and if we set SMIMO DMT = KSDMT , it can be shown by simulation that
in real systems, this relaxed power constraint is sufficient to guarantee a correct
power distribution on the various loops.
(2.26)
where y C r and x C t denote the received and transmitted vectors, respectively. A is a deterministic r t complex matrix, the channel matrix, and n C r
is the noise vector. The transmitter is constrained in its total power7 to S,
E{xH x} S,
(2.27)
tr E{xxH } S.
(2.28)
Observe that all occurring vectors are modeled as random vectors, so that the expectation operation E{} in equations (2.27) and (2.28) is well defined. We will
discuss related definitions and properties of complex random vectors in Section
3.1.
Furthermore, we want to emphasize that the channel matrix as well as the noise
characteristics are assumed to be known at the receiver and the transmitter. This
means that a real system has to determine (estimate) these parameters in a pretransmission phase before the effective information transmission starts.
If we compare the results obtained for DMT transmission over one single loop,
cf. equations (2.16) and (2.17), with this channel model, we conclude that A is a
diagonal matrix and we have r = t = N2 + 1 in this case. The Frequency Domain
Equalizer (FDE) even yields an identity channel matrix.
In the MIMO DMT case, cf. equation (2.25), the channel matrix is no longer
a diagonal matrix. It is a block diagonal matrix. All submatrices on the diagonal
N have dimension K K, whereas the whole matrix has dimensions r = t =
2 + 1 K. Observe that there is no FEXT if and only if A is a diagonal matrix.
6
To be more precise: the vector xn0 is modeled as random vector having a correlation matrix
[34] with a trace smaller or equal to SMIMO DMT .
7
The superscript H denotes Hermitian transposition, i.e., transposition and complex conjugation.
25
<{x}
<{y}
<{n}
2t
2r
x
=
R , y
=
R , n
=
R2r ,
={x}
={y}
={n}
and
=
A
<{A} ={A}
={A}
<{A}
R2r2t ,
(2.29)
and
E{
xT x
} S
(2.30)
tr E{
xx
T } S,
(2.31)
respectively. Note that the real and complex channel descriptions are fully equivalent.
Since the noise is modeled as a random vector, the noise characteristics are
specified by a (multivariate) joint probability density function (p.d.f.)  if it exists. If it is not explicitly mentioned otherwise, we will assume throughout this
manuscript, that n
is distributed according to a 2r  dimensional, zeromean, Gaussian distribution [34], i.e., its p.d.f is given by
T 1
1
1
fn (
x) = p
e 2 (xn ) Cn (xn ) ,
det (2Cn )
(2.32)
where n = E{
n} = 0 R2r denotes the zero expectation / mean vector, and
Cn = E{(
n n ) (
n n )T } = E{
nn
T } R2r2r denotes the covariance
matrix. As already mentioned, we will assume, that the covariance matrix Cn is
known both at the receiver and transmitter, since the covariance matrix specifies
the probability distribution of a real valued, zeromean, Gaussian random vector
completely.
If we deal with the complex channel model, cf. equation (2.26), we have to
consider a complex random noise vector as well. It is well known [41], that the
knowledge of the covariance matrix of a complex valued, zeromean, Gaussian
26
random vector is not sufficient for a complete statistical description. Only for a subclass, the socalled rotationally invariant random vectors, the covariance matrix
fully specifies the p.d.f. . We will treat the theory of complex random vectors in
Section 3.1.
In this chapter, it is our goal to calculate the capacity [6] for channels of the form
(2.26), (2.27) and (2.28). Since this is the complex channel representation, we have
to deal with complex random vectors. As already mentioned in Section 2.3, for
rotationally variant Gaussian complex random vectors, mean and covariance matrix are not sufficient for a statistical description. The knowledge of the socalled
pseudocovariance matrix1 is also mandatory [41]. Therefore, the first section
(Section 3.1) of this chapter is dedicated to the theory of complex random vectors
and presents the main definitions and theorems. Furthermore, the concept of Entropy is extended from the real to the complex case, and a Generalized Maximum
Entropy Theorem is proved, cf. also [57]. In contrast to previous work [41, 58],
this theorem also takes into account the pseudocovariance matrix and strengthens
the results known before. With the introduction of a rotationally invariant analogon of a complex random vector we are able to show that the main statement
is independent of the probability distribution of the considered complex random
vector.
The reason why we give such a detailed introduction to complex random vectors is that we will show in Section 4.1 that the noise vector Ewn0 in (2.16) and
in turn the noise vector nn0 in (2.25) are rotationally variant in general, and therefore the pseudocovariance matrices of these noise vectors should be taken into
account. In practical systems, as well as in literature, this fact is simply neglected,
resulting in an unnecessary performance loss.
We will calculate the capacity for three cases. First, for rotationally invariant
noise, and second, for rotationally variant noise, where the pseudocovariance matrix is taken into account. For the third case, see also [55], we assume that the
noise is rotationally variant but it is erroneously believed that the noise is rotationally invariant, so that the pseudocovariance matrix is neglected. This results in a
decreased capacity and we will calculate the capacity loss. For related literature
we also refer to [33, 39, 62].
1
Note that the add on pseudo does not tell us anything about the definition of this matrix. This
is probably one of the reasons why other appellations  such as complementary covariance matrix
(see [34])  are circulating in literature as well. None of these appellations is really satisfactory in
our opinion. In fact, this matrix is the crosscovariance matrix [34] of the considered random vector
and its complex conjugate. However, in the subsequent sections we will maintain the widely used
terminology pseudocovariance matrix.
28
(3.1)
where the real and imaginary parts, <{x} and ={x} are real random vectors. The
expectation (mean) vector of a real random vector is naturally generalized to the
complex case as x = E{x} = E{<{x}} + E{={x}}. The statistical properties
of x = <{x} + ={x} are determined by the joint probability density function
(p.d.f.) of the
vector x
R2n consisting of its real and imaginary
real random
<{x}
parts, x
=
. A complex random vector x is said to be Gaussian if the
={x}
real random vector x
is Gaussian. Thus, to specify the distribution of a complex
Gaussian random vector x, it is necessary to specify the expectation vector x =
E{
x} R2n (or, equivalently, x C n ) and the covariance of x
, namely,
Cx = E{(
x x ) (
x x )H } = E{(
x x ) (
x x )T } R2n2n .
(3.2)
(3.3)
1 <{Cx } ={Cx }
1
=
+
<{Cx }
2 ={Cx }
2


{z
}
(3.4)
E{<{x x }={x x }T }
E{={x x }={x x }T }
<{Px }
={Px }
(3.5)
,
={Px } <{Px }
{z
}
x
C
x
P
29
Cy = ACx AH
and Py = APx AT .
PAx+b = 0.
(note that x
and A can be easily distinguished considering their domains),
Definition 3.4
2n
x
: C R ,
x 7 x
=
<{x}
={x}
: C nm R2n2m ,
A
=
A 7 A
: C nm R2n2m ,
A
=
A 7 A
<{A} ={A}
={A}
<{A}
<{A}
={A}
={A} <{A}
, and
, so that
x.
for a rotationally invariant random vector, Cx = 12 C
These mappings have some remarkable properties, as stated in the next lemmas,
propositions and theorems:
2
to a covariance matrix C and the mapping A
to
Note that we will usually apply the mapping A
a pseudocovariance matrix P (despite of the general results about the mapping properties).
30
Lemma 3.5
C = AB
C=A+B
C = AH
1
C=A
= det (A)2
det A
z=x+y
y = Ax
<{xH y}
nn
UC
is unitary
nn
CC
is nonnegative definite
=A
B
C=A+B
T
=A
C
=A
1
C
= det AAH , A C nn
z=x
+y
x
y
= A
x
T y
R 2n2n is orthonormal
U
R 2n2n is nonnegative
C
definite
Proof. [58]. The first three statements are immediate. The fourth statement follows
from the first and the fact that
In = I2n , where In denotes the nn identity matrix.
The fifth statement follows from
In In In In
A
0
det A
= det
A
= det
0
In
0
In
={A} A
= det (A) det (A) .
The statements 6  9 are immediate. The last statement follows from
x = <{xH Cx} = xH Cx 0,
x
T C
where the statements 7 and 8 were used.
and A
C = AB C
C = AB C = AB
=A
B
C = AB C
Proof. The first two statements follow from
AB = <{A}<{B} + ={A}={B} + (={A}<{B} <{A}={B})
and
B
=
A
3
were studied.
In contrast to Lemma 3.5, where the mappings x
and A
31
and
<{A}<{B} + ={A}={B}
={A}<{B} <{A}={B}
={A}<{B} <{A}={B} <{A}<{B} ={A}={B}
B
=
A
B
=
A
0
In
1
0 In 0 In
A
=
In
0
In
0
In
<{A}
={A}
0
In
= A,
0
={A} <{A}
In 0
and therefore,
I2n
I2n = (1)2n det A
+ I2n
det A
= det A
+ I2n ,
= det A
which implies the statement.
For a symmetric (not Hermitian) A C nn , i.e., AT = A, we are now able to
in terms of the original matrix A:
express all eigenvalues of A
Theorem 3.8 Let A C nn denote a symmetric matrix, i.e., AT = A. Then all
are real valued and the nonnegative eigenvalues4 of A
are the
eigenvalues of A
singular values [19] of A.
Proof. Since A is a symmetric matrix, <{A} and ={A} are also symmetric matri is a symmetric matrix as well which implies that its eigenvalues
ces. Therefore A
are real valued.
Due to the first statement of Lemma 3.6 (setting B = AT = A) we have,
\
H,
A
= AA
A
4
According to Proposition 3.7 we then have knowledge of all (also the negative) eigenvalues.
32
and consequently,
A
I2n =  det AAH In 2 .
det A
A
are the squared eigenvalues of A
and are equal to the
The eigenvalues of A
H
eigenvalues AA , which are the squared singular values of A.
for a symmetric
The previous result gives us knowledge about the eigenvalues of A
matrix A. But what about its eigenvectors? Can we obtain an analogous result?
This question is mainly of theoretical interest since Theorem 3.8 together with
Proposition 3.7 is sufficient to prove the subsequent results. However, in order
to get a deeper understanding of the underlying structure we will present another
proof of Theorem 3.8, which gives insight into the eigenvector problem as well.
The proof is strongly related to the not so well known Takagis factorization [24]
that applies to symmetric complex matrices:
Theorem 3.9 (Takagis factorization) Let A C nn denote a symmetric matrix, i.e., AT = A. Then there exist a unitary matrix Q C nn and a real valued,
diagonal matrix Rnn with nonnegative entries, i.e.,
1
0
..
=
i 0, i = 1, . . . , n,
,
.
0
n
such that
A = QQT .
(3.7)
AAH ,
Q
T.
A
(3.8)
Q and are as in Theorem 3.9, in particular the diagonal entries of are the
singular values of A.
33
Proof.
= Q^
A
(QT )
^
T
Q
= Q
^
= Q
(QH )
(3.9)
d
H
Q
= Q
Q
T,
= Q
(3.10)
(3.11)
where (3.9) and (3.10) are consequences of statements 3 and 2 of Lemma 3.6, respectively, and (3.11) follows from the third statement of Lemma 3.5.
We want to emphasize that factorization (3.8) represents the eigenvalue decompo Note that the diagonal eigenvalue matrix
has the structure
sition of A.
1
0
..
..
.
0
i being the singular values of A. It is very remarkable that the orthonormal eigen for a unitary matrix Q.
vector matrix is of the form Q
Corollary 3.11 Let Px = E{(x x ) (x x )T } denote the pseudocovariance
matrix of a complex random vector x C n . Then there exist a unitary matrix
Qx C nn and a real valued, diagonal matrix x Rnn with nonnegative
entries, such that
x = Q
x
xQ
T
P
(3.12)
x.
The diagonal entries of x are the singular values of Px .
Proof. Px C nn and PT
x = Px .
x in terms of Px . We
Until now, we were able to express the eigenvalues of P
will see later that it is very important to find a similar relation for the determinant
of Cx . Due to decomposition (3.5) we can expect that det (Cx ) depends on the
covariance matrix Cx and the pseudocovariance matrix Px . In order to solve this
problem, we need the following
Definition 3.12 A matrix B C nn is called generalized Cholesky factor of a
positive definite Hermitian matrix A C nn if it satisfies the equation
A = BBH .
34
1 1 1 H
B C
= B C .
1 2i .
i >0
35
x Cy B
T
det (Cx ) = det CB
=
det
B
xy
x
xB
T
y ,
= 22n (det Cx )2 det I2n + P
and, applying Corollary 3.11,
y
det (Cx ) = 22n (det Cx )2 det I2n +
Y
= 22n (det Cx )2
1 2i ,
i >0
where the product is over all positive eigenvalues (counted with multiplicity) of
1 , or, equivalently (Corollary 3.11), over all (nonzero) singular values of
P
Bx x
PBx 1 x .
Furthermore, the concept of generalized Cholesky factors enables us to formulate a criterion for a matrix to be a pseudocovariance matrix:
Theorem 3.15 Let C C nn be an Hermitian positive definite matrix and B a
generalized Cholesky factor of C. C and a matrix P C nn are covariance
matrix and pseudocovariance matrix of a complex random vector, respectively, if
and only if P is symmetric and the singular values of B1 PBT are smaller or
equal to 1.
Proof. We will first prove the necessary condition, i.e., the condition that P is
symmetric and that the singular values of B1 PBT are smaller or equal to 1,
if C and P are covariance and pseudocovariance matrix of a complex random
vector, respectively. Let us take a look at the linearly transformed random vector
y = B1 x, where x denotes the complex random vector with covariance matrix C
and pseudocovariance matrix P. Obviously, y has a covariance matrix Cy = In
and a pseudocovariance matrix Py = PB1 x = B1 PBT . From
1
y = 1
Cy =
Cy + P
2
2
^
T
1
I2n + B PB
,
and the fact that Cy is nonnegative definite, we conclude  using Corollary 3.11
 that the singular values of B1 PBT are smaller or equal to 1. Note that a
pseudocovariance matrix is symmetric by definition.
In order to prove that P being symmetric and the singular values of B1 PBT
being smaller or equal to 1 is also a sufficient condition that C and P are covariance and pseudocovariance matrix of a complex random vector, respectively, we
36
have to construct a complex random vector with covariance matrix C and pseudocovariance matrix P. Define C1 = I2n and P1 = B1 PBT , and consider
1
^
1 = 1 I2n + B1
C1 + P
PBT ,
2
2
which is a symmetric and nonnegative definite (apply Corollary 3.10) matrix.
Therefore, there exists5 a complex random vector y C n , such that
<{y}
y
=
R2n
={y}
1 + P
1 . By construction, y has covariance
has the covariance matrix Cy = 12 C
matrix Cy = C1 and pseudocovariance matrix Py = P1 (cf. decomposition
(3.5)). The complex random vector x = By has a covariance matrix Cx = C and
a pseudocovariance matrix Px = P.
Corollary 3.16 Suppose the complex random vector x C n has a nonsingular
covariance matrix Cx and a pseudocovariance matrix Px . Cx is nonsingular if
and only if all singular values of Bx 1 Px Bx T , Bx being a generalized Cholesky
factor of Cx , are smaller than 1.
Proof. Apply Theorem 3.15 and Lemma 3.14.
We conclude this subsection with the following useful theorem regarding complex random vectors of dimension 1:
Theorem 3.17 Let x C denote a 1  dimensional random vector with the 1 dimensional covariance and pseudocovariance6 matrices
h
i
Cx = [Cx ] and Px = [Px ] = rx ex , rx , x R,
respectively. Then the covariance matrix of the equivalent 2  dimensional real
random vector x
is given by
1 Cx + rx cos x
rx sin x
Cx =
(3.13)
rx sin x
Cx rx cos x
2
and has an eigenvalue decomposition,
1 0
Cx = U
UT ,
0 2
5
(3.14)
In fact, there are infinitely many such vectors, since covariance and pseudocovariance matrices
are invariant under (deterministic, vectorvalued) translations of random vectors.
6
For rotationally invariant vectors, x can be any number.
with
Cx + rx
Cx rx
and 2 =
2
2
cos 2x sin 2x
U=
.
sin 2x
cos 2x
1 =
and
37
(3.15)
(3.16)
Proof. (3.13) is a consequence of (3.5). For (3.14) with (3.15) and (3.16) observe
that
1 Cx 0
rx cos x
sin x
.
Cx =
+
0 Cx
2
2 sin x cos x
 2
{z 3
}
1
0 5 T
U4
U
0 1
Note that we do not require rx 0, i.e., we explicitly allow ambiguities of x in
this relaxed polar coordinates representation of Px .
Z
h(x) = h(
x) =
supp{fx }
fx (
x) log fx (
x)d
x,
38
h(
x)
(3.18)
so that g
is a zeromean,
Gaussian distributed random vector with covariance matrix Cx . Observe that x
= [
x(1) x
(2n)]T
Z
R2n
Z
fx (
x)
x(i)
x(j)d
x=
R2n
fg (
x)
x(i)
x(j)d
x,
1 i, j 2n
39
Then,
Z
h(
x) h(
g) =
=
supp{fx }
Z
Z
fx (
x) log fx (
x)d
x+
R2n
fg (
x) log fg (
x)d
x
fx (
x) log fx (
x)d
x+
Z
+
fx (
x) log fg (
x)d
x
supp{fx }
Z
fg (
x)
fx (
x) log
d
x
fx (
x)
supp{fx }
Z
fg (
x)
1
fx (
x)
1 d
x
ln 2 supp{fx }
fx (
x)
Z
!
1
fg (
x)d
x1
ln 2
supp{fx }
1
(1 1) = 0,
ln 2
supp{fx }
1X
log 1 2i ,
2
}
 i {z
0
40
1
1
1
log det (2eCx ) =
log (2e)2n + log det Cx
2
2
2
1
1
= n log (2e) + log 22n + log (det Cx )2 +
2
2
1X
+
log 1 2i
2
i
1X
log 1 2i
2
1X
= log det (eCx ) +
log 1 2i ,
2
which, together with Definition 3.18, Theorem 3.20, and equation (3.17), implies
the theorem.
Note that Theorem 3.19 is a corollary of Theorem 3.21. Therefore, Theorem
3.21 is really a generalization of Theorem 3.19.
The Generalized Maximum Entropy Theorem (Theorem 3.21) compares two
complex random vectors, the original one and its Gaussian distributed counterpart, i.e., a complex random vector with the same covariance matrix and pseudocovariance matrix but with a Gaussian distribution. The differential entropy of this
Gaussian random vector is equal to the differential entropy of a Gaussian random
vector that is rotationally
with
x 7 A (x)
is called widely affine transformation or widely affine mapping if there exist two
matrices A1 , A2 C mn and a vector b C m , such that
A (x) = A1 x + A2 x + b,
x C n .
41
Note that this definition is a modification of the definition of a widely linear transformation / mapping that can be found in literature (see e.g. [33, 44]). The difference is the additional translation vector b.
A widely affine transformation can be equivalently described by an affine transformation on the real and imaginary part level:
Theorem 3.23 A mapping
A : C n C m,
x 7 A (x)
A
(x) = A
x + b,
x C n .
1x
2x
c + b
A
(x) = A
+A
In
0
= A1 x
+ A2
x
+b
0 In
x
2x
= A
+A
+b
1
<{A1 + A2 } ={A2 A1 }
=
x
+ b,
={A1 + A2 } <{A1 A2 }

{z
}
(3.19)
(3.20)
that not only shows how to calculate A from A1 and A2 , i.e. from (3.19),
1 + A
2,
A=A
but  conversely  also how to calculate A1 and A2 from A, i.e. from (3.20),
<{A1 } =
={A1 } =
<{A2 } =
={A2 } =
1
<{A1 + A2 } +
2
1
={A1 + A2 }
2
1
<{A1 + A2 }
2
1
={A1 + A2 } +
2
1
<{A1 A2 },
2
1
={A2 A1 },
2
1
<{A1 A2 },
2
1
={A2 A1 }.
2
The vector b of the present theorem is identical to the vector b of Definition 3.22.
Now we return to our original problem. Suppose we are given a rotationally
variant random vector. Is it possible to associate in a canonical way a rotationally invariant random vector that behaves like the given random vector with the
exception of being rotationally invariant? It is natural to demand that the associated random vector must have a distribution law that is as close as possible to the
42
distribution law of the given random vector. For a given Gaussian random vector
everything is clear: the associated random vector will be Gaussian as well and is
fully specified by mean vector, covariance matrix, and rotationally invariance. But
what about a random vector that is not Gaussian?
Let us look again at the Gaussian case. On the real and imaginary part level we
deal with two Gaussian random vectors with the same mean vector but with different covariance matrices (in the complex domain both random vectors have the
same covariance matrix, but one random vector has a vanishing pseudocovariance
matrix whereas the other has a nonvanishing covariance matrix). Recalling that
affine transformations of Gaussian random vectors are again Gaussian (see e.g.
[34]) we can view one vector to be an affine transformed version of the other. It is
a consequence of the theory of generalized Cholesky factors that it is possible, cf.
also Theorem 3.27, to construct an affine transformation (on the real and imaginary
part level or a widely affine transformation on the complex level) that transforms
one covariance matrix into another. This observation together with the fact that a
(widely) affine transformation preserves the character of a random vector also in
the nonGaussian case is used for the definition of a canonically associated rotationally invariant random vector to a given rotationally variant random vector:
Definition 3.24 Let y C n denote a complex random vector with mean vector
y , covariance matrix Cy , and pseudocovariance matrix Py . A complex random
vector x C n is called rotationally invariant analogon of y, if its mean vector,
covariance matrix, and pseudocovariance matrix satisfy
x = y ,
Cx = Cy ,
Px = 0,
x 7 A (x) ,
43
In the following we will deal with the existence of a rotationally invariant analogon, i.e., with the existence of the widely affine transformation of Definition 3.24.
First of all observe that invertible affine transformations do not change any existence statements regarding rotationally invariant analogons:
Proposition 3.25 Suppose the random vector x C n is a rotationally invariant
analogon of the random vector y C n . Then, for any invertible matrix M
C nn and any vector c C n (both deterministic), the random vector Mx + c is
a rotationally invariant analogon of the random vector My + c.
Proof. We have, cf. Proposition 3.1,
Mx+c = Mx + c = My + c = My+c ,
CMx+c = MCx MH = MCy MH = CMy+c ,
PMx+c = MPx MT = 0,
so that it remains to show the existence of the widely affine transformation of Definition 3.24. From Definition 3.24, there exists a widely affine transformation
A : C n C n,
x 7 A (x) ,
such that A (x) and y are identically distributed. Then the mapping
A0 : C n C n ,
defined by
A0 (x0 ) = MA M1 x0 M1 c + c,
x0 C n ,
1
0
..
Cy = In
and
Py =
with 0 i < 1,
.
0
n
respectively. Then there exists a rotationally invariant analogon x of y.
44
1 + 1
..
1
1 + n
Cy =
1 1
2
..
1 n
..
1 + n
A=
1 1
..
1 n
x C n .
Note that i < 1 is mandatory for the validity of this proof, because otherwise
the inverse A1 would not exist. However, if we are given a rotationally invariant
random vector x (with Cx = 12 I2n ) and consider the random vector y that is
obtained from x via the same widely affine transformation y = A (x) as in the
proof, then x is a rotationally invariant analogon of y also in the case i = 1
(i {1, . . . , n}). This fact is very natural and is the reason why the widely affine
transformation of Definition 3.24 is chosen to be a mapping from the rotationally
invariant random vector to the rotationally variant random vector and not from the
10
45
With the introduced framework we are now in the position to formulate the
promised theorem that quantifies the deviation of the differential entropy of a rotationally variant random vector from the differential entropy of its rotationally
invariant analogon11 independently of the actual probability distribution:
Theorem 3.28 Suppose the complex random vector y C n has a nonsingular
covariance matrix Cy and a pseudocovariance matrix Py . Let By be a generalized Cholesky factor of Cy and let i denote the singular values of By 1 Py By T ,
which must be smaller than 1. Furthermore, let x denote a rotationally invariant
analogon of y (which exists according to Theorem 3.27). Then the differential
entropy of y satisfies
1X
log 1 2i .
h(y) = h(x) +
2
 i {z
}
0
Proof. From Definition 3.24 and Theorem 3.23 we know that there exist a deterministic matrix A R2n2n and a deterministic vector b C n , such that y
and
A
x + b are identically distributed. This implies
Cy = ACx AT ,
11
To be more precise, this should read ... differential entropies of all of its rotationally invariant
analogons ..., but it is a consequence of Theorem 3.28 that these entropies are all equal.
46
22n (det Cy )2
1 2i = 22n (det Cx )2 (det A)2 ,
i
det A =
1 2i .
i
The final result follows from the well known  see e.g. [6]  transformation rule
= log det A + h (
h A
x+b
x) ,
and the definition of the differential entropy for complex random vectors (Definition 3.18).
Note that we can reobtain the Generalized Maximum Entropy Theorem (Theorem
3.21) from the previous theorem, if we apply the conventional Maximum Entropy
Theorem (Theorem 3.19) to the rotationally invariant analogon x of y.
Using the inequality ln x x 1, one easily finds a lower and upper bound
for the difference between the differential entropy of a random vector and the differential entropy of (all of) its rotationally invariant analogon(s):
Corollary 3.29 Suppose the complex random vector y C n has a nonsingular
covariance matrix Cy and a pseudocovariance matrix Py . Let By be a generalized Cholesky factor of Cy and let dmax denote the largest eigenvalue of Py0 PH
y0
0
1
(y = By y), which must be smaller than 1. Furthermore, let x denote a rotationally invariant analogon of y. Then,
H
0P 0
tr
P
tr Py0 PH
0
y
y
y
h(x) h(y)
.
2 ln 2
2 ln 2 (1 dmax )
Observe that the deviation of the differential entropy from the ideal rotationally invariant differential entropy is approximately determined by the trace and the
largest eigenvalue of Py0 PH
y0 .
In the previous theorems and corollaries, we have the requirement that the covariance matrix is nonsingular and that the singular (eigen) values of a certain
matrix are smaller than one. These assumptions are mandatory for the validity of
the statements. They are not merely technical assumptions. They ensure that the
occurring expressions have finite values. Otherwise, the considered random vector or  at least  a transformed version of it will have deterministic real and / or
imaginary parts of certain elements (almost everywhere). Even if we would allow
infinite values, we cannot guarantee that the statements remain valid. Note that
we can easily construct two random vectors, one being Gaussian and the other being nonGaussian, that have deterministic real and / or imaginary parts of certain
47
elements, so that their differential entropies are both , i.e., both entropies are
equal. Hence, this can serve as a counterexample for the (Generalized) Maximum
Entropy Theorem, if the mentioned requirements are loosened and infinite values
are allowed.
We conclude this subsection with the extension of the definition of conditional
differential entropy to the complex case:
Definition 3.30 Let x and y denote two complex random vectors with an existing
joint p.d.f. fx,y of x
and y
. Then the conditional differential entropy h(yx) is
defined as
Z
fx,y (
x, y
)
h(yx) = h(
y
x) = fx,y (
x, y
) log
d
xd
y
fx (
x)
with the marginal p.d.f. of x
Z
fx (
x) =
fx,y (
x, y
)d
y,
(3.21)
kxk = 0 x = 0
(3.22)
kxk
kxk
(3.23)
kx + yk
kxk + kyk;
(3.24)
x, y V
(3.25)
and is called the metric induced by the norm. The normed space just defined is
denoted by (V, k k) or simply by V.
48
Definition 3.32 (Inner Product Space, Hilbert Space) An inner product space
(or preHilbert space) is a vector space V with an inner product defined on V. A
Hilbert space is a complete inner product space (complete in the metric defined by
the inner product; cf. (3.32), below). Here, an inner product on V is a mapping of
V V into the scalar field F of V; i.e., with every pair of vectors x and y there is
associated a scalar which is written
hx, yi
and is called the inner product of x and y, such that for all vectors x, y, z and
scalars , we have
hx + y, zi
hx, zi + hy, zi
(3.26)
hx, yi
hx, yi
(3.27)
hx, yi
hy, xi
(3.28)
hx, xi
(3.29)
hx, xi = 0 x = 0
(3.30)
(3.31)
(3.32)
Hence, inner product spaces are normed spaces, and Hilbert spaces are Banach
spaces.
A well known example for a Hilbert space (and in turn for a Banach space) is
the vector space of all n  dimensional complex vectors, i.e., the space V = C n ,
equipped with the inner product
hx, yi2 = yH x.
(3.33)
We will call the induced norm defined by (3.31) Euclidean (vector) norm and denote it by
q
(3.35)
49
Theorem 3.34 The vector space V (V1 , V2 ) of all bounded linear operators from
a normed space V1 into a normed space V2 is itself a normed space with norm
defined by
kTxk
kTk = sup
=
sup
kTxk.
(3.36)
xV1 \{0} kxk
xV1 ,kxk=1
Proof. The proof is straightforward by checking the axioms (3.21), (3.22), (3.23),
and (3.24) of Definition 3.31.
Let us consider again the Hilbert / Banach spaces (C n , k k2 ) and (C m , k k2 ).
The vector space V (C m , C n ) of all bounded linear operators from C m into C n is
isomorphic to the space of all n m  dimensional complex matrices, i.e.,
V (C m , C n )
= C nm .
(3.37)
nm
C
, k k2
is a normed space. We will call the corresponding (induced) norm, defined in
(3.36), the Euclidean matrix norm. That is, for any matrix A C nm , we have to
calculate
kAk2 = sup kAxk2 .
(3.38)
kxk2 =1
The question remains whether it is possible to find a simpler rule for calculating
the Euclidean matrix norm of a given matrix. But before we address this issue, we
will first present some important properties of the norm of a normed / inner product
space.
Lemma 3.35
kTxk kTkkxk x
kT1 T2 k kT1 kkT2 k
kTxk = kxk
1
If hTx, yi = hx, T yi x, y , then
kTk = 1
(3.39)
(3.40)
x
(3.41)
Proof. The properties are immediate consequences of the definition of the (induced) norm in (3.36).
We now return to our examples, the Hilbert spaces (C n , k k2 ) and (C m , k k2 ),
and the induced normed space (C nm , k k2 ) (it can be shown that it is even a Banach space [31]) with Euclidean matrix norm which has to be calculated according
to (3.38). The following theorem establishes a relation between the Euclidean
matrix norm of a matrix and its singular values, and provides us with a second
possibility for computing the Euclidean matrix norm.
12
Note that we write k k2 not only for the Euclidean vector norm, but also for the (induced)
Euclidean matrix norm.
50
Theorem 3.36 Let A C nm denote a complex matrix and max denote its
greatest singular value. Then,
kAk2 = max .
(3.42)
hUx, yi2 = yH Ux = U1 y x = x, U1 y 2 x, y
for any unitary matrix U, we have
kUk2 = 1
kVk2 = 1
and
and
kUxk2 = kxk2
x C n ,
(3.43)
kVyk2 = kyk2
(3.44)
y C ,
sup kAxk2
kxk2 =1
kxk2 =1
(3.45)
kxk2 =1
sup
kVyk2 =1
kVH (Vy)k2
sup kyk2
kyk2 =1
= kk2 = max .
We want to emphasize that this result is the reason why we introduced the concept of normed spaces (Banach spaces) and inner product spaces (Hilbert spaces).
It was already mentioned that statements like the singular values of the matrix
B1 PBT have to be smaller than 1 and sometimes also the singular values of
the matrix B1 PBT have to be smaller than or equal to 1 are mandatory for the
validity of Theorems 3.15 and 3.21, and Corollary 3.16. According to Theorem
3.36, we can replace these statements by
1
A rectangular diagonal matrix is a matrix for which all entries with different column and row
indices are 0.
14
U and V can be chosen to guarantee this.
3.2. Capacity
51
1
kC1 k2
we have
B PBT < 1,
2
1
kC1 k2
we have
B PBT 1.
2
<1
,
1
kC1 k2
(3.47)
where min denotes the smallest singular value of C, cf. also Theorem 3.36.
Therefore, the sufficient conditions of Lemma 3.37 can be also formulated as if
the greatest singular value of P is smaller than (or equal) to the smallest singular
value of C, then . . ..
3.2 Capacity
3.2.1 Rotationally Invariant Noise
Now we return to our channel model as presented in Section 2.3. We have
y = Ax + n,
(3.48)
where y C r and x C t denote the received and transmitted vectors, respectively. A is a deterministic r t complex matrix, and n C r is zeromean
complex Gaussian noise. We assume that the covariance matrix Cn = E{nnH }
is known and nonsingular, and that the noise vector is rotationally invariant, i.e.,
the pseudocovariance matrix Pn = E{nnT } is the zero matrix. This is the usual
assumption in literature [13, 58] as well in practice, although a vanishing pseudocovariance matrix occurs only in some special cases. We will derive a criterion for
the noise vector to be (essentially) rotationally invariant in Section 4.1 (for the special cases given by equations (2.16) and (2.25) ). The specification of the channel
model is completed by the power constraint,
E{xH x} = tr E{xxH } S,
(3.49)
52
and the assumption that the channel is memoryless, i.e., that the noise vectors for
different channel uses are independent. We want to emphasize that this is only
an approximation because in general, we can expect correlations / dependencies
between the noise vectors of different channel uses. For mathematical reasons, we
stick to this simplification. Note that such dependencies possibly increase capacity
but make an analytical derivation less tractable.
The mutual information I(x; y) between transmit and receive vector is defined
as
I(x; y) = h(y) h(yx),
(3.50)
and can therefore be written as
I(x; y) = h(y) h(
y
x)
= h(y) h(A
x+n

x)
(3.51)
= h(y) h(
n)
= h(Ax + n) h(n),
since n
and x
are independent.
The capacity C of the channel is defined as the maximum of the mutual information over all possible transmit random vectors x satisfying the power constraint
(3.49), i.e., we have to choose the distribution (the p.d.f.) of x (of x
) that maximizes equation (3.50) but also fulfills the power constraint. Mathematically, this
maximization can be written as15
C=
max
{I(x; y)} ,
(3.52)
(3.53)
H
fx
:tr(E{xx })S
max
H
fx
:tr(E{xx })S
arg max
H
fx
:tr(E{xx })S
{h(Ax + n)} ,
and
ymax = Axmax +n.
Obviously,
H
tr E{xmax xH
max } = tr (Cxmax ) + tr xmax xmax S,

{z
}
0
15
Cymax
= ACxmax AH + Cn ,
Pymax
= APxmax AT + Pn = APxmax AT .
If the unit of entropy is [bit], the unit of capacity is [bit / channel use].
(3.54)
3.2. Capacity
53
C =
max
log det ACx AH + Cn log det Cn ,
Cx :tr(Cx )S
(3.55)
Cxmax = arg max
log det ACx AH + Cn ,
Cx :tr(Cx )S
where Cx is the covariance matrix of x, i.e., the maximization goes over all nonnegative definite Hermitian matrices with a trace smaller or equal to S.
Let Bn denote a generalized Cholesky factor of Cn , i.e.,
Cn = Bn BH
n
and let
H
B1
n A = UDV
H
H
log det ACx A + Cn = log det ACx A + Bn BH
n
H
H H
= log det Bn B1
n ACx A Bn + Ir Bn
H H
= log det B1
n ACx A Bn + Ir +
+ log det Bn + log det BH
n
H
T
= log det DV Cx VD + Ir + log det Cn ,
and, with Cx = VCa VH (a = VH x), the maximization problem (3.55) is equivalent to
C =
max
log det DCa DT + Ir ,
(3.56)
Ca :tr(Ca )S
54
Power
L
camax 1
camax s
1
d22
camax s1
1
d21
1
1
d2s
d2s1
, i s and di 6= 0
L d12
i
(3.58)
camax i =
0,
i s and di = 0 ,
0,
i>s
P
where the Water Level L is chosen to satisfy tr (Camax ) = ti=1 camax i = S. The
corresponding maximum mutual information (capacity) is given by
X
+
C=
log Ld2i
,
(3.59)
i:di 6=0
(3.60)
(3.61)
19
This is the general, well known Water Filling illustration, which does not assume a special
(descending) ordering of {di : i = 1, . . . , s}, also cf. Footnote 14 in this chapter.
3.2. Capacity
55
where y C r and x C t denote the received and transmitted vectors, respectively. A is a deterministic rt complex matrix, and n C r is zeromean complex
Gaussian noise. Again, we assume that the covariance matrix Cn = E{nnH } is
known and nonsingular. However, we now assume that the noise vector is rotationally variant, i.e., the known pseudocovariance matrix Pn = E{nnT } is
not the zero matrix20 . Similarly to the previous subsection, we restrict the set of
possible input vectors according to the power constraint,
tr E{xxH } S,
(3.62)
and we assume that the channel is memoryless, i.e., the noise vectors for different
channel uses are independent.
In contrast to the case of rotationally invariant noise (cf. Subsection 3.2.1), we
make two additional (technical) assumptions in order to simplify the analysis, so
that it is tractable at all. We assume:
1. t r. For our purposes, this restriction is not substantial, since, for our
practical channels (2.16) and (2.25), we have t = r.
2. A high signaltonoise ratio (SNR). Note that it is not quite clear what we
mean by high SNR at the moment, because its definition is not straightforward for a channel of the form (3.61). We will give a precise definition later
in this subsection. However, for transmission over cable (bundles) this high
SNR assumption is usually fulfilled.
Again, the mutual information I(x; y) between transmit and receive vector can
be written as
I(x; y) = h(y) h(yx)
(3.63)
= h(Ax + n) h(n),
such that the capacity of the channel
C=
max
{I(x; y)} ,
(3.64)
(3.65)
H
fx
:tr(E{xx })S
is obtained as
C=
max
H
fx
:tr(E{xx })S
arg max
H
fx
:tr(E{xx })S
{h(Ax + n)} ,
(3.66)
and
ymax = Axmax +n.
20
Note that the analysis of this subsection is valid for a vanishing pseudocovariance matrix as
well.
56
Obviously,
H
tr E{xmax xH
max } = tr (Cxmax ) + tr xmax xmax S,

{z
}
0
= ACxmax AH + Cn ,
Pymax
= APxmax AT + Pn .
Applying the Generalized Maximum Entropy Theorem (Theorem 3.21), we conclude that ymax is Gaussian. We can then restrict our search for the maximizing
random vector to zeromean, Gaussian, complex random vectors x, so that the
maximization problem (3.65) is simplified to
1
C =
max
log det ACx AH + Cn +
log 1 2i
2
Cx ,Px :tr(Cx )S
i
(3.67)
+r log(e) h(n),
By 1 Py By T = By 1 APx AT + Pn By T ,
which must be smaller than 1, By being a generalized Cholesky factor of
Cy = ACx AH + Cn .
Hence, the maximization is over all nonnegative, definite, Hermitian matrices Cx
with trace smaller or equal to S and over all symmetric matrices Px , such that
{Cx , Px } is a valid pair of covariance and pseudocovariance matrix according to
the criterion presented in Theorem 3.15.
Observe that the term
1X
log 1 2i
(3.68)
2
i
of (3.67) depends on the pseudocovariance matrix Px and on the covariance matrix Cx via the generalized Cholesky factor By , but is smaller or equal to 0. So, if
it is possible to take the covariance matrix Cx that maximizes the term
3.2. Capacity
57
We will show in the following that our technical assumptions are sufficient for
finding the maximum using this argument.
The first observation we make, if we look at (3.69), is that we have already
found the maximizing covariance matrix in the previous subsection, cf. (3.55). Let
Bn denote a generalized Cholesky factor of Cn , let
H
B1
n A = UDV
(3.70)
(3.71)
where
is a diagonal matrix with diagonal entries obtained by Water Filling, i.e.,
+
, i r and di 6= 0
L d12
i
camax i =
(3.72)
0,
i r and di = 0 ,
0,
i>r
Pt
where the Water Level L is chosen to satisfy tr (Camax ) =
i=1 camax i = S.
Furthermore,
X
log Ld2i
max
log det ACx AH + Cn =
+ log det Cn .
Cx ,Px :tr(Cx )S
i:di 6=0
(3.73)
Next, we will state our high SNR assumption more precisely: we will assume
in the following that the Water Level L is lower bounded by
L
,
d2min
dmin > 0.
(3.74)
Comparing this with (3.72), as illustrated in Figure 3.1, we conclude that this is
equivalent to having a signaltonoise ratio on every virtual subchannel21 greater
or equal to 1, i.e.,
camax i
1, i = 1, . . . , r.
(3.75)
1
d2i
21
By virtual subchannel, we mean the scalar channels obtained by the diagonalization via the
SVD. So every virtual subchannel is used for communications.
58
(3.76)
and
r
max
log det ACx AH + Cn =
log Ld2i + log det Cn
Cx ,Px :tr(Cx )S
i=1
(3.77)
with Water Level
r
S 1X 1
L= +
.
(3.78)
r
r
d2
i=1 i
We want to emphasize that (3.78) enables us to formulate the high SNR assumption
(3.74) as a condition for the signal power S as well. Using the Euclidean matrix
norm, cf. Subsection 3.1.3, to replace d21 and the trace operator tr() to replace
min
Pr 1
i=1 d2 , it is even possible (not shown here) to express this condition in terms of
i
0
d1
..
1
0
dr Rtr
D =
(3.79)
(A )
(3.80)
where A denotes the (Moore  Penrose) pseudo inverse [19] of A, such that
Pymax
= APxmax AT + Pn
H
= Bn UDV Pxmax V
1
= Bn UIr U
 {z
(3.81)
T
D U
BT
n
T
B1 P BT U
} n n n 
Ir
+ Pn
I UT BT + Pn
{zr } n
Ir
= Pn + Pn
= 0.
(3.82)
This yields
T
B1
ymax Pymax Bymax = 0,
3.2. Capacity
59
1X
log 1 2i = 0.
2
i
and, furthermore,
2
2
1
,
D L D =
C1
2
2
xmax 2
(3.84)
where (3.47) was used. This, together with Lemma 3.35, yields
1 1
T T
T
kPxmax k2 = VD U Bn Pn Bn U
D
V
(3.85)
2
T T
VD U1 B1
D VT
n Pn Bn U
2
2
{z
}
{z
}2


kD k2
kD k2
1 B1
Pn BT
n
n
2
Cxmax
2
T 1),
and, according to (3.46) and Theorem 3.15 (B1
n Pn Bn
2
1
kPxmax k2 1 .
Cxmax
2
(3.86)
Applying Lemma 3.37 to Theorem 3.15, we have proven that our choice of
{Cxmax , Pxmax } is indeed a valid pair of covariance and pseudocovariance matrix.
Finally, the capacity (see also (3.67) and the Generalized Maximum Entropy
Theorem 3.21) is obtained as
C =
=
r
X
i=1
r
X
(3.87)
i=1
1X
log 1 i2
2
i
r
X
i=1
1X
log 1 i2 ,
log Ld2i
2
i
60
(3.88)
We want to mention that we could equivalently call this quantity capacity gain.
Note that most conventional schemes do not make use of the pseudocovariance
matrix. Hence, if we compare the performance of a scheme that utilizes Pn with
a conventional scheme, we will encounter a gain and not a loss. However, capacity results are theoretical performance results of optimum schemes (of course,
optimum with respect to a certain optimality criterion). A scheme that neglects
the pseudocovariance matrix is certainly not optimum, if we apply the optimality
criterion that takes into account the pseudocovariance matrix. In our opinion it is
more appropriate to compare a scheme with an optimum scheme, and not with a
certain suboptimum scheme, and therefore we speak of a capacity loss instead of
a capacity gain.
The previous Subsection 3.2.2 was dedicated to the first type of transmission,
i.e., we determined the capacity
Cutilize Pn =
r
X
i=1
1X
log Ld2i
log 1 i2 ,
2
(3.89)
(3.90)
3.2. Capacity
61
(3.91)
Applying the Generalized Maximum Entropy Theorem (Theorem 3.21), the capacity can be written as
Cneglect Pn
= I(xmax ; ymax )
(3.92)
= h (Axmax + n) h(n)
1X
1X
log 1 i2 ,
log det Cn
2
i
Cxmax
= V
1
d21
0
..
.
L
1
d2r
0
..
.
0
H
V ,
(3.93)
62
and, furthermore,
Cymax
= ACxmax AH + Bn BH
(3.94)
n
1
H
H H
= Bn Bn ACxmax A Bn + Ir Bn
1
L d2
0
1
..
H H
H
.
D + Ir
= Bn U D
U Bn
L d12
r
0
0
H H
H
= Bn U LDD Ir + Ir U Bn
= LBn UDDH UH BH
n.
(3.95)
By Bn 2 = 1
(3.97)
D ,
max
2
L
2
2
where (3.83) was used. This enables us to calculate
1
T T T
By Pn BT
By Bn B1
(3.98)
ymax 2 =
n Pn Bn Bn Bymax 2
max
max
1
2 1
Bymax Bn 2 Bn Pn BT
n
2
1
,
2
where Theorem 3.15 and (3.46) applied to n were used. Applying Theorem 3.36,
we conclude that all singular values i are even smaller than 12 . As was also the
case in Subsection 3.2.2, a singular value i equal to 1 would result in an infinite
capacity, as can be shown by using the equivalent real channel model of doubled
dimension defined in (2.29), (2.30), and (2.31), and using Corollary 3.16.
Inserting (3.77) into (3.92), we finally obtain the capacity
Cneglect Pn =
r
X
i=1
1X
1X
log Ld2i +
log 1 i2 . (3.99)
log 1 2i
2
2
i
1X
=
log 1 2i ,
2
i
(3.100)
3.2. Capacity
Power
63
Power
Power
S+A
2
S+A
2
A+B
2
A+B
2
A+B
2
A
2
A
2
A
2
AB
2
AB
2
AB
2
<
<
<
(a)
(b)
(c)
noise power
distribution
Water Filling
neglecting Pn
Water Filling
utilizing Pn
T
where i denote the singular values of B1
ymax Pn Bymax , Bymax being a generalized
Cholesky factor of Cymax = ACxmax AH + Cn . Note that the capacity loss lies
within the range
2
(3.101)
0 C r log ,
3
which follows from (3.98) and
1X
r
1
r
4
2
2
log 1 i log 1
= log = r log .
2
2
4
2
3
3
x, y, n C,
(3.102)
where the noise has a known real valued covariance matrix and a known real valued
pseudocovariance matrix (of dimension 1 1)
Cn = [A]
and Pn = [B] ,
A B > 0.
(3.103)
64
1 A+B
0
Cn =
.
(3.104)
0
AB
2
Figure 3.2 (a) shows how the noise power is subdivided into real and imaginary
part. If it is erroneously assumed that the pseudocovariance matrix is the (all)
zero matrix, the conventional Water Filling algorithm distributes the same signal
power onto the real and imaginary part as it is illustrated in Figure 3.2 (b). But
this is obviously not the optimum solution that maximizes the mutual information
in case of a nonvanishing pseudocovariance matrix. The optimum distribution
of the signal power to the real and imaginary part that achieves capacity is shown
in Figure 3.2 (c). Note that this solution is given by Water Filling on a real and
imaginary part level. The difference of mutual information corresponding to Figure
3.2 (b) and (c) is exactly the capacity loss we are looking at in this subsection.
In Section 2.3, we came to the conclusion that we deal with a channel model of the
form
y = Ax + n,
(4.1)
where y C r and x C t denote the received and transmitted vectors, respectively. A is the channel matrix and n C r is the noise vector. In order to obtain
capacity results, cf. Section 3.2, it is not sufficient to know the channel matrix
A. It is also necessary to know the statistical properties of the noise vector n. To
be more precise, we need the covariance matrix Cn = E{nnH } and the pseudocovariance matrix Pn = E{nnT }. For the DMT system, this means that we have
to calculate covariance and pseudocovariance matrix of the vector Ewn0 in (2.16).
We will show in this chapter that the noise is rotationally variant in general, a fact
that is simply neglected in practical systems as well as in literature. Applying the
results of Subsection 3.2.3, we will calculate the resulting capacity loss. Furthermore, beyond capacity considerations, we also have a look at (uncoded) symbol
error probability. We will show that rotated rectangular constellations are more
appropriate than the common (quadratic) QAM constellations, and we will derive
formulas for the rotation angles and constellation sizes / densities. Finally, we will
show that this is of greatest importance if the noise at the input of the receiver is
colored, which will be the case in practice.
We also want to emphasize that the statements are also valid for MIMO DMT,
cf. Subsection 5.1.1. Note that the noise vector nn0 in (2.25) is then (follows from
the DMT case) clearly a rotationally variant complex random vector in general.
It was already mentioned in Section 2.1 that the Time Domain Equalizer has
the task to shorten the impulse response to a length shorter or equal to p + 1, p
being the length of the Cyclic Prefix. If this holds only approximately, one has to
accept intersymbol interference (ISI) and intercarrier interference (ICI), which can
be regarded as additional noise sources. In Section 4.4, we outline the fundamental
properties of the underlying mechanisms. The presented results are an extension of
the results of [42] to post  and precursors from both neighboring frames, relying
on a different derivation. Furthermore, we show that ISI and ICI are rotationally
variant in general and have equal first and second order moments.
For most of the other material presented in this chapter, we also refer to [56].
66
Note that this process is usually zeromean in our application. However, there is the possibility
that there are other applications where a discretetime, real valued, widesense stationary random
process is passed (blockwise) through a DFT that is not zeromean. We can still apply our analysis
to such situations.
2
Again, the superscript denotes complex conjugation, which is of course redundant for real
valued random processes. Since we are also dealing with complex valued random processes later on,
we write it here for completeness.
3
Crosstalk is colored noise and the time domain equalizer transforms white noise into colored
noise as well.
67
random vector, its real and imaginary part vectors have4 identical (auto)covariance
matrices and a skewsymmetric [24] crosscovariance matrix [34]. As an immediate
consequence, real and imaginary part of an element of this random vector have
identical variances and are uncorrelated. Using the theory of proper random vectors
developed in [41], it can be easily seen that the random vectors at the output of the
DFT are rotationally variant except for the case when the input random vectors are
constant with probability 1, i.e.,  roughly speaking  they are deterministic5 . This
suggests that at the output of the DFT (and also at the input of the decision device),
real and imaginary part at certain frequencies have different variances and / or are
correlated in general.
Remark: In the case of passband Orthogonal Frequency Division Multiplexing
(OFDM), the situation is different. At the input of the receiver, the demodulation of
the signal (passband to baseband conversion) requires the calculation of an analytical signal. It is shown in [43] that the analytical signal of any stationary signal is
rotationally invariant. It follows that all considered random vectors are rotationally
invariant as well.
wn0 (0)
..
wn0 =
.
wn0 (N 1)
at the output of the DFT (of even length N 2) analytically. Let us recall (see
Figure 4.1) that the real (DFT)input random vector
vn0 (0)
..
.
vn0 =
vn0 (N 1)
is part6 of a discretetime, real valued, widesense stationary random process
z = [z(n)]n=,...,+ with mean z and autocorrelation function Rz (n), i.e.,
vn0 (n) = z(n0 + n),
4
n = 0, . . . , N 1.
In fact, these properties can be equivalently used for the definition of rotationally invariance.
Wooding [63] and Goodman [20] were apparently the first to deal with random vectors satisfying
these conditions.
5
That is the only situation when a real random vector happens to be rotationally invariant.
6
According to the first block of the receiver (see Figure 4.1).
68
@
R
Frequency
Domain
Equalization
Decision
Device
remove
Cyclic
Prefix
(Length: p)
DFT
(Length: N)
Serial
Parallel
OC
C
..., wn0 ( N
), wn0 +N +p ( N
), ...
2
2
vn0 (n)e N nk
N n=0
wn0 (k) =
(4.2)
N 1
2
1 X
z(n0 + n)e N nk ,
N n=0
k = 0, . . . , N 1,
and
N 1
2
1 X
z e N nk
w (k) = E{wn0 (k)} =
N n=0
N z ,
k=0
=
.
0
,
k = 1, . . . , N 1
w (0)
..
.
(4.3)
w (N 1)
is real and does not depend on n0 . With7
Qw (k, l) = E{wn0 (k)wn0 (l)}
=
=
7
N 1 N 1
2
1 X X
E{z(n0 + n)z(n0 + m)}e N (mk+nl)
N
1
N
n=0 m=0
1
N
1 N
X
X
n=0 m=0
Rz (n m)e N (mk+nl) ,
(4.4)
69
N 1
s=n
@
@
R
A1
A2
1
0
N 1
@
@
R
@
I
@
N 2
A3
s=n+1N
m
n=s+N 1
1N
(4.5)
(4.6)
The next step is to simplify the expression for Qw (k, l). The idea is to reorder
the terms of the double sum, so that only one sum remains (after some calculations). We have
Qw (k, l) =
=
8
Again, no dependency on n0 .
N 1 N 1
2
1 X X
Rz (n m)e N (mk+nl)
N
1
N
n=0 m=0
n
N
1
X
X
n=0 s=n+1N
Rz (s)e N (nk+nlsk) ,
(4.7)
70
where the index change s = n m has been performed, such that the summation
over m is replaced by a summation over s. Next, we interchange the two sums.
Due to the dependence of n in the inner sum, we have to investigate the summation
over (n, s) in some more detail. Figure 4.2 shows the effective pairs that are used
in the two sums. They are denoted by the areas A1 , A2 , and A3 . Hence,
Qw (k, l) =
N 1
2
1 X
Rz (0)e N n(k+l)
N
+
+
n=0
N
1 N
1
X
X
1
N
Rz (s)e N (nk+nlsk)
s=1 n=s
1 s+N
X
X1
1
N
s=1N
(A1 ) (4.8)
Rz (s)e N (nk+nlsk) ,
(A2 )
(A3 )
n=0
and, furthermore,
N 1
2
1 X
Rz (0)e N n(k+l)
N
Qw (k, l) =
+
+
n=0
N
1 NX
1s
X
1
N
1
N
s=1 t=0
N
1 NX
1s
X
s=1
(4.9)
2
Rz (s)e N (tk+tl+sl)
2
Rz (s)e N (nk+nl+sk) ,
n=0
where the index change t = n s has been performed for term (A2 ) in (4.8), such
that its summation over n is replaced by a summation over t, and in term (A3 )
of (4.8) s has been replaced by s. Since Rz (s) is an even function, we obtain
(writing t instead of n)
N 1
Qw (k, l) =
X
2
1
Rz (0)
e N t(k+l) +
N
(4.10)
t=0
1
N
N
1
X
Rz (s)
NX
1s
s=1
t=0
(4.10) further
simplifies9
2
2
e N (tk+sl+tl) + e N (sk+tk+tl) .
(
t
a =
M,
1aM
1a ,
a=1
,
a 6= 1
(4.11)
to
9
After a long but straightforward calculation. For the general case (MIMO case), this calculation
is carried out in full detail in Subsection 5.1.1.
Qw (k, l) =
71
(4.12)
P 1
ks , k + l = 0 or k + l = N
Rz (0) + 2 N
Rz (s) 1 Ns cos 2
s=1
N
=
.
2
2
cot 2N
(k + l) + PN 1
R
(s)
sin
ks
+
sin
ls
,
z
s=1
N
N
{z
}

k + l 6= 0 and k + l 6= N
2 (k+l)
e 2N2
N sin(
(k+l))
2N
4.1.3 Frequency Dependent Noise Analysis
In this section, we use the results of the previous section to calculate variances and
covariances of the real and imaginary parts of the noise at certain frequencies, i.e.,
we calculate covariance matrices of the real 2  dimensional random vectors
<{wn0 (k)}
, k = 0, . . . , N 1,
={wn0 (k)}
where the frequency index k is considered as a fixed parameter. Splitting wn0 (k) =
<{wn0 (k)} + ={wn0 (k)} = an0 (k) + bn0 (k) into real and imaginary part, one
immediately  cf. also [41]  finds the relations,
Cov{an0 (k), an0 (l)} =
Cov{bn0 (k), bn0 (l)} =
Cov{an0 (k), bn0 (l)} =
1
<{Cw (k, l) + Pw (k, l)},
2
1
<{Cw (k, l) Pw (k, l)},
2
1
={Pw (k, l) Cw (k, l)},
2
(4.13)
(4.14)
o
1 n
< Qw (k, k) + Qw (k, k) 2 (w (k))2 ,
2
1
< {Qw (k, k) Qw (k, k)} ,
2
1
={Qw (k, k) Qw (k, k)}.
2
(4.15)
72
s
2
Qw (k, k) = Rz (0) + 2
Rz (s) 1
cos
ks ,
(4.16)
N
N
s=1
P 1
s
Rz (0) + 2 N
k = 0 or k = N2
Qw (k, k) =
k + PN 1
cot( 2
N )
N
N
k 6= 0 and k 6= N2 ,
N
1
X
= Rz (0) + 2
N
1
X
s=1
b2 (0)
a2 (
s
N 2z ,
Rz (s) 1
N
= a,b (0) = 0,
N
) = Rz (0) + 2
2
N
1
X
s=1
(4.17)
s
Rz (s) 1
(1)s ,
N
N
N
b2 ( ) = a,b ( ) = 0,
2
2
and, for k 6= 0 and k 6=
N
2,
as
N 1
s
Rz (0) X
2
+
Rz (s) 1
ks
=
cos
2
N
N
s=1
N 1
X
cot 2
2
Nk
Rz (s) sin
ks ,
N
N
s=1
N 1
Rz (0) X
s
2
2
b (k) =
+
Rz (s) 1
ks +
(4.18)
cos
2
N
N
s=1
N 1
X
cot 2
2
Nk
+
Rz (s) sin
ks ,
N
N
s=1
N 1
1 X
2
a,b (k) =
Rz (s) sin
ks .
N
N
a2 (k)
s=1
Note that the noise variances (and thus the noise powers) for real and imaginary
part at certain frequencies are different in general. Due to the properties of the DFT
of a real vector, cf. for example [59], this is not surprising for k = 0 and k = N2 .
Equation (4.18) shows that this also happens for k 6= 0 and k 6= N2 and that there
are statistical dependencies between real and imaginary part of the noise samples.
In the following subsection, we will study this effect in more detail.
73
4.1.4 Powers and Statistical Dependencies of Real and Imaginary Part of the
Noise
Let us consider
thecovariance matrices Ca,b (k) of the real 2  dimensional random
an0 (k)
vectors
, k = 1, . . . , N2 1:
bn0 (k)
2
(4.20)
1 (k)
0
a,b (k) =
0
2 (k)
as
1 (k) =
N 1
Rz (0) X
s
2
+
Rz (s) 1
ks
cos
2
N
N
s=1
N 1
2 (k) =
X
1
2
Rz (s) sin
N sin N k s=1
N 1
Rz (0) X
s
2
+
Rz (s) 1
ks +
cos
2
N
N
s=1
2
ks ,
N
(4.21)
N
1
X
1
2
+
ks ,
Rz (s) sin
N
N sin 2
N k s=1
cos N
k sin N
k
Ua,b (k) =
.
sin N
k
cos N
k
(4.22)
N
1
X
2
2
Rz (s) sin
ks
N
N sin 2
N k s=1
(N 1
)
X
2
2
j
ks
=
Rz (s)e N
N sin 2
Nk
s=1
(4.23)
74
and, using the properties of the DFT / IDFT, cf. for example [59], we are now able
to state the following theorem:
Theorem 4.1 Correlations and variance differences between real and imaginary
part of the noise at the output of the DFT do not occur at any frequency k =
1, . . . , N2 1 if and only if the autocorrelation function Rz (n) satisfies the relation
Rz (n) = Rz (N n),
n = 1, . . . , N 1.
(4.24)
Note that the same theorem holds essentially for the noise at the input of the
decision device, because the frequency domain equalizer performs simple multiplications with complex numbers for each frequency, which corresponds to rotations
(phase) and scalings (absolute value) in the complex plane (for each frequency
separately).
Theorem 4.1 tells us that correlations and / or variance differences occur only
in the presence of colored noise. This is not a big surprise: the cyclic prefix is
not able to transform the linear convolution that models the noise correlations into
a cyclic convolution. All effects of the linear convolution remain visible at the
output of the DFT. It is obvious that in practical systems, the condition (4.24) on
the autocorrelation function10 is never fulfilled, i.e., there will be (at least) one
n {1, . . . , N 1} with Rz (n) 6= Rz (N n). So one has to be aware that the
noise powers for real and imaginary part are different.
(t1 V1 + t2 V2 ) e :
An inspection of (4.24) shows that this condition is equivalent to the requirement that
the
autocorrelation
function
shifted to the left by N2 is an even function within the interval
N
N
( 2 1), +( 2 1) . Do not mix this up with the fact that any nonshifted autocorrelation
function is an even function.
75
V2
<
V1
eigenvector matrices11 , cf. (4.22), and the rotations introduced by the Frequency
Domain Equalizer, whereas size and density are determined by the eigenvalues, cf.
(4.21), and the scalings introduced by the Frequency Domain Equalizer.
Theorem 4.2 The angles of the noise rotations at the output of the DFT
(k) =
k,
N
k = 1, . . . ,
N
1,
2
are independent of the mean z and of the autocorrelation function Rz (n) of the
noise at the output of the Time Domain Equalizer (TDE).
To get a better picture of the deviations from common quadratic QAM constellations for the (ideal) rotated rectangular constellations, we have to look at the
relative differences
d(k) =
11
2 (k) 1 (k)
2 (k) 1 (k)
= 2
,
1 (k) + 2 (k)
a (k) + b2 (k)
These matrices are rotation matrices and describe the noise rotations.
(4.26)
76
0.9
0.8
0.7
d(k)
0.6
0.5
0.4
0.3
0.2
0.1
q=2
0
q = 70
50
q = 254
100
150
200
250
Fig. 4.4: Relative differences. The noise is zero mean filtered white Gaussian noise with
variance 1.
which can be viewed as a measure for the eccentricity of the noise ellipses and in
turn for the nonsquareness of the rotated rectangular constellations. Note that the
scaling factor that is introduced by the Frequency Domain Equalizer is canceled
out in the normalized expression (4.26).
From (4.21), we can obtain the traces
2 (k) + 1 (k) = Rz (0) + 2
N
1
X
s=1
s
2
ks .
Rz (s) 1
cos
N
N
Simulation Results
From Figure 4.4, one can see that significant differences between ideal rotated
rectangular constellations and the quadratic QAM constellations used may occur.
The noise is zero mean filtered white Gaussian noise with filter impulse response12
12
This is an artificially constructed example that merely shows the existence of extreme cases.
Note that if d(k) = 1 for a certain frequency k, the (optimum) rotated rectangular constellation at
this frequency k collapses into a constellation whose symbols are laying on a straight line.
77
(q is a fixed parameter)
1 cos 2n
cos 2qn
N
N , n = 0, . . . , N 1 .
gq (n) =
0
, n=
6 0, . . . , N 1
(4.27)
Note that 1 is the maximum value d(k) can take (in this case: 1 (k) = 0 or
2 (k) = 0).
Asymptotic Analysis
In the following, we assume that the random noise process at the input of the receiver is zero mean and has an autocorrelation function
1, n = 0
1
, n = 1 or n = 1
Rz (n) =
.
(4.28)
2
0, n 6= 1 and n 6= 0 and n 6= 1
The corresponding relative (eigenvalue) difference at frequency
N
1
,
1 =
2
N (N 1) cos 2
N
N
2
1 is given by
(4.29)
78
2
H e N 1
0
..
(4.30)
A=
,
.
2 N
0
H e N ( 2 1)
H(z) being the Z  transform of the impulse response h. It remains to specify the
covariance matrix and the pseudocovariance matrix of the complex noise vector
n. An inspection of (4.3), (4.5), (4.6) and (4.12) shows that this has been already
done in Section 4.1. However, these formulas are too complicated to serve as a
base for an analytical expression for the capacity loss. Instead, we will use an
approximation that only takes into account dependencies and power differences
of real and imaginary part at certain differences and neglects crosscorrelations
between different frequencies (subcarriers). Of course, the random noise vector
is assumed to be Gaussian distributed. Using (the notation of) (4.14) or (4.19),
we can write the covariance matrix of the equivalent real valued noise vector n
of
doubled dimension N 2 as
2
a (1)
0
a,b (1)
0
..
..
.
.
N
N
2
0
a,b ( 2 1)
0
a ( 2 1)
, (4.31)
Cn =
2 (1)
(1)
0
0
a,b
b
..
..
.
.
a,b ( N2 1)
..
<{C } =
n
={Cn } = 0,
<{Pn } =
b2 ( N2 1)
0
.
a2 ( N2 1) + b2 ( N2 1)
a2 (1) b2 (1)
={Pn } = 2
0
a,b (1)
0
..
..
.
a2 ( N2 1) b2 ( N2 1)
a,b ( N2 1)
, (4.32)
79
Cn =
Pn
1 (1) + 2 (1)
..
e N 1
1 ( N2 1) + 2 ( N2 1)
.
e
0
..
(4.33)
N
1)
2
N ( 2
1 (1) 2 (1)
0
..
.
1 ( N2 1) 2 ( N2 1)
such that we are now in the position to specialize the results of Subsection 3.2.3.
With the generalized Cholesky factor of Cn ,
p
Bn =
1 (1) + 2 (1)
0
..
1 ( N2
1) +
2 ( N2
(4.34)
1)
we calculate
1 (1)+2 (1)
B1
A
=
2
H e N 1
..
2 N 1)
H e N(2
q
ff
2
arg H e N 1
1)+2 ( N
1)
1 ( N
2
2
.
e
ff
2 N 1)
arg H e N ( 2
2
H e N 1
1 (1)+2 (1)
(4.35)
0
..
0
..
2 N
H e N ( 2 1)
q
N
N
1 ( 2 1)+2 ( 2 1)
80
ff
2
arg H e N 1
0
..
ff
2 N 1)
arg H e N ( 2
e
{z
2
H e N 1
1 (1)+2 (1)
H
Q
0
..
{z
2 N
H e N ( 2 1)
q
N
N
1 ( 2 1)+2 ( 2 1)
Q QH ,
{z}
VH
2 2
N 1
0
H
e
..
(4.36)
Cymax = L
,
.
2 N
2
0
H e N ( 2 1)
L being the Water Level, and, furthermore,
2
0
H e N 1
..
Bymax = L
.
2 N
0
H e N ( 2 1)
(4.37)
such that
T
B1
ymax Pn Bymax
e N 1
..
0
1
L
.
e
(1)2 (1)
1
2
2
H e N 1
0
..
(4.38)
N
2
1)
N ( 2
.
1 ( N
1)2 ( N
1)
2
2
2
2
N
H e N ( 2 1)
81
Capacity Loss
20
18
16
14
12
C
CRot Rect
[%]
10
5
6
Loop length [km]
10
Fig. 4.5: Normalized capacity loss (with respect to the capacity CRot Rect that assumes that
correlations and variance differences of real and imaginary part of the noise are
utilized) in terms of loop length, cf. also Section B.1 in the Appendix.
1 (k) 2 (k)
2 2 ,
L H e N k
k = 1, . . . ,
N
1.
2
(4.39)
Inserting the k into (3.100), we finally end up with the following (approximate)
expression for the capacity loss [bit / channel use],
N
1
2
2
X
1
(1 (k) 2 (k))
C
log2 1
(4.40)
2k 4 ,
2(N + p)
2
N
L H e
k=1
where p denotes the length of the Cyclic Prefix. Note that the division by N + p
originates from the serialtoparallel conversion, the conjugate complex extension
and the addition of the Cyclic Prefix (compare also with (2.1)).
We performed some simulations in order to demonstrate the capacity loss. We
used the parameters of a real ADSL scenario, i.e., a DFTlength of N = 512, a subcarrier spacing of 4312.5 Hz, and a transmit power of 100 mW. The transfer functions of the loops were obtained by measurements of Austrian cables. The noise
13
Do not mix up k of Subsection 3.2.3 with 1 (k) and 2 (k) of Section 4.1.
82
t1 V1 (k) + t2 V2 (k) :
Si (k) =
X
2
(2m 1)2 (Vi (k))2
Mi (k)
m=1
2
2 (Vi (k))
Mi (k)
Mi (k)
2
4m2 4m + 1
m=1
(4.42)
1
2
2
(Vi (k)) (Mi (k)) 1 ,
=
3
P
P
M (M +1)
M (M +1)(2M +1)
2
where M
and M
were used. If the
m=1 m =
m=1 m =
2
6
number of signal points Mi (k) is not too small, we can approximate the obtained
83
signal power by
1
(Vi (k))2 (Mi (k))2 .
(4.43)
3
Next, we assume that a certain conventional bitloading algorithm assigns a
square QAM constellation to the kth subcarrier. This implies that the gain factors
and the number of signal points in real and imaginary part direction, respectively,
are equal, see (4.41), i.e.,
Si (k)
(4.44)
the approximation being good, provided that the number of signal points Mall (k) is
not too small. At the input of the decision device, the eigenvalues of the covariance
matrices of real and imaginary parts of the noise at certain frequencies are given
by  see also (2.16) 2k 2
(4.45)
1 (k) = H e N 1 (k),
2k 2
2 (k) = H e N 2 (k).
In the following, we will assume 1 (k) > 2 (k) without loss of generality. In the
presence of Gaussian noise, the symbol error probabilities can be approximated by
(see Figure 4.6)
s
2
2
(V (k))
(V (k))
(4.46)
PSquare QAM (k) 2Q
+ 2Q
1 (k)
2 (k)
s
s
2
2
(V (k)) (V (k))
4Q
Q
1 (k)
2 (k)
s
2
(V (k))
,
2Q
1 (k)
where the Qfunction is defined as
1
Q(x) =
2
Z
x
t2
e 2 dt.
(4.47)
84
2 (k)
1 (k)
V (k)
1 V
2
(k)
2V (k)
Fig. 4.6: Decision region of interior point in square QAM constellation (bounded by solid
line) and lower and upper bound areas (bounded by dashed lines). The edges of
these lower and upper bound areas are chosen to be parallel to the eigenvectors of
the noise and are margined by the inscribe circle and circum
circle of the square
decision region. This is the reason for the factors 12 and 2.
dB
0 dB
6 dB
85
9.5 dB
12 dB
14 dB
15.6 dB
10
10
10
10
10
Q (x)
10
10
10
10
10
10
10
x
Fig. 4.7: The function Q ().
Note that the first approximation in (4.46) originates from the fact that only interior
points of the QAM constellation are considered and that the noise rotations are not
taken into account. Observe that the noise rotations at the input of the decision
device (in contrast to the noise rotations at the outputn of the DFT)
o are not only
2k
determined by Theorem 4.2: they also depend on arg H e N
. The second
approximation is due to the fact that the function Q is a strictly monotone
decreasing function having values close to zero for the arguments under consideration, cf. also Figure 4.7, or  to say it differently  the expression on the right
hand side of (4.46) is dominated by the first term. We can see from Figure 4.6
that the decision region is always contained in and contains another quadratic area
whose edges are parallel to the noise (or more precisely: to the eigenvectors of
the noise). This enables us to determine a lower and an upper bound for the symbol
error probability which are valid for all possible rotation angles, i.e.,
s
2
2
(V (k))
1 (V (k))
2Q 2
. PSquare QAM (k) . 2Q
.
(4.48)
1 (k)
2 1 (k)
Again, these inequalities are only approximations, as indicated by the special less
or equal than  symbol, because boundary points of the QAM constellation are not
incorporated in the formulas and some Qfunctionexpressions are approximated
86
by the zero value. But we want to emphasize that these approximations are quite
good and do not affect the validity of the inequalities significantly.
Inserting (4.44), we can summarize the results obtained for the symbol error
probability for a square QAM constellation under the presence of rotationally variant noise as follows,
s
PSquare QAM (k) 2Q
3Sall (k)
c
2Mall 1 (k)
!
,
(4.49)
where c = 1 is an approximation, which neglects the noise rotations (but not the
power differences), and c = 2 and c = 12 yield lower and upper bounds, respectively, which are valid for all possible rotation angles.
We continue our analysis by considering (optimized) rotated rectangular constellations. First of all, observe that we can restrict ourselves to the case when
the noise eigenvectors are parallel to the constellation edges, because the rotation angles
a o
priori known from Theorem 4.2 and the additive contribution
n are2k
N
of arg H e
. Note that there is no additional estimation effort for these
2k
parameters, because H e N
has to be estimated anyway in order to guarantee that the Frequency Domain Equalizer works properly. We will calculate the
reduced symbol error probability that occurs if one applies an optimized (with respect to symbol error probability) rotated rectangular constellation which has the
same number Mall (k) of signal points, i.e., it serves the same data rate, and uses
the same signal power Sall (k) (compare with (2.27) or (2.28)).
Similarly to the square QAM case, the symbol error probabilities (for the following analysis we also refer to [13]) can be approximated by
2
2
(V
(k))
(V
(k))
1
2
+ 2Q
(4.50)
PRot Rect (k) 2Q
1 (k)
2 (k)
s
!
s
!
3S1 (k)
3S2 (k)
2Q
+ 2Q
,
(M1 (k))2 1 (k)
(M2 (k))2 2 (k)
s
where (4.43) was inserted. As already mentioned, the approximations are valid
for all rotation angles, and neglect only boundary points of the constellation and
some small values of the Qfunction. According to (4.43), we also assume that the
number of signal points is not too small.
In order to determine the optimum constellation parameters, we have to minimize the function
q
3S1 (k)
3S2 (k)
= 2Q
+
2Q
2
(M (k)) (k)
(M (k))2
1
2 (k)
(4.51)
87
(4.52)
i = 1, 2,
f (x, y) = 2Q ( xy) + 2Q
i = 1, 2.
y0
(x0 x)
y
(4.53)
Mall (k)
F S1 (k), Sall (k) S1 (k), M1 (k),
= f (x, y) ,
M1 (k)
(4.54)
if
x = S1 (k),
(4.55)
3
y =
,
(M1 (k))2 1 (k)
x0 = Sall (k),
9
,
y0 =
2
(Mall (k)) 1 (k)2 (k)
such that we can find the minimum of F (, , , ) (under the required side constraints) by minimizing f (, ). Note that y is regarded as a continuous, real valued
parameter during the minimization process; this conflicts with the requirement that
M1 (k) is an even number. But we will stick to this simplification and will fulfill
the requirement by rounding to an even number after having found the continuous
minimum. Using
x
1 e 2
d
Q x =
,
(4.56)
dx
2 2 x
we obtain
y
(x0 x) y0
xy
2
1 e 2
e
y0
f (x, y) =
(4.57)
yq
x
xy
2
(x0 x) yy0 y
(x x)y
r
0 2y 0
xy
1
e
y
e 2
,
= q
y
x
2
(x0 x)
y0
xy
2
y
(x0 x) y0
2
1
e
e
y0
f (x, y) = x q
(x0 x) 2
y
0
y
xy
y
2
(x0 x) y
r
r
(x x)y
x
1
y0
0 2y 0
xy
e
=
(x0 x) 3 e 2
.
y
y
2
88
Setting
x f (x, y)
y f (x, y)
= fx (x, y) = 0,
(4.58)
= fy (x, y) = 0,
yields
q
xy
2
xy
xy
e 2
xy
which implies
(x0 x) yy0
y
(x0 x) y0
qe
(x0 x) yy0
e
y
(x0 x) y0
y0
,
y2
x0 x y0
,
x y2
x0 x
= 1,
x
(4.59)
(4.60)
and furthermore
x0
.
2
Substituting this back into (4.59), we obtain the following equation for y
x=
x0
y0 x20 yy0
e
= ye 2 y ,
y
y0 .
(4.61)
(4.62)
(4.63)
Note that it can happen that this solution of (4.62) is not the unique solution. The
question whether it is unique or not depends on the parameters x0 and y0 . To see
this, observe that (4.62) can be written as
y0
gx0
= gx0 (y)
(4.64)
y
with the function
gx0 (y) = ye
x0
y
2
(4.65)
which is not onetoone (also called not injective) in general. However, we will not
consider
other solutions. Instead, we will show that the solution (x, y) =
x0 possible
(4.66)
89
x
1 x0 y0
2 3 x0
0
= e 4
y0
y0 + 1 ,
fxx
, y0
2
x0
2
2
x
x
0
0
fxy
= fyx
, y0
, y0
(4.67)
2
2
s
x0
1
2 1 x0
= e 4 y0
y
1
,
0
x0 y0 2
2
s
x0
1
x0
1 3 x0
0
4 y0
= e
fyy
, y0
y0 1 ,
2
2
y0
2
2
such that the determinant of S
x0
2 , y0 is obtained as
x 1 x0
2 1
0
2 y0
, y0
.
det S
= e
1
x0 y0
(4.68)
x
0
It is a well known result of Differential Calculus,
see
e.g.
[7,
8],
that
S
2 , y0 to
x0
be positive definite is sufficient for (x, y) = 2 , y0 being
minimum of
a relative
f (x, y). It is also shown in [7, 8] that a positive definite S x20 , y0 is equivalent
to
x
0
fxx
, y0 > 0
(4.69)
2
and
x
0
det S
, y0
> 0.
(4.70)
2
Comparing (4.69) with (4.67), we conclude that the first condition (4.69) is always
fulfilled. According to (4.68), (4.70) is equivalent to
x0
y0 > 1,
2
(4.71)
(4.72)
It is clear that this condition is usually fulfilled in practice, cf. (4.75) and Figure
4.7. Note that we are operating in a range (especially in wireline transmission),
where the (symbol) error probabilities are small.
We also want to emphasize that the previous proof that the solution obtained
is indeed a minimum can be just as well omitted from a practical point of view,
because we will show in the following that the symbol error probability is substantially reduced when using rotated rectangular constellations with the obtained
parameters. This is obviously sufficient for application purposes. On the other
hand, we have shown that rotated rectangular constellations with these parameters
90
are really the optimum rotated rectangular constellations with respect to symbol
error probability.
Substituting (4.55) back into (4.61) and (4.63) and making use of the side constraints (4.52), we finally come to the conclusion that a bitloading algorithm that
takes into account power differences and statistical dependencies between real and
imaginary parts of the noise distributes the same signal power onto the two axes of
the rotated rectangular constellations, i.e.,
Sall (k)
.
(4.73)
2
Furthermore, the optimum numbers of signal points in the two directions are derived as  applying (4.45)
1
p
2 (k) 4
M1 (k) =
Mall (k)
(4.74)
1 (k)
1
p
2 (k) 4
=
Mall (k)
,
1 (k)
1
p
1 (k) 4
M2 (k) =
Mall (k)
2 (k)
1
p
1 (k) 4
=
Mall (k)
,
2 (k)
S1 (k) = S2 (k) =
which, of course, have to be rounded to even numbers. Using (4.42), the gain
factors Vi (k) are determined as well. The symbol error probabilities (4.50) are
calculated as
s
!
3Sall (k)
p
PRot Rect (k) 4Q
,
(4.75)
2Mall 1 (k)2 (k)
so that we obtain14 (approximate) SNR gains of
s
s
SNRRot Rect (k)
1 1 (k)
1 1 (k)
G(k) =
=
,
SNRSquare QAM (k)
c 2 (k)
c 2 (k)
(4.76)
where c has the same meaning as in equation (4.49) and is explained below equation (4.49). Finally, we can express the gains in terms of the relative (eigenvalue)
differences, cf. (4.26), as
s
1 1 + d(k)
G(k)
.
(4.77)
c 1 d(k)
We want to emphasize that in our example of Section 4.2, cf. also Section B.1
in the Appendix, the modulus of the relative differences is close to 1 for almost
all frequencies (see Figure 4.8), so that the overall SNR gain is very high, without
much influence from the value of c, cf. Figure 4.9.
14
SNR .
91
0.9
0.8
0.7
d(k)
0.6
0.5
0.4
0.3
0.2
0.1
50
100
150
200
250
Fig. 4.8: Relative differences. Real ADSL scenario with narrowband interference, cf. also
Section B.1 in the Appendix.
16
14
G(k) [dB]
12
10
50
100
150
200
250
Fig. 4.9: SNR gains (c = 1). Real ADSL scenario with narrowband interference, cf. also
Section B.1 in the Appendix.
92
The SNR gains are independent of the channel transfer function and of the signal
power and thus of the loop length, which is not the case for the capacity loss. Furthermore, the previous example shows that the use of rotated rectangular constellations is much more effective in reducing the (uncoded) symbol error probability
than in increasing capacity. Note that statements about capacity always assume an
optimum coding strategy which is not usually applicable in practice. For practical
en / decoders, the overall gain will be somewhere inbetween. It depends on the
ability of the code to use the safer transmission in one direction (corresponding to
one eigenvector) to correct the more frequent errors in the other direction (corresponding to the other eigenvector). The effort required to adapt the coding strategy
is much higher than for implementing rotated rectangular constellations.
n < 0 or n > p.
(4.78)
In the following, we are interested in what happens if this constraint (4.78) holds
only approximately. To analyze the effects occurring, we relax (4.78) to
h(n) = 0,
(4.79)
Again, the filter e (the TDE) will be always noncausal and therefore not implementable for practically occurring channel impulse responses g. Since a simple
delay in the receiver solves this problem, we will stick to (4.79) for the reason of
simplicity. We want to emphasize that this model together with a variable (adaptive) delay includes the case when the evaluation frame in the receiver is moved to
a certain extent in order to maximize performance. Furthermore, main parts of this
section are not limited to wireline DMT transmission. The derivation and (most of
the) results remain valid for wireless OFDM transmission as well.
Due to our new assumption (4.79), the input  output relationship (2.6) has to
be modified, i.e.,
+
u(n0 + n) =
l
X
h(k)t(n0 + n k) + z(n0 + n)
(4.80)
k=l
p
X
h(k)t(n0 + n k) + z(n0 + n)
k=0
1
X
k=l
h(k)t(n0 + n k) +
l
X
k=p+1
h(k)t(n0 + n k).
previous symbol
present symbol
an0 (N +p)
CP
93
next symbol
an0
CP
CP
qn0 (N +p)
qn0
bn0 (N +p)
bn0
an0
t(n0 p)
t(n0 )
t(n0 + N 1)
qn0
u(n0 )
u(n0 + N 1)
bn0
dn0 (0)
..
dn0 =
(4.81)
dn0 (N 1)
with
dn0 (m) =
N 1 p
2
1 XX
h(k)t(n0 + n k)e N nm
N n=0 k=0
(4.82)
N 1
2
1 X
z(n0 + n)e N nm
+
N n=0
(4.83)
N 1 1
2
1 X X
+
h(k)t(n0 + n k)e N nm
N n=0 k=l
(4.84)
N 1 l
2
1 X X
+
h(k)t(n0 + n k)e N nm ,
N n=0 k=p+1
(4.85)
94
with
bn0 =
u(n0 )
u(n0 + 1)
..
.
u(n0 + N 1)
For an illustration of our nomenclature, we also refer to Figure 4.10. The first term
(4.82) is the good term, that describes the influence of that part of the impulse
response h that does not exceed the length of the Cyclic Prefix, and has already
been discussed in detail in Section 2.1. The second term (4.83) originates from the
(background) noise and has been analyzed in Sections 4.1, 4.2 and 4.3. The third
and fourth term (4.84) and (4.85), respectively, result from the weakening of (4.78)
to (4.79). In the following, we will consider these two expressions, which cause as it is called  interference on the mth subcarrier, i.e.,
i
n0 (m)
N 1 1
2
1 X X
h(k)t(n0 + n k)e N nm ,
N n=0 k=l
N 1 l
2
1 X X
h(k)t(n0 + n k)e N nm ,
N n=0 k=p+1
(4.86)
i+
n0 (m)
(4.87)
m = 0, . . . , N 1.
First of all, observe that i
n0 (m) has contributions from
t(n + 1), . . . , t(n0 + N 1), t(n0 + N ), . . . , t(n0 + N + l  1),
 0
{z
} 
{z
}
elements of qn0
whereas i+
n0 (m) has contributions from
t(n l+ ), . . . , t(n0 p 1) , t(n0 p), . . . , t(n0 + N p 2),
 0
{z
} 
{z
}
elements of ...,qn0 2(N +p) ,qn0 (N +p)
elements of qn0
cf. also Figure 4.10. Hence, both terms depend on the present DMT symbol (also
called frame). This effect is called intercarrier interference (ICI). The dependence
on the preceding and following DMT symbols (frames) is called intersymbol interference (ISI). More specifically, i
n0 (m) contains precursors from the following
(future) symbols and i+
(m)
contains
postcursors from the preceding (past) symn0
bols.
95
l
l+
A1
p+1
p
l+ p 1
N 1
?
/
l = l+ n
l =p+1n
A2
m
l+ N
m
n = l+ l
n=p+1l
A3
p+1N
Fig. 4.11: Summation area in (4.88). Note that the area structure is slightly different for
l+ N + p, since then l+ p 1 N 1, but the basic idea behind the
interchange of the two sums remains the same.
i+
n0 (m)
N 1 l
2
1 X X
h(k)t(n0 + n k)e N nm
N n=0 k=p+1
N 1 l n
2
1 X X
(4.88)
where the index change l = k n has been performed, such that the summation
over k is replaced by a summation over l. Next, we interchange the two sums. Due
to the dependence of n in the inner sum, we have to investigate the summation over
(n, l) in some more detail. Figure 4.11 shows the effective pairs that are used in the
two sums. They are denoted by the areas A1 and A2 . In addition to interchanging
the two sums, we sum over area A1 , add the sum over areas A2 + A3 , and then
96
i+
n0 (m)
l
l l
2
1 X X
1
+
N
1
p
X
+ l
lX
(A1 ) (4.89)
2
(A2 + A3 )
l=p+1N n=p+1l
l+
N
X
+ l
lX
(A3 ).
l=p+1N n=N
Another index change, i.e., k = n + l, such that the summation over n is replaced
by a summation over k, yields
+
i+
n0 (m)
l
l
X
2
2
1 X
t(n0 l)e N lm
h(k)e N km
N l=p+1
k=l
1
+
N
1
p
X
t(n0 l)e
t(n0 l)e
l=p+1
l
X
2
lm
N
l=p+1N
Pl+
h(k)e N km
k=p+1
l+ N
l
X
2
lm
N
l=p+1N
(4.90)
{z
h(k)e N km .
k=N +l
2
t(n0 +N l)e N lm
Pl+
k=l
2
h(k)e N km
Replacing
1
p
X
t(n0 l)e N lm =
l=p+1N
N
1
NX
p1
t(n0 + l)e N lm
(4.91)
l=p
NX
p1
t(n0 + l)e N lm +
l=0
1
2
1 X
+
t(n0 + l)e N lm
N l=p
NX
p1
t(n0 + l)e N lm +
l=0
1
2
1 X
+
t(n0 + N + l)e N lm
N l=p
NX
p1
97
t(n0 + l)e N lm +
l=0
N 1
2
1 X
+
t(n0 + l)e N lm
N l=N p
N 1
2
1 X
t(n0 + l)e N lm
N l=0
N 1
2
1 X
an0 (l)e N lm
N l=0
= cn0 (m),
where the Cyclic Prefix, t(n0 + l) = t(n0 + N + l), l = p, . . . , 1, was utilized
and the nomenclature of Section 2.1, in particular an0 = F1 cn0 , was taken into
account (cf. also Figure 4.10), we obtain
+
i+
n0 (m)
= cn0 (m)
l
X
h(k)e N km
(4.92)
k=p+1
+
l
l
X
2
2
1 X
+
t(n0 l) e N lm
h(k)e N km
N l=p+1
k=l
{z
}

(4.93)
2 m
Hl+ e N
+
l
l
X
2
2
1 X
t(n0 + N l) e N lm
h(k)e N km ,
N l=p+1
k=l
{z
}

(4.94)
2 m
Hl+ e N
where
+
2
lm
N
l
X
h(k)e
2
km
N
k=l
+ l
lX
k=0
h(k + l)e N km
(4.95)
h(k + l)e N km
k=0
2
= Hl+ e N m
denotes the onesided Z +  transform of (the tail of) the impulse response h shifted
to the left by the (nonnegative) integer number l,
Hl+ (z) =
X
k=0
h(k + l)z k ,
(4.96)
98
2
evaluated at e N m . Note that the first term (4.92) is a contribution to the channel
matrix D of (2.16), i.e., instead of
p
X
h(k)e N km ,
k=0
we now have
l
X
h(k)e N km
(4.97)
k=0
as diagonal entries. The second term (4.93) causes intersymbol interference (ISI).
To be more precise, it expresses how postcursors from previous symbols disturb
the transmission. Finally, the third term (4.94) describes intercarrier interference
(ICI), provided that l+ is not too large. As long15 as l+ p + N , only the present
symbol has influence. Otherwise, additional ISI  again postcursors from preceding
symbol(s)  has to be accepted.
We continue our analysis with
i
n0 (m) =
=
N 1 1
2
1 X X
h(k)t(n0 + n k)e N nm
N n=0 k=l
N 1 1n
2
1 X X
(4.98)
where the index change l = k n has been performed, such that the summation
over k is replaced by a summation over l. Next, we interchange the two sums. Due
to the dependence of n in the inner sum, we have to investigate the summation over
(n, l) in some more detail. Figure 4.12 shows the effective pairs that are used in the
two sums. They are denoted by the areas A1 and A2 . In addition to interchanging
the two sums, we sum over area A1 , add the sum over areas A2 + A3 , and then
subtract the sum over A3 , i.e.,
i
n0 (m) =
N
X
N
1
X
(A1 )
(4.99)
l=l N +1 n=l l
1
+
N
0
X
1l
X
(A2 + A3 )
l=N +1 n=l l
0
1
X
X
2
1
(A3 ).
Again, another index change, i.e., k = n + l, such that the summation over n is
15
99
l
1
0 1
A3
N 2
N 1
l + 1
l = 1 n
A2
n = 1 l
l = l n
m
n = l l
N + 1
N
A1
l N + 1
Fig. 4.12: Summation area in (4.98). Note that the area structure is slightly different for
l N , but the basic idea behind the interchange of the two sums remains
the same.
N
X
t(n0 l)e
2
lm
N
l=l N +1
P0
l=l +1
1
+
N
0
X
PN 1
l=0
2
t(n0 +N l)e N lm
{z
h(k)e N km (4.100)
k=l
{z
t(n0 l)e N lm
l=N +1
NX
1+l
P1+l
k=l
1
X
}
2
h(k)e N km
2
h(k)e N km
k=l
2 lm
t(n0 +l)e N
0
1+l
X
X
2
2
1
t(n0 l)e N lm
h(k)e N km
N l=l +1
k=l
l 1
1l
X
2
1 X
2
lm
N
t(n0 + N + l)e
h(k)e N km
N l=0
k=l
1
X
+cn0 (m)
h(k)e N km
k=l
 1
1l
X
2
2
1 X
t(n0 + l)e N lm
h(k)e N km ,
N l=0
k=l
l
100
where again the nomenclature of Section 2.1 was used (cf. also Figure 4.10). Let
2
e N lm
1l
X
1
X
h(k)e N km =
k=l
h(k l)e N km
(4.101)
k=l +l
1
X
h(k l)e N km
k=
2
= Hl e N m
denote the onesided Z  transform of (the beginning of) the impulse response h
shifted to the right by the (nonnegative) integer number l,
Hl (z) =
1
X
h(k l)z k ,
(4.102)
k=
2
evaluated at e N m . Then,
1
X
i
n0 (m) = cn0 (m)
h(k)e N km
(4.103)
k=l
l 1
2
1 X
t(n0 + N + l)Hl e N m
+
N l=0
(4.104)
l 1
2
1 X
t(n0 + l)Hl e N m .
N l=0
(4.105)
Again, the first term (4.103) is a contribution to the channel matrix D of (2.16),
i.e., instead of
l+
X
2
h(k)e N km ,
k=0
l
X
h(k)e N km
(4.106)
k=l
X
k=
h(k)z
l
X
h(k)z k ,
(4.107)
k=l
2
101
its entries have changed. The second term (4.104) causes intersymbol interference
(ISI). To be more precise, it expresses how precursors from following (future) symbols disturb the transmission. Finally, the third term (4.105) describes intercarrier
interference (ICI), provided that l  is not too large. As long16 as l  N , only
the present symbol has influence. Otherwise, additional ISI  again precursors from
following symbol(s)  has to be accepted.
Finally, we can express the overall interference on the mth subcarrier introduced by (4.79) including both the + and the contribution as
+
in0 (m) =
l
2
1 X
t(n0 l)Hl+ e N m
N l=p+1
(4.108)
l=p+1
(ICI, if l+ p + N )
l 1
2
1 X
t(n0 + l)Hl e N m
N l=0
(ICI, if l N )
l 1
2
1 X
t(n0 + N + l)Hl e N m ,
+
N l=0
(precursor ISI from next symbol)
m = 0, . . . , N 1.
We want to emphasize again that a too long impulse response according to (4.79)
also changes the values of the channel matrix D, but not its Z  transform notation.
102
definition  identical to other elements. Note that two identical random variables
are never uncorrelated. It is obvious that the interference is zeromean as well.
Assuming19
l+ + l N,
(4.109)
we have, according to (4.108),
+
l
1 X 2 + 2 n + 2 m
t Hl e N
Hl e N
N
(ISI)
l=p+1
l
1 X 2 + 2 n + 2 m
Hl e N
t Hl e N
+
N
(ICI)
l 1
1 X 2 2 n 2 m
t Hl e N
+
Hl e N
N
(ICI)
l=p+1
l=0
1
N
1
lX

2
2
t2 Hl e N n Hl e N m
(ISI)
l=0
and
+
l
1 X 2 + 2 n + 2 m
t Hl e N
Hl e N
N
(ISI)
l=p+1
l
1 X 2 + 2 n + 2 m
t Hl e N
+
Hl e N
N
(ICI)
l 1
1 X 2 2 n 2 m
t Hl e N
+
Hl e N
N
(ICI)
l 1
1 X 2 2 n 2 m
t Hl e N
+
Hl e N
N
(ISI),
l=p+1
l=0
l=0
in0 (0)
..
i n0 =
.
in0 (N 1)
as
Ci = E{in0 iH
n0 } = CiISI + CiICI ,
19
20
For wireline transmission and typical FFTlengths, even this is usually true.
There is no dependence on n0 anymore.
(4.110)
103
where CiISI and CiICI denote the covariance matrices of ISI and ICI, respectively,
satisfying
CiISI = CiICI ,
(4.111)
and having entries
CiISI (n, m) = CiICI (n, m)
l+
t2 X + 2 n + 2 m
Hl e N
=
Hl e N
N
l=p+1
1
lX

(4.112)
2
2
Hl e N n Hl e N m
,
l=0
(4.113)
where PiISI and PiICI denote the pseudocovariance matrices of ISI and ICI, respectively, satisfying
(4.114)
PiISI = PiICI ,
and having entries
PiISI (n, m) = PiICI (n, m)
l+
2
t X + 2 n + 2 m
Hl e N
=
Hl e N
N
l=p+1
1
lX

(4.115)
2
2
Hl e N n Hl e N m .
l=0
Note the important result that ISI and ICI have the same statistics with respect to
first and second order moments, and that both interference mechanisms are uncorrelated under the chosen assumptions, so that the statistics of the overall interference are obtained by a simple addition (or by multiplying one interference
contribution by a factor of 2).
Furthermore, observe that intersymbol and intercarrier interference are rotationally variant in general, so that we can expect further performance improvements by using rotated rectangular constellations. As in Subsection 4.1.4, the
computation of the rotation angles and of the other constellation parameters requires eigenvalue decompositions of the covariance matrices of real and imaginary
part of the interference at certain frequencies (subcarriers). Applying Theorem
21
Again no dependence on n0 .
104
3.17 to the individual elements of the interference vector, we can immediately express the rotation angles and the other parameters in terms of the diagonal elements
of CiISI , CiICI , PiISI , and PiICI .
We also want to emphasize that if we are interested in the optimum constellation parameters for the overall system, considering both noise and interference, we
have to apply Theorem 3.17 to the individual elements of the sum vector of noise
and interference, which yields expressions depending on the diagonal elements of
covariance and pseudocovariance matrix of this sum vector,
Cn+i = Cn + CiISI + CiICI ,
(4.116)
(4.117)
where E(z) denotes the Z  transform of e, the impulse response of the TDE, and
G(z) denotes the Z  transform of the channel g, see Figure 2.2.
105
Cn
Pn =
a2 (0)
1 (1) + 2 (1)
e N 0
e
2
N
a2 ( N2 )
0
..
a2 (0)
..
1 ( N2 1) + 2 ( N2 1)
(4.118)
N
2
0
1 (1) 2 (1)
..
.
1 ( N2 1) 2 ( N2 1)
a2 ( N2 )
whose entries are dependent on the mean z and on the autocorrelation function
Rz of the noise at the output of the TDE via (4.17) and (4.21). This mean z and
autocorrelation function Rz in turn depend on the TDE according to
z = s
e(k),
(4.119)
k=
Rz = e0 e Rs ,
(4.120)
with22
e0 (n) = e (n) = e(n),
n Z,
where s denotes the mean23 and Rs the autocorrelation function of the noise at
the input of the receiver, respectively. Note that we have modeled the noise at the
input of the receiver, s = [s(n)]n=,...,+ , as a discretetime, real valued (due
to baseband signalling), widesense stationary (not necessarily Gaussian) random
process.
Finally, ISI and ICI have covariance and pseudocovariance matrices according
to (4.112) and (4.115), whose entries are dependent on the impulse response e of
22
23
106
X
k=
e(k)G+
lk (z) ,
(4.121)
e(k)G
l+k (z) ,
k=
where G+
l (z) and Gl (z) are defined as in (4.96) and (4.102) applied to the channel impulse response g, respectively. Note, that l+ and l  need not to be known
beforehand, because (4.112) and (4.115) remain valid if these two parameters are
chosen to be large, as long as (4.109) is satisfied.
If we want to use capacity as an overall performance measure (similarly as in
[22]), we can plug these matrices into the formulas of Section 3.2, and, concluding
that capacity is a function of the filter coefficients of the TDE, we can maximize
this function with respect to these parameters. Note that our analytical results provide us with an explicit relation between the capacity and e, so that e.g. a conventional numerical maximization algorithm can be used to solve this maximization
problem. Also, for the design of (practical) lowcomplexity TDE algorithms, the
results obtained can be very useful.
At this point we will stop our considerations about the (design of the) Time
Domain Equalizer. We admit that this is possibly unsatisfactory for the reader
who is also interested in quantitative results. However, such results depend on
the chosen TDE algorithms, and in turn on the chosen design methods. A full
and meaningful analysis would need a lot of additional work and can be regarded
as a separate topic. This is beyond the scope of this manuscript. However, we
developed all analytical tools that are required for this research area and showed
that the utilization of the pseudocovariance matrix plays an important role in the
design and analysis of Time Domain Equalizers.
In Section 2.3, we came to the conclusion that we are dealing with a channel model
of the form
y = Ax + n,
(5.1)
where y C r and x C t denote the received and transmitted vectors, respectively. A is the channel matrix and n C r is the noise vector. In order to obtain
capacity results, cf. Section 3.2, and to develop efficient transmission schemes, it
is not sufficient to know the channel matrix A. It is also necessary to know the
statistical properties of the noise vector n. To be more precise, we need the covariance matrix Cn = E{nnH } and the pseudocovariance matrix Pn = E{nnT }.
For the MIMO DMT system, this means that we have to calculate covariance and
pseudocovariance matrix of the vector nn0 in (2.25). This is done in the first section of this chapter. We extend the results of the previous chapter to a very general
noise model at the input of the receivers, allowing correlations between the noise
signals of different receivers. Again, it will turn out that the noise is rotationally
variant in general.
In the same section, we deal with the problem of a Cyclic Prefix that is too
short. It was already mentioned in Section 2.2 that the design of Time Domain
Equalizers is critical in the MIMO case, since overdetermined problems have to
be solved. We will generalize the results of Section 4.4 and obtain closed form
formulas for the MIMO interference (not to be mixed up with crosstalk) and study
their statistical properties. We will show that the interference (in the cable bundle
case) is rotationally variant as well.
Finally in this section, we will show how the obtained noise and interference
results can be used to design a MIMO Time Domain Equalizer.
In the second section of this chapter, we present the general form of a transmission scheme that can cope with channels of the form of (5.1). It is based on
socalled joint processing functions and allows the use of conventional Single Input / Single  Output (SISO) codes.
The third section deals with transmission schemes whose joint processing functions are based on the Singular Value Decomposition (SVD) of the channel matrix,
cf. also [36, 53, 54]. We will show that we can obtain the optimum joint processing functions by means of the SVD. Furthermore, we study low(er)complexity
variants and discuss their performance. To obtain quantitative results, we perform
simulations with realistic (practically used) parameters and compare the various
108
methods.
The final section presents the UP MIMO1 scheme, a scheme that was originally
designed by the author, see [2, 37], for wireless transmission, and that also has
applications in wireline transmission. Specifically, it can be used to reduce the
computational complexity at the transmitter side (but not at the receiver side). We
will treat various aspects of this scheme.
where k = 1, . . . , K denotes the receiver number. We will assume in the following that these signals are modeled as discretetime, real valued (due to baseband signalling), (pairwise) jointly widesense stationary (not necessarily Gaussian) random processes with given means2 and crosscorrelation (autocorrelation)
functions3 [34],
n
o
(5.2)
shki (n) = E shki (n) = shki ,
n
o
and n, m = , . . . , .
Note that this model incorporates dependencies between different (k1 6= k2 ) noise
signals shk1 i and shk2 i via the crosscorrelation function Rshk1 i ,shk2 i . Consider, e.g.,
the case when there is one dominant noise source in a cable bundle that influences
all of its loops. Of course, there is also the possibility that the disturbances are
independent of each other. Or one can think of hybrid situations, where a noise
source only influences some nearby loops but not loops that are far away. Our
model includes all cases mentioned so that we can assume that (5.2) covers a wide
range of possible (colored) noise environments in a cable bundle.
1
109
are also discretetime, real valued, (pairwise) jointly widesense stationary random
processes with means and crosscorrelation (autocorrelation) functions,
zhki
Rzhk1 i ,zhk2 i
= shki
ehki (n),
n=
0
hk2 i
hk1 i
= e
(5.3)
Rshk1 i ,shk2 i ,
k, k1 , k2 = 1, . . . , K,
with4
n Z.
(5.4)
(5.5)
hki
n0
n0
k1 , k2 = 1, . . . , K,
(5.6)
hki
where wn0 = Fvn0 denote the noise vectors at the outputs of the DFTs, while
z hki (n0 )
..
N
vnhki
=
R
.
0
z hki (n0 + N 1)
denote the noise vectors at the inputs of the DFTs. We want to emphasize that
this approach describes Cnn0 and Pnn0 as a matrix of matrices according to (5.6).
Similarly to Section 4.1, the lth elements of the mean vectors (do not depend on
n0 ) are calculated as
N zhki ,
l=0
whki (l) =
,
(5.7)
0
,
l = 1, . . . , N 1
4
110
and we can write the (l1 , l2 )th elements of (5.6)  again no dependence on n0  as
Cwhk1 i ,whk2 i (l1 , l2 ) = Qwhk1 i ,whk2 i (l1 , l2 ) whk1 i (l1 )whk2 i (l2 ), (5.8)
Pwhk1 i ,whk2 i (l1 , l2 ) = Qwhk1 i ,whk2 i (l1 , l2 ) whk1 i (l1 )whk2 i (l2 ),
(5.9)
l1 , l2 = 0, . . . , N 1,
with
Qwhk1 i ,whk2 i (l1 , l2 ) = E{wnhk01 i (l1 )wnhk02 i (l2 )}
=
(5.10)
N 1 N 1
1 X X
E{z hk1 i (n0 + n1 )z hk2 i (n0 + n2 )}
N
n1 =0 n2 =0
e N (n1 l1 +n2 l2 )
=
N 1 N 1
2
1 X X
Rzhk1 i ,zhk2 i (n1 n2 )e N (n1 l1 +n2 l2 ) .
N
n1 =0 n2 =0
The next step is to simplify the expression for Qwhk1 i ,whk2 i (l1 , l2 ). Again, the
idea is to reorder the terms of the double sum, so that only one sum remains (after
some calculations). We have
(5.11)
Qwhk1 i ,whk2 i (l1 , l2 ) =
=
1
N
N
1 N
1
X
X
n1 =0 n2 =0
n1
X
N 1
1 X
N
n1 =0 s=n1 +1N
where the index change s = n1 n2 has been performed, such that the summation
over n2 is replaced by a summation over s. Next, we interchange the two sums.
Due to the dependence of n1 in the inner sum, we have to investigate the summation
over (n1 , s) in some more detail. Figure 5.1 shows the effective pairs that are used
in the two sums. They are denoted by the areas A1 , A2 , and A3 . Hence,
Qwhk1 i ,whk2 i (l1 , l2 ) =
N 1
2
1 X
Rzhk1 i ,zhk2 i (0)e N n1 (l1 +l2 )
N
(5.12)
n1 =0
(A1 )
+
N 1 N 1
2
1 X X
Rzhk1 i ,zhk2 i (s)e N (n1 l1 +n1 l2 sl2 )
N
n =s
s=1
(A2 )
+
1
N
1
X
s+N
X1
s=1N n1 =0
(A3 )
111
N 1
s = n1
@
@
R
A1
A2
1
0
N 1
@
@
R
@
I
@
n1
N 2
A3
s = n1 + 1 N
m
n1 = s + N 1
1N
and, furthermore,
N 1
2
1 X
Rzhk1 i ,zhk2 i (0)e N n1 (l1 +l2 )
N
(5.13)
n1 =0
+
+
N 1 N 1s
2
1 X X
Rzhk1 i ,zhk2 i (s)e N (tl1 +sl1 +tl2 )
N
1
N
s=1 t=0
N
1 NX
1s
X
s=1
n1 =0
where the index change t = n1 s has been performed for term (A2 ) in (5.12),
such that its summation over n1 is replaced by a summation over t, and in term
112
X
2
1
e N t(l1 +l2 )
Rzhk1 i ,zhk2 i (0)
N
(5.14)
t=0
+
+
N 1
NX
1s
2
1 X
Rzhk1 i ,zhk2 i (s)
e N (sl1 +tl1 +tl2 )
N
1
N
s=1
N
1
X
s=1
t=0
NX
1s
t=0
(
t
a =
M,
1aM
1a ,
t=0
a=1
,
a 6= 1
(5.15)
N
1
X
s=1
(5.16)
2
2
Rzhk1 i ,zhk2 i (s)e N sl1 + Rzhk1 i ,zhk2 i (s)e N sl1 ,
for l1 + l2 = 0 or l1 + l2 = N
and
2
e 2N (l1 +l2 )
2
X
2
(5.17)
2
+Rzhk1 i ,zhk2 i (s) e N sl2 e N sl1 ,
for l1 + l2 6= 0 and l1 + l2 6= N,
(5.18)
113
e N t(l1 +l2 ) = N,
t=0
NX
1s
t=0
NX
1s
2
(sl1 +tl1 +tl2 )
N
= e
2
sl1
N
NX
1s
t=0
NX
1s
t=0
t=0
where in the last equation assumption (5.18) was used twice. This applied to (5.14)
implies (5.16). Suppose now we have
l1 + l2 6= 0 and l1 + l2 6= N.
(5.19)
N
1
X
e N t(l1 +l2 ) =
t=0
NX
1s
2
(sl1 +tl1 +tl2 )
N
z
}
{
2
1 e N N (l1 +l2 )
2
1 e N (l1 +l2 )
= e
2
sl1
N
NX
1s
t=0
e N t(l1 +l2 )
t=0
= e
sl1 1
2
N
= e N sl1
NX
1s
= 0,
e N (N s)(l1 +l2 )
2
1 e N (l1 +l2 )
2
1 e N s(l1 +l2 )
2
1 e N (l1 +l2 )
2
2
e N sl1 e N sl2
=
,
2
1 e N (l1 +l2 )
NX
1s
2
2
sl2
N
e N t(l1 +l2 )
= e
t=0
t=0
= e
2
sl2 1
N
= e N sl2
=
e N (N s)(l1 +l2 )
2
1 e N (l1 +l2 )
2
1 e N s(l1 +l2 )
2
1 e N (l1 +l2 )
2
2
e N sl2 e N sl1
,
2
1 e N (l1 +l2 )
114
2
(l +l2 )
N 1
1e
2
2N
1
2
2
(l1 +l2 )
e 2N (l1 +l2 ) e 2N (l1 +l2 )
2
e 2N (l1 +l2 ) 2
2
2
2 e 2N (l1 +l2 ) e 2N (l1 +l2 )
e 2N (l1 +l2 )
2
.
=
2 sin 2N
(l1 + l2 )
As already mentioned, (5.16) and (5.17) fully specify the covariance and the
pseudocovariance matrix of nn0 in (2.25), and we have found the desired solution. However, if we are not interested in correlations between different frequencies (subcarriers), we can approximate5 the covariance and the pseudocovariance
matrix by block diagonal matrices, which is also in line with the block diagonal
structure  see (2.25)  of HMIMO DMT , i.e,
Cnn0
= CnMIMO DMT
Pnn0
= PnMIMO DMT
Cn (0)
0
..
+1 K N +1 K
C( 2 ) ( 2 )
.
N
Cn ( 2 )
Pn (0)
..
+1 K N +1 K
C( 2 ) ( 2 )
.
N
Pn ( 2 )
with
Cn (l) , Pn (l) C KK .
(5.20)
Specializing (5.8) and (5.9) to l = l1 = l2 , while applying (5.7), (5.16) and (5.17),
we obtain the elements of Cn (l) and Pn (l) as
Cn
(0)
(k1 , k2 ) = Pn
(0)
N
1
X
s=1
(5.21)
s
(1)s
N
s=1
N
1
X
This is then an extension of the results obtained in Subsection 4.1.3 to the MIMO case.
N
2,
115
as
N
1
X
s=1
(5.22)
2
2
Rzhk1 i ,zhk2 i (s)e N sl + Rzhk1 i ,zhk2 i (s)e N sl ,
2
N
1
X
e N l
2
(l)
Pn (k1 , k2 ) =
sin
sl
Rzhk1 i ,zhk2 i (s)+
N
N sin 2
N l s=1
5.1.2 Interference
In this subsection, it is our goal to translate the results of Section 4.4 about intersymbol and intercarrier interference to the MIMO case. It was already mentioned
in Section 2.2 that the TDEs have the task to shorten all (k, m = 1, . . . , K) channel impulse responses ghkmi , so that the resulting impulse responses have lengths
shorter or equal to p + 1, p being the length of the Cyclic Prefixes. In other words,
ehki are chosen such that hhkmi = ehki ghkmi satisfy
hhkmi (n) = 0,
n < 0 or n > p.
(5.23)
Note that the calculation of the TDE coefficients is a nontrivial problem, since the
impulse response of the kth TDE, ehki , has to shorten all ghkmi , m = 1, . . . , K,
simultaneously, and can therefore be an overdetermined problem, which can then
only be solved in an approximate sense. In order to analyze the effects occurring,
we relax (5.23)  similarly to Section 4.4  to
hhkmi (n) = 0,
(5.24)
Applying the results of the same section, we conclude that the notation of the
block diagonal HMIMO DMT with its blocks H(n) using Z  transforms H hkmi (z)
of hhkmi , see (2.25), does not change, whereas the values of its entries change
according to (5.24). Let thmi denote the mtransmit signal, such that we have at
the output of the kth TDE, cf. (2.22),
hki
K
X
(5.25)
m=1
It is an immediate consequence of (4.108) that we can express the overall interference on the lth subcarrier at the output of the kth DFT (in the kth receiver)
116
+
K
lhkmi
2
X
X
1
hmi
hkmi +
ihki
(l)
=
t
(n
n)H
e N l
0
n0
n
N m=1 n=p+1
(5.26)
hkmi +
2
e N l
n=p+1
+
(ICI, if lhkmi p + N )
hkmi
1
l
2
e N l
n=0
(ICI, if lhkmi N )
hkmi
l
1
X
2
+
thmi (n0 + N + n)Hnhkmi e N l ,
n=0
k = 1, . . . , K,
hkmi +
Hn
(z)
hkmi
and Hn
(z) are defined
responses hhkmi , respectively.
where
as in (4.96) and (4.102) applied to
the impulse
In the following, we are interested in the statistics of this interference. As
usual, the mth transmit signal thmi is modeled as a discretetime, real valued (due
to baseband signalling), random process. Extending the assumptions of Section 4.4
(and still maintaining their simplicity), we will assume that all elements of these
processes are zeromean and pairwise uncorrelated for different time instants (also
across different processes), except of course for the elements of the Cyclic Prefixes
which are  by definition  identical to other elements. For fixed time indices n, the
variances and correlations of the elements of the K transmit signals are given by
covariances
n
o
thm1 i ,thm2 i = E thm1 i (n)thm2 i (n) , m1 , m2 = 1, . . . , K,
that do not depend on these time indices n (i.e., they are the same for all time
instants n). It is obvious that the interference is zeromean as well. Assuming6
(5.27)
lhkmi N + and lhkmi N with N + + N = N,
we have, according to (5.26),
n
o
hk1 i
hk2 i
E in0 (l1 ) in0 (l2 )
=
6
117
K
K
N
1 X X X
thm1 i ,thm2 i
N
m1 =1 m2 =1 n=p+1
2
2
+
+
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
(ISI)
K
K
N
1 X X X
+
thm1 i ,thm2 i
N
m1 =1 m2 =1 n=p+1
2
2
+
+
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
(ICI)
K
K N 1
1 X X X
+
thm1 i ,thm2 i
N
m1 =1 m2 =1 n=p+1
2
2
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
(ICI)
K
K N 1
1 X X X
thm1 i ,thm2 i
+
N
m1 =1 m2 =1 n=p+1
2
2
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
(ISI)
and
n
o
hk2 i
1i
E ihk
(l
)i
(l
)
=
1
2
n0
n0
K
K
N
1 X X X
thm1 i ,thm2 i
N
m1 =1 m2 =1 n=p+1
2
2
+
+
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
(ISI)
K
K
N
1 X X X
thm1 i ,thm2 i
+
N
m1 =1 m2 =1 n=p+1
2
2
+
+
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
(ICI)
K N 1
K
1 X X X
thm1 i ,thm2 i
+
N
m1 =1 m2 =1 n=p+1
2
2
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
(ICI)
K
K N 1
1 X X X
+
thm1 i ,thm2 i
N
m1 =1 m2 =1 n=p+1
2
2
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2 ,
(ISI)
118
so that we obtain the crosscovariance matrices7 Cihk1 i ,ihk2 i of the interference vectors
hki
in0 (0)
..
ihki
, k = 1, . . . , K,
n0 =
.
hki
in0 (N 1)
as
H
1 i hk2 i
Cihk1 i ,ihk2 i = E{ihk
} = Cihk1 i ,ihk2 i + Cihk1 i ,ihk2 i ,
n0 in0
ISI
ISI
ICI
(5.28)
ICI
where Cihk1 i ,ihk2 i and Cihk1 i ,ihk2 i denote the crosscovariance matrices of ISI and
ISI
ISI
ICI
ICI
ISI
ICI
(5.29)
ICI
ISI
ICI
(5.30)
ICI
K
K
1 X X
=
thm1 i ,thm2 i
N
m1 =1 m2 =1
N+
2
2
X
+
+
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2
+
n=p+1
1
NX
Hnhk1 m1 i
2
2
,
e N l1 Hnhk2 m2 i e N l2
n=0
in0 , k = 1, . . . , K, as
T
1 i hk2 i
Pihk1 i ,ihk2 i = E{ihk
} = Pihk1 i ,ihk2 i + Pihk1 i ,ihk2 i ,
n0 in0
ISI
ISI
ICI
(5.31)
ICI
where Pihk1 i ,ihk2 i and Pihk1 i ,ihk2 i denote the crosscovariance matrices of ISI and
ISI
ISI
ICI
ICI
7
8
ISI
ICI
ICI
(5.32)
119
ISI
ICI
(5.33)
ICI
K
K
1 X X
thm1 i ,thm2 i
N
m1 =1 m2 =1
N+
2
2
X
+
+
Hnhk1 m1 i e N l1 Hnhk2 m2 i e N l2 +
n=p+1
1
NX
Hnhk1 m1 i
l
2
N 1
Hnhk2 m2 i
2
e N l 2 .
n=0
To be in line with (the special ordering of) (2.25), we have to consider the
covariance matrix CiMIMO DMT and the pseudocovariance matrix PiMIMO DMT of the
stacked interference vector
iMIMO DMT n0
h1i
in0 (0)
..
hKi
in0 (0)
.
..
=
h1i N
in0
..
.
hKi N
in0 2
(5.34)
120
(5.35)
(5.36)
where E hki (z) denotes the Z  transform of ehki , the impulse response of the kth
TDE, and Ghkmi (z) denote the Z  transforms of all (cross)channels ghkmi , see
Figure 2.9.
The noise has a covariance and pseudocovariance matrix according to (5.7),
(5.8), (5.9), (5.16) and (5.16), or, if we use an approximation that neglects crosscorrelations between different frequencies (subcarriers), according to (5.20), (5.21)
and (5.22). Observe that (5.3) connects those formulas to the impulse responses of
the TDEs.
Finally, ISI and ICI have covariance and pseudocovariance matrices according
to (5.30) and (5.33), whose entries are dependent on the impulse responses ehki of
the TDEs via
+
l=
Hnhkmi (z) =
Hnhkmi (z) =
hkmi +
ehki (l)Gnl
hkmi
ehki (l)Gn+l
(z) ,
(5.37)
(z) ,
l=
hkmi +
hkmi +
where Gn
(z) and Gn
(z) are defined as in (4.96) and (4.102) applied to
the channel impulse responses ghkmi , respectively.
If we want to use capacity as an overall performance measure (similarly as in
[22]), we can plug these matrices into the formulas of Section 3.2, and, concluding
121
that capacity is a function of the filter coefficients of the TDEs, we can maximize
this function with respect to these parameters. Note that our analytical results
provide us with an explicit relation between the capacity and ehki , k = 1, . . . , K,
so that e.g. a conventional numerical maximization algorithm can be used to solve
this maximization problem. Also, for the design of (practical) lowcomplexity
MIMO TDE algorithms, the results obtained can be very useful.
Again, we will stop our considerations about the (design of) MIMO Time Domain Equalizers at this point. We admit that this is possibly unsatisfactory for the
reader who is also interested in quantitative results. However, such results depend
on the chosen MIMO TDE algorithms, and in turn on the chosen design methods.
A full and meaningful analysis would need a lot of additional work and can be regarded as a separate topic. This is beyond the scope of this manuscript. However,
we developed all analytical tools that are required for this research area and showed
that the utilization of the pseudocovariance matrix plays an important role in the
design and analysis of MIMO Time Domain Equalizers.
tr E{xxH } S.
(5.40)
Since this is the complex channel model, we assume that we have knowledge of
both the covariance matrix Cn and the pseudocovariance matrix Pn . This can be
justified due to our results of Section 5.1. For notational simplicity, we also assume
that the remaining interference (if there is interference at all) is also incorporated
in this noise vector, i.e., in fact we deal with the covariance matrix Cn+i and the
pseudocovariance matrix Pn+i , cf. also (5.35). We can compute the capacity of
this channel by applying the results of Subsection 3.2.2. Of course, in order to
obtain this capacity, it is necessary to utilize the nonvanishing pseudocovariance
matrix.
Another, but equivalent  cf. Chapter 3  approach to the channel (5.38) and
(5.39) / (5.40) is to consider the equivalent real channel model
x + n
y
= A
(5.41)
122
with
x
=
<{x}
={x}
2t
R ,
and
y
=
=
A
<{y}
={y}
2r
R ,
<{A} ={A}
={A}
<{A}
n
=
<{n}
={n}
R2r ,
R2r2t ,
cf. also Section 3.1. The power constraints (5.39) and (5.40) translate to
and
E{
xT x
} S
(5.42)
tr E{
xx
T } S,
(5.43)
respectively, and the covariance9 matrix Cn R2r2r of the real noise vector n
We will assume that this covariance matrix is nonsingular, since otherwise the capacity is ei has a dimension 1, i.e., after
ther infinite or there is a zerosubchannel (the kernel [19] of A
we obtain some vanishing diagonal elements, so that the subchannels
a diagonalization of A,
corresponding to these elements cannot be used for transmission  these subchannels are called
zerosubchannels).
123
Note that we can expect (to some extent) that such modifications do not change
the underlying concepts of the existing approaches, whereas MIMO (DMT) systems relying on the equivalent real channel model seem to be more revolutionary
solutions.
We also want to emphasize that if the complex channel has a block diagonal
structure, as, e.g., the channel in (2.25) with (5.20), (5.21) and (5.22), this block
diagonal structure can be maintained in the equivalent real channel model by simple permutations of the matrices and vectors considered. To be more precise, the
equivalent real channel model is in turn equivalent to another (described in the
following) real (valued) channel model which is related to the original (first) real
channel model by permutations of the vectors and matrices and which has a block
diagonal structure. One can obtain this block diagonal real channel model by considering the subchannels  introduced by the submatrices of the block diagonal
complex channel model  and computing the equivalent real channel models for
each of these subchannels. Note that a diagonal complex channel, cf. (2.16), is
always a block diagonal channel (the submatrices are 1 1 matrices), so that the
above statements remain true also for a diagonal complex channel.
It is the goal of the remaining chapter to develop transmission schemes for
channels of the form (5.41), (5.42) and (5.43). It is important to observe that these
channels have some remarkable properties, i.e., the occurring dimensions (2r and
2t) are even numbers and the channel matrices have the special structure
=
A
<{A} ={A}
={A}
<{A}
We want to emphasize that all transmission methods we will propose in the following do not rely on the mentioned properties, i.e., they are designed for channels of
the form
y = Ax + n,
(5.44)
where y Rr and x Rt denote the received and transmitted vectors, respectively, r and t being arbitrary natural numbers. A can be any deterministic r t
real valued matrix, the channel matrix, and n Rr is the zeromean noise vector
with known nonsingular covariance matrix Cn . The transmitter is constrained in
its total power to S,
E{xT x} S,
(5.45)
tr E{xxT } S.
(5.46)
124
Encoder
Decoder
Fig. 5.2: Joint en and decoding of elements of transmit and receive vector, respectively.
we can apply transmissions schemes that are developed for the general vector channel (5.44), (5.45) and (5.46) also to channels of the form (5.41), (5.42) and (5.43).
Equations (5.44), (5.45) and (5.46) have the advantage of having a simple nota are necessary) and being very general (e.g., odd dimensions are
tion (no x
and X
allowed as well). This is the reason why we deal with this channel model in the
following.
It is obvious that a rate close to capacity can only be achieved if coding is
applied. Furthermore, it is not sufficient in general that each element of the transmit
and receive vectors x and y is encoded and decoded, respectively, separately; in
order to achieve (a rate close to) capacity, one has to encode all elements of x and
 similarly  decode all elements of y jointly, see Figure 5.2.
There is a lot of work about codes for Single  Input / Single  Output (SISO)
channels on one hand, and on the other, research about joint coding, cf. also socalled Space Time Codes [52], has started not so long ago. Hence, what we propose
subsequently is to decompose the joint encoder and the joint decoder into two parts.
One performs joint processing, and this will be the part that we are looking at in
the following; the other applies nonjoint (SISO) en and decoding algorithms,
which can be chosen out of a vast pool of research results. This corresponds to a
transmission scenario as it is depicted in Figure 5.3. For related literature we also
refer to [17, 39].
Joint Encoding
z
}
{
Encoder
1
t
125
Joint Decoding
z
}
{
Decoder
1
A
Encoder s
Decoder s
Fig. 5.3: Joint processing of elements of transmit and receive vector according to (not necessarily linear) functions T and R, cf. (5.47), and nonjoint coding. s denotes the
number of en / decoders and depends on T and R.
(5.47)
R : R R ,
the joint processing functions, so that the system including this joint processing
can be written as
r = R (AT (t) + n) .
(5.48)
Note that the parameter s depends on the joint processing functions T and R and is
included in the model to cope e.g. with a nonsquare channel matrix A. In this case
s min{r, t}. Furthermore, if there is a zerosubchannel, cf. Footnote 9 in this
chapter, this channel cannot be used for transmission and s has to be decreased.
We will assume that t = [t(1) t(s)]T is zeromean and impose power constraints on its individual elements, i.e.,
E{t(i)} = 0 and E{(t(i))2 } = 1,
i = 1, . . . , s,
(5.49)
which is only a normalization and not a restriction, since any other mean and power
distribution can be incorporated in the transmit function T as well. The reason for
doing that is that we want to compare different transmission schemes defined by
different joint processing functions T and R and due to our normalization (5.49)
fairness is guaranteed. Without this normalization the joint processing at the transmitter side according to Figure 5.3 would not be well defined, i.e., there would
(could) be still some freedom in distributing the power onto the elements of t such
that (5.45) / (5.46) is satisfied, and different distributions would (could) yield different performances.
Note that we do not allow the different encoders and decoders, respectively, to
cooperate. Depending on the joint processing, this implies that an individual subchannel i {1, . . . , s}  corresponding to the ith element of t and r  is not only
126
disturbed by the noise n, there will also be impairments due to crosstalk from the
other subchannels j 6= i. However, we can compute the mutual information [6] as
(5.50)
which can be maximized in order to calculate the capacity C(i) of the ith subchannel. For simplicity, we will assume that all involved random variables are
Gaussian distributed, especially also the crosstalk contributions, while maximizing
(5.50). Finally, the (sum ) capacity (throughput) of the transmission scenario of
Figure 5.3 is obtained by summation of the individual capacities,
CT,R =
s
X
C(i).
(5.51)
i=1
Note that the joint processing functions T and R determine whether the maximizing input distribution of (5.50) is Gaussian or not, even if all other involved
random variables are Gaussian. Nevertheless, we will assume in the following that
the maximizing input distribution is Gaussian.
It is obvious that the (sum ) capacity of (5.51) is smaller or equal to the capacity of (5.44) and (5.45) / (5.46) with the transmission scenario of Figure 5.2.
Therefore, it will be our goal in the following to design the joint processing functions T and R in such a way that the (sum ) capacity of (5.51) is close (or even
equal) to this capacity.
The simplest way to define the joint processing functions wouldbe to use
diagonal matrices T Rts and R Rsr with the side constraint tr TTT = S,
i.e.,
T : Rs Rt ,
t 7 x = T(t) = Tt,
r
T diagonal,
tr TTT = S,
(5.52)
R : R R ,
y 7 r = R(y) = Ry,
R diagonal.
This configuration with s = r = t = N K, if all K loops and all N2 1 complex valued and the two real valued subcarriers (at frequencies 0 and N2 ) are used,
essentially corresponds to the conventional transmission over cable bundles where
the transmission is not coordinated and FEXT is regarded as noise. Later in the
simulation results, cf. Subsection 5.3.4, we will compare the proposed methods
with this conventional approach.
127
(5.53)
t 7 x = T(t) = Tt,
R : Rr Rs ,
y 7 r = R(y) = Ry.
We will define three transmission schemes by means of these matrices, and the
central part of the calculation rule is based in all three cases on the Singular Value
Decomposition (SVD) [19]. The SVD will be used as a tool that is able to diagonalize an arbitrary (even rectangular) matrix. But note that in the latter two cases we
will omit full diagonalization for the benefit of a lower computational complexity.
(5.54)
on the main diagonal (the singular values). So D has the same properties as in the
complex case. Let10
Ca = diagtt {ca 1 , . . . , ca t } ,
(5.55)
10
Note that there exists in fact a random vector a, i.e., a = Ba t, cf. (5.57), with this covariance
matrix Ca .
128
denote a diagonal matrix that is obtained from D  again using the definition x+ =
max{0, x}  via Water Filling, as
(
+
L d12
, iq
ca i =
,
i
0,
i>q
where the Water Level L is chosen to satisfy tr (Ca ) =
Pt
i=1 ca i
= S. Note that
(5.56)
Ba = diagts { ca 1 , . . . , ca s } .
Although we have
Ca = Ba BT
a,
Ba is not a generalized Cholesky factor of Ca in general, since it can happen that
Ca is a singular and Ba a rectangular (not square) matrix.
We define the transmit function and receive function to be the (here linear)
functions
T : Rs Rt ,
(5.57)
t 7 x = T(t) = VBa t,
 {z }
T
r
R : R R ,
y 7 r = R(y) = (DBa ) UT B1
n y,

{z
}
R
where (DBa ) denotes the (Moore  Penrose) pseudo inverse [19] of DBa , i.e.,
0
d1 ca 1
..
(DBa ) =
(5.58)
0 Rsr ,
.
0
ds ca s
= (DBa ) U

{z
D
= t + m,
(5.59)
B1
n AV Ba t
}
+ (DBa ) U
B1
n n
129
T
Cm = (DBa ) UT B1
C
B
U
(DB
)
a
 n {zn n }
I
{zr
Ir
d21 ca 1
0
..
.
1
d2s ca s
(5.60)
Note that there is no remaining crosstalk and we obtain the mutual information of
the ith subchannel, cf. (5.50),
1
1
1
1
=
log 1 + 2
log
2
2
di ca i
d2i ca i
1
=
log 1 + d2i ca i
2
1 2 +
=
log Ldi
,
2
(5.61)
and, furthermore,
CT,R =
s
X
i=1
s
s
X
2 +
1X
log Ldi
C(i) =
I r(i); t(i) =
.
2
i=1
(5.62)
i=1
It is a consequence of (5.49) and of the Gaussian assumption that we do not have the
freedom anymore to maximize (5.61) over the input distributions of the individual
subchannels.
On the other hand, we can compute the capacity of the channel (5.44) and
(5.45) / (5.46) with the transmission scenario depicted in Figure 5.2. Applying the
Maximum Entropy Theorem for Real Random Vectors (Theorem 3.20) we conclude  following the same line of arguments as in Subsection 3.2.1  that this
capacity is given by
C =
max
Cx :tr(Cx )S
1
1
T
log det ACx A + Cn log det Cn ,
2
2
(5.63)
where Cx is the covariance matrix of x, i.e., the maximization goes over all non
130
T
T T
= log det Bn B1
n ACx A Bn + Ir Bn
T
T
= log det DV Cx VD + Ir + log det Cn ,
and, with Cx = VCa VT (
a = VT x), the maximization problem (5.63) is equivalent to
1
T
C =
max
log det DCa D + Ir ,
(5.64)
2
C
a :tr(C
a )S
since tr (Cx ) = tr (Ca ). Proceeding as in [58], we find that the matrix Ca in (5.55)
maximizes (5.64). The corresponding maximum mutual information (capacity) is
given by
X 1
+
log Ld2i
C=
,
(5.65)
2
i:di 6=0
R
Oadd
(5.66)
2
= N K (2K 1) 2K ,
Another measure would be to count the numbers of performed operations in the transmitter and
receiver per transmitted bit of information. For the reason of simplicity we stick to our complexity
definition.
131
(real) multiplications and additions, respectively, which are quite large for typical
ADSL or VDSL [912, 2530] parameters. Note that these operations have to be
performed for each transmitted and received DMT symbol vector.
v
u
S
u
VBa t,
t 7 x = T(t) = t
T
tr VBa VBa

{z
}
(5.67)
T
r
R : R R ,
y 7 r = R(y) = DR (DBa ) UT B1
n y,

{z
}
R
where X denotes the matrix in which half of the elements of X  the elements with
the smallest absolute values  are set to zero. The diagonal matrix DR is chosen
such that the overall channel matrix RAT has diagonal elements equal to 1. If
this is not possible, we have one (more) zerosubchannel(s), cf. Footnote 9 in
this chapter, that can be neglected by decreasing s, the number of en / decoders.
Therefore, we maintain this assumption without loss of generality. Note that this
scheme has an Online Complexity of
T
R
Omult
= Omult
= (N 1) K 2 ,
N
T
R
Oadd
Oadd
K (2K 1) K 2 ,
2
(5.68)
132
A
Ah1ui
..
..
hiji
..
A=
Rrp tp ,
(5.69)
, A
.
.
.
Ahu1i
Ahuui
Cn h11i Cn h1ui
..
..
..
Cn =
,
.
.
.
hu1i
huui
Cn
Cn
Cn hiji Rrp rp ,
(5.70)
with
r
t
and tp = .
(5.71)
u
u
Note that we partition the channel matrix and the covariance matrix of the noise
into u2 submatrices with dimensions (rp tp ) and (rp rp ). It is obvious that u,
rp and tp have to be integer numbers. Hence, our freedom to select u is limited by
the condition that rp and tp  according to (5.71)  are integer numbers.
If we neglect all offdiagonal submatrices and apply the results of12 Subsection
5.3.1 to the block diagonal matrices
h11i
A
0
Cn h11i
0
..
..
(5.72)
and
,
.
.
rp =
Ahuui
Cn huui
we obtain joint processing functions together with transmit and receive matrices T
and R, respectively, as
T : Rs Rt ,
t 7 x = T(t) =

Th11i
0
..
{z
Thuui
(5.73)
t,
}
R : Rr Rs ,
y 7 r = R(y) =

Rh11i
0
..
{z
Rhuui
y
}
R
12
Again, we will drop the special ordering assumption of the singular values of the SVD [19], so
that the block diagonal structure is maintained.
133
with
hiii
Thiii = Vhiii Ba ,
T hiii 1
hiii
Rhiii =
Dhiii Ba
Uhiii Bn
,
(5.74)
i = 1, . . . , u,
defined as in Subsection 5.3.1. For our MIMO DMT channel, this method yields
an Online Complexity of
2K 2
T
R
Omult
= Omult
= (N 1)
,
u
2K
2K 2
T
R
Oadd
= Oadd
= NK
1
,
u
u
(5.75)
which is about u times less than the complexity of the scheme of Subsection 5.3.1
that also makes use of the offdiagonal submatrices. We want to emphasize that
different orderings of the elements of input and output vector of the channel correspond to permutations of the columns and rows of the channel matrix A and of
the covariance matrix Cn and yield different partitionings. Hence, it is possible to
improve the performance of this scheme while maintaining its Online Complexity
by optimizing with respect to the ordering of the elements of x, y (and n). Note
also that the physical transmitters / receivers corresponding to different partitions
(subsets) of the partitioning need not to be colocated. Therefore, this method can
be applied to distributed physical transmitter / receiver topologies as well. In the
next subsection, we will calculate (by means of simulations) the (sum ) capacity
CT,R for u = 2, so that we have the same Online Complexity as for the approach
of Subsection 5.3.2 and obtain comparable results.
134
cable bundle and is therefore able to perform joint processing over all loops
in this cable bundle. The remaining noise is only background noise. The
obtained results are depicted in Figure 5.4 (channel symbol rate T1 = 2.208
megasymbols / second).
2. Typical noise environment in a cable bundle including crosstalk and background noise. For a full and detailed description of this simulation scenario
we refer to Section B.3 in the Appendix. Note that this scenario models the
situation, where an operator does not own the whole cable bundle alone, so
that he is not able to perform joint processing over all loops in this cable
bundle. He can merely perform joint processing over a subset (here again
K = 20 loops, to obtain comparable results) of the loops in the cable bundle
and has to accept crosstalk from the neighboring loops that do not belong to
his subset. Hence, this crosstalk has to be regarded as noise and is therefore
included in the power spectral density (PSD) of the (stationary) noise process. The obtained results are depicted in Figure 5.5 (channel symbol rate
1
T = 2.208 megasymbols / second).
We start with the first case (only background noise), cf. Figure 5.4. First of
all, we observe that for loop lengths up to 3 km, the SVD scheme of Subsection
5.3.1 (full diagonalization) has a capacity that is (almost) twice the capacity of the
scheme that has no joint processing functions (so there is no cooperation between
the different transmitters / receivers and they encounter full FEXT). The second
observation is that the scheme of Subsection 5.3.2 (approximate diagonalization)
has a capacity that is within the range of the no joint processing function (NJPF)
scheme. Therefore, the idea to reduce complexity by simply setting small (half of
the) values equal to zero has a very negative impact on capacity. The other approach with the same reduced complexity, cf. Subsection 5.3.3 with u = 2, where
we looked at an optimized partitioning, has superior performance and approaches
the curve of the optimum SVD scheme with full diagonalization for loop lengths
above 4 km. Note also that all curves have the same asymptotic limit, because for
long loop lengths, the SNR is very low, so that the noise is extremely dominant and
any algorithm to reduce FEXT has very little impact on performance.
The simulation results for the second noise model are depicted in Figure 5.5.
Again, we conclude that the scheme of Subsection 5.3.2 (approximate diagonalization) yields no capacity gain and is  for sure  not worth applying. However,
the gain of the SVD scheme of Subsection 5.3.1 (full diagonalization) is highly
reduced as well and one must weigh of whether this gain pays for the additional
computational effort. On the other hand, this computational effort can be reduced
because the subsets scheme of Subsection 5.3.3 has the optimum capacity curve
as well (for u = 2). The reason that the gain is smaller for all schemes compared
to the first noise scenario is due to the higher noise power. Again, the noise is
extremely dominant and any algorithm to reduce FEXT has very little impact on
performance.
135
x 10
Full Diagonalization
Diagonalization of Subsets
Approximate Diagonalization
No Joint Processing
4
Loop length [km]
Fig. 5.4: SVD based joint processing functions and corresponding (sum ) capacities CT,R
(140 dBm/Hz background noise, cf. also Section B.2 in the Appendix).
3.5
x 10
Full Diagonalization
Diagonalization of Subsets
Approximate Diagonalization
No Joint Processing
2.5
1.5
0.5
4
Loop length [km]
Fig. 5.5: SVD based joint processing functions and corresponding (sum ) capacities CT,R
(typical noise environment in a cable bundle including crosstalk and background
noise, cf. also Section B.3 in the Appendix).
136
137
(5.76)
t 7 x = T(t) = Tt.
Applying the QRdecomposition [19], we decompose (for simplicity we will assume r s) the matrix B1
n AT, Bn being a real valued generalized Cholesky
factor of Cn , into a product of an orthonormal matrix Q Rrr with an upper
triangular matrix J Rrs , i.e.,
QJ =Bn1 AT.
(5.77)
The joint processing function at the receiver side is divided into two parts, i.e.,
R : Rr Rs ,
(5.78)
(5.79)
T
B1
n y,
performs multiplications by QT B1
n , so that we obtain an inputoutput relationship
between the transmit vector t and the vector q at the output of the first part of the
joint processing function R1 () as
q = QT B1
n y
=
=
=
=
Q B1
n (Ax + n)
T 1
Q Bn (ATt + n)
QT QJt + QT B1
n n
T 1
Jt + Q Bn n
(5.80)
J(1, 1) . . . J(1, s)
..
..
.
=
t + m,
0
J(s,
s)
0
15
At this point, this matrix is not specified yet and is allowed to be an arbitrary matrix.
138
(5.81)
The second part R2 () of the receiver joint processing function is then defined
as follows
R2 : Rr Rs ,
r(1)
q(1)
7 r = ... = R2 (q),
q = ...
r(s)
q(r)
via the recursion (Nulling and Cancelling)
q(s)
r(s) = dec
J(s, s)
P
(5.82)
(5.83)
k = (s 1), . . . , 1,
1X
log 1 + (J(i, i))2 .
2
s
CT,R =
(5.84)
i=1
We want to emphasize that this receive joint processing function is very general
in the sense that if we use appropriate transmit matrices T, we obtain well known
MIMO schemes as special cases.
Consider, e.g., the case when we use a permutation matrix as transmit matrix,
i.e., a matrix, which contains exactly one 1 per row and column, and all other
entries are zero. It can be shown that if we use a certain (optimized) permutation
matrix, the scheme is equivalent to the VBLAST scheme [15, 16, 18]. Note that VBLAST performs an efficient algorithm in order to find the optimum permutation
matrix. It is obvious that a multiplication by a permutation matrix requires no
T = 0 in this case.
T
= Oadd
additional complexity, i.e., we have Omult
Let us also discuss the case when we use the same transmit matrix as it is used
in the full diagonalization SVD scheme (Subsection 5.3.1). We have,
QJ = B1
n AVBa
T
= UDV VBa
= UDBa ,
(5.85)
139
and, furthermore,
Q = U and J = DBa ,
since the columns of U are pairwise orthogonal and (DBa ) is a diagonal matrix.
Hence, the whole recursion (5.83) collapses into independent decisions, and we
have shown that the scheme is equivalent to the SVD scheme (full diagonalization)
with this choice of transmit matrix.
Finally, for our MIMO DMT scheme  utilizing the block diagonal structure
of (2.25)  we can compute the (increased compared with the full diagonalization
SVD scheme) Online Complexity at the receiver side as
N
K (6K 1) 3K 2 ,
2
N
=
3K (2K 1) 3K 2 ,
2
= N K.
R
Omult
=
R
Oadd
R
Odiv
(5.86)
The joint processing function we will choose at the transmitter side is based on
a special algorithm [40] that decomposes any unitary (orthonormal) matrix into a
product of basic rotation matrices. We will look at this algorithm in the following
subsection.
1,
i = k and i 6= p, q
cos (p q ) ,
i = k and i = p, q
ep q sin (p q ) , i = p and k = q
(5.87)
U p q (p q , p q ) (i, k) =
p q sin ( ) ,
e
i
=
q
and
k
=
p
p
q
0,
otherwise,
i.e., it is a rotation matrix. To illustrate such an Up q , we have
cos 1 3
0 e1 3 sin 1 3 0
0
1
0
0
,
e 1 3 sin 1 3 0
cos 1 3
0
0
0
0
1
(5.88)
140
(5.89)
Theorem 5.1 Let U C nn denote a unitary matrix. Then there exist parameters
e 1
0
1
n
Y
Y
..
U=
Up q (p q , p q ) .
.
0
en p=n1 q=p+1
Proof. Can be found in [1, 40]. For completeness, we will present the proof in the
following. Given an n  dimensional unitary matrix U, consider the unitary matrix
H
Uh1i = UU1 n (1 n , 1 n ) obtained by postmultiplying U with the Hermitian
transposed of U1 n (1 n , 1 n ). We choose the parameters 1 n and 1 n in such a
way that Uh1i satisfies the following two constraints:
1.
Uh1i =
.. . .
.
.
.. ..
. .
2.
either U h1i (1, 1) = 0 or
U h1i (1, n) = 0 ,
n
o
It
be verified that these conditions uniquely specify 1 n [, [ and 1 n
can
2 , 2 (with the convention that we set 1 n to zero in case it is indeterminate).
Essentially, these constraints imply that we choose the parameters according to
tan 1 n =
U (1, n) 1 n
e
R.
U (1, 1)
Uh2i =
16
If n1 > n2 ,
Qn2
n=n1
.. . .
.
.
0 0
.. .. ..
. . .
(5.90)
U h2i (1, n 1) = 0 ,
2.
either U h2i (1, 1) = 0 or
141
n
o
Again,
using the same convention,
Uhn1i = UU1 n (1 n , 1 n ) U1 2 (1 2 , 1 2 ) ,
hn1i
U = U
12
1n
(1 2 , 1 2 ) U
Uhn1i = . .
.. ..
(5.91)
(1 n , 1 n ) ,
..
.
..
.
(5.92)
Uhn1i =
e1
0
..
.
0
V
(5.93)
0
where V is an (n 1)  dimensional unitary matrix. We can operate on V in a
similar manner by using matrices U2 k (2 k , 2 k ) for k = n, . . . , 3 and pass to an
(n 2)  dimensional unitary matrix. Continuing this process, we pass to a 1 dimensional unitary matrix, which consists of just one entry en . Therefore, by
this process, we can parameterize the bounded and closed space of n  dimensional
unitary matrices by
1 + 3 + 5 + . . . + (2n 1) = n2
(5.94)
142
After analyzing this algorithm, we come to the conclusion that a real valued
orthonormal matrix yields
0
,
n =
(5.95)
1 = . . . = n1 = 0,
p q = 0 p, q,
such that the basic unitary matrices Up q obtained and in turn the decomposition of
Theorem 5.1 are real valued.
e 1
0
1
t
Y
Y
..
VT =
Up q (p q , p q ) ,
.
0
et p=t1 q=p+1
where 1 = . . . = t = 0 without loss of generality, since these factors  if not
already zero according to (5.95)  can be incorporated in the matrix U. Then,
!
t1 p+1
Y
Y
p qH
V=
U
(p q , p q ) .
(5.96)
p=1
q=t
(5.97)
and
p q 
p qH
(p q , p q ) It = 2 sin
,
(5.98)
U
2
2
which motivates us to set those basic rotation matrices equal to the identity matrix
that have the smallest
absolute o
angle values p q . To put it in another way, given
n
any number f 0, . . . , t(t1)
, we obtain a unitary matrix Vf Rtt according
2
to
!
t1 p+1
Y
Y
Y
p qH
U
(p q , p q )
,
(5.99)
Vf =
{z}
p=1 q=t
f factors
with greatest p q 
of
143
denote a diagonal matrix with tr BBT = S. We will define the transmit matrix
to be
T = Tf = Vf B,
(5.101)
where B is optimized according to the Water Filling rule applied to the diagonal
elements of the matrix J0 , J0 being the upper triangular matrix of the QR  decomposition of B1
n AVf , i.e.,
Q0 J0 = Bn1 AVf .
(5.102)
Note the relationship between J0 and J, J being the upper triangular matrix of the
QR  decomposition of B1
n AT, i.e.,
QJ = Bn1 AVf B
(5.103)
0 0
= Q J B,
from which follows
Q = Q0
and J = J0 B,
(5.104)
and, furthermore,
J(i, i) = J 0 (i, i)B(i, i),
i = 1, . . . , s,
(5.105)
2
1X
log 1 + (B(i, i))2 J 0 (i, i)
2
1
2
i=1
s
X
(5.106)
2 +
log L J 0 (i, i)
,
i=1
(B(i, i)) = L
+
1
, i = 1, . . . , s,
(J 0 (i, i))2
P
with a Water Level L, chosen to satisfy si=1 (B(i, i))2 = S.
We want to emphasize that using optimized permutation matrices in the transmitter  similar to the BLAST approach  may enhance the performance even further. For our method, this means that we deal with transmit matrices of the form
2
T = Tf = Vf PB,
(5.107)
144
(5.109)
t 7 x = T(t) = Tf t,
can be evaluated in two different ways during transmission.
The conventional way would be to perform multiplications of the vector t with
the transmit matrix Tf , where Tf has been completely determined in the Startup
phase before data transmission. This method would yield an Online Complexity of
T
Omult
= (N 1) 2K 2 ,
T
Oadd
(5.110)
= N K (2K 1) 2K
for our MIMO DMT scheme (2.25), where its block diagonal structure has been
utilized. It is equal to the transmit Online Complexity of the full diagonalization
SVD scheme, cf. (5.66).
The other way to evaluate (5.109) is to make use of the factorization of Tf
into a diagonal matrix B, possibly a permutation matrix P, and f basic rotation
matrices according to (5.99). Observe that a multiplication by a real valued basic
rotation matrix Up q (p q , 0), cf. (5.87), requires 4 real valued multiplications and
2 real valued additions. Hence, if we make use of the blockndiagonal structure
o of
K(K1)
our MIMO DMT scheme (2.25), and denote by f0 , f N 0, . . . ,
and
2
2
with
f=
2
X
n=0
(5.111)
= 2f,
N
2
fn 0, . . . , K (2K 1) K .
2
(5.112)
Therefore, if we only make use of a small number f of rotation matrices, the computational effort required at the transmitter is very low. But note that if f exceeds a
17
Note that there is one block per subcarrier / frequency (n denotes the subcarrier / frequency
index).
145
certain threshold, the Online Complexity of the multiplications will be even higher
than the corresponding quantity in (5.110). We can easily compute this threshold
to be
N
K2
K (2K 1)
,
(5.113)
4
2
which is exactly one half of all basic rotation matrices obtained from the decomposition of Theorem 5.1. The transmit Online Complexity of the required additions
will never exceed its conventional counterpart, as can be seen from comparing
(5.110) with (5.111) and (5.112). Note that this second approach will always yield
a high transmit Online Complexity if we choose f to be close to the (optimum)
SVD scheme. In order to obtain a small complexity, we have to operate close to
VBLAST.
5.4.4 Comments
As already mentioned in the beginning of this section, the UP MIMO scheme
was originally designed for wireless transmission in order to cope with different
amounts of channel state information at the transmitter side.
Whereas channel knowledge at receiver side can be justified in practice by the
use of channel estimation and channel prediction techniques, channel knowledge
at the transmitter side can only be guaranteed if either reciprocity applies or there
is a backward channel. This implies that transmission schemes that utilize channel knowledge at the transmitter side are mostly of theoretical interest, since they
can yield upper bounds for achievable performances. However, there are some
scenarios, e.g. slowly varying channels, for which it would make sense to use
transmission schemes that require channel knowledge at the transmitter side.
In practice, we have the problem that we do not know in advance whether a
transceiver operates only in such a special environment, or if the environment may
suddenly change and channel knowledge at the transmitter side is lost. Consider
for example the situation where a user does not move during the first moments of
transmission, but, after a certain period of time, starts moving. Therefore, such
a transmission scheme is very inflexible and sensitive to (faster) variations of the
channel.
This was the original reason for designing the UP MIMO transmission scheme
[2, 37], since it can adapt to the current channel situation. According to (5.101) /
(5.107) with (5.99), the transmit matrix T can be described by a set of parameters,
i.e., the parameters of the basic rotation matrices and the diagonal elements of B.
Note that the transmitter must have knowledge of T but not of anything more. If
the channel is quasistatic, all parameters are retransmitted to the transmitter using
a backward channel, and full (SVD) performance is obtained. If the channel starts
to vary (not too fast), only the most important parameters (the ones that correspond
to the greatest absolute angle values) are retransmitted; and if the channel fades
really fast, no parameters are retransmitted (VBLAST performance). Note also
the possibility that there is only a lowrate backward channel, so that we can only
146
retransmit a few channel parameters. Hence, the main advantage of the (wireless)
UP MIMO scheme is its flexibility in dealing with channel fluctuations.
We also want to mention that the author proposed a differential variant of the
UP MIMO scheme as well in [2]. The basic idea behind this differential UP MIMO
scheme is that it applies Theorem 5.1 not to the optimum transmit matrix obtained
by the SVD, but to the changes, i.e., the quotient, of the optimum transmit matrices
for consecutive time instants. With this method, it is feasible to reduce the number
of retransmitted parameters even further (if the channel does not vary too fast).
We performed simulations of the UP MIMO scheme with the same parameters
as in Subsection 5.3.4. It turned out that the UP MIMO scheme yields the same
curves as the full diagonalization SVD scheme, even for small numbers f of rotation matrices used (and often even for f = 0). The reason for this is that we
have a high SNR for shorter loop lengths, so that even a simple channel matrix
inversion (zeroforcing) would have a performance almost identical to the SVD
scheme (full diagonalization). For such an operating range, we will mostly gain
by crosstalk (FEXT) removal, with very little influence of the noise. For longer
loop lengths, we already came to the conclusion, cf. Subsection 5.3.4, that joint
processing yields less benefit, so that the gain achieved by applying the UP MIMO
scheme is limited.
So, one might ask, why use the UP MIMO scheme at all (for wireline transmission. For wireless transmission, its advantages are unquestioned)? The answer
is that there are situations where the UP MIMO scheme has a better performance
than other schemes. We found out that if the noise is correlated across the various
loops, the UP MIMO scheme, and especially the use of some rotation matrices at
the transmitter side, can improve performance, provided that the SNR is not too
high. Hence, one good argument for applying the UP MIMO scheme is to make
the transmission robust against certain impairments by utilizing the information
available and still having a low Online Complexity at the transmitter side.
In [61], it is shown that a Decision Feedback structure at the receiver side has
its dual at the transmitter side, that is called Precoding, cf. also [14]. Since we
have a decision feedback structure according to (5.83), we can just as well think
of a UP MIMO precoding scheme, which benefits from the use of basic rotation
matrices.
Finally, note that our algorithm for parameter determination / selecting the basic rotation matrices is very simple, since it merely has to find the parameters with
the greatest absolute values, cf. (5.99). However, it only takes into account the
matrix VT obtained by the SVD of
T
B1
n A = UDV ,
ignoring any information contained in the matrix D. Note that the matrix U does
not affect the upper triangular matrix in the QRdecomposition of B1
n AT. So
one may look for an improved (and maybe more complex) algorithm that makes
additional use of this matrix (i.e., the singular values of B1
n A).
148
in wireline transmission. On the other hand, the loss1 measured by (uncoded) symbol error probability can be quite large, so that we can expect enough benefit to
afford the additional effort required for implementation. Furthermore, we showed
how to modify the existing bitloading algorithms in order to obtain the optimum
constellation parameters.
We also performed a detailed interference analysis for a DMT system. We considered the case when the channel impulse response exceeds the Cyclic Prefix on
both sides, which yields precursors and postcursors from both neighboring DMT
symbols (intersymbol interference) and also intercarrier interference. We derived
closed form formulas for both contributions and considered their statistical properties as well. We came to the conclusion that both interference contributions are
complex random vectors with equal first and second order moments and a nonvanishing pseudocovariance matrix.
We also showed how the noise and interference results obtained can be utilized
for the design of Time Domain Equalizers.
In a second step, we generalized the noise and interference results from DMT
to the MIMO DMT case. Again, it was possible to obtain closed form solutions,
even for very general assumptions with respect to correlations across the various
loops of the cable bundle.
We presented the general form of a transmission scheme that is suited to the
MIMO DMT channel and is based on socalled joint processing functions. It allows
the use of SISO codes, and we introduced the (sum ) capacity as a performance
measure.
We dealt with transmission schemes whose joint processing functions were
based on the Singular Value Decomposition (SVD) of the channel matrix. We
showed that the optimum joint processing function can be obtained by means of
the SVD. Furthermore, we studied low(er)complexity variations and discussed
their performance. To obtain quantitative results, we performed simulations with
realistic (practically used) parameters and compared the various methods.
Finally, we presented the UP MIMO scheme, a scheme that was originally designed by the author for wireless transmission, and also has applications to wireline
transmission. Specifically, it can be used to reduce the computational complexity
at the transmitter side (but not at the receiver side). We treated various aspects of
this scheme.
We also want to mention potential areas for further research:
The application of rotated and / or nonsquare constellations is not compliant to the xDSL standards, since it requires a modification of the transmitter.
In order to utilize the nonvanishing pseudocovariance matrix in a standard
compliant manner, one has to think of alternative solutions. A possible approach could be to adapt the decoding strategy (soft  decoding) so that it
makes use of the knowledge that transmission is more reliable, e.g., for the
real part than for the imaginary part of the transmitted symbol.
1
149
We explained how the noise and interference results obtained can be used for
the design of Time Domain Equalizers both in the DMT and MIMO DMT
case. However, we did not present explicit algorithms and this is certainly a
very interesting application.
We already mentioned that the UP MIMO scheme has potential for further
improvements if it utilizes the singular values of the channel matrix. A low
complexity algorithm for parameter determination / selecting the basic rotation matrices that has better performance because it makes use of this data
could be the goal of future research activities.
Joint processing functions with small Online Complexity and good performance are of interest as well as methods that do not require synchronization
between the different loops.
150
APPENDIX
In this appendix we introduce the mathematical notation used in this work and
summarize the most important abbreviations.
(A.1)
Similarly, the set of real valued vectors of dimension n (real valued n  tuples)
is written as Rn and the set of complex valued vectors of dimension n (complex
.
valued n  tuples) is written as C n = Rn + Rn . Furthermore, the set of real valued
(n m)  matrices is written as Rnm and the set of complex valued (n m) .
matrices is written as C nm = Rnm + Rnm . A boldface font is used to denote
vectors (lowercase letters, e.g. x) and matrices (uppercase letters, e.g. A), in
order to distinguish these objects from scalars. For complex vectors and matrices,
the real and imaginary part (operators), <{} and ={}, respectively, are defined
componentwise, such that we have x = <{x} + ={x} for x C n and A =
<{A} + ={A} for A C nm . Analogously to (A.1), complex conjugation is
written as
x = <{x} ={x} and A = <{A} ={A}.
(A.2)
Transpose and Hermitian transpose of a vector / matrix are denoted by the superscripts T and H (in a boldface font), respectively. The inverse of a nonsingular
square matrix A is denoted by A1 , whereas the Moore  Penrose pseudo inverse
[19] of an arbitrary matrix A C nm or A Rnm is denoted by A . Determinant and trace of a matrix A are denoted by det A and tr A, respectively.
A rectangular matrix is called diagonal if all entries with different column and
row indices are 0.
By diagrt {d1 , . . . , ds } with s = min{r, t} we denote a complex (real) valued
matrix with r rows and t columns for which all entries with different row and
column indices are 0 and the entry with ith row and ith column index is equal to
di (i = 1, . . . , s).
154
= 1
K
p
N
F=
F1 =
2
1 e N kl
N
2
1 e N kl
N
Meaning
Imaginary number
Number of (considered) loops
in a cable bundle
Length of Cyclic Prefix
DFT and IDFT length
i
k,l=0,...,N
1
i
k,l=0,...,N 1
In
E{}
x
Cx
Px
Rz
Rz1 ,z2
h(x)
I(x; y)
C
CT,R
C
L
d(k)
R t2
Q(x) = 12 x e 2 dt
P
k
H(z) = P
k= h(k)z
+
Hl (z) = k=0 h(k + l)z k
Hl (z) =
T
R
P1
k= h(k
l)z k
A.3. Abbreviations
A.3 Abbreviations
The following table summarizes the acronyms used in this work.
Acronym
ADSL
CP
CSI
DFE
DFT
DMT
DSL
FDE
FEXT
FFT
IDFT
ICI
IFFT
ISI
MIMO
MMSE
NEXT
NJPF
OFDM
PSD
QAM
QR
SISO
SNR
SVD
TDE
UP MIMO
VBLAST
VDSL
xDSL
Meaning
Asymmetric Digital Subscriber Line
Cyclic Prefix
Channel State Information
Decision Feedback Equalization
Discrete Fourier Transform
Discrete Multitone Modulation
Digital Subscriber Line
Frequency Domain Equalizer
FarEnd Crosstalk
Fast Fourier Transform
Inverse Discrete Fourier Transform
Intercarrier Interference
Inverse Fast Fourier Transform
Intersymbol Interference
Multiple  Input / Multiple  Output
Minimum Mean Square Error
NearEnd Crosstalk
No Joint Processing Function (scheme)
Orthogonal Frequency Division Multiplexing
Power Spectral Density
Quadrature Amplitude Modulation
Matrix decomposition (Gram  Schmidt orthogonalization)
Single  Input / Single  Output
SignaltoNoise Ratio
Singular Value Decomposition
Time Domain Equalizer
Unitary Parametrization Multiple  Input / Multiple  Output
Vertical Bell Labs Layered SpaceTime (detection algorithm)
Veryhigh bit rate Digital Subscriber Line
Acronym for all DSL systems
155
156
B. SIMULATION SCENARIOS
B.1 Scenario 1
For the used nomenclature we refer to Section 2.1.
1
T
= 2.208 MHz
B. Simulation Scenarios
158
40
dBm/Hz
60
80
100
120
140
200
400
600
800
1000
kHz
Fig. B.1: Onesided power spectral density (PSD) of the noise process s at the input of the
receiver.
B.2 Scenario 2
For the used nomenclature we refer to Section 2.2.
B.3. Scenario 3
159
40
dBm/Hz
60
80
100
120
140
200
400
600
800
1000
kHz
Fig. B.2: Onesided power spectral density (PSD) of all noise processes shki , k =
1, . . . , K, at the input of the receivers.
DFTlength: N = 512
Length of Cyclic Prefix: p = 32
Channel symbol rate:
1
T
= 2.208 MHz
B.3 Scenario 3
For the used nomenclature we refer to Section 2.2.
B. Simulation Scenarios
160
1
T
= 2.208 MHz
B.3. Scenario 3
161
90
100
dBm/Hz
110
120
130
140
150
200
400
600
800
1000
kHz
Fig. B.3: Onesided power spectral density (PSD) of all noise processes shki , k =
1, . . . , K, at the input of the receivers.
162
B. Simulation Scenarios
BIBLIOGRAPHY
[1] D. Agrawal, T. J. Richardson, R. Urbanke, Multiple Antenna Signal Constellations for Fading Channels, IEEE Transactions on Information Theory,
vol. 47, no. 6, pp. 26182626, Sep. 2001.
[2] A.Burr, Y. Zacharov, H. Toeger, W. Qiu, M. Meurer, Ch. Stimming, A.
Vanaev, G. Tauboeck, J. Shen, H. Mai, Selected MIMO Techniques and
their Performance, IST200132125 FLOWS Deliverable D14, 2003.
[3] A. Busboom, G. Herrmann, R. Tzschoppe, J. B. Huber, IFC  Aktive
164
Bibliography
Bibliography
165
Subscriber
Line
ADSL
[27] ITUT, Very high speed digital subscriber line foundation, ITUT, G.993.1,
Nov, 2001.
[28] ITUT, Asymmetric Digital Subscriber Line ADSL Transceivers 2
(ADSL2), ITUT, G.992.3, Jul, 2002.
[29] ITUT, Splitterless Asymmetric Digital Subscriber Line
Transceivers 2 (splitterless ADSL2), ITUT, G.992.4, Jul, 2002.
ADSL
166
Bibliography
[40] F. D. Murnaghan, The Unitary and Rotation Groups, vol. III of Lectures on
Applied Mathematics, Spartan, Washington, DC, 1962.
[41] F.D. Neeser, Communication Theory and Coding for Channels with Intersymbol Interference, PhD thesis, ETH Zurich, Switzerland, 1993.
[42] P. Odling,
W. Henkel, P. O. Borjesson, G. Taubock, N. Petersson, et. al,
The Cyclic Prefix of OFDM/DMT  An Analysis, Proc. of the International
Zurich Seminar on Broadband Communications, Zurich, Switzerland, Feb,
2002.
[43] B. Picinbono, Random Signals and Systems, Englewood Cliffs, NJ:
PrenticeHall, 1993.
[44] B. Picinbono, P.Chevalier, Widely Linear Estimation with Complex Data,
IEEE Transactions on Signal Processing, vol. 43, pp. 20302033, Aug. 1995.
[45] G. G. Raleigh, J. M. Cioffi, SpatioTemporal Coding for Wireless Communication, IEEE Transactions on Communications, pp. 357366, March 1998.
[46] J.C. Roh, B.D. Rao, Channel Feedback Quantization Methods for MISO
and MIMO Systems, Proc. of IEEE Symposium on Personal, Indoor and Mobile Radio Communications, Barcelona, Spain, September 2004.
[47] J.C. Roh, B.D. Rao, Vector Quantization Techniques for MultipleAntenna
Channel Information Feedback, Proc. of International Conference on Signal Processing and Communications (SPCOM), Bangalore, India, December
2004.
[48] T. Starr, M. Sorbara, J. Cioffi, P. Silverman, Understanding Digital Subscriber Line Technology, Prentice Hall, 2003.
[49] D. Statovci, T. Nordstrom, Adaptive Subcarrier Allocation, Power Control, and Power Allocation for Multiuser FDDDMT Systems, Proc. of IEEE
International Conference on Communications, ICC 2004, Paris, France, June
2004.
[50] D. Statovci, T. Nordstrom, Adaptive Resource Allocation in Multiuser
FDDDMT Systems, Proc. of the 12th European Signal Processing Conference, EUSIPCO 2004, Vienna, Austria, Sept 7  Sept 10, 2004.
[51] J. Stoer, R. Bulirsch, Introduction to Numerical Analysis Third Edition,
Springer New York, NY, 2002.
[52] V. Tarokh, N. Seshadri, A.R. Calderbank, Spacetime codes for high data
rate wireless communication: performance criterion and code construction,
IEEE Transactions on Information Theory, vol. 44, no. 2, pp. 744765, March
1998.
Bibliography
167
168
Bibliography
BIOGRAPHY
Georg Taubock was born in Modling, Austria, in 1973. He received the Dipl.Ing. degree in electrical engineering from Vienna University of Technology in
1999 and finished his studies in Violoncello with the Diploma Examination at the
Conservatory of Vienna in 2000. He joined the Telecommunications Research
Center Vienna (ftw.) in 1999, where he is still working as researcher in the strategic I0 project. He is author of several scientific papers, two bookchapters, and one
patent. His research interests include Multiple  Input / Multiple  Output (MIMO),
Discrete Multitone modulation (DMT), Orthogonal Frequency Division Multiplexing (OFDM), Information Theory, Free Probability Theory, TimeFrequency Analysis, and mathematics in general.