Beruflich Dokumente
Kultur Dokumente
Final report
2E1367 - Project Course in Signal Processing and Digital Communication
Black Team
Jia Liu, Erik Bergenudd, Vinod Patmanathan, Romain Masson
Project 2005
OFDM
project
Black
Team
OFDM
project
Abstract
This report discusses the design and implementation of an OFDM modem for a simplex communication between two PCs over a frequency selective channel. First a brief introduction is
provided by explaining the backrground and the specification of the project. Then the report
deals with the system model. Each block of the OFDM system is described (IFFT/FFT, cyclic
prefix, modulation/demodulation, channel estimation, bit loading). In the following section, the
system architecture is analysed. The transmission protocol, as well as the system parameters
are explained in details. Then, the DSP implementation is discussed. Finally, the results are
provided in the last chapter.
Aknowledgment
We would like to thank our project assistant, Xi Zhang, for his help during the course.
Contents
Abstract
Aknowledgment
1 Introduction
1.1 Background . . .
1.2 Specification . . .
1.3 Equipment . . .
1.4 Project overview
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
7
7
8
2 System Model
2.1 Overview . . . . . . . . . . . . . . . .
2.1.1 Transmitter . . . . . . . . . . .
2.1.2 Channel . . . . . . . . . . . . .
2.1.3 Receiver . . . . . . . . . . . . .
2.2 OFDM System . . . . . . . . . . . . .
2.2.1 Evolution of OFDM . . . . . .
2.2.2 Introduction to OFDM . . . . .
2.2.3 FFT and IFFT . . . . . . . . .
2.2.4 Cyclic Prefix . . . . . . . . . .
2.2.5 Modulation and demodulation
2.2.6 Channel and Noise Estimation
2.2.7 Adaptive Bit Loading . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
9
10
10
11
11
12
12
13
14
15
16
3 System Architecture
3.1 Frame structure . . . . . . . . . . .
3.1.1 Transmission Protocol . . .
3.1.2 Training sequence . . . . .
3.1.3 Length Frame . . . . . . . .
3.1.4 Pilots Frame . . . . . . . .
3.2 Convolution Encoder and Decoder
3.3 Mirror Operation . . . . . . . . . .
3.4 Upsampling and pulse shaping . .
3.5 Synchronization . . . . . . . . . . .
3.6 Frequency offset . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
19
20
20
21
21
21
22
23
25
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 0
Black TEAM
4 DSP implementation
4.1 Overview of the Board . . . . . . . . . .
4.2 Data Transfer . . . . . . . . . . . . . . .
4.3 Ping Pong Buffering with Linked EDMA
4.4 Implementation issues . . . . . . . . . .
4.5 DSP transmitter implementation . . . .
4.6 DSP Receiver Implementation . . . . . .
4.7 Changes compared to simulation . . . .
. . . . . .
. . . . . .
transfers
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
26
27
27
28
29
30
31
. . .
. . .
FFT
. . .
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
38
38
38
38
39
40
8 Future work
41
8.1 Increase the bit rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.2 Peak to average ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.3 Improving the system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Conclusion
42
Page 4
List of Figures
1.1
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
OFDM system . . . . . . . . . . .
Bits flow through an interleaver . .
Basis functions in OFDM system .
OFDM system . . . . . . . . . . .
Channel impulse response . . . . .
Adding a cyclic prefix to a frame .
Bits allocation in modulation . . .
4-QAM (left) and 16-QAM (right)
Block type channel estimation . . .
Adaptive bit loading . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
10
11
12
13
13
14
15
16
17
Preambule of transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Transmission protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Structure of the training sequence . . . . . . . . . . . . . . . . . . . . . . . . . . .
A 1/2 - rate Convolution encoder . . . . . . . . . . . . . . . . . . . . . . . . . . .
Trellis used in the Viterbi decoder . . . . . . . . . . . . . . . . . . . . . . . . . .
Mirror operation, producing a real valued output from IFFT . . . . . . . . . . . .
Impulse response of the root-raised-cosine filter . . . . . . . . . . . . . . . . . . .
The frames "slide" in the received buffer because of the sampling clock difference
The synchronization method uses correlation between the training sequence and
the ouput from the match filter . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.10 The maximum of the correlation indicates the where the training frame starts . .
3.11 Channel estimation compensates the rotation of the constellation due to frequency
offset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
20
20
21
21
22
23
23
4.1
4.2
4.3
4.4
4.5
.
.
.
.
.
27
28
29
30
31
5.1
5.2
33
33
6.1
36
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
24
24
25
Chapter 0
Black TEAM
6.2
6.3
6.4
Channel estimation in two different cases (noise free and noisy channel) . . . . .
Bit allocation in two different cases (SNR=30dB, SNR =6dB) . . . . . . . . . . .
BER vs SNR for the coded simulated system . . . . . . . . . . . . . . . . . . . .
36
37
37
7.1
7.2
7.3
7.4
7.5
Received signal . . . . . . . . . . . . . . .
Signal after FFT . . . . . . . . . . . . . .
Constellation before channel compensation
Constellation after channel compensation
Received picture and results . . . . . . . .
38
38
39
39
40
Page 6
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Introduction
1.1
Background
In a basic communication system, the data are modulated onto a single carrier frequency. The
available bandwith is then totally occupied by each symbol. This kind of system can lead to
inter-symbol-interference (ISI) in case of frequency selective channel.The basic idea of OFDM is
to divide the available spectrum into several orthogonal subchannels so that each narrowband
subchannel experiences almost flat fading. With OFDM, it is possible to have overlapping
subchannels in the frequency domain, thus increasing the transmission rate. OFDM systems
have gained an increased interest during the last years. It is used in the European digital
broadcast radio system, as well as in wired environment such as asymmetric digital subscriberlines (ADSL). This technique is used in digital subscriber lines (xDSL) to provides high bitrate
over a twisted-pair of wires. The project focuses on the latter application.
1.2
Specification
The goal of this project is to design and implement a point-to-point digital communication
between two PCs over a channel with similar characteristic as a telephone line. A feedback
channel is used to estimate the channel and to adapt the bit rate of each subcarrier. We provide
the possibility to transmit source files (pictures) and to display those on the screen at the
receiver. An graphical interface is implemented in order to be able to control the system.
1.3
Equipment
Chapter 1
Black TEAM
Matlab, Code Composer Studio 1.12, Microsoft Visual C++, Frontpage, Illustrator 10.
1.4
Project overview
Our point-to-point communication system presents two sets of PCs, each equipped by a DSP
card. Even if in practice, a full-duplex system is used, we consider a simplex communication :
one of the PCs is the transmitter whereas the other is the receiver. The two PCs communicate
through a telephone line (modeled by emulation hardware in the project). Pictures shall be
transmitted and displayed on the receivers screen. The receiver estimates the channel on each
carrier and feeds back information to the transmitter through the feedback link. In practice the
feedback link has the same characteristicsas the direct link. In this project, we will use a simple
cable as feedback link (as it is not the interesting part of the research).
simple cable
Transmitter
PC
DSP
DSP
Receiver
PC
Telephone
line emulation
Page 8
Chapter 2
System Model
2.1
Overview
TX
Convolution
encoder
Interleaver
IFFT
with
complex
mirror
Modulation
CP
pulse
shaping
feedback
D/A
Channel
estimation
Bit loading
Channel
A/D
RX
Decoder
Deinterleaver
Channel
compensation
Demodulation
FFT
remove
complex
mirror
CP
match
filter
2.1.1
Transmitter
Convolutional encoder. In order to decrease the error rate of the system, a simple convolution encoder of rate 1/2 is used as channel coding.
Interleaver. The interleaver rearranges input data such that consecutive data are split
among different blocks. This is done to avoid bursts of errors. An interleaver is presented
as a matrix. The stream of bits fills the matrix row by row. Then, the bits leave the
matrix column by column. The depth of interleaver can be adjusted.
Modulation. A modulator transforms a set of bits into a complex number corresponding
to a signal constellation. The modulation order depends on the subcarrier. A subcarrier
Chapter 2
Black TEAM
input bits
...
...
output bits
2.1.2
Channel
The channel must have the same characteristics as the pair of twisted wires found in the telephone
network. In order to achieve this, we use a telephone line emulation hardware. Also, we have
the possibility to use the adjustable filter ZePo and the noise generator. This can be very useful
to test the system performance.
2.1.3
Receiver
Page 10
Chapter 2
Black TEAM
Demodulation. Symbols are transformed back to bits. The inverse of the estimated channel
response is used to compensate the channel gain.
Deinterleaver (Interleaving inverse operation). The stream of bits fills the matrix column
by column. Then, the bits leave the matrix row by row.
Convolution decoder. The decoder performs the Viterbi decoding algorithm to generate
transmitted bits from the coded bits.
2.2
OFDM System
This section introduces OFDM and key system aspects are considered.
2.2.1
Evolution of OFDM
0.8
0.6
0.4
0.2
-0.2
-0.4
0.5
1.5
2.5
3.5
4.5
Today,
has grown to be the most popular communication system in high-speed
3.3 OFDM
Cyclic Prefix
communications.
In order to maintain orthogonality through the transmission over the channel a cyclic prefix is
added. This is done by taking the M last symbols of the frame and put a copy of them in the
beginning of the frame. This will also make
the 11
output from the IDFT periodic. The signal
Page
will now appear as shown below.
x(k ! N )
#M "k !0
(3.2)
s (k ) !
N !1
j 2!
Chapter 2
2.2.2
Black TEAM
Introduction to OFDM
bits
X(n)
modulation
x(k)
IFFT
CP
s(k)
feedback
Channel
Channel
estimation
Bit loading
bits
demodulation
Channel
compensation
FFT
CP
Y(n)
r(k)
y(k)
2.2.3
The key components of an OFDM system are the inverse FFT at the transmitter and FFT at the
receiver. These operations performing linear mappings between N complex data symbols and
N complex OFDM symbols, result in robustness against fading multipath channel. The reason
is to transform the high data rate stream into N low data rate streams, each experiencing a flat
fading during the transmission. Suppose the data set to be transmitted is
X(1), X(2), ..., X(N )
where N is the total number of sub-carriers. The discrete-time representation of the signal after
IFFT is:
N 1
n
1 X
x(n) =
n = 0..N 1
(2.1)
X(k).ej2k N ,
N k=0
At the receiver side, the data is recovered by performing FFT on the received signal,
N 1
n
1 X
Y (k) =
x(n).ej2k N ,
N n=0
Page 12
k = 0..N 1
(2.2)
Chapter 2
Black TEAM
An N-point FFT only requires N log(N ) multiplications, which is much more computationally
efficient than an equivalent system with equalizer in time domain.
2.2.4
Cyclic Prefix
In an OFDM system, the channel has a finite impulse response. We note tmax the maximum
delay of all reflected paths of the OFDM transmitted signal, see Figure 2.5.
channel impulse response
t
0
tmax
CP
FRAME
t
0
tc > tmax
(2.3)
The idea behind this is to convert the linear convolution (between signal and channel response) to a circular convolution. In this way, the FFT of circulary convolved signals is equivalent
to a multiplication in the frequency domain. However, in order to preserve the orthogonality
property, tmax should not exceed the duration of the time guard interval. As shown below, once
the above condition is satisfied, there is no ISI since the previous symbol will only have effect
Page 13
Chapter 2
Black TEAM
X(n)
x(k)
s(k)
IFFT
r(k)
CP
y(k)
Channel
CP
Y(n)
FFT
over samples within [0, tmax ]. And it is clear that orthogonality is maintained so that there is
no ICI.
r(k) = s(k) h(k) + e(k)
(2.4)
(2.6)
,0 k N 1
= X(n).H(n) + E(n)
(2.5)
(2.7)
where denotes circular convolution and E(n) = DF T (e(k)). Another advantage with the
cyclic prefix is that it serves as a guard between consecutive OFDM frames. This is similar
to adding guard bits, which means that the problem with inter frame interference also will
disappear.
2.2.5
Given the adaptive bit loading algorithm, the modulator has a number of bits and an energy
value as input for each sub-carriers. The output for one sub-carrier is a constellation symbol
with a desired energy, corresponding to the number of bits on the input. The modulator is taken
to get either 2bits, 4bits, 6bits or 8bits available, which means that, respectively, only QPSK,
16QAM, 64QAM and 256QAM are available for modulation on each sub-carrier.
Input buffer (bits)
1
0
0 1
BPSK
s1
QPSK
s2
BPSK
1 1
16-QAM
s3
1 0
QPSK
s4
QPSK
s5
BPSK
s6
s7
Chapter 2
Black TEAM
16- QAM
4 - QAM
2.2.6
The frequency response of channel has to be estimated to invert the effect of non-selective
fading on each subcarrier. Further, given full knowledge of channel and noise variance, the
transmitter and receiver can determine the channel gain at each tone of OFDM symbols, so
that the adaptive bit loading algorithm can proceed to calculate the optimal bit and energy
allocation. This algorithm will be expounded in next section.
Since the channel transfer function is not changing very rapidly, a block type channel estimation has been developed, in which the pilot tones are inserted into all of the subcarriers of
OFDM symbols as shown in figure 2.9. Noise variance is estimated by the empty-pilot, i.e. the
symbols are all zero-valued.
If the channel is constant during the block, there will be little channel estimation error since
the pilots are sent at all carriers.
The pilots are inserted to all subcarriers with a specific period, and extracted after DFT
n = 1, 2, ..., N
(2.8)
where E(n) is zero mean noise independent both in time and in frequency due to the linear
properties of FFT. The method we used to implement the channel estimation is based on the
forgetting factor technique :
(L, n)
H(n)
=
(L, n)
n = 1..127
(2.9)
where
(n, L) = Xp (n, L)Yp (n, L) + (1 )(n, L 1)
(2.10)
(2.11)
Page 15
Chapter 2
Black TEAM
frequency
127
subchannels
time
block of 20
data frames
pilot on
every subchannel
X(n)
=
H(n)
(2.12)
The power spectrum of the white Gaussian noise added in the channel is estimated using the
empty-pilot.
N 1
1 X
Yo (n)2
N
(2.14)
n=0
2.2.7
The adaptive bit loading algorithm is an efficient technique to achieve power and rate optimization based on knowledge of the subchannel gains. A subchannel with higher SNR is then
assigned more bits and energy than a subchannel with lower SNR.
There are two types of loading algorithms: those that try to maximize data rate and those
that try to maximize performance at a given fixed data rate. The adaptive technique employed
in this system is Rate-Adaptive (RA) loading criterion: a rate-adaptive loading procedure maximizes (or approximately maximizes) the number of bits per symbol subject to a fixed energy
Page 16
Chapter 2
Black TEAM
Channel gain
bit loading
6
4
2
N
X
n=1
En .gn
log2 1 +
(2.15)
subject to:
N Ex = N.En
(2.16)
where bn and En are the bit allocation and the energy for the nth subchannel. Ex is the
average energy per subchannel. is the gap (parameter of the algorithm).
To initialize the bit allocation, the procedure is summarized as follows:
Compute the subchannel signal to noise ratios gn .
Sort the subchannel SNRs to be from largest to smallest. Compute the number of usable
subchannels as
Nuse = N number of zerogain subchannels
(2.17)
Obtain the constant K and the energy in the worst subchannel based on the formula
!
N
use
X
1
1
K =
N.Ex + .
(2.18)
Nuse
gn
n=1
Enmin
= K
(2.19)
gNuse
Solve while Enmin is negative with Nuse Nuse 1 and corresponding gn term eliminated.
Determine the bits and energy on usable subchannels using the formula
Ei = K
b(i) =
gi
i = 0, 1, ...N use
1
gi
. log2 (K. )
2
i = 0, 1, ...N use
(2.20)
(2.21)
Return values of bits and energy allocation to original index of unsorted subchannels and
assign 0 bit to zero-gain and eliminated subchannels.
Restrict b(i) to take value 0, 2, 4, 6 or 8 (this corresponds to available modulation orders).
Page 17
Chapter 2
Black TEAM
Since only five different signal constellations are available in our system, we require the
subchannel to have only 0, 2, 4, 6 or 8bits. Thus other number of bits are not supported. In
order to take care of this, a restriction technique has to be used:
Quantize the number of bits per symbol to nearest integer.
Set bi = 8 if it is more than 8, and bi = 0 if it is less than 1.
Round the value of bi down to bi 1 if it is odd.
The advantage of OFDM is that each subchannel is relatively narrowband and has flat fading. However, it is probable that a given subchannel has a low gain, resulting in a large BER.
Conventional system without bit loading has to settle a low modulation to keep a relatively low
error probability, but also low bit rate. Thus, it is desirable to take advantage of subchannels
having relatively high performance. This is the motivation of bit loading and adaptive modulation. The bits and energy allocation is a function of channel property and noise PSD. Therefore,
a new adaptation must be implemented each time the channel varies.
Page 18
Chapter 3
System Architecture
3.1
Frame structure
There are different frames involved in this system. Beside the regular message frame, there are
pilot frame, empty pilot frame, length frame and training sequence, which are described in this
section. "Frame" here denotes an OFDM frame, as described previously, consisting 256 OFDM
symbols and a cyclic prefix.
The following frame structure was used in the system model on Matlab but had to be changed
when implementing the DSPs to achieve a reliable system. The real system protocol is described
in the next chapter.
3.1.1
Transmission Protocol
Figure 3.1 shows the format of OFDM preamble for transmission. The training sequence at the
beginning is used to synchronize and to estimate the channel. Since the length of message is
required for decoding the message, a frame recording the length is inserted into the pilot block
and transmitted with most robust combination of BPSK modulation. Then a "zeros" frame is
transmitted in order to estimate the noise in the channel. Finally, the transmitter has to wait
until it gets the first feedback from receiver. This is why it sends an empty frame before sending
the actual data.
Training
sequence
Length
frame
15 bits
Zeros
(noise
estimation)
Empty
frame
(wait for
feedback)
Block of
20 data frames
19
Chapter 3
Black TEAM
and thereby lose synchronization, the modem transmits the message block based on the feedback
from last pilot frame. The protocol is showed in the Figure 3.2.
Pilot
i
TX
update i-1
Pilot
i+1
update i
update i+1
process on pilot i
RX
receive block i
using update i-1
3.1.2
Training sequence
The training sequence is divided into two distinct parts. First, the transmitter sends a sine wave
followed by zero padding (without any upsampling or pulse shaping) and then it sends a pseudorandom sequence (the same as pilot frame). At the receiver side, a correlation between the sine
wave (known at the receiver) and the content of the received buffer is proceeded continuously.
When the receiver detects the sine wave, it has an estimation of the beginning of the pseudorandom sequence. Then the synchronization is done to get precisely the sampling time. the
synchronization method is explained in the next section.
Training
sequence
Sine
wave
{1,-1}
Padding
000...00
pseudo
random
real values
complex mirror,
IFFT,
CP
Pulse
shaping
Pilot frame
3.1.3
Length Frame
The receiver has to know where the message ends, thus a length frame is transmitted to denote
the length. The length frame is mapped onto a BPSK constellation, and then follows the same
steps of symmetrical IFFT, cyclic prefix extension as message frame. In order to facilitate a
Page 20
Chapter 3
Black TEAM
reliable detection, only the first 15 bits of length frame are used to represent the length, zero
bits appended to form a frame.
3.1.4
Pilots Frame
Pilot frame is used for synchronization and channel estimation, while empty pilot frame is sent
to estimate the noise power. As described in Figure 3.3 (see Pilot frame), the pilot frame is
chosen as a random sequence, followed by symmetrical IFFT and cyclic prefix extension to be
an OFDM frame. The empty pilot frame is set to be a zero valued frame. Since the channel
is relatively stable, we assume the time offset and distortion caused by channel is invariable
during a block period. Thereby we send the two type pilot frames each block during the data
transmission, and apply the estimation on a whole block.
3.2
A simple convolution encoder, supporting code rate 1/2 is as shown below. For every 1 input
bit, the convolution encoder outputs 2 bits, thus having a 100 % redundancy.
output 1
+
input
output 2
+
output =00
11
00
10
01
00
10
01
in = 1
in = 0
11
10
01
01
11
10
11
3.3
Mirror Operation
According to the standards in ADSL and since the DSP D/A converter can only handle real
valued data, we performe a mirror operation before IFFT. In a sense, we construct a conjugate
symmetric signal in the frequency domain in order to get a real signal in the time domain.
Page 21
Chapter 3
Black TEAM
Real signal
in the time domain
complex mirror
Since we use 127 subcarriers, the output of the modulation block contains 127 complex
symbols. Then we compute their conjugate and map the IFFT input as illustrated in the figure
bellow.
null
X1
X2
X127
null
X127*
IFFT
X126*
Real
values
X3*
X2*
X1*
Figure 3.6: Mirror operation, producing a real valued output from IFFT
We can notice that the 0 (dc) and the 128 input, are set to zero. Indeed, we do not transmit
at the zero frequency. A drawback of this manipulation is the requirement of more IFFT
computations. However, it does not introduce any deterioration in the system performance
3.4
Before the digital to analog converter, the symbols are upsampled and pulse shaped. This is
performed to permit the synchronization at the reciver side. The upsampling operation consists
in adding zeros between the symbols. In our system the upsampling rate is 4. This means
that we add 3 zeros between each symbols. Then the signal is filter by the pulse shape. In our
system, the impulse response of this filter is a root raised cosine of length 40. The response
in provided in the Figure 3.7. We first tried with a rectangular shape but it was less efficient.
Indeed, the properties of the root raised cosine in the frequency domain are better for our system.
The frequency response of the root-raised-cosine filter is narrow band and each symbol must be
transmitted through a norraw bandwith.
Page 22
Chapter 3
Black TEAM
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0.05
0.1
10
20
30
40
50
60
70
80
90
3.5
Synchronization
received buffer
beginning of
transmission
Frame
middle
Frame
end
Frame
Figure 3.8: The frames "slide" in the received buffer because of the sampling clock difference
Then, the sampling time must be estimated and compensated before removing cyclic prefix
at the receiver since otherwise the orthogonality between subcarriers are lost.
In wireless systems, the synchronization has to be done for every frame. This is why a
synchronization with cyclic prefix is performed. But this method is not very reliable. In our
case (telephone line), the system is quite stable. In this way, we can synchronize every 20 frames.
Page 23
Chapter 3
Black TEAM
As we use a pilot frame to estimate the channel, the synchronization is done using the same
pilot.
2
1
Output of
the match filter
0
!1
50
100
150
200
250
300
350
400
Training
sequence
0.5
0
!0.5
!1
0
10
15
20
25
30
35
40
Figure 3.9: The synchronization method uses correlation between the training sequence and the
ouput from the match filter
The synchronization algorithm used herein is based on the complex-valued pilot frames.
During the pilot frames, it is known to the receiver what the transmitter is transmitting. Hence,
one possible way of recovering the symbol timing is to cross-correlate the complex-valued samples
after the matched filter with a locally generated time-shifted replica of the pilot sequence. By
trying different time-shifts in steps of T / Q, where Q is the number of samples per symbol,
the symbol timing can be found with a resolution of T / Q. Put into mathematical terms, if
mf (n) is the output of the match filter, pilot(n) the pilot sequence of length L, and [tstart; tend]
represents the search window, the timing can be found as :
tsamp = arg max
t
L1
X
pilot(i).mf (Q.i + t)
t = tstart ..tend
i=0
50
100
150
200
250
300
350
400
Sampling time
Figure 3.10: The maximum of the correlation indicates the where the training frame starts
Page 24
Chapter 3
Black TEAM
The correlation properties of the training sequence are important as they affect the estimation
accuracy. Ideally, the autocorrelation function for the pilot frame should equal a delta pulse, i.e.,
zero correlation everywhere except at lag zero. Therefore, the pilots frame should be carefully
designed. However, it is simply chosen as a random sequence since usually it works fairly well
when the length is large enough.
3.6
Frequency offset
Due to a slight difference between the clock frequencies, the system experiences a frequency
offset. This phenomenon makes the angles of points in the constellation to change linearly over
time. If the variation is to fast, the demodulation becomes unreliable. Then, for large files, we
have to compensate the frequency offset. This is done using a forgetting factor closed to 1.
When estimating the channel every 20 frames, we weight higher the current estimation than
the previous. In this way, the angle due to frequency offset is estimated every 20 frames and
compensated all over the transmission.
compensation
Channel
estimation
with high
forgetting
factor
Time
Figure 3.11: Channel estimation compensates the rotation of the constellation due to frequency
offset.
Page 25
Chapter 4
DSP implementation
This section follows up from the previous and explains how the algorithms simulated in Matlab
were implemented on the DSP board. A brief overview of the DSP board used is presented first.
In the following sections we explain the implementation of the transmitter and receiver.
4.1
In this course we worked with the TI C6713 DSK (DSP Starter Kit). This board was equipped
with a 225 MHz TMS320C6713 Floating Point DSP and has 265Kbytes of internal memory. On
board peripherals include two multi-channel buffered serial ports (McBSPs) and an enhanced
DMA controller (EDMA). It also has 16 Mbytes of on board SDRAM.
With its significant amount of internal memory we managed to load our whole program
within it, without using any of the SDRAM. The SDRAM was used exclusively for the data to
be transmitted.
For signal transmission and audio processing, the DSP board is equipped with an on-board
codec called the AIC23. Codec stands for coder/decoder, the job of the AIC23 is to code analog
input samples into a digital format for the DSP to process, and then decode data coming out of
the DSP to generate the processed analog output. Digital data is sent to and from the codec on
McBSP1. We use both channels of the codec for the transmitting and the feedback part. The
signal is sampled as 16-bit elements at a rate of 96 kHz. To speed up the task of copying data
between the CPU and audio codec, the EDMA (Enhanced Direct Memory Access) is used to
copy one frame between the codec and memory and interrupt the DSP.
The DSK has 4 light emitting diodes (LEDs) and 4 DIP switches that allow users to interact
with programs through simple LED displays and user input on the switches.
The 6713 DSK includes a special device called a JTAG emulator on-board that can directly
access the register and memory state of the 6713 chip through a standardized JTAG interface
port. When a user wants to monitor the progress of his program, Code Composer (the development environment) sends commands to the emulator through its USB host interface to check on
any data the user is interested in. We also use this interface together with the GUI to download
the file from the host computer to the DSP board for transmission.
The DSP implementation followed closely the model implemented in Matlab. We had to
make sure however that the system complied with real time constraints. Also, we had to verify
that memory constraints were met. The programming was implemented in C and care had to
be taken to define precise memory allocation. Fortunately, Code composer Studio compiler does
26
Chapter 4
Black TEAM
a lot of code optimization, and in our case we did not have to work too hard on optimizing the
code manually. We took steps to avoid calling functions within loops as well vector operations.
4.2
Data Transfer
Audio signals are transferred back and forth from the codec via McBSP2, a bi-directional serial
port. The EDMA is configured to take every 16-bit signed audio sample arriving on McBSP1
and store it in a buffer in memory to be processed by the DSP. At the same time the EDMA is
used to transfer data from memory to the McBSP1 to be sampled and transmitted.
The codec is configured and controlled via the McBSP0, a second serial port. The commands
are used to configure parameters on the codec (sample rate, gain, audio path).
4.3
Using a single buffer for receiving and transmitting data can be tricky and timing dependant
because new data constantly overwrites present data being transmitted. Ping Pong buffering
is a technique where two buffers are used for data transfer instead of only one. The EDMA is
configured in our case to fill the Ping buffer first, and then the Pong Buffer. While the Pong
buffer is being filled, the Ping buffer can be processed with the knowledge that the current
transfer wont overwrite it. In our system we use Ping and Pong Buffers for both transmitting
and receiving.
ping
sending
processing
data
pong
when
proceeded
codec
copy
next interrupt
ping
copy
when
proccessing proceeded
data
pong
codec
sending
Chapter 4
Black TEAM
but only to signal to the processor that it can process the data. Hence, the only time constraint
is that audio data must be processed before the next buffer is filled.
4.4
Implementation issues
Synchronization
It is possible to determine the start index of frame by calculating the cross correlation of the
received and known sinusoid. The receiver calculates the correlation with the received signal,
sifting in time, and keeping the maximum value and the corresponding index. If the maximum
value is higher then a predefined threshold, the sine is considered to be found. If no sine is
found, the frame is considered as empty and noise estimation is performed. From the sample
right after the sine the first training sequence starts.
Assembling frame
On the receiver side, the frame can start at any point in the buffer since its not possible to
make the cards work completely synchronously. To solve this, the received part of the frame is
stored in a temporary buffer. When the next buffer is filled, the frame can be assembled and
processed, a semaphore is set to keep in mind that there is still unprocessed data in the receive
buffer.
received buffer
1st part of
the frame
2nd part of
the frame
1st part of
the frame
2nd part of
the frame
temporary buffer
Memory considerations
Mostly, the computational power consumption is very low for this implementation. More limiting is the memory usage on the board. Even if the physical amount of memory is sufficient,
there have been problems with allocating memory in the functions. Only very small arrays were
possible to declare in the functions. No dynamic memory allocation worked either. The solution
was to declare large buffers globally in the beginning and then reuse them. Especially the modulation and demodulation functions caused problems. The Matlab function for modulation used
complete predefined tables to encode the signal. But due to the gray coding of the modulation
the complexity could be reduced significantly.
Feedback
The feedback is a basic communication system. The bit allocation is encoded as 4 bits per
sub-carrier and then modulated with QPSK. Each symbol is repeated twice. The transmission
is performed through a perfect channel using a coaxial cable.
Page 28
Chapter 4
4.5
Black TEAM
initialization of buffers,
codec, EDMA.
synchronization
frame
10 training
frames
wait for
feedback
if no feedback
if feedback
get the bit
allocation
send 1 training
frame
send 20 data
frames
end
Page 29
Chapter 4
Black TEAM
Based on the bit allocation the right amount of data is extracted from the sequence to be
transmitted. Next it is modulated and the IFFT is performed. Then it is pulse shaped
and up sampled. It is then sent to the codec to be transmitted.
The previous step is repeated for every 10 frames after which a training frame is sent.
After all the frames are sent an empty frame is transmitted and the program exits.
4.6
initialization of buffers,
codec, EDMA.
receive
synchronization
frame and get
the time index
estimate the
channel with
10pilots
receive 20 data
frames
end
Page 30
Chapter 4
Black TEAM
An empty frame is used to estimate the noise variance. Then the receiver correlation
detection on the sinusoid.
The received sinusoid is used to determine a rough index of the start of frame.
10 training frames are used to estimate the channel and a weighted average of past and
present estimations is used. In addition to determining the correct index of the start of
frame.
With the channel estimate the bit allocation is computed and sent to the transmitter,
along with a sinusoid for synchronization.
The receiver waits for the frame to be received and bit allocation detected. Subsequently,
20 data frames are received.
After every 20 data frames, a training frame is received to correct the frequency offset
in sampling frequency of the clocks on both DSPs. It is also used to update the channel
estimation.
After all the frames are received, the program exits.
4.7
In order to achieve a functional system, the original system had to be adapted. In this way,
simplifications have been done to obtain a reliable modem. Here are the differences between the
model and the real system :
The convolution coder and the interleaver have not been integrated. The functions were
too slow and did not meet the real-time deadlines.
The bit allocation is constant over the transmission. We only compute the bit allocation
at the beginning, send back the result to the transmitter and keep the result all over the
transmission. However we still estimate the channel every 20 frames.
TX
Pilot
send block 1 of
20 frames
wait
send block 2 of
20 frames
Pilot
bit
allocation
RX
estimate the
channel,
computation
of the bit
allocation
wait
receive block 1
channel
est.
receive block 2
Chapter 5
5.1
The Real Time Data Exchange (RTDX) is a standard component of the TI DSP, which permits
users to transfer data between a Host and a target DSP without interfering with their applications. Since the DSP target can only handle integers, the picture to be sent is written to a
memory buffer as an array of integers within the transmitter RTDX Host Library. When the
RTDX Host Library receives a request for data from the target application, the data in the host
buffer is sent to the requested location on the target via the JTAG interface. The host notifies
the RTDX Target Library when the operation is complete. In a similar way, the receiver Host
records the target-send data into a memory buffer, and retrieve the picture from the received
integers. The Host can then read or write the whole file directly from or to the DSP on the
board memory. Specific RTDX syntaxes are inserted into Host and Target application to utilize
the RTDX Host Library and RTDX Target Library.
5.2
Transmitter GUI
Once the user starts the Transmitter GUI program transmitter.exe, the CCS is launched and
the DSP program "dsk_app.out" is loaded. When the picture is chosen, it is displayed in the
display area and written to a memory buffer as an array of integers. The user clicks the send
button to start the DSP program and the data transfer from Host to DSP. As shown in Figure
5.1, while the whole array of the integers is transferred to the target DSP, the text-area window
becomes visible and display "The DSP is processing!" to let the user know the status.
32
Chapter 5
Black TEAM
5.3
Receiver GUI
Similarly, the corresponding DSP program at the receiver is loaded once the user begins the
Receiver GUI program receiver.exe. While the user clicks the receive button, the DSP program is
launched to start receiving, then transfers the data from DSP to Host when receiving is complete
on DSP. It is important to guarantee that the receiving is run before the transmission so that the
Page 33
Chapter 5
Black TEAM
receiver is able to detect the beginning of the data. The Receiver GUI, see Figure 5.2, displays
the received image, the SNR and the bit allocation when the transmission is accomplished. There
is also a text-area window that displays "Data processing complete!" to indicate the status. BER
is then calculated by comparing the received file with a copy of the original file.
Page 34
Chapter 6
Simulation results
In this chapter, the performance of the system model is treated. We recall the parameters of
the Matlab model:
number of subchannels : 127
size of the IFFT/FFT : 256
length of the cyclic prefix : 32 symbols
upsampling factor : 4 , pulse shaping : rectangular
10 pilots sent at the beginning to estimate the channel
1 pilot sent every 20 frames to improve the estimation
We first show the performance of the channel estimation, then the bit loading results are
displayed and finally, the bit error probability as a function of the SNR is plotted. Here the
SNR corresponds to the symbol energy divided by the noise power.
SNR =
6.1
Eb
2
N
Channel estimation
In the simulation, the channel is a low pass filter with additive white gaussian noise. It is
illustrated in the Figure 6.2. We provide a plot of the channel estimation in a noisy channel
(SNR = 6dB) and in a noiseless channel. We can notice that the channel estimation is noise
sensitive. In the simulation, the message is not very large and averaging is not proceeding so
much. But since the files that we use in the real system are very large, and since the estimation
is improved every 20 frames, the channel estimation is efficient and gives good results.
6.2
Bit loading
The bit allocation depends on the SNR computed on each sub-channel. We provide the plot of
the bit allocation for two values of the SNR.
35
Chapter 6
Black TEAM
channel
n(k)
low pass
filter
Channel estimation
2
noise free channel
1.5
1
0.5
0
20
40
60
80
100
120
140
Channel estimation
2
noisy channel
SNR = 6 db
1.5
1
0.5
0
20
40
60
80
100
120
140
Figure 6.2: Channel estimation in two different cases (noise free and noisy channel)
Page 36
Chapter 6
Black TEAM
Channel estimation
2
1
0
20
40
60
80
Bit allocation
100
120
140
40
60
80
Bit allocation
100
120
140
40
60
100
120
140
8
6
SNR=30dB
4
20
4
2
SNR=6dB
0
20
80
Figure 6.3: Bit allocation in two different cases (SNR=30dB, SNR =6dB)
6.3
In order to plot the SNR vs bit error probability (BER), we did a Monte Carlo simulation. We
randomly generated the input to the simulation (a bit stream) and then transmit this signal
through the system. At the very end, we compute the BER. We repeat this operation several
times for several values of the SNR. Then, we obtain statistics of the system. Our results are
provided in the Figure 6.4. For a low SNR, errors occur very often. Indeed, the synchronization is
0
10
10
10
10
10
Page 37
Chapter 7
7.1.2
Chapter 7
7.1.3
Black TEAM
Channel compensation
One problem when implementing the system was the frequency offset. As explained in Section
3.6, the angles of points in the constellation change linearly over time and we have to compensate
for this rotation. We provide the plot of a frame constellation before and after compensation.
The channel was flat, and the modulation scheme is 64-QAM in each subchannel.
Page 39
Chapter 7
7.2
Black TEAM
Final results
We provide a picture of what we obtain at the receiver. The channel was taken noisy and
frequency selective. The bit loading algorithm is efficient and the bit error rate is low.
Page 40
Chapter 8
Future work
This chapter contains a list of suggested future work.
8.1
The maximum bit rate for our system is equal to 64kbits/s, which is low for an OFDM system.
This is due to some of our choices in the system design. In order to increase this bit rate, we
could have change some parameters. We could have reduced the upsampling factor, used an
other modulation scheme (higher order) and reduce the size of the cyclic prefix.
8.2
One major difficulty about OFDM is its large peak-to-average ratio (PAR). This means that
the OFDM signal has a large variation between the average signal power and the maximum
signal power. The large dynamic range of the OFDM system can lead to some problems when
converting from digital to analog. A D/A converter has both linear and non-linear regions
where the non-linear regions occur for large output powers (i.e., near saturation). To reduce the
amount of distortion, the symbols need to be as much as possible in the linear region. In this
way, we have to lower the output power, which leads to unefficiency.
Methods to reduce the PAR can be implemented in an OFDM system. They use some
constraints on the modulation sequences and seem quite complex.
8.3
Some blocks of the system can be improved. Some algorithms like synchronization, bit loading
and channel estimation can be improved and optimized to obtain better performence and a
faster system. Also, an algorithm to track the frequency offset could have been implemented.
41
Conclusion
The key building blocks of an OFDM Modem conforming to the course requirements has been
designed and implemented. The functionality of the modem is verified at the simulation level,
and the system performance is measured under real operation condition. We found that the
OFDM modem leads to better BER performance than conventional systems. Further, the OFDM
system with adaptive algorithm outperforms the OFDM systems having fixed modulation, giving
higher bit rate. In conclusion, OFDM is a very promising technology, and practical adaptive
rate algorithm serves well to improve performance.
42
Bibliography
[1] Advanced Digital Communication, John M. Cioffi, course reader in EE379C, Stanford University, 2000.
[2] High Performance OFDM Modem Robust Against Channel Imperfections, Shen et al.,
Project report, Royal Institute of Technogy, 2002.
[3] Full Duplex OFDM Modem over a Frequency Selective Channel, Gustavsson et al., Project
report, Royal Institute of Technology, 2001.
[4] DSP baserat OFDM system ver akustisk kanal, Bellander et al., Project report Lulea
University of Technology, 1997.
[5] An Introduction to Orthogonal Frequency-Division Multiplexing, Edfors et al., Tech. report
Lulea University of Technology, 1996.
[6] Estimation of Synchronization Parameters, Jan-Jaap van de Beek, Licentiate thesis Lulea
University of Technology, 1996.
[7] Lecture notes from graduate course in OFDM, Katie Wilson, Royal Institute of Technology,
2003
[8] J. G. Proakis and D. G. Manolakis, Digital Signal Processing, Prentice Hall, 1996.
43