You are on page 1of 12

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO.

6, MARCH 15, 2016

1507

Low-Complexity Modem Design for GFDM


Arman Farhang, Member, IEEE, Nicola Marchetti, Senior Member, IEEE, and Linda E. Doyle, Senior Member, IEEE

AbstractDue to its attractive properties, generalized frequency


division multiplexing (GFDM) is recently being discussed as a candidate waveform for the fth generation of wireless communication systems (5G). GFDM is introduced as a generalized form of the
widely used orthogonal frequency division multiplexing (OFDM)
modulation scheme and since it uses only one cyclic prex (CP)
for a group of symbols rather than a CP per symbol, it is more
bandwidth efcient than OFDM. In this paper, we propose novel
modem structures for GFDM by taking advantage of the particular
structure in the modulation matrix. Our proposed transmitter is
based on modulation matrix sparsication through application of
fast Fourier transform (FFT) to reduce the implementation complexity. A unied demodulator structure for matched lter (MF),
zero forcing (ZF), and minimum mean square error (MMSE) receivers is also derived. The proposed demodulation techniques harness the special block circulant property of the matrices involved
in the demodulation stage to reduce the computational cost of the
system implementation. We have derived the closed forms for the
ZF and MMSE receiver lters. Additionally, our algorithms do not
incur any performance loss as they maintain the optimal performance. The computational costs of our proposed techniques are
analyzed in detail and are compared with the existing solutions
that are known to have the lowest complexity. It is shown that
through application of our structures a substantial amount of computational complexity reduction can be achieved.
Index Terms5G, GFDM, OFDM, FBMC, modem, MF, ZF,
MMSE.

I. INTRODUCTION

FDM has been the technology of choice in wired and


wireless systems for years, [1][3]. The advent of the
fth generation of wireless communication systems (5G) and
the associated focus on a wide range of applications from
those involving bursty machine-to-machine (M2M) like trafc
to media-rich high bandwidth applications has led to a requirement for new signaling techniques with better time and
frequency containment than that of OFDM. Hence, a plethora
of waveforms are coming under the microscope for analysis
and investigation.
Manuscript received December 02, 2014; revised July 10, 2015, August 17,
2015, and September 30, 2015; accepted November 05, 2015. Date of publication November 20, 2015; date of current version February 11, 2016. The associate editor coordinating the review of this manuscript and approving it for
publication was Dr. Ashish Pandharipande. This paper was supported by the
Science Foundation Ireland (SFI) by Grant Number 13/RC/2077. CONNECT
is funded under the SFI Research Centres Programme and is co-funded under
the European Regional Development Fund.
The authors are with the CTVR/CONNECT, The Telecommunications Research Centre, Trinity College Dublin, Ireland, Dublin2 (e-mail:
farhanga@tcd.ie; marchetn@tcd.ie; ledoyle@tcd.ie).
Color versions of one or more of the gures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identier 10.1109/TSP.2015.2502546

The limitations of OFDM are well documented. OFDM suffers from large out-of-band emissions which not only have interference implications but it also can reduce the potential for
exploiting non-contiguous spectrum chunks through such techniques as carrier aggregation. For future high bandwidth applications this can be a major drawback. OFDM also has high sensitivity to synchronization errors especially carrier frequency
offset (CFO). As a case in point, in multiuser uplink scenarios
where OFDMA is utilized, in order to avoid the large amount of
interference caused by multiple CFOs as well as timing offsets,
stringent synchronization is required which in turn imposes a
great amount of overhead to the network. This overhead is not
acceptable for lightweight M2M applications for example. The
presence of multiple Doppler shifts and propagation delays in
the received uplink signal at the base station (BS) results in some
residual synchronization errors and hence multiuser interference (MUI), [4]. The MUI problem can be tackled with a range
of different solutions that are proposed in [5][7]. However,
these lead to an increased receiver computational complexity.
Thus, one of the main advantages of OFDM, i.e., its low complexity, is lost. The challenge therefore is to provide waveforms
with more relaxed synchronization requirements and more localized signals in time and frequency to suit future 5G applications, without the penalty of a more complex transceiver.
There are many suggestions on the table as candidate waveforms [8][12]. In general, all of these signaling methods can be
considered as lter bank multicarrier (FBMC) systems. They
can be broadly broken into two categories, those with linear
pulse shaping [11], [12] and those with circular pulse shaping,
[8][10]. The former signals with linear pulse shaping have attractive spectral properties, [13]. In addition, these systems are
resilient to the timing as well as frequency errors. However,
the ramp-up and ramp-down of their signal which are due to
the transient interval of the prototype lter result in additional
latency issues. In contrast, FBMC systems with circular pulse
shaping remove the prototype lter transients thanks to their so
called tail biting property, [8]. The waveform of interest in this
paper is known as generalized frequency division multiplexing
(GFDM) and it can be categorized as an FBMC system with circular pulse shaping. The focus of the paper, more specically, is
on the design of a low complexity modem structure for GFDM.
GFDM has attractive properties and as a result has recently
received a great deal of attention. One of the main attractions
of GFDM is that it is a generalized form of OFDM which preserves most of the advantageous properties of OFDM while addressing its limitations. As Datta and Fettweis have pointed out
in [14], GFDM can provide a very low out-of-band radiation
which removes the limitations of OFDM for carrier aggregation. It is also more bandwidth efcient than OFDM since it
uses only one cyclic prex (CP) for a group of symbols in its

1053-587X 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1508

block rather than a CP per symbol as is the case in OFDM.


Through circular ltering, GFDM removes the prototype lter
transient intervals and hence the latency. Additionally, its special block structure makes it an attractive choice for the low latency applications like IoT and M2M, [15]. Filtering the subcarriers using a well-designed prototype lter limits the intercarrier interference (ICI) only to adjacent subcarriers which reduces the amount of leakage between subcarriers and increases
the resiliency of the system to CFO as well as narrow band interference. In other words, GFDM has robustness to synchronization errors. As Michailow et al. report in [15], GFDM is
also a good match for multiple input multiple output (MIMO)
systems.
The advantages of GFDM come at the expense of an increased bit error rate (BER) compared with OFDM. This degradation is due to the fact that GFDM is a non-orthogonal waveform. Consequently, non-orthogonality of the neighboring subcarriers and time slots results in self-interference. To tackle this
self-interference, matched filter (MF), zero forcing (ZF) or minimum mean square error (MMSE) receivers can be derived [16].
Since, the MF receiver cannot completely remove the ICI, ZF
receiver can be utilized. However, due to its noise enhancement
problem, ZF receiver incurs some BER performance loss. Thus,
the MMSE approach can be chosen to reduce the noise enhancement effect and maximize the signal-to-interference plus noise
ratio (SINR). As MF, ZF and MMSE receivers involve large
matrix inversion and multiplication operations, they demand a
large computational complexity that makes them inefcient for
practical implementations. As an alternative solution, Datta et
al., [17], take a time domain successive interference cancellation approach. This solution can completely remove the effect of the self-interference. However, that solution is a computationally exhaustive procedure. In a more recent work from
the same research group, Gaspar et al., [18], take advantage of
the sparsity of the pulse shaping lter in frequency domain to
perform the interference cancellation in the frequency domain
and hence further reduce the computational complexity of the
receiver. Even though the solutions that are based on the results of [17] and [18] successive interference cancellation can
remove the self-interference, they can incur error propagation
problems. Recently, Matth et al., [19], have proposed a fast algorithm to calculate the ZF and MMSE receiver lters. Their
approach is based on the Gabor transform structure of GFDM.
Although matrix inversion is circumvented multiplication of the
ZF and MMSE matrices to the received signal is a bottle-neck
in this approach as the matrix to vector multiplication is a computationally expensive operation. To reduce the implementation
complexity of the ZF and MMSE demodulators, after efcient
calculation of the lter coefcients using the technique in [19],
the structure that is proposed in [15] can be utilized.
In this paper, we design a low complexity modem structure
for GFDM and therefore improve on the existing approaches.
The special structure of the modulation matrix is utilized to reduce the complexity of the transmitter. Compared with the existing GFDM transmitter [20], so far known to have the lowest
complexity, our proposed transmitter structure is more computationally efcient. Based on the lessons that we learned from ICI
cancellation in uplink OFDMA systems with interleaved sub-

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 6, MARCH 15, 2016

carrier allocation, [6], we are able to substantially reduce the


complexity of the ZF and MMSE receivers compared with the
low complexity receiver structure that is proposed in [18]. To
be more specic, we take advantage of block circulant matrices
and some mathematical concepts discussed in [6] for the purpose of complexity reduction. We propose a unied structure for
the MF, ZF and MMSE receivers. This unied receiver structure
is benecial as only the lter coefcients need to be changed for
implementation of different receivers. These coefcients can be
saved on memory and be used if needed in different scenarios.
For instance, ZF receiver can be used instead of MMSE one at
high signal-to-noise ratios (SNRs). As our techniques are direct
and no approximation is involved, our proposed receivers do not
incur any performance loss compared with the optimal MF, ZF
and MMSE receivers. Another advantage of our receiver structure with respect to interference cancellation receivers is that
it is not iterative and hence the computations can run in parallel which can in turn reduce the overall processing delay of the
system. As our proposed modem structure is based on sparsication of the matrices that are involved, it also provides savings in
the memory requirements of the system. It is worth mentioning
that there are some similarities between our approach and Zak
transform which is used to derive the ZF and MMSE lter coefcients in [19], not their implementation. These similarities
are in utilization of the block Fourier transform matrices in the
calculation of Zak transform. However, in [19] no structure is
proposed for implementation of the ZF and MMSE lters at the
receiver. In contrast, the approach that we take in this paper is to
make the matrices involved sparse through using block Fourier
transform matrices and hence reduce the complexity of both
GFDM modulator and demodulator.
The rest of the paper is organized as follows.
Section II presents the GFDM system model. Sections III and
IV include the design and implementation of our proposed
GFDM transmitter and receiver structures, respectively. The
computational complexity of our modulator and demodulator
pair is analyzed in Section V. Performance optimality of the
proposed techniques is investigated in Section VI. Finally, the
conclusions are drawn in Section VII.
Notations: Matrices, vectors and scalar quantities are denoted by boldface uppercase, boldface lowercase and normal
letters, respectively.
and
represent the element in
the th row and th column of and the th element of , respectively and
signies the inverse of .
and
are
the identity and zero matrices of the size
, respectively.
is a diagonal matrix whose diagonal elements
are formed by the elements of the vector and
is a circulant matrix whose rst column is . The round-down
operator
, rounds the value inside to the nearest integer towards minus innity. The superscripts
and
indicate transpose, conjugate transpose and conjugate operations,
respectively. Finally,
and
represent the Dirac
delta function, -point circular convolution and modulo-N operations, respectively.
II. SYSTEM MODEL FOR GFDM
We consider a GFDM system with the total number of
subcarriers that includes
symbols in each block. In a

FARHANG et al.: LOW-COMPLEXITY MODEM DESIGN

1509

GFDM block,
symbols overlap in time. Therefore, we
call
, overlapping factor of the GFDM system. The
vector
contains the complex data
symbols of the GFDM block where the
data vector
contains the data symbols to be
transmitted on the th subcarrier. To put it differently,
is
the data symbol to be transmitted at the th time slot on the
th subcarrier. The data symbols are taken from a zero mean
independent and identically distributed (i.i.d) process with the
variance of unity. In GFDM modulation, the data symbols to
be transmitted on the th subcarrier are rst up-sampled by the
factor of to form an impulse train

(1)
Then,
is circularly convolved
with the prototype lter and up-converted to its corresponding
subcarrier frequency. After performing the same procedure for
all the subcarriers, the resulting signals are summed up to form
the GFDM signal
, [16].

(2)
where is the th coefcient of the prototype lter.
Putting together all the transmitter output samples in an
vector
, the GFDM
signal can be represented as multiplication of a modulation
matrix of size
to the data vector , [16].
(3)
encompasses all signal processing
Modulation matrix
steps involved in modulation. Let
hold
all the coefcients of the pulse shaping/prototype lter with the
length
, the elements of can be represented as,

to form the transmitted signal vector whose length is


. Let
be the channel impulse
response. Thus, the CP length
needs to be longer than the
channel length
. The received signal which has gone through
the channel, after CP removal can be shown as
(6)
where is the complex additive white Gaussian noise (AWGN)
vector, i.e.,
is the noise variance,
and is the zero padded version of to have the
same length as . Due to the fact that is a circulant matrix, an
FDE procedure can be performed to compensate for the multipath channel impairments. With the assumption of having perfect synchronization and channel estimates, the equalized signal
can be obtained as
(7)
is
-point normalized discrete Fourier transwhere
form (DFT) matrix and
is a diagonal matrix whose diagonal elements are reciprocals of the elements of the vector obtained from taking
-point DFT of the zero padded version
of , viz., . The vector
is the output
of the FDE block.
In order to suppress or remove the ICI due to non-orthogonality of the subcarriers and estimate the transmitted data
vector from the equalized signal vector, three linear GFDM
receivers; namely, MF, ZF and MMSE detectors are considered
in this paper.
As it was discussed in, [16], the transmitted symbols can be
recovered through match ltering
(8)
However, MF receiver cannot completely remove the ICI.
Hence, ZF solution can be utilized to completely eliminate the
ICI that is caused by non-orthogonality of the subcarriers. The
ZF estimate of the transmitted data vector can be found as
(9)

(4)

can have large values, its multiplication


Since
to can result in noise enhancement. This noise amplication
problem can be taken care of by utilizing the MMSE receiver

(5)

(10)

where is an
matrix whose rst column contains the
samples of the prototype lter and its consecutive columns are
the copies of the previous column circularly shifted by samples.
is an
diagonal
matrix whose diagonal elements are comprised of
concatenated copies of the vector
.
In GFDM systems, a CP which is longer than the channel
delay spread is added to the beginning of the block to accommodate the channel transient period. This enables the MF and ZF
receivers to use frequency domain equalization (FDE) to tackle
the wireless channel impairments and hence reduce the channel
equalization complexity. If
is the CP length, the last
elements of the vector are appended to its beginning in order

It is worth mentioning that due to the noise coloring effects,


opposed to MF and ZF receivers in (8) and (9), respectively,
the channel distortions cannot be compensated using (7) before
MMSE receiver. Hence, the channel matrix is included in (10).
Fig. 1, depicts the baseband block diagram of a GFDM
transceiver when we have perfect synchronization in time and
frequency between the transmitter and receiver in an AWGN
channel. Fig. 1 summarizes the modulation and demodulation
process that is discussed above. It is worth mentioning that 's
for
are the prototype lter coefcients and
's are the receiver lter coefcients which can be taken from
the coefcients of MF, ZF or MMSE receiver lter in AWGN
channel. As it was mentioned in Section I, GFDM is a type

Based on the (2) to (4), the matrix

can be written as

1510

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 6, MARCH 15, 2016

Fig. 1. Baseband block diagram of a GFDM transceiver system in AWGN channel.

of lter bank multicarrier system with circular pulse shaping.


Therefore, GFDM transmitter and receiver can be thought of as
a pair of synthesis and analysis lter banks, respectively.
From (3) and (8) to (10), one realizes that direct matrix multiplications and inversions that are involved, demand a very large
computational complexity as all the matrices are of the size
, with being usually large, and such complexity may
not be affordable for practical systems. Therefore, in the remainder of this paper, low complexity techniques will be proposed that can substantially reduce the computational cost of
the synthesis and analysis lter banks that are shown in Fig. 1,
while maintaining the optimal performance.
III. PROPOSED GFDM TRANSMITTER
This section presents our proposed low complexity GFDM
transmitter design and implementation. In the following subsections, we will show how the synthesis lter bank of Fig. 1 can
be simplied to have a very low computational load.
A. GFDM Transmitter Design
Starting from (3), one can realize that direct multiplication
of the matrix
to the data vector is a complex operation
which demands
complex multiplications. Therefore,
complexity will be an issue for practical systems as the number
of subcarriers and/or the parameter
increases. Accordingly,
a low complexity implementation technique for GFDM transmitter has to be sought. To this end, (3) can be written as
(11)
normalized block DFT matrix
submatrices
and
. Validity of (11) is based on the fact that
. As it is derived in Appendix A, the resulting
matrix from multiplication of the block DFT matrix
into
is sparse and it is comprised of the prototype lter coefcients
scaled by
. From (11), it can be inferred that
is
also sparse since it is the conjugate transpose of
. Hence,
our strategy allows us to make the matrix
sparse and real
is the
where
that includes

as the prototype lter is usually chosen as a real lter1. Due


to (11) and the denition of
can be implemented by
performing
DFT operations of size
on the data samples,
i.e., one per GFDM symbol. Let
where the
vector
contains
the th output of each DFT block, then (11) can be rearranged as
(12)
where

. As discussed in Appendix B, the


matrices 's have only
non-zero columns and
the sets of those column indices are mutually exclusive with
respect to each other. As a result,
will be a sparse vector
with only
non-zero elements located on the positions
. On the basis of the derivations that are
presented in Appendix A, the non-zero elements of
can
be obtained from -point circular convolution of
with the
th polyphase component of the prototype lter that is scaled
by
. Therefore, dening the non-zero elements of
as
the vector
, we get

where

(13)

B. GFDM Transmitter Implementation


In this subsection, implementation of the designed GFDM
transmitter in Section III.A is discussed. From the (11) to (13),
GFDM modulation, based on our design, can be summarized
into two steps.
1)
number of -point DFT operations, i.e., application of
-point DFT to each individual GFDM symbol which includes subcarriers. This can be efciently implemented
by taking advantage of the fast Fourier transform (FFT) algorithm.
2)
number of -point circular convolution operations.
1It is worth mentioning that GFDM is not limited to real-valued prototype
lters. Additionally, real-valued or complex-valued lters are applicable to the
proposed solutions in this paper.

FARHANG et al.: LOW-COMPLEXITY MODEM DESIGN

1511

Fig. 2. Concatenation of (a) and (b) show the implementation of the proposed GFDM transmitter.

Therefore, the rst and second steps of our GFDM transmitter


can be implemented by cascading the block diagrams shown in
Fig. 2(a) and (b), respectively. The blocks P/S convert the parallel FFT outputs to serial streams. All the commutators shown
in Fig. 2 turn counter clockwise. Both commutators located on
the right hand side of the Fig. 2(a) and (b) turn after one sample
collection. However, the one located on the left hand side of (b)
turns by one position after sending
samples to each -point
circular convolution block.
IV. PROPOSED GFDM RECEIVER
In this section, we derive low complexity ZF and MMSE receivers for GFDM systems. It is worth mentioning that our solutions are direct and hence lower complexity of these receivers
comes for free as they do not result in any performance loss,
thanks to the special structure of the matrix
. The characteristics of
will be discussed in the next subsection and
then we will derive our proposed receivers on the basis of those
traits.
A. Block-Diagonalization of the Matrix
The key idea behind our proposed GFDM receiver techniques
is to take advantage of the particular structure of the matrix
which is present in both ZF and MMSE receiver formulations. Using (5), one can calculate
and nd out that it
has the following structure

..
.

..
.

..

..
.
(14)

From the denition of vector , it can be straightforwardly perceived that


and hence
. Therefore, the
columns of
as shown in (14) are circularly shifted with
respect to each other. Accordingly,
is a block-circulant
matrix with blocks of size
. Following a similar line of
derivations as in [21] and [6],
can be expanded as follows
(15)

where

is an

trices. From (15),

block-diagonal matrix,
and 's are
block macan be derived as
(16)

As it is explained in Appendix B,
's can be derived from
polyphase components of the prototype lter.

(17)

where
is the th polyphase component
of and
is its circularly
folded version. As (17) highlights, 's are all real and circulant matrices.
B. Low Complexity MF Receiver
Based on (8), direct implementation of MF receiver involves
a matrix to vector multiplication which has the computational
cost of
complex multiplications. This procedure becomes highly complex for large values of
and/or
which
is usually the case. As discussed in Appendix A, multiplication
of
by the block DFT matrix results in a sparse matrix. Due
to the fact that
, similar to the transmitter ((11)),
(8) can be written as

(18)
where is a sparse matrix with only
non-zero elements
that are the scaled version of the prototype lter coefcients. Closed form of
is derived in
Appendix A and it is shown that the matrix is real valued and
comprised of the prototype lter elements. Non-zero columns
of the
block matrices 's are circularly shifted
copies of each other. Hence, multiplication of
and
is
equivalent to -point circular convolution of
equidistant
elements of
starting from the th position and circularly
folded version of the th polyphase component of scaled by
, viz.,
. Usually, the prototype lter coefcients are
real-valued. Thus,
is real-valued. Multiplication of
to
the vector
can be implemented by applying
number of

1512

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 6, MARCH 15, 2016

Using (15) in (25) we get

-point IDFT operations. Let


and

. Therefore, we have

where
obtained as

. Finally, the MF estimates of

(19)
can be
(20)

C. Low Complexity ZF Receiver

(26)
and
. Recalling circulant property of
from
(17), it can be understood that
is also circulant and can
be expanded as
where
2. Let
, we can write

where

Inserting (15) into (9), we get

(21)
Multiplication of matrix
to the vector is the rst source of
computational burden in ZF receiver which has computational
cost of
. However, this complexity can be reduced by
taking advantage of the sparsity of the matrix
as
it was suggested in the previous subsection. Equation (19) can
be written as
. Let
where
(22)

(27)

includes the rst column of the circulant matrix


. Since, in MMSE receiver,
depends on
and the receiver cannot be
the matrix
simplied as in (19) or (23), circular convolution of (27) needs
to be calculated in the frequency domain, known as fast convolution, in order to have the lowest complexity. After obtaining
, the MMSE estimates of the transmitted symbols can be
found as

where

(28)

Therefore, from rearranging (17) as


and inserting it into (22), we have
E. Receiver Implementation

(23)

where

includes the rst column of the circulant matrix


scaled by
. Due to the fact that the the
coefcients of the prototype lter are known, the vectors 's
can be calculated ofine. Additionally, since the prototype
lter coefcients are real,
's are also real. From (23), one
may realize that calculation of the vector needs
number
of
-point circular convolutions. After acquiring , the ZF
estimates of the transmitted symbols can be obtained as
(24)
from requires
As can be inferred from (24), nding
number of -point inverse DFT (IDFT) operations.
D. Low Complexity MMSE Receiver for AWGN Channels
From (10), one may realize that in presence of channel matrix
is not block circulant
in the equations, the matrix
opposed to
. As a result, low complexity MMSE receiver
using DFT based matrix block diagonalization approach only
exists for AWGN channels. Hence, in this paper, we limit ourselves to such channels for the MMSE based GFDM receiver
design. It is worth noting that the MMSE receiver in AWGN
channel becomes relevant when ZF receiver leads to a large
amount of noise amplication. The MMSE estimate of the transmitted data block in the AWGN channel is simplied to
(25)

In this subsection, we present a unied implementation


of the MF, ZF and MMSE receivers that we proposed in
Sections IV.B, IV.C and IV.D. As Fig. 3 depicts, the proposed GFDM receivers can be implemented by cascading
Fig. 3(a) and (b). It is worth mentioning that the commutator
on the right hand side of Fig. 3(a) will turn by one position
after collecting
samples from the th branch, i.e.,
vector
, in the clockwise direction. In the MF and
ZF receivers, the vectors
are replaced by 's and 's,
respectively, and in MMSE receiver, they will be replaced
by 's3. Due to the fact that in the MF and ZF receivers, the
vectors
and
are xed and only depend on the prototype
lter coefcients, they can be calculated ofine and hence there
is no need for their real-time calculation. However, in MMSE
receivers, the vectors
depend on the signal to noise ratio
and hence they should be calculated in real-time. As mentioned
earlier in Section IV.D, circular convolutions in our MMSE
receiver need to be performed by taking advantage of fast
convolution to keep the complexity low.
V. COMPUTATIONAL COMPLEXITY
In this section, the computational complexity of our proposed
GFDM transmitter and receiver structures are discussed and
compared to the existing ones that are known to have the lowest
complexity, [18], [20]. In both cases, total number of subcarriers and overlapping factor of
are considered.
2Since,

is a real vector and circularly folded version of


.

3As mentioned earlier in Section IV.D, opposed to our proposed MF and


ZF receivers, the proposed MMSE receiver in this paper is only applicable to
AWGN channels.

FARHANG et al.: LOW-COMPLEXITY MODEM DESIGN

1513

Fig. 3. Unied implementation of our proposed MF, ZF and MMSE-based GFDM receivers from cascading the block diagrams(a) and (b).

A. Transmitter Complexity
Table I presents the computational complexity of different
GFDM transmitter implementations based on the number of
complex multiplications (CMs).
As discussed in Section III.B, our proposed GFDM transmitter involves two steps. The rst step includes
number of
-point FFT operations that requires
CMs. The
second step needs number of -point circular convolutions.
Recalling (13), since
's are real-valued vectors, one may
-point circular convolution demands
realize that each
number of CMs. If
is a power of two, the complexity can
be further reduced by performing the circular convolutions in
frequency domain. This is due to the fact that circular convolution in time is multiplication in the frequency domain. Thus, to
perform each circular convolution, a pair of -point FFT and
IFFT blocks together with
complex multiplications to the
lter coefcients in frequency domain are required. However,
based on the results of [19],
cannot take even values as the
matrix becomes singular.
The complexity relationships that are presented in Table I are
calculated and plotted in Fig. 4 for
subcarriers with
respect to different values of overlapping factor . As the authors of [20] suggest,
is chosen for calculating their
GFDM transmitter complexity4. Due to the fact that direct multiplication of to the data vector demands a large number of
CMs and is impractical, we do not present it in Fig. 4. To give
a quantitative indication of the complexity reduction that our
proposed transmitter provides compared with the direct computation of the (3), in the same system setting as used for our
other comparisons, i.e.,
and
, complexity reduction of around three orders of magnitude can be
achieved. According to Fig. 4, for the small values of
our
proposed transmitter structure has a complexity very close to
that of OFDM5. However, as
increases the complexity of our
transmitter increases with a higher pace than OFDM. This is due
4Parameter
indicates the number of overlapping subcarriers, i.e., two adjacent subcarriers in GFDM transmission.
5For the purpose of having a fair comparison between OFDM and different
concateGFDM system implementations, an OFDM system transmitting
nated symbols having subcarriers is considered in this study.

TABLE I
COMPUTATIONAL COMPLEXITY OF DIFFERENT GFDM
TRANSMITTER IMPLEMENTATIONS

Fig. 4. Computational complexity comparison of different GFDM transmitter


.
techniques and the OFDM transmitter technique for

to the overhead of
number of CMs compared with OFDM.
Compared with the transmitter structure that we are proposing
in this paper, for small values of up to 11, the transmitter proposed in [20] demands about two times higher number of CMs.
As
increases, complexity of our technique gets close to that
of the one proposed in [20]. GFDM transmitter of [20] is about
3 to 4 times more complex than OFDM.

1514

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 6, MARCH 15, 2016

TABLE II
COMPUTATIONAL COMPLEXITY OF DIFFERENT GFDM RECEIVER TECHNIQUES

B. Receiver Complexity
Table II summarizes the computational complexity of different GFDM receivers in terms of the number of complex multiplications for two cases of AWGN and multipath channels. The
parameter is the number of iterations in the algorithm with interference cancellation and indicates the span of receiver lter
in the neighborhood of each subcarrier band6.
From Fig. 3, it can be understood that our proposed receivers
involve
and
numbers of -point circular convolutions
and -point IDFT operations, respectively. IDFT operations
can be efciently implemented using -point IFFT algorithm
which requires
CMs. As mentioned earlier, in the proposed MF and ZF receivers, the vectors have xed values and
hence can be calculated and stored ofine. Furthermore, 's are
real-valued vectors. Thus, the number of complex multiplications needed for number of -point circular convolutions is
.
In contrast to the MF and ZF receivers, in the MMSE receiver,
the vectors 's are not xed and depend on the signal-to-noise
ratio (SNR). Hence, they need to be calculated in real-time. To
this end, as highlighted in Section IV.D, in AWGN channels,
those operations can be performed by using -point DFT and
IDFT operations. Due to the fact that
is a
real-valued diagonal matrix, its inversion and multiplication to
only needs CMs. The resulting diagonal matrix
is multiplied into an
vector which needs
CMs. Since,
is not necessarily a power of 2, complexity
of -point DFT and IDFT operations in the implementation
of the circular convolutions is considered as
. Obviously, if
is a power of 2, a further complexity reduction by taking
advantage of FFT and IFFT algorithms is possible. Therefore,
the complexity of our proposed MMSE receiver only differs
from the MF and ZF ones in the implementation of the circular
convolution operations.
Table II also presents the complexity of the direct MF, ZF and
MMSE detection techniques, i.e., direct matrix multiplications
and solutions to the (9) and (10), respectively. Those solutions
involve direct inversion of an
matrix which has
6ZF and MMSE lters for each subcarrier have overlapping with more than
only the two adjacent subcarriers in frequency domain in contrast to the MF
lter, [15]. Based on the results of [15], depends on the choice of prototype
lter and can be as large as 16.

the complexity of
and two vector by matrix multiplications with the computational burden of
CMs. To
reduce receiver complexity, a number of solutions are proposed
in the literature, [15], [18], and their complexity is presented in
Table II. These solutions in essence are based on matrix sparsication techniques and their complexity depend on the choice
of prototype lter. However, the complexity of our proposed solutions is independent of the prototype lter choice.
The complexity formulas that are presented in Table II are
evaluated and plotted in Figs. 5 and 6 for different values of
overlapping factor
and
for the
receiver that is proposed in [18]. Based on the results of [18],
and
are considered for the SIC receiver.
and
are considered for the MF and ZF receivers proposed
in [15], respectively. For MMSE reception in AWGN channels,
the MMSE lter coefcients can be efciently calculated using
the results of [19] and then the receiver structure of [15] with
can be exploited. Due to the fact that the complexity of
MF, ZF and MMSE receivers with direct matrix inversion and
multiplications is prohibitively high compared with other techniques (the difference is in the level of orders of magnitude),
they are not presented in Figs. 5 and 6. However, to quantify the
amount of complexity reduction that our proposed techniques
provide for AWGN channel, in the case of
and
, our proposed MF/ZF receiver is three orders of magnitude and the proposed MMSE receiver is six orders of magnitudes simpler than the direct ones, respectively, in terms of the
required number of CMs. As Fig. 5 depicts, our proposed ZF receiver is around an order of magnitude and 2 to 5 times simpler
than the proposed receiver structures in [18] and [15], respectively. Our proposed MF receiver is around two times simpler
than the one in [15]. In addition, our proposed MMSE receiver
has around 2 to 3 times lower complexity than the ones in [18]
and [15], [19]. Apart from lower computational cost compared
with the existing receiver structures, our techniques maintain
the optimal ZF and MMSE performance as they are direct. Furthermore, their complexity is independent of the prototype lter
choice as opposed to the existing solutions. Finally, the ZF and
MMSE receivers that we are proposing are closer in complexity
to OFDM as compared to the receiver structures in [18] and [15]
that are over an order of magnitude more complex than OFDM.
Fig. 6, compares the complexity of our proposed MF and ZF
receivers with the ones proposed in [18] and [15] in the pres-

FARHANG et al.: LOW-COMPLEXITY MODEM DESIGN

Fig. 5. Computational complexity comparison of different GFDM receiver


techniques with respect to each other and that of OFDM receiver in AWGN
and
.
channel when

1515

factor of
is used in all the simulations. Each GFDM
data block is comprised of
subcarriers and
symbols. In all the simulations of this section, the proposed low
complexity transmitter is used for evaluating performance of the
proposed receivers. Direct implementation of the transmitter,
i.e., (3) is used for calculating the BER of the direct solutions.
Each point, in our BER curves, is calculated based on 10 000
simulation runs.
Performance of the proposed ZF and MMSE solutions in
AWGN channel are investigated and compared with those of the
direct solutions7 in Figs. 7 and 8. It is worth mentioning that uncoded 64-QAM modulation scheme is considered in these simulations. As the gures show, the proposed techniques provide
the optimal ZF and MMSE performance with orders of magnitude lower computational complexity than the direct solutions.
As mentioned in Section IV.D, in presence of wireless
channel in (10) limits our proposed MMSE solution to AWGN
channels. Therefore, in Fig. 9, we evaluate the BER performance of our proposed ZF technique and compare it with the
direct ZF solution in presence of multipath channel. In the
BER results shown in Fig. 9, 16-QAM modulation scheme
with convolutional coding and the code rate of 1/2 is considered. Based on our results, the BER curve of the proposed ZF
technique coincides with that of the direct ZF. However, the
MMSE receiver is superior to the ZF one in terms of BER
performance. This is due to the noise amplication problem of
ZF receiver. Performance superiority of MMSE receiver over
the ZF one, i.e., around 2 dB, comes in expense of a substantial
amount of computational burden. Direct MMSE receiver of
(10) is 7 orders of magnitude more complex than the proposed
ZF receiver.
VII. CONCLUSION

Fig. 6. Computational complexity comparison of different GFDM receiver


techniques with respect to each other and that of OFDM receiver in presence
and
.
of multipath channel when

ence of the multipath channel where the channel equalization


complexity is considered. As the gure depicts, our proposed
ZF receiver in this case is 4 and 1.5 times simpler than the SIC
receiver of [18] and the proposed structure in [15], respectively.
Finally, our proposed ZF receiver is only around 4 times more
complex than OFDM receiver in presence of multipath channel.
VI. NUMERICAL RESULTS
In this section, we present the bit error rate (BER) performance of our proposed ZF and MMSE techniques in presence of
AWGN and multipath channels. The multipath channel COST
207, [22], for typical urban area with 12 taps is considered. The
CP is chosen long enough to accommodate the wireless channel
delay spread. A root-raised cosine prototype lter with roll-off

In this paper, we proposed low complexity modulation and


demodulation techniques for GFDM systems. The proposed
techniques exploit the special structure of the modulation
matrix to reduce the computational cost without incurring any
performance loss penalty. In our proposed transmitter, block
DFT and IDFT matrices were used to make the modulation
matrix sparse and hence reduce the computational burden. We
designed low complexity MF, ZF and MMSE demodulators by
block diagonalization of the matrices involved. It was shown
that through this block diagonalization, a substantial amount
of complexity reduction in the matrix inversion and multiplication operations can be achieved. A unied demodulator
structure based on MF, ZF and MMSE criteria was derived.
The closed form expressions for the ZF and MMSE receiver
lters were also obtained. We also analyzed and compared the
computational complexities of our techniques with the existing
ones known so far to have the lowest complexity. We have
shown that all the proposed techniques in this paper involve
lower computational cost than the existing low complexity
techniques [15], [18], [20]. For instance, over an order of
magnitude complexity reduction can be achieved through our
ZF receiver compared with the proposed technique in [18].
7Direct solution involves direct inversion and multiplication of the matrices
involved, i.e., direct calculation of (9) and (10).

1516

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 6, MARCH 15, 2016

Fig. 7. BER performance of our proposed ZF technique compared with the


and uncoded 64-QAM modulation
direct ZF solution for
scheme in AWGN channel.

Fig. 9. BER performance of our proposed ZF technique compared with the


and 16-QAM modulation
direct ZF and MMSE solutions for
scheme having convolutional coding with the code rate of 1/2 in presence of
multipath channel.

From the denitions of


and
as
where 's are
trices that can be mathematically shown as

can be obtained
block ma-

(A.2)
where
we have

. Based on the denition of

and (A.1)

(A.3)
where

Fig. 8. BER performance of our proposed MMSE technique compared with


and uncoded 64-QAM modthe direct MMSE solution for
ulation scheme in AWGN channel.

Such a substantial reduction in the amount of computations that


are involved makes our proposed modem structures attractive
for hardware implementation of the real time GFDM systems.

's are
vectors and
is a diagonal matrix whose main
diagonal elements are made up of
concatenated copies of the
vector . From (A.3) and (A.1), 's can be obtained as
(A.4)

APPENDIX A
DERIVATION OF
The key idea in the derivation of
is based on the fact
that inner product of two complex exponential signals with different frequencies is zero.
(A.1)

Accordingly, it can be perceived that the block matrices 's


and hence the matrix are sparse. The matrix
has only
non-zero elements which are located on the circularly equidistant columns
. The elements of
two consecutive non-zero columns of
are circularly shifted
copies of each other. For instance, the second non-zero column
of
is a circularly shifted version of the rst non-zero one by
one sample. From (A.4), the rst non-zero column of
can be

FARHANG et al.: LOW-COMPLEXITY MODEM DESIGN

1517

derived as
which is the circularly folded version of the th polyphase component of the
prototype lter. One can further deduce that the matrix is a
real one consisted of the prototype lter coefcients.

is also a real and circulant matrix which

(B.4)

REFERENCES

APPENDIX B
CLOSED FORM DERIVATION OF
The polyphase components of the prototype lter
can be dened as the vectors
where
. As it is shown in
Appendix A,
is a sparse matrix with only
non-zero elements in each column. The elements of can be
mathematically represented as
(B.1)

where

is circularly folded version of


and
. From (B.1), it can be deduced that each
group of
consecutive rows of , i.e., 's, whose non-zero
elements are comprised of the elements of the vectors
's,
is mutually orthogonal to the other ones. This is due to the
fact that the sets of column indices of
's with non-zero
elements are mutually exclusive with respect to each other.
The block-diagonal matrix , as derived earlier in (16), can be
calculated as
which can be rearranged as
.
Due to orthogonality of
's with respect to each other,
i.e.,
, it can be discerned that
has a
block-diagonal structure. Based on (B.1), only equidistant
columns of
's with circular distance of
are non-zero
and two consecutive and non-zero columns are circularly
shifted copies of each other with one sample. As a case
in point, consider
and (B.1). Therefore, the elements
and
illustrate that the consecutive and non-zero columns of
are
circularly shifted versions of each other. Using (B.1), one can
conclude that the same property holds for the other non-zero
columns of
and all the other 's.
The goal here is to derive a closed form for .
..
.

real and circulant,


can be obtained as

(B.2)

is an
matrix comprised of
submatrices
which are all zero except the ones located on the main diagonal,
i.e.,
. From (B.1), it can be understood that the rst
non-zero columns of the matrices and
are equal to
and
, respectively and the rest of their non-zero columns
are circularly shifted version of their rst non-zero column. Removing zero columns of 's
(B.3)
where
and
are circulant matrices with the rst columns
equal to
and
, respectively. Since,
and
are

[1] T. Starr, J. M. Ciof, and P. J. Silverman, Understanding Digital Subscriber Line Technology. Upper Saddle River, NJ, USA: PrenticeHall PTR, 1999.
[2] Y. G. Li and G. L. Stuber, Orthogonal Frequency Division Multiplexing for Wireless Communications. New York, NY, USA:
Springer-Verlag, 2006.
[3] A. Farhang, M. Kakhki, and B. Farhang-Boroujeny, Wavelet-OFDM
versus ltered-OFDM in power line communication systems, in Proc.
5th Int. Symp. Telecommun. (IST), Dec. 2010, pp. 691694.
[4] M. Morelli, C. C. J. Kuo, and M. O. Pun, Synchronization techniques
for orthogonal frequency division multiple access (OFDMA): A tutorial review, Proc. IEEE, vol. 95, no. 7, pp. 13941427, Jul. 2007.
[5] K. Lee, S.-R. Lee, S.-H. Moon, and I. Lee, MMSE-based CFO compensation for uplink OFDMA systems with conjugate gradient, IEEE
Trans. Wireless Commun., vol. 11, no. 8, pp. 27672775, Aug. 2012.
[6] A. Farhang, N. Marchetti, and L. Doyle, Low complexity LS and
MMSE based CFO compensation techniques for the uplink of OFDMA
systems, in Proc. IEEE Int. Conf. Commun. (ICC), Jun. 2013, pp.
57485753.
[7] A. Farhang, A. J. Majid, N. Marchetti, L. E. Doyle, and B.
Farhang-Boroujeny, Interference localization for uplink OFDMA
systems in presence of CFOs, in Proc. IEEE Wireless Commun. Netw.
Conf. (WCNC), Apr. 2014, pp. 10301035.
[8] G. Fettweis, M. Krondorf, and S. Bittner, GFDMGeneralized frequency division multiplexing, in Proc. IEEE Veh. Technol. Conf. (VTC
Spring), Apr. 2009, pp. 14.
[9] A. Tonello and M. Girotto, Cyclic block FMT modulation for broadband power line communications, in Proc. IEEE Int. Symp. Power
Line Commun. Its Appl. (ISPLC), Mar. 2013, pp. 247251.
[10] H. Lin and P. Siohan, An advanced multi-carrier modulation for future radio systems, in Proc. IEEE Int. Conf. Acoust., Speech Signal
Process. (ICASSP), May 2014, pp. 80978101.
[11] M. Renfors, J. Yli-Kaakinen, and F. Harris, Analysis and design of efcient and exible fast-convolution based multirate lter banks, IEEE
Trans. Signal Process., vol. 62, no. 15, pp. 37683783, Aug. 2014.
[12] A. Farhang, N. Marchetti, L. Doyle, and B. Farhang-Boroujeny, Filter
bank multicarrier for massive MIMO, in Proc. IEEE Veh. Technol.
Conf. (VTC Fall), Sep. 2014, pp. 17.
[13] B. Farhang-Boroujeny, OFDM versus lter bank multicarrier, IEEE
Signal Process. Mag., vol. 28, no. 3, pp. 92112, 2011.
[14] R. Datta and G. Fettweis, Improved ACLR by cancellation carrier insertion in GFDM based cognitive radios, in Proc. IEEE Veh. Technol.
Conf. (VTC Spring), May 2014, pp. 15.
[15] N. Michailow, M. Matthe, I. Gaspar, A. Caldevilla, L. Mendes, A.
Festag, and G. Fettweis, Generalized frequency division multiplexing
for 5th generation cellular networks, IEEE Trans. Commun., vol. 62,
no. 9, pp. 30453061, 2014.
[16] N. Michailow, S. Krone, M. Lentmaier, and G. Fettweis, Bit error rate
performance of generalized frequency division multiplexing, in Proc.
IEEE Veh. Technol. Conf. (VTC Fall), Sep. 2012, pp. 15.
[17] R. Datta, N. Michailow, M. Lentmaier, and G. Fettweis, GFDM interference cancellation for exible cognitive radio PHY design, in Proc.
IEEE Veh. Technol. Conf. (VTC Fall), Sep. 2012, pp. 15.
[18] I. Gaspar, N. Michailow, A. Navarro, E. Ohlmer, S. Krone, and G. Fettweis, Low complexity GFDM receiver based on sparse frequency
domain processing, in Proc. IEEE Veh. Technol. Conf. (VTC Spring),
Jun. 2013, pp. 16.
[19] M. Matthe, L. Mendes, and G. Fettweis, Generalized frequency division multiplexing in a Gabor transform setting, IEEE Commun. Lett.,
vol. 18, no. 8, pp. 13791382, Aug. 2014.
[20] N. Michailow, I. Gaspar, S. Krone, M. Lentmaier, and G. Fettweis,
Generalized frequency division multiplexing: Analysis of an alternative multi-carrier technique for next generation cellular systems, in
Proc. Int. Symp. Wireless Commun. Syst. (ISWCS), 2012, pp. 171175.
[21] T. De Mazancourt and D. Gerlic, The inverse of a block-circulant
matrix, IEEE Trans. Antennas Propag., vol. 31, no. 5, pp. 808810,
Sep. 1983.
[22] Digital Land Mobile Radio Communications Ofce for Off. Publ. Eur.
Commun., Final Rep., Luxembourg, 1989, COST 207.

1518

Arman Farhang (M13) received the B.Sc. degree


in telecommunications engineering from Azad
University of Najafabad, Iran, in 2007. He received
the M.Sc. degree in telecommunications engineering
from Sadjad University of Technology, Mashhad,
Iran, in 2010.
Currently, he is pursuing the Ph.D. degree in
Irish National Telecommunications Research Centre
(CTVR/CONNECT) at Trinity College Dublin,
Ireland. His research interests include wireless
communications, digital signal processing for communications, multiuser communications, and multicarrier systems.

Nicola Marchetti (M13SM15) received the M.Sc.


degree in electronic engineering from the University
of Ferrara, Italy, in 2003. He received the Ph.D. degree in wireless communications in 2007, and also
received the M.Sc. degree in mathematics, in 2010,
both from Aalborg University, Denmark.
He is currently an Assistant Professor at Trinity
College Dublin, Ireland, where he holds the Ussher
Lectureship in Wireless Communications, and is a
member of the Irish National Telecommunications
Research Centre (CTVR/CONNECT). He worked
as a Research Assistant at the University of Ferrara during 20032004. He
then was a Ph.D. student during 20042007, and a Research and Teaching
Postdoctoral researcher during 20072010 at Aalborg University. His former

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 6, MARCH 15, 2016

collaborations include research projects in cooperation with Samsung, Nokia


Siemens Networks, Huawei, Intel Mobile Communications, among others. His
research interests include: 5G Wireless Communication Systems, Cognitive
Radio and Dynamic Spectrum Access, Complex Systems Science, Integrated
Optical-Wireless Networks, Multiple Antenna Systems, Radio Resource Management, Small Cells and HetNets, and Waveforms. He authored 60 refereed
journals and conference papers, holds two patents, and wrote two books, and
four book chapters.

Linda E. Doyle (SM00) received the B.Sc. degree


in electrical engineering from University College
Cork, Ireland, in 1989, and the M.Sc. and Ph.D.
degrees from Trinity College Dublin, Ireland, in
1992 and 1996, respectively.
She is a Professor of Engineering and The Arts
at Trinity College, University of Dublin. She is the
Director of CTVR/CONNECT, an SFI Research
center, focused on future networks and communications. CTVR/CONNECT is headquartered in Trinity
College, and comprises 10 academic institutions in
total and has more than 40 industry partners. Her expertise is in the elds of
wireless communications, cognitive radio, recongurable networks, spectrum
management, and creative arts practices. She has published widely in these
domains and leads a large research team within CTVR/CONNECT.
Prof. Doyle is a member of the Ofcom Spectrum Advisory Board in the UK.
She is a Fellow of Trinity College Dublin.