Sie sind auf Seite 1von 10

FPGA Implementation of a LDPC Decoder using

a Reduced Complexity Message Passing


Algorithm


Vikram Arkalgud Chandrasetty and Syed Mahfuzul Aziz
School of Electrical & Information Engineering, University of South Australia,
Mawson Lakes, SA 5095, Australia
Email: vikramac@ieee.org, mahfuz.aziz@unisa.edu.au



AbstractIn this paper, a simplified message passing
algorithm for decoding Low-Density Parity-Check (LDPC)
codes is proposed with a view to reduce the implementation
complexity. The algorithm is based on simple hard-decision
decoding techniques while utilizing the advantages of soft
channel information to improve decoder performance. It
has been validated through simulation using LDPC code
compliant with Wireless Local Area Network (WLAN
IEEE 802.11n) standard. The results show that the
proposed algorithm can achieve significant improvement in
bit error rate (BER) performance and average decoding
iterations compared to fully hard-decision based decoding
algorithms. The proposed algorithm has been implemented
and tested on Xilinx Virtex 5 FPGA. With significantly
reduced hardware resources, the implemented decoder can
achieve an average throughput of ~16.2 Gbps with a BER
performance of 10
-5
at an E
b
/N
o
of 6.25 dB.

Index TermsDigital communication, error correction
coding, logic design, field programmable gate array

I. INTRODUCTION
The Low-Density Parity-Check (LDPC) codes were
first proposed by Gallager in 1962 [1]. They were not
popular for a few decades since its introduction due to
high implementation complexity. However, it gained
popularity after it was formally re-introduced by MacKay
and Neal in 1997 [2]. It has been shown that LDPC
codes, when optimally designed, have the capability to
perform very close to the Shannon Limit [3]. LDPC
codes have several advantages over turbo codes,
including reduced implementation complexity, better bit
error rate (BER) performance at low signal to noise ratio
(SNR) and the inherent code structure that supports high
degree of parallelism [4]. Hence LDPC codes have
become increasingly popular and have been adopted in
latest generation high data rate applications such as
WLAN [5], WiMax [6] and Digital Video Broadcasting -
Second Generation (DVB-S2) [7, 8].
A number of algorithms with varying complexity and
performance have been proposed for LDPC decoding [9-
11]. However, achieving a balanced trade-off between
decoding performance (such as BER and number of
iterations) and implementation complexity still remains a
potential problem [12, 13]. The Sum-Product Algorithm
(SPA) which is based on soft-decision decoding achieves
best decoding performance but has very high complexity
[14]. Many modifications have been proposed to simplify
the node operations in SPA. The check nodes are
simplified by reducing the non-linear function to an
approximated quantization table [15, 16] and even to
logarithmic functions [17]. But the reduction in
implementation complexity achievable by using
quantization table or logarithmic functions appears to be
insignificant. The check node operation of SPA is
significantly simplified in the min-sum algorithm [14].
However, it requires high precision quantized messages
to be exchanged between the processing nodes to achieve
good BER performance. In contrast, the Bit-Flip
algorithm (BFA) [3], which is based on hard-decision
decoding, has the least complexity but suffers from poor
performance. A number of modifications have been
proposed to improve its performance [18-21]. . The
improved bit-flipping technique presented in [18]
requires dynamic computation of probabilities for bit-
flipping at the variable node. Similarly, the weighted bit-
flip (WBF) based algorithms [19-21] require updating of
reliability values during the decoding process. Hence
these algorithms require relatively complex operations
compared to BFA and can achieve only modest
improvement in decoding performance.
In this paper, a low complexity LDPC decoding
algorithm is proposed to achieve a trade-off between
implementation complexity compared to fully soft-
decision based algorithms (SPA) and decoding
performance compared to fully hard-decision based
algorithms (BFA). The algorithm is based on a simple
hard-decision message passing technique to reduce the
complexity [22]. However, the variable node uses soft
inputs and performs a distinct operation to improve the
decoding performance. With a slight increase in
complexity of variable node operation, the proposed
algorithm not only improves the BER performance
compared to BFA, but also reduces the average iterations
required for decoding. The algorithm has been
implemented on FPGA and the LDPC decoder
performance is analyzed.
36 JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011
2011 ACADEMY PUBLISHER
doi:10.4304/jnw.6.1.36-45
The rest of the paper is organized as follows: an
overview of LDPC decoding is provided in section II.
Section III presents LDPC decoding algorithms that
contrasts in complexity and performance. Section IV
discusses the proposed algorithm and its node operations.
Section V provides performance simulation results of the
proposed algorithm. Finally, FPGA implementation
details, hardware performance results and analysis are
presented in section VI, followed by conclusions in
section VII.
II. OVERVIEW OF LDPC DECODING
LDPC codes belong to a class of block codes [23]. As
their name suggests, its parity-check matrix (H) consists
of very small number of non-zero elements. The
sparseness of H determines the decoding complexity and
the minimum distance of the code. Apart from the
requirement that the LDPC matrix be sparse, there is no
other difference between the LDPC code and any other
block code [24]. An LDPC matrix is described by various
parameters, which are briefly described here. An encoded
message also known as codeword consists of the useful
information bits and the redundant bits. The code rate of
a decoder is the ratio of the length of useful information
bits to the length of codeword bits. The number of non-
zero entries in each of the rows and columns of H matrix
is collectively known as degree distribution. An H matrix
is said to be regular if the degree distribution of rows and
columns are uniform, otherwise it is Irregular. The H
matrix can be represented as a graph called Tanner
graph. A cycle in the graph is a sequence of connected
nodes, which start and end at the same node. The girth or
the smallest cycle in the parity-check graph, significantly
contributes to the performance of the iterative decoding
algorithms [25]. A regular (3, 6) parity-check matrix with
10-bit code length is shown in Fig. 1 (a) and the
corresponding Tanner graph representation of the parity-
check matrix is shown in Fig. 1 (b).

(a) An LDPC matrix


(b) Tanner graph representation of LDPC matrix

Figure 1. A rate (3, 6) 10-bit LDPC code

The decoding of LDPC code involves passing of
messages between the nodes along the edges in the
Tanner graph. This class of decoding algorithms is often
called as message passing algorithm [26]. Each of the
nodes in the Tanner graph works in isolation with
information available along the connected edges only.
These decoding algorithms require passing of the
messages between the nodes for a fixed number of times
or till the result is achieved. Hence such algorithms are
also known as iterative algorithms [25]. LDPC decoding
algorithms generally operates by making either hard-
decision or soft- decision on the messages received from
the noisy channel. In the former case, a binary hard-
decision is made on the data received from the channel
and then passed to the decoder, e.g. Bit-Flip Algorithm
(BFA). But in case of soft-decision based algorithms, the
input data to the decoder is the channel probabilities
represented in logarithmic ratio which is also known as
log-likelihood ratio (LLR). The messages passed between
the nodes in the decoder are also soft messages, e.g.
Belief Propagation based algorithms uses soft LLR input
for decoding [27]. It is known that the decoder using soft-
decision methods perform better compared to that of the
hard-decision, due to its ability to correct errors based on
the bit probabilities [28].
III. LDPC DECODING ALGORITHMS
In this section, a highly complex Sum-Product
algorithm that can achieve very good performance
compared to a low complexity Bit-Flip algorithm that
suffers from poor performance is presented.

A. Sum-Product Algorithm
The Sum-Product algorithm for LDPC decoding is a
soft-decision message-passing algorithm that requires
LLR (intrinsic message) for variable node operations to
make decoding decisions. To begin with the decoding
process, the LLRs are passed over to the variable nodes.
The variable nodes (V) perform the sum operations on
the input LLRs, as in (1) and the computed (extrinsic)
messages are passed along the connected edges to the
check nodes (C) [14].


SPA Variable node operation:

+ =
i j
j n i C LLR V (1)
where, n = 1,2,.number of variable nodes
i, j = 1,2,.degree of variable node

The operation performed by the check nodes (C) is
given in (2) [14]. The output messages (C
k
) are passed to
the respective variable nodes. The check nodes also
perform the parity check operation. This process is
repeated till the maximum iterations is reached or the
parity check is satisfied.


JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011 37
2011 ACADEMY PUBLISHER
SPA Check node operation:
|
|

\
|
=

l k
l
k
V
C
2
tanh tanh 2
1
(2)
where, l, k = 1,2,.degree of check node

It can be noted that the SPA requires multiple non-
linear operations in the check node and also requires high
precision extrinsic messages to be exchanged between the
nodes. This represents high computational complexity.
However, the SPA can achieve very good decoding
performance [14].

B. Bit-Flip Algorithm
The Bit-Flip algorithm is based on hard-decision
message-passing technique. A binary hard-decision is
done on the received channel data and then passed to the
decoder. The messages passed between the check node
and variable nodes are also single-bit hard-decision
binary values. The variable node (V) sends the bit
information to the connected check nodes (C) over the
edges. The check node performs a parity check operation
on the bits received from the variable nodes (Eq. 3) [18].
It sends the message back to the respective variable nodes
with a suggestion of the expected bit value for the parity
check to be satisfied.
BFA Check node operation:
k l V V V C l k = ..... 2 1 (3)
where, l, k = 1,2,.degree of check node

The variable node (V) receives a set of response or the
suggested bit values from the check nodes (C). Based on
the majority of the suggested bit values, the variable node
flips the current bit (Eq. 4) [18] or retains the original
value. This operation is repeated until the parity check is
satisfied or maximum number of iterations is reached

BFA Variable node operation:

=
=
=
,
1 ) ( , 1
0 ) ( , 0
Otherwise V
C majority If
C majority If
V
n
i
i
n (4)
where, n = 1,2,.number of variable nodes
i = 1,2,.degree of variable node

Clearly the BFA has simple check node and variable
node operations, thus making it a very low complexity
decoding algorithm compared to the SPA presented
above. But this advantage comes with a poor decoding
performance [18].


IV. SIMPLIFIED MESSAGE PASSING ALGORITHM
It is well known that SPA can achieve good decoding
performance [14] but with high implementation
complexity, and BFA has low implementation complexity
but poor performance [18]. The main aim of the proposed
Simplified Message Passing Algorithm (SMPA) is to
achieve a trade-off between the decoding performance
and implementation complexity of the above two
algorithms. The check node and variable node operations
of the proposed SMPA are presented next.

A. Check-Node operation
The complexity of a message passing algorithm
significantly depends on the quantization length of
extrinsic messages and the check node operation. These
aspects are particularly critical in case of hardware
implementation of large LDPC codes. In order to reduce
the complexity of SMPA, the check node consists of a
simple parity check operation (Eq. 5) requiring XOR
logic only, similar to BFA. However, the performance
improvement of SMPA over BFA is achieved from a
distinct variable node (V) operation.
SMPA Check node operation:
k l V V V C l k = ..... 2 1 (5)
where, l, k = 1,2,.degree of check node

Note that the stochastic [29] and binary message-passing
[30-32] based LDPC decoders also incorporate a similar
check node operation requiring simple XOR logic.
However, these techniques propose using serialized
messages between the variable and check nodes, and
therefore require extra hardware (e.g. FIFO) and
additional clock cycles that slows down the process
substantially. In comparison, the SMPA proposed in this
paper uses only single-bit messages, where serialization
is not required.
B. Variable-Node operation
A fully hard-decision based decoding algorithm suffers
from poor performance because of the hard-decision
intrinsic and extrinsic messages used in the decoding
process. In SMPA, the performance improvement is
achieved by using soft LLR input for decoding, like any
other soft-decision based algorithms. The variable node
(V) performs sum operation similar to SPA, but the
difference is that SMPA requires original LLR value only
at the beginning of the decoding cycle, as in (6). The
updated new LLR value is used in subsequent iterations,
as in (9). In the analysis of hard-decision based channels
presented in [33] the variable node operates directly on
the hard-decision bit received from the check nodes. In
contrast, the proposed SMPA maps the bit suggestion
from the check nodes (C) to an optimized integer constant
called Weight (W), which is determined from
simulations to achieve the best possible BER
performance. It is either added to or subtracted from the
38 JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011
2011 ACADEMY PUBLISHER
current LLR value. For example, at a variable node if
binary 0 is received from the check node, it is mapped
to +W and if its binary 1 it is mapped to W, as in (7)
and (8) respectively. After this operation, a hard-decision
is performed on the updated LLR value and it is sent
across to the respective check nodes for parity check, as
in (10). The process is repeated until the parity check is
satisfied or the maximum iteration is reached.

SMPA variable node operation:
Initial (at iteration 0): LLR Vn = (6)
Subsequent iterations:
0 , = + = i i C If W X (7)
1 , = = i i C If W X (8)

+ = i n n X V V (9)
) ( i n i X V sign V = (10)
[+ 0 and -1]
where, n = 1,2,.number of variable nodes
i = 1,2,.degree of variable node
W = optimized integer constant obtained from
simulations

Note that in SMPA, the variable node performs
addition operations and uses mapping logic, and check
node performs simple XOR operation. Hence it can be
implemented using simple hardware blocks, such as
adders and Look-Up-Tables (LUT).
A comparison of check node and variable node
structures for the SMPA, SPA and BFA is shown in Fig.
2.

Figure 2. Comparison of decoding node structures for SMPA, SPA and
BFA
V. PERFORMANCE ANALYSIS
To evaluate the performance of the proposed
algorithm, simulations were carried out for two different
code lengths, 1000-bit and 648-bit. The latter is
compliant with WLAN (IEEE 802.11n) standard [5]. A
simulation model has been developed using the C
programming language in MatLab environment. The
LDPC codes were generated using Progressive Edge
Growth (PEG) based algorithm [34] and the simulations
were carried out assuming that the codewords were
Binary Phase Shift Keying (BPSK) modulated and passed
over an Additive White Gaussian Noise (AWGN)
channel [35]. Simulations were done with different LLR
precisions to study the effect on the decoding
performance.

A. Estimation of W in SMPA
As stated previously it is necessary to use optimum
value of W in order to achieve the best possible BER
performance from the SMPA decoder (see Eq. 7 and Eq.
8). Monte Carlo simulations were carried out using
different LLR quantizations (3-bit, 4-bit and 5-bit) with
various values of W (1 to 5) and for different E
b
/N
o

levels. Fig. 3 shows the simulations results only for E
b
/N
o

= 6 dB. A rate (3, 6) 648-bit LDPC code with a
maximum iteration of 10 was used in the simulations.
From Fig. 3, it is clear that SMPA (3-bit) can achieve the
lowest BER at W=1, whereas SMPA (4-bit) has optimum
BER performance at both W=1 and W=2. The BER
performance of SMPA (5-bit) is almost constant over a
wide range of W, achieving the best performance at
W=2.


Figure 3. BER performance of SMPA over a range of W

B. Simulation Results
The LDPC code parameters and specifications used in
the performance simulations are as follows:
Code lengths: 648-bit (WLAN) and 1000-bit
rate and (3, 6) regular code
LLR quantization for SMPA: 3-bit, 4-bit (Weight, W
= 1) and 5-bit (Weight, W = 2)
Maximum decoding iterations: 10
The BER and frame error rate (FER) performance
simulation results for the proposed SMPA are shown in
Fig. 4 and Fig. 5 respectively. Key features of these
results are summarized in Table I. Clearly the proposed
SMPA achieves better BER performance compared to
BFA. At a BER of 10
-5
, the improvement in SMPA when
using 3-bit LLR is 0.8 dB for 648-bit code (Fig. 4a) and
0.9 dB for 1000-bit code (Fig. 4b). Higher LLR
precisions (4-bit and 5-bit) in the variable node
operations improve the BER performance by at least 1.8
dB compared to BFA. The proposed SMPA improves the
frame error rate performance over BFA in a similar
JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011 39
2011 ACADEMY PUBLISHER
fashion as observed from Fig. 5 and Table I. The
convergence rate of various algorithms have been
assessed by analyzing the average number of decoding
iterations required by each algorithm, as shown in Fig. 6.
Clearly the proposed SMPA with 4-bit and 5-bit LLR
precisions requires much fewer iterations (higher
convergence rate) compared to BFA for all values of
E
b
/N
o
. Even with 3-bit LLR precision, SMPA requires
fewer or equal number of iterations compared to BFA for
a reasonably large range of E
b
/N
o
. Although the iteration
count for SPA is much lower than SMPA, each of the
iterations in SPA is likely to take significantly more
computation time due to the highly complex operations at
the variable and check nodes (see Eq. 1 and Eq. 2). In
contrast, the proposed SMPA has much simpler node
operations (see Eq. 5-10) and therefore incurs much
shorter iteration cycle time.
TABLE I.
COMPARISON OF BER AND FER PERFORMANCE OF THE ALGORITHMS
Algorithms
648-bit (WLAN) 1000-bit
BER of
10
-5
(dB)
FER of
10
-2
(dB)
BER of
10
-5
(dB)
FER of
10
-2
(dB)
SPA 3.2 2.8 2.9 2.6
BFA 8 7.1 7.9 7.1
SMPA
3-bit 7.2 6.5 6.8 6.5
4-bit 6.2 5.6 5.9 5.6
5-bit 6.0 5.5 5.8 5.5



(a) 648-bit LDPC code

(b) 1000-bit LDPC code

Figure 4. BER performances for the SMPA, SPA and BFA



(a) 648-bit LDPC code


(b) 1000-bit LDPC code
Figure 5. FER performances for the SMPA, SPA and BFA
40 JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011
2011 ACADEMY PUBLISHER

(a) 648-bit LDPC code



(b) 1000-bit LDPC code
Figure 6. Average decoding iterations for SMPA, SPA and BFA
VI. FPGA IMPLEMENTATION
Fully parallel implementation of decoders for large
LDPC codes has been problematic due to very large
amount of resources required. We seek to implement a
fully parallel LDPC decoder on FPGA based on the
proposed SMPA to judge the savings in hardware
resources. Implementing a fully parallel architecture will
also be useful to determine the feasibility of various
partially parallel architectures based on SMPA. From the
simulation results (Fig. 4 and Fig. 5), it is clear that using
4-bit SMPA over 3-bit provides significant improvement
in decoding performance. Whereas, using 5-bit SMPA
over 4-bit achieves little or negligible gain in
performance. Hence 4-bit LLR quantization is used for
the FPGA implementation. A -rate (3, 6) regular 648-bit
LDPC code that is compliant with WLAN (IEEE
802.11n) standard was chosen to implement the decoder.

A. Design Procedure
A parameterized hardware model of the decoder was
developed using the Verilog Hardware Description
Language (HDL) and synthesized using Xilinx Synthesis
Tool. Behavioral and post-synthesis model simulations
were carried out using ModelSim. The block diagram of
the LDPC decoder as implemented is shown in Fig. 7.
Figure 7. Block diagram of the designed LDPC decoder

The decoder consists of a global Clock and
synchronous Reset inputs. The maximum permissible
number of iterations is determined by the value supplied
at the MaxIter input. This can be set to a value in the
range 0-15. When the Configure input is high, the
MaxIter value is read. The LLRs are fed into the
decoder using the Load control signal. The decoding
process is initiated by the Start signal. After the
decoding is completed, the Decoded Data can be
obtained when the DataOut Ready signal is asserted.
The receipt of data can be acknowledged via the
DataOut Ack signal to receive the next decoded bit. The
number of iterations used for decoding can be obtained
from the Used Iter port. The Decoder Status port
indicates the progress (Active/Idle) of the decoder.
Note that the LLRs are loaded serially (one at a time)
into the decoder. Similarly, the Decoded Data is latched
(read) bit by bit serially. This technique is used because
of the limited number of input/output ports available on
the FPGA. It also provides flexibility for implementing
LDPC decoders with variable code length without
modifying the port configuration.

B. Test Procedure
The LDPC decoder was implemented on a Xilinx
Vertex 5 FPGA (XC5VLX110T). A comprehensive
testing environment was developed to test the
implemented decoder. The test setup is shown in Fig. 8.
An RS232 transceiver module was embedded on the
FPGA along with the LDPC decoder module to interface
with the RS232 port of the PC [36]. MatLab was used to
communicate with the FPGA. A serial port
communication driver was developed using the C
programming language and executed in the MatLab
environment [37]. A maximum baud rate of 115200 kbps
was used for the serial data communication. The LLRs
were generated as decribed in Section V and sent to the
FPGA along with the appropriate control signals. After

LLR Input
MaxIter
Clock Reset Configure
Load
Decoder Status
Decoded Data
Used Iter
DataOut Ready
DataOut Ack
4
4
LDPC
Decoder
Start
4
JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011 41
2011 ACADEMY PUBLISHER
the decoding process, the decoded data received via the
same serial port was used to analyze the performance of
the decoder. The number of iterations used by the
decoder was also collected to estimate the average
throughput of the implemented decoder.


Figure 8. Block diagram of FPGA test setup

C. Implementation Results
The performance of the LDPC decoder implemented
on FPGA was analyzed and compared against the
software simulation results. The BER and FER
performance of the hardware decoder are shown in Fig. 9
and Fig. 10 respectively. The loss in BER performance
due to implementation is negligible. The loss in FER
performance is less than 0.1 dB (E
b
/N
o
). The average
number of iterations required by the hardware decoder
closely follows the average iterations predicted by the
software simulation model, as shown in Fig. 11.
The FPGA device utilization summary of the LDPC
decoder including the serial communication module is
shown in Table II. FPGA implementation results for BFA
and SPA was not available in the literature that is suitable
for comparing with SMPA. Hence, the results were
obtained from post-placement and routing (PAR) of the
design. Note that for SPA, only synthesis and mapping
was carried out, since the Xilinx tool failed to route the
design completely due to huge complexity.


Figure 9. BER performance of the LDPC decoder from FPGA

Figure 10. FER performance of the LDPC decoder from FPGA

Figure 11. Average iterations of the LDPC decoder from FPGA
TABLE II.
FPGA DEVICE UTILIZATION SUMMARY
Resources SPA SMPA BFA
Device
Xilinx Vertex 5
(XC5VLX110T-3FF1136)
LDPC code rate (3, 6) regular 648-bit (WLAN)
Intrinsic message (LLR) 4-bit 1-bit
Extrinsic message 4-bit 1-bit 1-bit
Slices 15684 4046 1396
LUTs 58787 14239 3577
Registers 12443 5963 2069
Clock 128 MHz 188 MHz 190 MHz

The throughput of the decoder has been calculated
using the formula shown in (11) [38]. This calculation
excludes the loading time of the individual LLRs (before
starting the decoding process) and latching time of the
decoded data (after decoding is complete).


=
it
N
f codelength rate
T
max
(11)
LDPC
Decoder
RS232
Rx/Tx



Personal
Computer
MatLab
FPGA
Serial Port
Connection
42 JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011
2011 ACADEMY PUBLISHER
where, f
max
is the maximum operating frequency of the
decoder obtained from FPGA implementation, N
it
is the
number of decoding iterations and is the number of
clock cycles required to complete one iteration ( = 1, for
the SMPA decoder).
Using the average decoding iterations from Fig. 11, the
estimated average throughput of the LDPC decoder
implemented on FPGA is plotted in Fig. 12 for various
E
b
/N
o
. Clearly the proposed decoder has an average
throughput of ~16.2 Gbps at E
b
/N
o
= 6.25 dB. Fig. 9
shows that the BER achieved at this E
b
/N
o
is 10
-5
.


Figure 12. Average throughput of the LDPC decoder from FPGA
VII. CONCLUSIONS
A simplified message passing algorithm for LDPC
decoding has been presented in this paper. The proposed
algorithm uses higher precision soft LLR-inputs for
variable node operations while passing only hard-
decision messages between the processing nodes. This
approach has led to improved BER and FER
performances compared to fully hard-decision based
solutions such as those based on the Bit Flip Algorithm
(BFA). The proposed algorithm also reduces the average
number of decoding iterations compared to BFA. The
algorithm has been verified through FPGA
implementation of a LDPC decoder that complies with
the Wireless LAN standard. The results show that the
decoder requires significantly reduced hardware
resources compared to the sum-product algorithm (SPA).
The decoder can achieve a massive throughput of ~16.2
Gbps, which is considerably higher than decoders based
on SPA. Although the proposed decoder requires more
hardware resources than the one based on the Bit Flip
Algorithm, it addresses the main weakness of the latter by
significantly improving the BER performance. The
hardware resource utilization results obtained from the
fully parallel implementation of the decoder presented in
this paper can be used to guide the design of partially-
parallel architectures for large codes to reduce the
hardware resource requirement even further.
ACKNOWLEDGMENT
The authors wish to acknowledge Dr Mark Ho of the
School of Electrical and Information Engineering,
University of South Australia, for his advice on carrying
out the performance simulations.
REFERENCES
[1] R. Gallager, "Low-density parity-check codes," IRE
Transactions on Information Theory, vol. 8, no. 1, pp. 21-
28, January 1962.
[2] D.J.C. MacKay and R.M. Neal, "Near Shannon limit
performance of low density parity check codes,"
Electronics Letters, vol. 33, no. 6, pp. 457-458, 13 March
1997.
[3] D.J.C. MacKay, "Good error-correcting codes based on
very sparse matrices," IEEE Transactions on Information
Theory, vol. 45, no. 2, pp. 399-431, March 1999.
[4] G.L.L. Nicolas Fau, LDPC (Low Density Parity Check) -
A Better Coding Scheme for Wireless PHY Layers Design
and Reuse Industry Article, 2008.
[5] IEEE Std. 802.11n, "Wireless LAN medium access
control (MAC) and physicallayer (PHY) specifications:
enhancementsfor higher throughput", IEEE, September
2009.
[6] IEEE Stdandard 802.l6e, "Air interface for fixed and
mobile broadband wireless access systems . Amendment
2: Physical and medium access control layers for
combined fixed and mobile operation in licensed bands",
IEEE, December 2005.
[7] A. Morello and V. Mignone, "DVB-S2: The Second
Generation Standard for Satellite Broad-Band Services,"
Proceedings of the IEEE, vol. 94, no. 1, pp. 210-227,
January 2006.
[8] Tetsuo Nozawa, LDPC Adopted for Use in Comms,
Broadcasting, HDDs, Nikkei Electronics Asia, 2005.
[9] C. Winstead, V. Gaudet, A. Rapley, and C. Schlegel,
"Stochastic iterative decoders," Proceedings of the IEEE
International Symposium on Information Theory, pp.
1116-1120, 4-9 September 2005.
[10] Z. Cui and Z. Wang, "Improved low-complexity low-
density parity-check decoding," IET Communications,
vol. 2, no. 8, pp. 1061-1068, September 2008.
[11] G. Lechner, I. Land, and L. Rasmussen, "Decoding of
LDPC codes with binary vector messages and scalable
complexity," Proceedings of the International Symposium
on Turbo Codes and Related Topics, Lausanne, pp. 350-
355, 1-5 September 2008.
[12] E. Yeo and V. Anantharam, "Capacity Approaching
Codes, Iterative Decoding Architectures, and Their
Applications," IEEE Communications Magazine, vol. 41,
no. 8, pp. 132-140, 2003.
[13] S.M. Aziz and M.D. Pham, "Implementation of Low
Density Parity Check Decoders using a New High Level
Design Methodology," Journal of Computers, Academy
Publisher, vol. 5, no. 1, pp. 81-90, January 2010.
[14] A. Anastasopoulos, "A comparison between the sum-
product and the min-sum iterative detection algorithms
based on density evolution," Proceedings of the IEEE
Global Telecommunications Conference, San Antonio,
TX, pp. 1021-1025, 25-29 November 2001.
[15] J.H. Han and M.H. Sunwoo, "Simplified sum-product
algorithm using piecewise linear function approximation
for low complexity LDPC decoding," Proceedings of the
3rd International Conference on Ubiquitous Information
JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011 43
2011 ACADEMY PUBLISHER
Management and Communication, Suwon, Korea, pp.
302-308, 2009.
[16] S. Papaharalabos, et al., "Modified sum-product
algorithms for decoding low-density parity-check codes,"
IET Communications, vol. 1, no. 3, pp. 294-300, June
2007.
[17] S. Papaharalabos and P.T. Mathiopoulos, "Simplified
sum-product algorithm for decoding LDPC codes with
optimal performance," Electronics Letters, vol. 45, no. 2,
pp. 116-117, 15 January 2009.
[18] N. Miladinovic and M.P.C. Fossorier, "Improved bit-
flipping decoding of low-density parity-check codes,"
IEEE Transactions on Information Theory, vol. 51, no. 4,
pp. 1594-1606, April 2005.
[19] Q. Dajun, J. Ming, Z. Chunming, and W. Xiaofu, "A
Modification to Weighted Bit-Flipping Decoding
Algorithm for LDPC Codes Based on Reliability
Adjustment," Proceedings of the IEEE International
Conference on Communications, Beijing, pp. 1161-1165,
19-23 May 2008.
[20] F. Guo and L. Hanzo, "Reliability ratio based weighted
bit-flipping decoding for low-density parity-check codes,"
Electronics Letters, vol. 40, no. 21, pp. 1356-1358, 14
October 2004.
[21] C.H. Lee and W. Wolf, "Implementation-efficient
reliability ratio based weighted bit-flipping decoding for
LDPC codes," Electronics Letters, vol. 41, no. 13, pp.
755-757, 23 June 2005.
[22] V.A. Chandrasetty and S.M. Aziz, "A reduced complexity
message passing algorithm with improved performance
for LDPC decoding," Proceedings of the 12th
International Conference on Computers and Information
Technology, Dhaka, pp. 19-24, 21-23 December 2009.
[23] D. Costello Jr, A. Pusane, S. Bates, and K. Zigangirov, "A
comparison between LDPC block and convolutional
codes," Proceedings of the Information Theory and
Applications Workshop, San Diego, USA, February 2006.
[24] S.J. Johnson, Introducing Low-Density Parity-Check
Codes, University of Newcastle, Australia, 2006.
[25] J.C. Moreira, Essentials of error-control coding, West
Sussex, England: John Wiley & Sons, 2006.
[26] T.J. Richardson and R.L. Urbanke, "The capacity of low-
density parity-check codes under message-passing
decoding," IEEE Transactions on Information Theory,
vol. 47, no. 2, pp. 599-618, February 2001.
[27] M.G. Luby, M. Amin Shokrolloahi, M. Mizenmacher, and
D.A. Spielman, "Improved low-density parity-check
codes using irregular graphs and belief propagation,"
Proceedings of the IEEE International Symposium on
Information Theory, pp. 117.
[28] M. Singh and I.J. Wassell, "Comparison between soft and
hard decision decoding using quaternary convolutional
encoders and the decomposed CPM model," Proceedings
of the IEEE VTS 53rd Vehicular Technology Conference,
Rhodes, pp. 1347-1351, 06-09 May 2001.
[29] S. Sharifi Tehrani, S. Mannor, and W.J. Gross, "Fully
Parallel Stochastic LDPC Decoders," IEEE Transactions
on Signal Processing, vol. 56, no. 11, pp. 5692-5703,
November 2008.
[30] N. Mobini, A.H. Banihashemi, and S. Hemati, "A
Differential Binary Message-Passing LDPC Decoder,"
Proceedings of the IEEE Global Telecommunications
Conference, Washington, DC, pp. 1561-1565, 26-30
November 2007.
[31] C. Chao-Yu, H. Qin, K. Jingyu, Z. Li, and L. Shu, "A
binary message-passing decoding algorithm for LDPC
codes," Proceedings of the 47th Annual Allerton
Conference on Communication, Control, and Computing
Monticello, IL, pp. 424-430, Sept. 30 2009-Oct. 2 2009.
[32] H. Qin, K. Jingyu, Z. Li, L. Shu, and K. Abdel-Ghaffar,
"Two reliability-based iterative majority-logic decoding
algorithms for LDPC codes," IEEE Transactions on
Communications, vol. 57, no. 12, pp. 3597-3606,
December 2009.
[33] G. Lechner, T. Pedersen, and G. Kramer, "EXIT Chart
Analysis of Binary Message-Passing Decoders,"
Proceedings of the IEEE International Symposium on
Information Theory, Nice, pp. 871-875, 24-29 June 2007.
[34] X.Y. Hu, Software to Construct PEG LDPC code, 2008,
[cited on May 2009]; Available from:
http://www.inference.phy.cam.ac.uk/mackay/PEG_ECC.h
tml.
[35] J.G. Proakis, Digital communications, 5th ed, New York:
McGraw-Hill, 2008.
[36] V.A. Chandrasetty and S.R. Laddha, "A novel dual
processing architecture for implementation of motion
estimation unit of H.264 AVC on FPGA," Proceedings of
the IEEE Symposium on Industrial Electronics &
Applications, Kuala Lumpur, pp. 62-67, 4-6 October
2009.
[37] V.A. Chandrasetty and S.M. Aziz, "FPGA
Implementation of High Performance LDPC Decoder
Using Modified 2-Bit Min-Sum Algorithm," Proceedings
of the 2nd International Conference on Computer
Research and Development, Kuala Lumpur, pp. 881-885,
7-10 May 2010.
[38] R. Zarubica, S.G. Wilson, and E. Hall, "Multi-Gbps
FPGA-Based Low Density Parity Check (LDPC) Decoder
Design," Proceedings of the IEEE Global
Telecommunications Conference, Washington, DC, pp.
548-552, 26-30 November 2007.







Vikram Arkalgud Chandrasetty
received Bachelor Degree in
Electronics and Communication
Engineering from Bangalore
University (India) in 2004 and
Masters Degree in VLSI System
Design from Coventry University
(UK) in 2008. He was working with
Core Networks Division at Motorola
India as Software Engineer (2005-2007), where he was part of
the billing and call processing R&D team of Motorola Soft-
Switch (MSS) for Mobile Switching Centres (MSC). He also
worked for SoftJin Technologies as Senior Software Engineer
(2007-2008) focusing on Electronic Design Automation (EDA)
and FPGA applications design. He was involved in the design
and development of Programmable Synthesis Engine (PSE) for
custom FPGA architectures and structured ASICs. He was also
working on software modelling and FPGA implementation of
Motion Estimation algorithms for H.264 Advance Video Coder.
Mr. Vikram is currently working towards his doctoral thesis
at the School of Electrical and Information Engineering,
University of South Australia. He is exploring low complexity
algorithms for decoding LDPC codes and investigating efficient
architectures for hardware implementation. His research is
mainly focused on implementing high performance LDPC
decoders on reconfigurable devices.
44 JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011
2011 ACADEMY PUBLISHER
Syed Mahfuzul Aziz received
Bachelor and Masters Degrees, both in
electrical & electronic engineering, from
Bangladesh University of Engineering &
Technology (BUET) in 1984 and 1986
respectively. He received a Ph.D. degree
in electronic engineering from the
University of Kent (UK) in 1993 and a
Graduate Certificate in higher education
from Queensland University of Technology, Australia in 2002.
He was a Professor in BUET until 1999, and led the
development of the teaching and research programs in
integrated circuit (IC) design in Bangladesh. He joined the
University of South Australia in 1999, where he is currently an
associate professor. In 1996, he was a visiting scholar at the
University of Texas at Austin when he spent time at Crystal
Semiconductor Corporation designing advanced CMOS
integrated circuits. He has been involved in numerous industry
projects in Australia and overseas, and has attracted funding
from reputed research organisations such as the Australian
Research Council (ARC), Australian Defence Science and
Technology Organisation (DSTO), and the Pork CRC
(Cooperative Research Centre), Australia. He has authored over
ninety refereed research papers. His research interests include
digital CMOS IC design and testability, modelling and FPGA
implementation of high performance processing systems,
biomedical engineering and engineering education.
Dr Aziz is a senior member of IEEE and a member of
Engineers Australia. He has received numerous professional and
teaching awards including the Prime Ministers Award for
Australian University Teacher of the Year (2009). He has
served as member of the program committees of many
international conferences. He reviews papers for the IEEE
Transactions on Computer and Electronics Letters, UK.

JOURNAL OF NETWORKS, VOL. 6, NO. 1, JANUARY 2011 45
2011 ACADEMY PUBLISHER

Das könnte Ihnen auch gefallen