C2006 Ieee PDF

Decoders for Low-Density Parity- Check Convolutional Codes with Large Memory
Stephen Bates*, Logan Gunthorpe*, Ali Emre Pusanet, Zhengang Chen*, Kamil Zigangirovt and Daniel J. Costello Jr.t *Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada. tDepartment of Electrical Engineering, University of Notre Dame, Notre Dame, IN, USA.
Abstract- Low-density parity-check convolutional codes offer the same good error-correcting performance as low-density parity-check block codes while having the ability to encode and decode arbitrary lengths of data. This makes these codes well suited to certain applications, such as forward error control on packet switching networks. In this paper we propose a decoder architecture for low-density parity-check convolutional codes with very large memories. These may codes have very good error correcting properties and as such* be applicable in wireless sensor networks and space communication systems. We discuss a realization of this architecture for a (2048,3,6) code implemented on a field-programmable gate-array.
Index Terms- Convolutional codes, Data communication, Error correction coding, High-speed integrated circuits.
* We demonstrate that LDPC-CCs can be applied to finite lengths of data rather than an infinite stream.
To demonstrate the feasiblity of our architecture we implement an encoder and decoder for a (2048,3,6) LDPC-CC on an Altera Stratix FPGA. This implementation achieves excellent Frame Error Rate (FER) performance and a moderate throughput using only a fraction of the FPGA resources. Therefore the architecture is suitable for low power, moderate bandwidth applications where good error correction capabilites are required. Such applications include mobile telephony data transfer, communication over wireless sensor networks, and space communications. In Section II we present a brief overview of LDPC-CCs. In Section III we discuss the termination of data streams encoded with LDPC-CCs. In Section IV we demonstrate the performance of these codes via simulation. In Section V we present the architecture of the decoder and in Sections VI and VII we present synthesis and performance results for the implementation of such a decoder for a (2048,3,6) LDPC-CC.
. . .
I. INTRODUCTION Low-Density Parity-Check (LDPC) codes were first proposed by Gallagher in his Ph.D. thesis in 1960 and were then rediscovered by, among others, MacKay and Neal in 1996 [1]. Since their re-discovery LDPC codes have become very popular. One of the reasons for this is that LDPC Block Codes II. AN OVERVIEW OF LDPC-CCs (LDPC-BCs) can perform very close to the Shannon Limit LDPC-CCs' generate parity-bits using only previous infor[2]. Another reason is that a relatively simple message-passing decoding algorithm, called belief-propagation, can be applied mation bits and previously generated parity bits. For example the generation of a parity-bit, v(t), for a rate 1/2 (me, J, K) to these codes. Research regarding LDPC codes has, to date, focused LDPC-CC is almost exclusively upon LDPC-BCs. However, Low-Density Parity-Check Convolutional Codes (LDPC-CCs) [3], [4] may 7n 7n (1) v(t) be better suited to certain applications than their block code h/< (t)u(t i) + E h j) (t)v(t -i). i=O i=1 counterparts. This is because LDPC-CCs are able to encode and decode arbitrary lengths of data without the need to (i) fragment them into fixed-sized blocks. Many packet switching Ik } are taken networks, including those based on the Ethernet frame format, from the parity-check matrix, H, that defines the code. m, is utilize a Protocol Data Unit (PDU) that can vary in size. For denoted as the memory of the code, J is the number of ones example in the IEEE 802.11 wireless standards the Ethernet in each column of H and K is the number of ones in each row of H. frame can vary in size from 64 Bytes to 1518 Bytes. LDPC-CCs could be decoded using the Viterbi algorithm. In this paper we present an architecture for decoding LDPCCCs with large memory. This architecture is based upon a However given that their trellis consists of 22mS states, such processor design presented in [5]. In this paper we make two a decoder is infeasible. Instead, since the parity-check matrix of an LDPC-CC iS designed to be sparse, iterative messagemao exesin tota .ok Rathr tan ecoing he DPCCC ith anyproes-passing algorithms can be applied. LDPC-CCs are typically
sors, we use a single processor that operates upon a terminated sequence of data.
1The nomenclature used in this paper is based upon that presented in [3]. Please refer to that paper for a more complete overview of LDPC-CCs.
0-7803-9390-2/06/$20.00 C2006 IEEE
5103
ISCAS 2006
decoded with the same Belief Propagation (BP) algorithm used for LDPC-BCs [1].
Iter- Threshold
-1.5
-x, -1+
(100000,3,6) (10000,3,6)
III. LDPC-CC TERMINATION As mentioned in Section II an LDPC-CC encoder can be thought of as a Finite State Machine (FSM). In that case, a termination sequence is defined as a sequence of information bits that takes the encoder from any given state back to the all-zero state2. Before LDPC-CCs can be used in a communication system we need to determine if termination sequences of reasonable lengths, exist for all possible states. We also need to determine if we can generate these sequences in an efficient manner. In [6] we showed that most LDPC-CCs can be terminated using a sequence of, at most, m, + E information bits (e << mj). In [7] we showed how to construct simple VLSI circuits to generate termination sequences for a variety
of LDPC-CCs.
-2
25
m -3 -35
-4
-4.5-
15
Fig. 1.
Eb/No, dB
The simulated performance of the (2048,3,6) LDPC-CC versus a
Note that the termination sequence has an impact on the Eb performance of an LDPC-CC code. Since the sequence No cannot convey information it reduces the overall rate of the code. Therefore if the information to be encoded is of length L bits, the new code rate, R', is given by
number of LDPC-BCs. The threshold for iterative decoding of (3,6) codes is also shown.
the processor array and the output is returned to memory. At

that
R'= L R L + mS
point the controller can either output the data or send it
(2)
Eb than Therefore any given BER will now occur at a higher N~0 before. If L is small wrt m, the effect of termination is very large. However, if L >> mS, then the impact of termination is minimal.
through the processor array again. In this paper we use the Ethernet frame format as the basis for our finite data. We do this because Ethernet is one of the most prevalent frame formats in data communications.
IV. LDPC-CC PERFORMANCE As mentioned in Section lIIthe error correcting capabilities of LDC-CCshold weinceas m,.In of an LDPC-CC should increase as we increase m. sIn this we focus codes with much memories than larger upon paper those that have been implemented previously. We choose e u LDPC-CC, a ted in3[8]. The s a prm nc thi method proposed in [8]. The simulated performance Of this 1. (3,6) LDPC-BCs in Fig. LDPC-CC versus several a (100000 3 6) The (2048,3,6) outperformsis given LDPC-CC
a ncrese s
histhe
at(2048,3,6)
whichimplem wased revionstrted itoistapveryfpowerful
(CRC) after every loop through the array. We can therefore determine if the packet has been decoded correctly and use this information to decide whether or not to send the data through tharyagi.Tefmewllophouhhedcers array again. The frame will loop through the decoder as ~~many times as necessary until it passes the CRC check or a maximum nereofaiteratio of iterations iseahed. iS reached. maximum number We approached the design in such a way that the number of processors, maximum number of iterations and memory length
can be many chosen during synthesis. of the decoder as variables This allows us to test configurations with the different
Since our frames are compliant with the Ethernet frame format, we check the 4 Byte Cyclic Redundancy Check
ofth
deoe
ca
ecoe
svrabe
uigsnhss
Upon implementation, we found it advantageous to use a single processor in the decoder. This results in a decoder that is very small in terms of area and power. Such a decoder could be integrated with an Ethernet MAC to produce a very low-power data-link layer solution. Also, with one processor a frame V. DECODER ARCHITECTURE only goes through exactly as many processors as are needed We overcome the fact that LDPC-CCs with large memory to properly decode it due to the intermediate verification of require many processors by wrapping a terminated sequence frames. This greatly improves the throughput of the decoder at (a frame) of data through a relatively small number (possibly higher SNR. However multiprocessor designs will have better one) of processors many times. We accomplish this by using a throughput at lower SNRs. Tharitcueotepoesrisdnialotemmry controller that manages the flow of frames through an array of processors (Fig. 2). The controller takes a single frame as an baedsinpsntdn[5exptfrheactatheiml input and places it in memory. It then passes the frame through prt-oei 1]wsue.Teapoiaingvsams
2The all-zero state is the state in which the most recent m8 information the full performance of the BP algorithm with a much simpler bits and parity bits are all zero. implementation. 5104
cd. LDPC-BC suggestsuggests H LDPC.BC which itisave o. However, what is not captured here is that the LDPC-CC requires up to 100 processors in order to achieve this performance. This is a serious issue since it renders previously reported LDPCCC architectures such as [5] and [9] impractical. Therefore we propose a novel architecture for these longer memory codes.
.,3wic
code. We use the naming scheme "m, pxn" to designate a design using a memory of m, with p processors and a
informnation LLRs
parity LLRs
information LLRs
memory
inforrnation LLRs
parity LLRs
parity LLRs
AWN
AWGNw
Aarray
processor
information bits
parity bitts
DC
information
Decntodero 10~~
CRC|[
eValu~e
bits 0output ~~~~~~~~~~~~~~~~~~~~
ar,aeorctencoder caled
terminate
a r,output valin
ClRC rezsult
alticit(
Fig. 2. The system implemented
array, a decoder controller, and a CRC evaluator.
on FPGA. This includes the LDPC-CC
TAL II
encoder, decoder, and noise generators. The decoder consists of memory, a processor
CR
TABLE I
THE FPA
LDPC-CC DECODER WITH A PROCESSOR ARRAY OF SIZE ONE. THE

WORD-LENGTHS OF THE LLR IS W = 8 BITS. THE FIGURES IN BRACKETS REFER TO UTILIZATION OF THAT RESOURCE ON THE FPGA TARGETED.
Ror the maximum number of iterations

a
was reached. For all three packet sizes the FER, average number of iterations, and
single processor until either the CRC was evaluated correctly
Logic Elements
Max Clock Freq. Max x 3 Clock Freq.
Parameter Device Name
Value Altera Stratix EPIS80

131456 (1.63%)
60MHz 180MHz
4004 (5.05%)
ge termination throuhpt sequence werendetermined. The was accounted for when determining the Eb/NO by using (2). As discussed in Section III this sequence has a large impact at small frame sizes. However
at very large packet sizes its impact iS minimal.
0
-i
Memory (bits)
-0.5
2048 lx1 00 128B A-2048 1x100 256B 2048 1x100 640B
VI. DECODER IMPLEMENTATION
~~~~ <
-*
chip RAM and dedicated arithmetic blocks. The architecture of each processor is based on the design presented in [5], which has been shown to be a very efficient architecture for FPGA implementations. The additional state machines and RAM mentioned in Section V were also implemented on the device. As can be seen from Table I, the decoder occupies only a few percent of the targeted FPGA. Therefore it is expected than an ASIC implementation of the same design would occupy very little area and consume very small amounts of power.
on an Altera Stratix FPGA. This is a mid-sized FPGA with on-
The decoder was implemented for the (2048,3,6) LDPC-CC
2(1000,3,64 LDPC-BC < (10000036)LDPCB
~~~2048lx1001280B
> (1 0000,3,6) LDPC-BC
-2 E D
*3
5
-4 5
dB Eb/Nd B
2.5
Fig. 3.
The FER performance of the decoder and several LDPC-BCs for a
VII. DECODER PERFORMANCEvaitofrmesz.
The decoder was tested using different frame sizes ranging In Fig. 3 the error correcting capabilities of the LDPCfrom 128 Bytes to 8064 Bytes. This allows us to compare CC can be seen. As long as a certain threshold in Eb/No is the performance for small and large frames. As noted in reached, then small increases in signal power lead to a large Section V the decoder looped the received data through the drop in FER. When the frame size is small, the overhead 5105
100
90
\l \ k\
2048 1xl 00640B jl x- ~~~~~~2048 1xl100 1280B 2048 1x008064B
2048 1x00 128B
5). The maximum throughput occurs when the frame size is
large and the Eb/No is high. In this
case we can attain over
70 ll l \ \ i \ \ \
50
o 60 E
6 MBPS. Note that as the frame size is reduced, the potential throughput of the decoder is also decreased. This is due to the fact that the overhead of termination is a fixed number of bits and hence begins to dominate at small frame sizes.
VIII. CONCLUSIONS In this paper we have presented an LDPC-CC decoder suitable for decoding codes with large memory. These codes can be applied where strong error correction is required and where throughput and latency are less of an issue. Such applications include data transmission on mobile telephony networks and space and satellite communications. The decoder can be efficiently implemented on a fraction of a modern FPGA and utilizes its on-chip memory. We propose that an
'40 -
30
20 -
107
0
dB E/NO dB
Eb/No
I:5
Fig. 4. The average number of processor iterations for a variety of frame sizes.
8000 7000
2048 lxlOO 128B A 2048 1x100256B -+-2048 lxlOO0 640Binlds -* 2048 lxlOO 1280B
-_V_ 2048 1 x 100 8064 B
6000 5000
power design. This is the subject of ongoing work. We showed that an implementation of this code is capable of decoding 8064 Byte frames with an FER of about le - 3 and a throughput of 1 MBPS at an Eb/No of 1.37 dB. This adto lofrmeeqrd includes the additional overhead of frame termination required by the LDPC-CC. However results do suggest that, to achieve optimum results, small frames should be avoided whenever possible.
IX. ACKNOWLEDGMENTS
ASIC implementation of this decoder could yield a very low
sQ 4000 H 3000 _
2000
/W w ,+
xs ABv
This work was supported in part by an NSERC Discovery Grant, SRC Grant 1170.001, NSF Grant CCRO2-05310 and
NASA Grant NNG05GH736.
REFERENCES
E3
EbNO
dB
[1] C. Schlegel and L. Perez, Trellis and Turbo Coding. IEEE, 2004. [2] S. Y Chung, G. D. Forney, T. J. Richardson, and R. Urbanke, "On
the design of low-density parity-check codes within 0.0045dB of the Shannon limit," in IEEE Comm. Lett., vol. 5, no. 2, 2001, pp. 58-60. [3] A. Jimenez-Felstrom and K. Sh. Zigangirov, "Time-varying periodic convolutional codes with low-density parity-check matrix," IEEE Trans. Information Theory, vol. 45, no. 6, pp. 2181-2191, September 1999. [4] R. Tanner, D. Sridhara, A. Sridharan, T. Fuja, and D. Costello Jr., "LDPC block and convolutional codes based on circulant matricies," IEEE Trans.
Fig. 5. The average throughput for a variety of frame sizes.
LDPC-BC for most frame sizes. Also, for the largest frame size our decoder outperforms the (10000,3,6) LDPC-BC and performs 0.15dB 0.15d wthin of the the (100000,3,6) (10000,3,6) LDPC-BC DPC-BC at of performs within a FER of 1 x 10- 3.. This is a significant result since we conjecture our decoder is much simpler to implement than such large block-code decoders. We also conjecture that, based on the slopes of the curves, the 8064B case will outperform
the (100000,3,6) LDPC-BC at higher Eb/NO.
plotted. Our decoder significantly outperforms the (1000,3,6)
of termination has a large impact on the Eb/No, since the effective rate of the code is much less than 1/2. At very large frame sizes, this overhead is much smaller. The simulated performance of three LDPC-BCs is also
Information Theory, vol. 50, no. 12, pp. 2966-2984, December 2004. [5] S. Bates and G. Block, "A memory based architecture for low-density
parity-check convolutional decoders," in Proceedings of the IEEE Symposium on Circuits and Systems (ISCAS), Kobe, Japan, May 2005. [6] Z. Chen, S. Bates, and X. Dong, "Low-density parity-check convo-
lutional codes applied to packet based communication systems," in Proceedings of IEEE Global Telecommunications Conference (GLOBE-
[7] 5. Bates, D. Elliot, and R. Swamy, "Termination sequence generation circuits for low-density parity-check convolutional codes," Submitted to
COM), November 2005.
Circuits and Systems I, IEEE Transactions on, March 2005. [8] A. Sridharan and D. Costello Jr., "A new construction method for low density parity check convolutional codes," in Proceedings of The IEEE Information Theory Workshop, October 2002, p. 212. [9] R. Swamy, S. Bates, and T. L. Brandon, "Architectures for ASIC
implementations of low-density parity-check convolutional encoders and
Fig. 4 shows that the average number of iterations through

the single processor in the decoder reduces as the Eb/NO is
Systems (ISCAS), Kobe, Japan, 2005.
decoders," in Proceedings of the IEEE Symposium on Circuits and
we can proceed to the next frame in a much shorter time (Fig.
increased.~~~~~~~~~~~~ ietipc Thiha ntetruhu,
'.
sic
[10] 5. Howard, C. Schlegel, and V. Gaudet, "A degree-matched check node approximation for LDPC decoding," in Proceedings of the IEEE Intl. ~~Symposium on Information Theory, ISIT, Adelaide, Australia, September
2005.
5106

C2006 Ieee PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

C2006 Ieee PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Decoders for Low-Density Parity- Check Convolutional Codes with Large Memory

0-7803-9390-2/06/$20.00 C2006 IEEE

The simulated performance of the (2048,3,6) LDPC-CC versus a

the processor array and the output is returned to memory. At

point the controller can either output the data or send it

whichimplem wased revionstrted itoistapveryfpowerful

bits 0output ~~~~~~~~~~~~~~~~~~~~

Fig. 2. The system implemented

array, a decoder controller, and a CRC evaluator.

on FPGA. This includes the LDPC-CC

LDPC-CC DECODER WITH A PROCESSOR ARRAY OF SIZE ONE. THE

Ror the maximum number of iterations

single processor until either the CRC was evaluated correctly

Parameter Device Name

Value Altera Stratix EPIS80

2048 lx1 00 128B A-2048 1x100 256B 2048 1x100 640B

VI. DECODER IMPLEMENTATION

on an Altera Stratix FPGA. This is a mid-sized FPGA with on-

The decoder was implemented for the (2048,3,6) LDPC-CC

2(1000,3,64 LDPC-BC < (10000036)LDPCB

> (1 0000,3,6) LDPC-BC

The FER performance of the decoder and several LDPC-BCs for a

VII. DECODER PERFORMANCEvaitofrmesz.

2048 1xl 00640B jl x- ~~~~~~2048 1xl100 1280B 2048 1x008064B

2048 1x00 128B

5). The maximum throughput occurs when the frame size is

large and the Eb/No is high. In this

case we can attain over

-_V_ 2048 1 x 100 8064 B

ASIC implementation of this decoder could yield a very low

Fig. 5. The average throughput for a variety of frame sizes.

plotted. Our decoder significantly outperforms the (1000,3,6)

COM), November 2005.

Fig. 4 shows that the average number of iterations through

Systems (ISCAS), Kobe, Japan, 2005.

decoders," in Proceedings of the IEEE Symposium on Circuits and

we can proceed to the next frame in a much shorter time (Fig.

increased.~~~~~~~~~~~~ ietipc Thiha ntetruhu,

Das könnte Ihnen auch gefallen