Sie sind auf Seite 1von 21

Background:

A Viterbi Decoder uses the Viterbi algorithm for decoding a bit stream that has been
encoded using forward error correction based on a convolutional code. The Viterbi
algorithm was developed by Andrew J. Viterbi in 1967. While working as a professor
teaching digital communications and information theory at the UCLA School of
Engineering and Applied Sciences, Viterbi wrote his most famous paper, Error bounds
for convolutional codes and an asymptotically optimum decoding algorithm," published
in IEEE Transactions on Information Theory in 1967. It was decades ahead of its time.
Computers then weren't advanced enough to apply the algorithm to decoding problems,
and it didn't find widespread application until digital and wireless communications hit the
airwaves.

Today, the Viterbi algorithm is a widely used in error-correcting codes in cell phones,
dial-up modems, satellite, deep-space communications, 802.11 wireless LANS, speech
recognition systems, magnetic disk drives, and DNA research. Figure 1 shows the typical
operation of a Viterbi Decoder during wireless transmission.

Figure 1: Wireless Transmission of a Signal

Figure 2 shows the block diagram of a hardware implementation of a Viterbi Decoder


circuit. It consists of a Branch Metric Unit(BMC), a Path Metric Unit(PMU), a
Traceback Unit(TBU), and a First-In Last-Out(FILO) buffer.

Figure 2: Block Diagram of the Viterbi Decoder[10]


Convolutional Encoder:
A Convolutional Encoder is a finite state machine and can be represented by the state
diagram shown in Figure 3. There are four possible states (00, 01, 10, and 11) that can be
reached in the Convolutional Encoder. The arrows in Figure 3 represent the transitions
from state to state. The numbers associated with each arrow are the input to transition
from one state to the next and the corresponding encoder output. They are in the form
OUTPUT/INPUT.

Figure 3: State Diagram[1]

A trellis diagram is a way to represent a state diagram over time. Figure 4 shows the
trellis diagram of the state diagram in Figure 3. There is one column of four dots for the
initial state of the encoder (t=0) and one for each time instant during the message. The
solid lines connecting dots in the diagram represent possible state transitions when the
input bit is a one. The dotted lines represent state transitions when the input bit is a zero.
These state transitions are summarized in Table 1.

Figure 4: Radix-2 Trellis Diagram[1]

Next State, If
Current State Input = 0 Input = 1
00 00 10
01 00 10
10 01 11
11 01 11
Table 1: State Transition Table

2
The expanded version of the transition between one time instant to the next is shown
below (Figure 5). The two-bit numbers labeling the lines are the corresponding
Convolutional Encoder channel symbol outputs, or simply, the outputs of the
convolutional encoder. These channel symbol outputs are summarized in Table 2.

Figure 5: Convolutional Encoder Diagram[1]

Output Symbols, If
Current State Input = 0 Input = 1
00 00 11
01 11 00
10 10 01
11 01 10
Table2: Output Table

Convolutionally encoding the data is accomplished by using a 2-bit shift register(2D


flip-flops) and 2 exclusive OR gates which perform modulo-two addition. In this
encoder, the input bit is stable during the encoder cycle. The encoder cycle starts when an
input clock rising edge occurs. When the input clock edge occurs, the output of the left-
hand flip-flop is clocked into the right-hand flip-flop, the previous input bit is clocked
into the left-hand flip-flop, and a new input bit becomes available. Then the outputs of the
upper and lower modulo-two adders become stable. The output selector (SEL A/B block)
cycles through two states; in the first state, it selects and outputs the output of the upper
modulo-two adder(MSB); in the second state, it selects and outputs the output of the
lower modulo-two adder(LSB).

The data rate of this encoder is , meaning that for every 1 bit input into the encoder, the
output is 2 bits. Our implementation of this is shown in the circuit in Figure 7.

Figure 6: Convolutional Encoder Diagram[1]

3
Figure 7: Convolutional Encoder Schematic

Adding Error:
The Add Error block was designed to add error to our received encoded bit stream. To
begin we simply decided to flip two of the bits in the transfer from the encoder to the
decoding unit. To do this we simply installed a 2-1 multiplexer with an inverter inline
with the serial bit input. The inverter added after the multiplexer was to compensate for
the fact that the multiplexers are inverting.

Timing logic was also designed to control this multiplexer. This control allows the
multiplexer to select the inverted input for one half the main clock cycle. This is equal to
flipping one encoded bit because we get two encoded bits per main clock cycle. A set of
multiplexers and D flip-flops were used as a delay. We chose to flip two bits at about the
fourth and eighth clock cycle after the reset signals high. This delay can be changed by
altering the number of registers as we did for the waveform in Figure 9. The registers
appeared to have high outputs when left un-initialized. Multiplexers were added to each
register to load low values into the registers. The reset signal generates one high value
which controls the first multiplexer. Then a ripple effect delay is used and the signal is
put into an AND gate with the clock to enable the multiplexer to invert a bit. Figure 8
shows the schematic of the Add Error circuit and Figure 9 shows the waveforms before
and after the Add Error block.

4
Figure 8: Adding Error Schematic

Figure 9: Adding Error Waveforms

5
Branch Metric Calculator:
The BMC is the first block of the actual Viterbi Decoder. A BMC, also know as the
Hamming Distance Calculator, is used to calculate the branch metrics of a current state.
Branch Metric can be defined as the cost or distance from one state to another.
For example:

If the current state = 00


Distance 0000 = 00 = 0
Distance 0001 = 01 = 1
Distance 0010 = 01 = 1
Distance 0011 = 10 = 2

The input to the BMC is a stream of bits from the Convolutional Encoder. The output of
the BMC is four 2-bit numbers consisting of the MSB and LSB of each states (00, 01, 10,
11) branch metric. Figure 10 shows a block diagram of the BMC.

Figure 10: Block Diagram of BMC

Table 3 shows the truth table used to build the logic of the BMC. Figure 11 shows the
completed BMC circuit.

Input Compare Output(MSB, LSB)


00 00 00
01 00 01
10 00 01
11 00 10
00 01 01
01 01 00
10 01 10
11 01 01
00 10 01
01 10 10
10 10 00
11 10 01
00 11 10
01 11 01
10 11 01
11 11 00
Table 3: BMC Truth Table

6
Figure 11: BMC Schematic

Add, Compare, Select:


Since we are using 4-states to encode our signal, 4 ACS (Add, Compare, Select) units are
needed in order to keep track of the state metrics. A state metric can be defined as the
accumulated error metric, or the total branch metric through a path of the trellis
diagram. This algorithm updates the state metric using and add-compare-select recursion.
The branch metrics are added to the state metrics of the previous time instant. The
smaller of the two is selected to be the new state metric for each state. The equations are
as follows:

ACS00: sm1n = min(sm1n-1 + bm1, sm2n-1 + bm4) Eq. 1


ACS01: sm2n = min(sm3n-1 + bm3, sm4n-1 + bm2) Eq. 2
ACS10: sm3n = min(sm1n-1 + bm4, sm2n-1 + bm1) Eq. 3
ACS11: sm4n = min(sm3n-1 + bm2, sm4n-1 + bm3) Eq. 4

where:
bm1 = branch metric for state 00
bm2 = branch metric for state 01
bm3 = branch metric for state 10
bm4 = branch metric for state 11

The ACS consists of two 8-bit Carry Look-Ahead (CLA) adders, an 8-bit subtractor, three
inverter blocks, two 16-8 MUX block, and an 8-bit SR Register block. The CLA was
chosen since it is the faster of the adders. Looking at the ACS00 equation:

sm1n = min(sm1n-1 + bm1, sm2n-1 + bm4)

7
There are two adders in this unit. The first adder adds the state metric of state 00(sm1) to
the branch metric of state 00(bm1). The second adder adds the state metric of state
01(sm2) to the branch metric of state 01(bm2). Another CLA is used to act as a
subtractor by inverting one of the inputs and setting the carry-in to one. This implements
a twos complement subtraction. The output of the subtractor(carry-out) tells us which of
the two numbers is smaller. This output acts as the select for the 16-8 MUX block and
allows the smaller number through. Since the MUXs in the OSU Standard Library are
naturally inverting, an inverter block is needed to invert the output back to its original
state. This value needs to be stored in each ACS unit until the next clock cycle when it
gets fed back into the appropriate ACS units. The functionality of the SR Register block
is explained below.

Because the ACS uses so many components, the critical path can be found here. The
critical path is through and adder, inverter block, subtractor, MUX block, inverter block,
and a SR Register block.

Adder Inverter Subtractor MUX Inverter SR Register


2 XOR, 8 AND, 8 OR 1 INV 2 XOR, 8 AND, 8 OR 1 MUX 1 INV INV, DFFSR

MUX Inverter
1 MUX 1 INV

Using the worst case delays for each of these gates given in the functional description of
each gate provided by the OSU standard libraries:

AND = 0.11ns
INV = 0.048
MUX = 0.13ns
OR = 0.12ns
XOR = 0.12ns
DFFSR = 0.43ns

The total time to traverse the critical path is then 5.042ns. This critical pass limits the
frequency of the clock. Using this time, our circuit theoretically should run at 198MHz
without failing. Remember that this was the worst case critical path time and only
estimate.

The schematic for the ACS is shown in Figure 12.

8
Figure 12: ACS Schematic

An eight-bit register for accumulating values in the ACS was implemented. This register
had two purposes. The first was the capability to store the current state metric for one
clock cycle. The second purpose was the ability to clear the register at the beginning of
the input bit stream. To achieve this functionality a D flip-flop SR Register was used.
The D input and clock were used to load values and the Reset was used to clear all of the
registers. The set and reset are active low inputs. The set input to was connected to VDD
because we had no use for it. The reset was inverted to create an active high input for the
reset to occur. The schematic for the SR-Register block is shown in Figure 13.

S R Qnext Action
0 0 0 Hold State
0 1 1 Hold State
1 0 1 Toggle
1 1 0 Toggle
Table 4: SR Flip-Flop Truth Table

9
Figure 13: SR Flip-Flop Block Schematic

After the SR Register block in the ACS, there is a 16-8 MUX and an Inverter block. The
MUX block has the ability to select between the lowest state metric determined by the
other components of the ACS or all 1s. If the select on the MUX is 0, the lowest state
metric is passed through. If the select on the MUX is 1, the state metric output of that
ACS is high. This is due to the fact that at the beginning of an input bit stream some of
the states are not valid until a certain number of clock cycles. These scenarios are
discussed in the next section.

Reset Signals:
By examining the trellis diagram in Figure 14, at t=0, the state 00 can be the only valid
state. This means that the following equation:

sm1n = min(sm1n-1 + bm1, sm2n-1 + bm4) Eq. 1

must always select sm1n-1 + bm1 in Eq. 1 to be the smallest state metric because the other
state metric is not valid. This is accomplished by driving all state metric values except 00
to high before t = 0. Driving these values high makes sm2n-1 high, which will cause ACS
00 to select sm1n-1 + bm1.

At t=1, state 00 and 10 are the only valid states. This means that the following equations:

sm1n = min(sm1n-1 + bm1, sm2n-1 + bm4) Eq. 1


sm3n = min(sm1n-1 + bm4, sm2n-1 + bm1) Eq. 3

10
must always select the state metric sm1n-1 + bm1 in Eq.1 and must always select sm1n-1 +
bm4 in Eq. 3 because sm2n-1 is not a valid state. In order to assure that the ACS 00 and 10
always select sm1n-1 as the smallest state metric, sm2n-1 is temporarily set to high.

At t=2, all states are now valid. This means that all state metric values are valid and can
be selected to be the lowest in there respective equations (Eq. 1 Eq. 5).

Figure 14: Trellis Diagram form t=0 to t=3[1]

The Reset signals are left up to the user to input into the Viterbi Decoder. The first reset
signal needs to put in at the same time the input bit stream is put in. The reset signal
should be high for one period of the clock. The second reset signal is just the first reset
signal delayed by one clock cycle. The third reset signal is the reset signal 1 and 2 ORed
to produce a high signal for two clock cycles. Figure 15 shows the reset signal
waveforms.

Figure 15: Reset Signal Waveforms

11
Traceback: Every clock cycle, four state metric (8-bit) values accumulated from the
ACS blocks are stored into memory. This memory is part of the Traceback block. There
are 32 registers that store four 8-bit state metric values from the ACS blocks. Noticing
that the accumulated values did not need all eight bits, we implemented the memory
using only 4-bits for each state metric. In order to properly correct for errors during
transmission, all the state metric values must be computed and stored. Once all four state
metrics for the 16 inputs have been computed, the traceback can begin.
In the traceback, we start at the last 4 state metrics, and the lowest of the four is
selected. If there is a tie, the top state or the state closest to 00 is selected. From that
state, only two other states can be a possible transition (Figure 16). The lowest state
metric value of these two possible states is selected. Every time a low state metric value
is found, the corresponding state that produced that metric is saved for one clock cycle.
Depending on the next lowest state, a transition (1 or 0) from the current state to the
saved state can be found. That is the decoded output bit.

Figure 16: Transistion Diagram(solid line = 1, dashed line =0)[1]

Figure 17: Traceback Schematic

12
There are two rows of 16 registers in the Traceback unit shown in Figure 17. The input
into the Viterbi Decoder is 16-bits, so for the first 16 clock cycles, the state metric values
from the ACS block are stored in the first 16 memory blocks on the top row starting from
left to right. On the seventeenth clock cycle, new data needs to be decoded. These new
state metric values are then stored in the second row of 16 memory blocks starting from
left to right. At the same time these new state metrics are being stored, the first input
sequence is ready to be decoded.
The last state metric values stored (register 16 on top row) are outputted to a bus.
The lowest state metric is determined and that state is saved. While this is happening, the
first state metric value of the next input sequence is being stored in register 18 (bottom
row, second in). On the next clock cycle, the second to last state metric value is outputted
to the bus. Depending on the previous saved state, the next lowest possible state metric is
determined. From these two states, a decoded bit can be determined (Table 5). This
process is repeated until all 16 bits have been decoded. New state metric values are then
written to the top row of registers while the bottom row is being decoded.

Previous State Current State Output


PV1 PV0 S1 S0
0 0 0 0 0
1 0 0 0 1
0 0 0 1 0
1 0 0 1 1
0 1 1 0 0
1 1 1 0 1
0 1 1 1 0
1 1 1 1 1
Table 5: Traceback Truth Table

As you can see from the table, the decoded output is just equal to PV1.

In order to determine which state has the lowest state metric, a Select Low State block
needs to be implemented.

When the traceback first starts, the lowest of all four states state metric value needs to be
found. This is done by using three comparators and two MUX blocks. The comparator
uses the bout bit of an adder to determine which state metric is larger. That bout then is the
input to a MUX and will select the smaller of the two. State metric 11 and 10 are
compared and state metric 01 and 00 are compared. The two smaller number of each
group are then compared. This bout bit along with the first two are inputs into some logic
to determine which state was the smallest. The schematic for this circuit is shown is
Figure 18.

13
Figure 18: Select Low State of 4 Schematic

So once the lowest of the four state metric values is determined, the next lowest state
metric can only be determined by what the previous lowest state was. So another Select
Low State block was needed using feedback (previous state) and only selecting the
lowest state metric of two possible states. This was implemented using the block diagram
in Figure 19.

Figure 19: Select Low State of 2 Block Diagram

14
Figure 20: Select Low State of 4 Schematic

FILO Buffer:
The last part of the Viterbi Decoder is the First-In Last-Out (FILO) buffer. The Viterbi
Decoder must output the same 16-bit stream as the input 16-bit stream. The bits are
being decoded from the end of stream to the beginning so once a bit is decoded it cant be
outputted right away or else the output would be in reverse order. They need to be stored
into a 16-bit register, a FILO. The last bit into the register is outputted first. The first bit
into the register is outputted last. The schematic of the FILO is shown below in Figure
21.

Figure 21: FILO Schematic

15
Overall System:
Before we began designing and testing the components of the Viterbi Decoder described
above, we created the flow diagram (Figure 22) of the entire circuit. This diagram shows
how the different blocks are connected to one another, the capacity of the connections,
and the inputs and outputs of each block.

Figure 22: Flow Diagram

When each block was designed, built, and tested, this flow diagram made the connection
from block to block rather straightforward. Figure 23 shows the completed schematic
circuit of the Viterbi Decoder.

Figure 23: Viterbi Decoder Schematic

16
Figure 24: Viterbi Decoder Schematic Waveforms

Figure 24 shows the waveforms of the Viterbi Decoder. The decoded signal, input signal,
and the clock are shown. After 48 clock cycles, the output becomes valid data. This
delay occurs from the 16 clock cycles to store the state metric values in memory, the 16
clock cycles to decode the bits from bit 15-0, and 16 clock cycles to reverse the decoded
bits in the LIFO.

The input signal to our Viterbi Decoder had to be variable, meaning not just alternating
ones and zeros, but a random combination of the two. To do this we use a pwlFile and a
pwlf source. The pwlFile consisted of points in time, each associated with a voltage level.
The pwlFile was then pointed to using a pwlf source. Figure 25 shows an example of a
pwlFile we used.

17
Figure 25: Viterbi Decoder Schematic Waveforms

Transistors:
Total # of Transistors = 24,350

Convolutional Encoder:
# of Transistors = 144
% of Total = 144/24,350 = 0.60%

Add Error:
# of Transistors = 298
% of Total = 298/24,350 = 1.22%

BMC:
# of Transistors = 50
% of Total = 50/24,350 = 0.21%

ACS:
# of Transistors = 1,474 x 4 = 5,512
% of Total = 5,896/24,350 = 24.2%

18
Traceback:
# of Transistors = 16,960
% of Total = 16,960/24,350 = 70.0%

FILO:
# of Transistors = 896
% of Total = 896/24,350 = 3.7%

Clock Speed:
As mentioned in the ACS section of this report, our theoretical clock speed was 198MHz.
This value was calculated from our critical path through the ACS. The delays for each
gate were estimated using the worst case for each. Further testing of our completed
circuit showed not only that our circuit ran efficiently and without incident at 198MHz,
we could increase the frequency of the clock up to 400MHz without degrading the
performance of the circuit.

Conclusion: This was a very interesting and challenging project. It took a handful of
hours for us to read background information on the Viterbi Decoder and to understand its
functionality. After this, we constructed a circuit for the Encoder, BMC, and the ACS.
After some efficient troubleshooting, these circuits worked as required. We next
constructed a Select-Low-State block. At first we had it take the lowest of the four
accumulated state values. We did not get the correct output. After more troubleshooting,
we realized we needed to take the lower of the next possible two states. Then we saw
that the most significant bit of the output of the Select-Low-State correctly matched the
input. This eliminated a decoding logic circuit we made for a Traceback.
The next task was to add error. We built a block to invert two separate encoded
bits. This introduced our error to the received signal. To our dismay, the output no longer
matched the input. Our next step was to do more research about the Traceback. Our
previous attempt output bits as we moved forward in the trellis. Our next attempt would
store all of the accumulated state metric values, select the lowest of the four at the last
trellis stage, and traceback to the beginning of the trellis while outputting the transition
bits. Then these bits would go into a FILO to reverse their order. This circuit took a lot
of troubleshooting for us. We attached inverters to the output of our buffers without
thinking to correct for the natural inversion. After some confusion, we realized that by
doing that we created a path to vdd or ground instead of high impedance as the buffer is
intended to have. This was an easy fix and also lowered our transistor count. We simply
took off all the inverters and only put them at the end of the bus line.
Even now our new Traceback was not working. We figured out that the outputs of
the ACS were not matching the values stored in our memory. They were connected by
straight wires, so we knew the problem was most likely in clock timing. We thought that
the memory was loading at the same time as the ACS registers. So, if the ACS is in
transition at the time the memory loads it will load incorrect values. To fix this we just
put a clock buffer delay going to the clock on the Traceback memory. This corrected the
problem and our new Traceback output once again matched the input.

19
Again, we tried adding error. The decoded output now matched the input. Figure
24 shows the waveforms of our Viterbi Decoder design and proves that we have
implemented a working 4-state Viterbi Decoder.

20
References:

[1] Fleming, Chip, A Tutorial on Convolutional Coding with Viterbi Decoding.


Spectrum Applications. 2 Novmeber 2006.
<http://home.netcom.com/~chip.f/viterbi/tutorial.html>.

[2] G. D. Forney, Jr., "Convolutional Codes II: Maximum-Likelihood Decoding,"


Information Control, vol. 25, June, 1974, pp. 222-226.

[3] K. S. Gilhousen et. al., "Coding Systems Study for High Data Rate Telemetry Links,"
Final Contract Report, N71-27786, Contract No. NAS2-6024, Linkabit Corporation, La
Jolla, CA, 1971.

[4] J. A. Heller and I. M. Jacobs, Viterbi Decoding for Satellite and Space
Communications," IEEE Transactions on Communication Technology, vol. COM-19,
October, 1971, pp. 835-848.

[5] K. J. Larsen, "Short Convolutional Codes with Maximal Free Distance for Rates 1/2,
1/3, and 1/4," IEEE Transactions on Information Theory, vol. IT-19, May, 1973, pp. 371-
372.

[6] Lin, Ming-Bo, "New Path History Management Circuits for Viterbi Decoders," IEEE
Transactions on Communications, vol. 48, October, 2000, pp. 1605-1608.

[7] J. P. Odenwalder, "Optimum Decoding of Convolutional Codes," Ph. D. Dissertation,


Department of Systems Sciences, School of Engineering and Applied Sciences,
University of California at Los Angeles, 1970.

[8] J.M. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits, 2nd ed.
Upper Saddle River, New Jersey: Pearson Educational International, 2003.

[9] A. J. Viterbi, "Error Bounds for Convolutional Codes and an Asymptotically


Optimum Decoding Algorithm," IEEE Transactions on Information Theory , vol. IT-13,
April, 1967, pp. 260-269.

[10] Viterbi Decoder. In Wikipedia, The Free Encyclopedia. Retrieved May 4, 2007 from
<http://en.wikipedia.org/wiki/Viterbi_decoder>.

21

Das könnte Ihnen auch gefallen