Beruflich Dokumente
Kultur Dokumente
Abstract
1. Introduction
Driven by overwhelming design-time constraints,
standard-cell based synchronous design styles
supported by mature CAD design tools and a largely
automated flow dominate the ASSP and ASIC market
places. As device feature sizes shrink and process
variability increases, however, the reliance on a
global clock becomes increasingly difficult, yielding
far-from-optimal solutions. Because standard-cell
designs use very conservative circuit families and are
often over-designed to accommodate worst-case
variations, the performance and power gap between
2. Background
This section begins by describing the STFB and
SSTFB templates, followed by necessary background
on Turbo decoding.
S0
R0
L0
S0
S1
A
B
keeper
keeper
S1
R1
L1
R0
R1
A
B
keeper
Decoded
Data
Outer
SISO
DeInterleaver
Interleaver
Inner
SISO
Received
Data
Buffer
t=1
t=2
F00
F01
F02
F10
F11
t=k
F0k
F12
F1k
t = k+1
t = K-2
B0k+1 B0K-2
B1k+1 B
K-2
t = K-1
t=K
B0K-1
B0K
B1K-1
B1K
i, j
bk =1
BM ki , j = SI bk .
(3)
(4)
B ki = min B kj+1 + BM ki , j
i, j
i, j
bk =0
(5)
(2)
bk =1
20
40
60
80
100
120
140
# of Parallel Processors
250
300
400
500
T=
f K
(8)
3K
I2 p +
8M
Pipeline
Overhead
Inner SISO
Processing
Pipeline
Overhead
K/4M
Outer SISO
Processing
K/8M
4. Asynchronous Turbo
This section covers the asynchronous Turbo
design, including a brief description of the overall
design flow, the library development, gate-level
design, and physical design.
1-bit
adder
1-bit
adder
1-bit
adder
1-bit
adder
1-bit
adder
1-bit
adder
1-bit
adder
1-bit
adder
Fork
Fork
Fork
Fork
Fork
Fork
Fork
Fork
1-bit
adder
1-bit
adder
1-bit
adder
1-bit
adder
Merge
Output
B1
S0
L0
R0
XO
M0
A1
A0
B0
B1
S1
L1
X1
M1
R1
A1
A0
B0
S0
S1
A0 M0
M1
X0
B0
X1
A1 R0
R1
B1
6. Preliminary Comparisons
The previous two sections described the
synchronous baseline and asynchronous designs in
detail and showed their post P&R results. This section
compares the performance, area, throughput/area,
energy and power/area results for the two designs.
Because the SSTFB standard cell library has only one
size per cell, the number we present are conservative
and may improve if we developed and used more sizes
per cell.
Block
Size
(bits)
Async T
Sync T
(Mbps)
(Mbps)
512
383
768
Sync
area
(mm2)
T/area
ratio
418
415
11
27.06
3.91
1024
438
440
14.76
2.13
2048
471
519
9.84
1.28
4096
490
513
7.38
1.03
Energy per
block
(sync)
Energy per
block
(async)
Ratio
768
3.5E-05
2.84E-05
0.81
1024
2.40E-05
3.63E-05
1.5
2048
2.714E-05
6.73E-05
2.5
4096
4.11E-05
12.9E-05
3.1
10
Acknowledgements
We would like to thank the anonymous reviewers for
helpful comments on earlier drafts of this work. This
work was partially supported by NSF ITR Award No.
CCR 0086036.
References
[1] M. Ferretti and P. A. Beerel. High Performance
Asynchronous Design Using Single-Track Full-Buffer
Standard Cells, IEEE Journal of Solid-State Circuits,
Vol. 41, No. 6, pp. 1444-1454, June 2006.
[2] M. Ferretti and P. A. Beerel. Single-Track
Asynchronous Pipeline Templates using 1-of-N
Encoding, DATE02, Mar. 2002.
[3] P. Golani and P. A. Beerel. Back-Annotation in HighSpeed Asynchronous Design, Journal of Low power
Electronics, Vol. 2, pp. 37-44, 2006.
[4] P. Golani and P. A. Beerel. High Speed Noise Robust
Asynchronous Circuits, ISVLSI06, March, 2006.
Pages: 173-178
[5] P. A Beerel, A. Lines, M. Davies, N.-H. Kim. Slack
matching asynchronous designs. ASYNC06, March
2006.
[6] M. Ferretti. Single-track Asynchronous Pipeline
Template, Ph.D. Thesis, University of Southern
California, Aug., 2004.
[7] P. A. Beerel and K. M. Chugg. A Low Latency SISO
with Application to Broadband Turbo Decoding, IEEE
Journal on Selected Areas in Communications, Vol.
19 Issue 5, May, 2001.
[8] R. Manohar and A. J. Martin. Slack Elasticity in
Concurrent Computing. Proceedings of the Fourth
11