Beruflich Dokumente
Kultur Dokumente
Outline:
Retiming
Retiming and Resynthesis (RnR)
Resynthesis of Pipelines
Various Goals:
Outputs
Retiming
Problem
Solutions
Circuit Represetation
[Leiserson, Rose and Saxe (1983)]
Circuit representation: G(V,E,d,w)
V set of gates
E set of wires
d(v) = delay of gate/vertex v, (d(v)0)
w(e) = number of registers on edge e, (w(e)0)
Circuit Representation
Example: Correlator
+
Host
(x, y) = 1 if x=y
0 otherwise
7
0
0
2
Graph (Directed)
Circuit
Operation delay
Preliminaries
For a path p :
e0
ek 1
e1
v0 v1 L vk 1 vk
d ( p ) d (vi )
(includes endpoints)
i 0
k 1
w( p) w(ei )
Clock cycle
i 0
c max {d ( p )}
p: w ( p ) 0
0
0
Path with
0
0
2
w(p)=0
3
For correlator c = 13
6
Basic Operation
Movement of registers from input to output of a gate or
vice versa
Retime by -1
Retime by 1
Basic Operation
Thus in the example, r(u) = -1, r(v) = -1 results
in
0
0
0
2
0
0
1
1
u
0
c max {d ( p )}
p: wr ( p ) 0
is minimized
Retiming: 2 important matrices
Register weight matrix
W (u , v) min{w( p) : u
p
v}
Delay matrix
D(u, v) max{d ( p ) : u
p
v, w( p )
w(u, v)}
9
V0 V1 V2 V3
V0 V1 V2 V3
W = register path
weight matrix
(minimum # latches
on all paths between
u and v)
D = path delay matrix
(maximum delay on
all paths between
u and v)
0222
0000
0200
0220
0
13
10
7
V0
V1
V2
V3
v0 0
0
2
V1
V0
V1
V2
V3
0
3
V2
3
3
13
10
6
6
3
13
13
13
10
7
k 1
1 wr ( p) wr (ei )
i 0
k 1
w( p ) r (vk ) r (v0 )
w( p ) r (v) r (u )
11
V0 V1 V2 V3
V0 V1 V2 V3
D>7:
Legal: r(u)-r(v)w(e) r(u)-r(v)W(u,v)-1
r (v0 ) r (v1 ) 2
r (v1 ) r (v2 ) 0
r (v1 ) r (v3 ) 0
r (v2 ) r (v3 ) 0
r (v3 ) r (v0 ) 0
r (v0 ) r (v3 ) 1
r (v1 ) r (v0 ) 1
r (v1 ) r (v3 ) 1
r (v2 ) r (v0 ) 1
r (v2 ) r (v1 ) 1
r (v2 ) r (v3 ) 1
r (v3 ) r (v1 ) 1
r (v3 ) r (v2 ) 1
V0
V1
V2
V3
0222
0000
0200
0220
v0 0
0
13
10
7
3 6 13
3 6 13
13 3 10
10 13 7
7
0
0
2
v1
V2
12
V0
V1
V2
V3
r (v0 ) r (v1 ) 2
r (v1 ) r (v2 ) 0
r (v1 ) r (v3 ) 0
r (v2 ) r (v3 ) 0
r (v3 ) r (v0 ) 0
r (v0 ) r (v3 ) 1
r (v1 ) r (v0 ) 1
r (v1 ) r (v3 ) 1
r (v2 ) r (v0 ) 1
r (v2 ) r (v1 ) 1
r (v2 ) r (v3 ) 1
r (v3 ) r (v1 ) 1
r (v3 ) r (v2 ) 1
r(0)
-1
-1
2
0
-1
r(2)
-1
1
1
0,-1
1
r(1)
0,-1
r(3)
0
Constraint graph
13
Retiming
To find the minimum cycle time, do a binary search among the
0
v0 0
v1
V0 V1 V2 V3
V0
V1
V2
V3
3
V2
Retime
Host
0222
0000
0200
0220
Retimed correlator:
Clock cycle
= 3+3+7=13
log V ))
0
2
V0 V1 V2 V3
0
13
10
7
3
3
13
10
6
6
3
13
13
13
10
7
Host
Clock cycle = 7
14
V0
V1
V2
V3
E log
log V ))
+1
Critical path
15
( w(e) r (v) r (u ))
e:u v
w(e)
eE
(r (v) r (u ))
e:u v
N (r (v) r (u ))
u v
N aV r (v )
vV
where av is a constant.
16
a r (v )
vV
17
Nave approach: carve out combinational regions, do optimization on each region. Only local
gains made.
Can we do any better?
18
c
1
0
2
1
i
internal
peripheral
19
Extended Retiming
Move register to the periphery
Negative edge-weights permitted
-1
negative
latch
Peripheral Retiming
A retiming is called a peripheral retiming if it results in
all internal edges having zero weight
Peripheral edges can have negative weight
1
k k
peripheral
weights
21
Peripheral Retiming
A circuit that undergoes peripheral retiming followed by
a legal retiming, i.e. one that results in all weights 0
is functionally equivalent to the original circuit
Functional equivalence: equivalence of finite state
automata:
22
c
1
c
0
Peripheral retiming
1
j
a 0
b
0
0
c 0
d
o1
o2
23
inputs
rows:
columns: outputs
1
i
Wij
o2
o1
i j
w(e)
o1
I 1
j 0
k *
no path i j
same weight for all paths
o2
*
~
0
24
Example:
[ 1 0 * ] o1
o2
[*~0]
1
i
[0**]
o2
*
~
0
( [ * 0 * ]+1) & [ * 0 0 ]
=[*1*]&[*00]
=[*~0]
[10*]
([ 0 * * ]+1) & ([ * 0 * ]+0)
=[1**]&[*0*]
=[10*]
o1
I 1
j 0
k *
[*0*]
[**0]
26
Computing ,
Each constraint of the form I + j = Wij
Procedure
1.
2.
3.
4.
set 1 = 0
use first row to generate s
determine s from s
check for consistency
o
0/2
Example:
c
1/0
2/0
o
1/1 i 2
j 3
1/0
i
1 = 0
2 = 1
1 = 2
27
Computing ,
Example:
i
j
a 0
b
o1 o2
i 0 0
j 0 1
0
0
c 0
d
1+ 1 = 0
o1
o2
2+ 1 = 0
1+ 2 = 0 2+ 2 = 1
-----------------------------1- 2 = 0
1- 2 = -1
contradiction
28
29
cut
circuits
Note that both of the cut circuits can be peripherally
retimed, unlike the original.
30
31
FSM Optimization
X
Y
OUT
A
Break feedback
Y
OUT
B
32
FSM Optimization
Peripheral retiming
combinational
Y
OUT
B
X
Resynthesize
Y
OUT
B
33
FSM Optimization
Retime
X
Y
OUT
B
Reconnect
OUT
B
34
FSM Optimization
Original circuit
OUT
A
Resynthesized circuit
OUT
A
35
Resynthesis of Pipelines
Goal: Performance optimization of pipeline
circuits
Example: Pipeline circuit C
In
Ii
I1
Cn
Ci
C1
On
Oi
O1
In
Peripheral
Retiming
Cn
-(n -1)
Ii
Ci
-(i-1)
I1
C1
On
n -1
Oi
i-1
Combinational
circuit
O1
36
Resynthesis of Pipelines
Parameters:
A
R
c
C
Resynthesis of Pipelines
Problem Transformation:
C
P
peripheral retime
CC
stagei
C
(i 1) c R
stagei
C
(i 1) c A
stagei
P
stagei
P
38
Ii
I1
Cn
Ci
On
Oi
In
Peripheral
Retiming
Cn
On
(i-1)c+Ri
Ii
Ci
Oi
Combinational
circuit
(i-1)c+Ai
C1
O1
I1
C1
O1
Theoretical Contribution :
If P C(cp+) has a solution then retiming yields a solution to P P (cp+)
If there is a solution to PP (cp) then peripheral retiming yields a solution
to P C (cp+)
39