Sie sind auf Seite 1von 39

Retiming and Re-synthesis

Outline:
Retiming
Retiming and Resynthesis (RnR)
Resynthesis of Pipelines

Optimizing Sequential Circuits


by Retiming Netlist of Gates
Netlist of gates and registers:
Inputs

Various Goals:

Outputs

Reduce clock cycle time


Reduce area

Reduce number of latches

Retiming
Problem

Pure combinational optimization can be myopic since


relations across register boundaries are disregarded

Solutions

Retiming: Move register(s) so that

clock cycle decreases, or number of registers decreases and


input-output behavior is preserved

RnR: Combine retiming with combinational


optimization techniques
Move latches out of the way temporarily
optimize larger blocks of combinational

Circuit Represetation
[Leiserson, Rose and Saxe (1983)]
Circuit representation: G(V,E,d,w)

V set of gates
E set of wires
d(v) = delay of gate/vertex v, (d(v)0)
w(e) = number of registers on edge e, (w(e)0)

Circuit Representation
Example: Correlator
+

Host

(x, y) = 1 if x=y
0 otherwise

7
0

0
2

Graph (Directed)

Circuit

Every cycle in Graph has at least one register


i.e. no combinational loops.

Operation delay

Preliminaries
For a path p :

e0

ek 1

e1

v0 v1 L vk 1 vk

d ( p ) d (vi )

(includes endpoints)

i 0
k 1

w( p) w(ei )

Clock cycle

i 0

c max {d ( p )}
p: w ( p ) 0

0
0

Path with
0

0
2

w(p)=0
3

For correlator c = 13
6

Basic Operation
Movement of registers from input to output of a gate or
vice versa
Retime by -1

Retime by 1

Does not affect gate functionalities


A mathematical definition: retardation
r: V Z, an integer vertex labeling
wr(e) = w(e) + r(v) - r(u) for edge e = (u,v)

Basic Operation
Thus in the example, r(u) = -1, r(v) = -1 results
in
0
0

0
2

0
0

1
1

u
0

For a path p: st, wr(p) = w(p) + r(t) - r(s)


Retardation
r: VZ, an integer vertex labeling
wr(e) =w(e) + r(v) - r(u) for edge e= (u,v)
A retiming r is legal if wr(e) 0, eE

Retiming for minimum clock cycle


Problem Statement: (minimum cycle time)
Given G (V, E, d, w), find a legal retiming r so that

c max {d ( p )}
p: wr ( p ) 0

is minimized
Retiming: 2 important matrices
Register weight matrix

W (u , v) min{w( p) : u
p

v}

Delay matrix

D(u, v) max{d ( p ) : u
p

v, w( p )

w(u, v)}
9

Retiming for minimum clock cycle

V0 V1 V2 V3

V0 V1 V2 V3

W = register path
weight matrix
(minimum # latches
on all paths between
u and v)
D = path delay matrix
(maximum delay on
all paths between
u and v)

0222
0000
0200
0220

0
13
10
7

V0
V1
V2
V3

v0 0

0
2

V1

V0
V1
V2
V3

0
3

V2

3
3
13
10

6
6
3
13

13
13
10
7

c p, if d(p) then w(p) 1


10

Conditions for Retiming


Assume that we are asked to check if a retiming exists for a clock
cycle
Legal retiming: wr(e) 0 for all e. Hence
wr(e) = w(e) = r(v) - r(u) 0 or

r (u) - r (v) w (e)

For all paths p: u v such that d(p) , we require wr(p) 1


Thus

k 1

1 wr ( p) wr (ei )
i 0

k 1

[ w(ei ) r (vi 1 ) r (vi )]


i 0

w( p ) r (vk ) r (v0 )
w( p ) r (v) r (u )

Or take the least w(p) (tightest constraint) r(u)-r(v) W(u,v)-1


Note: this is independent of the path from u to v, so we just need to apply it
to u, v such that D(u,v)

11

Solving the constraints


All constraints in difference-of-2-variable form
Related to shortest path problem
W
Correlator: = 7

V0 V1 V2 V3

V0 V1 V2 V3

D>7:
Legal: r(u)-r(v)w(e) r(u)-r(v)W(u,v)-1

r (v0 ) r (v1 ) 2
r (v1 ) r (v2 ) 0
r (v1 ) r (v3 ) 0
r (v2 ) r (v3 ) 0
r (v3 ) r (v0 ) 0

r (v0 ) r (v3 ) 1
r (v1 ) r (v0 ) 1
r (v1 ) r (v3 ) 1
r (v2 ) r (v0 ) 1
r (v2 ) r (v1 ) 1
r (v2 ) r (v3 ) 1
r (v3 ) r (v1 ) 1
r (v3 ) r (v2 ) 1

V0
V1
V2
V3

0222
0000
0200
0220

v0 0

0
13
10
7

3 6 13
3 6 13
13 3 10
10 13 7

7
0

0
2

v1

V2

12

V0
V1
V2
V3

Solving the constraints


Do shortest path on constraint graph: (O(|V| 3 )).
A solution exists if and only if there exists no negative
weighted cycle.
D>7:
Legal: r(u)-r(v)w(e) r(u)-r(v)W(u,v)-1

r (v0 ) r (v1 ) 2
r (v1 ) r (v2 ) 0
r (v1 ) r (v3 ) 0
r (v2 ) r (v3 ) 0
r (v3 ) r (v0 ) 0

r (v0 ) r (v3 ) 1
r (v1 ) r (v0 ) 1
r (v1 ) r (v3 ) 1
r (v2 ) r (v0 ) 1
r (v2 ) r (v1 ) 1
r (v2 ) r (v3 ) 1
r (v3 ) r (v1 ) 1
r (v3 ) r (v2 ) 1

A solution is r(v0) = r(v3) = 0, r(v1) = r(v2) = -1

r(0)

-1

-1

2
0

-1

r(2)
-1

1
1

0,-1
1

r(1)

0,-1

r(3)
0

Constraint graph

13

Retiming
To find the minimum cycle time, do a binary search among the
0

v0 0

entries of the D matrix (0( V

v1

V0 V1 V2 V3

V0
V1
V2
V3

3
V2

Retime

Host

0222
0000
0200
0220

Retimed correlator:

Clock cycle
= 3+3+7=13

log V ))

0
2

V0 V1 V2 V3
0
13
10
7

3
3
13
10

6
6
3
13

13
13
10
7

Host

Clock cycle = 7

14

V0
V1
V2
V3

Retiming: 2 more algorithms


1. Relaxation based:

Repeatedly find critical path;


retime vertex at end of path by +1 (O(
(O( V

E log
log V ))

2. Also, Mixed Integer Linear Program formulation

+1

Critical path

15

Retiming for minimum area


(minimum # latches)
Goal: minimize number of registers used
min N r wr (e)
eE

( w(e) r (v) r (u ))

e:u v

w(e)
eE

(r (v) r (u ))

e:u v

N (r (v) r (u ))
u v

N r (v)(# fanin(v ) # fanout (v)


vV

N aV r (v )
vV

where av is a constant.

16

Minimum registers formulation


Minimize:

a r (v )
vV

Subject to: wr(e) =w(e) + r(v) - r(u) 0

Reducible to a flow problem

17

Retiming and resynthesis:


motivation
Goal: incorporate combinational optimization into sequential optimization

Nave approach: carve out combinational regions, do optimization on each region. Only local
gains made.
Can we do any better?

RnR: a new approach


Sentovich, Malik, Brayton andSangiovanni-Vincentelli (89)
3 step approach
1.
2.
3.

Move registers to boundary of circuit


Optimize network
Move registers back in an optimal way

18

RnR: circuit representation


Circuit representation: communication graph
internal/peripheral edges
edge-weight = register count

c
1

0
2

1
i

internal
peripheral

19

Extended Retiming
Move register to the periphery
Negative edge-weights permitted
-1

negative
latch

A negative latch has the interpretation that it


advances its output by 1 clock cycle instead of
delaying it
20

Peripheral Retiming
A retiming is called a peripheral retiming if it results in
all internal edges having zero weight
Peripheral edges can have negative weight
1

All internal edge


weights are 0
1

k k
peripheral
weights

21

Peripheral Retiming
A circuit that undergoes peripheral retiming followed by
a legal retiming, i.e. one that results in all weights 0
is functionally equivalent to the original circuit
Functional equivalence: equivalence of finite state
automata:

But we have to be careful about the initial conditions and


initializing sequences. The resulting circuit may only exhibit
equivalence after an appropriate delay. See [Singhal et.al
ICCAD 1995].
This holds even for regular retiming

22

Peripheral Retiming - an example

c
1

c
0

Peripheral retiming

Not possible for all circuits:


0
i
j

1
j

a 0
b

0
0

c 0
d

o1

o2

23

Path Weight Matrix (PWM)


Matrix W

inputs
rows:
columns: outputs

1
i

Wij

o2

o1

i j

w(e)

o1
I 1
j 0
k *

no path i j
same weight for all paths

if paths with different weights

o2
*
~
0
24

PWM and Peripheral Retiming


Satisfiable path weight matrix:
1. Wij ~, i, j and
2. i, j, such that Wij= I + j , Wij *
Peripheral retiming possible matrix is satisfiable
i, j specify registers on peripheral edge
some i, j can be negative
Complexity: linear in size of communication graph
25

Path Weight Matrix - generation


From inputs to outputs:

generate output columns of W.

Example:
[ 1 0 * ] o1

o2

[*~0]

Paths from inputs


to node

1
i

[0**]

o2
*
~
0

( [ * 0 * ]+1) & [ * 0 0 ]
=[*1*]&[*00]
=[*~0]

[10*]
([ 0 * * ]+1) & ([ * 0 * ]+0)
=[1**]&[*0*]
=[10*]

o1
I 1
j 0
k *

[*0*]

([ * 0 * ]+0) & ([ * * 0 ]+0)


=[*0*]&[**0]
=[*00]

[**0]

26

Computing ,
Each constraint of the form I + j = Wij
Procedure
1.
2.
3.
4.

set 1 = 0
use first row to generate s
determine s from s
check for consistency

o
0/2

Example:

c
1/0

2/0

o
1/1 i 2
j 3

1/0
i

1 = 0
2 = 1
1 = 2

27

Computing ,
Example:
i
j

a 0
b

o1 o2
i 0 0
j 0 1

0
0

c 0
d

1+ 1 = 0

o1

o2

2+ 1 = 0

1+ 2 = 0 2+ 2 = 1
-----------------------------1- 2 = 0
1- 2 = -1

contradiction
28

Optimizing Acyclic Sequential


Circuits
Acyclic circuits with satisfiable W

Do peripheral retiming putting i,j registers at the I/O


Resynthesize interior
Do a legal retiming (move registers in)
May not always be possible

Acyclic circuits with unsatisfiable W

Identify maximal sub-circuits with satisfiable W


Cut connections
Repeat previous procedure

29

Acyclic Sequential Circuits

cut
circuits
Note that both of the cut circuits can be peripherally
retimed, unlike the original.

30

Optimizing Cyclic Sequential


Circuits
Cyclic circuits

1. Make circuit acyclic by breaking cycles

Different feedback-cuts give different Ws

2. Find sub-circuits with satisfiable Ws


3. Repeat procedure

31

FSM Optimization
X
Y
OUT
A

Break feedback

Y
OUT
B

32

FSM Optimization
Peripheral retiming

combinational

Y
OUT
B

X
Resynthesize

Y
OUT
B

33

FSM Optimization
Retime
X

Y
OUT
B

Reconnect
OUT
B

34

FSM Optimization
Original circuit

OUT
A

Resynthesized circuit

OUT
A

35

Resynthesis of Pipelines
Goal: Performance optimization of pipeline
circuits
Example: Pipeline circuit C
In

Ii

I1

Cn

Ci

C1

On

Oi

O1

In
Peripheral
Retiming

Cn

-(n -1)

Ii

Ci

-(i-1)

I1

C1

On
n -1

Oi
i-1

Combinational
circuit

O1

36

Resynthesis of Pipelines
Parameters:

A
R
c
C

vector of arrival times for inputs


vector of required times for outputs
target clock cycle
circuit

Pipeline performance problem:


PP(CP,cP,AP,RP)
Combinational performance problem:
PC(CC,cC,AC,RC)
37

Resynthesis of Pipelines
Problem Transformation:

Relax PP(CP,cP,AP,RP) to P P(CP,cp+,AP,RP)


where = largest possible single gate delay
Convert: pipeline P P to combinational problem P C

C
P

peripheral retime

CC

stagei
C

(i 1) c R

stagei
C

(i 1) c A

stagei
P
stagei
P
38

Perfomance Synthesis of Pipeline


Arrival and Required Times for P C :
In

Ii

I1

Cn

Ci

On

Oi

In
Peripheral
Retiming

Cn

On
(i-1)c+Ri

Ii

Ci

Oi

Combinational
circuit

(i-1)c+Ai

C1

O1

I1

C1

O1

Theoretical Contribution :
If P C(cp+) has a solution then retiming yields a solution to P P (cp+)
If there is a solution to PP (cp) then peripheral retiming yields a solution
to P C (cp+)

39

Das könnte Ihnen auch gefallen