Sie sind auf Seite 1von 43

Objective

Need for CTS Revisit timing concepts and definitions Introduction to the various steps involved in clock tree synthesis After completion of this program students will be familiar with the CTS flow,challenges in CTS, interdependencies and will be ready to synthesize a clock tree for any project.

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Design Status before CTS


Placement completed Power and ground nets- prerouted Estimated congestion acceptable Estimated timing acceptable (~ 0 ns slack) Estimated max cap/ transition no violations High fanout nets: Reset, Scan Enable synthesized with buffers Clocks are still not buffered.

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Clock Tree Synthesis


Basics
Clock net are treated as ideal during synthesis Logical proximity does not mean physical proximity of related flops Clock net has a high fan out and needs to be dealt with appropriately Interconnect delays will degrade the quality of the clock signal

CTS: Clock Tree Synthesis


It's a kind of a tree to provide the clock to all of it's sinks (Binary tree, H-Tree etc.) The basic of CTS is to develop the interconnect that connect the system clocks to all the cells in the chip that uses the clock The primary task of CTS is vary the routing paths, placement of clocked cells and clock buffers to meet clock tree targets

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Review Of Basic Terminology


PLL Clock Period Clock Latency Source Latency Network Latency Clock Uncertainty Setup & Hold Constraints Max Capacitance Max Fanout Max Transition Skew Global skew Local skew Useful skew
Ph:08040788574 www.rvvlsi.com
RVVLSIConfidential

PLL : Phase Locked Loop


Designers use low frequency crystal as on off-chip clock source Reference Clock enters in the chip and drives the PLL It generates the desired stable frequency on the chip The PLL drives the clock distribution network & one of it outputs used as a feedback in PLL The main function of PLL is to compare the reference clock and distribution clock and match them with the help of VCO, LF and PC. PLL is very much sensitive to noise, so placing the PLL in digital chip is critical Need to provide the power and ground lines around the PLL
Ph:08040788574 www.rvvlsi.com
RVVLSIConfidential

Clock Latency
Clock source latency is defined as the delay from Clock source to clock definition port in your design Clock network latency is defined as the delay from the Clock definition port to clock sink of your design It is also known as insertion delay (standard term)
ClockLatency=TclkTclk_a
Rise=7 Fall=4
INV

ClockDefinitionPort

INV

Rise=7 Fall=4
INV

Rise=7 Fall=4
INV

Rise=7 Fall=4
INV

CLK_a

CLK
BUF

Rise=7 Fall=4

CLK_b

ClockLatency=TclkTclk_b

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Clock Uncertainty
The clock network delay uncertainty : Jitter It is defined as the maximum difference of phase for a clock in one clock cycle to other The jitter can move launch edge of clock and capture edge of clock by the jitter amount Sources:
PLL oscillation frequency Various noise sources like power supply noise

Two cases
Setup : Delayed the launch edge and early the capture edge Hold : Early the launch edge and delayed the capture edge

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Clock Skew
Clock skew for a particular clock is defined as difference in between insertion delays Source of skews
Designed Variations : mismatch in buffer , load sizes, interconnect lengths Process variations, Temperature variations IR-DROP

Types : positive skew & negative skew Skew range : 5-10% of clock period for your design

Data CLK

D CK

Flop1

Combo Delay

D CK

z
Flop2

Buffer&WireDelay

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Local Skew
Local Skew : Considered in between two flops which are interacting logically with each other in same clock domain Here FF1 and FF2 are interacting with each other so local skew = 0.38 0.37 = 0.01ns & FF3 and FF4 are also interacting with each other so local skew = 0.32 0.30 = 0.02ns Local skew takes more run times and less buffers in your design. Must meet local skew
Din

A
0.38ns

FF1

FF2

C_out

0.37ns

CLK 0.30ns

FF3

B_out

B
0.32ns

FF4

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Global Skew
Din

Global Skew:It is considered as the difference in the longest delay path and shortest delay path from clock definition point to clock sink withing the same domain Here the global skew : 0.38 0.30 = 0.08ns Global skews has faster run CLK time Global skew add more buffers in your design , so be cautious about it. It impacts area constraints of your design and increases congestion.
Ph:08040788574

A
0.38ns

FF1

FF2

0.37ns

FF3

0.30ns

B
0.32ns

FF4

www.rvvlsi.com
RVVLSIConfidential

10

Useful Skew
The useful skew concept is used to fix the setup violation The same path should not violate the hold This is a push/pull technique of clock. Example: Let assume your path from flop1 to flop2 is failing setup by -1ns & path from flop2 to flop3 is passing setup by +1ns . Think ? We use it very less these days in the industry, because it again impacts your skew in the design

Datainput

D CK

Flop1

D CLOCK CK

Flop2

D CK

Flop3

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

11

Setup & Hold


Setup time For an edge triggered sequential element, the setup time is the time interval before the active clock edge during which the data should remain unchanged Hold time Time interval after the active clock edge during which the data should remain unchanged

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

12

CTS Specifications

Meet the buffering constraints

Maximum transition delay Maximum load capacitance Maximum fan out

Meet the clock tree targets

Maximumskew Min/Max insertion delay

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

13

CTS Specifications
Typical specs which are used to design the clock tree are:

Clock skew Clock Fanout Clock Latency Buffer levels

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

14

CTS Flow

FromPlacement
SetClockCommon Options SynthesizetheClock Tree(CTS) ReconnectScan Chains

Enablepropagated Clocks PostCTSPlacement Optimization OptimizeTiming Skew OptimizeTiming (UsefulSkewCTO) ToRouting

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

15

Clock Trees

Clock Tree : When you drive the all the sink pins of a particular clock from the clock port, it is known as the clock tree for that particular clock Types :
H-Tree Balanced Fanout Clock Tree Binary Clock Tree

When you synthesize your clock network with respect to particular clock known clock tree synthesis
it consists of varying the routing paths placement of clock buffers/cells consideration of specification

After CTS your optimize your clock tree network to meet specific targets skew

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

16

Clock Tree

A path from the clock source to flops

ClockSource

FF

FF

FF

FF

FF

FF

FF

FF

FF

FF
17

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Balanced Fanout Clock Tree

A path from the clock source to clock sinks

ClockSource

FF

FF

FF

FF

FF

FF

FF

FF

FF

FF
18

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

Binary Clock Tree

A path from the clock source to clock sinks

ClockSource

FF

FF

FF

FF
www.rvvlsi.com
RVVLSIConfidential

FF

FF

FF

FF
19

Ph:08040788574

H-Tree

H-Tree

4Points

16Points

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

20

Clock Common Options


Clock Common Tree options used to set the different-2 options as shown in figure.

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

21

Clock Common Options Constraints


Max Trans/Cap/Fanout
If specified at multiple places (Library, SDC, Default), it consider the smallest values. if (Astro default values < SDC or Library), Verify with vendor and then proceed. By default it is defined 20. First let it be by default. Want to change this number analyze the things properly. Use this to control the min and max insertion delay. By default , insertion delay in SDC have a Priority

Max Buffer Level


Max & Min Insertion Delay


Ph:08040788574

We can choose to ignore SDC and Library constraints. If settings are too tight , violations may be created

www.rvvlsi.com
RVVLSIConfidential

22

Invoking Clock Tree Synthesis


Invoking of clock tree synthesis have the options as shown in figure on the right hand side.

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

23

Clock Tree Begin and End


Clock tree begins at SDC-defined clock source. Type of Astro-defined pins
Implicit pins: which tool defines by itself Explicit pins: which user defines
Clock tree passes through Gating logic by default Gated

D
FF1

Each pin comes into one of the category mentioned above. Clock tree ends at Astro-defined stop pins Two types of stop pins:
Sync pins: Clock pins of sequential and macro cells Ignore pins:Everything else.

D
FF2

clock

start

D
FF3

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

24

Sync pins and Ignore Pins


Sync pins:

Clock tree passes through Gating logic by default Gated

D
FF1

CTS optimizes for buffering constraints(max tran/cap) and clock tree targets (clock skew, insertion delay)

Ignore pins:

D
FF2

CTS adds a small buffer to isolate all pins ignores buffering constraints and clock tree targets.
Implicit IGNORE pins

clock

D
FF3

Q D
FF4

IP

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

clk_out

25

Explicit Sync Pins


The design spec requires that the delays from the clock port, through the mux selects, to all output ports must be balanced.
Isolation Buffer

D
FF1

M U X

ImplicitIgnorePins

clock

D
FF2

1
M U X

DummyMUXisusedto matchthedelay

Skew and insertion delay are ignored Ph:08040788574 www.rvvlsi.com


RVVLSIConfidential

26

Adding Explicit Sync Pins

How we can force CTS to balance these delays?


D
FF1

Isolation Buffer

M U X

Explicitsyncpin

clock

D
FF2

1
M U X

Skew and insertion delay are optimized Ph:08040788574 www.rvvlsi.com


RVVLSIConfidential

27

Optimization Behaviour
D Q

create_generated_clock create_clock
D

FF1

Todifferent clockdomain oroutputport

FF2

clock
D D Q Q

combo
Todifferent clockdomain oroutputport

FF3

FF5
D Q

FF4

Todifferent clockdomain oroutputport

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

28

Optimization during CTS

Buffer sizing and Gate sizing


F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F

4x

2x

3x

4x

3x

4x

Before Ph:08040788574 www.rvvlsi.com


RVVLSIConfidential

After 29

Optimization during CTS

Delay Insertion
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F

4x

3x

3x

4x

3x

3x

DelayCell

Before

After www.rvvlsi.com
RVVLSIConfidential

Ph:08040788574

30

Optimization during CTS


Buffer and Gate Relocation
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F

4x

3x

3x

4x 3x 3x

Before

After www.rvvlsi.com
RVVLSIConfidential

Ph:08040788574

31

Optimization during CTS

Dummy load
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F

4x

3x

3x

4x

3x

3x

Before

After www.rvvlsi.com
RVVLSIConfidential

Ph:08040788574

32

Optimization during CTS


Level Adjustment
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F

4x

3x

3x

4x

3x

3x

F F

F F

F F

Before Ph:08040788574 www.rvvlsi.com


RVVLSIConfidential

After 33

Optimization during CTS


Reconfiguration
F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F F

4x

3x

3x

4x

3x

3x

Before

After www.rvvlsi.com
RVVLSIConfidential

Ph:08040788574

34

Reconnect the Scan Chains

Scan chains were disconnected prior to placement to allow placement to focus on the functional paths. Reconnect scan chains so that they are included for hold time fixing during the next optimization step
- Same grouping of FF, as traced prior to disconnect - Different ordering : based on placement, to minimize routing resources.

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

35

Post CTS Placement Optimizations


Post CTS placement Optimizations
Stage : Post-CTS Effort : Medium Optimizations tasks : Fix Hold also

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

36

Post CTS Placement Optimization : As needed


It can be executed iteratively, if needed. If violations are too tight, better to choose one option at a time with high effort , Ex: Fix Transition.

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

37

Effect of Post CTS Placement Optimization


Fixing timing and max cap/tran violations through logic optimizations and cell relocation may disturb the clock networks FF may be moved . This can affect the skew and insertion delay. Keep the size and location fixed

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

38

Clock Tree Analysis


Clock tree analysis : dump the proper reports Use proper clock name Check skew (both global and local)

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

39

Clock Tree Optimization


Further clock tree optimization , if needed before routing

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

40

Effects of CTS

Clock buffers added (lots of them!) Congestion may increase Non clock tree cells may have been moved to less ideal locations Can introduce new timing and max tran/cap violations.

* How can you improve congestion and timing?

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

41

Module Takeaway

After completion of this program students will be familiar with the CTS flow,challenges in CTS, interdependencies and will be ready to synthesize a clock tree for any design.

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

42

References
Algorithms for VLSI Physical Design Automation by Naveed A. Sherwani. From Basics to ASICs by Harry Veendrick Himanshu Bhetnagar, Advanced ASIC Chip Synthesis, Second Edition, Kluwer Academic Publishers. https://solvnet.synopsys.com

Ph:08040788574

www.rvvlsi.com
RVVLSIConfidential

43

Das könnte Ihnen auch gefallen