You are on page 1of 11

Application Note

Using the Clone Report


April, 2001 1.1 Magma Confidential

Introduction
The Blast Chip design system uses a unique constant delay methodology in which delays
through the cells of a design are fixed initially. The optimization process used by Blast
Chip then performs buffer or inverter insertion, gate sizing, and other logic restructuring
operations to meet timing goals.
Optimization is performed using SuperCell models, which characterize the timing
behavior of library cells. In well designed libraries, where the delay of SuperCell models
scales linearly with the load and assuming that every SuperCell model is continuously
sizable, it is theoretically possible to meet the delays computed by the constant delay
methodology regardless of the actual load after placement.
Real libraries frequently do not have cells with sizes that can drive all possible loads.
Therefore, it is possible that the largest available cell in the library is not capable of driving
a load that is necessary to meet timing objectives. Situations like this are referred to as
“load violations.”
This document describes how to use the report that is produced when you execute the
report clone command to identify load violations and how to interpret the report.

Using the Clone Report Magma Confidential 1


SuperCell Model
A large part of the flow in Blast Chip uses a SuperCell model for a timing model. This
SuperCell model is a delay model that captures a range of sizes for a particular function.
The SuperCell model delay depends on the transition at the input of a cell and the gain of
the cell. The gain of the cell is a ratio of the input capacitance and the output load. Hence,
the gain is a measure of the amplification of the signal. This amplification is not measured
in amplitude, but in terms of capacitance.
The reasoning behind this model is the assumption that sizing can be done appropriate to
the load. So for any load, there is an appropriate size as transistor widths, output loads,
and input capacitances all scale linearly while the ratio of output load to input capacitance
and the delay remain constant. Assuming powerful continuous sizing, these delays
should be achievable, regardless of the actual load after placement.
The gain of a cell is composed of two parts: one part that is specific to the input-output
pair and the other part that is specific to the output (output gain). The gain of the cell is the
product of the two numbers. The first part of the gain is specific to the SuperCell model
and does not change as a result of optimization. The second part of the gain, called the
output gain, varies during the process of optimization.
The delay of a SuperCell model is expressed in terms of its gain and the input transition
time:
delay = s * Tx + p + g * h
s Is the slew sensitivity.
Tx Is the input transition time or input slew.
p Is a constant intrinsic delay.
h Is the output gain.
g Is the electrical effort. For a complete description of the logical and electrical
effort concepts, see the book, Logical Effort by Sutherland, Sproull, and
Harris.

Methodology
The SuperCell model allows you to reason about delay in the absence of knowledge
about capacitance. Before placement, nothing is known about the actual wire lengths.
In this stage, (fix time) you use the SuperCell model to get an estimate of the timing.
Because this model is both simpler and more accurate than using wire load models, the
use of SuperCell models speeds the timing optimization. Furthermore, as the SuperCell
models are continuous models, rather than discrete models, they lend themselves to
continuous optimization methods. Due to the relative smoothness of the state space,
such methods are less prone to getting caught in local minima.
After placement, the estimates of the wire capacitances become more accurate.
The accuracy of the SuperCell model is now increasingly dependent on the ability to find
the right size for each cell for the given load. Obviously, the continuous sizing assumption
is optimistic, and in real libraries there are only a limited number of sizes available.

Using the Clone Report Magma Confidential 2


Substituting the SuperCell model with real cells usually has a negative impact on the
slack. The degree of this impact depends on the selection of real cells that are available
in the library. First of all, there is a certain round-off effect, where a slightly larger or
smaller cell than desired is selected. Assuming that the cell drive strengths are increasing
by a factor of two you should be able to select the correct drive strength within an
accuracy of 50 percent. If the network sized near optimum, the delay impact of such a
deviation is much less than 50 percent. Generally, this effect is not very large.
Much larger is the effect when no cell can be found that approximates the desired drive
strength. While long nets are a minority, they are commonly much longer than the
average net. The average net is less than 100 microns, but a long net can easily be 2mm
long. (you need to break up nets longer than 2mm because of RC delay). Often the
largest size of a function can drive only up to 200um, an order of magnitude less than
required. This problem, where the largest size in the library is smaller than the required
size, is a load violation.
Blast Chip contains many optimizations that attempt to deal with this situation. These
include electrical optimization such as cloning, buffering, as well as structural resynthesis
of various forms. While these optimizations are often successful at removing or reducing
load violations, they are not always successful, as their applicability depends on a
number of circumstances.
Cloning is the copying of a cell, while distributing the fanouts between the original cell and
its copy. It is an important way to increase the effective drive strength of a cell—by
splitting its load between two copies. The advantage of this method is that it does not add
levels of logic.
Buffering is another optimization technique that can greatly increase the drive strength of
a cell, but adds levels of logic and thus, delay.
An alternative to cloning is using parallel cells. With parallel cells, a copy of the cell is
made using the same inputs. But the outputs of the cells are tied together, so they drive
the load together. The advantage is that you can use this method for single fanout nets,
where the load cannot easily be split. Furthermore, splitting the load in cloning often adds
wire. This can also be avoided with parallel cells.
You can use logic restructuring to deal with load violations. Logic restructuring can
reduce the length of the critical paths. This allows you to accommodate the extra delay
associated with the load violation or allows the insertion of a buffer. You can use logic
remapping to substitute cells having a low maximum drive strength with other logic
functions having a higher maximum drive strength.

Clone Report
To aid in identifying these situations, Blast Chip has the clone report. The reason that it is
called the clone report is that cloning is one of the main methods of increasing drive
strengths. However, unlike sizing and creating parallel cells, this method has a large
number of different circumstantial constraints. The clone report lists the size of the load
violation for each cell and the reason it cannot be cloned. The clone report lists all load
violations sorted by size.

Using the Clone Report Magma Confidential 3


Configuring the Clone Report
You can configure the clone report by selecting the columns you want to appear. You can
see the available columns for the clone report by passing an invalid argument to the
config report clone command:
% config report clone x
TAB-1 ERROR: Invalid column identifier: x
TAB-3 Valid column identifiers are: PROBLEM PIN_NAME
NET_NAME CELL_NAME MODEL_NAME ENTITY_NAME PIN_LOAD
WIRE_LOAD TOTAL_LOAD TYPICAL_LOAD TYPICAL_RATIO
CAP_LIMIT CAP_RATIO CAP_SLACK VIOLATION CAUSE
FANOUT_LIMIT FANOUT_RATIO FANOUT_SLACK SLEW
SLEW_RATIO SLEW_SLACK

You can see the current settings by giving no argument:


% config report clone
PROBLEM VIOLATION:*.3 CAUSE WIRE_LOAD TOTAL_LOAD FANOUT SLACK
MODEL_NAME PIN_NAME:30 NET_NAME

These are the default settings.


You can select the columns to be shown in the clone report by using the config report
clone command with a list of desired columns. For example:
% config report clone {MODEL_NAME PROBLEM TYPICAL_RATIO}
Add the CELL_NAME column to the current set of columns, enter:
% config report clone “[config report clone] CELL_NAME”

Table 1 lists the column names that can be added to the clone report.

Table 1: Column Names and Definitions


Column Names Definition
PROBLEM This column tells you why the cell cannot be cloned. Removing the cause
might improve your results.
PIN_NAME Name of the output pin that drives the net.
NET_NAME Name of the net that has a load violation.
CELL_NAME Name of the cell that drives the net.
MODEL_NAME Name of the model that is the cell master.
ENTITY_NAME Name of the entity that contains the model.
PIN_LOAD Total capacitance of all pins connected to the net.
WIRE_LOAD Capacitive load of the wire that implements the net.
TOTAL_LOAD Total capacitance on the net, including wires and pins.

Using the Clone Report Magma Confidential 4


Table 1: Column Names and Definitions (Continued)
Column Names Definition
TYPICAL_LOAD Typical load of the output pin. The typical load is an estimated value of
the load that the pins should drive in a performance critical.
TYPICAL_RATIO Ratio of the typical load times output gain versus the total load. A value of
1 indicates a well trimmed cell. A value larger than 1 indicates an
overloaded cell.
CAP_LIMIT Hard limit on the capacitance. This limit usually comes from the library.
CAP_RATIO Ratio of actual capacitance versus the capacitance limit.
CAP_SLACK Difference of the load limit minus the actual capacitance. A negative
number constitutes a capacitance violation.
VIOLATION Worst ratio of the CAP_RATIO, TYPICAL_RATIO, FANOUT_RATIO,
SLEW_RATIO, minus 1. A positive value indicates a violation.
CAUSE Reports what limit determined the VIOLATION number. Fanout Limit
means a fanout violation; Cap Limit means a capacitance violation; Rise
Limit and Fall Limit refer to transition time limits, and Overload means
that the cell is overloaded compared to the typical load (times gain).
DIAGNOSIS Description of the problem.
GAIN Output gain of the driving pin. A larger gain indicates looser timing and
results in a slower cell. Slowing down a cell allows it to drive more
capacitance.
SLACK Timing slack.
AREA Area of the driving cell.
NUM_OUT Number of output pins on the driving cell.
FANOUT Fanout load on the net.
FANIN Number of input pins on the driving.
FANOUT_LIMIT Fanout limit on the net.
FANOUT_RATIO Ratio of fanout load divided by the fanout limit. A number greater than 1
indicates a fanout violation.
FANOUT_SLACK Difference between the fanout limit minus the fanout load. A negative
number indicates a fanout violation.
SLEW Worst output transition time (worst rise and fall).
SLEW_RATIO Ratio of the actual transition time divided by the transition time limit.
SLEW_SLACK Difference of the transition limit minus the actual transition time.

Using the Clone Report Magma Confidential 5


Interpreting the Clone Report
The following is an example of a clone report:
mantle[8]> report clone /work/ct/ct
######################################################################
# Mantle analysis report
# Command:
# report clone \
# /work/ct/ct
# Date: Fri Sep 29 15:15:19 2000
# Version: Mantle Version 2.0.31-sunos5_sun4
# Wire Capacitance Configuration:
# force wire model manhattan /work/ct/ct -auto all
# config delay manhattan computed
######################################################################

Problems
problem violation cause total load slack model pin
------------- --------- ---------- ---------- ----- --------- -----------------
Primary In 3.053 Cap Limit 243 167 * fa_ct_c_prefix_1
Too Much Wire 2.245 Overload 184 -49 SC_ND3B C43194_2/Z
Primary In 2.220 Cap Limit 193 97 * x86_inst_rot_ctl
Primary In 1.752 Cap Limit 165 -49 * w_profile_en
Too Much Wire 1.595 Overload 631 52 SC_O22AIH C43348/Z
May Clone 1.576 Overload 417 -45 SC_NR3H taxi_U514.C1_2/Z
Single Fanout 1.202 Overload 67 320 SC_NR3A C42293_3/Z
May Clone 1.154 Overload 367 96 SC_NR3H dec_U851.C1_3/Z
Single Fanout 1.108 Overload 61 74 SC_A2O1IA C41663_2/Z
May Clone 1.103 Overload 221 17 SC_INVA BW1_INV1695_7/Z
May Clone 1.085 Rise Limit 704 2524 SC_NR2B fp_smu_U338.C2/Z
Single Fanout 0.959 Overload 251 38 SC_ND2B gen_U1132.C1/Z
Primary In 0.898 Cap Limit 114 75 * c_misc_mark
Single Fanout 0.813 Overload 66 437 SC_NR3A C42299_3/Z
May Clone 0.793 Overload 373 313 SC_ND3H dec_U881.C1_2/Z
Too Much Wire 0.724 Overload 342 353 SC_ND3D C42856_2/Z
May Clone 0.693 Overload 162 163 SC_INR2B gen_U1081.C2/Z
May Clone 0.682 Overload 2027 64 SC_INVP C51540/Z
May Clone 0.679 Overload 465 192 SC_INR2H info_U109.C1/Z
May Clone 0.669 Overload 60 148 SC_A2O1IA C43378/Z
Too Much Wire 0.666 Overload 285 -49 SC_NR3H taxi_U580.C2_6/Z
Single Fanout 0.629 Overload 288 38 SC_NR3H C41051_12/Z
May Clone 0.622 Overload 105 255 SC_ND2A dec2_U263.C1/Z
May Clone 0.607 Overload 127 381 SC_INR2A reg_U253.C2_1/Z
Single Fanout 0.578 Overload 62 121 SC_ND2A C41666_4/Z
May Clone 0.569 Overload 217 189 SC_IND2B C41027/Z
Total wire length 1.36607 meter
Total wire load 292.851 pf
Total area 0.735351 sq mm
* * * Overload Summary * * *
Problem Count Average(%) Worst(%)
------------------ ----- ---------- --------
May Clone 782 20.9 157.6
Single Fanout 295 22.0 120.2
Don't Clone Entity 14 15.6 32.5
Primary Out 37 21.6 39.5
Primary In 469 22.3 305.3
Too Much Wire 160 23.7 224.5
------------------ ----- ---------- --------
Total 1757 21.7 305.3

Using the Clone Report Magma Confidential 6


The header lists how and when the command is invoked, including some of the settings
that affect the computation of values in the clone report. See “Configuring the Clone
Report” on page 4 for information about specifying settings. The first table can be
configured as described before with the config report clone command. By default, it lists
30 lines. It sorts the cells by the violation column, with the worst violation at the top. Then
it reports the total wire length, wire load, and cell area, as measured by the current wire
load model (as set by force wire model command). The problems identified in the report
are described in Table 2.

Table 2: Troubleshooting Problems in the Clone Report

Problem Description Action

May Clone Problem can probably be fixed by No action required.


using run gate clone.

Don’t Clone Cell Cannot clone because of a noclone Use clear noclone on cell.
directive on cell.

Don’t Clone Model Cannot clone because of noclone Use clear noclone on model.
directive on the model.

Don’t Clone Entity Cannot clone because of noclone Use clear noclone on entity.
directive on the entity.

Cell has InOut Cannot clone because cell has Rearchitect the design to
bidirectional in-out pins. eliminate bidirectional pins.

Net has InOut Cannot clone because net has In Rearchitect the design to
Out (bi-directional) pins. eliminate bidirectional pins.

Multisource Cannot clone because the net has Rearchitect the design to
multiple sources. eliminate multisource nets.

Primary output Cannot clone primary output. Reduce loading on primary output.
Relax timing constraint on primary
output.

Primary input Cannot clone primary input. Relax timing constraint on primary
input. Increase driving cell
strength on input. Relax
capacitance limit on input.

Too Much Wire Cloning results in duplication of large Increase drive by using parallel
amount of wire. This cell is not cells. Increase force limit parallel
suitable for driving long wires. on the model.

Hierarchical Net Net topology is restricted by the Flatten hierarchy.


hierarchy.

Output Net Kept Output net topology may not be Use clear keep of output net.
modified.
Not Bound Cell is not bound to a model master. Read in missing models and bind
using the run bind logical
command.

Using the Clone Report Magma Confidential 7


Table 2: Troubleshooting Problems in the Clone Report

Problem Description Action

No Sink Output net has no fanout. Connect dangling output net.

No Out Net Output net is absent. Connect dangling output or


remove cell.
Primitive Primitive (unmapped) cell. Use the run gate map command
to map all unmapped cells.

No Source Overloaded net has no driver. Connect dangling net.


Clock Clock nets are handled separately No action required.
by the clock router.

Overload Summary
The overload summary at the end of the report lists the total number of overloads and the
cause of them. Similar tables are also printed by various optimization commands and are
primarily intended to monitor the performance of various algorithms. Each line shows the
number of overloaded cells, the average overload, and the worst overload.

Gain Table
The following table is printed by the run gate trim command:
* Output Gain Distribution *
Gain Range Count Pct
------------------------------
0.52...1.00 580 5.5%
------------------------------
1.00...2.00 3229 30.6%
2.00...4.00 3974 37.7%
4.00...8.00 2693 25.5%
8.00..10.00 69 0.7%

You can use this table to estimate how timing critical the entire design is. It divides the
cells into ranges by gain. The ranges each occupy a 2x range in gain. The boundaries of
the ranges are 0.25, 0.5, 1, 2, 4, 8—each an increasing factor of 2, except for the first and
the last range. The low end of the first range shows the actual smallest output gain
anywhere in the design, and the high end of the last range shows the actual highest
output gain anywhere in the system.
The Count column is the number of outputs with gain falling in this range, and the Pct
column shows this number as a percentage of the total number of outputs. You can
consider the ranges above the bar critical, and below the bar as non-critical.

Using the Clone Report Magma Confidential 8


For paths that have a small gain, buffering tends to make the path more critical. The
system has to rely mostly on aggressively sizing up the cells on these paths to meet
timing. If the library has no strong drive strengths for base functions and the only high
drive strengths are buffers, such paths are likely to fail timing after placement. A smallest
gain of 0.52 with 5.5 percent of critical outputs indicate a challenging design.
* Output Gain Distribution *
Gain Range Count Pct
------------------------------
1.20...2.00 1885 17.9%
2.00...4.00 4201 39.8%
4.00...8.00 4390 41.6%
8.00..10.00 69 0.7%

In the previous example, there are no critical pins. The smallest gain is 1.20, which is not
considered critical. You can expect meeting the timing on this design to be easy.
* Output Gain Distribution *
Gain Range Count Pct
------------------------------
0.20...0.25 1436 13.6%
0.25...0.50 1136 10.8%
0.50...1.00 2239 21.2%
------------------------------
1.00...2.00 3039 28.8%
2.00...4.00 1797 17.0%
4.00...8.00 829 7.9%
8.00..10.00 69 0.7%

In the previous example, the smallest gain is 0.2, which is an extremely small value (0.2 is
a hard limit in trim). Moreover, there are a large number of outputs in the smallest ranges.
This indicates an over constrained design that will not meet timing after placement. You
might have an error in your timing constraints.

Using the Clone Report Magma Confidential 9


Library Remedies
Blast Chip automatically deals with many load violations as described. However, some
load violations will invariably remain. When these violations are small (about 2x), they are
usually not a serious problem. Large load violations are harder to resolve. While you can
sometimes reduce the load violation by improving the floorplan, or by reducing the load
on long wires, in general, you must seek most remedies in the library and preparation of
the library.
Modern 0.18 micron ASIC libraries have typical loads for most of the functional cells in the
range of 10 to 50 femtofarads. While the majority of wires are short, and falls within this
range, long wires in modern ASICs represent loads of 500 femtofarads or more. Often
only buffers can drive such high loads. Yet, performance critical paths, with low output
gains, suffer a timing delay, requiring even lower output gains to meet timing. Hence, for
such paths, the only solution is to increase drive strength of the functional gates.
Short of redesigning the library, there are two ways to increase the drive strength of
existing libraries: cloning and using parallel cells.

Cloning
Cloning is usually enabled for combinational cells, but is turned off for sequential cells.
The no clone restriction can be turned off by the clear noclone command. Cloning
sequential cells can eliminate load violations, but can pose function verification
challenges. Depending on the methodology for formal verification, such copies can cause
a problem. Modern formal verification tools can handle this simple sequential verification
issue, older tools, however, report a functional mismatch due to the mismatch in the
number of sequential elements. Cloning sequential cells can also pose a problem for the
initialization of the design. Depending on the initialization methodology, there might be a
problem. Cloning sequential elements is safe if the initialization is based on one of the
following methods:
1. Set or reset signal to sequential elements.
2. Scan in initial state using scan chain.
3. Verify initialization using three valued (0, 1, X) simulation.

Using Parallel Cells


Using parallel cells is a unique method for increasing performance of low drive libraries.
Parallel cells are similar to cloned cells, except the outputs are tied together. Tying the
outputs together allows otherwise indivisible loads to be driven by multiple cells.
Moreover, it avoids the duplication of global wires, and reduces congestion. You can
specify the number of parallel cells using the force limit parallel command. This
command specifies how many parallel copies of the cell might be tied together.
Initialization of sequential cells can leave parallel sequential cells in different states,
causing a permanent short circuit current. Do the initialization of parallel sequential cells
carefully. Magma advises against the use of sequential parallel cells.

Using the Clone Report Magma Confidential 10


Copyright © 1999-2003 Magma Design Automation Incorporated

All rights reserved.


All contents of this document are protected by copyright. Except as specifically permitted
herein, no portion of this information may be reproduced in any form, or by any means,
without prior written permission from Magma Design Automation, Inc. End users are not
permitted to modify, distribute, publish, transmit or create derivative works of this material
for any public or commercial purposes.

Using the Clone Report Magma Confidential 11