Guidelines For Early Power Analysis

Guidelines for early power analysis
Siddharth Guha & Kiran Vittal - Atrenta - February 11, 2013
While design sizes and complexities are increasing steadily, the power budget for electronic devices
is aggressively decreasing. This increased demand for low power design is driven by various factors.
First, wireless devices cannot afford high power consumption due to the limitations of battery
power. Second, even wired devices cannot afford high power consumption as the cooling costs are
significant. Additionally, in the last few years, government bodies, such as the European Union, have
recognized the need for energy efficient devices and have set strict regulations. So various forces
are now compelling the market to produce power-efficient electronic devices.
It is very important for system-on-a-chip (SoC) designers to understand power consumption early in
the design cycle to meet the desired power budget. However, one of the complexities involved is that
in the initial stages of SoC design not much information is available to accurately estimate power. As
the design progresses, power consumption becomes clearer with the availability of simulation
vectors, technology libraries and decisions taken for synthesis and routing. On the other hand, the
best time to optimize power is in the early stages of the design. The later it gets in the design flow,
the harder it gets to make changes to reduce power. One of the biggest challenges for the designer
is to have a set of tools and flows which can work right from the very early stage of the design
through the later stages in the flow. This article discusses some of the challenges of setting up such
a flow and shares five guidelines for early and accurate power analysis at the register transfer level
(RTL) of abstraction. The RTL abstraction for an SoC is developed during the early stages.
Guideline 1: Leverage design activity information
One of the required pieces of information needed for any power analysis tool is the toggle, or activity
information of the design. Simulation output files, like VCD and FSDB, contain detailed information
of the switching activity of each net in the design. This is known as vector-based power estimation.
Estimating power using this kind of information is very accurate but is time consuming.
On the other hand, vector-less power estimation is an approach to estimate the power based on
probabilistic toggling information. This approach is much faster but can be also less accurate.
Several case studies are available to explain why probabilistic power estimation can be inaccurate,
primarily because of loss in spatial and temporal correlation between the signals. This is however
not just related to the signals.
Consider that you are estimating the power of a memory and have the activity and duty cycles for
each net connected to the memory. In the technology libraries, the power table for the memory is
described as follows:
/* DISABLED POWER */
internal_power() {
related_pg_pin : "VDD" ;
when: "(!BISTEA & !MEA & !DFTMASK) & !LS";
rise_power(INPUT_BY_TRANS) {
values ("0.342393, 0.342393, 0.342393, 0.342393, 0.342393");
}
…
}
/* WRITE_SLOW POWER */
internal_power() {
when: "(!BISTEA & MEA & WEA & !DFTMASK & RMEA & RMA[0] & !RMA[1] & !RMA[2]
& !RMA[3] & !LS)";
values (" 5.791451,5.791451, 5.791451, 5.791451, 5.791451");
}
…
}
/* READ POWER */
internal_power() {
when: "(!BISTEA & MEA & !WEA) & !DFTMASK & !RMEA & !LS";
values (" 5.067451, 5.067451, 5.067451, 5.067451, 5.067451");
}
The power for the memory varies significantly based on the different “when” conditions in the
library model. So even if we get an accurate toggle rate and duty cycle of all the nets in the design,
no simulation output will provide the duty cycle of these “when” conditions. This is because these
“when” conditions are not present as nets in the design. So even if you have a very detailed VCD file
for the design, to accurately calculate power, the power analysis needs to do an internal cycle-based
simulation.
Adopting a hybrid approach
It is clearly evident that doing a cycle-based evaluation for each condition of the power table for
each cell in a design is not a scalable solution for large SoCs. So instead of taking a purely
probabilistic approach, or a complete cycle-based approach, power analysis flows can take a hybrid
approach depending on the following factors:
1. Stage of the design, including availability of the RTL or netlist, or libraries for hard macro
2. Availability of simulation data
3. Design specific data – like memory, datapath, analog cells, black boxes, etc.
Figure 1: A typical early stage IP sub-system block
Suppose we have a design at an early stage of RTL coding. As shown in Figure 1, there are 4 blocks:
Block A: This is an RTL block for which simulation data is available.
Block B: This is an RTL block for which no simulation data is available so far.
Block C: This is a black box for which the RTL is still not available, but the designer is aware of some
characteristics of this block.
Block D: This is block primarily consisting of memories and we have a simulation output file for this
block.
As we can see, there is a fair variation in the progress and the characteristics of each block. Also,
each block is at a different stage with respect to the availability of simulation data. So the early
power analysis flow should be able to handle the best information available.
Block A has RTL with simulation data information. So the power analysis tool should be able to
accept a simulation file at the block level. Since this block is mostly standard cell logic, power
analysis tools will consume a VCD or FSDB data and convert it into toggle counts and duty cycles for
each net. This will ensure that power estimation is much faster than a cycle-based approach. The
error introduced here because of the loss of spatial and temporal correlation will not affect the
accuracy of results for this kind of a design.
Block B is also an early stage RTL design where the simulation data is still not available. But at this
stage, the designer has some information regarding the critical signals. These will be clocks and
control signals.
Here, we can specify the clock period of the clock and the activity information or toggle rate for
critical signals.
Many times, it is hard to specify the toggle rate for a signal internal to the design. However, even for
vector-less power estimation, capturing the information for such signals is important. So the flow
should allow specifying toggle information on such signals. One such signal is clock gating enables
for blocks or registers.
Block C is a black box. There is no RTL information. So for such a case, the flow should be able to
capture coarse design information, as shown below, in an early power analysis tool such as Atrenta’s
SpyGlass® Power:
blackbox_power -instname block_c -equiv_nand2_count 3000 \

-register_count 100 –activity 0.3 -clocks a1 a2 -clock_percentage 0.5 0.5
The above command in the power analysis tool specifies that the black box will contain 3,000 NAND
gate equivalent cells and 100 registers. Also, the average activity of this module will be 0.3. With
this information and technology libraries the flow can estimate the power of this black box.
Block D contains many memories. Earlier in this section, we have seen that memories have a very
high variation of dynamic power based on different access operations like “read” and “write”. So for
this block, we need very accurate power estimation. A robust power estimation flow should be able
to identify such logic from other logic in the design. Once it identifies such cells, it will enable very
accurate tracing of each “when” condition for the cells. This is time-consuming, but the key is to be
able to identify a critical number of cells that will benefit most from such detailed cycle-based
evaluation.
The power analysis flow should be able to consume these different types of activity information and
apply them based on design knowledge to estimate the power at an early stage in the design.
Guideline 2
Guideline 2: Learn from an existing netlist design and apply it to the new RTL.
One of the key benefits of RTL power estimation is to get the power analysis early in the cycle. The
flow does not go through the complete back-end steps. However, a good power analysis flow should
be able to capture the intent of back-end analysis and apply it to the RTL. Scavenging an existing
prototype design netlist can provide good information to RTL analysis tools for accurate power
estimation as shown in Figure 2.
Many designs these days are derivative designs using the same technology node and libraries. In
these cases, parts of the design have already gone through back-end place and route. So when we
create a new design using exiting blocks, the early power analysis flow should be able to capture
characteristics like capacitance, cell distribution, VT-mix, clock tree buffers, etc. It is important to
support a completely automated flow of scavenging the key attributes from the netlist and apply
them in RTL power estimation. At the same time, the flow should provide the flexibility for an
advanced user to fine-tune the scavenged data.
Figure 2: Scavenging existing technology netlist for accurate RTL power analysis
The following factors affect components of power in the early analysis flow and relevant useful data
can be brought into the RTL power estimation for new designs based on an existing netlist with the
same technology nodes and libraries:
1. The synthesis engine should be fast enough but relatively accurate to match the area
characteristics of actual implementation tools. Synthesis will have to use scan cells, as the final
power correlation is being done with scanned netlist design.
2. In general, power analysis tools use minimum area-based cell mapping and may use cells that
have very low drive strengths, and therefore this may result in power discrepancies. To work
around this problem, use “don’t_use” or “don’t_touch” synthesis constraints on cells that have low
drive strengths.
3. The power analysis tool needs to account for the impact of clock buffers added to clock trees and
other buffers added to high-fanout nets.
4. In a few cases, libraries might have multiple power rails or blocks in the design that are in
switched off power domains. In some cases, you may have different libraries that are operating at
different voltages.
5. Clock power depends on the way clock gating is done in the design. By default, clock gating is not
done in an early power analysis tool and the flow needs to infer an existing clock gating
threshold.
Guideline 3
Guideline 3: Do early physically-aware power estimation for timing sensitive designs
In advanced technology nodes, it is common that the overall power at RTL correlates well with the
final netlist power. However, the individual sub-components of leakage power, internal power or
combinational power do not match that of the final design. This is an inherent drawback of area-
based synthesis for early power estimation and hence requires a solution that considers physical and
timing constraints early at RTL to get more accurate results for power.
It is also important for the power analysis tool to read in the timing constraints in Synopsys Design
Constraints (SDC) format to improve the power estimation results. The tool should also be able to
take in physical libraries and do timing optimization and the changes for fixing design rule violations
along with the slew calculation. The flow should also support the use of different versions of libraries
(like nominal for power and worst for timing) for timing optimization and power computation.
Further, with smaller geometries, the interconnect capacitance is becoming more significant. Thus
many libraries do not have wire load models. In the absence wire load models, a flow that has a
quick prototyping placement and floor plan module can extract fairly acute wiring capacitances.
Timing and physical optimization steps are time consuming and the tool needs to tradeoff fast run
times at RTL and accuracy in power estimation
Figure 3: Early physically-aware power analysis
Guideline 4
Guideline 4: Perform early RTL scan power estimation
SoC designs have multiple scan chains; each of them may have several thousand flops. If all the
chains are run at the same time, then power dissipation is too high. Hence scan power is a key factor
for deciding chip packaging. The power grid is designed with a certain maximum power, based on
normal operational mode. If the power during test mode is significantly more, it may be necessary
to slow down scan patterns or test certain blocks only. Both of these methods can cause excess cost
due to higher test time.
So, estimating scan power early in the design phase is very important. If the estimated power is not
under budget, then one needs to explore options to reduce test mode power. Here are a couple of
options:
1. Run the tests at lower speed.

2. Some SoCs are designed with groups of scan chains. We can run one scan group at a time, so we
can reduce the power with an increase in the test time. We need to find minimum number and
arrangement of scan groups which will meet the power limit.
For early scan power analysis, it is required to specify the activity of the signals during scan
operation. Automatic test pattern generation (ATPG) patterns can be viewed as statistically random,
so a typical activity value for an ATPG pattern would be 0.5, or 50%. In some cases, where the
designer uses low power ATPG to generate patterns, a lower activity value such as 0.3 or 0.1 can be
used for RTL scan power analysis.
We have done some experiments to compare the power numbers generated by SpyGlass using a
vector-less approach against a scan-inserted netlist using ATPG pattern VCD. We found that the
power numbers at RTL correlate to within 10% of the final netlist. In Figure 4, the blue line shows
the average power predicted by SpyGlass at RTL, before either the netlist or the ATPG patterns
existed. As you can see, except for the chain test which is guaranteed highest activity, the average
RTL prediction is very close to the actual value. This means that the RTL prediction can be used to
make tradeoffs about scan chain grouping and to ensure that no “surprise” comes from excessive
test mode power
Figure 4: Test mode power estimated at RTL with a vector-less approach
Guideline 5
Guideline 5: Leverage formal (LEC) tool’s match points for netlist power estimation
As the design progresses from RTL to a netlist, the flow should be able to adapt the netlist for power
analysis. However, gate level simulation is available much later. Also gate level simulation VCD or
FSDB data are huge in size. So a suitable flow is to be able to estimate the gate level power with
RTL simulation files. This flow has its own challenges. This is because when the RTL is synthesized
to a gate level design, various name changes takes place. Module hierarchies may get flattened,
vectors may get bit blasted and design constraints may change the name of the signals. So it is
harder for a tool to automatically map the RTL simulation file information onto the gate level design.
The flow should be able to consume the match points report of a logical equivalence check (LEC)
tool to match the RTL and gate level register names as shown in Figure 5.
Figure 5: Logical equivalency check (LEC) tool provides mapping information for gate level power
analysis based on RTL simulation data
Conclusion
This article shares five guidelines to perform an early power analysis with relevant and available
data at each stage to avoid last-minute surprises in the SoC design process. This set of guidelines is
applicable to any mobile or wired application and should be easy to adopt in any design
implementation flow.
About the authors
Siddharth Guha is a Senior Engineering Manager at Atrenta India. Siddharth holds

a bachelor’s degree in engineering from Netaji Subhas Institute of technology
(NSIT), Delhi. Siddharth is primarily responsible for SpyGlass Power Estimation,
Reduction and SEC products. You can reach him at sid@atrenta.com
Kiran Vittal is a Senior Director of Product Marketing at Atrenta, with over 23 years
of experience in EDA and semiconductor design. Prior to joining Atrenta, he held
engineering, field applications and product marketing positions at Synopsys Inc,
ViewLogic Inc and Mentor Graphics Inc. Vittal holds an MBA from Santa Clara
University and a bachelor's degree in electronics engineering from India. You can
reach him at kvittal@atrenta.com.
If you found this article to be of interest, visit EDA Designline where you will find the latest and
greatest design, technology, product, and news articles with regard to all aspects of Electronic
Design Automation (EDA).
Also, you can obtain a highlights update delivered directly to your inbox by signing up for the EDA
Designline weekly newsletter – just Click Here to request this newsletter using the Manage
Newsletters tab (if you aren't already a member you'll be asked to register, but it's free and painless
so don't let that stop you).

Guidelines For Early Power Analysis

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Guidelines For Early Power Analysis

Hochgeladen von

Copyright:

Verfügbare Formate

Guidelines for early power analysis

Siddharth Guha & Kiran Vittal - Atrenta - February 11, 2013

Guideline 1: Leverage design activity information

Adopting a hybrid approach

Block A: This is an RTL block for which simulation data is available.

blackbox_power -instname block_c -equiv_nand2_count 3000 \

Guideline 3: Do early physically-aware power estimation for timing sensitive designs

Figure 3: Early physically-aware power analysis

Guideline 4: Perform early RTL scan power estimation

1. Run the tests at lower speed.

Figure 4: Test mode power estimated at RTL with a vector-less approach

About the authors

Siddharth Guha is a Senior Engineering Manager at Atrenta India. Siddharth holds

Das könnte Ihnen auch gefallen