Beruflich Dokumente
Kultur Dokumente
978-1-4244-6455-5/10/$26.00 ©2010 IEEE 488 11th Int’l Symposium on Quality Electronic Design
during full operations. We will address this problem using Start from the RTL coding, the normal SoC design flow
the clock gates synthesis algorithm during CTS. will run through logic synthesis to translate the system
verilog or VHDL behavior model into gate level netlist. The
gate level netlist will be optimized for timing and area
constraint to achieve the required design target. After
generating the optimized netlist, the netlist will then be
placed on the floorplan for physical optimization.
Once the design physical placement is completed, the
clock tree synthesis will be run to synthesize the clock tree
network to distribute the clock signal across the floorplan to
achieve low skew, low power clocking design. To obtain the
optimum clock gate design, we implement the two passes
clock tree synthesis flow. During the first pass, the traditional
clock tree synthesis methodology will be used. The clock tree
will be built using the default design constraint and yield the
standard clock tree structure with the clock gates being
Figure 2: Clock Tree Placement after CTS placed at their initial location based on pre-CTS database.
2. Two Passes Low Power Clock Gate Synthesis We will then apply our split clock gate algorithm to
Flow effectively splitting the clock gates to the nearest location to
RTL codes their loads. The output of the algorithm will be a TCL format
engineering change order (ECO) script that can be used to
apply on the pre-CTS database. As the low power design
Logic Synthesis Flow normally will have a lot of aggressive clock gating, there
might be multiple levels of clock gating a long the clock path.
Placement and Logic Optimization Flow The flow will be able to split each level of the clock gate
accordingly.
First Pass Clock Tree Synthesis Flow The TCL scripts will then be applied into the pre-CTS
database and we will have a new pre-CTS database for the
second iteration of clock tree synthesis. With the newly
Low Power Multiple Level Clock
created clock gate location, the design is once again running
Gates Splitting Algorithm
through the clock tree synthesis flow and optimization to get
the final optimized clock tree design.
Second Pass Clock Tree Synthesis Flow 3. Multiple Levels Clock Gate Splitting
In this section, the multiple levels clock gate splitting
Low Power Multiple Level Clock algorithm will be discussed. We presented the clock gate
Gates Merging Algorithm splitting flow on post-CTS database to find out the optimum
locations of the clock gate. The newly created clock gates
and their location will be written into an ECO format in TCL.
Clock Tree Power This ECO TCL file will be applied to the pre-CTS database
for the second pass clock tree synthesis.
N The multiple level clock gates splitting algorithm can be
o Achieved described as below.
Lower Power? Input: a digital circuit design.
Yes Output: an ECO changes in TCL format.
I. Clock Tree Tracing Algorithm – For each clock
End source, all clock gates are traced for connectivity and
Figure 3: Two Passes Low Power Clock Tree Synthesis each clock gate level are stored.
Flow II. Clock Gate Reverse Splitting Algorithm – For the
clock source, starting from the last level (n) of the
In this section, we introduce the two passes low power clock gate
clock tree synthesis flow as shown Figure 3. This flow is a. For each clock gate
proven working on the standard industrial design using the i. Split the clock gate to its last level buffers
electronic design automation (EDA) tool. The low power locations.
clock gate synthesis algorithm is integrated into the clock tree ii. Find the fanout loads of all the last level
synthesis flow. The algorithm is implemented using the TCL buffers.
language that can be easily integrated into the EDA tool. iii. Assign the fanout loads to the new clock
gate’s hash.