Download

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/3224644
An industrial view of electronic design automation
Article in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems · January 2001
DOI: 10.1109/43.898825 · Source: IEEE Xplore
CITATIONS READS
51 155
4 authors, including:
Donald Benton MacMillen

University of Washington Seattle
23 PUBLICATIONS 778 CITATIONS
SEE PROFILE
All content following this page was uploaded by Donald Benton MacMillen on 12 February 2014.
The user has requested enhancement of the downloaded file.

1428 IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 12, DECEMBER 2000
An Industrial View of Electronic Design Automation

Don MacMillen, Member, IEEE, Michael Butts, Member, IEEE, Raul Camposano, Fellow, IEEE,
Dwight Hill, Senior Member, IEEE, and Thomas W. Williams, Fellow, IEEE
Abstract—The automation of the design of electronic systems captive IC routers. Calma’s GDS II becomes the standard
and circuits [electronic design automation (EDA)] has a history of layout format.
strong innovation. The EDA business has profoundly influenced • Schematic Capture and Logic Simulation: Workstation
the integrated circuit (IC) business and vice-versa. This paper re-
views the technologies, algorithms, and methodologies that have technologies combined with graphical user interfaces
been used in EDA tools and the business impact of these technolo- allow designers to directly draw their logic in schematic
gies. In particular, we will focus on four areas that have been key in capture tools. Once entered, these designs could be
defining the design methodologies over time: physical design, simu- simulated at the logic level for further development and
lation/verification, synthesis, and test. We then look briefly into the debugging. The largest entrants at this juncture were
future. Design will evolve toward more software programmability
or some other kind of field configurability like field programmable Daisy, Mentor, and Valid.
gate arrays (FPGAs). We discuss the kinds of tool sets needed to • Placement & Routing: During the 1980s, powerful IC
support design in this environment. layout tools emerge: automatic placement & routing,
Index Terms—Emulation, formal verification, high-level design rule check, layout versus schematic comparison,
synthesis, logic synthesis, physical synthesis, placement, RTL etc. at the same time, the emergence of the hardware
synthesis, routing, simulation, simulation acceleration, testing, (HW) description languages Verilog and VHDL enables
verification. the widespread adoption of logic and RTL simulators. The
main players in the commercial arena become Mentor and
I. INTRODUCTION Cadence. (ECAD/SDA/Gateway/Tangent/ASI/Valid).
• Synthesis: In 1987, commercial logic synthesis is
T HE AUTOMATION of the design of electronic systems
and circuits [electronic design automation (EDA)] has a
history of strong innovation. The EDA business has profoundly
launched, allowing designers to automatically map
netlists into different cell libraries. During the 90s,
synthesizing netlists from RTL descriptions, which are
influenced the integrated circuit (IC) business and vice-versa,
then placed and routed, becomes the mainstream design
e.g., in the areas of design methodology, verification, cell-based
methodology. Structural test becomes widely adopted by
and (semi)-custom design, libraries, and intellectual property
the end of the decade. RTL simulation becomes the main
(IP). In the digital arena, the changing prevalent design method-
vehicle for verifying digital systems. Formal verification
ology has characterized successive “eras.”1
and static timing analysis complement simulation. By the
• Hand Design: Shannon’s [SHA38] work on Boolean al- end of the 1990s, Cadence and Synopsys have become
gebra and McCluskey’s [79] work on the minimization of the main commercial players. The EDA market for ICs
combinational logic circuits form the basis for digital cir- has grown to approximately $3B.
cuit design. Most work is done “by hand” until the late Also, the fabric of design, meaning the basic elements such
1960s, essentially doing paper designs and cutting rubylith as transistors and library cells, has evolved and become richer.
(a plastic film used to photographically reduce mask ge- Full custom design, cell based designs [mainly standard cells
ometries) for mask generation. and gate arrays for application specific ICs (ASICs)] and FPGAs
• Automated Artwork and Circuit Simulation: In the early [or programmable logic devices (PLDs)] are the most prevalent
1970s, companies such as Calma enter the market with fabrics today.
digitizing systems to produce the artwork. Spice is used This paper reviews the technologies, algorithms, and method-
for circuit simulation. Automatic routing for printed cir- ologies that have been used in EDA tools. In particular, we
cuit boards becomes available from Applicon, CV, Racal- will focus on four areas that have been key in defining the de-
Redac, and some companies such as IBM and WE develop sign methodologies over time: physical design, simulation/ver-
ification, synthesis, and test. These areas do not cover EDA
exhaustively; i.e., we do not include physical analysis (extrac-
Manuscript received May 24, 2000. This paper was recommended by Asso-
ciate Editor M. Pedram. tion, LVS, transistor-level simulation, reliability analysis, etc.),
D. MacMillen is with Synopsys Inc., Mountain View, CA 94065 USA because this is too diverse an area to allow a focused anal-
(e-mail: macd@synopsys.com) ysis within the space limits of this paper. We also exclude the
M. Butts, R. Camposano, D. Hill, and T. W. Williams are with Synopsys Inc.,
Mountain View, CA 94065 USA. emerging area of system-level design automation, because it is
Publisher Item Identifier S 0278-0070(00)10449-X. not a significant market today (although we believe it may be-
1We do not attempt to reconstruct the history of EDA in detail. The sketch
come one).
below is only meant to exemplify and summarize some of the major advance- We then look briefly into the future. Design will evolve to-
ments that created sizeable markets. ward more software (SW) programmability or some other kind
0278–0070/00$10.00 © 2000 IEEE
MACMILLEN et al.: AN INDUSTRIAL VIEW OF ELECTRONIC DESIGN AUTOMATION 1429
of field configurability like FPGAs. The evidence for this ex- to automate. Finally, the physical design of computers was some-
ists today with the healthy growth of the FPGA market as well thing that is relatively easy to represent in computers themselves.
as the increased attention this topic now receives at technical Although physical design data volume can be high, the precise
conferences such as the Design Automation Conference. The nature of layout geometry and wiring plans can be represented
reasons are manifold. The fixed cost of an IC (NRE, Masks, and manipulated efficiently. In contrast, other design concepts
etc.) and the raising costs of production lines require an ever-in- such as functional requirements are much harder to codify.
creasing volume. Complexity makes IC design more difficult Providing mechanized assistance to handle the sheer volume
and, consequently, the number of IC design starts is decreasing, of design was probably the basis of the first successful EDA
currently at around 10 000 ASICs/Systems on a Chip (SoC) per systems. To meet the demand, EDA systems were developed
year plus some 80 000 FPGA design starts/year. So there is an that could capture artwork and logic diagrams. Later, the EDA
increasing pressure for designs to share a basic architecture, typ- tools started adding value by auditing the designs for certain
ically geared toward a specific application (which can be broad). violations (such as shorts) and later still, by automatically pro-
Such a design is often referred to as a “platform.” In this sce- ducing the placement or routing the wires. The value of design
nario, the number of platforms will probably grow into the hun- audits is still a major contribution of physical EDA. In fact,
dreds or thousands. The number of different uses of a partic- today, physical verification of designs (e.g., extraction) amounts
ular platform needs to be at least in the tens, which puts the to about 45% of the market for physical EDA—the design cre-
total number of different IC “designs” platforms (average ation process accounting for 55%. However, due to space limi-
platform use) in the 100 000 range. If the number of different tations, this paper will focus on the design creation half.
platforms is higher than of the order of 1000, then it is probably
more accurate to talk just about programmable SoC designs, but A. Evolving Design Environment
this does not change the following analysis. The evolution of physical EDA tools ran in parallel with the
The EDA market will fragment accordingly. evolution of the target process and the evolution of the available
• A market of tools to design these platforms targeting IC computing platforms. Defining the “best” target process has
designers. These tools will need to master complexity, been a tug-of-war between the desire to simplify the problem in
deal with deep submicrometer effects, allow for large dis- order to make the design and testing process easier, and the de-
tributed design teams, etc. Examples: Physical synthesis sire to squeeze the best possible performance out of each gener-
(synthesis placement routing), floorplanning, chip ation of technology. This battle is far from over, and the tradeoff
level test, top level chip routers, etc. of technology complexity versus speed of design will probably
• A market of tools to use these platforms by System on a be a theme for the next generation of chip technologies. In ad-
Chip Designers. Typically, these tools will handle lower dition to the evolution of the target technologies, each new gen-
complexity, will exhibit a certain platform specificity and eration of technologies has impacted EDA by providing a more
will require HW/SW codesign. Examples: Synthesis of capable platform for running CAD tools.
FPGA blocks, compilers, HW/SW cosimulation including This section will discuss the co-evolution of EDA tools, the
platform models (processors, DSP, memory), system-level machines they ran on, and the technologies that they targeted. Be-
design automation, etc. cause physical EDA is a large and diverse field, there are many
This paper explores the technical and commercial past and taxonomies that could be applied to it. For the purposes of ex-
touches on the likely future of each of the four areas—layout, position, we will divide it roughly along the time line of tech-
synthesis, simulation/verification, and test—in this scenario. nologies as they have had an economic impact in the EDA in-
dustry. Specifically, these would include interactive tools, auto-
matic placement & routing, layout analysis, and general layout
II. PHYSICAL DESIGN support. On the other hand, there have been physical design tech-
From both technical and financial perspectives physical de- nologies which have been quite promising, and even quite effec-
sign has been one of the most successful areas within EDA, active within limited contexts, but which have not impacted the rev-
counting for roughly one-third of the EDA revenues in the year enue numbers very much. The end of this section hypothesizes
2000. Why has physical design automation been so successful? why these technologies have not yet had an economic impact.
There are many factors, but here are three based on the history
of EDA and machine design: First, physical design is tedious B. Early Physical Design: Interactive Support
and error prone. Early computer designers were comfortable de- The earliest physical EDA systems supported the technology
signing schematics for complex machines, but the actual wiring of the day, which was small and simple by today’s standards but
process and fabrication was a distinctly unattractive job that re- complex compared with the systems before it. Early electronic
quired hours of exacting calculation and had no tolerance for machines consisted of discrete components, typically bipolar
error. And since physical design is normally driven by logic de- transistors or very small-scale ICs packed with passive compo-
sign, minor logical changes resulted in a restart of the physical nents onto a two-layer board. A rack of these boards was con-
process, which made the work even more frustrating. Second, the nected together via a backplane utilizing wirewrap interconnec-
physical design of machines could be adjusted to make it more tions. In this environment, the first physical EDA tools were ba-
amenable to automation. That is, whereas the external behavior sically electronic drawing systems that helped produce manu-
of a circuit is often specified by its application, its layout could be facturing drawings of the boards and backplanes, which techni-
implemented in a range of ways, including some that were easy cians could then implement with hand tools. In the late 1960s,
the situation changed because the backend of physical design C. Automatic Design
started to become automated. Prior to this, complex printed cir-
cuit board and IC masks were done directly on large sheets of While interactive physical design was a success, automatic
paper or on rubylith. As long as the final design was being im- physical EDA in the late 1970s was just beginning to provide
plemented by hand, the time to produce a new machine was placement and routing functions. At the time, tools internal to
long, and the size of circuits was limited. But around 1969, new the biggest companies, especially IBM, Bell Labs, and RCA
machines were invented, included the Mann IC Mask maker and were probably the most advanced. The IBM tools naturally
the Gerber flatbed plotter, that could produce the final masks ran on mainframes, whereas other tools would run on either
automatically. All at once the bottleneck moved from the fab- mainframes or the “minicomputers” of the day, which were
rication side to the production of the design files, which were 16-bit machines like PDP-11s or perhaps 24-bit machines such
typically stored on large reels of magnetic tape. as Harris computers. The introduction of the “super minis,”
Mechanizing the mask making, thus, created a demand for especially the DEC VAX 11-780 in 1977 probably did as much
tools that could accelerate the intermediate steps of design. Part to accelerate computer—aided designas any other single event.
of this was the “capturing” of schematic and other design data. With the advent of an almost unlimited (about 24 bit) virtual
In those days, it was fashionable to model the design process as memory, it became possible to represent huge design databases
one that was external to the computing environment and, thus, (of perhaps 20K objects) in memory all at once! This freed
the digitizing tablet was a device for “capturing” it and storing it up a lot of development effort that had gone into segmenting
into a computer memory. Applicon and ComputerVision devel- large design problems into small ones and meant that real
oped the first board and chip designing graphics stations during designs could be handled within realistic running times on
the 1970s. These were commercially significant, and set the machines that small to medium companies, and universities,
pace for EDA work to follow. Specifically, most EDA compa- could afford.
nies in the late 1970s and 1980s were seen as system suppliers In this context, the first commercial automatic physical
that had to provide a combination of HW and SW. Daisy and design systems started demonstrating that they could compete
Valid, for example, grew to large companies by developing and with designs done by layout engineers. An open question at
shipping workstations specifically intended for EDA. The capa- the time was: “What is the most appropriate IC layout style?”
bilities of these early EDA workstations were limited in many The popular book by Mead and Conway [MEA80] had opened
ways, including both the algorithms applied and the HW that up the problem of layout to masses of graduate students using
was available, which was typically limited to something like only colored pencils, but the approach it advocated seemed
64K words—hardly enough to boot today’s machines. Eventu- ripe for automation. It was based on the specification of
ally, general-purpose workstations were available on the open individual polygons in the CIF language, and relied heavily
market that had equal and soon greater capabilities than the spe- on programmable logic arrays (PLAs) which appeared to be
cialized boxes. This was the approach of Mentor and essentially the ideal way to specify complex behavior of circuits. After
of all successful physical EDA companies since then. all, they were easy to design, had predictable interfaces, and
Physical EDA needs more than computing power: it requires could be modified easily. To support them, PLA EDA was
graphical display, which was a rarity on computers in the late developed (e.g., [39]). But the problems designing with PLAs
1970s. At the time, most computer users were running mechan- soon exceeded their advantages: they did not scale to very
ical paper terminals, or perhaps 80 character by 24-line CRT ter- large sizes and did not generally abut to make larger chips.
minals. The advent of reasonably priced vector-graphics mon- Furthermore, the very EDA systems that optimized their area
itors (which continuously redraw a series of lines on the CRT undermined their predictability: making a small change in the
from a display list), and of storage tube displays (which use a logic of a PLA could cause it to no longer fold, resulting in a
charge storage effect to maintain a glow at specific locations on much larger change in area. This often meant that it would no
the face of the CRT) opened the door to human interaction with longer fit into the area allocated for it. Finally, other design
two-dimensional physical design data. These technologies were styles were being developed that were just as easy to use but
commercially developed during the 1970s and became gener- which scaled to much larger sizes. So, the EDA support for
ally available by the late 1970s. At the time of their commer- PLAs, per se, was never a major economic factor. (But the story
cial introduction, some display designers were starting develop of programmable logic is far from over, as will be discussed in
“bit mapped” graphics displays, where each pixel on the CRT the section EDA Futures.)
is continuously updated from a RAM memory. These were con- While PLAs and stick figures were all the rage on campus,
sidered high-end machines because they required a huge amount industry was experimenting with a completely different para-
of memory (as much as 128K bytes) and were, therefore, priced digm called “standard-cell” design, which greatly simplified the
out of the range of ordinary users. Of course, the 1980s brought problem of placing and routing cells. Whereas custom ICs were
a collapse of memory prices and bit-mapped displays came to laid out a single transistor at a time and each basic logic gate was
dominate the market. The EDA industry, which had served as an fit into its context, the introduction of standard cells de-coupled
early leader in workstation development, was now a small frac- the transistor-level design from the logic design, and defined a
tion of machine sales. This killed the prospect for special EDA new level of abstraction. When the layout was made up of cells of
workstations, but meant that EDA SW developers could depend a standard height that were designed to abut so that their power
on a steady sequence of ever more capable workstations to run and ground lines matched, almost any placement you create was
their SW-only products on. legal (even if it was not very efficient). Since the spacing of the
rows was not determined until after the routing is done almost and Apollo systems, respectively. Cadence continues to support
any channel router would succeed in connecting 100% of the and extend LEF and DEF in its tools, and has recently put these
nets. Hence, it suddenly became possible to automatically pro- formats and related readers into an open source model. 2
duce large designs that were electrically and physically correct. Many of the first commercially successful placement algo-
Many decried the area efficiency of the early standard cells since rithms were based on slicing tree placement, guided by rela-
the channel routes took more than half the area. And the lack of tively simple graph partitioning algorithms, such as Kernighan–
circuit tuning resulted in chips that were clearly running much Lin [64] and Fiduccia–Mattheyses [50]. These algorithms have
slower than the “optimal” speed of the technology. Academics the advantage that they deal with the entire netlist at the early
called for a generation of “tall thin” designers who could handle steps, and could at least potentially produce a placement that
the architecture, logic, layout and electrical problems all at reduced the number of wire crossings. But as the competition
once. But such people were in short supply, and there was a for better QOR heated up, and the compute power available to
huge demand of medium scale integration (MSI) and large scale EDA increased during the early 1970s, it became attractive to
integration (LSI) chips for the computer systems of the day throw more cycles at the placement problem in order to achieve
(which typically required dozens of chip designs, even for a smaller die area. About this time, the nature of the Metropolis
simple CPU). The obvious priority was for design efficiency and and Simulated Annealing algorithms [67] became much better
accuracy: most chips had to “work the first time.” The benefits understood from both a theoretical and practical viewpoint, and
were so obvious that standard cell layouts became standard the mantel of highest quality placement was generally held by
practice for a large percentage of design starts. the annealing placers such as TimberWolf [99].
Following closely on the heels of standard cell method- Because simulated annealing is relatively simple to under-
ology was the introduction of gate arrays. Whereas standard stand and implement (at least in a naïve way) many students and
cells could reduce design time, gate arrays offered more: the companies implemented annealing engines. Various speed-up
reduction in the chip fabrication time. By putting even more mechanisms were employed to varying degrees of success, and
constraints on the layout they reduced the number of masking factors of 10 from one run to another were not uncommon.
steps needed to “turn around” a chip. Because all the cells were Early annealing engines required expert users to specify an-
predesigned to fit onto the “master-slice,” producing a valid nealing “schedules” with starting “temperatures” sometimes in
placement was never an issue with gate-arrays. But because the the units of “million degrees.” Such homegrown heuristics re-
area available for routing was fixed, not stretchable, most of the duced the accessibility of physical design to a small clique of
chips were routing limited. Thu the new problem, area routing experts who knew where all the buttons were. Gradually, the an-
dominated the CAD support world for gate arrays. nealing technology became better understood, and put on firmer
Commercial automatic physical design tools started ap- foundations, and more parameters were derived automatically
pearing in the early to mid 1980s. Silvar-Lisco, VRI, CADI, based on metrics extracted from the design.
Daisy, SDA, ECAD, and others provided early silicon P&R But slicing placement was not dead—it gained new life with
tools, which included both a placement and a routing sub- the advent of analytical algorithms that optimally solved the
system. The CalMP program from Silvar-Lisco achieved both problem of ordering the cells within one or two dimensions.
technical and economic success, for example, around 1983. Because the classic partitioning algorithms tend to run very
Early routing products were limited to two layers, which was slowly and suboptimially on problems of more than a few
fine for standard cells but not very effective for gate arrays. The thousand cells, the use of analytical guidance produced much
tools that provided effective area routing with the three metal higher quality much more quickly in systems like PROUD
layers, therefore, came into high demand for gate array use. [117] and Gordian [68]. When commercialized by companies
The Tancell and more especially the TanGate programs from such as Cadence, they were marketed as Q for “quadratic”
Tangent were major players in placement and routing in the and for “quick” algorithms. They opened the door to placing
1985–1986 timeframe. In addition to their multilayer capability, designs of 100K cells in a reasonable time.
part of their success was due to the development of a standard In more recent years, the competitive environment for placers
ASCII format for libraries and design data, called LEF and has started to include claims of timing-driven placement, and
DEF, respectively. Because these allowed vendors to specify synthesis-directed placement. Early timing-driven placement
their own physical libraries in a relatively understandable techniques worked by taking a small fixed set of net weights
format, they became the de facto standards for the industry. and tried to place the cells such that the high weighted nets were
The fact that they were, at that time, proprietary and not public shorter [45], [46]. More recent work in the area has supported
formats slowed, but did not stop their adaptation by other EDA path-based timing constraints, with a fixed set of time critical
vendors. Eventually, the physical design tools from Tangent, paths. The current state of the art for timing-driven placement
SDA, and ECAD were placed under the Cadence umbrella, and is to merge the placement algorithm with timing analysis, so
Cadence came to dominate the market for physical EDA with that the user need only supply overall timing constraints and
its Cell3, Gate Ensemble, and later Silicon Ensemble products. the system will derive the paths and the internal net priorities.
Since the introduction of Cell3 there has been competition in Synthesis directed placement (or placement directed syn-
this area, including the GARDS program from Silvar-Lisco, thesis, according to some) is a melding of placement operations
and later, tools from ArcSys (later called Avant!) and others. At and synthesis operations. The synthesis operations range
the moment, Cadence and Avant! hold the dominant position in
the market for physical P&R tools with their Silicon Ensemble 2see http://www.OpenEda.org
from simple sizing of devices to complete resynthesis of new generation of IC routers that could connect a set of points
logic starting from technology independent representations. distributed across the surface of the die. Ironically, this was more
Most commercial “placement” systems, including the current like the original printed circuit board routing problem and could
generation from Cadence and Avant! have effective sizing and be attacked with similar methods, including both gridded and
buffering operations built into them. These greatly improve the gridless routing—although most of the practical commercial
timing of the final circuit. Cadence, with Silicon Ensemble, and routers have been gridded until fairly recently. But the classic
Avant!, with Apollo, are currently in the market leading tools. routing techniques had a problem in that they tended to route one
In a different approach, others are currently working to com- net at a time to completion, sometimes called sequential routing,
bine RTL synthesis operations such as resource sharing, core before starting on the next. This tended to result in high-com-
logic synthesis such as timing-driven structuring, and place- pletion rates for the nets at the beginning and low rates, or long
ment, all while using common timing models and core timing paths, for nets at the end of the run. Addressing these quality-of-
engine. Although its economic impact of integrated synthesis results problems has been a main focus of router work for more
with placement is small at this time, it seems likely that most than a decade. There have been many significant advances, in-
commercial placement tools of will have to have some kind of cluding a process of strategically sequencing the route with a
synthesis operations incorporated within them in order to re- global route run before the main routing [31], and many heuris-
main competitive in the long run. The current market entrants tics for selective rip-up and reroute done during or after the
into this new arena include Saturn, from Avant!, PKS from Ca- net-by-net routing [51]. Producing a competitive router today
dence, Physical Compiler from Synopsys, Dolphin from Mon- requires extensive experimentation, integration, and tuning of
terey Designs, Blast Fusion from Magma, and others. This is not these processes. More detailed discussions on this and other is-
to say that the technical approaches are the same for these prod- sues in physical design, with references, can be found in [104].
ucts, only that they are all targeting the solution of the timing
closure problem by some combination of synthesis with place- E. Critical Support Subsystems Within Place and Route
ment. Practical layout systems depend on many specialized subsys-
tems in order to complete the chip. However, it has not generally
D. Routing been commercially viable to build these as separate tools to link
The first application of automatic routing techniques was to into another company’s P&R system. Examples of such tools
printed circuit boards, where automatic routing algorithms pre- include clock tree synthesis, power/ground routing, filler cell
dated successful placement algorithms. There were at least three insertion, and other functions. Some start-up companies have
reasons for this: routing was a more tedious process than place- begun to offer such support as point tools, an example being spe-
ment; minor changes in placement required redoing the routing; cialized clock tree synthesis solutions with deliberate nonzero
and the number of pins to be routed was an order of magnitude skew, but they have not yet had large economic impact.
greater than the number of components to be placed. Early PC
board routing EDA consisted mostly of in-house tools in the F. Physical Design Tools with Smaller Economic Impact
1960s and 1970s running on mainframes and the minicomputers In addition to the central place/route/verify business, many
available at the time. Most of the practical routing algorithms other physical design facilities and tools have been developed
were basically search algorithms based on either a breadth-first and many of these have been launched commercially. However,
grid search [73] or line-probe search algorithms [59]. Because as of yet they have not had as much impact on the business
breadth-first search techniques require unacceptably long run- side of EDA. Examples of these include automatic floorplan-
times, many optimizations, such as the algorithm have been ning tools, cell layout synthesizers, layout generators, and com-
used to speed up routers [58], [35]. pactors. This section discusses some of the reasons behind the
Sometime in the 1970s the number of components on ICs limited economic impact of these and similar tools.
started to exceed the number of components on PC boards. Floorplanning is the process of organizing a die into large re-
At this point the biggest challenges in physical EDA, and the gions, and assigning major blocks to those regions. Very often,
biggest revenue, started to migrate to the silicon domain. The the blocks to be floorplanned are malleable, and part of the floor-
nature of the technology involved placing hundreds to thou- planning process is reshaping the blocks to fit into the die effi-
sands of relatively simple circuits. In this context, automatic ciently. There are many constraints and cost metrics involved in
placement had almost no value without automatic routing. floorplanning, some of which are easy to compute, such as the
Hence, there was a strong motivation to provide a solution that amount of wasted area lost by ill fitting blocks. But other con-
included both. As mentioned above, early automatic gate-level straints are much more difficult to compute, such as the impact
placement was often standard-cell based, which meant that that routing the chip will have on its timing. Many academic
channel routing could be applied. As defined in the early 1970s, papers and some commercial EDA systems have proposed so-
channel routing was the problem of connecting a set of pins lutions to floorplanning [90], and in some cases they have met
on either side of a parallel channel with the fewest tracks [42]. with significant success within a specific environment. But the
Because this problem was simple enough to model analytically, EDA industry overall has not seen a large revenue stream asso-
its definition lead to a large number of algorithms and heuristics ciated with floorplanning—it amounts to only about 2% of the
in the literature, and to a lesser extent, in the market [92]. physical IC EDA market. This may be partly due to the lack of
Although standard cells connected by channel routes were ef- successful tools in the area. But it is also due, at least in part, to
fective the advent of gate-array layout brought the need for a the nature of the floorplanning problem: it is both tractable to
human designers, and attractive to them. Generally, chip archi- problem to allow complex effects to be safely ignored. Coupled
tects are not desperate to hand this problem to a machine since with the increasing complexity of the fabrication process and the
skilled designers can produce floorplans that are better than faster workstation, it seems as if the EDA developer’s effort level
the best machine-produced floorplans, and can develop them has been roughly constant over the last 20 years. That is, as a
within a matter of days. Hence, companies today seem unwilling problem such as metal migration becomes well understood, the
to spend large sums for automatic floorplanning. As chips get next wrinkle in the technology (e.g., self-heating) is revealed.
larger, and design teams get more complex, floorplanning and The overall effect is cumulative and practical EDA systems today
hierarchical design in general may play a larger role, since the tend to be large and grow roughly linearly with time.
use of explicit hierarchical blocks provides effective mechanism On the other side of the balance, a greater number of systems
for insulating one complex subsystem from another. than ever can be designed without the need for “mask pro-
Another area that has not had much commercial impact has grammed” silicon at all. The PLAs that represented the early
been cell layout synthesis. This is the process of taking a tran- attempts to automate chip design turned into PLDs and then
sistor-level specification and producing detailed transistor-level CPLDs. Likewise, the gate-arrays used to reduce turnaround
layout. Many projects of this nature have been done in research time now compete head-to-head with FPGAs. Even custom
[118] and inside olarge companies, and a few start-ups have at- computer microprocessor chips, such as the Intel Pentium, now
tempted to make this a commercial product. But so far, auto- compete with SW emulation of the same instruction set, as in
matic cell layout has had little impact on the EDA economy. This the TransMeta Crusoe processor. In addition, there is a new
may be because the total amount of transistor-level layout done category of design, named “application specific programmable
in the industry is small and getting smaller, or it may be because products” [38] which combine programmability into traditional
the overhead verifying circuits at the transistor-level raises the application specific chip designs. This allows one mask set
effective cost too high. But in any case, the current economic to support multiple users, even across multiple customers.
impact of cell synthesis on EDA is small. Another example is the projections from FPGA vendors, such
For people interested in full-custom layout, the use of as the idea that within ten years, almost all designs will have
layout generators was a promising area in the late 1980s and some amount of programmable logic [93]. What does this mean
early 1990s. These tools provide a framework for algorithmic for physical EDA? That depends on if the EDA is for the chip
placement systems to be written by designers, and can provide designers or for the chip users.
amazing productivity gains within specific contexts. However, The number of custom chip designers will probably remain
they tended to suffer from the syndrome of too many contrib- relatively constant. This is because there are several factors that
utors, and too little demand for them. That is, many groups tend to oppose each other: the number of design starts is dimin-
would write generators for such circuits as ALUs, but few ishing, but the complexity of each design is increasing. The cost
designers would pick up someone else’s generator and use it, of designing and testing each chip is large and growing, and so,
since its detailed physical and electrical properties were not the leverage that advanced EDA tools can offer continues to be
well understood. Also, generator development systems were significant. Hence, EDA tools for mask programmed chips that
targeted at a relatively small group of people who were experts to offer good performance with excellent reliability should prob-
in both layout and in programming, and these people could ably have a steady, if not expanding market.
very often provide similar systems themselves. At the moment,
these systems are not in wide use. III. SIMULATION AND VERIFICATION OF VLSI DESIGNS
The final case example would be layout compaction tools.
Verification of design functionality for digital ICs has always
These were promoted starting in the 1980s as a way of im-
been a huge and challenging problem, and will continue to re-
proving the efficiency of physical design and enabling a design
main so for the foreseeable future.
to be ported from one technology to another [120]. However,
If we measure difficulty of verifying a design by the total
they suffered from several problems. First, they were targeted
possible states of its storage elements, which is two to the
at the limited number of full custom layout designers. Second,
power of the number of elements, we face an explosion of two
they produced layouts that were generally correct in most rules,
to the power of Moore’s Law.3 In reality, it is not that bad.
but often required tuning to understand some of the peculiar
But long experience has clearly shown that verification work
technology rules involved in the true detailed high-performance
is a super-linear function of design size. A fair approximation
layout. Third, they were viewed with suspicion by the designers
is that doubling the gate count doubles the work per clock
used to doing full custom layout interactively, and in the end
cycle, and that the extra complexity also at least doubles the
were not widely accepted.
number of cycles needed to get acceptable coverage [22],
[112]. Thus, the verification problem is Moore’s Law squared,
G. On the Future of Physical Design EDA that is doubling the exponent in an exponentially increasing
function. Even if processor performance continues to increase
Each generation of technology has brought larger-scale prob- according to Moore’s Law (exponentially), design verification
lems, run on more powerful computer platforms, targeted at ever time continues to double every 18 months, all other things
faster and more capable very large scale integration (VLSI) tech-
remaining equal.
nologies. Each generation has traded off the raw capabilities of
the underlying technology with the desire to get working designs 3Moore’s Law, named after Gordon Moore of Intel, states that the density of
out quickly. Such tradeoffs abstract some of the physics of the circuits on an IC roughly doubles every 18 months
Since “time to market” pressures do not allow verification time up to the clock cycle level. Most of the work in event-driven
time to double, all other things have not remained equal. Elec- simulation is event detection, timing, and scheduling, which in-
tronic CAD algorithms and methodologies have come to the volves a lot of data-dependent branching and pointer following
rescue in three general ways: abstraction, acceleration and anal- through large data structures, that is difficult to speed up in a
ysis. Over the last 20 years, mainstream verification has moved conventional processor. Gate evaluation takes very little time.
to continually higher levels of abstraction, from the gate level, In pure cycle-based simulation, event handling is dispensed with
to the register transfer level (RTL), and now into the system in exchange for evaluating all gates in a block every cycle. Pre-
level. Likewise, timing resolution has moved from nanoseconds processing analyzes the gates between storage elements to find a
to clock cycles.4 Special-purpose HW has been applied to ac- correct order of evaluation. The simulator has few run-time data
celerate verification, by taking advantage of parallelism and spe- dependencies, usually only conditions that enable or disable
cific dataflows and operations. Logic emulation and rapid pro- whole blocks. Pipelined processors with large caches, which
totyping use real programmable HW to run billions of cycles of were becoming available in mainframes and workstations of the
design verification per day. Analysis, i.e., formal verification, late 1980s, can run this algorithm very fast. Resulting perfor-
can prove equivalence and properties of logic networks. Formal mance is typically an order of magnitude greater than event-
tools are limited in the depth of time they can address, compared driven simulation on the same platform.
with simulation and emulation, but their verification is logically Unfortunately, not everyone can use pure cycle-based
complete in a way that simulation and emulation can never be. simulation. It fundamentally assumes that each gate is only
active once per clock cycle, in rank order. Designs with
Abstraction asynchronous feedback, which includes transparent latches, or
Any level we choose to verify at involves some abstraction. other asynchronous behavior, cannot be simulated this way.
Early on logic designers found they could abstract above their The assumption of a single clock cycle excludes designs with
transistor circuit designs and simulate at the logic level of gates multiple independent clock domains, which are common in
and flip–flops, escaping from SPICE for functional verifica- telecommunications and media processors. Multiple clocks in a
tion. Logic simulators used in the earliest days of digital VLSI common clock domain can be simulated by using subcycles of
worked mainly at the gate level, fully modeled nominal timing the largest common divisor of the clock periods, at a substantial
at the nanosecond level, and used an event-driven algorithm. speed penalty. Not surprisingly, the first cycle-based logic
simulators were developed by large computer companies, such
A. Events as IBM, which built large, complex processors with a single
clock domain, and which could enforce a strict synchronous
The event-driven algorithm was previously well known in the
design methodology on their designers. IBM’s 3081 mainframe
larger simulation world outside ECAD. It calculates each transi-
project developed EFS, a cycle-based simulator, in the late
tion of a logic signal, called an event, locates it in time, and fans
1970s [84], [45], [46]. Pure cycle-based simulation is widely
it out to its receivers. Compute time is only spent on processing
used today on designs, mostly CPUs, which can accommodate
events, not on gates and time steps without activity. Since no
its methodology constraints. It has become less attractive
more than 10%–20% of the gates in a gate-level design are active
as the performance of event-driven simulators has steadily
on an average cycle, this is an efficient algorithm. It makes no
improved, partially because of their incorporation of cycle
assumptions about timing beyond gate delays, so it can correctly
based techniques.
model asynchronous feedback, transparent latches, and multiple
Cycle-based simulation does not verify timing. In syn-
clock domains. Setup and hold time violations on storage ele-
chronous designs, static timing analysis is used instead.
ments are modeled and detectable. Event-driven logic simula-
Analysis does a more thorough job of verifying timing than
tion remains a widely used fundamental algorithm today.
simulation, since it covers all paths without depending on
1970s examples of such simulators include IBM’s VMS [2],
stimulus for coverage. Timing analysis has the opposite dis-
[26], TEGAS [115], and CADAT. They were developed for very
advantage, reporting false paths, paths that can never occur in
large projects, such as mainframe computers, which were the
actual operation. Modern static timing analysis tools are much
earliest adopters of VLSI technology. VLSI gate arrays (i.e.,
more capable of avoiding false paths than earlier analyzers.
ASICs) appeared in the early 1980s, replacing discrete MSI TTL
However, when a design has asynchronous logic, event-driven
implementations, which were verified by real-time prototyping,
simulation with full timing is called for, at least on the asyn-
and corrected with jumper wires. With ASICs came a much
chronous portion.
broader demand for logic simulation in the design community.
Daisy Systems, Mentor Graphics, and Valid Logic, among C. Models
others, became well established by putting schematic capture
and event-driven logic simulation running on networked micro- Most designs contain large modules, which do not need verifi-
processor-based workstations onto ASIC designers’ desks. cation, since they are standard parts or previously verified HDL
cores, but which must be present for the rest of the design to be
B. Cycles verified. Fortunately we can use models of these modules, which
only provide the external behavior, while internal details are ab-
Soon a higher-abstraction alternative to the event-driven al-
stracted away. Models take a number of SW or HW forms, most
gorithm came into some use. Cycle-based simulation abstracts
of which emerged in the mid-1980s, particularly from logic au-
4Of course these days a nanosecond is a clock cycle … tomation and logic modeling systems.
Fully functional SW models completely represent the internal In 1984–1985, Moorby and others at Gateway developed
states and cycle-based behavior, as seen from the chip’s pins. the Verilog HDL and event-driven simulator, which was very
They can also provide visibility to internal registers through efficient for gate-level simulation, but also included RTL and
a programmer-friendly interface. Instruction set simulators ab- behavioral features, to express testbenches and other parts
stract a processor further, to the instruction level, and can exe- of the system. Verilog was very successful, and remained
cute binary code at a very high speed. Often only the bus-level proprietary through the 1980s. Cadence, after it has acquired
behavior is needed, so bus-functional models of chips and buses, Gateway, opened it in 1991, and it became an IEEE standard
which are controlled by transaction command scripts, are widely in 1995 [116].
used. Nothing is as fast or accurate as the silicon itself. A HW Abstraction to a HDL increases verification performance in a
modeler takes an actual chip and interfaces it to simulation. number of fundamental ways.
All these modeling technologies connect to simulators 1) Replacing simulation of each gate with simulation of
through standard interfaces. Verilog’s Programming Language multigate expressions.
Interface (PLI) is a standard model interface. PLI also makes 2) Handling multibit buses, registers and constants as single
it possible for the user to build custom utilities that can run in values.
tandem with the simulator, to instrument it in arbitrary ways. 3) Combining these two abstractions to use multibit arith-
This is how the precursors to the testbench language Vera were metic and logic operators.
implemented, and many coverage tools have been written using 4) Using higher-level ways to express control flow, such as
PLI. It is a differentiating feature of SW versus HW-accelerated case, for and if/then/else statements that can enclose en-
simulators. tire blocks.
D. Compiled Code In 1988, Synopsys introduced Design Compiler, which was
the first tool to synthesize higher-level Verilog into gates.
Simulators were originally built as looping interpreters, using This allowed engineers to move their whole design process
static data structures to represent the logic and its interconnect, to a higher level of abstraction, creating and verifying their
and an event wheel list structure to schedule events in time. designs in HDL. Initially motivated mainly by representation,
Some early cycle-based simulators used a compiled code tech- HDLs verification performance benefits, once enabled by HDL
nique instead. The logic design is analyzed by a compiler, which synthesis, propelled it into the mainstream methodology it is
constructs a program that, when executed, simulates the design today.
according to the simulation algorithm. As in programming lan-
guage compilers, the overhead of an interpreter is avoided, and F. All of the Above
some optimizations can be found at compile time which are not
practical or possible at run time. This technique trades prepro- Today’s highest performance SW-based simulators, such as
cessing time for run-time efficiency, which is usually a good VCS, Verilog-XL, or NC-Verilog, combine event-driven and
tradeoff to make in logic verification. Hill’s SABLE at Stanford cycle-based simulation of HDLs, with an optimized PLI model
[60], which simulated the ADLIB HDL [61], and IBM’s CEFS interface, into a native compiled-code executable. For example,
of 1985 [1] were early examples of compiled code simulation. recognizing that most activity is triggered by clock edges, VCS
While originally considered synonymous with the cycle-based inserts a cycle-based algorithm where it can, inside an event-
algorithm, event-driven simulators have also been implemented driven simulator. The performance of cycle-based simulation is
in compiled code to good effect. Chronologic’s Verilog Com- achieved when the design permits it, but without giving up the
piled Simulator, which is now Synopsys VCS, was the first com- ability to correctly simulate any Verilog design.
piled code simulator for Verilog in 1993, and also the first na- Other speedup methods used in VCS include circuit leveliza-
tive-code compiled code simulator. tion, coalescing of logic blocks, optimization of hierarchical
structures and design abstraction techniques. Further abstraction
E. Hardware Description Languages of logic values, from four-states and 120-values to two states, is
As design sizes grew to hundreds of thousands of gates, de- done automatically when safely possible, without losing com-
sign at the gate level became too unwieldy and inefficient. There pliance with standard Verilog, by retaining full four-state on
was also a need to express designs at a higher level to specify and those variables that require it [32]. Advanced source code anal-
evaluate architectures. Hardware description languages (HDLs) ysis and optimization, much like that done in programming lan-
were first developed to meet this descriptive need. guage compilers, takes advantage of the compiled-code archi-
The first widely known HDL was Bell and Newell’s ISP of tecture to further improve performance.
1971 [10]. ISP was also the first to use the term RTL. IBM’s
EFS simulator [45], [46] used BDL/CS, an RTL HDL [84], in G. Testbench Languages
1979–1980. Around 1980, the U.S. Department of Defense de- Testbenches abstract the rest of the universe the design will
manded a uniform way to document the function of parts that exist in, providing stimulus and checking responses to verify
was technology and design-independent, and would work iden- that the design will work correctly in its environment. Test-
tically on any simulator. Only a HDL could satisfy this demand. bench languages which express that abstraction had a diffuse
VHDL (VHSIC HDL) emerged from this requirement, based origin, with multiple influences from testers, stimulus genera-
on the earlier ADLIB HDL and Ada programming language. tors in process simulation languages, HDLs, and finally, modern
VHDL became an IEEE standard in 1987 [44]. programming languages and formal methods.
In the early 1980s, we can see the influence of process simu- A HW time wheel schedules events, which are fanned out to
lation languages and early HDLs in Cadat’s DSL (Digital Stim- their input primitives through one or more netlist tables. Hard-
ulus Language), Daisy’s stimulus language, and the stimulus ware can directly evaluate a logic primitive, detect an output
constructs of HiLo and Silos. Also, we can see in the IC-Test event, find its delay, and give it to the time wheel very quickly.
language of John Newkirk and Rob Matthews at Stanford in Most timesteps have many independent events to process, so
1981, features that became integral in modern testbench lan- pipelining can take advantage of this low-level parallelism
guages: de-coupling of timing and function, explicit binding of to raise gate evaluation speed to match the accelerator HW
testbench and device pins, and the use of a standard, high-level clock rate. Most accelerators also use the high-level parallelism
language (C extended with programs). available from locality of the circuit’s topology, by having
In the late 1980s, Systems Science worked with Intel to multiple parallel processors, which accelerate a partitioned
develop a functional macro language, with timing abstracted design. Events across partition boundaries are communicated
in generators that were directly tied to specification sheets, over a fast interconnect.
capturing the notion of executable specifications. In the 1990s, Soon after one MIPS desktop workstation-based EDA
Verilog and VHDL, as well as standard languages Perl and emerged in the early 1980s, the need for acceleration was
C/C , became widely used to develop testbenches, in spite of immediately apparent. Zycad, Silicon Solutions, Daisy, Valid,
their focus on HW or SW description, not testbenches. Formal Ikos, and others responded with event-driven HW accelerators.
methods had been proposed to address testbench generation, Speeds between 0.1 million and 360 million gate evaluations
and were sometimes oversold as universal panaceas, which per second and capacities up to 3.8 million gates were achieved
they are not. Although there are many appealing elements in by the mid-1980s [11]. This was one to two orders of magnitude
both standard languages and in some formal techniques, their faster and larger than the SW simulators of the day.
case has been overstated [29], [30]. Still, we can see a big Mentor Graphics’ 1986 accelerator, the Compute Engine [6],
influence from both of those fields. was a pipelined superscalar RISC attached processor, with large
Verification problems exploded in the mid to late 90s, with physical memory and HLL compilers. It foreshadowed the
companies finding that they were spending substantially more development of general-purpose workstations with pipelined
on verification than on design. The modern verification lan- superscalar RISC CPUs, in the late 1980s and early 1990s.
guages Vera, from Systems Science, now part of Synopsys, and Their sharply increased performance and capacity greatly
Specman, from Verisity, appeared in response, developed from decreased the demand for HW accelerators, and most such
the start with a focus on functional verification. offerings eventually disappeared. Today, Ikos remains the sole
Vera is an object-oriented, HW-aware language, which al- vendor of event-driven HW accelerators.
lows users to generate stimulus, check results, and measure cov-
erage in a pragmatic fashion [4]. Some of the issues it has ad- J. Cycle-Based Accelerators and Emulators
dressed include directed and random stimulus, automatic gen-
IBM developed several large-scale cycle-based simulation
eration of stimulus (with specialized packet generation mech-
accelerators in the 1980s, first the Yorktown Simulation Engine
anisms and with nondeterministic automata), flexible abstrac-
(YSE) research project [41], followed by the Engineering
tions tailored to check complex functional responses, ways of
Verification Engine (EVE) [9]. EVE could simulate up to
launching and synchronizing massive amounts of concurrent ac-
two million gates at a peak of 2.2 billion gate evaluations per
tivity, high-level coverage (state, function, sequences, etc.), en-
second, in pure cycle-based or unit-delay modes, or a mix
capsulation of testbench IP, seamless interfacing with HDLs and
of the two. It was widely used within IBM, especially for
with high-level languages such as C and Java for HW/SW
processor and controller-based systems, running actual SW on
co-simulation, and finally, specialized testbench debugging, as
the accelerated design.
testbenches have become complex programs in themselves.
Similar technology became commercially available in the
Modern testbench languages, such as Vera, have been adopted
1990s in logic emulator form, when Arkos introduced its
by many of the largest semiconductor and systems companies,
system, and Quickturn introduced its CoBALT machine, a
and continue to mature and grow in their usage and capabilities.
productization of IBM silicon and compiler technology di-
rectly descended from the YSE/EVE work. Today’s CoBALT
H. Acceleration and Emulation Plus is capable of speeds over 100 000 cycles per second on
An accelerator implements a simulation algorithm in HW, system designs up to 20 million gates, fast enough to be an
gaining speedup from parallelism, at both the operation and in-circuit-connected logic emulator in live HW targets.
processor levels, and from specialized operators and control. A Cycle-based machines are subject to the same design con-
logic emulator can plug the design into live real-time HW for straints, detailed above that apply to all implementations of the
verification by actual operation. cycle-based algorithm, so a large fraction of design projects
cannot use them.
I. Event-Driven Accelerators
Event-driven HW accelerators have been with us almost as K. Massively Parallel General-Purpose Processors
long as logic simulation itself. The first machine, built in the Another 1980s response to the need for accelerated perfor-
1960s at Boeing by Angus McKay [11], [81], contained most mance was research into logic simulation on general-purpose
elements found in event-driven accelerators ever since. massively parallel processors [107]. The result of this research
was mainly negative. In global-time event-driven logic simu- excessive routing delay. Most commercial emulators, such as
lation, enough high-level parallelism is available to keep up to the post-1992 Quickturn systems, use a separate partial crossbar
about ten, but nowhere near 100, processors busy [8]. This is be- interconnect architecture, built with a hierarchy of crossbar
cause logic designs have a very irregular topology. Events can routing chips [23]. Ikos emulators directly interconnect the
immediately propagate very far across a logic design. Massively FPGAs to one another in a grid, and time-multiplex signals
parallel processing succeeds on physical simulations, such as over the pins synchronously with the design’s clock [7]. Both
fluid flow and structural analysis, because the inherent nearest- interconnect architectures make the FPGAs’ internal routing
neighbor communications in three-dimensional physical reality interdependent, so it is hard to incrementally change the design
keeps communications local and predictable. Also, parallelism or insert probes without a major recompile.
in logic designs is very irregular in time, greatly concentrated at Another major technical challenge is the logic emulation
the beginning of a clock cycle. Many timesteps have few events compiler [24]. Multilevel, multiway timing-driven partitioning
to do in parallel. Distributed-time algorithms [28] were inves- is a large and NP-hard problem [5], which can take a long time
tigated, but the overhead required outweighed the gains [109]. to execute, and which is resistant to parallel acceleration. An
Today, some simulation tools can take advantage of the modest emulation design database is required to map from HDL-level
multiprocessing widely available in modern workstations. names to the FPGA HW for run time debugging. Often quite
Much more common is task-level parallelism, which exists a lot of user intervention is required to achieve successful
across a large collection of regression vector sets that must compilation, especially in identifying clock trees for timing
be simulated to validate design correctness after minor design analysis of designs involving many gated clocks.
revisions late in a project. Many projects have built “farms” Logic emulation, in both FPGA and cycle-based forms,
or “ranches” of tens or hundreds of inexpensive networked is widely used today on major microprocessor development
workstations or PCs, which run conventional simulation tools projects [52], multimedia, telecom and networking designs.
in batch mode. In terms of cycles/s/dollar, this is a very Interestingly, users have found as much value post-silicon as
successful acceleration technique, and it is probably the most pretapeout, because of the visibility afforded by the emulator.
commonly used one today. It is important to remember that System-level problems can be observed through the logic em-
this only works when task-level parallelism is available. When ulator’s built-in logic analyzer, with access to design internals
a single thread of verification is required, such as running OS during actual operation, which is impractical with the real
and application SW and large data sets on the logic design, no silicon. Engineering change orders can be proven to correct the
task-level parallelism is available, and very high performance problem before a respin of the silicon. However, the expense
on a single task is still needed. of logic emulation HW and SW, and the persistent difficulty of
using it in the design cycle, often requiring full-time specialist
L. FPGA-Based Emulators staff, has confined logic emulation to those large projects that
FPGAs emerged in the late-1980s as an implementation tech- can afford it.
nology for digital logic. Most FPGAs are composed of small
clusters of static random access memory, used as look-up tables M. Analysis
to implement combinational logic, and flip–flops, in an array
Even the most powerful simulator or emulator can only in-
of programmable interconnect [19]. FPGAs have less capacity
crease confidence, not guarantee, that all the important states
when used for modeling ASIC and full custom designs than
and sequences in a large design have been checked. Analysis
when intentionally targeted, since the technology mapping and
resulting in proven conclusions is much to be desired. Formal
I/O pin utilization is far from ideal. FPGAs used for verification
verification tools can provide such analysis in many important
could hold 1K gates in the early 1990s, and have grown to 100K
cases. Most tools emerging from formal verification research
gates or more today.
have been either model checkers, which analyzes a design to
Logic emulation systems soon took a large number of FPGAs
see if it satisfies properties defined by the designer, or equiva-
and harnessed them as a logic verification tool [23], [97]. A
lence checkers, which compare two designs to prove they are
compiler does emulation-specific HDL synthesis into FPGA
logically identical. So far, all practical formal verification has
primitives, runs timing analysis to avoid introducing hold-time
been at the functional level, dealing with time only at the cycle
violations, partitions the netlist into FPGAs and boards, ob-
level, depending on use of static timing analysis alongside, as
serving capacity and pin constraints and critical timing paths,
with cycle-based simulation.
routes the inter-FPGA interconnect, and runs chip-level place-
ment and routing on each FPGA. At run time, the FPGAs and
interconnect are programmed, and a live, real-time emulation of N. Model Checkers
the logic design results. Multi-MHz system clock rates are typ- A model checker checks conditions expressed in temporal
ically achieved. The emulated design may be plugged directly logic, called properties, against the set of reachable states in a
into a slowed-down version of the target HW, and operated with logic design, to prove the properties to be true. Early research
the real applications, OS and data sets as stimulus. Alternately, was based on explicit state representation of finite-state models
precompiled vectors may be used, for very large regression tests. (FSMs), which can only describe relatively small systems. But
The most difficult technical challenge in logic emulation is systems of any meaningful size have a staggering number of
interconnecting the FPGAs in a general purpose, scalable and total states, requiring a better representation. The problem of
affordable way, without wasting FPGA capacity or introducing state-space explosion has dominated research and limited the
scope of model checking tools [34]. Bryant’s breakthrough de- to mainstream design projects. Formality’s most important in-
velopment of ordered binary decision diagrams (OBDDs) in novation was its use of a number of different solvers that are
1986 [20] opened the way for symbolic model checking. automatically chosen according to the analysis problem. It is
Within a year, Pixley at Motorola, Kukula at IBM, McMillan based on OBDDs for complex control logic, while other algo-
at CMU, and Madre and Coudert at Bull, all either proposed rithms are used for large combinational datapath elements like
or developed model checking tools that used OBDDs to repre- multipliers, to get results in reasonable time and space on de-
sent reachable state sets, and relationships between them, in a signs with a wide variety of characteristics.
compact and canonical form. An important additional result of Both equivalence checkers and model checkers can be very
model checking can be specific counterexamples when a model demanding of compute time and memory space, and generally
fails to meet a condition [33], which can be used with sim- are not practical for more than a few hundred thousand gates at
ulation to identify and correct the error. Later university sys- a time. Since nearly all large designs are modular to that level
tems, notably SMV [82], and VIS [16], further developed model of granularity, analysis is usefully applied at the module level.
checking technology, and are used today [89].
Model checkers are usually limited to module-level analysis P. Future: SoC
by the fact that determining the reachable state set across more Super-linear growth of verification work shows no sign of
than several hundred registers with consistency is impractical, letting up, so long as Moore’s Law holds up. The combina-
because the size of the OBDD representations explodes. They tion of abstraction, acceleration, and analysis have risen to meet
also require very intensive user training and support, to express the challenge so far. With modern simulation and formal ver-
temporal logic properties correctly. The accuracy of the verifi- ification tools, most chip development projects today succeed
cation depends on the correctness and completeness of the prop- in getting correct first silicon. Larger chips and system designs
erties declared by the user. Model checkers have been used suc- succeed by adding acceleration or emulation, or by prototyping
cessfully to check protocol properties, such as cache coherency with FPGAs.
implementations, avoiding deadlock or live-lock, or safety-crit- Systems on chip (SoC), which have recently become econom-
ical control systems, such as automotive airbags. ically desirable to fabricate, pose the latest and greatest chal-
lenge, and will dominate verification issues in the near future.
O. Equivalence Checkers SoCs are doubly hard to verify. First, because, as before, their
An equivalence checker formally compares two designs to increased gate count requires more work per cycle and more cy-
prove whether they are logically equivalent. The process is ideal cles to get verification coverage. Second, because SoCs are sys-
in principle, since no user-defined properties or testbenches are tems, systems include processors, and processors run code. The
involved. Usually, an RTL version of the design is validated problem of system-level verification is largely one of HW/SW
using simulation or emulation, establishing it as the golden ver- co-verification. The embedded SW OS, drivers and applications
sion. This golden version serves as the reference for the func- and the SoC HW must be verified together, by both HW and
tionality at the beginning of an equivalence checking method- SW engineers, in the design cycle. This sharply increases the
ology. Each subsequent version of the design is an implementa- number of verification cycles needed each workday. A single
tion of the behavior specified in, and simulated from, the golden thread of verification is required to run code, so no task-level
database. Once an implementation database has been proven parallelism is available. As before, this challenge is being met
equivalent, it can be used as a reference for any subsequent by a combination of abstraction, acceleration and analysis.
equivalence comparisons. For example, the flattened gate-level Abstraction of system SW execution to the source code or in-
implementation version of a design can be proven equivalent struction set levels was combined with HDL logic simulation,
to its hierarchical HDL source. Equivalence checking is also to form a HW/SW co-verification tool set, by Eagle Design Au-
very powerful for validating very large combinational blocks tomation. The processor-memory part of the system is modeled
that have a clear higher-level definition, such as multipliers. by either an instruction set simulator, which can achieve hun-
Equivalence checkers work by applying formal mathemat- dreds of thousands of instruction cycles per second, or by di-
ical algorithms to determine whether the logic function at each rect execution of the embedded source code compiled to the
compare point in one design is equivalent to the matching com- verification workstation, which is even faster. The rest of the
pare point in the other design. To accomplish this, the tool seg- system HW resides in a conventional HDL logic simulator. The
ments the circuit into logic cones with boundaries represented two are connected through the Eaglei system, which embeds
by source points and end points, such as primary I/Os and reg- the processor-memory SW execution within a Virtual Software
isters. With the boundary points of each logic cone aligned be- Processor (VSP). The VSP interacts with the HW simulation
tween the two designs, solvers can use the functional descrip- through a bus-functional model of the processor bus. The VSP
tions of the two designs to determine their equivalence [113]. and logic simulator can be coupled together to follow the same
The first equivalence checker was developed for the 3081 simulation time steps, when SW is interacting with simulated
project at IBM in about 1980 [106]. Early equivalence checkers HW, or uncoupled when SW is running on its own, which is
used Boolean representations, which were almost like OBDDs, most of the time, allowing full-speed execution of the SW. Cou-
but were not canonical. Bryant’s development of OBDDs [20] pling control can be static or dynamic, triggered by HW or SW
was put to immediate use to provide a canonical form in research activity. This way the HW-SW system can be verified through
tools. Modern commercial equivalence checkers, Chrysalis in many millions of cycles in a reasonable time on a standard work-
1994 and Synopsys Formality in 1997, brought the technology station [114].
Moving HW descriptions above HDL to a system-level repre- the earlier LSS; both used a rule-based approach. In either case,
sentation is another way to get verification performance by ab- optimal solutions were, and still are, out of computational fea-
straction. Once system-level languages, such as SystemC, can sibility. The goal of the optimization systems then becomes one
be used with system-level synthesis tools on general-purpose of matching or beating a human designer in a greatly acceler-
logic designs as HDLs are today, that will enable a system-based ated fashion.
design methodology that gets high verification performance by On the business side, we find the very first use of logic ma-
abstraction. nipulation in the “second wave” of EDA companies. If the first
Verification power will continue to increase through incre- wave of EDA companies was focused on physical design and
mental improvements in each of the abstraction, acceleration verification, then the second wave was focused on logic design.
and analysis technologies. More sophisticated HDL analysis The companies Daisy, Mentor, and Valid all offered schematic
and compilation techniques continue to improve HDL simulator entry as a front end to logic simulation programs. While these
performance. FPGA and compiler technology for acceleration offerings did not utilize logic optimization techniques, some of
continues to grow rapidly, benefiting from the widespread ac- them did offer rudimentary logic manipulation such as “bubble
ceptance of FPGAs in HW products. New and better algorithms pushing” (phase assignment) that could manipulate the logic in
and implementations in formal verification can be expected, as schematics.
this is an active research area in academia and industry.
We will see increasing synergy among abstraction, acceler- B. Logic Optimization and Design Compiler
ation and analysis technologies as well. Analysis techniques The mid to late 1980s saw the beginnings of companies that
cover logic very thoroughly, but can not go very deep in time, were focused on commercializing the advances of academic re-
while simulation can cover long times without being complete. search in logic synthesis. The early entrants in this market where
These complementary technologies can be combined to exploit Trimeter, Silc, and Optimal Solutions Inc. (later renamed Syn-
the strengths of each, in both design verification and testbench opsys). In what follows, we will concentrate on the develop-
generation. Work continues on new ways of applying FPGA ac- ments in technology and business in the Design Compiler family
celeration technology to abstracted HDL and system-level simu- of synthesis products.
lation. We may even see FPGA acceleration of solvers in formal When new techniques are applied to real-world design prob-
verification. lems, opportunities for further innovation can become apparent,
as necessity drives further innovation to meet the needs of the
IV. SYNTHESIS market. The early success of Design Compiler in this environ-
ment can be traced to a number of technical and business fac-
A. The Beginnings tors. One of the first is that the initial use of Design Compiler
Logic synthesis can trace its early beginnings in the work on was for optimization of preexisting logic designs [98]. In this
switching theory. The early work of Quine [91] and McCluskey scenario, users could try out the optimization capabilities and
[79] address the simplification, or minimization, of Boolean make their designs both smaller and faster with no change to
functions. Since the most demanding logic design was encoun- their design methodology. The impact on design methodology
tered in the design of computers, it is natural that IBM has is a key factor in the acceptance of any new EDA tool. While this
played a pivotal role in the invention of many algorithms and in is understood much better today in the design and EDA commu-
the development of many in-house systems. MINI [62] was an nities, in the late 1980s this was not universally understood that
early system that targeted two-level optimization using heuristic it could be a make-or-break proposition for tool acceptance. A
techniques. An early multiple-level logic synthesis system to corollary to the optimization-only methodology was the use of
find use of designs that were manufactured was IBM’s LSS [37]. Design Compiler as a porting and reoptimization tool that could
LSS was a rule-based system that first operated on a technology move a design from one ASIC library or technology to another.
independent representation of a circuit and then mapped and op- It was probably in this fashion that Design Compiler was first
timized the circuit further. most heavily used.
Work on LSS, as well as another IBM synthesis system, the This initial usage scenario was enabled by a number of fac-
Yorktown Silicon Compiler (YSC) [14], contributed in many tors. On the business side, the availability of a large number of
ways to the start of the explosive growth in logic synthesis re- ASIC libraries was a key advantage. This was driven by a combi-
search that occurred during the 1980s. Not only from publishing nation of early top customer needs coupled with a keen internal
papers, but also as a training ground for many researchers, IBM focus on silicon vendors with both support and tool develop-
synthesis research gave the necessary background to a genera- ment. Library Compiler was developed to support the encapsu-
tion of investigators. Notable efforts in the universities include lation of all the pertinent data for logic synthesis. Initially, this
the University of Colorado at Boulder’s BOLD system [57], contained information only about logic functionality and con-
Berkeley’s MIS [13], and Socrates [54] from General Electric. nectivity, area, and delay. Today, these libraries contain infor-
For a comprehensive treatment, with references, see [40] and mation about wire-loads, both static and dynamic power con-
[43]. sumption, and ever increasing amounts of physical information.
All of these systems were aimed at the optimization of multi- Today, this is supported by Liberty) .lib STAMP formats that
level logic and the approaches they used could be roughly char- are licensed through the TAP-IN program.
acterized as either algorithmic or rule-based (or both). The al- Another key for the early acceptance of Design Compiler was
gorithmic approach was used in MIS, while Socrates, as well as the availability of schematic generation. Without natural and
intuitive schematic generation, the initial acceptance of logic provided for by a synthesis based design flow. Silicon compilers
optimization by logic designers would have been significantly were either tightly targeted on a specific design (i.e., producing
slowed. Logic designers at that time were most familiar with an 8051) or were compiling to a fixed micro-architecture, again
schematic entry and were accustomed to thinking in schematic resulting in uncompetitive designs.
terms. The output of optimization had to be presentable in that
fashion so that the designers could convince themselves not only C. RTL Synthesis
of the improved design characteristics, but more importantly, the The early Synopsys literature attempted to describe to a large
correctness of the new design. audience of logic designers who, at that time, did not understand
Perhaps the greatest reason for initial the success of Design synthesis by the “equation”
Compiler was that it focused on performance-driven design.
More precisely, the optimization goal was to meet the designer’s Synthesis Translation Optimization
timing constraints while minimizing area. By continuously
changing the timing constraints, a whole family of solutions What we have already described was the optimization term in
on the design tradeoff curve could be obtained. This design this equation. We now turn our attention to the translation term.
tradeoff curve plots the longest path delay ( -axis) verses the Hardware Description Languages have been an active area
area required to implement the described functionality ( -axis). of research for many years, as evidenced by the long history
This curve is also known as the banana curve. Another term of CHDL [S10] (see also the comments on HDLs in the Ver-
of art that has survived is “quality of results”; most often ification section). However, it was the introduction of Verilog
abbreviated as “QoR.” One tool’s, or algorithm’s, QoR was that enabled the HDL methodology to gain a foothold. Users
demonstrated to be superior if its banana curve was entirely were first attracted to Verilog, not because of the language, but
below the banana curve results of any other approach. because it was the fastest gate-level simulator available [98].
There are two key technologies for achieving timing QoR for VHDL then was introduced in the mid 1980s from its genesis as
synthesis. The first is having an embedded, incremental static a HDL based on the Ada syntax and designed under the VHSIC
timing analysis capability. Without good timing, which includes program [44]. A stumbling block to using these HDLs as syn-
a thorough understanding of clocking, delay modeling, inter- thesis input was that they were both designed as simulation lan-
connect delay estimation, and constraint management, it would guages. At that time, it was not obvious how to subset the lan-
be impossible to achieve high-quality designs. The static timing guages and how to superimpose synthesis semantics onto them.
engine must be incremental to support the delay costing of po- In fact, there were many competing subsets and coding styles for
tential optimization moves. It also must be fast and memory both Verilog and VHDL. While most efforts relied on concur-
efficient, since it will be called many times in the middle of rent assignment statements in the HDLs to model the concur-
many synthesis procedures. But good static timing analysis also rency of HW, Synopsys pioneered the use of sequential state-
needs good timing modeling to obtain accurate results. It be- ments to imply combinational and sequential logic. This had a
came clear in the late 1980s that simple linear delay models pivotal impact making HDLs more generally useful and, thus,
were no longer adequate to describe gate delay [78], [18]. To ad- accepted by the design community. This “Synopsys style” or
dress this situation, a table look-up delay model was developed. “Synopsys subset,” as it was called at the time, became the de
This model, known as the nonlinear delay model (NLDM), mod- facto standard and its influence can be seen today in the recently
eled both the cell delay and the output slope as a function of the approved IEEE 1076.6 standard for VHDL Register Transfer
input slope and the capacitance on the output. These techniques, Level Synthesis.
along with the development of the routing estimation technique Once designers started using these languages for gate-level
known as wire load models (WLM), enabled the accurate es- simulation, they could then experiment with more SW oriented
timation of arrival and required times for the synthesis proce- language constructs such as “if … then … else” and “case” as
dures. The second key technology is timing-driven synthesis, well as the specific HW modeling statements that modeled syn-
which is a collection of techniques and algorithms. They include chronization by clock edges. Experimentation with RTL syn-
timing-driven structuring [119] in the technology independent thesis soon followed. But this change was effected only slowly
domain, as well as many techniques that operate on technology at first, as designers gained confidence in the methodology. It
mapped netlists. The general design of a commercial logic syn- was in this fashion that together, Design Compiler and Verilog
thesis system is outlined in [96]. XL were able to bootstrap the RTL methodology into the ASIC
The biggest competition for logic synthesis for a place in the design community.
design flow in the mid 1980s was the concept of “Silicon Com- Most HDL synthesis research in the 1980s was focused on be-
pilers.” By this we mean compilers that produced final layout havioral-level synthesis, not RTL synthesis. A notable exception
from an initial specification. There was a lot of uncertainty in was the BDSYN program at Berkeley [100]. In this approach,
the market as to which technology would provide the most ben- a rigid separation of data and control was relaxed to combine
efit. In the end, logic synthesis coupled with automatic place and all the generated logic into one initial design netlist that could
route won almost all of the business for several reasons. First then be further optimized at both the RTL and logic level. Be-
was that in most cases, the designs produced by silicon compi- fore any RTL optimizations, RTL synthesis starts with a fairly
lation were bigger and slower than those produced by synthesis direct mapping from the HDL to the initial circuit. Flow con-
and APR, resulting in uncompetitive designs. Second, silicon trol in the language text becomes data steering logic (muxes) in
compilation did not have the universality of application that was the circuit. Iteration becomes completely unrolled and used for
building parallel HW. Clocking statements result in the infer- restructure operation trees for minimum delay [66]. This con-
ence of memory elements, and the bane of neophyte RTL de- tinual stream of innovation has enabled designs produced by
signers continues to be the fact that incomplete assignments in modern synthesis tools to operate with clock frequencies in ex-
branches of control flow will result in the inference of trans- cess of 800 MHz using modern technologies.
parent latches. That this initial mapping is direct is evidenced The invention of OBDDs has benefited synthesis almost
by the ability of those skilled in the art to directly sketch syn- as much as verification. One early use at Synopsys was the
thesized circuits from Verilog text fragments [55]. There are very simple equivalence checking offered by the check_design
a number of SW compiler-like optimizations that can be done command of Design Compiler. This functionality was based on
during this initial phase as well, such as dead-code elimination an efficient implementation of an OBDD package [15]. While
and strength reduction on multiplication by powers of two. Gen- this simple command did not have the capacity, robustness,
erally, techniques that do not require timing information can also and debugging capability of the equivalence checking product
be performed at this time, such as using canonical signed digit Formality, it proved invaluable to the developers of logic and
representation to transform multiplication by a constant into a RTL optimization algorithms. Transforms and optimizations
series of additions, subtractions, and bit shifts (another form of could be checked for correctness during development with the
strength reduction). check_design command. Afterwards, these checks could be
After the initial circuit synthesis, there remain a number of added to the regression test suite to ensure that bad logic bugs
RTL optimizations that can gain significant performance and would not creep into the code base. Another use of OBDDs is
area improvements. Since most of these involve design trade- in the representation and manipulation of control logic during
offs, it becomes necessary to perform them in context of the RTL synthesis. In all of these applications, the size of the
synthesized logic. Therefore, an RTL optimization subsystem OBDDs is a critical issue. The invention of dynamic variable
was built in the middle of the compile flow [56]. By utilizing the ordering and the sifting algorithm [95] has proven to be a key
same static timing analysis, with bit-level information on both component in making OBDDs more generally useful.
operator timing and area costs, effective algorithms can be con-
structed for timing-driven resource sharing, implementation se- D. Sequential Mapping and Optimization
lection (performance-area tradeoffs for RTL components), oper-
Supporting the combinational logic synthesis is the synthesis
ator tree restructuring for minimum delay, and common subex-
and mapping of sequential elements. The initial HDL design
pression elimination. It is crucial to have accurate bit-level in-
description determines the type of memory element that will
formation here, both for the RTL operators (such as additions
be inferred (i.e., a flip–flop or transparent latch active high or
and subtractions) as well as the control signals. This is neces-
low). But modern ASIC libraries have a wide variety of library
sary because critical paths can be interleaved between control
elements that can implement the necessary functionality. The
and data and many RTL optimizations tend to move the con-
choice of these elements can have an impact on both the area
trol signals in the design. To support the accurate delay and area
and the timing of the final design [72]. It is at this stage that de-
costing of RTL components, the initial DesignWare subsystem
sign for testability (DFT) for sequential elements can be accom-
was constructed. This subsystem manages the building, costing,
plished, by using scan-enabled flip–flops and automatically up-
and modeling of all RTL operators and yields great flexibility
dating the design’s timing constraints. Also, mapping to multibit
and richness in developing libraries. Allowing the description
registers can have positive effects during the place and route
of the RTL operators themselves to be expressed in a HDL and
phase of design realization.
then have the synthesis procedures called in a recursive fashion
One sequential optimization technique that has held the
so that all timing information is calculated “in-situ” enables this
promise of increased design performance has been retiming
subsystem all the functionality and flexibility of the logic syn-
[74], [101]. But the road from theory to practical industrial use
thesis system. By adding the capability to specify any arbitrary
has taken longer than initially thought [102]. The stumbling
function as an operator in the HDL source, these RTL optimiza-
block has been that most designers use a variety of complex
tions can be performed even on user defined and designed com-
sequential elements that were poorly handled by the classical
ponents.
retiming algorithm. These elements might offer asynchronous
Since that time, there has been a continuous stream of innova-
or synchronous reset inputs as well as a synchronous load
tions and improvements in both logic and RTL synthesis. While
enable input (also called a clock enable). These types can now
less than a decade ago, modules in the range of 5000–10 000
be optimized using multiple class retiming techniques [47].
gates used to take upwards of ten CPU hours, today, a top down
automated chip synthesis of one million gates is possible in
under 4-h running on four parallel CPUs. Also, it is now possible E. Extending the Reach of Synthesis
to simultaneously perform min/max delay optimization. That We have already mentioned how scan flip–flops can be
is, separate runs for maximum delay optimization (setup-con- inserted during the sequential mapping phase to achieve a
straints), followed by minimum path (hold constraints) fixing one-pass DFT design flow. There are other objectives that can
are no longer necessary. Timing QoR has improved dramati- be targeted as well as testability. Power optimizing synthesis is
cally as well. Techniques such as critical path resynthesis and a one of these techniques. Power optimization can be performed
variety of transformations on technology mapped circuits have at the gate-level [S18], RT level, or the behavioral level. Clock
yielded improvements of over 25% in timing QoR. Advances in gating is a well-known technique used to save on power con-
RTL arithmetic optimization have used carry save arithmetic to sumption, but support is need throughout the EDA tool chain
to achieve high productivity. Other techniques at the RT and effort to change those thought processes. Even with these chal-
behavioral-level include operand isolation, which isolates the lenges, hundreds of chips have been manufactured that have Be-
input of operand by using control signals that indicate whether havioral Compiler synthesized modules.
the operand’s outputs will be needed. In this way, the logic that
implements the operator is not subjected to changing inputs
G. Physical Synthesis
when the results are not needed. These techniques can net over
40% savings in power with a minimal impact to the design During the last few years of the Millennium, one of the main
cycle and the design quality. challenges in ASIC implementation has been in the area of
While the initial purpose of the DesignWare subsystem was timing closure. Timing closure refers to the desirable property
to support RTL optimization techniques, it became quickly ev- that the timing predicted by RTL and logic synthesis is actually
ident that the feature set enabled “soft-IP” design reuse. Today, achieved after the physical design phase. This is often not
the largest available library and the most widely reused compo- the case. The root of this divergence lies in the modeling of
nents are in the DesignWare Foundation Library. This library is interconnect delay during synthesis. Accurate estimates of the
an extensive collection of technology-independent, preverified, interconnect delay are not available at this stage and a statistical
virtual micro-architectural components. This library contains el- technique that uses both the net’s pin count and the size of
ementary RTL components such as adders and multipliers with the block that contains the net to index into a look up table
multiple implementations to be leveraged by the RTL optimiza- for capacitance estimates. There is a fundamental mismatch
tion algorithms. Also, included are more complex components between this estimation technique, which in effect relies on a
such as DW8051 microcontroller and the DWPCI bus interface. biased average capacitance, and the fact that performance is
The benefits of design reuse are further enhanced by availability dictated by the worst case delay.
of DesignWare Developer and other tools and methodologies There have been many proposals in how to address the timing
[63] that support design reuse. closure problem. In [65], the observation is made that the tradi-
tional synthesis methodology can be maintained if a hierarchical
design implementation is employed. In this approach, the chip
F. Behavioral Synthesis is synthesized as a number of blocks below some threshold, usu-
ally quoted in the 50K to 100K gate range. Budgeting, planning,
Behavioral synthesis has long been a subject of academic and implementation of the global interconnect for chips with
research [80], [25], [40], and [76]. The object of behavioral 2000 to 3000 such blocks could prove to be challenging. An-
synthesis is twofold. First, by describing a system at a higher other approach is to use a constant delay, or gain, approach for
level of abstraction, there is a boost in design productivity. gate sizing after placement and routing [110]. This technique
Second, because the design transformations are also at a higher depends on the ability to continuously size gates and also that
level, a greater volume of the feasible design space can be delay is a linear function of capacitive loading. Working against
explored, which should result in better designs. Synopsys first this are the modest number of discrete sizes in many ASIC li-
shipped Behavioral Compiler in November of 1994. There braries, that wire resistance can be a significant contributor to
were a number of innovations embodied in Behavioral Com- the (nonlinear) wire delay, and that the gate delays themselves
piler. First, a new technique for handling timing constraints, can be nonlinear due to input slope dependence. Having said
sequential operator modeling, prechaining operations, and that, this approach can yield improvements in timing QoR when
hierarchical scheduling was developed with a method called used with libraries that have a rich set of drive strengths.
“behavioral templates” [77]. Also, a consistent input/output The approach to timing closure that seems to hold the most
(I/O) methodology was developed that introduced [69] the promise is one where synthesis and placement are combined
“cycle fixed,” “super-state fixed” and “free float” I/O sched- into a single implementation pass. This pass takes RTL design
uling modes. These are necessary for understanding how to descriptions as input and produces an optimized and (legally)
interface and verify a design generated by behavioral synthesis. placed netlist as an output. By this fashion, the optimization
Finally, the key to high-quality designs remains in accurate area routines can have access to the placement of the design, and
and timing estimates. This is true for both the operators that this allows for accurate estimation of the final wire lengths and,
will be implemented on scheduled, allocated, and bound HW hence, the wire delays.
processors, and also for early estimates of delays on control
signals as well [83]. H. Datapath Synthesis
While behavioral synthesis has not achieved the ubiquity of
logic synthesis, it has found leverage use in the design set-top Specialized techniques can often gain significant efficiencies
boxes, ATM switches, mobile phones, wireless LANs, digital in focused problem domains. Datapath design has traditionally
cameras and a wide range of image processing HW. A recent been done manually by an expert designer. The techniques for
white paper [BLA99] details one design team’s experience in building high-quality datapath designs are well understood and
using Behavioral Compiler on a design of a packet engine with can be expressed in the form of rules that are used while con-
20 000 (40 000 after M4 macro expansion) lines of behavioral structing the design. Using these rules leads to a circuit that has
Verilog. One constant theme is that it is a steep learning curve good performance and area. This circuit can be post processed
for an experienced RTL designer to use Behavioral Compiler, by a general-purpose logic optimization tool for incremental im-
for thinking in terms of FSM is ingrained and it take a conscious provements. Providing a well-structured, high-quality input into
logic optimization will usually create a better circuit than pro- coming to terms with the heterogeneous environment of system
viding functionally correct input that is not well structured. on a chip designs. In this area, C/C with class libraries is
A datapath synthesis tool like Synopsys’s Module Compiler emerging as the dominant language in which system descrip-
implements the datapath design rules as part of its synthesis tions are specified (such as SystemC [75]). A C/C synthesis
engine. These rules dynamically configure the datapath de- tool will leverage state-of-the-art and mature behavioral and
sign being synthesized based on the timing context of the RTL synthesis and logic optimization capability. However,
surrounding logic. Hence, the initial circuit structure is not C/C presents some unique challenges for synthesis [53]. For
statically determined only by the functionality of the circuit. example, synthesis of descriptions using pointers is difficult.
Initial circuit construction is done in timing-driven fashion. Due to effects such as aliasing, analysis of pointers and where
Instead of depending upon a down-stream logic synthesis they point to is nontrivial. Another challenge is in extending
tool to restructure the circuit based on timing constraints, this the synthesizable subset to include object-oriented features like
can allow the creation of higher quality designs with faster virtual functions, multiple inheritance, etc.
compile times, particularly, when the designer knows what
kind of structure is wanted. A key element of the constructive
datapath synthesis technique is that the placement can also V. TESTING
be constructively determined. Datapaths are often placed in a The last 35 years have been a very exciting time for the people
bit-sliced fashion that is also called tiling. In tiling, the cells are involved in testing. The objective of testing is, very simply, to
each assigned to a row and column position called its relative efficiently identify networks with one or more defects, which
placement. The rows represent bit slices and the columns makes the network function incorrectly. Defects are introduced
represent separate datapath functions. Specialized datapath during the manufacturing process to create connections where
placers read this tiling information and convert it into real none were intended or to break intended connections in one or
legal placement. Integrating datapath synthesis and physical more of the layers in the fabrication process or to have changes
synthesis leads to placed gates that simultaneously enables in the processes variables. These shorts, opens, and process vari-
structured placement and timing convergence. ation can manifest themselves in too many different ways in the
IC that makes it impractical to create an exhaustive list of them,
I. FPGA Synthesis let alone generate tests for each of them. Thus, in the typical
FPGA design presents a different paradigm to logic optimiza- test methodology, an abstraction of the defects is used to gen-
tion techniques. Because the chip has already been designed erate tests and evaluate tests. The abstract form of the defects
and fabricated with dedicated resources, excess demand on any is known as faults and there are many different types of them.
one of them, logic, routing, or switching, causes the design to The most popular fault model that is used in the industry is the
become infeasible. Once a design fits onto a specific FPGA single stuck-at fault model which assumes that every net in the
part, the challenge is to reach the performance goals. The FPGA design is stuck at a logic value (zero or one). There should be no
technology-mapping problem and ASIC technology-mapping surprise that this fault model has been used over the years as it
problem at a basic level are very similar. However, the algorithms epitomizes the reason why fault models exist namely, efficiency.
used to solve these two problems differ drastically in today’s Stuck-at faults are easy to model and grow linearly with design
synthesis tools. The ASIC approach of using a library of logic size. Most of the years of testing of logic networks have been
gates is not practical for FPGA technologies, especially in LUT centered on testing for stuck-at-faults [49] in sequential net-
based FPGAs (a LUT can be programmed to compute a function works. This fault model is still widely used today even though
of its -inputs). FPGA specific mapping algorithms have been we have made many technology changes. In order to generate
developed to take advantage of the special characteristics of test patterns to detect these fault models, these tests were gen-
LUT FPGA and anti-fuse FPGA technologies to make them erated by hand. In 1966, the first complete test generation algo-
more efficient in run-time and yield better quality of results as rithm was put forth by J. Paul Roth [94]. This algorithm was for
compared to the typical ASIC technology mapping algorithms. combinational networks. The D-Algorithm was used for com-
There are also large challenges for performance-driven binational logic when it was applicable, and modified in a way
design for FPGAs. This is due to the limited selection of that it could be used in a limited scope for sequential logic. Test
drive capabilities for synthesis tools, inasmuch as much of the generation algorithms such as the D-Algorithm target faults one
buffering is hidden in the routing architecture. Additionally, at a time and create tests that stimulate the faulty behavior in
FPGAs have always experienced significant delay in the the circuit and observe it at a measurable point in the design.
interconnect because of the interconnect’s programmable D-Algorithm performs a methodical search over the input space
nature. In a sense, challenges is ASIC synthesis have caught for a solution to the problem. Other algorithms have been de-
up to challenges in FPGA synthesis from a timing convergence veloped over the years that essentially do the same thing as the
standpoint. An interesting approach to FPGA implementation D-Algorithm with different ways of exhausting the search space.
would be to leverage the new unified synthesis and placement These algorithms are aided by numerous heuristics that guide
solution developed for ASICs in the FPGA environment. the decisions of the algorithm to converge on a solution faster
for the nominal fault being targeted. However, as an industry we
J. The Future started to have difficulty in generating tests for boards, which
The future of synthesis is one in which it has to grow and had only 500 logic gates on them with packages which had only
stretch to meet the demands of the new millennia. This means one or two logic gates per module. Because of these difficul-
ties a number of people started to look for different approaches of scaling on testing, there are a number of other issues which
to testing. It was clear to many that automatic test generation the SIA Roadmap [105] refers to as Grand Challenges that also
for sequential networks could not keep up with the rate of in- effect how testing needs to be done in the future. These include
creasing network size. This rapid growth was the tied to the very GHz cycle time, higher sensitivity to crosstalk, and tester
first days of the Moore’s Law growth rate. However, the design costs, all of which effect testing. The GHz cycle time refers
community was hoping that some new test generation algorithm to the formidable design problem that faces the design com-
would be developed that would solve the sequential test gener- munity in getting timing closure. This problem is exacerbated
ation problem and have the ability to grow with Moore’s Law. by the higher sensitivity to crosstalk. The cost of testing is be-
To date, only very limited success has been obtained with se- coming a very central issue for both designers and manufac-
quential test generation tools. In addition to the need for a more tures. We are at the point today that the cost of test in terms of
powerful automatic test generation tool, there was also a need NRE is between 30% and 45%, while the actual cost to test a
to have designs which did not result in a race conditions at the logic gate will equal the cost of manufacturing that gate in the
tester. These race conditions, more often than not, resulted in a year 2012, [SIA97]. Thus, as we go forward into these more and
zero yield situation (zero-yield is the result of a difference be- more complex and dense technologies the role of testing will be
tween the expected test data response and the chips actual re- expanded to be even more cost sensitive and to include new fault
sponse). The rapid growth in gate density and problems in pro- models to, very simply, to efficiently identify networks with one
duction caused by race conditions has resulted in many changes or more defects which makes these network function incorrectly
in the way designs were done. The first reference to the new and discard them and only ship the “good“ ones.
approach to design was given by Kobayashi [70] in a one-page
paper in Japanese. This work was not discovered in the rest of
VI. SUMMARY
the world till much later. The first industrial users to publish an
indication of a dramatic change in the way designs are done be- We have explored commercial EDA in the past millenium,
cause of automatic test generation complexities were NEC in particularly in the areas of physical design, synthesis, simula-
1975 [122] and IBM in 1977 [48]. The core idea of this new tion and verification and test. In the coming decade, the de-
approach was to make all memory elements, except RAMs or mand for ever-more complex electronic systems will continue
ROMs, part of a shift register. This would allow every memory to grow. As stated in Section I, flexibility and quick time to
element to be controllable and observable from primary inputs market in the form of reprogrammable/reconfigurable chips and
and outputs of the package. Thus, the memory elements of the systems will increase in importance. Already, reconfigurable
sequential network can be treated as if they were, in fact, real logic cores for ASIC and SoC use have been announced, as
primary inputs and outputs. As a result one could now take ad- well as a crop of reconfigurable processors. Even though pro-
vantage of the D-Algorithm’s unique ability to generate tests grammable devices carry a higher cost in area, performance and
for combinational logic and be able to keep up with ever in- power than hard-wired ones, the overall technology growth will
creasing gate counts. The NEC approach was called the SCAN enable more programmability. For market segments where de-
approach, and the IBM approach which was called level sen- mand for complexity and performance grow at a slower rate than
sitive scan design got the shortened name of LSSD. This new what can be delivered by technology (as described in the scaling
approach of designing to accommodate real difficulties in test roadmap of the ITRS [105]), the emergence of platform based
became known in general as the area of DFT. While these de- reconfigurable design seems likely.
sign techniques have been around for a number of years in has In this scenario, what is an appropriate EDA tool suite for im-
not been until last 5–7 years that the industry has started to do plementing designs on existing platforms? Little architectural
SCAN designs in earnest. Today we have a large portion of de- design is required, since the platform defines the basic archi-
signs, which employ SCAN. As a result a number of the CAD tecture. At the highest level, system design tools can direct the
vendors are offering SCAN insertion and automatic test pattern partitioning of system functionality between SW and HW that
generation tools which require SCAN structures to be able to meet performance goals while not overflowing any of the fixed
one generate test for very complex sequential designs and also resources. Hardware/SW co-verification environments will in-
make the test patterns race free as well. crease in importance. However, if sufficient facilities for internal
Today we are looking at an era before us which also has the debugging visibility are provided on the platform, then first sil-
gate count increasing at exactly the same rate, Moore’s Law. The icon or even standard product chips can always be shipped. We
DFT technique of Full Scan or high percentage of scannable will have come full circle back to pre-VLSI days, when verifi-
memory elements seems to be commonplace in the industry. cation was done in the end product form itself, in real time HW.
Furthermore, it is possible to have test automatically generated The synthesis and physical design tools would not be funda-
for networks on the order 10 000 000! mentally different from FPGA implementation tools used today,
However, there are significant changes brought on by the albeit perhaps better at performance-driven design. The prob-
technology developments facing us today which have a direct lems of mapping a large netlist onto an FPGA are similar to
influence on testing. As we continue scaling down into deep those of placing a gate array: more complex in that the routing
submicrometer or nanometer technologies, some testing tech- is less easily abstracted, but simpler in that the underlying tech-
niques are loosing their effectiveness, such as [121] and nology is hidden (i.e., it is always DRC correct, no need to
other techniques are being called into use, such as delay testing model via overlaps, etc). An added challenge in platform de-
[85], [86], and [71]. In addition to the above-mentioned impacts sign consists in matching the performance of the processor and
ASIC portions with the performance of the reconfigurable por- [13] R. Brayton, R. Rudell, A. Sangiovanni-Vincentelli, and A. Wang,
tions. This performance matching will ultimately be reflected “MIS, a multiple-level logic optimization system,” IEEE Trans.
Computer-Aided Design, vol. CAD-6, pp. 1062–1081, Nov. 1987.
back into the implementation tools. In any case, the emergence [14] R. K. Brayton, R. Camposano, G. DeMichelli, R. H. J. M. Otten, and J. T.
of higher level tools to address complexity and time to market J. van Eijndhoven, “The Yorktown silicon compiler system,” in Silicon
is likely to increase the available market for EDA greatly. Compilation, D. Gajski, Ed. Reading, MA: Addison-Wesley, 1988.
[15] K. S. Brace, R. L. Rudell, and R. E. Bryant, “Efficient implementation
The EDA tools suite for the design of these platforms looks of a BDD package,” in Proc. ACM/IEEE Design Automation Conf., June
much like the tool suite of today, but with the added function- 1990, pp. 40–45.
ality for dealing with scaling effects and ever-increasing com- [16] R. K. Brayton and G. D. Hachtel, et al., “VIS: A system for verification
and synthesis,” UC Berkeley Electronics Research Laboratory, Tech.
plexity. The intellectual challenge of solving these problems Rep. UCB/ERL M95/104, Dec. 1995.
remains the same. The nature of the market, however, could [17] M. A. Breuer, “A class of min-cut placement algorithms,” in Proc.
change significantly. For example, as mentioned above, mask ACM/IEEE Design Automation Conf., 1977.
[18] L. M. Brocco, S. P. McCormick, and J. Allen, “Macromodeling CMOS
verification SW today accounts for roughly half of the physical circuits for timing simulation,” IEEE Trans. Computer-Aided Design,
EDA revenue. But extraction or design rule checking is only vol. 7, pp. 1237–1249, Dec. 1988.
necessary for the design of the platform, not for the program- [19] S. Brown, R. Francis, J. Rose, and Z. Vranesic, Field-Programmable
Gate Arrays. Boston: Kluwer, 1992.
ming of that platform. At the same time, mask verification will [20] R. E. Bryant, “Graph-based algorithms for Boolean function manipula-
become significantly more complex and, hence, more time con- tion,” IEEE Trans. Comput., vol. C-35, pp. 677–691, Aug. 1986.
suming. With a narrowed user base but a more intensive tool [21] M. Burstein and R. Pelavin, “Hierarchical channel router,” in Proc.
ACM/IEEE Design Automation Conf., 1977.
use, the economics of commercial EDA for platform design are
2 2 2
[22] M. Butts, “10 , 100 and 1000 speed-ups in computers: What con-
likely to change. In-house EDA development at the semicon- straints change in CAD?,” presented at the Panel Session 2, IEEE/ACM
ductor vendors who become the winners in the platform sweep- Int. Conf.Computer-Aided Design, 1989.
stakes is likely to continue in niche applications. We also expect [23] M. Butts, J. Batcheller, and J. Varghese, “An efficient logic emulation
system,” in Proc. IEEE Conf.Computer Design, Oct. 1992, pp. 138–141.
that the semiconductor vendors and the EDA vendors will con- [24] M. Butts, “Emulators,” in Encyclopedia of Electrical and Electronics
tinue to form partnerships at an increased pace to tackle the com- Engineering. New York: Wiley, 1999.
plexities of fielding platform based solutions. The electronics [25] R. Camposano and W. Wolf, Eds., High-Level VLSI Synthesis. Boston,
MA: Kluwer Academic, 1991.
industry depends on it. [26] P. W. Case, M. Correia, W. Gianopulos, W. R. Heller, H. Ofek, T. C. Ray-
mond, R. L. Simek, and C. B. Stieglitz, “Design automation in IBM,”
IBM J. Res. Develop., vol. 25, no. 5, pp. 631–645, Sept. 1981.
ACKNOWLEDGMENT [27] “CHDL,”, held every other year from 1973 to 1997..
The authors would like to thank M. Burstein, D. Chapiro, T. [28] K. M. Chandy and J. Misra, “Asynchronous distributed simulation via a
sequence of parallel computations,” Commun. ACM, vol. 24, no. 11, pp.
Chatterjee, R. Damiano, B. Gregory, M. Laudes, T. Ma, J. C. 198–206, Apr. 1981.
Madre, G. Maturana, S. Meier, P. Moceyunas, N. Shenoy, and [29] D. Chapiro. (1999, May) Testbench vs. high-level languages
W. vanCleemput for help in the preparation of this paper. for verification. Electron. Syst. Technol. Design [Online]. Avail-
able: http://www.computer-design.com/Editorial/1999/05/0599test-
bench_vs_high.html
REFERENCES [30] D. Chapiro, “Verification myths—Automating verification using test-
bench languages,” Electron. Eng., Sept. 1999.
[1] D. F. Ackerman, M. H. Decker, J. J. Gosselin, K. M. Lasko, M. P. Mullen, [31] K. A. Chen, M. Feuer, H. H. Khokhani, N. Nan, and S. Schmidt,
R. E. Rosa, E. V. Valera, and B. Wile, “Simulation of IBM enterprise “The chip layout problem: An automatic wiring procedure,” in Proc.
system/9000 models 820 and 900,” IBM J. Res. Develop., vol. 36, no. 4, ACM/IEEE Design Automation Conf., 1977.
pp. 751–764, July 1992. [32] D. R. Chowdhury, H. Chaudhry, and P. Eichenberger, “Mixed 2–4 state
[2] P. N. Agnew and M. Kelly, “The VMS algorithm,” IBM System Products simulation in VCS,” presented at the IEEE Int. Verilog Conf., Santa
Div. Lab., Endicott, NY, Tech Rep. TR01.1338, 1970. Clara, CA, 1997.
[3] S. B. Akers, “On the use of the linear assignment algorithm in module [33] E. M. Clarke, O. Grumberg, K. L. McMillan, and X. Zhao, “Efficient
placement,” in Proc. ACM/IEEE Design Automation Conf., 1981. generation of counterexamples and witnesses in symbolic model
[4] S. Al-Ashari. (1999, Jan.) System verification from the ground up. Inte- checking,” in Proc. 32nd Design Automation Conf.: IEEE, June 1995,
grated Syst. Design [Online]. Available: http://www.isdmag.com/Edito- pp. 427–432.
rial/1999/coverstory9901.html [34] E. M. Clarke and R. P. Kurshan, “Computer-aided verification,” in IEEE
[5] C. J. Alpert and A. B. Kahng. (1995) Recent developments in netlist Spectrum, Mag.: IEEE, June 1996, pp. 61–67.
partitioning: A survey. Integration: VLSI J. [Online], vol (1–2), pp. 1–81. [35] G. W. Clow, “A global routing algorithm for general cells,” in Proc.
Available: http://vlsicad.cs.ucla.edu/ cheese/survey.html ACM/IEEE 21st Design Automation Conf., 1984, pp. 45–51.
[6] C. Azar and M. Butts, “Hardware accelerator picks up the pace across [36] O. Coudert, R. Haddad, and S. Manne, “New algorithms for gate sizing:
full design cycle,” Electron. Design, Oct. 17, 1985. A comparative study,” in Proc. 33rd ACM/IEEE Design Automation
[7] J. Babb, R. Tessier, and A. Agarwal, “Virtual wires: Overcoming pin Conf., 1996, pp. 734–739.
limitations in FPGA-based logic emulators,” presented at the IEEE [37] J. Darringer, W. Joyner, L. Berman, and L. Trevillyan, “LSS: Logic syn-
Workshop on FPGAs for Custom Computing Machines, Apr. 1993. thesis through local transformations,” IBM J. Res. Develop., vol. 25, no.
[8] M. L. Bailey and L. Snyder, “An empirical study of on-chip paral- 4, pp. 272–280, July 1981.
lelism,” in Proc. 25th Design Automation Conf.: IEEE, June 1988, [38] Dataquest, , San Jose, CA, Application Specific Programmable Prod-
pp. 160–165. ucts: The Coming Revolution in System Integration: ASIC-WW-PP-
[9] D. K. Beece, G. Deibert, G. Papp, and F. Villante, “The IBM engineering 9811.
verification engine,” in Proc. 25th Design Automation Conf.: IEEE, June [39] G. DeMicheli and A. Sangiovanni-Vincentelli, “PLEASURE: A com-
1988, pp. 218–224. puter program for simple/multiple constrained/unconstrained folding of
[10] C. G. Bell and A. Newell, Computer Structures: Readings and Exam- programmable logic arrays,” in Proc. ACM/IEEE Design Automation
ples. New York: McGraw-Hill, 1971. Conf., 1983, pp. 530–537.
[11] T. Blank, “A survey of hardware accelerators used in computer-aided [40] G. De Micheli, Synthesis and Optimization of Digital Circuits. San
design,” IEEE Design Test, pp. 21–39, Aug. 1984. Francisco, CA: McGraw-Hill, 1994.
[12] J. P. Blanks, “Near-optimal placement using a quadratic objective func- [41] M. M. Denneau, “The Yorktown simulation engine,” in Proc. 19th De-
tion,” in Proc. IEEE Design Automation Conf., 1985, pp. 609–615. sign Automation Conf.: IEEE, June 1982.
[42] D. N. Deutsch, “A ‘dogleg’ channel router,” in Proc. ACM/IEEE Design [73] C. Y. Lee, “An algorithm to path connections and its applications,” IRE
Automation Conf., 1976. Trans. Electron. Comput., pp. 346–365, Sept. 1961.
[43] S. Devadas, A. Ghosh, and K. Keutzer, Logic Synthesis. San Francisco, [74] C. E. Leiserson and J. B. Saxe, “Retiming synchronous circuitry,” Algo-
CA: McGraw-Hill, 1994. rithmica, vol. 6, no. 1, pp. 5–35, 1991.
[44] A. Dewey, “The VHSIC hardware description language (VHDL) pro- [75] S. Liao, S. Tjiang, and R. Gupta, “An efficient implementation of re-
gram,” in Proc. 21st Design Automation Conf.: IEEE, June 1984, pp. activity for modeling hardware,” in Proc. ACM/IEEE 34th Design Au-
556–7. tomation Conf., 1997, pp. 70–75.
[45] A. E. Dunlop, et al., “Chip layout optimization using critical path [76] Y.-L. Lin, “Recent developments in high-level synthesis,” (Survey
weighting,” in Proc. ACM/IEEE Design Automation Conf., 1984, pp. Paper), ACM Trans. Design Automation of Elec. Sys., vol. 2, no. 1, pp.
133–136. 2–21, Jan. 1997.
[46] L. N. Dunn, “IBM’s engineering design system support for VLSI design [77] T. Ly, D. Knapp, R. Miller, and D. MacMillen, “Scheduling using be-
and verification,” IEEE Design Test, Feb. 1984. havioral templates,” in Proc. 32nd ACM/IEEE Design Automation Conf.,
[47] K. Eckl, J. C. Madre, P. Zepter, and C. Legl, “A practical approach to June 1995, pp. 101–106.
multiple-class retiming,” in Proc. 36th ACM/IEEE Design Automation [78] D. MacMillen and T. Schaefer, “A comparison of CMOS delay models,”
Conf., June 1999, pp. 237–242. VLSI Technology, Internal Tech. Rep. (Design Technology Display File
[48] E. B. Eichelberger and T. W. Williams, “A logic design structure for LSI 48), 1988.
testability,” in Proc. 14th Design Automation Conf., New Orleans, LA, [79] E. McCluskey, “Minimization of Boolean functions,” Bell Syst. Tech. J.,
June 1977, pp. 462–468. vol. 35, pp. 1417–1444, Nov. 1956.
[49] R. D. Eldred, “Test routines based on symbolic logical statements,” pre- [80] M. C. McFarland, A. C. Parker, and R. Camposano, “The high-level
sented at the Association, Aug. 11–13, 1958, pp. 33–36. synthesis of digital systems,” Proc. IEEE, vol. 78, pp. 301–318, Feb.
[50] C. M. Fiduccia and R. M. Mattheyses, “A linear-time heuristic for im- 1990.
proving network partitions,” in Proc. IEEE Design Automation Conf., [81] A. R. McKay, “Comment on computer-aided design: Simulation of
1982, pp. 175–181. digital design logic,” IEEE Trans. Comput., vol. C-18, p. 862, Sept.
[51] R. Fisher, “A multipass, multialgorithm approach to PCB routing,” in 1969.
Proc. ACM/IEEE Design Automation Conf., 1978. [82] K. L. McMillan, “Symbolic model checking: An approach to the state
[52] J. Gateley, et al., “UltraSPARC-I emulation,” in Proc. 32nd Design Au- explosion problem,” Ph.D. dissertation, Carnegie Mellon Univ., Pitts-
tomation Conf.: IEEE, June 1995, pp. 13–18. burgh, PA, 1992.
[53] A. Ghosh, S. Liao, and J. Kunkel, “Hardware synthesis from C/C ++ ,” [83] R. Miller, D. MacMillen, T. Ly, and D. Knapp, “Behavioral synthesis
presented at the Design Automation and Test in Europe, Munich, Ger- links to logic synthesis,” U.S. Patent 6 026 219, Feb. 15, 2000.
many, 1999. [84] G. J. Parasch and R. L. Price, “Development and application of a
[54] D. Gregory, K. Bartlett, A. De Geus, and G. Hactel, “Socrates: A system designer oriented cyclic simulator,” in Proc. 13th Design Automation
for automatically synthesizing and optimizing combinational logic,” in Conf.: IEEE, June 1976, pp. 48–53.
DAC, Proc. 23rd Design Automation Conf., 1986, pp. 79–85. [85] E. S. Park, M. R. Mercer, and T. W. Williams, “Statistical delay fault
[55] B. Gregory, “HDL synthesis pitfalls and how to avoid them,” in Synopsys coverage and defect level for delay faults,” in Proc. 1988 Int. Test Conf.,
Methodology Notes. Mountain View, CA: Synopsys, Sept. 1991, vol. Washington, DC, September 1988, pp. 492–499.
1. [86] E. S. Park, B. Underwood, and M. R. Mercer, “Delay testing quality
[56] B. Gregory, D. MacMillen, and D. Fogg, “ISIS: A system for perfor- in timing-optimized designs,” in Proc. 1991 IEEE Int. Test Conf.,
mance-driven resource sharing,” in Proc. 29th ACM/IEEE Design Au- Nashville, TN, October 1991, pp. 897–905.
tomation Conf., 1992, pp. 285–290. [87] G. Persky, D. N. Deutsch, and D. G. Schweikert, “LTX—A system for
[57] G. Hactel, M. Lightner, R. Jacoby, C. Morrison, and P. Moceyunas, the directed automatic design of LSI circuits,” in Proc. ACM/IEEE De-
“Bold: The boulder optimal logic design system,” in Hawaii Int. Symp. sign Automation Conf., 1976.
Systems Sciences, 1988. [88] E. B. Pitty, D. Martin, and H.-K. T. Ma, “A simulation-based protocol-
[58] F. O. Hadlock, “A shortest path algorithm for grid graphs,” Networks, driven scan test design rule checker,” in Proc. Int. Test Conf., 1994, pp.
vol. 7, pp. 323–334, 1977. 999–1006.
[59] D. W. Hightower, “A solution to line-routing problems on the continuous [89] C. Pixley, “Integrating model checking into the semiconductor design
plane,” in Proc. IEEE Design Automation Conf., 1969. flow,” Electron. Syst., pp. 67–74, March 1999.
[60] D. Hill and W. van Cleemput, “SABLE: A tool for generating structured, [90] B. T. Preas and W. M. van Cleemput, “Placement algorithms for arbi-
multilevel simulations,” in Proc. 16th Design Automation Conf.: IEEE, trarily shaped blocks,” in Proc. IEEE Design Automation Conf., 1979.
June 1979. [91] W. Quine, “The problem of simplifying truth functions,” American
[61] D. Hill and D. Coelho, Multi-Level Simulation for VLSI De- Math. Monthly, vol. 59, pp. 521–531, 1952.
sign. Boston, MA: Kluwer Academic, 1986. [92] J. Reed, A. Sangiovanni-Vincentelli, and M. Santamauro, “A new sym-
[62] S. Hong, R. Cain, and D. Ostapko, “MINI: A heuristic approach for logic bolic channel router: Yacr2,” IEEE Trans. Computer-Aided Design, vol.
minimization,” IBM J. Res. Develop., vol. 18, pp. 443–458, Sept. 1974. 4, pp. 203–219, Mar. 1985.
[63] M. Keating and P. Bricuad, Reuse Methodology Manual for System on a [93] W. Roelandts, “DesignCon keynote speech,” EE Times, Feb. 3, 2000.
Chip, 2nd ed: Kluwer Academic, 1999. [94] J. P. Roth, “Diagnosis of automata failures: A calculus and a method,”
[64] B. W. Kernighan and S. Lin, “An efficient heuristic procedure for parti- IBM J. Res. Dev., vol. 10, no. 4, pp. 278–291, July 1966.
tioning graphs,” Bell Syst. Tech. J., pp. 291–307, 1970. [95] R. Rudell, “Dynamic variable ordering for ordered binary decision dia-
[65] K. Keutzer and D. Sylvester, “Getting to the bottom of deep submicron,” grams,” in Digest of Tech. Papers, IEEE/ACM Int. Conf.Computer Aided
in IEEE/ACM Int. Conf.Computer-Aided Design, 1998. Design-93, 1993, pp. 42–47.
[66] T. Kim, W. Jao, and S. Tjiang, “Arithmetic optimization using carry- [96] R. Rudell, “Tutorial: Design of a logic synthesis system,” in Proc. 33rd
save-adders,” in Proc. ACM/IEEE 35th Design Automation Conf., 1998, ACM/IEEE Design Automation Conf., 1996, pp. 191–196.
pp. 433–438. [97] S. Sample, M. D’Amour, and T. Payne, “Apparatus for emulation of
[67] S. Kirkpatrick, C. Gelatt, and M. Vecchi, “Optimization by simulated electronic hardware systems,” U.S. Patent 5 109 353, 1992.
annealing,” Science, vol. 220, no. 4598, p. 671, May 12, 1983. [98] M. Santarini, “RTL: Avanced by three companies,” in EE Times: CMP
[68] J. M. Kleinhans, et al., “Gordian: VLSI placement by quadratic pro- Media Inc., Jan. 2000.
gramming and slicing optimization,” IEEE Trans. Computer-Aided De- [99] C. Sechen and A. Sangiovanni-Vincentelli, “The TimberWolf placement
sign, vol. 10, pp. 356–365, Mar. 1991. and routing package,” presented at the Custom Integrated Circuits Conf.,
[69] D. Knapp, T. Ly, D. MacMillen, and R. Miller, “Behavioral synthesis Rochester, NY, May 1984.
methodology for HDL-based specification and validation,” in Proc. [100] R. Segal, “Bdsyn: Logic description translator; BDSIM: Switch level
32nd ACM/IEEE Design Automation Conf., June 1995, pp. 286–291. simulator,” Masters thesis, Univ. California, Berkeley, CA, May 1987.
[70] A. Kobayashi, S. Matsue, and H. Shiba, “Flip-flop circuit with FLT ca- [101] N. Shenoy and R. Rudell, “Efficient implementation of retiming,” in
pability” (in Japanese), in Proc. IECEO Conf., 1968, p. 692. IEEE/ACM Int. Conf.Computer-Aided Design, Nov. 1994, pp. 226–233.
[71] B. Köenemann et al., “Delay test: The next frontier for LSSD test sys- [102] N. Shenoy, “Retiming: Theory and practice,” Integration, VLSI J., p. 22,
tems,” in Proc. Int. Test Conf., 1989, pp. 578–587. 1997.
[72] S. Krishnamoorthy and F. Mailhot, “Boolean matching of sequential el- [103] N. Shenoy, M. Iyer, R. Damiano, K. Harer, H.-K. Ma, and P. Thilking, “A
ements,” in Proc. 31st ACM/IEEE Design Automation Conf., 1994, pp. robust solution to the timing convergence problem in high-performance
691–697. design,” in Proc. IEEE Int. Conf. Computer Design, 1999, pp. 250–257.
[104] N. Sherwani, Algorithms for VLSI Physical Design Automation, 3rd Michael Butts (M’88) received the B.S. degree in
ed. Boston, MA: Kluwer Academic, 1999. electrical engineering in 1974, and the M.S. degree in
[105] Semiconductor Industry Association (SIA), , San Jose, CA, “The inter- electrical engineering and computer science in 1977,
national technology roadmap for semiconductors,”, 1999. both from the Massachusetts Institute of Technology,
[106] G. L. Smith, R. J. Bahnsen, and H. Halliwell, “Boolean comparison Cambridge.
of hardware and flowcharts,” IBM J. Res. Develop., vol. 26, no. 1, pp. Since 1998, he has been Principal Engineer in the
106–116, January 1982. Large Systems Technology Group of Synopsys, Inc.,
[107] R. J. Smith II, “Fundamentals of parallel logic simulation,” in Proc. 23rd Beaverton, OR, where his research interests are high-
Design Automation Conf.: IEEE, June 1986, pp. 2–12. performance system verification and reconfigurable
[108] S. Smith and D. Black, “Pushing the limits with behavioral compiler,”, computing. Prior to that he was with Quickturn De-
Mountain View, CA, White Paper. Available: http://www.sysopsys.com, sign Systems and Mentor Graphics. He has published
May 1999. four papers and holds 17 patents in logic emulaton, reconfigurable computing
[109] L. Soule and A. Gupta, “An evaluation of the Chandy–Misra–Bryant and FPGA architecture.
algorithm for digital logic simulation,” ACM Trans. Modeling Comput. Mr. Butts has been a Technical Program Committee Member for the IEEE
Simulat., vol. 1, no. 4, pp. 308–347, October 1991. Symposium on Field-Programmable Custom Computing Machines, ACM In-
[110] I. Sutherland, B. Sproull, and D. Harris, Logical Effort: Designing Fast ternational Symposium on Field-Programmable Gate Arrays and the Interna-
CMOS Circuits. San Francisco, CA: Morgan Kaufmann, 1999. tional Conference on Field Programmable Logic and Applications.
[111] Synopsys Test Complier Manual, Mountain View, CA, 1990.
[112] (1998) Multimillion-gate ASIC verification. Synopsys, Inc., Mountain
View, CA. [Online]. Available: http://www.synopsys.com/products/stat-
icverif/staticver_wp.html
[113] (1998) Formal verification equivalence checker methodology back-
grounder. Synopsys, Inc., Mountain View, CA. [Online]. Available:
http://www.synopsys.com/products/verification/formality_bgr.html
[114] (1999) Balancing your design cycle, a practical guide to HW/SW Raul Camposano (M’92–SM’96–F’00) received
co-verification. Synopsys, Inc., Mountain View, CA. [Online]. Avail- the B.S.E.E. degree in 1977, the diploma in electrical
able: http://www.synopsys.com/products/hwsw/practical_guide.html engineering in 1978 from the University of Chile,
[115] S. A. Szygenda, “TEGAS2—Anatomy of a general purpose test gen- and the doctorate degree in computer science from
eration and simulation system,” in Proc. 9th Design Automation Conf.: the University of Karlsruhe, Karlsruhe, Germany, in
IEEE, June 1972, pp. 117–127. 1981.
[116] D. E. Thomas and P. R. Moorby, The Verilog Hardware Description Lan- In 1982 and 1983, he was with the University of
guage. Boston, MA: Kluwer Academic, 1991. Siegen and Professor of Computer Science at the Uni-
[117] R. Tsay, E. S. Kuh, and C. P. Hsu, “PROUD: A fast sea-of-gates place- versity del Norte in Antofagasta. From 1984 to 1986,
ment algorithm,” in Proc. IEEE Design Automation Conf., 1988, pp. he jwas with the Computer Science Research Labo-
318–323. ratory at the University of Karlsruhe as a Researcher.
[118] T. Uehara and W. M. van Cleemput, “Optimal layout of CMOS func- Between 1986 and 1991, he was a Research Staff Member at the IBM T.J.
tional arrays,” in Proc. IEEE Design Automation Conf., 1979. Watson Research Center, where he lead the project on high-level synthesis.
[119] A. Wang, “Algorithms for multilevel logic optimization,” Ph.D. disser- From 1991 to 1994, he was a Professor at the University of Paderborn and the
tation, Univ. California, Berkeley, CA, 1989. director of the System Design Technology department at GMD (German Na-
[120] N. Weste, “Virtual grid symbolic layout,” in Proc. ACM/IEEE Design tional Research Center for Computer Science). In 1994, he joined Synopsys,
Automation Conf., 1981. Inc. He currently is Chief Technical Officer of Synopsys. His responsibilities
[121] T. W. Williams, R. H. Dennard, R. Kapur, M. R. Mercer, and W. Maly, include system design, physical synthesis, libraries, deep-submicrometer anal-
“Iddq test: Sensitivity analysis of scaling,” in Int. Test Conf., 1996, pp. ysis, test and formal verification. He has published over 50 technical papers and
786–792. written/edited three books. His research interests include electronic systems de-
[122] A. Yamada, N. Wakatsuki, T. Fukui, and S. Funatsu, “Automatic system- sign automation, high-level synthesis and design technology for System on a
level test generation and fault location for large digital systems,” in Proc. Chip.
12th Design Automation Conf., Boston, MA, June 1975, pp. 347–352. Dr. Camposano is also a member of the ACM and the IFIP WG10. He serves
as an Associate Editor for the IEEE TRANSACTIONS ON COMPUTER-AIDED
DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS and the Integration Journal.
He is Editor in Chief of The International Journal on Design Automation for
Embedded Systems.
Don MacMillen (M’83) received the B.S. degree

in 1974 from Canisius College, Buffalo, NY, his
M.S. Degree in 1976 from University of Rochester,
Rochester, NY, and the Ph.D. degree in theoretical
chemistry from University of Rochester in 1980. Dwight Hill (S’74–M’76–SM’90) received the B.S.
From 1980 to 1990, he worked at General degree in computer engineering from the University
Electric, Inmos Corp., Sierra Semiconductor, and of Illinois, Urbana, in 1975, the M.S. and Ph.D. in
VLSI Technology. His work included photolithog- electrical engineering. from Stanford University,
raphy process research and development, process Stanford, CA, in 1980. His thesis was a multilevel
simulation, device modeling and development of simulator called ADLIB/SABLE, which was a
device characterization SW, logic modeling and forerunner to VHDL.
RTL synthesis. In 1990, he joined Synopsys, Inc. and held various individual After graduation, he went to work at AT&T Bell
contributor and management positions in the Design Tools Group. From Labs, primarily at the Murray Hill, NJ, location. He
1997 to 1998, he was Vice President of Software Development and later Vice has worked on VLSI layout, including layout syn-
President of Engineering at LightSpeed Semiconductor. Since September thesis and interactive CAD systems, and on FPGAs.
1998, He has been the Vice President of the Advanced Technology Group at He was responsible for the architecture of Lucent’s “ORCA” FPGA. He has
Synopsys. He has published technical papers in process development, device coauthored two books on VLSI CAD. He worked as Architect of the Chip Ar-
characterization and RTL and high-level synthesis. His research interests chitect product, which was the first product in the Synopsys Physical Synthesis
include RTL synthesis, behavioral synthesis, and deep submicrometer design system. He holds six patents/
challenges. Dr. Hill has been active in many conferences and workshops, including ICCD,
Dr. MacMillen has served on the Technical Program Committee, High–Level the Design Automation Conference, and VLSI 93. He was the Technical Chair
Synthesis for the International Conference on Computer-Aided Design of ISPD99 and the Program Chair of ISPD2000. He also serves on the Editorial
(ICCAD). He is a Member of ACM. Board of IEEE DESIGN AND TEST.
Thomas W. Williams (S’69–M’71–SM’85–F’89)

received the B.S.E.E. degree from Clarkson Uni-
versity, Potsdam, NY, the M.A. degree in pure
mathematics from the State University of New York
at Binghamton, and the Ph.D. degree in electrical en-
gineering from Colorado State University, Corvallis.
He is a Chief Scientist at Synopsys, Inc. Mountain
View, CA. Prior to joining Synopsys, Inc. he was a
Senior Technical Staff Member with IBM Microelec-
tronics Division in Boulder, CO, as Manager of the
VLSI Design for Testability Group, which dealt with
DFT of IBM products. He is engaged in numerous professional activities. He
has written four book chapters and many papers on testing, edited a book and
recently co-authored another book entitled Structured Logic Testing (with E. B.
Eichelberger, E. Lindbloom, and J. A. Waicukauski). His research interests are
in design for testability (scan design and self-test), test generation, fault simu-
lation, synthesis and fault-tolerant computing.
Dr. Williams is the founder and chair of the annual IEEE Computer Society
Workshop on Design for Testability. He is the co-founder of the European Work-
shop on Design for Testability. He is also the Chair of the IEEE Technical Sub-
committee on Design for Testability. He has been a program committee member
of many conferences in the area of testing, as well as being a keynote or invited
speaker at a number of conferences, both in the U.S. and abroad. He was selected
as a Distinguished Visiting Speaker by the IEEE Computer Society from 1982
to 1985. He has been a Special-Issue Editor in the area of design for testability
for both the IEEE TRANSACTIONS ON COMPUTERS and the IEEE TRANSACTIONS
ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS. He has
received a number of best paper awards, including the 1987 Outstanding Paper
Award from the IEEE International Test Conference for his work in the area of
VLSI Self-Testing, (with W. Daehn, M. Gruetzner, and C. W. Starke), a 1987
Outstanding Paper Award from the CompEuro’87 for his work on Self-Test, a
1989 Outstanding Paper Award (Honorable Mention) from the IEEE Interna-
tional Test Conference for his work on AC Test Quality (with U. Park and M.
R. Mercer), and a 1991 Outstanding Paper Award from the ACM/IEEE Design
Automation Conference for his work in the area of Synthesis and Testing (with
B. Underwood and M. R. Mercer). He is an Adjunct Professor at the Univer-
sity of Colorado, Boulder, and in 1985 and 1997 he was a Guest Professor and
Robert Bosch Fellow at the Universitaet of Hannover, Hannover Germany. Dr.
Williams was named an IEEE Fellow in 1988, “for leadership and contribu-
tions to the area of design for testability.” In 1989, Dr. Williams and Dr. E. B.
Eichelberger shared the IEEE Computer Society W. Wallace McDowell Award
for Outstanding Contribution to the Computer Art, and was cited “for devel-
oping the level-sensitive scan technique of testing solid-state logic circuits and
for leading, defining, and promoting design for testability concepts.”
View publication stats

Download

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Download

Hochgeladen von

Copyright:

Verfügbare Formate

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

An industrial view of electronic design automation

Donald Benton MacMillen

The user has requested enhancement of the downloaded file.

An Industrial View of Electronic Design Automation

Don MacMillen (M’83) received the B.S. degree

Thomas W. Williams (S’69–M’71–SM’85–F’89)

View publication stats

Das könnte Ihnen auch gefallen