Beruflich Dokumente
Kultur Dokumente
VLSI JAGRITI AA M
Moonntthhllyy M
Maaggaazziinnee ffrroom
m JJBBTTeecchh IIN
NDDIIAA
IIN
NSSPPIIR
REEA
ASSPPIIR
REE
“Teachers open door, but you
must enter yourself”
“If you are not failing you are
doing nothing”
“If you have made mistakes, there
is always another chance for you.
You may have a fresh start any
moment you choose, for this thing
we call 'failure' is not the falling
down, but the staying down”
Tech Byte
Basics of FPGAs Design
PGAs offer all of the features needed to implement most
F
ield-programmable gate arrays (FPGAs) arrived in 1984 as
Programmable
Interconnects
On the other hand, HDLs represent a level of abstraction that can isolate designers
from the details of the hardware implementation. Schematic-based entry gives
designers much more visibility into the hardware. It’s a better method for those
who are hardware-oriented. The downside of schematic-based entry is that it makes
the design more difficult to modify or port to another FPGA.
A third option for design entry, state-machine entry,
works well for designers who can see their logic design as a series of states that the
system steps through. It shines when designing somewhat simple functions, often
in the area of system control, that can be clearly represented in visual formats. Tool
support for finite state-machine entry is limited, though. Some designers approach
the start of their design from a level of abstraction higher than HDLs, which is
algorithmic design using the C/C++ programming languages. A number of EDA
vendors have tool flows supporting this design style. Generally, algorithmic design
has been thought of as a tool for architectural exploration. But increasingly, as tool
flows emerge for C-level synthesis, it’s being accepted as a first step on the road to
hardware implementation Following synthesis, device implementation begins. After netlist
After design entry, the design is simulated at the register-transfer level synthesis, the design is automatically converted into the format supported
(RTL). This is the first of several simulation stages, because the design must be internally by the FPGA vendor’s place-and-route tools. Design rule
simulated at successive levels of abstraction as it moves down the chain toward checking and optimization is performed on the incoming netlist and the
physical implementation on the FPGA itself. RTL simulation offers the highest software partitions the design onto the available logic resources. Good
performance in terms of speed. As a result, designers can perform many partitioning is required to achieve high routing completion and high
simulation runs in an effort to refine the logic. At this stage, FPGA development performance.
isn’t unlike software development. Signals and variables are observed, procedures Increasingly, FPGA designers are turning to floorplanning after synthesis
and functions traced, and breakpoints set. The good news is that it’s a very fast and design partitioning. FPGA floorplanners work from the netlist
simulation. But because the design hasn’t yet been synthesized to gate level, hierarchy as defined by the RTL coding. Floorplanning can help if area is
properties such as timing and resource usage are still unknowns. tight. When possible, it’s a good idea to place critical logic in separate
The next step following RTL simulation is to convert the RTL blocks.
representation of the design into a bit-stream file that can be loaded onto the FPGA. After partitioning and floorplanning, the placement tool tries to place the
The interim step is FPGA synthesis, which translates the VHDL or Verilog code
logic blocks to achieve efficient routing. The tool monitors routing length
into a device netlist format that can be understood by a bit-stream converter. The
and track congestion while placing the blocks. It may also track the
synthesis process can be broken down into three steps. First, the HDL code is
converted into device netlist format. Then the resulting file is converted into a absolute path delays to meet the user’s timing constraints. Overall, the
hexadecimal bit-stream file, or .bit file. This step is necessary to change the list of process mimics PCB place and route.
required devices and interconnects into
hexadecimal bits to download to the FPGA. Lastly, Table 2 :FPGA Usage
the .bit file is downloaded to the physical FPGA.
This final step completes the FPGA synthesis Emulation: 3% Prototyping: 30% Preproduction: 30% Production: 37%
procedure by programming the design onto the
Fairly high; fast Fairly high; fast Fairly high; fast Fairly high; fast
physical FPGA. Time-to-market
compile times compile times compile times compile times
It’s important to fully constrain designs before
synthesis (Fig. 3). A constraint file is an input to Performance Not stringent Not stringent Very critical Very critical
the synthesis process just as the RTL code itself.
Constraints can be applied globally or to specific Very low per Low per Moderately high per High per
portions of the design. The synthesis engine uses Volume application application application application
these constraints to optimize the netlist. However,
Table 3 Advantages/Disadvantages Of Various FPGA Technologies Modern FPGAs also incorporate a JTAG
port that, happily, can be used for more
Feature SRAM Antifuse Flash than boundary-scan testing. The JTAG
Reprogrammable? Yes (in System) No Yes (In System or offline) port can be connected to the device’s
Reprogrammable? Speed Not internal SRAM configuration-cell shift
Fast 3X SRAM register, which in turn can be instructed
(Including erasure) Applicable
to connect to the chip’s JTAG scan chain.
Volatile? Yes No No (but can be if required)
David Maliniak, Electronic Design Automation Editor
External Configuration
Yes No No If you’ve gotten this far with your design,
file?
chances are you have a finished FPGA.
Good for Prototyping? Yes No Yes There’s one more step to the process,
Instant on? No Yes Yes however, which is to attach the device to
IP Security Poor Very Good Very Good a printed-circuit board in a system. The
Large (Six appearance of 10-Gbit/s serial
Size of configuration cell Very Small Small (Two Transistors)
Transistors) transmitters, or I/Os, on the chip, coupled
Power Consumption High Low Medium with packages containing as many as
Radiation Hardness No yes No 1500 pins, makes the interface between
the FPGA and its intended system board a very sticky issue. All too
Functional simulation is performed after synthesis and before physical often, an FPGA is soldered to a pc board and it doesn’t function as
implementation. This step ensures correct logic functionality. After expected or, worse, it doesn’t function at all. That can be the result of
implementation, there’s a final verification step with full timing information. errors caused by manual placement of all those pins, not to mention the
After placement and routing, the logic and routing delays are back-annotated to board-level timing issues. created by a complex FPGA
the gate-level netlist for this final simulation. At this point, simulation is a More than ever, designers must strongly consider an
much longer process, because timing is also a factor (Fig. 4). Often, designers integrated flow that takes them from conception of the FPGA through
substitute static timing analysis for timing simulation. Static timing analysis board design. Such flows maintain complete connectivity between the
calculates the timing of combinational paths between registers and compares it system-level design and the FPGA; they also do so between design
against the designer’s timing constraints. iterations. Not only do today’s integrated FPGA to- board flows create
the schematic connectivity needed for verification and layout of the board,
3. Go With The Flow but they also document which signal connections are made to which device
The implementation flow for FPGAs begins with synthesis of the HDL pins and how these map to the original board-level bus structures.
design description into a gate-level netlist. Accounting for user-defined Integrated flows for FPGAs make sense in general, considering
design constraints on area, power, and speed, the tool performs that FPGA vendors will continue to introduce more complex, powerful, and
various optimizations before creating the netlist that’s passed on to economical devices over time. An integrated third-party flow makes it easier
place-and-route tools. to re-target a design to different technologies from different vendors as
conditions warrant.
FPGA
Implement
Verilog RTL
IP
FPGA/PLD
HDL Simulator
Place and
Route
Once the design is successfully verified and found to meet timing, the final
step is to actually program the FPGA itself. At the completion of placement
and routing, a binary programming file is created. It’s used to configure the
device. No matter what the device’s underlying technology, the FPGA
interconnect fabric has cells that configure it to connect to the inputs and Synthesis
outputs of the logic blocks. In turn, the cells configure those logic blocks to
each other. Most programmable- logic technologies, including the PROMs for
SRAM based FPGAs, require some sort of a device programmer. Devices can
also be programmed through their configuration ports using a set of dedicated
pins.
New Chips don’t Sucks (Power)
by Jim Turley, By allowing big areas of the chip to essentially switch
Embedded Technology Journal off, Freescale slashes the passive leakage current in
those areas a big problem for small-geometry
Within a week, Intel and Freescale both
semiconductors. Modern chips often leak as much
announced new high-end embedded processors.
current as they actively dissipate, an unfortunate side
They’re both packed with multicore processors, DRAM
effect of small transistor geometry.
controllers, and PCI Express interfaces. But, for all
their similarities, they couldn’t be more different. Intel and x86
In this corner, we have Freescale’s new P1022, the For its part, Intel partially lifted the veil on a series of
sixth member of the QorIQ family. And in this corner, upcoming embedded x86 processors similar to
we have “Jasper Forest,” a mostly new family of chips Freescale’s new QorIQs. Codenamed “Jasper Forest,”
gineer from Intel. Both are more power-efficient than their the new chips are based on the venerable x86
often predecessors, though, in one case, that’s not saying processor architecture. In this case, they’ll use the
ASICs much. And both are well-supported with software and “Nehalem” processor core design that appears in some
development tools. newer Xeon chips.
based
Freescale and PowerPC Like most recent Intel designs, Nehalem emphasizes
se of power efficiency over raw clock speed. Intel likes to
If you’re not up on Freescale’s perverse brand-name
n be point out that Jasper Forest, when it arrives, will save
strategy, QorIQ is the spiritual successor to QUICC, the
times. 27 watts over today’s equivalent Xeon-based
aining company’s old communications controllers. Years ago,
configuration, largely because of the more efficient
when the QUICC name stood for “quad integrated
processor and the integrated I/O. In an interesting
ing is communications controller” and thus made a modicum
aside, an Intel representative extrapolated billions of
with a of sense. When the QUICC family traded in its 68K
dollars of energy savings if all the world’s embedded
es, IP processor core for a more modern PowerPC processor
Xeon processors were replaced with Jasper Forest.
sing a core, the name changed to Power QUICC, which still
Hey, dream big.
the made some sense.
That comparison is a big disingenuous, though,
after QorIQ (pronounced “core I.Q.”) trades on the
because Xeon is notoriously power hungry. The heat
based company’s hard-won brand equity in the letter Q, but
sinks are typically bigger than the processor. Saying
rocess otherwise makes little sense. Nevertheless, Freescale is
that Jasper Forest consumes less power than a Xeon
ndard pushing ahead with numerous QorIQ family members,
them 5500 is like saying it’s a long walk to the moon.
the P1022 and P1013 being the newest additions.
ehind. Still, Jasper Forest promises to be in the same league
The P1022 takes a dual-core PowerPC and
p as QorIQ with its multicore heart and integrated
mates it to a dizzying array of communications- and
peripherals. The decision may come down to whether
interface-related peripherals. As nice as it is to have a
you prefer the PowerPC or x86 instruction set
pair of PowerPCs under harness, the real value of the
P1022 is its peripheral mix. A set of three (count ’em) Clash of the Titans
PCI Express buses allows connection to pretty much Both chips support DDR3 directly; both have PCI
anything else in a typical system. Disks get their own Express controllers (although Intel’s has 16 lanes to
dual-SATA interfaces, and memory is handled through Freescale’s six); both have RAID disk controllers; both
a DDR2/3 controller. Dual gigabit Ethernet ports handle have expected 10-year life spans. And both come
networking, while dual USB 2.0 controllers handle the from the biggest names in microprocessors.
slower stuff. Unusually, this chip has an LCD controller, Intel isn’t giving away too many details of Jasper
so a nice user interface would be simple to add. The Forest just yet, so the chips may come with Gigabit
P1013 is identical to the P1022 but has only a single Ethernet, USB 2.0, or LCD controllers like QoriQ; we’ll
PowerPC processor core. have to wait and see.
Even with all its goodies, the P1022 is only a midrange Jasper Forest’s Nehalem-based processor core is
QorIQ chip. The existing P2020 and P4080 chips have available in single-, dual-, and quad-core
more performance, including quad-core centers, but configurations, whereas Freescale’s QorIQ chips are
are also more power-hungry. Freescale pitches the already available in four- and eight-core versions. So
P1022 (or any QorIQ chip with P1xxx in its part that makes them twice as good, right? Plus, several
number) as “balanced” or power-efficient variations. QorIQ chips are already shipping (although the P1022
That is, they’ve got high-end performance but with the itself isn’t due until January), while the first member
edge taken off to reduce power consumption a bit. of the Jasper Forest family isn’t expected until early
The power-efficiency comes from a couple of next year. Advantage: Freescale.
factors. First, the chip runs at “only” 600 MHz to 1 On the “green” front, Freescale has Intel beat, hands
GHz, whereas other QorIQ devices are rated for 1 GHz down. The dual-core QorIQ P1022 consumes about 3
and up. Second, the chip is manufactured with 45nm watts, less than one-tenth of Jasper Forest’s
silicon-on-insulator (SOI) process technology, the estimated 35-65 watts for a roughly equivalent dual-
current state of the art And finally, the chip’s circuits core configuration. That x86 instruction set exacts a
are separated into two power planes, dividing the heavy toll in power efficiency, even though both chip
silicon into areas that must stay awake and active all families are fabricated in similar 45nm silicon.
the time and those that can go to sleep.
Designing the Power of tomorrow
Training aims at providing a basic
understanding of the integrated circuit design,
by working on an industry standard project
either in Front end , Back end , Embedded ,
EDA and DSP.
Benefits of Training
How to use FPGA (Spartan, Virtex).
From BASICs to ASICs.
From Gates to Microprocessor.
Will be able to understand IC Design.
Interaction with R&D Team.
info@jbtechindia.com www.jbtechindia.com
JBTech INDIA
VLSI Design Solutions & Project Training
JBTech INDIA
Royal Krishna Apra Plaza, D-2, F-09, Alpha-I,
Commercial Belt, Greater Noida (U.P), INDIA
Tel: +91-0120-4213142, 09911676774
Email: info@jbtechindia.com
Website: www.jbtechindia.com