Sie sind auf Seite 1von 45

Basic FPGA

Architecture

© 2005 Xilinx, Inc. All Rights Reserved


Objectives
After completing this module, you will be
able to:
• Identify the basic architectural resources of the
Virtex™-II FPGA
• List the differences between the Virtex-II, Virtex-II
Pro, Spartan™-3, and Spartan-3E devices
• List the new and enhanced features of the new
Virtex-4 device family

Basic FPGA Architecture 2 - 2 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3, Spartan-
3E, and Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 3 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Overview
• All Xilinx FPGAs contain the same basic resources
– Slices (grouped into CLBs)
• Contain combinatorial logic and register resources
– IOBs
• Interface between the FPGA and the outside world
– Programmable interconnect
– Other resources
• Memory
• Multipliers
• Global clock buffers
• Boundary scan logic

Basic FPGA Architecture 2 - 4 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Virtex-II Architecture
Block SelectRAM™ I/O Blocks (IOBs)
resource

Programmable
interconnect
Dedicated
multipliers
Configurable
Logic Blocks
(CLBs)

• Virtex™-II
architecture’s core Clock Management
voltage (DCMs, BUFGMUXes)
operates at 1.5V
Basic FPGA Architecture 2 - 5 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3, Spartan-
3E, and Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 6 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Slices and CLBs
• Each Virtex-II CLB COUT COUT
BUFT
contains BUF T
four slices Slice S3
– Local routing provides
feedback between slices
Slice S2
in the same CLB, and it Switch SHIFT
provides routing to Matrix

neighboring CLBs
Slice S1
– A switch matrix provides
access
Slice S0
to general routing Local Routing

resources
CIN CIN

Basic FPGA Architecture 2 - 7 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Simplified Slice Structure
• Each slice has four
outputs
– Two registered outputs, Slice 0
two
PRE
non-registered outputs LUT Carry D Q
CE
– Two BUFTs associated CLR
with each CLB,
accessible
by all 16 CLB outputs
• Carry logic runs LUT Carry D PRE
CE Q

vertically,
up only
CLR

– Two independent
carry
chains per CLB
Basic FPGA Architecture 2 - 8 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Detailed Slice Structure
• The next few slides
discuss the slice
features
– LUTs
– MUXF5, MUXF6,
MUXF7, MUXF8
(only the F5 and
F6 MUX are shown
in this diagram)
– Carry Logic
– MULT_ANDs
– Sequential
Elements

Basic FPGA Architecture 2 - 9 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Look-Up Tables
• Combinatorial logic is stored in Look- A B C D Z
Up Tables (LUTs) 0 0 0 0 0
– Also called Function Generators (FGs) 0 0 0 1 0
– Capacity is limited by the number of 0 0 1 0 0
inputs, not by the complexity 0 0 1 1 1
• Delay through the LUT is constant 0 1 0 0 1
0 1 0 1 1
. . .
Combinatorial Logic
1 1 0 0 0
A
1 1 0 1 0
B
Z 1 1 1 0 0
C
D 1 1 1 1 1

Basic FPGA Architecture 2 - 10 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Connecting Look-Up Tables
MUXF8 combines the
CLB

F8
two MUXF7 outputs

F5
(from the CLB above
Slice S3 or below)
MUXF6 combines

F6
slices S2 and S3

F5
Slice S2

MUXF7 combines the


F7

two MUXF6
Slice S1
F5

outputs
MUXF6 combines slices S0 and S1
F6

Slice S0
F5

MUXF5 combines LUTs in each slice

Basic FPGA Architecture 2 - 11 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Fast Carry Logic
• Simple, fast, and COUT COUT
To S0 of the
complete next CLB
To CIN of S2 of the next
CLB
arithmetic Logic SLICE
– Dedicated XOR S3
First Carry CIN
gate for single- Chain COUT
level sum
completion SLICE
S2
– Uses dedicated
routing resources SLICE
– All synthesis tools CIN
S1
Second
can infer carry COUT
Carry
logic Chain
SLICE
S0
CIN CIN CLB

Basic FPGA Architecture 2 - 12 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
MULT_AND Gate
• Highly efficient multiply and add implementation
– Earlier FPGA architectures require two LUTs per bit to
perform the multiplication and addition
– The MULT_AND gate enables an area reduction by
performing the
multiply and the add in one LUT per bit
LUT

A S CO
DI
CY_MUX
CI

CY_XOR

MULT_AND

AxB

LUT

B LUT

Basic FPGA Architecture 2 - 13 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Flexible Sequential Elements
• Either flip-flops or latches FDRSE_1

• Two in each slice; eight in D S Q


CE
each CLB
R
• Inputs come from LUTs or
from an independent CLB FDCPE

input D PRE Q
CE
• Separate set and reset CLR
controls
– Can be synchronous or LDCPE

asynchronous D PRE Q
CE
• All controls are shared G

within a slice CLR

– Control signals can be


inverted locally within a
Basic FPGA Architecture 2 - 14 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Shift Register LUT
(SRL16CE)
• Dynamically addressable LUT
serial shift registers D
CE
D Q
CE
CLK
– Maximum delay of 16
clock cycles per LUT D Q
CE
(128 per CLB)
– Cascadable to other D Q Q
LUTs or CLBs for longer CE

shift registers
• Dedicated connection
from Q15 to D input of
the next SRL16CE LUT D Q
CE
– Shift register length can
A[3:0]
be changed Q15 (cascade out)
asynchronously
by toggling address A
Basic FPGA Architecture 2 - 15 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Shift Register LUT Example
• The SRL can be used to create a No Operation (NOP)
– This example uses 64 LUTs (8 CLBs) to replace 576
flip-flops (72 CLBs) and associated routing and delays

12 Cycles

Operation A Operation B
64
4 Cycles 8 Cycles
64
Operation C Operation D -
NOP
3 Cycles 9 Cycles
Paths are Statically
Balanced
12 Cycles

Basic FPGA Architecture 2 - 16 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3, Spartan-
3E, and Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 17 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
IOB Element
• Input path
IOB
– Two DDR registers Input
DDR MUX
Reg
• Output path OCK1 Reg
– Two DDR registers ICK1
– Two 3-state enable Reg
OCK2 3-state Reg
DDR registers
ICK2
• Separate clocks and
clock enables for I and O DDR MUX
Reg
• Set and reset signals OCK1
PAD
are shared
Reg
OCK2 Output

Basic FPGA Architecture 2 - 18 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
SelectIO Standard
• Allows direct connections to external signals of
varied voltages and thresholds
– Optimizes the speed/noise tradeoff
– Saves having to place interface components onto your
board
• Differential signaling standards
– LVDS, BLVDS, ULVDS
– LDT
– LVPECL
• Single-ended I/O standards
– LVTTL, LVCMOS (3.3V, 2.5V, 1.8V, and 1.5V)
– PCI-X at 133 MHz, PCI (3.3V at 33 MHz and 66 MHz)
– GTL, GTLP
– and more!
Basic FPGA Architecture 2 - 19 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Digital Controlled
Impedance (DCI)
• DCI provides
– Output drivers that match the impedance of the
traces
– On-chip termination for receivers and transmitters
• DCI advantages
– Improves signal integrity by eliminating stub
reflections
– Reduces board routing complexity and component
count by eliminating external resistors
– Eliminates the effects of temperature, voltage, and
process variations by using an internal feedback
circuit

Basic FPGA Architecture 2 - 20 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3, Spartan-
3E, and Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 21 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Other Virtex-II Features
• Distributed RAM and block RAM
– Distributed RAM uses the CLB resources (1 LUT = 16
RAM bits)
– Block RAM is a dedicated resources on the device (18-
kb blocks)
• Dedicated 18 x 18 multipliers next to block RAMs
• Clock management resources
– Sixteen dedicated global clock multiplexers
– Digital Clock Managers (DCMs)

Basic FPGA Architecture 2 - 22 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Distributed SelectRAM
Resources
• Uses a LUT in a slice as D
RAM16X1S

memory WE
WCLK
• Synchronous write LUT A0
A1
O

A2
• Asynchronous read A3

– Accompanying flip-flops
can be used to create D
RAM32X1S
D
RAM16X1D

synchronous read WE
WCLK
WE
WCLK
Slice
• RAM and ROM are A0
A1
O A0
A1
SPO

initialized during LUT


A2
A3
A2
A3

configuration A4 DPRA0 DPO


DPRA1
DPRA2
– Data can be written to DPRA3

RAM LUT
after configuration
• Emulated dual-port RAM
– One read/write port
– One
Basic FPGA Architecture 2 - 23
read-only port
© 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Block SelectRAM Resources
• Up to 3.5 Mb of RAM in 18-kb block SelectRAM memory
18-kb blocks DIA
DIPA
– Synchronous read and ADDRA
write WEA
ENA
• True dual-port memory SSRA DOA
CLKA DOPA
– Each port has
synchronous read and DIB
DIPB
write capability ADDRB
WEB
– Different clocks for each ENB
port SSRB DOB
CLKB DOPB
• Supports initial values
• Synchronous reset on
output latches
• Supports parity bits
Basic FPGA Architecture 2 - 24 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Dedicated Multiplier Blocks
• 18-bit twos complement signed operation
• Optimized to implement Multiply and Accumulate
functions
• Multipliers are physically located next to block
SelectRAM™ memory
Data_A
(18 bits) 4x4
signed
8x8
18 x 18 Output
Multiplier (36 bits) signed
12 x 12
signed
18 x 18
Data_B signed
(18 bits)

Basic FPGA Architecture 2 - 25 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Global Clock Routing
Resources
• Sixteen dedicated global clock multiplexers
– Eight on the top-center of the die, eight on the
bottom-center
– Driven by a clock input pad, a DCM, or local routing
• Global clock multiplexers provide the following:
– Traditional clock buffer (BUFG) function
– Global clock enable capability (BUFGCE)
– Glitch-free switching between clock signals
(BUFGMUX)
• Up to eight clock nets can be used in each clock
region of the device
– Each device contains four or more clock regions

Basic FPGA Architecture 2 - 26 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Digital Clock Manager (DCM)
• Up to twelve DCMs per device
– Located on the top and bottom edges of the die
– Driven by clock input pads
• DCMs provide the following:
– Delay-Locked Loop (DLL)
– Digital Frequency Synthesizer (DFS)
– Digital Phase Shifter (DPS)
• Up to four outputs of each DCM can drive onto
global clock buffers
– All DCM outputs can drive general routing

Basic FPGA Architecture 2 - 27 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3,
Spartan-3E, and
Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 28 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Spartan-3 versus Virtex-II
• Lower cost • More I/O pins per
• Smaller process = package
lower core voltage • Only one-half of the
– .09 micron versus .15 slices support RAM or
micron SRL16s (SLICEM)
– Vccint = 1.2V versus • Fewer block RAMs and
1.5V multiplier blocks
• Different I/O standard – Same size and
support functionality
– New standards: 1.2V • Eight global clock
LVCMOS, 1.8V HSTL, multiplexers
and SSTL • Two or four DCM blocks
– Default is LVCMOS,
versus LVTTL
• No internal 3-state
Basic FPGA Architecture 2 - 29 buffers
For Academic Use Only
© 2005 Xilinx, Inc. All Rights Reserved
SLICEM and SLICEL
• Each Spartan™-3 CLB Right-Hand SLICEL
Left-Hand SLICEM

contains four slices COUT COUT

– Similar to the
Virtex™-II Slice X1Y1

• Slices are grouped in


pairs SHIFTIN
Slice X1Y0
Switch
– Left-hand SLICEM Matrix
(Memory)
Slice X0Y1
• LUTs can be
configured as
memory or SRL16 Slice X0Y0
Fast Connects

– Right-hand SLICEL
(Logic) SHIFTOUT CIN
CIN

• LUT can be used as


logic only
Basic FPGA Architecture 2 - 30 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Spartan-3E Features
• More gates per I/O than • 16 BUFGMUXes on left
Spartan-3 and right sides
• Removed some I/O – Drive half the chip
standards only
– Higher-drive LVCMOS – In addition to eight
global clocks
– GTL, GTLP
– SSTL2_II
• Pipelined multipliers
– HSTL_II_18, HSTL_I, • Additional
HSTL_III configuration modes
– LVDS_EXT, ULVDS – SPI, BPI
• DDR Cascade – Multi-Boot mode
– Internal data is
presented on a single
clock edge
Basic FPGA Architecture 2 - 31 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Virtex-II Pro Features
• 0.13 micron process
• Up to 24 RocketIO™ Multi-Gigabit Transceiver (MGT)
blocks
– Serializer and deserializer (SERDES)
– Fibre Channel, Gigabit Ethernet, XAUI, Infiniband
compliant transceivers, and others
– 8-, 16-, and 32-bit selectable FPGA interface
– 8B/10B encoder and decoder
• PowerPC™ RISC processor blocks
– Thirty-two 32-bit General Purpose Registers (GPRs)
– Low power consumption: 0.9mW/MHz
– IBM CoreConnect bus architecture support

Basic FPGA Architecture 2 - 32 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3, Spartan-
3E, and Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 33 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Virtex-4 Features
• New features
– Dedicated DSP blocks
– Phase-matched clock dividers (PMCD)
– SERDES built into the Virtex™-4 SelectIO™ standard
– Dynamic reconfiguration port (DRP)
• Enhanced features
– Block RAM can be configured as a FIFO
– Advanced clocking networks, including regional clock
buffers and source- synchronous support
– 11.1 Gbps RocketIO™ Multi-Gigabit Transceiver (MGT)
blocks
– Enhanced PowerPC™ processor blocks

Basic FPGA Architecture 2 - 34 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3, Spartan-
3E, and Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 35 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Review Questions
• List the primary slice features
• List the three ways a LUT can be configured

Basic FPGA Architecture 2 - 36 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Answers
• List the primary slice features
– Look-up tables and function generators (two per slice,
eight per CLB)
– Registers (two per slice, eight per CLB)
– Dedicated multiplexers (MUXF5, MUXF6, MUXF7,
MUXF8)
– Carry logic
– MULT_AND gate
• List the three ways a LUT can be configured
– Combinatorial logic
– Shift register (SRL16CE)
– Distributed memory

Basic FPGA Architecture 2 - 37 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Summary
• Slices contain LUTs, registers, and carry logic
– LUTs are connected with dedicated multiplexers and
carry logic
– LUTs can be configured as shift registers or memory
• IOBs contain DDR registers
• SelectIO™ standards and DCI enable direct
connection to multiple I/O standards while reducing
component count
• Virtex™-II memory resources include the following:
– Distributed SelectRAM™ resources and distributed
SelectROM (uses CLB LUTs)
– 18-kb block SelectRAM resources

Basic FPGA Architecture 2 - 38 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Summary
• The Virtex™-II devices contain dedicated 18x18
multipliers next to each block SelectRAM™ resource
• Digital clock managers provide the following:
– Delay-Locked Loop (DLL)
– Digital Frequency Synthesizer (DFS)
– Digital Phase Shifter (DPS)

Basic FPGA Architecture 2 - 39 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Where Can I Learn More?
• User Guides
– www.xilinx.com → Documentation → User Guides

• Application Notes
– www.xilinx.com → Documentation → Application
Notes

• Education resources
– Designing with the Virtex-4 Family course
– Spartan-3E Architecture free Recorded e-Learning

Basic FPGA Architecture 2 - 40 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Outline
• Overview
• Slice Resources
• I/O Resources
• Memory and
Clocking
• Spartan-3, Spartan-
3E, and Virtex-II Pro
Features
• Virtex-4 Features
• Summary
• Appendix
Basic FPGA Architecture 2 - 41 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Double Data Rate Registers
• DDR registers can be clocked
– By Clock and NOT(Clock) if the duty cycle is 50/50
– By the CLK0 and CLK180 outputs of a DCM

D1
Clock Reg DDR MUX OBUF
OCK1
PAD
D2
Reg
OCK2 FDDR

• If D1 = “1” and D2 = “0”, the output is a copy of


Clock
– Use this technique to generate a clock output that is
synchronized to DDR output data
Basic FPGA Architecture 2 - 42 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Dual-Port Block RAM
Configurations
• ConfigurationsConfigurati Depth Data Parity
available on onx 1 Bits Bits
16k 16 kb 1 0
each port
8k x 2 8 kb 2 0
4k x 4 4 kb 4 0
2k x 9 2 kb 8 1
1k x 18 1 kb 16 2
512 x 36 512 32 4

• Independent Port A: 8
IN 8 bit
configurations on ports bits

A and B
– Supports data-width OUT 32 bit
Port B: 32
conversion, including bits
parity bits
Basic FPGA Architecture 2 - 43 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only
Clock Buffer Configurations
• Clock buffer (BUFG)
– Low-skew clock distribution
I O
BUFG
• Clock enable buffer (BUFGCE)
– Holds the clock output Low
when Clock Enable (CE) is
I O
inactive BUFGCE

– CE can be active-High or
active-Low CE
– Changes in CE are only
recognized when the clock
input is Low to avoid glitches
and short clock pulses

Basic FPGA Architecture 2 - 44 © 2005 Xilinx, Inc. All Rights Reserved


For Academic Use Only
Clock Buffer Configurations
• Clock multiplexer I0
(BUFGMUX)

BUFGMUX
– Switches from one O
I1
clock to another,
glitch-free S
– After a change on
S, the BUFGMUX
S
waits for the Wait for low
currently selected I0

clock input to go Switch


I1
Low
O
– The output is held
Low until the newly
selected clock goes
Low, then switches
Basic FPGA Architecture 2 - 45 © 2005 Xilinx, Inc. All Rights Reserved
For Academic Use Only

Das könnte Ihnen auch gefallen