Beruflich Dokumente
Kultur Dokumente
• Course Overview
• Module 1: Introduction to Reconfigurable
Computing
– General Purpose Computing [T1. Sec 1]
– Domain and Application specific processors [T1. Sec 2
& 3]
2
BITS Pilani, Pilani Campus
Today’s Lecture
3
BITS Pilani, Pilani Campus
When to use RC?
replacing/accelerating microprocessors
But, when should RC be used instead of alternative technologies?
Implementation Possibilities
Performance
15 15
14 14
13 13
12 12
11 11
10 10
Speedup
9
Speedup
9 8
8 7
7 6
6 5
5 4
4 3
3 2
2 1
1 0
0
0010 0010
… Processor
Processor … FPGA
Processor
11
• Hardware emulation
• Hardware testing under real
operation conditions
• Fast
• Accurate
• Allow several iterations
ITALTEL
FLEXBENCH
12
13
• Games,
• Internet Navigation system,
• Emergency Diagnostics
• Different standard protocols
• Monitoring
• Entertainment
14
• FPGAs are sensitive to SEU (Single event upset) and SET (Single
event transients) since the configuration memory of the chip can
be affected, resulting in permanent error, due to electromagnetic
noise and radiation and particularly in space applications, cosmic
rays can hit silicon-surfaces causing high-density electron-hole
pairs which may lead to transient errors
• Requires duplication or triplication of resources for combinational
logic and parity check for on-chip caches
• Triple Modular Redundancy (TMR) with a voter circuit is common
approach. Three identical hardware modules perform their
operations in parallel and their output is voted.
• Evolving paradigm
• RC requires more computing resources, area*power*time
products compared to ASIC
• But offers faster execution times, better power/performance ratio
• Fault tolerance
• Run time reconfiguration
• Adaptive
• Before PLD:
– Digital circuits available as SSI, MSI devices
– Logic determined at time of manufacture
– Cant be changed later and large volume fabrication
– Shelves of document for all devices
– Did not meet designer’s requirements for his/her exact specifications
– Forced to use multiple devices to meet requirements
• After PLDs:
– Device supplied with no logic function programmed in device
– Quick design creation
– Allows designer to program PLD in whatever way the design requires
– Meets exact designers' specifications
– Multiple functions can be combined and programmed onto single chip – lesser board space required
– In system programmable – need not remove device from board for changing program
– No worry for device obsolescence
– But requires usage of specific tools and understanding of hardware architecture before programming
0
X A1 1
00
Y A0 01 1
10
11
0
D0
X xor Y
Address lines as inputs
Form minterms using AND gates and then OR the appropriate minterms for formation of the output
Circuit requires four 2-input AND gates and one OR gate that can take up-to four inputs.
19
BITS Pilani, Pilani Campus
array of AND gates – AND plane
In the AND-plane all eight minterms for the three inputs, a, b, and c are generated. AND and OR Planes
The OR plane uses only the minterms that are needed for the outputs of the circuit.
Not all generated minterms may be used.minterm 7 that is generated in the AND-plane but not used in the OR plane.
20
BITS Pilani, Pilani Campus
• AND and OR gates technologies use more area and
delay compared to NAND and NOR implementations.
• Although NOR gates are used, the left plane is still
called the AND-plane and the right plane is called the
OR-plane
• Hardware implementation with large fan-in and
routing becomes difficult.
• Take for example, a circuit with 16 inputs, which is
very usual for combinational circuits. Such a circuit has
64k (216) minterms.
• In the AND-plane, wires from circuit inputs must be
routed to over 64,000 NOR gates.
• In the OR-plane, the NOR gates must be large enough
for every minterm of the function (over 64,000
All NOR Implementation minterms) to reach their inputs.
• Such an implementation is very slow because of long
lines, and takes too much space because of the
requirement of large gates.
21
BITS Pilani, Pilani Campus
Distributed NOR of the AND-plane Distributed NOR Gate of Output y
23
BITS Pilani, Pilani Campus
• Transistors for the implementation of
minterms in the AND-plane are fixed, but
in the OR-plane there are fusible
transistors on every output column for
every minterm of the AND-plane.
• For realization of a certain function on an
output of this array, transistors
corresponding to the used minterms are
kept, and the rest are blown to eliminate
contribution of the minterm to the output
function.
• For example, for output y, only transistors
on rows m2, m5, and m6 are connected
and the rest are fused off.
• The dots in the AND-plane indicate
permanent connections, and the crosses in
the OR-plane indicate programmable or
configurable connections
24
BITS Pilani, Pilani Campus
Memory View
PROM
• If we consider abc as the address inputs and wxyz as the data
read from abc designated address, then the circuit can be
regarded as a memory with an address space of 8 words and
data of four bits wide.
• In this case, the fixed AND-plane becomes the memory
decoder, and the programmable OR-plane becomes the
memory array.
• Because this memory can only be read from and not easily
written into, it is referred to as Read Only Memory or ROM. The
basic ROM is a one-time programmable logic array.
• Programmable ROM is a one-time programmable chip that,
once programmed, cannot be erased or altered.
• In a PROM, all minterms in the AND-plane are generated, and
connections of all AND-plane outputs to OR-plane gate inputs
are in place.
• By applying a high voltage, transistors in the OR-plane that
correspond to the minterms that are not needed for a certain
output are burned out.
• A fresh PROM has all transistors in its OR-plane connected.
When programmed, some will be fused out permanently.
25
BITS Pilani, Pilani Campus
Simple Programmable Logic Devices
26
Programmable Programmable
AND plane OR plane
INPUT OUTPUT
X F(X)=X2 X F(X)=X2
0 0 000 000000
1 1 001 000001
2 4 010 000100
3 9 011 001001
4 16 100 010000
5 25 101 011001
6 36 110 100100
7 49 111 110001
28
X F(X)=X2 1
X2 3-to-8 2
000 000000
3
001 000001 X1
010 000100 Decoder 4
X0
011 001001 5
6
100 010000
7
101 011001
110 100100
111 110001
F5 F4 F3 F2 F1 F0
29
X F(X)=X2 1
X2 3-to-8 2
000 000000
3
001 000001 X1
010 000100 Decoder 4
X0
011 001001 5
6
100 010000
7
101 011001
110 100100
111 110001
F5 F4 F3 F2 F1 F0
Not Used = X0
30
X F(X)=X2
0
000 000000
1
001 000001 X2 3-to-8 2
010 000100 3
X1
011 001001 Decoder 4
X0
100 010000 5
6
101 011001 7
110 100100
111 110001
F5 F4 F3 F2 F1 F0
31
Programmable
A
OR Plane
• This is a 3 x 4 x 2 PLA (3
B inputs, up to 4 product
terms, and 2 outputs), ready
C to be programmed.
• The left part of the diagram
replaces the decoder used in
a ROM.
• Connections can be made in
the “AND array” to produce
four arbitrary products,
Programmable instead of 8 minterms as
AND Plane with a ROM.
• Those products can then
be summed together in the
C “OR array.”
B F2
32
33
20
35
• The PAL is the opposite of the ROM, having a programmable set of ANDs
combined with fixed ORs.
• Disadvantage
– ROM guaranteed to implement any M functions of N
inputs. PAL may have too few inputs to the OR gates.
• Advantages
– For given internal complexity, a PAL can have larger N and M
– Some PALs have outputs that can be complemented, adding POS
functions
– No multilevel circuit implementations in ROM (without external
connections from output to input).
– PAL has outputs from OR terms as internal inputs to all AND
terms, making implementation of multi-level circuits easier.
F1 = A’B’ + C’ A
XX X
4
X X
5 F2
F2 = A’BC’ + AC + AB X X
6
B
F3 = AD + BD + F1 = AD + BD + A’B’+ C’ X X
7
= AD + BD + A’B’ + C’ 8
X X
F3
X
9
F4 = AB + CD + F1’ = AB + CD + (A’B’ + C’)’ C
X
= AB + CD + AC + BC
X
10
X X
11 F4
X
12
D
0 1 2 3 4 5 6 7 8 9
38
39