Beruflich Dokumente
Kultur Dokumente
Fall 2013
Adapted from Lecture Notes, David Mahoney Harris CMOS VLSI Design
Memory
RAM, ROM registers, FIFO etc.
I/O Control
Control
Finite state machine PLA, random logic
Datapath
Interconnect
switches arbiters bus
2
Bit-Sliced Datapath
Datapath (or ALU) may consist of number of arithmetic units components that operate on uniform width data words (e.g. 32-bit)
Arithmetic components often apply the same operation to each bit in the data word
Bit-Sliced is an efficient physical layout style in which an n-bit datapath built by stacking together n 1-bit data paths
Data buses run (mostly) in the horizontal direction Control runs (mostly) in the vertical direction Control
Bit 3
Multiplexer
Registers
Multiplier
Data-out
Data-in
Logical
Shifter
Adder
A 0 0 0 0
B 0 0 1 1 0 0 1 1
C 0 1 0 1 0 1 0 1
Cout 0 0 0 1 0 1 1 1
S 0 1 1 0 1 0 0 1
A Cout
1 1 1 1
+
S
36 transistors
5
+
S3
+
S2
+
S1
+
S0
Cin
Critical path goes from Cin to Cout Worst case delay is linear in number of bits td (N-bit adder) = (N-1).tcarry + tsum Need to minimize delay tcarry = delay from C to Cout in each full adder
tsum (delay from A,B,C to S) is negligible for large N
6
28 transistors
Ci S
Ci
+
S
Cout
+
S0
Cout
+
S3
C3
+
S2
C2
+
S1
C1
+
S0
Cin
24 transistors
Output inverters removed pMOS and nMOS networks are mirror of each other
rather than complimentary simplifies layout enabled by symmetry of the add operation
Transistors now run vertically with horizontal poly Data travels from left to right
carry propagates vertically from one bit to the next
11
GPK Representation
Introduce new intermediate signals that describe full adder operation in terms of carry propagation
A 0 0 1 1 B 0 1 0 1 C 0 1 0 1 0 1 0 1 G 0 0 0 1 P 0 1 1 0 K 1 0 0 0 Cout 0 0 0 1 0 1 1 1 S 0 1 1 0 1 0 0 1
G = A B (i.e. generate carry: Cout = 1 independent of C) P = A B (i.e. propagate carry: Cout = C) K = A B (i.e. kill carry: Cout = 0 independent of C) Note that G, P and K are only functions of A and B
dont need to wait for C
12
GPK Representation
Can see the action of generate, propagate and kill operators in mirror adder:
VDD VDD A "0"-Propagate Ci "1"-Propagate A B A Generate B A B Ci A B B B Kill A Co Ci S Ci A B VDD Ci A B
13
B
4
Cout
Cin
1. Compute bit-wise generate, propagate (& kill) signals Gi = Ai Bi Pi = Ai Bi Ki = Ai Bi 2. Use PG(K) signals and Cin to determine Ci for each bit (and Cout) 3. Calculate sums using Si = Pi Ci
14
15
Manchester Carry
Use transmission gates to provide carry propagation
dynamic static
16
R/2 9C
R/2 9C
R/2 9C
R/2 9C
17
Using Euler, delay (after n stages) = (9/4).n(n+1)RC Delay increases quadratically with n
n total delay delay of extra stage 1 4.5 RC 2 13.5 RC 9 RC 3 27 RC 13.5 RC 4 45 RC 18 RC
19
Carry-Bypass Adder
Cout
Cin
If (P0 and P1 and P2 and P3) then Cout = Cin Otherwise use PG within the block In an large adder with many blocks, BP is set up well before Cin arrives Also known as Carry-Skip Adder
20
C1
C2
C3
21
Carry propagation
Carry propagation
Carry propagation
Carry propagation
Sum M bits
Sum
Sum
tsum
Sum
td
ripple by-pass
N
22
Carry-Select Adder
For each M-bit block: Calculate block carries for both Cin=0 and Cin=1 Then when Cin finally arrives, use multiplexer to select correct result PG Setup 0 1 Co,k 0 Carry Propagation 1 Carry Propagation Multiplexer
Carry Vector
Co,k+M
Sum Generation
23
24
For last block, output of 0 and 1 carry sections arrive well before multiplexer select signal from previous block 25
Setup (1) "0" (1) "1" (3) "1" Carry (3) (4) Multiplexer Ci,0 Sum Generation S0-1 "0" Carry "0"
Setup
Setup
Setup
"0" Carry
"0"
"0" Carry
"0"
"0" Carry
27
Tree Adders
For wide adders (N>32 bits) delay of carry lookahead (bypass or select) adders is dominated by delay of passing carry through the lookahead stages (multiplexers). This delay can be reduced by recursively looking ahead across lookahead blocks, e.g.
lookahead across 2-bit blocks to generate Cin to 4-bit blocks lookahead across 4-bit blocks to generate Cin to 8-bit blocks, etc.
15 14 13 12 11 10
PG generation
15:14
13:12
11:10
9:8
7:6
5:4
3:2
1:0
15:12
11:8
7:4
3:0
15:8
7:0
15:0 14:0 13:0 12:0 11:0 10:0 9:0 8:0 7:0 6:0 5:0 4:0 3:0 2:0 1:0 0:0
Sum calculation
28
Subtraction
A B = A + (-B) -B = NOT(B) + 1
(where B is twos complement of B)
29
Unsigned Multiplication
Example: 1100 X 0101 1100 0000 1100 0000 :1210 : 510
multiplicand multiplier
partial products
0 0 1 1 1 1 0 0 :6010
product
M x N-bit multiplication Produce N M-bit partial products Sum these to produce (M+N)-bit product
30
Array Multiplier
=
31
=
33
Fast Adder
35
= . = 1 . 21 + 2 . 1 . 21 + 2
=0
= . . 2+ +
=0 =0 2 =0
2 2
1 . 1 . 2+2
. 1 . 2+1 + 1 . . 2+1
37
38
multiplier cell with AND gate multiplier cell with NAND gate
full adder
39
Faster Multipliers
Multiplication is key element in many DSP applications
Digital Filters Transforms Modulation & Correlation
Each starts by examining critical path and looking for ways to short-circuit computation Each provides improved speed at cost of area & power
40