Sie sind auf Seite 1von 9

FIR filter using WT & CSAM

X(n)
Z-1

C0 C1 CM-2 CM-1
M 1
y n    c  xn  i 
i Z-1 Z-1 Z-1 Z-1
i 0

y(n) Adder Z-1 Adder Adder Z-1 Adder

FIR filter using FIR filter using


Filter FIR filter using
Shared Multiplier Wallace Tree Carry Save Array

Clock Cycle 13 ns 18 ns 25 ns
Power 398.4 mW 412.2 mW 401.1 mW
Area 4.41  10 6 µm2 3.87  10 6 µm2 3.15  10 6 µm2

Source: Intel
• CMU library (0.35 µm technology)
• Power measured with clock frequency : 25ns

Prof. Kaushik Roy


@ Purdue Univ.
DCT: Shared Multiplier Application

DCT (Discrete Cosine Transform)

The number of alphabets can be reduced by modifying


the coefficients in DCT matrix

Only 1x & 3x are required for the Precomputer bank

Performance and Power improvement in Precomputer


bank and Select unit.

DCT with the modified coefficients generates acceptable


quality of image

Source: Intel

Prof. Kaushik Roy


@ Purdue Univ.
Shared Multiplier Application

DCT (Discrete Cosine Transform)


c(k )c(l ) 7 7  (2i  1)k   (2 j  1)l   1
  ,k  0
X kl  x ij cos   cos  k , l  0,1,....,7 and c(k )   2
4 i 0 j 0  16   16  1, otherwise

d d d d d d d d 
a c e g g e c  a 
 Z = Txt , Z = Txt
b f  f b b f f b 
 
c g a e e a g c
T  Image DCT
d d d d d d d d  data data
  xij Z Zt
e a g c c g a  e Row
Transpose
Column X
DCT
f b b  f  f b b f 
DCT
 
 g e c a a c e  g  Z = Txt Z = Txt

Source: Intel
Note the symmetry of
the DCT coef. matrix

Prof. Kaushik Roy


@ Purdue Univ.
DCT (Background)

Using the Symmetry of the DCT coefficient matrix, the matrix multiplication is simplified.

Z = Txt , X = TZt
…X10, X00 Add
Even DCT d d d d •

 z0   d d d d   x0  x7  …X11, X01 Add b f -f -b •


z  
 2   b f f  b  x1  x6  d -d -d d •
…X12, X02 Add
 z4   d d d d   x2  x5 
     d -d -d d •

Transpose
 z6   f b b f   x3  x4  …X13, X03 Add

…X14, X04 Sub


Odd DCT a c e g •

…X15, X05 Sub c -g -a -e •


 z1   a c e g   x0  x7 
z  
 3   c g  a  e   x1  x6  …X16, X06 Sub e -a -g c •

 z3   e a g c   x2  x5  g -e -c -a •
     …X17, X07 Sub
 z7   g e c  a   x3  x4 
Source: Intel

Prof. Kaushik Roy


@ Purdue Univ.
DCT using Shared Multiplier

Select &
…. X3-X4 , X2-X5 , X1-X6 , X0-X7 Precomputer
Adders Adder Z-1

….. g , e , c , a

Odd DCT
Select &
 z1   a c e g   x0  x7  Adders Adder Z-1
z  
 3   c g  a  e   x1  x6 
….. -e , -a , -g , c
 z3   e a g c   x2  x5 
    
 z7   g e c  a   x3  x4  Select &
Adders Adder Z-1

….. c , g , -a , e
• The Shared Multiplier can be
effectively used to implement Select &
matrix multiplication Adders Adder Z-1
Source: Intel
….. -a , c , -e , g

Prof. Kaushik Roy


@ Purdue Univ.
Approximation: Modification of DCT Coefficients

8bit DCT Coefficients


The number of alphabets can
be reduced by modifying
the coefficients in DCT matrix.

Only 1x & 3x are required for


the Precomputer bank.

Performance and Power


improvement in Precomputer
Source:bank
Intel and Select unit.

Prof. Kaushik Roy


@ Purdue Univ.
DCT using Shared Multiplier
c(k )c(l ) 7 7  (2i  1)k   (2 j  1)l 
X kl 
4

i 0 j 0
xij cos
 16
 cos
  16


8bit DCT Coefficients
Z = Txt , X = TZt

Even DCT
 z0   d d d d   x0  x7 
z  
 2   b f f  b  x1  x6 
 z4   d d d d   x2  x5 
    
 z6   f b b f   x3  x4 

Odd DCT
 z1   a c e g   x0  x7 
z  
 3   c g  a  e   x1  x6 
 z3  Intel
Source: e a g c   x2  x5 
     • Only 1x & 3xare required in the Modified
 z7   g e c  a   x3  x4 
8-bit DCT Coefficient

Prof. Kaushik Roy


@ Purdue Univ.
DCT using Shared Multiplier

< DCT with original 8 bit coefficient > < DCT with modified 8 bit coefficient >

• DCT with the modified coefficients generates acceptable


Source: Intel quality of image

Prof. Kaushik Roy


@ Purdue Univ.
Shared Mutiplier: Summary

• Reduces computational complexity

• Possible to trade-off power/performance by


judiciously selecting coefficients and alphabets

Source: Intel

Prof. Kaushik Roy


@ Purdue Univ.

Das könnte Ihnen auch gefallen