Applications Applications
Pr amod Meher
I nst i t ut e f or I nf ocomm Resear ch
Si ngapor e Si ngapor e
outline
trendsinmemorytechnology
memorybasedcomputing:advantagesandexamples
DAbasedcomputationforDSPapplications
l k bl d i f l i li i lookuptabledesignforconstantmultiplication
DAbasedvs LUTmultiplierbasedimplementations
b d l ti f li f ti memorybasedevaluationofnonlinearfunctions
conclusions
2
12/17/2010 Institute for Infocomm Research, Singapore
trends in memory technology
Applicationspecific memories [14]
lowpowermemoriesformobiledevicesand
consumerproducts
high speed memories for multimedia applications highspeedmemoriesformultimediaapplications
widetemperaturememoriesforautomotive
high reliability memories for biomedical instruments highreliabilitymemoriesforbiomedicalinstruments
radiationhardenedmemoryforspaceapplications
3
12/17/2010 Institute for Infocomm Research, Singapore
trends in memory technology
RAMlogicintegration
severalnonvolatileRAMtypesareemerging:
ferroelectricRAM(FeRAM),magnetoresistiveRAM
(MRAM), and varieties of phase change memory (MRAM),andvarietiesofphasechangememory
(PCM)[46]
theupcoming/newmemoriesprovidefasteraccess
andconsumelesspower[46]
canbeembeddeddirectlyintothestructureof
microprocessors or integrated in the functional microprocessorsorintegratedinthefunctional
elementsofdedicatedprocessors[7]
4
12/17/2010 Institute for Infocomm Research, Singapore
trends in memory technology
memoryplacement[711]
traditionalconceptofmemoryasastandalone
subsystemisgettingchanged
it is embedded within the logic components itisembeddedwithinthelogiccomponents
processorhasbeenmovedtomemoryormemoryhas
beenmovedtoprocessor p
therelocationsresultinhigherbandwidth,lowerpower
consumptionandlessaccessdelay
5
12/17/2010 Institute for Infocomm Research, Singapore
memorybased computing ?
aclassofdedicatedsystems,wherethecomputational
functions are performed by lookup tables (LUTs) functionsareperformedbylookuptables(LUTs),
insteadofactualcalculations
closetohumanlikecomputing p g
simpletodesign,andmoreregularcomparedwiththe
multiplyaccumulatestructures
havepotentialforhighthroughputandreduced
latencyimplementation
i l l d i ti d t involveslessdynamicpowerconsumptiondueto
minimizationofswitchingactivities
6
12/17/2010 Institute for Infocomm Research, Singapore
memorybased computations: examples
innerproductcomputationusingthedistributed
arithmetic (DA) [12] arithmetic(DA)[12]
directimplementationofconstantmultiplications[13]
ll it d f di it l filt i d th l wellsuitedfordigitalfilteringandorthogonal
transformationsfordigitalsignalprocessing
implementation of fixed and adaptive FIR filters and implementationoffixedandadaptiveFIRfiltersand
transforms
otherapplications:evaluationoftrigonometric
functions,sigmoidandothernonlinearfunction
7
12/17/2010 Institute for Infocomm Research, Singapore
DA to calculate innerproduct : example
X = [X
0
X
1
X
2
]
T
and A = [A
0
, A
1
, A
2
]
T
: 3point vectors. A is constant
X [ (3) (2) (1) (0)]
X
0
, X
1
and X
2
be 4bit integers:
X
0
= [ x
0
(3) x
0
(2) x
0
(1) x
0
(0)]
X
1
= [ x
1
(3) x
1
(2) x
1
(1) x
1
(0)]
X
2
= [ x
2
(3) x
2
(2) x
2
(1) x
2
(0)]
innerproduct of X and A : A.X =A
0
X
0
+ A
1
X
1
+ A
2
X
2
P
0
P
1
P
2
P
3
i d t f A X P + 2P + 4P + 8P
8
12/17/2010
innerproduct of : A.X = P
0
+ 2P
1
+ 4P
2
+ 8P
3
Institute for Infocomm Research, Singapore
LUT for innerproduct using DA [12]
x
2
(i) x
1
(i) x
0
(i)
partial sum
0
A
0
LUT
3
x (0) x (1) x (2) x (3)
0 0 0 0
0 0 1
A
0
0 1 0
A
1
A
0
A
1
A
1
+A
0
A
2
T
O
8
L
I
N
E
D
x
0
(0)
x
1
(0)
x
0
(1)
x
1
(1)
x
0
(2)
x
1
(2)
x
0
(3)
x
1
(3)
0 1 1
A
1
+A
0
1 0 0
A
2
1 0 1
A
2
+A
0
A
2
A
2
+A
0
A
2
+A
1
A
2
+A
1
+A
0
D
E
C
O
D
E
R
x
2
(0) x
2
(1) x
2
(2) x
2
(3)
1 1 0
A
2
+A
1
1 1 1 A
2
+A
1
+A
0
2 1 0
innerproduct A.X
+
2^N LUT words required for Npoint innerproduct. For N=32, it exceeds 10^9 words!!
shiftright
9
12/17/2010
For Lbit inputs, computation time = L cycles : Cycle time, T=T
MEM
+ T
ADD
+ T
FF
Institute for Infocomm Research, Singapore
LUT compaction for DA [12]
x
2
(i) x
1
(i) x
0
(i) conventional
OBC LUT content
0 0 0 0  (A
2
+A
1
+A
0
)
0 0 1
A
0
 ( A
2
+A
1
A
0
)
0 1 0
A
1
 (A
2
A
1
+A
0
)
0 1 1
A
1
+A
0
 (A
2
A
1
A
0
)
1 0 0
A
2
(A
2
A
1
A
0
)
1 0 1
A
2
+A
0
(A
2
A
1
+A
0
)
( )
1 1 0
A
2
+A
1
( A
2
+A
1
A
0
)
1 1 1 A
2
+A
1
+A
0
(A
2
+A
1
+A
0
)
Desired partial sum of product = [OBC value + (A
2
+A
1
+A
0
)]/2
half the number of LUT words are saved if OBC is used
10
12/17/2010 Institute for Infocomm Research, Singapore
linear convolution/ FIR filtering [13]
Ntap FIR filter equation:
address LUT content
0000 0
0001 h[0]
y[n]=h[0] x[n]+ h[1] x[n1] +
. . .
+
di t f FIR filt f N 4
0001 h[0]
0010 h[1]
0011 h[1]+h[0]
0100 h[2]
y[n] h[0].x[n]+ h[1].x[n 1] + +
+ h[N1].x[nN+1]
. . .
directform FIR filter for N=4.
x[n]
D D
x[n1] x[n2]
D
x[n3]
0101 h[2]+h[0]
0110 h[2]+h[1]
0111 h[2]+h[1]+h[0]
1000 h[3]
X
h[0] h[1]
E
h[3] h[2]
X X X
1000 h[3]
1001 h[3]+h[0]
1010 h[3]+h[1]
1011 h[3] +h[1]+h[0] E 1011 h[3] +h[1]+h[0]
1100 h[3] +h[2]
1101 h[3] +h[2]+h[0]
1110 h[3] +h[2]+h[1]
4pointinnerproduct.
Weightsareconstant
11
12/17/2010
y[n]
1111 h[3] +h[2]+h[1]+h[0]
Institute for Infocomm Research, Singapore
DAbased adaptive filtering [14]
example:4tapFIRadaptivefilter
x[n]
D D
x[n1] x[n2]
D
x[n3]
h[3]
4point
X
h[0] h[1] h[3] h[2]
X X
p
innerproduct.
Weightsarenot
constant
X
y[n]
E
constant.
y[n]
+
d[n]
e[n]
weight
update
12
12/17/2010 Institute for Infocomm Research, Singapore
LUT for adaptive filter: example [14]
LUT values LUT values
address address
bitsofthesameplacevaluesofthefiltercoefficientsareusedasaddresses
13
12/17/2010 Institute for Infocomm Research, Singapore
DAbased innerproduct of long vectors
n
MP
M P
n n
P
P
n n
P
n n
N
n
X A X A X A X A + + + = =
1
) 1 (
1 2 1
0
1
0
AX for N=MP
M P n P n n n = = = = ) 1 ( 0 0
InnerProduct InnerProduct InnerProduct
  
  
Inner Product
Unit1
Inner Product
Unit2
Inner Product
UnitP
E
P LUTs of 2^(M) words and (P1) adders required for Npoint innerproduct.
innerproduct A.X
14
12/17/2010
( ) ( ) q p p
Institute for Infocomm Research, Singapore
large order FIR filter using DA [15]
INPUT SHIFTREGISTER
Yin
x[nN+2]
M
BITSERIAL WORDPARALLEL CONVERTER
M M
x[n]
(b ) (b ) (b )
x[n1] x[nN+1]
Xout PE
Xin
). ( _ Yin Read ROM Xin Xout +
(b
n
)
1,0
(b
n
)
1,1
(b
n
)
1,(P1)
(b
n
)
0,0
(b
n
)
0,1
(b
n
)
0,(P1)
Xin
Xout
OUTPUT
CELL
0
PE PE
PE
A (P1)A
(b
n
)
(L1),0
(b
n
)
(L1),1
(b
n
)
(L1),(P1)
OUTPUT
CELL
.
; 0 ; 0 :
tion Initializa End
Count S Initialize
; 2
: 1 0
Xin S S
L Count For
+
s s
OUTPUT
. ; 0 ; 0
;
. 1
Endif Count S
S Xout then L Count If
Count Count
=
+
(b
n
)
i,j
:(j+1)th segment of bitvector of ith bits of input
15
12/17/2010 Institute for Infocomm Research, Singapore
large order FIR filter: a 2D design [15]
A (P1)A
M M M
E
R
T
E
R
0
SERIALIN PARALLELOUT SHIFTREGISTER
Xin
SA
Yin
SA
0
PE PE PE
S
E
R
I
A
L
C
O
N
V
E
Yin Xin Yout 2 +
Xin
SA
Yout
A (P1)A
M M M
L
L
E
L
W
O
R
D

S
SERIALIN PARALLELOUT SHIFTREGISTER
Yin Xin Yout . 2 +
0 PE PE PE
B
I
T

P
A
R
A
(L2)A
SA
A (P 1)A
SERIALIN PARALLELOUT SHIFTREGISTER
Xout PE
Xin
Yin
(L1)A
INPUT
SA
A (P1)A
M M M
0 PE PE PE
). ( _ Yin Read ROM Xin Xout +
16
12/17/2010
OUTPUT
Institute for Infocomm Research, Singapore
circular Convolution using DA [16]
circular convolution of two Npoint sequences {x(n)} and {h(n)} is :
i l l i f N 4 circular convolution for N=4:
17
12/17/2010 Institute for Infocomm Research, Singapore
cyclic convolution using DA: a 2D design [16]
A (P1)A
(L)th bitstream of input sequence {x(n)}
R
T
E
R
0
CIRCULRLY RIGHTSHIFT BUFFER
M M M
0
PE PE PE
E
R
I
A
L
C
O
N
V
E
R
SA
A (P1)A
L
E
L
W
O
R
D

S
E
CIRCULRLY RIGHTSHIFT BUFFER
second bitstream of input sequence {x(n)}
M M M
0 PE PE PE
B
I
T

P
A
R
A
L
L
(L2)A
SA
first bitstream of input sequence {x(n)}
(L1)A
INPUT
SAMPLES
SA
A (P1)A
M M M
0 PE PE PE
CIRCULRLY RIGHTSHIFT BUFFER
18
12/17/2010
OUTPUT
( )
SAMPLES
SA 0 PE PE PE
Institute for Infocomm Research, Singapore
computation of sinusoidal transforms [1720]
Npoint sinusoidal transforms like the DFT, DCT and DHT are given by
where the transform kernel is defined as
computation of Npoint sinusoidal transforms involves multiplication of computation of N point sinusoidal transforms involves multiplication of
an N x N kernel matrix with Npoint input vectors
involves N number of innerproducts of Npoint input vector with the
rows of kernel matrix
the matrixvector product requires N innerproduct computation units by
the DA approach
for prime values of N, the N x N kernel matrix is transformed to an (N1)
i t li l ti
19
12/17/2010
point cyclic convolution.
Institute for Infocomm Research, Singapore
multiplication using lookuptable
address
word X
product
word
LUT to multiply a 4bit word X with a constant A
address
word X
product
word
multiplication ofan
L bi b X i h
L
X
word, X word
0000 0
0001 A
word, X word
1000 8A
1001 9A
LbitnumberX with
constantA will
requireanLUTof2
L
words
L
LUTOF
2^L
0010 2A
0011 3A
0100 4A
1010 10A
1011 11A
1100 12A
words
multiplicationtime=
l
Words
0101 5A
0110 6A
0111 7A
1101 13A
1110 14A
1111 15A
memorylatency
AX
LUT size increases exponentially with input size.
20
12/17/2010
p y p
Institute for Infocomm Research, Singapore
optimization for constant multiplications
oddmultiplestorage(OMS)scheme
antisymmetricproductcoding(APC)scheme
inputcoding(IC)scheme
combinedtechniques
21
12/17/2010 Institute for Infocomm Research, Singapore
oddmultiple storage scheme [21]
address
word
product
word
address
word
product
word
address
word
product
word
0000 0
0001 A
0010 2A
1000 8A
1001 9A
1010 10A
0001 A
0011 3A
0101 5A
0011 3A
0100 4A
0101 5A
0110 6A
1011 11A
1100 12A
1101 13A
1110 14A
0111 7A
1001 9A
1011 11A
1101 13A 0110 6A
0111 7A
1110 14A
1111 15A
1101 13A
1111 15A
OnlyoddmultipleoftheconstantaretobestoredintheLUT.
Evenmultiplescouldbederivedfromthestoredwords.
Only half the number of product words are to be saved
22
12/17/2010
Onlyhalfthenumberofproductwordsaretobesaved.
Institute for Infocomm Research, Singapore
oddmultiple storage scheme [21]
memoryunitof(2^L)/2 wordsof(W+L)bitwidthisused
t t th dd lti l f t t A tostoretheoddmultiplesofconstant A.
abarrelshifterforproducingamaximumof(L1) left
shiftsisusedtoderivealltheevenmultiplesofA.
theLbitinputwordismappedto(L1)bitaddressof
theLUTbyanencoder.
thecontrolbitsforbarrelshifterarederivedbyacontrol y
circuittoperformthenecessaryshiftsoftheLUToutput.
RESETsignalisgeneratedbythesamecontrolcircuitto
reset the LUT output when the X=0 resettheLUToutputwhentheX 0.
ifonlymagnitudepartcouldbeusedasaddress,LUTsize
isreducedtohalf.
23
12/17/2010 Institute for Infocomm Research, Singapore
antisymmetric product coding [22]
insteadof32wordswe
needonly17words
tobestoredintheLUT.
usefulforhighprecision
multiplicationandinner
productcomputation.
u
v
24
12/17/2010 Institute for Infocomm Research, Singapore
highprecision LUTmultiplier [22]
WhenthewidthofinputmultiplicandX islarge,direct
implementationofLUTmultiplierinvolvesverylargeLUT.
But,theinputwordX couldbedecomposedintocertainnumberof
segmentsorsubwords X=(X
1
X
1
, , X
T
) andfedtoseparateLUTs.
The partial products pertaining to different subwords could be read Thepartialproductspertainingtodifferentsub wordscouldberead
fromtheLUTsandshiftaddedtoobtaintheproductvalues.
25
12/17/2010
Generalized Architecture for HighPrecision LUTbased Multiplier for L = S(T 1) + S.
Institute for Infocomm Research, Singapore
input coding scheme: example [23]
X = (1 0 1 1 0 1 0 1 1 1 0 0 0 1 1 1).
Wecandecomposeittofourwordsas
X (1 0 1 1) (0 1 0 1) (1 1 0 0) (0 1 1 1) X = (1 0 1 1) (0 1 0 1) (1 1 0 0) (0 1 1 1).
26
Institute for Infocomm Research, Singapore 12/17/2010
input coding scheme: basic concepts
27
Institute for Infocomm Research, Singapore 12/17/2010
input coding scheme: a case for L=5
28
Institute for Infocomm Research, Singapore 12/17/2010
combining input coding with OMS
29
Institute for Infocomm Research, Singapore 12/17/2010
combining input coding with OMS
multiplierforL=5
30
Institute for Infocomm Research, Singapore 12/17/2010
combining input coding with OMS
31
Institute for Infocomm Research, Singapore 12/17/2010
DALUT vs LUTmultiplierbased designs
eachoutputofanNtapFIRfilterinvolvesthe
i f N i i d computationofone Npointinnerproduct
onesamplecouldbeprocessedbyDAapproachineach
cycleusingL LUTsof(2^N)wordsand(L1) adders
LUTmultiplierbasedapproachtohavethesame
throughputrequiresN LUTsof(2^L)wordseachand
(N1) adders. ( )
for N=L andforthesamethroughputimplementation,
boththeapproacheshavesimilarperformances
32
12/17/2010 Institute for Infocomm Research, Singapore
LUTmultiplierbased FIR filter [21]
segmented memory core for N multiplications using OMS and APC [FIR 2010
Latency chart of the DAbased and Latency chart of the DA based and
LUTmultiplierbased FIR filter.
33
15% less area than DAbased design for the same throughput rate.
12/17/2010 Institute for Infocomm Research, Singapore
LUT design for nonlinear functions [24]
Example:sigmoidfunction
ForarangeAx ofvaluesofx onevalueoftanh(x)needtobe
d stored.
TherangeAx=2o, whereo, isthemaximumpermissible
valueoferror.
34
12/17/2010 Institute for Infocomm Research, Singapore
LUT design for nonlinear functions
35
12/17/2010 Institute for Infocomm Research, Singapore
conclusions
memorytechnologyisgrowingquitefastandefficient
memoriesfordifferentapplicationsareemergingoverthe
years
memoryelementscanbeembeddeddirectlyintothe
structureofthemicroprocessororintegratedinthe
functionalelementsofdedicatedprocessors.
memorybasedapproachcouldbeusedforcomputation
intensivefrequentlyusedDSPtools. q y
theDAapproachaswellastheLUTbasedmultiplication
couldbeusedformemorybasedimplementationofdigital
filters filters
36
12/17/2010 Institute for Infocomm Research, Singapore
conclusions
boththeapproachescouldbeusedforthecomputationof
discretesinusoidaltransformsbytransformingthekernel
t i t li l ti f matrixtocyclicconvolutionform.
DAapproachcouldbeusedforreducedhardwarerealization
whenhardwareisnotamajorconstraintLUTbased
multiplierscouldbeusedforasimpleandstraightforward
implementationofFIRfilters
anewapproachtoreductionofLUTsizeformultiplicationis pp p
proposedrecently,wherethememorysizeisreduced
significantly
LUT could be designed for efficient evaluation of nonlinear LUTcouldbedesignedforefficientevaluationofnon linear
functions,likesinusoidalandhyperbolicfunctions,logarithms
andmultipleprecisionarithmetic.
37
12/17/2010 Institute for Infocomm Research, Singapore
references
[1] K. Itoh, S. Kimura, and T. Sakata, VLSI memory technology: Current
status and future trends, in Proc. 25th European SolidState Circuits
Conference Sept 1999 pp 310 Conference, Sept. 1999, pp. 310.
[2] B. Prince, Trends in scaled and nanotechnology memories, in Proc.
IEEE 2004 Conference on Custom Integrated Circuits, Nov. 2005.
[3] R. Barth, ITRS commodity memory roadmap, in Proc. International
Workshop on Memory Technology, Design and Testing, July 2003 pp.
6163.
[4] Kinam Kim, Memory Technologies for Mobile Era, in Proc. Asian
S lid St t Ci it C f N 2005 7 11 SolidState Circuits Conference, Nov. 2005, pp. 711.
[5] International Technology Roadmap for Semiconductors. [Online].
Available: http://public.itrs.net/
[6] S. Lai, Nonvolatile memory technologies: The quest for ever lower [6] S. Lai, Non volatile memory technologies: The quest for ever lower
cost, in Proc. IEEE International on Electron Devices Meeting, Dec.
2008 pp.1  6
38
12/17/2010 Institute for Infocomm Research, Singapore
references
[7] D. G. Elliott, M. Stumm, W. M. Snelgrove, C. Cojocaru, and R.
Mckenzie, Computational RAM: implementing processors in memory,
IEEE Trans Design & Test of Computers vol 16 no 1 pp 3241 Jan IEEE Trans. Design & Test of Computers, vol. 16, no. 1, pp. 3241, Jan
Mar 1999.
[8] M. Wang, K. Suzuki, A. Sakai, W.Dai, Memory and logic integration for
SysteminaPackage, Proc. 4th International Conference on ASIC, Oct. y g f
2001, pp.843  847 .
[9] T. Furuyama, Trends and challenges of large scale embedded
memories, in Proc. IEEE 2004 Conference on Custom Integrated
Ci it O t 2004 449 456 Circuits, Oct. 2004, pp. 449456.
[10] C. Trigas, S. Doll, J. Kruecken, MRAM and Microprocessor System
InPackage: Technology Stepping Stone to Advanced Embedded
Devices, IEEE Custom Integrated Circuits Conf, 2004, pp.7179. Devices, IEEE Custom Integrated Circuits Conf, 2004, pp.71 79.
[11] US Patent 5790839  System integration of DRAM macros and logic
cores in a single chip architecture
39
12/17/2010 Institute for Infocomm Research, Singapore
references
[12] S. A. White, Applications of the distributed arithmetic to digital signal
processing: A tutorial review, IEEE ASSP Magazine, vol. 6, no. 3, pp.
519 July 1989 519, July 1989.
[13] H.R. Lee, C.W. Jen, and C.M. Liu, On the design automation of
the memorybased VLSI architectures for FIR filters, IEEE Trans.
Consumer Electronics, vol. 39, no. 3, pp. 619629, Aug. 1993. , , , pp , g
[14] D. J. Allred, H. Yoo, V. Krishnan, W. Huang, D. V. Anderson, LMS
Adaptive Filters Using Distributed Arithmetic for High Throughput,
IEEE Trans Circuits & SystemsI, vol. 52, no. 7, pp. 1327 1337, July
2005 2005.
[15] P. K. Meher, S. Chandrasekaran, and A. Amira, FPGA Realization of
FIR Filters by Efficient and Flexible Systolization Using Distributed
Arithmetic IEEE Trans Signal Processing pp 30093017 July 2008 Arithmetic, IEEE Trans Signal Processing, pp. 3009 3017, July 2008.
[16] P. K. Meher, HardwareEfficient Systolization of DAbased
Calculation of Finite Digital Convolution, IEEE Trans Circuits &
SystemsII, pp.707711, Aug 2006.
40
12/17/2010 Institute for Infocomm Research, Singapore
references
[17] J.I. Guo, C.M. Liu, and C.W. Jen, The efficient memorybased VLSI
array design for DFT and DCT, IEEE Trans. Circuits and Syst. II:
Analog and Digital Signal Process vol 39 no 10 pp 723733 Oct Analog and Digital Signal Process., vol. 39, no. 10, pp. 723 733, Oct.
1992.
[18] H.C. Chen, J.I. Guo, T.S. Chang, and C.W. Jen, A memoryefficient
realization of cyclic convolution and its application to discrete cosine
transform, IEEE Trans. Circuits Syst. for Video Technol., vol. 15, no. 3,
pp. 445453, Mar. 2005.
[19] D. F. Chiper, M. N. S. Swamy, M. O. Ahmad, and T. Stouraitis,
S t li l ith d e b ed de i h f ified Systolic algorithms and a memorybased design approach for a unified
architecture for the computation of DCT/DST/IDCT/IDST, IEEE
Trans. Circuits Syst.I: Regular Papers, vol. 52, no. 6, pp. 11251137,
Jun. 2005. Jun. 2005.
[20] P. K. Meher, J. C. Patra, and M. N. S. Swamy, Highthroughput
memory based architecture for DHT using a new convolutional
formulation, IEEE Trans. Circuits Syst. II: Express Briefs, vol. 54, no.
41
12/17/2010
7, pp. 606610, July 2007.
Institute for Infocomm Research, Singapore
references
[21] P. K. Meher, New Approach to LookupTable Design and Memory
Based Realization of FIR Digital Filter, IEEE Trans on Circuits &
SystemsI pp 592603 March 2010 Systems I, pp.592 603, March 2010.
[22] P. K. Meher, LUT Optimization for MemoryBased Computation,
IEEE Trans on Circuits & SystemsII, pp.285289, April 2010.
[23] P. K. Meher, Novel Input Coding Technique for HighPrecision LUT
Based Multiplication for DSP Applications The18th IEEE/IFIP
International Conference on VLSI and SystemonChip (VLSISoC
2010), pp. 201206, Madrid, Spain, September 2010.
[24] P K Mehe A O ti i ed L k T ble f the E l ti f [24] P. K. Meher, An Optimized LookupTable for the Evaluation of
Sigmoid Function for Artificial Neural Networks The18th IEEE/IFIP
International Conference on VLSI and SystemonChip (VLSISoC
2010), pp. 9195, Madrid, Spain, September 2010. 2010), pp. 91 95, Madrid, Spain, September 2010.
42
12/17/2010 Institute for Infocomm Research, Singapore