Sie sind auf Seite 1von 79

Digital Signal Processor evolution

over the last 30 years


Franois Charlot
IEEE Senior Memer
!"ril #0$0, revise% March #0$3
Presentation outline
DSP algorithms until the #0$s
Filters% Fast Fourier &rans'orm%
S(eech analysis an) synthesis%
*SM channel e+uali,ation
&he 'irst )eca)e o' single"chi(
DSPs
Early -0$s. Emerging DSP
mar/ets an) enablers
&he great )ivi)e.
Pervasive DSPs% Mobile DSPs%
0igh"(er'ormance DSPs
Some more DSP algorithms
CE1P% MP3% 2PE*% MPE*"2
Digital Signal Processors -0$s"
00$s
Pervasive DSPs. hybri) DSP3MC4s
Mobile DSPs. &I% com(etition
0igh"(er'ormance DSPs
S(eciali,e) DSPs
Con'igurable DSPs
56M attem(ts at DSP
5nySP 7 &he best mobile DSP8
FP*5s
Com(arison o' DSP
im(lementations
Some 'aile) attem(ts
DSP mar/et
Conclusion
DSP algorithms until the 80s
9asic DSP algorithms
&he ,"notation is use) to re(resent a sam(le) signal
an) o(erations over it:
&he s(eci'ication o' early DSPs ;as to e<ecute
e''iciently the 'ollo;ing algorithms.
Finite"im(ulse res(onse =FI6> 'ilters
In'inite"im(ulse res(onse =II6> 'ilters
Convolution an) correlation
Fast Fourier &rans'orm =FF&>
FI6 an) II6 'ilters
FI6 'ilter.
N coe''icients ='ilter or)er>
Stable% coul) be linear (hase: Im(ulse res(onse )ies at
sam(le N:
II6 'ilter.
Much better roll"o'' than
FI6 'ilter 'or a given or)er:
De'inition o' convolution
De'inition.
Combines an in(ut
signal ;ith the im(ulse
res(onse o' the system:
Image source. ?i/i(e)ia article on Convolution
E<am(les o' convolutions
5m(li'ication an) attenuation.
Echo.
Derivative.
Integral.
Source. &he Scientist an) Engineer@s *ui)e to Digital Signal Processing% ;;;:)s(gui)e:com
E<am(les o' convolutions 7 Filters
E<am(les o' im(ulse res(onse o' lo;"(ass 'ilters.
Source. &he Scientist an) Engineer@s *ui)e to Digital Signal Processing% ;;;:)s(gui)e:com
Sim(lest recursive
'ilter
6e)uces noise
Maintains e)ge
shar(ness
Se(arate ban)s o'
're+uencies
Some a((lications o' convolution
6a)ars. analy,ing the measure) im(ulse res(onse
Digital 'ilter )esign
Distance (hones calls. echo su((ression
Create an im(ulse res(onse that counteracts that o' the
reverberation:
5u)io. a((ly the im(ulse res(onse o' a real
environment on an au)io signal:
Correlation
Correlation is i)entical to the convolution ;ith the
)i''erence no signal is reverse):
5((lication e<am(les.
Correlation bet;een a ra)ar transmitte)
an) receive) signals:
5utocorrelation. ;hite noise removal%
(itch or tem(o estimation:::
Source. &he Scientist an) Engineer@s *ui)e to Digital Signal Processing% ;;;:)s(gui)e:com
Fast Fourier &rans'orm =32>
&he Discrete Fourier &rans'orm converts a series o'
time")omain values =e:g:% a sam(le) signal> into the
're+uency )omain:
&he FF& is an A=N log N> algorithm vs: A=NB> 'or the
DF&:
Invente) by Cooley an) &u/ey in -CD:
Most o'ten it is use) to )ivi)e a N"(oint FF&
into 2 N32 ones =ra)i<"2 FF&>:
9asic element is a 2"(oint FF& calle) butter'ly:
?
N
are calle) the t;i))le 'actors: '
E
'
N32FE
?
/
: ='
E
" '
N32FE
>
'
E
F '
N32FE
Fast Fourier &rans'orm =232>
Diagram o' an #"(oint ra)i< 2 FF&.
Note the (attern o' the in(ut sam(les in)ices:
0igher ra)i< =G or #> s(ee) u( the com(utation at
the e<(ense o' larger co)e si,e:
H 2DI 'e;er multi(lications:
5((lications o' FF&
S(ectral analysis o' signals
Nee) sam(les over a 'ull (erio) o' the signal:
Nee) to a((ly a ;in)o; =e:g:% 0ann> 'or non"(erio)ic
signals so to attenuate the si)e s(ectral lobes:
Fre+uency res(onse o' a system
5lternative to )econvolution
Determine in(ut 'rom im(ulse res(onse an) out(ut
1ess com(utationally intensive than convolution
time )omain 're+uency )omain convolution time )omain
FF& < IFF&
Image source. ?i/i(e)ia article on ?in)o; 'unctions an) ;;;:)s(gui)e:com
6ectangularly ;in)o;e) sinusoi)
S(eech analysis an) synthesis
&he human s(eech (ro)uction system is mo)ele)
into a linear 'ilter e<cite) by an im(ulse train =voice)
s(eech> or ;hite noise =unvoice) s(eech>:
becomes:::
=source"'ilter mo)el o' s(eech (ro)uction>
Source. Charles 5: 9ouman% Digital Signal Processing ;ith 5((lications% Pur)ue course ECE G3#% available at cn<:org
1inear Pre)ictive Co)ing
1inear Pre)ictive Co)ing use) since the mi)"J0$s:
5naly,e the signal over a time ;in)o; )uring ;hich the mo)el
(arameters may be )eeme) constant =H20"2D ms>:
&ransmit the voice3unvoice) state% the 're+uency value% all
(arameters o' the 'ilter:
Some bits are more im(ortant than others% hence the nee) 'or
channel enco)ing3)eco)ing
E<am(les.
1PC"0 in the &I S(ea/ an) S(ell =-J#>
H000 bits3secon)K (i(eline) har);ire) im(lementationK *ene Frant,
;as one o' the )esigners:::
2G00 bits (er secon) FS"0D 4S DoD3N5&A stan)ar) =-#G>
*SM voice an) channel co)ec
&he 'ull rate *SM co)ec uses 6PE"1&P"1PC.
1PC 'or short"term (re)iction =#"stage 'ilter>:
1ong"&erm Pre)iction.
1PC signal is reconstructe) an) correlate) ;ith the original one
;ith a lag o' G0"20:
&he ma<imum correlation is /e(t% then the gain is (rocesse) so to
get the lou)ness in'ormation:
&he resi)ual signal is )o;n"sam(le) an) enco)e) using 5PCM:
First )emo on &MS320CD0 at IC5SSP in May -#-:
Channel enco)ing.
2C0 bits 3 20 ms 'rom the co)ec classi'ie) by im(ortance:
Parity bits an)3or )u(lication by convolutional enco)ing translate
to 3J# bits (er 'rame:
Deco)ing through the Literbi algorithm:
&rellis )iagram
E<am(le trellis )iagram.
Path ;ith an e<am(le message.
Source. Chi( Fleming% &utorial on Convolutional Co)ing ;ith Literbi Deco)ing% htt(.33home:netcom:com3Hchi(:'3Literbi:html
*SM channel e+uali,ation
Some training bits are a))e) to the *SM 'rames to
allo; learning o' the channel res(onse.
Diagram o' the *SM channel e+uali,er.
&he Literbi algorithm
5))"com(are"select.
5ccumulate (ath metrics
Deco)e most li/ely (ath
Source. Pro': Lla)imir StoEanovic% C:-J3 Communication System Design course lecture notes% htt(.33oc;:mit:e)u
Numbers re(resentation
C"bit integer. "32JC#::32JCJ
MD notation scales "::F u( to "32JC#::F32JCJ
1imits the re"scaling nee)s )uring the com(utations
since multi(lications stay ;ithin the "::F interval:
MD < MD N M30
5))itions may re+uire e<tra guar) bits:
May also 'eature automatic saturation:
5lternative is 'loating"(oint re(resentation
e:g:% O0:mantissa : 2
e<(
The first decade of single-chip DSPs
Early times o' DSP$ing
&ime Frame 5((roach Primary 5((lication Enabling &echnologies
Early -J0$s Discrete logic Non"real time
(rocessing
Simulation
9i(olar SSI% MSI
FF& algorithm =-CD>
1ate -J0$s 9uil)ing bloc/ Military ra)ars
Digital Comm:
Single chi( bi(olar
multi(lier
Flash 53D
Early -#0$s Single Chi( DSP P &elecom
Control
P architectures
NMAS3CMAS
1ate -#0$s Function35((lication
s(eci'ic chi(s
Com(uters
Communication
Lector (rocessing
Parallel (rocessing
Early --0$s Multi(rocessing Li)eo3image
(rocessing
5)vance)
multi(rocessing
L1I?% MIMD% etc:
1ate --0$s Single"chi(
multi(rocessing
?ireless tele(hony
Internet relate)
1o; (o;er single"
chi( DSP
Multi(rocessing
Source. Pro': Purt Peut,er% 9er/eley course CS"2D2% 2000
DSP conte<t in the early #0$s
Digital tele(hony stan)ar)s
-#0. stu)ies on en)"to"en) 'ully )igital lin/
-#G. *:J2 =32 /b3s 5DPCM>
-#G. CCI&& 6e) 9oo/ =ISDN stan)ar)>
Per'ormance o' general"(ur(ose (rocessors.
Intel #0#C =-J#>. J03#"cycle MPQ% D"0 M0,
Motorola C#000 =-J->. J0"cycle MPQ% G"2:D M0,
Intel #02#C =-#2>. 2G"cycle MPQ% C"2:D M0,
DSP re+uirements% #0$s to early -0$s
Multi(ly"accumulate
Scaling ='i<e)"(oint DSPs>
R
"
FF&
Pee( M5C 'ully busy
0an)ling o' )ata 'rames%
bit;ise convolution
Literbi su((ort
0ar);ire) M5C
Shi'ters
Circular bu''ers
5))"sub"m(y% bit"reverse
2 )ata accesses (er cycle%
0"overhea) so't;are loo(s
Fast logical o(erations
5))"com(are"select
&he very 'irst )evices
Intel 2-20 =-J#>. 5D3D5% shi't"an)"a)) =no MPQ>%
-2";or) (rogram memory% no (rogram 'lo;
control: G:D Sm NMAS:
5MI S2# =-J->. C"bit 514% 2<2 MPQ% not a
stan)alone SP: G:D Sm LMAS:
5&T& DSP =-J->. true DSP but not 'or merchant
mar/et: G:D Sm NMAS:
NEC JJ20 =-#0>. 32"bit 514% C<C MP4% )ual
accumulator% G M0, instruction rate: 3 Sm NMAS:
Dates are those o' 'irst (rototy(es
Enter the &I &MS3200:::
-JC. &MS--00 C"bit SP% ahea) o' its time:
-J#. &I Management reali,es it is not ma/ing the
nee)e) hea);ay in the SP mar/et: 1oo/ 'or ne<t big
thing:
-J#. 0arvey Cragon suggests a DSP )evice:
-J-. Intel 2-20 (a(er 'or ISSCC convinces &I
Management to go ahea): Signal Processing
Com(uter architecture )e'ine): Dr Suren)ar Magar
=DSP PhD> hire) 'rom Plessey 4P:
Se(: -#. start o' 'irst sam(les 'abrication:
Feb: -#2. Magar (resents (a(er at ISSCC:
Source. 5:?: 1eigh% &he &MS3200: &he DSP chi( that change) the )estiny o' a semicon)uctor giant:
Some &MS3200 'acts an) 'igures
3 =2:J> Sm NMAS% DJ%000 transistors% G3:# mmB:
D M0, instruction rate% 2"cycle M5C:
GG";or) 65M% :D P; 6AM:
65M si,e 'its a CG"(oint FF&:
Initially no har);are multi(lier 7 Magar (ut it in -J-:
Initially internal 6AM only% 2G"(in (ac/age:
Micro(rocessor mo)e a))e) early -# =&MSJ000 'ee)bac/>:
Features a))e) to ma/e the chi( sel'"emulating =e<tra
stac/ level:::>:
Initially ;as to be name) &MS000 ='rom the &MS000>:
3200 ma)e u( 'rom 32 =bits>% 0 ='irst )evice in series> an) 0
=e<tra (er'ormance li/e &MS--003&MS--000>:
9loc/ )iagram an) )ie (hotogra(h
&MS320C<% C2D% CD< 'eatures
C1x (18!" C!# (18$" C#x (18"
M5C 2 cycles U D M0, cycle U 0 M0, cycle U 20 M0,
514 No guar) bits% acc No guar) bits% acc No guar) bits% accFbu':
R
"
Physical )ata move Physical )ata move 2 circular (ointers
FF& No bit reverse 9it reverse 9it reverse
9it mani(ulation 5143accumulator 5143accumulator Parallel logic unit
Memory G P; (rogram
GG ; )ata 65M
3 < CG P; s(ace
DGG ; )ata 65M
3 < CG P; s(ace
P; D565M
- P; S565M
5))ressing )ata acc:3cycle
2 (ointers F 56P
Imme):. loa)3m(y
2 )ata acc:3cycle
# (ointers F 56P
Imme):. all instr:
=same>
Program control 9ranch on not ,ero 6e(eat single instr: 6e(eat bloc/ = level>
Delaye) instructions% VC
Po;er mo)es
Cloc/
None
E<t: cloc/ 3 G
E<ternal bus hol)
=sto(s CP4>
E<t: cloc/ 3 G
I)le =CP4 sto((e)>
I)le2 =CP4F(eri(h>
P11 u( to <-
Main com(etition in the #0$s
%D& !101 (188" %T'T DSP1$% (188" (otorola #$001 (18)"
0 M0, 20330 M0, 0 M0,
C3G0 bits% 2 acc C33C bits% 2 acc 2G3DC bits% 2 acc%
a))"sub"m(y =FF&>
2 < C P; s(ace
2 P; P3D D565M
P; D S565M
CG P; P3D s(ace
or 2 P; 65M
3 < CG P; P3V3Q s(ace
2 < 2DC ; V3Q 65M
Circular (ointers% bitrev circular (ointer% no bitrev Mo)ulo a))ressing% bitrev
0ar);are loo(s =G levels>
I"cache
5ll o(co)es con)itional
0ar);are loo( =D ;or)s>
I"cache
0ar);are loo(s =stac/>
Still being (ro)uce) an)
sol) =2D M0,>% although
not at a com(etitive (rice:
1ast 'amily member =5DSP"
2--0> intro)uce) in 2002:
5&T& 1ucent 5gere
1SI 1ogic% DSP (ro)ucts
)iscontinue)
Sym(hony line o' 2G"bit
au)io DSPs =DC<<<>: Ne;
(ro)ucts intro)uce) until
200-: 5nnounce) in 202
there ;oul) be no ne;
stan)alone DSPs:
Floating"(oint vs: 'i<e)"(oint
?i)er )ynamic range% same (recision through range:
9alance o' cost% (er'ormance =cycle time>% (o;er
consum(tion% ease o' (rogramming an) time"to"
mar/et:
Floating"(oint DSPs better suite) to.
0igh"en) au)io
Me)ical
6a)ar
In)ustrial control an) robotics
Floating"(oint DSPs in the #0$s
0itachi 0DC#0 =-#2>. very 'irst one%
C"bit numbers =m2eG>:
NEC JJ230 =-#D>. 'irst (ractical oneK 32"bit storage%
DD"bit m(y% 2D0 ns cycle time:
&MS320C30 =-##>% 5&T& DSP32C =-##>% Motorola
-C002 =-#->
Some &MS320C30 'eatures.
C0 ns cycle time% C M? a))ress s(ace% # )ata registers%
regular instruction set% instruction cache% DM5 controller%
guar) bits:::
Ease o' (rogramming 7 $C303$CD0
*arl+ 0s, The great di-ide
Emerging DSP mar/ets an) enablers
Multime)ia% image enco)ing
MP3. --3% MPE*"2. --C% MPE*"G. --#
Mobile communications% base stations
First *SM (ilot net;or/ in --
Digital control% enable) by lo;er costs
--G. &MS320C2<% 5DI 20<. W0 =/ units>
Increase) transistor bu)get
&MS3200 =-#2>. DJ%000 transistors
&MS320CD0 =-#->. M transistors
Cray"5 CP4 =-JC"-J-> ;as 2:DM
1o;er (o;er consum(tion
XCD0 ;as :#< lo;er (o;er =ty(: U DL> than the &MS3200
;hile 'eaturing a G< instruction rate:
&he great )ivi)e
Pervasive DSPs
Entry"level% variety o' on"chi( (eri(herals% general"(ur(ose
ca(abilities eliminating MC4% lo; cost:::
Mobile DSPs
Mobile communications stan)ar)s setting stringent
re+uirements =interru(t res(onse time% bit"e<actness:::>%
(o;er consum(tion =energy3'unction% not (o;er (er M0,>%
)igital consumer a((lications:
0igh"(er'ormance DSPs
Multi(le on"chi( e<ecution units% multi(rocessing% L1I?%
su(erscalar% large on"chi( caches:::
Ather maEor DSP tren)s
E''icient C (rogramming% hybri) MC43DSP ca(abilities:
Emergence o' IP ven)ors
56M% 56C =no; Syno(sys>% DSP *rou( =no; CEL5>% &ensilica:::
Core"base) )esign. cDSP% C2<1P% Carmel:::
S(eciali,e) har)";ire) bloc/s an) accelerators
MPE*"23G:::
0ar) )is/ )rive rea) channel
5DS1% <DS1
Channel e+uali,ation =CDM5:::>
etc:
So't;are YecosystemZ% )evelo(ment tools
5 loo/ at so't;are 'or &I DSPs
-#J. YDigital Signal Processing 5((lications ;ith the &MS320
FamilyZ te<tboo/
--0. &I C com(iler an) source )ebugger% DSP Starter Pit =DSP%
EP6AM% au)io 5D3D5>
--. &I s(onsors the 'irst E)ucators$ Con'erence 'or DSP
e)ucators an) researchers
--3. &MS320 So't;are Coo(erative
--D. An"1ine DSP 1ab% ??? DSP hotline
--J. &I ac+uires S(ectron Microsystems =SPAV [st)io% DSP
math (/g\% 9IASuite> an) *A DSP =Co)e Com(oser Stu)io>
--#. 6eal"&ime Data E<change =6&DV>
---. eV(ressDSP =CCS% DSP39IAS% 5PIs>
M5&1593Simulin/% SP?% LisSim3Embe))e) Controls Develo(er
Some more DSP algorithms 7 CE1P
Co)eboo/ E<cite) 1inear Pre)ictive co)ing
5nalysis"by"synthesis
Co)eboo/ contains ran)om ;hite noise se+uences
4se o' a (erce(tual 'ilter so co)ing error noise 'alls belo;
s(eech s(ectral envelo(e:
Image sources. 2erry D: *ibson% S(eech co)ing metho)s% stan)ar)s% an) a((lications =200D>% ;;;:)ata"com(ression:com
CE1P variants
6e+uire) to re)uce com(utational com(le<ity:
LSE1P =Lector Sum>.
Co)e vectors are linear combinations o' a 'e; basis vectors:
5mericas D"5MPS IS"CG =--0% G MIPS>% 6eal5u)io =--D>:::
5CE1P =5lgebraic>.
1arge co)eboo/ containing very 'e; (ulses o' O am(litu)e:
*SM EF6 =E&SI% --D% G MIPS>%
5M6 =3*PP% --# % G MIPS>:::
*:J2- =--D% 2"20 MIPS>.
5)a(tive co)eboo/ to enco)e
(ast resi)uals =ste( > then
'i<e) co)eboo/:
Image source. ?i/i(e)ia article on CE1P
MPE*"32 au)io layer 3 =MP3% --3>
23 MIPS enco)e% 2 MIPS )eco)e =XCDD<% 2# /b(s>
32 ban)s
DJC (oints
02G (oints
Com(ares energy 3 human hearing threshol)s
Aut(uts Signal"to"Mas/ 6atio =SM6> 'or each ban)
Inner loo( =bit rate>. +uanti,ation F 0u''man co)ing
Auter loo( =noise>. a)Eustment o' o''en)ing sub"ban)s
Figures o' SPI6I& DSP im(lementation 7 ;;;:s(irit)s(:com
Still"image com(ression
&rans'orm
XCDD<% CIF
]
% 30 '(s.
HG0 M0,
H0 M0, =0?5>
Muanti,ation
6e)uces number o' bits (er coe'':
6un"length co)ing
0u''man co)ing
Still"image enco)ing alone achieves H0. com(ression:
#<# DC&
S(atial )omain
Q4L com(onents
Fre+uency )omain
Energy concentrate) in
lo; 're+uencies
6emoves (erce(tually
less signi'icant )ata
Full tutorial available at htt(.33;;;:b)ti:com3articles
]
CIF. Common Interme)iate Format% 3D2 < 2## (i<els
Motion estimation
Intra" an) (re)ictive"co)e) 'rames:
Image )ivi)e) in C<C (i<els macrobloc/s:
Process.
Search re': 'rame 'or a C<C region matching macrobloc/:
Enco)e motion vector:
Enco)e )i''erence bet;een (re)icte) an) actual macrobloc/s:
Cannot (er'orm e<haustive search:
Strategy o' motion estimation is a /ey )i''erentiator:
Sum"o'"absolute )i''erences:
233 o' enco)er MIPS s(ent there:
Full tutorial available at htt(.33;;;:b)ti:com3articles
Image (ost"(rocessing
6emoving arti'acts.
De"bloc/ing =see images>
De"ringing. remove )istortions
near e)ges o' image 'eatures:
Color"s(ace conversion.
Q4L 6*9
Lery intensive since (er'orme) at (i<el level:
e:g:% Q4L 6*9 ta/es H3C MIPS XCDD< =CIF% 30 '(s>
Control"oriente) (eri(herals
Pulse ?i)th Mo)ulation
Dea)"ban) control
Mua)rature Enco)er
Images source. C2#< DSP Design ?or/sho(% &e<as Instruments% 200G
9enchmar/ing DSP (er'ormance
9D&I DSP Pernel benchmar/s
FI6% 1MS% II6% FF&% vector )ot (ro)uct3a))3ma<% Literbi%
control% bit un(ac/
9D&I Communications benchmar/ =AFDM>
IM% FI6% FF&% slicer =FF& to M5M constellation>% Literbi
9D&I Li)eo Enco)er% Deco)er an) Pernel benchmar/s
Debloc/ing% DC&% motion com(ensation an) estimation%
image resi,e
5lso at 9D&I.
Com(arison high"level synthesis on FP*5 an) 6&1 or XCCG<F
Digital Signal Processors 0s-00s
Pervasive DSPs
-##. &MS320CG
456&% timers% ca(ture in(uts% com(are out(uts =P?M>%
EP6AM
Positione) to;ar)s motor control an) automotive =59S:::>
--. WD 'or a &MS320C< DSP = /units>
XC2<<. C2<1P core"base)% 2&5* emulation
--C. C (ro)uctsK 'lashK 456&% timers% *PIA
--J. $F2G0 aime) at motor control
&I hybri) DSP3MC4s
--#. XC2J<
Co)e e''iciency. better )ensity% better com(iler target
D"20I a)vantage on 56MJ in mass storage benchmar/s
GM C"bit ;or)s a))ress s(ace% byte a))ressability% stac/"base)
a))ressing
Im(rove) interru(t an) conte<t s;itch res(onses
Memory"to"memory an) register"to"register o(erations
6eal"time emulation. DM5% real"time )ata e<change:::
200. XC2#<
XC2<< co)e com(atible% 32"bit arithmetic% )ual C<C M5C
53D converter% C5N% SPI% SCI% IBC% 1IN% Mc9SP:::
0igh"resolution P?M =CD (s>% +ua)rature enco)er =MEP>:::
YLirtual 'loating (ointZ =IM Math>% "to" C"to"assembly ratio
&I hybri) DSP3MC4 current lineu(
Not a DSP anymore^
XC2000 a((ears in the &I MC4 (ro)uct tree:
Floating"(oint (ro)ucts since 200#% D0"300 M0,
#0F co)e"com(atible (ro)ucts
Entry (rice. W2:00 =%000 units>
G0 M0,% C3C P9 'lash365M% # P?M% 2 M0, 2"bit 5DC
Similar (rice as 56M Corte< M3 microcontrollers =same 'eatures>
5((lications.
Digital motor control% automotive =hybri)% (o;er steering%
<"by";ire:::>% rene;able energy% lighting% (o;er line comms%
(recision sensing an) control:::
Com(etition. 5nalog 2--<% Freescale DC#D<% booste)
MC4s =5tmel% In'ineon% Microchi( )sPIC>
&I Mobile DSPs
C#x (18" C#.x (1#" C##x (!000"
514 No guar) bits
accFbu':
Pi(eline) M5C
# guar) bits
2 accumulators
Dual C"bit 514
32"bit o(eran)s
Se(arate M5C
G accumulators
Dual M5C
5))itional C"bit 514
S(eciali,e) 'unc: Parallel logic unit Com(are"Select"Store
E<(onent enco)er
More orthogonal instr: set
5))ressing a))ress gen:
2F )ata rea)F;rite
CG /; (rogram
2 a))ress gen:
2F )ata rea)F;rite
4( to # M; (rogram
Parallel loa)3store
Con)itional store
3 a))ress gen:
3F2 )ata rea)F;rite
C M9 (rogram3)ata
More circular a))r:
Program control 6e(eat bloc/ = level>
Delaye) instr:% VC
=same> 6e(eat bloc/ =3 levels>
CG"byte instr: +ueue
S(eculative 'etching
Po;er mo)es I)le =CP4 sto((e)>
I)le2 =CP4F(eri(h>
I)le3 =CP4F(eri(hFP11> 4ser"con'igurable i)le
)omains
Ather 4( to G0I )enser co)e
Mobile DSPs timeline
0 # 00 0#
Texas &nstruments XCD< XCDG< XCDD<
DSP /roup0C*1% Pine Aa/ Palm VC20 VC22
%T'T02ucent0%gere DSPC5 C0 CJ C20
SC0 SC200 SC2200
(otorola03reescale DCDC
4*C SPVP33G SPVPD
&nfineon Carmel 0<<
2S& 2ogic01eriSilicon RSPG00 RSPD00
ST(icroelectronics S&00 S&22
%nalog De-ices 5DSP"232<<
So'tPhone =2#<>
So'tPhone =9FD<<>
&o)ay.
Mualcomm. in"house DSP
Intel. CEL5
Me)ia&e/. CEL5% but ac+uire) Coresonic in 202
2a(an. &ensilica
0igh (er'ormance DSPs
-#J. &MS320C30 is GD /gates (lus memory:
Massively Parallel Processing =MPP> starts in the #0$s:
e:g: nC49E 0 =-#D% 02G (rocessors>:
Some (rocessors are )esigne) ;ith high"s(ee) lin/s:
Inmos &rans(uter &22 =-#G% C"bit 'i<e) (oint>
&I &MS320CG0 =--0% 32"bit '(>
5DI 5DSP"20C< YS056CZ =--G% 32"bit '(>
Early X-0s. silicon bu)get starts allo;ing multi(le
e<ecution units:
&MS320CC0 Y2uggernautZ. single"cycle com(le< multi(ly:
ProEect re)irecte) to the &MS320CC000:
0igh (er'ormance DSPs timeline
) 00 0# 10
Texas &nstruments fixed
XCC2<
C 514% 2 MPQ%
200 M0,% L1I?
XCCG<
G M5C3cycle% SIMD% com(act
instruction set
G00 M0, " *0,
XCCG<F
# M5C3cycle% 20"30I com(act
G00"J00 M0,% 'rom W0
3 <
:2*
C <
J00M
XCCC< 'i<3'(
#<:2D *0,
Texas &nstruments float
XCCJ<
5)) '( to MPQFG 514% D0 M0,
XCCJ<F
D0"3D0 M0,% WJ
XCCJG<% 'i<3'(
200"GD0 M0,% WC
(otorola fixed
(StarCore"
MSC#0 =SCG0>
G M5C% L1I?
300 M0,
MCSJ< =SCG00>
200"300 M0,
MSC
#2D
*0,
MSC#02
G<SCG0% 300 M0,
MSC#22
G<SCG0% D00 M0,
MSC#<<3#2<<
C<SC3#D0
*0,
MSC#GG
G<SC3G00% *0,
%nalog De-ices float
S056C
2 514% 2 MPQ% SIMD% L1I?
S056C% u( to D Mb 65M% 200"GD0 M0,
&igerS056C% G"2G Mb eD65M% D00"C00 M0,
C*1% fixed
CEL5"VC323
32 M5C3cy
#00 M0,
S(eciali,e) DSPs 'rom &I
200% &MS320D52D0% )igital au)io
XCDD< U 20 M0,% 4S9% MS3MMC3SD% 1CD% IBC% SD65M:::
200% &MS320DSC232G% )igital still camera
XCDG< U 2D0 M0, F 56MJ&DMI% SD65M% N&SC3P51% 2 M(i<:::
200% &MS320IPDGJ2% IP tele(hony =XCDG< F 56MJ&DMI>
2003% &MS320DM30% )igital me)ia =XCDG< F 56M-2D>
2003% &MS320DMCG2% DaLinci vi)eo3imaging% XCCG<
200#% &MS320&CICG#J% ;ireless in'rastructure% 3<$CCG<F
Currently. DaLinci vi)eo3imaging% AM5P a((s (rocessors%
PeyStone imaging3vision3high (er'ormance com(uting
Con'igurable DSPs 7 56C
---. 56C 3% 2 M5C3cycle% G0"bit registers% saturating arithmetic% VQ mem%
circular an) bit"reverse a))ressing% ,ero"overhea) loo(ing% G0 M0,:
2002. 56Ctangent"5D% 56Com(act instruction set% targets LoIP% ;ireless
baseban)% )igital imaging an) au)io:
2003. 56CC00% 200 M0,% vertical a((lications =au)io:::>:
200G. 56MJ00% 300 M0,% MM4% )ynamic branch (re)iction:::
200D. 56MJ0D% 2#"bit SIMD 'or vi)eo algorithms% D33 M0,:
200J. L6a(tor me)ia architecture: SIMD% co(rocessors =ME% entro(y
enc3)ec:::>:
200-. Lirage 1ogic ac+uires 56C
200. Syno(sys ac+uires Lirage 1ogic
20. ConnV 'or baseban):
202. 0iFi 3 'or high"en) au)io an) voice (rocessing:
J00 millions units shi((e) annually:
Configurable DSPs Tensilica
2000: Vectra DSP extension to Xtensa III
200 MHz !0"bit registers ! M#C$c%cle SIMD
&&T '()x faster t*an TI +C,,x Viterbi butterfl% in 2 c%cles -TI: !.
2002: Xtensa V $ Vectra scores '(/x *ig*er t*an +C/2x 00M1C Teleco2
benc*2ar3 -bot* optimized.(
2004: Hi&i au5io 6ac3age -soft7are co5ecs ne7 instr(.
200!: Xtensa$Vectra 8X
&8IX -V8I9.( '()x 1DTI2ar3 of +C/!x -) M#C configuration Viterbi : bit un6ac3 instr(.
200,: ;i5eo 5eco5er 6ac3age
H(2/! D' 5eco5ing 40 f6s in '0(, 22< -'40 n2.(
200/: Dia2on5 series of 6re"configure5 6rocessors
,!,C= DSP -) M#Cs. 5is6la%s t7ice t*e +C/!x: 1DTI2ar32000 score at sa2e MHz(
200>: ConnX 1aseban5 0ngine -Dia2on5 ,!,C= '/ M#Cs ne7 instructions.
Tensilica 6ositions its 6ro5ucts as custo2izable 5ata6lane 6rocessors
)00 2illions units s*i66e5 annuall%( Clai2 license re;enue bigger t*an C0V#?s(
20'4: announce5 t*e% s*i66e5 2 billions cores( Ca5ence ac@uires Tensilica(
56M attem(ts at DSP 7 Piccolo
--J: 5n 56MJ co(rocessor a))ing
2
3
3
area:
In 56M$s ;or)s%
about the XCD2 (er'ormance
C<C M5C% G# bits accums:%
)ual C"bit instructions%
saturation% har);are loo(
Issues.
6egister 'ile starvation
No 56M3Piccolo signaling
=nee) 'or )ata% interru(ts:::>
56M attem(ts at DSP 7 vD&E
---: 32<C M5C% saturation: 30I area a))er to vG&:
S(ee)u( vs: vG&. FF& 20"G0I% *:J23 co)ec almost 2<:
No mo)ulo or bit"reverse a))ressing% no har);are
loo(% no guar) bits:
56M-GC3-CC =--->% 56M-2C =200>% 56M-C# =200G>
vD&E maintaine) in vC an) vJ
(&PS -#T* C#.x C##x
%dapti-e *cho Cancellation ($. ms" 3D:- 2:G #:C
/5)!%6 codec G0:3 2:2 0:3
Per'ormance 'igures source. ;;;:a)a(tive)igital:com
56M attem(ts at DSP 7 A(timoDE
2003% ac+uisition o' 5)elante &echnologies
L1I? ;ith Custom Functional 4nits an) con'igurable
micro"architecture
: C co)e evaluate) on (re"
)e'ine) con'igurations
2: De'ine CF4s
3: Develo( microco)e using
retargetable C com(iler
56M attem(ts at DSP 7 A(timoDE
1imite) success. &homson =vi)eo>% 1* =0D&L>%
9roa)com =net;or/ing%
;ireless>% Phona/
=hearing ai)s>% &oshiba
=(ortable>:
Com(etition. 56C%
&ensilica: Patent
(ort'olio% )evelo(ment
environment:
56M 1euven o''ice
close) in 200-:
56M attem(ts at DSP 7 NEAN
200D% intro)uce) on Corte< 5# =56MvJ"5>:
MMV"li/e =DSP"ca(able>% not DSP"li/e
2#"bit 'i<e)3'loating"(oint SIMD
32 CG"bit registers
#3C3323CG"bit integers
#3C"bit (olynomials ;ith "bit coe''icients
Single" an) )ouble"(recision ':(:% IEEE an) 'ast
Com(ete) ;ith Intel ?ireless MMV
PV52J<% 200G% 32"C2G M0,% CG"bit int: SIMD =C registers>
56M attem(ts at DSP 7 NEAN
Per'ormance increase vs. 56MvD&E.
MPE*G )eco)ing. <G:D
*SM"5M6. <3
MP3 )eco)ing. <2:D
FF&. <G
Comes at a cost.
Corte< 5- core.
C00 /gates
Corte< 5- NEAN.
D00 /gates
Corte< 5D intro)uce) in 200- as a re(lacement to
56MJC:
56M attem(ts at DSP 7 Corte< MG
Intro)uce) in 200 as a Digital Signal Controller:
5))s single"cycle la vD&E instructions to Corte< M3:
32"bit M5C or )ual C"bit M5C
A(tional FP4% MP4
]
% NLIC
]
% ?IC
]
:
9ut no bit"reverse or circular a))ressing% no
har);are loo(s% no (arallel loa)3store an) 514 o(s:
&y(ically D0 M0, =targets 'lash memory> but
ca(able o' 300 M0, =CD nm 1P>:
]
MP4. Memory Protection 4nitK NLIC. Neste) Lectore) Interru(t ControllerK ?IC. ?a/e"u( Interru(t Controller
Per'ormance com(arison
Source. ;;;:b)ti:com
0:00 2:D0 D:00 J:D0 0:00 2:D0 D:00 J:D0
Diamon) DGDCP " G00 M0,
MSC#DC " *0,
@CCG<F " :2 *0,
CEL5 VC20 " D00 M0,
RSP D00 " 300 M0,
Corte< 5# " J00 M0,
&igerS056C " C00 M0,
9FD<< 9lac/'in " JD0 M0,
@CDD< " 300 M0,
56MJC " D00 M0,
Marvell PV52J< " C2G M0,
MIPS 2GPe " D00 M0,
Pentium III " :G *0,
9D&Imar/3M0,
5nySP 7 &he best mobile DSP8
%lgorithm
S&(D
7or8load
(9"
Scalar
7or8load
(9"
:-erhead
7or8load
(9"
S&(D
7idth
(elements"
%mount of
thread-le-el
parallelism
FF& 3 inverse FF& JD D 20 02G 1o;
S(ace"time bloc/ co)ing=S&9C> # D G G 0igh
1o;")ensity (arity"chec/ =1DPC> G- # 33 -C 1o;
Debloc/ing 'ilter J2 3 D # Me)ium
Intra(re)iction #D D 0 C Me)ium
Inverse trans'orm #0 D D # 0igh
Motion com(ensation JD D 0 # 0igh
Data"level (arallelism analysis 'or mobile signal
(rocessing algorithms.
Source. 5nySP. 5nytime 5ny;here 5ny;ay Signal Processing%
Proc. of the International Symposium on Computer Architecture, June 200
htt(.33;;;:(ublic:asu:e)u3Hchaitali3(a(ers:html
5nySP 7 ?or/loa) analysis
Multi(le SIMD ;i)ths% substantial scalar an) overhea)
loa)s
5voi) 'i<e)";i)th SIMD% im(rove scalar3a))ress generation3
)ata shu''ling (er'ormance:
6egister values li'etimes
9y(ass register 'ile ;henever (ossibleK s(lit register 'ile into a
small an) a large region to o(timi,e (o;er consum(tion:
Instruction (air 're+uency
Fuse most 're+uent instruction (airs =loses unnee)e) interim
result>:
Data reor)ering (atterns
5ll stu)ie) algorithms have a (re)e'ine) set o' s;i,,le (atterns
=_0>:
5nySP 7 5rchitecture
5nySP 7 6esults
-0 nm core area. 2D:2 mmB =est: C:#D mmB in GD nm>
00 Mb(s high mobility G* ;ireless.
-0 nm% L% 300 M0,. :3 ?
=est:> GD nm% 0:# L% 300 M0,. #D0 m?7 000 Mo(s3m?^
0igh +uality 0:2CG GCIF 30 '(s )eco)ing. C0 m?=-0 nm>
FP*5s in )igital signal (rocessing
200G. FP*5s a((eare) in the EDN DSP Directory:
5ltera DSP 9uil)er inter'aces ;ith M5&1593Simulin/K
FI6 an) II6 MegaCores:
Vilin< Lirte< II. u( to DCC #<# multi(liers: Pro version inclu)es
Po;erPC core: 1icensable IPs =Literbi% &urbo:::>:
Su((ort o' M5&159% Simulin/ an) SP?:
Mar/ets.
5eros(ace3)e'ense% broa)cast3vi)eo3imaging%
;ireless in'rastructure:
200D. Vilin< intro)uces VtremeDSP (lat'orm:
De'ines VtremeDSP slices buil)ing bloc/s =m(y% a))3sub:::>:
1o;"cost )evelo(ment environment:
20. Vilin< ac+uires 5utoES1:
DSP3FP*5 system (artitioning
5ltera$s vie; o' an AFDM5 base station.
Source. 5ltera ;hite (a(er ?P"00G3":0% Actober 200J
&he DSP"centric vie; =MSC#DC>
Source. MSC#DC Pro)uct 9rie'% 9roa)ban) ?ireless 5ccess DSP% Freescale% March 200
Com(arison o' DSP im(lementations
&mplementation t+pe %rea Po7er 3lexi;ilit+ S75 De-5 <is8
<emote
=pgrade
FP*5
" " " " F F " F F F F
Parallel 0omogeneous
" 0 0 " " 0 F
Con'igurable CP4
0 F " 0 " " 0
CP4 F accelerators
F F F " " F " "
Single s(ee) )emon
F F " F F F F F F
&able sho;s relative value o' im(lementation by criteria. 'rom " " =(oorest> through 0 =average> to F F =best>:
Source. Micro(rocessor 6e(ort% YMi<e) 5rchitectures Dominate Consumer Soc/etsZ% "5ug"200#
Some 'aile) attem(ts 7 M;ave
M;ave. virtual DSP =har)% so't% native>:
PC"base) multime)ia =mo)em% 'a<% voice% au)io% vi)eo
)ecom(ression:::>: 4(gra)eability:
?as to be the DSP Y/iller a((licationZ:
Initially a &I3I9M alliance:
I9M )esign mar/ete) by &I =&MS320MD00% --2>:
&I ;ith)re; in --G ;hen I9M )eci)e) to sell )irectly the silicon:
M;ave DSP. C"bit )ata% 32"bit 5143MPQ% 2D"33 M0,K
IS5% MIDI an) au)io co)ec inter'aces:
Dro((e) by I9M because o' (er'ormance an)
com(atibility issues: Im(ortant su((ort bur)en com(are)
to o''"the"shel' soun) or mo)em car)s:
Some 'aile) attem(ts 7 &MS320C#0
--G. Multime)ia Li)eo Processor
Master Processor . 32"bit 6ISC% IEEE ':(:% I3D caches: 00 M'lo(s:
G Parallel Processors . 32"bit DSP% CG"bit o(co)es% I"cache% local 65M:
&rans'er controller =G00 M93s>: Li)eo controller =D65M3L65M>:
Dual )is(lay: 32"bit a))ress s(ace: Crossbar s;itch: D0 M0,: 2 9APS:
&oo com(le< to (rogram e''iciently:
S&I$s Cell (rocessor starte) shi((ing in 200C:::
Parl *uttag.
Li)eo Dis(lay Controller =Ys(riteZ>. MSV% ColecoLision% &I"--3G:::
Li)eo (alettes% SD65M% L65M:
&MS3G0 *ra(hics Signal Processor 'amily:
&I Fello; years a'ter gra)uating an) having Eoine) &I:
1e't &I in --J:
Some 'aile) attem(ts 7 MS5 U Intel
Intel35DI DSP3MC4 architecture co")evelo(ment since 2000:
5DI starte) mar/eting the 9lac/'in in 200:
AthelloAne% So'tFone% &&PCom*P6S stac/ =5DI% 2002> along ;ith
Intel$s VScale or NeoMagic@s a((lication (rocessors:
Intel. Micro Signal 5rchitecture: PV5#00F 3 Manitoba in 2003:
0:3 Sm
32 M0, VScale
0G M0, MS5
G M9 'lash
D2 P9 S65M
DSP mar/et
DSP"enable) silicon mar/et is 0< the )iscrete DSP$s:
200#. W2J:29 vs. W3:09
]
=total semicon)uctor. W2C09
`
>
De'inition o' ;hat a DSP is varies ;i)ely:
For;ar) Conce(ts. W3:09 vs. iSu((li WD:#9
`
=200#>:
Shrin/ing mar/et 'or
)iscrete DSPs
`
.
DSPs. "#:2I3year
over 200#"20G
&otal (rocessors. FG:I
&otal )igital ICs. FG:DI
]
Source. For;ar) Conce(ts% May 200-
`
Source. iSu((li% May 200
]
Conclusion
DSP technology has enable) many a((lications:
Si,e o' this mar/et has attracte) many com(etitors ;ith a
broa) set o' o(timi,e) solutions:
1o;"(er'ormance% lo;"cost DSP against MC4s:
1o;"(o;er% high"volume DSP against 5SSP ;ith
con'igurable (rocessors or licensable DSP cores:
CEL5. In'ineon% 9roa)com% Me)ia&e/% S(rea)trum% S&"Ericsson:::
&ensilica. N&& DoCoMo3NEC3FuEitsu3Panasonic:
0igh"(er'ormance DSP against MC4FFP*5 an) 5SSP ;ith
con'igurable (rocessors or licensable DSP cores or DSP
arrays:
PicoChi(% &ilera:::
To "roe &urther000
The Scientist an% EngineerBs Cui%e to Digital Signal Processing
9990%s"gui%e0com
9990%ata.com"ression0com
For9ar% Conce"ts, 9990&9%conce"ts0com 4since $?1-5
7er8eley Design Technology, 9990%ti0com 4since $??$5
+osts the com"0%s" F!D 9990%ti0com/Eesources/Com"0DSP0F!D
EDF DSP Directory, 9990e%n0com 4yearly5
ieeex"lore0ieee0org
9990%atasheetarchive0com
This "resentation is availale on Sli%eShare0

Das könnte Ihnen auch gefallen