DSP Lecture 01

DSP Lecture 01
Chapter 1
Introduction
Chapter 1, Slide 1
Learning Objectives
Chapter 1, Slide 2
Why process signals digitally?

Definition of a real-time application.
Why use Digital Signal Processing processors?
What are the typical DSP algorithms?
Parameters to consider when choosing a DSP
processor.
Programmable vs ASIC DSP.
Texas Instruments TMS320 family.
Present Day Applications

Wireless / Cellular
Voice-band audio
RF codecs
Voltage regulation
Consumer Audio
Stereo A/D, D/A

PLL
Mixers
HDD
DSP:
Technology
Enabler
Multimedia
Stereo audio
Imaging
Graphics palette
Voltage regulation
Chapter 1, Slide 3
PRML read channel

MR pre-amp
Servo control
SCSI tranceivers
Automotive
Digital radio A/D/A
Active suspension
Voltage regulation
DTAD
Speech synthesizer
Mixed-signal
processor
Why go digital?
Digital signal processing techniques are now so

powerful that sometimes it is extremely difficult, if
not impossible, for analogue signal processing to
achieve similar performance.
Examples:
FIR filter with linear phase.
Adaptive filters.
Chapter 1, Slide 4
Why go digital?
Analogue signal processing is achieved by using

analogue components such as:
Resistors.
Capacitors.
Inductors.
Chapter 1, Slide 5
The inherent tolerances associated with these

components, temperature, voltage changes and
mechanical vibrations can dramatically affect the
effectiveness of the analogue circuitry.
Why go digital?
With DSP it is easy to:

Change applications.
Correct applications.
Update applications.
Additionally DSP reduces:
Chapter 1, Slide 6
Noise susceptibility.
Chip count.
Development time.
Cost.
Power consumption.
Why NOT go digital?
High frequency signals cannot be processed

digitally because of two reasons:
Analog to Digital Converters, ADC cannot work fast
enough.
The application can be too complex to be performed in
real-time.
Chapter 1, Slide 7
Real-time processing
DSP processors have to perform tasks in real-time,

so how do we define real-time?
The definition of real-time depends on the
application.
Example: a 100-tap FIR filter is performed in realtime if the DSP can perform and complete the
following operation between two samples:
99
y n a k x n k
k 0
Chapter 1, Slide 8
Real-time processing
Processing Time
Waiting Time
n+1
Sample Time
We can say that we have a real-time application if:

Waiting Time 0
Chapter 1, Slide 9
Why do we need DSP processors?
Why not use a General Purpose Processor (GPP)

such as a Pentium instead of a DSP processor?
What is the power consumption of a Pentium and a DSP
processor?
What is the cost of a Pentium and a DSP processor?
Chapter 1, Slide 10
Why do we need DSP processors?
Chapter 1, Slide 11
Use a DSP processor when the following are

required:
Cost saving.
Smaller size.
Low power consumption.
Processing of many high frequency signals in
real-time.
Use a GPP processor when the following are
required:
Large memory.
Advanced operating systems.
What are the typical DSP algorithms?
Chapter 1, Slide 12
The Sum of Products (SOP) is the key element in

most DSP algorithms:
What Problem Are We Trying To Solve?

x
ADC
Digital sampling of
an analog signal:
DSP
DAC
Most DSP algorithms can be

expressed with MAC:
count
Y =
t
i = 1
ai * x i
for (i = 1; i < count; i++){

sum += m[i] * n[i]; }
What does it take to do this fast and easy?

Chapter 1, Slide 13
Fast MAC using only C

Multiply-Accumulate (MAC) in Natural C Code
for (i = 0; i < count; i++){
sum += m[i] * n[i]; }
Fastest Execution of MACs
The C6x roadmap ... from 200 to 2400 MMACs
Ease of C Programming
Even using natural C, the C6000 Architecture can perform 2 to 4 MACs
per cycle
Compiler generates 80-100% efficient code
Chapter 1, Slide 14
How does the C6000 achieve such performance from C?
'C6000 Architecture: Built for Speed

Memory
A0
..
A15
..
A31
.D1
.D1
.D2
.D2
.M1
.M1
.M2
.M2
.L1
.L1
.S1
.S1
.L2
.L2
.S2
.S2
Controller/Decoder
Controller/Decoder
Chapter 1, Slide 16
B0
..
B15
..
B31
C6000 Compiler excels at

Natural C
While dual-MAC speeds

math intensive algorithms,
flexibility of 8 independent
functional units allows the
compiler to quickly perform
other types of processing
All C6000 instructions are

conditional allowing efficient
hardware pipelining
Instruction set and CPU

hardware orthogonality
allow the compiler to
achieve 80-100% efficiency
Fastest MAC using Natural C

float mac(float *m, float *n, int count)
{ int i, float sum = 0;
Memory
A0
..
A15
..
A31
.D1
.D1
.D2
.D2
.M1
.M1
.M2
.M2
.L1
.L1
.L2
.L2
.S1
.S1
.S2
.S2
Controller/Decoder
Controller/Decoder
Chapter 1, Slide 17
B0
..
B15
..
B31
for (i=0; i < count; i++) {

sum += m[i] * n[i]; }
;** --------------------------------------------------*
LOOP: ; PIPED LOOP KERNEL
LDDW .D1
A4++,A7:A6
||
LDDW .D2
B4++,B7:B6
||
MPYSP .M1X
A6,B6,A5
||
MPYSP .M2X
A7,B7,B5
||
ADDSP .L1
A5,A8,A8
||
ADDSP .L2
B5,B8,B8
|| [A1] B
.S2
LOOP
|| [A1] SUB
.S1
A1,1,A1
;** --------------------------------------------------*
'C6000 System Block Diagram
External
Memory
Internal Buses
.D1 .D2
.M1 .M2
.L1 .L2
.S1 .S2
CPU
Looking at the internal buses ...
Register Set B
Register Set A
Chapter 1, Slide 18
P
E
R
I
P
H
E
R
A
L
S
Internal
Memory
C6000 Internal Buses

Internal
Program Addr
x32
Program Data
x256
Data Addr - T1
x32
Data Data - T1
x32/64
Data Addr - T2
x32
Data Data - T2
x32/64
PC
Memory
External
Memory
DMA Addr - Read

DMA Data - Read
Peripherals
Chapter 1, Slide 19
DMA Addr - Write

DMA Data - Write
A
regs
B
regs
DMA

Internal
Memory
External
Memory
Internal Buses
.M1 .M2
.L1 .L2
.S1 .S2
CPU
Chapter 1, Slide 20
Next, the internal memory ...
Register Set B
Register Set A
.D1 .D2
C6711 Memory
0000_0000
4K
Program
Cache
CPU
64K
0180_0000
On-chip Peripherals
8000_0000
128MB External
9000_0000
128MB External
A000_0000
128MB External
B000_0000
128MB External
Prog / Data
(Level 2)
4K
Data
Cache
cache logic
Chapter 1, Slide 21
64KB Internal
cache details
FFFF_FFFF
External
Memory
Internal Buses
.D1 .D2
.M1 .M2
.L1 .L2
.S1 .S2
CPU
Looking at each peripheral ...
Register Set B
Register Set A
Chapter 1, Slide 24
P
E
R
I
P
H
E
R
A
L
S
Internal
Memory
Hardware vs. Microcode multiplication
DSP processors are optimised to perform

multiplication and addition operations.
Multiplication and addition are done in hardware and in
one cycle.
Example: 4-bit multiply (unsigned).
Hardware
Microcode
1011
x 1110
10011010
1011
x 1110
0000
1011.
1011..
1011...
10011010
Chapter 1, Slide 26
Cycle
Cycle
Cycle
Cycle
1
2
3
4
Cycle 5
Parameters to consider when choosing a

DSP processor
Parameter
TMS320C6211
(@150MHz)
32-bit
TMS320C6711
(@150MHz)
32-bit
N/A
64-bit
Extended Arithmetic
40-bit
40-bit
Performance (peak)
1200MIPS
1200MFLOPS
2 (16 x 16-bit) with

32-bit result
2 (32 x 32-bit) with

32 or 64-bit result
32
32
Internal L1 program memory cache
32K
32K
Internal L1 data memory cache
32K
32K
Internal L2 cache
512K
512K
Arithmetic format
Extended floating point
Number of hardware multipliers

Number of registers
C6711 Datasheet: \Links\TMS320C6711.pdf
C6211 Datasheet: \Links\TMS320C6211.pdf
Chapter 1, Slide 27
Parameters to consider when choosing a

DSP processor
Parameter
TMS320C6211
(@150MHz)
2 x 75Mbps
TMS320C6711
(@150MHz)
2 x 75Mbps
16
16
Not inherent
Not inherent
3.3V I/O, 1.8V Core
3.3V I/O, 1.8V Core
Yes
Yes
On-chip timers (number/width)
2 x 32-bit
2 x 32-bit
Cost
US$ 21.54
US$ 21.54
256 Pin BGA
256 Pin BGA
External memory interface controller
Yes
Yes
JTAG
Yes
Yes
I/O bandwidth: Serial Ports

(number/speed)
DMA channels
Multiprocessor support
Supply voltage
Power management
Package
Chapter 1, Slide 28
Floating vs. Fixed point processors
Applications which require:
High precision.
Wide dynamic range.
High signal-to-noise ratio.
Ease of use.
Need a floating point processor.

Drawback of floating point processors:
Higher power consumption.
Can be more expensive.
Can be slower than fixed-point counterparts and larger in size.
Chapter 1, Slide 29
Floating vs. Fixed point processors
Chapter 1, Slide 30
It is the application that dictates which device and

platform to use in order to achieve optimum
performance at a low cost.
For educational purposes, use the floating-point
device (C6711) as it can support both fixed and
floating point operations.
General Purpose DSP vs. DSP in ASIC
Application Specific Integrated Circuits (ASICs) are

semiconductors designed for dedicated functions.
The advantages and disadvantages of using ASICs
are listed below:
Advantages
High throughput
Lower silicon area
Lower power consumption
Improved reliability
Reduction in system noise
Low overall system cost
Chapter 1, Slide 31
Disadvantages
High investment cost
Less flexibility
Long time from design to
market
General-purpose DSP market in 2003
Chapter 1, Slide 32
System Considerations
Interfacing
Performance
Power
Size
Ease-of Use
Programming
Interfacing
Debugging
Chapter 1, Slide 33
Cost
Device cost
System cost
Development cost
Time to market
Integration
Memory
Peripherals
Texas Instruments TMS320 family
Different families and sub-families exist to support

different markets.
C2000
Lowest Cost
Control Systems
Motor Control
Storage
Digital Ctrl Systems
C6000
C5000
Efficiency
Best MIPS per
Watt / Dollar / Size
Wireless phones
Internet audio players
Digital still cameras
Modems
Telephony
VoIP
Performance &
Best Ease-of-Use
Chapter 1, Slide 34
Multi Channel and

Multi Function App's
Comm Infrastructure
Wireless Base-stations
DSL
Imaging
Multi-media Servers
Video
Texas Instruments TMS320 family

TMS320C64x: The C64x fixed-point DSPs offer the industry's highest level of
performance to address the demands of the digital age. At clock rates of up
to 1 GHz, C64x DSPs can process information at rates up to 8000 MIPS with
costs as low as $19.95. In addition to a high clock rate, C64x DSPs can do
more work each cycle with built-in extensions. These extensions include new
instructions to accelerate performance in key application areas such as
digital communications infrastructure and video and image processing.
TMS320C62x: These first-generation fixed-point DSPs represent
breakthrough technology that enables new equipments and energizes
existing implementations for multi-channel, multi-function applications, such
as wireless base stations, remote access servers (RAS), digital subscriber
loop (xDSL) systems, personalized home security systems, advanced
imaging/biometrics, industrial scanners, precision instrumentation and multichannel telephony systems.
TMS320C67x: For designers of high-precision applications, C67x floatingpoint DSPs offer the speed, precision, power savings and dynamic range to
meet a wide variety of design needs. These dynamic DSPs are the ideal
solution for demanding applications like audio, medical imaging,
instrumentation and automotive.
Chapter 1, Slide 35
C6000 Roadmap
Object Code Software Compatibility
Floating Point
Performance
Multi-core
DSP
C64x
1.1 GHz
2nd Generation
C6414
C6412
1st Generation
C6203
C6201
C6202
C6701
C6211
C6416
C6415
DM642
C6411
t nce
s
e a
gh orm
i
H rf
Pe
C6713
C6204 C6205
C6711
C6712
C62x/C64x/DM642:
C62x/C64x/DM642:Fixed
FixedPoint
Point
C67x:
C67x: Floating
FloatingPoint
Point
Time
Chapter 1, Slide 36
C6000 Floating-Point
Performance
C67x
3 GFLOPS
and beyond
C6701
1 GFLOPS
C6711
900 MFLOPS
C6712
600
MFLOPS
C33
C31
C30
Chapter 1, Slide 37
C32
150
MFLOPS
Time
TI Floating-Point Innovation
TI Floating Point - A History of Firsts:
First commercially-successful floating-point DSP
First floating-point DSP with multiprocessing support
First $10 floating-point DSP
First 1-GFLOPS DSP
First $5 floating-point DSP
First 2-level cache floating-point DSP
First to offer 600 MFLOPS for under $10
Chapter 1, Slide 38
C30 (1987)
C40 (1991)
C32 (1995)
C6701 (1998)
C33 (1999)
C6711 (1999)
C6712 (2000)
Useful Links
Selection Guide:
\Links\DSP Selection Guide.pdf
\Links\DSP Selection Guide.pdf (3Q 2004)

\Links\DSP Selection Guide.pdf (4Q 2004)
Chapter 1, Slide 39
Looking for Literature on DSP?
A Simple Approach to Digital Signal Processing

by Craig Marven and Gillian Ewers;
ISBN 0-4711-5243-9
DSP Primer (Primer Series)

by C. Britton Rorabaugh;
ISBN 0-0705-4004-7
Understanding Digital Signal Processing

by Richard G. Lyons;
Prentice Hall; 2nd edition (March 15, 2004)
Chapter 1, Slide 40
ISBN 0-1310-8989-7
DSP First : A Multimedia Approach
James H. McClellan, Ronald W. Schafer, and
Mark A. Yoder;
ISBN 0-1324-3171-8
Looking for Books on C6000 DSP?
Digital Signal Processing Implementation

using the TMS320C6000TM DSP Platform
by Naim Dahnoun; ISBN 0201-61916-4
C6x-Based Digital Signal Processing

by Nasser Kehtarnavaz and Burc Simsek;
ISBN 0-13-088310-7
Chapter 1, Slide 41
Real-Time Digital Signal Processing: Based on

the TMS320C6000 by Nasser Kehtarnavaz;
Newnes; Book & CD-Rom (July 14, 2004)
ISBN 0-7506-7830-5
Digital Signal Processing and Applications with the
C6713 and C6416 DSK (Topics in Digital Signal
Processing) Wiley-Interscience; Book&CD-Rom
(December 3, 2004) by Rulph Chassaing;
ISBN 0-4716-9007-4
Looking for Books on C6000 DSP?
Chapter 1, Slide 42
Real-Time Digital Signal Processing from Matlab

to C with the TMS320C6x DSK by Thad B. Welch;
Cameron Wright; Michael Morrow; Book & CD-Rom
(2006) ISBN 0-8493-7382-4
Chapter 1
Introduction
- End -
Chapter 1, Slide 43

DSP Lecture 01

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

DSP Lecture 01

Hochgeladen von

Copyright:

Verfügbare Formate

DSP Lecture 01

Why process signals digitally?

Present Day Applications

Stereo A/D, D/A

PRML read channel

Digital signal processing techniques are now so

Analogue signal processing is achieved by using

The inherent tolerances associated with these

With DSP it is easy to:

Additionally DSP reduces:

Why NOT go digital?

High frequency signals cannot be processed

DSP processors have to perform tasks in real-time,

We can say that we have a real-time application if:

Why do we need DSP processors?

Why not use a General Purpose Processor (GPP)

Why do we need DSP processors?

Use a DSP processor when the following are

What are the typical DSP algorithms?

The Sum of Products (SOP) is the key element in

What Problem Are We Trying To Solve?

Most DSP algorithms can be

for (i = 1; i < count; i++){

What does it take to do this fast and easy?

Fast MAC using only C

How does the C6000 achieve such performance from C?

'C6000 Architecture: Built for Speed

C6000 Compiler excels at

While dual-MAC speeds

All C6000 instructions are

Instruction set and CPU

Fastest MAC using Natural C

for (i=0; i < count; i++) {

'C6000 System Block Diagram

Looking at the internal buses ...

C6000 Internal Buses

DMA Addr - Read

DMA Addr - Write

'C6000 System Block Diagram

Next, the internal memory ...

'C6000 System Block Diagram

Looking at each peripheral ...

Hardware vs. Microcode multiplication

DSP processors are optimised to perform

Parameters to consider when choosing a

2 (16 x 16-bit) with

2 (32 x 32-bit) with

Internal L1 program memory cache

Internal L1 data memory cache

Number of hardware multipliers

C6711 Datasheet: \Links\TMS320C6711.pdf

C6211 Datasheet: \Links\TMS320C6211.pdf

Parameters to consider when choosing a

3.3V I/O, 1.8V Core

3.3V I/O, 1.8V Core

On-chip timers (number/width)

256 Pin BGA

256 Pin BGA

External memory interface controller

I/O bandwidth: Serial Ports

Floating vs. Fixed point processors

Applications which require:

Need a floating point processor.

Floating vs. Fixed point processors