Sie sind auf Seite 1von 61

Lecture 10a: Digital Signal Processors: A TI Architectural History

Collated by: Professor Kurt Keutzer Computer Science 252, Spring 2000 With contributions from: Dr. Brock Barton, Clark Hise TI; Dr. Surendar S. Magar, Berkeley Concept Research Corporation
1

DSP ARCHITECTURE EVOLUTION

Multipliers (MUL)

Multiprocessors (MP) Multi-Processing

Application Examples

Video/Imaging W-CDMA Radars Digital Radios High-End Control Modems Voice Coding Instruments Low-End Modems Industrial Control

DSP Building Blocks & Bit Slice Processors (MUL, etc.)

Function/Application Specific ( MP) DSP P and RISC ( MP )

C and Analog

1980

1985

1990

1995
2

DSP ARCHITECTURE Enabling Technologies

Time Frame Early 1970s Late 1970s Early 1980s Late 1980s Early 1990s Late 1990s

Approach

Primary Application

Enabling Technologies

Discrete logic Building block Single Chip DSP P Function/Application specific chips Multiprocessing Single-chip multiprocessing

Non-real time procesing Simulation Military radars Digital Comm. Telecom Control Computers Communication

Bipolar SSI, MSI FFT algorithm Single chip bipolar multiplier Flash A/D P architectures NMOS/CMOS Vector processing Parallel processing Advanced multiprocessing VLIW, MIMD, etc. Low power single-chip DSP Multiprocessing

Video/Image Processing Wireless telephony Internet related

Texas Instruments TMS320 Family Multiple DSP P Generations


First Sample Bit Size Clock speed (MHz) Instruction Throughput MAC execution (ns) MOPS Device density (# of transistors)

Uniprocessor Based (Harvard


Architecture)

TMS32010 TMS320C25 TMS320C30 TMS320C50 TMS320C2XXX Multiprocessor Based TMS320C80 TMS320C62XX TMS310C67XX

1982 1985 1988 1991 1995

16 integer 16 integer 32 flt.pt. 16 integer 16 integer

20 40 33 57

5 MIPS 10 MIPS 17 MIPS 29 MIPS 40 MIPS

400 100 60 35 25

5 20 33 60 80

58,000 (3) 160,000 (2) 695,000 (1) 1,000,000 (0.5)

1996 1997 1997

32 integer/flt. 16 integer 32 flt. pt. 1600 MIPS 5 5

2 GOPS 120 MFLOP 20 GOPS 1 GFLOP

MIMD VLIW VLIW

First Generation DSP P Case Study


TMS32010 (Texas Instruments) - 1982

Features
u u u u u u u u u u 200 ns instruction cycle (5 MIPS) 144 words (16 bit) on-chip data RAM 1.5K words (16 bit) on-chip program ROM - TMS32010 External program memory expansion to a total of 4K words at full speed 16-bit instruction/data word single cycle 32-bit ALU/accumulator Single cycle 16 x 16-bit multiply in 200 ns Two cycle MAC (5 MOPS) Zero to 15-bit barrel shifter Eight input and eight output channels

TMS32010 BLOCK DIAGRAM

TMS32010 Program Memory Maps


Microcomputer Mode Address 0 1 2 16-bit word Reset 1st Word Reset 2nd Word Interrupt
Internal Memory Space

Microprocessor Mode 16-bit word 0 1 2 Reset 1st Word Reset 2nd Word Interrupt

1525
Internal Memory Space Reserved For Testing

External Memory Space

1536
External Memory Space

4095

4095 7

Digital FIR Filter Implementation (Uniprocessor-Circular Buffer)


Start each Time here 1st. Cycle 2nd. Cycle Start End Start

a n-1 a n-2 a0

a1

a0 a n-1

X0 X1 X2 X3 X4 X5

Xn-1

End

+
Acc

Replace starting value with new value

TMS32010 FIR FILTER PROGRAM Indirect Addressing (Smaller Program Space)


Y(n) = x[n-(N-1)] . h(N-1) + x[n-(N-2)] . h(N-2) ++ x(n) . h(0)

For N=50, Indirect Addressing t=42 s (23.8 KHz) For N=50, Direct Addressing t=21.6 s (40.2 KHz)

TMS320C203/LC203 BLOCK DIAGRAM DSP Core Approach - 1995

10

Third Generation DSP P Case Study TMS320C30 - 1988

TMS320C30 Key Features


u u u u u u u 60 ns single-cycle instruction execution time
n n

33.3 MFLOPS (million floating-point operations per second) 16.7 MIPS (million instructions per second)

One 4K x 32-bit single-cycle dual-access on-chip ROM block Two 1K x 32-bit single-cycle dual-access on-chip RAM blocks 64 x 32-bit instruction cache 32-bit instruction and data words, 24-bit addresses 40/32-bit floating-point/integer multiplier and ALU 32-bit barrel shifter

11

Third Generation DSP P Case Study TMS320C30 - 1988

TMS320C30 Key Features (cont.)


u u u u u u u u u Eight extended precision registers (accumulators) Two address generators with eight auxiliary registers and two auxiliary register arithmetic units On-chip direct memory Access (DMA) controller for concurrent I/O and CPU operation Parallel ALU and multiplier instructions Block repeat capability Interlocked instructions for multiprocessing support Two serial ports to support 8/16/32-bit transfers Two 32-bit timers 1 CDMOS Process

12

TMS320C30 BLOCK DIAGRAM

13

TMS320C3x CPU BLOCK DIAGRAM

14

TMS320C3x MEMORY BLOCK DIAGRAM

15

TMS320C30 Memory Organization


Oh BFh COh 7FFFFFh 800000h 801FFFh 802000h 803FFFh 804000h 805FFFh 806000h 807FFFH 80800h 8097FFh 809800h 809BFFh 809C00h 809FFFh 80A00h 0FFFFFFh Interrupt locations & reserved (192) external STRB active External STRB Active Expansion BUS MSTRB Active (8K) Reserved (8K) Expansion Bus IOSTRB Active (8K) Reserved (8K) Peripheral Bus Memory Mapped Registers (Internal) (6K) RAM Block 0 (1K) (Internal) RAM Block 1 (1K) (Internal) External STRB Active Oh BFh COh 0FFFh 1000h 7FFFFFh 800000h 801FFFh 802000h 803FFFh 804000h 805FFFh 806000h Interrupt locations & reserved (192) ROM (Internal) Expansion BUS MSTRB Active (8K) Reserved (8K) Expansion Bus IOSTRB Active (8K) Reserved (8K)

807FFFH Peripheral Bus Memory Mapped 80800h Registers (Internal) (6K) 8097FFh RAM Block 0 (1K) 809800h (Internal) 809BFFh 809C00h 809FFFh 80A00h 0FFFFFFh RAM Block 1 (1K) (Internal) External STRB Active

Microprocessor Mode

Microcomputer Mode

16

TMS320C30 FIR FILTER PROGRAM


Y(n) = x[n-(N-1)] . h(N-1) + x[n-(N-2)] . h(N-2) ++ x(n) . h(0)

For N=50, t=3.6 s (277 KHz)

17

C54x Architecture

18

TMS320C54x Internal Block Diagram

19

Architecture optimized for DSP


#1: CPU designed for efficient DSP processing
n

MAC unit, 2 Accumulators, Additional Adder, Barrel Shifter

#2: Multiple busses for efficient data


and program flow n Four busses and large on-chip memory that result in sustained performance near peak

#3: Highly tuned instruction set for powerful DSP computing


n

Sophisticated instructions that execute in fewer cycles, with less code and low power demands
20

Key #1: DSP engine


Y = x
40 n = 1

an * xn a

MPY ADD

y
21

Key #1: MAC Unit


MAC *AR2+, *AR3+, A
Data Acc A Temp Coeff Prgm Data Acc A

S/U

S/U

Fractional Mode Bit

MPY ADD acc A acc B

A B O

22

Key #1: Accumulators + Adder


General-Purpose Math example: t = s+e-r A Bus B Bus A B C T D Shifter acc A acc B ALU LD @s, A ADD @e, A STL A, @t A B MAC
23

MUX

U Bus SUB @r, A

Key #1: Barrel shifter


LD STH @X, 16, A @B, Y

A B C D Barrel Shifter (-16-+31) S Bus

ALU

E Bus
24

Key #1: Temporary register


LD MPY
D X

@x, T @a, A
EXP Encoder

A B For example: A = xa

Temporary Register T Bus MAC ALU

25

Key #2: Efficient data/program flow


#1: CPU designed for efficient DSP processing
n

MAC unit, 2 Accumulators, Additional Adder, Barrel Shifter

#2: Multiple busses for efficient data and program flow


n

Four busses and large on-chip memory that result in sustained performance near peak

#3: Highly tuned instruction set for powerful DSP computing


n

Sophisticated instructions that execute in fewer cycles, with less code and low power demands
26

Key #2: Multiple busses


MAC *AR2+, *AR3+, A

INTERNAL MEMORY

M U X E S

P D C E C
T

EXTERNAL MEMORY

M U X D
ALU SHIFTER MAC A B

Central Arithmetic Logic Unit

M
27

Key #2: Pipeline


Prefetch Fetch Decode Access Read Execute P K K K K K K F D A R E

Prefetch: Calculate address of instruction Fetch: Collect instruction Decode: Interpret instruction Access: Collect address of operand Read: Collect operand Execute: Perform operation
28

Key #2: Bus usage


CNTL INTERNAL MEMORY M U X E S P D C E PC ARs EXTERNAL MEMORY M U X

Central Arithmetic Logic Unit

T MAC A B ALU SHIFTER

29

Key #2: Pipeline performance


CYCLES P1 F1 D1 A1 P2 F2 D2 P3 F3 P4 R1 A2 D3 F4 P5 X1 R2 A3 D4 F5 P6 X2 R3 A4 D5 F6

X3 R4 X4 A5 R5 X5 D6 A6 R6 X6

Fully loaded pipeline


30

Key #3: Powerful instructions


#1: CPU designed for efficient DSP processing
n

MAC Unit, 2 Accumulators, Additional Adder, Barrel Shifter

#2: Multiple busses for efficient data and program flow


n

Four busses and large on-chip memory that result in sustained performance near peak

#3: Highly tuned instruction set for powerful DSP computing


n

Sophisticated instructions that execute in fewer cycles, with less code and low power demands
31

Key #3: Advanced applications


Symmetric FIR filter Adaptive filtering Polynomial evaluation Code book search Viterbi FIRS LMS POLY STRCD SACCD SRCCD DADST DSADT CMPS
32

C62x Architecture

33

TMS320C6201 Revision 2
Program Cache / Program Memory
32-bit address, 256-Bit data512K Bits RAM Pwr Dwn
Host Port Interface C6201 CPU Megamodule
Program Fetch Instruction Dispatch Instruction Decode Control Registers Control Logic Test Emulation Interrupts

4DMA

Data Path 1
A Register File L1 S1 M1 D1

Data Path 2
B Register File D2 M2 S2 L2

Ext. Memory Interface

2 Timers 2 Multichannel buffered serial ports (T1/E1)

Data Memory
32-Bit address, 8-, 16-, 32-Bit data 512K Bits RAM

34

C6201 Internal Memory Architecture


K Separate Internal Program and Data Spaces K Program
n n n

16K 32-bit instructions (2K Fetch Packets) 256-bit Fetch Width Configurable as either w Direct Mapped Cache, Memory Mapped Program Memory 32K x 16 Single Ported Accessible by Both CPU Data Buses 4 x 8K 16-bit Banks w 2 Possible Simultaneous Memory Accesses (4 Banks) w 4-Way Interleave, Banks and Interleave Minimize Access Conflicts

K Data
n n n

35

K K Interrupt Return Pointers (IRP, NRP) K Fast Interrupt Handing


n n n n

C62x Interrupts Interrupt (NMI) 12 Maskable Interrupts , Non-Maskable


Branches Directly to 8-Instruction Service Fetch Packet Can Branch out with no overhead for longer service 7 Cycle Overhead : Time When No Code is Running 12 Cycle Latency : Interrupt Response Time

K Interrupt Acknowledge (IACK) and Number (INUM) Signals K Branch Delay Slots Protected From Interrupts K Edge Triggered

36

C62x Datapaths
Registers A0 - A15
1X

Registers B0 - B15
2X

S1

S2

D DL SL

L1

SL DL D S1

S1

S2

D S1

M1

S2

D S1 S2

D1

S2 S1 D

S2

S1 D

D2

M2

S2

S2

S1 D DL SL

SL DL D

L2

S2

S1

DDATA_I1 (load data) DDATA_O1 (store data)

DDATA_I2 (load data) DDATA_O2 (store data)

DADR1 DADR2 (address) (address) Cross Paths 40-bit Write Paths (8 MSBs) 40-bit Read Paths/Store Paths

37

Functional Units
K L-Unit (L1, L2)
n n n n n n n

K S-Unit (S1, S2)

40-bit Integer ALU, Comparisons Bit Counting, Normalization 32-bit ALU, 40-bit Shifter Bitfield Operations, Branching 16 x 16 -> 32

K M-Unit (M1, M2) K D-Unit (D1, D2)

32-bit Add/Subtract Address Calculations

38

C62x Datapaths
Registers A0 - A15
1X

Registers B0 - B15
2X

S1

S2

D DL SL

L1

SL DL D S1

S1

S2

D S1

M1

S2

D S1 S2

D1

S2 S1 D

S2

S1 D

D2

M2
DDATA_I2 (load data)

S2

S2

S1 D DL SL

SL DL D

L2

S2

S1

DDATA_O1 (store data)

DDATA_I1 (load data)

DADR1 (address)

DADR2 (address)

DDATA_O2 (store data)

Cross Paths 40-bit Write Paths (8 MSBs) 40-bit Read Paths/Store Paths

39

C62x Instruction Packing Instruction Packing Advanced VLIW


K Fetch Packet

Example 1

n n n

CPU fetches 8 instructions/cycle CPU executes 1 to 8 instructions/cycle Fetch packets can contain multiple execute packets

A B C D E F G H A B C D Example 2 E F G H A B C D Example 3 E F G H

K Execute Packet K Parallelism determined at compile / assembly time K Examples


n n n

1) 8 parallel instructions 2) 8 serial instructions 3) Mixed Serial/Parallel Groups w A // B w C w D w E // F // G // H

K Reduces Codesize, Number of Program Fetches, Power Consumption

40

C62x Pipeline Operation Pipeline Phases


Fetch Decode Execute PG PS PW PR DP DC E1 E2 E3 E4 E5
u Decode uSingle-Cycle ThroughputInstruction Dispatch n DP uOperate in LockDC Step n Instruction Decode uFetch u Execute
n n n n

PG PS PW PR

Program Address Generate through Execute 5 n E1 - E5 Execute 1 Program Address Send Program Access Ready Wait Program Fetch Packet Receive

Execute Packet 1 PG PS PW PR DP DC Execute Packet 2 PG PS PW PR DP Execute Packet 3 PG PS PW PR Execute Packet 4 PG PS PW Execute Packet 5 PG PS Execute Packet 6 PG Execute Packet 7

E1 DC DP PR PW PS PG

E2 E1 DC DP PR PW PS

E3 E2 E1 DC DP PR PW

E4 E3 E2 E1 DC DP PR

E5 E4 E3 E2 E1 DC DP

E5 E4 E3 E2 E1 DC

E5 E4 E3 E2 E1

E5 E4 E5 E3 E4 E5 E2 E3 E4 E5 41

C62x Pipeline Operation Delay Slots


u Delay Slots: number of extra cycles until result is: n written to register file n available for use by a subsequent instructions n Multi-cycle NOP instruction can fill delay slots while minimizing codesize impact

Most Instructions Integer Multiply Loads Branches

E1 No Delay E1 E2 1 Delay Slots E1 E2 E3 E4 E5 4 Delay Slots E1

Branch Target PG PSPWPR DPDC E1 5 Delay Slots


42

C6000 Pipeline Operation Benefits


K Cycle Time
n n n n n n n n n n

Allows 6 ns cycle time on 67x Allows 5 ns cycle time & single cycle execution on C62x 8 new instructions can always be dispatched every cycle Pipelined Program and Data Accesses Two 32-bit Data Accesses/Cycle (C62x) Two 64-bit Data Accesses/Cycle (C67x) 256-bit Program Access/Cycle Visible: No Variable-Length Pipeline Flow Deterministic: Order and Time of Execution Orthogonal: Independent Instructions

K Parallelism K High Performance Internal Memory Access

K Good Compiler Target

43

C6000 Instruction Set Features


Conditional Instructions K All Instructions can be Conditional
n n n

A1, A2, B0, B1, B2 can be used as Conditions Based on Zero or Non-Zero Value Compare Instructions can allow other Conditions (<, >, etc)

K Reduces Branching K Increases Parallelism

44

C6000 Instruction Set Addressing Features


K Load-Store Architecture K Two Addressing Units (D1, D2) K Orthogonal
n

K Signed/Unsigned Byte, Half-Word, Word, Double-Word Addressable K Register or 5-Bit Unsigned Constant Index
n

Any Register can be used for Addressing or Indexing Indexes are Scaled by Type

45

C6000 Instruction Set Addressing Features


K Indirect Addressing Modes
n n n n n n

K 15-bit Positive/Negative Constant Offset from Either B14 or B15

Pre-Increment Post-Increment Pre-Decrement Post-Decrement Positive Offset Negative Offset

*++R[index] *R++[index] *--R[index] *R--[index] *+R[index] *-R[index]

46

C6000 Instruction Set Addressing Features


K Circular Addressing
n n

K Dual Endian Support

Fast and Low Cost: Power of 2 Sizes and Alignment Up to 8 Different Pointers/Buffers, Up to 2 Different Buffer Sizes

47

C67x Architecture

48

TMS320C6701 DSP Block Diagram


Program Cache/Program Memory 32-bit address, 256-Bit data 512K Bits RAM
Power C67x Floating-Point CPU Core Down Program Fetch Host Port Interface
Instruction Dispatch Instruction Decode Data Path 1
A Register File L1 S1 M1 D1

Control Registers Control Logic Test Emulation Interrupts

4 Channel DMA

Data Path 2
B Register File D2 M2 S2 L2

External Memory Interface

2 Timers

Data Memory 32-Bit address 8-, 16-, 32-Bit data 512K Bits RAM

2 Multichannel buffered serial ports (T1/E1)

49

TMS320C6701 Advanced VLIW CPU (VelociTI )


TM

K 1 GFLOPS @ 167 MHz


n n

6-ns cycle time 6 x 32-bit floating-point instructions/cycle

K K K K

Load store architecture 3.3-V I/Os, 1.8-V internal Single- and double-precision IEEE floating-point Dual data paths
n

6 floating-point units / 8 x 32-bit instructions

50

K Same as C6201 K External interface supports


n

TMS320C6701 Memory /Peripherals

SDRAM, SRAM, SBSRAM

K K K K K

4-channel bootloading DMA 16-bit host port interface 1Mbit on-chip SRAM 2 multichannel buffered serial ports (T1/E1) Pin compatible with C6201

51

TMS320C67x CPU Core


C67x Floating-Point CPU Core
Program Fetch Instruction Dispatch Instruction Decode Data Path 1 A Register File Data Path 2 B Register File Control Logic Test Emulation L1 S1 M1 D1 D2 M2 S2 L2 Interrupts Control Registers

Arithmetic Logic Unit

Auxiliary Logic Unit

Multiplier Unit

Floating-Point Capabilities

52

C67x Interrupts
K K K K 12 Maskable Interrupts Non-Maskable Interrupt (NMI) Interrupt Return Pointers (IRP, NRP) Fast Interrupt Handling
n n n

Branches Directly to 8-Instruction Service Fetch Packet 7 Cycle Overhead: Time When No Code is Running 12 Cycle Latency : Interrupt Response Time

K Interrupt Acknowledge (IACK) and Number (INUM) Signals K Branch Delay Slots Protected From Interrupts K Edge Triggered

53

C67x New Instructions


.L Unit
Floating Point Arithmetic Unit ADDSP ADDDP SUBSP SUBDP INTSP INTDP SPINT DPINT SPTRUNC DPTRUNC DPSP

.M Unit
Floating Point Multiply Unit MPYSP MPYDP MPYI MPYID MPY24 MPY24H

.S Unit
ABSSP ABSDP CMPGTSP CMPEQSP CMPLTSP CMPGTDP CMPEQDP CMPLTDP RCPSP RCPDP RSQRSP RSQRDP SPDP
54

Floating Point Auxilary Unit

C67x Datapaths
u u 2 Data Paths 8 Functional Units n Orthogonal/Independent n 2 Floating Point Multipliers u n 2 Floating Point Arithmetic n 2 Floating Point Auxiliary Control n Independent u n Up to 8 32-bit Instructions Registers n 2 Files u n 32, 32-bit registers total Cross paths (1X, 2X) u L-Unit (L1, L2) n Floating-Point, 40-bit Integer ALU n Bit Counting, Normalization S-Unit (S1, S2) n Floating Point Auxiliary Unit n 32-bit ALU/40-bit shifter n Bitfield Operations, Branching M-Unit (M1, M2) n Multiplier: Integer & Floating-Point D-Unit (D1, D2) n 32-bit add/subtract Addr Calculations

u u u

Registers A0 - A15
1X

Registers B0 - B15
2X

S1

S2

D DL SL

L1

SL DL D S1

S1

S2

D S1

M1

S2

D S1 S2

D1

S2 S1 D

S2

S1 D

D2

M2

S2

S2

S1 D DL SL

SL DL D

L2

S2

S1

55

C67x Instruction Packing Instruction Packing Enhanced VLIW


Example 1

A B C D E F G H A B C D E F G H

Example 2

A B C D Example 3 E F G H

K Fetch Packet n CPU fetches 8 instructions/cycle K Execute Packet n CPU executes 1 to 8 instructions/cycle n Fetch packets can contain multiple execute packets K Parallelism determined at compile/assembly time K Examples n 1) 8 parallel instructions n 2) 8 serial instructions n 3) Mixed Serial/Parallel Groups M A // B M C M D M E // F // G // H K Reduces n Codesize n Number of Program Fetches n Power Consumption

56

C67x Pipeline Operation Pipeline Phases


Fetch Decode Execute

PG PS PW PR DP DC E1 E2 E3 E4 E5 E6 E7 E8 E9 E10
uOperate in Lock Step uFetch n PG Program Address Generate n PS Program Address Send n PW Program Access Ready Wait n PR Program Fetch Packet Receive u Decode n DP n DC u Execute n E1 - E5 n E6 - E10 Instruction Dispatch Instruction Decode Execute 1 through Execute 5 Double Precision Only

Execute Packet 1 PG PS PW PR DP DC Execute Packet 2 PG PS PW PR DP Execute Packet 3 PG PS PW PR Execute Packet 4 PG PS PW Execute Packet 5 PG PS Execute Packet 6 PG Execute Packet 7

E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 DC E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 DP DC E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 PR DP DC E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 PW PR DP DC E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 PS PW PR DP DC E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 PG PS PW PR DP DC E1 E2 E3 E4 E5 E6 E7 E8 E9 E10

57

C67x Pipeline Operation Delay Slots


Delay Slots: number of extra cycles until result is: n written to register file n available for use by a subsequent instructions n Multi-cycle NOP instruction can fill delay slots while minimizing codesize impact
Most Integer Single-Precision Loads Branches

E1 No Delay E1 E2 E3 E4 3 Delay Slots E1 E2 E3 E4 E5 E1 PG PS PW PR DP DC E1 5 Delay Slots


58

4 Delay Slots

Branch Target

C67x and C62x Commonality


u u Driving commonality ( ) between C67x & C62x shortens C67x design time. Maintaining symmetry between datapaths shortens the C67x design time. C62x CPU C67x CPU

M-Unit 1 M-Unit 2 Multiplier Multiplier Unit Unit Control D-Unit 1 D-Unit 2 Data Load/ Registers Data Load/ Store Store Emulation S-Unit 2 S-Unit 1 Auxiliary Auxiliary Logic Unit Logic Unit L-Unit 1 L-Unit 2 Arithmetic Arithmetic Logic Unit Logic Unit

M-Unit 1 Multiplier Unit with Floating Point

M-Unit 2 Multiplier Unit with Floating Point

D-Unit 1 Data Load/ Store


S-Unit 1 Auxiliary Logic Unit with Floating Point L-Unit 1 Arithmetic Logic Unit with Floating Point

Control Registers Emulation

D-Unit 2 Data Load/ Store


S-Unit 2 Auxiliary Logic Unit with Floating Point L-Unit 2 Arithmetic Logic Unit with Floating Point

Register file

Decode

Register file

Register file

Decode

Register file

Program Fetch & Dispatch

Program Fetch & Dispatch


59

TMS320C80 MIMD MULTIPROCESSOR Texas Instruments - 1996

60

Copyright 1999

61

Das könnte Ihnen auch gefallen