Sie sind auf Seite 1von 34

SC554: Embedded Systems Design

Designing Custom Single-purpose Processors

Outline
Introduction Combinational logic Sequential logic Custom single-purpose processor design

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Introduction
Processor
Digital circuit that performs a computation tasks Controller and datapath CCD General-purpose: variety of computation tasks Single-purpose: one particular lens computation task Custom single-purpose: non-standard task
Digital camera chip

A2D

CCD preprocessor

Pixel coprocessor

D2A

JPEG codec

Microcontroller

Multiplier/Accum

A custom single-purpose processor may be


Fast, small, low power But, high NRE, longer time-to-market, less flexible
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

DMA controller

Display ctrl

Memory controller

ISA bus interface

UART

LCD ctrl

Basic logic gates


x F

x 0 1

F 0 1

x y F

F=x Driver

F=xy AND

x 0 0 1 1

y 0 1 0 1

F 0 0 0 1

x y

F=x+y OR

x 0 0 1 1

y 0 1 0 1

F 0 1 1 1

x y F

F=xy XOR

x 0 0 1 1

y 0 1 0 1

F 0 1 1 0

x 0 1

F 1 0

x
y F

F = x Inverter

F = (x y) NAND

x 0 0 1 1

y 0 1 0 1

F 1 1 1 0

x y

F = (x+y) NOR

x 0 0 1 1

y 0 1 0 1

F 1 0 0 0

x
y F

F=x y XNOR

x 0 0 1 1

y 0 1 0 1

F 1 0 0 1

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Combinational logic design


Specification
Describes the relation between inputs and outputs Must be often completed to be formal and complete Outputs at time T only depend on inputs at time T

Truth tables
Are a common way of expressing binary functions

Synthesis and optimization


Transforms truth tables into logic equations Manipulates equations to meet certain quality criteria such as area or timing
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Combinational logic design


A) Problem description y is 1 if a is 1, or b and c are 1. z is 1 if b or c is 1, but not both, or if all are 1. a 0 0 0 0 1 1 1 1 B) Truth table Inputs b c 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 Outputs y z 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 1

C) Output equations
y = a'bc + ab'c' + ab'c + abc' + abc

z = a'b'c + a'bc' + ab'c + abc' + abc

D) Minimized output equations y bc 00 01 11 10 a 0 0 0 1 0 1 1 1 1 1

E) Logic Gates
a b c

y = a + bc

z a

00 0 0 1 0

bc

01 1 1

11 0 1

10 1 1

K(Karnaugh)-Map

z = ab + bc + bc

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Combinational components
I(m-1) I1 I0 n S0 n-bit, m x 1 Multiplexor S(log m) n O I(log n -1) I0 log n x n Decoder O(n-1) O1 O0 A n n-bit Adder n carry sum less equal greater B n A n B n A n B

n-bit Comparator

n bit, m function S0 ALU S(log m) n O

O= I0 if S=0..00 I1 if S=0..01 I(m-1) if S=1..11

O0 =1 if I=0..00 O1 =1 if I=0..01 O(n-1) =1 if I=1..11

sum = A+B (first n bits) carry = (n+1)th bit of A+B

less = 1 if A<B equal =1 if A=B greater=1 if A>B

O = A op B op determined by S.

With enable input e all Os are 0 if e=0

With carry-in input Ci sum = A + B + Ci

May have status outputs carry, zero, etc.

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Exercises
Design a 2-bit comparator (that compares two 2-bit words) with a single output less-than, using the combinational design technique. Start from a truth table, use K-maps to minimize logic and draw the final circuit. Design a 3x8 decoder. Start from a truth table, use Kmaps to minimize logic and draw the final circuit.

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Sequential components
I n load clear n-bit Register n Q Q= 0 if clear=1, I if load=1 and clock=1, Q(previous) otherwise. Q = lsb - Content shifted - I stored in msb

shift
I

n-bit Shift register

count
Q clear

n-bit Counter n

Q
Q= 0 if clear=1, Q(prev)+1 if count=1 and clock=1.

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Sequential logic design


Sequential elements are used to
Break complex computations into subsequent, well distinguished stages Store temporary results Model and implement behaviours where the output at time T depends from the inputs at times Ti < T

This implies
Defining a base for discrete time
Electronic phenomena evolve in the continuous time

Defining a suitable model of computation


Finite State Machines
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

10

Sequential logic design


Accounting for past input values can be done
Explicitly storing the values
Unfeasible when a system must consider long sequences

Defining the concept of state

A state
Condenses all the information regarding past inputs Indicates the current operating condition of a system

A transition
Occurs in consequence of an event
A change of the inputs or a clock event

May change the state and produce some output


Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

11

Sequential logic design


Synchronous FSM model of computation
The number of states is finite Transition occur only on clock event
May be subject to given conditions on inputs

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

12

Sequential logic design


Representations of FSM
Graphical: State diagram Textual: State table

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

13

Sequential logic design


A) Problem Description You want to construct a clock divider. Slow down your preexisting clock so that you output a 1 for every four clock cycles a C) Implementation Model Combinational logic x I1 I0 Q1 Q0 State register Q1 0 0 0 0 1 1 1 1 D) State Table (Moore-type) Inputs Q0 a 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 Outputs I0 0 1 1 0 0 1 1 0

a=0

x=0 0 a=1 1 a=1

x=1 3 a=1 2 x=1

a=0

I1

I0

I1 0 0 0 1 1 1 1 0

x 0 0 1 1

a=0

a=1

x=0

a=0

Given this implementation model


Sequential logic design quickly reduces to combinational logic design
14

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

Exercise
Design a 3-bit counter that counts the following sequence: 1, 2, 4, 5, 7, 1, 2, etc. This counter has an output odd whose value is 1 when the current count value is odd. Use the sequential design technique we learned. Start from a state diagram, draw the state table, minimize the logic, and draw the final circuit.

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

16

Exercise
Four lights are connected to a decoder. Build a logic circuit that will blink the lights in the following order: 0, 2, 1, 3, 0, 2, .. Start from the state diagram, draw the state table, minimize the logic and draw the final circuit.

S1 Controller S2

3 4 Decoder

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

17

Custom single-purpose processor basic model


external control inputs controller datapath control inputs external data inputs datapath controller datapath registers

next-state and control logic

external control outputs

datapath control outputs

external data outputs

state register

functional units

controller and datapath a view inside the controller and datapath

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

18

Finite State Machine with Data


An FSMD is composed of
A finite state machine: FSM A datapath: DP

FSM
Controls the evolution of the system Evolves based on events generated as results from the datapath

DP
Performs combinatorial operations Data is fed to operators according to controlling signals generated by the finite state machine
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

19

Example: greatest common divisor


First create algorithm Convert algorithm to complex state machine
Known as FSMD: finitestate machine with data Can use templates to perform such conversion
(a) black-box view go_i x_i GCD d_o
4: y = y_i !(x!=y) x!=y 6: x<y 7: y = y -x !(x<y) !1 1: 1 2: !go_i 2-J: 3: x = x_i !(!go_i)

(c) state diagram

y_i

(b) desired functionality 0: int x, y; 1: while (1) { 2: while (!go_i); 3: x = x_i; 4: y = y_i; 5: while (x != y) { 6: if (x < y) 7: y = y - x; else 8: x = x - y; } 9: d_o = x; }

5:

8: x = x - y

6-J:

5-J: 9: 1-J: d_o = x

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

20

State diagram templates


Assignment statement a=b next statement Loop statement while (cond) { loop-bodystatements } next statement Branch statement if (c1) c1 stmts else if c2 c2 stmts else other stmts next statement
C: c1 c1 stmts !c1*c2 c2 stmts !c1*!c2 others

a=b

C: cond

!cond

next statement J:

loop-bodystatements

J: next statement

next statement

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

21

Creating the datapath


Create a register for any declared variable Create a functional unit for each arithmetic operation Connect the ports, registers and functional units
Based on reads and writes Use multiplexors for 7: multiple sources to registers
!1 1: 1 2: !go_i 2-J: x_sel 3: x = x_i y_sel x_ld 4: y = y_i !(x!=y) x!=y 6: x<y y = y -x 6-J: !(x<y) != 5: x!=y x_neq_y x_lt_y d_ld d_o < 6: x<y subtractor 8: x-y subtractor 7: y-x y_ld 0: x 0: y n-bit 2x1 n-bit 2x1 !(!go_i) x_i y_i

Datapath

5:

8: x = x - y

9: d

Create unique identifier


for each datapath component control input and output
Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

5-J: 9: 1-J: d_o = x

22

Creating the controllers FSM


!1
1: 1 2: !go_i 2-J: 3: x = x_i 0001 2: !go_i 0010 2-J: 0011 x_sel = 0 3: x_ld = 1 y_sel = 0 4: y_ld = 1 5: 6: !x_neq_y x_sel x_neq_y !x_lt_y x_sel =1 8: x_ld = 1 1000 y_sel x_ld x_lt_y 7: y_sel = 1 y_ld = 1 0111 1001 6-J: 5-J: 9: 1-J: d_o = x y_ld 0: x 0: y n-bit 2x1 n-bit 2x1 !(!go_i) go_i

Controller
0000 1: 1

!1 !(!go_i)

Same structure as FSMD Replace complex actions or conditions with Boolean datapath configurations
x_i y_i

4:

y = y_i

0100
5: x!=y 6: x<y 7: y = y -x 6-J: !(x<y) 0110 !(x!=y)

Datapath

0101

8: x = x - y

!=
5: x!=y x_neq_y x_lt_y

<
6: x<y

subtractor
8: x-y

subtractor
7: y-x

1010 5-J:
1011 9: d_ld = 1

9: d d_o

d_ld

1100 1-J:

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

23

Splitting into a controller and datapath


go_i

Controller implementation model


go_i x_sel y_sel

Controller
0000 0001 1:

!1 x_i y_i

1
2: !go_i 0010 2-J: 0011 x_sel = 0 3: x_ld = 1 y_sel = 0 4: y_ld = 1 5: 6:

!(!go_i) x_sel y_sel x_ld y_ld 0: x 0: y

Combinational logic

(b) Datapath
n-bit 2x1 n-bit 2x1

x_ld
y_ld x_neq_y x_lt_y d_ld 0100 0101

!= x_neq_y=0 5: x!=y x_neq_y x_lt_y d_ld

< 6: x<y

subtractor 8: x-y

subtractor 7: y-x

Q3 Q2 Q1 Q0 0110 State register I3 I2 I1 I0 x_lt_y=1 7: y_sel = 1 y_ld = 1 0111 1001 6-J: 1010 5-J: 1011 9:

x_neq_y=1 x_lt_y=0 x_sel =1 8: x_ld = 1 1000

9: d d_o

d_ld = 1

1100 1-J:

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

24

Controller state table for the GCD example


Inputs Outputs

Q3
0 0 0 0 0 0 0 0 0 0 0 1 1 1

Q2
0 0 0 0 0 1 1 1 1 1 1 0 0 0

Q1
0 0 0 1 1 0 0 0 1 1 1 0 0 1

Q0
0 1 1 0 1 0 1 1 0 0 1 0 1 0

x_neq _y *
* * * * * 0 1 * * * * * *

x_lt_ y *
* * * * * * * 0 1 * * * *

go_i
* 0 1 * * * * * * * * * * *

I3
0 0 0 0 0 0 1 0 1 0 1 1 1 0

I2
0 0 0 0 1 1 0 1 0 1 0 0 0 1

I1
0 1 1 0 0 0 1 1 0 1 0 0 1 0

I0
1 0 1 1 0 1 1 0 0 1 1 1 0 1

x_sel
X X X X 0 X X X X X X 1 X X

y_sel
X X X X X 0 X X X X 1 X X X

x_ld
0 0 0 0 1 0 0 0 0 0 0 1 0 0

y_ld
0 0 0 0 0 1 0 0 0 0 1 0 0 0

d_ld
0 0 0 0 0 0 0 0 0 0 0 0 0 0

1
1 1 1 1

0
1 1 1 1

1
0 0 1 1

1
0 1 0 1

*
* * * *

*
* * * *

*
* * * *

1
0 0 0 0

1
0 0 0 0

0
0 0 0 0

0
0 0 0 0

X
X X X X

X
X X X X

0
0 0 0 0

0
0 0 0 0

1
0 0 0 0

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

25

Completing the GCD custom single-purpose processor design


We finished the datapath We have a state table for the next state and control logic
All thats left is combinational logic design

datapath
registers

controller

next-state and control logic

state register

functional units

This is not an optimized design, but we see the basic steps


Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

a view inside the controller and datapath

26

Exercise
Design a single-purpose processor that outputs Fibonacci numbers up to n places. Start with a function computing the desired result, translate it into a state diagram (FSMD), and sketch a probable datapath. Design a circuit that does the matrix multiplication of matrices A and B. Matrix A is 3x2 and matrix B is 2x3.

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

28

Optimizing single-purpose processors


Optimization is the task of making design metric values the best possible Optimization opportunities
original program FSMD datapath FSM

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

31

Optimizing the original program


Analyze program attributes and look for areas of possible improvement
number of computations
Approximation Complexity reduction

size of variable
Integer vs. Floating Point

time and space complexity operations used


multiplication and division very expensive

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

32

Optimizing the original program (cont)


original program 0: int x, y; 1: while (1) { 2: while (!go_i); 3: x = x_i; 4: y = y_i; 5: while (x != y) { 6: if (x < y) 7: y = y - x; else 8: x = x - y; } 9: d_o = x; } optimized program 0: int x, y, r; 1: while (1) { 2: while (!go_i); // x must be the larger number 3: if (x_i >= y_i) { 4: x=x_i; 5: y=y_i; } 6: else { 7: x=y_i; 8: y=x_i; } 9: while (y != 0) { 10: r = x % y; 11: x = y; 12: y = r; } 13: d_o = x; } GCD(42,8) - 3 iterations to complete the loop x and y values evaluated as follows: (42, 8), (8,2), (2,0)

replace the subtraction operation(s) with modulo operation in order to speed up program

GCD(42, 8) - 9 iterations to complete the loop x and y values evaluated as follows : (42, 8), (34, 8), (26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2).

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

33

Optimizing the FSMD


Areas of possible improvements
merge states
states with constants on transitions can be eliminated, transition taken is already known states with independent operations can be merged

separate states
states which require complex operations (a*b*c*d) can be broken into smaller states to reduce hardware size

scheduling
organization of the operations in time

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

34

Optimizing the FSMD (cont.)


int x, y;
1: 1 2: !go_i 2-J: x = x_i !(!go_i) !1

original FSMD eliminate state 1 transitions have constant values

optimized FSMD int x, y;


2: go_i !go_i x = x_i y = y_i

3:
4: 5:

merge state 2 and state 2J no loop operation in between them merge state 3 and state 4 assignment operations are independent of one another merge state 5 and state 6 transitions from state 6 can be done in state 5 eliminate state 5J and 6J transitions from each state can be done from state 7 and state 8, respectively eliminate state 1-J transition from state 1-J can be done directly from state 9

3:

5:

y = y_i !(x!=y)
x!=y

x<y 7: y = y -x

x>y 8: x = x - y

6: x<y 7: y = y -x 6-J: 5-J: d_o = x !(x<y) 8: x = x - y

9:

d_o = x

9: 1-J:

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

35

Optimizing the datapath


Sharing of functional units
one-to-one mapping, as done previously, is not necessary if same operation occurs in different states, they can share a single functional unit

Multi-functional units
ALUs support a variety of operations, it can be shared among operations occurring in different states (that is different moments in time)

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

36

Optimizing the FSM


State encoding
task of assigning a unique bit pattern to each state in an FSM size of state register and combinational logic vary can be treated as an ordering problem

State minimization
task of merging equivalent states into a single state
state equivalent if for all possible input combinations the two states generate the same outputs and transitions to the next same state

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

37

Summary
Custom single-purpose processors
Straightforward design techniques Can be built to execute algorithms Typically start with FSMD CAD tools can be of great assistance

Embedded Systems Design: A Unified Hardware/Software Introduction, (c) 2000 Vahid/Givargis

38

Das könnte Ihnen auch gefallen