Beruflich Dokumente
Kultur Dokumente
POKL.04.01.02-00-209/11
X Qn
Qn+1
f g
M Y
(PLE) (PLE)
X Only for
CLK
Mealy type
Microprogrammable Circuit
Reducing size of memory
In each state only one bit of X is considered
Requires modified coding for equivalent behavior
X input multiplexer
Size depends on width of Q vector
Modification of functionality forces rewiring of input
signals
Microprogrammable Circuit
X[Qn]
X Qn+1 Qn
f g
M Y
(PLE) (PLE)
X
CLK
Microprogrammable Circuit
Independent selection of X input from Q
X[Sn]
{Qn+1 , Sn+1}
X {Qn, Sn}
f g
M Y
Sn (PLE) (PLE)
Qn
CLK
LD LD
X Y
ROM
CNT REG Y
A (μProg)
XSEL
CLK
Instr Arg 1 Arg 2
ROM: 〈INSTR, ARG〉 JMP XSEL A
ST Y
Data Processing
How to implement complex operations with use of
Finite State Machine
Mixing processed data with control flow
Complex, cumbersome and nonefficient
Idea – separation data and control
Data path
Control path
Data Processing
Data Path
XD RA
ALU RX YD
RB
JMP
LD
XC ROM Instr
CNT
A (μProg) Dec
XSEL
Control Path
CLK
First Microprocessor
Problem (after Charles Babbage)
calculate n values of f(x) = x2 with given step
Solution
Calculation based on progressive differences
f ( x 0 + h=) f ( x 0 ) + ∆ f ( x 0 ) + ∆2 f ( x 0 ) + + ∆n f ( x 0 )
Difference definition
∆ f ( x 0 )= f ( x 0 + h ) − f ( x 0 )
∆ f ( x 0 )= ∆ f ( x 0 + h ) − ∆ f ( x 0 )= f ( x 0 + 2h ) − 2 f ( x 0 + h ) + f ( x 0 )
2
∆ f (=
x0 ) ∆ f ( x0 + h ) − ∆ f ( x0 )
n n− 1 n− 1
First Microprocessor
Variables
f(x) = R0
Δf(x) = R1
Δ 2f(x) = R2 – constant value (Do we need a variable?)
n = R3 - Counter
Calculations – arithmetic operations
Δf(xi+h) = Δf(xi) + Δ2f(xi)
f(xi+h) = f(xi) + Δf(xi)
n = n – 1 (decrement)
n ≠ 0 (continue until different)
First Microprocessor
Instruction set
1. LDI DST VAL
4. JNZ ADDR
5. IN DST - IOA
A
ADD A
B 0
Z ZF
-1 1
ADD/DEC
First Microprocessor
Register file
Addressable set of registers – D type flip-flops
Instant access to 2 arguments
Selection of source for result written to selected register
ALU ARG1
RES REGISTER
IN
FILE
IMM ARG2
RES_SEL
WERES
ARES
AARG1
AARG2
First Microprocessor
Control unit
Microprgrammable based with counter
Control signals decoding
IMM ARES AARG1 AARG2
Conditional jump
ADD/DEC
LD_Z
ROM Instr
CNT RES_SEL
(μProg) Dec
ZF LD WERES
JZ
CLK
First Microprocessor
And finally…
REGs OUT
ARG1 A
RES REGISTER
WEIO
AIO
IN ADD
FILE ARG2 ZF
0 Z
RES_SEL
WERES
IMM
1
AARG2
AARG1
AIO
ARES
-1
ADD/DEC LD_Z
ROM Instr
CNT
ZF (μProg) Dec
LD
JZ
CLK
First Microprocessor
Program
INIT:
IN R0,F0 ;Initial value of f0
IN R1,dF0 ;Initial value of df0
IN R2,d2F0 ;Initial value of d2f0
IN R3,n ;Number of iterations
LOOP:
ADD R1,R1,R2 ;Calculate dfi = dfi-1 + d2fi
ADD R0,R0,R1 ;Calculate fi = fi-1 + dfi
OUT Fi,R0 ;Output result
DEC R3 ;Iterate given number of times
JNZ LOOP
…
Microcomputer structure
Bus architecture
μP Memory
AB Address Bus
DB Data Bus
CB Control Bus
Input Output
Microprocessor Z80
Microprocessor general architecture
8-bit data processing
16-bit address space
Limited support for 16-bit operations
Developed as successor of 8080 – 8085 line
Backward code compatible
Introduction of additional instructions
Microprocessor Z80
Microprocessor module
Bus architecture
Address and Data
Control bus
Memory and IO access
Access timing
Interrupt system
Bus request - Direct Memory Access
Bus Cycle
Basic bus cycles
Instruction fetch FETCH MEMR MEMW IOR IOW
Normal CLK
ADDR
Interrupt
ADDR16 ADDR16 ADDR16 ADDR8 ADDR8
D DIN DIN DOUT DIN DOUT
Data Read
nRD
nWR
Data Write
nMREQ
nIORQ
IO Read
nM1
nWAIT
IO Write
Bus cycles
Basic bus cycles
FETCH MEMR MEMW IOR IOW
CLK
ADDR ADDR16 ADDR16 ADDR16 ADDR8 ADDR8
nRD
nWR
nMREQ
nIORQ
nM1
nWAIT
Bus cycles
Fetch vs Interrupt
FETCH INT
CLK
ADDR ADDR16 ADDR16
D DIN DIN
nRD
nWR
nMREQ
nIORQ
nM1
nWAIT
Memory access – read
Memory access – write
Bus Cycles – length control
MEMR MEMW
tAA tAA
tACS tWP
nRD
nWR
nMREQ
nIORQ
nM1
nWAIT
Bus Cycles - program
System architecture
Organizing memory system
Memory map
Decoder design
Accommodating wait cycles
IO System
Special address space
Limited instruction set and addressing modes
Memory mapped system – takes benefits from reach
memory access instruction set
Connecting memory
Connecting memory
RAM: MEM:0x0000 – 0xEFFF
Connecting IO devices
RAM MEM:0x0000 – 0xFFFF
Timer IO: 0x20-0x23
Connecting IO devices
Connecting IO devices
RAM MEM:0x0000 – 0xEFFF
UART MEM: 0xFF00-0xFF07
Instruction Set
Single address machine
Instruction set – basic set
Arithmetic instructions - ADD, SUB
Logic instructions - AND, OR, NOT
Data transfer instructions – MOV
Jump instructions - JMP, CALL, RET
Control Instructions – NOP, HALT, EI, DI etc.
Instruction Set - extensions
Z80 extended instruction
Indexed addressing instructions
Introduction of index registers IX and IY
String operations
Automatic repetition of instruction until condition met or
executed given number of times
Relative jumps
Reduced
Bit operations
Test, set and reset
Register organization
Registers A F PSW
General purpose B C B
General
8 bit D E D purpose
16 bit pairs H L H registers
Stack pointer M
Program counter
I R
Index registers
IX Pointer
Interrupt
IY registers
Refresh
SP
PC
Addressing modes
Hardware assisted addressing modes
Immediate
Register
Direct addressing
Register indirect
Stack - LIFO
Advanced addressing modes
Implemented programmatically
Flag register
7 6 5 4 3 2 1 0
S Z - H - P/V N C
C – Carry In/Out
N – last operation ADD/SUB for DAA
P/V – parity or overflow
H – half carry
Z – zero result or bit transfer
S – sign of the result (MSB copy of result)
Flags market with green can be used as jump
conditions
Data transfer
Register to register 7 6 5 4 3 2 1 0
Immediate to register 7 6 5 4 3 2 1 0
0 1 d d d 1 1 0
MVI dst, imm imm8
SUI imm8 1 1 0 O CY 1 1 0
ACI imm8 imm8
SBI imm8
16-bit addition
DAD rp HL ← HL + rp
Logic instructions
7 6 5 4 3 2 1 0
Register data
1 0 1 O1 O0 s s s
ANA reg
00 – OR
ORA reg
XRA reg 01 – XOR
CMA 10 – AND
11 – CMP
Immediate data 7 6 5 4 3 2 1 0
ANI imm8 1 1 1 O1 O0 1 1 0
ORI imm8 imm8
XRI imm8
Jump Instructions
Simple jumps
JMP addr16 PC = addr16
Subroutine service
CALL addr16
RST n simplified call – restart PC = 8n
RET restore PC from stack
Stack organization
PUSH rp top
SP ← SP – 1 Stack
M(SP) ← rpL SP
Stack
SP ← SP – 1 rpL growth
M(SP) ← rpH SP + 2 rpH
POP rp Free
rpL ← M(SP)
SP ← SP + 1
PCH ← M(SP) Program
SP ← SP + 1 +
Data
Special reg. pair – PSW = {F, A}
Jump Instructions
Subroutine service details
CALL, RST
M(SP – 1) ← PCL
M(SP – 2) ← PCH
SP ← SP – 2
PC ← imm16 PC ← 8n
RET
PCL ← M(SP)
PCH ← M(SP + 1)
SP ← SP + 2
PC ← imm16
Jump Instructions – conditional
Condition description
Z – zero NZ – not zero
C – carry NC – not carry
PE – parity even PO – parity odd
P - plus M – minus
Condition merge
Jcc Simple jump
Ccc Call
Rcc Return
Jump Instructions – conditional
7 6 5 4 3 2 1 0
Instruction encoding 1 1 f1 f0 fv j1 j0 0
f1,f0 – flag selection imm16L
fv – flag value (0/1) imm16H
j1,j0 – jump kind
R A
B
Division
Exemplary operation
A = 5, B = 2 (A/B) → Q = 2, R = 1
Q B A
- - - - 0 0 1 0 0 1 0 1
R
0 0 0 0
1
- - - 0 0 0 0 0 1 0 1 -
1
- - 0 0 0 0 0 1 0 1 - -
0
- 0 0 1 0 0 1 0 1 - - -
1
0 0 1 0 0 0 0 1 - - - -
START
Division C=8
R=0
Algorithm
A – divisor {R,A}={R,A}<<1
B – dividend N Y
R<B
C – counter
Q – quotient Q = (Q << 1) + 1
R=R-B
Q = Q << 1
R - reminder
C=C-1
N
C=0
Y
FINISH
Division – final implementation
DIV_8:
MVI C,8; Number of bits
MVI H,0
DV8_LP:
DAD H
MOV A,H
SUB B
JC DV8_SKIP; Do not subtract
MOV H,A
DV8_SKIP:
CMC
MOV A,E
RAL
MOV E,A
DCR C
JNZ DV8_LP
RET
Exchange array items
Array
Array begin - pointer
Item size
Accessed by index
Transforming array index to address (pointer)
Ai = B + i ⋅ sitem
START
MUL_9: MUL_20:
Ai = Ai + 1
LXI H,0 DAD H Aj = Aj + 1
DAD H DAD H CNT = CNT - 1
DAD H MOV E,L
DAD H MOV D,H
DAD E DAD H CNT = 0
DAD H
DAD E
FINISH
Exchange array items
XCH:
;Calculate address of A[i] and A[j]
...
MVI C,SIZE ;Element size
XCH_LP:
MOV B,M ;TMP1 = M(HL) HL – TAB(i)
LDAX D ;TMP2 = M(DE) DE – TAB(j)
MOV M,A ;M(HL) = TMP2
MOV A,B ;TMP1 to M(DE)
STAX D ;M(DE) = TMP1
DCR C
JNZ XCH_LP
RET
Subprogram arguments
Lets consider Stack
subp arg1, arg2,…,argN
CPU Registers
Data
Small number
Static allocation Arg. PPn Static
Subprogram
Simple argument
Arg. PP1 area
Global variable access
Reentrant or recursive call ? Program
Common variables for each (instruction)
instance of subprogram
Subprogram arguments
How to create independent arguments for each
called instance?
Use of stack
Arguments are pushed on the stack before
subprogram call
Pushing on the stack does not require use of PUSH
instruction
Arguments area is released just after returning from
subprogram by calle
Subprogram arguments
Detailed view of the stack after calling subprogram
Begin of
stack
Arg 0
Program
(instruction)
Subprogram arguments
Problem
Begin of
stack
(before call)
Data Ret Adr Lo SP
Program
(instruction)
lxi B,ARG1_A
push B
mvi C,ARG2_A
mvi B,0
push B
call SUBP
Subprogram arguments
Problem
Begin of
stack
Accessing arguments
stack Arg 0 Lo SP+2
Program
(instruction)
Area allocation = SP
Var 0 Hi SP+1
modification Var 0 Lo
Top of the SP
stack
The area is allocated by
subprogram just after call
Data
Should be released before
calling RET instruction Program
(instructions)
Local variables
Lets consider following
Begin of
the stack
Var 0 Hi SP+1
var Var0 : word; Var1 : byte; Top of the Var 0 Lo SP
Requirements
stack
Data
PROLOG: EPILOG:
lxi H,-VARS_SIZE lxi H,VAR_SIZE
dad SP dad SP
sphl sphl C,M
ret
Local variables
Begin of
the stack
Var 0 Hi SP+1
subp A Data
Accessing variables
(instructions)
GET_V0: PUT_V0:
lxi H,0; var offset lxi H,0; var offset
dad SP dad SP
mov C,M mov M,C
inx H inx H
mov B,M mov M,B
Argument and variables
Putting all together
Begin of
the stack
subprogram Vars
Args
Stack frame Sub2 Ret Adr
Arguments Top of the Vars
stack
Created and released by caller
Variables Data
Created and released by calle
Program
(instruction)
Interrupt system
How to recognize external and internal events
requiring attention
State register pooling
Time consuming
Complex and difficult programming handling
Interrupting main program - idea
What does microprocessor do ?
How to insert/inject an external instruction not coming from
main program ?
What kind of instruction should be inserted ?
Pooling method
Start Main Loop
Init
Dev #0 Service
Dev #1 Service
Dev #2 Service
Instruction insertion
Inserting external instruction
INTERRUPT cycle instead of FETCH
Collect instruction to IR
Disable incrementing of PC until collecting entire instruction
(for multibyte instructions)
Disable interrupt system automatically
What kind of instruction can be inserted
Subroutine call instructions
Allow returning to main program and restoring its operation
Simplified RST n
Full CALL addr16
Interrupt fetch cycle
Comparison of FETCH and INTERRUPT cycles
FETCH INT
CLK
ADDR ADDR16 ADDR16
D DIN DIN
nRD
nWR
nMREQ
nIORQ
nM1
nWAIT
Interrupt cycle – CALL
Interrupt source recognition
Interrupt source recognition
Vectorized system handled by CPU
Instead of interrupt instruction the vector (device
identifier) is passed
Interrupt vectors array – transforms identifier into
interrupt subroutine (ISR) address
Operations
Get vector in INTERRUPT cycle
Push return address
Get ISR address as {IREG, 2VECT}
Put ISR address into PC
Interrupt cycle - vector
Interrupt subroutine
Do not disturb normal operation of main program
Interrupt prolog
Save machine state
PUSH PSW
PUSH H
Interrupt epilog
Restore machine state, enable interrupt system
POP H
POP PSW
EI
RET
Complex interrupt system
Interrupts enabling
Global interrupt system control - EI, DI
Interrupt masking – enable notification routing from
particular device
What to do when more than one request is pending
Priority system
Interrupt nesting
Enable interrupt system while inside ISR
RISC basics
Constant length of instruction word
Allows to predict instruction locations
Instruction execution in overlapped mode – pipeline
architecture
Instruction execution stages
Instruction fetch
Decode
Arguments
Execute
Write back
Instruction Pipeline
Overlapped execution
Instruction execution time: 5 cycles
Instruction execution average time: 1 cycle
t0 t1 t2 t3 t4 t5 t6 t7 t8 t9
elementary operations
R15/PC +1
Inserting an empty
instructions ARGS
Architecture modification
Increase of complexity
EXE ALU
WB REGS
Data race condition
Data path
ARG 1
Direct feedback DODATA
RES
from ALU ALU
Extended
ARG 2
decoding logic
ADDR
ADATA
INSTR WEDATA
ARG
EXE
WB
+
TH_ID
REG
FILE
8x16R DIDATA
Performance
Jump instructions FETCH
R15/PC +1
EXE – disable instructions
in pipe DEC
WB – new address
Pipe flushing ARGS
Skip instruction
Conditional instruction EXE ALU
execution
WB REGS
Outline
How it was developed
Basic concepts
Architecture
Instruction List
Interface cycle
Development of ARM
ACORN
1982 BBC micro developed with use of 6502
Not enough computing power
Own processor development under code name
Acorn RISC Machine
High computing performance
Simplified hardware architecture
Combination of RISC – CISC architectures
Concepts
1980 Patterson, Ditzel introduction of RISC
Instruction decoding
hardwired logic – inted of microprogramming
One instruction per clock
Impossible to achieve but when…
Pipeline execution
Pipeline increase hardware utilization
Reduces average instructions execution time
Instruction takes n-cycles to complete but at the same time n
instruction is processed
ARM – implemented features
Load-store architecture
Instruction constant length
Normal mode - 32 bit
Poor code density
THUMB mode - 16 bit
Instruction recoding into 16-bit word
Internally expanded by decoder into 32-bit instructions
Three address instructions
ARM not implemented features
Register window
Moving register window
Delayed jump execution
Single cycle instructions
Multi cycle instructions but still simple in use
ARM features
Construction simplicity (?)
Combines features of RISCs and CISCs
High performance
Small die size
Implements
Co-processor interfacing
Multitasking system support
ARM – block diagram
ARM – functional diagram
Programming model
Instruction length
Normal mode - 32 bit
THUMB mode – 16 bit
Data types
Byte (8 bit)
Half word (16 bit) address 2n
Word (32 bit) address 4n
Operation modes
Register file
Rejestry - THUMB
THUMB mapping
Status register - CPSR
CPSR – Current Program State Register
Registers – operation modes
Exceptions – start addresses
Exceptions
Save PC (R15) in link register (R14)
Copy CPSR to respective SPSR
Set flags according to exception
Writing to PC exception start address
Disabling interrupt system(FIQ, IRQ)
To avoid accidental execution before completing critical
operations of interrupt prolog or epilog
Exceptions
Restore of PC-R15 based on R14 with appropriate offset
of 4 or 8 (depending on normal/THUMB modes)
Restoring of CPSR based on SPSR
Odblokowanie przerwań
Instruction set
Condition filed
Jump instructions
Calculation instruction
Implemented operations
Argument shifting
Assembler
MOV, MVN – single argument operations
<opcode>{cond}{S} Rd,<Op2>
CMP, CMN, TEQ, TST – only condition code generation
<opcode>{cond} Rn,<Op2>
AND, EOR, SUB, RSB, ADD, ADC, SBC, RSC, ORR, BIC
<opcode>{cond}{S} Rd,Rn,<Op2>
Where <Op2> is Rm{,<shift>} or <#expression>
{cond} – two character condition code
{S} – update condition codes
implied for CMP, CMN, TEQ, TST
Rd, Rn and Rm expresions that evaluates to register number
Examples
ADDEQ R2,R4,R5; if the Z flag is set make R2:=R4+R5
TEQS R4,#3 ; test R4 for equality with 3
; (the S is in fact redundant as the
; assembler inserts it automatically)
SUB R4,R5,R7,LSR R2
; logical right shift R7 by the number in
; the bottom byte of R2, subtract result
; from R5, and put the answer into R4
MOV PC,R14 ; return from subroutine
MOVS PC,R14 ; return from exception and restore CPSR
; from SPSR_mode
Status register save
Status register restore
Status register load
Multiplication
Data transfer LDR/STR
Block transfers LDM/STM
Blok transfers LDM/STM
Memory access
Memory access
ARM Core based…
Custom CPU…
Program mentorski receptą na efektywne kształcenie na makrokierunku
automatyka i robotyka, elektronika i telekomunikacja, informatyka na Politechnice Śląskiej
Mentoring program - a recipe for efficient education at the Macrocourse on Automatic Control and Robotics, Electronics and
Telecommunication, and Computer Science offered by the Silesian University of Technology
POKL.04.01.02-00-209/11