Beruflich Dokumente
Kultur Dokumente
ARM
System - On
TIMELINE (1/2)
ARM
System - On
TIMELINE (2/2)
ARM
System - On
THE ARM
ARCHITECTURE
ARM
System - On
ARM
Small size
Low power
consumption
System - On
Registers
ARM
System - On
CPSR
user mode
r8_fiq
r9_fiq
r10_fiq
r11_fiq
r12_fiq
r13_fiq
r14_fiq
SPSR_fiq
fiq
mode
r13_svc
r14_svc
r13_abt
r14_abt
SPSR_svc
SPSR_abt
svc
mode
abort
mode
r13_irq
r14_irq
r13_und
r14_und
SPSR_irq SPSR_und
irq
mode
undefined
mode
CPSR
31
28 27
NZCV
ARM CPSR
format
unused
8 7 6 5 4
IF T
mode
N: Negative
Z: Zero
C: Carry
V: Overflow
Q: Saturation (for enhanced DSP instructions)
ARM
System - On
10
Memory Organization
bit 31
bit 0
23
22
21
20
19
18
17
16
15
14
13
12
11
10
word16
1 word = 32 bits
half-word14 half-word12
word8
byte6 half-word4
byte
address
ARM
System - On
11
Instruction Set
Data processing
Data transfer
Control flow
ARM
System - On
12
Supervisor mode
ARM
System - On
13
I/O System
ARM
System - On
14
Exceptions
Exceptions:
Interrupts
Supervisor Call
Traps
ARM
System - On
15
CPSR
user mode
r8_fiq
r9_fiq
r10_fiq
r11_fiq
r12_fiq
r13_fiq
r14_fiq
SPSR_fiq
fiq
mode
r13_svc
r14_svc
r13_abt
r14_abt
SPSR_svc
SPSR_abt
svc
mode
abort
mode
r13_irq
r14_irq
r13_und
r14_und
SPSR_irq SPSR_und
irq
mode
undefined
mode
THE ARM
INSTRUCTION SET
Arithmetic Operations
ADD r0, r1, r2 ; r0:= r1+r2 and dont update flags
ADDS r0, r1, r2 ; r0:= r1+r2 and update flags
Logical Operations
AND r0, r1, r2 ; r0:= r1 AND r2
Register Movement
MOV r0, r2
Comparison
CMP r1, r2
ARM
System - On
18
Operands:
Immediate operands
ADD r3, r3, #1
Multiplication:
MUL r4, r3, r2
ARM
System - On
19
System - On
20
Examples
PRE:
r0 = 0x00000000
r1 = 0x00009000
mem32[0x00009000] = 0x01010101
mem32[0x00009004] = 0x02020202
r0 = 0x02020202
r1 = 0x00009004
ARM
System - On
21
Examples
PRE:
r0 = 0x00000000
r1 = 0x00009000
mem32[0x00009000] = 0x01010101
mem32[0x00009004] = 0x02020202
r0 = 0x02020202
r1 = 0x00009000
ARM
System - On
22
Examples
PRE:
r0 = 0x00000000
r1 = 0x00009000
mem32[0x00009000] = 0x01010101
mem32[0x00009004] = 0x02020202
r0 = 0x01010101
r1 = 0x00009004
ARM
System - On
23
Examples
mem32[0x80018] = 0x03
mem32[0x80014] = 0x02
mem32[0x80010] = 0x01
r0 = 0x00080010
LDMIA r0!, {r1-r3}
r0 = 0x0008001c
r1 = 0x00000001
r2 = 0x00000002
r3 = 0x00000003
ARM
System - On
24
Examples
mem32[0x8001c] = 0x04
mem32[0x80018] = 0x03
mem32[0x80014] = 0x02
mem32[0x80010] = 0x01
r0 = 0x00080010
LDMIB r0!, {r1-r3}
r0 = 0x0008001c
r1 = 0x00000002
r2 = 0x00000003
r3 = 0x00000004
ARM
System - On
25
Conditional execution
Instructions can be executed
conditionally without braches
CMP r2, r3 ;subtract and set flags
ADDGE r4, r5, r6 ; if r2>r3
SUBLT r4, r5, r6 ; else
ARM
System - On
26
Conditional execution
mnemonics
ARM
System - On
27
Loop
loop
ARM
System - On
28
Example 1
AREA ARMex, CODE, READONLY ; Name this block of code
ARMex
ENTRY
; Mark first instruction to execute
start
MOV r0, #10
; Set up parameters
MOV r1, #3
ADD r0, r0, r1
; r0 = r0 + r1
stop
MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026
; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
END
; Mark end of file
ARM
System - On
29
Example 2
AREA subrout, CODE, READONLY
; Name this block of
code
ENTRY
; Mark first instruction to execute
start MOV r0, #10 ; Set up parameters
MOV r1, #3
BL doadd ; Call subroutine
stop
MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026
; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
doadd
ADD r0, r0, r1
; Subroutine code
MOV pc, lr
; Return from subroutine
END
; Mark end of file
ARM
System - On
30
3
Stage
Pipeline
(ARM7
80MHz)
Fetch
Decode
Execute
A[31:0]
c ontrol
i nc rementer
PC
regi ster
bank
i nstruc ti on
dec ode
A
L
U
b
u
s
multipl y
regi ster
&
b
u
s
b
u
s
barrel
shifter
c ontrol
ALU
Throughput:
1 instruction /
cycle
Increase f clk
Logic simplification
Reduce CPI
reduce the number
of multicycle instructions.
ARM
System - On
33
5 stage
pipeline
(ARM9150MHz)
next
pc
+4
pc + 4
pc + 8
I decode
r15
LDM/
STM
+4
Fetch
Decode
Execute
Buffer /
Data
Write Back
instruction
decode
register read
(2/2)
fetch
I-cache
mul
post index
reg
shift
shift
pre-index
B, BL
immediate
fields
mux
execute
ALU
forwarding
paths
MOV pc
SUBS pc
byte repl.
load/store
address
D-cache
buffer/
data
rot/sgn ex
LDR pc
register write
writeback
System - On
35
ARCHITECTURAL SUPPORT
FOR HIGH LEVEL
LANGUAGES
Coprocessor interface
Load / store unit
Register bank ( 8 registers 80 bit )
ALU (adder, mult, div)
ARM
System - On
37
pipeline
control
instruction
issuer
load/store
unit
coprocessor
hand-shake
coprocessor
interf ace
register bank
add
mult
div
ARM
arithmetic
unit
System - On
38
APCS (1/2)
Loop
Loop
MOV
pc, lr
ARM
System - On
39
APCS (2/2)
Assembly code
C code
void f1(int a)
{
f2(a); }
f1
16
8
4
0
Stack
pointer
ARM
System - On
40
THUMB PROGRAMMERS
MODEL
General information
Thumb objective:
Code density.
Thumb has a 16 bit instruction set.
A subset of the ARM instruction set is coded
to a 16bit space
With appropriate use great benefits can be
achieved in terms of
Power efficiency
Enhanced performance
ARM
System - On
42
ARM
System - On
43
Thumb registers
r0
r1
r2
r3
r4
r5
r6
r7
r8
r9
r10
r11
r12
SP (r13)
LR (r14)
PC (r15)
Hi registers
CPSR
ARM
System - On
44
Thumb
ARM
ARM
System - On
45
If performance is critical:
ARM
Thumb
ARM
System - On
46
ARM
System - On
47
Example 3
ARM
System - On
48
Example 4
Implement the following pseudocode in
ARM and Thumb assembly. Which is
more efficient in terms of execution time
and which in terms of code size?
If r1>r2 then
R3= r4 + r5
R6 = r4 r5
Else
R3= r4 - r5
R6 = r4 + r5
ARM
System - On
49
Example 5
ARM
System - On
50
ARCHITECTURAL
SUPPORT FOR SYSTEM
DEVELOPMENT
control
A basic
ARM
memor
y
system
RAMwe3
RAMwe2
RAMwe1
A[n+2:2]
A[n+2:2]
A[n+2:2]
RAMwe0
RAMoe
A[n+2:2]
A[31:0]
ARM
D[31:0]
D[31:0]
SRAM
SRAM
SRAM
SRAM
D[7:0]
D[7:0]
D[7:0]
D[7:0]
D[31:24]
D[23:16]
D[15:8]
D[7:0]
D[7:0]
D[7:0]
D[7:0]
D[7:0]
ROM
ROM
ROM
ROM
A[m+2:2]
A[m+2:2]
A[m+2:2]
A[m+2:2]
AMBA (1/4)
AMBA objectives:
Technology independence
To encourage modular system design
ARM
System - On
53
AMBA (2/4)
ARM
System - On
54
AMBA (3/4)
AHB bus
arbiter
Burst
transaction
Split
transaction
Data bus 64
128 bit
address
master
1
slave
1
write
data
master
2
slave
2
master
3
slave
3
read
data
decoder
ARM
System - On
55
AMBA (4/4)
ARM
System - On
56
ARM
System - On
57
(2/2)
I cache
ARM7TDMI
output
buffer
register
bank
input
buffer
A M B A i/ f
AMBA i/f
AMBA
MEMORY HIERARCHY
Memory hierarchy
Larger size
Memory
type
Lower speed
Size
Registers 32 bit
Speed
A few nsec
On chip 8
10 nsec
cache
32kbytes
Off chip 100
cache
200
kbytes
RAM
Mbytes
ARM
10 30
nsec
100 nsec
System - On
60
On chip memory
ARM
System - On
61
Cache types
Cache types:
Unified cache.
Separate instruction and data caches.
ARM
System - On
62
Replacement policy
-implementation
System - On
63
A line
of data
stored
in a
tag of
memor
y
tag RAM
data RAM
compare
mux
hit
data
ARM
System - On
64
(1/2)
(2/2)
System - On
65
address
2 way set
associative
cache. (1/3)
tag RAM
data RAM
compare
mux
hit
compare
tag RAM
data
mux
data RAM
ARM
System - On
67
Set selection:
Random allocation
Least recently used (LRU)
Round robin (cyclic)
ARM
System - On
68
tag CAM
data RAM
mux
hit
data
Write strategies
Write through
All write operations are passed to main memory
ARM
System - On
70
Physical cache
Unified instruction
and data cache
Direct-mapped
RAM-RAM
Cyclic
Write-through
ARM
Opti o ns
Virtual cache
Separate instruction
and data caches
Set-associative
RAM-RAM
Random
Write-through with
write buffer
System - On
71
Fully associative
CAM-RAM
LRU
Copy-back
Perfect cache
performance
Cache fo rm
No cache
Instruction-only cache
Instruction and data cache
Data-only cache
ARM
Perfo rmance
1
1.95
2.5
1.13
System - On
72
MMU (1/3)
ARM
System - On
73
MMU (2/3)
base
logical address
limit
>?
physical address
access fault
ARM
System - On
74
MMU (3/3)
22 21
12 11
logical address
data
page
directory
ARM
page
table
page
frame
System - On
75
ARCHITECTURAL
SUPPORT FOR
OPERATING SYSTEMS
External
Clock
W'Dog
External
Reset &
Battery Fail
System
Control
14 External
Interrupts
Trace Port
Analyser
ETM
Timers
&
RTC
(PL031)
VIC
(PL192)
8 external DMA
requests
DMAC
(PL080)
64
AHB/APB
Bridge
64
64
64
1.
2.
3.
4.
5.
6.
7.
8.
config
64
64
64
64
MPMC
(PL176)
Static
Memory
SMC
(PL093)
unassigned
SDRAM
& DDR
CLCD
Display
CLCD
(PL110)
ARM1136JF
core
8 AHBs
Bus Matrix
config
1.
2.
3.
4.
5.
6.
7.
8.
AHB/APB
Bridge
AHB/APB
Bridge
GPIO
(PL061)
SSP
(PL022)
32 GPIO
Lines
UART
(PL011)
2x UARTs
SCI
(PL131)
Smart Card
(UICC
compliant)
CP15
ARM
System - On
77
Protection Unit
ARM
System - On
78
ARMULATOR (1/2)
System - On
80
ARMULATOR (2/2)
MM
ARM
System - On
81
ARMULATOR TUTORIAL
CODEWARRIOR ENVIRONMENT
ARM
System - On
82
ARM
System - On
83
ARM
System - On
84
ARM
System - On
85
ARM
System - On
86
ARM
System - On
87