Beruflich Dokumente
Kultur Dokumente
Shrikrishna
Knowledge Dissemination through Education & Training CDAC Knowledge Park Bangalore
Agenda
ARM History RISC & CISC Design Philosophy ARM Processor Fundamentals ARM modes of Operation ARM instruction execution states Pipelining in ARM Processors Exceptions & Interrupts ARM Processor Family Overview
CDAC Knowledge Park, Bangalore 2
ARM
Advanced RISC Machines
ARM History
ARM Acorn RISC Machine(19831985)
Acorn Computers Limited, Cambridge, England
ARM History
Key component of many 32 bit embedded systems Portable Consumer devices ARM1 prototype in 1985 One of the ARMs most successful cores is the ARM7TDMI,provides high code density and low power consumption
ARM Applications
Nokia N93
ARM Partnership
ARM Advantage
Registers
ARM has Load Store Architecture General Purpose Registers can hold data or address Total of 37 Registers, each of 32 bit There are 17 or 18 active Registers
16 data registers 2 status registers
Status
Extension 7 6 5
Control 4 0
NZ CV
Function Condition Flags
I F T Mode
Interrupt Masks Processor Mode
Thumb State
Processor Modes
Determines which registers are active and the access rights to the cpsr register itself Privileged & Nonprivileged
Abort Fast Interrupt Request Interrupt Request Supervisor System Undefined User
Privileged
Nonprivileged
Banked Registers
r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 sp r14 lr r15 pc cpsr spsr_fiq spsr_irq spsr_svc spsr_undef spsr_abt
Banked Registers
Fast Interrupt Request
r8_fiq r9_fiq r10_fiq r11_fiq r12_fiq r13_fiq r14_fiq
Banked Registers
Banked registers are available only when the processor is in a particular mode Every processor mode except user mode can change mode by writing directly to the mode bits of the cpsr Banked registers are a subset of the main 16 registers If we change processor mode, a banked register from the new mode will replace an existing register Exceptions and Interrupts cause a mode change
This change causes user register r13 and r14 to be banked The user registers are replaced with registers r13_irq and r14_irq spsr stores the previous mode cpsr
spsr_irq
Processor Mode
Mode
Abort Fast Interrupt Request Interrupt Request Supervisor System Undefined User
Abbr:
abt fiq irq svc sys und usr
Privileged
yes yes yes yes yes yes no
Mode[4:0]
10111 10001 10010 10011 11111 11011 10000
Operation Modes
Mode User FIQ IRQ Supervisor Mode Abort Undefined Instruction System Registers User _fiq _irq _svc _abt _und User CPSR[4:0] 10000 10001 10010 10011 10111 11011 11111
Processor Modes
User FIQ IRQ Unprivileged mode for most applications to run Fast Interrupt Routine Interrupt Routines
Supervisor Entered on reset and when there is a exception Abort Entered when data or instruction prefetch aborted
Undefined When an undefined instructions is executed System Privileged user mode for operating system
Thumb (cpsr T = 1)
Jazelle (cpsr T = 0, J = 1)
Instruction Size Core Instruction 8 bit Over 60% of the java bytecodes are implemented in hardware; the rest of the codes are implemented in software
Interrupt Masks
Are used to stop specific interrupt requests from interrupting the processor
IRQ FIQ
The I bit masks IRQ when set to binary 1, and F bit masks FIQ when set to binary 1
Condition Flags
Flag Flag Name Set when
Q V C Z N
The result causes an overflow and / or saturation The result causes a signed overflow The result causes an unsigned carry The result is zero, frequently used to indicate the equality Bit 31 of the result is a binary 1
Condition Flags
Condition flags are updated by comparisons and the result of ALU operations that specify the S instruction suffix
If SUBS results in a register value of zero, then the Z flag in the CPSR is set
CPSR
Flags Fields Bit 31 30 29 28 27 24 Status Extension 7 6 5 Control 4 0
0 0 1 0 0 0
Function nzCvq j
0 1 0 10011
i F t svc
cpsr = nzCvqjiFt_SVC
CPSR
31 28 24 23 16 15 8 7 6 5 4 0
N Z C V
I F T
mode
hold information about the most recently performed ALU operation set the processor operating mode Condition code flags Interrupt Disable bits. N = Negative result from ALU
Z = Zero result from ALU C = ALU operation Carried out V = ALU operation oVerflowed I = 1: Disables the IRQ. F = 1: Disables the FIQ.
T Bit
Architecture xT only T = 0: Processor in ARM state T = 1: Processor in Thumb state
J bit
Architecture 5TEJ only J = 1: Processor in Jazelle state
Mode bits
Specify the processor mode
Pipeline
Is a mechanism a RISC processor uses to execute instructions Using a pipeline speeds up execution by fetching the next instruction while other instructions are being decoded and executed
Fetch loads an instruction from memory Decode identifies the instruction to be executed Execute processes the instruction and writes the result back to a register
Filling the pipeline Allows the core to execute an instruction every cycle
Higher operating frequency higher performance Latency increases Increase in instruction throughput by around 13% in 5 stage pipeline 1.1 Dhrystone MIPS per MHz
Decode
The instruction is decoded and register operands read from the register file
Execute
An operand is shifted and the ALU result generated
Memory (Buffer/Data)
Data memory is accessed if required. Otherwise the ALU result is buffered for one clock cycle to give the same pipeline flow for all instructions
Write (Write-Back)
The results generated by the instruction are written back to the register file, including any data loaded from memory
Increase in instruction throughput by around 34% in 6 stage pipeline 1.3 Dhrystone MIPS per MHz Code written for the ARM7 will execute on ARM9 and ARM10
Decode
Execute
ADD AND
cpsr
Pipeline Characteristics
An instruction in the execute stage will complete even though an interrupt has been raised The execution of a branch instruction or branching by the direct modification of the PC causes the ARM core to flush its pipeline
ARM Exceptions
ARM supports range of Interrupts, Traps, Supervisor Calls, all grouped under general heading of
Exceptions
Vector Addresses
Exception / Interrupt
Reset Undefined Instruction Software Interrupt Prefetch Abort Data Abort Reserved Interrupt Request Fast Interrupt Request
Shorthand
RESET UNDEF SWI PABT DABT IRQ FIQ
Address
0x00000000 0x00000004 0x00000008 0x0000000C 0x000000010 0x000000014 0x000000018 0x00000001C
High Address
0xffff0000 0xffff0004 0xffff0008 0xffff000C 0xffff0010 0xffff0014 0xffff0018 0xffff001C
Exception Priorities
1. 2. 3. 4. 5. 6. Reset (Highest Priority) Data Abort FIQ IRQ Prefetch Abort SWI, Undefined
Core Extensions
Standard components placed next to the ARM core Improve performance, manage resources, provide extra functionality Three hardware extensions
Caches Memory Management Coprocessors
Caches
Cache is a block of fast memory placed between main memory and the core Cache provides an overall increase in performance ARM has two forms of cache
Single unified cache for data and instruction Separate caches for data and instruction
Memory Management
MMU is a class of processor hardware components for handling memory accesses requested by the CPU. The functions of MMUs are
Translation of virtual address to physical address. Memory protection Cache control etc
Coprocessors
Coprocessors can be attached to the ARM processor A separate chip,that performs lot of calculations for the microprocessor,relieving the CPU some of its work and thus enhancing overall speed of system. A secondary processor used to speed up operation by taking over a specific part of main processors work. The ARM processor uses coprocessor 15 registers to control cache, TCMs, and memory management
Description of cpsr
Parts
Mode T I&F J Q V C Z N
Bits
4:0 5 7:6 24 27 28 29 30 31
Architecture
all ARMv4T all ARMv5TEJ ARMv5TE all all all all
Description
processor mode Thumb state interrupt masks Jazelle state condition flag condition flag condition flag condition flag condition flag
ARM9
ARM10
ARM11
eight-stage 335 0.4 mW/MHz (+ cache) 1.2 Harvard 16 x 32
three-stage five-stage six-stage 80 150 260 0.06 mW/MHz 0.19 mW/MHz 0. 5 mW/MHz (+ cache) (+ cache) 0.97 1.1 1.3 Von Neumann Harvard Harvard 8 x 32 8 x 32 16 x 32
Architecture Revisions
Revision Example core ISA enhancement Implementation
ARMv1 ARMv2 ARMv2a ARMv3 ARM1 ARM2 ARM3 ARM6 & ARM7DI First ARM Processor 26 bit addressing 32 bit multiplier 32 bit coprocessor support On chip cache Atomic swap instruction 32 bit addressing Separate cpsr & spsr New modes UNDEF, ABORT MMU support virtual memory Signed & unsigned long multiply Load store instruction New Mode - System
ARMv3M ARMv4
ARM7M StrongARM
Architecture Revisions
Revision Example core ISA enhancement Implementation
ARMv4T ARMv5TE ARM7TDMI & ARM9T Thumb ARM9E & ARM10E Superset of the ARMv4T Extra inst. added for changing state between ARM & Thumb Enhanced multiply instructions Extra DSP type instructions Faster multiply accumulate ARMv5TEJ ARM7EJ & ARM926EJ Java acceleration ARMv6 ARM11 New multimedia instructions
v6 v6Z v6T2
* * *
* * *
* * *
* * * * *
ARM Processors
ARM7 Family ARM7EJ-S ARM7TDMI ARM7TDMI-S ARM720T ARM9/9E Families ARM920T ARM922T ARM926EJ-S ARM940T ARM946E-S ARM966E-S ARM968E-S Vector Floating Point Families VFP10 ARM10 Family ARM1020E ARM1022E ARM1026EJ-S ARM11 Family ARM1136J-S ARM1136JF-S ARM1156T2(F)-S ARM1176JZ(F)-S ARM11 MPCore Cortex Family Cortex-A8 Cortex-M1 Cortex-M3 Cortex-R4 Other Processors/Microarchitectures StrongARM (DEC-Intel) Xscale (Intel- Marvell Tech) Other
Cortex Family
ARM Cortex-A Series - Application processors
for complex OS and user applications
ARM Cortex-A8, ARM Cortex-A9
Switching States
ARM to Thumb
Execute the BX instruction with state bit=1
Thumb to ARM
Execute the BX instruction with state bit=0 An interrupt or exception occurs
References
ARM System Developers Guide By Andrew N. Sloss, Dominic Symes, Chris Wright ARM System On Chip Architecture By Steve . B. Furber. ARM Architecture Reference Manual By David Seal.
In todays systems the key is not raw processor speed but total effective system performance and power consumption