Sie sind auf Seite 1von 42

ARM Cortex-M

Leon Chen
ARM Taiwan, Senior FAE
Oct. 14, 2010

Agenda
1.

Overview
 Market challenges
 Introducing the family
 Common technology benefits
 Spanning the applications

3.

Fundamental technologies
 Processor core
 Thumb-2 instruction set
 NVIC
 CoreSight
 Ecosystem and CMSIS

4.

Development tools and


ecosystem

5.

Summary

2. The processors





Cortex-M0
Cortex-M1
Cortex-M3
Cortex-M4

Cortex-M processor family




Seamless embedded architecture


 Spanning cost and performance points
ARM Cortex-A Series:
Applications processors for
feature-rich OS and user applications

ARM Cortex-R Series:


Embedded processors for
real-time signal processing
and control applications

ARM Cortex-M Series:


Deeply embedded processors
optimized for microcontroller
and low-power applications

Market challenges


More features at lower cost


 Increasing connectivity (e.g. USB, Ethernet, 802.15, NFC)
 Drive for better code reuse
 Analog devices with increasing processing and communication

Energy efficiency
 Wireless sensors, motor control, metering

8/16-bit running out of performance headroom


 As complexity rises so does frequency and memory requirement

Cortex-M processor solution




Energy efficiency
 Lower energy costs

Low power implementation


Sleep mode support
Wake-up Interrupt Controller
Increased intelligence at node





Ease of use
 Lower software costs

Broad tools and OS support

High performance
 Competitive products

32-bit RISC architecture

Reduced system cost


 Lower silicon costs

Binary compatible roadmap


Pure C target

High efficiency processor cores


Integrated Interrupt Controller (NVIC)

Thumb-2 code density


Area optimised designs
CoreSight support

Spanning the application range


 Forget traditional 8/16/32-bit classifications
 Seamless architecture across all applications
 Every product optimised for ultra low power systems

Cortex-M0 Cortex-M3 Cortex-M4

32-bit/DSC applications

8/16-bit applications

16/32-bit applications

Lowest cost

Performance efficiency

MCU plus DSP

Optimised connectivity

Feature rich connectivity

Accelerated SIMD, FP & DSP

Standardization - driven by software reuse




#1 factor in choosing a processor is the software


development tools available for it

April 2005

Factors considered most important when choosing a microprocessor


7

Cortex Microcontroller Standard (CMSIS)


 Cortex Microcontroller Software Interface Standard
 Abstraction layer for all Cortex-M processor based devices
 Developed in conjunction with silicon, tools and middleware partners

 Benefits to the embedded developer







Consistent software interfaces for silicon and middleware vendors


Simplifies re-use across Cortex-M processor-based devices
Reduces software development cost and time-to-market
Reduces learning curve for new Cortex microcontroller developers

Addressing proprietary MCU space


 Seamless architecture across all MCU and embedded
applications:

Cortex-M0 Cortex-M3 Cortex-M4


Renesas SuperH
Renesas H8

8051
TI MSP430

Freescale HCS08
Microchip PIC18

AVR8

Atmel AVR32

Microchip dsPIC

TI C2000
Microchip PIC32
Infineon C166
Renesas SuperH
Microchip PIC24

Infineon C166
TI MSP430

Freescale 58xxx

Atmel AVR32

Cortex-M processor industry adoption


 ARM Cortex-M3 processor momentum continues
 35+ licensees in applications from MCU, SoC, wireless sensor nodes
 140% CAGR in units shipped by vendors
 New ARM Cortex-M0 processor announced in 2009
 More than 20 licensees already in MCU, mixed-signal and FSM
replacement

 Cortex-M4 now also available


 Released at the end of 2009 with five licensees already including NXP,
ST and TI

10

The Products

11

ARM Cortex-M0 processor




The smallest, lowest power ARM processor


 A third of the area and power of ARM7TDMI-S processor
 12K gates, 47 A/MHz on 180ULL in minimal configuration*
 0.9 DMIPS/MHz performance

 Significant advantages over 8/16-bit


 Longer battery life through energy efficiency
 Reduced system cost through code density
 Performance headroom for advanced features

Extends ARM architecture to new applications


 Ultra low-power MCU and mixed-signal devices
 Ideal sequencer or FSM replacement on SoC
 Binary and tools upwards compatible with Cortex-M3 processor
12

ARM Cortex-M1 First for FPGA


 First ARM processor specifically optimized for FPGA





Small, high-frequency soft processor for low-cost volume FPGA


Upwards compatible with Cortex-M3 processor onwards on ASIC/ASSP/MCU
Capable of up to 200MHz on fast FPGA device
Delivers up to 0.8 DMIPS/MHz efficiency from TCM

 Designed for synthesis on multiple FPGA types






Actel ProASIC3, Actel Igloo and Actel Fusion


Altera Cyclone-III, Altera Stratix-III
Xilinx Spartan-3, Xilinx Virtex-5

 3 Channels to market




13

Traditional ARM licensing


NEW: Altera/Arrow 1X design start
NEW: Freely available to Actel users

Cortex-M3 processor - technical excellence





Optimal performance, high efficiency processor core 1.25 DMIPS/MHz


Rich, unified Thumb-2 high performance instruction set





Integrated bus matrix for increased performance


Advanced power management features and capabilities
Fully configurable to balance features and silicon area







Smallest code size and reduced memory requirements

Low latency, integrated Nested Vectored


Interrupt Controller (NVIC)
Sophisticated debug and trace support
Memory Protection Unit (MPU)
Embedded Trace Macrocell (ETM)
Fault Robust Interface

 Launched 2004


Broad adoption within microcontroller


and embedded SoC markets

 Rev2 released in 2008 with many new

power management and configuration


. the Cortex-M3 processor will propel us again towards a
capabilities
breakthrough in performance, ease of use and quality, while
also providing a competitive cost structure for our products.
We feel that the Cortex M3 processor will play an important
role in accelerating the convergence of the MCU market
Jim Nicholas, GM Microcontroller Division, ST

14

Cortex-M4 for digital signal control

DSP

MCU
Ease of use
C Programming
Interrupt handling
Ultra low power

15

Cortex-M4

Harvard architecture
Single cycle MAC
Floating Point
Barrel shifter

Cortex-M4 processor details


 ARMv7ME architecture







Thumb-2 technology
ARMv6 SIMD and DSP
Single cycle MAC (Up to 32 x 32 + 64->64)
Optional decoupled single precision FPU
Integrated configurable NVIC
Compatible with Cortex-M3 processor

 Microarchitecture



3-stage pipeline with branch speculation


3x AMBA AHB-Lite Bus Interfaces

 Configurable for ultra low power





Dotted boxes denote optional blocks


Deep Sleep Mode, Wakeup Interrupt Controller
Power down features for the optional Floating Point Unit

 Flexible configurations for wider applicability





16

Configurable Interrupt Controller (1-240 Interrupts and Priorities)


Optional Memory Protection Unit, Optional Debug & Trace

Highest in-class efficiency


The Cortex-M4 is ~2X more efficient on most DSP tasks than
leading 16 and 32 bit MCU devices with DSP extensions
16-bit MCU

32-bit MCU

32-bit Cortex-M4

Cycle counts on DSP tasks compared, smaller is better


17

Fundamental
Technologies

18

Instruction set architecture


 Thumb
 32-bit operations in 16-bit instructions
 Introduced in ARM7TDMI processor (T stands for Thumb)
 Subsequently supported in every ARM processor developed since

 Thumb-2
 Enables a performance optimised blend of 16/32-bit instructions
 All processor operations can all be handled in Thumb state
 Supported across the Cortex-M processor range
Thumb
ARM7

19

Thumb instruction set upwards compatibility


ARM9

Cortex-M0

Cortex-M3

Cortex-M4

Cortex-R4

Cortex-A9

Instruction set architecture

20

Nested Vectored Interrupt Controller


 Faster interrupt response
 With less software effort

 ISR written directly in C

8051
1.

SJMP/L JMP from vector


table to handler
2. PUSH PSW
3. ORL PSW, #00001000b
(to switch register bank)
4. Starting real handler
code

Cortex-M
1.

Starting real handler


code

 Interrupt table is simply a set of


pointers to C routines

 ISRs are standard C functions

 Integrated NVIC handles:


 Saving corruptible registers
 Exception prioritization
 Exception nesting
21

Tail-chain

Code density



Cortex-M shows smaller code size than 8/16-bit devices


Consider a 16-bit multiply operation
 Required for 10-bit ADC data filtering, encryption algorithms, audio

8-bit example (8051)


MOV
MOV
MUL
MOV
MOV
MOV
MOV
MUL
ADD
MOV
MOV
ADDC
MOV
MOV

A, XL ; 2 bytes
B, YL ; 3 bytes
AB; 1 byte
R0, A; 1 byte
R1, B; 3 bytes
A, XL ; 2 bytes
B, YH ; 3 bytes
AB; 1 byte
A, R1; 1 byte
R1, A; 1 byte
A, B ; 2 bytes
A, #0 ; 2 bytes
R2, A; 1 byte
A, XH ; 2 bytes

MOV

B, YL ; 3 bytes

MUL

AB; 1 byte

ADD

A, R1; 1 byte

MOV

R1, A; 1 byte

MOV

A, B ; 2 bytes

ADDC

A, R2 ; 1 bytes

MOV

R2, A; 1 byte

MOV

A, XH ; 2 bytes

MOV

B, YH ; 3 bytes

MUL

AB; 1 byte

ADD

A, R2; 1 byte

MOV

R2, A; 1 byte

MOV

A, B ; 2 bytes

ADDC

A, #0 ; 2 bytes

MOV

R3, A; 1 byte

Time: 48 clock cycles*


Code size: 48 bytes

16-bit example

ARM Cortex-M

MOV R1,&MulOp1
MOV R2,&MulOp2

MULS r0,r1,r0

MOV SumLo,R3
MOV SumHi,R4
(Memory mapped multiply
unit)

Time: 8 clock cycles


Code size: 8 bytes

Time: 1 clock cycle


Code size: 2 bytes

* 8051 needs at least one cycle per instruction byte fetch as they only have an 8-bit interface

22

Cortex-M Processor Power Modes


Active mode

Leakage + dynamic

Running Dhrystone 2.1 benchmark

Sleep mode

Leakage + some dynamic

Core clock gated, NVIC awake

Deep Sleep mode

Leakage only

Power still on, most clocks off

Deep Sleep mode

State retention (WIC)

Most power off, all clocks off

Power off

Zero power

Power off

Power consumption

Active

Power Off

23

Deep Sleep
(WIC)

Deep Sleep

Sleep

Not To scale

Cortex-M low power technologies


 All Cortex-M processors are specifically designed for low power, with a




range of complementary technologies including:


Integrated architectural clock gating
Sleep and deep sleep modes:
 puts the processor into a low-power state with flexible software control
Sleep-on-exit interrupt
handling:
 enables the
processor to sleep
whenever all outstanding
Interrupts are complete

 Wakeup Interrupt Controller


(WIC)
 enables advanced

interrupt-controlled processing

24

enables nW power consumption in deep sleep mode with instant wakeup

Efficiency with A to spare




Extremely low power leakage and operation


 67 A/MHz active, 7nA state retention in full configuration*
 Reduced Flash access and no speculative fetches

Dramatic energy efficiency advantage over 8/16-bit


 Over 2-4x shorter duty cycle than MSP430 and PIC18**
 Working smarter, sleeping longer

Ideal in power optimised designs

300

Active power A/MHz

100

50

** Based on benchmarks in public domain.


25

Data from pubic websites

M0

your device

* Using ARM Physical IP with PMK on 180ULL


process at 1.8V - ARM Cortex-M0 in full
configuration (32 interrupts, fast mul, debug)

PIC18L

150

MSP430

200

Available for

250

Low Power Sleep Mode Features


Sleep Now

Active Mode
Sleep Mode

Immediate sleep mode entry


WFI or WFE
ISR exit

Sleep On Exit

Active
Mode

ISR
Sleep Mode

Sleep Mode

Automatic sleep mode entry


on ISR service completion

SLEEPONEXIT bit set


WFI or WFE

Deep Sleep

Active Mode

Communicate to system

System Level Sleep Possible

that deeper sleep is possible


SLEEPDEEP bit set

26

32-bit Energy Efficiency Advantage


0.1 % Active Duty

Power (mW)

9 mW

Average power = 13 W

99.9% Sleep

1 W
0.1

9 mW

0.2

0.3

99.8

99.9

100

Time (%)

Power (mW)

0.05 % Active Duty


Average power = 6.8 W
47% lower!
99.95% Sleep

1 W
0.1

27

0.2

0.3

99.8

99.9

100

Time (%)

Optimised Debug with 2-pin SWD


Debug Access Port

Break Point Unit

Cortex-M0 integration
Cortex-M0
JTAG or
Serial Wire

DAP

DWT
Bus

ROM /
Flash
Debugger (e.g.
Vision)

SRAM

Data Watchpoint

BPU
Core debug support
(halt, single step, etc)

Processor
core

Debugger access to
memory, peripherals
and optional test
logic

Peripherals
Additional
test logic

USB

JTAG or Serial
Wire

In-Circuit Debugger
(e.g. ULINK2)

28

Microcontroller

Targeted embedded
system

The recommended 2 pin debug


solution:

Optimised to access memory


mapped debug devices

Tested and supported with


ARM deliverables

Silicon proven since 2004


Widely supported by a large
debug tool ecosystem

Cortex Microcontroller Software Interface Standard







Cortex Microcontroller Software Interface Standard (CMSIS)


Developed by ARM, silicon and software vendors
Common interface to peripherals, real-time operating systems, and
middleware components
Defines the basic requirements to achieve software re-usability and portability

29

An Example AMBA AHB-Lite System


ARM
Processor
Optional
External
Memory
Interface

AHB-lite







APB

AHB-Lite

On-chip RAM,
ROM or FLASH

Single Master
Simple Slaves

APB
Bridge




Keypad

Single clock edge operation


Uni-directional busses
No tri-state signals

Standard AHB modules can be used

Good for synthesis


 Allows burst transfers

Allows easier module design/debug

Pipelined operation

Very low gate count for simple systems


30

Timers

GPIO

No retry or split responses

UART

Quick start deliverables


 Example AMBA system provided in integration kit





Fully documented in integration and implementation guide


Includes example AMBA AHB-lite single master, single layer interconnect
Includes example GPIO, zero wait-state SRAM/ROM ctrl, PMU components
Includes C source files for integration test
0xFFFFFFFF
AHB Default Slave
0xF0001FFF
Example ROM tables

PMU

power control interface

ARM

0xF0000000

Reserved
0xE0100000

Cortex-M0

Private Peripheral Bus

interrupts

0xE0000000

AHB default slave


0x40001800
GPIO 2

AMBA AHB-lite Interconnect

Reset
Controller

GPIO 1
GPIO 0

GPIO 2

GPIO 1

SRAM

ROM

ctrl

ctrl

GPIO 0

On-chip
ROM

0x40000800
0x40000000

AHB default slave


0x20100000
Memory SRAM

On-chip
RAM

0x40001000

0x20000000

AHB default slave


0x00100000
Memory ROM
0x00000000

Example ARM Cortex-M0 AMBA system provided

31

Example memory map

Development Tools and


Ecosystem

32

Cortex-M0 mbed Evaluation System

Very low cost


USB / evaluation board

Plug it in

Appears as USB Disk


linking to website

No Installation!

Save to the board and


program runs!

Hello World! in 5 minutes


http://mbed.org/
33

Compile a program online

ARM Development Products


 A summary of all ARM development tool products
 All are compatible with the Cortex-M family of processors
ASICs and ASSPs
Software
Development

RVDS 4.0
DS-5

MDK-ARM
RL-ARM

System
Simulation

Fast
Models

Vision
simulator

RVI, RVT2,
DSTREAM

ULINK2,
ULINKPro

Hardware
platforms

Eval boards
& MPS

Target
Connection

Boards

34

MCUs and Smart cards

Keil Microcontroller Development Kit (MDK)

35

Microcontroller Prototyping System (MPS)


 FPGA hardware platform for Cortex-M
 MPS includes Cortex-M3 and M0 images
with a free upgrade to Cortex-M4
 Evaluate Cortex-M without a full license
 Save prototyping cost and time
 Comprehensive memory and peripheral
subsystem
 USB, Ethernet, DVI, MMC/SD, FlexRay/CAN
 User expandable
 Use child board interface to develop
custom IP or integrate proven third-party IP
and peripheral systems

 Comprehensive development tools


 Altera Quartus II (Web Edition)
 Keil MDK-ARM (eval) and ULINK2

36

MicroLib optimized C libraries


MicroLib significantly reduces library size in embedded applications

Superset of standard RealView C Library


 Developed for embedded and memory constrained applications
 Optimized for embedded applications
 Minimal overhead for un-used OS functionality
 Un-used functions removed from memory footprint
 Faster system bring-up
 Most functions initialized at point of use
Up to 92% Reduction in Library Code size
 empty main
 Even more for Hello World using printf

37

MicroLib optimized for embedded


Based on Dhrystone 2.1 Benchmark

RealView MDK libraries reduce system code size by 50% to 90%


Library Totals

RO Totals

25000

30000

61%

51%

25000

20000

20000

15000
15000

10000
10000

5000

5000
0

0
ARM

Thumb

Thumb (M1)

Processor
ARM7TDMI

ARM7TDMI

Cortex-M1

Cortex-M3

ARM

Thumb2

Object
ARM

Thumb

Thumb

Thumb-2

Thumb

Thumb (M1)

Thumb2

Standard

MicroLib

% saving

Library Total

21,352

8,980

61%

RO Total

25,608

12,816

51%

Library Total

17,156

6,244

57%

RO Total

20,129

9,348

50%

Library Total

16,452

5,996

64%

RO Total

19,472

9,016

54%

Library Total

15,018

5,796

63%

RO Total

18,616

8,976

54%

Based on Dhrystone 2.1 Benchmark


38

ARM ecosystem for Cortex-M


Quality as well as quantity: Many of these third parties identify ARM related
business as largest growth driver, which means robust, supported
solutions
TOOLCHAIN PLATFORMS

DEBUGGERS

OPERATING SYSTEMS

and onwards to modeling solutions


Over 40,000 members on ARM-based discussion forums

39

Conclusions
 Cortex-M spans the spectrum of embedded applications
 Hardware and software compatibility and reuse

 Drives up energy efficiency


 Lower energy costs and longer battery life

High performance
 Enables competitive products

Reduced system cost


 Lower silicon costs and superior
code density

40

ERROR: syntaxerror
OFFENDING COMMAND: --nostringval-STACK:
/Title
()
/Subject
(D:20101021173635+0800)
/ModDate
()
/Keywords
(PDFCreator Version 0.9.5)
/Creator
(D:20101021173635+0800)
/CreationDate
(posha.lin)
/Author
-mark-

Das könnte Ihnen auch gefallen