Sie sind auf Seite 1von 20

ARM Processor Architecture

ARM architecture forms the basis for every ARM processor. Over time, the ARM
architecture has evolved to include architectural features to meet the growing
demand for new functionality, high performance and the needs of new and
emerging markets. There are currently two ARMv8 profiles, the ARMv8-A
architecture profile for high performance markets such as mobile and enterprise,
and the ARMv8-R architecture profile for embedded applications in automotive and
industrial control.
The ARM architecture supports implementations across a wide range of
performance points, establishing it as the leading architecture in many market
segments. The ARM architecture supports a very broad range of performance points
leading to very small implementations of ARM processors, and very efficient
implementations of advanced designs using state of the art micro-architecture
techniques. Implementation size, performance, and low power consumption are key
attributes of the ARM architecture.

ARM developed architecture extensions to provide support for Java acceleration


(Jazelle), security (TrustZone), SIMD, and Advanced SIMD (NEON)
technologies. The ARMv8-architecture adds a Cryptographic extension as an
optional feature.

The ARM architecture is similar to a Reduced Instruction Set Computer (RISC)


architecture, as it incorporates these typical RISC architecture features:

A uniform register file load/store architecture, where data processing operates only
on register contents, not directly on memory contents.
Simple addressing modes, with all load/store addresses determined from register
contents and instruction fields only.
Enhancements to a basic RISC architecture enable ARM processors to achieve a
good balance of high performance, small code size, low power consumption and
small silicon area.

A64
A64 is a new 32-bit fixed length instruction set to support the AArch64 execution state. The
following is a summary of the A64 ISA features.

Clean decode table based on 5-bit register specifiers

Instruction semantics broadly the same as in AArch32

31 general purpose 64-bit registers accessible at all times

No modal banking of GP registers - Improved performance and energy

Program counter (PC) and Stack pointer (SP) not general purpose registers

Dedicated zero register available for most instructions


Key differences from A32 are:

New instructions to support 64-bit operands. Most instructions can have 32-bit or 64-bit
arguments

Addresses assumed to be 64-bits in size. LP64 and LLP64 are the primary data models
targeted

Far fewer conditional instructions than in AArch32 conditional {branches, compares,


selects}

No arbitrary length load/store multiple instructions LD/ST P for handling pairs of


registers added A64

Advanced SIMD and scalar floating-point support are semantically similar to the A32 support;
they share a floating-point/vector register file, V0 to V31. A64 provides three major functional
enhancements:

More 128 bit registers: 32 x 128 bit wide registers; can be viewed as 64-bit wide
registers

Advanced SIMD supports DP floating-point execution

Advanced SIMD supports full IEEE 754 execution; rounding-modes, Denorms, NaN
handling
There are some additional floating-point instructions for IEEE754-2008:

MaxNum/MinNum instructions

Float to Integer conversions with RoundTiesAway


The register packing model in A64 is also different from A32:

All vector registers 128-bits wide, Vx[127:0] :

Double precision scalar floating point uses Vx[63:0]

Single precision scalar floating point uses Vx[31:0]

A32
ARM, generically known as A32, is a fixed-length (32-bit) instruction set. It is the base 32-bit ISA
used in the ARMv4T, ARMv5TEJ and ARMv6 architectures. In these architectures it is used in
applications requiring high performance, or for handling hardware exceptions such as interrupts
and processor start-up.
The ARM ISA is also supported in the Cortex-A and Cortex-R profiles of the Cortex
architecture for performance critical applications, and for legacy code. Most of its functionality is
subsumed into the Thumb instruction set with the introduction of Thumb-2 technology. Thumb
(T32) benefits from improved code density.
ARM instructions are 32-bits wide, and are aligned on 4-byte boundaries.
Most ARM instructions can be "conditionalised" to only execute when previous instructions have
set a particular condition code. This means that instructions only have their normal effect on the
programmers model operation, memory and coprocessors if the N, Z, C and V flags in the

Application Program Status Register satisfy a condition specified in the instruction. If the flags
do not satisfy this condition, the instruction acts as a NOP, that is, execution advances to the
next instruction as normal, including any relevant checks for exceptions being taken, but has no
other effect. This conditionalisation of instructions allows small sections of if- and whilestatements to be encoded without the use of branch instructions.
The condition codes are:
Condition Code

Meaning

Negative condition code, set to 1 if result is negative

Zero condition code, set to 1 if the result of the instruction is 0

Carry condition code, set to 1 if the instruction results in a carry condition

Overflow condition code, set to 1 if the instruction results in an overflow condition.

T32
Cost-sensitive embedded control applications such as cell phones, disk drives, modems and
pagers are always looking for ways to achieve 32-bit performance and address space at
minimal cost with respect to memory footprint.
The Thumb (T32) instruction set provides a subset of the most commonly used 32-bit ARM
instructions which have been compressed into 16-bit wide opcodes. On execution, these 16-bit
instructions are decompressed transparently to full 32-bit ARM instructions in real time without
performance loss.
Thumb offers the designer:

Excellent code-density for minimal system memory size and cost


32-bit performance from 8- or16-bit memory on an 8- or 16-bit bus for low system

cost.

Plus the established ARM features

Industry-leading MIPS/Watt for maximum battery life and RISC performance

Small die size for integration and minimum chip cost

Global multi-partner sourcing for secure supply.


Designers can use both 16-bit Thumb and 32-bit ARM instructions sets and therefore have the
flexibility to emphasize performance or code size on a sub-routine level as their applications
require.
The Thumb ISA is widely supported by the ARM ecosystem, including a complete Windows
software development environment as well as development and evaluation cards.
Improved Code Density with Performance and Power Efficiency
Thumb-2 technology made Thumb a mixed (32- and 16-bit) length instruction set, and is the
instruction set common to all ARMv7 compliant ARM Cortex implementations. Thumb-2
provides enhanced levels of performance, energy efficiency, and code density for a wide range
of embedded applications.
The technology is backwards compatible with existing ARM and Thumb solutions, while
significantly extending the features available in the Thumb instructions set, allowing more of the
application to benefit from the best in class code density of Thumb. For performance optimised
code Thumb-2 technology uses 31 percent less memory to reduce system cost, while providing
up to 38 percent higher performance than existing high density code, which can be used to
prolong battery-life or to enrich the product feature set.

ARMv8-A Architecture
The ARMv8 architecture introduces 64-bit support to the ARM architecture with a focus on
power-efficient implementation while maintaining compatibility with existing 32-bit software. By
adopting a clean approach ARMv8-A processors extend the performance range available while
maintaining the low power consumption characteristics of the ARM processors that will power
tomorrow's most innovative and efficient devices. The current ARM processors supporting the
ARMv8-A architecture are the Cortex-A72, Cortex-A57 and Cortex-A53 processors.

Increased availability of larger registers for general purpose and media instructions, a greater
addressing range and cryptography instructions enable new categories of applications for
superphone and tablet computing, while bringing the ARM benefits of efficient design and low
power consumption to applications where 64-bit computing is already established, such as
servers and network infrastructure, promising to revolutionize the data center.
The ARMv8 architecture maintains compatibility with the comprehensive software ecosystem for
32-bit components. This enables a wealth of software optimized for existing ARM processors to
benefit from the enhanced performance of processors based on the ARMv8 architecture,
while the addition of 32-bit cryptographic instructions further enables optimization for
emerging requirements.
Developing the software to make best use of the new 64-bit capabilities requires the availability
of excellent tools, test platforms and key open source components. While developing the
architecture and the processors based on ARMv8-A, ARM has also ensured
that the essential tools for development are available to software developers today, enabling the
ARM software ecosystem to continue to innovate around the Architecture for the Digital World.
ARM DS-5 Development Studio
A comprehensive suite of development tools for all ARM processors, DS-5 Ultimate
Edition features the LLVM-based ARM Compiler 6 and ARMv8 Fixed Virtual Platform for worldclass software development on the ARMv8-A architecture.
Linaro
For GNU tools and linux kernel support, pre-built versions are available through the Linaro
website www.linaro.org/engineering/ARMv8
ARM Fast Models
Used in conjunction with DS-5 for ARMv8, ARM Fast Models can help developers debug,
analyse, and optimize their applications throughout the development cycle, providing a flexible
platform for software testing prior to the availability of silicon.
ARMv8 Foundation Model
To enable a broad community of developers, ARM is making available the ARMv8 Foundation
Model, based on ARM Fast Model technology. This provides the essentials needed to prove
software prior to readily available silicon platforms.

ARM Juno Development Platform


The Juno ARM Development Platform (ADP) is a software development platform for ARMv8-A.
It includes the Juno Versatile Express board and an ARMv8-A reference software port available
through Linaro. The Juno hardware provides software developers with an open, vendor neutral
ARMv8 development platform with Cortex A57 and A53 MPCore for ARMv8 big.LITTLE
Mali-T624 for 3D Graphics Acceleration and GP-GPU compute, and SoC architecture aligned
with Level 1 (Server) Base System Architecture.

Introduction
ARM Tools
Open Source Tools
ARMv8 Resources

ARMv8-A introduces 64-bit architecture support to the ARM architecture and includes:

64-bit general purpose registers, SP (stack pointer) and PC (program counter)

64-bit data processing and extended virtual addressing

Two main execution states:


AArch64 - The 64-bit execution state including exception model, memory model,

programmers' model and instruction set support for that state


AArch32 - The 32-bit execution state including exception model, memory model,

programmers' model and instruction set support for that state


The execution states support three key instruction sets:

A32 (or ARM): a 32-bit fixed length instruction set, enhanced through the different
architecture variants. Part of the 32-bit architecture execution environment now referred to as
AArch32.

T32 (Thumb) introduced as a 16-bit fixed-length instruction set, subsequently enhanced


to a mixed-length 16- and 32-bit instruction set on the introduction of Thumb-2 technology. Part
of the 32-bit architecture execution environment now referred to as AArch32.

A64 is a 64-bit fixed-length instruction set that offers similar functionality to the ARM and
Thumb instruction sets. Introduced with ARMv8-A, it is the AArch64 instruction set.
ARM ISAs are constantly improving to meet the increasing demands of leading edge
applications developers, while retaining the backwards compatibility necessary to protect
investment in software development. In ARMv8-A there are some additions to A32 and T32 to
maintain alignment with the A64 instruction set.

ARMv8-R Architecture
The ARMv8-R architecture significantly enhances ARMs real time 32-bit processor solutions
with new features to expand their functionality and capability to meet for rapidly-evolving market
requirements. In particular, processors implementing the ARMv8-R architecture will be suitable
for the rapidly-expanding number of safety-related applications in automotive and industrial
control.
The ARMv8-R architecture complements the ARMv8-A architecture and builds on the rich
heritage of the 32-bit ARMv7-R architecture used for the companys market-leading CortexR series of real time processors.
A key innovation within the ARMv8-R architecture is the introduction of a bare metal Hypervisor
mode which enables programmers to combine different operating systems, applications and
real-time tasks on a single processor whilst ensuring strict isolation between them. This
facilitates software consolidation and re-use which will accelerate time-to-market and reduce
development costs.

In addition, the ARMv8-R architecture will enable overall improvements in software quality and
will support increasingly sophisticated embedded programming techniques such as modelbased automated code generation.
The deployment of ARMv8-R architecture will reduce costs, increase efficiency and improve
performance of embedded systems to support emerging automotive applications such as
Advanced Driver Assistance Systems and vehicle-to-vehicle communications as well as factory
automation applications and Human-Machine interface. For example, a microcontroller
incorporating an ARMv8-R processor could host Linux for graphical management and
networking functions together with real-time operating system workloads such as motor control.
The ARMv8-R architecture also permits coexistence of both virtual memory and protected
memory systems on the same processor enabling an Operating System using memory
management, such as Linux, to be integrated with a Real Time Operating System.
Other ARMv8-R architecture features include:

Improved memory protection scheme which substantially reduces context switching time

ARM NEON advanced SIMD instructions for significantly improved radar and image
processing tasks

Instructions carried over from the ARMv8-A architecture such as CRC (Cyclic
Redundancy Check) for use in detecting the corruption of program code or data.
In support of the introduction of the ARMv8-R architecture, ARM is working to ensure a robust
design ecosystem to support the new features. The DS-5 ARM tools and Fast Models already
support the ARMv8-A architecture, and support for the ARMv8-R architecture will be available to
lead partners Q314. In addition timed models, automotive simulation system level tools and

mechanical and electronic modelling tools are being developed by ARM EDA partners in
advance of silicon.

CORTEX A
High-Performance Applications Processing
The ARM Cortex-A series of applications processors provide a range of solutions for
devices undertaking complex compute tasks, such as hosting a rich Operating
System (OS) platform, and supporting multiple software applications.
Cortex-A series processors scale efficiently across a range of the highest performing
consumer, embedded and enterprise devices. These include a spectrum of
smartphones, mobile computing platforms, digital TVs, set-top boxes, and rich IoT
devices through to enterprise networking, and server solutions. In an increasingly
energy-conscious business landscape, the power efficiency of Cortex-A processors
can provide significant advantages.

ARM's processors all share a commonly supported architecture and feature set,
ensuring compatibility across the range of instruction sets. The Cortex-A17
processor which was introduced last year, the mature Cortex-A15, the widelyshipped Cortex-A9, and high-efficiency Cortex-A7 and Cortex-A5 processors all use
the same ARMv7-A architecture, and therefore share full application compatibility,
including support for the traditional ARM, Thumb and high-performance Thumb-2
instruction sets. ARM also enables 64-bit computing with its ARMv8-A architecture
which is supported by the Cortex-A72, Cortex-A57 and Cortex-A53 processors. The
ARMv8-A architecture also has a specialized execution state allowing it to process
legacy ARM 32-bit applications. This provides an excellent path to upgrade for the
existing 32-bit ecosystem and ensures the 64-bit ecosystem is backwards
compatible.

High-performance cores such as the Cortex-A72, Cortex-A57, Cortex-A17 and


Cortex-A15 processors can be paired with architecturally aligned high efficiency
cores like the Cortex-A53 and Cortex-A7 processors in a big.LITTLE configuration
for ARMv8-A and ARMv7-A respectively. This power-optimization technology allows
the high-performance core to deliver peak-performance for intensive tasks such as
instant webpage loading while background processing is undertaken by high
efficiency cores. This is done seamlessly and is transparent to the applications and
middleware, resulting in significantly improved overall energy efficiency and an
exceptionally responsive user experience. High-performance Cortex-A processors
are ideally suited to enable servers to meet the growing need for performance
requirements delivered in ever more power-efficient solutions.

With the move to ever increasing driver assist, passive and active safety systems
and advanced driver interfaces, the compute power of vehicles is set to significantly

increase, and thus ARM Cortex-A processors are increasingly being designed in for
safety related automotive applications.

Request More Information


APPLICATION EXAMPLES

Find out more...

Why Cortex-A? Compare Processors Technology Server and Networking Mobile


Computing Resources
Cortex-A Series Characteristics
Cortex-A processors are specifically designed to execute complex functions and
applications such as those required by consumer devices like smartphones and
tablets. Their performance efficiency is also making them an increasingly popular
choice for servers and enterprise applications where large core clusters can be
combined for optimal solutions.

In consumer electronics, Cortex-A processors are ideal for providing fast and
immersive connected experiences. Their low-power architecture enables all-day
browsing, connectivity, console-quality gaming, technologies such as NEON and
support for the widest mobile app ecosystem. Across enterprise and networking
solutions, Cortex-A processors enable highly scalable solutions to match
performance requirements for more power-efficient package transfer, basestations,
edge routers and servers.

All Cortex-A based processors share a commonly supported architecture and feature
set, with each processor based on either the ARMv7-A or ARMv8-A architecture and
feature set. The ARMv8-A architecture has a 64-bit execution state and can also
support existing 32-bit applications. This backwards compatibility strengthens the
64-bit ecosystem. This commonality makes them the best solution for open platform
design where compatibility and portability of software between designs is of upmost
importance.

Cortex-A processors offer support for a rage of full Operating Systems including
Linux, as well as others requiring a Memory Management Unit such as Android,
Chrome and MontaVista.

On top of the ARMv7-A and ARMv8-A architecture support, Cortex-A series


processors have been developed to run a number of architecture extensions to
provide support for security (TrustZone), SIMD, and Advanced SIMD (NEON)
technologies. Other extensions and technologies supported by Cortex-A series
processors include:

Instruction Set support - ARM, Thumb, Thumb-2, DSP


Advanced single and double-precision Floating Point support
Virtualization
Large Physical Address Extension (LPAE) addressing up to 1TB of physical memory
big.LITTLE processing
Jazelle
ARM has worked closely with its partners to bring the performance and energy
efficiency of Cortex-A series processors to Android devices. The vast ARM
ecosystem brings with it a deep wealth of mobile knowledge in both hardware and
software to maximize the benefits of the Android OS, and to ensure the best
possible experience for users. The combination of these benefits makes Android
better on ARM.

Multicore Technology
All ARMv7-A and ARMv8-A based processor cores featured in the current ARM
Processor Portfolio support ARM's multicore technologies.

Single to quad-core implementation for performance orientated applications


Supports symmetric and asymmetric OS implementations
Coherency throughout the processor exported to system via Accelerator Coherency
Port (ACP)
The big.LITTLE compatible processors extend multi-core coherence beyond the 1-4
core clusters with AMBA 4 ACE (AMBA Coherency Extension) and AMBA 5 CHI
(Coherent Hub Interface).

Industry Standard

The success of the Cortex-A processors is built on the innovation of ARM partners
who have licensed these processors and developed a wide array of success stories
in various markets. Click here for a list of the currently public Silicon Partners.

CORTEX R
Ultimate Reliability for Embedded Real-Time Processing
The ARM Cortex-R real-time processors offer high-performance computing solutions for
embedded systems where reliability, high availability, fault tolerance, maintainability and
deterministic real-time responses are essential.
The Cortex-R series processors provide fast time-to-market through proven technology shipped
in billions of products, and leverages the vast ARM ecosystem and global, local language, 24/7
support services to ensure rapid and low-risk development.
Cortex-R series processors deliver fast and deterministic processing and high performance,
while meeting challenging real-time constraints in a range of situations. They combine these
features in a performance, power and area optimized package, making them the trusted choice
in reliable systems demanding high error-resistance.

APPLICATION EXAMPLES

Cortex-R Series Characteristics


Fundamental to the Cortex-R4, Cortex-R5 and Cortex-R7 processors are key features that are demanded
by deeply embedded and real-time markets such as automotive safety or wireless baseband, where highperformance, real-time, safe and cost-effective processing is required.
High performance: Rapid execution of complex code and DSP functionality

High performance, high clock-frequency, deeply pipelined micro-architecture

Dual-core multi-processing (AMP/SMP) configurations

Hardware SIMD instructions for very high performance DSP and media functions

Real-time: Deterministic operation to ensure responsiveness and high throughput

Fast, bounded and deterministic interrupt response

Tightly Coupled Memories (TCM) local to the processor for fast-responding code/data

Low-Latency Interrupt Mode (LLIM) to accelerate interrupt entry

Reliable: Detects errors and maintains system operation

User and privileged software operating modes with Memory Protection Unit (MPU)

ECC and parity error detection/correction for Level-1 memory system and buses

Dual-Core Lock-Step (DCLS) redundant core configurations

Cost effective: Fast time-to-market and customizable features

Best-in-class energy and die area/cost efficiency

Configuration to include/exclude features to optimize power, performance and area

Fast development and testing with configurable debug breakpoints and watchpoints through
CoreSight debug access port with embedded trace module options

(Click to Enlarge)

Industry Standard
ARM Cortex-R series processors set the industry standard for a wide range of deeply embedded
semiconductor application markets with a broad range of licensees throughout the worldwide
semiconductor industry. There are over 80 Cortex-R series licensees

CORTEX M
Scalable and Low-Power Technology for all Embedded
Applications
The ARM Cortex-M processor family is a range of scalable and compatible, energy efficient,
easy to use processors designed to help developers meet the needs of tomorrows smart and

connected embedded applications. Those demands include delivering more features at a lower
cost, increasing connectivity, better code reuse and improved energy efficiency.
The Cortex-M family is optimized for cost and power sensitive MCU and mixed-signal devices
for applications such as Internet of Things, connectivity, motor control, smart metering, human
interface devices, automotive and industrial control systems, domestic household appliances,
consumer products and medical instrumentation.
More information on ARM embedded products and related resources is available in
the Embedded Group on ARM Connected Community.
Request More Information

APPLICATION EXAMPLES

Why Cortex-M?

Compare Processors

Technology

32-bit Advantage

Resources
Cortex-M Series Characteristics
Energy efficiency

Ease of use

Power efficient 32-bit processors

Program in C/C++, easy software reu

Support for sleep modes

Wide range of tools available

Energy efficiency

Low power design with further optimization packs available

Low power consumption enables longer battery life

Instructions to support sleep modes

Ease of use

Standardize software framework (Co

Free DSP library

High performance

Feature rich

Leading MCU performance

Powerful Interrupt Control with NVI

Instructions for bit manipulation

OS support features

Low interrupt latency

Memory Protection Unit (MPU)

Powerful DSP extensions and optional hardware Floating Point Unit

Comprehensive debug and reliability

Ecosystem

Reduced system size

Largest ecosystem in the industry

Low gate count

Several thousands of MCU catalog parts

High code density reduces memory s

Wide range of development suites

Smaller area reduces die cost

Wide range of middleware & RTOS

Smaller area reduces chip package si

Industry Standard
ARM Cortex-M processors are the most popular choice for embedded
applications, having been licensed to over 175 ARM partners and benefit
from the widest third-party tools, RTOS and middleware support of any
architecture. Using a standard processor within a design allows ARM
partners to create devices with a consistent base while enabling them to
focus on creating superior device implementations.