Sie sind auf Seite 1von 6

A Programmable Revolution

A Compelling Alternative to Low Cost FPGAs

Introduction XMOS XS1-L


FPGAs and CPLDs are used in many The XMOS XS1-L family of devices are based
industries covering a broad range of on the XMOS XCore® processor, a 500 MIPS
performance requirements, price points and event-driven RISC processor with 100%
power envelopes. deterministic operation, a 32x32 multiplier,
In the early days, FPGAs were used for programmable I/O and a host of other
prototyping ASICs and for high-end, low resources, all programmable entirely in
volume applications that could bear a high C++, C and XC. XC includes extensions to C
unit cost, such as the communications and for concurrency, communications, and
defence sectors. Since then, FPGA vendors timed I/O operations
have driven down costs and power, through XMOS devices can be used as a direct
rapid process migration, to produce new substitute for low cost SRAM based FPGAs.
lower cost and lower power device families In other cases, they provide suitable
to address new requirements. replacements for some of the higher
performance flash-based FPGAs.
Evolution Figure 1 shows the XS1-L family with
In many cases low end FPGA families are respect to the price, capability (capacity and
now considered for power and cost performance) and power consumption of
sensitive consumer and industrial various FPGA device families. For a large
applications. These sectors benefit number of digital processing applications,
sufficiently from the flexibility and time to the XS1-L outperforms Altera Cyclone III,
market advantages offered by FPGAs to Xilinx Spartan 3A FPGAS and equivalents on
warrant the price premium of both price and power consumption.
programmability.
From 2009 onwards, new entrants to the
programmable silicon market are starting to
win the hearts and minds of designers
looking for the best possible mix of
solution flexibility, price and performance.
Some new players, such as SiliconBlue and
Achronix and tabula have come up with new
FPGA architectures. In parallel, other
vendors such as Actel and Cypress have
integrated FPGA fabrics with programmable
analogue blocks and microcontrollers.
All of these efforts represent an evolution
of the same FPGA concept.

Revolution: Now for the first time there is


an all-digital flexible solution that will
prove to be a better, cheaper, easier and
lower power solution than an FPGA for Figure 1: XS1-L compared to popular FPGA
many applications—XMOS. families

A single core XS1-L device offers a capacity


for general digital logic implementation
roughly comparable to an FPGA having
7-20K logic elements (roughly 70K-200K
ASIC gates).

2010-05-25 © 2010 XMOS Ltd www.xmos.com


A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs

There will always be a place for micropower Threads, Memory and Channels
programmable devices and high-end, DSP, Threads can use channels to provide
bandwidth intensive FPGAs such as Virtex buffered, event-based communication
or Stratix parts. For applications residing in between threads, allowing data exchange
the space in between, however, XMOS can and synchronization using single cycle
improve development speed and lower instructions. Alternatively, threads can
costs and power consumption without share 64KB of on-chip SRAM memory to
compromising solution flexibility and exchange data, using single cycle lock
programmability. instructions to co-ordinate access.
In addition the XS1-L provides robust IP This makes the implementation of
protection only found in flash-based FPGAs lightweight protocol stacks (such as TCP/IP
whilst retaining performance much closer to microIP) that fit within the 64KB of memory
that of an SRAM FPGA. essentially free when compared to an
The rest of this paper describes how XMOS equivalent implementation in an FPGA,
technology delivers a revolution in both the which requires a soft core such as Xilinx's
programmable silicon itself and the MicroBlaze and an external memory
associated hardware design processes. interface that would consume a large
portion of the FPGA capacity, not to
The XCore Processor mention adding an external memory chip to
Instead of writing code in HDL to describe the BOM cost.
registers, gate and wires, designers who
use XMOS technology, write code in C, C++ Task XMOS approach FPGA approach
or XC to implement deterministic Design High Level, HDL entry:
processing functions, as shown in Figure 2. Capture parallel C/XC always @(posedge
code clock)

Resources instructions, Gates, LUTs,


threads, routing
channels, timers

DSP Threads, 32x32 HDL entry,


MAC Embedded Block
Multipliers

Table 1: FPGA Design Concepts and XMOS


Equivalents
Figure 2: Designing with the XCore
Time
Parallelism Each XCore has ten configurable timers,
An XCore processor runs multiple real-time which can be directly instantiated in XC and
hardware threads simultaneously. Each used to control program execution or I/O
thread has access to a dedicated set of operations with nominal resolution of 10ns.
general purpose registers, gets a
guaranteed share of the processing power, I/O and Interfacing
and executes a program using common
RISC-style instructions. Each thread can Each XCore provides up to 64 GPIO that can
execute simple computational code, DSP be set and sampled in a single instruction
code, control software (taking logic via intelligent, autonomous I/O resources
decisions, or executing a state machine) or called Ports. Simple input and output
handle I/O operations using intelligent I/O instructions transfer data to or from I/O
resources. ports, as shown in Figure 3. More complex
use of ports allows data to be serialized and
The eight hardware threads, generous MIPS, de-serialized, enabling the processor to
100% deterministic architecture and keep up with high-speed data streams. The
intelligent I/O provide designers with the ports can timestamp data, synchronize
flexibility of HDL, while dramatically easing transfers with an external or internal clock,
the design entry and verification tasks. and schedule data to be input or output at
specific times.

XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs

out buffered port:1 outP = XS1_PORT_1B; Selecting your programmable


in buffered port:4 inP = XS1_PORT_4A;
clock ref = XS1_CLKBLK_REF; solution
int main(void) { Table 3 lists a range of application function
examples and compares the utilization of
int value;
configure_out_port_no_ready(outP, ref, 0); XCore resources and FPGA logic elements
configure_in_port_no_ready(inP, ref); required to implement the function.
while (1) {
inP :> value; XS1-L FPGA Asic
if (value > 9)
outP <: 1;

Function

Threads

Memory
else

Nand2
Gates
Logic
Cells
GPIO
MIPS
outP <: 0;
}

USB2 +
5 400 30794 12 4400 44000
2EP

Ethernet
5 250 9982 14 3600 36000
MAC+MII

TCP/IP
1 50 40000 0 61001 61000
(uip)

Figure 3: XMOS Ports Use Example S/PDIF 2 100 5036 2 800 8000
Clock Blocks are used to select the internal
I2C
XCore system clock, the timer reference 0.5 50 3044 2 700 7000
Master
clock, or an external clock connected via a
1-bit port to clock a given port. Clock SDRAM
blocks sample incoming external clocks and Interface
1 100 2974 30 1100 11000
then provide a variety of conditioning (D8,
options (for example, delaying the clock A14)
relative to the data associated with it). Table 3: Application Function Examples

Task XMOS approach FPGA approach IP Protection


I/O Each XCore has 8KB of secure one time
Ports, timers HDL entry
Interfacing
programmable (OTP) memory, secure
Clocking Clock blocks Clock Management Units execution mode, the ability to load AES
encrypted firmware, and the option to
Table 2: XMOS and FPGA I/O Concepts disable JTAG and external channel access to
a secured XCore. This all adds up to a level
Event-Driven Processing of IP protection that cannot be matched by
The XCore processor is event-driven. an SRAM FPGA.
Threads waiting for events do not consume
any processing resources. An event can be Applications requiring robust IP protection
the completion of a communication or I/O are often forced to use a slower but more
operation, the release of a lock, or a timer secure flash-based FPGA, which can lead to
reaching a programmed time. Threads can timing closure issues. XMOS XS1-L devices
wait for any one of a set of events; the first offer a way to meet security and
event causes the thread to start in a single performance requirements with minimal
instruction. effort.
The XS1-L XCore provides an Active Energy
Conservation mode in which it automatically
and instantly slows the XCore clock down to
a user-specified speed whenever all threads
are paused. The clock returns to its normal
speed as soon as any thread has new work
to do. 1
Assumes a NIOS II and external memory interface is
required for TCP/IP running in a Cyclone III device

XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs

DSP Soft Processors


XS1-L devices offer easily accessible DSP For FPGA designs that need to employ a
functionality via its 500 MHz 32x32 soft processor to implement a protocol
multiplier, offering a sustained rate stack, the issue becomes the amount of
(including load/store operations) of 59 code memory required. For many simple
MMACS per XCore (119 MMACS peak) which protocol stacks, such as TCP/IP for simple
is sufficient for many audio, signal control web-servers and various I/O related
and lower end DSP tasks that need low cost standard and proprietary protocols, the
and power per MMAC. 64KB of internal SRAM on the XCore is
The low cost FPGA families such as Altera sufficient.
Cyclone III, on the other hand, offer tens or In these cases the XS1-L is the cost-effective
hundreds of embedded block multipliers, choice. To achieve the above in an FPGA
which can be ganged together to create would require either:
multipliers of arbitrary width. When many of
these are employed in parallel, an
 a gate hungry soft processor core and
external memory interface plus external
aggregate DSP processing capability can be
memory chip, all of which adds a
built up far in excess of what the XS1-L can
sizeable penalty in device capacity,
achieve.
power consumption, I/O, BOM cost and
Consequently the FPGA provides a board space.
significant advantage for high throughput
image, video processing or  A soft processor core with additional
telecommunications infrastructure logic cells used to implement a small
processing. For many emerging applications code memory on the FPGA.
(such as consumer and prosumer digital Many soft processor implementations may
audio), however, moderate DSP needs are also find it impossible to achieve the clock
just one item on the list of requirements speed required to meet processing
alongside flexible control, low cost and requirements, leaving the designer to look
integration. For these types of applications for a product that integrates hardened 32-
XMOS is likely to offer the ideal solution, all bit RISC cores with a suitable programmable
programmable in a high-level language. fabric.
For applications that have code footprints
Solution Scaling well in excess of 64KB, an FPGA with
An application that does not fit in a single external memory may be the only option.
XCore may be easily spread across multiple
cores by selecting the two-core XS1-L2
device. Alternatively multiple XMOS devices
can be connected together by asynchronous
off-chip links that unify multiple XS1
processors into a single unified network
mediated by communication via channels.

High I/O Capability


For applications that require many 100s of
I/Os, a low cost FPGA is likely to be a
preferable choice. Likewise for very high
speed native I/O capabilities such as LVDS,
gigabit SERDES transceivers, SSTL2 or other
exotic I/O technology, choose an FPGA.
However a large majority of applications are
well served with single ended 3.3V I/O,
making large amounts of high speed I/O an
expensive and unneeded feature.

Figure 4: Costs associated with Soft Core Usage


in FPGAs

XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs

Design Flow synthesis.


Designers using XMOS technology, on the
Figure 5 compares the standard FPGA
other hand, immediately reap the
design flow to the XMOS design flow.
productivity benefits of coding in a high
Overall, the XMOS design flow offers
level language, yet avoid the pitfalls of high
dramatically shorter iteration times and
level synthesis.
more straightforward design entry than the
traditional FPGA flow.
Ultra Fast Compilation
Design Entry Even large XMOS programs compile and link
Design entry is C++, C or XC using either in seconds compared to the minutes or
the XDE even hours required to complete a typical
graphical development environment or your iteration of FPGA synthesis and place and
favorite text editor. The XDE offers syntax route.
highlighting, indenting and offers the ability
to compile, launch simulations and Application Timing Closure
debugging.
The XS1-L implements parallelism using its
instruction set and native resources, all of
Design in a High Level Language which reliably run at 500 MHz. Designers
EDA vendors have expended significant using XMOS have no need to check register
efforts to bring the advantages of high level to register timing paths across multiple
languages to FPGA design, and still have a design corners.
long way to go to deliver practical hardware One of the most powerful attractions of the
design flows using C and high level

Figure 5: XMOS and FPGA Design Flows Compared


XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.
A Programmable Revolution – A Compelling Alternative to Low Cost FPGAs

XMOS approach for FPGA designers is the Bitstream Generation


ability to statically time paths through
After the design is ready, firmware for
application code using the XMOS Timing
downloading to configuration flash
Analyzer, which times critical application
memories are easily generated with XFLASH,
paths rather than register-to-register paths.
which includes provision for multiple boot
The Timing Analyzer achieves 100% images and Dynamic Field Upgrade (DFU).
coverage of enumerated constraints, unlike
XBURN can be used to burn parts of the
test-bench based simulation. For example,
code image and selected user encryption
the Timing Analyzer can calculate the time
keys to the 8KB of OTP on chip, or just set
in XCore cycles from a thread sampling a
security options such as disabling JTAG
specific pattern on an input port to
debug access.
outputting a response on an output port.
The result can be graphically displayed,
In System Debug
highlighting the critical path through the
code and automatically signing off against XMOS offers a typical processor debugging
user specified timing constraints expressed environment using XGDB (built on top of
as pragmas in the code or entered using the gdb, the GNU Debugger) and the XS1-L
XTA GUI. JTAG
For FPGA designers to access similar interface.
functionality they must deploy property Debug iterations with XMOS tools only
checkers and formal proof methods, which require a recompile and regeneration of
rapidly reach their limits on even firmware. FPGA designers must pre-select
moderately sized designs, and require the nodes they wish to view and iterate
specialist design knowledge to apply. through synthesis, place and route and
timing analysis for each debug iteration.
The Timing Analyzer offers a whole-
application level timing capability that does PCB Design considerations
not rely on time consuming dynamic XMOS offers its processors in QFP, QFN and
simulation that will be appreciated by BGA packages, suitable for 2 layer and 4
software and hardware engineers alike. layer BCB implementations.
In addition, the XS1-L parts require only two
Simulation voltage supplies, a 3.3V or 2.5V supply for
Designers have the option to run XCore the I/O, and a 1V core voltage.
simulations of their code, visualizing the The various port/pin configurations that can
results with the XMOS VCD waveform viewer be realized with the XS1-L also offer some
and debugging and single stepping with the late pin assignment flexibility although not
debugger, all built into the XDE graphical to the same fine degree offered by FPGAs.
environment.
The signals displayed in the VCD viewer are Toolchain Simplicity and Platform
a range of actual signals that exist within Support
the XS1-L silicon including program
counters, port resource signals, timers, Full FPGA design tool chains from the FPGA
channels and thread status. vendors and/or third party EDA suppliers
run to multiple gigabytes of data.
These simulations run an order of
magnitude faster than a corresponding The XMOS tools typically only require about
dynamic simulation in an event-driven HDL 200 megabytes and work out of the box on
simulator. XSIM also provides a range of Windows, Linux and MAC platforms,
simple testbench plug-ins and an API for the allowing you to develop your applications
user to create more of their own. on desktop PCs or notebooks.

Summary
XMOS offers a lower cost and more secure
platform with dramatically enhanced time-
to-market than traditional SRAM and FLASH
based FPGAs for programmable digital
logic designs in the 70K – 400K gate range

XMOS, the XMOS logo and XCore are trademarks of XMOS Ltd
All other trademarks are the property of their respective owners.

Das könnte Ihnen auch gefallen