Sie sind auf Seite 1von 54

F

F
I
h,
|{

l-

F
I
tF
I

(0ptional)
FPGA
0ptimization
Optimizationtechniques
areimpofiantin FPGAprogramming.
Thereare
for optimizingFPGAcode.Onereasonis to optimizethe
two mainreasons
codetbr speed.This is impofiantwhenusinghigh-frequency
loop rates.
Anotherreason
tooptimizeFPGAcodeis for spaceutilizationon theFPGA.
Opfimizingfor sizeis importantwhenFPGAcodeis exhemelycomplex
andis too lareeto fit on theFPGA.

>

l'!t
F.

f-ts\

:t

Topics

rr!|l
h

r{,
E

r{,

OptimizationTechniques
B . Benchmarking
FPGAVIs
C. BasicOptimizationTechniques
D , ArchitectureOptimizations
E. AdvancedOptimizations

1'
h

fd

FI
Ef

1'
E

r{,
-

1l

l{
h
|{
h

f'f
-

-a
a-

A Nahonat
lnsxunenlsCotpaQhan

ConDactql'andLabVlEW
fundanertatsCouseManual

Lessan9

FPGA
Optiniration(optional)

A. 0ptimization
Techniques
FPGA VIs are limited Drimarilv in two areas:
Speed-Relatesto theactualexecutionspeedofthe FPGAgates.Ifthe
codedoesnot executeat thedesiredloopratethenyoumustmodifythe
applicationso thattheloop ratecanbe increased.
Thesemodifications
canchangethe wayyour createanddesigntheFPGAVI.
Size-Relatesto theactualamountof spacetheapplicationuseson the
FPGA. If the applicationrequiresmore hardwarecomponentsthan are
physicallyavailableon theFPGA,thecompilefailsandtheapplication
cannotrun on theFPGA.
Youcanusesevenloptimizationtechniques
to tum a non-functioning
or
poorly functioningapplicationinto a workablesolution.Table9-l lists
someof themainoptimizationtechniques
andhowtheyaffectFPGAspeed
andsize.Thesetechniques
work for bothCompactRlotargetsaswell as
otherNationalInstruments
FPGAtargets.
Tableg-1. 0Dtimization
Techni0ues
OptimizationTechnique

FPGA Speed

FPGA Size

Limit ftont panelobjects


Use small datatypes
AvoidlargeVIs
Usenon-reentrant
subvls
Usereentrantsubvls
Useparalleloperations
Usesingle-cycle
TimedLoops
Usepipelining
Use appropriatearbitration
The threelevelsof optimizationtechniques
include:
. Basicoptimizationtechniques
thatdo not requirenajor codechanges
.

Architectureoptimizationsthat rcquire modificationsto code structure

Advancedtechniques
thatrcquiredetailedknowledgeof FPGA
architecture

Conpactql0andLabVlEW
FundanatalsCou6ellanual

g2

F
F
P
P

F
P
F
F
F
F
F
F
F

rr
rl'
|r-

(qptional)
Lessan9 FPGAOptinEation

B. Benchmarking
FPGA
Vls
To determineif a changehas affectedthe FPGA VI you need a way to
measurethe executionspeedand size requirements.

Looo
BateBenchmarks
Using tick countsis the bestway to calculateloop rates,where one tick
equalsone clock cycle ofthe timebase(default40 MHz). The simplestway
to measurethe numbeaof ticks betweenone iterationand the next is to use
the Tick Count VI with a shift register
Placethe Tick Count Vl in the desiredloop, and wire the output to a shift
register,and then comparethe prior iterationstick count to the current
iterationstick count by subtractingthe differencebetweenthe two. On the
output of the subtractfunction, wire an indicator to seehow much time
elapsedbetweeniterations.
Although this techniqueusessomeFPGA space,it is typically usedduring
the developmentphaseand deleteit later.Becausean FPGA runs the code
in true parallel thereare no significant timing effectsby adding the
additionalcode.

Compile
Reporl
Benchmarks
After compilinganFPGAVI LabVIEWdisplaysa CompileReportwirh
importantinformationabouttheoverallsizeandspeedofthe application.

VlSize
TheDeviceUtilizationSummaryof theSuccessful
CompileReport
providesinfomationaboutthe numberof SLICEsused.This metricis the
mostimportantmeasure
ofthe sizeofthe programin hardware.
Thedesign
process
shouldbeaniterativeprocess.
Dudngthedesignprocess,
noticethe
increasein SLICEsusedasthe programgetslarger.

VlSpeed
The Successful
CompileReportdialogalsocontainsinformationaboutthe
clockratesof theapplication.
. RequestedRate-Displaystheclockrateat whichthecompiledFPGA
VI runs.The defaultseningis 40 MHz.
. TheoreticalMaximum-Displays thetheoreticalmaximumcompile
ratefor theFPGAVL This problemis conrnonwhenyou use
single-cycle
TimedLoops.

t{t
fl
-

<,

@National
lnstrunenbCatporation

v3

ConpaclBtq
andLabVtEW
fundanentate
Cousel/lanual

LessonS

fPGA0ptinizatian(0ptianat)

lf theTheoreticalMaximumis slowerthantheRequested
Rate,thecompiler
will errorout andthecompileprocessstops.you mustmodify the
applicalionto the pointwheretheTheoretical
Maximumis equalto or
grcaterthanthe rcquested
rate.

C.Basic
0plimization
Techniques
BasicFPGAoptimizationtechniques
aretypicallyeasyto implement,
requireno majorchanges
in thecodearchitecture,
andoftel areFPGA
programming
best-practices.
Thebasicoptimizations
coveredin thissection
primarilyaffectFPGAsize.

IimitFront
Panel0biecls
Frontpanelobjectsconsumea significantamountof spaceon the FPGA.
Not only is spacerequiredto storcthedataitself,but a considerable
amount
of FPGAlogic is requiredto implementthecommunication
betweenthe
ftont panelobiectandthehostVI.
Whenyou transferdataacrossthebusbetweentheFPGA andRT hostit
mustbe brokendowninto 32-bitpackets.This limitationis dueto the
numbetofdatatmnsferljneson thebus.Thereforeifyou havea frontpanel
elementthatusesmorethan32,bits,thatfrontpanelelementmustbebroken
downinto several32-bitchunks
andbepassed
alongthebuspiecewise.
This
breakingdown of the datarcducesoverall transferspeeds.
Usethefollowing guidelinesro optimizerheFpcA code:
. Limit the useof anays
.

Usethe bitpackingtechniqueasan altemative


to smallanays

UseFIFO structuresfor datatransfer

Thesetechniques
arenecessary
only for frontpanelobjectsusedin the
toplevel FPGA VI; frontpanelobjectsin subvlsdo notconsumespaceon
theFPGA.

Avoid
FronlPanel
Arrays
Arraysandclustersthataregreaterthan32-bitsin sizerequircanextracopy
on theFPGA to guarantee
all thedatais read,andconsumea significant
amountof spaceon the FPGA. Array elementsare transferredto andftom
theFPGAindividually,solimitingrheuseof araysdecreases
rhelogiccells
requiredto implementthe datatransfer

ConpactBl9andLabVlEW
fundanentats
CowseMan\at

94

Lesson
9

(Aptianal)
FPGA
Optjnization

Ifyou usean arraycontrolon thefront panel,you mustlimit thearfry sjze.


Youcanselectthesizeof thearfayby ght-clickingthecontroland
choosingtheappropriate
usermenuselection.
Controldatais storedin the
RAM of theFPGA.BecausetheRAM sizeis hardwarelimited,you must
maLesurethatthearraysizein bytesdoesnot exceedtheRAM sizeof the
hardwarein use.A 1M gateFPGAhas81,920bytesof RAM. A 3M gate
FPGA has196,608bytesofRAM. Ifthe sizeofthe afiayis largerthanthe
availableRAM, thecompilewill fail.
The bestwayto transferlargesetsofdatafromtheFPGAto thehostis with
DMA (DirectMemoryAccess)FIFOS.Referto Lessor'\
8, Data Tra skr
and Slnchrcniaatiotl,
for moreinformationaboutDMA FIFOS.
Arays consumesignificantspaceon theFPGAbecause
eachbit in thearray
usesa flip-flop on the FPGA.
Whenyou wire an airay asan inputto an FPGAVI or function,the FPGA
eachelementof the
compilercreates
theequivalent
of a For Loopto process
arrayin sequence.
Ifyou wire a clusterasan inputto anFPGAVI or
function,theFPGAcompilercreatesparallellogic for eachelementofthe
cluster.Therelationship
betweenarraysandclustenis recursivesuchthatif
you wire a clusterof arays asan input,thearraysareprocessed
in parallel
andthe arrayelementsareprocessed
sequentially.
N0le Performingoperations
on arrayscanlimit themaximumtop-levelor der.vedclock
rate.To maximizethe FPGAclock rate,processsingledatapointsinsteadof arays.
in theFPGA
Oneofthe mostconmon usesof aray controlsandconstants
VI is to storelae setsofwaveforminfomation.Youcanreplacelargefront
panolarraycontrolswith the Look Up Table(LUT.)ExpressVL Look-up
tablesprovidea generalpurposeblockofinitializedmemory,andshouldbe
modeling
usedinsteadof anaysto storewavefornsfor signalgeneration,
c)slems.
nonlinear
or arithmelic
compuladons.

Bilpack
Boolean
Logic
Bitpackingis a techniquethatcombinesmanysmallpiecesof datainto a
thecommunication
largerpieceof data.This techniqueworksbecause
betweentheFPGAtargetandreal-timehostis in 32 bitwords.Forexample,
you canstorefour 8-bit integersasone32-bitinteger.lnsteadof four
sepamte
busesto transfertheelementsofdata,only onebusis required.

lnsttunpntsCorpoatian
A Nattonal

Fundanentals
CouEeManual
ConpactqlA
andLabVlEW

LessonI

.PGA0ptiniation (jptionat)

Anotherexampleofbitpackingis to convenBooleancontrolsto a Boolean


array.Forexample,you canconverteightBooleancontrolsinto a single
8-bitunsignedintegerthenusea Numberto BooleanArray functionto
indexit into individualcomponents,
asshownFigure9-1.
F"d-*l

8"6
E.ore.n
3l

E""r"-"
@.+
F-;ohe5l
[J:!
6d'.da
F@rea;n
3l
toorean

t@

Figure
9"1.C0nvert
Boolean
Controls
toa Boolean
Array

clobal
Variables
Youcanuseglobalvariablesscopedto theFPGAto optimizecodefor size.
Usingglobalvariablesin subvls is especiallyeffecti/eiforher VIs needto
access
thedata.Ifa pieceofdatais not global,you mustuselogic to route
it to otherVIs andeachVI will require memoryto storethe data.To adda
globalvariableto your project,selectGlobalVariableftom the Stnictures
paletteon theblockdiagram.Thenlink theglobalvariableto thecontrolor
indicatoron your front panel.Youcanalsoreplacecontrolswith constants
to furtherreduceFPGAusage.

UseSmallDalaTypes
Alwaysselectthe smallestdatatypepossiblewhenusingcontrolsand
constants
on theFPGA.For example,thedefaultdatatypeof the Index
Araayfunctionis a 32-bitintegerHowever,ifyour arraywill haveno more
than256elements,
rcpresenttheindexasan unsigned8-bit integer.
Whenusinga Casestucture,only very rarelywouldyou needmorethan
256cases.
Ifyou havefewerthan256elementsin your Casestructurethen
usean 8-bitintegerwiredto theselectorterminalinsteadofthe default
32-bitintegervalue.
Similarly,if you areusingtheLoop TimerVI, Tick CountVI, or WaitVI,
configurcthemto usethe smallestSizeof Internal Counter possible.

Conpactqlo
andLabVlEW
Fundanentals
Cou$e
Manuat

94

F
F
F
F
F
F

(Aptianat)
Lessjng FPGAOptiniralon

Eliminale
Dots
Coercion
Whena coerciondot displays,it meansthecompliermustdeterminethe
properdatatypeandit canleadto inefficientdataconversion.
Insteadof
the
data
type
and
insefithe
leavingthe coercionto chance,specify desired
appropriate
coercionfunctionwhenyou seea coerciondot on yourblock
diagram.

Functions
Avoid
Large
SomeLabVIEWfunctionsconsumea significantamountof spaceon the
RotatelD Anay, andScaleBy
FPGA,suchastheQuotient& Remainder,
Powerof 2 functions.
signillcantspaceon
Quotient& Remainder This functionconsumes
you
power
theFPGA.If
needto divideby a
of two, considerusingthe
Scaleby Powerof 2 functionwith a negativeconstantwired to the
n input.

I
f'
F
tr3

functionis tiedto the


Oftenthenumerator
ofthe Quotient& Remainder
iterationterminalandtheremainderkeepsa placewithin an arrayor
on its own
look-uptable.An altemativeis to createcodethatincrements
by usingan Incrementfuncliontiedto a shiti registerandsettingthe
valuebackto zerowhenthecountexceeds
a specifiedvalue.
RotatelD Array-lf youwirc a controlto theinput,thisfunctiontakes
timeproportionalto thenumberof positionsto be rotated,plustwo
clockcyclesofoverhead
to enterandleavethefunction.However,rfyou
wireaconstanttothe
input,thislunctiontakesnegligibletimetoexecute
andconsumes
no spaceon the FPGA.

rat

Scaleby Powerof 2 If you wirc a controlto theinput,this function


canconsumesignificantspaceon theFPGA.However,ifyou wire a
no spaceon theFPGA.By
constantto theinput,thefunctionconsumes
wiring a negativeconstantto theinputthisfunctionhasthe sameeffect
asdividingby thatpowerof2.

F
F
x,

Note YoucanopfimizeFPGAVIs by usingconstants


insteadof controlswhenever
possible.However,intermediary
structurcs
suchasconnectorpanes,Iooptunnels,and
prevent
LabVIEW
from
recognizing
an input asa constant.
shift registers
can

dl

rr
I
Ir

ion
A Natnnal tnslrunenls Coryoral

Fundanentals
CauseManual
ConoactqtqandLabVlEW

LessonI

FPGAqptintzation(optionat)

0ptimize
Comparisons
The Compadson
functionsrequiresignificantspaceon theFPGA.You
mightbe ableto savespaceon theFPGAby usingthe Numberto Boolean
ArrayfunctionandtheCompoundArithmeticfunctionconfiguredwith the
Or operation,asshownin Figure9-2.

E@;"d fi#;ia
El

Et

Ft"-;]

lffit

t6l

tr

lEll

E;;;l

tEg

ta)

g-2.Uslng
Figure
Boolean
0perations
to0ptimize
Comparisons
The codein theWhile Loop on theleft is standadLabVIEWcodethat
compares
a numberto 16.Thecodein theWhileLoop on therightachieves
the sameresult,but usesapproximately
halfthe FPGA resources
and
nearly
executes
twice asfast.
Ifyou usethistechniqueto optimizeyourdata,be awarethatifyou wantto
yourcode
adjustthecomparisonvalue,you mustsignificantlyrestructure
to matchtheappropriate
Booleanlogic requiredto determinethedesired
results.
Note It is easiestto usethis techniqueto makecomparisons
to powersof two.

UsingReenlrant
andN0n-Reenlrant
subvls
Youcanconfigurea subVl asa singleinstancesharedamongmultiple
callers,alsoknownasa non-reentrant
subvl. Youalsocanconfigurea
subvl asreentranttoallowparallelexecution.
By default,VIs createdunder
an FPGAtargetarercenfant.To makea subvl non-reenftant,
select
Executionfrom the Categorypull-downmenuof the VI Propedesdialog
box andremovethe checkmarkfrom the Reentrantexecutioncheckbox.
Someapplications
usesharcdresources
thatareaccessed
by multiple
callers,suchasfunctionsor subvls in anFPGA VL Possibleshared
resources
includedigitaloutputlines,analoglines,memoryitems,FIFOS,
theinterruptline, localandglobalvariables,
andnon-reentrant
subvls.
Callersmustwait in a queueto gainexclusiveaccessto theresource.
By
contrast,an unsharcdresourceis a portionof codededicated
to a specihc
you
If
use
non-reentrant
VI,
caller.
a
subvl in an FPGA
only a singlecopy
ofthe subVlbecomeshardwareandall callerssharcthehardware
resource.
This maydecrease
the executionspeedofthe FPGA code.Ifyou usea

Fundanentah
Caurse
Manual
ConpaclqloandLabVlEW

Lessan
9

fPGAAptinization(Aptional)

rcentrantsubvI in anFPGAVI, eachcall ofthe subvl generates


a dedicated
areaofcode on the FPGA.Forexample,ifyou havefive instances
of an
eventcounterconfiguredasa reentfantsubvl on theblockdiagram,
LabVIEWimplementsfive independent
copiesofthe eventcounter
hardwareon theFPGA.However,if youreventcountersubvl is
non-reentrant,
LabVIEWimplements
onlyoneeventcounterin
hardware
on
theFPGA.
Nole Avoidusingsharedresources
in reentrantsubvls.If you usea sharedresource
in
a reentrantsubvl, eachinstanceofthe subvl mustusearbitrationto accessthe shared
re\ource.
whichcanimpedeparallel
execution.
Table9-2 summarizesthe typical advantagesand disadvantages
of
non-reenffantand reentraurt
subvls.

Table
9-2,Reentrant
vs.N0n-reentrant
Vls
VI Tlpe

FPGA Speed

FPGA Utilitization

Non-reentiant

Slower Eachcall to the


Canbe lowerbecause
only one
waits
previous
subvl
until the
instanceofthe subvl existson the
call ends.
FPGA no mattei how many timesyou
useit. However,
non-reenftant
subvls
useFPGAresources
for arbitration.

Reentrant
(defaultfor FPGA)

Faster Multiple callsto the


samesubVl run in parallel.

A National
lnsnunenECotpo.ztion

Canbe higherbecause
eachinstance
ofthe subvl on theblockdiagram
usesspaceon theFPGA.However,
reentantsubvls do not useFPGA
resources
lor arbitration.

Canpactql0andLabVlEW
fundanentalsCouseManual

LessonI

qptinEatian(Oplionat)
FPGA

Exercise
9-1 OplimizeFPGAVl
Goal
UsebasicFPGAoptimizations
to rcducetheapplicationsize.

Scenario
Youhavebeengivenan FPGAVI thatis meantto outputa lonethatis
on thetemperature
inputfrom an NI 9211.Unfortunately
the
dependent
FPGA VI will not compilein its presentstate.Useeachofthe techniques
leamedso far in the courseto optimize the FPGA VI so it cancompile and
useminimalspaceon theFPGA.

Design
TheFPGAVI generates
a sinewaveon AO0 ofthe NI9263 andchanges
the
wave
frequencyofthe sine
by changingthetimingbetweenanalogoutput
updates.
Thesinewaveon AO0 createsa toneon the speakeron the Sound
andVibrationSignalSimulatorwhenconnectedto AUDIO IN CHOwith the
SpeakertumedON.
The temperature
is readby readingtheThermocouple
valuesfrom AI0 on
theNI 9211.
Calculatethetimingby takingthe average
temperaturc
valuesandscaling
that information using the empirically derivedscalingfacton to generate
a time delaybetweensamplesof somewhere
between2000-120ticks of
whichgenerates
the40 MHz timebase,
an audiblefrequencybetween
40G-6700Hz.

lmplemenlalion
l. Find theappropriate
scalingfactols.
n OpentheTempemture
MonitorProject.
0 RuntheSimpleTemperature
MonitorRT HostVID Do not touchthethermocouple.
El Recordthe approximate
valueof theThemocoupleSignal
indicator:
TCcou=

ConpaclqlqandLzbVlEW
fundanenlalsCouseManual

9-10

=r
ED

LessanI

n WaIm the thermocoupleasmuch aspossibleby holding the


thermocouplein your handand recordthe approximatevalue of the
ThelmocoupleSignal indicator:

ral

TCror = -

>r
f{

>r
r{

Use the Slope lnterceptfomula (), = mr + lr) to find where the line
will crossthe Y-axis (r).

- Tccojd\=
rr = r | 20 - 2000)/ (TCHo,

Lr

rat
L

tat
L-

.PGAAptinzation(Optijnal)

E Plugthenumbersbackintotheequationnow that you know ,1 and


solvefor b.
2000 = m x (TCcoro)+ b ) b = 2MO - m x (TCs.16)) b =

You will useb asyouroffsetconstantlaterin theexercise.

tr Stopthe FPGA VL

b-

YI

tr ClosetheFPGA VI andtheTemperature
MonitorProject.Do not
saveanychanges.

-t

2. Createa newproject.

tr SelectFiloNew andcreatean EmptyProject.

tr Right-clicktheprojectandselectNew>Targetsand Devicesto add


theCompactRlotargetto theprcject.

=D

Q SelectExisting target or deviceandexpandtheReal-Time


CompactRlofolderto hnd your device.

1'

E Selectyour deviceandclick OK to addthe device.

-t
-,

tr SelectLabVIEWFPGAInterfacein theProgramming
Modedialog
-

box.

a)
a

>
qt

When prompted,click Discover to discoverall modules.

n Savethe Project as <Exerc i ses > \ compactRro


F u n d a ' r e n cIas \ o p c i n ' z a - i o r O p -i m i z a t i o n - I v p r o j .

-l
b

1'
E
{,
h

@National
lnstrunenECorpotatian

ConpactqloandLabVlEW
fundanentals
CouseManual

Less1n
I

(0ptianal)
FPGA
Optinization

3. Add the original files to the project.


O Right-click the FPGA target and selectAdd>File.
C l N a \ i 8 a l el o < E x e r c i s e s > . c o m p - c L R - oF l r n d d m e n - - l s \
opEimizaEion and add the followingVIs FpcA
O p L i m i z - L i o n - o r i q ' n a l . v i a n d F D G AS c a - ' r q . v i l o t h e
FPGA target.
O Savethe project.
4. Open the FPGA Optimization - Original VI and look at the block
diagram,shown in Figure 9-3-

g-3.FPGA
- 0riginal
Figure
0ptimization
Vl
Nole In theinterest
of time,do notcompilethisVI.
If you compiledthis VI, you would seethe followingeror reports:
.

Thecompilereportstatesthatthecompilationfailed.

CanpactBlq
andLabVlEW
fundanentalsCouseManual

912

LessanI

t- q-l

(1pt!onat)
.PGAAptinization

t;,;_,. f;,p I

- 0riginal
Figure9-4. FPGA
0ptimizati0n
Vl Compilati0n
Failure
.

The Compile Servergcncratesan eror and abortsthe compilation,


which meansthat the applicationwill not executeon the FPGA.

1'
-,

- 0riglnal
Figure
9-5.FPGA
0ptimizallon
VlCompile
Error
Server

4t

When you review the error you can seethat the VI usestoo much
memory. You must review the codeto seewhat is taking up too much
memory on the FPGA.

-l
IE

A NaliondllnsnLnenlsCo.pardt
ron

913

ConDactRl0
andLabVlEW
funllanentalsCou6eManuat

Lesson9

FPGA
AptinEation(Optianat)

AvoidFrontPanel
Affays
Whenusinga CRIO-9103
with a 3M GateFPGA,youarelimitedto lessthai
200kB ofmemory to usefor thingssuchasfront panelcontols and
indicators.The fiont panelolthe FPGA Optimization- OriginalVI,
includesa SinewaveArraycontaining5,000132elements.
This arrayalone
uses200kB, soyou mustreplacethe SinewaveArray (5000Elements)
front panelcontrolwith moreefficientequivalent
code.
TheSinewaveArraycontains5,000elements,
however,whenyou look
closer,you can seelhatit contains100sinewavescomprisedof numbers
between 128and 127.
Because
this is a relativelysimplewavetbrmyou caneasilysubstitute
the
SinewaveArray (5000elements)controlwith anequivalentLook Up
Table.
L Replacethe Sinewave
Array with a Look Up Ta51e.
D Add a Look-UpTablelD ExpressVI on theblockdiagram.
Q The configuration
dialogbox opens.
You configuretheLook-UpTableto create5,000elementsof
I8 datawith 100cyclesof sinewavedata.

Figure
9-6.C0nligure
Look-Up
Table
1DDialog
Box

D Setthe Number ofelementsto 5000andsettheData type to r8.

Click DefineTable.

CanpactBlo
andLabVlEW
fundanentals
CouaeManual

9-14

=D
Lesson
I

ra,

qptinization(Op|onal)
FPGA

=r

a
B

F
F
b,

-t

F
ED
b
I
F

t;___1 t.-",

tr Click DefineSegmentsto opentheConfigureSegmentdialogbox.

rlF
rE

jlntf,:',aue

t ",]f.^.r

Nober of cyd6

ttG__-1

Box
0ial0q
Segment
Figure
9-8.confjgure
D Setthe Mode to sine wave andtl'leNumber of Cyclestol00
fl MakesureEnd Addressis setto 4999.
D ClickOK.
tr Click OK in the DefineTabledialogbox to acceptthe settings
D Click OK to closetheConfigureLook-UpTablelD dialogbox

f
rl

i *b__l

Dialog
Box
Table
Figure
9-7. Define

l'

.]

lnsltunenlscorpotalion
a National

courseManuat
canpactql1andLabvlEwFundanentals

(Aptianal)
Lessons FPGA0ptinization

2. Wire theblockdiagramasshowr in Figure9 9.

Figure
9-9.FPGA
0ptimizaii0n
VlwithArray
Deleted
l'lole In theinterest
of time,do notcompilethisVI.
.

Starus

Iufter

If you compile this VI, you will seethe following report.

Conpllati.n

ol 4 iDpur

sxccesslut

rs:

1,977 0u! oI
3,793 our ol

2,672
23.672

6t
132

Dvice ULilization
Sunnary
Nurbe! oi EUF6UUiis
Nurber oi loced BUFGUIIS
Nurber af DCIS
NunbEr .f
Nxaber
NunbEr .f
NunbEr Df

IOES
E:terlal
of loced
IOES
UUlT13X13s
FlXEl6s

Nunber of SIICES

- 0riginal
Figure9-10.
FPGA
ngthe
0ptimizati0n
Vl Compile
Rep0rt
AtterReplac
Arravwitha Look-Uo
Table
Note Dueto thenatureof theFPGACompilationprocessthecompiledoesnot always
reachthesameconclusion,
soyourresultsmaydiffer slightlyfrom thefiguresshownin
theComDileReDortsfor this exercise.

Cou$el'lanual
Con1actnt1andLabvlEWFundanenkls

9-16

Er
I

F
F
F
F
F
F
F
F
F
F
F
trr
I
F
trD

LessonI

FPGAAp n2atianlqptonal)

Canyou optimizethe VI evenmorel

Eliminale
Dots
Coercion
Anotherareathatcancauseproblemsis by allowingtheVI to automatically
coercethevaluesinsteadof specifyingwhattype ofdatayou wouldlike
to have.Smallredcoerciondotsappearon theinputsto Look-UpTable
ExpressVI andtheinputto Mod3/AOoFPGAI/O Node.Youshouldalways
explicitlydefineyourdatatypeswheneverpossible.
1 Setdatatyperepresentatio[
to reduceLabVIEWcoercion.

D Right-clicktheterminalswheretheredcoerciondotsarelocatedand
selectCreate>Control.
This controlwill notbeconnected,
butwecanuseit to identifywhat
the expecteddatatypefor theinputteminal is.

B Right-clickthewire betweentheQuotient& Remainderfunction


andthe Look-UpTableandselectInserb)NumerlcPalette>>
ConversioD>U16.

tr

Right-clickthewire betweentheLook-UpTableandtheFPGA
I/O NodeandselectInserb>Numeric
Palette>ConversioD'
To Fixed-Point.
The To Fixed-Pointfunctionrequiresthatyor!wirc a valueto the
fixed-pointtypeinputtodeteminetheconfiguration
ofthe resulting
tlxed-pointvalue.

Copythefixed pojnt constantwiredto theFPGA yO Nodeto the


right of theWhile Loopandwire it to thefixed.point type inputof
theTo Fixed-Pointfunction.

Q Delete the controls you createdto identify the expecteddatatype.

=3

Right-clickthewire betweentheQuotient& Remainderfunction


andthe FPCAScalingVI andselectInser$Numeric Palette>
ConversioD>To
Fixed-Point.

tr

Right-clickthefixed-pointtype inputof theTo Fixed Point


functionandselectCreate>Constant.

Et

1'

;1

O Right-click the new constantand selectProperties.

h
-

O On the Data Type tab, set the Word length to 2 6 bits


Integerword length to o bits.

E
{

andthe

n
H'
-r

<,

lnstrunenE
A Natunat
Caryoqtian

Con1adqtAandLabVtEW
fundanentalsCouEeManual

LassonI

.PAAqptinization(Optional)

D ClickOK.
tr SavetheVI.
2. Wire theblockdiagramasshownin Figure9-l l.

Figure
9-11.FPGA
0ptimization
Vl withConverted
Values
Block
Diaoram
N0le In the interestof time, do not compile this Vl.

Ifyou compiletheVI nowyouwouldseethefollowingCompileReport.

Figureg-12.FPGA
optimizati0n
Vl withC0nverted
Vatues

ConpactRt0
andLabVtEW
Fundanentah
CouaeManuat

918

9
Lesson

FPGA
0ptinization(Optional)

As you canseethis changereducedthe sizeofour FPGAVI by


24 SLICES.

AvoidLarge
Functions
thatyoucanuseis to replaceanydivision
Oneof themajoroptimizations
wherethe denominator
is a powerof2 with the ScaleBy Powerof 2
while
this
function
takesup minimalspaceon the FPGAwhen
function.
wiredwith a constant,it takessignificantspaceif wiredto a control.
functionswith the ScaleBy Power
l. ReplacetheQuotient& RemainCe.
of 2 function.
and
four samplesyou lind thesumof thefour samples
N0le Becauseyou areaveraging
divideby four; however,dividingby four is the sameasmultiplyingby 2-2.
function,the associated
constant,
tr DeletetheQuotient& Remainder
theTo Long Integerfunction,theTo Fixed-Pointfunctionandits
associated
constantfrom betweentheAdd Vls andtheFPGA
ScalingVL
O Add a ScaleBy Powerof 2 functionto theblock diagram.
D wire theoutputof theAdd VI to thex input.
to a value
O Createa constantfor then inputterminal.Settheconstant
of 1.
Avoid usingScaleBy Powerof 2 with a controlfor then input.
If youopentheFPGAScalingVI blockdiagram,you seethatit uses
the ScaleBy Powerof 2 function;however,in the top-level
to theSlopecontrol.
applicadonthis is connected
Right-clicktheSlopecontrolandselectChangeto Constant.

tr

to determineyourslope.
In step1 you touchedthe thermocouple
Enterthevalueinto theconstant.
For example,ifyou hada Tccord= -20,000anda TCgo'= 20,000,
thenyou wouldselecta Slope= -32,768to 32,767.

D Make surethat the Offset Constantis equalto the offset value(r)


derivedin stepl.
D SavetheVl.

@l'lalionallnstrunenlsCotpoQtian

+19

f ndanenblsCourse
tlantal
ConpadRtA
andLabVlEW

LessonI

fPGAAp niration(0pti0na1)

2. Wirethe blockdiagramasshownin Figure9-13.

g-13.Replace
Figure
0uotient
& Bemainder
Functions
withtheScale
ByPower
of2 Function
Nole In the interestoftime, do not compile this VI.
If you compile the VI now you will seethe following Compile Repofi.

g-14.Successful
Figure
Compile
Report
Beplacing
atter
ouotient
& Remainder
Functions
withScale
ByPower
0f2 Functions
Asyoucanseein Figure9-14,youeliminated
another106SLlCEs,
whichis almost17,of theFPGAsize.

CanDaclql0andLabVlEW
fundanentals
Cau6elilanual

9-2A

F8

P
tr

LessonI

FPGA
(optionat)
Aptinization

By eliminatingiDproperuses
of theQuotient&Remainder
functionand
by replacingcontrolsconnected
to theScaleBy Powerof 2 with
constants,
you significanrlyreducedthe sizeofthe FPGAVl.

i|

But canyou do better?

3. Modify theblockdiagramasshownin Figure9-15.

Thereis anotherQuotient& Remainder


functionin yourVI, butit is not
dividedby a powerof2. This Quotient& Remainder
functionis usedto
incrementtheindexbasedoff thecurrentiterationof theWhile Loop.

Usinga Quotient& Remainder


functionis an ineffrcientway to
incrementvalues.lnsteadyou shouldreplacethatcodewith a shift
register,an Incrementfunction,a Selectorfunctionanda comparison
to resetthenumberbackto 0 if it exceedsa specifiednumber.

E
h

r{l
h
-

=D

6i;'.d-a

it

ir

rr

e
h

r(,
E
-

Fig[re9-15.Replace
0uotient
& Remainder
Functi0n
withIncrement
Values
Block
Diagram

af,

D DeletetheQuotient& Remainder
function,associated
constant,
and
Coerceto U16 function.

0 Deletethetunnelon theSequence
Stucturecomingfrom the
iterationcounterof theWhile Loop.

Er

I
-

E Press<Ctrl-B>to deletebrokenwrres.

:a
h

@NatonallnstrunentsCotporation

9-21

Conpa.tqtoandLdbVttWtudanenutscarrse Mcnudt

LessanI

.PAAAptiniration(Optionat)

D Right click the address input for the Look-Up Table and createa
Ul6 constant,
setthe constantto a valueof 5000 (numberof
elementsin the Look-Up Table).

D Add a Selectlunction, an Equal?function, and an Increment


function to the block diagram.

tr

Add a shift rcgister to the While Loop.

B Disconnectthe Ul6 constantfrom the addressinput ofthe Look Up


TableVI.
D Initialize the shift registerto a value of 0 by creatinga U16 constant
and wirins it as shown in Fieure 9- 15.

tr

Wire theblockdiagramasshownin Figure9-15.

B Vedfy thatthereareno coercionsof numericaldata,if so setthe


associated
datatype appropdately.
D Savethe VL
N0le In the interestof time, do not compile this Vl.

lfyou compiletheVI nowyouwouldseethefollowingCompileReport.


IEFTAJ

status:

conpilarion

Conpil6tlon

suhrary

!oq1c Utllrzation
Nunber ol s]ie Fllp Flops
Nunber ol 4 in!u! ILITS
Dewlce Utilizarion
Suhnary
lunber ol BUFCItrxs
Nunbe! ol loced BUFGIIIS
[unber oi
llunber
Iuhbe' ol
Irunbei oi

3
t?
4S4

Erreinal
IoBs
df loced IOBS
ULIIT13X13S
RAUll6s

ss

96
96
1'!336

g-16.Beplace
Figure
Functi0n
withIncrement
0uotient
& Remainder
Values
Report
Successftrl
Compile

CjnpactBl0andLabVlEW
fundanentalsCaurse
Manuat

9-22

F
P
F
FD
I

Lesson9

FPGAqptinintian (1ptional)

As you canseein Figure9-16,you eliminatedanother234SLICEs,which


is approximately
27oofthe FPGAsize.
By replacingimproperusesof theQuotient& Remainder
functionwith a
moreefficientmeansofincrementingvaluesyou savedsignificantspaceon
theFPGA.

P
I
F
I

At this pointyou havedonemostof thebasicFPGAoptimizations


thatare
possible.

To getan ideaof howquicklytheVI is nowexecutingyou shouldaddsome


Tick CountVIs soyou canseehow longeachiterationtakes.Usethis
tunctionalityto diagnosetheproblemwhenthebehaviorofthe application
is not whatyou expect.

F
ral

Benchmark
lheExecution
Speed

r4

ra

!3
iD

In this section,you modify theVI to resembleFigure9-17.


1. Add Tick CountVIs.

!r
if,

=r

e
L,

I
-

Figure9-17.AddTickCounts
to theFPGA
0ptimizati0n
Vl BtockDiagram

-t

O Add a TickCountV[, ro theblockdiagram.

h
h
{
h
{,
-t
-

@Nalonal
lnettunents
Cotpoahon

9-23

ConpmtRt,andLabVtEW
Fundanentats
CouseManual

(Aplional)
Lessan9 FPGAAptintration

D Configurethe Tick CountVI asshownin Figure9-18.

Figure
9-18.Configure
Tick
Count
Dialog
Box
D Createa copy of the Tick Count VI and place it in the initialization
sequence.

tr

Createa shittregisterandwiretheinitializationTickCountVI to the


shift register.

Add aMinusfunctionto find thedifference


betweenthecurentTick
Countandthe prior Tick Count.

,:l Right-clicktheoutputofthe Minusfunction,createan indicatorand


n a m ei t A c t u a l

Loop Rate(usec).

Savethe VI

2 . Compile the VI.


!

Click the Run button to begin the compile.

conpacBtAandLabVtEW
Fundanentah
Cou$a\lanual

9-21

Lesson
I

FPCAoptinizatian(qptianat)

you shouldgeta Successful


Whencompilationcompletes,
Compile
Reportsimilarto theoneshownin Figurc9-19.

Compile
Report
Figure
9-19.Successful
ThiscodeusesmoreFPGAspacethanit wouldifyou hadnotadded
thetick countis used
thetick countfunctionality;
however,because
you candeleteit beforeyou deploythe
only for testingpurposes
code.
3. Savetheproject.

Tesl
l

RuntheFPGA OptimizationVI to begincompilingthe application.

2. ConnecttheAO0 BNC cablefrom theNI9263 to theAUDIO IN 0input


on theSoundand\4brationSignalSimulator
3 . Setthe Distortionto OFFandturn the Speakerswitchto ON.
touchtheThermocouple
tip, notice
4. After the VI compilessuccessfully,
uponthe
howthe valuesfor theWait(Ticks)changesdepending
temperature.
Do you heara changingtonefrom the speakeron the bottom left of the
SoundandVibrationSignalSimulator?
l,lole Youshouldnot heara toneat thispoint.

lnstrunents
@National
Coryonton

andLabVlEW
Fundanenkle
ConpactRlo
CouEeManual

Lessan
9

(optunat)
.PGAAp|nizatton

5. Detemine if the speedof the applicationis appropriate.


Q What is the value of the Actual Loop Ratc(usec)indicator?
D What ratedoesthis representin Hz?
tr Why is it going so slow?
B Is the speedol thc applicationlinited by the efficiency of the code
or by the hardware?

g-20.Basic
Figure
FPGA
0ptimizati0ns
FrontPanel

Challenge
The FPCA Optimizationappljcationnade somesignificant improvements,
but you could do even more?
The following list providesexamplesof possibleimprovementsyou could
makc.
I . Replacingthe Equal?function in the main level VI and the In Rangeand
Coercefunction in dte FPGA Scaling subVI with Boolean logic
functions.To do this you must conven the integernumbersto Boolean
arraysand do bitu'ise logic ol the resultantarray.
2. Optimize your timcbasesby using only U8 or U I 6 numbersinsteadof
U32 numbers

3 . Becausethe FPCA VI design is really a headlessimplementationyou


could eliminate all controls or indicatorsfrom the application.

1 . The Look Up Tablecreatesthe samesine wave 100 times. Insteadyou


could generateone sine wave that is 50 sampleslong.

(oqpdc,plqar1tab,I/t/runo"npntat'Caa.pMdludt

9.2o

F
P
F
F
F
F

LessanI

.PGAAginization(optianal)

5 . Youcantakethisonestepfurtherby naking thesampleset2^nlong,so


thatyou canusethemoreefficientcomparisonfunctionsto createthe
incrementcode
As youcanseein Figure9-21,usingtheseoptimizationtechniques,
you
canrcducethe codeevenfunheranduseapproximately
1849SLICES,
whichis 237SLICESlessthanthemostoptimizedformatcreatedin the
main part of the exercise.

F
FN
P
F
F
P
F

F3

Figure9-21.Successfu
I CompileReportwith
Challenge
0ptimizations
lmplemented

Endol Exercise
9-1

!!
E

e
it
;il

=r
ir
!t

@NationallnslrunenlsCotpontion

ConpadqtqandLabvlErtl/
Fundanenta\CouseManuat

LessonI

.PGA'ptint2ation(optional)

D.Architecture
0ptimizations
Therearcseveralarchitecturc-related
FPGAoptimizations.
Theseatemore
advanced
operations
because
theyareapplication-specific
andalsorequire
a designphasepriorto implementation.
Therearesevenlimportant
conceptsto understand
beforeleamingmoreadvanced
optimization
techniques.
The mostimponantconceptis theenablechain.The enable
chainis additionallogicaddedtothe FPGAcodeto guarantee
thatdataflow
on the FPCA is consistent
with the LabVIEWdataflowparadigm.The
enablechainis a seriesof flip-flops,alsoknownasregisters,
thatrun in
parallelwith theactualflow of daraon rheblockdiagram.A flip-flop holds
a bit of dataandoutDutsthe dataon clock edses.

DalaflowwilhinlheFPGA
LabVIEWexecutes
codein a dataflowmanner.
Nodesexecutewhendatais
presenton all inputs.Whenthenodefinishesexecutiontheoutputsof the
nodepassdatato thenextnodedownstream.
Figure9-22showsanexample
of theFPGA hardware
requiredto implementa Booleanoperation.

Figure
9.22.A LabvlEW
NotVl andtheCorresponding
FPGA
Logic
thatlmplements
theNotVl
LabVIEW codeis transformed
into FPGA logicin threesections-logic,
synchronization,
andtheenablechain.Thelogic,shownin the upperthird
of Figure9-22conesponds
to the actualLabVIEWoperation.In this
example,theLabVIEWcodeis a Not functionandcorresponding
to an
inverterin theFPGAhardware.
The synchronization
registerisshownin the
middleofFigure9-22guarantees
thatdatais outputonly on risingedgesof
the clock. The final portion of FPGA codethat is generatedfrom the
LabVIEWcodeis theenablechain.The enablechainis an additional
registerthatonly outputson therisingedgeofthe clock.The enablechain
guarantees
thattheFPGA logicexecutes
in thesameorderdepictedon the
blockdiagram.

ConpacfilAandLabVlEW
Fundanentats
C1uEeManual

9-28

=9

F|

F
F
l--

f*

F
a
iil
E

ral
fA

-t

=D
iD

P
F3
=t

=r
;it

LessonI

FPGA
Aptinizaton(Apianat)

Dueto theenablechainoverhead,
eachlunctionor vl takesa minimumof
oneclock cycle.Somefunctions,suchasanaloginputoperations,
cantake
hundredsofclock cyclesdepending
uponthecomplexityof theoperation
andhardwarelimilations.
path.
A VI canrun only asfastasthe sumolthe itemsin a combinatorial
parallel
Oneadvantage
of usinganFPGA is thatcodecanrun in true
to
youcancreatecodein parallel,it is oftenbestto
anotheroperation.
Because
designcodesothatasmanyparalleloperations
cantakeplaceaspossible.
This usesthesameamountof FPGAspace,andcanincrease
theexecution
path
reducing
the
combinatodal
size.
speedby

Parallel
0peralions
Paralleloperations
area verypowerfulconceptin currentcomputer
In a standard
processor
parallel
architecture.
basedconfiguration,
programs
operations
arenottruly paraltel.In processor
bascdarchitectures,
runningon theprocessor
areslicedintomanyfragmentsandareinterleaved
The operatingsystemthendecides
with codefragments
ofotherprocesses.
thefragments
ofcode
whichprocesses
arethemostimportantandschedules
accordingly.
LabVIEW is oneofthe few programming
languages
thatnaturallylends
itselfto parallelprocessing
because
thecompilerlookstbr separate
sections
threadsasneeded.
By usingtheparallelnature
ofcode andcreatesseparate
of graphicalprogramming
andtheparallelimplementation
on theFPGA,
youcanseparate
yourcodeintodifferentsegments,
whichcanrunin parallel
andachievea fasterloopratethanwheneverythingis in oneprocess.
As you
developyourcode,startthinkingaboutlogicalplacesto createdifferent
segments
ofcode.
In LabVIEWFPGA,youcanconsiderseparate
loopsin yourtop-levelVl as
processor.
runningon theirowndedicated
Because
sepamteprccesses
FPGAhardwareallowsyou to executecodein trueparallel,parallel
on theFPGAaremoredeterministic
andcanrun at fasterloop
operations
architecture.
This is a great
rateswhencomparedto a processor-based
benefitfor safety-critical
applications
andcontrolapplications.
in LabVIEW placemultiple,independent
To createparalleloperations
While Loopson theblockdiagram.Figure9-23 showsparallelloopsfor
looprates.
acquiringanalogdataat independent

A National
lnsttunents
Coryaration

andLabVlEW
Fundanentzls
CoLtsetutanual
CanpattBlO

LessanI

FPGA
optiniration(1ptianat)

Figureg-23.TwoParailel
Lo0pswithDitferent
DataSample
Rates
Youmay losethebenefitsof paralleloperations
if you sharcresources
amongparallelloops.Memorytransfermechanisms,
suchasFIFOS,the
interruptline,va ables,andnon-rcentrant
VIs affecttheabilityof theFPGA
to executecodein trueparallel.
Anotheradvantage
ofrunningcodein parallelis thatit letssomesections
ol
coderun fasterthanothersections.
As shownin Figure9-24,onesetofcode
in a loop cansevrelylimit the speedofanotherpieceof code.In this
applicationtheanalogoutputruns35 timesslowerthanthe digitalinput.
Thiscanbecomepaticularlycriticalifthe codewerearangedsuchthatthe
digitalline corresponded
to an emergency
stopswitchandthe recog tion
ofthe response
hadto happenimmediately.

r,4odr/aor
t

@- .]Es!4!49{rl

Figure9-24.Vl Speed
LimitedbytheBateoftheSlower
A0 Functi0n

ConpactqloandLabVlEW
fundanentabCourse
Manual

9.34

ra,

F
F
F
F
F

LessonI

FPGA
Ap nization(Optional)

The top
In Figure9-25thecodeis brokeninto two paralleloperations.
rateofapproximately
sectionofcoderunsat thesameanalogoutput-limited
I MHz. However,the codein thebottomparallelloop canrun at a rateof
l0 MHz or l0 timesfaster.Oftenwhencodeis runningtoo slowlyit is
to preventunrelated
necessary
to separate
thecodeinto paralleloperations
from interferinswith oneanother.
orocesses

F
trl
g-25.Dl0N0Longer
Limited
bytheSlower
A0 Function
Figure

Pipelining
Techniques
is
Anotherimportanttechniquefor improvingFPGAperformance
pipelining.Pipeliningbrealcup codewithin a loop so thatoperations
are
perfomedin differentcyclesof the sameloop.The processofpipelining
codebeginswith identifyingcombinatoal pathsin your code.A
pathis a setof logicbetweentheoutputof oneregisterand
combinatorial
the inputof anotherregister.Becausedatain theregistersis updatedwith
between
everyrisingedgeof theclock,if therearetoo manyoperations
two registe$,a VI compilationmay fail dueto a timing error Figu'e9-26
pathinsidea single-cycle
TimedLoop.
showsanexampleof a combinatorial

L:q rc)t
Figure9-26.Vl withnoPipelining

lnstrunents
Coryontion
@National

ConDactBlo
andLabVlEW
fundanenhlsCou6eManuat

Lesson9

.PGAqptinization(A ionat)

Pipeliningshodensthe lengthbetweentheoutputandinputregistersof a
While Loop sothatyour VI meetstimingrequircments.
Youcanuseshift
registersto run portionsofyour combinatorial
pathin differentcyclesof
your loop.Pipeliningis especiallyimportantin singlecycleTimedLoops
wheretheentirepathis requiredto executein oneclockcycle.Figure9 27
illustratesa pipelinedversionofFigure 9-26.

Figure
9-27.UsePipelining
to Eliminate
Combinatorial
Path
Pipeliningincreases
systemlatencybecause
theinputofa functionis based
on theoutputof a previouscycleofthe loop.However,thelatency
disappears
whenthepipeis full. Afteronlyafew loopcycles,
pipelinedcode
is significantlymoreefficientthanidenticalcodein a normalloop.
Figure9-28illustrateslatencydueto pipeliring.
ClockCyce 1

ClockCycle3

--tqosl!,

_2LLoSq'

t-

_2 (!osj!J L_

lnPul .'}-\

L_

ourpLrl

=)(!os9-

_2LLoSq, r_

=)Gosjs)-

g-28.lncreased
Figure
Latency
Dueto Pipelining
After ClockCycle1,theoutputof subvl A is validandtheoutputof subvls
B andC areinvalid.After ClockCycle2, subvls A andB havevalidoutput
andsubVlC hasinvalidoutput.After theClockCycle3 andall subsequent
clockcycles,all outputwill be valid.

ConpactqloandLabVlEW
Fundanenlals
CowseManual

9-32

Fe
I
Fr
I
b
I
B

Lessans fPGAoptinizatian
(Optionat)

Feedback
Nodes
Feedback
Nodesareidenticalin functionalityto a shiftregister,andare
ofienpreferablefrom a userstandpoint
because
theylook similarto the
initial code.Figure9-29showsanexampleof usingFeedback
Nodes
insteadof shift registers.

t
t
=s
rrt
{

it

e
i,

=t
!3

a
?

Figure
9-29.VtwithFeedback
Nodes
Feedback
Nodesusea valuewiredto theinitializerterminalasthe initial
valuefor thefirst iterationor execution.
TheFeedback
Nodethenstoresthe
previousiterationresultfor eachsubsequent
execution.
Ifyou do not wire a
virluero rheinitiali,,er
lerminal.
lheFeedback
Nodeure. ihedelaulrvaluc
Ior thedatatypeandcontinuesbuildingon previousresultsin subsequent
execuuons.
Youcanusea Feedback
Nodeto implementa pipelineandreducelong
combinatorial
paths.Whenyou usetheFeedback
Nodeinsidea Case
structure,theFeedback
Nodeupdatesdataonly on clockcycleswhenthe
owningsubdiagram
executes.
TheFeedback
Nodeis implemented
asa registerandrequireslogic
resources
tn prcpoftionto thewidthofthe datatype.Usingtheinitialization
te.minalslightlyincreases
logic resourceusage.

Drawbacks
When you implementa pipeline,the output of{he final steplagsbehind the
inputby the numberofstepsin the pipeline and the output is invalid fbreach
clock cycle until the pipeJinefills. The numberofstepsin a pipeline is calied
the pipeline depth, and the latencyof a pipeline,measuredin clock cycles.
corespond\lo itc depth.Fora pipelineof depLhN. rheresultis invali; unlil
the Mh loop iteration,and the outputofeach valid loop iterationlass behind
the input by N I iterations.

e
a
3

A Nahonal
lnsltunentsCaeoralnn

9-33

ConpadRlAard LabVlEW
fmdanentatsCouseManuat

LessjnI

.PGAAptnizanon(Optonat)

Single-Cycle
TimedLoop
The single-cycle
TimedLoop is oneof themostpowerfulconstructs
in
FPGAprogramming.
Codeinsidethesingle-cycle
TimedLoop is more
optimized,takesup lessspaceon theFPGA,andexecutes
fasterthan
identicalcodein a srandard
While Loop.The single-cycle
TimedLoop
removes
theenablechainfromtheloopto savespaceontheFPGA.Because
all rcgistersarercmoved,all operations
in a single-cycle
TimedLoop can
completein a singleclockcycle.Furthermore,
eliminatingtheenablechain
overhead
reduces
the totalspaceusedon theFPGAbecause
theflip_flops
usedfor theenablechainarc no longerrequired.The single-cycle
Timed
Loop is a greattool for safety-critical
andcontrolapplications
wherefast
loopratesareimpofiant.
Figure9-30showsidenticalcodein a standard
WhileLoopandsingle_cycle
TimedLoop.The venicallinesindicatethe endof a clockcycle.The code
in theWhileLoop requiresfour clockcyclesto execute,in additionto two
clockcyclesof loop overhead.

Figureg-30.Single-Cycle
TimedLoopandWhjleLoopC0mparison
Becausethesingle-cycle
TimedLoop executes
in exactlyoneclock cycle,
theclockperiodmustbelongenoughtoallowall theoperations
to complete
in a singlecycle.The clock frequencycantechnicallybe from 2.5 to

conpactRt0
andLabvlEw
Fundanmtats
course
Manuat

9-U

LessonI

FPGA
ApinEation(oplional)

210MHz: however,the fasterthe clockfrequency,


thefewerthe
possiblein thesingle-cycle
computations
TimedLoop.It is usuallynot
possibleto know p or to compilingif yourcodewill executein a
single-cycle
TimedLoop.Somefunctionsandstructures,
suchasanalog
inputI/O, loops,andinteFuptsaretoo slowfor a single-cycle
TimedLoop
andresultin a brokenRun arrowifyou placethemin a single-cycle
Timed
Loop.However,theRun arlowmay be solidandcompilationstill fails
because
LabVIEWdoesnot knowthetjmingrequirements
ofa chainof
commandsin thesinglecycleTimedLoopuntil thecompileis alreadyrun.
Ifthe codedoesnot compileandthe Run anowis solid,the only other
optionto try is pipelining.
Youcanalsousesingle-cycle
TimedLoopsto optimizecodein yourVI,
evenif you don'tintendto irnmediately
reiteratethe codeinsidetheloop.
Figure9-31 showshowto usea single-cycle
TimedLoop to speedup a
portionofcode eventhoughit is not meantto itente morethanonce.Place
asmuchcodeaspossibleinsidea single-cycle
TimedLoop andthenwire a
Trueconstantto theloop-teminationterminalso the single-cycle
Timed
Loopexecutes
exactiyonce.Usingthesingle-cycle
TimedLoopremoves
theenablechainfrom theportionofthe FPGAcodeinsidethesingle-cycle
TimedLoop.

Figure
9-31.Single-Cycle
Timed
Loop
Used
toincrease
theSpeed
ina Porti0n
oftheCode

@ Nalianal
hslrunenlsCotpoation

Conpa.tRlqandLabVtEW
FundanentaE
Cou6eManual

Lesson9

.PGA0ptinizatian(Optional)

Combining
0ptimizalions
Combinalorial
Paths
pathis thepaththroughlogicbetweentheoutputof a
A combinatodal
registerandthe input of anotherregisteron an FPGA. A registerstoresdata
on anFPGA andupdatesthedataon therisingedgeofa clock.Long
pathstakemoretime to execute
combinatorial
andlimit themaximumclock
rateof theclockdomain.
pathsarepanicularlya problemin single-cycle
Longcombinatorial
Timed
Loopsbecause
the logic betweenthe inputregisterandtheoutputregister
mustexecutewithin oneperiodof theclockrateyou specify.ln the
single-cycle
TimedLoop,registenwithin andbetweencomponents
are
removed,
increasing
thelengthofthe combinatorial
pathbetweenregiste$.
If thecodein a combinatoal pathdoesnotexecutewithin a singleclock
cycle,LabVIEWrctumsa timing violationin the CompilationFailure
dialogbox.
Note DeeplynestedCasestructurcsalsocancauseLabVIEWto rctum a timing
violationin theCompilationFailuredialogbox.
path,first simplifythelogicasmuch
Toreducethelengthof a combinatorial
aspossible.Onceyou havereducedthe logicto its simplestform, you can
furtherreducethe lengthofa combinatorial
pathby dividingthelogic inro
discretestepsandpipeliningyour designin the single-cycle
TimedLoop.

ConDactBl9
andLabVlEW
fundanenkbCourse
Mantal

9-36

Lessan
I

(Ap anai)
FPGA
Aptinizatian

FPGA
Exercise
9-2 Archilectural
0ptimizalions
Goal
Usearchitectural
optimizations
on an FPCA Vl to reducetheapplication
sizeandincreaseapplicationspeed

Scenario
YouhavebeengivenanFPGAVI thatoutputsa tonedependent
on the
temperature
input from anNI 9211.Althoughthecodeis optimizedfor
speed,thetimingin theFPGAVI preventstheexpectedoutpul Designand
modifytheapplicationto createan audibleandchangingtonewhile still
performingthe sameactionsasodginallydesired.Useyour knowledgeof
architectural
optimizations
to optimizeboththespeedandsizeofthe FPGA
VI andcreatethe mostefficientcodepossible.

Design
TheFPGAVI generates
a sinewaveon AO0 ofthe NI9263 andchanges
the
frequencyofthe sinewaveby changingthetimingbetweenanalogoutput
Thesinewaveon AO0 createsa toneon the speakeron theSound
updates.
andVbrationSignalSimulatorwhen
connected
to AUDIO IN CHo with the
Speaker
tumedON.
The tempentureis readby readingtheThermocouple
valuesfrom AI0 on
t h eN I 9 2 1 l .
Calculatethetiming by scalingthe averagetempenturevaluesby the
empiricallyderivedscalingfactorsto generate
a time delaybetween
samples.
Thetime delayshouldbebetween2,000-120ticksofthe 40 MHz
timebase.

lmplementalion
L OpentheOptimizationproject.
Q

Open <Exercises>\conpactRlo
OoLim-zaLion OpLimiz-Lion

FundamenEaLs\
P r o p c L . l v p r o i 1 o uc r e a l e d

in Exercise9-1.
O Openthe FPGAOptimization- OriginalVL
In Exercise9-1,the codewasoptimizedfor size,but the speedof the
applicationwastoo slowbecause
theNI 921I hasa maximumreadraieof
14S/s.This causedthemaximumiateofthe entircloop to run at a rateof
14Hz, whichwasdrasticallylowerthanthe desiredrate.

@Natianal
lnstrunents
Coryorati1n

9-37

Conpa.tql0andLabVlEW
fundanentals
Cource
Manual

LessonI

fPGAoptinization(Opti,nal)

Pipe
lining
Youcouldreducethecombinatorial
pathby pipeliningthethermocouple
measurcment
andtakingtheaverageofthe p or four iterations.This
solutionshouldallow you to usetheparallelprocessing
natureof theFPGA
to executetheloop slightlyfaster
1. Pipelinetheblock diagmmasshownin Figure9-32.

g-32.FPGA
Figure
0ptimizati0n
Single
Loop
Pipeline
Block
0iagram
E Disconnect
theModl/TC0FPGAyO NodefromrhefirstAddVL
O Drag the left side of the associatedshift registerto expandit to four

elements.

tr

Rewirethe Add VIs to addall four prior iterationsfrom the input


shiftregister.

tr

Savethe Vl.

Note In the interestof time,do not compilethis VI.


If you compilethe VI you wouldseethefollowingreport.

ConpattRl0andLabvtEW
Fundanentats
Cou.se
Manuat

9-38

Lessan
I

.PGAAptinizatian('ptianal)

l* tuc(esf ul ConpileReDort

gtarus

a.nF1I3rr.r

c.nFllarran
lEqr.

:u..Essjul

gnnnarv

Ur1I1:atron

llurLer

ol

Nunber .l
Iunbar
Nxnler .f
Nxn!r of
]tunber
NunlEr !l
NxalEr of
ltxrlEr
of

,1 lnFul

4.0tf

lUTs

i .EEuuutrr ooo fft

EUFCIIIE
.l loced IUFC Ulls
Dafis
E:rarnal
IOES
.l loced
IoBs
UUlTltllLgs
RItiEl6s
SllcEs

9 ; Dur
9 6 .rr
.xr
3 nur
2 6 9 2. u r

ol
6f
ot
of
ol

16
)e.
3
33.a
1l
a.a
484
19r;
96
L 00 r l
trl
96
96
3l
14336 1B;

Figureg-33.FPGA
0ptimizat
0n SingleLoopPipelneSuccesslulC0mpile
Rep0rt
Comparedto the prior applicationpipclining incrcascdtbc nnmber
ol SLICES;howevcr,this applicationcar now run slishtlytastcr
becauseof thc dccreascdcombinatodal path.
However, if you lry to run this appiicationyou conlinue to seethe
sameproblemolthc NI 9211 runningat a ratcof only l4 S/s.
Although you can run a ftw clock ticks fastcrpcr itcration,that
imprcvement is impcrccptiblebecauseofthc slow acquisitionrate.
- sinqre
U r pcr oprim,/drion
I oop.vi
on0...,-. r] D

Figure
Pipelining
9-34.L11:
Single
L00pwlthout
Right:
S ngleLoopwithPipelining
Noticethatthe secondloop in Figure9 34 is slightlyf'aster.

lnslrunentsCaryotatian
Q National

9-39

Fundanenlals
CanpactBl0
andLabVlEW
CautleMenual

Lessan9

.PGAqptinization(Apional)

This applicationdoesnot producethedesiredtoneon thespeaker


because
theoutputis still too slowto producean audibletone.
Is thereanythingyou cando to outputa signalat a fasterrate?

UseParallel
Loops
It shouldnowbe obviousthatif you leavetheNI 9211 acquisitionin rhe
sameloopastheNI 9263operations
youwill nevergetthespeedyou w.utt.
To getthespeedrequiredforyouranalogoutputyoumustputit in a separate
loopfrom theanaloginputoperations.
Whenyoucreateseparate
loops,youmustmakesurethatallthecoderclated
to theanalogoutputloopremainsin oneloopandthatthecoderelatedto the
thermocouple
inputis placedin a differentloop.
1. Createtheblockdiagramshownin Figure9-35by separating
the
application
into two loops.

Figure9-35. FPGA
0ptimization
Separate
LoopsBlockDiagram
0 Placea newWhileLoop andwire a Falseconstantto theconditional
rcrmnal.

Canpacnlo
andLabVlEW
tundanentats
Manual
Course

9-44

LessonI

(Optional)
FPOAOptinization

U Selectthe Averaging,FPGA ScalingandModl/TCo FPGA VO


Nodeanddragtheminto the newloop.

Deletethe shift registersfrom the Output Loop.

tr

RenameActual
Rate (r.rsec).

Loop Ratse(usec) to output

Loop

f,l Add a new shiftregisterto theAcquisitionLoop andexpandthe


inputteminal to four terminals.

2. Rewirethecodein the AcquisitionLoop.


D WiretheModl/TC0 FPGAYO Nodeto theoutputshift register.
D Connecttheshift registers
to the appropriafe
inputson theAdd VIs.
3 . Measuretheloop executionspeed.
tr HighlighttheTick Count,Subtractfunction,andOutputLoop
Rate(usec)indicatorfrom theOutputLoop by pressing<Shift>
while clickingeachnewitem.
Q Press<Ctrl> anddraga copyofthe codeto the AcquisitionLoop.
D Createa new shift registerto retainthepdor iterationtick count.
E Renametheoutputs Loop Ratse(usec) 2 indicatorto
ACCIlrrSrrron LOop Rdr e (usecl,

4. Wirethecodeasin Figure9-35.
El Press<Ctrl-B>to deleteanyremainingbrokenwires.
tr Savethe Vl.
5. Add interJoopcommunication.
In thecunentVI, thercis no wayto changetheloopspeedfor theoutput
Ioop.Thenewloop ratesarecalculated
ir theacquisitionloop, sowe
needameansofsharingtheinformationbetweenthetwo loops.Thebest
wayto sharethatdatafor our applicationwill be to usea LocalVariable
for theWait (Ticks)indicator
O Right-clicktheWait (Ticks)indicatorin theAcquisitionLoop and
selectCreate>LocalVariable.
Q Placethelocalvadablenextto theinputof theLoopTimerfunction
in the OutputLoop.
A Naanallnstrunents
Caryonton

ConoactBlq
andLabVtEW
fundanentalsCowseManual

Lesson9

.PGAA inlzltion (Aphonat)

Right-click the local variable selectChange to Read, then connect


tbe Local Variableto the Count(Ticks) terminal on the Loop Timer
function.

tr Initialize the Local Variable.


Placea copy ofthe local variablein the initialization sequence.
-

RighFclick the local vadable and selectChange to Write.

Wrjte an initial value of 2 000 to the local variableby widng the


existing nuneric constantto the local variable.

B Savethl- VL
Nole In the interestoftime, do not compile this Vl.

If you compilethis VI you will seethe Iollowingreport.


'''
fliii"i;i.i"t i;,iirir"i;;;u '
s!hm.,y

Adlan.ed

starus

conlilarion

""' "''"'

-'..:1''i' '

succE.sfuL

^,

!:1:r11r:'!TTt:
l.q!.
rrrrrizarlon
Ilibei
6f 5l1ce FIlp Fl.!s:
lluiber of 4 lnDur ILITS:
I!frbe! of
Iuhbe!
Il hbe. of
Innbe. Bf
Nunbei
Nunber .f
nrnbe! oi
xlnbe. ot

"

l
2.671 ott at
4,018 out ol

IUFCUUXS
of loced BUIGIUXS
DCUS
Etternal
10ls
.l loced loEs
l4ulT1eli1os
!AuE16s
srICEs

3
1
1
96
96
6

oui
our
our
our
our
our

2e,612
28.672
Ei
.l
.f
of
of
.f

16
3
t2
434
96
96

9,
L4,

]
]

LZ
332
A,
l9z
Ttaz
6/.
*

___"1

)o no( 'ho^ (h:


e".qe nL,ern e

hero l

Figure
9-36.FPGA
0ptimizations
Separate
L0ops
Successful
Compile
Report
This applicationuses138moreSLICESthanthe prior version,but
mostof thatis relatedto the additionalTick CountVIs alongwith
the additionalcoderequiredlor havingmultipleloops.
By separating
thecodeinto two differentloopsyou getvery
differentbehavjorfrom the application.

ConaactqloandLabVlEW
Fundanentals
Cjuae Manuat

942

Lessan
I

.PGAqptinizatian(qptionat)

As youcanseein Figure9-37,you havetwo verydifferentloop


ratesfor theAcquisitionLoopandfor theOutputLoop.

Front
Panel
Separate
Loops
Figure
0-37.FPGA
0ptimization
Because
the OutputLoop is no longerlimitedby thespeedof the
NI 92I I acquisition,
thisis thefirst applicationyouhavecreatedthat
runstheOutputLoop quicklyenoughthatto meettheinitial design
ofproducinga toneon thespeakerof theSoundand
requirements
VibrationSignalSimulator.
But canyou do better?

TimedLoop
Single-Cycle
Onetechniquethatyou coulduseto incrcasethe executionspeedof the
applicationis to placesomeof theanalysisfunctions(Averagingand
asquickly
TimedLoopso thatthecodeexecutes
Scaling)in a single-cycle
aspossible.
becausethe analysisis in parallelwith theNI 9211
In this application,
TimedLoop is a bit morethanis really
acquisition,usinga single-cycle
the
thesizeof your applicationbecause
necessary,
but it candecrease
the
single-cycle
TimedLoop reducestheFPGA sizeby eliminating enable
chainoverhead.

lnstunentscotporalion
@National

9-43

Cotrse
Manual
andLabvlEhl
fundanenkls
Conpa.tRlo

(1ptonal)
Lessong FPGAoptinization

l. Placeanalysisfunctionsin a single-cycle
TimedLoop.
tr Modify theblockdiagramto resemble
Figure9 38.

Figure
9-38.FPGA
0ptirnization
Single"Cycle
Timed
LoopBlock
Diagram
TimedLoop aroundthe averaging
tr Add a single-cycle
andscaling
lunctionality.
Tip Presstl'le<Ctrl> buttonanddragtheWhileLoopto makeroomfor theTimeCLoop
beforeplacingit on theblockdiagramto avoidunusualshetchingof theblock diagram.
O Wire a Trueconstantto theconditionterminalof the single-cycle
TimedLoop.
Nole In the interestoftime, do not compilethis VI.
Ifyou compilethis VI you will seethe followingreport.

Fundanentah
CouseManual
ConpactRloandLabVtEW

F
P
P
F
P
F
F
F
F
F
a
!,

D
E8

a
=r
=r

LessonI

.
.
:
:
I
:

Sugqestlons f.r elirinatins


rhe problen:
* For Tiied lools xir!
rirlDq llolati.rs
- Raduc lors alirlherlc/cdblnararaat
parhs
Use Frpelini,s
lirlir
Tiaed lo.FS
- Reduce the I'nber ol nesred case stru.rures
. ieduce clocL raLes i! lossrble
r Reduce r]le andunt of alFllcarron
togi. to rake rourinq

FPGA
9ptinization(0plional)

easier

'RErel
t6 tLe labvlElt lielp tor iore iDrornation
abour lesolving
clr.k
rhe HeIp burron ro display rLe labVIEIi IteI!

.dFllarron

elrols

Figure
9-39.FPGA
0ptimization
Singie-Cycle
Timed
Loop
Compilation
Failure
0ialog
Box
Theerroroccurredbecause
thecombinatorial
pathwastoo long for
the singlecycleTimedLoop to executewithin one40 MHz clock
cycle.
Is thereanythingyoucando to allowthesingle-cycle
TimedLoopto befast
enoughto compileappropriately?

Combining
0ptimizations
l. Pipelinethecodein the single-cycle
TimedLoop.
Oneof theprimaryreasonsthatcodewill not compilewithin a
single-cycleTimed Loop is becausethe combinatorialpath is too long
to completein one clock cycle. The bestway to get aroundthis is to use
pipelininglo allow for lasrercyclerimes.

EO
:l
EO
E3
||Er

i-

ANationdllnslrunpnlscorporahan

9-45

Conpactql9andLabVlEW
Fundanentats
Cau$eManual

Leesan
9

.PGAOptinirdtion(A ionrl)

Referto theblockdiagramin Figure9-40ro addpipelining.

g-40.FPGA
Figure
optimization
Single-Cycl
Timed
LoopPipelined
BlockDiagram
D Add a Feedback
Nodebetweentheoutputof the ScaleBy Power
of 2 functionandthe AvemgeTemperature
inputof theFPGA
ScalingVI.
D Right-clickthewire andselectInserb>AllPalettes>Structures>
FeedbackNodetr Savethe VI.
2. CompiletheVl.
B Click theRun buttonto compiletheFPGAOptimizationVI.
Nole Do not makeanychangesto theVI afteryou havebegunthecompileptocess.

Cjnpdctql0dndLabVlEW
Fundanentals
Cou.seManuat

946

F
F
p
p
p
p

Lesson
I

(1ptional)
FPGA
Optinization

Whencompilingis completeyou shouldreceivea compilerepoft


similarto theonein Figure9-41.

b
b
f.-

Figure
9-41.FPGA
0ptimization
Single-Cycle
Timed
L00p
Pipelined
Successf
ulCompile
Report

As youcansee,althoughyouaddedsomecodeby usinga Feedback


Node,thecompilationis still 17SLICESsmallerthanthe lasr
workableversionof thecodebecause
you wereableto eliminate
someofthe enablechainoverhead
by placingportionsof thecode
in a single-cycle
TimedLoop.

b
b

0 Savethe Project.

Testing
l.

VerifythatAO0 is connected
to AUDIO IN CHo andthatthe Speaker
switchis setto ON.

2 . Whenthe compileis completeclick theOK buttonon the Successful


CompileRepofidialogbox to RuntheVI.
3 . Changethepitch ofthe outputtone.
Q Whenthe thermocouple
connected
to the9211is cool,the speaker
shouldproducea low andslightlyaudibletoneof about400 Hz.

tnstrunentsCorpaqtion
@ National

Conpactql0andLabVlEW
fundanentalsCooBeManual

.)

LessonI

FPAA2ptlniraton(Apionat)

Touchthe tip ofthe thermocouplewith your fingersor hand to


increasethe pitch of the tone.

tr

Releasethe thermocoupleto allow the temperatureand tone to drop


backdown.

4 . S r o pr h eV l b y c l i c l i n gt h eA b o r l b u l l o n .

Challenge
As with any optimization you can always find additional ways to optimize
the application.Use some ofthe lbllowing techniquesto createan even
more highly optimized application.
I . Eliminate the Tick Count VIs and any luncLionsusedto de ve the Loop
Rates.
2. Use the optimizationssuggestedin the CraLlen?esecrionof
Exercise9-l and apply them to this application.
By implementingtheseoptimizationsyou can reducethe code evenfurther
to only useabout 2431 SLICESwhich is 382 SLICES(-37o) lessrhaneven
the most optimizedformat createdin the main part ofthe exercisewhile still
retainingthe sametiming chaGcteristicsas the desiredmodel.

Srarus: Con!rIarlon successful


C6n!ilati.n

Suhaary

Device Utrllzatlon
Suhaary
Iurbe! of EUFGUIIS
llurbe! .l E:terral
loEs
Nuiber ol loced IOES
llunber ol IUlTlSIlBs
llunler El RAUBl6s
l{unber ol SIICES
ClDcL Ratesr
Base clock

14336

(RsqrEsled rates ale adtrsted


40 UHz Onb.ard Clocl

Theoierical

u3:iDun

40 52r:349tfiz

Endol Exercise
9-2

ConpactRlo
andLabVlEW
Fundanentah
CouseManual

9-48

rlrrer

g:

3nd acculacvl

LessonI

.PGA0ptininti1n (qptionat)

E.Advanced
0plimizations
Advancedoptimizationtechniques
areavailablefor experienced
userswho
areveryfamiliarwith FPGAprogramming.
Major erors in theFPGAcan
resultif advanced
oDtimizations
aredoneincorlectlv.

0plimizing
Arbilralion
LabVIEWusesarbitrationto managesharedresources
on theFPGA.This
ensurcsthat
only onecalleraccesses
aresource
at anygiventime.Removing
arbitrationsavessignificantspaceon theFPGAandcanallowsomeFPCA
I/O functionsandFIFO operations
to executein oneclockcycle.Referto
Lesson'7,Win.lotrsPC l1orl, for moreinformationaboutarbitration.

A Nallonallnsx nenlsCaeo.zrion

9.49

ConpaclRl0
andLa,VIEWfundanentalsCousei|anual

LessonI

qPGA0ptinizatian(0ptianat)

SellReview:
0uiz
1. Whichof the followingareFPGA optimizationtechniques?
a. Eliminatearrayson thefront panel.
b. Decrease
lheblockdiagramsize.
c. Pipelinelargecombinatodal
paths.
d. Usethe ScaleBy Powerof 2 functionwith a controlon then input.
e. Replaceall loopswith single-cycle
TimedLoops.
2. How doesthesingle-cycle
TimedLoopcreatea smallerFPGAfootprint
andexecutewithin oneclocktick?
a. By usingotherVI logicfunctionswhentheyarenot in use.
b. By eliminatingtheenablechainoverhead.
c. By passingthedatato theRT controllerto prccess.
d. By skippingsomefurctionsandhavingincompletefunctionality.

9-51

ConpactqtoandLabVlEW
fundanentals
Cotnselllanual

Lesson
I

|PGA1ptinization(qptianal)

SelfReview:
0uizAnswers
l . Whichof the followingareFPGAoptimizationtechniques?

a. Eliminate arrays on the front panel.


theblockdiagramsize.
b. Decrease
c. Pipelinelargecombinatorialpaths.
d. Usethe ScaleBy Powerof 2 functionwith a controlon then input.
TimedLoops.
e. Replaceall loopswith single-cycle
TimedLoop createa smallerFPGAfbotprint
2 . How doesthesingle-cycle
andexecutewithin oneclocktick?
a. By usingotherVIs logic functionswhentheyarenot in use.
b. By eliminatingthe enablechain overhead'
c. By passingthedatato theRT controllerto process
d. By skippingsomefunctionsandhavingincompletefunctionality

A Na onallnntunenlsCoqoralion

Manual
fundanentalsCourse
ConpactBl1andLabVlEW

LessonI

qptinizatian(0ptionai)
FPGA

Noles

Fundanentals
llanual
ConpaalBloandLabVlEW
Course

v54

Das könnte Ihnen auch gefallen