You are on page 1of 3

21/9/2016 DeltaruleWikipedia,thefreeencyclopedia

Deltarule
FromWikipedia,thefreeencyclopedia

Inmachinelearning,thedeltaruleisagradientdescentlearningruleforupdatingtheweightsoftheinputsto
artificialneuronsinasinglelayerneuralnetwork.[1]Itisaspecialcaseofthemoregeneralbackpropagation
algorithm.Foraneuron withactivationfunction ,thedeltarulefor 's thweight isgivenby

where

isasmallconstantcalledlearningrate
istheneuron'sactivationfunction
isthetargetoutput
istheweightedsumoftheneuron'sinputs
istheactualoutput
isthe thinput.

Itholdsthat and .

Thedeltaruleiscommonlystatedinsimplifiedformforaneuronwithalinearactivationfunctionas

Whilethedeltaruleissimilartotheperceptron'supdaterule,thederivationisdifferent.Theperceptronusesthe
Heavisidestepfunctionastheactivationfunction ,andthatmeansthat doesnotexistatzero,andis
equaltozeroelsewhere,whichmakesthedirectapplicationofthedeltaruleimpossible.

Derivationofthedeltarule
Thedeltaruleisderivedbyattemptingtominimizetheerrorintheoutputoftheneuralnetworkthrough
gradientdescent.Theerrorforaneuralnetworkwith outputscanbemeasuredas

Inthiscase,wewishtomovethrough"weightspace"oftheneuron(thespaceofallpossiblevaluesofallofthe
neuron'sweights)inproportiontothegradientoftheerrorfunctionwithrespecttoeachweight.Inordertodo
that,wecalculatethepartialderivativeoftheerrorwithrespecttoeachweight.Forthe thweight,this
derivativecanbewrittenas

Becauseweareonlyconcerningourselveswiththe thneuron,wecansubstitutetheerrorformulaabovewhile
omittingthesummation:

https://en.wikipedia.org/wiki/Delta_rule 1/3
21/9/2016 DeltaruleWikipedia,thefreeencyclopedia

Nextweusethechainruletosplitthisintotwoderivatives:

Tofindtheleftderivative,wesimplyapplythegeneralpowerrule:

Tofindtherightderivative,weagainapplythechainrule,thistimedifferentiatingwithrespecttothetotalinput
to , :

Notethattheoutputofthe thneuron, ,isjusttheneuron'sactivationfunction appliedtotheneuron'sinput


.Wecanthereforewritethederivativeof withrespectto simplyas 'sfirstderivative:

Nextwerewrite inthelasttermasthesumoverall weightsofeachweight timesitscorresponding


input :

Becauseweareonlyconcernedwiththe thweight,theonlytermofthesummationthatisrelevantis .
Clearly,

givingusourfinalequationforthegradient:

Asnotedabove,gradientdescenttellsusthatourchangeforeachweightshouldbeproportionaltothegradient.
Choosingaproportionalityconstant andeliminatingtheminussigntoenableustomovetheweightinthe
negativedirectionofthegradienttominimizeerror,wearriveatourtargetequation:

Seealso
Stochasticgradientdescent
Backpropagation
RescorlaWagnermodeltheoriginofdeltarule

https://en.wikipedia.org/wiki/Delta_rule 2/3
21/9/2016 DeltaruleWikipedia,thefreeencyclopedia

References
1.Russell,Ingrid."TheDeltaRule".UniversityofHartford.Retrieved5November2012.

Retrievedfrom"https://en.wikipedia.org/w/index.php?title=Delta_rule&oldid=707692960"

Categories: Artificialneuralnetworks

Thispagewaslastmodifiedon1March2016,at07:05.
TextisavailableundertheCreativeCommonsAttributionShareAlikeLicenseadditionaltermsmay
apply.Byusingthissite,youagreetotheTermsofUseandPrivacyPolicy.Wikipediaisaregistered
trademarkoftheWikimediaFoundation,Inc.,anonprofitorganization.

https://en.wikipedia.org/wiki/Delta_rule 3/3