Beruflich Dokumente
Kultur Dokumente
1
Learning from hints
In the form of prior information (invariance, symmetry) about
the input – output map to be learned
Learning Rate
Sometimes,
inversely proportional to the square root of synaptic sum (+ive)
2
Manufacturing Training Data:
• By corrupting with (adding, multiplying, convolving) noise
• Vary orientation (2d)
• Vary scale (size)
• Vary average of an attribute (brightness)
• Adding high frequency (vary sharpness)
• Enveloping with low frequency variation
4
Back Propagation and Differentiation
Specific technique for implementing gradient descent
No nonlinearity
An element
5
Function for one example X
involving W weights
‘N’ examples
6
Hessian and Online Learning
Positive semi-
definate
Largest eigenvalue
Smallest nonzero
eigenvalue
7
Utility (example)
Means of inputs
On Convergence
We have seen asymptotic convergence of LMS to local minima
8
Learning curve contains:
• Minimal loss (towards global/local minima)
• Additional loss (fluctuation in weight evolution)
• A time dependent term
9
Instantaneous Cost function
Expected risk
In batch sense
10
Evaluating the above in infinitesimal intervals,
with expected values of gradient
12
0
Solution:
If
Positive
13
→ 0, as t → ∞
Implementation
14
Adaptive control of learning rate*
not suitable due to nonstationarity
Generalized from g.
15
Assumed:
Solved to get
Smoothness!