Beruflich Dokumente
Kultur Dokumente
Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
Chuck Anderson
Fall, 2009
1 / 26
CS545: Gradient
Outline Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
Gradient Descent
Parabola
Examples in R
2 / 26
CS545: Gradient
Finding Minimum of Parabola Descent
Chuck Anderson
Find x that is minimum of f (x) = 1.2(x 2)2 + 3.2 or,
said another way, find argmaxx f (x). How? Gradient Descent
Parabola
Examples in R
3 / 26
CS545: Gradient
Finding Minimum of Parabola Descent
Chuck Anderson
Find x that is minimum of f (x) = 1.2(x 2)2 + 3.2 or,
said another way, find argmaxx f (x). How? Gradient Descent
Parabola
Yep. Take derivative, set equal to zero, and try to solve Examples in R
for x.
f (x) = 1.2(x 2)2 + 3.2
df (x)
= 1.2(2)(x 2) = 2.4(x 2)
dx
df (x)
= 0 = 2.4(x 2)
dx
x =2
8
Closedform solution
7
1.2(x 2)2 + 3.2
6
5
4
0 1 2 3 4
x 4 / 26
CS545: Gradient
Gradient Descent Descent
5 / 26
CS545: Gradient
Gradient Descent Descent
6 / 26
CS545: Gradient
Gradient Descent Descent
Closedform solution
7
1.2(x 2)2 + 3.2
Gradient Descent
6
5
0 1 2 3 4 7 / 26
For a parabola, can get there much faster if we also CS545: Gradient
Descent
know the second derivative, which is what?
Chuck Anderson
Gradient Descent
Parabola
Examples in R
8 / 26
For a parabola, can get there much faster if we also CS545: Gradient
Descent
know the second derivative, which is what?
Chuck Anderson
Gradient Descent
Parabola
Examples in R
9 / 26
For a parabola, can get there much faster if we also CS545: Gradient
Descent
know the second derivative, which is what?
Chuck Anderson
df (x)
= f 0 = 2.4(x 2) Gradient Descent
dx Parabola
d 2 f (x) Examples in R
= f 00 = 2.4
dx 2
7
1.2(x 2)2 + 3.2
10 / 26
CS545: Gradient
Gradient Descent Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
11 / 26
CS545: Gradient
Gradient Descent Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
12 / 26
CS545: Gradient
Gradient Descent Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
13 / 26
CS545: Gradient
Approximating the Second Derivative Descent
Chuck Anderson
Gradient Descent
Say we have picked a direction, p, to go. Rather than Parabola
compute the second derivative in that direction, we can Examples in R
f 0 (x + p) f 0 (x)
f 00 (x)p for 0 < << 1
14 / 26
CS545: Gradient
Approximating the Second Derivative Descent
Chuck Anderson
Gradient Descent
Say we have picked a direction, p, to go. Rather than Parabola
compute the second derivative in that direction, we can Examples in R
f 0 (x + p) f 0 (x)
f 00 (x)p for 0 < << 1
In practice, Moller found he had to modify this by
adding p where is set to a value for which the
resulting approximated second derivative is well
behaved.
f 0 (x + p) f 0 (x)
f 00 (x)p + p, for 0 < << 1
15 / 26
CS545: Gradient
Approximating the Second Derivative Descent
Chuck Anderson
Gradient Descent
Say we have picked a direction, p, to go. Rather than Parabola
compute the second derivative in that direction, we can Examples in R
f 0 (x + p) f 0 (x)
f 00 (x)p for 0 < << 1
In practice, Moller found he had to modify this by
adding p where is set to a value for which the
resulting approximated second derivative is well
behaved.
f 0 (x + p) f 0 (x)
f 00 (x)p + p, for 0 < << 1
This gives us a way to scale the step size.
16 / 26
CS545: Gradient
Picking a Good Direction Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
17 / 26
CS545: Gradient
Picking a Good Direction Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
18 / 26
CS545: Gradient
Picking a Good Direction Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
19 / 26
CS545: Gradient
Parabola Example Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
f < function(x) {
1.2 * (x2)2 + 3.2
}
20 / 26
CS545: Gradient
Steepest Descent Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
xs < seq(0,4,len=20)
plot (xs , f (xs ), type=l,xlab=x,ylab=expression(1.2(x2)2 +3.2))
### df/dx = 2.4(x2)
### df/dx = 0 > 0 = 2.4x 4.8 > x = 2
lines (c (2,2), c (3,8), col=red,lty=2)
text (2.1,7, Closedform solution,col=red,pos=4)
### gradient descent
x < 0.1
xtrace < x
ftrace < f(x)
stepFactor < 0.6 ### try larger and smaller values (0.8 and 0.01)
for (step in 1:100) {
x < x stepFactor * grad(x)
xtrace < c(xtrace,x)
ftrace < c(ftrace,f(x))
}
lines ( xtrace , ftrace , type=b,col=blue)
text (0.5,6, Gradient Descent,col=blue,pos=4)
21 / 26
CS545: Gradient
Descent
Chuck Anderson
Gradient Descent
Parabola
8
Examples in R
Closedform solution
7
1.2(x 2)2 + 3.2
Gradient Descent
6
5
0 1 2 3 4
22 / 26
CS545: Gradient
Steepest Descent with gradientDescents.R Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
source(gradientDescents .R)
x < 0.1
result < steepest(x, f, grad, stepsize =0.6, nIterations =100, xtracep=TRUE, ftracep=TRUE)
plot (xs , f (xs ), type=l,xlab=x,ylab=expression(1.2(x2)2 +3.2))
lines ( result $ xtrace , result $ ftrace ,type=b,col=blue)
text (0.5,6, Gradient Descent with steepest (), col=blue,pos=4)
23 / 26
CS545: Gradient
Steepest Descent scaled with Newtons Method Descent
Chuck Anderson
Gradient Descent
Parabola
Examples in R
24 / 26
CS545: Gradient
With Scaled Conjugate Gradient from Descent
Chuck Anderson
gradientDescents.R
Gradient Descent
Parabola
Examples in R
source(gradientDescents .R)
x < 0.1
result < scg(x, f, grad, nIterations =100, xtracep=TRUE, ftracep=TRUE)
plot (xs , f (xs ), type=l,xlab=x,ylab=expression(1.2(x2)2 +3.2))
lines ( result $ xtrace , result $ ftrace ,type=b,col=blue)
text (0.5,6, Gradient Descent with scg(), col=blue,pos=4)
25 / 26
CS545: Gradient
Results Descent
8 Chuck Anderson
7
Parabola
1.2(x 2)2 + 3.2
6
5
5
4
0 1 2 3 4 0 1 2 3 4
x x
8
8
7
7
1.2(x 2)2 + 3.2
6
5
5
4
0 1 2 3 4 0 1 2 3 4
x x
26 / 26