Beruflich Dokumente
Kultur Dokumente
Tom Lyche
University of Oslo
Norway
The Conjugate Gradient Method p. 1/23
Plan for the day
The method
Algorithm
Implementation of test problems
Complexity
Derivation of the method
Convergence
The Conjugate Gradient Method p. 2/23
The Conjugate gradient method
Restricted to positive denite systems: Ax = b,
A R
n,n
positive denite.
Generate x
k
by x
k+1
= x
k
+
k
p
k
,
p
k
is a vector, the search direction,
k
is a scalar determining the step length.
In general we nd the exact solution in at most n
iterations.
For many problems the error becomes small after a few
iterations.
Both a direct method and an iterative method.
Rate of convergence depends on the square root of the
condition number
The Conjugate Gradient Method p. 3/23
The name of the game
Conjugate means orthogonal; orthogonal gradients.
But why gradients?
Consider minimizing the quadratic function Q : R
n
R
given by Q(x) :=
1
2
x
T
Ax x
T
b.
The minimum is obtained by setting the gradient equal
to zero.
Q(x) = Ax b = 0 linear system Ax = b
Find the solution by solving r = b Ax = 0.
The sequence x
k
is such that r
k
:= b Ax
k
is
orthogonal with respect to the usual inner product in R
n
.
The search directions are also orthogonal, but with
respect to a different inner product.
The Conjugate Gradient Method p. 4/23
The algorithm
Start with some x
0
. Set p
0
= r
0
= b Ax
0
.
For k = 0, 1, 2, . . .
x
k+1
= x
k
+
k
p
k
,
k
=
r
T
k
r
k
p
T
k
Ap
k
r
k+1
= b Ax
k+1
= r
k
k
Ap
k
p
k+1
= r
k+1
+
k
p
k
,
k
=
r
T
k+1
r
k+1
r
T
k
r
k
The Conjugate Gradient Method p. 5/23
Example
2 1
1 2
[
x
1
x
2
] = [
1
0
]
Start with x
0
= 0.
p
0
= r
0
= b = [1, 0]
T
0
=
r
T
0
r
0
p
T
0
Ap
0
=
1
2
, x
1
= x
0
+
0
p
0
= [
0
0
] +
1
2
[
1
0
] =
1/2
0
r
1
= r
0
0
Ap
0
= [
1
0
]
1
2
2
1
0
1/2
, r
T
1
r
0
= 0
0
=
r
T
1
r
1
r
T
0
r
0
=
1
4
, p
1
= r
1
+
0
p
0
=
0
1/2
+
1
4
[
1
0
] =
1/4
1/2
1
=
r
T
1
r
1
p
T
1
Ap
1
=
2
3
,
x
2
= x
1
+
1
p
1
=
1/2
0
+
2
3
1/4
1/2
2/3
1/3
r
2
= 0, exact solution.
The Conjugate Gradient Method p. 6/23
Exact method and iterative method
Orthogonality of the residuals implies that x
m
is equal to the solution
x of Ax = b for some m n.
For if x
k
,= x for all k = 0, 1, . . . , n 1 then r
k
,= 0 for
k = 0, 1, . . . , n 1 is an orthogonal basis for R
n
. But then r
n
R
n
is
orthogonal to all vectors in R
n
so r
n
= 0 and hence x
n
= x.
So the conjugate gradient method nds the exact solution in at most
n iterations.
The convergence analysis shows that |x x
k
|
A
typically becomes
small quite rapidly and we can stop the iteration with k much smaller
that n.
It is this rapid convergence which makes the method interesting and
in practice an iterative method.
The Conjugate Gradient Method p. 7/23
Conjugate Gradient Algorithm
[Conjugate Gradient Iteration] The positive denite linear system Ax = b is
solved by the conjugate gradient method. x is a starting vector for the iteration. The
iteration is stopped when [[r
k
[[
2
/[[r
0
[[
2
tol or k > itmax. itm is the number of
iterations used.
functi on [ x , i t m ] =cg ( A, b , x , t ol , i t max ) r =bAx ; p=r ; rho=r r ;
rho0=rho ; f or k=0: i t max
i f sqrt ( rho / rho0)<= t o l 2
i t m=k ; return
end
t =Ap ; a=rho / ( p t ) ;
x=x+ap ; r =rat ;
rhos=rho ; rho=r r ;
p=r +( rho / rhos )p ;
end i t m=i t max +1;
The Conjugate Gradient Method p. 8/23
A family of test problems
We can test the methods on the Kronecker sum matrix
A = C
1
I+IC
2
=
C
1
C
1
.
.
.
C
1
C
1
cI bI
bI cI bI
.
.
.
.
.
.
.
.
.
bI cI bI
bI cI
,
where C
1
= tridiag
m
(a, c, a) and C
2
= tridiag
m
(b, c, b).
Positive denite if c > 0 and c [a[ +[b[.
The Conjugate Gradient Method p. 9/23
m = 3, n = 9
A =
2c a 0 b 0 0 0 0 0
a 2c a 0 b 0 0 0 0
0 a 2c 0 0 b 0 0 0
b 0 0 2c a 0 b 0 0
0 b 0 a 2c a 0 b 0
0 0 b 0 a 2c 0 0 b
0 0 0 b 0 0 2c a 0
0 0 0 0 b 0 a 2c a
0 0 0 0 0 b 0 a 2c
b = a = 1, c = 2: Poisson matrix
b = a = 1/9, c = 5/18: Averaging matrix
The Conjugate Gradient Method p. 10/23
Averaging problem
jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1/9, c = 5/18
max
=
5
9
+
4
9
cos (h),
min
=
5
9
4
9
cos (h)
cond
2
(A) =
max
min
=
5+4 cos(h)
54 cos(h)
9.
The Conjugate Gradient Method p. 11/23
2D formulation for test problems
V = vec(x). R = vec(r), P = vec(p)
Ax = b DV +V E = h
2
F,
D = tridiag(a, c, a) R
m,m
, E = tridiag(b, c, b) R
m,m
vec(Ap) = DP +PE
The Conjugate Gradient Method p. 12/23
Testing
[Testing Conjugate Gradient ] A = trid(a, c, a, m) I
m
+ I
m
trid(b, c, b, m) R
m
2
,m
2
functi on [ V, i t ] = cgt est (m, a , b , c , t ol , i t max )
h=1/ (m+1) ; R=hhones (m) ;
D=sparse( t r i di agonal ( a , c , a ,m) ) ; E=sparse( t r i di agonal ( b , c , b ,m) ) ;
V=zeros (m,m) ; P=R; rho=sum(sum(R. R) ) ; rho0=rho ;
f or k=1: i t max
i f sqrt ( rho / rho0)<= t o l
i t =k ; return
end
T=DP+PE; a=rho / sum(sum(P. T ) ) ; V=V+aP; R=RaT;
rhos=rho ; rho=sum(sum(R. R) ) ; P=R+( rho / rhos )P;
end;
i t =i t max +1;
The Conjugate Gradient Method p. 13/23
The Averaging Problem
n 2 500 10 000 40 000 1 000 000 4 000 000
K 22 22 21 21 20
Table 1: The number of iterations K for the averag-
ing problem on a
n
n grid. x
0
= 0 tol = 10
8
Both the condition number and the required number of iterations are
independent of the size of the problem
The convergence is quite rapid.
The Conjugate Gradient Method p. 14/23
Poisson Problem
jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1, c = 2
max
= 4 + 4 cos (h),
min
= 4 4 cos (h)
cond
2
(A) =
max
min
=
1+cos(h)
1cos(h)
= cond
(
T)
2
.
cond
2
(A) = O(n).
The Conjugate Gradient Method p. 15/23
The Poisson problem
n 2 500 10 000 40 000 160 000
K 140 294 587 1168
K/
n.
The results show that K is much smaller than n and appears to be
proportional to
n
This is the same speed as for SOR and we dont have to estimate
any acceleration parameter!
i=1
x, v
i
)
v
i
, v
i
)
v
i
. (2)
The Conjugate Gradient Method p. 20/23
Derivation of CG
Ax = b, A R
n,n
is pos. def., x, b R
n
(x, y) := x
T
y, x, y R
n
x, y) := x
T
Ay = (x, Ay) = (Ax, y)
|x|
A
=
x
T
Ax
W
0
= 0, W
1
= spanb, W
2
= spanb, Ab,
W
k
= spanb, Ab, A
2
b, . . . , A
k1
b
W
0
W
1
W
2
W
k
dim(W
k
) k, w W
k
Aw W
k+1
x
k
W
k
, x
k
x, w) = 0 for all w W
k
p
0
= r
0
:= b, p
j
= r
j
j1
i=0
r
j
,p
i
p
i
,p
i
p
i
, j = 1, . . . , k.
The Conjugate Gradient Method p. 21/23
Convergence
Theorem 5. Suppose we apply the conjugate gradient method to a
positive denite systemAx = b. Then the A-norms of the errors satisfy
[[x x
k
[[
A
[[x x
0
[[
A
2
+ 1
k
, for k 0,
where = cond
2
(A) =
max
/
min
is the 2-norm condition number of
A.
This theorem explains what we observed in the previous
section. Namely that the number of iterations is linked to
then
[[x x
k
[[
A
/[[x x
0
[[
A
.
The Conjugate Gradient Method p. 23/23