The Conjugate Gradient Method: Tom Lyche

The Conjugate Gradient Method
Tom Lyche
University of Oslo
Norway
The Conjugate Gradient Method p. 1/23
Plan for the day
The method
Algorithm
Implementation of test problems
Complexity
Derivation of the method
Convergence
The Conjugate gradient method
Restricted to positive denite systems: Ax = b,
A R
n,n
positive denite.
Generate x
k
by x
k+1
= x
k
+
k
p
k
,
p
k
is a vector, the search direction,
k
is a scalar determining the step length.
In general we nd the exact solution in at most n
iterations.
For many problems the error becomes small after a few
iterations.
Both a direct method and an iterative method.
Rate of convergence depends on the square root of the
condition number
The name of the game
Conjugate means orthogonal; orthogonal gradients.
But why gradients?
Consider minimizing the quadratic function Q : R
n
R
given by Q(x) :=
1
2
x
T
Ax x
T
b.
The minimum is obtained by setting the gradient equal
to zero.
Q(x) = Ax b = 0 linear system Ax = b
Find the solution by solving r = b Ax = 0.
The sequence x
k
is such that r
k
:= b Ax
k
is
orthogonal with respect to the usual inner product in R
n
.
The search directions are also orthogonal, but with
respect to a different inner product.
The algorithm
Start with some x
0
. Set p
0
= r
0
= b Ax
0
.
For k = 0, 1, 2, . . .
x
k+1
= x
k
+
k
p
k
,
k
=
r
T
k
r
k
p
T
k
Ap
k
r
k+1
= b Ax
k+1
= r
k

k
Ap
k
p
k+1
= r
k+1
+
k
p
k
,
k
=
r
T
k+1
r
k+1
r
T
k
r
k
Example
2 1
1 2
[
x
1
x
2
] = [
1
0
]
Start with x
0
= 0.
p
0
= r
0
= b = [1, 0]
T
0
=
r
T
0
r
0
p
T
0
Ap
0
=
1
2
, x
1
= x
0
+
0
p
0
= [
0
0
] +
1
2
[
1
0
] =
1/2
0
r
1
= r
0
0
Ap
0
= [
1
0
]
1
2
2
1
0
1/2
, r
T
1
r
0
= 0
0
=
r
T
1
r
1
r
T
0
r
0
=
1
4
, p
1
= r
1
+
0
p
0
=
0
1/2
+
1
4
[
1
0
] =
1/4
1/2
1
=
r
T
1
r
1
p
T
1
Ap
1
=
2
3
,
x
2
= x
1
+
1
p
1
=
1/2
0
+
2
3
1/4
1/2
2/3
1/3
r
2
= 0, exact solution.
Exact method and iterative method
Orthogonality of the residuals implies that x
m
is equal to the solution
x of Ax = b for some m n.
For if x
k
,= x for all k = 0, 1, . . . , n 1 then r
k
,= 0 for
k = 0, 1, . . . , n 1 is an orthogonal basis for R
n
. But then r
n
R
n
is
orthogonal to all vectors in R
n
so r
n
= 0 and hence x
n
= x.
So the conjugate gradient method nds the exact solution in at most
n iterations.
The convergence analysis shows that |x x
k
|
A
typically becomes
small quite rapidly and we can stop the iteration with k much smaller
that n.
It is this rapid convergence which makes the method interesting and
in practice an iterative method.
Conjugate Gradient Algorithm
[Conjugate Gradient Iteration] The positive denite linear system Ax = b is
solved by the conjugate gradient method. x is a starting vector for the iteration. The
iteration is stopped when [[r
k
[[
2
/[[r
0
[[
2
tol or k > itmax. itm is the number of
iterations used.
functi on [ x , i t m ] =cg ( A, b , x , t ol , i t max ) r =bAx ; p=r ; rho=r r ;
rho0=rho ; f or k=0: i t max
i f sqrt ( rho / rho0)<= t o l 2
i t m=k ; return
end
t =Ap ; a=rho / ( p t ) ;
x=x+ap ; r =rat ;
rhos=rho ; rho=r r ;
p=r +( rho / rhos )p ;
end i t m=i t max +1;
A family of test problems
We can test the methods on the Kronecker sum matrix
A = C
1
I+IC
2
=
C
1
C
1
.
.
.
C
1
C
1
cI bI
bI cI bI
.
.
.
.
.
.
.
.
.
bI cI bI
bI cI
,
where C
1
= tridiag
m
(a, c, a) and C
2
= tridiag
m
(b, c, b).
Positive denite if c > 0 and c [a[ +[b[.
m = 3, n = 9
A =
2c a 0 b 0 0 0 0 0
a 2c a 0 b 0 0 0 0
0 a 2c 0 0 b 0 0 0
b 0 0 2c a 0 b 0 0
0 b 0 a 2c a 0 b 0
0 0 b 0 a 2c 0 0 b
0 0 0 b 0 0 2c a 0
0 0 0 0 b 0 a 2c a
0 0 0 0 0 b 0 a 2c
b = a = 1, c = 2: Poisson matrix
b = a = 1/9, c = 5/18: Averaging matrix
Averaging problem
jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1/9, c = 5/18
max
=
5
9
+
4
9
cos (h),
min
=
5
9

4
9
cos (h)
cond
2
(A) =

max
min
=
5+4 cos(h)
54 cos(h)
9.
2D formulation for test problems
V = vec(x). R = vec(r), P = vec(p)
Ax = b DV +V E = h
2
F,
D = tridiag(a, c, a) R
m,m
, E = tridiag(b, c, b) R
m,m
vec(Ap) = DP +PE
Testing
[Testing Conjugate Gradient ] A = trid(a, c, a, m) I
m
+ I
m

trid(b, c, b, m) R
m
2
,m
2
functi on [ V, i t ] = cgt est (m, a , b , c , t ol , i t max )
h=1/ (m+1) ; R=hhones (m) ;
D=sparse( t r i di agonal ( a , c , a ,m) ) ; E=sparse( t r i di agonal ( b , c , b ,m) ) ;
V=zeros (m,m) ; P=R; rho=sum(sum(R. R) ) ; rho0=rho ;
f or k=1: i t max
i f sqrt ( rho / rho0)<= t o l
i t =k ; return
end
T=DP+PE; a=rho / sum(sum(P. T ) ) ; V=V+aP; R=RaT;
rhos=rho ; rho=sum(sum(R. R) ) ; P=R+( rho / rhos )P;
end;
i t =i t max +1;
The Averaging Problem
n 2 500 10 000 40 000 1 000 000 4 000 000
K 22 22 21 21 20
Table 1: The number of iterations K for the averag-
ing problem on a

n
n grid. x
0
= 0 tol = 10
8
Both the condition number and the required number of iterations are
independent of the size of the problem
The convergence is quite rapid.
Poisson Problem
jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1, c = 2
max
= 4 + 4 cos (h),
min
= 4 4 cos (h)
cond
2
(A) =

max
min
=
1+cos(h)
1cos(h)
= cond
(
T)
2
.
cond
2
(A) = O(n).
The Poisson problem
n 2 500 10 000 40 000 160 000
K 140 294 587 1168
K/
n 1.86 1.87 1.86 1.85

Using CG in the form of Algorithm 8 with = 10
8
and x
0
= 0 we list
K, the required number of iterations and K/
n.
The results show that K is much smaller than n and appears to be
proportional to

n
This is the same speed as for SOR and we dont have to estimate
any acceleration parameter!
n is essentially the square root of the condition number of A.

Complexity
The work involved in each iteration is
1. one matrix times vector (t = Ap),
2. two inner products (p
T
t and r
T
r),
3. three vector-plus-scalar-times-vector (x = x + ap,
r = r at and p = r + (rho/rhos)p),
The dominating part of the computation is statement 1.
Note that for our test problems A only has O(5n) nonzero
elements. Therefore, taking advantage of the sparseness of
A we can compute t in O(n) ops. With such an
implementation the total number of ops in one iteration is
O(n).
More Complexity
How many ops do we need to solve the test problems
by the conjugate gradient method to within a given
tolerance?
Average problem. O(n) ops. Optimal for a problem
with n unknowns.
Same as SOR and better than the fast method based
on FFT.
Discrete Poisson problem: O(n
3/2
) ops.
same as SOR and fast method.
Cholesky Algorithm: O(n
2
) ops both for averaging and
Poisson.
Analysis and Derivation of the Method
Theorem 3 (Orthogonal Projection). Let o be a subspace of a nite
dimensional real or complex inner product space (1, F, , , )). To each
x 1 there is a unique vector p o such that
x p, s) = 0, for all s o. (1)
x
x
x - p
p=P
S
S
Best Approximation
Theorem 4 (Best Approximation). Let o be a subspace of a nite
dimensional real or complex inner product space (1, F, , , )). Let
x 1, and p o. The following statements are equivalent
1. x p, s) = 0, for all s S.
2. |x s| > |x p| for all s o with s ,= p.
If (v
1
, . . . , v
k
) is an orthogonal basis for S then
p =
k
i=1
x, v
i
)
v
i
, v
i
)
v
i
. (2)
Derivation of CG
Ax = b, A R
n,n
is pos. def., x, b R
n
(x, y) := x
T
y, x, y R
n
x, y) := x
T
Ay = (x, Ay) = (Ax, y)
|x|
A
=
x
T
Ax
W
0
= 0, W
1
= spanb, W
2
= spanb, Ab,
W
k
= spanb, Ab, A
2
b, . . . , A
k1
b
W
0
W
1
W
2
W
k

dim(W
k
) k, w W
k
Aw W
k+1
x
k
W
k
, x
k
x, w) = 0 for all w W
k
p
0
= r
0
:= b, p
j
= r
j

j1
i=0
r
j
,p
i
p
i
,p
i
p
i
, j = 1, . . . , k.
Convergence
Theorem 5. Suppose we apply the conjugate gradient method to a
positive denite systemAx = b. Then the A-norms of the errors satisfy
[[x x
k
[[
A
[[x x
0
[[
A
2
+ 1
k
, for k 0,
where = cond
2
(A) =
max
/
min
is the 2-norm condition number of
A.
This theorem explains what we observed in the previous
section. Namely that the number of iterations is linked to
, the square root of the condition number of A. Indeed,

the following corollary gives an upper bound for the number
of iterations in terms of

.
Corollary 6. If for some > 0 we have k
1
2
ln(
2
then
[[x x
k
[[
A
/[[x x
0
[[
A
.

The Conjugate Gradient Method: Tom Lyche

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

The Conjugate Gradient Method: Tom Lyche

Hochgeladen von

Copyright:

Verfügbare Formate

The Conjugate Gradient Method

n 1.86 1.87 1.86 1.85

n is essentially the square root of the condition number of A.

, the square root of the condition number of A. Indeed,

Das könnte Ihnen auch gefallen