Sie sind auf Seite 1von 23

The Conjugate Gradient Method

Tom Lyche
University of Oslo
Norway
The Conjugate Gradient Method p. 1/23
Plan for the day
The method
Algorithm
Implementation of test problems
Complexity
Derivation of the method
Convergence
The Conjugate Gradient Method p. 2/23
The Conjugate gradient method
Restricted to positive denite systems: Ax = b,
A R
n,n
positive denite.
Generate x
k
by x
k+1
= x
k
+
k
p
k
,
p
k
is a vector, the search direction,

k
is a scalar determining the step length.
In general we nd the exact solution in at most n
iterations.
For many problems the error becomes small after a few
iterations.
Both a direct method and an iterative method.
Rate of convergence depends on the square root of the
condition number
The Conjugate Gradient Method p. 3/23
The name of the game
Conjugate means orthogonal; orthogonal gradients.
But why gradients?
Consider minimizing the quadratic function Q : R
n
R
given by Q(x) :=
1
2
x
T
Ax x
T
b.
The minimum is obtained by setting the gradient equal
to zero.
Q(x) = Ax b = 0 linear system Ax = b
Find the solution by solving r = b Ax = 0.
The sequence x
k
is such that r
k
:= b Ax
k
is
orthogonal with respect to the usual inner product in R
n
.
The search directions are also orthogonal, but with
respect to a different inner product.
The Conjugate Gradient Method p. 4/23
The algorithm
Start with some x
0
. Set p
0
= r
0
= b Ax
0
.
For k = 0, 1, 2, . . .
x
k+1
= x
k
+
k
p
k
,
k
=
r
T
k
r
k
p
T
k
Ap
k
r
k+1
= b Ax
k+1
= r
k

k
Ap
k
p
k+1
= r
k+1
+
k
p
k
,
k
=
r
T
k+1
r
k+1
r
T
k
r
k
The Conjugate Gradient Method p. 5/23
Example

2 1
1 2

[
x
1
x
2
] = [
1
0
]
Start with x
0
= 0.
p
0
= r
0
= b = [1, 0]
T

0
=
r
T
0
r
0
p
T
0
Ap
0
=
1
2
, x
1
= x
0
+
0
p
0
= [
0
0
] +
1
2
[
1
0
] =

1/2
0

r
1
= r
0

0
Ap
0
= [
1
0
]
1
2

2
1

0
1/2

, r
T
1
r
0
= 0

0
=
r
T
1
r
1
r
T
0
r
0
=
1
4
, p
1
= r
1
+
0
p
0
=

0
1/2

+
1
4
[
1
0
] =

1/4
1/2

1
=
r
T
1
r
1
p
T
1
Ap
1
=
2
3
,
x
2
= x
1
+
1
p
1
=

1/2
0

+
2
3

1/4
1/2

2/3
1/3

r
2
= 0, exact solution.
The Conjugate Gradient Method p. 6/23
Exact method and iterative method
Orthogonality of the residuals implies that x
m
is equal to the solution
x of Ax = b for some m n.
For if x
k
,= x for all k = 0, 1, . . . , n 1 then r
k
,= 0 for
k = 0, 1, . . . , n 1 is an orthogonal basis for R
n
. But then r
n
R
n
is
orthogonal to all vectors in R
n
so r
n
= 0 and hence x
n
= x.
So the conjugate gradient method nds the exact solution in at most
n iterations.
The convergence analysis shows that |x x
k
|
A
typically becomes
small quite rapidly and we can stop the iteration with k much smaller
that n.
It is this rapid convergence which makes the method interesting and
in practice an iterative method.
The Conjugate Gradient Method p. 7/23
Conjugate Gradient Algorithm
[Conjugate Gradient Iteration] The positive denite linear system Ax = b is
solved by the conjugate gradient method. x is a starting vector for the iteration. The
iteration is stopped when [[r
k
[[
2
/[[r
0
[[
2
tol or k > itmax. itm is the number of
iterations used.
functi on [ x , i t m ] =cg ( A, b , x , t ol , i t max ) r =bAx ; p=r ; rho=r r ;
rho0=rho ; f or k=0: i t max
i f sqrt ( rho / rho0)<= t o l 2
i t m=k ; return
end
t =Ap ; a=rho / ( p t ) ;
x=x+ap ; r =rat ;
rhos=rho ; rho=r r ;
p=r +( rho / rhos )p ;
end i t m=i t max +1;
The Conjugate Gradient Method p. 8/23
A family of test problems
We can test the methods on the Kronecker sum matrix
A = C
1
I+IC
2
=

C
1
C
1
.
.
.
C
1
C
1

cI bI
bI cI bI
.
.
.
.
.
.
.
.
.
bI cI bI
bI cI

,
where C
1
= tridiag
m
(a, c, a) and C
2
= tridiag
m
(b, c, b).
Positive denite if c > 0 and c [a[ +[b[.
The Conjugate Gradient Method p. 9/23
m = 3, n = 9
A =

2c a 0 b 0 0 0 0 0
a 2c a 0 b 0 0 0 0
0 a 2c 0 0 b 0 0 0
b 0 0 2c a 0 b 0 0
0 b 0 a 2c a 0 b 0
0 0 b 0 a 2c 0 0 b
0 0 0 b 0 0 2c a 0
0 0 0 0 b 0 a 2c a
0 0 0 0 0 b 0 a 2c

b = a = 1, c = 2: Poisson matrix
b = a = 1/9, c = 5/18: Averaging matrix
The Conjugate Gradient Method p. 10/23
Averaging problem

jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1/9, c = 5/18

max
=
5
9
+
4
9
cos (h),
min
=
5
9

4
9
cos (h)
cond
2
(A) =

max

min
=
5+4 cos(h)
54 cos(h)
9.
The Conjugate Gradient Method p. 11/23
2D formulation for test problems
V = vec(x). R = vec(r), P = vec(p)
Ax = b DV +V E = h
2
F,
D = tridiag(a, c, a) R
m,m
, E = tridiag(b, c, b) R
m,m
vec(Ap) = DP +PE
The Conjugate Gradient Method p. 12/23
Testing
[Testing Conjugate Gradient ] A = trid(a, c, a, m) I
m
+ I
m

trid(b, c, b, m) R
m
2
,m
2
functi on [ V, i t ] = cgt est (m, a , b , c , t ol , i t max )
h=1/ (m+1) ; R=hhones (m) ;
D=sparse( t r i di agonal ( a , c , a ,m) ) ; E=sparse( t r i di agonal ( b , c , b ,m) ) ;
V=zeros (m,m) ; P=R; rho=sum(sum(R. R) ) ; rho0=rho ;
f or k=1: i t max
i f sqrt ( rho / rho0)<= t o l
i t =k ; return
end
T=DP+PE; a=rho / sum(sum(P. T ) ) ; V=V+aP; R=RaT;
rhos=rho ; rho=sum(sum(R. R) ) ; P=R+( rho / rhos )P;
end;
i t =i t max +1;
The Conjugate Gradient Method p. 13/23
The Averaging Problem
n 2 500 10 000 40 000 1 000 000 4 000 000
K 22 22 21 21 20
Table 1: The number of iterations K for the averag-
ing problem on a

n

n grid. x
0
= 0 tol = 10
8
Both the condition number and the required number of iterations are
independent of the size of the problem
The convergence is quite rapid.
The Conjugate Gradient Method p. 14/23
Poisson Problem

jk
= 2c + 2a cos (jh) + 2b cos (kh), j, k = 1, 2, . . . , m.
a = b = 1, c = 2

max
= 4 + 4 cos (h),
min
= 4 4 cos (h)
cond
2
(A) =

max

min
=
1+cos(h)
1cos(h)
= cond
(
T)
2
.
cond
2
(A) = O(n).
The Conjugate Gradient Method p. 15/23
The Poisson problem
n 2 500 10 000 40 000 160 000
K 140 294 587 1168
K/

n 1.86 1.87 1.86 1.85


Using CG in the form of Algorithm 8 with = 10
8
and x
0
= 0 we list
K, the required number of iterations and K/

n.
The results show that K is much smaller than n and appears to be
proportional to

n
This is the same speed as for SOR and we dont have to estimate
any acceleration parameter!

n is essentially the square root of the condition number of A.


The Conjugate Gradient Method p. 16/23
Complexity
The work involved in each iteration is
1. one matrix times vector (t = Ap),
2. two inner products (p
T
t and r
T
r),
3. three vector-plus-scalar-times-vector (x = x + ap,
r = r at and p = r + (rho/rhos)p),
The dominating part of the computation is statement 1.
Note that for our test problems A only has O(5n) nonzero
elements. Therefore, taking advantage of the sparseness of
A we can compute t in O(n) ops. With such an
implementation the total number of ops in one iteration is
O(n).
The Conjugate Gradient Method p. 17/23
More Complexity
How many ops do we need to solve the test problems
by the conjugate gradient method to within a given
tolerance?
Average problem. O(n) ops. Optimal for a problem
with n unknowns.
Same as SOR and better than the fast method based
on FFT.
Discrete Poisson problem: O(n
3/2
) ops.
same as SOR and fast method.
Cholesky Algorithm: O(n
2
) ops both for averaging and
Poisson.
The Conjugate Gradient Method p. 18/23
Analysis and Derivation of the Method
Theorem 3 (Orthogonal Projection). Let o be a subspace of a nite
dimensional real or complex inner product space (1, F, , , )). To each
x 1 there is a unique vector p o such that
x p, s) = 0, for all s o. (1)
x
x
x - p
p=P
S
S
The Conjugate Gradient Method p. 19/23
Best Approximation
Theorem 4 (Best Approximation). Let o be a subspace of a nite
dimensional real or complex inner product space (1, F, , , )). Let
x 1, and p o. The following statements are equivalent
1. x p, s) = 0, for all s S.
2. |x s| > |x p| for all s o with s ,= p.
If (v
1
, . . . , v
k
) is an orthogonal basis for S then
p =
k

i=1
x, v
i
)
v
i
, v
i
)
v
i
. (2)
The Conjugate Gradient Method p. 20/23
Derivation of CG
Ax = b, A R
n,n
is pos. def., x, b R
n
(x, y) := x
T
y, x, y R
n
x, y) := x
T
Ay = (x, Ay) = (Ax, y)
|x|
A
=

x
T
Ax
W
0
= 0, W
1
= spanb, W
2
= spanb, Ab,
W
k
= spanb, Ab, A
2
b, . . . , A
k1
b
W
0
W
1
W
2
W
k

dim(W
k
) k, w W
k
Aw W
k+1
x
k
W
k
, x
k
x, w) = 0 for all w W
k
p
0
= r
0
:= b, p
j
= r
j

j1
i=0
r
j
,p
i

p
i
,p
i

p
i
, j = 1, . . . , k.
The Conjugate Gradient Method p. 21/23
Convergence
Theorem 5. Suppose we apply the conjugate gradient method to a
positive denite systemAx = b. Then the A-norms of the errors satisfy
[[x x
k
[[
A
[[x x
0
[[
A
2

+ 1

k
, for k 0,
where = cond
2
(A) =
max
/
min
is the 2-norm condition number of
A.
This theorem explains what we observed in the previous
section. Namely that the number of iterations is linked to

, the square root of the condition number of A. Indeed,


the following corollary gives an upper bound for the number
of iterations in terms of

.
The Conjugate Gradient Method p. 22/23
Corollary 6. If for some > 0 we have k
1
2
ln(
2

then
[[x x
k
[[
A
/[[x x
0
[[
A
.
The Conjugate Gradient Method p. 23/23

Das könnte Ihnen auch gefallen