Sie sind auf Seite 1von 24

PRECONDITIONING METHODS FOR

SOLUTION OF MATRICES
SEMINAR REPORT

Submitted in the partial fulfilment of the requirements

of the degree of

Master of Technology

in

Thermal and Fluids Engineering

by

YASH K. MENON

(AM.EN.P2TF15014)

Supervisor:

Dr. JAYAKUMAR J S

Department of Mechanical Engineering, Amrita School of Engineering

Amritapuri

Amrita Vishwa Vidyapeetham

2016
Department of Mechanical Engineering
Amrita School of Engineering
Amritapuri

CERTIFICATE

This is to certify that the report entitled “PRECONDITIONING METHODS FOR


SOLUTION OF MATRICES” is a bonafide record of the seminar presented by YASH K.
MENON, AM.EN.P2TF15014 under my guidance towards partial fulfilment of the
requirements for the award of Master of Technology degree in Mechanical Engineering under
the specialization Thermal & Fluid Engineering of AMRITA VISHWA VIDYAPEETHAM.

Guide:

___________________
Dr. JAYAKUMAR J S
Professor
Department of Mechanical Engineering
Amrita School of Engineering
Amritapuri Campus

ii
Acknowledgement

I express my sincere gratitude to my guide Dr. JAYAKUMAR J S, Professor of


Mechanical Engineering for his proper guidance, support and timely suggestions to prepare
this report without which my work would not have been accomplished.

I would also like to thank all the faculty members of the Mechanical Department and
my friends for their encouragement and support.

I would also like to thank my parents for their support, encouragement and blessings.
Above all I thank the AMMA (Sri Mata Amritanandamayi Devi) for giving me the courage
and confidence to complete this work.

YASH K. MENON
AM.EN.P2TF15014

iii
List of Contents

Certificate ii
Acknowledgement iii
List of figures v
Abbreviations vi
Abstract vii

Chapter 1 Introduction 1
1.1 Aim and objective 1
1.2 Need for Preconditioning 1
1.3 Scope of study 2
1.4 Preconditioning 3
1.5 Methods for preconditioning 4

Chapter 2 Preconditioned Conjugate Gradient Method 5


2.1 Mathematical background 5
2.2 Description of the method 6
2.3 Deriving the Conjugate Gradient algorithm 7
2.4 Applying the CG algorithm to 2x2 SPD matrix 8
2.5 Numerical example for Conjugate Gradient method 10

Chapter 3 Preconditioned Krylov Subspace Method 13

Chapter 4 Conclusion 15

Reference 16

iv
List of Figures

Fig 1.1 Example of Sparse matrix 2


Fig 1.2 Comparison between preconditioning and non-preconditioning 3
Fig 2.1 Searching direction of the CG algorithm 6
Fig 2.2 Algorithm for 2x2 matrix 9

v
Abbreviations

PDE Partial differential equation


CPU Central processing unit
CG Conjugate gradient
BFGS Broyden Fletcher Goldfarb Shanno
SPD Symmetric Positive Definite

vi
Abstract

It is very essential to understand the fundamentals of preconditioning methods for


solution of matrices. With the realization that preconditioning is essential for the successful
use of iterative methods, research on preconditioners has moved to centre stage in recent
years. Preconditioning is a way of transforming a difficult problem into one that is easier to
solve. It is widely recognized that preconditioning is the most critical ingredient in the
development of efficient solvers for challenging problems in scientific computation, and that
the importance of preconditioning is destined to increase even further.

In the recent years the focus on the research of preconditioning has increased due to
the following reasons: Preconditioning methods are very useful and essential while solving
the system of equations with millions or billions of unknowns. Preconditioning methods are
also essential for fast convergence in a reasonable amount of time. Preconditioning techniques
will improve the performance and reliability of the different iterative methods.

The study aims at understanding the fundamentals of preconditioning methods for


the solution of matrices, the convergence criteria of some of the preconditioners by taking up
a problem and a basic survey on some preconditioning methods.

vii
Chapter 1

Introduction

1.1 Aim and objective

To study the fundamentals of preconditioning methods for the solution of linear


system of equation.

1.2 Need for Preconditioning of Matrices

Mathematical model plays very important role in the development of science and
technology. It can be used in large variety of applications, like fluid flow, temperature
distribution, pollution of air, and water etc. Physical experiments are difficult and costly,
therefore one way is to convert physical problem into set of partial differential equations
(PDEs) using differential analysis. Later these PDEs are solved using discrete numerical
methods. While solving these PDEs we usually get system of linear equations in which
number of unknowns can be very large. It has been observed that for many practical 3D
applications, the number of unknowns are in the range of millions or sometimes in billions.
To get unknowns we express the system of linear equations in the following matrix format.
Ax = b (1.1)
Where,
A is called as coefficient matrix
x is called as column matrix of unknowns
b is called as column matrix of constant

It is observed that, coefficient matrix [A] formed during computation is “Sparse”


matrix means a matrix having large number of rows and columns in which very few elements
are non zero and rest of the elements are zero.

1
e.g.

Fig 1.1 Example of Sparse matrix [3]


Elimination methods like Gauss elimination or Gauss-Jordan methods can be used to
solve equation Ax = b but such methods are useful only for small coefficient matrix. For
Sparse matrix having millions of rows and columns elimination methods are not suitable
because during elimination process, number of non zero entries increases thus this increase
computer storage as well as CPU time and hence it is costly. Therefore Sparse system of
linear equations are solved by iterative methods as iterative methods requires less storage and
computation time compared to direct methods. But it is found that in some applications
iterative method fails to get final solution and hence preconditioning is necessary.

1.3 Scope of study

Preconditioning techniques have emerged as an essential part of successful and


efficient iterative solutions of matrices. Preconditioning is a way of transforming a difficult
problem into one that is easier to solve. Linear systems comprising millions or billions of
equations in as many unknowns, iterative methods with preconditioning technique is the only
option available in order to get the solution. It is widely recognized that preconditioning is the
most critical ingredient in the development of efficient solvers for challenging problems in
scientific computation, and that the importance of preconditioning is destined to increase even
further. Preconditioners are useful in iterative methods to solve a linear system Ax = b for x
since the rate of convergence for most iterative linear solvers increases as the condition
number of a matrix decreases as a result of preconditioning.

2
1.4 Preconditioning

Preconditioning refers to transforming a system into another system with more


favorable properties for iterative solution. A preconditioner is a matrix that effects such a
transformation. Preconditioning is the procedure for iterative solver to modify the given
system of linear equation Ax = b in such a way that we obtain an equivalent system ̂ ̂ ̂
for which the iterative method converges faster.

A standard approach is to use a non singular matrix M and rewrite the system as
M-1Ax = M-1b (1.2)
Here M is the preconditioner. The M needs to be chosen in such a way that the matrix
A= M-1A (1.3)
The matrix M should have the following properties in order to choose it as a preconditioner:
 It should be a proper approximation of A, so that M-1A resembles the identity matrix.
 The most important quality of a preconditioner is to reduce the spectral condition
number of the preconditioned matrix M-1A.
 It should not require a large amount of storage.
Below Fig 1.1 is an example of the preconditioner accelerating convergence. Original
function is a narrow valley and optimizer bounces off the walls of the valley while
approaching to the solution. Finally, it arrives to the isoline f=1, but it needs 4 iterations to
reach it. After the scaling curvature of the function became more simple and need only one
step in order to move far below f=1.

Fig 1.2 Comparison between preconditioning and non-preconditioning [1]

3
Example above used very basic optimization algorithm - steepest descent method. You
can understand it by noticing that steps are always made in the direction of the antigradient,
without accumulating information about curvature of the function. Both L-BFGS and CG
would have started to turn search direction toward extremum right after the first iteration. But
it does not mean that good optimization algorithm does not need preconditioner. Good
preconditioner can significantly (up to several times) speed up optimization progress.
We need preconditioner if:
 Variables have wildly different magnitudes (thousand times and higher)
 Function rapidly changes in some directions and slowly - in other ones
 You want to accelerate optimization
Sometimes preconditioner just accelerates convergence, but in some difficult cases it is
impossible to solve problem without good preconditioning.

ALGLIB package supports several preconditioners. ALGLIB is a company which


develops software for numerical analysis and data processing it includes direct and iterative
linear solvers, nonlinear solvers and optimizers.

1.5 Methods for Preconditioning

The following are some of the preconditioning methods


 Preconditioned Conjugate Gradient
 Preconditioned Krylov Subspace Methods
Let us see the methods of Preconditioning in detail in the subsequent chapters.

4
Chapter 2

Preconditioned Conjugate Gradient Method

2.1 Mathematical Background

Often, in real life applications, the systems of equations that solve particular science
problems, cannot be solved quickly with direct methods; instead, sequences of approximating
solutions are needed to reach the exact solution. And most of the time these solutions are
found using iterative methods.[5]

Before 1952, two great Mathematicians, Magnus R. Hestenes and Eduard Stiefel, were
working independently on developing an effective method for solving a system of n
simultaneously equations in n unknowns, especially when n is large. For effectiveness, they
were looking for a method that is simple, requiring minimum storage space; rapid convergent
for a number of infinite steps; stable with respect to round-off errors; and at each step of the
method, one should have some information about the solution and the current estimate should
be better than the previous one. Unfortunately, all these criteria are hard to meet and in real
life, there is no best method that can solve all types of problems.

In 1952, Hestenes and Stiefel, joining their research results[6], discovered one of the
most powerful iteration methods for rapidly solving large linear systems of equations with
symmetric positive definite coefficient matrices: the conjugate gradient
method or simply, the CG algorithm.

When CG is applied to Ax = b, where matrix A has the properties mentioned above,


the solution is reached in at most n steps, where n is the dimension of the matrix of
coefficients; a maximum of original data is preserved; the method is simple to code and
requires little storage space; each step gives an estimate better than the previous one; and the
method can be started anew at any step of the iterative method.

The CG algorithm is a technique that does not require any computation of the
eigenvalues of the system, as other methods do, and can therefore be used as a more direct
method for finding the solution of Ax = b.[6, 7]

In this chapter we present the conjugate gradient algorithm and the derivation of its
formulas along with analysis of its rate of convergence and explained with the numerical
example.

5
2.2 Description of Method

The conjugate gradient method, or simply CG, is one of the most important algorithms
in solving large and sparse symmetric positive definite (SPD) matrices.

Throughout this chapter we will discuss some of the benefits and disadvantages of the
method. One of the main benefits is that it reaches the solution in at most n steps if the system
we are trying to solve has n linear equations with n unknowns; moreover, the method is easy
to code in Matlab and doesn't request a lot of memory. The rate of convergence of the method
is in general good, but for ill-conditioned matrices becomes problematic; in the following
chapter we will show how CG can be improved for this kind of matrices.

The CG algorithm, applied to the system Ax = b, starts with an initial guess of the
solution x0, with an initial residual r0, and with an initial search direction that is equal to the
initial residual: p0 = r0.

The idea behind the conjugate gradient method is that the residual rk = b –Axk is
orthogonal to the Krylov subspace generated by b, and therefore each residual is
perpendicular to all the previous residuals. The residual is computed at each step. The solution
at the next step is found using a search direction that is only a linear combination of the
previous search directions, which for x1 is just a combination between the previous and the
current residual.
.

Fig 2.1 Searching direction of the CG algorithm[7]

6
Then, the solution at step k, xk, is just the previous iterate plus a constant times the last
search direction; the immediate benefit of the search directions is that there is no need to store
the previous search directions and using the orthogonality of the residuals to these previous
search directions, the search is linearly independent of the previous directions. And for the
solution at the next step, a new search direction is computed, as well as a new residual and
new constants. The role of the constants are to give an optimal approximate solution.[6]

A more visual explanation of how the CG algorithm finds the approximate solution
to the exact solution is given in Fig. 2.1.
The iterative formulas of CG are given below[6]:

Approximate solution: xk = xk-1 + αk pk-1

Residual: rk = rk-1- αk Apk-1

Search direction: pk = rk + αk pk-1

rkT rk
Improvement at step k:  k 
rkT1rk 1
rkT1rk 1
Step length:  k 
pkT1 Apk 1

2.3 Deriving the Conjugate Gradient Algorithm

In this section we will derive the formulas of the conjugate gradient algorithm starting
by minimizing the A-norm of the error at step k, ek A . The error at step k is defined as
ek  x*  xk where x*  A1b is the exact solution and xk is the solution at step k. We define
the A-norm of the error at step k to be: ek A
= ekT Aek and we introduce a new function
f ( k )  ek
2
A
.

The error at step k is defined as ek  x*  xk is also equivalent to ek  ek 1   k pk 1 if


we consider that the error at step k-1 is ek 1  x*  xk 1 . Using the equivalence for the error
f ( k ) becomes
f ( k )  (ek 1   k pk 1 )T A(ek 1   k pk 1 )
= (ekT1   k pkT1 ) A(ek 1   k pk 1 )

In order to determine whether the function has a minimum or not at αk, we need to
calculate the first and second derivatives of the function, and since A is symmetric:
ekT1 APk 1  pkT1 Aek 1 , we get:
f '( k )  2 k pkT1 Aek 1  2 pkT1 Aek 1

7
A being SPD, implies f ''( k ) > 0 and therefore, the function f ( k ) has a minimum at  k
 f '( k )  0
pkT1 Aek 1 (2.1)
  k pkT1 Apk 1  pkT1 Aek 1   k 
pkT1 Apk 1
rkT1rk 1
Later, we will show that  k  T
pk 1 Apk 1
Next, we deduce the residual at the kth step: rk  b  Axk

We know that xk  xk 1   k pk 1 , so rk becomes:

rk  b  A( xk 1  k pk 1 ) , or
rk  b  Axk 1   k Apk 1
(2.2)
 rk  rk 1   k Apk 1
The search direction pk , is a linear combination of the previous search direction pk 1 , and the
error at the current step, rk :
pk   k rk  k pk 1 , but  k  1 because of normalization Therefore, pk  rk  k pk 1

We say pk and pk 1 are conjugate with respect to A, if pkT1 Apk  0 . If we replace here pk with the
formula we found for it, this relation is transferred into:

pkT1 A(rk   k pk 1 )  0

After we multiply out, it becomes pk 1 Ark   k pk 1 Apk 1  0 , and solving for  k :


T T

pkT1 Ark
k   (2.3)
pkT1 Apk 1
rkT rk
Later we will see a new formula  k  T
rk 1rk 1

2.4 Applying the CG algorithm to 2 x 2 SPD matrix

In this section we will solve the system Ax = b using the conjugate gradient method,
for A a 2x2 symmetric positive definite matrix, and will show that the exact solution x̂ is
reached in two steps.
Consider the quadratic function defined on the previous section:
1
F ( x)  xT Ax  bT x (2.4)
2
Then, for the symmetric matrix A, the gradient of this function is nothing else but Ax  b :

F '( x)  Ax  b (2.5)

8
Therefore, solving Ax = b is equivalent to finding the x that minimizes F(x).The solution of
Ax = b is a critical point of F(x). From Multivariable Calculus one knows that the gradient, a
vector field, always points in the direction of the greatest increase of a function. And
therefore, for every x, the gradient points in the direction of the steepest increase of F(x).
Thus, the direction of the steepest decrease is in the direction opposite F '( x) .

Let x1 and x2 be the solutions returned by CG at the first and second iteration,
respectively; then the residual after the first iteration is r1  Axˆ  Ax1 . As we have previously
mentioned, the idea behind the conjugate gradients method is to generate a series of
approximating solutions until we reach the one closest to the exact solution. The gradient
vector will always point in the direction of the steepest increase of the quadratic function, and
for reaching the exact solution, the previous residual (different than zero and called r1 in the
2x2 case) will be orthogonal to the search direction. More, the quadratic function F(x) to be
minimized by the solution of CG, will reach the minimum where the projection of the
gradient vector on the search line is zero.

In Fig. 2.2 we illustrate these steps by showing a contour plot with the gradient vectors
pointing in the direction of the steepest increase; the first step x1 is taken along the initial
search line p0 will lead to x1; this x1 will hit the exact solution at the next step. Note: the error
e1  xˆ  x1 will be A-orthogonal to p0.

We apply the CG method to the 2x2 SPD matrix by first checking that the residual
after the first iteration ( r1  Axˆ  Ax1 ) is orthogonal to the initial residual (r0 ) : r1T r0  0 . Since
r0 = p0 and r1  b  Ax1  Axˆ  Ax1 , saying residuals are orthogonal is the same as
( xˆ  x1 )T Ap0  0 , which is nothing else than e1T Ap0  0 .The gradient at the bottom-most point
of the contour plot being orthogonal to the gradient of the previous step, implies that the best
approximation to the exact solution has been reached.

 For the 2x2 matrix, the solution is reached in two steps.

Fig 2.2 Algorithm for 2x2 matrix[7]

9
Algorithm:

2.5 Numerical example for conjugate gradient method

Consider the linear system Ax = b given by

We will perform two steps of the conjugate gradient method beginning with the initial guess

In order to find an approximate solution to the system.


Solution
For reference, the exact solution is

10
Our first step is to calculate the residual vector r0 associated with x0. This residual is
computed from the formula r0 = b - Ax0, and in our case is equal to

Since this is the first iteration, we will use the residual vector r0 as our initial search direction
p0; the method of selecting pk will change in further iterations.
We now compute the scalar α0 using the relationship

We can now compute x1 using the formula

This result completes the first iteration, the result being an "improved" approximate solution
to the system, x1. We may now move on and compute the next residual vector r1 using the
formula

Our next step in the process is to compute the scalar β0 that will eventually be used to
determine the next search direction p1.

Now, using this scalar β0, we can compute the next search direction p1 using the relationship

We now compute the scalar α1 using our newly acquired p1 using the same method as that
used for α0.

11
Finally, we find x2 using the same method as that used to find x1.

The result, x2, is a "better" approximation to the system's solution than x1 and x0. If exact
arithmetic were to be used in this example instead of limited-precision, then the exact solution
would theoretically have been reached after n = 2 iterations (n being the order of the system).

12
Chapter 3

Preconditioned Krylov Subspace Methods

The rate of convergence of a Krylov subspace method for a linear system Ax = b


depends on the condition number of the matrix [A]. Therefore, if we have a matrix [M] which
−1
is a crude approximation to [A], M A is closer to the identity than is A and should have a
smaller condition number, and it would be expected that a Krylov subspace method would
converge faster for the “preconditioned” system
M −1Ax = M −1b. (3.1)
For example, choosing M to be the diagonal part of A can be quite helpful. This is a
useful approach only if computing M −1v for an arbitrary vector v is cheap. Such a matrix M is
called a preconditioner or, more precisely, a left preconditioner.
In the case of a right preconditioner, one solves
AM −1u = b (3.2)
where x = M −1u.
And if M is in factored form M = MLMR, one can use this as a split preconditioner by
solving
ML−1AMR−1u = ML−1b (3.3)
where x = MR−1 u.

−1 −1
Note that M A, AM and ML−1AMR−1 are similar matrices, so they have the same eigen-
values.
For convenience write the given system as
A(x − x0) = r0 (3.4)
where r0 = b − Ax0,
so that x − x0 is the unknown, and consider a split preconditioner M = MLMR. A left
preconditioner is the case MR = I and a right preconditioner is the case ML = I . Then the
preconditioned system is

13
M −1AM −1u = M −1r0 Where x = x0 + M −1 u
L R L R

The Krylov algorithm is given below


x0 = initial guess;
r0=b-Ax0
perform k iterations of the Krylov subspace method for (ML−1AMR−1 )u = (ML−1r0) with u0 =0;
xk = x0 + MR−1 uk
Hence
xk ∈ x0 + MR−1Kk (ML−1 AMR−1, ML−1r0) = x0 + Kk (M −1A, M −1r0),
and the effect of preconditioning is to change the Krylov subspace to K(M −1 A, M −1r0). Thus,
it is only the product M = MLMR that determines the subspace. However, the actual splitting
effects the convergence test.

14
Chapter 4

Conclusion

The development of efficient and reliable preconditioned iterative methods is the key
for the successful application of scientific computation to the solution of many large-scale
problems. Therefore it is not surprising that this research area continues to see a vigorous
level of activity. Efficient preconditioners have been developed for certain classes of
problems, such as self-adjoint, second-order scalar elliptic PDEs with positive definite
operator, much work remains to be done for more complicated problems.

The type of preconditioning or which preconditioning method to be use depends on


the choice of the iterative method, problem characteristics, and so forth. In general, a good
preconditioner should meet the following requirements i.e., the preconditioned system should
be easy to solve and also the preconditioner should be cheap to construct and apply. The first
property means that the preconditioned iteration should converge rapidly, while the second
ensures that each iteration is not too expensive. Notice that these two requirements are in
competition with each other. It is necessary to strike a balance between the two needs. With a
good preconditioner, the computing time for the preconditioned iteration should be
significantly less than that for the unpreconditioned one.

Loss of accuracy and convergence was observed when the number of iterations of the
solution procedure was fixed instead of relating the stopping criterion to the precision of the
linear solution. Adequate preconditioning could significantly reduce the run time.
Preconditioners that are economical in terms of computation and communication costs were
suitable for the momentum equation system. Computationally expensive techniques such as
algebraic multigrid proved to be very efficient for solving the pressure correction equation.
Scalability up to hundreds of computing cores was observed when solving the momentum
equations due to low-cost preconditioning.

15
REFERENCES

[1] Y. Saad and M.H. Schultz, “A generalized minimal residual algorithm for solving
nonsymmetric linear systems”, SIAM J. Sci. Statist. Comput. 7:856–869, 1986.
[2] Michele Benzi, “Preconditioning Techniques for Large Linear Systems: A Survey”,
Journal of Computational Physics 182 418–477, 2002.
[3] M.W. Benson, “Iterative Solution of Large Scale Linear Systems”, M.Sc. thesis, Lakehead
University, Thunder Bay, Ontario, 1973.
[4] Xiaoke Cui, “Approximate Generalized Inverse Preconditioning Methods for Least
Squares Problems”, The Graduate University for Advanced Studies, 2009.
[5] Dr. C. Vuik, “Iterative solution methods”, Research School for Fluid Mechanics, 2015.
[6] Auke van der Ploeg, “Preconditioning for sparse matrices with applications”,
University of Gorningen, 1994.
[7] Elena Caraba, “Preconditioned Conjugate Gradient Algorithm”, Louisiana State
University, 2008.

16
17

Das könnte Ihnen auch gefallen