Beruflich Dokumente
Kultur Dokumente
SOLUTION OF MATRICES
SEMINAR REPORT
of the degree of
Master of Technology
in
by
YASH K. MENON
(AM.EN.P2TF15014)
Supervisor:
Dr. JAYAKUMAR J S
Amritapuri
2016
Department of Mechanical Engineering
Amrita School of Engineering
Amritapuri
CERTIFICATE
Guide:
___________________
Dr. JAYAKUMAR J S
Professor
Department of Mechanical Engineering
Amrita School of Engineering
Amritapuri Campus
ii
Acknowledgement
I would also like to thank all the faculty members of the Mechanical Department and
my friends for their encouragement and support.
I would also like to thank my parents for their support, encouragement and blessings.
Above all I thank the AMMA (Sri Mata Amritanandamayi Devi) for giving me the courage
and confidence to complete this work.
YASH K. MENON
AM.EN.P2TF15014
iii
List of Contents
Certificate ii
Acknowledgement iii
List of figures v
Abbreviations vi
Abstract vii
Chapter 1 Introduction 1
1.1 Aim and objective 1
1.2 Need for Preconditioning 1
1.3 Scope of study 2
1.4 Preconditioning 3
1.5 Methods for preconditioning 4
Chapter 4 Conclusion 15
Reference 16
iv
List of Figures
v
Abbreviations
vi
Abstract
In the recent years the focus on the research of preconditioning has increased due to
the following reasons: Preconditioning methods are very useful and essential while solving
the system of equations with millions or billions of unknowns. Preconditioning methods are
also essential for fast convergence in a reasonable amount of time. Preconditioning techniques
will improve the performance and reliability of the different iterative methods.
vii
Chapter 1
Introduction
Mathematical model plays very important role in the development of science and
technology. It can be used in large variety of applications, like fluid flow, temperature
distribution, pollution of air, and water etc. Physical experiments are difficult and costly,
therefore one way is to convert physical problem into set of partial differential equations
(PDEs) using differential analysis. Later these PDEs are solved using discrete numerical
methods. While solving these PDEs we usually get system of linear equations in which
number of unknowns can be very large. It has been observed that for many practical 3D
applications, the number of unknowns are in the range of millions or sometimes in billions.
To get unknowns we express the system of linear equations in the following matrix format.
Ax = b (1.1)
Where,
A is called as coefficient matrix
x is called as column matrix of unknowns
b is called as column matrix of constant
1
e.g.
2
1.4 Preconditioning
A standard approach is to use a non singular matrix M and rewrite the system as
M-1Ax = M-1b (1.2)
Here M is the preconditioner. The M needs to be chosen in such a way that the matrix
A= M-1A (1.3)
The matrix M should have the following properties in order to choose it as a preconditioner:
It should be a proper approximation of A, so that M-1A resembles the identity matrix.
The most important quality of a preconditioner is to reduce the spectral condition
number of the preconditioned matrix M-1A.
It should not require a large amount of storage.
Below Fig 1.1 is an example of the preconditioner accelerating convergence. Original
function is a narrow valley and optimizer bounces off the walls of the valley while
approaching to the solution. Finally, it arrives to the isoline f=1, but it needs 4 iterations to
reach it. After the scaling curvature of the function became more simple and need only one
step in order to move far below f=1.
3
Example above used very basic optimization algorithm - steepest descent method. You
can understand it by noticing that steps are always made in the direction of the antigradient,
without accumulating information about curvature of the function. Both L-BFGS and CG
would have started to turn search direction toward extremum right after the first iteration. But
it does not mean that good optimization algorithm does not need preconditioner. Good
preconditioner can significantly (up to several times) speed up optimization progress.
We need preconditioner if:
Variables have wildly different magnitudes (thousand times and higher)
Function rapidly changes in some directions and slowly - in other ones
You want to accelerate optimization
Sometimes preconditioner just accelerates convergence, but in some difficult cases it is
impossible to solve problem without good preconditioning.
4
Chapter 2
Often, in real life applications, the systems of equations that solve particular science
problems, cannot be solved quickly with direct methods; instead, sequences of approximating
solutions are needed to reach the exact solution. And most of the time these solutions are
found using iterative methods.[5]
Before 1952, two great Mathematicians, Magnus R. Hestenes and Eduard Stiefel, were
working independently on developing an effective method for solving a system of n
simultaneously equations in n unknowns, especially when n is large. For effectiveness, they
were looking for a method that is simple, requiring minimum storage space; rapid convergent
for a number of infinite steps; stable with respect to round-off errors; and at each step of the
method, one should have some information about the solution and the current estimate should
be better than the previous one. Unfortunately, all these criteria are hard to meet and in real
life, there is no best method that can solve all types of problems.
In 1952, Hestenes and Stiefel, joining their research results[6], discovered one of the
most powerful iteration methods for rapidly solving large linear systems of equations with
symmetric positive definite coefficient matrices: the conjugate gradient
method or simply, the CG algorithm.
The CG algorithm is a technique that does not require any computation of the
eigenvalues of the system, as other methods do, and can therefore be used as a more direct
method for finding the solution of Ax = b.[6, 7]
In this chapter we present the conjugate gradient algorithm and the derivation of its
formulas along with analysis of its rate of convergence and explained with the numerical
example.
5
2.2 Description of Method
The conjugate gradient method, or simply CG, is one of the most important algorithms
in solving large and sparse symmetric positive definite (SPD) matrices.
Throughout this chapter we will discuss some of the benefits and disadvantages of the
method. One of the main benefits is that it reaches the solution in at most n steps if the system
we are trying to solve has n linear equations with n unknowns; moreover, the method is easy
to code in Matlab and doesn't request a lot of memory. The rate of convergence of the method
is in general good, but for ill-conditioned matrices becomes problematic; in the following
chapter we will show how CG can be improved for this kind of matrices.
The CG algorithm, applied to the system Ax = b, starts with an initial guess of the
solution x0, with an initial residual r0, and with an initial search direction that is equal to the
initial residual: p0 = r0.
The idea behind the conjugate gradient method is that the residual rk = b –Axk is
orthogonal to the Krylov subspace generated by b, and therefore each residual is
perpendicular to all the previous residuals. The residual is computed at each step. The solution
at the next step is found using a search direction that is only a linear combination of the
previous search directions, which for x1 is just a combination between the previous and the
current residual.
.
6
Then, the solution at step k, xk, is just the previous iterate plus a constant times the last
search direction; the immediate benefit of the search directions is that there is no need to store
the previous search directions and using the orthogonality of the residuals to these previous
search directions, the search is linearly independent of the previous directions. And for the
solution at the next step, a new search direction is computed, as well as a new residual and
new constants. The role of the constants are to give an optimal approximate solution.[6]
A more visual explanation of how the CG algorithm finds the approximate solution
to the exact solution is given in Fig. 2.1.
The iterative formulas of CG are given below[6]:
rkT rk
Improvement at step k: k
rkT1rk 1
rkT1rk 1
Step length: k
pkT1 Apk 1
In this section we will derive the formulas of the conjugate gradient algorithm starting
by minimizing the A-norm of the error at step k, ek A . The error at step k is defined as
ek x* xk where x* A1b is the exact solution and xk is the solution at step k. We define
the A-norm of the error at step k to be: ek A
= ekT Aek and we introduce a new function
f ( k ) ek
2
A
.
In order to determine whether the function has a minimum or not at αk, we need to
calculate the first and second derivatives of the function, and since A is symmetric:
ekT1 APk 1 pkT1 Aek 1 , we get:
f '( k ) 2 k pkT1 Aek 1 2 pkT1 Aek 1
7
A being SPD, implies f ''( k ) > 0 and therefore, the function f ( k ) has a minimum at k
f '( k ) 0
pkT1 Aek 1 (2.1)
k pkT1 Apk 1 pkT1 Aek 1 k
pkT1 Apk 1
rkT1rk 1
Later, we will show that k T
pk 1 Apk 1
Next, we deduce the residual at the kth step: rk b Axk
rk b A( xk 1 k pk 1 ) , or
rk b Axk 1 k Apk 1
(2.2)
rk rk 1 k Apk 1
The search direction pk , is a linear combination of the previous search direction pk 1 , and the
error at the current step, rk :
pk k rk k pk 1 , but k 1 because of normalization Therefore, pk rk k pk 1
We say pk and pk 1 are conjugate with respect to A, if pkT1 Apk 0 . If we replace here pk with the
formula we found for it, this relation is transferred into:
pkT1 A(rk k pk 1 ) 0
pkT1 Ark
k (2.3)
pkT1 Apk 1
rkT rk
Later we will see a new formula k T
rk 1rk 1
In this section we will solve the system Ax = b using the conjugate gradient method,
for A a 2x2 symmetric positive definite matrix, and will show that the exact solution x̂ is
reached in two steps.
Consider the quadratic function defined on the previous section:
1
F ( x) xT Ax bT x (2.4)
2
Then, for the symmetric matrix A, the gradient of this function is nothing else but Ax b :
F '( x) Ax b (2.5)
8
Therefore, solving Ax = b is equivalent to finding the x that minimizes F(x).The solution of
Ax = b is a critical point of F(x). From Multivariable Calculus one knows that the gradient, a
vector field, always points in the direction of the greatest increase of a function. And
therefore, for every x, the gradient points in the direction of the steepest increase of F(x).
Thus, the direction of the steepest decrease is in the direction opposite F '( x) .
Let x1 and x2 be the solutions returned by CG at the first and second iteration,
respectively; then the residual after the first iteration is r1 Axˆ Ax1 . As we have previously
mentioned, the idea behind the conjugate gradients method is to generate a series of
approximating solutions until we reach the one closest to the exact solution. The gradient
vector will always point in the direction of the steepest increase of the quadratic function, and
for reaching the exact solution, the previous residual (different than zero and called r1 in the
2x2 case) will be orthogonal to the search direction. More, the quadratic function F(x) to be
minimized by the solution of CG, will reach the minimum where the projection of the
gradient vector on the search line is zero.
In Fig. 2.2 we illustrate these steps by showing a contour plot with the gradient vectors
pointing in the direction of the steepest increase; the first step x1 is taken along the initial
search line p0 will lead to x1; this x1 will hit the exact solution at the next step. Note: the error
e1 xˆ x1 will be A-orthogonal to p0.
We apply the CG method to the 2x2 SPD matrix by first checking that the residual
after the first iteration ( r1 Axˆ Ax1 ) is orthogonal to the initial residual (r0 ) : r1T r0 0 . Since
r0 = p0 and r1 b Ax1 Axˆ Ax1 , saying residuals are orthogonal is the same as
( xˆ x1 )T Ap0 0 , which is nothing else than e1T Ap0 0 .The gradient at the bottom-most point
of the contour plot being orthogonal to the gradient of the previous step, implies that the best
approximation to the exact solution has been reached.
9
Algorithm:
We will perform two steps of the conjugate gradient method beginning with the initial guess
10
Our first step is to calculate the residual vector r0 associated with x0. This residual is
computed from the formula r0 = b - Ax0, and in our case is equal to
Since this is the first iteration, we will use the residual vector r0 as our initial search direction
p0; the method of selecting pk will change in further iterations.
We now compute the scalar α0 using the relationship
This result completes the first iteration, the result being an "improved" approximate solution
to the system, x1. We may now move on and compute the next residual vector r1 using the
formula
Our next step in the process is to compute the scalar β0 that will eventually be used to
determine the next search direction p1.
Now, using this scalar β0, we can compute the next search direction p1 using the relationship
We now compute the scalar α1 using our newly acquired p1 using the same method as that
used for α0.
11
Finally, we find x2 using the same method as that used to find x1.
The result, x2, is a "better" approximation to the system's solution than x1 and x0. If exact
arithmetic were to be used in this example instead of limited-precision, then the exact solution
would theoretically have been reached after n = 2 iterations (n being the order of the system).
12
Chapter 3
−1 −1
Note that M A, AM and ML−1AMR−1 are similar matrices, so they have the same eigen-
values.
For convenience write the given system as
A(x − x0) = r0 (3.4)
where r0 = b − Ax0,
so that x − x0 is the unknown, and consider a split preconditioner M = MLMR. A left
preconditioner is the case MR = I and a right preconditioner is the case ML = I . Then the
preconditioned system is
13
M −1AM −1u = M −1r0 Where x = x0 + M −1 u
L R L R
14
Chapter 4
Conclusion
The development of efficient and reliable preconditioned iterative methods is the key
for the successful application of scientific computation to the solution of many large-scale
problems. Therefore it is not surprising that this research area continues to see a vigorous
level of activity. Efficient preconditioners have been developed for certain classes of
problems, such as self-adjoint, second-order scalar elliptic PDEs with positive definite
operator, much work remains to be done for more complicated problems.
Loss of accuracy and convergence was observed when the number of iterations of the
solution procedure was fixed instead of relating the stopping criterion to the precision of the
linear solution. Adequate preconditioning could significantly reduce the run time.
Preconditioners that are economical in terms of computation and communication costs were
suitable for the momentum equation system. Computationally expensive techniques such as
algebraic multigrid proved to be very efficient for solving the pressure correction equation.
Scalability up to hundreds of computing cores was observed when solving the momentum
equations due to low-cost preconditioning.
15
REFERENCES
[1] Y. Saad and M.H. Schultz, “A generalized minimal residual algorithm for solving
nonsymmetric linear systems”, SIAM J. Sci. Statist. Comput. 7:856–869, 1986.
[2] Michele Benzi, “Preconditioning Techniques for Large Linear Systems: A Survey”,
Journal of Computational Physics 182 418–477, 2002.
[3] M.W. Benson, “Iterative Solution of Large Scale Linear Systems”, M.Sc. thesis, Lakehead
University, Thunder Bay, Ontario, 1973.
[4] Xiaoke Cui, “Approximate Generalized Inverse Preconditioning Methods for Least
Squares Problems”, The Graduate University for Advanced Studies, 2009.
[5] Dr. C. Vuik, “Iterative solution methods”, Research School for Fluid Mechanics, 2015.
[6] Auke van der Ploeg, “Preconditioning for sparse matrices with applications”,
University of Gorningen, 1994.
[7] Elena Caraba, “Preconditioned Conjugate Gradient Algorithm”, Louisiana State
University, 2008.
16
17