Sie sind auf Seite 1von 7

An introdunction to the Power Method and

(Shifted/Inverse) Power Method

Yu-Kai Hong
Department of Applied Mathematics, National University of Kaohsiung,
Kaohsiung 811, Taiwan. E-mail: a0934147@gmail.com

23 May 2007

Abstract

Most of the algorithms for finding the dominant eigenvalue or eigenvector problem
use the power method which is the easiest numerical method. The inverse power
method is used to find the smallest eigenvalue or eigenvector, the idea of the inverse
power method depends on the power method via taking the inverse of the matrix
in the power method. Give any constant, we can find the nearest eigenvalue of the
matrix by using the shifted power method, which also depends on the power method
via shifting the eigenvalues of the matrix in the power method.

1 Power Method

Definition 1.1. If λ is an eigenvalue of the matrix A ∈ Mn×n (R) that is


larger in absolute value than any other eigenvalue, it is called the dominant
eigenvalue..An eigenvector V corresponding to the eigenvalue λ is called the
dominant eigenvector.

Definition 1.2. A vector V is said to be normalized if the norm of the vector


be 1 (it is called the unit vector.), it means that kV k = 1
It is easy to make a normal vector by dividing the vector with its norm.

Therem 1.1(Power Method). Assume the matrix A ∈ Mn×n (R) has n dis-
tinct eigenvalues λ1 , λ2 , λ3 ...λn and that they are ordered in decreasing mag-
nitude; that is,
|λ1 | > |λ2 | ≥ |λ3 | ≥ ... ≥ |λn |
if X0 is chosen appropriately, then the sequence {Xk } and the sequence {Ck }
generated recursively by
Yk = AXk

Preprint submitted to Elsevier Science 23 May 2007


where
Yk
Xk+1 =
Ck
and
Ck = kYk k
will converge to the dominant eigenvector V1 and the dominant eigenvalue
λ1 ,respectively. That is

lim Xk = V1 and lim Ck = λ1


k→∞ k→∞

Proof. Since A has n eigenvalues, there are n eigenvectors Vi corresponding


to λi which are normalized, for i = 1, 2...n, there are linearly independent set
{V1, V2 , V3 , ..., Vn } form a basis for the n dimensional space. Hence the starting
vector X0 can be expressed as the linear combination

X0 = b1 V1 + b2 V2 + b3 V3 + ... + bn−1 Vn−1 + bn Vn

Assume that X0 was chosen in such a manner that b1 is not zero (since we
want to evaluate the dominant eigenvalue and the dominant eigenvector ).
Since that we know AVi = λi Vi so that

Y0 = AX0 = A(b1 V1 + b2 V2 + b3 V3 + ... + bn Vn )


= b1 AV1 + b2 AV2 + b3 AV3 + ... + bn AVn
= b1 λ1 V1 + b2 λ2 V2 + b3 λ3 V3 + ... + bn λn Vn
à à ! à ! à ! !
λ2 λ3 λn
= λ1 b1 V1 + b2 V2 + b3 V3 + ... + bn Vn
λ1 λ1 λ1
and
à à ! à ! à ! !
λ1 λ2 λ3 λn
X1 = b1 V1 + b2 V2 + b3 V3 + ... + bn Vn
C1 λ1 λ1 λ1
After k iterations we arrive at

Yk−1 = AXk−1
⎛ Ã !k−1 Ã !k−1 Ã !k−1 ⎞
λk−1
1 ⎝b1 V1 + b2
λ2 λ3 λn
=A V2 + b3 V3 + ... + bn Vn ⎠
C1 C2 ..Ck−1 λ1 λ1 λ1
⎛ Ã !k−1 Ã !k−1 Ã !k−1 ⎞
λ1k−1 ⎝b1 AV1 + b2 A
λ2 λ3 λn
= V2 + b3 A V3 + ... + bn A Vn ⎠
C1 C2 ..Ck−1 λ1 λ1 λ1
⎛ Ã !k−1 Ã !k−1 Ã !k−1 ⎞
λ1k−1 ⎝b1 λ1 V1 + b2 λ2
λ2 λ3 λn
= V2 + b3 λ3 V3 + ... + bn λn Vn ⎠
C1 C2 ..Ck−1 λ1 λ1 λ1
⎛ à !k à !k à !k ⎞
λk1 ⎝b1 V1 + b2
λ2 λ3 λn
= V2 + b3 V3 + ... + bn Vn ⎠
C1 C2 ..Ck−1 λ1 λ1 λ1

2
and
⎛ à !k à !k à !k ⎞
λk1 ⎝b1 V1 + b2
λ2 λ3 λn
Xk = V2 + b3 V3 + ... + bn Vn ⎠
C1 C2 ..Ck−1 Ck λ1 λ1 λ1

since that |λ1 | > |λ2 | ≥ |λ3 | ≥ ... ≥ |λn | so that


à !k
λi
lim bi Vi = 0 for each i = 2...n
k→∞ λ1
Hence it follows that
° ° ¯ ¯
° b1 λk1 ° ¯ b1 λk1 ¯
°
lim kXk k = lim °° V1 °°° = ¯
lim ¯¯ ¯
¯ kV1 k
k→∞ k→∞ C1 C2 ..Ck−1 Ck k→∞ C1 C2 ..Ck−1 Ck ¯

and since both the vector Xk and V1 are normalized unit vector

kXk k = kV1 k = 1

so that we have
¯ ¯
¯ b1 λk1 ¯ b1 λk1
¯ ¯
lim ¯¯ ¯ = lim =1
k→∞ C1 C2 ..Ck−1 Ck ¯ k→∞ C1 C2 ..Ck−1 Ck

Therefore, the sequence of vector converge to the dominant eigenvector.


b1 λk1
lim Xk = lim V1 = lim V1
k→∞ k→∞ C1 C2 ..Ck−1 Ck k→∞

V1 = lim Xk
k→∞
Replacing the k with k − 1 in the terms of the
b1 λk1
lim =1
k→∞ C1 C2 ..Ck−1 Ck

and deviding the above two equation, we have


λ1 b1 λk1 /C1 C2 ..Ck−1 Ck
lim = lim =1
k→∞ Ck k→∞ b1 λk−1 /C1 C2 ..Ck−1
1

Therefore, the sequence of constant converge to the dominant eigenvalue.

λ1 = lim Ck
k→∞

and the proof of the theorem is complete.

In the equation
à !k
λi
lim bi Vi = 0 for each i = 2...n
k→∞ λ1

3
we see that the coefficient of Vi in Xk goes to zero. in proportion (λi /λ1 )k and
that the speed of convergence of {Xk } to V1 is governed by the terms (λ2 /λ1 )k .
Consequently, the rate of convergence is linear. Similary, the convergence of
the sequence {Ci } to λ1 is also linear.

So that if the eigenvalue λ2 is too closed to the eigenvalue λ1 , then we have to


take a lot of iterations to evaluate the dominant eigenvalue and the dominant
eigenvector.

2 Inverse Power Method

Theorem 2.1(Inverse Eigenvalue). Suppose the matrix A ∈ Mn×n (R) is


invertible (it means that all the eigenvalues of A are not zero) and λ be the
eigenvalue of A corresponding to the eigenvector V , then λ−1 is the eigenvalue
of A−1 corresponding the eigenvector V .

Proof. Since λ be the eigenvalue of A corresponding to the eigenvector V , so


that AV = λV and A−1 A = I, then

A−1 AV = A−1 (AV ) = A−1 (λV ) = λ(A−1 V )

λ(A−1 V ) = A−1 AV = IV = V
we have that
A−1 V = λ−1 V
So that λ−1 is the eigenvalue of A−1 corresponding to the eigenvector V

Therem 2.2(Inverse Power Method). Assume the matrix A ∈ Mn×n (R)


is invertible and has n distinct eigenvalues λ1 , λ2 , λ3 ...λn and that they are
ordered in decreasing magnitude; that is,

|λ1 | > |λ2 | ≥ |λ3 | ≥ ... ≥ |λn−1 | > |λn | > 0

if X0 is chosen appropriately, then the sequence {Xk } and the sequence {Ck }
generated recursively by
Yk = A−1 Xk
where
Yk
Xk+1 =
Ck
and
Ck = kYk k
will converge to the smallest eigenvector Vn and the smallest eigenvalue λn ,respectively.
That is
lim Xk = Vn and lim Ck = λ−1 n
k→∞ k→∞

4
It is clearly that we want to evaluate the smallest eigenvalue or eigenvector
of a matrix A, in order to use the power method , we deform the matrix A
to matrix B, such that the smallest eigenvalue or eigenvector of A be the
dominant eigenvalue or eigenvector of B and then using the power method.So
that we take B = A−1 (suppose that A is invertible).

Since that
|λ1 | > |λ2 | ≥ |λ3 | ≥ ... ≥ |λn−1 | > |λn | > 0
¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯
¯ −1 ¯
¯λn ¯ > ¯¯λ−1 ¯ ¯ −1 ¯ ¯ −1 ¯ ¯ −1 ¯
n−1 ¯ ≥ ¯λn−3 ¯ ≥ ... ≥ ¯λ2 ¯ > ¯λ1 ¯ > 0
The speed of the convergence on the inverse power method depends on ratio

(λ−1 −1 k
n−1 /λn ) = (λn /λn−1 )
k

So that if the eigenvalue λn is too closed to the eigenvalue λn−1 , then we


have to take a lot of iterations to evaluate the dominant eigenvalue and the
dominant eigenvector of the matrix A−1 .

In the numerical process of iterations

Yk = A−1 Xk

we don‘t have to evaluate the matrix A−1 , we solve the linear system.

AYk = Xk

There are a lot of numerical methods in solving the linear systems.

3 Shifted Power Method

Theorem 3.1(Shifted Eigenvalue). Suppose the matrix A ∈ Mn×n (R)


and λ be the eigenvalue of A corresponding to the eigenvector V , let α be any
constant value, then λ−α is the eigenvalue of the matrix A−αI corresponding
to the eigenvector V .

Proof. Since λ be the eigenvalue of A corresponding to the eigenvector V , so


that AV = λV , then

(A − αI)V = AV − αV = λV − αV = (λ − α)V

so that λ − α is the eigenvalue of the matrix A − αI corresponding to the


eigenvector V .

From the above theorem we know that, the matrix A ∈ Mn×n (R) has n distinct
eigenvalues λ1 , λ2 , λ3 ...λn and that they are ordered in decreasing magnitude;

5
that is,
|λ1 | > |λ2 | ≥ |λ3 | ≥ ... ≥ |λn−1 | ≥ |λn | > 0
Let A = A − λ1 I, the eigenvalues of the matrix A order as the following

|λ2 − λ1 | ≥ |λ3 − λ1 | ≥ ... ≥ |λn−1 − λ1 | > |λn − λ1 | > |λ1 − λ1 | = 0

we suppose that the first term is strickly increasing, then

|λ2 − λ1 | > |λ3 − λ1 | ≥ ... ≥ |λn−1 − λ1 | > |λn − λ1 | > |λ1 − λ1 | = 0

so that we can use the power method of the matrix A to find the dominant
eigenvalue λ = λ2 − λ1 and the eigenvector V , then we have λ2 = λ1 + λ and
V which are the second dominant eigenvalue and eigenvector of the matrix A,
this method is known as the shfited method.

So that using the same method we can find the eigenvalues λ1 , λ2 , λ3 ...λn and
the corresponding eigenvectors V1 , V2 , V3 ...Vn of the matrix A step bt step, but
the next eigenvalue or eigenvector that we found, we need more and more
strongly conditions

|λ1 | > |λ2 | > |λ3 | > ... > |λn−1 | > |λn |

and the errors will be more and more larger, so that the power method is best
in finding the dominant eigenvalue and eigenvector.

Theorem 3.2. Suppose the matrix A ∈ Mn×n (R) and λ be the eigenvalue
of A corresponding to the eigenvector V , let α be any constant value, then
(λ − α)−1 is the eigenvalue of the matrix (A − αI)−1 corresponding to the
eigenvector V .

Proof. Since λ be the eigenvalue of A corresponding to the eigenvector V , so


that AV = λV and (A − αI)−1 (A − αI) = I

(A − αI)−1 (A − αI)V = (A − αI)−1 (AV − αV ) = (A − αI)−1 (λV − αV ) = V

(A − αI)−1 (λV − αV ) = (A − αI)−1 (λ − α)V = (λ − α)(A − αI)−1 V


(λ − α)(A − αI)−1 V = V
(A − αI)−1 V = (λ − α)−1 V
so that (λ − α)−1 is the eigenvalue of the matrix (A − αI)−1 corresponding to
the eigenvector V .

Theorem 3.3(Shifted-Inverse Power Method). Assume the matrix A ∈


Mn×n (R) has n distinct eigenvalues λ1 , λ2 , λ3 ...λn and that they are ordered
in decreasing magnitude; that is,

|λ1 | > |λ2 | ≥ |λ3 | ≥ ... ≥ |λn−1 | > |λn | > 0

6
Consider the eigenvalue λj , then a constant α can be chosen so that μ1 =
1/(λj − α) is dominant eigenvalue of the matrix (A − αI)−1 . Furthermore, if
X0 is chosen appropriately, then the sequences {Xk } and {Ck } are generated
recursively by
Yk = (A − αI)−1 Xk
and
Yk
Xk+1 =
Ck
where
Ck = kYk k
will converge to the dominant eigenvalue μ1 = 1/(λj − α) of the matrix (A −
αI)−1 . Finally, the corresponding eigenvalue of the matrix A is given by the
calculation
1
λj = +α
μ1

Proof. Without loss generality, we may that λ1 > λ2 > λ3 > ... > λn−1 > λn ,
Select a constant number α (α 6= λj ) that is closer to λj than any of the other
eigenvalues of the matrix A, that is

|λj − α| < |λi − α| for each i = 1..., j − 1, j + 1, ...n

According to the Theroem 3.3 1/(λj − α) is the eigenvalue of the matrix


(A − αI)−1 corresponding to the eigenvector V . Since that

|λj − α| < |λi − α| for each i = 1..., j − 1, j + 1, ...n

so that

|λj − α|−1 > |λi − α|−1 for each i = 1..., j − 1, j + 1, ...n

so that μ1 = 1/(λj − α) is the dominant eigenvalue of the matrix (A − αI)−1 ,


then the calculation λj = 1/μ1 + α produces the desired eigenvalue of the
matrix A.

Remark : For pratical implementations of the shifted-inverse power method ,


a linear system solvers is used to compute Yk = (A − αI)−1 Xk in each step by
solving thelinear system (A − αI)Yk = Xk , it is almost impossible to evaluate
the matrix (A − αI)−1 .

Das könnte Ihnen auch gefallen