Sie sind auf Seite 1von 31

Throughout in this text V will be a vector space of nite dimension n

over a eld K and T : V V will be a linear transformation.


1 Eigenvalues and Eigenvectors
A scalar K is an eigenvalue of T if there is a nonzero v V such that
Tv = v. In this case v is called an eigenvector of T corresponding to .
Thus K is an eigenvalue of T if and only if ker(T I) ,= 0, and any
nonzero element of this subspace is an eigenvector of T corresponding to .
Here I denotes the identity mapping from V to itself. Equivalently, is an
eigenvalue of T if and only if det(T I) = 0. Therefore all eigenvectors are
actually the roots of the monic polynomial det(xI T) in K. This polynomial
is called the characteristic polynomial of T and is denoted by c
T
(x).
Since the degree of c
T
(x) is n, the dimension of V, T cannot have more than
n eigenvalues counted with multiplicities.
If A K
nn
, then A can be regarded as a linear mapping from K
n
to itself, and so the polynomial c
A
(x) = det(xI
n
A) is the characteristic
polynomial of the matrix A, and its roots in K are the eigenvalues of A.
A subspace W of V is T-invariant if T(W) W. The zero subspace and
the full space are trivial examples of T-invariant subspaces. For an eigen-
value of T the subspace E() = ker(T I) is T-invariant and is called
the eigenspace corresponding to . The dimension of E() is the geomet-
ric multiplicity of eigenvalue , and the multiplicity of as a root of the
characteristic polynomial of T is the algebraic multiplicity of
1.1 (i). Let W be a T-invariant subspace of V. and let T

and

T be linear
mappings induced by T on W and V/W. Then c
T
(x) = c
T
(x)c

T
(x).
(ii). Let V = W
1
W
2
, where W
1
and W
2
are T-invariant subspaces, and let
T
1
and T
2
be linear mappings induced by T on W
1
and W
2
. Then c
T
(x) =
c
T
1
(x)c
T
2
(x).
1.2 The geometric multiplicity of eigenvalue cannot exceed its algebraic
multiplicity.
Proof. Let the geometric multiplicity of be m. Then will have m lin-
early independent eigenvectors u
1
, . . . , u
m
, that is, Tu
i
= u
i
for i 1(1)m.
Extend this to a basis of V : B = u
1
, . . . , u
m
, u
m+1
, . . . , u
n
. Then [T]
B
=
1
_
I
m
B
0 A
_
. Therefore c
T
(x) = (x)
m
c
A
(x) and the algebraic multiplicity
of is at least m.
Recall that L(V ), the set of linear operators on V is a vector space over K
of dimension n
2
. Thus, if T is a linear operator on V , then I = T
0
, T, . . . , T
n
2
,
are linearly dependent, that is, there are scalars
0
,
1
, . . . ,
n
2 not all of
which are zero such that
0
I +
1
T + +
n
2T
n
2
= 0 ( L(V )) . Therefore,
if f (x) =
0
+
1
x + +
n
2x
n
2
, then f (T) = 0. This shows that S =
p (x) K [x] [ p (T) = 0 is a nonzero principal ideal of K[x]. The monic
polynomial which generates this ideal is called the minimal polynomial of
T. We will denote the minimal polynomial of T by m
T
(x).
Also, for a xed vector y in V , the vectors y, Ty, . . . , T
n
y are linearly
dependent, and so there are scalars
0
, , . . . ,
n
in K not all of which are
zero such that
0
y +
1
Ty +
n
T
n
y = 0. Similarly the ideal of K[x],
p (x) K [x] [ p (T) (y) = 0 is nonzero and principal. The monic polyno-
mial m
y
T
(x) which generates this ideal is called the annihilator of y with
respect to T. Clearly, m
y
T
(x) divides the minimal polynomial m
T
(x).
1.3 Primary decomposition theorem. Let the minimal polynomial of T
in K [x] be m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
, where p
1
(x) , . . . , p
k
(x) are monic
irreducible polynomials and r
1
, . . . , r
k
are positive integers. Then
V = ker p
1
(T)
r
1
ker p
k
(T)
r
k
,
a direct sum of Tinvariant subspaces of V . If for each i, T
i
is the linear
operator on ker p
i
(T)
r
i
induced by T, then the minimal polynomial of T
i
is
p
i
(x)
r
i
.
Proof. For each i, let q
i
(x) =p
1
(x)
r
1
p
i1
(x)
r
i1
p
i+1
(x)
r
i+1
p
k
(x)
r
k
=
m
T
(x) /p
i
(x)
r
i
. Then gcd (q
1
(x) , . . . , q
k
(x)) = 1, and so there are polyno-
mials s
1
(x) , . . . , s
k
(x) in K[x] such that s
1
(x) q
1
(x) + +s
k
(x) q
k
(x) = 1.
Thus, s
1
(T) q
1
(T) + + s
k
(T) q
k
(T) = I, and for any v in V ,
v = s
1
(T) q
1
(T) v + +s
k
(T) q
k
(T) v.
Since p
i
(T)
r
i
s
i
(T) q
i
(T) v = s
i
(T) m
T
(T) v = 0, so s
i
(T) q
i
(T) v ker p
i
(T)
r
i
.
Therefore,
V = ker p
1
(T)
r
1
+ + ker p
k
(T)
r
k
.
2
Next, to show that this sum is actually direct, assume that v
1
+ +v
k
= 0,
v
i
ker p
i
(T)
r
i
. Clearly, for i ,= j, q
i
(T) v
j
= 0. Now for any i,
v
i
= (v
1
+ + v
i1
+v
i+1
+ + v
k
) ,
and so q
i
(T) v
i
= 0. Also p
i
(T)
r
i
v
i
= 0. Since gcd (q
i
(x) , p
i
(x)
r
i
) = 1,
there are a
i
(x) , b
i
(x) K [x] such that a
i
(x) q
i
(x) + b
i
(x) p
i
(x)
r
i
= 1.
Thus, v
i
= a
i
(T) q
i
(T) v
i
+b
i
(T) p
i
(T)
r
i
v
i
= 0. Hence, the sum is direct.
For the second part, note that p
i
(T)
r
i
v = 0 for all v in V
i
. Thus, if m
i
(x)
is the minimal polynomial of T
i
, then m
i
(x) divides p
i
(x)
r
i
. Since p
i
(x) is
monic and irreducible, m
i
(x) = p
i
(x)
s
for some s r
i
. Now the polynomial
f (x) = q
i
(x) p
i
(x)
s
is such that f (T) = 0. Therefore, m
T
(x) divides f (x).
Hence s = r
i
and m
i
(x) = p
i
(x)
r
i
.
Let T L(V ) and let m
T
(x) be as in 1.3. Then by primary decomposition
theorem it follows that if T
i
is the restriction of T to the T-invariant subspace
V
i
= ker p
i
(T)
r
i
, and B
i
is a basis of V
i
, then B =
k
i=1
B
i
, is a basis for V
and [T]
B
= diag([T
1
]
B
1
, . . . , [T
k
]
B
k
).
1.4 There exists a vector y in V such that m
y
T
(x) = m
T
(x). In particular,
deg m
T
(x) dimV .
Proof. Write m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
, as in the statement of the above
proposition and V = ker p
1
(T)
r
1
ker p
k
(T)
r
k
. Then for each i there
exists y
i
ker p
i
(T)
r
i
such that p
i
(T)
r
i
y
i
= 0 and p
i
(T)
r
i
1
y
i
,= 0. Other-
wise m
T
(x) will not be the minimal polynomial. If y = y
1
+ + y
k
, then
clearly, m
y
T
(x) = m
T
(x).
We now prove the Cayley Hamilton Theorem: if T L(V ), then T
satises its characteristic polynomial, that is, c
T
(T) = 0, equivalently, the
minimal polynomial divides the characteristic polynomial; for that we need
the following result.
1.5 If m
T
(x) = p (x)
r
, where p (x) is a monic irreducible polynomial of
degree m, then m divides n.
Proof. Let y V such that m
y
T
(x) = m
T
(x) . Then y, Ty, . . . , T
mr1
y are
linearly independent and the subspace W generated by these vectors is T-
invariant and of dimension mr. If V = W, then the result is proved.
3
If V ,= W, then dimV/W = n mr < n. Suppose that the statement is
true for all vector spaces of dimension at most n 1. Let T

be the linear
operator on V/W induced by T. Since the minimal polynomial of T

divides
m
T
(x), so the minimal polynomial of T

is p (x)
t
for some t < r. By the
induction hypothesis, m divides n mr and so m divides n.
1.6 Cayley Hamilton theorem. Let V be a nite dimensional vector
space over K of dimension n. Let the minimal polynomial of T be m
T
(x) =
p
1
(x)
r
1
p
k
(x)
r
k
, where each p
i
(x) is monic irreducible polynomial of degree
m
i
, and let dimker p
i
(T)
r
i
= n
i
. Then m
i
divides n
i
and the characteristic
polynomial c
T
(x) = p
1
(x)
n
1
/m
1
p
k
(x)
n
k
/m
k
. In particular, m
T
(x) divides
c
T
(x).
Proof. First we consider the case when k = 1, that is, m
T
(x) = p (x)
r
,
deg p (x) = m. Then we require to show that c
T
(x) = p (x)
n/m
. We prove it
by induction on n. By 1.5, n/m is a positive integer.
If n = 1, then V = Kv for any v V 0, and so Tv = v, K.
Therefore m
T
(x) = x = c
T
(x). Assume that the hypothesis is true
for all vector spaces of dimension at most n 1. Let dimV = n and let
y V such that m
y
T
(x) = m
T
(x) (1.4). Then y, Ty, . . . , T
mr1
y are linearly
independent and W = y, Ty, . . . , T
mr1
y) is a nonzero Tinvariant subspace
of dimension mr.
If V = W, then n = mr. Let B = y
1
= y, y
2
= Ty, . . . , y
mr
= T
mr1
y
be an ordered basis of V and let (p (x))
r
=
0
+
1
x+ +
mr1
x
mr1
+x
mr
.
Then
Ty
1
= y
2
,
Ty
2
= y
3
,
.
.
.
Ty
mr1
= y
mr
,
Ty
mr
= T
mr
y = (
0
) y
1
+ (
1
) y
2
+ + (
mr1
) y
mr
,
and so
[T]
B
=
_

_
0 0 0
0
1 0 0
1
0 1 0
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1
mr
_

_
,
4
Therefore, c
T
(x) = p (x)
r
= p (x)
n/m
.
If W ,= V , and if T induces linear operators

T and T

on W and V/W
respectively, then c
T
(x) = c

T
(x) c
T
(x). Since dimW = mr < n, by the
induction hypothesis, c

T
(x) = p (x)
mr/m
. Also as dimV/W = n mr < n,
so c
T
(x) = p (x)
(nmr)/m
. Hence c
T
(x) = p (x)
n/m
.
Now consider the general case, m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
, where each
p
i
(x) is monic irreducible polynomial and each r
i
is a positive integer. The
subspace V
i
= ker p
i
(T)
r
i
is a T-invariant. Let T
i
be the linear operator on
V
i
induced by T. Then by 1.3, it follows that V = V
1
V
k
and m
T
i
(x) =
p
i
(x)
r
i
, i = 1, . . . , k. Also by 1.1, we have that c
T
(x) = c
T
1
(x) c
T
k
(x).
By the above paragraph, we have c
T
i
(x) = p
i
(x)
n
i
/m
i
, where n
i
= dimV
i
and
m
i
= deg p
i
(x). Therefore, c
T
(x) = p
1
(x)
n
1
/m
1
p
k
(x)
n
k
/m
k
.
Also m
T
i
(x) = p
i
(x)
r
i
in V
i
, and so by 4, there exists y
i
V
i
such that
m
y
i
T
i
(x) = m
T
i
(x). Hence, y
i
, Ty
i
, . . . , T
m
i
r
i
1
y
i
are linearly independent in
V
i
. Thus m
i
r
i
n
i
, and r
i
n
i
/m
i
. Hence, m
T
(x) divides c
T
(x).
The above result proves that, if for a linear operator T, c
T
(x) = p
1
(x)
m
1
p
k
(x)
m
k
,
where p
1
(x) , . . . , p
k
(x) are monic irreducible polynomials and m
1
, . . . , m
k
are positive integers, then c
T
(T) = 0 and m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
,
where each r
i
is a positive integer and r
i
m
i
. For example, if V is
a vector space over R of dimension 7 and for a linear operator T on V ,
c
T
(x) = (x
2
+ 1)
2
(x 1)
3
. Then m
T
(x) is one of the following polyno-
mials: (x
2
+ 1) (x 1), (x
2
+ 1) (x 1)
2
, (x
2
+ 1) (x 1)
3
, (x
2
+ 1)
2
(x 1),
(x
2
+ 1)
2
(x 1)
2
or (x
2
+ 1)
2
(x 1)
3
. In particular, is an eigenvalue of T
if and only if is a root of its minimal polynomial.
We present another proof of Cayley Hamilton Theorem. The advantage
of the following proof is that it shows that the theorem is also true for the ma-
trices over a commutative rings with identity. Just read K as a commutative
ring with identity in the following proof!
1.7 Let A K
nn
. Then c
A
(A) = 0.
Proof. Let c
A
(x) = a
0
+a
1
x+. . . +a
n1
x
n1
+x
n
and let B(x)be the classical
adjoint matrix of xI
n
A. The entries of B(x) are polynomials in x of degree
at most n 1, and so we can write B(x) = B
0
+ B
1
x + . . . + B
n1
x
n1
,
where B
i
K
nn
for all i. Thus by denition of the adjoint of a matrix:
(xI
n
A)B(x) = c
A
(x)I
n
, that is,
(xI
n
A)(B
0
+ B
1
x +. . . +B
n1
x
n1
) = a
0
+a
1
x +. . . +a
n1
x
n1
+x
n
I
n
.
5
Equating the coecients of x we obtain:
AB
0
= a
0
I
n
B
0
AB
1
= a
1
I
n
.
.
.
B
r1
AB
r
= a
r
I
n
.
.
.
B
n1
= I
n
and now multiplying by the increasing powers of A :
AB
0
= a
0
I
n
AB
0
A
2
B
1
= a
1
A
.
.
.
A
r
B
r1
A
r+1
B
r
= a
r
A
r
.
.
.
A
n
B
n1
= A
n
.
On adding we obtain 0 = a
0
I
n
+ Aa
1
A + . . . + a
n1
A
n1
+ A
n
, that is,
c
A
(A) = 0.
2 Diagonalizable and triangulable operators
2.1 Eigenvectors corresponding to distinct eigenvalues are linearly indepen-
dent.
Proof. Let
1
, . . . ,
k
be distinct eigenvalues of T and let u
1
, . . . , u
k
be cor-
responding eigenvectors, Tu
i
=
i
u
i
, i 1(1)k. For each i, dene
S
i
=
(T
1
I) . . . (T
i1
I)(T
i+1
I) . . . (T
k
I)
(
i

1
) . . . (
i

i1
)(
i

i+1
) . . . (
i
)
k
.
Then S
i
L(V ) and S
i
u
j
= 0 for i ,= j and S
i
u
i
= u
i
. Now if
1
u
1
+ . . . +

k
u
k
= 0, then 0 = S
i
(
1
u
1
+. . . +
k
u
k
) =
i
u
i
, and so
i
= 0.
6
A linear operator T is diagonalizable if V has a basis consisting of eigen-
vectors of T. Equivalently, T is diagonalizable if there is a basis B of V such
that [T]
B
is a diagonal matrix. It is immediate from the above statement that
if T has n (dimension of V ) distinct eigenvalues then T is diagonalizable. An
n n matrix is diagonalizable if it is similar to a diagonal matrix.
2.2 Let
1
, . . . ,
k
be distinct eigenvalues of T L(V ). Then the following
statements are equivalent.
(i). T is diagonalizable.
(ii). m
T
(x) = (x
1
) . . . (x
k
).
(iii). V = E(
1
) . . . E(
k
), where E(
i
) =ker(T
i
I).
(iv). c
T
(x) splits over K and the geometric multiplicity of each eigenvalue is
equal to its algebraic multiplicity.
Proof. (i) implies (ii) is clear. (ii) implies (iii) follows from the Primary De-
composition Theorem. Now assume (iii). Let dimE(
i
) = n
i
and let B

be an
ordered basis of E(
i
). Then each E(
i
) is T-invariant subspace, and if T
i
is
the restriction of T on E(
i
), then [T
i
]
B

=
i
I
n
i
. Now B =
k
i=1
B

is a basis of
V and [T]
B
= diag([T
i
]
B

, . . . , [T
k
]
B

. Hence c
T
(x) = (x
1
)
n
1
. . . (x
k
)
n
k
and so the geometric multiplicity of each eigenvalue is equal to its algebraic
multiplicity. This proves that (iii) implies (iv). Finally if (iv) holds then
clearly V has a basis consisting of eigenvalues of T.
If A K
nn
, to decide if A is diagonalizable rst check that if the char-
acteristic polynomial splits over K, and then for each root of c
A
(x) :
nullity(A I) is equal to the algebraic multiplicity of . In this case if
we let P be any matrix consisting of eigenvectors of A, then P
1
AP is a
diagonal matrix.
A linear operator T is triangulable if V has an ordered basis B such
that [T]
B
is an upper triangular matrix. An n n matrix A is triangulable
if A is similar to an upper triangular matrix.
2.3 For T L(V ) the following statements are equivalent.
(i). T is triangulable.
(ii). The characteristic polynomial c
T
(x) splits over K.
(iii). Every nonzero T-invariant subspace of V contains an eigenvector of T.
7
Proof. (i) implies (ii) is obvious.
(ii) implies (iii). Let W be any nonzero proper T-invariant subspace of V.
If T

is the restriction of T to W, then T

L(W) and c
T
(x) divides c
T
(x),
and so c
T
(x) splits. If K is a root of c
T
(x) and w is an eigenvector
corresponding to , then Tw = T

w = w. Hence W has an eigenvector of


T.
(iii) implies (ii). Let q(x) K[x] be a monic irreducible factor of c
T
(x).
Then W = kerq(T) is a nonzero T-invariant subspace of V, and so has
an eigenvector of T. Let w W be an eigenvector of T corresponding
to eigenvalue . If T

is the restriction of T to W, then q(T

) = 0. Since
T

w = Tw = w, x divides q(x). But q(x) is monic irreducible and so


q(x) = x . Hence c
T
(x) splits.
(ii) implies (i). By induction on n, the dimension of V. The statement is
obviously true for n = 1. Assume that the statement is true for all vector
spaces over K of dimension at most n1. Let V be a vector space over K of di-
mension n and T L(V ) such that c
T
(x) splits over K. Thus T has an eigen-
value K and let w be corresponding eigenvector. Then W =< w > is a
T-invariant subspace of V and T induces a linear operator

T on V/W. Since
c

T
(x) divides c
T
(x), c

T
(x) splits. Now V/W is n1 dimensional vector space
over K, by induction there is an ordered basis B

= x
2
, . . . , x
n
, x
i
= x
i
+W,
so that
_

T
_
B

is an upper triangular matrix. Thus if B = x


1
= w, x
2
. . . , x
n

an ordered basis of V, then [T]


B
is upper triangular.
Now we show how to transform a given matrix to an upper triangular
matrix if its characteristic polynomial splits. Here we assume that the entries
of the matrix are either from R or from C. However the discussion is valid
for arbitrary elds with suitable modications.
2.4 Let A C
nn
, Then there is an invertible matrix P such that P
1
AP
is an upper triangular matrix. If all the eigenvalues of A are real then P can
be taken as a matrix with real entries.
Proof. By induction on n. If n = 1 the statement is obvious. Assume that the
statement is true for all matrices of order less than n. Let be an eigenvalue
of A and u be corresponding eigenvector. Let P
1
be an invertible matrix
whose rst column is u. Then the rst column of P
1
1
AP
1
is e
1
and so
P
1
1
AP
1
=
_

0 A
1
_
. Here * denotes those entries of the matrix which are
8
of no interest for us.
Now by induction hypothesis there is an invertible matrix S such that
S
1
A
1
S is upper triangular. Now if P
2
= diag(1, S), then for P = P
1
P
2
,
P
1
AP =
_

0 S
1
A
1
S
_
. Hence an upper triangular matrix.
Note that in the proof of the above we can proceed by induction so that
on the diagonal of triangular matrix the equal eigenvalues appear together.
If T L(V ) and the characteristic polynomial splits by primary decom-
position, it follows that there is a basis B such that the matrix of T is a block
diagonal matrix with each block being an upper triangular matrix with the
same diagonal entry. This can be achieved for the matrices in the following
manner.
2.5 Let A C
nn
, and let c
A
(x) = (x
1
)
n
1
. . . (x
k
)
n
k
, where

1
s are
all distinct. Then A is similar to the matrix diag(A
1
, . . . , A
k
), where each A
i
is an upper triangular n
i
n
i
matrix with
i
on the diagonal.
Proof. By the above result, A is similar to an upper triangular matrix. We
can assume that equal eigenvalues are appearing together on the diagonal.
Now suppose that the ith diagonal entry is
i
and the j-th diagonal entry is

j
,
i
,=
j
and i < j. Then consider the product (I E
ij
)A(I +E
ij
) Since
i < j and the product of two upper triangular matrices is upper triangular,
this matrix is upper triangular and its ij-th entry is :
e
t
i
(I E
ij
)A(I +E
ij
)e
j
= (e
t
i
+e
t
j
)A(e
j
+ e
i
)
= e
t
i
Ae
j
e
t
i
Ae
j
+e
t
i
Ae
i

2
e
t
j
Ae
i
.
Since A is upper triangular e
t
j
Ae
i
= 0 for j > i. Thus if we want that the
(ij)-th entry to be zero after this similarity transform, should be such that
e
t
i
Ae
j
e
t
j
Ae
j
+e
t
i
Ae
j
= 0, that is, = e
t
i
Ae
j
/(e
t
j
Ae
j
e
t
i
Ae
i
).
Now if n-th and n 1-th diagonal entries are dierent then by suitable
value of we can make it zero by similarity. Otherwise we look for the n2-
th diagonal entry and can make suitable entries zeros this way.
Let S and T be two diagonalizable operators. We say that S and T are
simultaneously diagonalizable if there is a basis B such that [S]
B
and [T]

are
diagonal matrices.
9
2.6 S and T are simultaneously diagonalizable if and only if S and T com-
mute.
Proof. We assume that S and T are diagonalizable commuting operators.
Let
1
, . . . ,
k
be distinct eigenvalues of T. Let B =
k
i=1
B
i
be a basis
of V such that B
i
is a set of eigenvalues of S corresponding to
i
. Then
[S]
B
= diag(
1
I
n
1
, . . . ,
k
I
n
k
) Write [T]
B
=
_

_
A
11
. . . A
1k
.
.
.
.
.
.
A
k1
. . . A
kk
_

_
conformal
with the blocks of [S]
B
. Now since these matrices commute so on equating
we have that A
ij
= 0 for i ,= j. Thus [T]
B
= diag(A
11
, . . . , A
kk
). Therefore
if V
i
=< B
i
>, then V = V
1
. . . V
k
; and if T
i
is the restriction of T to
V
i
, then [T
i
]
Bi
= A
ii
. This implies that m
T
i
(x) divides m
T
(x) and so each
T
i
is diagonalizable. Let B

i
be a basis of V
i
consisting of eigenvectors of T
i
.
Then for B

=
k
i=1
B

i
we have that [T]
B
= diag([T
1
]
B

1
, . . . , [T
k
]
B

k
), a diagonal
matrix. Also for each x V
i
Sx =
i
x, so [S]
B
= [S]
B
a diagonal matrix.
Hence S and T are simultaneously diagonalizable.
Similarly we dene simultaneously triangulable operators, that is
when there is a basis with respect to which the matrices of operators is
upper triangular.
2.7 If linear operators S and T commute, then they are simultaneously tri-
angulable.
Proof. Let be an eigenvalue of T. let U = ker(T I). Then for u U,
(T I)Su = S(T I)u = 0. Thus Su U and U is S-invariant. Let u
1
be an eigenvector of S in U. Then Su
1
= u
1
and Tu
1
= u
1
. Now consider
V/U and consider the operators S

and T

induced by S and T on V/U. Then


the result now follows by induction.
Let T L(V ) and let m
T
(x) =

k
i=1
(x
i
)
m
i
Then resolving into partial
fractions:
1
m
T
(x)
=
k

i=1
m
i

j=1
a
ij
(x
i
)
m
i
Thus if p
j
(x) =

k
i=1,i=j
(x
i
)
m
i
, then 1 =

k
i=1
_

m
i
j=1
a
ij
(x
i
)
m
i
j
_
p
i
(x).
10
If q
i
(x) =

m
i
j=1
a
ij
(x
i
)
m
i
j
, then
k

i=1
p
i
(x)q
i
(x) = 1.
Dene E
i
= p
i
(T)q
i
(T), then for each i, E
i
L(V ) and it sis easy to see that
this is a resolution of identity, that is :
E
1
+. . . +E
k
= i, E
i
E
j
= 0, for i ,= j, and E
2
i
= E
i
.
Since E

i
s are diagonalizable and commuting, and so are simultaneously di-
agonalizable. Let B be such a basis. Then for each i ::
[E
i
]
B
= diag(0, . . . , 0, I
n
i
, 0, . . . , 0),
where n
i
is the algebraic multiplicity of eigenvalue
i
. Let D =

k
i=1

i
E
i
.
Then [D]
B
is a diagonal matrix and so D is diagonalizable. IF N = T D,
then N =

k
i=1
(T
i
I)E
i
. Since all these linear operators are polynomials
in T so they are all commuting. Therefore if r = maxm
1
, . . . , m
k
, we have
that N
r
= 0. Hence we have the following.
2.8 let T be a triangulable operator on V. Then T = D + N, where D is
diagonalizable and N is nilpotent such that DN = ND. Moreover D and N
are uniquely determined by these properties.
Only the uniqueness part is to be proved now. Suppose that there are
also D

and N

diagonalizable and nilpotent such that D

= N

. Then
DD

= N

N. Now D

T = D

(D

+N

) = TD

and as D is a polynomial
in T, DD

= D

D. Similarly, NN

= N

N. This shows that N and N

can be simultaneously triangulable and so are nilpotent. Also as D and


D

are commuting and diagonalizable, D D

is diagonalizable. But then


DD

= N

N implies that DD

is diagonalizable as well as nilpotent.


Hence D D

= 0.
3 The Jordan form
Let T L(V ) and let be an eigenvalue of T. For a positive integer r, the
subspace E
r
() = ker(T I)
r
is called the generalized eigenspace of
11
order r associated with . E
1
() is an eigenspace associated with , Since V
is nite dimensional, there is a positive integer p such that
0 = E
0
() E
1
() . . . E
p
() = E
p+1
() = . . . .
An element x E
r
()E
r1
() is called a generalized eigenvector of T of
order r corresponding to . Clearly if x is a generalized eigenvector of order
r then (T I)x is a generalized eigenvector of order r 1.
A sequence of nonzero vectors x
1
, . . . , x
k
is called a Jordan chain of
length k associated with eigenvalue if
Tx
1
= x
1
,
Tx
2
= x
2
+x
1
,
.
.
.
Tx
k
= x
k
+x
k1
.
3.1 A Jordan chain consists of linearly independent vectors.
Proof. Let x
1
, . . . , x
k
be a Jordan chain for T associated with eigenvalue
. Assume that
1
x
1
+ . . .
k
x
k
= 0 and that r is the largest index such
that
r
,= 0. Clearly r > 1. Write then x
r
=

r1
i=1
(
1
r

i
)x
i
and operate
(T I)
r1
on both sides to get x
1
= 0, a contradiction.
The length of a Jordan chain cannot exceed the dimension of the space and
the subspace generated by a Jordan chain is T-invariant. If B = x
1
, . . . , x
k

consists of a Jordan chain, W =< B > and T

is the linear operator on W


induced by T then
[T

]
B
=
_

_
1
.
.
.
.
.
.
.
.
.
1

_
.
This matrix is called the Jordan block of size k associated with eigenvalue
and we denote it by J
k
(). Note that J
k
() I
k
is a nilpotent matrix of
order k.
If V has a basis which is disjoint union of Jordan chains for T, then the
matrix representation of T with respect to th0s basis is a block diagonal
matrix with Jordan blocks on the diagonal. This basis is called a Jordan
basis of V for T, and the corresponding matrix representation a Jordan
canonical form for T.
12
3.2 Existence of Jordan canonical form. If the characteristic poly-
nomial of T splits over K, then V has a Jordan basis for T.
Proof. First assume that T is nilpotent . We prove by induction on n. If
T = 0 or in particular n = 1, then any basis of V is a Jordan basis. Suppose
that T ,= 0 and the statement holds for all vector spaces over K of dimension
less than n.
Since T is a nilpotent, W = imT is a proper T-invariant subspace of V.
Let T

be the restriction of T on W. Then T

L(W) and dimW < n, by


induction hypothesis W has a Jordan basis B

for T

. let B

=
k
i=1
B

i
, a
disjoint union of Jordan chains, that is, B

i
= x
i1
, . . . , x
in
i
and T

x
i1
=
Tx
i1
= 0 and T

x
ij
= Tx
ij
= x
ij1
for j 2(1)n
i
, i 1(1)k.
Now x
11
, . . . , x
k1
are linearly independent vectors of ker T. Extend it to
form a basis of ker T : x
11
, . . . , x
k1
, y
1
, . . . , y
q
, q 0. next each x
in
k
W,
choose x
in
k
+1
V such that Tx
in
k
+1
= x
in
k
. Now write B =
k+q
i=1
B
i
, where
B
i
= B

i
x
in
i
+1
for i 1(1)k, and B
k+i
= y
i
, for i 1(1)q. We now
show that B is a basis of V.
Clearly [B[ = [B

[ +k +q = dim kerT + dim imT = dimV = n. Next if


k

i=1
n
i
+1

j=1

ij
x
ij
+
q

r=1

r
y
r
= 0,
where
ij
K and
r
K. Then operating T on both sides we have

k
i=1

n
i
+1
j=2

ij
x
ij1
= 0, and so
ij
= 0, for j 2(1)n
i
+ 1, i 1(1)k. Thus

k
i=1

i1
x
i1
+

q
r=1

r
y
r
= 0, and which implies that
i1
= 0 for i 1(1)k,
and
r
= 0 for r 1(1)q as x
11
, . . . , x
1k
, y
1
, . . . , y
q
, is a basis of ker T.
Finally if T is an arbitrary then it follows that the minimal polynomial
of T is of the form (x
1
)
m
1
. . . (x
k
)
m
k
, where
1
, . . .
k
are distinct
eigenvalues of T. By primary decomposition theorem V = V
1
. . . V
k
where V
i
= ker(T
i
)
m
i
a T-invariant subspace. Let T
i
be the restriction on
V
i
. Then T
i
L(V
i
) and S
i
= T
i

i
I is a nilpotent operator on V
i
. Therefore
V
i
has a Jordan basis B
i
for S
i
and hence for T
i
associated with eigenvalue

i
. hence B
k
i=1
B
i
is a Jordan basis for T.
3.3 Let B be a Jordan basis of V for T. Then the number of generalized
eigenvectors of T corresponding to eigenvalue and of order up to s is
dim ker(T I)
s
.
13
Proof. Let B =
r
i=1
B
i
be a Jordan basis and is union of disjoint Jordan
chains: B
i
= x
i1
, . . . , x
in
i
, i 1(1)r. Let B
1
, . . . B
d
be all the Jordan chains
corresponding to . For a positive integer p dene
x
ip
=
_
x
ip
if p m
i
0 if p > m
i
We prove by induction on s that ker(T I)
s
has a basis consisting of nonzero
elements of the set x
ij
: i 1(1)l, j 1(1)s. This will clearly prove the
statement.
Since x
11
, . . . , x
l1
is a linearly independent subset of ker(T I). Thus
to prove the induction hypothesis for s = 1 we need to check that this set is
actually a basis of ker(T I). If v =

r,m
r
i=1,j=1

ij
x
ij
ker(T I), then
0 = (T I)v
=
l

i=1
m
i

j=2

ij
x
ij1
+
r

i=l+1
m
i

j=1

ij
(
i
)x
ij
+
r

i=l+1
m
i

j=2

ij
x
ij1
=
r

i=l
m
i
1

j=1

ij+1
x
ij
+
r

i=l+1
_
m
i
1

j=1
(
ij
(
i
) +
ij+1
) + (
i
)
im
1
x
im
i
_
Therefore
ij
= 0 for j 2(1)m
i
, i 1(1)l, and for j 1(1)m
i
, i l +1(1)r.
Hence v =

l
i=1,

i1
x
i1
and x
11
, . . . , x
l1
is a basis of ker(T I).
Now assume for s. We now show that ker(T I)
s+1
has a basis consisting
of nonzero elements of x
ij
: i 1(1)l, j 1(1)s + 1. Since the nonzero
elements of this set are already linearly independent, we only need to verify
that this set spans ker(T I)
s+1
.
Let v =

r,m
r
i=1,j=1

ij
x
ij
ker(TI)
s+1
. Then (TI)v ker(TI)
s
=
< x
ij
: i 1(1)l, j 1(1)s > . Since
(T I)v =
l

i=1
m
i
1

j=2

ij+1
x
ij
+
r

i=l+1
_
m
i
1

j=1
(
ij
(
i
) +
ij+1
) + (
i
)
im
1
x
im
i
_
,
we have
ij
= 0 for j 1(1)m
i
, i l + 1(1)r, and also that
ij
= 0 for
j s+2(1)m
i
, i 1(1)l, whenever m
i
> s+1. Hence v =

l
i=1,

s+1
j=1

ij
x
ij
.
14
3.4 The number of Jordan chains for T of length m associated with eigen-
value is
2dim ker(T I)
m
dim ker(T I)
m+1
dim ker(T I)
m1
.
or
rank(T I)
m+1
+ rank(T I)
m1
2rank(T I)
m
.
Proof. The number of Jordan chains for T of length at least m associated
with is exactly the number of generalized eigenvectors of T of order m cor-
responding to which appear in a Jordan basis. By above result this is equal
to l
m
= dim ker(T I)
m
dim ker(T I)
m1
. Therefore the number of
Jordan chains for T of length exactly equal to m associated with is l
m
l
m+1
.
Let T be a linear operator on V and let c
T
(x) = (x
1
)
n
1
. . . (x

k
)
n
k
, where
1
), . . . ,
k
are distinct eigenvalues of T. Then by 3.2 the matrix
representation of T with respect to a Jordan basis is: J = diag (J
1
, . . . , J
k
),
where for each i 1(1)k, J
i
= diag(J
m(i,1)
(
i
), . . . , J
m(ir
i
)
(
i
)). We order the
size of these Jordan blocks such that m(i, 1) . . . m(i, r
i
). Such a matrix J
is the Jordan canonical form or simply the Jordan form of T. Note that
for each eigenvalue
i
the number r
i
and m(i, 1), . . . , m(i, r
i
) are uniquely
determined by T. For each eigenvalue
i
the numbers r
i
is the geometric
multiplicity of
i
and m(i, 1)+. . . +m(i, r
i
) = n
i
the algebraic multiplicity of

i
. Also it is easy to verify that each J
i
is such that J
i

i
I
n
i
is a nilpotent of
order m
(i1)
. Hence the minimal polynomial of T is (x
1
)
m
(11)
. . . (x
k
)
m
(k1)
.
Now we show by direct matrix multiplications that how an n n matrix
can be transformed to its Jordan form.
3.5 Let J = J
n
(0). Then J
t
J = I E
11
.
Proof. J = E
12
+ . . . + E
n1n
. J
t
= E
21
+ . . . + E
nn1
. Thus J
t
J = (E
21
+
. . . +E
nn1
)(E
12
+. . . +E
n1n
) = E
22
+. . . +E
nn
= I E
11
.
3.6 Let J
m
(0) = J, a C
n
and B C
nn
. Then
_
I
m
e
i+1
a
t
0 I
n
_ _
J e
i
a
t
0 B
_ _
I
m
e
i+1
a
t
0 I
n
_
=
_
J e
i+1
a
t
B
0 B
_
15
Proof. Direct multiplication gives that:
_
I
m
e
i+1
a
t
0 I
n
_ _
J e
i
a
t
0 B
_ _
I
m
e
i+1
a
t
0 I
n
_
=
_
J Je
i+1
a
t
+e
i
a
t
+e
i+1
a
t
B
0 B
_
.
Now since Je
i+1
= e
i
, the result follows.
3.7 Let A be a strictly upper triangular n n matrix. Then there exists an
invertible matrix P and positive integers n
1
, . . . , n
k
, n
1
. . . n
k
> 0 and
n
1
+ . . . + n
k
= n. such that P
1
AP = diag(J
n
1
(0), . . . , J
n
k
(0)). Moreover if
A has real entries then P will also have real entries.
Proof. By induction on n. If n = 1 then A is a zero matrix and so the
statement is obvious. Assume that the statement holds for all matrices of
order less than n. Write A =
_
0 a
t
0 A
1
_
, where a C
n1
and A
1
C
n1n1
.
By induction there is an n1n1 invertible matrix S
1
such that S
1
1
AS
1
=
diag(J
r
1
, . . . , J
r
s
) = diag(J, B) with r
1
. . . r
s
> 0, r
1
+ . . . + r
s
= n 1,
and J = J
r
1
, B = diag(J
r
2
, . . . , J
r
s
).
If P
1
= diag(1, S
1
), then P
1
1
AP
1
=
_
0 a
t
S
1
0 S
1
1
AS
1
_
. Now write a
t
S
1
=
[b
t
c
t
], b C
r
1
and c C
n1r
1
. Then
P
1
1
AP
1
=
_
_
0 b
t
c
t
0 J 0
0 0 B
_
_
.
Next consider the following similarity transform:
_
_
1 b
t
J
t
0
0 I 0
0 0 I
_
_
_
_
0 b
t
c
t
0 J 0
0 0 B
_
_
_
_
1 b
t
J
t
0
0 I 0
0 0 I
_
_
=
_
_
0 b
t
(I J
t
J) c
t
0 J 0
0 0 B
_
_
=
_
_
0 (b
t
e
1
)e
t
1
c
t
0 J 0
0 0 B
_
_
If b
t
e
1
,= 0, then
diag(1/b
t
e
1
, I, 1/b
t
e
1
I)
_
_
0 (b
t
e
1
)e
t
1
c
t
0 J 0
0 0 B
_
_
diag(b
t
e
1
, I, b
t
e
1
I) =
_
_
0 e
t
1
c
t
0 J 0
0 0 B
_
_
16
Now
_
0 e
t
1
0 J
_
= J
r
1
+1
(0), a Jordan block of order r
1
+1 with zeros on the di-
agonal. Let us denote this matrix by

J. Thus
_
_
0 e
t
1
c
t
0 J 0
0 0 B
_
_
=
_

J e
1
c
t
0 B
_
.
Using the above result recursively, we have:
_
I e
i+1
c
t
B
i1
0 I
_ _

J e
i
c
t
B
i1
0 B
_ _
I e
i+1
c
t
B
i1
0 I
_ _

J e
i+1
c
t
B
i
0 B
_
,
for i = 1, 2, . . . . Since B
r
1
= 0 so after r
1
steps we have that the matrix
_

J e
i
c
t
B
i1
0 B
_
is similar to
_

J 0
0 B
_
which is in the required form.
If b
t
e
1
= 0, then A is similar to
_
_
0 0 c
t
0 J 0
0 0 B
_
_
which is permutation similar
to the matrix
_
_
J 0 0
0 0 c
t
0 0 B
_
_
. By induction hypothesis, there is an invertible
matrix S
2
C
nr
1
nr
1
such that S
1
2
_
0 c
t
0 B
_
S
2
= J, a Jordan matrix
with zeros on the diagonal. Therefore A is similar to
_
J
r
1
0
0 J
_
which is the
Jordan matrix in the required form except that the Jordan blocks are in the
descending order, that can be done by permutation similarity.
Finally note that if A has real entries then all similarities used in this
proof can be done by real matrices.
3.8 Jordan decomposition theorem. A C
nn
is similar to the matrix
diag(J
n
1
(
1
), . . . , J
n
k
(
k
)), where J
n
i
(
i
) C
n
i
n
i
and n
1
+ . . . + n
k
= n.
each J
n
i
(
i
)
i
I
n
1
is of the form of the above result. This form is es-
sentially unique, that is depends only on A and the order of occurrence of
eigenvalues.
Proof. Only uniqueness is required to prove now. For that note that if
,= 0, rankJ
n
m
= m for all positive integers n. For J
m
(0), rankJ
m
(0)
n
= 0,
if n m and rankJ(0)
n1
m
rankJ(0)
m
n
= 1 for n m. Write r
n
() =
rank(J
m
() I
m
)
n
. Then r
n1
(
i
) r
n
(
i
) is the number of Jordan blocks
17
of size at least n appearing in J. therefore the number of Jordan blocks of
size exactly equal to n is
(r
n1
(
i
) r
n
(
i
)) (r
n
(
i
) r
n+1
(
i
)) = r
n1
(
i
) 2r
n
(
i
) + r
n+1
(
i
).
Thus two nn matrices A and B are similar if and only if they have the
same characteristic polynomial and for each eigenvalue and positive integer
k, rank(A I)
k
= rank(B I)
k
4 Unitary, Self-adjoint, Normal Operators
In this section V will always denote an n-dimensional inner product space
over F = R or C.
4.1 Let V and W be nite dimensional inner product spaces and let T
L(V, W). Then there is a unique T

L(W, V ) such that


(Tv, w) = (v, T

w)
for all v V and w W. The linear map T

is called the adjoint of T.


For a xed w W the mapping f
w
(v) = (Tv, w) is a linear functional on
V. Thus by Reisz representation theorem there is a unique y V such that
(Tv, w) = (v, y). Hence if T

(w) = y, then (Tv, w) = (v, T

w) for all v V.
For linearity:
(v, T

(u + w)) = (Tv, u + w)
= (Tv, u) +(Tv, w)
= (v, T

u) +(v, T

w)
= (v, T

u +T

w))
for all v V and a W, and so T

(v+w) = T

v+T

w. For uniqueness
if S L(W, V ) is another such then for all v V and w W we have:
(v, (S T

)(w)) = 0. Hence, S = T

.
Following are the basic properties of the adjoint.
4.2 If S, T L(V, W), then (S + T)

= S

+ T

for F. S

= S,
where S

= (S

. (ST)

= T

.
If T L(V ) and T is invertible, then (T

)
1
= (T
1
)

.
18
let T L(V, W) and let B = v
1
, . . . , v
n
and B

= w
1
, . . . , w
m
be
ordered orthonormal basis of V and W respectively. Then
Tv
j
=
m

i=1
(Tv
j
, w
i
)w
i
,
for all j 1(1)n. thus the matrix of T with respect to the bases B and B

is
B
[T]
B
whose (i, j)-th entry is (Tv
j
, v
i
).
If A C
mn
, then we denote by A

the conjugate transpose of A.


4.3 let T L(V, W) and let B = v
1
, . . . , v
n
and B

= w
1
, . . . , w
m
be
ordered orthonormal basis of V and W respectively. Then
B
[T

]
B
=
B
[T]

B
,
the conjugate transpose of
B
[T]
B
.
Proof. Since T

w
j
=

n
i=1
(T

w
j
, v
i
)v
i
=

n
i=1
(Tv
i
, w
j
)v
i
.
4.4 If T L(V ). Then detT

= detT and trT

= trT.
A linear operator T on V is called normal if TT

= T

T; unitary if T is
invertible and TT

= I = T

T; and is called self adjoint if T

= T.
4.5 Follwoing statements are equivalent.
(i) T is unitary;
(ii) T preserves inner products: (Tu, Tv) = (u, v), for all u, v V ;
(iii) T preserves norm: [[Tu[[ = [[u[[, for all u V ;
(iv) T maps orthonormal basis to orthonormal basis.
4.6 Eigenvalues of unitary operators have absolute value 1. In particular a
unitary operator on real inner product space have eigenvalues 1 or -1.
An n n matrix A is called unitary if A is invertible and A
1
= A

. A
is called orthogonal if A
1
= A
t
.
4.7 A F
nn
. Then the follwoing statements are equivalent. (i) A is a
unitary matrix.
(ii) the columns of A form an orthonormal basis of the standard inner product
space F
n
. (iii) the rows of A form an orthonormal basis of the standard inner
product space F
n
.
19
It is easy to see that if T is unitary then the matrix of T with respect to
an orthonormal basis is unitary. If T is unitary on real inner product space
the matrix of T is actually orthogonal.
Recall the Gram Schmidt procedure. If B = u
1
, . . . , u
n
is an ordered
basis of an inner product space V, then there is an ordered orthonormal basis
B

= v
1
, . . . , v
n
such that < u
1
, . . . , u
k
>=< v
1
, . . . , v
k
> for k 1(1)n. In
fact, these are such that:
u
1
=
11
v
1
u
2
=
12
v
1
+
22
v
2
.
.
.
u
n
=
in
v
in
+. . . +
nn
v
n
,
where
if
F and
ii
> 0. for i, j 1(1)n. We use this to prove the following
statement.
4.8 Schur. If T is a linear opeartor on an inner product space V which is
triangulable then there is an ordered orthonormal basis such that the matrix
of T with respect to this basis is upper triangular.
Proof. let B = u
1
, . . . , u
n
is an ordered basis of such that the matrix of
[T]
B
is upper triangular. Let B

= v
1
, . . . , v
n
be an ordered orthonormal
basis of V which is obtained from B. Then the elements of B and B

are
related by the above equations. In other words, if I is the identity operator,
then
B
[I]
B
=
_

11
. . .
in
.
.
.
.
.
.
0
nn
_

_
.
Since [T]
B
=
B
[I]
B
[T]
B B
[I]
B
, and all the matrices on the right are upper
triangular, so [T]
B
is also upper triangular.
Now we look for some important results about self-adjoint operators.
Clearly. the sum of self-adjoint operators is self-adjoint. The inverse of self-
adjoint is self-adjoint, if exists.The product of two commuting self-adjoint
operators is also self-adjoint.
4.9 Let T be a self-adjoint operator. Then
(i) For all v V, (Tv, v) R.
20
(ii) (Tv, v) = 0 for all v V, then T 0.
(iii) If T
m
v = 0, for some positive integerm, then Tv = 0.
(iv) All roots of c
T
(x) are real.
(v) Eigenvectors corresponding to distinct eigenvalues are orthogonal.
Proof. (i). (Tv, v) = (v, Tv) = (T

v, v) = (Tv, v)
(ii). For x, y V, 0 = (T(x + y), x + y) = (Tx, y) + (Ty, x). If V is an inner
product space over R, then 0 = (Tx, y) + (Ty, x) = 2(Tx, y). Hence for all
x, y V, (Tx, y) = 0, and T 0. If V is inner product space over C, then
replacing y by iy we have that for all x, y V, (Tx, y) (Ty, x) = 0. Hence
we have T 0 again.
(iii). By induction on k. For k = 1, this is clear. Now assume that the
statement is true for all integers less than k. let m be a positive inte-
ger such that 2
m
k and 2
m1
< k. Since T
k
(v) = 0, T
2
m
v = 0. Thus
(T
2
m1
v, T
2
m1
v) = (T
2
m
v, v) = 0. Hence T
2
m1
v = 0 and by induction hy-
pothesis Tv = 0.
(iv). If V is an inner product space over C, then for an y root if v is
corresponding eigenvector, then as (Tv, v) = (v, v) and (Tv, v) is real, we
have that a real number.
Now assume that V is inner product space over R. Let A be the matrix
of T with respect to an ordered orthonormal basis. Then A is a hermitian
matrix with real entries, actually symmetric. Considering A as a linear op-
erator on standard inner product C
n
, A is self adjoint and c
A
(x) = c
T
(x).
Therefore all roots of c
A
(x) are real and so all roots of c
T
(x) are real.
(v). If and are distinct eigenvalues with corresponding eigenvectors u and
v, then and are real numbers and (u, v) = (Tu, v) = (u, Tv) = (u, v).
Hence (u, v) = 0.
Now we study normal operators.
4.10 Let T be a normal operator. Then
(i). [[Tv[[ = [[T

v[[ for all v V ;


(ii). Tv = v implies that T

v = v;
(iii). eigenvectors corresponding to distinct eigenvalues are orthogonal;
(iv). If T
k
v = 0 for some positive integer k, then Tv = 0.
Proof (i). [[Tv[[
2
= (Tv, Tv) = (v, T

Tv) = (v, TT

v) = [[T

v[[
2
.
(ii). Since T I is also a normal operator, [[(T I)v[[ = [[(T I)

v[[ =
21
[[(T

I)v[[.
(iii). If and are distinct eigenvalues with corresponding eigenvectors u
and v, then using (ii), (u, v) = (Tu, v) = (u, T

v) = (u, v) = (u, v).


Hence (u, v) = 0.
(iv). Since T is normal S = T

T is self adjoint. Thus T


k
v = 0 implies that
S
k
v = 0 and so Sv = 0. But then [[Tv[[
2
= (Tv, Tv) = (Sv, v) = 0. Hence
Tv = 0.
4.11 Spectral decomposition. Let T be a triangulabe operator on a nite
dimensional inner product space. Then T is normal if and only if there is an
orthonormal basis of V consisting of eigenvectors of T.
Proof. Let m
T
(x) = (x
1
)
m
1
. . . (x
k
)
m
k
, where
i
are distinct eigenvalues
of T and m
i
positive integers. Then
V = ker(T
1
I)
m
1
. . . ker(T
k
I)
m
j
Since for each i, T
i
I is a normal operator, so if (T
i
I)
m
i
v = 0, then
(T
i
I)v = 0, and so ker(T
i
I)
m
i
= ker(T
1
I). Hence
V = ker(T
1
I) . . . ker(T
k
I)
and T is diagonalizable.
Conversely, if B = u
1
, . . . , u
n
is an orthonormal basis of V such that
for each Tv
i
=
i
v
i
for i 1(1)n, then T

v
i
=
i
v
i
. Thus for each i 1(1)n,
TT

v
i
= T(
i
v
i
) =
i
v
i
=
i
(Tv
i
) = T

Tv
i
.
Hence TT

= T

T.
From the above result, if T is a normal operator on V , then there is an
orthonormal basis B = u
1
, . . . , u
n
of V consisting of eigenvectors of T and
so [T]
B
= diag(
1
, . . . ,
n
). Let P
i
L(V ) such that P
i
(u
i
) = u
i
and P
i
(u
j
) = 0.
Then P
i
P
j
= 0, for i ,= j, P
2
i
= P
i
and we have the spectral decomposition
of T :
T =
1
P
1
+. . . +. . . +
n
P
n
For matrices : if A is an nn normal matrix, then there is a unitary matrix
P such that P

AP = diag(
1
, . . . ,
n
). Thus:
A = Pdiag(
1
, . . . ,
n
)P

22
= P(
1
E
11
+. . . +
n
E
nn
)P

=
n

i=1

i
(Pe
i
)(Pe
i
)

=
n

i=1

i
Pe
i
,
where P
i
= (Pe
i
)(Pe
i
)

.
5 Positive Denite Operators
A self adjoint operator T is called positive semi-deniteif (Tu, u) 0 for
all u V, and positive denite if (Tu, u) > 0 for all u V, u ,= 0. Clearly
a positive denite operator is invertible.
5.1 Let T L(V ) be self adjoint. Then the following are equivalent.
(1). T is positive semi-denite.
(2). All eigenvalues of T are non-negative.
(3). There is a unique positive semi-denite operator P such that T = P
2
.
We write

T for P.
(4). There exists S L(V ) such that T = S

S.
Proof. (1) (2) is easy.
(2) (3). Let T =
1
E
1
+ . . . +
k
E
k
be spectral decomposition of T.
Then P =

1
E
1
+. . . +

k
E
k
is positive semi-denite and T = P
2
Indeed
for v V, v = E
1
v +. . . +E
k
v, and so (Pv, v) =

k
j=1

k
i=1

i
(E
i
v, E
j
v) =

k
i=1

i
(E
i
v, E
i
v) 0.
(3) (4). Let S = P.
(4) (1). For v V, (Tv, v) = (S

Sv, v) = (Sv, Sv) 0.


Similar statements for positive denite operators.
5.2 Let T L(V ) be self adjoint. Then the following are equivalent.
(1). T is positive denite.
(2). All eigenvalues of T are positive.
(3). There is a positive denite operator P such that T = P
2
.
(4). There exists an invertible S L(V ) such that T = S

S.
5.3 Polar Decomposition. Let T L(V ). Then T = UP, where U is unitary
and P is positive semi-denite.
23
Proof. TT

is positive semi-denite. Let . . .


r
> 0 and
r+1
=
. . .
n
= 0 be eigenvalues of TT

repeated according to their multiplicities.


Let w
1
, . . . , w
n
be corresponding orthonormal basis of V consisting of eigen-
vectors of TT

. Let u
i
=
1

i
T

w
i
, i 1(1)r. Then (u
i
u
j
) =
ij
and also that
it is easy to verify that TT

u =
i
u
i
Next dimker(T

T) = dimkerTT

=
nr. Thus if u
r+1
, . . . , u
n
is an orthonormal basis of ker(T

T), then u
1
, . . . , u
n
is an orthonormal basis of V consisting of eigenvectors of T

T. Let U be a
linear operator whose action on bases is given by: Uw
i
= u
i
. Then U is
unitary and if P =

T, then T = UP.
Indeed for i 1(1)r,
Tu
i
=
1

i
TT

w
i
=
_

i
w
i
= U(
_

i
u
i
) = UPu
i
.
and for i r + 1(1)n, (Tu
i
, Tu
i
) = (u
i
, T

Tu
i
) = 0, and (Pu
i
, Pu
i
) =
(u
i
, P
2
u
i
) = (u
i
, T

Tu
i
) = 0, that is, Tu
i
= 0 = Pu
i
.
The eigenvalues of

T are called the singular values of T.


5.4 Singular value decomposition. Let T L(V ). Then there are an ordered
orthonormal bases B = u
1
, . . . , u
n
and B

= w
1
, . . . , w
n
such that for any
x V,
Tx =
n

i=1

i
(x, u
i
)w
i
,
where
1
. . . ,
n
are singular values of T.
Proof. For x V, write x =

n
i=1
(x, u
i
)u
i
. Then Px =

n
i=1

i
(x, u
i
)u
i
, and
so as T = UP, Tx =

n
i=1

i
(x, u
i
)Uu
i
. If w
i
= Uu
i
, then we have the result.
The above result shows that Tu
i
=
i
u
i
. Thus we have that
B
[T]
B
= diag(
1
, . . . ,
n
).
We now give an method of nding obtaining singular value decomposition
for A C
mn
.
5.5 Let A C
mn
with singular values
1
, . . . ,
r
, where r is the rank of A.
Then there are unitary matrices U C
mm
and V C
nn
such that
A = Udiag(
1
, . . . ,
r
, 0 . . . , 0)V.
24
Proof. If A is a number c then A = [c[e
i
for some R is the singular
value decomposition of A. If A is a non zero row or a column vector, say
A = [a
1
. . . a
n
], Then
1
is the norm of A. Let V be any unitary matrix with
the rst row the unit vector [a
1
/
1
. . . a
n
/
n
]. Then A = [
1
0 . . . 0]V.
We now assume that m > 1 and n > 1 and A ,= 0 and let u
1
be a unit
eigenvector of A

A corresponding to
2
1
, that is A

Au
1
=
2
u
1
and u

1
u
1
= 1.
Let v
1
=
1

1
Au
1
. Then v
1
is a unit vector and u

1
A

v
1
=
1
. Let P and Q
be unitary matrices with u
1
and v
1
as the rst column respectively. Then
P

Q =
_

1
0
0 B
_
or A

= Q
_

1
0
0 B

_
P

for some n1m1 matrix


B. The result now follows by repeating the process on B

.
We can deduce polar decomposition from the singular value decomposi-
tion.
5.6 For any square matrix A there is a positive denite matrix P and a
unitary matrix U such that A = PU.
Proof. Let U and V be n n unitary matrices such that
A = Udiag(
1
, . . . ,
r
, 0 . . . , 0)V.
By inserting U

U between the diagonal matrix and V we have the result.


6 Equivalent conditions for normal matrices
A list of seventy equivalent conditions for a complex matrix to be normal
appeared in a paper of R Grone, C.R.Johnson, E.M.Sa and H.Wolkowicz,
Normal Matrices, Linear Algebra and its Applications, Vol. 87(1987)213-
225. Then in 1998 another list of twenty conditions appeared in the paper
of L. Elsner and Kh.D.Ikramov, Normal Matrices: an update, Linear Alge-
bra and its Applications, Vol. 285(1998)291-303. We list here 30 equivalent
conditions.
Let A C
nn
and let
1
, . . . ,
n
be eigenvalues of A. Then the following
statements are equivalent.
25
1. A is normal, that is, AA

= A

A.
2. A is unitarily diagonalizable.
3. There is a polynomial p(x) such that A

= P(A).
4. There is a set of eigenvectors of A which form an orthonormal basis for
C
n
.
5. Every eigenvector of A is also an eigenvector of A

.
6. A = B +iC for some Hermitian matrices B and C and BC = CB.
7. If U is unitary such that U

AU =
_
B C
0 D
_
, with B and D square, then
B and D are normal and C = 0.
8. If W a subspace of C
n
is A-invariant, then so is W

.
9. If u is an eigenvector of A, then < u >

is A-invariant.
10. A can be written as: A =

n
i=1

i
P
i
, P
i
C
nn
such that: P
2
i
= P
i
=
P

i
, P
i
P
j
= 0 if i ,= j,

n
i=1
P
i
= I.
11. tr(A

A) =

n
i=1
[
i
[
2
.
12. The singular values of A are [
1
[, . . . , [
n
[.
13.

n
i=1
(Re
i
)
2
= tr(A +A

)
2
/4
14.

n
i=1
(Im
i
)
2
= tr(A + A

)
2
/4
15. The eigenvalues of A +A

are
1
+
1
, . . . ,
n
+
n
.
16. tr(A

A)
2
= tr((A

)
2
A)
2
).
17. [[Ax[[ = [[A

x[[ for every x C


n
.
18. [A[ = [A

[ where [A[ = (A

A)
1/2
.
19. A

= AU for some unitary U.


20. A

= V A for some unitary V.


21. UP = PU if A = UP is polar decomposition of A.
22. UA = AU if A = UP is polar decomposition of A.
23. AP = PA if A = UP is polar decomposition of A.
24. A commutes with A +A

.
25. A commutes with A A

.
26. A +A

commutes with A A

.
27. A commutes with A

A.
28. A commutes with AA

A.
29. A

B = BA

whenever AB = BA.
30. AA

A is positive semi-denite.
Proof. 1 2. Let U be a unitary matrix such that U

AU = T, an upper
triangular matrix. Then AA

= A

A implies that TT

= T

T. Now equating
the diagonal terms on the both sides we get that T is actually a diagonal
26
matrix. Thus 1 2. Converse is clear.
1 3. Let p be a polynomial of degree at most n1 such that p(
1
) =
i
.
Since by above A is normal and so is unitarily diagonalizable. Thus there is
a unitary matrix U such that U

AU = diag(
1
, . . . ,
n
). But then
A

= Udiag(
1
, . . . ,
n
)U

= Udiag(p(
1
), . . . , p(
n
))U

= Up(diag(
1
, . . . ,
n
))U

= Up(U

AU)U

= p(A)
Conversely if A

= p(A) for some polynomial p, then clearly A

A = p(A)A =
Ap(A) = AA

.
The equivalence of 2 and 4 is clear.
5 1. Suppose that A is normal. Let be an eigenvalue of A corre-
sponding to normalizes eigenvector u. Let U be a unitary matrix with the
rst column u. Then:
U

AU =
_
v
t
0 B
_
,
where v C
n1
. The normality of A implies that
_
v
t
0 B
_ _
0
v B

_
=
_
0
v B

_ _
v
t
0 B
_
,
and so v = 0 and B is a normal n n matrix. But then
U

U =
_
0
0 B
_
.
Therefore, u is an eigenvector of A

corresponding to the eigenvalue .


For the converse rst note that Av = v if and only if (U

AU)(U

v) =
(U

v) for any unitary matrix U. Thus without any loss we can assume that
A is upper-triangular. Write A =
_
x
t
0 B
_
, where x C
n1
. Then e
1
is an
eigenvector of A corresponding to . But then A

=
_
0
x B

_
. Also as e
1
27
is an eigenvector of A

, it implies that x = 0. Now since every eigenvector of


A is also an eigenvector of A

this property will also be inherited by B. An


induction hypothesis on B shows that A is diagonal. hence A is normal.
6 1. Let B = (A +A

)/2 and C = (A A

)/2i.
7 1. 7 implies 1 is clear. We prove 1 implies 7. Suppose that U is a
unitary matrix such that U

AU =
_
B C
0 D
_
, with B and D square matrices.
Then AA

= A

A implies that
_
B

B B

C
C

B C

C +D

D
_
=
_
BB

+CC

CD

DC

DD

_
.
Therefore, B

B = BB

+ CC

. Now taking trace on both sides we get


tr(B

B) = tr(BB

) + tr(CC

). Since tr(B

B) = tr(BB

), this implies that


tr(CC

). Hence C = 0 and B and D are normal matrices.


8 9 5. Let x be an eigenvector of A. Then < x >

is A-invariant
subspace of C
n
. therefore < x > is A

-invariant subspace. In particular x is


also an eigenvector of A

.
7 8. Let W be an A-invariant subspace of C
n
. Let w
1
, . . . , w
k
be an
orthonormal basis of W. Then extend it to form an orthonormal basis of C
n
.
Clearly, W

=< w
k+1
, . . . , w
n
> . If W is the matrix whose columns are
w
1
, . . . , w
n
then W is unitary and W

AW =
_
B C
0 D
_
, B and D square
matrices. Thus C = 0 and so W is also A-invariant. Hence 1 8 9.
10 1. 10 implies 1 is a direct computation. Now suppose 1 holds. Then
there is a unitary matrix U such that A = Udiag(
1
, . . . ,
n
)U

. Therefore
A =
1
u
1
u

1
+. . .+
n
u
n
u

n
, where u
i
is the i-th column of U. If we let E
i
= u
i
u

i
we have the result.
11 2. 2 implies 11 is clear. Conversely , by Shurs lemma there is a
unitary matrix U such that U

AU = T and upper triangular matrix. Thus


tr(AA

) = tr(TT

), that is,

[a
ij
[
2
=

[
i
[
2
+

i<j
[t
ij
[
2
. Hence t
ij
= 0
and T is a diagonal matrix.
12 2. 2 implies 12 is clear. If singular values of A are [
1
[, . . . , [
n
[,
then tr(A

A) =

n
i=1
[
i
[
2
. and so 12 implies 11 and hence 2.
28
13 2. Let U be a unitary matrix such that U

AU = Tis upper-
triangular. Then
tr(A +A

)
2
= tr(T +T

)
2
= tr(T)
2
+ 2tr(T

T)
2
+ tr(T

)
2
=
n

i=1
t
2
ii
+ 2
n

i=1
[t
ii
[
2
+ 2

i<j
[t
ij
[
2
+
n

i=1
t
ii
2
=
n

i=1

2
i
+ 2
n

i=1
[
i
[
2
+
n

i=1

i
2
+ 2

i<j
[t
ij
[
2
=
n

i=1
(
i
+
i
)
2
+ 2

i<j
[t
ij
[
2
= 4
n

i=1
(Re
i
)
2
+ 2

i<j
[t
ij
[
2
Hence t
ij
= 0 for i ,= j and T is a diagonal matrix. This shows that 13
implies 2. Converse is easy.
14 2 is similar.
15 2 We show that 15 implies 13 and hence 2. This is easy since
tr(A +A

)
2
=
n

i=1
(
i
+
i
)
2
= 4
n

i=1
(Re
i
)
2
.
Converse is again straightforward.
16 1 Since for any matrix X tr(X

X) = 0 if and only if X = 0, we
show that tr((A

A AA

)(A

A AA

) = 0. Now
tr((A

A AA

)(A

A AA

= tr((A

A AA

)
2
)
= tr((A
2
A
2
) + tr((A
2
A
2
) tr(A

A)
2
tr(AA

)
2
= 2(tr((A
2
A
2
) tr(A

A)
2
)
Hence A

A AA

= 0 and A is normal.
17 1 For x C
n
(Ax, Ax) = (A

x, A

x) and so (x, (A

AAA

)x) = 0.
Hence A is normal.
29
18 1 Follows from the uniqueness of the square roots.
19 1 If A

= AU. then A

A = A

(A

= (AU)(AU)

= AA

.
Conversely, let A = V

diag(
1
, . . . ,
n
)V. Then A

= V

diag(
1
, . . . ,
n
)V.
Let s
i
=
i
/
i
if s
i
,= 0, and s
i
= 1 otherwise. Then diag(
1
, . . . ,
n
) =
diag(
1
, . . . ,
n
)diag(s
1
, . . . , s
n
), and so
A

= V

diag(
1
, . . . ,
n
)(V V

)diag(s
1
, . . . , s
n
)V = AU,
where U = V

diag(s
1
, . . . , s
n
)V.
20 1 is similar.
21 1 If A = UP a polar decomposition and UP = PU, then AA

=
P
2
= A

A. Conversely, A

A = AA

implies that P

P = UPP
8
U

or
P
2
= UP
2
U

. Taking square roots, we have PU = UP.


22 21 AU = UA if and only if UPU = UUP and so UP = PU.
23 21. If UP = PU, then UPP = PUP, that is AP = PA. Thus 21
implies 23. Conversely, if AP = PA, then UP
2
= PUP, and if P is invertible
the result is immediate. Let P be of rank r, r < n. Write P = V

_
D 0
0 0
_
V,
where V is unitary and D is an r r diagonal matrix with non-zero diagonal.
Then UP
2
= PUP gives with, W = V UV

W
_
D
2
0
0 0
_
=
_
D 0
0 0
_
W
_
D 0
0 0
_
.
Partition W =
_
W
1
W
2
W
3
W
4
_
with W
1
and rr matrix. Then W
1
D
2
= DW
1
D
and W
2
D
2
= 0. Since D is invertible, this implies that DW
1
= W
1
D and
W
2
= 0. Since W is unitary it follows that W
3
= 0, and so W
_
D 0
0 0
_
=
_
D 0
0 0
_
W. Hence UP = PU.
It is easy to verify that 24, 25 and 26 are equivalent to 1.
30
27 16. If A commutes with A

A, then (A

A)
2
= (A

)
2
A
2
and now
taking trace on both sides we have 16.
27 16. If A commutes with AA

A, then AA

AA

AA = AAA

AA

A Multiplying both sides by A

we have that 2(A

A)
2
= A
2
A
2
+A

A
2
A

.
Now taking trace on both sides we have 16.
29 1. Take B = A for normality. Conversely, let A be normal and let
A and B be commuting matrices. Write A = U

diag(
1
, . . . ,
n
)U, where U
is unitary. Then AB = BA implies that
diag(
1
, . . . ,
n
)(UBU

) = (UBU

)diag(
1
, . . . ,
n
)U
Denote C = UBU

= (c
ij
). Then
diag(
1
, . . . ,
n
)C = Cdiag(
1
, . . . ,
n
)U
which gives that (
i

j
)c
ij
= 0. Thus if
i
,=
j
then c
ij
= 0 and if c
ij
= 0,
then
i
=
j
. In any case we have that (
i

j
)c
ij
= 0 for all i and j, which
in return implies that A

B = BA

.
30 1. is obvious.
31

Das könnte Ihnen auch gefallen