Beruflich Dokumente
Kultur Dokumente
and
T be linear
mappings induced by T on W and V/W. Then c
T
(x) = c
T
(x)c
T
(x).
(ii). Let V = W
1
W
2
, where W
1
and W
2
are T-invariant subspaces, and let
T
1
and T
2
be linear mappings induced by T on W
1
and W
2
. Then c
T
(x) =
c
T
1
(x)c
T
2
(x).
1.2 The geometric multiplicity of eigenvalue cannot exceed its algebraic
multiplicity.
Proof. Let the geometric multiplicity of be m. Then will have m lin-
early independent eigenvectors u
1
, . . . , u
m
, that is, Tu
i
= u
i
for i 1(1)m.
Extend this to a basis of V : B = u
1
, . . . , u
m
, u
m+1
, . . . , u
n
. Then [T]
B
=
1
_
I
m
B
0 A
_
. Therefore c
T
(x) = (x)
m
c
A
(x) and the algebraic multiplicity
of is at least m.
Recall that L(V ), the set of linear operators on V is a vector space over K
of dimension n
2
. Thus, if T is a linear operator on V , then I = T
0
, T, . . . , T
n
2
,
are linearly dependent, that is, there are scalars
0
,
1
, . . . ,
n
2 not all of
which are zero such that
0
I +
1
T + +
n
2T
n
2
= 0 ( L(V )) . Therefore,
if f (x) =
0
+
1
x + +
n
2x
n
2
, then f (T) = 0. This shows that S =
p (x) K [x] [ p (T) = 0 is a nonzero principal ideal of K[x]. The monic
polynomial which generates this ideal is called the minimal polynomial of
T. We will denote the minimal polynomial of T by m
T
(x).
Also, for a xed vector y in V , the vectors y, Ty, . . . , T
n
y are linearly
dependent, and so there are scalars
0
, , . . . ,
n
in K not all of which are
zero such that
0
y +
1
Ty +
n
T
n
y = 0. Similarly the ideal of K[x],
p (x) K [x] [ p (T) (y) = 0 is nonzero and principal. The monic polyno-
mial m
y
T
(x) which generates this ideal is called the annihilator of y with
respect to T. Clearly, m
y
T
(x) divides the minimal polynomial m
T
(x).
1.3 Primary decomposition theorem. Let the minimal polynomial of T
in K [x] be m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
, where p
1
(x) , . . . , p
k
(x) are monic
irreducible polynomials and r
1
, . . . , r
k
are positive integers. Then
V = ker p
1
(T)
r
1
ker p
k
(T)
r
k
,
a direct sum of Tinvariant subspaces of V . If for each i, T
i
is the linear
operator on ker p
i
(T)
r
i
induced by T, then the minimal polynomial of T
i
is
p
i
(x)
r
i
.
Proof. For each i, let q
i
(x) =p
1
(x)
r
1
p
i1
(x)
r
i1
p
i+1
(x)
r
i+1
p
k
(x)
r
k
=
m
T
(x) /p
i
(x)
r
i
. Then gcd (q
1
(x) , . . . , q
k
(x)) = 1, and so there are polyno-
mials s
1
(x) , . . . , s
k
(x) in K[x] such that s
1
(x) q
1
(x) + +s
k
(x) q
k
(x) = 1.
Thus, s
1
(T) q
1
(T) + + s
k
(T) q
k
(T) = I, and for any v in V ,
v = s
1
(T) q
1
(T) v + +s
k
(T) q
k
(T) v.
Since p
i
(T)
r
i
s
i
(T) q
i
(T) v = s
i
(T) m
T
(T) v = 0, so s
i
(T) q
i
(T) v ker p
i
(T)
r
i
.
Therefore,
V = ker p
1
(T)
r
1
+ + ker p
k
(T)
r
k
.
2
Next, to show that this sum is actually direct, assume that v
1
+ +v
k
= 0,
v
i
ker p
i
(T)
r
i
. Clearly, for i ,= j, q
i
(T) v
j
= 0. Now for any i,
v
i
= (v
1
+ + v
i1
+v
i+1
+ + v
k
) ,
and so q
i
(T) v
i
= 0. Also p
i
(T)
r
i
v
i
= 0. Since gcd (q
i
(x) , p
i
(x)
r
i
) = 1,
there are a
i
(x) , b
i
(x) K [x] such that a
i
(x) q
i
(x) + b
i
(x) p
i
(x)
r
i
= 1.
Thus, v
i
= a
i
(T) q
i
(T) v
i
+b
i
(T) p
i
(T)
r
i
v
i
= 0. Hence, the sum is direct.
For the second part, note that p
i
(T)
r
i
v = 0 for all v in V
i
. Thus, if m
i
(x)
is the minimal polynomial of T
i
, then m
i
(x) divides p
i
(x)
r
i
. Since p
i
(x) is
monic and irreducible, m
i
(x) = p
i
(x)
s
for some s r
i
. Now the polynomial
f (x) = q
i
(x) p
i
(x)
s
is such that f (T) = 0. Therefore, m
T
(x) divides f (x).
Hence s = r
i
and m
i
(x) = p
i
(x)
r
i
.
Let T L(V ) and let m
T
(x) be as in 1.3. Then by primary decomposition
theorem it follows that if T
i
is the restriction of T to the T-invariant subspace
V
i
= ker p
i
(T)
r
i
, and B
i
is a basis of V
i
, then B =
k
i=1
B
i
, is a basis for V
and [T]
B
= diag([T
1
]
B
1
, . . . , [T
k
]
B
k
).
1.4 There exists a vector y in V such that m
y
T
(x) = m
T
(x). In particular,
deg m
T
(x) dimV .
Proof. Write m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
, as in the statement of the above
proposition and V = ker p
1
(T)
r
1
ker p
k
(T)
r
k
. Then for each i there
exists y
i
ker p
i
(T)
r
i
such that p
i
(T)
r
i
y
i
= 0 and p
i
(T)
r
i
1
y
i
,= 0. Other-
wise m
T
(x) will not be the minimal polynomial. If y = y
1
+ + y
k
, then
clearly, m
y
T
(x) = m
T
(x).
We now prove the Cayley Hamilton Theorem: if T L(V ), then T
satises its characteristic polynomial, that is, c
T
(T) = 0, equivalently, the
minimal polynomial divides the characteristic polynomial; for that we need
the following result.
1.5 If m
T
(x) = p (x)
r
, where p (x) is a monic irreducible polynomial of
degree m, then m divides n.
Proof. Let y V such that m
y
T
(x) = m
T
(x) . Then y, Ty, . . . , T
mr1
y are
linearly independent and the subspace W generated by these vectors is T-
invariant and of dimension mr. If V = W, then the result is proved.
3
If V ,= W, then dimV/W = n mr < n. Suppose that the statement is
true for all vector spaces of dimension at most n 1. Let T
be the linear
operator on V/W induced by T. Since the minimal polynomial of T
divides
m
T
(x), so the minimal polynomial of T
is p (x)
t
for some t < r. By the
induction hypothesis, m divides n mr and so m divides n.
1.6 Cayley Hamilton theorem. Let V be a nite dimensional vector
space over K of dimension n. Let the minimal polynomial of T be m
T
(x) =
p
1
(x)
r
1
p
k
(x)
r
k
, where each p
i
(x) is monic irreducible polynomial of degree
m
i
, and let dimker p
i
(T)
r
i
= n
i
. Then m
i
divides n
i
and the characteristic
polynomial c
T
(x) = p
1
(x)
n
1
/m
1
p
k
(x)
n
k
/m
k
. In particular, m
T
(x) divides
c
T
(x).
Proof. First we consider the case when k = 1, that is, m
T
(x) = p (x)
r
,
deg p (x) = m. Then we require to show that c
T
(x) = p (x)
n/m
. We prove it
by induction on n. By 1.5, n/m is a positive integer.
If n = 1, then V = Kv for any v V 0, and so Tv = v, K.
Therefore m
T
(x) = x = c
T
(x). Assume that the hypothesis is true
for all vector spaces of dimension at most n 1. Let dimV = n and let
y V such that m
y
T
(x) = m
T
(x) (1.4). Then y, Ty, . . . , T
mr1
y are linearly
independent and W = y, Ty, . . . , T
mr1
y) is a nonzero Tinvariant subspace
of dimension mr.
If V = W, then n = mr. Let B = y
1
= y, y
2
= Ty, . . . , y
mr
= T
mr1
y
be an ordered basis of V and let (p (x))
r
=
0
+
1
x+ +
mr1
x
mr1
+x
mr
.
Then
Ty
1
= y
2
,
Ty
2
= y
3
,
.
.
.
Ty
mr1
= y
mr
,
Ty
mr
= T
mr
y = (
0
) y
1
+ (
1
) y
2
+ + (
mr1
) y
mr
,
and so
[T]
B
=
_
_
0 0 0
0
1 0 0
1
0 1 0
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 1
mr
_
_
,
4
Therefore, c
T
(x) = p (x)
r
= p (x)
n/m
.
If W ,= V , and if T induces linear operators
T and T
on W and V/W
respectively, then c
T
(x) = c
T
(x) c
T
(x). Since dimW = mr < n, by the
induction hypothesis, c
T
(x) = p (x)
mr/m
. Also as dimV/W = n mr < n,
so c
T
(x) = p (x)
(nmr)/m
. Hence c
T
(x) = p (x)
n/m
.
Now consider the general case, m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
, where each
p
i
(x) is monic irreducible polynomial and each r
i
is a positive integer. The
subspace V
i
= ker p
i
(T)
r
i
is a T-invariant. Let T
i
be the linear operator on
V
i
induced by T. Then by 1.3, it follows that V = V
1
V
k
and m
T
i
(x) =
p
i
(x)
r
i
, i = 1, . . . , k. Also by 1.1, we have that c
T
(x) = c
T
1
(x) c
T
k
(x).
By the above paragraph, we have c
T
i
(x) = p
i
(x)
n
i
/m
i
, where n
i
= dimV
i
and
m
i
= deg p
i
(x). Therefore, c
T
(x) = p
1
(x)
n
1
/m
1
p
k
(x)
n
k
/m
k
.
Also m
T
i
(x) = p
i
(x)
r
i
in V
i
, and so by 4, there exists y
i
V
i
such that
m
y
i
T
i
(x) = m
T
i
(x). Hence, y
i
, Ty
i
, . . . , T
m
i
r
i
1
y
i
are linearly independent in
V
i
. Thus m
i
r
i
n
i
, and r
i
n
i
/m
i
. Hence, m
T
(x) divides c
T
(x).
The above result proves that, if for a linear operator T, c
T
(x) = p
1
(x)
m
1
p
k
(x)
m
k
,
where p
1
(x) , . . . , p
k
(x) are monic irreducible polynomials and m
1
, . . . , m
k
are positive integers, then c
T
(T) = 0 and m
T
(x) = p
1
(x)
r
1
p
k
(x)
r
k
,
where each r
i
is a positive integer and r
i
m
i
. For example, if V is
a vector space over R of dimension 7 and for a linear operator T on V ,
c
T
(x) = (x
2
+ 1)
2
(x 1)
3
. Then m
T
(x) is one of the following polyno-
mials: (x
2
+ 1) (x 1), (x
2
+ 1) (x 1)
2
, (x
2
+ 1) (x 1)
3
, (x
2
+ 1)
2
(x 1),
(x
2
+ 1)
2
(x 1)
2
or (x
2
+ 1)
2
(x 1)
3
. In particular, is an eigenvalue of T
if and only if is a root of its minimal polynomial.
We present another proof of Cayley Hamilton Theorem. The advantage
of the following proof is that it shows that the theorem is also true for the ma-
trices over a commutative rings with identity. Just read K as a commutative
ring with identity in the following proof!
1.7 Let A K
nn
. Then c
A
(A) = 0.
Proof. Let c
A
(x) = a
0
+a
1
x+. . . +a
n1
x
n1
+x
n
and let B(x)be the classical
adjoint matrix of xI
n
A. The entries of B(x) are polynomials in x of degree
at most n 1, and so we can write B(x) = B
0
+ B
1
x + . . . + B
n1
x
n1
,
where B
i
K
nn
for all i. Thus by denition of the adjoint of a matrix:
(xI
n
A)B(x) = c
A
(x)I
n
, that is,
(xI
n
A)(B
0
+ B
1
x +. . . +B
n1
x
n1
) = a
0
+a
1
x +. . . +a
n1
x
n1
+x
n
I
n
.
5
Equating the coecients of x we obtain:
AB
0
= a
0
I
n
B
0
AB
1
= a
1
I
n
.
.
.
B
r1
AB
r
= a
r
I
n
.
.
.
B
n1
= I
n
and now multiplying by the increasing powers of A :
AB
0
= a
0
I
n
AB
0
A
2
B
1
= a
1
A
.
.
.
A
r
B
r1
A
r+1
B
r
= a
r
A
r
.
.
.
A
n
B
n1
= A
n
.
On adding we obtain 0 = a
0
I
n
+ Aa
1
A + . . . + a
n1
A
n1
+ A
n
, that is,
c
A
(A) = 0.
2 Diagonalizable and triangulable operators
2.1 Eigenvectors corresponding to distinct eigenvalues are linearly indepen-
dent.
Proof. Let
1
, . . . ,
k
be distinct eigenvalues of T and let u
1
, . . . , u
k
be cor-
responding eigenvectors, Tu
i
=
i
u
i
, i 1(1)k. For each i, dene
S
i
=
(T
1
I) . . . (T
i1
I)(T
i+1
I) . . . (T
k
I)
(
i
1
) . . . (
i
i1
)(
i
i+1
) . . . (
i
)
k
.
Then S
i
L(V ) and S
i
u
j
= 0 for i ,= j and S
i
u
i
= u
i
. Now if
1
u
1
+ . . . +
k
u
k
= 0, then 0 = S
i
(
1
u
1
+. . . +
k
u
k
) =
i
u
i
, and so
i
= 0.
6
A linear operator T is diagonalizable if V has a basis consisting of eigen-
vectors of T. Equivalently, T is diagonalizable if there is a basis B of V such
that [T]
B
is a diagonal matrix. It is immediate from the above statement that
if T has n (dimension of V ) distinct eigenvalues then T is diagonalizable. An
n n matrix is diagonalizable if it is similar to a diagonal matrix.
2.2 Let
1
, . . . ,
k
be distinct eigenvalues of T L(V ). Then the following
statements are equivalent.
(i). T is diagonalizable.
(ii). m
T
(x) = (x
1
) . . . (x
k
).
(iii). V = E(
1
) . . . E(
k
), where E(
i
) =ker(T
i
I).
(iv). c
T
(x) splits over K and the geometric multiplicity of each eigenvalue is
equal to its algebraic multiplicity.
Proof. (i) implies (ii) is clear. (ii) implies (iii) follows from the Primary De-
composition Theorem. Now assume (iii). Let dimE(
i
) = n
i
and let B
be an
ordered basis of E(
i
). Then each E(
i
) is T-invariant subspace, and if T
i
is
the restriction of T on E(
i
), then [T
i
]
B
=
i
I
n
i
. Now B =
k
i=1
B
is a basis of
V and [T]
B
= diag([T
i
]
B
, . . . , [T
k
]
B
. Hence c
T
(x) = (x
1
)
n
1
. . . (x
k
)
n
k
and so the geometric multiplicity of each eigenvalue is equal to its algebraic
multiplicity. This proves that (iii) implies (iv). Finally if (iv) holds then
clearly V has a basis consisting of eigenvalues of T.
If A K
nn
, to decide if A is diagonalizable rst check that if the char-
acteristic polynomial splits over K, and then for each root of c
A
(x) :
nullity(A I) is equal to the algebraic multiplicity of . In this case if
we let P be any matrix consisting of eigenvectors of A, then P
1
AP is a
diagonal matrix.
A linear operator T is triangulable if V has an ordered basis B such
that [T]
B
is an upper triangular matrix. An n n matrix A is triangulable
if A is similar to an upper triangular matrix.
2.3 For T L(V ) the following statements are equivalent.
(i). T is triangulable.
(ii). The characteristic polynomial c
T
(x) splits over K.
(iii). Every nonzero T-invariant subspace of V contains an eigenvector of T.
7
Proof. (i) implies (ii) is obvious.
(ii) implies (iii). Let W be any nonzero proper T-invariant subspace of V.
If T
L(W) and c
T
(x) divides c
T
(x),
and so c
T
(x) splits. If K is a root of c
T
(x) and w is an eigenvector
corresponding to , then Tw = T
) = 0. Since
T
T
(x) divides c
T
(x), c
T
(x) splits. Now V/W is n1 dimensional vector space
over K, by induction there is an ordered basis B
= x
2
, . . . , x
n
, x
i
= x
i
+W,
so that
_
T
_
B
1
s are
all distinct. Then A is similar to the matrix diag(A
1
, . . . , A
k
), where each A
i
is an upper triangular n
i
n
i
matrix with
i
on the diagonal.
Proof. By the above result, A is similar to an upper triangular matrix. We
can assume that equal eigenvalues are appearing together on the diagonal.
Now suppose that the ith diagonal entry is
i
and the j-th diagonal entry is
j
,
i
,=
j
and i < j. Then consider the product (I E
ij
)A(I +E
ij
) Since
i < j and the product of two upper triangular matrices is upper triangular,
this matrix is upper triangular and its ij-th entry is :
e
t
i
(I E
ij
)A(I +E
ij
)e
j
= (e
t
i
+e
t
j
)A(e
j
+ e
i
)
= e
t
i
Ae
j
e
t
i
Ae
j
+e
t
i
Ae
i
2
e
t
j
Ae
i
.
Since A is upper triangular e
t
j
Ae
i
= 0 for j > i. Thus if we want that the
(ij)-th entry to be zero after this similarity transform, should be such that
e
t
i
Ae
j
e
t
j
Ae
j
+e
t
i
Ae
j
= 0, that is, = e
t
i
Ae
j
/(e
t
j
Ae
j
e
t
i
Ae
i
).
Now if n-th and n 1-th diagonal entries are dierent then by suitable
value of we can make it zero by similarity. Otherwise we look for the n2-
th diagonal entry and can make suitable entries zeros this way.
Let S and T be two diagonalizable operators. We say that S and T are
simultaneously diagonalizable if there is a basis B such that [S]
B
and [T]
are
diagonal matrices.
9
2.6 S and T are simultaneously diagonalizable if and only if S and T com-
mute.
Proof. We assume that S and T are diagonalizable commuting operators.
Let
1
, . . . ,
k
be distinct eigenvalues of T. Let B =
k
i=1
B
i
be a basis
of V such that B
i
is a set of eigenvalues of S corresponding to
i
. Then
[S]
B
= diag(
1
I
n
1
, . . . ,
k
I
n
k
) Write [T]
B
=
_
_
A
11
. . . A
1k
.
.
.
.
.
.
A
k1
. . . A
kk
_
_
conformal
with the blocks of [S]
B
. Now since these matrices commute so on equating
we have that A
ij
= 0 for i ,= j. Thus [T]
B
= diag(A
11
, . . . , A
kk
). Therefore
if V
i
=< B
i
>, then V = V
1
. . . V
k
; and if T
i
is the restriction of T to
V
i
, then [T
i
]
Bi
= A
ii
. This implies that m
T
i
(x) divides m
T
(x) and so each
T
i
is diagonalizable. Let B
i
be a basis of V
i
consisting of eigenvectors of T
i
.
Then for B
=
k
i=1
B
i
we have that [T]
B
= diag([T
1
]
B
1
, . . . , [T
k
]
B
k
), a diagonal
matrix. Also for each x V
i
Sx =
i
x, so [S]
B
= [S]
B
a diagonal matrix.
Hence S and T are simultaneously diagonalizable.
Similarly we dene simultaneously triangulable operators, that is
when there is a basis with respect to which the matrices of operators is
upper triangular.
2.7 If linear operators S and T commute, then they are simultaneously tri-
angulable.
Proof. Let be an eigenvalue of T. let U = ker(T I). Then for u U,
(T I)Su = S(T I)u = 0. Thus Su U and U is S-invariant. Let u
1
be an eigenvector of S in U. Then Su
1
= u
1
and Tu
1
= u
1
. Now consider
V/U and consider the operators S
and T
k
i=1
(x
i
)
m
i
Then resolving into partial
fractions:
1
m
T
(x)
=
k
i=1
m
i
j=1
a
ij
(x
i
)
m
i
Thus if p
j
(x) =
k
i=1,i=j
(x
i
)
m
i
, then 1 =
k
i=1
_
m
i
j=1
a
ij
(x
i
)
m
i
j
_
p
i
(x).
10
If q
i
(x) =
m
i
j=1
a
ij
(x
i
)
m
i
j
, then
k
i=1
p
i
(x)q
i
(x) = 1.
Dene E
i
= p
i
(T)q
i
(T), then for each i, E
i
L(V ) and it sis easy to see that
this is a resolution of identity, that is :
E
1
+. . . +E
k
= i, E
i
E
j
= 0, for i ,= j, and E
2
i
= E
i
.
Since E
i
s are diagonalizable and commuting, and so are simultaneously di-
agonalizable. Let B be such a basis. Then for each i ::
[E
i
]
B
= diag(0, . . . , 0, I
n
i
, 0, . . . , 0),
where n
i
is the algebraic multiplicity of eigenvalue
i
. Let D =
k
i=1
i
E
i
.
Then [D]
B
is a diagonal matrix and so D is diagonalizable. IF N = T D,
then N =
k
i=1
(T
i
I)E
i
. Since all these linear operators are polynomials
in T so they are all commuting. Therefore if r = maxm
1
, . . . , m
k
, we have
that N
r
= 0. Hence we have the following.
2.8 let T be a triangulable operator on V. Then T = D + N, where D is
diagonalizable and N is nilpotent such that DN = ND. Moreover D and N
are uniquely determined by these properties.
Only the uniqueness part is to be proved now. Suppose that there are
also D
and N
= N
. Then
DD
= N
N. Now D
T = D
(D
+N
) = TD
and as D is a polynomial
in T, DD
= D
D. Similarly, NN
= N
= N
N implies that DD
= 0.
3 The Jordan form
Let T L(V ) and let be an eigenvalue of T. For a positive integer r, the
subspace E
r
() = ker(T I)
r
is called the generalized eigenspace of
11
order r associated with . E
1
() is an eigenspace associated with , Since V
is nite dimensional, there is a positive integer p such that
0 = E
0
() E
1
() . . . E
p
() = E
p+1
() = . . . .
An element x E
r
()E
r1
() is called a generalized eigenvector of T of
order r corresponding to . Clearly if x is a generalized eigenvector of order
r then (T I)x is a generalized eigenvector of order r 1.
A sequence of nonzero vectors x
1
, . . . , x
k
is called a Jordan chain of
length k associated with eigenvalue if
Tx
1
= x
1
,
Tx
2
= x
2
+x
1
,
.
.
.
Tx
k
= x
k
+x
k1
.
3.1 A Jordan chain consists of linearly independent vectors.
Proof. Let x
1
, . . . , x
k
be a Jordan chain for T associated with eigenvalue
. Assume that
1
x
1
+ . . .
k
x
k
= 0 and that r is the largest index such
that
r
,= 0. Clearly r > 1. Write then x
r
=
r1
i=1
(
1
r
i
)x
i
and operate
(T I)
r1
on both sides to get x
1
= 0, a contradiction.
The length of a Jordan chain cannot exceed the dimension of the space and
the subspace generated by a Jordan chain is T-invariant. If B = x
1
, . . . , x
k
]
B
=
_
_
1
.
.
.
.
.
.
.
.
.
1
_
.
This matrix is called the Jordan block of size k associated with eigenvalue
and we denote it by J
k
(). Note that J
k
() I
k
is a nilpotent matrix of
order k.
If V has a basis which is disjoint union of Jordan chains for T, then the
matrix representation of T with respect to th0s basis is a block diagonal
matrix with Jordan blocks on the diagonal. This basis is called a Jordan
basis of V for T, and the corresponding matrix representation a Jordan
canonical form for T.
12
3.2 Existence of Jordan canonical form. If the characteristic poly-
nomial of T splits over K, then V has a Jordan basis for T.
Proof. First assume that T is nilpotent . We prove by induction on n. If
T = 0 or in particular n = 1, then any basis of V is a Jordan basis. Suppose
that T ,= 0 and the statement holds for all vector spaces over K of dimension
less than n.
Since T is a nilpotent, W = imT is a proper T-invariant subspace of V.
Let T
for T
. let B
=
k
i=1
B
i
, a
disjoint union of Jordan chains, that is, B
i
= x
i1
, . . . , x
in
i
and T
x
i1
=
Tx
i1
= 0 and T
x
ij
= Tx
ij
= x
ij1
for j 2(1)n
i
, i 1(1)k.
Now x
11
, . . . , x
k1
are linearly independent vectors of ker T. Extend it to
form a basis of ker T : x
11
, . . . , x
k1
, y
1
, . . . , y
q
, q 0. next each x
in
k
W,
choose x
in
k
+1
V such that Tx
in
k
+1
= x
in
k
. Now write B =
k+q
i=1
B
i
, where
B
i
= B
i
x
in
i
+1
for i 1(1)k, and B
k+i
= y
i
, for i 1(1)q. We now
show that B is a basis of V.
Clearly [B[ = [B
i=1
n
i
+1
j=1
ij
x
ij
+
q
r=1
r
y
r
= 0,
where
ij
K and
r
K. Then operating T on both sides we have
k
i=1
n
i
+1
j=2
ij
x
ij1
= 0, and so
ij
= 0, for j 2(1)n
i
+ 1, i 1(1)k. Thus
k
i=1
i1
x
i1
+
q
r=1
r
y
r
= 0, and which implies that
i1
= 0 for i 1(1)k,
and
r
= 0 for r 1(1)q as x
11
, . . . , x
1k
, y
1
, . . . , y
q
, is a basis of ker T.
Finally if T is an arbitrary then it follows that the minimal polynomial
of T is of the form (x
1
)
m
1
. . . (x
k
)
m
k
, where
1
, . . .
k
are distinct
eigenvalues of T. By primary decomposition theorem V = V
1
. . . V
k
where V
i
= ker(T
i
)
m
i
a T-invariant subspace. Let T
i
be the restriction on
V
i
. Then T
i
L(V
i
) and S
i
= T
i
i
I is a nilpotent operator on V
i
. Therefore
V
i
has a Jordan basis B
i
for S
i
and hence for T
i
associated with eigenvalue
i
. hence B
k
i=1
B
i
is a Jordan basis for T.
3.3 Let B be a Jordan basis of V for T. Then the number of generalized
eigenvectors of T corresponding to eigenvalue and of order up to s is
dim ker(T I)
s
.
13
Proof. Let B =
r
i=1
B
i
be a Jordan basis and is union of disjoint Jordan
chains: B
i
= x
i1
, . . . , x
in
i
, i 1(1)r. Let B
1
, . . . B
d
be all the Jordan chains
corresponding to . For a positive integer p dene
x
ip
=
_
x
ip
if p m
i
0 if p > m
i
We prove by induction on s that ker(T I)
s
has a basis consisting of nonzero
elements of the set x
ij
: i 1(1)l, j 1(1)s. This will clearly prove the
statement.
Since x
11
, . . . , x
l1
is a linearly independent subset of ker(T I). Thus
to prove the induction hypothesis for s = 1 we need to check that this set is
actually a basis of ker(T I). If v =
r,m
r
i=1,j=1
ij
x
ij
ker(T I), then
0 = (T I)v
=
l
i=1
m
i
j=2
ij
x
ij1
+
r
i=l+1
m
i
j=1
ij
(
i
)x
ij
+
r
i=l+1
m
i
j=2
ij
x
ij1
=
r
i=l
m
i
1
j=1
ij+1
x
ij
+
r
i=l+1
_
m
i
1
j=1
(
ij
(
i
) +
ij+1
) + (
i
)
im
1
x
im
i
_
Therefore
ij
= 0 for j 2(1)m
i
, i 1(1)l, and for j 1(1)m
i
, i l +1(1)r.
Hence v =
l
i=1,
i1
x
i1
and x
11
, . . . , x
l1
is a basis of ker(T I).
Now assume for s. We now show that ker(T I)
s+1
has a basis consisting
of nonzero elements of x
ij
: i 1(1)l, j 1(1)s + 1. Since the nonzero
elements of this set are already linearly independent, we only need to verify
that this set spans ker(T I)
s+1
.
Let v =
r,m
r
i=1,j=1
ij
x
ij
ker(TI)
s+1
. Then (TI)v ker(TI)
s
=
< x
ij
: i 1(1)l, j 1(1)s > . Since
(T I)v =
l
i=1
m
i
1
j=2
ij+1
x
ij
+
r
i=l+1
_
m
i
1
j=1
(
ij
(
i
) +
ij+1
) + (
i
)
im
1
x
im
i
_
,
we have
ij
= 0 for j 1(1)m
i
, i l + 1(1)r, and also that
ij
= 0 for
j s+2(1)m
i
, i 1(1)l, whenever m
i
> s+1. Hence v =
l
i=1,
s+1
j=1
ij
x
ij
.
14
3.4 The number of Jordan chains for T of length m associated with eigen-
value is
2dim ker(T I)
m
dim ker(T I)
m+1
dim ker(T I)
m1
.
or
rank(T I)
m+1
+ rank(T I)
m1
2rank(T I)
m
.
Proof. The number of Jordan chains for T of length at least m associated
with is exactly the number of generalized eigenvectors of T of order m cor-
responding to which appear in a Jordan basis. By above result this is equal
to l
m
= dim ker(T I)
m
dim ker(T I)
m1
. Therefore the number of
Jordan chains for T of length exactly equal to m associated with is l
m
l
m+1
.
Let T be a linear operator on V and let c
T
(x) = (x
1
)
n
1
. . . (x
k
)
n
k
, where
1
), . . . ,
k
are distinct eigenvalues of T. Then by 3.2 the matrix
representation of T with respect to a Jordan basis is: J = diag (J
1
, . . . , J
k
),
where for each i 1(1)k, J
i
= diag(J
m(i,1)
(
i
), . . . , J
m(ir
i
)
(
i
)). We order the
size of these Jordan blocks such that m(i, 1) . . . m(i, r
i
). Such a matrix J
is the Jordan canonical form or simply the Jordan form of T. Note that
for each eigenvalue
i
the number r
i
and m(i, 1), . . . , m(i, r
i
) are uniquely
determined by T. For each eigenvalue
i
the numbers r
i
is the geometric
multiplicity of
i
and m(i, 1)+. . . +m(i, r
i
) = n
i
the algebraic multiplicity of
i
. Also it is easy to verify that each J
i
is such that J
i
i
I
n
i
is a nilpotent of
order m
(i1)
. Hence the minimal polynomial of T is (x
1
)
m
(11)
. . . (x
k
)
m
(k1)
.
Now we show by direct matrix multiplications that how an n n matrix
can be transformed to its Jordan form.
3.5 Let J = J
n
(0). Then J
t
J = I E
11
.
Proof. J = E
12
+ . . . + E
n1n
. J
t
= E
21
+ . . . + E
nn1
. Thus J
t
J = (E
21
+
. . . +E
nn1
)(E
12
+. . . +E
n1n
) = E
22
+. . . +E
nn
= I E
11
.
3.6 Let J
m
(0) = J, a C
n
and B C
nn
. Then
_
I
m
e
i+1
a
t
0 I
n
_ _
J e
i
a
t
0 B
_ _
I
m
e
i+1
a
t
0 I
n
_
=
_
J e
i+1
a
t
B
0 B
_
15
Proof. Direct multiplication gives that:
_
I
m
e
i+1
a
t
0 I
n
_ _
J e
i
a
t
0 B
_ _
I
m
e
i+1
a
t
0 I
n
_
=
_
J Je
i+1
a
t
+e
i
a
t
+e
i+1
a
t
B
0 B
_
.
Now since Je
i+1
= e
i
, the result follows.
3.7 Let A be a strictly upper triangular n n matrix. Then there exists an
invertible matrix P and positive integers n
1
, . . . , n
k
, n
1
. . . n
k
> 0 and
n
1
+ . . . + n
k
= n. such that P
1
AP = diag(J
n
1
(0), . . . , J
n
k
(0)). Moreover if
A has real entries then P will also have real entries.
Proof. By induction on n. If n = 1 then A is a zero matrix and so the
statement is obvious. Assume that the statement holds for all matrices of
order less than n. Write A =
_
0 a
t
0 A
1
_
, where a C
n1
and A
1
C
n1n1
.
By induction there is an n1n1 invertible matrix S
1
such that S
1
1
AS
1
=
diag(J
r
1
, . . . , J
r
s
) = diag(J, B) with r
1
. . . r
s
> 0, r
1
+ . . . + r
s
= n 1,
and J = J
r
1
, B = diag(J
r
2
, . . . , J
r
s
).
If P
1
= diag(1, S
1
), then P
1
1
AP
1
=
_
0 a
t
S
1
0 S
1
1
AS
1
_
. Now write a
t
S
1
=
[b
t
c
t
], b C
r
1
and c C
n1r
1
. Then
P
1
1
AP
1
=
_
_
0 b
t
c
t
0 J 0
0 0 B
_
_
.
Next consider the following similarity transform:
_
_
1 b
t
J
t
0
0 I 0
0 0 I
_
_
_
_
0 b
t
c
t
0 J 0
0 0 B
_
_
_
_
1 b
t
J
t
0
0 I 0
0 0 I
_
_
=
_
_
0 b
t
(I J
t
J) c
t
0 J 0
0 0 B
_
_
=
_
_
0 (b
t
e
1
)e
t
1
c
t
0 J 0
0 0 B
_
_
If b
t
e
1
,= 0, then
diag(1/b
t
e
1
, I, 1/b
t
e
1
I)
_
_
0 (b
t
e
1
)e
t
1
c
t
0 J 0
0 0 B
_
_
diag(b
t
e
1
, I, b
t
e
1
I) =
_
_
0 e
t
1
c
t
0 J 0
0 0 B
_
_
16
Now
_
0 e
t
1
0 J
_
= J
r
1
+1
(0), a Jordan block of order r
1
+1 with zeros on the di-
agonal. Let us denote this matrix by
J. Thus
_
_
0 e
t
1
c
t
0 J 0
0 0 B
_
_
=
_
J e
1
c
t
0 B
_
.
Using the above result recursively, we have:
_
I e
i+1
c
t
B
i1
0 I
_ _
J e
i
c
t
B
i1
0 B
_ _
I e
i+1
c
t
B
i1
0 I
_ _
J e
i+1
c
t
B
i
0 B
_
,
for i = 1, 2, . . . . Since B
r
1
= 0 so after r
1
steps we have that the matrix
_
J e
i
c
t
B
i1
0 B
_
is similar to
_
J 0
0 B
_
which is in the required form.
If b
t
e
1
= 0, then A is similar to
_
_
0 0 c
t
0 J 0
0 0 B
_
_
which is permutation similar
to the matrix
_
_
J 0 0
0 0 c
t
0 0 B
_
_
. By induction hypothesis, there is an invertible
matrix S
2
C
nr
1
nr
1
such that S
1
2
_
0 c
t
0 B
_
S
2
= J, a Jordan matrix
with zeros on the diagonal. Therefore A is similar to
_
J
r
1
0
0 J
_
which is the
Jordan matrix in the required form except that the Jordan blocks are in the
descending order, that can be done by permutation similarity.
Finally note that if A has real entries then all similarities used in this
proof can be done by real matrices.
3.8 Jordan decomposition theorem. A C
nn
is similar to the matrix
diag(J
n
1
(
1
), . . . , J
n
k
(
k
)), where J
n
i
(
i
) C
n
i
n
i
and n
1
+ . . . + n
k
= n.
each J
n
i
(
i
)
i
I
n
1
is of the form of the above result. This form is es-
sentially unique, that is depends only on A and the order of occurrence of
eigenvalues.
Proof. Only uniqueness is required to prove now. For that note that if
,= 0, rankJ
n
m
= m for all positive integers n. For J
m
(0), rankJ
m
(0)
n
= 0,
if n m and rankJ(0)
n1
m
rankJ(0)
m
n
= 1 for n m. Write r
n
() =
rank(J
m
() I
m
)
n
. Then r
n1
(
i
) r
n
(
i
) is the number of Jordan blocks
17
of size at least n appearing in J. therefore the number of Jordan blocks of
size exactly equal to n is
(r
n1
(
i
) r
n
(
i
)) (r
n
(
i
) r
n+1
(
i
)) = r
n1
(
i
) 2r
n
(
i
) + r
n+1
(
i
).
Thus two nn matrices A and B are similar if and only if they have the
same characteristic polynomial and for each eigenvalue and positive integer
k, rank(A I)
k
= rank(B I)
k
4 Unitary, Self-adjoint, Normal Operators
In this section V will always denote an n-dimensional inner product space
over F = R or C.
4.1 Let V and W be nite dimensional inner product spaces and let T
L(V, W). Then there is a unique T
w)
for all v V and w W. The linear map T
w) for all v V.
For linearity:
(v, T
(u + w)) = (Tv, u + w)
= (Tv, u) +(Tv, w)
= (v, T
u) +(v, T
w)
= (v, T
u +T
w))
for all v V and a W, and so T
(v+w) = T
v+T
w. For uniqueness
if S L(W, V ) is another such then for all v V and w W we have:
(v, (S T
)(w)) = 0. Hence, S = T
.
Following are the basic properties of the adjoint.
4.2 If S, T L(V, W), then (S + T)
= S
+ T
for F. S
= S,
where S
= (S
. (ST)
= T
.
If T L(V ) and T is invertible, then (T
)
1
= (T
1
)
.
18
let T L(V, W) and let B = v
1
, . . . , v
n
and B
= w
1
, . . . , w
m
be
ordered orthonormal basis of V and W respectively. Then
Tv
j
=
m
i=1
(Tv
j
, w
i
)w
i
,
for all j 1(1)n. thus the matrix of T with respect to the bases B and B
is
B
[T]
B
whose (i, j)-th entry is (Tv
j
, v
i
).
If A C
mn
, then we denote by A
= w
1
, . . . , w
m
be
ordered orthonormal basis of V and W respectively. Then
B
[T
]
B
=
B
[T]
B
,
the conjugate transpose of
B
[T]
B
.
Proof. Since T
w
j
=
n
i=1
(T
w
j
, v
i
)v
i
=
n
i=1
(Tv
i
, w
j
)v
i
.
4.4 If T L(V ). Then detT
= trT.
A linear operator T on V is called normal if TT
= T
T; unitary if T is
invertible and TT
= I = T
= T.
4.5 Follwoing statements are equivalent.
(i) T is unitary;
(ii) T preserves inner products: (Tu, Tv) = (u, v), for all u, v V ;
(iii) T preserves norm: [[Tu[[ = [[u[[, for all u V ;
(iv) T maps orthonormal basis to orthonormal basis.
4.6 Eigenvalues of unitary operators have absolute value 1. In particular a
unitary operator on real inner product space have eigenvalues 1 or -1.
An n n matrix A is called unitary if A is invertible and A
1
= A
. A
is called orthogonal if A
1
= A
t
.
4.7 A F
nn
. Then the follwoing statements are equivalent. (i) A is a
unitary matrix.
(ii) the columns of A form an orthonormal basis of the standard inner product
space F
n
. (iii) the rows of A form an orthonormal basis of the standard inner
product space F
n
.
19
It is easy to see that if T is unitary then the matrix of T with respect to
an orthonormal basis is unitary. If T is unitary on real inner product space
the matrix of T is actually orthogonal.
Recall the Gram Schmidt procedure. If B = u
1
, . . . , u
n
is an ordered
basis of an inner product space V, then there is an ordered orthonormal basis
B
= v
1
, . . . , v
n
such that < u
1
, . . . , u
k
>=< v
1
, . . . , v
k
> for k 1(1)n. In
fact, these are such that:
u
1
=
11
v
1
u
2
=
12
v
1
+
22
v
2
.
.
.
u
n
=
in
v
in
+. . . +
nn
v
n
,
where
if
F and
ii
> 0. for i, j 1(1)n. We use this to prove the following
statement.
4.8 Schur. If T is a linear opeartor on an inner product space V which is
triangulable then there is an ordered orthonormal basis such that the matrix
of T with respect to this basis is upper triangular.
Proof. let B = u
1
, . . . , u
n
is an ordered basis of such that the matrix of
[T]
B
is upper triangular. Let B
= v
1
, . . . , v
n
be an ordered orthonormal
basis of V which is obtained from B. Then the elements of B and B
are
related by the above equations. In other words, if I is the identity operator,
then
B
[I]
B
=
_
11
. . .
in
.
.
.
.
.
.
0
nn
_
_
.
Since [T]
B
=
B
[I]
B
[T]
B B
[I]
B
, and all the matrices on the right are upper
triangular, so [T]
B
is also upper triangular.
Now we look for some important results about self-adjoint operators.
Clearly. the sum of self-adjoint operators is self-adjoint. The inverse of self-
adjoint is self-adjoint, if exists.The product of two commuting self-adjoint
operators is also self-adjoint.
4.9 Let T be a self-adjoint operator. Then
(i) For all v V, (Tv, v) R.
20
(ii) (Tv, v) = 0 for all v V, then T 0.
(iii) If T
m
v = 0, for some positive integerm, then Tv = 0.
(iv) All roots of c
T
(x) are real.
(v) Eigenvectors corresponding to distinct eigenvalues are orthogonal.
Proof. (i). (Tv, v) = (v, Tv) = (T
v, v) = (Tv, v)
(ii). For x, y V, 0 = (T(x + y), x + y) = (Tx, y) + (Ty, x). If V is an inner
product space over R, then 0 = (Tx, y) + (Ty, x) = 2(Tx, y). Hence for all
x, y V, (Tx, y) = 0, and T 0. If V is inner product space over C, then
replacing y by iy we have that for all x, y V, (Tx, y) (Ty, x) = 0. Hence
we have T 0 again.
(iii). By induction on k. For k = 1, this is clear. Now assume that the
statement is true for all integers less than k. let m be a positive inte-
ger such that 2
m
k and 2
m1
< k. Since T
k
(v) = 0, T
2
m
v = 0. Thus
(T
2
m1
v, T
2
m1
v) = (T
2
m
v, v) = 0. Hence T
2
m1
v = 0 and by induction hy-
pothesis Tv = 0.
(iv). If V is an inner product space over C, then for an y root if v is
corresponding eigenvector, then as (Tv, v) = (v, v) and (Tv, v) is real, we
have that a real number.
Now assume that V is inner product space over R. Let A be the matrix
of T with respect to an ordered orthonormal basis. Then A is a hermitian
matrix with real entries, actually symmetric. Considering A as a linear op-
erator on standard inner product C
n
, A is self adjoint and c
A
(x) = c
T
(x).
Therefore all roots of c
A
(x) are real and so all roots of c
T
(x) are real.
(v). If and are distinct eigenvalues with corresponding eigenvectors u and
v, then and are real numbers and (u, v) = (Tu, v) = (u, Tv) = (u, v).
Hence (u, v) = 0.
Now we study normal operators.
4.10 Let T be a normal operator. Then
(i). [[Tv[[ = [[T
v = v;
(iii). eigenvectors corresponding to distinct eigenvalues are orthogonal;
(iv). If T
k
v = 0 for some positive integer k, then Tv = 0.
Proof (i). [[Tv[[
2
= (Tv, Tv) = (v, T
Tv) = (v, TT
v) = [[T
v[[
2
.
(ii). Since T I is also a normal operator, [[(T I)v[[ = [[(T I)
v[[ =
21
[[(T
I)v[[.
(iii). If and are distinct eigenvalues with corresponding eigenvectors u
and v, then using (ii), (u, v) = (Tu, v) = (u, T
v
i
=
i
v
i
. Thus for each i 1(1)n,
TT
v
i
= T(
i
v
i
) =
i
v
i
=
i
(Tv
i
) = T
Tv
i
.
Hence TT
= T
T.
From the above result, if T is a normal operator on V , then there is an
orthonormal basis B = u
1
, . . . , u
n
of V consisting of eigenvectors of T and
so [T]
B
= diag(
1
, . . . ,
n
). Let P
i
L(V ) such that P
i
(u
i
) = u
i
and P
i
(u
j
) = 0.
Then P
i
P
j
= 0, for i ,= j, P
2
i
= P
i
and we have the spectral decomposition
of T :
T =
1
P
1
+. . . +. . . +
n
P
n
For matrices : if A is an nn normal matrix, then there is a unitary matrix
P such that P
AP = diag(
1
, . . . ,
n
). Thus:
A = Pdiag(
1
, . . . ,
n
)P
22
= P(
1
E
11
+. . . +
n
E
nn
)P
=
n
i=1
i
(Pe
i
)(Pe
i
)
=
n
i=1
i
Pe
i
,
where P
i
= (Pe
i
)(Pe
i
)
.
5 Positive Denite Operators
A self adjoint operator T is called positive semi-deniteif (Tu, u) 0 for
all u V, and positive denite if (Tu, u) > 0 for all u V, u ,= 0. Clearly
a positive denite operator is invertible.
5.1 Let T L(V ) be self adjoint. Then the following are equivalent.
(1). T is positive semi-denite.
(2). All eigenvalues of T are non-negative.
(3). There is a unique positive semi-denite operator P such that T = P
2
.
We write
T for P.
(4). There exists S L(V ) such that T = S
S.
Proof. (1) (2) is easy.
(2) (3). Let T =
1
E
1
+ . . . +
k
E
k
be spectral decomposition of T.
Then P =
1
E
1
+. . . +
k
E
k
is positive semi-denite and T = P
2
Indeed
for v V, v = E
1
v +. . . +E
k
v, and so (Pv, v) =
k
j=1
k
i=1
i
(E
i
v, E
j
v) =
k
i=1
i
(E
i
v, E
i
v) 0.
(3) (4). Let S = P.
(4) (1). For v V, (Tv, v) = (S
S.
5.3 Polar Decomposition. Let T L(V ). Then T = UP, where U is unitary
and P is positive semi-denite.
23
Proof. TT
. Let u
i
=
1
i
T
w
i
, i 1(1)r. Then (u
i
u
j
) =
ij
and also that
it is easy to verify that TT
u =
i
u
i
Next dimker(T
T) = dimkerTT
=
nr. Thus if u
r+1
, . . . , u
n
is an orthonormal basis of ker(T
T), then u
1
, . . . , u
n
is an orthonormal basis of V consisting of eigenvectors of T
T. Let U be a
linear operator whose action on bases is given by: Uw
i
= u
i
. Then U is
unitary and if P =
T, then T = UP.
Indeed for i 1(1)r,
Tu
i
=
1
i
TT
w
i
=
_
i
w
i
= U(
_
i
u
i
) = UPu
i
.
and for i r + 1(1)n, (Tu
i
, Tu
i
) = (u
i
, T
Tu
i
) = 0, and (Pu
i
, Pu
i
) =
(u
i
, P
2
u
i
) = (u
i
, T
Tu
i
) = 0, that is, Tu
i
= 0 = Pu
i
.
The eigenvalues of
= w
1
, . . . , w
n
such that for any
x V,
Tx =
n
i=1
i
(x, u
i
)w
i
,
where
1
. . . ,
n
are singular values of T.
Proof. For x V, write x =
n
i=1
(x, u
i
)u
i
. Then Px =
n
i=1
i
(x, u
i
)u
i
, and
so as T = UP, Tx =
n
i=1
i
(x, u
i
)Uu
i
. If w
i
= Uu
i
, then we have the result.
The above result shows that Tu
i
=
i
u
i
. Thus we have that
B
[T]
B
= diag(
1
, . . . ,
n
).
We now give an method of nding obtaining singular value decomposition
for A C
mn
.
5.5 Let A C
mn
with singular values
1
, . . . ,
r
, where r is the rank of A.
Then there are unitary matrices U C
mm
and V C
nn
such that
A = Udiag(
1
, . . . ,
r
, 0 . . . , 0)V.
24
Proof. If A is a number c then A = [c[e
i
for some R is the singular
value decomposition of A. If A is a non zero row or a column vector, say
A = [a
1
. . . a
n
], Then
1
is the norm of A. Let V be any unitary matrix with
the rst row the unit vector [a
1
/
1
. . . a
n
/
n
]. Then A = [
1
0 . . . 0]V.
We now assume that m > 1 and n > 1 and A ,= 0 and let u
1
be a unit
eigenvector of A
A corresponding to
2
1
, that is A
Au
1
=
2
u
1
and u
1
u
1
= 1.
Let v
1
=
1
1
Au
1
. Then v
1
is a unit vector and u
1
A
v
1
=
1
. Let P and Q
be unitary matrices with u
1
and v
1
as the rst column respectively. Then
P
Q =
_
1
0
0 B
_
or A
= Q
_
1
0
0 B
_
P
.
We can deduce polar decomposition from the singular value decomposi-
tion.
5.6 For any square matrix A there is a positive denite matrix P and a
unitary matrix U such that A = PU.
Proof. Let U and V be n n unitary matrices such that
A = Udiag(
1
, . . . ,
r
, 0 . . . , 0)V.
By inserting U
= A
A.
2. A is unitarily diagonalizable.
3. There is a polynomial p(x) such that A
= P(A).
4. There is a set of eigenvectors of A which form an orthonormal basis for
C
n
.
5. Every eigenvector of A is also an eigenvector of A
.
6. A = B +iC for some Hermitian matrices B and C and BC = CB.
7. If U is unitary such that U
AU =
_
B C
0 D
_
, with B and D square, then
B and D are normal and C = 0.
8. If W a subspace of C
n
is A-invariant, then so is W
.
9. If u is an eigenvector of A, then < u >
is A-invariant.
10. A can be written as: A =
n
i=1
i
P
i
, P
i
C
nn
such that: P
2
i
= P
i
=
P
i
, P
i
P
j
= 0 if i ,= j,
n
i=1
P
i
= I.
11. tr(A
A) =
n
i=1
[
i
[
2
.
12. The singular values of A are [
1
[, . . . , [
n
[.
13.
n
i=1
(Re
i
)
2
= tr(A +A
)
2
/4
14.
n
i=1
(Im
i
)
2
= tr(A + A
)
2
/4
15. The eigenvalues of A +A
are
1
+
1
, . . . ,
n
+
n
.
16. tr(A
A)
2
= tr((A
)
2
A)
2
).
17. [[Ax[[ = [[A
[ where [A[ = (A
A)
1/2
.
19. A
.
25. A commutes with A A
.
26. A +A
commutes with A A
.
27. A commutes with A
A.
28. A commutes with AA
A.
29. A
B = BA
whenever AB = BA.
30. AA
A is positive semi-denite.
Proof. 1 2. Let U be a unitary matrix such that U
AU = T, an upper
triangular matrix. Then AA
= A
A implies that TT
= T
T. Now equating
the diagonal terms on the both sides we get that T is actually a diagonal
26
matrix. Thus 1 2. Converse is clear.
1 3. Let p be a polynomial of degree at most n1 such that p(
1
) =
i
.
Since by above A is normal and so is unitarily diagonalizable. Thus there is
a unitary matrix U such that U
AU = diag(
1
, . . . ,
n
). But then
A
= Udiag(
1
, . . . ,
n
)U
= Udiag(p(
1
), . . . , p(
n
))U
= Up(diag(
1
, . . . ,
n
))U
= Up(U
AU)U
= p(A)
Conversely if A
A = p(A)A =
Ap(A) = AA
.
The equivalence of 2 and 4 is clear.
5 1. Suppose that A is normal. Let be an eigenvalue of A corre-
sponding to normalizes eigenvector u. Let U be a unitary matrix with the
rst column u. Then:
U
AU =
_
v
t
0 B
_
,
where v C
n1
. The normality of A implies that
_
v
t
0 B
_ _
0
v B
_
=
_
0
v B
_ _
v
t
0 B
_
,
and so v = 0 and B is a normal n n matrix. But then
U
U =
_
0
0 B
_
.
Therefore, u is an eigenvector of A
AU)(U
v) =
(U
v) for any unitary matrix U. Thus without any loss we can assume that
A is upper-triangular. Write A =
_
x
t
0 B
_
, where x C
n1
. Then e
1
is an
eigenvector of A corresponding to . But then A
=
_
0
x B
_
. Also as e
1
27
is an eigenvector of A
)/2 and C = (A A
)/2i.
7 1. 7 implies 1 is clear. We prove 1 implies 7. Suppose that U is a
unitary matrix such that U
AU =
_
B C
0 D
_
, with B and D square matrices.
Then AA
= A
A implies that
_
B
B B
C
C
B C
C +D
D
_
=
_
BB
+CC
CD
DC
DD
_
.
Therefore, B
B = BB
+ CC
B) = tr(BB
) + tr(CC
). Since tr(B
B) = tr(BB
is A-invariant
subspace of C
n
. therefore < x > is A
.
7 8. Let W be an A-invariant subspace of C
n
. Let w
1
, . . . , w
k
be an
orthonormal basis of W. Then extend it to form an orthonormal basis of C
n
.
Clearly, W
=< w
k+1
, . . . , w
n
> . If W is the matrix whose columns are
w
1
, . . . , w
n
then W is unitary and W
AW =
_
B C
0 D
_
, B and D square
matrices. Thus C = 0 and so W is also A-invariant. Hence 1 8 9.
10 1. 10 implies 1 is a direct computation. Now suppose 1 holds. Then
there is a unitary matrix U such that A = Udiag(
1
, . . . ,
n
)U
. Therefore
A =
1
u
1
u
1
+. . .+
n
u
n
u
n
, where u
i
is the i-th column of U. If we let E
i
= u
i
u
i
we have the result.
11 2. 2 implies 11 is clear. Conversely , by Shurs lemma there is a
unitary matrix U such that U
) = tr(TT
), that is,
[a
ij
[
2
=
[
i
[
2
+
i<j
[t
ij
[
2
. Hence t
ij
= 0
and T is a diagonal matrix.
12 2. 2 implies 12 is clear. If singular values of A are [
1
[, . . . , [
n
[,
then tr(A
A) =
n
i=1
[
i
[
2
. and so 12 implies 11 and hence 2.
28
13 2. Let U be a unitary matrix such that U
AU = Tis upper-
triangular. Then
tr(A +A
)
2
= tr(T +T
)
2
= tr(T)
2
+ 2tr(T
T)
2
+ tr(T
)
2
=
n
i=1
t
2
ii
+ 2
n
i=1
[t
ii
[
2
+ 2
i<j
[t
ij
[
2
+
n
i=1
t
ii
2
=
n
i=1
2
i
+ 2
n
i=1
[
i
[
2
+
n
i=1
i
2
+ 2
i<j
[t
ij
[
2
=
n
i=1
(
i
+
i
)
2
+ 2
i<j
[t
ij
[
2
= 4
n
i=1
(Re
i
)
2
+ 2
i<j
[t
ij
[
2
Hence t
ij
= 0 for i ,= j and T is a diagonal matrix. This shows that 13
implies 2. Converse is easy.
14 2 is similar.
15 2 We show that 15 implies 13 and hence 2. This is easy since
tr(A +A
)
2
=
n
i=1
(
i
+
i
)
2
= 4
n
i=1
(Re
i
)
2
.
Converse is again straightforward.
16 1 Since for any matrix X tr(X
X) = 0 if and only if X = 0, we
show that tr((A
A AA
)(A
A AA
) = 0. Now
tr((A
A AA
)(A
A AA
= tr((A
A AA
)
2
)
= tr((A
2
A
2
) + tr((A
2
A
2
) tr(A
A)
2
tr(AA
)
2
= 2(tr((A
2
A
2
) tr(A
A)
2
)
Hence A
A AA
= 0 and A is normal.
17 1 For x C
n
(Ax, Ax) = (A
x, A
x) and so (x, (A
AAA
)x) = 0.
Hence A is normal.
29
18 1 Follows from the uniqueness of the square roots.
19 1 If A
= AU. then A
A = A
(A
= (AU)(AU)
= AA
.
Conversely, let A = V
diag(
1
, . . . ,
n
)V. Then A
= V
diag(
1
, . . . ,
n
)V.
Let s
i
=
i
/
i
if s
i
,= 0, and s
i
= 1 otherwise. Then diag(
1
, . . . ,
n
) =
diag(
1
, . . . ,
n
)diag(s
1
, . . . , s
n
), and so
A
= V
diag(
1
, . . . ,
n
)(V V
)diag(s
1
, . . . , s
n
)V = AU,
where U = V
diag(s
1
, . . . , s
n
)V.
20 1 is similar.
21 1 If A = UP a polar decomposition and UP = PU, then AA
=
P
2
= A
A. Conversely, A
A = AA
implies that P
P = UPP
8
U
or
P
2
= UP
2
U
_
D 0
0 0
_
V,
where V is unitary and D is an r r diagonal matrix with non-zero diagonal.
Then UP
2
= PUP gives with, W = V UV
W
_
D
2
0
0 0
_
=
_
D 0
0 0
_
W
_
D 0
0 0
_
.
Partition W =
_
W
1
W
2
W
3
W
4
_
with W
1
and rr matrix. Then W
1
D
2
= DW
1
D
and W
2
D
2
= 0. Since D is invertible, this implies that DW
1
= W
1
D and
W
2
= 0. Since W is unitary it follows that W
3
= 0, and so W
_
D 0
0 0
_
=
_
D 0
0 0
_
W. Hence UP = PU.
It is easy to verify that 24, 25 and 26 are equivalent to 1.
30
27 16. If A commutes with A
A, then (A
A)
2
= (A
)
2
A
2
and now
taking trace on both sides we have 16.
27 16. If A commutes with AA
A, then AA
AA
AA = AAA
AA
A)
2
= A
2
A
2
+A
A
2
A
.
Now taking trace on both sides we have 16.
29 1. Take B = A for normality. Conversely, let A be normal and let
A and B be commuting matrices. Write A = U
diag(
1
, . . . ,
n
)U, where U
is unitary. Then AB = BA implies that
diag(
1
, . . . ,
n
)(UBU
) = (UBU
)diag(
1
, . . . ,
n
)U
Denote C = UBU
= (c
ij
). Then
diag(
1
, . . . ,
n
)C = Cdiag(
1
, . . . ,
n
)U
which gives that (
i
j
)c
ij
= 0. Thus if
i
,=
j
then c
ij
= 0 and if c
ij
= 0,
then
i
=
j
. In any case we have that (
i
j
)c
ij
= 0 for all i and j, which
in return implies that A
B = BA
.
30 1. is obvious.
31