Sie sind auf Seite 1von 49

1

1 LINEAR SPACES AND LINEAR TRANSFORMATIONS

1.1 LINEAR SPACES

The concepts in this chapter (possibly) involve non-trivial notions from diverse areas in
Mathematics. The reader should not be disappointed if he (she) is unable to grasp them
the first time around. In fact, many of these ideas −as well as their relevance− are
expected to be satisfactorily understood only after having covered several chapters of this
book.

Definition. A vector space (or linear space) over a field 𝐹 is a set 𝑽 together with an
internal binary operation 𝑽 × 𝑽 → 𝑽 (called addition) and an external binary operation
𝐹 × 𝑽 → 𝑽 (called scalar multiplication) such that the following axioms are satisfied:

1. 𝑽 together with the operation of addition (denoted by +) is a commutative group:

a) Addition is associative: For all 𝒗1 ,𝒗2 , 𝒗3 ∈ 𝑽,


(𝒗1 + 𝒗2 ) + 𝒗3 = 𝒗1 + (𝒗2 + 𝒗3 ).
⃑ such that
b) Existence of an identity: 𝑽 contains a unique element denoted by 𝟎
⃑𝟎 + 𝒗 = 𝒗 = 𝒗 + ⃑𝟎 for all 𝒗 ∈ 𝑽.
c) Existence of inverses: For every 𝒗 ∈ 𝑽 there is a unique element −𝒗 ∈ 𝑽 such that
𝒗 + (−𝒗) = ⃑𝟎 = (−𝒗) + 𝒗.
d) Addition is commutative: For all 𝒗1 ,𝒗2 ∈ 𝑽,
𝒗1 + 𝒗2 = 𝒗2 + 𝒗1 .

2. a) For each element 𝑘 ∈ 𝐹 and each pair of elements 𝒗1 ,𝒗2 ∈ 𝑽,


𝑘(𝒗1 + 𝒗2 ) = 𝑘𝒗1 + 𝑘𝒗2 .
b) For each pair of elements 𝑘, 𝑙 ∈ 𝐹 and each element 𝒗 ∈ 𝑽,
(𝑘 + 𝑙)𝒗 = 𝑘𝒗 + 𝑙𝒗.
c) For each pair of elements 𝑘, 𝑙 ∈ 𝐹 and each element 𝒗 ∈ 𝑽,
(𝑘𝑙)𝒗 = 𝑘(𝑙𝒗).
b) For each element 𝒗 ∈ 𝑽,
1𝒗 = 𝒗.

Diverse elementary consequences can be easily derived either from the cancellation law
for addition or the properties in 2, or a combination of both. The following are frequently
used:

1. For all 𝒗 ∈ 𝑽, 0𝒗 = ⃑𝟎 and (−1)𝒗 = −𝒗.


⃑ =𝟎
2. For all 𝑘 ∈ 𝐹, 𝑘𝟎 ⃑.

July 2014 S. Adarve


2

⃑ in any vector space will be called the zero


Henceforth, the additive identity element 𝟎
vector.

Example 1 If 𝐹 is any field, the set 𝐹 𝑛 = ⏟


𝐹 × ⋯ × 𝐹 = {(𝑎1 , 𝑎2 , … , 𝑎𝑛 ) |𝑎𝑗 ∈ 𝐹, 𝑗 = 1, … , 𝑛}
𝑛 𝑡𝑖𝑚𝑒𝑠
together with the natural coordinatewise operations of addition and scalar multiplication

(𝑎1 , 𝑎2 , … , 𝑎𝑛 ) + (𝑏1 , 𝑏2 , … , 𝑏𝑛 ) = (𝑎1 + 𝑏1 , 𝑎2 + 𝑏2 , … , 𝑎𝑛 + 𝑏𝑛 ) and

𝑘(𝑎1 , 𝑎2 , … , 𝑎𝑛 ) = (𝑘𝑎1 , 𝑘𝑎2 , … , 𝑘𝑎𝑛 )

is a vector space over 𝐹.

Example 2 For a given field 𝐹 the set 𝑀𝑚×𝑛 (𝐹) of 𝑚 × 𝑛 matrices with entries in
𝐹 together with coordinatewise addition and scalar multiplication is a vector space over 𝐹.

Example 3 Given a set 𝑅 and a field 𝐹, let F(𝑅, 𝐹) denote the set of functions on 𝑅 with
values in 𝐹 . The set F (𝑅, 𝐹) together with the operations of addition and scalar
multiplication defined for 𝑓, 𝑔 ∈ F(𝑅, 𝐹) and 𝑘 ∈ 𝐹 by

(𝑓 + 𝑔)(𝑟) = 𝑓(𝑟) + 𝑔(𝑟) and (𝑘𝑓)(𝑟) = 𝑘𝑓(𝑟),

for each 𝑟 ∈ 𝑅 , is a vector space over 𝐹.

Example 4 For a given field 𝐹 the set 𝐹 𝜔 of ∞ −tuples (𝑎1 , 𝑎2 , … , 𝑎𝑗 , … ) such that 𝑎𝑗 ∈ 𝐹,
for 𝑗 =1, 2, … , ∞, together with coordinatewise addition and scalar multiplication is a
vector space over 𝐹.

The elements in a vector space 𝑽 over a field 𝐹 are called vectors and the elements in 𝐹
are called scalars.

Definition. A nonempty subset 𝑾 of a vector space 𝑽 over a field 𝐹 is a subspace of 𝑽 if

1. 𝒘1 ,𝒘2 ∈ 𝑾 imply 𝒘1 + 𝒘2 ∈ 𝑾
2. 𝑘 ∈ 𝐹 and 𝒗 ∈ 𝑾 imply 𝑘𝒗 ∈ 𝑾

Thus, a subspace 𝑾 of a vector space 𝑽 is a nonempty subset that is closed both under
addition and scalar multiplication. Note that these conditions imply that a subspace 𝑾
must contain the zero vector from the ambient space 𝑽 and that 𝑾 together with the
inherited addition operation is, in particular, a subgroup of the additive group (𝑽, +).

Example 5 The intersection of any collection of subspaces of a vector space is a subspace


of the given vector space. The union of a collection of subspaces of a vector space is not, in
general, a subspace of the given vector space.

Example 6 For a given field 𝐹, the subset of 𝐹 𝜔 given by

July 2014 S. Adarve


3

𝐹 ∞ = {(𝑎1 , 𝑎2 , … , 𝑎𝑗 , … ) ∈ 𝐹 𝜔 | 𝑎𝑗 = 0, 𝑒𝑥𝑐𝑒𝑝𝑡 𝑓𝑜𝑟 𝑎 𝑓𝑖𝑛𝑖𝑡𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑖𝑛𝑑𝑖𝑐𝑒𝑠 𝑗}

is a subspace of 𝐹 𝜔 .

Example 7 For 𝐹 = ℝ or ℂ, the following are all subspaces of F(𝐹, 𝐹).

a) The subset 𝐹[𝑡] of all polynomials in one variable with coefficients in 𝐹.


b) The subset 𝐹𝑛 [𝑡] of all polynomials in 𝐹[𝑡] of degree less than or equal to 𝑛.

Example 8 For a given field 𝐹, the following are all subspaces of 𝑀𝑛×𝑛 (𝐹).

a) The subset of diagonal matrices.


b) The subset of upper (lower) triangular matrices.
c) The subset {𝐴 ∈ 𝑀𝑛×𝑛 (𝐹) | 𝐴𝑡 = 𝐴} of symmetric matrices.
d) The subset {𝐴 ∈ 𝑀𝑛×𝑛 (𝐹) | 𝐴𝑡 = −𝐴} of skew− symmetric matrices.

Definition. Let 𝑽 be a vector space over a field 𝐹. Given any finite collection of vectors
𝒗1 ,𝒗2 , . . . , 𝒗𝑟 in 𝑽, an expression of the form

𝑘1 𝒗1 + 𝑘2 𝒗2 + ⋯ + 𝑘𝑟 𝒗𝑟 ,

for any kit (collection) of scalars 𝑘1 , 𝑘2 , … , 𝑘𝑟 in 𝐹, is called a linear combination (of the
vectors 𝒗1 ,𝒗2 , . . . , 𝒗𝑟 ).

The linear combination

0𝒗1 + 0𝒗2 + ⋯ + 0𝒗𝑟

is called the trivial linear combination (of the vectors 𝒗1 ,𝒗2 , . . . , 𝒗𝑟 ).

Notice that a trivial linear combination is trivially equal to the zero vector of the given
vector space.

One of the most important tools for building subspaces is the so called span of a subset of
vectors.

Definition. Let 𝑅 be a (possibly infinite) subset of a vector space 𝑽 over a field 𝐹. The
subset of 𝑽 consisting of all possible linear combinations of elements in 𝑅 is called the
span of 𝑅 and is denoted by 𝑠𝑝𝑎𝑛𝑅.

Let 𝑽 be a vector space over a field 𝐹. If 𝑄 = {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } is a finite collection of


vectors in 𝑽, then we may write

𝑠𝑝𝑎𝑛𝑄 = {𝑘1 𝒗1 + 𝑘2 𝒗2 + ⋯ + 𝑘𝑟 𝒗𝑟 | 𝑘1 , 𝑘2 , … , 𝑘𝑟 ∈ 𝐹}.

If 𝑅 is an infinite subset of 𝑽, then 𝑠𝑝𝑎𝑛𝑅 is the union of all subsets 𝑠𝑝𝑎𝑛𝑄 where 𝑄 is a
finite subset of 𝑅.

⃑ }.
We will use the convention 𝑠𝑝𝑎𝑛∅ = {𝟎

July 2014 S. Adarve


4

Theorem 1.1. Let 𝑽 be a vector space over a field 𝐹. Then for any subset 𝑅 of 𝑽 the subset
𝑠𝑝𝑎𝑛𝑅 is a subspace of 𝑽 and any subspace of 𝑽 that contains 𝑅 must also contain 𝑠𝑝𝑎𝑛𝑅.

Proof. Regarding the first assertion, let 𝑅 = {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } be a finite subset of 𝑽. If


𝑘1 , 𝑘2 , … , 𝑘𝑟 ∈ 𝐹 and 𝑙1 , 𝑙2 , … , 𝑙𝑟 ∈ 𝐹 are two kits of scalars, then

𝑘1 𝒗1 + 𝑘2 𝒗2 + ⋯ + 𝑘𝑟 𝒗𝑟 + 𝑙1 𝒗1 + 𝑙2 𝒗2 + ⋯ + 𝑙𝑟 𝒗𝑟

= (𝑘1 + 𝑙1 )𝒗1 + (𝑘2 + 𝑙2 )𝒗2 + ⋯ + (𝑘𝑟 + 𝑙𝑟 )𝒗𝑟 .

This shows that 𝑠𝑝𝑎𝑛𝑅 is closed under addition. If 𝑑 ∈ 𝐹 and 𝑘1 , 𝑘2 , … , 𝑘𝑟 ∈ 𝐹 is a kit of


scalars, then

𝑑(𝑘1 𝒗1 + 𝑘2 𝒗2 + ⋯ + 𝑘𝑟 𝒗𝑟 ) = (𝑑𝑘1 )𝒗1 + (𝑑𝑘2 )𝒗2 + ⋯ + (𝑑𝑘𝑟 )𝒗𝑟 .

This shows that 𝑠𝑝𝑎𝑛𝑅 is closed under scalar multiplication.

If 𝑅 is infinite, an embellishment of the preceding argument will show that −also in this
case− 𝑠𝑝𝑎𝑛𝑅 is a subspace of 𝑽.

The second assertion is immediate since subspaces, being closed under addition and scalar
multiplication, must also be closed under linear combinations. 

1.2 LINEAR INDEPENDENCE, BASES, AND DIMENSION

The following concept is crucial in linear algebra.

Definition. Let 𝑅 be a (possibly infinite) subset of a vector space 𝑽 over a field 𝐹. The
subset 𝑅 is linearly independent if the only linear combinations of vectors in 𝑅 equal to
the zero vector of 𝑽 are trivial linear combinations. A subset of a vector space that is not
linearly independent is said to be linearly dependent.

Thus, a finite subset 𝑅 = {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } of a vector space 𝑽 over a field 𝐹 is linearly


independent if the following condition is satisfied:

⃑  𝑘1 = 𝑘2 = ⋯ = 𝑘𝑟 = 0.
𝑘1 𝒗1 + 𝑘2 𝒗2 + ⋯ + 𝑘𝑟 𝒗𝑟 = 𝟎

If 𝑅 is infinite, then 𝑅 is linearly independent if and only if each finite subset of 𝑅 is linearly
independent: The preceding condition must be satisfied for every finite subcollection
𝒗1 , 𝒗2 , . . . , 𝒗𝑟 ∈ 𝑅.

Clearly, a subset of a linearly independent set is also linearly independent.

Theorem 1.2. (NKB, New Kid on the Block) Let 𝑅 be a (possibly infinite) linearly
independent subset of a vector space 𝑽 over a field 𝐹. Let 𝒗 be a vector in 𝑽, 𝒗  𝑅. Then
𝑅 ∪ {𝒗} is linearly dependent if and only if 𝒗 ∈ 𝑠𝑝𝑎𝑛𝑆.

July 2014 S. Adarve


5

Proof. Let 𝒗 be a vector in 𝑽 such that 𝑅 ∪ {𝒗} is linearly dependent. This means that there
exists a finite collection {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } in 𝑅 such that {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } ∪ {𝒗} is linearly
dependent. Hence, there are scalars 𝑘1 , 𝑘2 , … , 𝑘𝑟 , 𝑑 ∈ 𝐹, not all zero, such that

⃑.
𝑘1 𝒗1 + 𝑘2 𝒗2 + ⋯ + 𝑘𝑟 𝒗𝑟 + 𝑑𝒗 = 𝟎

Since {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } is linearly independent by hypothesis, we must have 𝑑 ≠ 0, which


implies

𝒗 = −(1⁄𝑑) (𝑘1 𝒗1 + 𝑘2 𝒗2 + ⋯ + 𝑘𝑟 𝒗𝑟 ) ∈ 𝑠𝑝𝑎𝑛𝑅.

Conversely, let 𝒗 ∈ 𝑠𝑝𝑎𝑛𝑅. This means that there is a finite collection {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } in 𝑅
such that 𝒗 = 𝑙1 𝒗1 + 𝑙2 𝒗2 + ⋯ + 𝑙𝑟 𝒗𝑟 for some scalars 𝑙1 , 𝑙2 , … , 𝑙𝑟 ∈ 𝐹. Therefore

⃑,
𝑙1 𝒗1 + 𝑙2 𝒗2 + ⋯ + 𝑙𝑟 𝒗𝑟 − 𝒗 = 𝟎

which means that {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } ∪ {𝒗}, hence 𝑅 ∪ {𝒗}, is linearly dependent, since the
latter is a non trivial linear combination. 

We now make a short digression to discuss the concept of a partial ordering that we shall
need to define (and prove the existence of) bases in arbitrary vector spaces.

Definition. A partial ordering on a set 𝑅 consists of a relation  on 𝑅 such that

1. r  r for all r ∈ 𝑅 (reflexivity).


2. 𝑟  𝑠 and 𝑠  𝑟  𝑟 = 𝑠 (antisymmetry).
3. 𝑟  𝑠 and 𝑠  𝑡  𝑟  𝑡 (transitivity).

Remark. 𝑟  𝑠 denotes 𝑟  𝑠 but 𝑟 ≠ 𝑠.

Definition. Let (𝑅, ) be a partially ordered set. A collection 𝑄 in 𝑅 is a chain if for every
pair of elements 𝑟, 𝑠 ∈ 𝑄, either 𝑟  𝑠 or 𝑠  𝑟. An element 𝑠 ∈ 𝑅 is an upper bound for a
subset 𝑄  𝑅 if r  s for all r ∈ 𝑄. An element 𝑡 ∈ 𝑅 is maximal if there is no 𝑠 ∈ 𝑅 such
that 𝑡  𝑠.

We apply the preceding concepts in the following setting: Starting with a vector space 𝑽
over a field 𝐹, we consider the family L of all (possibly infinite) collections of linearly
independent vectors in 𝑽 together with the inclusion relation . It is straightforward
routine to show that  is a partial ordering on L.

Definition. Let 𝑅 be a (possibly infinite) linearly independent subset of a vector space 𝑽


over a field 𝐹. The subset 𝑅 is maximal linearly independent if it is maximal in the partially
ordered set (L, ). Such a subset is called a basis for 𝑽.

Thus, a linearly independent subset of a vector space is maximal, hence a basis, if it is not
properly contained in any other linearly independent subset of the vector space.

July 2014 S. Adarve


6

Theorem 1.3. Let 𝑽 be a vector space over a field 𝐹 and let 𝑅 be a subset of 𝑽. Then 𝑅 is a
basis for 𝑽 if and only if 𝑅 is linearly independent and 𝑠𝑝𝑎𝑛𝑅 = 𝑽.

Proof. If 𝑅 is a basis for 𝑽, in particular, 𝑅 is linearly independent. To show that 𝑠𝑝𝑎𝑛𝑅 =


𝑽, let 𝒗 be any vector in 𝑽 such that 𝒗  𝑅. This implies that 𝑅 ∪ {𝒗} is linearly dependent,
since 𝑅 is maximal. Therefore, by NKB, 𝒗 ∈ 𝑠𝑝𝑎𝑛𝑅.

Conversely, suppose that 𝑅 is linearly independent and that 𝑠𝑝𝑎𝑛𝑅 = 𝑽. The latter means
that every 𝒗 ∈ 𝑽 belongs to 𝑠𝑝𝑎𝑛𝑅, implying –again by NKB− that 𝑅 ∪ {𝒗} is linearly
dependent for every 𝒗  𝑅. Hence, 𝑅 must be maximal. 

Henceforth, if 𝑠𝑝𝑎𝑛𝑅 = 𝑽, we shall say that 𝑅 generates 𝑉 or that 𝑅 is a set of generators


for 𝑽.

Having a set of generators 𝑅 for a vector space 𝑽, means that any vector in 𝑽 may be
expressed as a linear combination of vectors in 𝑅. If this set of generators happens to be a
basis, then its linear independence will imply that any vector in 𝑽 will be able to be
expressed as a linear combination of vectors in 𝑅 in a unique way.

The cornerstone for our purposes at this point is Zorn’s Lemma, a very important principle
in set theory, equivalent to diverse set−theoretic statements such as the Axiom of Choice
and the Well-ordering Theorem.

Zorn’s Lemma Let (𝑅, ) be a partially ordered set. If every chain in 𝑅 has an upper bound,
then 𝑅 contains at least one maximal element.

We now apply Zorn’s Lemma to prove the existence of a basis in an arbitrary vector space.

Theorem 1.4. Let 𝑽 be a non-trivial vector space over a field 𝐹. Then 𝑽 has a basis.

Proof. Consider the family L of all (possibly infinite) linearly independent collections in 𝑽
together with the partial order relation . The family L is non−empty since 𝑽 is
non−trivial. Because a basis is a maximal member in L, Zorn’s Lemma will provide one if
we manage to prove that every chain in L has an upper bound (under inclusion).

Let K be a chain in L. We claim that the union 𝑅 of all members in K is an upper bound for
K. Clearly 𝑅 contains all members in K, so it suffices to show that 𝑅 is linearly
independent. Since K is a chain and 𝑅 is the union of all members in K, any finite subset of
𝑅 lies in some member of K, hence is linearly independent. 

A particularly important class of vector spaces is the class of finitely generated vector
spaces.

Definition. Let 𝑽 be a vector space over a field 𝐹. 𝑽 is finitely generated if 𝑽 has a finite
set of generators.

It is a simple exercise to prove that a finitely generated vector space has a finite basis and,
consequently, that all of its bases must be finite. (See Exercise 8.)

July 2014 S. Adarve


7

Theorem 1.5. (Replacement Theorem) Let 𝑽 be a finitely generated vector space over a
field 𝐹. Suppose that 𝑄 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑞 } is a linearly independent subset of 𝑽 and that
𝑅 = {𝒗1 , 𝒗2 , . . . , 𝒗𝑟 } is a set of generators for 𝑽. Then

a) 𝑞≤𝑟
b) There is a subset 𝑅′ of 𝑅 with 𝑟 − 𝑞 elements such that 𝑄 ∪ 𝑅′ generates 𝑽.

Proof. The proof is by mathematical induction on 𝑞. For 𝑞 = 0, we have 𝑄 = ∅ and the


result is trivial.

Suppose the theorem is true for linearly independent subsets of 𝑽 with 𝑞 elements and let
𝑄 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑞+1 } be a linearly independent subset of 𝑽. Since {𝒖1 , 𝒖2 , . . . , 𝒖𝑞 } must
also be linearly independent, we may apply the induction hypothesis to obtain 𝑞 ≤ 𝑟 and a
subset {𝒗1 , 𝒗2 , . . . , 𝒗𝑟−𝑞 } of 𝑅 such that {𝒖1 , 𝒖2 , . . . , 𝒖𝑞 } ∪ {𝒗1 , 𝒗2 , . . . , 𝒗𝑟−𝑞 } is a set of
generators for 𝑽. In particular, 𝒖𝑞+1 can be expressed as a linear combination of these 𝑟
vectors, where not all the coefficients of the vectors 𝒗1 , 𝒗2 , . . . , 𝒗𝑟−𝑞 are zero (otherwise
𝒖𝑟+1 ∈ 𝑠𝑝𝑎𝑛{𝒖1 , 𝒖2 , . . . , 𝒖𝑟 }, a contradiction by NKB).

For the same reason 𝑞 < 𝑟, that is 𝑞 + 1 ≤ 𝑟, and, at least for an index 𝑖 = 1, … , 𝑟 − 𝑞, the
̂𝑖 , . . . , 𝒗𝑟−𝑞 }
vector 𝒗𝑖 is a linear combination of the vectors {𝒖1 , 𝒖2 , . . . , 𝒖𝑞+1 } ∪ {𝒗1 , . . . , 𝒗
(where ^ indicates that 𝒗𝑖 has been deleted). This implies that 𝑠𝑝𝑎𝑛{𝒖1 , 𝒖2 , . . . , 𝒖𝑞+1 } ∪
̂𝑖 , . . . , 𝒗𝑟−𝑞 } = 𝑽, by Exercise 5.
{𝒗1 , . . . , 𝒗 

An immediate and essential consequence of the Replacement Theorem is that any two
bases for a finitely generated vector space, which must be finite by Exercise 8 c), must
have the same number of elements. This leads to the following definition.

Definition. Let 𝑽 be a vector space over a field 𝐹. If 𝑽 is finitely generated, the common
number of elements of all of its bases is called the dimension of 𝑽, denoted by 𝑑𝑖𝑚 𝑽.
Such spaces are called finite dimensional. If 𝑽 is not finitely generated then 𝑽 is said to be
infinite dimensional.

Example 9 For any field 𝐹, consider the vector space 𝐹 𝑛 . The vectors 𝒆1 , … , 𝒆𝑛 , where 𝒆𝑖 ,
𝑖 = 1, … , 𝑛, is the 𝑛 −tuple with 1 in the 𝑖 −th position and zeroes in the remaining ones,
form a basis for 𝐹 𝑛 . This is called the canonical basis for 𝐹 𝑛 .

Example 10 The vectors 𝒆1 , 𝒆2 , 𝒆3 , …, where 𝒆𝑖 , 𝑖 = 1, … , ∞, is the ∞ −tuple with 1 in the


𝑖 −th position and zeroes in the remaining ones, form a basis for 𝐹 ∞ . This is called the
canonical basis for 𝐹 ∞ .

Spaces with a countable basis, such as 𝐹 ∞ , are said to have an (infinite) countable
dimension. It can be shown that all bases for such spaces are countable. Moreover, it can
be proven that all bases for an arbitrary (possibly non countable infinite dimensional)
vector space have the same cardinal. Nevertheless, in general, bases for non countable
infinite dimensional vector spaces are impossible to be described by an algorithm or
decision process. This would be the case of spaces such as ℝ𝜔 or ℂ𝜔 .

July 2014 S. Adarve


8

1.3 DIRECT SUMS

Definition. Let 𝑽 be a vector space over a field 𝐹 and let 𝑾1 and 𝑾2 be subspaces. The
sum of 𝑾1 and 𝑾2 is the subset of 𝑽 given by

𝑾1 + 𝑾2 = {𝒘1 + 𝒘2 | 𝒘1 ∈ 𝑾1 𝑎𝑛𝑑 𝒘2 ∈ 𝑾2 }.

Since addition of vectors is commutative, it is obvious that 𝑾1 + 𝑾2 = 𝑾2 + 𝑾1 . On the


other hand, it is straightforward routine to show that 𝑾1 + 𝑾2 is a subspace of 𝑽.

Theorem 1.6. Let 𝑽 be a vector space over a field 𝐹 and let 𝑾1 and 𝑾2 be finitely
generated subspaces of 𝑽. Then

𝑑𝑖𝑚 (𝑾1 + 𝑾2 ) = 𝑑𝑖𝑚 𝑾1 + 𝑑𝑖𝑚 𝑾2 − 𝑑𝑖𝑚 (𝑾1 ∩ 𝑾2 ).

Proof. We leave the proof to the reader. (See Exercise 13.) 

The definition may be extended unambiguously to define the sum 𝑾1 + ⋯ + 𝑾𝑘 of any


finite number of subspaces of 𝑽.

Example 11 In ℝ3 , let 𝑾1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 1, 0)} , 𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 0, 1)}, and
𝑾3 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 0, 1)}. Then 𝑾1 + 𝑾2 = 𝑾1 + 𝑾3 = ℝ3 and 𝑾2 + 𝑾3 = 𝑾3 .

Example 12 Let 𝑾1 = {𝐴 ∈ 𝑀𝑛×𝑛 (𝐹) | 𝐴𝑡 = 𝐴} and 𝑾2 = {𝐴 ∈ 𝑀𝑛×𝑛 (𝐹) | 𝐴𝑡 = −𝐴} ,


that is, the subspaces of 𝑀𝑛×𝑛 (𝐹) of symmetric and skew − symmetric matrices,
respectively. Given any matrix 𝐵 ∈ 𝑀𝑛×𝑛 (𝐹), the matrices

𝐶 = 1⁄2 (𝐵 + 𝐵𝑡 ) and 𝐷 = 1⁄2 (𝐵 − 𝐵𝑡 )

are symmetric and skew−symmetric, respectively; also, 𝐵 = 𝐶 + 𝐷. This implies that


𝑀𝑛×𝑛 (𝐹) = 𝑾1 + 𝑾2 .

We now introduce the very important and useful concept of a direct sum.

Definition. Let 𝑽 be a vector space over a field 𝐹 and let 𝑾1 and 𝑾2 be subspaces of 𝑽. If
𝑾1 ∩ 𝑾2 = {𝟎 ⃑ }, the sum 𝑾1 + 𝑾2 is called a direct sum and we write 𝑾1  𝑾2 .

If 𝑾1 and 𝑾2 are finitely generated subspaces of a vector space 𝑽 such that 𝑾1 ∩ 𝑾2 =


⃑ }, i.e. the sum 𝑾1 + 𝑾2 is direct, then the formula in Theorem 1.6 takes the form
{𝟎

𝑑𝑖𝑚 (𝑾1  𝑾2 ) = 𝑑𝑖𝑚 𝑾1 + 𝑑𝑖𝑚 𝑾2 .

⃑ },
Example 13 In Example 11, the sum 𝑾1 + 𝑾2 is a direct sum because 𝑾1 ∩ 𝑾2 = {𝟎
3
and we may write 𝑾1  𝑾2 = ℝ . The sum 𝑾1 + 𝑾3 is not a direct sum since
𝑾1 ∩ 𝑾3 = 𝑠𝑝𝑎𝑛 {(1, 0, 0)}.

Example 14 In ℝ∞ , consider the subspaces

𝑾1 = {(𝑎1 , 𝑎2 , … , 𝑎𝑗 , … ) ∈ ℝ∞| 𝑎𝑗 = 0 𝑖𝑓 𝑗 𝑖𝑠 𝑒𝑣𝑒𝑛}

July 2014 S. Adarve


9

and 𝑾2 = {(𝑎1 , 𝑎2 , … , 𝑎𝑗 , … ) ∈ ℝ∞ | 𝑎𝑗 = 0 𝑖𝑓 𝑗 𝑖𝑠 𝑜𝑑𝑑}.

⃑ }. Thus, ℝ∞ = 𝑾1  𝑾2 .
It is readily shown that ℝ∞ = 𝑾1 + 𝑾2 and 𝑾1 ∩ 𝑾2 = {𝟎

For the case of an arbitrary finite number of summands the notion of a direct sum is
slightly more complicated.

Definition. Let 𝑽 be a vector space over a field 𝐹 and let 𝑾1 , … , 𝑾𝑘 be subspaces of 𝑽.


The sum 𝑾1 + ⋯ + 𝑾𝑘 is called a direct sum if, for each 𝑗 = 1, … , 𝑘,

⃑}
𝑾𝑗 ∩ ∑ 𝑾𝑖 = {𝟎
𝑖≠𝑗

In this case we write 𝑾1  ⋯  𝑾𝑘 .

Example 15 In ℝ4 , let 𝑾1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0, 0), (0, 1, 0, 0)}, 𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 1, 1, 1)}, and
𝑾3 = 𝑠𝑝𝑎𝑛 {(0, 0, 1, 0), (0, 0, 0, 1)}. Then 𝑾1 + 𝑾3 is a direct sum and 𝑾1  𝑾3 = ℝ4 .
Nonetheless, 𝑾1 + 𝑾2 + 𝑾3 is not a direct sum: It is true that the intersection of any two
of these three subspaces is equal to {𝟎 ⃑ }, but

𝑾1 ∩ (𝑾2 + 𝑾3 ) = 𝑠𝑝𝑎𝑛 {(1, 1, 0, 0)}.

The intersections 𝑾2 ∩ (𝑾1 + 𝑾3 ) and 𝑾3 ∩ (𝑾1 + 𝑾2 ) are also non trivial.

In order to characterize direct sums in terms of bases, we need to introduce the notion of
concatenation of finite ordered collections of vectors.

Definition. Let 𝑽 be vector space over a field 𝐹. The concatenation of the finite ordered
collections 𝒖1 , 𝒖2 , . . . , 𝒖𝑟 and 𝒗1 , 𝒗2 , . . . , 𝒗𝑠 of vectors in 𝑽 is the ordered collection of
vectors 𝒖1 , 𝒖2 , . . . , 𝒖𝑟 , 𝒗1 , 𝒗2 , . . . , 𝒗𝑠 .

If 𝛼 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑟 } and 𝛽 = {𝒗1 , 𝒗2 , . . . , 𝒗𝑠 }, the concatenation of 𝛼 and 𝛽 is denoted


by 𝛼 ∪̇ 𝛽 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑟 } ∪̇ {𝒗1 , 𝒗2 , . . . , 𝒗𝑠 }.

Notice that concatenation takes into account possible repetitions, whilst union does not.
Thus, in general, 𝛼 ∪̇ 𝛽 and 𝛼 ∪ 𝛽 may differ. The notion of concatenation may be
extended in a natural way to the case of any finite number of finite ordered collections of
vectors in 𝑽.

Theorem 1.7. Let 𝑽 be a finitely generated vector space over a field 𝐹 and let 𝑾1 , … , 𝑾𝑘
be subspaces of 𝑽. The following statements are equivalent.

a) The sum 𝑾1 + ⋯ + 𝑾𝑘 is a direct sum.


b) If 𝛼1 , … , 𝛼𝑘 are ordered bases for 𝑾1 , … , 𝑾𝑘 , respectively, then the concatenation
𝛼1 ∪̇ ⋯ ∪̇ 𝛼𝑘 is a basis for 𝑾1 + ⋯ + 𝑾𝑘 .
c) For each vector 𝒗 ∈ 𝑾1 + ⋯ + 𝑾𝑘 , there exists a unique collection of vectors
𝒘1 ∈ 𝑾1 , … , 𝒘𝑘 ∈ 𝑾𝑘 such that 𝒗 = 𝒘1 + ⋯ + 𝒘𝑘 .

Proof. Assume a) and let us prove b). Let 𝛼1 = {𝒘11 , … , 𝒘1𝑑1 }, … , 𝛼𝑘 = {𝒘𝑘1 , … , 𝒘𝑘𝑑𝑘 } be
ordered bases for 𝑾1 , … , 𝑾𝑘 , respectively. Clearly the collection 𝛼1 ∪̇ ⋯ ∪̇ 𝛼𝑘 generates

July 2014 S. Adarve


10

𝑾1 + ⋯ + 𝑾𝑘 . By Theorem 1.3, it will suffice to prove that 𝛼1 ∪̇ ⋯ ∪̇ 𝛼𝑘 is linearly


independent. Consider scalars 𝑏𝑗𝑙 such that

𝑘 𝑑𝑗
∑ ∑ ⃑.
𝑏𝑗𝑙 𝒘𝑗𝑙 = 𝟎
𝑗=1 𝑙=1

Then for each 𝑗 = 1, … , 𝑘,

𝑑𝑗 𝑑𝑖
∑ 𝑏𝑗𝑙 𝒘𝑗𝑙 = − (∑ ∑ ⃑ },
𝑏𝑖𝑙 𝒘𝑖𝑙 ) ∈ 𝑾𝑗 ∩ ∑ 𝑾𝑖 = {𝟎
𝑙=1 𝑙=1
𝑖≠𝑗 𝑖≠𝑗

hence, by the linear independence of 𝒘𝑗1 , … , 𝒘𝑗𝑑𝑗 , we have 𝑏𝑗1 , … , 𝑏𝑗𝑑𝑗 = 0.

We leave the proof of b)  c) to the reader. (See Exercise 17.)

Assume c) and let us prove a). For fixed 𝑗 = 1, … , 𝑘, let

𝒗 ∈ 𝑾𝑗 ∩ ∑ 𝑾𝑖 .
𝑖≠𝑗

This implies that there are vectors 𝒘𝑖 ∈ 𝑾𝑖 , 𝑖 ≠ 𝑗, such that

𝒗 = 𝒘1 + ⋯ + 𝒘𝑗−1 + 𝒘𝑗+1 + ⋯ + 𝒘𝑘

or

𝒘1 + ⋯ + 𝒘𝑗−1 − 𝒗 + 𝒘𝑗+1 + ⋯ + 𝒘𝑘 = ⃑𝟎.

Since the latter is a representation of the zero vector in 𝑾1 + ⋯ + 𝑾𝑘 , we must have


𝒗 = 𝒘1 = ⋯ = 𝒘𝑗−1 = 𝒘𝑗+1 = ⋯ 𝒘𝑘 = 𝟎 ⃑ by uniqueness. 

Theorem 1.7 suggests a particularly simple way to construct direct sums in any vector
space. Let 𝑽 be a vector space over a field 𝐹 and let 𝛼 be an (ordered) finite collection of
linearly independent vectors in 𝑽. Let 𝛼1 , … , 𝛼𝑘 be a partition of 𝛼, i.e. a collection of
mutually disjoint (ordered) subcollections of 𝛼 such that 𝛼1 ∪̇ ⋯ ∪̇ 𝛼𝑘 = 𝛼. Then, by
Theorem 1.5, the sum
𝑘
∑ 𝑠𝑝𝑎𝑛 𝛼𝑗
𝑗=1

is a direct sum. In fact, we obtain

𝑠𝑝𝑎𝑛 𝛼 = 𝑠𝑝𝑎𝑛 𝛼1  ⋯ 𝑠𝑝𝑎𝑛 𝛼𝑘 .

Example 16 In ℝ3 , consider the linearly independent (ordered) collection of vectors

𝛼 = { 𝒆1 , 𝒆1 + 𝒆2 , 𝒆1 + 𝒆2 + 𝒆3 },

which happens to be a basis for ℝ3 . Then

July 2014 S. Adarve


11

ℝ3 = 𝑠𝑝𝑎𝑛 {𝒆1 }  𝑠𝑝𝑎𝑛 { 𝒆1 + 𝒆2 }  𝑠𝑝𝑎𝑛 { 𝒆1 + 𝒆2 + 𝒆3 } .

Also,

ℝ3 = 𝑠𝑝𝑎𝑛 {𝒆1 , 𝒆1 + 𝒆2 }  𝑠𝑝𝑎𝑛 { 𝒆1 + 𝒆2 + 𝒆3 }.

Example 17 Let 𝑽 be a real or complex vector space together with an inner product 〈 , 〉.
(See Chapter 4 for more details.) If 𝑾 is a finite dimensional subspace of 𝑽, then the
subspace given by

𝑾⊥ = {𝒗 ∈ 𝑽 | 〈𝒗 , 𝒘〉 = 0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝒘 ∈ 𝑾}

−the orthogonal complement of 𝑾 in 𝑽 − satisfies the equation 𝑽 = 𝑾 + 𝑾⊥ . Indeed, if


{𝒘1 , 𝒘2 , . . . , 𝒘𝑟 } is an orthonormal basis for 𝑾 , then, for each 𝒗 ∈ 𝑽 , the vector
𝒘⊥ = 𝒗 − 𝒘, where
𝑟

𝒘 = ∑〈𝒗 , 𝒘𝑗 〉 𝒘𝑗 ∈ 𝑾,
𝑗=1

⃑ }, since 〈 , 〉 is positive
is easily shown to belong to 𝑾⊥ . On the other hand, 𝑾 ∩ 𝑾⊥ = {𝟎

definite. Thus, in this situation, we always have 𝑽 = 𝑾  𝑾 .

1.4 LINEAR TRANSFORMATIONS I

Definition. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹. A linear transformation is a


function 𝑆: 𝑼 → 𝑽 such that
a) For each pair of vectors 𝒖1 , 𝒖2 ∈ 𝑼, 𝑆(𝒖1 + 𝒖2 ) = 𝑆(𝒖1 ) + 𝑆(𝒖2 ).
b) For each 𝑘 ∈ 𝐹 and each 𝒖 ∈ 𝑼, 𝑆(𝑘𝒖) = 𝑘 𝑆(𝒖).

Diverse useful consequences can be easily derived from the definition using elementary
arguments. The following are frequently used:

⃑ ) = ⃑𝟎.
1. The image by 𝑆 of the zero vector in 𝑼 is the zero vector in 𝑽, i.e. 𝑆(𝟎
2. The image by 𝑆 of a linear combination ∑𝑟𝑗=1 𝑘𝑗 𝒖𝑗 of vectors in 𝑼 equals the linear
combination of the images 𝑆(𝒖𝑗 ) with the same kit of scalars:
𝑟 𝑟

𝑆 (∑ 𝑘𝑗 𝒖𝑗 ) = ∑ 𝑘𝑗 𝑆(𝒖𝑗 ).
𝑗=1 𝑗=1

A linear transformation 𝑆: 𝑽 → 𝑽 from a vector space to itself is also called an operator.

The next theorem shows one of the main features of linear transformations.

July 2014 S. Adarve


12

Theorem 1.8. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹and let 𝑆: 𝑼 → 𝑽 be a linear
transformation. Then for every subspace 𝑾 of 𝑼, its image 𝑆(𝑾) is a subspace of 𝑽. Also,
for every subspace 𝒁 of 𝑽, the preimage 𝑆 −1 (𝒁) is a subspace of 𝑼.

Proof. Concerning the first assertion, for a given subspace 𝑾 of 𝑼, let 𝒗1 , 𝒗2 ∈ 𝑆(𝑾) and
let 𝒘1 , 𝒘2 be vectors in 𝑾 such that 𝑆(𝒘1 ) = 𝒗1 and 𝑆(𝒘2 ) = 𝒗2 . Then 𝒗1 + 𝒗2 =
𝑆(𝒘1 ) + 𝑆(𝒘2 ) = 𝑆(𝒘1 + 𝒘2 ) ∈ 𝑆(𝑾).

Let now 𝑘 ∈ 𝐹, 𝒗 ∈ 𝑆(𝑾), and let 𝒘 ∈ 𝑾 be such that 𝑆(𝒘) = 𝒗. Then 𝑘𝒗 = 𝑘 𝑆(𝒘) =
𝑆(𝑘𝒘) ∈ 𝑆(𝑾).

We leave the proof of the second assertion to the reader. (See Exercise 24.) 

If 𝑆: 𝑼 → 𝑽 is a linear transformation, then two things are of utmost importance: The


image 𝑆(𝑼) of the whole domain 𝑼, i.e. 𝑟𝑎𝑛𝑔𝑒𝑆 −the range of 𝑆 − and the preimage
⃑ ) of the trivial subspace {𝟎
𝑆 −1 (𝟎 ⃑ } of 𝑽.

Definition. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹and let 𝑆: 𝑼 → 𝑽 be a linear
⃑ ) of the trivial subspace {𝟎
transformation. The preimage 𝑆 −1 (𝟎 ⃑ } of 𝑽 is called the kernel
of 𝑆 and is denoted by 𝑘𝑒𝑟𝑆.

The following facts are complementary to Theorem 1.8.

Theorem 1.9. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹and let 𝑆: 𝑼 → 𝑽 be a linear
transformation. If 𝑅 is a (possibly infinite) set of generators for 𝑼, then 𝑆(𝑅) is a set of
generators for 𝑟𝑎𝑛𝑔𝑒𝑆.

Proof. Let 𝒗 ∈ 𝑟𝑎𝑛𝑔𝑒 𝑆 and let 𝒖 ∈ 𝑼 such that 𝑆(𝒖) = 𝒗. Since 𝑅 is a set of generators
for 𝑼, there is a finite subcollection 𝑄 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑞 } of 𝑅 and a kit 𝑘1 , 𝑘2 , … , 𝑘𝑞 ∈ 𝐹 of
scalars such that 𝒖 = 𝑘1 𝒖1 + 𝑘2 𝒖2 + ⋯ + 𝑘𝑞 𝒖𝑞 . Hence

𝑞 𝑞

𝒗 = 𝑆(𝒖) = 𝑆 (∑ 𝑘𝑗 𝒖𝑗 ) = ∑ 𝑘𝑗 𝑆(𝒖𝑗 ) ∈ 𝑠𝑝𝑎𝑛𝑆(𝑅).


𝑗=1 𝑗=1

Thus 𝑠𝑝𝑎𝑛𝑆(𝑅) = 𝑟𝑎𝑛𝑔𝑒𝑆. 

An identical proof will show that if 𝑆: 𝑼 → 𝑽 is a linear transformation, 𝑾 is a subspace of


𝑼, and 𝑅 is a set of generators for 𝑾, then 𝑆(𝑅) is a set of generators for 𝑆(𝑾).

Corollary Let 𝑆: 𝑼 → 𝑽 be a linear transformation and let 𝑾 be a finite dimensional


subspace of 𝑼. Then 𝑆(𝑾) is finite dimensional and 𝑑𝑖𝑚 𝑆(𝑾) ≤ 𝑑𝑖𝑚 𝑾.

If 𝑾 in the previous corollary happens to be infinite dimensional, 𝑆(𝑾) may be either


finite or infinite dimensional. The inequality 𝑑𝑖𝑚 𝑆(𝑾) ≤ 𝑑𝑖𝑚 𝑾 still holds in this case,
although it must now be interpreted in terms of the arithmetic of cardinals.

Example 18 For any matrix 𝐴 ∈ 𝑀𝑚×𝑛 (𝐹), the function 𝐿𝐴 : 𝐹 𝒏 → 𝐹 𝒎 given by 𝐿𝐴 (𝒗) =
𝐴𝒗, for all 𝒗 ∈ 𝐹 𝒏 , is a linear transformation. In this situation, 𝑘𝑒𝑟𝐿𝐴 is the solution set of
the 𝑚 × 𝑛 homogeneous linear system 𝐴𝒗 = 𝟎 ⃑ . On the other hand, if 𝐴1 , 𝐴2 , … , 𝐴𝑛 are the

July 2014 S. Adarve


13

columns of 𝐴, then 𝑟𝑎𝑛𝑔𝑒𝐿𝐴 = 𝑠𝑝𝑎𝑛{𝐴1 , 𝐴2 , … , 𝐴𝑛 } according to Theorem 1.9, because


𝐴𝑗 = 𝐿𝐴 ( 𝒆𝑗 ), 𝑗 = 1, … , 𝑛, and 𝒆1 , … , 𝒆𝑛 is a basis –in particular a set of generators− for
𝐹𝒏.

Example 19 Let 𝑽 be vector space over a field 𝐹. Let 𝑟 ∈ 𝐹 be some fixed scalar. Then the
function 𝐻𝑟 : 𝑽 → 𝑽 given by 𝐻𝑟 (𝒗) = 𝑟𝒗, for all 𝒗 ∈ 𝑽, is readily shown to be an operator.

The operator in Example 19 is called the homothety of ratio 𝑟.

- If 𝑟 = 0, the corresponding homothety 𝐻0 is the zero operator on 𝑽, also denoted by


𝑂𝑽 .
- If 𝑟 = 1, the corresponding homothety is the identity operator on 𝑽, also denoted by
𝐼𝑑𝑽 .
- If 𝑟 = −1, the corresponding homothety is the inversion on 𝑽 about the origin, also
denoted by −𝐼𝑑𝑽.

⃑ } and 𝑟𝑎𝑛𝑔𝑒𝐻𝑟 = 𝑽.
For 𝑟 ≠ 0, it is evident that 𝑘𝑒𝑟𝐻𝑟 = {𝟎

Homotheties will play a very important role in Chapter 3 regarding diagonal operators.

Theorem 1.10. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹 and let 𝑆: 𝑼 → 𝑽 be a linear
⃑ }.
transformation. Then 𝑆 is injective if and only if 𝑘𝑒𝑟𝑆 = {𝟎

⃑ ) = ⃑𝟎, there is no non zero vector 𝒖 ∈ 𝑼 such that


Proof. If 𝑆 is injective, then, since 𝑆(𝟎
𝑆(𝒖) = ⃑𝟎. Thus 𝑘𝑒𝑟𝑆 = {𝟎 ⃑ }.

Conversely, suppose 𝑘𝑒𝑟𝑆 = {𝟎 ⃑ } and let 𝒖1 , 𝒖2 be vectors in 𝑼 such that 𝑆(𝒖1 ) = 𝑆(𝒖2 ).
Then 𝑆(𝒖1 − 𝒖2 ) = ⃑𝟎, which implies 𝒖1 − 𝒖2 = ⃑𝟎, i.e. 𝒖1 = 𝒖2 . Thus, 𝑆 is injective. 

Let 𝑼 and 𝑽 be vector spaces over a field 𝐹, where 𝑼 is finitely generated. Let 𝑅 =
{𝒖1 , 𝒖2 , . . . , 𝒖𝑛 } be a basis for 𝑼 and let {𝒗1 , 𝒗2 , . . . , 𝒗𝑛 } be any collerction of 𝑛 (non
necessarily distinct) vectors in 𝑽. Let 𝑆: 𝑅 → 𝑽 be the function given by 𝑆(𝒖𝑗 ) = 𝒗𝑗 ,
𝑗 = 1, … , 𝑛. Consider the linear transformation (also denoted by 𝑆) 𝑆: 𝑼 → 𝑽 defined as
follows:

For each 𝒖 ∈ 𝑼, if 𝒖 = ∑𝑛𝑗=1 𝑘𝑗 𝒖𝑗 is the unique expression of 𝒖 as a linear combination of


the vectors in the basis 𝑅, then 𝑆(𝒖) = ∑𝑛𝑗=1 𝑘𝑗 𝑆(𝒖𝑗 ) = ∑𝑛𝑗=1 𝑘𝑗 𝒗𝑗 .

Definition. The linear transformation 𝑆: 𝑼 → 𝑽 defined in the previous paragraph is called


the linear extension of the function 𝑆: 𝑅 → 𝑽.

This definition may be generalized to arbitrary vector spaces in the obvious way.

Example 20 For a given field 𝐹, consider the canonical basis 𝑅 = {𝒆1 , 𝒆2 , … , 𝒆𝑗 , … , } for 𝐹 ∞ .
Let 𝑆: 𝑅 → 𝐹 ∞ be the function given by 𝑆(𝒆𝑗 ) = 𝒆𝑗+1 , for 𝑗 = 1, … , ∞. We claim that its
linear extension 𝑆: 𝐹 ∞ → 𝐹 ∞, the right shift on 𝐹 ∞ , is injective. Indeed, if ∑𝑟𝑖=1 𝑘𝑗𝑖 𝒆𝑗𝑖 is an
arbitrary vector in 𝐹 ∞ , then

July 2014 S. Adarve


14

𝑟 𝑟 𝑟

𝑺 (∑ 𝑘𝑗𝑖 𝒆𝑗𝑖 ) = ∑ 𝑘𝑗𝑖 𝑆( 𝒆𝑗𝑖 ) = ∑ 𝑘𝑗𝑖 𝒆𝑗𝑖+1 = ⃑𝟎


𝑖=1 𝑖=1 𝑖=1

⃑ }. Hence 𝑆 is injective by Theorem 1.10.


implies 𝑘𝑗1 = ⋯ = 𝑘𝑗𝑟 = 0, i.e. 𝑘𝑒𝑟𝑆 = {𝟎

If 𝑆: 𝑼 → 𝑽 is a linear transformation and 𝑼 is finitely generated, then so are 𝑘𝑒𝑟𝑆 and


𝑟𝑎𝑛𝑔𝑒𝑆. This leads to the following definition.

Definition. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹, with 𝑼 finitely generated, and let
𝑆: 𝑼 → 𝑽 be a linear transformation. The dimension of 𝑘𝑒𝑟𝑆 is called the nullity of 𝑆 and is
denoted by 𝑛𝑢𝑙 𝑆. The dimension of 𝑟𝑎𝑛𝑔𝑒𝑆 is called the rank of 𝑆 and is denoted by
𝑟𝑎𝑛𝑘𝑆.

The following is a relevant fact in Linear Algebra.

Theorem 1.11 (Dimension Theorem) Let 𝑼 be an 𝑛 −dimensional vector space and 𝑽 an


arbitrary vector space over a field 𝐹. If 𝑆: 𝑼 → 𝑽 is a linear transformation, then

𝑛𝑢𝑙 𝑆 + 𝑟𝑎𝑛𝑘𝑆 = 𝑑𝑖𝑚 𝑼.

Proof. Let 𝐾 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑟 } be a basis for 𝑘𝑒𝑟𝑆 and 𝑅 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑟 , 𝒖𝑟+1 , . . . , 𝒖𝑛 }


an extension to a basis for 𝑼. We claim that 𝑄 = {𝑆(𝒖𝑟+1 ) , . . . , 𝑆(𝒖𝑛 ) } is a basis for
𝑟𝑎𝑛𝑔𝑒 𝑆. Clearly 𝑠𝑝𝑎𝑛𝑆(𝑄) = 𝑠𝑝𝑎𝑛𝑆(𝑅), since 𝑆(𝐾) = {𝟎 ⃑ }. Hence, 𝑠𝑝𝑎𝑛𝑆(𝑄) = 𝑟𝑎𝑛𝑔𝑒𝑆
by Theorem 1.9.

It suffices now to prove that 𝑄 is linearly independent. Let 𝑘𝑟+1 , … , 𝑘𝑛 ∈ 𝐹 be scalars such
that ∑𝑛𝑗=𝑟+1 𝑘𝑗 𝑆(𝒖𝑗 ) = ⃑𝟎. Now,

𝑛 𝑛
⃑ = ∑ 𝑘𝑗 𝑆(𝒖𝑗 ) = 𝑆 ( ∑ 𝑘𝑗 𝒖𝑗 ),
𝟎
𝑗=𝑟+1 𝑗=𝑟+1

means that ∑𝑟𝑗=𝑟+1 𝑘𝑗 𝒖𝑗 ∈ 𝑘𝑒𝑟𝑆. Hence, there are scalars 𝑘1 , … , 𝑘𝑟 ∈ 𝐹 such that

⃑⃑⃑
∑𝑛𝑗=𝑟+1 𝑘𝑗 𝒖𝑗 = ∑𝑟𝑗=1 𝑘𝑗 𝒖𝑗 or ∑𝑟𝑗=1(−𝑘𝑗 )𝒖𝑗 + ∑𝑛𝑗=𝑟+1 𝑘𝑗 𝒖𝑗 = 𝟎.

But this implies that 𝑘1 = ⋯ = 𝑘𝑟 = 𝑘𝑟+1 = ⋯ = 𝑘𝑛 = 0 , by the linear independence of


the extended basis 𝑅. 

Example 21 For a given field 𝐹, let 𝐿𝐴 : 𝐹 𝒏 → 𝐹 𝒎 be the linear transformation induced by


some matrix 𝐴 ∈ 𝑀𝑚×𝑛 (𝐹). Then Theorem 1.11 says that the dimension of the solution
set to the homogeneous system 𝐴𝒗 = ⃑𝟎 is equal to 𝑛 − 𝑑𝑖𝑚 (𝑠𝑝𝑎𝑛{𝐴1 , 𝐴2 , … , 𝐴𝑛 } ),
where 𝐴1 , 𝐴2 , … , 𝐴𝑛 are the columns of 𝐴. Now, the dimension of this span is the maximal
number of linearly independent columns of the matrix 𝐴. Thus, the dimension of the
solution set to the linear system is equal to the number of variables minus the rank of the
(finite) collection of vectors {𝐴1 , 𝐴2 , … , 𝐴𝑛 } in 𝐹 𝒎 . (See Chapter 2.)

July 2014 S. Adarve


15

Definition. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹and let 𝑆: 𝑼 → 𝑽 be a linear
transformation. If 𝑆 is invertible, then 𝑆 is said to be an isomorphism. In such case, the
spaces 𝑼 and 𝑽 are said to be isomorphic and we use the notation 𝑼 ≈ 𝑽.

In the particular case where 𝑆: 𝑼 → 𝑼 is an invertible operator, 𝑆 is also called an


automorphism.

It is a straightforward exercise to show that the inverse of a linear transformation, if it


exists, is also linear. Notice, on the other hand, that ≈ defines an equivalence relation on
the family of vector spaces over a given field 𝐹.

If 𝑼 and 𝑽 are vector spaces and 𝑆: 𝑼 → 𝑽 is an injective linear transformation, it follows


immediately that 𝑼 ≈ 𝑟𝑎𝑛𝑔𝑒𝑆.

Example 22 Let 𝑽 be a vector space over a field 𝐹. Then, for any 𝑟 ∈ 𝐹, 𝑟 ≠ 0, the
homothety 𝐻𝑟 : 𝑽 → 𝑽 is an automorphism.

The following fact is a consequence of Theorem 1.11.

Corollary Let 𝑼 and 𝑽 be finitely generated vector spaces over a field 𝐹. Then 𝑼 ≈ 𝑽 if and
only if 𝑑𝑖𝑚 𝑼 = 𝑑𝑖𝑚 𝑽.

Proof. Assume 𝑼 ≈ 𝑽 and let 𝑆: 𝑼 → 𝑽 be an isomorphism. If {𝒖1 , 𝒖2 , . . . , 𝒖𝑛 } is a basis for


𝑼, then {𝑆(𝒖1 ) , . . . , 𝑆(𝒖𝑛 ) } is a basis for 𝑽 by the proof to Theorem 1.11 (since
𝑘𝑒𝑟𝑆 = {𝟎⃑ }). Thus 𝑑𝑖𝑚 𝑼 = 𝑛 = 𝑑𝑖𝑚 𝑽.

Conversely, assume 𝑑𝑖𝑚 𝑼 = 𝑑𝑖𝑚 𝑽. If 𝑅 = {𝒖1 , 𝒖2 , . . . , 𝒖𝑛 } and 𝑄 = {𝒗1 , 𝒗2 , . . . , 𝒗𝑛 } are


bases for 𝑼 and 𝑽, respectively, let 𝑆: 𝑅 → 𝑽 be the function given by 𝑆(𝒖𝑗 ) = 𝒗𝑗 , 1, … , 𝑛.
Then the linear extension 𝑆: 𝑼 → 𝑽 is an isomorphism. 

The preceding corollary extends to the case of infinite dimensional vector spaces. Indeed,
two arbitrary vector spaces over a field 𝐹 are isomorphic if and only if they admit bases
with the same cardinality. (See Exercise 28 .)

Example 23. As we saw in Section 1.2, ℝ∞ has a countable basis. On the other hand, the
vector space ℝ[𝑡] of all polynomials with coefficients in ℝ also has a countable basis,
namely, the numerable set of all monomials 1, 𝑡, 𝑡 2 , … , 𝑡 𝑛 , … . Thus, by Exercise 28, we
have ℝ∞ ≈ ℝ[𝑡]. Of course, an explicit isomorphism between these two spaces can be
easily exhibited precisely by using linear extensions.

1.5 LINEAR TRANSFORMATIONS II

We begin this section with a notion that will play a particularly important role in Chapter 3.

July 2014 S. Adarve


16

Definition. Let 𝑆: 𝑽 → 𝑽 be an operator on a vector space 𝑽 over a field 𝐹. A subspace 𝑾


of 𝑽 is said to be 𝑆 −invariant if 𝑆(𝒘) ∈ 𝑾 for all 𝒘 ∈ 𝑾, that is, if 𝑆(𝑾)  𝑾. If a
subspace 𝑾 is 𝑆 −invariant, we define the restriction of 𝑆 to 𝑾 to be the operator
𝑆𝑾 : 𝑾 → 𝑾 given by 𝑆𝑾 (𝒘) = 𝑆(𝒘) for all 𝒘 ∈ 𝑾.

Example 24. Let 𝑽 be a vector space over a field 𝐹 and let 𝐻𝑟 : 𝑽 → 𝑽 be the homothety of
ratio 𝑟 ∈ 𝐹. Then any subspace 𝑾 of 𝑽 is 𝐻𝑟 −invariant and the restriction of 𝐻𝑟 to 𝑾 is
the homothety of ratio 𝑟 (on 𝑾).

Example 25. Let 𝑆: 𝑽 → 𝑽 be any operator on a vector space 𝑽. Then it is readily verified
⃑ }, 𝑽 , 𝑘𝑒𝑟𝑆, and 𝑟𝑎𝑛𝑔𝑒𝑆 are all 𝑆 −invariant.
that the subspaces {𝟎

The following notion will also play a crucial role in Chapter 3.

Definition. Let 𝑽 be a vector space over a field 𝐹 and let 𝑾1 and 𝑾2 be subspaces such
that 𝑽 = 𝑾1  𝑾2 . Let 𝑆1 : 𝑾1 → 𝑾1 and 𝑆2 : 𝑾2 → 𝑾2 be linear operators. The direct
sum of 𝑆1 and 𝑆2 is the operator 𝑆 1  𝑆2 : 𝑽 → 𝑽 defined as follows:

For each 𝒗 ∈ 𝑽, if 𝒗 = 𝒘1 + 𝒘2 is the unique expression of 𝒗 as a sum of a vector in 𝑾1


and a vector in 𝑾2 , then

(𝑆 1  𝑆2 )(𝒗) = (𝑆 1  𝑆2 )(𝒘1 + 𝒘2 ) = 𝑆 1(𝒘1 ) + 𝑆 2(𝒘2 ).

In the previous definition, notice that the subspaces 𝑾1 and 𝑾2 are automatically
𝑆 1  𝑆2 −invariant. Moreover, the restrictions of 𝑆 1  𝑆2 to 𝑾1 and 𝑾2 are the original
operators 𝑆1 and 𝑆2 , respectively.

The definition of a direct sum of operators extends to any finite number of summands in
the obvious way.

Example 26 Let 𝑽 = ℝ3 , 𝑾1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 1, 0)} , and 𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 0, 1)} .
Clearly ℝ3 = 𝑾1  𝑾2 . Consider the operators 𝑆1 : 𝑾1 → 𝑾1 and 𝑆2 : 𝑾2 → 𝑾2 given by

𝑆 1 (𝒘1 ) = 𝑆 1 (𝑥1 , 𝑥2 , 0) = (𝑥1 , 𝑥1 − 𝑥2 , 0), for each 𝒘1 = (𝑥1 , 𝑥2 , 0) ∈ 𝑾1 , and

𝑆 2 (𝒘2 ) = 𝑆 2 (𝑥3 , 0, 𝑥3 ) = (2𝑥3 , 0, 2𝑥3 ), for each 𝒘2 = (𝑥3 , 0, 𝑥3 ) ∈ 𝑾2 .

To obtain 𝑆 1  𝑆2 : ℝ3 → ℝ3 , we first solve a trivial linear system to show that

𝒗 = (𝑥1 , 𝑥2 , 𝑥3 ) = (𝑥1 − 𝑥3 , 𝑥2 , 0) + (𝑥3 , 0, 𝑥3 ) = 𝒘1 + 𝒘2

is the unique decomposition of each 𝒗 = (𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ3 relative to the decomposition


ℝ3 = 𝑾1  𝑾2 . Thus,

(𝑆 1  𝑆2 )(𝑥1 , 𝑥2 , 𝑥3 ) = (𝑆 1  𝑆2 )((𝑥1 − 𝑥3 , 𝑥2 , 0) + (𝑥3 , 0, 𝑥3 ))

July 2014 S. Adarve


17

= 𝑆 1 (𝑥1 − 𝑥3 , 𝑥2 , 0) + 𝑆 2 (𝑥3 , 0, 𝑥3 )

= (𝑥1 − 𝑥3 , 𝑥1 − 𝑥2 − 𝑥3 , 0) + (2𝑥3 , 0, 2𝑥3 )

= (𝑥1 + 𝑥3 , 𝑥1 − 𝑥2 − 𝑥3 , 2𝑥3 ),

for all (𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ3 .

Definition. (Projections) Let 𝑽 be a vector space over a field 𝐹 and let 𝑾1 and 𝑾2 be
subspaces of 𝑽 such that 𝑽 = 𝑾1  𝑾2 . The operator on 𝑽 given by 𝑃 = 𝐼𝑑𝑾1  𝑂𝑾2 is
called the projection on 𝑾1 along 𝑾2 .

Notice that 𝑃 may also be described in the following way: Given 𝒗 = 𝒘1 + 𝒘2 , with
𝒘1 ∈ 𝑾1 and 𝒘2 ∈ 𝑾2 (unique decomposition), then 𝑃(𝒗) = 𝑃(𝒘1 + 𝒘2 ) = 𝒘1 .

Example 27 In ℂ2 , let 𝑾1 = 𝑠𝑝𝑎𝑛 {(0, 𝑖)} and 𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 1)}. Since the vectors (0, 𝑖)
and (1, 1) form a basis for ℂ2 , we have ℂ2 = 𝑾1  𝑾2 , by Exercise 17 and Theorem 1.7.
Let 𝑃: ℂ2 → ℂ2 be the projection on 𝑾1 along 𝑾2 . By definition, 𝑃(0, 𝑖) = (0, 𝑖) and
𝑃(1, 1) = (0, 0). On the other hand, it is easily verified that 𝒆1 = 𝑖(0, 𝑖) + (1, 1) and
𝒆2 = −𝑖(0, 𝑖).

Thus, 𝑃(𝒆1 ) = 𝑖𝑃(0, 𝑖) + 𝑃(1, 1) = (0, −1) = −𝒆2 and 𝑃(𝒆2 ) = −𝑖𝑃(0, 𝑖) = (0, 1) = 𝒆2 .

Hence, for each 𝒗 = (𝑧1 , 𝑧2 ) ∈ ℂ2, we obtain

𝑃(𝑧1 , 𝑧2 ) = 𝑃(𝑧1 𝒆1 + 𝑧2 𝒆2 ) = 𝑧1 𝑃(𝒆1 ) + 𝑧2 𝑃(𝒆2 ) = (0, −𝑧1 + 𝑧2 ).

Remark: If 𝑽 is a real or complex vector space together with an inner product 〈 , 〉, and 𝑾
is a subspace of 𝑽 such that 𝑽 = 𝑾  𝑾⊥ (where 𝑾⊥ is the orthogonal complement of 𝑾
in 𝑽 relative to the inner product 〈 , 〉), then the projection on 𝑾 along 𝑾⊥ will also be
called the orthogonal projection on 𝑾. (The condition 𝑽 = 𝑾  𝑾⊥ is always fulfilled if
𝑾 is finitely generated. See Chapter 4 or Example 17 for more details.)

The following is an example of an orthogonal projection in ℝ2 , relative to the standard


inner product on ℝ2 .

Example 28 Let 𝑾 = 𝑠𝑝𝑎𝑛 {(2, 1)}. Since the vectors (−1, 2) and (2, 1) are orthogonal
with respect to the standard inner product on ℝ2 , we have 𝑾⊥ = 𝑠𝑝𝑎𝑛 {(−1, 2)} and
ℝ2 = 𝑾  𝑾⊥ , for dimensional reasons. Let 𝑃: ℝ2 → ℝ2 be the orthogonal projection on
𝑾. By definition, 𝑃(2, 1) = (2, 1) and 𝑃(−1, 2) = (0, 0). Solving for 𝒆1 = (1, 0) and
𝒆2 = (0, 1) in terms of the vectors (2, 1) and (−1, 2), which also form a basis for ℝ2 , we
obtain

𝒆1 = 2⁄5 (2, 1) − 1⁄5 (−1, 2) and 𝒆2 = 1⁄5 (2, 1) + 2⁄5 (−1, 2).

Thus,

July 2014 S. Adarve


18

𝑃(𝒆1 ) = 2⁄5 𝑃(2, 1) − 1⁄5 𝑃(−1, 2) = (4⁄5, 2⁄5) and

𝑃(𝒆2 ) = 1⁄5 𝑃(2, 1) + 2⁄5 𝑃(−1, 2) = (2⁄5, 1⁄5).

The projection 𝑃 is the linear extension of the function 𝑃: {𝒆1 , 𝒆2 } → ℝ2 described above,
hence, for each 𝒗 = (𝑥1 , 𝑥2 ) ∈ ℝ2 , we have

𝑃(𝑥1 , 𝑥2 ) = 𝑃(𝑥1 𝒆1 + 𝑥2 𝒆2 ) = 𝑥1 𝑃(𝒆1 ) + 𝑥2 𝑃(𝒆2 )

4 2 2 1
= (5 𝑥1 + 5 𝑥2 , 5 𝑥1 + 5 𝑥2 ).

In Chapter 4 we will see alternative methods for computing orthogonal prtojections.

Projections enjoy several interesting properties (see Exercise 29). On the other hand, as we
shall see, they play an important role in Linear Algebra.

Definition. (Generalized Householder operators) Let 𝑽 be a real or complex vector space,


let 𝑾1 be a 1 −dimensional subspace of 𝑽, and let 𝑾2 be a direct complement of 𝑾1 in 𝑽,
i.e. a subspace 𝑾2 such that 𝑽 = 𝑾1  𝑾2 . Denote by 𝑃 the projection on 𝑾1 along 𝑾2 .
The operator 𝐻: 𝑽 → 𝑽 given by 𝐻(𝒗) = 𝒗 − 2𝑃(𝒗), for all 𝒗 ∈ 𝑽, is called a generalized
Householder operator.

Example 29 Let 𝑾1 = 𝑠𝑝𝑎𝑛 {(0, 𝑖)} and 𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 1)} in ℂ2 and let 𝑃 be the
projection on 𝑾1 along 𝑾2 , as in Example 27. Then the corresponding Householder
operator 𝐻 is given by

𝐻(𝑧1 , 𝑧2 ) = (𝑧1 , 𝑧2 ) − 2(0, −𝑧1 + 𝑧2 ) = (𝑧1 , 2𝑧1 − 𝑧2 ).

Remark: If 𝑽 is a real or complex vector space together with an inner product 〈 , 〉, and 𝑾
is a 1 −dimensional subspace of 𝑽, then the generalized Householder operator 𝐻, relative
to the projection 𝑃 on 𝑾 along 𝑾⊥ , is also given by 𝐻 = −𝐼𝑑𝑾  𝐼𝑑𝑾⊥ . (See Exercise 30.)
Such an operator is called the reflection about 𝑾⊥ . In this case, the generalized
Householder operator 𝐻 may be interpreted geometrically as the operator that transforms
each vector in 𝑽 into its mirror image about 𝑾⊥ . See Chapter 4 for more details.

Here is an example of a reflection in ℝ2 .

Example 30 Let 𝑾 = 𝑠𝑝𝑎𝑛 {(2, 1)} and 𝑾⊥ = 𝑠𝑝𝑎𝑛 {(−1, 2)} in ℝ2 , as in Example 28. Let
𝑃 be the orthogonal projection on 𝑾. Then the corresponding Householder operator 𝐻, a
reflection in this case, is given by

4 2 2 1
𝐻(𝑥1 , 𝑥2 ) = (𝑥1 , 𝑥2 ) − 2 (5 𝑥1 + 5 𝑥2 , 5 𝑥1 + 5 𝑥2 )

3 4 4 3
= (−5 𝑥1 − 5 𝑥2 , −5 𝑥1 + 5 𝑥2 ),

for all (𝑥1 , 𝑥2 ) ∈ ℝ2 . The reader might want to draw a picture to convince himself that 𝐻
transforms vectors in ℝ2 into their mirror images about the line 𝑾⊥ = 𝑠𝑝𝑎𝑛 {(−1, 2)}.

July 2014 S. Adarve


19

We now introduce a vector space structure on the set of all linear transformations from
one vector space to another (over the same field).

Definition. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹 . Let 𝑆 and 𝑇 be any linear
transformations from 𝑼 to 𝑽and let 𝑘 ∈ 𝐹 be any scalar.

a) The sum 𝑆 + 𝑇 is defined as follows: For each 𝒖 ∈ 𝑼,

(𝑆 + 𝑇)(𝒖) = 𝑆(𝒖) + 𝑇(𝒖).

b) The scalar multiple 𝑘𝑆 is defined as follows: For each 𝒖 ∈ 𝑼,

(𝑘𝑆)(𝒖) = 𝑘𝑆(𝒖).

It turns out that 𝑆 + 𝑇 and 𝑘𝑆 also define linear transformations from 𝑼 to 𝑽. Moreover,
the set L(𝑼, 𝑽) of all linear transformations from 𝑼 to 𝑽, together with these operations of
addition and scalar multiplication, becomes a vector space over 𝐹.

We now turn our attention to linear transformations between finitely generated vector
spaces. Such linear transformations may be interpreted in terms of matrices.

At this point we assume that the reader is acquainted with matrix multiplication and its
basic properties.

Definition. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹 such that 𝑑𝑖𝑚 𝑼 = 𝑛 and
𝑑𝑖𝑚 𝑽 = 𝑚, and let 𝑆: 𝑼 → 𝑽 be a linear transformation. Let 𝛼 = {𝒖1 , . . . , 𝒖𝑗 , . . . , 𝒖𝑛 } and
𝛽 = {𝒗1 , . . . , 𝒗𝑖 , . . . , 𝒗𝑚 } be ordered bases for 𝑼 and 𝑽, respectively. For each 𝑗 = 1, … , 𝑛,
let 𝑆(𝒖𝑗 ) = ∑𝑚 𝑖=1 𝑎𝑖𝑗 𝒗𝑖 . The matrix representation of 𝑆 with respect to 𝛼 and 𝛽 is the
𝑚 × 𝑛 matrix with entries in 𝐹 given by

[𝑆]𝛽𝛼 = (𝑎𝑖𝑗 ).

𝛽
This means that the 𝑗 −th column of [𝑆]𝛼 is the (ordered) kit of scalars 𝑎1𝑗 ,…, 𝑎𝑖𝑗 ,…, 𝑎𝑚𝑗 in
the unique expression of 𝑆(𝒖𝑗 ) as a linear combination of the vectors 𝒗1 , . . . , 𝒗𝑖 , . . . , 𝒗𝑚 .

In the particular case of an operator 𝑆: 𝑽 → 𝑽 on a finitely generated vector space, if


𝛼 = {𝒗1 , . . . , 𝒗𝑗 , . . . , 𝒗𝑛 } is an ordered basis for 𝑽, we shall write [𝑆]𝛼 instead of [𝑆]𝛼𝛼 .

The following notation will be very useful in handling matrix representations.

Let 𝑼 be a vector space over a field 𝐹 and let 𝛼 = {𝒖1 , . . . , 𝒖𝑗 , . . . , 𝒖𝑛 } be an ordered basis
for 𝑼 . If 𝒖 is any vector in 𝑼 and 𝒖 = 𝑘1 𝒖1 + ⋯ + 𝑘𝑗 𝒖𝑗 + ⋯ + 𝑘𝑛 𝒖𝑛 is the unique
expression of 𝒖 as a linear combination of the vectors in 𝛼, the vector in 𝐹 𝑛 given by

𝑘1

[𝒖]𝛼 = 𝑘𝑗

[𝑘𝑛 ]

July 2014 S. Adarve


20

is called the coordinate vector of 𝒖 in the basis 𝛼.

Turning back to the situation in the definition above, where 𝑆: 𝑼 → 𝑽 is a linear


transformation between finitely generated vector spaces, and 𝛼 and 𝛽 are ordered bases
𝛽
for 𝑼 and 𝑽 , respectively, if we multiply the matrix [𝑆]𝛼 and the vector [𝒖]𝛼 we obtain

∑𝑛𝑗=1 𝑘𝑗 𝑎1𝑗

𝛽 𝑛
[𝑆]𝛼 [𝒖]𝛼 = ∑𝑗=1 𝑘𝑗 𝑎𝑖𝑗 .

𝑛

[ 𝑗=1 𝑘𝑗 𝑎𝑚𝑗 ]

On the other hand, 𝒖 = ∑𝑛𝑗=1 𝑘𝑗 𝒖𝑗 implies

𝑛 𝑛 𝑛 𝑛
𝑚 𝑚
𝑆(𝒖) = 𝑆 (∑ 𝑘𝑗 𝒖𝑗 ) = ∑ 𝑘𝑗 𝑆( 𝒖𝑗 ) = ∑ 𝑘𝑗 ∑ 𝑎𝑖𝑗 𝒗𝑖 = ∑ (∑ 𝑘𝑗 𝑎𝑖𝑗 ) 𝒗𝑖 .
𝑖=1 𝑖=1
𝑗=1 𝑗=1 𝑗=1 𝑗=1

Thus,

[𝑆]𝛽𝛼 [𝒖]𝛼 = [𝑆(𝒖)]𝛽 ,

𝛽
which shows the role of the matrix representation [𝑆]𝛼 .

Example 31 For a given field 𝐹, let 𝐿𝐴 : 𝐹 𝒏 → 𝐹 𝒎 be the linear transformation induced by


some matrix 𝐴 = (𝑎𝑖𝑗 ) ∈ 𝑀𝑚×𝑛 (𝐹), that is, 𝐿𝐴 (𝒗) = 𝐴𝒗, for all 𝒗 ∈ 𝐹 𝒏 . Consider the
canonical bases 𝑐𝑎𝑛𝑛 = {𝒆1 , . . . , 𝒆𝑗 , . . . , 𝒆𝑛 } and 𝑐𝑎𝑛𝑚 = {𝒆1 , . . . , 𝒆𝑖 , . . . , 𝒆𝑚 } for 𝐹 𝒏 and
𝐹 𝒎 , respectively. If 𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 are the columns of 𝐴, then 𝑆(𝒆𝑗 ) = 𝐴𝑗 , 𝑗 = 1, … , 𝑛, as
we saw in Example 18. But

𝑎1𝑗
⋮ 𝑚

𝐴𝑗 = 𝑎𝑖𝑗 = ∑ 𝑎𝑖𝑗 𝒆𝑖 ∈ 𝐹 𝒎 .
⋮ 𝑖=1
[𝑎𝑚𝑗 ]

Thus,

[𝐿𝐴 ]𝑐𝑎𝑛
𝑐𝑎𝑛𝑛 = (𝑆(𝒆1 ), … , 𝑆(𝒆𝑗 ), … , 𝑆(𝒆𝑛 ) ) = (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ) = 𝐴.
𝑚

Example 32 Let 𝑽 be a vector space over a field 𝐹 such that 𝑑𝑖𝑚 𝑽 = 𝑛. For any given
𝑟 ∈ 𝐹, consider the homothety 𝐻𝑟 : 𝑽 → 𝑽, i.e. the operator such that 𝐻𝑟 (𝒗) = 𝑟𝒗, for all
𝒗 ∈ 𝑽. Then, for any ordered basis 𝛼 for 𝑽, the matrix representation [𝐻𝑟 ]𝛼 is the matrix
𝑟𝐼𝑛 , where 𝐼𝑛 is the 𝑛 × 𝑛 identity matrix. Such a matrix is called a scalar matrix.

We now turn back to invariant subspaces of operators, which were defined at the
beginning of this section.

July 2014 S. Adarve


21

Let 𝑆: 𝑽 → 𝑽 be an operator on a finitely generated vector space 𝑽 over a field 𝐹 and let
𝑾 be an 𝑆 −invariant subspace of 𝑽. Let 𝛼 = {𝒘1 , 𝒘2 , . . . , 𝒘𝑟 } be an ordered basis for 𝑾
and consider an extension 𝛼 ∪̇ 𝛽 = {𝒘1 , 𝒘2 , . . . , 𝒘𝑟 } ∪̇ {𝒗1 , 𝒗2 , . . . , 𝒗𝑠 } to an ordered
basis for 𝑽. Then the matrix representation [𝑆]𝛼∪̇𝛽 is a block−triangular matrix, say

[𝑆]𝛼∪̇𝛽 = [𝐵 𝐷
],
𝑂 𝐶

where 𝐵 is 𝑟 × 𝑟 and 𝐶 is 𝑠 × 𝑠.

In the particular case of a direct sum of operators, say 𝑆 1  𝑆2 : 𝑽 → 𝑽, where 𝑆1 and 𝑆2


are operators on subspaces 𝑾1 and 𝑾2 of 𝑽 such that 𝑽 = 𝑾1  𝑾2 , the situation
becomes more interesting. Indeed, if 𝛼 and 𝛽 are ordered bases for 𝑾1 and 𝑾2 , then, by
Theorem 1.7, 𝛼 ∪̇ 𝛽 is a basis for 𝑽 and, since 𝑾1 and 𝑾2 are both 𝑆 1  𝑆2 −invariant, the
matrix representation [𝑆 1  𝑆2 ]𝛼∪̇𝛽 is a block−diagonal matrix, say

[𝑆 1  𝑆2 ]𝛼∪̇𝛽 = [𝐵 𝑂
],
𝑂 𝐶

where 𝐵 and 𝐶 are submatrices of appropriate size.

This suggests the following definition.

Definition. Let 𝐵 ∈ 𝑀𝑟×𝑟 (𝐹) and 𝐶 ∈ 𝑀𝑠×𝑠 (𝐹) . The direct sum of 𝐵 and 𝐶 is the
(𝑟 + 𝑠) × (𝑟 + 𝑠) block−diagonal matrix with entries in 𝐹 given by

𝐵 𝑂
𝐵C=[ ].
𝑂 𝐶

Thus, in these terms, the matrix identity previous to this definition concerning direct sums
of operators may be written

[𝑆 1  𝑆2 ]𝛼∪̇𝛽 = [𝐵 𝑂
] = 𝐵  C.
𝑂 𝐶

Both this definition and the matrix identity may be extended to any number of summands.

We finish this chapter with a relevant isomorphism between L(𝑼, 𝑽) and a space of
matrices, when 𝑼 and 𝑽 are finite dimensional vector spaces over a field 𝐹.

Let 𝑼 and 𝑽 be finitely generated vector spaces over a field 𝐹 such that 𝑑𝑖𝑚 𝑼 = 𝑛 and
𝑑𝑖𝑚 𝑽 = 𝑚 and let 𝑆: 𝑼 → 𝑽 and 𝑇: 𝑼 → 𝑽 be linear transformations. Consider ordered
bases 𝛼 = {𝒖1 , . . . , 𝒖𝑗 , . . . , 𝒖𝑛 } and 𝛽 = {𝒗1 , . . . , 𝒗𝑖 , . . . , 𝒗𝑚 } for 𝑼 and 𝑽, respectively. For
each 𝑗 = 1, … , 𝑛, let 𝑆(𝒖𝑗 ) = ∑𝑚 𝑚
𝑖=1 𝑎𝑖𝑗 𝒗𝑖 and 𝑇(𝒖𝑗 ) = ∑𝑖=1 𝑏𝑖𝑗 𝒗𝑖 . Then,

𝑚 𝑚 𝑚
(𝑆 + 𝑇)(𝒖𝑗 ) = ∑ 𝑎𝑖𝑗 𝒗𝑖 + ∑ 𝑏𝑖𝑗 𝒗𝑖 = ∑(𝑎𝑖𝑗 + 𝑏𝑖𝑗 ) 𝒗𝑖
𝑖=1 𝑖=1 𝑖=1

Hence, according to the definition,

July 2014 S. Adarve


22

[𝑆 + 𝑇]𝛽𝛼 = [𝑆]𝛽𝛼 + [𝑇]𝛽𝛼 .

Using a similar elementary argument , if 𝑘 ∈ 𝐹 is any scalar, we also obtain

[𝑘𝑆]𝛽𝛼 = 𝑘[𝑆]𝛽𝛼 .

𝛽
In this token, consider the function : L(𝑼, 𝑽) → 𝑀𝑚×𝑛 (𝐹) given by (𝑆) = [𝑆]𝛼 , for
each 𝑆 ∈ L(𝑼, 𝑽). The two identities above mean that  is 𝐹 −linear. In fact, this linear
transformation happens to be an isomorphism (see Exercise 35).

Now we can go one step further. Let 𝑼, 𝑽 and 𝑾 be finitely generated vector spaces over
a field 𝐹 and let 𝑆: 𝑼 → 𝑽 and 𝑇: 𝑽 → 𝑾 be linear transformations. Consider ordered
bases 𝛼 = {𝒖1 , . . . , 𝒖𝑗 , . . . , 𝒖𝑛 } , 𝛽 = {𝒗1 , . . . , 𝒗𝑖 , . . . , 𝒗𝑚 } , and 𝛾 = {𝒘1 , . . . , 𝒘𝑗 , . . . , 𝒘𝑞 }
𝛾
for 𝑼, 𝑽 and 𝑾, respectively. Then the matrix representation [𝑇𝑆]𝛼 of the composite
linear transformation 𝑇𝑆: 𝑼 → 𝑾 may be computed in terms of the matrix representations
[𝑇]𝛾𝛽 = (𝑏𝑖𝑘) and [𝑆]𝛽𝛼 = (𝑎𝑘𝑗 ). Indeed, for each 𝑗 = 1, … , 𝑛, we have

𝑚 𝑚

(𝑇𝑆)(𝒖𝑗 ) = 𝑇 (𝑆(𝒖𝑗 )) = 𝑇 (∑ 𝑎𝑘𝑗 𝒗𝑘 ) = ∑ 𝑎𝑘𝑗 𝑇(𝒗𝑘 )


𝑘=1 𝑘=1

𝑚 𝑞 𝑞 𝑚

= ∑ 𝑎𝑘𝑗 ∑ 𝑏𝑖𝑘 𝒘𝑖 = ∑ (∑ 𝑏𝑖𝑘 𝑎𝑘𝑗 ) 𝒘𝑖 .


𝑘=1 𝑖=1 𝑘=1 𝑘=1

Thus, we have the property

[𝑇𝑆]𝛾𝛼 = [𝑇]𝛾𝛽 [𝑆]𝛽𝛼 .

Denote by L(𝑽) the vector space of operators L(𝑽, 𝑽) on a vector space 𝑽. If 𝑽 is finitely
generated and 𝛼 = {𝒗1 , . . . , 𝒗𝑗 , . . . , 𝒗𝑛 } is an ordered basis for 𝑽, it turns out that the
vector space isomorphism : L(𝑽) → 𝑀𝑛×𝑛 (𝐹) given by (𝑆) = [𝑆]𝛼 , for each 𝑆 ∈ L(𝑽),
is also an isomorphism of algebras since (𝑇𝑆) = (𝑇)(𝑆), for all 𝑆, 𝑇 ∈ L(𝑽), by the
property above applied to this particular case.

July 2014 S. Adarve


23

EXERCISES

1. Prove properties 1 and 2 at the bottom of page 1.

2. Prove that the intersection of an arbitrary collection of subspaces of a vector space is


also a subspace of the vector space.

3. Let 𝑽 be a vector space over a field 𝐹 and let 𝑾1 and 𝑾2 be subspaces of 𝑽. Prove that
𝑾1 ∪ 𝑾2 is a subspace of 𝑽 if and only if 𝑾1  𝑾2 or 𝑾2  𝑾1 .

4. Let 𝑽 be a vector space over a field 𝐹 and let 𝑅 be a (possibly infinite) basis for 𝑽. Show
that any vector in 𝑽 can be expressed as a linear combination of vectors in 𝑅 in a
unique way.

5. Let 𝑽 be a finitely generated vector space over a field 𝐹, and let 𝑅 = {𝒗1 , 𝒗2 , . . . , 𝒗𝑛 } be
a basis for 𝑽. Let 𝒗 be a vector in 𝑽 and let 𝑘1 , … , 𝑘𝑖 , … , 𝑘𝑛 ∈ 𝐹 be the unique scalars
such that 𝒗 = 𝑘1 𝒗1 + ⋯ + 𝑘𝑖 𝒗𝑖 + ⋯ + 𝑘𝑛 𝒗𝑛 . Show that {𝒗1 , . . . , 𝒗 ̂𝑖 , . . . , 𝒗𝑛 } ∪ {𝒗} is
also a basis for 𝑽 if and only if 𝑘𝑖 ≠ 0.

6. Let 𝑽 be a vector space over a field 𝐹 and let 𝑅 be an infinite subset of 𝑽. Show that
𝑠𝑝𝑎𝑛𝑅 is a subspace of 𝑽.

7. Let 𝑄 and 𝑅 be subsets of a vector space 𝑽 over a field 𝐹. Prove that


a) 𝑠𝑝𝑎𝑛(𝑄 ∪ 𝑅) = 𝑠𝑝𝑎𝑛𝑄 + 𝑠𝑝𝑎𝑛𝑅
b) 𝑠𝑝𝑎𝑛(𝑄 ∩ 𝑅)  𝑠𝑝𝑎𝑛𝑄 ∩ 𝑠𝑝𝑎𝑛𝑅

8. Let 𝑽 be a non trivial finitely generated vector space over a field 𝐹.


a) Show that 𝑽 has a finite basis.
b) Prove that any set of generators for 𝑽 has a finite subset that also generates 𝑽.
c) Prove that all bases for 𝑽 are finite and have the same number of elements.

9. Let 𝑽 be an 𝑛 −dimensional vector space over a field 𝐹.


a) Show that any collection of 𝑛 linearly independent vectors in 𝑽 is a basis for 𝑽.
b) Prove that any generating set for 𝑽 with 𝑛 elements is a basis for 𝑽.

10. Let 𝑽 be a finitely generated vector space over a field 𝐹 and let 𝑾 be a subspace of 𝑽.
Show that 𝑾 is also finitely generated and 𝑑𝑖𝑚 𝑾 ≤ 𝑑𝑖𝑚 𝑽.

11. A partially ordered set (𝑅, ) is said to be well−ordered if any (non empty) subset of 𝑅
has a smallest element.
a) Show that the set of real numbers ℝ together with the usual partial ordering ≤ is
not well−ordered.
b) Prove that the subset of ℝ given by

July 2014 S. Adarve


24


∐ {𝑛 − 1⁄𝑘 | 𝑘 ∈ ℕ, 𝑘 > 0},
𝑛=1

together with the partial ordering inherited from (ℝ, ≤), is well−ordered.

12. Let 𝑽 be a non trivial vector space over a field 𝐹.


a) Assume that 𝑽 is finitely generated. If 𝒗1 , 𝒗2 , . . . , 𝒗𝑛 is a linearly independent set
of vectors in 𝑽, show that this collection can be extended to a basis for 𝑽. Hint:
Use the Replacement Theorem.
b) Assume now that 𝑽 is not finitely generated. If 𝑅 is a (possibly infinite) linearly
independent set of vectors in 𝑽, prove that it can be extended to a basis for 𝑽.
Hint: Apply Zorn’s Lemma to an appropriate family of linearly independent
subsets of 𝑽 with the partial order relation  .

13. Prove Theorem 1.6. Hint: Start with a basis 𝑄 for 𝑾1 ∩ 𝑾2 and consider extensions 𝑅1
and 𝑅2 to bases for 𝑾1 and 𝑾2 , respectively.

14. Prove that ℝ regarded as a vector space over ℚ is infinite dimensional. Hint: Use the
existence of transcendental real numbers.

15. If 𝑾1 and 𝑾2 are the subspaces of 𝑀𝑛×𝑛 (𝐹) of symmetric and skew−symmetric
matrices, respectively, show that 𝑀𝑛×𝑛 (𝐹) = 𝑾1  𝑾2 if and only if the characteristic
of 𝐹 is not 2.

16. a) Let 𝑾 be the subspace of 𝑀𝑛×𝑛 (𝐹) of all strictly lower triangular matrices (i.e. lower
triangular matrices whose diagonal entries are also equal to zero). Find two different
direct complements for 𝑾 in 𝑀𝑛×𝑛 (𝐹). (You might need to discuss separately the case
where the characteristis of 𝐹 is 2.)

b) Does 𝐹 ∞ have a direct complement in 𝐹 𝜔 ? Explain your answer.

17. Prove that b)  c) in Theorem 1.7. (In fact, c) is a consequence of a weaker version of b)
in the sense that you only need to assume that b) is true for one collection of ordered
bases 𝛼1 , … , 𝛼𝑘 for 𝑾1 , … , 𝑾𝑘 .) Conclude that if 𝑾1 , … , 𝑾𝑘 are finitely generated
subspaces of a vector space whose sum is a direct sum, then 𝑑𝑖𝑚 (𝑾1  ⋯  𝑾𝑘 ) =
∑𝑘𝑗=1 𝑑𝑖𝑚 𝑾𝑗 .

18. Let 𝑾1 , … , 𝑾𝑘 be subspaces of a vector space 𝑽. If 𝑾2 + ⋯ + 𝑾𝑘 is a direct sum,


prove that 𝑾1 + ⋯ + 𝑾𝑘 is a direct sum if and only if 𝑾1 ∩ (𝑾2 + ⋯ + 𝑾𝑘 ) = {𝟎⃑ }.

19. Let {𝑾𝑗 }𝑗∈𝐽 be an arbitrary (possibly infinite) collection of subspaces of a vector space
𝑽. The sum ∑𝑗∈𝐽 𝑾𝑗 is defined to be the subset of 𝑽 of all sums ∑𝑗∈𝐽 𝒘𝑗 such that
⃑ except for a finite number of indices 𝑗 ∈ 𝐽. Show that ∑𝑗∈𝐽 𝑾𝑗 is a
𝒘𝑗 ∈ 𝑾𝑗 and 𝒘𝑗 = 𝟎
subspace of 𝑽.

July 2014 S. Adarve


25

20. A sum of subspaces {𝑾𝑗 }𝑗∈𝐽 of a vector space 𝑽 (see Exercise 19) is said to be a direct
sum when the following condition is satisfied:

⃑ , then 𝒘𝑗 = 𝟎
If 𝒘𝑗 ∈ 𝑾𝑗 , 𝑗 ∈ 𝐽, are such that ∑𝑗∈𝐽 𝒘𝑗 = 𝟎 ⃑ for all 𝑗 ∈ 𝐽.

Prove that a sum ∑𝑗∈𝐽 𝑾𝑗 is direct if and only if whenever ∑𝑗∈𝐽 𝒘𝑗 = ∑𝑗∈𝐽 𝒘𝑗′ , with
𝒘𝑗 , 𝒘𝑗′ ∈ 𝑾𝑗 , then 𝒘𝑗 = 𝒘𝑗′ for all 𝑗 ∈ 𝐽. If 𝐽 happens to be a finite set of indices, is this
definition of direct sum equivalent to the one given in Section 1.3? Justify your answer.


21. Show that ℝ∞ = 𝑗=1 𝑠𝑝𝑎𝑛 {𝒆𝑗 }. (See Exercise 20.)

22. Give an example of a vector space 𝑽 and a proper subspace 𝑾 of 𝑽 that have bases
with the same cardinality.

23. Fill in the details in Example 17.

24. Prove the second assertion in Theorem 1.8.

25. For a given field 𝐹, show that 𝑑𝑖𝑚 𝑀𝑚×𝑛 (𝐹) = 𝑚 × 𝑛. 𝐻𝑖𝑛𝑡: For each 𝑖 = 1, … , 𝑚 and
𝑗 = 1, … , 𝑛, consider the matrix 𝐴𝑖𝑗 ∈ 𝑀𝑚×𝑛 (𝐹) all of whose entries are equal to zero,
except for its 𝑖𝑗 −th entry that is equal to one.

26. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹and let 𝑆: 𝑼 → 𝑽 be a linear transformation.

a) Prove that 𝑆 is injective if and only if the image of any (possibly infinite) linearly
independent subset of 𝑼 is a linearly independent subset of 𝑽.

b) Show that 𝑆 is an isomorphism if and only if the image of any basis for 𝑼 is a basis
for 𝑽.

27. Let 𝑼 and 𝑽 be finitely generated vector spaces over a field 𝐹 with 𝑑𝑖𝑚 𝑼 = 𝑑𝑖𝑚 𝑽. Let
𝑆: 𝑼 → 𝑽 be a linear transformation. Show that 𝑆 is injective if and only if it is
surjective. Give counterexamples to this fact in the infinite dimensional case.

28. Prove that two arbitrary (possibly infinite dimensional) vector spaces over a field 𝐹 are
isomorphic if and only if they admit bases with the same cardinality. Hint: For one
implication use Exercise 26; for the converse, use linear extensions.

29. Let 𝑽 be a vector space and consider subspaces 𝑾1 and 𝑾2 of 𝑽 such that 𝑽 =
𝑾1  𝑾2 . Let 𝑃 be the projection on 𝑾1 along 𝑾2 .
a) Show that 𝑘𝑒𝑟𝑃 = 𝑾2 .
b) Prove that 𝑃 is idempotent, i.e. 𝑃2 = 𝑃.
c) Prove that 𝑟𝑎𝑛𝑔𝑒𝑃 is equal to the fixed point set of 𝑃, i.e. {𝒗 ∈ 𝑽 | 𝑃(𝒗) = 𝒗}.
d) Show that 𝑟𝑎𝑛𝑔𝑒𝑃 = 𝑾1 .

July 2014 S. Adarve


26

30. Let 𝑽 be a real or complex vector space with an inner product 〈 , 〉, and let 𝑾 be a
1 −dimensional subspace of 𝑽. Show that the generalized Householder operator 𝐻,
relative to the orthogonal projection 𝑃 on 𝑾, is also given by 𝐻 = −𝐼𝑑𝑾  𝐼𝑑𝑾⊥ .

31. Let 𝑽 be a vector space and let 𝑃 be an operator on 𝑽 such that 𝑃2 = 𝑃.


a) Show that 𝑽 = 𝑟𝑎𝑛𝑔𝑒𝑃 + 𝑘𝑒𝑟𝑃. Hint: For each 𝒗 ∈ 𝑽, 𝒗 = 𝑃(𝒗) + (𝒗 − 𝑃(𝒗)).
b) Show that 𝑽 = 𝑟𝑎𝑛𝑔𝑒𝑃  𝑘𝑒𝑟𝑃.
c) Prove that 𝑃 is the projection on 𝑟𝑎𝑛𝑔𝑒𝑃 along 𝑘𝑒𝑟𝑃.

32. Let 𝑆 be an operator on a vector space 𝑽.


a) If 𝑽 is finitely generated, show that if 𝑆 satisfies 𝑽 = 𝑟𝑎𝑛𝑔𝑒𝑆 + 𝑘𝑒𝑟𝑆, then it also
satisfies 𝑽 = 𝑟𝑎𝑛𝑔𝑒𝑆  𝑘𝑒𝑟𝑆.
b) Give an example of an operator 𝑆 on 𝐹 ∞ such that 𝐹 ∞ = 𝑟𝑎𝑛𝑔𝑒𝑆 + 𝑘𝑒𝑟𝑆, where
this sum is not direct.

33. Let 𝑼 and 𝑽 be finitely generated vector spaces over a field 𝐹 such that 𝑑𝑖𝑚 𝑼 = 𝑛 and
𝑑𝑖𝑚 𝑽 = 𝑚 . Let 𝛼 = {𝒖1 , . . . , 𝒖𝑘 , . . . , 𝒖𝑛 } and 𝛽 = {𝒗1 , . . . , 𝒗𝑖 , . . . , 𝒗𝑚 } be ordered
bases for 𝑼 and 𝑽, respectively. For each 𝑖 = 1, … , 𝑚 and 𝑗 = 1, … , 𝑛, define 𝑆𝑖𝑗 ∈
L(𝑼, 𝑽) as follows: 𝑆𝑖𝑗 is the linear extension of the function 𝑆𝑖𝑗 : 𝛼 → 𝑽 given by

𝑆𝑖𝑗 (𝒖𝑘 ) = ⃑𝟎, if 𝑘 ≠ 𝑗, and 𝑆𝑖𝑗 (𝒖𝑘 ) = 𝒗𝑖 , if 𝑘 = 𝑗.

Prove that the collection {𝑆𝑖𝑗 } is a basis for the vector space L(𝑼, 𝑽) over 𝐹. Conclude
that 𝑑𝑖𝑚 ( L(𝑼, 𝑽)) = 𝑚 × 𝑛.

34. Prove that the linear transformation : L(𝑼, 𝑽) → 𝑀𝑚×𝑛 (𝐹) described in the last
paragraph of Section 1.5 is an isomorphism. Hint: Show that the image by  of the basis
{𝑆𝑖𝑗 } for L(𝑼, 𝑽) given in Exercise 33 is precisely the basis {𝐴𝑖𝑗 } for 𝑀𝑚×𝑛 (𝐹) described
in Exercise 25.

35. Let 𝑼 and 𝑽 be vector spaces over a field 𝐹 and let 𝑾 be a subspace of 𝑼.
a) Show that 𝐴𝑛𝑛𝑾 = {𝑆 ∈ 𝐋(𝑼, 𝑽)| 𝑆(𝒘) = 𝟎 ⃑ , 𝑓𝑜𝑟 𝑎𝑙𝑙 𝒘 ∈ 𝑾}, the annihilator of
𝑾, is a subspace of L(𝑼, 𝑽).
b) If 𝑼 is finitely generated and 𝑽 = 𝐹, find 𝑑𝑖𝑚(𝐴𝑛𝑛𝑾). Prove your answer.

36. Consider ℂ2 as a real vector space and the ℝ −linear operator 𝑆 on ℂ2 given by
𝒛2 for all ( 𝒛1 , 𝒛2 ) ∈ ℂ2.
𝑆(𝒛1 , 𝒛2 ) = (𝑖𝒛1 , ̅̅̅),
a) If 𝛼 = {(1, 0), (𝑖, 0), (0, 1), (0, 𝑖)} is the canonical ordered basis of ℂ2 as a real
vector space, find the matrix representation [𝑆]𝛼 .
b) If 𝑾1 = {(1, 0), (𝑖, 0)} and 𝑾2 = {(0, 1), (0, 𝑖)} , express 𝑆 as a direct sum of
operators with respect to the direct sum decomposition ℂ2 = 𝑾1  𝑾2 .
c) Describe 𝑆 geometrically in terms of rotations and reflections. Is 𝑆 also ℂ −linear?
d) Find the smallest positive integer 𝑛 such that 𝑆 𝑛 = 𝐼𝑑ℂ2 . Is 𝑆 an automorphism?

July 2014 S. Adarve


27

37. a) Let 𝑆 be an operator on a vector space 𝑽. Suppose that 𝑾1 and 𝑾2 are (non trivial)
𝑆 −invariant subspaces of 𝑽 such that 𝑽 = 𝑾1  𝑾2 . Show that 𝑆 can be expressed
non trivially as a direct sum of operators.
b) Let 𝑆: ℂ2 → ℂ2 be the operator given by 𝑆(𝒛1 , 𝒛2 ) = (−𝒛2 , ̅̅̅),
𝒛1 for all ( 𝒛1 , 𝒛2 ) ∈ ℂ2.
Find non trivial 𝑆 −invariant subspaces 𝑾1 and 𝑾2 of ℂ2 such that ℂ2 = 𝑾1  𝑾2 .
Use this to express 𝑆 as a (non trivial) direct sum of operators.

38. Let 𝑆 be an operator on a finitely generated vector space 𝑽. If 𝑟𝑎𝑛𝑘𝑆 2 = 𝑟𝑎𝑛𝑘𝑆, prove
that 𝑽 = 𝑟𝑎𝑛𝑔𝑒𝑆  𝑘𝑒𝑟𝑆. Hint: Show that the restriction 𝑆𝑟𝑎𝑛𝑔𝑒𝑆 : 𝑟𝑎𝑛𝑔𝑒𝑆 → 𝑟𝑎𝑛𝑔𝑒𝑆
is an automorphism and deduce that 𝑟𝑎𝑛𝑔𝑒𝑆 ∩ 𝑘𝑒𝑟𝑆 = {𝟎 ⃑ }. Then, consider Theorem
1.6 and Theorem 1.11 (Dimension Theorem).

39. Let 𝑆 be an operator on a finitely generated vector space 𝑽. Prove that there exists a
positive integer 𝑘 such that 𝑽 = 𝑟𝑎𝑛𝑔𝑒𝑆 𝑘  𝑘𝑒𝑟𝑆 𝑘 . Give an example of an operator on
an infinite dimensional vector space that does not satisfy this condition.

40. Let 𝑆: ℝ2 → ℝ2 be an operator. Prove that 𝑟𝑎𝑛𝑘𝑆 2 = 𝑟𝑎𝑛𝑘𝑆 if and only if 𝑆 is the
composite of a projection and a homothety.

41. Give an example of an operator 𝑆: ℝ3 → ℝ3 such that 𝑟𝑎𝑛𝑘𝑆 2 = 𝑟𝑎𝑛𝑘𝑆, but 𝑆 2 ≠ 𝑆.

42. Let 𝑆: 𝑼 → 𝑽 and 𝑇: 𝑽 → 𝑾 be linear transformations, where 𝑼, 𝑽, and 𝑾 are finitely


generated vector spaces. Prove the inequalty

𝑟𝑎𝑛𝑘𝑇𝑆 ≤ 𝑚𝑖𝑛{𝑟𝑎𝑛𝑘𝑆, 𝑟𝑎𝑛𝑘𝑇}.

July 2014 S. Adarve


28

2 RANK AND DETERMINANTS


DIAGONAL AND JORDAN FORMS

This chapter is devoted to the notions of rank and determinant of a matrix, as well as to
the analysis of diagonal and Jordan forms. The concept of rank concerns arbitrary matrices,
while determinants involve only square matrices. Diagonal and Jordan forms involve
operators on finitely generated vector spaces.

2.1 RANK

Definition. Let 𝑽 be a vector space and let 𝑅 be a finite collection of vectors in 𝑽. The rank
of 𝑅, denoted by 𝑟𝑎𝑛𝑘𝑅, is the (common) number of vectors in any maximal linearly
independent subcollection of 𝑅.

The previous notion of rank may of course be extended to the case of infinite subsets of a
vector space by considering cardinals if necessary.

Let 𝐹 be any field. Given 𝒂1 , … , 𝒂𝑖 , … , 𝒂𝑚 ∈ 𝐹 𝑛 , then

𝒂1

𝐴 = 𝒂𝑖

[𝒂𝑚 ]

is to be read as “the matrix 𝐴 = (𝑎𝑖𝑗 ) ∈ 𝑀𝑚×𝑛 (𝐹) whose rows are 𝒂1 , … , 𝒂𝑖 , … , 𝒂𝑚 ”.

Analogously, given 𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ∈ 𝐹 𝑚 , then

𝐴 = (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 )

is to be read as “the matrix 𝐴 = (𝑎𝑖𝑗 ) ∈ 𝑀𝑚×𝑛 (𝐹) whose columns are 𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 " .

Theorem 2.1. Let 𝐴 = (𝑎𝑖𝑗 ) ∈ 𝑀𝑚×𝑛 (𝐹). Let 𝑅 be the collection of columns of 𝐴 (a subset
of 𝐹 𝑚 ) and let 𝑄 be the collection of rows of 𝐴 (a subset of 𝐹 𝑛 ). Then, 𝑟𝑎𝑛𝑘𝑄 = 𝑟𝑎𝑛𝑘𝑅.

Proof. Let 𝑟 = 𝑟𝑎𝑛𝑘𝑅 and 𝑞 = 𝑟𝑎𝑛𝑘𝑄. Let {𝐴𝑗1 , … , 𝐴𝑗𝑙 , … , 𝐴𝑗𝑟 } and {𝒂𝑖1 , … , 𝒂𝑖𝑘 , … , 𝒂𝑖𝑞 } be
maximal linearly independent subcollections of 𝑅 and 𝑄, respectively. We want to prove
that 𝑞 = 𝑟. We suppose 𝑞 < 𝑟 and arrive at a contradiction. (The supposition 𝑟 < 𝑞 leads
to a contradiction as well, by interchanging the roles of rows and columns.)

Consider the 𝑞 × 𝑟 submatrix of 𝐴 given by

July 2014 S. Adarve


29

𝒂1′


𝐴′ = 𝒂𝑘 = (𝐴1′ , … , 𝐴′𝑙 , … , 𝐴′𝑟 ),


𝒂
[ 𝑞]
′ ′
whose 𝑘𝑙 −th entry, 𝑎𝑘𝑙 , is the 𝑖𝑘 𝑗𝑙 −th entry in 𝐴, that is 𝑎𝑘𝑙 = 𝑎𝑖𝑘 𝑗𝑙 , for 𝑘 = 1, … , 𝑞 and
𝑙 = 1, … , 𝑟.

The vectors 𝐴1′ , … , 𝐴′𝑙 , … , 𝐴′𝑟 are 𝑟 vectors in 𝐹 𝑞 , hence linearly dependent, since 𝑞 < 𝑟.
Thus, there are scalars 𝑑1 , … , 𝑑𝑙 , … , 𝑑𝑟 ∈ 𝐹, not all zero, such that
𝑟
𝑨′ 𝒅 = ∑ ⃑,
𝑑𝑙 𝐴′𝑙 = 𝟎
𝑙=1

𝑑1

where 𝒅 = 𝑑𝑙 ∈ 𝐹 𝑞 .

[𝑑𝑟 ]

Now, since {𝒂𝑖1 , … , 𝒂𝑖𝑘 , … , 𝒂𝑖𝑞 } is a maximal linearly independent subcollection of 𝑄 then,
by NKB, 𝑄  𝑠𝑝𝑎𝑛 {𝒂𝑖1 , … , 𝒂𝑖𝑘 , … , 𝒂𝑖𝑞 }. This, in turn, implies that the rows of the 𝑚 × 𝑟
submatrix of 𝐴, formed by the (ordered) columns 𝐴𝑗1 , … , 𝐴𝑗𝑙 , … , 𝐴𝑗𝑟 , are contained in
𝑠𝑝𝑎𝑛{𝒂1′ , … , 𝒂′𝑘 , … , 𝒂′𝑞 }. This implies that 𝒂′ 𝒅 = ⃑𝟎 for each row 𝒂′ of the latter submatrix.
Thus, we obtain
𝑟
∑ 𝑑𝑙 𝐴𝑗𝑙 = ⃑𝟎 ∈ 𝐹 𝑚 ,
𝑙=1

i.e. {𝐴𝑗1 , … , 𝐴𝑗𝑙 , … , 𝐴𝑗𝑟 } is linearly dependent (since 𝑑1 , … , 𝑑𝑙 , … , 𝑑𝑟 are not all zero), which
contradicts the hypothesis. 

Definition. Let 𝐹 be a field. The rank of a matrix 𝐴 ∈ 𝑀𝑚×𝑛 (𝐹), denoted by 𝑟𝑎𝑛𝑘𝐴, is the
common number given both by the rank of its columns and the rank of its rows.

For any matrix 𝐴 = (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ) ∈ 𝑀𝑚×𝑛 (𝐹) , recall the linear transformation
𝐿𝐴 : 𝐹 𝒏 → 𝐹 𝒎 given by 𝐿𝐴 (𝒗) = 𝐴𝒗, for all 𝒗 ∈ 𝐹 𝒏 . As we saw in Example 18 in Chapter 1,
𝑟𝑎𝑛𝑔𝑒𝐿𝐴 = 𝑠𝑝𝑎𝑛{𝐴1 , 𝐴2 , … , 𝐴𝑛 }. On the other hand, the dimension of this span is equal to
the maximal number of linearly independent columns of the matrix 𝐴. Thus, by Theorem
2.1, we have 𝑟𝑎𝑛𝑘𝐿𝐴 = 𝑟𝑎𝑛𝑘𝐴.

Let now 𝐴 ∈ 𝑀𝑚×𝑛 (𝐹) and 𝐵 ∈ 𝑀𝑞×𝑚 (𝐹) . If 𝐿𝐴 : 𝐹 𝒏 → 𝐹 𝒎 and 𝐿𝐵 : 𝐹 𝒎 → 𝐹 𝒒 are the
associated linear transformations, then it is immediate that 𝐿𝐵𝐴 = 𝐿𝐵 𝐿𝐴 . Thus, by Exercise
42 in Chapter 1, we arrive at the inequalty

𝑟𝑎𝑛𝑘𝐵𝐴 ≤ 𝑚𝑖𝑛 {𝑟𝑎𝑛𝑘𝐴, 𝑟𝑎𝑛𝑘𝐵}.

Theorem 2.2. Let 𝐴 ∈ 𝑀𝑚×𝑛 (𝐹) and 𝐵 ∈ 𝑀𝑚×𝑚 (𝐹), where 𝐵 is invertible. Then

July 2014 S. Adarve


30

𝑟𝑎𝑛𝑘𝐵𝐴 = 𝑟𝑎𝑛𝑘𝐴.

Proof. By the inequality above 𝑟𝑎𝑛𝑘𝐵𝐴 ≤ 𝑟𝑎𝑛𝑘𝐴. Now, since 𝐵 is invertible, we have

𝐴 = 𝐼𝑚 = (𝐵−1 𝐵)𝐴 = 𝐵−1 (𝐵𝐴).

Thus, by the same inequality,

𝑟𝑎𝑛𝑘𝐴 = 𝑟𝑎𝑛𝑘𝐵 −1 (𝐵𝐴) ≤ 𝑟𝑎𝑛𝑘𝐵𝐴,

which implies 𝑟𝑎𝑛𝑘𝐵𝐴 = 𝑟𝑎𝑛𝑘𝐴. 

Analogously, if 𝐴 ∈ 𝑀𝑚×𝑛 (𝐹) and 𝐶 ∈ 𝑀𝑛×𝑛 (𝐹), where 𝐶 is invertible. Then

𝑟𝑎𝑛𝑘𝐴𝐶 = 𝑟𝑎𝑛𝑘𝐴.

Theorem 2.3. Let 𝐴 ∈ 𝑀𝑛×𝑛 (𝐹). Then 𝐴 is invertible if and only if, 𝑟𝑎𝑛𝑘𝐴 = 𝑛.

Proof. By Exercise 5, 𝐴 is invertible if and only if 𝐿𝐴 : 𝐹 𝒏 → 𝐹 𝒏 is invertible. On the other


hand, by the comments prior to Theorem 2.2, 𝑟𝑎𝑛𝑘𝐿𝐴 = 𝑟𝑎𝑛𝑘𝐴. Thus,

𝐴 is invertible ⇔ 𝐿𝐴 is invertible ⇔ 𝑛 = 𝑟𝑎𝑛𝑘𝐿𝐴 = 𝑟𝑎𝑛𝑘𝐴.

Definition. Let 𝐹 be a field and let 𝐴 ∈ 𝑀𝑚×𝑛 (𝐹). The nullity of 𝐴, denoted by 𝑛𝑢𝑙𝐴, is the
dimension of the solution set to the homogeneous system 𝐴𝒗 = ⃑𝟎 as a subspace of 𝐹 𝒏 .

By Theorem 1.11,

𝑛𝑢𝑙𝐿𝐴 + 𝑟𝑎𝑛𝑘𝐿𝐴 = 𝑛,

which, in matrix language, translates into

𝑛𝑢𝑙𝐴 + 𝑟𝑎𝑛𝑘𝐴 = 𝑛.

⃑ is equal to the number


Thus, the dimension of the solution set to the linear system 𝐴𝒗 = 𝟎
of variables minus 𝑟𝑎𝑛𝑘𝐴.

2.2 DETERMINANTS

The concepts and techniques in this section serve as a preliminary to multilinear algebra.
We begin with a short review of permutation groups.

Definition. Let 𝑆 be a set. Any bijection from 𝑆 onto 𝑆 is called a permutation.

July 2014 S. Adarve


31

Clearly, the set of all permutations of a set 𝑆 is a group under composition: Composition is
associative, by definition each permutation has an inverse that is also a permutation, and
the identity function is the identity element.

We now focus on the finite set ⟦1, 𝑛⟧ = {1, 2, … , 𝑛} for each natural number 𝑛. We denote
the group of permutations of ⟦1, 𝑛⟧ by 𝑆𝑛 . Hence,

𝑆𝑛 = {𝜎: ⟦1, 𝑛⟧ → ⟦1, 𝑛⟧ | 𝜎 𝑖𝑠 𝑎 𝑝𝑒𝑟𝑚𝑢𝑡𝑎𝑡𝑖𝑜𝑛}.

The group 𝑆𝑛 is called the symmetric group on 𝑛 letters. The order of this group is 𝑛!.

In general, permutation groups are not commutative. For example, the permutations 𝜎
and 𝜌 in 𝑆3 given by 𝜎(1) = 2, 𝜎(2) = 3, and 𝜎(3) = 1 and 𝜌(1) = 3, 𝜌(2) = 2, and
𝜌(3) = 1 do not commute, i.e. 𝜎𝜌 ≠ 𝜌𝜎.

Definition. Let 𝑟 ∈ ⟦1, 𝑛⟧. An 𝒓 −cycle is a permutation in 𝑆𝑛 that fixes 𝑛 − 𝑟 integers in


⟦1, 𝑛⟧ and, if 𝑗1 , 𝑗2 , … , 𝑗𝑟 are the remaining 𝑟 integers in ⟦1, 𝑛⟧, then (relabeling if required)

𝜎(𝑗1 ) = 𝑗2 , 𝜎(𝑗2 ) = 𝑗3 , … , 𝜎(𝑗𝑟−1 ) = 𝑗𝑟 , 𝜎(𝑗𝑟 ) = 𝑗1 .

Such an 𝑟 −cycle is denoted by (𝑗1 , 𝑗2 , … , 𝑗𝑟 ). Any 2 −cycle is called a transposition.

Note that any 1 −cycle in 𝑆𝑛 is the identity permutation 𝐼𝑑⟦1,𝑛⟧ .

Example 1 In 𝑆4 the 3 −cycle (1, 2, 4) is the permutation 𝜎 such that 𝜎(1) = 2, 𝜎(2) = 4,
𝜎(3) = 3, and 𝜎(4) = 1; the 4 −cycle (1, 2, 3, 4) is the permutation 𝜌 such that 𝜌(1) = 2,
𝜌(2) = 3, 𝜌(3) = 4, and 𝜌(4) = 1; and the transposition (1, 2) is the permutation 𝜏 such
that 𝜏(1) = 2, 𝜏(2) = 1, 𝜏(3) = 3, and 𝜏(4) = 4.

Disjoint cycles (i.e. cycles whose non fixed point sets are disjoint) always commute. On the
other hand, not all permutations are cycles. For example, the permutation 𝜎 in 𝑆5 given by
𝜎(1) = 2, 𝜎(2) = 1, 𝜎(3) = 4, 𝜎(4) = 5, and 𝜎(5) = 3 is not a cycle. Nevertheless, the
following is true.

Theorem 2.4. Every permutation in 𝑆𝑛 can be expressed as a product (composition) of


disjoint cycles. Such a factorization is unique except for the order of the factors.

Theorem 2.5. For 𝑛 ≥ 2, every cycle −hence every permutation− in 𝑆𝑛 can be expressed
as a product of transpositions. Such a factorization is not unique.

1 2 3 4 5 6 7 8 9
Example 2 In 𝑆9 , 𝜎 = ( ) = (1, 5)(2, 9, 7)(3, 4, 8); on the other hand
5 9 4 8 1 6 2 3 7
(2, 9, 7) = (2, 7)(2, 9) and (3, 4, 8) = (3, 8)(3, 4). Thus 𝜎 = (1, 5)(2, 7)(2, 9)(3, 8)(3, 4).

Theorem 2.6. For 𝑛 ≥ 2, each permutation in 𝑆𝑛 can be factorized in an infinite number of


ways as a product of transpositions. The number of factors in any such decomposition is
always even or always odd. Accordingly, a permutation is either even or odd.

Definition. Let 𝑛 ≥ 2. The signum of a permutation 𝜎 in 𝑆𝑛 is (−1)𝑟 , where 𝑟 is the


number of factors in a factorization of 𝜎 as a product of transpositions.

July 2014 S. Adarve


32

Clearly, the signum of a permutation is well−defined, i.e. independent of the number of


factors in a factorization as a product of transpositions.

The signum of a permutation 𝜎 is denoted by 𝑠𝑖𝑔𝑛(𝜎). Thus, 𝑠𝑖𝑔𝑛(𝜎) = 1 if 𝜎 is even and


𝑠𝑖𝑔𝑛(𝜎) = −1 if 𝜎 is odd.

Example 3 If 𝐼𝑑⟦1,𝑛⟧ is the identity of 𝑆𝑛 (𝑛 ≥ 2 ), then 𝑠𝑖𝑔𝑛 (𝐼𝑑⟦1,𝑛⟧ ) = 1. For the


permutation 𝜎 in Example 2 we have 𝑠𝑖𝑔𝑛(𝜎) = (−1)5 = −1.

We now use the preceding ideas to introduce and develop the notion of determinant of a
square matrix with entries in a given field.

Definition. Let 𝐹 be any field. If 𝐴 = (𝑎𝑖𝑗 ) ∈ 𝑀𝑛×𝑛 (𝐹) is a square matrix with entries in 𝐹,
the determinant of 𝐴 is the scalar in 𝐹 given by

𝑑𝑒𝑡 𝐴 = ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎1𝜎(1) ⋯ 𝑎𝑗𝜎(𝑗) ⋯ 𝑎𝑛𝜎(𝑛) .


𝜎∈𝑆𝑛

Thus, the determinant of a given 𝑛 × 𝑛 matrix is a weighted sum of all products of 𝑛


entries in 𝐴 such that no two entries belong to the same row or the same column. The
weights are the signums of the corresponding permutations.

It is not difficult to show that 𝑑𝑒𝑡 𝐴 is also given by

∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎𝜎(1)1 ⋯ 𝑎𝜎(𝑗)𝑗 ⋯ 𝑎𝜎(𝑛)𝑛 .


𝜎∈𝑆𝑛

The determinant function 𝑑𝑒𝑡: 𝑀𝑛×𝑛 (𝐹) → 𝐹 may also be regarded as a function of 𝑛
vector variables, namely the columns of each matrix. More precisely,

𝐹𝑛 × ⋯ × 𝐹𝑛 → 𝐹
𝑑𝑒𝑡: ⏟
𝑛 𝑡𝑖𝑚𝑒𝑠

given by

𝑑𝑒𝑡 (𝐴1 , 𝐴2 , … , 𝐴𝑛 ) = 𝑑𝑒𝑡 𝐴,

where 𝐴 = (𝐴1 , 𝐴2 , … , 𝐴𝑛 ) , is a well − defined function. In the same token, the


determinant may also be considered as a function on the 𝑛 −fold Cartesian product
𝐹 𝑛 × … × 𝐹 𝑛 where the arguments are the rows (instead of the columns) of a matrix.

Henceforth, we will regard 𝑑𝑒𝑡 as a function of both the rows and columns of a matrix. The
𝐹 𝑛 × ⋯ × 𝐹 𝑛 → 𝐹.
next theorem reveals the multilinear nature of 𝑑𝑒𝑡: ⏟
𝑛 𝑡𝑖𝑚𝑒𝑠

Theorem 2.7. Let 𝐹 be a field.

1. For all 𝐴1 , … , 𝐴𝑗 , 𝐴𝑗′ , … , 𝐴𝑛 ∈ 𝐹 𝑛 ,

𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗 + 𝐴𝑗′ , … , 𝐴𝑛 )

July 2014 S. Adarve


33

= 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ) + 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗′ , … , 𝐴𝑛 ).

2. For all 𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ∈ 𝐹 𝑛 and 𝑘 ∈ 𝐹,

𝑑𝑒𝑡 (𝐴1 , … , 𝑘𝐴𝑗 , … , 𝐴𝑛 ) = 𝑘 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ).

3. For all 𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ∈ 𝐹 𝑛 and 𝜎 ∈ 𝑆𝑛 ,

𝑑𝑒𝑡 (𝐴𝜎(1) , … , 𝐴𝜎(𝑗) , … , 𝐴𝜎(𝑛) ) = 𝑠𝑖𝑔𝑛(𝜎) 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ).

Proof.

1. 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗 + 𝐴𝑗′ , … , 𝐴𝑛 )


= ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎𝜎(1)1 ⋯ (𝑎𝜎(𝑗)𝑗 + 𝑎𝜎(𝑗)𝑗 ) ⋯ 𝑎𝜎(𝑛)𝑛
𝜎∈𝑆𝑛


= ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎𝜎(1)1 ⋯ 𝑎𝜎(𝑗)𝑗 ⋯ 𝑎𝜎(𝑛)𝑛 + ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎𝜎(1)1 ⋯ 𝑎𝜎(𝑗)𝑗 ⋯ 𝑎𝜎(𝑛)𝑛
𝜎∈𝑆𝑛 𝜎∈𝑆𝑛

= 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ) + 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗′ , … , 𝐴𝑛 ).

2. 𝑑𝑒𝑡 (𝐴1 , … , 𝑘𝐴𝑗 , … , 𝐴𝑛 )

= ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎𝜎(1)1 ⋯ 𝑘𝑎𝜎(𝑗)𝑗 ⋯ 𝑎𝜎(𝑛)𝑛


𝜎∈𝑆𝑛

= 𝑘 ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎𝜎(1)1 ⋯ 𝑎𝜎(𝑗)𝑗 ⋯ 𝑎𝜎(𝑛)𝑛


𝜎∈𝑆𝑛

= 𝑘 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑗 , … , 𝐴𝑛 ).

3. Consider first the case of a transposition 𝜏 = (𝑖, 𝑗) ∈ 𝑆𝑛 . Without loss of generality,


assume 𝑖 < 𝑗. Then

𝑑𝑒𝑡 (𝐴𝜏(1) , … , 𝐴𝜏(𝑖) , … , 𝐴𝜏(𝑗) , … , 𝐴𝜏(𝑛) )

= ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎1𝜎𝜏(1) ⋯ 𝑎𝑖𝜎𝜏(𝑖) ⋯ 𝑎𝑗𝜎𝜏(𝑗) ⋯ 𝑎𝑛𝜎𝜏(𝑛) .


𝜎∈𝑆𝑛

Making a change of variable, namely 𝜌 = 𝜎𝜏, the latter is equal to

July 2014 S. Adarve


34

∑ 𝑠𝑖𝑔𝑛(𝜌𝜏) 𝑎1𝜌(1) ⋯ 𝑎𝑖𝜌(𝑖) ⋯ 𝑎𝑗𝜌(𝑗) ⋯ 𝑎𝑛𝜌(𝑛)


𝜌∈𝑆𝑛

= ∑ 𝑠𝑖𝑔𝑛(𝜌) 𝑠𝑖𝑔𝑛(𝜏) 𝑎1𝜌(1) ⋯ 𝑎𝑛𝜌(𝑛)


𝜌∈𝑆𝑛

= − ∑ 𝑠𝑖𝑔𝑛(𝜌) 𝑎1𝜌(1) ⋯ 𝑎𝑛𝜌(𝑛)


𝜌∈𝑆𝑛

= − 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑖 , … , 𝐴𝑗 , … , 𝐴𝑛 ).

Consider now a general permutation 𝜎 and a factorization 𝜎 = 𝜏𝑟 … 𝜏1 as a product of


transpositions. Then

𝑑𝑒𝑡 (𝐴𝜎(1) , … , 𝐴𝜎(𝑛) )

= 𝑑𝑒𝑡 (𝐴𝜏𝑟 … 𝜏1 (1) , … , 𝐴𝜏𝑟 … 𝜏1 (𝑛) )

= (−1)𝑟 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑛 )

= 𝑠𝑖𝑔𝑛(𝜎) 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑛 ). 

Properties 1 and 2 in Theorem 2.7 tell us that 𝑑𝑒𝑡: 𝐹 𝑛 × ⋯ × 𝐹 𝑛 → 𝐹 is multilinear,


meaning that it is linear in each component when the remaining ones are fixed. Property 3
in Theorem 2.7 tells us that 𝑑𝑒𝑡 is alternating or skew−symmetric. In later chapters, such
functions will be referred to as skew tensors or skew forms.

The following property is essential for applications.

Theorem 2.8. Let 𝐹 be a field and let 𝐴, 𝐵 ∈ 𝑀𝑛×𝑛 (𝐹). Then 𝑑𝑒𝑡 𝐵𝐴 = 𝑑𝑒𝑡 𝐵 𝑑𝑒𝑡 𝐴.

Proof. Property 3 in Theorem 2.7 (skew−symmetry) is equivalent to the following


property: If two or more arguments are repeated, then 𝑑𝑒𝑡 = 0. That is,

𝑑𝑒𝑡 (𝐴1 , … , 𝐴
⏟𝑗 , … , 𝐴𝑗 , … , 𝐴𝑛 ) = 𝑑𝑒𝑡 (𝒂1 , … , 𝒂
⏟𝑖 , … , 𝒂𝑖 , … , 𝒂𝑛 ) = 0.
𝑟𝑒𝑝𝑒𝑎𝑡𝑒𝑑 𝑟𝑒𝑝𝑒𝑎𝑡𝑒𝑑
𝑐𝑜𝑙𝑢𝑚𝑛𝑠 𝑟𝑜𝑤𝑠

Hence,

𝑑𝑒 𝑡 𝐵𝐴 = 𝑑𝑒𝑡(𝒃1 𝐴, … , 𝒃𝑗 𝐴, … , 𝒃𝑛 𝐴)

𝑛 𝑛 𝑛

= 𝑑𝑒𝑡 ( ∑ 𝑏1𝑗1 𝒂𝑗1 , … , ∑ 𝑏𝑘𝑗𝑘 𝒂𝑗𝑘 , … , ∑ 𝑏𝑛𝑗𝑛 𝒂𝑗𝑛 )


𝑗1 =1 𝑗𝑘 =1 𝑗𝑛 =1

= ∑ 𝑏1𝑗1 ⋯ 𝑏𝑘𝑗𝑘 ⋯ 𝑏𝑛𝑗𝑛 𝑑𝑒𝑡 (𝒂𝑗1 , … , 𝒂𝑗𝑘 , … , 𝒂𝑗𝑛 )


𝑗1 ,… ,𝑗𝑛 =1

July 2014 S. Adarve


35

= ∑ 𝑏1𝜎(1) ⋯ 𝑏𝑛𝜎(𝑛) 𝑑𝑒 𝑡(𝒂𝜎(1) , … , 𝒂𝜎(𝑛) )


𝜎∈𝑆𝑛

= ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑏1𝜎(1) ⋯ 𝑏𝑛𝜎(𝑛) 𝑑𝑒 𝑡(𝒂1 , … , 𝒂𝑛 ),


𝜎∈𝑆𝑛

by the property at the beginning of this proof. (This means, by the way, that at most 𝑛! of
the original 𝑛𝑛 terms are non−zero.) Finally, since the term 𝑑𝑒 𝑡(𝒂 ⃑ 𝑛 )is constant
⃑ 1, … , 𝒂
with respect to the summation index 𝜎, the last expression is equal to

⃑ 𝑛)
⃑ 1, … , 𝒂
= ( ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑏1𝜎(1) ⋯ 𝑏𝑛𝜎(𝑛) ) 𝑑𝑒 𝑡(𝒂
𝜎∈𝑆𝑛

= 𝑑𝑒𝑡 𝐵 𝑑𝑒𝑡 𝐴.

Corollary 1 For a given field 𝐹, let 𝐴1 , … , 𝐴𝑛 ∈ 𝐹 𝑛 and 𝐴 = (𝐴1 , … , 𝐴𝑛 ). Then the following
statements are equivalent.

1. 𝑑𝑒𝑡 (𝐴1 , … , 𝐴𝑛 ) ≠ 0.
2. 𝐴1 , … , 𝐴𝑛 are linearly independent.
3. 𝑟𝑎𝑛𝑘 𝐴 = 𝑛.
4. 𝐴 is invertible.

Corollary 2 Let 𝐴, 𝐵 ∈ 𝑀𝑛×𝑛 (𝐹). If 𝐴 and 𝐵 are similar, then 𝑑𝑒𝑡 𝐴 = 𝑑𝑒𝑡 𝐵.

Definition. Given a matrix 𝐴 ∈ 𝑀𝑛×𝑛 (𝐹), for each position 𝑖𝑗 in the matrix, 𝑖, 𝑗 = 1, … , 𝑛,
the minor 𝑖𝑗 of 𝐴, denoted by 𝑀𝑖𝑗 , is the determinant of the (𝑛 − 1) × (𝑛 − 1) submatrix
obtained by deleting the 𝑖 −th row and the 𝑗 −th column of 𝐴. For each pair 𝑖𝑗, the scalar

𝐴𝑖𝑗 = (−1)𝑖+𝑗 𝑀𝑖𝑗

is called the cofactor 𝑖𝑗 of 𝐴.

Theorem 2.9. Let 𝐴 ∈ 𝑀𝑛×𝑛 (𝐹).

1. 𝑑𝑒𝑡 𝐴 = ∑𝑛𝑗=1 𝑎𝑖𝑗 𝐴𝑖𝑗 , for all 𝑖 = 1, … , 𝑛.


1 𝑡
2. If 𝐴 is invertible, then 𝐴−1 = (𝐴𝑖𝑗 ) .
𝑑𝑒𝑡 𝐴

Proof.

1. The proof is left to the reader.


2. If 𝐴 is invertible, then det 𝐴 ≠ 0 by Corollary 1 above. On the other hand, for each
𝑘 = 1, … , 𝑛, ∑𝑛𝑗=1 𝑎𝑘𝑗 𝐴𝑖𝑗 equals det 𝐴, if 𝑘 = 𝑖, and equals zero, if 𝑘 ≠ 𝑖, by 1 plus an
elementary argument. 

July 2014 S. Adarve


36

Theorem 2.10

1. If 𝐴 = (𝑎𝑖𝑗 ) ∈ 𝑀𝑛×𝑛 (𝐹) is triangular (in particular, diagonal), then 𝑑𝑒𝑡 𝐴 = 𝑎11 ⋯ 𝑎𝑛𝑛 .
𝐵 𝐷
2. Let 𝐵 ∈ 𝑀𝑝×𝑝 (𝐹) and 𝐶 ∈ 𝑀𝑞×𝑞 (𝐹) . Let 𝐴 be the block−triangular matrix [ ],
𝑂 𝐶
where 𝐷 is any 𝑝 × 𝑞 matrix. Then 𝑑𝑒𝑡 𝐴 = 𝑑𝑒𝑡 𝐵 𝑑𝑒𝑡 𝐶 . In particular, 𝑑𝑒𝑡 𝐵  C
𝐵 𝑂
= 𝑑𝑒𝑡 [ ] = 𝑑𝑒𝑡 𝐵 𝑑𝑒𝑡 𝐶 (block−diagonal case).
𝑂 𝐶

Proof.

1. This is a trivial fact from the definition of 𝑑𝑒𝑡.


2. By definition,

𝑑𝑒𝑡 𝐴 = ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎1𝜎(1) ⋯ 𝑎𝑝𝜎(𝑝) 𝑎(𝑝+1)𝜎(𝑝+1) ⋯ 𝑎(𝑝+𝑞)𝜎(𝑝+𝑞) .


𝜎∈𝑆𝑝+𝑞

Since 𝐴 is block-triangular, only those permutations 𝜎 ∈ 𝑆𝑝+𝑞 such that 𝜎(⟦𝑝 + 1, 𝑛⟧)
= ⟦𝑝 + 1, 𝑛⟧ make (possibly) non zero contributions to this sum. Such permutations
also satisfy 𝜎(⟦1, 𝑝⟧) = ⟦1, 𝑝⟧. Now, by Exercise 3, these permutations 𝜎 may be
decomposed as products 𝜎 = 𝛿𝜇, where 𝛿 is a permutation in 𝑆𝑝+𝑞 whose fixed point
set is ⟦𝑝 + 1, 𝑛⟧, and 𝜇 is a permutation in 𝑆𝑝+𝑞 whose fixed point set is ⟦1, 𝑝⟧. Thus,

𝑑𝑒𝑡 𝐴 = ∑ 𝑠𝑖𝑔𝑛(𝛿𝜇) 𝑎1𝛿(1) ⋯ 𝑎𝑝𝛿(𝑝) 𝑎(𝑝+1)𝜇(𝑝+1) ⋯ 𝑎(𝑝+𝑞)𝜇(𝑝+𝑞) ,


𝛿∈𝑆𝑝′ , 𝜇∈𝑆𝑞′

where 𝑆𝑝′ is the subgroup of 𝑆𝑝+𝑞 whose fixed point set is ⟦𝑝 + 1, 𝑛⟧ and 𝑆𝑞′ is the
subgroup of 𝑆𝑝+𝑞 whose fixed point set is ⟦1, 𝑝⟧.

But the latter is equal to

( ∑ 𝑠𝑖𝑔𝑛(𝛿) 𝑏1𝛿(1) ⋯ 𝑏𝑝𝛿(𝑝) ) ( ∑ 𝑠𝑖𝑔𝑛(𝜇) 𝑐1𝜇(1) ⋯ 𝑐𝑞𝜇(𝑞) ).


𝛿∈𝑆𝑝 𝜇∈𝑆𝑞

Hence,

𝑑𝑒𝑡 𝐴 = 𝑑𝑒𝑡 𝐵 𝑑𝑒𝑡 𝐶.

July 2014 S. Adarve


37

2.3 DIAGONAL FORMS

In this section we study operators on a finitely generated vector space 𝑽 that appear as
direct sums of homotheties on invariant subspaces.

Example 4 Let 𝑽 = ℝ3 , 𝑾1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 1, 0)}, and 𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 0, 1)}. Clearly
ℝ3 = 𝑾1  𝑾2 . Consider the homotheties 𝑆1 : 𝑾1 → 𝑾1 and 𝑆2 : 𝑾2 → 𝑾2 given by

𝑆 1 (𝒘1 ) = 𝑆 1 (𝑥1 , 𝑥2 , 0) = (−𝑥1 , −𝑥2 , 0), for each 𝒘1 = (𝑥1 , 𝑥2 , 0) ∈ 𝑾1 , and

𝑆 2 (𝒘2 ) = 𝑆 2 (𝑥3 , 0, 𝑥3 ) = (2𝑥3 , 0, 2𝑥3 ), for each 𝒘2 = (𝑥3 , 0, 𝑥3 ) ∈ 𝑾2 .

As in Example 26 in Chapter 1, we readily find that

𝒗 = (𝑥1 , 𝑥2 , 𝑥3 ) = (𝑥1 − 𝑥3 , 𝑥2 , 0) + (𝑥3 , 0, 𝑥3 ) = 𝒘1 + 𝒘2

is the unique decomposition of each 𝒗 = (𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ3 relative to the decomposition


ℝ3 = 𝑾1  𝑾2 . Thus,

(𝑆 1  𝑆2 )(𝑥1 , 𝑥2 , 𝑥3 ) = (−𝑥1 + 3𝑥3 , − 𝑥2 − 𝑥3 , 2𝑥3 ),

for all (𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ3 .

Example 5 In Example 4, consider the (ordered) bases 𝛼 = {(1, 0, 0), (0, 1, 0)} for 𝑾1 and
𝛽 = {(1, 0, 1)} for 𝑾2 . Then

[𝑆 1 ]𝛼 = [−1 0
] = (−1)𝐼2,
0 −1

[𝑆 2 ]𝛽 = [2] = 2𝐼1, and

−1 0 0
[𝑆 1  𝑆2 ]𝛼∪̇𝛽 = [ 0 −1 0] = [−1 0
]  [2] = (−1)𝐼2  2𝐼1.
0 −1
0 0 2

Therefore, [𝑆 1  𝑆2 ]𝛼∪̇𝛽 is a direct sum of scalar matrices, hence, a diagonal matrix.

The preceding examples motivate one of the main definitions in this section.

Definition. Let 𝑆: 𝑽 → 𝑽 be an operator on a finitely generated vector space 𝑽 over a field


𝐹. The operator 𝑆 is diagonal (or diagonalizable) if if there is an ordered basis  for 𝑉 such
that [𝑆] is diagonal. If 𝑆 is diagonal, to diagonalize 𝑆 means to find such a basis.

The next theorem characterizes diagonal operators on finite dimensional vector spaces.

July 2014 S. Adarve


38

Theorem 2.11. Let 𝑆: 𝑽 → 𝑽 be an operator on a finitely generated vector space 𝑽 over a


field 𝐹. Then 𝑆 is diagonal if and only if there exists a collection 𝑾1 , … , 𝑾𝑘 of 𝑆 −invariant
subspaces of 𝑽 such that

a) 𝑽 = 𝑾1  ⋯  𝑾𝑘 .

b) The restrictions 𝑆𝑾1 , … , 𝑆𝑾𝑘 are homotheties 𝐻1 ,…,𝐻𝑘 , where the corresponding
ratios 1 ,…, 𝑘 are all distinct.

Proof.

Let  be a basis for 𝑉 such that [𝑆] is diagonal. Without lost of generality (relabeling the
vectors in  if necessary), we may assume that repeated diagonal entries are consecutive.
[𝑆] will then have the form

1 𝐼𝑑𝑑1 𝑂 𝑂
[𝑆] = [ 𝑂 ⋱ 𝑂 ],
𝑂 𝑂 𝑘 𝐼𝑑𝑑𝑘

for certain (distinct) scalars 1 ,…,𝑘 ∈ 𝐹 and positive integers 𝑑1 ,…,𝑑𝑘 .

For each 𝑖 = 1, … , 𝑘, let 𝛼𝑖 be the (ordered) sub−collection of vectors in  that correspond


to the columns of the diagonal block 𝑖 𝐼𝑑𝑑𝑖 . Setting 𝑾𝑖 = 𝑠𝑝𝑎𝑛 𝛼𝑖 , 𝑖 = 1, … , 𝑘, clearly
𝑽 = 𝑾1  ⋯  𝑾𝑘 , because  = 𝛼1 ∪̇ ⋯ ∪̇ 𝛼𝑘 (concatenation). The fact that 𝑆𝑾𝑖 = 𝐻𝑖 ,
for each 𝑖, is immediate.

Conversely, let 𝑾1 , … , 𝑾𝑘 be 𝑆 −invariant subspaces of 𝑽 that satisfy a) and b) in the


preceding definition and consider corresponding bases 𝛼1 ,…,𝛼𝑘 . Clearly  = 𝛼1 ∪̇ ⋯ ∪̇ 𝛼𝑘
is a basis for 𝑽 such that [𝑆] is diagonal. 

Both the decomposition 𝑽 = 𝑾1  ⋯  𝑾𝑘 and the ratios 1 ,…,𝑘 ∈ 𝐹 in Theorem 2.11


are unique except for the order of the factors. In this situation, the operator 𝑆 can be
expressed as a direct sum of homotheties 𝑆 = 𝐻1  ⋯  𝐻𝑘 .

In this token, a diagonal operator on a finite dimensional vector space is an operator that
can be expressed as a direct sum of homotheties.

In the terms of Theorem 2.11, the subspaces 𝑾1 , … , 𝑾𝑘 are called the eigenspaces of 𝑆
and the ratios 1 ,…,𝑘 ∈ 𝐹 are called the eigenvalues of 𝑆. Non zero vectors in the 𝑾𝑖 ’s
are called eigenvectors of 𝑆.

In this terminology, notice that an operator 𝑆: 𝑽 → 𝑽 on a finitely generated vector space


𝑽 over a field 𝐹 is diagonal if and only if there exists an ordered basis for 𝑽 that consists of
eigenvectors of 𝑆.

July 2014 S. Adarve


39

Example 6 The operator 𝑆 in Example 4 is by construction a direct sum of homotheties,


hence a diagonal operator. The eigenspaces of 𝑆 are 𝑾1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 1, 0)}, and
𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 0, 1)}. The corresponding eigenvalues are 1 = −1 and 2 = 2.

Some operators may have eigenvalues and eigenspaces even if they are not diagonal. On
the other hand, it is convenient to state a definition of eigenvalues and eigenspaces that
may include operators on infinite dimensional vector spaces. This suggests the following
definition.

Definition. Let 𝑆: 𝑽 → 𝑽 be an operator on a (possibly infinite dimensional) vector space 𝑽


over a field 𝐹. A scalar 𝑡 ∈ 𝐹 is an eigenvalue of 𝑆 if there exists a non zero vector 𝒗 ∈ 𝑽
such that 𝑆(𝒗) = 𝑡𝒗.

If 𝑡 is an eigenvalue of 𝑆, the subspace of 𝑽 given by 𝐸𝑡 = {𝒗 ∈ 𝑽 | 𝑆(𝒗) = 𝑡𝒗} is the


eigenspace of 𝑆 corresponding to the eigenvalue 𝑡.

We now develop techniques to find eigenvalues and eigenspaces and to determine


whether a given operator on a finitely generated vector space is diagonal.

Let 𝑆: 𝑽 → 𝑽 be an operator on an 𝑛-dimensional vector space 𝑽 over a field 𝐹and let


𝑡 ∈ 𝐹. We are searching for non zero vectors 𝒗 ∈ 𝑽 that may possibly satisfy the equation
𝑆(𝒗) = 𝑡𝒗 or

(𝑆 − 𝑡𝐼𝑑𝑽 )(𝒗) = ⃑𝟎.

If 𝛼 is an ordered basis for 𝑽, the latter translates into

⃑,
([𝑆]𝛼 − 𝑡𝐼𝑛 )[𝒗]𝛼 = 𝟎

which is an 𝑛 × 𝑛 homogeneous linear system with coefficients in 𝐹.

Thus, 𝑡 ∈ 𝐹 is an eigenvalue of 𝑆 if and only if this system has non trivial solutions or

𝑑𝑒𝑡 ([𝑆]𝛼 − 𝑡𝐼𝑛 ) = 0,

by Corollary 1 to Theorem 2.8. Notice that the latter is a polynomial equation. Hence, the
possible eigenvalues of 𝑆 will be roots of the polynomial 𝑑𝑒𝑡 ([𝑆]𝛼 − 𝑡𝐼𝑛 ). Nevertheless,
since some of these roots may not belong to 𝐹, not all the roots of this polynomial will, a
fortiori, be eigenvalues of 𝑆.

Theorem 2.12.

a) 𝑑𝑒𝑡 ([𝑆]𝛼 − 𝑡𝐼𝑛 ) is a polynomial in 𝑡 of degree 𝑛 with coefficients in 𝐹 whose leading


term is (−1)𝑛 𝑡 𝑛 .
b) This polynomial does not depend on the choice of the ordered basis 𝛼.

Proof.

a) This is immediate from the definition of a determinant in terms of permutations.

July 2014 S. Adarve


40

b) Let 𝛼 and 𝛽 be two ordered bases for 𝑽. Since [𝑆]𝛼 and [𝑆]𝛽 are similar, there is an
invertible matrix 𝐶 such that [𝑆]𝛽 = 𝐶 −1 [𝑆] 𝛼 𝐶. Thus

𝑑𝑒𝑡 ([𝑆]𝛽 − 𝑡𝐼𝑛 ) = 𝑑𝑒𝑡 (𝐶 −1 [𝑆] 𝛼 𝐶 − 𝐶 −1 𝑡𝐼𝑛 𝐶)

= 𝑑𝑒𝑡 (𝐶 −1 ([𝑆] 𝛼 − 𝑡𝐼𝑛 )𝐶) = 𝑑𝑒𝑡 ([𝑆]𝛼 − 𝑡𝐼𝑛 ). 

Definition. The polynomial 𝑑𝑒𝑡 ([𝑆]𝛼 − 𝑡𝐼𝑛 ) is called the characteristic polynomial of 𝑆.

Example 7 In order to find the eigenvalues of the operator 𝐿𝐴 : ℝ3 → ℝ3, where 𝐴 is the
−1 0 3
matrix [ 0 −1 0 ], we first compute (via the canonical basis of ℝ3 )
0 0 2

𝑑𝑒𝑡 ([𝐿𝐴 ]𝑐𝑎𝑛 − 𝑡𝐼3 ) = 𝑑𝑒𝑡 (𝐴 − 𝑡𝐼3 ) =

−1 − 𝑡 0 3
=| 0 −1 − 𝑡 0 |
0 0 2−𝑡

= (−1)3 (𝑡 + 1)2 (𝑡 − 2).

Hence, the eigenvalues of 𝐿𝐴 −the roots of this polynomial− are 𝑡 = −1, 2.

The eigenspace 𝐸−1 corresponding to 𝑡 = −1 is the solution set to the homogeneous


linear system

0 0 3 𝑥1 0
[0 0 0 ] [𝑥2 ] = [0].
0 0 3 𝑥3 0

Thus, 𝐸−1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 1, 0)}.

Similarly, it is easily verified that 𝐸2 = 𝑠𝑝𝑎𝑛 {(1, 0, 1)}.

Clearly ℝ3 = 𝐸−1  𝐸2 . Hence, by Theorem 2.11, 𝐿𝐴 is diagonal. In fact, 𝐿𝐴 is the operator


𝐻−1  𝐻2 of Examples 4 and 5, where 𝐻−1 is a homothety on 𝐸−1 and 𝐻2 is a homothety
on 𝐸2 .

2 0 0
Example 8 Consider the operator 𝐿𝐴 : ℝ3 → ℝ3, where 𝐴 is the 3 × 3 matrix [ 0 4 0 ].
1 0 2
Applying the same procedure as in Example 7, we find that the characteristic polynomial of
𝐿𝐴 is 𝑝(𝑡) = (−1)3 (𝑡 − 2)2 (𝑡 − 4), the eigenvalues are 𝑡 = 2, 4, and the eigenspaces are
𝐸2 = 𝑠𝑝𝑎𝑛 {(0, 0, 1)} and 𝐸4 = 𝑠𝑝𝑎𝑛 {(0, 1, 0)}.

⃑ }), 𝐸2  𝐸4 ≠ ℝ3. Hence, in this


Although 𝐸2 + 𝐸4 is a direct sum (since 𝐸2 ∩ 𝐸4 = {𝟎
case, 𝐿𝐴 is not diagonal.

July 2014 S. Adarve


41

Theorem 2.13. Let 𝑆: 𝑽 → 𝑽 be an operator on a finitely generated vector space 𝑽 over a


field 𝐹. If 𝑾 is an 𝑆-invariant subspace of 𝑽, then the characteristic polynomial of the
restriction 𝑆𝑾 divides the characteristic polynomial of 𝑆.

Proof. Let 𝛼 = {𝒘1 , … , 𝒘𝑝 } be an ordered basis for 𝑾 and let 𝛾 = {𝒘1 , … , 𝒘𝑝 , 𝒗1 , … , 𝒗𝑞 }


be an extension to an ordered basis for 𝑽 . Since 𝑾 is 𝑆 -invariant, the matrix
representation of 𝑆 in the basis 𝛾 is a block−triangular (𝑝 + 𝑞) × (𝑝 + 𝑞) matrix, say,

𝐵 𝐷
𝐴 = [𝑆]𝛾 = [ ],
𝑂 𝐶

where 𝐵 = [𝑆𝑾 ]𝛼 . If 𝑠(𝑡) = 𝑑𝑒𝑡 ([𝑆] 𝛾 − 𝑡𝐼𝑝+𝑞 ) is the characteristic polynomial of 𝑆 and
𝑝(𝑡) = 𝑑𝑒𝑡 ([𝑆𝑾 ]𝛼 − 𝑡𝐼𝑝 ) is the characteristic polynomial of 𝑆𝑾 . Then, by Theorem 2.10,

𝐵 − 𝑡𝐼𝑝 𝐷
𝑠(𝑡) = 𝑑𝑒𝑡 (𝐴 − 𝑡𝐼𝑝+𝑞 ) = | |
𝑂 𝐶 − 𝑡𝐼𝑞

= 𝑑𝑒𝑡 (𝐵 − 𝑡𝐼𝑝 )𝑑𝑒𝑡 (𝐶 − 𝑡𝐼𝑝 )

= 𝑝(𝑡)𝑑𝑒𝑡 (𝐶 − 𝑡𝐼𝑝 ),

Thus 𝑝(𝑡) divides 𝑠(𝑡). 

Theorem 2.14. Let 𝑆: 𝑽 → 𝑽 be an operator on a finitely generated vector space 𝑽 over a


field 𝐹. Suppose that 𝑆 has at least one eigenspace. Then the sum of the eigenspaces of 𝑆
is a direct sum.

Proof. Let 1 ,…, 𝑖 , … , 𝑘 ∈ 𝐹 be the distinct eigenvalues of 𝑆 and let 𝐸1 , … , 𝐸𝑖 , … , 𝐸𝑘
be the corresponding eigenspaces. The proof is by mathematical induction on 𝑖 = 1, … , 𝑘.

The result is trivial for 𝑖 = 1.

Suppose that the result is true for the first 𝑖 − 1 eigenvalues, i.e. that
𝑖−1

∑ 𝐸𝑟
𝑟=1

is a direct sum. By Exercise 18 in Chapter 1, it suffices to show that

𝑖−1
⃑ }.
𝐸𝑖 ∩ ∑ 𝐸𝑟 = {𝟎
𝑟=1

Let 𝒗 ∈ 𝐸𝑖 ∩ ∑𝑖−1


𝑟=1 𝐸𝑟 and let 𝒘𝑟 ∈ 𝐸𝑟 , 𝑟 = 1, … , 𝑖 − 1, be the unique vectors such that

𝑖−1

𝒗 = ∑ 𝒘𝑟 .
𝑟=1

Thus,

July 2014 S. Adarve


42

𝑆(𝒗) = 𝑖 𝒗 = 𝑖 (𝒘1 + ⋯ + 𝒘𝑖−1 ) = 𝑖 𝒘1 + ⋯ + 𝑖 𝒘𝑖−1 .

On the other hand,

𝑆(𝒗) = 𝑆(𝒘1 + ⋯ + 𝒘𝑖−1 )

= 𝑆(𝒘1 ) + ⋯ + 𝑆(𝒘𝑖−1 ) = 1 𝒘1 + ⋯ + 𝑖−1 𝒘𝑖−1 ,

which implies

𝑖 𝒘1 + ⋯ + 𝑖 𝒘𝑖−1 = 1 𝒘1 + ⋯ + 𝑖−1 𝒘𝑖−1 .

Subtracting, we obtain

⃑.
(𝑖 − 1 )𝑤1 + ⋯ + (𝑖 − 𝑖−1 )𝑤𝑖−1 = 0

Since the sum ∑𝑖−1


𝑟=1 𝐸𝑟 is direct and 𝑖 − 𝑟 ≠ 0 for 𝑟 < 𝑖, we conclude that

𝒘1 = ⋯ = 𝒘𝑖−1 = ⃑0.

⃑.
Hence, 𝒗 = 𝟎 

Examples 7 and 8 illustrate contrasting situations regarding Theorem 2.14. In Example 7


the eigenspaces of the operator sum up to ℝ3 , while in Example 8 they fail to do so, since
their sum is barely a two−dimensional subspace of ℝ3 . This leads to the following
question: Given an operator 𝑆: 𝑽 → 𝑽 in finite dimension, what could possibly prevent the
𝑆 −eigenspaces from summing up to 𝑽 (i.e. prevent 𝑆 from being diagonal)?

Certainly not all operators in finite dimension are diagonal. In fact, diagonal operators are
a very special class of operators. We now examine the possible obstructions that might
prevent an operator in finite dimension from being diagonal.

Definition. Let 𝐹 be a field. A polynomial 𝑝(𝑡) ∈ 𝐹𝑛 [𝑡] splits over 𝐹 if there are (non−
necessarily distinct) scalars 𝑐, 𝑎1 , … , 𝑎𝑛 ∈ 𝐹 such that

𝑝(𝑡) = 𝑐(𝑡 − 𝑎1 ) ⋯ (𝑡 − 𝑎𝑛 ).

Example 9

a) 𝑝(𝑡) = 1 + 𝑡 2 does not split over ℝ but it does split over ℂ. In fact,

𝑝(𝑡) = (𝑡 − 𝑖)(𝑡 + 𝑖).

b) 𝑝(𝑡) = 1 − 𝑡 2 splits over ℝ.


c) 𝑝(𝑡) = 1 − 𝑡 4 splits over ℂ but it does not split over ℝ.

By the Fundamental Theorem of Algebra, any polynomial in ℂ[𝑡] splits over ℂ (i.e., ℂ is
algebraically closed). In particular, any polynomial in the real subspace ℝ[𝑡] splits over ℂ,
although not necessarily over ℝ.

July 2014 S. Adarve


43

Definition. Let 𝑆: 𝑽 → 𝑽 be an operator on a 𝑛 −dimensional vector space 𝑽 over a field 𝐹


with characteristic polynomial 𝑝(𝑡), and let  ∈ 𝐹 be an eigenvalue of 𝑆. The algebraic
multiplicity of  is the largest positive integer 𝑚 for which (𝑡 − )𝑚 is a factor of 𝑝(𝑡). The
geometric multiplicity of  is the dimension of the corresponding eigenspace 𝐸 .

Theorem 2.15. Let 𝑆: 𝑽 → 𝑽 be an operator on a 𝑛 −dimensional vector space 𝑽 over a


field 𝐹. Then, for each eigenvalue  of 𝑆, if any, the following inequalities hold:

1 ≤ geometric multiplicity of  ≤ algebraic multiplicity of .

Proof. Let  be an eigenvalue of 𝑆 and let 𝑑 and 𝑚 denote its geometric and algebraic
multiplicities, respectively. The inequality 1 ≤ 𝑑 is trivial, since eigenspaces are non
trivial.

Regarding the second inequality, let 𝛼 = {𝒘1 , … , 𝒘𝑑 } be an ordered basis for the
eigenspace 𝐸 and let 𝛾 = {𝒘1 , … , 𝒘𝑑 , 𝒗1 , … , 𝒗𝑛−𝑑 } be an extension to an ordered basis
for 𝑽. As in the proof to Theorem 2.13 −since 𝐸 is 𝑆-invariant− the matrix representation
of 𝑆 in the basis 𝛾 is block−triangular. Moreover, since the restriction of 𝑆 to 𝐸 is the
homothety of ratio , the 𝑛 × 𝑛 matrix [𝑆]𝛾 has the form

𝐼𝑑 𝐷
[𝑆]𝛾 = [ ].
𝑂 𝐶

Thus, the characteristic polynomial of 𝑆 is given by

( − 𝑡)𝐼𝑑 𝐷
𝑝(𝑡) = | | = (−1)𝑑 (𝑡 − )𝑑 𝑞(𝑡),
𝑂 𝐶 − 𝑡𝐼𝑛−𝑑

where 𝑞(𝑡) = 𝑑𝑒𝑡 (𝐶 − 𝑡𝐼𝑛−𝑑 ). Since 𝑞(𝑡) may possibly furnish additional factors 𝑡 −  ,
clearly 𝑑 ≤ 𝑚. 

Corollary Let 𝑆: 𝑽 → 𝑽 be an operator on a 𝑛 −dimensional vector space 𝑽 over a field 𝐹


whose characteristic polynomial splits over 𝐹. Then 𝑆 is diagonal (diagonalizable) if and
only if for each eigenvalue of 𝑆 its geometric multiplicity equals its algebraic multiplicity.

Proof. Let 1 , … , 𝑘 ∈ 𝐹 be the distinct eigenvalues of 𝑆 and let 𝐸1 , … , 𝐸𝑘 be the
corresponding eigenspaces. Let 𝑑1 , … , 𝑑𝑘 and 𝑚1 , … , 𝑚𝑘 denote the respective geometric
and algebraic multiplicities. Since the characteristic polynomial 𝑝(𝑡) of 𝑆 has degree 𝑛 and
splits over 𝐹, then

𝑝(𝑡) = (−1)𝑛 (𝑡 − 1 )𝑚1 ⋯ (𝑡 − 1 )𝑚𝑘 ,

where 𝑚1 + ⋯ + 𝑚𝑘 = 𝑛. By Theorem 2.11, 𝑆 is diagonal if and only if 𝑑1 + ⋯ + 𝑑𝑘 = 𝑛


(i.e. 𝑽 is the direct sum of its eigenspaces). By Theorem 2.15, 1 ≤ 𝑑𝑖 ≤ 𝑚𝑖 , 𝑖 = 1, … , 𝑘.
Therefore, 𝑆 is diagonal if and only if each 𝑑𝑖 , 𝑖 = 1, … , 𝑘, attains its maximum, i.e. 𝑑𝑖
= 𝑚𝑖 , 𝑖 = 1, … , 𝑘. 

July 2014 S. Adarve


44

We are now able to describe the possible obstructions for an operator 𝑆 over an
𝑛 −dimensional vector space 𝑽 over a field 𝐹 to be diagonal.

Obstruction 1. The characteristic polynomial of 𝑆 does not split over 𝐹. In this situation, at
least one root will fall outside 𝐹. Hence, the algebraic multiplicities of those roots that do
fall inside 𝐹, if any, will not sum 𝑛. Recall, on the other hand, that complex roots of
polynomials with real coefficients appear in conjugate pairs since they arise precisely from
ℝ −irreducible quadratic factors.

Obstruction 2. The characteristic polynomial of 𝑆 splits over 𝐹 but for (at least) one
eigenvalue  of 𝑆 we have: geometric multiplicity of  < algebraic multiplicity of  . In this
situation, by the Corollary to Theorem 2.15, the geometric multiplicities of the eigenvalues
will not sum 𝑛. Thus, the sum of all the eigenspaces −although a direct sum− is smaller
than 𝑽, i.e. a proper subspace of 𝑽.

Example 10 The complex operator 𝑆: ℂ𝟐 → ℂ𝟐 given by 𝑆(𝑧1 , 𝑧2 ) = (𝑧1 − 𝑖𝑧2 , −𝑖𝑧1 − 𝑧2 ),


for all (𝑧1 , 𝑧2 ) ∈ ℂ𝟐 , is not diagonal. Its characteristic polynomial is

1−𝑡 −𝑖
𝑝(𝑡) = | | = (−1)2 𝑡 2 ,
−𝑖 −1 − 𝑡

meaning that 𝑡 = 0 is the only eigenvalue of 𝑆. The corresponding eigenspace 𝐸0 = 𝑘𝑒𝑟 𝑆


is not equal to the whole space ℂ𝟐 (it is just a 1 −dimensional complex subspace of ℂ𝟐 ). As
an exercise, we invite the reader to find a basis for 𝐸0 = 𝑘𝑒𝑟 𝑆.

Example 11 The real operator 𝑆: ℝ2 → ℝ2 given by 𝑆(𝑥1 , 𝑥2 ) = (𝑥1 + 𝑥2 , 𝑥2 ), for all


(𝑥1 , 𝑥2 ) ∈ ℝ𝟐 , is not diagonal. The characteristic polynomial

1−𝑡 1
𝑝(𝑡) = | | = (−1)2 (𝑡 − 1)2 ,
0 1−𝑡

has a single root, namely 𝑡 = 1. The corresponding eigenspace, 𝐸1 = 𝑠𝑝𝑎𝑛 {(1, 0)}, as in
the previous example, is not equal to the whole space, in this case ℝ2 .

In this example, it is interesting to notice that the canonical matrix of 𝑆,

[𝑆]𝑐𝑎𝑛 = [1 1],
0 1

is a minimal Jordan block (see Section 2.4).

More generally, if 𝑎 is a real parameter, 𝑎 ≠ 0, the operator 𝑆𝑎 : ℝ𝟐 → ℝ𝟐 given by


𝑆(𝑥1 , 𝑥2 ) = (𝑥1 + 𝑎𝑥2 , 𝑥2 ), for all (𝑥1 , 𝑥2 ) ∈ ℝ𝟐 , is not diagonal. Such an operator is called
a shear (along the 𝑥1 −axis).

Example 12 If 𝜃 ≠ 0, , no rotation 𝑅𝜃 : ℝ2 → ℝ2 , where


𝑥1 𝑐𝑜𝑠𝜃 −𝑠𝑒𝑛𝜃 𝑥1
𝑅𝜃 [𝑥 ] = [ ][ ] ,
2 𝑠𝑒𝑛𝜃 𝑐𝑜𝑠𝜃 𝑥2

July 2014 S. Adarve


45

𝑥1
for all [𝑥 ] ∈ ℝ2 , is diagonal. Geometrically, no non zero vector in ℝ2 is transformed by 𝑅𝜃
2
into a real multiple of itself. The reader may verify that the characteristic polynomial of
such a rotation has complex roots. If 𝜃 = 0,  the operator 𝑅𝜃 corresponds, respectively,
to the identity/inversion on ℝ2 ; both of them are homotheties, hence diagonal.

Example 13 The real operator 𝑆: ℝ𝟑 → ℝ𝟑 given by 𝑆(𝑥1 , 𝑥2 , 𝑥3 ) = (𝑥2 , −𝑥1 , 𝑥3 ), for all
(𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ𝟑 , is not diagonal. In fact,

0 −1 0
0 −1
[𝑆]𝑐𝑎𝑛 = [ 1 0 0]=[ ]  [1],
1 0
0 0 1

which implies that 𝑆 = 𝑆 1  𝑆2 , where 𝑆1 : 𝑾1 → 𝑾1 is the counterclockwise rotation of


𝜃 = ⁄2 radians on 𝑾1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 1, 0)}, and 𝑆2 : 𝑾2 → 𝑾2 is the identity on
𝑾2 = 𝑠𝑝𝑎𝑛 {(0, 0, 1)}. The reader may easily verify that the characteristic polynomial of
𝑆 has two conjugate complex roots plus one real root.

Example 14 Any homothety 𝐻𝑟 : 𝑽 → 𝑽 on a 𝑛 −dimensional vector space 𝑽 over a field 𝐹


is diagonal. 𝐻𝑟 has a single eigenvalue, namely 𝑡 = 𝑟, and the corresponding eigenspace is
𝐸𝑟 = 𝑉. Recall, from Chapter 1, that the matrix representation of 𝑆 in any ordered basis of
𝑽 is the scalar matrix 𝑟𝐼𝑛 .

Example 15 Any projection on a finite dimensional vector space is diagonal: Let 𝑽 be a


finitely generated vector space over a field 𝐹 and let 𝑾1 and 𝑾2 be subspaces such that
𝑽 = 𝑾1  𝑾2 . Let 𝑃: 𝑽 → 𝑽 be the projection on 𝑾1 along 𝑾2 . Then 𝑃 = 𝐼𝑑𝑾1  𝑂𝑾2 ,
which is a direct sum of homotheties. Moreover, if 𝛼1 and 𝛼2 are ordered bases for 𝑾1
and 𝑾2 , respectively, then

𝐼𝑑1 𝑂
[𝑃]𝛼1 ∪̇𝛼2 = [ ],
𝑂 𝑂𝑑2

where 𝑑𝑖 = 𝑑𝑖𝑚 𝑾𝑖 , 𝑖 = 1, 2. The eigenvalues of 𝑃 are 𝑡 = 1, 0 and the corresponding


eigenspaces are 𝑾1 and 𝑾2 , respectively. Of course, 𝑑1 and 𝑑2 are the geometric
multiplicities.

Example 16 The operator in Example 26 in Chapter 1 is diagonal. This is the operator


𝑆: ℝ3 → ℝ3 given by 𝑆(𝑥1 , 𝑥2 , 𝑥3 ) = (𝑥1 + 𝑥3 , 𝑥1 − 𝑥2 − 𝑥3 , 2𝑥3 ), for all (𝑥1 , 𝑥2 , 𝑥3 ) ∈
ℝ3 . This operator was originally constructed as a direct sum of operators (on appropriate
subspaces of ℝ3 ), one of which was not a homothety. Using the canonical basis for ℝ3 , the
characteristic polynomial of 𝑆 is

1−𝑡 0 1
𝑑𝑒𝑡 ([𝑆]𝑐𝑎𝑛 − 𝑡𝐼3 ) = | 1 −1 − 𝑡 −1 | = (−1)3 (𝑡 − 2)(𝑡 − 1)(𝑡 + 1).
0 0 2−𝑡

Since there are three distinct eigenvalues, 𝑆 is indeed diagonal, by Exercise 24.

Example 17 The complex operator 𝑆: ℂ𝟐 → ℂ𝟐 given by 𝑆(𝑧1 , 𝑧2 ) = (𝑖𝑧1 , 𝑧1 + 𝑧2 ), for all


(𝑧1 , 𝑧2 ) ∈ ℂ𝟐 , is ℂ −diagonal, but it is not ℝ −diagonal. (The computations are left to the

July 2014 S. Adarve


46

reader.) Could a complex operator possibly be ℝ −diagonal and yet not be ℂ −diagonal?
(See Exercise 19.)

We close this section with the well−known Cayley−Hamilton Theorem.

Definition. Let 𝑆: 𝑽 → 𝑽 be an operator on a vector space 𝑽 over a field 𝐹 and let 𝒗 ∈ 𝑽.


The subspace 𝑾𝒗 = 𝑠𝑝𝑎𝑛{ 𝑆𝑗 (𝒗) | 𝑗 = 1, 2, … , ∞ } is called the 𝑆 −cyclic subspace of 𝑽
generated by 𝒗. This subspace 𝑾𝒗 is clearly 𝑆 −invariant for all 𝒗 ∈ 𝑽.

Suppose that 𝑽 is finitely generated over a field 𝐹 and let 𝒗 be a non zero vector in 𝑽. In
the terms of the previous definition, if 𝑘 is the largest positive integer such that
𝒗, 𝑆(𝒗), … , 𝑆 𝑘−1 (𝒗) are linearly independent, then this collection of vectors is maximal
linearly independent in 𝑾𝒗 , hence a basis for 𝑾𝒗 : If 𝑆 𝑘 (𝒗) ∈ 𝑠𝑝𝑎𝑛 {𝒗, 𝑆(𝒗), … , 𝑆 𝑘−1 (𝒗)},
then 𝑆 𝑙 (𝒗) ∈ 𝑠𝑝𝑎𝑛 {𝒗, 𝑆(𝒗), … , 𝑆 𝑘−1 (𝒗)} for any 𝑙 ≥ 𝑘.

Consider such an integer 𝑘 and let 𝑎0 , 𝑎1 , … , 𝑎𝑘−1 ∈ 𝐹 be the scalars such that

−𝑆 𝑘 (𝒗) = 𝑎0 𝒗 + 𝑎1 𝑆(𝒗) + ⋯ + 𝑎𝑘−1 𝑆 𝑘−1 (𝒗).

Using the basis {𝒗, 𝑆(𝒗), … , 𝑆 𝑘−1 (𝒗)}, it may be shown (see Exercise 34) that the
characteristic polynomial of the restriction 𝑆𝑾𝒗 is

𝑞(𝑡) = (−1)𝑘 (𝑎0 + 𝑎1 𝑡 + ⋯ + 𝑎𝑘−1 𝑡 𝑘−1 + 𝑡 𝑘 ).

Thus,

𝑘−1
𝑞(𝑆𝑾𝒗 )(𝒗) = (−1)𝑘 (𝑎0 𝐼𝑑𝑾𝒗 + 𝑎1 𝑆𝑾𝒗 + ⋯ + 𝑎𝑘−1 𝑆𝑾𝒗
+ 𝑆 𝑘 )(𝒗) = ⃑𝟎.

Theorem 2.16. (Cayley−Hamilton) Let 𝑆: 𝑽 → 𝑽 be an operator on a finite dimensional


vector space 𝑽 over a field 𝐹. If 𝑝(𝑡) is the characteristic polynomial of 𝑆, then 𝑝(𝑆) = 𝑂𝑽 ,
i.e. 𝑆 satisfies its own characteristic equation.

Proof. Let 𝒗 be a non zero vector in 𝑽. Since 𝑾𝒗 is 𝑆 −invariant, the characteristic


polynomial 𝑞(𝑡) of the restriction 𝑆𝑾𝒗 divides 𝑝(𝑡) , by Theorem 2.13. Thus 𝑝(𝑡) =
𝑟(𝑡)𝑞(𝑡) for some polynomial 𝑟(𝑡). This implies 𝑝(𝑆) = 𝑟(𝑆)𝑞(𝑆) (composition). Hence,
applying the operator 𝑝(𝑆) to 𝒗 yields

⃑)=𝟎
𝑝(𝑆)(𝒗) = 𝑟(𝑆)(𝑞(𝑆)(𝒗)) = 𝑟(𝑆)( 𝟎 ⃑,

by the argument prior to this theorem. 

Example 18 Let 𝑆: ℝ3 → ℝ3 be the operator 𝑆(𝑥1 , 𝑥2 , 𝑥3 ) = (𝑥2 + 𝑥3 , 𝑥1 + 𝑥3 , 2𝑥3 ), for


all (𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ3 .

If 𝒗 = (1, 0, 0), then 𝑾𝒗 = 𝑠𝑝𝑎𝑛 {(1, 0, 0), (0, 1, 0)} and the characteristic polynomial of
𝑆𝑾𝒗 is 𝑞(𝑡) = −1 + 𝑡 2 . The characteristic polynomial of 𝑆 is 𝑝(𝑡) = −(𝑡 − 2)(−1 + 𝑡 2 ) =
−2 + 𝑡 + 2𝑡 2 − 𝑡 3 .

Thus, by Cayley−Hamilton, 𝑝(𝑆) = −2𝐼𝑑ℝ𝟑 + 𝑆 + 2𝑆 2 − 𝑆 3 is the zero operator on ℝ𝟑 .

July 2014 S. Adarve


47

EXERCISES

1. Let 𝐹 be a field and let 𝒖1 ,…,𝒖𝑖 ,…, 𝒖𝑚 ,𝒗 be vectors in 𝐹 𝑛 such that 𝒖𝑖 𝒗 = 0 for
𝑖 = 1, … , 𝑚. Show that 𝒖𝒗 = 0 for all 𝒖 ∈ 𝑠𝑝𝑎𝑛{𝒖1 , … , 𝒖𝑖 , … , 𝒖𝑚 }.

2. Two permutations in 𝑆𝑛 are said to be disjoint if every integer in ⟦1, 𝑛⟧ moved by one
of them is fixed by the other. Prove that any two disjoint permutations in 𝑆𝑛 commute.

3. Let 𝑝 and 𝑞 be positive integers and let 𝑛 = 𝑝 + 𝑞.


a) Let 𝜎 ∈ 𝑆𝑝+𝑞 be such that 𝜎(⟦1, 𝑝⟧ ) = ⟦1, 𝑝⟧ . Show that 𝜎(⟦𝑝 + 1, 𝑛⟧ ) =
⟦𝑝 + 1, 𝑛⟧.
b) Let 𝜎 ∈ 𝑆𝑝+𝑞 be a permutation that satisfies 𝜎(⟦1, 𝑝⟧) = ⟦1, 𝑝⟧. Prove that 𝜎 may
be decomposed as a product 𝜎 = 𝛿𝜇, where 𝛿 is a permutation in 𝑆𝑝+𝑞 whose
fixed point set is ⟦𝑝 + 1, 𝑛⟧, and 𝜇 is a permutation in 𝑆𝑝+𝑞 whose fixed point set is
⟦1, 𝑝⟧.

4. Let 𝜎 = 𝛿𝜇 in 𝑆𝑛 , where 𝛿 and 𝜇 are disjoint. Suppose that 𝑖 ∈ ⟦1, 𝑛⟧ is moved by 𝛿.


Show that 𝜎 𝑟 (𝑖) = 𝛿 𝑟 (𝑖), for all 𝑟 ≥ 0.

5. Let 𝐹 be any field. If 𝐴 = (𝑎𝑖𝑗 ) ∈ 𝑀𝑛×𝑛 (𝐹), prove that

∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎1𝜎(1) ⋯ 𝑎𝑗𝜎(𝑗) ⋯ 𝑎𝑛𝜎(𝑛) = ∑ 𝑠𝑖𝑔𝑛(𝜎) 𝑎𝜎(1)1 ⋯ 𝑎𝜎(𝑗)𝑗 ⋯ 𝑎𝜎(𝑛)𝑛 .


𝜎∈𝑆𝑛 𝜎∈𝑆𝑛

6. Show that the property of skew−symmetry of 𝑑𝑒𝑡 is equivalent to the property given
at the beginning of the proof of Theorem 2.8.

7. Prove Corollary 1 to Theorem 2.8.

8. Prove part 1 of Theorem 2.9.

9. Let 𝐵 ∈ 𝑀𝑝×𝑝 (𝐹) and 𝐶 ∈ 𝑀𝑞×𝑞 (𝐹) . Use induction on 𝑝 to show that 𝑑𝑒𝑡 𝐵  C =
𝑑𝑒𝑡 𝐵 𝑑𝑒𝑡 𝐶. (If you wish, you may use Theorem 2.9.)

10. State and prove Cramer’s rule.

11. Let 𝑾1 = 𝑠𝑝𝑎𝑛 {(1, 0, 0, 0), (0, 1, 0, 1)} and 𝑾2 = 𝑠𝑝𝑎𝑛 {(1, 0, 1, 0), (0, 0, 0, 1)} in ℝ4 .

a) Show that ℝ4 = 𝑾1  𝑾2 .
b) If 𝐻2 : 𝑾2 → 𝑾2 is the homothety of ratio 2 and 𝐻1 : 𝑾1 → 𝑾1 is the homothety of
ratio 1 (i.e., the identity operator), find 𝐻1  𝐻2.
c) Describe two different ordered bases 𝛼 and 𝛽 for ℝ4 such that [𝑆]𝛼 and [𝑆]𝛽 are both
diagonal.

July 2014 S. Adarve


48

12. Let 𝑽 be a finitely generated vector space and let 𝑆1 and 𝑆2 be diagonal operators on 𝑽
that commute. Show that there is an ordered basis 𝛼 for 𝑽 such that [𝑆1 ]𝛼 and [𝑆2 ]𝛼
are both diagonal. (Such operators are said to be simultaneously diagonalizable.)

13. Prove the converse to Exercise 12.

14. Make the necessary computations in Examples 8.

15. Let 𝑾1 be a 2 −dimensional subspace of ℝ𝑛 , and let 𝑾2 be a direct complement of


𝑾1 in ℝ𝑛 . For 𝜃 ∈ [0, 2], the operator on ℝ𝑛 given by 𝑆𝜃 = 𝑅𝜃  𝐼𝑑𝑾2 , where 𝑅𝜃 is
the rotation of 𝜃 radians about the origin in 𝑾1 , is the generalized rotation of 𝜃
radians with respect to 𝑾2 . Show that 𝑆𝜃 is not diagonal for 𝜃 ≠ 0, . Hint: Prove that
the characteristic polynomial of 𝑆𝜃 has complex roots for 𝜃 ≠ 0, .

16. Give an example of an operator on ℝ4 with no eigenvalues.

17. Let 𝑆: ℂ𝟐 → ℂ𝟐 be the complex operator given by 𝑆(𝑧1 , 𝑧2 ) = (𝑖𝑧1 , −𝑖𝑧2 ), for all
(𝑧1 , 𝑧2 ) ∈ ℂ𝟐 .

a) Find the smallest positive integer 𝑘 such that 𝑆 𝑘 = 𝐼𝑑ℂ𝟐 .


b) Show that 𝑆 is ℂ −diagonal, but not ℝ −diagonal.

18. Consider the operator in Example 17.


a) Find the eigenvalues and bases for the eigenspaces of this operator as a complex
operator.
b) Show that this operator is not ℝ −diagonal.

19. Could an operator on a finitely generated complex vector space possibly be


ℝ −diagonal and yet not be ℂ −diagonal?

20. If the ratios 1 ,…, 𝑘 in Theorem 2.11 are not taken to be distinct, what could be said
about the uniqueness of the decomposition 𝑽 = 𝑾1  ⋯  𝑾𝑘 ?

21. Is the complex operator 𝑆: ℂ𝟐 → ℂ𝟐 given by 𝑆(𝑧1 , 𝑧2 ) = (𝑧1 + 𝑖𝑧1 , 𝑖𝑧1 + 𝑧2 ), for all
(𝑧1 , 𝑧2 ) ∈ ℂ𝟐 , a diagonal operator?

22. For a given field 𝐹, let 𝑆: 𝐹 ∞ → 𝐹 ∞ be the right shift. Prove that 𝑆 has no eigenvalues.

23. Let 𝑆: 𝑽 → 𝑽 be an operator on a finitely generated real vector space.

a) Define 𝑑𝑒𝑡 𝑆.
b) Suppose that 𝑆 is diagonal with distinct eigenvalues 1 , … , 𝑟 and respective
𝑚
algebraic multiplicities 𝑚1 , … , 𝑚𝑟 . Show that 𝑑𝑒𝑡 𝑆 = 1 1 ⋯ 𝑚 𝑟
𝑟 .

July 2014 S. Adarve


49

24. Let 𝑽 be an 𝑛 −dimensional vector space over a field 𝐹. Show that an operator on 𝑽
with 𝑛 distinct eigenvalues is a diagonal operator.

25. Let 𝑽 be an 𝑛 −dimensional vector space over a field 𝐹. Show that an operator on 𝑽
with two distinct eigenvalues, one of which has geometric multiplicity equal to 𝑛 − 1,
is diagonal.

26. Let 𝑆: 𝑽 → 𝑽 be an operator on 𝑛 −dimensional real vector space with an eigenvalue


that has geometric multiplicity equal to 𝑛 − 1. Prove that 𝑆 is diagonal. Give an
example of such an operator on ℝ3 .

27. Give an example of an operator on ℝ3 with a single eigenvalue of algebraic multiplicity


equal to three and geometric multiplicity equal to
a) One
b) Two
c) Three

28. Let 𝑾 = {(𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 ) ∈ ℝ4 | 𝑥1 − 𝑥3 = 0} and consider the reflection about 𝑾⊥ .


a) Without using any matrix representation, explain why this operator is diagonal.
b) Verify a) by computing the eigenvalues and eigenspaces via the canonical basis.

29. Let 𝑽 be a real or complex vector space, let 𝑾1 be a 1 −dimensional subspace of 𝑽,


and let 𝑾2 be a direct complement of 𝑾1 in 𝑽. Let 𝑃 be the projection on 𝑾1 along
𝑾2 . If 𝐻 = 𝐼𝑑𝑽 − 2𝑃 is the corresponding generalized Householder operator, prove
that 𝐻 is diagonal.

30. Give an example of a complex non diagonal operator 𝑆: 𝑪2 → 𝑪2 with 𝑡 = 𝑖 as a


single eigenvalue.

31. Show that the Cayley-Hamilton equation for any projection 𝑃 on a finitely generated
vector space is equivalent to the condition 𝑃2 = 𝑃.

July 2014 S. Adarve

Das könnte Ihnen auch gefallen