Beruflich Dokumente
Kultur Dokumente
with ISETL
Learning Linear Algebra
with ISETL
Kirk Weller Aaron Montgomery
University of North Texas Central Washington University
Julie Clark Jim Cottrill
Hollins University Illinois State University
Maria Trigueros Ilana Arnon
Instituto Tecnologico Autonomo
de Mexico
Centre for Educational
Technology
Ed Dubinsky
RUMEC
Preliminary Version 3
July 31, 2002
c _ 2002 by Research in Undergraduate Mathematics Education Community
All rights reserved.
Preface
The authors wish to express thanks to Don Muench of St. John Fisher College
(Rochester, NY) whose work with linear algebra and ISETL gave us the basis
for our work. His code was written at Gettysburg College in 1991 with
students there, so we thank Jared Colesh, Ben Papada, Julie Leese, and
Dave Riihimaki.
This work is a collaborative eort both in authorship and its conception.
We acknowledge the assistance of the following members of RUMEC who
have worked with us at various stages of the project:
Broni Czarnocha David DeVries Clare Hemenway
George Litman Sergio Loch Rob Merkovsky
Steve Morics Asuman Oktac Vrunda Prabhu
Keith Schwingendorf
Many students and faculty have used these materials and have helped us
to rene their intent and presentation. Those members of RUMEC who have
implemented some or all of these sections are Ilana Arnon, Julie Clark, Sergio
Loch, Steve Morics, Keith Schwingendorf, and Kirk Weller. Our special
thanks go to the brave faculty who are implementing this approach and its
materials from beyond RUMEC. They and their students will guide us in
taking this preliminary version to its next level.
Robert Acar University of Puerto Rico-Mayag uez
Felix Almendra Arao Unidad Profesional Interdisplinaria en Ingenier
ia
y Tecnolog
.
(c) The function determines the number of non-OM components of a
tuple. (Note that not all of the components of a tuple between
the rst and last must be dened).
2. Write ISETL funcs with the given names according to each of the follow-
ing specications. In each case, set up specic values for the parameters
and run your code to check that it works.
(a) The func is associative has two input parameters: a set G
and a binary operation o. The action of is associative is to
determine whether the operation represented by o is associative.
This is indicated by returning the value true or false.
(b) Construct a func add 7 that implements addition mod 7 in Z
7
.
Use your construction in ISETL code that shows that every element
in Z
7
has an inverse in relation to add 7.
(c) Repeat part (b) with Z
6
and add 6.
3. Write an ISETL map that implements the function whose domain is Z
20
and assigns to each element x an element y such that (x+y) mod 20 =
0. Is this an smap? Explain.
50 CHAPTER 1. FUNCTIONS AND STRUCTURES
4. Do the same as the previous exercise with addition replaced by multi-
plication and 0 replaced by 1.
5. Write an ISETL smap that implements the operation of addition mod
20 in Z
20
.
6. Write an ISETL func that accepts a pair consisting of a set G and a
binary operation o on G. The action of the func is to convert this pair
into an smap which implements the operation. Use your func to do the
previous exercise.
7. Construct a function takes a tuple (of any length) for input and returns
a set all of the components of the tuple for output.
8. Look again at the func Av of Activity 1(c). Is it or is it not a binary
operation? Write an explanation. If it is, use it with several appropriate
inputs as an operation written between the two parameters. If it is not,
what modications need to be done to Av in order to produce a similar
function (say, AvBin) which could be used as a binary operation, in
particular in the method described above.
9. Write a tuple and a smap of your own. Operate both as functions.
For each of them, write an explanation: What are the inputs of the
function? What is its domain? (The domain is the set of elements you
can input in the function). What are its outputs?
Chapter 2
Vectors and Vector Spaces
You have seen vectors in your physics classes, in
multivariable calculus and perhaps in other
courses. In those cases, vectors were probably
considered to be things with direction and
magnitude and were usually represented as
directed line segments. In this chapter and
beyond, we will be working with vectors in an
abstract sense. Certainly the vectors with which
you are already familiar will be included in our
work (although they will all have their tails at
the origin). However, as we work with vectors
and vector spaces, you will nd that polynomials
and innitely dierentiable functions are also
vectors.
52
2.1 Vectors
Activities
1. (a) Dene the set K = Z
5
= 0, 1, 2, 3, 4 in ISETL.
(b) Write an ISETL func add scal that accepts two elements of K
and returns their sum mod 5.
(c) Write an ISETL func mult scal that accepts two elements of K
and returns their product mod 5.
2. (a) Dene V = (Z
5
)
2
, that is, the set of all 2-tuples with components
from Z
5
in ISETL. How many elements are there in V ?
(b) Write an ISETL func vec add that accepts two elements, [v
1
, v
2
]
and [w
1
, w
2
] of V and returns the tuple [(v
1
+ w
1
) mod 5, (v
2
+
w
2
) mod 5].
(c) Write an ISETL func scal mult that accepts an element k of
Z
5
and a tuple [v
1
, v
2
] from V , and returns the tuple [(kv
1
) mod
5, (kv
2
) mod 5].
3. Dene the tuples v = [2, 3], w = [1, 1], and u = [0, 3] in ISETL. Use your
funcs dened in Activities 1 and 2 to determine whether the following
tuples are the same.
(a) v +w and w +v.
(b) (u +v) +w and u + (v +w).
(c) v +v and 2v.
(d) 1v and v.
(e) v +1v and v v.
(f) 2(3u) and (2 3)u.
(g) 2(v +w) and 2v + 2w.
4. How is the following code dierent from the func vec add you wrote
in Activity 2? What assumption does this code make about u and v?
va := |v, w -> [(v(i) + w(i)) mod 5 : i in [1..#v]]|;
2.1 Vectors 53
Use va to add the following tuples in (Z
5
)
n
. Can you add these tuples
using vec add?
(a) [2, 2, 1] + [3, 0, 4]
(b) [0, 1, 0, 1] + [1, 2, 3, 4]
(c) [1, 2] + [2, 1]
5. (a) Write an ISETL func sm that accepts an element k from Z
5
and
a tuple v, and returns the tuple kv in which each component of v
has been multiplied (mod 5) by k.
(b) Test your func for k = 3 and v = [2, 4].
(c) Test your func for k = 0 and v = [1, 3, 3].
(d) Test your func for k = 1 and v = [3, 2, 4, 1].
6. (a) Write an ISETL func is closed va that accepts a set V of tuples
and an operation va (vector addition). Your func should test
whether the sum of any two tuples in V is again in V .
(b) Test your func on V = (Z
5
)
2
, with va dened in Activity 4.
(c) Test your func on V = (Z
3
)
3
. Modify va appropriately, using
mod 3 arithmetic.
(d) Test your func on V = (Z
2
)
4
. Modify va appropriately, using
mod 2 arithmetic.
7. (a) Write an ISETL func is commutative that accepts a set V of
vectors (tuples) and an operation va and determines whether or
not the operation va is commutative on V .
(b) Test your func on V = (Z
5
)
2
and va.
(c) Test your func on V = (Z
3
)
3
and an appropriately modied va.
(d) Test your func on V = (Z
2
)
4
and an appropriately modied va.
8. (a) Write an ISETL func is associative va that accepts a set V of
vectors (tuples) and an operation va, and determines whether or
not va is associative on V .
(b) Test your func on V = (Z
5
)
2
and va.
(c) Test your func on V = (Z
2
)
2
and an appropriately modied va.
54 CHAPTER 2. VECTORS AND VECTOR SPACES
9. Explain the following ISETL code. What are the inputs to this func?
What does this func return?
has_zerovec := func(V, va);
VZERO := choose z in V | forall v in V | (v .va z) = v;
return VZERO;
end;
10. (a) Use the func has zerovec to write a new func has vinverses
that accepts a set V of tuples and operation va and determines
whether or not for each x in V there is an y in V with the property
that va(x, y) = the result of has zerovec(V, va).
(b) Test your func on V = (Z
5
)
2
and va.
(c) Test your func on V = (Z
3
)
3
and an appropriately modied va.
(d) Test your func on V = (Z
2
)
4
and an appropriately modied va.
11. Explain the following ISETL code:
is_closed_sm := func(K, V, sm);
return forall k in K, v in V | (k .sm v) in V;
end;
12. Write an ISETL func is associative sm that accepts a set K of
scalars, a set V of vectors, and two operations, sm (scalar multipli-
cation) and ms, multiplication of scalars. Your func should determine
whether for all k, j in K and all v in V k(jv) = (kj)v. Note that the
right hand side of this equation usesmultiplication of scalars as well
as scalar multiplication. What is the dierence? Test your func on
(Z
2
)
2
.
13. What does the following ISETL func do? What are the inputs? The
outputs?
has_distributive1 := func(K, V, sm, va);
return forall k in K, v, w in V |
(k .sm (v .va w)) = (k .sm v) .va (k .sm w);
end;
2.1 Vectors 55
14. Write an ISETL func has distributive2 that accepts a set K of
scalars, a set V of tuples, and three operations, va, vector addition,
sm, scalar multiplication, as, addition of scalars. The action of your
func is to determine whether the following expression holds for all k, j
in K and v in V : (k +j)v = kv +jv.
15. Write an ISETL func has identityscalar that accepts a set K of
scalars, a set V of vectors (tuples), and an operation sm, scalar multi-
plication. The action of your function is to determine whether there is
an element k in K such that for all v in V , sm(v, k) = v.
Discussion
In these activities you created tuples with components chosen from Z
2
,
Z
3
, or Z
5
, and wrote code to perform operations on those tuples. Such
tuples are more commonly known as vectors. A vector can be any tuple
with components in a set K of scalars. In ISETL we denote vectors by v =
[v
1
, v
2
, , v
n
]. In mathematical notation we write v = v
1
, v
2
, , v
n
). The
numbers v
i
are known as the components of the vector.
Any specic vector might be thought of as living in several dierent
spaces. For example, v = 2, 2) could be an element of the space (Z
3
)
2
,
or of (Z
5
)
2
, or of R
2
. (Why can is it not an element of (Z
2
)
2
?) If we
choose to work within a specic space (K)
n
, then we can combine vec-
tors with each other using an operation of vector addition. The ad-
dition is done component-wise. For example if we are working with 2-
tuples with entries from Z
5
, the sum of v = v
1
, v
2
) and w = w
1
, w
2
) is
(v
1
+w
2
) mod 5, (v
2
+w
2
) mod 5). We also have an operation of scalar
multiplication which allows us to combine scalars with vectors. This mul-
tiplication is also done component-wise. There is a natural relationship be-
tween this vector addition and scalar multiplication that is very satisfying.
For example v + v results in the same vector as 2v. Linear algebra is built
on these two operations of adding vectors and multiplying by scalars.
In these Activities you worked exclusively with nite sets of scalars and
vectors. This is because much of our work in ISETL requires us to be able to
dene nite sets. However, many of the real-world applications of vectors
that you will see in this and other courses deal with innite sets of scalars
and vectors. For example R
2
is the set of all ordered pairs of real numbers.
The vectors in R
2
have both a physical (as forces or velocities) and geometric
56 CHAPTER 2. VECTORS AND VECTOR SPACES
interpretation. In R
2
vectors can be thought of as quantities that have both
a direction and magnitude. We can represent the vector v = 4, 2) by an
arrow in a two-dimensional plane. The arrow will start at the origin (0, 0)
1 2 3 4
1
2
O
P
v
Figure 2.1: A vector in R
2
and end at the point (4, 2). (See Figure 2.1). Such a vector has a magnitude
and direction, and shows both of its components simultaneously. The vector
w = 2, 1) has the same direction as v but is half as long. Can you see how
to use the Pythagorean theorem to nd the length of such vectors? What is
the relationship between the length of a vector v and the length of 2v? Of
v and kv?
Of course not all of the arrows in 2-space originate at (0, 0). We will
consider two vectors v
1
and v
2
to be equivalent if they have the same length
and direction, even if they originate at dierent points. (See Figure 2.2).
1 2 3 4
1
2
O
P
B
A
v
Figure 2.2: Parallel vectors in R
2
2.1 Vectors 57
Such vectors are obviously parallel arrows. One reason for allowing vectors
to start at dierent points is to be able to visualize the sum of two vectors.
In Figure 2.3 we can form the vector v +w by translating w so that the
start of w is placed at the end of v. Then v +w is the arrow drawn from
1 2 3 4
1
2
4
3
w = <-1,2>
w'
v + w
v'
v = <4,2>
Figure 2.3: Adding vectors in R
2
the start of v to the end of the translated vector w. Note that this geometric
vector addition produces a parallelogram. To get v +w can can travel along
v and then along w or we can take the shortcut along the diagonal v +w of
the parallelogram.
Use algebra to check that the geometric addition in Figure 2.3 is correct.
In other words, is v +w = 1, 2) equal to the vector 3, 4)?
Given a vector v, what would we mean by the vector v? How would
you draw v in the real plane? What is the relationship between the length
and direction of v and that of v? How can we combine vector addition and
multiplication by 1 to obtain vector subtraction?
Ordered triples of real numbers can also be thought of as vectors and
visualized geometrically. In order to do this we need an xyz coordinate
system. The set of all such ordered triples is known as 3-space or R
3
. Vectors
that live in spaces with more than 3 components are not so easily visualized.
However, many of the results and techniques of vector arithmetic are useful
in such situations where there is no direct geometric signicance. This leads
us to the following denition.
Denition 2.1.1. The set of all sequences v
1
, v
2
, , v
n
) of real numbers
is called Real n-space and is denoted R
n
.
58 CHAPTER 2. VECTORS AND VECTOR SPACES
In Activities 515, you wrote or explained several ISETL funcs that
checked various properties of vector addition and scalar multiplication. Sys-
tems in which these particular properties are satised turn out to be very
useful in the study of linear algebra. We will explore such systems further in
the next section.
Exercises
1. Compute the following vector expressions for
v = 2, 3) , u = 3, 1) , and w = 8, 0)
(a)
1
2
w
(b) v +u
(c) v +u +w
(d) 2v + 3u +w
2. (a) Draw the vectors v = 4, 1) and
1
2
v in a single xy plane.
(b) Draw the vectors v = 4, 1) and w = 2, 2) and v+w and vw
in a single xy plane.
3. Compute the following vector expressions for
v = 1, 2, 3) , u = 3, 1, 2) , and w = 2, 3, 1) .
(a) v +w
(b) v + 3u
(c) w+u
(d) 5v 2u + 6w
(e) 2v 3u 4w
4. To what number do the components of every scalar multiple of v =
2, 1, 3) add up?
5. Use the Pythagorean theorem to nd the length of the following vectors
in R
2
.
(a) 4, 3)
2.1 Vectors 59
(b) 2, 0)
(c) 1, 2)
(d) 3 1, 2)
(e) 0, 0)
6. Extend the notation of the length of a vector to R
n
by
length(v) =
_
(v
1
)
2
+ (v
2
)
2
+ + (v
n
)
2
.
Find the length of the following vectors.
(a) 2, 4, 3)
(b) 2 2, 4, 3)
(c) 2, 0, 0)
(d) 1, 1, 0, 2)
(e) 5, 5, 5, 5)
7. A unit vector is a vector of length one. Find 3 distinct unit vectors in
R
2
. Find 4 distinct unit vectors in R
3
.
8. Is the sum of any two unit vectors a unit vector? Give a proof or
counterexample.
9. Let v = 5, 3, 4). Find a scalar k in R such that kv is a unit vector.
10. If three corners of a parallelogram are (1, 1), (4, 2) and (1, 3), what are
all the possible fourth corners? Draw two of them.
11. Let v = 1, 2, 1), w = 0, 1, 1). Find scalars k and j so that
kv +jw = 4, 2, 6) .
60
2.2 Introduction to Vector Spaces
Activities
1. Following is a list of some funcs that you worked with in the previous
section.
is_closed_va
is_commutative
is_associative_va
has_zerovec
has_vinverses
is_closed_sm
is_associative_sm
has_distributive1
has_distributive2
has_identityscalar
Write a description of what each func does, including the kind of ob-
jects accepted, what is done to them, and the kind of object that is
returned.
2. (a) Construct in ISETL a set K = Z
3
of scalars, a set V = (Z
3
)
3
of
vectors, and four operations, va (vector addition) which is addi-
tion mod 3 of elements in V , sm (scalar multiplication), which is
multiplication mod 3 of elements in V by elements in K, as (ad-
dition of scalars), which is addition mod 3, and ms multiplication
of scalars, which is multiplication mod 3.
For example, you could write and store code such as:
K:={0..2}; V:={[x,y,z]| x,y,z in K};
va:=|v,u->[\left(v(i)+u(i)\right) mod 3 : i in [1..3]]|;
sm:=|k,v -> [\left(k*v(i)\right) mod 3: i in [1..3]]|;
as:=|k,j->(k+j) mod 3|; ms:=|k,j -> (k*j) mod 3|;
(b) Apply each of your funcs from Activity 1 to this system, [K,
V, va, sm, as, ms]. Create a table with the funcs as column
headings and this system as the rst row, and use the table to
keep track of which properties are satised this system.
2.2 Introduction to Vector Spaces 61
3. Repeat Activity 2 for each of the following systems. Add a new row to
your table for each system.
(a) K = Z
5
, V = (Z
5
)
2
, va is addition mod 5 of elements in V , sm
is multiplication mod 5 of elements in V by elements in K, sm is
multiplication mod 5 of elements in V by elements in K, as and
ms are addition and multiplication mod 5 respectively.
(b) K = Z
3
, V = x, x, x) : x K, va is addition mod 3 of elements
in V , sm is multiplication mod 3 of elements in V by elements in
K, as and ms are addition and multiplication mod 3 respectively.
(c) K = Z
5
, V = x, y) : x, y 1, 3, va is addition mod 5 of
elements in V , sm is multiplication mod 5 of elements in V by
elements in K, as and ms are addition and multiplication mod 5
respectively.
(d) K = Z
5
, V = x, 0, 0) : x K, va is addition mod 5 of elements
in V , sm is multiplication mod 5 of elements in V by elements in
K, as and ms are addition and multiplication mod 5 respectively.
(e) K = Z
5
, V = x, 1) : x K, va is addition mod 5 of elements
in V , sm is multiplication mod 5 of elements in V by elements in
K, as and ms are addition and multiplication mod 5 respectively.
(f) K = Z
2
, V = (Z
2
)
5
, va is addition mod 2 of elements in V , sm is
multiplication mod 2 of elements in V by elements in K, as and
ms are addition and multiplication mod 2 respectively.
(g) K = 0, V = (Z
3
)
2
, va is addition mod 3 of elements in V , sm is
multiplication mod 3 of elements in V by elements in K, as and
ms are ordinary addition and multiplication respectively.
(h) K = Z
7
, V = (Z
7
)
1
, va, as are addition mod 7, sm and ms are
multiplication mod 7.
(i) K = Z
5
, V = x, y) : x, y 0, 2, 4, va is addition mod 5 of
elements in V , sm is multiplication mod 5 of elements in V by
elements in K, as and ms are addition and multiplication mod 5
respectively.
(j) K = Z
5
, V = (Z
2
)
3
, va is addition mod 5 of elements in V , sm is
multiplication mod 2 of elements in V by elements in K, as and
ms are addition and multiplication mod 5 respectively.
62 CHAPTER 2. VECTORS AND VECTOR SPACES
(k) K = Z
3
, V = (Z
3
)
3
, va is addition mod 3 of elements in V , sm
is dened by k x, y, z) = 0, 0, 0), as and ms are addition and
multiplication mod 3 respectively.
(l) K = Z
5
, V = (Z
5
)
2
, va is addition mod 5 of elements in V , sm
is dened by k x, y, z) = x, y, z), as and ms are addition and
multiplication mod 5 respectively.
4. Which systems from Activities 2 and 3 satisfy all ten properties from
Activity 1? Can you conjecture conditions on K = Z
p
, (Z
q
)
n
, and the
four operations so that such a system will satisfy all ten properties?
5. Here is a list of some more systems [K, V, va, sm, as, and ms]. Which
of these systems satisfy all of the properties in Activity 1? Note: Most
of the following systems can only be constructed and run in VISETL
(Virtual ISETL). This means that all of your work must be done by
hand and in your mind.
(a) K = 1, 1, V = (K)
3
, va = ordinary component-wise multipli-
cation, sm is ordinary component-wise multiplication, as, ms are
ordinary addition and multiplication respectively.
(b) K = R, V = R
2
, va is ordinary component-wise addition, and sm
is ordinary component-wise multiplication, as, ms are ordinary
addition and multiplication respectively.
(c) K = R, V = R
2
, va is ordinary component-wise addition, and sm
is dened by k x, y) = kx, 3ky), as, ms are ordinary addition
and multiplication respectively.
(d) K = R, V = R 0, va is dened by x) + y) = xy), sm
is dened by k x) =
x
k
_
, as, ms are ordinary addition and
multiplication respectively.
(e) K = R, V = x, 0, x) : x R, with va, sm, as, and ms dened
as usual for R
3
.
6. Write a func is vector space that accepts a set K of scalars, a set
V of vectors, and four operations va, sm, as, and ms dened on V
and K, and tests whether all of the properties listed in Activity 1 are
satised. Your func should return true if the system satises all ten
properties, and false if it fails to satisfy one or more property. Test
your func on some of the systems dened in Activity 2.
2.2 Introduction to Vector Spaces 63
Discussion
In the activities at the beginning of this section you constructed several
mathematical systems and examined their properties. More specically, you
constructed certain sets of vectors and sets of scalars, dened operations on
them and studied various properties of these sets under the dened opera-
tions.
The ten properties listed in Activity 1 are satised by many important
mathematical systems. Rather than study each system separately, we are
going to collectively consider all systems that satisfy these ten properties.
We begin with the denition of such systems.
Denition 2.2.1. A set V of objects called vectors, together with the binary
operations of vector addition and scalar multiplication is said to be a vector
space over a eld of scalars K if for all u, v, and w in V and all k, j in K the
following axioms are satised:
Axiom 1: u +v V (closure under vector addition).
Axiom 2: u +v = v +u (commutativity of vector addition).
Axiom 3: (u +v) +w = u + (v +w) (associativity).
Axiom 4: There is a vector 0 V such that v +0 = v (zero vector).
Axiom 5: For each v V there is a unique element (v) V such that
v + (v) = 0 (vector inverses).
Axiom 6: kv V (closure under scalar multiplication).
Axiom 7: (kj)v = k(jv)(associativity of scalar multiplication).
Axiom 8: k(u +v) = ku +kv (rst distributive law).
Axiom 9: (k +j)v = kv +jv (second distributive law).
Axiom 10: There is an element 1 K such that for every v in V , 1v = v
(identity scalar).
We digress for a moment to discuss this eld of scalars mentioned in
the denition. Scalars are just numbers, but what is a eld? A eld is a
set of objects (usually numbers), together with two operations (addition and
64 CHAPTER 2. VECTORS AND VECTOR SPACES
multiplication) dened on the set that collectively satisfy many properties
that you have seen in your previous work with the real number system. That
is,a eld has all the standard properties of the real numbers including closure
under both operations, operations, additive and multiplicative identities and
inverses, and properties such as commutativity, associativity and the dis-
tributive laws. There are both nite and innite elds. R is obviously an
innite eld, as are the rational numbers Q, and the complex numbers, C.
However, Z is not a eld. Why not? Is 0 a nite eld? It turns out
that if p is a prime number, then Z
p
forms a nite eld under the opera-
tions of addition and multiplication mod p. The system Z
m
is not a eld
for m not prime, because (among other reasons) not all elements in Z
m
have
multiplicative inverses.
We will not in general worry about the specic details of a eld in this
course. Henceforth we will generally restrict our scalars to one of the elds
Q, R, C, or Z
p
. In each case, the operations (as, ms) of addition and mul-
tiplication of scalars are henceforth understood to be ordinary addition and
multiplication or addition and multiplication mod p, so we no longer need to
specify them.
Finite Vector Spaces
We now generalize from some of the systems you worked with in the activities
to nd examples of vector spaces. In cases where we do nd a vector space,
we will prove that fact. Where they do not, we will investigate the vector
space axioms that are violated. The examples we will consider fall naturally
into two typesnite and innite vector spaces.
Your work in the Activities should have convinced you that nite sys-
tems such as K = Z
3
, V = (Z
3
)
3
, component-wise addition mod 3, and
component-wise multiplication mod 3, or K = Z
5
, V = (Z
5
)
2
, with the cor-
responding operations dened mod 5 do satisfy all ten axioms of a vector
space. In fact, you may have conjectured the following theorem.
Theorem 2.2.1. For any positive integer n, and any prime p, (Z
p
)
n
forms
a vector space over Z
p
.
Note that in our theorem we have not mentioned the operations va, sm,
as, or ms. Why not? Your work in the Activities should have convinced you
that there is a natural choice for these operations. In order for the system to
2.2 Introduction to Vector Spaces 65
form a vector space, the operations will be done mod p. The theorem does
require that p be prime; (Z
m
)
n
is only a vector space when Z
m
is a eld.
Proof. For a particular p and n we could always use ISETL to check all
ten axioms. However, the theorem holds for all primes p, so we will not
specify a particular one. We prove the theorem for the case n = 2 and leave
the generalization to any n for Exercise 16. Our proof consists of running
through the axioms for a vector space, and citing appropriate properties of
addition and multiplication mod p which you learned about in Chapter 1.
closure: v + w = (v
1
+w
1
) mod p, (v
2
+w
2
) mod p) (Z
p
)
2
since the re-
mainder of v
i
+w
i
is always between 0 and p 1.
commutativity:
v +w = (v
1
+w
1
) mod p, (v
2
+w
2
) mod p)
= (w
1
+v
1
) mod p, (w
2
+v
2
) mod p)
= w+v.
associativity: Since mod p addition is associative, the component-wise mod
p addition is also associative.
zero vector: The vector 0 = 0, 0) (Z
p
)
2
and
v +0 = (v
1
+ 0) mod p, (v
2
+ 0) mod p) = v
1
, v
2
) = v.
vector inverses: The inverse of v = v
1
, v
2
) is p v
1
, p v
2
). Why?
closure: kv = (kv
1
) mod p, (kv
2
) mod p) (Z
p
)
2
.
associativity: The associativity of multiplication mod p is inherited from
the integers. Component-wise multiplication mod p is therefore asso-
ciative.
distributive law 1:
k(v +w) = k (v
1
+w
1
) mod p, (v
2
+w
2
) mod p)
= k(v
1
+w
1
) mod p, k(v
2
+w
2
) mod p)
= (kv
1
+kw
1
) mod p, (kv
2
+kw
2
) mod p)
= kv
1
, kv
2
) +kw
1
, kw
2
)
66 CHAPTER 2. VECTORS AND VECTOR SPACES
distributive law 2:
(k +j)v = ((k +j)v
1
) mod p, ((k +j)v
2
) mod p)
= (kv
1
+jv
1
) mod p, (kv
2
+jv
2
) mod p)
= (kv
1
) mod p, (kv
2
) mod p) +(jv
1
) mod p, (jv
2
) mod p)
= kv +jv
identity scalar: Clearly 1 Z
p
and 1v = v.
In the Activities you discovered some nite vector spaces that were not
of the form (Z
p
)
n
. For example, the system in Activity 3(b) where K = Z
3
,
V = x, x, x) : x K forms a vector space. Note the all the vectors in
this space have identical components, and all these vectors are also in the
vector space (Z
3
)
3
. So, to determine whether or not this V is a vector space,
we do not need to check all ten axioms. Axioms 2, 3, 7, 8, 9, and 10 are
automatically true for this subset V since they are true for all vectors in
(Z
3
)
3
, and scalars in K. Closure axioms 1 and 6 are fairly easily checked
since adding two vectors with identical components must result in a vector
of identical components, and multiplying x, x, x) by any element of Z
3
will
result in a vector with identical components. It is clear that 0, 0, 0) is the
zero vector in V and that the vector inverse 3 x, 3 x, 3 x) of x, x, x)
is also in V , and thus axioms 4 and 5 are also satised. What other subsets of
(Z
p
)
n
did you nd to be vector spaces? Can you think of additional examples
that were not in explored in the Activities?
Some of the nite systems you worked with in the Activities were not
vector spaces. For example, the system in Activity 3(l) where K = Z
5
, V =
(Z
5
)
2
, and k x, y, z) = x, y, z) is not a vector space. To determine this,
we do not need to check the rst 5 axioms because they only involve vector
addition, and our theorem guarantees that this system satises the vector
addition axioms. We do need to check axioms 6 through 10 since they all
involve some form of scalar multiplication. Is this system closed under scalar
multiplication? Does the theorem guarantee that? What is the identity
scalar? Is there more than one in this case? Which of the distributive laws
does not hold?
In the activities you also learned that the system K = 0, V = (Z
3
)
2
is not a vector space. Why not? Which axiom does it fail to satisfy? If we
2.2 Introduction to Vector Spaces 67
change K to Z
3
would the system be a vector space? What if we change K
to Z
2
?
Why is the system K = Z
5
, V = x, y) : x, y 1, 3 not a vector
space? How many axioms are failed? Would changing K correct the prob-
lems? Does the system K = Z
5
, V = x, y) : x, y 0, 2, 4 fail the same
axioms or dierent ones? Can you x K so that these systems will be
vector spaces?
Innite Vector Spaces
Now we turn our attention to innite vector spaces. Consider V = R
2
, with
vector addition and scalar multiplication dened by the ordinary component-
wise operations. Is V a vector space over the real numbers? Is R
3
a vector
space? R
n
? We answer these questions with a theorem.
Theorem 2.2.2. Let n be a positive integer. The space R
n
of ordered n-
tuples with components from R is a vector space over R.
Proof. We cannot use ISETL to prove this theorem (why not?), but we note
that many of the vector space axioms are true as a consequence of properties
of the real numbers. We only need to check the component-wise application
of these properties. We now prove a few of the axioms for n = 2 and leave
the rest to Exercise 17.
closure: v +u = v
1
, v
2
) +u
1
, u
2
) = v
1
+u
1
, v
2
+u
2
) R
2
commutativity: Exercise 17 .
associativity:
(v +u) +w = v
1
+u
1
, v
2
+u
2
) +w
1
, w
2
)
= v
1
+u
1
+w
1
, v
2
+u
2
+w
2
)
= v
1
, v
2
) +u
1
+w
1
, u
2
+w
2
) = v + (u +w)
zero vector: Exercise 17.
inverses: If v R
2
then v = v
1
, v
2
) V , and v +v = 0, 0).
closure: Exercise 17.
68 CHAPTER 2. VECTORS AND VECTOR SPACES
associativity: Exercise 17.
distributive law 1:
k(v +u) = k(v
1
, v
2
) +u
1
, u
2
))
= k v
1
+u
1
, v
2
+u
2
)
= kv
1
+ku
1
, kv
2
+ku
2
)
= kv
1
, kv
2
) +kv
1
, ku
2
) = kv +ku
distributive law 2: Exercise 17.
identity scalar: 1 R, and 1v = 1 v
1
, v
2
) = 1v
1
, 1v
2
) = v
1
, v
2
) = v.
Since R
n
is a vector space, it seems reasonable to believe that C will also
be a vector space over R. In this case V = a +bi) : a, b R. Scalar
multiplication is dened by k a +bi) = ka +kbi), and vector addition by
a +bi) + c +di) = (a +c) + (b +d)i). You will verify the vector space
axioms in Exercise 5.
Is C
n
a vector space over R? Is it a vector space over C? Is Q
n
a
vector space over Q? Why is Q
n
not a vector space over R? Which closure
axiom fails? In Activity 5, did you nd innite vector spaces that were not
of the form (K)
n
for some eld K? Is K = R, V = x, 0, x) : x R
a vector space? How would you verify closure under vector addition and
scalar multiplication? Knowing that the operations in this space are the
same as those for R
3
, do you need to check commutativity, associativity, or
the distributive laws? Does V contain a zero vector? What is the vector
inverse of x, 0, x) in V ?
Does the system in Activity 5(d) satisfy the commutativity, associativity
and distributive axioms? Since va and sm are not the usual operations on
R, we need to check. What is the zero vector? What is the inverse of the
vector x)? Is this system a vector space?
In Activity 5(a) you should have discovered that the system K = 1, 1,
V = (K)
3
, va = ordinary component-wise multiplication, is not a vector
space. Why not? Which axiom does it fail to satisfy? Is the system in
Activity 5(c) closed under scalar multiplication? Is this system a vector
space?
2.2 Introduction to Vector Spaces 69
The vectors in a vector space do not necessarily have to be tuples of
numbers. Polynomials and functions dened on a set S can also play the role
of vectors. Vector spaces turn up in a wide variety of subjects. For example,
vector spaces arise naturally in the study of solutions of systems of equations,
geometry in 3-space, solutions of dierential and integral equations, discrete
and continuous Fourier transforms, quantum mechanics, and approximation
theory.
Note that we are sometimes sloppy and write things such as Let V =
(Z
5
)
2
be a vector space with no specic mention of the corresponding eld
or operations. Technically this is incorrect. Why? In order to be a vector
space, we have to specify not only the set V of vectors, but also the set K
of scalars and the operations of vector addition and scalar multiplication.
In many cases the scalars and operations are unambiguous, and so we just
describe the set V of vectors. Henceforth, when K is not specied, you may
assume it is Z
p
if V is nite, or R if V is innite. The operations va and sm
are the standard operations on (Z
p
)
n
or R
n
unless otherwise specied.
Non-Tuple Vector Spaces
There are two non-tuple vector spaces which we will discuss throughout this
text. We present them here by beginning with the following theorem.
Theorem 2.2.3. The set P(K) is a vector space over K with the standard
polynomial arithmetic. For any n, the set P
n
(K) is a vector space over K
with the standard polynomial arithmetic.
Proof. Left as an exercise (see Exercise 11).
This result is not very surprising because polynomials are really just tu-
ples of numbers. Recall the denitions of pointwise operations on functions.
If f and g are functions with the same domain and range and addition and
multiplication are dened on the range of f and g, then we can dene f +g to
be the function x f(x)+g(x) and kf to be the function x kf(x). Not
only do the polynomials form a vector space, but they do so when interpreted
as functions as well.
Theorem 2.2.4. The set PF(K) is a vector space over K with pointwise
addition and scalar multiplication. For any n, the set PF
n
(K) is a vector
space over K with pointwise addition and scalar multiplication.
70 CHAPTER 2. VECTORS AND VECTOR SPACES
Proof. Left as an exercise (see Exercise 13).
The polynomial functions is actually only a small subset of a much larger
collection, the innitely dierentiable functions on R. We make the following
denition.
Denition 2.2.2. The collection of innitely dierentiable functions on R
consists of all functions f : R R for which f and all of its derivatives are
dened on all of R. This set will be denoted by C
(R).
It should be clear that PF(R) C
(R), but C
(x
5
1
+x
5
2
)
1/5
, (y
5
1
+y
5
2
)
1/5
_
,
and scalar multiplication is dened by
k x, y) =
k
1/5
x, k
1/5
y
_
.
10. Consider the set P3 = a
0
+a
1
x+a
2
x
2
+a
3
x
3
: a
0
, a
1
, a
2
, a
3
Ra
3
,= 0
be the set of all polynomials of degree three with coecients in R. Show
that P3 does not form a vector space over R under polynomial addition
and scalar multiplication.
11. Prove Theorem 2.2.3.
12. Generalize the result of previous exercise. That is, show that P
n
(R) =
set of all polynomials of degree n or less, forms a vector space over R.
13. Prove Theorem 2.2.4.
14. Prove Theorem 2.2.5.
15. Does the set of all real-valued discontinuous functions on S form a
vector space over Runder pointwise addition and scalar multiplication?
Why not?
16. Generalize the proof of Theorem 2.2.1 for n 3.
17. Complete the proof of Theorem 2.2.2.
18. Complete the proof of Theorem 2.2.6.
19. Let V be a vector space. Prove that for every u, v V there is a unique
vector w V such that w + v = u. How does this property relate to
the operation of vector subtraction?
74
2.3 Subspaces
Activities
1. Use the ISETL func subset on the pairs below to determine when W
is a subset of V .
(a) W = x, 1, 0) : x Z
5
, V = (Z
5
)
3
.
(b) W = x, y) : x, y Z
3
, V = (Z
5
)
2
.
(c) W = x, 0) : x Z
5
, V = (Z
5
)
3
.
(d) W = V = (Z
2
)
4
.
2. Write an ISETL func is subspace that accepts a set W and a vector
space (that is [K, V, va, sm]). The action of your func is to determine
whether or not W is a nonempty subset of V , and whether W is also
a vector space over K using va and sm. Test your func on each of the
systems below.
(a) W
1
= 1, 2) , 2, 1) , 0, 0), V = (Z
3
)
2
(b) W
2
= 0, 0, x) : x Z
3
, V = (Z
3
)
3
(c) W
3
= x, y, z, w) : x, y, z, w Z
3
, x + y = 2, z + w = 1, V =
(Z
3
)
4
.
(d) W
4
= x, y) : x, y Z
2
, V = (Z
3
)
2
.
(e) W
5
= x, x) : x Z
5
, V = (Z
5
)
2
(f) W
6
= 1, 1, 1), V = (Z
2
)
3
(g) W
7
= 0, 0, 0, 0), V = (Z
2
)
4
(h) W
8
= x, 3, z) : x, z Z
5
, V = (Z
5
)
3
.
(i) W
9
= x, y, 0) : x, y Z
5
, V = (Z
5
)
3
.
(j) W
10
= x, y, z) : x, y, z Z
5
, x +y = z, V = (Z
5
)
3
.
(k) W
11
= x, y) : x, y Z
5
, x +y = 0, V = (Z
5
)
2
.
(l) W
12
= W
5
W
11
, V = (Z
5
)
2
.
(m) W
13
= W
9
W
10
, V = (Z
5
)
3
.
2.3 Subspaces 75
3. Find a subspace of (Z
5
)
3
that is not W
8
, W
9
, or W
10
. Use your func
is subspace to verify that your subset is a subspace.
4. Write an ISETL func is subspace2 that accepts a set W and a vector
space [K, V, va, sm]. The action of your func is to determine whether
or not W is a nonempty subset of V , and whether or not W satises
the vector space axioms 1,4,5, and 6. Test your func on the 13 systems
in Activity 2.
5. Compare your results from Activities 2 and 4. For which systems
do both funcs return true? For which systems do both funcs re-
turn false? Can you make a conjecture about the equivalence of
is subspace and is subspace2?
6. Write an ISETL func that accepts as inputs a set W and a vector
space [K, V, va, sm]. The action of your func is to determine whether
or not W is a nonempty subset of V , and whether for all k K and
w
1
, w
2
W, kw
1
+w
2
W. Test your func on the systems given in
Activity 2.
7. Compare the your results from Activities 2 and 6. For which systems
do both funcs return true? For which systems do both funcs return
false? Can you make a conjecture about the equivalence of these two
funcs?
Discussion
In these activities you explored subsets of vector spaces. In each case
you worked with a subset of vectors from a vector space V over a eld K,
and you used the same operations of vector addition, scalar multiplication,
and addition and multiplication of scalars that were dened for V and K.
Sometimes this subset formed a vector space itself, and sometimes it did not.
There is no general rule that would allow you to determine by inspection
alone when such a subset will form a vector space, but we can make the
following denition.
Denition 2.3.1. Let [K, V, va, sm] be a vector space over the eld K, and
let W be a nonempty subset of V . If [K, W, va, sm] is again a vector space
over K, then we say that W is a subspace of V .
76 CHAPTER 2. VECTORS AND VECTOR SPACES
Note that in order to be a subspace, W must rst be a nonempty subset
of the vectors in V , and W must also satisfy all of the vector space axioms
using the operations va and sm as they were dened for V over K. So,
although the set of vectors W = (Z
2
)
2
is a subset of V = (Z
3
)
2
, and the
system [Z
2
, W,
2
, +
2
] is a vector space, W is not a subspace of V . Why not?
There are two problems here: the vectors in V and W are dened over two
dierent elds, and vector addition and scalar multiplication are done mod
2 in W whereas they are done mod 3 in V . We could of course use mod 3
arithmetic in W, but under these operations W will not be a vector space.
Why not? Which vector space axioms are not satised by [Z
2
, W,
3
, +
3
] ?
Now consider the vector space R
3
with the usual operations of vector
addition and scalar multiplication, and the subset W = x, y, z) : x + 2y +
3z = 0. Is W a subset of R
3
? Does W have a geometric interpretation? Is
W itself a vector space?
Determination of Subspaces
One way of answering that last question is to check each of the ten vector
space axioms for the system [R, W, , +]. However this is much more work
than is really necessary. Since the operations of vector addition and scalar
multiplication are exactly the same for both R
3
and W, we do not need
to recheck all of the vector space axioms for W. In fact, W will inherit
commutativity, associativity, the distributive laws, and the scalar identity
from R
3
. Why? Which axioms does this allow us to avoid checking? Which
axioms do we still need to check?
Your work in Activities 4 and 5 should have convinced you that we need
only check four axiomsAxioms 1, 4, 5, and 6. We now check these axioms
for [R, W, , +].
Axiom 1: Let w
1
= x
1
, y
1
, z
1
) and w
2
= x
2
, y
2
, z
2
) be arbitrary vectors in
W. Then w
1
+w
2
= x
1
+x
2
, y
1
+y
2
, z
1
+z
2
) and (x
1
+x
2
) + 2(y
1
+
y
2
) + 3(z
1
+ z
2
) = (x
1
+ 2y
1
+ 3z
1
) + (x
2
+ 2y
2
+ 3z
2
) = 0 + 0 = 0, so
w
1
+w
2
W, and W is closed under vector addition.
Axiom 4: Since 0 + 20 + 30 = 0, the vector 0 = 0, 0, 0) is in W. We do
not need to check that w+0 = w. Why not?
Axiom 5: Let w = x, y, z) W. Since w R
3
, there is a vector w
R
3
with w + w = 0. We need to show that w is in W. Since
2.3 Subspaces 77
x + 2y + 3z = 0, (x + 2y + 3z) = x + 2(y) + 3(z) = 0, so
w W.
Axiom 6: Let k R and w = x, y, z) W. Since x + 2y + 3z = 0,
k(x + 2y + 3z) = kx + 2ky + 3kz = 0, so kw = kx, ky, kz) W.
Thus W is in fact a subspace of R
3
. Recall that W has a familiar geometric
interpretation as a plane through the origin in 3-space. Can you nd another
geometric subspace of R
3
?
Suppose W
2
= x, y, z) : x + 2y + 3z = 2 is another plane in 3-space.
How does W
2
dier from W? Is W
2
a subspace of R
3
? Which of Axioms
1,4,5, or 6 does W
2
fail to satisfy?
We can generalize these results in a theorem:
Theorem 2.3.1. A nonempty subset W of a vector space V over K is a
subspace if and only if W is closed under the inherited vector addition and
scalar multiplication, the zero vector is in W, and each vector w in W has
an vector inverse w in W.
Proof. (=) If W is a subspace of V over K, then W is itself a vector space
and therefore satises all ten vector space axioms.
(=) The proof of this is similar to our work above and is left for Exercise 6.
In Activities 6 and 7, you may have observed that it is not necessary
to check all four of these axioms separately. You may have conjectured the
following theorem.
Theorem 2.3.2. A nonempty subset W of a vector space V over K is a
subspace if and only if for all w
1
, w
2
W and k K, kw
1
+w
2
W.
Proof. (=) Left as an exercise (See Exercise 7).
(=) We give only a rough sketch of the proof, and leave the details for
Exercise 15. Use Theorem 2.3.1 so that we only need to verify four axioms
for W. Assume kw
1
+ w
2
W. If we choose k = 1, then we can easily
show that W is closed under vector addition. Since W is nonempty, we can
nd a vector w W and let w = w
1
= w
2
. Then by letting k = 1, one
can show that 0 W. Still using k = 1, but letting w
2
= 0, (which we
now know is in W), one can show that vector inverses are in W. Finally,
letting w
2
be 0, and k, w, be arbitrary will show that W is closed under
scalar multiplication.
78 CHAPTER 2. VECTORS AND VECTOR SPACES
Any vector space V will have at least two subspaces, the subspace V itself,
and the zero subspace (consisting solely of the vector 0). Why are these both
subspaces? Why are they called improper subspaces? Does every vector
space necessarily have proper subspaces?
Is R
2
a subspace of R
3
? Carefully re-read the denition of a subspace.
Can you see why R
2
is not a subspace of R
3
? Is W
2
= x, y, 0) : x, y R
a subspace of R
3
? Note that the subspace W
2
looks like or behaves
exactly like R
2
. We say that R
2
and W
2
are isomorphic vector spaces, and
that W
2
is an embedding of R
2
in R
3
. Are there other copies of R
2
that
can be embedded in R
3
?
Is R
1
a subspace of R
3
? Of R
2
? Can you nd a subspace W
1
of R
3
that
is isomorphic to R
1
? How many dierent isomorphic subspaces of R
1
are
there in R
3
? Can you nd a subspace of R
n
that is isomorphic to R
m
for all
m < n?
Non-Tuple Vector Spaces
When we discussed the polynomial, polynomial functions and the innitely
dierentiable functions, some subset relationships were presented. We are
now able to describe the relationship between these sets more clearly in the
following theorems.
Theorem 2.3.3. For any set of scalars K and n, m with n < m, the following
statements are true:
P
n
(K) is a subspace of P
m
(K);
P
n
(K) is a subspace of P(K).
Proof. Left as an exercise (see Exercise 9).
Theorem 2.3.4. For any set of scalars K and n, m with n < m, the following
statements are true:
PF
n
(K) is a subspace of PF
m
(K);
PF
n
(K) is a subspace of PF(K).
Proof. Left as an exercise (see Exercise 10).
Theorem 2.3.5. The following statements are true:
2.3 Subspaces 79
PF
n
(R) is a subspace of C
(R);
PF(R) is a subspace of C
(R).
Proof. Left as an exercise (see Exercise 11).
Exercises
1. Let V be a vector space. Prove that 0 is a subspace of V .
2. Let L be a line through the origin in R
3
. Prove that L is a subspace
of R
3
.
3. Show that the set of all points on the line y = mx +b is a subspace of
R
2
if and only if b = 0.
4. Show that the set of all points in the plane ax+by+cz = d is a subspace
of R
3
if and only if d = 0.
5. Let W be the subset of P
2
(R) consisting of all polynomials of the form
f(x) = a
1
x + a
2
x
2
, a
1
, a
2
R. Determine whether or not W is a
subspace of P
2
(R).
6. Complete the proof of Theorem 2.3.1.
7. Complete the proof of Theorem 2.3.2.
8. Which of the following are subspaces of R
3
?
(a) W = x, y, z) : x z = 1, y +z = 2.
(b) W = x, y, z) : xy = 0.
(c) W = 0, y, 0).
9. Prove Theorem 2.3.3.
10. Prove Theorem 2.3.4.
11. Prove Theorem 2.3.5.
12. Which of the following subsets of C
(R)?
80 CHAPTER 2. VECTORS AND VECTOR SPACES
(a) f C
(R) : f(1) = 0
(b) f C
(R) : f 0
(d) f C
(R) : f(x
2
) = (f(x))
2
(f) f C
(R) : f(x
2
) = 2(f(x))
(g) f C
(R) : f(x) = a
13. Let W
1
and W
2
be subspaces of a vector space V . Is W
1
W
2
a subspace?
If so, prove it. If not, nd a counterexample.
14. Let W
1
and W
2
be subspaces of a vector space V . Is W
1
W
2
a subspace?
If so, prove it. If not, nd a counterexample.
15. Let W = (x, y) : x
2
+ y
2
9 be a subset of R
2
. (W is a disk of
radius 3.) Is W a subspace of R
2
? Why or why not?
16. Is Q
3
a subspace of R
3
? (What is K)?
17. Let m n. Find two distinct subspaces of R
n
that are isomorphic to
R
m
.
Chapter 3
First Look at Systems
In this chapter, you will certainly recognize ideas
that anyone would call algebra. We revisit
systems of equationsperhaps your high school
text called them simultaneous systemsand
explore a couple of methods for nding the
solutions to these systems. You will nd some
interesting procedures in the next sections and
probably some new interpretations for things you
have met before.
82
3.1 Systems of Equations
Activities
1. Let K = Z
3
, the set of integers modulo 3. Write a statement in
ISETL that determines whether the following tuples: [x, y, z] = [2, 1, 1],
[1, 1, 1], [2, 2, 2], and [1, 0, 0] are or are not a solution of the equation
2x +y + 2z = 1.
2. Let K = Z
3
. Construct a func in ISETL that accepts a sequence [x, y, z]
of three elements of K as input; that substitutes the elements of the
sequence into the respective unknowns of the equation 2x+y +2z = 1;
and that returns true, if the substituted values result in the equation
being true, or returns false, if the substituted values result in the
equation being false. Use this func to nd the solution set of the
equation.
3. Use the func you wrote in the prior activity to construct the solution
set of the equation 2x + y + 2z = 1 in K = Z
3
. In particular, you will
want to dene the set in such a manner that you iterate through every
possible sequence of three elements in K (test every sequence over K)
to identify all possible solutions.
4. Given K = Z
p
, where p is prime number, and given a linear equation
in K such as
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c,
where a
i
K, i = 1, . . . , q, and c K, construct a func One eqn
that accepts the modulus p, the sequence [a
1
, a
2
, . . . a
q
] over K of co-
ecients and the constant c as input, and that yields the set of all
solutions [x
1
, x
2
, . . . , x
q
] of the equation as output. Test your func on
the equation dened in Activity 1.
5. Let K = Z
2
. For the equations
x +y +z = 1
x +z = 0.
3.1 Systems of Equations 83
Use the func One eqn you wrote in the last activity to determine the
solution set of the rst equation. Then use the same func to determine
the solution set of the second equation. Find the intersection of both
solution sets. What property do the elements of the intersection set
have? What is the solution set of these two equations taken as a single
system?
6. Let K = Z
2
. Construct a func in ISETL that accepts a sequence
[x, y, z] of three elements as input; that substitutes the elements of the
sequence into the respective unknowns of the equations
x +y +z = 1
x +z = 0;
and that returns true, if the substituted values result in both equations
being true, or returns false, if the substituted values result in one or
both equations being false.
7. Use ISETL code to construct the solution set of the system of equations
given by
x +y +z = 1
x +z = 0.
in K = Z
2
. In particular, you will want to dene the set in such
a manner that you iterate through every possible sequence of three
elements in K (test every sequence over K) to identify all possible
solutions.
8. Given two equations
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c
1
b
1
x
1
+b
2
x
2
+ +b
q
x
q
= c
2
in K = Z
p
, where p is a prime number, construct a func Two eqn that
accepts the modulus p, two sequences of coecients, [a
1
, a
2
, . . . , a
q
] and
[b
1
, b
2
, . . . , b
q
], and two constants, c
1
and c
2
, as input, and that returns
the set of all solutions [x
1
, x
2
, . . . , x
q
] of both equations as output.
84 CHAPTER 3. FIRST LOOK AT SYSTEMS
9. Given a system of three equations
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c
1
b
1
x
1
+b
2
x
2
+ +b
q
x
q
= c
2
d
1
x
1
+d
2
x
2
+ +d
q
x
q
= c
3
in K = Z
p
, where p is a prime number, construct a func Three eqn that
accepts the modulus p, three sequences of coecients, [a
1
, a
2
, . . . , a
q
],
[b
1
, b
2
, . . . , b
q
], and [d
1
, d
2
, . . . , d
q
], and three constants, c
1
, c
2
, and c
3
,
as input and that returns the set of all solutions [x
1
, x
2
, . . . , x
q
] to the
system as output. Test your func on the system
x + 2y +z = 1
2x +y + 2z = 2
2x + 2y +z = 1
over Z
3
. Describe the process for constructing such a func for any
number of equations.
10. Let K = Z
5
. Answer the following set of questions in relation to the
system of equations in Z
5
given below.
3x
1
+ 2x
2
+x
3
= 2
x
1
+ 4x
2
+ 4x
3
= 3
2x
1
+x
2
+ 2x
3
= 2.
(a) Find the solution of this system using the func Three eqn you
wrote before.
(b) Interchange the rst and second equations of the system. Find the
solution of this new system using the func Three eqn you wrote
before. What do you observe?
(c) Multiply both sides of equation 2 by 3. Replace the second equa-
tion by this new equation. Apply the func Three eqn to this
transformed system. What do you observe?
(d) Multiply both sides of equation 2 by 3. Then, add the modied
version of equation 2 to equation 1. Replace the second equation
by this new equation. Apply the func Three eqn to this trans-
formed system. What do you observe?
3.1 Systems of Equations 85
(e) What operations can you do to transform the equations of the
system without changing its solution set?
11. Let K = Z
5
. Answer the following set of questions in relation to the
system in Z
5
given below.
2x
1
+ 3x
2
+x
3
= 3
x
1
+ 4x
2
+ 2x
3
= 1
3x
1
+x
2
+ 2x
3
= 2.
(a) Apply the func Three eqn to nd the set of sequences [x
1
, x
2
, x
3
]
that are simultaneous solutions of all three equations.
(b) Multiply both sides of equation 2 by 3. Then, add the modied
version of equation 2 to equation 1. In particular,
R2
t
(new eqn 2) = R1 + 3R2.
Apply the func Three eqn to the system
2x
1
+ 3x
2
+x
3
= 3
R2
t
3x
1
+x
2
+ 2x
3
= 2.
Compare the solution set of this system to the original.
(c) Add equation 1 to equation 3. In particular,
R3
t
(new eqn 3) = R1 +R3.
Apply the func Three eqn to the system
2x
1
+ 3x
2
+x
3
= 3
R2
t
R3
t
.
Compare the solution set of this system to the original.
86 CHAPTER 3. FIRST LOOK AT SYSTEMS
(d) Interchange rows 2 and 3. In particular,
R2
tt
= R3
t
R3
tt
= R2
t
.
Apply the func Three eqn to the system
2x
1
+ 3x
2
+x
3
= 3
R2
tt
R3
tt
.
Compare the solution set of this system to that of the original.
(e) Does the process outlined in parts (b),(c) and (d) change the so-
lution set of the system? Why does the process described here
appear to be eective in helping us to identify the solution of the
original system?
12. Let K = Z
5
. Given the system of equations
3x
1
+x
2
+ 4x
3
= 1
x
1
+ 3x
2
+ 3x
3
= 4
4x
1
+x
2
+ 3x
3
= 3,
nd the solution set by hand using a process similar to what was out-
lined in the prior activity. Verify your work by applying the func
Three eqn to both the original system and the simplied system you
produced by hand. What do you observe?
Discussion
Algebraic Expressions and Linear Equations
In previous courses in algebra, you spent a great deal of time working with
algebraic expressions. Do you remember the dierence between an algebraic
3.1 Systems of Equations 87
expression and an equation? In some cases, you were asked to simplify ex-
pressions by applying the distributive property; the exercise
Simplify the expression: 6x(x +y) 3(x
2
2xy)
is such an example. Similarly, you were assigned problems in which you were
asked to combine like terms. Exercises such as
Simplify by combining like terms: 5bcd 8cd 12bcd +cd
t into this category and are probably very familiar to you. You also spent
considerable time factoring polynomials like
4x
3
+ 27x
2
+ 5x 3
25a
2
20ab + 4b
2
.
What was the purpose of these tasks?
Although these exercises may have seemed pointless, they were designed
with several objectives in mind: in particular, you were being taught about
the concept of variable. What are the values that each variable can assume
in algebraic expressions such as
6x(x +y) 3(x
2
2xy)?
That is what values can you select for x and for y, which, when substituted
into the expression, yield a single number answer?
On the other hand, if we take one of the algebraic expressions above, such
as 4x
3
+ 27x
2
+ 5x 3, and set it equal to, say 4, which yields 4x
3
+ 27x
2
+
5x3 = 4, we now have an equation. What happens if you substitute values
for x in this case? Is it always a true statement?
In a similar fashion, if we take two of the other expressions given above,
say 5bcd 8cd 12bcd +cd and 25a
2
20ab +4b
2
, and set them equal to one
another, the resulting equation
5bcd 8cd 12bcd +cd = 25a
2
20ab + 4b
2
will be true only for appropriately selected sequences [a, b, c, d] of values for
a, b, c, and d. Can you nd some examples of values for a, b, c, and d such
that the equation will be false? Can you nd some examples of values for
a, b, c, and d such that the equation will be true? The set of values which
88 CHAPTER 3. FIRST LOOK AT SYSTEMS
an unknown, or sequence of unknowns, can assume in any given equation
is called the solution set of the equation. In this section and throughout
the remainder of this course, we will focus our attention on nding solution
sets of linear equations and systems of linear equations. Do you recall the
dierence between a linear equation and one that is not linear? Can you give
an example of each?
A linear equation is any equation of the form
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c,
where a
1
, a
2
, . . . , a
q
, and c are constants in K, where K is the set of real
numbers or the set Z
p
, with p prime, and x
1
, x
2
, . . . , x
q
are unknowns.
Equation such as 3x
2
1
+4x
2
2
= 2, 3y sin(y) = 1, or 2x
4
4xy +5y = 7 are
not linear equations.
Denition 3.1.1. Let K be a eld. A linear equation with coecients in
K is of the form a
1
x
1
+ a
2
x
2
. . . + a
m
x
m
= c, where a
i
K, i = 1, 2, . . . , m
denote the coecients, x
i
, i = 1, 2, . . . , m represent the unknowns, and c K
is a constant.
Denition 3.1.2. A sequence [s
1
, s
2
, . . . , s
m
] is a solution of the equation,
if, when s
i
is substituted for x
i
, i = 1, 2, . . . , m, the equation
a
1
s
1
+a
2
s
2
+ +a
m
s
m
= c
is true. The solution set of a linear equation is the collection of all such
solutions.
In Activity 1, you were asked to determine whether the sequences [2, 1, 1]
and [1, 1, 1], [2, 2, 2], [1, 0, 0] are solutions of the equation 2x + y + 2z = 1
in K = Z
3
. It is convenient to remember here that all the equalities in the
activities where the variables are elements of a nite eld are congruences,
and that it is implicit in the notation Z
k
that all the operations have to be
done modulo k. For example, when we write 3x+2y = 4 where x and y are in
Z
7
we mean 3x +2y 4 (mod 7). What do you have to ask when you want
to know whether a sequence is an element of the solution set of an equation?
By substitution of the sequences you were able to nd the sequences that are
solutions to the equation. Can you nd all the sequences that are solutions
to this equation?
3.1 Systems of Equations 89
What is the purpose of the func you wrote in Activity 2? How can
you nd all the elements of the solution set of an equation? In Activity 3
you answered this question for a particular equation and in Activity 4, you
constructed and tested a func that would return the solution set of a single
linear equation in K = Z
p
, where p is prime; in particular, you were asked
to write code that would input a sequence of coecients and a constant of a
linear equation, and then return the solution set of the equation by iterating
through every possible sequence in the set K.
If K is nite and small, as it was in the activities, it is possible to check
every possible sequence of values for the unknowns of an equation. If K is
innite, however, we cannot check every possible sequence. For example, if
K = R, the set of real numbers, we cannot nd the solution set of a linear
equation by checking every sequence of possible values for the unknowns,
because there are innitely many possibilities to check. Instead, we try to
determine the solution set by transforming the original equation into a sim-
pler but equivalent equation, that is, an equation that has the same solution
set, whose solution can be easily identied. For linear equations, this in-
volves isolating an unknown variable. For example, given an equation like
2x+3 = 11, what are the transformations you would do to nd an equivalent
equation which tells you directly the solution to the original equation? What
properties do you use to transform the equation into an equivalent one? How
do we know that the method for transforming the equation yields each and
every solution?
Forms of Solution Sets
In the case of a linear equation of a single variable, we know that, if it has
solutions, there is exactly one solution. For a linear equation of more than
one variable, say 4x y = 5, this is not the case. We can simplify the
equation, however, by isolating y. What are the transformations you would
do in this case? Can you identify the solution set of this equation? Can you
identify the geometric representation of the solution set? We will return to
this in the last section of this chapter.
If the variables of an equation are elements of a nite eld, we can always
nd all the solutions in its solution set. If the variables are elements of an
innite eld, this is not always the case. Why?
The solution set of the equation we were considering before can be ex-
pressed in a variety of ways. If we simply isolate y, we get the form y = 4x5.
90 CHAPTER 3. FIRST LOOK AT SYSTEMS
If x and y are in R, we can select any value for x, which, via the expression,
yields a corresponding value for y. If we let x = t, then we get the parametric
form
x = t
y = 4t 5
of the equation. The solution set of this equation can be written in vector
form. For example we can interpret the equation 4x y = 5 as consisting of
all vectors a, b) in R
2
whose components, when substituted, x = a, y = b,
result in the equation being true. That is, the vectors that are solutions
of the equation. The expression for the x coordinate would be placed in
the rst component, and the expression for y would be placed in the second
component. The vector form of the solution set of 4xy = 5 is given by the
set
S = t, 4t 5) : t R.
Can you express the vectors in S in terms of other vectors, using the opera-
tions you learned in chapter 2? Is S a vector space?
We can also express the solution as a sequence. In this case, we are
interpreting the solution of the equation 4xy = 5 as the set of all sequence
combinations [c, d] of elements in R such that if we let x = c and y = d, the
equation is true. In this case, the solution set of the given equation assumes
the form
S = [c, 4c 5] : c R.
Given a linear equation in four variables, say
3x
1
+ 2x
2
4x
3
+x
4
= 5,
to obtain a solution set we would follow the same basic procedure as we did
in simplifying the linear equation in two variables; in particular, we would
transform the the equation into an equivalent one where the rst unknown
is isolated:
x
1
=
5
3
2
3
x
2
+
4
3
x
3
1
3
x
4
.
Can you identify the equivalent equations involved in the transformation of
this equation? In this solution, x
2
, x
3
, and x
4
are free variables, because they
can assume any value, while x
1
is dependent upon, or is determined by, the
3.1 Systems of Equations 91
values selected for x
2
, x
3
, and x
4
. Must x
1
necessarily be the dependent
variable? What is the vector form of the solution set of this equation? The
sequence form? What are the vector and sequence forms of the solution set
of the general linear equation given in Denition 3.1.2?
In vector form, the solution set is given by
S =
__
5
3
2
3
t
1
+
4
3
t
2
1
3
t
3
, t
1
, t
2
, t
3
_
: t
1
, t
2
, t
3
R
_
,
where S represents the set of vectors in R
4
whose components satisfy the
equation
3x
1
+ 2x
2
4x
3
+x
4
= 5.
In sequence form, the solution set looks like
S =
__
5
3
2
3
t
1
+
4
3
t
2
1
3
t
3
, t
1
, t
2
, t
3
_
: t
1
, t
2
, t
3
R
_
,
where S represents all combinations of values for the unknowns that are
solutions of the equation.
In general, for a single linear equation
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c
the solution set in vector form can be written as
S =
__
c
a
1
a
2
a
1
t
1
a
3
a
1
t
2
a
q
a
1
t
q1
, t
1
, t
2
, . . . , t
q1
_
: t
1
, t
2
, . . . , t
q1
R ,
where S is the set of vectors in R
q
whose components are solutions of the
equation; and the sequence form of the solution set is given by the set
S =
__
c
a
1
a
2
a
1
t
1
a
3
a
1
t
2
a
q
a
1
t
q1
, t
1
, t
2
, . . . , t
q1
_
: t
1
, t
2
, . . . , t
q1
R ,
where S represents all combinations of values for the unknowns that are
solutions of the equation. Note that any x
i
, i = 1..q, can be used as the
dependent variable by solving as we did for x
1
. Is S a subspace of R?
92 CHAPTER 3. FIRST LOOK AT SYSTEMS
Systems of Linear Equations
A system of equations is a collection of two or more linear equations. We are
interested in nding the solution set of systems of equations. In Activity 5 you
used what you learned in previous activities to nd the solution set of each of
the equations that form the given system. Then you found the solution set of
the system. Can you dene what is the meaning of the solution set of a system
of equations? How is the solution set of the system related to the solution set
of each of the equations? In Activity 6 you worked with the same system but
now you constructed a func that allowed you to check whether any sequence
of your choice would be a solution to the system and in Activity 7, you asked
the computer to determine all solutions by testing every possible sequence in
the func you constructed in Activity 6. You then generalized this process in
Activity 8 by constructing a func that would accept the coecients and the
constants corresponding to any pair of equations in Z
p
, p prime, and that
would return the solution set of the system. Can you explain in your own
words how this func works?
In Activity 9 you constructed and tested a func for solving a system
of three equations in Z
p
, p prime. You were then asked how you would
generalize the procedure to construct a func for any number of equations in
K = Z
p
. Can you use these ideas to describe a general process for nding
the solution of any system of m equations and n unknowns, where n and
m are any integers larger than 1, and K is a nite led? If K is nite, it
is possible to construct the solution set of a system of linear equations by
writing code that checks every possible sequence of values for the unknowns.
Why? If K is an innite set, for example, the real numbers or the complex
numbers, such iteration is not possible. As with a single linear equation in
R, it is necessary to devise a process that transforms the original system
into a simpler, equivalent system whose solution set is readily identiable.
In order to do this you were asked in Activity 10 to perform some operations
on a system of equations and to solve the transformed systems. What did
you nd about the solution of each of those systems? The operations used
are called elementary transformations and as you found out, these operations
transform the system into a new system that has the same solution set.
Denition 3.1.3. Given a system of m equations and n unknowns over a
eld K, the original system can be transformed into a simpler, equivalent
system by applying one or more elementary transformations, each of which
is listed below:
3.1 Systems of Equations 93
Interchange the position of two equations.
Multiply both sides of an equation by a nonzero constant.
Add a multiple of one equation to another equation.
In Activity 11 you were asked to transform a system where K = Z
5
into
equivalent systems, that is, into a system that has the same solution set. It
is important to remember that all the operations used while working with
this activity are done using modulo 5. In the rst step, you multiplied the
second equation by 3, and added the result to the rst equation to produce
a newsecond equation. This operation involved two elementary transfor-
mations: the rst involved multiplying the second equation by the constant
3; the second involved adding a the rst equation to the second equation.
What did you observe? Why might the form you obtained be considered
simpler? If you did not have access to the func, how would you nd the
solution of the system? Why do you suppose the solution set of the original
system and the simpler system are equal? In the second step, you performed
a similar transformation to alter the third equation. Again, the resulting
system yielded the same solution set. In the last step, you applied the third
type of elementary transformation: you interchanged equations 2 and 3. The
nal form
2x
1
+ 3x
2
+x
3
= 3
4x
2
+ 3x
3
= 0
2x
3
= 1
is an equivalent system. Unlike the original system, it is possible to identify
the solution set by hand. Specically, the last equation reveals that x
3
= 3.
If we substitute this value into the second equation, we see that
4x
2
+ 3(3) = 0,
from which it follows that x
2
= 4. If we substitute x
2
= 4 and x
3
= 3 into
the rst equation, we get
2x
1
+ 3(4) + 3 = 3,
which yields x
1
= 4. Hence, the original system in K = Z
5
2x
1
+ 3x
2
+x
3
= 3
x
1
+ 4x
2
+ 2x
3
= 1
3x
1
+x
2
+ 2x
3
= 2,
94 CHAPTER 3. FIRST LOOK AT SYSTEMS
has only one solution, namely x
1
= 4, x
2
= 4, x
3
= 3. If the solution set is
written in vector form, we have
S = 4, 4, 3),
and if it is expressed in sequence form, we get
S = [4, 4, 3].
Observe that the simplied system
2x
1
+ 3x
2
+x
3
= 3
4x
2
+ 3x
3
= 0
2x
3
= 1
has no x
1
term in the second equation and neither an x
1
nor an x
2
term in
the third equation. We could have added additional steps to simplify even
further. In particular, if we multiply each equation by a suitable nonzero
constant, we eventually get a triangular-looking system
x
1
+ 4x
2
+ 3x
3
= 4
x
2
+ 2x
3
= 0
x
3
= 3
said to be in echelon form. The entries corresponding to the x
1
term in the
rst equation, the x
2
term in the second equation, and the x
3
term in the
third equation are called leading entries.
As you can see, the process of transforming a system of equations into
echelon form involves isolating variables: in particular, we isolated x
3
and
then used it to isolate x
2
, whereby we then isolated x
1
. The three elementary
transformations, interchanging two equations, multiplying both sides of an
equation by a constant, and adding a multiple of one equation to another, are
the tools by which we can transform a system of equations into an equivalent
system that is in echelon form. Can you transform the system given in
Activity 10 into an equivalent system which is in echelon form?
In Chapter 6 it will be shown that the process used to transform the
system into its echelon form does not change the solution set of any system
of linear equations. Before we think about a proof, lets consider the following
3.1 Systems of Equations 95
example in R,
2x
1
x
2
+ 3x
3
+x
4
= 2
3x
1
+ 2x
2
4x
3
+ 2x
4
= 3
x
1
+ 4x
2
2x
3
+ 5x
4
= 1.
Based upon what you did in Activity 11, the rst goal is to transform the
original system into an equivalent system in which the x
1
term in the second
equation vanishes. What elementary transformation has been performed to
transform the system into
2x
1
x
2
+ 3x
3
+x
4
= 2
7x
2
17x
3
+x
4
= 12
x
1
+ 4x
2
2x
3
+ 5x
4
= 1?
The next step might be to eliminate the x
1
term from the third equation.
What elementary transformation was used to transform the system into
2x
1
x
2
+ 3x
3
+x
4
= 2
7x
2
17x
3
+x
4
= 12
7x
2
x
3
+ 11x
4
= 0?
The last transformation left an x
2
term in the third equation. What elemen-
tary transformation was applied to transform the system into
2x
1
x
2
+ 3x
3
+x
4
= 2
7x
2
17x
3
+x
4
= 12
16x
3
+ 10x
4
= 12?
Is the next system equivalent to the given one? Why? In order to get the
system into echelon form, we need a coecient of 1 for each leading entry.
How would we go about doing this?
x
1
1
2
x
2
+
3
2
x
3
+
1
2
x
4
= 1
x
2
17
7
x
3
+
1
7
x
4
=
12
7
x
3
+
5
8
x
4
=
3
4
96 CHAPTER 3. FIRST LOOK AT SYSTEMS
How many leading entries does this system have?
This system is now in echelon form, with leading entries provided by the
x
1
term in the rst equation, the x
2
term in the second equation, and the x
3
term in the last equation. Unlike the prior example, the last unknown will
not assume a single value. In particular, the last equation x
3
+
5
8
x
4
=
3
4
is
a linear equation in two variables. If we isolate the x
3
term, we get
x
3
=
3
4
5
8
x
4
.
This means that x
4
can assume any value; it is called a free variable. If we
let x
4
= t, we get
x
3
=
3
4
5
8
t.
Substituting t for x
4
and
3
4
5
8
t for x
3
in the second equation yields
x
2
17
7
_
3
4
5
8
t
_
+
1
7
t =
12
7
x
2
+
51
28
+
113
56
t =
12
7
x
2
=
3
28
113
56
t.
If we substitute these expressions into equation 1, we nd that
x
1
1
2
_
3
28
113
56
t
_
+
3
2
_
3
4
5
8
t
_
+
1
2
t = 1
x
1
+
15
14
+
4
7
t = 1
x
1
=
29
14
4
7
t.
If we now write the solution in parametric form, we get
x
1
=
29
14
4
7
t
x
2
=
3
28
113
56
t
x
3
=
3
4
5
8
t
x
4
= t,
3.1 Systems of Equations 97
where t is any real number. In vector form, the solution set can be written
as
S =
__
29
14
4
7
t,
3
28
113
56
t,
3
4
5
8
t, t
_
: t R
_
.
Is S a vector space? The algebraic structure on vector spaces can be used
to express the solution set in vector form in dierent suitable ways. Can you
think of one such expression? How would you write the solution in sequence
form? How many solutions does this system have? We can nd any specic
solution by selecting a value for t. If we substitute the representations given
for x
1
, x
2
, x
3
, and x
4
into each equation in the original system, all three
equations will be true, thereby proving that the proposed solution set, no
matter its specic form, is indeed the solution set of the original system of
equations. Is this always true?
In Activity 12 you transformed the given system using elementary trans-
formations. How many leading entries did the system in echelon form have?
What is the solution of that system? Is it possible for a system to have no
solution?
Lets consider another example in R:
3x
1
+ 6x
2
3x
3
= 6
2x
1
4x
2
3x
3
= 1
3x
1
+ 6x
2
2x
3
= 10
What are the elementary operations used to transform the system into the
following equivalent systems?
3x
1
+ 6x
2
3x
3
= 6
4x
2
15x
3
= 9
3x
1
+ 6x
2
3x
3
= 10
and
3x
1
+ 6x
2
3x
3
= 6
4x
2
15x
3
= 9
0 = 4
The last equation corresponds to
0x
1
+ 0x
2
+ 0x
3
= 4.
98 CHAPTER 3. FIRST LOOK AT SYSTEMS
What is the meaning of the last equation of the system in echelon form?
Does it have a solution? What does it mean in terms of the sequence of
values [x
1
, x
2
, x
3
]? The system has no solution.
The examples discussed here represent each of the possible types of so-
lution sets of a system of equations: A system of equations in K = Z
p
has
a nite number of solutions whereas if it is over an innite eld, a system
of equations either has a unique solution, innitely many solutions, or no
solution. We have exemplied this result in this section, later on, in chapter
6 it will be proved. Although we have always considered the leading entries
as dierent from zero, in many systems they are zero. In such cases it would
be more convenient to interchange the appropriate equations rst, so that
the leading entries of the top equations are dierent from zero, and later
transform the system into echelon form.
Summarizing the Process for Finding the Solution of a
Systems of Equations
If we are given a system of equations in K when K is nite, we can nd the
solution set by substituting each possible sequence of values for the unknowns
into each equation. Those sequences for which the func returns true for each
equation in the system are elements of the solution set and vice versa.
If K is an innite set, such as R, then it is impossible to check each
sequence of possible solutions. In this case, we must transform the original
system of equations into a simpler system whose solution set is equal to that
of the original system. There are three elementary transformations that can
be applied to a system without changing its solution set:
Interchange the positions of two equations.
Multiply both sides of an equation by a nonzero constant.
Add a multiple of one equation to a multiple of another equation.
The goal of applying elementary transformations is to produce a system of
equations in echelon form, that is, a simpler system whose solution set is
easy to identify or to construct. To place a system in echelon form, we must
3.1 Systems of Equations 99
apply the following series of steps to a system such as
a
11
x
1
+a
12
x
2
+a
13
x
3
+a
14
x
4
+ +a
1q
x
q
= c
1
a
21
x
1
+a
22
x
2
+a
23
x
3
+a
24
x
4
+ +a
2q
x
q
= c
2
a
31
x
1
+a
32
x
2
+a
33
x
3
+a
34
x
4
+ +a
3q
x
q
= c
3
.
.
.
a
r1
x
1
+a
r2
x
2
+a
r3
x
3
+a
r4
x
4
+ +a
rq
x
q
= c
r
1. Scale the leading coecients to one, dividing by the coecient of the
leading variable.
2. Apply elementary transformations that eliminate x
1
from equations 2
and higher and replace those equations by the transformed ones.
3. Do the same to eliminate the x
2
term from equations 3 and higher.
4. Do the same to eliminate the x
3
term from equations 4 and higher.
5. Continue this process until the leading entries form a triangular pattern.
Once completed, the echelon system should look something like
x
1
+b
12
x
2
+b
13
x
3
+b
14
x
4
+ +b
1q
x
q
= d
1
x
2
+b
23
x
3
+b
24
x
4
+ +b
2q
x
q
= d
2
x
3
+b
34
x
4
+ +b
3q
x
q
= d
3
.
.
.
x
r
+b
r(r+1)
x
r+1
+ +b
rq
x
q
= d
r
In general, the leading entry in any given equation should occur in a column
to the right of the leading entry in the prior equation. Based upon the nal
echelon form, the solution set of the equation can be found. If the system is
in K = R, what would you expect the nal echelon form of a system that
has an innite number of solutions to be? Does such a system have any free
variables?
Exercises
The following exercises involve systems of equations where the variables are
all in K = R unless otherwise stated.
100 CHAPTER 3. FIRST LOOK AT SYSTEMS
1. Given the following equations verify if they are true for the values
x = 2, x = 5, x = 0 and x = 1.
(a) x
2
3x 4 = 6
(b) x + 7 = 5
(c) x
2
+ 2x + 1 = (x + 1)
2
2. Give two examples of nonlinear equations.
3. Are the sequences [1, 1, 0], [1, 1, 2], [0, 0, 1] and [1, 1, 0] solutions of
the equation
2x +y + 2z = 1
for x, y, and z in Z
3
?
4. Find the solution set of the system
x
1
x
2
+ 3x
3
= 3
2x
1
x
2
+ 2x
3
= 2
3x
1
+x
2
2x
3
= 2
by transforming the system into an equivalent system, which is in ech-
elon form.
5. Using elementary transformations, nd the solution set of the system
3x
1
+ 6x
2
3x
4
= 3
x
1
+ 3x
2
x
3
4x
4
= 12
x
1
x
2
+x
3
+ 2x
4
= 8
2x
1
+ 3x
2
= 8
by transforming the original system into echelon form.
(a) What are the leading entries? Are there any free variables?
(b) Does the system have one, innitely many, or no solution? What
is the relationship between the existence of free variables and
whether the system has one, innitely many, or no solution?
(c) Express the solution set in vector, and sequence form. Explain
the dierence between each way of expressing the solution set.
3.1 Systems of Equations 101
(d) Substitute the solution back into each equation of the original
system. After substitution, is each equation true?
6. Find the solution set of the system
x
1
+ 2x
2
x
3
+ 3x
4
+x
5
= 2
2x
1
+ 4x
2
2x
3
+ 6x
4
+ 3x
5
= 6
x
1
2x
2
+x
3
x
4
+ 3x
5
= 4
by reducing it into an equivalent system which is in echelon form. What
are the leading entries? Continue performing elementary transforma-
tions to the system to eliminate the variable x
2
from the rst and
third equations and the variable x
3
from the rst and second equa-
tions. What do you observe? Can you continue performing elementary
transformations to the system without altering the leading entries?
The system you found is said to be in reduced echelon form. In a
reduced echelon form, we go beyond echelon form to get zeros in all of
the coecients above and below each leading entry. The elementary
transformations and the basic process are the same. The result is a
system that is even more simplied than echelon form.
(a) You have already identied the leading entries. Are there any free
variables?
(b) Does this system have a unique, innitely many, or no solution?
(c) Express the solution set in vector, and sequence form.
(d) Using the general form given in either the vector or sequence forms
of the solution set, create three dierent specic solutions, and
substitute your results into the equations of the original system?
What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
(f) Is the solution of the system a vector space?
(g) Write a new system which has the same expressions on the left
side of the equations but that has zeros as the constant terms
to the right of the equal sign. Find the solution set of the new
system. Is the solution set of this system a vector space?
102 CHAPTER 3. FIRST LOOK AT SYSTEMS
7. Using elementary transformations, nd the solution set of the system
2x
1
4x
2
+ 12x
3
10x
4
= 58
x
1
+ 2x
2
3x
3
+ 2x
4
= 14
2x
1
4x
2
+ 9x
3
6x
4
= 44
by transforming the original system into reduced echelon form.
(a) What are the leading entries? Are there any free variables?
(b) Does the system have one, innitely many, or no solution?
(c) Express the solution set in vector and in sequence form.
(d) Using the general form given in either the vector or sequence forms
of the solution set, create three dierent specic solutions, and
substitute your results into the equations of the original system.
What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
(f) Is the solution set of this system a vector space?
8. Write a system in reduced echelon form such that its solution is given
by:
x
1
= 3t + 4
x
2
= 2t + 1.
The next series of steps are designed to transform the system into one
possible original system. Carefully perform each step.
(a) Take 1 times equation 2, and add the result to equation 1 to yield
a new equation 1. Write down the new system that results from
performing this transformation.
(b) Create an equation 3 by taking 2 times equation 2. (In this case,
we think of equation 3 as 0x
1
+0x
2
+0x
3
= 0. Hence, what we are
really doing is taking 2 times equation 2, and adding the result
to equation 3 to yield a new equation 3.) Write down the new
system that results from performing this transformation.
3.1 Systems of Equations 103
(c) Take 2 times equation 1, and add the result to equation 2 to yield
a new equation 2. Write down the new system that results from
performing this transformation.
(d) Take 3 times equation 1, and add the result to equation 3 to yield
a new equation 3. Write down the new system that results from
performing this transformation.
(e) Multiply both sides of equation 1 by 3 to yield a new equation
1. Write down the new system that results from performing this
transformation.
(f) Using the general solution, construct three dierent specic solu-
tions, and substitute each of these into the original system you
have just created. What do you observe?
(g) Substitute the general form of the solution set given above into
each equation of the resulting original system you have created.
Is each equation true?
9. Suppose a system of 2 equations in 3 unknowns has a solution set whose
vector form is given by
S = 3t + 1, 4t + 2, t) : t R.
Write the reduced echelon form that corresponds to this system. Then,
apply three elementary transformations of your choice. Show that the
general form of the solution is a solution of the resulting original
system you have created. Using three dierent elementary transfor-
mations, create a second original system, and show that the general
form of the solution is a solution to the second system you have created.
10. Using elementary transformations, nd the solution set of the system
x
1
x
2
+ 2x
3
= 3
2x
1
2x
2
+ 5x
3
= 4
x
1
+ 2x
2
x
3
= 3
2x
2
+ 2x
3
= 1
by transforming the original system into reduced echelon form.
(a) What are the leading entries? How many of them are there? Are
there any free variables?
104 CHAPTER 3. FIRST LOOK AT SYSTEMS
(b) Does the system have one, innitely many, or no solution?
(c) Express the solution set in vector and in sequence form.
(d) Using the general form of the solution set create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
11. Using elementary transformations, nd the solution set of the system
2x
1
4x
2
+ 16x
3
14x
4
= 10
x
1
+ 5x
2
17x
3
+ 19x
4
= 2
x
1
3x
2
+ 11x
3
11x
4
= 4
3x
1
4x
2
+ 18x
3
13x
4
= 17
by transforming the original system into reduced echelon form.
(a) What are the leading entries? How many of them there are? Are
there any free variables?
(b) Does the system have one, innitely many, or no solution?
(c) Express the solution set in vector and in sequence form.
(d) Using the general form of the solution set create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
12. Consider the following homogeneous system of four equations in four
unknowns given by
x
1
2x
2
+x
3
4x
4
= 0
2x
1
3x
2
+ 2x
3
3x
4
= 0
3x
1
5x
2
+ 3x
3
4x
4
= 0
x
1
+x
2
18x
3
+ 2x
4
= 0.
3.1 Systems of Equations 105
(a) Using elementary transformations, nd the solution set of the sys-
tem.
(b) What are the leading entries? How many of them there are? Are
there any free variables?
(c) Does the system have one, innitely many, or no solution?
(d) Express the solution set in vector and in sequence form. Is this
the only solution? Why?
(e) Using the general form of the solution set, create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
(f) Is the solution set of this system a vector space?
13. Consider the following system of four equations in four unknowns given
by
x
1
2x
2
+x
3
4x
4
= 4
2x
1
3x
2
+ 2x
3
3x
4
= 1
3x
1
5x
2
+ 3x
3
4x
4
= 3
x
1
+x
2
18x
3
+ 2x
4
= 5.
(a) Compare this system to the one given in the previous exercise.
What are the similarities? What are the dierences?
(b) Is
S = [6, 6, 1, 3]
a solution to the system. Why? Is this the only solution to the
system? Why?
(c) Using elementary transformations, nd the solution set of the sys-
tem. What do you observe?
(d) The system of the previous exercise can be considered the homo-
geneous system associated to this system. Why? Take the general
form of the solution set of the homogeneous system and add this
solution to the solution given by the sequence
S = [6, 6, 1, 3].
Is the sum a solution to the system? Why?
106 CHAPTER 3. FIRST LOOK AT SYSTEMS
(e) Using elementary transformations, nd the solution set of the sys-
tem.
(f) What are the leading entries? How many of them there are? Are
there any free variables?
(g) Does the system have one, innitely many, or no solution?
(h) Express the solution set in vector and in sequence form.
(i) Using the general form of the solution set, create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
14. The reduced echelon forms of two equations in two unknowns can be
classied in one of three dierent ways:
x
1
+ 0x
2
= c
1
0x
1
+x
2
= c
2
unique solution
x
1
+bx
2
= c
1
0x
1
+ 0x
2
= c
2
(,= 0) no solution
x
1
+bx
2
= c
0x
1
+ 0x
2
= 0 innitely many solutions
Classify, in a similar manner, the possible reduced echelon forms of a
system of three equations in three unknowns. Indicate which system(s)
yield a unique solution, innitely many solutions, or no solution.
15. Consider the general system of two equations in two unknowns given
by
ax +by = e
cx +dy = f.
(a) Determine conditions on a, b, c, d, e, and f that result in this
system having a unique solution.
3.1 Systems of Equations 107
(b) Determine conditions on a, b, c, d, e, and f that result in this
system having no solution.
(c) Determine conditions on a, b, c, d, e, and f that result in this
system having innitely many solutions.
16. A homogenous system of equations is any system of equations in which
all of the constant terms are zero. Show that x = 0, y = 0 is a solution
to the system
ax +by = 0
cx +dy = 0.
Prove that this is the only solution if and only if ad bc ,= 0.
17. For a homogeneous system of n linear equations in n unknowns, prove
that:
(a) The sum of two solutions to the system is a solution of the system.
(b) A multiple of a solution to the system is a solution of the system.
18. For a non-homogeneous system of n linear equations in n unknowns,
prove that if a vector p is a particular solution of the system, and
if h is a solution of the associated homogeneous system, that is, of a
homogeneous system that has the same coecients as the given system,
p +h is a solution of the non-homogeneous system.
19. Consider the following system of two dierential equations in two un-
knowns given by
x y = x
t
x + 3y = y
t
.
Are the functions x, y given by
x(t) = e
2t
y(t) = e
2t
elements of the vector space C
(R).
Find three functions which are solutions to this dierential equation.
Then choose any three scalars in R and use them to form a linear
combination with your three functions. Is this linear combination also
a solution?
152 CHAPTER 4. LINEARITY AND SPAN
Discussion
The Dierence Between a Set and a Sequence
In this section of activities, you may have noticed that some activities refer
to a set of vectors or scalars and others refer to a sequence of vectors or
scalars. Do you recall the dierence between a set and a sequence? From
Chapter 1 you will recall that a set in ISETL is designated by curly braces
, whereas a sequence is denoted by square brackets, [ ] and is called a
tuple. What properties dierentiate sets from tuples?
In addition to having dierent properties, sets and tuples are conceptually
dierent. Lets review some properties that you worked with in Chapter 1.
A sequence is a function whose domain consists of the set or a subset of the
positive integers and whose range can be any set. How would you see a list
like a
1
, a
2
, a
3
, . . . , a
n
, . . . as a function? If the name of the function in this
case is f, what would be meant by f(1), f(2), f(3)? In the context in which
we are working, we can focus upon the listing representation, but, because a
sequence is a function, it is not just any list, it is an ordered list. Where does
the order come from? For example, the sequence [4, 5, 6] is dierent than the
sequence [6, 4, 5] because of the dierence in the order of the presentation of
the elements. On the other hand, a set is an unordered collection of objects.
So, the set 4, 5, 6 is equal to the set 6, 4, 5. Additionally, if an element is
repeated in a sequence, say [4, 5, 5, 6], the repeated 5 cannot be dropped like
it would if we were talking about the set 4, 5, 5, 6, which is actually equal
to the set 4, 5, 6.
This distinction comes up when we have specic scalars and specic vec-
tors that we want to use in forming a linear combination. Thus, in Activity
1(c) we had the scalars 2, 3 and the vectors v, w. We wanted to form the
linear combination 2v + 3w. This means that we are thinking about the
sequence of scalars [2, 3] and the sequence of vectors [v, w] and not sets of
sequences or scalars. What sequences would we use if we wanted to form the
linear combination 3v 2w? or 2w+ 3v?
Can you explain why this means that in Activity 2, the func LC has
to take inputs SK, SV which are respectively, a sequence of scalars and a
sequence of vectors? On the other hand in Activity 6, the func All LC takes
a set of vectors. What is the dierence? Since All LC calls the func LC, how
did you deal with the fact that All LC receives a set of vectors, but LC needs
4.1 Linear Combinations 153
a sequence to work with? Was a conversion involved here?
Forming Linear Combinations
In most of your work so far, each vector space has been of the form (K)
n
,
where each vector consists of n-tuples or vectors whose components are ele-
ments of the scalar set K. A vector space does not generally have to be of
this form (for example, P
n
(K) and C
x,
2
3
x
_
. This form can be simplied to
1
3
x 3, 2) .
Since x is an arbitrary real number,
1
3
x can represent any real number, which
means that if we let c =
1
3
x, the above scalar multiple can be written in the
form c 3, 2) . Therefore, any vector whose components satisfy the equation
y =
2
3
x is an element of the set c 3, 2) : c R.
Therefore, the set of vectors in R
2
whose components satisfy the equation
y =
2
3
x is equal to the set of vectors given algebraically by c 3, 2) : c R.
Go back to Activity 9: Does the graph of the set of vectors given in that
exercise form a line in the plane? If so, what is the equation of the line? What
is the relationship between the rst and second coordinates of the vectors in
4.1 Linear Combinations 157
tv : t R? Does the relationship, if any, reect what you found in the
previous example? In general, can the graph of any set of vectors of the form
c a, b) : c R,
where a ,= 0 or b ,= 0 be represented as a line through the origin whose slope
is given by the ratio b/a? Explain your answer.
In Activity 11, you were asked to nd the graph of the linear combination
v +aw a R,
where v = 1, 3) and w = 1, 2). This set of vectors is exactly the same as
the set of vectors specied by the xy-equation y = 5 2x. How do we show
this?
1 2 3
1
2
4
3
w = <-1,1>
v + w
v = <1,2>
v + 2w
v - w
Figure 4.2: v +aw : a R
Let x, y) v +aw : a R. Then,
(x, y) = v +aw
= (1, 3) +a(1, 2)
= (1 a, 3 + 2a).
Since x = 1 a and y = 3 + 2a, it follows that y = 5 2x. Hence, the
components of every vector in v + aw : a R form a solution of the
equation y = 5 2x.
158 CHAPTER 4. LINEARITY AND SPAN
On the other hand, each solution of the equation y = 5 2x can be
expressed in vector form as
x, 5 2x) .
This is equivalent to the vector sum
1, 3) +x 1, 2 2x) .
If we let c = x 1, then the sum becomes
1, 3) +x 1, 2 2x) = 1, 3) +c, 2c) = 1, 3) + (c) 1, 2) .
Since x is an arbitrary real number, c is an arbitrary scalar, which means
that c is also arbitrary. So, if we let a = (c), we get
1, 3) +a 1, 2) ,
from which it follows that every vector whose components are a solution to
the equation y = 5 2x is a vector of the form
1, 3) +a 1, 2) .
Therefore, the solution set, in vector form, of the equation 5 2x is equal to
the set of vectors 1, 3) +a 1, 2) : a R.
How is the set of vectors v +aw : a R related to aw : a R?
w
v
Figure 4.3: sv +tw : s, t R
4.1 Linear Combinations 159
Let V = R
3
be the vector space of ordered triples of real numbers. If
v = v
1
, v
2
, v
3
) V and w = w
1
, w
2
, w
3
) V,
then the ordered triples v
1
, v
2
, v
3
) and w
1
, w
2
, w
3
) represent two arrows
whose initial points are the origin (0, 0, 0) and whose terminal points have
coordinates given by (v
1
, v
2
, v
3
) and (w
1
, w
2
, w
3
), respectively. If v and w are
not multiples of each other, that is, w ,= cv for any scalar c R, then the
set of all possible linear combinations of v and w, denoted by the set
sv +tw : s, t R,
forms the plane generated by v, w. The vectors v, w are referred to as
generators of this plane. If you recall from multivariable calculus, every pair
of non-parallel vectors, that is, vectors that are not multiples of one another,
denes a plane. Also recall that course the two binary operations on vectors:
the dot product and cross product. The equation of a plane is obtained by
identifying a normal vector, say a, b, c), formed by taking the cross product
of the two generators, and simplifying the resulting dot product equation
a, b, c) x x
0
, y y
0
, z z
0
) = 0,
where (x
0
, y
0
, z
0
) is any xed point in the plane, (x, y, z) is any arbitrary
point in the plane, and x x
0
, y y
0
, z z
0
) refers to the directed line
whose initial point (x
0
, y
0
, z
0
) and whose terminal point is (x, y, z).
To understand better the connection between the set of all linear com-
binations of a generating set and the plane formed by two generators, lets
consider the following example. Let v = 1, 2, 3) and w = 2, 3, 1). The set
of all linear combinations of v and w is denoted by the set
s 1, 2, 3) +t 2, 3, 1) : s, t R.
Since v = 1, 2, 3) and w = 2, 3, 1) are not multiples of each other, 1, 2, 3)
and 2, 3, 1) dene a plane whose normal vector is 7, 5, 1), found by
taking the cross product of 1, 2, 3) and 2, 3, 1). This yields the following
equation, when using the point (1, 2, 3):
7, 5, 1) x 1, y 2, z 3) = 0
7(x 1) + 5(y 2) 1(z 3) = 0
7x + 7 + 5y 10 z + 3 = 0
7x 5y +z = 0.
160 CHAPTER 4. LINEARITY AND SPAN
We claim that the solution set of 7x 5y +z = 0 is the set of vectors
s 1, 2, 3) +t 2, 3, 1) : s, t R.
In order to prove this claim, we must show that every solution of the equation
7x 5y +z = 0 is a linear combination of 1, 2, 3) and 2, 3, 1), and then we
must prove that every linear combination of 1, 2, 3) and 2, 3, 1) is a solution
of the equation 7x 5y +z = 0.
Every solution of the equation 7x5y +z = 0 can be written as a vector
in the form x, y, 5y 7x). To see that this vector is an element of the set
s 1, 2, 3)+t 2, 3, 1) : s, t R, we must show that x, y, 5y 7x) is a linear
combination of 1, 2, 3) and 2, 3, 1). In particular, we must nd scalars s and
t such that the equation
x, y, 5y 7x) = s 1, 2, 3) +t 2, 3, 1)
holds. Simplifying, we obtain,
x, y, 5y 7x) = s, 2s, 3s) +2t, 3t, t)
= s + 2t, 2s + 3t, 3s +t) ,
which is a system of 3 equations in the 2 unknowns s and t,
s + 2t = x
2s + 3t = y
3s +t = 5y 7x.
In Chapter 3 you worked with such systems and developed methods for
nding the solution set. Can you use those methods to show that the solu-
tions are given by:
s = 2y 3x
t = 2x y.
Hence, every vector x, y, z) whose coordinates x, y, and z are a solution of
the equation 7x 5y +z = 0 is a linear combination of 1, 2, 3) and 2, 3, 1),
where s, t are 2y 3x, 2x y, respectively.
Next, we want to show that every element of the set
s 1, 2, 3) +t 2, 3, 1) : s, t R
4.1 Linear Combinations 161
is a solution of the equation 7x 5y +z = 0. Let a 1, 2, 3) +b 2, 3, 1) be an
element of the set s 1, 2, 3) +t 2, 3, 1) : s, t R. This linear combination,
when simplied, is a + 2b, 2a + 3b, 3a +b). Substituting each component for
the respective variables x, y, and z yields
7(a + 2b) 5(2a + 3b) + (3a +b) = 7a + 14b 10a 15b + 3a +b
= (7a 10a + 3a) + (14b 15b +b)
= 0,
which shows that a 1, 2, 3) +b 2, 3, 1) is a solution of 7x5y +z = 0. Since
we have shown that every solution of 7x5y+z = 0 is a linear combination of
1, 2, 3) and 2, 3, 1), and since we have proven that every linear combination
of 1, 2, 3) and 2, 3, 1) is a solution of 7x5y +z = 0, it follows that the set
s 1, 2, 3) +t 2, 3, 1) : s, t R,
is the plane generated by the vectors 1, 2, 3) and 2, 3, 1) and given by the
equation 7x 5y +z = 0.
What we have shown in this discussion is that the solution set of the
equation 7x 5y + z = 0 is the set s 1, 2, 3) + t 2, 3, 1) : s, t R, which
is a plane in R
3
.
Vectors Generated by a Set of VectorsSpan
Throughout the previous subsection, we have used the terms generator or is
generated by. This was always in very specic contexts and so the meaning
should have been clear to you. Was it? You need to understand these terms
thoroughly and in a general context. In Activity 6, you wrote a func All LC
that formed the set of all linear combinations of vectors taken from a given
set. What does this have to do with the set of vectors generated by a given
set?
In the context of forming linear combinations, we say that the set of all
linear combinations of the vectors u and v, which is given by the set
su +tv : s, t K,
is the set of vectors generated by u and v. Consequently, whenever you are
given the phraseFind the set of vectors generated by u, v, and wyou are
162 CHAPTER 4. LINEARITY AND SPAN
actually being asked to nd the set of all linear combinations of u, v, and w;
that is, the set whose form is
qu +sv +tw : q, s, t K.
This term is important enough to warrant a formal denition.
Denition 4.1.2. If S is a set of vectors in a vector space V , then the set
generated by S is the set W of all linear combinations of vectors in S. We
say that the elements of S are the generators of W and that W is the span
of S.
Do you think that in the context of this denition, W must turn out to
be a subspace of V ?
What Vectors Can You Get from Linear Combinations?
In Activity 6, you wrote a func to compute all of the vectors you get by
forming linear combinations of vectors in a given set, that is, you computed
the set generated by the given set. In Activities 4 and 7, you considered the
more specic question of whether one of the linear combinations was equal
to the zero vector. Using the computer is one way of solving such problems
and in Chapter 6, you will develop similar methods using matrices.
There is still another way. Go back a few pages where you worked out
the solution set of the equation 7x 5y +z = 0. What does this have to do
with the set of vectors generated by the set 1, 2, 3) , 2, 3, 1)? What does
the vectors generated by this set have to do with the plane in R
3
determined
by 7x 5y +z = 0?
As we saw earlier, we can check whether a vector can be written as a linear
combination of a set of vectors by solving a vector equation. For instance,
suppose we wish to determine whether the vector 7, 12, 18) can be written
as a linear combination of 1, 2, 3) , 2, 3, 1). This question, similar to what
you were asked to do in Activity 3(b) and what was shown above, amounts
to asking whether we can nd scalars a and b such that the following vector
equation is true:
a 1, 2, 3) +b 2, 3, 1) = 7, 12, 18) .
If we simplify the linear combination on the left and equate components
4.1 Linear Combinations 163
(why?), we get a system of three equations in the two unknowns a and b:
7, 12, 18) = a 1, 2, 3) +b 2, 3, 1)
= a, 2a, 3a) +2b, 3b, b)
= a + 2b, 2a + 3b, 3a +b, )
which, when simplied further, yields
a + 2b = 7
2a + 3b = 12
3a +b = 18.
As it turns out, this system, which you should try to solve yourself, has no so-
lution. Hence, the vector 7, 12, 18) cannot be written as a linear combination
of 1, 2, 3) and 2, 3, 1), and the components of the vector, when substituted
into the expression 7x 5y + z, would render the equation 7x 5y + z = 0
false. If, on the other hand, the system above had yielded a solution, then
7, 12, 18) could be written as a linear combination of 1, 2, 3) and 2, 3, 1);
7, 12, 18) would lie in the plane generated by 1, 2, 3) and 2, 3, 1); and the
components of the vector, when substituted into the expression 7x 5y + z
would satisfy the equation 7x 5y +z = 0.
In Activity 8, you wrote a func that essentially performed the operation
we have been discussing; in particular, the func accepts as input a single
vector v and set of vectors SV and returns a boolean value obtained by
determining whether v could be expressed as a linear combination of the
elements of SV . If the vector v could not be written as a linear combination
of LU, how would LU need to be modied to report such a result?
Actually, Activity 8 did a bit more. It checked, not only whether the
given vector could be expressed as a linear combination of the given set of
vectors, but also whether this could be done in exactly one way, that is,
was the representation unique? The question of whether a vector can be
expressed uniquely as a linear combination of a given set of vectors is very
important and will be discussed thoroughly in Section 4.4.
Non-Tuple Vector Spaces
Your work in Activity 12 should have involved two additional vector spaces.
One is C
(R). The other is the vector space of all solutions of the dier-
ential equation. For which values in R of a, b are the functions a sin, b cos
164 CHAPTER 4. LINEARITY AND SPAN
solutions to the dierential equation? How would you write a general linear
combination of two of these functions? When is it a solution?
Exercises
1. Let V = (Z
3
)
4
be the vector space of 4-tuples with entries in Z
3
. For
each of the following pairs of vectors v, w, nd all linear combinations
of v and w, and determine which linear combinations yield the zero
vector. You may wish to use the func All LC to verify your results.
(a) v = 1, 1, 2, 2) and w = 1, 2, 0, 1).
(b) v = 1, 2, 0, 1) and w = 2, 1, 0, 2).
2. Let V = (Z
3
)
3
. Let
S1 = 2, 1, 0) , 1, 2, 1)
S2 = 1, 1, 2) , 0, 2, 1)
be two sets of vectors in V . Find all linear combinations of S1 and of
S2. Do S1 and S2 generate the same set of vectors?
3. Let V = K
6
, where K = (Z
2
)
2
= x, y) [ x, y Z
2
and the additive
and multiplicative operations are given by the following formulas: if
s, t K, then
s +
K
t = s
1
, s
2
) +
K
t
1
, t
2
)
= (s
1
+t
1
)mod 2, (s
2
+t
2
)mod 2)
s
K
t = s
1
, s
2
)
K
t
1
, t
2
)
= (s
1
t
1
+s
2
t
2
)mod 2, (s
1
t
2
+s
2
t
1
+s
2
t
2
)mod 2) .
Select any three non-zero vectors from V . Designate one as u, one
as v, and the remaining vector as w. Let A = 1, 1) , 1, 0), B =
0, 1) , 1, 0), and C = 1, 1) , 0, 1) be three sets of scalars. Find
all linear combinations of the form
au +bv +cw,
where a A, b B, c C.
You may want to use the func LC to verify your result.
4.1 Linear Combinations 165
4. Give seven vectors in R
4
that are in the set generated by
1, 2, 4, 2) , 3, 5, 2, 3) , 1, 1, 2, 1) , 3, 4, 8, 4).
5. Let V = R
3
. Determine whether
S1 = 2, 1, 3) , 1, 4, 5)
S2 = 3, 1, 4) , 5, 1, 1)
generate the same set of vectors in R
3
.
6. Let V = R
3
. Determine whether
S1 = 2, 1, 0) , 1, 1, 1)
S2 = 1, 1, 1) , 3, 0, 1)
generate the same set of vectors in R
3
.
7. Let V = R
3
. Determine whether
S1 = 1, 2, 3) , 1, 2, 5) , 3, 1, 4)
S2 = 1, 6, 11) , 2, 0, 2) , 1, 2, 3)
generate the same set of vectors in R
3
.
8. Modify the func LU you constructed in Activity 8, so that, for any
vector v and set of vectors SV , LU is able to report whether v can
be expressed as a linear combination of SV uniquely (LU reports 1 as
output), in more than one way (LU reports 2 as output), or not at all (LU
reports 0 as output). Test your modied func for every possible vector
v = v
1
, v
2
, v
3
) in (Z
2
)
3
, when given the set SV = 1, 0, 1) , 0, 1, 1).
9. Let V = R
3
. For each part, (a)(d), determine whether the rst vector
can be expressed as a linear combination of the remaining three vectors.
(a) 3, 3, 7) ; 1, 1, 2) , 2, 1, 0) , 1, 2, 1)
(b) 2, 7, 13) ; 1, 2, 3) , 1, 2, 4) , 1, 6, 10)
(c) 1, 4, 9) ; 1, 3, 1) , 1, 1, 1) , 0, 1, 4)
(d) 4, 3, 8) ; 1, 0, 1) , 2, 1, 3) , 0, 1, 5)
166 CHAPTER 4. LINEARITY AND SPAN
10. Let v be a linear combination of two vectors v
1
and v
2
. Show that v
is also a linear combination of c
1
v
1
and c
2
v
2
, where c
1
,= 0 and c
2
,= 0.
11. Suppose v is not a linear combination of two vectors v
1
and v
2
. Show
that v is also not a linear combination of c
1
v
1
and c
2
v
2
. Try to do this
using the previous exercise and without any calculations.
12. Let W denote the set of vectors generated by v
1
, v
2
. If v
3
W,
prove v
1
, v
2
, v
3
generates the same set.
13. Show that the solution set of the equation 23x 9y + z = 0 is the set
of vectors
s 1, 3, 4) +t 2, 5, 1) : s, t R.
14. Find the set of vectors whose components satisfy the equation y =
2x + 3.
15. Given the set of vectors
v +aw : a R,
where v = 3, 2) and w = 2, 5), nd the equation of the line whose
solution set, when written in vector form, is equal to the set given
above.
16. Given the set
v +aw : a R,
where v = 3, 2) and w = 2, 5) as in the prior exercise, determine
what would happen to the graph of this set if the coecient were al-
lowed to vary; that is, if you were given
bv +aw : a, b R,
what would this set of linear combinations look like? Draw a graph of
this set of linear combinations for b = 1, 2, 3, .5. What do you
observe? How does the graphical form of this set dier from the case
in which b = 1?
17. Find the equation of the plane, given the generating set
1, 3, 2) , 3, 0, 2).
4.1 Linear Combinations 167
18. Find the set of vectors whose components satisfy the equation y =
5
3
x.
What happens to the graph of this set, if each vector is multiplied by
the scalar 2? What happens to the graph of this set, if 2, 1) is added
to each vector in the set?
19. Show that if W is the span of a set of vectors S in a vector space V ,
then W is a subspace of V .
20. In the vector space P
n
(K), describe the span of the following sets of
vectors.
(a) 1, x
2
, x
4
, . . . , x
n div 2
(b) x, x
2
, x
3
, . . . , x
n
(c) 1
(d) x
(e) 1, x
21. In the vector space PF
4
(R), in how many ways can you express the
polynomial function
x 2 + 3x
as a linear combination of 1, x, x
2
, x
3
, x
4
?
22. In the vector space PF
4
(Z
3
), in how many ways can you express the
polynomial function
x 2 + 3x
as a linear combination of 1, x, x
2
, x
3
, x
4
?
23. Let a, b be real numbers, g the function given by g(x) = sin(x), h the
function given by h(x) = cos(x), and f = ag + bh the linear combina-
tion. What is the function f given by?
24. For which real numbers a, b is the function f given by the linear com-
bination, f = a sin +b cos a solution to the dierential equation
f
tt
+f = 0?
168
4.2 Linear Independence
Activities
1. Let V = (Z
2
)
4
be the vector space of quadruples of elements of Z
2
. Let
SETV 1 = 1, 1, 0, 1) , 1, 0, 1, 1) , 1, 1, 1, 0)
SETV 2 = 1, 1, 1, 1) , 0, 0, 1, 1) , 1, 1, 0, 0)
SETV 3 = 1, 1, 0, 1) , 1, 0, 1, 1) , 0, 0, 1, 1)
SETV 4 = 0, 0, 0, 0) , 1, 1, 0, 0) , 0, 0, 1, 1)
SETV 5 = 1, 0, 1, 0) , 0, 1, 0, 1) , 1, 1, 1, 1)
be four sets of vectors from (Z
2
)
4
.
(a) For each set of vectors, write down the expression for each possible
linear combination. Do not simplify.
(b) Apply the func LC that you wrote in Section 4.1, Activity 2 to
each combination you produced in (a) to decide if it yields the
zero vector.
(c) Identify which sets have the property that there is one and only
one linear combination that yields the zero vector.
2. Write a func LI that will assume name vector space has been run;
that will accept one input SETV , where SETV denotes a set of vectors;
and that will return a boolean value that tells if there is a unique scalar
sequence whose linear combination with SETV t yields the zero vector.
Verify the construction of LI by checking each set of vectors given in
Activity 1.
You will probably need to dene a local variable TUPV and include a
line of code such as:
TUPV := [x : x in SETV];
You may wish to use one or more of the funcs you dened in the
previous section.
3. Write a func LD that will assume name vector space has been run;
that will accept one input SETV , where SETV denotes a set of vectors;
that will convert SETV to a sequence with a line of code such as:
4.2 Linear Independence 169
TUPV := [x : x in SETV];
and that will return either the string the set is independent, if there
is a unique scalar sequence whose linear combination with the vectors
in SETV yields the zero vector, or the set of all scalar sequences that
yield the zero vector, if more than one such scalar sequence is identied.
Verify the construction of LD by checking each set of vectors given in
Activity 1.
4. For each set of vectors u, v, w you constructed in Activity 1, deter-
mine whether u can be written as a linear combination of v and w;
determine whether v can be written as a linear combination of u and
w; and determine whether w can be written as a linear combination of
u and v. Keep track of this information in relation to the results you
obtained in Activities 2 and 3.
5. Let V = (Z
2
)
4
be as in Activity 1. Apply the func All LC, which you
wrote for Section 4.1, Activity 6, to nd the set of vectors generated by
the zero vector, that is, the set ov). What do you observe? Then,
apply the funcs LI and LD to this single element set. What do you
observe?
6. Let V = (Z
7
)
2
. Apply the func All LC to nd the set of vectors gen-
erated by the single-vector set 3, 2). What do you observe? Then,
apply the funcs LI and LD to this set. What do you observe?
7. Let v = 2, 3) and w = 4, 6) be two vectors in R
2
. Solve the vector
equation
a 2, 3) +b 4, 6) = 0, 0)
for a and b. How many solutions does this equation have: one? none?
innitely many? As discussed in the last section, the set of vectors
generated by these two vectors is given by
s 2, 3) +t 4, 6) : s, t R.
Given s 2, 1, 0.5 and t 3, 2, construct all possible linear
combinations of the form
s 2, 3) +t 4, 6) ,
170 CHAPTER 4. LINEARITY AND SPAN
and use the func vectors to graph all of the resulting combinations.
Based upon your graphs, describe the graph of the set of vectors gen-
erated by 2, 3) and 4, 6).
8. Let v = 1, 2) and w = 3, 1) be two vectors in R
2
.
(a) Solve the vector equation
a 1, 2) +b 3, 1) = 0, 0)
for a and b. How many solutions does this equation have: one?
none? innitely many?
(b) As discussed in the last section, the set of vectors generated by
these two vectors is given by
s 1, 2) +t 3, 1) : s, t R.
Given s 2, 1, 0.5 and t 3, 2, construct all possible
linear combinations of the form
s 1, 2) +t 3, 1) ,
and use the func vectors to graph all of the resulting combina-
tions.
(c) Based upon your graphs, describe the graph of the set of vectors
generated by 1, 2) and 3, 1).
9. Let v = 1, 2, 3) and w = 2, 4, 6) be two vectors in R
3
.
(a) Solve the vector equation
a 1, 2, 3) +b 2, 4, 6) = 0, 0, 0)
for a and b. How many solutions does this equation have: one?
none? innitely many?
(b) As discussed in the last section, the set of vectors generated by
these two vectors is given by
s 1, 2, 3) +t 2, 4, 6) : s, t R.
4.2 Linear Independence 171
Given s 2, 1, 0.5 and t 3, 2, construct all possible
linear combinations of the form
s 1, 2, 3) +t 2, 4, 6) ,
and then graph each resulting combination by hand.
(c) Based upon your graphs, describe the graph of the set of vectors
generated by 1, 2, 3) and 2, 4, 6).
10. Let v = 1, 2, 1) and w = 2, 1, 3) be two vectors in R
3
.
(a) Solve the vector equation
a 1, 2, 1) +b 2, 1, 3) = 0, 0, 0)
for a and b. How many solutions does this equation have: one?
none? innitely many?
(b) As discussed in the last section, the set of vectors generated by
these two vectors is given by
s 1, 2, 1) +t 2, 1, 3) : s, t R.
Given s 2, 1, 0.5 and t 3, 2, construct all possible
linear combinations of the form
s 1, 2, 1) +t 2, 1, 3) ,
and then graph each resulting combination by hand.
(c) Based upon your graphs, describe the graph of the set of vectors
generated by 1, 2, 1) and 2, 1, 3).
11. In the vector space P
n
(K) of all polynomials of degree less than or equal
to n with coecients in the eld K, consider the set of polynomials
1, x, x
2
, . . . , x
p
where p n. Is this set linearly independent?
12. Consider the dierential equation
f
tt
+f = 0,
where the unknown function f is in C
(R).
172 CHAPTER 4. LINEARITY AND SPAN
In the previous section, you determined for which values in R of a, b
the functions given by a sin(x), b cos(x) and their linear combinations
are solutions to the dierential equation. You should have decided that
all functions given by an expression of the form,
a sin(x) +b cos(x)
are solutions.
From among these solutions, pick out several linearly independent sets.
What is the largest number of functions that you can have in a linearly
independent set?
Discussion
Denition of Linear Independent and Linear Dependent
In Activity 1, you formed all possible linear combinations of the sets of vectors
SV 1, SV 2, SV 3, SV 4 and SV 5 in (Z
2
)
4
. You then applied the func LC to
determine which linear combinations yielded the zero vector. Which of these
sets have the property that there are no linear combinations that give the
zero vector? Exactly one such linear combination? More than one?
A set of vectors in which only one linear combination yields the zero vector
is particular important and deserves a name: linearly independent set. Any
other set of vectors is called a linearly dependent set.
Here is a precise denition.
Denition 4.2.1. Let V be a vector space over K. A set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
m
is linearly independent if there exists one and only one sequence of scalars,
namely
SK = [0, 0, 0, . . . , 0]
whose linear combination yields the zero vector; that is,
0v
1
+ 0v
2
+ 0v
3
+ + 0v
m
is the only linear combination of SV that yields the zero vector.
4.2 Linear Independence 173
In the exercises, you will be asked to formulate this denition for linearly
dependent sets.
Given any set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
m
, ,
the linear combination
0v
1
+ 0v
2
+ 0v
3
+ + 0v
m
yields the zero vector whether the set is linearly independent or dependent.
The dierence between independence and dependence lies in whether there
exist linear combinations with nonzero scalars that produce the zero vector.
In the case of linearly dependent sets, this is precisely the case. In the case
of linearly independent sets, the situation is the opposite: the only linear
combination that yields the zero vector is the one in which all of the scalars
are simultaneously zero.
For example, if, for each set from Activity 1, we form an arbitrary linear
combination and set it equal to the zero vector,
a 1, 1, 0, 1) +b 1, 0, 1, 1) +c 1, 1, 1, 0) = 0, 0, 0, 0) (SETV 1)
a 1, 1, 1, 1) +b 0, 0, 1, 1) +c 1, 1, 0, 0) = 0, 0, 0, 0) (SETV 2)
a 1, 1, 0, 1) +b 1, 0, 1, 1) +c 0, 0, 1, 1) = 0, 0, 0, 0) (SETV 3)
a 0, 0, 0, 0) +b 1, 1, 0, 0) +c 0, 0, 1, 1) = 0, 0, 0, 0) (SETV 4)
a 1, 0, 1, 0) +b 0, 1, 0, 1) +c 1, 1, 1, 1) = 0, 0, 0, 0) (SETV 5),
and then solve each equation for a, b, and c, we would nd that the all-zero
scalars
a = 0, b = 0, c = 0
satises all ve equations. For the sets SETV 1 and SETV 3 however, this is
the one and only combination that produces the zero vector. This is not the
case for the vectors sets SETV 2, SETV 4 and SETV 5. Each of these sets
has at least one other linear combination that yields the zero vector. Are
these results consistent with what you should have found when you applied
the funcs LI and LD to the sets SETV 1, SETV 2, SETV 3, SETV 4 and
SETV 5? Setting linear combinations equal to the zero vector, as we did
above, allows us to rewrite the denition of linear independence in terms of
an equation.
174 CHAPTER 4. LINEARITY AND SPAN
Denition 4.2.2. A set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
m
is linearly dependent if and only if at least one of the vectors in the set can
be written as a linear combination of the remaining vectors.
Proof. (=) : We will assume that SV is linearly dependent, and we will
prove that at least one of the vectors in the set can be written as a combi-
nation of the others. By denition, the dependence of SV implies that there
exists a set of scalars, say c
1
, c
2
, c
3
, . . . , c
q
, where
c
1
,= 0 or c
2
,= 0 or c
3
,= 0 . . . or . . . c
q
,= 0
and
c
1
v
1
+c
2
v
2
+c
3
v
3
+ +c
q
v
q
= 0.
For the purpose of this argument, it does not matter which specic scalar is
assumed to be nonzero. So, lets assume that c
1
,= 0. In this case, we can
4.2 Linear Independence 175
divide by c
1
, from which we clearly see that v
1
can be expressed as linear
combination of the remaining vectors.
c
1
v
1
+c
2
v
2
+c
3
v
3
+ +c
q
v
q
= 0
c
2
v
2
+c
3
v
3
+ +c
q
v
q
= c
1
v
1
c
2
c
1
v
2
c
3
c
1
v
3
c
q
c
1
v
q
= v
1
(=) : We will show that if at least one vector in SV can be written as a
linear combination of the others, then SV must be linearly dependent. For
the purpose of this argument, it does not matter which vector can be written
as a combination of the others; lets assume that v
1
is such a vector. Then
there exists a set of scalars c
2
, c
3
, . . . , c
q
such that
v
1
= c
2
v
2
+c
3
v
3
+ +c
q
v
q
.
If we rewrite this equation, we see that
(1)v
1
+c
2
v
2
+c
3
v
3
+ +c
q
v
q
= 0.
Since there is a set (1), c
2
, c
3
, . . . , c
q
of scalars, not all zero, that when
combined with SV yields the zero vector, it follows, according to the deni-
tion of linear dependence, that SV is a linearly dependent set.
Another characteristic that dierentiates linearly independent and depen-
dent sets involves the relationship between the set and the vectors which the
set generates. In Activity 1, you constructed all possible linear combinations
of each set of vectors. Did you nd that for some of these ve sets there
was more than one linear combination giving the same answer and for others
there was never more than one? That is, in some cases, the representation of
any vector as a linear combination of vectors in the set is unique, in others,
it is not. How did this compare with the linear independence or dependence
of the set? Again there is a general relationship which we establish in the
following theorem.
Theorem 4.2.2. Let V be a vector space over a eld K. A set of q vectors
in V ,
SV = v
1
, v
2
, v
3
, . . . , v
q
(b) v
1
, cv
2
, v
3
15. Let v
1
, v
2
be a linearly independent set. If v
3
cannot be written as
a linear combination of v
1
and v
2
, that is,
v
3
,= av
1
+bv
2
,
for any pair of scalars a and b, then show that v
1
, v
2
, v
3
is a linearly
independent set.
16. Prove or provide a counterexample. If three nonzero vectors u, v, w
are linearly dependent, it must be the case that u is a linear combina-
tion of v and w.
17. Let 2, 1, 1) , 4, 2, 2) be a two-vector set in R
3
. Determine
whether this set is linearly dependent or linearly independent. De-
scribe the span of this set as a set of points in R
3
. Are your results, in
terms of the issue of dimension, consistent with what was discussed in
the text? Explain your answer.
18. Let 3, 4, 1) , 2, 5, 3) be a two-vector set in R
3
. Decide whether
this set is linearly dependent or linearly independent. Describe the
span of this set as a set of points in R
3
. Are your results, in terms of
the issue of dimension, consistent with what was discussed in the text?
Explain your answer.
19. Let 4, 1) , 2, 5) be a two vector set in R
2
. Determine whether this
set generates the entire plane R
2
.
184 CHAPTER 4. LINEARITY AND SPAN
20. Let 1, 4, 3) , 3, 5, 2) , 1, 1, 3) be a three-vector set in R
3
. Deter-
mine whether this set generates the entire space R
3
using an approach
similar to that given for generating sets of R
2
. Is this set linearly in-
dependent or linearly dependent? Is your result consistent with the
discussion given in the text? Explain.
21. Let 1, 4, 2) , 1, 3, 1) , 2, 1, 3) be a set of three vectors in R
3
. De-
termine whether the rst vector can be written as a linear combination
of the remaining vectors. Repeat this process for the second and third
vectors. Without making any further calculations, is this set linearly
independent or linearly dependent? What can you say about the dimen-
sion of the graph of the set of vectors generated by this set? Carefully
explain your answer.
22. Prove that any set of monomial functions in PF
n
(R) is linearly inde-
pendent.
23. Which sets of monomial functions in PF
n
(R) span all of PF
n
(R)?
24. Discuss the results of Exercises 22 and 23 if R is replaced by Z
3
.
185
4.3 Generating Sets and Linear Independence
Activities
1. Let
SETV 1 = 2, 1, 3, 2) , 1, 1, 3, 1) , 3, 2, 2, 3)
SETV 2 = 3, 2, 1, 3) , 4, 3, 0, 4) , 2, 2, 1, 2)
be two sets of vectors on (Z
5
)
4
.
(a) Use the func LI that you wrote in Section 4.2, Activity 2 to verify
that both sets are linearly independent.
(b) Apply the func All LC from Section 4.1, Activity 6 to nd the
set of vectors generated by each set. What do you observe? Are
the two spans equal?
(c) Apply the modied version of the func LU you constructed in Sec-
tion 4.1, Exercise 8 to determine whether each vector in SETV 2
can be written as a linear combination of the vectors in SETV 1.
What do you observe?
2. Let
SETV 1 = 2, 1, 3, 2) , 1, 1, 3, 1) , 3, 2, 2, 3)
SETV 3 = 1, 2, 0, 2) , 3, 1, 1, 2) , 0, 3, 0, 0)
be two sets of vectors on (Z
5
)
4
.
(a) Use the func LI to verify that both sets are linearly independent.
(b) Apply the func All LC to nd the set of vectors generated by each
set. What do you observe? Are the two sets equal?
(c) Apply the modied version of the func LU you constructed in
Section 4.1, Exercise 8 of the section on linear combinations to
determine whether each vector in SETV 3 can be written as a lin-
ear combination of the vectors in SETV 1. What do you observe?
186 CHAPTER 4. LINEARITY AND SPAN
(d) The results from this and the prior exercise illustrate a general
principle. Formulate a conjecture based upon your ndings.
3. Let
SETV 4 = 1, 2, 3, 4) , 3, 3, 3, 2) , 2, 1, 0, 3) , 3, 2, 1, 2)
be a set of vectors in (Z
5
)
4
. Perform the following tasks with respect
to this set of vectors.
(a) Apply the func All LC to nd the span of this set.
(b) Apply the func LI to determine whether the set SETV 4 is inde-
pendent or dependent.
(c) Apply the modied version of the func LU to determine whether
any vector in the set can be written as a linear combination of the
remaining vectors.
(d) If the answer to (c) is yes, remove one such vector, and denote
the remaining sequence as SETV 5. Repeat steps (a), (b), and
(c) with SETV 5. Is the span of SETV 5 the same as the span
of SETV 4? Is SETV 5 linearly independent? If the answer to
part (c) is yes, repeat the process: remove one of the vectors that
can be written as a linear combination of the remaining vectors,
and denote the new sequence as SETV 6. Repeat (a), (b), and (c)
with SETV 6 and beyond that, if necessary, until you arrive at an
answer of no in part (c)
(e) When you get a no answer in part (d), what is your answer to
part (c)? Is there a relationship between whether a set is inde-
pendent and whether a vector in that set can be written as a linear
combination of the others? Does the nal set you get generate
the same set of vectors as the original set SETV 4? Explain your
answer.
4. Write a func LIGS that assumes that name vector space has been ex-
ecuted; that accepts one input SETV , where SETV is a set of vectors;
and that returns a linearly independent set constructed by employing
the following process: the func takes one of the vectors in SETV , tests
whether it is a combination of the others, removes the vector from the
set if it is, leaves the vector in the set if it is not, and successively
4.3 Generating Sets and Linear Independence 187
repeats this process until each vector in the set has been checked. Test
the func LIGS on the set SETV 4 that you worked with in Activity 3.
Does LIGS return the same set you got after having completed parts
(a)(d) of Activity 3?
5. Let SETV = 1, 1, 0) , 0, 1, 1) , 1, 0, 0) be a set of vectors in (Z
2
)
3
.
Verify that SETV is linearly independent. Show that SETV generates
the entire set of vectors (Z
2
)
3
. Select a vector v dierent from 1, 1, 0),
0, 1, 1), and 1, 0, 0), and form the new set
1, 1, 0) , 0, 1, 1) , 1, 0, 0) , v.
Test whether the resulting set is independent. What do you observe?
Repeat this for every possible choice of v that is not equal to 1, 1, 0),
0, 1, 1), 1, 0, 1). What do you observe?
6. Let
SETV = 2, 3, 4, 1, 1) , 1, 1, 3, 1, 2) , 3, 4, 2, 2, 3) ,
4, 0, 0, 3, 0) , 1, 4, 2, 3, 3)
be a set of vectors in (Z
5
)
5
.
In parts (b)(d) below, you can use the predened ISETL func npow
to construct the set of subsets with a given cardinality.
(a) Use the func LI to show that SETV is linearly dependent.
(b) Construct all subsets of SETV that consist of two vectors. Apply
LI to each set. What do you observe?
(c) Construct all subsets of SETV that consist of three vectors. Ap-
ply LI to each set. What do you observe?
(d) Construct all subsets of SETV that consist of four vectors. Apply
LI to each set. What do you observe?
7. Let V = (Z
5
)
3
. Apply the func LI to each part (a)(d) Discuss your
ndings: in particular, compare your result for part (d) with what you
get for parts (a)(c)
(a) 4, 4, 2) , 2, 1, 3)
188 CHAPTER 4. LINEARITY AND SPAN
(b) 4, 4, 2) , 3, 1, 3)
(c) 2, 1, 3) , 3, 1, 3)
(d) 4, 4, 2) , 2, 1, 3) , 3, 1, 3)
8. Use the func LI to determine whether each of the following sets is
independent or dependent.
(a) 2, 1, 0) , 1, 1, 1); 2, 1, 0) , 1, 1, 1) , 0, 0, 0) in (Z
3
)
3
(b) 2, 3, 1, 4) , 3, 3, 2, 1) , 1, 2, 1, 4);
2, 3, 1, 4) , 3, 3, 2, 1) , 1, 2, 1, 4) , 0, 0, 0, 0) in (Z
5
)
4
What do you think is the point of this activity? Explain.
Discussion
In Section 4.1, generating sets were dened and in Section 4.2, the con-
cepts of linear independence and linear dependence were dened. In this
section, we will study the relationship between generating sets and the sets
of vectors they generate; discuss how to construct a linearly independent
set when given any set of vectors; present various special forms of linearly
independent sets of vectors; and prove important properties of linearly inde-
pendent and dependent sets.
Generating Sets and Their Spans
In Activity 1, you were given two sets of vectors in (Z
5
)
4
. You were asked to
nd and compare the sets of vectors generated by the two and to determine
the relationship, in terms of linear combinations, between the generating sets.
In Activity 2, one of the sets was changed and you did the same thing with
the resulting pair of sets.
What were the various phenomena that you observed? See how long a list
of observations you can make. For each observation formulate a statement
that says that what you observed is true in general. Is your theorem
correct? Try to supply a proof or a counterexample as appropriate.
One of your observations might have been an important relationship be-
tween the sets generated by two sets of generators and the possibility of
4.3 Generating Sets and Linear Independence 189
writing every generator in one set as a linear combination of the genera-
tors in the other set. This relationship is formalized and proved in the next
theorem.
Theorem 4.3.1. Two sets of vectors
SV 1 = v
1
, v
2
, . . . , v
q
SV 2 = w
1
, w
2
, . . . , w
q
in a vector space V generate the same set of vectors in V if and only if each
vector in SV 2 can be written as a linear combination of the vectors in SV 1,
and vice-versa.
Proof. (=:) Let us assume that SV 1 and SV 2 generate the same set of
vectors. We will then prove that each vector in SV 2 can be written as a
linear combination of the vectors in SV 1. If u is an element of the span of
SV 2, then u can be written as a linear combination of the vectors w
1
, w
2
,
. . . , and w
q
. Now, each vector w
i
, i = 1, 2, , q, in SV 2 is an element of
the span of SV 2 and since SV 1 and SV 2 are assumed to generate the same
set of vectors, so it follows that each w
i
, i = 1, 2, , q is in the span of SV 1,
which means that that each vector w
i
for i = 1, 2, , q can be written as a
linear combination of the elements of SV 1.
In a similar manner, we can show that each vector in SV 1 can be written
as a linear combination of the vectors in SV 2.
(=:) We will assume that each vector in SV 2 can be written as a combina-
tion of the vectors in SV 1, and vice-versa. We will then prove that SV 1 and
SV 2 generate the same set of vectors. Let u be a vector in the set generated
by SV 2. Then, there exist scalars a
1
, a
2
, . . . , a
q
such that
u = a
1
w
1
+a
2
w
2
+ +a
q
w
q
.
Since each vector in SV 2 can be written as a combination of SV 1, there exist
sequences of scalars
[b
11
, b
12
, . . . , b
1q
]
[b
21
, b
22
, . . . , b
2q
]
.
.
.
[b
q1
, b
q2
, . . . , b
qq
]
190 CHAPTER 4. LINEARITY AND SPAN
such that
w
1
= b
11
v
1
+b
12
v
2
+ +b
1q
v
q
w
2
= b
21
v
1
+b
22
v
2
+ +b
2q
v
q
.
.
.
w
q
= b
q1
v
1
+b
q2
v
2
+ +b
qq
v
q
.
Substituting these expressions for the w
i
in the expression for u, we get
u = a
1
w
1
+a
2
w
2
+ +a
q
w
q
a
1
[b
11
v
1
+b
12
v
2
+ +b
1q
v
q
] +
a
2
[b
21
v
1
+b
22
v
2
+ +b
2q
v
q
] + +
a
q
[b
q1
v
1
+b
q2
v
2
+ +b
qq
v
q
]
= [a
1
b
11
+a
2
b
21
+ +a
q
b
q1
]v
1
+
[a
1
b
12
+a
2
b
22
+ +a
q
b
q2
]v
2
+ +
[a
1
b
1q
+a
2
b
2q
+ +a
q
b
qq
]v
q
,
a linear combination of SV 1. Hence, u is an element of the set of vectors
generated by SV 1. Since u was chosen arbitrarily, each vector in the set
generated by SV 2 is also generated by SV 1. In a similar fashion, we can
show that every vector contained in the set generated by SV 1 is generated
by SV 2. As a result, SV 1 and SV 2 generate the same set of vectors.
Constructing Linearly Independent Generating Sets
In Activity 3, you removed vectors until you ended up with a linearly inde-
pendent set. Every time you removed a vector, you applied All LC to the
new set. What was the relationship between the span of the new set and
the span of its predecessor? In constructing a new generating set, were you
allowed to remove just any vector?
How do your responses to these questions relate to the following theo-
rem? In the last section, we proved that an important property of linearly
dependent sets, one not shared by independent sets, is that at least one of its
vectors can be written as a linear combination of the other vectors in the set.
Since the set of vectors generated by a set consists of linear combinations of
the generators, it seems reasonable that any generator that can be written
as a combination of the other generators is redundant, a fact consistent with
what you found in Activity 3 and one which we shall now prove in general.
4.3 Generating Sets and Linear Independence 191
Theorem 4.3.2. If a set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
q
= v
2
, v
3
, . . . , v
q
, then SV and
SV
.
Think about this theorem in relation to Activity 4. You wrote a func LIGS
that removed vectors from a linearly dependent set until it either became a
linearly independent set or became empty. What does the theorem tell you
about the set of vectors generated by the resulting linearly independent set?
This theorem gives us a means by which we can construct a linearly
independent generating set whenever we are given a generating set: we simply
remove vectors which can be written as linear combinations and continue
until this is no longer possible. A linearly independent generating set is
called a basis, and this will be the topic of the next section.
How about going the other way? What happens if we have a set S of
vectors and add to it a vector which is already a linear combination of the
vectors in S? How does the set of vectors generated by the new set compare
with the set of vectors generated by S?
Properties of Linear Independence and Linear Depen-
dence
In the last section, in addition to dening independence and dependence,
we stated and proved two equivalent conditions of linear independence: in
particular, a set of vectors is linearly independent if and only if one of the
following properties holds:
no vector in the set can be written as a linear combination of the
remaining vectors; and
any vector generated by the set can be expressed as a linear combination
of elements of the generating set in one and only one way.
192 CHAPTER 4. LINEARITY AND SPAN
In this subsection, we will prove two necessary conditions for independence.
In Activity 6, you constructed all proper subsets (sets with one or more of the
original elements missing) of two, three, and four vectors of the dependent
set
SETV = 2, 3, 4, 1, 1) , 1, 1, 3, 1, 2) , 3, 4, 2, 2, 3) ,
4, 0, 0, 3, 0) , 1, 4, 2, 3, 3).
You then applied the func LI to each subset to determine which subsets were
independent and which were dependent. In Activity 7 you looked at some
more examples. What did you nd? What do these activities tell you about
the linear dependence or independence of a subset of a linearly dependent
set? Does Activity 8 suggest anything about that? The following theorem
addresses these questions for subsets of a linearly independent set.
What about subsets of a linearly independent set?
Theorem 4.3.3. If S is a linearly independent set in a vector space V , then
every proper subset of S must also be linearly independent.
Proof. The proof is left to the exercise section.
Is there a similar result for linearly dependent sets? In particular, must
every subset of a dependent set be dependent? Is it possible for a linearly
dependent set to have both subsets which are dependent and subsets which
are independent?
How about going the other way? Suppose you took a set of vectors and
added some vectors. What could you say about the larger set if the smaller
set was linearly independent? Linearly dependent?
In each part of Activity 7, you were given two sets of vectors: one which
was an independent set, and the second which was the same set with the
zero vector added. In both cases, the set with the zero vector was dependent.
Before looking at the following theorem, think about what might be true in
general.
Theorem 4.3.4. If SV is linearly independent set of vectors in in a vector
space V , then SV cannot contain the zero vector.
Proof. The proof is left to the exercise section.
4.3 Generating Sets and Linear Independence 193
Here is a trick question. Suppose you began with a set that was lin-
early dependent and began removing vectors. Can you be sure that you will
eventually obtain a linearly independent set? This is one of those statements
that is false but, really, it is true. What could that mean?
Non-Tuple Vector Spaces
In the vector space PF
n
(K) what would you say is the set generated by the
set of vectors x 1, x x, . . . , x x
p
where p n?
In the vector space C
SV 2 = w
1
, w
2
, w
3
,
where w
1
and w
2
are both linear combinations of SV 1 but w
3
is not,
determine whether SV 1 and SV 2 generate the same set of vectors.
Justify your answer. If necessary, construct two sets satisfying the
given conditions to illustrate your point.
13. Let 2, 1, 3 2) , 1, 1, 3, 4) be a set of vectors in R
4
.
(a) This set is linearly independent. Why?
(b) Find two more vectors v and w so that the resulting expanded set
2, 1, 3 2) , 1, 1, 3, 4) , v, w
is linearly independent.
(c) Once you have found two such vectors, show that the set you have
constructed is indeed independent.
(d) Based upon the discussion regarding dimension given in this and
at the end of the last section, make a determination as to whether
the set you have constructed generates all of R
4
.
14. Let
S = v
1
, v
2
, v
3
, v
4
be a set in V = (K)
3
.
(a) Is S linearly independent or linearly dependent? Explain your
answer.
196 CHAPTER 4. LINEARITY AND SPAN
(b) If every subset of S consisting of three vectors is linearly inde-
pendent, how could we construct a linearly independent set that
generates all of V ?
(c) If every subset consisting of three vectors is linearly independent,
must it follow that every subset of two vectors must be linearly
independent? Justify your answer.
15. Suppose w
1
and w
2
are two vectors that are combinations of v
1
and
v
2
such that
w
1
= av
1
+bv
2
w
2
= cv
1
+dv
2
a, b, c, d ,= 0
a
c
,=
b
d
.
Is the set w
1
, w
2
linearly independent or linearly dependent? Care-
fully justify your answer. Do the two sets v
1
, v
2
and w
1
, w
2
gen-
erate the same set? Explain.
16. Suppose you take a linearly dependent set and remove vectors one by
one. Can you always be sure that you eventually obtain an independent
set?
17. In the vector space PF
n
(K), describe the set generated by the vectors
x 1, x x, . . . , x x
p
where p n? What about the
set generated by x x, . . . , x x
p
or the set generated by the
monomial functions with odd exponents?
18. In the vector space PF
n
(K), the set x 1, . . . , x x
n1
is
linearly independent. Based on this, what can you say about about
the sets x 1, . . . , x x
p
where p < n? What about the sets
x x, . . . , x x
p
, p < n? or the set of all monomials with odd
exponents? Even?
19. In the vector space C
i=1
t
i
v
i
where t
1
, t
2
, . . . , t
n
is a sequence of scalars and v
1
, v
2
, . . . , v
n
is a sequence of
vectors. If you wrote this summation expression out without the
symbol,
what would you get?
The operator %.va in ISETL works exactly like the
symbol in mathe-
matics. Following are some summation notations that mean the same thing
in mathematics as do the ISETL expressions in Activity 1. See if you can
match them, expression for expression. In particular, what has replaced SK1,
SK2, SV1, SV2?
n
i=1
t
i
v
i
4.4 Bases and Dimension 201
3
i=1
a
i
u
i
3
i=1
b
i
w
i
3
i=1
a
i
u
i
+
3
i=1
b
i
w
i
6
i=1
c
i
v
i
One thing you can do to gure out such expressions is to write them out
in full detail without any summation or % symbols. Thus we have,
4
i=1
t
i
v
i
= t
1
v
1
+t
2
v
2
+t
3
v
3
+t
4
v
4
and
n
i=1
t
i
v
i
= t
1
v
1
+t
2
v
2
+ +t
n
v
n
You can factor a scalar out of an expression. Thus if all of the t
i
in
n
i=1
t
i
v
i
were equal to t, what would
n
i=1
tv
i
be equal to? If you are in
doubt, choose a value for n and write everything out.
You can also add two summation expressions termwise as in,
n
i=1
(a
i
+b
i
) =
n
i=1
a
i
+
n
i=1
b
i
Again, if you need clarication, choose a value for n and write everything
out.
Things can get very complicated if you have multi-indices, or sequences
of sequences, as with matrices. Thus if (a
ij
) where i = 1, 2, . . . , m and
j = 1, 2, . . . , n is a matrix or doubly indexed sequence of scalars, we have,
i,j
a
ij
=
m
i=1
n
j=1
a
ij
Once more, choose a value for n and write everything out to help un-
derstand what these expressions mean. You will have an opportunity in the
exercises to practice with these symbols.
202 CHAPTER 4. LINEARITY AND SPAN
Bases
Here is a mathematical formulation of the concept expressed in ISETL by the
func is basis that you wrote in Activity 4.
Denition 4.4.1. A non-empty set B = v
1
, v
2
, . . . , v
n
in a vector space
V is called a basis for V if every vector v V can be written in one and only
one way as a linear combination of the vectors in B.
In Activity 3, you considered, altogether, seven sets of vectors in two
vector spaces. Which of these are bases for the vector spaces containing
them?
You should have no diculty showing that this denition is exactly the
same as saying that the set is linearly independent and generates all of V .
Certain sets of vectors having a particular form will always be bases. For
example, for any vector space V = (K)
n
, the following sets of vectors are
bases:
B
1
= 1, 1, 1, . . . , 1) , 0, 1, 1, . . . , 1) , 0, 0, 1, . . . , 1) ,
0, 0, 0, 1, . . . , 1) , . . . , 0, 0, 0, , 0, 1)
B
2
= 1, 0, 0, . . . , 0) , 0, 1, 0, . . . , 0) , 0, 0, 1, . . . , 0) ,
. . . , 0, 0, 0, . . . , 0, 1)
B
3
= 1, 1, 1, . . . , 1) , 1, 1, . . . , 1, 0) , 1, 1, . . . , 1, 0, 0) ,
. . . , 1, 1, 0, . . . , 0) , 1, 0, 0, . . . , 0)
We prove the rst case here. The latter cases are left for the exercises.
Theorem 4.4.1. Let V = (K)
n
. The set B
1
is a basis.
Proof. First we show that the set is linearly independent.
Given the vector equation
a
1
1, 1, 1, . . . , 1) +a
2
0, 1, 1, . . . , 1) +
a
3
0, 0, 1, . . . , 1) + +a
n
0, 0, 0, , 0, 1)
= 0, 0, 0, . . . , 0) ,
4.4 Bases and Dimension 203
we must show that to
a
1
= a
2
= a
3
= = a
n
= 0.
We have,
0, 0, 0, . . . , 0) = a
1
1, 1, 1, . . . , 1) +a
2
0, 1, 1, . . . , 1) +
a
3
0, 0, 1, . . . , 1) + +a
n
0, 0, 0, . . . , 0, 1)
= a
1
, a
1
, a
1
, . . . , a
1
) +0, a
2
, a
2
, . . . , a
2
) +
0, 0, a
3
, . . . , a
3
) + +0, 0, 0, . . . , 0, a
n
)
= a
1
, (a
1
+a
2
), (a
1
+a
2
+a
3
), . . . , (a
1
+a
2
+a
3
+ +a
n
))
Therefore,
a
1
= 0
a
1
+a
2
= 0
a
1
+a
2
+a
3
= 0
.
.
.
a
1
+a
2
+a
3
+ +a
n1
= 0
a
1
+a
2
+a
3
+ +a
n1
+a
n
= 0.
The rst equation yields a
1
= 0. Substituting this into the second equation
forces a
2
= 0. Substituting these results into the third equation results in
a
3
= 0. If we continue with subsequent steps, we get the desired result; that
is,
a
1
= a
2
= a
3
= = a
n
= 0.
Next we show that B
1
generates all of V . Let v
1
, v
2
, . . . , v
n
be the vectors
in B
1
. We must show that given any sequence c
1
, c
2
, . . . , c
n
of scalars in K,
we can nd a sequence of scalars a
1
, a
2
, . . . , a
n
in K such that,
n
i=1
a
i
v
i
= c
where c = c
1
, c
2
, . . . , c
n
).
When this expression is written out with all of the coordinates, you get
almost exactly the same set of equations that was obtained in showing linear
204 CHAPTER 4. LINEARITY AND SPAN
independence. The only dierence is that each 0 on the right hand side is
replaced by the appropriate c
i
. That is, we must solve the following system
of equations for the unknowns a
1
, a
2
, . . . , a
n
:
a
1
= c
1
a
1
+a
2
= c
2
a
1
+a
2
+a
3
= c
3
.
.
.
a
1
+a
2
+a
3
+ +a
n1
= c
n1
a
1
+a
2
+a
3
+ +a
n1
+a
n
= c
n
.
Clearly, we get a solution by taking a
1
= c
1
, a
2
= c
2
c
1
, a
3
= c
3
2c
1
c
2
and so on.
The set of vectors, B
2
is particularly important. We call it the coordinate
basis and write its elements as e
1
, e
2
, . . . , e
n
. The vector e
i
has all of its
coordinates equal to 0 except for the i
th
coordinate which is 1.
You will notice that we have dened a basis to be a set of vectors as
opposed to a sequence of vectors. The reason for this is that the prop-
erty of being a basis does not depend on the order in which the vectors are
consideredat least not in the context with which this course is concerned.
In many situations where bases are used, however, it becomes important to
x the order of the elements. Is the case for Activity 7? Do the coecients
you nd form a set or a sequence? This issue will denitely come up in the
next few paragraphs.
When we want to make use of the order of the elements of a basis, we
make the set into a sequence and call it an ordered basis. Thus, given any
basis, each ordering of the set produces a dierent ordered basis. If a basis
has 10 vectors, how many ordered bases can you get from it?
Nobody is perfect and the dierence between a basis and an ordered basis
can be so small, that often we will forget to add the adjective ordered when
we should. But you can always tell from the context, so whenever we are
working with a basis as a sequence, we mean and ordered basis whether we
say so or not.
4.4 Bases and Dimension 205
Expansion of a Vector with respect to a Basis.
In Activity 7, you considered the problem of given a vector space V , a basis
B for V and a vector v V , how can we nd the coecients of v in its
expansion as a linear combination of the vectors in B. There are several
ways of doing this. You might set up a system of linear equations and solve
them. You could do this by hand, or use a computer tool. You could also do
it (if the vector space is not too large) by using the ISETL operation choose.
That is, you would apply choose to the set of all linear combinations with
the condition that it be equal to the given vector. In Chapter 6, you will nd
another method that uses matrices.
When determining these coecients, you really have to make sure that
each coecient goes with a specic vector. One way of doing this would be
to use an ordered basis. Then you nd a sequence of scalars to form the
coecients and the order takes care of the matching automatically.
There is one case in which nding the coecients is so easy that it might
seem trivial. Suppose V = K
n
and you are working with the coordinate
basis. Now, any vector v V has both its components as an element of
K
n
and its coecients in its expansion by the n basis elements. What can
you say about these two sequences of scalars? Although this fact may seem
trivial, it is not. We very briey pursue it in a more general form.
Representation of a vector space as K
n
. Suppose you have an arbitrary
vector space V over a eld, K, and an ordered basis B = [b
1
, b
2
, . . . , b
n
].
Then any vector v V has a sequence of scalars t = (t
i
) = (t
1
, t
2
, . . . , t
n
)
which are its coecients with respect to B. That is,
v =
i
t
i
b
i
= t
1
b
1
+t
2
b
2
+ +t
n
b
n
.
How does this compare with what you obtained in Activity 7?
Because of properties of bases, this representation is unique so you can
think of v as an element t = (t
i
) of K
n
. Conversely, if you have an element
t = (t
i
) K
n
, then the same equation can be used to specify a vector v K
n
.
It is easy to see that the operations of vector addition and multiplication are
preserved by this correspondence. What exactly is meant by preserved
here?
These comments can be summarized by saying that the vector space V
is the same as K
n
. This being the same, however, depends on the basis B.
206 CHAPTER 4. LINEARITY AND SPAN
This observation is very important in more advanced studies of linear
algebra. We will not pursue it in this text.
In this representation of a vector space as K
n
,would you say that the
order of the basis makes a dierence?
Finding a Basis
You can always nd a basis. In the case in which everything is nite, Ac-
tivity 8 produces a basis. How do you know that the func Make Basis will
always work? The algorithm begins by selecting an arbitrary non-zero vector
in V , forms the subspace generated, picks a non-zero vector not in that sub-
space, adds it to what has already been chosen and continues that process
until a basis is achieved.
Did you nd in Activity 8 that you got a dierent basis each of the three
times you ran Make Basis? How about the number of elements in each basis?
Why does this happen? Could you guarantee that the basis you get contains
some given set of one or more vectors? What would you have to assume
about this set?
Finite dimensional vector spaces. Suppose that the eld K is not nite.
For example, your vector space might be R
n
. Or, as in Activity 12 of Section
4.1, it might be the set of all functions from [, ] to R whose derivative
of every order and at every point in the interval (left and right derivatives at
the endpoints) exist. In such a case you can still apply the algorithm, but
you cant be sure of what will happen. That is, you pick a non-zero vector
and put it in the set B. Then you pick a non-zero vector not in the subspace
generated by B and add that vector to B. You continue this process. If your
vector space is nite, it must stop. If the vector space is not nite, it may or
may not stop. If it stops after nitely many steps, then the resulting set B is
a basis and a nite set. This case is important enough to warrant a formal
denition.
Denition 4.4.2. If a vector space V has a basis which is a nite set, then
V is called a nite dimensional vector space.
As we indicated above, even if this process does not stop, you can still
show that the vector space has a basis, but we will not discuss that situation
here.
4.4 Bases and Dimension 207
Characterizations of bases. We dened a basis for a vector space in a
way that is equivalent to being a set which is both linearly independent and
generates the whole space. (See comment after Denition 4.4.1 and Exercise
2.) There are two other characterizations of a basis.
Theorem 4.4.2. A subset B of a vector space V is a basis if and only if it
is a maximal linearly independent set. That is, B is linearly independent and
if any other vector is added to it, then it is no longer linearly independent.
Proof. Exercise.
Theorem 4.4.3. A subset B of a vector space V is a basis if and only if it
is a minimal generating set. That is, the subspace generated by B is all of
V , but if any vector is removed from B, then the subspace it generates is no
longer all of V .
Proof. Exercise.
Dimension
In several activities for this section, you found bases for various vector spaces.
In Activity 4, you found bases for (Z
2
)
5
and for (Z
3
)
4
. In Activity 5 you
constructed two dierent bases for P
5
(Z
3
) and in Activity 8, you found three
bases for the same vector space.
In all of these examples, did you notice any regularities in the number
of elements in a basis? The bases for (Z
2
)
5
and for (Z
3
)
4
had, respectively,
5 and 4 elements. Note that for each of these vector spaces, the coordinate
bases also have 5 and 4 elements, respectively. In the other examples, what
did all bases for the same vector space always have in common?
Do you think there is a general result here? There is, but rst we must
consider an important fact about the maximum number of elements in a
linearly independent set. In going through the following proof, it might help
you to pick values for m and n and write out all of the summations.
Theorem 4.4.4. If V has a basis with n elements in it, then any subset of
V with more than n elements must be linearly dependent.
Proof. Suppose that B = b
1
, b
2
, . . . , b
n
is a basis for V , and that ( =
c
1
, c
2
, . . . , c
m
is a subset of V with m > n. We must show that ( is a
208 CHAPTER 4. LINEARITY AND SPAN
linearly dependent set. To do that we must nd scalars, a
1
, a
2
, . . . , a
m
not
all zero, such that,
m
i=1
a
i
c
i
= 0.
Now, because B is a basis, each element of ( is equal to some linear
combination of the vectors in B. That is, we have scalars t
ij
, i = 1, . . . , m, j =
1, . . . , n such that for each i = 1, . . . , m we have,
c
i
=
n
j=1
t
ij
b
j
Substituting these expressions in the equation we have to solve, this equation
becomes,
m
i=1
a
i
n
j=1
t
ij
b
j
= 0
or
m
i=1
n
j=1
a
i
t
ij
b
j
= 0
and, reversing the order of the two sums (why can we do this?), it becomes:
n
j=1
m
i=1
a
i
t
ij
b
j
= 0
In this vector equation, we can replace 0 by its expression as a linear combi-
nation of the basis vectors to obtain:
n
j=1
m
i=1
a
i
t
ij
b
j
=
n
j=1
0b
j
.
This equation expresses the equality of two linear combinations of the basis
vectors and therefore, because of the uniqueness, each coecient of b
j
, j =
1, . . . , n is the same on both sides of the equation. This leads to the following
4.4 Bases and Dimension 209
system of equations:
m
i=1
a
i
t
i1
= 0
m
i=1
a
i
t
i2
= 0
i=1
a
i
t
in
= 0
But this is a system of n equations in m unknowns, with m > n, so the
system must have a solution in which not all of the unknowns are 0. Why?
This notion will be pursued in Exercise 13.
With this theorem, we can easily prove what you observed in considering
the number of elements in a basis.
Theorem 4.4.5. Any two bases for a vector space V have the same number
of elements.
Proof. Suppose we have two sets which are bases for V . Applying Theorem
4.4.4 to the fact that the rst set is a basis and the second set is linearly
independent, we conclude that the second set cannot have more elements
than the rst. Reversing the two sets, we conclude the the rst set cannot
have more elements than the second. Hence they have the same number of
elements.
This theorem allows us to make the following denition:
Denition 4.4.3. The dimension of a nite dimensional vector space is the
number of elements in a basis.
Why do we need Theorem 4.4.5 before we can dene the dimension of a
vector space?
Some of the following theorems were illustrated by examples in the activ-
ities. You will have a chance to give general proofs in the exercises.
210 CHAPTER 4. LINEARITY AND SPAN
Theorem 4.4.6. The dimension of the vector space K
n
is n.
Proof. Exercise.
Theorem 4.4.7. If V is an n-dimensional vector space, then any set of
vectors which generates V must have at least n elements.
Proof. Exercise.
Theorem 4.4.8. If V is an n-dimensional vector space, then any set of n
linearly independent vectors generates V .
Proof. Exercise.
Theorem 4.4.9. If V is an n-dimensional vector space, then any set of n
vectors which generates V must be linearly independent.
Proof. Exercise.
Theorem 4.4.10. If V is an n-dimensional vector space, and B is a set of
linearly independent vectors in V , then there is a basis which contains B, that
is, the set B can be extended into a basis of V .
Proof. Exercise.
Dimensions of Euclidean spaces. The Euclidean space R
2
contains el-
ements of three types: points, which are subspaces of dimension zero; lines,
which are subspaces of dimension one; and the entire space, which is of dimen-
sion two. The Euclidean space R
3
contains four types of subspaces, points;
lines; planes; and the entire space, which is itself a subspace of dimension 3.
We have seen that any vector space of the form V = (K)
n
, whether
K is equal to the set of real numbers R or not, is a space of dimension n
possessing subspaces of smaller dimension. Such a result is consistent with
the geometric notion of dimension discussed in Euclidean space.
4.4 Bases and Dimension 211
Non-Tuple Vector Spaces
Most of the vector spaces you have studied so far are of the form (Z
p
)
n
. This
is because these are concrete examples and are the easiest to work with. But
you have also begun to work with some other examples: P
n
(K), the vector
space of polynomials of degree less than or equal to a certain number n with
coecients in some eld K; the vector space of all solutions of a system of
homogeneous linear equations; and the vector space of all functions which
satisfy a certain dierential equation. In each of these cases, the space has
bases, and they are important. We will consider some rst facts.
The results you found in Activity 5(a) are completely general and you
might have been able to solve this problem more easily by hand without
the computer. After all, what is a linear combination of monomials but a
polynomial whose degree is less than or equal to the highest degree of a
monomial, which is n. Thus you get all polynomials with degree less than or
equal to n. Can such a polynomial be equal to the zero polynomial, if the
coecients are not all zero? That one could be interesting, so you will have
a chance to play with it in the exercises.
Theorem 4.4.11. The monomials 1, x, x
2
, . . . , x
n
form a basis for P
n
(K).
Proof. Exercise.
Determining other bases for P
n
(K) involves a lot of work with the prop-
erties of polynomials, and we will not go much farther with that in this text.
Closely related to the vector space of polynomials is the vector space of
polynomial functions. The result of Theorem 4.4.11 does not hold in general
for these spaces, instead we have the following.
Theorem 4.4.12. The monomial functions x 1, . . . , x x
n
form a
basis of PF
n
(R).
If n < p, then the monomial functions, then the monomial functions
x 1, . . . , x x
n
form a basis of PF
n
(Z
p
).
If n p, then the monomial functions x 1, . . . , x x
p1
form a
basis of PF
n
(Z
p
).
Proof. Left as an exercise (see Exercise 21).
Consider the system of equations in Activity 8, and try to solve it com-
pletely, perhaps using the methods you learned in Chapter 3.
212 CHAPTER 4. LINEARITY AND SPAN
You should be able to determine that all solutions [x
1
, x
2
, x
3
] are given
by:
x
1
= s
x
2
= t
x
3
= s +t
where s, t run independently through all values in Z
5
.
Another way to say this is that the solutions form a subspace of (Z
5
)
3
and that this subspace is generated by the two vectors, 1, 0, 1),0, 1, 1). Can
you see why this is so? If it is, then these two vectors are obviously linearly
independent so they form a basis for the space of solutions of this system.
This situation is very general, and you will study it more in Chapter 6.
For now, we can introduce some words you will meet later. This system has
three equations, and the vector space of solutions is of dimension 2. Hence
we say that the rank of the system is 1, and its nullity is 2.
Here is something else for you to mull over. Suppose you throw away all
of the x
t
s and the = 0 parts of the system of equations in Activity 8, leaving
you with a 3 3 matrix. Treat the rows of this matrix as vectors and notice
that the three vectors do not form a linearly independent set. Moreover, the
largest subset which is linearly independent has only 1 vector. Is this 1 a
coincidence? Now do the same thing with the columns of the matrix. Are
the results the same? Whats that all about?
There is a deep mathematical connection between linear algebra and the
solutions of a linear dierential equation that involves a lot of important
mathematics. In this book, we can only give a barest hint of the tip of an
iceberg.
Recall, from Section 4.3, the dierential equation
f
tt
+f = 0
where f is an unknown function in C
(R).
You checked in Section 4.3 that the two functions sin, cos are solutions
to this dierential equation. You should have no trouble showing that they
form a linearly independent set in the vector space C
4
i=1
a
i
b
i
(b)
4
i=1
tb
i
= t
4
i=1
b
i
(c)
4
i=1
(a
i
+b
i
) =
4
i=1
a
i
+
4
i=1
b
i
(d)
i,j
a
ij
=
3
i=1
4
j=1
a
ij
(e)
3
i=1
4
j=1
a
ij
=
4
j=1
3
i=1
a
ij
2. Show that a subset of a vector space V is a basis if and only if it is
linearly independent and generates all of V .
3. Show that each of the following sets is a basis for (K)
n
.
(a) B
2
= 1, 0, 0, . . . , 0) , 0, 1, 0, . . . , 0) , 0, 0, 1, . . . , 0) , . . . ,
0, 0, 0, . . . , 0, 1)
214 CHAPTER 4. LINEARITY AND SPAN
(b) B
3
= 1, 1, 1, . . . , 1) , 1, 1, 1, . . . , 0) , 1, 1, 1, . . . , 0, 0) , . . . ,
1, 1, 0, . . . , 0) , 1, 0, 0, . . . , 0)
4. In the paragraph on Representations of a vector space as (K)
n
it is
stated that It is easy to see that the operations of vector addition and
multiplication are preserved by this correspondence.
(a) Explain what is meant by preserved.
(b) Prove that vector addition is preserved.
(c) Prove that scalar multiplication is preserved.
5. In Activity 3(a) choose the set which is a basis and nd the coordinates
of the expansion of the vector 1, 0, 1, 0, 1) with respect to this basis.
6. In Activity 3(b) choose the set which is a basis and nd the coordinates
of the expansion of the vector 0, 1, 2, 0) with respect to this basis.
7. Let P
4
(R) be the vector space of all polynomials of degree less than or
equal to 4 with real coecients.
(a) Find a basis which contains the polynomial x + 1
(b) Find a basis which contains the polynomials x + 2, x 1
(c) Find a basis which contains the polynomials x 2, x
2
2
8. For each of the bases you found in the previous exercise, nd the ex-
pansion, with respect to that basis, of the polynomial x.
9. Let be the set of all nite sequences of real numbers.
(a) Dene a scalar multiplication and a vector addition on this set
and show that with these operations it becomes a vector space.
(b) Explain what could be meant by the coordinate basis for this
vector space and show that it is a basis.
(c) Find a basis B for in which no sequence contains a zero.
(d) Consider the element of which is sequence consisting of three
0s followed by three 1s. Find the expansion of this vector with
respect to your basis B.
10. Prove Theorem 4.4.2.
4.4 Bases and Dimension 215
11. Prove Theorem 4.4.3
12. Write out the proof of Theorem 4.4.4 for the case n = 3, m = 5 using
no summation symbols.
13. (a) Consider the following homogeneous system of 3 equations in 4
unknowns
x
1
x
2
+ 2x
3
+ 4x
4
= 0
2x
1
+ 3x
2
x
3
+x
4
= 0
4x
1
+ 5x
2
+ 3x
3
2x
4
= 0.
Show there exists a non-trivial solution to the system.
(b) Generalize the results of the previous part to show that a system
of n equations in m unknowns, with m > n, must have a solution
in which not all of the unknowns are 0. This completes the proof
of Theorem 4.4.4.
14. Prove Theorem 4.4.6.
15. Prove Theorem 4.4.7.
16. Prove Theorem 4.4.8.
17. Prove Theorem 4.4.9.
18. Prove Theorem 4.4.10.
19. (a) Choose several values for n and elds K and in each case show
that any polynomial in P
n
(K), whose coecients are not all zero,
cannot be the zero polynomial.
(b) Show in general for any n, K that any polynomial in P
n
(K), whose
coecients are not all zero, cannot be the zero polynomial.
20. Prove Theorem 4.4.11.
21. Prove Theorem 4.4.12.
22. Show the set 1, 0, 1) , 0, 1, 1) is a basis for the solution space of the
system of equations in Activity 8.
216 CHAPTER 4. LINEARITY AND SPAN
23. Show that the set sin, cos is linearly independent in the vector space
C
(R).
24. The set sin, cos also spans C
(R).
Chapter 5
Linear Transformations
Remember the denition of a function from
previous mathematics courses? In calculus,
functions are the main object of study as
dierentiation and integration both operate on
functions. You are probably saying to yourself,
Of course, a function is a mapping from some
set called the domain into another set called the
co-domain in which any element from the
domain is mapped to exactly one element in the
co-domain. Or something like a subset f of
the Cartesian product of A and B such that for
every a A there is exactly one b B such that
(a, b) f. This chapter is going to explore some
functions from one vector space to another and
consider how portions of the domains and ranges
might be thought of a vector spaces themselves.
218
5.1 Introduction to Linear Transformations
Activities
1. Let U = (Z
3
)
2
, the vector space of ordered pairs of elements in Z
3
.
Let u, v U, where u = u
1
, u
2
) and v = v
1
, v
2
). Dene a function
T : U U by
T(u) = (u
1
u
2
), (2 u
2
)) .
For all u, v U and for all c, d Z
3
, perform the following steps:
(a) Compute cu +dv, and then nd T(cu +dv).
(b) Compute T(u) and T(v), and then nd cT(u) +dT(v).
(c) Determine whether T(cu +dv) = cT(u) +dT(v).
2. Let U = (Z
3
)
2
and V = (Z
3
)
3
. Let u, v U, where u = u
1
, u
2
) and
v = v
1
, v
2
). Dene a function F : U U by
F(u) = (u
1
+u
2
), (2 u
1
+u
2
), u
1
) .
For all u, v U and for all c, d Z
3
, perform the following steps:
(a) Compute cu +dv, and then nd F(cu +dv).
(b) Compute F(u) and F(v), and then nd cF(u) +dF(v).
(c) Determine whether F(cu +dv) = cF(u) +dF(v).
3. Let U and V be vector spaces with scalars in K, and assume that
name vector space has been run. Write an ISETL func is linear
that accepts a func H : U V , where U and V indicate vector
spaces over K; checks the equality
H(cu +dv) = cH(u) +dH(v)
for all pairs of scalars c, d K and all pairs of vectors u, v U; and
returns true, if the equality being checked holds for all possible scalar
and vector pairs, or false, if the equality does not hold. Apply this
func to the funcs T and F dened in the Activities 1 and 2.
5.1 Introduction to Linear Transformations 219
4. Let U = (Z
3
). Let u, v U. Dene a function H : U U by
H(u) = u
2
.
(a) Apply the func is linear to H. Does is linear return true or
false in this case?
(b) If we dene G : U U by
G(u) = u
n
,
where n is an integer greater than or equal to 1, for what values
of n will is linear return true?
5. Let U = (Z
5
)
2
. Run name vector space, and complete parts (a)(e).
(a) Write a func T that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector in (Z
5
)
2
. The rst component of the output is the sum
of the product of u
1
by 2 with the product of u
2
by 4, and the
second component is the sum of u
1
with the product of 3 and u
2
.
(b) Write a func F that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by u
1
+ 2, u
2
) (Z
5
)
2
.
(c) Write a func H that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by 0, u
2
) (Z
5
)
2
.
(d) Write a func R that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by 3u
1
+ 2u
2
+ 2, 2u
1
+u
2
+ 3) (Z
5
)
3
.
(e) Write a func S that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by 3u
1
+ 2u
2
, 2u
1
+u
2
) (Z
5
)
2
.
(f) Apply the func is linear to each of the funcs you have con-
structed in (a)(e). Which return true? Which return false?
6. Let U = R
2
be the coordinate plane, the vector space of ordered pairs
with real-valued components. Let u = 2, 5) and v = 1, 3) be two
vectors in the plane. Dene G : R
2
R
2
to be the function that
accepts a vector u R
2
and returns the vector found by rotating u
counterclockwise through
6
radians. If we think of u geometrically as
an arrow that emanates from the origin, then a rotation through ra-
dians refers to rotating the given arrow radians in a counterclockwise
direction.
220 CHAPTER 5. LINEAR TRANSFORMATIONS
(a) Use the ISETL tool vectors to graph G(u) +G(v) and G(u+v).
What do you observe about the relationship between G(u)+G(v)
and G(u +v)?
(b) Let c = 2. Use vectors to graph G(cu) and cG(u). What do you
observe about the relationship between G(cu) and cG(u)?
(c) For u, v R
2
and c, d R, does G satisfy the equality
G(cu +dv) = cG(u) +dG(v)?
Explain your answer.
7. Let U = R
2
be the coordinate plane. Let u = 3, 1) and v = 1, 3) be
two vectors in the plane. Dene H : R
2
R
2
to be the function that
accepts a vector u R
2
and returns the vector found by reecting u
through the line whose equation is given by y =
3x. If we think of u
geometrically as an arrow emanating from the origin, then a reection
of u refers to nding the mirror image of its arrow with respect to the
given reecting line.
(a) Use vectors to graph H(u) + H(v) and H(u + v). What do
you observe about the relationship between H(u) + H(v) and
H(u +v)?
(b) Let c = 2. Use vectors to graph H(cu) and cH(u). What do
you observe about the relationship between H(cu) and cH(u)?
(c) For u, v R
2
and c, d R, does H satisfy the equality
H(cu +dv) = cH(u) +dH(v)?
Explain your answer.
8. Let U = R
2
be the coordinate plane. Let u = 3, 2) and v = 1, 5) be
two vectors in the plane. Dene S : R
2
R
2
to be the function that
accepts a vector u R
2
and that returns the vector found by translat-
ing the vector u by the vector 3, 4). If we think of u geometrically as
an arrow that emanates from the origin, then a translation of u refers
to moving u to a new location in the plane without disturbing its
original direction.
5.1 Introduction to Linear Transformations 221
(a) Use vectors to graph S(u) + S(v) and S(u + v). What do you
observe about the relationship between S(u)+S(v) and S(u+v)?
(b) Let c = 2. Use vectors to graph S(cu) and cS(u). What do you
observe about the relationship between S(cu) and cS(u)?
(c) For u, v R
2
and c, d R, does S satisfy the equality
S(cu +dv) = cS(u) +dS(v)?
Explain your answer.
9. Let D: C
(R) C
(R) C
(R), a, b R.
10. Let
_
1
0
: PF
3
(R) R be dened by
_
1
0
p =
_
1
0
p(t) dt.
Determine whether
_
1
0
satises the condition
_
1
0
(af +bg) = a
_
1
0
f +b
_
1
0
g, f, g PF
3
(R), a, b R.
11. Let J : PF
2
(R) PF
3
(R) be dened by
J(p) = x
_
x
2
p(t) dt.
Determine whether J satises the condition
J(af +bg) = aJ(f) +bJ(g), f, g PF
2
(R), a, b R.
222 CHAPTER 5. LINEAR TRANSFORMATIONS
Discussion
Functions between Vector Spaces
In calculus, you worked with functions whose domains and ranges were both
some set of the real numbers: each input was a real number and each output
was a real number. In multivariable calculus, you broadened your horizons
a bit. A function of two variables accepts an ordered pair (x, y) of numbers
and returns a real number. A function of three variables accepts an ordered
triple (x, y, z) and returns a real number. With vector-valued functions,
that is, functions whose ranges are vector spaces, the situation is reversed:
each input is a real number, and each output can be an ordered pair or an
ordered triple. For example, a function h : R R
3
dened by h(x) =
x
2
, 2x + 3, 3x
3
) accepts a real number x and returns the vector, or ordered
triple x
2
, 2x + 3, 3x
3
).
In linear algebra, you have the opportunity to expand your view even
further. Look again at the functions T and F dened in the rst two ac-
tivities. What are the inputs and outputs of these functions? You might
have noticed that both accept a vector u in (Z
5
)
2
, while T returns a vector
in (Z
5
)
2
, and F returns a vector in (Z
3
)
3
. A function between vector spaces
assigns one and only one output vector, say v in V , to each input vector,
say u in U. Such functions are often called vector transformations or just
transformations. Check all the functions you worked with in the activities:
Which of these are vector transformations? In Activity 1, you were asked
whether T(cu + dv) = cT(u) + dT(v). Part (b) of Activity 2 asked you a
similar question regarding the function F. In Activity 3, you were asked to
write a func that would check this condition for any vector space function
H : U V . In particular, given any H : U V , any pair of vectors
u, v U, and any pair of scalars c, d, does the following equality hold:
H(cu +dv) = cH(u) +dH(v)?
Any vector space function that satises this condition is called a linear trans-
formation. Is the function T dened in Activity 1 a linear transformation?
Is the function F dened in Activity 2 a linear transformation?
5.1 Introduction to Linear Transformations 223
Denition and Signicance of Linear Transformations
The single condition you used in Activity 3 to construct the func is linear
can be separated into two conditions, as presented in the denition given
below.
Denition 5.1.1. Let U and V be vector spaces with scalars in K. A
function T : U V is a linear transformation if
(i.) T(u +v) = T(u) +T(v) for u, v U and
(ii.) T(cu) = cT(u) for u U and c K.
How could the func you dened in Activity 3 be modied to check the
two conditions given in the denition? Of those funcs in Activities 1, 2, 4,
and 5 that are not are linear, which fail condition (i)? condition (ii)? both?
In Activity 4, we dened a familiar function using vector notation. You
can verify that the set of real numbers R is in fact a vector space over itself,
that is, the set of scalars is also R. The function H dened in this activity,
like any function from R to R, is a vector transformation: a real number,
say x R, thought of in this context as a vector, is assigned to the vector,
or real number, x
2
. When you applied is linear, what did you nd? Is H a
linear transformation? For what n is the function G dened in Activity 4(b)
a linear transformation?
Many of the functions you studied in calculus are not linear. However,
linear transformations are extremely important in calculus. For example,
when you compute the derivative D of a dierentiable function g : R R
at a point x = a, the function G : R R given by
G(x) = D(g)(a) (x a),
where () denotes real number multiplication, is a linear transformation that
approximates g near x = a. This is also true of multivariable functions
involving the gradient. For a function of two variables h : R
2
R, the
function H : R
2
R given by
H(x, y) = D
x
(h)(a, b), D
y
(h)(a, b)) (x a), (y b)) ,
where D
x
denotes the partial derivative function with respect to x, D
y
repre-
sents the partial derivative function with respect to y, and () represents the
224 CHAPTER 5. LINEAR TRANSFORMATIONS
dot product, is a linear transformation that approximates h near the point
(a, b). Can you show that H is linear?
In Activity 9, you were asked to determine whether the dierential oper-
ator D
2
is linear. What did you nd? Is your answer consistent with what
has just been discussed regarding the functions G and H?
In Activities 10 and 11, you considered the denite integral applied to
polynomials from PF
2
(R) and PF
3
(R). Do the denite integrals dened in
these activities satisfy the conditions given in Denition 5.1.1? If not, what
modications would need to be made to ensure linearity?
Why is linearity so important? The two conditions specied in Deni-
tion 5.1.1 ensure preservation of vector addition and scalar multiplication
for a function between two vector spaces. Suppose that T : U V is a
linear transformation. The rst condition given in Denition 5.1.1,
T(u +v) = T(u) +T(v), u, v U,
guarantees that the vector assigned to u + v under T is equal to the sum
assigned by T to u and v individually. This is illustrated in the diagram
below: the sum of the outputs, T(u) and T(v), is equal to the output of the
sum, T(u +v).
(u, v)
T
(T(u), T(v))
+
u +v
T
T(u) +T(v)
Thus, T and + can be applied in any order: taking the sum of u and v
followed by applying T yields the same answer as taking the sum of the
images of u and v separately.
The second condition given in Denition 5.1.1,
T(cu) = cT(u), u U, c K,
guarantees preservation of scalar multiplication: the vector assigned by T to
the scalar product cu is equal to the product of c with the vector assigned
by T to u.
u
T
T(u)
c
cu
T
cT(u)
5.1 Introduction to Linear Transformations 225
Similar to the case involving the sum, T and can be applied in any order:
taking the product cu followed by applying T yields the same answer as
multiplying the image of u by the scalar c.
In Activities 6, 7, and 8, you were asked whether familiar geometric trans-
formations such as rotations, reections, and translations preserve the opera-
tions of vector addition and scalar multiplication in R
2
. Activity 6 described
a rotation through
6
radians. The general denition and its analytic, or
algebraic, representation are given in the following denition.
Denition 5.1.2. A rotation is a function that takes a vector in R
2
and
rotates it in a counterclockwise fashion through an angle of radians. It is
given by the formula
T(x, y)) = x cos y sin , x sin +y cos ) .
y
x
(x,y) (x',y')
Figure 5.1: A rotation
As the gure indicates, let x, y) R
2
be a vector in the plane, and
assume that the vector x, y) forms an angle of radians with the x-axis.
Let x
t
, y
t
) be the vector returned after x, y) has been rotated through
radians.
If we drop a perpendicular segment from the point (x, y) to the x-axis,
we can see that the segment from the origin to the point (x, 0) has length
x, the segment from (x, 0) to (x, y) has length y, and the vector x, y) has
226 CHAPTER 5. LINEAR TRANSFORMATIONS
length
_
x
2
+y
2
. We can show that
x =
_
x
2
+y
2
cos
y =
_
x
2
+y
2
sin .
Similarly, if we drop a perpendicular segment from (x
t
, y
t
) to the x-axis, we
can see that
x
t
=
_
x
2
+y
2
cos( +)
y
t
=
_
x
2
+y
2
sin( +).
If we then apply standard trigonometric identities, we can see that the ge-
ometric representation is the same as the analytic representation, which is
given by the expression for T.
In Activity 7, you were asked to consider a reection through the line
y =
2
(x,y)
(x',y')
(x,-y)
y = mx
R
2
that takes a vector in R
2
and returns a vector given by the formula
T(x, y)) = x +c, y +d) .
Let x, y) R
2
be a vector in the plane, and assume that x
t
, y
t
) repre-
sents the vector returned after x, y) has been translated by the vector c, d).
As the gure illustrates, the components of x
t
, y
t
) are the vector sum of
x, y) and c, d).
Based upon your work in the Activities, which of these geometric transfor-
mations is a linear transformation? For each one that is not, can you identify
which operation, vector addition or scalar multiplication, is not preserved?
In addition to preservation of vector addition and scalar multiplication,
linear transformations preserve lines. A line through the vector a in the
direction of v is given by the set
tv +a : t R.
v
tv
a
tv + a
Figure 5.4: Vector form of a line
As shown in Figure 5.4 in R
2
, the direction vector v emanates through
the origin, a is the vector through which the line passes, and every point on
the line can be represented as the vector sum of a and some scalar multiple
of v.
Linear transformations transform lines into lines. For example, the trans-
formation T : R
3
R
3
given by
T(u
1
, u
2
, u
3
)) = u
1
+u
2
, u
2
u
3
, u
1
+u
3
)
5.1 Introduction to Linear Transformations 229
transforms the line t 2, 1, 1) +1, 3, 1) : t R into the line t 3, 2, 1) +
2, 2, 0) : t R as shown in Figure5.5.
a
v
tv + a
tv
Figure 5.5: Linear Transformation of a line
The example being considered here can be generalized.
Theorem 5.1.1. Let T : R
n
R
n
be a linear transformation, and let l be
a line in R
n
given by
l = tv +a : t R.
Then, the image of l under T is also a line.
Proof. We must show that T(l) is a line. Since T is assumed to be a linear
transformation, we can write
T(l) = T(tv +a)
= T(tv) +T(a)
= tT(v) +T(a).
Since T is a vector transformation fromR
n
to R
n
, T(v) R
n
and T(a) R
n
.
Therefore, tT(v) + T(a) : t R is a line; that is, T transforms the line
passing through the vector in the direction of v into the line passing through
the vector T(a) in the direction of T(v).
In the exercises, you will be asked to show related consequences of linear-
ity. Specically, linear transformations transform parallel lines into parallel
lines, line segments into line segments, and squares into parallelograms.
230 CHAPTER 5. LINEAR TRANSFORMATIONS
Component Functions and Linear Transformations
Many of the vector spaces you have studied in this course involve spaces of
tuples, that is, a vector space K
n
, where K is the set of real numbers R or
some nite eld such as Z
5
. Later in this text, we will show that non-tuple,
nite dimensional vector spaces are structurally equivalent to spaces of tuples
of the same dimension. Hence, any insight regarding linear transformations
between spaces of tuples is of particular importance to us.
As you may have noticed in the activities, any function H : K
n
K
m
can be decomposed into a set of component functions. For example, the
function T : (Z
5
)
2
(Z
5
)
2
, given in Activity 1 and dened by T(a, b)) =
(a b), (2 b)), can be expressed in the form
T(a, b)) = t
1
(a, b)), t
2
(a, b))) ,
where the function t
1
: (Z
5
)
2
Z
5
, dened by
t
1
(a, b)) = (a b),
corresponds to the expression in the rst component, and the function t
2
:
(Z
5
)
2
Z
5
, dened by
t
2
(a, b)) = (3 b),
corresponds to the expression given in the second component. Can you pro-
vide similar descriptions for the func F dened in Activity 2 and the funcs
described in Activity 5?
Consider the transformation dened in Activity 2. In this example, you
may have noticed that the expression for each component function is a linear
combination of the components of the input vector. Does this characteristic
appear to hold true for those funcs in Activity 5 that you deemed to be
linear? Is this true for the func dened Activity 1, a transformation which
you discovered was not linear? What about the non-linear funcs dened in
Activity 5: is each component a linear combination of the components of the
input vector, or is there at least one component for which this fails?
As the next theorem illustrates, the patterns you discovered in the activ-
ities can be generalized.
Theorem 5.1.2. A function T : K
n
K
m
given by
T(u) = f
1
(u), f
2
(u), . . . , f
m
(u))
5.1 Introduction to Linear Transformations 231
is a linear transformation if and only if each component function
f
i
: K
n
K, i = 1, 2, . . . , m,
is given by
f
1
(u) = f
1
(u
1
, u
2
, , u
n
)) = a
11
u
1
+a
12
u
2
+ +a
1n
u
n
f
2
(u) = f
2
(u
1
, u
2
, , u
n
)) = a
21
u
1
+a
22
u
2
+ +a
2n
u
n
.
.
.
f
m
(u) = f
m
(u
1
, u
2
, , u
n
)) = a
m1
u
1
+a
m2
u
2
+ +a
mn
u
n
,
where each a
ij
, i = 1, 2, . . . , m, j = 1, 2, . . . , n, is a scalar.
Proof. (=:) Let u = u
1
, u
2
, , u
n
) K
n
. Then,
T(u) = T(u
1
, u
2
, , u
n
)) =
_
f
1
(u
1
, u
2
, , u
n
)), f
2
(u
1
, u
2
, , u
n
)),
, f
m
(u
1
, u
2
, , u
m
))
_
,
where we assume that each component function is
f
1
(u) = f
1
(u
1
, u
2
, , u
n
)) = a
11
u
1
+a
12
u
2
+ +a
1n
u
n
f
2
(u) = f
2
(u
1
, u
2
, , u
n
)) = a
21
u
1
+a
22
u
2
+ +a
2n
u
n
.
.
.
f
m
(u) = f
m
(u
1
, u
2
, , u
n
)) = a
m1
u
1
+a
m2
u
2
+ +a
mn
u
n
,
and each a
ij
K, i 1, . . . , m, j 1, . . . , n, is a scalar. To establish
that T is a linear transformation, it suces to show that each component
function f
i
: K
n
K, i = 1, 2, . . . , m is linear. The details are left as an
exercise. See Exercise 18.
(=:) Assume that T is a linear transformation. Let u = u
1
, u
2
, . . . , u
n
)
K
n
, and rewrite u as the sum
u = u
1
, u
2
, . . . , u
n
)
= u
1
, 0, . . . , 0) +0, u
2
, 0, . . . , 0) + +0, . . . , 0, u
n
) .
232 CHAPTER 5. LINEAR TRANSFORMATIONS
Since we are assuming that T is a linear transformation,
T(u) = T(u
1
, 0, . . . , 0)) +T(0, u
2
, 0, . . . , 0)) + +T(0, . . . , 0, u
n
)).
Since each vector 0, . . . , 0, u
j
, 0, . . . , 0) has a single nonzero component, each
component function f
i
: K
n
K, i = 1, . . . , m, behaves like a single-
variable function that accepts u
j
and returns a scalar; specically,
f
i
(0, . . . , 0, u
j
, 0, . . . , 0)) = a
ij
u
j
,
where a
ij
K. If we take the sum over all j, we achieve the desired result
for each component function f
i
. The details are left to the exercises. See
Exercise 19.
In the second part of the proof, (=), there is an assumption being
made. Can you identify what that assumption is? Can you state and prove
a theorem that would address this assumption?
Non-Tuple Vector Spaces
Throughout this section, we have considered transformations between vector
spaces of tuples. In other chapters, you have been introduced to other, non-
tuple examples. Although these examples are familiar, you have been asked
to think about them in a new context. For instance, consider the set C
(R)
of all innitely dierentiable functions from R to R. This is a vector space
with scalars in R. Do you remember what each vector in this space looks
like? Do you recall how the addition and scalar multiplication operations are
dened?
In Activity 9, you were asked to determine whether the second deriva-
tive operator is linear. What did you observe? If we dene a function
F : C
(R) C
(R) by
F(f) = D
2
(f) +f,
where f C
(R)?
If so, can you prove that F is linear? If not, can you explain which condition
of linearity is being violated?
Another class of non-tuple vector spaces are sets of polynomial functions
organized by degree. For instance,
PF
3
(R) = x a
0
+a
1
x +a
2
x
2
+a
3
x
3
: a
0
, a
1
, a
2
, a
3
R
5.1 Introduction to Linear Transformations 233
the set of all polynomial functions of degree three or less with real-valued
coecients is a vector space. What did you nd in Activities 10 and 11? Is
the denite integral dened in this activity linear? What about the function
J?
Several of the exercises will ask you to make similar determinations. In
particular, when given a function between two non-tuple vector spaces, how
does one determine whether the given function satises the two conditions
of linearity specied in Denition 5.1.1?
Exercises
1. Explain why each of the following functions from R
2
to R
2
is not a
linear transformation.
(a) T
1
(x, y)) = 3y + 1, x +y)
(b) T
2
(x, y)) = xy, 3x + 4y)
(c) T
3
(x, y)) =
_
y
4
, 2y +x + 4
_
(d) T
4
(x, y)) = 3x + 2y 4, 5x y + 7)
2. Dene T : R
2
R by T(x, y)) = xy. Determine whether T is a
linear transformation. If the function is linear, use the denition to
prove it; if the function is not linear, explain why the denition fails.
3. Dene T : R
3
R
5
by
T(u
1
, u
2
, u
3
)) =
(3u
1
+ 2u
2
u
3
), (u
1
2u
2
u
3
), 0, (u
2
+ 5u
3
), (3u
1
+ 3u
2
)) .
Use the denition to show that T is a linear transformation.
4. Dene T : R
n
R by
T(u) = a u,
where a = a
1
, a
2
, . . . , a
n
) R
n
, u is any vector in R
n
, and () repre-
sents the dot product on R
n
. Use the denition to show that T is a
linear transformation. What is the signicance of T as it relates to the
topic of the dot product in multivariable calculus?
5. Determine whether each function given below is linear.
234 CHAPTER 5. LINEAR TRANSFORMATIONS
(a) D: P
3
(Z
5
) P
2
(Z
5
) dened by
D(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = 3a
3
x
2
+ 2a
2
x +a
1
.
(b) T : P
3
(R) R dened by
T(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = a
3
a
2
+a
0
.
(c) G: P
3
(R) R dened by
G(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = 2a
3
a
2
+a
1
.
(d) H: P
3
(Z
5
) P
3
(Z
5
) dened by
H(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = a
3
x
3
+a
2
x
2
+a
1
x +a
0
+ 3.
(e) S: P
3
(R) P
3
(R) dened by
S(a
3
x
3
+a
2
x
2
+a
1
x+a
0
) = a
3
(x+2)
3
+a
2
(x+2)
2
+a
1
(x+2) +a
0
.
6. Determine whether each function given below is linear.
(a) D: PF
3
(R) PF
2
(R) dened by
D(p) = p
t
,
where p
t
is the derivative of p.
(b) T : PF
3
(R) R dened by
T(p) =
_
1
0
p(t) dt
(c) J : PF
3
(R) PF
4
(R) dened by
J(p) = x
_
x
2
p(t) dt
7. Let V be the vector space of all functions from R R whose domain
is all of R. Dene T : C
(R) V be dened by
T(f) = F
where for F is a function such that D(F) = f and F(0) = k ,= 0. Use
the denition to show that T is not a linear transformation. In order
for T to be a linear transformation, we would have to limit ourselves
to a certain subspace of C
(R) be the vector space of all functions that are innitely dier-
entiable. Let D : C
(R) C
(R) R
by
P(g) = D(g)
_
1
2
_
,
where g C
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
,
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . ,
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
_
,
then the kernel, dened to be the set
ker(T) = x
1
, x
2
, . . . , x
n
)) K
n
: T(x
1
, x
2
, . . . , x
n
)) = 0, 0, . . . , 0),
is the solution set of the system
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= 0
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= 0
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= 0.
Conversely, the solution set in K
n
of the homogeneous system
b
11
x
1
+b
12
x
2
+ +b
1n
x
n
= 0
b
21
x
1
+b
22
x
2
+ +b
2n
x
n
= 0
.
.
.
b
m1
x
1
+b
m2
x
2
+ +b
mn
x
n
= 0
5.2 Kernel and Range 243
is the kernel of the linear transformation T : K
n
K
m
dened by
T(x
1
, x
2
. . . , x
n
)) = b
11
x
1
+b
12
x
2
+ +b
1n
x
n
,
b
21
x
1
+b
22
x
2
+ +b
2n
x
n
, . . . ,
b
m1
x
1
+b
m2
x
2
+ +b
mn
x
n
) .
This explains why KER and SOL were equal. Since ker(T) is a subspace of U,
the relationship shown above suggests that the solution set of a homogeneous
system is also a subspace. This is indeed the case.
Theorem 5.2.3. The solution set of a homogeneous system of m equations
in n unknowns is a subspace of the vector space V = K
n
.
Proof. Let
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= 0
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= 0
.
.
.
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= 0
be a homogeneous system of m equations in n unknowns over K. Suppose
that u = u
1
, u
2
, . . . , u
n
) and v = v
1
, v
2
, . . . , v
n
) are solutions. Then, for
every i = 1, 2, . . . , m,
a
i1
(u
1
+v
1
) +a
i2
(u
2
+v
2
) + +a
in
(u
n
+v
n
)
= a
i1
u
1
+a
i1
v
1
+a
i2
u
2
+a
i2
v
2
+ +a
in
u
n
+a
in
v
n
= [a
i1
u
1
+a
i2
u
2
+ +a
in
u
n
] + [a
i1
v
1
+a
i2
v
2
+ +a
in
v
n
] = 0,
which shows that u + v is a solution of the system. In a similar manner,
we can show that if c K is a scalar and if v is a solution, then the scalar
product cv is a solution. See Exercise 11.
If U and V are not vector spaces of tuples, the kernel of a linear trans-
formation may not be able to be represented by the solution set of a homo-
geneous system of equations. Even though no such correspondence exists,
Denition 5.2.1 and Theorems 5.2.1 and 5.2.2 still apply.
244 CHAPTER 5. LINEAR TRANSFORMATIONS
The Image Space of a Linear Transformation
Let T : U V be a linear transformation between the vector spaces U and
V . We will call the set of all inputs U the domain space of T. The vector
space V will be referred to as the range space of T. The set of all vectors in
V that are assigned to at least one vector in U under T is the image space
of T. We denote the image space of T using the notation
T(U) = v V : there exists u U such that T(u) = v.
The term image refers to the output of a single vector. If u U, then
v = T(u) is the image of u under T. In Activity 5, you constructed the set
IMAGESPACE, which is, in fact, the image space of T dened in Activity 2.
Like the kernel of T, this set is a subspace of V .
Theorem 5.2.4. Let U and V be vector spaces with scalars in K. Let T :
U V be a linear transformation. The image space of T, T(U), is a
subspace of V .
Proof. In order to show that T(U) is a subspace of V , we must show that
the sum of any two vectors in T(U) is equal to a vector in T(U) and that the
scalar product of a vector in T(U) is equal to a vector in T(U). The details
are left as an exercise. See Exercise 13.
In the last subsection, we showed that there is a correspondence between
the kernel of a linear transformation between tuples and the solution set of a
homogeneous system of equations. There is a similar correspondence between
systems and the image space. For example, if T : K
n
K
m
is dened by
T(x
1
, x
2
, . . . , x
n
))
= a
11
x
1
+a
12
x
2
+ +a
1n
x
n
,
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . ,
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
) ,
then v = v
1
, v
2
, . . . , v
m
) T(K
n
) if and only if the non-homogeneous system
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= v
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= v
2
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= v
m
5.2 Kernel and Range 245
K
m
K
n
V
Set of solutions
of T(u) = v
T
image space
of T
Figure 5.7: The image space of T
has a solution. If there were no solution, could we say that v T(K
n
)?
If U and V are not vector spaces of tuples, an element of the image space
may not correspond to the solution set of a particular system of equations.
However, the existence of a solution to T(u) = v and the presence of v in
T(U) are the same.
Bases for the Kernel and Image Space
In Activity 4, you constructed a basis for SOL. In Activity 7, you extended
this set to a basis for the entire space (Z
5
)
5
. What theorem from Chapter
4 allowed you to do this? After having found the image T(u) of each new
basis vector, you applied the func All LC to the set of these images. Did
this set generate the set IMAGESPACE? Is this set independent? If so, can
you describe a procedure for nding the basis of the image space T(K
n
) of
a linear transformation T : K
n
K
m
? If not, can you explain how things
break down?
Since the kernel and image space of a linear transformation are subspaces,
we can talk about the dimension of each set. After having analyzed the
relationship between the dimensions of (Z
5
)
5
, IMAGESPACE, and KER in Ac-
tivity 8, what did you conjecture, in general, regarding the relationship be-
tween the dimensions of K
n
, ker(T), and T(K
n
) for a linear transformation
T : K
n
K
m
?
246 CHAPTER 5. LINEAR TRANSFORMATIONS
Before considering the theorem that addresses your conjecture, we need
to introduce two terms, rank and nullity, that are used in this theorem. In
Chapter 3, you learned that a system of equations, such as
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= c
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= c
2
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= c
m
,
can be represented as a matrix equation
_
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
_
_
_
_
_
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
c
1
c
2
.
.
.
c
m
_
_
_
_
_
.
The solution of such a system can be found by transforming the augmented
matrix
_
_
_
_
_
a
11
a
12
. . . a
1n
c
1
a
21
a
22
. . . a
2n
c
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
c
m
_
_
_
_
_
into reduced echelon form. If you recall, the rank of an augmented matrix,
or any matrix for that matter, is dened to be the number of nonzero rows
appearing in its reduced echelon form. In Activity 6, you showed that the
image of each coordinate basis vector under T was a column of the matrix A.
The columns of A generated the set IMAGESPACE. The rank of A, as dened
in Activity 6, turns out to be equal to the dimension of IMAGESPACE, the
image space of T. This link is the basis for the use of the term rank in
Theorem 5.2.5. If you apply elementary row operations to A in Activity 6,
can you verify that its rank is equal to the dimension of IMAGESPACE?
The term nullity refers to the dimension of the null space of a linear
transformation. The term null space is simply the kernel. Some texts elect
to refer to the kernel as the null space. It would be wise for you to be familiar
with both.
5.2 Kernel and Range 247
Theorem 5.2.5 (Rank and Nullity). Let U and V be nite dimensional
vector spaces with scalars in K, and let T : U V be a linear transforma-
tion. Then,
dim[ker(T)] + dim[T(U)] = dim(U).
Proof. Suppose dim(U) = n and dim[ker(T)] = m. Let
u
1
, u
2
, . . . , u
m
be a basis for ker(T). By Theorem 4.4.10, we can extend this linearly inde-
pendent set to a basis
u
1
, u
2
, . . . , u
m
, u
m+1
, u
m+1
, . . . , u
n
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
.
The coecients of the system are given by the matrix A, the unknowns by
the vector X, and the constants by the vector B. Specically, the system
can be represented by the equation AX = B. The associated homogeneous
5.2 Kernel and Range 253
system AX = 0 that is mentioned in Activity 9 is nothing more than the
system
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
0
0
.
.
.
0
_
_
_
_
_
,
where constants
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
are replaced by zeros. Given a particular solution
X = v
p
and X = v
0
, a solution of the associated homogeneous system, what
did you nd in Activity 9 regarding the vector sum v
p
+v
0
? Is it a solution
of AX = B? Can every solution of AX = B be written this way? How do
your ndings compare with the statement of the theorem given below?
Theorem 5.2.7. Let AX = B be a system of m equations in n unknowns
over a eld K. v
s
is a solution of AX = B if and only if v
s
= v
p
+v
0
, where
v
p
is a particular solution of AX = B, and v
0
is a solution of the associated
homogeneous system AX = 0.
Proof. (=:) Dene T : K
n
K
m
by
T(x
1
, x
2
, . . . , x
n
)) = a
11
x
1
+a
12
x
2
+ +a
1n
x
n
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . , a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
) .
As demonstrated in the rst three activities, the kernel of T can be found by
solving the homogeneous system
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
0
0
.
.
.
0
_
_
_
_
_
.
Any solution v
s
of the system
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
,
254 CHAPTER 5. LINEAR TRANSFORMATIONS
is a vector whose components, when substituted for x
i
, satisfy the system
given above, as well as the equation T(x) = b, where b = b
1
, b
2
, . . . , b
m
).
Let v
p
denote one particular solution. Using these relationships and the
linearity of T, we can write
T(v
s
) = T(v
p
)
T(v
s
) T(v
p
) = 0
T(v
s
v
p
) = 0.
Therefore, v
s
v
p
= v
0
ker(T). A rearrangement of terms gives us the
desired result: v
s
= v
p
+v
0
.
(=:) Assume that v
s
= v
p
+ v
0
, where v
p
is a particular solution of
AX = B, and v
0
is a solution of AX = 0. Then,
A(v
s
) = A(v
p
+v
0
) = A(v
p
) +A(v
0
) = B +0 = B.
Therefore, v
s
is a solution of AX = B.
This theorem has two important interpretations. In R
3
, the solution of
the system
x
1
2x
2
+ 3x
3
= 1
3x
1
4x
2
+ 5x
3
= 3
2x
1
3x
2
+ 4x
3
= 2
is the set t 1, 2, 1) + 1, 0, 0) : t R. This is the line through 1, 0, 0) in
the direction 1, 2, 1). 1, 0, 0) is a particular solution of the system. The
solution set of the associated homogeneous system
x
1
2x
2
+ 3x
3
= 0
3x
1
4x
2
+ 5x
3
= 0
2x
1
3x
2
+ 4x
3
= 0
is given by t 1, 2, 1) : t R, the line passing through the origin in the
direction 1, 2, 1). Hence, the solution set of the non-homogeneous system
5.2 Kernel and Range 255
a
v
tv + a
tv
Figure 5.11: General solution of a system
can be represented as a translation of the solution set of the homogeneous
system.
If we dene T : R
3
R
3
by
T(x
1
, x
2
, x
3
)) = x
1
2x
2
+ 3x
3
, 3x
1
4x
2
+ 5x
3
, 2x
1
3x
2
+ 4x
3
) ,
the kernel is the solution set of the homogeneous system
x
1
2x
2
+ 3x
3
= 0
3x
1
4x
2
+ 5x
3
= 0
2x
1
3x
2
+ 4x
3
= 0.
b
1
, b
2
, b
3
) is in the image space if the non-homogeneous system
x
1
2x
2
+ 3x
3
= b
1
3x
1
4x
2
+ 5x
3
= b
2
2x
1
3x
2
+ 4x
3
= b
3
has a solution. If p
1
, p
2
, p
3
) is a particular solution of the non-homogeneous
system, then b
1
, b
2
, b
3
) = T(t 1, 2, 1)+p
1
, p
2
, p
3
)): every vector in the image
space is the image under T of the sum of an element of the kernel and a
particular pre-image.
256 CHAPTER 5. LINEAR TRANSFORMATIONS
Non-Tuple Vector Spaces
As discussed earlier, we cannot necessarily nd the kernel and the image
space of a linear transformation in non-tuple contexts by solving a corre-
sponding system of equations. However, the basic concept is the same. To
nd the kernel of a linear transformation T : U V , we nd all vectors
u U that are solutions of the equation T(x) = 0
V
. Similarly v V is
in the image space of T if there exists u such that T(u) = v. As with any
linear transformation between two nite dimensional vector spaces, Theo-
rem 5.2.5 applies in non-tuple contexts. This is veried by the proof, which
was constructed independently of any specic form. In the exercises, you will
be asked to work with transformations between spaces of polynomials and
functions.
Exercises
1. Let T : R
3
R
4
be a transformation given by
T(x
1
, x
2
, x
3
)) =
x
1
+ 2x
3
, 3x
2
4x
3
, x
1
x
2
+ 4x
3
, 2x
1
+x
2
4x
3
) .
(a) Show that T is a linear transformation.
(b) Find the kernel of T.
(c) Construct a basis for the kernel of T.
(d) Construct a basis for the image space of T.
(e) Verify the rank and nullity theorem for this transformation.
2. Let
x
1
+ 2x
2
3x
3
= 0
2x
1
+ 4x
2
2 + 6x
3
= 0
2x
2
x
3
= 0
4x
2
+ 2x
3
= 0
be a homogeneous system of 4 equations in 3 unknowns.
(a) Find a basis for the solution set.
5.2 Kernel and Range 257
(b) Find a linear transformation T : R
3
R
4
whose kernel is equal
to the solution set of the system of equations given here.
(c) Find a basis for the kernel of T.
(d) Find a basis for the image space of T.
(e) Verify the rank and nullity theorem for this transformation.
3. Dene T : R
3
R by
T(x
1
, x
2
, x
3
)) = 3x
1
+ 2x
2
x
3
.
(a) Show that T is a linear transformation.
(b) Find a basis for the kernel of T.
(c) Find a basis for the image T(R
3
) of T.
(d) Describe the kernel of T geometrically.
4. Suppose that the image space of a linear transformation F : R
4
R
4
is spanned by the set
2, 1, 3, 1) , 1, 0, 2, 4) , 1, 1, 5, 3) , 4, 2, 6, 2).
(a) Find an expression for F. (Note: There may be more than one
expression that satises the condition given above. Your job here
is to nd one such expression.)
(b) Find a basis for the kernel of F.
(c) Verify the rank and nullity theorem for this transformation.
5. Suppose that the kernel of a linear transformation G : R
4
R
4
is
spanned by the set
2, 3, 1, 1) , 1, 4, 5, 2) , 3, 1, 4, 1) , 4, 6, 2, 2).
(a) Find an expression for G. (Note: There may be more than one
expression that satises the condition given above. Your job here
is to nd one such expression.)
(b) Find a basis for the image space of G.
(c) Verify the rank and nullity theorem for this transformation.
258 CHAPTER 5. LINEAR TRANSFORMATIONS
6. Suppose H : R
5
R
5
is a linear transformation such that ker(H) =
3. Is it possible for the image space to be spanned by the set
1, 2, 4, 2, 3) , 2, 0, 1, 3, 3) , 2, 4, 0, 1, 2) , 0, 4, 1, 4, 1)?
Justify your answer using the Rank and Nullity Theorem.
7. Dene
_
1
0
: C
(R) R by
_
1
0
f =
_
1
0
f(t) dt.
Describe the kernel of
_
1
0
.
8. Let D
2
: C
(R) C
(R) C
(R) C
(R) by L = D
2
+ I. Find the kernel of L.
What is the relationship of between the kernel of L and the solution
set of the dierential equation f
tt
+f = 0?
9. Provide a proof of Theorem 5.2.1.
10. Provide a proof of Theorem 5.2.2.
11. Complete the proof of Theorem 5.2.3.
12. Find the solution set of the homogeneous system associated with the
transformation T dened by
T(x
1
, x
2
, x
3
, x
4
))
= x
1
2x
2
+ 3x
3
+ 5x
4
, x
1
x
2
+ 8x
3
+ 7x
4
,
2x
1
4x
2
+ 6x
3
+ 10x
4
) .
What is the relationship between the solution set of the homogeneous
system that corresponds to this transformation and the ker(T)?
13. Provide a proof of Theorem 5.2.4.
14. Find the solution set of the system of equations
x
1
x
2
+x
3
+ 2x
4
2x
5
= 1
2x
1
x 2 x
3
+ 3x
4
x
5
= 3
x
1
x
2
+ 5x
3
4x
5
= 3.
5.2 Kernel and Range 259
Is the vector 1, 3, 3) an element of the image of the transformation
T : R
5
R
3
dened by
T(x
1
, x
2
, x
3
, x
4
, x
5
))
= x
1
x
2
+x
3
+ 2x
4
2x
5
,
2x
1
x 2 x
3
+ 3x
4
x
5
,
x
1
x
2
+ 5x
3
4x
5
)?
If so, nd the set of vectors in R
5
whose image under T is the vector
1, 3, 3). If not, explain why the vector 1, 3, 3) is not in the image
T(R
5
) of T.
15. Suppose that G : U R
6
is a linear transformation. Each part gives
a dierent scenario involving G and U. Answer each question on the
basis of the given scenario.
(a) If dim[T(U)] = 4 and dim[ker(T)] = 3, what would be the dimen-
sion of U? Justify your answer.
(b) Is G were onto, what could we conclude about the dimension of
the domain space U?
(c) If dim[T(U)] = 6 and dim(U) = 6, what could we say about G?
Is G one-to-one? onto?
(d) If dim(U) ,= dim(R
6
), does Theorem 5.2.6 apply? For example, if
G is one-to-one, must G also be onto? If G is onto, is G necessarily
invertible?
16. Let T : R
4
R
5
be dened by T(u) = 0 for all u R
4
. What is the
dimension of the kernel of T? Justify your answer.
17. Let T : K
n
K
m
be dened by
T(x
1
, x
2
, . . . , v
n
) = a
11
x
1
+a
12
x
2
+ +a
1n
x
n
,
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . , a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
) .
(a) Write the expression for T in terms of matrices; that is, write the
expression for T as T(x) = A x.
(b) Show that the j
th
column of A is the image of the vector T(e
j
),
where e
j
is the coordinate basis vector where 1 appears in the j
th
component.
260 CHAPTER 5. LINEAR TRANSFORMATIONS
(c) Show that
I = T(e
1
), T(e
2
), . . . , T(e
n
),
spans the image space of T.
(d) What does Theorem 5.2.5 say about the rank of A and the num-
ber of vectors in I, after I has been reduced to a linearly in-
dependent set? By reduced, we mean the application of Theo-
rem 4.3.2.
18. If T : U V is any vector space transformation, show that T is
invertible, that is, there exists T
1
: ImageSpace(T) U, if and only
if T is one-to-one.
19. If T : U V is an invertible linear transformation, show that T
1
:
ImageSpace(T) U is linear as well.
20. Dene D: PF
3
(R) PF
3
(R) by
D(p) = p
t
, where p
t
is the rst derivative of p.
(a) Show that D is a linear transformation.
(b) Find a basis for the kernel of D.
(c) Find a basis for the image of D.
21. Dene D
2
: PF
3
(R) PF
3
(R) by
D
2
(p) = p
tt
, where p
tt
is the second derivative of p.
(a) Show that D
2
is a linear transformation.
(b) Find a basis for the kernel of D
2
.
(c) Find a basis for the image of D
2
.
22. Dene H: PF
2
(R) PF
3
(R) by
H(p) = x 2p
t
(x) +
_
x
0
3p(t) dt,
where p
t
denotes the rst derivative of p.
5.2 Kernel and Range 261
(a) Show that H is a linear transformation.
(b) Find a basis for the kernel of H.
(c) Find a basis for the image of H.
23. Dene T : PF
2
(R) R
4
by
T(a
0
+a
1
x +a
2
x
2
) = a
0
, a
1
, a
0
+a
1
, 0) .
(a) Show that T is a linear transformation.
(b) Find a basis for the kernel of T.
(c) Find a basis for the image of T.
24. Dene F : PF
2
(R) PF
2
(R) by
F(p) = q, where q(x) = p(x) +p(x)
(a) Show that F is a linear transformation.
(b) Find a basis for the kernel of F.
(c) Find a basis for the image of F.
25. Dene G: PF
2
(R) PF
2
(R) by
G(p) = q, where q(x) = (2x + 3)p(x).
(a) Show that G is a linear transformation.
(b) Find a basis for the kernel of G.
(c) Find a basis for the image of G.
26. Dene H: PF
2
(R) R by
H(p) = p(1)
(a) Show that H is a linear transformation.
(b) Find a basis for the kernel of H.
(c) Find a basis for the image of H.
262 CHAPTER 5. LINEAR TRANSFORMATIONS
27. Use the results of Theorem 5.2.7 to nd the general form of a solution
for the system of equations
3x
1
3x
2
+ 3x
3
= a
2x
1
x
2
+ 4x
3
= b
3x
1
5x
2
x
3
= c,
where one particular solution is of the form 1, 1, 1).
28. Find the form of the system of equations whose general solution consists
of the particular solution 1, 3, 4) and whose associated homogeneous
system has solution set generated by the basis
2, 1, 2) , 1, 1, 3).
263
5.3 New Constructions from Old
Activities
1. There are four sets of linear transformations given below. Write an
ISETL func that implements each transformation. Then, construct
four sets, R, S, F, and P, according to the categorizations given below.
Save these sets of funcs in a le called LTexamples.
Let R = R
i
: (Z
5
)
2
(Z
5
)
2
: i = 1, 2, 3, 4, 5, 6, 7 be a set of
linear transformations. The expression for each transformation is
given below.
R
1
(v
1
, v
2
)) = v
1
+ 2v
2
, 2v
1
+ 3v
2
)
R
2
(v
1
, v
2
)) = v
1
+ 2v
2
, 4v
1
+ 3v
2
)
R
3
(v
1
, v
2
)) = 3v
1
+v
2
, 0)
R
4
(v
1
, v
2
)) = 3v
1
, 2v
2
)
R
5
(v
1
, v
2
)) = v
1
+ 2v
2
, v
1
+ 2v
2
)
R
6
(v
1
, v
2
)) = 2v
1
+ 2v
2
, 2v
1
+ 4v
2
)
R
7
(v
1
, v
2
)) = 4v
1
+ 2v
2
, 4v
1
).
Let S = S
i
: (Z
5
)
2
(Z
5
)
3
: i = 1, 2, 3, 4, 5 be a set of linear
transformations. The expression for each transformation is given
below.
S
1
(v
1
, v
2
)) = v
1
+ 2v
2
, 2v
1
+ 3v
2
, 3v
1
)
S
2
(v
1
, v
2
)) = 2v
1
+ 4v
2
, v
1
+ 2v
2
, 4v
1
+ 3v
2
)
S
3
(v
1
, v
2
)) = 3v
1
+v
2
, 2v
1
+ 2v
2
, 0)
S
4
(v
1
, v
2
)) = 3v
1
, 2v
2
, 0)
S
5
(v
1
, v
2
)) = 3v
1
+v
2
, 3v
1
, 2v
1
+ 3v
2
).
Let F = F
i
: (Z
5
)
4
(Z
5
)
2
: i = 1, 2, 3, 4 be a set of linear
transformations. The expression for each transformation is given
below.
F
1
(v
1
, v
2
, v
3
, v
4
)) = v
1
+ 2v
2
+v
3
, 2v
1
+v
2
+v
3
)
F
2
(v
1
, v
2
, v
3
, v
4
)) = 2v
1
+v
2
, 2v
2
+v
3
)
F
3
(v
1
, v
2
, v
3
, v
4
)) = v
1
+ 2v
2
, v
1
+ 2v
2
)
F
4
(v
1
, v
2
, v
3
, v
4
)) = 2v
1
, v
2
).
Let P = P
i
: (Z
5
) (Z
5
)
4
: i = 1, 2 be a set of linear
transformations. The expression for each transformation is given
264 CHAPTER 5. LINEAR TRANSFORMATIONS
below.
P
1
(v
1
)) = 2v
1
, v
1
, 0, v
1
)
P
2
(v
1
)) = v
1
, 2v
1
, 2v
1
, 0).
2. (a) Run name vector space by setting K equal to Z
5
, U equal to
(Z
5
)
2
and V equal to (Z
5
)
3
. Given S
1
, as dened in Activity 1,
and the scalar 3 Z
5
, explain what you think is meant by 3S
1
,
and write a func that implements it. Apply the func is linear
that you constructed in Activity 3 of Section 5.1 to 3S
1
. Is 3S
1
a
linear transformation?
(b) Write a func LTsm that accepts a scalar a and a linear transfor-
mation F, and returns a func that implements aF.
(c) Assume that name vector space has been run. Write a func that
accepts a set of linear transformations from U to V ; determines
whether the scalar multiple of each transformation in the set is
linear; returns true, if each transformation is linear, or false, if
one or more transformations is not linear. Apply your func to
each of the sets R, S, F, and P dened in Activity 1. Note that you
will need to adjust the inputs for name vector space accordingly.
State a conjecture that summarizes what you observe.
3. (a) Run name vector space by setting K equal to Z
3
, U equal to
(Z
3
)
4
and V equal to (Z
3
)
2
. Given F
1
and F
2
, as dened in Ac-
tivity 1, explain what you think is meant by F
1
+F
2
, and write a
func that implements F
1
+ F
2
. Apply is linear to F
1
+ F
2
. Is
F
1
+F
2
a linear transformation?
(b) Why doesnt the procedure in part (a) work for F
1
+ R
2
? for
S
1
+ R
1
? Specify condition(s) under which the sum of two linear
transformations is dened.
(c) Write a func LTadd that accepts two linear transformations A
and B; determines whether the sum of A and B is dened; and
returns the sum A +B, if the sum is dened, or OM, if the sum is
not dened.
(d) Form the sum S
3
+ S
4
by applying the func LTadd. Determine
whether the resulting sum is a linear transformation by applying
the func is linear. What do you observe?
5.3 New Constructions from Old 265
(e) Write a func that assumes that name vector space has been run;
accepts a set of linear transformations from U to V ; determines
whether the sum of each pair is linear; returns true, if each sum
is a linear transformation, or false, if one or more of the sums is
not a linear transformation. Apply this func to each of the sets R,
S, F, and P dened in Activity 1. Note that you will need to adjust
the inputs for name vector space accordingly. State a conjecture
that summarizes what you observe.
4. (a) Apply the func LTadd to S
1
and S
2
from Activity 1, and determine
whether the resulting sum is equal to either S
3
, S
4
, or S
5
.
(b) Are the funcs R
5
and F
3
that were dened in Activity 1 equal?
(c) Are the funcs R
4
and S
4
that were dened in Activity 1 equal?
(d) Write a func is equal that assumes that name vector space has
been run; accepts two linear transformations; determines whether
the two inputs are equal; and returns true, if they are, or false,
if they are not.
(e) Use is equal to nd all pairs of linear transformations in R whose
sum is equal to another linear transformation in R. Repeat for the
set F.
5. Let G = G
i
: (Z
2
)
1
(Z
2
)
2
: i 1, 2, 3 be a set of transforma-
tions with the expression for each G
i
given below:
G
1
(v)) = 0, 0)
G
2
(v)) = v, 0)
G
3
(v)) = 0, v)
G
4
(v)) = v, v).
(a) Write an ISETL func for each transformation. Apply the func
is linear to verify that each transformation is linear.
(b) What are the scalars in Z
2
? For each transformation, determine
each of its scalar multiples. Use the func Is Equal that you con-
structed in Activity 4 to determine whether each scalar multiple
is equal to either G
1
, G
2
, G
3
, or G
4
. What do you observe?
266 CHAPTER 5. LINEAR TRANSFORMATIONS
(c) Apply the func LTadd to nd the sum G
i
+ G
j
of all possible
combinations i, j such that i, j 1, 2, 3, 4. Apply the func
is equal to determine whether each sum is equal to either G
1
,
G
2
, G
3
, or G
4
. What do you observe?
(d) Apply the func is vector space (See Activity 6, in Section 2.2)
to the set G, together with the operations dened on G, as given
above. Does ISETL return a response consistent with the results
you obtained in (b) and (c)?
6. Let T = T
i
: (Z
2
)
2
(Z
2
)
2
: i 1, 2, . . . , 24 be a set of transfor-
mations with the expression for each T
i
given below:
T
1
(v
1
, v
2
)) = 0, 0)
T
2
(v
1
, v
2
)) = v
1
, 0)
T
3
(v
1
, v
2
)) = v
1
, v
1
)
T
4
(v
1
, v
2
)) = v
1
, v
2
)
T
5
(v
1
, v
2
)) = v
1
, v
1
+v
2
)
T
6
(v
1
, v
2
)) = v
2
, 0)
T
7
(v
1
, v
2
)) = v
2
, v
1
)
T
8
(v
1
, v
2
)) = v
2
, v
2
)
T
9
(v
1
, v
2
)) = v
2
, v
1
+v
2
)
T
10
(v
1
, v
2
)) = 0, v
1
)
T
11
(v
1
, v
2
)) = v
1
, v
1
)
T
12
(v
1
, v
2
)) = v
2
, v
1
)
T
13
(v
1
, v
2
)) = v
1
+v
2
, v
1
)
T
14
(v
1
, v
2
)) = 0, v
2
)
T
15
(v
1
, v
2
)) = v
1
, v
2
)
T
16
(v
1
, v
2
)) = v
2
, v
2
)
T
17
(v
1
, v
2
)) = v
1
+v
2
, v
2
)
T
18
(v
1
, v
2
)) = 0, v
1
+v
2
)
T
19
(v
1
, v
2
)) = v
1
, v
1
+v
2
)
T
20
(v
1
, v
2
)) = v
2
, v
1
+v
2
)
5.3 New Constructions from Old 267
T
21
(v
1
, v
2
)) = v
1
+v
2
, v
1
+v
2
)
T
22
(v
1
, v
2
)) = v
1
+v
2
, 0)
T
23
(v
1
, v
2
)) = v
1
+v
2
, v
1
)
T
24
(v
1
, v
2
)) = v
1
+v
2
, v
2
).
Let B = B
i
: (Z
2
)
2
(Z
2
)
2
: i 1, 2, 3, 4 be a set of transforma-
tions with the expression for each B
i
given below:
B
1
(v
1
, v
2
)) = v
1
, 0)
B
2
(v
1
, v
2
)) = v
2
, 0)
B
3
(v
1
, v
2
)) = 0, v
1
)
B
4
(v
1
, v
2
)) = 0, v
2
).
(a) Apply the func is vector space to determine whether the set T,
together with the operations dened on T, forms a vector space?
(b) Determine whether each transformation T
i
can be expressed as an
element of B or a sum of elements of B.
(c) Determine whether the set B is linearly independent. If no, why
not?
(d) Is the set B a basis for the set T? If so, what is the dimension of
the set T. If not, how does B fail to constitute a basis?
7. (a) Run name vector space by setting K equal to Z
3
, U equal to
(Z
3
)
2
, and V equal to (Z
3
)
3
. Given R
1
and S
1
, as dened in
Activity 1, explain what is meant by S
1
R
1
, where indicates
that R
1
is followed by S
1
. Write a func that implements S
1
R
1
.
(b) Why doesnt the procedure in part (a) work for R
1
S
1
? P
1
T
1
?
Under what condition(s) is the composition of two transformations
dened?
(c) Write a func LTcomp that accepts two linear transformations A
and B; determines whether the composition of A and B is dened;
and returns A B, if the composition is dened, or OM, if the
composition is not dened.
268 CHAPTER 5. LINEAR TRANSFORMATIONS
(d) Run name vector space by setting K equal to Z
3
, U equal to
(Z
3
)
2
, and V equal to (Z
3
)
2
. Apply the ISETL command arb to
U to choose three vectors from (Z
3
)
2
. Determine whether R
1
R
3
applied to each of these three vectors returns the same result as
R
5
. Apply the func is equal to R
1
R
3
and R
5
. Is the result
returned by is equal consistent with that returned by R
1
R
3
and R
5
when applied to each of the three vectors selected by arb?
(e) Apply the func is linear to the composition R
2
R
4
. What do
you nd?
(f) Write a func that assumes that name vector space has been run;
accepts two sets of linear transformations; determines whether the
composition of a transformation from the rst set followed by a
transformation from the second set is dened and is linear; returns
true, if each composition is a linear transformation, or false, if
one or more of the compositions is not a linear transformation,
or OM, if the composition operation is undened. Apply this func
to each of the pairs (R,R), (S,R), and (P,F), where R, S, and F
are the sets dened in Activity 1. Note that you will need to
adjust the inputs for name name vector space accordingly. State
a conjecture that summarizes what you observe.
(g) Find all pairs A, B in R such that A B = B A. Does the
equality hold for every pair in R? Describe your observation in a
single sentence using the word commutative.
Discussion
Scalar Multiple of a Linear Transformation
In Activity 2, you were asked to dene the mapping 3S
1
. In order to get an
expression for this map, we simply multiply each component of the expression
for S
1
by 3:
3S
1
(v
1
, v
2
)) = 3 v
1
+ 2v
2
, 2v
1
+ 3v
2
, 3v
1
)
= 3(v
1
+ 2v
2
), 3(2v
1
+ 3v
2
), 3(3v
1
))
= 3v
1
+v
2
, v
1
+ 4v
2
, 4v
2
) .
5.3 New Constructions from Old 269
One can nd the scalar multiple of any transformation in a similar manner,
as suggested in the following denition.
Denition 5.3.1. Let T : U V be a transformation between two vector
spaces U and V with scalars in a eld K. Given a scalar a K, the scalar
multiple of T by a, denoted aT, is a transformation that assigns to each
vector u U a vector aT(u) V , where T(u) represents the vector assigned
to u by T. This is represented by the notation
(aT)(u) = a(T(u)).
In constructing the func is linear in Activity 3 of Section 5.1, you most
likely checked the single condition,
T(cu
1
+du
2
) = cT(u
1
) +dT(u
2
),
or the equivalent, two-part version,
T(cu) = cT(u)
T(u
1
+u
2
) = T(u
1
) +T(u
2
),
given in Denition 5.1.1. Is your nding in Activity 2 consistent with the
following theorem?
Theorem 5.3.1. Let U and V be vector spaces with scalars in K, and let
T : U V be a linear transformation. If a K is a scalar, then the scalar
multiple aT : U V is a linear transformation.
Proof. Let u
1
, u
2
U, and let a, c be scalars.
(aT)(cu
1
) = a(T(cu
1
))
= a(cT(u
1
))
= (ac)T(u
1
)
= (ca)T(u
1
)
= c(aT(u
1
))
= c(aT)(u
1
)
Can you justify each step? To nish the proof, we still need to show that
(aT)(u
1
+u
2
) = (aT)(u
1
) + (aT)(u
2
).
This can be done using the same ideas as in the rst part of the proof. You
will be asked to complete this proof as an exercise. See Exercise 3.
270 CHAPTER 5. LINEAR TRANSFORMATIONS
The Sum of Two Linear Transformations
In Activity 3(a), you wrote a func to express the sum of the two funcs F
1
and F
2
. As with the sum of any two functions, the sum of two linear trans-
formations consists of taking the vector assigned to an input vector u under
F
1
and adding it to the vector assigned to u under F
2
. If each transformation
is given by an expression, the sum is dened by adding the individual expres-
sions. How is the method by which you obtained the expression for F
1
+ F
2
similar to the way in which you found the expression for 3S
1
in Activity 2(a)?
Your work in the activities and the discussion here should convince you
that the sum is not dened unless the two transformations have the same
domain and range spaces. The sum F
1
+ R
2
given in Activity 3(b) is not
dened, because the domains dier. Can you explain exactly why this is a
problem? The second sum in part (b), S
1
+ R
1
, is not dened, because the
ranges dier. Why must the ranges be the same? Ideas related to the sum
of two transformations are summarized in the denition below.
Denition 5.3.2. Let U and V be vector spaces with scalars in K, and let
T : U V and F : U V be two transformations. Given u U, the
sum of T and F, denoted
T +F : U V,
is dened by taking the sum of the vector assigned to u under T, T(u), with
the vector assigned to u under F, F(u). The notation for this is
(T +F)(u) = T(u) +F(u).
Is your result in Activity 3 consistent with the following theorem?
Theorem 5.3.2. Let U and V be vector spaces with scalars in K, and let
T : U V and F : U V be linear transformations. Then, the sum
T +F : U V is a linear transformation.
Proof. Let u
1
, u
2
U, and let c be a scalar.
(T +F)(u
1
+u
2
) = T(u
1
+u
2
) +F(u
1
+u
2
)
= [T(u
1
) +T(u
2
)] + [F(u
1
) +F(u
2
)]
= [T(u
1
) +F(u
1
)] + [T(u
2
) +F(u
2
)]
= (T +F)(u
1
) + (T +F)(u
2
)
5.3 New Constructions from Old 271
Can you justify each step? To nish the proof of the theorem, we still need
to show that
(T +F)(cu
1
) = c(T +F)(u
1
).
This can be done using the same ideas as in the rst part of the proof.
You will be asked to complete the proof of this theorem as an exercise. See
Exercise 8.
Equality of Linear Transformations
Generally speaking, two functions f and g are equal if their domains are
equal, their ranges are equal, and if f and g assign the same output to a
given input. Since linear transformations are functions, the requirement for
equality is the same. A linear transformation T : U
1
V
1
is equal to a
linear transformation F : U
2
V
2
if and only if U
1
= U
2
, V
1
= V
2
, and
T(u) = F(u) for every input u. Consider R
5
and F
3
in Activity 1: each
is given by the same expression, and both range spaces are the same. So,
what is the problem? Why are these two transformations not equal to one
another? Consider R
4
and S
4
dened in Activity 1: the domain of each
transformation is the same, and the expression for each transformation looks
similar. However, R
4
,= S
4
. Can you explain why?
The func is equal you constructed in Activity 4 checks to see whether
two transformations between nite vector spaces are equal. When you ap-
plied is equal in Activity 5(b) and (c), what did you nd? Was each com-
bination considered in parts (b) and (c) equal to one of the transformations
given in the set G? If so, what signicance does this have? In particular,
does the set G form a vector space?
A Set of Linear Transformations as a Vector Space
The set of transformations G = G
1
, G
2
, G
3
, G
4
given in Activity 5 is the
set of all linear transformations between (Z)
1
and (Z
2
)
2
. In parts (b) and
(c), you showed that each scalar multiple and each sum is equal to one of the
transformations in G. Application of the func is vector space conrmed
these ndings. Of the vector space axioms, which axioms correspond to what
you were checking in parts (b) and (c)?
As it turns out, Activity 5 is a specic example of a much more general
result. In particular, the set of all linear transformations between two vector
spaces U and V , denoted Hom(U, V ), together with transformation addition
272 CHAPTER 5. LINEAR TRANSFORMATIONS
and scalar multiplication, forms a vector space. Here, each transformation is
a vector, the operation of adding two transformations represents vector addi-
tion, and the operation of multiplying a transformation by a scalar represents
scalar multiplication. Hom is an abbreviation for the term homomorphism,
which will be dened carefully in an abstract algebra course. For our pur-
poses, we only need to know that Hom(U, V ) denotes the collection of all
linear transformations between U and V .
Theorem 5.3.3. Let U and V be vectors spaces with scalars in K. The
set of all linear transformations, Hom(U, V ), together with transformation
addition and transformation scalar multiplication, forms a vector space.
Proof. In order to prove this theorem, you will need to check each of the
vector space axioms. The full proof is left as an exercise (see Exercise 11),
but the following questions and comments are designed to help you to write
a complete proof.
Theorem 5.3.1 shows that the scalar multiple of a linear transformation
is again a linear transformation. Which vector space axiom is satised by
this theorem? Similarly, Theorem 5.3.2 shows that the sum of two linear
transformations is itself a linear transformation. Which vector space axiom
is satised here?
Since addition of functions is commutative and associative, the addition
operation dened here is both commutative and associative.
The transformation dened by Z(u) = 0
V
for all u U is called the zero
transformation.
For F Hom(U, V ), the transformation F Hom(U, V ) denotes its
additive inverse.
As with any vector space, we can talk about nding a basis. You did just
that in Activity 6. What is the dimension of T? Can you nd a basis for the
vector space G dened in Activity 5?
Creating New Linear Transformations
In the last subsection, we created a new vector space by considering the set
of all linear transformations between two previously dened vector spaces.
We can do even more: in particular, we can often dene a function between
a vector space of transformations and a vector space of tuples that preserves
5.3 New Constructions from Old 273
linearity. For example, dene a function L : G (Z
2
)
2
between the vector
space G given in Activity 5 and the vector space (Z
2
)
2
by:
L(G
1
) = 0, 0)
L(G
2
) = 1, 0)
L(G
3
) = 0, 1)
L(G
4
) = 1, 1) .
Can you show that L is a linear transformation, that is:
L(G
i
+G
j
) = L(G
i
) +L(G
j
), where G
i
, G
j
G, i, j 1, 2, 3, 4
L(cG
i
) = cL(G
i
), where c Z
2
, and G
i
G, i 1, 2, 3, 4?
Can you dene a similar linear transformation between the set of all linear
transformations T dened in Activity 6 and (Z
2
)
4
? Transformations such as
these will be considered in more detail in Chapter 7.
Compositions of Linear Transformations
In Activity 7, you wrote a func to express the composition S
1
R
1
of the
two funcs R
1
and S
1
dened in Activity 1. As with any two functions, the
composition of two linear transformations, say A B, consists of taking an
input vector u, applying B to u, and then nding the image of B(u) under
A, provided that the application of A to B(u) is dened.
V U W
A B
u
B(u)
A(B(u))
Figure 5.12: Composition
Why are the compositions R
1
S
1
and P
1
T
1
that you considered in
part (b) undened? In general, if F : U V and G : W Z are two
274 CHAPTER 5. LINEAR TRANSFORMATIONS
linear transformations, what condition must be satised in order to ensure
that the composition GF is dened? Under what condition is the operation
of composition dened for pairs of transformations from Hom(U, V )? In this
context, what does the word closed mean in the statement of the theorem
given below?
Theorem 5.3.4. Let U be a vector space with scalars in a eld K. The
set of all linear transformations Hom(U, U) is closed under the operation of
composition.
Proof. See Exercise 23.
In Activity 7(g), you considered the issue of commutativity. Specically,
you were trying to determine whether A B = B A for every pair of
transformations in R. On the basis of your ndings in this activity, can you
conclude that composition is a commutative operation?
Exercises
1. Let F : R
2
R
3
be dened by
F(v
1
, v
2
)) = v
1
v
2
, 3v
1
+v
2
, 4v
2
) .
(a) Verify that F is a linear transformation.
(b) Let a R. Show that aF forms a linear transformation.
2. Justify each step of the proof of Theorem 5.3.1 that establishes
(aT)(cu
1
) = c(aT)(u
1
).
3. Complete the proof of Theorem 5.3.1 by showing that
(aT)(u
1
+u
2
) = (aT)(u
1
) + (aT)(u
2
).
4. Let T : PF
3
(R) PF
3
(R) be dened by
T(p) = q, where q(x) = p(x + 3),
Let a R. Show that aT is a linear transformation.
5.3 New Constructions from Old 275
5. Let J : PF
2
(R) PF
3
(R) be dened by
J(p) = x
_
x
0
p(t) dt
Let a R. Show that aJ is a linear transformation.
6. Let F
1
: R
3
R
2
be dened by
F
1
(v
1
, v
2
, v
3
)) = 3v
2
2v
3
, 4v
1
+v
2
) .
Let F
2
: R
3
R
2
be dened by
F
2
(v
1
, v
2
, v
3
)) = v
1
v
3
, v
2
v
3
) .
(a) Verify that F
1
and F
2
are linear transformations.
(b) Show that F
1
+F
2
is a linear transformation.
7. Justify each step of the proof of Theorem 5.3.2 that establishes
(T +F)(u
1
+u
2
) = (T +F)(u
1
) + (T +F)(u
2
).
8. Complete the proof of Theorem 5.3.2 by showing that
(T +F)(cu
1
) = c(T +F)(u
1
).
9. Dene T, S: PF
3
(R) PF
4
(R) be dened by
T(p) = q, where q(x) = p(x + 3),
S(p) = q, where q(x) = xp(x).
(a) Show that T and S are linear transformations.
(b) Show that T +S is a linear transformation.
10. Dene D, D
2
: C
(R) C
i=1
i
2
is interpreted?
Summation notation uses the capital Greek letter
to indicate summa-
tion:
u
i=
a
i
= a
+a
+1
+ +a
u1
+a
u
.
The variable i is referred to as the variable of summation (or the index), the
number is the lower bound and the number u is the upper bound. Often the
bounds of the summation will be omitted if they are clear from the context.
If we wish to perform a double-sum, where the bounds of each summation
do not depend on the other index, we will abbreviate to a single summation
sign:
i,j
a
ij
=
j
a
ij
=
i
a
ij
.
For example, if the context of the summation indicates that 1 i 3 and
1 j 2, then
i,j
ij =
3
i=1
2
j=1
ij =
3
i=1
(i + 2i) =
3
i=1
3i = 3(1) + 3(2) + 3(3) = 18.
You should change the order of the indices and verify that the value of that
summation is also 18; in other words, check that
2
j=1
3
i=1
ij = 18.
You must be careful when using these abbreviations. For example, can
you explain why the following is incorrect:
5
i=1
i
j=1
i j =
i,j
i j?
288 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
When expanding a double-summation, you should expand the outer sum rst
if the index for the outer sum is used as a bound for the inner sum:
2
i=1
i
j=1
(i +j) =
1
j=1
(1 +j) +
2
j=1
(2 +j) = (1 + 1) + (2 + 1) + (2 + 2) = 9
Dimensions of Matrix Vector Spaces
In Chapter 4, we discussed the dimension of a vector space. Activities 8
and 9 asked you to explore the concepts of linear independence and span in
the context of vector spaces of matrices. In Activity 9, you were asked to give
the dimension of K
nm
based on your work with (Z
3
)
43
. Is your conjecture
consistent with the statement of Theorem 6.1.3?
Theorem 6.1.3. The dimension of K
nm
is nm.
Proof. Dene the matrix E
ij
to be the matrix whose (i, j) entry is 1 and
whose other entries are 0. In functional notation:
E
ij
(k, ) =
_
1 if (k, ) = (i, j)
0 otherwise
.
We will show that E
ij
[ 1 i n, 1 j m is a generating set for K
nm
.
The proof that this set is independent is left as an exercise (see Exercise 6).
Let M = (m
ij
) be a matrix in K
nm
. We will show that M can be written
as a linear combination of the set E
ij
:
_
i,j
m
ij
E
ij
_
(k, ) =
i,j
m
ij
E
ij
(k, ) =
= m
k,
E
k,
(k, ) +
(i,j),=(k,)
m
ij
E
ij
(k, ) =
= m
k,
+
(i,j),=(k,)
0 = m
k,
= M(k, ).
Can you reformulate the proof representing matrices as arrays of num-
bers? Which proof is more understandable? Which proof is more detailed?
6.1 Vector Spaces of Matrices 289
Linear Transformations of Matrices
Because K
nm
is a vector space over K, it is natural to try to learn about
linear transformations whose domain and/or range is a vector space of ma-
trices. The rst such transformation you worked with attened a matrix into
a tuple (in Activities 10 through 12). This is referred to as attening the
matrix. Be sure to keep the two vector spaces K
nm
and K
nm
clear: the
rst is a collection of matrices, and the second is a collection of tuples.
Theorem 6.1.4. The attening map is a linear transformation from K
nm
to K
nm
.
Proof. Let M = (m
ij
) and N = (n
ij
) be elements of K
nm
,let a, b be scalars
and F denote the attening map. We must show that F(aM + bN) =
aF(M)+bF(N). We start with the left hand side of the equality: aM+bN =
(am
ij
+bn
ij
) and so the (i 1)m+j
th
component of F(aM +bN) will equal
am
ij
+bn
ij
.
On the right hand side, the (i 1)m+j
th
component of F(M) will equal
m
ij
and the (i 1)m+j
th
component of F(N) will equal n
ij
. Therefore, the
(i 1)m + j
th
component of aF(M) + bF(N) will equal am
ij
+ bn
ij
. This
proves that F(aM +bN) = aF(M) +bF(N) as needed.
Theorem 6.1.5. The attening map is one-to-one and onto.
Proof. The inverse of the attening map can be dened as follows. Start with
the nm-tuple [a
k
]. For each value of k, with 1 k nm, there are unique
integers i
k
and j
k
such that 1 i
k
n, 1 j
k
m, and k = (i
k
1)m+j
k
.
The inverse of the attening map is the matrix whose (i
k
, j
k
) entry is equal
to a
k
.
Theorems 6.1.2 and 6.1.3 describe a special relationship between two vec-
tor spaces that was dened in Chapter 2. Two vector spaces are isomorphic
if there is a one-to-one, onto linear transformation from one space to the
other. We will explore this concept a little more in the exercises. What is
important is that if two vector spaces are isomorphic, their linear structures
correspond (any true statement about one of them can be translated into a
true statement about the other).
The second linear map we examined (in Activities 13 and 15) is the trans-
pose map. It has some of the same features of the attening map.
290 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
Theorem 6.1.6. The transpose map M M
t
from K
nm
K
mn
is
an isomorphism of vector spaces.
Proof. See Exercise 10.
Exercises
1. Complete the proof of Theorem 6.1.1.
2. If V is a vector space over K, how would you dene V
nm
. How would
you multiply an element of V
nm
by an element of K? How would you
add two elements of V
nm
? Prove that V
nm
is a vector space. Explain
how Theorem 6.1.1 would be a special case of this result (assuming the
result were true).
3. Let S be a set, and V be a vector space over K. Let V
S
denote the
set of all functions from S V . How would you multiply an element
of V
S
by an element of K? How would you add two elements of V
S
?
Prove that V
S
is a vector space. Explain how Theorem 6.1.1 is a special
case of this result (assuming the result were true).
4. Let T be a linear transformation from K
nm
to K. Prove that the
collection of M K
nm
which map to 0 under T is a subspace of
K
nm
.
For example, Activity 7c is one such example where:
T
_
a
11
a
12
a
21
a
22
_
= a
12
a
11
a
21
5. Compute the following summations:
(a)
3
i=1
i+1
j=1
i +j
(b) 1 i 3, 1 j 3 with
i,=j
ij
(c)
2
i=1
2
j=i
2
i
3
j
6. Prove that the set of matrices E
ij
dened in the proof of Theo-
rem 6.1.3 is independent.
6.1 Vector Spaces of Matrices 291
7. For each of the following matrices, compute their images under the
attening map.
(a)
_
1 3 2
2 1 3
_
(b)
_
_
_
_
2 3
1 2
0 1
2 4
_
_
_
_
(c)
_
1 3 2 3 4 2
3 2 4 2 4 1
_
8. For each of the following tuples, compute their images under the inverse
of the attening map F : (Z
5
)
23
(Z
5
)
6
.
(a) [1, 4, 3, 4, 0, 2]
(b) [0, 3, 4, 2, 1, 0]
(c) [1, 0, 0, 0, 1, 0]
(d) [0, 3, 2, 4, 1, 3]
9. For each of the following matrices, compute their images under the
transpose map.
(a)
_
1 3 1
2 3 4
_
(b)
_
2 3 2 1 4 0
3 4 2 0 2 1
_
(c)
_
_
_
_
_
_
1 2
2 2
2 4
1 0
1 0
_
_
_
_
_
_
10. Prove Theorem 6.1.6.
11. Dene the trace of an n n matrix by:
Tr : (m
ij
)
n
i=1
m
ii
.
292 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
Determine whether the trace is a linear map from the vector space of
n n matrices over K to the 1-dimensional vector space (K)
1
.
12. For a 2 2 matrix over K, dene the map:
det : (m
ij
) m
11
m
22
m
12
m
21
.
Determine whether det is a linear map from the vector space of all 22
matrices over K to the 1-dimensional vector space (K)
1
.
293
6.2 Transformations and Matrices
Activities
1. For the matrices M and N, given by
M =
_
_
1 3 4 1 0
2 1 4 0 2
1 3 0 2 1
_
_
and N =
_
_
2 3 1 3 1
1 3 2 1 1
0 0 1 0 0
_
_
do the following:
(a) Use matrix row from the Matrix package to obtain the rows of
M as tuples. Determine the dimension of the subspace generated
by these tuples in (Z
5
)
5
.
(b) Use matrix col from the Matrix package to obtain the columns of
M as tuples. Determine the dimension of the subspace generated
by these tuples in (Z
5
)
3
(c) What relationship do you nd between the dimensions of these
spaces?
2. Write a func called row rank that accepts an nm matrix M over K,
converts the rows of M into tuples, and that returns the dimension of
the subspace generated by these tuples in K
m
. The code in row rank
can assume that name vector space has been run for the vector space
to K
m
. Let K = Z
5
, and let M be given by
M =
_
_
_
_
_
_
1 3 4 2 1
2 1 3 2 3
1 3 4 2 1
1 4 2 2 1
0 0 0 1 0
_
_
_
_
_
_
.
Determine the value of row rank on the following matrices:
(a) M;
(b) M with the third and fourth rows interchanged;
(c) M with the second row multiplied by 2;
294 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(d) M with the fourth row replaced by the sum of the rst and fourth
rows;
(e) the matrix obtained by reducing M to echelon form (see Chapter
3);
(f) the matrix obtained by reducing M to reduced echelon form (see
Chapter 3).
Make a conjecture about the eects of elementary row operations (as
dened in Chapter 3) on the value of row rank.
3. Write a func called col rank that accepts an nm matrix M over K,
converts the columns of M into tuples, and returns the dimension of
the subspace generated by these tuples in K
n
. The code in col rank
can assume that name vector space has been run for the vector space
to K
n
. Let K = Z
3
, and let M be given by
M =
_
_
_
_
_
_
1 0 1 2 1
2 1 0 2 0
1 0 1 2 1
1 1 2 2 1
0 0 0 1 0
_
_
_
_
_
_
.
Determine the value of col rank on the following matrices:
(a) M;
(b) M with the third and fourth rows interchanged;
(c) M with the second row multiplied by 2;
(d) M with the fourth row replaced by the sum of the rst and fourth
rows;
(e) the matrix obtained by reducing M to echelon form;
(f) the matrix obtained by reducing M to reduced echelon form.
Make a conjecture about the eects of elementary row operations (as
dened in Chapter 3) on the value of col rank.
4. For each system of linear equations over Z
3
, determine the column
rank of the matrix of coecients and the number of free variables in
the solution to the system.
6.2 Transformations and Matrices 295
(a)
x + y = 2
x + 2y = 1
2x + y = 0
(b)
2y + z = 2
x + y + 2z = 1
(c)
+ y + z = 0
x + 2y + 2z = 2
x = 2
(d)
x + y + z = 2
2x + 2y + 2z = 1
(e)
x + y + z + w = 1
x + 2y + 2w = 2
x + z + w = 0
Given a system of equations, can you determine a relationship between
the column rank and the number of free variables?
5. Dene a func called mat apply that accepts an nm matrix M = (m
ij
)
and an m-tuple [x
j
] and returns an n-tuple whose i
th
coordinate is equal
to
n
j=1
m
ij
x
j
.
Your code may assume that ms and as have been dened to implement
scalar arithmetic.
For this activity, dene ms and as to implement arithmetic mod 3. For
each matrix M given below, determine the values of mat apply on the
tuples e
1
, e
2
, and e
3
. Can you determine how the results relate to the
entries in the given matrix?
(a) M =
_
2 3 1
1 3 1
_
296 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(b) M =
_
_
1 2 0
0 3 1
1 0 1
_
_
(c) M =
_
2 3 1
_
(d) M =
_
_
1 0 0
0 1 0
0 0 1
_
_
6. Write a func called coord that accepts a vector v and an ordered
basis B = [b
1
, . . . , b
n
] and returns the coordinates of v with respect
to B. That is, the return value of coord should be the tuple of scalars
[a
1
, . . . , a
n
] such that
v =
n
i=1
a
i
b
i
.
Your code may assume that name vector spacehas been used to es-
tablish the vector space.
Run name vector space to establish the vector space (Z
5
)
3
. Let B =
[1, 0, 0) , 1, 1, 0) , 1, 1, 1)] and ( = [2, 0, 0) , 0, 2, 0) , 0, 0, 2)]. Deter-
mine whether the following are true or false:
(a) The function coord (using the ordered basis B) is a linear trans-
formation from (Z
5
)
3
to (Z
5
)
3
.
(b) If the ordered basis B is used in both function calls, then coord
and LC (from Activity 2 of Section 4.1) are inverse functions.
(c) The function coord using the ordered basis B and the function LC
using the ordered basis ( are inverse functions.
(d) The function obtained by rst applying coord using the ordered
basis B and then applying LC using ordered basis ( is a linear
transformation.
7. Write a func called matrify that accepts a function T : U V , a
basis B = [b
j
] of U, and a basis ( = [c
i
] of V . The func should return
a matrix M = (m
ij
) such that the entries in the j
th
column are the
coordinates of T(b
i
) with respect to the basis (.
Now run name two vector spaces to set the domain to (Z
5
)
2
and the
range to (Z
5
)
3
. Use the coordinate bases for the remainder of this activ-
ity (see Section 4.4 if you cannot remember the denition of coordinate
6.2 Transformations and Matrices 297
bases). For each function T below, compare the value of T(e
1
) to the
value obtained by applying M to the vector e
1
and the value of T(e
2
)
to the value obtained by applying M to the vector e
2
.
(a) T(x, y)) = 2x +y, x, x +y)
(b) T(x, y)) = x, 0, 2x y)
(c) T(x, y)) = 2x y, x +y, xy)
What is the relationship between the vectors in the rst two cases?
Make a conjecture about the property (or lack thereof) that makes the
third case behave dierently.
8. This activity uses the scalar eld Z
5
. For each pair of ordered bases
given below, compare the result of matrify on the linear transforma-
tion given by T (x, y, z)) = x + 2y, y + 2z).
(a) B = [e
1
, e
2
, e
3
], ( = [e
1
, e
2
]
(b) B = [e
2
, e
1
, e
3
], ( = [e
1
, e
2
]
(c) B = [e
1
, 2e
2
, e
3
], ( = [e
1
, e
2
]
(d) B = [e
1
, 2e
2
, e
3
], ( = [e
1
, 3e
2
]
How does changing the order of the domain basis aect the matrix?
How does changing the order of the range basis aect the matrix? How
does scaling an element of the domain basis aect the matrix? How
does scaling an element of the range basis aect the matrix?
9. Let T : (Z
5
)
2
(Z
5
)
3
be dened as T(x, y)) = x + 2y, x + 3y, y)
and use the coordinate bases throughout this activity. How do the
results of matrify(T), and matrify(2T) relate to each other? Can
you make a general conjecture about how matrify is aected when
you scale its input?
Now let S: (Z
5
)
2
(Z
5
)
3
be dened as S(x, y)) = x, y, 2x + 3y).
How do the matrices matrify(T), matrify(S), and matrify(T + S)
relate to each other? Can you make a conjecture about how matrify
applied to a sum relates to matrify applied individually to each of the
terms of a sum?
298 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
10. Use name two vector spaces to set the domain and range to be (Z
5
)
2
.
For each of the following matrices M, compare the value of col rank(M)
with the rank of the linear transformation which applies M to the vec-
tor.
(a) M =
_
1 2
3 2
_
(b) M =
_
1 2
3 1
_
(c) M =
_
0 2
1 2
_
(d) M =
_
0 0
0 0
_
Make a conjecture about the relationship between the two ranks.
Discussion
The Rank of a Matrix
In Activity 1, you examined two ways to convert an n m matrix into a
set of tuples (row-by-row or column-by-column). The dimensions of the sets
spanned by these tuples provides a signicant amount of information about
a matrix. We now provide a name for these numbers.
Denition 6.2.1. Let M be an n m matrix over K. The dimension of
the subspace of K
m
generated by the rows of M is called the row rank of
M. The dimension of the subspace of K
n
generated by the columns of M is
called the column rank of M.
In Activity 1, you also discovered that although the values of n and m
may be dierent, the row rank and the column rank of the matrix turned
out to be identical. We now work toward proving this result by looking at
the eect of the elementary row operations on the row and column ranks of
a matrix. Can you recall the three elementary row operations?
In Activity 2, you examined the eect of the elementary row operations
on the row rank of a matrix. Your work should have led to make a conjecture
consistent with the following theorem.
6.2 Transformations and Matrices 299
Theorem 6.2.1. The row rank of a matrix is unaected by the elementary
row operations.
Proof. Theorem 4.3.1 proved that two sets generate the same subspace, pro-
vided every vector in each set can be written as a linear combination of
vectors in the other set. We will use this to show that the the elementary
row operations do not aect the subspace generated by the rows of a matrix.
The rst elementary row operation is interchanging two rows. In this case,
the set of tuples obtained from the rows of the original matrix is identical to
those of the transformed matrix; hence, the spans are the same.
The second elementary row operation is to multiply one row by a scalar.
Let i be the row which is multiplied, v be the tuple from row i of the original
matrix, and k be the scalar. Then the sets are identical except for the tuple
from the i
th
row. The original matrix will have the tuple v and the new
matrix will have the tuple kv. However kv is clearly a linear combination of
the tuples from the original matrix, and v =
1
k
(kv) is a linear combination
of the tuples from the new matrix.
The third elementary row operation replaces a row i by the sum of row i
with another row j. Let i be the row which is being replaced, v
i
be the tuple
created from row i in the original matrix, j be the row which is added to row
i, and v
j
be the tuple from row j of the original matrix. As in the last case,
the set of tuples generated from the new matrix diers by only one tuple
from the set of tuples generated from the original matrix. The tuple from
row i of the new matrix, v
i
+ v
j
, is a linear combination of the tuples from
the original matrix. Similarly, the tuple v
i
= (v
i
+ v
j
) v
j
can be written
as a linear combination of the tuples from the new matrix.
Since the tuples produced by applying elementary row operations can be
written as linear combinations of the tuples from the original matrix and
vice-versa, the subspaces generated by the rows of the transformed matrix
and the rows of the original matrix are the same. Therefore, the row rank is
unaected.
In Chapter 3, we transformed matrices into reduced echelon form as a
means of determining the solution set of a system of equations. Here we will
use echelon form, because it provides an easy way to determine the row (and
column) rank of a matrix. Do you remember the requirements for a matrix
to be in reduced echelon form?
Since a matrix can be transformed into reduced echelon form by using
elementary row operations exclusively, the row rank of a matrix is equal
300 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
to the row rank of its corresponding reduced echelon form. As stated in
Denition 6.2.1, the row rank is the dimension of the vector space generated
by the rows of a matrix. However, the nonzero rows of a reduced echelon
matrix are linearly independent (why?). If we put these ideas together, what
is the relationship between the number of nonzero rows of a reduced echelon
matrix and its rank? Given any matrix, how can we nd a basis for the
vector space generated by its rows?
Although you may have been able to predict the outcome of Activity 2, the
results of Activity 3 may have come as a surprise. These ndings demonstrate
the following theorem, which is considered one of the deep results in linear
algebra.
Theorem 6.2.2. The column rank of a matrix is unaected by the elemen-
tary row operations.
Although this can be proven directly, an elegant proof (which avoids many
of the calculations) will be presented in Section 6.3. Your conjecture from
Activity 1 follows directly from Theorem 6.2.2.
Theorem 6.2.3. The column rank and the row rank of a matrix M are equal.
Proof. We can assume that M is in reduced echelon form (why?). Denote
the row rank of M by r. Because M is in reduced echelon form, M will
have exactly r nonzero rows of M. We consider the tuples generated by the
columns of M which have leading entries.
Each of the tuples coming from these columns will contain a single nonzero
entry and this entry will equal 1. This means that these tuples be e
1
, . . . e
r
.
To conclude the proof, every column of M will have zero in all but its
top r positions so the set e
1
, . . . e
r
will generate the subspace generated by
these tuples. This is sucient to prove that the column rank of M is r.
In the context of systems of equations, the rank of the coecient matrix
can be related to the number of determined variables in the solution set.
This relationship was presented in Theorem 3.2.1 and again in Activity 4.
The following theorem restates the result.
Theorem 6.2.4. The number of determined variables in the solution set of
a system of linear equations is equal to the the rank of the coecient matrix
of that system.
6.2 Transformations and Matrices 301
Proof. Given a system of equations with coecient matrix M, we augment
M and then transform the augmented matrix to reduced echelon form. Ev-
ery leading entry then becomes a determined variable in the solution of the
system. From the proof of Theorem 6.2.3, we can see that the number of
leading entries is equal to the rank of the coecient matrix.
The Matrix of a Linear Transformation
In Chapter 3, you used matrices to solve systems of equations. In Chapter 5,
you described systems of equations in terms of linear transformations and
found that every matrix can be used to dene a linear transformation. We
complete the triangle in this section by showing that every linear transfor-
mation can be described in terms of a matrix. In Activity 5, you wrote the
expression for a linear transformation between two vector spaces of tuples as
a matrix application.
In order to use this technique on every vector space, you need to repre-
sent a vector space as a collection of tuples. This was the purpose of the
funcs in Activity 6. The functions implemented in coord and LC provide
an isomorphism between an n-dimensional vector space V over K and the
vector space of tuples K
n
. In some ways, this is precisely the reason ordered
bases were dened.
In Activity 7, you probably realized that matrify provided a generic
method for representing a linear transformation as a matrix application. This
method of producing a matrix from a linear transformation is so important,
it gets its own denition.
Denition 6.2.2. Let V be a vector space of dimension n, and let B =
[b
i
] be an ordered basis. Then for each vector v, the coordinate vector (or
coordinates) of v with respect to B is dened to be the vector x
1
, . . . , x
n
)
in K
n
such that
v =
n
i=1
x
i
b
i
.
Given vector spaces U and V with ordered bases B = [b
j
] and (, respec-
tively, and a linear transformation T : U V , then the matrix represen-
tation of T with respect to B and ( is the matrix whose j
th
column is the
coordinate vector of T(b
j
) with respect to the ordered basis (.
Although the dimensions of the resulting matrix are not explicitly men-
tioned, they can be determined based on the dimensions of U and V .
302 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
The choice of ordered bases is signicant, as was seen in Activity 8. This
dependence can create ambiguities which make computations very dicult
and will be explored more fully in Chapter 7. We will try to alleviate this
with the following notation. If a tuple-vector is a coordinate vector, then we
will subscript it with the name of the ordered basis. For example, consider
the vector v = 1, 2) in the vector space (Z
5
)
2
with ordered bases B = [e
1
, e
2
]
and ( = [1, 1) , 0, 1)]. Then the coordinates of v with respect to B will be
1, 2)
B
. This explains why the basis B is referred to as the coordinate basis.
One the other hand, the coordinates of v with respect to ( is given by 1, 1)
(
.
In cases where the ordered bases are clear from the context, we will often omit
mention of them. One particular case is a vector spaces of tuples where we
will assume the use of the coordinate bases if no other bases are mentioned.
Because there is so much information involved with matrix representa-
tions, the following diagram is often used to illustrate the situation.
U
K
m
M
K
n
In this diagram, the vertical arrows indicate the isomorphism from U to K
m
and from V to K
n
, which you implemented as coord (with inverse LC). These
vertical arrows are labeled by the ordered bases used for the isomorphism.
The arrow labeled T indicates the linear transformation from U to V , and
the arrow labeled M indicates the application of the matrix representation
of T with respect to B and (.
The top row of the diagram presents the linear transformation in terms of
the vector spaces, while the bottom row of the diagram presents the matrix
representation. The vertical arrows represent the choice of coordinates for
the vector space and tie together the two presentations.
This diagram also illustrates an important equality: if we take a vector
in U, follow the arrow labeled B (downward), and then apply the matrix
M, we get the same result as if we had rst applied T, followed by nding
the coordinate representation in terms of the basis (. This was suggested in
Activity 7 and is stated in the following theorem.
Theorem 6.2.5. Let U and V be vector spaces with ordered bases B and
(, respectively, and let T : U V be a linear transformation. Given u in
U, the coordinate vector of T(u) with respect to ( is equal to the result of
6.2 Transformations and Matrices 303
applying the matrix representation of T to the coordinate vector of u with
respect to B.
Proof. Let B = [b
j
] and ( = [c
i
] be the ordered bases. Let [u
i
] be the coordi-
nates of u with respect to B, and let M = (m
ij
) be the matrix representation
of T with respect to B and (. Then
T(u) = T
_
j
u
j
b
j
_
=
j
u
j
T(b
j
) =
j
u
j
_
i
m
ij
c
i
_
=
i,j
u
j
m
ij
c
i
=
i
_
j
m
ij
u
j
_
c
i
.
Therefore, the coordinates of T(u) with respect to ( is the tuple whose i
th
component is
j
m
ij
u
j
. This tuple is the same as that obtained by applying
the matrix M to the tuple [u
j
].
Given an n m matrix M over K, there is a linear transformation
T : K
m
K
n
obtained by matrix application. A natural question now
arises: is the matrix associated with T equal to the original matrix M. The
answer is contained in the following theorem.
Theorem 6.2.6. Let M be an nm matrix over K and T : K
m
K
n
the
linear transformation mapping v to Mv. Then the matrix of T with respect
to the coordinate bases is equal to M.
Proof. See Exercise 7.
Properties of Matrix Representations
In Chapter 5, we dened a vector space structure on the set of linear trans-
formations from U to V denoted by Hom(U, V ). In Section 6.1, we dened
a vector space structure on the set of n m matrices denoted by K
nm
.
Theorem 6.2.5 provides a bridge between linear transformations and matri-
ces. A natural question is how the vector space structure on Hom(U, V )
relates to that on K
nm
. You explored this relationship in Activity 9. Was
304 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
the conjecture you formulated in that activity consistent with the following
theorem?
Theorem 6.2.7. Let U and V be n-dimensional vector spaces over K. Let
B be an ordered basis for U, and let ( be an ordered basis for V . Let T and S
be elements of Hom(U, V ), and let k K. Assume all matrix representations
are given with respect to the bases B and (. Then, we can conclude:
the matrix representation of T + S is equal to the sum of the matrix
representation of T and the matrix representation of S;
the matrix representation of kT is equal to the product of k and the
matrix representation of T.
Proof. We will prove the result for scalar multiplication, and leave the result
for addition as an exercise (see Exercise 6). Let T : U V be a linear
transformation, and let k be any scalar in K.To simplify the notation, let
B = [b
j
] and ( = [c
i
] represent the ordered bases of U and V , respectively,
and let M = (m
ij
) denote the matrix representation of T with respect to B
and (. We now determine the matrix representation of kT with respect to
B and (.
For all j, we have:
(kT)(b
j
) = k(T(b
j
)) = k
_
m
i,j
c
i
_
=
(km
ij
)c
i
.
Therefore the j
th
column of the matrix representation of kT is equal to the
product of k and the j
th
column of the matrix representation of T. This
completes the proof.
Another place where there have been parallel developments between linear
transformations and matrices has been the value known as rank. Do you
recall the denition for the rank of a transformation and the rank of a matrix?
As you discovered in Activity 10, this value is also independent of the matrix
representations.
Theorem 6.2.8. The rank of a linear transformation is equal to the column
rank of its matrix representation (with respect to any ordered bases).
Proof. Let T : U V be a linear transformation, and B = [b
j
] and ( be
ordered bases of U and V , respectively. Assume that the dimension of V is
equal to n.
6.2 Transformations and Matrices 305
The range of T is spanned by the vectors T(b
j
). The tuple of coordinates
of the vector T(b
j
) with respect to the ordered basis ( is equal to the j
th
column of the matrix. As a result, the range of T is isomorphic to the
subspace of K
n
spanned by the column tuples; hence they will have the same
dimension.
Theorem 6.2.8 ties the concept of column rank to the concept of the rank
of a linear transformation. This result will be instrumental in proving The-
orem 6.2.2 using the following strategy. In Section 6.3, we will determine
the linear transformation analog of an elementary row operation. It will be
proven that these operations do not aect the rank of the linear transforma-
tion and hence will not aect the column rank of the associated matrix. This
is the missing piece in the proof of Theorem 6.2.2.
Retrospection
We started with the goal of solving systems of linear equations (each of which
can be written as a single matrix equation). In this task, we developed an
abstract theory of vector spaces, bases, and linear transformations. We have
now returned full circle, because every vector space can be represented as a
vector space of tuples, and every linear transformation can be represented as a
matrix application. A reasonable question now is why bother, if everything
really is just matrices, to do all of the abstraction?
By making the dependence on the ordered bases explicit, we have gained
the freedom to change our choice of ordered bases. In many cases this can
help simplify a problem at hand (as will be done in Chapter 7). Another
advantage is that many of the proofs are actually easier to write if we abstract
away from the details. Perhaps the greatest gain has been in places where
there are no natural bases. By providing abstract proofs of the results, we
learn that many of the techniques which were eective in the concrete case
of tuples can be applied (without change) to the more abstract cases.
Exercises
1. Consider each of the matrices below as being over Z
5
. Compute their
row rank. Next, consider each of the matrices below as being over R.
Compute their row rank. Do these numbers dier?
306 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(a)
M =
_
1 2 3 4 1
2 3 1 3 2
_
(b)
M =
_
_
1 0 0
0 1 0
0 0 1
_
_
(c)
M =
_
_
0 0 0 0
0 0 0 0
0 0 0 0
_
_
(d)
M =
_
_
1 2 1
2 0 3
1 4 0
_
_
(e)
M =
_
_
1 2 3
2 3 0
1 4 0
_
_
2. For each of the following matrices over R, compute its column rank.
(a)
M =
_
1 2 3 4 1
2 3 1 3 2
_
(b)
M =
_
_
1 0 0
0 1 0
0 0 1
_
_
(c)
M =
_
_
0 0 0 0
0 0 0 0
0 0 0 0
_
_
6.2 Transformations and Matrices 307
(d)
M =
_
_
1 2 1
2 0 3
1 4 0
_
_
(e)
M =
_
_
1 2 3
2 3 0
1 4 0
_
_
3. For each system of equations over R, determine the number of deter-
mined variables. You do not need to solve the equations.
(a)
x + 2y + 3z = 4
x + 3y 2z = 2
3x + 8y + z = 8
(b)
x + y z = 2
x + 2y 3z = 4
2x + 7y 4z = 2
(c)
x + y = 2
x + 3y = 4
x 2y = 7
4. For each vector space, ordered basis B, and vector v below, write the
coordinates of v with respect to B.
(a) V = (Z
5
)
2
, B = [1, 1) , 0, 1)], and v = 1, 3)
(b) V = R
3
, B = [1, 2, 3) , 2, 4, 1) , 3, 6, 5)], and v = 0, 4, 1)
(c) V = R
23
, B = [E
11
, E
12
, E
13
, E
21
, E
22
, E
23
], and
v =
_
1 3 2
2 1 0
_
.
The denition of the E
ij
can be found in the proof of Theo-
rem 6.1.3.
308 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(d) V = PF
2
(R), the vector space of polynomial functions with degree
two or less over R, B = [x 1, x x, x x
2
], v = p, where
p(x) = x
2
+ 3x 2.
(e) V = PF
2
(R), the vector space of polynomial functions with degree
two or less over R, B = [x 1, x x + 1, x x
2
+ x + 1]
and v = p, where p(x) = x
2
2x + 3.
(f) V = PF(Z
3
) the vector space of all polynomial functions over Z
3
,
B = [x 1, x x, x x
2
], and v = p, where p(x) = x
3
.
If you consider this impossible, carefully consider the information
about polynomials in Chapter 1.
(g) V = PF(Z
5
) the vector space of all polynomial functions over Z
5
,
B = [x 1, x x, x x
2
, x x
3
, x x
4
], and v = p,
where
p(x) = 2x
7
+ 3x
6
+ 2x
5
+ 3x
4
+x
3
+ 2x
2
+x + 4.
(h) V = C
(R) C
p
k=1
m
ik
n
kj
).
One important question to ask about matrix multiplication is how it
interacts with the vector space operations. In Activity 3, you veried that
matrix multiplication by a given matrix is a linear transformation. This
result is true in general.
Theorem 6.3.1. Let M be a p q matrix over K. Then:
the map N MN denes a linear transformation from K
qm
K
pm
;
the map N NM denes a linear transformation from K
np
K
nq
.
Proof. We prove the case for the rst map. You will be asked to prove the
second case in the exercises (see Exercise 2). Let M K
pq
. For N K
qm
,
dene T(N) = MN K
pm
. We must show that for any a, b K and
N, P K
qm
we have
M(aN +bP) = T(aN +bP) = aT(N) +bT(N) = aMN +bMP.
This can be done by direct computation:
M(aN +bP) = (m
ij
)(an
ij
+bp
ij
) =
_
k
m
ik
(an
kj
+bp
kj
)
_
=
=
_
k
am
ik
n
kj
+bm
ik
p
kj
_
=
= a
_
k
m
ik
n
kj
_
+b
_
k
m
ik
p
kj
_
= aMN +bMP.
6.3 Matrix Multiplication 317
In Activities 1, 2 and 4, you explored some of the basic properties of
matrix multiplication. Although you undoubtedly discovered that matrix
multiplication shared some properties with real number multiplication, you
found that there are notable dierences. Can you identify some of the sim-
ilarities? Can you describe the dierences? Some of the similarities are
presented in the following theorem.
Theorem 6.3.2. Let the matrix I be dened by (
ij
), where
ij
is equal to 1,
if i = j, and
ij
= 0, otherwise. Let the matrix Z be the zero matrix. If A, B,
and C are matrices, then the following equalities hold, provided the products
are dened (in other words, provided the dimensions are correct).
A(B +C) = AB +AC
(A +B)C = AC +BC
k(AB) = (kA)B = A(kB)
AI = A = IA
AZ = Z = ZA
A(BC) = (AB)C
Proof. We will prove some of these and leave others as exercises.
A(B +C) = AB +AC is valid, because multiplication from the left by
A is a linear transformation.
(A + B)C = AC + BC is valid, because multiplication from the right
by C is a linear transformation.
A(kB) = k(AB) is valid, because multiplication from the left by A is
a linear transformation.
(kA)B = k(AB) is valid, because multiplication from the right by B is
a linear transformation.
A(BC) = (AB)C is left to the reader (see Exercise 3).
318 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
We will prove AI = A = IA directly.
AI = (a
ij
)(
ij
) =
_
k
a
ik
kj
_
=
=
_
a
ij
jj
+
k,=j
a
ik
kj
_
=
_
a
ij
+
k,=j
0
_
= (a
ij
) = A.
The proof of A = IA is similar and left to the reader (see Exercise 4.
AZ = Z = ZA is left to the reader (see Exercise 5).
The statement in the theorem, provided the products are dened, is
required, because not all matrices can be multiplied. What are the require-
ments on the dimensions of two matrices that allow them to be multiplied?
The second property which matrix multiplication lacks is commutativity.
Even when both AB and BA are dened, they may not even be the same
dimension, as you discovered in Activity 1. This also explains why Theo-
rem 6.3.2 requires two dierent distribution laws (one from the right and one
from the left).
Multiplication as Composition
There is another operation which has similar properties to matrix multipli-
cation: composition of linear transformations. Composition is associative,
distributes over addition and scalar multiplication, has an identity, and a
zero-function. Function composition is not always dened and usually non-
commutative.
In Activities 5 and 6, you explored the connection between function com-
position and matrix multiplication. Although the proofs of the following
two theorems are straight-forward, they provide an important connection
between matrices and linear transformations, which we will be able to use
for a variety of purposes.
6.3 Matrix Multiplication 319
Theorem 6.3.3. Let M be an n p matrix over K and N a p m matrix
over K. If we start with a tuple x in K
m
, apply N to this tuple and then
apply M to the resulting tuple in K
p
, then the nal tuple in K
n
will be equal
to the tuple obtained by applying MN to the original tuple in K
m
. In short:
M(Nx) = (MN)x.
Proof. Let x = x
1
, . . . , x
m
) be any vector in K
m
.
The result of applying the matrix N to x will have
j
n
kj
x
j
as its k
th
component. We then apply the matrix M to this and the result will have
k
m
ik
_
j
n
kj
x
j
_
as its i
th
component.
The result of applying the matrix MN to the vector x will have
j
_
k
m
ik
n
kj
_
x
j
as its i
th
component. What remains to be shown is that these two are equal.
k
m
ik
_
j
n
kj
x
j
_
=
k,j
m
ik
n
kj
x
j
=
j
_
k
m
ik
n
kj
_
x
j
This analogous relationship also holds for matrix representations of a
linear transformation, as the following theorem shows:
Theorem 6.3.4. Let T : K
p
K
n
and S: K
m
K
p
be linear transfor-
mations. The matrix representation of T S is equal to the product of the
matrix representation of T and the matrix representation of S.
Proof. Let the matrix representation of T be given by (t
ij
) and the matrix
representation of S be given by (s
ij
). We nd the column tuples of the matrix
320 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
representation of T S by applying T S to the basis elements of K
m
.
T S(e
j
) = T
_
k
s
kj
e
k
_
=
=
k
s
kj
T(e
k
) =
k
s
kj
_
i
t
ik
e
i
_
=
=
i,k
t
ik
s
kj
e
i
=
i
_
k
t
ik
s
kj
_
e
i
This means that the matrix representation of T S is the matrix (
k
t
ik
s
kj
),
which completes the proof.
The result of these two theorems is that (with the extra baggage of ordered
bases) matrix multiplication and transformation composition are just two
views of the same process. This can be described using the following diagram.
U
TS
K
m
N
MN
K
p
M
K
n
The top row of the diagram illustrates the composition of linear transfor-
mations between vector spaces and the bottom row illustrates matrix multi-
plication. As always in these diagrams, the link between the top and bottom
rows is the choice of ordered bases.
Invertible Matrices and Change of Bases
In Activity 4, you discovered the identity matrix I. What is special about
a matrix of this form? In Activities 7 and 8, you discovered a method for
determining the inverse of a matrix (in other words, a method for nding
a matrix N such that MN = I). The method suggested in this activity is
equivalent to solving the equation MN = I, where I is the identity matrix,
and N represents the inverse of M, if the inverse exists. If we denote column
6.3 Matrix Multiplication 321
j of N by N
j
, then the system of equations given by MN
j
= I
j
is one of n
systems represented by MN = I, which, when expanded, yields n systems
of equations in n unknowns, all of which have the same coecient matrix.
Since the coecients are the same for each system, we can apply elementary
row operations to the augmented matrix (M[I) to nd the solution set, if
it exists, of each individual system. The func mat inv that you wrote in
Activity 8 implements this strategy for determining the inverse.
Theorem 6.3.5. The matrix M is invertible if and only if the matrix ob-
tained by reducing M to reduced echelon form is the identity matrix.
Proof. See Exercise 7.
A direct consequence of this was investigated in Activity 8 and is pre-
sented here as a corollary.
Corollary. An n n matrix is invertible if and only if its row rank is equal
to n.
Proof. This follows quickly from the previous theorem. The row rank of M
is equal to the row rank of the matrix obtained by reducing M to reduced
echelon form. If M is invertible, the reduced matrix is I, and the row rank
is n. If M is not invertible, the reduced matrix, which is not I, has a row of
all zeros and hence, row rank less than n.
You should see if you can translate the analysis of the paragraph pre-
ceding the theorem into statements about linear transformations. Can you
use this technique to describe a general method for nding inverses of linear
transformations?
Theorem 6.3.6. If T is an invertible linear transformation from K
n
to K
n
,
then the matrix representation of T in K
nn
is invertible.
Proof. Let T have inverse S so that T S = I. Let M be the matrix
representation of T and N be the matrix representation of S. Then MN is
the matrix representation of T S = I which means that MN = I.
In this proof we have used I with two meanings. In one place it refers
to the identity linear transformation, and in another it refers to the identity
matrix. Can you identify which one is which in the proof above?
322 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
Theorem 6.3.7. Let M be an invertible matrix in K
nm
. The linear trans-
formation T : K
m
K
n
dened by
T(x) = Mx
where x K
m
, is invertible.
Proof. See Exercise 8.
Corollary. An n n matrix is invertible if and only if its column rank is
equal to n.
Proof. Let M be any matrix in K
nn
and T be the linear transformation
dened by applying M. Then the range of T is generated by the columns of
M. This means that the rank of T is equal to the column rank of M. The
corollary follows because T is invertible if and only if its rank is equal to the
dimension of its domain (which is n).
One particular application of inverse matrices was presented in Activity 9,
where you used a matrix inverse to nd solution set of a system of linear
equations. The general technique can be described as follows. Starting with
an equation MX = Y , you nd the value of X by computing M
1
Y where
M
1
is the inverse of M. Written out in notation, the solution should look
very familiar.
MX = Y
M
1
(MX) = M
1
Y
(M
1
M)X = M
1
Y
IX = M
1
Y
X = M
1
Y
Now that we have a number of conditions for determining the invertibil-
ity of a matrix, we focus on one interpretation of invertible transformations
(and matrices). When you worked Activity 10, you discovered a connection
between changing ordered bases and invertibility. Namely, every time you
apply an invertible matrix to an ordered basis, you get another ordered ba-
sis. As a result, the linear transformation dened by the application of an
invertible matrix is simply a change of ordered basis. This perspective will
allow us to prove the following theorem.
6.3 Matrix Multiplication 323
Theorem 6.3.8. If P is an invertible matrix, then the column rank of PM
is equal to the column rank of M.
Proof. The key to the proof is that M and PM both represent the same
linear transformation, except with respect to dierent ordered bases. Since
both matrices have rank equal to that of the transformation they represent,
M and PM have the same rank.
Let M be an n m matrix and P be an invertible n n matrix. Let
/ be the coordinate basis of K
m
, B be the coordinate basis of K
n
and let
the ordered set ( = [c
i
] be the result of applying the inverse of P to the
elements of B. Let T : K
m
K
n
be the linear transformation dened by
applying M with respect to / and B. In the exercises (see Exercise 16), you
will prove that the matrix representation of T with respect to / and B is
equal to M. We show that the matrix representation of T with respect to /
and ( is equal to PM.
The rst step is to prove that ( is really a basis. Let S be the linear
transformation dened by the application of the inverse of P. Because the
inverse of P is invertible, S is an invertible linear transformation. This implies
that the range of S has dimension n and so must be all of K
n
. However, the
range of S is spanned by the n vectors ( = S(e
i
) and so this set must be
linearly independent by Theorem 4.4.9. As a result, ( is a basis of K
n
.
Note that because c
i
is the result of applying the inverse of P to the
vector e
i
, we know that the vector e
i
is the result of applying P to the c
i
. In
other words,
e
k
=
i
p
ik
c
i
.
To nish the proof, we compute the matrix representation of T with
respect to the ordered bases / and (:
T(e
j
) =
k
m
kj
e
k
=
k
m
kj
_
i
p
ik
c
i
_
=
i
_
k
p
ik
m
kj
_
c
i
.
This calculation shows that the matrix representation of T with respect to
/ and ( is equal to
_
k
p
ik
m
kj
_
= PM,
which is what was desired.
324 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
The proof may be hard to conceptualize, but it can be claried using the
diagram notation.
K
m
T
K
n
I
K
n
K
m
M
PM
K
m
P
K
m
At the top of the diagram, there are two arrows labelled with the linear
transformation T, which has a xed rank. At the bottom of the diagram,
the corresponding arrows are labelled M and PM, so the column ranks of
M and PM must be equal.
Exercises 17 and 18 ask you to state and prove the analog of Theorem 6.3.6
for M and MP, where P is invertible. A more complete development of this
theory is presented in Chapter 7.
So now we arrive at a position where we are able to prove Theorem 6.2.2.
Recall that the theorem states that the column rank of a matrix is unaected
by the elementary row operations.
Proof of Theorem 6.2.2. Let M be a matrix in K
nn
. We prove that each
elementary row operation on M can be implemented by multiplying M by
an invertible matrix on the right. This suces to prove the result by Theo-
rem 6.3.6.
The interchanging of rows i and j can be implemented by multiplying by
the matrix M dened by
m
k,
=
_
_
1 k = except (k, ) = (i, i) and (k, ) = (j, j)
1 (k, ) = (j, i) or (k, ) = (i, j)
0 otherwise
.
The multiplying of row i by the scalar a can be implemented by multi-
plying by the matrix M dened by
m
k,
=
_
_
1 k = except (k, ) = (i, i)
a (k, ) = (i, i)
0 otherwise
.
6.3 Matrix Multiplication 325
The replacement of row i by the sum of row i and row j can be imple-
mented by multiplying by the matrix M dened by
m
k,
=
_
_
1 k =
1 (k, ) = (i, j)
0 otherwise
.
We leave as a proof that these matrices implement the appropriate ele-
mentary row operations (see Exercises 19 through 21). The fact that they
are invertible follows from the fact that each operation is reversible.
Exercises
1. For each of the matrices M, N over R below, compute the product
MN.
(a)
M =
_
1 2 3
2 3 4
_
and N =
_
_
2 3
2 1
3 1
_
_
(b)
M =
_
_
1 2 3
0 1 2
0 0 3
_
_
and N =
_
_
3 0 0
2 4 0
1 3 1
_
_
(c)
M =
_
_
4 0 0
0 5 0
0 0 3
_
_
and N =
_
_
2 0 0
0 4 0
0 0 6
_
_
2. Complete the proof of Theorem 6.3.1.
3. Provide the proof of associativity in Theorem 6.3.2: for any three matri-
ces A, B, and C, (AB)C = A(BC), whenever the products are dened.
4. Complete the proof of the existence of an identity matrix in Theo-
rem 6.3.2: for any matrix A, AI = A = IA, whenever the product is
dened.
326 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
5. Provide the proof that multiplication by the zero matrix always results
in the zero matrix in Theorem 6.3.2: for any matrix A, ZA = Z = AZ,
whenever the products are dened.
6. Dene T, S: R
3
R
2
by
T(x, y, z)) = x + 2y, z)
S(x, y, z)) = y z, z x) .
Compute the matrix representations of the following linear transforma-
tions with respect to the coordinate bases.
(a) T S
(b) 2T 3S
(c) T (2S +T +I)
7. Provide the proof of Theorem 6.3.5.
8. Provide the proof of Theorem 6.3.7.
9. Assume that M and N are square matrices. Prove that if M and N
are invertible, then MN is invertible.
10. Assume that M and N are square matrices. Prove that if MN is
invertible, then M and N are both invertible.
11. For each linear transformation T over R given below, compute its in-
verse.
(a) T(x, y)) = x +y, x y)
(b) T(x, y, z)) = x + 2y, 3x z, 4x + 2y + 3z)
(c) T(x, y, z, w)) = x +y +z, y +z, z +w, x +w)
12. Solve each system of equations over R given below. (Hint: Are there
similarities between these systems coecient matrices that can be ex-
ploited?)
(a)
x + 7y + z = 14
2x + 2y 10z = 4
3y + 4z = 7
6.3 Matrix Multiplication 327
(b)
x + 7y + z = 2
2x + 2y 10z = 3
3y + 4z = 2
(c)
x + 7y + z = 0
2x + 2y 10z = 0
3y + 4z = 0
(d)
x + 7y + z = 7
2x + 2y 10z = 1
3y + 4z = 7
(e)
x + 7y + z = 1
2x + 2y 10z = 1
3y + 4z = 1
(f)
x + 7y + z = 2
2x + 2y 10z = 4
3y + 4z = 6
(g)
x + 7y + z =
2
2x + 2y 10z =
3y + 4z = e
13. Solve each system of equations over R below. (Hint: Are there similari-
ties between these systems coecient matrices that can be exploited?)
(a)
2x + 3y + 2z + w = 3
x + y + z = 2
2y + 2z = 3
3z + 4w = 7
328 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(b)
2x + 3y + 2z + w = 1
x + y + z = 3
2y + 2z = 4
3z + 4w = 4
(c)
2x + 3y + 2z + w = 1
x + y + z = 1
2y + 2z = 0
3z + 4w = 0
(d)
2x + 3y + 2z + w = 2
x + y + z = 5
2y + 2z = 8
3z + 4w = 10
(e)
2x + 3y + 2z + w = 4
x + y + z = 2
2y + 2z = 3
3z + 4w = 47
(f)
2x + 3y + 2z + w =
3
x + y + z =
2
2y + 2z = e
e
3z + 4w = 0
14. Let D: P
2
(R) P
1
(R) be the linear transformation D(p) = p
t
the
derivative of p. Let J : P
1
(R) P
2
(R) be the linear transformation
where J(p) =
_
x
0
p(t) dt.
(a) Compute the matrix representation M of D with respect to the
ordered bases [1, x, x
2
] and [1, x].
(b) Compute the matrix representation N of J with respect to the
ordered bases [1, x] and [1, x, x
2
].
(c) Compute the matrix product MN, what does this say about the
linear transformation D J?
6.3 Matrix Multiplication 329
(d) Compute the matrix product NM, what does this say about JD?
15. Let D: P
5
(R) P
4
(R) be the linear transformation D(p) = p
t
the
derivative of p. Let J : P
4
(R) P
5
(R) be the linear transformation
where J(p) =
_
x
0
p(t) dt.
(a) Compute the matrix representation M of D with respect to the
ordered bases [1, x, x
2
, x
3
, x
4
, x
5
] and [1, x, x
2
, x
3
, x
4
].
(b) Assume more generally that D: P
n
(R) P
n1
(R) is the linear
transformation D(p) = p
t
the derivative of p. What is the matrix
representation of D with respect to the ordered bases [1, . . . , x
n
]
and [1, . . . , x
n1
]?
(c) Compute the matrix representation N of J with respect to the
ordered bases [1, x, x
2
, x
3
, x
4
] and[1, x, x
2
, x
3
, x
4
, x
5
].
(d) Assume more generally that J : P
n1
(R) P
n
(R) is the linear
transformation J(p) =
_
x
0
p(t) dt. What is the matrix represen-
tation of J with respect to the ordered bases [1, . . . , x
n1
] and
[1, . . . , x
n
]?
16. Let M be an n m matrix and T : K
m
K
n
be the linear transfor-
mation dened by application of the matrix M. Prove that the matrix
representation of T is equal to M. Notice, we have not specied or-
dered bases for this exercise. Why was that omission acceptable in this
case?
17. State the theorem analogous to Theorem 6.3.8 for the matrices MP
and M.
18. Prove the theorem you stated in Exercise 17.
19. Let M be dened as
m
k,
=
_
_
1 k = except (k, ) = (i, i) and (k, ) = (j, j)
1 (k, ) = (j, i) or (k, ) = (i, j)
0 otherwise
.
Prove that the the matrix MN is the result of interchanging rows i and
j in the matrix N.
330 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
20. Let M be dened as
m
k,
=
_
_
1 k = except (k, ) = (i, i)
a (k, ) = (i, i)
0 otherwise
.
Prove that the the matrix MN is the result of multiplying row i of the
matrix N by a.
21. Let M be dened as
m
k,
=
_
_
1 k =
1 (k, ) = (i, j)
0 otherwise
.
Prove that the the matrix MN is the result of replacing row i of the
matrix N with the sum of rows i and j of the matrix N.
22. Although we have dened a matrix product above, another possibility
for a product would have been
A B =
1
2
AB +
1
2
BA,
where juxtaposition indicates the product dened in this section. Prove
that this new product is linear and commutative. For A, I and Z of
the same size, show that I A = A = AI and Z A = Z = AZ.
Give an example of three matrices A, B and C for which (AB)C ,=
A (B C).
331
6.4 Determinants
Activities
1. Dene an func called det which accepts a 22 matrix M = (m
ij
) and
returns the value m
11
m
22
m
12
m
21
. Your code may assume that as
and ms implement addition and multiplication.
Now dene as and ms to implement arithmetic mod 5. For each of
the following pairs of matrices, compute det(M), det(N) and det(MN).
Make a conjecture based on your ndings.
(a)
M =
_
2 3
1 4
_
and N =
_
3 4
0 1
_
(b)
M =
_
1 3
2 1
_
and N =
_
0 1
2 4
_
(c)
M =
_
2 0
1 3
_
and N =
_
1 3
3 4
_
Discussion
You probably noticed that this activity section was rather sparse. Often
you may want to determine whether a matrix is invertible, but you are not
be interested in the value of the inverse matrix. The purpose of this section is
to provide you with a technique for determining the invertibility of a matrix
through a single computation. We provide such a function with the following
denition.
Denition 6.4.1. If M is a 11 matrix, then the determinant of M is m
11
.
The determinant of M is denoted by det(M).
If M is an n n matrix, the (i, j) cofactor of M is dened as the deter-
minant of the (n1) (n1) matrix obtained by removing the i
th
row and
j
th
column from the matrix M. This cofactor is denoted by M
ij
.
332 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
If M is an n n matrix, the determinant of M is dened as any of the
following:
det(M) =
n
j=1
(1)
i+j
m
ij
M
ij
det(M) =
n
i=1
(1)
i+j
m
ij
M
ij
Note that the rst formula represents n dierent summations (one for each
xed i with 1 i n) and the second formula represents n dierent sum-
mations (one for each xed j with 1 j n). The rst formula is called the
expansion along the i
th
row, and the second formula is called the expansion
along the j
th
column.
We currently do not have the tools to prove the following theorem, but
it is important because it states that there is no ambiguity in the above
denition.
Theorem 6.4.1. All of the formulas in Denition 6.4.1 produce the same
scalar.
The formula for computing the determinants of 2 2 matrices was pro-
vided in Activity 1 and is worth remembering. We provide it here again.
det
_
a b
c d
_
= ad bc
The formula for computing determinants of 3 3 matrices is slightly
more complicated, but also worth remembering. We provide the formula for
expansion along the rst row here:
det
_
_
m
11
m
12
m
13
m
21
m
22
m
23
m
31
m
32
m
33
_
_
=
m
11
(m
22
m
33
m
32
m
23
) m
12
(m
21
m
33
m
31
m
23
)
+m
13
(m
21
m
32
m
31
m
22
) .
Since the value of the determinant of a matrix is independent of the row
or column selected when performing a cofactor expansion, the easiest way
to compute the determinants of a matrix M is to nd that row or column
which contains the largest number of zero entries.
6.4 Determinants 333
Theorem 6.4.2. If Z is a zero matrix, then det(Z) = 0. If I is an identity
matrix, then det(I) = 1.
Proof. See Exercises 2 and 3.
In Activity 1, you discovered one important property of 2 2 determi-
nants, namely, that the determinant of a product of matrices is the product
of the determinants of those matrices. This holds in general, although we
will not prove it in this text.
Theorem 6.4.3. For any pair of n n matrices M and N, det(MN) =
det(M) det(N).
Theorem 6.4.4. The matrix M is invertible if and only if det(M) ,= 0.
Proof. If M is invertible with inverse N, then det(M) det(N) = det(MN) =
det(I) = 1. This implies that det(M) ,= 0.
The proof that det(M) ,= 0 implies that M has an inverse can be found
in more advanced linear algebra texts.
As you might have noticed, our coverage of determinants has been short,
and many of the theorems have been left unproven. There is enough infor-
mation about determinants to ll an entire chapter, but, for the moment, we
only need a few results. Complete proofs would take us far aeld.
Exercises
1. Compute the determinants of the following matrices.
(a) M =
_
1 3
2 4
_
, over R
(b) M =
_
2 3
1 4
_
, over Z
5
(c) M =
_
_
2 2 1
0 1 0
3 2 1
_
_
, over Z
5
(d) M =
_
_
2 0 0
1 2 0
2 1 3
_
_
, over R
334 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(e) M =
_
_
x 3 2
2 x 2
1 2 x
_
_
, over R
2. Prove that if Z is the zero matrix, then det(Z) = 0. (Hint: This proof
requires the use of mathematical induction.)
3. Prove that if I is the identity matrix, then det(I) = 1. (Hint: This
proof requires the use of mathematical induction.)
Chapter 7
Getting to Second Bases
So the grand nale certainly has an interesting
title. In this last chapter, we want to look at
some of the ways that linear algebra is
usefulother than being a great way to spend a
semester with your favorite mathematician.
Specically, we explore the power that is
harnessed by using matrix representations for
linear transformations. We revisit basis of a
vector space and see that by choosing wisely,
much work can be avoided! And the fun we will
have with eigenstu. . .
336
7.1 Change of Basis
Activities
1. Complete parts (a)(c) for each of the following matrices A = (a
ij
)
given below. Use the information obtained in (a)(c) to answer the
question posed in (d).
A
1
=
_
_
1 2 3
2 3 1
3 1 2
_
_
A
2
=
_
_
_
_
1
2
1
8
1
4
1
8
0
1
2
0
1
2
1
4
1
2
0
1
4
0
1
3
0
2
3
_
_
_
_
A
3
=
_
_
1 1 1
3 2 3
5 5 4
_
_
(a) Show that A is invertible by applying a tool from Chapter 6 to
construct its inverse S = (s
ij
).
(b) For any x = x
1
, x
2
, . . . , x
n
) R
n
, where n denotes the number
of columns in A, dene the vector y = y
1
, y
2
, . . . , y
n
) R
n
by
y = S x.
Select three nonzero vectors x, and use the equation to nd three
corresponding vectors y.
(c) Let b
j
be the vector whose components are the entries of the j
th
column of A. Check to see that the sequence B = [b
1
, b
2
, . . . , b
n
]
forms a basis for R
n
, and then, for each vector y, compute the
sum
z =
n
i=1
y
j
b
j
.
Compare each vector z to the vector x to which it corresponds.
What do you observe?
7.1 Change of Basis 337
(d) Describe the procedure alluded to in (a)(c). Given a basis B =
[b
1
, b
2
, . . . , b
n
] for R
n
, how do we nd the coordinate vector [x]
B
of x, that is, the vector s
1
, s
2
, . . . , s
n
) whose components are the
scalars in the expression
x =
n
i=1
s
i
b
i
?
2. Construct a func ChangeCoeff that will accept an ordered basis B =
[b
1
, b
2
, . . . , b
n
] for R
n
and a vector x R
n
and that returns the co-
ordinate vector [x]
B
= s
1
, s
2
, . . . , s
n
) of x. This is the vector whose
components are the coecients of the equation
x =
n
i=1
s
i
b
i
.
Check your func for each x that you selected in Activity 1.
3. Let B = [b
1
, b
2
, . . . , b
n
] be an ordered basis for R
n
. Let T : R
n
R
n
be a linear transformation dened by T(x) = C x, where each matrix
C is dened below. Complete (a)(d) for each transformation T and
basis B.
C
1
=
_
_
9 7 7
3 1 3
5 5 3
_
_
B
1
is formed from the columns of the matrix A
1
dened in Activity 1.
C
2
=
_
_
_
_
1 3 2 3
2 1 5 1
1 4 0 8
2 5 3 0
_
_
_
_
B
2
is formed from the columns of the matrix A
2
dened in Activity 1.
C
3
=
_
_
9 1 1
2 1 3
3 5 4
_
_
B
3
is formed from the columns of the matrix A
3
dened in Activity 1.
338 CHAPTER 7. GETTING TO SECOND BASES
(a) Verify that the matrix representation of T with respect to the
coordinate basis is equal to the matrix C.
(b) Let M be the matrix whose j
th
column is the sequence of compo-
nents of the vector b
j
B. Show that M is invertible, and nd
its inverse M
1
. Compute M
1
CM.
(c) Select two nonzero vectors x R
n
. Apply the func ChangeCoeff
to nd the coordinate vector [x]
B
of x with respect to the basis
B. Compute (M
1
CM) [x]
B
.
(d) Compute T(x). Apply the func ChangeCoeff to nd [T(x)]
B
.
Compare (M
1
CM) [x]
B
and [T(x)]
B
.
4. Let T : R
n
R
n
be a linear transformation, and let C be the matrix
representation of T with respect to the coordinate basis. Based upon
your experience with Activity 3, construct a func ChangeFromCoordB
that will accept the matrix representation C and an ordered basis
B = [b
1
, b
2
, . . . , b
n
] and return the matrix representation with respect
to the basis B. Check your func for each transformation dened in
Activity 3.
5. Let T : R
n
R
n
be a linear transformation, and let B be the matrix
representation of T with respect to an ordered basis B. Construct a
func ChangeToCoordB that will accept the matrix representation B of
T with respect to B and return the matrix representation of T with re-
spect to the coordinate basis. Check your func for each transformation
dened in Activity 3.
6. In the following examples, try to nd a basis that transforms the given
matrix B to C having the indicated form.
(a) The matrix B is given by
B
1
=
_
2 2
5 1
_
.
The matrix C is to have a diagonal form, that is, all entries other
than those on the main diagonal are zero.
7.1 Change of Basis 339
(b) The matrix B is given by
B
2
=
_
_
19 18 5
32 33 10
62 69 22
_
_
.
The matrix C is to have a lower triangular form. This means that
all entries above the main diagonal are zero.
(c) The matrix B is the matrix C
3
from Activity 3. The matrix C is
to have a diagonal form.
Discussion
In Chapter 6, you learned how to nd coordinate vectors and matrix
representations. In both cases, you discovered that neither is unique. For
a vector u U, the components of its corresponding coordinate vector in
K
n
depend upon the basis selected for U. Similarly, the form of a matrix
representation of a linear transformation T : U V depends upon the
bases selected for U and V . In this section, we investigate the relationship
between representations for dierent bases. In particular, if u U, and
if B and ( are two bases for U, what is the relationship between the two
coordinate vectors [u]
B
and [u]
(
? If T : U U is a linear transformation
from U to itself, how is the matrix representation with respect to B related
to that for (? In the rst subsection, we will discuss change of basis in
relation to coordinate vectors. In the second subsection, we will see how to
transform a matrix representation from one basis into another. Throughout
this chapter, you will be introduced to examples that show why the ability
to change bases is important.
Coordinate Vectors
If V = K
n
, and if the given basis is the coordinate basis ( = [e
1
, . . . , e
n
],
the coordinate vector [v]
(
of any v K
n
is simply the vector v itself. Do
you remember what the form of each e
i
vector is? Can you explain why the
coordinate vector in this case is the same as v?
There are many instances in which we need to work with a basis other
than the coordinate basis. In such a case, the coordinate vector of v K
n
,
as you discovered in Activity 1, is not equal to v. Our interest in this section
340 CHAPTER 7. GETTING TO SECOND BASES
is to study the relationship between coordinate vectors and dierent bases,
and to understand the procedure for changing from one basis to another.
The func ChangeCoeff that you wrote in Activity 2 involves changing from
the coordinate basis to a second basis B. Is this func consistent with the
following theorem?
Theorem 7.1.1. Given a basis B = [b
1
, b
2
, . . . , b
n
] for a vector space K
n
,
let M be the matrix whose j
th
column entries are the components of the
vector b
j
. Then, the matrix M is invertible, and, given any vector x =
x
1
, x
2
, . . . , x
n
) K
n
, the vector given by
y = M
1
x
gives the coordinate vector of x with respect to B, that is,
x =
n
i=1
y
i
b
i
.
Proof. The proof of this theorem is a tour de force of notation together
with calculations involving sequences, summations, and multi-indices. One
strategy in understanding the proof is to take a very specic example and
follow through the formulas with that example. Other than heavy notation,
the steps of the argument are not particularly dicult.
According to the Corollary of Theorem 6.3.4, the matrix M dened in
the statement of the theorem is invertible since its row rank is n.
Now we introduce some notation. All indices are assumed to run from 1
to n. In the double index for an element of a matrix, the rst index counts
the rows, the second indicates the columns. Let M = (t
ij
), M
1
= (s
ij
), and
let ( = [e
1
, e
2
, . . . , e
n
] be the coordinate basis for R
n
.
Since the entries of the i
th
column of M are the components of the basis
vector b
i
, M applied to each coordinate basis vector e
i
yields
b
i
= M e
i
=
k
t
ki
e
k
.
Since MM
1
is the identity matrix, we know that the kj
th
coordinate of
the product is 0 for all values of the indices, except when k = j, in which case
the entry is 1. A convenient and standard way of expressing this is through
the symbol
kj
called the Kronecker delta. This is nothing more than a
7.1 Change of Basis 341
shorthand for the long statement: 0, if k is dierent from j; 1 otherwise.
Thus, we have,
i
t
ki
s
ij
=
kj
.
As dened in the activities, let y = y
1
, y
2
, . . . , y
n
) be the vector given by
the product of M
1
and x = x
1
, x
2
, . . . , x
n
). Using summation notation, we
have the following expression for each component of y,
y
i
=
j
s
ij
x
j
.
In terms of ( = [e
1
, . . . , e
n
], y is of the form
y =
j
s
ij
x
j
e
i
.
Now that we have established these relationships, we are ready to show
that
y = M
1
x.
Before looking at the explanations that follow, justify, for yourself, each step
of the calculation:
i
y
i
b
i
=
j
s
ij
x
j
Me
i
=
j
s
ij
x
j
k
t
ki
e
k
=
k
_
j
_
i
t
ki
s
ij
_
x
j
_
e
k
=
k
_
kj
x
j
_
e
k
=
k
x
k
e
k
= x.
In the rst line, we substituted M
1
x for y
i
, and replaced b
i
by its
equivalent formulation M e
i
.
342 CHAPTER 7. GETTING TO SECOND BASES
In the second line, we expressed M e
i
in terms of the entries in i
th
column of the matrix M.
In the third line, we reordered, rearranged, and collected terms in this
triple summation.
In the fourth line, we replaced the expression for the product MM
1
by its value in terms of the Kronecker delta.
In the fth line, we replaced each
kj
by dropping all 0 terms.
In the last line, we noted that the expression was the expansion of x in
terms of the coordinate basis.
Lets summarize what we discovered thus far.
1. If V = K
n
, the coordinate vector of v K
n
with respect to the coor-
dinate basis is equal to v.
2. If B = [b
1
, b
2
, . . . , b
n
] is any basis, we can nd the coordinate vector
of v by computing the product
M
1
v,
where M is the matrix whose j
th
column is the sequence of the com-
ponents of the vector b
j
. From this point forward, we will refer to a
matrix such as M as a transition matrix.
3. Given [v]
B
, the coordinate vector of v with respect to B, we can nd
v by computing the product
M[v]
B
.
If B
1
and B
2
are two bases for V = K
n
, and if v V , what is the
relationship between [v]
B
1
and [v]
B
2
? How do we get from [v]
B
1
to [v]
B
2
?
7.1 Change of Basis 343
from [v]
B
2
to [v]
B
1
? The diagram given below illustrates these relationships.
What are the entries of the transition matrices M
B
1
and M
B
2
?
[v]
B
2
M
B
2
M
1
B
1
v
M
1
B
2
M
1
B
1
[v]
B
1
M
B
1
M
1
B
2
Starting on the left, we see that multiplying the coordinate vector [v]
B
2
by the
matrix M
1
B
1
M
B
2
yields [v]
B
1
. Note that we are multiplying by the product of
two matricesthe rst matrix product goes from B
2
to the coordinate basis,
and the second goes back from the coordinate basis to B
1
. In the middle,
if we multiply the vector v by the matrix M
1
B
2
, then we get the coordinate
vector [x]
B
2
. If, on the other hand, we multiply v by the matrix M
1
B
1
, we
get the coordinate vector [v]
B
1
. On the right, if we start with the coordinate
vector [v]
B
1
and multiply by the matrix M
1
B
2
M
B
1
, we get the coordinate
vector [v]
B
2
. If we start at any node, [v]
B
2
, v, or [v]
B
1
, we can get to any
other node by following the appropriate arrow, or sequence of arrows. Since
the coordinate vector of v with respect to the coordinate basis ( is equal to v
itself, we can actually say that the matrix M
1
B
1
is the matrix that transforms
[v]
(
into [x]
B
1
. What matrix would we use to transform [x]
B
1
into [x]
(
? [x]
B
2
into [x]
(
? [x]
(
into [x]
B
2
?
Alias and alibi. There is a point of interpretation that may at rst seem
confusing. However, it is interesting and important, because it appears in
many dierent situations. In particular, we can think of a basis as a frame
of reference for locating vectors. If the vector space is R
n
, and if the basis
is the coordinate basis ( = [e
1
, e
2
, . . . , e
n
], then the coecients of a vector
v R
n
with respect to this basis are precisely its coordinates in a coordinate
system in which the basis vectors are the axes.
The is true of any basis. Consider the basis B = [2, 1) , 1, 3)] in R
2
.
The vector 1, 2) R
2
has the coordinates 1 and 2 with respect to the
coordinate basis. What are its coordinates with respect to the basis B?
With respect to B, [1, 2)]
B
=
5
7
,
3
7
_
. Can you show how to get this using
Theorem 7.1.1?
344 CHAPTER 7. GETTING TO SECOND BASES
Need a picture of a vector in R
3
in which the coordinates of the
vector are clearly shown to be arrows along the coordinate
axes.
Figure 7.1: Basis
A picture in R
2
showing the basis vectors b and appropriate
lines to e
1
, e
2
indicating the coordinates as lengths and the same
thing relative to b
1
, b
2
.
Figure 7.2: Basis representation
But we could also show this in another way as given in the accompanying
gure.
Here he have a coordinate system with only b
1
, b
2
as axes,
shown in the same position as the coordinate axes are shown
and with the new coordinates of x indicated.
Figure 7.3: An alternate representation
Now come the interpretations. In the rst gure, we may consider that
the vector 1, 2) is unchanged, but it has two names: the coordinates 1, 2)
and the coordinates
5
7
,
3
7
_
. This is called the alias interpretation.
On the other hand, the second picture suggests that the vector 1, 2) has
been changed. Originally, it was 1, 2), but now it is changed to
5
7
,
3
7
_
. This
is called the alibi interpretation.
Matrix Representations
Every vector in an n-dimensional vector space has a coordinate vector rep-
resentation with respect to a given basis. The same is true of linear trans-
formations. For instance, if L : U V is a linear transformation and if
B and ( are bases for U and V respectively, then there is an m n matrix
A, where dim(U) = n and dim(V ) = m, that represents L in the sense that
if [u]
B
= s
1
, s
2
, . . . , s
n
) is the coordinate basis of u with respect to B and
[L(u)]
(
is the coordinate vector of L(u) with respect to (, then
[L(u)]
(
= A [u]
B
.
7.1 Change of Basis 345
The j
th
column of A is the sequence of coecients of the vector L(b
j
) in
terms of the basis (. We can illustrate this in the gure below.
u
L
L(u)
[u]
B
A
[L(u)]
(
This diagram tells us that if we take a vector u U, nd its coordinate vector
[u]
B
, and multiply by the matrix representation A, we will get the same result
as if we had rst applied L to u and then found the coordinate vector [L(u)]
(
.
In other words, L : U V can be represented by A : K
n
K
m
.
In this subsection, we will limit the discussion to linear transformations
from a vector space U to itself. In this context, we will investigate what
happens to a matrix representation when we change the basis. In Activities 3
and 4, you considered the process by which one changes from the coordinate
basis to a basis B for a linear transformation between spaces of tuples.
Lets review the methodology suggested in these activities by considering
an example. Let L : R
2
R
2
be the linear transformation given by the
formula
L(x
1
, x
2
)) = 3x
1
+ 2x
2
, x
1
+ 2x
2
) .
If we work with the coordinate basis ( = [e
1
, e
2
], the matrix representation
of L with respect to ( is given by
C =
_
3 2
1 2
_
.
The matrix representation of L with respect to B = [1, 1) , 2, 1)] is
given by
B =
_
1 0
0 4
_
.
Later in this section and throughout the remainder of this chapter, we will
discover that diagonal forms are extremely important and useful. The func
func ChangeFromCoordB that you constructed in Activity 4 changes a repre-
sentation written in terms of the coordinate basis into a representations with
respect to a basis B. If you applied func ChangeFromCoordB to the matrix
C and the basis B given here, would ISETL return the matrix B? Are the
component pieces of ChangeFromCoordB consistent with the theorem given
below?
346 CHAPTER 7. GETTING TO SECOND BASES
Theorem 7.1.2. Let L : R
n
R
n
be a linear transformation whose matrix
with respect to the coordinate basis is denoted by C. Let B = [b
1
, . . . , b
n
] be
an ordered basis for R
n
, and let M be the matrix whose j
th
column is the
sequence of coecients of b
j
. Then, the matrix B of L with respect to B is
given by
B = M
1
CM.
Proof. The j
th
column of B consists of the sequence of coecients of L(b
j
)
in terms of B. We must show that the j
th
column of M
1
CM consists of
the same sequence of coecients. Since C is the matrix representation of L
with respect to (, the j
th
column of CM is the coordinate vector [L(b
j
)]
(
.
By Theorem 7.1.1, the product M
1
[L(b
j
)]
(
is [L(b
j
)]
B
.
The func ChangeToCoordB that you constructed in Activity 5 reverses the
process given by Theorem 7.1.2. If we start with the matrix representation
in terms of a non-coordinate basis, how to we get to the matrix representa-
tion for the coordinate basis? Specically, how do we represent the matrix
representation C in terms of B? Before proceeding further, lets summarize
the relationship between the coordinate basis and a second basis B.
If L : R
n
R
n
is a linear transformation, then the j
th
column of the
matrix representation of L with respect to the coordinate basis ( is the
coordinate vector [L(e
j
)]
(
.
If we want to nd the matrix of L with respect to B, we rst construct
a transition matrix M. The j
th
column of this matrix consists of the
coecients of the vector b
j
.
If we let C denote the coordinate basis representation of L, then the
matrix representation with respect to B is found by computing the
product M
1
CM. The j
th
column of the product is the coordinate
vector [L(b
j
)]
B
.
If we are given the matrix representation B with respect to B and wish
to nd the coordinate basis representation, we compute the product
MBM
1
. The j
th
column of MBM
1
is the coordinate vector [L(e
j
)]
(
.
If B
1
and B
2
are two bases, neither of which is the coordinate basis, how do
we use Theorem 7.1.2 to nd the transition from B
1
to B
2
, and vice-versa?
The diagram below illustrates the relationships involved in making these
7.1 Change of Basis 347
transitions. In the gure, v is a vector in K
n
, B
1
is the matrix representation
in terms of B
1
, and B
2
denotes the matrix representation of L with respect
to B
2
.
[v]
B
1
B
1
[L(v)]
B
1
v
M
1
B
1
M
1
B
2
L(v)
M
1
B
1
M
1
B
2
[v]
B
2
B
2
[L(v)]
B
2
What are the entries of the transition matrices M
B
1
and M
B
2
? Using the
diagram, we can see that
[L(v)]
B
2
= B
2
M
1
B
2
M
B
1
[v]
B
1
[L(v)]
B
2
= M
1
B
2
M
B
1
B
1
[v]
B
1
.
Therefore,
B
2
= M
1
B
2
M
B
1
B
1
M
1
B
1
M
B
2
.
Following a similar argument, we can write B
1
in terms of B
2
. How do
we interpret the equation given here? To get from B
1
to B
2
, one would
start with B
1
and convert to the coordinate basis. This is represented by
M
B
1
B
1
M
1
B
1
. This is followed by a transition from the coordinate basis to B
2
,
which is represented by multiplying by the inverse of M
B
2
on the left and M
B
2
on the right. How would we construct a func using ChangeToCoordB and
ChangeFromCoordB to make the transition from B
1
to B
2
, and vice-versa?
Matrices with Special Forms
As we have seen, the process of making a transition from one basis to another
is quite involved. How does this process help us in working with linear
transformations? In this subsection and throughout the remainder of this
chapter, we will consider examples that will help to show the importance of
the ability to change bases.
348 CHAPTER 7. GETTING TO SECOND BASES
Triangular matrices. Recall that if you have a system of m linear equa-
tions in n unknowns, then you can interpret it as an equation L(x) = c,
where L : R
n
R
m
is a linear transformation, c is a vector in R
m
, and
x R
n
. If A = (a
ij
) is the matrix of L with respect to the coordinate bases
and x = x
1
, . . . , x
n
) , c = c
1
, . . . , c
m
) are the representations of x, c in terms
of their coecients with respect to these bases, then the system of equations
can be represented as the matrix equation Ax = c.
Now, suppose that the matrix A is in lower triangular form, that is,
a
ij
= 0 for i < j. Then you can write the solution very quickly. The rst
equation involves only x
1
, so you can solve it (provided a
11
,= 0). The second
equation involves only x
1
, x
2
. Since you already know the solution of x
1
, you
can solve for x
2
. Following a similar approach, we can nd the solution of
each x
i
.
For example, the answer to Activity 6(b) is the following lower triangular
matrix
_
_
3 0 0
1 2 0
7 4 3
_
_
.
You might not have found this matrix, but now that you know it, can you
nd the basis that gives it? The system of equations that gives this matrix
is
3x
1
= c
1
x
1
+ 2x
2
= c
2
7x
1
4x
2
+ 3x
3
= c
3
,
where c
1
, c
2
, c
3
are given numbers.
You can write the solution almost immediately as
x
1
=
c
1
3
x
2
=
1
2
(c
2
x
1
) =
3c
2
c
1
6
x
3
=
1
3
(c
3
7x
1
+ 4x
2
) =
c
3
3c
1
+ 2c
2
3
.
Of course, this is not the solution of the system of equations whose matrix
of coecients is the original matrix B
2
given in Activity 6(b). Actually, the
7.1 Change of Basis 349
x
1
, x
2
, x
3
given here are the coecients of that solution with respect to the
basis that transformed B
2
into the above triangular matrix. In fact, that basis
is B = [1, 1, 2) , 1, 2, 3) , 1, 2, 4)]. Given this information and assuming that
the right hand side of this system was given by c
1
= 6, c
2
= 2, c
3
= 1, can you
nd the solution of the original system? (Dont forget the values of c
1
, c
2
, c
3
are coecients of a vector with respect to the basis B.)
You will note that we are not saying much about how to nd a basis that
transforms a matrix into triangular form. One reason for this is that it is not
very practical. If we want to solve a system of equations, there are better
methods, such as Gaussian elimination or inversion of the matrix of coe-
cients (for which there are ecient computer methods). A more interesting
comment is a theoretical one. Suppose you have a linear transformation
L : V V and a basis B for V such that the matrix of L with respect
to B is upper triangular. Then L(b
1
) is an element of the subspace of V
generated by b
1
. Also, L(b
2
) is an element of the subspace of V generated
by b
1
, b
2
. This means that every element of the subspace of V generated
by b
1
, b
2
is mapped by L into that same subspace generated by b
1
, b
2
.
In other words, the subspace of V generated by b
1
, b
2
is invariant under
the linear transformation.
We can continue this process and say, for each k (up to the dimension
of V ), that the subspace generated by b
1
, b
2
, . . . , b
k
is invariant under L.
Such a decomposition of V into an increasing sequence of subspaces, each
invariant under L, has theoretical importance in the general theory of vector
spaces.
Diagonal matrices. A linear transformation T : R
n
R
n
whose ma-
trix representation with respect to the coordinate basis is diagonal has a
particularly simple structure. It takes each vector e
i
and multiplies it by a
xed scalar. For instance, if x = 2, 1, 3), and if T : R
3
R
3
is a linear
transformation whose matrix representation with respect to the coordinate
basis is
_
_
3 0 0
0 4 0
0 0 5
_
_
,
then T(x) = 3 2, 4 1, 5 3). In this case, the x component is scaled by a
factor of 3, the y component is scaled by a factor of 4, and the z component
is scaled by a factor of 5. This means that the unit cube maps to the box B,
350 CHAPTER 7. GETTING TO SECOND BASES
as illustrated in the gure below.
In R
3
, provide a picture of the unit cube and the resulting box
found by applying T to the unit cube.
Figure 7.4: Losing a dimension
If the matrix representation of T with respect to the coordinate basis
is not diagonal, then such a simple description of T is generally not possi-
ble. Under certain circumstances, however, we can nd a basis for which
the matrix representation is diagonal and the notion of scaling makes sense.
Consider the following example T : R
2
R
2
dened by
T(x, y)) =
_
5 4
4 1
__
x
1
x
2
_
.
As you can see, the matrix representation of T with respect to the coordinate
basis is not diagonal. However, in this case, we can nd a basis for which the
matrix representation of T is diagonal, namely
B =
__
2
5
,
1
5
_
,
_
5
,
2
5
__
.
The matrix of T with respect to B is given by
_
7 0
0 3
_
.
The vectors
_
2
5
,
1
5
_
and
_
5
,
2
5
_
dene a new coordinate system.
In R
2
, picture of coordinate axes with vectors from B super-
imposed. Picture of unit square ABCO with respect to new
coordinate system and picture of image rectangle A
t
B
t
C
t
O. O
here is the origin.
Figure 7.5: Eect of a diagonal basis
T is a scaling with respect to B, with a factor of 7 is the x
t
direction and
3 in the y
t
direction. As you can see, the square ABCO is mapped by T
to the rectangle A
t
B
t
C
t
O.
7.1 Change of Basis 351
The ability to diagonalize can also be used to simplify the equation of a
conic section in which there is a middle term, say
3x
2
+ 2xy + 3y
2
= 8.
The presence of the xy term makes this dicult to graph. In order to use
diagonalization, we need to make use of an auxiliary concept called inner
product. This is a deep and important concept in mathematics, but, for
now, you can consider the following formula as shorthand notation. If x =
x
1
, . . . , x
n
) , y = y
1
, . . . , y
n
) R
n
, then we can dene the inner product
x, y) by
x, y) =
n
i=1
x
i
y
i
.
We can write the algebraic expression 3x
2
+ 2xy + 3y
2
as an inner product
Ax
1
, x
2
) , x
1
, x
2
)) ,
where
A =
_
3 1
1 3
_
,
and ) represents the dot product on R
2
. Next, we change to the basis
B =
__
1
2
,
1
2
_
,
_
1
2
,
1
2
__
. The matrix B of the transformation rep-
resented by A is
B =
_
4 0
0 2
_
.
If we let M be the matrix whose columns are the vectors b
1
and b
2
, set
y
1
, y
2
) = M
1
x
1
, x
2
), and make the substitution x
1
, x
2
) = M y
1
, y
2
) in
the original equation, then, after some simplication, we get
4y
2
1
+ 2y
2
2
= 8.
Picture of ellipse with both sets of coordinate axes shown.
Figure 7.6: Rotation of coordinate axes
As Figure 7.6 shows, the original equation 3x
2
+ 2xy + 3y
2
= 8 can be
represented by the equation 4(x
t
)
2
+2(y
t
)
2
= 8 in the x
t
y
t
coordinate system,
352 CHAPTER 7. GETTING TO SECOND BASES
where the x
t
-axis is the line that coincides with the basis vector
_
1
2
,
1
2
_
,
and the y
t
-axis is the line that coincides with the basis vector
_
1
2
,
1
2
_
.
Exercises
1. Let V = R
3
. Find the coordinate vector of 2, 1, 3) in terms of the
basis [1, 1, 1) , 1, 1, 0) , 1, 0, 0)].
2. Let V = R
2
. Find the coordinate vector of 6, 5) with respect to the
basis [1, 1) , 2, 1)].
3. Let V = R
4
. Find the coordinate vector of 6, 3, 1, 2) with respect to
the basis [1, 1, 0, 2) , 2, 1, 1, 3) , 3, 0, 1, 0) , 1, 0, 0, 4)].
4. Complete (a)(b) for the bases
B
1
= [1, 2) , 3, 0)] and B
2
= [2, 1) , 3, 2)].
(a) If [x]
B
1
= 4, 3), nd [x]
B
2
.
(b) Find the form of x in terms of the coordinate basis.
5. Complete (a)(b) for the bases
B
1
= [3, 1) , 1, 1)] and B
2
= [2, 3) , 4, 5)].
(a) If [x]
B
1
= 2, 5), nd [x]
B
2
.
(b) Find the form of x in terms of the coordinate basis.
6. Let V = P
2
. Find the coordinate vector of p = 32x+x
3
with respect
to the basis [1, x
2
2, x
2
x + 1, x
3
+x].
7. Let V = P
2
. Find the coordinate vector of p = 3x
2
6x 2 with
respect to the basis [x 1, 2x + 3, x
2
+x].
8. Let P
2
(R) be the vector space of polynomials of degree two or less with
real coecients. A basis ( = [c
0
, c
1
, c
2
] is given by
c
0
= 1
c
1
= x
c
2
= x
2
.
7.1 Change of Basis 353
Another basis B = [b
0
, b
1
, b
2
] is given by
b
0
=
1
2
x(x 1)
b
1
= x
2
+ 1
b
2
=
1
2
x(x + 1).
For each of the following polynomials given below, write its coordinates
with respect to the basis (, and then change to its coordinates with
respect to the basis B. You can solve the problem by hand, or use the
computer tools you built in the activities.
(a) p(x) = 3x
2
5x + 2
(b) q(x) = 7x
2
4
(c) r(x) = 2x 1
9. Let P
2
(R) be the vector space of polynomials of degree two or less with
real coecients. A basis ( = [c
0
, c
1
, c
2
] is given by
c
0
= 1
c
1
= x
c
2
= x
2
.
Another basis B = [b
0
, b
1
, b
2
] is given by
b
0
=
1
2
x(x 1)
b
1
= x
2
+ 1
b
2
=
1
2
x(x + 1).
For each of the following polynomials, nd its coordinates with respect
to the basis B. You can solve the problem by hand, or use the computer
tools you built in the activities.
(a) p = 4c
0
3c
1
6c
2
(b) q = 6c
0
4c
2
354 CHAPTER 7. GETTING TO SECOND BASES
(c) r = 3c
1
10. Justify each step of the calculation given in Theorem 7.1.1.
11. Formulate a theorem, similar to Theorem 7.1.1, that gives a formula
for changing a coordinate vector with respect to a basis B
1
to its cor-
responding coordinate vector with respect to a basis B
2
, where neither
B
1
nor B
2
is the coordinate basis. Once you have written a statement
of this theorem, provide a proof.
12. In each of the following problems, a linear transformation L : R
n
R
n
is given by its matrix A with respect to the given basis B. Find its
matrix with respect to the coordinate basis. You can solve the problem
by hand, or use the computer tools you built in the activities.
(a)
A =
_
_
3 1 4
1 3 0
0 2 3
_
_
B = [2, 2, 2) , 3, 2, 3) , 2, 1, 1)]
(b)
A =
_
_
_
_
1/2 1/8 1/4 1/8
0 1/2 0 1/2
1/4 1/2 0 1/4
0 1/3 0 2/3
_
_
_
_
B = [1, 1, 1, 0) , 1, 1, 0, 1) , 1, 0, 1, 1) , 0, 1, 1, 1)]
13. In each of the following problems, a linear transformation L : R
n
R
n
is given by its matrix A with respect to the given basis B
1
. Find
its matrix with respect to the basis B
2
. You can solve the problem by
hand, or use the computer tools you built in the activities.
(a)
A =
_
_
3 1 4
1 3 0
0 2 3
_
_
B
1
= [2, 2, 2) , 3, 2, 3) , 2, 1, 1)]
B
2
= [2, 1, 0) , 1, 0, 2) , 1, 2, 1)]
7.1 Change of Basis 355
(b)
A =
_
_
_
_
1/2 1/8 1/4 1/8
0 1/2 0 1/2
1/4 1/2 0 1/4
0 1/3 0 2/3
_
_
_
_
B
1
= [1, 1, 1, 0) , 1, 1, 0, 1) , 1, 0, 1, 1) , 0, 1, 1, 1)]
B
2
= [1, 1, 0, 2) , 2, 1, 2, 0) , 1, 0, 2, 2) , 0, 2, 1, 1)]
14. Dene a transformation F : R
3
R
3
by
F(x
1
, x
2
, x
3
)) = x
1
x
2
x
3
, 0, 2x
1
+ 3x
3
, x
2
+x
3
) .
Find the matrix representation of F with respect to the basis
d = [1, 2, 1) , 2, 1, 1) , 3, 1, 2)].
15. Suppose that T : R
3
R
3
is a linear transformation whose matrix
representation with respect to some basis B
1
is given by
_
_
3 1 2
4 5 6
1 3 7
_
_
.
Suppose that the transition matrix from B
1
to another basis B
2
is given
by
_
_
2 1 1
1 0 1
0 2 3
_
_
.
Find the expression for the rule of correspondence of T in terms of the
coordinate basis.
16. Let P
3
(R) be the vector space of polynomials of degree three or less
with real coecients. A basis ( = [c
0
, c
1
, c
2
, c
3
)] is given by
c
0
= 1
c
1
= x
c
2
= x
2
c
3
= x
3
.
356 CHAPTER 7. GETTING TO SECOND BASES
This is called the basis of monomials.
Another basis B = [b
0
, b
1
, b
2
, b
3
] is given by
b
0
= 1
b
1
= x
b
2
= x(x 1)
b
3
= x(x 1)(x 2).
This will be called the basis of linear products.
Let L : P
3
(R) P
3
(R) be the linear transformation, called the
dierence operator, given by
L(p)(x) = p(x + 1) p(x).
Write the matrix of L with respect to the basis (, and then nd its
matrix with respect to the basis B. You can solve the problem by
hand, or use the computer tools you built in the activities.
17. Dene L : P
3
(R) P
3
(R) to be the linear transformation dened by
the derivative, that is, L(p) = p
t
. Write the matrix of L with respect
to the monomial basis (see previous exercise), and then nd its matrix
with respect to the basis of linear products (see previous exercise).
You can solve the problem by hand, or use the computer tools you
built in the activities. In this exercise, we think of each polynomial as
an expression for a function from R to R.
18. Why do you think that we have been using sequences of basis vectors
rather than sets of basis vectors. If the order of the vectors in a basis is
changed, would the coordinate vector change? To help you in answering
this question, construct a basis in R
3
. Select a vector x. Find the
coordinate vector of x with respect to the basis you have constructed.
Change the order of the basis elements, and nd the coordinate vector
with respect to this new ordering. What do you observe?
19. Let F : R
n
R
n
be a linear transformation. Formulate a theorem,
similar to Theorem 7.1.2, that gives a formula for changing the matrix
representation of F with respect to a basis B
1
to its corresponding
matrix representation with respect to a basis B
2
, where neither B
1
nor
B
2
is the coordinate basis. Once you have written a statement of this
theorem, provide a proof.
357
7.2 Eigenvalues and Eigenvectors
Activities
1. In each of the following problems, (a)(e), nd as many solutions to
the given equation as you can. Let x R
2
.
(a) For the matrix A
1
=
_
3 1
2 2
_
, nd all solutions to A
1
x = x.
(b) For the matrix A
2
=
_
5 2
7 3
_
, nd all solutions to A
2
x = x.
(c) For the matrix A
3
=
_
2 1
2 2
_
, nd all solutions to A
3
x = x.
(d) Let , the Greek letter lambda, represent a scalar. For the matrix
A
4
=
_
1 3
2 2
_
, nd all such that the equation A
4
x = x has at
least one solution. For each such , nd all solutions x that satisfy
the given equation.
(e) Let , the Greek letter lambda, represent a scalar. For the matrix
A
5
=
_
5 2
2 1
_
, nd all such that the equation A
5
x = x has
at least one solution. For each such , nd all solutions x that
satisfy the given equation.
2. Complete (a)(d) for the matrix A
4
dened in Activity 1(d).
(a) Write the polynomial p given by p() = det(A
4
I), where I is
the 4 4 identity matrix.
(b) Find all such that p() = 0. For each solution, select a nonzero
vector x such that A
4
x = x.
(c) What can you say about the vectors you picked in the previous
step? Do they form a basis for R
2
?
(d) Think of A
4
as representing the expression of a linear transforma-
tion T : R
2
R
2
given by
T(x) = A
4
x.
358 CHAPTER 7. GETTING TO SECOND BASES
Use the information you have gathered thus far to nd a diagonal
matrix representation for T.
3. Repeat Activity 2 for the matrix given by A
6
=
_
_
0 3 3
2 2 2
4 1 1
_
_
.
4. Repeat Activity 2 for the matrix given by A
7
=
_
_
0 0 0
4 1 1
3 2 1
_
_
.
Did anything dierent happen this time? Can you diagonalize A
7
? Try
to explain as much as you can.
5. Let S be the matrix whose columns are the vectors you found in Ac-
tivity 3. Set D = S
1
A
6
S. A
6
denotes the matrix from Activity 3. Do
you see anything remarkable about D? If so, can you explain it?
Discussion
Basic Ideas
Given a linear transformation L : V V , the point of Activity 1 was to
illustrate the idea that it is possible to nd scalars and nonzero vectors x
such that
L(x) = x.
When this occurs, we say that is an eigenvalue of L and x is an eigenvector
belonging to . Formally, we have
Denition 7.2.1. Let L : V V be a linear transformation. If there
exists a nonzero vector x V for which L(x) = x, then we say that is an
eigenvalue of L. Any nonzero vector x satisfying the equality for a particular
is called an eigenvector belonging to .
What examples of eigenvalues and eigenvectors did you nd in the activ-
ities? Carefully describe them before proceeding.
What eect does a linear transformation have upon an eigenvector? We
know that a linear transformation takes a vector and transforms it into an-
other vector. If L : R
n
R
n
is a linear transformation, and if x R
n
is
7.2 Eigenvalues and Eigenvectors 359
an eigenvector belonging to , how might we describe the way L transforms
x? According to the denition, the eect of L on an eigenvector, no matter
how complex the transformation, is nothing more than a simple scaling of
the eigenvector. In this context, what does the term scaling mean?
In R
3
, give an arrow for a vector x. Give a second, longer arrow
that coincides with x that will be labeled L(x).
Figure 7.7: Image of an eigenvector
The activities introduced a methodology for nding eigenvalues and eigen-
vectors. The procedure outlined in Activity 2 can be expanded and gen-
eralized. In order to identify the eigenvalues of a linear transformation
L : K
n
K
n
, we dene a second transformation based upon the equa-
tion L(x) = x. Dene I : K
n
K
n
to be the identify transformation:
I(x) = x for all x K
n
. Dene M : K
n
K
n
by the expression
M(x) = L(x) I(x).
Is M a linear transformation? Before continuing, you should check this.
What is the relationship between the expression for M and the equation
L(x) = x? A particular scalar, say s, is a solution to L(x) = x if and only
if there exists a nonzero vector x
s
K
n
such that x
s
ker(M). Can you
explain why this is the case?
If the kernel of M were to contain the zero vector exclusively, then
dim(ker(M)) = 0. According to Theorem 5.2.5, the Rank and Nullity The-
orem, rank(M) = n. As a result, the rank of any matrix representation
of M would be n. In such a case, Theorems 6.3.4 and 6.4.4 tells us that
det(M) ,= 0. Hence, a scalar s is an eigenvalue if and only if
[[L] sI[ = 0,
where [L] denotes a matrix representation of L. The set of eigenvalues of L
is subsequently given by the set
: [[L] I[ = 0 .
The determinant [[L] I[, a polynomial in , is called the characteristic
polynomial of L. Any root of this polynomial is an eigenvalue of L. It is
useful to summarize all of this in a theorem.
360 CHAPTER 7. GETTING TO SECOND BASES
Theorem 7.2.1. Let L : K
n
K
n
be a linear transformation, [L] a matrix
representation of L, and let p be the function given by p() = det([L]
I). Then, p is a polynomial of degree n in with coecients in K. The
eigenvalues of L are precisely the roots of p.
As you may recall from your study of algebra, a polynomial of degree n
has at most n roots or zeros. This means that a linear transformation from
R
n
into itself has at most n eigenvalues. If we are working over the complex
numbers C, we know that a polynomial can be factored completely. Hence,
if L is a linear transformation from C
n
to itself, and if we count eigenvalues
according to their multiplicity, then L would have exactly n eigenvalues.
Bases of Eigenvectors
The steps in Activity 2 provided a rough sketch of the procedure for nding a
diagonal matrix representation. In Activity 5, you found a diagonal form by
multiplying by a suitable transition matrix. These activities raise some im-
portant questions. Is a set of eigenvectors always linearly independent? Do
diagonal matrix representations correspond to bases consisting exclusively of
eigenvectors? What happened in Activity 4 that prevented the transforma-
tion dened by A
7
from having a diagonal representation? Providing answers
to these questions will be one focus of our work in this and the next section.
We provide a partial answer to the rst question in the following theorem.
Theorem 7.2.2. Let L : V V be a linear transformation. Let
1
,
2
,
. . .,
k
be a set of distinct eigenvalues of L. For each i = 1, 2, . . . , k, let v
i
be an eigenvector belonging to
i
. Then, the set of vectors
v
i
: i = 1, 2, . . . , k
is linearly independent.
Proof. The proof will be by induction on k.
Since an eigenvector, by denition, is not zero, the theorem is true for
k = 1.
Suppose that the theorem holds for any set of k eigenvectors,
v
1
, v
2
, . . . , v
k
,
belonging to a set
1
,
2
, . . . ,
k
7.2 Eigenvalues and Eigenvectors 361
of k distinct eigenvalues. Now, suppose we have a set of k + 1 distinct
eigenvalues,
1
,
2
, . . . ,
k
,
k+1
,
and a corresponding set of eigenvectors,
v
1
, v
2
, . . . , v
k
, v
k+1
,
where v
i
belongs to
i
. By the induction hypothesis, the rst k of these are
linearly independent. In order to show that the entire set is independent, all
we have to show is that v
k+1
is not a linear combination of v
1
, v
2
, . . . , v
k
.
Suppose this is not the case; that is, v
k+1
=
k
i=1
t
i
v
i
, where the t
i
are
scalars. Then we would have,
k
i=1
t
i
k+1
v
i
=
k+1
k
i=1
t
i
v
i
=
k+1
v
k+1
= T(v
k+1
)
= T
_
k
i=1
t
i
v
i
_
=
k
i=1
t
i
T(v
i
) =
k
i=1
t
i
i
v
i
.
But then we would have,
0 =
k
i=1
t
i
k+1
v
i
i=1
t
i
i
v
i
=
k
i=1
t
i
(
k+1
i
) v
i
.
Since all of the eigenvalues
1
,
2
, . . . ,
k+1
are distinct,
k+1
i
,= 0,
i = 1, 2, . . . , k. Since the vectors v
1
, v
2
, . . . , v
k
are linearly independent,
t
i
= 0, i = 1, 2, . . . , k. This implies that v
k+1
= 0, which is not the case.
This theorem explains why the matrix S in Activity 5 is invertible. The
columns of S were the eigenvectors belonging to dierent eigenvalues. Indeed,
if a vector space V has dimension n and n distinct eigenvalues, then a set
of n eigenvectors, each corresponding to a distinct eigenvalue, constitutes a
basis. This situation can be generalized. Specically, we can construct a
basis of eigenvectors in which some belong to the same eigenvalue. The next
theorem will help us to prove such a theorem.
Theorem 7.2.3. The set of all eigenvectors belonging to the same eigenvalue
forms a subspace.
362 CHAPTER 7. GETTING TO SECOND BASES
Proof. See Exercise 10.
We will use the theorem just stated, along with Theorem 7.2.2, to prove
a stronger version of Theorem 7.2.2. We will show that it is possible to
construct a linearly independent set of eigenvectors in which some of the
vectors belong to the same eigenvalue. This theorem is an important step in
the process of determining conditions that guarantee diagonalizability, that
is, the existence of a basis for which a linear transformation has a diagonal
matrix representation.
Theorem 7.2.4. Let L : V V be a linear transformation, and let
1
,
2
,
. . .,
k
be a set of distinct eigenvalues of L. Let E be a set of eigenvectors that
satises the following condition: for each i = 1, 2, . . . , k, those eigenvectors
that belong to
i
are linearly independent. It then follows that the entire set
E is linearly independent.
Proof. Suppose that
E = v
1
, v
2
, . . . , v
m
, m n.
Let a
1
, a
2
, . . . , a
m
be scalars such that
a
1
v
1
+a
2
v
2
+ +a
m
v
m
= 0.
It suces to show that
a
1
= a
2
= = a
m
= 0.
Group the combination according to those vectors which belong to the same
eigenvalue. In such a case, it follows, by Theorem 7.2.3, that such a sum
yields another eigenvector belonging to the same eigenvalue. If we do this
for every such grouping, the original linear combination
a
1
v
1
+a
2
v
2
+ +a
m
v
m
simplies to a sum of distinct eigenvectors, each belonging to a dierent
eigenvalue. By Theorem 7.2.2, each vector must be zero. This means that
each part of the original combination that belonged to a particular eigenvalue
yields a vector sum of zero. Since we are assuming that those vectors in E
that belong to the same eigenvalue are linearly independent, it follows that
the associated scalars are zero. Since this happens for each such grouping of
the original linear combination above, it follows that the coecients a
1
, a
2
,
. . . , a
m
are all simultaneously zero.
7.2 Eigenvalues and Eigenvectors 363
The next two denitions will help us to state a condition that guarantees
the existence of a basis of eigenvectors. The proof will be based upon The-
orems 7.2.2, 7.2.3, and 7.2.4. This theorem will be key tool in helping us to
devise a procedure for diagonalization. This procedure will be discussed in
detail in the next section.
Denition 7.2.2. The algebraic multiplicity of an eigenvalue of a linear
transformation L : K
n
K
n
is its multiplicity as a root of the characteristic
polynomial of L.
Denition 7.2.3. The geometric multiplicity of an eigenvalue is the di-
mension of the subspace of eigenvectors that belong to it.
Theorem 7.2.5. If L : K
n
K
n
is a linear transformation, and if the
sum of the geometric multiplicities of the eigenvalues of L is n, then there is
a basis of eigenvectors of L.
Proof. For each eigenvalue, nd a basis for the subspace of its eigenvectors
(it is a subspace by Theorem 7.2.3). Then, by Theorem 7.2.4, the union
of all of these bases is linearly independent. By the assumption regarding
geometric multiplicities, the number of vectors in this linearly independent
set is n. By Theorem 4.4.8, this set is a basis.
Theorem 7.2.5, which gives a condition for diagonalizability, allows us to
expand the level of detail of the procedure given in Activity 2. In each step,
assume that L : R
n
R
n
is a linear transformation.
1. Find the matrix representation [L] of L with respect to the coordinate
basis. Determine the characteristic polynomial [[L] I[.
2. Find the roots of the characteristic polynomial.
3. Find a basis for the subspace corresponding to each eigenvalue.
4. Take the union of these bases.
5. If the sum of the geometric multiplicities of the eigenvalues of L is
equal to n, then the linearly independent set described in 4. forms a
basis. Can you explain why? What happens if the sum of the geometric
multiplicities is not equal to n? Would L be diagonalizable in this case?
364 CHAPTER 7. GETTING TO SECOND BASES
This procedure leaves several unanswered questions: Can you nd the roots
of the characteristic equation? How do we construct a basis for a given
eigenspace (the subspace of eigenvectors corresponding to a particular eigen-
value)? What happens if the sum of the geometric multiplicities is not equal
to the dimension of the vector space? What is the precise relationship be-
tween an eigenbasis, if one can be constructed, and the coordinate basis?
These and related questions will be answered in the next section.
What Can Happen?
Before considering conditions guaranteeing diagonalizability, as well as other
applications of the theory of eigenvalues and eigenvectors, it may be useful to
summarize all of the various possibilities encountered thus far. If L : K
n
K
n
is a linear transformation, where K is some eld, the following list details
important facts concerning eigenvalues and eigenvectors.
If K is the set C of complex numbers, then it is certain that L will
have at least one eigenvalue. In general, however, it is possible that L
has no eigenvalues. When might L not have any eigenvalues?
L has at most n eigenvalues. Can you explain why?
If there is a basis for K
n
consisting of eigenvectors, then the matrix of
L with respect that basis is diagonal. We actually showed that it might
be possible to construct a basis consisting exclusively of eigenvectors.
However, we have not yet proven that a basis consisting of eigenvectors
yields a diagonal matrix representation. Can you prove that?
If L has n distinct eigenvalues, then there is a basis for K
n
consisting
of eigenvectors.
If the sum of the geometric multiplicities of the eigenvalues of L is n,
then there is a basis for K consisting of eigenvectors.
In considering these possibilities, we must be careful to take into account
the base eld. For example, the characteristic polynomial p of the matrix A
7
in Activity 4 is given by
p() = (1 +
2
).
7.2 Eigenvalues and Eigenvectors 365
One might be tempted to conclude that 0 is the only eigenvalue. If K = R, as
was the case in Activity 4, then 0 is the only eigenvalue, and a transformation
T : R
3
R
3
dened by T(x) = A
7
x cannot be diagonalized. On the other
hand, if we take K to be the complex numbers, then a linear transformation
T : C
3
C
3
dened by A
7
has 3 distinct eigenvalues. In this case, T can
be diagonalized.
Exercises
In doing the following problems, you may use any computer software, in-
cluding any constructions you made in the activities, or a computer algebra
system such as Derive, Maple, Matlab, or Mathematica.
1. Consider each of the following matrices as representing a linear trans-
formation on a vector space whose eld of scalars is R. For each matrix,
determine the characteristic polynomial, the eigenvalues, and the cor-
responding eigenspaces.
(a)
A
1
=
_
5 4
8 7
_
(b)
A
2
=
_
2 1
3 1
_
(c)
A
3
=
_
7 6
15 12
_
(d)
A
4
=
_
_
4 0 2
2 3 2
3 0 1
_
_
(e)
A
5
=
_
_
4 1 2
0 3 2
0 0 1
_
_
366 CHAPTER 7. GETTING TO SECOND BASES
(f)
A
6
=
_
_
9 7 7
3 1 3
5 5 3
_
_
(g)
A
7
=
_
_
3 1 0
0 1 0
4 2 1
_
_
(h)
A
8
=
_
_
2 3 6
6 2 3
3 6 2
_
_
2. Repeat Exercise 1 for the following matrices, except this time assume
that the eld of scalars is the complex numbers C.
(a)
A
1
=
_
1 3
1 1
_
(b)
A
2
=
_
_
2 3 6
6 2 3
3 6 2
_
_
3. Suppose F : R
2
R
2
is a linear transformation whose matrix repre-
sentation with respect to the ordered basis B = [1, 1) , 2, 1)] is
_
5 0
0 1
_
.
(a) Show that B is a basis of eigenvectors.
(b) Find the matrix representation of F with respect to the coordinate
basis.
(c) Find the characteristic polynomial of F.
4. Suppose that p(x) = (x 3)
2
(x 2)(x + 1) is the characteristic poly-
nomial of a linear transformation T : R
4
R
4
. Is T diagonalizable?
Why, or why not?
7.2 Eigenvalues and Eigenvectors 367
5. Suppose that p() =
2
5 + 6 is the characteristic polynomial of a
linear transformation G : R
2
R
2
.
(a) Construct a basis for R
2
of eigenvectors of G.
(b) Find the matrix representation of G with respect to the eigenbasis
you constructed in (a).
(c) Find the matrix representation of G with respect to the ordered
basis B = [1, 3) , 4, 2)].
6. Let L : R
n
R
n
be a linear transformation. Let I : R R
n
be
the identity transformation. Show that the transformation M : R
n
R
n
, dened by
M(x) = L(x) I(x)
is a linear transformation.
7. Let L, I, and M be dened as they were in Exercise 6. Show that is
an eigenvalue of L if and only if there exists a nonzero vector x such
that x ker(M).
8. Let L, I, and M be dened as they have been in the previous two
exercises. Carefully explain why the set of eigenvalues of L is the
solution set of the equation
det([M]) = 0,
where [M] denotes the matrix representation of M with respect to the
coordinate basis.
9. Explain why the matrix S dened in Activity 5 is invertible.
10. Provide a proof of Theorem 7.2.3.
11. In Theorem 7.2.4, the original linear combination
a
1
v
1
+a
2
v
2
+ +a
m
v
m
simplies to a sum of distinct eigenvectors, each belonging to a dierent
eigenvalue. If you recall, we were trying to show that the scalars a
i
are
simultaneously zero. Hence, we were considering the simplied sum
when set equal to the zero vector. In this context, use Theorem 7.2.2
to explain why each of the vectors in the simplied combination must
be the zero vector.
368 CHAPTER 7. GETTING TO SECOND BASES
12. Let A be a matrix representation of a linear transformation on a vector
space over K with respect to some basis B. If A is diagonal, prove that
the basis B consists entirely of eigenvectors, and show that the diagonal
elements of A are the eigenvalues.
13. Let
A =
_
0 1
1 0
_
be the matrix representation of a linear transformation T : R
2
R
2
with respect to the coordinate basis.
(a) Find the characteristic polynomial of T.
(b) Show that T has no eigenvalues.
(c) Interpret this result geometrically. In particular, what does it
mean to say that T has no eigenvalues.
(d) If the vector space were C
n
instead of R
n
, would T still have no
eigenvalues? Explain.
14. Consider the linear transformation T : R
3
R
3
whose matrix with
respect to the coordinate basis is
_
_
3 1 1
0 3 1
0 0 5
_
_
.
Try to diagonalize this matrix. What is the relationship between the
algebraic and geometric multiplicities of an eigenvalue in this case?
15. If A is a square matrix, then A
2
denotes A A. Similarly, A
3
= A A A,
and A
n
= A A A, a product of n copies of A. If a square matrix A
is diagonalizable, and if all of its eigenvalues are either 1 or 1, then
show that A
2
= I.
16. If A is square, diagonalizable matrix, and if all of its eigenvalues are
either 1 or 0, then prove A
2
= A.
17. If A is a square, diagonalizable matrix, and if all of its eigenvalues are
either 3 or 5, then show that A
2
+ 2A 15 = 0.
7.2 Eigenvalues and Eigenvectors 369
18. Can you think of a general statement for which the three previous
exercises are special cases?
19. Show that the general formula for the characteristic polynomial of a
2 2 matrix
_
a
11
a
12
a
21
a
22
_
is given by
2
(tr(A))+det(A), where tr(A) denotes the trace of A,
the sum of the diagonal entries.
20. Suppose v is a nonzero eigenvector of a matrix A belonging to the
eigenvalue . Show that v is an eigenvector of A
n
belonging to
n
.
370
7.3 Diagonalization and Applications
Activities
1. Let T : R
2
R
2
be dened by
T(x
1
, x
2
)) = A
_
x
1
x
2
_
,
where
A =
_
7 10
3 4
_
.
(a) Verify that the matrix representation of T with respect to the
coordinate basis is given by A. Find the characteristic polynomial
of T with respect to the coordinate basis, that is, compute [A
I
2
[.
(b) Let B = 2, 1) , 3, 4) be another basis for R
2
. Apply the func
ChangeMatRep that you constructed in Activity 4 of Section 7.2
to nd [T]
B
, the matrix representation of T with respect to the
basis B. Find the characteristic polynomial of T with respect to
the basis B, that is, compute [[T]
B
I
2
[.
(c) Use Theorem 7.1.2 to show that A and [T]
B
are similar matrices.
What is the relationship between [A I
2
[ and [[T]
B
I
2
[?
(d) Based upon the results you obtained in part (c), what can we say
about the characteristic polynomials of two similar matrices? If
asked to nd the eigenvalues of a linear transformation, does it
appear to matter what basis we work with? Explain your answer.
2. Dene F : R
3
R
3
by
F(x
1
, x
2
, x
3
)) = 4x
1
+x
3
, 2x
1
+ 3x
2
+ 2x
3
, x
1
+ 4x
3
) .
(a) Find the eigenvalues of F. Does the characteristic polynomial
factor completely?
(b) Suppose = a is an eigenvalue of F. According to Theorem 7.2.3,
the eigenspace corresponding to a forms a subspace of R
3
. x R
3
7.3 Diagonalization and Applications 371
is an eigenvector corresponding to a if and only if x is a solution of
[F]
(
aI
3
= 0. Why is this the case? How can we use this equa-
tion to nd a basis for the eigenspace corresponding to a? Once
you have answered these questions, nd a basis for the eigenspace
corresponding to each of the eigenvalues you found in part (a).
(c) Let c be the collection of all of the eigenbasis vectors you found
in part (b). According to Theorem 7.2.4, this set is linearly inde-
pendent. Does this set form a basis for R
3
in this case? Why, or
why not?
(d) If c forms a basis for R
3
, what is [F]
c
, the matrix representation
with respect to c? What is the form of the transition matrix from
the coordinate basis ( to the basis c? If c does not form a basis,
would it still be possible to construct a basis of eigenvectors?
(e) What is the relationship between the algebraic multiplicity of each
eigenvalue and its geometric multiplicity? Do the geometric mul-
tiplicities add to the dimension of R
3
?
3. Dene G : R
3
R
3
by
G(x
1
, x
2
, x
3
)) = x
3
, x
1
x
3
, x
2
+x
3
) .
(a) Find the eigenvalues of G. Does the characteristic polynomial
factor completely?
(b) Find a basis for the eigenspace corresponding to each of the eigen-
values you found in part (a).
(c) Let c be the collection of all of the eigenbasis vectors you found
in part (b). Does this set form a basis for R
3
in this case? Why,
or why not?
(d) If c forms a basis for R
3
, what is [G]
c
, the matrix representation
with respect to c? What is the form of the transition matrix from
the coordinate basis ( to the basis c? If c does not form a basis,
would it still be possible to construct a basis of eigenvectors?
(e) What is the relationship between the algebraic multiplicity of each
eigenvalue and its geometric multiplicity? Do the geometric mul-
tiplicities add to the dimension of R
3
?
372 CHAPTER 7. GETTING TO SECOND BASES
4. Let P
2
(R) be the space of all polynomials with real coecients of degree
2 or less. Dene H : P
2
(R) P
2
(R) by
H(p) = p
t
,
where p P
2
(R), and p
t
denotes the derivative of p. (Note that here
we are considering polynomials as functions, rather than as a sequence
of coecients from a eld. Do you see this distinction?)
(a) Find the matrix representation of H with respect to the basis
B = 1, x, x
2
. Find the eigenvalues of H. Does the characteristic
polynomial factor completely?
(b) Find a basis for the eigenspace corresponding to each of the eigen-
values you found in part (a).
(c) Let c be the collection of all of the eigenbasis vectors you found in
part (b). Does this set form a basis for P
2
(R) in this case? Why,
or why not?
(d) If c forms a basis for P
2
(R), what is [H]
c
, the matrix representa-
tion with respect to c? What is the form of the transition matrix
from basis B to the basis c? If c does not form a basis, would it
still be possible to construct a basis of eigenvectors?
(e) What is the relationship between the algebraic multiplicity of each
eigenvalue and its geometric multiplicity? Do the geometric mul-
tiplicities add to the dimension of P
2
(R)?
(f) On the basis of your ndings here and in Activities 2 and 3, under
what condition is a linear transformation likely to have a diagonal
matrix representation?
5. Let
A =
_
4 6
3 5
_
,
and let A = C
1
DC, where
C =
_
1 1
1 2
_
and D =
_
1 0
0 2
_
,
determine whether A
n
= C
1
D
n
C for n = 2, 3, 4. What do you ob-
serve? Based upon your observations, state a conjecture, if possible.
7.3 Diagonalization and Applications 373
Discussion
Relationship between Diagonalizability and Eigenvalues
Toward the end of the last section, we outlined a procedure for diagonalizing
a matrix. In this section, we will provide additional detail, as well as deter-
mining conditions which ensure diagonalizability. But, before we continue,
it might be helpful to dene exactly what we mean by diagonalizability.
Denition 7.3.1. Let T : R
n
R
n
be a linear transformation. T is
diagonalizable if there exists a basis B such that the corresponding matrix
representation [T]
B
is a diagonal matrix.
Activities 2, 3, and 4, together with the discussion in the last section,
suggest that the diagonalizability of a linear transformation is dependent
upon the ability of nding, or constructing, a basis consisting exclusively of
eigenvectors. The theorem below veries that this is indeed the case.
Theorem 7.3.1. Let T : R
n
R
n
be a linear transformation, and let B be
a basis. The matrix representation [T]
B
is diagonal if and only if B consists
exclusively of eigenvectors.
Proof. (=:) Assume that B is a basis of eigenvectors. We want to show
that [T]
B
is a diagonal matrix. The proof of this part is left to the exercises.
See Exercise 6.
(=:) Assume that [T]
B
is a diagonal matrix. Then, there exist scalars
a
11
, a
22
, . . . , a
nn
such that
[T]
B
=
_
_
_
_
_
_
_
a
11
0 0 . . . 0
0 a
22
0 . . . 0
0 0 a
33
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . a
nn
_
_
_
_
_
_
_
.
Let the basis B be given by B = v
1
, v
2
, . . . , v
n
. Then, according to De-
nition 6.2.2,
T(v
i
) = 0v
1
+ + 0v
i1
+a
ii
v
i
+ 0v
i+1
+ + 0v
n
= a
ii
v
i
.
374 CHAPTER 7. GETTING TO SECOND BASES
According to Denition 7.2.1, v
i
, i = 1, 2, . . . , n, is an eigenvector. Therefore,
B is a basis consisting entirely of eigenvectors.
In Section 7.3, you found eigenvalues by working with the coordinate ba-
sis. The results of Activity 1 suggest that the characteristic polynomial does
not depend upon the specic choice of basis. The next theorem establishes
this as a general result.
Theorem 7.3.2. Let T : R
n
R
n
be a linear transformation. Let B
and B be two bases for R
n
. Then,
[[T]
B
I
n
[ = [[T]
B
I
n
[,
that is, the characteristic polynomial is independent of the choice of basis.
Proof. As given in the statement of the theorem, let [T]
B
and [T]
B
be
two matrix representations of T with respect to the bases B and B,
respectively. According to Theorem 7.1.2, these two matrices are similar,
that is, there exists an invertible matrix C such that
[T]
B
= C
1
[T]
B
C.
What are the entries of C? Can you recall based upon the theorem just
cited?
Using C, we can establish the following equality,
[[T]
B
I
n
[ = [C
1
[T]
B
C I
n
[
= [C
1
([T]
B
I
n
)C[
= [C
1
[[[T]
B
I
n
[[C[
= [[T]
B
I
n
[[C
1
C[
= [[T]
B
I
n
[,
which is what we wished to prove. Can you justify each step?
These theorems simplify the basic procedure for diagonalizing a trans-
formation. The theorem we have just proven tells us that we can use any
basis, and hence, any matrix representation, to nd the eigenvalues of a
linear transformation. Theorem 7.3.1 reveals that diagonalizability depends
entirely upon the ability to construct a basis of eigenvectors. What remains
is to nd conditions that guarantee the existence of an eigenbasis.
7.3 Diagonalization and Applications 375
Conditions that Guarantee Diagonalizability
In Activities 2, 3, and 4, you were asked to compare the geometric and al-
gebraic multiplicities of each eigenvalue, as well as to determine whether the
characteristic polynomial splits, that is, factors completely. Based upon your
results, is it possible for a diagonalizable transformation to have a characteris-
tic polynomial that does not split? If the characteristic polynomial splits, can
you immediately conclude that the transformation is diagonalizable? Does
the relationship between the algebraic and geometric multiplicities of each
eigenvalue appear to have any bearing upon the issue of diagonalizability?
The next theorem provides an answer to the rst question.
Theorem 7.3.3. Let T : R
n
R
n
be a linear transformation that is
diagonalizable. Then, the characteristic polynomial of T splits.
Proof. According to the assumption, there exists a basis B such that the
resulting matrix representation [T]
B
is a diagonal matrix. By Theorem 7.3.2,
the choice of basis does not eect the form of the characteristic polynomial.
Hence, we can nd the characteristic polynomial of T by evaluating the
determinant [[T]
B
I
n
[. Since the only nonzero entries of [T]
B
I
n
lie
along the diagonal and are of the form a
ii
, i = 1, 2, . . . , n, it follows that
the characteristic polynomial [[T]
B
I
n
[ will consist exclusively of a product
of n factors of the form (a
ii
), i = 1, 2, . . . , n.
Does this theorem answer the second question posed in the rst paragraph
of this subsection? Why, or why not?
Activities 2, 3, and 4 reveal a second consequence of diagonalizability,
the equality of the algebraic and geometric multiplicities of each eigenvalue.
Before we can prove this result, we rst show that the geometric multiplicity
of an eigenvalue cannot exceed its algebraic multiplicity.
Theorem 7.3.4. Let T : R
n
R
n
be a linear transformation. Let be
an eigenvalue of T. Then, the geometric multiplicity of does not exceed its
algebraic multiplicity.
Proof. Let be an eigenvalue of T having algebraic multiplicity m. Certainly,
m n. Why is this true? Let v
1
, . . . , v
p
be a basis for the eigenspace
corresponding to . Then, p n. Why? According to Theorem 4.4.10, we
can expand this linearly independent set to a basis for all of K
n
, say
v
1
, . . . , v
p
, v
p+1
, . . . , v
n
.
376 CHAPTER 7. GETTING TO SECOND BASES
Since the rst p vectors are eigenvectors, the matrix representation of T with
respect to this basis is of the form
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
0 0 . . . 0 a
1p+1
. . . a
1n
0 0 . . . 0 a
2p+1
. . . a
2n
0 0 . . . 0 a
3p+1
. . . a
3n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . a
pp+1
. . . a
pn
0 0 0 . . . 0 a
p+1p+1
. . . a
p+1n
0 0 0 . . . 0 a
p+2p+1
. . . a
p+2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 a
np+1
. . . a
nn
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
.
According to Theorem 7.2.1, the characteristic polynomial is the determinant
of the matrix
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
0 0 . . . 0 a
1p+1
. . . a
1n
0 0 . . . 0 a
2p+1
. . . a
2n
0 0 . . . 0 a
3p+1
. . . a
3n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . a
pp+1
. . . a
pn
0 0 0 . . . 0 a
p+1p+1
. . . a
p+1n
0 0 0 . . . 0 a
p+2p+1
. . . a
p+2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 a
np+1
. . . a
nn
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
tI
n
=
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
t 0 0 . . . 0 a
1p+1
. . . a
1n
0 t 0 . . . 0 a
2p+1
. . . a
2n
0 0 t . . . 0 a
3p+1
. . . a
3n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . t a
pp+1
. . . a
pn
0 0 0 . . . 0 a
p+1p+1
t . . . a
p+1n
0 0 0 . . . 0 a
p+2p+1
. . . a
p+2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 a
np+1
. . . a
nn
t
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
.
If we apply what we have learned about determinants from Chapter 6, we
can see that the determinant of the matrix given above will simplify to an
7.3 Diagonalization and Applications 377
n degree polynomial with a factor of the form ( t)
p
. Since the algebraic
multiplicity is assumed to be m, it follows that p m, that is, the geometric
multiplicity cannot exceed the algebraic multiplicity.
We will use this theorem to prove the following theorem, which shows that
the equality of the algebraic and geometric multiplicities of each eigenvalue
is a second consequence of diagonalizability.
Theorem 7.3.5. Let T : R
n
R
n
be a linear transformation that is diago-
nalizable. Then, the geometric and algebraic multiplicities of each eigenvalue
are equal.
Proof. Suppose that
1
,
2
, . . . ,
k
, k n, are the distinct eigenvalues
of T. By Theorem 7.3.1, there exists a basis B consisting exclusively of
eigenvectors. Since dim(R
n
) = n, there are n vectors in the set B. Let E
i
,
i = 1, 2, . . . , k, each be a set of those vectors in B that correspond to the
eigenvalue
i
. Let j
i
, i = 1, 2, . . . , k, represent the number of vectors in E
i
.
Let m
i
, i = 1, 2, . . . , k, denote the algebraic multiplicity of
i
.
Since E
i
, i = 1, 2, . . . , k, is a subset of B, each E
i
is a linearly independent
set. This set also generates the eigenspace corresponding to
i
. To begin
with, the set is linearly independent. In addition, any vector in the eigenspace
of
i
can be written as a linear combination of B, from which it follows that
any such vector can be written as a linear combination of E
i
. (Can you ll
in the details here?) Hence, E
i
forms a basis for the eigenspace of
i
, which
means that j
i
represents the geometric multiplicity of
i
.
By Theorem 7.3.4, j
i
m
i
for all i = 1, 2, . . . , k. We can use this to say
n =
k
i=1
j
i
k
i=1
m
i
n,
from which it follows that
k
i=1
(m
i
j
i
) = 0.
Since m
i
j
i
0 for all i = 1, 2, . . . , k, we can conclude that
j
i
= m
i
for all i = 1, 2, . . . , k, which is what we wished to prove.
378 CHAPTER 7. GETTING TO SECOND BASES
As a result of Theorems 7.3.3 and 7.3.5, we know that the splitting of the
characteristic polynomial and the equality of the algebraic and geometric
multiplicities of each eigenvalue are consequences are diagonalizability. Can
we go the other way? As the activities show, neither of these conditions in
isolation is sucient to ensure diagonalizability. What do we mean by su-
cient here? Of the linear transformations in Activities 2, 3, and 4, only one
proved to be diagonalizable. In this case, both the characteristic polynomial
split and the algebraic and geometric multiplicities were equal. As the next
theorem shows, both of these things must occur together in order to ensure
the existence of an eigenbasis.
Theorem 7.3.6. Let T : R
n
R
n
be a linear transformation. If the
characteristic polynomial of T splits, and if the algebraic and geometric mul-
tiplicities of each eigenvalue of T are equal, then T is a diagonalizable trans-
formation.
Proof. Let
1
,
2
, . . .,
k
be the distinct eigenvalues of T. Let m
i
, i =
1, 2, . . . , k, represent the algebraic multiplicities of each eigenvalue. Since the
characteristic polynomial splits,
m
1
+m
2
+ +m
k
= n,
that is, the algebraic multiplicities add to the dimension of the vector space
R
n
.
Let j
i
, i = 1, 2, . . . , k, denote the geometric multiplicity of each eigenspace
of
i
, i = 1, 2, . . . , k. Let c
i
, i = 1, 2, . . . , k, be a basis for the eigenspace
corresponding to
i
. If we let
B = c
1
c
2
c
k
,
that is, B is the collection of all eigenbasis vectors from each c
i
, then B,
according to Theorem 7.2.4, is a linearly independent set. By assumption,
j
i
= m
i
for all i = 1, 2, . . . , k. Therefore, B is a linearly independent set of n
eigenvectors. According to Theorem 4.4.8, B forms an eigenbasis for R
n
. By
Theorem 7.3.1, it follows that the matrix representation [T]
B
is diagonal.
Theorems 7.3.3, 7.3.4, and 7.3.6 can be combined into a single if and
only if theorem. What is the statement of this theorem? Now that we
have established dual conditions that are equivalent to diagonalizability, we
can elaborate upon the procedure for nding an eigenbasis that was outlined
briey in the last section and alluded to in the exercises.
7.3 Diagonalization and Applications 379
A Procedure Diagonalizing a Transformation
In this subsection, we will provide a detailed description of the process of
diagonalizing a linear transformation. We discuss each step in the context of
working with a specic example. Let T : R
3
R
3
be a linear transforma-
tion dened by
T(x
1
, x
2
, x
3
)) = 15x
1
+ 7x
2
7x
3
, x
1
+x
2
+x
3
, 13x
1
+ 7x
2
5x
3
) .
1. Find the matrix representation of T. In this case, we nd the matrix
representation with respect to the coordinate basis, which is
_
_
15 7 7
1 1 1
13 7 5
_
_
.
2. Find the eigenvalues of T. This involves completing the series of steps
involving the characteristic polynomial, which are given below.
_
_
15 7 7
1 1 1
13 7 5
_
_
t
_
_
1 0 0
0 1 0
0 0 1
_
_
_
_
15 t 7 7
1 1 t 1
13 7 5 t
_
_
_
_
c
1
e
2x
c
2
e
2x
c
3
e
4x
_
_
=
_
_
c
1
e
2x
+c
3
e
4x
c
2
e
2x
+ 2c
3
e
2x
c
1
e
2x
c
2
e
2x
c
3
e
4x
_
_
= e
2x
_
_
c
1
_
_
1
0
1
_
_
+c
2
_
_
0
1
1
_
_
_
_
+e
4x
_
_
c
3
_
_
1
2
1
_
_
_
_
= e
2x
z
1
+e
4x
z
2
,
384 CHAPTER 7. GETTING TO SECOND BASES
where z
1
and z
2
represent arbitrary elements of the eigenspaces corresponding
to the eigenvalues 2 and 4, respectively.
Markov Chains
If A is a square matrix, the notation A
k
refers to the k
th
power of A, which,
as you would expect, is the product of A with itself k times.
A
k
= A A A A
. .
k times
.
In Activity 5, we showed how to compute the power of a matrix using di-
agonalization. In particular, if A is similar to a diagonal matrix D, we can
compute any power k of A by computing the product
A
k
= CD
k
C
1
,
where C is the matrix whose columns are the components of each vector in
the eigenbasis, and D
k
is found by raising each diagonal entry of D to the
k
th
power.
Theorem 7.3.7. Let A be an nn diagonalizable matrix with entries in R.
If C is the transition matrix, and if D is the diagonal form with respect to
the eigenbasis, then, for any k,
A
k
= CD
k
C
1
,
where C is the matrix whose columns are the components of the vectors of
the eigenbasis.
Proof. See Exercise 19.
We will apply this theorem in the following example involving Markov
Chains. Suppose we have two adjacent cities A and B in which both city
managers wish to predict long term trends in the movement of population
between the two cities. Currently, 70% of the people in the two cities live
in City A, while 30% live in City B. In a typical year, 20% of the people in
City A move to City B and 80% of the people remain in City A, while 10%
of the people in City B move to city A, with 10% remaining in City B. If a
.5% increase per year is expected for the two cities combined, what will be
7.3 Diagonalization and Applications 385
the population in each city in 30 years, if the current combined population
is 150,000 people? We can rst set up a table of migration data:
From City A From City B
To City A .8 .1
To City B .2 .9
.
The initial population distribution is given by
_
.7
.3
_
.
The proportion in City A after the rst year will consist of 80% of the original
70% plus 10% of the 30% from City B, that is,
Proportion in City A after 1 year = .8 .7 +.1 .3.
Similarly, the proportion in City B after the rst year will consist of 90% of
the original 30% plus 20% of the 70% from City A, that is,
Proportion in City B after 1 year = .2 .7 +.9 .3.
In terms of matrices, we have
_
.8 .1
.2 .9
_
_
.7
.3
_
=
_
.8 .7 +.1 .3
.2 .7 +.9 .3
_
=
_
.59
.41
_
.
The proportion in City A and City B after year two will be given by
_
.8 .1
.2 .9
_
2
_
.7
.3
_
.
Can you explain why? After 30 years, the proportions will be given by
_
.8 .1
.2 .9
_
30
_
.7
.3
_
.
Since the matrix
_
.8 .1
.2 .9
_
386 CHAPTER 7. GETTING TO SECOND BASES
is diagonalizable, we can compute this product using Theorem 7.3.7. The
eigenvalues are .7 and 1. 1, 1 is a basis for .7, and 5, 1 is a basis for
1. According to Theorem 7.3.7,
_
.8 .1
.2 .9
_
30
=
_
1 5
1 1
_
_
.7 0
0 1
_
30
_
1
4
5
4
1
4
1
4
_
.
Using this equality, what is the proportion matrix after year 30? Using the
growth assumption given at the beginning of the discussion of this problem,
how many people will live in both cities combined after 30 years? How many
will live in City A? How many will live in City B?
Exercises
1. Dene T : R
2
R
2
by
T(x
1
, x
2
)) = 5x
1
3x
2
, 3x
1
x
2
) .
Determine whether T is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
2. Dene F : R
2
R
2
by
F(x
1
, x
2
)) =
_
2 4
1 4
_
_
x
1
x
2
_
.
Determine whether F is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
3. Dene H : R
3
R
3
by
H(x
1
, x
2
, x
3
)) =
_
_
1 0 0
2 1 2
2 0 3
_
_
_
_
x
1
x
2
x
3
_
_
.
Determine whether H is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
7.3 Diagonalization and Applications 387
4. Dene T : R
3
R
3
by
T(x
1
, x
2
, x
3
)) =
_
_
1 2 3
0 1 2
1 1 0
_
_
_
_
x
1
x
2
x
3
_
_
.
Determine whether T is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
5. Dene G : R
4
R
4
by
G(x
1
, x
2
, x
3
, x
4
)) =
4x
1
+ 2x
2
2x
3
+ 2x
4
, x
1
+ 3x
2
+x
3
x
4
, 2x
3
, x
1
+x
2
3x
3
+ 5x
4
) .
Determine whether G is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
6. Provide a proof of the rst part of Theorem 7.3.1.
7. Justify each step of the equality given in the proof of Theorem 7.3.2.
8. Theorems 7.3.3, 7.3.4, and 7.3.6 can be combined into a single if and
only if theorem. What is the statement of this theorem?
9. Let P
3
(R) be the vector space of polynomials of degree 3 or less. Dene
T : P
3
(R) P
3
(R) by
T(p) = p
tt
+p
t
where p P
3
(R), p
tt
is the second derivative of p, and p
t
is the
rst derivative of p. Determine whether T is diagonalizable. If it
is, nd an eigenbasis, and nd the transition matrix between the basis
1, x, x
2
, x
3
and the eigenbasis. If not, explain why.
10. Prove that if A is a diagonal matrix, then its eigenvalues are the diag-
onal elements.
11. Prove that if A is an upper triangular matrix, then its eigenvalues are
the diagonal elements.
388 CHAPTER 7. GETTING TO SECOND BASES
12. Prove that = 0 is an eigenvalue of a matrix A if and only if A is
singular.
13. Find the general solution of each system of dierential equations.
(a)
f
t
1
= f
1
+f
2
f
t
2
= 3f
1
f
2
(b)
f
t
1
= 8f
1
+ 10f
2
f
t
2
= 5f
1
7f
2
(c)
f
t
1
= f
1
+f
3
f
t
2
= f
2
+f
3
f
t
3
= 2f
3
14. Let T : R
n
R
n
be an invertible linear transformation, that is, a
transformation that is both one-to-one and onto. Show that T is diag-
onalizable if and only if its inverse T
1
: R
n
R
n
is diagonalizable.
15. Let P
2
(R) be the vector space of polynomials of degree 2 or less. Dene
F : P
2
(R) P
2
(R) by
F(p) = p(0) +p(1) (x +x
2
)
where p P
2
(R), p(0) is the value of the polynomial evaluated at
x = 0, and p(1) is the value of the polynomial evaluated at x = 1.
Determine whether F is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the basis 1, x, x
2
and the
eigenbasis. If not, explain why.
16. Let A be a square matrix. A power of A, say A
n
, is nothing more than
a matrix product of n copies of A. Use this denition to answer the
following questions regarding various matrix polynomials in A and the
diagonalizability of A.
7.3 Diagonalization and Applications 389
(a) Show that if a matrix A is diagonalizable and all of its eigenvalues
are either 1 or -1, then A
2
= I.
(b) Show that if a matrix A is diagonalizable and all of its eigenvalues
are either 1 or 0, then A
2
= A.
(c) Show that if a matrix A is diagonalizable and all of its eigenvalues
are either 3 or -5, then A
2
+ 2A 15 = 0.
(d) Can you think of a general statement for which the three previous
exercises are special cases?
17. Prove that if A is diagonalizable with distinct eigenvalues
1
,
2
, . . . ,
n
, then
[A[ =
1
2
n
.
18. If A and B are similar matrices, prove that if A is diagonalizable, then
B is diagonalizable.
19. Provide a proof for Theorem 7.3.7.
20. Answer the questions posed at the end of the discussion of the Markov
Chain example.
21. Construct a model of population ows between cities, suburbs, and
nonmetropolitan areas of the U.S. Their respective populations in 1985
were 60 million, 125 million, and 55 million. The matrix giving proba-
bilities of the moves is
From City From Suburb From Nonmetro
To City .96 .01 .015
To Suburb .03 .98 .005
To Nonmetro .01 .01 .98
Predict the population that will live in each category in 2010, if the
total population is assumed to be 350 million.