Sie sind auf Seite 1von 401

Learning Linear Algebra

with ISETL
Learning Linear Algebra
with ISETL
Kirk Weller Aaron Montgomery
University of North Texas Central Washington University
Julie Clark Jim Cottrill
Hollins University Illinois State University
Maria Trigueros Ilana Arnon
Instituto Tecnologico Autonomo
de Mexico
Centre for Educational
Technology
Ed Dubinsky
RUMEC
Preliminary Version 3
July 31, 2002
c _ 2002 by Research in Undergraduate Mathematics Education Community
All rights reserved.
Preface
The authors wish to express thanks to Don Muench of St. John Fisher College
(Rochester, NY) whose work with linear algebra and ISETL gave us the basis
for our work. His code was written at Gettysburg College in 1991 with
students there, so we thank Jared Colesh, Ben Papada, Julie Leese, and
Dave Riihimaki.
This work is a collaborative eort both in authorship and its conception.
We acknowledge the assistance of the following members of RUMEC who
have worked with us at various stages of the project:
Broni Czarnocha David DeVries Clare Hemenway
George Litman Sergio Loch Rob Merkovsky
Steve Morics Asuman Oktac Vrunda Prabhu
Keith Schwingendorf
Many students and faculty have used these materials and have helped us
to rene their intent and presentation. Those members of RUMEC who have
implemented some or all of these sections are Ilana Arnon, Julie Clark, Sergio
Loch, Steve Morics, Keith Schwingendorf, and Kirk Weller. Our special
thanks go to the brave faculty who are implementing this approach and its
materials from beyond RUMEC. They and their students will guide us in
taking this preliminary version to its next level.
Robert Acar University of Puerto Rico-Mayag uez
Felix Almendra Arao Unidad Profesional Interdisplinaria en Ingenier

ia
y Tecnolog

ias Avanzadas, Mexico


vi
vii
Contents
Preface v
1 Functions and Structures 1
1.1 Introduction to ISETL . . . . . . . . . . . . . . . . . . . . . . 2
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Getting Started . . . . . . . . . . . . . . . . . . . . . . 8
Simple Objects and Operations . . . . . . . . . . . . . 10
Modular arithmetic. . . . . . . . . . . . . . . . . 10
Variables. . . . . . . . . . . . . . . . . . . . . . 11
Boolean. . . . . . . . . . . . . . . . . . . . . . . 11
Control Statements . . . . . . . . . . . . . . . . . . . . 12
if statements. . . . . . . . . . . . . . . . . . . . 12
for loops. . . . . . . . . . . . . . . . . . . . . . 13
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Structures and Operations . . . . . . . . . . . . . . . . . . . . 19
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Tuple and Set Formers . . . . . . . . . . . . . . . . . . 25
Set Operations . . . . . . . . . . . . . . . . . . . . . . 26
Tuple and Set Operations . . . . . . . . . . . . . . . . 27
Sets of Tuples . . . . . . . . . . . . . . . . . . . . . . . 27
Quantication . . . . . . . . . . . . . . . . . . . . . . . 28
Modular Arithmetic . . . . . . . . . . . . . . . . . . . 29
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
viii
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Funcs and Their Syntax Options . . . . . . . . . . . . 42
Funcs for Binary Operations . . . . . . . . . . . . . . . 43
Funcs to Test Properties . . . . . . . . . . . . . . . . . 44
Tuples and Smaps . . . . . . . . . . . . . . . . . . . . 45
Procs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
The Fields Z
p
. . . . . . . . . . . . . . . . . . . . . . . 46
Polynomials and Polynomial Functions . . . . . . . . . 47
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2 Vectors and Vector Spaces 51
2.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.2 Introduction to Vector Spaces . . . . . . . . . . . . . . . . . . 60
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Finite Vector Spaces . . . . . . . . . . . . . . . . . . . 64
Innite Vector Spaces . . . . . . . . . . . . . . . . . . . 67
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 69
Basic Properties of Vector Spaces . . . . . . . . . . . . 70
name vector space . . . . . . . . . . . . . . . . . . . . 71
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Determination of Subspaces . . . . . . . . . . . . . . . 76
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 78
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3 First Look at Systems 81
3.1 Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . 82
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Algebraic Expressions and Linear Equations . . . . . . 86
Forms of Solution Sets . . . . . . . . . . . . . . . . . . 89
ix
Systems of Linear Equations . . . . . . . . . . . . . . . 92
Summarizing the Process for Finding the Solution of a
Systems of Equations . . . . . . . . . . . . . . 98
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.2 Solving Systems Using Augmented Matrices . . . . . . . . . . 109
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Using Augmented Matrices . . . . . . . . . . . . . . . . 113
Summarizing the Process for Finding the Solution of
a System of Equations Using an Augmented
Matrix . . . . . . . . . . . . . . . . . . . . . . 120
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
3.3 A Geometric View of Systems . . . . . . . . . . . . . . . . . . 130
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Equations in Two Unknowns . . . . . . . . . . . . . . . 137
Equations in Three Unknowns . . . . . . . . . . . . . . 140
Systems of Three Equations in Three Unknowns . . . . 142
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4 Linearity and Span 147
4.1 Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . 148
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
The Dierence Between a Set and a Sequence . . . . . 152
Forming Linear Combinations . . . . . . . . . . . . . . 153
Simplied Single-Vector Representations . . . . . . . . 154
Geometric Representation . . . . . . . . . . . . . . . . 155
Vectors Generated by a Set of VectorsSpan . . . . . 161
What Vectors Can You Get from Linear Combinations? 162
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 163
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
4.2 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . 168
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Denition of Linear Independent and Linear Dependent 172
Geometric Interpretation/Generating Sets . . . . . . . 177
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 180
x
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
4.3 Generating Sets and Linear Independence . . . . . . . . . . . 185
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Generating Sets and Their Spans . . . . . . . . . . . . 188
Constructing Linearly Independent Generating Sets . . 190
Properties of Linear Independence and Linear Depen-
dence . . . . . . . . . . . . . . . . . . . . . . 191
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 193
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
4.4 Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . 197
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Summation Notation . . . . . . . . . . . . . . . . . . . 200
Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Expansion of a Vector with respect to a Basis. . . . . . 205
Representation of a vector space as K
n
. . . . . . 205
Finding a Basis . . . . . . . . . . . . . . . . . . . . . . 206
Finite dimensional vector spaces. . . . . . . . . 206
Characterizations of bases. . . . . . . . . . . . . 207
Dimension . . . . . . . . . . . . . . . . . . . . . . . . . 207
Dimensions of Euclidean spaces. . . . . . . . . . 210
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 211
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
5 Linear Transformations 217
5.1 Introduction to Linear Transformations . . . . . . . . . . . . . 218
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Functions between Vector Spaces . . . . . . . . . . . . 222
Denition and Signicance of Linear Transformations . 223
Component Functions and Linear Transformations . . . 230
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 232
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
5.2 Kernel and Range . . . . . . . . . . . . . . . . . . . . . . . . . 238
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
The Kernel of a Linear Transformation . . . . . . . . . 240
xi
The Image Space of a Linear Transformation . . . . . . 244
Bases for the Kernel and Image Space . . . . . . . . . 245
The General Form of a System of Linear Equations . . 252
Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 256
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
5.3 New Constructions from Old . . . . . . . . . . . . . . . . . . . 263
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Scalar Multiple of a Linear Transformation . . . . . . . 268
The Sum of Two Linear Transformations . . . . . . . . 270
Equality of Linear Transformations . . . . . . . . . . . 271
A Set of Linear Transformations as a Vector Space . . 271
Creating New Linear Transformations . . . . . . . . . . 272
Compositions of Linear Transformations . . . . . . . . 273
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
6 Systems, Transformations and Matrices 281
6.1 Vector Spaces of Matrices . . . . . . . . . . . . . . . . . . . . 282
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Vector Spaces of Matrices . . . . . . . . . . . . . . . . 284
Subspaces of Matrices . . . . . . . . . . . . . . . . . . 286
Summation Notation . . . . . . . . . . . . . . . . . . . 287
Dimensions of Matrix Vector Spaces . . . . . . . . . . . 288
Linear Transformations of Matrices . . . . . . . . . . . 289
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
6.2 Transformations and Matrices . . . . . . . . . . . . . . . . . . 293
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
The Rank of a Matrix . . . . . . . . . . . . . . . . . . 298
The Matrix of a Linear Transformation . . . . . . . . . 301
Properties of Matrix Representations . . . . . . . . . . 303
Retrospection . . . . . . . . . . . . . . . . . . . . . . . 305
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
6.3 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . 311
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Matrix Multiplication . . . . . . . . . . . . . . . . . . . 315
xii
Multiplication as Composition . . . . . . . . . . . . . . 318
Invertible Matrices and Change of Bases . . . . . . . . 320
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
6.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
7 Getting to Second Bases 335
7.1 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Coordinate Vectors . . . . . . . . . . . . . . . . . . . . 339
Alias and alibi. . . . . . . . . . . . . . . . . . . 343
Matrix Representations . . . . . . . . . . . . . . . . . . 344
Matrices with Special Forms . . . . . . . . . . . . . . . 347
Triangular matrices. . . . . . . . . . . . . . . . . 348
Diagonal matrices. . . . . . . . . . . . . . . . . 349
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
7.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . 357
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
Basic Ideas . . . . . . . . . . . . . . . . . . . . . . . . 358
Bases of Eigenvectors . . . . . . . . . . . . . . . . . . . 360
What Can Happen? . . . . . . . . . . . . . . . . . . . 364
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
7.3 Diagonalization and Applications . . . . . . . . . . . . . . . . 370
Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Relationship between Diagonalizability and Eigenvalues 373
Conditions that Guarantee Diagonalizability . . . . . . 375
A Procedure Diagonalizing a Transformation . . . . . . 379
Using Diagonalization to Solve a System of Dierential
Equations . . . . . . . . . . . . . . . . . . . . 381
Markov Chains . . . . . . . . . . . . . . . . . . . . . . 384
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
Chapter 1
Functions and Structures
ISETL is a Mathematical Programing Language.
Before you run away, the emphasis is on the
Mathematical aspects. ISETL is a tool for
constructing mathematics using a computer.
However it is a language for programing the
computer, so you will need to learn some
commands and syntax. This chapter assumes no
prior knowledge of ISETL or of programing. It
gets you started in a gentle manner and sets you
on your way to learn linear algebra. Rest assured
that you will be learning plenty of linear algebra
in this chapter, too.
2
1.1 Introduction to ISETL
Activities
1. Use the documentation provided for your computer to make sure that
you can answer the following questions.
(a) How do you turn the computer on?
(b) How do you turn the computer o?
(c) How do you enter information? From the keyboard? A mouse?
Disks?
(d) How do you move around the screen? With keys? A mouse?
(e) How do you make les, and how are they organized?
(f) How do you save, back-up or discard les?
2. Use the documentation provided for ISETL to make sure that you can
answer the following questions.
(a) How do you start an ISETL session?
(b) How do you end an ISETL session?
(c) How do you enter information to the system by typing directly?
(d) How do you transfer information from a le to the system?
(e) How do you make changes, correct errors, add or delete material?
(f) How do you save the work that you do in a session?
(g) How do you print from a le or windows?
3. Following is data that can appear on the screen when you are in ISETL.
The code on a line beginning with a > or >> prompt must be entered by
you. (The symbols > or >> are ISETL prompts. You do not enter the
prompt; ISETL will provide it for you.) The end of such a line indicates
that you should press Return or Enter. The other lines are put on
the screen by ISETL.
Start ISETL and operate interactively to enter the appropriate lines
and obtain the indicated responses (note that the number of decimal
places may vary).
1.1 Introduction to ISETL 3
>
> $NUMBERS
>
> 7 + 18;
25;
> 13 * (-233.8);
-3039.400;
> 170
>> + 237 - 460
>> * 2
>> ;
-513;
> 3 + 2; 2 +
5;
>> 1;
3;
> 3 + $this is a comment
>> 2;
5;
> 9/3;
3.000;
> 9/4;
2.250;
> 9/0;
!Error: Divide by zero
> 3 ** 2;
9
> 3**(1.2);
3.737;
> 9**(1/2);
3.000;
> (-1)**(1/2);
OM;
>
> $VARIABLES
>
> x := 2;
> x;
4 CHAPTER 1. FUNCTIONS AND STRUCTURES
2;
> X;
OM;
> a := 1; b := 2; a := b; b := a;
> a; b;
2;
2;
>
> $BOOLEANS
>
> 6 = 2 * 3;
true;
> 5 >= 2* 3;
false;
> is_integer(3/2);
false;
> is_integer(9/3);
true;
> is_integer(4.00);
true;
> x := 2;
> x < 3 and x > 1;
true;
> x > 2 and x < 2;
false;
> x < 0 or x > 1;
true;
> x < 1 impl x > 3;
true;
> not true and false;
false;
> not (true and false);
true;
>
> $IF
>
> x := 2;
> if x > 2 then
1.1 Introduction to ISETL 5
>> write "x is larger than 2";
>> else
>> write "x is 2 or smaller";
>> end if;
x is 2 or smaller
> x := 4;
> if x > 2 then
>> write "x is larger than 2";
>> elseif x > 3 then
>> write "x is larger than 3";
>> end if;
x is larger than 2
> if x > 2 then
>> write "x is in (2, infinity)";
>> elseif x > 1 then
>> write "x is in (1, 2]";
>> elseif x > 0 then
>> write "x is in (0, 1]";
>> else
>> write "x is in (-infinity, 0]";
>> end if;
x is in (2, infinity)
> x := 1;
> if x > 2 then
>> write "x is in (2, infinity)";
>> elseif x > 1 then
>> write "x is in (1, 2]";
>> elseif x > 0 then
>> write "x is in (0, 1]";
>> else
>> write "x is in (-infinity, 0]";
>> end if;
x is in (0, 1]
6 CHAPTER 1. FUNCTIONS AND STRUCTURES
> $FOR LOOPS
> S := {2..6};
> y := 1;
> for x in S do
>> y := x * y;
>> end for;
> y;
720;
> S := {1..3};
> a := 0;
> for x, y in S do
>> a := a + x + y;
>> end for;
> a;
36;
4. In parts (a)(g), you will be asked to work with code involving modular
arithmetic. Modular arithmetic will be used throughout this course.
(a) Following is a list of items for you to enter into ISETL. Before
entering, guess and write down what the response in ISETL will
be. In any case where the response is dierent from what you
predicted, try to understand why.
> 5 mod 5;
> 7 mod 5;
> 7 mod 7;
> 3 mod 5;
> (2 + 3) mod 5;
> -2 mod 5;
(b) Do you understand the meaning of the operation mod? Write an
explanation of this operation.
(c) There are several ways to visualize the operation mod. We shall
start with one method of visualization, and present another later
in the chapter.
To draw numbers mod 7 (here 7 is the right operand) draw a
number line and mark 0 and consecutive multiples of 7 (both
1.1 Introduction to ISETL 7
negative and positive, on both sides of 0) as long as your page
permits. Leave reasonable and (approximately) equal distances
between every two consecutive multiples. (Why equal distances?)
Choose any integer n (try rst an integer which is not a multiple
of 7), which is in the range of your number line.
Draw your integer approximately on the line. How did you know
where to locate n: Between which two multiples of 7? Nearer to
which of the two?
Shift your integer by multiples of 7 until it reaches within the in-
terval [0, 7). The number corresponding to this location of your
shifted integer n is n mod 7. According to the drawing in Fig-
-21 -14 -7 0 7 14 21
n = 20 6 = 20 mod 7
Figure 1.1: A number line for mod 7
ure 1.1, 20 mod 7 = 6. Check this answer in ISETL.
(d) Choose three more integers n
1
, n
2
, n
3
, among them a multiple of
7 and a negative. Locate them also on the number line, shift, and
nd n
i
mod 7. Check your answers in ISETL.
(e) Following is another list of items for you to enter into ISETL.
Before entering, use number lines to predict and write down what
the response in ISETL will be. In any case where the response is
dierent from what you predicted, try to understand why.
(f) > 21 mod 7;
> (3 + 4) mod 7;
> 3 = -4 mod 7
> (1 + 6) mod 7
> 1 = -6 mod 7
> (4 * 2) mod 7;
> (2 + 5) mod 6;
> (2 * 5) mod 6;
> 2/3 mod 6;
> 2 * 1 = (2 * 4) mod 6;
8 CHAPTER 1. FUNCTIONS AND STRUCTURES
> 2 * 4 = (2 * 1) mod 6;
> (2 * 4) mod 6 = (2 * 1) mod 6;
(g) What did your number line look like for the last set of ISETL code
given in (f)? Why?
5. Following again is a list of items for you to enter into ISETL. Before
entering, predict and write down what the response in ISETL will be.
In any case where the response is dierent from what you predicted,
try to understand why.
It may be more convenient for you to work with a le and to copy items
to the screen as you enter them. The specic instructions for doing this
will vary with the system.
(a) > b := 10;
> b;
> b + 20;
> b := b - 4; b; B;
(b) > (2 /= 3) and ((5.2/3.1) > 0.9);
> (3 <= 3) impl (3 = 2 + 1);
> (3 <= 3) impl (not (3 = 2 + 1));
> (3 > 3) impl (3 = 2 + 1);
> (3 > 3) impl (not (3 = 2 + 1));
(c) > 7 mod 4; 11 mod 4; -1 mod 4;
> (23 + 17) mod 3;
(d) > a := O; b := 1; c := 2; d := 3;
> a := d; b := c; c := b; d := a;
> a; b; c; d;
(e) > is_integer(5); is_integer(-13);
> is_integer(6/4); is_integer(6/3);
Discussion
Getting Started
Interaction with ISETL through the execution window follows a simple pat-
tern:
1.1 Introduction to ISETL 9
ISETL provides you with a prompt (>);
You type and edit a line, and end by pressing Return or Enter. (Once
you type Return or Enter, you cannot edit the line any further. If
you attempt to do so, ISETL will not recognize the changes.);
ISETL reads the line and attempts to execute any complete statements
that it nds (complete statements end with semi-colons);
If you have an incomplete statement and press Return or Enter,
ISETL provides you with a double prompt (>>) with which to continue
your statement;
If at the end of the line, you have produced something which cannot
be the start of a correct statement, ISETL returns some funny words
and tosses out all of your input back to the last complete statement it
executed;
The cycle starts again.
Since statements may be long and you will lose all of the intermediate
work if you make an error on a line, it is a good idea to rst type the intended
statements somewhere other than the execution window. ISETL allows you
to open plain text windows for this purpose. The method for transferring
code from a plain text window to the execution window is dependent upon
the system.
Another reason to keep your work in a separate window is that ISETL
deletes the contents of the Execution Window when it gets too long. In
fact, ISETL may not provide you with any warning until after it has done
this. In order to retain your work, you will need to make it a habit to save
the contents of your Execution Window at regular intervals. Furthermore,
you will need to save those contents to a dierent le each time ISETL has
truncated the text (or you will overwrite the le containing the old text).
Note that while it tosses out the contents of the execution window, it does
not remove the eects of those contents. For example, if you set b to 3 early
in the session and that line is discarded by ISETL, then b will remain 3 even
after the window has been truncated.
Other than commands, there are two dierent types of instructions that
you can give to ISETL: directives and comments. Directives adjust the ISETL
10 CHAPTER 1. FUNCTIONS AND STRUCTURES
environment and follow a slightly dierent set of rules than commands. Di-
rectives always start at the rst ! on the line and continue until the end of
the line. Everything on the line after the directive will be discarded. Com-
ments are ignored by ISETL. They start at the rst $ on the line and continue
until the end of the line.
Simple Objects and Operations
The rst simple object ISETL supports is the symbol OM which means that
the result of the computation is undened.
ISETL supports a number of dierent types of objects, as well as operators
which act on those objects. The most common object for you to manipulate
in ISETL will probably be numbers. There are three dierent types of num-
bers that ISETL deals with: integers, fractions, and decimals. The special
symbol OM means that the object is undened.
As you would expect, the symbols in ISETL for addition and subtraction
are + and -, respectively. In ISETL, multiplication is represented with a
*. Placing two numbers next to each other without the * is not supported
(you will get an error message about a Bad Mapping or OM). Division is in-
dicated by the slash symbol (/). ISETL also supports exponentiation using
the exponential operator (**).
Modular arithmetic. One arithmetic operator which you may not have
been introduced to before is the mod operator. In Activity 4, you were shown
how to use number lines to elaborate mod expressions. You may have realized
that to elaborate an n mod k expression you need a number line with k-
multiples. After locating n between the appropriate k-multiples, you shift n
by multiples of k to the interval [0, k). n mod k equals the value of the new
location of n following the shift into [0, k).
Is it possible to nd n mod k on the number line without shifting n to
the interval [0, k)? Just by locating n between the appropriate k-multiples?
Try out some examples of your own to answer this question. Use ISETL to
check your answers.
You may have found out that mod corresponds to the distance between
n and the nearest multiple of k to its left. Namely, mod reduces its left
operand to the remainder upon division by its right operand: to nd 20 mod 7
you located 20 between 14 and 21. You can represent 20 as 20 = 2 7 + 6,
hence 20 mod 7 = 6. This is the general interpretation of mod: n mod k = r
1.1 Introduction to ISETL 11
if and only if there exists an integer a and an integer r, 0 r < k, such that
n = a k + r. Check both the graphical and arithmetic descriptions with
examples of your own. Use ISETL to verify your calculations.
Variables. In Activity 5, you worked on the code:
> a := O; b := 1; c := 2; d := 3;
> a := d; b := c; c := b; d := a;
> a; b; c; d;
Were you surprised by the result of this code? Do you think that this result is
what the programmer had in mind? Can you guess what was intended? How
would you make it right? ISETL objects may be named either literally (e.g.,
using the symbol 22 to refer to the number twenty-two) or using variables.
A variable is a sequence of letters, digits, underscores ( ), carets (^) and
primes (
t
) which begins with a letter and is not a reserved word in ISETL.
Upper and lower case letters are considered dierent when determining a
variables value and so drat and DrAt refer to dierent objects. Any object
can be assigned (or bound) to a variable and this is done in ISETL with the
assignment operator (:=). Once this is done, the value of the variable will
equal the value of the object. The value of the variable is determined at the
time the variable is assigned and will remain until the variable is explicitly
reassigned.
Boolean. ISETL also supports boolean values (true and false). One
means of generating a boolean value (true and false) is the use of com-
parison operators on numbers. These are summarized in the table below:
operator meaning
= equality
/= inequality
< strictly less than
<= less than or equal
> strictly greater than
>= greater than or equal
Boolean values also have their own operators: and, or, not and impl. The
rst three are fairly clear in meaning. The last one, impl, requires some
explanation. The implication operator is false only when a true statement is
12 CHAPTER 1. FUNCTIONS AND STRUCTURES
said to imply a false statement. Based on this, a false statement is said to
imply either a true statement or a false statement. The other feature of the
impl operator is that it completely evaluates both expressions before testing
the implication. This means that a statement like
x < 1 impl x > 3
is only tested for the particular value of x at the time of the test. In this
case, if x were the value 2, then the truth value of the statement above would
be true. As a result, the ISETL statement does not have the same meaning
as the statement if x < 1 then x > 3, where x is implicitly assumed to be
any possible number. The impl operator is most useful where the variable
ranges over all possible values of a particular set. The code to do this will
be discussed in the next section.
Control Statements
Control statements allow you to direct activities in ISETL. Here we will learn
to work with if and for statements.
if statements. An if statement consists of a sequence of branches. Each
branch consists of a condition and a statement block. ISETL will test each
condition until it nds one which is true, and it will execute the statement
block associated with that condition. The rst branch is indicated with an
if condition then construction, and later branches are indicated with an
elseif condition then construction. A nal branch can be indicated by an
else construction, and it will be executed if no other branch is executed.
The order of the branches may be important since only the block asso-
ciated to the rst matching (true) condition will be executed. This means
that while the following is a valid statement in ISETL, it is unlikely that it
does what the writer intended:
> if x > 2 then
>> write "x is larger than 2";
>> elseif x > 3 then
>> write "x is larger than 3";
>> end if;
A more complex example was given in Activity 3:
1.1 Introduction to ISETL 13
> if x > 2 then
>> write "x is in (2, infinity)";
>> elseif x > 1 then
>> write "x is in (1, 2]";
>> elseif x > 0 then
>> write "x is in (0, 1]";
>> else
>> write "x is in (-infinity, 0]";
>> end if;
The general structure of an if statement code is:
> if (boolean expression) then
>> (statements)
>> elseif (boolean expression) then
>> (statements)
> end if;
for loops. A for loop is used to repeat execution of a list of statements a
xed number of times. A for loop begins with the key word for followed by
an iterator. An iterator is a domain specication of one or more variables in
a tuple or set. (You will see how tuples and sets work in the next section.)
In Activity 3 you elaborated the code
> S := {1..3};
> a := 0;
> for x, y in S do
>> a := a + x + y;
>> end for;
> a;
In this example, when the two variables are iterated over the set S, ISETL
selects a value of x and then iterates over all of the values for y. Then a second
value for x is selected and ISETL iterates again over all values for y. This is
repeated for each value of x. One possible order of selection might produce
14 CHAPTER 1. FUNCTIONS AND STRUCTURES
the following solution:
a = 0 + 1 + 1 = 2
a = 2 + 1 + 2 = 5
a = 5 + 1 + 3 = 9
a = 9 + 2 + 1 = 12
a = 12 + 2 + 2 = 16
a = 16 + 2 + 3 = 21
a = 21 + 3 + 1 = 25
a = 25 + 3 + 2 = 30
a = 30 + 3 + 3 = 36.
Hence the output of this code is 36. After the iterator, the for loop has the
keyword do, followed by a list of commands (the last example consists of but
one command: a := a + x + y).
A code of a for loop structure is completed by the keyword end for.
The general structure of a for loop code is:
> for (variables) in (tuple or set) do
>> (statements)
>> end for;
Exercises
1. In your own words, write out explanations for each of the following
terms. Note that anything here which is in typewriter font is con-
sidered to be an ISETL keyword.
(a) prompt
(b) true
(c) om
(d) mod
(e) boolean
(f) if statement
(g) for loop
1.1 Introduction to ISETL 15
(h) input
(i) objects
(j) operations
(k) ;
(l) impl
2. Read the following code, and follow the instructions and/or answer the
questions listed after the code.
rp := om;
x := 12;
y := 18;
if is_integer(x) and is_integer(y) and x > 0 and y > 0
then rp := true;
for i in [2..min(x, y)] do
if (x mod i = 0) and (y mod i = 0) then
rp := false;
end;
end;
end;
x; y; rp;
(a) Run the code several times, with dierent initial values for x, y.
(b) In your own words, write out an explanation of what this code
does. In particular, explain how the code gets data to work on,
what it does with the data, and what is the meaning of the result.
(c) Place this code in an external le. Exit ISETL and then re-enter
ISETL to run this code without retyping it.
(d) What does it mean to say that this code tests its input?
(e) Suppose that you run this code and that the value of y you enter
is always twice the value of x. Can you be sure of what the value
of rp will be? Why?
(f) Add a statement to the code that will display a meaningful an-
nouncement about the result.
16 CHAPTER 1. FUNCTIONS AND STRUCTURES
(g) List some relationships between the values of x and y for which
you can always be sure of the value of rp at the end.
(h) Suppose that values a, b for x, y result in rp having the value
true. Suppose that this is still the case for values b, c for x, y.
What will happen if you give x, y the values a, c?
3. Look at each of the following sets of ISETL code. Predict what will be
the result if the code is entered. Then enter it and note if you were
right or wrong. In either case, explain why.
12 div 4; 12 div 5; 12 div -5; -12 div 5; -12 div -4;
12 div 0;
12 mod 4; 9 mod 4; 6 mod 4; 3 mod 4; -2 mod 4;
-4 mod 4; -7 mod 4;
2 = 3; (4 + 5) /= -123; (12 mod 4) >= (12 div 4);
even(2**14); odd(187965*45);
max(-27, 27); min(-27, max(27, -27));
abs(min(-10, 12) - max(-10, 12));
4. Write a for loop which adds up all the numbers between 1 and 100.
5. Find the value of x mod 6 for x = -7, -6, ... 6, 7.
6. Describe all possible integer values of a for which a mod 6 = 0.
7. Describe all possible integer values of b for which b mod 6 = 4.
8. The following is another visualization of the operator mod. Choose an
integer k between 5 and 20. Like a number line with k-multiples, the
following drawing will serve as a representation of mod k.
Draw a circle and mark on it k arbitrary points, in (approximately)
equal distances. Write next to one of the points, outside the circle, the
number 0. Continue labeling the other points, moving in a constant
direction (clockwise or counter-clockwise), writing each consecutive in-
teger in its turn next to the following point. After k steps you will
write the number k next to the point 0, behind it. An example of a
drawing for mod k, with k = 7, is given in Figure 1.2.
1.1 Introduction to ISETL 17
0
1
2
3 4
5
6
7
Figure 1.2: A circular representation
Continue writing numbers for at least two tours around the circle. Can
you describe the numbers that accumulate behind a specic point?
Could you continue the list of numbers behind a particular point with-
out moving all the way around the circle? Could you write a mathe-
matical (algebraic) description of the set of the numbers for a specic
point?
Choose a point on your circle, and write an ISETL code which constructs
the set of the numbers of this point up to 100. Compare the output of
this code with the numbers you have written in your drawing.
9. Look at each of the following sets of ISETL code. Predict what will be
the result if the code is entered. Then enter it and note if you were
right or wrong. In either case, explain why.
(a) n := 58; (n div 3) * 3 + n mod 3;
(b) is integer(-1020.0) and is integer(-1020);
(c) true impl true; true impl false; false impl true;
false impl false;
(d) (23 + 5) mod 7 = 0 impl 7 mod 7 = 1;
(e) (1 + 6) mod 7 = 0 impl (1 + 5) mod 6 = 0;
(f) (23 + 5) mod 7 = 1 impl -70 mod 7 = 1;
(g) (2 + 3) mod 5 = 0 and (2 * 3) mod 5 = 1;
18 CHAPTER 1. FUNCTIONS AND STRUCTURES
10. Write ISETL code that will run through all of the integers from 1 to 50
and, each time the integer is even, will write out its square.
11. Change your code in the previous problem so that instead of even inte-
gers, it will write out the square each time the integer gives a remainder
of 3 when divided by 7.
12. Use ISETL to determine the larger of the fractions
2
3
+
8
9
,
4
5
+
6
7
Do you
see a pattern in the choice of the four fractions? Run several variations
of the pattern and see if you can nd a general rule.
19
1.2 Structures and Operations
Activities
1. Following is a list of items for you to enter into ISETL. Before entering,
guess and write down what the response of ISETL will be. In case the
response is dierent from your prediction, try to understand why.
T1 := [0..19]; T1;
T2 := [0, 2..19]; T2;
T3 := [2, 8..21]; T3;
T4 := [3, -5, 1];
T5 :=[2**i + 1 : i in [0..4]]; T5;
T1(5); T2(5); T3(5);
T4(3); T4(1); T5(1);
T3(8);
#T1; #T2; #T3; #T4; #T5;
U:=[1, 2, T2, 3 < 2, [3.5, -100]];
U(7); #U; U(5); U(5)(2);
2 in U; false in U; -100 in U; -100 in U(5);
Z20 := {0..19};
T1; T1; T1; T1;
Z20; Z20; Z20; Z20;
T1(5); Z20(5);
E := [2, 1 > 2, [1, 2]]; E1 := [1 > 2, 2, [1, 2]];
E2 := [2, 1 > 2, [1, 2], 2];
E = E1; E = E2; E1 = E2;
F := {2, 1 > 2, [1, 2]}; F1 := {1 > 2, 2, [1, 2]};
F2 := {2, 1 > 2, [1, 2], 2};
F = F1; F = F2; F1 = F2;
N := {O, 1, {0, 1}}; R := [0, 1, {0, 1}];
N = R;
20 CHAPTER 1. FUNCTIONS AND STRUCTURES
B := [1, 2, 1, 3, 1, 4];
C := {1, 2, 1, 3, 1, 4};
#B; #C; B; B; B; C; C; C;
K1 := {1, 3, 2} with 5;
#K1; K1;
L1 := [1, 3, 2] with 5;
#L1; L1;
K2 := {1, 3, 2} with 3;
#K2; K2; K2; K2;
L2 := [1, 3, 2] with 3;
#L2; L2; L2; L2;
2. Write a few paragraphs describing your experience with Activity 1.
What did you predict? What happened? How do you explain what
ISETL did? Rather than just reporting events in chronological order,
try to organize your description in some logical order and suggest gener-
alizations. Include a description of the main dierences between using
[..] (which constructs a tuple or sequence) and .. (which constructs
a set).
3. Write out a verbal explanation of the result of giving each of the fol-
lowing input lines to ISETL.
p := [1, 1, 0]; q := [1, 0, 1];
r := [(p(i) + q(i)) mod 2 : i in [1..3]]; r;
s := [1, 2, 0];
s1 := [(3 * s(i)) mod 5 : i in [1..3]]; s1;
G := {[a, b, c] : a, b, c in [0, 2, 4]};
H := {[a, b, c] : a, b, c in [1, 3]};
K := {[a, b, c] : a, b, c in [1..3]};
H union K; H union G; K union G;
K inter H; H inter G;
H subset K; G subset H;
Z20 := {0..19};
1.2 Structures and Operations 21
L := {g * h : g, h in Z20 | even(g) and h < 10};
L1 := {g * h : g, h in Z20 | even(g)};
L1 subset L; L subset L1;
Z20 - {0}; 0 in Z20; 0 in Z20 - {0};
S := pow({0, 1, 2, 3}); {0, 1} in S; {} in S;
arb(Z20); arb(Z20); arb(Z20); arb(Z20);
%+[1..10]; %*[l..6]; %or[2=1, 2=2, 2=3, 2=4];
%+{1..10}; %*{l..6}; %or{2=1, 2=2, 2=3, 2=4};
4. (a) Let Z2 3 be the set of all the 3-tuples (tuples with 3 elements)
with elements in 1, 2. How many elements are there in Z2 3?
(b) Write ISETL code that constructs Z2 3, and use it to check your
conjecture.
5. Write out a verbal explanation of the result of giving each of the fol-
lowing input lines to ISETL.
Z20 := {0..19};
Z2_3 := {[a,b,c] : a,b,c in [0,1]}; Z2_3;
forall x in Z20 | (x + 0) mod 20 = x;
forall x in Z20 | (x + 3) mod 20 = x;
exists p in Z2_3 | p(l) < p(2);
exists p in Z2_3 | p(l) = p(2);
exists e in Z20 | (forall g in Z20 | (e + g) mod 20 = g);
forall g in Z20 | (exists g in Z20 |
(g + g) mod 20 = 0);
forall p, q in Z2_3 |
[(p(i) + q(i)) mod 2 : i in [1..3]] in Z2_3;
choose e in Z20 | (forall g in Z20 | (e + g) mod 20 = g);
e := choose x in Z20 |
(forall g in Z20 | (x + g) mod 20 = g); e;
6. Write out a verbal explanation of the result of giving each of the fol-
lowing input lines to ISETL.
22 CHAPTER 1. FUNCTIONS AND STRUCTURES
Z5 := {a mod 5 : a in [-30..50]};
A := {a mod 5 : a in [-100..100]};
#Z5; #A; A = Z5;
C := {c : c in Z5 | (exists d in Z5 | (c + d) mod 5 = 0)};
C;
G := {g : g in Z5 | (exists d in Z5 | (g * d) mod 5 = 1)};
G;
C = Z5; G = Z5; G = Z5 - {0}; #G;
forall a in Z5 | (exists d in Z5 | (a + d) mod 5=0);
forall a in Z5 | (exists d in Z5 | (a * d) mod 5=1);
Z6 := {a mod 6 : a in [-100..100]}; Z6; #Z6;
M := {m : m in Z6 | (exists d in Z6 | (m + d) mod 6 = 0)};
M;
N := {n : n in Z6 | (exists d in Z6 | (n * d) mod 6 = 1)};
N;
M = Z6; N = Z6; N = Z6 - {0};
forall a in Z6 | (exists d in Z6 | (a + d) mod 6 = 0);
forall a in Z6 | (exists d in Z6 | (a * d) mod 6 = 1);
Z7 := {g mod 7 : g in [-50..50]}; Z7;
K := {(5 * g) mod 7 : g in Z7}; K;
H := {(2 * g) mod 7 : g in Z7}; H;
Z5 := {g mod 5 : g in [-50..50]}; Z5;
K1 := {(3 * g) mod 5 : g in Z5}; K1;
H1 := {(2 * g) mod 5 : g in Z5}; H1;
Z20 := {g mod 20 : g in [-50..50]}; Z20;
K2 := {(5 * g) mod 20 : g in Z20}; K2;
H2 := {(2 * g) mod 20 : g in Z20}; H2;
Z6 := {0..5};
K3 := {(5 * g) mod 6 : g in Z6}; K3;
H3 := {(4 * g) mod 6 : g in Z6}; H3;
7. Write ISETL code that will construct the following sets. Run your code
to check that it is correct.
1.2 Structures and Operations 23
(a) The set of all integers from 1 to 1000 whose squares mod 20 are
greater than 14.
(b) The set Z2 4 of all 4-tuples (tuples with four elements) with entries
from Z2.
(c) The set of all sums of the tuple p with the tuple q where p, q run
through all elements of Z2 3.
(d) The set of all elements of the form [[x, y], (x + y) mod 6]
where x, y run through all the elements of Z6.
8. Write ISETL code that will test the truth or falsity of the following
statements. Run your code to check that it is correct.
(a) Every element of Z20 is even.
(b) Every element of Z2 3 is a tuple.
(c) Some element of Z20 is a tuple.
(d) Some elements of Z20 are odd.
(e) The product mod 20 of every pair of elements of Z20 - 0 is
again in Z20 - 0.
(f) Every element of Z20 has a corresponding element which when
added to it mod 20 gives the result 0.
(g) There is an element of Z20 which when added to any element of
Z20 does not change it.
Discussion
Tuples
The ISETL object called tuple is used to represent a nite sequence. In Activ-
ity 1, the code for T1 yields the sequence given by the rst 19 whole numbers.
Similarly, [-30..50] would yield the sequence of consecutive integers whose
rst term is 30 and whose last term is 50. In general, any tuple given
by [a..b], where a and b are integers and b > a, will yield a consecutive
sequence of integers that begins with a and ends with b. What happens if
b < a?
24 CHAPTER 1. FUNCTIONS AND STRUCTURES
T2 in Activity 1 diers from T1 in that the dierence between successive
terms is 2. Upon receiving input such as [4, 7..15], ISETL constructs the
sequence [4, 7, 10, 13]. In general, if a < b < c, with a, b, and c integers,
the ISETL tuple [a, b..c] returns an arithmetic sequence with common
dierence b a whose rst two terms are a and b and whose last term does
not exceed c. What does ISETL return if a < b < c does not hold?
How would a decreasing sequence be constructed? Non-arithmetic se-
quences can also be constructed by simply listing all of the elements (T4 in
Activity 1 is an example) or by using a formula (T5 in Activity 1 is an exam-
ple). Of course, arithmetic sequences can also be expressed in either of these
ways.
The components of a tuple do not have to be integers, or even numbers.
They can be any ISETL objects, including other tuples. U in Activity 1 is
such an example: The rst term of this sequence is 1, the second is 2, the
third is the tuple T2, the fourth is the proposition 3 < 2, and the fth is the
tuple whose elements are the numbers 3.5 and 100.
One of the most important facts about tuples is that their elements come
in a xed, denite order. Each time you evaluate a tuple, you get the same
sequence in the same order. This should not be surprising, since a sequence
is a function whose domain is the set of integers: the output corresponding
to the integer 1 is the rst element of the sequence, the output corresponding
to the integer 2 is the second element of the sequence, and so on. Since tuples
retain their original ordering, it is possible to access specied components of
a tuple. As you discovered in Activity 1, the value of the expression T1(5) is
the value of the fth component of the tuple T1. What would be the value
of the seventh component of U?
Sets
The ISETL object set is exactly the same as a nite set in mathematics. The
elements of a set can be any ISETL objects, including other sets, as in the
set N of Activity 1. This includes the empty set , as in 1, , 1,2.
Sets can also have tuples for elements, as in Z2 3 of Activities 4 and 5. Sets
can also be elements of tuples. Can you construct such an example?
As with tuples, a code of the form a .. b; returns the set of consec-
utive integers starting with a and ending with b, provided that a and b are
integers and a < b. If b < a, then ISETL will return the empty set. Similarly,
if a, b and c are integers and a < b < c, the set a, b..c; will return a and
1.2 Structures and Operations 25
b, with subsequent elements obtained by adding the constant dierence b a
such that no term exceeds c. What elements are returned if the condition
a < b < c does not hold?
Sets and tuples dier in two important ways. In a set, order and repetition
do not matter, while the opposite is true for tuples. For instance, the set
1, 2, 3 is equal to any set consisting of any permutation of the elements
1, 2, and 3. For example, 1, 2, 3 = 2, 3, 1 . On the other hand, the
tuple, or sequence, [1, 2, 3] is not equal to the tuple [3, 2, 1]. This is what you
discovered in Activity 1: the terms of the tuple T1 are always presented in
the order in which they were entered. This is not the case with the set Z20;
ISETL lists the elements of Z20 in varying order.
Repetition, like order, is a second distinguishing characteristic of se-
quences. This is why the sequences E and E2 in Activity 1 are not equal.
However, when the elements of E and E2 were entered as sets, to produce
F and F2, the result was dierent: You had found that F = F2 is true. Re-
peated elements of a set are disregarded: When one uses the code #, with
which ISETL produces the number of elements, one can see that a repeated
element in a set is counted but once, while in a tuple it is counted as many
times as it appears. As a result, tuples and sets react dierently to the opera-
tion with. What is the dierence? Use the results you obtained in Activity 1
to explain this dierence.
Tuple and Set Formers
In addition to dening sets and sequences by listing their elements or terms,
the set and tuple objects can be dened using former notation. In Activity 3,
the sets Z20, H, K, L, HK and S are all dened using set-former notation.
Whether a set or a tuple, the former has three parts. The rst is an expres-
sion. Every variable that appears in the expression must either have been
assigned values previously or appear in the second part of the former. The
rst part of the former is completed with a colon (:). The second part, called
the domain specication, takes unassigned variables in the expression and
iterates them through previously dened sets or tuples. For the set HK :=
6 * n : n in [1..5], the rst part of the former is the expression 6 *
n, and the second part indicates that n is an element of the tuple [1..5].
Although it is not necessary for every variable that appears in the domain
specier to appear in the expression, it is required that every unassigned
variable that appears in the expression must also appear in the domain spec-
26 CHAPTER 1. FUNCTIONS AND STRUCTURES
ier. For example, the tuple r:=[p(i) + q(i) : i in [1..3]] given in
Activity 3 is the component-wise sum of previously dened tuples p and q.
As a result, p and q do not need to appear in the domain specier. The
index i is the undened variable iterating through [1..3]. The last part
of the former notation is optional. If present, it begins with the symbol (|)
and is followed by a boolean expression, that is, an expression whose value is
true or false. For example, L := g * h : g, h in Z20 | even(g) and
h < 10 is the set of all the numbers produced from the expression g * h
by substituting all the possible combinations of even elements in Z20 for g,
and elements smaller than 10 in G for h.
When presented with former notation, ISETL constructs the set (or tu-
ple) by iterating through all possible combinations of values of the variables
dened in the domain specication. For each combination of values of the
variables, the boolean expression in the third part is evaluated. If the third
part is not present, then the boolean value is automatically assumed to be
true. If the result of evaluating the boolean expression is false, then noth-
ing more is done, and ISETL moves on to the next combination of values
for the variables. If the boolean expression is true (or not present), then
ISETL evaluates the expression and returns the result as an element of the
set. Thus, the value of a former expression is the set of all values of the
expression obtained by iterating the variables through their domains such
that the condition in the third part holds. This is similar to what you were
asked to do in Activity 7.
Set Operations
ISETL can perform usual set operations. You used these operations in Ac-
tivity 3. Recall, as you read the following summary, that the convention in
this text is that any word in typewriter font is an ISETL keyword.
1. The basic idea of a set is that any object is either in the set or it is not
in the set. You can test for set membership using the operation in.
2. The union (union) of two sets A and B is the set of all values which are
elements of A, B, or both. The intersection (inter) of two sets A and B
is the set of all elements which are contained in both A and B.
3. A set A is a subset (subset) of a set B if every element of A is also an
element of B.
1.2 Structures and Operations 27
4. The dierence between two sets A and B (-) is the set of all elements
which are in A but not in B.
5. The value in ISETL of is the empty setthe set which has no ele-
ments.
6. The cardinality operator (#) applied to a set A returns the number of
elements in A.
7. The operation pow applied to a set A constructs the set of all subsets
of A. This is called the power set of A, denoted in ISETL by pow(A).
When the set A is nite, pow(A) can be worked out by rst putting into
it the empty set , then all one-element sets consisting of one of the
elements of A, then all the possible two-element subsets of A (consisting
of two of the elements of A), and so on, until the largest subset of A, the
one of greatest cardinality, which is A itself. What is the cardinality of
the power set of an arbitrary, nite set?
8. The operation arb selects an arbitrary element of a set.
Tuple and Set Operations
In Activity 1, you were introduced to operations applied to tuples. For
instance the code %+[3..9] tells ISETL to nd the sum of the terms of
the sequence 3, 4, 5, 6, 7, 8, 9. If the addition sign were replaced with a
multiplication sign, then ISETL would nd the product of the terms of the
sequence.
If the terms of a tuple consist of boolean expressions, that is, statements
that can be judged to either true or false, code such as %or[3 < 2, 2 > 1,
6 = 7] instructs ISETL to return the value true if one of the statements is
true. What would the code %and[3 < 2, 2 > 1, 6 = 7] yield?
Sets of Tuples
In Activities 4 and 5, you instructed ISETL to construct the set Z2 3. When
you typed Z2 3, ISETL returned the set of all possible combinations of 3-
tuples (tuples with three components) whose entries were either 0 or 1. As
discussed in the previous subsection on sets, the elements of a set in ISETL
can be any ISETL objects, including, as you saw in these activities, tuples.
28 CHAPTER 1. FUNCTIONS AND STRUCTURES
There are a variety of ways to represent a set of tuples. One is to simply
list each tuple given in the set. For example, if A is the set of all 2-tuples
(tuples consisting of two components) of all possible two-element orderings
of the rst three counting numbers, then we could represent A in ISETL by
listing each element:
A := {[1, 1], [1, 2], [1, 3], [2, 1], [2, 2],
[2, 3], [3 ,1], [3, 2], [3, 3]};
On the other hand, it would be more convenient to use former notation:
A := {[a, b] : a, b in {1, 2, 3}};
Whenever dening a set of tuples using former notation, the expression, or
rst part of the set former, will consist of a tuple whose components are
various expressions. For instance, if we want to dene a set B consisting of
all 3-tuples (tuples with three components) of elements of Z20 in which the
rst component is always zero, the second is always even, and the third is
two more than 3 times the second, then, using set former notation, we would
write:
B := {[0, b, ((3 * b) + 2) mod 20] : b in Z20 | even(b)};
Tuples will be used frequently throughout the text. Tuples constitute a
special and important kind of vectors. Vectors are important objects of
study in linear algebra. Sets of tuples will often constitute a vector space.
A vector space is a set of vectors with two operations that satisfy certain
conditions.
Quantication
Quantied logical statements are used in mathematics to express conditions,
usually in a denition, statement of a property, or a construction. forall
involves a universal quantier. In order for a statement involving a universal
quantier to be true, the condition must hold for all possible values of the
variable attributed to it. exists involves an existential quantier. In order
for a statement involving an existential quantier to be true, only one of the
values of the variable attributed it has to be true.
In Activity 5, you were asked to evaluate several statements involving
universal quantication. The ISETL statement forall x in Z20 | (x +
1.2 Structures and Operations 29
0) mod 20 = x illustrates the standard form of a universal quantier: the
rst part begins with the key word forall and is followed by a domain
specication. The domain specication is completed with the symbol |. The
second part of the quantifying statement is a boolean expression, that is, the
condition that the variable must satisfy.
To evaluate a universal quantication expression, ISETL iterates through
the values of the variable in the domain specier. Thus, in this example,
it considers every value of x in the set Z20. For each value, the boolean
expression is evaluated. If the boolean expression is found to be false for
just one x, then the entire universal quantication statement is false. If the
boolean expression is true for every x given by the domain specier, then the
value of the quantication is true.
The existential quantier is similar, except that it returns true if the value
of the boolean expression is true at least once. Consequently, an existential
quantier returns false only when the boolean expression is false for every
single value of the variable.
The operation choose is a useful alternative to exists. The syntax
is exactly the same, and choose performs the same internal operation as
exists. Instead of returning true or false, however, choose will select and
subsequently return one value of the variable that makes the condition true.
If there is no such value, choose will return OM. Thus, in the case of the
following statement from Activity 5, e := choose x in Z20 | (forall g
in Z20 | (x + g) mod 20 = g); ISETL will return the value of 0 for e.
Modular Arithmetic
In Activity 6 you could see dierences between multiplication mod 7 and
mod 5 in Z
7
and Z
5
respectively on the one hand, and multiplication mod 20
and mod 6 in Z
20
and Z
6
respectively on the other hand. These dierences
can be seen when comparing these sets
K := {(5 * g) mod 7 : g in Z7};
H := {(2 * g) mod 7 : g in Z7};
K1 := {(3 * g) mod 5 : g in Z5};
H1 := {(2 * g) mod 5 : g in Z5};
With the following sets:
K2 := {(5 * g) mod 20 : g in Z20};
30 CHAPTER 1. FUNCTIONS AND STRUCTURES
H2 := {(2 * g) mod 20 : g in Z20};
K3 := {(5 * g) mod 6 : g in Z6};
H3 := {(4 * g) mod 6 : g in Z6};
What are the dierences between them? Notice that H, for example, is a
subset of Z
7
, and is determined by multiplication mod 7. Similarly, each of
the other sets is a subset of some Z
n
and determined by multiplication mod
that same n. What dierences do you see in the relation between these sets
and the Z
n
they are subsets of?
Exercises
1. How many elements are there in the following set? Use the # operator
in ISETL to check your answer.
{2, 3, 6 mod 4, {[1, 1] {-1, 2..5}}, {}};
2. List the elements in each of the following sets and note the number of
elements in each. Use ISETL to check your answers. If a set is empty,
explain why.
{2..12}
{4..4}
{10..1}
{-2, 4..38}
{0, 3..-1}
{100, 90..-5}
{100, 90..100}
{100, 90..101}
{10 ,9..0}
{4, 4..8}
3. For each of the following sets of code, predict what result will be re-
turned by ISETL and then check your answer on the computer.
T := {[3, 4], 3 + 4, 8};
7 in T; 4 in T; (1 + 7) in T;
3 + 5 notin T; 7 notin T;
1.2 Structures and Operations 31
T = {7 * 1, 15 mod 8};
T /= {}; T /= T; {} subset T; T subset T;
T subset {}; not({8} subset T);
{7, 1 + 6, 49 mod 6, 7 + 0} subset T;
#(T); pow(T);
[3, 2, 1] with 2 = [3, 2, 1];
{3, 2, 1} with 2 = {3, 2, 1};
4. Let Y and Z be dened by
Y := {Z5, 6.9, {2..10}, {{true impl false}, false},
(10 div -4)+16, {5 in {3,6..9}}, {{}}};
Z := {{false or true}, 28 mod 2, Z5, {}, true impl false,
{10, 2, 9, 3, 8, 4, 7, 5, 6}, {false}, abs(-6.9)};
For each expression in the following list, determine if the value is an
element of Y, an element of Z, an element of both sets, or neither.
(a) true
(b) false
(c) true
(d) false
(e) 13 + 1
(f) 2, 3..10
(g) Z5
(h)
(i)
(j) 0, 1..4
(k) 10, 9..2
5. In the context of the previous exercise, do the following.
(a) List every expression whose value is in both Y and Z.
(b) List every expression whose value is in either Y or Z or both.
32 CHAPTER 1. FUNCTIONS AND STRUCTURES
(c) List every expression whose value is in Y but not in Z.
(d) Can you write an ISETL expression that will give an answer to
any of the above?
6. For each of the following ISETL set former expressions, give a verbal
explanation of the set and then list the elements. For example, if the
expression is x**2 : x in 2..10 | x mod 2 = 0; then the verbal
description might be The set of all squares of the even integers from 2
to 10. And the list would be 4, 16, 36, 64, 100;
(a) x : x in 2, 5..10;
(b) r : r in 2, 5..100 | r mod 5 = 0;
(c) t**4 + t**2 : t in -6..6 | even(t div 3);
(d) even(n) : n in -3, -1..11;
(e) (x * y) mod 3 : x, y in -8, -7, 0, 7, 8 | x < y;
(f) s, t : s in l0, 8..4, t in 5..s |
(s + t) mod 2 = 0;
(g) (p and q) = (q and p) : p, q in true, false;
7. (a) Construct addition and multiplication tables for addition mod 5
and multiplication mod 5 in Z
5
(ll in the given tables):
(a +b) mod 5 0 1 2 3 4
0 0
1
2 1
3 2
4
(a b) mod 5 0 1 2 3 4
0 0
1
2 3
3 2
4
(b) Construct similar tables for addition mod 6 and multiplication
mod 6 in Z
6
.
1.2 Structures and Operations 33
(c) Write a verbal description of all the dierences you found between
these tables: Compare the addition tables to the multiplication
tables; Compare the tables of mod 5 (in Z
5
) to those of mod 6 (in
Z
6
).
8. Write ISETL set former expressions for each of the following.
(a) The set of all 3-tuples of elements from a given set of integers K.
(b) The set of all possible sums of two elements, one taken from a
given set of integers S and one from a given set of integers T.
(c) The set of all 4-tuples of those elements from a given set of integers
K that are even.
(d) The set of all subsets a, b of Z
5
, for which (a b) mod 5 = 1.
(e) The set of all subsets a, b of Z
5
, for which (a +b) mod 5 = 0.
(f) The set of all subsets a, b of Z
6
, for which (a b) mod 6 = 1.
(g) The set of all subsets a, b of Z
6
, for which (a +b) mod 6 = 0.
9. Compare the sets you constructed in Exercise 8, parts (d)(g), to the
tables you had constructed in Exercise 7.
10. Assume that S is a set and T is a tuple, both of which have been pre-
viously dened in ISETL. Write an ISETL expression that will evaluate
to a tuple whose components are the elements of S and the set whose
elements are the components of T.
11. Evaluate the following tuples and then use ISETL to check your answers.
(a) [x**2 : x in [1, 3..10]];
(b) [[1..r] : r in [0, 2..6]];
(c) [N + 2 < 2**N : N in [0..20]];
(d) [u * v : u in [-5..0], v in [-5..(u + l)] |
(u + v) mod 3 = 0];
12. Use the ISETL forall, exists, and choose constructs to write a code
that implements the following statements. Assume that S is the set of
multiples of 3 from 0 to 49.
(a) Every odd number in 0 . . . 50 is in S.
34 CHAPTER 1. FUNCTIONS AND STRUCTURES
(b) Every even number in S is divisible by 6.
(c) It is not the case that every even number in S is divisible by 6.
(d) There is an odd number in S, which is divisible by 5.
(e) There is an even number in S, which is divisible by 5.
(f) There is an element m of S such that the number of elements of
S that are less than m is twice the number of elements of S that
are greater than m.
(g) An element a of S exists such that for every element x of S, there
is an element y of S such that the average of x and y is a.
35
1.3 Functions
Activities
1. For each of the following sets of ISETL code, try to predict what the
result would be. Then run the code and check your prediction. Write
out a verbal explanation of what the code is doing.
(a) f := func(x);
return (x + 3) mod 6;
end;
f(5); f(0); f(37);
h := |x -> (x + 3) mod 6|;
h(5); h(0); h(37);
h=f;
forall x in [-10..10] | h(x) = f(x);
(b) fact := func(n);
return %*[1..n];
end;
fact(3); fact(5); fact(50);
forall n in [2..20] | fact(n) = n * fact(n - 1);
f:= func(x);
return (x + 3) mod 6;
end;
f(fact(3)); fact(f(114));
(c) Av := func(T);
return %+T/#T;
end;
Av([1, 2, 3, 4]); Av([1, 2, 3, 4, 0]);
CompAv := func(T,S);
return max(Av(T),Av(S));
end;
CompAv([1, 2, 3, 4], [1, 2, 3, 4, 0]);
36 CHAPTER 1. FUNCTIONS AND STRUCTURES
What is the input of the function Av? What is its output? What
is the input of the function CompAv? What is its output?
(d) Z6 := {0..5};
inv := func(x);
if x in Z6 then
return choose g in Z6 | (x * g) mod 6 = 1;
end;
end;
inv(2); inv(5); inv(3); inv(1); inv(0);
(e) Z20 := {0..19};
closed:= func(H);
return forall x, y in H | (x + y) mod 20 in H;
end;
closed({0, 4..19}); closed({3, 5, 9}); closed(Z20);
2. For each of the following specications, write an ISETL func with the
specied input parameters that returns the result of the action that is
described. If any auxiliary objects are needed, then construct them as
well. Select some specic values and run your func on them to see that
it works.
(a) The input parameter is a single number x and the action is to
compute the square mod 20 of x.
(b) K is the set Z
5
0 (Z
5
without the element 0). The input
parameter is a single variable x and the action is to choose any
element of K whose product mod 5 with x is 1.
3. For each of the following sets of ISETL code, try to predict what the
result would be. Then run the code and check your prediction. Write
out a verbal explanation of what the code is doing.
(a) Z5 := {0..4};
add_5 := func(x, y);
if (x in Z5 and y in Z5) then
return (x + y) mod 5;
end;
end;
1.3 Functions 37
add_5(3, 4); .add_5(2, 3); add_5(4, 4);
3 .add_5 4; 2 .add_5 3; 4 .add_5 4
(b) G := Z5 - {0}; G;
mlt_5 := func(x, y);
if (x in G and y in G) then
return (x * y) mod 5;
end;
end;
forall x, y in G | x .mlt_5 y in G;
exists e in G | (forall g in G | e .mlt_5 g = g);
choose e in G | (forall g in G | e .mlt_5 g = g);
id := choose e in G |
(forall g in G | e .mlt_5 g = g); id;
forall g in G | (exists g in G | g .mlt_5 g = id);
4. (a) Using the following specication, write ISETL funcs with the spec-
ied input parameters, that return outputs as described. If any
auxiliary objects are needed, then construct them as well. Select
some specic values and run your funcs on them to see that they
work.
Let G be Z7-0 (namely, the set of integers from 1 to 6).
Construct a function mlt 7 that takes for input two elements
of G and gives as output their product mod 7.
Use the operation mlt 7 in the construction of a func named
inv 7. The input of inv 7 is a single value g. The func
should check that g is an element of G and, if it is, choose
an element of G whose product with g under the operation
mlt 7 is equal to 1.
(b) If your function inv 7 works properly, predict and then check in
ISETL the values of the following expressions:
inv_7(3); inv_7(5); inv_7(2); inv_7(1); inv_7(6);
3 .mlt_7 5; 2 .mlt_7 4; 1 .mlt_7 1; 6 .mlt_7 6;
forall g in Z7 - {0} | (exist g in Z7 - {0} |
g .mlt_7 g = 1);
38 CHAPTER 1. FUNCTIONS AND STRUCTURES
Keep the codes you constructed here for future use.
(c) Repeat Activities 4(a) and 4(b) with Z6. Check the values you
obtained with your multiplication-mod-6 table. How are these
results reected in this table?
You may wish to keep these codes as well.
(d) Write a comparison between the behavior of the functions inv 7
and inv 6. How are they similar? How dierent?
(e) Repeat Activities 4(a) and 4(b) with Z5. Check the values you
obtained with your multiplication-mod-5 table.
Keep the codes you constructed here for future use.
(f) What is the function inv 5 like: inv 6 or inv 7? Explain.
5. Write ISETL funcs with the given names according to each of the follow-
ing specications. In each case, set up specic values for the parameters
and run your code to check that it works.
(a) The func is closed has two input parameters: a set G and a
func o which is some operation on two variables from G such as
the one in Activity 3. The action of is closed is to determine
whether the result of o, when applied to two elements of G, is
always an element of G. This is indicated by returning the value
true or false.
(b) The func is commutative has two input parameters: a set G and
a func o which is some operation on two elements from G such as
the operations in Activity 3. The action of is commutative is
to determine whether or not the result of o depends on the order
of the elements. This is indicated by returning the value true or
false.
(c) The func identity has two input parameters: a set G and a
func o which is some operation on two variables from G such as
the operations in Activity 3. The action of identity is to search
for an element e of G which has the property that for any element
g of G, the result of the operation o applied to e and g is again g.
This is indicated by returning the value of e if it exists or OM if it
does not.
1.3 Functions 39
(d) The func inverses has three input parameters: a set G, a binary
operation o on G, and an element g of G. We assume that G and
o are such that identity(G, o) is dened (does not return OM).
The action of inverses is to rst assign the value of identity(G,
o) to a variable e, then search for an element g of G which has
the property that the result of the operation o applied to g and
g is e. The func inverses returns the element g if it exists or
OM if it doesnt.
6. (a) Write an ISETL func invertibles, that takes for input a set G
and an binary operation o dened on G. We assume that G and
o are such that identity(G, o) is dened (does not return OM).
The action of invertibles is to construct the set of all elements
in G that have an inverse with respect to o in G. Namely, all the
elements g in G such that there exists an element g in G such that
g .o g=identity(G, o).
(b) What does your func do? What are invertibles?
(c) In the previous activities you constructed the funcs mlt 5, mlt 6
and mlt 7. Operate your func invertibles on the following
inputs: (Z5, mlt 5), (Z6, mlt 6), (Z7, mlt 7). (Use .mlt 5,
.mlt 6 and .mlt 7 as in Activity 3.)
(d) Summarize your ndings in (c): What did your func check? What
did you nd? How does (Z6, mlt 6) dier from (Z5, mlt 5) and
(Z7, mlt 7) (from the point of view of the invertibles)?
(e) Do you see the dierences between (Z6, mlt 6) and (Z5, mlt 5)
in the multiplication tables you constructed in Exercise 7 in Sec-
tion 1.2? How are these dierences represented there?
7. For each of the following sets of ISETL code, try to predict what the
result would be. Then run the code and check your prediction. Write
out a verbal explanation of what the code is doing.
(a) p := [[1, 3], [2, 4], [3, 2], [4, 1]];
p(1); p(3);
s := {[1, 3], [2, 4], [3, 2], [4, 1]};
s(1); s(3); s(5);
40 CHAPTER 1. FUNCTIONS AND STRUCTURES
Write an explanation: What did the expression p(i) do for the
tuple p? What did the expression s(i) do for the set of tuples s?
s := s with [1, 5]; s;
s(3); s(1);
st := [s(i) : i in [1..4]]; st;
st(1); st(3);
(b) next := func(n);
if n in [1..10] then return n + 1;
end;
end;
mnext := {[n, n+1] : n in [1..10]};
mnext;
next(6); mnext(6); next(9); mnext(9);
next(11); mnext(11);
forall n in [0..11] | mnext(n) = next(n);
(c) G := {1..12};
o := func(x,y);
if (x in G and y in G) then
return (x * y) mod 13;
end;
end;
m13 := {[[x, y], x .o y] : x, y in G};
m13;
m13(3, 5); m13(2, 4);
3 .m13 5; 2 .m13 4;
forall x, y in G | x .m13 y = x .o y;
8. Construct functions of your own, with the following specications:
(a) The function takes two tuples for input, and returns a tuple con-
sisting of the sum of each pair of their components for output.
1.3 Functions 41
(b) The function takes a tuple and a number for input, and returns a
tuple consisting of the product of the number and each component
of the tuple for output.
(c) The function takes two tuples and two numbers for input, and
returns a tuple for output. This function should multiply the rst
number with the rst tuple (as in part (b)), multiply the second
number and tuple, and then add the results as in part (a).
9. Describe what the following code is doing.
SetNot := proc(pair);
G := pair(1); o := pair(2);
e := choose x in G | (forall g in G | x .o g = g);
inv := {[g, choose g in G | g .o g = e] : g in G};
end;
pair := [ ];
pair(1) := {0..5};
pair(2) := func(x, y);
if (x in G and y in G) then
return (x * y) mod 6;
end;
end;
pair;
SetNot(pair);
G; e; inv(5); inv(2);
What would happen if, after running this code (and dening the funcs
in Activity 5), you entered statements such as 3 .o 5, is closed(G,
o), is commutative(G, o), identity(G,o)?
Can you imagine why we might want to go to all the trouble of some-
thing like SetNot?
10. An ISETL func can also return a func. Look at the following code and
write down an explanation of what it does:
add_a := func(a);
return func(x);
42 CHAPTER 1. FUNCTIONS AND STRUCTURES
return x + a;
end;
end;
Predict the results of the following ISETL statements:
add_a(3); add_a(3)(5); add_a(3)(2); add_a(7)(4);
f3 := add_a(3); f3(5); f3(2);
f7 := add_a(7); f7(4);
11. Write down an ISETL func compose which takes as input two ISETL
funcs f and g and returns a func representing their composition; that
is, the func compose returns a func which for each input x returns
f(g(x)). Use your func to compose some of the funcs previously
dened in the Activities, and study the resulting new funcs.
Discussion
Funcs and Their Syntax Options
The notion of function is fundamental to every area of mathematics. A
function has to do with transforming an input value, or a collection of input
values, into a single output value. In ISETL there are several ways to represent
functions, one of which is the func. What kinds of inputs are used in the
funcs of Activity 1? What kinds of outputs are used? A func processes
(transforms) the input to obtain an output. The syntax of funcs is designed
to describe these processes. There are a variety of processes demonstrated
in the funcs of the activities. Let us look at the function of Activity 1(e)
again:
inv := func(x);
if x in Z6 then
return choose g in Z6 | (x * g) mod 6 = 1;
end;
end;
Funcs must have a header line, a list of statements for ISETL to process,
and an end statement. The header line for a func usually starts naming the
1.3 Functions 43
func. This is done by an assignment to an identier. In our example, the
line inv := func(x); assigns the name inv to the function, followed by the
keyword func and the actual values of parameters enclosed in parentheses.
The complete expression is followed by a semicolon.
The func in our example has an if statement for its primary process. If
statements terminate with an end; or an end if; commandin our exam-
ple, the rst of the two end;s. As we can see in our example, the label that
indicates which process is being completed (end if; end func;) is optional.
Every func and every control statement must have its own end statement. In
the example, the func also includes a return statement which causes ISETL
to evaluate the expression and end the processing. Return statements can
only be used inside a func. They instruct ISETL to send the current value of
the expression to the computer screen or to some other process in which the
func may be embedded. A return statement causes the operation in a func
to stop. In our example, for an input which is a number in Z
6
, the func
chooses an element g in Z
6
such that (x g) mod 6 equals 1, and produces it
on the screen.
So the general syntax for funcs is as follows:
name := func(list of parameters);
statements;
end;
Once a func is dened (and recorded by ISETL) it can be used with
dierent values of inputs (or parameters). The syntax for operating a func
with some (properly) chosen input is: name(parameters). In our example, as
the function was assigned to the identier (the name) inv, and the input is a
number in Z6 (otherwise the func responses with OM), the func will operate
in response to an expression such as inv(5);.
Funcs for Binary Operations
When a func has two input parameters, which are assumed to represent two
elements from some set, and the returned value is also assumed to belong to
the same set, then the func is said to represent a binary operation. Binary
operations are extremely important in Algebra in general, and also in Linear
Algebra, and you will spend considerable time working with them.
In mathematics, it is the usual practice to write the name of a binary
operation between the two parameters, rather than before them. Thus, we
44 CHAPTER 1. FUNCTIONS AND STRUCTURES
write a +b rather than +(a, b). We can do something very similar in ISETL.
If o is a func which fulls the requirements for a binary operation (input
consisting of two parameters, output a single parameter, all three elements
of the same set) then, instead of o(e, g) we have the option of writing e
.o g. Putting the period before the name of the operation is the signal to
ISETL that the operation is between the two parameters, not before them.
You used binary operations in this way in Activity 3. What about the func
Av of Activity 1(c)? Can it be used as a binary operation? Why, or why not?
In Activity 8, you were to construct funcs to work with tuples and num-
bers. You probably needed to decide the size of the tuples involved and how
to deal with them in the code. For instance, in part (a) you might have used
code like:
act8a := func(tup1, tup2);
[a, b, c, d] := tup1;
[e, f, g, h] := tup2;
return [a + e, b + f, c + g, d + h];
end func;
which works well for a tuples of length 4. A perhaps more elegant approach
is:
act8a := func(tup1, tup2);
return [tup1(1) + tup2(1), tup1(2) + tup2(2),
tup1(3) + tup2(3), tup1(4) + tup2(4)];
end func;
but neither of these will work with any length tuple or make use of the power
of ISETL. Compare the following code to the previous examples. Can you
explain how it works?
act8a := func(tup1, tup2);
return [tup1(i) + tup2(i) : i in [1..#tup1]];
end func;
This func computes the component-wise sum of two tuples.
Funcs to Test Properties
In Activity 5 you constructed several funcs that tested binary operations
for various properties. The func is closed that you were asked to write in
1.3 Functions 45
Activity 5(a) for example, had two input parameters, a set G and a binary
operation o. It was assumed that before you call this func you would have
dened a set and a binary operation. The denition of the func would contain
a single boolean expression. The value of this expression was returned as the
result of a call to this func. Look back at the funcs you constructed in
Activity 5, and write a description of the property that is tested by each of
the funcs you were asked to construct in this activity.
Tuples and Smaps
In Activity 7(b) you compared two ISETL objects: the func next, which
assigns to each number in [1..10] the next larger number, and the set of
ordered pairs mnext := [n, n+1] : n in [1..10]; Hopefully you also
discovered that expressions like mnext(6) = next(6) return true.
In ISETL, an ordered pair is a tuple with only two components dened.
A set of pairs is a called a map. The map gives us one more way to represent
functions in ISETLas in mathematics. As you can see in the above example,
the way a map represents a function is that it assigns the second component
of each ordered pair to the rst component of the same pair. However, not
every map represents a function. Since a function assigns to each elements in
its domain a single value, in order for a map to represent a function it must
have the following additional property:
No value appears as a rst component of more than one pair in
the map.
A map with this property is called an smap (single-valued-map). Any time
an smap is constructed and assigned to a variable, that variable can be used
as a function.
We have seen before, (in Section 1.2, Activity 1) that tuples too can be
used as functions. Use the examples of Activity 6(c) to discuss the dierences
between the way tuples operate like functions, and the way smaps do.
Procs
Activity 7 has an example of an ISETL proc or procedure. A procedure is the
same as a func except that it has no return statement and does not return
a value. It is used to perform some internal operations such as establishing
the values of certain variables. In our example, it establishes the value of G,
46 CHAPTER 1. FUNCTIONS AND STRUCTURES
o, e, and the function inv. It is also used for external eects on the screen
(such as printing or drawing something), or other devices (disk, printer, and
so forth).
The Fields Z
p
In the activities you dealt with the properties of structures consisting of the
sets Zn and their binary operations mlt n and add n. Mlt n was dened to
be multiplication mod n, and add n was addition mod n. We will denote
such a structure by the 3-tuple [Zn, add n, mlt n]. Such are [Z5, add 5,
mlt 5], [Z6, add 6, mlt 6], [Z7, add 7, mlt 7] and others. In rela-
tion to these structures, you dealt with several new concepts:
Identity was an element related to an operation within its respective
Z
n
, which when operated with another element produced that other
element as a result.
Inverse of an element was an element related to an operation, its iden-
tity, and a specic element g in the corresponding Z
n
: g
t
was said to
be an inverse of g with respect to an operation o in its respective Z
n
,
if gog
t
= the identity (of the same operation).
An element which had an inverse respective to a specic operation in its
appropriate Z
n
was called an invertible (in relation to that operation).
In addition to work with the ISETL code you looked for the properties of the
operations in the operation-tables you constructed in Section 1.2, Exercise 7.
How do you identify each of the concepts (Identity, Inverse and Invertibles)
in these tables?
We hope that you became aware of some basic dierences among [Z6,
add 6, mlt 6], [Z5, add 5, mlt 5], and [Z7, add 7, mlt 7]. The main
dierence concerns the above concepts: While all three structures have an
identity for the operations mlt n, not every element in each of the structures
has an inverse (in relation to mlt n).
The element 0 is not invertible with respect to mlt n in any of the struc-
tures.
While in in Z5 and in Z7 all the other elements (g/=0) have inverses, in
Z6 not all do. Which are the elements of Z6 which do not have such inverse
(are not invertible)?
1.3 Functions 47
In the rest of the course we will learn about structures named vector
spaces. These structures are constructed upon (and rely on) structures of
numbers called elds. Fields have some important characteristics. Let us
explain these characteristics.
A eld consists of a set K and two binary operations operating on it;
Both operations are closed (see Activity 5);
Each operation is commutative and associative (see Activity 5);
There is a relation between the two operations called distributivity,
which you probably know but with which we will not now deal;
Each operation has an identity;
For one of the operations every element has an inverse and the identity
of this operation is usually denoted by 0;
As for the second operation, all the elements of K0 have inverses.
That means that every element has an inverse, except the identity of
the rst operation!
We will not keep working with elds. Such work belongs to another
algebra course. For the time being, we need to know which Z
n
s are elds,
and hence, can build vector spaces upon them.
From your work with Z
n
and their operations, you could see that Z
6
is
not a eld. Although it has an identity for every operation, not every element
of Z
6
0 is invertible in relation to multiplication mod 6. Which elements
of Z
6
are not invertible?
We will not prove this here, but it can be proved, that like in our examples,
when n is prime, Z
n
is a eld. Examples are Z
5
and Z
7
. We will use them
to construct vector spaces upon. Can you think of more examples of elds?
But when n is not prime, like Z
6
(because 6 = 2 3), Z
n
is not a eld.
Likewise are Z
12
and Z
4
with which you worked. We will not use them for the
construction of vector spaces. Can you think of other examples of non-elds?
Polynomials and Polynomial Functions
An important collection of functions which can be implemented in ISETL are
the polynomial functions. We start with the denition of a polynomial.
48 CHAPTER 1. FUNCTIONS AND STRUCTURES
Denition 1.3.1. A polynomial in x is an expression of the form
a
0
+a
1
x +a
2
x
2
+a
3
x
3
+. . . a
n
x
n
.
The values a
i
are called the coecients of the polynomial. The polynomial
is said to be over the numbers K if all of the coecients are in the set K. A
polynomial for which a
n
,= 0 but a
m
= 0 for all m > n is said to have degree
n.
Notice that a polynomial is not a function, it is only an expression using
the variable x. Two polynomials are equal if and only if all of their coecients
match. In ISETL, a polynomial is implemented as a tuple of scalars in order
of increasing degree. For example, the polynomial 2 +3x
2
is implemented as
the scalar [2, 0, 3]. The collection of all polynomials of degree less than
or equal to n over the numbers K is denoted by P
n
(K). The collection of all
polynomials over K is denoted by P(K).
Each polynomial expression has an interpretation as a function. We now
dene a polynomial function.
Denition 1.3.2. A polynomial function is a function p where p(x) is a
polynomial in x.
In ISETL, polynomial functions are represented as funcs. Over the real
numbers, dierent polynomials lead to dierent polynomial functions. How-
ever, over the nite elds Z
p
polynomial functions can behave in rather un-
expected ways. For example, over Z
5
, consider the polynomial functions p
and q where p(x) = x and q(x) = x
5
. Compute p(a) and q(a) for every
a Z
5
. You should discover that p = q despite the fact that the polynomials
p(x) and q(x) are dierent. As a result, dening the degree of a polynomial
function is more complicated than it seems.
Denition 1.3.3. A polynomial function has degree n if it is the interpreta-
tion of an n degree polynomial and not an interpretation of any lesser degree
polynomial.
The collection of all polynomial functions of degree less than or equal to n
over the numbers K is denoted by PF
n
(K). The collection of all polynomials
over K is denoted by PF(K).
Theorem 1.3.1. Over the eld Z
p
, the polynomial functions x x and
x x
p
are the same. As a result, PF(Z
p
) = PF
p1
(Z
p
).
1.3 Functions 49
Exercises
1. Each of the following is a description of a function. Use an ISETL func
to implement it, with a restricted domain where appropriate. Calculate
the value of the function on at least three values of the domain in two
ways: rst with paper and pencil, using the given verbal description and
second, on the computer, using your func. Explain any discrepancies.
(a) The function takes a 4-tuple (a tuple with four elements) whose
components are positive integers and computes the sum of the
cubes of the components.
(b) The domain of the function is the set of all points of the square
centered at the origin of a rectangular coordinate system with sides
parallel to the coordinate axes and having length 2. The action of
the function is to rotate the square counterclockwise through an
angle of 180

.
(c) The function determines the number of non-OM components of a
tuple. (Note that not all of the components of a tuple between
the rst and last must be dened).
2. Write ISETL funcs with the given names according to each of the follow-
ing specications. In each case, set up specic values for the parameters
and run your code to check that it works.
(a) The func is associative has two input parameters: a set G
and a binary operation o. The action of is associative is to
determine whether the operation represented by o is associative.
This is indicated by returning the value true or false.
(b) Construct a func add 7 that implements addition mod 7 in Z
7
.
Use your construction in ISETL code that shows that every element
in Z
7
has an inverse in relation to add 7.
(c) Repeat part (b) with Z
6
and add 6.
3. Write an ISETL map that implements the function whose domain is Z
20
and assigns to each element x an element y such that (x+y) mod 20 =
0. Is this an smap? Explain.
50 CHAPTER 1. FUNCTIONS AND STRUCTURES
4. Do the same as the previous exercise with addition replaced by multi-
plication and 0 replaced by 1.
5. Write an ISETL smap that implements the operation of addition mod
20 in Z
20
.
6. Write an ISETL func that accepts a pair consisting of a set G and a
binary operation o on G. The action of the func is to convert this pair
into an smap which implements the operation. Use your func to do the
previous exercise.
7. Construct a function takes a tuple (of any length) for input and returns
a set all of the components of the tuple for output.
8. Look again at the func Av of Activity 1(c). Is it or is it not a binary
operation? Write an explanation. If it is, use it with several appropriate
inputs as an operation written between the two parameters. If it is not,
what modications need to be done to Av in order to produce a similar
function (say, AvBin) which could be used as a binary operation, in
particular in the method described above.
9. Write a tuple and a smap of your own. Operate both as functions.
For each of them, write an explanation: What are the inputs of the
function? What is its domain? (The domain is the set of elements you
can input in the function). What are its outputs?
Chapter 2
Vectors and Vector Spaces
You have seen vectors in your physics classes, in
multivariable calculus and perhaps in other
courses. In those cases, vectors were probably
considered to be things with direction and
magnitude and were usually represented as
directed line segments. In this chapter and
beyond, we will be working with vectors in an
abstract sense. Certainly the vectors with which
you are already familiar will be included in our
work (although they will all have their tails at
the origin). However, as we work with vectors
and vector spaces, you will nd that polynomials
and innitely dierentiable functions are also
vectors.
52
2.1 Vectors
Activities
1. (a) Dene the set K = Z
5
= 0, 1, 2, 3, 4 in ISETL.
(b) Write an ISETL func add scal that accepts two elements of K
and returns their sum mod 5.
(c) Write an ISETL func mult scal that accepts two elements of K
and returns their product mod 5.
2. (a) Dene V = (Z
5
)
2
, that is, the set of all 2-tuples with components
from Z
5
in ISETL. How many elements are there in V ?
(b) Write an ISETL func vec add that accepts two elements, [v
1
, v
2
]
and [w
1
, w
2
] of V and returns the tuple [(v
1
+ w
1
) mod 5, (v
2
+
w
2
) mod 5].
(c) Write an ISETL func scal mult that accepts an element k of
Z
5
and a tuple [v
1
, v
2
] from V , and returns the tuple [(kv
1
) mod
5, (kv
2
) mod 5].
3. Dene the tuples v = [2, 3], w = [1, 1], and u = [0, 3] in ISETL. Use your
funcs dened in Activities 1 and 2 to determine whether the following
tuples are the same.
(a) v +w and w +v.
(b) (u +v) +w and u + (v +w).
(c) v +v and 2v.
(d) 1v and v.
(e) v +1v and v v.
(f) 2(3u) and (2 3)u.
(g) 2(v +w) and 2v + 2w.
4. How is the following code dierent from the func vec add you wrote
in Activity 2? What assumption does this code make about u and v?
va := |v, w -> [(v(i) + w(i)) mod 5 : i in [1..#v]]|;
2.1 Vectors 53
Use va to add the following tuples in (Z
5
)
n
. Can you add these tuples
using vec add?
(a) [2, 2, 1] + [3, 0, 4]
(b) [0, 1, 0, 1] + [1, 2, 3, 4]
(c) [1, 2] + [2, 1]
5. (a) Write an ISETL func sm that accepts an element k from Z
5
and
a tuple v, and returns the tuple kv in which each component of v
has been multiplied (mod 5) by k.
(b) Test your func for k = 3 and v = [2, 4].
(c) Test your func for k = 0 and v = [1, 3, 3].
(d) Test your func for k = 1 and v = [3, 2, 4, 1].
6. (a) Write an ISETL func is closed va that accepts a set V of tuples
and an operation va (vector addition). Your func should test
whether the sum of any two tuples in V is again in V .
(b) Test your func on V = (Z
5
)
2
, with va dened in Activity 4.
(c) Test your func on V = (Z
3
)
3
. Modify va appropriately, using
mod 3 arithmetic.
(d) Test your func on V = (Z
2
)
4
. Modify va appropriately, using
mod 2 arithmetic.
7. (a) Write an ISETL func is commutative that accepts a set V of
vectors (tuples) and an operation va and determines whether or
not the operation va is commutative on V .
(b) Test your func on V = (Z
5
)
2
and va.
(c) Test your func on V = (Z
3
)
3
and an appropriately modied va.
(d) Test your func on V = (Z
2
)
4
and an appropriately modied va.
8. (a) Write an ISETL func is associative va that accepts a set V of
vectors (tuples) and an operation va, and determines whether or
not va is associative on V .
(b) Test your func on V = (Z
5
)
2
and va.
(c) Test your func on V = (Z
2
)
2
and an appropriately modied va.
54 CHAPTER 2. VECTORS AND VECTOR SPACES
9. Explain the following ISETL code. What are the inputs to this func?
What does this func return?
has_zerovec := func(V, va);
VZERO := choose z in V | forall v in V | (v .va z) = v;
return VZERO;
end;
10. (a) Use the func has zerovec to write a new func has vinverses
that accepts a set V of tuples and operation va and determines
whether or not for each x in V there is an y in V with the property
that va(x, y) = the result of has zerovec(V, va).
(b) Test your func on V = (Z
5
)
2
and va.
(c) Test your func on V = (Z
3
)
3
and an appropriately modied va.
(d) Test your func on V = (Z
2
)
4
and an appropriately modied va.
11. Explain the following ISETL code:
is_closed_sm := func(K, V, sm);
return forall k in K, v in V | (k .sm v) in V;
end;
12. Write an ISETL func is associative sm that accepts a set K of
scalars, a set V of vectors, and two operations, sm (scalar multipli-
cation) and ms, multiplication of scalars. Your func should determine
whether for all k, j in K and all v in V k(jv) = (kj)v. Note that the
right hand side of this equation usesmultiplication of scalars as well
as scalar multiplication. What is the dierence? Test your func on
(Z
2
)
2
.
13. What does the following ISETL func do? What are the inputs? The
outputs?
has_distributive1 := func(K, V, sm, va);
return forall k in K, v, w in V |
(k .sm (v .va w)) = (k .sm v) .va (k .sm w);
end;
2.1 Vectors 55
14. Write an ISETL func has distributive2 that accepts a set K of
scalars, a set V of tuples, and three operations, va, vector addition,
sm, scalar multiplication, as, addition of scalars. The action of your
func is to determine whether the following expression holds for all k, j
in K and v in V : (k +j)v = kv +jv.
15. Write an ISETL func has identityscalar that accepts a set K of
scalars, a set V of vectors (tuples), and an operation sm, scalar multi-
plication. The action of your function is to determine whether there is
an element k in K such that for all v in V , sm(v, k) = v.
Discussion
In these activities you created tuples with components chosen from Z
2
,
Z
3
, or Z
5
, and wrote code to perform operations on those tuples. Such
tuples are more commonly known as vectors. A vector can be any tuple
with components in a set K of scalars. In ISETL we denote vectors by v =
[v
1
, v
2
, , v
n
]. In mathematical notation we write v = v
1
, v
2
, , v
n
). The
numbers v
i
are known as the components of the vector.
Any specic vector might be thought of as living in several dierent
spaces. For example, v = 2, 2) could be an element of the space (Z
3
)
2
,
or of (Z
5
)
2
, or of R
2
. (Why can is it not an element of (Z
2
)
2
?) If we
choose to work within a specic space (K)
n
, then we can combine vec-
tors with each other using an operation of vector addition. The ad-
dition is done component-wise. For example if we are working with 2-
tuples with entries from Z
5
, the sum of v = v
1
, v
2
) and w = w
1
, w
2
) is
(v
1
+w
2
) mod 5, (v
2
+w
2
) mod 5). We also have an operation of scalar
multiplication which allows us to combine scalars with vectors. This mul-
tiplication is also done component-wise. There is a natural relationship be-
tween this vector addition and scalar multiplication that is very satisfying.
For example v + v results in the same vector as 2v. Linear algebra is built
on these two operations of adding vectors and multiplying by scalars.
In these Activities you worked exclusively with nite sets of scalars and
vectors. This is because much of our work in ISETL requires us to be able to
dene nite sets. However, many of the real-world applications of vectors
that you will see in this and other courses deal with innite sets of scalars
and vectors. For example R
2
is the set of all ordered pairs of real numbers.
The vectors in R
2
have both a physical (as forces or velocities) and geometric
56 CHAPTER 2. VECTORS AND VECTOR SPACES
interpretation. In R
2
vectors can be thought of as quantities that have both
a direction and magnitude. We can represent the vector v = 4, 2) by an
arrow in a two-dimensional plane. The arrow will start at the origin (0, 0)
1 2 3 4
1
2
O
P
v
Figure 2.1: A vector in R
2
and end at the point (4, 2). (See Figure 2.1). Such a vector has a magnitude
and direction, and shows both of its components simultaneously. The vector
w = 2, 1) has the same direction as v but is half as long. Can you see how
to use the Pythagorean theorem to nd the length of such vectors? What is
the relationship between the length of a vector v and the length of 2v? Of
v and kv?
Of course not all of the arrows in 2-space originate at (0, 0). We will
consider two vectors v
1
and v
2
to be equivalent if they have the same length
and direction, even if they originate at dierent points. (See Figure 2.2).
1 2 3 4
1
2
O
P
B
A
v
Figure 2.2: Parallel vectors in R
2
2.1 Vectors 57
Such vectors are obviously parallel arrows. One reason for allowing vectors
to start at dierent points is to be able to visualize the sum of two vectors.
In Figure 2.3 we can form the vector v +w by translating w so that the
start of w is placed at the end of v. Then v +w is the arrow drawn from
1 2 3 4
1
2
4
3
w = <-1,2>
w'
v + w
v'
v = <4,2>
Figure 2.3: Adding vectors in R
2
the start of v to the end of the translated vector w. Note that this geometric
vector addition produces a parallelogram. To get v +w can can travel along
v and then along w or we can take the shortcut along the diagonal v +w of
the parallelogram.
Use algebra to check that the geometric addition in Figure 2.3 is correct.
In other words, is v +w = 1, 2) equal to the vector 3, 4)?
Given a vector v, what would we mean by the vector v? How would
you draw v in the real plane? What is the relationship between the length
and direction of v and that of v? How can we combine vector addition and
multiplication by 1 to obtain vector subtraction?
Ordered triples of real numbers can also be thought of as vectors and
visualized geometrically. In order to do this we need an xyz coordinate
system. The set of all such ordered triples is known as 3-space or R
3
. Vectors
that live in spaces with more than 3 components are not so easily visualized.
However, many of the results and techniques of vector arithmetic are useful
in such situations where there is no direct geometric signicance. This leads
us to the following denition.
Denition 2.1.1. The set of all sequences v
1
, v
2
, , v
n
) of real numbers
is called Real n-space and is denoted R
n
.
58 CHAPTER 2. VECTORS AND VECTOR SPACES
In Activities 515, you wrote or explained several ISETL funcs that
checked various properties of vector addition and scalar multiplication. Sys-
tems in which these particular properties are satised turn out to be very
useful in the study of linear algebra. We will explore such systems further in
the next section.
Exercises
1. Compute the following vector expressions for
v = 2, 3) , u = 3, 1) , and w = 8, 0)
(a)
1
2
w
(b) v +u
(c) v +u +w
(d) 2v + 3u +w
2. (a) Draw the vectors v = 4, 1) and
1
2
v in a single xy plane.
(b) Draw the vectors v = 4, 1) and w = 2, 2) and v+w and vw
in a single xy plane.
3. Compute the following vector expressions for
v = 1, 2, 3) , u = 3, 1, 2) , and w = 2, 3, 1) .
(a) v +w
(b) v + 3u
(c) w+u
(d) 5v 2u + 6w
(e) 2v 3u 4w
4. To what number do the components of every scalar multiple of v =
2, 1, 3) add up?
5. Use the Pythagorean theorem to nd the length of the following vectors
in R
2
.
(a) 4, 3)
2.1 Vectors 59
(b) 2, 0)
(c) 1, 2)
(d) 3 1, 2)
(e) 0, 0)
6. Extend the notation of the length of a vector to R
n
by
length(v) =
_
(v
1
)
2
+ (v
2
)
2
+ + (v
n
)
2
.
Find the length of the following vectors.
(a) 2, 4, 3)
(b) 2 2, 4, 3)
(c) 2, 0, 0)
(d) 1, 1, 0, 2)
(e) 5, 5, 5, 5)
7. A unit vector is a vector of length one. Find 3 distinct unit vectors in
R
2
. Find 4 distinct unit vectors in R
3
.
8. Is the sum of any two unit vectors a unit vector? Give a proof or
counterexample.
9. Let v = 5, 3, 4). Find a scalar k in R such that kv is a unit vector.
10. If three corners of a parallelogram are (1, 1), (4, 2) and (1, 3), what are
all the possible fourth corners? Draw two of them.
11. Let v = 1, 2, 1), w = 0, 1, 1). Find scalars k and j so that
kv +jw = 4, 2, 6) .
60
2.2 Introduction to Vector Spaces
Activities
1. Following is a list of some funcs that you worked with in the previous
section.
is_closed_va
is_commutative
is_associative_va
has_zerovec
has_vinverses
is_closed_sm
is_associative_sm
has_distributive1
has_distributive2
has_identityscalar
Write a description of what each func does, including the kind of ob-
jects accepted, what is done to them, and the kind of object that is
returned.
2. (a) Construct in ISETL a set K = Z
3
of scalars, a set V = (Z
3
)
3
of
vectors, and four operations, va (vector addition) which is addi-
tion mod 3 of elements in V , sm (scalar multiplication), which is
multiplication mod 3 of elements in V by elements in K, as (ad-
dition of scalars), which is addition mod 3, and ms multiplication
of scalars, which is multiplication mod 3.
For example, you could write and store code such as:
K:={0..2}; V:={[x,y,z]| x,y,z in K};
va:=|v,u->[\left(v(i)+u(i)\right) mod 3 : i in [1..3]]|;
sm:=|k,v -> [\left(k*v(i)\right) mod 3: i in [1..3]]|;
as:=|k,j->(k+j) mod 3|; ms:=|k,j -> (k*j) mod 3|;
(b) Apply each of your funcs from Activity 1 to this system, [K,
V, va, sm, as, ms]. Create a table with the funcs as column
headings and this system as the rst row, and use the table to
keep track of which properties are satised this system.
2.2 Introduction to Vector Spaces 61
3. Repeat Activity 2 for each of the following systems. Add a new row to
your table for each system.
(a) K = Z
5
, V = (Z
5
)
2
, va is addition mod 5 of elements in V , sm
is multiplication mod 5 of elements in V by elements in K, sm is
multiplication mod 5 of elements in V by elements in K, as and
ms are addition and multiplication mod 5 respectively.
(b) K = Z
3
, V = x, x, x) : x K, va is addition mod 3 of elements
in V , sm is multiplication mod 3 of elements in V by elements in
K, as and ms are addition and multiplication mod 3 respectively.
(c) K = Z
5
, V = x, y) : x, y 1, 3, va is addition mod 5 of
elements in V , sm is multiplication mod 5 of elements in V by
elements in K, as and ms are addition and multiplication mod 5
respectively.
(d) K = Z
5
, V = x, 0, 0) : x K, va is addition mod 5 of elements
in V , sm is multiplication mod 5 of elements in V by elements in
K, as and ms are addition and multiplication mod 5 respectively.
(e) K = Z
5
, V = x, 1) : x K, va is addition mod 5 of elements
in V , sm is multiplication mod 5 of elements in V by elements in
K, as and ms are addition and multiplication mod 5 respectively.
(f) K = Z
2
, V = (Z
2
)
5
, va is addition mod 2 of elements in V , sm is
multiplication mod 2 of elements in V by elements in K, as and
ms are addition and multiplication mod 2 respectively.
(g) K = 0, V = (Z
3
)
2
, va is addition mod 3 of elements in V , sm is
multiplication mod 3 of elements in V by elements in K, as and
ms are ordinary addition and multiplication respectively.
(h) K = Z
7
, V = (Z
7
)
1
, va, as are addition mod 7, sm and ms are
multiplication mod 7.
(i) K = Z
5
, V = x, y) : x, y 0, 2, 4, va is addition mod 5 of
elements in V , sm is multiplication mod 5 of elements in V by
elements in K, as and ms are addition and multiplication mod 5
respectively.
(j) K = Z
5
, V = (Z
2
)
3
, va is addition mod 5 of elements in V , sm is
multiplication mod 2 of elements in V by elements in K, as and
ms are addition and multiplication mod 5 respectively.
62 CHAPTER 2. VECTORS AND VECTOR SPACES
(k) K = Z
3
, V = (Z
3
)
3
, va is addition mod 3 of elements in V , sm
is dened by k x, y, z) = 0, 0, 0), as and ms are addition and
multiplication mod 3 respectively.
(l) K = Z
5
, V = (Z
5
)
2
, va is addition mod 5 of elements in V , sm
is dened by k x, y, z) = x, y, z), as and ms are addition and
multiplication mod 5 respectively.
4. Which systems from Activities 2 and 3 satisfy all ten properties from
Activity 1? Can you conjecture conditions on K = Z
p
, (Z
q
)
n
, and the
four operations so that such a system will satisfy all ten properties?
5. Here is a list of some more systems [K, V, va, sm, as, and ms]. Which
of these systems satisfy all of the properties in Activity 1? Note: Most
of the following systems can only be constructed and run in VISETL
(Virtual ISETL). This means that all of your work must be done by
hand and in your mind.
(a) K = 1, 1, V = (K)
3
, va = ordinary component-wise multipli-
cation, sm is ordinary component-wise multiplication, as, ms are
ordinary addition and multiplication respectively.
(b) K = R, V = R
2
, va is ordinary component-wise addition, and sm
is ordinary component-wise multiplication, as, ms are ordinary
addition and multiplication respectively.
(c) K = R, V = R
2
, va is ordinary component-wise addition, and sm
is dened by k x, y) = kx, 3ky), as, ms are ordinary addition
and multiplication respectively.
(d) K = R, V = R 0, va is dened by x) + y) = xy), sm
is dened by k x) =

x
k
_
, as, ms are ordinary addition and
multiplication respectively.
(e) K = R, V = x, 0, x) : x R, with va, sm, as, and ms dened
as usual for R
3
.
6. Write a func is vector space that accepts a set K of scalars, a set
V of vectors, and four operations va, sm, as, and ms dened on V
and K, and tests whether all of the properties listed in Activity 1 are
satised. Your func should return true if the system satises all ten
properties, and false if it fails to satisfy one or more property. Test
your func on some of the systems dened in Activity 2.
2.2 Introduction to Vector Spaces 63
Discussion
In the activities at the beginning of this section you constructed several
mathematical systems and examined their properties. More specically, you
constructed certain sets of vectors and sets of scalars, dened operations on
them and studied various properties of these sets under the dened opera-
tions.
The ten properties listed in Activity 1 are satised by many important
mathematical systems. Rather than study each system separately, we are
going to collectively consider all systems that satisfy these ten properties.
We begin with the denition of such systems.
Denition 2.2.1. A set V of objects called vectors, together with the binary
operations of vector addition and scalar multiplication is said to be a vector
space over a eld of scalars K if for all u, v, and w in V and all k, j in K the
following axioms are satised:
Axiom 1: u +v V (closure under vector addition).
Axiom 2: u +v = v +u (commutativity of vector addition).
Axiom 3: (u +v) +w = u + (v +w) (associativity).
Axiom 4: There is a vector 0 V such that v +0 = v (zero vector).
Axiom 5: For each v V there is a unique element (v) V such that
v + (v) = 0 (vector inverses).
Axiom 6: kv V (closure under scalar multiplication).
Axiom 7: (kj)v = k(jv)(associativity of scalar multiplication).
Axiom 8: k(u +v) = ku +kv (rst distributive law).
Axiom 9: (k +j)v = kv +jv (second distributive law).
Axiom 10: There is an element 1 K such that for every v in V , 1v = v
(identity scalar).
We digress for a moment to discuss this eld of scalars mentioned in
the denition. Scalars are just numbers, but what is a eld? A eld is a
set of objects (usually numbers), together with two operations (addition and
64 CHAPTER 2. VECTORS AND VECTOR SPACES
multiplication) dened on the set that collectively satisfy many properties
that you have seen in your previous work with the real number system. That
is,a eld has all the standard properties of the real numbers including closure
under both operations, operations, additive and multiplicative identities and
inverses, and properties such as commutativity, associativity and the dis-
tributive laws. There are both nite and innite elds. R is obviously an
innite eld, as are the rational numbers Q, and the complex numbers, C.
However, Z is not a eld. Why not? Is 0 a nite eld? It turns out
that if p is a prime number, then Z
p
forms a nite eld under the opera-
tions of addition and multiplication mod p. The system Z
m
is not a eld
for m not prime, because (among other reasons) not all elements in Z
m
have
multiplicative inverses.
We will not in general worry about the specic details of a eld in this
course. Henceforth we will generally restrict our scalars to one of the elds
Q, R, C, or Z
p
. In each case, the operations (as, ms) of addition and mul-
tiplication of scalars are henceforth understood to be ordinary addition and
multiplication or addition and multiplication mod p, so we no longer need to
specify them.
Finite Vector Spaces
We now generalize from some of the systems you worked with in the activities
to nd examples of vector spaces. In cases where we do nd a vector space,
we will prove that fact. Where they do not, we will investigate the vector
space axioms that are violated. The examples we will consider fall naturally
into two typesnite and innite vector spaces.
Your work in the Activities should have convinced you that nite sys-
tems such as K = Z
3
, V = (Z
3
)
3
, component-wise addition mod 3, and
component-wise multiplication mod 3, or K = Z
5
, V = (Z
5
)
2
, with the cor-
responding operations dened mod 5 do satisfy all ten axioms of a vector
space. In fact, you may have conjectured the following theorem.
Theorem 2.2.1. For any positive integer n, and any prime p, (Z
p
)
n
forms
a vector space over Z
p
.
Note that in our theorem we have not mentioned the operations va, sm,
as, or ms. Why not? Your work in the Activities should have convinced you
that there is a natural choice for these operations. In order for the system to
2.2 Introduction to Vector Spaces 65
form a vector space, the operations will be done mod p. The theorem does
require that p be prime; (Z
m
)
n
is only a vector space when Z
m
is a eld.
Proof. For a particular p and n we could always use ISETL to check all
ten axioms. However, the theorem holds for all primes p, so we will not
specify a particular one. We prove the theorem for the case n = 2 and leave
the generalization to any n for Exercise 16. Our proof consists of running
through the axioms for a vector space, and citing appropriate properties of
addition and multiplication mod p which you learned about in Chapter 1.
closure: v + w = (v
1
+w
1
) mod p, (v
2
+w
2
) mod p) (Z
p
)
2
since the re-
mainder of v
i
+w
i
is always between 0 and p 1.
commutativity:
v +w = (v
1
+w
1
) mod p, (v
2
+w
2
) mod p)
= (w
1
+v
1
) mod p, (w
2
+v
2
) mod p)
= w+v.
associativity: Since mod p addition is associative, the component-wise mod
p addition is also associative.
zero vector: The vector 0 = 0, 0) (Z
p
)
2
and
v +0 = (v
1
+ 0) mod p, (v
2
+ 0) mod p) = v
1
, v
2
) = v.
vector inverses: The inverse of v = v
1
, v
2
) is p v
1
, p v
2
). Why?
closure: kv = (kv
1
) mod p, (kv
2
) mod p) (Z
p
)
2
.
associativity: The associativity of multiplication mod p is inherited from
the integers. Component-wise multiplication mod p is therefore asso-
ciative.
distributive law 1:
k(v +w) = k (v
1
+w
1
) mod p, (v
2
+w
2
) mod p)
= k(v
1
+w
1
) mod p, k(v
2
+w
2
) mod p)
= (kv
1
+kw
1
) mod p, (kv
2
+kw
2
) mod p)
= kv
1
, kv
2
) +kw
1
, kw
2
)
66 CHAPTER 2. VECTORS AND VECTOR SPACES
distributive law 2:
(k +j)v = ((k +j)v
1
) mod p, ((k +j)v
2
) mod p)
= (kv
1
+jv
1
) mod p, (kv
2
+jv
2
) mod p)
= (kv
1
) mod p, (kv
2
) mod p) +(jv
1
) mod p, (jv
2
) mod p)
= kv +jv
identity scalar: Clearly 1 Z
p
and 1v = v.
In the Activities you discovered some nite vector spaces that were not
of the form (Z
p
)
n
. For example, the system in Activity 3(b) where K = Z
3
,
V = x, x, x) : x K forms a vector space. Note the all the vectors in
this space have identical components, and all these vectors are also in the
vector space (Z
3
)
3
. So, to determine whether or not this V is a vector space,
we do not need to check all ten axioms. Axioms 2, 3, 7, 8, 9, and 10 are
automatically true for this subset V since they are true for all vectors in
(Z
3
)
3
, and scalars in K. Closure axioms 1 and 6 are fairly easily checked
since adding two vectors with identical components must result in a vector
of identical components, and multiplying x, x, x) by any element of Z
3
will
result in a vector with identical components. It is clear that 0, 0, 0) is the
zero vector in V and that the vector inverse 3 x, 3 x, 3 x) of x, x, x)
is also in V , and thus axioms 4 and 5 are also satised. What other subsets of
(Z
p
)
n
did you nd to be vector spaces? Can you think of additional examples
that were not in explored in the Activities?
Some of the nite systems you worked with in the Activities were not
vector spaces. For example, the system in Activity 3(l) where K = Z
5
, V =
(Z
5
)
2
, and k x, y, z) = x, y, z) is not a vector space. To determine this,
we do not need to check the rst 5 axioms because they only involve vector
addition, and our theorem guarantees that this system satises the vector
addition axioms. We do need to check axioms 6 through 10 since they all
involve some form of scalar multiplication. Is this system closed under scalar
multiplication? Does the theorem guarantee that? What is the identity
scalar? Is there more than one in this case? Which of the distributive laws
does not hold?
In the activities you also learned that the system K = 0, V = (Z
3
)
2
is not a vector space. Why not? Which axiom does it fail to satisfy? If we
2.2 Introduction to Vector Spaces 67
change K to Z
3
would the system be a vector space? What if we change K
to Z
2
?
Why is the system K = Z
5
, V = x, y) : x, y 1, 3 not a vector
space? How many axioms are failed? Would changing K correct the prob-
lems? Does the system K = Z
5
, V = x, y) : x, y 0, 2, 4 fail the same
axioms or dierent ones? Can you x K so that these systems will be
vector spaces?
Innite Vector Spaces
Now we turn our attention to innite vector spaces. Consider V = R
2
, with
vector addition and scalar multiplication dened by the ordinary component-
wise operations. Is V a vector space over the real numbers? Is R
3
a vector
space? R
n
? We answer these questions with a theorem.
Theorem 2.2.2. Let n be a positive integer. The space R
n
of ordered n-
tuples with components from R is a vector space over R.
Proof. We cannot use ISETL to prove this theorem (why not?), but we note
that many of the vector space axioms are true as a consequence of properties
of the real numbers. We only need to check the component-wise application
of these properties. We now prove a few of the axioms for n = 2 and leave
the rest to Exercise 17.
closure: v +u = v
1
, v
2
) +u
1
, u
2
) = v
1
+u
1
, v
2
+u
2
) R
2
commutativity: Exercise 17 .
associativity:
(v +u) +w = v
1
+u
1
, v
2
+u
2
) +w
1
, w
2
)
= v
1
+u
1
+w
1
, v
2
+u
2
+w
2
)
= v
1
, v
2
) +u
1
+w
1
, u
2
+w
2
) = v + (u +w)
zero vector: Exercise 17.
inverses: If v R
2
then v = v
1
, v
2
) V , and v +v = 0, 0).
closure: Exercise 17.
68 CHAPTER 2. VECTORS AND VECTOR SPACES
associativity: Exercise 17.
distributive law 1:
k(v +u) = k(v
1
, v
2
) +u
1
, u
2
))
= k v
1
+u
1
, v
2
+u
2
)
= kv
1
+ku
1
, kv
2
+ku
2
)
= kv
1
, kv
2
) +kv
1
, ku
2
) = kv +ku
distributive law 2: Exercise 17.
identity scalar: 1 R, and 1v = 1 v
1
, v
2
) = 1v
1
, 1v
2
) = v
1
, v
2
) = v.
Since R
n
is a vector space, it seems reasonable to believe that C will also
be a vector space over R. In this case V = a +bi) : a, b R. Scalar
multiplication is dened by k a +bi) = ka +kbi), and vector addition by
a +bi) + c +di) = (a +c) + (b +d)i). You will verify the vector space
axioms in Exercise 5.
Is C
n
a vector space over R? Is it a vector space over C? Is Q
n
a
vector space over Q? Why is Q
n
not a vector space over R? Which closure
axiom fails? In Activity 5, did you nd innite vector spaces that were not
of the form (K)
n
for some eld K? Is K = R, V = x, 0, x) : x R
a vector space? How would you verify closure under vector addition and
scalar multiplication? Knowing that the operations in this space are the
same as those for R
3
, do you need to check commutativity, associativity, or
the distributive laws? Does V contain a zero vector? What is the vector
inverse of x, 0, x) in V ?
Does the system in Activity 5(d) satisfy the commutativity, associativity
and distributive axioms? Since va and sm are not the usual operations on
R, we need to check. What is the zero vector? What is the inverse of the
vector x)? Is this system a vector space?
In Activity 5(a) you should have discovered that the system K = 1, 1,
V = (K)
3
, va = ordinary component-wise multiplication, is not a vector
space. Why not? Which axiom does it fail to satisfy? Is the system in
Activity 5(c) closed under scalar multiplication? Is this system a vector
space?
2.2 Introduction to Vector Spaces 69
The vectors in a vector space do not necessarily have to be tuples of
numbers. Polynomials and functions dened on a set S can also play the role
of vectors. Vector spaces turn up in a wide variety of subjects. For example,
vector spaces arise naturally in the study of solutions of systems of equations,
geometry in 3-space, solutions of dierential and integral equations, discrete
and continuous Fourier transforms, quantum mechanics, and approximation
theory.
Note that we are sometimes sloppy and write things such as Let V =
(Z
5
)
2
be a vector space with no specic mention of the corresponding eld
or operations. Technically this is incorrect. Why? In order to be a vector
space, we have to specify not only the set V of vectors, but also the set K
of scalars and the operations of vector addition and scalar multiplication.
In many cases the scalars and operations are unambiguous, and so we just
describe the set V of vectors. Henceforth, when K is not specied, you may
assume it is Z
p
if V is nite, or R if V is innite. The operations va and sm
are the standard operations on (Z
p
)
n
or R
n
unless otherwise specied.
Non-Tuple Vector Spaces
There are two non-tuple vector spaces which we will discuss throughout this
text. We present them here by beginning with the following theorem.
Theorem 2.2.3. The set P(K) is a vector space over K with the standard
polynomial arithmetic. For any n, the set P
n
(K) is a vector space over K
with the standard polynomial arithmetic.
Proof. Left as an exercise (see Exercise 11).
This result is not very surprising because polynomials are really just tu-
ples of numbers. Recall the denitions of pointwise operations on functions.
If f and g are functions with the same domain and range and addition and
multiplication are dened on the range of f and g, then we can dene f +g to
be the function x f(x)+g(x) and kf to be the function x kf(x). Not
only do the polynomials form a vector space, but they do so when interpreted
as functions as well.
Theorem 2.2.4. The set PF(K) is a vector space over K with pointwise
addition and scalar multiplication. For any n, the set PF
n
(K) is a vector
space over K with pointwise addition and scalar multiplication.
70 CHAPTER 2. VECTORS AND VECTOR SPACES
Proof. Left as an exercise (see Exercise 13).
The polynomial functions is actually only a small subset of a much larger
collection, the innitely dierentiable functions on R. We make the following
denition.
Denition 2.2.2. The collection of innitely dierentiable functions on R
consists of all functions f : R R for which f and all of its derivatives are
dened on all of R. This set will be denoted by C

(R).
It should be clear that PF(R) C

(R), but C

(R) contains other


functions such as sin, cos, and x e
x
. These functions also form a vector
space.
Theorem 2.2.5. The set C

(R) is a vector space over R with pointwise


addition and scalar multiplication.
Proof. Left as an exercise (see Exercise 14).
This last vector space is of great importance in the area of dierential
equations and is also interesting because it does not have a natural tuple
structure.
Basic Properties of Vector Spaces
We conclude this section with the following theorem about vector spaces.
Theorem 2.2.6. Let V be a vector space over a eld of scalars K. Then
1. The zero vector is unique.
2. Vector inverses are unique.
3. For any v V , 0v = 0.
4. Any scalar k times the zero vector is the zero vector (k0 = 0).
5. The scalar 1 times a vector v is the additive inverse of the vector.
6. If kv is 0 then either k = 0 or v = 0.
Proof. We prove (1) and (4) and leave the rest for Exercise 18.
2.2 Introduction to Vector Spaces 71
(1): Suppose there are two zero vectors 0, and v in V . Consider the vector
sum 0 +v. We can compute it in two ways. Since 0 is an zero vector,
0 +v = v. Since v is an zero vector, 0 +v = 0. Hence 0 = v.
(4): Using the second distributive law we know that for any v, k0 + kv =
k(0 +v) = kv. Now adding kv to both sides yields k0 = k0 +kv +
kv = kv +kv = 0.
The combination of Properties (2) and (5) allows us to simplify our nota-
tion and to speak of vector subtraction. That is, the meaning of v w is
now clear. That is, vw = v+w. Our eld of scalars K will also have an
operation of scalar subtraction dened as the addition of additive inverses. A
eld will also have an operation of division (multiplication by multiplicative
inverses). For example, in Z
5
4, 4/3 = 4 2 = 3. Can we dene an operation
of vector division in a similar manner? Why or why not?
name vector space
Most of the remainder of our work in this course will be done within the
context of a vector space. When we are writing ISETL code, it would be
helpful to have an easy way of dening and referring to all the necessary
pieces of a vector space. Carefully consider the code below. What does this
code do? What kind of objects are accepted? What kind of objects are
returned? Any time you wish to work in ISETL with a nite vector space,
we strongly suggest that you rst apply name vector space and then work
exclusively with the standard notation V, K, va, vs, as, ss, ms, ds,
sm, ov, os, is. For now, you might try applying name vector space to
the vector space (Z
3
)
3
.
name_vector_space := proc(set_scal, op_add_scal,op_mult_scal,
set_vec,op_add_vec, op_scal_vec_mult);
$ SETS
V := set_vec; $ set of vectors
K := set_scal; $ set of scalars
$ OPERATIONS
va := op_add_vec; $ vector addition
vs := |u,v -> choose w in V | v .va w = u|; $ vector subtractn
as := op_add_scal; $ addn of scalars
ss := |s,t -> choose r in K | r .as t = s|; $ subtn of scalars
72 CHAPTER 2. VECTORS AND VECTOR SPACES
ms := op_mult_scal; $ mult. of scalars
ds := |s,t -> choose r in K | r .ms t = s|; $ divn of scalars
sm := op_scal_vec_mult; $ scalar mult.
$ DISTINGUISHED OBJECTS
ov := choose o in V | forall v in V | o .va v = v; $ Zero vector
os := choose o in K | forall s in K | o .as s = s; $ Zero scalar
is := choose i in K | forall s in K | i .ms s = s; $ Unit scalar
write
"Vector space objects defined: ","\n","\t",
"V, K, va, vs, as, ss, ms, ds, sm, ov, os, is";
end proc;
Exercises
1. Let V be a set consisting of the single vector v, and let K = R. Let
vector addition be dened by v + v = v and scalar multiplication by
kv = v. Is V a vector space? If so, prove this. If not, list the axioms
that are not satised by this system.
2. Prove that V = a, 0, 0) : a Z
p
forms a vector space over Z
p
.
3. Show that the line through the origin in R
3
in the direction a, b, c) is a
vector space. That is, show that ta, tb, tc) : t R is a vector space
under the usual operations on vectors in R
3
.
4. Show that the plane x, y, z) : x, y, z Randax + by + cz = 0 is a
vector space. (Vector addition and scalar multiplication are the usual
operations dened in R
3
.)
5. Verify that C
n
is a vector space over R.
6. Let V = x, y) : x, y R, x 0. Determine whether or not V
forms a vector space over R under the usual operations of addition and
multiplication for R
2
.
7. Let V = R
3
. Determine whether or not V forms a vector space over R
under the usual operation of addition for R
3
, if scalar multiplication is
dened by k x, y, z) = x, ky, z).
2.2 Introduction to Vector Spaces 73
8. Let V = R
2
. Determine whether or not V forms a vector space over R
under the usual operation of addition for R
2
, if scalar multiplication is
dened by k x, y) = 0, 0).
9. Let V = R
2
. Determine whether or not V forms a vector space over R
if vector addition is dened by
x
1
, y
1
) +x
2
, y
2
) =

(x
5
1
+x
5
2
)
1/5
, (y
5
1
+y
5
2
)
1/5
_
,
and scalar multiplication is dened by
k x, y) =

k
1/5
x, k
1/5
y
_
.
10. Consider the set P3 = a
0
+a
1
x+a
2
x
2
+a
3
x
3
: a
0
, a
1
, a
2
, a
3
Ra
3
,= 0
be the set of all polynomials of degree three with coecients in R. Show
that P3 does not form a vector space over R under polynomial addition
and scalar multiplication.
11. Prove Theorem 2.2.3.
12. Generalize the result of previous exercise. That is, show that P
n
(R) =
set of all polynomials of degree n or less, forms a vector space over R.
13. Prove Theorem 2.2.4.
14. Prove Theorem 2.2.5.
15. Does the set of all real-valued discontinuous functions on S form a
vector space over Runder pointwise addition and scalar multiplication?
Why not?
16. Generalize the proof of Theorem 2.2.1 for n 3.
17. Complete the proof of Theorem 2.2.2.
18. Complete the proof of Theorem 2.2.6.
19. Let V be a vector space. Prove that for every u, v V there is a unique
vector w V such that w + v = u. How does this property relate to
the operation of vector subtraction?
74
2.3 Subspaces
Activities
1. Use the ISETL func subset on the pairs below to determine when W
is a subset of V .
(a) W = x, 1, 0) : x Z
5
, V = (Z
5
)
3
.
(b) W = x, y) : x, y Z
3
, V = (Z
5
)
2
.
(c) W = x, 0) : x Z
5
, V = (Z
5
)
3
.
(d) W = V = (Z
2
)
4
.
2. Write an ISETL func is subspace that accepts a set W and a vector
space (that is [K, V, va, sm]). The action of your func is to determine
whether or not W is a nonempty subset of V , and whether W is also
a vector space over K using va and sm. Test your func on each of the
systems below.
(a) W
1
= 1, 2) , 2, 1) , 0, 0), V = (Z
3
)
2
(b) W
2
= 0, 0, x) : x Z
3
, V = (Z
3
)
3
(c) W
3
= x, y, z, w) : x, y, z, w Z
3
, x + y = 2, z + w = 1, V =
(Z
3
)
4
.
(d) W
4
= x, y) : x, y Z
2
, V = (Z
3
)
2
.
(e) W
5
= x, x) : x Z
5
, V = (Z
5
)
2
(f) W
6
= 1, 1, 1), V = (Z
2
)
3
(g) W
7
= 0, 0, 0, 0), V = (Z
2
)
4
(h) W
8
= x, 3, z) : x, z Z
5
, V = (Z
5
)
3
.
(i) W
9
= x, y, 0) : x, y Z
5
, V = (Z
5
)
3
.
(j) W
10
= x, y, z) : x, y, z Z
5
, x +y = z, V = (Z
5
)
3
.
(k) W
11
= x, y) : x, y Z
5
, x +y = 0, V = (Z
5
)
2
.
(l) W
12
= W
5
W
11
, V = (Z
5
)
2
.
(m) W
13
= W
9
W
10
, V = (Z
5
)
3
.
2.3 Subspaces 75
3. Find a subspace of (Z
5
)
3
that is not W
8
, W
9
, or W
10
. Use your func
is subspace to verify that your subset is a subspace.
4. Write an ISETL func is subspace2 that accepts a set W and a vector
space [K, V, va, sm]. The action of your func is to determine whether
or not W is a nonempty subset of V , and whether or not W satises
the vector space axioms 1,4,5, and 6. Test your func on the 13 systems
in Activity 2.
5. Compare your results from Activities 2 and 4. For which systems
do both funcs return true? For which systems do both funcs re-
turn false? Can you make a conjecture about the equivalence of
is subspace and is subspace2?
6. Write an ISETL func that accepts as inputs a set W and a vector
space [K, V, va, sm]. The action of your func is to determine whether
or not W is a nonempty subset of V , and whether for all k K and
w
1
, w
2
W, kw
1
+w
2
W. Test your func on the systems given in
Activity 2.
7. Compare the your results from Activities 2 and 6. For which systems
do both funcs return true? For which systems do both funcs return
false? Can you make a conjecture about the equivalence of these two
funcs?
Discussion
In these activities you explored subsets of vector spaces. In each case
you worked with a subset of vectors from a vector space V over a eld K,
and you used the same operations of vector addition, scalar multiplication,
and addition and multiplication of scalars that were dened for V and K.
Sometimes this subset formed a vector space itself, and sometimes it did not.
There is no general rule that would allow you to determine by inspection
alone when such a subset will form a vector space, but we can make the
following denition.
Denition 2.3.1. Let [K, V, va, sm] be a vector space over the eld K, and
let W be a nonempty subset of V . If [K, W, va, sm] is again a vector space
over K, then we say that W is a subspace of V .
76 CHAPTER 2. VECTORS AND VECTOR SPACES
Note that in order to be a subspace, W must rst be a nonempty subset
of the vectors in V , and W must also satisfy all of the vector space axioms
using the operations va and sm as they were dened for V over K. So,
although the set of vectors W = (Z
2
)
2
is a subset of V = (Z
3
)
2
, and the
system [Z
2
, W,
2
, +
2
] is a vector space, W is not a subspace of V . Why not?
There are two problems here: the vectors in V and W are dened over two
dierent elds, and vector addition and scalar multiplication are done mod
2 in W whereas they are done mod 3 in V . We could of course use mod 3
arithmetic in W, but under these operations W will not be a vector space.
Why not? Which vector space axioms are not satised by [Z
2
, W,
3
, +
3
] ?
Now consider the vector space R
3
with the usual operations of vector
addition and scalar multiplication, and the subset W = x, y, z) : x + 2y +
3z = 0. Is W a subset of R
3
? Does W have a geometric interpretation? Is
W itself a vector space?
Determination of Subspaces
One way of answering that last question is to check each of the ten vector
space axioms for the system [R, W, , +]. However this is much more work
than is really necessary. Since the operations of vector addition and scalar
multiplication are exactly the same for both R
3
and W, we do not need
to recheck all of the vector space axioms for W. In fact, W will inherit
commutativity, associativity, the distributive laws, and the scalar identity
from R
3
. Why? Which axioms does this allow us to avoid checking? Which
axioms do we still need to check?
Your work in Activities 4 and 5 should have convinced you that we need
only check four axiomsAxioms 1, 4, 5, and 6. We now check these axioms
for [R, W, , +].
Axiom 1: Let w
1
= x
1
, y
1
, z
1
) and w
2
= x
2
, y
2
, z
2
) be arbitrary vectors in
W. Then w
1
+w
2
= x
1
+x
2
, y
1
+y
2
, z
1
+z
2
) and (x
1
+x
2
) + 2(y
1
+
y
2
) + 3(z
1
+ z
2
) = (x
1
+ 2y
1
+ 3z
1
) + (x
2
+ 2y
2
+ 3z
2
) = 0 + 0 = 0, so
w
1
+w
2
W, and W is closed under vector addition.
Axiom 4: Since 0 + 20 + 30 = 0, the vector 0 = 0, 0, 0) is in W. We do
not need to check that w+0 = w. Why not?
Axiom 5: Let w = x, y, z) W. Since w R
3
, there is a vector w
R
3
with w + w = 0. We need to show that w is in W. Since
2.3 Subspaces 77
x + 2y + 3z = 0, (x + 2y + 3z) = x + 2(y) + 3(z) = 0, so
w W.
Axiom 6: Let k R and w = x, y, z) W. Since x + 2y + 3z = 0,
k(x + 2y + 3z) = kx + 2ky + 3kz = 0, so kw = kx, ky, kz) W.
Thus W is in fact a subspace of R
3
. Recall that W has a familiar geometric
interpretation as a plane through the origin in 3-space. Can you nd another
geometric subspace of R
3
?
Suppose W
2
= x, y, z) : x + 2y + 3z = 2 is another plane in 3-space.
How does W
2
dier from W? Is W
2
a subspace of R
3
? Which of Axioms
1,4,5, or 6 does W
2
fail to satisfy?
We can generalize these results in a theorem:
Theorem 2.3.1. A nonempty subset W of a vector space V over K is a
subspace if and only if W is closed under the inherited vector addition and
scalar multiplication, the zero vector is in W, and each vector w in W has
an vector inverse w in W.
Proof. (=) If W is a subspace of V over K, then W is itself a vector space
and therefore satises all ten vector space axioms.
(=) The proof of this is similar to our work above and is left for Exercise 6.
In Activities 6 and 7, you may have observed that it is not necessary
to check all four of these axioms separately. You may have conjectured the
following theorem.
Theorem 2.3.2. A nonempty subset W of a vector space V over K is a
subspace if and only if for all w
1
, w
2
W and k K, kw
1
+w
2
W.
Proof. (=) Left as an exercise (See Exercise 7).
(=) We give only a rough sketch of the proof, and leave the details for
Exercise 15. Use Theorem 2.3.1 so that we only need to verify four axioms
for W. Assume kw
1
+ w
2
W. If we choose k = 1, then we can easily
show that W is closed under vector addition. Since W is nonempty, we can
nd a vector w W and let w = w
1
= w
2
. Then by letting k = 1, one
can show that 0 W. Still using k = 1, but letting w
2
= 0, (which we
now know is in W), one can show that vector inverses are in W. Finally,
letting w
2
be 0, and k, w, be arbitrary will show that W is closed under
scalar multiplication.
78 CHAPTER 2. VECTORS AND VECTOR SPACES
Any vector space V will have at least two subspaces, the subspace V itself,
and the zero subspace (consisting solely of the vector 0). Why are these both
subspaces? Why are they called improper subspaces? Does every vector
space necessarily have proper subspaces?
Is R
2
a subspace of R
3
? Carefully re-read the denition of a subspace.
Can you see why R
2
is not a subspace of R
3
? Is W
2
= x, y, 0) : x, y R
a subspace of R
3
? Note that the subspace W
2
looks like or behaves
exactly like R
2
. We say that R
2
and W
2
are isomorphic vector spaces, and
that W
2
is an embedding of R
2
in R
3
. Are there other copies of R
2
that
can be embedded in R
3
?
Is R
1
a subspace of R
3
? Of R
2
? Can you nd a subspace W
1
of R
3
that
is isomorphic to R
1
? How many dierent isomorphic subspaces of R
1
are
there in R
3
? Can you nd a subspace of R
n
that is isomorphic to R
m
for all
m < n?
Non-Tuple Vector Spaces
When we discussed the polynomial, polynomial functions and the innitely
dierentiable functions, some subset relationships were presented. We are
now able to describe the relationship between these sets more clearly in the
following theorems.
Theorem 2.3.3. For any set of scalars K and n, m with n < m, the following
statements are true:
P
n
(K) is a subspace of P
m
(K);
P
n
(K) is a subspace of P(K).
Proof. Left as an exercise (see Exercise 9).
Theorem 2.3.4. For any set of scalars K and n, m with n < m, the following
statements are true:
PF
n
(K) is a subspace of PF
m
(K);
PF
n
(K) is a subspace of PF(K).
Proof. Left as an exercise (see Exercise 10).
Theorem 2.3.5. The following statements are true:
2.3 Subspaces 79
PF
n
(R) is a subspace of C

(R);
PF(R) is a subspace of C

(R).
Proof. Left as an exercise (see Exercise 11).
Exercises
1. Let V be a vector space. Prove that 0 is a subspace of V .
2. Let L be a line through the origin in R
3
. Prove that L is a subspace
of R
3
.
3. Show that the set of all points on the line y = mx +b is a subspace of
R
2
if and only if b = 0.
4. Show that the set of all points in the plane ax+by+cz = d is a subspace
of R
3
if and only if d = 0.
5. Let W be the subset of P
2
(R) consisting of all polynomials of the form
f(x) = a
1
x + a
2
x
2
, a
1
, a
2
R. Determine whether or not W is a
subspace of P
2
(R).
6. Complete the proof of Theorem 2.3.1.
7. Complete the proof of Theorem 2.3.2.
8. Which of the following are subspaces of R
3
?
(a) W = x, y, z) : x z = 1, y +z = 2.
(b) W = x, y, z) : xy = 0.
(c) W = 0, y, 0).
9. Prove Theorem 2.3.3.
10. Prove Theorem 2.3.4.
11. Prove Theorem 2.3.5.
12. Which of the following subsets of C

(R) are subspaces of C

(R)?
80 CHAPTER 2. VECTORS AND VECTOR SPACES
(a) f C

(R) : f(1) = 0
(b) f C

(R) : f(0) = f(1)


(c) f C

(R) : f 0
(d) f C

(R) : f(x) = f(x)


(e) f C

(R) : f(x
2
) = (f(x))
2

(f) f C

(R) : f(x
2
) = 2(f(x))
(g) f C

(R) : f(x) = a
13. Let W
1
and W
2
be subspaces of a vector space V . Is W
1
W
2
a subspace?
If so, prove it. If not, nd a counterexample.
14. Let W
1
and W
2
be subspaces of a vector space V . Is W
1
W
2
a subspace?
If so, prove it. If not, nd a counterexample.
15. Let W = (x, y) : x
2
+ y
2
9 be a subset of R
2
. (W is a disk of
radius 3.) Is W a subspace of R
2
? Why or why not?
16. Is Q
3
a subspace of R
3
? (What is K)?
17. Let m n. Find two distinct subspaces of R
n
that are isomorphic to
R
m
.
Chapter 3
First Look at Systems
In this chapter, you will certainly recognize ideas
that anyone would call algebra. We revisit
systems of equationsperhaps your high school
text called them simultaneous systemsand
explore a couple of methods for nding the
solutions to these systems. You will nd some
interesting procedures in the next sections and
probably some new interpretations for things you
have met before.
82
3.1 Systems of Equations
Activities
1. Let K = Z
3
, the set of integers modulo 3. Write a statement in
ISETL that determines whether the following tuples: [x, y, z] = [2, 1, 1],
[1, 1, 1], [2, 2, 2], and [1, 0, 0] are or are not a solution of the equation
2x +y + 2z = 1.
2. Let K = Z
3
. Construct a func in ISETL that accepts a sequence [x, y, z]
of three elements of K as input; that substitutes the elements of the
sequence into the respective unknowns of the equation 2x+y +2z = 1;
and that returns true, if the substituted values result in the equation
being true, or returns false, if the substituted values result in the
equation being false. Use this func to nd the solution set of the
equation.
3. Use the func you wrote in the prior activity to construct the solution
set of the equation 2x + y + 2z = 1 in K = Z
3
. In particular, you will
want to dene the set in such a manner that you iterate through every
possible sequence of three elements in K (test every sequence over K)
to identify all possible solutions.
4. Given K = Z
p
, where p is prime number, and given a linear equation
in K such as
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c,
where a
i
K, i = 1, . . . , q, and c K, construct a func One eqn
that accepts the modulus p, the sequence [a
1
, a
2
, . . . a
q
] over K of co-
ecients and the constant c as input, and that yields the set of all
solutions [x
1
, x
2
, . . . , x
q
] of the equation as output. Test your func on
the equation dened in Activity 1.
5. Let K = Z
2
. For the equations
x +y +z = 1
x +z = 0.
3.1 Systems of Equations 83
Use the func One eqn you wrote in the last activity to determine the
solution set of the rst equation. Then use the same func to determine
the solution set of the second equation. Find the intersection of both
solution sets. What property do the elements of the intersection set
have? What is the solution set of these two equations taken as a single
system?
6. Let K = Z
2
. Construct a func in ISETL that accepts a sequence
[x, y, z] of three elements as input; that substitutes the elements of the
sequence into the respective unknowns of the equations
x +y +z = 1
x +z = 0;
and that returns true, if the substituted values result in both equations
being true, or returns false, if the substituted values result in one or
both equations being false.
7. Use ISETL code to construct the solution set of the system of equations
given by
x +y +z = 1
x +z = 0.
in K = Z
2
. In particular, you will want to dene the set in such
a manner that you iterate through every possible sequence of three
elements in K (test every sequence over K) to identify all possible
solutions.
8. Given two equations
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c
1
b
1
x
1
+b
2
x
2
+ +b
q
x
q
= c
2
in K = Z
p
, where p is a prime number, construct a func Two eqn that
accepts the modulus p, two sequences of coecients, [a
1
, a
2
, . . . , a
q
] and
[b
1
, b
2
, . . . , b
q
], and two constants, c
1
and c
2
, as input, and that returns
the set of all solutions [x
1
, x
2
, . . . , x
q
] of both equations as output.
84 CHAPTER 3. FIRST LOOK AT SYSTEMS
9. Given a system of three equations
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c
1
b
1
x
1
+b
2
x
2
+ +b
q
x
q
= c
2
d
1
x
1
+d
2
x
2
+ +d
q
x
q
= c
3
in K = Z
p
, where p is a prime number, construct a func Three eqn that
accepts the modulus p, three sequences of coecients, [a
1
, a
2
, . . . , a
q
],
[b
1
, b
2
, . . . , b
q
], and [d
1
, d
2
, . . . , d
q
], and three constants, c
1
, c
2
, and c
3
,
as input and that returns the set of all solutions [x
1
, x
2
, . . . , x
q
] to the
system as output. Test your func on the system
x + 2y +z = 1
2x +y + 2z = 2
2x + 2y +z = 1
over Z
3
. Describe the process for constructing such a func for any
number of equations.
10. Let K = Z
5
. Answer the following set of questions in relation to the
system of equations in Z
5
given below.
3x
1
+ 2x
2
+x
3
= 2
x
1
+ 4x
2
+ 4x
3
= 3
2x
1
+x
2
+ 2x
3
= 2.
(a) Find the solution of this system using the func Three eqn you
wrote before.
(b) Interchange the rst and second equations of the system. Find the
solution of this new system using the func Three eqn you wrote
before. What do you observe?
(c) Multiply both sides of equation 2 by 3. Replace the second equa-
tion by this new equation. Apply the func Three eqn to this
transformed system. What do you observe?
(d) Multiply both sides of equation 2 by 3. Then, add the modied
version of equation 2 to equation 1. Replace the second equation
by this new equation. Apply the func Three eqn to this trans-
formed system. What do you observe?
3.1 Systems of Equations 85
(e) What operations can you do to transform the equations of the
system without changing its solution set?
11. Let K = Z
5
. Answer the following set of questions in relation to the
system in Z
5
given below.
2x
1
+ 3x
2
+x
3
= 3
x
1
+ 4x
2
+ 2x
3
= 1
3x
1
+x
2
+ 2x
3
= 2.
(a) Apply the func Three eqn to nd the set of sequences [x
1
, x
2
, x
3
]
that are simultaneous solutions of all three equations.
(b) Multiply both sides of equation 2 by 3. Then, add the modied
version of equation 2 to equation 1. In particular,
R2
t
(new eqn 2) = R1 + 3R2.
Apply the func Three eqn to the system
2x
1
+ 3x
2
+x
3
= 3
R2
t
3x
1
+x
2
+ 2x
3
= 2.
Compare the solution set of this system to the original.
(c) Add equation 1 to equation 3. In particular,
R3
t
(new eqn 3) = R1 +R3.
Apply the func Three eqn to the system
2x
1
+ 3x
2
+x
3
= 3
R2
t
R3
t
.
Compare the solution set of this system to the original.
86 CHAPTER 3. FIRST LOOK AT SYSTEMS
(d) Interchange rows 2 and 3. In particular,
R2
tt
= R3
t
R3
tt
= R2
t
.
Apply the func Three eqn to the system
2x
1
+ 3x
2
+x
3
= 3
R2
tt
R3
tt
.
Compare the solution set of this system to that of the original.
(e) Does the process outlined in parts (b),(c) and (d) change the so-
lution set of the system? Why does the process described here
appear to be eective in helping us to identify the solution of the
original system?
12. Let K = Z
5
. Given the system of equations
3x
1
+x
2
+ 4x
3
= 1
x
1
+ 3x
2
+ 3x
3
= 4
4x
1
+x
2
+ 3x
3
= 3,
nd the solution set by hand using a process similar to what was out-
lined in the prior activity. Verify your work by applying the func
Three eqn to both the original system and the simplied system you
produced by hand. What do you observe?
Discussion
Algebraic Expressions and Linear Equations
In previous courses in algebra, you spent a great deal of time working with
algebraic expressions. Do you remember the dierence between an algebraic
3.1 Systems of Equations 87
expression and an equation? In some cases, you were asked to simplify ex-
pressions by applying the distributive property; the exercise
Simplify the expression: 6x(x +y) 3(x
2
2xy)
is such an example. Similarly, you were assigned problems in which you were
asked to combine like terms. Exercises such as
Simplify by combining like terms: 5bcd 8cd 12bcd +cd
t into this category and are probably very familiar to you. You also spent
considerable time factoring polynomials like
4x
3
+ 27x
2
+ 5x 3
25a
2
20ab + 4b
2
.
What was the purpose of these tasks?
Although these exercises may have seemed pointless, they were designed
with several objectives in mind: in particular, you were being taught about
the concept of variable. What are the values that each variable can assume
in algebraic expressions such as
6x(x +y) 3(x
2
2xy)?
That is what values can you select for x and for y, which, when substituted
into the expression, yield a single number answer?
On the other hand, if we take one of the algebraic expressions above, such
as 4x
3
+ 27x
2
+ 5x 3, and set it equal to, say 4, which yields 4x
3
+ 27x
2
+
5x3 = 4, we now have an equation. What happens if you substitute values
for x in this case? Is it always a true statement?
In a similar fashion, if we take two of the other expressions given above,
say 5bcd 8cd 12bcd +cd and 25a
2
20ab +4b
2
, and set them equal to one
another, the resulting equation
5bcd 8cd 12bcd +cd = 25a
2
20ab + 4b
2
will be true only for appropriately selected sequences [a, b, c, d] of values for
a, b, c, and d. Can you nd some examples of values for a, b, c, and d such
that the equation will be false? Can you nd some examples of values for
a, b, c, and d such that the equation will be true? The set of values which
88 CHAPTER 3. FIRST LOOK AT SYSTEMS
an unknown, or sequence of unknowns, can assume in any given equation
is called the solution set of the equation. In this section and throughout
the remainder of this course, we will focus our attention on nding solution
sets of linear equations and systems of linear equations. Do you recall the
dierence between a linear equation and one that is not linear? Can you give
an example of each?
A linear equation is any equation of the form
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c,
where a
1
, a
2
, . . . , a
q
, and c are constants in K, where K is the set of real
numbers or the set Z
p
, with p prime, and x
1
, x
2
, . . . , x
q
are unknowns.
Equation such as 3x
2
1
+4x
2
2
= 2, 3y sin(y) = 1, or 2x
4
4xy +5y = 7 are
not linear equations.
Denition 3.1.1. Let K be a eld. A linear equation with coecients in
K is of the form a
1
x
1
+ a
2
x
2
. . . + a
m
x
m
= c, where a
i
K, i = 1, 2, . . . , m
denote the coecients, x
i
, i = 1, 2, . . . , m represent the unknowns, and c K
is a constant.
Denition 3.1.2. A sequence [s
1
, s
2
, . . . , s
m
] is a solution of the equation,
if, when s
i
is substituted for x
i
, i = 1, 2, . . . , m, the equation
a
1
s
1
+a
2
s
2
+ +a
m
s
m
= c
is true. The solution set of a linear equation is the collection of all such
solutions.
In Activity 1, you were asked to determine whether the sequences [2, 1, 1]
and [1, 1, 1], [2, 2, 2], [1, 0, 0] are solutions of the equation 2x + y + 2z = 1
in K = Z
3
. It is convenient to remember here that all the equalities in the
activities where the variables are elements of a nite eld are congruences,
and that it is implicit in the notation Z
k
that all the operations have to be
done modulo k. For example, when we write 3x+2y = 4 where x and y are in
Z
7
we mean 3x +2y 4 (mod 7). What do you have to ask when you want
to know whether a sequence is an element of the solution set of an equation?
By substitution of the sequences you were able to nd the sequences that are
solutions to the equation. Can you nd all the sequences that are solutions
to this equation?
3.1 Systems of Equations 89
What is the purpose of the func you wrote in Activity 2? How can
you nd all the elements of the solution set of an equation? In Activity 3
you answered this question for a particular equation and in Activity 4, you
constructed and tested a func that would return the solution set of a single
linear equation in K = Z
p
, where p is prime; in particular, you were asked
to write code that would input a sequence of coecients and a constant of a
linear equation, and then return the solution set of the equation by iterating
through every possible sequence in the set K.
If K is nite and small, as it was in the activities, it is possible to check
every possible sequence of values for the unknowns of an equation. If K is
innite, however, we cannot check every possible sequence. For example, if
K = R, the set of real numbers, we cannot nd the solution set of a linear
equation by checking every sequence of possible values for the unknowns,
because there are innitely many possibilities to check. Instead, we try to
determine the solution set by transforming the original equation into a sim-
pler but equivalent equation, that is, an equation that has the same solution
set, whose solution can be easily identied. For linear equations, this in-
volves isolating an unknown variable. For example, given an equation like
2x+3 = 11, what are the transformations you would do to nd an equivalent
equation which tells you directly the solution to the original equation? What
properties do you use to transform the equation into an equivalent one? How
do we know that the method for transforming the equation yields each and
every solution?
Forms of Solution Sets
In the case of a linear equation of a single variable, we know that, if it has
solutions, there is exactly one solution. For a linear equation of more than
one variable, say 4x y = 5, this is not the case. We can simplify the
equation, however, by isolating y. What are the transformations you would
do in this case? Can you identify the solution set of this equation? Can you
identify the geometric representation of the solution set? We will return to
this in the last section of this chapter.
If the variables of an equation are elements of a nite eld, we can always
nd all the solutions in its solution set. If the variables are elements of an
innite eld, this is not always the case. Why?
The solution set of the equation we were considering before can be ex-
pressed in a variety of ways. If we simply isolate y, we get the form y = 4x5.
90 CHAPTER 3. FIRST LOOK AT SYSTEMS
If x and y are in R, we can select any value for x, which, via the expression,
yields a corresponding value for y. If we let x = t, then we get the parametric
form
x = t
y = 4t 5
of the equation. The solution set of this equation can be written in vector
form. For example we can interpret the equation 4x y = 5 as consisting of
all vectors a, b) in R
2
whose components, when substituted, x = a, y = b,
result in the equation being true. That is, the vectors that are solutions
of the equation. The expression for the x coordinate would be placed in
the rst component, and the expression for y would be placed in the second
component. The vector form of the solution set of 4xy = 5 is given by the
set
S = t, 4t 5) : t R.
Can you express the vectors in S in terms of other vectors, using the opera-
tions you learned in chapter 2? Is S a vector space?
We can also express the solution as a sequence. In this case, we are
interpreting the solution of the equation 4xy = 5 as the set of all sequence
combinations [c, d] of elements in R such that if we let x = c and y = d, the
equation is true. In this case, the solution set of the given equation assumes
the form
S = [c, 4c 5] : c R.
Given a linear equation in four variables, say
3x
1
+ 2x
2
4x
3
+x
4
= 5,
to obtain a solution set we would follow the same basic procedure as we did
in simplifying the linear equation in two variables; in particular, we would
transform the the equation into an equivalent one where the rst unknown
is isolated:
x
1
=
5
3

2
3
x
2
+
4
3
x
3

1
3
x
4
.
Can you identify the equivalent equations involved in the transformation of
this equation? In this solution, x
2
, x
3
, and x
4
are free variables, because they
can assume any value, while x
1
is dependent upon, or is determined by, the
3.1 Systems of Equations 91
values selected for x
2
, x
3
, and x
4
. Must x
1
necessarily be the dependent
variable? What is the vector form of the solution set of this equation? The
sequence form? What are the vector and sequence forms of the solution set
of the general linear equation given in Denition 3.1.2?
In vector form, the solution set is given by
S =
__
5
3

2
3
t
1
+
4
3
t
2

1
3
t
3
, t
1
, t
2
, t
3
_
: t
1
, t
2
, t
3
R
_
,
where S represents the set of vectors in R
4
whose components satisfy the
equation
3x
1
+ 2x
2
4x
3
+x
4
= 5.
In sequence form, the solution set looks like
S =
__
5
3

2
3
t
1
+
4
3
t
2

1
3
t
3
, t
1
, t
2
, t
3
_
: t
1
, t
2
, t
3
R
_
,
where S represents all combinations of values for the unknowns that are
solutions of the equation.
In general, for a single linear equation
a
1
x
1
+a
2
x
2
+ +a
q
x
q
= c
the solution set in vector form can be written as
S =
__
c
a
1

a
2
a
1
t
1

a
3
a
1
t
2

a
q
a
1
t
q1
, t
1
, t
2
, . . . , t
q1
_
: t
1
, t
2
, . . . , t
q1
R ,
where S is the set of vectors in R
q
whose components are solutions of the
equation; and the sequence form of the solution set is given by the set
S =
__
c
a
1

a
2
a
1
t
1

a
3
a
1
t
2

a
q
a
1
t
q1
, t
1
, t
2
, . . . , t
q1
_
: t
1
, t
2
, . . . , t
q1
R ,
where S represents all combinations of values for the unknowns that are
solutions of the equation. Note that any x
i
, i = 1..q, can be used as the
dependent variable by solving as we did for x
1
. Is S a subspace of R?
92 CHAPTER 3. FIRST LOOK AT SYSTEMS
Systems of Linear Equations
A system of equations is a collection of two or more linear equations. We are
interested in nding the solution set of systems of equations. In Activity 5 you
used what you learned in previous activities to nd the solution set of each of
the equations that form the given system. Then you found the solution set of
the system. Can you dene what is the meaning of the solution set of a system
of equations? How is the solution set of the system related to the solution set
of each of the equations? In Activity 6 you worked with the same system but
now you constructed a func that allowed you to check whether any sequence
of your choice would be a solution to the system and in Activity 7, you asked
the computer to determine all solutions by testing every possible sequence in
the func you constructed in Activity 6. You then generalized this process in
Activity 8 by constructing a func that would accept the coecients and the
constants corresponding to any pair of equations in Z
p
, p prime, and that
would return the solution set of the system. Can you explain in your own
words how this func works?
In Activity 9 you constructed and tested a func for solving a system
of three equations in Z
p
, p prime. You were then asked how you would
generalize the procedure to construct a func for any number of equations in
K = Z
p
. Can you use these ideas to describe a general process for nding
the solution of any system of m equations and n unknowns, where n and
m are any integers larger than 1, and K is a nite led? If K is nite, it
is possible to construct the solution set of a system of linear equations by
writing code that checks every possible sequence of values for the unknowns.
Why? If K is an innite set, for example, the real numbers or the complex
numbers, such iteration is not possible. As with a single linear equation in
R, it is necessary to devise a process that transforms the original system
into a simpler, equivalent system whose solution set is readily identiable.
In order to do this you were asked in Activity 10 to perform some operations
on a system of equations and to solve the transformed systems. What did
you nd about the solution of each of those systems? The operations used
are called elementary transformations and as you found out, these operations
transform the system into a new system that has the same solution set.
Denition 3.1.3. Given a system of m equations and n unknowns over a
eld K, the original system can be transformed into a simpler, equivalent
system by applying one or more elementary transformations, each of which
is listed below:
3.1 Systems of Equations 93
Interchange the position of two equations.
Multiply both sides of an equation by a nonzero constant.
Add a multiple of one equation to another equation.
In Activity 11 you were asked to transform a system where K = Z
5
into
equivalent systems, that is, into a system that has the same solution set. It
is important to remember that all the operations used while working with
this activity are done using modulo 5. In the rst step, you multiplied the
second equation by 3, and added the result to the rst equation to produce
a newsecond equation. This operation involved two elementary transfor-
mations: the rst involved multiplying the second equation by the constant
3; the second involved adding a the rst equation to the second equation.
What did you observe? Why might the form you obtained be considered
simpler? If you did not have access to the func, how would you nd the
solution of the system? Why do you suppose the solution set of the original
system and the simpler system are equal? In the second step, you performed
a similar transformation to alter the third equation. Again, the resulting
system yielded the same solution set. In the last step, you applied the third
type of elementary transformation: you interchanged equations 2 and 3. The
nal form
2x
1
+ 3x
2
+x
3
= 3
4x
2
+ 3x
3
= 0
2x
3
= 1
is an equivalent system. Unlike the original system, it is possible to identify
the solution set by hand. Specically, the last equation reveals that x
3
= 3.
If we substitute this value into the second equation, we see that
4x
2
+ 3(3) = 0,
from which it follows that x
2
= 4. If we substitute x
2
= 4 and x
3
= 3 into
the rst equation, we get
2x
1
+ 3(4) + 3 = 3,
which yields x
1
= 4. Hence, the original system in K = Z
5
2x
1
+ 3x
2
+x
3
= 3
x
1
+ 4x
2
+ 2x
3
= 1
3x
1
+x
2
+ 2x
3
= 2,
94 CHAPTER 3. FIRST LOOK AT SYSTEMS
has only one solution, namely x
1
= 4, x
2
= 4, x
3
= 3. If the solution set is
written in vector form, we have
S = 4, 4, 3),
and if it is expressed in sequence form, we get
S = [4, 4, 3].
Observe that the simplied system
2x
1
+ 3x
2
+x
3
= 3
4x
2
+ 3x
3
= 0
2x
3
= 1
has no x
1
term in the second equation and neither an x
1
nor an x
2
term in
the third equation. We could have added additional steps to simplify even
further. In particular, if we multiply each equation by a suitable nonzero
constant, we eventually get a triangular-looking system
x
1
+ 4x
2
+ 3x
3
= 4
x
2
+ 2x
3
= 0
x
3
= 3
said to be in echelon form. The entries corresponding to the x
1
term in the
rst equation, the x
2
term in the second equation, and the x
3
term in the
third equation are called leading entries.
As you can see, the process of transforming a system of equations into
echelon form involves isolating variables: in particular, we isolated x
3
and
then used it to isolate x
2
, whereby we then isolated x
1
. The three elementary
transformations, interchanging two equations, multiplying both sides of an
equation by a constant, and adding a multiple of one equation to another, are
the tools by which we can transform a system of equations into an equivalent
system that is in echelon form. Can you transform the system given in
Activity 10 into an equivalent system which is in echelon form?
In Chapter 6 it will be shown that the process used to transform the
system into its echelon form does not change the solution set of any system
of linear equations. Before we think about a proof, lets consider the following
3.1 Systems of Equations 95
example in R,
2x
1
x
2
+ 3x
3
+x
4
= 2
3x
1
+ 2x
2
4x
3
+ 2x
4
= 3
x
1
+ 4x
2
2x
3
+ 5x
4
= 1.
Based upon what you did in Activity 11, the rst goal is to transform the
original system into an equivalent system in which the x
1
term in the second
equation vanishes. What elementary transformation has been performed to
transform the system into
2x
1
x
2
+ 3x
3
+x
4
= 2
7x
2
17x
3
+x
4
= 12
x
1
+ 4x
2
2x
3
+ 5x
4
= 1?
The next step might be to eliminate the x
1
term from the third equation.
What elementary transformation was used to transform the system into
2x
1
x
2
+ 3x
3
+x
4
= 2
7x
2
17x
3
+x
4
= 12
7x
2
x
3
+ 11x
4
= 0?
The last transformation left an x
2
term in the third equation. What elemen-
tary transformation was applied to transform the system into
2x
1
x
2
+ 3x
3
+x
4
= 2
7x
2
17x
3
+x
4
= 12
16x
3
+ 10x
4
= 12?
Is the next system equivalent to the given one? Why? In order to get the
system into echelon form, we need a coecient of 1 for each leading entry.
How would we go about doing this?
x
1

1
2
x
2
+
3
2
x
3
+
1
2
x
4
= 1
x
2

17
7
x
3
+
1
7
x
4
=
12
7
x
3
+
5
8
x
4
=
3
4
96 CHAPTER 3. FIRST LOOK AT SYSTEMS
How many leading entries does this system have?
This system is now in echelon form, with leading entries provided by the
x
1
term in the rst equation, the x
2
term in the second equation, and the x
3
term in the last equation. Unlike the prior example, the last unknown will
not assume a single value. In particular, the last equation x
3
+
5
8
x
4
=
3
4
is
a linear equation in two variables. If we isolate the x
3
term, we get
x
3
=
3
4

5
8
x
4
.
This means that x
4
can assume any value; it is called a free variable. If we
let x
4
= t, we get
x
3
=
3
4

5
8
t.
Substituting t for x
4
and
3
4

5
8
t for x
3
in the second equation yields
x
2

17
7
_

3
4

5
8
t
_
+
1
7
t =
12
7
x
2
+
51
28
+
113
56
t =
12
7
x
2
=
3
28

113
56
t.
If we substitute these expressions into equation 1, we nd that
x
1

1
2
_

3
28

113
56
t
_
+
3
2
_

3
4

5
8
t
_
+
1
2
t = 1
x
1
+
15
14
+
4
7
t = 1
x
1
=
29
14

4
7
t.
If we now write the solution in parametric form, we get
x
1
=
29
14

4
7
t
x
2
=
3
28

113
56
t
x
3
=
3
4

5
8
t
x
4
= t,
3.1 Systems of Equations 97
where t is any real number. In vector form, the solution set can be written
as
S =
__

29
14

4
7
t,
3
28

113
56
t,
3
4

5
8
t, t
_
: t R
_
.
Is S a vector space? The algebraic structure on vector spaces can be used
to express the solution set in vector form in dierent suitable ways. Can you
think of one such expression? How would you write the solution in sequence
form? How many solutions does this system have? We can nd any specic
solution by selecting a value for t. If we substitute the representations given
for x
1
, x
2
, x
3
, and x
4
into each equation in the original system, all three
equations will be true, thereby proving that the proposed solution set, no
matter its specic form, is indeed the solution set of the original system of
equations. Is this always true?
In Activity 12 you transformed the given system using elementary trans-
formations. How many leading entries did the system in echelon form have?
What is the solution of that system? Is it possible for a system to have no
solution?
Lets consider another example in R:
3x
1
+ 6x
2
3x
3
= 6
2x
1
4x
2
3x
3
= 1
3x
1
+ 6x
2
2x
3
= 10
What are the elementary operations used to transform the system into the
following equivalent systems?
3x
1
+ 6x
2
3x
3
= 6
4x
2
15x
3
= 9
3x
1
+ 6x
2
3x
3
= 10
and
3x
1
+ 6x
2
3x
3
= 6
4x
2
15x
3
= 9
0 = 4
The last equation corresponds to
0x
1
+ 0x
2
+ 0x
3
= 4.
98 CHAPTER 3. FIRST LOOK AT SYSTEMS
What is the meaning of the last equation of the system in echelon form?
Does it have a solution? What does it mean in terms of the sequence of
values [x
1
, x
2
, x
3
]? The system has no solution.
The examples discussed here represent each of the possible types of so-
lution sets of a system of equations: A system of equations in K = Z
p
has
a nite number of solutions whereas if it is over an innite eld, a system
of equations either has a unique solution, innitely many solutions, or no
solution. We have exemplied this result in this section, later on, in chapter
6 it will be proved. Although we have always considered the leading entries
as dierent from zero, in many systems they are zero. In such cases it would
be more convenient to interchange the appropriate equations rst, so that
the leading entries of the top equations are dierent from zero, and later
transform the system into echelon form.
Summarizing the Process for Finding the Solution of a
Systems of Equations
If we are given a system of equations in K when K is nite, we can nd the
solution set by substituting each possible sequence of values for the unknowns
into each equation. Those sequences for which the func returns true for each
equation in the system are elements of the solution set and vice versa.
If K is an innite set, such as R, then it is impossible to check each
sequence of possible solutions. In this case, we must transform the original
system of equations into a simpler system whose solution set is equal to that
of the original system. There are three elementary transformations that can
be applied to a system without changing its solution set:
Interchange the positions of two equations.
Multiply both sides of an equation by a nonzero constant.
Add a multiple of one equation to a multiple of another equation.
The goal of applying elementary transformations is to produce a system of
equations in echelon form, that is, a simpler system whose solution set is
easy to identify or to construct. To place a system in echelon form, we must
3.1 Systems of Equations 99
apply the following series of steps to a system such as
a
11
x
1
+a
12
x
2
+a
13
x
3
+a
14
x
4
+ +a
1q
x
q
= c
1
a
21
x
1
+a
22
x
2
+a
23
x
3
+a
24
x
4
+ +a
2q
x
q
= c
2
a
31
x
1
+a
32
x
2
+a
33
x
3
+a
34
x
4
+ +a
3q
x
q
= c
3
.
.
.
a
r1
x
1
+a
r2
x
2
+a
r3
x
3
+a
r4
x
4
+ +a
rq
x
q
= c
r
1. Scale the leading coecients to one, dividing by the coecient of the
leading variable.
2. Apply elementary transformations that eliminate x
1
from equations 2
and higher and replace those equations by the transformed ones.
3. Do the same to eliminate the x
2
term from equations 3 and higher.
4. Do the same to eliminate the x
3
term from equations 4 and higher.
5. Continue this process until the leading entries form a triangular pattern.
Once completed, the echelon system should look something like
x
1
+b
12
x
2
+b
13
x
3
+b
14
x
4
+ +b
1q
x
q
= d
1
x
2
+b
23
x
3
+b
24
x
4
+ +b
2q
x
q
= d
2
x
3
+b
34
x
4
+ +b
3q
x
q
= d
3
.
.
.
x
r
+b
r(r+1)
x
r+1
+ +b
rq
x
q
= d
r
In general, the leading entry in any given equation should occur in a column
to the right of the leading entry in the prior equation. Based upon the nal
echelon form, the solution set of the equation can be found. If the system is
in K = R, what would you expect the nal echelon form of a system that
has an innite number of solutions to be? Does such a system have any free
variables?
Exercises
The following exercises involve systems of equations where the variables are
all in K = R unless otherwise stated.
100 CHAPTER 3. FIRST LOOK AT SYSTEMS
1. Given the following equations verify if they are true for the values
x = 2, x = 5, x = 0 and x = 1.
(a) x
2
3x 4 = 6
(b) x + 7 = 5
(c) x
2
+ 2x + 1 = (x + 1)
2
2. Give two examples of nonlinear equations.
3. Are the sequences [1, 1, 0], [1, 1, 2], [0, 0, 1] and [1, 1, 0] solutions of
the equation
2x +y + 2z = 1
for x, y, and z in Z
3
?
4. Find the solution set of the system
x
1
x
2
+ 3x
3
= 3
2x
1
x
2
+ 2x
3
= 2
3x
1
+x
2
2x
3
= 2
by transforming the system into an equivalent system, which is in ech-
elon form.
5. Using elementary transformations, nd the solution set of the system
3x
1
+ 6x
2
3x
4
= 3
x
1
+ 3x
2
x
3
4x
4
= 12
x
1
x
2
+x
3
+ 2x
4
= 8
2x
1
+ 3x
2
= 8
by transforming the original system into echelon form.
(a) What are the leading entries? Are there any free variables?
(b) Does the system have one, innitely many, or no solution? What
is the relationship between the existence of free variables and
whether the system has one, innitely many, or no solution?
(c) Express the solution set in vector, and sequence form. Explain
the dierence between each way of expressing the solution set.
3.1 Systems of Equations 101
(d) Substitute the solution back into each equation of the original
system. After substitution, is each equation true?
6. Find the solution set of the system
x
1
+ 2x
2
x
3
+ 3x
4
+x
5
= 2
2x
1
+ 4x
2
2x
3
+ 6x
4
+ 3x
5
= 6
x
1
2x
2
+x
3
x
4
+ 3x
5
= 4
by reducing it into an equivalent system which is in echelon form. What
are the leading entries? Continue performing elementary transforma-
tions to the system to eliminate the variable x
2
from the rst and
third equations and the variable x
3
from the rst and second equa-
tions. What do you observe? Can you continue performing elementary
transformations to the system without altering the leading entries?
The system you found is said to be in reduced echelon form. In a
reduced echelon form, we go beyond echelon form to get zeros in all of
the coecients above and below each leading entry. The elementary
transformations and the basic process are the same. The result is a
system that is even more simplied than echelon form.
(a) You have already identied the leading entries. Are there any free
variables?
(b) Does this system have a unique, innitely many, or no solution?
(c) Express the solution set in vector, and sequence form.
(d) Using the general form given in either the vector or sequence forms
of the solution set, create three dierent specic solutions, and
substitute your results into the equations of the original system?
What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
(f) Is the solution of the system a vector space?
(g) Write a new system which has the same expressions on the left
side of the equations but that has zeros as the constant terms
to the right of the equal sign. Find the solution set of the new
system. Is the solution set of this system a vector space?
102 CHAPTER 3. FIRST LOOK AT SYSTEMS
7. Using elementary transformations, nd the solution set of the system
2x
1
4x
2
+ 12x
3
10x
4
= 58
x
1
+ 2x
2
3x
3
+ 2x
4
= 14
2x
1
4x
2
+ 9x
3
6x
4
= 44
by transforming the original system into reduced echelon form.
(a) What are the leading entries? Are there any free variables?
(b) Does the system have one, innitely many, or no solution?
(c) Express the solution set in vector and in sequence form.
(d) Using the general form given in either the vector or sequence forms
of the solution set, create three dierent specic solutions, and
substitute your results into the equations of the original system.
What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
(f) Is the solution set of this system a vector space?
8. Write a system in reduced echelon form such that its solution is given
by:
x
1
= 3t + 4
x
2
= 2t + 1.
The next series of steps are designed to transform the system into one
possible original system. Carefully perform each step.
(a) Take 1 times equation 2, and add the result to equation 1 to yield
a new equation 1. Write down the new system that results from
performing this transformation.
(b) Create an equation 3 by taking 2 times equation 2. (In this case,
we think of equation 3 as 0x
1
+0x
2
+0x
3
= 0. Hence, what we are
really doing is taking 2 times equation 2, and adding the result
to equation 3 to yield a new equation 3.) Write down the new
system that results from performing this transformation.
3.1 Systems of Equations 103
(c) Take 2 times equation 1, and add the result to equation 2 to yield
a new equation 2. Write down the new system that results from
performing this transformation.
(d) Take 3 times equation 1, and add the result to equation 3 to yield
a new equation 3. Write down the new system that results from
performing this transformation.
(e) Multiply both sides of equation 1 by 3 to yield a new equation
1. Write down the new system that results from performing this
transformation.
(f) Using the general solution, construct three dierent specic solu-
tions, and substitute each of these into the original system you
have just created. What do you observe?
(g) Substitute the general form of the solution set given above into
each equation of the resulting original system you have created.
Is each equation true?
9. Suppose a system of 2 equations in 3 unknowns has a solution set whose
vector form is given by
S = 3t + 1, 4t + 2, t) : t R.
Write the reduced echelon form that corresponds to this system. Then,
apply three elementary transformations of your choice. Show that the
general form of the solution is a solution of the resulting original
system you have created. Using three dierent elementary transfor-
mations, create a second original system, and show that the general
form of the solution is a solution to the second system you have created.
10. Using elementary transformations, nd the solution set of the system
x
1
x
2
+ 2x
3
= 3
2x
1
2x
2
+ 5x
3
= 4
x
1
+ 2x
2
x
3
= 3
2x
2
+ 2x
3
= 1
by transforming the original system into reduced echelon form.
(a) What are the leading entries? How many of them are there? Are
there any free variables?
104 CHAPTER 3. FIRST LOOK AT SYSTEMS
(b) Does the system have one, innitely many, or no solution?
(c) Express the solution set in vector and in sequence form.
(d) Using the general form of the solution set create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
11. Using elementary transformations, nd the solution set of the system
2x
1
4x
2
+ 16x
3
14x
4
= 10
x
1
+ 5x
2
17x
3
+ 19x
4
= 2
x
1
3x
2
+ 11x
3
11x
4
= 4
3x
1
4x
2
+ 18x
3
13x
4
= 17
by transforming the original system into reduced echelon form.
(a) What are the leading entries? How many of them there are? Are
there any free variables?
(b) Does the system have one, innitely many, or no solution?
(c) Express the solution set in vector and in sequence form.
(d) Using the general form of the solution set create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
(e) Substitute the general form of the solution back into each equation
of the original system. After substitution, is each equation true?
12. Consider the following homogeneous system of four equations in four
unknowns given by
x
1
2x
2
+x
3
4x
4
= 0
2x
1
3x
2
+ 2x
3
3x
4
= 0
3x
1
5x
2
+ 3x
3
4x
4
= 0
x
1
+x
2
18x
3
+ 2x
4
= 0.
3.1 Systems of Equations 105
(a) Using elementary transformations, nd the solution set of the sys-
tem.
(b) What are the leading entries? How many of them there are? Are
there any free variables?
(c) Does the system have one, innitely many, or no solution?
(d) Express the solution set in vector and in sequence form. Is this
the only solution? Why?
(e) Using the general form of the solution set, create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
(f) Is the solution set of this system a vector space?
13. Consider the following system of four equations in four unknowns given
by
x
1
2x
2
+x
3
4x
4
= 4
2x
1
3x
2
+ 2x
3
3x
4
= 1
3x
1
5x
2
+ 3x
3
4x
4
= 3
x
1
+x
2
18x
3
+ 2x
4
= 5.
(a) Compare this system to the one given in the previous exercise.
What are the similarities? What are the dierences?
(b) Is
S = [6, 6, 1, 3]
a solution to the system. Why? Is this the only solution to the
system? Why?
(c) Using elementary transformations, nd the solution set of the sys-
tem. What do you observe?
(d) The system of the previous exercise can be considered the homo-
geneous system associated to this system. Why? Take the general
form of the solution set of the homogeneous system and add this
solution to the solution given by the sequence
S = [6, 6, 1, 3].
Is the sum a solution to the system? Why?
106 CHAPTER 3. FIRST LOOK AT SYSTEMS
(e) Using elementary transformations, nd the solution set of the sys-
tem.
(f) What are the leading entries? How many of them there are? Are
there any free variables?
(g) Does the system have one, innitely many, or no solution?
(h) Express the solution set in vector and in sequence form.
(i) Using the general form of the solution set, create three dierent
specic solutions and substitute your results into the equations of
the original system. What do you observe?
14. The reduced echelon forms of two equations in two unknowns can be
classied in one of three dierent ways:
x
1
+ 0x
2
= c
1
0x
1
+x
2
= c
2
unique solution
x
1
+bx
2
= c
1
0x
1
+ 0x
2
= c
2
(,= 0) no solution
x
1
+bx
2
= c
0x
1
+ 0x
2
= 0 innitely many solutions
Classify, in a similar manner, the possible reduced echelon forms of a
system of three equations in three unknowns. Indicate which system(s)
yield a unique solution, innitely many solutions, or no solution.
15. Consider the general system of two equations in two unknowns given
by
ax +by = e
cx +dy = f.
(a) Determine conditions on a, b, c, d, e, and f that result in this
system having a unique solution.
3.1 Systems of Equations 107
(b) Determine conditions on a, b, c, d, e, and f that result in this
system having no solution.
(c) Determine conditions on a, b, c, d, e, and f that result in this
system having innitely many solutions.
16. A homogenous system of equations is any system of equations in which
all of the constant terms are zero. Show that x = 0, y = 0 is a solution
to the system
ax +by = 0
cx +dy = 0.
Prove that this is the only solution if and only if ad bc ,= 0.
17. For a homogeneous system of n linear equations in n unknowns, prove
that:
(a) The sum of two solutions to the system is a solution of the system.
(b) A multiple of a solution to the system is a solution of the system.
18. For a non-homogeneous system of n linear equations in n unknowns,
prove that if a vector p is a particular solution of the system, and
if h is a solution of the associated homogeneous system, that is, of a
homogeneous system that has the same coecients as the given system,
p +h is a solution of the non-homogeneous system.
19. Consider the following system of two dierential equations in two un-
knowns given by
x y = x
t
x + 3y = y
t
.
Are the functions x, y given by
x(t) = e
2t
y(t) = e
2t
elements of the vector space C

(R)? Is the pair [x, y] a solution to


the system? Why? Explain in your own words what the solution to a
system of dierential equations is.
108 CHAPTER 3. FIRST LOOK AT SYSTEMS
20. Prove or disprove the following statements:
(a) The solution set of a system of equations in K = R is a vector
space.
(b) The solution set of a homogeneous system of equations in K = Ris
a vector space.
109
3.2 Solving Systems Using Augmented
Matrices
Activities
1. Go back to Activity 11 in the prior section (see page 85). The op-
erations given in (b), (c), and (d) transformed the original system of
equations into a simpler system. Go back and review the steps of that
process. What changed in each transformation? What remained the
same?
2. Go back to Activity 12 in the prior section (see page 86). Transform the
original system into echelon form by hand. After each step, write the
newly formed system on the left side of the page. On the right side of
the page, write only the numbers corresponding to the coecients and
the constant term of each equation. What do you observe? Can you
read the solution of the system from the array? What do the elements
on the right side of the page represent? Would it be possible to design
a procedure that uses only the coecients and the constants arranged
in an array when applying elementary transformations?
3. Let K = Z
5
. Consider the system given below:
3x
1
+x
2
+ 4x
3
= 1
x
1
+ 3x
2
+ 3x
3
= 4
4x
1
+x
2
+ 3x
3
= 3
(a) Take the coecients of each equation and form three sequences.
(b) Use the program matrix from tuple from the Matrix package in
ISETL to dene an array of the coecients of the equations of the
system, or matrix, where the rst row of the matrix, [a
11
, a
12
, a
13
],
is the sequence of the coecients from the rst equation, that is,
[a
11
, a
12
, a
13
] = [3, 1, 4];
the second row of the matrix, [a
21
, a
22
, a
23
], is the sequence of the
coecients from the second equation, that is,
[a
21
, a
22
, a
23
] = [1, 3, 3];
110 CHAPTER 3. FIRST LOOK AT SYSTEMS
and the third row of the matrix, [a
31
, a
32
, a
33
], is the sequence of
the coecients from the third equation, that is,
[a
31
, a
32
, a
33
] = [4, 1, 3].
Then use the program display matrix(M) from the same package
to display the array. This array is called the coecient matrix of
the system.
(c) Take the constants of each equation and form a sequence. Use the
program augment col(M,c) from the Matrix package to dene a
new array which includes the coecients and constants from the
system dened at the beginning of this activity. We call this array
the augmented matrix of the system.
(d) Use the programs matrix row, matrix add, scale matrix and
set matrix row from the Matrix package to create a program
reduce matrix that will take the augmented matrix of a system
of equations, return its echelon form , as well as display all in-
termediate steps, that is, the augmented matrix that results after
having applied a single elementary transformation.
(e) Write down the resulting system of equations. To do this, convert
each row of the matrix into an equation, with the rst column
entry set equal to the coecient of x
1
, the second column entry
set equal to the coecient of x
2
, the third column entry set equal
to the coecient of x
3
, and the column entry after the vertical line
set equal to the constant on the opposite side of the equal sign.
What is the solution of the system?
(f) Apply the func Three eqn to the system, and compare the result
you obtained in part (c). What do you observe?
4. Let K = Z
7
. Given the system of equations,
4x
1
+x
2
+ 3x
3
+ 2x
4
= 1
x
1
+ 3x
2
+ 2x
3
+ 5x
4
= 2
3x
1
+ 4x
2
+ 2x
3
+x
4
= 6,
follow a process similar to that described in the last activity to nd the
solution set of the given system. Modify the program you wrote in the
3.2 Solving Systems Using Augmented Matrices 111
last activity, so that the nal augmented matrix is in reduced echelon
form. What is the solution of the system? Apply the func Three eqn
both to the original and the simplied systems to verify your results.
5. Let K = R. Given the following array,
_
_
4 7 1 0
8 8 2 4
3 4 1 3
_
_
,
write the corresponding system of equations. Solve the system. Does
the system have one solution, multiple solutions, or no solution?
6. Let K = R. Consider the following system of equations
2x
1
+ 2x
2
+ 5x
3
+x
4
+ 3x
5
= 0
x
1
+ 2x
2
+ 4x
3
+ 6x
4
+x
5
= 0
3x
1
+ 4x
2
+x
3
+x
4
+ 4x
5
= 0
x
1
+ 3x
2
+ 3x
3
+ 2x
4
+ 5x
5
= 0.
(a) Apply the reduce matrix program to nd the solution set of this
system.
(b) How many free variables do you nd? How many solutions does
this system have?
(c) Is the solution set of this system a vector space? Use one of the
tools you designed in Chapter 2
(d) Compare the coecients of the last system with those of the fol-
lowing one:
2x
1
+ 2x
2
+ 5x
3
+x
4
+ 3x
5
= 3
x
1
+ 2x
2
+ 4x
3
+ 6x
4
+x
5
= 2
3x
1
+ 4x
2
+x
3
+x
4
+ 4x
5
= 5
x
1
+ 3x
2
+ 3x
3
+ 2x
4
+ 5x
5
= 2.
What do you observe?
(e) Is the sequence
_
160
61
,
114
61
,
30
61
,
40
61
_
a solution to this system?
112 CHAPTER 3. FIRST LOOK AT SYSTEMS
(f) Find the solution set of the system given in part (d). Compare
the solution of the system given in part (d) with the form of the
solution set of the homogeneous system given at the beginning of
this activity. What do you observe? Describe the solution to part
(d) in terms of the solution of the associated homogeneous system.
(g) Is the solution set of the system in part (d) a vector space?
7. Suppose a system of equations has the following matrix of coecients,
_
_
2 1 3
1 3 1
4 0 3
_
_
.
Solve the system of equations associated with this matrix, if the con-
stants of the system are given by the following sequences:
(a) [1, 0, 0]
(b) [0, 1, 0]
(c) [0, 0, 1]
What do you observe? Can you design a way to solve the three systems
at the same time?
8. Let K = Z
7
. Apply Three eqn to verify that the two systems of equa-
tions,
x
1
+x
2
+x
3
= 6
4x
1
+ 3x
2
+ 5x
3
= 1
3x
1
+ 2x
2
+x
3
= 6
and
5x
1
+ 2x
2
+ 4x
3
= 6
x
1
+ 3x
2
+ 3x
3
= 0
3x
1
+ 2x
2
+ 4x
3
= 2,
have the same solution set. Find the echelon form of the solution of
each system. What do you observe?
3.2 Solving Systems Using Augmented Matrices 113
9. Let K = Z
7
. Apply Three eqn to verify that the two systems of equa-
tions,
x
1
+x
2
+x
3
= 6
4x
1
+ 3x
2
+ 5x
3
= 1
3x
1
+ 2x
2
+x
3
= 6
and
x
1
+ 2x
2
+ 3x
3
= 1
2x
1
+ 3x
2
+ 5x
3
= 4
3x
1
+ 4x
2
+ 4x
3
= 6,
have dierent solution sets. Find the echelon form of each system.
What do you observe?
10. Given a eld K = Z
p
, where p is prime number, or K = R, and given
a general system of linear equations, such as
a
11
x
1
+a
12
x
2
+ +a
1q
x
q
= c
1
a
21
x
1
+a
22
x
2
+ +a
2q
x
q
= c
2
a
31
x
1
+a
32
x
2
+ +a
3q
x
q
= c
3
.
.
.
a
n1
x
1
+a
n2
x
2
+ +a
nq
x
q
= c
n
,
where a
i
K, i = 1, . . . , q, and c
i
K, write the steps for nding the
solution set of a system using matrices as if you were being asked to
explain this process to a classmate having diculty with linear algebra.
Discussion
Using Augmented Matrices
In the last section, we discussed various strategies for nding the solution
set of a system of linear equations. For a given system in Z
p
, p prime, you
114 CHAPTER 3. FIRST LOOK AT SYSTEMS
constructed funcs that accepted a sequence of values and then tested whether
the given sequence was a solution. For the same system, you used this func
to identify all of the sequences that make up the solution set. You were
then asked to generalize this process by writing funcs for a system of two
equations and a system of three equations. What were the inputs and the
outputs for each of the funcs, Two eqn and Three eqn programs you wrote?
How would you generalize this process for a system of n equations? These
funcs are limited however, because they can only be applied to systems
of equations dened over nite elds. Although such funcs would work in
theory when K = R, this is not practical, because R has innitely many
elements. To compensate for this, you learned how to apply elementary
transformations to the a system of linear equations over R to transform
it into a simpler system whose solution set was both easy to identify and
equivalent to the solution set of the original system. However, the process of
transforming the system through the use of elementary transformations can
often be cumbersome, despite its eectiveness. The purpose of this section
is to introduce a means of streamlining the process.
In Activity 1, you analyzed the process of transforming the system to iden-
tify those features that remained invariable and those that changed. Once
you identied these features, in Activity 2 you were asked to repeat the pro-
cedure of transformation of a system of equations in a dierent way. Which
procedure seems to make the process of transformation easier? In Activity 3,
you learned how to write a system of linear equations as an array of numbers,
the augmented matrix associated with the given system. The augmented ma-
trix is formed by augmenting the coecient matrixformed by placing the
coecients of the unknowns in an arraywith a column consisting of the
constant terms of the equations.
In an augmented matrix, such as
_
_
_
_
a
11
a
12
a
13
a
14
a
21
a
22
a
23
a
24
a
31
a
32
a
33
a
34
a
41
a
42
a
43
a
44
_
_
_
_
,
the columns to the left of the vertical line correspond to the coecients of
the equations, and the column to the right of the vertical line corresponds to
the constants for each equation. (If it is clear that the matrix represents the
coecients and constants of an associated system of equations, the vertical
line is often dropped.) The rst row of terms, a
11
, a
12
, a
13
, and a
14
, are
3.2 Solving Systems Using Augmented Matrices 115
the coecients and the constant of the rst equation. a
11
, a
12
, a
13
are the
coecients of the unknowns, say x
1
, x
2
, and x
3
, and a
14
is the constant
which appears on the opposite side of the equals sign; in particular row 1
corresponds to the equation
a
11
x
1
+a
12
x
2
+a
13
x
3
= a
14
.
Each double digit subscript denotes the row and column position of each
entry. For example, the 12 that appears in a
12
tells us that the entry cor-
responding to a
12
resides in the second column of the rst row; a
43
denotes
the entry that occupies the third column of the fourth row. In general, a
ij
denotes the entry in the j
th
column of the i
th
row. What is the form of the
other equations associated with the augmented matrix shown above?
In the rst activities, you discovered that elementary transformations
change the coecients and the constants; the unknowns remain unchanged.
In Activity 2, you were asked to think about designing a procedure for sim-
plifying a system that would involve only the coecients and the constants
of the system. In Activity 3, you were actually introduced to such a method-
ology: you formed the augmented matrix corresponding to the given system;
you applied elementary row operations to each row of the matrix to convert it
into echelon form; and you used the subsequent echelon form to identify the
solution set of the original system. You designed the program reduce matrix
to perform these steps after having accepted the coecients and constant of
each equation.
Lets apply this procedure to a specic example in K = R, such as
x
1
+ 2x
2
x
3
+ 3x
4
+x
5
= 2
2x
1
+ 4x
2
2x
3
+ 6x
4
+ 3x
5
= 6
x
1
2x
2
+x
3
x
4
+ 3x
5
= 4.
First, rewrite the system in its associated augmented matrix:
_
_
1 2 1 3 1 2
2 4 2 6 3 6
1 2 1 1 3 4
_
_
.
Second, our goal is to transform this matrix into either its echelon or its
reduced echelon form. A matrix is in echelon form if:
116 CHAPTER 3. FIRST LOOK AT SYSTEMS
1. Any rows consisting entirely of zeros are grouped at the bottom of the
matrix.
2. The rst nonzero element of each row is 1. This element is called a
leading entry.
3. The leading entry of each row is positioned to the right of the leading
entry in the prior row.
4. All entries in the column below a leading entry are zero.
This process is known as Gaussian elimination. The nonzero entries of an
echelon matrix create a diagonal conguration. Three examples of echelon
matrices are given below. What features do they have that dene this form?
_
_
1 2 1 0 2 1
0 0 1 3 1 2
0 0 0 0 1 5
_
_
_
_
0 1 1 2 3
0 0 1 1 4
0 0 0 0 1
_
_
_
_
_
_
1 0 2 1
0 1 3 4
0 0 1 1
0 0 0 1
_
_
_
_
The three matrices given below are not in echelon form. Can you explain
why?
_
_
1 3 2 4
0 0 1 2
0 1 3 4
_
_
_
_
2 1 3 4
0 1 1 2
0 1 2 3
_
_
_
_
_
_
0 1 6 2 3
1 2 3 1 1
0 0 0 1 2
0 0 0 3 5
_
_
_
_
3.2 Solving Systems Using Augmented Matrices 117
Reduced echelon form is basically the same as echelon form, except that
all of the column entries above, as well as below, a given leading entry must
be zero. Three examples of reduced echelon matrices are given below.
_
_
_
_
1 5 0 2 0
0 0 1 9 0
0 0 0 0 1
0 0 0 0 0
_
_
_
_
_
_
_
_
1 0 4 0 0
0 1 2 0 0
0 0 0 1 0
0 0 0 0 1
_
_
_
_
_
_
1 2 0 3 0 4
0 0 1 2 0 9
0 0 0 0 1 8
_
_
The three matrices given below are not in reduced echelon form. Can you
explain why?
_
_
1 0 0 5 3
0 0 1 0 3
0 1 2 3 7
_
_
_
_
_
_
1 0 4 2 6
0 1 3 2 3
0 0 0 1 2
0 0 0 0 1
_
_
_
_
_
_
_
_
1 0 2 0 3
0 0 0 0 0
0 1 2 0 7
0 0 0 1 3
_
_
_
_
In the prior section, you applied elementary transformations to transform
a system into echelon form. In trying to simplify an augmented matrix,
what transformations did you use? You used these analogous elementary
row operations to nd the solution of the systems in Activities 4 and 5. We
will apply these operations to simplify following the matrix. Can you identify
the operation done at each step of the transformation?
118 CHAPTER 3. FIRST LOOK AT SYSTEMS
_
_
1 2 1 3 1 2
2 4 2 6 3 6
1 2 1 1 3 4
_
_
.
Step 1 Which elementary row operation were used to eliminate the column
1 entries in rows 2 and 3?. These operations yield
_
_
1 2 1 3 1 2
0 0 0 0 1 2
0 0 0 2 4 6
_
_
.
Step 2 What elementary row operation was used to yield
_
_
1 2 1 3 1 2
0 0 0 2 4 6
0 0 0 0 1 2
_
_
.
Step 3 What elementary row operation was used to yield
_
_
1 2 1 3 1 2
0 0 0 1 2 3
0 0 0 0 1 2
_
_
.
Is the last matrix is in echelon form? If so and if we wish to transform
it into reduced echelon form, we must continue the process using the
elementary row operations to eliminate the column entry above the
leading entry in row 2, and to eliminate the nonzero entries in column 5
above the leading entry in row 3. What operations have been performed
in the next steps?
Step 4 They would give us
_
_
1 2 1 0 5 7
0 0 0 1 2 3
0 0 0 0 1 2
_
_
.
Step 5 How can we obtain the following matrix which is in reduced echelon
form?
_
_
1 2 1 0 0 3
0 0 0 1 0 1
0 0 0 0 1 2
_
_
3.2 Solving Systems Using Augmented Matrices 119
Is the last matrix in reduced echelon form? Why? Once the matrix
is in reduced echelon form we can identify the solution set. What are
the equations corresponding to this matrix? As you found out, the
unknowns x
4
and x
5
are xed; x
2
and x
3
are free variables; and x
1
is
dependent upon x
2
and x
3
.
Echelon and reduced echelon form provide a convenient notation for iden-
tifying the solution set of a system of equations. Elementary row operations
are the necessary tool that allows us to systematize the simplication process
for transforming a system of equations into an augmented matrix in echelon
form.
Activity 6 had the objective to apply this new tool to help you solve an
autonomous system and to relate the solution of a non-homogeneous sys-
tem with its associated homogeneous system. Can the general solution of
a non-homogeneous system be written as the sum of a specic solution and
the general solution of the homogeneous system? If the solution set of the
homogeneous system given at the beginning of the activity is a vector space,
what exactly must you show? If the answer is yes, of what vector space is the
solution set a subspace? Why isnt the solution set of the non-homogeneous
system a subspace? And Activity 7 intended to demonstrate that it is possi-
ble to use the augmented matrix to solve several systems: the coecients of
each equation remain the same. In trying to solve all three systems simulta-
neously, what would be the form of the resulting augmented matrix?
What did you nd when working with Activities 8 and 9? What can you
say about the reduced echelon form corresponding to a given solution set?
As you were able to verify, two systems of equations have the same solution
set if and only if they can be simplied to the same reduced echelon form. In
Activity 8, where you veried that the two systems have the same solution
set, you found that the corresponding reduced echelon forms are equal. On
the other hand, in Activity 9 you discovered that each system yielded a
dierent reduced echelon form. What can you say about the solution set in
this case? What is the purpose of Activity 10? Does each reduced echelon
form correspond to a single original system? How does your answer relate
to Exercise 9 in the last section?
In this section you have been applying elementary operation to systems
of equations in order to nd their solution set in a convenient computational
way. With practice you will nd that you can combine several elementary
operations into one step. For example, such a combined operation would be
120 CHAPTER 3. FIRST LOOK AT SYSTEMS
the replacing of a row by the sum of the multiples of two rows, provided that
the row replaced appears in the linear combination with a nonzero coecient.
This operation can be thought of in the following way: If you think of the
each row of the augmented matrix as a vector, the operation would be the
same as replacing one vector with the sum of the multiples of other two
vectors. It will be very useful in later chapters.
The original system of m equations and n unknowns in R, and the corre-
sponding system in echelon form are, as we have seen, equivalent. The system
in reduced echelon form is particularly easy to solve because the variable x
i
appears only in the i
th
equation. Furthermore, non-zero coecients appear
in all n rows or only in the rst r rows. Since each x
i
appears in but one
row with unit coecient, we can consider two cases: when the n rows have
non-zero coecients, the system has a unique solution; when the non-zero
coecients appear in r rows, the remaining n r unknowns can be given
values arbitrarily, and the corresponding values of the x
i
can be computed
accordingly. The n r unknowns corresponding to the indexes greater than
r can be considered parameters, or free variables, and thus the system has
an innite number of solutions.
The paragraph above can be considered a non-formal proof of the follow-
ing theorem that will be proved formally in Chapter 6:
Theorem 3.2.1. The system of simultaneous linear equations in K = R,
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= c
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= c
2
a
31
x
1
+a
32
x
2
+ +a
3n
x
n
= c
3
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= c
m
,
has a solution if and only if all solutions can be expressed in terms of n r
independent parameters, or free variables. The number r is called the rank
of the coecient matrix associated with the system.
Summarizing the Process for Finding the Solution of a
System of Equations Using an Augmented Matrix
As we observed in the prior section, we can nd the solution set of a system
of equations in K = Z
p
, p prime, by direct substitution: if a sequence of
3.2 Solving Systems Using Augmented Matrices 121
values for the unknowns returns true for each equation of the system, then
that sequence of values is an element of the solution set. If K is an innite
set, such as R, then it is impossible to check each sequence. In this case, we
must transform the original system of equations into a simpler system. In
this section, we executed this process by applying elementary row operations
to the augmented matrix of a system of equations. The three elementary
row operationsinterchange the positions of two rows, multiply a row by
a nonzero constant, and add a multiple of one row to another roware
analogous to the three elementary transformations dened in the last section.
We apply these operations to transform the original system into echelon or
reduced echelon form. Both forms allow us to identify the solution set without
having to resort to direct substitution. To use an augmented matrix, we take
a system, such as
a
11
x
1
+a
12
x
2
+a
13
x
3
+a
14
x
4
+ +a
1q
x
q
= c
1
a
21
x
1
+a
22
x
2
+a
23
x
3
+a
24
x
4
+ +a
2q
x
q
= c
2
a
31
x
1
+a
32
x
2
+a
33
x
3
+a
34
x
4
+ +a
3q
x
q
= c
3
.
.
.
a
r1
x
1
+a
r2
x
2
+a
r3
x
3
+a
r4
x
4
+ +a
rq
x
q
= c
r
,
form its associated matrix,
_
_
_
_
_
_
_
a
11
a
12
a
13
a
14
. . . a
1q
c
1
a
21
a
22
a
23
a
24
. . . a
2q
c
2
a
31
a
32
a
33
a
34
. . . a
3q
c
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
r1
a
r2
a
r3
a
r4
. . . a
rq
c
r
_
_
_
_
_
_
_
,
and then apply elementary row operations until we get one of the echelon
forms. The simplied system, which is equivalent to the original, will yield
either a unique solution, innitely many solutions, or no solution.
Exercises
All systems considered in these exercises are in R unless otherwise stated in
the exercise.
1. For each of the augmented matrices given in (a)(e), determine whether
the given matrix is in echelon form. If it is, write the corresponding
122 CHAPTER 3. FIRST LOOK AT SYSTEMS
system of equations, and identify its solution set. If the matrix is not
in echelon form, explain why.
(a)
_
_
0 1 2 1 3
0 0 1 1 2
0 0 0 1 4
_
_
(b)
_
_
_
_
1 1 3 2 0 1
0 1 0 0 0 4
0 2 1 5 3 6
0 0 5 1 0 7
_
_
_
_
(c)
_
_
1 2 1 0 0 3
0 0 1 1 3 4
0 0 0 0 1 6
_
_
(d)
_
_
1 1 2 1 3
0 1 3 4 2
0 0 0 0 1
_
_
(e)
_
_
_
_
1 1 2 1 3
0 0 0 0 0
0 0 2 1 5
0 1 3 2 4
_
_
_
_
2. For each of the augmented matrices given in (a)(f), determine whether
the given matrix is in reduced echelon form. If it is, write the corre-
sponding system of equations, and identify its solution set. If the matrix
is not in reduced echelon form, explain why.
(a)
_
1 0 0 2
0 1 1 3
_
3.2 Solving Systems Using Augmented Matrices 123
(b)
_
_
0 1 3 0 1
2 0 1 0 5
0 0 0 1 4
_
_
(c)
_
_
_
_
1 0 2 0 0 1
0 1 1 0 0 4
0 0 0 1 0 4
0 0 0 0 0 0
_
_
_
_
(d)
_
_
1 2 3 0 4 3
0 0 1 1 2 7
0 0 0 0 1 5
_
_
(e)
_
_
_
_
1 0 2 0 3 5
0 1 3 0 2 3
0 0 0 1 2 0
0 0 0 0 0 0
_
_
_
_
(f)
_
_
1 1 2 0 3 0 3 5
0 0 0 1 2 0 4 2
0 0 0 0 0 1 2 2
_
_
3. Use augmented matrices and elementary row operations to nd the
solution set of each of the following systems of equations. Identify the
leading entries, the free variables, and the rank of the coecient matrix
associated with the system. Write the solution in vector and parametric
form.
(a)
3x
1
+ 6x
2
3x
3
= 6
2x
1
4x
2
3x
3
= 1
3x
1
+ 6x
2
2x
3
= 10
124 CHAPTER 3. FIRST LOOK AT SYSTEMS
(b)
x
1
x
2
+ 2x
3
+ 4x
4
= 0
2x
1
+ 3x
2
x
3
+x
4
= 0
4x
1
+ 5x
2
+ 3x
3
2x
4
= 0
(c)
4x
1
+ 8x
2
12x
3
= 28
x
1
2x
2
+ 3x
3
= 7
2x
1
+ 4x
2
8x
3
= 16
3x
1
6x
2
+ 9x
3
= 21
(d)
2x
1
x
2
+ 3x
3
+x
4
= 12
3x
1
+ 2x
2
x
3
+ 4x
4
= 3
(e)
x
1
+ 2x
2
+x
3
3x
4
= 3
2x
1
+ 3x
2
+x
3
5x
4
= 9
2x
1
+ 4x
2
x
3
+ 3x
4
= 2
x
1
x
2
x
3
+x
4
= 1
4. For each system below, (a)(c), nd the values of h, k, and l that result
in each system of equations having a unique solution.
(a)
2x +y = h
8x 4y = k
(b)
x + 3y z = h
x y + 2z = k
3x + 4y = 1
3.2 Solving Systems Using Augmented Matrices 125
(c)
2x 6y = 3
4x + 12y = h
5. For the augmented matrices given in parts (a) and (b) below, nd the
values of r that result in the corresponding system of equations having
innitely many solutions.
(a)
_
1 4 2
3 r 1
_
(b)
_
2 r 1 0
1 3 r 0
_
6. For the augmented matrix given below, nd values for h and k that
result in the corresponding system of equations having no solution.
_
1 h 1
2 3 k
_
7. Write two systems of equations that have the same solution set.
8. Given the following vector form of a solution set,
S = 2, 3s + 2t, s + 2t 3, 4 s, s, 2, t) : s, t R,
write the associated reduced echelon matrix.
9. Write all of the possible augmented reduced echelon matrices that
would correspond to a system of two equations in two unknowns. De-
note entries that can be a number other than 1 or 0 with . Identify
which matrices correspond to no solution, which correspond to innitely
many solutions, and which correspond to a unique solution.
10. Write all of the possible augmented reduced echelon matrices that
would correspond to a system of three equations in three unknowns.
Denote entries that can be a number other than 1 or 0 with . Iden-
tify which matrices correspond to no solution, which correspond to
innitely many solutions, and which correspond to a unique solution.
126 CHAPTER 3. FIRST LOOK AT SYSTEMS
11. Based upon your answers to the prior two exercises, what augmented
reduced echelon matrix would correspond to a system of four equations
with four unknowns that has a unique solution? Try to generalize this
to n equations in n unknowns.
12. Find the sequence form of the solution set of the system of equations
whose reduced echelon form is given by the augmented matrix
_
_
1 1 0 1 0
0 0 1 2 0
0 0 0 0 0
_
_
.
Once you have found the sequence form of the general solution, perform
the following tasks:
(a) Determine three dierent specic nonzero solutions.
(b) Show that these specic sequences satisfy the system correspond-
ing to the augmented matrix.
(c) Take two of the solutions, add them, and show that the resulting
sequence is also a solution.
(d) Take a multiple of one solution and a dierent multiple of other
solution and add them. Is the sum still a solution of the system?
(e) If we replace the entry at the 15
th
position with 3 and the entry
at the 5
th
position with 4, is the resulting sum still a solution?
13. Apply elementary row operations to the augmented echelon matrix
_
_
1 2 1 3
0 1 2 1
0 0 1 2
_
_
to transform it into the reduced echelon form.
(a) Write the resulting system of equations.
(b) What is the solution set of this system?
(c) What operations would have to be performed to get this system
back into the echelon form presented above?
3.2 Solving Systems Using Augmented Matrices 127
14. Show that [0, 0, 0] is a solution of the system
2x
1
x
2
+ 3x
3
= 0
x
1
+ 4x
2
2x
3
= 0.
(a) If we replaced the constants with nonzero values, would [0, 0, 0] be
a solution?
(b) If you are given any system of equations where each constant
is zero, must the sequence consisting of all zeros be a solution?
Explain your answer.
15. Show that [0, 0, 0, 0] is a solution of the system whose augmented matrix
is given by
_
_
2 2 3 4 0
1 3 4 5 0
3 4 2 1 0
_
_
.
(a) If we replaced the constants with nonzero values, would [0, 0, 0, 0]
be a solution?
(b) Based upon your answer to this and the preceding exercise, can
any system with constants that are all zero ever have no solution?
Explain your answer.
(c) Can the zero vector be a solution of a non-homogeneous system
of equations? Explain your answer.
16. (a) Are the solution sets of the systems of equations given in two
previous exercises vector spaces?
(b) Write two systems that have the same coecients as those in the
two previous exercises that are not homogeneous.
(c) Find their solution sets.
(d) Are these sets vector spaces?
(e) Can you nd a relationship between the form of the systems whose
solution sets are vector spaces?
17. What can you say about the solution set of a homogeneous system of
equations in which there are as many unknowns as equations? Carefully
explain your answer.
128 CHAPTER 3. FIRST LOOK AT SYSTEMS
18. If you are given a system of equations that has as many unknowns
as rows and that has all zero constants, what can you say about the
solution set? Carefully explain your answer.
19. In the augmented matrix given below, the third column is the sum
of the rst two columns. What can we say about the solution set of
the associated system of equations: does it consist of a single solution,
innitely many solutions, or no solution? Justify your answer.
_
_
_
_
2 3 5 0
5 1 4 0
3 1 4 0
1 0 1 0
_
_
_
_
20. In the augmented matrix given below, the third column is the sum of
the rst column and twice the second column. What can we say about
the solution set of the associated system of equations: does it consist
of a single solution, innitely many solutions, or no solution? Justify
your answer.
_
_
1 2 3 0
7 4 15 0
3 2 7 0
_
_
3.2 Solving Systems Using Augmented Matrices 129
130
3.3 A Geometric View of Systems
Activities
All the equations and systems in these activities are in K = R, unless oth-
erwise stated in the activity.
1. Given the equation
3x 5y = 4
(a) Find the solution set of the equation. Does the solutions set of
the equation form a vector space?
(b) Find ve specic ordered pairs that are solutions to the equation.
Plot them in the coordinate plane. What do you observe?
(c) Using ISETL, draw the graph of the equation. What do you ob-
serve? Use plot to plot the solutions you found in part (a) on the
same coordinate system as the graph of the equation when x is in
[3, 3]. What do you nd?
(d) Answer using your own words: Where will all the points of the
solution set lie when you plot them on the same coordinate system
as the graph of the equation?
(e) Repeat the previous instructions for the equation
3x 5y = 0.
Compare the results found here with those you obtained for the
equation given at the beginning of the activity. What are the
dierences? What are the similarities?
2. Consider the system of equations,
x + 2y = 0
x + 2y = 3.
(a) Graph the solution set of each equation in the same coordinate
plane.
(b) Select a point that is a solution of the rst equation but not of
the second one. Locate it on the plane. What do you observe?
3.3 A Geometric View of Systems 131
(c) Select a point that is a solution of the second equation but not of
the rst one. Locate it on the plane. What do you observe?
(d) Select a point that is neither a solution of the second equation nor
of the rst. Locate it on the plane. What do you observe?
(e) Can you nd a point that is a solution of both equations? Why?
(f) Find the solution set of the system algebraically, if it exists. What
do you nd? Compare with your answer to the previous part of
this activity. What is the relationship between the algebraic result
for the solution set of the system and the geometric representa-
tion? Carefully explain.
3. Consider the system of equations,
3x + 2y = 5
6x 7y = 2.
(a) Graph the solution set of each equation in the same coordinate
plane.
(b) Select a point that is a solution of the rst equation but not of
the second one. Locate it on the plane. What do you observe?
(c) Select a point that is a solution of the second equation but not of
the rst one. Locate it on the plane. What do you observe?
(d) Select a point that is neither a solution of the second equation nor
of the rst. Locate it on the plane. What do you observe?
(e) Find the solution set of the given system algebraically, if it exists.
Locate the points of the solution set on the coordinate plane.
What do you observe?
4. Consider the system of equations,
6x + 15y = 6
14x 35y = 14.
(a) Graph the solution set of each equation in the same coordinate
plane.
132 CHAPTER 3. FIRST LOOK AT SYSTEMS
(b) Select a point that is a solution of the rst equation but not of
the second one. Locate it on the plane. What do you observe?
(c) Select a point that is a solution of the second equation but not of
the rst one. Locate it on the plane. What do you observe?
(d) Select a point that is neither a solution of the second equation nor
of the rst. Locate it on the plane. What do you observe?
(e) Find the solution set of the given system algebraically, if it exists.
Locate the points of the solution set on the coordinate plane.
What do you observe?
5. Given a system of two equations in two unknowns,
ax +by = e
cx +dy = f,
answer the following questions:
(a) Give conditions on a, b, c, d, e, and f under which a system of two
equations in two unknowns has a unique solution, innitely many
solutions, or no solution.
(b) Coordinate each set of conditions with the appropriate geometric
representation.
(c) Given a graph of the system and the graph of the two equations
when the system has a unique solution, which points in the co-
ordinate graph are solutions of the rst equation? the second
equation? both equations? neither equation?
(d) Given a graph of the system and the graph of the two equations
in the case when the system has innitely many solutions, which
points on the coordinate graph are solutions of the rst equation?
the second equation? both equations? neither equation?
(e) Given a graph of the system and the graph of the two equations
in the case when the system has no solution, which points on the
coordinate graph are solutions of the rst equation? the second
equation? both equations? neither equation?
6. Consider the equation
x 2y +z = 0.
3.3 A Geometric View of Systems 133
(a) Find the solution set of the equation. How many solutions does
the equation have?
(b) Determine whether the ordered triples
(0, 0, 0), (2, 1, 2), and (2, 1, 0)
are solutions of the given equation. Plot three ordered triples
that are elements of the solution set of the equation in a three-
dimensional space. Can you imagine a plane passing by those
points? Chose other three points from the solution set of the
equation and plot them on the same graph as the previous three.
What do you obtain?
(c) If we were to graph all the points of the solution set, what would
be the form of its geometric representation?
(d) Repeat parts (a) and (b) for the equation
x 2y +z = 10,
using the ordered triples (10, 0, 0), (2, 1, 6) and (1, 3, 12). Com-
pare the results found here with those you obtained before. What
dierences and what similarities do you nd?
7. Consider the system of equations
x +y +z = 0
x +y +z = 5.
(a) What is the geometric representation of the solution set of each
equation?
(b) Are the geometric representation of the solution set of each equa-
tion parallel?
(c) Do the geometric representation of the solution set of each equa-
tion intersect? If they do, can you describe the intersection geo-
metrically?
(d) Find the solution set of the system. Is the solution set you ob-
tained consistent with the geometric representation? Carefully
explain.
134 CHAPTER 3. FIRST LOOK AT SYSTEMS
(e) Compare your answers to those of the previous activity. What do
you observe?
(f) Can you write a system of two dierent linear equations equations
in three unknowns that represents two coincident planes?
8. Consider the system of equations
x 3y + 2z = 8
3x 7y +z = 2.
(a) What is the geometric representation of the solution set of each
equation?
(b) Are the planes represented by the equations parallel? Why?
(c) Do the planes intersect each other? If they do, can you describe
the intersection geometrically?
(d) Find the solution set of the system. Is the algebraic solution
consistent with the geometric representation? Carefully explain.
(e) Plot the solution set in a three-dimensional space. What do you
observe? Compare with your answer to the previous question, do
both answers agree? Why?
9. Consider the system of equations
5x + 2y z = 11
x y +z = 1
4x + 2y + 3z = 5.
(a) What is the geometric representation of the solution set of each
equation?
(b) Are the planes represented by the equations parallel?
(c) Do the rst two planes intersect each other? If they do, can you
describe the intersection geometrically?
(d) Do the last two planes intersect each other? If they do, can you
describe the intersection geometrically?
(e) Solve the system formed by the rst two equations. What do you
nd? What is the geometric representation of this solution set?
3.3 A Geometric View of Systems 135
(f) Solve the system formed by the last two equations. What do you
nd? What is the geometric representation of this solution set?
(g) Do the three planes represented by each of the equations intersect?
If they do, can you describe the intersection geometrically?
(h) Find the solution set of the entire system. Is the algebraic solution
consistent with the geometric representation? Carefully explain.
(i) Plot the solution set in a three-dimensional space. What do you
observe? Compare with your answer to the previous question, do
both answers agree? Carefully explain.
10. Consider the system of equations
3x y z = 5
x 5y +z = 3
x + 2y z = 1.
(a) What is the geometric representation of the solution set of each
equation?
(b) Are the planes represented by the equations parallel?
(c) Do the rst two planes intersect each other? If they do, can you
describe the intersection geometrically?
(d) Do the last two planes intersect each other? If they do, can you
describe the intersection geometrically?
(e) Solve the system formed by the rst two equations. What do you
nd? What is the geometric representation of this solution set?
(f) Solve the system formed by the last two equations. What do you
nd? What is the geometric representation of this solution set?
(g) Do the three planes represented by each of the equations intersect?
If they do, can you describe the intersection geometrically?
(h) Find the solution set of the entire system. Is the algebraic solution
consistent with the geometric representation? Carefully explain.
11. Consider the system of equations
3x y z = 5
x 5y +z = 3
x + 2y z = 2.
136 CHAPTER 3. FIRST LOOK AT SYSTEMS
(a) What is the geometric representation of the solution set of each
equation?
(b) Are the planes represented by the equations parallel?
(c) Do the rst two planes intersect each other? If they do, can you
describe the intersection geometrically?
(d) Do the last two planes intersect each other? If they do, can you
describe the intersection geometrically?
(e) Solve the system formed by the rst two equations. What do you
nd? What is the geometric representation of this solution set?
(f) Solve the system formed by the last two equations. What do you
nd? What is the geometric representation of this solution set?
(g) Do the three planes represented by each of the equations intersect?
If they do, can you describe the intersection geometrically?
(h) Find the solution set of the entire system. Is the algebraic solution
consistent with the geometric representation? Carefully explain.
12. Consider the following systems of equations,
(1)
2x + 3y 8z = 0
x 2y + 3z = 0
5x 3y +z = 0
(2)
2x + 3y 8z = 1
x 2y + 3z = 4
5x 3y +z = 13.
(a) What is the geometric representation of the solution set of each
equation in the two systems?
(b) Given the ordered triple (1, 2, 1), test whether it is a solution to
System (1). Test whether it is a solution to System (2). What do
you nd?
(c) Given the ordered triple (2, 4, 2), test whether it is a solution
to System (1). Test whether it is a solution to System (2). What
do you nd?
(d) Calculate the ordered triple found by adding (1, 2, 1) and (1, 4, 13).
As you can see, this ordered triple is the same vector as (1, 2, 1)
but it has been translated to a new position displacing it by adding
the vector (1, 4, 13). Is the resulting ordered triple a solution to
System (1)? Is it a solution to System (2)?
3.3 A Geometric View of Systems 137
(e) Calculate the ordered triple obtained by adding (2, 4, 2) and
(1, 4, 13). As you can see, this ordered triple is the same vector
as (2, 4, 2) but it has been translated to a new position dis-
placing it by adding the vector (1, 4, 13). Is the resulting ordered
triple a solution to System (1)? Is it a solution to System (2)?
(f) Find the solution set of System (1). Find the solution set of Sys-
tem (2). What is the relationship between the algebraic represen-
tation of the two solution sets?
(g) Can you nd any relationship between the geometric representa-
tions of two solution sets? Describe it. Can you nd any relation-
ship between the two systems?
Discussion
Equations in Two Unknowns
You already know how to nd the solution set of an equation or a system
of equations and you also know the meaning of the solution set in algebraic
terms. We are interested now in nding out what is the meaning of equations,
systems of equations and their solutions in geometric terms.
We will start with equations in two variables. The equation ax + by = c
can be considered as a system consisting only of one equation. What is its
solution set? You already know that the solution set of that system over R
consists of an innite set of points in a two-dimensional space. In Activity 1
you plotted the solution set of two such equations in the coordinate plane.
As you found out the geometric representation of a single linear equation in
two variables has a line in the coordinate plane as its solution set. What
is the slope of the line that represents the solution set of the rst equation
in Activity 1? What is its y-intercept? What is the slope of the line that
represents the solution set of the second equation in Activity 1? What is its
y-intercept? There is a dierence between the two equations given in this
activity, however. The solution set of one equation forms a vector subspace
of R
2
, while the other does not. Which is which? When you compared the
two lines representing the two equations you could observe that the lines are
parallel, they have the same slope. The second line is a line through the
origin and the rst one is a line through
_
0,
4
5
_
, this line can be considered
138 CHAPTER 3. FIRST LOOK AT SYSTEMS
as the translation of the line through the origin to the point
_
0,
4
5
_
.
As you recall from the previous sections, you had demonstrated that you
could obtain the solution set of a non-homogeneous system of equations from
the solution of the associated homogeneous system if you knew one particular
solution. What is the particular solution in this case? Is the relationship be-
tween the geometric representation consistent with the algebraic relationship
just mentioned?
In Activity 2 the equations are also in K = R. You were asked to
nd out if the components of a point which is a the solution of the rst
equation is a solution of the other. When you plotted this point on the plane
you noticed that it was located on the line representing the solution set of
the rst equation, but not on the line representing the solution set of the
second equation. You also noticed that when you picked a point which is
not a solution of either equation, it was located on the plane in a place that
was not located on either of the lines representing the solution sets of the
equations. You nally realized that the lines representing the solution sets
of both equations do not intersect, that is that they do not have any point
in common. When you solved the system you found out that the solution
set is empty, there are no points in the solution set. You could then verify
that when the solution set of the system is empty, the lines representing the
equations of the system do not intersect. Can you explain this result in terms
of the result of the previous activity?
In Activity 3 you were presented again with a system of two equations in
K = R and two unknowns. You realized that the geometric representation
of each solution set of those equations is a line. Why? What are their slopes?
What are their y-intercepts? In this case the two lines are not parallel; they
intersect in one point. Again, you were asked to nd out if the components
of a point, which is a the solution of the rst equation, is a solution of the
other. When you plotted this point in the plane you noticed that it was
located on the line representing the solution set of the rst equation, but
not on the line representing the solution set of the second equation. As in
the previous activity you also noticed that the point which is not a solution
of the rst equation, but is a solution of the second is located on the line
representing the second equation. The point which is not a solution of either
equation is again located in the plane outside the two lines representing each
of the equations. Then you were asked to solve the system. This time you
found the system has a unique solution, which is precisely a point on the
intersection of the lines representing the solution sets of the two equations.
3.3 A Geometric View of Systems 139
The lines represented by the equations of the system over R given in
Activity 4 are parallel. Why? When you chose a point that was a solution
to the rst equation you noticed that the same point was always a solution
to the second equation and vice versa. Was it possible to nd a point that
satised the requirements given by the activity? When you solved the system
you realized that any point that is a solution of one equation is a solution of
the other. When you plotted both equations you noticed that the particular
lines they represent lay one over the other, that is, they are two dierent ways
to write the equation of the same line, so any point on one line is also on the
other. The system has an innite number of solutions that are represented
on the plane by all the points that are on the line. On the other hand, points
that dont satisfy any of the equations are not solutions of any of them.
The system presented in K = R in Activity 5 is a general representation
of two equations with two unknowns. If you give particular values to the
parameters a, b, c, and d you will have a particular system. Depending on
the values assigned to these parameters you can have dierent situations,
but it is always true that each of the equations represent a line. What is
the relationship between the slope of the rst equation given by
a
b
, the
slope of the second equation given by
c
d
, and the existence of exactly one or
innitely many solutions? If the slopes are equal under which condition do
the y-intercepts, denoted by
e
b
and
f
d
, correspond to a system with innitely
many solutions, represented by coincident lines, or a system with no solution,
represented by parallel lines?
As you have seen in previous activities, systems of two equations and two
unknowns over R can have one solution, an innite number of solutions or no
solution at all. You already know from previous sections that the system will
have one solution when the system can be row reduced without obtaining
a row of zeros, that it will have an innite number of solutions if it can be
row reduced and you obtain a row that consists only of zeros and that it has
no solution if the row reduction process leads to a row in which every entry
except the last entry is zero. Now you also know that each of the equations
of the system can be represented geometrically by a line. The number of
solution depends on the position of the two lines in the plane. In previous
activities you found that the condition for the system to have one solution
is that the equations represented by each of the equations are not parallel,
that is, that they have dierent slopes. If the two lines represented by the
equations have the same slope there are two possibilities: if they have the
same y-intercept, the equations are representations of the same line, or you
140 CHAPTER 3. FIRST LOOK AT SYSTEMS
might say that one of the lines lays exactly over the other line and that is
why they intersect in an innite number of points; if the lines represented by
the equations do not have the same y-intercept, they are parallel, they do
not intersect and the system has no solution. Can you nd a condition for
the solution set of a system of equation in K = R to be a vector space? Can
you relate this condition to the geometric representation of the solution set?
Equations in Three Unknowns
Equations with three unknowns in R represent, as you observed in Activity 6,
planes in a three-dimensional cartesian space R
3
. You may recall that the
equation of a plane in a three-dimensional space can be found by identifying
a vector that is normal to all the vectors that lie in the plane and a point
that lies in the plane. A normal vector can be constructed by taking two
vectors that lie on the plane and obtaining the cross product of the two
vectors. Since the plane consists of the set of all vectors that are normal to
this vector, the equation of the plane passing through the point (x
0
, y
0
, z
0
)
and normal to the vector a, b, c) is the set of all points (x, y, z) such that
a, b, c) x x
0
, y y
0
, z z
0
) = 0,
as shown in Figure 3.1. In Activity 6, the normal vector is given by the
coecients of the variables, which in this case is 1, 2, 1). The plane passes
through the point (0, 0, 0), since the equation has no nonzero constant. If
there were a nonzero constant, a point could be identied by selecting values
for, say y and z, and solving the resulting equation for x. Finding the
equation for a plane will be dealt with later in the text. In the case of
the equation x2y +z = 0, you can verify that the normal vector is given by
the coecients of the variables of the equation, that is 1, 2, 1) in the case
of the example, and x
0
, y
0
, z
0
) are the coordinates of a point on the plane,
0, 0, 0) in the example. We will come back to this in later chapters.
In Activity 6 you plotted the equation of the plane corresponding to the
given equation and veried that the points in the solution set are on the plane.
As in Activity 1, you were asked to consider a second equation that had the
same coecients as the previous one but passes through dierent points. As
the coecients are the same, the planes represented by the two equations
are parallel. Can you explain why? As in Activity 1 when you compared the
solution sets of both equations you noticed that you could obtain the solution
set of the non-homogeneous system of equations from the solution of the
3.3 A Geometric View of Systems 141
v = <1,-2,1> v = <1,-2,1>
Figure 3.1: Plane with normal vector
associated homogeneous system if you knew one particular solution. Again,
the second equation represents a plane with the same directions as that
represented by the rst equation but that is translated to another location.
Can you describe the specic translation here? One of the two equations
yielded a solution set that is a subspace of R
3
. Which is it?
In Activity 7 you considered a system of two equations and three un-
knowns over R. In this case you could verify that each of these equations
geometrically represents a plane in a three-dimensional cartesian space. If a
triple is a solution of one of the equations, it has to represent a point on the
plane represented by the equation. Working with the system of the equations
you noticed that the solution set is empty. In geometric terms it means that
the planes represented by the two equations have no points in common, they
are parallel. This condition can be veried by comparing the coecients of
the variables in the equations and the independent constants. Exactly how
does this help us? Under what condition does a system of two equations in
three unknowns in R yield a nonempty solution? Under what conditions two
planes coincide? What is the solution set of a system of two equations and
three unknowns in R in which the corresponding planes coincide?
If two equations represent the same plane, then not only the normal
vectors that dene the planes have to be parallel, but all the points that lie
in one plane have to lie in the other set. What does this mean in terms of
the solution of the system formed by the equations of the two planes?
The planes represented by the equations in Activity 8 are not parallel.
142 CHAPTER 3. FIRST LOOK AT SYSTEMS
You know this because the normal vectors represented by the coecients
of the equations are not parallel. As the planes are not parallel, they will
intersect. What is the geometric representation of the intersection of the
planes? When you solved the system you found out that the solution set of
the system has an innite number of points. When you plotted the solution
set in a three-dimensional space you found that the solution space can be
represented by a line. That line lies in the intersection of both planes, as
indicated in Figure 3.2.
Figure 3.2: Intersecting planes from Activity 8
Systems of Three Equations in Three Unknowns
The planes represented by each of the three equations of the system given
in Activity 9 are not parallel. Why? By looking at the rst two planes you
found out that there is an innite number of solutions in the solution set of
that system. Those solutions are on a line which lies in both planes. The
same result holds for a system formed by the last two equations. What do
you expect the intersection of those lines will be? When you solved the entire
system, you discovered a single solution. Is this point of intersection of the
lines representing solution sets of the systems dened by the rst and second
equations and the second and third equations?
In the case of Activity 10, you followed the same instructions as in Activ-
ity 9. You probably expected the intersection of the three planes to represent
again a single point in R
3
because the three planes are not parallel. In this
3.3 A Geometric View of Systems 143
activity, however, the system has an innite number of solutions, and the geo-
metric representation of the solution set is a line in the space. See Figure 3.3.
Figure 3.3: Intersecting planes as in Activity 10
The system in Activity 11 is similar to that given in Activity 10: in that
the planes are not parallel. This means that there is no point of intersection;
the solution set in this activity is empty, despite the fact that each pair of
equations, rst/second, and second/third, yields a solution set represented
by a line. Do lines formed by each pair intersect?
In Activity 12 you were given two systems that diered only in terms of
their constants. You tested whether adding a specic solution of System (2)
to a solution of System (1) would yield a solution of System (2). What did
you nd? If you think in geometric terms, the solution set of System (2), a
non-homogeneous system, can be thought of as a translation of the solution
set of System (1), a homogeneous system. Can you explain why?
Exercises
All the equations and systems in these exercises are in K = R, unless other-
wise stated in the activity.
1. In this section you found that a linear equation in two variables over R
represents a line in a two-dimensional space, and that a linear equation
in three variables represents a plane in a three-dimensional space. What
144 CHAPTER 3. FIRST LOOK AT SYSTEMS
can you say about the geometric representation of the solution of a
linear equation in more than three variables?
2. Explain in geometric terms why no system of linear equations can have
exactly two solutions.
3. Describe the solution set of the system of equations kx + y = 5 and
k(x + 2) y = 3
4. Determine the value of k such that (4 1, 1), is on the plane described
by the solution set of the equation kx + 3y kz = 7
5. Given the equation 5x 7y + 11 = 0, what is the geometric represen-
tation of its solution set? Write a general equation that describes all
the possible lines that are parallel to this one.
6. Verify that the solution set of the system of equations given by x +
3y +z = 9, 4x + 3y 2z = 12 is a line.
7. Write the system of equations that yields the xy-plane as a solution.
Write the system of equations that yields the yz-plane as a solution.
Write the system of equations that yields the xz-plane as a solution.
8. Consider the system of equations given by
x + 4y 5z = 0
2x y + 8z = 9.
Describe the geometric representation of the solution set of each equa-
tion. Then describe the geometric representation of the solution set of
the system.
9. Suppose that the solution set of a system of equations is given by
x = 4 3z
y = 1 + 6z,
where z is a free variable. Use what you know about vectors to describe
this solution set as a line in R
3
.
3.3 A Geometric View of Systems 145
10. Suppose that the solution set of a system of equations is given by
x = 7 +w
y = 5 2w
z = 5 + 2w,
where w is a free variable. Use what you know about vectors to describe
this solution set as a line in R
4
.
11. Compare geometrically the solution sets of the following systems of
equations:
(1)
6y 18z = 0
x + 2y + 3z = 0
2x + 3y + 9z = 0
(2)
6y 18z = 24
x + 2y + 3z = 6
2x + 3y + 9z = 8.
146 CHAPTER 3. FIRST LOOK AT SYSTEMS
Chapter 4
Linearity and Span
Vectors and sets of vectors return as the features
for discussion. You may have thought we were
done with vector spaces as we looked at systems
of equations. However, we will need to solve
some systems that arise as we spend time
familiarizing ourselves with the elements of
vector spaces. The main question before us is
What is the smallest subset of a vector space
that can be used to get the whole thing? The
answer to that may surprise you.
148
4.1 Linear Combinations
Activities
1. Let V = (Z
5
)
3
be the vector space of triples of elements of Z
5
. Parts
(a)(c) refer to V and it is assumed that name vector space has been
run.
Let v = 1, 2, 3) and w = 2, 1, 4) be two elements in V .
(a) Use ISETL to calculate the vector obtained by multiplying v by
the scalar 2.
(b) Use ISETL to calculate the vector obtained by adding the vectors
v and w.
(c) Use ISETL to calculate the vector obtained by rst multiplying v
by the scalar 2 and w by the scalar 3 and then adding the two
resulting vector/scalar products together.
2. Write a func LC that will assume name vector space has been run;
that will accept two inputs SK and SV , where SK denotes a sequence
of scalars, and SV represents a sequence of vectors of the same length;
and that will return a vector constructed by taking the linear combina-
tion of SV with respect to the sequence SK; that is, the combination
formed by rst multiplying each vector in SV by its corresponding
scalar in SK and then adding together the resulting vectors.
Apply LC to any sequence of four nonzero vectors and four nonzero
scalars from the vector space V = (Z
5
)
6
.
Try to use % (see Section 1.2, p. 27) in writing this func.
3. Let V = (Z
5
)
3
. Use the ISETL func LC you constructed in Activity 2
to perform the tasks given in parts (a)(b) below.
(a) Let u = 1, 2, 1), v = 3, 1, 4), and w = 4, 0, 2) be a sequence
of three vectors in V . Write all the possible sequences [b, c, d]
of three scalars based upon the possible choices of b 0, 1,
c 2, 3, and d 4. Then, apply LC to [u, v, w] and each
scalar sequence. Identify which combinations yield the zero vector.
4.1 Linear Combinations 149
Are your results consistent with those you would get by computing
each combination by hand?
(b) Use tool Three eqn you constructed in Chapter 3 to nd the values
of a, b, and c that make the following statements true. Once you
have found these values, write an ISETL statement that uses LC
to verify your results.
i. a 2, 1, 3) +b 1, 2, 1) +c 4, 0, 2) = 0, 0, 0)
ii. a 3, 1, 1) +b 4, 1, 2) +c 2, 2, 3) = 0, 0, 0)
4. Write a func that assumes that name vector space has been executed;
that will accept two inputs SK and SV , where SK and SV are dened
as in Activity 2; and that will return a boolean value obtained by
applying LC to the pair SK and SV to check whether the resulting
combination LC(SK,SV) is equal to the zero vector.
Test your func on the linear combinations from Activity 3, part (a).
5. Consider the following ISETL code.
UKn:=func(n);
if n=1 then return {[s] : s in K};
else return {t with s : t in UKn(n-1), s in K};
end;
end;
Explain how this program is executed in the case in which K = Z
3
and
n = 2; K = Z
3
and n = 3; K = Z
3
and n = 4. Then, explain how
this program is executed in general. What is your interpretation of the
result of running this program?
6. Write a func All LC that assumes that name vector space has been
executed; that will accept a single input SETV , where SETV denotes
a set of vectors in V ; and that will return the set of all possible linear
combinations of SETV . Use UKn in your program and note that if you
want to use LC on SETV , then you will rst have to convert it from a
set to a sequence.
7. Use the vector space V = (Z
3
)
4
to complete parts (a)(c) below.
150 CHAPTER 4. LINEARITY AND SPAN
(a) Construct ve dierent sets of vectors, each set consisting of four
vectors from V .
(b) For each set in (a), compute all possible linear combinations by
hand. (Leave these combinations unsimplied.)
(c) Apply All LC to each set in (a) to simplify all of the combina-
tions you constructed in (b) How many dierent linear combina-
tions do you get for each of these sets?
(d) For each of these sets, which linear combinations yield the zero
vector?
8. Write a func LU that assumes that name vector space has already
been executed; that will accept two inputs v and SETV from a vector
space V , where v is any vector, and SETV is a set of vectors; and that
will return a boolean value obtained by determining whether v can be
written as a linear combination of SETV in one and only one way.
For the vector space V = (Z
7
)
4
, use LU to determine the answer to the
following two questions.
(a) Can the vector v = 3, 4, 1, 2) be expressed uniquely as a linear
combination of the set
SET1V = 1, 2, 1, 0) , 3, 0, 1, 2) , 2, 1, 0, 1) ,
4, 0, 3, 5) , 5, 3, 0, 3)
(b) Can the vector v = 3, 4, 1, 2) be expressed uniquely as a linear
combination of the set
SET2V = 1, 0, 0, 0) , 3, 0, 1, 2) , 2, 1, 0, 1) , 4, 0, 3, 5)
9. Consider the vector v = 1, 2) in R
2
. Use the ISETL func vectors to
view this vector in the plane. Let t = 2, 3, 0.5. For each value of
t, use vectors to graph the scalar product tv.
(a) What do you observe? Based upon these examples, what does the
set
t v : t R;
look like when it is graphed in the plane?
4.1 Linear Combinations 151
(b) Explain why it looks this way.
10. Consider the vectors v = 1, 3) and w = 1, 2) in R
2
. Let a =
0.1, 0.2, 0.3, . . . , 1. Use vectors to view each possible combination:
take the product of a with v and (1 a) with w, and add the results
together: in short, graph av+(1a)w for each value of a given above.
What do you observe? Based upon these examples, what do you think
the set
av + (1 a)w : a [0, 1];
looks like when it is graphed in the plane?
11. Consider the vectors v = 1, 3) and w = 1, 2) in R
2
. Let a =
0.1, 0.2, . . . , 4. Use vectors to view the following combinations: take
the product of a with w and add it to v; in short, graph v + aw for
each value of a given above.
(a) What do you observe? Based upon these examples, what do you
think the set
v +aw : a R;
looks like when it is graphed in the plane?
(b) How would it look if you let a run though all values in [0, )?
How about (, 1]? (, )?
12. Consider the dierential equation,
f
tt
+f = 0,
where the function f is in C

(R).
Find three functions which are solutions to this dierential equation.
Then choose any three scalars in R and use them to form a linear
combination with your three functions. Is this linear combination also
a solution?
152 CHAPTER 4. LINEARITY AND SPAN
Discussion
The Dierence Between a Set and a Sequence
In this section of activities, you may have noticed that some activities refer
to a set of vectors or scalars and others refer to a sequence of vectors or
scalars. Do you recall the dierence between a set and a sequence? From
Chapter 1 you will recall that a set in ISETL is designated by curly braces
, whereas a sequence is denoted by square brackets, [ ] and is called a
tuple. What properties dierentiate sets from tuples?
In addition to having dierent properties, sets and tuples are conceptually
dierent. Lets review some properties that you worked with in Chapter 1.
A sequence is a function whose domain consists of the set or a subset of the
positive integers and whose range can be any set. How would you see a list
like a
1
, a
2
, a
3
, . . . , a
n
, . . . as a function? If the name of the function in this
case is f, what would be meant by f(1), f(2), f(3)? In the context in which
we are working, we can focus upon the listing representation, but, because a
sequence is a function, it is not just any list, it is an ordered list. Where does
the order come from? For example, the sequence [4, 5, 6] is dierent than the
sequence [6, 4, 5] because of the dierence in the order of the presentation of
the elements. On the other hand, a set is an unordered collection of objects.
So, the set 4, 5, 6 is equal to the set 6, 4, 5. Additionally, if an element is
repeated in a sequence, say [4, 5, 5, 6], the repeated 5 cannot be dropped like
it would if we were talking about the set 4, 5, 5, 6, which is actually equal
to the set 4, 5, 6.
This distinction comes up when we have specic scalars and specic vec-
tors that we want to use in forming a linear combination. Thus, in Activity
1(c) we had the scalars 2, 3 and the vectors v, w. We wanted to form the
linear combination 2v + 3w. This means that we are thinking about the
sequence of scalars [2, 3] and the sequence of vectors [v, w] and not sets of
sequences or scalars. What sequences would we use if we wanted to form the
linear combination 3v 2w? or 2w+ 3v?
Can you explain why this means that in Activity 2, the func LC has
to take inputs SK, SV which are respectively, a sequence of scalars and a
sequence of vectors? On the other hand in Activity 6, the func All LC takes
a set of vectors. What is the dierence? Since All LC calls the func LC, how
did you deal with the fact that All LC receives a set of vectors, but LC needs
4.1 Linear Combinations 153
a sequence to work with? Was a conversion involved here?
Forming Linear Combinations
In most of your work so far, each vector space has been of the form (K)
n
,
where each vector consists of n-tuples or vectors whose components are ele-
ments of the scalar set K. A vector space does not generally have to be of
this form (for example, P
n
(K) and C

(R)). However, many of the vector


spaces that we will encounter in this course will have elements consisting of
tuples of entries from the scalar set. Indeed, we will discuss in Section 4.4,
that there is a sense in which every vector space is essentially the same as a
vector space of tuples of elements of the scalar set.
No matter what form its elements assume, a vector space is always dened
over, or accompanied by, a set of scalars. For this reason, whenever a vector
space is dened, it is common to use the phrase: Let V be a vector space
over a eld K. For our purposes, we do not need to know what a eld is; you
will study that concept in abstract algebra. In this course, the scalar eld
will be a familiar set like the real number system or a nite set such as Z
2
,
Z
3
, Z
5
, or Z
7
.
Given this relationship between a vector space and its scalar eld, how
would you, given two vectors v and w in a vector space V and two scalars a
and b in its associated scalar eld K, explain how to multiply v by a? add
v and w? compute the combination av +bw?
In Activity 1, you were asked to perform this series of operations. In
particular, given the set v = 1, 2, 3) , w = 2, 1, 4) in (Z
5
)
3
, you were asked
to form three combinations: 2v in part (a), v+w in part (b), and 2v+3w in
part (c). As it turns out, these combinations represent three of the possible
linear combinations of the set of vectors v = 1, 2, 3) , w = 2, 1, 4). If
we let a, b Z
5
, where a 1, 2, 3 and b 2, 4, what are the linear
combinations of the form av +bw for the given values of a and b?
In Activity 2, you constructed a func that accepted as input a sequence of
vectors SV and a sequence of scalars SK and returned the linear combination
of the corresponding pair as output. If we keep the vectors in the order in
which they were given in Activity 1, then SV = [v, w], with SK = [2, 0] in
part (a), SK = [1, 1] in part (b), and SK = [2, 3] in part (c). In Activity 3,
SV = [1, 2, 1) , 3, 1, 4) , 4, 0, 2)], and, based upon the choices for b, c, and
d, the possible scalar sequences would be of the form [0, 2, 4], [0, 3, 4], [1, 2, 4],
and [1, 3, 4]. If we let SV = [v
1
, v
2
, v
3
] be an arbitrary sequence of vectors,
154 CHAPTER 4. LINEARITY AND SPAN
and if we let SK = [a
1
, a
2
, a
3
] be an arbitrary sequence of scalars, what would
be the form of the linear combination of the sequence SV with respect to the
sequence SK? The denition given below discusses how to form such linear
combinations in general.
Denition 4.1.1. Let V be a vector space over the eld K. Let SV =
[v
1
, v
2
, . . . , v
q
] be a sequence of q vectors in V , and let SK = [a
1
, a
2
, . . . , a
q
]
be a sequence of q scalars in K. The linear combination of the sequence SV
with respect to the scalar sequence SK is given by:
a
1
v
1
+a
2
v
2
+ +a
q
v
q
.
Simplied Single-Vector Representations
Let V = R
4
, the vector space of 4-tuples with real-number entries. Let
SK = [3, 2, 5] and SV = [2, 4, 3, 1) , 3, 0, 1, 5) , 3, 2, 6, 4)].
If we want to express the linear combination of SV with respect to SK as
a single vector, we would explicitly perform the operations indicated by the
denition; in particular,
3 2, 4, 3, 1) + 2 3, 0, 1, 5) + 5 3, 2, 6, 4)
= 6, 12, 9, 3) +6, 0, 2, 10) +15, 10, 30, 20)
= 15, 22, 41, 33) .
If we continue to let V = R
4
, but if, in this case, we let
SK = [a
1
, a
2
, a
3
],
SV = [v
1
, v
2
, v
3
]
= [v
11
, v
12
, v
13
, v
14
) , v
21
, v
22
, v
23
, v
24
) , v
31
, v
32
, v
33
, v
34
)],
how would we express the linear combination of SV with respect to SK in
the form of a single vector?
In general, if V = (K)
n
, the vector space of n-tuples with entries in K,
and if
SV = [v
1
, v
2
, v
3
, . . . , v
q
]
is a sequence of q vectors in V , then the single-vector form of the linear
combination of SV with respect to the scalar sequence
SK = [a
1
, a
2
, a
3
, . . . , a
q
]
4.1 Linear Combinations 155
would be given as follows:
a
1
v
1
+a
2
v
2
+ +a
q
v
q
= a
1
v
11
, v
12
, v
13
, . . . , v
1n
) +a
2
v
21
, v
22
, v
23
, . . . , v
2n
) +
a
3
v
31
, v
32
, v
33
, . . . , v
3n
) + +a
q
v
q1
, v
q2
, v
q3
, . . . , v
qn
)
= a
1
v
11
, a
1
v
12
, a
1
v
13
, . . . , a
1
v
1n
) +a
2
v
21
, a
2
v
22
, a
2
v
23
, . . . , a
2
v
2n
) +
a
3
v
31
, a
3
v
32
, a
3
v
33
, . . . , a
3
v
3n
) + +a
q
v
q1
, a
q
v
q2
, a
q
v
q3
, . . . , a
q
v
qn
)
=
_
(a
1
v
11
+a
2
v
21
+a
3
v
31
+ +a
q
v
q1
), (a
1
v
12
+a
2
v
22
+a
3
v
32
+ +a
q
v
q2
),
(a
1
v
13
+a
2
v
23
+a
3
v
33
+a
q
v
q3
), . . . , (a
1
v
1n
+a
2
v
2n
+a
3
v
3n
+ +a
q
v
qn
)
_
Geometric Representation
Let V = R
2
, the vector space of ordered pairs with real-numbered entries.
If v = v
1
, v
2
) V , the ordered pair (v
1
, v
2
) is represented geometrically by
an arrow whose initial point is the origin (0, 0) and whose terminal point has
coordinates given by the ordered pair (v
1
, v
2
). If we let v
1
= 3 and v
2
= 2,
the set of all linear combinations of the vector 3, 2), which was graphed in
Activity 9, is represented algebraically by
c 3, 2) : c R.
How would you describe the graph of this set? Take a sheet of paper, draw
it by hand and compare it with what you got on the screen in Activity 9.
What do you observe?
How about Activity 10?
You should have discovered that the graph of this set of vectors is a line
that passes through the origin. If you recall, every line in the plane can be
represented by an equation of the form y = mx + b, where m is the slope of
the line, and b is the y-intercept. Since this line passes through the origin,
the value of b is zero. What is the value of m? To answer this question, we
need to identify two points that lie on the line: certainly, one is (0, 0); if we
let c = 1, another point is (3, 2). Based upon these two ordered pairs, we
see that the equation of the line is y =
2
3
x. This is precisely the relationship
that exists between the rst and second coordinates of any vector in R
2
that
is an element of the set
c 3, 2) : c R.
156 CHAPTER 4. LINEARITY AND SPAN
1 2 3 4
1
2
O
v = <3,2>
Figure 4.1: c 3, 2) : c R
In particular, if
(x, y) c 3, 2) : c R,
and if x ,= 0 and c ,= 0, then the relationship between x and y can be
represented by the ratio
y
x
=
2c
3c
=
2
3
;
that is, y =
2
3
x. Of course, this relationship continues to hold in the case in
which x and y are both zero.
Conversely, any solution of the equation y =
2
3
x can be represented in
vector form by the tuple

x,
2
3
x
_
. This form can be simplied to
1
3
x 3, 2) .
Since x is an arbitrary real number,
1
3
x can represent any real number, which
means that if we let c =
1
3
x, the above scalar multiple can be written in the
form c 3, 2) . Therefore, any vector whose components satisfy the equation
y =
2
3
x is an element of the set c 3, 2) : c R.
Therefore, the set of vectors in R
2
whose components satisfy the equation
y =
2
3
x is equal to the set of vectors given algebraically by c 3, 2) : c R.
Go back to Activity 9: Does the graph of the set of vectors given in that
exercise form a line in the plane? If so, what is the equation of the line? What
is the relationship between the rst and second coordinates of the vectors in
4.1 Linear Combinations 157
tv : t R? Does the relationship, if any, reect what you found in the
previous example? In general, can the graph of any set of vectors of the form
c a, b) : c R,
where a ,= 0 or b ,= 0 be represented as a line through the origin whose slope
is given by the ratio b/a? Explain your answer.
In Activity 11, you were asked to nd the graph of the linear combination
v +aw a R,
where v = 1, 3) and w = 1, 2). This set of vectors is exactly the same as
the set of vectors specied by the xy-equation y = 5 2x. How do we show
this?
1 2 3
1
2
4
3
w = <-1,1>
v + w
v = <1,2>
v + 2w
v - w
Figure 4.2: v +aw : a R
Let x, y) v +aw : a R. Then,
(x, y) = v +aw
= (1, 3) +a(1, 2)
= (1 a, 3 + 2a).
Since x = 1 a and y = 3 + 2a, it follows that y = 5 2x. Hence, the
components of every vector in v + aw : a R form a solution of the
equation y = 5 2x.
158 CHAPTER 4. LINEARITY AND SPAN
On the other hand, each solution of the equation y = 5 2x can be
expressed in vector form as
x, 5 2x) .
This is equivalent to the vector sum
1, 3) +x 1, 2 2x) .
If we let c = x 1, then the sum becomes
1, 3) +x 1, 2 2x) = 1, 3) +c, 2c) = 1, 3) + (c) 1, 2) .
Since x is an arbitrary real number, c is an arbitrary scalar, which means
that c is also arbitrary. So, if we let a = (c), we get
1, 3) +a 1, 2) ,
from which it follows that every vector whose components are a solution to
the equation y = 5 2x is a vector of the form
1, 3) +a 1, 2) .
Therefore, the solution set, in vector form, of the equation 5 2x is equal to
the set of vectors 1, 3) +a 1, 2) : a R.
How is the set of vectors v +aw : a R related to aw : a R?
w
v
Figure 4.3: sv +tw : s, t R
4.1 Linear Combinations 159
Let V = R
3
be the vector space of ordered triples of real numbers. If
v = v
1
, v
2
, v
3
) V and w = w
1
, w
2
, w
3
) V,
then the ordered triples v
1
, v
2
, v
3
) and w
1
, w
2
, w
3
) represent two arrows
whose initial points are the origin (0, 0, 0) and whose terminal points have
coordinates given by (v
1
, v
2
, v
3
) and (w
1
, w
2
, w
3
), respectively. If v and w are
not multiples of each other, that is, w ,= cv for any scalar c R, then the
set of all possible linear combinations of v and w, denoted by the set
sv +tw : s, t R,
forms the plane generated by v, w. The vectors v, w are referred to as
generators of this plane. If you recall from multivariable calculus, every pair
of non-parallel vectors, that is, vectors that are not multiples of one another,
denes a plane. Also recall that course the two binary operations on vectors:
the dot product and cross product. The equation of a plane is obtained by
identifying a normal vector, say a, b, c), formed by taking the cross product
of the two generators, and simplifying the resulting dot product equation
a, b, c) x x
0
, y y
0
, z z
0
) = 0,
where (x
0
, y
0
, z
0
) is any xed point in the plane, (x, y, z) is any arbitrary
point in the plane, and x x
0
, y y
0
, z z
0
) refers to the directed line
whose initial point (x
0
, y
0
, z
0
) and whose terminal point is (x, y, z).
To understand better the connection between the set of all linear com-
binations of a generating set and the plane formed by two generators, lets
consider the following example. Let v = 1, 2, 3) and w = 2, 3, 1). The set
of all linear combinations of v and w is denoted by the set
s 1, 2, 3) +t 2, 3, 1) : s, t R.
Since v = 1, 2, 3) and w = 2, 3, 1) are not multiples of each other, 1, 2, 3)
and 2, 3, 1) dene a plane whose normal vector is 7, 5, 1), found by
taking the cross product of 1, 2, 3) and 2, 3, 1). This yields the following
equation, when using the point (1, 2, 3):
7, 5, 1) x 1, y 2, z 3) = 0
7(x 1) + 5(y 2) 1(z 3) = 0
7x + 7 + 5y 10 z + 3 = 0
7x 5y +z = 0.
160 CHAPTER 4. LINEARITY AND SPAN
We claim that the solution set of 7x 5y +z = 0 is the set of vectors
s 1, 2, 3) +t 2, 3, 1) : s, t R.
In order to prove this claim, we must show that every solution of the equation
7x 5y +z = 0 is a linear combination of 1, 2, 3) and 2, 3, 1), and then we
must prove that every linear combination of 1, 2, 3) and 2, 3, 1) is a solution
of the equation 7x 5y +z = 0.
Every solution of the equation 7x5y +z = 0 can be written as a vector
in the form x, y, 5y 7x). To see that this vector is an element of the set
s 1, 2, 3)+t 2, 3, 1) : s, t R, we must show that x, y, 5y 7x) is a linear
combination of 1, 2, 3) and 2, 3, 1). In particular, we must nd scalars s and
t such that the equation
x, y, 5y 7x) = s 1, 2, 3) +t 2, 3, 1)
holds. Simplifying, we obtain,
x, y, 5y 7x) = s, 2s, 3s) +2t, 3t, t)
= s + 2t, 2s + 3t, 3s +t) ,
which is a system of 3 equations in the 2 unknowns s and t,
s + 2t = x
2s + 3t = y
3s +t = 5y 7x.
In Chapter 3 you worked with such systems and developed methods for
nding the solution set. Can you use those methods to show that the solu-
tions are given by:
s = 2y 3x
t = 2x y.
Hence, every vector x, y, z) whose coordinates x, y, and z are a solution of
the equation 7x 5y +z = 0 is a linear combination of 1, 2, 3) and 2, 3, 1),
where s, t are 2y 3x, 2x y, respectively.
Next, we want to show that every element of the set
s 1, 2, 3) +t 2, 3, 1) : s, t R
4.1 Linear Combinations 161
is a solution of the equation 7x 5y +z = 0. Let a 1, 2, 3) +b 2, 3, 1) be an
element of the set s 1, 2, 3) +t 2, 3, 1) : s, t R. This linear combination,
when simplied, is a + 2b, 2a + 3b, 3a +b). Substituting each component for
the respective variables x, y, and z yields
7(a + 2b) 5(2a + 3b) + (3a +b) = 7a + 14b 10a 15b + 3a +b
= (7a 10a + 3a) + (14b 15b +b)
= 0,
which shows that a 1, 2, 3) +b 2, 3, 1) is a solution of 7x5y +z = 0. Since
we have shown that every solution of 7x5y+z = 0 is a linear combination of
1, 2, 3) and 2, 3, 1), and since we have proven that every linear combination
of 1, 2, 3) and 2, 3, 1) is a solution of 7x5y +z = 0, it follows that the set
s 1, 2, 3) +t 2, 3, 1) : s, t R,
is the plane generated by the vectors 1, 2, 3) and 2, 3, 1) and given by the
equation 7x 5y +z = 0.
What we have shown in this discussion is that the solution set of the
equation 7x 5y + z = 0 is the set s 1, 2, 3) + t 2, 3, 1) : s, t R, which
is a plane in R
3
.
Vectors Generated by a Set of VectorsSpan
Throughout the previous subsection, we have used the terms generator or is
generated by. This was always in very specic contexts and so the meaning
should have been clear to you. Was it? You need to understand these terms
thoroughly and in a general context. In Activity 6, you wrote a func All LC
that formed the set of all linear combinations of vectors taken from a given
set. What does this have to do with the set of vectors generated by a given
set?
In the context of forming linear combinations, we say that the set of all
linear combinations of the vectors u and v, which is given by the set
su +tv : s, t K,
is the set of vectors generated by u and v. Consequently, whenever you are
given the phraseFind the set of vectors generated by u, v, and wyou are
162 CHAPTER 4. LINEARITY AND SPAN
actually being asked to nd the set of all linear combinations of u, v, and w;
that is, the set whose form is
qu +sv +tw : q, s, t K.
This term is important enough to warrant a formal denition.
Denition 4.1.2. If S is a set of vectors in a vector space V , then the set
generated by S is the set W of all linear combinations of vectors in S. We
say that the elements of S are the generators of W and that W is the span
of S.
Do you think that in the context of this denition, W must turn out to
be a subspace of V ?
What Vectors Can You Get from Linear Combinations?
In Activity 6, you wrote a func to compute all of the vectors you get by
forming linear combinations of vectors in a given set, that is, you computed
the set generated by the given set. In Activities 4 and 7, you considered the
more specic question of whether one of the linear combinations was equal
to the zero vector. Using the computer is one way of solving such problems
and in Chapter 6, you will develop similar methods using matrices.
There is still another way. Go back a few pages where you worked out
the solution set of the equation 7x 5y +z = 0. What does this have to do
with the set of vectors generated by the set 1, 2, 3) , 2, 3, 1)? What does
the vectors generated by this set have to do with the plane in R
3
determined
by 7x 5y +z = 0?
As we saw earlier, we can check whether a vector can be written as a linear
combination of a set of vectors by solving a vector equation. For instance,
suppose we wish to determine whether the vector 7, 12, 18) can be written
as a linear combination of 1, 2, 3) , 2, 3, 1). This question, similar to what
you were asked to do in Activity 3(b) and what was shown above, amounts
to asking whether we can nd scalars a and b such that the following vector
equation is true:
a 1, 2, 3) +b 2, 3, 1) = 7, 12, 18) .
If we simplify the linear combination on the left and equate components
4.1 Linear Combinations 163
(why?), we get a system of three equations in the two unknowns a and b:
7, 12, 18) = a 1, 2, 3) +b 2, 3, 1)
= a, 2a, 3a) +2b, 3b, b)
= a + 2b, 2a + 3b, 3a +b, )
which, when simplied further, yields
a + 2b = 7
2a + 3b = 12
3a +b = 18.
As it turns out, this system, which you should try to solve yourself, has no so-
lution. Hence, the vector 7, 12, 18) cannot be written as a linear combination
of 1, 2, 3) and 2, 3, 1), and the components of the vector, when substituted
into the expression 7x 5y + z, would render the equation 7x 5y + z = 0
false. If, on the other hand, the system above had yielded a solution, then
7, 12, 18) could be written as a linear combination of 1, 2, 3) and 2, 3, 1);
7, 12, 18) would lie in the plane generated by 1, 2, 3) and 2, 3, 1); and the
components of the vector, when substituted into the expression 7x 5y + z
would satisfy the equation 7x 5y +z = 0.
In Activity 8, you wrote a func that essentially performed the operation
we have been discussing; in particular, the func accepts as input a single
vector v and set of vectors SV and returns a boolean value obtained by
determining whether v could be expressed as a linear combination of the
elements of SV . If the vector v could not be written as a linear combination
of LU, how would LU need to be modied to report such a result?
Actually, Activity 8 did a bit more. It checked, not only whether the
given vector could be expressed as a linear combination of the given set of
vectors, but also whether this could be done in exactly one way, that is,
was the representation unique? The question of whether a vector can be
expressed uniquely as a linear combination of a given set of vectors is very
important and will be discussed thoroughly in Section 4.4.
Non-Tuple Vector Spaces
Your work in Activity 12 should have involved two additional vector spaces.
One is C

(R). The other is the vector space of all solutions of the dier-
ential equation. For which values in R of a, b are the functions a sin, b cos
164 CHAPTER 4. LINEARITY AND SPAN
solutions to the dierential equation? How would you write a general linear
combination of two of these functions? When is it a solution?
Exercises
1. Let V = (Z
3
)
4
be the vector space of 4-tuples with entries in Z
3
. For
each of the following pairs of vectors v, w, nd all linear combinations
of v and w, and determine which linear combinations yield the zero
vector. You may wish to use the func All LC to verify your results.
(a) v = 1, 1, 2, 2) and w = 1, 2, 0, 1).
(b) v = 1, 2, 0, 1) and w = 2, 1, 0, 2).
2. Let V = (Z
3
)
3
. Let
S1 = 2, 1, 0) , 1, 2, 1)
S2 = 1, 1, 2) , 0, 2, 1)
be two sets of vectors in V . Find all linear combinations of S1 and of
S2. Do S1 and S2 generate the same set of vectors?
3. Let V = K
6
, where K = (Z
2
)
2
= x, y) [ x, y Z
2
and the additive
and multiplicative operations are given by the following formulas: if
s, t K, then
s +
K
t = s
1
, s
2
) +
K
t
1
, t
2
)
= (s
1
+t
1
)mod 2, (s
2
+t
2
)mod 2)
s
K
t = s
1
, s
2
)
K
t
1
, t
2
)
= (s
1
t
1
+s
2
t
2
)mod 2, (s
1
t
2
+s
2
t
1
+s
2
t
2
)mod 2) .
Select any three non-zero vectors from V . Designate one as u, one
as v, and the remaining vector as w. Let A = 1, 1) , 1, 0), B =
0, 1) , 1, 0), and C = 1, 1) , 0, 1) be three sets of scalars. Find
all linear combinations of the form
au +bv +cw,
where a A, b B, c C.
You may want to use the func LC to verify your result.
4.1 Linear Combinations 165
4. Give seven vectors in R
4
that are in the set generated by
1, 2, 4, 2) , 3, 5, 2, 3) , 1, 1, 2, 1) , 3, 4, 8, 4).
5. Let V = R
3
. Determine whether
S1 = 2, 1, 3) , 1, 4, 5)
S2 = 3, 1, 4) , 5, 1, 1)
generate the same set of vectors in R
3
.
6. Let V = R
3
. Determine whether
S1 = 2, 1, 0) , 1, 1, 1)
S2 = 1, 1, 1) , 3, 0, 1)
generate the same set of vectors in R
3
.
7. Let V = R
3
. Determine whether
S1 = 1, 2, 3) , 1, 2, 5) , 3, 1, 4)
S2 = 1, 6, 11) , 2, 0, 2) , 1, 2, 3)
generate the same set of vectors in R
3
.
8. Modify the func LU you constructed in Activity 8, so that, for any
vector v and set of vectors SV , LU is able to report whether v can
be expressed as a linear combination of SV uniquely (LU reports 1 as
output), in more than one way (LU reports 2 as output), or not at all (LU
reports 0 as output). Test your modied func for every possible vector
v = v
1
, v
2
, v
3
) in (Z
2
)
3
, when given the set SV = 1, 0, 1) , 0, 1, 1).
9. Let V = R
3
. For each part, (a)(d), determine whether the rst vector
can be expressed as a linear combination of the remaining three vectors.
(a) 3, 3, 7) ; 1, 1, 2) , 2, 1, 0) , 1, 2, 1)
(b) 2, 7, 13) ; 1, 2, 3) , 1, 2, 4) , 1, 6, 10)
(c) 1, 4, 9) ; 1, 3, 1) , 1, 1, 1) , 0, 1, 4)
(d) 4, 3, 8) ; 1, 0, 1) , 2, 1, 3) , 0, 1, 5)
166 CHAPTER 4. LINEARITY AND SPAN
10. Let v be a linear combination of two vectors v
1
and v
2
. Show that v
is also a linear combination of c
1
v
1
and c
2
v
2
, where c
1
,= 0 and c
2
,= 0.
11. Suppose v is not a linear combination of two vectors v
1
and v
2
. Show
that v is also not a linear combination of c
1
v
1
and c
2
v
2
. Try to do this
using the previous exercise and without any calculations.
12. Let W denote the set of vectors generated by v
1
, v
2
. If v
3
W,
prove v
1
, v
2
, v
3
generates the same set.
13. Show that the solution set of the equation 23x 9y + z = 0 is the set
of vectors
s 1, 3, 4) +t 2, 5, 1) : s, t R.
14. Find the set of vectors whose components satisfy the equation y =
2x + 3.
15. Given the set of vectors
v +aw : a R,
where v = 3, 2) and w = 2, 5), nd the equation of the line whose
solution set, when written in vector form, is equal to the set given
above.
16. Given the set
v +aw : a R,
where v = 3, 2) and w = 2, 5) as in the prior exercise, determine
what would happen to the graph of this set if the coecient were al-
lowed to vary; that is, if you were given
bv +aw : a, b R,
what would this set of linear combinations look like? Draw a graph of
this set of linear combinations for b = 1, 2, 3, .5. What do you
observe? How does the graphical form of this set dier from the case
in which b = 1?
17. Find the equation of the plane, given the generating set
1, 3, 2) , 3, 0, 2).
4.1 Linear Combinations 167
18. Find the set of vectors whose components satisfy the equation y =
5
3
x.
What happens to the graph of this set, if each vector is multiplied by
the scalar 2? What happens to the graph of this set, if 2, 1) is added
to each vector in the set?
19. Show that if W is the span of a set of vectors S in a vector space V ,
then W is a subspace of V .
20. In the vector space P
n
(K), describe the span of the following sets of
vectors.
(a) 1, x
2
, x
4
, . . . , x
n div 2

(b) x, x
2
, x
3
, . . . , x
n

(c) 1
(d) x
(e) 1, x
21. In the vector space PF
4
(R), in how many ways can you express the
polynomial function
x 2 + 3x
as a linear combination of 1, x, x
2
, x
3
, x
4
?
22. In the vector space PF
4
(Z
3
), in how many ways can you express the
polynomial function
x 2 + 3x
as a linear combination of 1, x, x
2
, x
3
, x
4
?
23. Let a, b be real numbers, g the function given by g(x) = sin(x), h the
function given by h(x) = cos(x), and f = ag + bh the linear combina-
tion. What is the function f given by?
24. For which real numbers a, b is the function f given by the linear com-
bination, f = a sin +b cos a solution to the dierential equation
f
tt
+f = 0?
168
4.2 Linear Independence
Activities
1. Let V = (Z
2
)
4
be the vector space of quadruples of elements of Z
2
. Let
SETV 1 = 1, 1, 0, 1) , 1, 0, 1, 1) , 1, 1, 1, 0)
SETV 2 = 1, 1, 1, 1) , 0, 0, 1, 1) , 1, 1, 0, 0)
SETV 3 = 1, 1, 0, 1) , 1, 0, 1, 1) , 0, 0, 1, 1)
SETV 4 = 0, 0, 0, 0) , 1, 1, 0, 0) , 0, 0, 1, 1)
SETV 5 = 1, 0, 1, 0) , 0, 1, 0, 1) , 1, 1, 1, 1)
be four sets of vectors from (Z
2
)
4
.
(a) For each set of vectors, write down the expression for each possible
linear combination. Do not simplify.
(b) Apply the func LC that you wrote in Section 4.1, Activity 2 to
each combination you produced in (a) to decide if it yields the
zero vector.
(c) Identify which sets have the property that there is one and only
one linear combination that yields the zero vector.
2. Write a func LI that will assume name vector space has been run;
that will accept one input SETV , where SETV denotes a set of vectors;
and that will return a boolean value that tells if there is a unique scalar
sequence whose linear combination with SETV t yields the zero vector.
Verify the construction of LI by checking each set of vectors given in
Activity 1.
You will probably need to dene a local variable TUPV and include a
line of code such as:
TUPV := [x : x in SETV];
You may wish to use one or more of the funcs you dened in the
previous section.
3. Write a func LD that will assume name vector space has been run;
that will accept one input SETV , where SETV denotes a set of vectors;
that will convert SETV to a sequence with a line of code such as:
4.2 Linear Independence 169
TUPV := [x : x in SETV];
and that will return either the string the set is independent, if there
is a unique scalar sequence whose linear combination with the vectors
in SETV yields the zero vector, or the set of all scalar sequences that
yield the zero vector, if more than one such scalar sequence is identied.
Verify the construction of LD by checking each set of vectors given in
Activity 1.
4. For each set of vectors u, v, w you constructed in Activity 1, deter-
mine whether u can be written as a linear combination of v and w;
determine whether v can be written as a linear combination of u and
w; and determine whether w can be written as a linear combination of
u and v. Keep track of this information in relation to the results you
obtained in Activities 2 and 3.
5. Let V = (Z
2
)
4
be as in Activity 1. Apply the func All LC, which you
wrote for Section 4.1, Activity 6, to nd the set of vectors generated by
the zero vector, that is, the set ov). What do you observe? Then,
apply the funcs LI and LD to this single element set. What do you
observe?
6. Let V = (Z
7
)
2
. Apply the func All LC to nd the set of vectors gen-
erated by the single-vector set 3, 2). What do you observe? Then,
apply the funcs LI and LD to this set. What do you observe?
7. Let v = 2, 3) and w = 4, 6) be two vectors in R
2
. Solve the vector
equation
a 2, 3) +b 4, 6) = 0, 0)
for a and b. How many solutions does this equation have: one? none?
innitely many? As discussed in the last section, the set of vectors
generated by these two vectors is given by
s 2, 3) +t 4, 6) : s, t R.
Given s 2, 1, 0.5 and t 3, 2, construct all possible linear
combinations of the form
s 2, 3) +t 4, 6) ,
170 CHAPTER 4. LINEARITY AND SPAN
and use the func vectors to graph all of the resulting combinations.
Based upon your graphs, describe the graph of the set of vectors gen-
erated by 2, 3) and 4, 6).
8. Let v = 1, 2) and w = 3, 1) be two vectors in R
2
.
(a) Solve the vector equation
a 1, 2) +b 3, 1) = 0, 0)
for a and b. How many solutions does this equation have: one?
none? innitely many?
(b) As discussed in the last section, the set of vectors generated by
these two vectors is given by
s 1, 2) +t 3, 1) : s, t R.
Given s 2, 1, 0.5 and t 3, 2, construct all possible
linear combinations of the form
s 1, 2) +t 3, 1) ,
and use the func vectors to graph all of the resulting combina-
tions.
(c) Based upon your graphs, describe the graph of the set of vectors
generated by 1, 2) and 3, 1).
9. Let v = 1, 2, 3) and w = 2, 4, 6) be two vectors in R
3
.
(a) Solve the vector equation
a 1, 2, 3) +b 2, 4, 6) = 0, 0, 0)
for a and b. How many solutions does this equation have: one?
none? innitely many?
(b) As discussed in the last section, the set of vectors generated by
these two vectors is given by
s 1, 2, 3) +t 2, 4, 6) : s, t R.
4.2 Linear Independence 171
Given s 2, 1, 0.5 and t 3, 2, construct all possible
linear combinations of the form
s 1, 2, 3) +t 2, 4, 6) ,
and then graph each resulting combination by hand.
(c) Based upon your graphs, describe the graph of the set of vectors
generated by 1, 2, 3) and 2, 4, 6).
10. Let v = 1, 2, 1) and w = 2, 1, 3) be two vectors in R
3
.
(a) Solve the vector equation
a 1, 2, 1) +b 2, 1, 3) = 0, 0, 0)
for a and b. How many solutions does this equation have: one?
none? innitely many?
(b) As discussed in the last section, the set of vectors generated by
these two vectors is given by
s 1, 2, 1) +t 2, 1, 3) : s, t R.
Given s 2, 1, 0.5 and t 3, 2, construct all possible
linear combinations of the form
s 1, 2, 1) +t 2, 1, 3) ,
and then graph each resulting combination by hand.
(c) Based upon your graphs, describe the graph of the set of vectors
generated by 1, 2, 1) and 2, 1, 3).
11. In the vector space P
n
(K) of all polynomials of degree less than or equal
to n with coecients in the eld K, consider the set of polynomials
1, x, x
2
, . . . , x
p
where p n. Is this set linearly independent?
12. Consider the dierential equation
f
tt
+f = 0,
where the unknown function f is in C

(R).
172 CHAPTER 4. LINEARITY AND SPAN
In the previous section, you determined for which values in R of a, b
the functions given by a sin(x), b cos(x) and their linear combinations
are solutions to the dierential equation. You should have decided that
all functions given by an expression of the form,
a sin(x) +b cos(x)
are solutions.
From among these solutions, pick out several linearly independent sets.
What is the largest number of functions that you can have in a linearly
independent set?
Discussion
Denition of Linear Independent and Linear Dependent
In Activity 1, you formed all possible linear combinations of the sets of vectors
SV 1, SV 2, SV 3, SV 4 and SV 5 in (Z
2
)
4
. You then applied the func LC to
determine which linear combinations yielded the zero vector. Which of these
sets have the property that there are no linear combinations that give the
zero vector? Exactly one such linear combination? More than one?
A set of vectors in which only one linear combination yields the zero vector
is particular important and deserves a name: linearly independent set. Any
other set of vectors is called a linearly dependent set.
Here is a precise denition.
Denition 4.2.1. Let V be a vector space over K. A set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
m

is linearly independent if there exists one and only one sequence of scalars,
namely
SK = [0, 0, 0, . . . , 0]
whose linear combination yields the zero vector; that is,
0v
1
+ 0v
2
+ 0v
3
+ + 0v
m
is the only linear combination of SV that yields the zero vector.
4.2 Linear Independence 173
In the exercises, you will be asked to formulate this denition for linearly
dependent sets.
Given any set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
m
, ,
the linear combination
0v
1
+ 0v
2
+ 0v
3
+ + 0v
m
yields the zero vector whether the set is linearly independent or dependent.
The dierence between independence and dependence lies in whether there
exist linear combinations with nonzero scalars that produce the zero vector.
In the case of linearly dependent sets, this is precisely the case. In the case
of linearly independent sets, the situation is the opposite: the only linear
combination that yields the zero vector is the one in which all of the scalars
are simultaneously zero.
For example, if, for each set from Activity 1, we form an arbitrary linear
combination and set it equal to the zero vector,
a 1, 1, 0, 1) +b 1, 0, 1, 1) +c 1, 1, 1, 0) = 0, 0, 0, 0) (SETV 1)
a 1, 1, 1, 1) +b 0, 0, 1, 1) +c 1, 1, 0, 0) = 0, 0, 0, 0) (SETV 2)
a 1, 1, 0, 1) +b 1, 0, 1, 1) +c 0, 0, 1, 1) = 0, 0, 0, 0) (SETV 3)
a 0, 0, 0, 0) +b 1, 1, 0, 0) +c 0, 0, 1, 1) = 0, 0, 0, 0) (SETV 4)
a 1, 0, 1, 0) +b 0, 1, 0, 1) +c 1, 1, 1, 1) = 0, 0, 0, 0) (SETV 5),
and then solve each equation for a, b, and c, we would nd that the all-zero
scalars
a = 0, b = 0, c = 0
satises all ve equations. For the sets SETV 1 and SETV 3 however, this is
the one and only combination that produces the zero vector. This is not the
case for the vectors sets SETV 2, SETV 4 and SETV 5. Each of these sets
has at least one other linear combination that yields the zero vector. Are
these results consistent with what you should have found when you applied
the funcs LI and LD to the sets SETV 1, SETV 2, SETV 3, SETV 4 and
SETV 5? Setting linear combinations equal to the zero vector, as we did
above, allows us to rewrite the denition of linear independence in terms of
an equation.
174 CHAPTER 4. LINEARITY AND SPAN
Denition 4.2.2. A set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
m

in a vector space is linearly independent if and only if there exists a unique


solution to the vector equation
a
1
v
1
+a
2
v
2
+a
3
v
3
+ +a
m
v
m
= 0,
namely,
a
1
= a
2
= a
3
= = a
m
= 0.
In the exercises, you will be asked to formulate a similar denition for
linearly dependent, that is, not linearly independent, sets.
Although we have dened linear independence, we have not yet identied
any characteristics that distinguish independence from dependence. One im-
portant dierence exists in the relationship between the vectors within the
linearly independent or dependent sets. In Activity 4, you took each set
SETV 1, SETV 2, SETV 3, SETV 4 and SETV 5 and determined whether
each vector in the set could be written as a linear combination of the remain-
ing vectors. What did you nd in Activity 4? How do these results compare
with the linear dependence or independence of these sets? There is, in fact,
a general relationship which we establish in the next theorem.
Theorem 4.2.1. Let V be a vector space over the eld K. A set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
q

is linearly dependent if and only if at least one of the vectors in the set can
be written as a linear combination of the remaining vectors.
Proof. (=) : We will assume that SV is linearly dependent, and we will
prove that at least one of the vectors in the set can be written as a combi-
nation of the others. By denition, the dependence of SV implies that there
exists a set of scalars, say c
1
, c
2
, c
3
, . . . , c
q
, where
c
1
,= 0 or c
2
,= 0 or c
3
,= 0 . . . or . . . c
q
,= 0
and
c
1
v
1
+c
2
v
2
+c
3
v
3
+ +c
q
v
q
= 0.
For the purpose of this argument, it does not matter which specic scalar is
assumed to be nonzero. So, lets assume that c
1
,= 0. In this case, we can
4.2 Linear Independence 175
divide by c
1
, from which we clearly see that v
1
can be expressed as linear
combination of the remaining vectors.
c
1
v
1
+c
2
v
2
+c
3
v
3
+ +c
q
v
q
= 0
c
2
v
2
+c
3
v
3
+ +c
q
v
q
= c
1
v
1

c
2
c
1
v
2

c
3
c
1
v
3

c
q
c
1
v
q
= v
1
(=) : We will show that if at least one vector in SV can be written as a
linear combination of the others, then SV must be linearly dependent. For
the purpose of this argument, it does not matter which vector can be written
as a combination of the others; lets assume that v
1
is such a vector. Then
there exists a set of scalars c
2
, c
3
, . . . , c
q
such that
v
1
= c
2
v
2
+c
3
v
3
+ +c
q
v
q
.
If we rewrite this equation, we see that
(1)v
1
+c
2
v
2
+c
3
v
3
+ +c
q
v
q
= 0.
Since there is a set (1), c
2
, c
3
, . . . , c
q
of scalars, not all zero, that when
combined with SV yields the zero vector, it follows, according to the deni-
tion of linear dependence, that SV is a linearly dependent set.
Another characteristic that dierentiates linearly independent and depen-
dent sets involves the relationship between the set and the vectors which the
set generates. In Activity 1, you constructed all possible linear combinations
of each set of vectors. Did you nd that for some of these ve sets there
was more than one linear combination giving the same answer and for others
there was never more than one? That is, in some cases, the representation of
any vector as a linear combination of vectors in the set is unique, in others,
it is not. How did this compare with the linear independence or dependence
of the set? Again there is a general relationship which we establish in the
following theorem.
Theorem 4.2.2. Let V be a vector space over a eld K. A set of q vectors
in V ,
SV = v
1
, v
2
, v
3
, . . . , v
q

is linearly independent if and only if each vector v contained in the span of


SV can be written as a linear combination of SV in one and only one way.
176 CHAPTER 4. LINEARITY AND SPAN
Proof. (=): Assume that SV is a linearly independent set, and let v be an
element of the set of vectors generated by SV . By denition, we can write
v as a linear combination of SV . Lets suppose that this can be done in two
dierent ways; that is, suppose there are two dierent linear combinations of
SV that represent the vector v. Using the assumption that SV is linearly
independent, we will prove that each pair of corresponding scalars from these
two linear combinations consists of two equal scalars. This will prove that
v can be expressed as a linear combination of the vectors of SV in one and
only one way.
We set the two linear combinations of v equal to one another. We then
simplify the equation: we get both linear combinations on one side of the
equation; we group like vectors from SV ; and then we apply the distributive
property for scalar multiples to each vector in SV . In its nal form, the
equation consists of having the zero vector on one side and a linear combi-
nation of SV on the opposite side, where each vector in SV is multiplied by
the dierence between the corresponding scalars from the two dierent linear
combinations of v. Since we are assuming that SV is linearly independent,
each scalar dierence is zero. From this, we see that each pair of correspond-
ing scalars consists of two equal scalars. As a result, the two dierent linear
combinations of SV must have the same coecients: v can be written as a
linear combination of SV in one and only one way.
(=): To prove the converse, we assume that any arbitrary vector v in
the set of vectors generated by SV can be written as a linear combination
of SV in one and only one way. We use this to prove that SV is a linearly
independent set.
The set of vectors generated by SV consists of all possible linear combi-
nations of SV . One such combination consists of the expression where each
scalar is zero. As a result, the zero vector must be an element of the set of
vectors generated by SV . However, because of our assumption, this is the
only linear combination of SV that yields the zero vector. According to the
denition of linear independence, SV must be a linearly independent set.
In addition to providing insight into the dierences between independent
and dependent sets, either Theorem 4.2.1 or Theorem 4.2.2 could have been
used as the denition of linear independence and dependence, with Deni-
tions 4.2.1 and 4.2.2 proven as theorems. In other words, the two denitions
and the two theorems are equivalent formulations of the same concept. In
the exercises, you will be asked to use Theorem 4.2.1 to prove Denition 4.2.1
4.2 Linear Independence 177
as a theorem. Your proof, taken together with the proof given above, will
show that Denition 4.2.1 and Theorem 4.2.1 are logically equivalent; that
is, one statement can be substituted for the other as the denition of linear
independence or dependence. The reason for making this point is that each
statement focuses upon a dierent, yet equivalent aspect of independence or
dependence. Which denition we choose to employ depends upon the specic
circumstances of the problem being posed.
Geometric Interpretation/Generating Sets
Euclidean two-dimensional space R
2
contains three dierent types of objects:
points, which are of dimension zero, lines, which are of dimension one, and
the entire plane, which is of dimension two. In Activity 5, you discovered
that the zero vector in (Z
2
)
4
, considered as a single-vector set, is linearly
dependent. All linear combinations of the zero vector yield the zero vector.
This is also true in R
2
: the zero vector is a linearly dependent set that
generates a zero-dimensional object.
On the other hand, in Activity 6, you discovered that the nonzero single-
vector set 3, 2) is independent and generates multiples of itself. When is
a set with a single vector linearly independent? In the previous section, we
found that the single-vector set 3, 2) in R
2
generated a set of vectors of
the form
t 3, 2) : t R,
where the graph is given by a line whose equation is y =
2
3
x. For single-
vector sets, notice that the set consisting of the zero vector, a dependent
set, generates a set of dimension zero, while the set consisting of a nonzero
vector, an independent set, generates a set of dimension one.
An analogous result holds for two-vector sets. In Activities 7 and 8, you
were asked to determine whether the given set of vectors is independent or
dependent and to graph several of the vectors contained in the sets they gen-
erate. Based upon your graphs, what would you say about the set of vectors
generated by the sets in the two activities? Is there a dierence? How does
your response compare with the linear dependence or independence of the two
sets? Can you formulate a general statement about the relationship between
the dependence or independence of a set and the geometric representation of
the set it generates in the plane?
As one example of all this, we can prove that the set of vectors generated
178 CHAPTER 4. LINEARITY AND SPAN
in R
2
by the set
1, 2) , 3, 1),
that is,
s 1, 2) +t 3, 1) : a, R,
is not a line through the origin, but is the entire plane. In order to prove
this, we must show that every element of R
2
can be expressed as a linear
combination of 1, 2) and 3, 1). In particular, we must show that we can
always nd scalars s and t such that the vector equation
a, b) = s 1, 2) +t 3, 1)
holds for any values of a and b. If we simplify the right hand side of this
equation, equate coordinates, and solve for s and t, we get
s =
1
7
a +
3
7
b
t =
2
7
a
1
7
b.
Since any values of a and b can be substituted into the expressions for s
and t, the vector equation given above will have a solution for all possible
choices of a and b. Hence, every vector a, b) in R
2
can be written as a linear
combination of 1, 2) and 3, 1); the sequence 1, 2) , 3, 1) generates all
of R
2
. In Activity 7, this is not the case. The vector equation
a, b) = s 2, 3) +t 4, 6) ,
which is equivalent to the following system of equations
2s + 4t = a
3s + 6t = b,
yields no solution if, for example, a = 1 and b = 2.
These examples suggest that linearly independent sets in R
2
span sets
whose dimension is equal to the number of vectors in the generating set,
whereas linearly dependent sets generate sets whose dimension is less than
the number of vectors in the generating set. This observation, as we will
see below, holds in R
3
. We are using the term dimension here somewhat
informally and without explanation. In Section 4.4 we will discuss dimension
more thoroughly.
4.2 Linear Independence 179
Like Euclidean two-dimensional space, Euclidean three-dimensional space
R
3
contains points and lines. The dierence is that R
3
contains more than
one plane, and the entire space itself represents a three-dimensional object.
Similar to the R
2
case and what you found in Activity 4, any single-vector
set consisting of the zero vector is a linearly dependent set that generates
only the zero vector, and a single-vector set consisting of a non-zero vector is
a linearly independent set that generates scalar multiples of the generating
vector. The graph of the set of vectors generated by a nonzero, single-vector
set is a line in R
3
. Can you show this for the set 2, 1, 4) in R
3
?
a
v
tv + a
tv
Figure 4.4: Generating a line
In Activities 9 and 10, you studied the geometric representations of the
sets of vectors generated by two-vector sets. In Activity 9, you graphed dif-
ferent linear combinations of the dependent set 1, 2, 3) , 2, 4, 6). The six
combinations you graphed lie on a line passing through the origin that is
oriented in the direction of the vector 1, 2, 3). This is true for any linear
combination of this set, a result consistent with what we found in the case of
R
2
: namely, a linearly dependent set generates a geometric object of dimen-
sion smaller than the number of elements in the set. The set in Activity 10,
1, 2, 1) , 2, 1, 3), is linearly independent.
The six linear combinations you were asked to graph lie in the plane gen-
erated by 1, 2, 1) and 2, 1, 3), a result consistent with what we discovered
in the last section and what was discussed earlier in this section. How do
we show that this sequence generates a plane? What is the equation of the
180 CHAPTER 4. LINEARITY AND SPAN
w
v
Figure 4.5: Generating a plane
plane it generates? How do we know that the set in Activity 9 does not
generate a plane? How does one go about nding the equation of the line it
does generate?
What about three-vector sets in R
3
? If we form a set of three vectors in
R
3
, what is the graph of the set it generates? Similar to what we have seen
for one-vector and two-vector sets, a linearly dependent set of three vectors
will generate a set whose graph is of dimension zero, one, or two, while a
linearly independent set of three vectors will generate the entire space R
3
.
Although we will not consider a three-vector set here, the prior examples
illustrate an important point: a linearly dependent set of q vectors generates
a set whose graph is less than dimension q, and a linearly independent set
of q vectors generates a set whose graph is of dimension q. We will revisit
this issue in Section 4.4 when we discuss the notion of dimension in a more
general context.
Non-Tuple Vector Spaces
In reference to Activity 11, what does the linear dependence or independence
of the set of polynomial . . .
In reference to Activity 11, what does the linear dependence or inde-
pendence of the set of polynomial functions x 1, x x, x
x
2
, . . . , x x
p
, p n in PF
n
(K) have to do with the fact that a polyno-
mial of degree p has at most p zeros?
Turning to Activity 12, is there any connection between the degree of
4.2 Linear Independence 181
the dierential equation and the maximum number of linearly independent
solutions you could nd?
Exercises
1. Restate Denition 4.2.1 for linearly dependent sets.
2. Restate Denition 4.2.2 for linearly dependent sets.
3. Determine whether the following sets of vectors are independent or
dependent.
(a) 1, 1) , 3, 2) , 2, 1) in R
2
.
(b) 1, 1, 1) , 4, 3, 2) , 1, 2, 0) , 0, 1, 2) in R
3
.
(c) 2, 3, 1, 2) , 1, 3, 2, 1) , 4, 2, 1, 3) , 0, 3, 2, 1) , 2, 0, 3, 2)
in R
4
.
4. Let SV = 2, 3, 1) , 1, 2, 2) , 5, 4, 4). Check to see whether each
vector in the set can be written as a linear combination of the remaining
two vectors. Using only the information you get from checking these
combinations, determine whether this set is linearly independent or
dependent. Explain how you can make such a determination without
invoking the Denitions 4.2.1 or 4.2.2 directly.
5. Let SV = 2, 3, 1) , 1, 2, 2) , 4, 3, 5). Check to see whether
each vector in the set can be written as a linear combination of the re-
maining two vectors. Using only the information you get from checking
these combinations, determine whether this set is linearly independent
or dependent. Explain how you can make such a determination without
invoking the Denitions 4.2.1 or 4.2.2 directly.
6. In this exercise, you are asked to develop the general form of a partic-
ular type of linearly independent set. In parts (a)(c), show that each
set of vectors is linearly independent. In parts (d) and (e), use this
information to generalize the pattern represented parts (a)(c).
(a) 2, 0) , 0, 1) in R
2
.
(b) 3, 2, 0) , 1, 0, 2) , 0, 1, 1) in R
3
.
(c) 1, 1, 2, 0) , 2, 1, 0, 2) , 3, 0, 2, 1) , 0, 1, 2, 1) in R
4
182 CHAPTER 4. LINEARITY AND SPAN
(d) Based upon the pattern given in (a)(c), where we will assume
that the components are always non-negative, construct a linearly
independent set of vectors in R
5
. Show that the set you have
constructed is indeed independent.
(e) Describe a process for constructing a linearly independent set of
n vectors in R
n
using the approach outlined above.
7. In this exercise, you are asked to develop the general form of a partic-
ular type of linearly independent set. In parts (a)(c), show that each
set of vectors is linearly independent. In parts (d) and (e), use this
information to generalize the pattern represented parts (a)(c)
(a) 1, 0) , 0, 1) in R
2
.
(b) 1, 0, 0) , 0, 1, 0) , 0, 0, 1) in R
3
.
(c) 1, 0, 0, 0) , 0, 1, 0, 0) , 0, 0, 1, 0) , 0, 0, 0, 1) in R
4
(d) Based upon the pattern given in (a)(c), construct a linearly in-
dependent set of vectors in R
5
. Show that the set you have con-
structed is indeed independent.
(e) Describe a process for constructing a linearly independent set of
n vectors in R
n
using the approach outlined above.
8. Let SV 1 = 2, 1, 2) , 3, 1, 4) and SV 2 = 1, 1, 0) , 0, 1, 0) be two
sets of vectors in R
3
. Show that SV 1 and SV 2 are linearly independent
sets. Describe the spans of these two sets. Do they generate the same
sets of vectors? Draw a picture of R
3
depicting these sets and their
spans.
9. Let SV 1 = 2, 1, 2) , 3, 1, 4) and SV 2 = 2, 3, 1) , 1, 0, 2) be
two sets of vectors in R
3
. Show that SV 1 and SV 2 are linearly inde-
pendent sets. Describe the spans of these two sets. Do they generate
the same sets of vectors? Draw a picture of R
3
depicting these sets and
their spans.
10. Restate Theorem 4.2.1 for linearly independent sets.
11. Restate Theorem 4.2.1 in terms of linear independence. Use this state-
ment to prove Denition 4.2.1 as a theorem.
12. Restate Theorem 4.2.2 for linearly dependent sets.
4.2 Linear Independence 183
13. The proof of Theorem 4.2.2 although complete, is written in a con-
versational style without any calculations. An alternative would be to
write out all the steps in expressions and equations as was done in the
proof of Theorem 4.2.1. Rewrite the proof of Theorem 4.2.2 in this
more computational style.
14. Let v
1
, v
2
, v
3
be a linearly dependent set. Let c be a non-zero scalar.
Show that the following sets are also linearly dependent.
(a) v
1
, v
1
+v
2
, v
3

(b) v
1
, cv
2
, v
3

15. Let v
1
, v
2
be a linearly independent set. If v
3
cannot be written as
a linear combination of v
1
and v
2
, that is,
v
3
,= av
1
+bv
2
,
for any pair of scalars a and b, then show that v
1
, v
2
, v
3
is a linearly
independent set.
16. Prove or provide a counterexample. If three nonzero vectors u, v, w
are linearly dependent, it must be the case that u is a linear combina-
tion of v and w.
17. Let 2, 1, 1) , 4, 2, 2) be a two-vector set in R
3
. Determine
whether this set is linearly dependent or linearly independent. De-
scribe the span of this set as a set of points in R
3
. Are your results, in
terms of the issue of dimension, consistent with what was discussed in
the text? Explain your answer.
18. Let 3, 4, 1) , 2, 5, 3) be a two-vector set in R
3
. Decide whether
this set is linearly dependent or linearly independent. Describe the
span of this set as a set of points in R
3
. Are your results, in terms of
the issue of dimension, consistent with what was discussed in the text?
Explain your answer.
19. Let 4, 1) , 2, 5) be a two vector set in R
2
. Determine whether this
set generates the entire plane R
2
.
184 CHAPTER 4. LINEARITY AND SPAN
20. Let 1, 4, 3) , 3, 5, 2) , 1, 1, 3) be a three-vector set in R
3
. Deter-
mine whether this set generates the entire space R
3
using an approach
similar to that given for generating sets of R
2
. Is this set linearly in-
dependent or linearly dependent? Is your result consistent with the
discussion given in the text? Explain.
21. Let 1, 4, 2) , 1, 3, 1) , 2, 1, 3) be a set of three vectors in R
3
. De-
termine whether the rst vector can be written as a linear combination
of the remaining vectors. Repeat this process for the second and third
vectors. Without making any further calculations, is this set linearly
independent or linearly dependent? What can you say about the dimen-
sion of the graph of the set of vectors generated by this set? Carefully
explain your answer.
22. Prove that any set of monomial functions in PF
n
(R) is linearly inde-
pendent.
23. Which sets of monomial functions in PF
n
(R) span all of PF
n
(R)?
24. Discuss the results of Exercises 22 and 23 if R is replaced by Z
3
.
185
4.3 Generating Sets and Linear Independence
Activities
1. Let
SETV 1 = 2, 1, 3, 2) , 1, 1, 3, 1) , 3, 2, 2, 3)
SETV 2 = 3, 2, 1, 3) , 4, 3, 0, 4) , 2, 2, 1, 2)
be two sets of vectors on (Z
5
)
4
.
(a) Use the func LI that you wrote in Section 4.2, Activity 2 to verify
that both sets are linearly independent.
(b) Apply the func All LC from Section 4.1, Activity 6 to nd the
set of vectors generated by each set. What do you observe? Are
the two spans equal?
(c) Apply the modied version of the func LU you constructed in Sec-
tion 4.1, Exercise 8 to determine whether each vector in SETV 2
can be written as a linear combination of the vectors in SETV 1.
What do you observe?
2. Let
SETV 1 = 2, 1, 3, 2) , 1, 1, 3, 1) , 3, 2, 2, 3)
SETV 3 = 1, 2, 0, 2) , 3, 1, 1, 2) , 0, 3, 0, 0)
be two sets of vectors on (Z
5
)
4
.
(a) Use the func LI to verify that both sets are linearly independent.
(b) Apply the func All LC to nd the set of vectors generated by each
set. What do you observe? Are the two sets equal?
(c) Apply the modied version of the func LU you constructed in
Section 4.1, Exercise 8 of the section on linear combinations to
determine whether each vector in SETV 3 can be written as a lin-
ear combination of the vectors in SETV 1. What do you observe?
186 CHAPTER 4. LINEARITY AND SPAN
(d) The results from this and the prior exercise illustrate a general
principle. Formulate a conjecture based upon your ndings.
3. Let
SETV 4 = 1, 2, 3, 4) , 3, 3, 3, 2) , 2, 1, 0, 3) , 3, 2, 1, 2)
be a set of vectors in (Z
5
)
4
. Perform the following tasks with respect
to this set of vectors.
(a) Apply the func All LC to nd the span of this set.
(b) Apply the func LI to determine whether the set SETV 4 is inde-
pendent or dependent.
(c) Apply the modied version of the func LU to determine whether
any vector in the set can be written as a linear combination of the
remaining vectors.
(d) If the answer to (c) is yes, remove one such vector, and denote
the remaining sequence as SETV 5. Repeat steps (a), (b), and
(c) with SETV 5. Is the span of SETV 5 the same as the span
of SETV 4? Is SETV 5 linearly independent? If the answer to
part (c) is yes, repeat the process: remove one of the vectors that
can be written as a linear combination of the remaining vectors,
and denote the new sequence as SETV 6. Repeat (a), (b), and (c)
with SETV 6 and beyond that, if necessary, until you arrive at an
answer of no in part (c)
(e) When you get a no answer in part (d), what is your answer to
part (c)? Is there a relationship between whether a set is inde-
pendent and whether a vector in that set can be written as a linear
combination of the others? Does the nal set you get generate
the same set of vectors as the original set SETV 4? Explain your
answer.
4. Write a func LIGS that assumes that name vector space has been ex-
ecuted; that accepts one input SETV , where SETV is a set of vectors;
and that returns a linearly independent set constructed by employing
the following process: the func takes one of the vectors in SETV , tests
whether it is a combination of the others, removes the vector from the
set if it is, leaves the vector in the set if it is not, and successively
4.3 Generating Sets and Linear Independence 187
repeats this process until each vector in the set has been checked. Test
the func LIGS on the set SETV 4 that you worked with in Activity 3.
Does LIGS return the same set you got after having completed parts
(a)(d) of Activity 3?
5. Let SETV = 1, 1, 0) , 0, 1, 1) , 1, 0, 0) be a set of vectors in (Z
2
)
3
.
Verify that SETV is linearly independent. Show that SETV generates
the entire set of vectors (Z
2
)
3
. Select a vector v dierent from 1, 1, 0),
0, 1, 1), and 1, 0, 0), and form the new set
1, 1, 0) , 0, 1, 1) , 1, 0, 0) , v.
Test whether the resulting set is independent. What do you observe?
Repeat this for every possible choice of v that is not equal to 1, 1, 0),
0, 1, 1), 1, 0, 1). What do you observe?
6. Let
SETV = 2, 3, 4, 1, 1) , 1, 1, 3, 1, 2) , 3, 4, 2, 2, 3) ,
4, 0, 0, 3, 0) , 1, 4, 2, 3, 3)
be a set of vectors in (Z
5
)
5
.
In parts (b)(d) below, you can use the predened ISETL func npow
to construct the set of subsets with a given cardinality.
(a) Use the func LI to show that SETV is linearly dependent.
(b) Construct all subsets of SETV that consist of two vectors. Apply
LI to each set. What do you observe?
(c) Construct all subsets of SETV that consist of three vectors. Ap-
ply LI to each set. What do you observe?
(d) Construct all subsets of SETV that consist of four vectors. Apply
LI to each set. What do you observe?
7. Let V = (Z
5
)
3
. Apply the func LI to each part (a)(d) Discuss your
ndings: in particular, compare your result for part (d) with what you
get for parts (a)(c)
(a) 4, 4, 2) , 2, 1, 3)
188 CHAPTER 4. LINEARITY AND SPAN
(b) 4, 4, 2) , 3, 1, 3)
(c) 2, 1, 3) , 3, 1, 3)
(d) 4, 4, 2) , 2, 1, 3) , 3, 1, 3)
8. Use the func LI to determine whether each of the following sets is
independent or dependent.
(a) 2, 1, 0) , 1, 1, 1); 2, 1, 0) , 1, 1, 1) , 0, 0, 0) in (Z
3
)
3
(b) 2, 3, 1, 4) , 3, 3, 2, 1) , 1, 2, 1, 4);
2, 3, 1, 4) , 3, 3, 2, 1) , 1, 2, 1, 4) , 0, 0, 0, 0) in (Z
5
)
4
What do you think is the point of this activity? Explain.
Discussion
In Section 4.1, generating sets were dened and in Section 4.2, the con-
cepts of linear independence and linear dependence were dened. In this
section, we will study the relationship between generating sets and the sets
of vectors they generate; discuss how to construct a linearly independent
set when given any set of vectors; present various special forms of linearly
independent sets of vectors; and prove important properties of linearly inde-
pendent and dependent sets.
Generating Sets and Their Spans
In Activity 1, you were given two sets of vectors in (Z
5
)
4
. You were asked to
nd and compare the sets of vectors generated by the two and to determine
the relationship, in terms of linear combinations, between the generating sets.
In Activity 2, one of the sets was changed and you did the same thing with
the resulting pair of sets.
What were the various phenomena that you observed? See how long a list
of observations you can make. For each observation formulate a statement
that says that what you observed is true in general. Is your theorem
correct? Try to supply a proof or a counterexample as appropriate.
One of your observations might have been an important relationship be-
tween the sets generated by two sets of generators and the possibility of
4.3 Generating Sets and Linear Independence 189
writing every generator in one set as a linear combination of the genera-
tors in the other set. This relationship is formalized and proved in the next
theorem.
Theorem 4.3.1. Two sets of vectors
SV 1 = v
1
, v
2
, . . . , v
q

SV 2 = w
1
, w
2
, . . . , w
q

in a vector space V generate the same set of vectors in V if and only if each
vector in SV 2 can be written as a linear combination of the vectors in SV 1,
and vice-versa.
Proof. (=:) Let us assume that SV 1 and SV 2 generate the same set of
vectors. We will then prove that each vector in SV 2 can be written as a
linear combination of the vectors in SV 1. If u is an element of the span of
SV 2, then u can be written as a linear combination of the vectors w
1
, w
2
,
. . . , and w
q
. Now, each vector w
i
, i = 1, 2, , q, in SV 2 is an element of
the span of SV 2 and since SV 1 and SV 2 are assumed to generate the same
set of vectors, so it follows that each w
i
, i = 1, 2, , q is in the span of SV 1,
which means that that each vector w
i
for i = 1, 2, , q can be written as a
linear combination of the elements of SV 1.
In a similar manner, we can show that each vector in SV 1 can be written
as a linear combination of the vectors in SV 2.
(=:) We will assume that each vector in SV 2 can be written as a combina-
tion of the vectors in SV 1, and vice-versa. We will then prove that SV 1 and
SV 2 generate the same set of vectors. Let u be a vector in the set generated
by SV 2. Then, there exist scalars a
1
, a
2
, . . . , a
q
such that
u = a
1
w
1
+a
2
w
2
+ +a
q
w
q
.
Since each vector in SV 2 can be written as a combination of SV 1, there exist
sequences of scalars
[b
11
, b
12
, . . . , b
1q
]
[b
21
, b
22
, . . . , b
2q
]
.
.
.
[b
q1
, b
q2
, . . . , b
qq
]
190 CHAPTER 4. LINEARITY AND SPAN
such that
w
1
= b
11
v
1
+b
12
v
2
+ +b
1q
v
q
w
2
= b
21
v
1
+b
22
v
2
+ +b
2q
v
q
.
.
.
w
q
= b
q1
v
1
+b
q2
v
2
+ +b
qq
v
q
.
Substituting these expressions for the w
i
in the expression for u, we get
u = a
1
w
1
+a
2
w
2
+ +a
q
w
q
a
1
[b
11
v
1
+b
12
v
2
+ +b
1q
v
q
] +
a
2
[b
21
v
1
+b
22
v
2
+ +b
2q
v
q
] + +
a
q
[b
q1
v
1
+b
q2
v
2
+ +b
qq
v
q
]
= [a
1
b
11
+a
2
b
21
+ +a
q
b
q1
]v
1
+
[a
1
b
12
+a
2
b
22
+ +a
q
b
q2
]v
2
+ +
[a
1
b
1q
+a
2
b
2q
+ +a
q
b
qq
]v
q
,
a linear combination of SV 1. Hence, u is an element of the set of vectors
generated by SV 1. Since u was chosen arbitrarily, each vector in the set
generated by SV 2 is also generated by SV 1. In a similar fashion, we can
show that every vector contained in the set generated by SV 1 is generated
by SV 2. As a result, SV 1 and SV 2 generate the same set of vectors.
Constructing Linearly Independent Generating Sets
In Activity 3, you removed vectors until you ended up with a linearly inde-
pendent set. Every time you removed a vector, you applied All LC to the
new set. What was the relationship between the span of the new set and
the span of its predecessor? In constructing a new generating set, were you
allowed to remove just any vector?
How do your responses to these questions relate to the following theo-
rem? In the last section, we proved that an important property of linearly
dependent sets, one not shared by independent sets, is that at least one of its
vectors can be written as a linear combination of the other vectors in the set.
Since the set of vectors generated by a set consists of linear combinations of
the generators, it seems reasonable that any generator that can be written
as a combination of the other generators is redundant, a fact consistent with
what you found in Activity 3 and one which we shall now prove in general.
4.3 Generating Sets and Linear Independence 191
Theorem 4.3.2. If a set of vectors
SV = v
1
, v
2
, v
3
, . . . , v
q

in a vector space is linearly dependent, and if one of the vectors, say v


1
, is
a linear combination of the vectors in SV

= v
2
, v
3
, . . . , v
q
, then SV and
SV

generate the same set of vectors.


Proof. We can simply apply Theorem 4.3.1 with SV 1 = SV and SV 2 =
SV

.
Think about this theorem in relation to Activity 4. You wrote a func LIGS
that removed vectors from a linearly dependent set until it either became a
linearly independent set or became empty. What does the theorem tell you
about the set of vectors generated by the resulting linearly independent set?
This theorem gives us a means by which we can construct a linearly
independent generating set whenever we are given a generating set: we simply
remove vectors which can be written as linear combinations and continue
until this is no longer possible. A linearly independent generating set is
called a basis, and this will be the topic of the next section.
How about going the other way? What happens if we have a set S of
vectors and add to it a vector which is already a linear combination of the
vectors in S? How does the set of vectors generated by the new set compare
with the set of vectors generated by S?
Properties of Linear Independence and Linear Depen-
dence
In the last section, in addition to dening independence and dependence,
we stated and proved two equivalent conditions of linear independence: in
particular, a set of vectors is linearly independent if and only if one of the
following properties holds:
no vector in the set can be written as a linear combination of the
remaining vectors; and
any vector generated by the set can be expressed as a linear combination
of elements of the generating set in one and only one way.
192 CHAPTER 4. LINEARITY AND SPAN
In this subsection, we will prove two necessary conditions for independence.
In Activity 6, you constructed all proper subsets (sets with one or more of the
original elements missing) of two, three, and four vectors of the dependent
set
SETV = 2, 3, 4, 1, 1) , 1, 1, 3, 1, 2) , 3, 4, 2, 2, 3) ,
4, 0, 0, 3, 0) , 1, 4, 2, 3, 3).
You then applied the func LI to each subset to determine which subsets were
independent and which were dependent. In Activity 7 you looked at some
more examples. What did you nd? What do these activities tell you about
the linear dependence or independence of a subset of a linearly dependent
set? Does Activity 8 suggest anything about that? The following theorem
addresses these questions for subsets of a linearly independent set.
What about subsets of a linearly independent set?
Theorem 4.3.3. If S is a linearly independent set in a vector space V , then
every proper subset of S must also be linearly independent.
Proof. The proof is left to the exercise section.
Is there a similar result for linearly dependent sets? In particular, must
every subset of a dependent set be dependent? Is it possible for a linearly
dependent set to have both subsets which are dependent and subsets which
are independent?
How about going the other way? Suppose you took a set of vectors and
added some vectors. What could you say about the larger set if the smaller
set was linearly independent? Linearly dependent?
In each part of Activity 7, you were given two sets of vectors: one which
was an independent set, and the second which was the same set with the
zero vector added. In both cases, the set with the zero vector was dependent.
Before looking at the following theorem, think about what might be true in
general.
Theorem 4.3.4. If SV is linearly independent set of vectors in in a vector
space V , then SV cannot contain the zero vector.
Proof. The proof is left to the exercise section.
4.3 Generating Sets and Linear Independence 193
Here is a trick question. Suppose you began with a set that was lin-
early dependent and began removing vectors. Can you be sure that you will
eventually obtain a linearly independent set? This is one of those statements
that is false but, really, it is true. What could that mean?
Non-Tuple Vector Spaces
In the vector space PF
n
(K) what would you say is the set generated by the
set of vectors x 1, x x, . . . , x x
p
where p n?
In the vector space C

(R), what relation does the subspace generated by


the functions sin, cos have to the set of solutions of the dierential equation,
f
tt
+f = 0?
Exercises
1. Let SV = (2, 1, 3, 1), (2, 1, 1, 3), (3, 4, 1, 1) be a linearly inde-
pendent set of vectors in R
4
. Construct a second linearly independent
set that spans the same set as SV . Verify that the spans of these two
sets are the same.
2. Let SV be the same set as that given in 1. Construct a second linearly
independent set that spans a dierent set of vectors from that spanned
by SV . Verify that the spans of these two sets are indeed dierent.
3. Let
SV = 1, 2, 3, 2) , 2, 3, 3, 1) , 1, 1, 6, 1) , 4, 5, 15, 1)
be a set of vectors in R
4
. Show that this set is linearly dependent.
Construct a linearly independent set by removing dependent vectors.
Show that the two sets generate the same set of vectors. What would
you say is the dimension of the set of vectors generated by this set?
4. Let
SV = 3, 4, 1) , 1, 1, 2) , 5, 2, 5)
be a set of vectors in R
3
. Show that this set is linearly dependent.
Construct a linearly independent set by removing dependent vectors.
194 CHAPTER 4. LINEARITY AND SPAN
Show that the two sets generate the same set of vectors. What would
you say is the dimension of the set of vectors generated by this set?
Explain.
5. Let V = (K)
n
be the vector space of all n-tuples whose components
consist of elements of K. Prove that the set
1, 1, 1, . . . , 1) , 1, 1, 1, . . . , 1, 0) , 1, 1, 1, . . . , 0, 0) ,
, 1, 1, 0, . . . , 0) , , 1, 0, 0, . . . , 0, 0)
is linearly independent.
6. Let V = (K)
n
be the vector space of all n-tuples whose components
consist of elements of K. Prove that the set
1, 0, 0, , 0) , 0, 1, 0, , 0) , 0, 0, 1, , 0) , , 0, 0, 0, , 0, 1)
is linearly independent.
7. Provide the proof of Theorem 4.3.3.
8. Provide the proof of Theorem 4.3.4.
9. Without performing any calculations, explain why each of the following
sets in R
3
is linearly dependent.
(a) 1, 2, 3) , 0, 0, 0) , 2, 1, 3)
(b) 1, 1, 2) , 3, 4, 1) , 1, 1, 0) , 2, 1, 3)
(c) 3, 5, 1) , 3, 1, 2) , v, where v = a 3, 5, 1) + b 3, 1, 2) for some
scalars a and b
(d) v
1
, v
2
, v, where v
1
, v
2
is a linearly dependent set, but there
exist no scalars a and b such that v = av
1
+bv
2
.
10. For each part of the parts (a)(c) below, nd a linearly independent
set that generates the same set as the set given. Then, show that the
resulting linearly independent set generates all of R
n
. Try to complete
each part without making specic computations; in short, use the the-
orems proven and the concepts discussed in this and the prior section
to justify your claims.
4.3 Generating Sets and Linear Independence 195
(a) (2, 1), (1, 3), (1, 1), n = 2
(b) (1, 2, 3), (3, 1, 2), (1, 4, 3), (2, 4, 1), n = 3
(c) (2, 0, 1, 0), (0, 1, 3, 1), (1, 2, 1, 1), (3, 2, 1, 4), (1, 1, 1, 2), n = 4
11. Construct a linearly dependent set in R
3
whose proper subsets are all
linearly independent.
12. Given two sets in R
3
, say
SV 1 = v
1
, v
2
, v
3

SV 2 = w
1
, w
2
, w
3
,
where w
1
and w
2
are both linear combinations of SV 1 but w
3
is not,
determine whether SV 1 and SV 2 generate the same set of vectors.
Justify your answer. If necessary, construct two sets satisfying the
given conditions to illustrate your point.
13. Let 2, 1, 3 2) , 1, 1, 3, 4) be a set of vectors in R
4
.
(a) This set is linearly independent. Why?
(b) Find two more vectors v and w so that the resulting expanded set
2, 1, 3 2) , 1, 1, 3, 4) , v, w
is linearly independent.
(c) Once you have found two such vectors, show that the set you have
constructed is indeed independent.
(d) Based upon the discussion regarding dimension given in this and
at the end of the last section, make a determination as to whether
the set you have constructed generates all of R
4
.
14. Let
S = v
1
, v
2
, v
3
, v
4

be a set in V = (K)
3
.
(a) Is S linearly independent or linearly dependent? Explain your
answer.
196 CHAPTER 4. LINEARITY AND SPAN
(b) If every subset of S consisting of three vectors is linearly inde-
pendent, how could we construct a linearly independent set that
generates all of V ?
(c) If every subset consisting of three vectors is linearly independent,
must it follow that every subset of two vectors must be linearly
independent? Justify your answer.
15. Suppose w
1
and w
2
are two vectors that are combinations of v
1
and
v
2
such that
w
1
= av
1
+bv
2
w
2
= cv
1
+dv
2
a, b, c, d ,= 0
a
c
,=
b
d
.
Is the set w
1
, w
2
linearly independent or linearly dependent? Care-
fully justify your answer. Do the two sets v
1
, v
2
and w
1
, w
2
gen-
erate the same set? Explain.
16. Suppose you take a linearly dependent set and remove vectors one by
one. Can you always be sure that you eventually obtain an independent
set?
17. In the vector space PF
n
(K), describe the set generated by the vectors
x 1, x x, . . . , x x
p
where p n? What about the
set generated by x x, . . . , x x
p
or the set generated by the
monomial functions with odd exponents?
18. In the vector space PF
n
(K), the set x 1, . . . , x x
n1
is
linearly independent. Based on this, what can you say about about
the sets x 1, . . . , x x
p
where p < n? What about the sets
x x, . . . , x x
p
, p < n? or the set of all monomials with odd
exponents? Even?
19. In the vector space C

(R), what relation does the subspace generated


by the functions sin, cos have to the set of solutions of the dierential
equation,
f
tt
+f = 0?
197
4.4 Bases and Dimension
Activities
1. In Section 4.1, Activity 2, you wrote the func LC that accepted a
sequence of scalars SK and a sequence of vectors SV and returned the
vector that was the linear combination of the vectors in SV using the
scalars in SK. There is an ISETL operation called % that can be applied
to .va to make forming linear combinations easy. Here is a version of
LC that uses this feature. It is assumed that name vector space has
been run.
LC := func(SK, SV);
return %.va[SK(i) .sm SV(i) : i in [1..#SK]];
end;
(a) Describe in words how the operation %.va works.
(b) Let V = (Z
5
)
3
, run name vector space, dene SK and SV and
run the line
%.va[SK(i) .sm SV(i) : i in [1..#SK]];
on the following two examples.
i. SK1 = [1, 4, 3], SV 1 = [1, 3, 1)], 2, 1, 4) , 4, 0, 2)]
ii. SK2 = [3, 0, 3], SV 2 = [3, 4, 4)], 2, 1, 0) , 3, 3, 3)]
(c) Based on your experience with %, predict what ISETL will do with
each of the following three lines of code do, run them and explain
any relationship between the expressions.
i. LC(SK1,SV1); LC(SK2,SV2);
ii. LC(SK1,SV1).va LC(SK2,SV2);
iii. LC(SK1 + SK2, SV1 + SV2);
Note: The ISETL operation + here will concatenate the two
tuples.
2. Let SETV = 1, 1, 0) , 0, 1, 1) be a set of vectors in (Z
2
)
3
.
(a) Verify that SETV is linearly independent.
198 CHAPTER 4. LINEARITY AND SPAN
(b) Show that SETV does not generate the entire set of vectors (Z
2
)
3
.
(c) Select a vector v dierent from 1, 1, 0) and 0, 1, 1) and form the
new set 1, 1, 0) , 0, 1, 1) , v. Test whether the resulting set is
independent. What do you observe?
(d) Repeat this for every possible choice of v that is not equal to
1, 1, 0) and 0, 1, 1). What do you observe? Explain.
3. For each vector space V and set of vectors SETV listed below, apply
the func LI from Section 4.2, Activity 2 to determine if the set is
linearly independent and the func All LC from Section 4.1, Activity 6
to determine if the set of vectors generated by SETV is equal to V .
(a) V = (Z
2
)
5
SETV 1 = 1, 1, 1, 1, 1) , 0, 1, 1, 1, 1) , 0, 0, 1, 1, 1) ,
0, 0, 0, 1, 1) , 0, 0, 0, 0, 1)
SETV 2 = 0, 1, 1, 1, 1) , 1, 0, 1, 1, 1) , 1, 1, 0, 1, 1) , 1, 1, 1, 0, 1)
SETV 3 = 1, 1, 1, 1, 1) , 0, 0, 1, 1, 1) , 0, 0, 0, 0, 1) ,
1, 1, 0, 0, 1) , 1, 0, 0, 0, 1)
(b) V = (Z
3
)
4
SETV 4 = 1, 2, 2, 1) , 0, 1, 2, 1) , 0, 0, 2, 1)
SETV 5 = 0, 1, 2, 1) , 2, 0, 2, 2) , 1, 2, 0, 1) , 2, 2, 2, 0)
SETV 6 = 1, 0, 1, 1) , 0, 0, 1, 2) , 0, 0, 0, 2) , 1, 1, 2, 0)
SETV 7 = 1, 1, 1, 1) , 0, 0, 1, 2) , 0, 0, 0, 2) ,
1, 1, 2, 0) , 1, 0, 0, 0)
4. Write a func is basis that assumes name vector space has been run;
that accepts a set of vectors and a vector space V ; that identies the
set as being linearly dependent or independent; and that determines
whether the span of the set is V . In your func, you may want to rule
out the empty set right away.
Test your func on all of the examples in Activity 3.
5. Let P
n
(K) be the vector space of all polynomials of degree less than or
equal to n with coecients in the eld K. Set up appropriate repre-
sentations and apply is basis to solve the following problems.
4.4 Bases and Dimension 199
(a) Determine if the set of monomials 1, x, x
2
, x
3
, x
4
is a basis for
P
4
(Z
3
).
(b) Find a basis for P
4
(Z
3
) as dierent from the monomials as you
can.
(c) Find an example of a linearly independent set and a set which
generates the entire vector space in which neither are bases for
P
4
(Z
3
)
6. For each of the following system of linear equations with coecients in
Z
5
, use the func Three eqn that you wrote for Activity 9 in Section
3.1 to nd the solution set. Run name vector space on this subspace
of Z
5
. Then, nd a linearly independent set whose span is the solution
set. Apply the func is basis to this linearly independent set.
x
1
+x
2
+ 4x
3
= 0
2x
1
+ 2x
2
+ 3x
3
= 0
3x
1
+ 3x
2
+ 2x
3
= 0.
7. If V is a vector space, and B is set of vectors for which is basis returns
true, then we know that each vector, V , can be written uniquely as a
linear combination of the vectors in B. This raises the problem, given
B and a vector v, of nding the coecients in that linear combination.
(a) Explain why the above statement about unique linear combina-
tions is true.
(b) Pick a set from Activity 3, part (b) for which is basis returns
true. Find the coecients for the linear combination of the vectors
in this set that is equal to 0, 1, 2, 0).
8. Consider the following func which assumes that name vector space
has been run and accepts a vector space V .
Make_Basis := func(V);
local select;
select := func(SETV,W);
if is_basis(SETV) then
return SETV;
200 CHAPTER 4. LINEARITY AND SPAN
else
SETV := SETV with arb(W);
W := W - All_LC(SETV);
return select(SETV,W);
end;
end;
return select({},V less ov);
end;
(a) Explain in words what Make Basis does.
(b) Pick a vector space V and apply Make Basis three times to V .
How are your three results related to each other? Are they the
same? Do they have the same number of elements?
(c) Apply is basis to each of the three sets of vectors returned by
the three applications of Make Basis. Describe what happens.
Discussion
Summation Notation
A very important tool for writing expressions in linear algebra is summation
notation, that is, expressions such as,
n

i=1
t
i
v
i
where t
1
, t
2
, . . . , t
n
is a sequence of scalars and v
1
, v
2
, . . . , v
n
is a sequence of
vectors. If you wrote this summation expression out without the

symbol,
what would you get?
The operator %.va in ISETL works exactly like the

symbol in mathe-
matics. Following are some summation notations that mean the same thing
in mathematics as do the ISETL expressions in Activity 1. See if you can
match them, expression for expression. In particular, what has replaced SK1,
SK2, SV1, SV2?
n

i=1
t
i
v
i
4.4 Bases and Dimension 201
3

i=1
a
i
u
i
3

i=1
b
i
w
i
3

i=1
a
i
u
i
+
3

i=1
b
i
w
i
6

i=1
c
i
v
i
One thing you can do to gure out such expressions is to write them out
in full detail without any summation or % symbols. Thus we have,
4

i=1
t
i
v
i
= t
1
v
1
+t
2
v
2
+t
3
v
3
+t
4
v
4
and
n

i=1
t
i
v
i
= t
1
v
1
+t
2
v
2
+ +t
n
v
n
You can factor a scalar out of an expression. Thus if all of the t
i
in

n
i=1
t
i
v
i
were equal to t, what would

n
i=1
tv
i
be equal to? If you are in
doubt, choose a value for n and write everything out.
You can also add two summation expressions termwise as in,
n

i=1
(a
i
+b
i
) =
n

i=1
a
i
+
n

i=1
b
i
Again, if you need clarication, choose a value for n and write everything
out.
Things can get very complicated if you have multi-indices, or sequences
of sequences, as with matrices. Thus if (a
ij
) where i = 1, 2, . . . , m and
j = 1, 2, . . . , n is a matrix or doubly indexed sequence of scalars, we have,

i,j
a
ij
=
m

i=1
n

j=1
a
ij
Once more, choose a value for n and write everything out to help un-
derstand what these expressions mean. You will have an opportunity in the
exercises to practice with these symbols.
202 CHAPTER 4. LINEARITY AND SPAN
Bases
Here is a mathematical formulation of the concept expressed in ISETL by the
func is basis that you wrote in Activity 4.
Denition 4.4.1. A non-empty set B = v
1
, v
2
, . . . , v
n
in a vector space
V is called a basis for V if every vector v V can be written in one and only
one way as a linear combination of the vectors in B.
In Activity 3, you considered, altogether, seven sets of vectors in two
vector spaces. Which of these are bases for the vector spaces containing
them?
You should have no diculty showing that this denition is exactly the
same as saying that the set is linearly independent and generates all of V .
Certain sets of vectors having a particular form will always be bases. For
example, for any vector space V = (K)
n
, the following sets of vectors are
bases:
B
1
= 1, 1, 1, . . . , 1) , 0, 1, 1, . . . , 1) , 0, 0, 1, . . . , 1) ,
0, 0, 0, 1, . . . , 1) , . . . , 0, 0, 0, , 0, 1)
B
2
= 1, 0, 0, . . . , 0) , 0, 1, 0, . . . , 0) , 0, 0, 1, . . . , 0) ,
. . . , 0, 0, 0, . . . , 0, 1)
B
3
= 1, 1, 1, . . . , 1) , 1, 1, . . . , 1, 0) , 1, 1, . . . , 1, 0, 0) ,
. . . , 1, 1, 0, . . . , 0) , 1, 0, 0, . . . , 0)
We prove the rst case here. The latter cases are left for the exercises.
Theorem 4.4.1. Let V = (K)
n
. The set B
1
is a basis.
Proof. First we show that the set is linearly independent.
Given the vector equation
a
1
1, 1, 1, . . . , 1) +a
2
0, 1, 1, . . . , 1) +
a
3
0, 0, 1, . . . , 1) + +a
n
0, 0, 0, , 0, 1)
= 0, 0, 0, . . . , 0) ,
4.4 Bases and Dimension 203
we must show that to
a
1
= a
2
= a
3
= = a
n
= 0.
We have,
0, 0, 0, . . . , 0) = a
1
1, 1, 1, . . . , 1) +a
2
0, 1, 1, . . . , 1) +
a
3
0, 0, 1, . . . , 1) + +a
n
0, 0, 0, . . . , 0, 1)
= a
1
, a
1
, a
1
, . . . , a
1
) +0, a
2
, a
2
, . . . , a
2
) +
0, 0, a
3
, . . . , a
3
) + +0, 0, 0, . . . , 0, a
n
)
= a
1
, (a
1
+a
2
), (a
1
+a
2
+a
3
), . . . , (a
1
+a
2
+a
3
+ +a
n
))
Therefore,
a
1
= 0
a
1
+a
2
= 0
a
1
+a
2
+a
3
= 0
.
.
.
a
1
+a
2
+a
3
+ +a
n1
= 0
a
1
+a
2
+a
3
+ +a
n1
+a
n
= 0.
The rst equation yields a
1
= 0. Substituting this into the second equation
forces a
2
= 0. Substituting these results into the third equation results in
a
3
= 0. If we continue with subsequent steps, we get the desired result; that
is,
a
1
= a
2
= a
3
= = a
n
= 0.
Next we show that B
1
generates all of V . Let v
1
, v
2
, . . . , v
n
be the vectors
in B
1
. We must show that given any sequence c
1
, c
2
, . . . , c
n
of scalars in K,
we can nd a sequence of scalars a
1
, a
2
, . . . , a
n
in K such that,
n

i=1
a
i
v
i
= c
where c = c
1
, c
2
, . . . , c
n
).
When this expression is written out with all of the coordinates, you get
almost exactly the same set of equations that was obtained in showing linear
204 CHAPTER 4. LINEARITY AND SPAN
independence. The only dierence is that each 0 on the right hand side is
replaced by the appropriate c
i
. That is, we must solve the following system
of equations for the unknowns a
1
, a
2
, . . . , a
n
:
a
1
= c
1
a
1
+a
2
= c
2
a
1
+a
2
+a
3
= c
3
.
.
.
a
1
+a
2
+a
3
+ +a
n1
= c
n1
a
1
+a
2
+a
3
+ +a
n1
+a
n
= c
n
.
Clearly, we get a solution by taking a
1
= c
1
, a
2
= c
2
c
1
, a
3
= c
3
2c
1
c
2
and so on.
The set of vectors, B
2
is particularly important. We call it the coordinate
basis and write its elements as e
1
, e
2
, . . . , e
n
. The vector e
i
has all of its
coordinates equal to 0 except for the i
th
coordinate which is 1.
You will notice that we have dened a basis to be a set of vectors as
opposed to a sequence of vectors. The reason for this is that the prop-
erty of being a basis does not depend on the order in which the vectors are
consideredat least not in the context with which this course is concerned.
In many situations where bases are used, however, it becomes important to
x the order of the elements. Is the case for Activity 7? Do the coecients
you nd form a set or a sequence? This issue will denitely come up in the
next few paragraphs.
When we want to make use of the order of the elements of a basis, we
make the set into a sequence and call it an ordered basis. Thus, given any
basis, each ordering of the set produces a dierent ordered basis. If a basis
has 10 vectors, how many ordered bases can you get from it?
Nobody is perfect and the dierence between a basis and an ordered basis
can be so small, that often we will forget to add the adjective ordered when
we should. But you can always tell from the context, so whenever we are
working with a basis as a sequence, we mean and ordered basis whether we
say so or not.
4.4 Bases and Dimension 205
Expansion of a Vector with respect to a Basis.
In Activity 7, you considered the problem of given a vector space V , a basis
B for V and a vector v V , how can we nd the coecients of v in its
expansion as a linear combination of the vectors in B. There are several
ways of doing this. You might set up a system of linear equations and solve
them. You could do this by hand, or use a computer tool. You could also do
it (if the vector space is not too large) by using the ISETL operation choose.
That is, you would apply choose to the set of all linear combinations with
the condition that it be equal to the given vector. In Chapter 6, you will nd
another method that uses matrices.
When determining these coecients, you really have to make sure that
each coecient goes with a specic vector. One way of doing this would be
to use an ordered basis. Then you nd a sequence of scalars to form the
coecients and the order takes care of the matching automatically.
There is one case in which nding the coecients is so easy that it might
seem trivial. Suppose V = K
n
and you are working with the coordinate
basis. Now, any vector v V has both its components as an element of
K
n
and its coecients in its expansion by the n basis elements. What can
you say about these two sequences of scalars? Although this fact may seem
trivial, it is not. We very briey pursue it in a more general form.
Representation of a vector space as K
n
. Suppose you have an arbitrary
vector space V over a eld, K, and an ordered basis B = [b
1
, b
2
, . . . , b
n
].
Then any vector v V has a sequence of scalars t = (t
i
) = (t
1
, t
2
, . . . , t
n
)
which are its coecients with respect to B. That is,
v =

i
t
i
b
i
= t
1
b
1
+t
2
b
2
+ +t
n
b
n
.
How does this compare with what you obtained in Activity 7?
Because of properties of bases, this representation is unique so you can
think of v as an element t = (t
i
) of K
n
. Conversely, if you have an element
t = (t
i
) K
n
, then the same equation can be used to specify a vector v K
n
.
It is easy to see that the operations of vector addition and multiplication are
preserved by this correspondence. What exactly is meant by preserved
here?
These comments can be summarized by saying that the vector space V
is the same as K
n
. This being the same, however, depends on the basis B.
206 CHAPTER 4. LINEARITY AND SPAN
This observation is very important in more advanced studies of linear
algebra. We will not pursue it in this text.
In this representation of a vector space as K
n
,would you say that the
order of the basis makes a dierence?
Finding a Basis
You can always nd a basis. In the case in which everything is nite, Ac-
tivity 8 produces a basis. How do you know that the func Make Basis will
always work? The algorithm begins by selecting an arbitrary non-zero vector
in V , forms the subspace generated, picks a non-zero vector not in that sub-
space, adds it to what has already been chosen and continues that process
until a basis is achieved.
Did you nd in Activity 8 that you got a dierent basis each of the three
times you ran Make Basis? How about the number of elements in each basis?
Why does this happen? Could you guarantee that the basis you get contains
some given set of one or more vectors? What would you have to assume
about this set?
Finite dimensional vector spaces. Suppose that the eld K is not nite.
For example, your vector space might be R
n
. Or, as in Activity 12 of Section
4.1, it might be the set of all functions from [, ] to R whose derivative
of every order and at every point in the interval (left and right derivatives at
the endpoints) exist. In such a case you can still apply the algorithm, but
you cant be sure of what will happen. That is, you pick a non-zero vector
and put it in the set B. Then you pick a non-zero vector not in the subspace
generated by B and add that vector to B. You continue this process. If your
vector space is nite, it must stop. If the vector space is not nite, it may or
may not stop. If it stops after nitely many steps, then the resulting set B is
a basis and a nite set. This case is important enough to warrant a formal
denition.
Denition 4.4.2. If a vector space V has a basis which is a nite set, then
V is called a nite dimensional vector space.
As we indicated above, even if this process does not stop, you can still
show that the vector space has a basis, but we will not discuss that situation
here.
4.4 Bases and Dimension 207
Characterizations of bases. We dened a basis for a vector space in a
way that is equivalent to being a set which is both linearly independent and
generates the whole space. (See comment after Denition 4.4.1 and Exercise
2.) There are two other characterizations of a basis.
Theorem 4.4.2. A subset B of a vector space V is a basis if and only if it
is a maximal linearly independent set. That is, B is linearly independent and
if any other vector is added to it, then it is no longer linearly independent.
Proof. Exercise.
Theorem 4.4.3. A subset B of a vector space V is a basis if and only if it
is a minimal generating set. That is, the subspace generated by B is all of
V , but if any vector is removed from B, then the subspace it generates is no
longer all of V .
Proof. Exercise.
Dimension
In several activities for this section, you found bases for various vector spaces.
In Activity 4, you found bases for (Z
2
)
5
and for (Z
3
)
4
. In Activity 5 you
constructed two dierent bases for P
5
(Z
3
) and in Activity 8, you found three
bases for the same vector space.
In all of these examples, did you notice any regularities in the number
of elements in a basis? The bases for (Z
2
)
5
and for (Z
3
)
4
had, respectively,
5 and 4 elements. Note that for each of these vector spaces, the coordinate
bases also have 5 and 4 elements, respectively. In the other examples, what
did all bases for the same vector space always have in common?
Do you think there is a general result here? There is, but rst we must
consider an important fact about the maximum number of elements in a
linearly independent set. In going through the following proof, it might help
you to pick values for m and n and write out all of the summations.
Theorem 4.4.4. If V has a basis with n elements in it, then any subset of
V with more than n elements must be linearly dependent.
Proof. Suppose that B = b
1
, b
2
, . . . , b
n
is a basis for V , and that ( =
c
1
, c
2
, . . . , c
m
is a subset of V with m > n. We must show that ( is a
208 CHAPTER 4. LINEARITY AND SPAN
linearly dependent set. To do that we must nd scalars, a
1
, a
2
, . . . , a
m
not
all zero, such that,
m

i=1
a
i
c
i
= 0.
Now, because B is a basis, each element of ( is equal to some linear
combination of the vectors in B. That is, we have scalars t
ij
, i = 1, . . . , m, j =
1, . . . , n such that for each i = 1, . . . , m we have,
c
i
=
n

j=1
t
ij
b
j
Substituting these expressions in the equation we have to solve, this equation
becomes,
m

i=1
a
i
n

j=1
t
ij
b
j
= 0
or
m

i=1
n

j=1
a
i
t
ij
b
j
= 0
and, reversing the order of the two sums (why can we do this?), it becomes:
n

j=1
m

i=1
a
i
t
ij
b
j
= 0
In this vector equation, we can replace 0 by its expression as a linear combi-
nation of the basis vectors to obtain:
n

j=1
m

i=1
a
i
t
ij
b
j
=
n

j=1
0b
j
.
This equation expresses the equality of two linear combinations of the basis
vectors and therefore, because of the uniqueness, each coecient of b
j
, j =
1, . . . , n is the same on both sides of the equation. This leads to the following
4.4 Bases and Dimension 209
system of equations:
m

i=1
a
i
t
i1
= 0
m

i=1
a
i
t
i2
= 0

i=1
a
i
t
in
= 0
But this is a system of n equations in m unknowns, with m > n, so the
system must have a solution in which not all of the unknowns are 0. Why?
This notion will be pursued in Exercise 13.
With this theorem, we can easily prove what you observed in considering
the number of elements in a basis.
Theorem 4.4.5. Any two bases for a vector space V have the same number
of elements.
Proof. Suppose we have two sets which are bases for V . Applying Theorem
4.4.4 to the fact that the rst set is a basis and the second set is linearly
independent, we conclude that the second set cannot have more elements
than the rst. Reversing the two sets, we conclude the the rst set cannot
have more elements than the second. Hence they have the same number of
elements.
This theorem allows us to make the following denition:
Denition 4.4.3. The dimension of a nite dimensional vector space is the
number of elements in a basis.
Why do we need Theorem 4.4.5 before we can dene the dimension of a
vector space?
Some of the following theorems were illustrated by examples in the activ-
ities. You will have a chance to give general proofs in the exercises.
210 CHAPTER 4. LINEARITY AND SPAN
Theorem 4.4.6. The dimension of the vector space K
n
is n.
Proof. Exercise.
Theorem 4.4.7. If V is an n-dimensional vector space, then any set of
vectors which generates V must have at least n elements.
Proof. Exercise.
Theorem 4.4.8. If V is an n-dimensional vector space, then any set of n
linearly independent vectors generates V .
Proof. Exercise.
Theorem 4.4.9. If V is an n-dimensional vector space, then any set of n
vectors which generates V must be linearly independent.
Proof. Exercise.
Theorem 4.4.10. If V is an n-dimensional vector space, and B is a set of
linearly independent vectors in V , then there is a basis which contains B, that
is, the set B can be extended into a basis of V .
Proof. Exercise.
Dimensions of Euclidean spaces. The Euclidean space R
2
contains el-
ements of three types: points, which are subspaces of dimension zero; lines,
which are subspaces of dimension one; and the entire space, which is of dimen-
sion two. The Euclidean space R
3
contains four types of subspaces, points;
lines; planes; and the entire space, which is itself a subspace of dimension 3.
We have seen that any vector space of the form V = (K)
n
, whether
K is equal to the set of real numbers R or not, is a space of dimension n
possessing subspaces of smaller dimension. Such a result is consistent with
the geometric notion of dimension discussed in Euclidean space.
4.4 Bases and Dimension 211
Non-Tuple Vector Spaces
Most of the vector spaces you have studied so far are of the form (Z
p
)
n
. This
is because these are concrete examples and are the easiest to work with. But
you have also begun to work with some other examples: P
n
(K), the vector
space of polynomials of degree less than or equal to a certain number n with
coecients in some eld K; the vector space of all solutions of a system of
homogeneous linear equations; and the vector space of all functions which
satisfy a certain dierential equation. In each of these cases, the space has
bases, and they are important. We will consider some rst facts.
The results you found in Activity 5(a) are completely general and you
might have been able to solve this problem more easily by hand without
the computer. After all, what is a linear combination of monomials but a
polynomial whose degree is less than or equal to the highest degree of a
monomial, which is n. Thus you get all polynomials with degree less than or
equal to n. Can such a polynomial be equal to the zero polynomial, if the
coecients are not all zero? That one could be interesting, so you will have
a chance to play with it in the exercises.
Theorem 4.4.11. The monomials 1, x, x
2
, . . . , x
n
form a basis for P
n
(K).
Proof. Exercise.
Determining other bases for P
n
(K) involves a lot of work with the prop-
erties of polynomials, and we will not go much farther with that in this text.
Closely related to the vector space of polynomials is the vector space of
polynomial functions. The result of Theorem 4.4.11 does not hold in general
for these spaces, instead we have the following.
Theorem 4.4.12. The monomial functions x 1, . . . , x x
n
form a
basis of PF
n
(R).
If n < p, then the monomial functions, then the monomial functions
x 1, . . . , x x
n
form a basis of PF
n
(Z
p
).
If n p, then the monomial functions x 1, . . . , x x
p1
form a
basis of PF
n
(Z
p
).
Proof. Left as an exercise (see Exercise 21).
Consider the system of equations in Activity 8, and try to solve it com-
pletely, perhaps using the methods you learned in Chapter 3.
212 CHAPTER 4. LINEARITY AND SPAN
You should be able to determine that all solutions [x
1
, x
2
, x
3
] are given
by:
x
1
= s
x
2
= t
x
3
= s +t
where s, t run independently through all values in Z
5
.
Another way to say this is that the solutions form a subspace of (Z
5
)
3
and that this subspace is generated by the two vectors, 1, 0, 1),0, 1, 1). Can
you see why this is so? If it is, then these two vectors are obviously linearly
independent so they form a basis for the space of solutions of this system.
This situation is very general, and you will study it more in Chapter 6.
For now, we can introduce some words you will meet later. This system has
three equations, and the vector space of solutions is of dimension 2. Hence
we say that the rank of the system is 1, and its nullity is 2.
Here is something else for you to mull over. Suppose you throw away all
of the x
t
s and the = 0 parts of the system of equations in Activity 8, leaving
you with a 3 3 matrix. Treat the rows of this matrix as vectors and notice
that the three vectors do not form a linearly independent set. Moreover, the
largest subset which is linearly independent has only 1 vector. Is this 1 a
coincidence? Now do the same thing with the columns of the matrix. Are
the results the same? Whats that all about?
There is a deep mathematical connection between linear algebra and the
solutions of a linear dierential equation that involves a lot of important
mathematics. In this book, we can only give a barest hint of the tip of an
iceberg.
Recall, from Section 4.3, the dierential equation
f
tt
+f = 0
where f is an unknown function in C

(R).
You checked in Section 4.3 that the two functions sin, cos are solutions
to this dierential equation. You should have no trouble showing that they
form a linearly independent set in the vector space C

(R). Do they generate


the subspace of all solutions?
In the theory of dierential equations, the study of initial value problems
shows that if you choose any real numbers a, b then there is a unique function
4.4 Bases and Dimension 213
f C

(R) which is a solution to the dierential equation and satises,


f(0) = a and f
t
(0) = b. On the other hand, since the value of the sin
function and its derivate at 0 are 0, 1 respectively and the value of the cos
function and its derivative at 0 are 1, 0 respectively, given any a, b, we can
nd scalars s, t such that the function s sin +t cos has its value at 0 and its
derivative at 0 equal to a, b respectively.
So take any function f C

(R) and let a = f(0), b = f


t
(0). Then nd
s, t as above. We have the fact that both f and s sin +t cos are solutions to
the dierential equation and they have the same values and derivatives at 0.
By the uniqueness, it follows that
f = s sin +t cos
Hence sin, cos generates the space of solutions o the dierential equation,
so it is a basis. Incidentally, can you explain why we use sin, cos here and
not sin(x), cos(x)?
You might think that some of the statements just made require proofs.
You will have an opportunity to provide them in the exercises.
Exercises
1. Write out each of the following expressions or equations without use
of summation notation. Then explain why the equations in (b)(e) are
true.
(a)

4
i=1
a
i
b
i
(b)

4
i=1
tb
i
= t

4
i=1
b
i
(c)

4
i=1
(a
i
+b
i
) =

4
i=1
a
i
+

4
i=1
b
i
(d)

i,j
a
ij
=

3
i=1

4
j=1
a
ij
(e)

3
i=1

4
j=1
a
ij
=

4
j=1

3
i=1
a
ij
2. Show that a subset of a vector space V is a basis if and only if it is
linearly independent and generates all of V .
3. Show that each of the following sets is a basis for (K)
n
.
(a) B
2
= 1, 0, 0, . . . , 0) , 0, 1, 0, . . . , 0) , 0, 0, 1, . . . , 0) , . . . ,
0, 0, 0, . . . , 0, 1)
214 CHAPTER 4. LINEARITY AND SPAN
(b) B
3
= 1, 1, 1, . . . , 1) , 1, 1, 1, . . . , 0) , 1, 1, 1, . . . , 0, 0) , . . . ,
1, 1, 0, . . . , 0) , 1, 0, 0, . . . , 0)
4. In the paragraph on Representations of a vector space as (K)
n
it is
stated that It is easy to see that the operations of vector addition and
multiplication are preserved by this correspondence.
(a) Explain what is meant by preserved.
(b) Prove that vector addition is preserved.
(c) Prove that scalar multiplication is preserved.
5. In Activity 3(a) choose the set which is a basis and nd the coordinates
of the expansion of the vector 1, 0, 1, 0, 1) with respect to this basis.
6. In Activity 3(b) choose the set which is a basis and nd the coordinates
of the expansion of the vector 0, 1, 2, 0) with respect to this basis.
7. Let P
4
(R) be the vector space of all polynomials of degree less than or
equal to 4 with real coecients.
(a) Find a basis which contains the polynomial x + 1
(b) Find a basis which contains the polynomials x + 2, x 1
(c) Find a basis which contains the polynomials x 2, x
2
2
8. For each of the bases you found in the previous exercise, nd the ex-
pansion, with respect to that basis, of the polynomial x.
9. Let be the set of all nite sequences of real numbers.
(a) Dene a scalar multiplication and a vector addition on this set
and show that with these operations it becomes a vector space.
(b) Explain what could be meant by the coordinate basis for this
vector space and show that it is a basis.
(c) Find a basis B for in which no sequence contains a zero.
(d) Consider the element of which is sequence consisting of three
0s followed by three 1s. Find the expansion of this vector with
respect to your basis B.
10. Prove Theorem 4.4.2.
4.4 Bases and Dimension 215
11. Prove Theorem 4.4.3
12. Write out the proof of Theorem 4.4.4 for the case n = 3, m = 5 using
no summation symbols.
13. (a) Consider the following homogeneous system of 3 equations in 4
unknowns
x
1
x
2
+ 2x
3
+ 4x
4
= 0
2x
1
+ 3x
2
x
3
+x
4
= 0
4x
1
+ 5x
2
+ 3x
3
2x
4
= 0.
Show there exists a non-trivial solution to the system.
(b) Generalize the results of the previous part to show that a system
of n equations in m unknowns, with m > n, must have a solution
in which not all of the unknowns are 0. This completes the proof
of Theorem 4.4.4.
14. Prove Theorem 4.4.6.
15. Prove Theorem 4.4.7.
16. Prove Theorem 4.4.8.
17. Prove Theorem 4.4.9.
18. Prove Theorem 4.4.10.
19. (a) Choose several values for n and elds K and in each case show
that any polynomial in P
n
(K), whose coecients are not all zero,
cannot be the zero polynomial.
(b) Show in general for any n, K that any polynomial in P
n
(K), whose
coecients are not all zero, cannot be the zero polynomial.
20. Prove Theorem 4.4.11.
21. Prove Theorem 4.4.12.
22. Show the set 1, 0, 1) , 0, 1, 1) is a basis for the solution space of the
system of equations in Activity 8.
216 CHAPTER 4. LINEARITY AND SPAN
23. Show that the set sin, cos is linearly independent in the vector space
C

(R).
24. The set sin, cos also spans C

(R) (and so, with the previous exercise,


it is a basis). To show that it generates all of C

(R) requires some


background in dierential equations. Look this up and sketch a proof
that sin, cos spans C

(R).
Chapter 5
Linear Transformations
Remember the denition of a function from
previous mathematics courses? In calculus,
functions are the main object of study as
dierentiation and integration both operate on
functions. You are probably saying to yourself,
Of course, a function is a mapping from some
set called the domain into another set called the
co-domain in which any element from the
domain is mapped to exactly one element in the
co-domain. Or something like a subset f of
the Cartesian product of A and B such that for
every a A there is exactly one b B such that
(a, b) f. This chapter is going to explore some
functions from one vector space to another and
consider how portions of the domains and ranges
might be thought of a vector spaces themselves.
218
5.1 Introduction to Linear Transformations
Activities
1. Let U = (Z
3
)
2
, the vector space of ordered pairs of elements in Z
3
.
Let u, v U, where u = u
1
, u
2
) and v = v
1
, v
2
). Dene a function
T : U U by
T(u) = (u
1
u
2
), (2 u
2
)) .
For all u, v U and for all c, d Z
3
, perform the following steps:
(a) Compute cu +dv, and then nd T(cu +dv).
(b) Compute T(u) and T(v), and then nd cT(u) +dT(v).
(c) Determine whether T(cu +dv) = cT(u) +dT(v).
2. Let U = (Z
3
)
2
and V = (Z
3
)
3
. Let u, v U, where u = u
1
, u
2
) and
v = v
1
, v
2
). Dene a function F : U U by
F(u) = (u
1
+u
2
), (2 u
1
+u
2
), u
1
) .
For all u, v U and for all c, d Z
3
, perform the following steps:
(a) Compute cu +dv, and then nd F(cu +dv).
(b) Compute F(u) and F(v), and then nd cF(u) +dF(v).
(c) Determine whether F(cu +dv) = cF(u) +dF(v).
3. Let U and V be vector spaces with scalars in K, and assume that
name vector space has been run. Write an ISETL func is linear
that accepts a func H : U V , where U and V indicate vector
spaces over K; checks the equality
H(cu +dv) = cH(u) +dH(v)
for all pairs of scalars c, d K and all pairs of vectors u, v U; and
returns true, if the equality being checked holds for all possible scalar
and vector pairs, or false, if the equality does not hold. Apply this
func to the funcs T and F dened in the Activities 1 and 2.
5.1 Introduction to Linear Transformations 219
4. Let U = (Z
3
). Let u, v U. Dene a function H : U U by
H(u) = u
2
.
(a) Apply the func is linear to H. Does is linear return true or
false in this case?
(b) If we dene G : U U by
G(u) = u
n
,
where n is an integer greater than or equal to 1, for what values
of n will is linear return true?
5. Let U = (Z
5
)
2
. Run name vector space, and complete parts (a)(e).
(a) Write a func T that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector in (Z
5
)
2
. The rst component of the output is the sum
of the product of u
1
by 2 with the product of u
2
by 4, and the
second component is the sum of u
1
with the product of 3 and u
2
.
(b) Write a func F that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by u
1
+ 2, u
2
) (Z
5
)
2
.
(c) Write a func H that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by 0, u
2
) (Z
5
)
2
.
(d) Write a func R that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by 3u
1
+ 2u
2
+ 2, 2u
1
+u
2
+ 3) (Z
5
)
3
.
(e) Write a func S that accepts a vector u
1
, u
2
) (Z
5
)
2
and returns
a vector given by 3u
1
+ 2u
2
, 2u
1
+u
2
) (Z
5
)
2
.
(f) Apply the func is linear to each of the funcs you have con-
structed in (a)(e). Which return true? Which return false?
6. Let U = R
2
be the coordinate plane, the vector space of ordered pairs
with real-valued components. Let u = 2, 5) and v = 1, 3) be two
vectors in the plane. Dene G : R
2
R
2
to be the function that
accepts a vector u R
2
and returns the vector found by rotating u
counterclockwise through

6
radians. If we think of u geometrically as
an arrow that emanates from the origin, then a rotation through ra-
dians refers to rotating the given arrow radians in a counterclockwise
direction.
220 CHAPTER 5. LINEAR TRANSFORMATIONS
(a) Use the ISETL tool vectors to graph G(u) +G(v) and G(u+v).
What do you observe about the relationship between G(u)+G(v)
and G(u +v)?
(b) Let c = 2. Use vectors to graph G(cu) and cG(u). What do you
observe about the relationship between G(cu) and cG(u)?
(c) For u, v R
2
and c, d R, does G satisfy the equality
G(cu +dv) = cG(u) +dG(v)?
Explain your answer.
7. Let U = R
2
be the coordinate plane. Let u = 3, 1) and v = 1, 3) be
two vectors in the plane. Dene H : R
2
R
2
to be the function that
accepts a vector u R
2
and returns the vector found by reecting u
through the line whose equation is given by y =

3x. If we think of u
geometrically as an arrow emanating from the origin, then a reection
of u refers to nding the mirror image of its arrow with respect to the
given reecting line.
(a) Use vectors to graph H(u) + H(v) and H(u + v). What do
you observe about the relationship between H(u) + H(v) and
H(u +v)?
(b) Let c = 2. Use vectors to graph H(cu) and cH(u). What do
you observe about the relationship between H(cu) and cH(u)?
(c) For u, v R
2
and c, d R, does H satisfy the equality
H(cu +dv) = cH(u) +dH(v)?
Explain your answer.
8. Let U = R
2
be the coordinate plane. Let u = 3, 2) and v = 1, 5) be
two vectors in the plane. Dene S : R
2
R
2
to be the function that
accepts a vector u R
2
and that returns the vector found by translat-
ing the vector u by the vector 3, 4). If we think of u geometrically as
an arrow that emanates from the origin, then a translation of u refers
to moving u to a new location in the plane without disturbing its
original direction.
5.1 Introduction to Linear Transformations 221
(a) Use vectors to graph S(u) + S(v) and S(u + v). What do you
observe about the relationship between S(u)+S(v) and S(u+v)?
(b) Let c = 2. Use vectors to graph S(cu) and cS(u). What do you
observe about the relationship between S(cu) and cS(u)?
(c) For u, v R
2
and c, d R, does S satisfy the equality
S(cu +dv) = cS(u) +dS(v)?
Explain your answer.
9. Let D: C

(R) C

(R) be the operator dened by D(f) = f


t
, the
derivative of f. Let D
2
be the operator dened by D D. Determine
whether D
2
: C

(R) C

(R) satises the condition:


D
2
(af +bg) = aD
2
(f) +bD
2
(g), f, g C

(R), a, b R.
10. Let
_
1
0
: PF
3
(R) R be dened by
_
1
0
p =
_
1
0
p(t) dt.
Determine whether
_
1
0
satises the condition
_
1
0
(af +bg) = a
_
1
0
f +b
_
1
0
g, f, g PF
3
(R), a, b R.
11. Let J : PF
2
(R) PF
3
(R) be dened by
J(p) = x
_
x
2
p(t) dt.
Determine whether J satises the condition
J(af +bg) = aJ(f) +bJ(g), f, g PF
2
(R), a, b R.
222 CHAPTER 5. LINEAR TRANSFORMATIONS
Discussion
Functions between Vector Spaces
In calculus, you worked with functions whose domains and ranges were both
some set of the real numbers: each input was a real number and each output
was a real number. In multivariable calculus, you broadened your horizons
a bit. A function of two variables accepts an ordered pair (x, y) of numbers
and returns a real number. A function of three variables accepts an ordered
triple (x, y, z) and returns a real number. With vector-valued functions,
that is, functions whose ranges are vector spaces, the situation is reversed:
each input is a real number, and each output can be an ordered pair or an
ordered triple. For example, a function h : R R
3
dened by h(x) =
x
2
, 2x + 3, 3x
3
) accepts a real number x and returns the vector, or ordered
triple x
2
, 2x + 3, 3x
3
).
In linear algebra, you have the opportunity to expand your view even
further. Look again at the functions T and F dened in the rst two ac-
tivities. What are the inputs and outputs of these functions? You might
have noticed that both accept a vector u in (Z
5
)
2
, while T returns a vector
in (Z
5
)
2
, and F returns a vector in (Z
3
)
3
. A function between vector spaces
assigns one and only one output vector, say v in V , to each input vector,
say u in U. Such functions are often called vector transformations or just
transformations. Check all the functions you worked with in the activities:
Which of these are vector transformations? In Activity 1, you were asked
whether T(cu + dv) = cT(u) + dT(v). Part (b) of Activity 2 asked you a
similar question regarding the function F. In Activity 3, you were asked to
write a func that would check this condition for any vector space function
H : U V . In particular, given any H : U V , any pair of vectors
u, v U, and any pair of scalars c, d, does the following equality hold:
H(cu +dv) = cH(u) +dH(v)?
Any vector space function that satises this condition is called a linear trans-
formation. Is the function T dened in Activity 1 a linear transformation?
Is the function F dened in Activity 2 a linear transformation?
5.1 Introduction to Linear Transformations 223
Denition and Signicance of Linear Transformations
The single condition you used in Activity 3 to construct the func is linear
can be separated into two conditions, as presented in the denition given
below.
Denition 5.1.1. Let U and V be vector spaces with scalars in K. A
function T : U V is a linear transformation if
(i.) T(u +v) = T(u) +T(v) for u, v U and
(ii.) T(cu) = cT(u) for u U and c K.
How could the func you dened in Activity 3 be modied to check the
two conditions given in the denition? Of those funcs in Activities 1, 2, 4,
and 5 that are not are linear, which fail condition (i)? condition (ii)? both?
In Activity 4, we dened a familiar function using vector notation. You
can verify that the set of real numbers R is in fact a vector space over itself,
that is, the set of scalars is also R. The function H dened in this activity,
like any function from R to R, is a vector transformation: a real number,
say x R, thought of in this context as a vector, is assigned to the vector,
or real number, x
2
. When you applied is linear, what did you nd? Is H a
linear transformation? For what n is the function G dened in Activity 4(b)
a linear transformation?
Many of the functions you studied in calculus are not linear. However,
linear transformations are extremely important in calculus. For example,
when you compute the derivative D of a dierentiable function g : R R
at a point x = a, the function G : R R given by
G(x) = D(g)(a) (x a),
where () denotes real number multiplication, is a linear transformation that
approximates g near x = a. This is also true of multivariable functions
involving the gradient. For a function of two variables h : R
2
R, the
function H : R
2
R given by
H(x, y) = D
x
(h)(a, b), D
y
(h)(a, b)) (x a), (y b)) ,
where D
x
denotes the partial derivative function with respect to x, D
y
repre-
sents the partial derivative function with respect to y, and () represents the
224 CHAPTER 5. LINEAR TRANSFORMATIONS
dot product, is a linear transformation that approximates h near the point
(a, b). Can you show that H is linear?
In Activity 9, you were asked to determine whether the dierential oper-
ator D
2
is linear. What did you nd? Is your answer consistent with what
has just been discussed regarding the functions G and H?
In Activities 10 and 11, you considered the denite integral applied to
polynomials from PF
2
(R) and PF
3
(R). Do the denite integrals dened in
these activities satisfy the conditions given in Denition 5.1.1? If not, what
modications would need to be made to ensure linearity?
Why is linearity so important? The two conditions specied in Deni-
tion 5.1.1 ensure preservation of vector addition and scalar multiplication
for a function between two vector spaces. Suppose that T : U V is a
linear transformation. The rst condition given in Denition 5.1.1,
T(u +v) = T(u) +T(v), u, v U,
guarantees that the vector assigned to u + v under T is equal to the sum
assigned by T to u and v individually. This is illustrated in the diagram
below: the sum of the outputs, T(u) and T(v), is equal to the output of the
sum, T(u +v).
(u, v)
T

(T(u), T(v))
+

u +v
T

T(u) +T(v)
Thus, T and + can be applied in any order: taking the sum of u and v
followed by applying T yields the same answer as taking the sum of the
images of u and v separately.
The second condition given in Denition 5.1.1,
T(cu) = cT(u), u U, c K,
guarantees preservation of scalar multiplication: the vector assigned by T to
the scalar product cu is equal to the product of c with the vector assigned
by T to u.
u
T

T(u)
c

cu
T

cT(u)
5.1 Introduction to Linear Transformations 225
Similar to the case involving the sum, T and can be applied in any order:
taking the product cu followed by applying T yields the same answer as
multiplying the image of u by the scalar c.
In Activities 6, 7, and 8, you were asked whether familiar geometric trans-
formations such as rotations, reections, and translations preserve the opera-
tions of vector addition and scalar multiplication in R
2
. Activity 6 described
a rotation through

6
radians. The general denition and its analytic, or
algebraic, representation are given in the following denition.
Denition 5.1.2. A rotation is a function that takes a vector in R
2
and
rotates it in a counterclockwise fashion through an angle of radians. It is
given by the formula
T(x, y)) = x cos y sin , x sin +y cos ) .
y
x

(x,y) (x',y')
Figure 5.1: A rotation
As the gure indicates, let x, y) R
2
be a vector in the plane, and
assume that the vector x, y) forms an angle of radians with the x-axis.
Let x
t
, y
t
) be the vector returned after x, y) has been rotated through
radians.
If we drop a perpendicular segment from the point (x, y) to the x-axis,
we can see that the segment from the origin to the point (x, 0) has length
x, the segment from (x, 0) to (x, y) has length y, and the vector x, y) has
226 CHAPTER 5. LINEAR TRANSFORMATIONS
length
_
x
2
+y
2
. We can show that
x =
_
x
2
+y
2
cos
y =
_
x
2
+y
2
sin .
Similarly, if we drop a perpendicular segment from (x
t
, y
t
) to the x-axis, we
can see that
x
t
=
_
x
2
+y
2
cos( +)
y
t
=
_
x
2
+y
2
sin( +).
If we then apply standard trigonometric identities, we can see that the ge-
ometric representation is the same as the analytic representation, which is
given by the expression for T.
In Activity 7, you were asked to consider a reection through the line
y =

3x. As with a rotation, there is a general analytic, or algebraic,


representation.
Denition 5.1.3. A reection through the line y = mx is a function that
takes a vector in R
2
and returns a vector given by the formula
T(x, y)) = x cos 2 +y sin 2, x sin 2 y cos 2) ,
where the angle of inclination of the reecting line is given by = tan
1
m.
Let x, y) R
2
be a vector in the plane, and assume that the vector x, y)
forms an angle of radians with the x-axis, as shown in the gure above.
Assume that the line y = mx forms an angle of radians with the x-axis,
where m = tan . Let x
t
, y
t
) represent the vector returned after x, y) has
been reected through y = mx.
The drawing indicates that the reection through y = mx is the same as
rst reecting the vector x, y) through the x-axis and then rotating the re-
sulting reection x, y) through 2 radians. The components of x
t
, y
t
) can
be found by applying the expression for a rotation given in Denition 5.1.2.
Can you explain why a reection can be represented as a reection through
an axis followed by a rotation?
The transformation described in Activity 8 is an example of a translation,
which is dened below.
5.1 Introduction to Linear Transformations 227
y
x

2
(x,y)
(x',y')
(x,-y)
y = mx

Figure 5.2: Reection through a line


y
x
(x,y)
(x',y')
(c,d)
c
d
Figure 5.3: A translation
228 CHAPTER 5. LINEAR TRANSFORMATIONS
Denition 5.1.4. A translation by the vector c, d) is a function T : R
2

R
2
that takes a vector in R
2
and returns a vector given by the formula
T(x, y)) = x +c, y +d) .
Let x, y) R
2
be a vector in the plane, and assume that x
t
, y
t
) repre-
sents the vector returned after x, y) has been translated by the vector c, d).
As the gure illustrates, the components of x
t
, y
t
) are the vector sum of
x, y) and c, d).
Based upon your work in the Activities, which of these geometric transfor-
mations is a linear transformation? For each one that is not, can you identify
which operation, vector addition or scalar multiplication, is not preserved?
In addition to preservation of vector addition and scalar multiplication,
linear transformations preserve lines. A line through the vector a in the
direction of v is given by the set
tv +a : t R.
v
tv
a
tv + a
Figure 5.4: Vector form of a line
As shown in Figure 5.4 in R
2
, the direction vector v emanates through
the origin, a is the vector through which the line passes, and every point on
the line can be represented as the vector sum of a and some scalar multiple
of v.
Linear transformations transform lines into lines. For example, the trans-
formation T : R
3
R
3
given by
T(u
1
, u
2
, u
3
)) = u
1
+u
2
, u
2
u
3
, u
1
+u
3
)
5.1 Introduction to Linear Transformations 229
transforms the line t 2, 1, 1) +1, 3, 1) : t R into the line t 3, 2, 1) +
2, 2, 0) : t R as shown in Figure5.5.
a
v
tv + a
tv
Figure 5.5: Linear Transformation of a line
The example being considered here can be generalized.
Theorem 5.1.1. Let T : R
n
R
n
be a linear transformation, and let l be
a line in R
n
given by
l = tv +a : t R.
Then, the image of l under T is also a line.
Proof. We must show that T(l) is a line. Since T is assumed to be a linear
transformation, we can write
T(l) = T(tv +a)
= T(tv) +T(a)
= tT(v) +T(a).
Since T is a vector transformation fromR
n
to R
n
, T(v) R
n
and T(a) R
n
.
Therefore, tT(v) + T(a) : t R is a line; that is, T transforms the line
passing through the vector in the direction of v into the line passing through
the vector T(a) in the direction of T(v).
In the exercises, you will be asked to show related consequences of linear-
ity. Specically, linear transformations transform parallel lines into parallel
lines, line segments into line segments, and squares into parallelograms.
230 CHAPTER 5. LINEAR TRANSFORMATIONS
Component Functions and Linear Transformations
Many of the vector spaces you have studied in this course involve spaces of
tuples, that is, a vector space K
n
, where K is the set of real numbers R or
some nite eld such as Z
5
. Later in this text, we will show that non-tuple,
nite dimensional vector spaces are structurally equivalent to spaces of tuples
of the same dimension. Hence, any insight regarding linear transformations
between spaces of tuples is of particular importance to us.
As you may have noticed in the activities, any function H : K
n
K
m
can be decomposed into a set of component functions. For example, the
function T : (Z
5
)
2
(Z
5
)
2
, given in Activity 1 and dened by T(a, b)) =
(a b), (2 b)), can be expressed in the form
T(a, b)) = t
1
(a, b)), t
2
(a, b))) ,
where the function t
1
: (Z
5
)
2
Z
5
, dened by
t
1
(a, b)) = (a b),
corresponds to the expression in the rst component, and the function t
2
:
(Z
5
)
2
Z
5
, dened by
t
2
(a, b)) = (3 b),
corresponds to the expression given in the second component. Can you pro-
vide similar descriptions for the func F dened in Activity 2 and the funcs
described in Activity 5?
Consider the transformation dened in Activity 2. In this example, you
may have noticed that the expression for each component function is a linear
combination of the components of the input vector. Does this characteristic
appear to hold true for those funcs in Activity 5 that you deemed to be
linear? Is this true for the func dened Activity 1, a transformation which
you discovered was not linear? What about the non-linear funcs dened in
Activity 5: is each component a linear combination of the components of the
input vector, or is there at least one component for which this fails?
As the next theorem illustrates, the patterns you discovered in the activ-
ities can be generalized.
Theorem 5.1.2. A function T : K
n
K
m
given by
T(u) = f
1
(u), f
2
(u), . . . , f
m
(u))
5.1 Introduction to Linear Transformations 231
is a linear transformation if and only if each component function
f
i
: K
n
K, i = 1, 2, . . . , m,
is given by
f
1
(u) = f
1
(u
1
, u
2
, , u
n
)) = a
11
u
1
+a
12
u
2
+ +a
1n
u
n
f
2
(u) = f
2
(u
1
, u
2
, , u
n
)) = a
21
u
1
+a
22
u
2
+ +a
2n
u
n
.
.
.
f
m
(u) = f
m
(u
1
, u
2
, , u
n
)) = a
m1
u
1
+a
m2
u
2
+ +a
mn
u
n
,
where each a
ij
, i = 1, 2, . . . , m, j = 1, 2, . . . , n, is a scalar.
Proof. (=:) Let u = u
1
, u
2
, , u
n
) K
n
. Then,
T(u) = T(u
1
, u
2
, , u
n
)) =
_
f
1
(u
1
, u
2
, , u
n
)), f
2
(u
1
, u
2
, , u
n
)),
, f
m
(u
1
, u
2
, , u
m
))
_
,
where we assume that each component function is
f
1
(u) = f
1
(u
1
, u
2
, , u
n
)) = a
11
u
1
+a
12
u
2
+ +a
1n
u
n
f
2
(u) = f
2
(u
1
, u
2
, , u
n
)) = a
21
u
1
+a
22
u
2
+ +a
2n
u
n
.
.
.
f
m
(u) = f
m
(u
1
, u
2
, , u
n
)) = a
m1
u
1
+a
m2
u
2
+ +a
mn
u
n
,
and each a
ij
K, i 1, . . . , m, j 1, . . . , n, is a scalar. To establish
that T is a linear transformation, it suces to show that each component
function f
i
: K
n
K, i = 1, 2, . . . , m is linear. The details are left as an
exercise. See Exercise 18.
(=:) Assume that T is a linear transformation. Let u = u
1
, u
2
, . . . , u
n
)
K
n
, and rewrite u as the sum
u = u
1
, u
2
, . . . , u
n
)
= u
1
, 0, . . . , 0) +0, u
2
, 0, . . . , 0) + +0, . . . , 0, u
n
) .
232 CHAPTER 5. LINEAR TRANSFORMATIONS
Since we are assuming that T is a linear transformation,
T(u) = T(u
1
, 0, . . . , 0)) +T(0, u
2
, 0, . . . , 0)) + +T(0, . . . , 0, u
n
)).
Since each vector 0, . . . , 0, u
j
, 0, . . . , 0) has a single nonzero component, each
component function f
i
: K
n
K, i = 1, . . . , m, behaves like a single-
variable function that accepts u
j
and returns a scalar; specically,
f
i
(0, . . . , 0, u
j
, 0, . . . , 0)) = a
ij
u
j
,
where a
ij
K. If we take the sum over all j, we achieve the desired result
for each component function f
i
. The details are left to the exercises. See
Exercise 19.
In the second part of the proof, (=), there is an assumption being
made. Can you identify what that assumption is? Can you state and prove
a theorem that would address this assumption?
Non-Tuple Vector Spaces
Throughout this section, we have considered transformations between vector
spaces of tuples. In other chapters, you have been introduced to other, non-
tuple examples. Although these examples are familiar, you have been asked
to think about them in a new context. For instance, consider the set C

(R)
of all innitely dierentiable functions from R to R. This is a vector space
with scalars in R. Do you remember what each vector in this space looks
like? Do you recall how the addition and scalar multiplication operations are
dened?
In Activity 9, you were asked to determine whether the second deriva-
tive operator is linear. What did you observe? If we dene a function
F : C

(R) C

(R) by
F(f) = D
2
(f) +f,
where f C

(R), can we say that F is a linear transformation on C

(R)?
If so, can you prove that F is linear? If not, can you explain which condition
of linearity is being violated?
Another class of non-tuple vector spaces are sets of polynomial functions
organized by degree. For instance,
PF
3
(R) = x a
0
+a
1
x +a
2
x
2
+a
3
x
3
: a
0
, a
1
, a
2
, a
3
R
5.1 Introduction to Linear Transformations 233
the set of all polynomial functions of degree three or less with real-valued
coecients is a vector space. What did you nd in Activities 10 and 11? Is
the denite integral dened in this activity linear? What about the function
J?
Several of the exercises will ask you to make similar determinations. In
particular, when given a function between two non-tuple vector spaces, how
does one determine whether the given function satises the two conditions
of linearity specied in Denition 5.1.1?
Exercises
1. Explain why each of the following functions from R
2
to R
2
is not a
linear transformation.
(a) T
1
(x, y)) = 3y + 1, x +y)
(b) T
2
(x, y)) = xy, 3x + 4y)
(c) T
3
(x, y)) =
_
y
4
, 2y +x + 4
_
(d) T
4
(x, y)) = 3x + 2y 4, 5x y + 7)
2. Dene T : R
2
R by T(x, y)) = xy. Determine whether T is a
linear transformation. If the function is linear, use the denition to
prove it; if the function is not linear, explain why the denition fails.
3. Dene T : R
3
R
5
by
T(u
1
, u
2
, u
3
)) =
(3u
1
+ 2u
2
u
3
), (u
1
2u
2
u
3
), 0, (u
2
+ 5u
3
), (3u
1
+ 3u
2
)) .
Use the denition to show that T is a linear transformation.
4. Dene T : R
n
R by
T(u) = a u,
where a = a
1
, a
2
, . . . , a
n
) R
n
, u is any vector in R
n
, and () repre-
sents the dot product on R
n
. Use the denition to show that T is a
linear transformation. What is the signicance of T as it relates to the
topic of the dot product in multivariable calculus?
5. Determine whether each function given below is linear.
234 CHAPTER 5. LINEAR TRANSFORMATIONS
(a) D: P
3
(Z
5
) P
2
(Z
5
) dened by
D(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = 3a
3
x
2
+ 2a
2
x +a
1
.
(b) T : P
3
(R) R dened by
T(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = a
3
a
2
+a
0
.
(c) G: P
3
(R) R dened by
G(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = 2a
3
a
2
+a
1
.
(d) H: P
3
(Z
5
) P
3
(Z
5
) dened by
H(a
3
x
3
+a
2
x
2
+a
1
x +a
0
) = a
3
x
3
+a
2
x
2
+a
1
x +a
0
+ 3.
(e) S: P
3
(R) P
3
(R) dened by
S(a
3
x
3
+a
2
x
2
+a
1
x+a
0
) = a
3
(x+2)
3
+a
2
(x+2)
2
+a
1
(x+2) +a
0
.
6. Determine whether each function given below is linear.
(a) D: PF
3
(R) PF
2
(R) dened by
D(p) = p
t
,
where p
t
is the derivative of p.
(b) T : PF
3
(R) R dened by
T(p) =
_
1
0
p(t) dt
(c) J : PF
3
(R) PF
4
(R) dened by
J(p) = x
_
x
2
p(t) dt
7. Let V be the vector space of all functions from R R whose domain
is all of R. Dene T : C

(R) V be dened by
T(f) = F
where for F is a function such that D(F) = f and F(0) = k ,= 0. Use
the denition to show that T is not a linear transformation. In order
for T to be a linear transformation, we would have to limit ourselves
to a certain subspace of C

(R). What subspace would that be?


5.1 Introduction to Linear Transformations 235
8. Let g : R R be a dierentiable function. Let a R. Dene
G : R R be dened by
G(x) = D(g)(a) (x a),
where D denotes the derivative function. Show that G is a linear
transformation.
9. Let C

(R) be the vector space of all functions that are innitely dier-
entiable. Let D : C

(R) C

(R) be the derivative function, that


is, D(g) = g
t
(x) for g C

(R) and x R. Dene P : C

(R) R
by
P(g) = D(g)
_
1
2
_
,
where g C

(R), and P represents the derivative of g evaluated at the


point
1
2
. Use the denition to show that P is a linear transformation.
10. Dene f : R
3
R by
f(x
1
, x
2
, x
3
) = x
1
+x
2
2
+x
2
1
x
3
.
Dene F : R
3
R by
F(x
1
, x
2
, x
3
) = D
x
1
(f)(2, 1, 1), D
x
2
(f)(2, 1, 1),
D
x
3
(2, 1, 1)) x
1
2, x
2
1, x
3
1) ,
where D
x
i
, i = 1, 2, 3, denotes the partial derivative function with
respect to the variable x
i
, and () denotes the dot product. Show that
F is a linear transformation.
11. Let f : R
n
R by a function whose partial derivatives exist. Let
(a
1
, a
2
, . . . , a
n
) R
n
. Dene F : R
n
R by
F(x
1
, x
2
, . . . , x
n
) = D
x
1
(a
1
, . . . , a
n
), D
x
2
(a
1
, . . . , a
n
), . . . ,
D
xn
(a
1
, . . . , a
n
)) x
1
a
1
, x
2
a
2
, . . . , x
n
a
n
) ,
where D
x
i
, i = 1, 2, . . . , n, denotes the partial derivative function with
respect to the variable x
i
, and() denotes the dot product. Show that
F is a linear transformation.
236 CHAPTER 5. LINEAR TRANSFORMATIONS
12. Use the denition of a linear transformation to prove that a rotation is
a linear transformation.
13. Use the denition of a linear transformation to prove that a reection
through a line passing through the origin is a linear transformation.
14. Use the denition of a linear transformation to prove that a translation
is a not a linear transformation.
15. A line segment passing through the vector a in the direction v is given
by the set tv +a : t [0, 1].
(a) Dene T : R
2
R
2
by
T(u
1
, u
2
)) = 2u
1
+ 3u
2
, u
1
+u
2
) .
Show symbolically and geometrically that the line given by
t 1, 2) +3, 4) : t [0, 1]
is transformed into a line segment. Find the direction vector v
and the vector a through which the segment passes.
(b) Let T : K
n
K
n
be a linear transformation. Show that the
image of a line segment given by tv +a : t [0, 1] is also a line
segment.
16. Consider the line l in R
3
given by l = t 2, 1, 3) +1, 0, 2) : t R.
(a) Find the form of the line l
t
that passes through the vector 0, 1, 1)
and is parallel to l.
(b) Dene T : R
2
R
2
by
T(u
1
, u
2
, u
3
)) = 2u
1
+ 3u
2
, u
1
+u
2
, u
2
+u
3
) .
Show symbolically and geometrically that T(l) and T(l
t
) are par-
allel.
(c) Let T : K
n
K
n
be a linear transformation. Let l be a line
passing through a
l
in the direction v. Let l
t
be a line parallel to l
that passes through the vector a
l
. Show that T(l) and T(l
t
) are
parallel.
5.1 Introduction to Linear Transformations 237
17. Let T : K
n
K
n
be a linear transformation. Let l be a line in
K
n
that passes through the origin. Show that T(l) passes through the
origin.
18. Complete the proof of the rst part (=) of Theorem 5.1.2.
19. Complete the proof of the second part (=) of Theorem 5.1.2. What
assumption is being made in the second part of the proof, (=)? State
this assumption as a theorem, and provide a proof.
20. Let T : U V be a linear transformation, where U and V are vector
spaces with scalars in K. Use induction to show that
T(cu
1
+du
2
) = cT(u
1
) +dT(u
2
), u
1
, u
2
U, c, d K,
holds for any size combination; in particular, show that
T(a
1
v
1
+a
2
v
2
+ +a
n
v
n
) = a
1
T(v
1
) +a
2
T(v
2
) + +a
n
T(v
n
),
where u
1
, . . . , u
n
U, a
1
, . . . , a
n
K, and n 1.
238
5.2 Kernel and Range
Activities
1. Use UKn, dened in Activity 5 in Section 4.1, and the func LC, which
you constructed in Activity 5 of Section 4.2, to construct the set SOL
of all 5-tuples that satisfy simultaneously each of the four equations in
Z
5
listed below.
x
1
+ x
2
+ x
5
= 0
x
2
+ x
3
+ x
5
= 0
x
3
+ x
4
+ x
5
= 0
x
1
+ x
4
+ x
5
= 0
2. Let T : (Z
5
)
5
(Z
5
)
4
be a linear transformation dened as follows:
The rst component of the output is the sum of the rst, second,
and last components of the input.
The second component of the output is the sum of the second,
third, and last components of the input.
The third component of the output is the sum of the third, fourth,
and last components of the input.
The last component of the output is the sum of the rst, fourth,
and last components of the input.
(a) Write an ISETL statement that constructs the set called KER of all
vectors u in (Z
5
)
5
such that T(u) = 0.
(b) Is KER a subspace of (Z
5
)
5
? Use is subspace to
determine this.
3. Complete parts (a)(d) as a means of determining the relationship be-
tween the sets KER and SOL.
(a) Determine whether SOL is a subset of KER. Use the ISETL code
SOL Subset KER to make this determination.
(b) Determine whether KER is a subset of SOL. Use the ISETL code
KER Subset SOL to make this determination.
5.2 Kernel and Range 239
(c) Based only upon your ndings in (a) and (b), which of the follow-
ing appears to be true:
Every element of SOL is also an element of KER, but there is
at least one element of KER that is not an element of SOL.
Every element of KER is also an element of SOL, but there is
at least one element of SOL that is not an element of KER.
KER is neither a subset of SOL, nor is SOL a
subset of KER.
The sets SOL and KER are equal.
4. Find a basis for the solution set from Activity 1. What is the dimension
of KER? Explain your reasoning.
5. Construct the set IMAGESPACE of all vectors of the form T(u), where u
is in (Z
5
)
5
, and T is the func you constructed in Activity 2.
6. Take T from Activity 3, and rewrite the output vector in the form of a
matrix product, that is, express T in the form T(u) = A u, where A is
a 4 5 matrix whose entries in column j correspond to the coecients
of component j of the output vector. Then, complete the following
steps.
(a) Think of each column of A as a vector in (Z
5
)
4
. Apply the func
All LC you constructed in Activity 4 of Section 4.1 to nd the set
COLS of all linear combinations of the columns of A.
(b) Use ISETL to determine whether IMAGESPACE is a subset of COLS,
and vice-versa. What is the relationship between the subspace
generated by the columns of the matrix A, given by COLS, and the
set IMAGESPACE?
(c) Apply T to the coordinate basis
1, 0, 0, 0, 0) , 0, 1, 0, 0, 0) , 0, 0, 1, 0, 0) ,
0, 0, 0, 1, 0) , 0, 0, 0, 0, 1)
of (Z
5
)
5
. Compare the image of each basis vector with the columns
of A. What do you observe?
240 CHAPTER 5. LINEAR TRANSFORMATIONS
7. Take the set you constructed in Activity 4, and extend it to a basis for
all of (Z
5
)
5
. Apply T to each newly constructed basis vector. Apply
All LC to the set of all such T(u). Call the resulting set IMB. Use
ISETL to determine the relationship between IMAGESPACE and IMB. Is
IMAGESPACE a subset of IMB? Is IMB a subset of IMAGESPACE?
8. What is the relationship between the dimensions of (Z
5
)
5
, KER, and
IMAGESPACE? Make a conjecture about a general relation between these
dimensions.
9. Consider AX = B, where A is the matrix of coecients of the left
side of the system in Activity 1, B is a 4 1 matrix representing the
constants on the right hand of the equals sign, and
_
_
_
_
x
1
x
2
x
3
x
4
_
_
_
_
.
(a) Determine the solution set; that is, nd all vectors X such that
AX = B. Select one such solution, and call it v
p
.
(b) Determine the solution set of AX = 0, where 0 denotes the 4 1
matrix whose entries are all zero. Find a particular solution, and
call it v
0
.
(c) Does the vector v
p
+ v
0
form a solution of AX = B? Select a
dierent solution to AX = 0, replace v
0
with this new solution,
and check whether v
0
+v
0
is a solution to AX = B.
(d) Can you nd a solution of AX = B that is not of the form v
p
+v
0
?
Discussion
The Kernel of a Linear Transformation
Let T : U V be a linear transformation between two vector spaces U
and V . In this section, we will introduce two subspaces that are associated
with a linear transformation, the kernel, which is a subspace of U, and the
image space, a subspace of V .
Denition 5.2.1. Let U and V be vector spaces with scalars in K, and let
T : U V be a linear transformation. The kernel of T, denoted ker(T), is
5.2 Kernel and Range 241
the set of all vectors in U that are mapped to the zero vector in V under T.
In symbols,
ker(T) = u U : T(u) = 0
V
.
U
V
0
V
ker(T)
T
Figure 5.6: The kernel of T
What is the relationship between the kernel of a linear transformation
and the set KER you constructed in Activity 2(a)? What did you nd in part
(b), when you applied the func is subspace to KER? Before going on to the
next theorem, think about whether the kernel of a linear transformation is a
subspace.
Theorem 5.2.1. Let U and V be vector spaces with scalars in K, and let
T : U V be a linear transformation. The kernel of T, ker(T), is a
subspace of U.
Proof. In order to show that ker(T) is a subspace of U, we must show that
the sum of any two vectors in ker(T) is an element of ker(T) and that the
scalar product of a vector in ker(T) is an element of ker(T). The details are
left as an exercise. See Exercise 9.
Every subspace of a vector space contains the zero vector as an element.
Since ker(T) is a subspace of the domain, the zero vector must be an element
of the kernel. Consequently, the zero vector of the domain is mapped to the
zero vector of the range space. Is this true for any function?
242 CHAPTER 5. LINEAR TRANSFORMATIONS
Theorem 5.2.2. Let U and V be vector spaces with scalars in K, and let
T : U V be a linear transformation. The zero vector of U is mapped to
the zero vector of V under T, that is, T(0
U
) = 0
V
, where 0
U
denotes the
zero vector in U, and 0
V
represents the zero vector in V .
Proof. See Exercise 10.
In Activity 3, you were asked to compare the sets SOL and KER. If you
compare the system of equations given in Activity 1 and the expression for
T described in Activity 2, you will notice that the left hand side of each
equation is one of the components of the expression for T. This suggests
that the kernel of a linear transformation between spaces of tuples can be
found by solving a homogeneous system of equations. Generally speaking, if
T : K
n
K
m
is given by
T(x
1
, x
2
, . . . , x
n
)) =

a
11
x
1
+a
12
x
2
+ +a
1n
x
n
,
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . ,
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
_
,
then the kernel, dened to be the set
ker(T) = x
1
, x
2
, . . . , x
n
)) K
n
: T(x
1
, x
2
, . . . , x
n
)) = 0, 0, . . . , 0),
is the solution set of the system
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= 0
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= 0
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= 0.
Conversely, the solution set in K
n
of the homogeneous system
b
11
x
1
+b
12
x
2
+ +b
1n
x
n
= 0
b
21
x
1
+b
22
x
2
+ +b
2n
x
n
= 0
.
.
.
b
m1
x
1
+b
m2
x
2
+ +b
mn
x
n
= 0
5.2 Kernel and Range 243
is the kernel of the linear transformation T : K
n
K
m
dened by
T(x
1
, x
2
. . . , x
n
)) = b
11
x
1
+b
12
x
2
+ +b
1n
x
n
,
b
21
x
1
+b
22
x
2
+ +b
2n
x
n
, . . . ,
b
m1
x
1
+b
m2
x
2
+ +b
mn
x
n
) .
This explains why KER and SOL were equal. Since ker(T) is a subspace of U,
the relationship shown above suggests that the solution set of a homogeneous
system is also a subspace. This is indeed the case.
Theorem 5.2.3. The solution set of a homogeneous system of m equations
in n unknowns is a subspace of the vector space V = K
n
.
Proof. Let
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= 0
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= 0
.
.
.
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= 0
be a homogeneous system of m equations in n unknowns over K. Suppose
that u = u
1
, u
2
, . . . , u
n
) and v = v
1
, v
2
, . . . , v
n
) are solutions. Then, for
every i = 1, 2, . . . , m,
a
i1
(u
1
+v
1
) +a
i2
(u
2
+v
2
) + +a
in
(u
n
+v
n
)
= a
i1
u
1
+a
i1
v
1
+a
i2
u
2
+a
i2
v
2
+ +a
in
u
n
+a
in
v
n
= [a
i1
u
1
+a
i2
u
2
+ +a
in
u
n
] + [a
i1
v
1
+a
i2
v
2
+ +a
in
v
n
] = 0,
which shows that u + v is a solution of the system. In a similar manner,
we can show that if c K is a scalar and if v is a solution, then the scalar
product cv is a solution. See Exercise 11.
If U and V are not vector spaces of tuples, the kernel of a linear trans-
formation may not be able to be represented by the solution set of a homo-
geneous system of equations. Even though no such correspondence exists,
Denition 5.2.1 and Theorems 5.2.1 and 5.2.2 still apply.
244 CHAPTER 5. LINEAR TRANSFORMATIONS
The Image Space of a Linear Transformation
Let T : U V be a linear transformation between the vector spaces U and
V . We will call the set of all inputs U the domain space of T. The vector
space V will be referred to as the range space of T. The set of all vectors in
V that are assigned to at least one vector in U under T is the image space
of T. We denote the image space of T using the notation
T(U) = v V : there exists u U such that T(u) = v.
The term image refers to the output of a single vector. If u U, then
v = T(u) is the image of u under T. In Activity 5, you constructed the set
IMAGESPACE, which is, in fact, the image space of T dened in Activity 2.
Like the kernel of T, this set is a subspace of V .
Theorem 5.2.4. Let U and V be vector spaces with scalars in K. Let T :
U V be a linear transformation. The image space of T, T(U), is a
subspace of V .
Proof. In order to show that T(U) is a subspace of V , we must show that
the sum of any two vectors in T(U) is equal to a vector in T(U) and that the
scalar product of a vector in T(U) is equal to a vector in T(U). The details
are left as an exercise. See Exercise 13.
In the last subsection, we showed that there is a correspondence between
the kernel of a linear transformation between tuples and the solution set of a
homogeneous system of equations. There is a similar correspondence between
systems and the image space. For example, if T : K
n
K
m
is dened by
T(x
1
, x
2
, . . . , x
n
))
= a
11
x
1
+a
12
x
2
+ +a
1n
x
n
,
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . ,
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
) ,
then v = v
1
, v
2
, . . . , v
m
) T(K
n
) if and only if the non-homogeneous system
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= v
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= v
2
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= v
m
5.2 Kernel and Range 245
K
m
K
n
V
Set of solutions
of T(u) = v
T
image space
of T
Figure 5.7: The image space of T
has a solution. If there were no solution, could we say that v T(K
n
)?
If U and V are not vector spaces of tuples, an element of the image space
may not correspond to the solution set of a particular system of equations.
However, the existence of a solution to T(u) = v and the presence of v in
T(U) are the same.
Bases for the Kernel and Image Space
In Activity 4, you constructed a basis for SOL. In Activity 7, you extended
this set to a basis for the entire space (Z
5
)
5
. What theorem from Chapter
4 allowed you to do this? After having found the image T(u) of each new
basis vector, you applied the func All LC to the set of these images. Did
this set generate the set IMAGESPACE? Is this set independent? If so, can
you describe a procedure for nding the basis of the image space T(K
n
) of
a linear transformation T : K
n
K
m
? If not, can you explain how things
break down?
Since the kernel and image space of a linear transformation are subspaces,
we can talk about the dimension of each set. After having analyzed the
relationship between the dimensions of (Z
5
)
5
, IMAGESPACE, and KER in Ac-
tivity 8, what did you conjecture, in general, regarding the relationship be-
tween the dimensions of K
n
, ker(T), and T(K
n
) for a linear transformation
T : K
n
K
m
?
246 CHAPTER 5. LINEAR TRANSFORMATIONS
Before considering the theorem that addresses your conjecture, we need
to introduce two terms, rank and nullity, that are used in this theorem. In
Chapter 3, you learned that a system of equations, such as
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= c
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= c
2
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= c
m
,
can be represented as a matrix equation
_
_
_
_
_
a
11
a
12
. . . a
1n
a
21
a
22
. . . a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
_
_
_
_
_

_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
c
1
c
2
.
.
.
c
m
_
_
_
_
_
.
The solution of such a system can be found by transforming the augmented
matrix
_
_
_
_
_
a
11
a
12
. . . a
1n
c
1
a
21
a
22
. . . a
2n
c
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
. . . a
mn
c
m
_
_
_
_
_
into reduced echelon form. If you recall, the rank of an augmented matrix,
or any matrix for that matter, is dened to be the number of nonzero rows
appearing in its reduced echelon form. In Activity 6, you showed that the
image of each coordinate basis vector under T was a column of the matrix A.
The columns of A generated the set IMAGESPACE. The rank of A, as dened
in Activity 6, turns out to be equal to the dimension of IMAGESPACE, the
image space of T. This link is the basis for the use of the term rank in
Theorem 5.2.5. If you apply elementary row operations to A in Activity 6,
can you verify that its rank is equal to the dimension of IMAGESPACE?
The term nullity refers to the dimension of the null space of a linear
transformation. The term null space is simply the kernel. Some texts elect
to refer to the kernel as the null space. It would be wise for you to be familiar
with both.
5.2 Kernel and Range 247
Theorem 5.2.5 (Rank and Nullity). Let U and V be nite dimensional
vector spaces with scalars in K, and let T : U V be a linear transforma-
tion. Then,
dim[ker(T)] + dim[T(U)] = dim(U).
Proof. Suppose dim(U) = n and dim[ker(T)] = m. Let
u
1
, u
2
, . . . , u
m

be a basis for ker(T). By Theorem 4.4.10, we can extend this linearly inde-
pendent set to a basis
u
1
, u
2
, . . . , u
m
, u
m+1
, u
m+1
, . . . , u
n

for U. We will show that the set


T(u
m+1
), T(u
m+2
), . . . , T(u
m
)
is a basis for T(U).
First, we show that this set spans T(U). Let v T(U). Then, there
exists u U such that T(u) = v. We can write u as a linear combination of
the given basis for U. Grouping terms, we have
u = x +y,
where x is a linear combination of u
1
, u
2
, . . . , u
m
and y is a linear combi-
nation of u
m+1
, u
m+2
, . . . , u
n
. Since u
1
, u
2
, . . . , u
m
is a basis for ker(T),
x ker(T). Therefore,
v = T(u) = T(x +y) = T(x) +T(y) = T(y).
Since y is a linear combination of u
m+1
, u
m+1
, . . . , u
n
, it follows, from the
linearity of T, that T(y), and hence v, is a linear combination of
T(u
m+1
), T(u
m+1
), . . . , T(u
n
).
Therefore, T(u
m+1
), . . . , T(u
n
) forms a spanning set for T(U).
Next, we must show that the set T(u
m+1
), . . . , T(u
n
) is linearly inde-
pendent. Suppose that
a
m+1
T(u
m+1
) +a
m+2
T(u
m+2
) + +a
n
T(u
n
) = 0
V
.
248 CHAPTER 5. LINEAR TRANSFORMATIONS
Since T is a linear transformation, we can write
T(z) = 0
V
,
where z = a
m+1
u
m+1
+a
m+2
u
m+2
+ +a
n
u
n
. By the denition of kernel,
z ker(T). Since u
1
, u
2
, . . . , u
m
is a basis for ker(T), there exists a linear
combination of these vectors, say c
z
such that z = c
z
. We can rewrite this
expression as
c
z
z = 0
U
.
Hence, we have a linear combination of u
1
, . . . , u
m
, u
m+1
, . . . , u
n
set equal
to the zero vector. Since this set is a basis, it follows that the scalars must
be simultaneously zero. Therefore,
a
m+1
= a
m+2
= = a
n
= 0.
As a result, we can conclude that the set
T(u
m+1
), T(u
m+2
), . . . , T(u
n
)
is linearly independent. Therefore, the set
T(u
m+1
), T(u
m+2
), . . . , T(u
n
)
is a basis for T(U). According to the denition of dimension, we can conclude
that T(U) = n m, which is what we wished to prove.
Theorem 5.2.5 can be used to tie together the notions of invertibility, one-
to-one, and onto. A function is one-to-one if every vector in the image space
is assigned to one and only one vector in the domain space. In symbols, a
linear transformation T : U V is one-to-one if, for u
1
, u
2
U such that
u
1
,= u
2
, it follows that T(u
1
) ,= T(u
2
). This can be stated equivalently as
T(u
1
) = T(u
2
) implies that u
1
= u
2
.
A function is dened to be onto if the image space is equal to the range
space. In other words, every element of the range space is an image of some
element of the domain. For a linear transformation T, this means that if
v V , there exists u U such that T(u) = v.
If you recall from calculus, a function f : D R, where D denotes
the domain space and R represents the range space, has an inverse f
1
:
f(D) D, where f(D) indicates the image space of f, if f
1
(f(a)) = a
for all a D and f(f
1
(b)) = b for all b f(D). The underlying feature
5.2 Kernel and Range 249
V
U
T
Figure 5.8: A one-to-one transformation
V
U
T
Image space
of T
Figure 5.9: A onto transformation
250 CHAPTER 5. LINEAR TRANSFORMATIONS
which guarantees the existence of f
1
is that each element of f(D) is assigned
to one and only element of D under f
1
. Can you recall why the function
f : R R dened by f(x) = x
2
, whose graph is given below, does not
have an inverse? Can you explain this using the graph? Are you able to
reason this using the algebraic expression for the function?
Figure 5.10: f(x) = x
2
Can you give an example of a function from calculus whose inverse exists?
Can you justify your answer using the graph? the algebraic expression?
Working with these examples should remind you that a function has an
inverse if and only if the function is one-to-one. Can you prove this? For
linear transformations, we can make an even stronger statement.
Theorem 5.2.6. Let T : U U be a linear transformation. The following
statements are equivalent.
1. T is one-to-one.
2. T is onto.
3. T is invertible; that is, there exists T
1
: U U such that
T
1
(T(u)) = u and T(T
1
(u)) = u
for every u U.
5.2 Kernel and Range 251
Proof. Throughout the proof, assume that dim(U) = n.
(1 = 2:) We will rst show that 1 implies 2. We assume that T is one-
to-one and use this to show that T is onto. Since T is one-to-one, the
kernel of T, ker(T), consists only of the zero vector. Why? Therefore,
dim(ker(T)) = 0. By Theorem 5.2.5, dim(ker(T)) + dim(T(U)) = n, which
implies that dim(T(U)) = n. Since T(U) is a subspace of U, and since both
have dimension n, we can conclude that T(U) = U; the image space is equal
to the range space. Hence, T is onto.
(2 = 3:) We prove that 2 implies 3. We assume that T is onto and use this
to show that T is invertible. Since T is onto, the image space T(U) is equal
to the range space U. Therefore, by Theorem 5.2.5, the kernel of T, ker(T),
consists only of the zero vector. In order to dene an inverse transformation,
the pre-image of each vector v, that is, the set u U : T(u) = v, can
consist of but a single vector. Suppose that there exist u
1
, u
2
U such that
T(u
1
) = v = T(u
2
). By linearity,
T(u
1
) = T(u
2
) =
T(u
1
u
2
) = 0
U
=
u
1
u
2
ker(T).
Since ker(T) = 0
U
, u
1
u
2
= 0
V
, or u
1
= u
2
: there is but one vector
whose image is v. Hence, we can dene an inverse function.
(3 = 1:) We prove that 3 implies 1. We assume that T is invertible and
use this to show that T is one-to-one. Suppose that
T(u
1
) = T(u
2
)
for some u
1
, u
2
U. Since T
1
exists, we can write
T
1
(T(u
1
)) = T
1
(T(u
2
)) =
u
1
= u
2
,
which is what we wished to prove.
For a linear transformation with the same domain and range spaces, this
theorem establishes the logical equivalence of one-to-one, onto, and invert-
ibility: 1 if and only if 2; 1 if and only if 3; and 2 if and only if 3. Although
252 CHAPTER 5. LINEAR TRANSFORMATIONS
we did not actually prove if 1 then 2 and if 2 then 1, each of these condi-
tional statements holds. The proof of 1 implies 2 is given in the proof of the
theorem. The implication if 2 then 1 holds, because 2 implies 3, 3 implies 1,
and 1 implies 2. Using a similar strategy, can you explain why 1 3 and
2 3 are true?
From a practical standpoint, the presence of one condition gives the others
as a result. So, if T : U U is a linear transformation that is one-to-one,
then T is also both onto and invertible. Speaking of linearity, where was the
requirement of the linearity of T used in the proof? And, moreover, if T
1
exists, is it a linear transformation?
In the next chapter, we will prove that every linear transformation be-
tween nite dimensional vector spaces has a matrix representation. This
suggests that we need to consider the issue of rank. What is the relation-
ship, if any, between the rank of a linear transformation T : U U, the
dimension of U, and whether T is one-to-one, onto, and invertible. Could
Theorem 5.2.6 be expanded to include rank(T) = dim(U) as a fourth
equivalent condition?
The General Form of a System of Linear Equations
A system of m equations in n unknowns over a eld K, say
a
11
x
1
+a
12
x
12
+ +a
1n
x
1n
= b
1
a
21
x
1
+a
22
x
12
+ +a
1n
x
2n
= b
2
.
.
.
a
m1
x
1
+a
m2
x
12
+ +a
mn
x
mn
= b
m
,
can be written in terms of the product of a matrix by a vector,
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_

_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
.
The coecients of the system are given by the matrix A, the unknowns by
the vector X, and the constants by the vector B. Specically, the system
can be represented by the equation AX = B. The associated homogeneous
5.2 Kernel and Range 253
system AX = 0 that is mentioned in Activity 9 is nothing more than the
system
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_

_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
0
0
.
.
.
0
_
_
_
_
_
,
where constants
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
are replaced by zeros. Given a particular solution
X = v
p
and X = v
0
, a solution of the associated homogeneous system, what
did you nd in Activity 9 regarding the vector sum v
p
+v
0
? Is it a solution
of AX = B? Can every solution of AX = B be written this way? How do
your ndings compare with the statement of the theorem given below?
Theorem 5.2.7. Let AX = B be a system of m equations in n unknowns
over a eld K. v
s
is a solution of AX = B if and only if v
s
= v
p
+v
0
, where
v
p
is a particular solution of AX = B, and v
0
is a solution of the associated
homogeneous system AX = 0.
Proof. (=:) Dene T : K
n
K
m
by
T(x
1
, x
2
, . . . , x
n
)) = a
11
x
1
+a
12
x
2
+ +a
1n
x
n
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . , a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
) .
As demonstrated in the rst three activities, the kernel of T can be found by
solving the homogeneous system
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_

_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
0
0
.
.
.
0
_
_
_
_
_
.
Any solution v
s
of the system
_
_
_
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
_
_
_

_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
_
=
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
,
254 CHAPTER 5. LINEAR TRANSFORMATIONS
is a vector whose components, when substituted for x
i
, satisfy the system
given above, as well as the equation T(x) = b, where b = b
1
, b
2
, . . . , b
m
).
Let v
p
denote one particular solution. Using these relationships and the
linearity of T, we can write
T(v
s
) = T(v
p
)
T(v
s
) T(v
p
) = 0
T(v
s
v
p
) = 0.
Therefore, v
s
v
p
= v
0
ker(T). A rearrangement of terms gives us the
desired result: v
s
= v
p
+v
0
.
(=:) Assume that v
s
= v
p
+ v
0
, where v
p
is a particular solution of
AX = B, and v
0
is a solution of AX = 0. Then,
A(v
s
) = A(v
p
+v
0
) = A(v
p
) +A(v
0
) = B +0 = B.
Therefore, v
s
is a solution of AX = B.
This theorem has two important interpretations. In R
3
, the solution of
the system
x
1
2x
2
+ 3x
3
= 1
3x
1
4x
2
+ 5x
3
= 3
2x
1
3x
2
+ 4x
3
= 2
is the set t 1, 2, 1) + 1, 0, 0) : t R. This is the line through 1, 0, 0) in
the direction 1, 2, 1). 1, 0, 0) is a particular solution of the system. The
solution set of the associated homogeneous system
x
1
2x
2
+ 3x
3
= 0
3x
1
4x
2
+ 5x
3
= 0
2x
1
3x
2
+ 4x
3
= 0
is given by t 1, 2, 1) : t R, the line passing through the origin in the
direction 1, 2, 1). Hence, the solution set of the non-homogeneous system
5.2 Kernel and Range 255
a
v
tv + a
tv
Figure 5.11: General solution of a system
can be represented as a translation of the solution set of the homogeneous
system.
If we dene T : R
3
R
3
by
T(x
1
, x
2
, x
3
)) = x
1
2x
2
+ 3x
3
, 3x
1
4x
2
+ 5x
3
, 2x
1
3x
2
+ 4x
3
) ,
the kernel is the solution set of the homogeneous system
x
1
2x
2
+ 3x
3
= 0
3x
1
4x
2
+ 5x
3
= 0
2x
1
3x
2
+ 4x
3
= 0.
b
1
, b
2
, b
3
) is in the image space if the non-homogeneous system
x
1
2x
2
+ 3x
3
= b
1
3x
1
4x
2
+ 5x
3
= b
2
2x
1
3x
2
+ 4x
3
= b
3
has a solution. If p
1
, p
2
, p
3
) is a particular solution of the non-homogeneous
system, then b
1
, b
2
, b
3
) = T(t 1, 2, 1)+p
1
, p
2
, p
3
)): every vector in the image
space is the image under T of the sum of an element of the kernel and a
particular pre-image.
256 CHAPTER 5. LINEAR TRANSFORMATIONS
Non-Tuple Vector Spaces
As discussed earlier, we cannot necessarily nd the kernel and the image
space of a linear transformation in non-tuple contexts by solving a corre-
sponding system of equations. However, the basic concept is the same. To
nd the kernel of a linear transformation T : U V , we nd all vectors
u U that are solutions of the equation T(x) = 0
V
. Similarly v V is
in the image space of T if there exists u such that T(u) = v. As with any
linear transformation between two nite dimensional vector spaces, Theo-
rem 5.2.5 applies in non-tuple contexts. This is veried by the proof, which
was constructed independently of any specic form. In the exercises, you will
be asked to work with transformations between spaces of polynomials and
functions.
Exercises
1. Let T : R
3
R
4
be a transformation given by
T(x
1
, x
2
, x
3
)) =
x
1
+ 2x
3
, 3x
2
4x
3
, x
1
x
2
+ 4x
3
, 2x
1
+x
2
4x
3
) .
(a) Show that T is a linear transformation.
(b) Find the kernel of T.
(c) Construct a basis for the kernel of T.
(d) Construct a basis for the image space of T.
(e) Verify the rank and nullity theorem for this transformation.
2. Let
x
1
+ 2x
2
3x
3
= 0
2x
1
+ 4x
2
2 + 6x
3
= 0
2x
2
x
3
= 0
4x
2
+ 2x
3
= 0
be a homogeneous system of 4 equations in 3 unknowns.
(a) Find a basis for the solution set.
5.2 Kernel and Range 257
(b) Find a linear transformation T : R
3
R
4
whose kernel is equal
to the solution set of the system of equations given here.
(c) Find a basis for the kernel of T.
(d) Find a basis for the image space of T.
(e) Verify the rank and nullity theorem for this transformation.
3. Dene T : R
3
R by
T(x
1
, x
2
, x
3
)) = 3x
1
+ 2x
2
x
3
.
(a) Show that T is a linear transformation.
(b) Find a basis for the kernel of T.
(c) Find a basis for the image T(R
3
) of T.
(d) Describe the kernel of T geometrically.
4. Suppose that the image space of a linear transformation F : R
4
R
4
is spanned by the set
2, 1, 3, 1) , 1, 0, 2, 4) , 1, 1, 5, 3) , 4, 2, 6, 2).
(a) Find an expression for F. (Note: There may be more than one
expression that satises the condition given above. Your job here
is to nd one such expression.)
(b) Find a basis for the kernel of F.
(c) Verify the rank and nullity theorem for this transformation.
5. Suppose that the kernel of a linear transformation G : R
4
R
4
is
spanned by the set
2, 3, 1, 1) , 1, 4, 5, 2) , 3, 1, 4, 1) , 4, 6, 2, 2).
(a) Find an expression for G. (Note: There may be more than one
expression that satises the condition given above. Your job here
is to nd one such expression.)
(b) Find a basis for the image space of G.
(c) Verify the rank and nullity theorem for this transformation.
258 CHAPTER 5. LINEAR TRANSFORMATIONS
6. Suppose H : R
5
R
5
is a linear transformation such that ker(H) =
3. Is it possible for the image space to be spanned by the set
1, 2, 4, 2, 3) , 2, 0, 1, 3, 3) , 2, 4, 0, 1, 2) , 0, 4, 1, 4, 1)?
Justify your answer using the Rank and Nullity Theorem.
7. Dene
_
1
0
: C

(R) R by
_
1
0
f =
_
1
0
f(t) dt.
Describe the kernel of
_
1
0
.
8. Let D
2
: C

(R) C

(R) represent the second derivative. Let


I : C

(R) C

(R) denote the identity transformation (I(f) = f).


Dene L: C

(R) C

(R) by L = D
2
+ I. Find the kernel of L.
What is the relationship of between the kernel of L and the solution
set of the dierential equation f
tt
+f = 0?
9. Provide a proof of Theorem 5.2.1.
10. Provide a proof of Theorem 5.2.2.
11. Complete the proof of Theorem 5.2.3.
12. Find the solution set of the homogeneous system associated with the
transformation T dened by
T(x
1
, x
2
, x
3
, x
4
))
= x
1
2x
2
+ 3x
3
+ 5x
4
, x
1
x
2
+ 8x
3
+ 7x
4
,
2x
1
4x
2
+ 6x
3
+ 10x
4
) .
What is the relationship between the solution set of the homogeneous
system that corresponds to this transformation and the ker(T)?
13. Provide a proof of Theorem 5.2.4.
14. Find the solution set of the system of equations
x
1
x
2
+x
3
+ 2x
4
2x
5
= 1
2x
1
x 2 x
3
+ 3x
4
x
5
= 3
x
1
x
2
+ 5x
3
4x
5
= 3.
5.2 Kernel and Range 259
Is the vector 1, 3, 3) an element of the image of the transformation
T : R
5
R
3
dened by
T(x
1
, x
2
, x
3
, x
4
, x
5
))
= x
1
x
2
+x
3
+ 2x
4
2x
5
,
2x
1
x 2 x
3
+ 3x
4
x
5
,
x
1
x
2
+ 5x
3
4x
5
)?
If so, nd the set of vectors in R
5
whose image under T is the vector
1, 3, 3). If not, explain why the vector 1, 3, 3) is not in the image
T(R
5
) of T.
15. Suppose that G : U R
6
is a linear transformation. Each part gives
a dierent scenario involving G and U. Answer each question on the
basis of the given scenario.
(a) If dim[T(U)] = 4 and dim[ker(T)] = 3, what would be the dimen-
sion of U? Justify your answer.
(b) Is G were onto, what could we conclude about the dimension of
the domain space U?
(c) If dim[T(U)] = 6 and dim(U) = 6, what could we say about G?
Is G one-to-one? onto?
(d) If dim(U) ,= dim(R
6
), does Theorem 5.2.6 apply? For example, if
G is one-to-one, must G also be onto? If G is onto, is G necessarily
invertible?
16. Let T : R
4
R
5
be dened by T(u) = 0 for all u R
4
. What is the
dimension of the kernel of T? Justify your answer.
17. Let T : K
n
K
m
be dened by
T(x
1
, x
2
, . . . , v
n
) = a
11
x
1
+a
12
x
2
+ +a
1n
x
n
,
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
, . . . , a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
) .
(a) Write the expression for T in terms of matrices; that is, write the
expression for T as T(x) = A x.
(b) Show that the j
th
column of A is the image of the vector T(e
j
),
where e
j
is the coordinate basis vector where 1 appears in the j
th
component.
260 CHAPTER 5. LINEAR TRANSFORMATIONS
(c) Show that
I = T(e
1
), T(e
2
), . . . , T(e
n
),
spans the image space of T.
(d) What does Theorem 5.2.5 say about the rank of A and the num-
ber of vectors in I, after I has been reduced to a linearly in-
dependent set? By reduced, we mean the application of Theo-
rem 4.3.2.
18. If T : U V is any vector space transformation, show that T is
invertible, that is, there exists T
1
: ImageSpace(T) U, if and only
if T is one-to-one.
19. If T : U V is an invertible linear transformation, show that T
1
:
ImageSpace(T) U is linear as well.
20. Dene D: PF
3
(R) PF
3
(R) by
D(p) = p
t
, where p
t
is the rst derivative of p.
(a) Show that D is a linear transformation.
(b) Find a basis for the kernel of D.
(c) Find a basis for the image of D.
21. Dene D
2
: PF
3
(R) PF
3
(R) by
D
2
(p) = p
tt
, where p
tt
is the second derivative of p.
(a) Show that D
2
is a linear transformation.
(b) Find a basis for the kernel of D
2
.
(c) Find a basis for the image of D
2
.
22. Dene H: PF
2
(R) PF
3
(R) by
H(p) = x 2p
t
(x) +
_
x
0
3p(t) dt,
where p
t
denotes the rst derivative of p.
5.2 Kernel and Range 261
(a) Show that H is a linear transformation.
(b) Find a basis for the kernel of H.
(c) Find a basis for the image of H.
23. Dene T : PF
2
(R) R
4
by
T(a
0
+a
1
x +a
2
x
2
) = a
0
, a
1
, a
0
+a
1
, 0) .
(a) Show that T is a linear transformation.
(b) Find a basis for the kernel of T.
(c) Find a basis for the image of T.
24. Dene F : PF
2
(R) PF
2
(R) by
F(p) = q, where q(x) = p(x) +p(x)
(a) Show that F is a linear transformation.
(b) Find a basis for the kernel of F.
(c) Find a basis for the image of F.
25. Dene G: PF
2
(R) PF
2
(R) by
G(p) = q, where q(x) = (2x + 3)p(x).
(a) Show that G is a linear transformation.
(b) Find a basis for the kernel of G.
(c) Find a basis for the image of G.
26. Dene H: PF
2
(R) R by
H(p) = p(1)
(a) Show that H is a linear transformation.
(b) Find a basis for the kernel of H.
(c) Find a basis for the image of H.
262 CHAPTER 5. LINEAR TRANSFORMATIONS
27. Use the results of Theorem 5.2.7 to nd the general form of a solution
for the system of equations
3x
1
3x
2
+ 3x
3
= a
2x
1
x
2
+ 4x
3
= b
3x
1
5x
2
x
3
= c,
where one particular solution is of the form 1, 1, 1).
28. Find the form of the system of equations whose general solution consists
of the particular solution 1, 3, 4) and whose associated homogeneous
system has solution set generated by the basis
2, 1, 2) , 1, 1, 3).
263
5.3 New Constructions from Old
Activities
1. There are four sets of linear transformations given below. Write an
ISETL func that implements each transformation. Then, construct
four sets, R, S, F, and P, according to the categorizations given below.
Save these sets of funcs in a le called LTexamples.
Let R = R
i
: (Z
5
)
2
(Z
5
)
2
: i = 1, 2, 3, 4, 5, 6, 7 be a set of
linear transformations. The expression for each transformation is
given below.
R
1
(v
1
, v
2
)) = v
1
+ 2v
2
, 2v
1
+ 3v
2
)
R
2
(v
1
, v
2
)) = v
1
+ 2v
2
, 4v
1
+ 3v
2
)
R
3
(v
1
, v
2
)) = 3v
1
+v
2
, 0)
R
4
(v
1
, v
2
)) = 3v
1
, 2v
2
)
R
5
(v
1
, v
2
)) = v
1
+ 2v
2
, v
1
+ 2v
2
)
R
6
(v
1
, v
2
)) = 2v
1
+ 2v
2
, 2v
1
+ 4v
2
)
R
7
(v
1
, v
2
)) = 4v
1
+ 2v
2
, 4v
1
).
Let S = S
i
: (Z
5
)
2
(Z
5
)
3
: i = 1, 2, 3, 4, 5 be a set of linear
transformations. The expression for each transformation is given
below.
S
1
(v
1
, v
2
)) = v
1
+ 2v
2
, 2v
1
+ 3v
2
, 3v
1
)
S
2
(v
1
, v
2
)) = 2v
1
+ 4v
2
, v
1
+ 2v
2
, 4v
1
+ 3v
2
)
S
3
(v
1
, v
2
)) = 3v
1
+v
2
, 2v
1
+ 2v
2
, 0)
S
4
(v
1
, v
2
)) = 3v
1
, 2v
2
, 0)
S
5
(v
1
, v
2
)) = 3v
1
+v
2
, 3v
1
, 2v
1
+ 3v
2
).
Let F = F
i
: (Z
5
)
4
(Z
5
)
2
: i = 1, 2, 3, 4 be a set of linear
transformations. The expression for each transformation is given
below.
F
1
(v
1
, v
2
, v
3
, v
4
)) = v
1
+ 2v
2
+v
3
, 2v
1
+v
2
+v
3
)
F
2
(v
1
, v
2
, v
3
, v
4
)) = 2v
1
+v
2
, 2v
2
+v
3
)
F
3
(v
1
, v
2
, v
3
, v
4
)) = v
1
+ 2v
2
, v
1
+ 2v
2
)
F
4
(v
1
, v
2
, v
3
, v
4
)) = 2v
1
, v
2
).
Let P = P
i
: (Z
5
) (Z
5
)
4
: i = 1, 2 be a set of linear
transformations. The expression for each transformation is given
264 CHAPTER 5. LINEAR TRANSFORMATIONS
below.
P
1
(v
1
)) = 2v
1
, v
1
, 0, v
1
)
P
2
(v
1
)) = v
1
, 2v
1
, 2v
1
, 0).
2. (a) Run name vector space by setting K equal to Z
5
, U equal to
(Z
5
)
2
and V equal to (Z
5
)
3
. Given S
1
, as dened in Activity 1,
and the scalar 3 Z
5
, explain what you think is meant by 3S
1
,
and write a func that implements it. Apply the func is linear
that you constructed in Activity 3 of Section 5.1 to 3S
1
. Is 3S
1
a
linear transformation?
(b) Write a func LTsm that accepts a scalar a and a linear transfor-
mation F, and returns a func that implements aF.
(c) Assume that name vector space has been run. Write a func that
accepts a set of linear transformations from U to V ; determines
whether the scalar multiple of each transformation in the set is
linear; returns true, if each transformation is linear, or false, if
one or more transformations is not linear. Apply your func to
each of the sets R, S, F, and P dened in Activity 1. Note that you
will need to adjust the inputs for name vector space accordingly.
State a conjecture that summarizes what you observe.
3. (a) Run name vector space by setting K equal to Z
3
, U equal to
(Z
3
)
4
and V equal to (Z
3
)
2
. Given F
1
and F
2
, as dened in Ac-
tivity 1, explain what you think is meant by F
1
+F
2
, and write a
func that implements F
1
+ F
2
. Apply is linear to F
1
+ F
2
. Is
F
1
+F
2
a linear transformation?
(b) Why doesnt the procedure in part (a) work for F
1
+ R
2
? for
S
1
+ R
1
? Specify condition(s) under which the sum of two linear
transformations is dened.
(c) Write a func LTadd that accepts two linear transformations A
and B; determines whether the sum of A and B is dened; and
returns the sum A +B, if the sum is dened, or OM, if the sum is
not dened.
(d) Form the sum S
3
+ S
4
by applying the func LTadd. Determine
whether the resulting sum is a linear transformation by applying
the func is linear. What do you observe?
5.3 New Constructions from Old 265
(e) Write a func that assumes that name vector space has been run;
accepts a set of linear transformations from U to V ; determines
whether the sum of each pair is linear; returns true, if each sum
is a linear transformation, or false, if one or more of the sums is
not a linear transformation. Apply this func to each of the sets R,
S, F, and P dened in Activity 1. Note that you will need to adjust
the inputs for name vector space accordingly. State a conjecture
that summarizes what you observe.
4. (a) Apply the func LTadd to S
1
and S
2
from Activity 1, and determine
whether the resulting sum is equal to either S
3
, S
4
, or S
5
.
(b) Are the funcs R
5
and F
3
that were dened in Activity 1 equal?
(c) Are the funcs R
4
and S
4
that were dened in Activity 1 equal?
(d) Write a func is equal that assumes that name vector space has
been run; accepts two linear transformations; determines whether
the two inputs are equal; and returns true, if they are, or false,
if they are not.
(e) Use is equal to nd all pairs of linear transformations in R whose
sum is equal to another linear transformation in R. Repeat for the
set F.
5. Let G = G
i
: (Z
2
)
1
(Z
2
)
2
: i 1, 2, 3 be a set of transforma-
tions with the expression for each G
i
given below:
G
1
(v)) = 0, 0)
G
2
(v)) = v, 0)
G
3
(v)) = 0, v)
G
4
(v)) = v, v).
(a) Write an ISETL func for each transformation. Apply the func
is linear to verify that each transformation is linear.
(b) What are the scalars in Z
2
? For each transformation, determine
each of its scalar multiples. Use the func Is Equal that you con-
structed in Activity 4 to determine whether each scalar multiple
is equal to either G
1
, G
2
, G
3
, or G
4
. What do you observe?
266 CHAPTER 5. LINEAR TRANSFORMATIONS
(c) Apply the func LTadd to nd the sum G
i
+ G
j
of all possible
combinations i, j such that i, j 1, 2, 3, 4. Apply the func
is equal to determine whether each sum is equal to either G
1
,
G
2
, G
3
, or G
4
. What do you observe?
(d) Apply the func is vector space (See Activity 6, in Section 2.2)
to the set G, together with the operations dened on G, as given
above. Does ISETL return a response consistent with the results
you obtained in (b) and (c)?
6. Let T = T
i
: (Z
2
)
2
(Z
2
)
2
: i 1, 2, . . . , 24 be a set of transfor-
mations with the expression for each T
i
given below:
T
1
(v
1
, v
2
)) = 0, 0)
T
2
(v
1
, v
2
)) = v
1
, 0)
T
3
(v
1
, v
2
)) = v
1
, v
1
)
T
4
(v
1
, v
2
)) = v
1
, v
2
)
T
5
(v
1
, v
2
)) = v
1
, v
1
+v
2
)
T
6
(v
1
, v
2
)) = v
2
, 0)
T
7
(v
1
, v
2
)) = v
2
, v
1
)
T
8
(v
1
, v
2
)) = v
2
, v
2
)
T
9
(v
1
, v
2
)) = v
2
, v
1
+v
2
)
T
10
(v
1
, v
2
)) = 0, v
1
)
T
11
(v
1
, v
2
)) = v
1
, v
1
)
T
12
(v
1
, v
2
)) = v
2
, v
1
)
T
13
(v
1
, v
2
)) = v
1
+v
2
, v
1
)
T
14
(v
1
, v
2
)) = 0, v
2
)
T
15
(v
1
, v
2
)) = v
1
, v
2
)
T
16
(v
1
, v
2
)) = v
2
, v
2
)
T
17
(v
1
, v
2
)) = v
1
+v
2
, v
2
)
T
18
(v
1
, v
2
)) = 0, v
1
+v
2
)
T
19
(v
1
, v
2
)) = v
1
, v
1
+v
2
)
T
20
(v
1
, v
2
)) = v
2
, v
1
+v
2
)
5.3 New Constructions from Old 267
T
21
(v
1
, v
2
)) = v
1
+v
2
, v
1
+v
2
)
T
22
(v
1
, v
2
)) = v
1
+v
2
, 0)
T
23
(v
1
, v
2
)) = v
1
+v
2
, v
1
)
T
24
(v
1
, v
2
)) = v
1
+v
2
, v
2
).
Let B = B
i
: (Z
2
)
2
(Z
2
)
2
: i 1, 2, 3, 4 be a set of transforma-
tions with the expression for each B
i
given below:
B
1
(v
1
, v
2
)) = v
1
, 0)
B
2
(v
1
, v
2
)) = v
2
, 0)
B
3
(v
1
, v
2
)) = 0, v
1
)
B
4
(v
1
, v
2
)) = 0, v
2
).
(a) Apply the func is vector space to determine whether the set T,
together with the operations dened on T, forms a vector space?
(b) Determine whether each transformation T
i
can be expressed as an
element of B or a sum of elements of B.
(c) Determine whether the set B is linearly independent. If no, why
not?
(d) Is the set B a basis for the set T? If so, what is the dimension of
the set T. If not, how does B fail to constitute a basis?
7. (a) Run name vector space by setting K equal to Z
3
, U equal to
(Z
3
)
2
, and V equal to (Z
3
)
3
. Given R
1
and S
1
, as dened in
Activity 1, explain what is meant by S
1
R
1
, where indicates
that R
1
is followed by S
1
. Write a func that implements S
1
R
1
.
(b) Why doesnt the procedure in part (a) work for R
1
S
1
? P
1
T
1
?
Under what condition(s) is the composition of two transformations
dened?
(c) Write a func LTcomp that accepts two linear transformations A
and B; determines whether the composition of A and B is dened;
and returns A B, if the composition is dened, or OM, if the
composition is not dened.
268 CHAPTER 5. LINEAR TRANSFORMATIONS
(d) Run name vector space by setting K equal to Z
3
, U equal to
(Z
3
)
2
, and V equal to (Z
3
)
2
. Apply the ISETL command arb to
U to choose three vectors from (Z
3
)
2
. Determine whether R
1
R
3
applied to each of these three vectors returns the same result as
R
5
. Apply the func is equal to R
1
R
3
and R
5
. Is the result
returned by is equal consistent with that returned by R
1
R
3
and R
5
when applied to each of the three vectors selected by arb?
(e) Apply the func is linear to the composition R
2
R
4
. What do
you nd?
(f) Write a func that assumes that name vector space has been run;
accepts two sets of linear transformations; determines whether the
composition of a transformation from the rst set followed by a
transformation from the second set is dened and is linear; returns
true, if each composition is a linear transformation, or false, if
one or more of the compositions is not a linear transformation,
or OM, if the composition operation is undened. Apply this func
to each of the pairs (R,R), (S,R), and (P,F), where R, S, and F
are the sets dened in Activity 1. Note that you will need to
adjust the inputs for name name vector space accordingly. State
a conjecture that summarizes what you observe.
(g) Find all pairs A, B in R such that A B = B A. Does the
equality hold for every pair in R? Describe your observation in a
single sentence using the word commutative.
Discussion
Scalar Multiple of a Linear Transformation
In Activity 2, you were asked to dene the mapping 3S
1
. In order to get an
expression for this map, we simply multiply each component of the expression
for S
1
by 3:
3S
1
(v
1
, v
2
)) = 3 v
1
+ 2v
2
, 2v
1
+ 3v
2
, 3v
1
)
= 3(v
1
+ 2v
2
), 3(2v
1
+ 3v
2
), 3(3v
1
))
= 3v
1
+v
2
, v
1
+ 4v
2
, 4v
2
) .
5.3 New Constructions from Old 269
One can nd the scalar multiple of any transformation in a similar manner,
as suggested in the following denition.
Denition 5.3.1. Let T : U V be a transformation between two vector
spaces U and V with scalars in a eld K. Given a scalar a K, the scalar
multiple of T by a, denoted aT, is a transformation that assigns to each
vector u U a vector aT(u) V , where T(u) represents the vector assigned
to u by T. This is represented by the notation
(aT)(u) = a(T(u)).
In constructing the func is linear in Activity 3 of Section 5.1, you most
likely checked the single condition,
T(cu
1
+du
2
) = cT(u
1
) +dT(u
2
),
or the equivalent, two-part version,
T(cu) = cT(u)
T(u
1
+u
2
) = T(u
1
) +T(u
2
),
given in Denition 5.1.1. Is your nding in Activity 2 consistent with the
following theorem?
Theorem 5.3.1. Let U and V be vector spaces with scalars in K, and let
T : U V be a linear transformation. If a K is a scalar, then the scalar
multiple aT : U V is a linear transformation.
Proof. Let u
1
, u
2
U, and let a, c be scalars.
(aT)(cu
1
) = a(T(cu
1
))
= a(cT(u
1
))
= (ac)T(u
1
)
= (ca)T(u
1
)
= c(aT(u
1
))
= c(aT)(u
1
)
Can you justify each step? To nish the proof, we still need to show that
(aT)(u
1
+u
2
) = (aT)(u
1
) + (aT)(u
2
).
This can be done using the same ideas as in the rst part of the proof. You
will be asked to complete this proof as an exercise. See Exercise 3.
270 CHAPTER 5. LINEAR TRANSFORMATIONS
The Sum of Two Linear Transformations
In Activity 3(a), you wrote a func to express the sum of the two funcs F
1
and F
2
. As with the sum of any two functions, the sum of two linear trans-
formations consists of taking the vector assigned to an input vector u under
F
1
and adding it to the vector assigned to u under F
2
. If each transformation
is given by an expression, the sum is dened by adding the individual expres-
sions. How is the method by which you obtained the expression for F
1
+ F
2
similar to the way in which you found the expression for 3S
1
in Activity 2(a)?
Your work in the activities and the discussion here should convince you
that the sum is not dened unless the two transformations have the same
domain and range spaces. The sum F
1
+ R
2
given in Activity 3(b) is not
dened, because the domains dier. Can you explain exactly why this is a
problem? The second sum in part (b), S
1
+ R
1
, is not dened, because the
ranges dier. Why must the ranges be the same? Ideas related to the sum
of two transformations are summarized in the denition below.
Denition 5.3.2. Let U and V be vector spaces with scalars in K, and let
T : U V and F : U V be two transformations. Given u U, the
sum of T and F, denoted
T +F : U V,
is dened by taking the sum of the vector assigned to u under T, T(u), with
the vector assigned to u under F, F(u). The notation for this is
(T +F)(u) = T(u) +F(u).
Is your result in Activity 3 consistent with the following theorem?
Theorem 5.3.2. Let U and V be vector spaces with scalars in K, and let
T : U V and F : U V be linear transformations. Then, the sum
T +F : U V is a linear transformation.
Proof. Let u
1
, u
2
U, and let c be a scalar.
(T +F)(u
1
+u
2
) = T(u
1
+u
2
) +F(u
1
+u
2
)
= [T(u
1
) +T(u
2
)] + [F(u
1
) +F(u
2
)]
= [T(u
1
) +F(u
1
)] + [T(u
2
) +F(u
2
)]
= (T +F)(u
1
) + (T +F)(u
2
)
5.3 New Constructions from Old 271
Can you justify each step? To nish the proof of the theorem, we still need
to show that
(T +F)(cu
1
) = c(T +F)(u
1
).
This can be done using the same ideas as in the rst part of the proof.
You will be asked to complete the proof of this theorem as an exercise. See
Exercise 8.
Equality of Linear Transformations
Generally speaking, two functions f and g are equal if their domains are
equal, their ranges are equal, and if f and g assign the same output to a
given input. Since linear transformations are functions, the requirement for
equality is the same. A linear transformation T : U
1
V
1
is equal to a
linear transformation F : U
2
V
2
if and only if U
1
= U
2
, V
1
= V
2
, and
T(u) = F(u) for every input u. Consider R
5
and F
3
in Activity 1: each
is given by the same expression, and both range spaces are the same. So,
what is the problem? Why are these two transformations not equal to one
another? Consider R
4
and S
4
dened in Activity 1: the domain of each
transformation is the same, and the expression for each transformation looks
similar. However, R
4
,= S
4
. Can you explain why?
The func is equal you constructed in Activity 4 checks to see whether
two transformations between nite vector spaces are equal. When you ap-
plied is equal in Activity 5(b) and (c), what did you nd? Was each com-
bination considered in parts (b) and (c) equal to one of the transformations
given in the set G? If so, what signicance does this have? In particular,
does the set G form a vector space?
A Set of Linear Transformations as a Vector Space
The set of transformations G = G
1
, G
2
, G
3
, G
4
given in Activity 5 is the
set of all linear transformations between (Z)
1
and (Z
2
)
2
. In parts (b) and
(c), you showed that each scalar multiple and each sum is equal to one of the
transformations in G. Application of the func is vector space conrmed
these ndings. Of the vector space axioms, which axioms correspond to what
you were checking in parts (b) and (c)?
As it turns out, Activity 5 is a specic example of a much more general
result. In particular, the set of all linear transformations between two vector
spaces U and V , denoted Hom(U, V ), together with transformation addition
272 CHAPTER 5. LINEAR TRANSFORMATIONS
and scalar multiplication, forms a vector space. Here, each transformation is
a vector, the operation of adding two transformations represents vector addi-
tion, and the operation of multiplying a transformation by a scalar represents
scalar multiplication. Hom is an abbreviation for the term homomorphism,
which will be dened carefully in an abstract algebra course. For our pur-
poses, we only need to know that Hom(U, V ) denotes the collection of all
linear transformations between U and V .
Theorem 5.3.3. Let U and V be vectors spaces with scalars in K. The
set of all linear transformations, Hom(U, V ), together with transformation
addition and transformation scalar multiplication, forms a vector space.
Proof. In order to prove this theorem, you will need to check each of the
vector space axioms. The full proof is left as an exercise (see Exercise 11),
but the following questions and comments are designed to help you to write
a complete proof.
Theorem 5.3.1 shows that the scalar multiple of a linear transformation
is again a linear transformation. Which vector space axiom is satised by
this theorem? Similarly, Theorem 5.3.2 shows that the sum of two linear
transformations is itself a linear transformation. Which vector space axiom
is satised here?
Since addition of functions is commutative and associative, the addition
operation dened here is both commutative and associative.
The transformation dened by Z(u) = 0
V
for all u U is called the zero
transformation.
For F Hom(U, V ), the transformation F Hom(U, V ) denotes its
additive inverse.
As with any vector space, we can talk about nding a basis. You did just
that in Activity 6. What is the dimension of T? Can you nd a basis for the
vector space G dened in Activity 5?
Creating New Linear Transformations
In the last subsection, we created a new vector space by considering the set
of all linear transformations between two previously dened vector spaces.
We can do even more: in particular, we can often dene a function between
a vector space of transformations and a vector space of tuples that preserves
5.3 New Constructions from Old 273
linearity. For example, dene a function L : G (Z
2
)
2
between the vector
space G given in Activity 5 and the vector space (Z
2
)
2
by:
L(G
1
) = 0, 0)
L(G
2
) = 1, 0)
L(G
3
) = 0, 1)
L(G
4
) = 1, 1) .
Can you show that L is a linear transformation, that is:
L(G
i
+G
j
) = L(G
i
) +L(G
j
), where G
i
, G
j
G, i, j 1, 2, 3, 4
L(cG
i
) = cL(G
i
), where c Z
2
, and G
i
G, i 1, 2, 3, 4?
Can you dene a similar linear transformation between the set of all linear
transformations T dened in Activity 6 and (Z
2
)
4
? Transformations such as
these will be considered in more detail in Chapter 7.
Compositions of Linear Transformations
In Activity 7, you wrote a func to express the composition S
1
R
1
of the
two funcs R
1
and S
1
dened in Activity 1. As with any two functions, the
composition of two linear transformations, say A B, consists of taking an
input vector u, applying B to u, and then nding the image of B(u) under
A, provided that the application of A to B(u) is dened.
V U W
A B
u
B(u)
A(B(u))
Figure 5.12: Composition
Why are the compositions R
1
S
1
and P
1
T
1
that you considered in
part (b) undened? In general, if F : U V and G : W Z are two
274 CHAPTER 5. LINEAR TRANSFORMATIONS
linear transformations, what condition must be satised in order to ensure
that the composition GF is dened? Under what condition is the operation
of composition dened for pairs of transformations from Hom(U, V )? In this
context, what does the word closed mean in the statement of the theorem
given below?
Theorem 5.3.4. Let U be a vector space with scalars in a eld K. The
set of all linear transformations Hom(U, U) is closed under the operation of
composition.
Proof. See Exercise 23.
In Activity 7(g), you considered the issue of commutativity. Specically,
you were trying to determine whether A B = B A for every pair of
transformations in R. On the basis of your ndings in this activity, can you
conclude that composition is a commutative operation?
Exercises
1. Let F : R
2
R
3
be dened by
F(v
1
, v
2
)) = v
1
v
2
, 3v
1
+v
2
, 4v
2
) .
(a) Verify that F is a linear transformation.
(b) Let a R. Show that aF forms a linear transformation.
2. Justify each step of the proof of Theorem 5.3.1 that establishes
(aT)(cu
1
) = c(aT)(u
1
).
3. Complete the proof of Theorem 5.3.1 by showing that
(aT)(u
1
+u
2
) = (aT)(u
1
) + (aT)(u
2
).
4. Let T : PF
3
(R) PF
3
(R) be dened by
T(p) = q, where q(x) = p(x + 3),
Let a R. Show that aT is a linear transformation.
5.3 New Constructions from Old 275
5. Let J : PF
2
(R) PF
3
(R) be dened by
J(p) = x
_
x
0
p(t) dt
Let a R. Show that aJ is a linear transformation.
6. Let F
1
: R
3
R
2
be dened by
F
1
(v
1
, v
2
, v
3
)) = 3v
2
2v
3
, 4v
1
+v
2
) .
Let F
2
: R
3
R
2
be dened by
F
2
(v
1
, v
2
, v
3
)) = v
1
v
3
, v
2
v
3
) .
(a) Verify that F
1
and F
2
are linear transformations.
(b) Show that F
1
+F
2
is a linear transformation.
7. Justify each step of the proof of Theorem 5.3.2 that establishes
(T +F)(u
1
+u
2
) = (T +F)(u
1
) + (T +F)(u
2
).
8. Complete the proof of Theorem 5.3.2 by showing that
(T +F)(cu
1
) = c(T +F)(u
1
).
9. Dene T, S: PF
3
(R) PF
4
(R) be dened by
T(p) = q, where q(x) = p(x + 3),
S(p) = q, where q(x) = xp(x).
(a) Show that T and S are linear transformations.
(b) Show that T +S is a linear transformation.
10. Dene D, D
2
: C

(R) C

(R) be the rst and second derivative


operators respectively.
(a) Show that D and D
2
are linear transformations.
(b) Show that D
2
+D is a linear transformation.
276 CHAPTER 5. LINEAR TRANSFORMATIONS
11. Complete the proof of Theorem 5.3.3.
12. Show that the set Hom(R, R
3
) of linear transformations from R to R
3
,
where addition, for T, G Hom(R, R
3
), is dened by
(T +G)(v) = T(v) +G(v),
and scalar multiplication, for T Hom(R, R
3
) and a R, is dened
by
(aT)(v) = aT(v),
forms a vector space.
13. Let F
1
, F
2
, F
3
be transformations in Hom(R, R
3
), the set of all linear
transformations from R to R
3
, as dened below:
F
1
(v)) = v, 0, 0)
F
2
(v)) = 0, v, 0)
F
3
(v)) = 0, 0, v).
(a) Show that F
1
, F
2
, F
3
spans Hom(R, R
3
).
(b) Show that F
1
, F
2
, F
3
is an independent subset of Hom(R, R
3
).
(c) Determine the dimension of Hom(R, R
3
).
14. Find a basis for the set G dened in Activity 5. What is the dimension
of this vector space?
15. Let T be the set of functions dened in Activity 6. Each T
i
T,
i 1, . . . , 24, is dened by
T
i
(v
1
, v
2
) = av
1
+bv
2
, cv
1
+dv
2
) ,
where v
1
, v
2
) (Z
2
)
2
, and a, b, c, d Z
2
. A review of this activity
shows that the choices for a, b, c, and d are unique for each i. For
example, T
5
is dened by
T
5
(v
1
, v
2
)) = v
1
, v
1
+v
2
) ,
where a, c, d = 1 and b = 0. Dene L : T (Z
4
)
4
by
L(T
i
) = a, b, c, d) .
5.3 New Constructions from Old 277
(a) Show that L denes a linear transformation.
(b) Determine whether L is one-to-one. See Section 5.2, if you do not
remember the denition of one-to-one.
(c) Determine whether L is onto. See Section 5.2, if you do not re-
member the denition of onto.
16. Let Hom(R
2
, R
3
) be the set of all linear transformations from R
2
to
R
3
.
(a) Show that Hom(R
2
, R
3
) forms a vector space under transforma-
tion addition and transformation scalar multiplication.
(b) Find a basis for Hom(R
2
, R
3
).
(c) Dene a transformation between Hom(R
2
, R
3
) and the vector
space R
6
. Is this transformation linear? one-to-one? onto?
17. Let F
1
: R
3
R
3
be dened by
F(v
1
, v
2
, v
3
)) = 3v
2
2v
3
, v
1
+v
3
, 4v
1
+v
2
) .
Let F
2
: R
3
R
3
be dened by
F(v
1
, v
2
, v
3
)) = v
1
+v
2
v
3
, v
1
v
2
+v
3
, v
1
+v
2
+v
3
) .
(a) Verify that F
1
and F
2
are linear transformations.
(b) Find F
2
F
1
, and verify that the composition is also a linear trans-
formation.
18. Let F Hom(R
3
, R
3
) be dened by
F(v
1
, v
2
, v
3
)) = v
1
+ 2v
2
+v
3
, 2v
1
v
2
+ 3v
3
, v
1
3v
2
2v
3
) .
(a) Determine whether F is 1-1 and onto.
(b) Find the dimension of the kernel of F.
(c) If F is 1-1 and onto, nd its inverse, and verify that the composi-
tion of F and its proposed inverse yield the identity transforma-
tion.
19. Let S : U U and R : U U be two invertible linear transforma-
tions. Show that (S R)
1
= R
1
S
1
.
278 CHAPTER 5. LINEAR TRANSFORMATIONS
20. Let R
1
: R
2
R
2
be a rotation through

4
radians, and let R
2
:
R
2
R
2
be a rotation through

2
radians.
(a) Show that the composition R
2
R
1
is a rotation, and determine
the angle of rotation.
(b) Graph 1, 3), R
1
(1, 3)), and (R
2
R
1
)(1, 3)) on the same set of
axes. Is the graph consistent with what you have proven in (a)?
(c) Write a general proof: If R
1
: R
2
R
2
is a rotation through
radians, and R
2
: R
2
R
2
is a rotation through radians, then
R
2
R
1
is a rotation through + radians.
21. Let F
1
: R
2
R
2
be a reection through the line y =

3x, and let


F
2
: R
2
R
2
be a reection through the line y = x.
(a) Show that the composition F
2
F
1
is a rotation, and nd the angle
of rotation.
(b) Graph 2, 1), F
1
(2, 1)), and (F
2
F
1
)(2, 1)) on the same set of
axes. Is the graph consistent with what you have proven in (a)?
(c) Write a general proof: If F
1
: R
2
R
2
is a reection through
the line y = m
1
x, F
2
: R
2
R
2
is a reection through the line
y = m
2
x, and m
1
,= m
2
, then F
2
F
1
is a rotation through twice
the angle between the two reecting lines.
22. If T
2
: R
2
R
2
is a rotation about the origin, and T
1
: R
2
R
2
is
a reection with respect to a line through the origin, explain, without
making any computations, why the composition T
2
T
1
cannot be a
translation.
23. Write the proof of Theorem 5.3.4. Then, answer the following questions
related to the operation of composition.
(a) Is composition associative? That is, given f, g, h Hom(U, U),
does the following equality hold:
h (g f) = (h g) f?
If the answer is yes, provide a proof. If the equality does not hold,
nd a counterexample.
5.3 New Constructions from Old 279
(b) Is composition commutative? That is, given f, g Hom(U, U),
does the following equality hold:
g f = g f?
If the answer is yes, provide a proof. If the equality does not hold,
nd a counterexample.
(c) Show that the transformation I : U U given by I(u) = u is
an element of the set Hom(U, U). For every F Hom(U, U), show
that F I = F = I F.
24. Let I be the subset of Hom(U, U) that consists of all invertible linear
transformations from U to U.
(a) Is I closed under composition? If so, write a proof. If not, nd two
invertible transformations whose composition is not invertible.
(b) Is I closed under transformation addition? If so, write a proof. If
not, nd an example of two invertible transformations whose sum
is not invertible.
(c) Is I closed under transformation scalar multiplication? If so, write
a proof. If not, nd an example of a transformation and a scalar
whose resulting product is not invertible.
(d) Does I form a vector subspace of Hom(U, U) under transformation
addition and transformation scalar multiplication? If so, write a
proof. If not, provide an explanation.
280 CHAPTER 5. LINEAR TRANSFORMATIONS
Chapter 6
Systems, Transformations and
Matrices
If you have stayed with us this far, you are in for
a treat. We have been making some connections
between the dierent topics that have been
studied up until now. Things like how solution
sets to certain systems of equations are
subspaces of particular vector spaces and how
they both can be pictured in geometrically for
small dimensions. But now we are ready to put
it all in a neat and tidy package to be tied with a
bow. The package in this case is matrices. You
have already seen how matrices are connected to
systems of equations. In the second section we
connect matrices to linear transformations, and
the package is complete. Wonder what we will
do in Chapter 7?
282
6.1 Vector Spaces of Matrices
Activities
1. Use the Matrix package in ISETL to dene the matrices 2M and M+N,
where
M =
_
1 0 3
2 1 4
_
and N =
_
2 1 2
3 4 0
_
and all arithmetic is done mod 5.
2. Write a func scale mat that accepts a scalar a and a matrix of scalars
M and returns the matrix aM. Write a func add mat which accepts
two matrices M and N and returns the matrix M + N. Your funcs
should assume that ms and as have been dened to implement scalar
multiplication and scalar arithmetic.
Next dene ms and as to implement arithmetic mod 5 and let M and
N be as in Activity 1. Check your funcs using the following matrices.
(a) 2M
(b) 3M + 2N
(c) 3(M +N)
3. For matrices M and N given in Activity 1, use scale mat and add mat
from Activity 2 to compute 2N and N + N. What is the relationship
between these? Now compute 3M and M + M + M. What is the
relationship between these? What axiom for vector spaces is being
demonstrated here?
4. For the matrix M in Activity 1, determine a matrix Z such that
M + Z = M. For the matrix Z you have just constructed, deter-
mine whether the equality P + Z = P holds for all 2 3 matrices P
over Z
5
. Can you describe what such a matrix would look like for nm
matrices over K? What axiom for vector spaces is being demonstrated
here?
5. For the matrix N given in Activity 1 and the matrix Z you found in
Activity 4, determine a matrix N such that N + (N) = Z. Write
a func neg mat that accepts a matrix P and determines P such that
6.1 Vector Spaces of Matrices 283
P + (P) = Z. Your code should work for R as well as Z
p
. What
axiom for vector spaces is being demonstrated here?
6. Use is vector space from Chapter 2 to determine if the collection
of 4 3 matrices over Z
3
forms a vector space over Z
3
(using the
appropriate arithmetic). We will denote this vector space as (Z
3
)
43
.
7. Use is vector space to verify that (Z
5
)
22
is a vector space (using
the appropriate arithmetic). Use is subspace from Chapter 2 to de-
termine if the following are subspaces of (Z
5
)
22
:
(a) The set of all matrices in (Z
5
)
22
with the value 0 in the upper
left corner (a
11
= 0);
(b) The set of all matrices in (Z
5
)
22
with the value 1 in the upper
left corner (a
11
= 1);
(c) The set of all matrices in (Z
5
)
22
where the upper right corner is
equal to the sum of the upper left and lower left corners (a
12
=
a
11
+a
21
);
(d) The set of all matrices in (Z
5
)
22
where the upper right corner
is equal to the product of the upper left and lower left corners
(a
12
= a
11
a
21
).
8. Construct a linearly independent set of 3 vectors in (Z
3
)
43
. Use LI
from Activity 2 of Section 4.2 to verify that the set you have constructed
is linearly independent.
9. Find a basis for (Z
3
)
43
, and conrm that it is a basis by using is basis
from Activity 4 of Section 4.4. What is the dimension of (Z
3
)
43
? Can
you determine a general method for nding a basis for K
mn
? Make a
conjecture about the dimension of K
mn
.
10. Write a func flatten mat that accepts an n m matrix whose (i, j)
entry is a
ij
and that returns a vector whose (i1)m+j entry is equal to
a
ij
. In other words, read the matrix left-to-right from top-to-bottom:
_
1 2 3
4 5 6
_
[1, 2, 3, 4, 5, 6].
What is the range of flatten mat? What is the dimension of the range
of flatten mat?
284 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
11. Use is linear from Chapter 5 to verify that flatten mat is a linear
transformation from (Z
3
)
43
(Z
3
)
12
. Remember, in order to use
is linear, you rst need to run name two vector spaces.
12. Verify that flatten mat from (Z
3
)
43
(Z
3
)
12
is one-to-one and
onto.
13. Write a func trans mat that accepts an nm matrix whose (i, j) entry
is a
ij
and returns a matrix whose (i, j) entry is a
ji
. What is the range
of trans mat? What is the dimension of the range of trans mat?
14. Assume that (Z
5
)
23
and (Z
5
)
32
are vector spaces. Use is linear to
verify that trans mat from (Z
5
)
23
(Z
5
)
32
is a linear transforma-
tion.
15. Verify that trans mat from (Z
5
)
23
(Z
5
)
32
is one-to-one and onto.
Discussion
Vector Spaces of Matrices
In Section 3.2, you were introduced to matrices as a computational tool
for determining the solution set of a system of linear equations. Does your
memory of their denition match the following?
Denition 6.1.1. If S is a set of numbers, an n m matrix over S (read n
by m matrix over S) is a function M: N N S. The dimension of an
nm matrix is n by m. The element at (i, j), or (i, j) entry of M is equal
to M(i, j) and is denoted by m
ij
.
A matrix will often be written as a rectangular array of n rows and m
columns. The use of capital letters for the matrix and the corresponding
lowercase letter for the elements is conventional.
The notation M = (m
ij
)
nm
is used to indicate a matrix of dimension
n m whose (i, j) entry is m
ij
. For example, can you see why the matrix
written as (i +j)
22
is equal to
_
2 3
3 4
_
?
6.1 Vector Spaces of Matrices 285
Can you write (ij)
33
, (2i)
23
and (i
j
)
31
as arrays of real numbers? Often
the dimensions will be omitted from this notation if they are clear from the
context.
You were introduced to vector spaces in Chapter 2 and explored the
concept further in Chapters 4 and 5. We now examine matrices in this
context.
What two operations must be dened on a set if it is to be a vector space?
What properties must these operations satisfy? In Activities 1 and 2 you de-
ned two matrix operations, scaling and adding. You tested whether these
operations satised various vector space properties in Activities 3 through 5.
In Activities 6 and 7, you used the func is vector space to prove that one
particular collection of matrices, together with scaling and adding, consti-
tutes a vector space. Is that which you proved in Activity 6 true for any
particular collection of matrices? The next theorem addresses this issue.
Theorem 6.1.1. The collection of matrices of dimension n m over a set
of scalars is a vector space under the operations:
(m
ij
) + (n
ij
) = (m
ij
+n
ij
);
k(m
ij
) = (km
ij
).
Proof. We prove the distributive properties. A check of the remaining prop-
erties is left as an exercise (see Exercise 1).
Given scalars k and l and matrices M = (m
ij
) and N = (n
ij
), we must
show that k(M + N) = kM + kN and that (k + l)M = kM + lM. Both of
these results can be obtained by calculation:
k(M +N) = k(m
ij
+n
i,j
) = (k(m
ij
+n
ij
)) = (km
ij
+kn
ij
) =
= (km
ij
) + (kn
ij
) = k(m
ij
) +k(n
ij
) = kM +kN,
(k +l)M = (k +l)(m
ij
) = ((k +l)m
ij
) = (km
ij
+lm
ij
) =
= (km
ij
) + (lm
ij
) = k(m
ij
) +l(m
ij
) = kM +lM.
Throughout the proof of Theorem 6.1.1, parentheses were used in more
than one context. In some cases, they were used as grouping symbols. In
286 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
others, they were used as shorthand to denote matrices. Can you distin-
guish between these two uses? There was an additional ambiguity involving
addition and multiplication. In some instances, multiplication refers to an
operation between a scalar and a matrix. In other instances, it refers to an
operation between two scalars. There is an analogous ambiguity in working
with addition: the same notation indicates both matrix and scalar addition.
Can you distinguish between these uses?
The following two questions will be explored in the exercises (see Exer-
cises 2 and 3). If V is a vector space, do the n m matrices over V form a
vector space? If V is a vector space and S is a set, does the collection of all
functions from S to V form a vector space? Can you see how each of these
is a generalization of the previous theorem?
Subspaces of Matrices
Because nm matrices over K form a vector space over K, there is a notion
of a subspace of matrices. Do you recall the denition of a subspace and
what is required to prove that a subset is a subspace? In the case of vector
spaces, it is sucient to prove that the subset is non-empty and that for
every v, w in the subset and every scalar k, the vector v +kw is also in the
subset (Theorem 2.3.2).
In Activity 7, you discovered that you could create subspaces by requiring
that particular entries be 0. For example, the collection of all matrices whose
(1, 1) entry is 0 is a subspace of the vector space (Z
5
)
22
. The general result
is the following:
Theorem 6.1.2. For any i and j, with 1 i n and 1 j m, the
collection of all n m matrices over K whose (i, j) entry is 0 is a subspace
of K
nm
.
Proof. First note that the zero matrix (0) satises the conditions and there-
fore the subset is non-empty. Next if both M and N have 0 as their (i, j)
entry, then the (i, j) entry of M +kN will equal 0 +k0 = 0 and so M +kN
is contained in the subset.
You should be able to formulate and prove the more general theorem
which states that if you have a xed set of entries required to be 0 you still
have a vector subspace of K
nm
. Did any of the other subsets of matrices in
Activity 7 form subspaces? For those that did, can you formulate a general
statement like Theorem 6.1.2? This will be explored further in Exercise 4.
6.1 Vector Spaces of Matrices 287
Summation Notation
Before continuing with the discussion on matrices, a few words about sum-
mation notation are probably appropriate. Do you recall how the notation
3

i=1
i
2
is interpreted?
Summation notation uses the capital Greek letter

to indicate summa-
tion:
u

i=
a
i
= a

+a
+1
+ +a
u1
+a
u
.
The variable i is referred to as the variable of summation (or the index), the
number is the lower bound and the number u is the upper bound. Often the
bounds of the summation will be omitted if they are clear from the context.
If we wish to perform a double-sum, where the bounds of each summation
do not depend on the other index, we will abbreviate to a single summation
sign:

i,j
a
ij
=

j
a
ij
=

i
a
ij
.
For example, if the context of the summation indicates that 1 i 3 and
1 j 2, then

i,j
ij =
3

i=1
2

j=1
ij =
3

i=1
(i + 2i) =
3

i=1
3i = 3(1) + 3(2) + 3(3) = 18.
You should change the order of the indices and verify that the value of that
summation is also 18; in other words, check that
2

j=1
3

i=1
ij = 18.
You must be careful when using these abbreviations. For example, can
you explain why the following is incorrect:
5

i=1
i

j=1
i j =

i,j
i j?
288 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
When expanding a double-summation, you should expand the outer sum rst
if the index for the outer sum is used as a bound for the inner sum:
2

i=1
i

j=1
(i +j) =
1

j=1
(1 +j) +
2

j=1
(2 +j) = (1 + 1) + (2 + 1) + (2 + 2) = 9
Dimensions of Matrix Vector Spaces
In Chapter 4, we discussed the dimension of a vector space. Activities 8
and 9 asked you to explore the concepts of linear independence and span in
the context of vector spaces of matrices. In Activity 9, you were asked to give
the dimension of K
nm
based on your work with (Z
3
)
43
. Is your conjecture
consistent with the statement of Theorem 6.1.3?
Theorem 6.1.3. The dimension of K
nm
is nm.
Proof. Dene the matrix E
ij
to be the matrix whose (i, j) entry is 1 and
whose other entries are 0. In functional notation:
E
ij
(k, ) =
_
1 if (k, ) = (i, j)
0 otherwise
.
We will show that E
ij
[ 1 i n, 1 j m is a generating set for K
nm
.
The proof that this set is independent is left as an exercise (see Exercise 6).
Let M = (m
ij
) be a matrix in K
nm
. We will show that M can be written
as a linear combination of the set E
ij
:
_

i,j
m
ij
E
ij
_
(k, ) =

i,j
m
ij
E
ij
(k, ) =
= m
k,
E
k,
(k, ) +

(i,j),=(k,)
m
ij
E
ij
(k, ) =
= m
k,
+

(i,j),=(k,)
0 = m
k,
= M(k, ).
Can you reformulate the proof representing matrices as arrays of num-
bers? Which proof is more understandable? Which proof is more detailed?
6.1 Vector Spaces of Matrices 289
Linear Transformations of Matrices
Because K
nm
is a vector space over K, it is natural to try to learn about
linear transformations whose domain and/or range is a vector space of ma-
trices. The rst such transformation you worked with attened a matrix into
a tuple (in Activities 10 through 12). This is referred to as attening the
matrix. Be sure to keep the two vector spaces K
nm
and K
nm
clear: the
rst is a collection of matrices, and the second is a collection of tuples.
Theorem 6.1.4. The attening map is a linear transformation from K
nm
to K
nm
.
Proof. Let M = (m
ij
) and N = (n
ij
) be elements of K
nm
,let a, b be scalars
and F denote the attening map. We must show that F(aM + bN) =
aF(M)+bF(N). We start with the left hand side of the equality: aM+bN =
(am
ij
+bn
ij
) and so the (i 1)m+j
th
component of F(aM +bN) will equal
am
ij
+bn
ij
.
On the right hand side, the (i 1)m+j
th
component of F(M) will equal
m
ij
and the (i 1)m+j
th
component of F(N) will equal n
ij
. Therefore, the
(i 1)m + j
th
component of aF(M) + bF(N) will equal am
ij
+ bn
ij
. This
proves that F(aM +bN) = aF(M) +bF(N) as needed.
Theorem 6.1.5. The attening map is one-to-one and onto.
Proof. The inverse of the attening map can be dened as follows. Start with
the nm-tuple [a
k
]. For each value of k, with 1 k nm, there are unique
integers i
k
and j
k
such that 1 i
k
n, 1 j
k
m, and k = (i
k
1)m+j
k
.
The inverse of the attening map is the matrix whose (i
k
, j
k
) entry is equal
to a
k
.
Theorems 6.1.2 and 6.1.3 describe a special relationship between two vec-
tor spaces that was dened in Chapter 2. Two vector spaces are isomorphic
if there is a one-to-one, onto linear transformation from one space to the
other. We will explore this concept a little more in the exercises. What is
important is that if two vector spaces are isomorphic, their linear structures
correspond (any true statement about one of them can be translated into a
true statement about the other).
The second linear map we examined (in Activities 13 and 15) is the trans-
pose map. It has some of the same features of the attening map.
290 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
Theorem 6.1.6. The transpose map M M
t
from K
nm
K
mn
is
an isomorphism of vector spaces.
Proof. See Exercise 10.
Exercises
1. Complete the proof of Theorem 6.1.1.
2. If V is a vector space over K, how would you dene V
nm
. How would
you multiply an element of V
nm
by an element of K? How would you
add two elements of V
nm
? Prove that V
nm
is a vector space. Explain
how Theorem 6.1.1 would be a special case of this result (assuming the
result were true).
3. Let S be a set, and V be a vector space over K. Let V
S
denote the
set of all functions from S V . How would you multiply an element
of V
S
by an element of K? How would you add two elements of V
S
?
Prove that V
S
is a vector space. Explain how Theorem 6.1.1 is a special
case of this result (assuming the result were true).
4. Let T be a linear transformation from K
nm
to K. Prove that the
collection of M K
nm
which map to 0 under T is a subspace of
K
nm
.
For example, Activity 7c is one such example where:
T
_
a
11
a
12
a
21
a
22
_
= a
12
a
11
a
21
5. Compute the following summations:
(a)

3
i=1

i+1
j=1
i +j
(b) 1 i 3, 1 j 3 with

i,=j
ij
(c)

2
i=1

2
j=i
2
i
3
j
6. Prove that the set of matrices E
ij
dened in the proof of Theo-
rem 6.1.3 is independent.
6.1 Vector Spaces of Matrices 291
7. For each of the following matrices, compute their images under the
attening map.
(a)
_
1 3 2
2 1 3
_
(b)
_
_
_
_
2 3
1 2
0 1
2 4
_
_
_
_
(c)
_
1 3 2 3 4 2
3 2 4 2 4 1
_
8. For each of the following tuples, compute their images under the inverse
of the attening map F : (Z
5
)
23
(Z
5
)
6
.
(a) [1, 4, 3, 4, 0, 2]
(b) [0, 3, 4, 2, 1, 0]
(c) [1, 0, 0, 0, 1, 0]
(d) [0, 3, 2, 4, 1, 3]
9. For each of the following matrices, compute their images under the
transpose map.
(a)
_
1 3 1
2 3 4
_
(b)
_
2 3 2 1 4 0
3 4 2 0 2 1
_
(c)
_
_
_
_
_
_
1 2
2 2
2 4
1 0
1 0
_
_
_
_
_
_
10. Prove Theorem 6.1.6.
11. Dene the trace of an n n matrix by:
Tr : (m
ij
)
n

i=1
m
ii
.
292 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
Determine whether the trace is a linear map from the vector space of
n n matrices over K to the 1-dimensional vector space (K)
1
.
12. For a 2 2 matrix over K, dene the map:
det : (m
ij
) m
11
m
22
m
12
m
21
.
Determine whether det is a linear map from the vector space of all 22
matrices over K to the 1-dimensional vector space (K)
1
.
293
6.2 Transformations and Matrices
Activities
1. For the matrices M and N, given by
M =
_
_
1 3 4 1 0
2 1 4 0 2
1 3 0 2 1
_
_
and N =
_
_
2 3 1 3 1
1 3 2 1 1
0 0 1 0 0
_
_
do the following:
(a) Use matrix row from the Matrix package to obtain the rows of
M as tuples. Determine the dimension of the subspace generated
by these tuples in (Z
5
)
5
.
(b) Use matrix col from the Matrix package to obtain the columns of
M as tuples. Determine the dimension of the subspace generated
by these tuples in (Z
5
)
3
(c) What relationship do you nd between the dimensions of these
spaces?
2. Write a func called row rank that accepts an nm matrix M over K,
converts the rows of M into tuples, and that returns the dimension of
the subspace generated by these tuples in K
m
. The code in row rank
can assume that name vector space has been run for the vector space
to K
m
. Let K = Z
5
, and let M be given by
M =
_
_
_
_
_
_
1 3 4 2 1
2 1 3 2 3
1 3 4 2 1
1 4 2 2 1
0 0 0 1 0
_
_
_
_
_
_
.
Determine the value of row rank on the following matrices:
(a) M;
(b) M with the third and fourth rows interchanged;
(c) M with the second row multiplied by 2;
294 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(d) M with the fourth row replaced by the sum of the rst and fourth
rows;
(e) the matrix obtained by reducing M to echelon form (see Chapter
3);
(f) the matrix obtained by reducing M to reduced echelon form (see
Chapter 3).
Make a conjecture about the eects of elementary row operations (as
dened in Chapter 3) on the value of row rank.
3. Write a func called col rank that accepts an nm matrix M over K,
converts the columns of M into tuples, and returns the dimension of
the subspace generated by these tuples in K
n
. The code in col rank
can assume that name vector space has been run for the vector space
to K
n
. Let K = Z
3
, and let M be given by
M =
_
_
_
_
_
_
1 0 1 2 1
2 1 0 2 0
1 0 1 2 1
1 1 2 2 1
0 0 0 1 0
_
_
_
_
_
_
.
Determine the value of col rank on the following matrices:
(a) M;
(b) M with the third and fourth rows interchanged;
(c) M with the second row multiplied by 2;
(d) M with the fourth row replaced by the sum of the rst and fourth
rows;
(e) the matrix obtained by reducing M to echelon form;
(f) the matrix obtained by reducing M to reduced echelon form.
Make a conjecture about the eects of elementary row operations (as
dened in Chapter 3) on the value of col rank.
4. For each system of linear equations over Z
3
, determine the column
rank of the matrix of coecients and the number of free variables in
the solution to the system.
6.2 Transformations and Matrices 295
(a)
x + y = 2
x + 2y = 1
2x + y = 0
(b)
2y + z = 2
x + y + 2z = 1
(c)
+ y + z = 0
x + 2y + 2z = 2
x = 2
(d)
x + y + z = 2
2x + 2y + 2z = 1
(e)
x + y + z + w = 1
x + 2y + 2w = 2
x + z + w = 0
Given a system of equations, can you determine a relationship between
the column rank and the number of free variables?
5. Dene a func called mat apply that accepts an nm matrix M = (m
ij
)
and an m-tuple [x
j
] and returns an n-tuple whose i
th
coordinate is equal
to
n

j=1
m
ij
x
j
.
Your code may assume that ms and as have been dened to implement
scalar arithmetic.
For this activity, dene ms and as to implement arithmetic mod 3. For
each matrix M given below, determine the values of mat apply on the
tuples e
1
, e
2
, and e
3
. Can you determine how the results relate to the
entries in the given matrix?
(a) M =
_
2 3 1
1 3 1
_
296 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(b) M =
_
_
1 2 0
0 3 1
1 0 1
_
_
(c) M =
_
2 3 1
_
(d) M =
_
_
1 0 0
0 1 0
0 0 1
_
_
6. Write a func called coord that accepts a vector v and an ordered
basis B = [b
1
, . . . , b
n
] and returns the coordinates of v with respect
to B. That is, the return value of coord should be the tuple of scalars
[a
1
, . . . , a
n
] such that
v =
n

i=1
a
i
b
i
.
Your code may assume that name vector spacehas been used to es-
tablish the vector space.
Run name vector space to establish the vector space (Z
5
)
3
. Let B =
[1, 0, 0) , 1, 1, 0) , 1, 1, 1)] and ( = [2, 0, 0) , 0, 2, 0) , 0, 0, 2)]. Deter-
mine whether the following are true or false:
(a) The function coord (using the ordered basis B) is a linear trans-
formation from (Z
5
)
3
to (Z
5
)
3
.
(b) If the ordered basis B is used in both function calls, then coord
and LC (from Activity 2 of Section 4.1) are inverse functions.
(c) The function coord using the ordered basis B and the function LC
using the ordered basis ( are inverse functions.
(d) The function obtained by rst applying coord using the ordered
basis B and then applying LC using ordered basis ( is a linear
transformation.
7. Write a func called matrify that accepts a function T : U V , a
basis B = [b
j
] of U, and a basis ( = [c
i
] of V . The func should return
a matrix M = (m
ij
) such that the entries in the j
th
column are the
coordinates of T(b
i
) with respect to the basis (.
Now run name two vector spaces to set the domain to (Z
5
)
2
and the
range to (Z
5
)
3
. Use the coordinate bases for the remainder of this activ-
ity (see Section 4.4 if you cannot remember the denition of coordinate
6.2 Transformations and Matrices 297
bases). For each function T below, compare the value of T(e
1
) to the
value obtained by applying M to the vector e
1
and the value of T(e
2
)
to the value obtained by applying M to the vector e
2
.
(a) T(x, y)) = 2x +y, x, x +y)
(b) T(x, y)) = x, 0, 2x y)
(c) T(x, y)) = 2x y, x +y, xy)
What is the relationship between the vectors in the rst two cases?
Make a conjecture about the property (or lack thereof) that makes the
third case behave dierently.
8. This activity uses the scalar eld Z
5
. For each pair of ordered bases
given below, compare the result of matrify on the linear transforma-
tion given by T (x, y, z)) = x + 2y, y + 2z).
(a) B = [e
1
, e
2
, e
3
], ( = [e
1
, e
2
]
(b) B = [e
2
, e
1
, e
3
], ( = [e
1
, e
2
]
(c) B = [e
1
, 2e
2
, e
3
], ( = [e
1
, e
2
]
(d) B = [e
1
, 2e
2
, e
3
], ( = [e
1
, 3e
2
]
How does changing the order of the domain basis aect the matrix?
How does changing the order of the range basis aect the matrix? How
does scaling an element of the domain basis aect the matrix? How
does scaling an element of the range basis aect the matrix?
9. Let T : (Z
5
)
2
(Z
5
)
3
be dened as T(x, y)) = x + 2y, x + 3y, y)
and use the coordinate bases throughout this activity. How do the
results of matrify(T), and matrify(2T) relate to each other? Can
you make a general conjecture about how matrify is aected when
you scale its input?
Now let S: (Z
5
)
2
(Z
5
)
3
be dened as S(x, y)) = x, y, 2x + 3y).
How do the matrices matrify(T), matrify(S), and matrify(T + S)
relate to each other? Can you make a conjecture about how matrify
applied to a sum relates to matrify applied individually to each of the
terms of a sum?
298 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
10. Use name two vector spaces to set the domain and range to be (Z
5
)
2
.
For each of the following matrices M, compare the value of col rank(M)
with the rank of the linear transformation which applies M to the vec-
tor.
(a) M =
_
1 2
3 2
_
(b) M =
_
1 2
3 1
_
(c) M =
_
0 2
1 2
_
(d) M =
_
0 0
0 0
_
Make a conjecture about the relationship between the two ranks.
Discussion
The Rank of a Matrix
In Activity 1, you examined two ways to convert an n m matrix into a
set of tuples (row-by-row or column-by-column). The dimensions of the sets
spanned by these tuples provides a signicant amount of information about
a matrix. We now provide a name for these numbers.
Denition 6.2.1. Let M be an n m matrix over K. The dimension of
the subspace of K
m
generated by the rows of M is called the row rank of
M. The dimension of the subspace of K
n
generated by the columns of M is
called the column rank of M.
In Activity 1, you also discovered that although the values of n and m
may be dierent, the row rank and the column rank of the matrix turned
out to be identical. We now work toward proving this result by looking at
the eect of the elementary row operations on the row and column ranks of
a matrix. Can you recall the three elementary row operations?
In Activity 2, you examined the eect of the elementary row operations
on the row rank of a matrix. Your work should have led to make a conjecture
consistent with the following theorem.
6.2 Transformations and Matrices 299
Theorem 6.2.1. The row rank of a matrix is unaected by the elementary
row operations.
Proof. Theorem 4.3.1 proved that two sets generate the same subspace, pro-
vided every vector in each set can be written as a linear combination of
vectors in the other set. We will use this to show that the the elementary
row operations do not aect the subspace generated by the rows of a matrix.
The rst elementary row operation is interchanging two rows. In this case,
the set of tuples obtained from the rows of the original matrix is identical to
those of the transformed matrix; hence, the spans are the same.
The second elementary row operation is to multiply one row by a scalar.
Let i be the row which is multiplied, v be the tuple from row i of the original
matrix, and k be the scalar. Then the sets are identical except for the tuple
from the i
th
row. The original matrix will have the tuple v and the new
matrix will have the tuple kv. However kv is clearly a linear combination of
the tuples from the original matrix, and v =
1
k
(kv) is a linear combination
of the tuples from the new matrix.
The third elementary row operation replaces a row i by the sum of row i
with another row j. Let i be the row which is being replaced, v
i
be the tuple
created from row i in the original matrix, j be the row which is added to row
i, and v
j
be the tuple from row j of the original matrix. As in the last case,
the set of tuples generated from the new matrix diers by only one tuple
from the set of tuples generated from the original matrix. The tuple from
row i of the new matrix, v
i
+ v
j
, is a linear combination of the tuples from
the original matrix. Similarly, the tuple v
i
= (v
i
+ v
j
) v
j
can be written
as a linear combination of the tuples from the new matrix.
Since the tuples produced by applying elementary row operations can be
written as linear combinations of the tuples from the original matrix and
vice-versa, the subspaces generated by the rows of the transformed matrix
and the rows of the original matrix are the same. Therefore, the row rank is
unaected.
In Chapter 3, we transformed matrices into reduced echelon form as a
means of determining the solution set of a system of equations. Here we will
use echelon form, because it provides an easy way to determine the row (and
column) rank of a matrix. Do you remember the requirements for a matrix
to be in reduced echelon form?
Since a matrix can be transformed into reduced echelon form by using
elementary row operations exclusively, the row rank of a matrix is equal
300 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
to the row rank of its corresponding reduced echelon form. As stated in
Denition 6.2.1, the row rank is the dimension of the vector space generated
by the rows of a matrix. However, the nonzero rows of a reduced echelon
matrix are linearly independent (why?). If we put these ideas together, what
is the relationship between the number of nonzero rows of a reduced echelon
matrix and its rank? Given any matrix, how can we nd a basis for the
vector space generated by its rows?
Although you may have been able to predict the outcome of Activity 2, the
results of Activity 3 may have come as a surprise. These ndings demonstrate
the following theorem, which is considered one of the deep results in linear
algebra.
Theorem 6.2.2. The column rank of a matrix is unaected by the elemen-
tary row operations.
Although this can be proven directly, an elegant proof (which avoids many
of the calculations) will be presented in Section 6.3. Your conjecture from
Activity 1 follows directly from Theorem 6.2.2.
Theorem 6.2.3. The column rank and the row rank of a matrix M are equal.
Proof. We can assume that M is in reduced echelon form (why?). Denote
the row rank of M by r. Because M is in reduced echelon form, M will
have exactly r nonzero rows of M. We consider the tuples generated by the
columns of M which have leading entries.
Each of the tuples coming from these columns will contain a single nonzero
entry and this entry will equal 1. This means that these tuples be e
1
, . . . e
r
.
To conclude the proof, every column of M will have zero in all but its
top r positions so the set e
1
, . . . e
r
will generate the subspace generated by
these tuples. This is sucient to prove that the column rank of M is r.
In the context of systems of equations, the rank of the coecient matrix
can be related to the number of determined variables in the solution set.
This relationship was presented in Theorem 3.2.1 and again in Activity 4.
The following theorem restates the result.
Theorem 6.2.4. The number of determined variables in the solution set of
a system of linear equations is equal to the the rank of the coecient matrix
of that system.
6.2 Transformations and Matrices 301
Proof. Given a system of equations with coecient matrix M, we augment
M and then transform the augmented matrix to reduced echelon form. Ev-
ery leading entry then becomes a determined variable in the solution of the
system. From the proof of Theorem 6.2.3, we can see that the number of
leading entries is equal to the rank of the coecient matrix.
The Matrix of a Linear Transformation
In Chapter 3, you used matrices to solve systems of equations. In Chapter 5,
you described systems of equations in terms of linear transformations and
found that every matrix can be used to dene a linear transformation. We
complete the triangle in this section by showing that every linear transfor-
mation can be described in terms of a matrix. In Activity 5, you wrote the
expression for a linear transformation between two vector spaces of tuples as
a matrix application.
In order to use this technique on every vector space, you need to repre-
sent a vector space as a collection of tuples. This was the purpose of the
funcs in Activity 6. The functions implemented in coord and LC provide
an isomorphism between an n-dimensional vector space V over K and the
vector space of tuples K
n
. In some ways, this is precisely the reason ordered
bases were dened.
In Activity 7, you probably realized that matrify provided a generic
method for representing a linear transformation as a matrix application. This
method of producing a matrix from a linear transformation is so important,
it gets its own denition.
Denition 6.2.2. Let V be a vector space of dimension n, and let B =
[b
i
] be an ordered basis. Then for each vector v, the coordinate vector (or
coordinates) of v with respect to B is dened to be the vector x
1
, . . . , x
n
)
in K
n
such that
v =
n

i=1
x
i
b
i
.
Given vector spaces U and V with ordered bases B = [b
j
] and (, respec-
tively, and a linear transformation T : U V , then the matrix represen-
tation of T with respect to B and ( is the matrix whose j
th
column is the
coordinate vector of T(b
j
) with respect to the ordered basis (.
Although the dimensions of the resulting matrix are not explicitly men-
tioned, they can be determined based on the dimensions of U and V .
302 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
The choice of ordered bases is signicant, as was seen in Activity 8. This
dependence can create ambiguities which make computations very dicult
and will be explored more fully in Chapter 7. We will try to alleviate this
with the following notation. If a tuple-vector is a coordinate vector, then we
will subscript it with the name of the ordered basis. For example, consider
the vector v = 1, 2) in the vector space (Z
5
)
2
with ordered bases B = [e
1
, e
2
]
and ( = [1, 1) , 0, 1)]. Then the coordinates of v with respect to B will be
1, 2)
B
. This explains why the basis B is referred to as the coordinate basis.
One the other hand, the coordinates of v with respect to ( is given by 1, 1)
(
.
In cases where the ordered bases are clear from the context, we will often omit
mention of them. One particular case is a vector spaces of tuples where we
will assume the use of the coordinate bases if no other bases are mentioned.
Because there is so much information involved with matrix representa-
tions, the following diagram is often used to illustrate the situation.
U

K
m
M

K
n
In this diagram, the vertical arrows indicate the isomorphism from U to K
m
and from V to K
n
, which you implemented as coord (with inverse LC). These
vertical arrows are labeled by the ordered bases used for the isomorphism.
The arrow labeled T indicates the linear transformation from U to V , and
the arrow labeled M indicates the application of the matrix representation
of T with respect to B and (.
The top row of the diagram presents the linear transformation in terms of
the vector spaces, while the bottom row of the diagram presents the matrix
representation. The vertical arrows represent the choice of coordinates for
the vector space and tie together the two presentations.
This diagram also illustrates an important equality: if we take a vector
in U, follow the arrow labeled B (downward), and then apply the matrix
M, we get the same result as if we had rst applied T, followed by nding
the coordinate representation in terms of the basis (. This was suggested in
Activity 7 and is stated in the following theorem.
Theorem 6.2.5. Let U and V be vector spaces with ordered bases B and
(, respectively, and let T : U V be a linear transformation. Given u in
U, the coordinate vector of T(u) with respect to ( is equal to the result of
6.2 Transformations and Matrices 303
applying the matrix representation of T to the coordinate vector of u with
respect to B.
Proof. Let B = [b
j
] and ( = [c
i
] be the ordered bases. Let [u
i
] be the coordi-
nates of u with respect to B, and let M = (m
ij
) be the matrix representation
of T with respect to B and (. Then
T(u) = T
_

j
u
j
b
j
_
=

j
u
j
T(b
j
) =

j
u
j
_

i
m
ij
c
i
_
=

i,j
u
j
m
ij
c
i
=

i
_

j
m
ij
u
j
_
c
i
.
Therefore, the coordinates of T(u) with respect to ( is the tuple whose i
th
component is

j
m
ij
u
j
. This tuple is the same as that obtained by applying
the matrix M to the tuple [u
j
].
Given an n m matrix M over K, there is a linear transformation
T : K
m
K
n
obtained by matrix application. A natural question now
arises: is the matrix associated with T equal to the original matrix M. The
answer is contained in the following theorem.
Theorem 6.2.6. Let M be an nm matrix over K and T : K
m
K
n
the
linear transformation mapping v to Mv. Then the matrix of T with respect
to the coordinate bases is equal to M.
Proof. See Exercise 7.
Properties of Matrix Representations
In Chapter 5, we dened a vector space structure on the set of linear trans-
formations from U to V denoted by Hom(U, V ). In Section 6.1, we dened
a vector space structure on the set of n m matrices denoted by K
nm
.
Theorem 6.2.5 provides a bridge between linear transformations and matri-
ces. A natural question is how the vector space structure on Hom(U, V )
relates to that on K
nm
. You explored this relationship in Activity 9. Was
304 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
the conjecture you formulated in that activity consistent with the following
theorem?
Theorem 6.2.7. Let U and V be n-dimensional vector spaces over K. Let
B be an ordered basis for U, and let ( be an ordered basis for V . Let T and S
be elements of Hom(U, V ), and let k K. Assume all matrix representations
are given with respect to the bases B and (. Then, we can conclude:
the matrix representation of T + S is equal to the sum of the matrix
representation of T and the matrix representation of S;
the matrix representation of kT is equal to the product of k and the
matrix representation of T.
Proof. We will prove the result for scalar multiplication, and leave the result
for addition as an exercise (see Exercise 6). Let T : U V be a linear
transformation, and let k be any scalar in K.To simplify the notation, let
B = [b
j
] and ( = [c
i
] represent the ordered bases of U and V , respectively,
and let M = (m
ij
) denote the matrix representation of T with respect to B
and (. We now determine the matrix representation of kT with respect to
B and (.
For all j, we have:
(kT)(b
j
) = k(T(b
j
)) = k
_

m
i,j
c
i
_
=

(km
ij
)c
i
.
Therefore the j
th
column of the matrix representation of kT is equal to the
product of k and the j
th
column of the matrix representation of T. This
completes the proof.
Another place where there have been parallel developments between linear
transformations and matrices has been the value known as rank. Do you
recall the denition for the rank of a transformation and the rank of a matrix?
As you discovered in Activity 10, this value is also independent of the matrix
representations.
Theorem 6.2.8. The rank of a linear transformation is equal to the column
rank of its matrix representation (with respect to any ordered bases).
Proof. Let T : U V be a linear transformation, and B = [b
j
] and ( be
ordered bases of U and V , respectively. Assume that the dimension of V is
equal to n.
6.2 Transformations and Matrices 305
The range of T is spanned by the vectors T(b
j
). The tuple of coordinates
of the vector T(b
j
) with respect to the ordered basis ( is equal to the j
th
column of the matrix. As a result, the range of T is isomorphic to the
subspace of K
n
spanned by the column tuples; hence they will have the same
dimension.
Theorem 6.2.8 ties the concept of column rank to the concept of the rank
of a linear transformation. This result will be instrumental in proving The-
orem 6.2.2 using the following strategy. In Section 6.3, we will determine
the linear transformation analog of an elementary row operation. It will be
proven that these operations do not aect the rank of the linear transforma-
tion and hence will not aect the column rank of the associated matrix. This
is the missing piece in the proof of Theorem 6.2.2.
Retrospection
We started with the goal of solving systems of linear equations (each of which
can be written as a single matrix equation). In this task, we developed an
abstract theory of vector spaces, bases, and linear transformations. We have
now returned full circle, because every vector space can be represented as a
vector space of tuples, and every linear transformation can be represented as a
matrix application. A reasonable question now is why bother, if everything
really is just matrices, to do all of the abstraction?
By making the dependence on the ordered bases explicit, we have gained
the freedom to change our choice of ordered bases. In many cases this can
help simplify a problem at hand (as will be done in Chapter 7). Another
advantage is that many of the proofs are actually easier to write if we abstract
away from the details. Perhaps the greatest gain has been in places where
there are no natural bases. By providing abstract proofs of the results, we
learn that many of the techniques which were eective in the concrete case
of tuples can be applied (without change) to the more abstract cases.
Exercises
1. Consider each of the matrices below as being over Z
5
. Compute their
row rank. Next, consider each of the matrices below as being over R.
Compute their row rank. Do these numbers dier?
306 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(a)
M =
_
1 2 3 4 1
2 3 1 3 2
_
(b)
M =
_
_
1 0 0
0 1 0
0 0 1
_
_
(c)
M =
_
_
0 0 0 0
0 0 0 0
0 0 0 0
_
_
(d)
M =
_
_
1 2 1
2 0 3
1 4 0
_
_
(e)
M =
_
_
1 2 3
2 3 0
1 4 0
_
_
2. For each of the following matrices over R, compute its column rank.
(a)
M =
_
1 2 3 4 1
2 3 1 3 2
_
(b)
M =
_
_
1 0 0
0 1 0
0 0 1
_
_
(c)
M =
_
_
0 0 0 0
0 0 0 0
0 0 0 0
_
_
6.2 Transformations and Matrices 307
(d)
M =
_
_
1 2 1
2 0 3
1 4 0
_
_
(e)
M =
_
_
1 2 3
2 3 0
1 4 0
_
_
3. For each system of equations over R, determine the number of deter-
mined variables. You do not need to solve the equations.
(a)
x + 2y + 3z = 4
x + 3y 2z = 2
3x + 8y + z = 8
(b)
x + y z = 2
x + 2y 3z = 4
2x + 7y 4z = 2
(c)
x + y = 2
x + 3y = 4
x 2y = 7
4. For each vector space, ordered basis B, and vector v below, write the
coordinates of v with respect to B.
(a) V = (Z
5
)
2
, B = [1, 1) , 0, 1)], and v = 1, 3)
(b) V = R
3
, B = [1, 2, 3) , 2, 4, 1) , 3, 6, 5)], and v = 0, 4, 1)
(c) V = R
23
, B = [E
11
, E
12
, E
13
, E
21
, E
22
, E
23
], and
v =
_
1 3 2
2 1 0
_
.
The denition of the E
ij
can be found in the proof of Theo-
rem 6.1.3.
308 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(d) V = PF
2
(R), the vector space of polynomial functions with degree
two or less over R, B = [x 1, x x, x x
2
], v = p, where
p(x) = x
2
+ 3x 2.
(e) V = PF
2
(R), the vector space of polynomial functions with degree
two or less over R, B = [x 1, x x + 1, x x
2
+ x + 1]
and v = p, where p(x) = x
2
2x + 3.
(f) V = PF(Z
3
) the vector space of all polynomial functions over Z
3
,
B = [x 1, x x, x x
2
], and v = p, where p(x) = x
3
.
If you consider this impossible, carefully consider the information
about polynomials in Chapter 1.
(g) V = PF(Z
5
) the vector space of all polynomial functions over Z
5
,
B = [x 1, x x, x x
2
, x x
3
, x x
4
], and v = p,
where
p(x) = 2x
7
+ 3x
6
+ 2x
5
+ 3x
4
+x
3
+ 2x
2
+x + 4.
(h) V = C

(R), the set of all innitely dierentiable functions over


R, B has as its rst three elements x 1, x x and x x
2
,
and v = f, where f(x) = 2(x+1)
2
+3x1. You are only required
to provide a clear description of the coordinates.
(i) V = C

(R), the set of all innitely dierentiable functions over


R, B has as its rst two elements sin and cos, v = f, where f is
the function f(x) = sin(x + ). As in the previous item, you are
only required to provide a description of the coordinates.
5. For each of the linear transformations T : U V and ordered bases
B and ( below, write the matrix of T with respect to B and (.
(a) T : (Z
5
)
2
(Z
5
)
3
dened by
T(x, y)) = 2x +y, x 3y, 3x) ,
and each basis is the appropriate coordinate basis.
(b) T : (Z
5
)
3
(Z
5
)
3
dened by
T(x, y, z)) = x, y, z) ,
B the coordinate basis, and ( = [1, 0, 0) , 0, 2, 0) , 1, 0, 1)].
6.2 Transformations and Matrices 309
(c) T : R
2
R
3
dened by
T(x, y)) = x +y, 2x y, x + 3y) .
B = [1, 0, 1) , 0, 2, 3) , 1, 2, 3)], and ( the coordinate basis.
(d) T : PF
2
(R) PF
2
(R) dened by T(p) = q, where q(x) = p(x+
1), B = ( = [x 1, x x, x x
2
].
(e) T : PF
3
(R) PF
2
(R) dened by T(p) = p
t
, the derivative of
p, B = [x 1, x x, x x
2
, x x
3
] and ( = [x
1, x x, x x
2
].
(f) V be the vector subspace of C

(R) spanned by sin, cos, x


x, x 1 and T : V V be dened by T(f) = f
t
, the deriva-
tive of f, B = ( = [sin, cos, x x, x 1].
6. Complete the proof of Theorem 6.2.7.
7. Complete the proof of Theorem 6.2.6.
8. For the linear transformations T, S: R
3
R
3
dened by
T(x, y, z)) = x +y, y +z, z +x)
S(x, y, z)) = 2x, x + 3y, x + 2y +z) ,
nd the matrix representations for the linear transformations below
with respect to the coordinate bases.
(a) 3T
(b) T +S
(c) 2T + 3S
(d) 3T + 2S
(e) 3(T + 2S)
9. For each linear transformation below, compute its rank.
(a) T : (Z
5
)
3
(Z
5
)
2
given by
T(x, y, z)) = 2x + 3y, 2z +x) .
310 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(b) T : (Z
5
)
2
(Z
5
)
2
given by
T(x, y)) = x +y, x y)).
(c) T : R
3
R
3
given by
T(x, y, z)) = x +y z, x +z, 2y +z) .
(d) T : PF
4
(Z
5
) PF
4
(Z
5
) given by T(p) = q, where q(x) = p(x +
1).
(e) T : C

(R) C

(R) given by T(f) = f


t
, the derivative of f.
311
6.3 Matrix Multiplication
Activities
1. Dene the product of M and N by the rule that the j
th
column of
the product is equal to the result of applying the matrix M to the j
th
column of N. This product is denoted by MN.
For each pair of matrices M and N below, compute MN and NM.
Based on these results, make a conjecture about the relationship be-
tween MN and NM.
(a)
M =
_
_
_
_
1 2 3
2 3 4
1 3 0
0 1 0
_
_
_
_
and N =
_
_
2 1
1 2
1 1
_
_
(b)
M =
_
_
1 2 1
2 2 1
3 1 1
_
_
and N =
_
_
1 0 0
0 1 0
0 0 1
_
_
(c)
M =
_
_
_
_
0 0 0
0 0 0
0 0 0
0 0 0
_
_
_
_
and N =
_
_
1 2 4 1
2 3 2 1
2 1 0 2
_
_
(d)
M =
_
1 3
2 1
_
and N =
_
1 1
2 3
_
2. Dene a func called mat mul which accepts two matrices and returns
their product. Your code may assume that as and ms have been dened.
Dene as and ms to implement arithmetic mod 5. For each M, N and
P given below, compute the values of (MN)P and M(NP). Based
on these results, make a conjecture about the relationship between
(MN)P and M(NP).
312 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(a)
M =
_
1 2
2 3
_
and N =
_
2 4
2 2
_
and P =
_
1 3
2 1
_
(b)
M =
_
_
1 3 2
2 1 3
0 1 0
_
_
and N =
_
_
1 2 3
2 0 4
4 4 4
_
_
and P =
_
_
2 3 1
2 3 4
1 0 1
_
_
(c)
M =
_
_
1 3
2 3
1 3
_
_
and N =
_
2 1
3 1
_
and P =
_
2
1
_
3. Dene a func called left mul that accepts a matrix M and returns a
func. The return value of left mul(M) should implement the map:
N MN.
Now dene as and ms to implement arithmetic mod 3, and let M be
M =
_
1 2 3
2 3 1
_
.
What are the domain and range of left mul(M)? Is this a linear trans-
formation between these spaces?
4. Let I denote the matrix in (Z
5
)
22
dened by
I =
_
1 0
0 1
_
.
For each matrix M in (Z
5
)
22
, what are the values of MI and IM?
Why do you think that the matrix whose (i, j) entry is equal to 1, if
i = j, and equal to 0 otherwise, is called the identity matrix?
6.3 Matrix Multiplication 313
5. Use name two vector spaces to dene a domain of (Z
5
)
2
and a range
of (Z
5
)
3
. For each pair of matrices M and N below, compute the result
of applying N followed by applying M to 1, 3). Then compute the
result of applying MN to the vector 1, 3). Based on these results,
make a conjecture about the connection between the composition of
matrix applications and the product of matrices.
(a)
M =
_
_
1 3 4 1
2 3 1 0
1 0 0 1
_
_
and N =
_
_
_
_
2 3
1 1
3 0
0 1
_
_
_
_
(b)
M =
_
_
1 3 2
2 1 0
2 3 4
_
_
and N =
_
_
2 1
3 4
0 0
_
_
6. Use name two vector spaces to set the domain to (Z
3
)
2
and the range
to (Z
3
)
3
. For each pair of linear transformations T, S compute the
product of the matrix representations of T and S and the matrix rep-
resentation of the composition TS. (Use the coordinate bases for all
representations.) Based on these results, make a conjecture about the
connection between the composition of linear transformations and the
product of their matrix representations.
(a)
T(x, y)) = 2x +y, x y, x +y)
S(x, y, z)) = x +y, x y, x +z)
(b)
T(x, y)) = x +y, x y, 2x, 3y)
S(x, y, z, w)) = x, y, z)
7. For each matrix M over Z
5
, create the augmented matrix (M[I), where
I is the identity matrix of the appropriate dimension (see Activity 4).
Use Gaussian Elimination to reduce this augmented matrix to reduced
314 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
echelon form, which we denote by (E[N), and then compute MN.
What can you say about this product if E is the identity matrix? What
can you say about this product if E is not the identity matrix?
(a) M =
_
2 1
7 8
_
(b) M =
_
_
1 2 3
2 4 1
0 1 0
_
_
(c) M =
_
0 1
0 2
_
(d) M =
_
_
1 4 2
2 3 0
1 2 0
_
_
8. Use the technique in Activity 7 to write a func called mat inv that
accepts a matrix M and returns a matrix N such that MN is the
identity matrix. If no such matrix exists, mat inv should return OM.
For each matrix M over Z
5
given below, compute the column rank of
M and then determine if M has an inverse. Make a conjecture about
the relationship between these two results.
(a)
M =
_
_
1 3 2
2 1 4
3 4 1
_
_
(b)
M =
_
_
2 1 3
1 0 1
0 1 0
_
_
(c)
M =
_
2 1
2 0
_
(d)
_
1 4
2 3
_
6.3 Matrix Multiplication 315
9. For each system of equations below, nd the solution set using Gaussian
Elimination. Then nd the inverse of the coecient matrix and apply
this matrix to the tuple of constants. Based on your ndings, describe
a method of solving a system of equations using matrix inverses.
(a)
x + y z = 3
x y + z = 0
x + 2y + z = 1
(b)
x + 2y 3z = 4
3y + 2z = 2
x 4z = 2
(c)
x + y = 0
2x 2y = 0
10. For each linear transformation from (Z
5
)
3
to (Z
5
)
3
, determine if it is
invertible. Then determine if the the image of the coordinate basis
under T is also a basis.
(a) T(x, y, z)) = x +y, x + 4y, z)
(b) T(x, y, z)) = x + 2y + 3z, 2x + 3y, 3x +y + 4z)
(c) T(x, y, z)) = x +y, y + 4z, z + 4x)
(d) T(x, y, z)) = x + 3y +z, x + 2y, y +z)
Discussion
Matrix Multiplication
You might recall that a vector space does not require a multiplication oper-
ation between the elements of the vector space, only an addition operation
between them, and multiplication between an element and a scalar. In Sec-
tion 6.1, we focused on the vector space structure of the collection of n m
matrices over the scalars K and discussed such an addition and multiplica-
tion. In this section we will develop an additional structure on matrices: the
316 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
ability to multiply two matrices together. We now present a formal denition
of matrix multiplication as developed in Activities 1 and 2.
Denition 6.3.1. Given an np matrix M = (m
ij
) and a pm matrix N =
(n
ij
), we dene their product to be the nm matrix MN = (

p
k=1
m
ik
n
kj
).
One important question to ask about matrix multiplication is how it
interacts with the vector space operations. In Activity 3, you veried that
matrix multiplication by a given matrix is a linear transformation. This
result is true in general.
Theorem 6.3.1. Let M be a p q matrix over K. Then:
the map N MN denes a linear transformation from K
qm

K
pm
;
the map N NM denes a linear transformation from K
np

K
nq
.
Proof. We prove the case for the rst map. You will be asked to prove the
second case in the exercises (see Exercise 2). Let M K
pq
. For N K
qm
,
dene T(N) = MN K
pm
. We must show that for any a, b K and
N, P K
qm
we have
M(aN +bP) = T(aN +bP) = aT(N) +bT(N) = aMN +bMP.
This can be done by direct computation:
M(aN +bP) = (m
ij
)(an
ij
+bp
ij
) =
_

k
m
ik
(an
kj
+bp
kj
)
_
=
=
_

k
am
ik
n
kj
+bm
ik
p
kj
_
=
= a
_

k
m
ik
n
kj
_
+b
_

k
m
ik
p
kj
_
= aMN +bMP.
6.3 Matrix Multiplication 317
In Activities 1, 2 and 4, you explored some of the basic properties of
matrix multiplication. Although you undoubtedly discovered that matrix
multiplication shared some properties with real number multiplication, you
found that there are notable dierences. Can you identify some of the sim-
ilarities? Can you describe the dierences? Some of the similarities are
presented in the following theorem.
Theorem 6.3.2. Let the matrix I be dened by (
ij
), where
ij
is equal to 1,
if i = j, and
ij
= 0, otherwise. Let the matrix Z be the zero matrix. If A, B,
and C are matrices, then the following equalities hold, provided the products
are dened (in other words, provided the dimensions are correct).
A(B +C) = AB +AC
(A +B)C = AC +BC
k(AB) = (kA)B = A(kB)
AI = A = IA
AZ = Z = ZA
A(BC) = (AB)C
Proof. We will prove some of these and leave others as exercises.
A(B +C) = AB +AC is valid, because multiplication from the left by
A is a linear transformation.
(A + B)C = AC + BC is valid, because multiplication from the right
by C is a linear transformation.
A(kB) = k(AB) is valid, because multiplication from the left by A is
a linear transformation.
(kA)B = k(AB) is valid, because multiplication from the right by B is
a linear transformation.
A(BC) = (AB)C is left to the reader (see Exercise 3).
318 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
We will prove AI = A = IA directly.
AI = (a
ij
)(
ij
) =
_

k
a
ik

kj
_
=
=
_
a
ij

jj
+

k,=j
a
ik

kj
_
=
_
a
ij
+

k,=j
0
_
= (a
ij
) = A.
The proof of A = IA is similar and left to the reader (see Exercise 4.
AZ = Z = ZA is left to the reader (see Exercise 5).
The statement in the theorem, provided the products are dened, is
required, because not all matrices can be multiplied. What are the require-
ments on the dimensions of two matrices that allow them to be multiplied?
The second property which matrix multiplication lacks is commutativity.
Even when both AB and BA are dened, they may not even be the same
dimension, as you discovered in Activity 1. This also explains why Theo-
rem 6.3.2 requires two dierent distribution laws (one from the right and one
from the left).
Multiplication as Composition
There is another operation which has similar properties to matrix multipli-
cation: composition of linear transformations. Composition is associative,
distributes over addition and scalar multiplication, has an identity, and a
zero-function. Function composition is not always dened and usually non-
commutative.
In Activities 5 and 6, you explored the connection between function com-
position and matrix multiplication. Although the proofs of the following
two theorems are straight-forward, they provide an important connection
between matrices and linear transformations, which we will be able to use
for a variety of purposes.
6.3 Matrix Multiplication 319
Theorem 6.3.3. Let M be an n p matrix over K and N a p m matrix
over K. If we start with a tuple x in K
m
, apply N to this tuple and then
apply M to the resulting tuple in K
p
, then the nal tuple in K
n
will be equal
to the tuple obtained by applying MN to the original tuple in K
m
. In short:
M(Nx) = (MN)x.
Proof. Let x = x
1
, . . . , x
m
) be any vector in K
m
.
The result of applying the matrix N to x will have

j
n
kj
x
j
as its k
th
component. We then apply the matrix M to this and the result will have

k
m
ik
_

j
n
kj
x
j
_
as its i
th
component.
The result of applying the matrix MN to the vector x will have

j
_

k
m
ik
n
kj
_
x
j
as its i
th
component. What remains to be shown is that these two are equal.

k
m
ik
_

j
n
kj
x
j
_
=

k,j
m
ik
n
kj
x
j
=

j
_

k
m
ik
n
kj
_
x
j
This analogous relationship also holds for matrix representations of a
linear transformation, as the following theorem shows:
Theorem 6.3.4. Let T : K
p
K
n
and S: K
m
K
p
be linear transfor-
mations. The matrix representation of T S is equal to the product of the
matrix representation of T and the matrix representation of S.
Proof. Let the matrix representation of T be given by (t
ij
) and the matrix
representation of S be given by (s
ij
). We nd the column tuples of the matrix
320 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
representation of T S by applying T S to the basis elements of K
m
.
T S(e
j
) = T
_

k
s
kj
e
k
_
=
=

k
s
kj
T(e
k
) =

k
s
kj
_

i
t
ik
e
i
_
=
=

i,k
t
ik
s
kj
e
i
=

i
_

k
t
ik
s
kj
_
e
i
This means that the matrix representation of T S is the matrix (

k
t
ik
s
kj
),
which completes the proof.
The result of these two theorems is that (with the extra baggage of ordered
bases) matrix multiplication and transformation composition are just two
views of the same process. This can be described using the following diagram.
U

TS

K
m
N

MN

K
p
M

K
n
The top row of the diagram illustrates the composition of linear transfor-
mations between vector spaces and the bottom row illustrates matrix multi-
plication. As always in these diagrams, the link between the top and bottom
rows is the choice of ordered bases.
Invertible Matrices and Change of Bases
In Activity 4, you discovered the identity matrix I. What is special about
a matrix of this form? In Activities 7 and 8, you discovered a method for
determining the inverse of a matrix (in other words, a method for nding
a matrix N such that MN = I). The method suggested in this activity is
equivalent to solving the equation MN = I, where I is the identity matrix,
and N represents the inverse of M, if the inverse exists. If we denote column
6.3 Matrix Multiplication 321
j of N by N
j
, then the system of equations given by MN
j
= I
j
is one of n
systems represented by MN = I, which, when expanded, yields n systems
of equations in n unknowns, all of which have the same coecient matrix.
Since the coecients are the same for each system, we can apply elementary
row operations to the augmented matrix (M[I) to nd the solution set, if
it exists, of each individual system. The func mat inv that you wrote in
Activity 8 implements this strategy for determining the inverse.
Theorem 6.3.5. The matrix M is invertible if and only if the matrix ob-
tained by reducing M to reduced echelon form is the identity matrix.
Proof. See Exercise 7.
A direct consequence of this was investigated in Activity 8 and is pre-
sented here as a corollary.
Corollary. An n n matrix is invertible if and only if its row rank is equal
to n.
Proof. This follows quickly from the previous theorem. The row rank of M
is equal to the row rank of the matrix obtained by reducing M to reduced
echelon form. If M is invertible, the reduced matrix is I, and the row rank
is n. If M is not invertible, the reduced matrix, which is not I, has a row of
all zeros and hence, row rank less than n.
You should see if you can translate the analysis of the paragraph pre-
ceding the theorem into statements about linear transformations. Can you
use this technique to describe a general method for nding inverses of linear
transformations?
Theorem 6.3.6. If T is an invertible linear transformation from K
n
to K
n
,
then the matrix representation of T in K
nn
is invertible.
Proof. Let T have inverse S so that T S = I. Let M be the matrix
representation of T and N be the matrix representation of S. Then MN is
the matrix representation of T S = I which means that MN = I.
In this proof we have used I with two meanings. In one place it refers
to the identity linear transformation, and in another it refers to the identity
matrix. Can you identify which one is which in the proof above?
322 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
Theorem 6.3.7. Let M be an invertible matrix in K
nm
. The linear trans-
formation T : K
m
K
n
dened by
T(x) = Mx
where x K
m
, is invertible.
Proof. See Exercise 8.
Corollary. An n n matrix is invertible if and only if its column rank is
equal to n.
Proof. Let M be any matrix in K
nn
and T be the linear transformation
dened by applying M. Then the range of T is generated by the columns of
M. This means that the rank of T is equal to the column rank of M. The
corollary follows because T is invertible if and only if its rank is equal to the
dimension of its domain (which is n).
One particular application of inverse matrices was presented in Activity 9,
where you used a matrix inverse to nd solution set of a system of linear
equations. The general technique can be described as follows. Starting with
an equation MX = Y , you nd the value of X by computing M
1
Y where
M
1
is the inverse of M. Written out in notation, the solution should look
very familiar.
MX = Y
M
1
(MX) = M
1
Y
(M
1
M)X = M
1
Y
IX = M
1
Y
X = M
1
Y
Now that we have a number of conditions for determining the invertibil-
ity of a matrix, we focus on one interpretation of invertible transformations
(and matrices). When you worked Activity 10, you discovered a connection
between changing ordered bases and invertibility. Namely, every time you
apply an invertible matrix to an ordered basis, you get another ordered ba-
sis. As a result, the linear transformation dened by the application of an
invertible matrix is simply a change of ordered basis. This perspective will
allow us to prove the following theorem.
6.3 Matrix Multiplication 323
Theorem 6.3.8. If P is an invertible matrix, then the column rank of PM
is equal to the column rank of M.
Proof. The key to the proof is that M and PM both represent the same
linear transformation, except with respect to dierent ordered bases. Since
both matrices have rank equal to that of the transformation they represent,
M and PM have the same rank.
Let M be an n m matrix and P be an invertible n n matrix. Let
/ be the coordinate basis of K
m
, B be the coordinate basis of K
n
and let
the ordered set ( = [c
i
] be the result of applying the inverse of P to the
elements of B. Let T : K
m
K
n
be the linear transformation dened by
applying M with respect to / and B. In the exercises (see Exercise 16), you
will prove that the matrix representation of T with respect to / and B is
equal to M. We show that the matrix representation of T with respect to /
and ( is equal to PM.
The rst step is to prove that ( is really a basis. Let S be the linear
transformation dened by the application of the inverse of P. Because the
inverse of P is invertible, S is an invertible linear transformation. This implies
that the range of S has dimension n and so must be all of K
n
. However, the
range of S is spanned by the n vectors ( = S(e
i
) and so this set must be
linearly independent by Theorem 4.4.9. As a result, ( is a basis of K
n
.
Note that because c
i
is the result of applying the inverse of P to the
vector e
i
, we know that the vector e
i
is the result of applying P to the c
i
. In
other words,
e
k
=

i
p
ik
c
i
.
To nish the proof, we compute the matrix representation of T with
respect to the ordered bases / and (:
T(e
j
) =

k
m
kj
e
k
=

k
m
kj
_

i
p
ik
c
i
_
=

i
_

k
p
ik
m
kj
_
c
i
.
This calculation shows that the matrix representation of T with respect to
/ and ( is equal to
_

k
p
ik
m
kj
_
= PM,
which is what was desired.
324 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
The proof may be hard to conceptualize, but it can be claried using the
diagram notation.
K
m
T

K
n
I

K
n

K
m
M

PM

K
m
P

K
m
At the top of the diagram, there are two arrows labelled with the linear
transformation T, which has a xed rank. At the bottom of the diagram,
the corresponding arrows are labelled M and PM, so the column ranks of
M and PM must be equal.
Exercises 17 and 18 ask you to state and prove the analog of Theorem 6.3.6
for M and MP, where P is invertible. A more complete development of this
theory is presented in Chapter 7.
So now we arrive at a position where we are able to prove Theorem 6.2.2.
Recall that the theorem states that the column rank of a matrix is unaected
by the elementary row operations.
Proof of Theorem 6.2.2. Let M be a matrix in K
nn
. We prove that each
elementary row operation on M can be implemented by multiplying M by
an invertible matrix on the right. This suces to prove the result by Theo-
rem 6.3.6.
The interchanging of rows i and j can be implemented by multiplying by
the matrix M dened by
m
k,
=
_

_
1 k = except (k, ) = (i, i) and (k, ) = (j, j)
1 (k, ) = (j, i) or (k, ) = (i, j)
0 otherwise
.
The multiplying of row i by the scalar a can be implemented by multi-
plying by the matrix M dened by
m
k,
=
_

_
1 k = except (k, ) = (i, i)
a (k, ) = (i, i)
0 otherwise
.
6.3 Matrix Multiplication 325
The replacement of row i by the sum of row i and row j can be imple-
mented by multiplying by the matrix M dened by
m
k,
=
_

_
1 k =
1 (k, ) = (i, j)
0 otherwise
.
We leave as a proof that these matrices implement the appropriate ele-
mentary row operations (see Exercises 19 through 21). The fact that they
are invertible follows from the fact that each operation is reversible.
Exercises
1. For each of the matrices M, N over R below, compute the product
MN.
(a)
M =
_
1 2 3
2 3 4
_
and N =
_
_
2 3
2 1
3 1
_
_
(b)
M =
_
_
1 2 3
0 1 2
0 0 3
_
_
and N =
_
_
3 0 0
2 4 0
1 3 1
_
_
(c)
M =
_
_
4 0 0
0 5 0
0 0 3
_
_
and N =
_
_
2 0 0
0 4 0
0 0 6
_
_
2. Complete the proof of Theorem 6.3.1.
3. Provide the proof of associativity in Theorem 6.3.2: for any three matri-
ces A, B, and C, (AB)C = A(BC), whenever the products are dened.
4. Complete the proof of the existence of an identity matrix in Theo-
rem 6.3.2: for any matrix A, AI = A = IA, whenever the product is
dened.
326 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
5. Provide the proof that multiplication by the zero matrix always results
in the zero matrix in Theorem 6.3.2: for any matrix A, ZA = Z = AZ,
whenever the products are dened.
6. Dene T, S: R
3
R
2
by
T(x, y, z)) = x + 2y, z)
S(x, y, z)) = y z, z x) .
Compute the matrix representations of the following linear transforma-
tions with respect to the coordinate bases.
(a) T S
(b) 2T 3S
(c) T (2S +T +I)
7. Provide the proof of Theorem 6.3.5.
8. Provide the proof of Theorem 6.3.7.
9. Assume that M and N are square matrices. Prove that if M and N
are invertible, then MN is invertible.
10. Assume that M and N are square matrices. Prove that if MN is
invertible, then M and N are both invertible.
11. For each linear transformation T over R given below, compute its in-
verse.
(a) T(x, y)) = x +y, x y)
(b) T(x, y, z)) = x + 2y, 3x z, 4x + 2y + 3z)
(c) T(x, y, z, w)) = x +y +z, y +z, z +w, x +w)
12. Solve each system of equations over R given below. (Hint: Are there
similarities between these systems coecient matrices that can be ex-
ploited?)
(a)
x + 7y + z = 14
2x + 2y 10z = 4
3y + 4z = 7
6.3 Matrix Multiplication 327
(b)
x + 7y + z = 2
2x + 2y 10z = 3
3y + 4z = 2
(c)
x + 7y + z = 0
2x + 2y 10z = 0
3y + 4z = 0
(d)
x + 7y + z = 7
2x + 2y 10z = 1
3y + 4z = 7
(e)
x + 7y + z = 1
2x + 2y 10z = 1
3y + 4z = 1
(f)
x + 7y + z = 2
2x + 2y 10z = 4
3y + 4z = 6
(g)
x + 7y + z =

2
2x + 2y 10z =
3y + 4z = e
13. Solve each system of equations over R below. (Hint: Are there similari-
ties between these systems coecient matrices that can be exploited?)
(a)
2x + 3y + 2z + w = 3
x + y + z = 2
2y + 2z = 3
3z + 4w = 7
328 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(b)
2x + 3y + 2z + w = 1
x + y + z = 3
2y + 2z = 4
3z + 4w = 4
(c)
2x + 3y + 2z + w = 1
x + y + z = 1
2y + 2z = 0
3z + 4w = 0
(d)
2x + 3y + 2z + w = 2
x + y + z = 5
2y + 2z = 8
3z + 4w = 10
(e)
2x + 3y + 2z + w = 4
x + y + z = 2
2y + 2z = 3
3z + 4w = 47
(f)
2x + 3y + 2z + w =

3
x + y + z =
2
2y + 2z = e
e
3z + 4w = 0
14. Let D: P
2
(R) P
1
(R) be the linear transformation D(p) = p
t
the
derivative of p. Let J : P
1
(R) P
2
(R) be the linear transformation
where J(p) =
_
x
0
p(t) dt.
(a) Compute the matrix representation M of D with respect to the
ordered bases [1, x, x
2
] and [1, x].
(b) Compute the matrix representation N of J with respect to the
ordered bases [1, x] and [1, x, x
2
].
(c) Compute the matrix product MN, what does this say about the
linear transformation D J?
6.3 Matrix Multiplication 329
(d) Compute the matrix product NM, what does this say about JD?
15. Let D: P
5
(R) P
4
(R) be the linear transformation D(p) = p
t
the
derivative of p. Let J : P
4
(R) P
5
(R) be the linear transformation
where J(p) =
_
x
0
p(t) dt.
(a) Compute the matrix representation M of D with respect to the
ordered bases [1, x, x
2
, x
3
, x
4
, x
5
] and [1, x, x
2
, x
3
, x
4
].
(b) Assume more generally that D: P
n
(R) P
n1
(R) is the linear
transformation D(p) = p
t
the derivative of p. What is the matrix
representation of D with respect to the ordered bases [1, . . . , x
n
]
and [1, . . . , x
n1
]?
(c) Compute the matrix representation N of J with respect to the
ordered bases [1, x, x
2
, x
3
, x
4
] and[1, x, x
2
, x
3
, x
4
, x
5
].
(d) Assume more generally that J : P
n1
(R) P
n
(R) is the linear
transformation J(p) =
_
x
0
p(t) dt. What is the matrix represen-
tation of J with respect to the ordered bases [1, . . . , x
n1
] and
[1, . . . , x
n
]?
16. Let M be an n m matrix and T : K
m
K
n
be the linear transfor-
mation dened by application of the matrix M. Prove that the matrix
representation of T is equal to M. Notice, we have not specied or-
dered bases for this exercise. Why was that omission acceptable in this
case?
17. State the theorem analogous to Theorem 6.3.8 for the matrices MP
and M.
18. Prove the theorem you stated in Exercise 17.
19. Let M be dened as
m
k,
=
_

_
1 k = except (k, ) = (i, i) and (k, ) = (j, j)
1 (k, ) = (j, i) or (k, ) = (i, j)
0 otherwise
.
Prove that the the matrix MN is the result of interchanging rows i and
j in the matrix N.
330 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
20. Let M be dened as
m
k,
=
_

_
1 k = except (k, ) = (i, i)
a (k, ) = (i, i)
0 otherwise
.
Prove that the the matrix MN is the result of multiplying row i of the
matrix N by a.
21. Let M be dened as
m
k,
=
_

_
1 k =
1 (k, ) = (i, j)
0 otherwise
.
Prove that the the matrix MN is the result of replacing row i of the
matrix N with the sum of rows i and j of the matrix N.
22. Although we have dened a matrix product above, another possibility
for a product would have been
A B =
1
2
AB +
1
2
BA,
where juxtaposition indicates the product dened in this section. Prove
that this new product is linear and commutative. For A, I and Z of
the same size, show that I A = A = AI and Z A = Z = AZ.
Give an example of three matrices A, B and C for which (AB)C ,=
A (B C).
331
6.4 Determinants
Activities
1. Dene an func called det which accepts a 22 matrix M = (m
ij
) and
returns the value m
11
m
22
m
12
m
21
. Your code may assume that as
and ms implement addition and multiplication.
Now dene as and ms to implement arithmetic mod 5. For each of
the following pairs of matrices, compute det(M), det(N) and det(MN).
Make a conjecture based on your ndings.
(a)
M =
_
2 3
1 4
_
and N =
_
3 4
0 1
_
(b)
M =
_
1 3
2 1
_
and N =
_
0 1
2 4
_
(c)
M =
_
2 0
1 3
_
and N =
_
1 3
3 4
_
Discussion
You probably noticed that this activity section was rather sparse. Often
you may want to determine whether a matrix is invertible, but you are not
be interested in the value of the inverse matrix. The purpose of this section is
to provide you with a technique for determining the invertibility of a matrix
through a single computation. We provide such a function with the following
denition.
Denition 6.4.1. If M is a 11 matrix, then the determinant of M is m
11
.
The determinant of M is denoted by det(M).
If M is an n n matrix, the (i, j) cofactor of M is dened as the deter-
minant of the (n1) (n1) matrix obtained by removing the i
th
row and
j
th
column from the matrix M. This cofactor is denoted by M
ij
.
332 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
If M is an n n matrix, the determinant of M is dened as any of the
following:
det(M) =
n

j=1
(1)
i+j
m
ij
M
ij
det(M) =
n

i=1
(1)
i+j
m
ij
M
ij
Note that the rst formula represents n dierent summations (one for each
xed i with 1 i n) and the second formula represents n dierent sum-
mations (one for each xed j with 1 j n). The rst formula is called the
expansion along the i
th
row, and the second formula is called the expansion
along the j
th
column.
We currently do not have the tools to prove the following theorem, but
it is important because it states that there is no ambiguity in the above
denition.
Theorem 6.4.1. All of the formulas in Denition 6.4.1 produce the same
scalar.
The formula for computing the determinants of 2 2 matrices was pro-
vided in Activity 1 and is worth remembering. We provide it here again.
det
_
a b
c d
_
= ad bc
The formula for computing determinants of 3 3 matrices is slightly
more complicated, but also worth remembering. We provide the formula for
expansion along the rst row here:
det
_
_
m
11
m
12
m
13
m
21
m
22
m
23
m
31
m
32
m
33
_
_
=
m
11
(m
22
m
33
m
32
m
23
) m
12
(m
21
m
33
m
31
m
23
)
+m
13
(m
21
m
32
m
31
m
22
) .
Since the value of the determinant of a matrix is independent of the row
or column selected when performing a cofactor expansion, the easiest way
to compute the determinants of a matrix M is to nd that row or column
which contains the largest number of zero entries.
6.4 Determinants 333
Theorem 6.4.2. If Z is a zero matrix, then det(Z) = 0. If I is an identity
matrix, then det(I) = 1.
Proof. See Exercises 2 and 3.
In Activity 1, you discovered one important property of 2 2 determi-
nants, namely, that the determinant of a product of matrices is the product
of the determinants of those matrices. This holds in general, although we
will not prove it in this text.
Theorem 6.4.3. For any pair of n n matrices M and N, det(MN) =
det(M) det(N).
Theorem 6.4.4. The matrix M is invertible if and only if det(M) ,= 0.
Proof. If M is invertible with inverse N, then det(M) det(N) = det(MN) =
det(I) = 1. This implies that det(M) ,= 0.
The proof that det(M) ,= 0 implies that M has an inverse can be found
in more advanced linear algebra texts.
As you might have noticed, our coverage of determinants has been short,
and many of the theorems have been left unproven. There is enough infor-
mation about determinants to ll an entire chapter, but, for the moment, we
only need a few results. Complete proofs would take us far aeld.
Exercises
1. Compute the determinants of the following matrices.
(a) M =
_
1 3
2 4
_
, over R
(b) M =
_
2 3
1 4
_
, over Z
5
(c) M =
_
_
2 2 1
0 1 0
3 2 1
_
_
, over Z
5
(d) M =
_
_
2 0 0
1 2 0
2 1 3
_
_
, over R
334 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES
(e) M =
_
_
x 3 2
2 x 2
1 2 x
_
_
, over R
2. Prove that if Z is the zero matrix, then det(Z) = 0. (Hint: This proof
requires the use of mathematical induction.)
3. Prove that if I is the identity matrix, then det(I) = 1. (Hint: This
proof requires the use of mathematical induction.)
Chapter 7
Getting to Second Bases
So the grand nale certainly has an interesting
title. In this last chapter, we want to look at
some of the ways that linear algebra is
usefulother than being a great way to spend a
semester with your favorite mathematician.
Specically, we explore the power that is
harnessed by using matrix representations for
linear transformations. We revisit basis of a
vector space and see that by choosing wisely,
much work can be avoided! And the fun we will
have with eigenstu. . .
336
7.1 Change of Basis
Activities
1. Complete parts (a)(c) for each of the following matrices A = (a
ij
)
given below. Use the information obtained in (a)(c) to answer the
question posed in (d).
A
1
=
_
_
1 2 3
2 3 1
3 1 2
_
_
A
2
=
_
_
_
_
1
2
1
8
1
4
1
8
0
1
2
0
1
2
1
4
1
2
0
1
4
0
1
3
0
2
3
_
_
_
_
A
3
=
_
_
1 1 1
3 2 3
5 5 4
_
_
(a) Show that A is invertible by applying a tool from Chapter 6 to
construct its inverse S = (s
ij
).
(b) For any x = x
1
, x
2
, . . . , x
n
) R
n
, where n denotes the number
of columns in A, dene the vector y = y
1
, y
2
, . . . , y
n
) R
n
by
y = S x.
Select three nonzero vectors x, and use the equation to nd three
corresponding vectors y.
(c) Let b
j
be the vector whose components are the entries of the j
th
column of A. Check to see that the sequence B = [b
1
, b
2
, . . . , b
n
]
forms a basis for R
n
, and then, for each vector y, compute the
sum
z =
n

i=1
y
j
b
j
.
Compare each vector z to the vector x to which it corresponds.
What do you observe?
7.1 Change of Basis 337
(d) Describe the procedure alluded to in (a)(c). Given a basis B =
[b
1
, b
2
, . . . , b
n
] for R
n
, how do we nd the coordinate vector [x]
B
of x, that is, the vector s
1
, s
2
, . . . , s
n
) whose components are the
scalars in the expression
x =
n

i=1
s
i
b
i
?
2. Construct a func ChangeCoeff that will accept an ordered basis B =
[b
1
, b
2
, . . . , b
n
] for R
n
and a vector x R
n
and that returns the co-
ordinate vector [x]
B
= s
1
, s
2
, . . . , s
n
) of x. This is the vector whose
components are the coecients of the equation
x =
n

i=1
s
i
b
i
.
Check your func for each x that you selected in Activity 1.
3. Let B = [b
1
, b
2
, . . . , b
n
] be an ordered basis for R
n
. Let T : R
n
R
n
be a linear transformation dened by T(x) = C x, where each matrix
C is dened below. Complete (a)(d) for each transformation T and
basis B.
C
1
=
_
_
9 7 7
3 1 3
5 5 3
_
_
B
1
is formed from the columns of the matrix A
1
dened in Activity 1.
C
2
=
_
_
_
_
1 3 2 3
2 1 5 1
1 4 0 8
2 5 3 0
_
_
_
_
B
2
is formed from the columns of the matrix A
2
dened in Activity 1.
C
3
=
_
_
9 1 1
2 1 3
3 5 4
_
_
B
3
is formed from the columns of the matrix A
3
dened in Activity 1.
338 CHAPTER 7. GETTING TO SECOND BASES
(a) Verify that the matrix representation of T with respect to the
coordinate basis is equal to the matrix C.
(b) Let M be the matrix whose j
th
column is the sequence of compo-
nents of the vector b
j
B. Show that M is invertible, and nd
its inverse M
1
. Compute M
1
CM.
(c) Select two nonzero vectors x R
n
. Apply the func ChangeCoeff
to nd the coordinate vector [x]
B
of x with respect to the basis
B. Compute (M
1
CM) [x]
B
.
(d) Compute T(x). Apply the func ChangeCoeff to nd [T(x)]
B
.
Compare (M
1
CM) [x]
B
and [T(x)]
B
.
4. Let T : R
n
R
n
be a linear transformation, and let C be the matrix
representation of T with respect to the coordinate basis. Based upon
your experience with Activity 3, construct a func ChangeFromCoordB
that will accept the matrix representation C and an ordered basis
B = [b
1
, b
2
, . . . , b
n
] and return the matrix representation with respect
to the basis B. Check your func for each transformation dened in
Activity 3.
5. Let T : R
n
R
n
be a linear transformation, and let B be the matrix
representation of T with respect to an ordered basis B. Construct a
func ChangeToCoordB that will accept the matrix representation B of
T with respect to B and return the matrix representation of T with re-
spect to the coordinate basis. Check your func for each transformation
dened in Activity 3.
6. In the following examples, try to nd a basis that transforms the given
matrix B to C having the indicated form.
(a) The matrix B is given by
B
1
=
_
2 2
5 1
_
.
The matrix C is to have a diagonal form, that is, all entries other
than those on the main diagonal are zero.
7.1 Change of Basis 339
(b) The matrix B is given by
B
2
=
_
_
19 18 5
32 33 10
62 69 22
_
_
.
The matrix C is to have a lower triangular form. This means that
all entries above the main diagonal are zero.
(c) The matrix B is the matrix C
3
from Activity 3. The matrix C is
to have a diagonal form.
Discussion
In Chapter 6, you learned how to nd coordinate vectors and matrix
representations. In both cases, you discovered that neither is unique. For
a vector u U, the components of its corresponding coordinate vector in
K
n
depend upon the basis selected for U. Similarly, the form of a matrix
representation of a linear transformation T : U V depends upon the
bases selected for U and V . In this section, we investigate the relationship
between representations for dierent bases. In particular, if u U, and
if B and ( are two bases for U, what is the relationship between the two
coordinate vectors [u]
B
and [u]
(
? If T : U U is a linear transformation
from U to itself, how is the matrix representation with respect to B related
to that for (? In the rst subsection, we will discuss change of basis in
relation to coordinate vectors. In the second subsection, we will see how to
transform a matrix representation from one basis into another. Throughout
this chapter, you will be introduced to examples that show why the ability
to change bases is important.
Coordinate Vectors
If V = K
n
, and if the given basis is the coordinate basis ( = [e
1
, . . . , e
n
],
the coordinate vector [v]
(
of any v K
n
is simply the vector v itself. Do
you remember what the form of each e
i
vector is? Can you explain why the
coordinate vector in this case is the same as v?
There are many instances in which we need to work with a basis other
than the coordinate basis. In such a case, the coordinate vector of v K
n
,
as you discovered in Activity 1, is not equal to v. Our interest in this section
340 CHAPTER 7. GETTING TO SECOND BASES
is to study the relationship between coordinate vectors and dierent bases,
and to understand the procedure for changing from one basis to another.
The func ChangeCoeff that you wrote in Activity 2 involves changing from
the coordinate basis to a second basis B. Is this func consistent with the
following theorem?
Theorem 7.1.1. Given a basis B = [b
1
, b
2
, . . . , b
n
] for a vector space K
n
,
let M be the matrix whose j
th
column entries are the components of the
vector b
j
. Then, the matrix M is invertible, and, given any vector x =
x
1
, x
2
, . . . , x
n
) K
n
, the vector given by
y = M
1
x
gives the coordinate vector of x with respect to B, that is,
x =
n

i=1
y
i
b
i
.
Proof. The proof of this theorem is a tour de force of notation together
with calculations involving sequences, summations, and multi-indices. One
strategy in understanding the proof is to take a very specic example and
follow through the formulas with that example. Other than heavy notation,
the steps of the argument are not particularly dicult.
According to the Corollary of Theorem 6.3.4, the matrix M dened in
the statement of the theorem is invertible since its row rank is n.
Now we introduce some notation. All indices are assumed to run from 1
to n. In the double index for an element of a matrix, the rst index counts
the rows, the second indicates the columns. Let M = (t
ij
), M
1
= (s
ij
), and
let ( = [e
1
, e
2
, . . . , e
n
] be the coordinate basis for R
n
.
Since the entries of the i
th
column of M are the components of the basis
vector b
i
, M applied to each coordinate basis vector e
i
yields
b
i
= M e
i
=

k
t
ki
e
k
.
Since MM
1
is the identity matrix, we know that the kj
th
coordinate of
the product is 0 for all values of the indices, except when k = j, in which case
the entry is 1. A convenient and standard way of expressing this is through
the symbol
kj
called the Kronecker delta. This is nothing more than a
7.1 Change of Basis 341
shorthand for the long statement: 0, if k is dierent from j; 1 otherwise.
Thus, we have,

i
t
ki
s
ij
=
kj
.
As dened in the activities, let y = y
1
, y
2
, . . . , y
n
) be the vector given by
the product of M
1
and x = x
1
, x
2
, . . . , x
n
). Using summation notation, we
have the following expression for each component of y,
y
i
=

j
s
ij
x
j
.
In terms of ( = [e
1
, . . . , e
n
], y is of the form
y =

j
s
ij
x
j
e
i
.
Now that we have established these relationships, we are ready to show
that
y = M
1
x.
Before looking at the explanations that follow, justify, for yourself, each step
of the calculation:

i
y
i
b
i
=

j
s
ij
x
j
Me
i
=

j
s
ij
x
j

k
t
ki
e
k
=

k
_

j
_

i
t
ki
s
ij
_
x
j
_
e
k
=

k
_

kj
x
j
_
e
k
=

k
x
k
e
k
= x.
In the rst line, we substituted M
1
x for y
i
, and replaced b
i
by its
equivalent formulation M e
i
.
342 CHAPTER 7. GETTING TO SECOND BASES
In the second line, we expressed M e
i
in terms of the entries in i
th
column of the matrix M.
In the third line, we reordered, rearranged, and collected terms in this
triple summation.
In the fourth line, we replaced the expression for the product MM
1
by its value in terms of the Kronecker delta.
In the fth line, we replaced each
kj
by dropping all 0 terms.
In the last line, we noted that the expression was the expansion of x in
terms of the coordinate basis.
Lets summarize what we discovered thus far.
1. If V = K
n
, the coordinate vector of v K
n
with respect to the coor-
dinate basis is equal to v.
2. If B = [b
1
, b
2
, . . . , b
n
] is any basis, we can nd the coordinate vector
of v by computing the product
M
1
v,
where M is the matrix whose j
th
column is the sequence of the com-
ponents of the vector b
j
. From this point forward, we will refer to a
matrix such as M as a transition matrix.
3. Given [v]
B
, the coordinate vector of v with respect to B, we can nd
v by computing the product
M[v]
B
.
If B
1
and B
2
are two bases for V = K
n
, and if v V , what is the
relationship between [v]
B
1
and [v]
B
2
? How do we get from [v]
B
1
to [v]
B
2
?
7.1 Change of Basis 343
from [v]
B
2
to [v]
B
1
? The diagram given below illustrates these relationships.
What are the entries of the transition matrices M
B
1
and M
B
2
?
[v]
B
2
M
B
2
M
1
B
1

v
M
1
B
2

M
1
B
1

[v]
B
1
M
B
1
M
1
B
2

Starting on the left, we see that multiplying the coordinate vector [v]
B
2
by the
matrix M
1
B
1
M
B
2
yields [v]
B
1
. Note that we are multiplying by the product of
two matricesthe rst matrix product goes from B
2
to the coordinate basis,
and the second goes back from the coordinate basis to B
1
. In the middle,
if we multiply the vector v by the matrix M
1
B
2
, then we get the coordinate
vector [x]
B
2
. If, on the other hand, we multiply v by the matrix M
1
B
1
, we
get the coordinate vector [v]
B
1
. On the right, if we start with the coordinate
vector [v]
B
1
and multiply by the matrix M
1
B
2
M
B
1
, we get the coordinate
vector [v]
B
2
. If we start at any node, [v]
B
2
, v, or [v]
B
1
, we can get to any
other node by following the appropriate arrow, or sequence of arrows. Since
the coordinate vector of v with respect to the coordinate basis ( is equal to v
itself, we can actually say that the matrix M
1
B
1
is the matrix that transforms
[v]
(
into [x]
B
1
. What matrix would we use to transform [x]
B
1
into [x]
(
? [x]
B
2
into [x]
(
? [x]
(
into [x]
B
2
?
Alias and alibi. There is a point of interpretation that may at rst seem
confusing. However, it is interesting and important, because it appears in
many dierent situations. In particular, we can think of a basis as a frame
of reference for locating vectors. If the vector space is R
n
, and if the basis
is the coordinate basis ( = [e
1
, e
2
, . . . , e
n
], then the coecients of a vector
v R
n
with respect to this basis are precisely its coordinates in a coordinate
system in which the basis vectors are the axes.
The is true of any basis. Consider the basis B = [2, 1) , 1, 3)] in R
2
.
The vector 1, 2) R
2
has the coordinates 1 and 2 with respect to the
coordinate basis. What are its coordinates with respect to the basis B?
With respect to B, [1, 2)]
B
=

5
7
,
3
7
_
. Can you show how to get this using
Theorem 7.1.1?
344 CHAPTER 7. GETTING TO SECOND BASES
Need a picture of a vector in R
3
in which the coordinates of the
vector are clearly shown to be arrows along the coordinate
axes.
Figure 7.1: Basis
A picture in R
2
showing the basis vectors b and appropriate
lines to e
1
, e
2
indicating the coordinates as lengths and the same
thing relative to b
1
, b
2
.
Figure 7.2: Basis representation
But we could also show this in another way as given in the accompanying
gure.
Here he have a coordinate system with only b
1
, b
2
as axes,
shown in the same position as the coordinate axes are shown
and with the new coordinates of x indicated.
Figure 7.3: An alternate representation
Now come the interpretations. In the rst gure, we may consider that
the vector 1, 2) is unchanged, but it has two names: the coordinates 1, 2)
and the coordinates

5
7
,
3
7
_
. This is called the alias interpretation.
On the other hand, the second picture suggests that the vector 1, 2) has
been changed. Originally, it was 1, 2), but now it is changed to

5
7
,
3
7
_
. This
is called the alibi interpretation.
Matrix Representations
Every vector in an n-dimensional vector space has a coordinate vector rep-
resentation with respect to a given basis. The same is true of linear trans-
formations. For instance, if L : U V is a linear transformation and if
B and ( are bases for U and V respectively, then there is an m n matrix
A, where dim(U) = n and dim(V ) = m, that represents L in the sense that
if [u]
B
= s
1
, s
2
, . . . , s
n
) is the coordinate basis of u with respect to B and
[L(u)]
(
is the coordinate vector of L(u) with respect to (, then
[L(u)]
(
= A [u]
B
.
7.1 Change of Basis 345
The j
th
column of A is the sequence of coecients of the vector L(b
j
) in
terms of the basis (. We can illustrate this in the gure below.
u
L

L(u)

[u]
B
A

[L(u)]
(
This diagram tells us that if we take a vector u U, nd its coordinate vector
[u]
B
, and multiply by the matrix representation A, we will get the same result
as if we had rst applied L to u and then found the coordinate vector [L(u)]
(
.
In other words, L : U V can be represented by A : K
n
K
m
.
In this subsection, we will limit the discussion to linear transformations
from a vector space U to itself. In this context, we will investigate what
happens to a matrix representation when we change the basis. In Activities 3
and 4, you considered the process by which one changes from the coordinate
basis to a basis B for a linear transformation between spaces of tuples.
Lets review the methodology suggested in these activities by considering
an example. Let L : R
2
R
2
be the linear transformation given by the
formula
L(x
1
, x
2
)) = 3x
1
+ 2x
2
, x
1
+ 2x
2
) .
If we work with the coordinate basis ( = [e
1
, e
2
], the matrix representation
of L with respect to ( is given by
C =
_
3 2
1 2
_
.
The matrix representation of L with respect to B = [1, 1) , 2, 1)] is
given by
B =
_
1 0
0 4
_
.
Later in this section and throughout the remainder of this chapter, we will
discover that diagonal forms are extremely important and useful. The func
func ChangeFromCoordB that you constructed in Activity 4 changes a repre-
sentation written in terms of the coordinate basis into a representations with
respect to a basis B. If you applied func ChangeFromCoordB to the matrix
C and the basis B given here, would ISETL return the matrix B? Are the
component pieces of ChangeFromCoordB consistent with the theorem given
below?
346 CHAPTER 7. GETTING TO SECOND BASES
Theorem 7.1.2. Let L : R
n
R
n
be a linear transformation whose matrix
with respect to the coordinate basis is denoted by C. Let B = [b
1
, . . . , b
n
] be
an ordered basis for R
n
, and let M be the matrix whose j
th
column is the
sequence of coecients of b
j
. Then, the matrix B of L with respect to B is
given by
B = M
1
CM.
Proof. The j
th
column of B consists of the sequence of coecients of L(b
j
)
in terms of B. We must show that the j
th
column of M
1
CM consists of
the same sequence of coecients. Since C is the matrix representation of L
with respect to (, the j
th
column of CM is the coordinate vector [L(b
j
)]
(
.
By Theorem 7.1.1, the product M
1
[L(b
j
)]
(
is [L(b
j
)]
B
.
The func ChangeToCoordB that you constructed in Activity 5 reverses the
process given by Theorem 7.1.2. If we start with the matrix representation
in terms of a non-coordinate basis, how to we get to the matrix representa-
tion for the coordinate basis? Specically, how do we represent the matrix
representation C in terms of B? Before proceeding further, lets summarize
the relationship between the coordinate basis and a second basis B.
If L : R
n
R
n
is a linear transformation, then the j
th
column of the
matrix representation of L with respect to the coordinate basis ( is the
coordinate vector [L(e
j
)]
(
.
If we want to nd the matrix of L with respect to B, we rst construct
a transition matrix M. The j
th
column of this matrix consists of the
coecients of the vector b
j
.
If we let C denote the coordinate basis representation of L, then the
matrix representation with respect to B is found by computing the
product M
1
CM. The j
th
column of the product is the coordinate
vector [L(b
j
)]
B
.
If we are given the matrix representation B with respect to B and wish
to nd the coordinate basis representation, we compute the product
MBM
1
. The j
th
column of MBM
1
is the coordinate vector [L(e
j
)]
(
.
If B
1
and B
2
are two bases, neither of which is the coordinate basis, how do
we use Theorem 7.1.2 to nd the transition from B
1
to B
2
, and vice-versa?
The diagram below illustrates the relationships involved in making these
7.1 Change of Basis 347
transitions. In the gure, v is a vector in K
n
, B
1
is the matrix representation
in terms of B
1
, and B
2
denotes the matrix representation of L with respect
to B
2
.
[v]
B
1
B
1

[L(v)]
B
1
v
M
1
B
1

M
1
B
2

L(v)
M
1
B
1

M
1
B
2

[v]
B
2
B
2

[L(v)]
B
2
What are the entries of the transition matrices M
B
1
and M
B
2
? Using the
diagram, we can see that
[L(v)]
B
2
= B
2
M
1
B
2
M
B
1
[v]
B
1
[L(v)]
B
2
= M
1
B
2
M
B
1
B
1
[v]
B
1
.
Therefore,
B
2
= M
1
B
2
M
B
1
B
1
M
1
B
1
M
B
2
.
Following a similar argument, we can write B
1
in terms of B
2
. How do
we interpret the equation given here? To get from B
1
to B
2
, one would
start with B
1
and convert to the coordinate basis. This is represented by
M
B
1
B
1
M
1
B
1
. This is followed by a transition from the coordinate basis to B
2
,
which is represented by multiplying by the inverse of M
B
2
on the left and M
B
2
on the right. How would we construct a func using ChangeToCoordB and
ChangeFromCoordB to make the transition from B
1
to B
2
, and vice-versa?
Matrices with Special Forms
As we have seen, the process of making a transition from one basis to another
is quite involved. How does this process help us in working with linear
transformations? In this subsection and throughout the remainder of this
chapter, we will consider examples that will help to show the importance of
the ability to change bases.
348 CHAPTER 7. GETTING TO SECOND BASES
Triangular matrices. Recall that if you have a system of m linear equa-
tions in n unknowns, then you can interpret it as an equation L(x) = c,
where L : R
n
R
m
is a linear transformation, c is a vector in R
m
, and
x R
n
. If A = (a
ij
) is the matrix of L with respect to the coordinate bases
and x = x
1
, . . . , x
n
) , c = c
1
, . . . , c
m
) are the representations of x, c in terms
of their coecients with respect to these bases, then the system of equations
can be represented as the matrix equation Ax = c.
Now, suppose that the matrix A is in lower triangular form, that is,
a
ij
= 0 for i < j. Then you can write the solution very quickly. The rst
equation involves only x
1
, so you can solve it (provided a
11
,= 0). The second
equation involves only x
1
, x
2
. Since you already know the solution of x
1
, you
can solve for x
2
. Following a similar approach, we can nd the solution of
each x
i
.
For example, the answer to Activity 6(b) is the following lower triangular
matrix
_
_
3 0 0
1 2 0
7 4 3
_
_
.
You might not have found this matrix, but now that you know it, can you
nd the basis that gives it? The system of equations that gives this matrix
is
3x
1
= c
1
x
1
+ 2x
2
= c
2
7x
1
4x
2
+ 3x
3
= c
3
,
where c
1
, c
2
, c
3
are given numbers.
You can write the solution almost immediately as
x
1
=
c
1
3
x
2
=
1
2
(c
2
x
1
) =
3c
2
c
1
6
x
3
=
1
3
(c
3
7x
1
+ 4x
2
) =
c
3
3c
1
+ 2c
2
3
.
Of course, this is not the solution of the system of equations whose matrix
of coecients is the original matrix B
2
given in Activity 6(b). Actually, the
7.1 Change of Basis 349
x
1
, x
2
, x
3
given here are the coecients of that solution with respect to the
basis that transformed B
2
into the above triangular matrix. In fact, that basis
is B = [1, 1, 2) , 1, 2, 3) , 1, 2, 4)]. Given this information and assuming that
the right hand side of this system was given by c
1
= 6, c
2
= 2, c
3
= 1, can you
nd the solution of the original system? (Dont forget the values of c
1
, c
2
, c
3
are coecients of a vector with respect to the basis B.)
You will note that we are not saying much about how to nd a basis that
transforms a matrix into triangular form. One reason for this is that it is not
very practical. If we want to solve a system of equations, there are better
methods, such as Gaussian elimination or inversion of the matrix of coe-
cients (for which there are ecient computer methods). A more interesting
comment is a theoretical one. Suppose you have a linear transformation
L : V V and a basis B for V such that the matrix of L with respect
to B is upper triangular. Then L(b
1
) is an element of the subspace of V
generated by b
1
. Also, L(b
2
) is an element of the subspace of V generated
by b
1
, b
2
. This means that every element of the subspace of V generated
by b
1
, b
2
is mapped by L into that same subspace generated by b
1
, b
2
.
In other words, the subspace of V generated by b
1
, b
2
is invariant under
the linear transformation.
We can continue this process and say, for each k (up to the dimension
of V ), that the subspace generated by b
1
, b
2
, . . . , b
k
is invariant under L.
Such a decomposition of V into an increasing sequence of subspaces, each
invariant under L, has theoretical importance in the general theory of vector
spaces.
Diagonal matrices. A linear transformation T : R
n
R
n
whose ma-
trix representation with respect to the coordinate basis is diagonal has a
particularly simple structure. It takes each vector e
i
and multiplies it by a
xed scalar. For instance, if x = 2, 1, 3), and if T : R
3
R
3
is a linear
transformation whose matrix representation with respect to the coordinate
basis is
_
_
3 0 0
0 4 0
0 0 5
_
_
,
then T(x) = 3 2, 4 1, 5 3). In this case, the x component is scaled by a
factor of 3, the y component is scaled by a factor of 4, and the z component
is scaled by a factor of 5. This means that the unit cube maps to the box B,
350 CHAPTER 7. GETTING TO SECOND BASES
as illustrated in the gure below.
In R
3
, provide a picture of the unit cube and the resulting box
found by applying T to the unit cube.
Figure 7.4: Losing a dimension
If the matrix representation of T with respect to the coordinate basis
is not diagonal, then such a simple description of T is generally not possi-
ble. Under certain circumstances, however, we can nd a basis for which
the matrix representation is diagonal and the notion of scaling makes sense.
Consider the following example T : R
2
R
2
dened by
T(x, y)) =
_
5 4
4 1
__
x
1
x
2
_
.
As you can see, the matrix representation of T with respect to the coordinate
basis is not diagonal. However, in this case, we can nd a basis for which the
matrix representation of T is diagonal, namely
B =
__
2

5
,
1

5
_
,
_

5
,
2

5
__
.
The matrix of T with respect to B is given by
_
7 0
0 3
_
.
The vectors
_
2

5
,
1

5
_
and
_

5
,
2

5
_
dene a new coordinate system.
In R
2
, picture of coordinate axes with vectors from B super-
imposed. Picture of unit square ABCO with respect to new
coordinate system and picture of image rectangle A
t
B
t
C
t
O. O
here is the origin.
Figure 7.5: Eect of a diagonal basis
T is a scaling with respect to B, with a factor of 7 is the x
t
direction and
3 in the y
t
direction. As you can see, the square ABCO is mapped by T
to the rectangle A
t
B
t
C
t
O.
7.1 Change of Basis 351
The ability to diagonalize can also be used to simplify the equation of a
conic section in which there is a middle term, say
3x
2
+ 2xy + 3y
2
= 8.
The presence of the xy term makes this dicult to graph. In order to use
diagonalization, we need to make use of an auxiliary concept called inner
product. This is a deep and important concept in mathematics, but, for
now, you can consider the following formula as shorthand notation. If x =
x
1
, . . . , x
n
) , y = y
1
, . . . , y
n
) R
n
, then we can dene the inner product
x, y) by
x, y) =
n

i=1
x
i
y
i
.
We can write the algebraic expression 3x
2
+ 2xy + 3y
2
as an inner product
Ax
1
, x
2
) , x
1
, x
2
)) ,
where
A =
_
3 1
1 3
_
,
and ) represents the dot product on R
2
. Next, we change to the basis
B =
__
1

2
,
1

2
_
,
_
1

2
,
1

2
__
. The matrix B of the transformation rep-
resented by A is
B =
_
4 0
0 2
_
.
If we let M be the matrix whose columns are the vectors b
1
and b
2
, set
y
1
, y
2
) = M
1
x
1
, x
2
), and make the substitution x
1
, x
2
) = M y
1
, y
2
) in
the original equation, then, after some simplication, we get
4y
2
1
+ 2y
2
2
= 8.
Picture of ellipse with both sets of coordinate axes shown.
Figure 7.6: Rotation of coordinate axes
As Figure 7.6 shows, the original equation 3x
2
+ 2xy + 3y
2
= 8 can be
represented by the equation 4(x
t
)
2
+2(y
t
)
2
= 8 in the x
t
y
t
coordinate system,
352 CHAPTER 7. GETTING TO SECOND BASES
where the x
t
-axis is the line that coincides with the basis vector
_
1

2
,
1

2
_
,
and the y
t
-axis is the line that coincides with the basis vector
_
1

2
,
1

2
_
.
Exercises
1. Let V = R
3
. Find the coordinate vector of 2, 1, 3) in terms of the
basis [1, 1, 1) , 1, 1, 0) , 1, 0, 0)].
2. Let V = R
2
. Find the coordinate vector of 6, 5) with respect to the
basis [1, 1) , 2, 1)].
3. Let V = R
4
. Find the coordinate vector of 6, 3, 1, 2) with respect to
the basis [1, 1, 0, 2) , 2, 1, 1, 3) , 3, 0, 1, 0) , 1, 0, 0, 4)].
4. Complete (a)(b) for the bases
B
1
= [1, 2) , 3, 0)] and B
2
= [2, 1) , 3, 2)].
(a) If [x]
B
1
= 4, 3), nd [x]
B
2
.
(b) Find the form of x in terms of the coordinate basis.
5. Complete (a)(b) for the bases
B
1
= [3, 1) , 1, 1)] and B
2
= [2, 3) , 4, 5)].
(a) If [x]
B
1
= 2, 5), nd [x]
B
2
.
(b) Find the form of x in terms of the coordinate basis.
6. Let V = P
2
. Find the coordinate vector of p = 32x+x
3
with respect
to the basis [1, x
2
2, x
2
x + 1, x
3
+x].
7. Let V = P
2
. Find the coordinate vector of p = 3x
2
6x 2 with
respect to the basis [x 1, 2x + 3, x
2
+x].
8. Let P
2
(R) be the vector space of polynomials of degree two or less with
real coecients. A basis ( = [c
0
, c
1
, c
2
] is given by
c
0
= 1
c
1
= x
c
2
= x
2
.
7.1 Change of Basis 353
Another basis B = [b
0
, b
1
, b
2
] is given by
b
0
=
1
2
x(x 1)
b
1
= x
2
+ 1
b
2
=
1
2
x(x + 1).
For each of the following polynomials given below, write its coordinates
with respect to the basis (, and then change to its coordinates with
respect to the basis B. You can solve the problem by hand, or use the
computer tools you built in the activities.
(a) p(x) = 3x
2
5x + 2
(b) q(x) = 7x
2
4
(c) r(x) = 2x 1
9. Let P
2
(R) be the vector space of polynomials of degree two or less with
real coecients. A basis ( = [c
0
, c
1
, c
2
] is given by
c
0
= 1
c
1
= x
c
2
= x
2
.
Another basis B = [b
0
, b
1
, b
2
] is given by
b
0
=
1
2
x(x 1)
b
1
= x
2
+ 1
b
2
=
1
2
x(x + 1).
For each of the following polynomials, nd its coordinates with respect
to the basis B. You can solve the problem by hand, or use the computer
tools you built in the activities.
(a) p = 4c
0
3c
1
6c
2
(b) q = 6c
0
4c
2
354 CHAPTER 7. GETTING TO SECOND BASES
(c) r = 3c
1
10. Justify each step of the calculation given in Theorem 7.1.1.
11. Formulate a theorem, similar to Theorem 7.1.1, that gives a formula
for changing a coordinate vector with respect to a basis B
1
to its cor-
responding coordinate vector with respect to a basis B
2
, where neither
B
1
nor B
2
is the coordinate basis. Once you have written a statement
of this theorem, provide a proof.
12. In each of the following problems, a linear transformation L : R
n

R
n
is given by its matrix A with respect to the given basis B. Find its
matrix with respect to the coordinate basis. You can solve the problem
by hand, or use the computer tools you built in the activities.
(a)
A =
_
_
3 1 4
1 3 0
0 2 3
_
_
B = [2, 2, 2) , 3, 2, 3) , 2, 1, 1)]
(b)
A =
_
_
_
_
1/2 1/8 1/4 1/8
0 1/2 0 1/2
1/4 1/2 0 1/4
0 1/3 0 2/3
_
_
_
_
B = [1, 1, 1, 0) , 1, 1, 0, 1) , 1, 0, 1, 1) , 0, 1, 1, 1)]
13. In each of the following problems, a linear transformation L : R
n

R
n
is given by its matrix A with respect to the given basis B
1
. Find
its matrix with respect to the basis B
2
. You can solve the problem by
hand, or use the computer tools you built in the activities.
(a)
A =
_
_
3 1 4
1 3 0
0 2 3
_
_
B
1
= [2, 2, 2) , 3, 2, 3) , 2, 1, 1)]
B
2
= [2, 1, 0) , 1, 0, 2) , 1, 2, 1)]
7.1 Change of Basis 355
(b)
A =
_
_
_
_
1/2 1/8 1/4 1/8
0 1/2 0 1/2
1/4 1/2 0 1/4
0 1/3 0 2/3
_
_
_
_
B
1
= [1, 1, 1, 0) , 1, 1, 0, 1) , 1, 0, 1, 1) , 0, 1, 1, 1)]
B
2
= [1, 1, 0, 2) , 2, 1, 2, 0) , 1, 0, 2, 2) , 0, 2, 1, 1)]
14. Dene a transformation F : R
3
R
3
by
F(x
1
, x
2
, x
3
)) = x
1
x
2
x
3
, 0, 2x
1
+ 3x
3
, x
2
+x
3
) .
Find the matrix representation of F with respect to the basis
d = [1, 2, 1) , 2, 1, 1) , 3, 1, 2)].
15. Suppose that T : R
3
R
3
is a linear transformation whose matrix
representation with respect to some basis B
1
is given by
_
_
3 1 2
4 5 6
1 3 7
_
_
.
Suppose that the transition matrix from B
1
to another basis B
2
is given
by
_
_
2 1 1
1 0 1
0 2 3
_
_
.
Find the expression for the rule of correspondence of T in terms of the
coordinate basis.
16. Let P
3
(R) be the vector space of polynomials of degree three or less
with real coecients. A basis ( = [c
0
, c
1
, c
2
, c
3
)] is given by
c
0
= 1
c
1
= x
c
2
= x
2
c
3
= x
3
.
356 CHAPTER 7. GETTING TO SECOND BASES
This is called the basis of monomials.
Another basis B = [b
0
, b
1
, b
2
, b
3
] is given by
b
0
= 1
b
1
= x
b
2
= x(x 1)
b
3
= x(x 1)(x 2).
This will be called the basis of linear products.
Let L : P
3
(R) P
3
(R) be the linear transformation, called the
dierence operator, given by
L(p)(x) = p(x + 1) p(x).
Write the matrix of L with respect to the basis (, and then nd its
matrix with respect to the basis B. You can solve the problem by
hand, or use the computer tools you built in the activities.
17. Dene L : P
3
(R) P
3
(R) to be the linear transformation dened by
the derivative, that is, L(p) = p
t
. Write the matrix of L with respect
to the monomial basis (see previous exercise), and then nd its matrix
with respect to the basis of linear products (see previous exercise).
You can solve the problem by hand, or use the computer tools you
built in the activities. In this exercise, we think of each polynomial as
an expression for a function from R to R.
18. Why do you think that we have been using sequences of basis vectors
rather than sets of basis vectors. If the order of the vectors in a basis is
changed, would the coordinate vector change? To help you in answering
this question, construct a basis in R
3
. Select a vector x. Find the
coordinate vector of x with respect to the basis you have constructed.
Change the order of the basis elements, and nd the coordinate vector
with respect to this new ordering. What do you observe?
19. Let F : R
n
R
n
be a linear transformation. Formulate a theorem,
similar to Theorem 7.1.2, that gives a formula for changing the matrix
representation of F with respect to a basis B
1
to its corresponding
matrix representation with respect to a basis B
2
, where neither B
1
nor
B
2
is the coordinate basis. Once you have written a statement of this
theorem, provide a proof.
357
7.2 Eigenvalues and Eigenvectors
Activities
1. In each of the following problems, (a)(e), nd as many solutions to
the given equation as you can. Let x R
2
.
(a) For the matrix A
1
=
_
3 1
2 2
_
, nd all solutions to A
1
x = x.
(b) For the matrix A
2
=
_
5 2
7 3
_
, nd all solutions to A
2
x = x.
(c) For the matrix A
3
=
_
2 1
2 2
_
, nd all solutions to A
3
x = x.
(d) Let , the Greek letter lambda, represent a scalar. For the matrix
A
4
=
_
1 3
2 2
_
, nd all such that the equation A
4
x = x has at
least one solution. For each such , nd all solutions x that satisfy
the given equation.
(e) Let , the Greek letter lambda, represent a scalar. For the matrix
A
5
=
_
5 2
2 1
_
, nd all such that the equation A
5
x = x has
at least one solution. For each such , nd all solutions x that
satisfy the given equation.
2. Complete (a)(d) for the matrix A
4
dened in Activity 1(d).
(a) Write the polynomial p given by p() = det(A
4
I), where I is
the 4 4 identity matrix.
(b) Find all such that p() = 0. For each solution, select a nonzero
vector x such that A
4
x = x.
(c) What can you say about the vectors you picked in the previous
step? Do they form a basis for R
2
?
(d) Think of A
4
as representing the expression of a linear transforma-
tion T : R
2
R
2
given by
T(x) = A
4
x.
358 CHAPTER 7. GETTING TO SECOND BASES
Use the information you have gathered thus far to nd a diagonal
matrix representation for T.
3. Repeat Activity 2 for the matrix given by A
6
=
_
_
0 3 3
2 2 2
4 1 1
_
_
.
4. Repeat Activity 2 for the matrix given by A
7
=
_
_
0 0 0
4 1 1
3 2 1
_
_
.
Did anything dierent happen this time? Can you diagonalize A
7
? Try
to explain as much as you can.
5. Let S be the matrix whose columns are the vectors you found in Ac-
tivity 3. Set D = S
1
A
6
S. A
6
denotes the matrix from Activity 3. Do
you see anything remarkable about D? If so, can you explain it?
Discussion
Basic Ideas
Given a linear transformation L : V V , the point of Activity 1 was to
illustrate the idea that it is possible to nd scalars and nonzero vectors x
such that
L(x) = x.
When this occurs, we say that is an eigenvalue of L and x is an eigenvector
belonging to . Formally, we have
Denition 7.2.1. Let L : V V be a linear transformation. If there
exists a nonzero vector x V for which L(x) = x, then we say that is an
eigenvalue of L. Any nonzero vector x satisfying the equality for a particular
is called an eigenvector belonging to .
What examples of eigenvalues and eigenvectors did you nd in the activ-
ities? Carefully describe them before proceeding.
What eect does a linear transformation have upon an eigenvector? We
know that a linear transformation takes a vector and transforms it into an-
other vector. If L : R
n
R
n
is a linear transformation, and if x R
n
is
7.2 Eigenvalues and Eigenvectors 359
an eigenvector belonging to , how might we describe the way L transforms
x? According to the denition, the eect of L on an eigenvector, no matter
how complex the transformation, is nothing more than a simple scaling of
the eigenvector. In this context, what does the term scaling mean?
In R
3
, give an arrow for a vector x. Give a second, longer arrow
that coincides with x that will be labeled L(x).
Figure 7.7: Image of an eigenvector
The activities introduced a methodology for nding eigenvalues and eigen-
vectors. The procedure outlined in Activity 2 can be expanded and gen-
eralized. In order to identify the eigenvalues of a linear transformation
L : K
n
K
n
, we dene a second transformation based upon the equa-
tion L(x) = x. Dene I : K
n
K
n
to be the identify transformation:
I(x) = x for all x K
n
. Dene M : K
n
K
n
by the expression
M(x) = L(x) I(x).
Is M a linear transformation? Before continuing, you should check this.
What is the relationship between the expression for M and the equation
L(x) = x? A particular scalar, say s, is a solution to L(x) = x if and only
if there exists a nonzero vector x
s
K
n
such that x
s
ker(M). Can you
explain why this is the case?
If the kernel of M were to contain the zero vector exclusively, then
dim(ker(M)) = 0. According to Theorem 5.2.5, the Rank and Nullity The-
orem, rank(M) = n. As a result, the rank of any matrix representation
of M would be n. In such a case, Theorems 6.3.4 and 6.4.4 tells us that
det(M) ,= 0. Hence, a scalar s is an eigenvalue if and only if
[[L] sI[ = 0,
where [L] denotes a matrix representation of L. The set of eigenvalues of L
is subsequently given by the set
: [[L] I[ = 0 .
The determinant [[L] I[, a polynomial in , is called the characteristic
polynomial of L. Any root of this polynomial is an eigenvalue of L. It is
useful to summarize all of this in a theorem.
360 CHAPTER 7. GETTING TO SECOND BASES
Theorem 7.2.1. Let L : K
n
K
n
be a linear transformation, [L] a matrix
representation of L, and let p be the function given by p() = det([L]
I). Then, p is a polynomial of degree n in with coecients in K. The
eigenvalues of L are precisely the roots of p.
As you may recall from your study of algebra, a polynomial of degree n
has at most n roots or zeros. This means that a linear transformation from
R
n
into itself has at most n eigenvalues. If we are working over the complex
numbers C, we know that a polynomial can be factored completely. Hence,
if L is a linear transformation from C
n
to itself, and if we count eigenvalues
according to their multiplicity, then L would have exactly n eigenvalues.
Bases of Eigenvectors
The steps in Activity 2 provided a rough sketch of the procedure for nding a
diagonal matrix representation. In Activity 5, you found a diagonal form by
multiplying by a suitable transition matrix. These activities raise some im-
portant questions. Is a set of eigenvectors always linearly independent? Do
diagonal matrix representations correspond to bases consisting exclusively of
eigenvectors? What happened in Activity 4 that prevented the transforma-
tion dened by A
7
from having a diagonal representation? Providing answers
to these questions will be one focus of our work in this and the next section.
We provide a partial answer to the rst question in the following theorem.
Theorem 7.2.2. Let L : V V be a linear transformation. Let
1
,
2
,
. . .,
k
be a set of distinct eigenvalues of L. For each i = 1, 2, . . . , k, let v
i
be an eigenvector belonging to
i
. Then, the set of vectors
v
i
: i = 1, 2, . . . , k
is linearly independent.
Proof. The proof will be by induction on k.
Since an eigenvector, by denition, is not zero, the theorem is true for
k = 1.
Suppose that the theorem holds for any set of k eigenvectors,
v
1
, v
2
, . . . , v
k
,
belonging to a set

1
,
2
, . . . ,
k
7.2 Eigenvalues and Eigenvectors 361
of k distinct eigenvalues. Now, suppose we have a set of k + 1 distinct
eigenvalues,

1
,
2
, . . . ,
k
,
k+1
,
and a corresponding set of eigenvectors,
v
1
, v
2
, . . . , v
k
, v
k+1
,
where v
i
belongs to
i
. By the induction hypothesis, the rst k of these are
linearly independent. In order to show that the entire set is independent, all
we have to show is that v
k+1
is not a linear combination of v
1
, v
2
, . . . , v
k
.
Suppose this is not the case; that is, v
k+1
=

k
i=1
t
i
v
i
, where the t
i
are
scalars. Then we would have,
k

i=1
t
i

k+1
v
i
=
k+1
k

i=1
t
i
v
i
=
k+1
v
k+1
= T(v
k+1
)
= T
_
k

i=1
t
i
v
i
_
=
k

i=1
t
i
T(v
i
) =
k

i=1
t
i

i
v
i
.
But then we would have,
0 =
k

i=1
t
i

k+1
v
i

i=1
t
i

i
v
i
=
k

i=1
t
i
(
k+1

i
) v
i
.
Since all of the eigenvalues
1
,
2
, . . . ,
k+1
are distinct,
k+1

i
,= 0,
i = 1, 2, . . . , k. Since the vectors v
1
, v
2
, . . . , v
k
are linearly independent,
t
i
= 0, i = 1, 2, . . . , k. This implies that v
k+1
= 0, which is not the case.
This theorem explains why the matrix S in Activity 5 is invertible. The
columns of S were the eigenvectors belonging to dierent eigenvalues. Indeed,
if a vector space V has dimension n and n distinct eigenvalues, then a set
of n eigenvectors, each corresponding to a distinct eigenvalue, constitutes a
basis. This situation can be generalized. Specically, we can construct a
basis of eigenvectors in which some belong to the same eigenvalue. The next
theorem will help us to prove such a theorem.
Theorem 7.2.3. The set of all eigenvectors belonging to the same eigenvalue
forms a subspace.
362 CHAPTER 7. GETTING TO SECOND BASES
Proof. See Exercise 10.
We will use the theorem just stated, along with Theorem 7.2.2, to prove
a stronger version of Theorem 7.2.2. We will show that it is possible to
construct a linearly independent set of eigenvectors in which some of the
vectors belong to the same eigenvalue. This theorem is an important step in
the process of determining conditions that guarantee diagonalizability, that
is, the existence of a basis for which a linear transformation has a diagonal
matrix representation.
Theorem 7.2.4. Let L : V V be a linear transformation, and let
1
,
2
,
. . .,
k
be a set of distinct eigenvalues of L. Let E be a set of eigenvectors that
satises the following condition: for each i = 1, 2, . . . , k, those eigenvectors
that belong to
i
are linearly independent. It then follows that the entire set
E is linearly independent.
Proof. Suppose that
E = v
1
, v
2
, . . . , v
m
, m n.
Let a
1
, a
2
, . . . , a
m
be scalars such that
a
1
v
1
+a
2
v
2
+ +a
m
v
m
= 0.
It suces to show that
a
1
= a
2
= = a
m
= 0.
Group the combination according to those vectors which belong to the same
eigenvalue. In such a case, it follows, by Theorem 7.2.3, that such a sum
yields another eigenvector belonging to the same eigenvalue. If we do this
for every such grouping, the original linear combination
a
1
v
1
+a
2
v
2
+ +a
m
v
m
simplies to a sum of distinct eigenvectors, each belonging to a dierent
eigenvalue. By Theorem 7.2.2, each vector must be zero. This means that
each part of the original combination that belonged to a particular eigenvalue
yields a vector sum of zero. Since we are assuming that those vectors in E
that belong to the same eigenvalue are linearly independent, it follows that
the associated scalars are zero. Since this happens for each such grouping of
the original linear combination above, it follows that the coecients a
1
, a
2
,
. . . , a
m
are all simultaneously zero.
7.2 Eigenvalues and Eigenvectors 363
The next two denitions will help us to state a condition that guarantees
the existence of a basis of eigenvectors. The proof will be based upon The-
orems 7.2.2, 7.2.3, and 7.2.4. This theorem will be key tool in helping us to
devise a procedure for diagonalization. This procedure will be discussed in
detail in the next section.
Denition 7.2.2. The algebraic multiplicity of an eigenvalue of a linear
transformation L : K
n
K
n
is its multiplicity as a root of the characteristic
polynomial of L.
Denition 7.2.3. The geometric multiplicity of an eigenvalue is the di-
mension of the subspace of eigenvectors that belong to it.
Theorem 7.2.5. If L : K
n
K
n
is a linear transformation, and if the
sum of the geometric multiplicities of the eigenvalues of L is n, then there is
a basis of eigenvectors of L.
Proof. For each eigenvalue, nd a basis for the subspace of its eigenvectors
(it is a subspace by Theorem 7.2.3). Then, by Theorem 7.2.4, the union
of all of these bases is linearly independent. By the assumption regarding
geometric multiplicities, the number of vectors in this linearly independent
set is n. By Theorem 4.4.8, this set is a basis.
Theorem 7.2.5, which gives a condition for diagonalizability, allows us to
expand the level of detail of the procedure given in Activity 2. In each step,
assume that L : R
n
R
n
is a linear transformation.
1. Find the matrix representation [L] of L with respect to the coordinate
basis. Determine the characteristic polynomial [[L] I[.
2. Find the roots of the characteristic polynomial.
3. Find a basis for the subspace corresponding to each eigenvalue.
4. Take the union of these bases.
5. If the sum of the geometric multiplicities of the eigenvalues of L is
equal to n, then the linearly independent set described in 4. forms a
basis. Can you explain why? What happens if the sum of the geometric
multiplicities is not equal to n? Would L be diagonalizable in this case?
364 CHAPTER 7. GETTING TO SECOND BASES
This procedure leaves several unanswered questions: Can you nd the roots
of the characteristic equation? How do we construct a basis for a given
eigenspace (the subspace of eigenvectors corresponding to a particular eigen-
value)? What happens if the sum of the geometric multiplicities is not equal
to the dimension of the vector space? What is the precise relationship be-
tween an eigenbasis, if one can be constructed, and the coordinate basis?
These and related questions will be answered in the next section.
What Can Happen?
Before considering conditions guaranteeing diagonalizability, as well as other
applications of the theory of eigenvalues and eigenvectors, it may be useful to
summarize all of the various possibilities encountered thus far. If L : K
n

K
n
is a linear transformation, where K is some eld, the following list details
important facts concerning eigenvalues and eigenvectors.
If K is the set C of complex numbers, then it is certain that L will
have at least one eigenvalue. In general, however, it is possible that L
has no eigenvalues. When might L not have any eigenvalues?
L has at most n eigenvalues. Can you explain why?
If there is a basis for K
n
consisting of eigenvectors, then the matrix of
L with respect that basis is diagonal. We actually showed that it might
be possible to construct a basis consisting exclusively of eigenvectors.
However, we have not yet proven that a basis consisting of eigenvectors
yields a diagonal matrix representation. Can you prove that?
If L has n distinct eigenvalues, then there is a basis for K
n
consisting
of eigenvectors.
If the sum of the geometric multiplicities of the eigenvalues of L is n,
then there is a basis for K consisting of eigenvectors.
In considering these possibilities, we must be careful to take into account
the base eld. For example, the characteristic polynomial p of the matrix A
7
in Activity 4 is given by
p() = (1 +
2
).
7.2 Eigenvalues and Eigenvectors 365
One might be tempted to conclude that 0 is the only eigenvalue. If K = R, as
was the case in Activity 4, then 0 is the only eigenvalue, and a transformation
T : R
3
R
3
dened by T(x) = A
7
x cannot be diagonalized. On the other
hand, if we take K to be the complex numbers, then a linear transformation
T : C
3
C
3
dened by A
7
has 3 distinct eigenvalues. In this case, T can
be diagonalized.
Exercises
In doing the following problems, you may use any computer software, in-
cluding any constructions you made in the activities, or a computer algebra
system such as Derive, Maple, Matlab, or Mathematica.
1. Consider each of the following matrices as representing a linear trans-
formation on a vector space whose eld of scalars is R. For each matrix,
determine the characteristic polynomial, the eigenvalues, and the cor-
responding eigenspaces.
(a)
A
1
=
_
5 4
8 7
_
(b)
A
2
=
_
2 1
3 1
_
(c)
A
3
=
_
7 6
15 12
_
(d)
A
4
=
_
_
4 0 2
2 3 2
3 0 1
_
_
(e)
A
5
=
_
_
4 1 2
0 3 2
0 0 1
_
_
366 CHAPTER 7. GETTING TO SECOND BASES
(f)
A
6
=
_
_
9 7 7
3 1 3
5 5 3
_
_
(g)
A
7
=
_
_
3 1 0
0 1 0
4 2 1
_
_
(h)
A
8
=
_
_
2 3 6
6 2 3
3 6 2
_
_
2. Repeat Exercise 1 for the following matrices, except this time assume
that the eld of scalars is the complex numbers C.
(a)
A
1
=
_
1 3
1 1
_
(b)
A
2
=
_
_
2 3 6
6 2 3
3 6 2
_
_
3. Suppose F : R
2
R
2
is a linear transformation whose matrix repre-
sentation with respect to the ordered basis B = [1, 1) , 2, 1)] is
_
5 0
0 1
_
.
(a) Show that B is a basis of eigenvectors.
(b) Find the matrix representation of F with respect to the coordinate
basis.
(c) Find the characteristic polynomial of F.
4. Suppose that p(x) = (x 3)
2
(x 2)(x + 1) is the characteristic poly-
nomial of a linear transformation T : R
4
R
4
. Is T diagonalizable?
Why, or why not?
7.2 Eigenvalues and Eigenvectors 367
5. Suppose that p() =
2
5 + 6 is the characteristic polynomial of a
linear transformation G : R
2
R
2
.
(a) Construct a basis for R
2
of eigenvectors of G.
(b) Find the matrix representation of G with respect to the eigenbasis
you constructed in (a).
(c) Find the matrix representation of G with respect to the ordered
basis B = [1, 3) , 4, 2)].
6. Let L : R
n
R
n
be a linear transformation. Let I : R R
n
be
the identity transformation. Show that the transformation M : R
n

R
n
, dened by
M(x) = L(x) I(x)
is a linear transformation.
7. Let L, I, and M be dened as they were in Exercise 6. Show that is
an eigenvalue of L if and only if there exists a nonzero vector x such
that x ker(M).
8. Let L, I, and M be dened as they have been in the previous two
exercises. Carefully explain why the set of eigenvalues of L is the
solution set of the equation
det([M]) = 0,
where [M] denotes the matrix representation of M with respect to the
coordinate basis.
9. Explain why the matrix S dened in Activity 5 is invertible.
10. Provide a proof of Theorem 7.2.3.
11. In Theorem 7.2.4, the original linear combination
a
1
v
1
+a
2
v
2
+ +a
m
v
m
simplies to a sum of distinct eigenvectors, each belonging to a dierent
eigenvalue. If you recall, we were trying to show that the scalars a
i
are
simultaneously zero. Hence, we were considering the simplied sum
when set equal to the zero vector. In this context, use Theorem 7.2.2
to explain why each of the vectors in the simplied combination must
be the zero vector.
368 CHAPTER 7. GETTING TO SECOND BASES
12. Let A be a matrix representation of a linear transformation on a vector
space over K with respect to some basis B. If A is diagonal, prove that
the basis B consists entirely of eigenvectors, and show that the diagonal
elements of A are the eigenvalues.
13. Let
A =
_
0 1
1 0
_
be the matrix representation of a linear transformation T : R
2
R
2
with respect to the coordinate basis.
(a) Find the characteristic polynomial of T.
(b) Show that T has no eigenvalues.
(c) Interpret this result geometrically. In particular, what does it
mean to say that T has no eigenvalues.
(d) If the vector space were C
n
instead of R
n
, would T still have no
eigenvalues? Explain.
14. Consider the linear transformation T : R
3
R
3
whose matrix with
respect to the coordinate basis is
_
_
3 1 1
0 3 1
0 0 5
_
_
.
Try to diagonalize this matrix. What is the relationship between the
algebraic and geometric multiplicities of an eigenvalue in this case?
15. If A is a square matrix, then A
2
denotes A A. Similarly, A
3
= A A A,
and A
n
= A A A, a product of n copies of A. If a square matrix A
is diagonalizable, and if all of its eigenvalues are either 1 or 1, then
show that A
2
= I.
16. If A is square, diagonalizable matrix, and if all of its eigenvalues are
either 1 or 0, then prove A
2
= A.
17. If A is a square, diagonalizable matrix, and if all of its eigenvalues are
either 3 or 5, then show that A
2
+ 2A 15 = 0.
7.2 Eigenvalues and Eigenvectors 369
18. Can you think of a general statement for which the three previous
exercises are special cases?
19. Show that the general formula for the characteristic polynomial of a
2 2 matrix
_
a
11
a
12
a
21
a
22
_
is given by
2
(tr(A))+det(A), where tr(A) denotes the trace of A,
the sum of the diagonal entries.
20. Suppose v is a nonzero eigenvector of a matrix A belonging to the
eigenvalue . Show that v is an eigenvector of A
n
belonging to
n
.
370
7.3 Diagonalization and Applications
Activities
1. Let T : R
2
R
2
be dened by
T(x
1
, x
2
)) = A
_
x
1
x
2
_
,
where
A =
_
7 10
3 4
_
.
(a) Verify that the matrix representation of T with respect to the
coordinate basis is given by A. Find the characteristic polynomial
of T with respect to the coordinate basis, that is, compute [A
I
2
[.
(b) Let B = 2, 1) , 3, 4) be another basis for R
2
. Apply the func
ChangeMatRep that you constructed in Activity 4 of Section 7.2
to nd [T]
B
, the matrix representation of T with respect to the
basis B. Find the characteristic polynomial of T with respect to
the basis B, that is, compute [[T]
B
I
2
[.
(c) Use Theorem 7.1.2 to show that A and [T]
B
are similar matrices.
What is the relationship between [A I
2
[ and [[T]
B
I
2
[?
(d) Based upon the results you obtained in part (c), what can we say
about the characteristic polynomials of two similar matrices? If
asked to nd the eigenvalues of a linear transformation, does it
appear to matter what basis we work with? Explain your answer.
2. Dene F : R
3
R
3
by
F(x
1
, x
2
, x
3
)) = 4x
1
+x
3
, 2x
1
+ 3x
2
+ 2x
3
, x
1
+ 4x
3
) .
(a) Find the eigenvalues of F. Does the characteristic polynomial
factor completely?
(b) Suppose = a is an eigenvalue of F. According to Theorem 7.2.3,
the eigenspace corresponding to a forms a subspace of R
3
. x R
3
7.3 Diagonalization and Applications 371
is an eigenvector corresponding to a if and only if x is a solution of
[F]
(
aI
3
= 0. Why is this the case? How can we use this equa-
tion to nd a basis for the eigenspace corresponding to a? Once
you have answered these questions, nd a basis for the eigenspace
corresponding to each of the eigenvalues you found in part (a).
(c) Let c be the collection of all of the eigenbasis vectors you found
in part (b). According to Theorem 7.2.4, this set is linearly inde-
pendent. Does this set form a basis for R
3
in this case? Why, or
why not?
(d) If c forms a basis for R
3
, what is [F]
c
, the matrix representation
with respect to c? What is the form of the transition matrix from
the coordinate basis ( to the basis c? If c does not form a basis,
would it still be possible to construct a basis of eigenvectors?
(e) What is the relationship between the algebraic multiplicity of each
eigenvalue and its geometric multiplicity? Do the geometric mul-
tiplicities add to the dimension of R
3
?
3. Dene G : R
3
R
3
by
G(x
1
, x
2
, x
3
)) = x
3
, x
1
x
3
, x
2
+x
3
) .
(a) Find the eigenvalues of G. Does the characteristic polynomial
factor completely?
(b) Find a basis for the eigenspace corresponding to each of the eigen-
values you found in part (a).
(c) Let c be the collection of all of the eigenbasis vectors you found
in part (b). Does this set form a basis for R
3
in this case? Why,
or why not?
(d) If c forms a basis for R
3
, what is [G]
c
, the matrix representation
with respect to c? What is the form of the transition matrix from
the coordinate basis ( to the basis c? If c does not form a basis,
would it still be possible to construct a basis of eigenvectors?
(e) What is the relationship between the algebraic multiplicity of each
eigenvalue and its geometric multiplicity? Do the geometric mul-
tiplicities add to the dimension of R
3
?
372 CHAPTER 7. GETTING TO SECOND BASES
4. Let P
2
(R) be the space of all polynomials with real coecients of degree
2 or less. Dene H : P
2
(R) P
2
(R) by
H(p) = p
t
,
where p P
2
(R), and p
t
denotes the derivative of p. (Note that here
we are considering polynomials as functions, rather than as a sequence
of coecients from a eld. Do you see this distinction?)
(a) Find the matrix representation of H with respect to the basis
B = 1, x, x
2
. Find the eigenvalues of H. Does the characteristic
polynomial factor completely?
(b) Find a basis for the eigenspace corresponding to each of the eigen-
values you found in part (a).
(c) Let c be the collection of all of the eigenbasis vectors you found in
part (b). Does this set form a basis for P
2
(R) in this case? Why,
or why not?
(d) If c forms a basis for P
2
(R), what is [H]
c
, the matrix representa-
tion with respect to c? What is the form of the transition matrix
from basis B to the basis c? If c does not form a basis, would it
still be possible to construct a basis of eigenvectors?
(e) What is the relationship between the algebraic multiplicity of each
eigenvalue and its geometric multiplicity? Do the geometric mul-
tiplicities add to the dimension of P
2
(R)?
(f) On the basis of your ndings here and in Activities 2 and 3, under
what condition is a linear transformation likely to have a diagonal
matrix representation?
5. Let
A =
_
4 6
3 5
_
,
and let A = C
1
DC, where
C =
_
1 1
1 2
_
and D =
_
1 0
0 2
_
,
determine whether A
n
= C
1
D
n
C for n = 2, 3, 4. What do you ob-
serve? Based upon your observations, state a conjecture, if possible.
7.3 Diagonalization and Applications 373
Discussion
Relationship between Diagonalizability and Eigenvalues
Toward the end of the last section, we outlined a procedure for diagonalizing
a matrix. In this section, we will provide additional detail, as well as deter-
mining conditions which ensure diagonalizability. But, before we continue,
it might be helpful to dene exactly what we mean by diagonalizability.
Denition 7.3.1. Let T : R
n
R
n
be a linear transformation. T is
diagonalizable if there exists a basis B such that the corresponding matrix
representation [T]
B
is a diagonal matrix.
Activities 2, 3, and 4, together with the discussion in the last section,
suggest that the diagonalizability of a linear transformation is dependent
upon the ability of nding, or constructing, a basis consisting exclusively of
eigenvectors. The theorem below veries that this is indeed the case.
Theorem 7.3.1. Let T : R
n
R
n
be a linear transformation, and let B be
a basis. The matrix representation [T]
B
is diagonal if and only if B consists
exclusively of eigenvectors.
Proof. (=:) Assume that B is a basis of eigenvectors. We want to show
that [T]
B
is a diagonal matrix. The proof of this part is left to the exercises.
See Exercise 6.
(=:) Assume that [T]
B
is a diagonal matrix. Then, there exist scalars
a
11
, a
22
, . . . , a
nn
such that
[T]
B
=
_
_
_
_
_
_
_
a
11
0 0 . . . 0
0 a
22
0 . . . 0
0 0 a
33
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . a
nn
_
_
_
_
_
_
_
.
Let the basis B be given by B = v
1
, v
2
, . . . , v
n
. Then, according to De-
nition 6.2.2,
T(v
i
) = 0v
1
+ + 0v
i1
+a
ii
v
i
+ 0v
i+1
+ + 0v
n
= a
ii
v
i
.
374 CHAPTER 7. GETTING TO SECOND BASES
According to Denition 7.2.1, v
i
, i = 1, 2, . . . , n, is an eigenvector. Therefore,
B is a basis consisting entirely of eigenvectors.
In Section 7.3, you found eigenvalues by working with the coordinate ba-
sis. The results of Activity 1 suggest that the characteristic polynomial does
not depend upon the specic choice of basis. The next theorem establishes
this as a general result.
Theorem 7.3.2. Let T : R
n
R
n
be a linear transformation. Let B
and B be two bases for R
n
. Then,
[[T]
B
I
n
[ = [[T]
B
I
n
[,
that is, the characteristic polynomial is independent of the choice of basis.
Proof. As given in the statement of the theorem, let [T]
B
and [T]
B
be
two matrix representations of T with respect to the bases B and B,
respectively. According to Theorem 7.1.2, these two matrices are similar,
that is, there exists an invertible matrix C such that
[T]
B
= C
1
[T]
B
C.
What are the entries of C? Can you recall based upon the theorem just
cited?
Using C, we can establish the following equality,
[[T]
B
I
n
[ = [C
1
[T]
B
C I
n
[
= [C
1
([T]
B
I
n
)C[
= [C
1
[[[T]
B
I
n
[[C[
= [[T]
B
I
n
[[C
1
C[
= [[T]
B
I
n
[,
which is what we wished to prove. Can you justify each step?
These theorems simplify the basic procedure for diagonalizing a trans-
formation. The theorem we have just proven tells us that we can use any
basis, and hence, any matrix representation, to nd the eigenvalues of a
linear transformation. Theorem 7.3.1 reveals that diagonalizability depends
entirely upon the ability to construct a basis of eigenvectors. What remains
is to nd conditions that guarantee the existence of an eigenbasis.
7.3 Diagonalization and Applications 375
Conditions that Guarantee Diagonalizability
In Activities 2, 3, and 4, you were asked to compare the geometric and al-
gebraic multiplicities of each eigenvalue, as well as to determine whether the
characteristic polynomial splits, that is, factors completely. Based upon your
results, is it possible for a diagonalizable transformation to have a characteris-
tic polynomial that does not split? If the characteristic polynomial splits, can
you immediately conclude that the transformation is diagonalizable? Does
the relationship between the algebraic and geometric multiplicities of each
eigenvalue appear to have any bearing upon the issue of diagonalizability?
The next theorem provides an answer to the rst question.
Theorem 7.3.3. Let T : R
n
R
n
be a linear transformation that is
diagonalizable. Then, the characteristic polynomial of T splits.
Proof. According to the assumption, there exists a basis B such that the
resulting matrix representation [T]
B
is a diagonal matrix. By Theorem 7.3.2,
the choice of basis does not eect the form of the characteristic polynomial.
Hence, we can nd the characteristic polynomial of T by evaluating the
determinant [[T]
B
I
n
[. Since the only nonzero entries of [T]
B
I
n
lie
along the diagonal and are of the form a
ii
, i = 1, 2, . . . , n, it follows that
the characteristic polynomial [[T]
B
I
n
[ will consist exclusively of a product
of n factors of the form (a
ii
), i = 1, 2, . . . , n.
Does this theorem answer the second question posed in the rst paragraph
of this subsection? Why, or why not?
Activities 2, 3, and 4 reveal a second consequence of diagonalizability,
the equality of the algebraic and geometric multiplicities of each eigenvalue.
Before we can prove this result, we rst show that the geometric multiplicity
of an eigenvalue cannot exceed its algebraic multiplicity.
Theorem 7.3.4. Let T : R
n
R
n
be a linear transformation. Let be
an eigenvalue of T. Then, the geometric multiplicity of does not exceed its
algebraic multiplicity.
Proof. Let be an eigenvalue of T having algebraic multiplicity m. Certainly,
m n. Why is this true? Let v
1
, . . . , v
p
be a basis for the eigenspace
corresponding to . Then, p n. Why? According to Theorem 4.4.10, we
can expand this linearly independent set to a basis for all of K
n
, say
v
1
, . . . , v
p
, v
p+1
, . . . , v
n
.
376 CHAPTER 7. GETTING TO SECOND BASES
Since the rst p vectors are eigenvectors, the matrix representation of T with
respect to this basis is of the form
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
0 0 . . . 0 a
1p+1
. . . a
1n
0 0 . . . 0 a
2p+1
. . . a
2n
0 0 . . . 0 a
3p+1
. . . a
3n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . a
pp+1
. . . a
pn
0 0 0 . . . 0 a
p+1p+1
. . . a
p+1n
0 0 0 . . . 0 a
p+2p+1
. . . a
p+2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 a
np+1
. . . a
nn
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
.
According to Theorem 7.2.1, the characteristic polynomial is the determinant
of the matrix
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
0 0 . . . 0 a
1p+1
. . . a
1n
0 0 . . . 0 a
2p+1
. . . a
2n
0 0 . . . 0 a
3p+1
. . . a
3n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . a
pp+1
. . . a
pn
0 0 0 . . . 0 a
p+1p+1
. . . a
p+1n
0 0 0 . . . 0 a
p+2p+1
. . . a
p+2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 a
np+1
. . . a
nn
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
tI
n
=
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
t 0 0 . . . 0 a
1p+1
. . . a
1n
0 t 0 . . . 0 a
2p+1
. . . a
2n
0 0 t . . . 0 a
3p+1
. . . a
3n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . t a
pp+1
. . . a
pn
0 0 0 . . . 0 a
p+1p+1
t . . . a
p+1n
0 0 0 . . . 0 a
p+2p+1
. . . a
p+2n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . . 0 a
np+1
. . . a
nn
t
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
.
If we apply what we have learned about determinants from Chapter 6, we
can see that the determinant of the matrix given above will simplify to an
7.3 Diagonalization and Applications 377
n degree polynomial with a factor of the form ( t)
p
. Since the algebraic
multiplicity is assumed to be m, it follows that p m, that is, the geometric
multiplicity cannot exceed the algebraic multiplicity.
We will use this theorem to prove the following theorem, which shows that
the equality of the algebraic and geometric multiplicities of each eigenvalue
is a second consequence of diagonalizability.
Theorem 7.3.5. Let T : R
n
R
n
be a linear transformation that is diago-
nalizable. Then, the geometric and algebraic multiplicities of each eigenvalue
are equal.
Proof. Suppose that
1
,
2
, . . . ,
k
, k n, are the distinct eigenvalues
of T. By Theorem 7.3.1, there exists a basis B consisting exclusively of
eigenvectors. Since dim(R
n
) = n, there are n vectors in the set B. Let E
i
,
i = 1, 2, . . . , k, each be a set of those vectors in B that correspond to the
eigenvalue
i
. Let j
i
, i = 1, 2, . . . , k, represent the number of vectors in E
i
.
Let m
i
, i = 1, 2, . . . , k, denote the algebraic multiplicity of
i
.
Since E
i
, i = 1, 2, . . . , k, is a subset of B, each E
i
is a linearly independent
set. This set also generates the eigenspace corresponding to
i
. To begin
with, the set is linearly independent. In addition, any vector in the eigenspace
of
i
can be written as a linear combination of B, from which it follows that
any such vector can be written as a linear combination of E
i
. (Can you ll
in the details here?) Hence, E
i
forms a basis for the eigenspace of
i
, which
means that j
i
represents the geometric multiplicity of
i
.
By Theorem 7.3.4, j
i
m
i
for all i = 1, 2, . . . , k. We can use this to say
n =
k

i=1
j
i

k

i=1
m
i
n,
from which it follows that
k

i=1
(m
i
j
i
) = 0.
Since m
i
j
i
0 for all i = 1, 2, . . . , k, we can conclude that
j
i
= m
i
for all i = 1, 2, . . . , k, which is what we wished to prove.
378 CHAPTER 7. GETTING TO SECOND BASES
As a result of Theorems 7.3.3 and 7.3.5, we know that the splitting of the
characteristic polynomial and the equality of the algebraic and geometric
multiplicities of each eigenvalue are consequences are diagonalizability. Can
we go the other way? As the activities show, neither of these conditions in
isolation is sucient to ensure diagonalizability. What do we mean by su-
cient here? Of the linear transformations in Activities 2, 3, and 4, only one
proved to be diagonalizable. In this case, both the characteristic polynomial
split and the algebraic and geometric multiplicities were equal. As the next
theorem shows, both of these things must occur together in order to ensure
the existence of an eigenbasis.
Theorem 7.3.6. Let T : R
n
R
n
be a linear transformation. If the
characteristic polynomial of T splits, and if the algebraic and geometric mul-
tiplicities of each eigenvalue of T are equal, then T is a diagonalizable trans-
formation.
Proof. Let
1
,
2
, . . .,
k
be the distinct eigenvalues of T. Let m
i
, i =
1, 2, . . . , k, represent the algebraic multiplicities of each eigenvalue. Since the
characteristic polynomial splits,
m
1
+m
2
+ +m
k
= n,
that is, the algebraic multiplicities add to the dimension of the vector space
R
n
.
Let j
i
, i = 1, 2, . . . , k, denote the geometric multiplicity of each eigenspace
of
i
, i = 1, 2, . . . , k. Let c
i
, i = 1, 2, . . . , k, be a basis for the eigenspace
corresponding to
i
. If we let
B = c
1
c
2
c
k
,
that is, B is the collection of all eigenbasis vectors from each c
i
, then B,
according to Theorem 7.2.4, is a linearly independent set. By assumption,
j
i
= m
i
for all i = 1, 2, . . . , k. Therefore, B is a linearly independent set of n
eigenvectors. According to Theorem 4.4.8, B forms an eigenbasis for R
n
. By
Theorem 7.3.1, it follows that the matrix representation [T]
B
is diagonal.
Theorems 7.3.3, 7.3.4, and 7.3.6 can be combined into a single if and
only if theorem. What is the statement of this theorem? Now that we
have established dual conditions that are equivalent to diagonalizability, we
can elaborate upon the procedure for nding an eigenbasis that was outlined
briey in the last section and alluded to in the exercises.
7.3 Diagonalization and Applications 379
A Procedure Diagonalizing a Transformation
In this subsection, we will provide a detailed description of the process of
diagonalizing a linear transformation. We discuss each step in the context of
working with a specic example. Let T : R
3
R
3
be a linear transforma-
tion dened by
T(x
1
, x
2
, x
3
)) = 15x
1
+ 7x
2
7x
3
, x
1
+x
2
+x
3
, 13x
1
+ 7x
2
5x
3
) .
1. Find the matrix representation of T. In this case, we nd the matrix
representation with respect to the coordinate basis, which is
_
_
15 7 7
1 1 1
13 7 5
_
_
.
2. Find the eigenvalues of T. This involves completing the series of steps
involving the characteristic polynomial, which are given below.

_
_
15 7 7
1 1 1
13 7 5
_
_
t
_
_
1 0 0
0 1 0
0 0 1
_
_

_
_
15 t 7 7
1 1 t 1
13 7 5 t
_
_

= (t 1)(t 8)(t 2).


Since the polynomial splits, one requirement of diagonalizability has
been satised. What would have happened if the polynomial had not
split? Would we be able to construct an eigenbasis? Why, or why not?
For this example, the eigenvalues are 1, 2, and 8. Each has algebraic
multiplicity 1.
3. Find a basis for each eigenspace. If a is an eigenvalue, x is an eigen-
vector with eigenvalue a if and only if x is a solution of the matrix
equation
([T]
(
aI
3
) x = 0.
= 1:
([T]
(
(1)I
3
) x = 0
_
_
_
_
15 7 7
1 1 1
13 7 5
_
_
(1)
_
_
1 0 0
0 1 0
0 0 1
_
_
_
_
_
_
x
1
x
2
x
3
_
_
=
_
_
0
0
0
_
_
.
380 CHAPTER 7. GETTING TO SECOND BASES
This yields the following system of equations,
14x
1
+ 7x
2
7x
3
= 0
x
1
+ 0x
2
+x
3
= 0
13x
1
+ 7x
2
6x
3
= 0,
whose solution set is
r, r, r) : r R.
1, 1, 1) is a basis for the eigenspace of = 1. The geometric
multiplicity of this eigenspace is obviously 1.
= 2:
([T]
(
(2)I
3
) x = 0
Solution Set = 0, r, r) : r R.
0, 1, 1) is a basis for the eigenspace of = 2. The geometric multi-
plicity of this eigenspace is obviously 1. Can you ll in the details?
= 8:
([T]
(
(8)I
3
) x = 0
Solution Set = r, 0, r) : r R.
1, 0, 1) is a basis for the eigenspace of = 8. The geometric multi-
plicity of this eigenspace is obviously 1. Can you ll in the details?
Since the characteristic polynomials splits, and since the geometric mul-
tiplicities add to the dimension of R
3
,
1, 1, 1) , 0, 1, 1) , 1, 0, 1)
forms an eigenbasis for R
3
. What would have happened if one or more
of the eigenvalues had geometric multiplicities that failed to equal their
corresponding algebraic multiplicities? Would the transformation be
diagonalizable?
7.3 Diagonalization and Applications 381
4. Find the matrix representation with respect to the eigenbasis:
T(1, 1, 1)) = 1, 1, 1)
T(0, 1, 1)) = 0, 1, 1)
T(1, 0, 1)) = 8 1, 0, 1) .
Therefore,
[T]
1,1,1),0,1,1),1,0,1)
=
_
_
1 0 0
0 1 0
0 0 8
_
_
.
What is the change of basis matrix from the coordinate basis to the
eigenbasis given here? How would we nd this diagonal form, if we
were only given the change of basis matrix?
Now that we know how to nd a diagonal form, the next issue is to see how it
applies. This will be the focus of the remainder of this and the next section.
Using Diagonalization to Solve a System of Dierential
Equations
Consider the system of dierential equations
f
t
1
= 3f
1
+f
2
+f
3
f
t
2
= 2f
1
+ 4f
2
+ 2f
3
f
t
3
= f
1
f
2
+f
3
,
where each f
i
: R R, i = 1, 2, 3, is to be a dierentiable function. One
solution of this system is that each f
i
is the zero function. However, we wish
to nd all solutions. Let F : R R
3
be given by
F(x) = f
1
(x), f
2
(x), f
3
(x)) .
Since each f
i
is dierentiable, F is dierentiable, and its derivative is given
by
F
t
(x) = f
t
1
(x), f
t
2
(x), f
t
3
(x)) .
F
t
and F are related by the matrix equation
F
t
(x) = T F(x),
382 CHAPTER 7. GETTING TO SECOND BASES
where
T =
_
_
3 1 1
2 4 2
1 1 1
_
_
.
As you can see, this is nothing more than the original system given above.
We can show that the matrix given in the equation is diagonalizable. Its
diagonal form with respect to the basis
B = 1, 0, 1) , 0, 1, 1) , 1, 2, 1)
is
[T]
B
=
_
_
2 0 0
0 2 0
0 0 4
_
_
.
We will use this diagonal form to help us nd the solution set of this system
of equations. The transition matrix from the coordinate basis to B is given
by the inverse of
M =
_
_
1 0 1
0 1 2
1 1 1
_
_
.
According to Theorem 7.1.2,
T = M [T]
B
M
1
.
It then follows that
F
t
(x) = T F(x) = M [T]
B
M
1
F(x),
which is equivalent to
M
1
F
t
(x) = [T]
B
M
1
F(x).
If we let G : R R
3
be given by
G(x) =
_
_
g
1
(x)
g
2
(x)
g
3
(x)
_
_
= M
1
F(x),
then G is dierentiable and
G
t
(x) =
_
_
g
t
1
(x)
g
t
2
(x)
g
t
3
(x)
_
_
= M
1
F
t
(x).
7.3 Diagonalization and Applications 383
Since
F
t
(x) = M [T]
B
M
1
F(x),
G
t
(x) = M
1
) F
t
(x)
= [T]
B
M
1
F(x)
= [T]
B
G(x)
=
_
_
2g
1
(x)
2g
2
(x)
4g
3
(x)
_
_
.
The three equations
g
t
1
(x) = 2g
1
(x)
g
t
2
(x) = 2g
2
(x)
g
t
3
(x) = 4g
3
(x)
are independent of each other and can be solved individually. Their solutions
are given by
g
1
(x) = c
1
e
2x
g
2
(x) = c
2
e
2x
g
3
(x) = c
3
e
4x
.
Since G(x) = M
1
F(x), it follows that F(x) = M G(x), which gives us
F(x) =
_
_
f
1
(x)
f
2
(x)
f
3
(x)
_
_
=
_
_
1 0 1
0 1 2
1 1 1
_
_

_
_
c
1
e
2x
c
2
e
2x
c
3
e
4x
_
_
=
_
_
c
1
e
2x
+c
3
e
4x
c
2
e
2x
+ 2c
3
e
2x
c
1
e
2x
c
2
e
2x
c
3
e
4x
_
_
= e
2x
_
_
c
1
_
_
1
0
1
_
_
+c
2
_
_
0
1
1
_
_
_
_
+e
4x
_
_
c
3
_
_
1
2
1
_
_
_
_
= e
2x
z
1
+e
4x
z
2
,
384 CHAPTER 7. GETTING TO SECOND BASES
where z
1
and z
2
represent arbitrary elements of the eigenspaces corresponding
to the eigenvalues 2 and 4, respectively.
Markov Chains
If A is a square matrix, the notation A
k
refers to the k
th
power of A, which,
as you would expect, is the product of A with itself k times.
A
k
= A A A A
. .
k times
.
In Activity 5, we showed how to compute the power of a matrix using di-
agonalization. In particular, if A is similar to a diagonal matrix D, we can
compute any power k of A by computing the product
A
k
= CD
k
C
1
,
where C is the matrix whose columns are the components of each vector in
the eigenbasis, and D
k
is found by raising each diagonal entry of D to the
k
th
power.
Theorem 7.3.7. Let A be an nn diagonalizable matrix with entries in R.
If C is the transition matrix, and if D is the diagonal form with respect to
the eigenbasis, then, for any k,
A
k
= CD
k
C
1
,
where C is the matrix whose columns are the components of the vectors of
the eigenbasis.
Proof. See Exercise 19.
We will apply this theorem in the following example involving Markov
Chains. Suppose we have two adjacent cities A and B in which both city
managers wish to predict long term trends in the movement of population
between the two cities. Currently, 70% of the people in the two cities live
in City A, while 30% live in City B. In a typical year, 20% of the people in
City A move to City B and 80% of the people remain in City A, while 10%
of the people in City B move to city A, with 10% remaining in City B. If a
.5% increase per year is expected for the two cities combined, what will be
7.3 Diagonalization and Applications 385
the population in each city in 30 years, if the current combined population
is 150,000 people? We can rst set up a table of migration data:
From City A From City B
To City A .8 .1
To City B .2 .9
.
The initial population distribution is given by
_
.7
.3
_
.
The proportion in City A after the rst year will consist of 80% of the original
70% plus 10% of the 30% from City B, that is,
Proportion in City A after 1 year = .8 .7 +.1 .3.
Similarly, the proportion in City B after the rst year will consist of 90% of
the original 30% plus 20% of the 70% from City A, that is,
Proportion in City B after 1 year = .2 .7 +.9 .3.
In terms of matrices, we have
_
.8 .1
.2 .9
_

_
.7
.3
_
=
_
.8 .7 +.1 .3
.2 .7 +.9 .3
_
=
_
.59
.41
_
.
The proportion in City A and City B after year two will be given by
_
.8 .1
.2 .9
_
2

_
.7
.3
_
.
Can you explain why? After 30 years, the proportions will be given by
_
.8 .1
.2 .9
_
30

_
.7
.3
_
.
Since the matrix
_
.8 .1
.2 .9
_
386 CHAPTER 7. GETTING TO SECOND BASES
is diagonalizable, we can compute this product using Theorem 7.3.7. The
eigenvalues are .7 and 1. 1, 1 is a basis for .7, and 5, 1 is a basis for
1. According to Theorem 7.3.7,
_
.8 .1
.2 .9
_
30
=
_
1 5
1 1
_

_
.7 0
0 1
_
30

_
1
4
5
4

1
4

1
4
_
.
Using this equality, what is the proportion matrix after year 30? Using the
growth assumption given at the beginning of the discussion of this problem,
how many people will live in both cities combined after 30 years? How many
will live in City A? How many will live in City B?
Exercises
1. Dene T : R
2
R
2
by
T(x
1
, x
2
)) = 5x
1
3x
2
, 3x
1
x
2
) .
Determine whether T is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
2. Dene F : R
2
R
2
by
F(x
1
, x
2
)) =
_
2 4
1 4
_

_
x
1
x
2
_
.
Determine whether F is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
3. Dene H : R
3
R
3
by
H(x
1
, x
2
, x
3
)) =
_
_
1 0 0
2 1 2
2 0 3
_
_

_
_
x
1
x
2
x
3
_
_
.
Determine whether H is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
7.3 Diagonalization and Applications 387
4. Dene T : R
3
R
3
by
T(x
1
, x
2
, x
3
)) =
_
_
1 2 3
0 1 2
1 1 0
_
_

_
_
x
1
x
2
x
3
_
_
.
Determine whether T is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
5. Dene G : R
4
R
4
by
G(x
1
, x
2
, x
3
, x
4
)) =
4x
1
+ 2x
2
2x
3
+ 2x
4
, x
1
+ 3x
2
+x
3
x
4
, 2x
3
, x
1
+x
2
3x
3
+ 5x
4
) .
Determine whether G is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the coordinate basis and the
eigenbasis. If not, explain why.
6. Provide a proof of the rst part of Theorem 7.3.1.
7. Justify each step of the equality given in the proof of Theorem 7.3.2.
8. Theorems 7.3.3, 7.3.4, and 7.3.6 can be combined into a single if and
only if theorem. What is the statement of this theorem?
9. Let P
3
(R) be the vector space of polynomials of degree 3 or less. Dene
T : P
3
(R) P
3
(R) by
T(p) = p
tt
+p
t
where p P
3
(R), p
tt
is the second derivative of p, and p
t
is the
rst derivative of p. Determine whether T is diagonalizable. If it
is, nd an eigenbasis, and nd the transition matrix between the basis
1, x, x
2
, x
3
and the eigenbasis. If not, explain why.
10. Prove that if A is a diagonal matrix, then its eigenvalues are the diag-
onal elements.
11. Prove that if A is an upper triangular matrix, then its eigenvalues are
the diagonal elements.
388 CHAPTER 7. GETTING TO SECOND BASES
12. Prove that = 0 is an eigenvalue of a matrix A if and only if A is
singular.
13. Find the general solution of each system of dierential equations.
(a)
f
t
1
= f
1
+f
2
f
t
2
= 3f
1
f
2
(b)
f
t
1
= 8f
1
+ 10f
2
f
t
2
= 5f
1
7f
2
(c)
f
t
1
= f
1
+f
3
f
t
2
= f
2
+f
3
f
t
3
= 2f
3
14. Let T : R
n
R
n
be an invertible linear transformation, that is, a
transformation that is both one-to-one and onto. Show that T is diag-
onalizable if and only if its inverse T
1
: R
n
R
n
is diagonalizable.
15. Let P
2
(R) be the vector space of polynomials of degree 2 or less. Dene
F : P
2
(R) P
2
(R) by
F(p) = p(0) +p(1) (x +x
2
)
where p P
2
(R), p(0) is the value of the polynomial evaluated at
x = 0, and p(1) is the value of the polynomial evaluated at x = 1.
Determine whether F is diagonalizable. If it is, nd an eigenbasis,
and nd the transition matrix between the basis 1, x, x
2
and the
eigenbasis. If not, explain why.
16. Let A be a square matrix. A power of A, say A
n
, is nothing more than
a matrix product of n copies of A. Use this denition to answer the
following questions regarding various matrix polynomials in A and the
diagonalizability of A.
7.3 Diagonalization and Applications 389
(a) Show that if a matrix A is diagonalizable and all of its eigenvalues
are either 1 or -1, then A
2
= I.
(b) Show that if a matrix A is diagonalizable and all of its eigenvalues
are either 1 or 0, then A
2
= A.
(c) Show that if a matrix A is diagonalizable and all of its eigenvalues
are either 3 or -5, then A
2
+ 2A 15 = 0.
(d) Can you think of a general statement for which the three previous
exercises are special cases?
17. Prove that if A is diagonalizable with distinct eigenvalues
1
,
2
, . . . ,

n
, then
[A[ =
1

2

n
.
18. If A and B are similar matrices, prove that if A is diagonalizable, then
B is diagonalizable.
19. Provide a proof for Theorem 7.3.7.
20. Answer the questions posed at the end of the discussion of the Markov
Chain example.
21. Construct a model of population ows between cities, suburbs, and
nonmetropolitan areas of the U.S. Their respective populations in 1985
were 60 million, 125 million, and 55 million. The matrix giving proba-
bilities of the moves is
From City From Suburb From Nonmetro
To City .96 .01 .015
To Suburb .03 .98 .005
To Nonmetro .01 .01 .98
Predict the population that will live in each category in 2010, if the
total population is assumed to be 350 million.

Das könnte Ihnen auch gefallen