Sie sind auf Seite 1von 40

PMA1014 Linear Algebra

Academic Year 2009/2010


Thomas Huettemann and Ivan Todorov
Contents
Chapter I. Vector Spaces 5
1. Systems of linear equations 5
2. Vector spaces: denition and basic facts 8
3. Examples 10
4. Subspaces and linear combinations of vectors 12
Chapter II. Linear independence and bases 21
1. The notion of linear independence 21
2. Bases 24
3. Basis Selection and Basis Extension theorems 26
4. Subspaces and bases 28
5. The dimension 30
Chapter III. Linear mappings 35
1. Mappings between sets 35
2. Linear mappings 36
3. The kernel and the image 38
4. The dimension formula for linear maps 39
5. Isomorphisms 39
3
CHAPTER I
Vector Spaces
1. Systems of linear equations
When we use the term equation we mean a certain relation be-
tween various quantities, some of which are known and some of which
are not known. The quantities are assumed to be expressible in real
numbers, and the relation between them are usually expressed by us-
ing the operations of addition, subtraction, multiplication and division.
To solve an equation means to nd values for the unknown quantities,
which satisfy the given relation.
For example, an equation is the relation
2x
2
3x + 1 = 2x 2.
Here we have one unknown quantity, which is designated by the letter x,
and all other quantities appearing in the equation are known; these are
2, 3, 1. Using the operations between real numbers, and the quantities
(both known and unknown) we form the relation which expresses the
above equation.
An equation may involve more than one unknown variable. For
example, such an equation is
x + 2y
2
4 = 5z 2x
2
,
which involves the unknown variables x, y, z. A solution for this equa-
tion is then a set of three numbers, one corresponding to x, one to y
and one to z such that, when we substitute these three numbers in the
corresponding places in the equation, we obtain a truthful identity. For
example, the triple x = 1, y = 2 and z =
7
5
is a solution. There are
however many other solutions to this equation. To solve the equation
means to nd all triples of numbers which satisfy the given relation.
An equation is called linear if the multiplication of unknown vari-
ables does not appear in the corresponding relation; thus, terms like x
2
,
x
3
, xy, etc. are missing from it. Speaking formally, a linear equation
with respect to x
1
, x
2
, . . . , x
n
is an equation of the form
a
1
x
1
+ a
2
x
2
+ + a
n
x
n
= b,
where x
1
, x
2
, . . . , x
n
are the unknown variables while a
1
, a
2
, . . . , a
n
, b
are certain quantities assumed to be known. The numbers a
1
, . . . , a
n
are called coecients while b is called the free term of the equation.
5
6 I. VECTOR SPACES
The coecients and the free term will for our purposes be taken to be
either real numbers.
For example, the equation
2x 3y = 5
is linear. As is the case with our previous example, this equation does
not possess just one solution: the pairs x = 1, y = 1 and x = 3, y =
1
3
both solve it. As a matter of fact, it has innitely many solutions: for
every value of x we can nd a value of y which together with x forms
a solution. To do this, is suces to express y in terms of x using the
given equation:
y =
1
3
(2x 5).
A system of linear equations is simply a set of several linear equa-
tions which share the same variables, say x
1
, x
2
, . . . , x
n
. To solve a
system of linear equations means to nd all n-tuples of numbers which,
when substituted into each one of the equations, give a truthful iden-
tity. In other words, the set of solutions of a system is the intersection
of the sets of solutions of each one of the equations.
To solve a system of linear equations, we usually operate with the
equations using multiplication and addition. Suppose, for example,
that we want to solve the following system:
2x y = 2
x + 3y = 4 .
If we multiply both sides of the rst equation by 3 then the new equa-
tion, namely 6x 3y = 6, is equivalent to the initial equation, namely
2x y = 2, in the sense that the two equations have the same sets of
solutions. So, the system
6x 3y = 6
x + 3y = 4
will be equivalent to the initial system (that is, will have the same
set of solutions). Adding the two equations now gives the equation
7x = 10 in which only the variable x is present and hence we can solve
it obtaining x = 10/7.
Similarly, multiplying the second equation by 2 and the rst by -1
will lead to the equivalent system
2x + y = 2
2x + 6y = 8 .
Adding the two equations leads to the equation 7y = 6 from where we
deduce y = 6/7.
1. SYSTEMS OF LINEAR EQUATIONS 7
Now suppose that instead of the free terms 2 and 4 in the above
system we have just letters, say a and b:
2x y = a
x + 3y = b .
We may look at a and b as new variables, which are expressed in terms
of the old ones, x and y, via the above relations. Given these rela-
tions, it is possible, following the same steps as above, to express x and
y in terms of a and b. Indeed, repeating these steps with the letters a
and b instead of the numbers 2 and 4 we obtain
x =
3
7
a +
1
7
b
y =
1
7
a +
2
7
b .
What we did is better than solving the system with concrete given
free terms, because we can now solve the system which has the same
coecients but arbitrary free terms and not just the free terms 2 and 4.
In the example above we worked with a system of two equations. If
we have a system of three equations with three unknowns, for example,

1
x +
1
y +
1
z = a

2
x +
2
y +
2
z = b

3
x +
3
y +
3
z = c
where x, y, z are unknowns while
i
,
i
,
i
, a, b, c are given, will involve
similar steps of multiplication of equations by a non-zero number and
adding two equations. In this way, a result will be reached where x, y
and z will be expressed in terms of a, b and c and the coecients. The
new and important feature here in this is that we view the equations
forming the system as objects each one of which can be multiplied by
a number and each two of which can be added. In this way, the set
of all linear equations on some xed number of unknown variables is
equipped with two operations: addition and multiplication by a scalar.
Formally, this is a completely new situation to ones encountered before
until now the operation of addition was considered only in sets of
numbers, say the rational number, or the real numbers. Now we have
an operation, which we call again addition, but which is applied not
to numbers but linear equations.
Now let us return to the system that we had, but add one more
equation to it. Consider the system
()
2x y = a
x + 3y = b
4x + y = c .
We now have three numbers a, b and c which are expressed in terms
of only two variables, x and y. Assuming that the system does have a
solution, this leads to some relation between a, b and c that is, a, b
and c cannot be independent from each other. Indeed, we already saw
8 I. VECTOR SPACES
that the rst two equations alone can be solved with respect to x and y;
substituting the corresponding expressions into the last equation will
produce such a relation between a, b and c, namely
c =
11
7
a +
6
7
b.
This relation must be satised if the system of linear equations () has
a solution; if it is not satised, the system () cannot have a solution.
It is natural to expect that this is only a special case of a more
general statement: if n + 1 variables, say y
1
, . . . , y
n+1
are expressed in
terms of n variables, say x
1
, . . . , x
n
, through linear relations involving
real coecients, then the variables y
1
, . . . , y
n+1
are related through a
linear equation with real coecients.
2. Vector spaces: denition and basic facts
In this section we give the formal denition of a vector space and
develop its rst properties. This denition is motivated by plenty of
examples arising in mathematics and its applications.
Definition I.1. A vector space over R, or R-vector space, is a
non-empty set V together with a distinguished element o V and two
operations: addition which, given any two elements u, v V , produces
an element u + v V , and multiplication by a scalar which, given an
element v V and a real number R, produces an element v V ,
such that the following properties are satised:
(A1) u + v = v + u for all u, v V ;
(A2) (u + v) + w = u + (v + w) for all u, v, w V ;
(A3) v + o = v for all v V ;
(A4) for each v V there exists an element w V such that
v + w = o;
(S1) (u + v) = u + v for all u, v V and all R;
(S2) ( + )v = v + v for all v V and all , R;
(S3) ()v = (v) for all v V and all , R;
(S4) 1v = v, for all v V .
The elements of V are called vectors and the distinguished element
o V is called the zero vector. Often the term a linear space over
R is used instead of vector space over R. When talking about an
R-vector space, real numbers are often called scalars.
In the above denition we used the symbol o to denote a special
vector, namely the zero vector. By property (A3) of the addition of
vectors, the zero vector behaves like the zero number in the operation of
addition of real numbers. It is very important, however, not to confuse
it with the number 0 which appears also very often in considerations
with vector spaces, because of the other operation, multiplication of a
vector by a scalar.
2. VECTOR SPACES: DEFINITION AND BASIC FACTS 9
We note that the symbol + is used in the above denition for
two dierent things. It denotes both the operation of addition of real
numbers and that of addition of vectors. For example, in property (S2),
on the left hand side + denotes the sum of the two real numbers and
, while on the right hand side it denotes the sum of the two vectors
v and v. Similarly, in (S3) we encounter both multiplication of real
numbers on the left hand side, being the product of and , as
well as a multiplication of a vector by a scalar on the left hand side
this is the multiplication of the vector v by the scalar , while in the
right hand side it is the multiplication of the vector v by the scalar ,
and of the vector v by the scalar .
In Denition I.1 we spoke about a vector space over R. It would
be natural for the reader to ask whether there can be vector spaces
over something dierent from R. The answer is armative vector
spaces can be dened if the set R of real numbers is replaced by a
set which acts like R a set of pseudo-numbers which can be
added, subtracted, multiplied and divided like the real numbers and
have similar properties to the real numbers. A very important example
of such a set is the subset Q of rational numbers. This leads to the
notion of a vector space over Q the denition is the same as the one
of a vector space over R with the only dierence that the operation of
multiplication by a scalar is now dened only for rational numbers, and
consequently properties (S1)-(S4) are required to hold for all rational
numbers and .
Now suppose V is an R-vector space as dened above, and let v V
be a xed vector. By property (A4) there exists w V with v +w = o.
The vector w is uniquely determined by v; that is, if w, w
2
V are
such that v + w = o = v + w
2
, then w = w
2
. This can be seen as
follows:
w =
(A3)
w + o = w + (v + w
2
) =
(A1)
w + (w
2
+ v) =
(A2)
(w + w
2
) + v
=
(A1)
(w
2
+ w) + v =
(A2)
w
2
+ (w + v) =
(A1)
w
2
+ (v + w) = w
2
+ o =
(A3)
w
2
This unique vector w is denoted by v, so that (as expected) v+(v) =
(v) +v = o. Of course we will simply write v v instead of v +(v).
Now we want to establish further basic properties of the operations
in linear spaces.
Theorem I.2. Let V be an R-vector space with zero vector o.
(i) For all u, v, w V , if u + v = u + w then v = w.
(ii) For all u V we have 0u = o.
(iii) For all u V we have (1)u = u.
(iv) For all R we have o = o.
(v) Let R and u V . We have u = o if and only if = 0 or
u = o.
10 I. VECTOR SPACES
(vi) Let u, v V and 0 = R. If u = v then u = v.
Proof. (i) Suppose that u + v = u + w. Then u + (u + v) =
u+(u+w). Using (A2), we have (u+u)+v = (u+u)+w, and by
(A1), (u +(u)) +v = (u +(u)) +w. Now (A4) gives o +v = o +w,
and (A3) implies that v = w.
(ii) Using (A3) and (S2) in this order we have that
0u + o = 0u = (0 + 0)u = 0u + 0u.
Now using (i) we conclude that 0u = o.
(iii) Using (A4), (ii), (S2) and (S4) (in this order) we see that
u + (u) = o = 0u = (1 + (1))u = 1u + (1)u = u + (1)u.
Now (i) implies that (1)u = u.
(iv) Using (A4), (A3), (S1), (A2) and (A4) we have that
o = o + (o) = (o + o) + (o) = (o + o) + (o)
= o + (o + (o)) = o.
(v) and (vi) are left as an exercise.
We will use the properties established in Theorem I.2 extensively
in the sequel. In fact they are so basic that we will not even bother to
mention them explicitly when we use them!
3. Examples
In this section we gather a variety of examples of vector spaces.
Example I.3. Let n N. An ordered n-tuple of real numbers is a
collection
x =
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
where x
i
R for each i = 1, . . . , n. Two such collections, say
x =
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
and y =
_
_
_
_
y
1
y
2
.
.
.
y
n
_
_
_
_
,
are equal precisely when
x
1
= y
1
, x
2
= y
2
, , x
n
= y
n
.
Ordered 2-tuples are just ordered pairs of real numbers, and ordered
1-tuples are just collections consisting of one real number which can be
identied with this number.
3. EXAMPLES 11
The set of ordered n-tuples of real numbers is denoted by R
n
. In
symbols,
R
n
=
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_

x
i
R for i = 1, 2, , n
_
.
We introduce operations of addition and multiplication by a real
number in R
n
as follows: if
u =
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
and v =
_
_
_
_
y
1
y
2
.
.
.
y
n
_
_
_
_
are elements of R
n
and R, we set
u + v =
_
_
_
_
x
1
+ y
1
x
2
+ y
2
.
.
.
x
n
+ y
n
_
_
_
_
and u =
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
.
We also dene o to be the ordered n-tuple with all 0-entries:
o =
_
_
_
_
0
0
.
.
.
0
_
_
_
_
Then the set R
n
equipped with these operations is a vector space over
R. We leave it to the student to check that this is indeed the case in
other words, that the properties (A1)-(A4), (S1)-(S4) indeed hold.
Example I.4. Let S be the set of all sequences with real entries.
For sequences a = (a
n
)

n=1
and b = (b
n
)

n=1
we dene a new sequence
a+b whose nth terms is, by denition, equal to a
n
+b
n
. This sequence
is called the sum of a and b. Similarly, if R then one may dene
the sequence a whose nth term is the real number a
n
. Let o be the
sequence all of whose terms are equal to zero. It can be checked that
the set S equipped with the operations described above is a vector
space over R.
Informally speaking a sequence can be thought of as an ordered
tuple, but having innitely many entries. In this sense, the present
example is an extension of Example I.3.
Example I.5. Let I be an interval and let F(I) be the set of all
functions f : I R dened on I. Given two functions f, g F(I) and
a scalar R we dene f +g by setting (f +g)(x) = f(x) +g(x), and
dene f by setting (f)(x) = f(x) (multiplication of real numbers
12 I. VECTOR SPACES
on the right), for each x I. The function o is dene by o(x) = 0.
Equipped with this structure, F(I) is a vector space over R.
Example I.6. For n N, let P
n
be the set of all polynomials with
real coecients of degree at most n:
P
n
= {a
n
x
n
+ a
n1
x
n1
+ . . . + a
1
x + a
0
| a
n
, a
n1
, , a
0
R}
Equipped with the usual operations of addition of polynomials and
multiplication of a polynomial by a real number, the set P
n
is a vector
space over R. The zero vector is the polynomial whose all coecients
are equal to zero.
Let P be the set of all polynomials, that is, P =

n=1
P
n
. Then,
equipped with the same operations, P is a vector space. Note that
P
n
P for each n N.
Example I.7. Let m, n N. An m-by-n matrix is a table of real
numbers with m rows and n columns. The numbers appearing in the
table are called entries of the matrix. To denote the entries of a matrix,
we use double subscripts. If 1 i n and 1 j m let a
i,j
be the
real number which is located in the ith row and the jth column of a
given matrix A. This number is also called the (i, j)-entry of A. We
write
A = (a
i,j
)
n
i,j=1
=
_
_
_
_
a
1,1
a
1,2
a
1,n
a
2,1
a
2,2
a
2,n
.
.
.
.
.
.
.
.
.
.
.
.
a
m,1
a
m,2
a
m,n
_
_
_
_
We denote by M
m,n
the set of all m by n matrices. If A = (a
i,j
)
n
i,j=1
and B = (b
i,j
)
n
i,j=1
are two elements of M
m,n
then the sum of A and B
is the matrix A+B whose (i, j)-entry is equal to a
i,j
+b
i,j
, that is, to the
sum of the (i, j)-entries of A and B. Similarly, if R then we dene
A to be the matrix whose (i, j)-entry is equal to a
i,j
. In this way, we
have equipped M
m,n
with operations of addition and multiplication by
a scalar. Let o M
m,n
be the matrix whose all entries are equal to
zero. With respect to this structure, M
m,n
is a vector space over R.
4. Subspaces and linear combinations of vectors
Definition I.8. Let V be a vector space over R. A subset W V
is called a subspace of V if
(Sub 0) o W (so in particular W = )
(Sub 1) u + v W whenever u, v W;
(Sub 2) v W whenever w W and R.
We remark that every vector space V possesses two subspaces
the subspace {o} consisting of the zero vector only (called the trivial
subspace), and V itself. But here are some non-trivial examples:
4. SUBSPACES AND LINEAR COMBINATIONS OF VECTORS 13
Example I.9. Let V = R
2
and W =
_
_
x
0
_

x R
_
. Then W is
a subspace of V , as a simple verication shows. Note that if we identify
V with the plane, W corresponds to the x-axis.
More generally, let V = R
n
, k n, and
W =
_
_
_
_
_
_
_
_
_
_
x
1
.
.
.
x
k
0
.
.
.
0
_
_
_
_
_
_
_
_
_

x
1
, x
2
, , x
k
R
_
.
Then W is a subspace of V . We note that W can be identied with
R
k
, after deleting the zeros in its denition.
Example I.10. Let S be the vector space dened in Example I.4.
Let S
0
be the set of all converging sequences. Results in Analysis assert
that
the constant sequence (0)
n1
is convergent, so S
0
satises
(Sub 0);
the sum of two convergent sequences is convergent, so S
0
sat-
ises (Sub 1);
the product of a constant sequence with a convergent se-
quence is convergent, so S
0
satises (Sub 2).
This means that S
0
is a subspace of S.
Example I.11. Let I be an interval, and let F(I) the linear space
dened in Example I.5. Let C(I) be the subset of F(I) consisting of
all continuous functions on I. Results in Analysis assert that
the constant function o with value 0 everywhere is continuous,
so C(I) satises (Sub 0);
the sum of two continuous functions is continuous, so C(I)
satises (Sub 1);
the product of a continuous function with a real number is
continuous, so C(I) satises (Sub 2).
This means that C(I) is a subspace of F(I).
Example I.12. Let a, b R and
W =
_
_
x
y
_
R
2

ax + by = 0
_
.
Then W is a subspace of R
2
.
14 I. VECTOR SPACES
More generally, for a
1
, a
2
, , a
n
R the subset
U =
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
R
n
| a
1
x
1
+ a
2
x
2
+ + a
n
x
n
= 0}
is a subspace of R
n
.
Definition I.13. Let V be a vector space and v
1
, v
2
, , v
n
V .
A linear combination of the vectors v
1
, , v
n
is a vector of the form

1
v
1
+
2
v
2
+ . . . +
n
v
n
,
where
1
, ,
2
, ,
n
are scalars.
More generally, if S V is a subset, a linear combinations of vectors
in S is any vector of the form

1
v
1
+
2
v
2
+ . . . +
n
v
n
,
where n is a natural number,
1
, ,
2
, ,
n
are scalars, and all the
vectors v
i
are in S (note that the number n is not xed here).
For example, the vector (
2
3
) R
2
is a linear combination of the
vectors (
1
0
) and (
2
1
). Indeed, we have
_
2
3
_
= 4
_
1
0
_
+ 3
_
2
1
_
.
In the vector space P
n
of all polynomials of degree at most n (cf. Ex-
ample I.6), every element of P
n
is a linear combination of the vectors
1, x, , x
n
. In fact, this is the very denition of a polynomial of de-
gree at most n. More interestingly, perhaps, is the fact that every
polynomial is a linear combination of the vectors
1, x 1, x
2
x, x
3
x
2
, , x
n
x
n1
.
This follows from the following equality:
a
n
x
n
+ a
n1
x
n1
+ . . . + a
1
x + a
0
= a
n
(x
n
x
n1
)
+ (a
n
+ a
n1
)(x
n1
x
n2
)
+ (a
n
+ a
n1
+ a
n2
)(x
n2
x
n3
)
+ . . .
+ (a
n
+ a
n1
+ . . . + a
1
)(x 1)
+ (a
n
+ a
n1
+ . . . + a
0
)
We now prove a criterion for being a subspace which will be used
throughout without specic reference.
Proposition I.14. Let V be a vector space and let W V be a
non-empty subset. The following conditions are equivalent:
(i) W is a subspace
4. SUBSPACES AND LINEAR COMBINATIONS OF VECTORS 15
(ii) u + v W for all u, v W and all , R
(iii)
1
v
1
+ +
n
v
n
W for all v
1
, . . . , v
n
W and all
1
, . . . ,
n
R
Proof. We will only prove that the equivalence of (i) and (ii). The
equivalence with (iii) is left as an exercise.
Suppose condition (ii) holds. Let u, v W and = = 1. By (ii),
u + v = 1u + 1v W, so property (Sub 1) of Denition I.8 is fullled.
By (ii) again, v = v + 0v W for all R and v W, hence W
satises condition (Sub 2). Finally, choose any u W (we can do this
since W = by hypothesis). Then using (ii) a third time we see that
o = 0u + 0u W, so W satises (Sub 0) as well. This proves that W
is a subspace.
Now suppose that (i) holds, that is, suppose that W is a subspace
of V . Fix u, v W and , R. Since W is a subspace, u W and
v W by (Sub 2). But then u +v W from (Sub 1). This shows
that condition (ii) is satised by W.
Lemma I.15. Let V be a vector space over R. The intersection of
any non-empty family of subspaces of V is a subspace.
Proof. Let S be a certain collection of subspaces of V (maybe
innite). Thus, the elements of S are themselves subspaces of V . Let W
be the intersection of all subspaces which belong to S. Then o W as
every subspace of V must contain the zero vector, by (Sub 0). Suppose
that u, v are vectors which belong to every subspace from S. Then,
by (Sub 1) of Denition I.8, u + v also belongs to every subspace in
the family S, and so u + v belongs to the intersection of all those
subspaces, that is, to W. In a similar manner we see that W satises
property (Sub 2). So W is indeed a subspace of V as claimed.
Definition I.16. Let V be a vector space over R, and let S
V be a non-empty subset. We dene the subspace generated by S,
denoted [S], to be the intersection of all the subspaces of V which
contain S (this is a subspace by the previous Lemma). We also set
[] = {o}.
Proposition I.17. Let V be a vector space over R and S V be
a subset. Then [S] is the smallest subspace of V containing S; more
precisely, S [S], and if W is a subspace of V with S W, then
[S] W.
Proof. Let S be the collection of all subspaces of V which contain
the subset S; note that V S so that S = . By denition, [S] is the
intersection of all elements of S. Since every element of S contains S,
it follows that [S] contains S as well.
It remains to prove that [S] is the smallest subspace containing S.
Suppose that W is a subspace of V with S W. Then W S. It
follows from the denition of [S] that then [S] W. The proof is
complete.
16 I. VECTOR SPACES
Definition I.18. Let V be a vector space over R and let S V
be a non-empty subset of V . The span of S, denoted span (S), is the
set of linear combinations of vectors in S, cf. Denition I.13. We set
span () = {o}.
Lemma I.19. Let V be a vector space and S V be a subset. Then
span (S) is a subspace of V .
Proof. By virtue of its denition, span (S) V and o span (S)
(the linear combination with all coecients 0). So span (S) satises
(Sub 0).
Suppose now that
u =
1
u
1
+ +
n
u
n
and v =
1
v
1
+ +
m
v
m
are elements of span (S) here u
1
, , u
n
, v
1
, , v
m
are elements of
S and
1
, ,
n
,
1
, ,
m
are scalars. But then, for , R, we
have that
u + v = (
1
u
1
+ +
n
u
n
) + (
1
v
1
+ +
m
v
m
)
= (
1
)u
1
+ + (
n
)u
n
+ (
1
)v
1
+ + (
m
)v
m
which is again in span (S). Thus, span (S) is a subspace by Proposi-
tion I.14.
Theorem I.20. Let V be a vector space and S V be a subset.
Then [S] = span (S).
Proof. Suppose we have an element
1
v
1
+
2
v
2
+ . . . +
n
v
n

span (S), where v
1
, , v
n
S and
1
, ,
n
R. Since S [S],
part (iii) of Theorem I.14 shows that
1
v
1
+ +
n
v
n
[S]. We have
shown that span (S) [S].
To establish the reverse inclusion, rst note that S span (S); in
fact, for s S we have s = 1s span (S) since 1s is a (very short) linear
combination of vectors in S. By Lemma I.19 we know that span (S) is
a subspace of V . But by Proposition I.17 [S] is the smallest subspace
containing S, so that [S] span (S).
A special case of a particular importance is obtained when the set
S is nite. In this case, Theorem I.20 reduces to the following:
Corollary I.21. Let S = {v
1
, v
2
, . . . , v
n
}. Then
[S] = span (S) = {
1
v
1
+
2
v
2
+ +
n
v
n
|
1
,
2
, . . . ,
n
R} .
Corollary I.22. If T S, then [T] [S].
Proof. The inclusion span (T) span (S) is clear: every linear
combination of vectors in T is also a linear combination of vectors in S
since T S. By Theorem I.20 this implies the claim.
4. SUBSPACES AND LINEAR COMBINATIONS OF VECTORS 17
Example I.23. Let S = {(
1
1
) , (
2
1
)} R
2
. Then [S] = R
2
. Well,
[S] is a subset of R
2
by its very denition, so we have to establish that,
as a matter of fact, every vector in R
2
is contained in [S].
So x a vector (
a
b
) R
2
. We want to nd scalars and such that

_
1
1
_
+
_
2
1
_
=
_
a
b
_
.
This amounts to solving the system of linear equations
+ 2 = a
+ = b .
Straightforward calculation gives = 2b a and = a b. Thus
_
a
b
_
= (2b a)
_
1
1
_
+ (a b)
_
2
1
_
is indeed a linear combination of vectors in S. This means that (
a
b
)
span (S). But span (S) = [S] by Theorem I.20, so (
a
b
) [S] as required.
Example I.24. We dene special vectors in R
n
, called unit vectors.
Given an integer j with 1 j n, the vector e
j
has jth entry 1, and
all other entries 0. That is,
e
1
=
_
_
_
_
_
_
1
0
0
.
.
.
0
_
_
_
_
_
_
e
2
=
_
_
_
_
_
_
0
1
0
.
.
.
0
_
_
_
_
_
_
e
n
=
_
_
_
_
_
_
0
0
.
.
.
0
1
_
_
_
_
_
_
.
Now let S = {e
1
, e
2
, , e
n
}. Then [S] = R
n
. To see this, since
[S] R
n
anyway, it suces to show that every vector from R
n
is
in [S]. But this is easy: for every vector
u =
_
_
_
_
a
1
a
2
.
.
.
a
n
_
_
_
_
R
n
we have u = a
1
e
1
+ a
2
e
2
+ . . . + a
n
e
n
span (S) = [S].
Example I.25. In R
n
, let
v
1
=
_
_
_
_
_
_
_
_
1
1
0
0
.
.
.
0
_
_
_
_
_
_
_
_
v
2
=
_
_
_
_
_
_
_
_
1
0
1
0
.
.
.
0
_
_
_
_
_
_
_
_
v
n1
=
_
_
_
_
_
_
_
_
1
0
0
.
.
.
0
1
_
_
_
_
_
_
_
_
.
18 I. VECTOR SPACES
We want to identify the subspace generated by these vectors explicitly;
we claim that there is an equality
[S] =
_
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
R
n

x
1
+ x
2
+ . . . + x
n
= 0
_
where S = {v
1
, v
2
, , v
n1
}.
To prove this equality, let us denote by M its right hand side. It is
clear that v
i
M for each i = 1, . . . , n 1, by direct calculation. Now
let v [S]. Since [S] = span (S) by Theorem I.20 we can write v as a
linear combination
(1) v =
1
v
1
+
2
v
2
+ . . . +
n1
v
n1
.
Using the explicit denition of the vectors v
i
this means that we have
v =
_
_
_
_
_
_

1
+
2
+ . . . +
n1

2
.
.
.

n1
_
_
_
_
_
_
.
But this implies v M, by denition of M and inspection. This proves
[S] M.
To prove the reverse inclusion let v M; in other words, we have
v =
_
_
_
_
x
1
x
2
.
.
.
x
n
_
_
_
_
with
(2) x
1
+ x
2
+ . . . + x
n
= 0 .
We want to show that v [S]; by Theorem I.20 again it is enough to
show that v span (S). In other words, we want to determine scalars

j
such that v can be written in the form given in (1) above. Note rst
that, from equation (2), we actually have
v =
_
_
_
_
x
2
x
3
. . . x
n
x
2
.
.
.
x
n
_
_
_
_
;
substituting this into equation (1) yields a system of linear equations
which can easily be solved for the
j
. The result will be

n1
= x
n
,
n2
= x
n1
, ,
1
= x
2
,
4. SUBSPACES AND LINEAR COMBINATIONS OF VECTORS 19
and direct computation conrms that v =
1
v
1
+
2
v
2
+. . . +
n1
v
n1
.
In particular, v span (S) = [S] as required.
Lemma I.26. Let S be a subset of the vector space V , and let T S.
Suppose that every vector t T is in span (S \ T). Then [S \ T] = [S].
Proof. We have S \ T S and hence [S \ T] [S] by Corol-
lary I.22. Now [S \ T] contains the set S \ T, but it also contains every
vector of T: for t T we have t span (S \ T) = [S \ T], using the hy-
pothesis and Theorem I.20. So [S \T] contains the set (S \T) T = S.
But [S] is the smallest subspace of V containing S, so [S] [S \ T]. In
total, this shows [S] = [S \ T].
CHAPTER II
Linear independence and bases
1. The notion of linear independence
In this chapter we study the notion of linear independence. We
have actually seen this notion in various forms in previous sections;
now it is time to drag the concept to the surface. We begin with a
rigorous denition.
Definition II.1. Let V be a vector space over R, let v
1
, v
2
, , v
n

V be a nite collection of vectors (possibly with repetitions), and let
S V be a non-empty subset (possibly innite).
We call the vectors v
1
, v
2
, , v
n
V linearly dependent if
there are scalars
1
,
2
, ,
n
R, not all of which are 0,
such that

1
v
1
+
2
v
2
+ . . . +
n
v
n
= o .
We call the vectors v
1
, v
2
, , v
n
V linearly independent if
they are not linearly dependent. Explicitly, this means that
whenever we are given scalars
1
,
2
, ,
n
R such that

1
v
1
+
2
v
2
+ . . . +
n
v
n
= o ,
then necessarily
1
=
2
= . . . =
n
= 0.
We call the set S linearly dependent if there are (nitely
many!) distinct(!) vectors w
1
, w
2
, , w
k
S and scalars

1
,
2
, ,
k
R, not all of which are 0, such that

1
w
1
+
2
w
2
+ . . . +
k
w
k
= o
where k 1 is a suitable natural number.
We call the set S linearly independent if is is not linearly de-
pendent. Explicitly, this means that every (nite!) collection
w
1
, w
2
, , w
k
S of distinct(!) vectors is linearly indepen-
dent.
Remark II.2. (1) If one of the v
j
is the zero vector, or if two of the
v
j
are identical, then the vectors v
1
, v
2
, , v
n
are clearly linearly
dependent.
(2) If S is linearly dependent, then so is every superset of S.
(3) If S is linearly independent, then so is every subset of S.
21
22 II. LINEAR INDEPENDENCE AND BASES
(4) A a nite set S = {v
1
, v
2
, , v
n
} of vectors in V (with the v
j
pairwise distinct) is linearly independent if and only if its elements
v
1
, v
2
, , v
n
are linearly independent vectors.
It is useful to note a fact weve used before: the zero vector o is
always a linear combination of given vectors v
1
, v
2
, . . . , v
n
; indeed,
o = 0v
1
+ 0v
2
+ + 0v
n
.
This does however not imply that v
1
, v
2
, , v
n
are linearly dependent.
In order for this to happen, we must have that o is a linear combination
of v
1
, v
2
, , v
n
where not all coecients are zero.
Example II.3. In R
3
, the vectors
_
1
2
1
_
and
_
1
0
3
_
are linearly in-
dependent. Indeed, suppose that

1
_
_
1
2
1
_
_
+
2
_
_
1
0
3
_
_
=
_
_
0
0
0
_
_
.
This means that
1
and
2
are a solution of the system of linear equa-
tions

1

2
= 0
2
1
= 0

1
+ 3
2
= 0
From the second equation we see that we must have
1
= 0, and the
rst (or third) equation then implies that necessarily
2
= 0. In other
words, the only linear combination of the two given vectors which yields
the zero vector is the trivial linear combination with all coecients 0.
This means that the vectors are linearly independent.
Example II.4. In P
3
, the vector space of polynomials of degree at
most 3, the vectors
p
1
= x
3
+ 2x
2
+ 1
p
2
= x
2
x 3
p
3
= 2x
3
+ 7x
2
3x 7
are linearly dependent. Indeed, it is easily veried that we have
2p
1
+ 3p
2
p
3
= o .
More constructively, to show that the given polynomials are linearly
dependent we have to prove that the equation
1
p
1
+
2
p
2
+
3
p
3
= o
has a solution with at least one of the
i
dierent from zero. The
equation is equivalent to the following system of linear equations:

1
+ 2
3
= 0
2
1
+
2
+ 7
3
= 0

2
3
3
= 0

1
3
2
7
3
= 0
1. THE NOTION OF LINEAR INDEPENDENCE 23
Once again, the task is to show that this system has a solution with
at least one of the
j
non-zero. For example, from second and fourth
equation we can conclude that 7
2
+21
3
= 0, so
2
= 3
3
. The rst
equation tells us that
1
= 2
3
. Choosing
3
= 1 yields
1
= 2
and
2
= 3, and this is, by direct calculation, indeed a solution of the
system.
Example II.5. The unit vectors
e
1
=
_
_
_
_
_
_
1
0
0
.
.
.
0
_
_
_
_
_
_
e
2
=
_
_
_
_
_
_
0
1
0
.
.
.
0
_
_
_
_
_
_
e
n
=
_
_
_
_
_
_
0
0
.
.
.
0
1
_
_
_
_
_
_
in R
n
are linearly independent. Indeed, suppose that

1
e
1
+
2
e
2
+ . . . +
n
e
n
= o .
By writing out this equality explicitly, this means that
_
_
_
_

2
.
.
.

n
_
_
_
_
=
_
_
_
_
0
0
.
.
.
0
_
_
_
_
.
We have shown that necessarily
j
= 0 for 1 j n. This proves that
the unit vectors are linearly independent.
Proposition II.6. Let V be a vector space over R. The vectors
v
1
, v
2
, , v
n
in V are linearly dependent if and only if one of them
is a linear combination of the rest.
Proof. Suppose that v
1
, v
2
, . . . , v
n
are linearly dependent. Then

1
v
1
+
2
v
2
+ +
n
v
n
= o
for some choice of scalars
1
,
2
, ,
n
not all of which are zero. Thus
there exists and index i such that
i
= 0. But then, after dividing by

i
and keeping the vector v
i
on the left hand side, we conclude that
v
i
=

i
v
1
. . .

i1

i
v
i1


i+1

i
v
i+1
. . .

n

i
v
n
,
that is, v
i
is a linear combination of the rest of the vectors.
Conversely, suppose that one of the vectors, say v
j
, is a linear combi-
nation of the rest. Thus, there exist scalars
1
, ,
j1
,
j+1
, ,
n
such that v
j
=
1
v
1
+. . . +
j1
v
j1
+
j+1
v
j+1
+. . . +
n
v
n
. But then

1
v
1
+ +
j1
v
j1
v
j
+
j+1
v
j+1
+ +
n
v
n
= o ,
and since not all coecients in this linear combination are zero (namely
the coecient of v
j
is 1), we have shown that the vectors v
1
, . . . , v
n
are linearly dependent.
24 II. LINEAR INDEPENDENCE AND BASES
Remark II.7. A variation of the proof shows: A subset S of the
vector space V is linear dependent if and only if there is s S such
that s span (S \ {s}).
2. Bases
Definition II.8. Let V be a vector space and S V . The subset
S is called a generating set for V if [S] = V .
Remark II.9. If S S

and S is generating then so is S

. For
we have [S] [S

] by Corollary I.22. Since S is generating we know


[S] = V and hence S

[S]. Since [S

] is the smallest subspace of V


containing S

this implies the reverse inclusion [S

] [S].
Definition II.10. A vector space is called nite dimensional if it
has a nite generating set. We write dimV < to indicate that V is
a nite dimensional vector space.
A vector space is called innite dimensional if it is not nite dimen-
sional. We write dimV = to indicate that V is innite dimensional.
Most (but not all) of the examples we have considered are nite
dimensional. The vector space R
n
is nite dimensional as was shown in
Example I.24. However, the vector space S of all sequences, the vector
space S
0
of all convergent sequences, and the vector space P of all
polynomials (of arbitrary degree) are examples of innite dimensional
vector spaces.
Definition II.11. Let V be a vector space. The subset S V is
called a basis for V if S is generating and linearly independent.
Example II.12. The set S = {e
1
, e
2
, . . . , e
n
} in R
n
is a basis. In-
deed, S is generating as pointed out above, and Example II.5 shows
that S is linearly independent as well.
Proposition II.13. Let V be a vector space and B V . The set
B is a basis for V if and only if every vector in V can be expressed in
a unique way as a linear combination of vectors in B.
Proof. Suppose that B is a basis. Then B is generating and hence
every vector in V can be expressed as a linear combinations of vectors
from B. We need to show uniqueness of such a representation. Suppose
that
v =
1
v
1
+
2
v
2
+ +
n
v
n
=
1
v
1
+
2
v
2
+ +
n
v
n
,
where v
1
, v
2
, v
n
B and
1
,
2
, ,
n
,
1
,
2
, ,
n
are scalars.
Then we have that
(
1

1
)v
1
+ (
2

2
)v
2
+ . . . + (
n

n
)v
n
= o ,
and since B is linearly independent we must have that
i

i
= 0, that
is,
i
=
i
, for each i = 1, 2, , n. It follows that the representation
of v as a linear combination of vectors from B is unique.
2. BASES 25
Conversely, suppose that every vector in V can be expressed in a
unique way as a linear combination of vectors in B. In particular, B is
a generating set. Suppose now that

1
v
1
+
2
v
2
+ +
n
v
n
= o ,
for some (pairwise distinct) vectors v
1
, v
2
, v
n
B and scalars

1
,
2
, ,
n
. But we also know that 0v
1
+ 0v
2
+ . . . + 0v
n
= o.
So we have two ways of writing the zero vector as a linear combination
of vectors in B. By the uniqueness clause in the hypotheses the two
ways must actually coincide, that is, we must have
i
= 0 for 1 i n.
But this means that the vectors v
i
are linearly independent. Since this
argument is true for every nite collection of vectors from B this shows
that B is linearly independent and hence a basis as claimed.
Theorem II.14. Let V be an R-vector space, and let v
1
, v
2
, , v
n
be vectors in V . The following conditions are equivalent:
(i) The vectors v
1
, v
2
, , v
n
generate the vector space V (that is,
V = [{v
1
, v
2
, , v
n
}]) and are are linearly independent.
(ii) The vectors v
1
, v
2
, , v
n
form a minimal list of vectors gener-
ating V (that is, they generate V , but no n 1 of these vectors
generate V ).
(iii) The vectors v
1
, v
2
, , v
n
are a maximal list of linearly indepen-
dent vectors (that is, v
1
, v
2
, , v
n
are linearly independent, but
given any vector v V , the vectors v, v
1
, v
2
, , v
n
are linearly
dependent).
Proof. (i) (ii): The hypotheses of (i) already say that the vec-
tors generate V , so all we have to check is minimality. We work by con-
tradiction. Assume that after leaving out one the vectors, say v
i
, the
remaining ones still generate all of V . By re-numbering the vectors we
may assume that i = 1. That is, we may assume that [{v
2
, v
3
, , v
n
}] =
V . But then we have
v
1
V = [{v
2
, v
3
, , v
n
}] = span ({v
2
, v
3
, , v
n
})
so that the vectors v
1
, v
2
, , v
n
are linearly dependent by Proposi-
tion II.6. But this contradicts the hypotheses of (i)! So in fact the
remaining n 1 vectors cannot generate all of V .
(ii) (iii): Let us check rst that, under the hypotheses of (ii),
the vectors v
i
are in fact linearly independent. We work (as happens
so often) by contradiction. If the vectors v
i
are linearly dependent, we
can nd scalars
i
, not all of which are 0, such that

1
v
1
+
2
v
2
+ . . . +
n
v
n
= o .
By renumbering the v
i
and
i
we may assume that
1
= 0. But this
then means that
v
1
=

1
v
2


3

1
v
3
. . .

n

1
v
n
span ({v
2
, v
3
, , v
n
}) .
26 II. LINEAR INDEPENDENCE AND BASES
From Lemma I.26 we then get the equality
[{v
2
, v
3
, , v
n
}] = [{v
1
, v
2
, , v
n
}] = V ,
that is, after forgetting v
1
the remaining vectors generate all of V . But
this is not possible as (ii) stipulates that the list of vectors is a minimal
list generating V !
So v
1
, , v
n
are linearly independent. To show that they form a
maximal list of linearly independent vectors, let v V be given. By
hypothesis (ii) the v
i
generate V so that
v V = [{v
1
, v
2
, , v
n
}] = span ({v
1
, v
2
, , v
n
}) .
By Proposition II.6 this implies that v, v
1
, v
2
, , v
n
are linearly de-
pendent, as required.
(iii) (i): Suppose now that condition (iii) holds. We have to prove
that the vectors v
i
generate the vector space V (this is enough since they
are already linearly independent, by hypothesis). So let v V . Since
v, v
1
, v
2
, , v
n
are linearly dependent by the maximality clause in
the hypotheses, there are scalars ,
1
, ,
n
, not all zero, such that
v +
1
v
1
+
2
v
2
+ . . . +
n
v
n
= o .
The crucial observation now is that we must have = 0; for otherwise
one of the
i
must be non-zero, and the above equality shows that the
v
i
are linearly dependent, contradicting the hypothesis. Since = 0
we can re-arrange and express v as a linear combination
v =

v
1

v
2
. . .

n

v
n
so that v span ({v
1
, , v
n
}) = [{v
1
, , v
n
}]. This shows that the
vectors v
i
do indeed generate all of V .
3. Basis Selection and Basis Extension theorems
Our next aim is to show that every nite dimensional vector space
has a basis. For the trivial vector space {o}, which contains no vector
except the zero vector, this is easy: the empty set is a basis (it certainly
generates the space, and is considered to be linearly independent, by
convention). Note that the set {o} is linearly dependent and hence not
a basis. For non-trivial vector spaces the result is substantially more
dicult to prove.
Theorem II.15 (Basis Selection Theorem). Suppose that V is a
nite dimensional vector space, and suppose that S is a nite generating
set for V . Then there is a subset G S which is a basis of V . In other
words, from any nite list of vectors that generate V we can select
vectors which form a basis of V .
The proof is constructive and yields a recipe for nding a basis
from a given nite generating set. Well describe the algorithm rst,
and then prove that is does as claimed.
3. BASIS SELECTION AND BASIS EXTENSION THEOREMS 27
Theorem II.16 (Basis Selection Algorithm). Suppose that V is a
nite dimensional vector space, and suppose that the (nitely many)
vectors w
1
, w
2
, , w
n
generate V . Label these vectors iteratively as
follows:
If w
1
= o, label w
1
good; otherwise, label it bad.
Suppose we have labelled w
1
, , w
k1
already for k n.
Then we label w
k
good if it is not in the span of the vectors
that have been labelled good so far; otherwise, if w
k
is in the
span of the good vectors so far, we label it bad.
When all the vectors are labelled, the good vectors among them are a
basis of V .
Proof. Let S = {w
1
, w
2
, , w
n
}, and let T denote the set of
bad vectors in S. Then S \ T is the set of good vectors. By con-
struction, every bad vector is a linear combination of good vectors, so
[S \ T] = [S] = V by Lemma I.26 (and using the hypothesis that the
w
j
generate V ). So we know that the good vectors generate V .
Were left to check that the good vectors are linearly independent.
We work by contradiction, so assume for the moment that the good
vectors are in fact linearly dependent. To ease notation, let g
1
be the
rst good vector the Algorithm produces, g
2
the second good vector
and so on, with g
k
being the last good vector. Linear dependence now
means that there are scalars
j
, not all zero, such that

1
g
1
+
2
g
2
+ . . . +
k
g
k
= o .
Let m be the maximal index with
m
= 0 (the case m = k is possible
and means
k
= 0). Then by re-arrangeing the above equality we have
g
m
=

1

k
g
1
+

2

k
g
2
+ . . . +

m1

k
g
m1
,
that is, g
m
is in the span of the previously found good vectors. But
this means that the Algorithm must label g
m
as a bad vector contra-
diction!
Corollary II.17. Every nite dimensional vector space has a ba-
sis. If a vector space is generated by n vectors, there is a basis consisting
of not more than n vectors.
Proof. Let V be a nite dimensional vector space. Then V has
a nite generating set S V . After listing the elements of S in some
order we can apply the Basis Selection Algorithm to extract a basis
from S. Clearly such a basis cannot have more elements than the
set S.
Example II.18. Well work in the vector space R
3
. The vectors
w
1
=
_
_
1
1
1
_
_
, w
2
=
_
_
0
0
0
_
_
, w
3
=
_
_
1
2
3
_
_
, w
4
=
_
_
0
1
2
_
_
, w
5
=
_
_
1
0
1
_
_
28 II. LINEAR INDEPENDENCE AND BASES
generate R
3
(this is left as an exercise). Running the Basis Selection
Algorithm on these vectors works as follows. First, the vector w
1
is
not the zero vector, so w
1
is good. Next, w
2
= 0w
1
span ({w
1
})
is in the span of the previously found good vector(s), so w
2
is bad.
The vector w
3
is not a multiple of w
1
, hence w
3
/ span ({w
1
}) and
w
3
is good. Next, w
4
= w
1
+ w
2
span ({w
1
w
3
}) is in the span of
the previously found good vectors, so w
4
is bad. Finally, w
5
is not in
span ({w
1
, w
3
}) since the equation
w
5
= w
1
+ w
3
has no solution at all. This means that w
5
is good.
The Basis Selection Algorithm terminates now; the resulting basis
of R
3
is w
1
, w
3
, w
5
. (It might be instructive to check directly that these
three vectors form a basis.)
The Basis Selection Theorem II.15 asserts that we can reduce a
given nite generating set to a subset that keeps the property of being
generating but is moreover linearly independent. There is a another,
dual way to arrive at a basis, viz., starting with a linearly indepen-
dent set and enlarging it to a superset which keeps the property of
being linearly independent but is moreover generating.
Theorem II.19 (Basis Extension Theorem). Let V be a nite di-
mensional vector space and S V be a nite linearly independent set.
Then there exists a superset B S which is a basis for V .
Proof. Let S = {v
1
, v
2
, , v
k
}. We must show that there exist
vectors v
k+1
, . . . , v
n
of V such that the set B = {v
1
, . . . , v
k
, v
k+1
, . . . , v
n
}
is a basis for V . Since V is nite dimensional, there exists a generating
set G = {w
1
, w
2
, . . . , w
m
} for V . Consider the set
S G = {v
1
, v
2
, , v
k
, w
1
, w
2
, , w
m
} .
By Remark II.9, S G is a generating set. We now apply the Basis
Selection Algorithm (Theorem II.16) to the vectors in the set S G in
the order as listed above. Since S is linearly independent, all vectors
v
1
, , v
k
will be labelled good and will consequently be contained in
the basis of V produced by the algorithm. It follows that there exists
some set B (viz., the resulting basis after running the algorithm) that
contains all vectors v
1
, , v
k
and is a basis for V . Thats what we
wanted, so were done.
4. Subspaces and bases
Let V be a vector space over R and W V be a subspace. Then
W is a vector space over R in its own right, with respect to the oper-
ations of addition and multiplication inherited from V . Suppose that
we are given a basis B
W
of W. Then, in particular, B
W
is a linearly
4. SUBSPACES AND BASES 29
independent set of vectors in W, and since the operations of addition
and multiplication by a scalar are the same in V and in W, we see that
B
W
is linearly independent when viewed as a subset of V . According
to Theorem II.19, at least in the case V is nite dimensional, B
W
can
be enlarged to a set B
V
which is a basis for V . Thus, we arrive at the
following reformulation of Theorem II.19:
Every basis of a non-trivial subspace of a nite dimen-
sional vector space V can be extended to a basis of V .
It is often useful to identify a basis for a certain subspace of a given
vector space and to extend it to a basis of the whole of the space. We
demonstrate how this works in a particular example.
Example II.20. Let V = R
3
, a, b R and
W =
_
_
_
x
y
z
_
_
R
3

z = ax + by
_
.
We leave it as an exercise to check that W is a subspace of R
3
. Our
goal now is to nd a basis for W, and to extend it to a basis of R
3
.
To nd a basis for W, we rst set x = 1, y = 0 in the dening
relation of W. This yields z = a, and we obtain that the vector u =
_
1
0
a
_
W. Similarly, setting x = 0 and y = 1 in the dening relation
of W yields z = b, and we see that the vector v =
_
0
1
b
_
W. We claim
that the set {u, v} is a basis for W. To show this, note rst that u and
v are linearly independent, for if u + v = o then
_
_

a + b
_
_
=
_
_
0
0
0
_
_
which implies that = = 0. Second, if
_
x
y
z
_
W then z = ax + by
and hence
_
_
x
y
z
_
_
= x
_
_
1
0
a
_
_
+ y
_
_
0
1
b
_
_
= xu + yv .
Thus [{u, v}] = span ({u, v}) = W; in other words, W is both linearly
independent and generates W, hence is a basis for W.
To extend this to a basis of R
3
, we use the Basis Extension Theo-
rem II.19. Its proof tells us how to proceed: apply the Basis Selection
Algorithm (Theorem II.16) to the list of vectors u, v, e
1
, e
2
, e
3
, where
e
i
is the ith unit vector. (The point here is that e
1
, e
2
, e
3
form a basis
of R
3
.) The vectors u and v will be labelled good; what happens next
depends actually on the values of a and b:
If a = 0 then e
1
will be labelled good while e
2
and e
3
are
labelled bad. The resulting basis is u, v, e
1
.
30 II. LINEAR INDEPENDENCE AND BASES
If a = 0 the vector e
1
will be labelled bad. If b = 0 then e
2
will be labelled good and e
3
bad, so that we obtain the basis
u, v, e
2
.
Finally, if a = b = 0 then e
1
and e
2
will be labelled bad, but
e
3
will be labelled good. The resulting basis is u, v, e
3
.
It is interesting to observe that the result is dierent if we apply
the Basis Selection Algorithm to the vectors u, v, e
3
, e
2
, e
1
(note the
order). Now the vector e
3
will always be labelled good, independent of
a and b!
5. The dimension
Every nite dimensional vector space has a certain non-negative
integer attached to it which completely characterises the vector space
(as we will see later). In this section, we are going to dene this number,
called the dimension of the vector space. Before dening the dimension,
we prove the Basis Theorem, a deep result that is at the centre of linear
algebra: it asserts that any two bases of a nite dimensional vector
space have the same number of elements. This is by no means obvious,
and the proof uses all the material we discussed so far.
Theorem II.21 (Basis Theorem). Suppose V is a vector space
over R. Suppose that both
v
1
, v
2
, , v
s
and w
1
, w
2
, , w
t
are bases of V . Then necessarily s = t. In other words, every basis
of V has the same number of elements.
(The Basis Theorem as stated applies to nite dimensional vector
spaces: it is built into the hypotheses that a nite basis exists!)
Proof. Without loss of generality we may assume s t (otherwise
swap the roles of the v
i
and w
j
). Below well prove the following:
Claim: For any integer k with 1 k t, the list of
vectors
v
1
, v
2
, , v
sk
, w
1
, w
2
, , w
k
is a basis of V , possibly after renumbering the vectors v
i
in a suitable manner.
Once weve proved this we conclude as follows: The Claim, for k = t,
says that the vectors w
1
, , w
t
together with s t of the vectors v
i
form a basis V and are thus linearly independent. But the vectors
w
1
, , w
t
alone form a basis by hypothesis, and hence are a maximal
linearly independent list of vectors. It follows that none of the v
i
can
occur, that is to say, we must have s t = 0 and thus s = t.
5. THE DIMENSION 31
It remains to prove the claim. We deal with k = 1 rst. Since
v
1
, , v
s
is a basis of V we can write w
1
as a linear combination:
(1) w
1
=
1
v
1
+
2
v
2
+ . . . +
s
v
s
.
Now at least one of the
i
must be non-zero (otherwise w
1
= o which
is impossible as w
1
is part of a basis); by renumbering the vectors v
i
,
1 i s we may assume that
s
= 0. But then
(2) v
s
=

s
v
1


2

s
v
2
. . .

1

s1
v
s1
+
1

s
w
1
,
and consequently we know (using Lemma I.26 for the third equality)
that
V = [{v
1
, v
2
, , v
s1
, v
s
}] = [{v
1
, v
2
, , v
s1
, v
s
, w
1
}]
= [{v
1
, v
2
, , v
s1
, w
1
}] by (2) .
In other words, v
1
, , v
s1
, w
1
are generating.
These vectors are also linearly independent. For suppose we have
a linear relation
(3)
1
v
1
+
2
v
2
+ . . . +
s1
v
s1
+
1
w
1
= o .
If
1
= 0 we can write
w
1
=

1
v
1


2

1
v
2
. . .

s1

1
v
s1
,
which means that we have two dierent ways of writing w
1
as a linear
combination of the v
i
: one not involving v
s
, as just seen, and one
involving v
s
as seen in (1). But this cant happen as v
1
, , v
s
are a
basis of V (Proposition II.13).
So we must have
1
= 0. But then (3) is a linear relation be-
tween the v
i
s, which are linearly independent, hence all the other coef-
cients
i
must be zero too.This nishes the proof of the Claim for
k = 1.
Suppose now that for some k with 1 k < t weve already shown
that, for a suitable renumbering of the vectors v
i
, 1 i s, the list
(4) v
1
, v
2
, , v
sk
, w
1
, w
2
, , w
k
is a basis of V . We will prove that, possibly after after renumbering
the v
i
, 1 i s k, the list
v
1
, v
2
, , v
sk1
, w
1
, w
2
, , w
k+1
is a basis of V as well. This establishes the Claim for all k as we can
go through the argument for k = 1, 2, , t 1 iteratively.
Since (4) is a basis of V , we can write w
k+1
as a linear combination
(5) w
k+1
=
1
v
1
+
2
v
2
+. . . +
sk
v
sk
+
1
w
1
+
2
w
2
+. . . +
k
w
k
.
Now at least one of the
i
is non-zero (otherwise (5) shows that the w
j
are linearly dependent which is impossible as they form a basis of V ).
32 II. LINEAR INDEPENDENCE AND BASES
By renumbering the v
i
, 1 i s k, if necessary we may assume

sk
= 0 and can hence write
v
sk
=

1

sk
v
1


2

sk
v
2
. . .

sk1

sk
v
sk1

sk
w
1


2

sk
w
2
. . .

k

sk
w
k
+
1

sk
w
k+1
.
(6)
This identity implies, using Lemma I.26 for the third equality, that
V = [{v
1
, v
2
, , v
sk1
, v
sk
, w
1
, w
2
, . . . , w
k
}]
= [{v
1
, v
2
, , v
sk1
, v
sk
, w
1
, w
2
, . . . , w
k
, w
k+1
}]
= [{v
1
, v
2
, , v
sk1
, w
1
, w
2
, . . . , w
k+1
}] by (6) .
In other words, v
1
, , v
sk1
, w
1
, , w
k+1
are generating.
These vectors are also linearly independent. Indeed, suppose we
have a linear relation
(7)
1
v
1
+
2
v
2
+. . .+
sk1
v
sk1
+
1
w
1
+
2
w
2
+. . .+
k+1
w
k+1
= o .
If
k+1
= 0 we can re-arrange this to express w
k+1
as a linear combina-
tion of the other vectors, which are part of the basis (4); since the re-
sulting expressing does not involve the vector v
sk
this diers from (5),
and this cannot happen since (4) is a basis of V (Proposition II.13).
So we must have
k+1
= 0. But then (7) is a linear relation between
linearly independent vectors, so all the other coecients must vanish
as well, proving the Claim.
We are now ready to give the denition of the dimension.
Definition II.22. Let V be a nite dimensional vector space. The
dimension of V is the number of vectors in any basis of V . We denote
the dimension of V by the symbol dim(V ).
Remark II.23. It follows from Theorem II.21 that the dimension
of a vector space is independent of the particular basis that is used
to calculate it indeed, by this theorem, every basis has the same
number of elements. In other words, Theorem II.21 guarantees that
the dimension of a nite dimensional vector space is well-dened.
Proposition II.24. Let V be a nite dimensional vector space, and
write n = dim(V ).
(i) Every linearly independent set of n vectors is automatically gen-
erating and thus forms a basis of V .
(ii) Every generating set of n vectors is automatically linearly inde-
pendent and thus forms a basis of V .
5. THE DIMENSION 33
Proof. (i) Suppose that v
1
, v
2
, , v
n
are linearly independent.
By the Basis Extension Theorem II.19 we can nd vectors other vectors
w
1
, w
2
, , w
k
which together with all the v
i
form a basis of V . But
dim(V ) = n, and every basis of V must have n elements by the Basis
Theorem II.21. So the v
i
together with the w
j
must be a list of n vectors
exactly in other words, we cant have any w
j
showing up as we started
with n of the v
i
already! Since the Basis Extension Theorem produces a
basis regardless, the conclusion must be that the vectors v
1
, v
2
, , v
n
already form a basis of V .
(ii) Suppose that v
1
, v
2
, , v
n
are a generating system for V .
Then some of these vectors form a basis of V , by the Basis Selection
Theorem II.15. But since dim(V ) = n this basis must have precisely
n members; this is possible only if the Basis Selection Theorem actu-
ally selects all the v
i
. This proves that the v
i
already form a basis as
claimed.
CHAPTER III
Linear mappings
1. Mappings between sets
Definition III.1. Let A and B be sets. A map from A to B is a
rule f which assigns to every element x of the set A an element de-
noted f(x) from the set B. We call A the domain and B the codomain
of f, and say that f takes values in B and use the string of symbols
f : A B
to denote a map f with domain A and codomain B.
If f : A B is a map, we often say that f maps the element x
of A to the element f(x) of B. We also say that x is sent to f(x) or,
being even more colloquial, that x goes to f(x) via f. The set B is
sometimes called the target set, or simply the target of f.
Example III.2. (i) If I is an interval and f : I R is a function
then f is also a map with domain I and target R.
(ii) A sequence is nothing else but a map a : N R. In our notation
for sequences, we had set a
n
= a(n).
(iii) Let f : N N be given by f(n) = 2n. Then f is a map from
N into N. Equivalently, f is a sequence whose terms are precisely the
even integers in increasing order.
(iv) Let f : Z N be given by f(n) = |n|. Then f is a map from
Z into N.
(v) Let f : R
2
R
2
be given by f((x, y)) = (x + 1, y + 1). Then f
is a map from R
2
into R
2
.
(vi) Let f : R R
2
be given by f(x) = (x, x
2
). Then f is a map
from R into R
2
.
(vii) Let f : R R
2
be the map given by f(x) = (cos x, sin x).
Then f is a map from R into R
2
.
A very important example of a class of mappings is given as follows:
Definition III.3. Let A be a set. The identity map on A is the
map id
A
: A A given by id
A
(x) = x, x A.
We continue with the introduction of two basic properties that a
map may have.
35
36 III. LINEAR MAPPINGS
Definition III.4. Let A and B be sets and f : A B be a map.
(i) The map f is called injective if, whenever x
1
and x
2
are distinct
elements of A, that is, x
1
= x
2
, we have that f(x
1
) and f(x
2
) are
distinct elements of B, that is, f(x
1
) = f(x
2
).
Symbolically: x
1
= x
2
f(x
1
) = f(x
2
).
(ii) The map f is called surjective if, whenever y is an element of
B, there exists an element x of A such that f(x) = y.
Symbolically: y B x A with f(x) = y.
(iii) The map f is called bijective if it is both injective and surjective.
Note Very often, a surjective map is also called an onto map. Also,
saying that f is a map from A onto B means that f is surjective.
Examples The map in Example III.2 (iii) is injective but not surjec-
tive. The map in Example III.2 (iv) is surjective but not injective. The
map in Example III.2 (v) is bijective. The map in Example III.2 (vii)
is neither injective nor surjective.
We nish this section with an operation between mappings, a ver-
sion of which for functions played a prominent role in previous chapters:
the composition. Suppose that A, B and C are sets and f : A B
and g : B C are mappings. Then the mapping g f : A C, called
the composition g after f is dened by letting (g f)(x) = g(f(x)),
for every x A.
The proof of the following facts are left as an exercise:
Proposition III.5. Let A and B be sets and f : A B and
g : B A be maps. If g f = id
A
then f is injective and g is
surjective.
2. Linear mappings
Let V and W be vector spaces over R.
Definition III.6. A map f : V W is called linear if the following
two conditions are satised:
(Lin 1) The map f is homogeneous: f(v) = f(v) for all v V and
all R
(Lin 2) The map f is additive: f(u+v) = f(u)+f(v), for all u, v V
Example III.7. (i) The identity map id : V V is linear for
every vector space V .
(ii) The map f : R
3
R
3
given by
f(
_
_
x
y
z
_
_
) =
_
_
2x + y
y z
y + 3z
_
_
,
_
_
x
y
z
_
_
R
3
is linear.
2. LINEAR MAPPINGS 37
(iii) The map f : R
2
R
3
given by
f(
_
x
y
_
) =
_
_
2x + y
y
x y + 1
_
_
,
_
x
y
_
R
2
is not linear.
The following characterisation of linear maps is basic. It combines
the two conditions from Denition III.6 in one.
Proposition III.8. Let V and W be vector spaces, and let f : V
W be a map. The following conditions are equivalent:
(i) The map f is linear.
(ii) We have f(u+v) = f(u) +f(v) are all scalars and , and
all vectors u, v V .
(iii) We have equality
f(
1
v
1
+
2
v
2
+ . . . +
n
v
n
) =
1
f(v
1
) +
2
f(v
2
) + . . . +
n
f(v
n
)
for all n 1, all scalars
i
and all vectors v
i
(where 1 i n).

We know discuss how to detect that two linear maps are equal, and
how to construct linear maps:
Theorem III.9. Let V be a vector space over R, suppose we are
given vectors v
1
, v
2
, , v
n
in V . Let also W be a vector space over R
and let w
1
, w
2
, , w
n
be given vectors in W.
(a) Suppose that the vectors v
1
, v
2
, , v
n
generate V . Suppose further
that f and g are linear maps V W which agree on the v
i
, that
is, such that f(v
i
) = g(v
i
) for 1 i n. Then f and g are equal
(as maps), that is, f(v) = g(v) for every v V .
(b) Suppose now that the vectors v
1
, v
2
, , v
n
are a basis for V . Then
there exists a unique linear map h: V W with h(v
i
) = w
i
. That
is, a linear map is completely specied by prescribing the images of
basis vectors, and any such prescription determines a linear map.
Proof. (a) We have to show that f(v) = g(v) for every v V .
So let v V . Since V = [{v
1
, v
2
, , v
n
}] by hypothesis we have
V = span {v
1
, v
2
, , v
n
} by Theorem I.20, so there are scalars
i
such that v =
1
v
1
+
2
v
2
+. . .+
n
v
n
. But then, using Proposition III.8
twice, we have indeed
f(v) = f(
1
v
1
+
2
v
2
+ . . . +
n
v
n
)
=
1
f(v
1
) +
2
f(v
2
) + . . . +
n
f(v
n
)
=
1
g(v
1
) +
2
g(v
2
) + . . . +
n
g(v
n
)
= g(
1
v
1
+
2
v
2
+ . . . +
n
v
n
)
= g(v) .
38 III. LINEAR MAPPINGS
(b) For any v V there exist unique scalars
i
such that v =

1
v
1
+
2
v
2
+. . . +
n
v
n
, by Proposition II.13. We can thus dene the
map h by the formula
h(v) =
1
v
1
+
2
v
2
+ . . . +
n
v
n
.
By uniqueness of the scalars used, there is no ambiguity: the map h is
well dened. For any index i we clearly have h(v
i
) = w
i
(since then,
by uniqueness,
i
= 1, and
j
= 0 if j = i).
This map h is linear. For let u, v V be given, and write u =

1
v
1
+ . . . +
n
v
n
and v =
1
v
1
+ . . . +
n
v
n
. Then necessarily
u + v = (
1
+
1
)v
1
+ (
2
+
2
)v
2
+ . . . + (
n

n
)v
n
,
and thus, using the denition of h three times,
h(u) + h(v) = h(
1
v
1
+ . . . +
n
v
n
) + h(
1
v
1
+ . . . +
n
v
n
)
=
1
w
1
+ . . . +
n
w
n
+
1
w
1
+ . . . +
n
w
n
= (
1
+
1
)w
1
+ . . . + (
n
+
n
)w
n
= h
_
(
1
+
1
)v
1
+ (
2
+
2
)v
2
+ . . . + (
n

n
)v
n
_
= h(u + v)
so that h satises condition (Lin 2). Similarly, with v as above, we
have
v = (
1
)v
1
+ . . . + (
n
)v
n
for every scalar , and thus
h(v) = (
1
w
1
+ . . . +
n
w
n
)
= (
1
)w
1
+ . . . + (
n
)w
n
= h
_
(
1
)v
1
+ . . . + (
n
)v
n
_
= h(v)
so that h also satises (Lin 1). That is, h is linear.
Finally, it remains to observe that this map h is necessarily unique
by part (a).
We nish this section by pointing out that if U, V and W are vector
spaces over R and f : U V and g : V W are linear maps then the
composition g f : U W is a linear map as well; the easy proof is
left as an exercise.
3. The kernel and the image
Definition III.10. Let V and W be vector spaces over R and
f : V W be a linear map. The kernel of f is the set
ker f = {v V | f(v) = o} .
The image of f is the set
imf = {w W | there is v V such that f(v) = w} = {f(v) | v V } .
5. ISOMORPHISMS 39
Proposition III.11. Let V and W be vector spaces over R and
f : V W be a linear map. Then ker f is a subspace of V , and imf is
a subspace of W.
4. The dimension formula for linear maps
Theorem III.9 (b) allows to reduce many questions about a linear
map to checking what happens on a set of basis vectors. The following
facts were discussed in the lectures:
Proposition III.12. Suppose that h: V W is a linear map
between vector spaces. Suppose that v
1
, v
2
, , v
n
is a basis of V .
Write w
i
= f(v
i
) for 1 i n.
(a) The vectors w
1
, w
2
, , w
n
generate the subspace imh of W. In
particular, dimimh n = dimV .
(b) The map h is surjective if and only if the vectors w
1
, w
2
, , w
n
generate W. If this is the case, we necessarily have dimW n =
dimV .
(c) The map h is injective if and only if the vectors w
1
, w
2
, , w
n
are linearly independent. If this is the case, we have dimimh =
n = dimV .
Theorem III.13 (Dimension formula for linear maps). Let V and W
be vector spaces over R. Suppose that V is nite dimensional. Let
f : V W be a linear map. Then
dimker f + dimimf = dimV .
Proof. Given in lectures.
5. Isomorphisms
Definition III.14. Let V and W be vector spaces.
(i) A linear map f : V W that is a bijection is called an isomor-
phism of V onto W.
(ii) We say that V is isomorphic to W if there exists an isomorphism
of V onto W.
The term isomorphic
1
used in the above denition is just a fancy
word for saying that V and W are indistinguishable from each other
although, strictly speaking, they may be dierent!
Remark III.15. Given two vector spaces V and W, we can speak of
V and W being isomorphic, rather than V being isomorphic to W. This
is because of V is isomorphic to W via a linear bijection f : V W
then, by considering the inverse mapping f
1
: W V , we see that
W is also isomorphic to V .
1
coming from Greek isos = equal and morphi = form or shape
40 III. LINEAR MAPPINGS
We also see that the relation of two vector spaces of being isomor-
phic possesses the following property: if U is isomorphic to V and V
is isomorphic to W then U is isomorphic to W. By considering the
identity map, it is also clear that any vector space is isomorphic to
itself.
The big surprise is that, for a given dimension, say n N, there
is only one vector space over R with dimension n that is, any two
vector spaces over R of dimension n are isomorphic and thus formally
indistinguishable.
Theorem III.16. Let V and W be vector space over R of dimension
n. Then V is isomorphic to W.
Proof. Let v
1
, v
2
, . . . , v
n
be a basis of V and let w
1
, w
2
, . . . , w
n
be a basis of W. By TheoremIII.9, there exists a unique linear map
f : V W with f(v
i
) = w
i
for 1 i n. By Proposition III.12 this
map is bijective and thus an isomorphism.
Corollary III.17. Let n N. Every n-dimensional vector space
over R is isomorphic to R
n
.
After Corollary III.17 it is natural to ask why we studied nite
dimensional vector spaces in general and did not conne ourselves to
the much more concrete and understandable R
n
. There are at least
two immediate answers to this question: rst, we were only able to
realise that there is only one (nite dimensional) vector space of
a given dimension after whole lot of work was done abstractly, and
second, having the abstract perspective on a vector space is very often
useful being able to distance oneself from the concrete n-tuples of
real numbers is helpful exactly because the exact nature of the vectors
is not important, but only the properties that can be derived from the
axiomatic setup of vector space theory.

Das könnte Ihnen auch gefallen