Sie sind auf Seite 1von 20

INDIAN INSTITUTE OF TECHNOLOGY, BOMBAY

MA 106 Autumn 2012-13


Note 8
VECTOR (= LINEAR) SPACES
The term vector spaces is a bit misleading. In courses in physics, a
vector is dened as an entity which has both a magnitude or length and a
direction. Velocity, acceleration and force are given as standard examples of
vectors and velocity is specically distinguished from speed so as to bring out
the distinction between a vector and a scalar. Vector addition is dened by
the so-called parallelogram law and scalar multiplication by a real number,
say , is dened as multiplying the length of the vector by || and keeping
the same direction if > 0 and reversing it if < 0. We also study various
products such as the dot and the cross products.
We want to generalise the concept of a vector so that we can handle vectors
other than those involved in physics. One such generalisation is provided by
what are called euclidean spaces. For every positive integer n we let IR
n
be the set of all ordered n-tuples of real numbers. These ordered n-tuples
can be denoted either as row vectors or as column vectors of length n. Er
shall adopt the latter. But it is notationally clumsy to write down a column
vector
_

_
u
1
u
2
.
.
.
u
n
_

_
. So, we often write it as the transpose of a row vector i.e. as
[u
1
u
2
. . . u
n
]
t
. In fact, many times we denote this by a bold face letter u
and call u
1
, u
2
, . . . , u
n
as the components of u. In eect, we are treating a
member of IR
n
as an n 1 matrix. Not surprisingly, the addition as well as
the scalar multiplication for these vectors are dened componentwise, exactly
as for matrices.
The concept of a length makes sense for vectors in IR
n
. If u IR
n
, we
dene its length (also called its norm, denoted variously by |u| or ||u||) as
_
u
2
1
+ u
2
2
+ . . . + u
2
n
. Thus the norm is a function from IR
n
to the set of
non-negative real numbers. It is clear that |u| > 0 for all u = 0. This
property is called the positivity of the norm. A less obvious property is
the triangle inequality which says that for any two vectors u, v IR
n
,
1
|u+v| |u| +|v|, with equality holding if and only if one of the two vectors
u and v is a non-negative scalar multiple of the other. The proof follows from
a well-known inequality about real numbers called the Cauchy-Schwarz
inequality, which says that for any real numbers u
1
, u
2
, . . . , u
n
, v
1
, v
2
. . . v
n
,
u
1
v
1
+ u
2
v
2
+ . . . + u
n
v
n

_
u
2
1
+ u
2
2
+ . . . + u
2
n
_
v
2
1
+ v
2
2
+ . . . + v
2
n
(1)
with equality holding if and only if there exists 0 such that either
u
i
= v
i
for all i = 1, 2, . . . , n or v
i
= u
i
for all i = 1, 2, . . . , n. (A proof of
the Cauchy-Schwarz inequality will be indicated in the exercises.)
We can also dene the concept of the angle between two (non-zero) vectors,
say u and v in IR
n
. For this, we need to dene what is called their inner
product which is a generalisation of the dot product for vectors in IR
3
.
It is dened as the real number u
1
v
1
+ u
2
v
2
+ . . . + u
n
v
n
and denoted by
< u, v > or simply by u v. Note that since both u and v are matrices of
size n 1, < u, v > can also be written as the matrix product u
t
v with the
understanding that a 11 matrix is to be identied with its lone entry. Note
that |u| is precisely

< u, u > and the Cauchy-Schwarz inequality can be
stated as
< u, v > |u| |v| (2)
or equivalently, when neither u nor v is the zero vector, as
< u, v >
|u| |v|
1. We
are now justied in dening the angle between u and v as cos
1
_
< u, v >
|u| |v|
_
.
(There is no generalisation of cross product to IR
n
. It is a peculiarity of IR
3
.)
We can similarly consider the complex n-dimensional vector space
|
C
n
,
consisting of all n1 matrices with complex entries. Note that this time the
scalars are complex numbers. Addition and scalar multiplication are dened
componentwise. The length |u| of a vector u = [u
1
u
2
. . . u
n
]
t

|
C
n
is
dened as
_
|u
1
|
2
+|u
2
|
2
+ . . . +|u
n
|
2
. Note that this equals

u and not

u
t
u as in the real case. Similarly, the inner product < u, v > is dened
as u

v and not as u
t
v. This is our rst encounter with an evidence that for
complex matrices the Hermitian adjoint is more relevant than the transpose.
In applications to physics, the length of a vector is very relevant. However,
in many other applications, the length of a vector often does not make sense.
2
However, the concepts of addition of vectors and scalar multiplication will.
Both these concepts are involved in forming a linear combination, i.e. an
expression of the form u + v where u, v are vectors and , are scalars.
More generally, if we have vectors v
1
, v
2
, . . . , v
n
, and scalars
1
,
2
, . . . ,
n
,
then we can form the linear combination
1
v
1
+
2
v
2
+ . . . +
n
v
n
. For
example, consider the set of polynomials with real coecients. There is no
natural denition of the length of a polynomial. But we can talk about a
linear combination of a (nite) number of polynomials.
In real life too, we encounter many situations where the concept of a linear
combination makes sense while that of the length does not. For example, a
diet is a linear combination of unit quantities of certain basic cereals. To
handle such situations mathematically, we look for a mathematical structure
where the concept of a linear combination, but not necessarily that of length,
makes sense. The most appropriate name for such a structure would be a
linear space. But unfortunately, the name vector space has come to stay
for such a structure even though the concept of a length makes no sense in
it in general.
Thus, in essence, a vector space (or more appropriately a linear space) is
a set, say V , in which it is possible to form linear combinations. That is,
given any two elements u, v V and any two scalars , we are given a new
element u+v of the set V . The scalars , could be either real or complex
numbers and depending upon that we call V a real or a complex linear space
1
.
In any particular example of a linear space, the linear combination of two
vectors with given scalars will be dened explicitly in terms of the context
relevant to that example. For example, suppose V is the set of all (real)
polynomials in a variable x. Suppose p(x) = a
0
+a
1
x+a
2
x
2
+. . . +a
m
x
m
and
q(x) = b
0
+b
1
x+b
2
x
2
+. . . +b
n
x
n
, and , are any two real numbers. Then
we dene p(x)+q(x) to be the polynomial r(x) = c
0
+c
1
x+c
2
x
2
+. . . c
k
x
k
1
Actually, it is possible to consider linear spaces in which the scalars range over certain
other sets called elds. The sets of real and complex numbers are examples of elds. If
F is a eld and we have a linear space V in which we consider linear combinations with
coecients coming from F, then we say V is a vector space over the eld F. In this
terminology real and complex linear spaces are, respectively, vector spaces over the elds
of real and complex numbers. There are many other elds besides IR and
|
C. But we shall
have no occasion to consider vector spaces over them. We only remark in passing that
there is a very tiny eld, denoted by ZZ
2
which consists of only two elements, 0 and 1 in
which 1 + 1 = 0. The incidence matrix of a graph can be thought of as a matrix over
ZZ
2
and when so done the machinery of abstract vector spaces becomes available to study
graph theory and sometimes pays rich dividends.
3
where r = max{m, n} and for i = 0, 1, 2, . . . , r, c
i
= a
i
+ b
i
with the
understanding that if m < k then a
i
= 0 for i = m + 1, . . . , k and similarly,
if n < k then b
j
= 0 for j = n + 1, . . . , k. Similarly, we can consider the
vector space of all complex matrices of some xed size, say m n in which
linear combinations are dened entrywise. But these are particular examples.
When V is an abstract vector space, the only answer that can be given to
the question, But what do you mean by a linear combination of two vectors
in V ? is It is some element of V . We do not know the rule by which it is
dened.
This arbitrariness about linear combinations in an abstract vector space
would give us an unlimited freedom in constructing examples of vector spaces.
Take V to be absolutely any set, say, the set of all elephants in India. Given
any two elephants u and v and any scalars , , we are free to let u + v
to be any elephant, as long as it is in India! Any theorems about vector
spaces would be applicable to this set of elephants and provide the strongest
evidence of relevance of mathematics to real life!
The trouble is that there are hardly any theorems worth the name about
abstract vector spaces if all we know about them is that they contain linear
combinations of their elements. In mathematics, the truth of every state-
ment depends upon the truth of some other statements. Often these latter
statements are themselves proved using some more elementary statements.
But ultimately we reach a stage where the truth of some statements has to
be taken for granted. These statements are called axioms. Every branch of
mathematics has its own system of axioms. These axioms are often incor-
porated into the denition. Thus, instead of dening a vector space merely
as a set in which for every two elements u, v and every two scalars , an
element u+v is assigned, we dene it as a set in which this assignment is
subject to some restrictions, called the axioms of a vector space. With-
out these axioms it will be impossible to prove any results about abstract
vector spaces. So many manipulations which we do instinctively will become
impossible. There is no guarantee, for example, that u + v would equal
v + u. For this to happen, commutativity has to be be included as an
axiom.
Just exactly how many and which axioms should be made a part of the
denition of an abstract vector space is a matter of debate. Too few or too
weak axioms will not enable us to prove any non-trivial results about vector
spaces. On the other hand too many or too restrictive axioms would narrow
down the totality of examples of vector spaces and hence considerably reduce
4
the scope of applicability of the powerful implications of these strong axioms.
In other words, depth and generality are conicting virtues and some balance
has to be struck between them. It is usually only time which decides which
is the right balance.
In the case of abstract vector spaces, this balance has been reached a long
time ago. So, today everybody agrees which axioms are to be a part of the
denition of an abstract vector space. Instead of specifying these axioms in
terms of linear combinations, it is convenient to split them into two parts, viz.
the axioms about vector addition and the axioms about scalar multiplication.
Both of these are listed on p. 359 of the text and will not be repeated here.
(As a minor dierence of notations, we have been denoting scalars by Greek
letters. In the text roman letters or numbers are used for them.)
Because of the restrictions imposed by these axioms of a vector space, the
set of all elephants in India is no longer a vector space and therefore the
theorems we prove using these axioms will not be applicable to it. There
is ho help for this. Loss of generality is the inevitable price to be paid for
increased depth.
But there are many examples of vector spaces. We already have IR
n
and
|
C
n
as foremost examples. It is also easy to see that the set of all polynomials
in a variable x is a vector space. More generally, the set of all functions from
some set S to IR is a vector space if the operations of addition of functions and
multiplication of a function by a scalar are dened pointwise. As a special
case, the sets of matrices M
mn
(IR) and M
mn
(
|
C) are vector spaces for xed
values of m and n. We shall encounter many other examples. So, despite
the rather long list of axioms of a vector space, we have a huge collection
of vector spaces. Had we insisted upon making the length of a vector as an
indispensable part of the denition of an abstract vector space, we would
have to miss many of these examples.
But what about the theorems about abstract linear spaces? If we are
not able to prove some non-trivial theorems about abstract linear spaces,
then there is no point in the abstraction. Fortunately, on this count too, the
denition passes the bar. Naturally, the initial consequences immediately
derivable from the axioms will be very elementary, almost to the point of
being trivial. For example, because of associativity of addition of vectors, we
see that the linear combination
1
v
1
+
2
v
2
+
3
v
3
has only one interpretation.
This is a useful but hardly a profound result. But as we dene and study
more concepts, we shall encounter more non-trivial results.
We begin by dening the concept of a subspace of an (abstract) vector
5
space, say V . This concept is important because many times our interest is
not so much in the entire space but some subspace relevant for our purpose.
The denition is motivated by the requirement that lines or planes passing
through the origin should be subspaces of IR
3
. Since a linear combination
is the most basic concept in a linear space, it is natural to expect that a
subspace should be something which is closed under linear combinations,
that is, any linear combination of any two vectors in that subset should also
belong to that subset. (It is easily seen that in IR
3
, every line as well as every
plane passing through the origin has this property.)
So, formally, we say that a subset, say W, of a vector space V is a sub-
space of V if it is closed under linear combinations, i.e. whenever u, v W,
u+v W for any scalars and . It is easy to check that this single condi-
tion is equivalent to the two conditions together, viz. closure under addition
and closure under scalar multiplication. (In fact, this observation applies in
general. Whenever anything is to be checked for linear combinations, it is
enough to check it separately for sums and for scalar multiplication.) It is
easily seen that in IR
3
every line and every plane passing through the origin
is a subspace. In IR
2
neither the positive quadrant nor the union of the two
axes is a subspace of IR
2
. The former is closed under addition of vectors but
not scalar multiplication, while the latter is closed under scalar multiplica-
tion but not under addition of vectors. For a xed positive integer n, the
set of all polynomials of degree n or less (including the zero polynomial) is a
subspace of the vector space of all polynomials dened above.
Especially important subspaces are those generated by a subset S of V .
Assume rst that S is nite, say S = {v
1
, v
2
, . . . , v
n
}. Then the set of all
possible linear combinations of these vectors is called the span (or more
precisely the linear span) of S and is generally denoted by L(S). It is easily
seen that L(S) is a subspace of V because a linear combination of two linear
combinations of the vs is again a linear combination of them. It is easy to
show, in fact, that L(S) is the smallest subspace of V which contains the set
S. For this reason, L(S) is also called the subspace generated (or spanned)
by S. We can also dene L(S) when S is an innite set, as the set of all
possible nite linear combinations of vectors in S.
In this note, matrices have gured only as examples of vector spaces but
not as tools. As we shall see later, determining whether a given vector v in a
particular vector space V lies in the span of a given nite subset of V often
amounts to solving a system of linear equations. So matrices are important
computational tools in vector spaces.
6
INDIAN INSTITUTE OF TECHNOLOGY, BOMBAY
MA 106 Autumn 2012-13
Note 9
VECTOR SPACE BASICS
In the last note we dened abstract vector spaces, more appropriately
called linear spaces because the central idea in them is that of a linear com-
bination and not that of length. We now dene the basic concepts about
abstract vector spaces. Many authors (possibly worried that their readers
may dread abstraction!) dene these concepts only for the two prime exam-
ples of vector spaces, viz. IR
n
and
|
C
n
. And, as we shall see later, if we conne
ourselves only to what are called nite dimensional vector spaces, then there
is some justication for this, because every nite dimensional space can be
shown to be isomorphic to one of these model spaces. However, in applica-
tions we do come across innite dimensional vector spaces such as the space
of all real or complex valued functions on a set. Secondly, even if we consider
a nite dimensional vector space, it is often preferable to study it directly
than by appealing to its isomorphism with some euclidean space just as it is
better to answer a question in English directly instead of rst translating it
into Hindi (or some other language you are comfortable with), answering it
in Hindi and then translating the answer back into English!
With this preamble, we now dene one of the most basic concepts in
a vector space, viz. that of linear dependence or independence. Unless
otherwise stated, we shall let V be the vector space under consideration.
Denition 1: A collection of vectors v
1
, v
2
, . . . , v
k
in V is said to be linearly
dependent if some non-trivial linear combination of these vectors vanishes,
i.e. there exist scalars
1
,
2
, . . .
k
not all zero such that

1
v
1
+
2
v
2
+ . . . +
k
v
k
= 0 (1)
If no such s exist then the vectors v
1
, v
2
, . . . , v
k
are said to be linearly
independent.
Equivalently, v
1
, v
2
. . . v
k
are linearly independent if whenever an equation
like (1) holds, all s vanish.
A couple of comments are in order. First, linear dependence or inde-
pendence is a property of the collection and not of the individual members.
7
There is nothing like a particular vector being linearly dependent or inde-
pendent. The same vector may belong to two dierent collections of which
one is linearly dependent and the other is linearly independent. So, it is a
little misleading to say a collection of linearly independent vectors. A more
appropriate wording is a linearly independent collection of vectors. But the
misleading diction is commonly used.
Another point to note is that the members of the collection need not be
distinct. Normally, when we list the elements of a set, we avoid repetitions.
But occasionally, we consider collections in which some elements are repeated.
For example, when we talk of the collection of all columns of a matrix, it
may happen that two (or more) columns are identical. That is why we allow
repetitions. (We also encounter a similar situation when some polynomial
has multiple roots.) Of course, a collection in which some vector is repeated
is always linearly dependent. (For example, if v
1
= v
3
(say), then (1) is
satised with
1
= 1,
3
= 1 and
i
= 0 for all other is.)
It is easy to characterise linearly dependent collections of vectors. Some-
times this characterisation is taken as the denition.
Theorem 1: A (nite) collection v
1
, v
2
, . . . , v
k
is linearly dependent if and
only if some v
i
is a linear combination of the remaining vectors.
Proof: Assume v
1
, v
2
, . . . , v
k
are linearly dependent. Then (1) holds for
some scalars
1
,
2
, . . . ,
k
at least one of which, say
i
, is non-zero. Then
we can divide (1) by
i
throughout and can write
v
i
=

i
v
1
. . .

i1

i
v
i1


i+1

i
v
i+1

i
v
k
(2)
which shows that v
i
is a linear combination of the remaining vs. Conversely,
assume some v
j
is a linear combination of the remaining vs, say
v
j
=
1
v
1
+ . . . +
j1
v
j1
+
j+1
v
j+1
+ . . . +
k
v
k
(3)
Then we take
j
= 1 and
i
=
i
for all i = j. With these values it is easily
seen that (1) holds. Since
j
= 1 = 0, it follows that the vs are linearly
dependent.
Trivially, any collection which contains the zero vector is always linearly
dependent. A collection of two vectors, say v
1
and v
2
is linearly dependent if
and only if one of the two vectors is a scalar multiple of the other. This is a
direct consequence of the theorem above. It is also clear that a subcollection
8
of a linearly independent collection is linearly independent and a superset of
a linearly dependent set is linearly dependent.
We can view some of our earlier work in the terminology of linear depen-
dence and independence. Suppose A is an n n matrix. If we let V be the
vector space of all row vectors of length n (rather than column vectors of
length n), then the rows of A form a collection of vectors in V . Exercise
11 of Tutorial 8 can be paraphrased to say that the rows of A are linearly
independent if and only if det (A) = 0, with a similar statement holding for
the columns of A. Thus in this case, the maximum number of linearly inde-
pendent rows of A equals the rank of A, both being equal to n. Actually, the
equality of the rank of A with the maximum number of linearly independent
rows of A can be proved for any matrix, not necessarily a square matrix. But
rst we need to do some spadework.
The maximum number of linearly independent vectors in a given vector
space V is called its dimension. The trouble with this denition is that this
number may not be nite. What if we keep on getting bigger and bigger
collections of vectors that are linearly independent? The following result
puts a stopper on this process if the vector space V is spanned by some nite
subset.
Theorem 2: Any n + 1 (or more) vectors in the span of any n vectors in
any vector space are linearly dependent.
Proof: This can be proved by induction on n. But we shall give a proof based
on row reduction (or, rather, its consequences). Suppose w
1
, w
2
, . . . , w
n
, w
n+1
are vectors in the linear span of vectors v
1
, v
2
, . . . , v
n
in some vector space
V . We have to prove that they are linearly dependent. We are given that
there exist scalars a
ij
such that the following system of equations holds.
w
1
= a
11
v
1
+ a
12
v
2
+ . . . + a
1n
v
n
w
2
= a
21
v
1
+ a
22
v
2
+ . . . + a
2n
v
n
.
.
. =
.
.
.
w
n
= a
n1
v
1
+ a
n2
v
2
+ . . . + a
nn
v
n
w
n+1
= a
n+1,1
v
1
+ a
n+1,2
v
2
+ . . . + a
n+1,n
v
n
Now let A = (a
ij
) be the coecient matrix. This is an (n + 1) n matrix
and so its rank can be at most n. Hence the rank of its transpose, A
t
, is
also less than n + 1. Therefore, by row reduction, the homogeneous system
9
A
t
x = 0 of n linear equations in n +1 unknowns has at least one non-trivial
solution, say, c = [c
1
c
2
. . . c
n
c
n+1
]
t
. Here at least one c
j
= 0. If we expand
the system A
t
c = 0, we get that for every i = 1, 2, . . . , n,
n+1

j=1
a
ji
c
j
= 0 (4)
Now consider the linear combination
n+1

j=1
c
j
w
j
. Since each w
j
is a linear
combination of v
1
, v
2
, . . . , v
n
, by collecting the coecients of the vs together,
we have
n+1

j=1
c
j
w
j
=
_
_
n+1

j=1
a
j1
c
j
_
_
v
1
+
_
_
n+1

j=1
a
j2
c
j
_
_
v
2
+ . . . +
_
_
n+1

j=1
a
jn
c
j
_
_
v
n
(5)
But by (4), the sum in each parentheses vanishes. So the linear combination
c
1
w
1
+c
2
w
2
+. . . +c
n
w
n
+c
n+1
w
n+1
is the zero vector. Since at least one c
j
is non-zero, we get that the set w
1
, w
2
, . . . , w
n
, w
n+1
is linearly dependent.
This theorem makes it possible to dene the dimension and another very
important concept, called a basis for vector spaces which are spanned (or
generated) by some nite subset, say S. (Such spaces are also often called
nite dimensional. But it is a little awkward to use this term before we
have dened dimension.) For, if the spanning set S has k elements (say),
then no linearly independent set in V can contain more than k elements. We
are now justied in dening the dimension of V as the size of the largest
linearly independent subset of V . Every such subset is called a basis for V .
Note that a basis is not unique. If B = {v
1
, v
2
, . . . , v
n
} is a basis for V , so
are {2v
1
, v
2
, . . . , v
n
} or {v
1
+ v
2
, v
2
, . . . , v
n
} and many others. In fact, as
we shall soon see, any linearly independent subset of V can be enlarged to a
basis of V . (This is not such a trivial observation as it sounds. For example,
a builder may be able to build a 20 storey tower. But that does not mean
that he can build 5 more oors on a 15 storey building, already constructed
by somebody else. The structure of that building may not be strong enough
to support the additional oors.)
The following theorem lists a few basic properties of bases.
Theorem 3: Suppose B is basis for an n-dimensional vector space V . Then,
(i) every element of V can be expressed as a unique linear combination of
elements of B.
10
(ii) every linearly independent subset of V which spans V is a basis.
(iii) every linearly independent subset of V can be enlarged to a basis of V .
Proof: Assume B = {v
1
, v
2
, . . . , v
n
}. For (i), we rst show that B spans
V . Let v V . If some v
i
equals v then trivially v L(B). Otherwise,
as n is the maximum number of linearly independent vectors in V , the set
{v
1
, v
2
. . . , v
n
, v} is linearly dependent. So, there exist
1
,
2
, . . .
n
and ,
not all 0 such that
n

i=1

i
v
i
+ v = 0. Note that = 0 as otherwise we shall
get a non-trivial vanishing linear combination of the v
i
s, contradicting the
linear independence of B. As = 0, we can write v =
n

i=1

v
i
which
shows that v L(B).
So far we have proved that every v V can be expressed as some linear
combination of elements of B. To prove uniqueness, assume that the same
v V is expressed both as
n

i=1

i
v
i
and also as
n

i=1

i
v
i
. Then we have
n

i=1
(
i

i
)v
i
= 0, which, by linear independence of the v
i
s means
i

i
=
0 i.e.
i
=
i
for every i = 1, 2, . . . , n. This completes the proof of (i).
For (ii) suppose some linearly independent subset, say C = {w
1
, w
2
, . . . , w
k
}
spans V . By (i), all w
j
s lie in L(B) and so by Theorem 2, we have k n.
But as C spans V , v
1
, v
2
, . . . , v
n
all lie in L(C). So again by Theorem 2, we
get n k. Hence k = n. So C also is a largest independent set and hence is
a basis for V .
Finally, for (iii) suppose some C = {w
1
, w
2
, . . . , w
k
} is linearly indepen-
dent. If C spans V we are done by (ii). Otherwise take any u
1
L(C).
Using the same argument that was used in the rst part of the proof of (i)
(but in a contrapositive manner), we get that the set C {u
1
} is linearly
independent. Again, if this set spans V , by (ii) it is a basis. Otherwise,
there is some u
2
V such that C {u
1
, u
2
} is linearly independent. But by
Theorem 2, this process cannot go on indenitely. So at some stage, we shall
get a superset of C which is a basis for V .
By (ii) we see that for every positive integer n, the dimension of IR
n
(or
|
C
n
) is indeed n, because the set {e
1
, e
2
, . . . , e
n
} (where e
k
has 1 as
its k-th entry and 0 everywhere else) both spans IR
n
(or
|
C
n
) and is linearly
11
independent. So the name n-dimensional euclidean space is not a misnomer.
The basis just cited is called the standard basis for IR
n
or
|
C
n
as the case
may be. (By convention, we let the empty set be a basis for the trivial zero
space consisting of a single zero vector. Its dimension is zero.)
Thus a basis is both a spanning and a linearly independent set. Add one
more element to a basis and it will cease to be linearly independent. Remove
one element from a basis and it will cease to span the vector space. A basis
can, therefore be viewed as a set which achieves a critical balance of these
two virtues. It is possible to start with any spanning set and shrink it to a
basis. This construction of a basis is dual to that in Part (iii) of the theorem
above. It will be given as an exercise. A few other results about bases will
also be given as exercises. When V is not nite dimensional, in view of (ii)
we can still dene a basis for V as a linearly independent subset of V which
also spans V . But even proving the existence of such a set is a challenge we
shall not undertake. Forget about nding it!
Another noteworthy consequence of Theorem 2 is the following.
Theorem 4: Suppose S is a nite spanning set for a vector space V . Then
dim(V ) equals the maximum number of linearly independent elements of S.
Proof: Let T = {w
1
, w
2
, . . . , w
r
} be a maximal linearly independent subset
of S. Then surely dim(V ) r. For the other way inequality, note rst that
every element of S is in the span of T by Theorem 1. But then every element
of V is also in the span of S. So by Theorem 2, dim(V ) r.
We now go back to showing the equality of the rank and the row rank
(dened as the maximum number of linearly independent rows) for any mn
matrix, say A. By the theorem above, the latter equals the dimension of the
row space of A, i.e. the subspace of IR
n
spanned by the rows of A. It is
easy to show (see Problem 10 of Tutorial 4) that under any row operations,
the row space and hence the row rank of a matrix remains unchanged. We
already know that this holds for the rank. Hence to prove the equality of the
rank and the row rank of a matrix, it suces to do so when the matrix is in
a row echeclon form. For such matrices the result is obvious.
What is not so obvious is that the row rank of A also equals the column
rank of A, dened analogously. A direct proof of this fact is, in fact, not so
easy. But with determinants it is easy. The column rank of A equals the row
rank of A
T
and hence the rank of A
T
. But by the properties of determinants,
that is the same as the rank of A.
12
INDIAN INSTITUTE OF TECHNOLOGY, BOMBAY
MA 106 Autumn 2012-13
Note 10
LINEAR TRANSFORMATIONS
So far we studied one vector space (= linear space) at a time. We often
have to consider a transition from one vector space to another, in simpler
language a function from one vector space, say V , to another, say W. There
is not much point in considering arbitrary functions, because such a function
is more appropriately a function merely from the set V to the set W. If
this function is to throw some light on the relationship between properties
of V and W as vector spaces (e.g. their dimensions), it better preserve
or be compatible with the vector space structure, viz. the concept of a
linear combination. This is the motivation behind the denition of a linear
transformation, say T : V W. Simply stated, it is a function
2
which
preserves linear combinations, i.e. which has the property that for all scalars
3

1
,
2
and vectors v
1
, v
2
V we have
T(
1
v
1
+
2
v
2
) =
1
T(v
1
) +
2
T(v
2
) (1)
As usual, the single condition (1) is equivalent to two separate ones, viz.
preservation of addition and that of scalar multiplication, i.e. T(v
1
+ v
2
) =
T(v
1
) +T(v
2
) and T(v) = T(v). Also by induction, if (1) holds true then
we can extend it for the linear combination of any k vectors in V .
2
A function is a general term applied to any rule of correspondence which assigns to
each element of a set, called its domain, a unique element of some other (possibly the
same) set called its codomain. The terms, map, mapping, transformation, operator,
functional, form and eld are really synonymous with a function. But by convention,
they are used in specialised contexts. Thus we talk of a vector eld, when it is in fact a
vector valued function.
3
Whenever we talk of a linear transformation, it is understood that the scalars in the
case of the domain vector space are from the same set as those of the codomain vector
space. There is not much point in studying transformations from a real vector space to
a complex one, for example. Note also that to avoid unnecessary fussiness, we denote
the vector additions in the two spaces by the same symbol + and similarly for the scalar
multiplication (for which we often use no symbol). The context usually makes it clear
what is meant. Similarly, the zero vector of every vector space will be denoted by the
same symbol 0, even though it is a dierent element for every vector space.
13
Examples of linear transformations are abundant. The most trivial is the
trivial transformation where V and W can be any vector spaces and the
transformation assigns to every vector in V the vector 0 in W. Sometimes
this transformation is denoted by 0 itself. If V is a subspace of a vector space
W, then there is the inclusion transformation i : V W dened by
i(v) = v. All that this transformation does is to take an element of V and
view it as an element of W. An especially important case of this arises when
V = W. In this case, i is called the identity transformation on V and
denoted by 1
V
or id
V
. The verication of preservation of linear combinations
in all three cases is trivial.
Usually, whenever we say that some operation or process is linear, this
can be paraphrased in the language of a suitable linear transformation. For
example, we say that dierentiation is a linear process. In elementary calculus
this simply means that the derivative of a sum is the sum of the derivatives
and the derivative of a constant multiple of a function is that constant times
the derivative of that function. We can express this in the language of a linear
transformation. Let W be the vector space of all real valued functions on
some interval [a, b] and let V be the subspace of those that are dierentiable
at every point of [a, b]. Dene D : V W by D(f) = f

. Then D is a linear
transformation. Similarly the denite integral denes a linear transformation
from the vector space of all real valued continuous functions on [a, b] to the
vector space IR.
There are some standard ways to form new linear transformations from
old ones. Suppose T
1
, T
2
: V W are two linear transformations. Dene
T
1
+T
2
: V W by pointwise addition, i.e. (T
1
+T
2
)(v) = T
1
(v) +T
2
(v).
It is called the sum of T
1
and T
2
. It is easily checked that this is also a
linear transformation from V to W. More generally we can dene the linear
transformation
1
T
1
+
2
T
2
for any scalars
1
,
2
.
The composite of two linear transformations, when dened, is also a linear
transformation as is easily checked. As with functions, if T : V W and
S : W U are linear transformations, then their composite is denoted by
S T (or sometimes simply by ST). It is a transformation from V to U.
Associative law holds for composites. We frequently encounter a situation
where T is a linear transformation from a vector space V to itself. In that
case, T T is denoted by T
2
. Taking more composites, we can dene T
k
for every positive integer k. By convention, we set T
0
to be 1
V
, the identity
transformation on V .
Merging these two processes (viz. the composites and linear combina-
14
tions) gives rise to some linear transformations which are very important in
applications. We already introduced the dierential operator D as a linear
transformation from the space of all dierentiable functions on some interval,
say [a, b]. The operator D
2
i.e. D D is dened on the space of all twice
dierentiable functions. So, on this space we can talk of the dierential oper-
ator D
2
+D+ where , are some real numbers. (Here really means i
where i is the inclusion function of the space of twice dierentiable functions
into the space of all functions.) This is a linear transformation dened by
(D
2
+ D + )(f) = f

+ f

+ . If we denote f(x) by a variable y (say),


then it is also customary to write this transformation by
(D
2
+ D + )y =
d
2
y
dx
2
+
dy
dx
+ y (2)
An equation which involves such a linear dierential operator is called a
linear dierential equation (of order two in the present situation). Such
equations arise naturally in real life situations (e.g. a simple harmonic motion
or an RLC circuit) and their solutions are extremely important.
A linear transformation which is also a bijection is called a vector space
isomorphism. It is easy to show that the inverse is also a linear transfor-
mation. Two vector spaces are said to be isomorphic to each other if there
exists a vector space isomorphism between them. From an abstract point of
view, such vector spaces are replicas of each other.
Interesting as these examples are, for our purpose the most important
examples of linear transformations are those arising out of matrices. Let
A = (a
ij
) be any m n matrix with real entries. (The case of a complex
matrix is similar.) Let V and W be, respectively, the euclidean spaces IR
n
and R
m
. We denote their elements by column vectors. Now suppose v IR
n
.
Then the matrix Av makes sense and is a column vector of length m, i.e.
an element of IR
m
. This way we get a function, say T : IR
n
IR
m
. We
claim this function is linear. This follows immediately by the distributive
law for matrix multiplication, which yields T(v + w) = A(v + w) =
A(v) +A(w) = Av +Aw = T(v) +T(w). When we want to stress
that T is derived from A, we denote it by T
A
rather than by mere T. It is
clear that if B is another mn matrix then T
A+B
= T
A
+T
B
. What is more
interesting is that if B is an n p matrix then the matrix product AB is
dened and we have T
AB
= T
A
T
B
. In other words, the matrix multiplication
corresponds to the composite of the corresponding linear transformations. In
15
the exercises we shall point out that certain 2 2 matrices give rise to some
very familiar transformations of IR
2
to itself, such as rotations, reections and
projections. Later on we shall also extend certain concepts about matrices,
such as ranks and eigenvalues to linear transformations.
The following theorem highlights the role of bases in the study of linear
transformations whose domains are nite-dimensional.
Theorem 1: Suppose V is a nite dimensional vector space with a basis
B = {v
1
, v
2
, . . . , v
n
} and W is any vector space. Then a linear transforma-
tion T : V W is uniquely determined by its values on B i.e. by the
vectors T(v
1
), T(v
2
), . . . , T(v
n
) W. Conversely, given any (not necessarily
distinct) w
1
, w
2
, . . . , w
n
W, there exists a unique linear transformation
T : V W which satises T(v
i
) = w
i
for i = 1, 2, . . . , n.
Proof: This follows from Part (i) of Theorem 5 in Note 8. Let v V .
Then v can be expressed uniquely as
n

i=1

i
v
i
. Then T(v) =
n

i=1

i
T(v
i
)
by linearity of T. As the s are unique (for a given v), we see that the
values of T on the basis elements determine T(v) uniquely. For the converse
too, write a given v V as
n

i=1

i
v
i
and dene T(v) =
n

i=1

i
w
i
. This is a
well-dened function from V to W because the s are unique. As for its
linearity, assume v =
n

i=1

i
v
i
and v

=
n

i=1

i
v
i
. Then a direct computation
gives T(v +

) = T(
n

i=1

i
v
i
+

i=1

i
v
i
) = T(
n

i=1
(
i
+

i
)v
i
=
n

i=1
(
i
+

i
)w
i
=
n

i=1

i
w
i
+
n

i=1

i
w
i
= T(v) +

T(v

). Uniqueness
of T follows from the rst part.
Unlike the earlier examples of linear transformations which already had
some signicance as functions, this theorem is a factory for an unlimited
supply of linear transformations. All we have to do is to start with any basis
for any space and and assign arbitrary values (from any other vector space)
to its elements without worrying about their signicance. Of course, that is
not the way the theorem is used. Instead, the theorem has some interesting
consequences as will be illustrated in the exercises. One of them is that any
two vector spaces of equal dimensions are isomorphic to each other.
16
INDIAN INSTITUTE OF TECHNOLOGY, BOMBAY
MA 106 Autumn 2012-13
Tutorial 4
PART A
1. Examine whether the following sets of vectors constitute a vector space.
(In all examples, the vector space operations are the usual ones.) When
they do, obtain bases for them.
(a) The set of all (x
1
, x
2
, x
3
, x
4
) in IR
4
such that
(i) x
4
= 0; (ii) x
1
x
2
; (iii) x
2
1
x
2
2
= 0; (iv) x
1
= x
2
= x
3
= x
4
;
(v) x
1
x
2
= 0.
(b) The set of all real functions of the form a cos x+b sin x+c, where
a, b, c vary over all real numbers.
(c) Homogeneous polynomials in two variables of degree 3 together
with the zero polynomial.
(d) The set of all n n real matrices ((a
ij
)) which are:
(i) diagonal; (ii) upper triangular; (iii) having zero trace; (iv)
symmetric; (v) skew-symmetric (vi) invertible.
(e) The set of all real polynomials of degree 5 together with the zero
polynomial.
(f) The set of all complex polynomials of degree 5 with p(0) = p(1)
together with the zero polynomial.
(g) The real functions of the form (ax + b)e
x
, a, b IR.
2. Prove that property (i) of a basis given in Theorem 3 actually char-
acterises a basis. That is, if B is a subset of a (nite dimensional)
vector space V with the property that every v V is a unique linear
combination of elements of B, then B is a basis for V .
3. (a) Let W be a subspace of any vector space V and S be any subset
of V . Prove that L(S) is a subspace of V and further if S W,
then L(S) W. (Verbally, the span of S is the smallest subspace
of S that contains S.)
17
(b) In (a), assume V is nite dimensional. Prove that there exists
some linearly independent subset C of S which spans L(S). In
particular, if S spans V , then S contains some basis of V . [Hint:
Go on picking elements u
1
, u
2
, . . . from S so that the elements
picked are linearly independent until these elements span L(S).
This construction of a basis is dual to that in Part (iii) of Theo-
rem 3. There we started with a linearly independent subset and
enlarged it to a basis. In this exercise, we start with a spanning
set and shrink it to a basis.]
4. Prove that every subspace of a nite dimensional vector space is nite
dimensional and that its dimension cannot exceed that of the ambient
space. (This is not as obvious as it looks, because similar results do not
hold for other containments. For example, the area of a subregion is at
most the area of the region, but its perimeter may be greater.) Deduce
that the only subspaces of IR
3
are the zero space, the lines through the
origin, the planes passing through the origin and the entire IR
3
.
5. Examine whether the following subsets of the set of real valued func-
tions on IR are linearly dependent or independent. Compute the di-
mension of the subspace spanned by each set
(a) {1 + t, (1 + t)
2
}; (b) {x, |x|}.
6. Examine whether the following sets are linearly independent.
(a) {(a, b), (c, d)} IR
2
, with ad bc = 0.
(b) {(1 + i, 2i, 2), (1, 1 + i, 1 i)} in
|
C
3
.
(c) For
1
, . . . ,
k
distinct real numbers, the set {v
1
, . . . , v
k
} where
v
i
= (1,
i
,
2
i
, . . . ,
k1
i
).
(d) {e

1
x
, e

2
x
, , e
nx
} for distinct real numbers
1
,
2
, . . . ,
n
.
(e) {1, cos x, cos 2x, . . . , cos nx}.
(f) {1, sin x, sin 2x, . . . , sin nx}.
(g) {e
x
, xe
x
, . . . , x
n
e
x
}.
7. Let P
n
[x] denote the vector space consisting of the zero polynomial and
all real polynomials of degree n, where n is xed. Let S be a sub-
set of all polynomials p(x) in P
n
[x] satisfying the following conditions.
Check whether S is a subspace; if so, compute the dimension of S.
(i) p(0) = 0; (ii) p is an odd function; (iii) p(0) = p

(0) = 0.
18
8. Suppose {v
1
, v
2
, . . . , v
n
} is an independent set in a vector space V .
Suppose elements u
1
, u
2
, . . . , u
m
and u of V are expressed in terms of
the vs as u
i
=
n

j=1
a
ij
v
j
and u =
n

j=1
b
j
v
j
. Let A be the mn matrix
(a
ij
) and b be the row vector (b
1
, b
2
, . . . , b
n
). Prove that :
(i) the set {u
1
, u
2
, . . . , u
m
} is linearly independent if and only if the
rank of A is m
(ii) u is in the span of {u
1
, u
2
, . . . , u
m
} if and only if the equation
xA = b has a solution.
9. Suppose B and C are two bases for a nite dimensional vector space
V . Prove that for every u B, there is some v C such that the set
(B {u}) {v} is a basis for V . (This is called the basis exchange
property because verbally it says that any one element of one basis
can be exchanged for some element of the other basis.)
10. Prove that for any set S = {v
1
, v
2
, . . . , v
m
} and for any scalar , the set
{v
1
+v
2
, v
2
, v
3
, . . . , v
m
} has the same span as the set S. Deduce that
the row space of a matrix remains unchanged under row operations.
11. Suppose A is a real m n matrix and T
A
is the corresponding linear
transformation from IR
n
to IR
m
. Prove that the columns of A are
precisely the images (under T
A
) of the elements of the standard basis
for IR
n
. Prove further that every linear transformation T : IR
n
IR
m
is of the form T
A
for a unique mn matrix A. [Hint: Use Theorem 1
of Note 10 for the second part.]
12. Let T : V W be a linear transformation. Prove that T always maps
a linearly dependent set to a linearly dependent set. Prove further that
T is one-to-one if and only if it maps every linearly independent set to
a linearly independent set and also that T is onto if and only if it maps
every spanning subset of V to a spanning subset of W.
13. For a linear transformation T : V W (with V, W nite dimensional)
prove that the following conditions are equivalent:
(i) T is an isomorphism
(ii) T maps every basis of V to a basis of W
19
(iii) T maps some basis of V to a basis of W.
Deduce that isomorphic spaces have equal dimensions. Also prove the
converse using Theorem 1 of Note 10.
PART B
14. Let V be the set of all positive real numbers. For x, y V dene x+

y
to be xy (the ordinary product of x and y). For any real scalar and
any x V , dene x as x

. Prove that with these (unusual) operations


of addition and scalar multiplication, V is a vector space.
15. (Charm of axiomatic deductions) Using the axioms of an abstract
vector space (given on p. 359 of the text), prove that in any abstract
vector space V ,
(i) 0 v = 0 for every vector v and 0 = 0 for every scalar .
(ii) If v = 0, then either = 0 or v = 0
(iii) If v = w and = 0 then v = w (cancellation by non-zero
scalars)
iv) If v = v and v = 0, then = (cancellation by non-zero
vectors)
16. Prove that in an abstract vector space the hypothesis of commutativity
of the addition of vectors is redundant, in the sense that it can be
derived from the remaining axioms, including both the distributive laws
and the existence of additive inverses. [Hint: Expand (1 + 1)(v
1
+v
2
)
in two ways.] (In a tightly framed axiom system, such redundancies
are avoided. The best way to do this is to take each axiom and give an
example where all except that axiom hold true.
17. Prove that the operation of taking the limit of a (convergent) sequence
denes a linear transformation from the set of all convergent sequences
to IR (or to
|
C). Prove the same for limits of functions at some specied
point.
18. Show that the exponential and the sine functions satisfy equations in-
volving linear dierential operators of orders 1 and 2 respectively.
20

Das könnte Ihnen auch gefallen