Linear Algebra and Matrix Analysis: Vector Spaces

Linear Algebra and Matrix Analysis
Vector Spaces
Throughout this course, the base eld F of scalars will be R or C. Recall that a vector
space is a nonempty set V on which are dened the operations of addition (for v, w V ,
v+w V ) and scalar multiplication (for F and v V , v V ), subject to the following
conditions:
1. x + y = y + x
2. (x + y) + z = x + (y + z)
3. There exists an element 0 V such that x + 0 = x for all x
4. For each x V , there is an element of V denoted x such that x + (x) = 0
5. (x) = ()x
6. (x + y) = x + y
7. ( + )x = x + x
8. 1x = x
A subset W V is a subspace if W is closed under addition and scalar multiplication, so W
inherits a vector space structure of its own.
Examples:
(1) 0
(2) F
n
=
_
_
_
_
x
1
.
.
.
x
n
_
_
: each x
j
F
_
_
, n 1
(3) F
mn
=
_
_
_
_
a
11
a
1n
.
.
.
.
.
.
a
m1
a
mn
_
_
: each a
ij
F
_
_
, m, n 1
1
2 Linear Algebra and Matrix Analysis
(4) F
=
_
_
_
_
x
1
x
2
.
.
.
_
_
: each x
j
F
_
_
(5)
1
(F) F
, where
1
(F) =
_
_
_
_
x
1
x
2
.
.
.
_
_
:
j=1
[x
j
[ <
_
(F) F
, where
(F) =
_
_
_
_
x
1
x
2
.
.
.
_
_
: sup
j
[x
j
[ <
_
1
(F) and
(F) are clearly subspaces of F
.
Let 0 < p < , and dene
p
(F) =
_
_
_
_
x
1
x
2
.
.
.
_
_
:
j=1
[x
j
[
p
<
_
_
.
Since
[x + y[
p
([x[ +[y[)
p
(2 max([x[, [y[))
p
= 2
p
max([x[
p
, [y[
p
) 2
p
([x[
p
+[y[
p
),
it follows that
p
(F) is a subspace of F
.
Exercise: Show that
p
(F)
q
(F) if 0 < p < q .
(6) Let X be a nonempty set; then the set of all functions f : X F has a natural
structure as a vector space over F: dene f
1
+ f
2
by (f
1
+f
2
)(x) = f
1
(x) +f
2
(x), and
dene f by (f)(x) = f(x).
(7) For a metric space X, let C(X, F) denote the set of all continuous F-valued functions on
X. C(X, F) is a subspace of the vector space dened in (6). Dene C
b
(X, F) C(X, F)
to be the subspace of all bounded continuous functions f : X F.
(8) If U R
n
is a nonempty open set and k is a nonnegative integer, the set C
k
(U, F)
C(U, F) of functions all of whose derivatives of order at most k exist and are continuous
on U is a subspace of C(U, F). The set C
(U, F) =
k=0
C
k
(U, F) is a subspace of
each of the C
k
(U, F).
(9) Dene T(F) C
(R, F) to be the space of all F-valued polynomials on R:

T(F) = a
0
+ a
1
x + + a
m
x
m
: m 0, each a
j
F.
Each p T(F) is viewed as a function p : R F given by p(x) = a
0
+a
1
x+ +a
m
x
m
.
(10) Dene T
n
(F) T(F) to be the subspace of all polynomials of degree n.
Vector Spaces 3
(11) Let V = u C
2
(R, C) : u
+ u = 0. It is easy to check directly from the denition

that V is a subspace of C
2
(R, C). Alternatively, one knows that
V = a
1
cos x + a
2
sin x : a
1
, a
2
C = b
1
e
ix
+ b
2
e
ix
: b
1
, b
2
C,
from which it is also clear that V is a vector space.
More generally, if L(u) = u
(m)
+ a
m1
u
(m1)
+ + a
1
u
+ a
0
u is an m
th
order linear
constant-coecient dierential operator, then V = u C
m
(R, C) : L(u) = 0 is a
vector space. V can be explicitly described as the set of all linear combinations of
certain functions of the form x
j
e
rx
where j 0 and r is a root of the characteristic
polynomial r
m
+a
m1
r
m1
+ +a
1
r +a
0
= 0. For details, see Chapter 3 of Birkho
& Rota.
Convention: Throughout this course, if the eld F is not specied, it is assumed to be C.
Linear Independence, Span, Basis
Let V be a vector space. A linear combination of the vectors v
1
, . . . , v
m
V is a vector
v V of the form
v =
1
v
1
+ +
m
v
m
where each
j
F. Let S V be a subset of V . S is called linearly independent if for every
nite subset v
1
, . . . , v
m
of S, the linear combination
m
i=1
i
v
i
= 0 i
1
= =
m
= 0.
Otherwise, S is called linearly dependent. Dene the span of S (denoted Span(S)) to be the
set of all linear combinations of all nite subsets of S. (Note: a linear combination is by
denition a nite sum.) If S = , set Span(S) = 0. S is said to be a basis of V if S is
linearly independent and Span(S) = V .
Facts: (a) Every vector space has a basis; in fact if S is any linearly independent set in V , then
there is a basis of V containing S. The proof of this in innite dimensions uses Zorns lemma
and is nonconstructive. Such a basis in innite dimensions is called a Hamel basis. Typically
it is impossible to identify a Hamel basis explicitly, and they are of little use. There are
other sorts of bases in innite dimensions dened using topological considerations which
are very useful and which we will consider later.
(b) Any two bases of the same vector space V can be put into 11 correspondence. Dene
the dimension of V (denoted dimV ) 0, 1, 2, . . . to be the number of elements in
a basis of V . The vectors e
1
, . . . , e
n
, where
e
j
=
_
_
0
.
.
.
1
.
.
.
0
_
_
j
th
entry,
form the standard basis of F
n
, and dimF
n
= n.
Remark. Any vector space V over C may be regarded as a vector space over R by restriction
of the scalar multiplication. It is easily checked that if V is nite-dimensional with basis
v
1
, . . . , v
n
over C, then v
1
, . . . , v
n
, iv
1
, . . . , iv
n
is a basis for V over R. In particular,
dim
R
V = 2 dim
C
V .
The vectors e
1
, e
2
, . . . F
are linearly independent. However, Spane

1
, e
2
, . . . is the
proper subset F
0
F
consisting of all vectors with only nitely many nonzero components.

So e
1
, e
2
, . . . is not a basis of F
. But x
m
: m 0, 1, 2, . . . is a basis of T.
Now let V be a nite-dimensional vector space, and v
1
, . . . , v
n
be a basis for V . Any
v V can be written uniquely as v =
n
i=1
x
i
v
i
for some x
i
F. So we can dene a map
from V into F
n
by v
_
_
x
1
.
.
.
x
n
_
_
. The x
i
s are called the coordinates of v with respect to the
basis v
1
, . . . , v
n
. This coordinate map clearly preserves the vector space operations and is
bijective, so it is an isomorphism of V with F
n
in the following sense.
Denition. Let V , W be vector spaces. A map L : V W is a linear transformation if
L(
1
v
1
+
2
v
2
) =
1
L(v
1
) +
2
L(v
2
)
for all v
1
, v
2
V and
1
,
2
F. If in addition L is bijective, then L is called a (vector
space) isomorphism.
Even though every nite-dimensional vector space V is isomorphic to F
n
, where n =
dimV , the isomorphism depends on the choice of basis. Many properties of V are indepen-
dent of the basis (e.g. dimV ). We could try to avoid bases, but it is very useful to use
coordinate systems. So we need to understand how coordinates change when the basis is
changed.
Change of Basis
Let V be a nite dimensional vector space. Let v
1
, . . . , v
n
and w
1
, . . . , w
n
be two bases for
V . For v V , let x = (x
1
, . . . , x
n
)
T
and y = (y
1
, . . . , y
n
)
T
denote the vectors of coordinates
of v with respect to the bases B
1
= v
1
, . . . , v
n
and B
2
= w
1
, . . . , w
n
, respectively. Here
T
denotes the transpose. So v =
n
i=1
x
i
v
i
=
n
j=1
y
j
w
j
. Express each w
j
in terms of
v
1
, . . . , v
n
: w
j
=
n
i=1
c
ij
v
i
(c
ij
F). Let C =
_
_
_
c
11
c
1n
.
.
.
c
n1
c
nn
_
_
_
F
nn
. Then
n
i=1
x
i
v
i
= v =
n
j=1
y
j
w
j
=
n
i=1
_
n
j=1
c
ij
y
j
_
v
i
,
so x
i
=
n
j=1
c
ij
y
j
, i.e. x = Cy. C is called the change of basis matrix.
Notation: Horn-Johnson uses M
m,n
(F) to denote what we denote by F
mn
: the set of mn
matrices with entries in F. H-J writes [v]
B
1
for x, [v]
B
2
for y, and
B
1
[I]
B
2
for C, so x = Cy
Vector Spaces 5
becomes [v]
B
1
=
B
1
[I]
B
2
[v]
B
2
. Similarly, we can express each v
j
in terms of w
1
, . . . , w
n
:
v
j
=
n
i=1
b
ij
w
i
(b
ij
F). Let B =
_
_
_
b
11
b
1n
.
.
.
b
n1
b
nn
_
_
_
F
nn
. Then y = Bx. We obtain
that C and B are invertible and B = C
1
.
Formal matrix notation: Write the basis vectors (v
1
, , v
n
) and (w
1
, , w
n
) formally in
rows. Then the equations w
j
=
n
i=1
c
ij
v
i
become the formal matrix equation
(w
1
, , w
n
) = (v
1
, , v
n
)C
using the usual matrix multiplication rules. In general, (v
1
, , v
n
) and (w
1
, , w
n
) are
not matrices (although in the special case where each v
j
and w
j
is a column vector in F
n
,
we have W = V C where V, W F
nn
are the matrices whose columns are the v
j
s and
the w
j
s, respectively). We also have the formal matrix equations v = (v
1
, , v
n
)x and
v = (w
1
, , w
n
)y, so
(v
1
, , v
n
)x = (w
1
, , w
n
)y = (v
1
, , v
n
)Cy,
which gives us x = Cy as before.
Remark. We can read the matrix equation W = V C as saying that the j
th
column of W is
the linear combination of the columns of V whose coecients are in the j
th
column of C.
Constructing New Vector Spaces from Given Ones
(1) The intersection of any family of subspaces of V is again a subspace: let W
: G
be a family of subspaces of V (where G is an index set); then
G
W
is a subspace of
V .
(2) Sums of subspaces: If W
1
, W
2
are subspaces of V , then
W
1
+ W
2
= w
1
+ w
2
: w
1
W
1
, w
2
W
2
is also a subspace, and dim(W

1
+ W
2
) + dim(W
1
W
2
) = dimW
1
+ dimW
2
. We say
that the sum W
1
+W
2
is direct if W
1
W
2
= 0 (equivalently: for each v W
1
+W
2
,
there are unique w
1
W
1
and w
2
W
2
for which v = w
1
+ w
2
), and in this case we
write W
1
W
2
for W
1
+ W
2
. More generally, if W
1
, . . . , W
n
are subspaces of V , then
W
1
+ + W
n
= w
1
+ + w
n
: w
j
W
j
, 1 j n is a subspace. We say that
the sum is direct if whenever w
j
W
j
and
n
j=1
w
j
= 0, then each w
j
= 0, and in
this case we write W
1
W
n
. Even more generally, if W
: G is a family of
subspaces of V , dene
G
W
= span
_
_
G
W
_
. We say that the sum is direct if
for each nite subset G
of G, whenever w
for G
and
= 0, then
each w
= 0 for G
(equivalently: for each G, W
G,=
W
_
= 0).
(3) Direct Products: Let V
: G be a family of vector spaces over F. Dene

V = X
G
V
to be the set of all functions v : G

_
G
V
for which v() V
for all
G. We write v
for v(), and we write v = (v
)
G
, or just v = (v
). Dene
v + w = (v
+ w
) and v = (v
). Then V is a vector space over F. (Example:

G = N = 1, 2, . . ., each V
n
= F. Then
X
n1
V
n
= F
.)
(4) (External ) Direct Sums: Let V
: G be a family of vector spaces over F. Dene
G
V
to be the subspace of
X
G
V
consisting of those v for which v
= 0 except for
nitely many G. (Example: For n = 0, 1, 2, . . . let V
n
= span(x
n
) in T. Then T
can be identied with
n0
V
n
.)
Facts: (a) If G is a nite index set, then XV
and
are isomorphic.
(b) If each W
is a subspace of V and the sum
G
W
is direct, then it is naturally

isomorphic to the external direct sum
.
(5) Quotients: Let W be a subspace of V . Dene on V the equivalence relation v
1
v
2
if v
1
v
2
W, and dene the quotient to be the set V/W of equivalence classes. Let
v + W denote the equivalence class of v. Dene a vector space structure on V/W by
dening
1
(v
1
+W) +
2
(v
2
+W) = (
1
v
1
+
2
v
2
) +W. Dene the codimension of W
in V by codim(W) = dim(V/W).
Dual Vector Spaces
Denition. Let V be a vector space. A linear functional on V is a function f : V F for
which f(
1
v
1
+
2
v
2
) =
1
f(v
1
) +
2
f(v
2
) for v
1
, v
2
V ,
1
,
2
F. Equivalently, f is a
linear transformation from V to the 1-dimensional vector space F.
Examples:
(1) Let V = F
n
, and let f be a linear functional on V . Set f
i
= f(e
i
) for 1 i n. Then
for x = (x
1
, . . . , x
n
)
T
=
n
i=1
x
i
e
i
F
n
,
f(x) =
n
i=1
x
i
f(e
i
) =
n
i=1
f
i
x
i
So every linear functional on F
n
is a linear combination of the coordinates.
(2) Let V = F
. Given an N and some f

1
, f
2
, . . . , f
N
F, we can dene a linear functional
f(x) =
N
i=1
f
i
x
i
for x F
. However, not all linear functionals on F
are of this
form.
(3) Let V =
1
(F). If f
(F), then for x

1
(F),
i=1
[f
i
x
i
[ (sup [f
i
[)
i=1
[x
i
[ < ,
so the sum f(x) =
i=1
f
i
x
i
converges absolutely, dening a linear functional on
1
(F).
Similarly, if V =
(F) and f
1
(F), f(x) =
i=1
f
i
x
i
denes a linear functional on
(F).
Vector Spaces 7
(4) Let X be a metric space and x
0
X. Then f(u) = u(x
0
) denes a linear functional
on C(X).
(5) If < a < b < , f(u) =
_
b
a
u(x)dx denes a linear functional on C([a, b]).
Denition. If V is a vector space, the dual space of V is the vector space V
of all linear
functionals on V , where the vector space operations on V
are given by (
1
f
1
+
2
f
2
)(v) =
1
f
1
(v) +
2
f
2
(v).
Remark. When V is innite dimensional, V
is often called the algebraic dual space of V , as it

depends only on the algebraic structure of V . We will be more interested in linear functionals
related also to a topological structure on V . After introducing norms (which induce metrics
on V ), we will dene V

to be the vector space of all continuous linear functionals on
V . (When V is nite dimensional, with any norm on V , every linear functional on V is
continuous, so V

= V
.)
Dual Basis in Finite Dimensions
Let V be a nite dimensional vector space and let v
1
, . . . , v
n
be a basis for V . For 1 i n,
dene linear functionals f
i
V

by f
i
(v
j
) =
ij
, where
ij
=
_
1 if i = j
0 if i ,= j.
Let v V , and let x = (x
1
, . . . , x
n
)
T
be the vector of coordinates of v with respect to the
basis v
i
, . . . , v
n
, i.e., v =
n
i=1
x
i
v
i
. Then f
i
(v) = x
i
, i.e., f
i
maps v into its coordinate x
i
.
Now if f V

, let a
i
= f(v
i
); then
f(v) = f(
x
i
v
i
) =
n
i=1
a
i
x
i
=
n
i=1
a
i
f
i
(v),
so f =
n
i=1
a
i
f
i
. This representation is unique (exercise), so f
1
, . . . , f
n
is a basis for V

,
called the dual basis to v
1
, . . . , v
n
. We get dimV
= dimV .
If we write the dual basis in a column
_
_
_
f
1
.
.
.
f
n
_
_
_
and the coordinates (a
1
a
n
) of f =
n
i=1
a
i
f
i
V
in a row, then f = (a
1
a
n
)
_
_
_
f
1
.
.
.
f
n
_
_
_
. The dening equation of the dual
basis is (matrix multiply, evaluate)
()
_
_
_
f
1
.
.
.
f
n
_
_
_
(v
1
v
n
) =
_
_
_
1 0
.
.
.
0 1
_
_
_
= I
Change of Basis and Dual Bases: Let w
1
, . . . , w
n
be another basis of V related to the
rst basis v
1
, . . . , v
n
by the change-of-basis matrix C, i.e., (w
1
w
n
) = (v
1
v
n
)C. Left-
multiplying () by C
1
and right-multiplying by C gives
C
1
_
_
_
f
1
.
.
.
f
n
_
_
_
(w
1
w
n
) = I.
Therefore
_
_
_
g
1
.
.
.
g
n
_
_
_
= C
1
_
_
_
f
1
.
.
.
f
n
_
_
_
satises
_
_
_
g
1
.
.
.
g
n
_
_
_
(w
1
w
n
) = I
and so g
1
, . . . , g
n
is the dual basis to w
1
, . . . , w
n
. If (b
1
b
n
) are the coordinates of
f V

with respect to g
1
, . . . , g
n
, then
f = (b
1
b
n
)
_
_
_
g
1
.
.
.
g
n
_
_
_
= (b
1
b
n
)C
1
_
_
_
f
1
.
.
.
f
n
_
_
_
= (a
1
a
n
)
_
_
_
f
1
.
.
.
f
n
_
_
_
,
so (b
1
b
n
)C
1
= (a
1
a
n
), i.e., (b
1
b
n
) = (a
1
a
n
)C is the transformation law for the
coordinates of f with respect to the two dual bases f
1
, . . . , f
n
and g
1
, . . . , g
n
.
Linear Transformations
Linear transformations were dened above.
Examples:
(1) Let T : F
n
F
m
be a linear transformation. For 1 j n, write
T(e
j
) = t
j
=
_
_
_
t
1j
.
.
.
t
mj
_
_
_
F
m
.
If x =
_
_
_
x
1
.
.
.
x
n
_
_
_
F
n
, then T(x) = T(x
j
e
j
) = x
j
t
j
, which we can write as
T(x) = (t
1
t
n
)
_
_
_
x
1
.
.
.
x
n
_
_
_
=
_
_
_
t
11
t
1n
.
.
.
.
.
.
t
m1
t
mn
_
_
_
_
_
_
x
1
.
.
.
x
n
_
_
_
.
So every linear transformation from F
n
to F
m
is given by multiplication by a matrix
in F
mn
.
Vector Spaces 9
(2) One can construct linear transformations T : F
by matrix multiplication. Let

T =
_
_
_
t
11
t
12

t
21
.
.
.
.
.
.
_
_
_
be an innite matrix for which each row has only nitely many nonzero entries. In
forming Tx for x =
_
x
1
.
.
.
_
F
, each entry in Tx is given by a nite sum, so Tx

makes sense and T clearly denes a linear transformation from F
to itself. (How-
ever, not all linear transformations on F
are of this form.) The shift operators

(x
1
, x
2
, . . .)
T
(0, x
1
, x
2
, . . .)
T
and (x
1
, x
2
, . . .)
T
(x
2
, x
3
, . . .)
T
are examples of linear
transformations of this form.
(3) If sup
i,j
[t
ij
[ < and x
1
, then for each i,
j=1
[t
ij
x
j
[ sup
i,j
[t
ij
[
j=1
[x
j
[. It follows
that matrix multiplication Tx denes a linear transformation T :
1

.
(4) There are many ways that linear transformations arise on function spaces, for example:
(a) Let k C([c, d] [a, b]) where [a, b], [c, d] are closed bounded intervals. Dene the
linear transformation L : C[a, b] C[c, d] by L(u)(x) =
_
b
a
k(x, y)u(y)dy. L is
called an integral operator and k(x, y) is called its kernel.
(b) Let X be a metric space and let m C(X). Then L(u)(x) = m(x)u(x) denes a
multiplier operator L on C(X).
(c) Let X and Y be metric spaces and let g : X Y be continuous. Then L(u)(x) =
u(g(x)) denes a composition operator L : C(Y ) C(X).
(d) u u
denes a dierential operator L : C

1
[a, b] C[a, b].
Suppose V, W are nite-dimensional with bases v
1
, . . . , v
n
, w
1
, . . . , w
m
, respectively
and suppose L : V W is linear. For 1 j n, we can write Lv
j
=
m
i=1
t
ij
w
i
. The
matrix
T =
_
_
_
t
11
t
1n
.
.
.
.
.
.
t
m1
t
mn
_
_
_
F
mn
is called the matrix of L with respect to the bases B
1
= v
1
, . . . , v
n
, B
2
= w
1
, . . . , w
m
(H-J writes T =
B
2
[L]
B
1
.) Let v V and let
_
_
_
x
1
.
.
.
x
n
_
_
_
be the coordinates of v with respect
to B
1
and
_
_
_
y
1
.
.
.
y
m
_
_
_
the coordinates of Lv with respect to B
2
. Then
m
i=1
y
i
w
i
= Lv = L
_
n
j=1
x
j
v
j
_
=
m
i=1
_
n
j=1
t
ij
x
j
_
w
i
,
so for 1 i m, we have y
i
=
n
j=1
t
ij
x
j
, i.e. y = Tx. Thus, in terms of the coordinates
relative to these bases, L is represented by matrix multiplication by T.
Note that the relations dening T can be rewritten as L(v
1
v
n
) = (w
1
w
n
)T. Sup-
pose now that we choose dierent bases B
1
= v
1
, . . . , v
n
and B
2
= w
1
, . . . , w
m
for V and
W, respectively, with change-of-bases matrices C F
nn
, D F
mm
:
(v
1
v
n
) = (v
1
v
n
)C and (w
1
w
m
) = (w
1
w
m
)D.
Then
L(v
1
v
n
) = (w
1
w
n
)TC = (w
1
w
m
)D
1
TC,
so the matrix of L in the new bases is D
1
TC. In particular, if W = V and we choose
B
2
= B
1
and B
2
= B
1
, then D = C, so the matrix of L in the new basis is C
1
TC. A matrix
of the form C
1
TC is said to be similar to T. Therefore similar matrices can be viewed as
representations of the same linear transformation with respect to dierent bases.
Linear transformations can be studied abstractly or in terms of matrix representations.
For L : V W, the range 1(L), null space A(L) (or kernel ker(L)), rank (L) = dim(1(L)),
etc., can be dened directly in terms of L, or in terms of matrix representations. If T F
nn
is the matrix of L : V V in some basis, it is easiest to dene det L = det T and tr L = tr T.
Since det (C
1
TC) = det T and tr (C
1
TC) = tr T, these are independent of the basis.
Vector Spaces of Linear Transformations
Let V , W be vector spaces. We denote by /(V, W) the set of all linear transformations
from V to W. The set /(V, W) has a natural vector space structure: if L
1
, L
2
/ and
1
,
2
F, dene
1
L
1
+
2
L
2
/(V, W) by (
1
L
1
+
2
L
2
)(v) =
1
L
1
(v) +
2
L
2
(v). In
the innite-dimensional case, we will be more interested in the vector space B(V, W) of all
bounded linear transformations (to be dened) from V to W with respect to norms on V
and W. When V and W are nite-dimensional, it will turn out that B(V, W) = /(V, W).
If V , W have dimensions n, m, respectively, then the matrix representation above shows
that /(V, W) is isomorphic to F
mn
, so it has dimension nm. When V = W, we denote
/(V, V ) by /(V ). Since the composition ML : V U of linear transformations L : V W
and M : W U is also linear, /(V ) is naturally an algebra with composition as the
multiplication operation.
Projections
Suppose W
1
, W
2
are subspaces of V and V = W
1
W
2
. Then we say W
1
and W
2
are
complementary subspaces. Any v V can be written uniquely as v = w
1
+w
2
with w
1
W
1
,
w
2
W
2
. So we can dene maps P
1
: V W
1
, P
2
: V W
2
by P
1
v = w
1
, P
2
v = w
2
. It
is easy to check that P
1
, P
2
are linear. We usually regard P
1
, P
2
as mapping V into itself
(as W
1
V , W
2
V ). P
1
is called the projection onto W
1
along W
2
(and P
2
the projection
of W
2
along W
1
). It is important to note that P
1
is not determined solely by the subspace
W
1
V , but also depends on the choice of the complementary subspace W
2
. Since a linear
transformation is determined by its restrictions to direct summands of its domains, P
1
is
Vector Spaces 11
uniquely characterized as that linear transformation on V which satises
P
1
W
1
= I
W
1
and P
1
W
2
= 0.
It follows easily that
P
2
1
= P
1
, P
2
2
= P
2
, P
1
+ P
2
= I, P
1
P
2
= P
2
P
1
= 0.
In general, an element q of an algebra is called idempotent if q
2
= q. If P : V V is a
linear transformation and P is idempotent, then P is a projection in the above sense: it is
the projection onto 1(P) along A(P).
This discussion extends to the case in which V = W
1
W
m
for subspaces W
i
. We
can dene projections P
i
: V W
i
in the obvious way: P
i
is the projection onto W
i
along
W
1
W
i1
W
i+1
W
m
. Then
P
2
i
= P
i
for 1 i m, P
1
+ + P
m
= I, and P
i
P
j
= P
j
P
i
= 0 for i ,= j.
If V is nite dimensional, we say that a basis w
1
, . . . , w
p
, u
1
, . . . , u
q
for V = W
1
W
2
is
adapted to the decomposition W
1
W
2
if w
1
, . . . , w
p
is a basis for W
1
and u
1
, . . . , u
q
is
a basis for W
2
. With respect to such a basis, the matrix representations of P
1
and P
2
are
_
I 0
0 0
_
and
_
0 0
0 I
_
, where the block structure is
_
p p p q
q p q q
_
,
abbreviated as:
_
p q
p
q
_
.
Invariant Subspaces
We say that a subspace W V is invariant under a linear transformation L : V V if
L(W) W. If V is nite dimensional and w
1
, . . . , w
p
is a basis for W which we complete
to some basis w
1
, . . . , w
p
, u
1
, . . . , u
q
of V , then W is invariant under L i the matrix of L
in this basis is of the form
_
p q
p
q 0
_
,
i.e., block upper-triangular.
We say that L : V V preserves the decomposition W
1
W
m
= V if each W
i
is
invariant under L. In this case, L denes linear transformations L
i
: W
i
W
i
, 1 i m,
and we write L = L
1
L
m
. Clearly L preserves the decomposition i the matrix T of
L with respect to an adapted basis is of block diagonal form
T =
_
_
T
1
0
T
2
.
.
.
0 T
m
_
_
,
where the T
i
s are the matrices of the L
i
s in the bases of the W
i
s.
Nilpotents
A linear transformation L : V V is called nilpotent if L
r
= 0 for some r > 0. A basic
example is a shift operator on F
n
: dene Se
1
= 0, and Se
i
= e
i1
for 2 i n. The matrix
of S is denoted S
n
:
S
n
=
_
_
0 1 0
.
.
.
.
.
.
1
0 0
_
_
F
nn
.
Note that S
m
shifts by m : S
m
e
i
= 0 for 1 i m, and S
m
e
i
= e
im
for m + 1 i n.
Thus S
n
= 0. For 1 m n 1, the matrix (S
n
)
m
of S
m
is zero except for 1s on the m
th
super diagonal (i.e., the ij elements for j = i + m (1 i n m) are 1s):
(S
n
)
m
=
_
_
0 0 1 0
.
.
.
.
.
.
.
.
. 1
.
.
.
.
.
. 0
0
.
.
. 0
_
_
(1, m+ 1) element
(n m, n) element.
Note, however that the analogous shift operator on F
dened by: Se
1
= 0, Se
i
= e
i1
for
i 2, is not nilpotent.
Structure of Nilpotent Operators in Finite Dimensions
We next prove a theorem which describes the structure of all nilpotent operators in nite
dimensions. This is an important result in its own right and will be a key step in showing
that every matrix is similar to a matrix in Jordan form.
Theorem. Let V be nite dimensional and L : V V be nilpotent. There is a basis for V
in which L is a direct sum of shift operators.
Proof. Since L is nilpotent, there is an integer r so that L
r
= 0 but L
r1
,= 0. Let v
1
, . . . , v
l
1
be a basis for 1(L
r1
), and for 1 i l
1
, choose w
i
V for which v
i
= L
r1
w
i
. (As an
aside, observe that
V = A(L
r
) = A(L
r1
) spanw
1
, . . . , w
l
1
.)
We claim that the set
o
1
= L
k
w
i
: 0 k r 1, 1 i l
1
is linearly independent. Suppose

l
1
i=1
r1
k=0
c
ik
L
k
w
i
= 0.
Vector Spaces 13
Apply L
r1
to obtain
l
1
i=1
c
i0
L
r1
w
i
= 0.
Hence
l
1
i=1
c
i0
v
i
= 0, so c
i0
= 0 for 1 i l
1
. Now apply L
r2
to the double sum to obtain
0 =
l
1
i=1
c
i1
L
r1
w
i
=
l
1
i=1
c
i1
v
i
,
so c
i1
= 0 for 1 i l
1
. Successively applying lower powers of L shows that all c
ik
= 0.
Observe that for 1 i l
1
, spanL
r1
w
i
, L
r2
w
i
, . . . , w
i
is invariant under L, and L
acts by shifting these vectors. It follows that on span(o
1
), L is the direct sum of l
1
copies of
the (r r) shift S
r
, and in the basis
L
r1
w
1
, L
r2
w
1
, . . . , w
1
, L
r1
w
2
, . . . , w
2
, . . . , L
r1
w
l
1
, . . . , w
l
1
for span(o
1
), L has the matrix
_
_
S
r
0
.
.
.
0 S
r
_
_
. In general, span(o
1
) need not be all of V ,
so we arent done.
We know that L
r1
w
1
, . . . , L
r1
w
l
1
is a basis for 1(L
r1
), and that
L
r1
w
1
, . . . , L
r1
w
l
1
, L
r2
w
1
, . . . , L
r2
w
l
1
are linearly independent vectors in 1(L

r2
). Complete the latter to a basis of 1(L
r2
) by
appending, if necessary, vectors u
1
, . . . , u
l
2
. As before, choose w
l
1
+j
for which
L
r2
w
l
1
+j
= u
j
, 1 j l
2
.
We will replace w
l
1
+j
(1 j l
2
) by vectors in A(L
r1
). Note that
L u
j
= L
r1
w
l
1
+j
1(L
r1
),
so we may write
L
r1
w
l
1
+j
=
l
1
i=1
a
ij
L
r1
w
i
for some a
ij
F. For 1 j l
2
, set
w
l
1
+j
= w
l
1
+j

l
1
i=1
a
ij
w
i
and u
j
= L
r2
w
l
1
+j
.
Replacing the u
j
s by the u
j
s still gives a basis of 1(L
r2
) as above (exercise). Clearly
L
r1
w
l
1
+j
= 0 for 1 j l
2
. (Again as an aside, observe that we now have the direct sum
decomposition
A(L
r1
) = A(L
r2
) spanLw
1
, . . . , Lw
l
1
, w
l
1
+1
, . . . , w
l
1
+l
2
.)
So we now have a basis for 1(L
r2
) of the form
L
r1
w
1
, . . . , L
r1
w
l
1
, L
r2
w
1
, . . . , L
r2
w
l
1
, L
r2
w
l
1
+1
, . . . , L
r2
w
l
1
+l
2
for which L
r1
w
l
1
+j
= 0 for 1 j l
2
. By the same argument as above, upon setting
o
2
= L
k
w
l
1
+j
: 0 k r 2, 1 j l
2
,
we conclude that o
1
o
2
is linearly independent, and L acts on span(o
2
) as a direct sum of
l
2
copies of the (r 1) (r 1) shift S
r1
. We can continue this argument, decreasing r one
at a time and end up with a basis of 1(L
0
) = V in which L acts as a direct sum of shift
operators:
L =
l
1
..
S
r
S
r
l
2
..
S
r1
S
r1

lr
..
S
1
S
1
(Note: S
1
= 0 F
11
)
Remarks:
(1) For 1 j, let k
j
= dim(A(L
j
)). It follows easily from the above that 0 < k
1
< k
2
<
< k
r
= k
r+1
= k
r+2
= = n, and thus r n.
(2) The structure of L is determined by knowing r and l
1
, . . . , l
r
. These, in turn, are
determined by knowing k
1
, . . . , k
n
(see the homework).
(3) General facts about nilpotent transformations follow from this normal form. For ex-
ample, if dimV = n and L : V V is nilpotent, then
(i) L
n
= 0
(ii) tr L = 0
(iii) det L = 0
(iv) det (I + L) = 1
(v) for any F, det (I L) =
n
Dual Transformations
Recall that if V and W are vector spaces, we denote by V

and /(V, W) the dual space of
V and the space of linear transformations from V to W, respectively.
Let L /(V, W). We dene the dual, or adjoint transformation L
: W
V

by
(L
g)(v) = g(Lv) for g W
, v V . Clearly L L
is a linear transformation from

/(V, W) to /(W
, V

) and (L M)
= M
if M /(U, V ).
When V , W are nite dimensional and we choose bases for V and W, we get corre-
sponding dual bases, and we can represent vectors in V , W, V

, W
by their coordinate
vectors
x =
_
_
_
x
1
.
.
.
x
n
_
_
_
y =
_
_
_
y
1
.
.
.
y
m
_
_
_
a = (a
1
a
n
) b = (b
1
b
m
).
Vector Spaces 15
Also, L is represented by a matrix T F
mn
for which y = Tx. Now if g W
has
coordinates b = (b
1
b
m
), we have
g(Lv) = (b
1
b
m
)T
_
_
_
x
1
.
.
.
x
n
_
_
_
,
so L
g has coordinates (a
1
a
n
) = (b
1
b
m
)T. Thus L is represented by left-multiplication
by T on column vectors, and L
is represented by right-multiplication by T on row vectors.

Another common convention is to represent the dual coordinate vectors also as columns;
taking the transpose in the above gives
_
_
_
a
1
.
.
.
a
n
_
_
_
= T
T
_
_
_
b
1
.
.
.
b
m
_
_
_
,
so L
is represented by left-multiplication by T
T
on column vectors. (T
T
is the transpose of
T: (T
T
)
ij
= t
ji
.)
We can take the dual of V

to obtain V
. There is a natural inclusion V V

: if v V ,
then f f(v) denes a linear functional on V

. This map is injective since if v ,= 0, there
is an f V
for which f(v) ,= 0. (Proof: Complete v to a basis for V and take f to be

the rst vector in the dual basis.)
We identify V with its image, so we can regard V V
. If V is nite dimensional, then

V = V
since dimV = dimV

= dimV

. If V is innite dimensional, however, then there
are elements of V
which are not in V .

If S V is a subset, we dene the annihilator S
a
V
by
S
a
= f V
: f(v) = 0 for all v S.

Clearly S
a
= (span(S))
a
. Now (S
a
)
a
V

, and if dimV < , we can identify V

= V as
above.
Proposition. If dimV < , then (S
a
)
a
= span(S).
Proof. It follows immediately from the denition that span(S) (S
a
)
a
. To show (S
a
)
a
span(S), assume without loss of generality that S is a subspace. We claim that if W is an

m-dimensional subspace of V and dimV = n, then dimW
a
= codimW = nm. To see this,
choose a basis w
1
, . . . , w
m
for W and complete it to a basis w
1
, . . . , w
m+1
, . . . , w
n
for V .
Then clearly the dual basis vectors f
m+1
, . . . , f
n
are a basis for W
a
, so dimW
a
= n m.
Hence dim(S
a
)
a
= n dimS
a
= n (n dimS) = dimS. Since we know S (S
a
)
a
, the
result follows.
In complete generality, we have
Proposition. Suppose L /(V, W). Then A(L
) = 1(L)
a
.
Proof. Clearly both are subspaces of W
. Let g W
. Then g A(L
) L
g = 0
(v V ) (L
g)(v) = 0 ( v V ) g(Lv) = 0 g 1(L)

a
.
As an immediate consequence of the two Propositions above we conclude:
Corollary. Suppose L /(V, W) and dimW < . Then 1(L) = A(L
)
a
.
We are often interested in identifying 1(L) for some L /(V, W), or equivalently in
determining those w W for which there exists v V satisfying Lv = w. If V and W
are nite-dimensional, choose bases of V , W, thereby obtaining coordinate vectors x F
n
,
y F
m
for v, w and a matrix T representing L. This question then amounts to determining
those y F
m
for which the linear system Tx = y can be solved. According to the Corollary
above, we have 1(L) = A(L
)
a
. Thus there exists v V satisfying Lv = w i g(w) = 0 for all
g W
for which L
g = 0. In terms of matrices, Tx = y is solvable i (b

1
b
m
)
_
_
_
y
1
.
.
.
y
m
_
_
_
= 0
for all (b
1
b
m
) for which (b
1
b
m
)T = 0, or equivalently, T
T
_
_
b
1

b
m
_
_
= 0. These are
often called the compatibility conditions for solving the linear system Tx = y.
Bilinear Forms
A function : V V F is called a bilinear form if it is linear in each variable separately.
Examples:
(1) Let V = F
n
. For any matrix A F
nn
,
(y, x) =
n
i=1
n
j=1
a
ij
y
i
x
j
= y
T
Ax
is a bilinear form. In fact, all bilinear forms on F
n
are of this form: since
y
i
e
i
,
x
j
e
j
_
=
n
i=1
n
j=1
y
i
x
j
(e
i
, e
j
),
we can just set a
ij
= (e
i
, e
j
). Similarly, for any nite-dimensional V , we can choose
a basis v
1
, . . . , v
n
; if is a bilinear form on V and v =
x
j
v
j
, w =
y
i
v
i
, then
(w, v) =
n
i=1
n
j=1
y
i
x
j
(v
i
, v
j
) = y
T
Ax
where A F
nn
is given by a
ij
= (v
i
, v
j
). A is called the matrix of with respect to
the basis v
1
, . . . , v
n
.
(2) One can also use innite matrices (a
ij
)
i,j1
for V = F
as long as convergence con-

ditions are imposed. For example, if all [a
ij
[ M, then (y, x) =
i=1
j=1
a
ij
y
i
x
j
denes a bilinear form on
1
since
i=1
j=1
[a
ij
y
i
x
j
[ M
_

i=1
[y
i
[
__

j=1
[x
j
[
_
.
Vector Spaces 17
Similarly if
i=1
j=1
[a
ij
[ < , then we get a bilinear form on
.
(3) If V is a vector space and f, g V

, then (w, v) = g(w)f(v) is a bilinear form.
(4) If V = C[a, b], then the following are all examples of bilinear forms:
(i) (v, u) =
_
b
a
_
b
a
k(x, y)v(x)u(y)dxdy, for k C([a, b] [a, b])
(ii) (v, u) =
_
b
a
h(x)v(x)u(x)dx, for h C([a, b])
(iii) (v, u) = v(x
0
)
_
b
a
u(x)dx, for x
0
[a, b]
We say that a bilinear form is symmetric if ( v, w V )(v, w) = (w, v). In the nite-
dimensional case, this corresponds to the condition that the matrix A be symmetric, i.e.,
A = A
T
, or ( i, j) a
ij
= a
ji
.
Returning to Example (1) above, let V be nite-dimensional and consider how the matrix
of the bilinear form changes when the basis of V is changed. Let (v
1
, . . . , v
n
) be another
basis for V related to the original basis (v
1
, . . . , v
n
) by change of basis matrix C F
nn
. We
have seen that the coordinates x
for v relative to (v
1
, . . . , v
n
) are related to the coordinates
x relative to (v
1
, . . . , v
n
) by x = Cx
. If y
and y denote the coordinates of w relative to the

two bases, we have y = Cy
and therefore
(w, v) = y
T
Ax = y
T
C
T
ACx
.
It follows that the matrix of in the basis (v
1
, . . . , v
n
) is C
T
AC. Compare this with the
way the matrix representing a linear transformation L changed under change of basis: if T
was the matrix of L in the basis (v
1
, . . . , v
n
), then the matrix in the basis (v
1
, . . . , v
n
) was
C
1
TC. Hence the way a matrix changes under change of basis depends on whether the
matrix represents a linear transformation or a bilinear form.
Sesquilinear Forms
When F = C, we will more often use sesquilinear forms: : V V C is called sesquilinear
if is linear in the second variable and conjugate-linear in the rst variable, i.e.,
(
1
w
1
+
2
w
2
, v) =
1
(w
1
, v) +
2
(w
2
, v).
(Sometimes the convention is reversed and is conjugate-linear in the second variable. The
two possibilities are equivalent upon interchanging the variables.) For example, on C
n
all
sesquilinear forms are of the form (w, z) =
n
i=1
n
j=1
a
ij
w
i
z
j
for some A C
nn
. To be
able to discuss bilinear forms over R and sesquilinear forms over C at the same time, we will
speak of a sesquilinear form over R and mean just a bilinear form over R. A sesquilinear
form is said to be Hermitian-symmetric (or sometimes just Hermitian) if
( v, w V ) (w, v) = (v, w)
(when F = R, we say the form is symmetric). When F = C, this corresponds to the condition
that A = A
H
, where A
H
=

A
T
(i.e., (A
H
)
ij
= a
ji
) is the Hermitian transpose (or conjugate
transpose) of A, and a matrix A C
nn
satisfying A = A
H
is called Hermitian. When
F = R, this corresponds to the condition A = A
T
(i.e., A is symmetric).
To a sesquilinear form, we can associate the quadratic form (v, v). We say that is
nonnegative (or positive semi-denite) if ( v V ) (v, v) 0, and that is positive (or
positive denite) if (v, v) > 0 for all v ,= 0 in V . By an inner product on V , we will mean
a positive-denite Hermitian-symmetric sesquilinear form.
Examples:
(1) F
n
with the Euclidean inner product y, x =
n
i=1
y
i
x
i
.
(2) Let V = F
n
, and let A F
nn
be Hermitian-symmetric. Dene
y, x
A
=
n
i=1
n
j=1
a
ij
y
i
x
j
= y
T
Ax
The requirement that x, x
A
> 0 for x ,= 0 for ,
A
to be an inner product serves to
dene positive-denite matrices.
(3) If V is any nite-dimensional vector space, we can choose a basis and thus identify
V

= F
n
, and then transfer the Euclidean inner product to V in the coordinates of
this basis. The resulting inner product depends on the choice of basis there is no
canonical inner product on a general vector space. With respect to the coordinates
induced by a basis, any inner product on a nite-dimensional vector space V is of the
form (2).
(4) One can dene an inner product on
2
by y, x =
i=1
y
i
x
i
. To see (from rst
principles) that this sum converges absolutely, apply the nite-dimensional Cauchy-
Schwarz inequality to obtain
n
i=1
[y
i
x
i
[
_
n
i=1
[x
i
[
2
_1
2
_
n
i=1
[y
i
[
2
_1
2
i=1
[x
i
[
2
_1
2
_

i=1
[y
i
[
2
_1
2
.
Now let n to deduce that the series
i=1
y
i
x
i
converges absolutely.
(5) The L
2
-inner product on C([a, b]) is given by v, u =
_
b
a
v(x)u(x)dx. (Exercise: show
that this is indeed positive denite on C([a, b]).)
An inner product on V determines an injection V V
: if w V , dene w
V

by
w
(v) = w, v. Since w
(w) = w, w it follows that w
= 0 w = 0, so the map w w
is injective. The map w w
is conjugate-linear (rather than linear, unless F = R) since

(w)
= w
. The image of this map is a subspace of V

. If dimV < , then this map is
surjective too since dimV = dimV

. In general, it is not surjective.
Let dimV < , and represent vectors in V as elements of F
n
by choosing a basis. If
v, w have coordinates
_
_
_
x
1
.
.
.
x
n
_
_
_
,
_
_
_
y
1
.
.
.
y
n
_
_
_
, respectively, and the inner product has matrix
Vector Spaces 19
A F
nn
in this basis, then
w
(v) = w, v =
n
i=1
_
n
j=1
a
ij
y
i
_
x
j
.
It follows that w
has components b
j
=
n
j=1
a
ij
y
i
with respect to the dual basis. Recalling
that the components of dual vectors are written in a row, we can write this as b = y
T
A.
An inner product on V allows a reinterpretation of annihilators. If W V is a subspace,
dene the orthogonal complement W
= v V : w, v = 0 for all w W. Clearly W
is a subspace of V and W W
= 0. The orthogonal complement W
is closely related
to the annihilator W
a
: it is evident that for v V , we have v W
if and only if v
W
a
.
If dimV < , we saw above that every element of V
is of the form v
for some v V . So
we conclude in the nite-dimensional case that
W
a
= v
: v W
.
It follows that dimW
= dimW
a
= codimW. From this and W W
= 0, we deduce
that V = W W
. So in a nite dimensional inner product space, a subspace W has a

natural complementary subspace, namely W
. The induced projection onto W along W
is called the orthogonal projection onto W.

Linear Algebra and Matrix Analysis: Vector Spaces

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Linear Algebra and Matrix Analysis: Vector Spaces

Hochgeladen von

Copyright:

Verfügbare Formate

Linear Algebra and Matrix Analysis

(F) are clearly subspaces of F

(R, F) to be the space of all F-valued polynomials on R:

+ u = 0. It is easy to check directly from the denition

are linearly independent. However, Spane

consisting of all vectors with only nitely many nonzero components.

is also a subspace, and dim(W

(equivalently: for each G, W

: G be a family of vector spaces over F. Dene

to be the set of all functions v : G

for which v() V

for v(), and we write v = (v

). Then V is a vector space over F. (Example:

: G be a family of vector spaces over F. Dene

consisting of those v for which v

is a subspace of V and the sum

is direct, then it is naturally

. Given an N and some f

. However, not all linear functionals on F

(F), then for x

is often called the algebraic dual space of V , as it

by matrix multiplication. Let

, each entry in Tx is given by a nite sum, so Tx

are of this form.) The shift operators

denes a dierential operator L : C

is linearly independent. Suppose

are linearly independent vectors in 1(L

g)(v) = g(Lv) for g W

is a linear transformation from

is represented by right-multiplication by T on row vectors.

. There is a natural inclusion V V

for which f(v) ,= 0. (Proof: Complete v to a basis for V and take f to be

. If V is nite dimensional, then

since dimV = dimV

which are not in V .

: f(v) = 0 for all v S.

span(S), assume without loss of generality that S is a subspace. We claim that if W is an

g)(v) = 0 ( v V ) g(Lv) = 0 g 1(L)

g = 0. In terms of matrices, Tx = y is solvable i (b

as long as convergence con-

and y denote the coordinates of w relative to the

(w) = w, w it follows that w

is injective. The map w w

is conjugate-linear (rather than linear, unless F = R) since

. The image of this map is a subspace of V

= v V : w, v = 0 for all w W. Clearly W

= 0. The orthogonal complement W

. So in a nite dimensional inner product space, a subspace W has a

. The induced projection onto W along W

is called the orthogonal projection onto W.

Das könnte Ihnen auch gefallen