Beruflich Dokumente
Kultur Dokumente
1
Topic 1: Linear equations [AR 1.1 and 1.2]
2
We will begin with a couple of examples to illustrate how linear
equations can arise.
10000
x 1 = x2
2000
6000 10000
We can find the maximum profit by moving the ‘line of constant profit’
parallel to itself until it meets the shaded region.
4
Example (This is Example 1 of [AR 11.2])
7Ω A
I1 I2
The numbers I1 , I2 , I3
30V I3 3Ω 11Ω represent the current as
Loop 1 Loop 2 indicated.
50V
Because the current going into point A is the same as the current going
out, we have
I1 = I2 + I3
Measuring voltage around Loop 1 and Loop 2 we get, respectively,
7I1 + 3I3 = 30
11I2 − 3I3 = 50
5
1.1 Systems of equations, coefficient arrays, row operations
Linear equations
Now that we know what makes the equation linear, it is clear to see
that the number of variables is unimportant, so we have in the general
case the following definition.
6
Definition (Linear equation and linear system)
A linear equation in n variables, x1 , x2 , . . . , xn , is an equation of the form
a 1 x1 + a 2 x2 + · · · + a n xn = b
where a1 , a2 , . . . , an are real constants (not all zero) and b is also a real
constant.
7
Examples
x + 2y = 7
3
x − 21y = 0
8
8
Definition (Solution of a system of linear equations)
A solution to a system of linear equations in the variables x1 , . . . , xn is a
set of values of these variables which satisfy every equation in the
system.
How would you solve the following linear system?
2x − y = 3
x +y =0
Graphically
◮ Need accurate sketch!
◮ Not practical for three or more variables.
Elimination
◮ Will always give a solution, but is too adhoc, particularly in higher
dimensions (meaning three or more variables).
9
However, the good news is that with the introduction of some clever
notation, we can formalise the procedure involved in solving
simultaneous equations.
Example
Find all solutions for the following system of linear equations
2x − y = 3
x +y =0
10
Example Find all solutions for the linear system
3x + 2y = −3
2x + y = −1
11
Coefficient arrays
I1 − I2 − I3 = 0
7I1 + (0×)I2 + 3I3 = 30
(0×)I1 + 11I2 − 3I3 = 50
12
Definition (Matrix)
A matrix is a rectangular array of numbers.
The numbers in the array are called the entries of the matrix.
Example
The augmented matrix for the previous set of equations is:
13
Example
Write the following system of linear equations as an augmented matrix
2x − y = 3
x +y =0
Note
The number of rows is equal to the number of equations.
Each column, except the last, corresponds to a variable.
The last column contains the constant term from each equation.
14
Row Operations
Our aim is to use matrices to assist us in finding a solution to a system
of equations.
First we need to decide what sort of operations we can perform on the
augmented matrix. An essential condition is that whichever operations
we perform, we must be able to recover the solution to the original
system from the new matrix we obtain.
Let’s start by considering operations on a matrix that mimic those
operations used in the elimination method.
2x − y = 3
x +y =0
Note
The matrices are not equal, but are equivalent in that the solution set is
the same for each system represented by each augmented matrix.
16
1.2 Reduction of systems to reduced row-echelon form
Gaussian elimination
Using a sequence of elementary row operations, we can always get to a
matrix that allows us to determine the solution set of a linear system.
However, the degree of complication increases as the number of
variables and equations increases, so it is a good idea to formalise the
process.
The leftmost non-zero element in each row is called the leading entry.
17
Definition (Row echelon form)
A matrix is in row-echelon form if:
1. For any row with a leading entry, all elements below that entry and
in the same column as it, are zero.
2. For any two rows, the leading entry of the lower row is further to
the right than the leading entry in the higher row.
3. Any row that consists solely of zeros is lower than any row with
non-zero entries.
Examples
1 0 0 3
1 −2 3 4 5 , 0 1 1 2 r.e. form
0 0 0 3
0 0 0 2 4
0 0 3 1 6
0 0
not r.e. form
0 0 0
2 −3 6 −4 9
18
Gaussian elimination is a systematic (or algorithmic) approach to the
reduction of a matrix to row-echelon form.
Gaussian elimination
1. Make the top left element (row 1, column 1) a leading entry; that
is, reorder rows so that the entry in the top left position is non-zero.
2. Add multiples of the first row to the other rows to make all other
entries (from row 2 down) in the first column zero.
3. Reorder rows 2 . . . n so that the next leading entry is in row 2.
4. Add multiples of the second row to rows 3 . . . n, making all other
entries (from row 3 down) in the column containing the second
leading entry zero.
5. Repeat until you run out of rows.
19
Example
Use Gaussian elimination to reduce the augmented matrix which
represents the linear system
3x + 2y − z = −15
x + y − 4z = −30
3x + y + 3z = 11
3x + 3y − 5z = −41
to row-echelon form.
20
By reducing a matrix to row-echelon form, Gaussian elimination allows
us to easily solve a system of linear equations. To do this we need to
read off the final equations from the row-echelon matrix and then
manipulate them to find the solution. The final manipulation is
sometimes called back substitution. We illustrate with an example.
Example
From the row-echelon matrix of the previous example we can calculate
the solutions to the original system.
Note
This procedure relies on the fact that the new row-echelon matrix gives
a linear system with exactly the same set of solutions as the original
linear system.
21
Is there any way to solve a system without having to perform the final
manipulation?
22
Examples
1 −2 3 −4 5 is in r.r.e form
1 0 0 3
0 1 1 2 is not in r.r.e form
0 0 0 3
1 0 0 2 4
0 1 3 1 6
0
is not in r.r.e form
0 0 0 0
0 0 1 −4 9
23
1.2.2 Gauss-Jordan elimination
Gauss-Jordan elimination is a systematic way to reduce a matrix to
reduced row-echelon form using row operations.
Gauss-Jordan elimination
1. Use Gaussian elimination to reduce matrix to row-echelon form.
2. Use row operations to create zeros above the leading entries.
3. Multiply rows by appropriate numbers to create the leading 1’s.
24
Example
Use Gauss-Jordan elimination to find a solution to the linear system
3x + 2y − z = −15
x + y − 4z = −30
3x + y + 3z = 11
3x + 3y − 5z = −41
25
1.3 Consistent and inconsistent systems
Example
Find the solution of the system
2x + 4y − z = −3
x − 3y + 2z = 11
4x − 2y + 5z = 21
26
Another example
Find the solution of the system
x− y+ z =3
x − 7y + 3z = −11
2x + y + z = 16
27
Yet another example
Find the solution of the system
x+ y+ z =4
2x + y + 2z = 9
3x + 2y + 3z = 13
28
Types of solution sets
It is clear from the above examples that there are different types of
solutions for systems of equations.
Definition (Consistency)
◮ A system of linear equations is said to be consistent if the system
has at least one solution.
◮ A system of linear equations is said to be inconsistent if the system
has no solution.
For any system of linear equations, one of three types of solution set is
possible.
◮ no solution (inconsistent)
◮ one solution (consistent and unique)
◮ infinitely many solutions (consistent but not unique)
29
Inconsistent systems
We can determine the type of solution set a system has by reducing its
augmented matrix to row-echelon form.
0 × x1 + 0 × x2 + · · · + 0 × xn = 5
and of course this is not satisfied for any values of x1 , . . . , xn .
30
Example
1 2 0 4
2 1 1 2
4 2 2 3
32
Consistent systems
Recall that a consistent system has either a unique solution or infinitely
many solutions.
Unique solution:
For a system of equations with n variables, a unique solution exists
precisely when the row reduced augmented matrix has n non-zero rows.
In this case we can read off the solution straight from the reduced
matrix.
Example
1 1 3 5
1 2 1 4 ∼
2 1 1 4
x1 = x2 = x3 =
33
Infinitely many solutions:
Example
What is the solution to the equations with rre form;
1 2 0 0 5 1
0 0 1 0 6 2
0 0 0 1 7 3
0 0 0 0 0 0
The corresponding (non-zero) equations are
x1 + 2x2 + 5x5 = 1, x3 + 6x5 = 2, x4 + 7x5 = 3.
34
In general
If the row reduced augmented matrix has < n non zero rows then it has
infinitely many solutions.
More precisely, in the rre form of the matrix, there will be n − r columns
which contain no leading entry.
35
Example
Suppose that the rre matrix for a system of equations is
1 2 0 1 1
0 0 1 2 2
0 0 0 0 0
0 0 0 0 0
x1 = x3 =
36
Example
1 1 3 5
1 2 8 11 ∼
2 1 1 4
x1 = x2 = x3 =
37
Example
Solve the linear system:
v − 2w + z = 1
2u − v − z = 0
4u + v − 6w = 3
38
Example
Find the values of k for which the system
u + 3v + 4w = 6
4u + 9v − w = 4
6u + 9v + kw = 8
has
(i) no solution
(ii) a unique solution
(iii) an infinite number of solutions
39
Topic 2: Matrices and Determinants [AR 1.3 – 1.6]
40
2.1 Properties of matrices
Notation
Sometimes it is convenient to refer to the entries of a matrix rather
than the entire matrix.
If A is a matrix, we denote its entries as Aij , where i specifies the row of
the entry and j specifies the column. Using this notation, we write
A11 A12 . . . . . . A1n
A21 A22 . . . . . . A2n
.. .. .. ..
A= .
. . . or A = [Aij ]
.. .. . .. .
..
. .
Am1 Am2 . . . . . . Amn
41
We say that a matrix has size m × n when it has m rows and n columns.
Example
1 2 3
The matrix A = has size 2 × 3.
π e 27.1
Note
A12 and A21 are not equal (in this example).
42
Some special matrices
Some matrices with special features are given names suggestive of those
features.
Definition (Special matrices)
◮ A matrix having the same number of rows as columns is called a
square matrix
◮ A matrix with only one row is called a row matrix
(They have size 1 × n)
◮ A matrix with only one column is called a column matrix (They
have size n × 1)
◮ A matrix with all elements equal to zero is referred to as a zero
matrix
◮ A square matrix with Aij = 0 for i 6= j is called a diagonal matrix
(
1 if i = j
◮ A square matrix A satisfying Aij =
0 if i 6= j
is called an identity matrix.
43
Examples
1 2 2
1 2
The matrices and 3 4 5 are both square.
3 4
6 7 8
4 3 5 −2 is a row matrix.
1
2 is a column matrix.
3
0 0
0 0 0
The matrices and 0 0 are both zero matrices.
0 0 0
0 0
1 0 0
1 0
and 0 1 0 are both identity matrices.
0 1
0 0 1
44
2.2 Matrix algebra
45
Example
1 y 0 1 3 0
Given A = and B =
−7 2 3 x 2 3
determine the values of x and y for which A = B.
Scalar Multiplication
(cA)ij = c × Aij
46
Example
0 1
Let A = 1 −1 . What are −2A and αA for α ∈ R?
2 −7
If α and β are any scalars and C and D are any matrices of the same
size, then
1. (α + β)C = αC + βC
2. α(D + C ) = αD + αC
3. α(βC ) = (αβ)C
47
Addition of Matrices
Be Careful: Matrix addition is only defined for matrices of the same size.
48
Example
Let
2 0 −3 −1 −1 1 −1 1
A= , B= and C= .
1 −1 3 0 1 2 2 0
49
Properties of Matrix Addition
For matrices A, B and C , all of the same size, the following statements
hold:
1. A + B = B + A (commutativity)
2. A + (B + C ) = (A + B) + C (associativity)
3. A − A = 0
4. A + 0 = A
50
Matrix multiplication
Sometimes we can multiply two matrices together.
52
Example
1 0 4 3
Let A = and B =
2 3 2 1
Calculate AB and BA.
53
Definition (Commuting matrices)
The matrices A and B are said to commute if AB = BA.
For both AB and BA to be defined and equal we must have that A and
B are square and have the same size.
Note
If I is an n × n identity matrix and A is any n × n square matrix, then
AI = IA = A
54
Properties of matrix multiplication
The following properties hold whenever the matrix products and sums
are defined:
1. A(B + C ) = AB + AC (left distributivity)
2. (A + B)C = AC + BC (right distributivity)
3. A(BC ) = (AB)C (associativity)
4. A(αB) = α(AB)
5. AI = IA = A
6. A0 = 0 and 0A = 0
55
Matrix multiplication and linear systems
Using the rule for matrix multiplication, a linear system can be written
as a matrix equation.
Example
56
Another non-property of matrix multiplication
Example
1 1 1 −1
Let A = and B = .
−1 −1 −1 1
57
2.3 Matrix inverses
AB = BA = I
58
Note
59
Inverse of a 2 × 2 matrix
a b
In general, for a 2 × 2 matrix A=
c d
1. A is invertible iff (ad − bc) 6= 0
1 d −b
2. If (ad − bc) 6= 0, then A−1 = ad−bc −c a
Example
2 −1
Find the inverse of A =
1 1
60
Example
If A is a square matrix satisfying A3 = 0, show that
(I − A)−1 = I + A + A2
61
Finding the inverse of a square matrix
Calculating the inverse of a matrix
Given an n × n matrix A, we can find A−1 as follows:
[A | I ] ∼ [R | B]
63
Another example
1 2 1
Find the inverse of −1 −1 1
−1 0 3
64
Properties of the matrix inverse
If A and B are invertible matrices of the same size, and α a non-zero
scalar, then
1
1. (αA)−1 = α A−1
2. (AB)−1 = B −1 A−1
n
3. (An )−1 = A−1 (for all n ∈ N)
Exercise
Prove these!
65
Row operations by matrix multiplication
Example
1 0 0
Consider I3 = 0 1 0
0 0 1
a b c
Now consider the matrix A = d e f
g h i
Calculation gives
FA =
GA =
HA =
67
This works in general!
68
Example
1 2 1 2 1 2
∼ ∼
3 4 0 −2 0 1
1 0 1 0 1 2 1 2
−1 =
0 2 −3 1 3 4 0 1
69
If A ∼ I , then there is a sequence of elementary matrices E1 , E2 , . . . , En
such that En En−1 · · · E2 E1 A = I
A is invertible ⇐⇒ A ∼ In
This theorem justifies why our procedure for finding inverses using row
operations actually works.... Can you see why?
70
Matrix transpose
Example
1 2 3
Let A = . Then AT is
4 5 6
71
Properties of the transpose
T
1. AT =A
2. (A + B)T = AT + B T (whenever A + B is defined)
3. (αA)T = αAT (where α is a scalar)
4. (AB)T= B T AT (whenever AB is defined)
−1 T
5. AT = A−1 (whenever A−1 is defined)
72
Example
−1 T
Using part 4 above, prove part 5: AT = A−1 .
73
Linear systems revisited
x + 2y + z = − 3
−x − y + z = 11
y + 3z = 21
74
For convenience, label
(i) the matrix of coefficients as A,
(ii) the column matrix of variables as x,
(iii) the right hand side column matrix as b.
Using this notation, the linear system can be written as
Ax = b
Theorem
Suppose A is an invertible matrix.
Then the linear system Ax = b has exactly one solution, and it is given
by x = A−1 b
x + 2y + z = −3
−x − y + z = 11
y + 3z = 21
76
Another example
x + 2y + z = a
−x − y + z = b
y + 3z = c
77
2.4 Rank of a matrix
Definition (Rank)
The rank of a matrix A is the number of non-zero rows in the reduced
row-echelon form of A.
Note
◮ This is the same as the number of non-zero rows
in a row-echelon form of A.
◮ If A has size m × n, then rank(A) 6 m and rank(A) 6 n
Example
Find the rank of each of the following matrices:
1 2 1 1 −1 2 1
−1 −1 1 0 1 1 −2
0 1 3 1 −3 0 5
78
Theorem
The linear system Ax = b, where A is an m × n matrix, has:
1. No solution if rank(A) < rank([A | b])
79
Theorem
If A is an n × n matrix, the following conditions are equivalent:
1. A is invertible
2. Ax = b has a unique solution for any b
3. The rank of A is n
4. The reduced row-echelon form of A is In
Proof:
1 ⇒ 2 We’ve seen before (slide 75).
2 ⇒ 3 Follows from the previous theorem (or what we already knew about
linear systems).
3 ⇒ 4 Immediate from the definition of rank, and that fact that A is
square.
4 ⇒ 1 Let R be the RREF of A. Then R = EA, where E = Ek Ek−1 . . . E1
is a product of elementary matrices. So we have I = EA. We have
already noted that this implies that A is invertible (and
A−1 = E ).
80
2.5 Solutions of non-homogeneous linear equations
Ax = b
is given by
x = xh + x0
where x0 is any one solution to Ax = b and xh varies through all
solutions to the homogeneous equations Ax = 0.
81
Example
The equations
x1
1 0 −1 1 1
0 1 −1 −1 x2 = 2 (Ax = b)
x3
0 0 0 0 0
x4
have a solution x1 = 1, x2 = 2, x3 = 0, x4 = 0.
82
2.6 Determinants [AR 2.1–2.3]
a b
When we calculate the inverse of the 2 × 2 matrix A = we find
c d
that we need to invert the number ad − bc.
83
Defining the determinant
a b
The determinant of a 2 × 2 matrix A = is given by
c d
det(A) = ad − bc
a b
Furthermore, the matrix is invertible iff det(A) 6= 0
c d
Definition (Determinant)
Let A be an n × n matrix. The determinant of A, denoted det(A) or |A|,
can be defined as the signed sum of all the ways to multiply together n
entries of the matrix, with all chosen from different rows and columns.
84
To determine the sign of the products, imagine all but the elements in
the product in question are set to zero in the matrix. Now swap
columns until a diagonal matrix results. If the number of swaps required
is even, then the product has a + sign, while if it is odd, it is to be
given a − sign.
Determinants of 2 × 2 matrices
a b
For A = , the possible products are ad and bc.
c d
When we set all entries other than a and d to zero, then the matrix is
diagonal; so ad has a + sign. When we set all entries other than b and
c to zero then we need to interchange the two columns to obtain a
diagonal matrix; so we have a minus sign.
85
Determinant of a 3 × 3 matrix
Suppose
a11 a12 a13
A = a21 a22 a23
a31 a32 a33
We can form a table of all the products of 3 entries taken from different
rows and columns, together with the signs:
product sign
a11 a22 a33 +
a11 a23 a32 −
a12 a21 a33 −
a12 a23 a31 +
a13 a21 a32 +
a13 a22 a31 −
Hence
det(A) = a11 a22 a33 − a11 a23 a32 − a12 a21 a33
+a12 a23 a31 + a13 a21 a32 − a13 a22 a31
86
The formula for 3 × 3 matrices is complicated and it becomes quickly
worse as the size of the matrix increases. But there are better ways to
calculate determinants.
Submatrices and cofactors
Here is one way for practical calculation of (small) determinants.
Definition (Submatrix)
Let A be an m × n matrix. The (i , j)-submatrix of A, denoted by
A(i , j), is the (m − 1) × (n − 1) matrix obtained by deleting the i th row
and jth column from A.
Example
1 2 1
1 2
If A = −1 −1 1 , then A(2, 3) =
0 1
0 1 3
87
Definition (cofactor)
Let A be a square matrix. The (i , j)-cofactor of A, denoted by Cij , is
defined by
Cij = (−1)i +j det (A(i , j))
Example (continued)
1 2
A(2, 3) = , so det (A(2, 3)) = 1 and C23 =
0 1
88
Cofactor Expansion
Example
1 2 1
Calculate det −1 −1 1
0 1 3
89
Theorem
The determinant of an n × n matrix A can be computed by multiplying
the entries in any row (or column) by their cofactors and adding the
resulting products.
Proof: We can prove this in the n = 3 case using the formula on slide
86. The proof for the general case is essentially the same, but gets more
technical...
90
Example
91
How do you remember the sign of the cofactor?
The (1, 1)-cofactor always has sign +. Starting from there, imagine
walking to the square you want using either horizontal or vertical steps.
The appropriate sign will change at each step.
92
Example
1 −2 0 1
3 1 2 0
Calculate
1 0 1 0
2 −2 1 2
93
Some properties of determinants:
1. det AT = det(A)
2. If A has a row or column of zeros, then det(A) = 0
3. det(AB) = det(A) det(B)
1
4. If A if invertible, then det(A) 6= 0 and det A−1 = det(A)
Examples
2 −1 9 2 0 0 2 0 0
0 3 2 1 3 0 0 3 0
0 0 2 2 −3 2 0 0 2
95
Theorem
If A is an n × n triangular matrix, then
det(A) is the product of the entries on the main diagonal of A.
Example
2 −10 92 −117
0 3 28 −31
Let A =
0
0 −1 27
0 0 0 2
What is det(A)?
96
We can use row operations to manipulate a matrix into triangular form
in order to make the determinant calculation easier.
97
Example
a b c
Let A = d e f
g h i
Row swap:
1 0 0 a b c
E = 0 0 1 EA = g h i
0 1 0 d e f
98
Multiply a row by a scalar:
2 0 0 2a 2b 2c
F = 0 1 0 FA = d e f
0 0 1 g h i
1 0 0 a b c
G = 0 1 0 GA = d e f
3 0 1 g + 3a h + 3b i + 3c
Theorem
Let A be a square matrix.
1. If B is obtained from A by swapping two rows (or two columns) of
A, then det(B) = − det(A)
2. If B is obtained from A by multiplying a row (or column) of A by
the scalar α, then det(B) = α det(A)
3. If B is obtained from A by replacing a row (or column) of A by
itself plus a multiple of another row (column), then
det(B) = det(A)
100
Example
1 2 1
Calculate det −1 −1 1
0 1 3
101
Example
1 −2 0 1
3 1 2 0
Calculate
1 0 1 0
2 −2 1 2
102
We collect here some properties of the determinant function, most of
which we’ve already noted.
Theorem
Let A be an n × n matrix. Then,
1. det AT = det(A)
2. det(AB) = det(A) det(B)
3. det(αA) = αn det(A)
4. If A is a triangular matrix, then its determinant is the product of
the elements on the main diagonal
5. If A has a row (or column) of zeros, then det(A) = 0
6. If A has a row (or column) which is a scalar multiple of another
row (or column) then det(A) = 0
7. A is singular iff det(A) = 0 (and A is invertible iff det(A) 6= 0)
103
Example (showing how to prove property 3 above)
104
Topic 3: Euclidean Vector Spaces
3.1 Vectors in Rn
3.2 Dot product
3.3 Cross product of vectors in R3
3.4 Geometric applications
3.5 Linear combinations
3.6 Subspaces of Rn
3.7 Bases and dimension
3.8 Rank-nullity theorem
3.9 Coordinates relative to a basis
105
3.1 Vectors in Rn [AR 3.1]
Notation:
R2 = {(a, b) | a, b ∈ R}
= the set of all ordered pairs of real numbers
106
The algebraic approach to vector addition and scalar multiplication
extends to 3 dimensions or more.
Rn = {(x1 , x2 , . . . , xn ) | xi ∈ R for i = 1, 2, . . . , n}
= the set of all n-tuples of real numbers
107
Notation
u = (u1 , u2 , u3 ) = ui i + u2 j + u3 k
108
Definition (magnitude)
The length (or magnitude or norm) of a vector
u = (u1 , u2 , . . . , un ) ∈ Rn is given by
q
kuk = u12 + u22 + · · · + un2
109
Definition
The distance between two vectors u, v ∈ Rn is given by
d(u, v) = kv − uk
Example
Find the distance between the points P(1, 3, −1) and Q(2, 1, −1).
110
3.2 Dot product [AR 3.3]
Let
u = (u1 , u2 , . . . , un ) ∈ Rn
and
v = (v1 , v2 , . . . , vn ) ∈ Rn
be two vectors in Rn .
Examples
(3, −1) · (1, 2) =
(i + j + k) · (−j + k) =
111
The angle between two vectors can be defined in terms of the dot
product.
Definition (Angle)
The angle θ between two vectors u, v ∈ Rn is given by the expression
u · v = kukkvk cos θ
The angle defined in this way is exactly the usual angle between two
vectors in R2 or R3 .
That our definition of angle makes sense relies on the following
113
Properties of the dot product
1. u · v is a scalar
2. u · v = v · u
3. u · (v + w) = u · v + u · w
4. u · u = kuk2
5. (αu) · v = α(u.v)
Note
Suppose u = (u1 , . . . , un ) and v = (v1 , . . . , vn ) are two vectors in Rn .
If we write each as a row matrix U = [u1 · · · un ] V = [v1 · · · vn ], then
u · v = UV T
114
3.3 Cross product of vectors in R3 [AR 3.4]
115
Geometry of the cross product
v
θ
u × v = kuk kvk sin(θ) n̂
u
n̂
Example
Find a vector perpendicular to both (2, 3, 1) and (1, 1, 1).
116
Properties of the cross product
1. v × u = −(u × v)
2. u × (v + w) = (u × v) + (u × w)
3. (αu) × v = α(u × v)
4. u × 0 = 0
5. u × u = 0
6. u · (u × v) = 0
Note
The cross product is defined only for R3 . Unlike dot product and many
of the other properties we are considering, it does not extend to Rn in
general.
117
3.4 Geometric applications [AR 3.5]
Basic applications
ku × vk u
Note
If u and v are elements of R2 with u = (u1 , u2 ) and v = (v1 , v2 ), then
u1 u2
area of parallelogram = absolute value of
v1 v2
118
Example
Find the area of the triangle with vertices (2, −5, 4), (3, −4, 5) and
(3, −6, 2).
119
3. Assuming u 6= 0,
the projection of v onto u is v
given by (v − proju v)
(u · v)
proju v = u
kuk2 proju v u
Notice that:
◮ If we set û = u/kuk, then proju v = (û · v) û
◮ u · (v − proju v) =
120
Example
Let w = (2, −1, −2) and v = (2, 1, 3).
Find vectors v1 and v2 such that
◮ v = v1 + v2 ,
◮ v1 is parallel to w, and
◮ v2 is perpendicular to w.
121
Scalar triple product
Notice that
u1 u2 u3
u · (u × v) = u1 u2 u3 = 0
v1 v2 v3
Similarly
u · (v × u) = v · (u × v) = v · (v × u) = 0
122
Suppose u, v, w ∈ R3 are three vectors.
w
v
u1 u2 u3
volume of parallelepiped = |u·(v×w)| = absolute value of v1 v2 v3
w1 w2 w3
123
Example
⇀ ⇀ ⇀
Find the volume of the parallelepiped with adjacent edges PQ, PR, PS,
where the points are: P(2, −1, 1), Q(4, 6, 7), R(5, 9, 7) and S(8, 8, 8).
124
Lines
r = r0 + tv t∈R
⇀
where r0 = OP0
tv
P0
0
125
Letting r = (x, y , z), r0 = (x0 , y0 , z0 ) and v = (a, b, c) the equation
becomes
(x, y , z) = (x0 , y0 , z0 ) + t(a, b, c)
x = x0 + ta
y = y0 + tb t∈R
z = z0 + tc
Example
What is the parametric form of the line passing through the points
P(−1, 2, 3) and Q(4, −2, 5)?
126
Cartesian equations for a line
If a 6= 0, b 6= 0 and c 6= 0, we can solve the parametric equations for t
and equate. This gives the cartesian form of the straight line:
x − x0 y − y0 z − z0
= =
a b c
Example
What is the cartesian form of the line passing through the points
P(−1, 2, 3) and Q(4, −2, 5)?
127
Example
Find the vector equation of the line whose cartesian form is
x +1 y −3 z −4
= =
5 −1 2
128
Definition
Two lines are said to:
◮ intersect if there is a point lying in both
◮ be parallel if their direction vectors are parallel
The angle between two lines is the angle between their direction vectors
Example
Find the vector equation of the line through the point P(0, 0, 1) that is
parallel to the line given by
x −1 y +2 z −6
= =
1 2 2
129
Planes
r = r0 + su + tv s, t ∈ R
su
P0
tv
r = r0 + su + tv
0
The angle between two planes is given by the angle between their (unit)
normal vectors.
131
It follows that the equation of
r0
the plane can also be written as n̂ r−
(r − r0 ) · n̂ = 0
r0
r
0
132
Examples
1. The plane perpendicular to the direction (1, 2, 3) and through the
point (4, 5, 6) is given by x + 2y + 3z = d where
d = 1 × 4 + 2 × 5 + 3 × 6. That is
x + 2y + 3z = 32
133
3. The plane through (1, 1, 1) containing vectors parallel to (1, 0, 1)
and (0, 1, 2) is the set of all vectors of the form
(1, 1, 1) + s(1, 0, 1) + t(0, 1, 2) s, t ∈ R
5. Find the vector and cartesian equations of the plane containing the
three points P(2, 1, −1), Q(3, 0, 1), and R(−1, 1, −1).
134
Intersection of a line and a plane
Example
Where does the line
x −1 y −2 z −3
= =
1 2 3
meet the plane 3x + 2y + z = 20?
135
Intersection of two planes
Example What is the cartesian equation of the line of intersection of
the two planes x + 3y + 2z = 6 and 3x + 2y + z = 11?
A point on the line is given by solving the two equations. For example:
Or you could solve the two equations (for the planes) simultaneously.
136
Distance from a point to a line
Example Find the distance from the point P(2, 1, 1) to the line with
cartesian equation
x −2 y −1 z
= =
1 1 2
137
3.5 Linear Combinations [AR 5.2]
In this way, we can build up lines, planes and their higher dimensional
versions.
138
Examples
1. w = (2, 3) is a linear combination of e1 = (1, 0) and e2 = (0, 1)
139
Linear Dependence [AR 5.3]
By taking all linear combinations of a given set of vectors, we can build
up lines, planes, etc.
α1 v1 + · · · + αk vk = 0
Remember
The zero vector 0 = (0, . . . , 0) is not the same as the number zero.
140
So, the set of vectors is linearly dependent if and only if some
non-trivial linear combination gives the zero vector.
Proof:
Two vectors are linearly dependent iff one is a multiple of the other.
Three vectors in R3 are linearly dependent iff they lie in a plane.
141
Definition (Linear independence)
A set of vectors is called linearly independent if it is not linearly
dependent.
Examples
1. The vectors (2, −1) and (−6, 3) are linearly dependent.
142
Examples
1. The vectors (1, 0, 0), (1, 1, 0) and (1, 1, 1) are linearly independent.
3. (1, 2, 3), (1, 0, 0), (0, 1, 0), (0, 1, 1) are linearly dependent.
143
To decide if vectors v1 , . . . , vk ∈ Rn are linearly independent
144
Examples
1. (1, 0, 0, 0), (1, 1, 0, 0), (1, 1, 1, 0) ∈ R4 are linearly independent
145
An important observation
Example
If
1 3 0 2 0
[v1 · · · v5 ] ∼ 0 0 1 1 0
0 0 0 0 1
then v2 = 3v1 and v4 = 2v1 + v3 .
146
Let’s illustrate this with an
Example
The vectors
v1 = (1, 2, 1, 3), v2 = (2, 4, 2, 6), v3 = (0, −1, 3, −1), v4 = (1, 3, −2, 4)
satisfy v2 = 2v1 and v4 = v1 − v3
147
Useful Facts:
From the method above for deciding whether vectors are linearly
dependent we can derive the following.
Idea of proof: Because then the rre form of the matrix [v1 · · · vk ] must
contain columns which do not have a leading entry. So these columns
will be dependent on other columns which do have a leading entry.
148
Theorem
Vectors v1 , . . . , vn ∈ Rn are
linearly independent iff the matrix A = [v1 · · · vn ] has det(A) 6= 0.
Idea of proof:
149
Example
Decide whether the following vectors in R3 are linearly dependent or
independent:
150
3.6 Subspaces of Rn [AR 5.2]
Definition (Subspace)
A subspace (of Rn ) is a subset S (of Rn ) that satisfies:
0. S is non-empty
1. u, w ∈ S =⇒ u + w ∈ S (closed under vector addition)
2. u ∈ S, α ∈ R =⇒ αu ∈ S (closed under scalar multiplication)
151
Examples
1. The xy -plane S = {(x, y , z) ∈ R3 | z = 0} is a subspace of R3
152
Another example
Show that the points on the line y = 2x form a subspace of R2
Note
Every subspace of Rn contains the zero vector.
153
An important example of a subspace is given by the following.
Example
The solution to the homogeneous linear system
x +y +z =0
x −y −z =0
is a subspace of R3 .
154
In general, we have the following:
Proof:
We check the 3 conditions in the definition of subspace
155
Generating a subspace
Let v1 , . . . , vk be vectors in Rn .
Definition (Span)
The subspace spanned (or subspace generated) by these vectors is the
set of all linear combinations of the given vectors:
Span{v1 , . . . , vk } = {α1 v1 + · · · + αk vk | α1 , . . . , αk ∈ R}
Examples
1. In R2 , Span{(3, 2)} is the line through the origin in the direction
given by (3, 2), i.e., the line y = 23 x
Proof:
Remark
The subspace spanned by a set of vectors is the ‘smallest’ subspace that
contains those vectors.
157
More examples
In R3 :
1. Span{(1, 1, 0)} is the line through the origin containing the point
(1, 1, 0).
158
Spanning sets
159
Example
Show that the following vectors span R4 :
{(1, 0, −1, 0), (1, 1, 1, 1), (3, 0, 0, 0), (4, 1, −3, −1)}
has a solution.
160
Writing this linear system in matrix form gives:
1 1 3 4 x a
0 1 0 1 y b
−1 =
1 0 −3 z c
0 1 0 −1 w d
| {z }
A
Which has augmented matrix:
1 1 3 4 a
0 1 0 1 b
−1 1 0 −3 c
0 1 0 −1 d
We know that this linear system is consistent for all possible values of
a, b, c, d if and only if rank(A) = 4
(Which is the case in this example)
161
To decide if v1 , . . . , vk ∈ Rn span Rn
1. Form the matrix A = [v1 · · · vk ]
having columns given by the vectors v1 , . . . , vk
2. Calculate rank(A) as before:
a. Reduce to row-echelon form
b. Count the number of non-zero rows
162
Since A = [v1 · · · vk ] has k columns, we know that rank(A) 6 k.
It follows that:
Proposition
If k < n, then v1 , . . . , vk ∈ Rn can’t span Rn .
163
3.7 Bases and dimension [AR 5.4]
For example, (1, 0), (0, 1), (2, 3), (−7, 4) span R2 , but the last two are
not needed since they can be expressed as linear combinations of the
first two. The vectors are not linearly independent.
Bases
Definition (Basis)
A basis for a subspace V ⊆ Rn is a set of vectors from V which
1. spans V
2. is linearly independent
164
Examples
1. {(1, 0), (0, 1)} is a basis for R2
2. {(2, 0), (−1, 1)} is a basis for R2
3. {(2, −1, −1), (1, 2, −3)} is a basis for the plane x + y + z = 0 in R3
y
4. {(2, 3, 7)} is a basis for the line in R3 given by x
2 = 3 = z
7
Note
A subspace of Rn can have many bases. For example, any two vectors
in R2 which are not collinear will form a basis of R2 .
165
Notation/example:
The vectors
166
The following is an important and very useful theorem about bases:
Proof:
The proof is based on what we already know about (homogeneous)
linear systems:
167
So, in particular, every basis of Rn has exactly n elements.
Dimension
The above theorem tells us that although there can be many different
bases for the same space, they will all have the same number of vectors.
Definition (Dimension)
The dimension of a subspace V is the number of vectors in a basis for
V . This is denoted dim(V ).
168
Examples
1. The dimension of R2 is
2. Rn has dimension
3. The line {(α, α, α) | α ∈ R} ⊆ R3 has dimension
4. The plane {(x, y , z) | x + y + z = 0} ⊆ R3 has dimension
169
Calculating bases I
Example
Find a basis for the subspace
171
To find a basis for the span of a set of vectors
172
Remarks
◮ Don’t forget that it is the columns of A (not its row-echelon form)
that we use for the basis.
◮ This method gives a basis that is a subset of the original set of
vectors. Later we will give a second method which gives a basis of
different, but usually simpler, vectors
Example
Let S = {(1, −1, 2, 1), (0, 1, 1, −2), (1, −3, 0, 5)}
Find a subset of S that is a basis for hSi
173
Column space of a matrix
Remember that we saw a method for obtaining a basis for Span(S) that
started with the matrix [v1 · · · vk ] having the vectors of S as columns.
So, if V is a subspace of Rn :
175
Let m = dim(V ). From the above observation, since any basis must
have m elements, we conclude the following:
Theorem
Let V be a subspace of Rn , and suppose that dim(V ) = m.
1. If a spanning set of V has exactly m elements, then it is a basis.
2. If a linearly independent subset of V has exactly m elements, then
it is a basis.
Example
Given that
√ 3
{(1, −π, 2), (23, 1, 100), ( , 7, 1)}
2
is a spanning set for R3 , it is a basis for R3 .
176
Calculating bases II
Example
Let S = {(1, −1, 2, 1), (0, 1, 1, −2), (1, −3, 0, 5)}
Find a subset of S that is a basis for hSi.
1 −1 2 1 1 −1 2 1
0 1 1 −2 ∼ · · · ∼ 0 1 1 −2
1 −3 0 5 0 0 0 0
It follows that a basis for hSi is
178
Row space of a matrix
Definition
Let A be a m × n matrix. The subspace of Rn spanned by the rows of A
is called the row space of A.
Suppose A ∼ B. The above method relies on the fact that (unlike the
situation with the column space):
Note
For any matrix A:
180
Example
Let
1 −1 2 −2
2 0 1 0
A=
5 −3 7 −6
1 1 −1 3
181
Bases for solution spaces
We have seen how to find a basis for a subspace given a spanning set
for the subspace.
182
Example
Find a basis for the subspace of R4 defined by the equations
x1 + x3 + x4 = 0
3x1 + 2x2 + 5x3 + x4 = 0
x2 + x3 − x4 = 0
183
Finding a basis for the solution space:
184
Why does this work?
The set S = {v1 , · · · , vk } is a spanning set for the solution space, since
every solution can be written as a linear combination of the vectors in S.
The fact that the vectors are linearly independent results from the way
in which they were defined.
185
Another example
Find a basis for the subspace of R4 given by
(x1 , x2 , x3 , x4 ) = t1 ( , , , ) + t2 ( , , , ) t1 , t2 ∈ R
187
We have seen techniques to find bases for the row space, column space
and solution space of a matrix.
rank(A) + nullity(A) = n
Given what we know about finding the rank and the solution space of a
matrix, this is simply the statement that every column in the
row-echelon form either contains a leading entry or doesn’t contain a
leading entry.
188
Note
If you are asked to find the solution space, the column space and the
row space of A, you only need to find the reduced row-echelon form of
A once.
Remember
Definition (Coordinates)
Suppose B = {v1 , . . . , vn } is a basis for Rn . For v ∈ Rn write
αn
is called the coordinate matrix of v with respect to B.
190
Examples
1. If we consider R2 with the standard basis B = {i,
j},
1
the vector v = (1, 5) has coordinates [v]B =
5
2a
191
Having fixed a basis, there is a one-to-one correspondence between
vectors and their coordinate matrices.
192
Topic 4: General Vector Spaces [AR chapt 5]
193
Vectors in Rn have some basic properties shared by many many other
mathematical systems. For example,
Key Idea:
Write down these basic properties and look for other systems which
share these properties. Any system that does share these properties will
be called a vector space.
194
4.1 The vector space axioms [AR 5.1]
Let’s start trying to write down the basic properties that we want
‘vectors’ to satisfy.
This leads to a list of ten properties (or axioms) that we will then take
as our definition.
195
The scalars are members of a number system F called a field in which
we have addition, subtraction, multiplication and division.
196
Definition (Vector Space)
A vector space is a non-empty set V with two operations: addition and
scalar multiplication.
These operations are required to satisfy the following rules.
For any u, v, w ∈ V :
Addition behaves well:
A1 u + v ∈ V (closure of vector addition)
A2 (u + v) + w = u + (v + w) (associativity)
A3 u+v =v+u (commutativity)
There must be a zero and inverses:
A4 There exists a vector 0 ∈ V such that
v + 0 = v for all v ∈ V (existence of zero vector)
A5 For all v ∈ V , there exists a vector −v
such that v + (−v) = 0 (additive inverses)
197
Definition (Vector Space ctd)
For all u, v, ∈ V and α, β ∈ F:
198
Remark
It follows from the axioms that for all v ∈ V :
1. 0v = 0
2. (−1)v = −v
We are not going to show that the axioms hold for these systems.
If, however, you would like to get a feel for how this is done, read
AR5.1, Example 2.
199
4.2 Examples of vector spaces
After all, this was what we based our definition on! Vector spaces with
R as the scalars are called real vector spaces
200
2. Vector space of matrices
Denote by Mmn ( or Mmn (R) Mm,n (R) or Mm×n (R)) the set of all m × n
matrices with real entries.
Mmn is a real vector space with the following familiar operations:
a11 a12 b b12 a11 + b11 a12 + b12
+ 11 =
a21 a22 b21 b22 a2 1 + b21 a22 + b22
a a12 αa11 αa12
α 11 =
a21 a22 αa21 αa22
Pn = {a0 + a1 x + a2 x 2 + · · · + an x n | a0 , a1 , . . . , an ∈ R}
202
4. Vector space of functions
Let S be a set.
Remark
The ‘+’ on the left of the first equation is not the same as the ‘+’ on
the right!
Why not?
203
Example
Let f , g ∈ F(R, R) be defined by
204
What is the zero vector in F(R, R)?
0(x) = 0
205
4.3 Complex vector spaces
Example
C2 = {(a1 , a2 ) | a1 , a2 ∈ C}
with the operations :
Remark
All of the above examples of real vector spaces: Rn , Pn (R), F(S, R)
have complex analogues: Cn , Pn (C), F(S, C)
206
Important observation
All the concepts we looked at for Rn
(such as subspaces, linear independence, spanning sets, bases)
carry over directly to general vector spaces.
207
4.4 Subspaces of general vector spaces
Definition (Subspace)
A subspace of a vector space V is a subset S ⊆ V that is itself a vector
space (using the operations from V ).
The following theorem shows that, in fact, we get the same thing.
Example
Let V = M2,2 the vector space of real 2 × 2 matrices and H ⊆ V be
matrices with trace equal to 0, where ‘trace’ is the sum of the diagonal
entries.
In other words
a b
H= |a+d =0
c d
209
Another example
Let
V = P2 = {a0 + a1 x + a2 x 2 | a0 , a1 , a2 ∈ R}
and
W = {a0 + a1 x + a2 x 2 | a1 a2 > 0} ⊆ V
Is W a subspace of V ?
210
More examples
1. {0} is always a subspace of V
2. V is always a subspace of V
a 0 0
3. The set of diagonal matrices {0 b 0 | a, b, c ∈ R}
0 0 c
is a subspace of M3,3
5. S = {2 × 2 matrices
with determinant equal to 0}
a b
={ | ad − bc = 0} is not a subspace of M2,2
c d
211
4.5 Spanning sets, linear independence and bases
These concepts, which we have seen for subspaces of Rn ,
apply equally well in a general vector space.
Let V be a vector space with scalars F and S ⊆ V a subset.
Definition
A linear combination of vectors v1 , v2 , . . . , vk ∈ S is a sum
α1 v1 + · · · + αk vk
Definition
The set S is linearly dependent if there are vectors v1 , . . . , vk ∈ S and
scalars α1 , . . . , αk at least one of which is non-zero, such that
α1 v1 + · · · + αk vk = 0
Definition
A basis for V is a set which is both linearly independent and a spanning
set for V .
213
Example
Are the following elements of M2,2 linearly independent?
1 3 −2 1 1 3
, ,
0 1 0 −1 0 4
214
Another example
215
As with Rn we have the following important results:
Theorem
Let V be a vector space.
1. Every spanning set for V contains a basis for V
2. Every linearly independent set in V can be extended to a basis of V
3. Any two bases of V have the same cardinality
(i.e., ‘same number of elements’)
The basic idea behind the proof is exactly as we saw with Rn . But there
are some very interesting technical differences! These mostly concern
the possibility that a basis might have infinitely many elements.
216
Definition
The dimension of V , denoted dim(V ), is the number of elements in a
basis of V . We call V finite dimensional if it admits a finite basis, and
infinite dimensional otherwise.
Examples
1. {1, x, x 2 , . . . , x n } is a basis for Pn . So dim(Pn (R)) = n + 1
1 0 0 1 0 0 0 0
2. { , , , } is a basis for M2×2 ,
0 0 0 0 1 0 0 1
so dim(M2×2 ) = 4.
1 0 0 1 0 0
3. { , , } is a basis for the vector space of 2 × 2
0 −1 0 0 1 0
matrices with trace equal to zero.
This is therefore a 3 dimensional subspace of M2×2 .
217
An infinite dimensional example
218
In the case of a finite dimensional vector space, we have the following:
Theorem
Suppose V has dimension n, and S is a subset of V .
1. If |S| < n, then S does not span V
2. If |S| > n, then S is linearly dependent
219
Examples
1. The polynomials
{2 + x + x 2 , 1 + x, − 1 − 7x 2 , x − x 2 }
2. The matrices
n 2 1
−1 1
6 7 o
, ,
3 4 0 1 4 5
220
Standard Bases
The dimension of Rn is n.
The dimension of Pn is n + 1.
222
4.6 Coordinate matrices
The notion of coordinates relative to a basis carries over from the case
of vectors in Rn .
Definition
Suppose that B = {v1 , . . . , vn } is a basis for a vector space V . For any
v ∈ V we have
223
Examples
1. In P2 with basis B = {1, x, x 2 } the polynomial
2
p = 2 + 7x − 9x 2 has coordinates [p]B = 7
−9
1 0 0 1 0 0 0 0
2. M2×2 with basis B = { , , , }
0 0 0 0 1 0 0 1
1
1 2 2
The matrix A = has coordinates [A]B =
3 4 3
4
224
Topic 5: Linear Transformations [AR 4.2, 8.1]
225
We now turn to thinking about maps from one vector space to another.
v = α1 v1 + α2 v2 + · · · αn vn
226
Definition (Linear transformation)
Let V and W be vector spaces (over the same field of scalars).
A linear transformation from V to W is a map T : V → W such that
for each u, v ∈ V and for each scalar α:
227
5.1 Linear transformations from R2 to R2
(x, y )
Reflection across y -axis.
(−x, y )
x −x
T =
y y
228
A common feature of all linear transformations is that they can be
represented by a matrix. In the example above
x −1 0 x −x
T = =
y 0 1 y y
The matrix
−1 0
AT =
0 1
is called the standard matrix representation of the transformation T
229
Examples of (geometric) linear transformations from R2 to R2
1. Reflection across the x-axis has matrix
" 12 5
#
− 13 13
2. Reflection in the line y = 5x has matrix 5 12
13 13
0 −1
3. Rotation around the origin by an angle of π/2 has matrix
1 0
4. Rotation around the origin by an angle of θ has matrix
Q
We need to work out the coor-
dinates of the point Q obtained
by rotating P. P
θ
230
Examples continued
5. Compression/expansion along the x-axis has matrix
1 c
6. Shear along the x-axis has matrix
0 1
These are best thought of as mappings on a rectangle.
For example, a shear along the x-axis corresponds to the mapping
231
Successive Transformations
Example
Find the image of (x, y ) after a shear along the x-axis with c = 1
followed by a compression along the y -axis with c = 21 .
Solution:
Let R : R2 → R2 be the compression and denote its standard matrix
representation by AR . Similarly let S : R2 → R2 be the shear and
denote its standard matrix representation by AS . Then the coordinate
matrix of R(S(x, y )) is given by
x
AR AS
y
232
Note
1. The matrix for the linear transformation S followed by the linear
transformation R is the matrix product AR AS .
(In other words ARS = AR AS )
2. Notice that (reading right to left) the two matrices are in the
opposite order to the order in which the transformations are
applied.
3. The composition of two linear transformations T (v) = R(S(v)) is
also written T (v) = R ◦ S(v)
233
5.2 Linear transformations from Rn to Rm
To prove that this is a linear transformation, we must show that for any
u, v ∈ R3 and α ∈ R
we have that
234
Proof Let u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ). First we note that
u + v = (u1 + v1 , u2 + v2 , u3 + v3 )
235
In fact this example is typical of all linear transformations from Rn to
Rm — the mapping rule gives a vector whose components are linear
combinations of the components of the input vector.
Theorem
All linear transformations T : Rn → Rm have a standard matrix
representation AT specified by
Note
◮ The matrix AT has size m × n.
◮ Alternative notations for AT include: [T ] or [T ]S or [T ]S,S
236
Proof
Let v ∈ Rn and write
v = α1 e1 + α2 e2 + · · · + αn en
The coordinate matrix of v with respect to the standard basis is then
α1
α2
[v] = .
..
αn
In words, this equation says that AT times the column matrix [v] is
equal to the coordinate matrix of T (v).
Since T is linear, we have that
T (v) = α1 T (e1 ) + α2 T (e2 ) + · · · + αn T (en )
237
We read-off from this that
(property of
[T (v)] = α1 [T (e1 )] + α2 [T (e2 )] + · · · + αn [T (en )] coordinate matrices)
α1
.. (just matrix
= [ [T (e1 )] [T (e2 )] · · · [T (en )] ] . multiplication)
αn
[T (v)] = AT [v]
That is, the image of any vector v ∈ Rn can be calculated using the
matrix AT .
238
Note
1. All linear transformations map the zero vector to the zero vector.
2. Linear transformations map lines through the origin to other lines
through the origin (or to just the origin).
3. The above theory generalizes to linear transformations between any
n-dimensional vector space V and m-dimensional vector space W .
239
Examples
1. Define T : R3 → R4 by T (x1 , x2 , x3 ) = (x1 , x3 , x2 , x1 + x4 ).
Calculate AT .
240
5.3 Matrix representations in general [AR 8.4]
T : U → V is a linear transformation,
B = {b1 , . . . , bn } is a basis for U and
C = {c1 , . . . , cm } is a basis for V .
241
We want a matrix that can be used to calculate the effect of T .
Specifically, if we denote the matrix by AC,B , we want that
Note
For this matrix equation to make sense the size of AC,B must be m × n.
Theorem
There exists a unique matrix satisfying the above condition (∗). It is
given by h i
AC,B = [T (b1 )]C [T (b2 )]C · · · [T (bn )]C
The proof is the same as for the case of the formula for the standard
matrix AT .
242
A note on notation
The matrix AC,B is also denoted by [T ]C,B . In the special case in which
U = V and B = C, we often write [T ]B in place of [T ]B,B .
Example
5 1 0
A linear transformation T : R3 → R2 has matrix with
1 5 −2
respect to the standard bases of R3 and R2 . What is its matrix with
respect to the basis B = {(1, 1, 0), (1, −1, 0), (1, −1, −2)} of R3 and
the basis C = {(1, 1), (1, −1)} of R2 ?
Solution
We apply T to the elements of B to get:
243
Example
Consider the linear transformation T : V → V where V is the vector
space of real valued 2 × 2 matrices and T is defined by
1 1 1 0 1 0 0 1
B= , , ,
0 0 0 1 1 0 1 0
and
1 0 0 1 0 0 0 0
C= , , ,
0 0 0 0 1 0 0 1
244
Solution
The task is to work out the coordinates with respect to C of the image
of each element in B.
245
5.4 Image, kernel, rank and nullity [AR 8.2]
Let T : U → V be a linear transformation.
ker(T ) = {u ∈ U | T (u) = 0}
T (x, y , z) = (x − y + z, 2x − 2y + 2z)
247
Definition
A linear transformation T : U → V is called injective if ker(T ) = {0}.
It is called surjective if Im(T ) = V .
Example
Is the linear transformation of the preceding example injective ?
Is it surjective ?
248
Note
When we calculate ker(T ) we find that we are solving equations of the
form AT X = 0. So ker(T ) is the same as the solution space for AT .
To calculate Im(T ) we can use the fact that, because B spans U then
T (B) must span T (U); the image of T . But the elements of T (B) are
given by the columns of AT . Thus the image of T can be identified
with the column space of AT . It follows that rank(T ) = rank(AT ).
We can now use the result of slide 188 to see that for a linear
transformation T : U → V with dim(U) = n
nullity(T ) + rank(T ) = n
249
5.5 Change of basis
Transition Matrices
By multiplication by a matrix!
250
Theorem
There exists a unique matrix P such that for any vector v ∈ V ,
[v]C = P[v]B
P = [ [b1 ]C · · · [bn ]C ]
251
Proof:
We want to find a matrix P such that for all vectors v in V ,
[v]C = P[v]B
where
[T ]C,B = [T (b1 )]C [T (b2 )]C · · · [T (bn )]C
is the matrix representation of T .
252
Applying this to the special case where T (v) = v for all v
(i.e., T is the identity linear transformation)
gives
[T ]C,B = [b1 ]C [b2 ]C . . . [bn ]C
and (∗) becomes
[v]C = [T ]C,B [v]B
So we can take P = [T ]C,B
Exercise
Finish the proof by showing that P is unique. That is, if Q is a matrix
satisfying [v]C = Q[v]B for all v, then Q = P.
253
A simple case
The transition matrix is easy to calculate when one of B or C is the
standard basis.
Example
In R2 , write down the transition matrix from B to S, where
Solution
h i
PS,B = [b1 ]S [b2 ]S =
254
Going in the other direction
255
Example
For B and S as in the previous example, compute PB,S , the transition
matrix from S to B.
2
Use it to compute [v]B , given [v]S = .
0
Solution
We saw that in this case
1 1
PS,B =
1 −1
It follows that
PB,S =
[v]B =
256
Calculating a general transition matrix
Combining these,
257
Example
With U = V = R2 and B = {(1, 2), (1, 1)} and C = {(−3, 4), (1, −1)},
find PC,B .
PS,B = PS,C =
So
PC,S = and PC,B =
258
Relationship Between Different Matrix Representations
Example
T (x, y ) = (3x − y , −x + 3y )
Solution
259
Example continued
Now find the matrix of T with respect to the basis
B = {(1, 1), (1, −1)}
Solution
260
How are [T ]S and [T ]B related?
Theorem
The matrix representations of T : V → V with respect to two bases C
and B are related by the following equation:
[T ]B = PB,C [T ]C PC,B
Proof:
We need to show that for all v ∈ V
[T (v)]B = PB,C [T ]C PC,B [v]B
Starting with the right-hand side we obtain:
PB,C [T ]C PC,B [v]B = PB,C [T ]C [v]C (property of PC,B )
= PB,C [T (v)]C (property of [T ]C )
= [T (v)]B (property of PB,C )
261
Example
[T ]C =
[T ]B =
where C = {(1, 0), (0, 1)} is the standard basis and B = {(1, 1), (1, −1)}
[T ]B = PB,C [T ]C PC,B
262
Topic 6: Inner Product Spaces [AR Ch 6]
263
6.1 Definition of inner products
The Euclidean length of a vector in v ∈ Rn is defined as
q √
kvk = v12 + v22 + · · · + vn2 = v · v
(u · v)
proju v = u
kuk2
264
We want to address two issues:
1. How to generalise the notion of a dot product for vector spaces
other than Rn .
2. To relate ideas associated with the dot product and its
generalisations to bases and linear equations.
The first can be done by looking carefully at some of the key properties
and generalising them.
265
Definition (Inner Product)
Let V be a vector space over the real numbers. An inner product on V
is a function that associates with every pair of vectors u, v ∈ V a real
number, denoted hu, vi, satisfying the following properties.
Note
A vector space can admit many different inner products. We’ll see some
examples shortly.
266
In words these axioms say that we require the inner product to:
1. be symmetric;
2. be linear with respect to scalar multiplication;
3. be linear with respect to addition;
4. a. have positive squared lengths;
b. be such that only the zero vector has length 0.
But this is not the only inner product in the vector space Rn .
267
Example
Show that, in R2 , if u = (u1 , u2 ) and v = (v1 , v2 ) then
hu, vi = u1 v1 + 2u2 v2
268
More examples
1. Show that in R3 , hu, vi = u1 v1 − u2 v2 + u3 v3 does not define an
inner product by showing that axiom 4a does not hold.
2. Show that in R2 ,
T 2 −1 2 −1 v1
hu, vi = u v = u1 u2
−1 1 −1 1 v2
269
Another example
The check that this satisfies axioms 2 and 3 follows exactly the same
lines as the previous example.
The hard work occurs in satisfying axiom(s) 4. A quick check with the
standard basis vectors shows that we must have a > 0 and d > 0.
hv, ui
cos θ = with 06θ6π
kvkkuk
272
Note
In order for this definition of angle to make sense we need
hv, ui
−1 6 61
kvkkuk
273
Example
and
hu, vi = h(3, 1), (−2, 3)i = 3 × (−2) + 2 × 1 × 3 = 0
so u and v are orthogonal (using this inner product)
274
Example (an inner product for functions)
The set of all continuous functions
is a vector space.
It’s a subspace of F([a, b], R), the vector space of all functions.
275
Example
R 2π
Consider C [0, 2π] with the inner product hf , g i = 0 f (x)g (x)dx
The norms of the functions s(x) = sin(x) and c(x) = cos(x) are:
Z 2π Z 2π
1
ksk2 = hs, si = sin2 (x)dx = (1 − cos(2x))dx
0 0 2
2π
x 1
= − sin(2x) =π
2 4 0
√ √
So ksk = π and (similarly) kck = π
Z 2π Z 2π 2π
1 1
hs, ci = sin(x) cos(x)dx = sin(2x)dx = − cos(2x) =0
0 0 2 4 0
Application
The definition of angle is okay!
We defined the angle between two vectors using
hu, vi
cos θ =
kukkvk
From the Cauchy-Scwartz inequality we know that
hu, vi
−1 6 61
kukkvk
277
Example
For two continuous functions f , g : [a, b] → R, it follows directly from
the Cauchy-Schwarz inequality that
Z b 2 Z b Z b
f (x)g (x)dx 6 f 2 (x)dx g 2 (x)dx
a a a
Example
√ √
Set f (x) = x and g (x) = 1/ x; also a = 1, b = t > 1.
We obtain Z 2 Z Z
t t t
1
1 dx 6 x dx dx
1 1 1 x
which becomes
2 t2 − 1 2(t − 1)
(t − 1) 6 log t or log t >
2 t +1
278
6.4 Orthogonality and projections [AR 6.2]
Orthogonal sets
Recall that u and v are orthogonal if hu, vi = 0
Examples
1. {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is orthogonal in R3 with the dot
product.
2. So is {(1, 1, 1), (1, −1, 0), (1, 1, −2)}.
3. {sin(x), cos(x)} is orthogonal in C [0, 2π] equipped with the inner
product defined on slide 276
279
Proposition
Every orthogonal set of nonzero vectors is linearly independent
Proof:
280
Orthonormal sets
Definition
A set of vectors {v1 , . . . , vk } is called orthonormal if it is orthogonal
and each vector has length one. That is
(
0 i 6= j
{v1 , . . . , vk } is orthonormal ⇐⇒ hvi , vj i =
1 i =j
Note
Any orthogonal set of non-zero vectors can be made orthonormal by
dividing each vector by its length.
Examples
1. In R3 with the dot product:
{(1, 0, 0), (0, 1, 0), (0, 0, 1)} is orthonormal
{(1, 1, 1), (1, −1, 0), (1, 1, −2)} is not (though it is orthogonal)
{ √13 (1, 1, 1), √1 (1, −1, 0), √1 (1, 1, −2)}
2 6
is orthonormal
281
2. In C [0, 2π] with the inner product
Z 2π
hf , g i = f (x)g (x)dx
0
The set {sin(x), cos(x)} is orthogonal but not orthonormal.
The set { √1π sin(x), √1π cos(x)} is orthonormal
The (infinite) set
1 1 1 1 1
{ √ , √ sin(x), √ cos(x), √ sin(2x), √ cos(2x), . . . }
2π π π π π
is orthonormal.
282
Orthonormal bases
Proof: Exercise!
283
Orthogonal projection [AR 6.3]
Let V be a real vector space, with inner product h·, ·i
Let u ∈ V be a unit vector (i.e., kuk = 1)
Definition
The orthogonal projection of v onto u is
p = hv, uiu
Note
◮ p = kvk cos θ u
◮ v − p is orthogonal to u
Example
The orthogonal projection of (2, 3) onto √1 (1, 1) is:
2
284
More generally, we can project onto a subspace W of V as follows.
Definition
The orthogonal projection of v ∈ V onto W is:
286
6.5 Gram-Schmidt orthogonalization procedure
The following can be used to make any basis orthonormal:
Gram-Schmidt procedure [AR Thm 6.3.6]
Suppose {v1 , . . . , vk } is a basis for V .
1
1. Let u1 = kv1 k v1
Example
Find the point in W closest to v = (2, 2, 1, 3)
Answer
1
2 (3, 5, 3, 5)
288
6.6 Application: curve fitting [AR 6.4]
Given a set of data points (x1 , y1 ), (x2 , y2 ),. . . , (xn , yn ) we want to find
the straight line y = a + bx that best approximates the data.
289
So given (x1 , y1 ),. . . , (xn , yn ) we want to find a, b ∈ R which minimise
n
X
(yi − (a + bxi ))2
i =1
290
We seek the vector in
that is closest to y
The closest vector is precisely projW y
291
However, we can calculate u directly (without finding an orthonormal
basis for W ) by noting that
hw, y − projW yi = 0 for all w ∈ W
=⇒ AT (y − Au) = 0
=⇒ AT y − AT Au = 0
=⇒ AT Au = AT y
From this we can calculate u, given that we know A and y.
292
Summary
Given data points (x1 , y1 ), (x2 , y2 ),.. . , (xn , yn ), to find the straight line
a
y = a + bx of best fit, we find u = such that
b
AT Au = AT y (∗)
1 x1
y1 1 x2
..
where y = . and A = . .
.. ..
yn
1 xn
u = (AT A)−1 AT y
293
Example
Find the straight line which best fits the data points
(−1, 1), (1, 1), (2, 3)
Answer
9
Line of best fit is: y = 7 + 47 x
294
Extension
The same method works for finding quadratic fitting curves.
To find the quadratic y = a + bx + cx 2 which best fits data (x1 , y1 ),
(x2 , y2 ),. . . , (xn , yn ) we take
1 x1 x12
1 x2 x 2 y1
2 ..
A = . . .. y=.
.. .. .
yn
1 xn xn2
and solve
AT Au = AT y
for
a
u = b
c
295
Topic 7: Eigenvalues and Eigenvectors [AR Chapt 7]
296
7.1 Definition of eigenvalues and eigenvectors
The topic of eigenvalues and eigenvectors is fundamental to many
applications of linear algebra. These include quantum mechanics in
physics, image compression and reconstruction in computing and
engineering, and the analysis of high dimensional data in statistics.
Suppose that
T (v) = λv
for some scalar λ. It follows that T maps the subspace Span{v} to
itself.
Note that λ is a scale factor which stretches the subspace and possibly
changes its sense (if λ is negative).
297
Definition
Let T : V → V be a linear transformation.
A scalar λ is an eigenvalue of T if there is a non-zero vector v ∈ V such
that
T (v) = λv (∗)
The vector v is called an eigenvector of T (with eigenvalue λ).
If λ is an eigenvalue of T , then the set of all v satisfying (∗) is a
subspace of V (exercise!) and is called the eigenspace of λ.
298
In fact the idea of eigenvalues and eigenvectors can be applied directly
to square matrices.
Definition
Let A be an n × n matrix and let λ be a scalar. Then a non-zero n × 1
column matrix v with the property that
Av = λv
(A − λI )v = 0
The values of λ for which this equation has non-zero solutions are
precisely the eigenvalues.
Theorem
The homogeneous linear system (A − λI )v = 0 has a non-zero solution
if and only if det(A − λI ) = 0. Consequently, the eigenvalues of A are
the values of λ for which
det(A − λI ) = 0
300
Notation
The equation det(A − λI ) = 0 is referred to as the characteristic
equation. From our study of determinants, we know that det(A − λI ) is
a polynomial of degree n in λ. It is called the characteristic polynomial.
Example
1 4
Find the eigenvalues of .
1 1
301
Example
0 1
Find the eigenvalues of .
−1 0
302
Example
1 0 0 0
0 1 0 0
Find the eigenvalues of the matrix A =
0
.
0 1 −1
0 0 1 1
303
7.3 The Cayley-Hamilton theorem
304
Examples
3 2
Let A =
1 4
305
7.4 Finding eigenvectors
306
Example
For each of the eigenvalues
λ= −1, 8 (the eigenvalue −1 is repeated
3 2 4
twice) of the matrix 2 0 2, find a basis of for the corresponding
4 2 3
eigenspace.
307
7.5 Diagonalization [AR 7.2]
Definition
A square matrix A is said to be diagonalizable if there is an invertible
matrix P such that P −1 AP is a diagonal matrix. The matrix P is said
to diagonalize A.
308
To test if A is diagonalizable, the following theorem can be used.
Theorem
An n × n matrix A is diagonalizable if and only if there is a basis for Rn
all of whose elements are eigenvectors of A.
Idea of proof:
If we can form a basis in which all the basis vectors are eigenvectors of
T , then the new matrix for T will be diagonal.
309
A simple case in which A is diagonalizable is when A has n distinct
eigenvalues. This follows from the theorem above and the following
Lemma
Eigenvectors corresponding to distinct eigenvalues are linearly
independent.
Idea of proof:
310
Example
1 2
Give reasons why the matrix A = cannot be diagonalized.
0 1
311
How to diagonalize a matrix
Theorem
Let A be a diagonalizable n × n matrix. Thus there exists a basis
{v1 , . . . , vn } for Rn whose
elements are eigenvectors of A.
Let P = [v1 ] · · · [vn ] and let λi be the eigenvalue of the eigenvector vi .
Then
P −1 AP = diag [λ1 , λ2 , . . . , λn ]
312
Example
2 −1 −1
Check that 1 , 2 , 0 are eigenvectors of the matrix
2 0 1
3 2 4
A = 2 0 2
4 2 3
313
Orthogonal matrices
Because a matrix often represents a physical system it can be important
that the change-of-basis transformation does not affect shape.
This will happen when the change-of-basis matrix is orthogonal:
Definition
An n × n matrix P is orthogonal if the columns of P form an
orthonormal basis of Rn .
Examples
√ √ √
−1/ 2 1/ √6 1/√3
cos θ sin(θ)
0
√ −2/√ 6 1/√3 and
are orthogonal
sin θ cos θ
1/ 2 1/ 6 1/ 3
1 1 1 −1
but and are not.
0 1 1 1
314
Orthogonal matrices have some good properties.
315
Real symmetric matrices
Definition
A matrix A is symmetric if AT = A
Examples
3 −1 4
1 2
and −1 1 5 are symmetric,
2 3
4 5 9
3 −1 4
2 1
but and −1 1 5 are not
3 2
4 6 9
317
Example
2 −1
Find matrices D and Q as above for A =
−1 2
318
Powers of a matrix
In fact, we have
Lemma
Suppose A is diagonalizable, so that D = P −1 AP is diagonal. Then
Ak = PD k P −1
319
Example
For the matrix of slide 301 we can write
3 0 2 −2
A = PDP −1 with D = , P=
0 −1 1 1
Thus
1 1 2
Ak = PD k P −1 with P −1 =
4 −1 2
Explicitly,
n
2 −2
1 3 0 1 2
An = × ×
4
1 1 0 (−1)n −1 2
n n n n
1 2 (3 + (−1) ) 4 (3 − (−1) )
=4
3n − (−1)n 2 (3n + (−1)n )
320
Conic Sections [AR 9.6]
ax 2 + bxy + cy 2 + dx + ey + f = 0
3
We will see how the shape of
2 this graph can be calculated
using diagonalization.
1
321
We shall assume that d and e are zero.
See [AR 9.6] for a discussion of how to reduce to this case.
x2 y2
ellipse + =1
α2 β 2
x2 y2
hyperbola − =1
α2 β 2
parabola y = αx 2
322
The equation in matrix form
Consider the curve defined by the equation
ax 2 + bxy + cy 2 = 1
That is,
xT Ax = 1
x
where x = , and A is a real symmetric matrix.
y
We can diagonalize in order to simplify the equation so that we can
identify the curve.
Let’s demonstrate with an example.
323
Identify and sketch the conic defined by x 2 + 4xy + y 2 = 1
T 1 2
This can be written as x Ax = 1 where A =
2 1
T 3 0 1 1 −1
Diagonalizing gives A = QDQ with D = , Q = √2
0 −1 1 1
′
x
Let x′ = ′ be the co-ordinates of (x, y ) relative to the orthonormal
y
basis of eigenvectors: B = {( √12 , √12 ), (− √12 , √12 )}
Then x = Qx′ , (Q is precisely the transition matrix PS,B )
and the equation of the conic can be rewritten:
x
-2 -1 1 2
x′ -4 -2 2 4
-1 -2
-2
-4
3(x ′ )2 − (y ′ )2 = 1 x2 + 4xy + y 2 = 1
325
Summary
326
Example (a quadric surface)
The equation
3X 2 + 3Y 2 − 3Z 2 = 1
327
The surface is a ‘hyperboloid of one sheet’; see the sketch below.
(You are not expected to identify quadric surfaces in three dimensions.)
328