Beruflich Dokumente
Kultur Dokumente
Michael T. Heath
Scientific Computing
1 / 61
Outline
Michael T. Heath
Scientific Computing
2 / 61
Least Squares
Data Fitting
Scientific Computing
3 / 61
Least Squares
Data Fitting
Michael T. Heath
Scientific Computing
4 / 61
Least Squares
Data Fitting
Data Fitting
Given m data points (ti , yi ), find n-vector x of parameters
that gives best fit to model function f (t, x),
min
x
m
X
i=1
Scientific Computing
5 / 61
Least Squares
Data Fitting
Data Fitting
Polynomial fitting
f (t, x) = x1 + x2 t + x3 t2 + + xn tn1
is linear, since polynomial linear in coefficients, though
nonlinear in independent variable t
Fitting sum of exponentials
f (t, x) = x1 ex2 t + + xn1 exn t
is example of nonlinear problem
For now, we will consider only linear least squares
problems
Michael T. Heath
Scientific Computing
6 / 61
Least Squares
Data Fitting
y1
1 t1 t21
y2
1 t2 t22 x1
2 x y = b
1
t
t
Ax =
=
2
3
3
3
y4
1 t4 t2 x3
4
2
y5
1 t5 t 5
Matrix whose columns (or rows) are successive powers of
independent variable is called Vandermonde matrix
Michael T. Heath
Scientific Computing
7 / 61
Least Squares
Data Fitting
Example, continued
For data
t 1.0 0.5 0.0 0.5 1.0
1.0
0.5 0.0 0.5 2.0
y
overdetermined 5 3 linear system is
1.0
1 1.0 1.0
0.5
1 0.5 0.25 x1
0.0 0.0
Ax =
1
x2 = 0.0 = b
0.5
1
0.5 0.25 x3
2.0
1
1.0 1.0
Solution, which we will see later how to compute, is
T
x = 0.086 0.40 1.4
so approximating polynomial is
p(t) = 0.086 + 0.4t + 1.4t2
Michael T. Heath
Scientific Computing
8 / 61
Least Squares
Data Fitting
Example, continued
Resulting curve and original data points are shown in graph
Scientific Computing
9 / 61
Michael T. Heath
Scientific Computing
10 / 61
Normal Equations
To minimize squared Euclidean norm of residual vector
krk22 = r T r = (b Ax)T (b Ax)
= bT b 2xT AT b + xT AT Ax
take derivative with respect to x and set it to 0,
2AT Ax 2AT b = 0
which reduces to n n linear system of normal equations
AT Ax = AT b
Michael T. Heath
Scientific Computing
11 / 61
Orthogonality
Vectors v1 and v2 are orthogonal if their inner product is
zero, v1T v2 = 0
Space spanned by columns of m n matrix A,
span(A) = {Ax : x Rn }, is of dimension at most n
If m > n, b generally does not lie in span(A), so there is no
exact solution to Ax = b
Vector y = Ax in span(A) closest to b in 2-norm occurs
when residual r = b Ax is orthogonal to span(A),
0 = AT r = AT (b Ax)
again giving system of normal equations
AT Ax = AT b
Michael T. Heath
Scientific Computing
12 / 61
Orthogonality, continued
Geometric relationships among b, r, and span(A) are
shown in diagram
Michael T. Heath
Scientific Computing
13 / 61
Orthogonal Projectors
Matrix P is orthogonal projector if it is idempotent
(P 2 = P ) and symmetric (P T = P )
Orthogonal projector onto orthogonal complement
span(P ) is given by P = I P
For any vector v,
v = (P + (I P )) v = P v + P v
For least squares problem Ax
= b, if rank(A) = n, then
P = A(AT A)1 AT
is orthogonal projector onto span(A), and
b = P b + P b = Ax + (b Ax) = y + r
Michael T. Heath
Scientific Computing
14 / 61
Scientific Computing
15 / 61
kAxk2
kyk2
=
kbk2
kbk2
Michael T. Heath
Scientific Computing
16 / 61
Michael T. Heath
Scientific Computing
17 / 61
Normal Equations
Orthogonal Methods
SVD
Michael T. Heath
square
Scientific Computing
triangular
18 / 61
Normal Equations
Orthogonal Methods
SVD
1 1.0 1.0
1
1
1
1
1
1 0.5 0.25
T
0.0 0.0
1.0 0.5 0.0 0.5 1.0 1
A A =
0.5 0.25
1.0 0.25 0.0 0.25 1.0 1
1
1.0 1.0
1.0
1
1
1
1
1
0.5
4.0
Scientific Computing
19 / 61
Normal Equations
Orthogonal Methods
SVD
Example, continued
Cholesky factorization of symmetric positive definite matrix
AT A gives
2.236
0
0
2.236
0
1.118
1.581
0 0
1.581
0 = LLT
= 0
1.118
0
0.935
0
0
0.935
Solving lower triangular system Lz = AT b by
T
forward-substitution gives z = 1.789 0.632 1.336
Solving upper triangular system LT x = z by
T
back-substitution gives x = 0.086 0.400 1.429
Michael T. Heath
Scientific Computing
20 / 61
Normal Equations
Orthogonal Methods
SVD
1 1
A = 0
0
where is positive number smaller than
mach
Scientific Computing
21 / 61
Normal Equations
Orthogonal Methods
SVD
Michael T. Heath
Scientific Computing
22 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
23 / 61
Normal Equations
Orthogonal Methods
SVD
Orthogonal Transformations
We seek alternative method that avoids numerical
difficulties of normal equations
We need numerically robust transformation that produces
easier problem without changing solution
What kind of transformation leaves least squares solution
unchanged?
Square matrix Q is orthogonal if QT Q = I
Multiplication of vector by orthogonal matrix preserves
Euclidean norm
kQvk22 = (Qv)T Qv = v T QT Qv = v T v = kvk22
Thus, multiplying both sides of least squares problem by
orthogonal matrix does not change its solution
Michael T. Heath
Scientific Computing
24 / 61
Normal Equations
Orthogonal Methods
SVD
x= 1
O
b2
where R is n n upper triangular and b is partitioned
similarly
Residual is
krk22 = kb1 Rxk22 + kb2 k22
Michael T. Heath
Scientific Computing
25 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
26 / 61
Normal Equations
Orthogonal Methods
SVD
QR Factorization
Given m n matrix A, with m > n, we seek m m
orthogonal matrix Q such that
R
A=Q
O
where R is n n and upper triangular
Linear least squares problem Ax
= b is then transformed
into triangular least squares problem
R
c1
x
= QT b
QT Ax =
=
O
c2
which has same solution, since
R
R
2
2
2
T
krk2 = kb Axk2 = kb Q
xk2 = kQ b
xk22
O
O
Michael T. Heath
Scientific Computing
27 / 61
Normal Equations
Orthogonal Methods
SVD
Orthogonal Bases
If we partition m m orthogonal matrix Q = [Q1 Q2 ],
where Q1 is m n, then
R
R
A=Q
= [Q1 Q2 ]
= Q1 R
O
O
is called reduced QR factorization of A
Columns of Q1 are orthonormal basis for span(A), and
columns of Q2 are orthonormal basis for span(A)
Q1 QT1 is orthogonal projector onto span(A)
b is given by
Solution to least squares problem Ax =
solution to square system
QT1 Ax = Rx = c1 = QT1 b
Michael T. Heath
Scientific Computing
28 / 61
Normal Equations
Orthogonal Methods
SVD
Computing QR Factorization
To compute QR factorization of m n matrix A, with
m > n, we annihilate subdiagonal entries of successive
columns of A, eventually reaching upper triangular form
Similar to LU factorization by Gaussian elimination, but use
orthogonal transformations instead of elementary
elimination matrices
Possible methods include
Householder transformations
Givens rotations
Gram-Schmidt orthogonalization
Michael T. Heath
Scientific Computing
29 / 61
Normal Equations
Orthogonal Methods
SVD
Householder Transformations
Householder transformation has form
H =I 2
vv T
vT v
1
0
0
Ha = . = . = e1
.
.
..
0
0
Substituting into formula for H, we can take
v = a e1
and = kak2 , with sign chosen to avoid cancellation
Michael T. Heath
Scientific Computing
30 / 61
Normal Equations
Orthogonal Methods
SVD
v = a e1 = 1 0 = 1 0
2
0
2
0
where = kak2 = 3
Since a1 is positive, we
choose
negative
sign for to avoid
2
3
5
cancellation, so v = 1 0 = 1
2
0
2
To confirm that transformation works,
2
5
3
T
v a
15
1 =
0
Ha = a 2 T v = 1 2
v v
30
2
2
0
< interactive example >
Michael T. Heath
Scientific Computing
31 / 61
Normal Equations
Orthogonal Methods
SVD
Householder QR Factorization
To compute QR factorization of A, use Householder
transformations to annihilate subdiagonal entries of each
successive column
Each Householder transformation is applied to entire
matrix, but does not affect prior columns, so zeros are
preserved
In applying Householder transformation H to arbitrary
vector u,
T
vv T
v u
Hu = I 2 T
u=u 2 T
v
v v
v v
which is much cheaper than general matrix-vector
multiplication and requires only vector v, not full matrix H
Michael T. Heath
Scientific Computing
32 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
33 / 61
Normal Equations
Orthogonal Methods
SVD
Michael T. Heath
Scientific Computing
34 / 61
Normal Equations
Orthogonal Methods
SVD
1.0
1 1.0 1.0
0.5
1 0.5 0.25
0.0 0.0
A=
1
, b = 0.0
0.5
1
0.5 0.25
2.0
1
1.0 1.0
Householder vector v1 for annihilating subdiagonal entries
of first column ofAis
1
2.236
3.236
1 0 1
0 = 1
1
v1 =
1 0 1
1
0
1
Michael T. Heath
Scientific Computing
35 / 61
Normal Equations
Orthogonal Methods
SVD
Example, continued
Applying resulting Householder transformation H1 yields
transformed matrix and right-hand side
1.789
2.236
0
1.118
0.362
0
0.191 0.405
0.309 0.655 , H1 b =
H1 A = 0
0.862
0.362
0
0.809 0.405
1.138
0
1.309
0.345
Householder vector v2 for annihilating subdiagonal entries
of second column
of H
1 Ais
0
0
0
0.191 1.581 1.772
v2 =
0.309 0 = 0.309
0.809 0 0.809
1.309
0
1.309
Michael T. Heath
Scientific Computing
36 / 61
Normal Equations
Orthogonal Methods
SVD
Example, continued
Applying resulting Householder transformation H2 yields
1.789
2.236
0
1.118
0.632
0
1.581
0
1.035
0
0.725 , H2 H1 b =
H2 H1 A = 0
0.816
0
0.589
0.404
0
0
0.047
Householder vector v3 for annihilating subdiagonal entries
of third column of H2 H1 A is
0
0
0
0 0 0
0.935 = 1.660
0.725
v3 =
0.589 0 0.589
0.047
0
0.047
Michael T. Heath
Scientific Computing
37 / 61
Normal Equations
Orthogonal Methods
SVD
Example, continued
Applying resulting Householder transformation H3 yields
1.789
2.236
0
1.118
0.632
0
1.581
0
1.336
0
0.935 , H3 H2 H1 b =
H 3 H2 H 1 A = 0
0
0.026
0
0
0.337
0
0
0
Now solve upper triangular system Rx = c1 by
T
back-substitution to obtain x = 0.086 0.400 1.429
< interactive example >
Michael T. Heath
Scientific Computing
38 / 61
Normal Equations
Orthogonal Methods
SVD
Givens Rotations
Givens rotations introduce zeros one at a time
T
Given vector a1 a2 , choose scalars c and s so that
c s a1
=
s c a2
0
p
with c2 + s2 = 1, or equivalently, = a21 + a22
Previous equation can be rewritten
a1
a2 c
=
a2 a1 s
0
Gaussian elimination yields triangular system
a1
a2
c
=
0 a1 a22 /a1 s
a2 /a1
Michael T. Heath
Scientific Computing
39 / 61
Normal Equations
Orthogonal Methods
SVD
a2
+ a22
a21
Finally, c2 + s2 = 1, or =
a1
c= p 2
a1 + a22
Michael T. Heath
and
c=
a1
+ a22
a21
and
a2
s= p 2
a1 + a22
Scientific Computing
40 / 61
Normal Equations
Orthogonal Methods
SVD
c= p
a21
a22
4
= 0.8
5
a2
and s = p
a21
a22
3
= 0.6
5
Scientific Computing
41 / 61
Normal Equations
Orthogonal Methods
SVD
Givens QR Factorization
More generally, to annihilate selected component of vector
in n dimensions, rotate target component with another
component
a1
1
0 0 0 0 a1
0
c 0 s 0 a2
0
0 1 0 0
a3 = a3
0 s 0 c 0 a4 0
a5
0
0 0 0 1 a5
By systematically annihilating successive entries, we can
reduce matrix to upper triangular form using sequence of
Givens rotations
Each rotation is orthogonal, so their product is orthogonal,
producing QR factorization
Michael T. Heath
Scientific Computing
42 / 61
Normal Equations
Orthogonal Methods
SVD
Givens QR Factorization
Straightforward implementation of Givens method requires
about 50% more work than Householder method, and also
requires more storage, since each rotation requires two
numbers, c and s, to define it
These disadvantages can be overcome, but requires more
complicated implementation
Givens can be advantageous for computing QR
factorization when many entries of matrix are already zero,
since those annihilations can then be skipped
< interactive example >
Michael T. Heath
Scientific Computing
43 / 61
Normal Equations
Orthogonal Methods
SVD
Gram-Schmidt Orthogonalization
Given vectors a1 and a2 , we seek orthonormal vectors q1
and q2 having same span
This can be accomplished by subtracting from second
vector its projection onto first vector and normalizing both
resulting vectors, as shown in diagram
Scientific Computing
44 / 61
Normal Equations
Orthogonal Methods
SVD
Gram-Schmidt Orthogonalization
Process can be extended to any number of vectors
a1 , . . . , ak , orthogonalizing each successive vector against
all preceding ones, giving classical Gram-Schmidt
procedure
for k = 1 to n
qk = ak
for j = 1 to k 1
rjk = qjT ak
qk = qk rjk qj
end
rkk = kqk k2
qk = qk /rkk
end
Resulting qk and rjk form reduced QR factorization of A
Michael T. Heath
Scientific Computing
45 / 61
Normal Equations
Orthogonal Methods
SVD
Modified Gram-Schmidt
Michael T. Heath
Scientific Computing
46 / 61
Normal Equations
Orthogonal Methods
SVD
Michael T. Heath
Scientific Computing
47 / 61
Normal Equations
Orthogonal Methods
SVD
Rank Deficiency
Michael T. Heath
Scientific Computing
48 / 61
Normal Equations
Orthogonal Methods
SVD
0.641 0.242
A = 0.321 0.121
0.962 0.363
Computing QR factorization,
1.1997 0.4527
R=
0
0.0002
R is extremely close to singular (exactly singular to 3-digit
accuracy of problem statement)
If R is used to solve linear least squares problem, result is
highly sensitive to perturbations in right-hand side
For practical purposes, rank(A) = 1 rather than 2, because
columns are nearly linearly dependent
Michael T. Heath
Scientific Computing
49 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
50 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
51 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
52 / 61
Normal Equations
Orthogonal Methods
SVD
Example: SVD
1 2 3
4 5 6
T
SVD of A =
7 8 9 is given by U V =
10 11 12
.141
.825 .420 .351
25.5
0
0
.504
.574 .644
.344
.426
.298
.782 0
1.29 0
Michael T. Heath
Scientific Computing
53 / 61
Normal Equations
Orthogonal Methods
SVD
Applications of SVD
Minimum norm solution to Ax
= b is given by
x=
X uT b
i
vi
i
i 6=0
max
min
Scientific Computing
54 / 61
Normal Equations
Orthogonal Methods
SVD
Pseudoinverse
Define pseudoinverse of scalar to be 1/ if 6= 0, zero
otherwise
Define pseudoinverse of (possibly rectangular) diagonal
matrix by transposing and taking scalar pseudoinverse of
each entry
Then pseudoinverse of general real m n matrix A is
given by
A+ = V + U T
Pseudoinverse always exists whether or not matrix is
square or has full rank
If A is square and nonsingular, then A+ = A1
In all cases, minimum-norm solution to Ax
= b is given by
x = A+ b
Michael T. Heath
Scientific Computing
55 / 61
Normal Equations
Orthogonal Methods
SVD
Orthogonal Bases
SVD of matrix, A = U V T , provides orthogonal bases for
subspaces relevant to A
Columns of U corresponding to nonzero singular values
form orthonormal basis for span(A)
Remaining columns of U form orthonormal basis for
orthogonal complement span(A)
Columns of V corresponding to zero singular values form
orthonormal basis for null space of A
Remaining columns of V form orthonormal basis for
orthogonal complement of null space of A
Michael T. Heath
Scientific Computing
56 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
57 / 61
Normal Equations
Orthogonal Methods
SVD
Michael T. Heath
Scientific Computing
58 / 61
Normal Equations
Orthogonal Methods
SVD
Comparison of Methods
Forming normal equations matrix AT A requires about
n2 m/2 multiplications, and solving resulting symmetric
linear system requires about n3 /6 multiplications
Solving least squares problem using Householder QR
factorization requires about mn2 n3 /3 multiplications
If m n, both methods require about same amount of
work
If m n, Householder QR requires about twice as much
work as normal equations
Cost of SVD is proportional to mn2 + n3 , with
proportionality constant ranging from 4 to 10, depending on
algorithm used
Michael T. Heath
Scientific Computing
59 / 61
Normal Equations
Orthogonal Methods
SVD
Scientific Computing
60 / 61
Normal Equations
Orthogonal Methods
SVD
Michael T. Heath
Scientific Computing
61 / 61