Numerical Analysis and Application

Numerical Analysis and Application
（数值分析与应用）
Wei Zhang（张炜）
Faculty of Mechanical Engineering and Automation

Zhejiang Sci-Tech University
1
Course Information
 Lecture by: Wei Zhang（张炜）

Contact: 18758042685, zhangwei@zstu.edu.cn
Office: 23-314
 Programming practice instructed by: Zhengdao Wang（王政道）
 10 lectures + 5 programming practices + 1 exam depending on schedule
 Grading: 50% assignments + 50% final exam (open book)
 Textbook: R. L. Burden, J. D. Faires, Numerical Analysis, 9th edition, Cengage Learning.
2
Preliminaries and Error Analysis
（概论与误差分析）

3
Numerical Analysis and Application（数值分析与应用）
Lecture 1: Preliminaries and Error Analysis（概论与误差分析）
What is Numerical Analysis ?
 Numerical analysis: the study of algorithms that use numerical approximation (as
opposed to general symbolic manipulations) for the problems of mathematical analysis (as
distinguished from discrete mathematics).
 Applications: in ALL fields of engineering and sciences circumstances.

• Ordinary Differential Equations（常微分方程）: celestial mechanics (planets, stars
and galaxies)
• Numerical Linear Algebra（数值线性代数）: data analysis
• Stochastic Differential Equations （随机微分方程）and Markov Chains（马尔科夫
链）: simulating living cells for medicine and biology
 Industries: numerical weather prediction, computing the trajectory of a spacecraft, crash

safety of vehicles, hedge funds, deciding airline ticket prices, actuarial analysis in insurance
companies
4
What is Numerical Analysis ?
Babylonian clay tablet (1800-1600 BC) with

annotations. The approximation of the square root of 2
is four sexagesimal figures, which is about six decimal
figures. 1 + 24/60 + 51/602 + 10/603 = 1.41421296...
Weather models use systems of differential equations

based on the laws of physics, fluid motion, and chemistry,
and use a coordinate system which divides the planet into a
3D grid. Winds, heat transfer, solar radiation, relative
humidity, and surface hydrology are calculated within each
grid cell, and the interactions with neighboring cells are
used to calculate atmospheric properties in the future.
5
What We Learn in Numerical Analysis ?
 Error analysis（误差分析）
 Interpolation（插值）
 Least Squares Approximation（最小二乘逼近）
 Numerical Differentiation and Integration（数值微分与积分）
 Initial Value Problems for Ordinary Differential Equations（常微分方程的初值问题）
 Direct Methods for Solving Linear Systems（线性方程组的直接解法）
 Iterative Methods for Solving Linear Systems（线性方程组的迭代解法）
 Numerical Solutions of Nonlinear Systems of Equations（数值求解非线性方程组）
6
Outlines
 Review of Calculus（微积分回顾）
 Round-off Errors and Computer Arithmetic（舍入误差和计算机算法）
 Algorithms and Convergence（算法和收敛性）
7
Review of Calculus
8
Review of Calculus
 Limits and Continuity
• The functions will be assumed to be continuous for predictable behavior.
• Functions that are not continuous can skip over points of interest and cause difficulties
when attempting to approximate a solution to a problem.
• Definition 1.1: A function f defined on a set X of real numbers has the limit L at x0,
if, given any real number ε > 0, there exists a real number δ > 0 such that
函数的极限
9
Review of Calculus
 Limits and Continuity
• Definition 1.2: Let f be a function defined on a set X of real numbers and x0 ∈ X. Then f is
continuous at x0 if,
连续函数
The function f is continuous on the set X if it is continuous at each number in X.
• Definition 1.3: Let {xn}∞n=1 be an infinite sequence of real numbers. This sequence has
the limit x (converges to x) if, for any ε > 0 there exists a positive integer N(ε) such that
|xn − x| < ε, whenever n > N(ε). The notation
数列的极限
means that the sequence {xn}∞n=1 converges to x.
10
Review of Calculus
 Differentiability
• A function with a smooth graph will normally behave more predictably than one with
numerous jagged features.
• Definition 1.5: Let f be a function defined in an open interval containing x0. The function
f is differentiable at x0 if
函数可微
exists. The number f ’(x0) is called the derivative of f at x0. A function that has a derivative
at each number in a set X is differentiable on X.
11
Review of Calculus
• Theorem 1.6: If the function f is differentiable at x0, then f is continuous at x0.
• Theorem 1.7: (Rolle’s Theorem) • Theorem 1.8: (Mean Value Theorem)

Suppose f ∈ C[a, b] and f is If f ∈ C[a, b] and f is differentiable on
differentiable on (a, b). If f (a) = f (b), (a, b), then a number c in (a, b) exists
then a number c in (a, b) exists with with
f ’(c) = 0.
罗尔定理中值定理
12
Review of Calculus
• Theorem 1.9: (Extreme Value Theorem) If f ∈ C[a, b], then c1, c2 ∈ [a, b] exist with
f (c1) ≤ f (x) ≤ f (c2), for all x ∈ [a, b]. In addition, if f is differentiable on (a, b), then the
numbers c1 and c2 occur either at the endpoints of [a, b] or where f ’ is zero.
极值定理
广义罗尔定理
• Theorem 1.10: (Generalized Rolle’s Theorem) Suppose f ∈ C[a, b] is n times
differentiable on (a, b). If f (x) = 0 at the n + 1 distinct numbers a ≤ x0 < x1 < . . . < xn ≤ b,
then a number c in (x0, xn), and hence in (a, b), exists with f (n)(c) = 0.
13
Review of Calculus
• Theorem 1.11: (Intermediate Value Theorem) If f ∈ C[a, b] and K is any number

between f (a) and f (b), then there exists a number c in (a, b) for which f (c) = K.
介值定理
14
Review of Calculus
 Integration
• Theorem 1.12: The Riemann integral of the function f on the interval [a, b] is the
following limit, provided it exists:
黎曼积分
where the numbers x0, x1, . . . , xn satisfy a = x0 ≤ x1 ≤ · · · ≤ xn = b, where Δxi = xi − xi-1,
for each i = 1, 2,. . . , n, and zi is arbitrarily chosen in the interval [xi-1, xi].
• For computational convenience, the

points xi to be equally spaced in [a, b],
and for each i = 1, 2,. . . , n, to choose zi
= xi
15
Review of Calculus
 Integration
• Theorem 1.13: (Weighted Mean Value Theorem for Integrals) Suppose f ∈ C [a, b],
the Riemann integral of g exists on [a, b], and g(x) does not change sign on [a, b]. Then
there exists a number c in (a, b) with:
加权积分中值定理
When g(x) ≡ 1, the theorem is the

usual Mean Value Theorem for
Integrals. It gives the average value of
the function f over the interval [a, b] as
16
Review of Calculus
 Taylor Polynomials and Series
• Theorem 1.14: (Taylor’s Theorem) Suppose f ∈ Cn[a, b], that f (n+1) exists on [a, b],
and x0 ∈ [a, b]. For every x ∈ [a, b], there exists a number ξ(x) between x0 and x with
where
泰勒定理
nth Taylor polynomial
truncation error
17
Round-off Errors and Computer Arithmetic
18

• In computer arithmetic √32 ≠ 3, because of finite-digit arithmetic with round-off errors.
• Only part of rational numbers（有理数）can be precisely represented.
• Round-off errors: error that is produced when a calculator or computer is used to

perform real number calculations.
• Reason: the arithmetic performed in a machine involves numbers with only a finite
number of digits, with the result that calculations are performed with only
approximate representations of the actual numbers.
19

 Binary Machine Numbers（二进制机器数）
• IEEE Binary Floating Point Arithmetic Standard 754–1985 and 754-2008（IEEE二进制

浮点数计算标准）
• Data interchange, rounding arithmetic operations, handling of exceptions.
Characteristic Mantissa（尾数）
Single precision
Double precision
20

• Illustration: a binary machine number
sign characteristic mantissa
Positive
Decimal number
21

• The smallest normalized positive number: s = 0, c = 1, and f = 0
• The largest normalized positive number: s = 0, c = 2046, and f = 1-2-52
• Numbers < 2-1022(1+0) are set to zero.

• Numbers > 21023(2-2-52) cause the computations to stop (infinite value).
22

 Decimal Machine Numbers（十进制机器数）
• Normalized decimal floating-point form: k-digit decimal machine numbers
• A positive real number:
Chopping（截断）
Rounding（舍入）
• Example: Determine the five-digit (a) chopping and (b) rounding values of π
Chopping
Rounding
23

 Decimal Machine Numbers（十进制机器数）
• Definition 1.15: Suppose that p∗ is an approximation to p. The absolute error is |p − p∗|,

and the relative error is |p − p∗|/|p|, provided that p ≠ 0.
• Definition 1.16: The number p∗ is said to approximate p to t significant digits (or

figures) if t is the largest nonnegative integer for which
24

 Finite Digit Arithmetic
• Assume real numbers x and y, and their floating-point representations f l(x) and f l(y)
• Example 3: Suppose that x = 5/7 and y = 1/3. Use five-digit chopping for calculating x + y.
Solution
25

 Nested Arithmetic（嵌套计算）
• Accuracy loss due to round-off error can also be reduced by rearranging calculations.
• Example 6: Evaluate f (x) = x3 − 6.1x2 + 3.2x + 1.5 at x = 4.71 using three-digit arithmetic.
Solution
Exact
Chopping
Rounding
Relative difference
26

 Nested Arithmetic（嵌套计算）
• Accuracy loss due to round-off error can also be reduced by rearranging calculations.
• Example 6: Evaluate f (x) = x3 − 6.1x2 + 3.2x + 1.5 at x = 4.71 using three-digit arithmetic.
Solution An alternative way
Chopping
Relative difference
One way to reduce round-off error is to reduce the number of computations.
27
Algorithms and Convergence
28

 Characterizing Algorithms
• Algorithm: a procedure that describes, in an unambiguous manner, a finite sequence of

steps to be performed in a specified order.
• Stable algorithm: small changes in the initial data produce correspondingly small
changes in the final results.
• Conditionally stable algorithm: stable only for certain choices of initial data.
• Definition 1.17: Suppose that E0 > 0 denotes an error introduced at some stage in the
calculations and En represents the magnitude of the error after n subsequent operations
• If En ≈ CnE0, where C is a constant independent of n, then the growth of error is linear.
• If En ≈ CnE0, for some C > 1, then the growth of error is called exponential.
29

 Rates of Convergence
• Definition 1.18: Suppose {βn}∞n=1 is a sequence known to converge to zero, and {αn}∞n=1
converges to a number α. If a positive constant K exists with
then we say that {αn}∞n=1 converges to α with rate, or order, of convergence O(βn). It is
indicated by writing αn = α + O(βn).
• Normally we use
• Example 2:
30
Summary
 Review of Calculus
• Limits, continuity, differentiability, Rolle’s theorem, mean value theorem, extreme value
theorem, generalized Rolle’s theorem, intermediate value theorem, Riemann integral,
Taylor’s theorem
 Round-off Errors and Computer Arithmetic
• Binary and decimal machine numbers
• Finite-digit arithmetic, and nested arithmetic
 Algorithms and Convergence

• Linear and exponentially growing errors
• Rate of convergence
31
Interpolation
（插值）

32
Lecture 2: Interpolation（插值）
Outlines
 Lagrange Interpolation（拉格朗日插值）
 Neville’s Method（Neville插值）
 Divided Differences（均差插值）
 Hermite Interpolation（埃尔米特插值）
 Cubic Spline Interpolation（样条插值）
33
Introduction
 Example
• The table lists the population, in thousands of people, from 1950 to 2000 for the United
States, and the data are also represented in the figure.
Question: can we use the data to reasonably

estimate the population in 1975 or even in the
year 2020 ?
34
Lagrange Interpolation
35
 Fundamentals
• Algebraic polynomials（代数多项式）: mapping the set of real numbers into itself
where n is a nonnegative integer and a0, . . . , an are real constants.
• For any function defined and continuous on a closed and bounded interval, there
exists a polynomial that is as “close” to the given function as desired.
36
 Fundamentals
• Throrem 3.1: (Weierstrass Approximation Theorem) Suppose that f is defined and
continuous on [a, b]. For each ε > 0, there exists a polynomial P(x), with the property that
• The derivative and indefinite integral of a polynomial are easy to

determine and are also polynomials
• The Taylor polynomials are not used in the interpolation

because all the information used in the approximation is
concentrated at the single number x0.
Karl Weierstrass
维尔斯特拉斯
37
 First-degree polynomial interpolating
• Problem definition: determining a polynomial of degree one that passes through the
distinct points (x0, y0) and (x1, y1), i.e., f (x0) = y0 and f (x1) = y1.
Solution: define the functions
The linear Lagrange interpolating polynomial through (x0, y0) and (x1, y1) is
38
 First-degree polynomial interpolating: example
• Problem: Determine the linear Lagrange interpolating polynomial that passes through
the points (2, 4) and (5, 1).
Solution:
39
 Generalized Lagrange interpolation
• Problem: construct a polynomial of degree at most n that passes through the n + 1 points
40
• Solution: we construct a polynomial function Ln,k(x) for xi to satisfy
The polynomial function is:
41
• Theorem 3.2: if x0, x1, . . . , xn are n + 1 distinct numbers and f is a function whose values
are given at these numbers, then a unique polynomial P(x) of degree at most n exists with
The polynomial is:
nth Lagrange interpolating polynomial
Ln,k(x) is simplified as Lk(x) when there is no confusion as to its degree.
42
• Example: use the numbers (called nodes) x0 = 2, x1 = 2.75, and x2 = 4 to find the second
Lagrange interpolating polynomial for f (x) = 1/x
• Solution: determine the coefficient polynomials L0(x), L1(x), and L2(x)
Considering
43
 Remainder term（余项）
• Theorem 3.3: suppose x0, x1, . . . , xn are distinct numbers in the interval [a, b] and f ∈
Cn+1[a, b]. Then, for each x in [a, b], a number ξ(x) (generally unknown) between x0,
x1, . . . , xn, and hence in (a, b), exists with
where P(x) is the interpolating polynomial.

Remainder term
• The remainder term is used to estimate the error bound.

• Comparison
Taylor polynomial Lagrange polynomial
44
 Remainder term
• Example: we have the second Lagrange polynomial for f (x) = 1/x on [2, 4] using the
nodes x0 = 2, x1 = 2.75, and x2 = 4. Determine the error form for this polynomial, and the
maximum error when the polynomial is used to approximate f (x) for x∈[2, 4].
• Solution: we have
The second Lagrange polynomial has the error form
Maximum value is 2-4=1/16

The maximum value on this interval of the absolute value of the polynomial
45
 Remainder term
• Example: we have the second Lagrange polynomial for f (x) = 1/x on [2, 4] using the
nodes x0 = 2, x1 = 2.75, and x2 = 4. Determine the error form for this polynomial, and the
maximum error when the polynomial is used to approximate f (x) for x∈[2, 4].
Solution: how to calculate the maximum value
For Dx=0, the critical points are
The maximum error is
46
Neville’s Method
47
Neville’s Method
 Introduction
• Lagrange interpolation: the degree of the polynomial needed for the desired accuracy
is generally not known until computations have been performed.
• We now derive these approximating polynomials in a manner that uses the previous
calculations to greater advantage.
• Definition 3.4: let f be a function defined at x0, x1, x2, . . . , xn, and suppose that m1, m2, . . .,
mk are k distinct integers, with 0 ≤ mi ≤ n for each i. The Lagrange polynomial that agrees
with f (x) at the k points xm1, xm2, . . . , xmk is denoted Pm1,m2,...,mk(x).
• Example: suppose that x0 = 1, x1 = 2, x2 = 3, x3 = 4, x4 = 6, and f (x) = ex. Determine the
interpolating polynomial denoted P1,2,4(x).
Solution: Lagrange polynomial that agrees with f (x) at x1 = 2, x2 = 3, and x4 = 6
48
Neville’s Method
 Fundamentals
• Theorem 3.5: Let f be defined at x0, x1, . . . , xk, and let xj and xi be two distinct numbers in
this set. Then
is the kth Lagrange polynomial that interpolates f at the k + 1 points x0, x1, . . . , xk.
• The procedure that recursively generates interpolating polynomial approximations is

called Neville’s method.
Using consecutive
points with larger i.
Increasing the degree of the

interpolating polynomial. 49
Neville’s Method
 Fundamentals
• The interpolating polynomials can be generated recursively. For example:
• Notation: let Qi,j(x), for 0 ≤ j ≤ i, denote the interpolating polynomial of degree j on the
(j + 1) numbers xi−j, xi−j+1, . . . , xi−1, xi :
50
Neville’s Method
 Example
• Problem: apply Neville’s method to the data by constructing a recursive table for x=1.5.
Solution: Let x0 = 1.0, x1 = 1.3, x2 = 1.6, x3 = 1.9 and x4 = 2.2, Degree zero
then Q0,0 = f (1.0), Q1,0 = f (1.3), Q2,0 = f (1.6), Q3,0 = f (1.9) and Q4,0 = f (2.2)
51
Neville’s Method
 Example
• Problem: apply Neville’s method to the data by constructing a recursive table for x=1.5.
Solution: higher-degree approximations
52
Neville’s Method
 Neville’s Iterated Interpolation
53
Divided Differences
54
Divided Differences
 Introduction
• Divided-difference methods is used to generate successively higher-degree polynomial
approximations.
• The divided differences of f with respect to x0, x1, . . . , xn are used to express Pn(x) as
for appropriate constants a0, a1, . . . , an.
55
Divided Differences
 Notation
• The zeroth divided difference of the function f with respect to xi
• The first divided difference of f with respect to xi and xi+1 is denoted f [xi, xi+1]
• The second divided difference is defined as
• The kth divided difference is defined as
• The nth divided difference is defined as
56
Divided Differences
 Newton’s Divided Difference
• The Lagrange polynomial can be written in the form of Newton’s divided difference
the value of f [x0, x1, . . . , xk] is independent of the order of the numbers x0, x1, . . . , xk
57
Hermite Interpolation
58
 Hermite Polynomial
• The Hermite polynomials agree with the magnitude and first derivative of f at x0, x1, . . . , xn.
• Theorem 3.9: If f ∈ C1[a, b] and x0, . . . , xn ∈ [a, b] are distinct, the unique polynomial of
least degree agreeing with f and f ’ at x0, . . . , xn is the Hermite polynomial of degree at
most 2n + 1 given by
where, for Ln, j(x) denoting the jth Lagrange coefficient polynomial of degree n
59
 Hermite Interpolation: Example
• Problem: Use the Hermite polynomial that agrees with the data in the table to find an
approximation of f (1.5).
Solution: first compute the Lagrange

polynomials and their derivatives
60
 Hermite Interpolation: Example
• Problem: Use the Hermite polynomial that agrees with the data in the table to find an
approximation of f (1.5).
Solution: compute the Hermite polynomials
H5(1.5) = 0.5118277
61
Cubic Spline Interpolation
62

 Piecewise Polynomial Approximation
• Divide the approximation interval into a collection of subintervals and construct a
(generally) different approximating polynomial on each subinterval.
• Piecewise-linear interpolation: joining a set of data points
by a series of straight lines. Disadvantage: no

differentiability at the
endpoints of the
subintervals
63

 Cubic Splines
• Cubic spline interpolation: use cubic polynomials between each successive pair of nodes
• Four constants of the cubic spline ensures sufficient flexibility that the interpolant is
continuously differentiable on the interval as well as its second derivative.
• No assumption that the derivatives of the interpolant agree with those of the function.
64

 Cubic Splines
• Definition 3.10: given a function f defined on [a, b] and a set of nodes a = x0 < x1 < ···< xn
= b, a cubic spline interpolant S for f is a function that satisfies the following conditions:
(a) S(x) is a cubic polynomial, denoted Sj(x), on the subinterval [xj, xj+1] for each
j = 0, 1,. . ., n − 1;
Spline
(b) Sj(xj) = f (xj) and Sj(xj+1) = f (xj+1) for each j = 0, 1,. . ., n − 1;
(c) Sj+1(xj+1) = Sj (xj+1) for each j = 0, 1, . . . , n − 2; (Implied by (b).) Continuous function
(d) S’j+1(xj+1) = S’j (xj+1) for each j = 0, 1, . . . , n − 2;
Continuous first
(e) S’’j+1(xj+1) = S’’j(xj+1) for each j = 0, 1, . . . , n − 2; derivative
(f ) One of the following sets of boundary conditions is satisfied:
Continuous second
(i) S’’(x0) = S’’(xn) = 0 (natural (or free) boundary); derivative
(ii) S’(x0) = f ’(x0) and S’(xn) = f ’(xn) (clamped boundary).
65

 Cubic Splines
• Example: Construct a natural cubic spline that passes through the points (1, 2), (2, 3),
and (3, 5).
Solution: This spline consists of two cubics. The first for the interval 1 ≤ x ≤ 2, denoted
and the other for 2 ≤ x ≤ 3, denoted
Four conditions: the splines must agree with the data at the nodes
Two conditions: continuity of first and second derivatives at the nodes
66

 Cubic Splines
• Example: Construct a natural cubic spline that passes through the points (1, 2), (2, 3),
and (3, 5).
Solution:
Two conditions: natural boundary conditions
Solving this system of equations gives the spline
67

 Construction of a Cubic Splines
• A spline defined on an interval that is divided into n subintervals will require determining
4n constants.
for each j = 0, 1, . . . , n − 1.
• By introducing hj = xj+1 − xj , we have
• By defining bn = S’(xn) , we have
• By defining cn = S’’(xn)/2 , we have
68

 Natural Splines
• Theorem 3.11: If f is defined at a = x0 < x1 < ···< xn = b, then f has a unique natural spline
interpolant S on the nodes x0, x1, . . ., xn; that is, a spline interpolant that satisfies the
natural boundary conditions S’’(a) = 0 and S’’(b) = 0.
• Proof: considering c0 = 0 and cn = 0, and
A linear equation system Ax = b
69
Strictly diagonally dominant

 Natural Splines
• Example: Use the data points (0, 1), (1, e), (2, e2), and (3, e3) to form a natural spline S(x)
that approximates f (x) = ex.
Solution: n = 3, h0 = h1 = h2 = 1, a0 = 1, a1 = e, a2 = e2, and a3 = e3
70

 Clamped Splines（固定边界样条）
• Theorem 3.11: If f is defined at a = x0 < x1 < ···< xn = b and differentiable at a and b, then
f has a unique clamped spline interpolant S on the nodes x0, x1, . . ., xn; that is, a spline
interpolant that satisfies the clamped boundary conditions S’(a) = f ’(a) and S’(b) = f ’(b) .
• Proof:
j=0
j = n-1
A linear equation system Ax = b
71
Summary
 Lagrange Interpolation
• n-1 order polynomial constructed in the whole domain
• nth Lagrange interpolating polynomial
• Neville’ method
 Cubic Spline Interpolation

• 3rd-order polynomial constructed in the each interval
• Natural spline, clamped spline
72
Least Squares Approximation
（最小二乘逼近）

73
Lecture 3: Least Squares Approximation（最小二乘逼近）
Outlines
 Linear Least Squares（线性最小二乘法）

 Polynomial Least Squares（多项式最小二乘法）
74
Introduction
• Example: Hooke’s law for force-deformation relationship F(l) = k(l − E)
• We want to determine the spring

constant for a spring.
• The data points (0, 5.3), (2, 7.0), (4,

9.4) and (6, 12.3) do not quite lie in
a straight line.
Objective: find the line that best approximates all the data points.
Find a simpler type of a given function.
Find fitting functions to given data and the best function to represent the data.
75
Linear Least Squares Approximation
76

 Discrete Approximation
Raw data Ninth-degree interpolating polynomial
• The actual relationship between x and y is linear.

• High-order interpolation would introduce oscillations that were not originally present.
• A better approach would be to find the “best” approximating line, even if it does not
agree precisely with the data at any point. 77

 Discrete Approximation
• Construct a straight line a1x + a0

• a1xi + a0 is the ith approximating value, yi is the ith given value (i=1, 2, …, 10)
• One option: finding the equation of the best linear approximation in the absolute
sense requires that values of a0 and a1 be found to minimize
• Another option: finding values of a0 and a1 to minimize
78

 Definition and Concepts
• Definition: fitting the best least squares line to a collection of data

{(xi, yi)}mi=1 involves minimizing the total error
with respect to the parameters a0 and a1. Carl Friedrich Gauss
• Puts substantially more weight on a point that is out of line with the rest of the data.
• Will not permit that point to completely dominate the approximation.
79

 Derivation
• To minimize the total error
• We need
80

 Derivation
• What we need to compute a0 and a1
81

 Example
• Example: Find the least squares line approximating the data
82
Polynomial Least Squares Approximation
83

 Definition
• The general problem of approximating a set of data, {(xi, yi) | i = 1, 2, . . . , m} , with an

algebraic polynomial
of degree n < m − 1.
• Choose the constants a0, a1, . . ., an to minimize the least squares error
84

 Derivation
• To minimize E, it is necessary that ∂E/∂aj = 0 { j = 0, 1, . . . , n}
• n + 1 normal equations for the n + 1 unknowns aj
• The normal equations have a unique solution provided that the xi are distinct
85

 Example
• Example: fit the data with the discrete least squares polynomial
of degree at most 2
Solution: n = 2, m = 5, and the three normal equations are
86

 Approximation for Exponentially Related Data
• Exponentially related data requires the approximating function to be of the form

or
• Commonly used method: consider the logarithm of the approximating equation
or
Linear Problem
87

 Example
• Example: Approximate the data in the first three columns
Solution: assume an approximation of the form
88
Summary
 Linear Least Square
• The approximate line does not necessarily intersect with the data.
• Minimize the square of error.
• Required data:
 Polynomial Least Square

• Solving n+1 equations for a polynomial of order n.
• Approximation with exponentially related data.
89
Numerical Differentiation and Integration
（数值微分与积分）

90
Lecture 4: Numerical Differentiation and Integration（数值微分与积分）
Outlines
 Numerical Differentiation（数值微分）
 Richardson’s Extrapolation（理查德森外插）
 Elements of Numerical Integration（数值积分基本知识）
 Composite Numerical Integration（复合积分）
 Romberg Integration（龙贝格积分）
91
Numerical Differentiation
 Example
• A sheet of corrugated roofing
• Compute the length of the curve given by f (x) = sin x from x = 0 to x = 48
Difficult to determine
• Methods are developed to approximate the solution to problems of this type.
92
93
 First-Order Derivative
• The derivative of the function f at x0 is
For small h
 Approximation using Lagrange Polynomial

• Suppose first that x0 ∈ (a, b), where f ∈ C2[a, b], and that x1 = x0 + h for some h ≠ 0 that is
sufficiently small to ensure that x1 ∈ [a, b]
for some ξ(x) between x0 and x1

94
First-order derivative Approximation error

• For x = x0
• Forward-difference formula if h > 0

• Backward-difference formula if h < 0
95
 Example
• Problem: Use the forward-difference formula to approximate the derivative of f (x) = ln x
at x0 = 1.8 using h = 0.1, h = 0.05 and h = 0.01, and determine bounds for the
approximation errors.
Solution: forward-difference formula
h = 0.1
Because f ’’(x) = −1/x2 and 1.8 < ξ < 1.9
Error bound
96
 Three-Point Formula
• Deriving the approximation formulas using Lagrange coefficient polynomial
j = 0, 1, 2
97
 Three-Point Formula
• Three-point endpoint formula
• Three-point midpoint formula
98
 Five-Point Formula
• Five-point endpoint formula
• Five-point midpoint formula
 Second-Order Derivative
• Three-point midpoint formula
99
Richardson’s Extrapolation
100
 Purpose: generate high-accuracy results while using low-order formulas.
Combine these rather inaccurate O(h) approximations in an appropriate way

to produce formulas with a higher-order truncation error.
 Applicability: an approximation technique has an error term with a predictable form, usually
the step size h.
 General procedure: for a formula N1(h) that approximates an unknown constant M
K1, K2, K3… are constants.
101
 Realization:
Eliminates the h2 term
102
 Realization: for each j = 2, 3, . . . , the O(h2j) approximation
103
Elements of Numerical Integration
104

 Numerical Quadrature
• Purpose: evaluating the definite integral of a function that has no explicit anti-derivative
or whose anti-derivative is not easy to obtain.
• Basic idea: select a set of distinct nodes {x0, . . . , xn} from the interval [a, b]. Then
integrate the Lagrange interpolating polynomial
Numerical quadrature
105

 The Trapezoidal Rule（梯形法则）
• To use the trapezoidal rule for approximating with equally-spaced nodes.
• Let x0 = a, x1 = b, h = b − a, use the linear Lagrange polynomial:
106

 The Trapezoidal Rule（梯形法则）
h = x1- x0 Trapezoidal rule
107

 Simpson’s Rule（Simpson法则）
• To use the Simpson’s rule for approximating with equally-spaced nodes.
• Let x0 = a, x1 = a + h, x2 = b, use the Taylor polynomial:
Simpson’s rule
108

 Simpson’s Rule（Simpson法则）
• Use the Lagrange interpolation:
109

 Example
• Problem: Compare the Trapezoidal rule and Simpson’s rule approximations to
when f (x) is
Solution:
110

 Closed Newton-Cotes Formulas（闭区间上的牛顿-柯特斯公式）
• Nodes xi = x0 +ih, for i = 0, 1,. . . , n, where x0 = a, xn = b and h = (b − a)/n.
• The endpoints of the closed interval [a, b] are included as nodes.
• (n+1)-point closed Newton-Cotes formula
111

 Closed Newton-Cotes Formulas（闭区间上的牛顿-柯特斯公式）
• Theorem 4.2: suppose that denotes the (n + 1)-point closed Newton-Cotes

formula with x0 = a, xn = b, and h = (b − a)/n. There exists ξ ∈ (a, b) for which
if n is odd and f ∈ C n+2[a, b]
if n is even and f ∈ C n+1[a, b]
n = 1: Trapezoidal rule
n = 2: Simpson’s rule
112
Composite Numerical Integration
113

 Defects of Newton-Cotes Formulas
• Newton-Cotes formulas are high-degree formulas, with coefficients difficult to obtain.
• Newton-Cotes formulas are based on equally-spaced nodes.
 Composite Numerical Integration

• Low-order Newton-Cotes formulas in a piecewise approach.
• Example: Use Simpson’s rule to approximate and compare this to the results
obtained by adding the Simpson’s rule approximations for and . Compare these
approximations to the sum of Simpson’s rule for , , , and .
Solution: exact answer in this case is e4 − e0 = 53.59815
(1) error = −3.17143
114

(2) error = −0.26570
(3) error = −0.01807
115

 Generalization of Composite Numerical Integration
• Problem: for an arbitrary integral
• Procedure:
(1) Subdivide the interval [a, b] into n subintervals (n is an even number).
116

 Generalization of Composite Numerical Integration
(2) Apply Simpson’s rule on each consecutive pair of subintervals
with h = (b − a)/n and xj = a + jh, for each j = 0, 1, . . . , n, we have
(3) For each j = 1, 2,. . . , (n/2) − 1, we have f (x2j) appearing in the term corresponding to
the interval [x2j−2, x2j] and also in the term corresponding to the interval [x2j, x2j+2]
117

 Composite Simpson’s Rule
• Theorem 4.4: Let f ∈ C4 [a, b], n be even, h = (b − a)/n, and xj = a + jh, for each j = 0, 1, . . . ,
n. There exists a μ ∈ (a, b) for which the Composite Simpson’s rule for n subintervals
can be written with its error term as
Error term O(h4)
The most frequently used general-purpose quadrature algorithm.
118

 Composite Trapezoidal rule
• Theorem 4.5: Let f ∈ C2[a, b], h = (b − a)/n, and xj = a + jh, for each j = 0, 1, . . . , n. There
exists a μ ∈ (a, b) for which the Composite Trapezoidal rule for n subintervals can be
written with its error term as
119

 Composite Midpoint rule
• Theorem 4.6: Let f ∈ C2[a, b], h = (b − a)/(n+2), and xj = a + (j+1)h, for each j = -1, 0,
1, . . . , n+1. There exists a μ ∈ (a, b) for which the Composite Midpoint rule for n+2
subintervals can be written with its error term as
120

 Example
• Problem: Determine values of h that will ensure an approximation error of less than
0.00002 when approximating and employing (a) Composite Trapezoidal rule
and (b) Composite Simpson’s rule.
Solution: (a) Composite Trapezoidal rule
The error form for the Composite Trapezoidal rule
n = π/h
n ≥ 360
121

 Example
• Problem: Determine values of h that will ensure an approximation error of less than
0.00002 when approximating and employing (a) Composite Trapezoidal rule
and (b) Composite Simpson’s rule.
Solution: (b) Composite Simpson rule
The error form for the Composite Midpoint rule
n = π/h
n ≥ 18 122
Romberg Integration
123
Romberg Integration
 Romberg Integration = Composite Trapezoidal Rule + Richardson’s Extrapolation
f ∈ C∞[a, b] Composite Trapezoidal Rule
 Realization: to approximate the integral using the results of the Composite

Trapezoidal rule with n = 1, 2, 4, 8, 16, . . .
(1) Resulting approximations are denoted as R1,1, R2,1, R3,1, R4,1 , R5,1
n=1 n=2 n=4 n=8 n = 16
124
Romberg Integration
(2) Obtaining O(h4) approximations R2,2, R3,2, R4,2, R5,2
(3) Obtaining O(h6) approximations R3,3, R4,3, R5,3
125
Romberg Integration
 Example:
• Problem: Use the Composite Trapezoidal rule to find approximations to
with n = 1, 2, 4, 8, and 16. Then perform Romberg extrapolation on the results.
Solution:
126
Romberg Integration
The O(h4) approximations are
127
Romberg Integration
General results:
Generalized form
128
Romberg Integration
 Example: easily to extend to larger n
• Problem: add an additional extrapolation row to the above table to approximate
Solution:
129
Summary
 Numerical Differentiation
 Richardson’s Extrapolation
• Generate high-accuracy results while using low-order formulas.
 Numerical Integration
• Trapezoidal rule, Simpson’s rule, closed Newton-Cotes formulas.
 Composite Numerical Integration
• Low-order, piecewise approach.
• Based on the Trapezoidal rule or Simpson’s rule.
 Romberg Integration
• Composite Trapezoidal Rule + Richardson’s Extrapolation
• Improved accuracy.
130
Initial Value Problems for
Ordinary Differential Equations
（常微分方程的初值问题）

131
Lecture 5: Initial Value Problems for Ordinary Differential Equations（常微分方程的初值问题）
Outlines
 Elementary Theory of Initial Value Problems（初值问题的基本理论）

 Euler’s Method（欧拉方法）
 Higher-Order Taylor Methods（高阶泰勒方法）
 Runge-Kutta Methods（龙格-库塔方法）
 Runge-Kutta-Fehlberg Method（龙格-库塔-菲尔伯格方法）
 Multistep Methods（多步法）
132
Introduction
 What is a initial value problem (IVP) ?
• The motion of a swinging pendulum
For small θ For large θ

Simplify No Simplification
• Initial value problem with initial condition y(a) = α
133
Elementary Theory of Initial Value Problems
134

 Initial value problem
• The change of some variable with respect to another.
• Solution to a differential equation that satisfies a given initial condition.
• Definition 5.1: A function f (t, y) is said to satisfy a Lipschitz condition in the variable y
on a set D ⊂ ℝ2 if a constant L > 0 exists with
whenever (t, y1) and (t, y2) are in D. The constant L is called a Lipschitz constant for f.
• Definition 5.2: A set D ⊂ ℝ2 is said to be convex (凸集) if whenever (t1, y1) and (t2, y2)
belong to D, then ((1 − λ)t1 + λt2, (1 − λ)y1 + λy2) also belongs to D for every λ in [0, 1].
135

 Initial value problem
• Theorem 5.3: Suppose f (t, y) is defined on a convex set D ⊂ ℝ2. If a constant L > 0 exists
with
then f satisfies a Lipschitz condition on D in the variable y with Lipschitz constant L.
• Theorem 5.4: Suppose that D = {(t, y) | a ≤ t ≤ b and −∞ < y < ∞} and that f (t, y) is
continuous on D. If f satisfies a Lipschitz condition on D in the variable y, then the initial-
value problem
has a unique solution y(t) for a ≤ t ≤ b.
136

 Well-Posed Problems (适定问题)
• Whether small changes in the statement of the problem introduce correspondingly small
changes in the solution ?
• Definition 5.5: The initial-value problem
is said to be a well-posed problem if:

• A unique solution, y(t), to the problem exists, and
• There exists constants ε0 > 0 and k > 0 such that for any ε, with ε0 > ε > 0, whenever δ(t)
is continuous with |δ(t)| < ε for all t in [a, b], and when |δ0| < ε, the initial-value
problem small change
has a unique solution z(t) that satisfies
137

 Well-Posed Problems (适定问题)
• Theorem 5.6: Suppose D = {(t, y) | a ≤ t ≤ b and −∞ < y < ∞}. If f is continuous and
satisfies a Lipschitz condition in the variable y on the set D, then the initial-value problem
is well-posed.
138
Euler’s Method
139
Euler’s Method
 Basic Idea
• The most elementary approximation technique for solving initial-value problems.
• Objective: to obtain approximations to the well-posed initial-value problem
at various values, called mesh points, in the interval [a, b].

• Procedure:
(1) Choosing a positive integer N and selecting the mesh points in the interval [a, b]
(2) Use Taylor’s theorem
140
Euler’s Method
 Basic Idea
(3) Constructing wi ≈ y(ti), for each i = 1, 2, . . . , N, the Euler’s method is
 Example
• Problem: use Euler’s method to approximate the solution to
at t = 2 with h = 0.5.
Solution:
141
Euler’s Method
 Geometrical Interpretation
142
Euler’s Method
 Example
• Problem: use Euler’s method to approximate the solution to
with N = 10 to determine approximations, and compare these with the exact values given
by y(t) = (t + 1)2 − 0.5et.
Solution: with N = 10 we have h = 0.2, ti = 0.2i, w0 = 0.5
143
Higher-Order Taylor Methods
144

 Local Truncation Error
• Measure how well the approximations generated by the methods satisfy the differential
equation.
• The local truncation determines the actual approximation error.
• Definition 5.11: The difference method
has local truncation error
• The Euler’s method has the truncation error
145

 High-Order Methods
• Try to select a difference-equation method that their local truncation errors are O(hp) for
as large a value of p as possible.
• Keeping the number and complexity of calculations of the methods within a reasonable
bound.
• The solution y(t) to the initial-value problem
Truncation error
146

 Taylor Method of Order n
Euler’s method is Taylor’s method of order one.
• Theorem 5.12: If Taylor’s method of order n is used to approximate the solution to
with step size h and if y ∈ Cn+1[a, b], then the local truncation error is O(hn).
147

 Example
• Problem: Apply Taylor’s method of orders (a) two and (b) four with N = 10 to the initial-
value problem
Solution: Taylor’s method of order two
Because N = 10 we have h = 0.2, and ti = 0.2i for each i = 1, 2, . . . , 10
148

Solution: Taylor’s method of order four
149

Solution: Taylor’s method of order four
Because N = 10 we have h = 0.2
150
Runge-Kutta Methods
151
Runge-Kutta Methods
 Why Runge-Kutta Methods
• The Taylor methods requires the computation and evaluation of the derivatives of f (t, y).
• Advantages of Runge-Kutta methods: high-order, no need to evaluate f (n) (t, y).
• Theorem 5.13: Suppose that f (t, y) and all its partial derivatives of order less than or
equal to n + 1 are continuous on D = {(t, y) | a ≤ t ≤ b, c ≤ y ≤ d}, and let (t0, y0) ∊ D. For
every (t, y) ∊ D, there exists ξ between t and t0 and μ between y and y0 with
Residual
nth Taylor polynomial in

two variables
152
Runge-Kutta Methods
 Runge-Kutta Methods of Order Two
• Objective: determine values for a1, α1, and β1 that a1 f (t + α1, y + β1) approximates
with error no greater than O(h2).

• Derivation: determine values for a1, α1, and β1 that a1 f (t + α1, y + β1) approximates
Taylor expansion
153
Runge-Kutta Methods
 Runge-Kutta Methods of Order Two
Second order residual
 Midpoint Method
154
Runge-Kutta Methods
 Runge-Kutta Methods of Order Three
Heun’s method
• Problem: Applying the Heun’s method with N = 10, h = 0.2, ti = 0.2i, and w0 = 0.5 to
approximate the solution to the example
155
Runge-Kutta Methods
 Runge-Kutta Methods of Order Four
156
Runge-Kutta Methods
 Computational Comparisons
• The main computational effort in applying the RK methods is the evaluation of f .
• The methods of order less than five with smaller step size are used in preference to the
higher-order methods using a larger step size
157
Runge-Kutta-Fehlberg Method
158
 Why Adaptive Method
• Use varying step sizes for integral approximations produced efficient methods.
• Step-size procedure to estimate the truncation error without approximation of the
higher derivatives of the function.
• For a initial value problem
nth-order Taylor method

(n+1)th-order Taylor method
The truncation error
159
 Why Adaptive Method
O(hn)
O(hn+1)
O(hn)
Estimate the
truncation error
160
 Runge-Kutta-Fehlberg Method
• Use a Runge-Kutta method with local truncation error of order five
to estimate the local error in a Runge-Kutta method of order four
Six evaluations of f
161
• Procedure
(1) Compute the first values of wi+1 and using the step h
(2) Compute q for that step
(3) When q < 1: repeat the calculations using the step size qh
When q ≥ 1: accept the computed value at this using the step size h, but change the
step size to qh for the (i + 1)st step
• Choose the value of q more conservatively, for example n=4
162
• Problem: Use the Runge-Kutta-Fehlberg method with a tolerance TOL = 10−5, a
maximum step size hmax = 0.25, and a minimum step size hmin = 0.01 to approximate the
solution to the initial-value problem
and compare the results with the exact solution y(t) = (t + 1)2 − 0.5et.
Solution: Determine w1 using h = 0.25
163
q < 1: we can not accept the approximation 0.9204886 for y(0.25), but adjust the step
size to h = 0.9461033291(0.25) ≈ 0.2365258 and repeat.
164
Multistep Methods
165
Multistep Methods
 Introduction
• The above one-step methods: approximation for the mesh point ti+1 involves
information from only one of the previous mesh points ti .
• Multi-step methods: using the approximation at more than one previous mesh point
to determine the approximation at the next point .
• Definition 5.14: An m-step multistep method for solving the initial-value problem
has a difference equation for finding the approximation wi+1 at the mesh point ti+1
represented by the following equation, where m is an integer greater than 1:
for i = m − 1, m, . . . , N − 1, where h = (b − a)/N, the a0, a1, . . . , am-1 and b0, b1, . . . , bm are
constants, and the starting values are specified
166
Multistep Methods
 Introduction
• The method is called explicit (显式) when bm = 0, and implicit (隐式) for bm ≠ 0.
• Explicit fourth-order Adams-Bashforth technique
• Implicit fourth-order Adams-Moulton technique
• Definition 5.15: If y(t) is the solution to the initial-value problem
and
is the (i + 1)st step in a multistep method, the local truncation error at this step is
167
Multistep Methods
 Adams-Bashforth Explicit Methods
• Two-step method
for i = 1, 2, . . . , N − 1
truncation error
• Three-step method
for i = 2, 3, . . . , N − 1
truncation error
168
Multistep Methods
 Adams-Bashforth Explicit Methods
• Four-step method
for i = 3, 4, . . . , N − 1
truncation error
• Five-step method
for i = 4, 5, . . . , N − 1
truncation error
169
Multistep Methods
 Adams-Moulton Implicit Methods
• Two-step method
for i = 1, 2, . . . , N − 1
truncation error
• Three-step method
for i = 2, 3, . . . , N − 1
truncation error
• Four-step method
for i = 3, 4, . . . , N − 1
truncation error 170

Multistep Methods
 Example
• Problem: Consider the initial-value problem
Use the exact values given from y(t) = (t + 1)2 − 0.5et as starting values and h = 0.2 to
compare the approximations from (a) by the explicit Adams-Bashforth four-step method
and (b) the implicit Adams-Moulton three-step method.
Solution: Adams-Bashforth method
i = 3, 4, . . . , 9
Adams-Moulton method
i = 2, 3, . . . , 9
171
Multistep Methods
 Example
172
Multistep Methods
 Predictor-Corrector Methods (预估-修正法)
• The implicit Adams-Moulton method gave better results than the explicit Adams-
Bashforth method of the same order.
• Deficiency of the implicit method: first having to convert the method algebraically to
an explicit representation for wi+1.
• Predictor-Corrector method: an explicit method to predict and an implicit to improve
the prediction.
• Procedure
(1) Calculate an approximation, w4p, to y(t4) using the four-step explicit Adams-Bashforth
method as predictor
(2) Improving by inserting w4p in the right side of the three-step implicit Adams-Moulton
method and using that method as a corrector
173
Multistep Methods
 Predictor-Corrector Methods (预估-修正法)
• Problem: Apply the Adams fourth-order predictor-corrector method with h = 0.2 and
starting values from the Runge-Kutta fourth order method to the initial-value problem
Solution:
174
Summary
 Euler’s Method
 Higher-Order Taylor Methods
 Runge-Kutta Methods
• Runge-Kutta method of order two
• Midpoint method
• Runge-Kutta method of order two
 Runge-Kutta-Fehlberg Methods
• Estimate truncation error with minimal computational cost
• RK order five + RK order four
 Multistep Methods
• Use solutions on more mesh points
• Explicit Adam-Bashforth methods
• Implicit Adam-Moulton methods
175
Direct Methods for Solving Linear Systems
（线性方程组的直接解法）

176
Lecture 6: Direct Methods for Solving Linear Systems（线性方程组的直接解法）
Introduction
 Engineering Problem
• Kirchhoff’s laws of electrical circuits (基尔霍
夫电路定律)
• Linear system of equations for this problem
• General form of linear system of equations:
given the constants aij, for each i, j = 1, 2,. . . , n,
and bi, for each i = 1, 2, . . . , n, and we need to
determine the unknowns x1, . . . , xn.
• Direct methods: theoretically give the exact solution to the system in a finite number
of steps.
177
Outlines
 Fundamentals of Linear Systems of Equations（线性方程组基本知识）

 Pivoting Strategies（选主元消去法）
 Matrix Factorization（矩阵分解）
 Special Types of Matrices（特殊形式的矩阵）
178
Fundamentals of Linear Systems of Equations
179

 Basics Related to Linear Algebra
• Three basic operations
(1) Equation Ei can be multiplied by any nonzero constant λ with the resulting
equation used in place of Ei. This operation is denoted (λEi) → (Ei).
(2) Equation Ej can be multiplied by any constant λ and added to equation Ei with
the resulting equation used in place of Ei. This operation is denoted (Ei + λEj) → (Ei).
(3) Equations Ei and Ej can be transposed in order. This operation is denoted (Ei) ↔ (Ej).
• Problem: The four equations will be solved for x1, x2, x3 and x4.
180

Solution: (1) First use equation E1 to eliminate the unknown x1 from equations E2, E3,
and E4 by performing (E2 − 2E1) → (E2), (E3 − 3E1) → (E3), and (E4 + E1) → (E4)
(2) E2 is used to eliminate the unknown x2 from E3 and E4 by performing

(E3 − 4E2) → (E3) and (E4 + 3E2) → (E4). This results in
Triangular form
181

(3) E4 implies x4 = 1
(4) Backward-substitution
182

 Matrices and Vectors
• n-dimensional row vector
• n-dimensional column vector
• Augmented matrix（增广矩阵）
183

 General Procedure of Gaussian Elimination Method
Linear system
vector b
Augmented matrix
Forward elimination
Eliminate the coefficient
Provided a11 ≠ 0
of x1 in each rows
Provided aii ≠ 0 Eliminates xi in each row

below the ith for all values
of i = 1, 2,. . . , n − 1
184

 General Procedure of Gaussian Elimination Method
Resulting matrix The same solution set as

the original system
Backward substitution
Solving the nth equation for xn
Solving the (n-1)th equation

for xn-1
General solution
185

 Example
• Problem: Solve the linear system using Gaussian Elimination method.
Solution: The augmented matrix
Pivot element
186

 Example
187
Pivoting Strategies
188
Pivoting Strategies
 Simple Row Interchange
• For pivot elements akk (k) = 0: row interchange (Ek) ↔ (Ep), where p is the smallest integer
greater than k with apk (k) ≠ 0.
 Why Applying Pivoting (选主元) Strategies
(𝑘) (𝑘)
• If 𝑎𝑘𝑘 is small in magnitude compared to 𝑎𝑗𝑘 , then the magnitude of the multiplier
≫1
Round-off error occurs when:

(𝑘+1) (𝑘)
Forward elimination 𝑎𝑗𝑙 = 𝑎𝑘𝑙 ∙ 𝑚𝑗𝑘
𝑥𝑗
189
Pivoting Strategies
 Example
• Problem: Apply Gaussian elimination to the system
using four-digit arithmetic with rounding, and compare the results to the exact solution
x1 = 10.00 and x2 = 1.000.
(𝑘)
Solution: The first pivot element 𝑎𝑘𝑘 =0.003000
Exact
190
Pivoting Strategies
 Example
Exact result: x1 = 10.00, x2 = 1.000
191
Pivoting Strategies
 Partial Pivoting (列主元法)
(𝒌)
• Pivoting is performed by selecting an element 𝒂𝒑𝒌 with a larger magnitude as the
pivot, and interchanging the kth and pth rows.
• Simple partial pivoting: select an element in the same column that is below the
diagonal and has the largest absolute value
smallest p ≥ k
and perform (Ek) ↔ (Ep). No interchange of columns.

using partial pivoting and four-digit arithmetic with rounding, and compare the results to
the exact solution x1 = 10.00 and x2 = 1.000.
192
Pivoting Strategies
Solution: Find the pivot
Exact solution
193
Pivoting Strategies
• Counter-Example: Apply Gaussian elimination to the system
is the same as the above problem except that all the entries in the first equation have
been multiplied by 104.
Solution: no row interchange is needed
Wrong results
194
Pivoting Strategies
 Scaled Partial Pivoting (比例消元法)
• The element in the pivot position that is largest relative to the entries in its row.
• Procedure: (1) define a scale factor si for each row as
(2) The pivot is determined by choosing the least integer p with
and performing (E1) ↔ (Ep).

(3) To eliminate the variable xi, we select the smallest integer p ≥ i with
and perform the row interchange (Ei) ↔ (Ep) if i ≠ p.
195
Pivoting Strategies
 Scaled Partial Pivoting (比例消元法)
Solution: no row interchange is needed
Exact solution
196
Pivoting Strategies
 Complete Pivoting (全主元消去法)
• Complete pivoting at the kth step searches all the entries aij, for i = k, k + 1, . . . , n and j =
k, k+1, . . . , n, to find the entry with the largest magnitude.
• Both row and column interchanges are performed.
• Massive computational cost for comparisons.
• The strategy recommended only for systems where accuracy is essential
197
Matrix Factorization
198
 LU Factorization
• The steps used to solve a system of the form Ax = b can be used to factor a matrix.
• The factorization is particularly useful when it has the form A = LU, where L is lower
triangular and U is upper triangular.
• Theorem 6.19: If Gaussian elimination can be performed on the linear system Ax = b
without row interchanges, then the matrix A can be factored into the product of a
lower-triangular matrix L and an upper-triangular matrix U, that is, A = LU, where
(𝒊) (𝒊)
𝒎𝒋𝒊 = 𝒂𝒋𝒊 𝒂𝒊𝒊
• Doolittle’s method: the 1s be on the diagonal of L.

• Crout’s method: the 1s be on the diagonal of U.
• Cholesky’s method: lii = uii.
199
• Suppose A = LU, then solve for x by using a two-step process:
(1) First let y = Ux and solve the lower triangular system Ly = b for y.
(2) Once y is known, solve the upper triangular system Ux = y to determine the solution x.
• Problem: Determine the LU factorization for matrix A in the linear system Ax = b, where
Then use the factorization to solve the system
200
Solution:
201
Introduce the substitution y = Ux. Then b = L(Ux) = Ly
Forward substitution
then solve Ux = y for x, the solution of the original system
202
Special Types of Matrices
203

 Diagonally Dominant Matrices (对角占优矩阵)
• Definition 6.20: The n  n matrix A is said to be diagonally dominant when
strictly diagonally dominant matrix
• Theorem 6.21: A strictly diagonally dominant matrix A is nonsingular (非奇异).

Gaussian elimination can be performed on any linear system of the form Ax = b to obtain
its unique solution without row or column interchanges, and the computations will be
stable with respect to the growth of round-off errors.
204

 Positive Definite Matrices (正定矩阵)
• Definition 6.22: A matrix A is positive definite if it is symmetric and if xtAx > 0 for
every n-dimensional vector x ≠ 0.
• Theorem 6.23: If A is an n  n positive definite matrix, then: (1) A has an inverse; (2) aii >
0, for each i = 1, 2, . . . , n; (3) max1≤𝑘,𝑗≤𝑛 𝑎𝑘𝑗 ≤ max1≤𝑖≤𝑛 𝑎𝑖𝑖 ; (4) (𝑎𝑖𝑗 )2 < 𝑎𝑖𝑖 𝑎𝑗𝑗 , for each
i ≠ j.
Purpose: eliminate certain matrices

from positive definite ones 205

 Positive Definite Matrices (正定矩阵)
• Theorem 6.25: A symmetric matrix A is positive definite if and only if each of its leading
principal submatrices has a positive determinant (行列式).
• Definition 6.24: A leading principal submatrix of a matrix A is a matrix of the form
Eliminate certain matrices

for some 1 ≤ k ≤ n. from positive definite ones
• Theorem 6.26: The symmetric matrix A is positive definite if and only if Gaussian
elimination without row interchanges can be performed on the linear system Ax = b with
all pivot elements positive. Moreover, in this case, the computations are stable with
respect to the growth of round-off errors.
206

 Tridiagonal Matrices (三对角矩阵)
• Definition 6.30: An n  n matrix is called a band matrix if integers p and q, with 1 < p, q
< n, exist with the property that apq = 0 whenever p ≤ j − i or q ≤ i − j. The band width of a
band matrix is defined as w = p + q − 1.
• p: the number of non-zero diagonals above and including, the main diagonal.
• q: the number of non-zero diagonals below and including, the main diagonal.
p=q=2
bandwidth 2 + 2 − 1 = 3
• Tridiagnoal matrix: p = q = 2 and bandwidth 3
207

• The factorization is simplified because a large number of zeros appear in these matrices.
• The matrix A has at most (3n − 2) nonzero entries to determine the entries of L and U.
A total of (3n − 2) undetermined entries in the factorization.
• The entries in A can be overwritten by the entries in L and U with the result that no new
storage is required. 208

• Problem: Determine the Crout factorization of the symmetric tridiagonal matrix
and use this factorization to solve the linear system
Solution:
209

Crout factorization
210

211
Summary
 Basics of Vectors and Matrices
 Gauss Elimination Methods
• Forward elimination + backward substitution
• Pivoting: partial pivoting, scaled partial pivoting, complete pivoting
 Matrix Factorization
• LU factorization: Crout’s method, Doolittle’s method, Cholesky’s method
 Special Types of Matrices
• Diagnoal dominant matrices, positive definite matrices, tridiagnoal matrices
212
Iterative Methods for Solving Linear Systems
（线性方程组的迭代解法）

213
Lecture 7: Iterative Methods for Solving Linear Systems（线性方程组的迭代解法）
Introduction
 Engineering Problem
• Trusses (桁架): lightweight structures
capable of carrying heavy loads
• The truss is in static equilibrium.
• Two endpoints (1, 4), four pin joints (1, 2, 3,
4), eight forces.
214
Outlines
 Norms of Vectors and Matrices（向量和矩阵的范数）

 Eigenvalues and Eigenvectors（特征值和特征向量）
 Jacobi Method（雅克比方法）
 Gauss-Seidel Method（高斯-赛德尔方法）
 Relaxation Techniques（松弛方法）
 Error Bound and Iterative Refinement（误差界和迭代优化）
215
Norms of Vectors and Matrices
216

 Vector Norms and Distances
• ℝn denotes the set of all n-dimensional column vectors with real-number components.
• Definition 7.1: A vector norm on ℝn is a function, ∙ , from ℝn into ℝ with the following
properties:
(1) 𝐱 ≥ 0 for all x ∊ ℝn
(2) 𝐱 = 0 if and only if x = 0
(3) 𝛼𝐱 = 𝛼 𝐱 for all α ∊ ℝ and x ∊ ℝn
(4) 𝐱 + 𝐲 ≤ 𝐱 + 𝐲 for all x, y ∊ ℝn
• Definition 7.2: The l2 and l∞ norms for the vector x = (x1, x2, . . . , xn)t are defined by
217

 Vector Norms and Distances
• The distance between two vectors is defined as the norm of the difference of the vectors.
• Definition 7.4: If x = (x1, x2, . . . , xn)t and y = (y1, y2, . . . , yn)t are vectors in ℝn, the l2 and l∞
distances between x and y are defined by
218

 Matrix Norms and Distances
• Definition 7.8: A matrix norm on the set of all n  n matrices is a real-valued function,
∙ , defined on this set, satisfying for all n  n matrices A and B and all real numbers α:
(1) 𝐴 ≥ 0
(2) 𝐴 = 0 if and only if A is O, the matrix with all 0 entries
(3) 𝛼𝐴 = 𝛼 𝐴
(4) 𝐴 + 𝐵 ≤ 𝐴 + 𝐵
(5) 𝐴𝐵 ≤ 𝐴 𝐵
• The l2 and l∞ matrix norms
• The distance between n  n matrices A and B with respect to this matrix norm is
𝐴−𝐵
219
Eigenvalues and Eigenvectors
220

 Eigenvalue and Eigenvector
• Definition 7.12: If A is a square matrix, the characteristic polynomial of A is defined by
nth-degree polynomial with

at most n distinct zeros
• Definition 7.13: If p is the characteristic polynomial of the matrix A, the zeros of p are
eigenvalues (characteristic values) of the matrix A. If λ is an eigenvalue of A and x ≠ 0
satisfies (A − λI)x = 0, then x is an eigenvector (characteristic vector) of A corresponding
to the eigenvalue λ.
221

 Spectral Radius (谱半径)
• Definition 7.14: The spectral radius ρ(A) of a matrix A is defined by
For complex λ = α + βi, we define |λ| = (α2 + β2)1/2.

 Convergent Matrix (收敛矩阵)
• Definition 7.16: an n  n matrix A convergent if
222
Jacobi Method
223
Jacobi Method
 Iterative Methods
• The iterative methods are efficient in terms of both computer storage and computation
for large systems with a high percentage of 0 entries.
• An iterative technique to solve the n  n linear system Ax = b starts with an initial
∞
approximation x(0) to the solution x and generates a sequence of vectors 𝐱 (𝒌) 𝒌=𝟎
that
converges to x.
 Jacobi Method
• By solving the ith equation in Ax = b for xi to obtain (provided aii ≠ 0)
Final solution
(𝑘)
• For each k ≥ 1, generate the components 𝐱 𝑖 of x(k) from the components of x(k-1) by
Iteration
224
Jacobi Method
 Jacobi Method
• Problem: The linear system Ax = b given by
has the unique solution x = (1, 2, −1, 1)t. Use Jacobi method to find approximations x(k) to
x starting with x(0) = (0, 0, 0, 0)t until
Solution: first solve equation Ei for xi
225
Jacobi Method
 Jacobi Method
From the initial approximation x(0) = (0, 0, 0, 0)t we have x(1) given by
226
Jacobi Method
 Jacobi Method: Factorization Form
• Iterative methods: convert the system Ax = b into an equivalent system of the form
x = Tx + c for some fixed matrix T and vector c.
• The sequence of approximate solution vectors is generated by computing
• Jacobi method: factorize A into its diagonal and off-diagonal parts
227
Jacobi Method
 Jacobi Method: Factorization Form
228
Gauss-Seidel Method
229
Gauss-Seidel Method
 Gauss-Seidel Method
(𝑘)
• In Jacobi method: components of x(k−1) are used to compute all the components 𝑥𝑖 of x(k).
(𝑘) (𝑘)
• For i > 1, the components 𝑥1 , . . . , 𝑥𝑖−1 of x(k) have already been computed and are
(𝑘−1) (𝑘−1)
expected to be better approximations to the actual solutions than 𝑥1 , . . . , 𝑥𝑖−1 .
(𝑘)
• Gauss-Seidel method: compute 𝑥𝑖 using the most recently calculated values
• Problem: Use the Gauss-Seidel iterative technique to find approximate solutions to
starting with x = (0, 0, 0, 0)t and iterating until
230
Gauss-Seidel Method
Solution: we write the system, for each k = 1, 2, . . . as
231
Gauss-Seidel Method
 Gauss-Seidel Method: Factorization Form
Solution: we write the system, for each k = 1, 2, . . . as
232
Relaxation Techniques
233
 Why Relaxation Methods
• The rate of convergence of an iterative technique depends on the spectral radius of the
matrix associated with the method.
• Convergence acceleration: choose a method whose associated matrix has minimal
spectral radius.
• Definition 7.23: If suppose 𝐱 ∊ ℝn is an approximation to the solution of the linear system
defined by Ax = b. The residual vector for 𝑥 with respect to this system is r = b − A𝐱.
 Gauss-Seidel method presented by the residual
(𝑘)
Denote the approximate solution vector 𝐱 𝑖 defined by
residual
234
(𝑘)
The mth component of 𝐫𝑖 is
the ith component
G-S iteration
235
By modifying the Gauss-Seidel procedure
Relaxation method
For certain choices of positive ω, we can reduce the norm of the residual vector and
obtain significantly faster convergence.
• Under-relaxation methods (欠松弛): 0 < ω < 1
• Over-relaxation methods (超松弛) : 1 < ω
• Successive over-relaxation (SOR) methods (逐次超松弛) : solving the system with G-S
method
236
 Example
has the solution (3, 4,−5)t. Compare the iterations from the Gauss-Seidel method and the
SOR method with ω = 1.25 using x(0) = (1, 1, 1)t for both methods.
Solution: For each k = 1, 2, . . . , the equations for the Gauss-Seidel method are
and the equations for the SOR method with ω = 1.25 are
237
 Example
For the iterates to be accurate to seven decimal places, the Gauss-Seidel method requires
34 iterations, as opposed to 14 iterations for the SOR method with ω = 1.25.
238
 Convergence of SOR Method
• Theorem 7.24: If aii ≠ 0, for each i = 1, 2, . . . , n, then ρ(Tω) ≥ |ω−1|. This implies that the
SOR method can converge only if 0 < ω < 2.
• Theorem 7.25: If A is a positive definite matrix and 0 < ω < 2, then the SOR method
converges for any choice of initial approximate vector x(0).
• Theorem 7.26: If A is positive definite and tridiagonal, then ρ(Tg) = [ρ(Tj)]2 < 1, and the
optimal choice of ω for the SOR method is
With this choice of ω, we have ρ(Tω) = ω − 1.
239
 SOR Method: Factorization Form
240
Error Bound and Iterative Refinement
241

 Example
has the unique solution x = (1, 1)t. Determine the residual vector for the poor approximation
𝐱 = (3, −0.0001)t.
Solution:
The residual vector 𝐴 ∞ = 0.0002

The solution vector 𝐱 − 𝐱 ∞ =2
242

 Condition Number
• Theorem 7.27: Suppose that 𝐱 is an approximation to the solution of Ax = b, A is a
nonsingular matrix, and r is the residual vector for 𝐱. Then for any natural norm,
Connection between the

and if x ≠ 0 and b ≠ 0, residual vector and the
accuracy of the approximation
• Theorem 7.28: The condition number (条件数) of the nonsingular matrix A relative to
a norm ∙ is
• A matrix A is well-conditioned (良态) if K(A) is close to 1, and is ill-conditioned (病态)

when K(A) is significantly greater than 1.
243

 Iterative Refinement
• Iterative refinement: Solve the system Ay = r for the approximate solution 𝐲 ≈ 𝐱 − 𝐱,
then 𝐱 + 𝐲 is a more accurate approximation to the solution of the linear system Ax = b
than the original approximation 𝐱.
• Example: the approximation to the linear system
using five-digit arithmetic and Gaussian elimination, to be
and the solution to Ay = r(1) to be
244

compute r(2) = b − A𝐱(2) and solve the system Ay(2) = r(2), which gives
245
Summary
 Norms of Vectors and Matrices
 Jacobi Method
• Simple
• Always use the old solutions to compute the new ones
• Use the updated solutions to compute the new solutions
 Relaxation Techniques
• Over-relaxation, under-relaxation, SOR
• 0<ω<2
• Spectral radius, conditional number
• Solve the system for error and residual
246

Numerical Analysis and Application

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Numerical Analysis and Application

Hochgeladen von

Copyright:

Verfügbare Formate

Numerical Analysis and Application

Faculty of Mechanical Engineering and Automation

 Lecture by: Wei Zhang（张炜）

 10 lectures + 5 programming practices + 1 exam depending on schedule

 Grading: 50% assignments + 50% final exam (open book)

 Textbook: R. L. Burden, J. D. Faires, Numerical Analysis, 9th edition, Cengage Learning.

Faculty of Mechanical Engineering and Automation

What is Numerical Analysis ?

 Applications: in ALL fields of engineering and sciences circumstances.

 Industries: numerical weather prediction, computing the trajectory of a spacecraft, crash

What is Numerical Analysis ?

Babylonian clay tablet (1800-1600 BC) with

Weather models use systems of differential equations

What We Learn in Numerical Analysis ?

• The functions will be assumed to be continuous for predictable behavior.

• Theorem 1.6: If the function f is differentiable at x0, then f is continuous at x0.

• Theorem 1.7: (Rolle’s Theorem) • Theorem 1.8: (Mean Value Theorem)

• Theorem 1.11: (Intermediate Value Theorem) If f ∈ C[a, b] and K is any number

• For computational convenience, the

When g(x) ≡ 1, the theorem is the

nth Taylor polynomial

Round-off Errors and Computer Arithmetic

Round-off Errors and Computer Arithmetic

• Only part of rational numbers（有理数）can be precisely represented.

• Round-off errors: error that is produced when a calculator or computer is used to

Round-off Errors and Computer Arithmetic

• IEEE Binary Floating Point Arithmetic Standard 754–1985 and 754-2008（IEEE二进制

• Data interchange, rounding arithmetic operations, handling of exceptions.

Round-off Errors and Computer Arithmetic

• Illustration: a binary machine number

sign characteristic mantissa

Round-off Errors and Computer Arithmetic

• The smallest normalized positive number: s = 0, c = 1, and f = 0

• The largest normalized positive number: s = 0, c = 2046, and f = 1-2-52

• Numbers < 2-1022(1+0) are set to zero.

Round-off Errors and Computer Arithmetic

• Normalized decimal floating-point form: k-digit decimal machine numbers

• A positive real number:

Round-off Errors and Computer Arithmetic

• Definition 1.15: Suppose that p∗ is an approximation to p. The absolute error is |p − p∗|,

• Definition 1.16: The number p∗ is said to approximate p to t significant digits (or

Round-off Errors and Computer Arithmetic

Round-off Errors and Computer Arithmetic

Round-off Errors and Computer Arithmetic

One way to reduce round-off error is to reduce the number of computations.

Algorithms and Convergence

Algorithms and Convergence

• Algorithm: a procedure that describes, in an unambiguous manner, a finite sequence of

Algorithms and Convergence

 Algorithms and Convergence

Faculty of Mechanical Engineering and Automation

Question: can we use the data to reasonably

where n is a nonnegative integer and a0, . . . , an are real constants.

• The derivative and indefinite integral of a polynomial are easy to

• The Taylor polynomials are not used in the interpolation

Solution: define the functions

The polynomial function is:

The polynomial is:

nth Lagrange interpolating polynomial

Ln,k(x) is simplified as Lk(x) when there is no confusion as to its degree.

• Solution: determine the coefficient polynomials L0(x), L1(x), and L2(x)

where P(x) is the interpolating polynomial.

• The remainder term is used to estimate the error bound.

Taylor polynomial Lagrange polynomial