Lecture 2 ENGR-516 Spring 2011

ENGR-516 Spring 2011
Lecture #2 January 24, 2011
Adjunct Prof. Michael A. Soderstrand

soderstrand@ieee.org
405-334-8329
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 1

Homework #1 -- Due Monday January 24, 2011
Do the following problems from Chapters 1 and 2:
Problem 1.3, p. 21 of the text
Rather than the linear realationship of Eq. (1.7) you might choose to model
the upward force on the parachutist as a second-order relationship.
Fu = -c'v2
where c' = a second-order drag coefficient (kg/m).

a) Using calculus, obtain the closed-form solution for the case where the
jumper is initially at rest (v=0 at t=0).
b) Repeat the numerical calculation in Example 1.2 with the same initial
condition and parameter values. Use a value of 0.225 kg/m for c'.

The velocity is equal to the rate of change of distance x(m),
dx/dt = v(t) [Eq P1.19]
a) Substitute Eq. (1.10) and develop and analytical solution for the
distance as a function of time. Assume that x(0)=0.
b) Use Euler's method to numerically integrate Eqs. (P1.19) and (1.9) in
order to determine both the velocity and distance fallen as a function of
time for the first 10s of free fall using the same parameters as in
Example 1.2.
c) Develop a plot of your numerical results together with analytical
solutions.
Develop, debug, and document a program to

determine the roots of a quadratic equation ax2 + bx +
c, in either a high-level language or a macro language
of your choice (MatLab strongly suggested). Use a
subroutine procedure to compute the roots (either rel
or complex). Perform test runs of the cases:
a) a=1, b=6, c=2;
b) a=0, b=-4, c=1.6;
c) a=3, b=2.5, c=7.
The psuedocode below computes the factorial. Express this algorithm as a well-structured
function in the language of your choice (MatLab is strongly recommended). Test it by
computing 0! and 5! In addition, test the error trap by trying to evaluate -2!
Pseudocode
FUNCTION fac(n)
IF n ≥ 0 THEN
x=1
DOFOR i = 1,n
x = x·i
END DO
fac = x
ELSE
display error message
terminate
ENDIF
END fac
Chapter 3 – Approximation and
Round-Off Errors
 Errors are inherent in any numerical solution
 Even when we have an exact analytic solution (as
in the parachute or circuit example of Chapter 1),
as soon as we use a computer to calculate
solutions, those solutions have error in them.
 Often we do not have exact or analytic solutions,
then numerical techniques give us approximations
– but how much error is there?

Sources of Error
 This lecture will deal with all of the major errors
associated with numerical analysis
 Chapter 3 discusses accuracy and precision and
errors due to number representation in the
computer.
 Chapter 4 deals with truncation errors and in
detail errors associated with Taylor’s Series
approximations.

3.1 Significant Figures
 Below is a car odometer (see text Fig 3.1)
 How many significant digits are there?

 There are eight significant digits.
 However, only the first seven
can be used with confidence.
th
 We can approximate the 8
digit (126,462.25).
3.1 Significant Figures
 Below is a car speedometer (see text Fig 3.1)
 How many significant digits are there?

 There are two significant digits.
 However, only the first can be
used with confidence.
nd
 We can approximate the 2
digit (52).
3.2 Accuracy and
Precision
 Accuracy refers to how closely a
computed or measured value agrees
with the true value.
 Precision refers to how closely
computed or measured values agree
with each other.

Illustration of Accuracy and Precision
Figure 3.2
p. 55 of text

Computer Precision
 Precision on a computer comes from
the word-length used for computation.
 Accuracy on a computer comes from

the quality of the algorithms used to
perform the calculations.

3.3 Error Definitions
 Numerical errors include truncation
errors due to inexact mathematical
operations and round-off errors due
to significant figure limitations.
 True Value = Calculated Value +
True Error
 Calculated Value is referred to in the
text as the Approximation
Error Definitions
 Et = True Value – Approximation
 Relative Error expressed as a percentage
is given by:
𝑻𝒓𝒖𝒆 𝑬𝒓𝒓𝒐𝒓 𝑬𝒕
𝜺𝒕 = × 100% = × 100%
𝑻𝒓𝒖𝒆 𝑽𝒂𝒍𝒖𝒆 𝑻𝑽
 NOTE: This assumes we know the true
value.

Example 3.1
 Problem Statement We measure a
bridge and a rivet. We measure
10,000cm for the bridge and 11cm
for the rivet. If the true values are
9,999cm and 10cm respectively,
calculate a) the true error and b) the
relative percent error. (Note: This is
slightly different than in the text.)

Example 3.1
For the Bridge:
 Et = 9999cm - 10000cm = -1cm
𝑬𝒕 −1𝑐𝑚
 𝜺𝒕 = 𝑇𝑉 = 9999𝑐𝑚 × 100% ≈ −.01%
For the Rivet:

 Et = 10cm - 11cm = -1cm
𝑬𝒕 −1𝑐𝑚
 𝜺𝒕 = 𝑇𝑉 = 10𝑐𝑚 × 100% ≈ −10%
Approximate Error
 Often we do not know the true value.
 When we do not know the true value, we use the
following equation to calculate the approximate error:
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓
𝜺𝒂 = × 100%
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏
 In the above formula, the denominator is our

approximation and the numerator is the approximate
error.
We know the approximation, but how do we find the
approximate error?

Finding the Approximate Error
 When we do not know the true value, it is often challenging to find the
approximate error.
 However, in iterative algorithms, we can use the difference in successive
approximations as a reasonable indication of the approximate error:
𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏 − 𝑷𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏

𝜺𝒂 = × 100%
𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏
 In the above formula, the denominator is our approximation and the

numerator is the approximate error.

Sign of the Error
 The error is negative if the
approximation is larger than the
true value
 The error is positive if the
approximation is smaller than the
true value

Absolute Error
 Often we are not concerned with
the sign of the error.
 In the rest of this class, we will
use absolute error.
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓
𝜺𝒂 = × 100%
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏
Error & Significant Digits
 Often we want the error to be less
than a certain number of significant
digits n.
 The error will be less than n
significant digits if:
𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 %

Error & Significant Digits
 Often we want the error to be less
than a certain number of significant
digits n.Let’s Take a
TEN
 The error willMINUTE
be less than n
significant digits if:
Break 𝟐−𝒏
𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎 %

Example 3.2 Error Estimates for Iterative Methods
Problem Statement. In mathematics, functions can
often be represented by infinite series. For example,
the exponential function can be computed using
2 3 𝑛
𝑥 𝑥 𝑥
𝑒𝑥 = 1 + 𝑥 + + +⋯+
2 3! 𝑛!
Thus, as more terms are added to the sequence, the
approximation becomes a better and better estimate
of the true value of ex. This is called a Maclaurin
series expansion.
Example 3.2 Error Estimates for Iterative Methods
Problem Statement Continued:
2 3 𝑛
𝑥 𝑥 𝑥
𝑒𝑥 = 1 + 𝑥 + + +⋯+
2 3! 𝑛!
x
Starting with the simplest version, e = 1, add terms
0.5
one at a time to estimate e = 1.648721… Add
terms until the absolute value of the approximate
error estimate a falls below a prespecified error
criterion s conforming to three significant figures.
Example 3.2 SOLUTION
First solve for the error-specification equivalent to
three significant digits:
𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 % = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝟑 % = 𝟎. 𝟎𝟓%
The true error can be calculated as:
𝒆𝟎.𝟓 − 𝒂𝒑𝒑𝒓𝒐𝒙
𝜺𝒕 = 𝟎.𝟓
× 𝟏𝟎𝟎%
𝒆

Example 3.2 SOLUTION
 However, we usually do not know the true value
and therefore cannot calculate the true error.
 In such cases we must use the approximation error:
𝒂𝒑𝒑𝒓𝒐𝒙(𝒏) − 𝒂𝒑𝒑𝒓𝒐𝒙(𝒏 − 𝟏)
𝜺𝒂 = × 𝟏𝟎𝟎%
𝒂𝒑𝒑𝒓𝒐𝒙(𝒏)

x = 0.5 s= 0.05%
e x = 1.64872127
n Approx t a
0 1.00000000 39.34693%
1 1.50000000 9.02040% 33.33333333%
2 1.62500000 1.43877% 7.69230769%
3 1.64583333 0.17516% 1.26582278%
4 1.64843750 0.01721% 0.15797788%
5 1.64869792 0.00142% 0.01579529%
6 1.64871962 0.00010% 0.00131626%
7 1.64872117 0.00001% 0.00009402%
8 1.64872127 0.00000% 0.00000588%
3.3.1 Iterative
Calculations
 Most of the methods in this course
are iterative using successive
approximation to the true value
 The computer implementation
involves LOOPS usually ending when
the error drops below a specified
value.
function [ v,ea,iter ] = IterMeth( x,es,maxit )
% Implements a general series expansion
% INPUT:
% (x,es,maxit): x = independent variable
% es = Specified error
% maxit = Maximum number of interations
% Function term(val,n) that defines the n-th term of the sequence
%
% OUTPUT:
% [v,ea,iter]: v = Approximate value for iter iterations
% ea = Approximate error after iter interations
% iter = number of iterations
iter=1; sol=1; ea=100;
while ea>es & iter < maxit,
solold=sol;
termn=term(x,iter);
sol=sol+termn;
iter=iter+1;
if sol ~= 0
ea=abs((sol-solold)/sol)*100;
end
end
v=sol;
return
end
function [ val ] = term( x,n )
% Function to calculate the
% n-th term of e^x
% INPUT: x, n
% OUTPUT: val
val=x^n/factorial(n);
return
end

3.4 Round-Off Errors
 Round-off errors originate from the fact that
computers retain only a fixed number of
significant digits.
 Irrational numbers cannot be expressed
exactly.
 Many numbers that are exact in decimal are
not exact in binary used by computers.

3.4.1 Computer Representation of Numbers
 Integers are represented in most computers in two’s
compliment representation.
 MatLab Integer Representations
Class Range of Values Name
7 7
Signed 8-bit integer -2 to 2 -1 int8
15 15
31 31
63 63
8
Unsigned 8-bit integer 0 to 2 -1 uint8
Unsigned 16-bit integer 0 to 216-1 uint16
 Floating Point numbers are represented in most
computers by IEEE Standard 754.
Decimal Decimal
Name Common name Base Digits E min E max
digits E max
binary16 Half precision 2 10+1 -14 +15 3.31 4.51
binary32 Single precision 2 23+1 -126 +127 7.22 38.23
binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95
binary128 Quadruple precision 2 112+1 -16382 +16383 34.02 4931.77
decimal32 10 7 -95 +96 7 96
decimal64 10 16 -383 +384 16 384
decimal128 10 34 -6143 +6144 34 6144

 MatLab uses primarily double precision IEEE Standard 754
arithmetic.
 Single precision is available, but half, quadruple, and the
decimal representations are not.
Decimal Decimal
Name Common name Base Digits E min E max
digits E max
binary16 Half precision 2 10+1 -14 +15 3.31 4.51
binary32 Single precision 2 23+1 -126 +127 7.22 38.23
binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95
Quadruple
binary128 2 112+1 -16382 +16383 34.02 4931.77
precision
decimal32 10 7 -95 +96 7 96
decimal64 10 16 -383 +384 16 384
decimal128 10 34 -6143 +6144 34 6144
Binary and HEX
Binary HEX Binary HEX Binary HEX Binary HEX
0000 0 0100 4 1000 8 1100 C
0001 1 0101 5 1001 9 1101 D
0010 2 0110 6 1010 A 1110 E
0011 3 0111 7 1011 B 1111 F

3.4.1 Integer Representation in MatLab
uint8: 8-bit unsigned integer (0 – 255)
Binary HEX Decimal
1
0110 1010 6A 616 +10 = 106
1
0011 1011 3B 316 +11 = 59
1
1100 0101 C5 1216 +5 = 197
1
1111 0111 F7 1516 +7 = 247

uint8: 8-bit unsigned integer (0 – 255)
Decimal HEX Binary
16 1616 = 1R0 = 10 0001 0000
40 4016 = 2R8 = 28 0010 1000
228 22816 = 14R4 = E4 1110 0100
157 15716 = 9R13 = 9D 1001 1101

int8: 8-bit signed integer (-128 – 127)
Binary HEX Decimal
1
0110 1010 6A 616 +10 = 106
0011 1011 3B 3161+11 = 59
1100 0101 C5 4161+5-128 = -59
1
1111 0111 F7 716 +7-128 = -9

int8: 8-bit signed integer (-128 – 127)
Decimal HEX Binary
16 1616 = 1R0 = 10 0001 0000
40 4016 = 2R8 = 28 0010 1000
-28 (256-28)16 = 14R4 = E4 1110 0100
-99 (256-99)16 = 9R13 = 9D 1001 1101

int16: 16-bit signed integer (-32,768 – 32,767)
Binary HEX Decimal
3 2
0110 1010 0011 1101 6A3B 616 +1016
hex2dec(‘6A3B’) +3161+11 = 27,195
1100 0101 1111 0111 C5F7 4163+5162
hex2dec(‘45F7’)+ +15161+7-32768 =
intmin(‘int16’) -14,857

Decimal HEX Binary
4136 413616 = 258R8 = 1028 0001 0000 0010 1000
25816=16R2
1616=1R0
-7011 (65,536-7011)16 = 1110 0100 1001 1101
3657R13 = E49D
365716 = 228R9
22816 = 14R4
Decimal HEX Binary
4136
Let’s Take 0001
a 0000 0010 1000
413616 = 258R8 = 1028
25816=16R2
-7011
TEN MINUTE
1616=1R0
1110 0100 1001 1101
(65,536-7011)16 =
Break
3657R13 = E49D
365716 = 228R9
22816 = 14R4
3.4.1 Floating Point Representation
 Computer equivalent of scientific notation.
𝐼 = 2,356 𝐹 = 2.356 × 103 (normalization)
𝐼 = 2,356 𝐹 = 2.36 × 103 (norm & round)

 Binary equivalent:
𝐼 = 1011 𝐹 = 1.011 × 23 (normalization)
𝐼 = 1011 𝐹 = 1.10 × 23 (norm & round)

 In binary, we save a bit, because the whole number is
always 1 – we only need to store the fractional part.
IEEE 754 Double Precision
 MSB is the sign of the number (0 = +; 1 = -)

 The exponent is 11 bits in a biased number system
(subtract 1023 from the actual number)
 The mantissa is formed by 1 + the 52-bit fraction. (53
bits total)
Exponent encoding
The double precision binary floating-point exponent is encoded using an offset
binary representation, with the zero offset being 1023; also known as exponent
bias in the IEEE 754 standard. Examples of such representations would be:
■ Emin (1) = -1022

■ E (50) = -973
■ Emax (2046) = 1023
Thus, as defined by the offset binary representation, in order to get the true
exponent the exponent bias of 1023 has to be subtracted from the written
exponent. The exponents 0x000 and 0x7ff have a special meaning:
■ 0x000 is used to represent zero (if F=0) and subnormals (if F≠0); and
■ 0x7ff is used to represent infinity (if F=0) and NaNs (if F≠0),
where F is the fraction mantissa. All bit patterns are valid encoding .

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

0x3ff0 0000 0000 0000 = 1
0x3ff0 0000 0000 0001 = 1.0000000000000002,
the next higher number > 1
0x3ff0 0000 0000 0002 = 1.0000000000000004
0x4000 0000 0000 0000 = 2
0xc000 0000 0000 0000 = –2

0x0000 0000 0000 0000 = 0
0x8000 0000 0000 0000 = –0
0x7ff0 0000 0000 0000 = Infinity
0xfff0 0000 0000 0000 = -Infinity

0x0000 0000 0000 0001 ≈ 4.9406564584124654 x 10-324
(Min subnormal positive double)
0x0010 0000 0000 0000 ≈ 2.2250738585072014 x 10-308
(Min normal positive double)
0x7fef ffff ffff ffff ≈ 1.7976931348623157 x 10308
(Max Double)
Note: Subnormals fill the gap between zero and smallest number
(see text pp. 65-67 for explanation of this gap).

0x7ff0 0000 0000 0001 = qNaN
0xfff8 0000 0000 0000 = sNaN

There are three kinds of operations which return NaN
 Operations with a NaN as at least one operand
 Indeterminate forms
o The divisions 0/0, ∞/∞, ∞/−∞, −∞/∞, and −∞/−∞
o The multiplications 0×∞ and 0×−∞
o The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions
o The standard has alternative functions for powers:
 The standard pow function and the integer exponent pown
function define 00, 1∞, and ∞0 as 1.

 The powr function defines all three as invalid operations (NaN).
 Real operations with complex results, for example:

o The square root of a negative number
o The logarithm of a negative number
o The inverse sine or cosine of a number which is less than −1 or greater
than +1.
3.4.2 Arithmetic Manipulations
 Floating-point addition and subtraction can cause
significant error.
 Exponents must be the same to add or subtract
1.557  104
+ 4.381  102
Can’t Add!
 The mantissa of the number with the smaller exponent
is modified to make the exponents the same
1.557  104
+ 0.04381  104
1.60081  104
1.557  104
+ 0.04381  104
1.60081  104
 Exponents However, we must now either truncate
(chop) or round to the machine precision (4 digits)
1.557  104
+ 0.04381  104
1.600  104 Chopped
1.557  104
+ 0.04381  104
1.601  104 Rounded
 Floating-point subtraction of two numbers that are
very close to each other causes major problems.
 Exponents must be the same to add or subtract
7.642  103
- 7.641  103
0.001  103
 Renormalization creates three non-significant digits
7.642  103
- 7.641  103
1.000  100

Example 3.7
Investigate the effect of round-off error on large
numbers of interdependent computations.
 The text suggests adding 1 and .00001 in
single precision to themselves 10,000 times.
 The first sum is 10,000 but the second will not
be 1 due to round-off errors.
 The text then suggest doing the second sum
in double precision.

Example 3.7 Generalized
% Program Fig0312 (Page 72 of text)
% INPUTS: x1 = the first number to add (eg: 1)
% x2 = the second and third number to add (eg: 0.1)
% x3 = the second and third number to add (eg: 0.1)
% n = numer of iterations (eg: 10,000,000)
% OUTPUTS: sum1 = x1 summed n times (single precision)
% sum2 = x2 summed n times (single precision)
% sum2 = x3 summed n times (double precision)
sum1=single(0);
sum2=sum1;
sum3=0;
for i = 1:n
sum1=sum1+x1;
sum2=sum2+x2;
sum3=sum3+x3;
end
Example 3.7 Generalized
% Program Fig0312 (Page 72 of text)
sum1=single(0);
sum2=sum1; n = 10000000, x1 = 1, x2 = 0.1, x3 = 0.1
sum3=0;
for i = 1:n
sum1 = 10000000
sum1=sum1+x1; sum2 = 1087937
sum2=sum2+x2; sum3 = 1000000
sum3=sum3+x3;
end
t1=sprintf('n = %0.10g, ',n);
t2=sprintf('x1 = %0.8g, x2 = %0.8g, x3 = %0.8g\n',x1,x2,x3);
t3=sprintf('sum1 = %0.8g\n',sum1);
t=sprintf('%s%s%s%s%s',t1,t2,t3,t4,t5);
sprintf(t)

Example 3.8
Find the roots of the quadratic equation with
a=1, b=30000000.0000001, c=3.
(x+30000000)(x+0.0000001)
 MatLab and Excel have trouble with this
EDU>> a=1;b=30000000.0000001;c=3; a= 1
EDU>> r1=(-b+sqrt(b^2-4*a*c))/2/a b = 30000000.0000001
r1 = -1.00582838058472e-007 c= 3
EDU>> r2=(-b-sqrt(b^2-4*a*c))/2/a
r1 = -0.000000100582838
r2 = -30000000
r2 = -30000000
EDU>> r1=2*c/(-b-sqrt(b^2-4*a*c))
r1 = -1e-007 r1 = 0.000000100000000
Example 3.9
Using a Taylor’s Series to calculate ex when x is negative is not
recommended.
 Instead calculate 1/e|x|.
 However, the example in the text on pp. 74-75 works fine in both
MatLab and Excel.
 You need to use a higher value of x to see the effect (try -19 and -20).
EDU>> IterMeth(x,es,maxit)
ans = 2.55376446514562e-009
EDU>> exp(-19)
ans = 5.60279643753727e-009
Example 3.9
Using a Taylor’s Series to calculate ex when x is negative is not
recommended.
 Instead calculate 1/e|x|.
 However, the example in the text on pp. 74-75 works fine in both
MatLab and Excel.
Let’s Take a
 You need to use a higher value of x to see the effect (try -19 and -20).
TEN MINUTE
EDU>> IterMeth(x,es,maxit)
ans = 2.55376446514562e-009
Break
EDU>> exp(-19)
ans = 5.60279643753727e-009
Chapter 4 – Truncation errors
and the Taylor Series
 Truncation Errors are those that result from using
an approximation in place of an exact mathematical
procedure
 Euler’s Method is an example of a first order
approximation to the real next value of a function.
 Most approximation methods including Euler’s
Method make use of a Taylor Series Approximation.

4.1 – The Taylor Series
 If a function f and its first n+1 derivatives are
continuous on an interval containing x and x+h:
𝑓′′ (𝑥) 2 𝑓 𝑛(𝑥 ) 𝑛
′(
𝑓 (𝑥 + ℎ ) = 𝑓 (𝑥 ) + 𝑓 𝑥 )ℎ + 2!
ℎ + ⋯+ 𝑛!
ℎ + 𝑅𝑛
 Where the nth-order approximation error is give b:.

𝑥+ℎ
(𝑥 − 𝑡)𝑛 𝑛+1 𝑓 𝑛+1 () 𝑛+1
𝑅𝑛 = 𝑓 (𝑡)𝑑𝑡 = ℎ
𝑡=𝑥 𝑛! (𝑛 + 1)!
Where  is some value between x and x+h

st nd
4.1 – Zero, 1 and 2 Order
Approximations
 Rn is the slope to the exact prediction:

Example 4.1 – Exact Taylor
Series Approximation
 For polynomials of order n, the n-th order Taylor Series
approximation is exact
 Consider 𝑓(𝑥 ) = −0.1𝑥 4 − 0.15𝑥 3 − 0.5𝑥 2 − 0.25𝑥 + 1.2 (n=4)
𝑓 ′ (𝑥 ) = −0.4𝑥 3 − 0.45𝑥 2 − 𝑥 − 0.25
𝑓 ′′ (𝑥 ) = −1.2𝑥 2 − 0.9𝑥 − 1
𝑓 ′′′ (𝑥 ) = −2.4𝑥 − 0.9 𝑓 𝑛+1 () 𝑛+1 𝑓 () 5
𝑖𝑣 0 5
𝑖𝑣
𝑅4 = 𝑛+1 !
ℎ = 5! ℎ = 5!
ℎ =0
𝑓 (𝑥 ) = −2.4
𝑣 (𝑥 ) = 0 Hence, Rn =0
𝑓
Example 4.1 – Successive
Approximation
 Here are the successive approximations for x=0 and
h=1 of 𝑓(𝑥 ) = −0.1𝑥 4 − 0.15𝑥 3 − 0.5𝑥 2 − 0.25𝑥 + 1.2
𝑓0 1 = 𝑓(0)
′( 3 2
𝑓 0) = −0.4𝑥 − 0.45𝑥 − 𝑥 − 0.25 = −.25 𝑓 𝑛(0) 𝑛
𝑓𝑛 1 = 𝑓𝑛−1 1 + ℎ
𝑓 ′′ (0) = −1.2𝑥 2 − 0.9𝑥 − 1 = −1 𝑛!
𝑓 ′′′ (0) = −2.4𝑥 − 0.9 = −0.9 𝑓0 1 = 𝑓 0 = 𝟏. 𝟐
𝑖𝑣
𝑓1 1 = 𝑓0 1 + 𝑓 ′ 0 (1) = 1.2 − 0.25 1 = 𝟎. 𝟗𝟓
𝑓 (0) = −2.4
𝑓′′(0) 2
1
𝑓2 1 = 𝑓1 1 + (1) = 0.95 − = 𝟎. 𝟒𝟓
𝑓 𝑣 (0) = 0 2! 2
′′′
𝑓 0 0.9
𝑓3 1 = 𝑓2 1 + (1)3 = 0.45 − = 𝟎. 𝟑
3! 6
𝑓 𝑖𝑣 (0) 2.4
𝑓4 1 = 𝑓3 1 + (1)4 = 0.3 − = 𝟎. 𝟐 = 𝒇(𝟏)
4! 24
Example 4.2 Infinite Taylor Series
Approximation
Use Taylor Series expansions with n=0 through n=6 to approximate
f(x) = cos(pi/3) on the basis of f(x) and derivatives at pi/4 (h=pi/3-i/4)
[n]
Order n f (p /4) f(p /3) t
0 cos(x) = 0.70710678 0.70710678 41.4214%
1 -sin(x) = -0.70710678 0.52198666 4.3973%
2 -cos(x) = -0.70710678 0.49775449 0.4491%
3 sin(x) = 0.70710678 0.49986915 0.0262%
4 cos(x) = 0.70710678 0.50000755 0.0015%
5 -sin(x) = -0.70710678 0.50000030 0.0001%
6 -cos(x) = -0.70710678 0.49999999 0.0000%
4.1.1 – The Remainder for the
Taylor Series Expansion
 Suppose we truncate the Taylor Series expansion
after the first term (zero-order approximation):
𝑓 (𝑥 + ℎ) = 𝑓 (𝑥) + 𝑅0
 Where the 0th-order approximation error is give by:.
𝑥+ℎ
(𝑥 − 𝑡)𝑛 𝑛+1 𝑓 𝑛+1 () 𝑛+1
𝑅0 = 𝑓 (𝑡)𝑑𝑡 = ℎ = 𝑓 ′ ()ℎ
𝑡=𝑥 𝑛! (𝑛 + 1)!
Where  is some value between x and x+h

4.1.1 – The Remainder for
the Taylor Series Expansion
 R0 is the slope to the exact prediction:
Figure 4.2
Zero-order
Taylor Series
Prediction
4.1.1 – The Remainder for
the Taylor Series Expansion
 R0 is the slope to the exact prediction:
Figure 4.3
Derivative
Mean-Value
theorem
4.1.2 Using the Taylor Series to
Estimate Truncation Error
 The remainder Rn is an exact measure of the error of
truncation at term n of a Taylor Series
 Even though Rn usually cannot be calculated exactly,
n n
the order of Rn is h [ie O(h )]
 This allows us to estimate with good accuracy the
error
 In iterative procedures this is the stopping criterion.

Figure 4.4 In creasing non-
linearity of a function requires
higher and higher Taylor
Series Approximations

Figure 4.5 Log-log plot of R1
vs h showing that as h gets
smaller R1 decreases with h2
O(h2)

4.1.3 Numerical Differentiation
 This will be covered in much more detail in
Chapters 23 and 24.
 Here we will introduce the three approximations
typically used for a derivative
o The forward difference
o The backward difference
o The centered difference
 We will also look briefly at second derivatives.
 The first forward difference approximation.
′(
𝑓(𝑥𝑖+1 ) − 𝑓(𝑥𝑖 ) ∆𝑓𝑖
𝑓 𝑥𝑖 ) = + 𝑂 (ℎ ) = + 𝑂(ℎ)
ℎ ℎ
 The first backward difference approximation
′(
𝑓(𝑥𝑖 ) − 𝑓(𝑥𝑖−1 ) ∇𝑓𝑖
𝑓 𝑥𝑖 ) = + 𝑂 (ℎ ) = + 𝑂(ℎ)
ℎ ℎ

 The first centered difference approximation.
′(
𝑓 (𝑥𝑖+1 ) − 𝑓(𝑥𝑖−1 )
𝑓 𝑥𝑖 ) = + 𝑂 (ℎ 2 )
2ℎ
 The second derivative approximation
′( )
𝑓 ( 𝑥𝑖+1 ) − 2𝑓 ( 𝑥𝑖 ) + 𝑓(𝑥𝑖−1 ) 2
𝑓 𝑥𝑖 = 2
+ 𝑂(ℎ )
ℎ

Example 4.4 Pages 92-93 of the text
f(x)=Ax 4 + Bx 3 + Cx 2 +Dx + E f '(0.5) = -0.9125
A= -0.1 For h=0.5 error

B= -0.15 Df/h = -1.45 58.90%
C= -0.5 f/h = -0.55 39.73%
D= -0.25 c f/h = -1 9.59%
E= 1.2
x f(x)
0.00 1.20000000 For h=0.5 error reduction
0.25 1.10351563 Df/h = -1.15469 26.54% 45.06%
0.50 0.92500000 f/h = -0.71406 21.75% 54.74%
0.75 0.63632813 c f/h = -0.93438 2.40% 25.00%
1.00
1/24/2011
0.20000000 ENGR-516 Prof. Soderstrand Spring 2011 77
4.2 – Error Propagation
 The purpose of this section is to investigate how
errors propagate through mathematical functions.
 For example, if we multiply two numbers together,
each with a known error, what is the error of the
product?
 We will look at error propagation for
o Single-variable functions
o Multi-variable functions
o Stability and condition

4.2.1 – Single-Variable
Functions
 Assume we have a function f(x) that is
dependent on a single variable x.
 We have a known value 𝑥 that is an
approximation to x
 Then the error in using 𝑓(𝑥 ) rather than f(x) is
given by:
∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥

4.2.1 – Single-Variable Functions
Then the error in using 𝑓(𝑥) rather than f(x) is given by:
∆𝑓(𝑥) = 𝑓 ′(𝑥) ∆𝑥

Example 4.5
 Given 𝑥 = 2.5 with D𝑥 = 0.01, estimate the
resulting error in the function f(x) = x3.
 Solution: We have f’(x) = 3x2.
Hence, f’(𝑥) = 3(2.5)2 = 18.75. Therefore:
∆𝑓(𝑥) = 𝑓 ′(𝑥) ∆𝑥
∆𝑓(𝑥) = 18.75 (0.01) = 0.1875
 Predict: (2.5)3 – 0.1875  x3  (2.5)3 + 0.1875

15,4375  x3  15.8125
4.2.2 – Multi-Variable Functions
 The concept of 4.2.1 can be extended to multi-variable
functions:
𝜕𝑓 𝜕𝑓 𝜕𝑓
∆𝑓(𝑥1 , 𝑥2 , ⋯ 𝑥𝑛 ) ≅ ∆𝑥1 + ∆𝑥2 + ⋯ + ∆𝑥𝑛
𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛
 See Example 4.6 pp. 96-97.

4.2.3 – Stability and Condition
 The condition of a mathematical problem relates to
its sensitivity to changes in input values.
 We say that a computation is numerically
unstable if the uncertainty of the input values is
grossly magnified by the numerical method.
 A Condition Number can be defined by:
𝑥 𝑓 ′(𝑥)
𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 𝑁𝑢𝑚𝑏𝑒𝑟 =
𝑓(𝑥)
4.3 – Total Numerical Error
 The Total Numerical Error is the summation of the
truncation and round-off (or chopping) errors.
 Round-off and chopping errors increase with
subtractive cancellation and increased number of
calculations.
 Truncation errors decrease with smaller step size,
but smaller step sizes increase the number of
calculations.
 Hence: There is a trade-off!

4.3 – Total Numerical Error
Truncation errors decrease with smaller step size,
calculations. Hence: There is a trade-off!
Figure 4.8 Trade-off

between Truncation
and Round-Off
errors.
Example 4.8
Truncation errors decrease with smaller step size,
calculations. Hence: There is a trade-off!
Figure 4.9 Trade-off

between Truncation
and Round-Off errors
in Example 4.8.

4.4 – Blunders, Formulation
Errors and Uncertainty
 Programming blunders are often a cause of
error that is difficult to eliminate.
 Modeling Errors (also called Formulation
Errors) are also difficult to eliminate.
 Data Uncertainty also causes errors, but we
do have some measures for this that can be
very helpful in at least accounting for these
types of errors.

End of Tonight’s Lecture
 Any questions?
 Homework for next week: 3.1, 3.3, 3.6
-19 -5
(note: use e rather than e ), 3.13, 4.1
and 4.23
 Quiz for this week is NOW online –
Complete before class next week
 Quiz for next week will be online next
Monday.

Lecture 2 ENGR-516 Spring 2011

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lecture 2 ENGR-516 Spring 2011

Hochgeladen von

Copyright:

Verfügbare Formate

ENGR-516 Spring 2011

Lecture #2 January 24, 2011

Adjunct Prof. Michael A. Soderstrand

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 1

Do the following problems from Chapters 1 and 2:

Problem 1.3, p. 21 of the text

where c' = a second-order drag coefficient (kg/m).

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 2

Problem 1.19, p. 24 of the text

The velocity is equal to the rate of change of distance x(m),

dx/dt = v(t) [Eq P1.19]

Problem 2.3, p. 47 of the text

Develop, debug, and document a program to

Problem 2.25, p. 51 of the text

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 6

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 7

 How many significant digits are there?

 How many significant digits are there?

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 12

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 13

 Accuracy on a computer comes from

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 14

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 16

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 17

For the Rivet:

 In the above formula, the denominator is our

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 19

𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏 − 𝑷𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏

 In the above formula, the denominator is our approximation and the

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 20

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 21

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 23

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 24

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 27

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 28

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 32

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 33

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 35

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 37

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 38

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 39

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 40

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 41

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 42

𝐼 = 2,356 𝐹 = 2.356 × 103 (normalization)

𝐼 = 2,356 𝐹 = 2.36 × 103 (norm & round)

𝐼 = 1011 𝐹 = 1.011 × 23 (normalization)

𝐼 = 1011 𝐹 = 1.10 × 23 (norm & round)

 MSB is the sign of the number (0 = +; 1 = -)

■ Emin (1) = -1022

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 47

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 51

o The multiplications 0×∞ and 0×−∞

o The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions

o The standard has alternative functions for powers:

 The standard pow function and the integer exponent pown

function define 00, 1∞, and ∞0 as 1.

 Real operations with complex results, for example:

o The logarithm of a negative number