Sie sind auf Seite 1von 88

ENGR-516 Spring 2011

Lecture #2 January 24, 2011

Adjunct Prof. Michael A. Soderstrand


soderstrand@ieee.org
405-334-8329

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 1


Homework #1 -- Due Monday January 24, 2011

Do the following problems from Chapters 1 and 2:

Problem 1.3, p. 21 of the text

Rather than the linear realationship of Eq. (1.7) you might choose to model
the upward force on the parachutist as a second-order relationship.

Fu = -c'v2

where c' = a second-order drag coefficient (kg/m).


a) Using calculus, obtain the closed-form solution for the case where the
jumper is initially at rest (v=0 at t=0).
b) Repeat the numerical calculation in Example 1.2 with the same initial
condition and parameter values. Use a value of 0.225 kg/m for c'.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 2


Homework #1 -- Due Monday January 24, 2011

Problem 1.19, p. 24 of the text

The velocity is equal to the rate of change of distance x(m),

dx/dt = v(t) [Eq P1.19]

a) Substitute Eq. (1.10) and develop and analytical solution for the
distance as a function of time. Assume that x(0)=0.
b) Use Euler's method to numerically integrate Eqs. (P1.19) and (1.9) in
order to determine both the velocity and distance fallen as a function of
time for the first 10s of free fall using the same parameters as in
Example 1.2.
c) Develop a plot of your numerical results together with analytical
solutions.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 3
Homework #1 -- Due Monday January 24, 2011

Problem 2.3, p. 47 of the text

Develop, debug, and document a program to


determine the roots of a quadratic equation ax2 + bx +
c, in either a high-level language or a macro language
of your choice (MatLab strongly suggested). Use a
subroutine procedure to compute the roots (either rel
or complex). Perform test runs of the cases:
a) a=1, b=6, c=2;
b) a=0, b=-4, c=1.6;
c) a=3, b=2.5, c=7.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 4
Homework #1 -- Due Monday January 24, 2011

Problem 2.25, p. 51 of the text

The psuedocode below computes the factorial. Express this algorithm as a well-structured
function in the language of your choice (MatLab is strongly recommended). Test it by
computing 0! and 5! In addition, test the error trap by trying to evaluate -2!

Pseudocode

FUNCTION fac(n)
IF n ≥ 0 THEN
x=1
DOFOR i = 1,n
x = x·i
END DO
fac = x
ELSE
display error message
terminate
ENDIF
END fac
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 5
Chapter 3 – Approximation and
Round-Off Errors
 Errors are inherent in any numerical solution
 Even when we have an exact analytic solution (as
in the parachute or circuit example of Chapter 1),
as soon as we use a computer to calculate
solutions, those solutions have error in them.
 Often we do not have exact or analytic solutions,
then numerical techniques give us approximations
– but how much error is there?

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 6


Sources of Error
 This lecture will deal with all of the major errors
associated with numerical analysis
 Chapter 3 discusses accuracy and precision and
errors due to number representation in the
computer.
 Chapter 4 deals with truncation errors and in
detail errors associated with Taylor’s Series
approximations.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 7


3.1 Significant Figures
 Below is a car odometer (see text Fig 3.1)

 How many significant digits are there?


1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 8
 There are eight significant digits.
 However, only the first seven
can be used with confidence.
th
 We can approximate the 8
digit (126,462.25).
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 9
3.1 Significant Figures
 Below is a car speedometer (see text Fig 3.1)

 How many significant digits are there?


1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 10
 There are two significant digits.
 However, only the first can be
used with confidence.
nd
 We can approximate the 2
digit (52).
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 11
3.2 Accuracy and
Precision
 Accuracy refers to how closely a
computed or measured value agrees
with the true value.
 Precision refers to how closely
computed or measured values agree
with each other.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 12


Illustration of Accuracy and Precision

Figure 3.2
p. 55 of text

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 13


Computer Precision
 Precision on a computer comes from
the word-length used for computation.

 Accuracy on a computer comes from


the quality of the algorithms used to
perform the calculations.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 14


3.3 Error Definitions
 Numerical errors include truncation
errors due to inexact mathematical
operations and round-off errors due
to significant figure limitations.
 True Value = Calculated Value +
True Error
 Calculated Value is referred to in the
text as the Approximation
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 15
Error Definitions
 Et = True Value – Approximation
 Relative Error expressed as a percentage
is given by:
𝑻𝒓𝒖𝒆 𝑬𝒓𝒓𝒐𝒓 𝑬𝒕
𝜺𝒕 = × 100% = × 100%
𝑻𝒓𝒖𝒆 𝑽𝒂𝒍𝒖𝒆 𝑻𝑽
 NOTE: This assumes we know the true
value.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 16


Example 3.1
 Problem Statement We measure a
bridge and a rivet. We measure
10,000cm for the bridge and 11cm
for the rivet. If the true values are
9,999cm and 10cm respectively,
calculate a) the true error and b) the
relative percent error. (Note: This is
slightly different than in the text.)

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 17


Example 3.1
For the Bridge:
 Et = 9999cm - 10000cm = -1cm
𝑬𝒕 −1𝑐𝑚
 𝜺𝒕 = 𝑇𝑉 = 9999𝑐𝑚 × 100% ≈ −.01%

For the Rivet:


 Et = 10cm - 11cm = -1cm
𝑬𝒕 −1𝑐𝑚
 𝜺𝒕 = 𝑇𝑉 = 10𝑐𝑚 × 100% ≈ −10%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 18
Approximate Error
 Often we do not know the true value.
 When we do not know the true value, we use the
following equation to calculate the approximate error:

𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓
𝜺𝒂 = × 100%
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏

 In the above formula, the denominator is our


approximation and the numerator is the approximate
error.
We know the approximation, but how do we find the
approximate error?

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 19


Finding the Approximate Error
 When we do not know the true value, it is often challenging to find the
approximate error.
 However, in iterative algorithms, we can use the difference in successive
approximations as a reasonable indication of the approximate error:

𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏 − 𝑷𝒓𝒆𝒗𝒊𝒐𝒖𝒔 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏


𝜺𝒂 = × 100%
𝑪𝒖𝒓𝒓𝒆𝒏𝒕 𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏

 In the above formula, the denominator is our approximation and the


numerator is the approximate error.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 20


Sign of the Error
 The error is negative if the
approximation is larger than the
true value
 The error is positive if the
approximation is smaller than the
true value

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 21


Absolute Error
 Often we are not concerned with
the sign of the error.
 In the rest of this class, we will
use absolute error.
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒆 𝑬𝒓𝒓𝒐𝒓
𝜺𝒂 = × 100%
𝑨𝒑𝒑𝒓𝒐𝒙𝒊𝒎𝒂𝒕𝒊𝒐𝒏
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 22
Error & Significant Digits
 Often we want the error to be less
than a certain number of significant
digits n.
 The error will be less than n
significant digits if:
𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 %

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 23


Error & Significant Digits
 Often we want the error to be less
than a certain number of significant
digits n.Let’s Take a
TEN
 The error willMINUTE
be less than n
significant digits if:
Break 𝟐−𝒏
𝜺𝒂 < 𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎 %

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 24


Example 3.2 Error Estimates for Iterative Methods
Problem Statement. In mathematics, functions can
often be represented by infinite series. For example,
the exponential function can be computed using
2 3 𝑛
𝑥 𝑥 𝑥
𝑒𝑥 = 1 + 𝑥 + + +⋯+
2 3! 𝑛!
Thus, as more terms are added to the sequence, the
approximation becomes a better and better estimate
of the true value of ex. This is called a Maclaurin
series expansion.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 25
Example 3.2 Error Estimates for Iterative Methods
Problem Statement Continued:
2 3 𝑛
𝑥 𝑥 𝑥
𝑒𝑥 = 1 + 𝑥 + + +⋯+
2 3! 𝑛!
x
Starting with the simplest version, e = 1, add terms
0.5
one at a time to estimate e = 1.648721… Add
terms until the absolute value of the approximate
error estimate a falls below a prespecified error
criterion s conforming to three significant figures.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 26
Example 3.2 SOLUTION
First solve for the error-specification equivalent to
three significant digits:
𝜺𝒔 = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝒏 % = 𝟎. 𝟓 × 𝟏𝟎𝟐−𝟑 % = 𝟎. 𝟎𝟓%
The true error can be calculated as:
𝒆𝟎.𝟓 − 𝒂𝒑𝒑𝒓𝒐𝒙
𝜺𝒕 = 𝟎.𝟓
× 𝟏𝟎𝟎%
𝒆

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 27


Example 3.2 SOLUTION
 However, we usually do not know the true value
and therefore cannot calculate the true error.
 In such cases we must use the approximation error:

𝒂𝒑𝒑𝒓𝒐𝒙(𝒏) − 𝒂𝒑𝒑𝒓𝒐𝒙(𝒏 − 𝟏)
𝜺𝒂 = × 𝟏𝟎𝟎%
𝒂𝒑𝒑𝒓𝒐𝒙(𝒏)

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 28


x = 0.5 s= 0.05%
e x = 1.64872127
n Approx t a
0 1.00000000 39.34693%
1 1.50000000 9.02040% 33.33333333%
2 1.62500000 1.43877% 7.69230769%
3 1.64583333 0.17516% 1.26582278%
4 1.64843750 0.01721% 0.15797788%
5 1.64869792 0.00142% 0.01579529%
6 1.64871962 0.00010% 0.00131626%
7 1.64872117 0.00001% 0.00009402%
8 1.64872127 0.00000% 0.00000588%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 29
3.3.1 Iterative
Calculations
 Most of the methods in this course
are iterative using successive
approximation to the true value
 The computer implementation
involves LOOPS usually ending when
the error drops below a specified
value.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 30
function [ v,ea,iter ] = IterMeth( x,es,maxit )
% Implements a general series expansion
% INPUT:
% (x,es,maxit): x = independent variable
% es = Specified error
% maxit = Maximum number of interations
% Function term(val,n) that defines the n-th term of the sequence
%
% OUTPUT:
% [v,ea,iter]: v = Approximate value for iter iterations
% ea = Approximate error after iter interations
% iter = number of iterations
iter=1; sol=1; ea=100;
while ea>es & iter < maxit,
solold=sol;
termn=term(x,iter);
sol=sol+termn;
iter=iter+1;
if sol ~= 0
ea=abs((sol-solold)/sol)*100;
end
end
v=sol;
return
end
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 31
function [ val ] = term( x,n )
% Function to calculate the
% n-th term of e^x
% INPUT: x, n
% OUTPUT: val
val=x^n/factorial(n);
return
end

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 32


3.4 Round-Off Errors
 Round-off errors originate from the fact that
computers retain only a fixed number of
significant digits.
 Irrational numbers cannot be expressed
exactly.
 Many numbers that are exact in decimal are
not exact in binary used by computers.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 33


3.4.1 Computer Representation of Numbers
 Integers are represented in most computers in two’s
compliment representation.
 MatLab Integer Representations
Class Range of Values Name
7 7
Signed 8-bit integer -2 to 2 -1 int8
15 15
Signed 16-bit integer -2 to 2 -1 int16
31 31
Signed 32-bit integer -2 to 2 -1 int32
63 63
Signed 64-bit integer -2 to 2 -1 int64
8
Unsigned 8-bit integer 0 to 2 -1 uint8
Unsigned 16-bit integer 0 to 216-1 uint16
Unsigned 32-bit integer 0 to 232-1 uint32
Unsigned 64-bit integer 0 to 264-1 uint64
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 34
3.4.1 Computer Representation of Numbers
 Floating Point numbers are represented in most
computers by IEEE Standard 754.
Decimal Decimal
Name Common name Base Digits E min E max
digits E max
binary16 Half precision 2 10+1 -14 +15 3.31 4.51
binary32 Single precision 2 23+1 -126 +127 7.22 38.23
binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95
binary128 Quadruple precision 2 112+1 -16382 +16383 34.02 4931.77
decimal32 10 7 -95 +96 7 96
decimal64 10 16 -383 +384 16 384
decimal128 10 34 -6143 +6144 34 6144

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 35


3.4.1 Computer Representation of Numbers
 MatLab uses primarily double precision IEEE Standard 754
arithmetic.
 Single precision is available, but half, quadruple, and the
decimal representations are not.
Decimal Decimal
Name Common name Base Digits E min E max
digits E max
binary16 Half precision 2 10+1 -14 +15 3.31 4.51
binary32 Single precision 2 23+1 -126 +127 7.22 38.23
binary64 Double precision 2 52+1 -1022 +1023 15.95 307.95
Quadruple
binary128 2 112+1 -16382 +16383 34.02 4931.77
precision
decimal32 10 7 -95 +96 7 96
decimal64 10 16 -383 +384 16 384
decimal128 10 34 -6143 +6144 34 6144
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 36
Binary and HEX
Binary HEX Binary HEX Binary HEX Binary HEX
0000 0 0100 4 1000 8 1100 C
0001 1 0101 5 1001 9 1101 D
0010 2 0110 6 1010 A 1110 E
0011 3 0111 7 1011 B 1111 F

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 37


3.4.1 Integer Representation in MatLab
uint8: 8-bit unsigned integer (0 – 255)
Binary HEX Decimal
1
0110 1010 6A 616 +10 = 106
1
0011 1011 3B 316 +11 = 59
1
1100 0101 C5 1216 +5 = 197
1
1111 0111 F7 1516 +7 = 247

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 38


3.4.1 Integer Representation in MatLab
uint8: 8-bit unsigned integer (0 – 255)
Decimal HEX Binary
16 1616 = 1R0 = 10 0001 0000
40 4016 = 2R8 = 28 0010 1000
228 22816 = 14R4 = E4 1110 0100
157 15716 = 9R13 = 9D 1001 1101

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 39


3.4.1 Integer Representation in MatLab
int8: 8-bit signed integer (-128 – 127)
Binary HEX Decimal
1
0110 1010 6A 616 +10 = 106
0011 1011 3B 3161+11 = 59
1100 0101 C5 4161+5-128 = -59
1
1111 0111 F7 716 +7-128 = -9

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 40


3.4.1 Integer Representation in MatLab
int8: 8-bit signed integer (-128 – 127)
Decimal HEX Binary
16 1616 = 1R0 = 10 0001 0000
40 4016 = 2R8 = 28 0010 1000
-28 (256-28)16 = 14R4 = E4 1110 0100
-99 (256-99)16 = 9R13 = 9D 1001 1101

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 41


3.4.1 Integer Representation in MatLab
int16: 16-bit signed integer (-32,768 – 32,767)
Binary HEX Decimal
3 2
0110 1010 0011 1101 6A3B 616 +1016
hex2dec(‘6A3B’) +3161+11 = 27,195
1100 0101 1111 0111 C5F7 4163+5162
hex2dec(‘45F7’)+ +15161+7-32768 =
intmin(‘int16’) -14,857

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 42


3.4.1 Integer Representation in MatLab
int16: 16-bit signed integer (-32,768 – 32,767)
Decimal HEX Binary
4136 413616 = 258R8 = 1028 0001 0000 0010 1000
25816=16R2
1616=1R0
-7011 (65,536-7011)16 = 1110 0100 1001 1101
3657R13 = E49D
365716 = 228R9
22816 = 14R4
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 43
3.4.1 Integer Representation in MatLab
int16: 16-bit signed integer (-32,768 – 32,767)
Decimal HEX Binary
4136
Let’s Take 0001
a 0000 0010 1000
413616 = 258R8 = 1028
25816=16R2

-7011
TEN MINUTE
1616=1R0
1110 0100 1001 1101
(65,536-7011)16 =
Break
3657R13 = E49D
365716 = 228R9
22816 = 14R4
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 44
3.4.1 Floating Point Representation
 Computer equivalent of scientific notation.

𝐼 = 2,356 𝐹 = 2.356 × 103 (normalization)

𝐼 = 2,356 𝐹 = 2.36 × 103 (norm & round)


 Binary equivalent:

𝐼 = 1011 𝐹 = 1.011 × 23 (normalization)

𝐼 = 1011 𝐹 = 1.10 × 23 (norm & round)


 In binary, we save a bit, because the whole number is
always 1 – we only need to store the fractional part.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 45
IEEE 754 Double Precision

 MSB is the sign of the number (0 = +; 1 = -)


 The exponent is 11 bits in a biased number system
(subtract 1023 from the actual number)
 The mantissa is formed by 1 + the 52-bit fraction. (53
bits total)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 46
Exponent encoding
The double precision binary floating-point exponent is encoded using an offset
binary representation, with the zero offset being 1023; also known as exponent
bias in the IEEE 754 standard. Examples of such representations would be:

■ Emin (1) = -1022


■ E (50) = -973
■ Emax (2046) = 1023

Thus, as defined by the offset binary representation, in order to get the true
exponent the exponent bias of 1023 has to be subtracted from the written
exponent. The exponents 0x000 and 0x7ff have a special meaning:

■ 0x000 is used to represent zero (if F=0) and subnormals (if F≠0); and
■ 0x7ff is used to represent infinity (if F=0) and NaNs (if F≠0),

where F is the fraction mantissa. All bit patterns are valid encoding .

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 47


IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂


0x3ff0 0000 0000 0000 = 1
0x3ff0 0000 0000 0001 = 1.0000000000000002,
the next higher number > 1
0x3ff0 0000 0000 0002 = 1.0000000000000004
0x4000 0000 0000 0000 = 2
0xc000 0000 0000 0000 = –2
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 48
IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂


0x0000 0000 0000 0000 = 0
0x8000 0000 0000 0000 = –0
0x7ff0 0000 0000 0000 = Infinity
0xfff0 0000 0000 0000 = -Infinity
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 49
IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂


0x0000 0000 0000 0001 ≈ 4.9406564584124654 x 10-324
(Min subnormal positive double)
0x0010 0000 0000 0000 ≈ 2.2250738585072014 x 10-308
(Min normal positive double)
0x7fef ffff ffff ffff ≈ 1.7976931348623157 x 10308
(Max Double)
Note: Subnormals fill the gap between zero and smallest number
(see text pp. 65-67 for explanation of this gap).
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 50
IEEE 754 Double Precision

(−𝟏)𝒔𝒊𝒈𝒏 × 𝟐(𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕−𝟏𝟎𝟐𝟑) × 𝟏. 𝒎𝒂𝒏𝒕𝒊𝒔𝒔𝒂


0x7ff0 0000 0000 0001 = qNaN
0xfff8 0000 0000 0000 = sNaN

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 51


There are three kinds of operations which return NaN
 Operations with a NaN as at least one operand
 Indeterminate forms
o The divisions 0/0, ∞/∞, ∞/−∞, −∞/∞, and −∞/−∞

o The multiplications 0×∞ and 0×−∞

o The additions ∞ + (−∞), (−∞) + ∞ and equivalent subtractions

o The standard has alternative functions for powers:

 The standard pow function and the integer exponent pown

function define 00, 1∞, and ∞0 as 1.


 The powr function defines all three as invalid operations (NaN).

 Real operations with complex results, for example:


o The square root of a negative number

o The logarithm of a negative number

o The inverse sine or cosine of a number which is less than −1 or greater

than +1.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 52
3.4.2 Arithmetic Manipulations
 Floating-point addition and subtraction can cause
significant error.
 Exponents must be the same to add or subtract
1.557  104
+ 4.381  102
Can’t Add!
 The mantissa of the number with the smaller exponent
is modified to make the exponents the same
1.557  104
+ 0.04381  104
1.60081  104
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 53
3.4.2 Arithmetic Manipulations
1.557  104
+ 0.04381  104
1.60081  104
 Exponents However, we must now either truncate
(chop) or round to the machine precision (4 digits)
1.557  104
+ 0.04381  104
1.600  104 Chopped

1.557  104
+ 0.04381  104
1.601  104 Rounded
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 54
3.4.2 Arithmetic Manipulations
 Floating-point subtraction of two numbers that are
very close to each other causes major problems.
 Exponents must be the same to add or subtract
7.642  103
- 7.641  103
0.001  103
 Renormalization creates three non-significant digits
7.642  103
- 7.641  103
1.000  100

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 55


Example 3.7
Investigate the effect of round-off error on large
numbers of interdependent computations.
 The text suggests adding 1 and .00001 in
single precision to themselves 10,000 times.
 The first sum is 10,000 but the second will not
be 1 due to round-off errors.
 The text then suggest doing the second sum
in double precision.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 56


Example 3.7 Generalized
% Program Fig0312 (Page 72 of text)
% INPUTS: x1 = the first number to add (eg: 1)
% x2 = the second and third number to add (eg: 0.1)
% x3 = the second and third number to add (eg: 0.1)
% n = numer of iterations (eg: 10,000,000)
% OUTPUTS: sum1 = x1 summed n times (single precision)
% sum2 = x2 summed n times (single precision)
% sum2 = x3 summed n times (double precision)
sum1=single(0);
sum2=sum1;
sum3=0;
for i = 1:n
sum1=sum1+x1;
sum2=sum2+x2;
sum3=sum3+x3;
end
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 57
Example 3.7 Generalized
% Program Fig0312 (Page 72 of text)
sum1=single(0);
sum2=sum1; n = 10000000, x1 = 1, x2 = 0.1, x3 = 0.1
sum3=0;
for i = 1:n
sum1 = 10000000
sum1=sum1+x1; sum2 = 1087937
sum2=sum2+x2; sum3 = 1000000
sum3=sum3+x3;
end
t1=sprintf('n = %0.10g, ',n);
t2=sprintf('x1 = %0.8g, x2 = %0.8g, x3 = %0.8g\n',x1,x2,x3);
t3=sprintf('sum1 = %0.8g\n',sum1);
t4=sprintf('sum2 = %0.8g\n',sum2);
t5=sprintf('sum3 = %0.8g\n',sum3);
t=sprintf('%s%s%s%s%s',t1,t2,t3,t4,t5);
sprintf(t)

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 58


Example 3.8
Find the roots of the quadratic equation with
a=1, b=30000000.0000001, c=3.
(x+30000000)(x+0.0000001)
 MatLab and Excel have trouble with this
EDU>> a=1;b=30000000.0000001;c=3; a= 1
EDU>> r1=(-b+sqrt(b^2-4*a*c))/2/a b = 30000000.0000001
r1 = -1.00582838058472e-007 c= 3
EDU>> r2=(-b-sqrt(b^2-4*a*c))/2/a
r1 = -0.000000100582838
r2 = -30000000
r2 = -30000000
EDU>> r1=2*c/(-b-sqrt(b^2-4*a*c))
r1 = -1e-007 r1 = 0.000000100000000
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 59
Example 3.9
Using a Taylor’s Series to calculate ex when x is negative is not
recommended.
 Instead calculate 1/e|x|.
 However, the example in the text on pp. 74-75 works fine in both
MatLab and Excel.
 You need to use a higher value of x to see the effect (try -19 and -20).

EDU>> IterMeth(x,es,maxit)

ans = 2.55376446514562e-009

EDU>> exp(-19)

ans = 5.60279643753727e-009
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 60
Example 3.9
Using a Taylor’s Series to calculate ex when x is negative is not
recommended.
 Instead calculate 1/e|x|.
 However, the example in the text on pp. 74-75 works fine in both
MatLab and Excel.
Let’s Take a
 You need to use a higher value of x to see the effect (try -19 and -20).

TEN MINUTE
EDU>> IterMeth(x,es,maxit)

ans = 2.55376446514562e-009
Break
EDU>> exp(-19)

ans = 5.60279643753727e-009
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 61
Chapter 4 – Truncation errors
and the Taylor Series
 Truncation Errors are those that result from using
an approximation in place of an exact mathematical
procedure
 Euler’s Method is an example of a first order
approximation to the real next value of a function.
 Most approximation methods including Euler’s
Method make use of a Taylor Series Approximation.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 62


4.1 – The Taylor Series
 If a function f and its first n+1 derivatives are
continuous on an interval containing x and x+h:
𝑓′′ (𝑥) 2 𝑓 𝑛(𝑥 ) 𝑛
′(
𝑓 (𝑥 + ℎ ) = 𝑓 (𝑥 ) + 𝑓 𝑥 )ℎ + 2!
ℎ + ⋯+ 𝑛!
ℎ + 𝑅𝑛

 Where the nth-order approximation error is give b:.


𝑥+ℎ
(𝑥 − 𝑡)𝑛 𝑛+1 𝑓 𝑛+1 () 𝑛+1
𝑅𝑛 = 𝑓 (𝑡)𝑑𝑡 = ℎ
𝑡=𝑥 𝑛! (𝑛 + 1)!
Where  is some value between x and x+h

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 63


st nd
4.1 – Zero, 1 and 2 Order
Approximations
 Rn is the slope to the exact prediction:

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 64


Example 4.1 – Exact Taylor
Series Approximation
 For polynomials of order n, the n-th order Taylor Series
approximation is exact
 Consider 𝑓(𝑥 ) = −0.1𝑥 4 − 0.15𝑥 3 − 0.5𝑥 2 − 0.25𝑥 + 1.2 (n=4)
𝑓 ′ (𝑥 ) = −0.4𝑥 3 − 0.45𝑥 2 − 𝑥 − 0.25
𝑓 ′′ (𝑥 ) = −1.2𝑥 2 − 0.9𝑥 − 1
𝑓 ′′′ (𝑥 ) = −2.4𝑥 − 0.9 𝑓 𝑛+1 () 𝑛+1 𝑓 () 5
𝑖𝑣 0 5
𝑖𝑣
𝑅4 = 𝑛+1 !
ℎ = 5! ℎ = 5!
ℎ =0
𝑓 (𝑥 ) = −2.4
𝑣 (𝑥 ) = 0 Hence, Rn =0
𝑓
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 65
Example 4.1 – Successive
Approximation
 Here are the successive approximations for x=0 and
h=1 of 𝑓(𝑥 ) = −0.1𝑥 4 − 0.15𝑥 3 − 0.5𝑥 2 − 0.25𝑥 + 1.2
𝑓0 1 = 𝑓(0)
′( 3 2
𝑓 0) = −0.4𝑥 − 0.45𝑥 − 𝑥 − 0.25 = −.25 𝑓 𝑛(0) 𝑛
𝑓𝑛 1 = 𝑓𝑛−1 1 + ℎ
𝑓 ′′ (0) = −1.2𝑥 2 − 0.9𝑥 − 1 = −1 𝑛!
𝑓 ′′′ (0) = −2.4𝑥 − 0.9 = −0.9 𝑓0 1 = 𝑓 0 = 𝟏. 𝟐
𝑖𝑣
𝑓1 1 = 𝑓0 1 + 𝑓 ′ 0 (1) = 1.2 − 0.25 1 = 𝟎. 𝟗𝟓
𝑓 (0) = −2.4
𝑓′′(0) 2
1
𝑓2 1 = 𝑓1 1 + (1) = 0.95 − = 𝟎. 𝟒𝟓
𝑓 𝑣 (0) = 0 2! 2
′′′
𝑓 0 0.9
𝑓3 1 = 𝑓2 1 + (1)3 = 0.45 − = 𝟎. 𝟑
3! 6
𝑓 𝑖𝑣 (0) 2.4
𝑓4 1 = 𝑓3 1 + (1)4 = 0.3 − = 𝟎. 𝟐 = 𝒇(𝟏)
4! 24
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 66
Example 4.2 Infinite Taylor Series
Approximation
Use Taylor Series expansions with n=0 through n=6 to approximate
f(x) = cos(pi/3) on the basis of f(x) and derivatives at pi/4 (h=pi/3-i/4)

[n]
Order n f (p /4) f(p /3) t
0 cos(x) = 0.70710678 0.70710678 41.4214%
1 -sin(x) = -0.70710678 0.52198666 4.3973%
2 -cos(x) = -0.70710678 0.49775449 0.4491%
3 sin(x) = 0.70710678 0.49986915 0.0262%
4 cos(x) = 0.70710678 0.50000755 0.0015%
5 -sin(x) = -0.70710678 0.50000030 0.0001%
6 -cos(x) = -0.70710678 0.49999999 0.0000%
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 67
4.1.1 – The Remainder for the
Taylor Series Expansion
 Suppose we truncate the Taylor Series expansion
after the first term (zero-order approximation):
𝑓 (𝑥 + ℎ) = 𝑓 (𝑥) + 𝑅0
 Where the 0th-order approximation error is give by:.
𝑥+ℎ
(𝑥 − 𝑡)𝑛 𝑛+1 𝑓 𝑛+1 () 𝑛+1
𝑅0 = 𝑓 (𝑡)𝑑𝑡 = ℎ = 𝑓 ′ ()ℎ
𝑡=𝑥 𝑛! (𝑛 + 1)!
Where  is some value between x and x+h

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 68


4.1.1 – The Remainder for
the Taylor Series Expansion
 R0 is the slope to the exact prediction:

Figure 4.2
Zero-order
Taylor Series
Prediction
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 69
4.1.1 – The Remainder for
the Taylor Series Expansion
 R0 is the slope to the exact prediction:

Figure 4.3
Derivative
Mean-Value
theorem
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 70
4.1.2 Using the Taylor Series to
Estimate Truncation Error
 The remainder Rn is an exact measure of the error of
truncation at term n of a Taylor Series
 Even though Rn usually cannot be calculated exactly,
n n
the order of Rn is h [ie O(h )]
 This allows us to estimate with good accuracy the
error
 In iterative procedures this is the stopping criterion.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 71


Figure 4.4 In creasing non-
linearity of a function requires
higher and higher Taylor
Series Approximations

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 72


Figure 4.5 Log-log plot of R1
vs h showing that as h gets
smaller R1 decreases with h2
O(h2)

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 73


4.1.3 Numerical Differentiation
 This will be covered in much more detail in
Chapters 23 and 24.
 Here we will introduce the three approximations
typically used for a derivative
o The forward difference
o The backward difference
o The centered difference
 We will also look briefly at second derivatives.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 74
4.1.3 Numerical Differentiation
 The first forward difference approximation.

′(
𝑓(𝑥𝑖+1 ) − 𝑓(𝑥𝑖 ) ∆𝑓𝑖
𝑓 𝑥𝑖 ) = + 𝑂 (ℎ ) = + 𝑂(ℎ)
ℎ ℎ
 The first backward difference approximation

′(
𝑓(𝑥𝑖 ) − 𝑓(𝑥𝑖−1 ) ∇𝑓𝑖
𝑓 𝑥𝑖 ) = + 𝑂 (ℎ ) = + 𝑂(ℎ)
ℎ ℎ

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 75


4.1.3 Numerical Differentiation
 The first centered difference approximation.

′(
𝑓 (𝑥𝑖+1 ) − 𝑓(𝑥𝑖−1 )
𝑓 𝑥𝑖 ) = + 𝑂 (ℎ 2 )
2ℎ
 The second derivative approximation

′( )
𝑓 ( 𝑥𝑖+1 ) − 2𝑓 ( 𝑥𝑖 ) + 𝑓(𝑥𝑖−1 ) 2
𝑓 𝑥𝑖 = 2
+ 𝑂(ℎ )

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 76


Example 4.4 Pages 92-93 of the text
f(x)=Ax 4 + Bx 3 + Cx 2 +Dx + E f '(0.5) = -0.9125

A= -0.1 For h=0.5 error


B= -0.15 Df/h = -1.45 58.90%
C= -0.5 f/h = -0.55 39.73%
D= -0.25 c f/h = -1 9.59%
E= 1.2

x f(x)
0.00 1.20000000 For h=0.5 error reduction
0.25 1.10351563 Df/h = -1.15469 26.54% 45.06%
0.50 0.92500000 f/h = -0.71406 21.75% 54.74%
0.75 0.63632813 c f/h = -0.93438 2.40% 25.00%
1.00
1/24/2011
0.20000000 ENGR-516 Prof. Soderstrand Spring 2011 77
4.2 – Error Propagation
 The purpose of this section is to investigate how
errors propagate through mathematical functions.
 For example, if we multiply two numbers together,
each with a known error, what is the error of the
product?
 We will look at error propagation for
o Single-variable functions
o Multi-variable functions
o Stability and condition

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 78


4.2.1 – Single-Variable
Functions
 Assume we have a function f(x) that is
dependent on a single variable x.
 We have a known value 𝑥 that is an
approximation to x
 Then the error in using 𝑓(𝑥 ) rather than f(x) is
given by:

∆𝑓(𝑥 ) = 𝑓 ′(𝑥 ) ∆𝑥

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 79


4.2.1 – Single-Variable Functions
Then the error in using 𝑓(𝑥) rather than f(x) is given by:
∆𝑓(𝑥) = 𝑓 ′(𝑥) ∆𝑥

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 80


Example 4.5
 Given 𝑥 = 2.5 with D𝑥 = 0.01, estimate the
resulting error in the function f(x) = x3.
 Solution: We have f’(x) = 3x2.
Hence, f’(𝑥) = 3(2.5)2 = 18.75. Therefore:

∆𝑓(𝑥) = 𝑓 ′(𝑥) ∆𝑥
∆𝑓(𝑥) = 18.75 (0.01) = 0.1875

 Predict: (2.5)3 – 0.1875  x3  (2.5)3 + 0.1875


15,4375  x3  15.8125
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 81
4.2.2 – Multi-Variable Functions
 The concept of 4.2.1 can be extended to multi-variable
functions:
𝜕𝑓 𝜕𝑓 𝜕𝑓
∆𝑓(𝑥1 , 𝑥2 , ⋯ 𝑥𝑛 ) ≅ ∆𝑥1 + ∆𝑥2 + ⋯ + ∆𝑥𝑛
𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛

 See Example 4.6 pp. 96-97.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 82


4.2.3 – Stability and Condition
 The condition of a mathematical problem relates to
its sensitivity to changes in input values.
 We say that a computation is numerically
unstable if the uncertainty of the input values is
grossly magnified by the numerical method.
 A Condition Number can be defined by:

𝑥 𝑓 ′(𝑥)
𝐶𝑜𝑛𝑑𝑖𝑡𝑖𝑜𝑛 𝑁𝑢𝑚𝑏𝑒𝑟 =
𝑓(𝑥)
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 83
4.3 – Total Numerical Error
 The Total Numerical Error is the summation of the
truncation and round-off (or chopping) errors.
 Round-off and chopping errors increase with
subtractive cancellation and increased number of
calculations.
 Truncation errors decrease with smaller step size,
but smaller step sizes increase the number of
calculations.
 Hence: There is a trade-off!

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 84


4.3 – Total Numerical Error
Truncation errors decrease with smaller step size,
but smaller step sizes increase the number of
calculations. Hence: There is a trade-off!

Figure 4.8 Trade-off


between Truncation
and Round-Off
errors.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 85
Example 4.8
Truncation errors decrease with smaller step size,
but smaller step sizes increase the number of
calculations. Hence: There is a trade-off!

Figure 4.9 Trade-off


between Truncation
and Round-Off errors
in Example 4.8.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 86


4.4 – Blunders, Formulation
Errors and Uncertainty
 Programming blunders are often a cause of
error that is difficult to eliminate.
 Modeling Errors (also called Formulation
Errors) are also difficult to eliminate.
 Data Uncertainty also causes errors, but we
do have some measures for this that can be
very helpful in at least accounting for these
types of errors.

1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 87


End of Tonight’s Lecture
 Any questions?
 Homework for next week: 3.1, 3.3, 3.6
-19 -5
(note: use e rather than e ), 3.13, 4.1
and 4.23
 Quiz for this week is NOW online –
Complete before class next week
 Quiz for next week will be online next
Monday.
1/24/2011 ENGR-516 Prof. Soderstrand Spring 2011 88

Das könnte Ihnen auch gefallen