Sie sind auf Seite 1von 10

BIT 18 (1978), 415-424

A GLOBALLY INTERVAL AND METHOD BOUNDING

CONVERGENT FOR REAL COMPUTING ROOTS

ELDON HANSEN

Abstract. In this paper, we extend the interval Newton method to the case where the interval derivative may contain zero. This extended method will isolate and bound all the real roots of a continuously differentiable function in a given interval. In particular, it will bound multiple roots. We prove that the method never fails to converge.

Key words: interval arithmetic, interval analysis, roots, Newton method, error bounds,
convergence.

1. Introduction.
In [1], R. E. Moore derived an interval extension of Newton's method and showed that in the neighborhood of a simple root of a real function f(x), it converged quadratically to the root. Nickel [2], [3] has observed that the method converges globally provided the derivative of f ( x ) is non-zero in some interval containing the root. In this paper, we extend the method to yield a globally conyergent method to find (and bound) all real roots in an interval X o. The roots may have any multiplicity, but we assume there are a finite number of them in X 0. We assume throughout this paper that the function f ( x ) is continuously differentiable on X 0.

2. Moore's method.
Let X i = [as, b J be an interval containing a root r of f(x), and let x~ be the midpoint of Xv L e t f ' ( X 3 be an interval extension (see [1]) of f '(x)for x ~ X i and assume (2.1) Define (2.2) and let (2.3)

0 dl f ' ( X i ) . f (xi) N(Xi) = x i - f , ( X i )

Xi+ 1 = X i N N ( X i ) .

Received April 28, 1978. Revised May 29, 1978.

416

ELDON HANSEN

Moore t-1] shows that if r e X 0 and if the sequence {X~} is generated by (2.3), then r ~ X~ for all i. He also shows that if convergence occurs, it is asymptotically quadratic. That is, if we define the width of X~ = [ai, bi] to be (2.4) then
w(X~) = bi - a i ,

w(X~+t) <= C[w(X~)] ~


for some constant C. Nickel [2], [3] observes,that if (2.1) holds, then convergence must occur. See also [6]. Usually, the interval Newton method is defined to be such that f ' ( X ~ ) is an interval extension of f '(x) for x e Xi and hence f ' ( X i ) contains the range of f '(x) for x e X~. In [8], it is shown that the various properties of the interval Newton method discussed in this paper also hold when f'(X~) does not contain the range o f f ' over Xi provided f ' ( X i ) is properly derived. In the derivation in [8], various occurrences of x in the expression for f ( x ) are written as separate variables which are equal to x. Then f i s expanded as a function of several variables. As shown in [8], this yields a more rapidly convergent Newton method becausef' (X~) becomes a narrower interval. An example is given below in Section 8.

3. Extended interval analysis.


If (2.1) does not hold, that is if (3.1)
0 e f'(X) ,

then the above method cannot be used in ordinary interval arithmetic since (2.2) involves division by an interval containing zero. However, an extended interval analysis can be used which permits such a division operation. For example, see [4], [5]. We shall avoid a general discussion of extended interval analysis and merely present the specific concepts we require. Let [c~,dJ denote the interval we get when we evaluate f ' with interval argument X~, that is,

(3.2)
If 0 e f ' ( X i ) , then

f'(Xi)

= [el, d l ]

[1/di, cx~] (3.3)


1

if c i = 0 , if
di= 0 ,

f'(Xi)

[ - ~ , 1/ci]

[ - ~ , 1/ci] U [1/di, c~] otherwise.

A G L O B A L L Y C O N V E R G E N T INTERVAL M E T H O D . , .

417

Iff(xi) = 0, then xi is a root off. Excluding this case, we have the following two possibilities. If f(xi) > 0, then from (2.2) and (3.3),

(3.4a)

N(XI) = . [ P i , ~ ]
I[-

[-~,qi]

if c i = 0 , if d i = 0 ,

~ , q J U [p~, 0o] otherwise,

where

Pl = xi--f(xi)/Ci, qi If f (xi)<O, then [qi, cx~]


(3.4b) if ci=O , if d i = 0 ,
= xi-f (xi)/di .

N(Xi) =

[-c~,pi ]

[ - cx~,Pi] U [% ~ ] otherwise .
To obtain X~ f3 N(X~) is a simple set operation. We get a finite set which can be either a single interval, or the union of two intervals, or the empty set. Thus even if 0 ~ f'(Xi), the result (2.3) is a bounded set. We shall show that the method defined by (2.3) is globally convergent in a sense to be defined below. 4. Proof of convergence when 0 f'(Xo). As stated earlier, if 0 d~f'(Xo), then we have global convergence (see [2], [3]). This follows as a corollary to the following well-known theorem which states that w (X~ +1) < w(Xi) for all i = 0, 1. . . . . Thus the interval Newton method converges reasonably rapidly even before the asymptotically quadratic nature is evident. THEOREM 1. Assume 0 d~f'(Xo). I f X i ( i = 1 , 2 . . . . ) is generated by (2.3), then

w(Xi+ l)~<~w(Xi).
A proof of this theorem follows easily from the observation that the midpoint xl of X~ is not in N(Xi) when x~ is not a root off(x). Hence either N(X~)>x i or else N(X~)<x~ and X~+I =Xif-tN(Xi) must contain less than half of Xi.

5. Proof of convergence when 0 e f ' ( X ) . We shall now prove that the interval Newton method is convergent even when 0 ~ f'(Xo). The method is more complicated in this case, however, since even if X~ is an interval, X~+I may be a disjoint set composed of two intervals. When this occurs, we assume the interval Newton method is applied separately to each of the two intervals. These (or subsequent intervals) may again give rise to two new

418

ELDON HANSEN

smaller intervals, that is, the number of intervals may grow. In practice the number of intervals should not be much larger than the number of real roots o f f in X0. Suppose that before the ith step, the set X~ is composed of st disjoint intervals. We shall denote them by X~ ) (j = 1. . . . . st). The ith step involves only one of these intervals. Nevertheless, we shall rename the other unchanged intervals so that just after the ith step, the set of intevals is -~i+xY(J)(j = 1. . . . . st+ 1). We have st+l = s i + 1 if two disjoint intervals are formed; s t + 1 = st if the interval is merely reduced in size; or s t + 1 = s t - 1 if N(X~ ~)) and X~~) are disjoint. The latter case can occur only if no root o f f ( x ) lies in X~j~. Assume that the set of intervals is labeled so that

w ( X 7 +1)) < w(X~ ))

(j=l ..... si-1 ) .

Thus w(Xl 1)) is at least as large as any other w(Xl j)) where j = 2 . . . . . s v We now state and prove a theorem showing that the method always converges. THEOREM 2. Let an initial interval X o be given and let new intervals be obtained beginning with X o using (2.3) with N defined by (3.4a) and (3.4b). Assume that at the i-th step (i = 1, 2 . . . . ), the method is applied to the interval X~ 1) of largest width. Also assume that f (x) and its derivative f ' ( x ) each has a finite number of zeros in Xo. Then for sufficiently large value of i, (5.1) wt < e
W-si t - ~ q = l w(X~ )) and e > O is arbitrary.

where

Before proving this theorem, we introduce some notation and a concept which will enable us to simplify the proof. For notational simplicity we shall denote XI x) by X i. Let F'(Xi) denote the set of values of f '(x) as x ranges over X~; that is,

F'(Xi) = {f'(x) : x ~ Xt} .


As shown in [1],

f'(Xi) = F'(Xi)
but in practice, the computed interval f'(Xi) is generally not equal to F'(Xi). It is also shown in [1] that iff'(x~) is a rational function of xi and i f f ' ( X t ) is obtained by simply replacing xt by X~ and doing the necessary rational operations in interval arithmetic, then (5.2)

w [ f ' ( X i ) ] - w [ F ' ( X i ) ] = O[w(Xi)].

That is, the amount by which the length o f f ' ( X i ) exceeds that of F'(Xi) tends to zero as the length of X~ tends to zero. Actually, the right member of (5.2) can be replaced by O[w(Xi) z] if the computation of f'(Xi) is done using special

A GLOBALLY CONVERGENT INTERVAL M E T H O D . . .

419

procedures such as the "centered form" (see [1] and Chapter 10 of [7]). When the method in [8] is used, the interval f ' (Xi) is replaced by a subset of itself. We begin the proof of Theorem 2 by noting that if 0 ~ f ' (X~), then the Newton step (2.3) generates a single new interval, say X'i (or else the empty set). By Theorem 1, (5.3)
w'(X',) < w(Xi).

Thus the width of the largest interval is reduced by more than half by the ith step. Without loss of generality, assume hereafter that f ( x O > 0 . Then N(X~) is given by (3.4a). If ci = 0, the ith step produces the new interval
X ' i = X i fq N ( X i ) = X i f') [ - 0 o , q/] .

But qi < x, so again we find that (5.3) holds. Similarly, if di = 0, we again obtain (5.3). Thus (5.3) holds if c i is non-negative or if d, is non-positive. The remaining case is c~< 0 < d~. In this case, using (3.4a), X~ = ( [ - c~, qi] U [Pi, c~]) [q [ai, bi] We shall distinguish several sub-cases. First, however, note that q i < x i < b ~ and a,<x~<p~ so that X'~ is strictly contained in X e Thus we already see that in all cases, the total width
$i

w, = Z w(X?)
j=l

of the remaining intervals is a monotonically strictly decreasing function. Since wi is non-negative, it has a limit as i ---, c~. It remains to show that the limit is zero. As a first sub-case, suppose qi<a~ and b~<p v Then X~ is empty and (5.3) certainly holds. Note that this implies that X~ does not contain a root of f Next suppose a~< q~ and b~<p~. Then
X~ = [ai, qi ] cz: [ai, x d

so (5.3) again holds. Similarly, if qi < ai and Pi < bi then (5.3) holds. In all the previous cases, X'i was a single interval (or empty) and (5.3) held. The only remaining case is when a~< q~< xl < p~< b~ so that
X'i = [ai, ql] U [Pi, b J
.

In this case, (5.3) need not hold and we require a more refined argument to prove convergence. Note that qi < x i and p~> xv Hence x i d~ X'~ and so each of the disjoint intervals composing X'~ has width less than half that of X v Therefore, for sufficiently large i, all the remaining intervals are of arbitrarily small width. What we must show is that their total width aproaches zero. We have assumed that i f ( x ) has a finite number of zeros in the initial interval X o. Let 6 denote the smallest distance between any two distinct zeros. (Note that

420

ELDON HANSEN

we allow multiple zeros.) Assume the algorithm has proceeded until i is so large that the largest interval X~1~has width less than ~. This eventuality must occur since, as we have seen, each remaining subinterval becomes arbitrarily small. "By the hypothesis of Theorem 2, the number of zeros of f '(x) in X o is finite. Hence only a finite number of intervals X~~) can contain a zero of f '(x) and their total length approaches zero as i --~ co. The remaining intervals which do not contain a zero o f f ( x ) should be such that 0 d~f'(X). If this is the case, then as we have seen, their widths go rapidly to zero. However, as pointed out earlier, the computed interval f ' ( X ) is generally larger than the true range F' (X). Hence it is possible that 0 ~ f ' (X) even though there is no point x ~ X such that f ' (x)= 0. We see from equation (5.2) that the possibility of this happening decreases as the width of X goes to zero. Asymptotically, as the width of the largest remaining interval goes to zero, the set of remaining intervals X~i) for which 0 ~ f'(X~ j)) is the same as the set of intervals containing a zero of f '(x). We have seen that the total width of these intervals approaches zero as i ~ c~. We have also seen that the total width of the other remaining intervals approaches zero. Thus the theorem is proved. II 6. Elimination of intervals. It sometimes happens that X~+ 1 is empty. It is of interest to consider when this occurs. The following theorem is relevant.
THEOREM 3. Assume tf(x)l > 6 > 0 for all x ~ Xi = [al, bi] and that If'(Xi)l ~ M. Then Xi will be entirely eliminated in m steps where m < (b i - a i ) M / 2 6 .

We shall prove this theorem in a special case in which f ' ( X i ) contains zero but is non-negative. The general case follows in the same way. Thus we assume f'(Xi) = [ 0 , d J . Without loss of generality, we also assume that f ( x ) is positive for x e . X i. Thus by assumption, f ( x ) ~ 6 > 0 for all x ~ Xi. In this case N ( X i ) is given by (3.4a) as N ( X i ) = [ - oo, x i - f (xi)/di] . Since f(xi)/di>O, it follows that N ( X i ) < x i and hence the right half of X i is eliminated in one step. Either the left half of X~ is also eliminated or else the subinterval I-Xi -- f (xi)/d i, xi] is eliminated. This sub-interval has width w = f (x~)/di and from the hypotheses of the theorem, w >=,5/M. The bound 6 holds for all of Xi and by inclusion monotonicity (see [1]) the bound M will hold for any subset of X~. Hence at each subsequent step, we either eliminate all of the remaining subset of X~ or else we again eliminate a sub-interval of width =>6/M.

A GLOBALLY CONVERGENT INTERVAL M E T H O D . . .

421

The total width of the left half of Xi is (bi - ai)/2 and a subset of width ~ 6/M is eliminated at each step. Thus the conclusion of the theorem follows. II

7. Evaluation of f (x).
In the preceding section, we implicitly assumed f(x) was known exactly. In practice we evaluate f(x) using interval arithmetic. If the resulting interval contains zero and if f '(X) also contains zero, then N ( X ) = l - - ~ , ~ ] and X' = X . Hence there is a practical limit on how accurately we can bound a root. We can only reduce the size of X if 0 ~ f(X) when 0 ~ f'(X). However, even if 0 ~f'(X), we eventually reach a stage where X ' = X when using finite precision interval arithmetic. For a multiple root, this will occur for a larger interval X, in general, since the condition of a multiple root tends to be worse than that of a simple one.

8. Termination.
In practice, we use finite precision arithmetic. Hence it is not possible to isolate a root to arbitrary accuracy. We now consider how termination conditions should be used to account for this fact. If the tolerance e is sufficiently large and the error criterion (5.1) is satisfied, we will of course stop the algorithm. However, if e is too small, (5.1) may never be satisfied. We wish our program to recognize when it has done the best job it can and automatically stop. The source of the difficulty is that we cannot evaluate f(x) exactly. To assure that, without question, the set of intervals X/~) contain all the roots o f f ( x ) which lie in X o, we must evaluate f (x) in interval, arithmetic to bound rounding errors. Let F(x) denote the interval we obtain when evaluating f(x). Suppose f has a root near the center x of an interval X and we obtain F(x)= [eL, eR] where eL < 0 and eR >0. Suppose also that we find 0 ~f'(X). Then N ( X ) = [ - c ~ , c~] and the Newton step does not reduce X in size. This could happen when w(X) is not small so that we have not isolated a root of f even though there is probably a root near x because 0 ~ F(X). One thing we can do in this case is simply to divide X in half and add each half to our list of intervals. It is possible to make use of the fact that a root exists near x. We learn this fact when we find that F(x) contains zero. We shall not discuss the details of how this might be done. Simply dividing X in half does not reduce the total length of all remaining intervals. Thus we might repeat this step ad infinitum and never satisfy the termination condition (5.1). Hence if we find that 0 ~ F(x) and 0 ~f'(X) and the width of the current interval X is less than e, we stop subdividing X. Our termination condition is subsequently applied only to the remaining intervals exclusive of X. A simple procedure is to print out X and remove it from the list of intevals yet to be processed.
BIT t 5 - - 2 8

422

ELDON HANSEN

Another procedure that we might use, instead, when 0 e F ( x ) and 0 ~ f ' ( X ) is the following. Repeat the Newton step letting x be an endpoint of X instead of the center of X. If we still have 0 ~ F(x), repeat the step again using the other endpoint. Generally, this procedure will eventually yield the best possible interval. However, it is possible that an interval X contains a root at its center and roots at each endpoint. Thus this alternative procedure would fail to isolate them from one another. A cautious person might prefer to evaluate f at additional points in X. On the other hand, if e is much too small, simply dividing X in half when 0 ~ F ( x ) and 0 ~ f ' ( X ) can result in many fruitless subdividing steps. Hence the second procedure is more attractive.

9. An example.
We now present a numerical example which illustrates how our method can isolate separated roots and bound either simple or multiple roots. Consider the polynomial (9.1)
f(x) = (X--

1)2(X+1)

= x a-x E-x+

which has a simple root at x = - 1 and a double root at x = 1. The derivative of

f (x) is
f ' ( x ) = 3x z - 2 x - 1 . Using the method described in [8], we can replace the interval derivative of f by (9.2)
f'(X) = x 2 - x - 1 + X ( X + x - 1)

where x is the center of X. This is accomplished by rewriting f ( x ) as a function so that g ( x , x , x ) = f ( x ) . The multidimensional interval Newton method (see [8]) is applied to the function g. Then x a, x 2, and x 3 are set equal to x. The result is of the form of the one dimensional method with f ' ( X ) given by (9.2). Note that for a given interval X, f ' (X) as given by (9.2) does not generally contain the range of f ' ( x ) for x e X. Let the initial interval be Xo = [ - 2 , 2 . 5 ] and note that X o contains both the simple and double root. The center of Xo is x o = 0.25. F r o m (9.2),
g(xl,x2,x3)=xlx2x3-xlxz-xl+l, f ' ( X o ) = [ - 8.07, 4.32].

For brevity, we record intermediate results to only three significant digits. However, our calculations were done using twelve decimal digit arithmetic on the HP9830A computer. Since O ~ f ' ( X o ) , we obtain two semi-infinite intervals from (3.4b). These intervals are NI (Xo, Xo) = [ - c~, 0.0870], N2(xo, X0) = [-0.337, ~x~] .

A GLOBALLY CONVERGENT INTERVAL M E T H O D . . ,

423

Intersecting these intervals with Xo, we obtain X~1) = [ - 2 , 0 . 0 8 7 0 ] , X~2) = [0.337,2.5]. Let us first process X] 1) which contains the simple root only. We find f'(X] 1~) > 0 so that the interval Newton method can be applied in the usual way. We obtain [ - 1.28, - 0.975]. Three additional steps yield the interval [ - 1.000000038, - 0.999999996]. Suppose we had started with the ordinary interval Newton method using (2.2) and merely subdivided any interval for which f ' (X) contained zero. We would first have subdivided X o into (say) X] = [ - 3, 0.25] and X~ = [0.25, 2.5]. We would find 0 ~ f'(X' 0 and would have to subdivide again. Thus the entire initial interval would remain (but be subdivided into three subintervals). In contrast, two steps of our extended method eliminate 45 ~ of the initial interval. Let us now continue with Xt~ which contains the double root. Convergence to 2) this root is slow because it is at a linear rate. We chose e so small that the convergence criterion wi < e could not be satisfied. We chose the option of using an endpoint (or endpoints) of X instead of the midpoint whenever both 0 ~ F(x) and 0 ~f'(X). It was necessary to use a n endpoint four times. After getting X]z), our algorithm terminated after an additional 33 evaluations of f and 29 evaluations of f '(X) and obtained the final interval [1.000004232, 1.000005406] bounding the double root at x = 1. The Newton method using real arithmetic rather than interval arithmetic requires 17 steps to get as accurate a result when starting from the midpoint of X] 2). Thus the interval method is slower; but it provides guaranteed error bounds.

10. Conclusion. We have shown that the interval Newton method can be extended to include the case in which the interval derivative contains zero. We have proved that the extended method always converges. This type of generalization can also be done for the multidimensional interval Newton method, see [9]. Our method cannot be used directly to find complex roots. This is because the mean value theorem (which is used to derive the interval Newton method) does not hold in the complex plane. However, the complex case can be separated into real and imaginary parts. Then the method in [9] can be used.

424

ELDON HANSEN REFERENCES

1. R a m o n E. Moore, Interval analysis, Prentice-Hall, 1966. 2. Karl Nickel, Triplex-Algol and its applications, in Topics in interval analysis, (edited by E. R. Hansen), Oxford Press, 1969. 3. Karl Nickel, On the Newton method in interval analysis, Univ. of Wisconsin, Mathematics Research Center Report 1136, Dec., 1971. 4. W. M. Kahan, A more complete interval arithmetic, Lecture notes for a summer course at the Univ. of Michigan, 1968. 5. Richard J. Hanson, Interval arithmetic as a closed arithmetic system on a computer, Jet Propulsion Lab. Report 197, June, 1968. 6. G. Alefeld and J. Herzberger, Einfiihrung in die lntervallrechnung, Bibliographisches Institut Mannheim - Wien - Ziirich, 1974. 7. Eldon Hansen (ed.), Topics in interval analysis, Oxford University Press, 1969. 8. Eldon Hansen, Interval forms of Newton's method, to appear in vol. 19 of Computing. 9. Eldon Hansen, Bounding solutions of systems of equations using interval analysis, to be submitted.

DEPARTMENT OF PURE AND APPLIED MATHEMATICS WASHINGTON STATE UNIVERSITY PULLMAN, WASHINGTON99164 U.S.A.

Das könnte Ihnen auch gefallen