Sie sind auf Seite 1von 487

Numerical

Polynomial

Algebra

Hans J. Stetter

Institute for Applied and Numerical Mathematics Vienna University of Technology Vienna, Austria

SJaJ1L

Society for Industrial and Applied Mathematics Philadelphia

Copyright © 2004 by the Society for Industrial and Applied Mathematics.

10987654321

All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688.

MAPLE is a registered trademark of Waterloo Maple Inc.

MATLAB is a registered trademark of The MathWorks, Inc. For MATLAB product

information, please contact:

01760-2098 USA, 508-647-7000, Fax: 508-647-7101, info@mathworks.com,

www.mathworks.com/

The MathWorks, Inc., 3 Apple Hill Drive, Natick, MA

Mathematica is a registered trademark of Wolfram Research, Inc.

Library of Congress Cataloging-in-Publication Data

Stetter, Hans J., 1930- Numerical polynomial algebra / Hans J. Stetter. p. em. Includes bibliographical references and index. ISBN 0-89871-557-1 (pbk.) 1. Polynomials. 2. Numerical analysis. I. Title.

QA161.P59S742004

512.9'422-dc22

2004041691

About the cover: The cover art shows the discretized image of the variety of a pseudofactorizable polynomial in three variables; d. Example 7.13 and Figure 7.5 for the varieties of the pseudofactors.

Contents

Preface

xi

Acknowledgments

 

xv

I Polynomials and Numerical Analysis

 

1

1 Polynomials

 

3

1.1 Linear Spaces of Polynomials

.

4

1.2 Polynomials as Functions

 

.

7

1.3 Rings and Ideals of Polynomials

 

12

 

1.3.1

Polynomial Rings

 

.

12

1.3.2

Polynomial Ideals

.

13

1.4 and Affine Varieties

Polynomials

 

16

1.5 Polynomials in Scientific Computing

 

20

 

1.5.1

Polynomials in Scientific and Industrial Applications.

 

.

21

2 Representations of Polynomial Ideals

 

2S

2.1 Ideal Bases

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

25

2.2 Quotient Rings of Polynomial Ideals

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

29

 

2.2.1 Linear Spaces of Residue Classes and Their Multiplicative

 

Structure

 

.

.

.

.

.

.

.

.

.

.

.

.

.

29

 

2.2.2 Commuting Families of Matrices

 

36

2.3 Dual Spaces of Polynomial Ideals

 

.

.

39

 

2.3 .1

Dual Vector Spaces

 

39

2.3.2

Dual

Spaces of Quotient Rings

 

.

42

2.4 The Central Theorem of Polynomial Systems

Solving

 

46

 

2.4.1 Basis Transfonnations in 'R. and 1>

 

46

2.4.2 A Preliminary Version

.

.

.

49

2.4.3 The General Case

 

.

.

.

.

.

.

.

.

50

2.5 Nonnal Sets and Border Bases

 

54

 

2.5.1 Monomial Bases of a Quotient Ring.

.

.

54

2.5.2 Border Bases of Polynomial Ideals

 

58

2.5.3 Groebner Bases

 

.

.

.

.

.

.

.

.

.

.

.

60

v

vi

Contents

 

2.5.4

Polynomial Interpolation

 

61

3

Polynomials with Coefficients of Limited Accuracy

 

67

3.1 Data of Limited Accuracy

 

.

.

.

.

.

.

67

3.1.1 Empirical Data

 

.

67

3.1.2 Empirical

 

71

3.1.3 Valid Approximate Results.

 

73

3.2 Estimation of the Result Indetermination

76

3.2.1 Well-Posed and Ill-Posed Problems

 

76

3.2.2 Condition of an Empirical Algebraic Problem.

 

79

3.2.3 Linearized Estimation of the Result Indetermination

 

84

3.3 Backward Error of Approximate Results

 

.

.

.

.

.

.

.

.

.

.

86

3.3.1 Determination ofthe Backward Error

 

88

3.3.2 Transformations of an Empirical Polynomial ".

 

92

3.4 Refinement of Approximate Results

 

.

.

.

.

.

.

.

.

.

95

4

Approximate Numerical Computation

 

101

4.1 Solution Algorithms for Numerical Algebraic Problems

.

.

.

.

.

.

lot

4.2 Numerical Stability of Computational Algorithms

 

.

.

.

.

.

.

.

.

.

105

4.2.1 Generation and Propagation of Computational Errors.

105

4.2.2 Numerical Stability

 

.

108

4.2.3 Causes for Numerical Instability

Ito

4.3 Floating-Point Arithmetic

 

.

113

4.3.1 Floating-Point Numbers

.

.

.

.

.

114

4.3.2 Arithmetic with Floating-Point Numbers

 

115

4.3.3 Floating-Point Errors

 

118

4.3.4 Local Use of Higher Precision.

 

120

4.4 Use of Intervals

.

.

.

.

.

.

.

.

.

.

.

123

4.4.1 Interval Arithmetic

.

.

.

.

.

.

.

123

4.4.2 Validation Within Intervals

 

125

4.4.3 Interval Mathematics and Scientific Computing.

 

128

II

Univariate Polynomial Problems

133

5

Univariate Polynomials

135

5.1 Intrinsic Polynomials

.

.

.

.

.

.

.

.

.

135

5.1.1 Some Analytic Properties

135

5.1.2 Spaces of Polynomials

 

137

5.1.3 Some Algebraic Properties.

140

5.1.4 The Multiplicative Structure .

143

5.1.5 Numerical Determination of Zeros of Intrinsic Polynomials

146

5.2 Zeros of Empirical Univariate Polynomials

.

148

5.2.1 Backward Error of Polynomial Zeros

.

.

.

.

.

.

.

149

5.2.2 Pseudozero Domains for Univariate Polynomials.

152

5.2.3 Zeros with Large Modulus

.

154

Contents

vii

5.3 Polynomial Division.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

157

5.3.1 Sensitivity Analysis of Polynomial

Division

 

157

5.3.2 Division of Empirical Polynomials

 

.

.

.

.

.

.

.

.

.

160

5.4 Polynomial Interpolation

 

163

5.4.1 Classical Representations of Interpolation Polynomials.

163

5.4.2 Sensitivity Analysis of Univariate Polynomial Interpolation

166

5.4.3 Interpolation Polynomials for Empirical

 

168

6

Various Tasks with Empirical Univariate Polynomials

 

173

6.1 Algebraic Predicates

 

.

.

.

.

.

.

.

.

.

.

.

.

.

173

6.1.1 Algebraic Predicates for Empirical Data .

 

·

173

6.1.2 Real Polynomials with Real Zeros .

 

·

178

6.1.3 Stable Polynomials

 

·

180

6.2 Divisors of Empirical Polynomials

 

.

.

·

183

6.2.1 Divisors and Zeros .

.

.

.

·

183

6.2.2 Sylvester Matrices

.

.

.

.

·

185

6.2.3 Refinement of an Approximate Factorization

 

·

188

6.2.4 Multiples of Empirical Polynomials

 

·

192

6.3 Multiple Zeros and Zero Clusters

 

.

.

.

.

.

.

.

.

.

.

.

·

194

6.3.1 Intuitive Approach

.

.

.

.

.

.

.

.

.

.

.

.

·

194

6.3.2 Zero Clusters of Empirical Polynomials.

 

·

196

6.3.3 Cluster Polynomials

 

.

·

.

198

6.3.4 Multiple Zeros of Empirical Polynomials

.

·

.

202

6.3.5 Zero Clusters about Infinity

 

.

·

.204

6.4 Greatest Common Divisors

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

206

6.4.1 Intrinsic Polynomial Systems in One Variable

 

·

.206

6.4.2 Empirical Polynomial Systems in One Variable

 

·

.211

6.4.3 Algorithmic Determination of Approximate Common

 

Divisors

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

214

6.4.4 Refinement of Approximate Common Zeros and Divisors

 

.

218

6.4.5 Example

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

219

ill

Multivariate Polynomial Problems

 

225

7

One Multivariate Polynomial

 

229

7.1 Analytic Aspects .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.229

7.1.1 Intuitional Difficulties with Real and Complex Data

.229

7.1.2 Taylor Approximations

 

.230

7.1.3 Nearest Points on a Manifold

 

.

.

.

.

.

.

.233

7.2 Empirical Multivariate Polynomials

 

.

.

.

.

.

.

.

.

.

.

.237

7.2.1 Valid Results for Empirical Polynomials

 

.

.237

7.2.2 Pseudozero Sets of Empirical Multivariate Polynomials

 

.240

7.2.3 Condition of Zero Manifolds

 

.

.242

7.3 Singular Points on Algebraic Manifolds.

.

.

.

.

.

.

.

.246

viii

Contents

7.3.2 Detennination of Singular Zeros

7.3.3 Manifold Structure at a Singular Point .

.249

· 251

7.4

Numerical Factorization of a Multivariate Polynomial

.254

704.1

Analysis of the Problem .

.254

7.4.2 An Algorithmic Approach

704.3 Algorithmic Details

704.4 Condition of a Multivariate Factorization

· 258

· 261

.266

8

Zero-Dimensional Systems of Multivariate Polynomials

 

273

8.1 Quotient Rings and Border Bases ofO-Dimensionai

 

Ideals.

 

. 274

8.1.1 The

Quotient Ring of a Specified Dual

Space.

. 274

8.1.2 The Ideal Generated by a Nonnal Set Ring

 

.

278

8.1.3 Quasi-Univariate Nonnal Sets.

.

.

.

.

.

.

.

.

.

282

8.2 Nonnal Set Representations of O-Dimensional

 

. 286

8.2.1 Computation of Nonnal Fonns and Border Basis Expansions

 

286

8.2.2 The Syzygies of a Border Basis

.

.

.

.

.

.

.

.

.

.

.

290

8.2.3 Admissible Data for a Nonnal Set Representation

.

295

8.3 Regular Systems of Polynomials

 

.

.

.

.

.

.

.

300

8.3.1 Complete Intersections.

.

.

.

.

.

.

.

.

.

.

.

.

. 300

8.3.2 Continuity of Polynomial Zeros.

.

.

.

.

.

.

.

. 304

8.3.3 Expansion by

a Complete Intersection System

 

.

306

8.304

Number of Zeros of a Complete Intersection System

. 309

8.4 Groebner Bases

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 314

8.4.1 Tenn Order and

Order-Based Reduction

 

.

314

8.4.2 Groebner Bases

 

.

317

8.4.3 Direct Characterization of Reduced Groebner Bases

. 321

8.4.4 Discontinuous Dependence of Groebner Bases on P

. 323

8.5 Multiple Zeros of Intrinsic Polynomial Systems

 

.

328

8.5.1 Dual Space of a Multiple Zero.

.

.

.

.

.

.

.

.

.

. 328

8.5.2 Nonnal Set Representation for a Multiple Zero.

 

. 334

8.5.3 From Multiplication Matrices to Dual Space

 

.

335

9

Systems of Empirical Multivariate Polynomials

 

343

9.1 Regular Systems of Empirical Polynomials

.

.

.

.

.

.

.

.

.

.

.

.344

9.1.1 Backward Error of Polynomial Zeros

.

.

.

.

.

.

.

.

.

.

.

.344

9.1.2 Pseudozero Domains for Multivariate Empirical Systems

.347

9.1.3 Feasible Nonnal Sets for Regular Empirical Systems.

 

.349

9.1.4 Sets of Ideals of System Neighborhoods .

 

.350

9.2 Approximate Representations of Polynomial

· 355

9.2.1 Approximate Nonnal Set Representations

 

.

· 355

9.2.2 Refinement of an Approximate Nonnal Set Representation.

9.2.3 Refinement Towards the Exact Representation

· 358

· 363

9.3 Multiple Zeros and Zero Clusters

.

.

.

.

.

.

.

.

.

.

.

.

.

.366

9.3.1 Approximate Dual Space for a Zero Cluster.

.367

9.3.2 Further Refinement

· 371

9.3.3 Ciusterideals

.

· 373

Contents

ix

9.3.4 Asymptotic Analysis of Zero Clusters

.376

 

9.4 Singular Systems of Empirical Polynomials

 

·

.

381

 

9.4.1 Singular Systems of Linear Polynomials

 

·

.382

9.4.2 Singular Polynomial Systems; Simple d-Points

.

.386

9.4.3 A Nontrivial Example

 

.

.

391

9.4.4 Multiple d-Points

 

.

.394

 

9.5 Singular Polynomial Systems with Diverging Zeros

 

.398

 

9.5.1 Inconsistent Linear Systems

 

.

·

.398

9.5.2 BKK-Deficient Polynomial Systems

·

.400

 

9.6 Multivariate Interpolation

.

.

.

.

.

.

.

.

.

.

.

.

.

.

·

.404

 

9.6.1 Principal Approach

 

.

·

.404

9.6.2 Special Situations

 

.

.406

9.6.3 Smoothing Interpolation.

 

.

.

.407

10

Numerical Basis Computation

 

411

10.1 Algorithmic Computation of Groebner Bases

 

.

.

.

.

.

. 411

 

1O.l.l

Principles of Groebner Basis Algorithms

.

.

.

.

.

. 411

10.1.2

Avoiding Ill-Conditioned

 

.

. 414

10.1.3

Groebner Basis Computation for Floating-Point Systems

 

416

 

10.2 Algorithmic Computation of Normal Set Representations

 

417

 

10.2.1 An Intuitive Approach

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 418

10.2.2 Determination of a Normal Set for a Complete Intersection

. 421

10.2.3 Basis Computation

with Specified Normal Set

 

. 423

 

10.3 Numerical Aspects of Basis Computation .

 

.

429

 

10.3.1 Two Fundamental Difficulties

. 429

10.3.2 Pivoting.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. 431

10.3.3 Basis Computation with Empirical Data

 

.

433

10.3.4 A Numerical Example

 

434

IV

Positive-Dimensional Polynomial Systems

443

11

Matrix Eigenproblems for Positive-Dimensional Systems

 

447

Il.l

Multiplicative Structure of oo-Dimensional Quotient Rings

.

.

.

.

.

.

.447

 

11.1.1 Quotient Rings and Normal Sets of Positive-Dimensional Ideals

.

.447

11.1.2 Finite Sections of Infinite Multiplication Matrices

 

·

.449

11.1.3 Extension of the Central Theorem

 

.

.

.

.

.

.

.

·

.451

 

11.2

Singular Matrix Eigenproblems

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

·

.452

 

11.2.1 The Solution Space of a Singular Matrix Eigenproblem

.

·

.452

11.2.2 Algorithmic Determination of Parametric Eigensolutions .

·

.454

11.2.3 Algorithmic Determination of Regular Eigensolutions

 

.456

 

11.3

Zero Sets from Finite Multiplication Matrices

 

.457

 

11.3.1 One-Dimensional Zero Sets

 

.

.457

11.3.2 Multi-Dimensional Zero Sets

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.460

x

Contents

11.4

A Quasi-O-Dimensional Approach

.

.462

11.4.1 Quotient Rings with Parameters .

.463

11.4.2 A Modified Approach

.464

Index

467

Preface

"Numerical Polynomial Algebra" is not a standard designation of a mathematical discipline; therefore, I should start by explaining the title of this book. Historically, in the growth of computational mathematics, which occurred in parallel with the breath-taking explosion in the perfonnance of computational machinery, all areas of mathematics which play a role in the modelling and analysis of real world phenomena developed their branch of Numerical Analysis:

Linear Algebra, Differential Equations, Approximation, Optimization, etc. The collective term Numerical Analysis turned out to be appropriate: The fact that data and relations from the real world inevitably have a limited accuracy make it necessary to embed the computational tasks into metric spaces: Few parts of computational scientific computing can proceed without approximations and without the analytic background (like norms, for example) to deal with the inherent indeterminations. Numerical Linear Algebra is the best-known example: It originated from an embedding of the constructive parts of classical linear algebra into linear functional analysis, and its growth into one of the supporting pillars of scientific computing was driven by the use of analytic tools like mappings, norms, convergent iteration, etc. Empirical data could easily be fitted into this conceptual frame so that the approximate solution of approximate linear problems with approximate data could be conceived and implemented.

One area of mathematics did not follow that trend: classical nonlinear algebra. It had undergone a remarkable algorithmic development in the late 19th century; then the axiomatic age had turned it into an abstract discipline. When the symbol manipulation capabilities of electronic computers became evident, a faction of algebraists remembered the algorithmic as- pects of their field and developed them into "ComputerAlgebra," as a computational tool for the solution of constructive problems in pure mathematics. They have designed and implemented algorithms which delight the algebraic community; but at the same time, this enterprise has somehow prevented the growth of a numerical nonlinear algebra. The inadequacy of this math- ematically interesting project for realistic problems is exposed when the solution of a system of linear equations with numerical coefficients is obtained in the fonn of fractions of integers with hundreds of digits.

But nonlinear algebraic tasks do exist in scientific computing: Multivariate polynomials are a natural modelling tool. This creates multivariate systems of polynomial equations, mul- tivariate interpolation problems, decomposition problems (factorization) etc.; the modelling of nontrivial geometric constellations alone generates a multitude of nonlinear algebraic problems. These computational tasks from the real world possess (some) data with limited accuracy and there are no exact solutions; thus, they are generally not accessible by the sophisticated exact

xi

xii

Preface

tools which Computer Algebra has provided. At the same time, they often require a global struc- tural analysis of the situation and cannot satisfactorily be solved with general-purpose tools of Numerical Analysis. (The computation of the zeros of one univariate polynomial became an exception: Here, algebra and numerical analysis joined ranks to develop efficient and reliable black-box software for the-necessarily approximate-solution of this task.)

Thus, in the late 20th century, a no man's land between computer algebra and numerical analysis had remained on the landscape of scientific computing which invited discovery and cultivation for general usage. But-most surprisingly-this challenge of pioneering a "nu- merical nonlinear algebra" remained practically unnoticed by the many young mathematicians hungry for success, even by those working in the immediate neighborhood of the glaring white

spot. When I accepted that challenge more than 10 years ago and tried to recruit help for my expeditions, my soliciting was met with little resonance. On these expeditions, I have met stim- ulating mathematical adventures all along the way and interesting unsolved problems wherever

I proceeded. Many of these problems are still waiting for their efficient solution.

From the beginning, in stepping into this virgin territory, I found it more important to set up directions and road posts than to investigate and plot small areas meticulously. I believe that

I have now gained an overview of large parts of that territory and I wish to communicate my findings in printed form, beyond my many lectures at conferences and seminars over the past years. This has been the motive for writing this book. The more restrictive title "Numerical Polynomial Algebra" (instead of the original "Numerical Nonlinear Algebra") expresses the fact that there remain interesting and computationally important areas in nonlinear algebra which I have not even touched.

A number of principles have guided the composition of this text:

The most prominent one is continuity: Throughout, all data are from C or lR so that all quantities and relations are automatically embedded into analysis, as in Numerical Linear Algebra. Derivatives of maps are widely used, not just formally but also quantitatively. This permits an analysis of the sensitivity of results to small changes in the data of a problem ("condition"). Continuity is the indispensable basis for the use of floating-point computation- or any other approximate computation. Concepts which are inherently discontinuous (like g.c.d., radical, etc.) must be reinterpreted or abandoned.

Continuitiy is also a prerequisite for the consideration of data with limited accuracy which we systematically assume throughout the text, with a concept of families of neighborhoods as

Correctness of a result is replaced by its validity, conceived as a continuous

property represented by a numerical value not as a discrete property (yes-no): A result is valid if

it is the exact result of nearby data, which is established by a backward error analysis. For multi-

a formal basis.

component quantities, we use weighted maximum norms throughout; but a weighted 2-norm would do just as well.

The interpretation of algebraic relations as continuous maps permits the systematic use of iterative refinement as an algorithmic tool. Crude initial results may be refined into sufficiently valid ones by the use of loeallinearization, a standard tool throughout analysis.

Within polynomial algebra proper, I have tried to employ the quotient ring aspect of ideals wherever possible. The vector space structure of quotient rings and the linear mapping structure of multiplication permit an ample use of concepts and algorithms from (numerical) linear algebra. The determination of all zeros of a polynomial system from the eigenvectors of

Preface

xiii

the multiplication matrices of the associated quotient ring is the most prominent example.

I have widely used standard linear algebra notations. The systematic use of row vectors for coefficients and of column vectors for bases has proved very helpful; within monomial basis vectors, components are always arranged by increasing degree or term order. These conventions may lead to linear systems b T A = c T for row vectors and elimination from right to left, which is somewhat nonstandard, but the internal consistency of this notational principle has been an ample reward.

Another guiding principle has been to write a textbook rather than a monograph. For a novel area of practical importance-which numerical polynomial algebra is in many ways-it is crucial that students are given the opportunity to absorb its principles. I hope that this book may be used as a text for relevant courses in Mathematics and Computer Science and to help students get acquainted with the numerical solution of quantitative problems in commutative algebra. I have included "Exercises" with all sections of the book; as usual, they are meant to challenge the reader's understanding by confronting him/her with numerical and theoretical problems. Also, most of the numerical examples in the text are not only demonstrations for the relevance of formal results but an invitation for a replication of the indicated computation. The textbook approach has also kept me from including references to technical papers within the text. Instead, I have added "Historical and Bibliographical Notes" at the end of each chapter which put the material into perspective and point to contributors of its development.

The dual nature of the subject area as a part of numerical analysis as well as of polynomial algebra requires that the textbe attractive and readable for students and scientists from both fields. As I know from my own experience, a standard numerical analyst knows few concepts and results from commutative algebra, and a standard algebraist has a natural aversion to approximate data and approximate computation which appear as foreign elements in hislher world. Therefore, I have seen it necessary to include low level introductory sections on matters of numerical analysis as well as of polynomial algebra, and I have tried to refrain from highly technical language in either subject area. Thus, a reader well versed in one of the areas must find some passages trivial or naive, but I consider this less harmful than assuming a technical knowledge which part of the intended readership does not possess.

Beyond students and colleagues from numerical analysis and computer algebra, the in- tended readership comprises experts from various areas in scientific computing. Polynomial algebra provides specialized and effective tools for many of their tasks, in particular for tasks with strong geometric aspects. They may be interested to see how many nontrivial algebraic problems can be solved efficiently in a meaningful way for data with limited accuracy. Alto- gether, I hope that this book may arouse general interest in a neglected area of computational mathematics where-for a while at least-interesting research projects abound and publishable results lurk behind every comer. This should make the area particularly attractive for scientists in the beginning phases of their careers.

For me personally, my encounter with numerical polynomial algebra has become a crucial event in my scientific life. It happened at a time when, with my advancing age, my interest in mathematical research had begun to decrease. In particular, I had lost interest in highly technical investigations as they are indispensable in any advanced scientific field. At that point, through some coincidences, I became aware of the fact that many fundamental aspects of the numerical treatment of nonlinear algebraic problems had hardly been touched. In my 60s, I began to learn the basics of commutative algebra and to apply my lifelong experience in numerical analysis to

xiv

Preface

it-and my fascination grew with every new insight. This late love affair of my scientific life has gained me 10 or more years of intense intellectual activity for which I can only be grateful. The result is this book, and-like a late lover-I must ask forgiveness for some foolish ideas in it which may irritate my younger and more meticulous colleagues.

This book also marks the end of my active scientific research. I have decided that I will devote my few or many remaining years to other activities which I have delayed long enough. If my mind should, for a short while, continue to tempt me with mathematical ideas and problems, I will simply put them on my homepage for others to exploit. I also owe it to my dear wife Christine who has so often patiently acknowledged the priority of science in our 44 years of married life that this state does not continue to the very end. Without her continuing love and support, this last mark of my scientific life would not have come into existence.

Vienna, July 2003

Hans J. Stetter

Acknowledgments

Fifteen years ago, when I began to get interested in the numerical solving ofpolynomial systems, my knowledge of commutative algebra was nil. I could not have gained even modest insight into polynomial algebra represented in this book without the advice and help of many colleagues much more knowledgeable in the area; implicitly or explicitly, they have contributed a great deal to my work. I wish to express my gratitude to

my early road companion H. M. Moeller;

B. Buchberger, H. Hong, J. Schicho, F. Winkler at RISC;

my friends at ORCCA, R. Corless, K. Geddes, M. Giesbrecht, D. Jeffrey, I. Kotsireas, G. Labahn,

G. Reid, S. Watt, and a number of people at Waterloo Maple Inc.;

my friends and collaborators in China, Huang Y. Zh., Wu W. D., Wu W. Ts., Zhi L. H.;

in the USA, B. Caviness, G. Collins, D. Cox, G. Hoffmann, E. Kaltofen, Y. N. Lakshman, T. Y. Li, D. Manocha, V. Pan, S. Steinberg, M. Sweedler, B. Trager, J. Verschelde;

in Japan, H. Kobayashi, M. T. Noda, T. Sasaki, K. Shirayanagi;

in France, J.-Ch. Faugere, I. Emiris (now back in Greece), D. Lazard, B. Mourrain, M.-F. Roy;

in Italy, D. Bini, P. M. Gianni, M. G. Marinari, T. Mora, L. Robbiano, C. Traverso;

in Spain, L. GonzaIes-Vega, T. Recio;

in Germany, J. Apel, J. Calmet, K. Gatermann, J.v.z. Gathen, T. Sauer, F. Schwarz, W. Seiler, V. Weispfenning;

in Russia, V. Gerdt;

my students J. Haunschmied, V. Hribernig, A. Kondratyev, G. Thallinger;

and the numerous other colleagues allover the world who have discussed matters of polynomial algebra with me on various occasions.

xv

Part I

Polynomials and Numerical Analysis

Chapter 1

Polynomials

In their use as modelling tools in Scientific Computing, polynomials appear, at first, simply as

a special class offunctions from the CS to C (or IRs to IR). Such polynomials are automatically objects of univariate (s = I) or multivariate (s > I) analysis over the complex or real numbers. For linear polynomials, this fact has played virtually no role in classical linear algebra; but

it has become a fundamental aspect of today's numerical linear algebra where concepts from

analysis (norms, neighborhoods, convergence, etc.) and related results are widely used in the design and analysis of computational algorithms. In an analogous manner, the consideration of polynomial algebra as a part of analysis plays a fundamental role in numerical polynomial algebra; it will be widely used throughout this book. In particular, this embedding of algebra into analysis permits the extension of algebraic algorithms to polynomials with coefficients of

limited accuracy; d. Chapter 3.

On the other hand. certain sets of polynomials have special algebraic structures: they may

be linear spaces, rings, ideals, etc. Algebraic properties related to these structures may play

a crucial role in solving computational tasks involving polynomials, e.g., for finding zeros of polynomial systems; d. Chapter 2.

In this introductory chapter, we consider various aspects of polynomials which will play

a fundamental role in our later investigations.

The following notations will generally be used (but without strict adherence):

scalars and coefficients E Cor JR :

lower-case Greek letters a, p, y, '"

elements (points) E CS

or JRs ;

lower-case Greek letters

~,1], l;,

 

.

vectors of coefficients etc. : lower-case Latin letters

a, b,

c,

.

s-dim. variables (indeterminates) :

polynomials; lower-case Latin letters

systems of polynomials: upper-case

lower-case Latin letters x, y, z,

p, q,

Latin letters P, Q,

3

.

.

.

4

Chapter 1.

Polynomials

1.1 Linear Spaces of Polynomials

Definition 1.1. A monomial in the s variables Xl,

'X s is the power product

x j

:= xf'

xl',

with

i

= (iJ,

, is)

E

N~

;

O.l)

is the exponent and iii := L::=l ia the degree of the monomial x j . The set of all monomials in s variables will be denoted by T S , independently of the notation for the variables. T1 C T S is the set of monomials in s variables of degree :5 d.

i

D

For example, x 2 y3

that each monomial set

z is a monomial of degree 6 in P

and hence contained in T) for d ~ 6. Note

.•• ,O) •

TJ contains the monomial I = XO = x(O,

Proposition 1.1. T1 contains (d~S) monomials; (d+~-l)of these

Proof. The proposition follows from fundamental formulas in combinatorics.

Obviously, the number of different monomials grows rapidly with the number of variables s and the degree d. For example, there are 126 monomials of degree :5 5 in 4 variables, and 3003 monomials of degree :5 8 in 6 variables. This rapid growth is a major reason for the high computational complexity of many polynomial algorithms.

have exact degree d.

D

Definition 1.2. A complex (real) polynomial l in s variables is a finite linear combination of monomials from TS with coefficients from C or JR, resp.:

p(x)

= P(XIo

,x s )

=

L

aj,

j,xf1

xl'

= LajX j .

(1.2)

,j,)EJ

The set J C N~ which contains the exponents of those monomials which are present in the polynomial p (i.e. which have a nonvanishing coefficient) is the support of p; deg(p) := maxjEJ iii is the (total) degree of p. The summands of a polynomial are called terms. The exponent and the degree of a term are those of the associated monomial. D

(j],

jEJ

Definition 1.3. A polynomial p with a support J such that iii = deg(p) for each i

called homogeneous.

D

E

J

is

The following is a polynomial of total degree 4 in the 3 variables x, y, z:

4x 2 l-7 xz 3 + 2l + 3.5x 3 -lz - 8.5xz -10.

The terms of degree 4 form a homogeneous polynomial of total degree 4 :

4x 2 l-7xz 3 +2z 4

Definition 1.4. The set of all complex (real) polynomials in s variables will be denoted by Pc of PH. resp., independently of the notation for the variables. When the coefficient domain is evident, the notation p s will be used. P d C p s will denote the set of polynomials in s variables oftotal degree .:'S d. D Obviously, P d is a linear space (vector space) over C or JR, resp., of dimension (d~s) (cf. Proposition 1.1); addition and multiplication by a scalar are defined in the natural way. A

I Throughout this book, only such polynomials are considered; cf. the preface.

1.1.

linear Spaces of Polynomials

5

generic basis in Pd is furnished by the monomials of Ttl arranged in some linear order. With respect to such a basis, the coefficients of a polynomial p are the components of p as an element of the linear space, i.e. p is represented by the vector of its coefficients (ordered appropriately). The zero element 0 is the zero polynomial with ai = 0 for all j. Vector space computations in Pd (i.e. addition and multiplication by a scalar) are thus reduced to the analogous computations with the coefficient vectors, as in any linear space.

However, polynomials may also be multiplied-and multiplication generally results in

a polynomial of higher degree which is outside the linear space of the factor polynomials; cf. section 1.3. Therefore, the vector space notation for polynomials can only be used within specified contexts. On the other hand, because of its simplicity, it shou/dbe used in computations with polynomials wherever it is feasible.

Another reason for a potential inadequacy of vector space notation in dealing with poly- nomials is the fact that the cardinality IJ I of the support J of a polynomial may be very small relative to the magnitude of the associated basis T1 so that almost all components are O. Such polynomials are called sparse in analogy to the use of this word in linear algebra. Multivariate polynomials which appear in scientific computing are generally sparse.

Fortunately, we will often have to deal with linear spaces n of polynomials from some ps with a fixed uniform support J so that a fixed monomial basis {xi, j E J} can be used.

Moreover, in these spaces n, multiplication of the element polynomials is defined in a way that

it does not lead out of n; hence in spite of their fixed dimensions IJI, they are commutative

rings, so-called quotient rings. We will formally introduce and discuss these objects in section 2.2 and later use them a great deal.

In numerical polynomial algebra, a good deal of the algorithmic manipulations of poly- nomials are linear operations; in this context, we will widely employ the standard notations of numerical linear algebra. To facilitate this practice, we will generally collect the coefficients

of a polynomial into a row vector aT =

x

(

ai

)

and its monomials into a column vector

=

(

.x i

.)T. Then

p(x)

into a column vector = ( . x i . )T. Then p ( x )

(1.3)

The use of row vectors for coefficients agrees with the COmmon notation aTx for linear polynomials in linear algebra; therefore it is the natural choice. It implies, however, that a linear system for the computation of a coefficient vector aT appears in the form aT A = b T . For the sake of a systematic notation (which greatly assists the human intuitive and associative powers), we will not transpose such systems during formal manipulations.

Example 1.1: The monomial vector for polynomials from PJ (univariate polynomials of maxi-

mal degree d) is x := (1, x,

, x d ) T . A shift of the origin to ~ E lR requires a rewriting to the

6

Chapter 1.

Polynomials

basis vector

6 Chapter 1. Polynomials basis vector 1 -~ ~2 1 -2~ o -d~ -. Ex; the

1

-~

~2

1

-2~

o

-d~

-. Ex;

the corresponding rewriting of a polynomial P is simply achieved:

d

p(x) = aTx = aTg-lgx = iiT(X-~) = Laj(x-~)j.

j=O

D

Naturally, we may freely use bases other than monomial for linear spaces of polynomials if it is advantageous for the understanding of a situation or for the design and analysis of computational algorithms. For example, we may wish to have a basis which is orthogonal w.r.t. some special scalar product, or which has other desirable properties.

Polynomials, in particular multivariate ones, often occur in sets or systems. We denote systems of polynomials by capital letters; e.g.,

P(X)

= {Pv(x), V = l(1)n} .

Notationally and operationally, such systems will often be treated as vectors of polynomials:

P(X)

=

(PI (X))

Pn(X)

(1.4)

It is true that this notation implies an order of the polynomials in the system which has originally not been there. But such an (arbitrary) order is also generated by the assignment of subscripts and generally without harm.

Exercises

1. (a) Consider the linear space pJ (cf. Definition 1.4). What is its dimension? Introduce a

monomial basis; consider reasons for choosing various orders for the basis monomials xj in the

basis vector x.

(b) Differentiation w.r.t. Xl and X2, resp., are linear operations in pJ. For a fixed basis

vector x, which matrices Dl, D2 represent differentiation so that a~ix = Dj x, i = 1, 2. Which

matrix represents ax~;x2? How can you tell from the D j that all derivatives of an order greater than 4 vanish for P E pJ?

With p(x) = aTx, show that the coefficient vector of -aa p(x) is aT D j Check that

(c)

x,

D l D2 = D2DI. Explain why the commutatitivity is necessary and sufficient to make the nota- tion q(D I , D2), with q E p2, meaningful. What is the coefficient vector of q( -aa , ~il ) p(x)?

XI

UX2

1.2.

Polynomials as Functions

7

2.

(

(a) According to Example l.l, the coefficient vector aT of p(x) = aT x E pJ w.r.t. the basis

(x -

(b)

1 . Derive the explicit fonn of 8- 1

By Taylor's Theorem, the components ii} of aT are also given by Otj = *::J p (~).

l

is aT = aT 8-

Show that this leads to the same matrices in the linear transfonnation between aT and aT.

3. (a) The classical Chebyshev polynomials Tv E p~ are defined by

To(x) :=

I ,

TI (x)

:= x ,

T V + 1 (x) := 2x Tv(x) -

Tv-I (x),

V =

2,3,

Show that Tv(l) = I, Tv(-I) = (_I)V, Vv; Tv(O) = 0 for v odd and = (-OV/ 2 for v even. Derive the same relations from the identity

Tv(cos rp)

=

cos v rp,

rp E

[0, rr] .

(1.5)

for p E PJ.

(b) Consider the representations

p(x) = aTx = b T (To(x),

, Td(x»T

Which matrices M and M- 1 represent the transfonnations b T = aT M- 1 anda T = b T M.

(c) The Tv satisfy maxxe{_I,ljITv(x)1 =

I, as is well-known and also follows, e.g., from

(1.5). For p(x) = bT(To(x),

, Td(x»T, this implies maxxe{-I,ljlp(x)1 ~ ~=O IPvl (why?).

Which bound for Ipl in tenns of the monomial coefficients aT follows from b)?

1.2 Polynomials as Functions

In this section, we recall some analytic aspects of polynomials regarded as functions. While the linear polynomials of linear algebra constitute a particular simple class of functions whose analytic aspects are trivial or straightforward, this is no longer the case for polynomials of a total degree d > I, For example, the trivial polynomial p (x) = x 2 + a maps the two disjoint real points ~ and -~, ~ =1= 0, to the same point ~2 + a E lR so that p is not bijective. Furthennore, the image of lR is only the interval [a, (0).

The fact that lR is not an algebraically closed field causes well-known complications when real polynomials are regarded as functions between domains in real space only. There- fore, throughout this book, we will mainly consider polynomials as functions between complex domains; note that real polynomials may also be considered as having a complex domain and range. Except if stated otherwise, individual polynomials in s variables or systems of n such polynomials will be regarded as mappings

systems of n such polynomials will be regarded as mappings or P: res~C". However, we must

or P: res~C".

However, we must clarify the notation re : In our algebraic context, it will always denote the open complex "plane" without the point 00. For us, "Ix I very large" is a near-singular situation, as in most other areas of numerical analysis. This is particularly important for our use of the multidimensional complex spaces res, with their analytically intricate structure at 00. In any case, our intuition for the res, s > I, is extremely restricted so that we may often have the lR s in mind when we are fonnally dealing with the res. Compare also Proposition 1.4 and the remark following it.

Multivariate differential operators will play an important role in some parts of this book; we use the following notation for them:

8

Chapter 1.

Polynomials

Definition 1.5. For i E N~,

a j

:=

iJl.·· is! ax{1

alii

axis

is a differentiation operator of order iii. A polynomial q(x) = LiE} bix i differential operator

q(a) :=

L bja].

0

jEl

(1.6)

E p~ defines the

(1.7)

In examples, we may also use shorthand notations like PXI for a\ P = il~1 p. The factors in (1.6) simplify a number of expressions; cf., e.g., the expansion (1.9) below. In particular, the well-known Leibniz rule for the differentiation of products takes the simple form

aj(p·q) =

where .:'S is the generic partial ordering in No.

L aj-kP·akq,

k~i

(1.8)

Proposition 1.2. For P E p~, we have

identically.

.

Iiu xi /xu

Proof:

au Xl

=

o

ai P E P~_IjI'and all derivatives of an order> d vanish

if Xu divides xi,

if

Xu does not divide xi

D

Proposition 1.3. For P E P~, ~ E CS, the expansion of P in powers of x = x - ~ (the Taylor expansion about ~ ) is

d

p(x) = p(~+ x) = L L (a i p)(~)xi -. p(x;~).

&=0

Ijl=&

Proof:

Example 1.2: For x = (y, z), X = (ji, z) E C 2 , p(y, z) = S y3 Z - 2 y2 z 2 + Z4,

The proof follows from the binomial theorem and (1.6).

D

(2,-1):

p(ji, z) =

p(TJ, n + Py(TJ, {) ji + pz(TJ, {) z

(1.9)

~ = (TJ, {)

=

+

! Pyy(TJ, n

ji2 + Pyz(TJ, n

jiz

+ ! pzz(TJ, n

Z2

+

ipyyy(TJ, n ji3 + ipyyz(TJ, n iz + ipyzz(TJ, {) ji z2 + iPzzz(TJ, {) Z3

+

ipyyyz(TJ, {) ji3 z + ipyyzz(TJ, n ji2 z2 + i4Pzzzz(TJ, n Z4

=

-47 -

68ji + S2z -

32ji2 + 76jiz - 2z 2

-

Sji3 + 34y2z -

8ji z 2 - 4z 3 + slz -

2ji2 z 2+ Z4.

D

For convenience, we will sometimes use the Frechet derivative concept of Functional Analysis to represent results in a more compact notation. Frechet differentiation is a straightfor- ward generalization of common differentiation to maps between Banach spaces; in our algebraic context, all Banach spaces of interest are finite-dimensional vector spaces.

1.2.

Polynomials as Functions

9

For a sufficient understanding of our essentially notational use of the concept, we observe

that we can interpret the differentiability of a function I

I is differentiable at xED if there exists a linear map u(x) : JR ~ JR such that

:

JR ~

JR in a domain D C JR thus:

lim

&x-+O

_1_ I/(x + dX) -

Idxl

I(x) -

u(x) dx I

=

0;

note that a linear map JR ~ JR is given by a real number to be employed as a/actor. A slightly stronger formulation which is, however, equivalent in our setting is

(l.l 0)

I/(x + dX) -

I(x) -

u(x) dx I

O(ldXI 2 ).

=

Thus, differentiation of I : JR ~ JR is an operation which maps I into u : JR ~

the space of linear maps from JR to JR. The derivative u is generally denoted by I' or d~ I, etc.

Now, we consider functions or maps from one vector space A into another one B and apply the same line of thought: Frechet differentiation is an operation which maps I : A ~ B into a function u : A ~ £(A ~ B) such that, for x from the domain DCA of differentiability,

(l.ll)

IIB are the norms in A and B, resp., and the· denotes the action of the linear

where II

operation u(x). Again, notations like I' etc. are commonly employed for u. Obviously,I'(x) linearizes the local variation of I in the neighborhood of x. A few examples will show the notational power of the Frechet differentiation concept:

Let A = JRm, B = JR (or em and C); i.e. I is a scalar function of m variables. Then the Frechet derivative u(x) of I at x must satisfy

£(JR ~

JR),

II/(x + dX) -

I(x) -

u(x)· dx liB = O(lIdxll~),

IIA, II

I(x + dX)

=

I(XI + dXl,""

X m + dX m )

= I(x) + u(x) . dx + O(lIdxII 2 )

=

I(Xl,'"

,x m ) + (it (x),

, ~(X»)

(

)

+ O(lIdxII 2 ) ;

dX m

thus the Frechet derivative u = I' of I is given by x ~ grad I(x) := (it(X), a row vector of dimension m. For a vector of functions, we simply obtain the vector of the Fcechet derivatives. Thus,

, ~(X»),

for A = Rm,

B = Rn, and I

: A

~

B,

I(x + dX)

=

=

(

(

ft(XI + dXl,

:

, Xm + dXm )

)

= I(x) + u(x) . dx + O(lIdxII 2 )

In(XI +

, X m + dX m )

II (X»)

-H;;(X») (

:

In (X)

+:

a

2b. (x)

a

XI

a

2h

x".

a

:

(x)

dXI

:

)

dX m

+ O(lIdxII 2 ).

10

Chapter 1.

Polynomials

Higher Frechet derivatives tend to become less intuitive: When f' =

u is a map from

A to L:(A

B) or a bilinear map from A to B.

the bilinear mapping

(dX, dX) --+ dX T f"(x) dx , with the symmetric m x m Hessian matrix H(x) := f"(x) =

( il~2tXi< (x) ). With the use of this higher Frechet derivative concept, the Taylor expansion of a system P of n polynomials Pv in m variables takes the compact form

derivative to a scalar function of m variables assigns to each xED

Thus the Frechet generalization of the classical second

--+

B),

f"

= u' has to be a map from A to L:(A

--+ L:(A --+

B))

= L:(A x

A

--+

P(X + dX)

=

deg(p)

"

L

K=O

~ p(K)(X) (dxt

K!

.

Proposition 1.4. A polynomial pEPs is a holomorphic function on each compact part D C C S The image p(D) of D is a compact part of C.

According to Propositions 1.2 and 1.3, each p