Sie sind auf Seite 1von 32

The Pythagorean Theorem and Its

Consequences
Jim Emery

Edited: 8/4/13

Contents
1 Pythagoras: Biographical Sketch 5

2 Eight Proofs of the Pythagorean Theorem 5


2.1 Proof I: Euclids Elements . . . . . . . . . . . . . . . . . . . . 5
2.2 Proof II: The Ascent of Man . . . . . . . . . . . . . . . . . . . 5
2.3 Proof III: Garfield . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Proof IV: An Arrangement of Four Triangles in a Square of
side a + b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Proof V: An Arrangement of Four Triangles in a Square of side c 12
2.6 Remarks on Geometric Proofs, Versus Algebraic Proofs . . . . 12
2.7 Proof VI: Equating Two Square Arrangements . . . . . . . . . 12
2.8 Proof VII: Triangle Area Proportional to the Hypotenuse Squared 17
2.9 Proof VIII: Similarity and Proportion . . . . . . . . . . . . . . 17

3 A Crises in Greek Mathematics: What is a real number? 20

4 Inequalities 20

5 Euclidean Distance 21

6 Distance Functions and the Metric Space 22

7 Vector Spaces and Inner Product Spaces 23

1
8 Normed Linear Spaces 23

9 Normed Linear Spaces and Functional Analysis 23

10 Hilbert Space and `2 23

11 Orthogonality, Orthagonal Polynomials, Fourier Series 23

12 Projections 23

13 Linear Least Squares Problems as Geometric Problems: Or-


thogonality and the Pythagorean Theorem 23

14 Elementary Formulation of the Least Squares Problem for


Straight Line Fitting 23

15 A Geometric View of the Least Squares Problem 24

16 Bibliography 31

List of Figures
1 The Pythagorean Theorem. The area of a square on the
hypotenuse of a right triangle is equal to the sum of the squares
on the sides. a2 + b2 = c2 , where here a = 6, b = 8, c = 10 . . 6
2 Proof I: Euclids Elements. The short side of the triangle is
a, the long side is b and the hypotenuse is c. The more darkly
shaded triangle rotated counterclockwise by 90 degrees, will
fall exactly on the more lightly shaded triangle. So these two
triangles are congruent. The line from the top vertex divides
the square on the hypotenuse c into a left rectangle L and a
right one R. The dark triangle has area b2 /2, because its base
has length b, as does its height. The area of the lightly shaded
triangle is 1/2 that of the right sub-rectangle R. Therefore
the area of R is b2 . Repeating the argument on the left side
of the figure with two new triangles, we find the area of L is
a2 . Therefore c2 = a2 + b2 . . . . . . . . . . . . . . . . . . . . . 7

2
3 Proof II: The Ascent of Man. Jacob Bronowski in his
book The Ascent of Man discusses this proof on pages 156-
162. The book is based on the 1972 BBC television series of
the same name. The small side of the triangle is a, the long
side b, the hypotenuse c. The area of the left figure is c2 . The
shaded inner square has side length b a. Rearranging the
pieces of the left figure we get the right figure consisting of a
small square of area a2 , and a larger composite square of area
b2 . Therefore a2 + b2 = c2 . . . . . . . . . . . . . . . . . . . . . 8
4 Proof II: The Ascent of Man. Jacob Bronowski in his
book The Ascent of Man discusses this proof on pages 156-
162. The book is based on the 1972 BBC television series of
the same name. The small side of the triangle is a, the long
side b, the hypotenuse c. The area of the left figure is c2 . The
shaded inner square has side length b a. Rearranging the
pieces of the left figure we get the right figure consisting of a
square region of area b2 (shaded region), and a square region
of area a2 (unshaded region). Therefore a2 + b2 = c2 . . . . . . 9
5 Proof III: Garfields Proof. The long side of the triangle is
b, the short side a, the hypotenuse c. The area of the trapezoid
is A = (a + b)(a + b)/2 = a2 /2 + ab + b2 /2. The area as the
sum of the three triangles is A = ab + c2 /2. Equating the two
expressions for A we obtain the result a2 + b2 = c2 . . . . . . . 11
6 Proof IV: An Arrangement of Four Triangles in a
Square of side a + b The short side of the triangle is a, the
long side b, and the hypotenuse c. The area of the enclosing
rectangle is A = (a + b)2 . The area of the four triangles and
the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 . Equating
these two expressions for area A we have a2 + b2 = c2 . . . . . . 13
7 Proof V: An Arrangement of Four Triangles in a Square
of side c The short side of the triangle is a, the long side b,
and the hypotenuse c. The area of the enclosing rectangle is
A = c2 . The area of the four triangles and the inside rectan-
gle is A = 4(ab/2) + (b a)2 = a2 + b2 . Equating these two
expressions for area A we have a2 + b2 = c2 . . . . . . . . . . . 14

3
8 An Arrangement That Leads to a Geometric Proof.
By equating this square with a certain second square also
of side a + b we arrive at a clearly geometric proof of the
Pythagorean Theorem that uses no algebra, and could have
been employed by the Greeks, who did not have algebra avail-
able. They used number in their arguments, but to them num-
bers were line segment lengths. Through this means Euclid
treated the concept of proportional numbers. . . . . . . . . . 15
9 Proof VI: Equating Two Square Arrangements. The
left enclosing square and the right enclosing square have the
same area. Therefore the sum of the areas of the two shaded
squares on the left, is equal to the area of the shaded square
on the right. That is, the sum of the square on the short side
of the triangle, plus the square on the long side of the triangle,
is equal to the area of the square on the hypotenuse. . . . . . 16
10 Proof VII: Area Proportional to the Hypotenuse Squared.
Let the short side of the triangle be a, the long side b, and the
hypotenuse c. Similar right triangles have their areas pro-
portional to the square of their hypotenuses. with the same
proportionality constant, say . This follows because simi-
lar triangles have corresponding sides that are proportional.
Also the ratios of triangle sides are the same for similar tri-
angles. This is established in Euclidean geometry, and is the
basis of trigonometry. In particular similar triangles have the
same acute angles. Let one of them be . Then the area of
the triangle is A = (ab)/2 = c cos()c sin() = c2 , where
= cos() sin(). The vertical line divides the triangle into
two similar triangles, a left one and a right one. The hy-
potenuse of the left sub-triangle is a, the right one b. Thus
their areas are a2 and b2 . The area of the original triangle
is c2 . So a2 + b2 = c2 . Thus a2 + b2 = c2 . . . . . . . . . . 18
11 Proof VIII: Similarity and Proportion. Let the short
side of the triangle be a, the long side b, and the hypotenuse c.
The vertical line divides the triangle into two similar triangles.
Corresponding sides are proportional. c is divided into two
segments c1 on the left and c2 on the right. We have a/c1 =
c/a, so a2 = c1 c. And b/c2 = c/b, so b2 = c2 c. Then a2 + b2 =
(c1 + c2 )c = c2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4
1 Pythagoras: Biographical Sketch
Pythagoras proclaimed that All is Number (that is, all is Mathematics).
Pythagoras was born in Samos about 570 BC and died about 495 BC. Knowl-
edge about him is vague and uncertain. He is said to have related mathe-
matics to music, believed in reincarnation, and founded a secret religion in
southern Italy in the town of Croton, a Greek colony. Much said about him
may be apocryphal. But perhaps he was the first to call himself a philoso-
pher (lover of knowledge). Many later philosophers claimed to have been
influenced by his ideas. The Pythagorean theorem itself may have originated
in the cultures of the Babylonians and Indians, although he may have been
the first to write down a formal proof of this theorem, earlier versions being
folklore and tradition.

2 Eight Proofs of the Pythagorean Theorem


Most proofs are obvious from geometrical figures. Some proofs are algebraic,
many use use the concept of similarities of triangles and proportions.

2.1 Proof I: Euclids Elements


In our figure for Euclids proof, which proof appears in his work The Ele-
ments, two overlaid shaded triangles are congruent, and so have equal areas.
Corresponding to each triangle are two rectangles each of double the area.
One such rectangle is a square on a side of the original right triangle. The
other makes up a portion of the square on the hypotenuse. So suppose the
triangle sides are a and b and the hypotenuse c. So we have have that a2
is equal to the area of a sub-rectangle of the square on the hypotenuse c.
Similarly we have b2 equal to the area of the rest of the area of the square
on the hypotenuse. Thus

a2 + b2 = c2

2.2 Proof II: The Ascent of Man


Jacob Bronowsky devotes several pages discussing a proof of the Pythagorean
theorem in his book, The Ascent of Man, and in his television series. This

5
Figure 1: The Pythagorean Theorem. The area of a square on the
hypotenuse of a right triangle is equal to the sum of the squares on the sides.
a2 + b2 = c2 , where here a = 6, b = 8, c = 10

6
Figure 2: Proof I: Euclids Elements. The short side of the triangle is a,
the long side is b and the hypotenuse is c. The more darkly shaded triangle
rotated counterclockwise by 90 degrees, will fall exactly on the more lightly
shaded triangle. So these two triangles are congruent. The line from the top
vertex divides the square on the hypotenuse c into a left rectangle L and a
right one R. The dark triangle has area b2 /2, because its base has length b,
as does its height. The area of the lightly shaded triangle is 1/2 that of the
right sub-rectangle R. Therefore the area of R is b2 . Repeating the argument
on the left side of the figure with two new triangles, we find the area of L is
a2 . Therefore c2 = a2 + b2 .

7
Figure 3: Proof II: The Ascent of Man. Jacob Bronowski in his book
The Ascent of Man discusses this proof on pages 156-162. The book is
based on the 1972 BBC television series of the same name. The small side
of the triangle is a, the long side b, the hypotenuse c. The area of the left
figure is c2 . The shaded inner square has side length b a. Rearranging the
pieces of the left figure we get the right figure consisting of a small square of
area a2 , and a larger composite square of area b2 . Therefore a2 + b2 = c2 .

8
Figure 4: Proof II: The Ascent of Man. Jacob Bronowski in his book
The Ascent of Man discusses this proof on pages 156-162. The book is
based on the 1972 BBC television series of the same name. The small side
of the triangle is a, the long side b, the hypotenuse c. The area of the left
figure is c2 . The shaded inner square has side length b a. Rearranging the
pieces of the left figure we get the right figure consisting of a square region
of area b2 (shaded region), and a square region of area a2 (unshaded region).
Therefore a2 + b2 = c2 .
9
occurs in the chapter called The Music of the Spheres and in an episode
similarly titled in the television series. See the figure captioned The Ascent
of Man.

2.3 Proof III: Garfield


James A. Garfield contributed an original proof for the Pythagorean theorem.
Of course most proofs of this theorem are rather similar. I had heard about
Garfields proof many times, but had not actually seen it. However, his
proof is presented in the book: Welchons, Krickenberger, Pearson, Plane
Geometry.
I graduated from James A. Garfield elementary school in Long Beach
California, a few years back, so I am closely connected to Garfield. Garfield
was one of our assassinated presidents, a rather interesting person, an excep-
tion to our rather dull and dim witted group of presidents in general. His
assassin Charles Guiteau had a connection with the Oneida Community in
Oneida, New York. This was a 19th century social experiment devoted to
free love. For an interesting treatment of these matters see Sara Vowells
book Assassination Vacation. If you are not familiar with Sara, her quirky
personality and her squeaky voice, as heard on This American Life, you
are really missing out.
Garfields proof consists in using two copies of the triangle, which has
short side a, long side b, and hypotenuse c. We rest one copy on its short
side a, the other on the long side b, so that the two triangles touch at a
point. Then we add a line joining the top vertex of the first triangle to
the top vertex of the second triangle getting a trapezoid (See the Garfield
figure). A trepezoid is a quadrilateral with two parallel opposite sides. The
area of the trapezoid is the average length of its two parallel sides times
the perpendiculat distance between its parallel sides (this can be shown by
decomposing the trapezoid into two triangles by drawing a diagonal). So the
area of the trapezoid is

a+b 1 a2 b2
A = (a + b) = (a2 + 2ab + b2 ) = + ab + .
2 2 2 2
On the other hand writing the area as the sum of the areas of the three
triangles, we have
ab ab c2 c2
A= + + = ab + .
2 2 2 2

10
Figure 5: Proof III: Garfields Proof. The long side of the triangle is
b, the short side a, the hypotenuse c. The area of the trapezoid is A =
(a + b)(a + b)/2 = a2 /2 + ab + b2 /2. The area as the sum of the three
triangles is A = ab + c2 /2. Equating the two expressions for A we obtain the
result a2 + b2 = c2 .

11
Equating these two expressions for A, we obtain
a2 + b2 = c2 .

2.4 Proof IV: An Arrangement of Four Triangles in a


Square of side a + b
Consider the figure called An Arrangement of Four Triangles in a
Square of side a + b. The short side of the triangle is a, the long side b, and
the hypotenuse c. The area of the enclosing rectangle is A = (a+b)2 . The area
of the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 .
Equating these two expressions for area A we have a2 + b2 = c2 .

2.5 Proof V: An Arrangement of Four Triangles in a


Square of side c
Consider the figure called An Arrangement of Four Triangles in a
Square of side c. The short side of the triangle is a, the long side b, and
the hypotenuse c. The area of the enclosing rectangle is A = c2 . The area of
the four triangles and the inside rectangle is A = 4(ab/2) + (b a)2 = a2 + b2 .
Equating these two expressions for area A we have a2 + b2 = c2 .

2.6 Remarks on Geometric Proofs, Versus Algebraic


Proofs
Euclids proof is purely Geometric with no reliance on algebra. The figure
titled An Arrangement That Leads to a Geometric Proof. will lead
to another purely geometric proof. Most of the proofs are algebraic involving
a slight amount of Algebra.

2.7 Proof VI: Equating Two Square Arrangements


Referring to the figure for proof VI, the left enclosing square and the right
enclosing square have the same area. Therefore the sum of the areas of the
two shaded squares on the left, is equal to the area of the shaded square on
the right. That is, the sum of the square on the short side of the triangle,
plus the square on the long side of the triangle, is equal to the area of the
square on the hypotenuse.

12
Figure 6: Proof IV: An Arrangement of Four Triangles in a Square
of side a + b The short side of the triangle is a, the long side b, and the
hypotenuse c. The area of the enclosing rectangle is A = (a + b)2 . The area
of the four triangles and the inside rectangle is A = 4(ab/2) + c2 = 2ab + c2 .
Equating these two expressions for area A we have a2 + b2 = c2 .

13
Figure 7: Proof V: An Arrangement of Four Triangles in a Square of
side c The short side of the triangle is a, the long side b, and the hypotenuse
c. The area of the enclosing rectangle is A = c2 . The area of the four triangles
and the inside rectangle is A = 4(ab/2) + (b a)2 = a2 + b2 . Equating these
two expressions for area A we have a2 + b2 = c2 .

14
Figure 8: An Arrangement That Leads to a Geometric Proof. By
equating this square with a certain second square also of side a+b we arrive at
a clearly geometric proof of the Pythagorean Theorem that uses no algebra,
and could have been employed by the Greeks, who did not have algebra
available. They used number in their arguments, but to them numbers were
line segment lengths. Through this means Euclid treated the concept of
proportional numbers.

15
Figure 9: Proof VI: Equating Two Square Arrangements. The left
enclosing square and the right enclosing square have the same area. Therefore
the sum of the areas of the two shaded squares on the left, is equal to the
area of the shaded square on the right. That is, the sum of the square on the
short side of the triangle, plus the square on the long side of the triangle, is
equal to the area of the square on the hypotenuse.

16
2.8 Proof VII: Triangle Area Proportional to the Hy-
potenuse Squared
See the figure for proof VII. Let the short side of the triangle be a, the
long side b, and the hypotenuse c. Similar right triangles have their ar-
eas proportional to the square of their hypotenuses. with the same pro-
portionality constant, say . This follows because similar triangles have
corresponding sides that are proportional. Also the ratios of triangle sides
are the same for similar triangles. This is established in Euclidean geome-
try, and is the basis of trigonometry. In particular similar triangles have the
same acute angles. Let one of them be . Then the area of the triangle is
A = (ab)/2 = c cos()c sin() = c2 , where = cos() sin(). The vertical
line divides the triangle into two similar triangles, a left one and a right one.
The hypotenuse of the left sub-triangle is a, the right one b. Thus their areas
are a2 and b2 . The area of the original triangle is c2 . So a2 + b2 = c2 .
Thus a2 + b2 = c2 .

2.9 Proof VIII: Similarity and Proportion


Let the short side of the triangle be a, the long side b, and the hypotenuse c.
The vertical line divides the triangle into two sub-triangles both similar to
the original. Referring to the figure, corresponding sides are proportional. c
is divided into two segments c1 on the left and c2 on the right. We have
a c
= ,
c1 a
so
a2 = c1 c.

b c
= ,
c2 b
so
b2 = c1 c.
Then
a2 + b2 = (c1 + c2 )c = c2 .
Thus
a2 + b2 = c2 .

17
Figure 10: Proof VII: Area Proportional to the Hypotenuse Squared.
Let the short side of the triangle be a, the long side b, and the hypotenuse
c. Similar right triangles have their areas proportional to the square of their
hypotenuses. with the same proportionality constant, say . This follows be-
cause similar triangles have corresponding sides that are proportional. Also
the ratios of triangle sides are the same for similar triangles. This is estab-
lished in Euclidean geometry, and is the basis of trigonometry. In partic-
ular similar triangles have the same acute angles. Let one of them be .
Then the area of the triangle is A = (ab)/2 = c cos()c sin() = c2 , where
= cos() sin(). The vertical line divides the triangle into two similar tri-
angles, a left one and a right one. The hypotenuse of the left sub-triangle is
a, the right one b. Thus their areas are a2 and b2 . The area of the original
triangle is c2 . So a2 + b2 = c2 . Thus a2 + b2 = c2 .
18
Figure 11: Proof VIII: Similarity and Proportion. Let the short side
of the triangle be a, the long side b, and the hypotenuse c. The vertical
line divides the triangle into two similar triangles. Corresponding sides are
proportional. c is divided into two segments c1 on the left and c2 on the
right. We have a/c1 = c/a, so a2 = c1 c. And b/c2 = c/b, so b2 = c2 c. Then
a2 + b2 = (c1 + c2 )c = c2 .

19
3 A Crises in Greek Mathematics: What is
a real number?
For the greeks numbers were lengths of line segments. Fractions (rational
numbers) are obtained by dividing line segments into equal pieces. They
discovered that the diagonal of a square can not be equal to any multiple of
a fractional division of the unit length of a square. This is a big problem for
their concept of number!
Show that the square root of a prime number is not rational. So suppose

the integer p is a prime, having no factors. Suppose p could be written as
a rational number, as a fraction say n/m, where m and n have no common
factor, since if not we could divide out the common factors.
m
p= .
n
Squaring we have
m2
p= .
n2
Then
n2 p = m2 .
Hence p must be a factor of m, say

m = pr.

Then
n2 p = p2 r 2
But this implies that p is a factor of n. This contradicts our assumption that
m and n had no common factor. Therefore the square root of a prime is not
a rational number.
Mention the definition of real numbers as Didekind Cuts, or as equivalence
classes of Cauchy sequences.
LEAST UPPER BOUND AXIOM If A is any nonempty set of the real
numbers R that is bounded above, then A has a least upper bound.

4 Inequalities
Cauchy-Schwartz, Minkowsky (Goldberg)

20
Cauchy-Schwartz,
"
#1/2 "
#1/2
X X X
sn tn s2n t2n



n=1 n=1 n=1

Minkowsky
"
# "
#1/2 "
#1/2
X X X
2
(sn + tn ) s2n + t2n
n=1 n=1 n=1

If a, b, c are vectors in a normed vector space (triangle inequality)

kc ak kb ak + kc bk.

5 Euclidean Distance
From the Pythagorean Theorem we able to define the Euclidean distance
between points. So if we have two points with respective coordinates p1 =
(x1 , y1, z1 ) and p2 = (x2 , y2 , z2 ), the distance between the points is
q
d= (x2 x1 )2 + (y2 y1 )2 + (z2 z1 )2

So now we can talk about the nearness of points and thus talk about con-
cepts such as continuity, and differentiability. Also we are able to formulate
the ideas of analytic geometry.
For example we are able to define the ellipse as the locus of points equidis-
tant from two fixed points called the foci. Doing this we arrive at the canon-
ical representation of an ellipse with the equation

x2 y 2
+ 2 = 1,
a2 b
and the standard equation of the ellipsoid

x2 y 2 z 2
+ 2 + 2 = 1.
a2 b c

21
6 Distance Functions and the Metric Space
A metric is a distance function defined on some set of points M with the
following four properties:
For a, b, c points of M, then
(a, b) 0 (i)

(a, b) = (a, b) (ii)

(a, a) = 0 (iii)
and
(a, c) (a, b) + (b, c) (iv)
An open ball about the point a of radius r, B(a, r), is the set of all points
such that
(a, p) < r
A metric space (M, ) consists of a set M with a metric .
An open set in a metric space is a set A so that for every point a in A
there exists some open ball about a that is a subset of A. A metric space M
and the class of all open subsets form a topological space.
The metric for ordinary Euclidean two dimensional space is defined by
the Pythagorean Theorem. So let point p1 = (x1 , y1 ) and point p2 = (x2 , y2 .
Then the Euclidean distance between the points is the square root of the
differences of the coordinates
q
d(p1 , p2 ) = (x2 x1 )2 + (y2 y1 )2 ,
which by the Pythagorean Theorem is the length of the line segment con-
necting the two points.
So the triangle inequality says that the sum of the lengths of two adjacent
sides in a triangle is greater than the length of the opposite side.
This is metric property (iv):
(a, c) (a, b) + (b, c) (iv)
For this simple two dimensional case, from the law of cosines
c2 = a2 + b2 ab cos() a2 + b2 .
For more general arguments see lineara.pdf, Topics In Linear Algebra
and Its Applications by James Emery.

22
7 Vector Spaces and Inner Product Spaces
8 Normed Linear Spaces
9 Normed Linear Spaces and Functional Analy-
sis
10 Hilbert Space and `2
11 Orthogonality, Orthagonal Polynomials, Fourier
Series
12 Projections
13 Linear Least Squares Problems as Geo-
metric Problems: Orthogonality and the
Pythagorean Theorem
14 Elementary Formulation of the Least Squares
Problem for Straight Line Fitting
The traditional way of deriving least squares equations is to write the ex-
pression for the sum of the squares difference between the given data and
the approximating function, and then to set the partial derivatives with re-
spect to the coefficients of the approximating function to zero. Let us do
this for the case of fitting a straight line to given data. Assume the model
f (x) = ax + b and minimize
n
X
r(a, b) = (axi + b yi )2
i=1

The conditions for a minimum are


n
r X
= 2xi (axi + b yi ) = 0
a i=1

23
n
r X
= 2(axi + b yi ) = 0
b i=1
We get a two by two system of equations.
n
X n
X n
X
a x2i +b xi = xi yi
i=1 i=1 i=1
n
X n
X n
X
a xi + b 1= yi
i=1 i=1 i=1
These equations are known as the normal equations of the problem. They
have a unique solution if the determinant is not zero, that is if
n
X n
X
n x2i ( xi )2 ) 6= 0.
i=1 i=1

If the x values are not all equal this follows from the Cauchy-Schwartz in-
equality applied to the vectors (1, 1, ...1) and (x1 , x2 , ..., xn ). The general
problem can be viewed more naturally as being geometric.

15 A Geometric View of the Least Squares


Problem
The abstract linear least squares problem may be formulated as approxima-
tion in a vector space by some element of a subspace. Often this vector space
is a space of functions. As examples the subspace could be generated by a
bases such as
1, x, x2 , x3 , ....,
or such as
1, cos(t), sin(t), cos(2t), sin(t), ...
The first case would be a polynomial, or power series approximation. And
the second would be a Fourier or trigonometric approximation. So consider
a vector space V with an inner product of u, with v, written as (u, v). Given
a subspace S and an arbitrary element g of V , we are to find the element in
S that best approximates g in the norm corresponding to the inner product.
The L2 norm for functions is based on the inner product
Z
(f, g) = f g,

24
and for sequences is based on the inner product
n
X
(f, g) = fi gi .
i=1

This L2 norm corresponds directly to the squares part of the least squares
approximation. But the theory carries through for an arbitrary inner prod-
uct. The norm defined by an inner product is

kf k = (f, f )1/2 .

A solution f S, minimizes

(f g, f g) = kf gk2.

We will show that the problem is solved as the orthogonal projection of a


vector into a subspace. One can think of this as analogous to the simple
geometric problem of projecting a vector in space onto a plane. Think of a
vector from the origin to a point, and think of a plane through the origin,
not containing this vector. The plane is a vector space. A vector in the
plane closest to the original vector is obviously the orthogonal projection of
the vector onto the plane. The same thing happens in the general problem,
where the plane becomes the subspace. For example the subspace might be
the set of all cubic polyunomials. And the problem is to best fit the data to
a cubic polynomial.
Two vectors are orthogonal, i.e. perpendicular, if their inner product is
zero. We require a preliminary theorem to prove the main proposition.
Pythagorean Theorem. If v1 is orthogonal to v2 , then

kv1 + v2 k2 = kv1 k2 + kv2 k2 .

Proof.

(v1 + v2 , v1 + v2 ) = (v1 , v1 ) + 2(v1 , v2 ) + v(2 , v2 ) = (v1 , v1 ) + (v2 , v2 ).

Proposition. If f S and (g f, h) = 0, h S then f is a solution to the


least squares problem.
Proof. Let s S. We have

kg sk2 = k(g f ) + (f s)k2 = kg f k2 + kf sk2 kg f k2 .

25
By assumption, g f is orthogonal to the subspace S, and f s is in S. So
the second equality is a consequence of the Pythagorean Theorem.
We have shown that

kg sk kg f k, s S.

so f is the best approximation to g in S and this completes the proof.


Notice that a unique solution always exists because f is the unique or-
thogonal projection of g into S. For finite subspaces the solution can be
formulated as a solution to a set of n linear equations in n unknowns. Let S
equal the span of f1 , .., fn . Let the solution be

f = c1 f1 + c2 f2 + .. + cn fn .

Then the minimum condition is equivalent to

(fi , c1 f1 + c2 f2 + ..cn fn g) = 0, i = 1, .., n.

This is the same as

c1 (fi , f1 ) + c2 (fi , f2 ) + ..cn (fi , fn ) = (fi , g), i = 1, .., n.

These n linear equations in n unknowns are called the normal equations of


the problem. In the usual case, S is a space of discrete functions. These are
functions defined on a finite domain. Suppose there are m data values so
that the domain is
{p1 , p2 , ..., pm }.
We identify the function fi with the vector

fi (p1 )
fi (p2 )




.....


.....

fi (pm )
fi is an m dimensional column vector of values of the ith function. We can
formulate the minimum conditions with matrices. The inner product is then
the transpose of the first vector times the second. We write the transpose of
a vector v as v t . We have (fi , fj ) = fit fj Then

c1 (fi , f1 ) + c2 (fi , f2 ) + ..cn (fi , fn ) = (fi , g), i = 1, .., n.

26
Thus
c1
.

h i
fit f1 ... fit fn = fit g


.


.

cn
If we let A be an m row by n column matrix, whose ith column is fi , then
h i
A= f1 f2 ... fn

Written out
f1 (p1 ) f2 (p1 ) ... fn (p1 )

f1 (p2 ) f2 (p2 ) ... fn (p2 )
A=

... ... ... ...


f1 (pm ) f2 (pm ) ... fn (pm )
Also let
g(p1)
.



B=
.


.

g(pm )
The normal equations become

c1
.


t
= At B.

A A .


.

cn

Note that the original approximation problem in this form is a system of m


equations in n unknowns
c1
.



A . B.


.


cn
Any linear system of this form with m > n can be interpreted as a least
squares problem and has an approximate least squares solution. The matrices

27
A and B are a convenient input set to a general linear least squares solver
(see the listing of subroutine llsq).
There is always a unique solution to the linear least squares problem.
The solution is the orthogonal projection into the subspace. But there will
be more than one solution to the normal equations if the given functions
spanning the subspace are not linearly independent. The normal equations
have a solution, so they are consistent. From the theory of linear equations,
if the determinant D of the coefficient matrix of the normal equations is
not zero, then there is a unique solution. Then we can solve the equations
either by inverting the coefficient matrix, or by gaussian elimination. If D is
zero, then there is more than one solution, such solution will involve one or
more variables of arbitrary value. Gaussian elimination will fail. The D = 0
solution can be computed by using elementary row operations which can be
done numerically or with various computer algebra programs. When we are
concerned only with the discrete space, it does not matter that there are
multiple solutions to the normal equations. Because any set of coefficients
gives a linear combination equal to the unique projection into the subspace.
The various solutions just give different linear combinations of dependent
vectors that equal the same vector. On the other hand if points other than the
sample points are in the relevant domain of the functions, then the multiple
solutions may give function solutions that are not the same on this extended
domain. To illustrate compare functions f and g where f (x) = x(x 1)
is equal to zero on the domain x = 0 and x = 1, but it is not zero on
the extended domain of all real numbers. Let g be the true zero function,
g(x) = 0. The two functions agree on {0, 1}, but give different values on
an extended domain. Frequently we want to use the least squares solution
for interpolation between the given data points, and so the case of multiple
solutions to the normal equations does have consequence.
We will show that if f1 ,..,fn are linearly independent then the normal
equations have a unique solution. This is obvious because in this case f1 ,..,fn
is a basis of S and the unique solution f in S has unique components with
respect to this basis. It is also a direct consequence of the following propo-
sition.
Proposition. if f1 ,...,fn are the linearly independent columns of a matrix
A, which has m > n rows, then det(At A) is not equal to zero.
Proof. Suppose the determinant is zero. Then there exists c1 , c2 , .., cn , not

28
all zero such that

(f1 , f1 ) (f1 , f2 ) (f1 , fn )
(f2 , f1 ) (f2 , f2 ) (f2 , fn )



c1
...... + c2
...... + .. + cn
......
= 0.

......


......


......

(fn , f1 ) (fn , f2 ) (fn , fn )

Let
v = c1 f1 + ... + cn fn .
The first equation shows that (fi , v) = 0, for i = 1, .., n. It follows that
(v, v) = 0. This implies v = 0, and so each ci is zero. This is a contradiction,
so the proposition is true.
Example 1. We are to fit the function

y = f (x) = a sin(x) + b cos(x).

to the data
x y
1.0 3.0
,
2.5 5.6
3.4 7.8
Apply the sin function to the x values to get the first column of matrix A
and the cos function to get the second column. Let vector B be the y values.
The normal equations are
At AC = At B
or in terms of the components
" # " #
1.13154358 0.22224325 3.882636366
C=
0.22224325 1.86845642 10.40652323

The solution is " #


4.6334245
C=
6.120705005
So
f (x) = 4.6334245 sin(x) 6.120705005 cos(x)
The following program does the linear least squares computations.

29
c+ llsq least squares solution of a*c=b (solving for c)
subroutine llsq(a,ia,m,n,ws,c,b,ier)
c parameters
c a-m by n matrix. declared row dimension ia.
c ws-working storage vector of length m
c c-vector of size n
c b-vector of size m
c ier-return parameter: ier=0 normal return,ier=1 normal
c equations
c nearly singular,ier=2 normal equations singular.
c
dimension a(ia,1),b(1),c(1),ws(1)
c compute lower elements of jth column of transpose(a)*a
do 50 j=1,n
do 18 i=j,n
s=0.
do 15 k=1,m
s=s+a(k,i)*a(k,j)
15 continue
18 ws(i)=s
c
c compute jth element of right side vector
s=0.
do 40 k=1,m
40 s=s+a(k,j)*b(k)
c(j)=s
c
c store lower elements of jth column in a
do 19 i=j,n
19 a(i,j)=ws(i)
c
50 continue
c fill in upper values
do 60 i=1,n
do 60 j=i,n
a(i,j)=a(j,i)
60 continue
ib=1

30
mm=1
eps=1.e-12
inv=0
c solve normal equations
call gausse(a,ia,c,ib,n,mm,inv,eps,det,ier)
return
end

16 Bibliography
[1] Heath T. L. (translator), Euclids Elements, 3 Volumes, Dover, 1956.

[2] Welchons A. M., Krickenberger W. R., Pearson Helen R., Plane Geom-
etry, 1958, Ginn and Company. Garfield Proof p. 253.

[3] Bronowski Jacob, The Ascent of Man, Little Brown and Company,
1973.

[4] Halmos Paul R, Introduction to Hilbert Space: And the Theory


of Spectral Multiplicity, Chelsea, 1951. Halmos was a student of John
Von Neumann.

[5] Halmos Paul R, Finite Dimensional Vector Spaces, Springer-Verlag,


1975.

[6] Diggins Julia E, String, Straight-Edge and Shadow: The Story of


Geometry, The Viking Press, 1965. This is a book for junior high school
students, and elementary school teachers. A very nice short book with pic-
tures, a history of the Greeks and Pythagoras, as well as some interesting
mathematical discussions I had not seen elsewhere.

[7] Pedoe Dan, Geometry and the Liberal Arts, St Martins Press, 1976.

[8] Vowell Sara, Assassination Vacation, 2005, Simon and Schuster.

[9] Goldberg Richard R Methods of Real Analysis, Blaisdell Publishing


Company, 1964.

31
32

Das könnte Ihnen auch gefallen