Sie sind auf Seite 1von 11

ANDREW SWANN

C ONTENTS 1. 2. 3. 4. 5. 6. 7. Introduction Symmetric Matrices Orthogonal Changes of Coordinates Classifying Degree Two Curves in the Plane An Extremal Problem Arbitrary Changes of Coordinate Critical Points of Smooth Functions 89 90 91 94 95 97 98

1. I NTRODUCTION

this point we have concentrated on applications of vectors and matrices to linear problems, particularly solutions of linear equations. However, the theory we have developed is rather exible and in particular lends itself well to certain situations that are quadratic in nature. Consider the following equation in the plane:
PTO

7x2 + 8xy + y2 3x + 2y 1 = 0.

(1.1)

Here we place one constraint on two variables, so we would expect the solution set to be a curve. What type of curve? We are familiar with some curves that are described by equations like (1.1). Four particular examples come to mind Circle: x2 + y2 = r2 . Ellipse: (x2 /a2 ) + (y2 /b2 ) = 1 Parabola: y = ax2 . Hyperbola: (x2 /a2 ) (y2 /b2 ) = 1.
Date: December, 1999; minor revisions January, 1999.
89

90

ANDREW SWANN

Below we will see that together with straight lines these are essentially the only curves that can arise. We will also show how we can decide which of these correspond to (1.1).

2. S YMMETRIC M ATRICES

ET

have

A be an n n symmetric matrix. The quantity xt Ax denes a poly2 nomial in the n coordinates of x. For example, if A = 5 3 then we 2 xt Ax = x1 x2 5 2 2 3 x1 = 5x2 + 4x1 x2 3x2 . 1 2 x2

In general this procedure produces a polynomial that is quadratic in the variables x1 , . . . , xn : qA (x1 , x2 , . . . , xn ) = xt Ax =
n

i=1 j=1

ai j xix j = aiix2 + i
i=1

1=i< j n

2ai j xi x j .

Denition 2.1. We call qA the quadratic form associated to A. Moreover, any quadratic polynomial without linear or constant terms can be written as xt Ax for a suitable choice of A: if q(x1 , . . ., xn ) = bi x2 + i
i=1 n

1=i< j n

ci j xi x j ,

1 2 c1n

1 2 c12

b2 .. .
1 2 c2n

## ... ... . . . ...

1 2 c1n 1 2 c2n bn

Pay particular attention to the factor 1/2 that appears in front of the coefcients ci j .

91

The two most common examples one encounters are: n=2: A= a b a A = h g b , c h g b f , f c qA (x, y) = ax2 + 2bxy + cy2 ,

n=3:

## qA (x, y, z) = ax2 + by2 + cz2 + 2 f yz + 2gxz + 2hxy.

The naming of the entries in the 3 3 case is standard in mechanics and I nd the following mnemonic helpful All Hairy Gorillas Have Big Feet Good For Climbing The other special case to note is when A is a diagonal matrix: 1 2 A= , .. . n 3. O RTHOGONAL C HANGES
OF

qA (x1 , x2 , . . . , xn ) = 1 x1 2 + 2 x2 2 + + n xn 2 C OORDINATES

UPPOSE

we make a change of coordinates, how does this affect the polynomial q? If x = P for some invertible n n matrix P, then x qA (x) = xt Ax = xt Pt AP = qPt AP ( ). x x (3.1)

We thus get a new quadratic form represented by the matrix Pt AP. Note that this is different from the formula describing how the matrix [T ]B of B a linear transformation changes: we have the transpose Pt instead of the inverse P1 . However, if P is an orthogonal matrix, then Pt = P1 . We can thus use our theorem on diagonalisability of symmetric matrices to deduce: Theorem 3.1. Let qA be the quadratic form associated to symmetric matrix A. Then there is an orthogonal matrix P such that n 2 qPt AP (x1 , . . . , xn ) = 1 x2 + 2 x2 + + n x2 1

## where 1 , . . ., n are the eigenvalues of A.

92

ANDREW SWANN

The directions dened by the columns of P are called the principal axes of qA . Example 3.2. Consider the ellipse
1 2 1 2 4 x + 9 y = 1. 1 form is 1 x2 + 9 y2 4

In this case the quadratic and the associated matrix is diagonal, so the principal axes are aligned with the coordinate axes. Example 3.3. Consider the equation 5x2 4xy + 5y2 = 1. (3.2)
5 2 2 5

The quadratic form 5x2 4xy + 5y2 has matrix A = istic polynomial of this matrix is det(I A) = det

. The character-

5 2 = 2 10 + 21 = ( 3)( 7), 2 5

## so A has eigenvalues 3 and 7. We now compute the corresponding eigenvectors:

2 = 3: We solve 0 = (3I A) [ x ] = 2 2 [ x ], so x = y and a cory y 2 0 1 1 . responding unit eigenvector is 1 2

2 1 1 1 2 1 1

## gives us a coordinate system in which (3.2) 3x2 + 7y2 = 1,

which is an ellipse, with principal axes at 45 to the usual axes (see Fig ure 1). Note the ellipse crosses these axes at x = 1/ 3 and y = 1/ 7, respectively. Example 3.4. The quadratic part of equation (1.1) is q(x, y) = 7x2 + 8xy + y2 , which is represented by the matrix A = 7 4 . The characteristic poly41 nomial of A is det(I A) = det 7 4 = 2 8 9, 4 1

## so A has eigenvalues = 1 and = 9. We now nd an orthonormal basis of eigenvectors:

93

y x y x

Figure 1: The ellipse 5x2 4xy + 5y2 = 1 with its principal axes.

= 1: 0 = 8 4 [ x ], which gives y = 2x. The correspond4 2 y 0 1 ing eigenvectors are thus multiples of 2 , which has norm squared 12 + (2)2 = 5. A unit eigenvector is thus given by 1 1 . v1 = 5 2
2 = 9: 0 = 4 4 [ x ], so x = 2y and a corresponding unit eigeny 8 0 vector is given by

1 2 . v2 = 5 1 We build P by using the vectors v1 and v2 for the columns: 1 1 2 P= . 5 2 1 The matrix P satises Pt AP = 1 0 . 0 9

The new coordinates are related to the old by x = Pt [ x ] since P1 = Pt . So y y 1 1 x = (x 2y) and y = (2x + y) and the quadratic part of (1.1) becomes 5 5

x2 + 9y2 .

94

ANDREW SWANN

## 4. C LASSIFYING D EGREE T WO C URVES

IN THE

P LANE

ONSIDER

a general curve of degree two in the variables x and y. This is given by an equation ax2 + 2bxy + cy2 + dx + ey + g = 0. (4.1)

We may rewrite this in terms of matrices as xt Ax + bt x + g = 0, where x = [ x ], A = a b and b = [ d ]. y e bc We can nd an orthogonal matrix P such that Pt AP = 1 0 . 0 2 (4.2)

Changing coordinates by

## x x =P in (4.2), we see that (4.1) becomes y y

1 x2 + 2 y2 + dx + ey + g = 0, where d = Pt b. e If 1 and 2 are non-zero, then we can remove the linear terms by setting = x + d/(2 ) and y = y + e/(2 ) (to complete the square) giving x 1 2 1 x + 2 y = g ,
2 2

(4.3)

where g = d2 /(41) + e2 /(42) g. Such a change of coordinates corre sponds to moving the origin to the point (d/(21), e/(22 )). There are now two cases: 1 and 2 have the same sign: then (4.3) describes an ellipse provided that g has the same sign as 1 and 2 . If g = 0, we get just a single point, if g has the opposite sign to i then (4.3) has no solutions. 1 and 2 have the opposite signs: then (4.3) describes a hyperbola provided g = 0. If g = 0 the hyperbola degenerates to the intersection of two straight lines. The above cases occur provided both eigenvalues are non-zero. If both eigenvalues vanish, then the original equation (4.1) has no quadratic terms and describes a straight line. If one eigenvalue vanishes, say 1 = 0, then we put y = y + e/(22) and x = e2 /(42 ) dx g, to get x = y 2 , which is 2

95

a parabola. This requires d = 0, if that is not the case, we have y 2 + g = 0 which either gives a pair of parallel lines (g < 0), a line (g = 0) or is empty (g > 0). Thus the only curves other than straight lines that occur are parabolas, hyperbolas or ellipses (including circles). Example 4.1. Let us continue with (1.1). As the eigenvalues 1 and 9 found in Example 3.4 are of opposite signs this is a hyperbola unless it is degenerate, i.e., unless g = 0. We nd 1 1 2 d d = = Pt e e 5 2 1 So we have g = d2 /4 + e2 /(4 9) g = Therefore (1.1) is a hyperbola. 5. A N E XTREMAL P ROBLEM 16 11 (49 + ) + 1 = 0. 45 9 1 7 3 = . 2 5 4

HANGING COORDINATES

with an orthogonal matrix has the advantage that distances and angles are preserved. This is because Px Py = (Px)t Py = xt Pt Py = xt y = x y.

Thus such coordinate changes are appropriate in the following type of example. Example 5.1. Problem: Find the points on the surface 2x2 + 4y2 + 2z2 + 2xz = 4 (5.1) in 3 that lie closest the origin and those that lie furthest from the origin. Solution: The matrix corresponding to the quadratic form 2x2 + 2y2 + z2 + 2xz is 2 0 1 A = 0 4 0 . 1 0 2 The characteristic polynomial of A is 2 0 1 det(I A) = det 0 4 0 . 1 0 2

96

ANDREW SWANN

Expanding this determinant along the middle row we get det(I A) = ( 4) det 2 1 = ( 4)(2 4 + 3) 1 2

= ( 4)( 3)( 1), so the eigenvalues of A are 1, 3 and 4. By the Theorem there is an orthogonal change of coordinates x = P so that x 1 0 0 (5.2) Pt AP = 0 3 0 0 0 4 and in this coordinate system equation (5.1) becomes x2 + 3y2 + 42 = 4. z (5.3)

This is an ellipsoid with principal axes aligned with the new coordinate system. The points of (5.3) lying on the new coordinate axes are found by setting two of the variables x, y and z equal to zero and solving (5.3) for the re maining one. Thus the extreme points are among (2, 0, 0), (0, 2/ 3, 0) and (0, 0, 1). Of these (2, 0, 0) is furthest from 0 and (0, 0, 1) is closest. We now need to nd the coordinates of these points in the original coor0 2 dinate system, i.e., we need P 0 and P 0 . For these we only need to 1 0 know the rst and last columns of P. The rst column is given by a unit eigenvector with eigenvalue 1 (the rst entry in (5.2)). To nd this eigenvector we solve 1 0 1 x x 0 0 = (1I A) y = 0 3 0 y 1 0 1 z z 0
1 so z = x and y = 0. The unit eigenvector is v1 = 2 0 . Thus the points 1 on (5.1) furthest from 0 are 2v1 = ( 2, 0, 2). For the closest points, we need a unit eigenvector with eigenvalue 4 (the 0 last entry in (5.2)). Such a vector is v3 = 1 , so the closest points are 0 v3 = (0, 1, 0). 1

97

6. A RBITRARY C HANGES

OF

C OORDINATE

discussion of classication of curves shows, often one only needs to know the signs of the eigenvalues. In fact, if we allow arbitrary, rather than orthogonal, coordinate changes this is the only information that remains about a quadratic form.

S THE

Theorem 6.1. Let qA be a quadratic form with matrix A. Then there is an invertible matrix Q such that qQt AQ (x1 , . . ., xn ) = x2 + + xt2 xt+1 x2 , 1 2 r where r is the rank of A, and t is the number of positive eigenvalues of A. Proof. By Theorem 3.1 there is an orthogonal matrix P diagonalising A. Moreover, we may choose P so that diagonal entries of Pt AP are ordered as we wish. Lets have the positive eigenvalues rst, the negative ones next and the zero eigenvalues last. Fix P so that
2 Pt AP = diag(2 , . . . , t2 , t+1 , . . . , 2 , 0, . . ., 0), 1 r

## s. Let D be the diagonal matrix 1 1 1 1 ,..., , , . . . , , 1, . . ., 1 . 1 t t+1 r

D = diag

Then D is an invertible matrix such that D(Pt AP)D has the desired form. As D is symmetric, we take Q = PD, so Qt AQ = Dt Pt APD = DPt APD, as required. Example 6.2. In Example 3.4, we have chosen P so that the eigenvalues come out in the wrong order. However, this may be remedied by swapping the columns. So we use 1 2 1 P= . 5 1 2 The eigenvalues are now 9 and 1, so we take D = diag(1/3, 1) and Q becomes 1 2 3 Q = PD = . 3 5 1 6 One can now check that Qt AQ = diag(1, 1).

98

ANDREW SWANN

Denition 6.3. The rank of qA is the rank of A. The signature of qA is the number of positive eigenvalues of A minus the number of negative eigenvalues. For A M(n, n), qA is non-degenerate if its rank is n. qA is positive denite if its signature equals n, i.e., if all eigenvalues of A are positive. In the notation of the Theorem, the rank of qA is r, the signature of qA is s := 2t r. Note that t = (s + r)/2, so the rank and signature determine the form of q in the Theorem. Example 6.4. If the rank is 4, q is non-degenerate and the signature 2, then there is coordinate system in which q(x1 , x2 , x3 , x4 ) = x2 x2 x2 x2 , 1 2 3 4 which is a Lorentz metric. 7. C RITICAL P OINTS
OF

S MOOTH F UNCTIONS

ET

f : n be a smooth function. The critical points of f are points satisfying f = 0, x1 f = 8x 3y, x f = 0, . . . , x2 f = 3x + 2y, y f = 0. xn f = sin(z), z

## For example, suppose f (x, y, z) = 4x2 3xy + y2 cos(z) + 1. Then we have

so there are critical points at (x, y, z) = (0, 0, n) for n an integer. Suppose 0 is a critical point of f and adjust f so that f (0) = 0. The nature of the critical point is determined by the Hessian of f which is the symmetric n n matrix of all second partial derivatives Hess( f ) = [2 f /xi x j ], since by Taylors Theorem we have that f is approximated near 0 by 2 f (0)xi x j . i=1 j=1 xi x j
n n

The Hessian denes a quadratic form, and Theorem 6.1 implies that there are local coordinates (x1 , . . . , xn ) such that f ( ) near 0 is approximated by x x2 + + xt2 xt+1 x2 , 1 2 r where r is the rank of Hess( f ) and t is the number of positive eigenvalues. Suppose that r = n. Then the critical point is a Local Minimum: if t = n, i.e., all the eigenvalues are positive,