Beruflich Dokumente
Kultur Dokumente
2 / 61
Theorem 1
Our universe, U, is unstable.
2 / 61
Theorem 1
Our universe, U, is unstable.
Proof.
2 / 61
Theorem 1
Our universe, U, is unstable.
Proof.
It can be shown that the perverse cohomology group
10
237
Hpot
(U) = 1010 potatoes.
2 / 61
Theorem 1
Our universe, U, is unstable.
Proof.
It can be shown that the perverse cohomology group
10
237
Hpot
(U) = 1010 potatoes.
2 / 61
Theorem 1
Our universe, U, is unstable.
Proof.
It can be shown that the perverse cohomology group
10
237
Hpot
(U) = 1010 potatoes.
2 / 61
3 / 61
f (x)
g1 (x) = 0
g2 (x) 0
..
.
3 / 61
f (x)
g1 (x) = 0
g2 (x) 0
..
.
The constaint functions, g1 , g2 , etc., are often linear or quadratic but they
can be more complicated.
3 / 61
4 / 61
4 / 61
4 / 61
4 / 61
4 / 61
5 / 61
5 / 61
5 / 61
5 / 61
A Simple Example
For example, find the maximum of
f (x, y ) = 5x 2 + 4xy + 2y 2
on the unit circle
x 2 + y 2 = 1.
6 / 61
A Simple Example
For example, find the maximum of
f (x, y ) = 5x 2 + 4xy + 2y 2
on the unit circle
x 2 + y 2 = 1.
It turns out that the maximum of f on the unit circle is 6 and that it is
achieved for
2
x
= 15 .
y
6 / 61
A Simple Example
For example, find the maximum of
f (x, y ) = 5x 2 + 4xy + 2y 2
on the unit circle
x 2 + y 2 = 1.
It turns out that the maximum of f on the unit circle is 6 and that it is
achieved for
2
x
= 15 .
y
6 / 61
in terms of a matrix as
x
2
2
y
7 / 61
A=
5 2
2 2
in terms of a matrix as
x
2
2
y
7 / 61
This means that there are (unit) vectors, e1 , e2 , that form a basis of R2
and such that
Aei = i ei ,
i = 1, 2,
8 / 61
This means that there are (unit) vectors, e1 , e2 , that form a basis of R2
and such that
Aei = i ei ,
i = 1, 2,
8 / 61
This means that there are (unit) vectors, e1 , e2 , that form a basis of R2
and such that
Aei = i ei ,
i = 1, 2,
8 / 61
9 / 61
2 = 1,
and
2
e1 =
5
1
,
e2 =
15
2
5
9 / 61
2 = 1,
and
2
e1 =
5
1
5 2
A=
=
2 2
2
5
1
5
,
e2 =
15
2
5
so we can write
15
2
5
!
6 0
0 1
2
5
1
1
5
2
5
9 / 61
The matrix
P=
2
5
1
5
15
2
5
10 / 61
The matrix
P=
2
5
1
5
15
2
5
10 / 61
Observe that
x
5 2
f (x, y ) = (x, y )
2 2
y
!
2
1
6 0
5
5
= (x, y ) 1
2
0 1
5
5
x
6 0
= (x, y )P
P>
0 1
y
2
5
1
1
5
2
5
!
x
y
11 / 61
Observe that
x
5 2
f (x, y ) = (x, y )
2 2
y
!
2
1
6 0
5
5
= (x, y ) 1
2
0 1
5
5
x
6 0
= (x, y )P
P>
0 1
y
If we let
2
5
1
1
5
2
5
!
x
y
u
> x
=P
,
v
y
then
u
6 0
f (u, v ) = (u, v )
0 1
v
= 6u 2 + v 2 .
Jean Gallier (Upenn)
11 / 61
x
(x, y )
= 1.
y
12 / 61
x
(x, y )
= 1.
y
x
u
so
=P
,
y
v
we get
1 = (x, y )
x
u
u
= (u, v )P > P
= (u, v )
,
y
v
v
12 / 61
x
(x, y )
= 1.
y
x
u
so
=P
,
y
v
we get
1 = (x, y )
x
u
u
= (u, v )P > P
= (u, v )
,
y
v
v
so
u 2 + v 2 = 1.
12 / 61
13 / 61
13 / 61
13 / 61
x
u
u
5
=P
= 15
2
y
v
v
5
5
this yields
2
x
= 15 .
y
14 / 61
x
u
u
5
=P
= 15
2
y
v
v
5
5
this yields
2
x
= 15 .
y
14 / 61
ABs-H2.A'-'\ -"{"f::.RlV\s.'"
15 / 61
Important Fact 1.
We may assume that A is symmetric, which means that A> = A.
16 / 61
Important Fact 1.
We may assume that A is symmetric, which means that A> = A.
This is because we can write
A = H(A) + S(A),
where
H(A) =
A + A>
2
and S(A) =
A A>
2
16 / 61
Important Fact 1.
We may assume that A is symmetric, which means that A> = A.
This is because we can write
A = H(A) + S(A),
where
H(A) =
A + A>
2
and S(A) =
A A>
2
16 / 61
Important Fact 1.
We may assume that A is symmetric, which means that A> = A.
This is because we can write
A = H(A) + S(A),
where
H(A) =
A + A>
2
and S(A) =
A A>
2
16 / 61
Important Fact 1.
We may assume that A is symmetric, which means that A> = A.
This is because we can write
A = H(A) + S(A),
where
H(A) =
A + A>
2
and S(A) =
A A>
2
16 / 61
17 / 61
17 / 61
17 / 61
17 / 61
17 / 61
17 / 61
17 / 61
17 / 61
18 / 61
18 / 61
18 / 61
18 / 61
19 / 61
but this only implies that the real part of f (x) is zero that is, f (x) is pure
imaginary or zero.
19 / 61
but this only implies that the real part of f (x) is zero that is, f (x) is pure
imaginary or zero.
However, if A is Hermitian, then f (x) = x Ax, is real.
19 / 61
Important Fact 2.
Every n n real symmetric matrix, A, has real eigenvalues, say
1 2 n ,
and can be diagonalized with respect to an orthonormal basis of
eigenvectors.
20 / 61
Important Fact 2.
Every n n real symmetric matrix, A, has real eigenvalues, say
1 2 n ,
and can be diagonalized with respect to an orthonormal basis of
eigenvectors.
This means that there is a basis of orthonormal vectors, (e1 , . . . , en ),
where ei is an eigenvector for i , that is,
Aei = i ei ,
1 i n.
20 / 61
Important Fact 2.
Every n n real symmetric matrix, A, has real eigenvalues, say
1 2 n ,
and can be diagonalized with respect to an orthonormal basis of
eigenvectors.
This means that there is a basis of orthonormal vectors, (e1 , . . . , en ),
where ei is an eigenvector for i , that is,
Aei = i ei ,
1 i n.
The same result holds for (complex) Hermitian matrices (w.r.t. the
Hermitian inner product).
20 / 61
x > Ax
subject to
x > x = 1, x Rn ,
21 / 61
x > Ax
subject to
x > x = 1, x Rn ,
21 / 61
x > Ax
subject to
x > x = 1, x Rn ,
21 / 61
Courant Fischer
x > x=1
the largest eigenvalue of A, and this maximum is achieved for any unit
eigenvector associated with 1 .
22 / 61
Courant Fischer
x > x=1
the largest eigenvalue of A, and this maximum is achieved for any unit
eigenvector associated with 1 .
This fact is part of the Courant-Fischer Theorem.
22 / 61
23 / 61
23 / 61
24 / 61
24 / 61
24 / 61
24 / 61
24 / 61
25 / 61
where
25 / 61
where
1
25 / 61
where
1
25 / 61
where
1
T (k) is the tube size of the cut; it depends on the thickness factor , k
(in fact, T (k) = k/|S|).
25 / 61
Very recently, Shi and Kennedy found a better formulation of the objective
function involving a new normalization of the matrix arising from the
graph G .
26 / 61
Very recently, Shi and Kennedy found a better formulation of the objective
function involving a new normalization of the matrix arising from the
graph G .
We will only present the old formulation.
26 / 61
Very recently, Shi and Kennedy found a better formulation of the objective
function involving a new normalization of the matrix arising from the
graph G .
We will only present the old formulation.
Maximizing C (S, O, k) is a hard combinatorial problem so, Shi, Zhu and
Song had the idea of converting the orginal problem to a simpler problem
using a
circular embedding .
26 / 61
27 / 61
27 / 61
27 / 61
j =
2j
.
|S|
28 / 61
j =
2j
.
|S|
The maximum jumping angle max will also play a role; this is the
maximum of the angle between two consecutive nodes.
28 / 61
29 / 61
Ce (r , , max ) =
1
max
Pij /|S|,
i <j i +max
ri >0, rj >0
where
29 / 61
Ce (r , , max ) =
1
max
Pij /|S|,
i <j i +max
ri >0, rj >0
where
1
The matrix P = (Pij ) is obtained from the weight matrix, W , (of the
graph G = (V , E , W )) by a suitable normalization
29 / 61
Ce (r , , max ) =
1
max
Pij /|S|,
i <j i +max
ri >0, rj >0
where
1
The matrix P = (Pij ) is obtained from the weight matrix, W , (of the
graph G = (V , E , W )) by a suitable normalization
rj {0, 1}
29 / 61
Ce (r , , max ) =
1
max
Pij /|S|,
i <j i +max
ri >0, rj >0
where
1
The matrix P = (Pij ) is obtained from the weight matrix, W , (of the
graph G = (V , E , W )) by a suitable normalization
rj {0, 1}
29 / 61
Ce (r , , max ) =
1
max
Pij /|S|,
i <j i +max
ri >0, rj >0
where
1
The matrix P = (Pij ) is obtained from the weight matrix, W , (of the
graph G = (V , E , W )) by a suitable normalization
rj {0, 1}
29 / 61
30 / 61
30 / 61
30 / 61
j,k
30 / 61
Continuous Relaxation
Thus, Ce (r , , max ) is well approximated by
P
i )
1
j,k Re(xj xk e
P
.
2
max
j |xj |
31 / 61
Continuous Relaxation
Thus, Ce (r , , max ) is well approximated by
P
i )
1
j,k Re(xj xk e
P
.
2
max
j |xj |
1
max
Re(x Px e i )
,
x x
31 / 61
The matrix P is a real matrix but, in general, it not symmetric nor normal
(PP = P P).
32 / 61
The matrix P is a real matrix but, in general, it not symmetric nor normal
(PP = P P).
If we write = and if we assume that
0 < min max , we would like to solve the following optimization
problem:
32 / 61
The matrix P is a real matrix but, in general, it not symmetric nor normal
(PP = P P).
If we write = and if we assume that
0 < min max , we would like to solve the following optimization
problem:
maximize
Re(x e i Px)
subject to
x x = 1, x Cn ;
min max .
32 / 61
maximize
Re(x e i Py )
subject to
x y = c, x, y Cn ;
min max .
with c = e i .
33 / 61
maximize
Re(x e i Py )
subject to
x y = c, x, y Cn ;
min max .
with c = e i .
However, it turns out that this problem is too relaxed, because the
constraint x y = c is weak; it allows x to be very large and y to be very
small, and conversely.
33 / 61
34 / 61
z +z
,
2
34 / 61
z +z
,
2
34 / 61
z +z
,
2
1
H(e i P) = (e i P + e i P > )
2
34 / 61
35 / 61
x H() x
subject to
x x = 1, x Cn ;
min max
35 / 61
x H() x
subject to
x x = 1, x Cn ;
min max
with
H() = H(e i P) = cos H(P) i sin S(P),
a Hermitian matrix.
Jean Gallier (Upenn)
35 / 61
The optimal value is the largest eigenvalue, 1 , of H(), over all such
that min max and it is attained for any associated complex unit
eigenvector, x = xr + ixi .
36 / 61
The optimal value is the largest eigenvalue, 1 , of H(), over all such
that min max and it is attained for any associated complex unit
eigenvector, x = xr + ixi .
Ryan Kennedy has implemented this method and has obtained good
results.
36 / 61
37 / 61
37 / 61
37 / 61
37 / 61
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1.2
38 / 61
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1.2
39 / 61
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
40 / 61
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
41 / 61
42 / 61
42 / 61
42 / 61
43 / 61
43 / 61
43 / 61
44 / 61
Theorem 2
Let X () be a normal matrix that depends differentiably on . If is any
simple eigenvalue of X at 0 (it has algebraic multiplicity 1) and if u is the
corresponding unit eigenvector, then the derivatives at = 0 of () and
u() are given by
0 = u X 0 u
u 0 = (I X ) X 0 u,
where (I X ) is the pseudo-inverse of I X , X 0 is the derivative of X
at = 0 and u 0 is orthogonal to u.
44 / 61
Proof.
If X is a normal matrix, it is well known that Xu = u iff X u = u and
so, if Xu = u then
u X = u .
45 / 61
Proof.
If X is a normal matrix, it is well known that Xu = u iff X u = u and
so, if Xu = u then
u X = u .
Taking the derivative of Xu = u and using the chain rule, we get
X 0 u + Xu 0 = 0 u + u 0 .
45 / 61
Proof.
If X is a normal matrix, it is well known that Xu = u iff X u = u and
so, if Xu = u then
u X = u .
Taking the derivative of Xu = u and using the chain rule, we get
X 0 u + Xu 0 = 0 u + u 0 .
By taking the inner product with u , we get
u X 0 u + u Xu 0 = 0 u u + u u 0 .
45 / 61
Proof.
If X is a normal matrix, it is well known that Xu = u iff X u = u and
so, if Xu = u then
u X = u .
Taking the derivative of Xu = u and using the chain rule, we get
X 0 u + Xu 0 = 0 u + u 0 .
By taking the inner product with u , we get
u X 0 u + u Xu 0 = 0 u u + u u 0 .
However, u X = u , so u Xu 0 = u u 0 , and as u is a unit vector,
u u = 1,
45 / 61
Proof.
If X is a normal matrix, it is well known that Xu = u iff X u = u and
so, if Xu = u then
u X = u .
Taking the derivative of Xu = u and using the chain rule, we get
X 0 u + Xu 0 = 0 u + u 0 .
By taking the inner product with u , we get
u X 0 u + u Xu 0 = 0 u u + u u 0 .
However, u X = u , so u Xu 0 = u u 0 , and as u is a unit vector,
u u = 1, so
u X 0 u + u u 0 = 0 + u u 0 ,
that is, 0 = u X 0 u.
45 / 61
Proof.
If X is a normal matrix, it is well known that Xu = u iff X u = u and
so, if Xu = u then
u X = u .
Taking the derivative of Xu = u and using the chain rule, we get
X 0 u + Xu 0 = 0 u + u 0 .
By taking the inner product with u , we get
u X 0 u + u Xu 0 = 0 u u + u u 0 .
However, u X = u , so u Xu 0 = u u 0 , and as u is a unit vector,
u u = 1, so
u X 0 u + u u 0 = 0 + u u 0 ,
that is, 0 = u X 0 u.
Deriving the formula for the derivative of u is more involved.
Jean Gallier (Upenn)
45 / 61
t .
46 / 61
47 / 61
47 / 61
48 / 61
48 / 61
49 / 61
49 / 61
49 / 61
49 / 61
Unfortunately, this doesnt not seem to help much in finding for which
the function f (x, ) has local maxima.
50 / 61
Unfortunately, this doesnt not seem to help much in finding for which
the function f (x, ) has local maxima.
Still, since the maxima of |x Px| dominate the maxima of x H()x, and
are a subset of those maxima, it is useful to understand better how to find
the local maxima of |x Px|.
50 / 61
Unfortunately, this doesnt not seem to help much in finding for which
the function f (x, ) has local maxima.
Still, since the maxima of |x Px| dominate the maxima of x H()x, and
are a subset of those maxima, it is useful to understand better how to find
the local maxima of |x Px|.
The determination of the local extrema of |x Px| (with x x = 1) is closely
related to the structure of the set of complex numbers
F (P) = {x Px C | x Cn , x x = 1},
known as the field of values of P or the numerical range of P (the
notation W (P) is also commonly used, corresponding to the German
terminology Wertvorrat or Wertevorrat).
50 / 61
This set was studied as early as 1918 by Toeplitz and Hausdorff who
proved that F (P) is convex.
Figure: Felix Hausdorff, 1868-1942 (left) and Otto Toeplitz, 1881-1940 (right)
51 / 61
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1.2
52 / 61
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1.2
53 / 61
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
54 / 61
Figure: Beauty
55 / 61
The quantity
r (P) = max{|z| | z F (P)}
is called the numerical radius of P.
It is obviously of interest to us since it corresponds to the maximum of
|x Px|, over all unit vectors, x.
56 / 61
The quantity
r (P) = max{|z| | z F (P)}
is called the numerical radius of P.
It is obviously of interest to us since it corresponds to the maximum of
|x Px|, over all unit vectors, x.
It is easy to show that
F (e i P) = e i F (P)
and so,
F (P) = e i F (e i P).
56 / 61
The quantity
r (P) = max{|z| | z F (P)}
is called the numerical radius of P.
It is obviously of interest to us since it corresponds to the maximum of
|x Px|, over all unit vectors, x.
It is easy to show that
F (e i P) = e i F (P)
and so,
F (P) = e i F (e i P).
Geometrically, this means that F (P) is obtained from F (e i P) by
rotating it by .
Jean Gallier (Upenn)
56 / 61
This fact yields a nice way of finding supporting lines for the convex set,
F (P).
To show this, we use a proposition from Horn and Johnson whose proof is
quite simple:
57 / 61
This fact yields a nice way of finding supporting lines for the convex set,
F (P).
To show this, we use a proposition from Horn and Johnson whose proof is
quite simple:
Theorem 3
For any n n matrix, P, and any unit vector, x Cn , the following
properties are equivalent:
(1) Re(x Px) = max{Re(z) | z F (P)}
(2) x H(P)x = max{r | r F (H(P))}
(3) The vector, x, is an eigenvector of H(P) corresponding to the largest
eigenvalue, 1 , of H(P).
57 / 61
58 / 61
58 / 61
These properties can be exploited to find the local maxima of the objective
function of our contour identification problem. Ryan Kennedy has done
some work on this problem.
59 / 61
These properties can be exploited to find the local maxima of the objective
function of our contour identification problem. Ryan Kennedy has done
some work on this problem.
Much more work needs to be done, in particular
59 / 61
These properties can be exploited to find the local maxima of the objective
function of our contour identification problem. Ryan Kennedy has done
some work on this problem.
Much more work needs to be done, in particular
Cope with the dimension of the matrix, P
59 / 61
These properties can be exploited to find the local maxima of the objective
function of our contour identification problem. Ryan Kennedy has done
some work on this problem.
Much more work needs to be done, in particular
Cope with the dimension of the matrix, P
Understand the role of various normalizations of P (stochastic,
bi-stochastic).
59 / 61
These properties can be exploited to find the local maxima of the objective
function of our contour identification problem. Ryan Kennedy has done
some work on this problem.
Much more work needs to be done, in particular
Cope with the dimension of the matrix, P
Understand the role of various normalizations of P (stochastic,
bi-stochastic).
Shi and Kennedy have made recent progress on the issue of
normalization.
59 / 61
60 / 61
60 / 61
60 / 61
60 / 61
60 / 61
Gander, Golub and von Matt (1989) showed that these constraints can be
eliminated but a linear term has to be added to the objective function.
61 / 61
Gander, Golub and von Matt (1989) showed that these constraints can be
eliminated but a linear term has to be added to the objective function.
One needs to consider objective functions of the form
f (x) = x > Ax + 2x > b,
with b 6= 0.
61 / 61
Gander, Golub and von Matt (1989) showed that these constraints can be
eliminated but a linear term has to be added to the objective function.
One needs to consider objective functions of the form
f (x) = x > Ax + 2x > b,
with b 6= 0.
This problem was studied by Gander, Golub and von Matt (1989).
61 / 61
Gander, Golub and von Matt (1989) showed that these constraints can be
eliminated but a linear term has to be added to the objective function.
One needs to consider objective functions of the form
f (x) = x > Ax + 2x > b,
with b 6= 0.
This problem was studied by Gander, Golub and von Matt (1989).
I also have a solution to this problem involving an algebraic curve
generalizing the hyperbola to Rn , but this will have to wait for another
talk!
61 / 61