Beruflich Dokumente
Kultur Dokumente
Now Digital.
Visit www.maa.org/ebooks
MONTHLY
Volume 119, No. 2 February 2012
EDITOR
Scott T. Chapman
Sam Houston State University
ASSOCIATE EDITORS
William Adkins Ulrich Krause
Louisiana State University Universitat Bremen
David Aldous Jeffrey Lawson
University of California, Berkeley Western Carolina University
Elizabeth Allman C. Dwight Lahr
University of Alaska, Fairbanks Dartmouth College
Jonathan M. Borwein Susan Loepp
University of Newcastle Williams College
Jason Boynton Irina Mitrea
North Dakota State University Temple University
Edward B. Burger Bruce P. Palka
Williams College National Science Foundation
Minerva Cordero-Epperson Vadim Ponomarenko
University of Texas, Arlington San Diego State University
Beverly Diamond Catherine A. Roberts
College of Charleston College of the Holy Cross
Allan Donsig Rachel Roberts
University of Nebraska, Lincoln Washington University, St. Louis
Michael Dorff Ivelisse M. Rubio
Brigham Young University Universidad de Puerto Rico, Rio Piedras
Daniela Ferrero Adriana Salerno
Texas State University Bates College
Luis David Garcia-Puente Edward Scheinerman
Sam Houston State University Johns Hopkins University
Sidney Graham Susan G. Staples
Central Michigan University Texas Christian University
Tara Holm Dennis Stowe
Cornell University Idaho State University
Roger A. Horn Daniel Ullman
University of Utah George Washington University
Lea Jenkins Daniel Velleman
Clemson University Amherst College
Daniel Krashen
University of Georgia
EDITORIAL ASSISTANT
Bonnie K. Ponce
NOTICE TO AUTHORS Proposed problems or solutions should be sent to:
The MONTHLY publishes articles, as well as notes and DOUG HENSLEY, MONTHLY Problems
other features, about mathematics and the profes- Department of Mathematics
sion. Its readers span a broad spectrum of math- Texas A&M University
ematical interests, and include professional mathe- 3368 TAMU
maticians as well as students of mathematics at all College Station, TX 77843-3368
collegiate levels. Authors are invited to submit arti-
cles and notes that bring interesting mathematical
In lieu of duplicate hardcopy, authors may submit
ideas to a wide audience of MONTHLY readers.
pdfs to monthlyproblems@math.tamu.edu.
The MONTHLYs readers expect a high standard of ex-
position; they expect articles to inform, stimulate,
challenge, enlighten, and even entertain. MONTHLY Advertising Correspondence:
articles are meant to be read, enjoyed, and dis- MAA Advertising
cussed, rather than just archived. Articles may be 1529 Eighteenth St. NW
expositions of old or new results, historical or bio- Washington DC 20036
graphical essays, speculations or definitive treat-
ments, broad developments, or explorations of a Phone: (877) 622-2373
single application. Novelty and generality are far E-mail: tmarmor@maa.org
less important than clarity of exposition and broad
appeal. Appropriate figures, diagrams, and photo- Further advertising information can be found online
graphs are encouraged. at www.maa.org
Notes are short, sharply focused, and possibly infor- Change of address, missing issue inquiries, and
mal. They are often gems that provide a new proof other subscription correspondence:
of an old theorem, a novel presentation of a familiar MAA Service Center, maahq@maa.org
theme, or a lively discussion of a single issue.
All at the address:
Beginning January 1, 2011, submission of articles and
notes is required via the MONTHLYs Editorial Man- The Mathematical Association of America
ager System. Initial submissions in pdf or LATEX form 1529 Eighteenth Street, N.W.
can be sent to the Editor Scott Chapman at Washington, DC 20036
Abstract. The classical Heron problem states: on a given straight line in the plane, find a
point C such that the sum of the distances from C to the given points A and B is minimal. This
problem can be solved using standard geometry or differential calculus. In the light of modern
convex analysis, we are able to investigate more general versions of this problem. In this paper
we propose and solve the following problem: on a given nonempty closed convex subset of
Rs , find a point such that the sum of the distances from that point to n given nonempty closed
convex subsets of Rs is minimal.
where || || is the Euclidean norm in Rs . The new generalized Heron problem is for-
mulated as follows:
n
X
minimize D(x) := d(x; i ) subject to x , (1.2)
i=1
where all the sets and i , i = 1, . . . , n, are nonempty, closed, and convex; these are
our standing assumptions in this paper. Thus (1.2) is a constrained convex optimization
problem, and hence it is natural to use techniques of convex analysis and optimization
to solve it.
The function f is closed if its epigraph is closed, and it is convex if its epigraph is a
convex subset of Rs+1 . It is easy to check that f is convex if and only if
of the set . It follows immediately from the definitions that Rs is closed (resp.
convex) if and only if the indicator function (2.3) is closed (resp. convex).
An element v Rs is called a subgradient of a convex function f : Rs R at
x dom f if it satisfies the inequality
Figure 1. The absolute value function and subtangent lines at (0, 0).
which is a nonsmooth convex counterpart of the classical Fermat stationary rule. Ap-
plying (2.7) to the constrained optimization problem (2.5) via its unconstrained de-
scription (2.6) requires the usage of subdifferential calculus. The most fundamental
calculus result of convex analysis is the following Moreau-Rockafellar theorem for
the subdifferential of sums; see, e.g., [4, p. 51].
n
for all x i=1 dom i .
Observe that a vector v belongs to N (x; ) if and only if it makes a right or obtuse
angle with the vector from x to x for any x . It easily follows from the definitions
that
Finally in this section, we present a useful formula for computing the subdifferential
of the distance function (1.1) via the Euclidean projection
of x Rs on the closed and convex set Rs . It follows from the definition of the
Euclidean projection that 5(x; ) = {x} if x and it is a singleton when x / .
In the sequel, we identify the set 5(x; ) with its unique element.
x 5(x; ) o
n
/ ,
if x
d(x; )
d(x; ) =
N (x; ) B if x ,
:= inf D(x)
x
and thus there exists wk 1 with ||xk wk || < + 1 for such indices k. Then
which shows that the sequence {xk } is bounded. The existence of optimal solutions
follows in this case from the arguments above.
hv, ui
cos(v, u) := . (3.11)
||v|| ||u||
The next theorem gives necessary and sufficient conditions for optimal solutions to
(1.2) via projections (2.10) on i incorporated into quantities (3.11).
i = for i = 1, . . . , n. (3.13)
x 5(x; i )
ai (x) := 6 = 0, i = 1, . . . , n. (3.14)
d(x; i )
Then x is an optimal solution to the generalized Heron problem (1.2) if and only
if we have the inclusion
n
X
ai (x) N (x; ). (3.15)
i=1
Suppose in addition that the normal cone to the constraint set N (x; ) is representable
by a subspace L. Then (3.15) is equivalent to
n
X
cos ai (x), u = 0 whenever u L \ {0}. (3.16)
i=1
Applying the generalized Fermat rule (2.7), we see that x is a solution to (3.17) if and
only if
n
X
0 d(; i ) + (; ) (x). (3.18)
i=1
Since all of the functions d(; i ), i = 1, . . . , n, are convex and continuous, we em-
ploy the subdifferential sum rule of Theorem 2.1 to (3.18) and arrive at
n
X
0 D + (, ) (x) = d(x; i ) + N (x; )
i=1
n
X (3.19)
= ai (x) + N (x; ),
i=1
where the second representation in (3.19) is due to (2.9), assumption (3.13), and the
subdifferential description of Proposition 2.2 with ai (x) defined in (3.14). It is obvious
that (3.19) and (3.15) are equivalent.
Suppose in addition that the the normal cone N (x; ) to the constraint set is repre-
sentable by a subspace L. Then the inclusion (3.15) is equivalent to
n
X
0 ai (x) + L ,
i=1
Taking into account that ||ai (x)|| = 1 for i = 1, . . . , n by (3.14) and assumption
(3.13), the latter equality is equivalent to
n
X hai (x), vi
= 0 for all u L \ {0},
i=1
||ai (x)|| ||u||
which gives (3.16) due to the notation (3.11) and thus completes the proof of the
theorem.
Corollary 3.3. Let be an affine subspace parallel to a subspace L, and let assump-
tion (3.13) of Theorem 3.2 be satisfied. Then x is a solution to the generalized
Heron problem (1.2) if and only if condition (3.16) holds.
The underlying characterization (3.16) can be checked easily when the subspace L
in Theorem 3.2 is given as the span of fixed generating vectors.
Proof. We show that (3.16) is equivalent to (3.20) in the setting under consideration.
Since (3.16) obviously implies (3.20), it remains to justify the opposite implication.
Set
n
X
a := ai (x)
i=1
Let us examine in more detail the case of two sets 1 and 2 in (1.2) with the
normal cone to the constraint set being a straight line generated by a given vector.
This is a direct extension of the classical Heron problem to the setting when the two
points are replaced by closed and convex sets, and the constraint line is replaced by a
closed convex set with the property above. The next theorem gives a complete and
verifiable solution to the new problem.
Proof. It follows from Theorem 3.2 that x is an optimal solution to (1.2) if and
only if a1 a2 N (x; ). By the assumed structure of the normal cone to the
latter is equivalent to the alternative:
To justify (i), let us show that the second equality in (3.24) implies the correspond-
ing one in (3.22). Indeed, we have ||a1 || = ||a2 || = 1, and thus
ha1 , ai = ha1 , a1 + a2 i
= ha1 , a1 i + ha1 , a2 i
= 1 + ha1 , a2 i
= ha2 , a2 i + ha2 , a1 i
= ha2 , a1 + a2 i
= ha2 , ai,
which ensures that ha1 , ai = ha2 , ai as 6 = 0. This gives us the equality cos(a1 , a) =
cos(a2 , a) due to ||a1 || = ||a2 || = 1 and a 6 = 0. Hence we arrive at (3.22).
To justify (ii), we need to prove that the relationships in (3.23) imply
a1 = x1 a + y1 b and a2 = x2 a + y2 b.
Since cos(a1 , a) = cos(a2 , a), we have x1 = x2 . Then y1 = y2 by ||a1 ||2 = ||a2 ||2 .
Due to a1 6= a2 this implies y1 = y2 and thus completes the proof.
Finally in this section, we present two examples illustrating the application of The-
orem 3.2 and Corollary 3.4, respectively, to solving the corresponding generalized and
classical Heron problems.
Example 3.6. Consider problem (1.2) where n = 2, the sets 1 and 2 are two points
A1 and A2 in the plane, and the constraint is a disk that does not contain A1 or A2 .
Condition (3.15) from Theorem 3.2 characterizes a solution M to this generalized
10 10
8 8
A2 A2
6 6
4 M 4
2 2 M
A1
0 0
y
y
2 2
4 4
A1
6 6
8 8
10 10
10 8 6 4 2 0 2 4 6 8 10 10 8 6 4 2 0 2 4 6 8 10
x x
Figure 3. Generalized Heron problems for two points with disk constraints.
where aE is a direction vector of L. Note that the latter equation completely character-
izes the solution of the classical Heron problem in the plane in both cases when A1
and A2 are on the same side and different sides of L; see Figure 4.
10 10
A2
8 8
A2
6 6
A1
4 4
2 2
0 M 0 M
y
y
2 2
4 4
6 6 A1
8 8
10 10
10 8 6 4 2 0 2 4 6 8 10 10 8 6 4 2 0 2 4 6 8 10
x x
Figure 4. The classical Heron problem.
n
X
xk+1 = 5 xk k vik ; , k = 1, 2, . . . , (4.26)
i=1
xk 5(xk ; i )
/ i ,
if xk
d(xk ; i )
vik :=
0 if xk i .
Then the iterative sequence {xk } in (4.26) converges to an optimal solution of the gen-
eralized Heron problem (1.2) and the value sequence
Proof. Observe that algorithm (4.26) is well posed, since the projection to a convex
set used in (4.26) is uniquely defined. Since one of the sets and i , i = 1, . . . , n,
is bounded, the problem has an optimal solution by Proposition 3.1. This algorithm
and its convergence under conditions (4.27) are based on the subgradient method for
convex functions in the so-called square summable but not summable case (see,
e.g., [1, Proposition 8.2.8, p. 480]), the subdifferential sum rule of Theorem 2.1, and
the subdifferential formula for the distance function given in Proposition 2.2. The
reader can compare this algorithm and its justifications with the related developments
in [6] for the numerical solution of the (unconstrained) generalized Fermat-Torricelli
problem.
Example 4.2. Consider the generalized Heron problem (1.2) for (not necessarily dis-
joint) squares i , i = 1, . . . , n, of right position in R2 (i.e., such that the sides of each
square are parallel to the x-axis and the y-axis) subject to a given disk constraint . Let
ci = (ai , bi ) and ri , i = 1, . . . , n, be the centers and half the side lengths of the squares
under consideration. The vertices of the ith square are denoted by q1i = (ai + ri ,
bi + ri ), q2i = (ai ri , bi + ri ), q3i = (ai ri , bi ri ), q4i = (ai + ri , bi ri ). Let
r and p = (, ) be the radius and the center of the constraint.
P(x, y) = (wx + , w y + )
with
r (x )
wx = p
(x )2 + (y )2
and
r (y )
wy = p .
(x )2 + (y )2
for i = 1, . . . , n and k = 1, 2, . . .
Example 4.3. Consider the generalized Heron problem (1.2) for (not necessarily dis-
joint) cubes of right position in R3 subject to a ball constraint. In this case the projec-
tion 5((x, y, z); ) and quantities vik are computed similarly to Example 4.2.
Once again, we implemented this algorithm with a MATLAB program. Figure 6 and
the corresponding table present the calculation results for the ball constraint with
center (0, 2, 0) and radius 1, for the cubes i with centers (0, 4, 0), (4, 2, 3),
(3, 4, 2), (5, 4, 4), and (1, 8, 1) of the same half side length 1, for the starting
point x1 = (0, 2, 0), and for the sequence k = 1/k in (4.26) satisfying (4.27). The ap-
proximate optimal solution and optimal value are x (0.92531, 1.62907, 0.07883)
and V 22.23480.
MATLAB RESULT
6 k xk Vk
4
2 1 (0, 2, 0) 24.18180
z
0 10 (0.92583, 1.63052, 0.07947) 22.23480
2 100 (0.92531, 1.62908, 0.07884) 22.23480
4
6 4 8
1,000 (0.92531, 1.62907, 0.07883) 22.23480
2 0 4 6
y
2 4
6 8 6 4 2
0 2
x
10,000 (0.92531, 1.62907, 0.07883) 22.23480
10 8
20,000 (0.92531, 1.62907, 0.07883) 22.23480
30,000 (0.92531, 1.62907, 0.07883) 22.23480
Figure 6. Generalized Heron problem for cubes with a ball constraint.
ACKNOWLEDGMENTS. Research of the first author was partially supported by the US National Science
Foundation under grant DMS-1007132 and by the Australian Research Council under grant DP-12092508.
REFERENCES
1. D. Bertsekas, A. Nedic, A. Ozdaglar, Convex Analysis and Optimization, Athena Scientific, Belmont, MA,
2003.
2. J. M. Borwein, A. S. Lewis, Convex Analysis and Nonlinear Optimization: Theory and Examples, second
edition, Springer, New York, 2006.
3. J.-B. Hiriart-Urruty, C. Lemarechal, Fundamentals of Convex Analysis. Springer-Verlag, Berlin, 2001.
4. G. G. Magaril-Ilyaev, M. V. Tikhomirov, Convex Analysis: Theory and Applications, American Mathe-
matical Society, Providence, RI, 2003.
BORIS MORDUKHOVICH is Distinguished University Professor and President of the Academy of Schol-
ars at Wayne State University. He has more than 300 publications including monographs and patents. Among
his best known achievements are the introduction of powerful constructions of generalized differentiation
(bearing his name), their development, and applications to broad classes of problems in variational analy-
sis, optimization, equilibrium, control, economics, engineering, and other fields. Mordukhovich is a SIAM
Fellow and a recipient of many international awards and honors including Doctor Honoris Causa degrees from
four universities.
Department of Mathematics, Wayne State University, Detroit, MI 48202
boris@math.wayne.edu
NGUYEN MAU NAM received his B.S. from Hue University, Vietnam, in 1998 and his Ph.D. from Wayne
State University in 2007 under the direction of Boris Mordukhovich. He is currently an Assistant Professor of
Mathematics at the University of Texas-Pan American.
Department of Mathematics, University of Texas-Pan American, Edinburg, TX 78539
nguyenmn@utpa.edu
JUAN SALINAS JR. received his B.S. in Electrical Engineering from the University of Texas-Pan American
in 1999. He is currently a graduate student at the University of Texas-Pan American.
Department of Mathematics, University of Texas-Pan American, Edinburg, TX 78539
jsalinasn@broncs.utpa.edu
Abstract. In this article we identify several beautiful properties of Jacobi sums that become
evident when these numbers are organized as a matrix and studied via the tools of linear alge-
bra. In the process we reconsider a convention employed in computing Jacobi sum values by
illustrating how these properties become less elegant or disappear entirely when the standard
definition for Jacobi sums is utilized. We conclude with a conjecture regarding polynomials
that factor in an unexpected manner.
1. JACOBI SUMS. Carl Jacobis formidable mathematical legacy includes such con-
tributions as the Jacobi triple product, the Jacobi symbol, the Jacobi elliptic functions
with associated Jacobi amplitudes, and the Jacobian in the change of variables theo-
rem, to but scratch the surface. Among his many discoveries, Jacobi sums stand out
as one of the most brilliant gems. Very informally, a Jacobi sum adds together certain
roots of unity in a manner prescribed by the arithmetic structure of the finite field on
which it is based. (We will supply a precise definition momentarily.) For a given finite
field a Jacobi sum depends on two parameters, so it is natural to assemble these values
into a matrix. We have done so below for the Jacobi sums arising from the field with
eight elements. We invite the reader to study this collection of numbers and identify as
many properties as are readily apparent.
6
1 1 1 1 1 1
1 1 + i 7 5 1 i 7 1 i 7 5 1 i 7 1 + i 7 1
2 2 2 2
1 5 1 i 7 1 + i 7 1 + i 7 5 1 i 7
2 2 2 2
1 1 i 7
5 1
5 1
(1)
1 1 i 7 1 + i 7 1 i 7 1 + 2i 7 2 + 2i 7
2
1 52 12 i 7 52 12 i 7 1 + i 7 1 i 7 1 + i 7
1
5 1 5 1
1 1 + i 7 7 7 7 7
1 2
+ 2
i 1 i 1 i 2
+ 2
i
5
1 1 1 i 7 2
+ 12 i 7 1 + i 7 5
2
+ 21 i 7 1 i 7
Before enumerating the standard properties of Jacobi sums we offer a modest back-
ground on their development and applications. According to [2] Jacobi first proposed
these sums as mathematical objects worthy of study in a letter mailed to Gauss in 1827.
Ten years later he published his findings, with extensions of his work provided soon
after by Cauchy, Gauss, and Eisenstein. It is interesting to note that while Gauss sums
will suffice for a proof of quadratic reciprocity, a demonstration of cubic reciprocity
along similar lines requires a foray into the realm of Jacobi sums; Eisenstein formu-
lated a generalization of Jacobi sums (see [3]) in order to prove biquadratic reciprocity.
As shown in [5], Jacobi sums may be used to estimate the number of integral solutions
to congruences such as x 3 + y 3 1 mod p. These estimates played an important role
in the development of the Weil conjectures [6]. Jacobi sums were also employed by
Adleman, Pomerance, and Rumely [1] for primality testing.
http://dx.doi.org/10.4169/amer.math.monthly.119.02.100
MSC: Primary 11T24
2. PRELIMINARIES. Recall that there exists a finite field Fq with q elements if and
only if q = pr is a power of a prime, and such a field is unique up to isomorphism.
We shall not require any specialized knowledge of finite fields beyond the fact that the
multiplicative group Fq of nonzero elements forms a cyclic group of order q 1. The
quantity q 1 appears throughout our discussion, so we set m = q 1 from here on.
Thus Fq has m elements.
Fix a generator g of Fq and let = e2i/m . The function defined by (g k ) = k
for 1 k m is an example of a multiplicative character on Fq ; that is, a function
: Fq C satisfying
We use an mth root of unity since ((g))m = (g m ) = (1) = 1. As the reader may
verify, there are precisely m multiplicative characters on Fq , namely , 2 , . . . , m ,
where a (g k ) = ((g k ))a = ak as one would expect. Note that m (g k ) = 1 for all k,
so we call m the trivial character. It follows that the value of the exponent a only
matters mod m. In particular, the inverse of a (which is also the complex conjugate)
may be written either as a or as ma . By the same token, we will usually write the
trivial character as 0 .
To define a Jacobi sum it is necessary to extend each character a to all of Fq
by defining a (0). The multiplicative condition forces a (0) = 0 whenever 1 a <
m. But for the trivial character a seemingly arbitrary choice1 must be made, since
taking either 0 (0) = 0 or 0 (0) = 1 satisfies (2). Convention dictates that we declare
0 (0) = 1 for the trivial character. However, we opt for setting a (0) = 0 for all a. As
the opportunity arises we will point out the ramifications of this choice. Properties of
roots of unity now imply that
X X
a (u) = 0, 1 a < m, 0 (u) = q 1 = m. (3)
uFq uFq
(One rationale behind taking 0 (0) = 1 is presumably rooted in the fact that the latter
sum would come to q rather than q 1, giving a more pleasing value.)
A Jacobi sum takes as its arguments a pair of multiplicative characters on a given
finite field and returns a complex number:
1 Ireland and Rosen explain that Jacobi sums arise when counting solutions to equations over F . In this
p
context 0 (0) tallies solutions to x e = 0, which would seem to motivate the value 0 (0) = 1. However, one
might also argue that the zero solution should not be included since the equations are homogenous, leading to
0 (0) = 0 instead.
The middle expression is more utilitarian, while the final one highlights the symmetry
in the definition. When the field Fq is clear we will drop the subscript q. We will also
often omit and refer to a particular Jacobi sum simply as J (a, b). Because the terms
of the sum corresponding to u = 0 and u = 1 always vanish, we may write
X
J (a, b) = a (u) b (1 u), (5)
u6 =0,1
where it is understood that the sum is over u Fq . Thus a Jacobi sum adds together
q 2 not necessarily distinct mth roots of unity.
In a marvelous manner this sum plays the additive and multiplicative structures of
the field off one another, yielding a collection of numbers with extraordinary prop-
erties. To illustrate how these numbers are computed we return to matrix (1), which
catalogs the values J8 ( a , b ) for 0 a, b 6 for a particular generator g of F8 .
(For aesthetic reasons we begin numbering rows and columns of this matrix at 0.) The
generator g of F8 chosen satisfies
g 1 + g 3 = 1, g 2 + g 6 = 1, g 4 + g 5 = 1, g 7 + 0 = 1. (6)
= a (g 7 )
= 1,
where the penultimate step follows from (3). If we had employed the conventional
value for 0 (0) the term a (g 7 ) would also appear in the sum, giving a total of 0 in-
way of further orientation the reader is encouraged to confirm that J (5, 1) =
stead. By
1 + i 7 and that J (3, 4) = 1.
A cursory examination shows that matrix (1) is symmetric, that the top left entry
equals q 2, and that the remaining entries along the top row, the left column, and the
secondary diagonal are 1. Slightly less obvious is the fact that all other entries have
an absolute value of 8. The sum of the entries along the top row is 0; a quick check
Observe that |J (a, b) + 1|2 and |J (a, b) + 8|2 are either 0 or of the form 2r 7s with
r, s N for every entry of (1). In general the quantities Jq (a, b) + 1 and Jq (a, b) +
q satisfy interesting congruences. We also remark that all the results presented here
continue to be valid regardless of the generator g of Fq used to define . The value of
Jq ( as , bs ) obtained by using the generator g s is identical to that of Jq ( a , b ) using
the original generator g, so altering the generator only permutes the rows and columns
of a Jacobi sum matrix in a symmetric fashion.
where = e2i/3 . One might speculate that cube roots of unity make an appearance
since we used characters of F8 , and 8 = 23 . But in fact the same phenomenon occurs
for every value of q. This is explained by the fact that powers of these matrices (suit-
ably scaled) cycle with period three, a property that depends on using the nonstandard
value for 0 (0).
It is a standard opening gambit in these sorts of proofs to move the summation over k
k=0 (u) = 0 unless u = 1, in which case
to the inside and then use the fact that m1 k
P
the sum equals m. (It is this feature of characters that make them useful for counting
arguments.) Employing this strategy leads to
X X m1
X
a (u) b (1 v) k ((1 u)v). (12)
u6 =0,1 v6 =0,1 k=0
where we have used a (u) = a ( u1 ). Next observe that J (a, b) = J (a, b) since
a = a ; we introduced negative exponents in anticipation of this fact. And now in
v 1
a beautiful stroke we realize that v1 + 1v = 1, and as v runs through all elements of
v
Fq other than 0 and 1 so does v1 . Hence the right-hand side of (13) is precisely the
sum defining J (a, b), thus proving the first part.
With this result in hand the second part will follow once we show B B = m 2 I
mU . This is equivalent to demonstrating that
m1
m2 m
X a=b
J (a, k)J (k, b) = (14)
m a 6 = b.
k=0
The same ingredients are needed as above (but without negative exponents at the end),
so we omit the proof in favor of permitting the reader to supply the steps. There are no
major surprises along the way, and the explanation is quite satisfying.
The final claim is an immediate consequence of the second part. For n 4 we
compute
B n = B n4 B(m 3 I m 2 U ) = m 3 B n3 , (15)
where we have used the fact that BU is the zero matrix because the entries within each
row of B sum to 0. This completes the proof.
The fact that the eigenspaces for = 7, 7, and 7 have the same dimension has
probably not escaped notice. In general the eigenspaces are always as close in size as
possible, a fact that depends upon ascertaining the traces of Jacobi sum matrices.
Proof. The values occurring along the main diagonal of B are J (a, a). Hence
m1 m1 X m1
a (u u 2 ).
X X XX
tr(B) = J (a, a) = (u) (1 u) =
a a
(16)
a=0 a=0 uFq uFq a=0
But the inner sum vanishes unless u u 2 = 1, in which case its value is m. When
Fq has characteristic 3 we find that u = 1 is a double root of the equation, while
for other characteristics u = 1 is not a root. Since (u 2 u + 1)(u + 1) = u 3 + 1, in
these cases we seek values of u with u 3 = 1 other than u = 1. For m 0 mod 3
there will be two such values, while for m 1 mod 3 there are no such values, because
Fq is a cyclic group of order m. In summary, u u 2 = 1 has one, two, or zero distinct
roots when m 2, 0, 1 mod 3, as claimed.
Proposition 3. The characteristic polynomial of a Jacobi sum matrix B has the form
Proof. Clearly p B (x) is monic. Furthermore, the sign will be positive unless m is odd;
i.e., when q = 2r . We will also see below that rank(B) = m 1, giving the single fac-
tor of x. Now let us show that p B (x) has real coefficients, meaning that the eigenval-
ues m and m occur in pairs. This follows from the relationship J (m a, m b) =
J (a, b). In other words, swapping rows a and m a as well as columns b and m b
for all a and b with 1 a, b m2 does not alter p B (x) but does conjugate every entry
of B, and therefore p B (x) is real. The trace of B is the sum of the eigenvalues, and
each triple m, m, m will cancel. The result now follows from the fact that tr(B) is
equal to 0, m, or 2m.
4. RELATED MATRICES. Observe that the list of eigenvalues for a Jacobi sum
matrix constructed using the conventional definition is nearly identical to the list given
by Proposition 3, the difference being that the eigenvalue = 0 is replaced by = 1
and a single occurrence of = m changes to = m + 1 = q. This is a consequence of
the close relationship in each case between the characteristic polynomial of the entire
matrix and that of the lower right (m 1) (m 1) submatrix of values they share.
c1
1
M = .. (18)
. M0
1
We next add columns 2 through n to the first column. Since the sum of the entries
within each row of M is zero this operation cancels every term in the first column
below the top entry, which becomes
If Jq (a, b) were computed in the traditional manner the top row of our Jacobi sum
matrix would be q followed by a row of 0s, so the list of eigenvalues would consist of
those of the lower right submatrix, augmented by the value = q. Invoking the lemma
now leads to the statement made above comparing lists of eigenvalues.
Purely to satisfy our curiosity, we now propose permuting the rows and columns
of a Jacobi sum matrix B before computing the eigenvalues. For example, take B to
equal matrix (1), let P be any 7 7 permutation matrix, and consider the degree-seven
polynomial det(PB x I ). Compiling the roots to all 5040 polynomials that arise in
this manner produces a list with somewhat more than 3500 distinct complex numbers;
locating them in the complex plane yields the scatterplot on the left in Figure 1. The
roots, whose locations are marked by small solid discs, form a nearly unbroken chain
along the circle of radius 7 centered at the origin, with discernible gaps located only
near the real axis. By way of comparison, the related 7 7 matrix B of conventional
Jacobi sum values yields the right-hand plot in Figure 1. Put another way, matrix B
generates in excess of 3500 algebraic integers, each of degree 14 or less over Q and
each having absolute value 7. As one might hope, this property is shared by all Jacobi
sum matrices. The following result was conjectured by the author and proved by Ron
Evans (personal communication, Jan. 2011); we present this proof below.
4 4
2 2
6 4 2 2 4 6 6 4 2 2 4 6
2 2
4 4
6 6
Figure 1. A plot of the roots of det(P B x I ) on the left and the roots of det(P B x I ) on the right for all
7 7 permutation matrices P.
Proposition 4. Let B denote a Jacobi sum matrix for the finite field Fq and let P be
any m m permutation matrix, where m = q 1. Then every nonzero eigenvalue
of the matrix PB satisfies || = m.
Proof. Let be a nonzero eigenvalue of PB, so that PBv = v for some nonzero
vector v. Letting M denote the conjugate transpose of a matrix M, it follows that
(PBv) (PBv) = (v) (v). Expanding yields v B P PBv = ||2 v v, which implies
v m1 B 3 v = ||2 v v, (22)
v (m 2 I mU )v = m 2 v v mv U v = m 2 v v = ||2 v v. (23)
Proof. According to Corollary 1 the eigenvalues of B belong to the set {0, m, m, m}.
We know det(B) = 0, so rank(B) < m. But by the previous lemma B 0 is nonsingular;
therefore rank(B) = m 1, implying that exactly one eigenvalue of B is 0. We next
apply Lemma 1 to conclude that the eigenvalues of B 0 are among {1, m, m, m},
with the value 1 occurring precisely once. Finally, the discussion within Proposition 3
indicates that the values m and m come in pairs. Hence the product of the m 1
eigenvalues, which is det(B 0 ), comes to m m2 .
Proof. The case i = j = 0 is handled by Proposition 6. When j > 0 note that adding
all other columns of B 0 to column j effectively replaces that column with the negative
of column 0 of B, since the sum of the entries within every row is 0. Moving this
column back to the far left and negating it introduces a sign of (1) j to the value of
the determinant. The same reasoning applies to the rows; therefore B 0 is transformed
into A by operations that change the sign of det(B 0 ) by (1)i+ j .
In other words, the determinant of this submatrix appears to be related to the con-
jugates of the entries in the complementary upper left 2 2 submatrix. The same
phenomenon occurs elsewhere; for instance, if A is the upper left 5 5 submatrix
of (1) then we find that det(A) = 343(7 + 3i 7), and sure enough
7 + 3i 7 = J (5, 5) J (5, 6) J (6, 5) + J (6, 6). (25)
det(A) ddet(Ac )
= A c . (28)
mk m mk
Observe that the power of m in each denominator corresponds to the size of the matrix
in the numerator. Also, the examples outlined above illustrate the case m = 7, k = 5;
in both examples the sign happened to be Ac = 1. We provide a proof of this result in
the appendix. The reader is encouraged to peruse the argumentamong other things,
a number of steps would make excellent exercises for linear algebra students.
Before considering a collection of multivariable polynomials with unlikely factor-
izations, we pause to present a couple of elementary facts concerning the diminished
determinant, which arose naturally in the preceding discussion. Early in the proof of
Theorem 2 we will need an analogue to expansion by minors to handle the transition
between diminished determinants for matrices of different sizes. To clarify the analogy,
let M be an n n matrix with entries m i j and let M ij denote the submatrix obtained
by deleting row i and column j from M. Then expansion by minors implies that
n
1 X
det(M) = (1)i+ j m i j det(M ij ). (29)
n i, j=1
n
1 X
ddet(M) = (1)i+ j m i j ddet(M ij ). (30)
n 1 i, j=1
n
X
ddet(M) = (1)k+l det(Mlk )
k,l=1
X 1 X 0 0
= (1)k+l (1)i + j m i j det(M ik
jl ). (31)
k,l
n 1 i6=k
j6=l
1 X X 0 0
ddet(M) = (1)i+ j m i j (1)k +l det(M ik
jl ) (32)
n 1 i, j k6=i
l6= j
n
1 X
= (1)i+ j m i j ddet(M ij ).
n 1 i, j=1
Diminished determinants also resemble determinants with respect to row and col-
umn transpositions.
Proof. Every term in the sum (1)i+ j det(M ij ) is negated by such an operation, for
P
one of two reasons. If column i and row j stay put then a pair of rows or columns
within M ij trade places, negating det(M ij ) without affecting (1)i+ j . On the other
hand, if column i or row j is involved in the exchange then M ij still appears in the
sum with entries intact, but now with an attached sign of (1)i+ j1 .
In each case the 1s are situated along a line through the origin, where the origin is
the upper left entry and we reduce coordinates mod 7; the subscript indicates the slope
of the line. We have already observed that the characteristic polynomial of matrix (1),
which we shall denote as B once again, splits completely over the field Q():
Further experimentation suggests that it is not a coincidence that the slopes used for P1 ,
P2 , and P4 are powers of 2. For instance, det(B w P1 x P2 y P4 z P8 ) splits into
Conjecture 1. Let B be a Jacobi sum matrix for the finite field Fq , where q = pr and
m = q 1. For (k, m) = 1 denote by Pk the m m permutation matrix whose entry
in row s, column t is 1 for all 0 s, t < m with s kt mod m. Then the polynomial
Other evidence that we have not included here suggests that this conjecture can be
extended in scope.
In summary, we have examined an elegant tool from number theory via the lens
of linear algebra and uncovered several nice results in the process. At the very least
this approach demonstrates a tidy manner in which many of the elementary (though
perhaps not fully mapped out) facts concerning Jacobi sums may be packaged. On an
optimistic note, this avenue of inquiry may even lead to a more complete understanding
of Jacobi sums.
det(A) ddet(Ac )
k
= Ac . (38)
m m mk
Proof. For k = 0 and k = m the statement to be proved reduces to
ddet(B) det(B)
1= , = 0. (39)
mm mm
The former is a consequence of Corollary 3, while the latter is clear. Furthermore,
the statement for k = m 1 is equivalent to Corollary 3. Hence we need only show
that case k follows from case k + 1 for 1 k m 2. In the interest of present-
ing a lucid argument, we will provide a sketch of the proof in the case k = m 3,
followed by a summary of the algebra for the general case, which is qualitatively no
different.
Therefore suppose the result holds for k = m 2 and that A is an (m 3) (m
3) submatrix of B. For the sake of organization we permute the rows and columns of B
in order to situate the entries of Ac in the upper left corner, but otherwise maintain the
00 01 02 03 04
10 11 12 13 14
21 22 23 24
20
(40)
C =
30 31 32
40 41 42
A
..
.
We claim that the result for k = m 2 continues to hold for matrix C, up to a sign
which we now determine. For exchanging a pair of adjacent rows or columns of B
will negate exactly one of det(A), ddet(Ac ) or Ac , according as the pair of rows or
columns both intersect A, both intersect Ac (by Lemma 3), or intersect both. If the
entries of Ac reside in rows r1 , r2 , r3 and columns c1 , c2 , c3 then it requires
det(D) ddet(D c )
Ac = D c . (42)
m m2 m2
The final observation to be made before embarking upon a grand calculation is
that the dot product of any row vector of C with the conjugate of another row vector
is m, while the dot product of a row vector with its own conjugate is m 2 m. This
relationship holds for B since B is symmetric and B B = m 2 I mU , as noted in the
proof of Theorem 1. Permuting rows and columns of B does not destroy this property,
which consequently holds for C as well. Now to begin.
We wish to relate ddet(Ac ) to det(A). By Lemma 2 we may begin
1
ddet(Ac ) = 00 ddet 11 12 01 ddet 10 12 +
2 21 22 20 22
Ac
12 12 12
= 00 det(C 12 ) + 01 det(C 02 ) + 02 det(C 01 ) + , (43)
2m m4
since the result holds for k = m 2 with a correction factor of Ac . As before C ikjl
denotes the submatrix of C obtained by deleting rows i, k and columns j, l. We then
12 12 12
expand each of det(C12 ), det(C02 ), and det(C01 ) by minors along row 0, which gives
( 00 00 + 01 01 + 02 02 ) det(A), along with a fair number of other terms. We next
collect the remaining terms according to whether they involve 03 , 04 , 05 , and so on.
The reader may verify that the sum of the terms containing a factor of 03 is
where A03 F3 refers to matrix A with all entries in the left column replaced by 03 .
Combining the terms involving 04 , 05 , . . . in the same manner, we may rewrite the
where A3 represents matrix A with each entry in its leftmost column replaced by the
sum of all the entries in that column. Defining A j similarly for j 3, (47) reduces to
Each term of det(A) appears in det( A j ) for every j and hence appears m 3 times in
the sum; all other terms cancel in pairs, as the reader may verify. Hence we are left with
3(m 2 m) det(A) m(m 3) det(A) = 2m 2 det(A). (50)
In summary, we have shown that
Ac det(A)
ddet(Ac ) = (2m 2 det(A)) = Ac m6 . (51)
2m m4 m
Dividing through by Ac m 3 gives the desired equality.
The calculation proceeds in an identical fashion for other values of k. One arrives
at the expression
k(m 2 m) det(A) m(m k) det(A) = (k 1)m 2 det(A) (52)
in place of (50), yielding
Ac det(A)
ddet(Ac ) = ((k 1)m 2 det(A)) = Ac m2k . (53)
(k 1)m m2k+2 m
Rearranging gives the result.
ACKNOWLEDGMENTS. I would like to thank the referees for many helpful remarks and suggestions. In
particular, the idea of generating the right-hand scatterplot in the figure as well as the insightful remarks
contained in the footnote were both due to the referees. I am also grateful to Ron Evans for sharing the (quite
rapidly found) proof appearing in this article.
1. L. Adleman, C. Pomerance, R. Rumely, On distinguishing prime numbers from composite numbers, Ann.
of Math. 117 (1983) 173206; available at http://dx.doi.org/10.2307/2006975.
2. B. C. Berndt, R. J. Evans, K. S. Williams, Gauss and Jacobi Sums. Wiley, New York, 1998.
3. G. Eisenstein, Einfacher beweis und verallgemeinerung des fundamental theorems fur die biquadratischen
reste, in Mathematische Werke, Band I, 223245, Chelsea, New York, 1975.
4. R. A. Horn, C. R. Johnson, Matrix Analysis. Cambridge University Press, Cambridge, 1985.
5. K. Ireland, M. Rosen, A Classical Introduction to Modern Number Theory, second edition. Springer, New
York, 1990.
6. A. Weil, Number of solutions of equations in a finite field, Bull. Amer. Math. Soc. 55 (1949) 497508;
available at http://dx.doi.org/10.1090/S0002-9904-1949-09219-4.
Abstract. Alcuin of York (c. 740804) lived over four hundred years before Fibonacci. Like
Fibonacci, Alcuin has a sequence of integers named after him. Although not as well known
as the Fibonacci sequence, Alcuins sequence has several interesting properties. The purposes
of this note are to acquaint the reader with Alcuins sequence, to give the simplest available
proofs of various formulas for Alcuins sequence, and to showcase a new discovery about the
period of Alcuins sequence modulo a fixed integer.
A certain father died and left as an inheritance to his three sons 30 glass flasks,
of which 10 were full of oil, another 10 were half full, while another 10 were
empty. Divide the oil and flasks so that an equal share of the commodities should
equally come down to the three sons, both of oil and glass [4].
There are five solutions, of which Alcuin gives only the first:
The numbers of full, empty, and half-full flasks are represented by the columns F,
E, and H, respectively. We dont regard solutions with the sons permuted as distinct.
Notice that in each solution, each son receives an equal number of full and empty
http://dx.doi.org/10.4169/amer.math.monthly.119.02.115
MSC: Primary 11B50
0, 0, 0, 1, 0, 1, 1, 2, 1, 3, 2, 4, 3, 5, 4, 7, 5, 8, 7, 10, 8, 12, . . . .
Proof. Let pk (n) denote the number of partitions of the nonnegative integer n into
k positive integer parts (summands). The order of the parts is unimportant, and for
uniformity we list them in nonincreasing order. For example, p3 (5) = 2, and the two
relevant partitions are 3 + 1 + 1 and 2 + 2 + 1.
The number of integer triangles of even perimeter is equal to the number of parti-
tions of half the perimeter into three parts:
t (2n) = p3 (n), n 0.
(a, b, c) {n a, n b, n c},
p3 (n + 6) = p3 (n) + n + 3, n 0.
To prove this, consider the last summand in a partition of n + 6 into three parts. If this
summand is at least 3, then subtract 2 from each summand to obtain a partition of n
into three parts. If the last summand is a 2, then subtract 2 from each part to obtain a
partition of n into one or two parts. If the last summand is a 1, then subtract 1 from each
part to obtain a partition of n + 3 into one or two parts. Using the obvious formulas
p1 (n) = 1 and p2 (n) = bn/2c, where bxc is the greatest integer less than or equal to
x, we obtain
j n k
n+3
p3 (n + 6) = p3 (n) + 1 + + 1+ , n 0,
2 2
which simplifies to our desired recurrence relation by examining the cases n even and
n odd.
Initial values of the sequence { p3 (n)} can be obtained from the relation p3 (n) =
t (2n) or from scratch:
0, 0, 0, 1, 1, 2, . . . .
where kxk is the nearest integer to x. (It is easy to show that n 2 /12 is never a half-
integer.) This formula for p3 (n) has the correct six initial values, and it satisfies the
recurrence relation:
(n + 6)2
n 2
12
=
12
+ n + 3, n 0.
(a, b, c) (a + 1, b + 1, c + 1),
From the above formula, we notice a couple of interesting features of Alcuins se-
quence. The only fixed point is t (48) = 48. Furthermore, the sequence has the zig-zag
property that
Like the Fibonacci sequence, Alcuins sequence has a rational generating function.
Proof. We claim first that the triples (a, b, c) that represent integer triangles (i.e.,
triples of positive integers satisfying a b c < a + b) are precisely the triples of
the form
and each way of writing n 3 as a sum of 2s, 3s, and 4s leads to an x n term in the
sum on the right.
The denominator of the rational generating function yields an order-nine linear re-
currence relation for Alcuins sequence.
Now the order-nine recurrence relation can be read off by equating coefficients of x n
for n 9.
Theorem 4. For any integer m 2, the sequence {t (n) mod m} is periodic with period
12m. Moreover, the range of the sequence consists of all integers modulo m if and only
if m is one of the following:
This collection parallels a similar family of moduli for which the Fibonacci se-
quence contains all residues [3, p. 318].
t (2n + 12m) = k(2n + 12m)2 /48k = k(2n)2 /48k + nm + 3m 2 t (2n) (mod m).
Since t (n) = t (n + 3) for n odd, the sequence {t (n) mod m} is periodic with period
L 12m.
Now we will show that L 12m. Let = L or L 1, so that is even. We then
have k2 /48k 0 (mod m) and k( + 2)2 /48k 0 (mod m), since t (1) = t (0) =
t (1) = t (2) = 0. Hence m divides k( + 2)2 /48k k2 /48k, which is nonzero be-
cause L > 12 (since {t (n)} is an aperiodic pattern of 0s and 1s for 4 n 8). This
difference is less than ( + 2)2 /48 2 /48 + 1 = ( + 13)/12, and it follows that
12m < + 13 L + 13. By definition of period, L is a divisor of 12m. If L < 12m,
then L 6m, but this contradicts the inequality 12m < L + 13, for m 3. The re-
sult is checked by inspection for m = 2. We conclude that L 12m, and therefore
L = 12m.
The question of when the range of {t (n) mod m} contains all integers modulo m re-
duces to a question of when a certain set of polynomials represents all integers modulo
m. Our proof will use two venerable tools of number theory, namely Hensels lemma
and quadratic nonresidues.
Letting n = 12k + r , where 0 r < 12, we find from Theorem 1 that the values of
t (n) are given by six quadratic polynomials:
3k 2 , 3k 2 + k, 3k 2 + 2k, 3k 2 + 3k + 1, 3k 2 + 4k + 1, 3k 2 + 5k + 2.
must be a square modulo p. However, by a theorem of Andre Weil (see, e.g., [9]),
there exist arbitrarily many consecutive quadratic nonresidues modulo p, if p is large
enough. The number N of sequences of h consecutive quadratic nonresidues modulo
p satisfies
p
N 3h p.
2h
ACKNOWLEDGMENTS. The authors wish to thank the referees for valuable suggestions that improved the
motivation and clarity of this paper.
REFERENCES
1. E. G. Andrews, A note on partitions and triangles with integer sides, Amer. Math. Monthly 86 (1979)
477478; available at http://dx.doi.org/10.2307/2320420.
2. M. Erickson, Aha! Solutions, Mathematical Association of America, Washington, DC, 2009.
3. R. L. Graham, D. E. Knuth, O. Patashnik, Concrete Mathematics: A Foundation for Computer Science,
second edition, Addison-Wesley, Reading, MA, 1994.
DONALD J. BINDNER received his B.S. from Truman State University in 1992 and his Ph.D. from the
University of GeorgiaAthens in 2001. His interests include computer programming and free software. He and
Martin recently wrote the book A Students Guide to the Study, Practice, and Tools of Modern Mathematics
(CRC Press).
Department of Mathematics and Computer Science, Truman State University, Kirksville, MO 63501
dbindner@truman.edu
MARTIN ERICKSON received his B.S. and M.S. from the University of Michigan in 1985 and his Ph.D.
from the University of Michigan in 1987. His mathematical interests are combinatorics, number theory, and
problem solving.
Department of Mathematics and Computer Science, Truman State University, Kirksville, MO 63501
erickson@truman.edu
1 1 1 1
1+ + + + + .
2 3 4 n
Since the harmonic series diverges, it follows that the overhang can be arranged to be
as large as desired, simply by using a suitably large number of blocks.
table ... 1 1 1 1 1
1
65 4 3 2
1 1 1
Figure 1. The total overhang of this tower of twenty blocks is 1 + 2 + 3 + + 20 .
In practice, these special stacks are constructed from top to bottom: the top block
is placed so that its middle, balancing point is at the upper right corner of the second
block. Then the top two blocks are placed together so that their combined balancing
point is at the upper right corner of the third block, and so on.1
Okay, review is over; now for something new. Lets rescale our special stacks in
the vertical direction, so that each stack has height 1; the resulting stacks resemble
decks of playing cards, as indicated in Figure 2. Well call these stacks the harmonic
http://dx.doi.org/10.4169/amer.math.monthly.119.02.122
MSC: Primary 26A06
1 Recently, stacks have been investigated for which it is permitted to place two blocks upon any lower
block. Stacking in this way, one can use some blocks as counterweights and thus achieve significantly greater
overhangs than with the staircases. See [5] for these recent results and a comprehensive bibliography for the
stacking problem.
table x
y = 1 e x
table
staircases. Notice that weve arranged for the top right corner of the table to coincide
with the origin of our coordinate system.
We will prove (Theorem 2.1) that the sequence of harmonic staircases converges to
the harmonic stack, determined by the function 1 ex . And, just like the harmonic
staircases, the harmonic stack wont topple over. In fact, we will show (Theorem 3.2)
that the harmonic stack is stable in a correspondingly stricter sense.
Motivated by the limiting process above, in this article we shall consider general
stacks of width 2 and height 1. A general stack is not one solid piece, but rather consists
of infinitely many infinitely thin and unconnected horizontal blocks. Similar to a tower
of finite blocks, a general stack is capable of toppling at any level: to avoid toppling at
a given height, the center of mass of the stack above that height must lie directly above
the cross-section of the stack at that height. (By comparison, a solid stack is safe from
toppling just as long as its total center of mass lies above its base.)
As indicated above, the harmonic staircases are distinguished in the framework of
the original problem by each block being extended as far as possible. We will show
(Theorem 3.2) that the harmonic stack is similarly characterized among the stable
stacks: cutting the stack horizontally at any height into two pieces, the center of mass
of the top piece lies directly above the upper right corner of the lower piece.
The harmonic staircase consisting of n blocks has maximum overhang within a
natural class of stacks made up of the same blocks. Similarly, we will show (Theorem
6.1) that the harmonic stack is a fastest growing stable stack. What may be surprising is
that the harmonic stack is not the uniquely fastest growing stable stack (Theorem 6.2).
Other results include various methods of transforming stable stacks into new stable
stacks, and further characterizations of the harmonic stack amongst stable stacks. All
the arguments employed are elementary.
Getting ready to stack. The original stacking problem is posed in terms of three-
dimensional blocks. However, the harmonic staircases and all stacks that we are inter-
ested in are simply figures in the x y-plane, orthogonally extended in the z-direction.
Clearly, the two-dimensional stacks will be stable if and only if their extensions are. So,
we lose nothing by restricting ourselves to discussing and drawing two-dimensional
stacks. (In Section 7, we make a short excursion into the world of 3D blocks).
The n-block harmonic staircase HARn has piecewise constant stack function
n
X 1
harn (y) = ,
m=nbnyc
m
where the floor function bnyc is the largest integer m ny. Weve shaded the graph of
har6 (y) in the picture of HAR6 in Figure 3. Essentially, it is comprised of the vertical
right-hand borders of the rectangular blocks in the stack.
table 1 1 1 1 1 x
6 5 4 3 2 1
As indicated in the picture, all stacks are resting on the x-axis, and the table extends
from to 0, making the upper right corner of the table the origin. The weight of
part of a stack is simply its area. Since stacks are of height 1 and constant width 2, it
follows that all stacks have total weight 2.
2 The stack function being either Riemann or Lebesgue integrable suffices. It is also sufficient that the stack
where is the Euler-Mascheroni constant and n 0. Now let y [0, 1). Then
n n1bnyc
X 1 X 1
harn (y) =
m=1
m m=1
m
n+1
bnyc
= log 1 + log + n n1bnyc .
n n
Consequently
table x
Figure 4. As one solid piece, this stack has a huge overhang and will not topple over.
4 The proof actually shows that, for h < 1, har (y) converges uniformly to har(y) on [0, h]. Note also that
n
Theorem 2.1 can be related to the Maclaurin expansion of the harmonic stack function:
y2 y3 y4
har(y) = log(1 y) = y + + + + .
2 3 4
For the limiting value y = 1 this identity says that the overhang of the harmonic stack is equal to the limit of
the harmonic series.
The gravity function and the gravity curve of a stack. More formally, consider a
general stack S, and fix a height y. Consider the slab of S lying above y, and let g(y) be
the x-coordinate of the center of mass of this slab. Then we call (g(y), y) the gravity
point of S at height y. Notice that a gravity point always lies directly below the center
of mass of the slab defining it, as pictured in Figure 5. We also call the function g(y)
the gravity function of the stack S, and the graph of x = g(y) is the gravity curve of S.
table x
Figure 5. The gravity point at height y is directly below the center of mass of the top slab.
Proposition 3.1 (Equation of the gravity function). Suppose S is a stack with stack
function f . Then the gravity function of S is continuous and is given by
R1
y
( f (t) 1) dt
g(y) = .
1y
At every point, g is differentiable from the right, and the above equation for f holds
everywhere with the derivative so interpreted. Consequently, the stack function
uniquely determines the gravity function and vice versa.5
Proof. If we consider the stack S to be made of infinitesimally thin blocks, then the
block at height t has mass 2dt, and the x-coordinate of its center of mass is f (t) 1.
It then follows that
R1
( f (t) 1) 2 dt
y
g(y) =
mass of the slab above height y
R1
( f (t) 1) dt
y
= .
(1 y)
5 In the more general setting of Lebesgue integrable functions, the gravity function g is absolutely continu-
From here on, g 0 (y) shall always denote, if need be, the right derivative of the
gravity function g. The previous result then promises that this right derivative always
exists.
Normalizing the tables. We say that a stack S is stable if S contains its gravity curve.
If f is the stack function of S, this is the case exactly when
Further, we say that a stack is balanced at 0 if g(0) = 0: that is, if the center of mass
of the whole stack is above the top right corner of the table.
It is easy to check that all harmonic staircases are stable stacks balanced at 0. In
fact, a harmonic staircase is constructed exactly so that its gravity curve will contain
the top right corner of the table, as well the top right corners of all but the topmost
block; through the top block, the gravity curve is simply a vertical line directly up the
middle.
As part of the next result, we prove that HAR is balanced at 0. As well, it is obvi-
ous that any stack can be translated to be balanced at 0. This justifies the following
normalization:
From here on, we shall consider only those stacks that are balanced at 0.
y y
table x table x
Figure 6. The gravity curves of the vertical stack and a harmonic staircase.
The gravity curve of the harmonic stack. Intuitively, the stack functions of HARn
approximate the gravity functions of HARn , suggesting that the stack and gravity func-
tions of HAR should coincide. This is indeed the case.
Theorem 3.2 (The harmonic stack and gravity functions coincide). The harmonic
stack HAR is the unique stack whose gravity function and stack function coincide. In
particular, HAR is stable.
Proof. Suppose S is a stack with stack function f . By Proposition 3.1, the stack and
gravity functions of S will coincide if and only if f is continuous and
R1
( f (t) 1) dt
y
= f (y).
1y
and so
1
f 0 (y) = .
1y
Antidifferentiating gives
f (y) = log(1 y) + C.
Since we have normalized to have all stacks balanced at 0, the only possibility is
C = 0, giving the harmonic stack.
Proposition 3.3. The gravity functions of the stacks HARn converge pointwise to har.
Proof. Let gn be the gravity function of HARn and let g be the gravity function of
HAR. We prove that, as suggested above, gn (y) harn (y) 0 pointwise on [0, 1).
The proposition then follows immediately from Theorem 2.1 and Theorem
3.2.
Fix y [0, 1), suppose n N, and set m = bnyc. Then y mn , m+1
n
. If n is large
then n > m + 1, and therefore
m m+1 m 1
gn gn (y) harn (y) = gn = gn + .
n n n nm
1 1
From the proof of Theorem 2.1, we know that nm
= nbnyc
0. It follows that
gn (y) harn (y) 0, as desired.
4. CUT AND PASTE. The following diagram shows two stacks S and T together
with their gravity curves. We now slice both stacks at height t, and then combine them
to make a new stack as pictured: the bottom slab of S stays fixed, and the top slab of
T is horizontally translated so that the ends of the gravity curves coincide. We will
denote this new stack by S \t T . From Proposition 3.1 we know that gravity functions
are continuous. It then follows immediately from the definition of the gravity curve
that the gravity curve of S \t T is exactly the union of the two part-curves.
We immediately conclude the following.
Proposition 4.1 (Properties of cut and pasted stacks). Let S and T be two stacks. If
both S and T are stable then so is S \t T .
Here is a nice application of this construction. Suppose that S and T are stable
stacks. Then St = S \t T, t [0, 1] is a continuous deformation of T into S with all
intermediate stacks being stable.
t S T
t S\tT
1 1
2 2 1
Figure 8. The stack har2 \ 1 har4 . The gray curve is the gravity curve.
2
Cutting and pasting is also a useful technique for transforming finite stacks of rect-
angular blocks. As an example, the stack har2 \ 1 har4 consists of the three blocks
2
pictured in Figure 8, with the lower two overhangs of length 12 .
Now replace the top block, by cutting and pasting with har8 at t = 34 , giving
(har2 \ 1 har4 ) \ 3 har8 . Continuing this process forever and taking the limit, we arrive
2 4
at the infinite-block stack
The stack HALF consists of blocks of heights 12 , 41 , 18 , 161 , . . . , with all overhangs
of length 12 . So, HALF has infinite overhang. Also, by applying Proposition 3.1 one
1 1 1 1 1
2 2 2 2 2
Figure 9. Finite stacks converging to HALF. The gray curve is the gravity curve.
translate to balance at 0
t S
Figure 10. Stretching and translating the top part of a stack into a new stack.
The stack function for t S is easily determined from the construction, and then the
gravity function is easily determined from Proposition 3.1. We summarize this in the
following proposition.
Proposition 5.1 (The defining functions of stretched stacks). Let S be a stack with
stack function f and gravity function g, and let t [0, 1). Then t S has stack function
f (y(1 t) + t) g(t),
g(y(1 t) + t) g(t).
If S is stable then so is t S.
f (y) = a log(1 y) + a + 1
and
for some a R.
Notice that a = 0 gives the vertical stack, and a = 1 gives HAR. Also, a = 1 gives
HAR reflected in the y-axis. It is clear that the stable stacks are given by a [1, 1],
and so a = 1 give the extreme stacks.
Proof of Theorem 5.2. Since t S = S, the gravity curves of the two stacks have to be
identical. So, by Proposition 5.1,
g 0 (0)
g 0 (y) = ,
1y
where g 0 (y) denotes the right derivative of g. Since g(0) = 0, we can conclude that
g(y) = a log(1 y)
Notice that the HALF stack introduced in the previous section is self-similar for
infinitely many values of t: it is easy to check that
g 0 , obtaining the desired expression for g. See, for example, [1, Theorem 7.1].
not unique in this regard, as an adaptation of HARn does just as well, as shown in
Figure 11.
In this section we will prove similar results for HAR. First, we prove (Theorem
6.1) that HAR has the fastest growing gravity function amongst stable stacks. We then
prove (Theorem 6.2) that HAR has one of the fastest growing stack functions, but that
it is not unique in this regard.
In what follows, we will consider stacks growing to the right of the table. Of course
stacks can also grow to the left, and there are obvious left versions of all our results
below.
Theorem 6.1 (HAR has the fastest growing gravity function). Suppose S is a stable
stack with gravity function g. Then har g is a nondecreasing and nonnegative func-
tion.
Note that it is possible that g = har on an initial interval [0, t]. However, Theorem
6.1 implies that once har gets in front of g, it remains so.
Proof of Theorem 6.1. Since har(0) = g(0) = 0, we only need to prove that har g
is nondecreasing. To do this, assume by way of contradiction that har0 (t) g 0 (t) < 0
for some t [0, 1), where g 0 (t) refers to the right derivative of g. Let h be the gravity
function of t S. From Proposition 5.1 and the self-similarity of HAR, it follows that
har =t har < h on some interval (0, s].
Now consider the stack T = (t S) \s HAR. By Proposition 4.1 and Proposition 5.1,
T is stable. Further, if k is the gravity function and f is the stack function of T , then
the stability of T and the choice of s ensures that har < k f on (0, 1); see Figure 12.
har
k
s
Figure 12. The gravity function k of the stack T is to the right of har.
Having proved the gravity function har of HAR is dominant, what can we say about
har as a stack function relative to other stack functions? Certainly, HAR need not al-
ways be in front of other stacks. For example, the vertical stack begins in front of HAR,
before being overtaken.
On the other hand, to have a stable stack S strictly in front of HAR from a certain
height on is impossible: if this were the case, then above that height the gravity function
of S would also be in front of har, contradicting our previous theorem. However, there
do exist stable stacks that effectively compete for the lead.
Theorem 6.2 (The fastest growing stack functions). There are no stable stacks that
stay ahead of HAR for all y near 1. However, there do exist stacks that grow as fast as
HAR. That is, there is a stable stack S, with stack function f , such that f (y) har(y)
changes sign infinitely often as y approaches 1.
Proof of Theorem 6.2. We have already argued that a stable stack cannot stay ahead of
HAR for all y near 1. We will now construct a stack S that repeatedly alternates with
HAR for the lead: S has stack function f satisfying
To see how to construct such a stack S, we first assume that S has the desired lead-
changing property, and we use this to derive explicit sufficient conditions for the stack
function f . We then show that f can indeed be chosen so that these conditions are
satisfied.
We begin by considering the gravity function g of S. Since S is assumed stable, it
follows that for any y for which har(y) < f (y), we must also have
(0) = 1.
We now assume that (y) is differentiable. Then, using Proposition 3.1, we can
calculate
Finally, S being ahead of HAR at height y amounts to f (y) > har(y). So, for the
lead-changing property, it suffices to have
(0) = 1
lim (y) = 0
y1
0 (y)(1 y) 2 for all y
0 (y)(1 y) > 1 for values of y arbitrarily close to 1.
These conditions will guarantee that the corresponding stack S is stable and balanced
at 0, and will repeatedly overtake HAR.
To construct an explicit function satisfying these conditions, we shall take to
have constant segments that are connected by small S-bends of just the right size and
slope. To do this, we first define a suitable prototype S-bend; see Figure 13:
5 1
B(y) = y y5, y [1, 1].
4 4
Note that B(1) = 1 and B 0 (1) = 0. Also, B 0 0 on [1, 1], with a maximum
of 54 . We also take B(y) = 1 for y > 1 and B(y) = 1 for y < 1.
We now define by subtracting a sum of suitable linear transformations of B.
Specifically, define
X
(y) = 1 Bn (y),
n=1
1 5
0 (y)(1 y) 5 2n2 = > 1.
2n 4
1 1
1
0 1/8 1/4 1/2 1
t S
t t S
Figure 16. Reflecting the top slab of a stack about an axis through its gravity point.
Figure 17. Constructing a stable stack with infinite overhang to both the left and right.
Recall the stack HALF constructed at the end of Section 4, consisting of blocks of
heights 21n , each placed with overhang 12 . We now flip this stack above the 1st, 3rd,
6th, and in general the (n+1)n
2
th block; each pair of flips results in the stack extending
1
2
further in both directions. Using Proposition 3.1, it is then easy to show that the
limiting result of these flips is a stable stack S, with stack function having unbounded
oscillation in both directions as y 1.
Note also that this flipping procedure can be used to construct a stack that continu-
ally overtakes the harmonic stack, similar to that constructed at the end of the previous
section. For this we modify the harmonic stack by flipping out infinitely many small
horizontal slivers that get arbitrarily close to the top of the stack. It is then possible to
arrange for these slivers to jut beyond the harmonic stack.
We now momentarily venture into the world of 3-dimensional blocks. Well create a
stack that casts a shadow over the whole x z-plane. (Well continue to label the vertical
direction as y.)
Begin with the oscillating stack S just constructed. Notice that there are infinitely
many blocks of S such that a top corner of the block lies above the origin and the grav-
ity curve of S also passes through that corner. Now, thicken the blocks in S to have
a thickness d in the z direction, giving a 3D stack b S. Next, take any fixed irrational
number a, and consider the angle = a. Finally, at the height of each of the distin-
guished corners of S above the origin, successively rotate the top slab of b S the angle
around the y-axis.
Any angle is closely approximated by arbitrarily large integer multiples of . It
follows that, no matter how small the thickness d, any point in the direction will
eventually lie under some block of b S. It follows that b
S casts a shadow over the whole
x z-plane.
Proposition 8.1 (Balancing the exponential function). The region under the graph
of the function e x to the left of the point x = a balances over a fulcrum at x = a 1.
Proof. Using the standard formula, we find that the x-coordinate of the center of mass
of the tail region is
Ra
xe x dx
= a 1.
Ra
e x dx
a
a1
Figure 18. The tail of the region under y = e x always balances 1 unit to the left of the cut.
Figure 19 shows the region between the graph of the exponential function e x and its
horizontal translate e x2 , truncated at a certain height. If the height is less than 1, then
this is exactly a top slab of the harmonic stack, rotated 180 degrees about the point
(0, 1/2). What we want to prove is that no matter where we cut, the shaded region
balances with the fulcrum at a: this establishes again that the gravity function and the
stack function of HAR are identical.
gravity
y = ex
y = ex
y = e x d
a
a1
Figure 20. The sliver trapped by y = e x and its translate balances over a pivot at a 1.
Now set the horizontal difference to be d = n2 . Then n + 1 copies of the sliver fit
together seamlessly into the shaded region in Figure 19, with a few curvy triangles
missing at the top and an extra sliver sticking out on the right; see Figure 21.
a
b
Figure 22. A stable non-harmonic stack together with its gravity curve.
Now, if we slide the top slab slightly to the right, then its gravity point a will stay
within the top of the middle slab, ensuring that the top slab will not topple. However,
sliding the top slab will move the gravity point of the whole stack to the right, and the
stack will not be balanced at 0.
However, we can avoid this by simultaneously sliding the top slab to the right and
the middle slab to the left. Clearly, we can do this in such a way that the gravity point
b of the combined top and middle slabs stays fixed, and so leaving the gravity curve
inside the bottom slab unchanged. This means that the adjusted stack still balances,
REFERENCES
1. D. Bressoud, A Radical Approach to Lebesgues Theory of Integration, MAA Textbooks, Cambridge Uni-
versity Press, Cambridge, 2008.
2. J. Bryant, C. Sangwin, How Round is Your Circle? Princeton University Press, Princeton, 2008.
3. R. Courant, H. Robbins, I. Stewart, What is Mathematics? Oxford University Press, New York, 1996.
4. J. F. Hall, Fun with stacking blocks, Amer. J. Phys. 73 (2005) 11071116; available at http://dx.doi.
org/10.1119/1.2074007.
5. M. Paterson, Y. Peres, M. Thorup, P. Winkler, U. Zwick, Maximum overhang, Amer. Math. Monthly 116
(2009) 763787; available at http://dx.doi.org/10.4169/000298909X474855.
BURKARD POLSTER received his Ph.D. in 1993 from the University of Erlangen-Nurnberg in Germany.
He currently teaches at Monash University in Melbourne, Australia. Readers may be familiar with some of his
books dealing with fun and beautiful mathematics such as The Mathematics of Juggling, Q.E.D.: Beauty in
Mathematical Proof, or the Shoelace Book.
School of Mathematical Sciences, Monash University, Victoria 3800, Australia
Burkard.Polster@monash.edu
MARTY ROSS is a mathematical nomad. He received his Ph.D. in 1991 from Stanford University. Burkard,
Marty, and their mascot the QED cat are Australias tag team of mathematics. They have a weekly column
in Melbournes AGE newspaper and are heavily involved in the popularization of mathematics. Their various
activities can be checked out at http://www.qedcat.com. When he is not partnering Burkard, Marty enjoys
smashing calculators with a hammer.
PO Box 83, Fairfield, Victoria 3078, Australia
martiniross@gmail.com.au.
DAVID TREEBY studied mathematics at Monash University in Australia where he graduated in 2005. He
currently teaches mathematics to high school students at Presbyterian Ladies College in Melbourne, Australia.
He delights in exploring beautiful mathematics with students at his school.
Presbyterian Ladies College, 141 Burwood Hwy, Burwood, Victoria 3125, Australia
david.treeby@gmail.com.
Abstract. We use the intrinsic diameter distance to describe when a Riemann map has a con-
tinuous extension to the closed unit disk.
Here d and d := d \ are the metric completion and metric boundary of the
metric space d := (, d) where d is the diameter distance on . Also, in (a) =
(b) we have g = i B h where i : d is the extension of the identity map d .
See 2.C for definitions.
Figure 1 illustrates a simple non-Jordan domain that satisfies conditions (a) through
(e); Figure 2 pictures two domains that do not.
It is straightforward to check that (b) implies (c) and that (d) implies (e). We refer
to [4, Theorem 2.1, p. 20] for a proof of the nontrivial fact that (e) implies (a). In 3
we verify that (a) implies (b) and that (c) implies (d).
Our ideas and proofs should be accessible to students possessing basic knowledge
of complex analysis and plane topology, and could possibly serve as a capstone experi-
ence for undergraduate mathematics majors. The reader is forewarned that our proofs
employ basic plane and metric topology arguments; the role of holomorphicity is ex-
plained in 2.B.
http://dx.doi.org/10.4169/amer.math.monthly.119.02.140
MSC: Primary 30C20, Secondary 30C35, 30J99
unit disk
+
A point in that
has two preimages
in d and in D
(z+1) 2 z 2 +1
f (z) =
+
(z+1)+ 2 z 2 +1
2. PRELIMINARIES.
2.A. Basic notation and terminology. Throughout this article denotes a simply
connected bounded domain in the complex plane C. We write D(a; r ) := {z C |
|z a| < r } for the open disk centered at a C with radius r > 0. Then D := D(0; 1)
is the unit disk with boundary T := D = {eit | t [0, 2]}, the unit circle.
A path is a continuous map of a compact interval, and unless explicitly indicated
otherwise, we assume that the parameter interval is [0, 1]. We use the phrase path in
with a terminal endpoint in to describe a path
[0, 1] { } with ([0, 1)) and (1) = .
We write | | := ([0, 1]) for the image of the path . However, we write [a, b]
both for the Euclidean line segment joining a and b as well as the affine path [0, 1] 3
t 7 a + t (b a); the reader can distinguish these two meanings by context.
A path joins (0) to (1). When (0) = (1), we call a closed path. By a
curve we mean the image of a path, a closed curve is the image of a closed path, and
an arc is the image of an injective path. A crosscut of is an arc with endpoints in
and all other points in . An endcut of is an arc having one endpoint in and all
other points in .
We note that every path contains an injective subpath that joins its endpoints; see [5].
A (closed) Jordan curve is a topological circle, that is, the homeomorphicimage
of T. We call 0 a plane Jordan curve if 0 is a Jordan curve in C; in this setting,
the Jordan curve theorem asserts that C \ 0 has exactly two components: the bounded
component int(0) called the interior of 0 and the unbounded component ext(0) called
the exterior of 0.
f
2.B. Riemann maps. A Riemann map D is a holomorphic homeomorphism;
complex analysts call such an f a conformal map. We require two properties of Rie-
mann maps. First, according to [4, Prop. 2.14, p. 29], each path in with a terminal
endpoint in will have a preimage that is a path in D with a terminal endpoint in
T. This fact, whose proof is based on a length-area estimate known as Wolfs lemma
(see [4, Prop. 2.2, p. 20]), does not require that f have a continuous extension to the
closed disk.
Then f = 0.
Proof. We assume that A {eit | 0 t 2/n} for some positive integer n. Define
Then F is holomorphic and bounded in D with the property that for each T,
lim F(z) = 0.
z
zD
For the slit disk, pictured in Figure 1, the map i is surjective but not injective. For the
two domains pictured in Figure 2, i is neither surjective nor injective; here the left-
1 7i
1 7i 8
+ 8
8
+ 8
1 3i
4
+ 4
1 i 1 i
2
+ 2 +
... ... 2 2
1 i
4
+ 4
There is also a natural connection between d and endcuts of . To see this, let
be a path in with terminal endpoint (1) = in . By choosing a sequence of
points along that is d-Cauchy but nonconvergent in , we obtain a point d
with i( ) = . The continuity of at t = 1 ensures that limt1 d( (t), ) = 0, so
we can define a continuous path d in d by
(
(t) if t [0, 1),
d (t) :=
if t = 1.
Thus each such path corresponds to a path d in d with a unique terminal endpoint
d (1) d and with the property that = i B d , so (1) = i(d (1)).
Let and be two paths in with terminal endpoints in . We declare and
to be d-equivalent provided limt1 d((t), (t)) = 0. There is a natural one-to-
one correspondence between d and the equivalence classes of such paths. A general
discussion, with detailed proofs, can be found in [2].
In 3 we use the following elementary facts.
2.3 Lemma. If
d is compact, then i is surjective; in particular, i(d ) = .
| i( )| | z n k | + |z nk i( )| | z n k | + d(z n k , ) 0 as k ,
it follows that = i( ).
id
Proof that h is continuous. Using the fact that d is a homeomorphism we see
that h|D = id1 B g|D is continuous. Thus it suffices to check that h is continuous at
each point of T. Let a T and > 0 be given. Select > 0 so that g(D D(a; ))
D(g(a); /4). Let z D D(a; /2).
The continuity of g, in conjunction with the definition of h, guarantees that
a0 , z 0 D D(a; ) and d(h(a0 ), h(a)) < /4, d(h(z 0 ), h(z)) < /4.
and therefore
Proof that h is injective. Let a and b be distinct points in T = D and let I and J be
the components of T \ {a, b} (so I and J are open subarcs of T). We demonstrate that
if h(a) = h(b), then min{diam[g(I )], diam[g(J )]} = 0; since this would contradict
(2.1), it follows that h(a) 6 = h(b).
As there is no harm in doing so, we assume that g(0) = 0. Then the paths := g B
[0, a] and := g B [0, b], given by (t) := g(t a) and (t) := g(t b), define endcuts of
both having initial endpoint (0) = g(0) = 0 = (0) and with terminal endpoints
(1) = g(a), (1) = g(b) in . See Figures 3 and 4.
0
Figure 3. The paths and .
Now suppose that := h(a) = h(b). Then := i( ) = g(a) = g(b), and are
d-equivalent paths that join 0 to , and C := || || is a plane Jordan curve
in { }. Note that C = { }; however, in general, D := int(C) need not be
contained in . See Figure 3.
Let > 0 be given. We show that either g(I ) or g(J ) has diameter smaller than .
We assume that < | |/10. It is helpful to examine topology in the D disk picture
in Figure 4, but the reader must remember that g is only a homeomorphism in D.
g 1 (10 ) ra
I
0
g 1 (11 )
sb
b
g
The continuity of D , in conjunction with the d-equivalence of and , guar-
antees the existence of a (0, 1) such that
:= |[R,S] and := g 1 B .
(S)
( )
0
Figure 5. The path and its subpath .
By its definition, is an injective path in that joins (r ) to (s) and is such that
|| C = {(r ), (s)}. Thus || is a crosscut either of D = int(C) or of E := ext(C)
(and || separates 0 and , either in D or in E). To see that the former holds (i.e., that
|| is a crosscut of D), note that there are (closed) Jordan curves
with
00 , 01 ( { }) D( ; /5),
and
3.B. (c) = (d). We assume that d is a closed Jordan curve. In particular then,
d is compact. To see that is a closed curve, it suffices to verify that i(d ) = .
This latter requirement follows from Lemmas 2.3 and 2.4.
ACKNOWLEDGMENTS. The author thanks the referees for their helpful comments. He was partially sup-
ported by the Charles Phelps Taft Research Center.
REFERENCES
1. D. Freeman, D. Herron, Bilipschitz homogeneity and inner diameter distance, J. Anal. Math. 111 (2010)
146; available at http://dx.doi.org/10.1007/s11854-010-0011-6.
2. D. Herron, Geometry and topology of intrinsic distances, J. Anal. 18 (2010) 197231.
3. R. Nakki, J. Vaisala, Jon disks, Expo. Math. 9 (1991) 343.
4. C. Pommerenke, Boundary Behavior of Conformal Maps, Grunlehren der mathematischen Wis-
senschaften, no. 299, Springer-Verlag, Berlin, 1992.
5. J. Vaisala, Exhaustions of John domains, Ann. Acad. Sci. Fenn. Math. 19 (1994) 4757.
Abstract. The aim of this note is to introduce a new technique for proving and discovering
some inequalities.
a b c 3
+ + , a, b, c > 0. (1)
b+c c+a a+b 2
a+b+c n
a n + bn + cn
X X
=3 3
n=1
3 n=1
3
n 1
X 1 3 3
=3 =3 1
= ,
n=1
3 1 3
2
2. DISCOVERING NEW INEQUALITIES. Vasile Crtoaje [2] states that for all
nonnegative real numbers a1 , a2 , . . . , ak < 1 satisfying
q
a= a12 + a22 + + ak2 /k 3/3,
we have
a1 a2 ak ka
+ + + . (2)
1 a12 1 a22 1 ak2 1 a2
Here we give a sort of extension. Note that, by using (2) and the inequality between
the quadratic mean and the arithmetic mean, we obtain
2
a1 a2 ak
a1
2
a2
2
ak
2
1a12
+ 1a22
+ + 1ak2
+ + +
1 a12 1 a22 1 ak2 k
ka 2
.
(1 a 2 )2
a consequence of (2), holds for all a1 , a2 , . . . , ak (0, 1), with the condition a
as
3/3.
We use our method to prove that inequality (3) holds without the condition
a 3/3. In this case, using
x
nx n =
X
, x (0, 1) , (4)
n=1
(1 x)2
a2
=k .
(1 a 2 )2
New inequalities can be discovered via this method, as we will see next. Consider
the well-known inequality:
a 2 + b2 + c2 ab + bc + ca, (5)
which is true for all a, b, c, but we take a, b, c (1, 1). For every n N,
and by addition,
a 2n + b2n + c2n (ab)n + (bc)n + (ca)n .
X X X X X X
1 1 1 1 1 1
+ + + + , a, b, c (1, 1) .
1a 2 1b 2 1c 2 1 ab 1 bc 1 ca
and then using the summation method with respect to n again, we obtain
1 1 1 1 1 1
+ + + + .
1a 4 1b 4 1c 4 1 a bc 1 ab c 1 abc2
2 2
17 17 17 45 32 32 32
+ + + + + .
1 x3 1 y3 1 z3 1 x yz 1 x2y 1 y2z 1 z2 x
Our method is also suitable for obtaining new results related to convexity. Walter
Janous [3] showed
1 1 1 1 1 1
f (y) f (x) + f (z).
1z 1x 1z 1y 1y 1x
which shows that the function g : (0, 1) R given by g(x) = (1 x) f (x) is also
concave.
Finally, we use Holders inequality to establish two inequalities that are new, as far
we know.
and
ap bp ab
+ .
p(1 a p )2 q(1 bq )2 (1 ab)2
ACKNOWLEDGMENTS. This work was supported by a grant of the Romanian National Authority for
Scientific Research, CNCS-UEFISCDI, project number PN-II-ID-PCE-2011-3-0087.
REFERENCES
Abstract. For a fixed prime p, we consider the set of maps Z/ pZ Z/ pZ of the form
a 7 Tn (a), where Tn (x) is the degree-n Chebyshev polynomial of the first kind. We observe
that these maps form a semigroup, and we determine its size and structure.
Theorem 1. The polynomials Tn and Tm induce the same map Z/2Z Z/2Z if and
only if n m (mod 2). There are a total of two Chebyshev maps Z/2Z Z/2Z,
namely the identity and the constant map 1. These form a semigroup isomorphic to
Z/2Z under the operation of multiplication.
Theorem 2. Let p be an odd prime. The polynomials Tn and Tm induce the same map
Z/ pZ Z/ pZ if and only if n is congruent to either m or pm modulo ( p 2 1)/2.
1 The coefficients of these linear polynomials are only required to lie in the algebraic closure of Z/ pZ.
Before proving these results, we illustrate Theorem 2 by writing it out in the two
smallest cases.
When p = 3, there are three Chebyshev maps on Z/ pZ, induced by T1 , T2 , and
T4 . Here T1 is the identity, T4 is the constant map 1, and T2 T2 = T4 . These three
maps comprise the quotient of the semigroup Z/4Z (under multiplication) by the
subgroup {1, 3}; the cosets of this subgroup are {1, 3}, {0}, and {2}.
When p = 5, there are six Chebyshev maps on Z/ pZ. These maps correspond to
the cosets of the subgroup {1, 5, 7, 11} of the semigroup Z/12Z (under multiplica-
tion), namely,
{0}, {1, 5, 7, 11}, {2, 10}, {3, 9}, {4, 8}, {6}.
Here a prescribed coset corresponds to the map a 7 Tn (a), where n is any positive
integer whose image in Z/12Z lies in the prescribed coset. The coset containing 1 is
the identity element, and in this case it is the only invertible element in the quotient
semigroup. Note that the cosets have sizes 1, 2, and 4. This also holds for larger
primes, and will be made explicit in the proof of Theorem 2.
some integers f i . Now put g(x) := f (x, 1) = i=0 f i x n2i , so that g(u + u 1 ) =
Pbn/2c
u n + u n . Then h(x) := g(2x)/2 satisfies h((z + z 1 )/2) = (z n + z n )/2, which for
z = ei implies that h(cos ) = cos n . Hence h Tn vanishes at cos , and since is
arbitrary it follows that h = Tn .
We now determine the lowest-degree term of h, and use it to compute the reduction
of h mod 2. If n is even then substituting u = v yields
so that f n/2 = 2 (1)n/2 . Since h = i=0 f i x n2i 2n2i1 and each f i is an integer,
Pbn/2c
it follows for even n that h 1 (mod 2). If n is odd then
(n1)/2
u n + vn
f i (u + v)n12i (uv)i ;
X
=
u+v i=0
substituting u = v on the right yields f (n1)/2 (v 2 )(n1)/2 , and evaluating the left
side at u = v (for instance, via lHopitals rule) yields nv n1 . Thus we find that
For any F p , write = + 1 with as in the lemma; then, since pth power-
ing is an automorphism of F p which fixes , we have
p + p = p = = + 1 ,
+ 1 + 1
Tn = Tm n + n = m + m
2 2
either n = m or n = m
either nm = 1 or n+m = 1.
n nk
(it turns out that nk is an integer). If 2 6 = 0 then the Dickson polynomial is
k
related to the Chebyshev polynomial via the change of variables
x
n
Dn (x, ) = 2 Tn .
2
ACKNOWLEDGMENTS. We thank Florian Block, Kevin Carde, Jeff Lagarias, and the referees for valuable
suggestions which improved the exposition in this paper. The first, third and fourth authors were partially sup-
ported by the NSF under grants DMS-0502170, DMS-0801029, and DMS-0903420, respectively. The second
author was supported by an NSF Graduate Research Fellowship.
REFERENCES
Abstract. Let P R3 be a pyramid with the base a convex polygon Q. We show that when
other faces are collapsed (rotated around the edges onto the plane spanned by Q), they cover
the whole base Q.
Figure 1. An impossible configuration of four collapsing walls of a pyramid leaving a hole in the base.
Q F A F ,
For example, suppose pyramid P in the theorem has a very large height, and all
walls are nearly vertical. The theorem then implies that every point O Q has an
orthogonal projection into the interior of some edge e of Q. This is a classical result
with a number of far-reaching generalizations (see [4, 9]). Thus, the collapsing walls
theorem can be viewed as yet another generalization of this result (see Section 3).
To prove the theorem, assume to the contrary that B / F1 . Then there exists a face
of P, say F2 , such that H2 separates B from the origin. Denote by C the closest point
to B on L 2 , and by 0 the angle between the line BC and the horizontal plane H , where
the angle is taken with the half-plane of H which contains Q (and thus the origin). In
this notation, the above condition implies that 0 > 2 .
Without loss of generality we may assume that line L 2 is given by the equations
y = r2 and z = 0. Then
C = r1 (1 cos 1 )a, r2 , 0 ,
and
r2 r1 (1 cos 1 )b
cos 0 = cos OCB
d =q .
r12 sin2 1 + (r2 r1 (1 cos 1 )b)2
Note that the quantity t/ a 2 + t 2 is monotone increasing as a function of t, and that
b 1. We get
r2 r1 (1 cos 1 )
cos 0 q .
r12 sin2 1 + (r2 r1 (1 cos 1 ))2
r2 r1 (1 cos 1 )
q < cos 2 . (1)
r12 sin2 1 + (r2 r1 (1 cos 1 ))2
The rest of this section is dedicated to showing that (1) and (2) cannot both be true.
This gives a contradiction with our assumptions and proves the claim. We split the
proof into two cases depending on whether the dihedral angle 2 is acute or obtuse. In
each case we repeatedly rewrite (1) and (2), eventually leading to a contradiction.
Case 1 (obtuse angles). Suppose 2
< 2 < . In this case cos 2 < 0 and r2 r1 (1
cos 1 ) < 0. Now (1) implies
r12 sin2 1 1
1+ < , (3)
(r2 r1 (1 cos 1 ))2 cos2 2
and
r1 sin 1
> tan 2 . (4)
r2 r1 (1 cos 1 )
This can be further rewritten as:
r2 sin 1
< 1 cos 1 + . (5)
r1 tan 2
Now (5) and (2) together imply
tan 21 sin 1
2 < 1 cos 1 + ,
tan 2 tan 2
which is impossible. Indeed, suppose for some and satisfying 0 < , < we
have
tan 2 sin
< 1 cos + . (6)
tan 2
tan
Dividing both sides by tan 2 , after some easy manipulations, we conclude that (6) is
equivalent to
1 1 + cos
< sin + , (7)
tan 2
tan
Since the left-hand side of (8) is equal to 1, we get a contradiction and complete the
proof in Case 1.
Case 2 (right and acute angles). Suppose now that 0 < 2 2 . Then cos 2 0, and
0 < tan 22 1. Let us first show that the numerator of (1) is nonnegative, i.e., that r2
r1 (1 cos 1 ). From the contrary assumption we have r2 /r1 < (1 cos 1 ). Together
r2 tan 21 1
1 cos 1 > tan ,
r1 tan 22 2
r12 sin2 1 1
1+ > , (9)
(r2 r1 (1 cos 1 ))2 cos2 2
and
r1 sin 1
> tan 2 . (10)
r2 r1 (1 cos 1 )
Note now that (10) coincides with (4). Since (6) does not hold for any and satisfy-
ing 0 < , < , we obtain the contradiction verbatim as in the proof of Case 1. This
completes the analysis of Case 2 and finishes the proof of the theorem.
3. FINAL REMARKS.
3.1. The collapsing walls theorem extends verbatim to higher dimensions. Moreover,
it also extends to every polytope P Rd , as follows. For each facet F of P, let HF
denote the hyperplane containing F. Fix one facet Q of P. If all other facets F of P
are rotated around the affine subspace HF HQ onto HQ (or if HF is parallel to HQ we
just consider the orthogonal projection of F onto HQ ), then they cover the whole facet
Q. We refer to [5], where this result is proved in full generality. We should mention
that after we advertised the result in this paper, other people (we should mention here
personal communications with Arseniy Akopyan and independently with Gunter Rote)
found alternative elementary and beautiful proofs that are not more complicated and
perhaps technically even easier for the simple three-dimensional case presented in this
paper. However, we have not yet seen another proof (simple or not) that generalizes to
higher dimensions or to the case of a general convex polytope, as does the argument
in this paper.
3.2. Let us note that when the walls of a pyramid are collapsed outside, rather than
onto the base, they are pairwise nonintersecting (see Figure 2). We leave this easy
exercise to the reader.
3.4. The proof of the theorem is based on an implicit subdivision of Q given by the
smallest of the linear functions i at every point O Q. Recall that i is a weighted
distance to the edge ei . Thus this subdivision is in fact a weighted analogue of the
dual Voronoi subdivision in the plane (see [1, 3]). As a consequence, computing this
subdivision can be done efficiently, both theoretically and practically.
ACKNOWLEDGMENTS. The authors are thankful to Yuri Rabinovich for his interest in the problem. The
first author was partially supported by the National Security Agency and the National Science Foundation. The
second author was supported by the Israeli Science Foundation (grant No. 938/06).
REFERENCES
1. F. Aurenhammer, Voronoi diagramsA survey of a fundamental geometric data structure, ACM Comput.
Surv. 23 (1991) 345405; available at http://dx.doi.org/10.1145/116873.116880.
2. J. H. Conway, M. Goldberg, R. K. Guy, Problem 66-12, SIAM Review 11 (1969) 7882; available at
http://dx.doi.org/10.1137/1011014.
3. S. Fortune, Voronoi Diagrams and Delaunay Triangulations, Computing in Euclidean Geometry, second
edition. Edited by F. Hwang and D.-Z. Du, Lecture Notes Ser. Comput. 4, 225265, World Scientific,
Singapore, 1995.
4. I. Pak, Lectures on Discrete and Polyhedral Geometry, monograph (to appear). Available at http://www.
math.ucla.edu/~pak/book.htm
5. I. Pak, R. Pinchasi, How to cut out a convex polyhedron, (to appear).
6. S. Tabachnikov, Around four vertices, Russian Math. Surveys 45 (1990) 229230; available at http:
//dx.doi.org/10.1070/RM1990v045n01ABEH002326.
1 One can give a construction in which there is only one such edge, if the center of mass is replaced by a
PROBLEMS
11621. Proposed by Z. K. Silagadze, Budker Institute of Nuclear Physics and Novosi-
birsk State University, Novosibirsk, Russia. Find
Z Z s1 Z s2 Z s3
cos(s12 s22 ) cos(s32 s42 ) ds4 ds3 ds2 ds1 .
s1 = s2 = s3 = s4 =
11627. Proposed by Samuel Alexander, The Ohio State University, Columbus, Ohio.
Let N be the set of nonnegative integers. Let M be the set of all functions from N
to N. For a function f 0 from an interval [0, m] in N to N, say that f extends f 0 if
f (n) = f 0 (n) for 0 k m. Let F( f 0 ) be the set of all extensions in M of f 0 , and
equip M with the topology in which the open sets of M are unions of sets of the form
F( f 0 ). Thus, { f M : f (0) = 7 and f (1) = 11} is an open set. S T
Let S be a proper subset of M that can be expressed both as iN jN X i, j and as
T S
iN jN Yi, j , where each set X i, j or Yi, j is a subset of M that is both closed
S T
and open
(clopen). Show that there is a family Z i, j of clopen sets such that S = iN jN Z i, j
T S
and S = iN jN Z i, j .
SOLUTIONS
where I is the identity matrix of order and U denotes the conjugate transpose of
U . We use this expression for P to partition the projector Q. Using the same matrix
U , we write
A B
Q=U U ,
B D
R( A) = R( A A + B B ) = R( A A + R(B B ) = R( A) + R(B),
and hence R(B) R( A). Other relationships among A, B, and D are found in Lem-
mas 15 of [1]; we use two of these. The first expresses the orthogonal projector PD
onto the column space of D as PD = D + B A B, where A is the MoorePenrose
inverse of A. The second expresses the rank of A as r ( A) = r (A) + r (B). Fur-
thermore, Theorem 1 of [1] gives r (Q) = r (A) r (B) + r (D), and Lemma 6 of [1]
gives r (P + Q) = + r (D). Taking differences of these expressions yields r (P +
Q) r (Q) = r (A) + r (B) and r (P + Q) r (P) = r (D). Since the third of the
desired equations is just the difference of the first two, it suffices to show that P Q
has r ( A) positive eigenvalues and r (D) negative eigenvalues.
Theorem 5 in [1] expresses P Q as
A B
PQ=U U ,
B D
The matrices before and after the central matrix in the product on the right are nonsin-
gular and are conjugate transposes of each other. By Sylvesters Law of Inertia (see [2,
Section 1.3]), the numbers of positive and negative eigenvalues are unchanged by con-
jugation. Since A and PD are nonnegative definite (the eigenvalues of an idempotent
matrix lie in the interval [0, 1]), we conclude that P Q has r ( A) positive eigenvalues
and r (D) negative eigenvalues, as desired.
[1] O. M. Baksalary and G. Trenkler, Eigenvalues of functions of orthogonal projec-
tors, Linear Alg. Appl. 431 (2009) 21722186.
[2] R. A. Horn, F. Zhang, Basic properties of the Schur complement, in The Schur
Complement and its Applications, edited by F. Zhang, Springer Verlag, New York,
2005, 1746.
Also solved by R. Chapman (U. K.), E. A. Herman, O. Kouba (Syria), J. H. Lindsey II, K. Schilling, J. Simons
(U. K.), R. Stong, Z. Voros (Hungary), S. Xiao (Canada), GCHQ Problem Solving Group (U. K.), and the
proposer.
of V ,
r
X r
X
(V 0 V (n))i, j = i1
k k k = H (n)i, j .
n+ j1 n+i+ j2
=
k=1 k=1
Therefore,
r
Y n Y
det H (n) = det(V 0 V (n)) = k (i j )2 .
k=1 j<i
and
r
X
(A + B)v j = i u i (u i v j ) + j v j for 1 j s.
i=1
Editorial comment. It would be nice to extend the result to normal matrices. The prob-
lem is that C = A + B is not normal when A and B are normal. Thus the rank of C is
not necessarily the same as the number of nonzero eigenvalues of C. Other than this,
everything works for normal matrices.
One may wonder whether the condition k 3n be replaced with k n. This
fails at least when n = 1, since tr (A + B) = tr A + tr B for all numbers A and B, but
AB 6= 0.
Also solved by J. Simons (U. K.), R. Stong, and the proposer.
Friendly Paths
11484 [2010, 182]. Proposed by Giedrius Alkauskas, Vilnius University, Vilnius,
Lithuania. An uphill lattice path is the union of a (doubly infinite) sequence of di-
rected line segments in R2 , each connecting an integer pair (a, b) to an adjacent pair,
either (a, b + 1) or (a + 1, b). A downhill lattice path is defined similarly, but with
b 1 in place of b + 1, and a monotone lattice is an uphill or downhill lattice path.
Given a finite set P of points in Z2 , a friendly path is a monotone lattice path for which
there are as many points in P on one side of the path as on the other. (Points that lie
on the path do not count.)
(a) Show that if N = a 2 + b2 + a + b for some positive integer pair (a, b) satisfying
a b a + 2a, then for some set of N points there is no friendly path.
(b)* Is it true that for every odd-sized set of points there is a friendly path?
Solution to (a) by the proposer. Let P be the centrally symmetric configuration con-
sisting of triangles of points in four quadrants as in the figure (where a = 4 and
b = 7). The first and third quadrants contain triangles meeting a diagonals, compris-
ing a(a + 1)/2 points. The second and fourth quadrants contain triangles meeting b
diagonals, comprising b(b + 1)/2 points. In total, |P| = N . Let A, B, C, D denote the
subsets in the four quadrants.
We prove that there is no friendly path for P. If the first and last points of P on
a monotone path Q lie in neighboring quadrants, then at least N /2 points lie on one
side, and Q is not friendly. If the first and last points are in C and A, then Q hits one
point in each of 2a + 1 diagonals. Since N is even, this leaves an odd number of points
of P outside Q, and they cannot be split equally.
It remains to consider a downhill lattice path Q whose first and last points are in
B and D. If Q hits a point of P at every step between these extremes, then Q hits
2b + 1 points of P, and again the remainder cannot be split equally. Hence, we may
assume that an odd number of lattice points along Q between its ends are not in P. By
symmetry, we may assume these points are in A. We claim that every such path has
more points of P below it than above it.
Consider the point x just above the leftmost column of the triangle in A. The down-
hill path Q containing x that has the most points of P above and to its right goes
directly rightward to x and then down. There are ba1 2
points of P above Q in B,
a b
2
points of P to the right of Q in A, and 2 points of P to the right of Q in D.
Meanwhile, on the other side of Q are a+1 b+1 ba+1
2
+ 2
2
points. An equal split
requires
a b ba1 a+1 b+1 ba+2
+ + + ,
2 2 2 2 2 2
which simplifies to a + b (b a)2 . The left side is at least 2a, and the right side is
at most 2a, so b = a is necessary, but then 2a 0.
As we move from x to any other point in the first quadrant outside A as a point of Q
outside P between points of Q in P, the number of points above Q decreases, while
the number of points below Q increases. Hence the two sides can never have equal size.
Editorial comment. We do not know the answer to part (b)*. Parity considerations
made part (a) easy using a centrally symmetric configuration. However, a centrally
symmetric configuration of odd size has a central point. Any symmetric path through
that point is a friendly path. This makes it difficult to construct a counterexample.
No other solutions were received.
Roads to Infinity. The mathematics of truth and proof. By John Stillwell. A K Peters, Natick,
MA, 2010. xi + 203 pp., ISBN 978-1-56881-466-7. $39.95.
Mathematics appears to be an ever-unfolding dialectic between the finite and the in-
finite, between discrete structures and continuous forms, but also between symbolic
content and idealisation. In a well-known paper On the infinite, Hilbert proposed a
distinction between the contentual in mathematicswhich he tentatively identified
with the study of strictly finite structuresand the ideal elements that are constantly
being introduced to help explore the realm of mathematical truths. Hilbert also claimed
that the infinite is not to be found in Nature, yet it may still be the case that the infinite
occupies a well-justified place in our thinking, that it plays the role of an indispensable
concept [6]. Broadly construed, that is the topic of the book under review. More than
dealing with roads to infinity, or with infinity studied for its own sake (the topic of
higher set theory), the books core focus is roads from infinity, in the sense explained
below.
In that same paper of 1925, Hilbert offered new clarifications of his celebrated pro-
gram to vindicate infinitarian mathematics by methods employing only finitary, con-
tentual mathematicsby means of a proof theory studying the structure of proofs
inside any given (axiomatized) theory, which would show that mathematics is consis-
tent, i.e., free from contradiction. 1 This became the springboard for Godels surprising
and celebrated Incompleteness Theorems, but also for new developments in proof the-
ory initiated in the 1930s by Gentzen, who was able to establish the consistency of the
axiomatic system called PA (Peano Arithmetic) by an extension of Hilbertian meth-
ods. This is the mathematics of truth and proof to which Stillwells book is devoted,
and in which he chooses to emphasize the contributions of Emil Post and Gerhard
Gentzen.
Stillwell is a master expositor and does a very good job explaining and weaving
together many core issues in mathematical logic and foundational studies. Less than
halfway through the book the reader has reached the limitation results that affect for-
mal systems capable of codifying a certain amount of arithmetic: incompleteness, un-
provability of consistency, the halting problem in computation, the decision problem.
But, although Stillwell sets himself the task of dispelling the myth that incomplete-
ness is a difficult concept, I doubt that he has managed to do so. Diagonalization
is a simple and clear technique, and the author does very well presenting it in several
versionsfrom the original set theoretic one (Chapter 1) to the relevant proof-theoretic
http://dx.doi.org/10.4169/amer.math.monthly.119.02.169
1 He also had the courage to propose a way of solving the Continuum Problemwhich turned out to be
seriously flawed.
2 This has given rise to all kinds of misunderstandings, even among expert mathematicians. To suggest
some readings, I find very interesting Celluccis distinction between closed and open symbolic systems
(see, e.g., his [2]); a masterful exposition of mistaken readings of Godels theorems is provided in [4].
3 Stillwell does not mention that Post suffered all his adult life from crippling manic-depressive disease at
a time when no drug therapy was available for this malady (M. Davis). His case, together with those of Cantor
and Godel, has reinforced the popular, romantic notion that there is some link between logic and madness.
4 Kruskals theorem has to do with trees, finite graphs that are connected and contain no closed paths: For
any infinite sequence of trees T1 , T2 , T3 , . . . there are indices i < j such that Ti embeds T j (i.e., the infinite
sequence of trees contains an infinite increasing subsequence of trees embedding each other).
22 ,
32 2 + 3 2 + 2,
42 2 + 4 2 + 1,
52 2 + 5 2,
62 2 + 6 + 5, . . .
and, according to Stillwell (following Kirby & Paris, see pages 4950), it reaches 0 at
base 3 2402653211 1. The Goodstein process looks very natural if you have studied
some set theory, to the point of becoming acquainted with countable ordinals and with
the fact that any descending sequence of countable ordinals will be finite. For the
Goodstein process is a finitary, arithmetic translation of the finiteness of descending
sequences of ordinals.
The interesting fact is that Goodsteins theorem, albeit a truth of arithmetic which
can be stated in PA, and whose statement involves only the basic operations of ad-
dition, product and exponentiation, cannot be proved in PA. Indeed, it can be proved
in PA that the Goodstein theorem (expressed in this theory by means of an axiom
schema) implies the consistency of PA. But then, by Godels second incompleteness
theorem, a.k.a. the unprovability of consistency, it follows that Goodsteins theorem is
a formally unprovable sentence of PA (better: schema of sentences).
Of course, Goodsteins theorem can be proved in set theory, for instance in ZFC.
The process by which the numbers in any Goodstein sequence (like the two we showed
above) change and eventually come down to 0 can be understood by considering de-
scending sequences of countable ordinals. Countable ordinals are objects like , 2 , or
2 2 + + 5, or even + + +3 + 3, written in what is called the Cantor normal
form; obviously there are infinitely many such regimented polynomials in . And
they are bounded above by 0 , which can be defined as the limit of the sequence
, , , . . . , or alternatively as the first ordinal such that = ;5 Cantor knew
this number already and established that it is merely countable. That is, our friend 0 is
the first countable ordinal that cannot be written as one of the above polynomials in .
For instance, the sequence for number 4 above corresponds to this (one may simply
interpret as a variable that takes value n + 1 in row n):
,
2 2 + 2 + 2,
2 2 + 2 + 1,
2 2 + 2,
2 2 + + 5 . . .
Given that such descending sequences of ordinals must be finite, Goodsteins theo-
rem is true. Actually the result can be proved in a theory much simpler than set theory:
it is sufficient to employ Gentzens expanded arithmetic, which adds 0 -Induction to
the system of Peano Arithmetic.
5 Of course there are uncountably many such numbers, even among the countable ordinals!
It is not that Stillwells technical exposition, sketchy as it may be, has shortcomings;
the author is very proficient and very clever in finding ways to present his material.
But in general the book offers more on the side of proof theory and computability, than
6 Readers interested in this topic will find a lot in the philosophically oriented [1].
7 Bernays in 1935: The axiom of choice is an immediate application of the quasi- combinatorial concepts
in question. Godel, assuming the extensional, combinatorial notion of class or set, says in 1944: nothing
can express better the meaning of the term class than the axiom of [separation] and the axiom of choice
(Collected Papers, vol. II, 139; compare p. 131).
sequences and functions applies only if we restrict to lawlike sequences. Once we are considering arbitrary
sequences, this is essentially the same as functions f : N (0, 1, . . . , 9).
The above may help emphasize, once again, that there is something inconcrete in the
subject matter of classical set theoryin 1 and 20 , hence on both sides of Cantors
equation 20 = 1 . We have pinned it down to the heavily idealizing tendencyand
the associated vaguenessof quasi-combinatorialism, which is expressed directly
in the contentious axiom AC and indirectly in the standard reading of the axiom of
powersets. Coming back to Stillwells discussion of the finitary counterparts of those
ideas, we find that the material on arithmetic and proof theory is rather concrete,
contentual: there is something tangible about 0 and about Gentzens measuring the
proof-theoretic strength of the proposition that PA is consistent, while there is some-
thing intangible about the real number system and about the problem of measuring
its uncountability. Yet mathematicians have always found a need for ideal horizons
11 Compare the case with the algebraic numbers: one can define in the strict sense a transcendental number
by using Cantors diagonal argument applied to an explicit enumeration of the set of algebraic numbers. See
R. Gray, Georg Cantor and Transcendental Numbers, The American Mathematical Monthly 101 (1994), pp.
819832; available at http://dx.doi.org/10.2307/2975129.
REFERENCES
1. T. Arrigoni, What is Meant by V? Reflections on the Universe of All Sets, Mentis-Verlag, Paderborn, 2007.
2. C. Cellucci, Why Proof? What is Proof? in Deduction, Computation, Experiment. Exploring the
Effectiveness of Proof. Edited by G. Corsi and R. Lupacchini. Springer, Berlin, 2008.
3. J. Ferreiros, On arbitrary sets and ZFC, Bull. of Symbolic. Logic 17 (2011) 361393.
4. T. Franzen, Godels theorem: An incomplete guide to its use and abuse, A K Peters, Natick, MA, 2005.
5. L. Henkin, Completeness in the theory of types, J. of Symbolic Logic 15 (1950) 8191; available at
http://dx.doi.org/10.2307/2266967.
6. D. Hilbert, On the Infinite, in From Frege to Godel, a Source Book in Mathematical Logic. Edited by J.
van Heijenoort. Harvard University Press, Cambridge, MA, 2002.
12 Two more examples given by Stillwell: we have both the finite and infinite Ramsey theorems, but few
Ramsey numbers are actually known! (p. 152153); also what is said about the Green-Tao theorem (p. 158).
13 Unless one does not employ the full set-theoretic semantics, but so-called Henkin semantics, which is
Shai Simonson
Rediscovering Mathematics is an eclectic collection
of mathematical topics and puzzles aimed at
talented youngsters and inquisitive adults who
want to expand their view of mathematics. By
focusing on problem solving, and discouraging
rote memorization, the book shows how to learn
and teach mathematics through investigation,
experimentation, and discovery. Rediscovering
Mathematics is also an excellent text for training
math teachers at all levels.