Beruflich Dokumente
Kultur Dokumente
MAY 2016
415
Sren Eilers
Mocposite Functions
427
Harold P. Boas
439
448
471
NOTES
A Corrigendum to Unreasonable Slightness
482
Arseniy Sheydvasser
486
491
Iosif Pinelis
497
502
Peter McGrath
504
BOOK REVIEW
Scientist, Scholar & Scoundrel: A Bibliographical Investigation
of the Life and Exploits of Count Guglielmo Libri/Mathematician,
Journalist, Patron, Historian of Science, Paleographer, Book
Collector, Bibliographer, Antiquarian, Bookseller, Forger
and Book Theif by Jeremy M. Norman
512
Gerald L. Alexanderson
ENDNOTES
515
MATHBITS
470, An Alternative Approach to the Product Rule; 481, On Measurable
Semigroups in ; 490, Rational Nonaxis Points on the Unit Circle Have
Irrational Angles
monthly
THE AMERICAN MATHEMATICAL
VOLUME 123, NO. 5
MAY 2016
EDITOR
Scott T. Chapman
Sam Houston State University
EDITOR-ELECT
Susan Colley
Oberlin College
Douglas B. West
University of Illinois
NOTES EDITOR
Sergei Tabachnikov
Pennsylvania State University
Doug Hensley
Texas A&M University
ASSOCIATE EDITORS
William Adkins
Jeffrey Lawson
Louisiana State University
Western Carolina University
David Aldous
C. Dwight Lahr
University of California, Berkeley
Dartmouth College
Elizabeth Allman
Susan Loepp
University of Alaska, Fairbanks
Williams College
Jonathan M. Borwein
Irina Mitrea
University of Newcastle
Temple University
Jason Boynton
Bruce P. Palka
North Dakota State University
National Science Foundation
Edward B. Burger
Vadim Ponomarenko
Southwestern University
San Diego State University
Minerva Cordero-Epperson
Catherine A. Roberts
University of Texas, Arlington
College of the Holy Cross
Allan Donsig
Rachel Roberts
University of Nebraska, Lincoln
Washington University, St. Louis
Michael Dorff
Ivelisse M. Rubio
Brigham Young University
Universidad de Puerto Rico, Rio Piedras
Daniela Ferrero
Adriana Salerno
Texas State University
Bates College
Luis David Garcia-Puente
Edward Scheinerman
Sam Houston State University
Johns Hopkins University
Sidney Graham
Anne Shepler
Central Michigan University
University of North Texas
Tara Holm
Frank Sottile
Cornell University
Texas A&M University
Lea Jenkins
Susan G. Staples
Clemson University
Texas Christian University
Daniel Krashen
Daniel Ullman
University of Georgia
George Washington University
Ulrich Krause
Daniel Velleman
Universitt Bremen
Amherst College
Steven Weintraub
Lehigh University
ASSISTANT MANAGING EDITOR
Bonnie K. Ponce
MANAGING EDITOR
Beverly Joy Ruedi
NOTICE TO AUTHORS
The MONTHLY publishes articles, as well as notes and other features, about mathematics and the profession. Its readers span
a broad spectrum of mathematical interests, and include professional mathematicians as well as students of mathematics
at all collegiate levels. Authors are invited to submit articles
and notes that bring interesting mathematical ideas to a wide
audience of MONTHLY readers.
Abstract. We detail the history of the problem of deciding how many ways one may combine
n 2 4 LEGO bricks, and explain what is knownand not knownabout the related question
of how these numbers grow with n.
1. HISTORICAL BOUNDS. For decades, the LEGO Company (since 2005: The
LEGO Group) would state in promotional material that six of the companys iconic
2 4 bricks could be combined in 102981500 ways if they had the same color. The
author coincidentally became aware that this number was incorrect in 2003, and in
2004 computed the correct number which is almost 9 times larger. It is a key purpose of
this note to explain how the correction was obtained, but let us first discuss the history
of the problem as indeed this is highly instructive for understanding its solution.
With the help of the LEGO Group Archive, the number 102981500 has been traced
back to 1974. It appeared in two short notes ([6],[5]) in the companys newsletter as
an example of the use of the formula
tn = 22
n2
(1.1)
i=0
which was found by Jrgen Kirk Kristiansen, a chemical engineer working in the company labs who is also the grandson of the founder of the LEGO Group. Mr. Kristiansen
was fully aware that he did not count all possible buildings and stated so explicitly in
the note, explaining that building number 3 in Figure 1 is not counted, whereas buildings 1 and 2 are.
In fact, formula (1.1) gives a completely correctbut, as we shall see, unnecessarily
complicateddescription of the number of buildings of maximal height, counted in
the sense we will describe below. Moreover, the values
t2 = 24,
t3 = 1060
were correctly computed this way. Precisely how and when this happened remains
unclear, but over the course of the years it was forgotten that (1.1) was only intended as
a lower bound of the number of buildings, and hence the number 102981500, which
is t6 4, was presented as the exact number of buildings in the LEGO Companys
official communication, for instance in the 2004 company profile along with other
LEGO facts and figures such as
May 2016]
415
Figure 1. Illustrations from [6], [5]. (a) Three buildings. (b) a2 = 24.
and
(a)
(b)
(c)
1 n1
1
(46 2n1 ) + 2n1 = (46n1 + 2n1 ),
2
2
(1.2)
n2
i=0
23i = 2n2
23n1 1
.
23 1
Let us digress a bit to note that the numbers 24, 1060, and 102981500 had in fact
been a matter of contention at the LEGO Company in the early 1990s. When an exhibition The Art of LEGO was prepared at Londons Science Museum, the head of the
LEGO Company in the United Kingdom, Clive Nicholls, got interested in the problem
and made the point that since any child would create 46 different configurations when
asked to build all possible buildings with two bricks, LEGO should communicate the
higher numbers 46, 462 = 2116 and 465 = 205962976 instead. Apart from seeing no
reason to undersell LEGOs versatility, Mr. Nicholls had the further point that since
any LEGO brick has the company logo printed in fine print inside each stud, it is in
fact possible to distinguish a brick from its 180 rotation.
Mr. Nicholls got an answer from the very top of the organization ([8]), formulated
by board member Per Srensen:
May 2016]
417
The science of form is called morphology. It includes the concept of isomorphismin this case, the ability of two or more objects to assume the same shape.
All objects are isomorphous which by rotation in three-dimensional space and/or
enlargement or reduction can be made to have the same shape. [...] An eight-stud
LEGO element is isomorphous with an eight-stud DUPLO element, and white
and red eight-stud bricks are also isomorphous with each other. [...] When the
elements are isomorphous with each other, variations in which Element I is fitted
on top of Element II are not morphologically different from variations in which
Element II is fitted on top of Element Ieven though, in purely physical terms
they are of course different, because (whatever the people in the moulding shop
may say) two elements are not the same. If the LEGO logo on the studs is turned
one way or the other, this ismorphologically speakinguninteresting, because
it can be regarded as an unintentional difference and thus insignificant in terms
of the morphological nature of the object.
Clive Nicholls had two words for that: Morphology, schmorphology, and after
a long tirade in the company newsletter, he threatened to leave the LEGO Company
for the then archenemy TYCO, a subsidiary of Mattel, if the board did not revise the
numbers. On a conciliatory note, Mr. Srensen closed the discussion as follows: I
propose that in the future, we answer the question in the same way that Rolls Royce
answers questions about horse power: enough!
Before moving on to a discussion of how to find an by use of computers, note that
we at least now have a2 = 24 and the lower bound an tn which tells us that an grows
at least as fast as exponentially with base 46. In order to get any sort of theoretical
handle on this problem, we need to complement this observation with an upper bound
of the same nature. Finding such an upper bound would presumably be anathema to the
communications division of LEGO Group, and since it is actually a good deal harder
than providing a lower bound, it seems rather safe to assert that this was attempted for
the first time by the author. Incidentally, the solution given in joint work with Durhuus
([3]) draws on another idea perfected by the LEGO Group: The building instructions.
To obtain a useful upper bound for an , valid for any n, we will use the approach
that any building can be created by a set of instructions, and then count the possible
instructions instead of the buildings themselves. To be able to implement such an
overcounting strategy, however, we need to work with building instructions of a less
immediate nature than what the average LEGO user would prefer. For n 2 we will
say that a map
: {1, 2, . . . , 16n 24} {8, 7, . . . , 7, 8}
is an instruction when (i) = 0 for precisely n 1 values of i.
To use such an instruction, we enumerate the studs of the brick 1, . . . , 8 starting in
the top left corner and working left to right from the top row. First take one brick and
call it brick 1. Then read (1), . . . , (8) from left to right to specify what to build on
top of brick 1 as follows. If (1) > 0, take another brick and place it parallel to brick
1 with hole (1) on top of stud 1. If (1) < 0, take a brick and place it orthogonally,
rotated +90 , to brick 1 with hole (1) on top of hole 1. In both cases, give the new
brick the number 2. If (1) = 0, do nothing. Then proceed to read (2) to see what,
if anything, to place on stud 2, and so on until (8). Enumerate the bricks as they
are introduced. When n > 2, similarly interpret (9), . . . , (16) as an instruction of
which bricks, if any, to place on top of brick 2, and (17), . . . , (24) as instruction for
what to place underneath brick 2, reading this time (i) as a specification of a stud
to be placed in a hole, and continue this way to the end of the instruction.
418
c
Figure 3. Building instructions, 1965. (Used with permission. 2015
The LEGO Group.)
See Figures 4(a) and (b) for two examples of instructions defining buildings we
are attempting to count. We note, however, that two things can go wrong when one
attempts to follow such instructionsas in Figure 4(c) the bricks specified can collide,
and as in Figure 4(d) we may encounter a situation where no brick number + 1
has been introduced when we have reached the end of the specifications of what to
place on bricks 1, . . . , . We also note that in most situations, there are many different
instructions leading to the same building.
But since, just as is the case for LEGO Group building instructions (cf. Figure 3),
there is obviously an instruction which will create any given building among the ones
we are aspiring to count, the number of instructions is larger than the number of buildings. And we can count the instructions as
16n 24
16n1 ,
n1
since the binomial coefficient enumerates the number of possible positions of nonzero
values and 16n1 enumerates the number of choices for the nonzero entries. To
avoid the computational complexity of computing binomial coefficients, appealing to
Stirlings formula one can see that these numbers grow no faster than (1617 /1515 )n1
674.02n1 . In fact, we will always have that the number of instructions, and hence May 2016]
419
(a)
(b)
(c)
(d)
Figure 4. Instructions and resulting buildings
There remains the problem for the case where the solid is not necessarily an
n -story building. I only have a result 1560 for n = 3 using a computer. I think it
is computable until n = 5 or 6.
420
n
1
2
3
4
5
6
7
8
9
an
1
24
1560
119580
10166403
915103765
85747377755
8274075616387
816630819554486
Kristiansen 1974
Anonymous 2002
Eilers 2004
Eilers 2004
Eilers 2004
Abrahamsen-Eilers 2006
Abrahamsen-Eilers 2006
Nilsson 2012
As indicated in Figure 5, the prediction on how far an was computable was a bit
on the pessimistic side, but the claim that a3 = 1560 is correct. And the anonymous1
LEGO enthusiast was certainly hitting the nail on the head by predicting that issues
concerning efficiency of computation would come up.
We have already touched upon such issues; indeed the main reason for finding our
revised formula (1.2) superior to the original (1.1) is that it is faster to compute, and
apart from our desire to provide an upper bound of the same nature as the lower bound,
we emphasized the feature that 675n1 was efficiently computable also for large n.
Indeed, even though the numbers tn and u n grow quickly as n increases, we may use
logarithms, successive squaring, or other standard computational methods, to compute
the numbers at an expense in time which grows at worst as a linear expression in n, or,
- which is the same, - as a linear expression in the number of digits in the computed
numbers.
But when it comes to computing an , we do not know of any method which does
not require us to go through, one at a time, a large part of the possible configurations,
and since the number of buildings grows exponentially with n, so does the time consumption. The first attempt by the author, naively going through all possible buildings
saving time only by employing our conventions of identification, could compute up to
a6 = 915103765. That number, which the LEGO Group in short order accepted and
helped disseminate widely, was the main goal of the initial investigations, but computing it required almost a weeks computing time on a laptop, and hence there was little
hope to compute beyond n = 6.
The situation was improved somewhat in joint work with Abrahamsen ([2]), who
among many other things made the observation that it is more efficient to count closer
to Mr. Nicholls convention, fixing a brick and then counting all buildings containing
this brick at its base level. One issue, then, that perhaps Mr. Nicholls had not considered, is that when the base level contains more than one brick arranged with its
long side parallel to the base brick, say k such bricks, then every configuration will
be counted k times even if one does not allow identifications by rotations, but only by
translations. But taking this into account, and keeping track also of which buildings
are symmetric after rotations, we may compute an by
an =
n
c(n, m) + c180 (n, m) + 2c90 (n, m)
m=1
1 The
2m
author of the post has been identified, but prefers to remain anonymous.
May 2016]
421
where c(n, m) is the number of configurations with n bricks containing the base brick
in its bottom level, so that there are m bricks in the lower level, and where c180 (n, m)
and c90 (n, m) count those that are symmetric after rotations by 180 and 90 degrees in
the X Y -plane as indicated.
However, it is obviously unnecessarily inefficient to work our way through all buildings this way, since (1.1) allows us to quickly count all the buildings of maximal height.
Lets elaborate on the idea implicit in Mr. Kristiansens computation. Whenever we
know that there is only one single brick in some layer of the building, we can compute
the number of possibilities by multiplication of the number of possibilities of what to
put below and the number of possibilities of what to put on top. We may speed up
the computations substantially by defining c(n, m) as the number of buildings with
n bricks, m of which are in the bottommost level, which are fat in the sense that at
every level above the bottommost, there are at least two bricks. Also, define c(n) as
the number of buildings with n + 1 bricks so that there is one brick each in the topmost
and bottommost level, and two or more in any other level. For instance, building 3 in
Figure 1 is one of the buildings counted by c(5). It is elementary, but tedious, to verify
then that
an =
n
c(n, m) + c180 (n, m) + 2c90 (n, m)
2m
m=2
1
+
2 =0 m
n1
(2.3)
+ c180 (m 1 , 1)c180 (m 2 , 1)c180 (k1 ) c180 (k ) .
Formula (2.3) looks rather formidable, but has several mitigating features. First, we
note that since the 2 4 brick is not itself invariant under a rotation by 90 , it takes
at least 4 bricks (two with the long side parallel to the X -axis, two with the long
side parallel to the Y -axis) to create a layer which is invariant under such a rotation,
and hence c90 (n, m) = 0 unless 4 divides both m and nthe first nonzero value is
c90 (8, 4) = 244. Second, since c(n, m) = 0 when n m + 2 (unless n = m = 1) and
since c(2) = 0, the expressions reduce substantially for small n. Indeed, we rediscover
a2 =
1
c(1) + c180 (1) = t2
2
and find
1
c(1)2 + 2c(3, 1) + c180 (1)2 + 2c180 (3, 1) ,
2
1
1
a4 =
c(4, 2) + c180 (4, 2) +
c(1)3 + 2c(3, 1)c(1) + c(3) + 2c(4, 1)
4
2
+ c180 (1)3 + 2c180 (3, 1)c180 (1) + c180 (3) + 2c180 (4, 1) ,
a3 =
exponentially with base 1248 35.3. Thus, unless a better way is found to count fat
buildings, the computation time needed to compute an will grow exponentially with
a prohibitively large base. The author does not believe it can be done in polynomial
time, but has no formal evidence for such a claim.
The concrete programs used in [2] (see [1] for more details) could compute a6
in about 5 minutes, but with an increase in computing times of around 100 for each
additional brick, finding a8 took about 500 CPU hours and finding a9 was projected
to take more than 5 CPU years. Thus the author was rather awed when approached in
2012 by Johan Nilsson, a Swedish mathematician then based in Germany, who could
not only supply a9 but had also independently verified a1 , . . . , a8 .
Dr. Nilssons approach was to parallelize the problem. The algorithms used in the
authors computations do not lend themselves well to such an approach, but Nilsson
had the brilliant idea of running through all instructions instead, one at a time, checking
which gave rise to buildings. Dividing the universe of instructions evenly among a
large number of computers at the Department of Mathematics at the University of
Bielefeld, which were working on the problem when otherwise idle, Nilsson could
obtain a9 in a matter of months.
3. THE GROWTH CONSTANT. The most efficient way of communicating the versatility of the 2 4 brick, rather than a sequence of individual counts, would be via
the growth constant h defined so that
an k h n .
Such growth constants are ubiquitous in asymptotic combinatorics and are key concepts in applications, measuring capacity in contexts of information theory or computer science, or entropy in contexts of physics.
That such a constant h is defined is nontrivial and requires some interpretation of
what we mean by . Of course if we knew that an+1 /an converged as n , the
limit would be an excellent candidate, but although this is in all likelihood the case,
May 2016]
423
(3.4)
in the sense that if one limit exists, so does the other. Note that since we have found
that log an log u n = (n 1) log 675, the sequence log an /n is bounded. Our claim
(3.4) follows immediately by the inequalities
an1 c(n, 1) 2an .
The leftmost inequality follows by mapping each equivalence class of configurations
with n 1 bricks to a representative placed on top of a fixed base brick and noting
that this map is injective. The rightmost follows by mapping each configuration to an
equivalence class and noting that this map is at most 2 1.
Letting Cn denote the set of buildings counted by c(n, 1), one sees that c(n +
m, 1) c(n, 1)c(m, 1) by noting that an injective map from Cn Cm to Cm+n is defined
by placing the base brick of the element of Cm somewhere on the top layer of the element of Cn . Hence, log c(n, 1) is a superadditive sequence, and appealing to Feketes
lemma, log c(n, 1)/n converges to supnN log c(n, 1)/n in [0, ]. But we have seen
that the limit is finite; indeed it is less than log 675.
Taking exponentials, we set
h = lim
n
an = lim n c(n, 1),
n
noting in particular that when we focus on h rather than individual counts, Mr.
Nicholls protests become completely inconsequential. Convincing ourselves that
= (s bw )n for any dimensions can be obtained by counting instrucupper bounds u bw
n
tions, we see further that
h bw = lim
n bw
an
makes sense for any choice of dimension b w. Thus, we have an extremely efficient
measure of the versatility of each brick in the LEGO product line, which can meaningfully be compared among themselves, and to other such measures. But before we
can ask the LEGO Group to start saying something like
Already have a lot of 2 4 LEGO bricks? Buy one more and have the number
of buildings you can create multiplied by h !
we have to face up to the task of computing, or at least estimating, such numbers h.
Our lower and upper bounds tell us that 46 h 675, leaving a lot of room for
improvement. The first step is to scrutinize our definition of instructions with the aim
of reducing the upper bound. For instance, since 38 of the 46 ways to place one brick
on top of another involves more than one stud, we can avoid some redundance by
424
distributing the positions evenly with at most 6 choices for each stud. Furthermore,
one can use that one stud (or one hole) has already been spoken for when placing all
bricks except the first to reduce the number of possible choices on the side which has
already been in use from 46 to 30. One checks that 30 positions can be distributed
evenly on 5 studs, leading to
(8 + 5)n (8 + 5 + 5) n1
6
n1
instructions, and the ensuing estimate
h (1313 /1212 ) 6 < 204.
Adapting much more advanced methods developed in the context of enumerating polyominoes ([4]), it was proved in [3] that h < 177.
The lower bound can be improved somewhat by appealing to the concept of generating functions. Organizing the individual counts into a power series
A(z) =
n=1
we see by the root criterion that the sum converges in [0, 1/ h) and diverges in
(1/ h, ). Using standard methods from the theory of generating functions, (2.3)
translates to
A(z) =
Cm (z) + C 180 (z) + 2C 90 (z)
m
m=2
C1 (z)2
2z(1 C(z))
C1180 (z)2
2z(1 C
180
(z))
180
(3.5)
More precisely, h is the reciprocal of the smallest solution to C(x) = 1 on [0, 1]. We
know, as recorded in Figure 7,
C(x) 46x + 74130x 3 + 867346x 4
on [0, 1], so solving for x we obtain that h > 66. Using more values of c(n), the last
being
c(9) = 2067477693115
as computed by Nilsson, and the general estimate c(n + 2) > 1248c(n), we may show
(as in [3]) that h > 81.
But it remains a sad fact, not for want of trying, that this is the best the author
has been able to do, and hence the ambitions for using growth constants to gauge and
May 2016]
425
compare the versatility of different brick sizes is largely unrealized. For instance, we
can create upper bounds by counting instructions to see that both h 12 and h 22 are less
than 81, and hence prove the nonsurprising fact that the 2 4 brick is more versatile
than both the 1 2 and the 2 2 brick. But because of overlaps between the intervals
in which we know that h 12 and h 22 must be contained, we are not able to say with
certainty which of these bricks is more versatile. By comparing an12
1, 4, 37, 375, 4493, 56848, 753536, 10283622, 143607345
to an22
1, 3, 31, 412, 6435, 106108, 1825803, 32320892, 584956651
for n {1, . . . , 9} it appears that the 2 2 brick is superior.
Although it felt a bit like acknowledging defeat, we in [3] took to heuristic estimation of h by the standard method of fitting a straight line to a semilogarithmic plot. The
h 74.8 which was not
best fit to our observations a1 , . . . a8 , however, gave the value
consistent with our lower bounds, indicating that we had too few observations for such
an approach. In [2] we consequently applied Monte Carlo methods, estimating an by
drawing instructions at random, seeing how often they gave rise to actual buildings to
h 117.
estimate a9 , . . . , a20 , to arrive at
This remains the authors best guess, but of course it should be taken only for what it
is: a guess. For instance, we now know that our estimate a9 7.94 1014 , obtained 5
years before Nilsson provided the exact value, was almost 3% too low. Imprecisions of
this nature can be expected to cancel out, but this leaves the real problem that we have
no way of knowing how well the growth of a1 . . . , a20 predicts the true value of h.
ACKNOWLEDGMENT. The author was supported by VILLUM FONDEN through the network for Experimental Mathematics in Number Theory, Operator Algebras, and Topology, as well as the Danish National
Research Foundation through the Centre for Symmetry and Deformation (DNRF92). The author further wishes
to thank records manager Tine Froberg Mortensen at the LEGO Group Archives for her invaluable assistance.
REFERENCES
1. M. Abrahamsen, S. Eilers, Efficient counting of LEGO structures, Tech. Report www.math.ku.dk/
~eilers/eclbii.pdf, University of Copenhagen, 2007.
, On the asymptotic enumeration of LEGO structures, Exp. Math. 20 (2011) 145152.
2.
3. B. Durhuus, S. Eilers, On the entropy of LEGO, J. Appl. Math. Comput. 45 (2014) 433448.
4. D. A. Klarner, R. L. Rivest, A procedure for improving the upper bound for the number of n-ominoes,
Canad. J. Math. 25 (1973) 585602.
5. J. K. Kristiansen, Mere taljonglering med klodser, Klodshans 3 (1974) 13 [Danish].
, Taljonglering med klodser eller talrige klodser, Klodshans 2 (1974) 12 [Danish].
6.
7. N. J. A. Sloane, The on-line encyclopedia of integer sequences, http://oeis.org.
8. K. Srensen, Morphology in practice, LEGO Rev. 2 (1991) 8.
SREN EILERS obtained his Ph.D. from the University of Copenhagen in 1995. He is currently on sabbatical from his position there, acting as the main organizer of the program Classification of operator algebras:
Complexity, rigidity, and dynamics at Institut Mittag-Leffler in Stockholm. After being featured as the crazy
mathematician in A LEGO Brickumentary, his BaconErdos number dropped from to 7.
Department of Mathematical Sciences, University of Copenhagen, Universitetsparken 5
DK-2100 Copenhagen , Denmark
eilers@math.ku.dk
426
Mocposite Functions
Harold P. Boas
Abstract. Traditional mathematical notation can lead to confusion. Expressions that appear to
define composite functions sometimes do not. A particular example with engineering applications is studied in detail.
An engineering student and a mathematics student walk into a bar. Instead of carding
the students, the bartender
offers them free drinks for a correct answer to the question,
Is the function 1 z 2 even or odd? The engineering student shouts out, Even,
of course! Noticing the bartenders sphinx tattoo, the mathematics student slyly says,
My answer is: Yes. Although the smart-aleck second answer arguably is less wrong
than the first, the bartender throws both students into the street and orders them to stay
away until they have studied
analytic continuation.
use exponent 1/2 to denote a square root? The exponential form is both cleaner than
to typeset and consistent with the standard notation for other powers.
A thornier problem than the notation is the ambiguity inherent in the concept of
square root, for every number has two square roots. If you object that
the number 0 is
an exception having a single square root, then observe that what z really means is
a solution w to a particular quadratic equation: namely, w2 z = 0. Every quadratic
equation has two solutions, counting multiplicity.
Nonetheless, there is one case in which everybody agrees that the symbol z has
a unique
meaning. When z happens to be a positive real number, convention dictates
that z always denotes the positive square root of z. But if z is a negative real number
(or, worse, a nonreal number), then confusion can and does arise.
A quantity i whose square equals 1 is fundamental to complex analysis, so neither
the existence nor the uniqueness of i should pass without comment. In the influential
terminology of Descartes [5, p. 380], nonreal solutions of polynomial equations are
http://dx.doi.org/10.4169/amer.math.monthly.123.5.427
MSC: Primary 30B40
May 2016]
MOCPOSITE FUNCTIONS
427
z2
imaginary in the sense of existing only in the imagination. The device of giving
imaginary numbers a concrete existence as ordered pairs of real numbers (equipped
with a suitable multiplication) is due to William Rowan Hamilton [6] 200 years after
Descartes. The imaginary unit i has an alternative realization, invented by AugustinLouis Cauchy [4], that can be expressed in modern language as the equivalence class
of the indeterminate x in the algebraic structure R[x]/(x 2 + 1), the quotient of the ring
of polynomials with real coefficients by the ideal consisting of polynomials that have
x 2 + 1 as a factor.
Authors who wish to have the letter
i available as a summation index often write
a complex variable in the
form x + y 1 instead of x + yi, innocently imagining (I
suppose) that the symbol 1 has a unique meaning rather
than two possible values.
An inevitable
consequence
of
this
belief
would
be
that
4 has a unique meaning
(namely, 2 1 ) and more generally that z is well defined for z everywhere on the
negative part of the real axis. Since this set is precisely the standard branch cut across
which the complex square-root function is discontinuous, such authors are implicitly
constructing an edifice on top of a fault line and hoping that no earthquake occurs.
The standard square-root function arises by considering an inverse of the function
that sends a complex number z to the image z 2 . As indicated in Fig. 1, this squaring
function maps the open right-hand half-plane (where the real part of z is positive)
bijectively onto the complex planewith a left-hand slit along the real axis from 0
to . The principal branch of z means the inverse of this squaring function.
Being the
inverse of a holomorphic (that is, complex-analytic) function, the
principal
branch of z is a holomorphic function too. More generally, a branch of z means
a holomorphic function f such that ( f (z))2 = z for every z in some prescribed domain
in the complex plane.
Not every domain supports a branch of z. The obstruction is the existence in the
domain of a simple closed curve that surrounds the origin. Indeed, if ( f (z))2 = z
for every z in some domain, then the chain rule implies that 2 f f = 1, so
1
1
f (z)
=
= .
f (z)
2( f (z))2
2z
If C denotes the image of under f , then
1
2i
1
f (z)
dz =
f (z)
2i
C
1
dw,
w
which equals the winding number of the curve C about 0: namely, a particular integer.
On the other hand, this integer equals
428
1
2i
1
dz,
2z
which is half the winding number of about 0. Hence, the existence of f precludes
the existence of a curve in the domain with winding number 1 about 0 since 1/2 is
not an integer.
When g is a holomorphic function, a branch of g means a holomorphic function f such that ( f (z))2 = g(z) for every z in some prescribed domain. A subtle but
a branch of g
on the entire complex plane (namely, the identity function), but there
is no branch of z on the image of g
(for that image is the entire complex plane).
of that function is another
If a domain supports a branch of z, then the negative
branch. Consequently,
the
value
of
the
expression
z
when
z = 4 is not necessarily
MOCPOSITE FUNCTIONS
429
1 z2
Figure 3.
f1 (z)
Therefore, the function that sends z to 1 z 2 maps the open upper half-plane bijectively onto the plane with a left-hand slit along the real axis from 1 to . (See Fig. 2.)
This open region
is a subset of the domain of the principal branch of the square-root
function, so 1 z 2 is well-defined on the upper half-plane as a composite function,
say f 1 .
The image of f 1 is nearly identical to the image of the principal branch of the square
root, except for removal of the image under the square-root function of the segment
of the real axis from 0 to 1. Since the square-root function maps that segment back to
itself, the function f 1 maps the upper half-plane bijectively onto the right-hand halfplane with a slit along the real axis from
0 to 1, as shown in Fig. 3. Notice that if y is
a positive real number, then f 1 (i y) = 1 + y 2 (positive square root), so f 1 maps the
part of the imaginary axis in the upper half-plane onto the unbounded interval of the
real axis from 1 to +.
The next stepa nonunique processis to extend the function f 1 to a larger
domain. Here is one way to proceed. Observe that when z is a point in the upper
half-plane with real part greater than 1 and imaginary part close to 0, the point z 2
has the same properties. The point 1 z 2 then lies in the third quadrant close to the
real axis. Taking the principal square root shows that the value f 1 (z) lies in the fourth
quadrant close to the imaginary axis. The upshot is that f 1 extends continuously to
the unbounded interval of the real axis to the right of 1 and maps this interval to the
bottom half of the imaginary axis. Explicitly,
the extension of f 1 maps an arbitrary real
number x greater than 1 to the image i x 2 1 (positive square root). Parallel reasoning shows that f 1 extends continuously to the unbounded interval of the real axis to
the left of 1, and f 1 maps this interval to the top half of the imaginary axis (Fig. 3).
This situation admits application of the Schwarz reflection principle, the simplest
method of analytic continuation discussed in a first course on complex analysis. The
430
(a) domain of f2
(b) domain of f3
1 z2
principle says that if a holomorphic function in the top half of a region symmetric with
respect to the real axis extends continuously to an open subset of the real axis and
takes real values there, then the function extends across that subset of the real axis to
a function that is holomorphic in the whole symmetric region. Moreover, the extended
function maps points that are symmetric with respect to the real axis to image points
that are again symmetric with respect to the real axis.
Accordingly, the function i f 1 extends by reflection to be holomorphic on the plane
with a slit along the real axis from 1 to 1. Let f 2 denote the corresponding extension
of f 1 to this slit plane. Since the extension of i f 1 maps pairs of complex-conjugate
points to complex-conjugate image points, the function f 2 has the property that
f 2 (z) = f 2 (z)
(1)
MOCPOSITE FUNCTIONS
431
1 z2
the problem of finding a holomorphic self-mapping of the slit plane with inverse function equal to its negative.
2
1 z . To an engineer, this function is the natural interpretation of the symbols
1 z 2 , not only because the function is composite but also because the reciprocal
of this function is the analytic continuation to the doubly slit plane of the derivative of
the inverse-sine function used in elementary differential calculus.
In summary, the bartenders question does not admit a one-word answer. A reasonable but incomplete short answer is, It depends on the domain of the function.
A deeper answer is, The question is wrong! The ultimate domain for 1 z 2 is
not a region in the plane but rather a two-sheeted Riemann surface, and on an abstract
surface, the notions of even and odd lose meaning. The surface can be visualized as
two copies of Fig. 4(a) stitched together along the slit, the upper edge of the slit in
either sheet being attached to the lower edge of the slit in the other sheet; crossing
the slit corresponds to moving from one sheet to the other. Alternatively, joining two
copies of Fig. 4(b) results in an equivalent surface.
432
(b)
(a)
Figure 6. Two exotic slit regions
MOCPOSITE FUNCTIONS
433
...
...
the integral is independent of the path joining z 0 to z because the region is simply
connected: Two different paths can be deformed into each other without changing the
value of the integral. The function f eg has value 1 at z 0 , and the derivative of f eg
is equal to zero by the product rule, the chain rule, and the fundamental theorem of
calculus. Therefore, f = e g .
The natural name for g, a holomorphic logarithm of f , is log f . Often, log f is a
mocposite function: The symbols must not be interpreted as a composition log f .
Consider, for instance, the sine function on the plane with the infinitely many
unbounded vertical slits shown in Fig. 7: for each integer n, a slit starting at the point
n on the real axis and going up. The zeroes of the sine function are the endpoints
of the slits, so the sine function has no zero on the plane with these infinitely many
slits, which is a simply connected region. Therefore, a holomorphic logarithm function log sin z exists on the region. This function is mocposite, for the sine function
maps the infinitely slit plane onto C \ {0}, the punctured plane, where no holomorphic
logarithm function lives: Composition log sin is not defined.
Exercise for the reader. If the value of the function log sin z when z = 12 is 0, then
the value when z = 2 + 12 is 2i. More generally, if n is an arbitrary integer, then
the value of log sin z when z = (n + 12 ) is ni.
A mocposite function
of a different character is the entire (holomorphic in the entire
plane) function
cos
z. This expression cannot be understood as a composite func
tion, for z is not holomorphic in a neighborhood of the origin. Nonetheless, the
cosine function has a Maclaurin
series containing only even powers of the variable,
and replacing this variable by z produces the power series
1
z2
z3
z
+
+ ,
2! 4! 6!
which converges
for every z and thus represents an entire function that can reasonably
be named cos z. This function is perhaps the simplest example of an entire function
of fractional order. (The order of an entire function f is the infimum of the positive
1 t2
dt = z z 2 1.
zt
(2)
1
1
1 t2
dt
z t
(s=t)
1
1
1 s2
(s=t)
ds =
z + s
1 t2
dt.
zt
Therefore,
the right-hand side of equation (2) must be antisymmetric too, but the term
z 2 1 does not look antisymmetric to an engineer. As explained in 2, this expression is an odd mocposite function.
MOCPOSITE FUNCTIONS
435
z2
1 = 2 z +
ellipse
1 w2
dw.
zw
Now let the ellipse collapse down to the slit. When w has apositive imaginary part
2
and approachesa real value t between 1 and 1, the
quantity 1 w approaches the
2
2
positive value 1 t . The mocposite function 1 w is antisymmetric, so when
w has a negative
imaginary part and approaches a realvalue t between 1 and 1,
the quantity 1 w2 approaches the negative value 1 t 2 . The top part of the
ellipse approaches the slit oriented from left to right, and the bottom part of the ellipse
approaches the slit oriented from right to left. Accordingly,
1
1 w2
1 t2
dw
approaches
2
dt.
1 z t
ellipse z w
The conclusion is that
2
z2
1 = 2 z + 2
1 t2
dt.
zt
1 1 t 2
Hankel carefully explains how he understands the expression 1 t 2 when t is outside the interval of the
1 and 1: namely, as the product of suitably
real axis between
chosen branches of 1 t and 1 + t. A sequel to this paper [8] was published two
years after Hankels untimely death at age 34 from a stroke [11]. Despite the clear
account of branches in the original article, George Neville Watson trips up in his exposition of Hankels work half a century later [10, Chap. 6] by incautiously claiming
noninteger powers of t 2 1 to
be even functions (and by integrating over a contour
not lying in any region where t 2 1 can be defined as a holomorphic function1 ).
The mocposite function on the right-hand side of equation (2) appears in another
engineering application, one dealing with airplane wings. A version of the Joukowski2
airfoil map sends a nonzero complex number z to the average of z and 1/z. At least
formally, this map is an inverse of the right-hand side of equation (2). Indeed,
z + z2 1
1
= 2
=
z
+
z 2 1,
z (z 2 1)
z z2 1
1 Experts will see how to salvage Watsons derivation by integrating a suitable holomorphic one-form over
an appropriate cycle in a Riemann surface.
2 Famous in his native land, Nikolai Egorovich Zhukovskii (18471921) is the father of Russian aviation.
In his French publicationsnotably the 1916 book Aerodynamiquethe usual transliteration of his name is
Joukowski, the spelling by which his map is commonly designated in the English literature.
436
so, as required,
1
1
2
= z.
z z 1+
2
z z2 1
What is needed in addition to this formal calculation is a consideration of domains.
The first observation is that the Joukowski map sending z to 12 (z + 1z ) is a two-to-one
mapping from C \ {0}, the punctured plane, onto the whole plane C. Indeed, if c is
an arbitrary complex number, then saying that 12 (z + 1z ) = c is equivalent to saying
that z 2 + 1 = 2cz, so there are two solutions for z (counting multiplicity). Moreover,
the symmetry between z and 1/z reveals that the Joukowski function maps each of the
regions { z C : 0 < |z| < 1 } and { z C : |z| > 1 } one-to-one onto the same image.
If is a real number, then 12 (ei + ei ) = cos , so the Joukowski function maps the
unit circle two-to-one onto the segment of the real axis between 1 and 1.
Consequently, the Joukowski function maps the punctured unit disk bijectively onto
the plane with a slit from 1 to 1 and maps the exterior of the unit disk bijectively onto
the same image. The expression on the
right-hand side of equation (2) is the inverse of
one of these two functions. Since z z 2 1 is close to 0 when the modulus of z is
large, this expression is the inverse of the restriction of the Joukowski function to the
punctured unit disk. The Joukowski function is plainly odd (antisymmetric), and the
inverse of an odd functionis odd, so the preceding argument reconfirms the oddness
of the mocposite function 1 z 2 .
5. COMPLETED MY DESIGN. After both analysis and application, my story
about mocposite functions, symmetry, and analytic continuation has come full circle. I
hope that you have returned to the starting point at a new level on the Riemann surface
of understanding. Here is your exit exam.
Exercise for the reader.
Show that on the plane with a slit along the real
axis from 1
1
2
to 1, the function 1 z 2 is even and composite, and 1 z = i z 1 z12 .
My secondary theme is that we mathematicians oftencommit expository solecisms
by using confusing or ambiguous expressions, such as 1 z 2 , even though we purport to value rigor and precision. Lewis Carroll, from whose works I have borrowed
my section titles, memorably chaffed eggheads for this shortcoming:
When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I
choose it to meanneither more nor less. [3]
F. Cajori, A History of Mathematical Notations. Vol. 1. Open Court Publishing, Chicago, 1928.
L. Carroll, Alices Adventures in Wonderland. Macmillan, London, 1866.
, Through the Looking-Glass. Macmillan, London, 1872.
A. Cauchy, Memoire sur une nouvelle theorie des imaginaires, et sur les racines symboliques des
e quations et des e quivalences, C. R. Acad. Sci. Paris 24 (1847) 11201130.
5. R. Descartes, Discours de la methode pour bien conduire sa raison et chercher la verite dans les sciences,
plus la dioptrique, les meteores et la geometrie qui sont des essais de cette methode, J. Maire, Leyde,
1637.
6. W. R. Hamilton, Theory of conjugate functions, or algebraic couples; with a preliminary and elementary essay on algebra as the science of pure time, Trans. Roy. Irish Acad. 17 (1837) 293422,
http://www.jstor.org/stable/30078796.
May 2016]
MOCPOSITE FUNCTIONS
437
7. H. Hankel, Die Cylinderfunctionen erster und zweiter Art, Math. Ann. 1 (1869) 467501,
https://doi.org/10.1007/BF01445870.
8. , Bestimmte Integrale mit Cylinderfunctionen, Math. Ann. 8 (1875) 453470,
https://doi.org/10.1007/BF02106596.
9. K. Hellan, Introduction to Fracture Mechanics. McGraw-Hill, New York, 1984.
10. G. N. Watson, A Treatise on the Theory of Bessel Functions. Cambridge Univ. Press, Cambridge, 1922.
11. W. v. Zahn, Einige Worte zum Andenken an Hermann Hankel, Math. Ann. 7 (1874) 583590,
https://doi.org/10.1007/BF02104927.
HAROLD P. BOAS received a Ph.D. in 1980 from the Massachusetts Institute of Technology. He taught at
Columbia University for four years before joining the faculty of Texas A&M University, where he currently
is Regents Professor and Presidential Professor for Teaching Excellence. He edited the book-review column
in this journal during 19981999 and served as editor of the Notices of the American Mathematical Society
during 20012003.
Department of Mathematics, Texas A&M University, College Station TX 77843
boas@tamu.edu
438
1. INTRODUCTION.
Tractrix and pseudosphere. A tractrix is the trajectory of the rear wheel of a bicycle whose front wheel, at constant distance from the rear wheel, moves along a fixed
straight line. For example, in Figure 1a the straight line is the x axis, and the bicycle
shown from above has length k between its wheels. The rear wheel is placed initially
at (0, k), and as the front wheel moves along the positive x axis the tractrix is the path
traced by the rear wheel. We shall refer to this as the first quadrant portion of a full
tractrix, shown in Figure 1b, which is the curve with a center of symmetry obtained
by first reflecting the first quadrant portion through the y axis and then reflecting this
extended curve through the x axis, the asymptote of the tractrix. We denote by k the
(a)
front wheel
(b)
(c)
Figure 1. (a) First quadrant tractrix with tangent segments of constant length k. (b) Full tractrix. (c) Surface
of revolution obtained by rotating the tractrix in (b) about its asymptote is a pseudosphere of pseudoradius k.
constant length of the tangent segment (the distance between the bicycle wheels) from
the tractrix to the asymptote. The tractrix has a cusp of height k above the asymptote
and a reflected cusp below the asymptote. Figure 1c shows a pseudosphere, the name
commonly given to the surface of revolution generated by rotating a tractrix about its
asymptote.
The pseudosphere is so named because many of its properties are analogous to those
of a sphere. A well known analogy concerns curvature. The surface of a sphere has
constant positive Gaussian curvature, whereas the surface of pseudosphere has constant negative Gaussian curvature (see [3, p. 282]). Another analogy concerns surface
area. It is known (from integral calculus) that the pseudosphere has surface area 4k 2 ,
the same as that of a sphere of radius k, obtained by rotating a circle about its diameter. Because of this analogy, it is reasonable to refer to distance k as the pseudoradius
of the pseudosphere. A lower-dimensional analogy relates the tractrix and circle: the
region in Figure 1b bounded by the full tractrix has area equal to that of a circular disk
of radius k (see [2, p. 13]).
http://dx.doi.org/10.4169/amer.math.monthly.123.5.439
MSC: Primary 51M25
May 2016]
439
Figure 2. Each pseudospherical zone has surface area equal to that of a cylindrical zone which, in turn, is that
of a spherical zone.
2. SURFACES OF CONSTANT GAUSSIAN CURVATURE. It is a pleasant surprise to learn that the equality of surface areas of zones on a pseudosphere and on a
sphere, as illustrated in Figure 2, can be extended to surface areas of zones on any
surface of revolution of constant Gaussian curvature (positive or negative).
Figure 3a shows a surface of constant positive curvature K . If = 1/K , it is known
that 2 = Rr , where R and r are the two principal radii of curvature at an arbitrary
point P of the profile curve (which we call a Gaussian profile) that is rotated to produce
the surface (see [3, p. 131]). In Figure 3a, R is the radius of curvature of the Gaussian
profile at P, and (by a theorem of Meusnier [3, p. 122]) r is the length of the normal
from P to the axis of rotation. A small arc of length l at point P on the profile
x
r
Spindle
type
(a)
General
(b)
Sphere
(c)
2
1
Bulge type
2
1
(d)
curve sweeps out an elemental zone on the surface of area S given by S = 2 xl,
where x is the distance of P from the axis of rotation. But x = r cos , where is
the angle shown, hence S = 2r cos l. But l = R so S = 2r R cos
= 2 2 cos . But this is also the area of a zone on a sphere of radius swept by
rotating a circular arc of length . Figure 3b shows the result of integration with
respect to of the elements in Figure 3a from = 1 to = 2 . On the surface of
constant curvature 1/ the area of the shaded zone whose normals at the ends of the
440
zones subtended by angles 1 and 2 is equal to that of the corresponding spherical zone
of radius whose normals are subtended by the same angles (Figure 3c). Figure 3d
shows a zone of the same area on another surface of constant curvature 1/.
The analysis in Figure 3 can be reformulated as follows. Consider the family of
all surfaces of revolution with parallel axes of revolution having the same constant
positive curvature. For any member of this family, consider the zone defined by two
normals at its ends making two given fixed angles with the axis of revolution. Then we
have the following.
Equality of zonal areas on surfaces of constant positive curvature. The surface
area of such a zone is the same for every member of the family which, in particular, is
the area of the corresponding zone for the spherical member of the family.
The same analysis applies to families of solids of revolution of constant negative
curvature, as indicated in Figure 4. The surface area of the shaded region in Figure 4b
whose normals at the ends of the zones subtended by angles 1 and 2 is equal to that
General
r
P
x
R
2
1
(b)
Hyperboloid
type
Pseudosphere
Conic type
(a)
(c)
(d)
7 6 5
3
3
2
We can rotate a Gaussian profile repeatedly through higher and higher dimensions
to obtain an n-dimensional Gaussian surface (not necessarily of constant curvature)
May 2016]
441
as was done in [1] with the n-pseudosphere. The equal zone area property that was
established in [1] for the n-sphere and n-pseudosphere also holds for an n-dimensional
Gaussian surface (via an n-cylindroid) similar to the result for the 3-dimensional case.
Details are omitted.
3. PROFILES OF SURFACES WITH CONSTANT CURVATURE.
Using bicycles to trace Gaussian profiles. Just as a tractrix is traced by the rear
wheel of a bicycle, we will show that the same is true of any Gaussian profile. The
constant length of the bicycle (the distance between the rear and front wheels) will
provide us with the constant k that serves as the constant radius of curvature of the
Gaussian profile. First we note the following geometrical property inherent in any
bicycle motion. It concerns the line passing through the axle of a bicycle wheel that is
perpendicular to the plane containing the outer circumference of the wheel.
Lemma 1. The two lines through the two axles of a bicycles wheels intersect at the
center of curvature of the path traced by the rear wheel, regardless of the direction of
the front wheel.
Sketch of proof. First, assume the front wheel makes a fixed acute angle with the
frame of the bicycle, as in Figure 6a. Then the wheels trace concentric circles and the
lines through the axles meet at their common center, denoted by S. So for this motion,
S is the center of curvature of both trajectories. Let r denote the radial distance from S
concentric
circles
S
k
(a )
rear
back or forth
along the frame
front S
rear
rear
circle
(b)
front
(c )
(d )
front
Figure 6. (a) Bicycle wheels trace concentric circles if the front wheel makes a fixed angle with the frame. (b)
Limiting case with = /2. As the front wheel changes its direction, as in (c) and (d), the rear wheel moves
in a direction along the frame.
to the rear wheel. Then the radial distance from S to the front wheel is the hypotenuse
of a right triangle with legs r and k, where k is the constant distance between the rear
and front wheels, as indicated in Figure 6a. Note that r = k/ tan . In the limiting case
= /2 the radial distance r shrinks to 0, as shown in Figure 6b. We can continually
change the direction of motion of the front wheel by allowing to vary. The rear wheel
always points in a direction along the frame, as indicated in Figures 6c and 6d.
Now lets see what happens when the front wheel moves from an initial position
1 to a nearby second position 2 with the angle changing by a small amount , as
illustrated in Figure 7a. The motion of the front wheel from 1 to 2 has two components,
shown in Figure 7b, a radial component along the frame to an intermediate position
i, and a transverse component perpendicular to the frame from i to 2. Figures 7c and
7d indicate two different ways to reach the second position as a combination of radial
and transverse motions. In Figure 7c, the front wheel moves radially in the direction
of the frame to an intermediate position i, and then turns at right angles and rotates
through angle to position 2. In Figure 7d, the front wheel turns at right angles from
442
Rear
1
(a)
Rear 2
1
k
2
Front
Rear
1
2
(b)
Front
1
Rear 2
2 1
(c)
Front
1
(d)
k
1
2
Front
Figure 7. (a) Front wheel moves from position 1 to nearby position 2, and rear wheel follows at distance k.
(b) Radial and transverse components of the motion in (a). Two ways, (c) and (d), to decompose the motion in
(a) into radial and transverse components.
Hyperboloid Type
r
rear
G-
k
k
R
C
(a)
rear
(b)
front
Conic Type
r
G-
k
rear
front
G-
G-
front
(c)
Figure 8. (a) Rotation of a tractrix (a prototype of a negative Gaussian profile) about its asymptote generates
a pseudosphere. Rotation of a negative Gaussian profile generates a surface of constant negative Gaussian
curvature of hyperboloid type (b) if the front wheel of the bicycle always lies to the left of the axis of revolution,
and of conic type (c) if the front wheel always lies to the right of the axis of revolution.
Take a bicycle of length k with its rear wheel to the left of the axis of revolution,
and consider two cases: (1) The front wheel always lies to the left of the axis, as in
Figure 8b, and (2) the front wheel always lies to the right of the axis as in Figure 8c.
In both cases, the rear wheel is tangent to G as shown in Figures 8b and 8c. Draw the
line through the axle of the rear wheel perpendicular to the bicycle. It intersects the axis
of revolution at a point designated as S, which, by Meusniers theorem ([3, p. 122]) is
one of the two principal centers of curvature of the surface of revolution generated
by G with corresponding radius of curvature r , the distance from the rear wheel to
S. According to Lemma 1, the lines through the two axles of the bicycles wheels
intersect at the second center of curvature of the surface generated by G , denoted by
C, with corresponding radius of curvature R, the distance from C to G as indicated in
Figure 8b. Note that the two centers of curvature C and S lie on opposite sides of G .
The front wheel moves along a line through S, either away from S or towards S. The
May 2016]
443
right triangles in Figure 8b with common side k show that r R = k 2 , so the constant
negative Gaussian curvature of the surface of revolution is K = 1/k 2 , just as in the
case of a pseudosphere, which is the limiting case in which the front wheel moves
along the axis of rotation.
In addition to the pseudosphere, there are two types of surfaces so generated,
depending on the position of the front wheel. If it always lies to the left of the axis, as
in Figure 8b, the surface is said to be of hyperboloid type, and if it always lies on the
other side the surface is said to be of conic type, as in Figure 8c.
Tracing Gaussian profiles of surfaces with positive curvature. Figure 9a shows a
circle G + that serves as the prototype of a positive Gaussian profile that produces a
spherical surface of constant positive Gaussian curvature 1/k 2 . In this special case, the
rear
G+ r
S C
C
R
k
rear
G+
R
front
C'
S
front
(a) Circle-sphere
rear
S
C'
G+ r R
k
front
Figure 9. (a) Rotation of a circle (a prototype of a positive Gaussian profile) generates a spherical surface. (b)
Rotation producing a bulge type surface. (c) Rotation producing a spindle type surface.
circle is traced by the rear wheel of a bicycle of length k, while the front wheel moves
perpendicular to the line connecting it to the common center of curvature C of the
lines through the two axles, which coincides with a point S on the axis of revolution.
Figures 9b and 9c show a general positive Gaussian profile G + traced by the rear wheel
of a bicycle of length k. The two principal centers of curvature of the surface generated
by G + are on opposite sides of G + , with corresponding radii r and R, with r being
the distance from G + to S. By Lemma 1, the lines through the two axles of the bicycle
intersect at one of the centers, indicated by C , with corresponding radius of curvature
R. The reflection of C through the tangent line to G + is the point C at distance R on
the other side of G + . The two right triangles with common side k show that r R = k 2 ,
so the constant positive Gaussian curvature of the surface of revolution is K = 1/k 2 .
In addition to the sphere, there are two type of surfaces so produced, bulge type as
in Figure 9b, and spindle type as in Figure 9c. The type is determined by the location
of G + relative to the axis of rotation. For a bulge type, G + always lies to the left of the
axis, and for a spindle type it crosses the axis.
Cycloid as limiting case of a Gaussian profile. Figure 10 shows what happens as
r becomes very large and R becomes very small, while their product remains constant, r R = k 2 . The Gaussian profile becomes asymptotically a cycloid! To see why,
refer to Figure 10b. It shows one arch of a cycloid, which we denote by G, traced by
a point P on a disk of diameter d rolling along a line that is parallel to the axis of
revolution and at distance H from it. It is known (see [2, p. 368]) that the cycloid G
is the evolute of another cycloid G also generated by a rolling disk of diameter d. A
string of length O L unwrapped from the cusp O on G traces the arc on G with one
444
cycloid L
G'
t
C O
(b)
(a)
Figure 10. Gaussian profile becomes more and more like a cycloid as r becomes arbitrarily large and R
becomes arbitrarily small, with their product being constant.
1
(1 + 2 ),
2
(1)
where 1 and 2 are the two principal curvatures at an arbitrary point P of the profile
curve that is rotated to produce the surface. Minimal surfaces of revolution, such as the
catenoid, have M = 0. They were studied in the 19th century by Delaunay who showed
that the only surfaces of constant mean curvature of revolution are those obtained by
rotating three types of catenary curves: parabolic, elliptic, and hyperbolic. A parabolic
catenary is the classical hanging chain catenary (discussed in [2, p. 346]). The
general type is defined as the locus of the focus of a conic that rolls along a line
without slipping. In classical differential geometry, these special surfaces of revolution are described parametrically in terms of elliptic functions (see [3, p. 288]). In
[2, Sec. 3.12], these curves were investigated by elementary geometric methods, and
associated areas and arclengths were determined. Animation showing these curves
being traced can be found on the website
www.mathcurve.com/courbes2d/delaunay/delaunay.shtml
which also compares their surfaces of revolution with surfaces of constant Gaussian,
or total, curvature K given by
K = 1 2 .
(2)
May 2016]
445
D-
G+
D+
k
k
k
Figure 11. Gaussian profile G + traced by a bicycle of length k and two parallel profiles D+ and D traced
by the two rear training wheels of a childs bicycle, each of which is at the same distance k from the frame.
right
front
front
left
cycloidal
bicyclix front
left
right
Figure 12. Cycloidal bicyclix (dashed) traced by the front wheel of a bicycle of length k whose rear wheel
follows a cycloid. Two parallel profiles, also cycloids, are traced by the training wheels of a childs bicycle,
each of which is at the same distance k from the frame.
446
REFERENCES
1. T. M. Apostol, M. A. Mnatsakanian, Volume/surface area relations for n-dimensional spheres, pseudospheres, and catenoids, Amer. Math. Monthly 122 (2015) 745756.
2. T. M. Apostol, M. A. Mnatsakanian, New Horizons in Geometry. Dolciani Mathematical Expositions No.
47, Mathematical Association of America, Washington, DC, 2012.
3. E. Kreyszig, Differential Geometry. Mathematical Expositions No. 11, Univ. of Toronto Press, Toronto;
Oxford Univ. Press, London, 1959.
TOM M. APOSTOL joined the Caltech faculty in 1950 and became professor emeritus in 1992. He is director
of Project MATHEMATICS! (http://www.projectmathematics.com), an award-winning series of videos
he initiated in 1987. His long career in mathematics is described in the September 1997 issue of The College
Mathematics Journal. He was selected as a Fellow of the American Mathematical Society in 2012. He is
currently working with colleague Mamikon Mnatsakanian to produce materials demonstrating Mamikons
innovative and exciting approach to mathematics.
California Institute of Technology, 253-37 Caltech, Pasadena, CA 91125.
apostol@caltech.edu
MAMIKON A. MNATSAKANIAN received a Ph. D. in physics in 1969 from Yerevan University, where he
became professor of astrophysics. As an undergraduate he began developing innovative geometric methods
for solving many calculus problems by a dynamic and visual approach that makes no use of formulas. He is
currently working with Tom Apostol under the auspices of Project MATHEMATICS! to present his methods in
a multimedia format.
California Institute of Technology, 253-37 Caltech, Pasadena, CA 91125.
mamikon@caltech.edu
May 2016]
447
(1)
This criterion is frequently used to show the calculus student that certain classix
x
x
2
cal integrals such as erf(x) = 2 0 eu du, Li(x) = 2 lnduu , and Si(x) = 0 sinu u du
cannot be expressed in terms of elementary functions, but in spite of its essential role
in determining the nonelementary character of important integrals in applications, the
relevance of this result in concrete situations has been confined to a few subclasses of
the class of Liouville integrals [9, 10, 14].
In this paper, we use Liouvilles criterion on integration to provide a studentfriendly algorithm that produces a decomposition that fulfills the conditions of Hardys
http://dx.doi.org/10.4169/amer.math.monthly.123.5.448
MSC: Primary 00A05, Secondary 26A09
448
reduction theory that we describe below. In a certain sense, this decomposition provides the minimum transcendental and the maximum elementary components of the
Liouville integrals. As a result, we can determine whether these integrals are elementary functions and when in the affirmative find them. In addition to Liouvilles
criterion, the essential tools to achieve our goal are partial fraction decomposition and
simple notions from linear algebra.
Concerning the whole class of elementary functions, advanced readers may find
in [3, 11] a complete algorithm to determine if the integral of any given elementary
function is also elementary. In regard to free and commercial software for the class
of integrals examined in this work, the reader may compare the answers furnished by
those programs with the ones provided by our alternative algorithm.
Naively speaking, an elementary function is one that can be expressed through
addition, product, division, and composition of rational, exponential, logarithmic, and
trigonometric functions and their inverses. A precise definition of elementary function
as well as a complete proof of Liouvilles theorem can be found in [12, 13].
Integrals that can be expressed in terms of elementary functions will be referred
to as elementary integrals or as integrals that can be integrated in finite terms. Unless
specified otherwise, all polynomials, rational functions, and vector spaces considered
below will be over the field C of the complex numbers.
2. MAIN RESULTS. We begin with a result that will surely be of interest to the
calculus student.
Theorem
1. sLet a be a complex number and r, s rational numbers such that sa = 0.
Then x r eax d x is elementary if and only if (r + 1)/s is a positive integer.
Using Eulers identity, we obtain the following.
Corollary
1. Let , be real numbers
and r, s rational numbers with s( + i) = 0.
s
s
Then x r ex cos(x s )d x and x r ex sin(x s )d x are elementary functions if and
only if (r + 1)/s is a positive integer.
Our search for this result was motivated by students questions and encouraged by
a classical theorem of Tchebychef
that states that if r, s, t are rational and a, b are real
numbers with abs = 0, then x r (a + bx s )t d x is elementary if and only if any of the
numbers t, (r + 1)/s, or (r + 1)/s + t is an integer [17].
Several particular cases of Theorem 1 for integer r and s can be found in the literature [9, 10, 14] without any reference to the condition that these numbers satisfy. To
prove this theorem, we note first that a change of variables allows us to assume that r
and s are integers. Then we integrate by parts to reduce the problem to a finite number
of cases and finish the proof by applying Liouvilles criterion on integration.
Our first result toward the main goal of this work is concerned with the class of
Liouville integrals with f and g in the space P of polynomials with coefficients in C.
Theorem 2. For any given polynomial g P with k = deg g 2, let
g(x)
f (x)e d x is an elementary function
Eg = f P :
and N g = span {1, x, . . . , x k2 }. Then
E g = span {x j g (x) + j x j1 : j = 0, 1, . . . }
May 2016]
449
and P = N g E g .
The basic idea to prove
this theorem goes as follows. Suppose
elementary and R(x) = nj=0 b j x j satisfies (1), then
f (x) =
n
f (x)e g(x) d x is
b j (x j g (x) + j x j1 ).
j=0
Since (x j g (x) + j x j1 )e g(x) = (x j e g(x) ) , all we need to show is that the linear span
of the class of functions x j g (x) + j x j1 , j = 0, 1, . . . , contains the whole class of
polynomials f that can be integrated against e g(x) in finite terms. We achieve this using
Liouvilles criterion on integration.
Before we turn to the general case of rational functions f and g, we have the
following.
Corollary 2. Let g be as in Theorem 2. Then for any f P there are unique polynomials P and u, with deg u k 2, such that
g(x)
(2)
f (x)e d x = u(x)e g(x) d x + P(x)e g(x) .
Moreover, the left side of (2) is elementary if and only if u 0 and its value is
P(x)e g(x) . If u 0 the nonelementary integral in the right side is minimal in the sense
that for any polynomials v and Q satisfying
g(x)
(3)
f (x)e d x = v(x)e g(x) d x + Q(x)e g(x) ,
we have deg v deg u.
Corollary 2 shows that if the left side of (2)
is nonelementary, we only have to
worry about the k 1 nonelementary integrals x j e g(x) d x, j = 0, 1, . . . k 2. Hardy
refers to these integrals as independent transcendents [7]. Corollary 2 shows too that
decomposition (2) fulfills the conditions of Hardys reduction theory. In the words of
Hardy:
Such a reduction theory endeavours in each case
1. to split up any integral of the class under consideration into the sum of a number of
parts of which some are elementary and the others are not;
2. to reduce the number of the latter terms to the least possible;
3. to prove that these terms are incapable of further reduction, and are genuinely new and
independent transcendents.
The Integration of Functions of a Single Variable, p. 46, [7].
In view of this fact, (2) will be referred to as Hardys reduction of the given integral.
To find this reduction, we only have to apply the following simple algorithm: write f
in terms of the basis of P given by Theorem 2. This algorithm can be performed by
a calculus student and easily be accomplished using any mathematical software. The
underlying idea of the algorithm for any Liouville integral is the same.
450
2
Example 1. Find Hardys reduction of f (x)eax /2 d x for any polynomial f and
a C \ {0}.
In this case, we have g(x) = ax 2 /2 and
n1
j=0
Thus, we have
f (x)e
ax 2 /2
dx =
A0 e
ax 2 /2
dx +
n1
A j+1
2 /2
dx
j=0
=
A0 eax
2 /2
dx +
n1
A j+1 x j eax
2 /2
j=0
j
The polynomials u(x) = A0 and P(x) = n1
j=0 A j+1 x are computable in the sense
that A0 , A1 , . . . An can explicitly be found for any given f ; moreover, the integral on
2
the left side is elementary if and only if A0 = 0 and its value is given by P(x)eax /2 .
Example 1 is a variant of a fact stated in [6] that says that if P is a polynomial with
2
coefficients in R, then P(x)ex d x is elementary if and only if P is in the linear span
2 n
2
of the class of Hermite polynomials Hn (x) = (1)n e x ddx n ex with n 1. In view of
2
the close connection between ex and Hermite polynomials, we dont think this fact
can be generalized for polynomials integrated against e g for any g P .
The following result is a natural generalization of Theorem 2 for f and g in the
space Q of rational functions with coefficients in C.
Theorem 3. For any given nonconstant rational function g Q, let
Eg = f Q :
f (x)e g(x) d x is an elementary function .
Then a basis for E g is given by
j
g (x)
j
j1
, g (x), x g (x) + j x
: C, j = 1, 2, . . . .
(x ) j
(x ) j+1
(4)
The essential idea to prove this theorem consists in using Liouvilles criterion on
integration to show that E g is the linear span of (4). In analogy to the polynomial
case, below we will see that this result provides a simple algorithm to find a Hardys
reduction of the type (2) for any Liouvilles integral. This algorithm only requires
partial fraction decomposition and a little linear algebra.
Note that the dimension of the space E g in Theorem 3 has cardinality of the continuum and so does its co-dimension as a subspace of Q. We will see that in concrete
May 2016]
451
1 u2
,
1 + u2
sin =
2u
,
1 + u2
d =
2du
.
1 + u2
Thus,
F(cos , sin )e
G(cos ,sin )
d =
2
2
1u
2u
1u
2u
and
g(u)
=
G
. Here we assume
where f (u) = 1+u
F
,
,
2
1+u 2 1+u 2
1+u 2 1+u 2
that g is not constant. This example includes the case in which F and G are rational functions of the form F(cos x, sin x, . . . , sin mx) and G(cos x, sin x, . . . ,
sin nx).
3. Let F and G be rational functions and p, q positive integers. If g(u) = G(u p ) +
pq
u pq is nonconstant, then setting ln x = u pq we have d x = pqu pq1 eu du, and
thus,
1/q
pqu pq1 F (u q ) e g(u) du.
F (ln x)1/ p e G ((ln x) ) d x =
u
p
Simple cases are given by lnd xx = eu du and by p ln x d x = pu p eu du,
which turn out to be nonelementary for all p 2 (see Section 4).
Examples 1 and 3 can be generalized, as the following one, to the case in
which F and G are rational functions of n and m variables, respectively.
4. Suppose F(x1 , . . . , xn ) and G(x1 , . . . , xm ) are rational functions with G nonconstant. If M is the product of the n + m positive integers p1 , . . . , pn and
q1 , . . . , qm , the change of variables x = u M provides
1
1
1/q
1/qm
f (u)e g(u) du,
F x p1 , . . . , x pn e G(x 1 ,...,x ) d x =
M
M
M
where f (u) = Mu M1 F u p1 , . . . , u pn and g(u) = G u q1 , . . . , u qm .
452
mr (x)
+ g (x)r (x),
x a
which is impossible since the left side is continuous at x = a but the right is not.
The following corollary quickly recovers a classical example that is often of interest
to the calculus student.
Corollary 3. For any positive integer n and a C \ {0}, the integral
an elementary function.
eax
xn
d x is not
ax
eax
eax
a
Proof. Integrating by parts yields xen+1 d x = nx
d x. Thus, by induction,
n + n
xn
only the case n = 1 needs to be treated, but this follows at once from Lemma 1.
As is well known, taking a = i and
Eulers
identity, we find that for all posi sin
using
x
x
d
x
and
d x are not elementary. Taking
tive integers n the classical integrals cos
xn
xn
du
ex
a = n = 1, neither is log u = x d x.
May 2016]
453
x nk+1 ax k (n k + 1)
e
Ink .
ka
ka
(5)
x 20 x 5
5
ex
e +
I16
d
x
=
21
x
20
20
20
5x 15
52 x 10
53 x 5
x
5
+
+
+
ex
=
20
20 15 20 15 10 20 15 10 5
x5
e
54
d x.
+
20 15 10 5 1
x
l1
(k)
j k j+r +1
(n
k
+
1)!
(ka)
x
k
(1) j1
In = (1)l
eax + Ir ,
(k)
(ka)l
(k
j
+
r
+
1)!
j=0
454
In =
j=0
It turns out that these expressions are in fact Hardys reduction of In introduced in (2).
We remind the reader that, for n 1 k,
1,
if n = 1 k, 2 k, . . . , 1, 0;
n!(k) =
n(n k)!(k) , if n 1.
As usual, for k = 1, 2, 3, we write n!, n!!, and n!!!.
5. PROOF
OF THEOREM 2. Next we prove Theorem 2 and obtain Hardys reduc
tion of f (x)e g(x) d x stated in (2).
Proof. Let k denote the degree of g. If k = 1, there is nothing to prove. So we may
assume k 2 and define N g = span {1, x, . . . , x k2 }. If j = x j g (x) + j x j1 , it is
easily verified by induction that { j : j = 0, 1, . . . } is linearly independent. Moreover,
we clearly have P = N g span { j : j = 0, 1, . . . } and
(6)
j (x)e g(x) d x = x j e g(x) + C.
Thus, if we prove that u(x)e g(x) d x is not elementary for all u N g \ {0}, then it
will follow that E g = span { j : j = 0, 1, . . . }; therefore, Theorem 2 will be proved.
Let u N g \ {0} and suppose u(x)e g(x) d x is elementary. By Liouvilles criterion on
integration, u = ( p/q)g + ( p/q) for some relatively prime polynomials p and q. A
short calculation yields
(uq p pg )q = pq .
Since p and q are relatively prime, a multiplicity argument about the zeroes of q and
q implies that q is a nonzero constant C. Thus, we have
Cu p pg = 0.
Since deg u k 2 and deg g = k 1, a simple degree argument again shows that
p 0 and thus u 0, but this contradicts the choice of u.
Now we prove Corollary 2.
Proof. In view of (6), the existence of a Hardys reduction of f (x)e g(x) d x of the
type (2) follows from the fact that P = N g E g . To prove the uniqueness of u and P,
suppose
g(x)
g(x)
g(x)
= u 2 e g(x) d x + P2 (x)e g(x) ,
f (x)e d x = u 1 e d x + P1 (x)e
where P1 , P2 are polynomials and both u 1 and u 2 have degree at most k 2. Thus,
u 1 , u 2 N g . If we set P = P2 P1 , then u 1 u 2 = P + Pg E g N g = {0}. Thus,
May 2016]
455
(10x 4 3x + 1)e x d x,
and decide if this integral is elementary. According to Theorem 2, g(x) = x 5 , f (x)
= 10x 4 3x + 1, E g = span { j (x) = 5x j+4 + j x j1 : j = 0, 1, 2, . . . }, and N g
= span {1, x, x 2 , x 3 }. Thus, f (x) = 1 3x + 20 (x), and Hardys reduction is
(10x 4 3x + 1)e x d x =
(1 3x)e x d x + 2e x .
Since u(x) = 1 3x is not the zero polynomial, the given integral is not elementary.
Remark. The interested reader may compare our Hardys reduction for this integral
with any of the answers provided by either MAPLE or Wolfram.
Example 3. Find Hardys reduction of
(x 5 + x 4 + x 3 + x 2 + x + 1)e x
3 +x
d x,
j (x)e x
3 +x
d x = x j ex
3 +x
+ C and
1
2
1
1
8 1
+ x 0 (x) + 1 (x) + 2 (x) + 3 (x).
9 9
9
9
3
3
f (x)e
g(x)
Since u(x) =
8
9
dx =
1 2 1 3 x 3 +x
8 1
1 2
x 3 +x
+ x e
dx + + x + x + x e
.
9 9
9 9
3
3
Proof. It is well known [18] that if Q is the vector space of rational functions with
coefficients in C, then
1
:
C,
j
=
1,
2,
.
.
.
(7)
1, x j ,
(x ) j
is a basis of Q. If g is a polynomial of degree k 1, we define
span
Ng =
span
1
x
:C ,
1
, x j : C, j = 0, 1, . . . , k 2 ,
x
if k = 1,
if k 2.
Now we set
j (x) =
g (x)
j
j
(x )
(x ) j+1
and
j (x) = x j g (x) + j x j1 .
(8)
A short calculation shows by induction that j , g , j : C, j = 1, 2, . . . is linearly independent; in addition, using the basis of Q given in (7), we see that
Q = Ng span j , g , j : C, j = 1, 2, . . . .
Moreover, for all C and j = 0, 1, . . . ,
e g(x)
+ C and
j (x)e g(x) d x = x j e g(x) + C.
(9)
j (x)e d x =
(x ) j
We will have that E g = span
j , g , j : C, j = 1, 2, . . . and thus prove the
theorem if we show that u(x)e g(x) d x is not elementary for all u N g \ {0}. This
follows as in the proof of Theorem 2 if u is a polynomial and from Lemma 1 otherwise.
g(x)
n
j=1
Aj
+ Q(x),
x j
457
Proof. The proof of this corollary is similar to that of Corollary 2. We just remind
the reader that if u is a nonzero rational function with u = p/q, where p and q are
relatively prime polynomials, then deg u = max{deg p, deg q}.
The decomposition
given by Corollary 4, which we will refer to as Hardys reduc
tion of f (x)e g(x) d x, says that if this integral is not an elementary function, then
we only have to be concerned about the nonelementary integrals x j e g(x) d x, for
g(x)
j = 0, 1, . . . , deg g 2 and a finite number of integrals of the form ex d x.
Corollary 4 generalizes for any polynomial
g to the following well-known fact,
which says that any integral of the form R(x)e x d x, where R(x) is a rational function,
can be reduced so as to contain only a term
e x S(x) where S(x) is a rational function
ex d x
and a number of terms of the type xa . If all the constants vanish, then the
integral can be calculated in the final form e x S(x) [7].
In concrete situations it is convenient to consider reasonable subspaces of Q, E g ,
and N g that possess a countable basis. For example, if f is a rational function with
partial fraction decomposition
f (x) =
nl
n
l=1 t=1
Blt
+ P1 (x),
(x l )t
(10)
(11)
span
span
1
x
:F ,
1
, x j : F , j = 0, 1, . . . , k 2 ,
x
if k = 1,
if k 2.
Note that Qg (F ) = E g (F ) N g (F ) and if f Qg (F ), then f (x)e g(x) d x is elementary if and only if f E g (F ); note too, E g (F ) has finite co-dimension as a subspace of Qg (F ).
Example 4. Find Hardys reduction of
(x 2
4
3
e x /3 d x,
2
1)
1
1
1
1
+
+
.
2
x + 1 (x + 1)
x 1 (x 1)2
1
1
If we set F = {1, 1}, then we have Q(F ) = span {1, x j , (x+1)
j , (x1) j :
j = 1, 2 . . . },
E g (F ) = span 1 j (x), x 2 , j (x) : j = 1, 2, . . . ,
2
j
x
j+2
+ j x j1 . We also have that
where 1 j (x) = (x1)
j (x1) j+1 , and j (x) = x
1
1
, 1, x . A short calculation shows that
N g (F ) = span x+1 , x1
f (x) = 2x +
2
11 (x) 11 (x).
x +1
2
x+1
Remark.
We warn the reader that our algorithm to find a Hardys reduction of
f (x)e g(x) d x has the advantages and limitations of partial fraction decomposition.
Everything works out well if the roots of the denominator of f are known, but we have
a problem otherwise. For example, we do not know how to find a Hardys reduction
g(x)
of x 5e6x+3 d x, where g is a nonconstant polynomial since q(x) = x 5 6x + 3 is
1
N g because
not solvable by radicals [16]. On the other hand, we note that x 5 6x+3
eg(x)
all the roots of q are simple. Thus, x 5 6x+3 d x is not elementary and has a Hardys
reduction of the form
e g(x)
d
x
=
Aj
x 5 6x + 3
j=1
5
e g(x)
d x,
x j
1
where A j =
and j , j = 1, . . . , 5, are the roots of q that could
k= j ( j k )
be approximated by Newtons method.
The advanced interested reader may find in [2] an algorithm
that does not require
factoring the denominators of f and g to determine if f (x)e g(x) d x is an elementary
function and when in the affirmative find its value.
7. PROOF OF THEOREM 3, CASE II. Now we prove Theorem 3 in the case that
both f and g are rational functions and g is not a polynomial.
Before we prove this theorem, we introduce notation. As customary Q will be
viewed as the linear span of (7). Next we suppose that the partial fraction decomposition of g is given by
g(x) =
lr
m
r =1 s=1
Ar s
+ P(x),
(x r )s
(12)
459
lr
m
r =1 s=1
1
(xr )s
s Ar s
+ P (x).
(x r )s+1
and
N g = {r s (x) : r = 1, . . . , m; s = 1, . . . , lr + 1} \ mlm +1 (x) .
Finally, we set = {1 , . . . , m } and define
1
, r s : r s N g , C \ ,
N g = span 1, x, . . . , x k1 ,
x
where it is understood that {1, x, . . . , x k1 } is empty if P is constant.
Remark. The remainder of this proof is valid if in the definition of N g we leave out
any rlr +1 (x), with r = 1, . . . , m 1, instead of mlm +1 (x).
Proof. With the above notation in hand, in view of (9), Theorem 3 will be proved if
we show that
g , and
Q = N E
g
g(x)
C
\
,
then
u(x)e
d x is not elementary because of Lemma 1.
x
So we may assume that u(x) = r s N ar s r s + Q(x), where Q 0 or a polynomial
g
of degree at most k 1. If u(x)e g(x) d x is elementary, then by Liouvilles criterion
u = ( p/q)g + ( p/q) , where p and q are relatively prime polynomials. A short calculation yields
(uq p pg )q = q p,
and multiplying by D(x) = rm=1 (x r )lr +1 , the common denominator of g , we
obtain the polynomial equation
(qu D p D pg D)q = q p D.
A multiplicity argument about the zeroes of q and q shows that the only possible roots
of q are 1 , . . . , m . If q(x) = (x 1 ) (x), with 1 and (1 ) = 0, then
(qu D p D pg D) = p(x 1 )l1 rm=2 (x r )lr +1 ( + (x 1 ) ).
By definition of g, we have that l1 1 and thus x 1 divides the left side of this last
equation. Since x 1 does not divide , then it divides the first factor. Note now that
x 1 divides the first two terms of this factor and thus divides pg D. Since A1l1 = 0,
then x 1 divides p rm=2 (x r )lr +1 , which is impossible. Thus, 1 is not a root of
q and by a similar argument neither is r , for r = 2, . . . , m. Hence, q is a nonzero
constant C1 , and therefore we have
C1 u D p D = pg D,
460
(13)
ar s Dr s + Q D,
pg D = p
r s N g
lr
m
r =1 s=1
s Ar s D
+ p P D.
(x r )s+1
k1
1
1
, r s =
,
: r s N g , C \ N g + E g .
x
(x r )s
Moreover, since lm Amlm = 0 and g E g , solving for (x 1)lm +1 we find that (x 1)lm +1
m
m
N g + E g . Taking these last two assertions as a starting point, it follows by induction
on j = 1, 2, . . . that we have the following.
If C \ , then
j
j+1 N g + E g , since j (x) =
j+1 E g .
(x)
(x)
1
(xr )lr + j+1
(x)
N g + E g since r j E g .
Hence, Q = N g E g .
Corollary 5. Let g and P be as in (12). Then for any f Q there exists a unique
rational function R and a unique rational function u N g of the form
u(x) =
n
j=1
Aj
+
ar s r s + Q(x),
x j N
rs
g
where Ng and Ng are as defined in the proof of Theorem 3, case II, and Q 0 if P 0
May 2016]
461
f (x)e
g(x)
dx =
Moreover, the left side of this last equation is elementary if and only if u 0, and if
u 0 the nonelementary integral in the right side is minimal in the sense that for any
rational functions v and satisfying
f (x)e
g(x)
dx =
we have deg v deg u. This is where the degree of a rational function is defined as in
the proof of Corollary 4.
Proof. The proof of this corollary is similar to that of Corollary 2.
As before, the decomposition given by Corollary 5 will be referred to as Hardys
reduction of f (x)e g(x) d x.
As in the previous section, when dealing with concrete situations, it is convenient to
work with suitable subspaces of Q, E g , and N g that have a countable basis. Thus, for
example, if f is as in (10) and g as in (12), then we set F = {1 , . . . , n } and consider
E g (F ) as in (11),
Qg (F ) = span 1, x j ,
1
1
,
: F , j = 1, 2, . . . ;
j
(x ) (x r )s
r \ F , s = 1, . . . , lr + 1
and
N g (F ) = span 1, x, . . . , x k1 ,
1
, r s : r s N g , F \ ,
x
where is as in N g and r s as in N g .
Again, Qg (F ) = E g (F ) N g (F ), and if f Qg (F ), then f (x)e g(x) d x is elementary if and only if f E g (F ).
Example 5. Find Hardys reduction of
4
x + x3 x2 x + 2
2x 6 3x 5 5x 4 + x 3 + 3x 2 + 12x 5
exp
dx
x 4 6x 3 + 13x 2 12x + 4
x2 1
2x 6 3x 5 5x 4 +x 3 +3x 2 +12x5
.
x 4 6x 3 +13x 2 12x+4
By
1
1
+ x2 + x
x 1 x +1
and
f (x) =
8
38
5
9
+
+ 2x 2 + 9x + 23.
+
x 1 (x 1)2
x 2 (x 2)2
1
1
1
,
Qg (F ) = span 1, x ,
,
: = 1, 2; j = 1, 2, . . . ,
(x ) j x + 1 (x + 1)2
1
1
1
,
,
Ng =
,
x 1 (x 1)2 x + 1
1
1
1
1
N g (F ) = span 1, x,
,
,
,
,
x 2 x 1 (x 1)2 x + 1
j
and
E g (F ) = span j (x), g (x), j (x); = 1, 2; j = 1, 2, . . . ,
where, from (8),
j (x) =
4x
2x + 1
j
+
,
j
2
2
j
(x ) (x 1)
(x )
(x ) j+1
g (x) =
1
1
+
+ 2x + 1,
(x 1)2
(x + 1)2
j (x) =
4x j+1
+ 2x j+1 + x j + j x j1 .
(x 2 1)2
1
1
+
+ 4g (x) + 1 (x) + 921 (x),
(x 1)2
x 2
where
1
1 (x) = (x1)
2
1
x1
21 (x) = 2
1
(x2)2
1
(x+1)2
37
9(x2)
1
x+1
1
(x1)2
+ 2x 2 + x + 1,
1
x1
1
3(x+1)2
1
.
9(x+1)
1
1
+
4g
+
(x)
+
(x)
+
9
(x)
e g(x) d x
1
21
(x 1)2
x 2
1
1
9
g(x)
e
e g(x) .
=
+
d
x
+
4
+
x
+
(x 1)2
x 2
x 2
f (x)e g(x) d x =
Since u(x) =
May 2016]
1
(x1)2
1
x2
463
2a
(2 j 1)!! a2 j = 0.
(14)
P(x)eax d x is elemen-
j j1
x
A j+1 x j+1 +
2a
j=0
2l 1
2l
2l1
A2l x 2l2
+ A2l2 +
= A2l x + A2l1 x
2a
3
1
A2
2
A4 x + A1 + A3 x +
.
+ + A2 +
2a
a
2a
ajx j =
2l1
That is to say,
2l 1
A2l = a2l2 ,
2a
3
1
A2
. . . , A2 +
A 4 = a2 , A 1 + A 3 = a1 ,
= a0 ,
2a
a
2a
2
x 2 j ex d x = (2 j 1)!!/2 j and
x 2 j+1 ex d x = 0,
464
which is (14) with a = 1. Using the results of this paper, the above example can be
pushed a bit further.
Example 7. Let
= an x n + + a1 x + a0 and a C \ {0}. If n = 3l + r , with
P(x) ax
3
r = 1, 2, then P(x)e d x is elementary if and only if
l
1 j
j=0
3a
(3 j 2)!!! a3 j = 0 and
j=0
3a
l
1 j
3a
(3 j 2)!!! a3 j = 0 and
l1
1 j
j=0
3a
(3 j 1)!!! a3 j+1 = 0.
3
Proof. We only treat the case n = 3l + 2. In view of Theorem 2, P(x)eax d x is ele
3l+2
j2 j3
j
mentary if and only if j=0 a j x j = 3l+2
. Comparing the coeffij=2 A j x + 3a x
j
cients of x , we see that this equality corresponds to
3l
A3l+2 = a3l1 ,
3a
A5
2
A3
. . . , A2 +
= a2 ,
A 4 = a1 ,
= a0 .
a
3a
3a
l
1 j
j=0
ka
l1
1 j
j=0
ka
k
Now we consider integrals of the form P(x)ea/x with k > 0 and P a polynomial.
We work out the cases k = 1 and 2 and leave the general case to the reader.
Example
8. For any P(x) = an x n + + a1 x + a0 and a C \ {0}, we have that
a/x
P(x)e d x is elementary if and only if
n
j=0
May 2016]
a jaj
= 0.
( j + 1)!
(16)
465
Proof. In view of Theorem 3, if P is a polynomial, then P(x)ea/x d x is an elemena
x j1 , }. The
tary function if and only if P span {x a2 , x 2 a3 x, . . . , x j j+1
a
latter occurs if and only if nj=0 a j x j = nj=1 A j x j j+1
x j1 . Comparing the
coefficients of x j , we see that this last equality is equivalent to
a
An = an1 ,
n+1
a
a
a A1
= a0 .
. . . , A 2 A 3 = a2 , A 1 A 2 = a1 ,
4
3
2
An = an , An1
(2a) j
a2 j = 0
(2 j + 1)!!
and
l
j=0
(2a) j
a2 j+1 = 0,
(2 j + 2)!!
(17)
(2a) j
a2 j = 0 and
(2 j + 1)!!
l1
j=0
(2a) j
a2 j+1 = 0.
(2 j + 2)!!
2
Proof. Again by Theorem 3, if P is a polynomial, then P(x)ea/x d x is an elementary
2a
x j2 , }.
function if and only if P span {x 2 2a3 , . . . , x j j+1
2l+1
2a
j
j
j2
x
. Coma
x
=
A
x
Suppose that n = 2l + 1 and that 2l+1
j
j
j=0
j=2
j+1
paring the coefficients of x j , we get
2a
A2l+1 = a2l1 ,
2l + 2
2a
2a
2a
A 4 = a2 , A 3 = a1 , A 2 = a0 .
. . . , A2
5
4
3
Eliminating A2 ,..., A2l+1 from this system of linear equations, we obtain (17). The case
n = 2l is treated similarly.
The interested reader can prove the following.
Proposition 2. Let k be a positive integer, P(x) = an x n + + a1 x + a0 , and a
k
C \ {0}. If n = kl + r , with 0 r k 1, then P(x)ea/x d x is elementary if and
only if the coefficients of P satisfy the following system of k linear equations
l
(ka) j
ak j+i = 0,
(k j + i + 1)!(k)
j=0
l1
(ka) j
a
= 0,
(k) k j+i
(k
j
+
i
+
1)!
j=0
466
i = 0, 1, . . . , r ;
i = r + 1, . . . , k 1.
(18)
a0
a1
al ax
e dx
+
+
+
xn
x n1
x nl
(19)
(20)
(n 2)A2 a A1
(n l)Al a Al1
a Al
(n 1)A1
+
+ +
nl ,
xn
x n1
x nl+1
x
for some constants A1 , A2 , . . . , An1 . Comparing the coefficients, we obtain the system of linear equations
(n 1)A1 = a0 , (n 2)A2 a A1 = a1 ,
. . . , (n l)Al a Al1 = al1 , a Al = al .
A short calculation shows that this system is equivalent to the fact that (20) is satisfied.
A Taylor expansion of P around x = c and setting u = x c yields
P(x) ax
e d x = eac
(x c)n
P (c)/1!
P (l) (c)/l! au
P(c)
+
+ +
e du.
un
u n1
u nl
F(x) log G(x)d x,
May 2016]
F(x) arctan G(x)d x,
467
E = span
1
1
,
: C, j = 2, 3, . . .
x (x ) j
and Q p = N E .
1
Proof. Note first that Q p = N span x1 , (x)
j : C, j = 2, 3, . . . . Since
log x
d x = 12 (log x)2 + C and for C and any integer j 2 so is
x
log x
dx
1
log x
dx =
,
(x ) j
j 1
x(x ) j1
( j 1)(x ) j1
it suffices to show that u(x) log x d x is not elementary for all u N \ {0}.
n
Bl
Let u(x) = l=1
in N \ {0} and suppose u(x) log x d x is elementary. By
xl
LiouvilleHardys criterion, there exists a rational function g and a constant C such
that u(x) = g (x) + Cx , but this equation is impossible. In fact, if Bk = 0, then u has
a simple pole at x = k . Since k = 0, there is an integer m 1 and a rational function r continuous at x = k such that r (k ) = 0 and g(x) = r (x)/(x k )m . A short
calculation with both expressions of g gives
C
mr (x)
= r (x)
,
(x k )m u(x)
x
x k
which is impossible since the left side is continuous at x = k but the right is not.
Corollary 6. Any Liouville-like integral of the type considered in this section has a
Hardys reduction that is the sum of an elementary function plus a finite number of
x
d x, with C \ {0}.
nonelementary integrals of the form log
x
468
Note that
3x 2 5x+4
x(x2)2
1
x
2
x2
3x 2 5x + 4
log x d x.
x(x 2)2
3
,
(x2)2
3x 2 5x + 4
log x d x = 2
x(x 2)2
with
1
x2
1
N and x1 , (x2)
2 E . Thus,
log x
dx + 3
x 2
log x
dx +
(x 2)2
log x
dx
x
log x
dx
x 2
2
log x
1
3
3
+ (log x)2 .
+ log 1
2
x
x 2 2
=2
Since
1
x2
ACKNOWLEDGMENT. We are thankful for the valuable comments, remarks, and suggestions of the referees. One of the authors, J. Cruz-Sampedro, was supported during his sabbatical leave by the Universidad
Autonoma Metropolitana-Azcapotzalco.
REFERENCES
1. J. Baddoura, Integration in finite terms with elementary functions and dilogarithms, J. Symbolic Comput.
41 (2006) 909942.
2. M. Bronstein, The transcendental Risch differential equation, J. Symbolic Comput. 9 (1990) 4960.
3. , Integration of elementary functions, J. Symbolic Comput. 9 (1990) 117173.
4. , Symbolic Integration I: Transcendental Functions. Second edition. With a foreword by B. F.
Caviness. Vol. 1 Algorithms and Computation in Mathematics, Springer-Verlag, Berlin Heidelberg, 2005.
5. J. H. Davenport, The Risch differential equation problem, SIAM J. Comput. 15 (1986) 903918.
6. P. Diaconis, S. Zabell, Closed form summation for classical distributions: variations on a theme of de
Moivre, Statist. Sci. 6 (1991) 284302.
7. G. H. Hardy, The Integration of Functions of a Single Variable. Cambridge Tracts in Mathematics and
Mathematical Physics, Cambridge Univ. Press, Warehause, England, 1905, https://archive.org/
details/integrationoffun00hardrich.
8. J. Liouville, Memoire sur lintegration dune classe de fonctions transcendantes, J. Reine
Angew. Math. 13 (1835) 93118, http://gdz.sub.uni-goettingen.de/dms/load/img/?
PPN=GDZPPN002140268&IDDOC=267366.
9. E. A. Marchisotto, G.-A Zakeri, An invitation to integration in finite terms, Coll. Math. J. 25 (1994)
295308.
10. D. G. Mead, Classroom notes: Integration, Amer. Math. Monthly 68 (1961) 152156.
11. R. H. Risch, The problem of integration in finite terms, Trans. Amer. Math. Soc. 139 (1969) 167189.
12. J. F. Ritt, Integration in Finite Terms. Liouvilles Theory of Elementary Methods. Columbia Univ. Press,
New York, 1948.
13. M. Rosenlicht, Integration in finite terms, Amer. Math. Monthly 79 (1972) 963972.
14. G. F. Simmons, Calculus with Analytic Geometry, McGraw Hill, New York, 1985.
15. M. F. Singer, B. D. Saunders, B. F. Caviness, An extension of Liouvilles theorem on integration in finite
terms, SIAM J. Comput. 14 (1985) 966990.
16. I. Stewart, Galois Theory. Second edition. Chapman and Hall, New York, 1989.
17. P. L. Tchebychef, Integration des Differentielles Irrationnelles, Oeuvres de P. L. Tchebychef, Imprimerie
de lAcademie Imperiale des Sciences, St. Petersbourg 1 (1899) 147168, https://archive.org/
details/oeuvresdepltche01chebrich.
18. L. Verde-Star, Rational runctions, Amer. Math. Monthly 116 (2009) 804827.
May 2016]
469
JAIME CRUZ-SAMPEDRO received his Ph.D. in mathematics from the University of Virginia. His mathematical interests included applied mathematics, number theory, analysis, operator theory, and Schrodinger
operators.
Universidad Autonoma Metropolitana-Azcapotzalco, Mexico City 02200, Mexico
Editors Note: We publish the following memorial statement written by Margarita Tetlalmatzi-Montiel regarding her co-author, Jaime Cruz-Sampedro. On November 3, 2015 the mathematical community lost a great
friend with the passing of Jaime Cruz Sampedro. We knew Jaime as a person who supported our efforts to do
important things and who kept open a wary eye to make sure we were living up to his high expectations. We
were always the better for this and will now try to continue to bear in mind his standards. We owe him a lot
and will miss him a lot.
(( f + g)2 ) =
f 2 + g2 + 2 f g
= 2 f f + gg + ( f g) .
u 2 (x + h) u 2 (x)
u(x + h) u(x)
= lim (u(x + h) + u(x))
u 2 (x) = lim
h0
h0
h
h
= 2u(x)u (x).
Submitted by Piotr Josevich
http://dx.doi.org/10.4169/amer.math.monthly.123.5.470
MSC: Primary 26A06
470
n1
n1
x
i .
f (x) =
(1)
j
i=0
j=0, j=i i
Another expression for the LIF of a
function f : Fq Fq can be
f [ j+1 , . . . , k ] f [ j , . . . , k1 ]
.
k j
May 2016]
471
a1
1
1 1 1
2 0 0 a2 = 1 .
1
3 1 3
a3
We immediately check that there is no such solution, since 2 is not a unit in Z4 .
Though the main object of study in our paper is single- and multivariate polynomial functions over finite rings, we briefly mention three related works on polynomial
functions in a more general setting. The reference [2] studies generalized polynomial
mappings from Zn to Zm ; a follow-up work is also presented in [1]. Generalized polynomial mappings from Zn1 Zn2 Znr to Zm are also studied in [3].
In this paper we consider an interpolation method due to a factorial representation
of polynomial functions. If a function does not admit a polynomial representation,
the factorial interpolation calculation fails in a natural way. In Section 2, we outline
some facts about factorial representations. In Section 3, we analyze the computational
complexity of factorial interpolation in Z pm , when it exists, and we show that our
method is preferable to the Lagrange and Newton interpolation methods. In Section 4
we compare an implementation of factorial representations with the interpolation function (based on LIF) of the Number Theory Library; see [10]. In Section 5 we briefly
discuss some extensions of factorial representations to functions in several variables
and over infinite rings. We conclude in Section 6 with comments regarding the use of
factorial interpolation in Shamirs secret sharing scheme.
2. FACTORIAL REPRESENTATION. The notion of factorial representations for
expressing polynomial functions can be found in [8].
Definition. Let j be a positive integer. The jth (falling) factorial of x, denoted x (j) , is
defined by
x ( j) = x(x 1) (x j + 1),
(2)
n
a( j) x ( j) R[x].
j=0
472
Then g is the factorial representation of and the act of finding g is the factorial
interpolation of .
Let g R[x] be in factorial representation. Since i ( j) = 0 for j > i, the evaluation g(i) requires computation and addition of only the first i + 1 terms of the sum.
This paper is primarily concerned, however, with expressing a function in factorial
representation.
Factorial interpolation over prime fields. First we consider the case of maps
: F p F p , where p is a prime. Recall that all such can be represented by a
p1
polynomial. Let f (x) = i=0
a(i) x (i) . In factorial notation,
f (k) = k (k) a(k) + k (k1) a(k1) + + k (1) a(1) + a(0) ,
since k (k+i) = 0 for any i > 0. We observe that a(0) = (0), and solving for the coefficients is equivalent to solving the system
( p 1)( p1)
0
..
.
0
( p 1)( p2)
( p 2)( p2)
..
.
0
( p 1)(1)
a( p1)
( p 1) (0)
(1)
( p 2) a( p2) ( p 2) (0)
. =
,
..
..
..
.
.
..
.
1(1)
a(1)
(1) (0)
(3)
which is triangular and hence invertible. Therefore, there is a unique solution and the
number of functions representable in factorial representation is p p .
p1
a(i) x (i) . Set a(0) = (0), and
Proposition 1. Let : F p F p and let f (x) = i=0
let a(i) be given by the solution of Equation (3) for i = 1, . . . , p 1. Then f is the
factorial interpolation of .
Next we show that a back-substitution method with precomputation can efficiently
find the factorial representation of a function.
Example. Consider the function : Z5 Z5 defined in two row notation by
0 1 2 3 4
0 2 3 4 1
4
and let f (x) = i=0
a(i) x (i) be its factorial interpolation. Clearly, a(0) = 0 since
(0) = 0. The matrix equation to solve is
4
0
0
0
4
1
0
0
2
1
2
0
4
1
a(4)
3 a(3) 4
=
.
2 a(2) 3
1
a(1)
2
473
Nq
a( j) x ( j) ,
j=0
(q 1)(q1)
0
..
.
0
(q 1)(q2)
(q 2)(q2)
..
.
0
..
.
(q 1)(1)
a(q1)
(q 1) (0)
(1)
(q 2) a(q2) (q 2) (0)
. =
..
..
.
(1) (0)
a(1)
1(1)
(4)
m
has a solution and f (x) = pj=01 a( j) x ( j) is the factorial interpolation of , where
a(0) = (0), or is not representable as a polynomial.
We return to our previous example of a function not expressible as a polynomial.
Example. Consider the function : Z4 Z4 defined by (0) = 0 and (a) = 1 for
a = 0. The adaptation of Equation (4) to Z4 is obvious, and the equation becomes
2 2 3
1
a(3)
0 2 2 a(2) = 1 .
a(1)
0 0 1
1
It is immediate that unless a(2) = 0, there can be no solution, since 2 is on the diagonal
and has no inverse. Eliminating the (2, 3) entry shows that 2a(2) = 3, a contradiction.
Conversions between factorial and polynomial representations. For a commutative ring with unity R and a transcendental element x, the (single variable) polynomial ring R[x] can be considered a free commutative R-algebra with generators {1, x, x 2 , . . .}. Suppose that R = {0, 1, r1 , . . . , rn1 } is a finite set of points. If
f R[x] is a polynomial mapping, then we have seen it is expressible in factorial
474
m
i
representation;
m that is,(i)if f (x) = i=0 ai x for any x R, then f has a representation
as f (x) = i=0 a(i) x , for some a(i) R.
Converting between traditional and factorial representations can also be performed
with linear algebra with the following observation. Define a free commutative Ralgebra with generators {1, x, x(x r1 ), . . . , x(x r1 ) (x rn1 )}; it is easy to
see that this is a generating set for the subspace of R[x] of polynomials of degree at
most n using distributivity and that each term in a generating set is monic. Hence, to
convert between traditional polynomial representation and factorial representations, it
is enough to perform a change-of-basis operation on the respective coefficient vectors.
This change-of-basis relies on combinatorial constants called Stirling numbers. Stirling numbers have many interesting combinatorial and number theoretic properties; we
list a few of them here and refer the reader to [7, Section 6.1] for more information.
The Stirling numbers of the first kind, denoted s(n, k), are given by the relations
x (n) =
n
n = 0, 1, . . . .
k=0
There are many equivalent definitions, for example, s(n, k) counts the number of permutations on n elements into exactly k disjoint cycles. The reverse conversion coefficients are given by the Stirling numbers of the second kind, denoted S(n, k) and are
given by the relations
xn =
n
n = 0, 1, . . . ,
k=0
and we draw notice to the lack of the alternating sign in this direction. Stirling numbers
of the second kind likewise have a combinatorial interpretation as the number of ways
to partition an n-set
n into ik nonempty subsets.
n
Let g(x) = i=0
ai x , rewriting g in factorial representation gives g(x) = i=0
ai ik=0 s(i, k)x (k) . Hence, the vector of (factorial) coefficients of g satisfies the matrix
equation
s(n, n) s(n, n 1)
a(n)
s(n 1, n 1)
a(n1) 0
. =
..
.. 0
.
0
a(0)
0
0
(1)n s(n, 0)
an
n1
(1) s(n 1, 0) an1
. .
..
..
.
a0
s(0, 0)
n
a(i) x (i) , its tradiSimilarly if h is given in factorial representation, h(x) = i=0
tional polynomial representation has the vector of coefficients satisfying
S(n, n)
an
a
n1 0
. =
.. 0
a0
S(n, n 1)
S(n 1, n 1)
..
.
0
0
a(n)
S(n, 0)
S(n 1, 0) a(n1)
. .
..
..
.
S(0, 0)
a(0)
The complexity of this conversion is then the cost of one fully dense upper triangun+1
(2i 1) = (n + 1)2
lar matrix-vector multiplication; naive multiplication takes i=1
base field operations. This cost supposes that the Stirling numbers have been precomputed, which requires storing n(n + 1)/2 constants. Moreover, Stirling numbers may
be computed using the simple recurrences
May 2016]
475
computations. Moreover, this improvement shows we need not precompute all falling
factorials ( p i)( j) , i = 1, 2, . . . , p 1, j = 1, 2 . . . , i. We observe, however, that
this improvement is not applicable when the (integral) modulus is not prime (that is, if
m > 1) since it requires division by i for all i (of which some may be zero divisors).
Table 1 provides a comparison of our methods with classical interpolation methods;
see [5, Section 5.2], for details on the costs of Lagrange and Newton interpolation
methods. We observe that the Factorial (Horner) entry applies only to the case q =
p, a prime number.
Table 1. Number of field operations (over Fq ) for several interpolation methods.
Factorial (backsolve)
(q 1)2
Factorial (Horner)
(q 1)(q 2)
LIF
7q 2 8q + 1
Newton
52 q 2
p
5
23
2029
iters.
2 107
2 107
103
LIF
2.63
15.55
68.43
A word on parallelism. Modern processors benefit from having multiple cores with
which to compute. Determining the computational complexity of parallel programs
is a many-tiered problem and is closely tied to the hardware. It is preferable, when
possible, to develop programs with small amounts of communication between cores.
The Lagrange interpolation scheme permits parallelism since each term in Equation (1) is independent of the other terms and hence may be computed separately. A
master thread can accept and add all of the terms to produce the output.
May 2016]
477
4
0
0
0
4
2
0
0
3
2
3
0
2
3
b(4)
4 b(3) 1
=
.
3 b(2) 4
b(1)
1
2
Solving the system gives (b(4) , b(3) , b(2) , b(1) ) = (1, 3, 1, 2). Hence, the altered factorial representation of is given by 2x + x(x 1) + 3x(x 1)(x 3) + x(x 1)
(x 3)(x 4). It is easy to verify that this interpolates the function .
Evidently, the ordering of the elements is not important to the algorithm; the ordering provides an equivalent linearization problem. Hence, factorial interpolation may
be applied to any finite commutative ring R with unity.
Finite fields Fq , q = pm . The multiplicative group of Fq is cyclic of order q 1. Order
the elements of Fq by 0, 1 , 2 , . . . , q1 , where is a primitive element of Fq . Then,
the ith term in the factorial representation is a(i) x(x ) (x i1 ).
q
REFERENCES
1. M. Bhargava, Congruence preservation and polynomial functions from Zn to Zm , Discrete Math. 173
(1997) 1521.
2. Z. Chen, On polynomial functions from Z n to Z m , Discrete Math. 137 (1995) 137145.
3. Z. Chen, On polynomial functions from Z n 1 Z n 2 Z nr to Z m , Discrete Math. 162 (1996) 6776.
4. I. S. Duff, M. A. Heroux, R. Pozo, An overview of the sparse basic linear algebra subprograms: The new
standard from the BLAS technical forum, ACM Trans. Math. Software 28 (2002) 239267.
5. J. von zur Gathen, J. Gerhard, Modern Computer Algebra. Third edition. Cambridge Univ. Press,
Cambridge, 2013.
6. J. Gonzalez-Domnguez, M. J. Martn, G. L. Taboada, J. Tourino, Dense Triangular Solvers on Multicore
Clusters using UPC, 11th International Conference on Computational Science (ICCS 2011), Singapore,
June 2011, 231240.
7. R. L. Graham, D. E. Knuth, O. Patashnik, Concrete Mathematics. Addison-Wesley, Reading, MA, 1990.
8. G. L. Mullen, H. Stevens, Polynomial functions (mod m), Acta Math. Hung. 44 (1984) 237241.
9. A. Shamir, How to share a secret, Commun. ACM 22 (1979) 612613.
10. V. Shoup, The Number Theory Library. Electronic resource; documentation and software available:
http://www.shoup.net/ntl (2015).
11. D. Singmaster, On polynomial functions (mod m), J. Number Theory 6 (1974) 345352.
May 2016]
479
GARY MULLEN received his Ph.D. in mathematics from Penn State University in 1974. Since that time,
he has found much enjoyment and reward in his many years of teaching at Penn State. His research interests
center around finite fields; and when not contemplating the serene beauty of mathematics he enjoys gardening,
hunting, fishing, and having a beer after a busy day in the office.
Department of Mathematics, The Pennsylvania State University, University Park, PA, 16802
mullen@math.psu.edu
DANIEL PANARIO studied mathematics and computer science in Uruguay. He received the MSc degree
from the University of Sao Paulo, Brazil, and the Ph.D. degree from the University of Toronto, Canada. He is a
Professor at Carleton University, Ottawa, Canada. His main research interests are finite fields and applications,
combinatorics and probabilistic analysis of algorithms.
School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa ON, Canada, K1S
5B6
daniel@math.carleton.ca
DAVID THOMSON is an Assistant Professor in the Department of Mathematical Sciences and Cyber Math
Fellow with the Army Cyber Institute at West Point. He is also an Adjunct Research Professor at Carleton
University in Ottawa, Canada, where he received his PhD in 2012. His research interests include algebraic,
algorithmic, and combinatorial aspects of finite fields and their applications. The views expressed in this article
are those of the author and do not reflect the official policy or position of the Department of the Army, DOD,
or the U.S. Government. This work was done when the author was at Carleton University.
Army Cyber Institute, United States Military Academy, 2101 New South Post Road, Spellman Hall, West Point,
NY, 10996
David.Thomson@usma.edu
480
On Measurable Semigroups in R
In the discussion in [[1] p. 315], of possible versions of the famous Spitzer identity
for random walks on subsets D of R, the following alternative is listed, among others:
both D and D c have void interiors but neither is empty. This alternative is called
pathological in [1]. Here D is assumed to be Borel-measurable and D c = R \ D.
It is shown in [1] that, in order for a version of Spitzers identity to hold, both
D and D c must be additive semigroups. However, then the mentioned alternative is,
not just pathological, but plainly nonexistenteven if the condition of the Borel
measurability of D is weakened to that of the Lebesgue measurability. Indeed, if D is
Lebesgue-measurable and denotes the Lebesgue measure, then (D) (D c ) > 0,
which contradicts
Proposition 2. Let D be any Lebesgue-measurable subset of R with (D) > 0 which
is an additive semigroup. Then D has a nonvoid interior.
Proof. By the regularity of the measure and the condition
t
2
1
2
(a0 , at ) .
Let Ct := a0 + at
Dt = {a0 + at x : x Dt }. Then Ct (a0 , at ), (Ct ) =
(Dt ) > 12 (a0 , at ) , and Dt (a0 , at ). Hence, (Ct Dt ) (Ct ) + (Dt )
(a0 , at ) > 0. So, there is some y Ct Dt . For any such y, one has a0 +
at y Dt and therefore
a0 + at = y + (a0 + at y) Dt + Dt D + D D.
We conclude that
= (a0 + a1/2 , a0 + a1 ) = {a0 + at : t ( 12 , 1)} D,
so that D has a nonvoid interior.
REFERENCE
1. J. F. C. Kingman, Spitzers identity and its use in probability theory. J. London Math. Soc. 37
(1962) 309316.
May 2016]
481
NOTES
Edited by Sergei Tabachnikov
482
=
(, ).
1
+ +
That the map is surjective is clear from the definition of a unimodular pair. It
remains to prove injectivity. Fix an element [(, )] U2 (A)/E 2 (A), and suppose
that it has two pre-images, for which we choose explicit coset representatives
and
. But then
1
1
=
L 2 (A).
1
Therefore,
=
.
Notice that L 2 (A) E 2 (A). Lemma 1 tells us two important facts. First, it tells
us that if we want to know the size of SL2 (A)/E 2 (A), we can get this information
by just considering the orbits of unimodular pairsin particular, E 2 (A) is an infinite
index subgroup if U2 (A)/E 2 (A) is an infinite set. Secondly, it gives us a convenient
way to examine the normality of E 2 (A)if E 2 (A) is a normal subgroup, then it is
an immediate consequence that the obvious map SL2 (A)/E 2 (A) U2 (A)/E 2 (A) is
a bijective map, and then we can transfer the group structure on SL2 (A)/E 2 (A) to
U2 (A)/E 2 (A). As we shall see, this will significantly simplify computations and give
an easy way to prove that E 2 (A) is nonnormal.
In order to prove the infinitude of orbits of U2 (A), Nica introduced the concept of
special pairs, defined as all unimodular pairs (, ) such that || = || < | |.
He then proved the key result.
NOTES
483
finitely many other special pairs, we know immediately that U2 (A)/E 2 (A) is infinite,
which gives the desired result. For nonnormality, one uses the arithmetic of the matrix
group SL2 (A) to derive a contradiction assuming E 2 (A) is normal (although Nica does
not make use of Lemma 1, his proof is easily seen to be equivalent tothis strategy).
Better yet, it is really enough to do the construction for rings A = Z[ D], as taking D
= 4D
1, it is immediate that this construction works just as well for rings
A = Z[ 21 (1 + 1 4D)].
Nicas construction of such an infinite family hinges crucially on the existence of
solutions to the Pell equation X 2 DY 2 = 1. As mentioned previously, this makes it
unsuitable for the rings Z[di].
3. THE CASE A = Z[di]: Thankfully, this is easily remedied by the construction of
a new infinite family of special unimodular pairs.
Proof of Theorem 4s Remaining Case. Specifically, notice that if d|n, then
1 + n + ni
n
1 + n ni
1 ni
S L 2 (Z[di]) .
Thus, the pair (1 + n + ni, 1 + n ni) is unimodular. That it is special is straightforward since if n > 1,
1 + n + ni2 = (1 + n)2 + n 2 = 1 + n ni2
1 + n + ni + (1 + n ni)2 = (2 + 2n)2 > (1 + n)2 + n 2
1 + n + ni (1 + n ni)2 = (2n)2 > (1 + n)2 + n 2 .
Therefore, for any d 2, we have an infinite family of special unimodular pairs.
Applying Lemma 2, we see immediately that E 2 (A) is an infinite index subgroup of
SL2 (A) by the above discussion.
Proving that E 2 (A) is a nonnormal subgroup is similarly easy. Indeed, suppose
that E 2 (A) is normal so that U2 (A)/E 2 (A) inherits a group structure. We have matrix
relations
0 1
1 ni
1 0
n
n
0 1
1 + ni
=
1 + ni
1 0
n
1 ni
n
1 + ni
=
n
1 + ni
n
n
1 ni
n
1 ni
1
.
Projectinginto U
2 (A)/E 2 (A), this simplifies greatly (especially since it is readily
0 1
checked that 1
0 E 2 (A)):
[(1 ni, n)] = [(1 + ni, n)]
[(1 ni, n)] = [(1 + ni, n)]1 .
484
In particular, this implies that [(1 ni, n)]2 = 1. However, it is also true that
1 ni
n
n
1 + ni
2
1 2ni
=
2n
2n
1 + 2ni
0 1
1 2ni
2n
0 1
1 0
1 2n + 2ni 1 + 2n + 2ni
1 0
1 + 2n + 2ni 1 + 2n 2ni
=
,
2n
1 2ni
which projects to
[(1 ni, n)]2 = [(1 2ni, 2n)]
[(1 2ni, 2n)] = [(1 + 2n + 2ni, 1 + 2n 2ni)].
By Lemma 2,
[(1 + 2n + 2ni, 1 + 2n 2ni)] = [(1 + 2n
+ 2n
i, 1 + 2n
2n
i)],
for distinct values of n, n
, so we can always choose n such that [(1 + 2n + 2ni,
1 + 2n 2ni)] = 1. But by the above, this implies that [(1 ni, n)]2 = 1, which
is a contradiction. Therefore, E 2 (A) is a nonnormal subgroup of SL2 (A).
Equivalently, one can show directly that the matrix
1 ni
n
n
1 + ni
0 1
1 ni
1 0
n
n
1 + ni
1
does not belong to E 2 (A), proving that E 2 (A) is nonnormal. We leave this as an exercise for the reader.
ACKNOWLEDGMENT. The author is indebted to Alex Kontorovich, both for pointing out the flaw in Nicas
paper and for helpful suggestions along the way.
REFERENCES
1. H. Bass, J. Milnor, J.-P. Serre, Solution of the congruence subgroup problem for SLn (n 3) and Sp2n
May 2016]
NOTES
485
Abstract. We follow an incorrect entry in a well-known table of series and products through
several earlier tables and books, all the way to the relevant (correct) identity in the work of
Euler. Along the way we explain what may have led to the error.
In addition to modern computer algebra systems and bibliographic databases, oldfashioned tables of integrals, sums, and products remain important research tools in
mathematics. This is especially the case for the present authors whose research is in
the classical areas of analysis, combinatorics, and number theory.
What follows is a cautionary tale about the use of tables, along with a reminder, as
much to ourselves as to the reader, to go back to the sources if at all possible. But before
we continue, an important caution to the reader of this note: Some of the identities that
follow are incorrect, while others may be misleading. Also, some notations will be
inconsistent, to preserve the historical context.
In the process of looking up some infinite products in the excellent and very useful
handbook by E. R. Hansen [12], we came across a curious identity for the Euler numbers E n . These numbers, as they appear in [12] and other modern books and papers,
can be defined by the exponential generating function
tn
2
=
En .
et + et
n!
n=0
(1)
Since the left-hand side of (1) is an even function, we have E 2k+1 = 0 for all k 0.
Furthermore, it can be shown that the even-index Euler numbers are integers, and
E 0 = 1, E 2 = 1, E 4 = 5, E 6 = 61, E 8 = 1385, . . .. For further properties, see,
e.g., [1] or its successor volume [16]. The identity in question is (89.8.3) on p. 486 of
[12], namely
1
k=1
2n+1
(1)k
1
|E
|
.
=
2n
(2k + 1)2n+1
2(2n)!
2
(2)
A quick numerical check with Maple and Mathematica for a few small values of
n revealed that the identity is definitely incorrect. Before we realized what was going
on, we found the identity, in the same form, in the first volume of the well-known
multivolume tables by Prudnikov et al. [17, p. 754]. No reference is given, and the
identity also appears in the Russian original [18], again in the same form.
One of the useful features of Hansens handbook [12] is the fact that most entries
come with one or more references to the literature, or at least to other entries in the
table. The reference associated with identity (2), i.e., Hansens (89.8.3), is given as
http://dx.doi.org/10.4169/amer.math.monthly.123.5.486
MSC: Primary 01A90, Secondary 11M06
486
identity (1131) in the smaller and older collection of formulas by Jolley [15]. The
identity in question is listed on pp. 238239 as
2n+1
1
1
1
2
1 + 2n+1
1 2n+1
1 + 2n+1 = E n = E 2n ,
2(2n)!
3
5
7
(3)
where E n and E 2n are notations of the Euler numbers that differ from the usage in (1)
and (2), namely E 0 = E 1 = 1, E 2 = 5, E 3 = 61, E 4 = 1385, . . ..
Jolley also provides references, and for his identity (1131), i.e., (3) above, the reader
is referred to p. 365 of the famous old algebra book by Chrystal [4]; it is actually part II
of this two-volume work, a fact that is not mentioned in [15]. The identity in question
turns out to be (15) on p. 365 of [4], namely
2m+1
1
1
1
2
E m = 2(2m)!
1 + 2m+1
1 2m+1
1 + 2m+1 ,
3
5
7
(4)
where yet another notation for the Euler numbers is used, namely E 1 = 1, E 2 = 5,
E 3 = 61, E 4 = 1385, . . . (see [4, p. 342]). Note that the large fraction sign in (4) is
missing in (3).
In addition to hinting at a proof of this last identity, Chrystal refers the reader to
Euler by writing, See again Euler, Introd. in Anal. Inf., 284 in a footnote on p. 365.
Eulers influential book [6] is available in English translation [7], and 284 can be
found on pp. 239241 of [6], or on pp. 244245 of [7], where we find
A =1
1
1
1
1
1
1
1
+ n n + n n + n n + &c.
n
3
5
7
9
11
13
15
(5)
5n
7n
11n
13n
17n
3n
&c,
3n + 1 5n 1 7n + 1 11n + 1 13n 1 17n 1
(6)
where the powers of all the prime numbers in the numerators occur and the denominators are increased or decreased by 1 depending on whether the number has the form
4m 1 or 4m + 1. [7, p. 245].
The identity (6) already gives us a glimpse of what might have gone wrong on
the way towards the infinite product (2). But first we need to establish a connection
between Eulers expressions (5) and (6) and what were later called the Euler numbers.
In another well-known book [8], published a few years after [6], Euler expressed the
numbers A in (5) in terms of the Taylor coefficients of the secant function, which by
(1) are closely related to the Euler numbers. For instance, one of the explicit identities
that can be found in 224 in [8, p. 542], or in German translation in [9, p. 259], is
1
1
9
1
1
1
&c
=
,
39
59
79
99
1 2 8 210
(7)
where = 1385. Since this note deals with incorrect identities, we must mention in
passing that there are typographical errors in the original identity (and neighboring
ones) on p. 542 of [8]. Only two pages further they are correctly printed; it is also correct in [9]. However, in [14, p. 135] it is pointed out, with historical and bibliographical
notes, that Eulers evaluation of what is |E 18 | in the notation of (1) is incorrect. This
May 2016]
NOTES
487
paper [14], by the way, uses an alternative definition of Euler numbers which is also in
modern use and is more amenable to combinatorial applications; see also [19, p. 149].
The rest of the story is now easy to piece together. The numbers , , . . . , , . . .
used by Euler in [8, p. 542] correspond to 1, E 1 , . . . , E 4 , . . . in Chrystals notation,
and (4) is indeed the general form of Eulers identityif the sequence 3, 5, 7, . . . is
interpreted as the beginning of the sequence of odd primes, rather than the sequence
of odd integers greater than 1. Therefore, the origin of the incorrect formula (2) is
quite likely Chrystals identity (4). Showing just one more term, along the lines of
Eulers identity (6), would have avoided all this. However, whether or not Jolley misinterpreted the sequence 3, 5, 7, . . ., the identity (3) still contains the mistake of the
missing fraction sign which then made it into Hansens identity (2).
In the end, we shouldnt blame Chrystal too much. Given that his book is written
in great detail, even a moderately attentive reader would realize that the product in
(4) had to be over the odd primes. This is all the more so as Chrystal remarks that a
previous identity is transformed into his (15), i.e., identity (4) above, in the same way
as before. He apparently refers to identity (8) in [4, p. 364], namely
(8)
Bm = 2(2m)! (2)2m 1 1/22m 1 1/32m 1 1/52m
(once again in Chrystals notation), where Bm is the mth Bernoulli number in
the historical notation that has B1 = 1/6, B2 = 1/30, B3 = 1/42, B4 = 1/30,
B5 = 5/66, . . .; for different notations see, e.g., [16, Ch. 24]. This last identity (8)
is closely related to the Euler product for the Riemann zeta function, especially if we
compare it with the following famous formula named after Euler:
2(2m)! 1
1
1
+
+
+
,
(9)
Bm =
(2)2m 12m
22m
32m
again reproduced as in [4, p. 363]. While there is much less danger of the product in
(8) to be misunderstood, Euler himself showed more terms in the analogs of (8) and
(9); see 283 in [6] or [7].
We already mentioned that the product in (8) is, essentially, the Euler product for the
Riemann zeta function. Similarly, the product in (4) is the Euler product of an appropriate L-series (in fact, the series (5)), and both are special cases of Euler products of
Dirichlet L-series; see, e.g., [5, p. 162ff.].
Before we close, let us reiterate that some of the identities in this note are incorrect
or misleading. Indeed, the reader will have realized that (2) and (3) are false, and
(4) is correct only when interpreted as a product over the odd primes. Along with
the incorrect entry (89.8.3) in [12], i.e., (2) above, two consequences are mentioned,
namely (89.4.11) for n = 0, and (89.6.12) for n = 1. The first one of these is correct
since it is in a somewhat different form, taken from an identity in the well-known
classical book by Bromwich [2, p. 224, Ex. 9]. The special case of (89.4.11), namely
w = 1, that is relevant here is
2
(1)k
=
.
(10)
1
2k + 1
4
k=1
However, the second identity (89.6.12) is indeed false, the corrected version being
2
3
(1)k
+
cosh
.
(11)
1
=
3
(2k + 1)
12
12
4
k=1
488
May 2016]
NOTES
489
18. A. P. Prudnikov, Yu. A. Brychkov, O. I. Marichev, Integraly i ryady. Elementarnye funktsii. Nauka,
Moscow, 1981.
19. R. P. Stanley, Enumerative Combinatorics. Vol. I. Wadsworth & Brooks/Cole, Monterey, CA, 1986.
20. Wikiquote contributors, Pierre-Simon Laplace, Wikiquote, The Free Quote Compendium, http://en.
wikiquote.org/wiki/Pierre-Simon_Laplace.
Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, B3H 4R2, Canada
dilcher@mathstat.dal.ca
LSS-Supelec, Universite Paris-Sud, Orsay, France and Department of Mathematics, Tulane University,
New Orleans, LA 70118, USA
cvignat@tulane.edu
(mod p),
where the first equality follows by the binomial theorem mod p, and the second by
Fermats little theorem and the fact that p is odd. It follows that
k
(a + bi) p a bi 0
(mod p)
490
Let V be a vector space over R endowed with a norm , with the dual space V .
Let us say that the norm admits an additive decomposition if there exists a Borel
measure on V such that
|(x)| ( d) for all x V .
(1)
x =
V
d,
where is
R
the Borel measure determined by the condition that (a, b] = f + (b) f + (a) for all
real a and b such that a< b. The latter condition is equivalent to each of the following
conditions:
(i) [a,b) = f (b) f (a)
for all real a and b such that a < b and (ii)
[a, b) + (a, b] = 2 f (b) f (a) for all real a and b such that a < b.
Now we are ready to state the following explicit additive decomposition of an arbitrary norm in the case when V = R2 .
Theorem 1. Let be any norm on R2 . Let N (u) := (u, 1) for all real u. Then
the function N is convex, the limit
(2)
c := lim (u, 1) 2(u, 0) + (u, 1)
u
(3)
May 2016]
NOTES
491
Special cases of representation (3), for the p norm on R2 with p > 1, are
(u, v) p = (|u| + |v| )
p
p 1/ p
p1
=
2
(4)
1
2
(|u + v| + |u v|),
for all (u, v) R2 . In these cases, the constant c in (3)(2) is 0. A simple case with
a nonzero c is given by the formula (u, v) = (u, v)1 = |u| + |v| for (u, v) R2 ,
with c = 2.
Remark 2. Concerning formula (4), one may recall Theorem 7.2 and Corollary 1 in
[8], which state that the classical normed spaces np (for natural n) and L p (0, 1) can be
isometrically imbedded into L r (0, 1) whenever 1 r p 2. A simple argument,
based on the scaling properties of sums of independent identically distributed (i.i.d.)
symmetric stable random variables (r.v.s), was proposed in [12, pages 161162],
which shows that these isometric imbeddings can be given in rather explicit form, in
terms of such sums of r.v.s.
In particular, any Euclidean space is linearly isometric to a subspace of L 1 . In this
case, one has the following explicit additive decomposition of the norm:
x2 :=
xx =
Rd
|x t| d ( dt)
(5)
for all x Rd , where d is the standard Gaussian measure on Rd and denotes the standard inner product on Rd . In place of d , one can similarly use any other spherically
invariant measure on Rd such that Rd |x t| ( dt) (0, ) for some or, equivalently, any nonzero vector x in Rd .
The proof of Theorem 1 relies on the following.
Lemma 3. Suppose that f : R R is a convex function such that for some real k
there exist finite limits
d+ := d f,k;+ := lim [ f (u) ku]
u
(6)
R
|u t| d f (t).
(7)
u
= f (0) + 0 ( f (t) k) dt, which converges to a finite limit (as u ) only if
k+ = k. Similarly, k = k. So, in view of (6), for any real u
u
u
z
f (u) + ku = d +
( f (z) + k) dz = d +
dz
d f (t)
= d +
= d +
d f (t)
dz
t
max(0, u t) d f (t),
so that
f (u) + ku = d +
Similarly, f (u) ku = d+ +
two identities, one obtains (7).
max(0, u t) d f (t).
Proof of Theorem 1. That the function N is convex follows immediately from the convexity of the norm. Note next that the limits d f,k; in (6) exist in [, ] for any
convex function f : R R and any real k. On the other hand, for all real u
N (u) |u| (1, 0) = (u, 1) (u, 0) (0, 1),
by the norm inequality. So, the limits d = d f,k; in (6) exist and are finite for f = N
and
k = (1, 0).
(8)
(9)
for all real u and all real v = 0. The last expression in (9) is continuousin v R by
dominated convergencebecause, by (9) with (u, v) = (0, 1), one has R |t| dN (t)
= 2(0, 1) (d+ + d ) < . Thus, one has (3)with d+ + d in place of cfor
all (u, v) R2 .
Moreover,
d+ + d = lim N (u) ku + N (u) ku
u
= lim (u, 1) 2(1, 0)u + (u, 1)
u
= lim (u, 1) 2(u, 0) + (u, 1) = c 0,
u
NOTES
493
From the proofs of Theorem 1 and Lemma 3, it follows that the nondecreasing
function N tends to k as x , where k is as in (8). So,
F :=
1
2
1
2k
N
(s) :=
1
2
s < 1.
(11)
(12)
where X and Y are i.i.d. random vectors in Rd and 2 is the Euclidean norm, as in
(5). This inequality was obtained in [3]. As noted in [7], in the case d = 1 (12) follows
immediately from the identity
E |X + Y | = E |X Y | + 2
Now the L 1 -imbedding formula (10) immediately yields the following corollary.
Corollary 5. For any two-dimensional normed space V and any i.i.d. random vectors
X and Y in V ,
E X Y E X + Y .
(13)
As shown by Johnson [2], for each natural d 3 inequality (13) fails to hold for
V = Rd in general. Indeed, define the norm on Rd by the formula
x := max{|xi | |xi x j | : i, j = 1, . . . , d}
for x = (x1 , . . . , xd ) Rd . Let (e1 , . . . , ed ) be the standard basis of Rd . Let X and Y
be i.i.d. random vectors in Rd . For any natural d 4, suppose that the random vector
> d+1
X is such that P(X = ei ) = d1 for each i = 1, . . . , d. Then E X Y = 2 d1
d
d
= E X + Y . In the remaining case when d = 3, suppose
random vector
X is
that the
such that P(X = e1 ) = P(X = e2 ) = P(X = e3 ) = P X = 21 (e1 + e2 + e3 ) = 14 .
> 19
= E X + Y .
Then E X Y = 21
16
16
In the arXiv version [11] of this note, one can find applications of the additive
decomposition of norms concerning the following: (i) explicit representations of the
moments of the norm of a random vector X in terms of the characteristic function and
the FourierLaplace transform of the distribution of X and (ii) an explicit and partially
improved form of the exact version of the LittlewoodKhinchinKahane inequality
obtained by Lataa and Oleszkiewicz.
ACKNOWLEDGMENT. This note was sparked by answers by Noam D. Elkies and Suvrit Sra on MathOverflow [1] and William B. Johnsons comments there.
REFERENCES
1. Absolute value inequality for complex numbers, 2015, MathOverflow, http://mathoverflow.net/
questions/167685/absolute-value-inequality-for-complex-numbers.
2. An inequality for two independent identically distributed random vectors in a normed space, 2015, MathOverflow, http://mathoverflow.net/questions/208194/an-inequality-for-two-indepen
dent-identically-distributed-random-vectors-in-a-no/208245#208250.
3. A. Buja, B. F. Logan, J. A. Reeds, L. A. Shepp, Inequalities and positive-definite functions arising from
a problem in multidimensional scaling, Ann. Statist. 22 (1994) 406438.
4. W. Fechner, Hlawkas functional inequality, Aequationes Math. 87 (2014) 7187.
5. L. M. Kelly, D. M. Smiley, M. F. Smiley, Two dimensional spaces are quadrilateral spaces, Amer. Math.
Monthly 72 (1965) 753754.
6. A. Koldobsky, H. Konig, Aspects of the isometric theory of Banach spaces, in Handbook of the Geometry
of Banach Spaces, Vol. I. North-Holland, Amsterdam, 2001. 899939.
7. M. Lifshits, R. M. Schilling, I. Tyurin, A probabilistic inequality related to negative definite functions, in
High Dimensional Probability VI. Progress in Probability, Vol. 66, Springer, Basel, 2013. 7380.
8. J. Lindenstrauss, A. Peczynski, Absolutely summing operators in L p -spaces and their applications,
Studia Math. 29 (1968) 275326.
May 2016]
NOTES
495
9. J. Lindenstrauss, On the extension of operators with a finite-dimensional range, Illinois J. Math. 8 (1964)
488499.
10. C. P. Niculescu, L.-E. Persson, Convex Functions and Their Applications. CMS Books in Mathematics/Ouvrages de Mathematiques de la SMC, 23, Springer, New York, 2006. A contemporary approach.
11. I. Pinelis, Explicit additive decomposition of norms on R2 , http://arxiv.org/abs/1506.00537,
2015.
12. H. P. Rosenthal, On the span in L p of sequences of independent random variables. II, in Proceedings
of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley,
Calif., 1970/1971), Vol. II: Probability theory. Univ. California Press, Berkeley, CA, 1972. 149167.
Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan 49931
ipinelis@mtu.edu
496
Archimedes showed that the area of the region bounded by a parabola and a chord AB
is 2/3 the area of the circumscribing parallelogram ABCD (Figure 1(a)). Sides AD and
BC are parallel to the axis of the parabola, and CD is tangent to the parabola. To see
why, translate the vertical segments in the parallelogram so that their top endpoints
lie on a horizontal line (Figure 1(b)). The transformation maps the parallelogram to a
rectangle A B C D and the parabola to another parabola with its vertex at the midpoint of C D . The symmetric parabolic segment occupies 2/3 of the rectangle, and
Archimedess result follows from Cavalieris principle.
D
Figure 1. The area of a parabolic segment
The authors of [1] claim a converse to this theorem, that with certain differentiability
assumptions quadratics are the only convex functions with the 2/3-property (where
two sides of the circumscribing parallelogram are parallel to the y-axis). Our efforts to
understand their proof led us to consider functions of the form
(1)
y = f (x) = ax + d + bx + c, a = 0.
We soon realized these functions also have the 2/3-property. The graphs of these
functions are parabolic arcs, as the reader may check by writing (1) in the form
http://dx.doi.org/10.4169/amer.math.monthly.123.5.497
MSC: Primary 26A06, Secondary 51N20
May 2016]
NOTES
497
Ax 2 + 2Bx y + C y 2 + Dx + E y + F = 0 and verifying that B 2 AC = 0.1 However, the 2/3-property is not the Archimedean property of Figure 1, for sides BC and
AD of the circumscribing parallelogram ABCD are not parallel to the axis (Figure 2).
But since ABCD has the same area as the circumscribing parallelogram ABcd with
sides Ad and Bc parallel to the axis, the functions (1) also have the 2/3-property.
axis
B
c
In this note we describe how we came upon this family of functions and sketch a
proof of the following theorem.
Theorem. Let I be an open interval and f : I R a four times differentiable function with nonvanishing second derivative.
Then f has the 2/3-property if and only if
498
B h,g h
y gx
c,cg c
c,g c
D
Figure 3. Translating A to the origin
We assume f is three times differentiable and that f > 0. Then chord A B lies
above the graph of y = g(x) and the area of S is
1
A(a, h) = hg(h)
2
g(x) d x.
0
f (a) 2
f (3) (a) 3
h +
h + o(h 3 ),
2
6
(2)
gives
A(a, h) =
f (a) 3
f (3) (a) 4
h +
h + o(h 4 ).
12
24
(3)
Approximating the function P (a, h) for the area of the circumscribing parallelogram A B C D is more delicate. The difficulty lies in getting a handle on the position of the point of tangency of side C D and the curve y = g(x). The mean value
theorem and our assumption that f > 0 guarantee a unique such point (c, g(c)), with
c (0, h) and
g (c) = g(h)/ h.
(4)
g(h)/ h, h = 0
g (0), h = 0,
f (a)
f (3) (a) 2
h+
h + o(h 2 ).
2
6
NOTES
(5)
499
We now follow [1] and approximate the function c(h). Since g (c) = 0, c(h) is
differentiable in a neighborhood of h = 0. Differentiating both sides of the equation
g (c(h)) = F(h) with respect to h gives
c (h) = F (h)/g (c(h)).
(6)
In particular,
c (0) = 1/2.
This is not surprising. Since g (0) = 0, g(x) behaves like a quadratic near x = 0, and
c ultimately lies at the midpoint of the interval [0, h]; as h 0, c/ h 1/2.
A similar calculation using (5) and (6) shows that
c (0) =
f (3) (a)
,
12 f (a)
and hence
c(h) =
f (3) (a) 2
h
+
h + o(h 2 ).
2 24 f (a)
(7)
Using (2) first and then (7), gives the width A D of the circumscribing parallelogram A B C D in Figure 3 as
f (a) 2
f (3) (a) 3
c +
c + o(c3 )
2
3
f (3) (a) 3
f (a) 2
h +
h + o(h 3 ),
8
16
P (a, h) = h H(a, h) =
f (a) 3
f (3) (a) 4
h +
h + o(h 4 ).
8
16
(8)
h0
h
f (a) f (4) (a) ( f (3) (a))2 3
f (3) (a) 2
+
h + o(h 3 ),
+
h
2 24 f (a)
48( f (a))2
A(a, h) =
f (a) 3
f (3) (a) 4
f (4) (a) 5
h +
h +
h + o(h 5 ),
12
24
80
and
P (a, h) =
f (a) 3
f (3) (a) 4 21 f (a) f (4) (a) + ( f (3) (a))2 5
h +
h +
h + o(h 5 ).
8
16
1152 f (a)
f (3) (a)
2
3
f (a) f (4) (a).
5
(9)
While (9) is a necessary condition for f to have the 2/3-property, it would certainly be
surprising if it were sufficient as well. But in fact it is, for the condition leads directly
to quadratics and the family of functions (1). To see how, let Y = f (x), and rewrite
(9) as
(Y )2 =
3
Y Y .
5
=
Y
3 Y
gives
Y = C1 Y 5/3 ,
for some constant C1 = 0. Then
d2 y
= Y = (C1 x + C2 )3/2 ,
dx2
for some constant C2 . Integrating twice gives the family (1).
ACKNOWLEDGMENT. The authors wish to thank the referee for several helpful suggestions.
REFERENCE
1. A. Benyi, P. Szeptycki, F. Van Vleck, Archimedean properties and parabolas, Amer. Math. Monthly 107
(2000) 945-949.
Mathematics Department, North Seattle College, Seattle, WA 98103
mgaul@northseattle.edu
Mathematics Department, Shoreline Community College, Shoreline, WA 98133
fkuczmar@shoreline.edu
May 2016]
NOTES
501
Abstract. Using winding numbers, we give an extremely short proof that every continuous
field of tangent vectors on S 2 must vanish somewhere.
502
q, p = s}, oriented so that p is the positive normal. These curves are all regularly
homotopic and so have the same rotation number with respect to v, say n.
Now notice that for s = 0, Cp,s and Cp,s parametrize the same great circle but
with opposite orientations. Thus, n = n and hence n = 0. On the other hand, for s
close to 1, the rotation number of Cp,s is close to the rotation number of a circle in
the plane because v is close to v(p) on Cp,s by continuity. Thus, n {1, 1}. This is a
contradiction.
REFERENCES
1. W. Chinn, N. Steenrod, First Concepts of Topology. Mathematical Association of America, Washington,
DC, 1966.
2. M. Eisenberg, R. Guy, A proof of the Hairy Ball theorem, Amer. Math. Monthly 86 (1979) 571574.
Department of Mathematics, Brown University, Providence RI 02912
Peter Mcgrath@brown.edu
May 2016]
NOTES
503
PROBLEMS
11908. Proposed by George. E. Andrews, The Pennsylvania State University, University Park, PA, and Emeric Deutsch, Polytechnic Institute of New York University,
Brooklyn, NY. Let n and k be nonnegative integers. Show that the number of partitions
of n having k even parts is the same as the number of partitions of n in which the largest
repeated part is k (defined to be 0 if the parts are all distinct). For example, 7 has three
partitions with two even parts (4 + 2 + 1 = 3 + 2 + 2 = 2 + 2 + 1 + 1 + 1) and also
three partitions in which the largest repeated part is 2: (3 + 2 + 2 = 2 + 2 + 2 + 1
= 2 + 2 + 1 + 1 + 1).
11909. Proposed by Hideyuki Ohtsuka, Saitama, Japan. Prove that for every positive
integer m there exists a polynomial Pm in two variables, with integer coefficients, such
that for all integers n and r with 0 r n,
r
2n
Pm (n, r )
n
n
2m
.
k = m
r +k r k
j=1 (2n 2 j + 1) 2r
k=r
11910. Proposed by Cornel Ioan Valean, Teremia Mare, Romania. Let G k be the reciprocal of the kth Fibonacci number; for example, G 4 = 1/3 and G 5 = 1/5. Find
n=1
504
11912. Proposed by Pal Peter Dalyay, Szeged, Hungary. Let be the circumscribed
circle of triangle ABC, and let R and r be the radii of its circumcircle and incircle,
respectively. Let r A , r B , and rC be the radii of the A-, B-, and C-mixtilinear incircles
of ABC and , respectively. Prove that 4r r A + r B + rC 14 (5R + 6r ). (For the
definition of a mixtilinear incircle see problem 11774; that problem and its solution
are found on the next page of this issue.)
11913. Proposed by George Stoica, Saint John, New Brunswick, Canada. Let be a
positive constant, and let f map (0, ) to R+ . Given limx x 1/ f (x) = , prove
f (x)
lim inf 1+ = 0.
x
f (x)
11914. Proposed by Robin Chapman, Mathematics Research Institute, University of
Exeter, Exeter, (U. K.), and Roberto Tauraso, Universit`a di Roma Tor Vergata, Rome,
Italy. Show that for all positive integers m and n,
n
3m
mk
k n k
j n + 1 2k
(4)
(2)
= 0.
k 1 j=1
j 1
3m j
k=1
k1
(Here xk = k!1 i=0
(x i) for x R.)
SOLUTIONS
Compositions Having At Least One 1
11767 [2014, 267]. Proposed by Mircea Merca, University of Craiova, Craiova,
Romania. Prove that
(1 + t1 + t2 + + tn )!
= 2n Fn ,
(1 + t1 )! t2 ! tn !
where the sum is over all nonnegative integer solutions to t1 + 2t2 + + ntn = n and
Fk is the kth Fibonacci number.
Solution I by CMC 328, Carleton College, Northfield, MN. View the sum as over all
partitions of n + 1 having at least one 1, treating t1 + 1 as the number of copies of 1
and t j as the number of copies of j for 2 j n. The summand counts the ways to
permute the parts, so the sum is the number of compositions of n + 1 having at least
one 1.
The number of compositions of n + 1 is 2n , so it suffices to prove that the number
an of compositions of n + 1 with no 1 is Fn . This is clear for n = 0 and n = 1. When
n 2, these compositions have last part 2 or greater than 2. Deleting the last part
shows that there are an2 of the first type, and subtracting 1 from the last part shows
that there are an1 of the second type. By induction, an = an1 + an2 = Fn .
Solution II by Borislav Karaivanov, Lexington, SC. Rewrite the sum as
(t1 + t2 + + tn )!
,
t1 ! t2 ! tn !
summed over all integer solutions to t1 + 2t2 + + ntn = n + 1 with t1 1 and
ti 0 for i 2. This sum is the coefficient of x n+1 in the series
f (x) =
(x + x 2 + )m (x 2 + x 3 + )m
m=0
May 2016]
505
m=0
x
1x
m
x2
1x
m
1x
x(1 x)2
1x
=
.
1 2x
1 x x2
(1 2x)(1 x x 2 )
2
x
(1 2x)(1 x x )
1 2x
1 x x2
The coefficient subtracted in the second term is the number of 1, 2-lists with sum n 1,
well known to be Fn , so the answer is 2n Fn .
Also solved by R. Bagby, D. Beckwith, R. Chapman (U. K.), M. Hoffman, Y. J. Ionin, O. P. Lossers
(Netherlands), R. Martin (Germany), R. Molinari, M. Omarjee (France), N. C. Singer, J. H. Smith, R. Stong,
R. Tauraso (Italy), T. Viteam (South Africa), T. Woodcock, GCHQ Problem Solving Group (U. K.), TCDmath
Problem Group (Ireland), and the proposer.
Mixtilinear Incircles
11774 [2015, 366]. Proposed by Yunus Tuncbilek, Ataturk High School of Science,
Istanbul, Turkey and Danny Lee, Herkimer Senior High School, New York, NY. Let
be the circumscribed circle of triangle ABC. The A-mixtilinear incircle of ABC and
is the circle that is internally tangent to , AB, and AC, and similarly for B and
C. Let A , PB , and PC be the points on , AB, and AC, respectively, at which the Amixtilinear incircle touches. Define B and C in the same manner that A was defined.
(See figure.)
B
PC
C
O
PB
OA
C
since they subtend the same arcs on . Thus we have similar triangles CXA CBY
and CXB CAY. Hence
AC
AY
YC
AX
=
and
=
.
YB
YC
XB
BC
The product of these is the claimed formula. For the converse, both the isogonality of
uniquely determine a point X on side AB.
CX and CY and the ratio AX
XB
Let
I map the extended plane by inverting through the circle with center C and
radius ab and then reflecting across the angle bisector of BAC. Note that I swaps C
with the point at infinity and swaps A with B. Hence it swaps line C A with CB and
swaps line AB with the circumcircle of ABC. It also swaps the C-mixtilinear incircle
with the C-excircle. Thus I swaps the tangency point C of the C-mixtilinear incircle
with and the tangency point, call it Q, of the C-excircle with AB. It follows that the
rays CC and CQ are isogonal; that is in ACB. Thus by the lemma above
a(s b)
|BC|
|AQ| |BC|
=
,
=
|AC|
|QB| |AC|
b(s a)
where we have used the well-known formulas |AQ| = s b and |QB| = s a.
Furthermore, I swaps the tangency point, call it D, of the C-mixtilinear incircle
with CA with the tangency point of the C-excircle with CB. This last tangency point is
well known to be at distance s from C. It follows that |CD| s = ab. Hence |CD| = abs
|DA|
and |DA| = b |CD| = b(sa)
. Thus |CD|
= sa
.
s
a
Denote the points where the A- and B-mixtilinear incircles are tangent with AB by
PB and E, respectively. Analogs of the result of the previous paragraph yield
sb
a
|BC |
|BP B | |BE|
=
.
|PB A| |EA|
b
sa
|AC |
Now consider the homothety with center B that takes the B-mixtilinear incircle to
. This map takes line AB, which is tangent to the B-mixtilinear incircle, to a parallel
tangent to . Hence its image is the tangent to at the midpoint of arc AB. Since this
tangency point is the image of E under the homothety, it follows that B E contains
the midpoint of arc AB or, equivalently, that B E bisects AB B. The angle bisector
|BB |
|BE|
theorem now yields |B
A| = |EA| , and this gives
|BC |
|BP B | |BB |
=
.
|PB A| |B A|
|AC |
By the lemma above (applied to triangle ABC ), the rays C PB and C B are isogonal
in AC B, and hence B C A = BC PB . Angles C BP B = C BA and C B A
are also congruent since they subtend the same arc of . Hence, we see that triangles
C PB B and C AB are similar. Analogously, triangles CPC B and C AB are similar.
Hence triangles C PB B and CPC B are similar.
Also solved by C. Delorme (France), C. R. Pranesachar (India), R. Stong, H. Widmer (Switzerland), GCHQ
Problem Solving Group (U. K.), and the proposers.
A Partition Inequality
11775 [2014, 455]. Proposed by Isaac Sofair,
VA. Let A1 , . . . , Ak be
Fredericksburg,
finite sets. For J {1, . . . , k}, let N J = jJ A j , and let Sm = J :|J |=m N J .
May 2016]
507
Ti .
Sm =
m
m
i=1
It suffices to show that the inverse of the k k matrix A with (m, i)-entry mk
m
is the k k matrix B with (s, m)-entry (1)k+m+s+1 ks
(interpreting nj as
ki
m
0 when j < 0 or j > n). To see this, we compute the (s, i)-entry of BA:
k
m
k
k i
(1)k+m+s+1
ks
m
m
m=1
k
k
m
k
m
k i
k+m+s+1
k+m+s+1
=
(1)
(1)
ks m
ks
m
m=0
m=0
k
k
k
s
k i
s i
k+m+s+1
k+m+s
=
(1)
(1)
+
.
k s m=0
m k +s
k s m=0
m k +s
Since the alternating sum of a row of Pascals triangle (other than the first) vanishes,
the first sum in the last expression vanishes, as does the second except when s = i, in
which case it is 1. Thus BA is the identity matrix.
(b) For the desired value Um , we compute
k
k
k
i
k+i+ j+1
Si
Tj =
(1)
Um =
k j
j=m
j=m i=1
=
k
k
(1)
k+i+ j+1
i=1 j=m
k
i
k+i+m+1 i 1
(1)
Si =
Si ,
k j
km
i=1
j=m (1)
i
k j
= (1)m
i1
, which is proved
km
Also solved by D. Beckwith, B. S. Burdick, R. Chapman (U. K.), Y. J. Ionin, B. Karaivanov, O. Kouba
(Syria), J. H. Lindsey II, O. P. Lossers (Netherlands), Y. Shim (Korea), J. C. Smith, R. Stong, R. Tauraso
(Italy), TCDmath Problem Group (Ireland), and the proposer.
A Line of Urns
11776 [2014, 455]. Proposed by David Beckwith, Sag Harbor, NY. Given urns
U1 , U2 , . . . , Un in a line, and plenty of identical blue and identical red balls, let an
be the number of ways to put balls into the urns subject to the conditions that
(i) each urn contains at most one ball,
508
(ii) any urn containing a red ball is next to exactly one urn containing a blue ball,
and
(iii) no two urns containing a blue ball are adjacent.
(a) Show that
an t n =
n=0
1 + t + 2t 2
.
1 t t 2 3t 3
j0 m0
Here
k
l
4j
n 2m
j
m
n 2m 1 m
n 2m m 1
+
+2
.
j
j
j
j
j
= 0 if k < l.
an t n =
an t n
an1 t n
an2 t n 3
an3 t n
1 t t 2 3t 3
n=0
= 1 + t + 2t 2 +
n=0
n=1
n=2
n=3
n=3
(b) Use the identity m0 mk t m = t k /(1 t)k+1 to obtain
n 2m m
n
m
n
2m
n
t =
t
t
j
j
j
j
n0 m0
m0
n0
=
t2j
(1 t 2 ) j+1
tj
(1 t) j+1
=
1
(1 t)(1 t 2 )
t3
(1 t)(1 t 2 )
j
.
It follows that
n 2m m
n 2m m
j
n
j
t =
4
4
tn
j
j
j
j
n0
j0 m0
j0
n0 m0
May 2016]
509
j
1
4t 3
=
(1 t)(1 t 2 ) j0 (1 t)(1 t 2 )
1
1
1
=
=
.
3
2
4t
(1 t)(1 t ) 1
1 t t 2 3t 3
2
(1t)(1t )
The Beast
11777 [2014, 456]. Proposed
by Marian Dinca, Bucharest, Romania. Let x1 , . . . , xn
be real numbers such that nk=1 xk = 1. Prove that
n
x2
k=1 k
xk2
1.
2xk cos(2/n) + 1
y
+
y
0.49n,
for all other n.
k+1
k+2
k=1
(Reference: V. G. Drinfeld, A cyclic inequality, Math. Notes. Acad. Sci. USSR 9
(1971) 6871. H. S. Shapiro, Monthly Problem 4603, 61 (1954) 571.
http://mathworld.wolfram.com/ShapiroCyclicSumConstant.html.)
Lemma. Fix n N with n 4. If x1 , . . . , xn are positive real numbers with product 1,
then
2
n
xk
1.
xk + 1
k=1
Proof. Let yk = nj=k x j for 1 k n, with yn+1 = y1 and yn+2 = y2 . Note that
yk > 0 and xk = yk /yk+1 for 1 k n. Also,
n
k=1
510
xk
yk
yk+1
=
=
xk + 1
yk + yk+1
yk+1 + yk+2
k=1
k=1
n
n
n
n
yk+1 yk
yk
yk
+
.
yk+1 + yk+2 k=1 yk+1 + yk+2
yk+1 + yk+2
k=1
k=1
(1)
Note that since yk > 0, setting t = max1kn {yk+1 + yk+2 } > 0 yields
n
n
1
1
yk+1 yk
(yk+1 yk ) = (yn+1 y1 ) = 0.
y
+
y
t
t
k+2
k=1 k+1
k=1
Thus, omitting this sum leads to the stated inequality in (1). Using the quadratic mean
arithmetic mean inequality, we obtain
n
2
n
2
2
n
1 xk
1
xk
yk
.
xk + 1
n k=1 xk + 1
n k=1 yk+1 + yk+2
k=1
The result now follows by applying the Shapiro inequality.
We now return to the original problem and prove the case n = 3, for which the
lemma is not needed. Let x, y, z be real numbers such that x yz = 1. With z = 1/x y,
the case n = 3 becomes
y2
1
x2
+
+ 2 2
1,
2
2
x +x +1
y + y + 1 x y + xy + 1
which is equivalent to
1
(2x 2 y 2
4
x y)2 + 34 (x y)2
0.
(x 2 + x + 1)(y 2 + y + 1)(x 2 y 2 + x y + 1)
For each factor in the denominator, we have t 2 + t + 1 = (t + 12 )2 + 34 > 0. The
desired inequality follows. This completes the case n = 3.
Now we consider the case n 4. Let x1 , x2 , . . . , xn be real numbers with product
1. For 1 k n,
2
2
2
2
2
xk 2xk cos
+ 1 xk2 + 2|xk | + 1 = |xk | + 1 .
0 < 1 cos
n
n
Applying Lemma 1 to |x1 |, . . . , |xn |, we obtain the required inequality
2
n
n
|xk |
xk2
1.
2
|x
|
+
1
x
2x
cos(2/n)
+
1
k
k
k
k=1
k=1
Also solved by M. Aassila (France), P. P. Dalyay (Hungary), D. Fleischman, Y. J. Ionin, O. P. Lossers
(Netherlands), P. Perfetti (Italy), R. E. Prather, J. C. Smith, N. Stanciu (Romania), A. Stenger, R. Stong, R.
Tauraso (Italy), Z. Voros (Hungary), GCHQ Problem Solving Group (U. K.), and the proposer.
May 2016]
511
REVIEWS
Edited by Jeffrey Nunemacher
Mathematics and Computer Science, Ohio Wesleyan University, Delaware, OH 43015
Scientist, Scholar & Scoundrel: A Bibliographical Investigation of the Life and Exploits of
Count Guglielmo Libri/Mathematican, Journalist, Patron, Historian of Science, Paleographer,
Book Collector, Bibliographer, Antiquarian Bookseller, Forger and Book Thief. By Jeremy M.
Norman. The Grolier Club, New York, 2013. xii+ 176 pp., ISBN 078-1-60583-941-4, $35.
512
REVIEWS
513
the newly unified country. Before departing England, he shipped the remainder of his
collection to Italy.
It is tempting to try to give Libri the benefit of the doubt and assume that he was
driven by a passion for books, but there is plenty of evidence that he was complicit in
numerous attempts to alter bindings and identifying marks to make it hard for prosecutors to trace the provenance of items in his possession. He did produce some useful
work in bibliography and pioneered in developing a taste for fine bindings and such.
But in the end, one can only conclude that he was a crooka fascinating crook, but a
crook nonetheless. But he skillfully succeeded in maintaining his good reputation, at
least outside France. Nevertheless, it is for the crimes and his manner of evading arrest
and protecting his reputation that Libri is remembered, rather than for his mathematics.
Norman provides ample evidence that Libris errant behavior was present throughout his career. In Pisa when he no longer wished to fulfill his teaching duties there, he
got a bogus health exemption from a doctor friend so he no longer had to teach, but
he retained his professional title and his salary for the rest of his life.
The quality of Normans scholarship is evident in every section of his book. And his
prose is direct and elegant. It is, however, a catalogue of an exhibit, so it has more bibliographic detail than some readersmainly professional mathematicianswill care
much about. But with a subject like Libri, the narrative is not only scholarly and informative but also entertaining as well.
REFERENCE
1. G. L. Alexanderson, Sophie Germain and a problem in number theory, Bull. Amer. Math. Soc. 49(2012)
327331.
Santa Clara University, Santa Clara, CA 95053-0290
galexanderson@scu.edu
514
Professor Alexander Ramm from Kansas State University has submitted to this
MONTHLY the following statement.
Upon reflection and consultation with the Editor of the MONTHLY, I submit that my
paper [6] is a duplicate publication of my earlier paper [7]. I apologize deeply to the
MONTHLY and its readers and retract paper [1] from the MONTHLY.
C
n . Now, for every n N, take any bijection n : C n R, and
k=1 k
define f : R R as f (x) = n (x) if x Cn (and 0 otherwise). This f is everywhere
surjective (and also zero almost everywhere!). Indeed, let I be any interval in R.
There exists k N such that Ik I . Thus f (I ) f (Ik ) f (Ck ) = k (Ck ) = R. A
monograph that also deals with this type of function can be found in [1], in which
everywhere surjective functions enjoying (simultaneously) several other pathologies are introduced. There are not only many such functions, but in fact there exists
a vector space V with dim(V )= 2c =dim(RR ), every nonzero element of which is
everywhere surjective (see [1, 2, 3, 8]).
We received the following from Mike Slattery, concerning his recent MONTHLY
paper On a property motivated by groups with a specified number of subgroups
123(2016) 7881.
I have discovered that there is an oversight in the last example of my article (p. 81).
In this example, I state that, if G is a finite group with exactly six subgroups, then G
is similar to one of C32 , C12 , C3 C3 , or the dihedral group of order 6. In fact, the
quaternion group of order 8 should also be in that list. This has no impact on the rest
of the paper.
http://dx.doi.org/10.4169/amer.math.monthly.123.5.515
May 2016]
515
Peter R. Mercer from Buffalo State College sends along the following.
In the MONTHLYs January 2016 issue, the Monthly Gems piece (123(2016)
p. 77) is a simplification of a proof due to Matsuoka. This simplification was already
demonstrated by D. Daners in [4]. It was also illustrated in my book [5].
REFERENCES
1. R. M. Aron, L. Bernal-Gonzalez, D. Pellegrino, J. B. Seoane-Sepulveda, Lineability: The search for
linearity in Mathematics, Monographs and Research Notes, in Mathematics, Monographs and Research
Notes in Mathematics, Chapman & Hall/CRC, Boca Raton, FL, 2015.
2. R. M. Aron, R.M., V. I. Gurariy, J. B. Seoane-Sepulveda, Lineability and spaceability of sets of functions
on R, Proc. Amer. Math. Soc., 133 (2005) 795803.
3. L. Bernal-Gonzalez, D. Pellegrino, J. B. Seoane-Sepulveda, Linear subsets of nonlinear sets in topological vector spaces, Bull. Amer. Math. Soc. (N.S.) 51 (2014) 71130.
2
1
4. D. Daners, A short elementary proof that
n=1 n 2 = 6 , Math. Mag. 85(2012) 361364.
5. P. Mercer, More Calculus of a Single Variable, Springer UTM, New York, 2014.
6. A. G. Ramm, A variational principle and its application to estimating the electrical capacitance of a
perfect conductor, Amer. Math. Monthly, 120 (2013) 747750.
7. A. G. Ramm, A variational principle and its application, Int. J. Pure Appl. Math.. 77 no. 3 (2012)
309313.
8. J. B. Seoane-Sepulveda, Chaos and Lineability of Pathological Phenomena in Analysis, Ph.D. thesis,
Kent State University, ProQuest LLC, Ann Arbor, MI, 2006.
516
12 3 4
6
5
7
9
8 10
Take a look at our latest
7KLVLVDFKDOOHQJLQJSUREOHPVROYLQJERRNLQ(XFOLGHDQJHRP
HWU\(DFKFKDSWHUFRQWDLQVFDUHIXOO\FKRVHQZRUNHGH[DPSOHV
ZKLFKH[SODLQQRWRQO\WKHVROXWLRQVWRWKHSUREOHPVEXWDOVRGH
VFULEHLQFORVHGHWDLOKRZRQHZRXOGLQYHQWWKHVROXWLRQWREHJLQ
ZLWK7KHWH[WFRQWDLQVDVHOHFWLRQRISUDFWLFHSUREOHPVRI
YDU\LQJGLFXOW\IURPFRQWHVWVDURXQGWKHZRUOGZLWKH[WHQ
VLYHKLQWVDQGVHOHFWHGVROXWLRQV7KHH[SRVLWLRQLVIULHQGO\DQG
UHOD[HGDQGDFFRPSDQLHGE\RYHUEHDXWLIXOO\GUDZQJXUHV
H,6%1
HERRN
SDJHV
7RRUGHUYLVLWZZZPDDRUJHERRNV(*02
7KLVERRNFHOHEUDWHVPDWKHPDWLFDOSUREOHPVROYLQJDWWKHOHYHO
RIWKH$PHULFDQ,QYLWDWLRQDO0DWKHPDWLFV([DPLQDWLRQ7KHUH
DUHPRUHWKDQIXOO\VROYHGSUREOHPVLQWKHERRNFRQWDLQLQJ
H[DPSOHVIURP$,0(FRPSHWLWLRQVRIWKHVVV
DQGV,QVRPHFDVHVPXOWLSOHVROXWLRQVDUHSUHVHQWHGWR
KLJKOLJKWYDULDEOHDSSURDFKHV7RKHOSSUREOHPVROYHUVZLWKWKH
H[HUFLVHVWKHDXWKRUSURYLGHVWZROHYHOVRIKLQWVWRHDFKH[HUFLVH
LQWKHERRNRQHWRKHOSJHWDQLGHDKRZWREHJLQDQGDQRWKHUWR
SURYLGHPRUHJXLGDQFHLQQDYLJDWLQJDQDSSURDFKWRWKHVROXWLRQ
H,6%1
HERRN
7RRUGHUYLVLWZZZPDDRUJHERRNV*,$
SDJHV
Washington, DC 20036