Sie sind auf Seite 1von 108

monthly

THE AMERICAN MATHEMATICAL


VOLUME 123, NO. 5

MAY 2016

The LEGO Counting Problem

415

Sren Eilers

Mocposite Functions

427

Harold P. Boas

A New Look at Surfaces of Constant Curvature

439

Tom M. Apostol and Mamikon A. Mnatsakanian

Hardys Reduction for a Class of Liouville Integrals


of Elementary Functions

448

Jaime Cruz-Sampedro and Margarita Tetlalmatzi-Montiel

Fast and Simple Modular Interpolation Using Factorial


Representation

471

G. L. Mullen, D. Panario, and D. Thomson

NOTES
A Corrigendum to Unreasonable Slightness

482

Arseniy Sheydvasser

Euler and the Strong Law of Small Numbers

486

Karl Dilcher and Christophe Vignat

Explicit Additive Decomposition of Norms on 2

491

Iosif Pinelis

Parabolas and Archimedes 2/3 Property

497

Michael Gaul and Fred Kuczmarski

An Extremely Short Proof of the Hairy Ball Theorem

502

Peter McGrath

PROBLEMS AND SOLUTIONS

504

BOOK REVIEW
Scientist, Scholar & Scoundrel: A Bibliographical Investigation
of the Life and Exploits of Count Guglielmo Libri/Mathematician,
Journalist, Patron, Historian of Science, Paleographer, Book
Collector, Bibliographer, Antiquarian, Bookseller, Forger
and Book Theif by Jeremy M. Norman

512

Gerald L. Alexanderson

ENDNOTES

515

MATHBITS
470, An Alternative Approach to the Product Rule; 481, On Measurable
Semigroups in ; 490, Rational Nonaxis Points on the Unit Circle Have
Irrational Angles

An Official Publication of the Mathematical Association of America

Join Us in Columbus, Ohio


Earle Raymond Hedrick Lectures
Hendrik Lenstra, Universiteit Leiden
Hedrick Lecture 1: The Group Law on
Elliptic Curves
Hedrick Lecture 2: The Combinatorial
Nullstellensatz
Hedrick Lecture 3: Pronite Number Theory

AMS-MAA Joint Invited Address


Ravi Vakil, Stanford University

MAA Invited Addresses


Arthur Benjamin, Harvey Mudd College
Magical Mathematics
Judy Holdener, Kenyon College
Immersion in Mathematics via Digital Art
Robert Megginson, University of Michigan
Mathematical Sense and Nonsense
Outside the Classroom: How Well are we
Preparing our Students to Tell the Difference

MAA James R.C. Leitzel Lecture


Annalisa Crannell, Franklin & Marshall College
Inquiry, Encouragement, Home Cooking (and
other boundary value problems)

AWM-MAA Etta Z. Falconer Lecture


Izabella Laba, University of British Columbia
Harmonic Analysis and Additive Combinatorics
on Fractals

MAA Chan Stanek Lecture for Students


Colin Adams, Williams College
Zombies & Calculus: A Survival Guide

Pi Mu Epsilon J. Sutherland Frame Lecture


Robin Wilson, Open University
Combinatorics the Mathematics that Counts

NAM David Blackwell Lecture


Robert C. Hampshire, University of Michigan
Transportation Research Institute
Urban Analytics: The Case for Smart Parking

Register Today maa.org/mathfest

monthly
THE AMERICAN MATHEMATICAL
VOLUME 123, NO. 5

MAY 2016

EDITOR
Scott T. Chapman
Sam Houston State University

EDITOR-ELECT
Susan Colley
Oberlin College
Douglas B. West
University of Illinois

NOTES EDITOR
Sergei Tabachnikov
Pennsylvania State University

BOOK REVIEW EDITOR


Jeffrey Nunemacher
Ohio Wesleyan University

PROBLEM SECTION EDITORS


Gerald Edgar
Ohio State University

Doug Hensley
Texas A&M University

ASSOCIATE EDITORS
William Adkins
Jeffrey Lawson
Louisiana State University
Western Carolina University
David Aldous
C. Dwight Lahr
University of California, Berkeley
Dartmouth College
Elizabeth Allman
Susan Loepp
University of Alaska, Fairbanks
Williams College
Jonathan M. Borwein
Irina Mitrea
University of Newcastle
Temple University
Jason Boynton
Bruce P. Palka
North Dakota State University
National Science Foundation
Edward B. Burger
Vadim Ponomarenko
Southwestern University
San Diego State University
Minerva Cordero-Epperson
Catherine A. Roberts
University of Texas, Arlington
College of the Holy Cross
Allan Donsig
Rachel Roberts
University of Nebraska, Lincoln
Washington University, St. Louis
Michael Dorff
Ivelisse M. Rubio
Brigham Young University
Universidad de Puerto Rico, Rio Piedras
Daniela Ferrero
Adriana Salerno
Texas State University
Bates College
Luis David Garcia-Puente
Edward Scheinerman
Sam Houston State University
Johns Hopkins University
Sidney Graham
Anne Shepler
Central Michigan University
University of North Texas
Tara Holm
Frank Sottile
Cornell University
Texas A&M University
Lea Jenkins
Susan G. Staples
Clemson University
Texas Christian University
Daniel Krashen
Daniel Ullman
University of Georgia
George Washington University
Ulrich Krause
Daniel Velleman
Universitt Bremen
Amherst College
Steven Weintraub
Lehigh University
ASSISTANT MANAGING EDITOR
Bonnie K. Ponce

MANAGING EDITOR
Beverly Joy Ruedi

NOTICE TO AUTHORS

Proposed problems or solutions should be sent to:

The MONTHLY publishes articles, as well as notes and other features, about mathematics and the profession. Its readers span
a broad spectrum of mathematical interests, and include professional mathematicians as well as students of mathematics
at all collegiate levels. Authors are invited to submit articles
and notes that bring interesting mathematical ideas to a wide
audience of MONTHLY readers.

In lieu of duplicate hardcopy, authors may submit pdfs to


monthlyproblems@math.tamu.edu.

The MONTHLYs readers expect a high standard of exposition;


they expect articles to inform, stimulate, challenge, enlighten,
and even entertain. MONTHLY articles are meant to be read, enjoyed, and discussed, rather than just archived. Articles may
be expositions of old or new results, historical or biographical
essays, speculations or definitive treatments, broad developments, or explorations of a single application. Novelty and
generality are far less important than clarity of exposition
and broad appeal. Appropriate figures, diagrams, and photographs are encouraged.
Notes are short, sharply focused, and possibly informal. They
are often gems that provide a new proof of an old theorem, a
novel presentation of a familiar theme, or a lively discussion
of a single issue.
Submission of articles, notes, and filler pieces is required via the
MONTHLYs Editorial Manager System. Initial submissions in pdf or
LATEX form can be sent to the Editor-Elect Susan Colley at
www.editorialmanager.com/monthly
The Editorial Manager System will cue the author for all required information concerning the paper. The MONTHLY has instituted a double blind refereeing policy. Manuscripts which
contain the authors names will be returned. Questions concerning submission of papers can be addressed to the EditorElect at monthly@maa.org. Authors who use LATEX can find
our article/note template at www.maa.org/monthly.html.
This template requires the style file maa-monthly.sty, which
can also be downloaded from the same webpage. A formatting document for MONTHLY references can be found there too.
Letters to the Editor on any topic are invited. Comments, criticisms, and suggestions for making the MONTHLY more lively,
entertaining, and informative can be forwarded to the EditorElect at monthly@maa.org
The online MONTHLY archive at www.jstor.org is a valuable
resource for both authors and readers; it may be searched
online in a variety of ways for any specified keyword(s). MAA
members whose institutions do not provide JSTOR access
may obtain individual access for a modest annual fee; call
800-331-1622 for more information.
See the MONTHLY section of MAA Online for current information such as contents of issues and descriptive summaries of
forthcoming articles:
www.maa.org/monthly.html

DOUG HENSLEY, MONTHLY Problems


Department of Mathematics
Texas A&M University
3368 TAMU
College Station, TX 77843-3368

Advertising correspondence should be sent to:


MAA Advertising
1529 Eighteenth St. NW
Washington DC 20036
Phone: (202) 319-8461
E-mail: advertising@maa.org
Further advertising information can be found online at www.
maa.org.
Change of address, missing issue inquiries, and other subscription correspondence can be sent to:
maaservice@maa.org
or
The MAA Customer Service Center
P.O. Box 91112
Washington, DC 20090-1112
(800) 331-1622
(301) 617-7800
Recent copies of the MONTHLY are available for purchase
through the MAA Service Center at the address above.
Microfilm Editions are available at: University Microfilms International, Serial Bid coordinator, 300 North Zeeb Road, Ann
Arbor, MI 48106.
The AMERICAN MATHEMATICAL MONTHLY (ISSN 0002-9890) is
published monthly except bimonthly June-July and AugustSeptember by the Mathematical Association of America at
1529 Eighteenth Street, NW, Washington, DC 20036 and Lancaster, PA, and copyrighted by the Mathematical Association
of America (Incorporated), 2015, including rights to this journal issue as a whole and, except where otherwise noted, rights
to each individual contribution. Permission to make copies
of individual articles, in paper or electronic form, including
posting on personal and class web pages, for educational and
scientific use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage
and that copies bear the following copyright notice: [Copyright 2016 Mathematical Association of America. All rights
reserved.] Abstracting, with credit, is permitted. To copy
otherwise, or to republish, requires specific permission of the
MAAs Director of Publications and possibly a fee. Periodicals
postage paid at Washington, DC, and additional mailing offices. Postmaster: Send address changes to the American Mathematical Monthly, Membership/Subscription Department,
MAA, 1529 Eighteenth Street, NW, Washington, DC 20036-1385.

The LEGO Counting Problem


Sren Eilers

Abstract. We detail the history of the problem of deciding how many ways one may combine
n 2 4 LEGO bricks, and explain what is knownand not knownabout the related question
of how these numbers grow with n.

1. HISTORICAL BOUNDS. For decades, the LEGO Company (since 2005: The
LEGO Group) would state in promotional material that six of the companys iconic
2 4 bricks could be combined in 102981500 ways if they had the same color. The
author coincidentally became aware that this number was incorrect in 2003, and in
2004 computed the correct number which is almost 9 times larger. It is a key purpose of
this note to explain how the correction was obtained, but let us first discuss the history
of the problem as indeed this is highly instructive for understanding its solution.
With the help of the LEGO Group Archive, the number 102981500 has been traced
back to 1974. It appeared in two short notes ([6],[5]) in the companys newsletter as
an example of the use of the formula
tn = 22

n2


46i 2n2i + 2n1 ,

(1.1)

i=0

which was found by Jrgen Kirk Kristiansen, a chemical engineer working in the company labs who is also the grandson of the founder of the LEGO Group. Mr. Kristiansen
was fully aware that he did not count all possible buildings and stated so explicitly in
the note, explaining that building number 3 in Figure 1 is not counted, whereas buildings 1 and 2 are.
In fact, formula (1.1) gives a completely correctbut, as we shall see, unnecessarily
complicateddescription of the number of buildings of maximal height, counted in
the sense we will describe below. Moreover, the values
t2 = 24,

t3 = 1060

were correctly computed this way. Precisely how and when this happened remains
unclear, but over the course of the years it was forgotten that (1.1) was only intended as
a lower bound of the number of buildings, and hence the number 102981500, which
is t6 4, was presented as the exact number of buildings in the LEGO Companys
official communication, for instance in the 2004 company profile along with other
LEGO facts and figures such as

It would take 40,000,000,000 LEGO bricks stacked on top of each other to


reach from the Earth to the Moon
http://dx.doi.org/10.4169/amer.math.monthly.123.5.415
MSC: Primary 05A16, Secondary 05B30

May 2016]

THE LEGO COUNTING PROBLEM

415

Figure 1. Illustrations from [6], [5]. (a) Three buildings. (b) a2 = 24.

and

On average each person on Earth owns 52 LEGO bricks.


To avoid more misunderstandings, let us be very precise about what we are intending to count. We fix the dimensions b w with b w of a brick in the LEGO product
range and count all buildings that are contiguous. By contiguous we mean that any
brick B0 is connected to any other brick B  in the sense that there is a number  0
and bricks B1 , . . . , B so that B0 is attached to B1 , B1 is attached to B2 , etc., and B is
attached to B  . We only consider buildings in which all bricks are placed with top and
bottom parallel to the X Y -plane and with two of its sides parallel to the X -axis, and
identify buildings which may be obtained from each other by translations in all of R3
or rotations in the X Y -plane. Thus, in Figure 2 the configuration (a) is not counted, and
the two configurations (b) and (c) are counted as one. We denote by anbw the number
416

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




(a)

(b)

(c)

Figure 2. (a) Not counted. (b) and (c) Counted as one.

of such (equivalence classes of) buildings which can be obtained by n b w bricks,


and abbreviate an = an24 .
As we shall see below, apart from the decision to only consider buildings where
all sides are parallel or perpendicular, it is of little mathematical consequence which
conventions are used, but let us convince ourselves that we are using the same conventions as Mr. Kristiansen. Indeed, as observed in [6], there are 46 different ways to
place one brick on top of another when the lower one is fixed, and 2 of these are selfsymmetric after a rotation by 180 , whereas the remaining 44 buildings come in pairs
defining 22 different buildings in our sense. Thus, as illustrated in Figure 1 (with the
symmetric buildings colored white) there are exactly 24 different buildings with two
bricks. Furthermore, it is now clear that when we fix one brick and place the remaining
n 1 bricks on top of each other, we have a total of 46n1 different choices for doing
so. To obtain a building which is invariant under a rotation by 180 , we must choose
one of the two exceptional configurations at every level, so 2n1 of these choices lead
to unique self-symmetric buildings, whereas the remaining choices come in pairs. In
total, the number of buildings of height n become
tn =

1 n1
1
(46 2n1 ) + 2n1 = (46n1 + 2n1 ),
2
2

(1.2)

which is consistent with (1.1), since


n2

i=0

46i 2n2i = 2n2

n2

i=0

23i = 2n2

23n1 1
.
23 1

Let us digress a bit to note that the numbers 24, 1060, and 102981500 had in fact
been a matter of contention at the LEGO Company in the early 1990s. When an exhibition The Art of LEGO was prepared at Londons Science Museum, the head of the
LEGO Company in the United Kingdom, Clive Nicholls, got interested in the problem
and made the point that since any child would create 46 different configurations when
asked to build all possible buildings with two bricks, LEGO should communicate the
higher numbers 46, 462 = 2116 and 465 = 205962976 instead. Apart from seeing no
reason to undersell LEGOs versatility, Mr. Nicholls had the further point that since
any LEGO brick has the company logo printed in fine print inside each stud, it is in
fact possible to distinguish a brick from its 180 rotation.
Mr. Nicholls got an answer from the very top of the organization ([8]), formulated
by board member Per Srensen:
May 2016]

THE LEGO COUNTING PROBLEM

417

The science of form is called morphology. It includes the concept of isomorphismin this case, the ability of two or more objects to assume the same shape.
All objects are isomorphous which by rotation in three-dimensional space and/or
enlargement or reduction can be made to have the same shape. [...] An eight-stud
LEGO element is isomorphous with an eight-stud DUPLO element, and white
and red eight-stud bricks are also isomorphous with each other. [...] When the
elements are isomorphous with each other, variations in which Element I is fitted
on top of Element II are not morphologically different from variations in which
Element II is fitted on top of Element Ieven though, in purely physical terms
they are of course different, because (whatever the people in the moulding shop
may say) two elements are not the same. If the LEGO logo on the studs is turned
one way or the other, this ismorphologically speakinguninteresting, because
it can be regarded as an unintentional difference and thus insignificant in terms
of the morphological nature of the object.
Clive Nicholls had two words for that: Morphology, schmorphology, and after
a long tirade in the company newsletter, he threatened to leave the LEGO Company
for the then archenemy TYCO, a subsidiary of Mattel, if the board did not revise the
numbers. On a conciliatory note, Mr. Srensen closed the discussion as follows: I
propose that in the future, we answer the question in the same way that Rolls Royce
answers questions about horse power: enough!
Before moving on to a discussion of how to find an by use of computers, note that
we at least now have a2 = 24 and the lower bound an tn which tells us that an grows
at least as fast as exponentially with base 46. In order to get any sort of theoretical
handle on this problem, we need to complement this observation with an upper bound
of the same nature. Finding such an upper bound would presumably be anathema to the
communications division of LEGO Group, and since it is actually a good deal harder
than providing a lower bound, it seems rather safe to assert that this was attempted for
the first time by the author. Incidentally, the solution given in joint work with Durhuus
([3]) draws on another idea perfected by the LEGO Group: The building instructions.
To obtain a useful upper bound for an , valid for any n, we will use the approach
that any building can be created by a set of instructions, and then count the possible
instructions instead of the buildings themselves. To be able to implement such an
overcounting strategy, however, we need to work with building instructions of a less
immediate nature than what the average LEGO user would prefer. For n 2 we will
say that a map
: {1, 2, . . . , 16n 24} {8, 7, . . . , 7, 8}
is an instruction when (i) = 0 for precisely n 1 values of i.
To use such an instruction, we enumerate the studs of the brick 1, . . . , 8 starting in
the top left corner and working left to right from the top row. First take one brick and
call it brick 1. Then read (1), . . . , (8) from left to right to specify what to build on
top of brick 1 as follows. If (1) > 0, take another brick and place it parallel to brick
1 with hole (1) on top of stud 1. If (1) < 0, take a brick and place it orthogonally,
rotated +90 , to brick 1 with hole (1) on top of hole 1. In both cases, give the new
brick the number 2. If (1) = 0, do nothing. Then proceed to read (2) to see what,
if anything, to place on stud 2, and so on until (8). Enumerate the bricks as they
are introduced. When n > 2, similarly interpret (9), . . . , (16) as an instruction of
which bricks, if any, to place on top of brick 2, and (17), . . . , (24) as instruction for
what to place underneath brick 2, reading this time (i) as a specification of a stud
to be placed in a hole, and continue this way to the end of the instruction.
418

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




c
Figure 3. Building instructions, 1965. (Used with permission. 2015
The LEGO Group.)

See Figures 4(a) and (b) for two examples of instructions defining buildings we
are attempting to count. We note, however, that two things can go wrong when one
attempts to follow such instructionsas in Figure 4(c) the bricks specified can collide,
and as in Figure 4(d) we may encounter a situation where no brick number  + 1
has been introduced when we have reached the end of the specifications of what to
place on bricks 1, . . . , . We also note that in most situations, there are many different
instructions leading to the same building.
But since, just as is the case for LEGO Group building instructions (cf. Figure 3),
there is obviously an instruction which will create any given building among the ones
we are aspiring to count, the number of instructions is larger than the number of buildings. And we can count the instructions as


16n 24
16n1 ,
n1
since the binomial coefficient enumerates the number of possible positions of nonzero
values and 16n1 enumerates the number of choices for the nonzero entries. To
avoid the computational complexity of computing binomial coefficients, appealing to
Stirlings formula one can see that these numbers grow no faster than (1617 /1515 )n1
674.02n1 . In fact, we will always have that the number of instructions, and hence May 2016]

THE LEGO COUNTING PROBLEM

419

(a)

(b)

(c)

(d)
Figure 4. Instructions and resulting buildings

an , - is bounded by u n = 675n1 . Thus an grows at most as fast as exponentially with


base 675.
In conclusion, we may now say with certainty that the number of ways to combine
six 2 4 bricks lies somewhere between 102981504 and 6755 = 140126044921875.
To narrow it down we need to use a computer.
2. COUNTING WITH COMPUTERS. In 2011, the author was made aware that
he was not the first to try to remedy that the numbers provided by LEGO Company
were only covering a subset of all buildings. On the LEGO user groups electronic
discussion forum LUGnet, a user already in 2002 posted the argument leading to (1.2),
and added:

There remains the problem for the case where the solid is not necessarily an
n -story building. I only have a result 1560 for n = 3 using a computer. I think it
is computable until n = 5 or 6.
420

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




n
1
2
3
4
5
6
7
8
9

an
1
24
1560
119580
10166403
915103765
85747377755
8274075616387
816630819554486

Kristiansen 1974
Anonymous 2002
Eilers 2004
Eilers 2004
Eilers 2004
Abrahamsen-Eilers 2006
Abrahamsen-Eilers 2006
Nilsson 2012

Figure 5. Known values of an (A112389 of [7])

As indicated in Figure 5, the prediction on how far an was computable was a bit
on the pessimistic side, but the claim that a3 = 1560 is correct. And the anonymous1
LEGO enthusiast was certainly hitting the nail on the head by predicting that issues
concerning efficiency of computation would come up.
We have already touched upon such issues; indeed the main reason for finding our
revised formula (1.2) superior to the original (1.1) is that it is faster to compute, and
apart from our desire to provide an upper bound of the same nature as the lower bound,
we emphasized the feature that 675n1 was efficiently computable also for large n.
Indeed, even though the numbers tn and u n grow quickly as n increases, we may use
logarithms, successive squaring, or other standard computational methods, to compute
the numbers at an expense in time which grows at worst as a linear expression in n, or,
- which is the same, - as a linear expression in the number of digits in the computed
numbers.
But when it comes to computing an , we do not know of any method which does
not require us to go through, one at a time, a large part of the possible configurations,
and since the number of buildings grows exponentially with n, so does the time consumption. The first attempt by the author, naively going through all possible buildings
saving time only by employing our conventions of identification, could compute up to
a6 = 915103765. That number, which the LEGO Group in short order accepted and
helped disseminate widely, was the main goal of the initial investigations, but computing it required almost a weeks computing time on a laptop, and hence there was little
hope to compute beyond n = 6.
The situation was improved somewhat in joint work with Abrahamsen ([2]), who
among many other things made the observation that it is more efficient to count closer
to Mr. Nicholls convention, fixing a brick and then counting all buildings containing
this brick at its base level. One issue, then, that perhaps Mr. Nicholls had not considered, is that when the base level contains more than one brick arranged with its
long side parallel to the base brick, say k such bricks, then every configuration will
be counted k times even if one does not allow identifications by rotations, but only by
translations. But taking this into account, and keeping track also of which buildings
are symmetric after rotations, we may compute an by
an =

n

c(n, m) + c180 (n, m) + 2c90 (n, m)
m=1

1 The

2m

author of the post has been identified, but prefers to remain anonymous.

May 2016]

THE LEGO COUNTING PROBLEM

421

where c(n, m) is the number of configurations with n bricks containing the base brick
in its bottom level, so that there are m bricks in the lower level, and where c180 (n, m)
and c90 (n, m) count those that are symmetric after rotations by 180 and 90 degrees in
the X Y -plane as indicated.
However, it is obviously unnecessarily inefficient to work our way through all buildings this way, since (1.1) allows us to quickly count all the buildings of maximal height.
Lets elaborate on the idea implicit in Mr. Kristiansens computation. Whenever we
know that there is only one single brick in some layer of the building, we can compute
the number of possibilities by multiplication of the number of possibilities of what to
put below and the number of possibilities of what to put on top. We may speed up
the computations substantially by defining c(n, m) as the number of buildings with
n bricks, m of which are in the bottommost level, which are fat in the sense that at
every level above the bottommost, there are at least two bricks. Also, define c(n) as
the number of buildings with n + 1 bricks so that there is one brick each in the topmost
and bottommost level, and two or more in any other level. For instance, building 3 in
Figure 1 is one of the buildings counted by c(5). It is elementary, but tedious, to verify
then that
an =

n

c(n, m) + c180 (n, m) + 2c90 (n, m)

2m

m=2

1
+
2 =0 m
n1

(2.3)

c(m 1 , 1)c(m 2 , 1)c(k1 ) c(k )

1 +m 2 +k1 ++k =n+1


+ c180 (m 1 , 1)c180 (m 2 , 1)c180 (k1 ) c180 (k ) .

Formula (2.3) looks rather formidable, but has several mitigating features. First, we
note that since the 2 4 brick is not itself invariant under a rotation by 90 , it takes
at least 4 bricks (two with the long side parallel to the X -axis, two with the long
side parallel to the Y -axis) to create a layer which is invariant under such a rotation,
and hence c90 (n, m) = 0 unless 4 divides both m and nthe first nonzero value is
c90 (8, 4) = 244. Second, since c(n, m) = 0 when n m + 2 (unless n = m = 1) and
since c(2) = 0, the expressions reduce substantially for small n. Indeed, we rediscover
a2 =


1
c(1) + c180 (1) = t2
2

and find

1
c(1)2 + 2c(3, 1) + c180 (1)2 + 2c180 (3, 1) ,
2
 1
1
a4 =
c(4, 2) + c180 (4, 2) +
c(1)3 + 2c(3, 1)c(1) + c(3) + 2c(4, 1)
4
2

+ c180 (1)3 + 2c180 (3, 1)c180 (1) + c180 (3) + 2c180 (4, 1) ,
a3 =

where all of the necessary constants are listed in Figure 7.


The point of (2.3) is of course that the number of fat buildings grows slower than
the total number of buildings. This helps, but it doesnt help a whole lot, since to this
day we know of no way to avoid more or less individually counting the fat buildings.
Thus, the time needed to compute an is at least as large as a number proportional to
422

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Figure 6. c180 (4, 1) = 8

Figure 7. Some basic counts

c(n 1), and these numbers


can be proven (as we will see below) to grow at least

exponentially with base 1248 35.3. Thus, unless a better way is found to count fat
buildings, the computation time needed to compute an will grow exponentially with
a prohibitively large base. The author does not believe it can be done in polynomial
time, but has no formal evidence for such a claim.
The concrete programs used in [2] (see [1] for more details) could compute a6
in about 5 minutes, but with an increase in computing times of around 100 for each
additional brick, finding a8 took about 500 CPU hours and finding a9 was projected
to take more than 5 CPU years. Thus the author was rather awed when approached in
2012 by Johan Nilsson, a Swedish mathematician then based in Germany, who could
not only supply a9 but had also independently verified a1 , . . . , a8 .
Dr. Nilssons approach was to parallelize the problem. The algorithms used in the
authors computations do not lend themselves well to such an approach, but Nilsson
had the brilliant idea of running through all instructions instead, one at a time, checking
which gave rise to buildings. Dividing the universe of instructions evenly among a
large number of computers at the Department of Mathematics at the University of
Bielefeld, which were working on the problem when otherwise idle, Nilsson could
obtain a9 in a matter of months.
3. THE GROWTH CONSTANT. The most efficient way of communicating the versatility of the 2 4 brick, rather than a sequence of individual counts, would be via
the growth constant h defined so that
an k h n .
Such growth constants are ubiquitous in asymptotic combinatorics and are key concepts in applications, measuring capacity in contexts of information theory or computer science, or entropy in contexts of physics.
That such a constant h is defined is nontrivial and requires some interpretation of
what we mean by . Of course if we knew that an+1 /an converged as n , the
limit would be an excellent candidate, but although this is in all likelihood the case,
May 2016]

THE LEGO COUNTING PROBLEM

423

the author knows of no way of proving


it. Instead, which is nearly as useful, we may
use our upper bound to prove that n an converges by the following lemma.
Lemma 3.1. [3] limn log an /n exists.

Proof: We note first that


log an
log c(n, 1)
= lim
n
n
n
n
lim

(3.4)

in the sense that if one limit exists, so does the other. Note that since we have found
that log an log u n = (n 1) log 675, the sequence log an /n is bounded. Our claim
(3.4) follows immediately by the inequalities
an1 c(n, 1) 2an .
The leftmost inequality follows by mapping each equivalence class of configurations
with n 1 bricks to a representative placed on top of a fixed base brick and noting
that this map is injective. The rightmost follows by mapping each configuration to an
equivalence class and noting that this map is at most 2 1.
Letting Cn denote the set of buildings counted by c(n, 1), one sees that c(n +
m, 1) c(n, 1)c(m, 1) by noting that an injective map from Cn Cm to Cm+n is defined
by placing the base brick of the element of Cm somewhere on the top layer of the element of Cn . Hence, log c(n, 1) is a superadditive sequence, and appealing to Feketes
lemma, log c(n, 1)/n converges to supnN log c(n, 1)/n in [0, ]. But we have seen
that the limit is finite; indeed it is less than log 675.

Taking exponentials, we set
h = lim

n
an = lim n c(n, 1),
n

noting in particular that when we focus on h rather than individual counts, Mr.
Nicholls protests become completely inconsequential. Convincing ourselves that
= (s bw )n for any dimensions can be obtained by counting instrucupper bounds u bw
n
tions, we see further that
h bw = lim

n bw
an

makes sense for any choice of dimension b w. Thus, we have an extremely efficient
measure of the versatility of each brick in the LEGO product line, which can meaningfully be compared among themselves, and to other such measures. But before we
can ask the LEGO Group to start saying something like

Already have a lot of 2 4 LEGO bricks? Buy one more and have the number
of buildings you can create multiplied by h !
we have to face up to the task of computing, or at least estimating, such numbers h.
Our lower and upper bounds tell us that 46 h 675, leaving a lot of room for
improvement. The first step is to scrutinize our definition of instructions with the aim
of reducing the upper bound. For instance, since 38 of the 46 ways to place one brick
on top of another involves more than one stud, we can avoid some redundance by
424

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




distributing the positions evenly with at most 6 choices for each stud. Furthermore,
one can use that one stud (or one hole) has already been spoken for when placing all
bricks except the first to reduce the number of possible choices on the side which has
already been in use from 46 to 30. One checks that 30 positions can be distributed
evenly on 5 studs, leading to


(8 + 5)n (8 + 5 + 5) n1
6
n1
instructions, and the ensuing estimate
h (1313 /1212 ) 6 < 204.
Adapting much more advanced methods developed in the context of enumerating polyominoes ([4]), it was proved in [3] that h < 177.
The lower bound can be improved somewhat by appealing to the concept of generating functions. Organizing the individual counts into a power series
A(z) =

an z n = 46z + 1560z 2 + 119580z 3 +

n=1

we see by the root criterion that the sum converges in [0, 1/ h) and diverges in
(1/ h, ). Using standard methods from the theory of generating functions, (2.3)
translates to
A(z) =


Cm (z) + C 180 (z) + 2C 90 (z)
m

m=2

C1 (z)2
2z(1 C(z))

C1180 (z)2
2z(1 C

180

(z))

180

defined from constants c(n, m), c180 (n, m),


with functions Cm , Cm180 , Cm90 , C, C
180
c90 (n, m), c(n), and c (n) in the same way that we defined A from an . Moreover,
since all of these functions must converge on [0, 1/ h), since c(n, m), c180 (n, m),
c90 (n, m) < an , and since c(n), c180 (n) c(n, 1), we may conclude (after a little more
work) that A(z) diverges at 1/ h as a result of division by zero:
C(1/ h) = 1.

(3.5)

More precisely, h is the reciprocal of the smallest solution to C(x) = 1 on [0, 1]. We
know, as recorded in Figure 7,
C(x) 46x + 74130x 3 + 867346x 4
on [0, 1], so solving for x we obtain that h > 66. Using more values of c(n), the last
being
c(9) = 2067477693115
as computed by Nilsson, and the general estimate c(n + 2) > 1248c(n), we may show
(as in [3]) that h > 81.
But it remains a sad fact, not for want of trying, that this is the best the author
has been able to do, and hence the ambitions for using growth constants to gauge and
May 2016]

THE LEGO COUNTING PROBLEM

425

compare the versatility of different brick sizes is largely unrealized. For instance, we
can create upper bounds by counting instructions to see that both h 12 and h 22 are less
than 81, and hence prove the nonsurprising fact that the 2 4 brick is more versatile
than both the 1 2 and the 2 2 brick. But because of overlaps between the intervals
in which we know that h 12 and h 22 must be contained, we are not able to say with
certainty which of these bricks is more versatile. By comparing an12
1, 4, 37, 375, 4493, 56848, 753536, 10283622, 143607345
to an22
1, 3, 31, 412, 6435, 106108, 1825803, 32320892, 584956651
for n {1, . . . , 9} it appears that the 2 2 brick is superior.
Although it felt a bit like acknowledging defeat, we in [3] took to heuristic estimation of h by the standard method of fitting a straight line to a semilogarithmic plot. The
h 74.8 which was not
best fit to our observations a1 , . . . a8 , however, gave the value
consistent with our lower bounds, indicating that we had too few observations for such
an approach. In [2] we consequently applied Monte Carlo methods, estimating an by
drawing instructions at random, seeing how often they gave rise to actual buildings to
h 117.
estimate a9 , . . . , a20 , to arrive at
This remains the authors best guess, but of course it should be taken only for what it
is: a guess. For instance, we now know that our estimate a9 7.94 1014 , obtained 5
years before Nilsson provided the exact value, was almost 3% too low. Imprecisions of
this nature can be expected to cancel out, but this leaves the real problem that we have
no way of knowing how well the growth of a1 . . . , a20 predicts the true value of h.
ACKNOWLEDGMENT. The author was supported by VILLUM FONDEN through the network for Experimental Mathematics in Number Theory, Operator Algebras, and Topology, as well as the Danish National
Research Foundation through the Centre for Symmetry and Deformation (DNRF92). The author further wishes
to thank records manager Tine Froberg Mortensen at the LEGO Group Archives for her invaluable assistance.

REFERENCES
1. M. Abrahamsen, S. Eilers, Efficient counting of LEGO structures, Tech. Report www.math.ku.dk/
~eilers/eclbii.pdf, University of Copenhagen, 2007.
, On the asymptotic enumeration of LEGO structures, Exp. Math. 20 (2011) 145152.
2.
3. B. Durhuus, S. Eilers, On the entropy of LEGO, J. Appl. Math. Comput. 45 (2014) 433448.
4. D. A. Klarner, R. L. Rivest, A procedure for improving the upper bound for the number of n-ominoes,
Canad. J. Math. 25 (1973) 585602.
5. J. K. Kristiansen, Mere taljonglering med klodser, Klodshans 3 (1974) 13 [Danish].
, Taljonglering med klodser eller talrige klodser, Klodshans 2 (1974) 12 [Danish].
6.
7. N. J. A. Sloane, The on-line encyclopedia of integer sequences, http://oeis.org.
8. K. Srensen, Morphology in practice, LEGO Rev. 2 (1991) 8.
SREN EILERS obtained his Ph.D. from the University of Copenhagen in 1995. He is currently on sabbatical from his position there, acting as the main organizer of the program Classification of operator algebras:
Complexity, rigidity, and dynamics at Institut Mittag-Leffler in Stockholm. After being featured as the crazy
mathematician in A LEGO Brickumentary, his BaconErdos number dropped from to 7.
Department of Mathematical Sciences, University of Copenhagen, Universitetsparken 5
DK-2100 Copenhagen , Denmark
eilers@math.ku.dk

426

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Mocposite Functions
Harold P. Boas

Abstract. Traditional mathematical notation can lead to confusion. Expressions that appear to
define composite functions sometimes do not. A particular example with engineering applications is studied in detail.

An engineering student and a mathematics student walk into a bar. Instead of carding
the students, the bartender
offers them free drinks for a correct answer to the question,

Is the function 1 z 2 even or odd? The engineering student shouts out, Even,
of course! Noticing the bartenders sphinx tattoo, the mathematics student slyly says,
My answer is: Yes. Although the smart-aleck second answer arguably is less wrong
than the first, the bartender throws both students into the street and orders them to stay
away until they have studied
analytic continuation.

The surprise is that 1 z 2 appears in some applications as an odd function! This


statement seems absurd at first sight, for 1 z 2 is manifestly even and composing an
even function with any subsequent
operation preserves evenness. The startling reso
lution of this paradox is that 1 z 2 only pretends to be a composite function but
actually is not one. I propose
that such mock composite functions be called mocposite.

This article studies 1 z 2 as a means of entering the looking-glass world [3] of


mocposite functions, where even is odd and odd is ordinary. Some prior acquaintance
with the elements of complex analysis will make the readers passage smoother. My
tale includes both a caution on confusing conventions and a pedagogical praise of
pedantry.

1. INDICES AND SURDS. Understanding 1 z 2 requires


first coming to grips

with the notation for square roots. The peculiar symbol


dates from 16th-century
Germany, according to Florian Cajori [1, 316338], and the juxtaposition of the
horizontal grouping bar (vinculum) is a subsequent innovation of Rene Descartes
one of his most enduring and most regrettable contributions to mathematics. Why not

use exponent 1/2 to denote a square root? The exponential form is both cleaner than
to typeset and consistent with the standard notation for other powers.
A thornier problem than the notation is the ambiguity inherent in the concept of
square root, for every number has two square roots. If you object that
the number 0 is
an exception having a single square root, then observe that what z really means is
a solution w to a particular quadratic equation: namely, w2 z = 0. Every quadratic
equation has two solutions, counting multiplicity.

Nonetheless, there is one case in which everybody agrees that the symbol z has
a unique
meaning. When z happens to be a positive real number, convention dictates
that z always denotes the positive square root of z. But if z is a negative real number
(or, worse, a nonreal number), then confusion can and does arise.
A quantity i whose square equals 1 is fundamental to complex analysis, so neither
the existence nor the uniqueness of i should pass without comment. In the influential
terminology of Descartes [5, p. 380], nonreal solutions of polynomial equations are
http://dx.doi.org/10.4169/amer.math.monthly.123.5.427
MSC: Primary 30B40

May 2016]

MOCPOSITE FUNCTIONS

427

z2

Figure 1. The squaring function

imaginary in the sense of existing only in the imagination. The device of giving
imaginary numbers a concrete existence as ordered pairs of real numbers (equipped
with a suitable multiplication) is due to William Rowan Hamilton [6] 200 years after
Descartes. The imaginary unit i has an alternative realization, invented by AugustinLouis Cauchy [4], that can be expressed in modern language as the equivalence class
of the indeterminate x in the algebraic structure R[x]/(x 2 + 1), the quotient of the ring
of polynomials with real coefficients by the ideal consisting of polynomials that have
x 2 + 1 as a factor.
Authors who wish to have the letter
i available as a summation index often write
a complex variable in the
form x + y 1 instead of x + yi, innocently imagining (I
suppose) that the symbol 1 has a unique meaning rather
than two possible values.
An inevitable
consequence
of
this
belief
would
be
that
4 has a unique meaning

(namely, 2 1 ) and more generally that z is well defined for z everywhere on the
negative part of the real axis. Since this set is precisely the standard branch cut across
which the complex square-root function is discontinuous, such authors are implicitly
constructing an edifice on top of a fault line and hoping that no earthquake occurs.
The standard square-root function arises by considering an inverse of the function
that sends a complex number z to the image z 2 . As indicated in Fig. 1, this squaring
function maps the open right-hand half-plane (where the real part of z is positive)
bijectively onto the complex planewith a left-hand slit along the real axis from 0
to . The principal branch of z means the inverse of this squaring function.
Being the
inverse of a holomorphic (that is, complex-analytic) function, the
principal
branch of z is a holomorphic function too. More generally, a branch of z means
a holomorphic function f such that ( f (z))2 = z for every z in some prescribed domain
in the complex plane.

Not every domain supports a branch of z. The obstruction is the existence in the
domain of a simple closed curve that surrounds the origin. Indeed, if ( f (z))2 = z
for every z in some domain, then the chain rule implies that 2 f f  = 1, so
1
1
f  (z)
=
= .
f (z)
2( f (z))2
2z
If C denotes the image of under f , then
1
2i

1
f  (z)
dz =
f (z)
2i


C

1
dw,
w

which equals the winding number of the curve C about 0: namely, a particular integer.
On the other hand, this integer equals
428

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




1
2i

1
dz,
2z

which is half the winding number of about 0. Hence, the existence of f precludes
the existence of a curve in the domain with winding number 1 about 0 since 1/2 is
not an integer.

When g is a holomorphic function, a branch of g means a holomorphic function f such that ( f (z))2 = g(z) for every z in some prescribed domain. A subtle but

crucial point is that the


existence of a branch of g does not necessarily entail the
existence of a branch of z on the image of g. If g(z) = z 2 , for instance, then there is

a branch of g
on the entire complex plane (namely, the identity function), but there
is no branch of z on the image of g
(for that image is the entire complex plane).
of that function is another
If a domain supports a branch of z, then the negative

branch. Consequently,
the
value
of
the
expression
z
when
z = 4 is not necessarily

equal to 4 (since 4 conventionally is positive). Do you sense the other-worldly


weirdness wafting from the standard notation for square roots?
Pedants distinguish between the name of a function, say cos, and the value of a
function at a point, say cos z. Most authors, however, use the notation cos z ambiguously to mean either the value of the function cos at the point z or the function that
sends the variable z to the value cos z. My father was fond of pointing out that the
second usage corresponds to a standard trope of classical rhetoric: Synecdoche is the
figure of speech in which a part stands for thewhole. Normally, no confusion arises
from naming a function by a generic value, but 1 z 2 presents a dramatic exception,
as I shall demonstrate now.
2. AND THIS WAS ODD. When I was an undergraduate, back in the days when
the distinguished mathematician John Tate had won only the first of his many major
awards, I heard him declare that 2 is an odd prime (an entirely reasonablestatement
in the context of algebraic number theory). I intend to make the case that 1 z 2 is
an odd function (in both senses of the word
odd).
Keep in mind that what the expression 1 z 2 means is a function f such that
( f (z))2 = 1 z 2 for all z in some specified domain. Introducing the variable w to
represent f (z) converts the equation into the following form: w2 + z 2 = 1. This relation defines a certain subset of the space C2 of two complex variables, a subset that
some readers may wish to think of as a Riemann surface (a one-dimensional complex
manifold). The implicit-function theorem implies that w can be expressed as a holomorphic function of z locally near each point on the surface at which w is different
from 0 (equivalently, z is different from 1).
Since the equation is symmetric with respect to the two variables, there is no reason
for w to play a distinguished role. If the equation determines w as a function of z in
some region of the complex plane, then symmetry dictates that z is the same function
of w in the identical region of the w-plane. Actually, there must be two functions,
negatives of each other, since the equation does not distinguish between w and w (or
between z and z).
Symmetry considerations thus give rise to the problem of prescribing a suitable subdomain of C \ {0, 1, 1} and a bijective holomorphic function f from that subdomain
to itself such that ( f (z))2 = 1 z 2 for every z. Moreover, the inverse function must
be either the same function f or its negative.

A reasonable initial step in the construction of f is to define a branch of 1 z 2


on the upper half-plane, the set where z has a positive imaginary part. Every complex
number that is neither a positive real number nor 0 is the square of exactly one such z.
May 2016]

MOCPOSITE FUNCTIONS

429

1 z2

Figure 2. Preparation for taking a square root

Figure 3.

f1 (z)

1 z 2 on the upper half-plane

Therefore, the function that sends z to 1 z 2 maps the open upper half-plane bijectively onto the plane with a left-hand slit along the real axis from 1 to . (See Fig. 2.)
This open region
is a subset of the domain of the principal branch of the square-root
function, so 1 z 2 is well-defined on the upper half-plane as a composite function,
say f 1 .
The image of f 1 is nearly identical to the image of the principal branch of the square
root, except for removal of the image under the square-root function of the segment
of the real axis from 0 to 1. Since the square-root function maps that segment back to
itself, the function f 1 maps the upper half-plane bijectively onto the right-hand halfplane with a slit along the real axis from
 0 to 1, as shown in Fig. 3. Notice that if y is
a positive real number, then f 1 (i y) = 1 + y 2 (positive square root), so f 1 maps the
part of the imaginary axis in the upper half-plane onto the unbounded interval of the
real axis from 1 to +.
The next stepa nonunique processis to extend the function f 1 to a larger
domain. Here is one way to proceed. Observe that when z is a point in the upper
half-plane with real part greater than 1 and imaginary part close to 0, the point z 2
has the same properties. The point 1 z 2 then lies in the third quadrant close to the
real axis. Taking the principal square root shows that the value f 1 (z) lies in the fourth
quadrant close to the imaginary axis. The upshot is that f 1 extends continuously to
the unbounded interval of the real axis to the right of 1 and maps this interval to the
bottom half of the imaginary axis. Explicitly,
the extension of f 1 maps an arbitrary real
number x greater than 1 to the image i x 2 1 (positive square root). Parallel reasoning shows that f 1 extends continuously to the unbounded interval of the real axis to
the left of 1, and f 1 maps this interval to the top half of the imaginary axis (Fig. 3).
This situation admits application of the Schwarz reflection principle, the simplest
method of analytic continuation discussed in a first course on complex analysis. The
430

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




(a) domain of f2

(b) domain of f3

Figure 4. Two domains for

1 z2

principle says that if a holomorphic function in the top half of a region symmetric with
respect to the real axis extends continuously to an open subset of the real axis and
takes real values there, then the function extends across that subset of the real axis to
a function that is holomorphic in the whole symmetric region. Moreover, the extended
function maps points that are symmetric with respect to the real axis to image points
that are again symmetric with respect to the real axis.
Accordingly, the function i f 1 extends by reflection to be holomorphic on the plane
with a slit along the real axis from 1 to 1. Let f 2 denote the corresponding extension
of f 1 to this slit plane. Since the extension of i f 1 maps pairs of complex-conjugate
points to complex-conjugate image points, the function f 2 has the property that
f 2 (z) = f 2 (z)

(1)

for every point z in the slit plane.


When z lies in the upper half-plane, ( f 2 (z))2 = ( f 1 (z))2 = 1 z 2 . Two holomorphic functions that agree on an open set agree identically on their common connected domain (by the identity principle from a first course on complex analysis),
so ( f 2 (z))2 = 1 z 2 on the whole plane with a slit along
the real axis from 1 to 1. In
other words, f 2 (z) gives a well-defined meaning to 1 z 2 on this slit plane, shown
in Fig. 4(a).
What symmetry property does f 2 have? Letting y be a positive real number and setting z equal to i y in equation (1) shows that f 2 (i y) = f 2 (i y). Since f 1 (hence f 2 )
takes real values on the top half of the imaginary axis, the preceding equation implies
that f 2 (i y) = f 2 (i y). In other words, the expression f 2 (z) + f 2 (z) is identically
equal to zero when z lies on the top half of the imaginary axis. Since zeroes of nonconstant holomorphic functions are isolated, the sum f 2 (z) + f 2 (z) is identically equal
to zero when z is in the domain of f 2 . Thus, f 2 is an odd (antisymmetric) function on
the plane with a slit along the real axis from 1 to 1.
Since f 2 maps the upper half-plane bijectively onto the right-hand half-plane with a
slit along the real axis from 0 to 1 (as shown in Fig. 3), the reflection principle implies
that f 2 maps the whole slit plane bijectively to itself. If y is a positive real number,
then


f 2 ( f 2 (i y)) = f 2 ( 1 + y 2 ) = (i) y 2 = i y.
The identity principle now implies that the composite function f 2 f 2 is equal to the
identity function. In other words, the function f 2 is the inverse of f 2 . Thus, f 2 solves
May 2016]

MOCPOSITE FUNCTIONS

431

1 z2

Figure 5. An even function without a square root

the problem of finding a holomorphic self-mapping of the slit plane with inverse function equal to its negative.

The preceding discussion demonstrates that the mocposite function 1 z 2 cannot


be understood as a composite function on the plane slit along the
real axis from 1
to 1, for the function is odd instead of even. Another way to see that 1 z 2 cannot be
a composite function on the indicated domain is to observe that the function sending z
to 1 z 2 maps the plane slit along the real axis from 1 to 1 onto the plane slit
along the real axis from 0 to 1, as shown in Fig. 5. There is no holomorphic (nor even
continuous) square-root function on the latter region, for the region contains the circle
centered at 0 with radius 2, and this simple closed curve has winding number about
the origin equal to 1.

On a different domain, however, the expression 1 z 2 can be understood as an


even composite function. Going back to f 1 defined on the upper half-plane, observe
of the real axis, sending a real numthat f 1 extends continuously to the interval (1, 1)
ber x between 1 and 1 to the positive square root 1 x 2 . By the Schwarz reflection
principle, the function f 1 extends across this interval of the real axis to a holomorphic
function f 3 defined on the plane with two slits, one along the real axis from 1 to and
the other along the real axis from 1 to . (See Fig. 4(b).) Moreover, f 3 (z) = f 3 (z)
for every z in the domain. In particular, if y is a positive real number and z = i y,
then f 3 (i y) = f 3 (i y) = f 3 (i y) (again since f 1 , hence f 3 , takes real values on the top
half of the imaginary axis). Therefore, f 3 is an even function.
The function sending z to 1 z 2 maps the doubly slit plane onto the plane with a
left-hand slit along the real axis from 0 to , which is precisely the domain of the
principal branch of the square root (see Fig. 1), and f 3 (z) is the composite function

2
1 z . To an engineer, this function is the natural interpretation of the symbols
1 z 2 , not only because the function is composite but also because the reciprocal
of this function is the analytic continuation to the doubly slit plane of the derivative of
the inverse-sine function used in elementary differential calculus.
In summary, the bartenders question does not admit a one-word answer. A reasonable but incomplete short answer is, It depends on the domain of the function.

A deeper answer is, The question is wrong! The ultimate domain for 1 z 2 is
not a region in the plane but rather a two-sheeted Riemann surface, and on an abstract
surface, the notions of even and odd lose meaning. The surface can be visualized as
two copies of Fig. 4(a) stitched together along the slit, the upper edge of the slit in
either sheet being attached to the lower edge of the slit in the other sheet; crossing
the slit corresponds to moving from one sheet to the other. Alternatively, joining two
copies of Fig. 4(b) results in an equivalent surface.
432

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




(b)

(a)
Figure 6. Two exotic slit regions

The construction cannot be implemented physically inthree-dimensional space, so


this surface exists only in the imagination. To discuss 1 z 2 with an engineer, a
mathematician has to cut the Riemann surface into two pieces such that each piece
projects bijectively to a planar region. The domains shown in Fig. 4 arise from two
different ways of cutting thesurface apart. More elaborate bisections of the surface
produce exotic domains for 1 z 2 , such as the ones shown in Fig. 6.
Exercise for
the reader. On the two planar regions whose boundary slits are indicated
in Fig. 6, is 1 z 2 a mocposite function? a composite function? an even function?
an odd function?
3. WILL YOU JOIN THE DANCE? I invite you to seek out your own examples of
mocposite functions. Such functions are easy to find; you need not travel to Wonderland [2] to encounter them. Here are a few more examples to start your feet moving
the right way.
A family of mocposite functions arises from the principal branch of the logarithm
function log z, which is defined on the complex plane with a slit along the negative part
of the real axis and has the property that elog z = z for every z in the domain. The real
part of log z is equal to the natural logarithm of the modulus of z, and the imaginary
part of log z is equal to the argument (angle) of z, taken between and .
There is a holomorphic function g on the slit plane such that e g(z) = z 2 for every z:
namely, g(z) = 2 log z. Since g(z) is a logarithm of z 2 , a natural name for g(z) is
log z 2 . This name represents a mocposite function, for log z 2 cannot mean the composition of a logarithm function with the squaring function. One reason is that the
squaring function maps the slit plane onto the plane with a puncture at 0, and there
is no holomorphic logarithm function defined on the punctured plane (just as there is
no holomorphic square-root function on the punctured plane). A more forceful reason
) = i , but g( 1+i
) = 3i , so g lacks the symmetry property that every
is that g( 1+i
2
2
2
2
, then log z 2 = log(z 2 ); ouch!
function of z 2 must have. In particular, if z = 1+i
2
There is an analogous mocposite function log z n on the slit plane for every integer n
greater than 1. More generally, a basic theorem in complex analysis says that if f is
a zero-free holomorphic function on a simply connected region of the plane (that is, a
region without holes), then there exists a holomorphic function g such that e g(z) = f (z)
for every point z in the region.
The standard proof fixes a base point z 0 in the region and a complex number c
z
such that ec = f (z 0 ). Set g(z) equal to c + z0 f  ( )/ f ( ) d . By Cauchys theorem,
May 2016]

MOCPOSITE FUNCTIONS

433

...

...

Figure 7. Slits for a domain of log sin z

the integral is independent of the path joining z 0 to z because the region is simply
connected: Two different paths can be deformed into each other without changing the
value of the integral. The function f eg has value 1 at z 0 , and the derivative of f eg
is equal to zero by the product rule, the chain rule, and the fundamental theorem of
calculus. Therefore, f = e g .
The natural name for g, a holomorphic logarithm of f , is log f . Often, log f is a
mocposite function: The symbols must not be interpreted as a composition log f .
Consider, for instance, the sine function on the plane with the infinitely many
unbounded vertical slits shown in Fig. 7: for each integer n, a slit starting at the point
n on the real axis and going up. The zeroes of the sine function are the endpoints
of the slits, so the sine function has no zero on the plane with these infinitely many
slits, which is a simply connected region. Therefore, a holomorphic logarithm function log sin z exists on the region. This function is mocposite, for the sine function
maps the infinitely slit plane onto C \ {0}, the punctured plane, where no holomorphic
logarithm function lives: Composition log sin is not defined.
Exercise for the reader. If the value of the function log sin z when z = 12 is 0, then
the value when z = 2 + 12 is 2i. More generally, if n is an arbitrary integer, then
the value of log sin z when z = (n + 12 ) is ni.
A mocposite function
of a different character is the entire (holomorphic in the entire
plane) function
cos
z. This expression cannot be understood as a composite func
tion, for z is not holomorphic in a neighborhood of the origin. Nonetheless, the
cosine function has a Maclaurin
series containing only even powers of the variable,
and replacing this variable by z produces the power series
1

z2
z3
z
+

+ ,
2! 4! 6!

which converges
for every z and thus represents an entire function that can reasonably
be named cos z. This function is perhaps the simplest example of an entire function
of fractional order. (The order of an entire function f is the infimum of the positive

values of for which f (z)e|z| is a bounded function of z.) Since


cos z is the average
of ei z and ei z , the order of cos z evidently is 1; the order of cos z is 1/2.
4. SOME HARD-BOILED THINGS CAN BE CRACKED. You might think that
mocposite functions are a notational curiosity of no practical importance. On the contrary, a graduate student of engineering came to me in puzzlement recently when she
encountered a mocposite function in fracture mechanics. She had read in a book [9,
B.2] about the stress intensity field induced by a crack in a material, the crack being
modeled by the interval of the real axis from 1 to 1. The theory requires the following evaluation of an integral involving an arbitrary complex number z lying outside the
integration interval:
434

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Figure 8. An integration contour


1 t2
dt = z z 2 1.
zt

(2)

Since 1 t 2 is a positive real number when 1 < t < 1, the expression 1 t 2 in


the integrand means the usual positive square root. Elementary real changes of variable
show that the integral is an antisymmetric function of z:


1
1

1 t2
dt
z t

(s=t)

1
1

1 s2
(s=t)
ds =
z + s

1 t2
dt.
zt

Therefore,
the right-hand side of equation (2) must be antisymmetric too, but the term

z 2 1 does not look antisymmetric to an engineer. As explained in 2, this expression is an odd mocposite function.

The mocposite function z 2 1 might mean either +i 1 z 2 or i 1 z 2 .


Which choice is right for equation (2)?
Since the integral on the left-hand side tends
to 0 when |z| , theexpression z 2 1 needs to be close to z when |z| is large.
2
The
mocposite function 1 z constructed in 2 is close
to i z when |z| is large, so
z 2 1 correspondingly needs to be interpreted as +i 1 z 2 .
Exercise for the reader. Verify equation (2), at least when z is a real number greater
than 1, via techniques of Calculus II. [Suggestion: Substitute 2u/(1 + u 2 ) for t to
reduce the problem to integration of a rational function.]
The appearance of branch issues on the right-hand side of equation (2) suggests
that complex contour integration
is the most natural way to evaluate the integral. One

procedure is to integrate 1 w2 /(z w) with respect to w along a path consisting


of a circle (oriented counterclockwise) with large radius R and an ellipse (oriented
clockwise) that surrounds
real axis (Fig. 8). By the residue theorem, this
the slit on the
2
integral equals 2i 1 z or 2 z 2 1.
On the large circle, the expression 1 w2 is iw + O(1/R), whence the integral
over the circle is

iw
dw + O(1/R).
circle z w
By the residue theorem, the preceding expression equals 2 z + O(1/R). Cauchys
theorem implies that the integral over the circle is independent of R (as long as R is
large enough that the point z is inside the circle), so the value actually is exactly 2 z.
Accordingly,
May 2016]

MOCPOSITE FUNCTIONS

435


z2

1 = 2 z +
ellipse

1 w2
dw.
zw

Now let the ellipse collapse down to the slit. When w has apositive imaginary part
2
and approachesa real value t between 1 and 1, the
quantity 1 w approaches the
2
2
positive value 1 t . The mocposite function 1 w is antisymmetric, so when
w has a negative
imaginary part and approaches a realvalue t between 1 and 1,
the quantity 1 w2 approaches the negative value 1 t 2 . The top part of the
ellipse approaches the slit oriented from left to right, and the bottom part of the ellipse
approaches the slit oriented from right to left. Accordingly,

 1

1 w2
1 t2
dw
approaches
2
dt.
1 z t
ellipse z w
The conclusion is that
2


z2

1 = 2 z + 2

1 t2
dt.
zt

Dividing by 2 shows that equation (2) holds.


The trick of letting a contour collapse down to a slit when the integrand involves a
(noninteger) power of 1 t 2 is an old idea. An early instance of this technique appears
in the first volume of Mathematische Annalen in a paper by Hermann Hankel containing a discussion [7, 3] of integral representations of Bessel functions (special
functions that appear in problems of mathematical physics involving cylindrical symmetry). One special case of Hankels theory is the representation of the Bessel function
J0 (z) as

1 1 ei zt
dt.

1 1 t 2

Hankel carefully explains how he understands the expression 1 t 2 when t is outside the interval of the
1 and 1: namely, as the product of suitably
real axis between

chosen branches of 1 t and 1 + t. A sequel to this paper [8] was published two
years after Hankels untimely death at age 34 from a stroke [11]. Despite the clear
account of branches in the original article, George Neville Watson trips up in his exposition of Hankels work half a century later [10, Chap. 6] by incautiously claiming
noninteger powers of t 2 1 to
be even functions (and by integrating over a contour
not lying in any region where t 2 1 can be defined as a holomorphic function1 ).
The mocposite function on the right-hand side of equation (2) appears in another
engineering application, one dealing with airplane wings. A version of the Joukowski2
airfoil map sends a nonzero complex number z to the average of z and 1/z. At least
formally, this map is an inverse of the right-hand side of equation (2). Indeed,


z + z2 1
1
= 2
=
z
+
z 2 1,

z (z 2 1)
z z2 1
1 Experts will see how to salvage Watsons derivation by integrating a suitable holomorphic one-form over
an appropriate cycle in a Riemann surface.
2 Famous in his native land, Nikolai Egorovich Zhukovskii (18471921) is the father of Russian aviation.
In his French publicationsnotably the 1916 book Aerodynamiquethe usual transliteration of his name is
Joukowski, the spelling by which his map is commonly designated in the English literature.

436

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




so, as required,



1
1
2
= z.
z z 1+

2
z z2 1
What is needed in addition to this formal calculation is a consideration of domains.
The first observation is that the Joukowski map sending z to 12 (z + 1z ) is a two-to-one
mapping from C \ {0}, the punctured plane, onto the whole plane C. Indeed, if c is
an arbitrary complex number, then saying that 12 (z + 1z ) = c is equivalent to saying
that z 2 + 1 = 2cz, so there are two solutions for z (counting multiplicity). Moreover,
the symmetry between z and 1/z reveals that the Joukowski function maps each of the
regions { z C : 0 < |z| < 1 } and { z C : |z| > 1 } one-to-one onto the same image.
If is a real number, then 12 (ei + ei ) = cos , so the Joukowski function maps the
unit circle two-to-one onto the segment of the real axis between 1 and 1.
Consequently, the Joukowski function maps the punctured unit disk bijectively onto
the plane with a slit from 1 to 1 and maps the exterior of the unit disk bijectively onto
the same image. The expression on the
right-hand side of equation (2) is the inverse of
one of these two functions. Since z z 2 1 is close to 0 when the modulus of z is
large, this expression is the inverse of the restriction of the Joukowski function to the
punctured unit disk. The Joukowski function is plainly odd (antisymmetric), and the
inverse of an odd functionis odd, so the preceding argument reconfirms the oddness
of the mocposite function 1 z 2 .
5. COMPLETED MY DESIGN. After both analysis and application, my story
about mocposite functions, symmetry, and analytic continuation has come full circle. I
hope that you have returned to the starting point at a new level on the Riemann surface
of understanding. Here is your exit exam.
Exercise for the reader.
Show that on the plane with a slit along the real
axis from 1


1
2
to 1, the function 1 z 2 is even and composite, and 1 z = i z 1 z12 .
My secondary theme is that we mathematicians oftencommit expository solecisms
by using confusing or ambiguous expressions, such as 1 z 2 , even though we purport to value rigor and precision. Lewis Carroll, from whose works I have borrowed
my section titles, memorably chaffed eggheads for this shortcoming:
When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I
choose it to meanneither more nor less. [3]

Was Humpty Dumpty a mathematician?


REFERENCES
1.
2.
3.
4.

F. Cajori, A History of Mathematical Notations. Vol. 1. Open Court Publishing, Chicago, 1928.
L. Carroll, Alices Adventures in Wonderland. Macmillan, London, 1866.
, Through the Looking-Glass. Macmillan, London, 1872.
A. Cauchy, Memoire sur une nouvelle theorie des imaginaires, et sur les racines symboliques des
e quations et des e quivalences, C. R. Acad. Sci. Paris 24 (1847) 11201130.
5. R. Descartes, Discours de la methode pour bien conduire sa raison et chercher la verite dans les sciences,
plus la dioptrique, les meteores et la geometrie qui sont des essais de cette methode, J. Maire, Leyde,
1637.
6. W. R. Hamilton, Theory of conjugate functions, or algebraic couples; with a preliminary and elementary essay on algebra as the science of pure time, Trans. Roy. Irish Acad. 17 (1837) 293422,
http://www.jstor.org/stable/30078796.

May 2016]

MOCPOSITE FUNCTIONS

437

7. H. Hankel, Die Cylinderfunctionen erster und zweiter Art, Math. Ann. 1 (1869) 467501,
https://doi.org/10.1007/BF01445870.
8. , Bestimmte Integrale mit Cylinderfunctionen, Math. Ann. 8 (1875) 453470,
https://doi.org/10.1007/BF02106596.
9. K. Hellan, Introduction to Fracture Mechanics. McGraw-Hill, New York, 1984.
10. G. N. Watson, A Treatise on the Theory of Bessel Functions. Cambridge Univ. Press, Cambridge, 1922.
11. W. v. Zahn, Einige Worte zum Andenken an Hermann Hankel, Math. Ann. 7 (1874) 583590,
https://doi.org/10.1007/BF02104927.
HAROLD P. BOAS received a Ph.D. in 1980 from the Massachusetts Institute of Technology. He taught at
Columbia University for four years before joining the faculty of Texas A&M University, where he currently
is Regents Professor and Presidential Professor for Teaching Excellence. He edited the book-review column
in this journal during 19981999 and served as editor of the Notices of the American Mathematical Society
during 20012003.
Department of Mathematics, Texas A&M University, College Station TX 77843
boas@tamu.edu

100 Years Ago This Month in The American Mathematical Monthly


Edited by Vadim Ponomarenko
An indication of the great decrease in the number of new publications in the natural
and exact sciences, occasioned by the European war, is the size of Vol. 37 (1915) of
NaturNovitates. It contains only 340 pages, as compared to more than 500 pages
in 1914, and upwards of 620 pages each for the years of 1912 and 1913.
Upon the occasion of the celebration of the seventieth birthday of the distinguished Swedish mathematician, Professor M. G. MITTAG-LEFFLER, he and his
wife set aside their entire fortune to be used in founding an international institute
for pure mathematics. The Acta Mathematica has been edited by Profesor MitttagLeffler since its founding in 1882.
Excerpted from Notes and News 23 (1916) 184188.

438

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




A New Look at Surfaces of Constant


Curvature
Tom M. Apostol and Mamikon A. Mnatsakanian
Abstract. Equality of zonal areas on a sphere and pseudosphere is extended by elementary
geometric methods to surfaces of revolution of constant total (Gaussian) curvature, and constant mean (Delaunay) curvature. Bicycle wheels are used to trace profiles of these surfaces.
Surprisingly, cycloids appear as limiting cases of such profiles.

1. INTRODUCTION.
Tractrix and pseudosphere. A tractrix is the trajectory of the rear wheel of a bicycle whose front wheel, at constant distance from the rear wheel, moves along a fixed
straight line. For example, in Figure 1a the straight line is the x axis, and the bicycle
shown from above has length k between its wheels. The rear wheel is placed initially
at (0, k), and as the front wheel moves along the positive x axis the tractrix is the path
traced by the rear wheel. We shall refer to this as the first quadrant portion of a full
tractrix, shown in Figure 1b, which is the curve with a center of symmetry obtained
by first reflecting the first quadrant portion through the y axis and then reflecting this
extended curve through the x axis, the asymptote of the tractrix. We denote by k the

path of rear wheel


k
path of

(a)

front wheel

(b)

(c)

Figure 1. (a) First quadrant tractrix with tangent segments of constant length k. (b) Full tractrix. (c) Surface
of revolution obtained by rotating the tractrix in (b) about its asymptote is a pseudosphere of pseudoradius k.

constant length of the tangent segment (the distance between the bicycle wheels) from
the tractrix to the asymptote. The tractrix has a cusp of height k above the asymptote
and a reflected cusp below the asymptote. Figure 1c shows a pseudosphere, the name
commonly given to the surface of revolution generated by rotating a tractrix about its
asymptote.
The pseudosphere is so named because many of its properties are analogous to those
of a sphere. A well known analogy concerns curvature. The surface of a sphere has
constant positive Gaussian curvature, whereas the surface of pseudosphere has constant negative Gaussian curvature (see [3, p. 282]). Another analogy concerns surface
area. It is known (from integral calculus) that the pseudosphere has surface area 4k 2 ,
the same as that of a sphere of radius k, obtained by rotating a circle about its diameter. Because of this analogy, it is reasonable to refer to distance k as the pseudoradius
of the pseudosphere. A lower-dimensional analogy relates the tractrix and circle: the
region in Figure 1b bounded by the full tractrix has area equal to that of a circular disk
of radius k (see [2, p. 13]).
http://dx.doi.org/10.4169/amer.math.monthly.123.5.439
MSC: Primary 51M25

May 2016]

A NEW LOOK AT SURFACES OF CONSTANT CURVATURE

439

Surface areas of pseudospherical and spherical zones. In [1], further analogies


relating the pseudosphere and sphere were obtained. In particular, the following
theorem was proved. It is illustrated in Figure 2, and describes a zone-by-zone area
property relating surface areas of pseudospherical and spherical zones. This property
was also extended in [1] to n-pseudospheres.
Theorem 1. If a pseudosphere of pseudoradius k is cut into zones by planes perpendicular to its asymptote, then the lateral surface area of each zone is equal to that of a
corresponding zone on a circular cylinder of radius k, hence also on a spherical zone
of radius k. In particular, the pseudosphere and sphere have equal surface areas.

Figure 2. Each pseudospherical zone has surface area equal to that of a cylindrical zone which, in turn, is that
of a spherical zone.

2. SURFACES OF CONSTANT GAUSSIAN CURVATURE. It is a pleasant surprise to learn that the equality of surface areas of zones on a pseudosphere and on a
sphere, as illustrated in Figure 2, can be extended to surface areas of zones on any
surface of revolution of constant Gaussian curvature (positive or negative).
Figure 3a shows a surface of constant positive curvature K . If = 1/K , it is known
that 2 = Rr , where R and r are the two principal radii of curvature at an arbitrary
point P of the profile curve (which we call a Gaussian profile) that is rotated to produce
the surface (see [3, p. 131]). In Figure 3a, R is the radius of curvature of the Gaussian
profile at P, and (by a theorem of Meusnier [3, p. 122]) r is the length of the normal
from P to the axis of rotation. A small arc of length l at point P on the profile
x
r

Spindle
type

(a)

General

(b)

Sphere

(c)

2
1

Bulge type

2
1

(d)

Figure 3. Zonal areas on surfaces of revolution of a given positive constant curvature.

curve sweeps out an elemental zone on the surface of area S given by S = 2 xl,
where x is the distance of P from the axis of rotation. But x = r cos , where is
the angle shown, hence S = 2r cos l. But l = R so S = 2r R cos 
= 2 2 cos . But this is also the area of a zone on a sphere of radius swept by
rotating a circular arc of length . Figure 3b shows the result of integration with
respect to of the elements in Figure 3a from = 1 to = 2 . On the surface of
constant curvature 1/ the area of the shaded zone whose normals at the ends of the
440

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




zones subtended by angles 1 and 2 is equal to that of the corresponding spherical zone
of radius whose normals are subtended by the same angles (Figure 3c). Figure 3d
shows a zone of the same area on another surface of constant curvature 1/.
The analysis in Figure 3 can be reformulated as follows. Consider the family of
all surfaces of revolution with parallel axes of revolution having the same constant
positive curvature. For any member of this family, consider the zone defined by two
normals at its ends making two given fixed angles with the axis of revolution. Then we
have the following.
Equality of zonal areas on surfaces of constant positive curvature. The surface
area of such a zone is the same for every member of the family which, in particular, is
the area of the corresponding zone for the spherical member of the family.
The same analysis applies to families of solids of revolution of constant negative
curvature, as indicated in Figure 4. The surface area of the shaded region in Figure 4b
whose normals at the ends of the zones subtended by angles 1 and 2 is equal to that

General
r

P
x
R

2
1

(b)

Hyperboloid
type

Pseudosphere

Conic type

(a)

(c)

(d)

Figure 4. Zonal areas on surfaces of revolution of a given negative constant curvature.

of the corresponding zone on a pseudosphere in Figure 4c having the same constant


curvature. The same is true for the example in Figure 4d.
Equality of zonal areas on surfaces of constant negative curvature. The surface
area of such a zone is the same for every member of the family which, in particular, is
the area of the corresponding zone for the pseudospherical member of the family.
In Figure 2, the equality of zonal areas on the sphere and pseudosphere was indicated by corresponding horizontal lines at the ends of the zones. We can rotate the
sphere as indicated in Figure 5 so that both the pseudosphere and sphere have the same
horizontal axis of revolution. In this diagram, zones determined by corresponding parallel pairs of normals to the two surfaces have equal areas, as indicated by numbers 1
and 2, 2 and 3, etc., through 6 and 7. This unifies the equality property of zonal areas
for both types of constant curvature, positive and negative, as presented in Figures 3
and 4.
1

7 6 5
3

3
2

Figure 5. Zones determined by corresponding pairs of normals have equal areas.

We can rotate a Gaussian profile repeatedly through higher and higher dimensions
to obtain an n-dimensional Gaussian surface (not necessarily of constant curvature)
May 2016]

A NEW LOOK AT SURFACES OF CONSTANT CURVATURE

441

as was done in [1] with the n-pseudosphere. The equal zone area property that was
established in [1] for the n-sphere and n-pseudosphere also holds for an n-dimensional
Gaussian surface (via an n-cylindroid) similar to the result for the 3-dimensional case.
Details are omitted.
3. PROFILES OF SURFACES WITH CONSTANT CURVATURE.
Using bicycles to trace Gaussian profiles. Just as a tractrix is traced by the rear
wheel of a bicycle, we will show that the same is true of any Gaussian profile. The
constant length of the bicycle (the distance between the rear and front wheels) will
provide us with the constant k that serves as the constant radius of curvature of the
Gaussian profile. First we note the following geometrical property inherent in any
bicycle motion. It concerns the line passing through the axle of a bicycle wheel that is
perpendicular to the plane containing the outer circumference of the wheel.
Lemma 1. The two lines through the two axles of a bicycles wheels intersect at the
center of curvature of the path traced by the rear wheel, regardless of the direction of
the front wheel.
Sketch of proof. First, assume the front wheel makes a fixed acute angle with the
frame of the bicycle, as in Figure 6a. Then the wheels trace concentric circles and the
lines through the axles meet at their common center, denoted by S. So for this motion,
S is the center of curvature of both trajectories. Let r denote the radial distance from S

concentric
circles

S
k

(a )

rear

back or forth
along the frame

front S

rear

rear
circle

(b)

front

(c )

(d )

front

Figure 6. (a) Bicycle wheels trace concentric circles if the front wheel makes a fixed angle with the frame. (b)
Limiting case with = /2. As the front wheel changes its direction, as in (c) and (d), the rear wheel moves
in a direction along the frame.

to the rear wheel. Then the radial distance from S to the front wheel is the hypotenuse
of a right triangle with legs r and k, where k is the constant distance between the rear
and front wheels, as indicated in Figure 6a. Note that r = k/ tan . In the limiting case
= /2 the radial distance r shrinks to 0, as shown in Figure 6b. We can continually
change the direction of motion of the front wheel by allowing to vary. The rear wheel
always points in a direction along the frame, as indicated in Figures 6c and 6d.
Now lets see what happens when the front wheel moves from an initial position
1 to a nearby second position 2 with the angle changing by a small amount , as
illustrated in Figure 7a. The motion of the front wheel from 1 to 2 has two components,
shown in Figure 7b, a radial component along the frame to an intermediate position
i, and a transverse component perpendicular to the frame from i to 2. Figures 7c and
7d indicate two different ways to reach the second position as a combination of radial
and transverse motions. In Figure 7c, the front wheel moves radially in the direction
of the frame to an intermediate position i, and then turns at right angles and rotates
through angle  to position 2. In Figure 7d, the front wheel turns at right angles from
442

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Rear
1

(a)

Rear 2
1

k
2

Front

Rear
1
2

(b)

Front
1

Rear 2
2 1

(c)

Front
1

(d)

k
1

2
Front

Figure 7. (a) Front wheel moves from position 1 to nearby position 2, and rear wheel follows at distance k.
(b) Radial and transverse components of the motion in (a). Two ways, (c) and (d), to decompose the motion in
(a) into radial and transverse components.

position 1 and rotates through angle  to an intermediate position i before advancing


radially to position 2. In either case, the rear wheel follows the front wheel at constant
distance k. During the actual motion in Figure 7a, both wheels undergo instantaneous
rotation around their respective centers of curvature, and the lines through the two
axles intersect as asserted in the lemma.
Tracing Gaussian profiles of surfaces with negative curvature. A tractrix rotated
to produce a pseudosphere of pseudoradius k serves as the prototype of a Gaussian
profile traced by the rear wheel of a bicycle of length k, which, when rotated, produces
a surface of constant negative Gaussian curvature K = 1/k 2 . We refer to such a
profile as a negative Gaussian profile. Figure 8a shows the tractrix, and Figures 8b and
8c show a general negative Gaussian profile, denoted by G , which will be rotated
around an axis of revolution, indicated as a vertical line.
TractrixPseudosphere

Hyperboloid Type

r
rear

G-

k
k

R
C

(a)

rear

(b)

front

Conic Type

r
G-

k
rear

front

G-

G-

front

(c)

Figure 8. (a) Rotation of a tractrix (a prototype of a negative Gaussian profile) about its asymptote generates
a pseudosphere. Rotation of a negative Gaussian profile generates a surface of constant negative Gaussian
curvature of hyperboloid type (b) if the front wheel of the bicycle always lies to the left of the axis of revolution,
and of conic type (c) if the front wheel always lies to the right of the axis of revolution.

Take a bicycle of length k with its rear wheel to the left of the axis of revolution,
and consider two cases: (1) The front wheel always lies to the left of the axis, as in
Figure 8b, and (2) the front wheel always lies to the right of the axis as in Figure 8c.
In both cases, the rear wheel is tangent to G as shown in Figures 8b and 8c. Draw the
line through the axle of the rear wheel perpendicular to the bicycle. It intersects the axis
of revolution at a point designated as S, which, by Meusniers theorem ([3, p. 122]) is
one of the two principal centers of curvature of the surface of revolution generated
by G with corresponding radius of curvature r , the distance from the rear wheel to
S. According to Lemma 1, the lines through the two axles of the bicycles wheels
intersect at the second center of curvature of the surface generated by G , denoted by
C, with corresponding radius of curvature R, the distance from C to G as indicated in
Figure 8b. Note that the two centers of curvature C and S lie on opposite sides of G .
The front wheel moves along a line through S, either away from S or towards S. The
May 2016]

A NEW LOOK AT SURFACES OF CONSTANT CURVATURE

443

right triangles in Figure 8b with common side k show that r R = k 2 , so the constant
negative Gaussian curvature of the surface of revolution is K = 1/k 2 , just as in the
case of a pseudosphere, which is the limiting case in which the front wheel moves
along the axis of rotation.
In addition to the pseudosphere, there are two types of surfaces so generated,
depending on the position of the front wheel. If it always lies to the left of the axis, as
in Figure 8b, the surface is said to be of hyperboloid type, and if it always lies on the
other side the surface is said to be of conic type, as in Figure 8c.
Tracing Gaussian profiles of surfaces with positive curvature. Figure 9a shows a
circle G + that serves as the prototype of a positive Gaussian profile that produces a
spherical surface of constant positive Gaussian curvature 1/k 2 . In this special case, the

rear

G+ r

S C

C
R
k

rear

G+
R

front

C'
S

front

(a) Circle-sphere

(b) Bulge type

rear

S
C'
G+ r R

k
front

(c) Spindle type

Figure 9. (a) Rotation of a circle (a prototype of a positive Gaussian profile) generates a spherical surface. (b)
Rotation producing a bulge type surface. (c) Rotation producing a spindle type surface.

circle is traced by the rear wheel of a bicycle of length k, while the front wheel moves
perpendicular to the line connecting it to the common center of curvature C of the
lines through the two axles, which coincides with a point S on the axis of revolution.
Figures 9b and 9c show a general positive Gaussian profile G + traced by the rear wheel
of a bicycle of length k. The two principal centers of curvature of the surface generated
by G + are on opposite sides of G + , with corresponding radii r and R, with r being
the distance from G + to S. By Lemma 1, the lines through the two axles of the bicycle
intersect at one of the centers, indicated by C  , with corresponding radius of curvature
R. The reflection of C  through the tangent line to G + is the point C at distance R on
the other side of G + . The two right triangles with common side k show that r R = k 2 ,
so the constant positive Gaussian curvature of the surface of revolution is K = 1/k 2 .
In addition to the sphere, there are two type of surfaces so produced, bulge type as
in Figure 9b, and spindle type as in Figure 9c. The type is determined by the location
of G + relative to the axis of rotation. For a bulge type, G + always lies to the left of the
axis, and for a spindle type it crosses the axis.
Cycloid as limiting case of a Gaussian profile. Figure 10 shows what happens as
r becomes very large and R becomes very small, while their product remains constant, r R = k 2 . The Gaussian profile becomes asymptotically a cycloid! To see why,
refer to Figure 10b. It shows one arch of a cycloid, which we denote by G, traced by
a point P on a disk of diameter d rolling along a line that is parallel to the axis of
revolution and at distance H from it. It is known (see [2, p. 368]) that the cycloid G
is the evolute of another cycloid G  also generated by a rolling disk of diameter d. A
string of length O L unwrapped from the cusp O on G  traces the arc on G with one
444

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




cycloid L

G'
t

C O

(b)

(a)

Figure 10. Gaussian profile becomes more and more like a cycloid as r becomes arbitrarily large and R
becomes arbitrarily small, with their product being constant.

endpoint at P. The tangent of length t to G  at C, when extended through a distance


t to G, intersects G as a normal at P. Therefore, G has radius of curvature R at P
from the center of curvature S given by r = t + H/ cos , where is the angle indicated in Figure 10b. But cos = t/d so r = t + H d/t. Therefore, Rr = 2H d + 2t 2
= 2H d + 2d 2 cos2 = 2H d(1 + (d/H ) cos2 ). If H >> d, this is asymptotic to
2H d, a quantity independent of or the position of P on the cycloid.
Tracing Delaunay profiles of surfaces of revolution with constant mean curvature.
The catenoid is an example of a minimal surface of revolution. It is also a special
surface of constant mean curvature M. In general, mean curvature is defined by
M=

1
(1 + 2 ),
2

(1)

where 1 and 2 are the two principal curvatures at an arbitrary point P of the profile
curve that is rotated to produce the surface. Minimal surfaces of revolution, such as the
catenoid, have M = 0. They were studied in the 19th century by Delaunay who showed
that the only surfaces of constant mean curvature of revolution are those obtained by
rotating three types of catenary curves: parabolic, elliptic, and hyperbolic. A parabolic
catenary is the classical hanging chain catenary (discussed in [2, p. 346]). The
general type is defined as the locus of the focus of a conic that rolls along a line
without slipping. In classical differential geometry, these special surfaces of revolution are described parametrically in terms of elliptic functions (see [3, p. 288]). In
[2, Sec. 3.12], these curves were investigated by elementary geometric methods, and
associated areas and arclengths were determined. Animation showing these curves
being traced can be found on the website
www.mathcurve.com/courbes2d/delaunay/delaunay.shtml
which also compares their surfaces of revolution with surfaces of constant Gaussian,
or total, curvature K given by
K = 1 2 .

(2)

Bonnet discovered the following surprising theorem relating surfaces of revolution


with constant Gaussian curvature and those with constant mean curvature. A proof of
Bonnets theorem is given in [3, p. 278].

May 2016]

A NEW LOOK AT SURFACES OF CONSTANT CURVATURE

445

D-

G+

D+
k

k
k

Figure 11. Gaussian profile G + traced by a bicycle of length k and two parallel profiles D+ and D traced
by the two rear training wheels of a childs bicycle, each of which is at the same distance k from the frame.

Theorem 2. Let G + be the profile of a surface of constant Gaussian curvature 1/k 2 .


Then there are two surfaces parallel to G + that have constant mean curvature 2k1 and
2k1 . The directed distances of these profiles from G + are k and k, respectively.
Figure 11 shows an example of a Gaussian profile G + traced by the rear wheel
of a bicycle of length k. The two equidistant parallel surfaces with constant mean
curvatures of opposite signs could be traced by the two training wheels of a childs
bicycle, each at distance k from the frame, as suggested in Figure 11.
Cycloid as limiting case of a Delaunay profile. We have already noted that the
cycloid is a limiting case of a Gaussian profile (Figure 10). Figure 12 shows the cycloid
in Figure 10b (rotated through a right angle) and two parallel profiles which are also
cycloids. These profiles are limiting cases of Delaunay profiles that can be traced by the
training wheels of a childs bicycle as indicated in Figure 11. In Figure 12, a cycloidal
bicyclix, shown dashed, is traced by the front wheel of a bicycle of length k whose rear
wheel follows a cycloid. The profile traced by the rightmost training wheel is the involute of this cycloid and is known to be another cycloid (see [2, p. 368]). Similarly the
profile traced by the leftmost training wheel is also a cycloid. These limiting profiles
provide another illustration of Theorem 2.

right
front

front
left

cycloidal
bicyclix front

left

right

Figure 12. Cycloidal bicyclix (dashed) traced by the front wheel of a bicycle of length k whose rear wheel
follows a cycloid. Two parallel profiles, also cycloids, are traced by the training wheels of a childs bicycle,
each of which is at the same distance k from the frame.

446

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




REFERENCES
1. T. M. Apostol, M. A. Mnatsakanian, Volume/surface area relations for n-dimensional spheres, pseudospheres, and catenoids, Amer. Math. Monthly 122 (2015) 745756.
2. T. M. Apostol, M. A. Mnatsakanian, New Horizons in Geometry. Dolciani Mathematical Expositions No.
47, Mathematical Association of America, Washington, DC, 2012.
3. E. Kreyszig, Differential Geometry. Mathematical Expositions No. 11, Univ. of Toronto Press, Toronto;
Oxford Univ. Press, London, 1959.
TOM M. APOSTOL joined the Caltech faculty in 1950 and became professor emeritus in 1992. He is director
of Project MATHEMATICS! (http://www.projectmathematics.com), an award-winning series of videos
he initiated in 1987. His long career in mathematics is described in the September 1997 issue of The College
Mathematics Journal. He was selected as a Fellow of the American Mathematical Society in 2012. He is
currently working with colleague Mamikon Mnatsakanian to produce materials demonstrating Mamikons
innovative and exciting approach to mathematics.
California Institute of Technology, 253-37 Caltech, Pasadena, CA 91125.
apostol@caltech.edu

MAMIKON A. MNATSAKANIAN received a Ph. D. in physics in 1969 from Yerevan University, where he
became professor of astrophysics. As an undergraduate he began developing innovative geometric methods
for solving many calculus problems by a dynamic and visual approach that makes no use of formulas. He is
currently working with Tom Apostol under the auspices of Project MATHEMATICS! to present his methods in
a multimedia format.
California Institute of Technology, 253-37 Caltech, Pasadena, CA 91125.
mamikon@caltech.edu

May 2016]

A NEW LOOK AT SURFACES OF CONSTANT CURVATURE

447

Hardys Reduction for a Class of Liouville


Integrals of Elementary Functions
Jaime Cruz-Sampedro and Margarita Tetlalmatzi-Montiel
In memory of Jaime Cruz-Sampedro
Abstract. This paper is concerned with a class of integrals whose integrands are the product of a rational function times the exponential of a nonconstant rational function. We call
these Liouville integrals. For these integrals, we provide a student-friendly algorithm producing a two-term decomposition with minimum transcendental and maximum elementary components. This decomposition fulfills the conditions of Hardys reduction theory, determines
whether these integrals are elementary functions, and when in the affirmative, finds them.
To achieve our goal, we use partial fraction decomposition, simple notions of linear algebra,
and a special case of an 1835 theorem of Liouville that we refer to as Liouvilles criterion
on integration. There is in the literature a complete algorithm to decide if the integral of an
elementary function is also elementary. Ours is a gentle alternative for the class of Liouville
integrals.

1. INTRODUCTION. Deciding if the integral of an elementary function is also an


elementary function is an important question that has been studied since the time of
Newton and Leibniz. Largely based on the work of Liouville [8], Risch [11], and
Rosentlicht [13], a lot of progress has been made on this problem during the last two
centuries [1, 2, 3, 4, 5, 7, 12, 15, 17]. Yet there are classes of integrals of interest to the
calculus student for which this question does not have a complete answer. An example
s
of these is the class of integrals of the form x r eax d x, with r, s integer numbers. The
aim of this paper is to examine the above question for a class of integrals of this type
that we will refer to as the class of Liouville integrals. This class is precisely the object
of study of the following special case of a theorem of Liouville [8, 9, 12, 13].
Theorem (Liouvilles criterionon integration, 1835). Let f and g be rational functions with g nonconstant. Then f (x)e g(x)d x is an elementary function if and only if
there exists a rational function R such that f (x)e g(x) d x = R(x)e g(x) or, equivalently,
f (x) = R(x)g  (x) + R  (x).

(1)

This criterion is frequently used to show the calculus student that certain classix
x
x
2
cal integrals such as erf(x) = 2 0 eu du, Li(x) = 2 lnduu , and Si(x) = 0 sinu u du
cannot be expressed in terms of elementary functions, but in spite of its essential role
in determining the nonelementary character of important integrals in applications, the
relevance of this result in concrete situations has been confined to a few subclasses of
the class of Liouville integrals [9, 10, 14].
In this paper, we use Liouvilles criterion on integration to provide a studentfriendly algorithm that produces a decomposition that fulfills the conditions of Hardys
http://dx.doi.org/10.4169/amer.math.monthly.123.5.448
MSC: Primary 00A05, Secondary 26A09

448

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




reduction theory that we describe below. In a certain sense, this decomposition provides the minimum transcendental and the maximum elementary components of the
Liouville integrals. As a result, we can determine whether these integrals are elementary functions and when in the affirmative find them. In addition to Liouvilles
criterion, the essential tools to achieve our goal are partial fraction decomposition and
simple notions from linear algebra.
Concerning the whole class of elementary functions, advanced readers may find
in [3, 11] a complete algorithm to determine if the integral of any given elementary
function is also elementary. In regard to free and commercial software for the class
of integrals examined in this work, the reader may compare the answers furnished by
those programs with the ones provided by our alternative algorithm.
Naively speaking, an elementary function is one that can be expressed through
addition, product, division, and composition of rational, exponential, logarithmic, and
trigonometric functions and their inverses. A precise definition of elementary function
as well as a complete proof of Liouvilles theorem can be found in [12, 13].
Integrals that can be expressed in terms of elementary functions will be referred
to as elementary integrals or as integrals that can be integrated in finite terms. Unless
specified otherwise, all polynomials, rational functions, and vector spaces considered
below will be over the field C of the complex numbers.
2. MAIN RESULTS. We begin with a result that will surely be of interest to the
calculus student.
Theorem
 1. sLet a be a complex number and r, s rational numbers such that sa = 0.
Then x r eax d x is elementary if and only if (r + 1)/s is a positive integer.
Using Eulers identity, we obtain the following.
Corollary
1. Let , be real numbers
and r, s rational numbers with s( + i) = 0.


s
s
Then x r ex cos(x s )d x and x r ex sin(x s )d x are elementary functions if and
only if (r + 1)/s is a positive integer.
Our search for this result was motivated by students questions and encouraged by
a classical theorem of Tchebychef
that states that if r, s, t are rational and a, b are real

numbers with abs = 0, then x r (a + bx s )t d x is elementary if and only if any of the
numbers t, (r + 1)/s, or (r + 1)/s + t is an integer [17].
Several particular cases of Theorem 1 for integer r and s can be found in the literature [9, 10, 14] without any reference to the condition that these numbers satisfy. To
prove this theorem, we note first that a change of variables allows us to assume that r
and s are integers. Then we integrate by parts to reduce the problem to a finite number
of cases and finish the proof by applying Liouvilles criterion on integration.
Our first result toward the main goal of this work is concerned with the class of
Liouville integrals with f and g in the space P of polynomials with coefficients in C.
Theorem 2. For any given polynomial g P with k = deg g 2, let



g(x)
f (x)e d x is an elementary function
Eg = f P :
and N g = span {1, x, . . . , x k2 }. Then
E g = span {x j g  (x) + j x j1 : j = 0, 1, . . . }
May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

449

and P = N g E g .
The basic idea to prove
 this theorem goes as follows. Suppose
elementary and R(x) = nj=0 b j x j satisfies (1), then
f (x) =

n


f (x)e g(x) d x is

b j (x j g  (x) + j x j1 ).

j=0

Since (x j g  (x) + j x j1 )e g(x) = (x j e g(x) ) , all we need to show is that the linear span
of the class of functions x j g  (x) + j x j1 , j = 0, 1, . . . , contains the whole class of
polynomials f that can be integrated against e g(x) in finite terms. We achieve this using
Liouvilles criterion on integration.
Before we turn to the general case of rational functions f and g, we have the
following.
Corollary 2. Let g be as in Theorem 2. Then for any f P there are unique polynomials P and u, with deg u k 2, such that


g(x)
(2)
f (x)e d x = u(x)e g(x) d x + P(x)e g(x) .
Moreover, the left side of (2) is elementary if and only if u 0 and its value is
P(x)e g(x) . If u  0 the nonelementary integral in the right side is minimal in the sense
that for any polynomials v and Q satisfying


g(x)
(3)
f (x)e d x = v(x)e g(x) d x + Q(x)e g(x) ,
we have deg v deg u.
Corollary 2 shows that if the left side of (2)
 is nonelementary, we only have to
worry about the k 1 nonelementary integrals x j e g(x) d x, j = 0, 1, . . . k 2. Hardy
refers to these integrals as independent transcendents [7]. Corollary 2 shows too that
decomposition (2) fulfills the conditions of Hardys reduction theory. In the words of
Hardy:
Such a reduction theory endeavours in each case
1. to split up any integral of the class under consideration into the sum of a number of
parts of which some are elementary and the others are not;
2. to reduce the number of the latter terms to the least possible;
3. to prove that these terms are incapable of further reduction, and are genuinely new and
independent transcendents.
The Integration of Functions of a Single Variable, p. 46, [7].

In view of this fact, (2) will be referred to as Hardys reduction of the given integral.
To find this reduction, we only have to apply the following simple algorithm: write f
in terms of the basis of P given by Theorem 2. This algorithm can be performed by
a calculus student and easily be accomplished using any mathematical software. The
underlying idea of the algorithm for any Liouville integral is the same.
450

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123





2
Example 1. Find Hardys reduction of f (x)eax /2 d x for any polynomial f and
a C \ {0}.
In this case, we have g(x) = ax 2 /2 and

P = span {1} span {ax, ax 2 + 1, ax 3 + 2x, . . . }.


If f is of degree n, we can find A0 , A1 , . . . , An C such that
f (x) = A0 +

n1


A j+1 (ax j+1 + j x j1 ).

j=0

Thus, we have

f (x)e

ax 2 /2


dx =

A0 e

ax 2 /2

dx +

n1



A j+1

(ax j+1 + j x j1 )eax

2 /2

dx

j=0


=

A0 eax

2 /2

dx +

n1


A j+1 x j eax

2 /2

j=0


j
The polynomials u(x) = A0 and P(x) = n1
j=0 A j+1 x are computable in the sense
that A0 , A1 , . . . An can explicitly be found for any given f ; moreover, the integral on
2
the left side is elementary if and only if A0 = 0 and its value is given by P(x)eax /2 .
Example 1 is a variant of a fact stated in [6] that says that if P is a polynomial with

2
coefficients in R, then P(x)ex d x is elementary if and only if P is in the linear span
2 n
2
of the class of Hermite polynomials Hn (x) = (1)n e x ddx n ex with n 1. In view of
2
the close connection between ex and Hermite polynomials, we dont think this fact
can be generalized for polynomials integrated against e g for any g P .
The following result is a natural generalization of Theorem 2 for f and g in the
space Q of rational functions with coefficients in C.
Theorem 3. For any given nonconstant rational function g Q, let



Eg = f Q :
f (x)e g(x) d x is an elementary function .
Then a basis for E g is given by



j
g  (x)

j 
j1

, g (x), x g (x) + j x
: C, j = 1, 2, . . . .
(x ) j
(x ) j+1

(4)

The essential idea to prove this theorem consists in using Liouvilles criterion on
integration to show that E g is the linear span of (4). In analogy to the polynomial
case, below we will see that this result provides a simple algorithm to find a Hardys
reduction of the type (2) for any Liouvilles integral. This algorithm only requires
partial fraction decomposition and a little linear algebra.
Note that the dimension of the space E g in Theorem 3 has cardinality of the continuum and so does its co-dimension as a subspace of Q. We will see that in concrete
May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

451

situations it is possible (and convenient) to work with appropriate subspaces of Q that


have a countable basis.
There are several classes of integrals for which the above Hardys reduction, and
thus their elementary character, can be established by transforming these into Liouville
integrals.
1. If F and G are rational functions, with G nonconstant, and p, q are positive
integers, then the substitution e x = u pq gives




pq F (u q ) G(u p )
x/q
e
du.
F e x/ p e G(e ) d x =
u
 kx
 u 2 1 u k
If k is an integer, a simple case is ee tanh x d x = u(u
2 +1) e du, which happens to be nonelementary for all nonzero k (see Section 3). Since cosh x and
sinh x are rational functions of e x , this example includes the case in which F
and G are rational functions of the form F(cosh x, sinh x, . . . , sinh mx) and
G(cosh x, sinh x, . . . , sinh nx).
2. If F(x, y) and G(x, y) are rational functions in two variables, then the substitution u = tan(/2) furnishes
cos =

1 u2
,
1 + u2

sin =

2u
,
1 + u2

d =

2du
.
1 + u2

Thus,

F(cos , sin )e

G(cos ,sin )


d =

f (u)e g(u) du,

2
2
1u
2u
1u
2u
and
g(u)
=
G
. Here we assume
where f (u) = 1+u
F
,
,
2
1+u 2 1+u 2
1+u 2 1+u 2
that g is not constant. This example includes the case in which F and G are rational functions of the form F(cos x, sin x, . . . , sin mx) and G(cos x, sin x, . . . ,
sin nx).
3. Let F and G be rational functions and p, q positive integers. If g(u) = G(u p ) +
pq
u pq is nonconstant, then setting ln x = u pq we have d x = pqu pq1 eu du, and
thus,




1/q
pqu pq1 F (u q ) e g(u) du.
F (ln x)1/ p e G ((ln x) ) d x =

 u


p
Simple cases are given by lnd xx = eu du and by p ln x d x = pu p eu du,
which turn out to be nonelementary for all p 2 (see Section 4).
Examples 1 and 3 can be generalized, as the following one, to the case in
which F and G are rational functions of n and m variables, respectively.
4. Suppose F(x1 , . . . , xn ) and G(x1 , . . . , xm ) are rational functions with G nonconstant. If M is the product of the n + m positive integers p1 , . . . , pn and
q1 , . . . , qm , the change of variables x = u M provides



1
1
1/q
1/qm
f (u)e g(u) du,
F x p1 , . . . , x pn e G(x 1 ,...,x ) d x =

M
M
M
where f (u) = Mu M1 F u p1 , . . . , u pn and g(u) = G u q1 , . . . , u qm .
452

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




5. If f and g are rational functions with real coefficients


 and g is nonconstant,
we
can
use
Eulers
identity
and
the
fact
that
f (x) cos g(x)d x and

f (x) sin g(x)d x are determined, up to additive constants, by their restrictions to any nonempty open interval
of the real axis to show that these integrals

are elementary if and only if f (x)eig(x) d x is.
In the last section of this work,
 we provide an algorithm
 to find a Hardys reduction
for the Liouville-like integrals f (x) log g(x)d x and f (x) arctan g(x)d x, where f
and g are rational functions, with g nonconstant. Our main tool to study these integrals
will be the following criterion of integration in finite terms that Hardy derived from a
theorem of Liouville [7].
Theorem (Liouville-Hardys
criterion on integration, 1905). If f is a rational func
tion, then f (x) log x d x is elementary if and only if there exists a rational function
g and a constant C such that f (x) = g  (x) + C/x.
3. A USEFUL LEMMA. To begin, we use Liouvilles criterion to establish a property of rational functions with simple poles that plays an important role in this work.
Lemma 1. Let g and h be rational functions both continuous at a fixed a C.
h(x) g(x)
e d x is not elementary.
If h(a) = 0 and g is nonconstant, then (xa)
 h(x) g(x)
e d x is elementary, then by Liouvilles criterion on integration there
Proof. If (xa)
is a rational function R such that
h(x)
= R(x)g  (x) + R  (x).
x a
Since h(a) = 0 and g is continuous at x = a, then R has a pole at x = a. Thus, there
is an integer m 1 and a rational function r continuous at x = a, with r (a) = 0, such
that R(x) = r (x)/(x a)m. A short calculation yields
h(x)(x a)m1 = r  (x)

mr (x)
+ g  (x)r (x),
x a

which is impossible since the left side is continuous at x = a but the right is not.
The following corollary quickly recovers a classical example that is often of interest
to the calculus student.
Corollary 3. For any positive integer n and a C \ {0}, the integral
an elementary function.

eax
xn

d x is not

 ax
 eax
eax
a
Proof. Integrating by parts yields xen+1 d x = nx
d x. Thus, by induction,
n + n
xn
only the case n = 1 needs to be treated, but this follows at once from Lemma 1.
As is well known, taking a = i and
Eulers
identity, we find that for all posi sin
 using
x
x
d
x
and
d x are not elementary. Taking
tive integers n the classical integrals cos
xn
xn
 du
 ex
a = n = 1, neither is log u = x d x.
May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

453

4. PROOF OF THEOREM 1. Now we turn to the proof of our first result.


Proof. We begin
that r = n and s = k are integers with k > 0 fixed.
 by assuming
k
Defining In = x n eax d x and integrating by parts, we find
In =

x nk+1 ax k (n k + 1)
e
Ink .
ka
ka

(5)

Setting n = k 1 in (5), we see that Ik1 is elementary; in addition, for n = k 1 we


have that In is elementary if and only if Ink is. We claim that Im is not elementary for
m = 1, 0, . . . , k 2. Thus, In is elementary if and only if there is an integer l 0
such that n = k 1 + kl, that is to say, if and only if (n + 1)/k is a positive integer. To
prove our claim, note first that the case m = 1 follows from Lemma 1. Suppose now
that 0 m k 2 and that Im is elementary. By Liouvilles criterion on integration,
x m = ( p/q)akx k1 + ( p/q) for some relatively prime polynomials p and q. A short
calculation gives
(x m q p akx k1 p)q = pq .
Since p and q are relatively prime, a multiplicity argument about the zeroes of q and
q  implies that q is a nonzero constant C. Thus, we have
C x m p akx k1 p = 0.
Note next that 0 m k 2, thus a simple degree argument shows that we must
have p 0, but this leads to the contradiction C x m 0.
Our assertion for a negative integer k follows from the positive case by making
u = 1/x. Thus, the result is valid when r and s are integers with sa = 0.
Suppose now that r = p/q and s = c/d, with cdq = 0. Setting x = u dq yields


s
cq
x r eax d x = dq u d( p+q)1 eau du.
Thus, the given integral is elementary if and only if (r + 1)/s = d( p + q)/cq is a
positive integer.
Theorem 1 shows also that iterating (5) any In can be reduced to the sum of an
elementary function and Ir , for some r {1, 0, . . . , k 2}. For example,


x 20 x 5
5
ex
e +
I16
d
x
=

21
x
20
20

20
5x 15
52 x 10
53 x 5
x
5
+
+
+
ex
=
20
20 15 20 15 10 20 15 10 5
 x5
e
54
d x.
+
20 15 10 5 1
x

The reader can verify that if n = kl + r 0, with 0 r k 1, then

l1
(k)
j k j+r +1

(n

k
+
1)!
(ka)
x
k
(1) j1
In = (1)l
eax + Ir ,
(k)
(ka)l
(k
j
+
r
+
1)!
j=0
454

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




while if n = kl + r < 0, with 1 r k 2, then


|l|1

In =


j=0

(ka) j x n+k j+1


(ka)|l|
k
eax +
Ir .
|n + 1| |n + k j + 1|
|n + 1|!(k)

It turns out that these expressions are in fact Hardys reduction of In introduced in (2).
We remind the reader that, for n 1 k,

1,
if n = 1 k, 2 k, . . . , 1, 0;
n!(k) =
n(n k)!(k) , if n 1.
As usual, for k = 1, 2, 3, we write n!, n!!, and n!!!.
5. PROOF
OF THEOREM 2. Next we prove Theorem 2 and obtain Hardys reduc
tion of f (x)e g(x) d x stated in (2).
Proof. Let k denote the degree of g. If k = 1, there is nothing to prove. So we may
assume k 2 and define N g = span {1, x, . . . , x k2 }. If j = x j g  (x) + j x j1 , it is
easily verified by induction that { j : j = 0, 1, . . . } is linearly independent. Moreover,
we clearly have P = N g span { j : j = 0, 1, . . . } and

(6)
j (x)e g(x) d x = x j e g(x) + C.

Thus, if we prove that u(x)e g(x) d x is not elementary for all u N g \ {0}, then it
will follow that E g = span { j : j = 0, 1, . . . }; therefore, Theorem 2 will be proved.
Let u N g \ {0} and suppose u(x)e g(x) d x is elementary. By Liouvilles criterion on
integration, u = ( p/q)g  + ( p/q) for some relatively prime polynomials p and q. A
short calculation yields
(uq p  pg  )q = pq .
Since p and q are relatively prime, a multiplicity argument about the zeroes of q and
q  implies that q is a nonzero constant C. Thus, we have
Cu p pg  = 0.
Since deg u k 2 and deg g  = k 1, a simple degree argument again shows that
p 0 and thus u 0, but this contradicts the choice of u.
Now we prove Corollary 2.

Proof. In view of (6), the existence of a Hardys reduction of f (x)e g(x) d x of the
type (2) follows from the fact that P = N g E g . To prove the uniqueness of u and P,
suppose



g(x)
g(x)
g(x)
= u 2 e g(x) d x + P2 (x)e g(x) ,
f (x)e d x = u 1 e d x + P1 (x)e
where P1 , P2 are polynomials and both u 1 and u 2 have degree at most k 2. Thus,
u 1 , u 2 N g . If we set P = P2 P1 , then u 1 u 2 = P  + Pg  E g N g = {0}. Thus,
May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

455

u 1 = u 2 and P  + Pg  = 0. Solving this last differential equation, we obtain P(x)


= Ceg(x) , where C is a constant. Since P is a polynomial and g nonconstant, we
must have C = 0 and therefore P1 = P2 . Suppose now that u is as in (2) and that
v satisfies (3). Since (v(x) u(x))e g(x) d x is elementary, we have v = u + h with
h E g . If h  0, then deg h k 1 > deg u. Thus, deg v deg u.
Example 2. Find Hardys reduction of


(10x 4 3x + 1)e x d x,
and decide if this integral is elementary. According to Theorem 2, g(x) = x 5 , f (x)
= 10x 4 3x + 1, E g = span { j (x) = 5x j+4 + j x j1 : j = 0, 1, 2, . . . }, and N g
= span {1, x, x 2 , x 3 }. Thus, f (x) = 1 3x + 20 (x), and Hardys reduction is


(10x 4 3x + 1)e x d x =

(1 3x)e x d x + 2e x .

Since u(x) = 1 3x is not the zero polynomial, the given integral is not elementary.
Remark. The interested reader may compare our Hardys reduction for this integral
with any of the answers provided by either MAPLE or Wolfram.
Example 3. Find Hardys reduction of

(x 5 + x 4 + x 3 + x 2 + x + 1)e x

3 +x

d x,

and determine if it is elementary. Now f (x) = x 5 + x 4 + x 3 + x 2 + x + 1, N g


= span {1, x}, g(x) = x 3 + x, and
E g = span { j (x) = 3x j+2 + x j + j x j1 : j = 0, 1, 2, . . . }.
A short calculation shows that
f (x) =

j (x)e x

3 +x

d x = x j ex

3 +x

+ C and

1
2
1
1
8 1
+ x 0 (x) + 1 (x) + 2 (x) + 3 (x).
9 9
9
9
3
3

Thus, the desired reduction is





f (x)e

g(x)

Since u(x) =

8
9

dx =




1 2 1 3 x 3 +x
8 1
1 2
x 3 +x
+ x e
dx + + x + x + x e
.
9 9
9 9
3
3

+ 19 x is not the zero polynomial, the given integral is not elementary.

6. PROOF OF THEOREM 3, CASE I. We split the proof of Theorem 3 in two


cases. Here we prove this theorem in the case that f is a rational function and g a
nonconstant polynomial.
456

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Proof. It is well known [18] that if Q is the vector space of rational functions with
coefficients in C, then


1
:

C,
j
=
1,
2,
.
.
.
(7)
1, x j ,
(x ) j
is a basis of Q. If g is a polynomial of degree k 1, we define


span

Ng =

span




1
x


:C ,


1
, x j : C, j = 0, 1, . . . , k 2 ,
x

if k = 1,
if k 2.

Now we set
j (x) =

g  (x)
j

j
(x )
(x ) j+1

and

j (x) = x j g  (x) + j x j1 .

(8)



A short calculation shows by induction that j , g  , j : C, j = 1, 2, . . . is linearly independent; in addition, using the basis of Q given in (7), we see that


Q = Ng span j , g  , j : C, j = 1, 2, . . . .
Moreover, for all C and j = 0, 1, . . . ,

e g(x)
+ C and
j (x)e g(x) d x = x j e g(x) + C.
(9)
j (x)e d x =
(x ) j


We will have that E g = span
j , g  , j : C, j = 1, 2, . . . and thus prove the

theorem if we show that u(x)e g(x) d x is not elementary for all u N g \ {0}. This
follows as in the proof of Theorem 2 if u is a polynomial and from Lemma 1 otherwise.


g(x)

Corollary 4. Let g be a nonconstant polynomial. Then for any f Q there exists a


unique rational function R and a unique rational function u N g of the form
u(x) =

n

j=1

Aj
+ Q(x),
x j

where Q is zero or a polynomial of degree at most deg g 2, such that




f (x)e g(x) d x = u(x)e g(x) d x + R(x)e g(x) .
Moreover, the left side of this last equation is elementary if and only if u 0, and if
u  0 the nonelementary integral in the right side is minimal in the sense that for any
rational functions v and Q satisfying


g(x)
f (x)e d x = v(x)e g(x) d x + Q(x)e g(x) ,
we have deg v deg u.
May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

457

Proof. The proof of this corollary is similar to that of Corollary 2. We just remind
the reader that if u is a nonzero rational function with u = p/q, where p and q are
relatively prime polynomials, then deg u = max{deg p, deg q}.
The decomposition
given by Corollary 4, which we will refer to as Hardys reduc
tion of f (x)e g(x) d x, says that if this integral is not an elementary function, then
we only have to be concerned about the nonelementary integrals x j e g(x) d x, for
 g(x)
j = 0, 1, . . . , deg g 2 and a finite number of integrals of the form ex d x.
Corollary 4 generalizes for any polynomial
g to the following well-known fact,

which says that any integral of the form R(x)e x d x, where R(x) is a rational function,
can be reduced so as to contain only a term
e x S(x) where S(x) is a rational function
ex d x
and a number of terms of the type xa . If all the constants vanish, then the
integral can be calculated in the final form e x S(x) [7].
In concrete situations it is convenient to consider reasonable subspaces of Q, E g ,
and N g that possess a countable basis. For example, if f is a rational function with
partial fraction decomposition
f (x) =

nl
n 

l=1 t=1

Blt
+ P1 (x),
(x l )t

(10)

where P1 is a polynomial. Assuming that n 1 and l = m whenever l = m, we set


F = {1 , . . . , n } and define


1
j
Q(F ) = span 1, x ,
: F , j = 1, 2, . . . .
(x ) j
Moreover, if g is a polynomial of degree k 1 we define


E g (F ) = span j (x), g  (x), j (x) : F , j = 1, 2, . . . ,

(11)

where j (x) and j (x) are as in (8). We also define



N g (F ) =

span
span




1
x


:F ,


1
, x j : F , j = 0, 1, . . . , k 2 ,
x

if k = 1,
if k 2.


Note that Qg (F ) = E g (F ) N g (F ) and if f Qg (F ), then f (x)e g(x) d x is elementary if and only if f E g (F ); note too, E g (F ) has finite co-dimension as a subspace of Qg (F ).
Example 4. Find Hardys reduction of

(x 2

4
3
e x /3 d x,
2
1)

and decide whether this integral is elementary.


4
Note first that g(x) = x 3 /3 and f (x) = (x 2 1)
2 . By partial fraction decomposition,
we have
f (x) =
458

1
1
1
1
+
+

.
2
x + 1 (x + 1)
x 1 (x 1)2

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




1
1
If we set F = {1, 1}, then we have Q(F ) = span {1, x j , (x+1)
j , (x1) j :
j = 1, 2 . . . },


E g (F ) = span 1 j (x), x 2 , j (x) : j = 1, 2, . . . ,
2

j
x
j+2
+ j x j1 . We also have that
where 1 j (x) = (x1)
j (x1) j+1 , and j (x) = x
 1

1
, 1, x . A short calculation shows that
N g (F ) = span x+1 , x1

f (x) = 2x +

2
11 (x) 11 (x).
x +1

In view of (9), we have that Hardys reduction of the given integral is







1
2
1
3
g(x)
x 3 /3
e dx
+
e x /3 .
2x +
f (x)e d x =
x +1
x 1 x +1
Since u(x) = 2x +

2
x+1

 0 the given integral is not elementary.

Remark.
We warn the reader that our algorithm to find a Hardys reduction of

f (x)e g(x) d x has the advantages and limitations of partial fraction decomposition.
Everything works out well if the roots of the denominator of f are known, but we have
a problem otherwise. For example, we do not know how to find a Hardys reduction
 g(x)
of x 5e6x+3 d x, where g is a nonconstant polynomial since q(x) = x 5 6x + 3 is
1
N g because
not solvable by radicals [16]. On the other hand, we note that x 5 6x+3
 eg(x)
all the roots of q are simple. Thus, x 5 6x+3 d x is not elementary and has a Hardys
reduction of the form



e g(x)
d
x
=
Aj
x 5 6x + 3
j=1
5

e g(x)
d x,
x j


1
where A j =
and j , j = 1, . . . , 5, are the roots of q that could
k= j ( j k )
be approximated by Newtons method.
The advanced interested reader may find in [2] an algorithm
that does not require

factoring the denominators of f and g to determine if f (x)e g(x) d x is an elementary
function and when in the affirmative find its value.
7. PROOF OF THEOREM 3, CASE II. Now we prove Theorem 3 in the case that
both f and g are rational functions and g is not a polynomial.
Before we prove this theorem, we introduce notation. As customary Q will be
viewed as the linear span of (7). Next we suppose that the partial fraction decomposition of g is given by
g(x) =

lr
m 

r =1 s=1

Ar s
+ P(x),
(x r )s

(12)

where P P is either zero or deg P = k. We assume m 1, the complex numbers r


to be distinct, and Arlr = 0 for r = 1, . . . , m. Now we define


E g = span j , g  , j : C, j = 1, 2, . . . ,
May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

459

where j (x) and j (x) are as in (8), and


g  (x) =

lr
m 

r =1 s=1

We also set r s (x) =

1
(xr )s

s Ar s
+ P  (x).
(x r )s+1

and



N g = {r s (x) : r = 1, . . . , m; s = 1, . . . , lr + 1} \ mlm +1 (x) .
Finally, we set  = {1 , . . . , m } and define


1
, r s : r s N g , C \  ,
N g = span 1, x, . . . , x k1 ,
x
where it is understood that {1, x, . . . , x k1 } is empty if P is constant.
Remark. The remainder of this proof is valid if in the definition of N g we leave out
any rlr +1 (x), with r = 1, . . . , m 1, instead of mlm +1 (x).
Proof. With the above notation in hand, in view of (9), Theorem 3 will be proved if
we show that
g , and
Q = N E
g

g(x)

u(x)e d x is not elementary for any u N g \ {0}.


First, we prove the latter. Let u
be given. If u has a nonzero term of the form
 N g \ {0}
A
g(x)
for
some

C
\
,
then
u(x)e
d x is not elementary because of Lemma 1.
x

So we may assume that u(x) = r s N  ar s r s + Q(x), where Q 0 or a polynomial
g

of degree at most k 1. If u(x)e g(x) d x is elementary, then by Liouvilles criterion
u = ( p/q)g  + ( p/q) , where p and q are relatively prime polynomials. A short calculation yields
(uq p  pg  )q = q  p,
and multiplying by D(x) = rm=1 (x r )lr +1 , the common denominator of g  , we
obtain the polynomial equation
(qu D p  D pg  D)q = q  p D.
A multiplicity argument about the zeroes of q and q  shows that the only possible roots
of q are 1 , . . . , m . If q(x) = (x 1 ) (x), with 1 and (1 ) = 0, then
(qu D p  D pg  D) = p(x 1 )l1 rm=2 (x r )lr +1 ( + (x 1 )  ).
By definition of g, we have that l1 1 and thus x 1 divides the left side of this last
equation. Since x 1 does not divide , then it divides the first factor. Note now that
x 1 divides the first two terms of this factor and thus divides pg  D. Since A1l1 = 0,
then x 1 divides p rm=2 (x r )lr +1 , which is impossible. Thus, 1 is not a root of
q and by a similar argument neither is r , for r = 2, . . . , m. Hence, q is a nonzero
constant C1 , and therefore we have
C1 u D p D = pg  D,
460

(13)

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




where the polynomials u D and pg  D are given by


uD =

ar s Dr s + Q D,

pg D = p

r s N g 

lr
m 

r =1 s=1

s Ar s D
+ p P  D.
(x r )s+1

Suppose k = deg P 1. Next we show that this assumption leads to a contradiction.


Note that in this case the degree of the right side of (13) is that of p P  D.

If Q 0, then the degree of p P  D is bigger than that of C1 u D p D, unless


p 0. But this implies that u 0, which contradicts the choice of u.
If Q  0, then deg Q k 1. If p were nonconstant, then the degree of p P  D
would be bigger than that of the left side of (13), which is at most max{deg Q D, deg
p  D}. Since this last is impossible, p is a constant C2 and therefore
u = (C2 /C1 )g  (x).
If C2 = 0, we are led to the contradiction u 0. Thus, C2 = 0. Since Amlm = 0,
using this last equation, we solve for mlm +1 and show that mlm +1 N g , but this
contradicts the definition of N g .

Hence, P is a constant and Q 0. If p is nonconstant, then (13) is impossible since


p is a constant
now the degree of p  D is bigger than that of C1 u D pg  D. Therefore,

C2 , and as before we arrive at a contradiction. This proves that u(x)e g(x) d x is not
elementary if u N g \ {0}.
Now we prove that Q = N g E g . In view of (9) and what we just proved, we have
that N g E g = {0}. On the other hand, N g + E g Q. So it suffices to show that the
generators of Q given in (7) belong to N g + E g . By definition of N g , we have

1, x, . . . , x

k1


1
1
, r s =
,
: r s N g , C \  N g + E g .
x
(x r )s

Moreover, since lm Amlm = 0 and g  E g , solving for (x 1)lm +1 we find that (x 1)lm +1
m
m
N g + E g . Taking these last two assertions as a starting point, it follows by induction
on j = 1, 2, . . . that we have the following.

For fixed k 0, x j+k1 N g + E g , since j (x) = x j g  (x) + j x j1 E g .


g  (x)
j
1

If C \ , then
j
j+1 N g + E g , since j (x) =
j+1 E g .
(x)

(x)

If = r for some r = 1, . . . , m, then

1
(xr )lr + j+1

(x)

N g + E g since r j E g .

Hence, Q = N g E g .
Corollary 5. Let g and P be as in (12). Then for any f Q there exists a unique
rational function R and a unique rational function u N g of the form
u(x) =

n

j=1


Aj
+
ar s r s + Q(x),
x j N
rs

g

where Ng and Ng are as defined in the proof of Theorem 3, case II, and Q 0 if P 0
May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

461

or is a polynomial of degree at most deg P 1, such that





f (x)e

g(x)

dx =

u(x)e g(x) d x + R(x)e g(x) .

Moreover, the left side of this last equation is elementary if and only if u 0, and if
u  0 the nonelementary integral in the right side is minimal in the sense that for any
rational functions v and satisfying



f (x)e

g(x)

dx =

v(x)e g(x) d x + (x)e g(x) ,

we have deg v deg u. This is where the degree of a rational function is defined as in
the proof of Corollary 4.
Proof. The proof of this corollary is similar to that of Corollary 2.
As before, the decomposition given by Corollary 5 will be referred to as Hardys
reduction of f (x)e g(x) d x.
As in the previous section, when dealing with concrete situations, it is convenient to
work with suitable subspaces of Q, E g , and N g that have a countable basis. Thus, for
example, if f is as in (10) and g as in (12), then we set F = {1 , . . . , n } and consider
E g (F ) as in (11),

Qg (F ) = span 1, x j ,

1
1
,
: F , j = 1, 2, . . . ;
j
(x ) (x r )s


r  \ F , s = 1, . . . , lr + 1

and

N g (F ) = span 1, x, . . . , x k1 ,


1
, r s : r s N g , F \  ,
x

where  is as in N g and r s as in N g .

Again, Qg (F ) = E g (F ) N g (F ), and if f Qg (F ), then f (x)e g(x) d x is elementary if and only if f E g (F ).
Example 5. Find Hardys reduction of


4

x + x3 x2 x + 2
2x 6 3x 5 5x 4 + x 3 + 3x 2 + 12x 5
exp
dx
x 4 6x 3 + 13x 2 12x + 4
x2 1

and determine if this integral is elementary.


4
3
2 x+2
First we note that g(x) = x +x xx
and f (x) =
2 1
partial fraction decomposition, we have
g(x) =
462

2x 6 3x 5 5x 4 +x 3 +3x 2 +12x5
.
x 4 6x 3 +13x 2 12x+4

By

1
1

+ x2 + x
x 1 x +1

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




and
f (x) =

8
38
5
9
+
+ 2x 2 + 9x + 23.
+

x 1 (x 1)2
x 2 (x 2)2

If we set F = {1, 2},





1
1
1
,
Qg (F ) = span 1, x ,
,
: = 1, 2; j = 1, 2, . . . ,
(x ) j x + 1 (x + 1)2


1
1
1
,
,
Ng =
,
x 1 (x 1)2 x + 1


1
1
1
1
N g (F ) = span 1, x,
,
,
,
,
x 2 x 1 (x 1)2 x + 1
j

and


E g (F ) = span j (x), g  (x), j (x); = 1, 2; j = 1, 2, . . . ,
where, from (8),
j (x) =

4x
2x + 1
j
+

,
j
2
2
j
(x ) (x 1)
(x )
(x ) j+1

g  (x) =

1
1
+
+ 2x + 1,
(x 1)2
(x + 1)2

j (x) =

4x j+1
+ 2x j+1 + x j + j x j1 .
(x 2 1)2

A short calculation shows that


f (x) =

1
1
+
+ 4g  (x) + 1 (x) + 921 (x),
(x 1)2
x 2

where
1
1 (x) = (x1)
2

1
x1

21 (x) = 2

1
(x2)2

1
(x+1)2

37
9(x2)

1
x+1

1
(x1)2

+ 2x 2 + x + 1,

1
x1

1
3(x+1)2

1
.
9(x+1)

In view of (9), Hardys reduction is





1
1

+
4g
+
(x)
+

(x)
+
9
(x)
e g(x) d x
1
21
(x 1)2
x 2




1
1
9
g(x)
e
e g(x) .
=
+
d
x
+
4
+
x
+
(x 1)2
x 2
x 2

f (x)e g(x) d x =

Since u(x) =
May 2016]

1
(x1)2

1
x2

N g (F ) \ {0}, the given integral is not elementary.

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

463

8. APPLICATIONS. Asan application of the previous results, we characterize the


k
polynomials P such that P(x)eax d x is elementary in terms of equations that the
coefficients of P satisfy. We treat the simplest values of k and let the interested reader
work out the general case.
Example 6. Let
= an x n + + a1 x + a0 and a C \ {0}. If n = 2l + r , with
 P(x) ax
2
r = 0, 1, then P(x)e d x is elementary if and only if

l

1 j
j=0

2a

(2 j 1)!! a2 j = 0.

(14)

Proof. We only treat the case n = 2l. In view of Theorem 2,


tary if and only if
2l

j=0

P(x)eax d x is elemen-



j j1
x
A j+1 x j+1 +
2a
j=0


2l 1
2l
2l1
A2l x 2l2
+ A2l2 +
= A2l x + A2l1 x
2a




3
1
A2
2
A4 x + A1 + A3 x +
.
+ + A2 +
2a
a
2a

ajx j =

2l1


That is to say,
2l 1
A2l = a2l2 ,
2a
3
1
A2
. . . , A2 +
A 4 = a2 , A 1 + A 3 = a1 ,
= a0 ,
2a
a
2a

A2l = a2l , A2l1 = a2l1 , A2l2 +

but this system of linear equations is equivalent to (14).


Example 6 generalizes a result of [6] that says that a polynomial P can be inte2
grated against ex in finite terms if and only if the following orthogonality condition
is satisfied

2
P(x)ex d x = 0.

Using the well-known fact that for j = 0, 1, 2, . . .




2
x 2 j ex d x = (2 j 1)!!/2 j and

x 2 j+1 ex d x = 0,

if P(x) = an x n + + a1 x + a0, where n = 2l + r with r = 0, 1, then the above


2
orthogonality condition says that P(x)ex d x is elementary if and only if
l

(2 j 1)!!
a2 j = 0,
j
2
j=0

464

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




which is (14) with a = 1. Using the results of this paper, the above example can be
pushed a bit further.
Example 7. Let
= an x n + + a1 x + a0 and a C \ {0}. If n = 3l + r , with
 P(x) ax
3
r = 1, 2, then P(x)e d x is elementary if and only if

l

1 j
j=0

3a

(3 j 2)!!! a3 j = 0 and

j=0

while if n = 3l, then



l

1 j
j=0

3a


l

1 j

3a

(3 j 1)!!! a3 j+1 = 0, (15)

P(x)eax d x is elementary if and only if

(3 j 2)!!! a3 j = 0 and


l1

1 j
j=0

3a

(3 j 1)!!! a3 j+1 = 0.


3
Proof. We only treat the case n = 3l + 2. In view of Theorem 2, P(x)eax d x is ele

3l+2

j2 j3
j
mentary if and only if j=0 a j x j = 3l+2
. Comparing the coeffij=2 A j x + 3a x
j
cients of x , we see that this equality corresponds to
3l
A3l+2 = a3l1 ,
3a
A5
2
A3
. . . , A2 +
= a2 ,
A 4 = a1 ,
= a0 .
a
3a
3a

A3l+2 = a3l+2 , A3l+1 = a3l+1 , A3l = a3l , A3l1 +

This system of linear equations is equivalent to (15).


Proceeding as in the previous examples, the interested reader can prove the
following.
n
Proposition 1. Let k be a positive integer, P(x) =
+ a1 x + a0 , and a
 an x +
k
C \ {0}. If n = kl + r , with 0 r k 1, then P(x)eax d x is elementary if and
only if the coefficients of P satisfy the following system of k 1 linear equations


l

1 j
j=0

ka


l1

1 j
j=0

ka

(( j 1)k + i + 1)!(k) ak j+i = 0, i = 0, 1, . . . , min{r, k 2};

(( j 1)k + i + 1)!(k) ak j+i = 0, i = min{r, k 2} + 1, . . . , k 2.


k
Now we consider integrals of the form P(x)ea/x with k > 0 and P a polynomial.
We work out the cases k = 1 and 2 and leave the general case to the reader.
Example
8. For any P(x) = an x n + + a1 x + a0 and a C \ {0}, we have that

a/x
P(x)e d x is elementary if and only if
n

j=0

May 2016]

a jaj
= 0.
( j + 1)!

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

(16)

465


Proof. In view of Theorem 3, if P is a polynomial, then P(x)ea/x d x is an elemena
x j1 , }. The
tary function if and only if P span {x a2 , x 2 a3 x, . . . , x j j+1




a
latter occurs if and only if nj=0 a j x j = nj=1 A j x j j+1
x j1 . Comparing the
coefficients of x j , we see that this last equality is equivalent to
a
An = an1 ,
n+1
a
a
a A1
= a0 .
. . . , A 2 A 3 = a2 , A 1 A 2 = a1 ,
4
3
2

An = an , An1

This system of linear equations is equivalent to (16).


n
Examplea/x9.2 Let P(x) = an x + + a1 x + a0 and a C \ {0}. If n = 2l + 1, then
P(x)e d x is elementary if and only if
l

j=0

(2a) j
a2 j = 0
(2 j + 1)!!

while if n = 2l, then


l

j=0

and

l

j=0

(2a) j
a2 j+1 = 0,
(2 j + 2)!!

(17)

P(x)ea/x d x is elementary if and only if

(2a) j
a2 j = 0 and
(2 j + 1)!!

l1

j=0

(2a) j
a2 j+1 = 0.
(2 j + 2)!!


2
Proof. Again by Theorem 3, if P is a polynomial, then P(x)ea/x d x is an elementary
2a
x j2 , }.
function if and only if P span {x 2 2a3 , . . . , x j j+1


2l+1

2a
j
j
j2
x
. Coma
x
=
A

x
Suppose that n = 2l + 1 and that 2l+1
j
j
j=0
j=2
j+1
paring the coefficients of x j , we get
2a
A2l+1 = a2l1 ,
2l + 2
2a
2a
2a
A 4 = a2 , A 3 = a1 , A 2 = a0 .
. . . , A2
5
4
3

A2l+1 = a2l+1 , A2l = a2l , A2l1

Eliminating A2 ,..., A2l+1 from this system of linear equations, we obtain (17). The case
n = 2l is treated similarly.
The interested reader can prove the following.
Proposition 2. Let k be a positive integer, P(x) = an x n + + a1 x + a0 , and a
k
C \ {0}. If n = kl + r , with 0 r k 1, then P(x)ea/x d x is elementary if and
only if the coefficients of P satisfy the following system of k linear equations
l


(ka) j
ak j+i = 0,
(k j + i + 1)!(k)
j=0

l1


(ka) j
a
= 0,
(k) k j+i
(k
j
+
i
+
1)!
j=0

466

i = 0, 1, . . . , r ;

i = r + 1, . . . , k 1.

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Now we obtain characterizations of a class of simple rational functions that have


elementary integrals when integrated against an exponential.
Example 10. Let a = 0 and c be given
constants. If n is a positive integer and P a
 P(x)
ax
polynomial of degree l n 1, then (xc)
n e d x is elementary if and only if

n

n j (n j)
j
a P
(c) = 0.
j
j=nl

(18)

a0
a1
al ax
e dx
+
+

+
xn
x n1
x nl

(19)

Proof. First, we show that

is elementary if and only if


n

a j an j
= 0.
( j 1)!
j=nl

(20)

Using Theorem 3, we see that (19) is elementary if and only if


a1
al
a0
+ n1 + + nl
xn
x
x






n1
n2
n l
a
a
a
=A1
n1 + A2
n2 + + Al
nl
xn
x
x n1
x
x nl+1
x
=

(n 2)A2 a A1
(n l)Al a Al1
a Al
(n 1)A1
+
+ +
nl ,
xn
x n1
x nl+1
x

for some constants A1 , A2 , . . . , An1 . Comparing the coefficients, we obtain the system of linear equations
(n 1)A1 = a0 , (n 2)A2 a A1 = a1 ,
. . . , (n l)Al a Al1 = al1 , a Al = al .
A short calculation shows that this system is equivalent to the fact that (20) is satisfied.
A Taylor expansion of P around x = c and setting u = x c yields


P(x) ax
e d x = eac
(x c)n


P  (c)/1!
P (l) (c)/l! au
P(c)
+
+ +
e du.
un
u n1
u nl

Thus, replacing a j = P j (c)/j! in (20) gives (18).


9. LIOUVILLE-LIKE INTEGRALS. In this section, we consider the problem of
determining the elementary character of the Liouville-like integrals



F(x) log G(x)d x,

May 2016]


F(x) arctan G(x)d x,

F(x) tanh1 G(x)d x,

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

467

where F and G are rational


G nonconstant.
 x functions with
 1+x Taking into account the fact
1
1
x
=
log
and
tanh
, it suffices to investigate the
that arctan x = 2i1 log 1+i
1i x
2
1x
first of these integrals. To simplify the problem, we write
G(x) = a(x 1 ) (x n )/(x 1 ) (x m )
and use properties of log and appropriate substitutions to find a rational function f
such that


F(x) log G(x)d x =
f (x) log x d x.
Next, we use LiouvilleHardys criterion on integration, stated in Section 2 of this
work, to provide an algorithm that determines if this last integral is elementary for any
given rational
function f.

Since P(x) log x d x is elementary for any polynomial P, in what follows we
consider the space Q p of proper rational functions with coefficients in C. A rational
function f is said to be proper if it is either zero or the degree of its numerator is
smaller than that of its denominator.



Theorem 4. Let E = f Qp : f (x) log x d x is an elementary function and let
1
N = span x
: C \ {0} . Then


E = span

1
1
,
: C, j = 2, 3, . . .
x (x ) j

and Q p = N E .


1
Proof. Note first that Q p = N span x1 , (x)
j : C, j = 2, 3, . . . . Since
 log x
d x = 12 (log x)2 + C and for C and any integer j 2 so is
x



log x
dx
1
log x
dx =

,
(x ) j
j 1
x(x ) j1
( j 1)(x ) j1

it suffices to show that u(x) log x d x is not elementary for all u N \ {0}.

n
Bl
Let u(x) = l=1
in N \ {0} and suppose u(x) log x d x is elementary. By
xl
LiouvilleHardys criterion, there exists a rational function g and a constant C such
that u(x) = g  (x) + Cx , but this equation is impossible. In fact, if Bk = 0, then u has
a simple pole at x = k . Since k = 0, there is an integer m 1 and a rational function r continuous at x = k such that r (k ) = 0 and g(x) = r (x)/(x k )m . A short
calculation with both expressions of g  gives


C
mr (x)
= r  (x)
,
(x k )m u(x)
x
x k
which is impossible since the left side is continuous at x = k but the right is not.
Corollary 6. Any Liouville-like integral of the type considered in this section has a
Hardys reduction that is the sum of an elementary function plus a finite number of
x
d x, with C \ {0}.
nonelementary integrals of the form log
x
468

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Example 11. Find a Hardys reduction of




Note that


3x 2 5x+4
x(x2)2

1
x

2
x2

3x 2 5x + 4
log x d x.
x(x 2)2

3
,
(x2)2

3x 2 5x + 4
log x d x = 2
x(x 2)2




with

1
x2

1
N and x1 , (x2)
2 E . Thus,

log x
dx + 3
x 2

log x
dx +
(x 2)2

log x
dx
x

log x
dx
x 2


2
log x
1
3
3
+ (log x)2 .
+ log 1
2
x
x 2 2

=2

Since

1
x2

N \ {0}, the given integral is not elementary.

ACKNOWLEDGMENT. We are thankful for the valuable comments, remarks, and suggestions of the referees. One of the authors, J. Cruz-Sampedro, was supported during his sabbatical leave by the Universidad
Autonoma Metropolitana-Azcapotzalco.

REFERENCES
1. J. Baddoura, Integration in finite terms with elementary functions and dilogarithms, J. Symbolic Comput.
41 (2006) 909942.
2. M. Bronstein, The transcendental Risch differential equation, J. Symbolic Comput. 9 (1990) 4960.
3. , Integration of elementary functions, J. Symbolic Comput. 9 (1990) 117173.
4. , Symbolic Integration I: Transcendental Functions. Second edition. With a foreword by B. F.
Caviness. Vol. 1 Algorithms and Computation in Mathematics, Springer-Verlag, Berlin Heidelberg, 2005.
5. J. H. Davenport, The Risch differential equation problem, SIAM J. Comput. 15 (1986) 903918.
6. P. Diaconis, S. Zabell, Closed form summation for classical distributions: variations on a theme of de
Moivre, Statist. Sci. 6 (1991) 284302.
7. G. H. Hardy, The Integration of Functions of a Single Variable. Cambridge Tracts in Mathematics and
Mathematical Physics, Cambridge Univ. Press, Warehause, England, 1905, https://archive.org/
details/integrationoffun00hardrich.
8. J. Liouville, Memoire sur lintegration dune classe de fonctions transcendantes, J. Reine
Angew. Math. 13 (1835) 93118, http://gdz.sub.uni-goettingen.de/dms/load/img/?
PPN=GDZPPN002140268&IDDOC=267366.
9. E. A. Marchisotto, G.-A Zakeri, An invitation to integration in finite terms, Coll. Math. J. 25 (1994)
295308.
10. D. G. Mead, Classroom notes: Integration, Amer. Math. Monthly 68 (1961) 152156.
11. R. H. Risch, The problem of integration in finite terms, Trans. Amer. Math. Soc. 139 (1969) 167189.
12. J. F. Ritt, Integration in Finite Terms. Liouvilles Theory of Elementary Methods. Columbia Univ. Press,
New York, 1948.
13. M. Rosenlicht, Integration in finite terms, Amer. Math. Monthly 79 (1972) 963972.
14. G. F. Simmons, Calculus with Analytic Geometry, McGraw Hill, New York, 1985.
15. M. F. Singer, B. D. Saunders, B. F. Caviness, An extension of Liouvilles theorem on integration in finite
terms, SIAM J. Comput. 14 (1985) 966990.
16. I. Stewart, Galois Theory. Second edition. Chapman and Hall, New York, 1989.
17. P. L. Tchebychef, Integration des Differentielles Irrationnelles, Oeuvres de P. L. Tchebychef, Imprimerie
de lAcademie Imperiale des Sciences, St. Petersbourg 1 (1899) 147168, https://archive.org/
details/oeuvresdepltche01chebrich.
18. L. Verde-Star, Rational runctions, Amer. Math. Monthly 116 (2009) 804827.

May 2016]

HARDYS REDUCTION FOR LIOUVILLE INTEGRALS

469

JAIME CRUZ-SAMPEDRO received his Ph.D. in mathematics from the University of Virginia. His mathematical interests included applied mathematics, number theory, analysis, operator theory, and Schrodinger
operators.
Universidad Autonoma Metropolitana-Azcapotzalco, Mexico City 02200, Mexico

MARGARITA TETLALMATZI-MONTIEL received her M.S. in mathematics from the University of


Virginia. Her mathematical interests include probability, applied mathematics, and history of mathematics.
Universidad Autonoma del Estado de Hidalgo, Pachuca Hgo. 42184, Mexico
tmontiel@uaeh.edu.mx

Editors Note: We publish the following memorial statement written by Margarita Tetlalmatzi-Montiel regarding her co-author, Jaime Cruz-Sampedro. On November 3, 2015 the mathematical community lost a great
friend with the passing of Jaime Cruz Sampedro. We knew Jaime as a person who supported our efforts to do
important things and who kept open a wary eye to make sure we were living up to his high expectations. We
were always the better for this and will now try to continue to bear in mind his standards. We owe him a lot
and will miss him a lot.

An Alternative Approach to the Product Rule


The usual proof of the product rule in calculus involves adding and subtracting the
same quantity, which can be unintuitive for students. Here is an alternative which
uses the fact that f g arises as a cross term in ( f + g)2 .
Using the power rule,



(( f + g)2 ) = 2( f + g)( f  + g  ) = 2 f f  + gg  + f  g + f g  .
On the other hand, by expanding first and then differentiating,


(( f + g)2 ) =

f 2 + g2 + 2 f g



= 2 f f  + gg  + ( f g) .

Comparing the above expressions reveals the product rule


( f g) = f  g + f g  .
One need not know the chain rule to carry out these calculations because it is easy to
derive the formula for the derivative of the square of a function directly:


u 2 (x + h) u 2 (x)
u(x + h) u(x)
= lim (u(x + h) + u(x))
u 2 (x) = lim
h0
h0
h
h
= 2u(x)u  (x).
Submitted by Piotr Josevich
http://dx.doi.org/10.4169/amer.math.monthly.123.5.470
MSC: Primary 26A06

470

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Fast and Simple Modular Interpolation Using


Factorial Representation
G. L. Mullen, D. Panario, and D. Thomson
Abstract. We study a representation for polynomial functions over finite rings. This factorial
representation is particularly useful for fast interpolation, and we show that it is computationally preferable to the Lagrange Interpolation Formula (LIF) and to Newton interpolation
over finite fields and rings. Moreover, over arbitrary finite rings the calculation of the factorial
representation aborts naturally when a given mapping does not arise as a polynomial function.

1. INTRODUCTION. Let R be a commutative ring with unity. A function : R


R is representable by a polynomial if (r ) = f (r ) for some f R[x] and all r R.
Functions which are representable as polynomials (mod m) for some positive integer
m, and their generalizations, have been studied; see for example [8, 11].
Throughout this manuscript, let q = p m for some prime p and positive integer m.
The finite field with q elements is denoted by Fq and the finite ring of integers modulo pm is denoted Z pm . The Lagrange Interpolation Formula (LIF) shows that any
mapping : Fq Fq can be given by a polynomial of degree at most q 1.
Theorem 1. Let 0 , 1 , . . . , n1 be distinct elements of Fq and let 0 , 1 , . . . , n1
be arbitrary elements of Fq . There exists exactly one polynomial f Fq [x] of degree
less than n such that f (i ) = i , 0 i n 1, given by

n1
n1


x

i .
f (x) =
(1)

j
i=0
j=0, j=i i
Another expression for the LIF of a
function f : Fq Fq can be

obtained by rearranging Equation (1), and is given by aFq f (a) 1 (x a)q1 .


A less well-known, but computationally superior, interpolation method is Newton Interpolation. Given a mapping , a set of distinct elements {i } of Fq , and
their images {i = (i )}, construct the tableau of divided differences recursively by
f [0 ] = 0 , f [1 ] = 1 , . . . , f [n1 ] = n1 , and
f [ j , j+1 , . . . , k1 , k ] =

f [ j+1 , . . . , k ] f [ j , . . . , k1 ]
.
k j

The Newton interpolating polynomial f is given by


f (x) = f [0 ] + f [0 , 1 ](x 0 ) + f [0 , 1 , 2 ](x 0 )(x 1 ) +
+ f [0 , 1 , . . . , n1 ](x 0 )(x 1 ) (x n2 ),
where the evaluations can be computed using Horners rule. We compare the computational complexity of these interpolation methods in Section 3 and compare an
implementation of factorial interpolants to LIF in Table 2.
http://dx.doi.org/10.4169/amer.math.monthly.123.5.471
MSC: Primary 11T06

May 2016]

INTERPOLATION USING FACTORIAL REPRESENTATION

471

When considering functions on the ring of integers modulo , by the Chinese


remainder theorem it is enough to consider functions on the ring Z pm , where m is the
largest power of p dividing . When m > 1, not every function on the ring Z pm can be
represented by a polynomial with coefficients in the ring.
Example. Consider the function f : Z4 Z4 defined by f (0) = 0 and f (a) = 1
for a = 0. If f (x) = a0 + a1 x + a2 x 2 + a3 x 3 , then f (0) = 0 = a0 and the remaining
coefficients can be determined by solving the linear system

a1
1
1 1 1
2 0 0 a2 = 1 .
1
3 1 3
a3
We immediately check that there is no such solution, since 2 is not a unit in Z4 .
Though the main object of study in our paper is single- and multivariate polynomial functions over finite rings, we briefly mention three related works on polynomial
functions in a more general setting. The reference [2] studies generalized polynomial
mappings from Zn to Zm ; a follow-up work is also presented in [1]. Generalized polynomial mappings from Zn1 Zn2 Znr to Zm are also studied in [3].
In this paper we consider an interpolation method due to a factorial representation
of polynomial functions. If a function does not admit a polynomial representation,
the factorial interpolation calculation fails in a natural way. In Section 2, we outline
some facts about factorial representations. In Section 3, we analyze the computational
complexity of factorial interpolation in Z pm , when it exists, and we show that our
method is preferable to the Lagrange and Newton interpolation methods. In Section 4
we compare an implementation of factorial representations with the interpolation function (based on LIF) of the Number Theory Library; see [10]. In Section 5 we briefly
discuss some extensions of factorial representations to functions in several variables
and over infinite rings. We conclude in Section 6 with comments regarding the use of
factorial interpolation in Shamirs secret sharing scheme.
2. FACTORIAL REPRESENTATION. The notion of factorial representations for
expressing polynomial functions can be found in [8].
Definition. Let j be a positive integer. The jth (falling) factorial of x, denoted x (j) , is
defined by
x ( j) = x(x 1) (x j + 1),

(2)

where x is any element in the considered ring.


We use the simple observation that i ( j) = 0 whenever j > i to develop an interpolation technique. We also use the convention i (0) = 1 whenever i 0.
Definition. Let R be a commutative ring with unity and suppose a function : R R
satisfies (r ) = g(r ) for all r R and some g R[x] given by
g(x) =

n


a( j) x ( j) R[x].

j=0

472

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Then g is the factorial representation of and the act of finding g is the factorial
interpolation of .
Let g R[x] be in factorial representation. Since i ( j) = 0 for j > i, the evaluation g(i) requires computation and addition of only the first i + 1 terms of the sum.
This paper is primarily concerned, however, with expressing a function in factorial
representation.
Factorial interpolation over prime fields. First we consider the case of maps
: F p F p , where p is a prime. Recall that all such can be represented by a
 p1
polynomial. Let f (x) = i=0
a(i) x (i) . In factorial notation,
f (k) = k (k) a(k) + k (k1) a(k1) + + k (1) a(1) + a(0) ,
since k (k+i) = 0 for any i > 0. We observe that a(0) = (0), and solving for the coefficients is equivalent to solving the system

( p 1)( p1)
0

..

.
0

( p 1)( p2)
( p 2)( p2)
..
.
0

( p 1)(1)
a( p1)
( p 1) (0)
(1)
( p 2) a( p2) ( p 2) (0)
. =
,
..
..
..

.
.

..
.

1(1)

a(1)

(1) (0)

(3)
which is triangular and hence invertible. Therefore, there is a unique solution and the
number of functions representable in factorial representation is p p .
 p1
a(i) x (i) . Set a(0) = (0), and
Proposition 1. Let : F p F p and let f (x) = i=0
let a(i) be given by the solution of Equation (3) for i = 1, . . . , p 1. Then f is the
factorial interpolation of .
Next we show that a back-substitution method with precomputation can efficiently
find the factorial representation of a function.
Example. Consider the function : Z5 Z5 defined in two row notation by


0 1 2 3 4
0 2 3 4 1
4
and let f (x) = i=0
a(i) x (i) be its factorial interpolation. Clearly, a(0) = 0 since
(0) = 0. The matrix equation to solve is

4
0
0
0

4
1
0
0

2
1
2
0


4
1
a(4)
3 a(3) 4
=
.
2 a(2) 3
1
a(1)
2

Backwards substitution beginning from a(1) = 2 yields the solution


(a(4) , a(3) , a(2) , a(1) ) = (0, 1, 2, 2),
and the factorial representation of is f (x) = 2x + 2x(x 1) + x(x 1)(x 2).
May 2016]

INTERPOLATION USING FACTORIAL REPRESENTATION

473

Factorial interpolation over Zq , q = pm . We now consider factorial interpolation of


functions over Zq , q = pm , m > 1. Mullen and Stevens [8] showed that polynomial
functions are expressible in factorial notation, as follows.
Let s be a positive
integer and denote by v p (s) the largest power of p which divides
s!; that is, v p (s) = r=1 psr
. Also, denote by Nq the unique integer such that q
divides (Nq + 1)! but q does not divide Nq ! so that Nq 1 (mod p).
Theorem 2. [8, Theorem 2.1] The distinct polynomial functions (mod p m ) are
exactly those represented in the form
f (x) =

Nq


a( j) x ( j) ,

j=0

where a j is a representative of a complete residue system (mod pmv p ( j) ). Moreover,


for m p, the number ( pm ) of such polynomial functions is given by ( p m ) =
p pm(m+1)/2 .
Theorem 2 shows that the method of factorial interpolation for rings Zq will finish
successfully if and only if the function is representable as a polynomial. As commented
above, if q = p, then this process always finishes successfully.
Theorem 3. Let : Zq Zq . Either the system of equations

(q 1)(q1)
0

..

.
0

(q 1)(q2)
(q 2)(q2)
..
.
0

..
.

(q 1)(1)
a(q1)
(q 1) (0)
(1)
(q 2) a(q2) (q 2) (0)
. =

..
..

.
(1) (0)
a(1)
1(1)
(4)

 m
has a solution and f (x) = pj=01 a( j) x ( j) is the factorial interpolation of , where
a(0) = (0), or is not representable as a polynomial.
We return to our previous example of a function not expressible as a polynomial.
Example. Consider the function : Z4 Z4 defined by (0) = 0 and (a) = 1 for
a = 0. The adaptation of Equation (4) to Z4 is obvious, and the equation becomes


2 2 3
1
a(3)
0 2 2 a(2) = 1 .
a(1)
0 0 1
1
It is immediate that unless a(2) = 0, there can be no solution, since 2 is on the diagonal
and has no inverse. Eliminating the (2, 3) entry shows that 2a(2) = 3, a contradiction.
Conversions between factorial and polynomial representations. For a commutative ring with unity R and a transcendental element x, the (single variable) polynomial ring R[x] can be considered a free commutative R-algebra with generators {1, x, x 2 , . . .}. Suppose that R = {0, 1, r1 , . . . , rn1 } is a finite set of points. If
f R[x] is a polynomial mapping, then we have seen it is expressible in factorial
474

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




m
i
representation;
m that is,(i)if f (x) = i=0 ai x for any x R, then f has a representation
as f (x) = i=0 a(i) x , for some a(i) R.
Converting between traditional and factorial representations can also be performed
with linear algebra with the following observation. Define a free commutative Ralgebra with generators {1, x, x(x r1 ), . . . , x(x r1 ) (x rn1 )}; it is easy to
see that this is a generating set for the subspace of R[x] of polynomials of degree at
most n using distributivity and that each term in a generating set is monic. Hence, to
convert between traditional polynomial representation and factorial representations, it
is enough to perform a change-of-basis operation on the respective coefficient vectors.
This change-of-basis relies on combinatorial constants called Stirling numbers. Stirling numbers have many interesting combinatorial and number theoretic properties; we
list a few of them here and refer the reader to [7, Section 6.1] for more information.
The Stirling numbers of the first kind, denoted s(n, k), are given by the relations
x (n) =

n


(1)nk s(n, k)x k ,

n = 0, 1, . . . .

k=0

There are many equivalent definitions, for example, s(n, k) counts the number of permutations on n elements into exactly k disjoint cycles. The reverse conversion coefficients are given by the Stirling numbers of the second kind, denoted S(n, k) and are
given by the relations
xn =

n


S(n, k)x (k) ,

n = 0, 1, . . . ,

k=0

and we draw notice to the lack of the alternating sign in this direction. Stirling numbers
of the second kind likewise have a combinatorial interpretation as the number of ways
to partition an n-set
n into ik nonempty subsets.
n
Let g(x) = i=0
ai x , rewriting g in factorial representation gives g(x) = i=0

ai ik=0 s(i, k)x (k) . Hence, the vector of (factorial) coefficients of g satisfies the matrix
equation

s(n, n) s(n, n 1)
a(n)
s(n 1, n 1)
a(n1) 0
. =
..
.. 0
.
0
a(0)
0
0

(1)n s(n, 0)
an
n1
(1) s(n 1, 0) an1
. .
..
..
.
a0

s(0, 0)

n
a(i) x (i) , its tradiSimilarly if h is given in factorial representation, h(x) = i=0
tional polynomial representation has the vector of coefficients satisfying

S(n, n)
an
a
n1 0
. =
.. 0

a0

S(n, n 1)

S(n 1, n 1)
..
.
0
0

a(n)
S(n, 0)
S(n 1, 0) a(n1)
. .
..
..
.
S(0, 0)

a(0)

The complexity of this conversion is then the cost of one fully dense upper triangun+1
(2i 1) = (n + 1)2
lar matrix-vector multiplication; naive multiplication takes i=1
base field operations. This cost supposes that the Stirling numbers have been precomputed, which requires storing n(n + 1)/2 constants. Moreover, Stirling numbers may
be computed using the simple recurrences
May 2016]

INTERPOLATION USING FACTORIAL REPRESENTATION

475

s(n, k) = (n 1)s(n 1, k) + s(n 1, k 1)


for n > 0 with s(0, 0) = s(n, n) = 1 and s(n, k) = 0 for k > n,
S(n, k) = k S(n 1, k) + S(n 1, k 1)
for n > 0 with S(0, 0) = S(n, n) = 1 and S(n, k) = 0 for k > n.
We remark that similar conversions should be possible for multivariate interpolation
(see Section 5), though this requires tensors and new analogs of Stirling numbers and
a full investigation is beyond the scope of this paper.
3. COMPUTATIONAL COMPLEXITY. In this section we analyze the computational complexity of computing the factorial interpolation over Zq , q = pm . The interpolated polynomial may have degree at most q 1, so we compute the computational
complexity of this maximal case. Consider the equation to solve the factorial interpolation of a map given by Equation (4). The leftmost matrix is fully dense and
upper-triangular. We observe that over F p every entry above the diagonal is distinct, so
further Gaussian elimination cannot yield an improvement. Thus, we present a solution
of this system by back-substitution.
Theorem 4. Let (q i)(q j) be precomputed for all i = 1, 2, . . . , q 1 and
j = 1, . . . , i. Then computing the factorial interpolation of a mapping Zq Zq
can be performed with (q 1)2 operations in Zq .
Proof. Consider the row corresponding to q i. The backwards substitution is given
by
(q i)(qi) a(qi) + + (q i)(1) a(1) = (q i) (0).
Precomputing all (q i)(qi j) , j = 0, . . . , q i 1, and assuming that a(qi j)
has already been computed by a previous row, gives a simple count of operations of
q i 1 multiplications, q i 2 additions, 1 subtraction, and 1 modular division to
compute a(qi) . At any point, if the coefficient to divide is not a unit, then the factorial
interpolation is impossible and the function is not representable by a polynomial.
We assume that all operations have equal cost, since the Cayley table of all operations can be considered to be a look-up. In this paradigm, the total number of
operations for row q i is 2(q i) 1. Hence, the total number of operations is
q1
2
i=1 (2(q i) 1) = (q 1) .
We consider a further improvement based on Horners rule. We illustrate this with
an example. For simplicity, let m = 1 and (0) = a(0) = 0, then a(1) = (1). Next,
2(2) a2 + 2(1) a(1) = 2(a(2) + a(1) ) = (2) with a(1) known. Hence, a(2) can be determined with 1 division and 1 subtraction in serial. Continuing like this, we obtain

(3) = 3(2)(1)a(3) + 3(2)a(2) + 3a(1) = 3 2 a(3) + a(2) + a(1) .


Hence, a(3) can be computed by performing one division followed by one subtraction,
two times in serial. Inductively, it is straightforward to see that when computing the
row corresponding to i, there are i 1 such iterations of Horners rule, each requiring
one subtraction and one modular division. Once again assuming that all field computa p1
(i 1) = ( p 2)( p 1) field
tions are equal, the cost of evaluating in F p is 2 i=1
476

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




computations. Moreover, this improvement shows we need not precompute all falling
factorials ( p i)( j) , i = 1, 2, . . . , p 1, j = 1, 2 . . . , i. We observe, however, that
this improvement is not applicable when the (integral) modulus is not prime (that is, if
m > 1) since it requires division by i for all i (of which some may be zero divisors).
Table 1 provides a comparison of our methods with classical interpolation methods;
see [5, Section 5.2], for details on the costs of Lagrange and Newton interpolation
methods. We observe that the Factorial (Horner) entry applies only to the case q =
p, a prime number.
Table 1. Number of field operations (over Fq ) for several interpolation methods.

Factorial (backsolve)
(q 1)2

Factorial (Horner)
(q 1)(q 2)

LIF
7q 2 8q + 1

Newton
52 q 2

4. SOFTWARE IMPLEMENTATION. In this section we compare a basic software


implementation of our factorial representations with the interpolate function of the
lzz p class in the Number Theory Library (NTL) [10], which is for computations over
F p . We chose NTL for our comparison because it is considered as a standard package
for finite field arithmetic, and also because it is the package with which the authors are
most familiar. Popular computer algebra packages also have finite field computations,
but are often slower in practice.
We ran our program on an Intel Core i5-2300 (2800 MHz) processor, running Red
Hat Linux 6, and we used NTL version 8.0.0 for our comparison.
We implemented the back-solution version of factorial interpolation, as in Theorem 4,
including a precomputation of the matrix. To create random functions for the factorial
representation, we fix f (0) = 0 and use the C random number generator to set f (i) to
a random number between 0 and p 1.
For modular arithmetic, instead of storing the Cayley tables of operations we use
hand-coded arithmetic over F p . This is to permit a fair comparison, as arithmetic for
the LIF necessarily uses the NTL lzz p class internal routines.
In Table 2 we report timings (from the GNU gprof profiling library) for the amount
of time spent in the respective interpolate functions (and their children in the call
tree) for the number of iterations given in the iters. column.
Table 2. Run time in seconds of factorial interpolation vs LIF for various values of p.

p
5
23
2029

iters.
2 107
2 107
103

Factorial Interpolation (backsolve)


0.16
4.27
16.89

LIF
2.63
15.55
68.43

A word on parallelism. Modern processors benefit from having multiple cores with
which to compute. Determining the computational complexity of parallel programs
is a many-tiered problem and is closely tied to the hardware. It is preferable, when
possible, to develop programs with small amounts of communication between cores.
The Lagrange interpolation scheme permits parallelism since each term in Equation (1) is independent of the other terms and hence may be computed separately. A
master thread can accept and add all of the terms to produce the output.
May 2016]

INTERPOLATION USING FACTORIAL REPRESENTATION

477

Factorial interpolation and Newton interpolation are iterative methods, and so


do not permit obvious parallelism. The back-solution method essentially admits an
upper-triangular dense matrix problem. Linear algebra (both sparse and dense forms)
is a popular topic in parallel computing: the Basic Linear Algebra Subprograms
(BLAS) are specifications for high-performance linear algebra computations; see [4],
for example. The particular problem of solving upper-triangular dense matrices using
BLAS is studied in [6], though we have not attempted to implement these routines.
5. EXTENSIONS TO OTHER RINGS. This section is motivated by the question:
What is special about ordering the elements of F p by 0, 1, 2, . . . , p 1?
Let us return to our faithful example.
Example. Consider again the function : Z5 Z5 defined in two row notation by


0 1 2 3 4
.
0 2 3 4 1
If the ordering of F p is 0, 1, 3, 4, 2, then an altered factorial representation is
b(1) x + b(2) x(x 1) + b(3) x(x 1)(x 3) + b(4) x(x 1)(x 3)(x 4),
where b(0) = (0) and b(i) , i = 1, 2, 3, 4, are the solutions to the matrix equation

4
0
0
0

4
2
0
0

3
2
3
0


2
3
b(4)
4 b(3) 1
=
.
3 b(2) 4
b(1)
1
2

Solving the system gives (b(4) , b(3) , b(2) , b(1) ) = (1, 3, 1, 2). Hence, the altered factorial representation of is given by 2x + x(x 1) + 3x(x 1)(x 3) + x(x 1)
(x 3)(x 4). It is easy to verify that this interpolates the function .
Evidently, the ordering of the elements is not important to the algorithm; the ordering provides an equivalent linearization problem. Hence, factorial interpolation may
be applied to any finite commutative ring R with unity.
Finite fields Fq , q = pm . The multiplicative group of Fq is cyclic of order q 1. Order
the elements of Fq by 0, 1 , 2 , . . . , q1 , where is a primitive element of Fq . Then,
the ith term in the factorial representation is a(i) x(x ) (x i1 ).
q

Multivariate functions over Fq and Zpm . Since xi = xi , for an s-variate polynomial


over Fq , construct the array where the (i 1 , i 2 , . . . , i s , j)th element of the array correi i
sponds to the term (x11 x22 xsis j ). Order the elements in any way, for example,
lexicographically. For 1 a q s+1 , write the ath term as its lexicographic representative.
Over Z pm , proceed as with multivariate polynomials over Fq , however since not
all functions are representable by polynomials, the algorithm will fail if it requires
dividing by a zero divisor.
Functions over Z and Q. We observe that factorial interpolation can be performed
over any totally orderable commutative ring with unity. We require here the prior
478

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




knowledge that if the function is representable as a polynomial, it must have degree


below a fixed bound and an equal number of points available. Any ordering of the
given points will suffice.
Factorial interpolation over Q will be similar to that of Z. Any finite (nonexplicit)
number of values of Q will suffice, and Q can be ordered by the ordering implicit by
Cantors proof on the countability of the rationals.
6. AN APPLICATION TO SHAMIRS SECRET SHARING SCHEME. An
application of interpolation over finite fields is Shamirs secret sharing scheme [9].
Here the secret key is encoded as the constant term of a polynomial f of positive
degree d. Public secrets are distributed as points on the curve defined by f , and f (and
the secret key) is recoverable once a participant has collected d + 1 points. Shamirs
scheme is a threshold scheme, in that any participant need collect d + 1 points from
a pool of N d + 1 distributed points. Moreover, Shamirs scheme permits a multitiered trust infrastructure, as trusted parties are given multiple points on the curve and
hence require collecting fewer points from the participant pool than a basic party.
Lagrange interpolants are proposed for the classical Shamir secret sharing scheme,
but clearly any interpolation method could be applied. When using factorial interpolation, each new point collected requires construction and back-solution of a new row.
Once again, back-solution of the ith row takes at most 2i 1 field operations, and
each back-solution can be performed as the point arrives. Once d + 1 points have been
collected with preimages ordered as (z 0 , z 1 , . . . , z d ), the secret key is
a(0) z 0 + a(1) z 0 z 1 + (1)d+1 a(d) z 0 z 1 z d ,
where a(i) is the ith coefficient of the factorial interpolant. The ith step can be computed with 2 multiplications and an addition by storing the result of the (i 1)th
iteration and the value of z 0 z 1 z i1 .
ACKNOWLEDGMENT. We would like to sincerely thank Xiaolan Yuan for her discussions during the Fall
2014 Semester when she participated in the first authors Mathematics Advanced Study Semesters program
course on finite fields at Penn State University. Her calculations were the first to show us the differences in the
running times of the falling factorial and Lagrange Interpolation methods. We are truly indebted to her.
We would also like to thank two anonymous reviewers for their helpful and insightful comments.

REFERENCES
1. M. Bhargava, Congruence preservation and polynomial functions from Zn to Zm , Discrete Math. 173
(1997) 1521.
2. Z. Chen, On polynomial functions from Z n to Z m , Discrete Math. 137 (1995) 137145.
3. Z. Chen, On polynomial functions from Z n 1 Z n 2 Z nr to Z m , Discrete Math. 162 (1996) 6776.
4. I. S. Duff, M. A. Heroux, R. Pozo, An overview of the sparse basic linear algebra subprograms: The new
standard from the BLAS technical forum, ACM Trans. Math. Software 28 (2002) 239267.
5. J. von zur Gathen, J. Gerhard, Modern Computer Algebra. Third edition. Cambridge Univ. Press,
Cambridge, 2013.
6. J. Gonzalez-Domnguez, M. J. Martn, G. L. Taboada, J. Tourino, Dense Triangular Solvers on Multicore
Clusters using UPC, 11th International Conference on Computational Science (ICCS 2011), Singapore,
June 2011, 231240.
7. R. L. Graham, D. E. Knuth, O. Patashnik, Concrete Mathematics. Addison-Wesley, Reading, MA, 1990.
8. G. L. Mullen, H. Stevens, Polynomial functions (mod m), Acta Math. Hung. 44 (1984) 237241.
9. A. Shamir, How to share a secret, Commun. ACM 22 (1979) 612613.
10. V. Shoup, The Number Theory Library. Electronic resource; documentation and software available:
http://www.shoup.net/ntl (2015).
11. D. Singmaster, On polynomial functions (mod m), J. Number Theory 6 (1974) 345352.

May 2016]

INTERPOLATION USING FACTORIAL REPRESENTATION

479

GARY MULLEN received his Ph.D. in mathematics from Penn State University in 1974. Since that time,
he has found much enjoyment and reward in his many years of teaching at Penn State. His research interests
center around finite fields; and when not contemplating the serene beauty of mathematics he enjoys gardening,
hunting, fishing, and having a beer after a busy day in the office.
Department of Mathematics, The Pennsylvania State University, University Park, PA, 16802
mullen@math.psu.edu

DANIEL PANARIO studied mathematics and computer science in Uruguay. He received the MSc degree
from the University of Sao Paulo, Brazil, and the Ph.D. degree from the University of Toronto, Canada. He is a
Professor at Carleton University, Ottawa, Canada. His main research interests are finite fields and applications,
combinatorics and probabilistic analysis of algorithms.
School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa ON, Canada, K1S
5B6
daniel@math.carleton.ca

DAVID THOMSON is an Assistant Professor in the Department of Mathematical Sciences and Cyber Math
Fellow with the Army Cyber Institute at West Point. He is also an Adjunct Research Professor at Carleton
University in Ottawa, Canada, where he received his PhD in 2012. His research interests include algebraic,
algorithmic, and combinatorial aspects of finite fields and their applications. The views expressed in this article
are those of the author and do not reflect the official policy or position of the Department of the Army, DOD,
or the U.S. Government. This work was done when the author was at Carleton University.
Army Cyber Institute, United States Military Academy, 2101 New South Post Road, Spellman Hall, West Point,
NY, 10996
David.Thomson@usma.edu

480

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




On Measurable Semigroups in R
In the discussion in [[1] p. 315], of possible versions of the famous Spitzer identity
for random walks on subsets D of R, the following alternative is listed, among others:
both D and D c have void interiors but neither is empty. This alternative is called
pathological in [1]. Here D is assumed to be Borel-measurable and D c = R \ D.
It is shown in [1] that, in order for a version of Spitzers identity to hold, both
D and D c must be additive semigroups. However, then the mentioned alternative is,
not just pathological, but plainly nonexistenteven if the condition of the Borel
measurability of D is weakened to that of the Lebesgue measurability. Indeed, if D is
Lebesgue-measurable and denotes the Lebesgue measure, then (D) (D c ) > 0,
which contradicts
Proposition 2. Let D be any Lebesgue-measurable subset of R with (D) > 0 which
is an additive semigroup. Then D has a nonvoid interior.
Proof. By the regularity of the measure and the condition

(D) > 0, one can find


a nonempty interval (a0 , a1 ) R such that D (a0 , a1 ) > 34 (a1 a0 ). For sim1
plicity, consider the normalized measure := a1 a
, so that (D1 ) > 34 , where
0
Dt := D (a0 , at ) and at := a0 + (a1 a0 )t for t [0, 1].
Take now any t ( 12 , 1). Then

(Dt ) = (D1 ) D [at , a1 ) (D1 ) (1 t) >

t
2

1
2

(a0 , at ) .

Let Ct := a0 + at
Dt = {a0 + at x : x Dt }. Then Ct (a0 , at ), (Ct ) =
(Dt ) > 12 (a0 , at ) , and Dt (a0 , at ). Hence, (Ct Dt ) (Ct ) + (Dt )


(a0 , at ) > 0. So, there is some y Ct Dt . For any such y, one has a0 +
at y Dt and therefore
a0 + at = y + (a0 + at y) Dt + Dt D + D D.
We conclude that
= (a0 + a1/2 , a0 + a1 ) = {a0 + at : t ( 12 , 1)} D,
so that D has a nonvoid interior.
REFERENCE
1. J. F. C. Kingman, Spitzers identity and its use in probability theory. J. London Math. Soc. 37
(1962) 309316.

Submitted by Iosif Pinelis


http://dx.doi.org/10.4169/amer.math.monthly.123.5.481
MSC: Primary 28A05

May 2016]

INTERPOLATION USING FACTORIAL REPRESENTATION

481

NOTES
Edited by Sergei Tabachnikov

A Corrigendum to Unreasonable Slightness


Arseniy Sheydvasser
Abstract. We revisit Bogdan Nicas 2011 paper, The Unreasonable Slightness of E 2 over
Imaginary Quadratic Rings and correct an inaccuracy in his proof.

1. INTRODUCTION. We review briefly the setup of Nicas paper [3]. Let A be a


commutative ring and E n (A) the group generated by elementary matrices, i.e., matrices in SLn (A) that have ones on the diagonal, and have one nonzero off-diagonal entry.
It is a basic fact that S L n (Z) = E n (Z), which prompts the question: What happens for
other commutative rings A?
In the course of their resolution of the congruence subgroup problem, Bass, Milnor,
and Serre [1] settled the matter for algebraic number fields in higher dimensions.
Theorem 1 (Bass-Milnor-Serre). Let A = O K be the ring of integers of any algebraic
number field K . Then, if n > 2, SLn (A) = E n (A).
Similarly, with n = 2 and for all algebraic number fields other than the imaginary
quadratic ones, Vaserstein [4] showed that elementary matrices are enough.
Theorem 2 (Vaserstein). Let A = O K be the ring of integers of an algebraic number
field K . Then, if K is not imaginary quadratic, SL2 (A) = E 2 (A).
In contrast, imaginary quadratic fields with n = 2 behave very differently, as was
first shown by Cohn [2].
Theorem 3 (Cohn). Let K be animaginary
quadratic
let A = O K be its
field, and
ring of integers. If K = Q(i), Q( 2), Q( 3), Q( 7), Q( 11), then SL2 (A)
= E 2 (A). Otherwise, SL2 (A) = E 2 (A).
We examine more closely the case where A is an imaginary quadratic ring, that
is, a subring of C of the form Z[], where
integer of degree
is a nonreal algebraic
2. Such an A is necessarily of the form Z[ D] or Z[ 12 (1 + 1 4D)] for some
integer D 1. Nica then gives an elementary proof of the following fact.

Theorem 4 (Nica). Let A = Z[ D] or A = Z[ 21 (1 + 1 4D)] with D 4. Then


E 2 (A) is an infinite index, nonnormal subgroup of SL2 (A).
Nicas proof hinges on the existence of nontrivial solutions to the Pell equation
X 2 DY 2 = 1. Since nontrivial solutions exist if and only if D is not a perfect square,
the proof is incomplete in the case of rings Z[di] where d Z. The basic method is
still sound, however, and we will provide a simple correction to cover this case as well.
http://dx.doi.org/10.4169/amer.math.monthly.123.5.482
MSC: Primary 11C00

482

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




2. PRELIMINARIES. Let U2 (A) denote the set of unimodular


pairs, that is, pairs

(, ) A2 that are the top row of some matrix SL2 (A). The usefulness of
unimodular pairs is captured in the following lemma.
Lemma 1. Let A be a commutative
ring. Let L 2 (A) be the subgroup of SL2 (A) of

elements of the form 1 1 . Then there exists a bijective map
L 2 (A)\S L 2 (A)/E 2 (A) U2 (A)/E 2 (A) defined by



[(, )] .

Proof. We must first check that this map is well defined. In particular, we should check
 
that multiplication on the left by an element of L 2 (A) does not affect which orbit
maps to. Indeed


 

1

=
(, ).
1

+ +
That the map is surjective is clear from the definition of a unimodular pair. It
remains to prove injectivity. Fix an element [(, )] U2 (A)/E 2 (A), and suppose
 
that it has two pre-images, for which we choose explicit coset representatives



and

. But then




1




1
=
L 2 (A).
1

Therefore,





=


.

Notice that L 2 (A) E 2 (A). Lemma 1 tells us two important facts. First, it tells
us that if we want to know the size of SL2 (A)/E 2 (A), we can get this information
by just considering the orbits of unimodular pairsin particular, E 2 (A) is an infinite
index subgroup if U2 (A)/E 2 (A) is an infinite set. Secondly, it gives us a convenient
way to examine the normality of E 2 (A)if E 2 (A) is a normal subgroup, then it is
an immediate consequence that the obvious map SL2 (A)/E 2 (A) U2 (A)/E 2 (A) is
a bijective map, and then we can transfer the group structure on SL2 (A)/E 2 (A) to
U2 (A)/E 2 (A). As we shall see, this will significantly simplify computations and give
an easy way to prove that E 2 (A) is nonnormal.
In order to prove the infinitude of orbits of U2 (A), Nica introduced the concept of
special pairs, defined as all unimodular pairs (, ) such that || = || < | |.
He then proved the key result.

Lemma 2. Let A = Z[ D] or A = Z[ 12 (1 + 1 4D)] with D 4. Let (, ),


(
,
) be special pairs. Then (
,
) is E 2 (A)-equivalent to (, ) if and only if
(
,
) = (, ), (, ), (, ), or (, ).
With this lemma, proving E 2 (A) has infinite index reduces to constructing an infinite family of special pairssince any special pair can only be E 2 (A)-equivalent to
May 2016]

NOTES

483

finitely many other special pairs, we know immediately that U2 (A)/E 2 (A) is infinite,
which gives the desired result. For nonnormality, one uses the arithmetic of the matrix
group SL2 (A) to derive a contradiction assuming E 2 (A) is normal (although Nica does
not make use of Lemma 1, his proof is easily seen to be equivalent tothis strategy).
Better yet, it is really enough to do the construction for rings A = Z[ D], as taking D
= 4D
1, it is immediate that this construction works just as well for rings
A = Z[ 21 (1 + 1 4D)].
Nicas construction of such an infinite family hinges crucially on the existence of
solutions to the Pell equation X 2 DY 2 = 1. As mentioned previously, this makes it
unsuitable for the rings Z[di].
3. THE CASE A = Z[di]: Thankfully, this is easily remedied by the construction of
a new infinite family of special unimodular pairs.
Proof of Theorem 4s Remaining Case. Specifically, notice that if d|n, then


1 + n + ni
n

1 + n ni
1 ni


S L 2 (Z[di]) .

Thus, the pair (1 + n + ni, 1 + n ni) is unimodular. That it is special is straightforward since if n > 1,
1 + n + ni 2 = (1 + n)2 + n 2 = 1 + n ni 2
1 + n + ni + (1 + n ni) 2 = (2 + 2n)2 > (1 + n)2 + n 2
1 + n + ni (1 + n ni) 2 = (2n)2 > (1 + n)2 + n 2 .
Therefore, for any d 2, we have an infinite family of special unimodular pairs.
Applying Lemma 2, we see immediately that E 2 (A) is an infinite index subgroup of
SL2 (A) by the above discussion.
Proving that E 2 (A) is a nonnormal subgroup is similarly easy. Indeed, suppose
that E 2 (A) is normal so that U2 (A)/E 2 (A) inherits a group structure. We have matrix
relations



0 1
1 ni
1 0
n


 
n
0 1
1 + ni
=
1 + ni
1 0
n

 
1 ni
n
1 + ni
=
n
1 + ni
n

n
1 ni
n
1 ni


1
.

Projectinginto U
 2 (A)/E 2 (A), this simplifies greatly (especially since it is readily
0 1
checked that 1
0 E 2 (A)):
[(1 ni, n)] = [(1 + ni, n)]
[(1 ni, n)] = [(1 + ni, n)]1 .
484

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




In particular, this implies that [(1 ni, n)]2 = 1. However, it is also true that



1 ni
n

n
1 + ni

2

1 2ni
=
2n

2n
1 + 2ni




0 1
1 2ni
2n
0 1
1 0
1 2n + 2ni 1 + 2n + 2ni
1 0


1 + 2n + 2ni 1 + 2n 2ni
=
,
2n
1 2ni

which projects to
[(1 ni, n)]2 = [(1 2ni, 2n)]
[(1 2ni, 2n)] = [(1 + 2n + 2ni, 1 + 2n 2ni)].
By Lemma 2,
[(1 + 2n + 2ni, 1 + 2n 2ni)] = [(1 + 2n
+ 2n
i, 1 + 2n
2n
i)],
for distinct values of n, n
, so we can always choose n such that [(1 + 2n + 2ni,
1 + 2n 2ni)] = 1. But by the above, this implies that [(1 ni, n)]2 = 1, which
is a contradiction. Therefore, E 2 (A) is a nonnormal subgroup of SL2 (A).
Equivalently, one can show directly that the matrix


1 ni
n

n
1 + ni




0 1
1 ni
1 0
n

n
1 + ni

1

does not belong to E 2 (A), proving that E 2 (A) is nonnormal. We leave this as an exercise for the reader.
ACKNOWLEDGMENT. The author is indebted to Alex Kontorovich, both for pointing out the flaw in Nicas
paper and for helpful suggestions along the way.

REFERENCES
1. H. Bass, J. Milnor, J.-P. Serre, Solution of the congruence subgroup problem for SLn (n 3) and Sp2n

(n 2). Publ. Math. Inst. Hautes Etudes


Sci 33 (1967) 59137.

Sci 30 (1966) 553.


2. P. M. Cohn, On the structure of the GL2 of a ring. Publ. Math. Inst. Hautes Etudes
3. B. Nica, The unreasonable slightness of E 2 over imaginary quadratic rings. Amer. Math. Monthly 118 no. 5
(2011) 455462.
4. L. N. Vaserstein, On the group SL2 over Dedekind rings of arithmetic type. Math. USSR Sb 18 (1972)
321332.
Department of Mathematics, Yale University, New Haven CT 06511
arseniy.sheydvasser@yale.edu

May 2016]

NOTES

485

Euler and the Strong Law of Small Numbers


Karl Dilcher and Christophe Vignat

Abstract. We follow an incorrect entry in a well-known table of series and products through
several earlier tables and books, all the way to the relevant (correct) identity in the work of
Euler. Along the way we explain what may have led to the error.

In addition to modern computer algebra systems and bibliographic databases, oldfashioned tables of integrals, sums, and products remain important research tools in
mathematics. This is especially the case for the present authors whose research is in
the classical areas of analysis, combinatorics, and number theory.
What follows is a cautionary tale about the use of tables, along with a reminder, as
much to ourselves as to the reader, to go back to the sources if at all possible. But before
we continue, an important caution to the reader of this note: Some of the identities that
follow are incorrect, while others may be misleading. Also, some notations will be
inconsistent, to preserve the historical context.
In the process of looking up some infinite products in the excellent and very useful
handbook by E. R. Hansen [12], we came across a curious identity for the Euler numbers E n . These numbers, as they appear in [12] and other modern books and papers,
can be defined by the exponential generating function
 tn
2
=
En .
et + et
n!
n=0

(1)

Since the left-hand side of (1) is an even function, we have E 2k+1 = 0 for all k 0.
Furthermore, it can be shown that the even-index Euler numbers are integers, and
E 0 = 1, E 2 = 1, E 4 = 5, E 6 = 61, E 8 = 1385, . . .. For further properties, see,
e.g., [1] or its successor volume [16]. The identity in question is (89.8.3) on p. 486 of
[12], namely


1
k=1


 2n+1
(1)k
1
|E
|
.
=
2n
(2k + 1)2n+1
2(2n)!
2

(2)

A quick numerical check with Maple and Mathematica for a few small values of
n revealed that the identity is definitely incorrect. Before we realized what was going
on, we found the identity, in the same form, in the first volume of the well-known
multivolume tables by Prudnikov et al. [17, p. 754]. No reference is given, and the
identity also appears in the Russian original [18], again in the same form.
One of the useful features of Hansens handbook [12] is the fact that most entries
come with one or more references to the literature, or at least to other entries in the
table. The reference associated with identity (2), i.e., Hansens (89.8.3), is given as
http://dx.doi.org/10.4169/amer.math.monthly.123.5.486
MSC: Primary 01A90, Secondary 11M06

486

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




identity (1131) in the smaller and older collection of formulas by Jolley [15]. The
identity in question is listed on pp. 238239 as



 2n+1 
1
1
1
2
1 + 2n+1
1 2n+1
1 + 2n+1 = E n = E 2n ,
2(2n)!

3
5
7

(3)

where E n and E 2n are notations of the Euler numbers that differ from the usage in (1)
and (2), namely E 0 = E 1 = 1, E 2 = 5, E 3 = 61, E 4 = 1385, . . ..
Jolley also provides references, and for his identity (1131), i.e., (3) above, the reader
is referred to p. 365 of the famous old algebra book by Chrystal [4]; it is actually part II
of this two-volume work, a fact that is not mentioned in [15]. The identity in question
turns out to be (15) on p. 365 of [4], namely



 2m+1

1
1
1
2
E m = 2(2m)!
1 + 2m+1
1 2m+1
1 + 2m+1 ,

3
5
7

(4)

where yet another notation for the Euler numbers is used, namely E 1 = 1, E 2 = 5,
E 3 = 61, E 4 = 1385, . . . (see [4, p. 342]). Note that the large fraction sign in (4) is
missing in (3).
In addition to hinting at a proof of this last identity, Chrystal refers the reader to
Euler by writing, See again Euler, Introd. in Anal. Inf., 284 in a footnote on p. 365.
Eulers influential book [6] is available in English translation [7], and 284 can be
found on pp. 239241 of [6], or on pp. 244245 of [7], where we find
A =1

1
1
1
1
1
1
1
+ n n + n n + n n + &c.
n
3
5
7
9
11
13
15

(5)

And at the end of 284, Euler writes


A=

5n
7n
11n
13n
17n
3n

&c,
3n + 1 5n 1 7n + 1 11n + 1 13n 1 17n 1

(6)

where the powers of all the prime numbers in the numerators occur and the denominators are increased or decreased by 1 depending on whether the number has the form
4m 1 or 4m + 1. [7, p. 245].
The identity (6) already gives us a glimpse of what might have gone wrong on
the way towards the infinite product (2). But first we need to establish a connection
between Eulers expressions (5) and (6) and what were later called the Euler numbers.
In another well-known book [8], published a few years after [6], Euler expressed the
numbers A in (5) in terms of the Taylor coefficients of the secant function, which by
(1) are closely related to the Euler numbers. For instance, one of the explicit identities
that can be found in 224 in [8, p. 542], or in German translation in [9, p. 259], is
1

1
9
1
1
1

&c
=
,
39
59
79
99
1 2 8 210

(7)

where = 1385. Since this note deals with incorrect identities, we must mention in
passing that there are typographical errors in the original identity (and neighboring
ones) on p. 542 of [8]. Only two pages further they are correctly printed; it is also correct in [9]. However, in [14, p. 135] it is pointed out, with historical and bibliographical
notes, that Eulers evaluation of what is |E 18 | in the notation of (1) is incorrect. This
May 2016]

NOTES

487

paper [14], by the way, uses an alternative definition of Euler numbers which is also in
modern use and is more amenable to combinatorial applications; see also [19, p. 149].
The rest of the story is now easy to piece together. The numbers , , . . . , , . . .
used by Euler in [8, p. 542] correspond to 1, E 1 , . . . , E 4 , . . . in Chrystals notation,
and (4) is indeed the general form of Eulers identityif the sequence 3, 5, 7, . . . is
interpreted as the beginning of the sequence of odd primes, rather than the sequence
of odd integers greater than 1. Therefore, the origin of the incorrect formula (2) is
quite likely Chrystals identity (4). Showing just one more term, along the lines of
Eulers identity (6), would have avoided all this. However, whether or not Jolley misinterpreted the sequence 3, 5, 7, . . ., the identity (3) still contains the mistake of the
missing fraction sign which then made it into Hansens identity (2).
In the end, we shouldnt blame Chrystal too much. Given that his book is written
in great detail, even a moderately attentive reader would realize that the product in
(4) had to be over the odd primes. This is all the more so as Chrystal remarks that a
previous identity is transformed into his (15), i.e., identity (4) above, in the same way
as before. He apparently refers to identity (8) in [4, p. 364], namely





(8)
Bm = 2(2m)! (2)2m 1 1/22m 1 1/32m 1 1/52m
(once again in Chrystals notation), where Bm is the mth Bernoulli number in
the historical notation that has B1 = 1/6, B2 = 1/30, B3 = 1/42, B4 = 1/30,
B5 = 5/66, . . .; for different notations see, e.g., [16, Ch. 24]. This last identity (8)
is closely related to the Euler product for the Riemann zeta function, especially if we
compare it with the following famous formula named after Euler:


2(2m)! 1
1
1
+
+
+

,
(9)
Bm =
(2)2m 12m
22m
32m
again reproduced as in [4, p. 363]. While there is much less danger of the product in
(8) to be misunderstood, Euler himself showed more terms in the analogs of (8) and
(9); see 283 in [6] or [7].
We already mentioned that the product in (8) is, essentially, the Euler product for the
Riemann zeta function. Similarly, the product in (4) is the Euler product of an appropriate L-series (in fact, the series (5)), and both are special cases of Euler products of
Dirichlet L-series; see, e.g., [5, p. 162ff.].
Before we close, let us reiterate that some of the identities in this note are incorrect
or misleading. Indeed, the reader will have realized that (2) and (3) are false, and
(4) is correct only when interpreted as a product over the odd primes. Along with
the incorrect entry (89.8.3) in [12], i.e., (2) above, two consequences are mentioned,
namely (89.4.11) for n = 0, and (89.6.12) for n = 1. The first one of these is correct
since it is in a somewhat different form, taken from an identity in the well-known
classical book by Bromwich [2, p. 224, Ex. 9]. The special case of (89.4.11), namely
w = 1, that is relevant here is




2
(1)k
=
.
(10)
1
2k + 1
4
k=1
However, the second identity (89.6.12) is indeed false, the corrected version being
 




2
3
(1)k

+
cosh
.
(11)
1
=
3
(2k + 1)
12
12
4
k=1
488

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




In an attempt to prove (11), we tried to use Hansens formula (89.6.2) which,


interestingly, also turned out to be incorrect. In this case, however, there was only
a small typographical error (it should read t = ei/3 ). We first obtained the correct
version (11) symbolically with the computer algebra system Mathematica. We were
then able to prove its extension to arbitrary (even or odd) powers of the denominators 2k + 1, using properties of the Gamma function; see also [13, p. 55]. But this is
another story, and we refer the interested reader to [3], where among numerous other
results the correct identity (11) was also obtained. We thank the authors of [3] for
making us aware of an undated and privately distributed 3-page list of errata which
includes the published errata mentioned under [12]. In this unpublished list, formula
(89.6.12) appears corrected in the form (11), while (89.8.3), i.e., identity (2) above,
was withdrawn.
The mistake that has been the main subject of this note is a good example of the
Strong Law of Small Numbers which Richard K. Guy [11] formulated as follows:
There arent enough small numbers to meet the many demands made of them. One of
the corollaries stated in [11] captures the situation even better: Superficial similarities
spawn spurious statements.
However, we would like to give the last word to Pierre-Simon Laplace and his
famous dictum: Lisez Euler, lisez Euler, cest notre matre a` tous. (Read Euler, read
Euler, he is the master of us all.) [20]. And to do so, the best place to begin is the
wonderful Euler Archive [10].
REFERENCES
1. M. Abramowitz, I. A. Stegun, Handbook of Mathematical Functions. National Bureau of Standards,
Washington, DC, 1964.
2. T. J. IA. Bromwich, An Introduction to the Theory of Infinite Series. Second edition. Macmillan, London,
1926.
3. M. Chamberland, A. Straub, On gamma quotients and infinite products, Adv. Appl. Math. 51 (2013)
546562.
4. G. Chrystal, Algebra: An Elementary Text-Book for the Higher Classes of Secondary Schools and for
Colleges. Part II. Second edition. Adam and Charles Black, London, 1900.
5. H. Cohen, Number Theory. Vol. II: Analytic and Modern Tools. Springer, New York, 2007.
6. L. Euler, Introductio in Analysin Infinitorum. Tomus primus. Bernuset, Delamolliere, Falque & Soc.,
Lyon, 1748.
7. , Introduction to Analysis of the Infinite. Book I. Translated from the Latin and with an introduction by John D. Blanton. Springer-Verlag, New York, 1988.
8. , Institutiones Calculi Differentialis. Pars posterior. Academia Imperialis Scientiarum Petropolitanae, 1755.
9. , Vollstandige Anleitung zur Differenzialrechnung. Zweyter Theil. Aus dem Lateinischen
u bersetzt und mit Anmerkungen und Zusatzen begleitet von Johann Andreas Christian Michelsen.
Lagarde und Friedrich, Berlin und Libau, 1790. Reprinted by LTR Verlag GmbH, Bad Honnef, 1981.
10. The Euler Archive. A digital library dedicated to the work and life of Leonhard Euler, http://
eulerarchive.maa.org.
11. R. K. Guy, The strong law of small numbers, Amer. Math. Monthly 95 (1988) 697712.
12. E. R. Hansen, A Table of Series and Products. Prentice-Hall, Englewood Cliffs, NJ, 1975. Errata: Math.
Comp. 47 (1986) 767.
13. E. R. Hansen, Addendum for A Table of Series and Products. Unpublished and distributed with [12],
http://eldonhansen.com/.
14. W. Johnson, Some polynomials associated with up-down permutations, Discrete Math. 210 (2000)
117136.
15. L. B. W. Jolley, Summation of Series. Second revised edition. Dover Publications, New York, 1961.
16. NIST Handbook of Mathematical Functions. Ed. F. W. J. Olver et al. Cambridge Univ. Press, New York,
2010.
17. A. P. Prudnikov, Yu. A. Brychkov, O. I. Marichev, Integrals and Series. Vol. 1, Elementary Functions.
Translated from the Russian and with a preface by N. M. Queen. Gordon & Breach Science Publishers,
New York, 1986.

May 2016]

NOTES

489

18. A. P. Prudnikov, Yu. A. Brychkov, O. I. Marichev, Integraly i ryady. Elementarnye funktsii. Nauka,
Moscow, 1981.
19. R. P. Stanley, Enumerative Combinatorics. Vol. I. Wadsworth & Brooks/Cole, Monterey, CA, 1986.
20. Wikiquote contributors, Pierre-Simon Laplace, Wikiquote, The Free Quote Compendium, http://en.
wikiquote.org/wiki/Pierre-Simon_Laplace.
Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, B3H 4R2, Canada
dilcher@mathstat.dal.ca
LSS-Supelec, Universite Paris-Sud, Orsay, France and Department of Mathematics, Tulane University,
New Orleans, LA 70118, USA
cvignat@tulane.edu

Rational Nonaxis Points on the Unit Circle Have Irrational Angles


We give a short elementary proof that any rational point on the unit circle that does
not lie on a coordinate axis has an angle that is an irrational multiple of . For more
refined results at the cost of more involved and less elementary proofs, see Eckert [1]
and Tan [2].
 
b
a
+
i = e2i , and ab = 0, then is irrational.
Theorem 1. If a, b, c Z,
c
c
Proof. Since a 2 + b2 = c2 , by standard arguments, we may assume a, b, c are pairwise relatively prime and c is odd. Let p be a prime dividing c. If is rational, then
for some N , (a + bi) N = c N, and so (a + bi)n 0 (mod p) for all n N , where
mod p means that both real and imaginary coordinates are considered mod p.
However,
(a + bi) p a p + (bi) p a bi

(mod p),

where the first equality follows by the binomial theorem mod p, and the second by
Fermats little theorem and the fact that p is odd. It follows that
k

(a + bi) p a bi  0

(mod p)

for all k N; contradiction.


REFERENCES
1. E. J. Eckert, The group of primitive Pythagorean triangles, Math. Mag. 57 no. 1 (1984) 2227.
2. L. Tan, The group of rational points on the unit circle, Math. Mag. 69 no. 3 (1996) 163171.

Submitted by Tim Hsu, San Jose State University


http://dx.doi.org/10.4169/amer.math.monthly.123.5.490
MSC: Primary 00A01

490

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Explicit Additive Decomposition


of Norms on R2
Iosif Pinelis

Abstract. A well-known result by Lindenstrauss is that any two-dimensional normed space


can be isometrically imbedded into L 1 (0, 1). We provide an explicit form of a such an imbedding. The proof is elementary and self-contained.

Let V be a vector space over R endowed with a norm  , with the dual space V .
Let us say that the norm   admits an additive decomposition if there exists a Borel
measure on V such that

|(x)| ( d) for all x V .
(1)
x =
V

Clearly such a decomposition exists if V is one dimensional. In this note an explicit


decomposition of the form (1) will be given in the case when V is two dimensional. It
is well known that, in general, there is no such decomposition if the dimension of V is
greater than 2; cf. Remark 4 in the present note.
To state our main result, let us recall some basic facts about convex functions.
Suppose that a function f : R R is convex. Then f is continuous and has finite nondecreasing right and left derivatives f + and f  , which are right- and left-continuous,
as well. The
respectively. Moreover, the function f  := ( f + + f  )/2 is nondecreasing

LebesgueStieltjes integral R (t) d f  (t) is the Lebesgue
integral

d,
where is
R


the Borel measure determined by the condition that (a, b] = f + (b) f + (a) for all
real a and b such that a< b. The latter condition is equivalent to each of the following


conditions:

 (i) [a,b) = f (b) f (a)
 for all real a and b such that a < b and (ii)
[a, b) + (a, b] = 2 f (b) f (a) for all real a and b such that a < b.
Now we are ready to state the following explicit additive decomposition of an arbitrary norm in the case when V = R2 .
Theorem 1. Let   be any norm on R2 . Let N (u) := (u, 1) for all real u. Then
the function N is convex, the limit


(2)
c := lim (u, 1) 2(u, 0) + (u, 1)
u

exists and is finite and nonnegative, and



c|v| 1
+
|u tv| dN  (t) for all (u, v) R2 .
(u, v) =
2
2 R

(3)

Obviously, (3) is an explicit


decomposition of the form
 (1),
 additive

 with  defined by
the condition that 2 R2 g (s, t) ( ds dt) = cg (0, 1) + R g (1, t) dN  (t) for
all nonnegative Borel-measurable functions g : R2 R.
http://dx.doi.org/10.4169/amer.math.monthly.123.5.491
MSC: Primary 46B04, Secondary 26D20; 39B22

May 2016]

NOTES

491

Special cases of representation (3), for the  p norm on R2 with p > 1, are
(u, v) p = (|u| + |v| )
p

p 1/ p

p1
=
2

|u tv| |t| p2 (|t| p + 1) p 2 dt

(4)

when p (1, ) and


(u, v) = max(|u|, |v|) =

1
2

(|u + v| + |u v|),

for all (u, v) R2 . In these cases, the constant c in (3)(2) is 0. A simple case with
a nonzero c is given by the formula (u, v) = (u, v)1 = |u| + |v| for (u, v) R2 ,
with c = 2.
Remark 2. Concerning formula (4), one may recall Theorem 7.2 and Corollary 1 in
[8], which state that the classical normed spaces np (for natural n) and L p (0, 1) can be
isometrically imbedded into L r (0, 1) whenever 1 r p 2. A simple argument,
based on the scaling properties of sums of independent identically distributed (i.i.d.)
symmetric stable random variables (r.v.s), was proposed in [12, pages 161162],
which shows that these isometric imbeddings can be given in rather explicit form, in
terms of such sums of r.v.s.
In particular, any Euclidean space is linearly isometric to a subspace of L 1 . In this
case, one has the following explicit additive decomposition of the norm:
x2 :=


xx =


Rd

|x t| d ( dt)

(5)

for all x Rd , where d is the standard Gaussian measure on Rd and denotes the standard inner product on Rd . In place of d , one can similarly use any other spherically
invariant measure on Rd such that Rd |x t| ( dt) (0, ) for some or, equivalently, any nonzero vector x in Rd .
The proof of Theorem 1 relies on the following.
Lemma 3. Suppose that f : R R is a convex function such that for some real k
there exist finite limits
d+ := d f,k;+ := lim [ f (u) ku]
u

and d := d f,k; := lim [ f (u) + ku].


u

(6)

Then for all u R


1
d+ + d
+
f (u) =
2
2


R

|u t| d f  (t).

(7)

Formula (7) is similar to well-known integral representations of convex functions;


cf., e.g., [10, Section 1.6]. The particularity of representation (7) is due to the fact that
it is obtained under condition (6).
Proof of Lemma 3. Since the function f  is nondecreasing, there exist limits k :
= limx f  (x) [, ]. Moreover, for any real u > 0 one has f (u) ku
492

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123

u
= f (0) + 0 ( f  (t) k) dt, which converges to a finite limit (as u ) only if
k+ = k. Similarly, k = k. So, in view of (6), for any real u
 u
 u
 z
f (u) + ku = d +
( f  (z) + k) dz = d +
dz
d f  (t)


= d +

= d +

d f  (t)

dz
t

max(0, u t) d f  (t),

so that

f (u) + ku = d +


Similarly, f (u) ku = d+ +
two identities, one obtains (7).

max(0, u t) d f  (t).

max(0, t u) d f  (t) for any real u. Adding the last

Proof of Theorem 1. That the function N is convex follows immediately from the convexity of the norm. Note next that the limits d f,k; in (6) exist in [, ] for any
convex function f : R R and any real k. On the other hand, for all real u

 

 N (u) |u| (1, 0) = (u, 1) (u, 0) (0, 1),
by the norm inequality. So, the limits d = d f,k; in (6) exist and are finite for f = N
and
k = (1, 0).

(8)

Therefore, by Lemma 3, (7) holds with f = N and d = d N ,(1,0); . It follows that,


with these d ,

2(u, v) = 2|v| (u/v, 1) = 2|v| N (u/v) = (d+ + d )|v| + |u tv| dN  (t)
R

(9)

for all real u and all real v = 0. The last expression in (9) is continuousin v R by
dominated convergencebecause, by (9) with (u, v) = (0, 1), one has R |t| dN  (t)
= 2(0, 1) (d+ + d ) < . Thus, one has (3)with d+ + d in place of cfor
all (u, v) R2 .
Moreover,


d+ + d = lim N (u) ku + N (u) ku
u


= lim (u, 1) 2(1, 0)u + (u, 1)
u


= lim (u, 1) 2(u, 0) + (u, 1) = c 0,
u

by (2) and, again, the convexity of the norm.


This completes the proof of Theorem 1.
May 2016]

NOTES

493

From the proofs of Theorem 1 and Lemma 3, it follows that the nondecreasing
function N  tends to k as x , where k is as in (8). So,
F :=

1
2

1
2k

N

is a cumulative probability distribution function (cdf), regularized in the sense that


2F(u) = F(u+) + F(u) for all real real u. Let the function F 1 : (0, 1) R be
(the smallest, left-continuous generalized inverse to F) defined by the condition
F 1 (s) = inf{u R : F(u) s}

for s (0, 1).

A well-known fact is that, if S is a random variable (r.v.) uniformly distributed on


the interval (0, 1), then the regularized cdf of the r.v. F 1 (S) is F. So, (3) can be
rewritten as
 1
(u, v) =
|u (s) + v (s)| ds
(10)
0

for all (u, v) R2 , where



1 if 0 < s < 12 ,
(s) :=
0 if 12 s < 1,


(s) :=

F 1 (2s) if 0 < s < 12 ,


c if

1
2

s < 1.

(11)

Thus, the mapping (u, v) u + v is a linear isometric imbedding of R2 (endowed


with the arbitrary norm  ) into L 1 (0, 1).
That any two-dimensional normed space is isometric to a subspace of L 1 (0, 1) was
shown by Lindenstrauss [9, Corollary 2]. In distinction from that result, the imbedding
into L 1 (0, 1) given by formulas (10)(11) is quite explicit. Another difference is that
our method is elementary and the proof is self-contained. (Also, formula (3) is simpler
than, and therefore in some situations may be preferable to, (10)(11).) On the other
hand, the study [9] contains a number of results that are more general than Corollary 2
therein.
More information on isometric imbeddings into L 1 can be found in survey [6]. In
particular, a connection with zonoids is discussed in Section 5 there; cf. the last display
on page 924 in [6] and formula (1) in the present note.
Using the isometric imbeddings into L 1 mentioned in Remark 2 and (say) the one
given by (10), it is straightforward to verify Hlawkas inequality
x + y + z + x + y + z x + y + y + z + z + x
for all x, y, z in V when V is either np (for natural n and p [1, 2]) or two dimensional. Another way to show that Hlawkas inequality holds for any two-dimensional
normed space was presented in [5].
Remark 4. In general, a normed space V of any given dimension greater than 2 is not
linearly isometric to a subspace of L 1 . Indeed, otherwise Hlawkas inequality would
hold for all x, y, z in R3 . However, as pointed out, e.g., in [4], Hlawkas inequality
fails to hold for some x, y, z in R3 endowed with the supremum norm.
In conclusion, consider the inequality
E X Y 2 E X + Y 2 ,
494

(12)

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123

where X and Y are i.i.d. random vectors in Rd and  2 is the Euclidean norm, as in
(5). This inequality was obtained in [3]. As noted in [7], in the case d = 1 (12) follows
immediately from the identity


E |X + Y | = E |X Y | + 2

[P(X > r ) P(X < r )]2 dr.

Now the L 1 -imbedding formula (10) immediately yields the following corollary.
Corollary 5. For any two-dimensional normed space V and any i.i.d. random vectors
X and Y in V ,
E X Y  E X + Y .

(13)

As shown by Johnson [2], for each natural d 3 inequality (13) fails to hold for
V = Rd in general. Indeed, define the norm on Rd by the formula
x := max{|xi | |xi x j | : i, j = 1, . . . , d}
for x = (x1 , . . . , xd ) Rd . Let (e1 , . . . , ed ) be the standard basis of Rd . Let X and Y
be i.i.d. random vectors in Rd . For any natural d 4, suppose that the random vector
> d+1
X is such that P(X = ei ) = d1 for each i = 1, . . . , d. Then E X Y  = 2 d1
d
d
= E X + Y . In the remaining case when d = 3, suppose
random vector
X is
 that the

such that P(X = e1 ) = P(X = e2 ) = P(X = e3 ) = P X = 21 (e1 + e2 + e3 ) = 14 .
> 19
= E X + Y .
Then E X Y  = 21
16
16
In the arXiv version [11] of this note, one can find applications of the additive
decomposition of norms concerning the following: (i) explicit representations of the
moments of the norm of a random vector X in terms of the characteristic function and
the FourierLaplace transform of the distribution of X and (ii) an explicit and partially
improved form of the exact version of the LittlewoodKhinchinKahane inequality
obtained by Lataa and Oleszkiewicz.
ACKNOWLEDGMENT. This note was sparked by answers by Noam D. Elkies and Suvrit Sra on MathOverflow [1] and William B. Johnsons comments there.

REFERENCES
1. Absolute value inequality for complex numbers, 2015, MathOverflow, http://mathoverflow.net/
questions/167685/absolute-value-inequality-for-complex-numbers.
2. An inequality for two independent identically distributed random vectors in a normed space, 2015, MathOverflow, http://mathoverflow.net/questions/208194/an-inequality-for-two-indepen
dent-identically-distributed-random-vectors-in-a-no/208245#208250.
3. A. Buja, B. F. Logan, J. A. Reeds, L. A. Shepp, Inequalities and positive-definite functions arising from
a problem in multidimensional scaling, Ann. Statist. 22 (1994) 406438.
4. W. Fechner, Hlawkas functional inequality, Aequationes Math. 87 (2014) 7187.
5. L. M. Kelly, D. M. Smiley, M. F. Smiley, Two dimensional spaces are quadrilateral spaces, Amer. Math.
Monthly 72 (1965) 753754.
6. A. Koldobsky, H. Konig, Aspects of the isometric theory of Banach spaces, in Handbook of the Geometry
of Banach Spaces, Vol. I. North-Holland, Amsterdam, 2001. 899939.
7. M. Lifshits, R. M. Schilling, I. Tyurin, A probabilistic inequality related to negative definite functions, in
High Dimensional Probability VI. Progress in Probability, Vol. 66, Springer, Basel, 2013. 7380.
8. J. Lindenstrauss, A. Peczynski, Absolutely summing operators in L p -spaces and their applications,
Studia Math. 29 (1968) 275326.

May 2016]

NOTES

495

9. J. Lindenstrauss, On the extension of operators with a finite-dimensional range, Illinois J. Math. 8 (1964)
488499.
10. C. P. Niculescu, L.-E. Persson, Convex Functions and Their Applications. CMS Books in Mathematics/Ouvrages de Mathematiques de la SMC, 23, Springer, New York, 2006. A contemporary approach.
11. I. Pinelis, Explicit additive decomposition of norms on R2 , http://arxiv.org/abs/1506.00537,
2015.
12. H. P. Rosenthal, On the span in L p of sequences of independent random variables. II, in Proceedings
of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley,
Calif., 1970/1971), Vol. II: Probability theory. Univ. California Press, Berkeley, CA, 1972. 149167.
Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan 49931
ipinelis@mtu.edu

496

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123

Parabolas and Archimedes 2/3-Property


Michael Gaul and Fred Kuczmarski
Abstract. Archimedes discovered that the area of the region bounded by a parabola and
a chord is 2/3 the area of its circumscribing parallelogram with two sides parallel to the
parabolas axis. We show that parabolic arcs are the only smooth strictly convex functions
with the 2/3-property, where two sides of the circumscribing parallelogram are parallel to the
y-axis.

Archimedes showed that the area of the region bounded by a parabola and a chord AB
is 2/3 the area of the circumscribing parallelogram ABCD (Figure 1(a)). Sides AD and
BC are parallel to the axis of the parabola, and CD is tangent to the parabola. To see
why, translate the vertical segments in the parallelogram so that their top endpoints
lie on a horizontal line (Figure 1(b)). The transformation maps the parallelogram to a
rectangle A B  C  D  and the parabola to another parabola with its vertex at the midpoint of C  D  . The symmetric parabolic segment occupies 2/3 of the rectangle, and
Archimedess result follows from Cavalieris principle.

D
Figure 1. The area of a parabolic segment

The authors of [1] claim a converse to this theorem, that with certain differentiability
assumptions quadratics are the only convex functions with the 2/3-property (where
two sides of the circumscribing parallelogram are parallel to the y-axis). Our efforts to
understand their proof led us to consider functions of the form

(1)
y = f (x) = ax + d + bx + c, a = 0.
We soon realized these functions also have the 2/3-property. The graphs of these
functions are parabolic arcs, as the reader may check by writing (1) in the form
http://dx.doi.org/10.4169/amer.math.monthly.123.5.497
MSC: Primary 26A06, Secondary 51N20

May 2016]

NOTES

497

Ax 2 + 2Bx y + C y 2 + Dx + E y + F = 0 and verifying that B 2 AC = 0.1 However, the 2/3-property is not the Archimedean property of Figure 1, for sides BC and
AD of the circumscribing parallelogram ABCD are not parallel to the axis (Figure 2).
But since ABCD has the same area as the circumscribing parallelogram ABcd with
sides Ad and Bc parallel to the axis, the functions (1) also have the 2/3-property.

axis

B
c

Figure 2. The 2/3-property for the curve y = x + x

In this note we describe how we came upon this family of functions and sketch a
proof of the following theorem.
Theorem. Let I be an open interval and f : I R a four times differentiable function with nonvanishing second derivative.
Then f has the 2/3-property if and only if

f (x) = ax 2 + bx + c or f (x) = ax + d + bx + c, for some constants a, b, c, and


d, with a = 0, or equivalently if and only if the graph of f is a parabolic arc.
Since we originally thought that functions satisfying the hypotheses of the theorem
would necessarily be quadratic, our hope was that a comparison of the fourth degree
Taylor polynomials for the areas of the segment and circumscribing parallelogram
would force the third derivative of f to vanish. But this approach ultimately failed. In
fact, much to our surprise, it told us nothing at all about f . While using the identical
approach with fifth degree polynomials ultimately proves the theorem, we thought it
would be instructive to first carry out the easier computation with quartics.
We start by approximating the area of the region S bounded by the graph of
y = f (x) and the chord AB with endpoints A(a, f (a)) and B(a + h, f (a + h)). To
simplify the calculations, define
g(x) = f (x + a) f (a),
in effect translating A to the origin (Figure 3). We now work with the translated region
S  bounded by y = g(x) and the chord with endpoints A (0, 0) and B  (h, g(h)), keeping in mind that g (n) (0) = f (n) (a), for n N.
1 Conversely, any function whose graph is an arc of a parabola with nonvertical axis may be written in the
form (1), as the reader can verify by solving Ax 2 + 2Bx y + C y 2 + Dx + E y + F = 0 for y when B 2 AC
= 0 and C = 0.

498

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




B h,g h

y gx
c,cg c

c,g c
D
Figure 3. Translating A to the origin

We assume f is three times differentiable and that f  > 0. Then chord A B  lies
above the graph of y = g(x) and the area of S  is
1
A(a, h) = hg(h)
2

g(x) d x.
0

Using the third degree Taylor approximation to g(x) at x = 0,


g(h) = f  (a)h +

f  (a) 2
f (3) (a) 3
h +
h + o(h 3 ),
2
6

(2)

gives

A(a, h) =

f  (a) 3
f (3) (a) 4
h +
h + o(h 4 ).
12
24

(3)

Approximating the function P (a, h) for the area of the circumscribing parallelogram A B  C  D  is more delicate. The difficulty lies in getting a handle on the position of the point of tangency of side C  D  and the curve y = g(x). The mean value
theorem and our assumption that f  > 0 guarantee a unique such point (c, g(c)), with
c (0, h) and
g  (c) = g(h)/ h.

(4)

For h = 0, (4) defines c = c(h) implicitly as a function of h. To allow for the


possibility that h = 0, we replace the right side of (4) with the function

F(h) =

g(h)/ h, h = 0
g  (0), h = 0,

so that c(0) = 0. Then from (2),


F(h) = f  (a) +
May 2016]

f  (a)
f (3) (a) 2
h+
h + o(h 2 ).
2
6
NOTES

(5)
499

We now follow [1] and approximate the function c(h). Since g  (c) = 0, c(h) is
differentiable in a neighborhood of h = 0. Differentiating both sides of the equation
g  (c(h)) = F(h) with respect to h gives
c (h) = F  (h)/g  (c(h)).

(6)

In particular,
c (0) = 1/2.
This is not surprising. Since g  (0) = 0, g(x) behaves like a quadratic near x = 0, and
c ultimately lies at the midpoint of the interval [0, h]; as h 0, c/ h 1/2.
A similar calculation using (5) and (6) shows that
c (0) =

f (3) (a)
,
12 f  (a)

and hence
c(h) =

f (3) (a) 2
h
+
h + o(h 2 ).
2 24 f  (a)

(7)

Using (2) first and then (7), gives the width A D  of the circumscribing parallelogram A B  C  D  in Figure 3 as

H(a, h) = c g  (c) g(c) =


=

f  (a) 2
f (3) (a) 3
c +
c + o(c3 )
2
3
f (3) (a) 3
f  (a) 2
h +
h + o(h 3 ),
8
16

and its area as

P (a, h) = h H(a, h) =

f  (a) 3
f (3) (a) 4
h +
h + o(h 4 ).
8
16

(8)

Comparing (3) and (8) shows that


2
A(a, h) P (a, h) = o(h 4 ).
3
Surprisingly, all functions with nonvanishing second derivative have the 2/3-property
to the fourth order approximation. In fact, just the third order approximations show
that
lim A(a, h)/P (a, h) = 2/3.

h0

Forging ahead, we now assume that f is four times differentiable. Calculations


along similar lines show that
c(h) =
500

h
f  (a) f (4) (a) ( f (3) (a))2 3
f (3) (a) 2
+
h + o(h 3 ),
+
h
2 24 f  (a)
48( f  (a))2

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




A(a, h) =

f  (a) 3
f (3) (a) 4
f (4) (a) 5
h +
h +
h + o(h 5 ),
12
24
80

and

P (a, h) =

f  (a) 3
f (3) (a) 4 21 f  (a) f (4) (a) + ( f (3) (a))2 5
h +
h +
h + o(h 5 ).
8
16
1152 f  (a)

Setting A(a, h) = (2/3)P (a, h) and comparing the coefficients of h 5 gives




f (3) (a)

2

3 
f (a) f (4) (a).
5

(9)

While (9) is a necessary condition for f to have the 2/3-property, it would certainly be
surprising if it were sufficient as well. But in fact it is, for the condition leads directly
to quadratics and the family of functions (1). To see how, let Y = f  (x), and rewrite
(9) as
(Y  )2 =

3
Y Y  .
5

Now if f (3) is identically zero, then f is quadratic. Otherwise, integrating


5 Y
Y 

=
Y
3 Y
gives
Y  = C1 Y 5/3 ,
for some constant C1 = 0. Then
d2 y
= Y = (C1 x + C2 )3/2 ,
dx2
for some constant C2 . Integrating twice gives the family (1).
ACKNOWLEDGMENT. The authors wish to thank the referee for several helpful suggestions.

REFERENCE
1. A. Benyi, P. Szeptycki, F. Van Vleck, Archimedean properties and parabolas, Amer. Math. Monthly 107
(2000) 945-949.
Mathematics Department, North Seattle College, Seattle, WA 98103
mgaul@northseattle.edu
Mathematics Department, Shoreline Community College, Shoreline, WA 98133
fkuczmar@shoreline.edu

May 2016]

NOTES

501

An Extremely Short Proof of the Hairy Ball


Theorem
Peter McGrath

Abstract. Using winding numbers, we give an extremely short proof that every continuous
field of tangent vectors on S 2 must vanish somewhere.

Consider the unit two sphere S 2 = {p R3 : |p| = 1} in R3 . We say a function v :


S 2 R3 is a vector field on S 2 if v(p), p = 0 for each p S 2 and call a vector field
continuous if its component functions are continuous.
Theorem 1. Suppose v is a continuous vector field on S 2 . Then there is p S 2 such
that v(p) = 0.
This classical theorem was originally proven by Poincare and is sometimes called
the Hairy Ball theorem. Theorem 1 has many interesting proofs (see, for instance, [2]
and the charming book [1]) and various generalizations; for more information, see the
introduction of [2]. The distinguishing attribute of the present proof is its brevity and
elegance: Each of the aforementioned proofs requires computations in and between a
set of stereographic coordinate charts that appropriately cover S 2 . The argument here
is shorter and and simpler.
A regular smooth curve in the plane is a smooth map S 1 R2 whose derivative
does not vanish anywhere. The rotation number of such a curve is 21 times the
change that the oriented angle makes with some fixed reference direction (e.g., e1 =
(1, 0)) as the curve is traversed; in other words, it is the winding number of , thought
of as a map S 1 R2 \ {0}. The rotation number is an integer that is an invariant under
regular homotopy (homotopy through regular curves).
Proof. Suppose for the sake of a contradiction that S 2 admits a continuous nonvanishv
. We first
ing vector field v; we may suppose v has unit length by replacing v with |v|
2
note that the definition of rotation number can be extended to curves in S by replacing
the fixed reference direction e1 by the variable direction v in the definition above.
To see this, endow R3 with a right-handed orientation so the ordered 3-tuple of
standard basis vectors {e1 , e2 , e3 } is positively oriented and identify R2 with the subset
{(x, y, z) R3 : z = 0} R3 . Given p S 2 and a unit vector w Tp S 2 , there is a
unique unit vector w Tp S 2 such that {p, w, w } is positively oriented. For such p
and w, denote by p,w the isometry of R3 determined by requesting that p,w map
the point p to 0 and send the ordered 3-tuple of tangent vectors {w, w , p} Tp R3
to {e1 , e2 , e3 } T0 R3 . Clearly, p,w depends continuously on p and w. We define the
rotation number of a curve in S 2 with respect to v to be the winding number of the
continuous curve  ,v( ) ( ).
Consider now the family of regular smooth curves in S 2 defined as follows: Cp,s (for
p S 2 , s (1, 1)) is the circle that is the intersection of S 2 and the plane {q S 2 :
http://dx.doi.org/10.4169/amer.math.monthly.123.5.502
MSC: Primary 55M25

502

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




q, p = s}, oriented so that p is the positive normal. These curves are all regularly
homotopic and so have the same rotation number with respect to v, say n.
Now notice that for s = 0, Cp,s and Cp,s parametrize the same great circle but
with opposite orientations. Thus, n = n and hence n = 0. On the other hand, for s
close to 1, the rotation number of Cp,s is close to the rotation number of a circle in
the plane because v is close to v(p) on Cp,s by continuity. Thus, n {1, 1}. This is a
contradiction.
REFERENCES
1. W. Chinn, N. Steenrod, First Concepts of Topology. Mathematical Association of America, Washington,
DC, 1966.
2. M. Eisenberg, R. Guy, A proof of the Hairy Ball theorem, Amer. Math. Monthly 86 (1979) 571574.
Department of Mathematics, Brown University, Providence RI 02912
Peter Mcgrath@brown.edu

May 2016]

NOTES

503

PROBLEMS AND SOLUTIONS


Edited by Gerald A. Edgar, Doug Hensley, Douglas B. West
with the collaboration of Itshak Borosh, Paul Bracken, Ezra A. Brown, Randall
Dougherty, Tamas Erdelyi, Zachary Franco, Christian Friesen, Ira M. Gessel, Laszlo
Liptak, Frederick W. Luttmann, Vania Mascioni, Frank B. Miles, Steven J. Miller,
Mohamed Omar, Richard Pfiefer, Dave Renfro, Cecil C. Rousseau, Leonard Smiley,
Kenneth Stolarsky, Richard Stong, Walter Stromquist, Daniel Ullman, Charles Vanden
Eynden, and Fuzhen Zhang.
Proposed problems and solutions should be sent in duplicate to the MONTHLY
problems address on the back of the title page. Proposed problems should never be
under submission concurrently to more than one journal, nor posted to the internet
before the due date for solutions. Submitted solutions should arrive before Sept 30,
2016. Additional information, such as generalizations and references, is welcome.
The problem number and the solvers name and address should appear on each solution. An asterisk (*) after the number of a problem or a part of a problem indicates
that no solution is currently available.

PROBLEMS
11908. Proposed by George. E. Andrews, The Pennsylvania State University, University Park, PA, and Emeric Deutsch, Polytechnic Institute of New York University,
Brooklyn, NY. Let n and k be nonnegative integers. Show that the number of partitions
of n having k even parts is the same as the number of partitions of n in which the largest
repeated part is k (defined to be 0 if the parts are all distinct). For example, 7 has three
partitions with two even parts (4 + 2 + 1 = 3 + 2 + 2 = 2 + 2 + 1 + 1 + 1) and also
three partitions in which the largest repeated part is 2: (3 + 2 + 2 = 2 + 2 + 2 + 1
= 2 + 2 + 1 + 1 + 1).
11909. Proposed by Hideyuki Ohtsuka, Saitama, Japan. Prove that for every positive
integer m there exists a polynomial Pm in two variables, with integer coefficients, such
that for all integers n and r with 0 r n,
 


r 

2n
Pm (n, r )
n
n
2m
.
k = m
r +k r k
j=1 (2n 2 j + 1) 2r
k=r
11910. Proposed by Cornel Ioan Valean, Teremia Mare, Romania. Let G k be the reciprocal of the kth Fibonacci number; for example, G 4 = 1/3 and G 5 = 1/5. Find

(arctan G 4n3 + arctan G 4n2 + arctan G 4n1 arctan G 4n ) .

n=1

11911. Proposed by Leonard Giugiuc, Drobotu Turnu Severin, Romania. Let a, b,


and c be positive numbers such that 1 + ab + bc + ca = a + b + c + 2abc. Prove
a 3 + b3 + c3 + 5abc 1 and determine when equality holds.
http://dx.doi.org/10.4169/amer.math.monthly.123.5.504

504

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




11912. Proposed by Pal Peter Dalyay, Szeged, Hungary. Let be the circumscribed
circle of triangle ABC, and let R and r be the radii of its circumcircle and incircle,
respectively. Let r A , r B , and rC be the radii of the A-, B-, and C-mixtilinear incircles
of ABC and , respectively. Prove that 4r r A + r B + rC 14 (5R + 6r ). (For the
definition of a mixtilinear incircle see problem 11774; that problem and its solution
are found on the next page of this issue.)
11913. Proposed by George Stoica, Saint John, New Brunswick, Canada. Let be a
positive constant, and let f map (0, ) to R+ . Given limx x 1/ f (x) = , prove

 
 f (x) 

lim inf  1+  = 0.
x
f (x)
11914. Proposed by Robin Chapman, Mathematics Research Institute, University of
Exeter, Exeter, (U. K.), and Roberto Tauraso, Universit`a di Roma Tor Vergata, Rome,
Italy. Show that for all positive integers m and n,





n
3m

mk
k n k
j n + 1 2k
(4)
(2)
= 0.
k 1 j=1
j 1
3m j
k=1

k1
(Here xk = k!1 i=0
(x i) for x R.)

SOLUTIONS
Compositions Having At Least One 1
11767 [2014, 267]. Proposed by Mircea Merca, University of Craiova, Craiova,
Romania. Prove that
 (1 + t1 + t2 + + tn )!
= 2n Fn ,
(1 + t1 )! t2 ! tn !
where the sum is over all nonnegative integer solutions to t1 + 2t2 + + ntn = n and
Fk is the kth Fibonacci number.
Solution I by CMC 328, Carleton College, Northfield, MN. View the sum as over all
partitions of n + 1 having at least one 1, treating t1 + 1 as the number of copies of 1
and t j as the number of copies of j for 2 j n. The summand counts the ways to
permute the parts, so the sum is the number of compositions of n + 1 having at least
one 1.
The number of compositions of n + 1 is 2n , so it suffices to prove that the number
an of compositions of n + 1 with no 1 is Fn . This is clear for n = 0 and n = 1. When
n 2, these compositions have last part 2 or greater than 2. Deleting the last part
shows that there are an2 of the first type, and subtracting 1 from the last part shows
that there are an1 of the second type. By induction, an = an1 + an2 = Fn .
Solution II by Borislav Karaivanov, Lexington, SC. Rewrite the sum as
 (t1 + t2 + + tn )!
,
t1 ! t2 ! tn !
summed over all integer solutions to t1 + 2t2 + + ntn = n + 1 with t1 1 and
ti 0 for i 2. This sum is the coefficient of x n+1 in the series
f (x) =




(x + x 2 + )m (x 2 + x 3 + )m

m=0

May 2016]

PROBLEMS AND SOLUTIONS

505



m=0

x
1x

m

x2
1x

m

1x
x(1 x)2
1x
=

.
1 2x
1 x x2
(1 2x)(1 x x 2 )

Hence we seek the coefficient of x n in


(1 x)2
1
x
f (x)
.
=
=

2
x
(1 2x)(1 x x )
1 2x
1 x x2
The coefficient subtracted in the second term is the number of 1, 2-lists with sum n 1,
well known to be Fn , so the answer is 2n Fn .
Also solved by R. Bagby, D. Beckwith, R. Chapman (U. K.), M. Hoffman, Y. J. Ionin, O. P. Lossers
(Netherlands), R. Martin (Germany), R. Molinari, M. Omarjee (France), N. C. Singer, J. H. Smith, R. Stong,
R. Tauraso (Italy), T. Viteam (South Africa), T. Woodcock, GCHQ Problem Solving Group (U. K.), TCDmath
Problem Group (Ireland), and the proposer.

Mixtilinear Incircles
11774 [2015, 366]. Proposed by Yunus Tuncbilek, Ataturk High School of Science,
Istanbul, Turkey and Danny Lee, Herkimer Senior High School, New York, NY. Let
be the circumscribed circle of triangle ABC. The A-mixtilinear incircle of ABC and
is the circle that is internally tangent to , AB, and AC, and similarly for B and
C. Let A , PB , and PC be the points on , AB, and AC, respectively, at which the Amixtilinear incircle touches. Define B  and C  in the same manner that A was defined.
(See figure.)
B

PC

C
O
PB

OA
C

Prove that triangles C  PB B and CPC B  are similar.


Solution by Radouan Boukharfane (student), Poitiers, France. Let a, b, and c be the
sidelengths of ABC, and let s be its semiperimeter.
Lemma. Let X be a point on the side AB of triangle ABC, and let Y be a point on
the arc AB (not containing C) of the circumcircle of ABC. The rays CX and CY are
AC
AY = BC
.
isogonal in ACB if and only if AX
XB YB
Proof. Suppose that CX and CY are isogonal, that is, XCA = BCY and XCB
= ACY. We also have CAX = CAB = CYB and CBX = CBA = CYA
506

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




since they subtend the same arcs on . Thus we have similar triangles CXA CBY
and CXB CAY. Hence
AC
AY
YC
AX
=
and
=
.
YB
YC
XB
BC
The product of these is the claimed formula. For the converse, both the isogonality of
uniquely determine a point X on side AB.
CX and CY and the ratio AX
XB
Let
I map the extended plane by inverting through the circle with center C and
radius ab and then reflecting across the angle bisector of BAC. Note that I swaps C
with the point at infinity and swaps A with B. Hence it swaps line C A with CB and
swaps line AB with the circumcircle of ABC. It also swaps the C-mixtilinear incircle
with the C-excircle. Thus I swaps the tangency point C  of the C-mixtilinear incircle
with and the tangency point, call it Q, of the C-excircle with AB. It follows that the
rays CC and CQ are isogonal; that is in ACB. Thus by the lemma above
a(s b)
|BC|
|AQ| |BC|

=
,
=

|AC|
|QB| |AC|
b(s a)
where we have used the well-known formulas |AQ| = s b and |QB| = s a.
Furthermore, I swaps the tangency point, call it D, of the C-mixtilinear incircle
with CA with the tangency point of the C-excircle with CB. This last tangency point is
well known to be at distance s from C. It follows that |CD| s = ab. Hence |CD| = abs
|DA|
and |DA| = b |CD| = b(sa)
. Thus |CD|
= sa
.
s
a
Denote the points where the A- and B-mixtilinear incircles are tangent with AB by
PB and E, respectively. Analogs of the result of the previous paragraph yield
sb
a
|BC |
|BP B | |BE|

=
.
|PB A| |EA|
b
sa
|AC |
Now consider the homothety with center B  that takes the B-mixtilinear incircle to
. This map takes line AB, which is tangent to the B-mixtilinear incircle, to a parallel
tangent to . Hence its image is the tangent to at the midpoint of arc AB. Since this
tangency point is the image of E under the homothety, it follows that B  E contains
the midpoint of arc AB or, equivalently, that B  E bisects AB B. The angle bisector
|BB |
|BE|
theorem now yields |B
 A| = |EA| , and this gives
|BC |
|BP B | |BB |
 =
.
|PB A| |B A|
|AC |
By the lemma above (applied to triangle ABC ), the rays C  PB and C  B  are isogonal
in AC B, and hence B  C  A = BC PB . Angles C  BP B = C  BA and C  B  A
are also congruent since they subtend the same arc of . Hence, we see that triangles
C  PB B and C  AB are similar. Analogously, triangles CPC B  and C  AB are similar.
Hence triangles C  PB B and CPC B  are similar.
Also solved by C. Delorme (France), C. R. Pranesachar (India), R. Stong, H. Widmer (Switzerland), GCHQ
Problem Solving Group (U. K.), and the proposers.

A Partition Inequality
11775 [2014, 455]. Proposed by Isaac Sofair,
VA. Let A1 , . . . , Ak be
 Fredericksburg,




finite sets. For J {1, . . . , k}, let N J =  jJ A j , and let Sm = J :|J |=m N J .
May 2016]

PROBLEMS AND SOLUTIONS

507

(a) Express in terms of S1 , . . . , Sk the number of elements that belong to exactly m of


the sets A1 , . . . , Ak .
(b) Same question as in (a), except that we now require the number of elements belonging to at least m of the sets A1 , . . . , Ak .
Solution by Mark Meyerson, Naval Academy, Annapolis,
k MD. k+i+m+1  i 
S . If an
(a) Let Tm be the desired value; we prove Tm = i=1
(1)
  km
 i
element x belongs to exactly i of A1 , . . . , Ak , then x contributes mk ki
to Sm .
m
Therefore,


k  

k
k i

Ti .
Sm =
m
m
i=1
 
It suffices to show that the inverse of the k k matrix A with (m, i)-entry mk
m

 
is the k k matrix B with (s, m)-entry (1)k+m+s+1 ks
(interpreting nj as
ki
m
0 when j < 0 or j > n). To see this, we compute the (s, i)-entry of BA:

   

k

m
k
k i
(1)k+m+s+1

ks
m
m
m=1

  



k
k

m
k
m
k i
k+m+s+1
k+m+s+1
=
(1)
(1)

ks m
ks
m
m=0
m=0


 
 k


k
k
s
k i 
s i
k+m+s+1
k+m+s
=
(1)
(1)
+
.
k s m=0
m k +s
k s m=0
m k +s


Since the alternating sum of a row of Pascals triangle (other than the first) vanishes,
the first sum in the last expression vanishes, as does the second except when s = i, in
which case it is 1. Thus BA is the identity matrix.
(b) For the desired value Um , we compute


k
k 
k


i
k+i+ j+1
Si
Tj =
(1)
Um =
k j
j=m
j=m i=1
=

k 
k



(1)

k+i+ j+1

i=1 j=m

where the last equality comes from


by induction on k m.




k

i
k+i+m+1 i 1
(1)
Si =
Si ,
k j
km
i=1

j=m (1)

i
k j

= (1)m

 i1 
, which is proved
km

Also solved by D. Beckwith, B. S. Burdick, R. Chapman (U. K.), Y. J. Ionin, B. Karaivanov, O. Kouba
(Syria), J. H. Lindsey II, O. P. Lossers (Netherlands), Y. Shim (Korea), J. C. Smith, R. Stong, R. Tauraso
(Italy), TCDmath Problem Group (Ireland), and the proposer.

A Line of Urns
11776 [2014, 455]. Proposed by David Beckwith, Sag Harbor, NY. Given urns
U1 , U2 , . . . , Un in a line, and plenty of identical blue and identical red balls, let an
be the number of ways to put balls into the urns subject to the conditions that
(i) each urn contains at most one ball,
508

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




(ii) any urn containing a red ball is next to exactly one urn containing a blue ball,
and
(iii) no two urns containing a blue ball are adjacent.
(a) Show that

an t n =

n=0

1 + t + 2t 2
.
1 t t 2 3t 3

(b) Show that


an =


j0 m0

Here

k 
l


4j

n 2m
j

  
 




m
n 2m 1 m
n 2m m 1
+
+2
.
j
j
j
j
j

= 0 if k < l.

Solution by James Christopher Smith, Knoxville, TN.


(a) By explicit counting, a0 = 1, a1 = 2, a2 = 5, and a3 = 10. View solutions as
strings of length n using E, B, R for empty, blue, and red, respectively, with en , bn ,
rn counting those beginning E, B, or R. Always an = en + bn + rn , and en = an1 for
n 1. Also rn = bn1 for n 2. For n 3, the solutions beginning B consist of en1
beginning BEE, BEB, or BER, plus en2 beginning BRE, plus rn2 beginning BRR.
Thus bn = en1 + en2 + rn2 , and
an = en + bn + rn = an1 + en1 + en2 + rn2 + bn1
= an1 + an2 + an3 + bn3 + en2 + en3 + rn3
= an1 + an2 + an3 + bn3 + an3 + en3 + rn3
= an1 + an2 + 3an3 .
Therefore,







an t n =
an t n
an1 t n
an2 t n 3
an3 t n
1 t t 2 3t 3
n=0

= 1 + t + 2t 2 +

n=0

n=1

n=2

n=3

(an an1 an2 3an3 )t n = 1 + t + 2t 2 .

n=3

 
(b) Use the identity m0 mk t m = t k /(1 t)k+1 to obtain





  n 2m m 
 n 
 m 
n
2m
n
t =
t
t
j
j
j
j
n0 m0
m0
n0


=

t2j
(1 t 2 ) j+1



tj
(1 t) j+1


=

1
(1 t)(1 t 2 )

t3
(1 t)(1 t 2 )

j
.

It follows that



   n 2m m 
   n 2m m 
j
n
j
t =

4
4
tn
j
j
j
j
n0
j0 m0
j0
n0 m0
May 2016]

PROBLEMS AND SOLUTIONS

509

j

1
4t 3
=
(1 t)(1 t 2 ) j0 (1 t)(1 t 2 )


1
1
1
=
=
.
3
2
4t
(1 t)(1 t ) 1
1 t t 2 3t 3
2
(1t)(1t )

Hence, in the expansion of 1/(1 t t 2 3t 3 ), the coefficients of t n1 and t n2 are,


respectively,
  n 2 2m m 
  n 1 2m m 
j
4
4j
and
.
j
j
j
j
j0 m0
j0 m0
Shifting the index for m in the last expression and summing the various contributions
now yields (b).
Also solved by R. Chapman (U. K.), M. Funkhouser, O. Geupel (Germany), O. Kouba (Syria), O. P. Lossers
(Netherlands), Y. Shim (Korea), R. Stong, R. Tauraso (Italy), GCHQ Problem Solving Group (U. K.), Missouri
State University Problem Solving Group, and the proposer.

The Beast
11777 [2014, 456]. Proposed
 by Marian Dinca, Bucharest, Romania. Let x1 , . . . , xn
be real numbers such that nk=1 xk = 1. Prove that
n


x2
k=1 k

xk2
1.
2xk cos(2/n) + 1

Solution by Mazen Zarrouk, Montgomery College, Takoma Park, MD. When n = 1,


the inequality becomes 10 1, which makes sense if we take 10 = +. The inequality
is not true for n = 2, as can be seen by taking x1 = x2 = 1. In the following it will be
shown that the inequality is true for n 3.
We will use the Shapiro inequality: If yi 0 for 1 i n, with yn+1 = y1 and
yn+2 = y2 , then

n

yk
n/2,
for even n at most 12 or odd n at most 23,

y
+
y
0.49n,
for all other n.
k+1
k+2
k=1
(Reference: V. G. Drinfeld, A cyclic inequality, Math. Notes. Acad. Sci. USSR 9
(1971) 6871. H. S. Shapiro, Monthly Problem 4603, 61 (1954) 571.
http://mathworld.wolfram.com/ShapiroCyclicSumConstant.html.)
Lemma. Fix n N with n 4. If x1 , . . . , xn are positive real numbers with product 1,
then
2
n 

xk
1.
xk + 1
k=1

Proof. Let yk = nj=k x j for 1 k n, with yn+1 = y1 and yn+2 = y2 . Note that
yk > 0 and xk = yk /yk+1 for 1 k n. Also,
n

k=1

510



xk
yk
yk+1
=
=
xk + 1
yk + yk+1
yk+1 + yk+2
k=1
k=1
n

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




n
n
n



yk+1 yk
yk
yk
+

.
yk+1 + yk+2 k=1 yk+1 + yk+2
yk+1 + yk+2
k=1
k=1

(1)

Note that since yk > 0, setting t = max1kn {yk+1 + yk+2 } > 0 yields
n
n

1
1
yk+1 yk

(yk+1 yk ) = (yn+1 y1 ) = 0.
y
+
y
t
t
k+2
k=1 k+1
k=1

Thus, omitting this sum leads to the stated inequality in (1). Using the quadratic mean
arithmetic mean inequality, we obtain
n
2
n
2
2
n 

1  xk
1 
xk
yk

.
xk + 1
n k=1 xk + 1
n k=1 yk+1 + yk+2
k=1
The result now follows by applying the Shapiro inequality.
We now return to the original problem and prove the case n = 3, for which the
lemma is not needed. Let x, y, z be real numbers such that x yz = 1. With z = 1/x y,
the case n = 3 becomes
y2
1
x2
+
+ 2 2
1,
2
2
x +x +1
y + y + 1 x y + xy + 1
which is equivalent to
1
(2x 2 y 2
4

x y)2 + 34 (x y)2
0.
(x 2 + x + 1)(y 2 + y + 1)(x 2 y 2 + x y + 1)
For each factor in the denominator, we have t 2 + t + 1 = (t + 12 )2 + 34 > 0. The
desired inequality follows. This completes the case n = 3.
Now we consider the case n 4. Let x1 , x2 , . . . , xn be real numbers with product
1. For 1 k n,
 
 

2
2
2
2
2
xk 2xk cos
+ 1 xk2 + 2|xk | + 1 = |xk | + 1 .
0 < 1 cos
n
n
Applying Lemma 1 to |x1 |, . . . , |xn |, we obtain the required inequality
2
n
n 


|xk |
xk2

1.
2
|x
|
+
1
x

2x
cos(2/n)
+
1
k
k
k
k=1
k=1
Also solved by M. Aassila (France), P. P. Dalyay (Hungary), D. Fleischman, Y. J. Ionin, O. P. Lossers
(Netherlands), P. Perfetti (Italy), R. E. Prather, J. C. Smith, N. Stanciu (Romania), A. Stenger, R. Stong, R.
Tauraso (Italy), Z. Voros (Hungary), GCHQ Problem Solving Group (U. K.), and the proposer.

May 2016]

PROBLEMS AND SOLUTIONS

511

REVIEWS
Edited by Jeffrey Nunemacher
Mathematics and Computer Science, Ohio Wesleyan University, Delaware, OH 43015

Scientist, Scholar & Scoundrel: A Bibliographical Investigation of the Life and Exploits of
Count Guglielmo Libri/Mathematican, Journalist, Patron, Historian of Science, Paleographer,
Book Collector, Bibliographer, Antiquarian Bookseller, Forger and Book Thief. By Jeremy M.
Norman. The Grolier Club, New York, 2013. xii+ 176 pp., ISBN 078-1-60583-941-4, $35.

Reviewed by Gerald L. Alexanderson


In 2006 in an article about Sophie Germain [1], I found an excuse to write something
about the notorious mathematician Guglielmo Bruno Icilio Timoleone Libri-Carrucci
dalla Sommajia, who in the subtitle above has been identified as a mathematician,
journalist, patriot, historian of science, paleographer, book collector, bibliographer,
antiquarian bookseller, forger, and . . . thief. He is better known in book collecting and history of science circles than in mathematics. And, of course, ones first
inclination is to assume that a name like that has to be a joke. Or at the very least,
Libri must be some exotic character from the Italian aristocracy. One would not be far
wronghe was a count. And, of course, libri means books. So this book thief was
actually Count Libri, or Count Books, a suitable name as it turns out.
Did such a person really exist? Apparently, he did. And, though there is a fair
amount of information about him in book collecting circles, a recent scholarly catalogue has been published about Libri since my 2012 article in the AMS Bulletin.
The author, Jeremy Norman, is a well-known book scholar, book dealer, and collector,
and this 176-page illustrated bibliographical catalogue written in narrative form was
issued on the occasion of an exhibition of Libri material collected by Norman over
the past ten or so years and the subject of an exhibit at the Grolier Club in New York.
(This is the oldest private organization of collectors of rare books in the United States,
founded in 1884 and occupying a stately Georgian town house on East 60th Street in
New York. The headquarters building houses offices of this prestigious club, a library,
and an exhibition gallery.)
Libris prowess as a mathematician is probably not a very rewarding study. Though
he knew a number of important people in the history of 19th century mathematics,
he contributed little original and significant mathematics himself. But few mathematicians have led such a picaresque life. Born in northern Italy in 1803, Libri was brought
up in a family with scholarly aspirations and a shaky reputation in noble circles in
Tuscany. However, by the age of 20, he had been appointed to a chair in mathematical
physics at the University of Pisa.
Libri was an early believer in a constitutional government in Tuscany and an early
fighter for a unified Italy. In 1831, after he participated in an abortive coup in Tuscany, he fled to France, chased by the carabinieri of various Italian states. Before that,
when he was first in Paris during the French Revolution of 1830, he was cited for
bravery in support of the revolution and became acquainted with Francois Guizot and
others, who would play important roles in the new government. Once he formally
http://dx.doi.org/10.4169/amer.math.monthly.123.5.512

512

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




moved to Paris, through an amazing combination of political acumen and modest


mathematical accomplishments (in mathematical physics, number theory, probability,
and most importantly in history), by 1833, he had succeeded Legendre by being elected
to Legendres seat in the Academie des Sciences. Quickly, he became acquainted with
the mathematical aristocracy in Paris at the time: Fourier, Laplace, Biot, Poisson, and
Arago. He also became a very close friend of Guizot who became secretary of state
and later prime minister under the regime of the citizen king, Louis Philippe. Taking
advantage of chaotic political times in France, ill-funded libraries, and less-than-ideal
appointments to key posts in the great scholarly institutions, Libri promoted the idea
of producing a union catalogue of French provincial libraries to his politician friends,
putting himself in charge. Taking advantage of this official appointment as inspector
of provincial libraries and often working after closing hours, Libri was able to steal,
with little chance of detection, books and manuscripts from these libraries. We read
of his dramatic and probably intimidating appearances, carrying a stiletto (to defend
himself from the carabinieri still hunting him) and wearing a large cloak, the better to
conceal treasures he was removing from the libraries. One might expect this to arouse
suspicion, but for a long while, it apparently did not. Libri had friends in high places in
the scientific and mathematical community, but he also cultivated others in the cultural
aristocracy, notably Prosper Merimee, who gave us the story that became the opera
Carmen.
In Paris, he shared with Germain her interest in number theory and specifically
Fermats last theorem. In 1838, Libri published the first in a series of volumes in his
Histoire des sciences mathematiques en Italie depuis la renaissance des lettres jusqu
a` la fin du xviii`eme si`ecle, a serious contribution to the history of mathematics. For
his sources, he had a vast collection, acquired by purchase or by theft, of manuscripts
by Fermat, Descartes, Euler, dAlembert, Galileo, Leibniz, Mersenne, and Gassendi.
In 1841, Libri bought all the extant papers of Galileos trial, considered until then
definitely lost.
But it couldnt last. In 1848, after the resignation of Louis Philippe and the failure of Guizots government, with a warrant out for his arrest, Libri fled to London
under a false passport in the company of Guizots daughters. There he set himself up
as a book and manuscript collector, providing treasuresoften stolenthat he had
brought with him from France in 17 crates. Once in England, he appears never to have
stolen any material. Instead, he held a series of famous auction sales at Sothebys,
always under his own name, at which famous libraries and collectors purchased books
and manuscripts, even though some were well aware that they were buying items that
possibly had been stolen.
After Libri fled to England, the French government spent years building their case
against him and eventually convicted him in absentia. He used the proceeds of auction
sales of books and manuscripts in his Paris apartments that he left behind to cover
the court costs. However, from the moment of his indictment in 1848, Libri, who was
also an accomplished journalist, conducted a propaganda war against the prosecutors,
publishing over 1,000 pages of pamphlets and books under his own name refuting
the prosecutions case and enlisting influential friends also to publish pamphlets supporting him. Altogether, around 70 pamphlets and books were published in this process. Through this manner of deception and refutationknown today as the Libri
AffairLibri so muddied the prosecutions case that even though the proceedings
of the indictment, trial, and conviction were reported in the English and Italian press,
his reputation in those countries remained untarnished. In fact, after his death in Italy
in 1869, The Times of London stated in his obituary that he had been falsely accused
of all crimes. Health declining, in old age, Libri had returned to Italy as a patriot of
May 2016]

REVIEWS

513

the newly unified country. Before departing England, he shipped the remainder of his
collection to Italy.
It is tempting to try to give Libri the benefit of the doubt and assume that he was
driven by a passion for books, but there is plenty of evidence that he was complicit in
numerous attempts to alter bindings and identifying marks to make it hard for prosecutors to trace the provenance of items in his possession. He did produce some useful
work in bibliography and pioneered in developing a taste for fine bindings and such.
But in the end, one can only conclude that he was a crooka fascinating crook, but a
crook nonetheless. But he skillfully succeeded in maintaining his good reputation, at
least outside France. Nevertheless, it is for the crimes and his manner of evading arrest
and protecting his reputation that Libri is remembered, rather than for his mathematics.
Norman provides ample evidence that Libris errant behavior was present throughout his career. In Pisa when he no longer wished to fulfill his teaching duties there, he
got a bogus health exemption from a doctor friend so he no longer had to teach, but
he retained his professional title and his salary for the rest of his life.
The quality of Normans scholarship is evident in every section of his book. And his
prose is direct and elegant. It is, however, a catalogue of an exhibit, so it has more bibliographic detail than some readersmainly professional mathematicianswill care
much about. But with a subject like Libri, the narrative is not only scholarly and informative but also entertaining as well.
REFERENCE
1. G. L. Alexanderson, Sophie Germain and a problem in number theory, Bull. Amer. Math. Soc. 49(2012)
327331.
Santa Clara University, Santa Clara, CA 95053-0290
galexanderson@scu.edu

514

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




EDITORS END NOTES

Professor Alexander Ramm from Kansas State University has submitted to this
MONTHLY the following statement.
Upon reflection and consultation with the Editor of the MONTHLY, I submit that my
paper [6] is a duplicate publication of my earlier paper [7]. I apologize deeply to the
MONTHLY and its readers and retract paper [1] from the MONTHLY.

The following comes to us from Richard M. Aron and Juan B. Seoane-Sepulveda


with regard to the paper A function that is surjective on every interval 123(2016)
8889 in this MONTHLY by David G. Radcliffe.
In the above mentioned paper, the author presented a function f : R R with
f (I ) = R for every nontrivial interval I R. These types of functions were already
known in the literature as everywhere surjective (see [2, 8]). A simpler construction
can actually be made as follows (see [1, 8]): Let (In )nN be the collection of all open
intervals with rational endpoints. The interval I1 contains a Cantor type set; call it
C1 . Now, I2 \ C1 also contains a Cantor type set; call it C2 . Inductively, we construct
a family
of pairwise
disjoint Cantor type sets, (Cn )nN , such that for every n N,
 n1

C
In \

C
n . Now, for every n N, take any bijection n : C n R, and
k=1 k
define f : R R as f (x) = n (x) if x Cn (and 0 otherwise). This f is everywhere
surjective (and also zero almost everywhere!). Indeed, let I be any interval in R.
There exists k N such that Ik I . Thus f (I ) f (Ik ) f (Ck ) = k (Ck ) = R. A
monograph that also deals with this type of function can be found in [1], in which
everywhere surjective functions enjoying (simultaneously) several other pathologies are introduced. There are not only many such functions, but in fact there exists
a vector space V with dim(V )= 2c =dim(RR ), every nonzero element of which is
everywhere surjective (see [1, 2, 3, 8]).

We received the following from Mike Slattery, concerning his recent MONTHLY
paper On a property motivated by groups with a specified number of subgroups
123(2016) 7881.
I have discovered that there is an oversight in the last example of my article (p. 81).
In this example, I state that, if G is a finite group with exactly six subgroups, then G
is similar to one of C32 , C12 , C3 C3 , or the dihedral group of order 6. In fact, the
quaternion group of order 8 should also be in that list. This has no impact on the rest
of the paper.
http://dx.doi.org/10.4169/amer.math.monthly.123.5.515

May 2016]

515

Peter R. Mercer from Buffalo State College sends along the following.
In the MONTHLYs January 2016 issue, the Monthly Gems piece (123(2016)
p. 77) is a simplification of a proof due to Matsuoka. This simplification was already
demonstrated by D. Daners in [4]. It was also illustrated in my book [5].

REFERENCES
1. R. M. Aron, L. Bernal-Gonzalez, D. Pellegrino, J. B. Seoane-Sepulveda, Lineability: The search for
linearity in Mathematics, Monographs and Research Notes, in Mathematics, Monographs and Research
Notes in Mathematics, Chapman & Hall/CRC, Boca Raton, FL, 2015.
2. R. M. Aron, R.M., V. I. Gurariy, J. B. Seoane-Sepulveda, Lineability and spaceability of sets of functions
on R, Proc. Amer. Math. Soc., 133 (2005) 795803.
3. L. Bernal-Gonzalez, D. Pellegrino, J. B. Seoane-Sepulveda, Linear subsets of nonlinear sets in topological vector spaces, Bull. Amer. Math. Soc. (N.S.) 51 (2014) 71130.

2
1
4. D. Daners, A short elementary proof that
n=1 n 2 = 6 , Math. Mag. 85(2012) 361364.
5. P. Mercer, More Calculus of a Single Variable, Springer UTM, New York, 2014.
6. A. G. Ramm, A variational principle and its application to estimating the electrical capacitance of a
perfect conductor, Amer. Math. Monthly, 120 (2013) 747750.
7. A. G. Ramm, A variational principle and its application, Int. J. Pure Appl. Math.. 77 no. 3 (2012)
309313.
8. J. B. Seoane-Sepulveda, Chaos and Lineability of Pathological Phenomena in Analysis, Ph.D. thesis,
Kent State University, ProQuest LLC, Ann Arbor, MI, 2006.

Scott T. Chapman, Editor

516

c THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 123




Enjoy math contests?

12 3 4
6
5
7
9
8 10
Take a look at our latest

Euclidean Geometry in Mathematical Olympiads


Evan Chen

7KLVLVDFKDOOHQJLQJSUREOHPVROYLQJERRNLQ(XFOLGHDQJHRP
HWU\(DFKFKDSWHUFRQWDLQVFDUHIXOO\FKRVHQZRUNHGH[DPSOHV
ZKLFKH[SODLQQRWRQO\WKHVROXWLRQVWRWKHSUREOHPVEXWDOVRGH
VFULEHLQFORVHGHWDLOKRZRQHZRXOGLQYHQWWKHVROXWLRQWREHJLQ
ZLWK7KHWH[WFRQWDLQVDVHOHFWLRQRISUDFWLFHSUREOHPVRI
YDU\LQJGLFXOW\IURPFRQWHVWVDURXQGWKHZRUOGZLWKH[WHQ
VLYHKLQWVDQGVHOHFWHGVROXWLRQV7KHH[SRVLWLRQLVIULHQGO\DQG
UHOD[HGDQGDFFRPSDQLHGE\RYHUEHDXWLIXOO\GUDZQJXUHV

H,6%1

HERRN

SDJHV

7RRUGHUYLVLWZZZPDDRUJHERRNV(*02

A Gentle Introduction to the American Invitational


Mathematics Exam
Scott Annin

7KLVERRNFHOHEUDWHVPDWKHPDWLFDOSUREOHPVROYLQJDWWKHOHYHO
RIWKH$PHULFDQ,QYLWDWLRQDO0DWKHPDWLFV([DPLQDWLRQ7KHUH
DUHPRUHWKDQIXOO\VROYHGSUREOHPVLQWKHERRNFRQWDLQLQJ
H[DPSOHVIURP$,0(FRPSHWLWLRQVRIWKHVVV
DQGV,QVRPHFDVHVPXOWLSOHVROXWLRQVDUHSUHVHQWHGWR
KLJKOLJKWYDULDEOHDSSURDFKHV7RKHOSSUREOHPVROYHUVZLWKWKH
H[HUFLVHVWKHDXWKRUSURYLGHVWZROHYHOVRIKLQWVWRHDFKH[HUFLVH
LQWKHERRNRQHWRKHOSJHWDQLGHDKRZWREHJLQDQGDQRWKHUWR
SURYLGHPRUHJXLGDQFHLQQDYLJDWLQJDQDSSURDFKWRWKHVROXWLRQ

H,6%1

HERRN

7RRUGHUYLVLWZZZPDDRUJHERRNV*,$

SDJHV

1529 Eighteenth St., NW

Washington, DC 20036

Need a text for a QR course?


Common Sense
Mathematics
Ethan D. Bolker and Maura B. Mast
&RPPRQ6HQVH0DWKHPDWLFVLVD
WH[WIRUDRQHVHPHVWHUFROOHJHOHYHO
FRXUVHLQTXDQWLWDWLYHOLWHUDF\7KH
WH[WHPSKDVL]HVFRPPRQVHQVHDQG
FRPPRQNQRZOHGJHLQDSSURDFKLQJ
UHDOSUREOHPVWKURXJKSRSXODUQHZVLWHPVDQGQGLQJXVHIXOPDWK
HPDWLFDOWRROVDQGIUDPHVZLWKZKLFKWRDGGUHVVWKRVHTXHVWLRQV
Catalog Code: CSM
E-ISBN: 9781614446217
ebook: $30.00
MAA Textbooks

Print ISBN: 9781939512109


List: $60.00
MAA Member: $45.00
328 pages, Hardbound, 2016

To order a print book visit www.store.maa.org or call 800-331-1622.


To order an electronic book visit www.maa.org/ebooks/CSM.

Das könnte Ihnen auch gefallen