Sie sind auf Seite 1von 439

Differential Calculus: Mathematics 102

The University of British Columbia


Notes by Leah Edelstein-Keshet1: All rights reserved
September 1, 2014

1
This disclaimer is inserted in view of UBC Policy 81. Copyright Leah Edelstein-Keshet. Not to
be copied, used, or revised without explicit written permission from the author.

ii

Leah Edelstein-Keshet

Contents
Preface

xi

Power functions as building blocks


1.1
Power functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
How big can a cell be? A model for nutrient balance . . . . . . . . . .
1.2.1
Building the model . . . . . . . . . . . . . . . . . . . . .
1.2.2
Nutrient balance depends on cell size . . . . . . . . . . .
1.2.3
Even and odd power functions . . . . . . . . . . . . . . .
1.3
Sustainability and Energy balance on Planet Earth . . . . . . . . . . .
1.4
Combining power functions: first steps in graph sketching . . . . . . .
1.4.1
Sketching a simple (two-term) polynomial . . . . . . . .
1.4.2
Sketching a simple rational function . . . . . . . . . . . .
1.5
Rate of an enzyme-catalyzed reaction . . . . . . . . . . . . . . . . . .
1.5.1
Saturation and Michaelis-Menten kinetics . . . . . . . . .
1.5.2
Hill functions . . . . . . . . . . . . . . . . . . . . . . .
1.6
Analysis versus computational tools: two sides of a coin . . . . . . . .
1.7
For further study: Michaelis-Menten transformed to a linear relationship
1.8
For further study: Spacing of fish in a school . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
1
3
4
6
7
8
9
9
12
12
13
15
15
16
17
18

Average rates of change, average velocity and the secant line


2.1
Time-dependent data and rates of change . . . . . . . . .
2.1.1
Milk temperature in a recipe for yoghurt . .
2.1.2
Data for swimming Tuna . . . . . . . . . .
2.1.3
Data for a falling object . . . . . . . . . . .
2.2
The slope of a straight line is a rate of change . . . . . .
2.3
The slope of a secant line is the average rate of change . .
2.4
From average to instantaneous rate of change . . . . . . .
2.4.1
Refined temperature data . . . . . . . . . .
2.4.2
Refined data for the height of a falling object
2.4.3
Instantaneous velocity . . . . . . . . . . . .
2.5
Introduction to the derivative . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25
25
26
27
28
29
30
34
34
36
37
37
40

iii

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

iv

Contents

Three faces of the derivative: geometric, analytic, and computational


3.1
The geometric view: Zooming into the graph of a function . . . . . . .
3.1.1
Locally, the graph of a function looks like a straight line .
3.1.2
At a cusp or a discontinuity, the derivative is not defined .
3.1.3
From the graph of a function, we can sketch its derivative
3.1.4
Constant and linear functions and their derivatives . . . .
3.1.5
Molecular motors . . . . . . . . . . . . . . . . . . . . .
3.2
Analytic view: calculating the derivative . . . . . . . . . . . . . . . .
3.2.1
Technical matters: continuous functions and limits . . . .
3.2.2
Computing the derivative . . . . . . . . . . . . . . . . .
3.3
Computational face of the derivative: software to the rescue! . . . . .
3.3.1
Concentration-dependent rate of chemical reaction . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45
45
45
47
48
49
51
52
52
56
57
58
63

Differentiation rules, simple antiderivatives and applications


4.1
Rules of differentiation . . . . . . . . . . . . . . . . . . . . . .
4.1.1
The derivative of power functions: the power rule .
4.1.2
The derivative is a linear operation . . . . . . . . .
4.1.3
The derivative of a polynomial . . . . . . . . . . .
4.1.4
Antiderivatives of power functions and polynomials
4.1.5
Product and quotient rules for derivatives . . . . . .
4.1.6
The power rule for fractional powers . . . . . . . .
4.2
Application: From acceleration to displacement . . . . . . . . .
4.2.1
Position, velocity, and acceleration . . . . . . . . .
4.3
Sketching first, second, and anti- derivatives . . . . . . . . . . .
4.3.1
A biological speed machine . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

69
69
70
71
71
72
74
76
76
77
80
84
88

Tangent lines, linear approximation, and Newtons method


5.1
The equation of a tangent line . . . . . . . . . . . . . . . . . . . .
5.1.1
Simple functions and their tangent lines . . . . . . . .
5.2
Generic tangent line equation and properties . . . . . . . . . . . .
5.2.1
Generic tangent line equation . . . . . . . . . . . . .
5.2.2
Where a tangent line intersects the x axis . . . . . . .
5.3
Close to a point, we can approximate a function by its tangent line
5.3.1
Accuracy of the linear approximation . . . . . . . . .
5.4
Tangent lines can help approximate the zeros of a function . . . . .
5.4.1
Newtons method . . . . . . . . . . . . . . . . . . .
5.5
Harder tangent line problems: Finding the point of tangency . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

93
93
94
97
97
97
98
100
102
103
106
110

Sketching the graph of a function using calculus tools


6.1
Overall shape of the graph of a function . . . . . . .
6.1.1
Increasing and decreasing functions . . .
6.1.2
Concavity and points of inflection . . . .
6.1.3
Determining whether f (x) changes sign

.
.
.
.

.
.
.
.

115
115
115
116
118

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

Contents

6.2

10

Special points on the graph of a function . . . . . . . . . . . . .


6.2.1
Zeros of a function . . . . . . . . . . . . . . . . . .
6.2.2
Critical points . . . . . . . . . . . . . . . . . . . .
6.2.3
What happens close to a critical point . . . . . . . .
6.3
Sketching the graph of a function . . . . . . . . . . . . . . . . .
6.3.1
Global maxima and minima, endpoints of an interval
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

118
119
120
120
122
126
128

Optimization
7.1
Simple biological optimization problems . . . . . . . . . . . . .
7.1.1
Density dependent (logistic) growth in a population
7.1.2
Cell size for maximal nutrient accumulation rate . .
7.2
Optimization with a constraint . . . . . . . . . . . . . . . . . . .
7.2.1
A cylindrical cell with minimal surface area . . . .
7.2.2
Wine for Keplers wedding . . . . . . . . . . . . .
7.3
Checking endpoints . . . . . . . . . . . . . . . . . . . . . . . .
7.4
Optimal foraging . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.1
For further study: Other patch functions . . . . . .
7.5
Additional Examples of geometric optimization . . . . . . . . .
7.5.1
Rectangular box with largest surface area . . . . . .
7.5.2
A cylinder in a sphere . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

131
131
131
133
134
135
137
140
142
146
149
149
150
152

Introducing the chain rule


8.1
The chain rule . . . . . . . . . . . . . . . .
8.1.1
Function composition . . . . .
8.1.2
The chain rule of differentiation
8.1.3
Interpreting the chain rule . . .
8.2
Chain Rule applied to optimization problems
8.2.1
Shortest path from food to nest
8.2.2
Food choice and attention . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

159
159
159
160
161
164
165
166
173

Chain rule applied to related rates and implicit differentiation


9.1
Applications of the chain rule to related rates . . . . . .
9.2
Implicit differentiation . . . . . . . . . . . . . . . . . . .
9.2.1
Implicit and explicit definition of a function
9.2.2
Slope of a tangent line at the point on a curve
9.3
The power rule for fractional powers . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

175
175
180
180
181
184
189

Exponential functions
10.1
Unlimited growth and doubling . . . . . . . . . . . . .
10.1.1
The Andromeda Strain . . . . . . . . . . .
10.1.2
The function 2x and its relatives . . . .
10.2
Derivatives of exponential functions and the function ex

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

195
195
195
197
199

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.

vi

11

12

13

Contents
10.2.1
Calculating the derivative of ax . . . . . . . . . .
10.2.2
The natural base e is convenient for calculus . . .
10.2.3
Properties of the function ex . . . . . . . . . . . .
10.2.4
The function ex satisfies a new kind of equation .
10.3
Inverse functions and logarithms . . . . . . . . . . . . . . . .
10.3.1
The natural logarithm is an inverse function for ex
10.3.2
Derivative of ln(x) by implicit differentiation . . .
10.4
Applications of the logarithm . . . . . . . . . . . . . . . . . .
10.4.1
Using the logarithm for base conversion . . . . .
10.4.2
The logarithm helps to solve exponential equations
10.4.3
Logarithms help plot data that varies on large scale
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

199
201
202
203
204
205
206
206
206
207
208
211

Differential equations for exponential growth and decay


11.1
Introducing a new kind of equation . . . . . . . . . . . . .
11.1.1
Observations about the exponential function .
11.1.2
The solution to a differential equation . . . . .
11.1.3
Where do differential equations come from? .
11.2
Differential equation for unlimited population growth . . .
11.2.1
A simple model for human population growth
11.2.2
A critique . . . . . . . . . . . . . . . . . . .
11.2.3
Growth and doubling . . . . . . . . . . . . .
11.3
Radioactive decay . . . . . . . . . . . . . . . . . . . . . .
11.3.1
Deriving the model . . . . . . . . . . . . . .
11.3.2
Solution to the decay equation . . . . . . . . .
11.3.3
The half life . . . . . . . . . . . . . . . . . .
11.4
Summary and Review . . . . . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

219
219
219
221
223
224
225
228
229
230
231
233
233
235
237

Solving differential equations


12.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12.2
Given a function, check that it is a solution . . . . . . . . . . . .
12.3
Equations of the form y (t) = a by . . . . . . . . . . . . . . .
12.3.1
Reduction to a simpler differential equation . . . . .
12.3.2
Newtons law of cooling . . . . . . . . . . . . . . .
12.3.3
Using Newtons Law of Cooling to solve a mystery
12.3.4
Related applications and further examples . . . . .
12.4
Eulers Method and numerical solutions . . . . . . . . . . . . .
12.4.1
Eulers method applied to population growth . . . .
12.4.2
Eulers method applied to Newtons law of cooling .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

241
241
241
243
244
246
248
249
250
252
254
257

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

Qualitative methods for differential equations


263
13.1
Linear and nonlinear differential equations . . . . . . . . . . . . . . . 263
13.1.1
The logistic equation for population growth . . . . . . . . 264
13.1.2
Linear versus nonlinear . . . . . . . . . . . . . . . . . . 264

Contents
13.1.3
Law of mass action . . . . . . . . . . . . .
13.1.4
Scaling the variable can simplify the ODE .
13.2
The geometry of change . . . . . . . . . . . . . . . . . .
13.2.1
Slope fields . . . . . . . . . . . . . . . . . .
13.2.2
State-space diagrams . . . . . . . . . . . . .
13.2.3
Steady states and stability . . . . . . . . . .
13.3
Applying qualitative analysis to biological models . . . .
13.3.1
Qualitative analysis for the logistic equation
13.3.2
A model for the spread of a disease . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14

vii
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

265
266
267
268
272
275
275
276
280
285

Trigonometric functions
14.1
Basic trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . .
14.1.1
Angles and circles . . . . . . . . . . . . . . . . . . . .
14.1.2
Defining the trigonometric functions sin(x) and cos(x)
14.1.3
Properties of sin(x) and cos(x) . . . . . . . . . . . . .
14.1.4
Other trigonometric functions . . . . . . . . . . . . . .
14.2
Periodic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . .
14.2.1
Phase, amplitude, and frequency . . . . . . . . . . . .
14.2.2
Rhythmic processes . . . . . . . . . . . . . . . . . . .
14.3
Inverse Trigonometric functions . . . . . . . . . . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.

291
291
292
293
294
295
295
297
298
301
308

15

Cycles, periods, and rates of change


311
15.1
Derivatives of trigonometric functions . . . . . . . . . . . . . . . . . 311
15.1.1
Limits of trigonometric functions . . . . . . . . . . . . . 311
15.1.2
Derivatives of sine, cosine, and other trigonometric functions312
15.1.3
Derivatives of the inverse trigonometric functions . . . . 313
15.2
Changing angles and related rates . . . . . . . . . . . . . . . . . . . . 314
15.3
The Zebra danios escape responses . . . . . . . . . . . . . . . . . . . 318
15.3.1
Visual angles . . . . . . . . . . . . . . . . . . . . . . . . 318
15.3.2
The Zebra danio and a looming predator . . . . . . . . . 320
15.3.3
Alternate approach involving inverse trig functions . . . . 323
15.4
For further study: Trigonometric functions and differential equations . 324
15.5
Additional examples: Implicit differentiation . . . . . . . . . . . . . . 325
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

16

Review Problems
335
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336

Appendices

349

A review of Straight Lines


351
A.A
Geometric ideas: lines, slopes, equations . . . . . . . . . . . . . . . . 351
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354

A precalculus review

357

viii

Contents
B.A
B.B

Manipulating exponents . . . . . . . . . . . . . . . . . . . . . . . . . 357


Manipulating logarithms . . . . . . . . . . . . . . . . . . . . . . . . . 357

A Review of Simple Functions


C.A
What is a function . . . . . . . . . . . . . . . . . . . . .
C.B
Geometric transformations . . . . . . . . . . . . . . . .
C.C
Classifying . . . . . . . . . . . . . . . . . . . . . . . . .
C.D
Power functions and symmetry . . . . . . . . . . . . . .
C.D.1
Further properties of intersections . . . . . .
C.D.2
Optional: Combining even and odd functions
C.E
Inverse functions and fractional powers . . . . . . . . . .
C.E.1
Graphical property of inverse functions . . .
C.E.2
Restricting the domain . . . . . . . . . . . .
C.F
Polynomials . . . . . . . . . . . . . . . . . . . . . . . .
C.F.1
Features of polynomials . . . . . . . . . . .
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

359
359
360
362
362
363
365
366
366
367
368
369
370

Limits
D.A
Limits for continuous functions . . . . . . . . . . . . . . . . . . . .
D.B
Properties of limits . . . . . . . . . . . . . . . . . . . . . . . . . . .
D.C
Limits of rational functions . . . . . . . . . . . . . . . . . . . . . .
D.C.1
Case 1: Denominator nonzero . . . . . . . . . . . . . .
D.C.2
Case 2: zero in the denominator and holes in a graph
D.D
Right and left sided limits . . . . . . . . . . . . . . . . . . . . . . .
D.E
Limits at infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . .
D.F
Summary of special limits . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.

373
373
374
375
375
376
378
379
379

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

Proof of the chain rule

381

Trigonometry review
383
F.A
Summary of the inverse trigonometric functions . . . . . . . . . . . . 385

Short Answers to Problems


G..1
Answers to Chapter 1 Problems .
G..2
Answers to Chapter 2 Problems .
G..3
Answers to Chapter 3 Problems .
G..4
Answers to Chapter 4 Problems .
G..5
Answers to Chapter 5 Problems .
G..6
Answers to Chapter 6 Problems .
G..7
Answers to Chapter 7 Problems .
G..8
Answers to Chapter 8 Problems .
G..9
Answers to Chapter 9 Problems .
G..10
Answers to Chapter 10 Problems
G..11
Answers to Chapter 11 Problems
G..12
Answers to Chapter 12 Problems
G..13
Answers to Chapter 13 Problems

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

387
388
390
392
394
396
397
399
401
402
404
406
408
410

Contents

ix
G..14
G..15
G..16
G..17
G..18

Answers to Chapter 14 Problems .


Answers to Chapter 15 Problems .
Answers to Chapter 16 Problems .
Answers to Appendix A Problems .
Answers to Appendix B Problems .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

412
413
415
418
419

Bibliography

421

Index

423

Contents

Preface
This preface outlines the main philosophy of the course, and serves as a guide to the
instructor. It outlines reasons for the organization of the material and why this works for introducing first year students to the major concepts and many applications of the differential
calculus.
Calculus arose as an important tool in solving practical scientific problems through
the centuries. However, in many current courses, it is taught as a technical subject with
rules and formulas (and occasionally theorems), devoid of its connection to applications.
In this course, the applications form an important focal point, with a focus on life sciences.This places the techniques and concepts into practical context, as well as motivating
quantitative approaches to biology taught to undergraduates. While many of the examples
have a biological flavour, the level of biology needed to understand those examples is kept
at a minimum. The problems are motivated with enough detail to follow the assumptions,
but are simplified for the purpose of pedagogy.
The mathematical philosophy is as follows: We start with elementary observations
about functions and graphs, with an emphasis on power functions and polynomials. This
introduces the idea of sketching of a graph from elementary properties of the function,
before calculus is discussed. It also leads to direct biological applications that illustrate the
idea of which terms in an expression (polynomial or rational function) dominate at which
range(s) of the independent variable.
We introduce the derivative in three complementary ways: (1) As a rate of change,
(2) as the slope we see when we zoom into the graph of a function, and (3) as a computational quantity that can be approximated by a finite difference. We discuss (1) by first
defining an average rate of change over a finite time interval. We use actual data to do so,
but then by refining the time interval, we show how this average rate of change approaches
the instantaneous rate, i.e. the derivative. This helps to make the idea of the limit more
intuitive, and not simply a formal calculation. We illustrate (2) using a sequence of graphs
or interactive graphs with increasing magnification. We illustrate (3) using simple computation that can be carried out on a spreadsheet. The actual formal definition of the derivative
(while presented and used) takes a back-seat to this discussion.
The next philosophical aspect of the course is that we develop all the ideas and applications of calculus using simple functions (power and polynomials) first, before introducing
the more elaborate technical calculations. The aim is to show our students the usefulness
of derivatives for understanding functions (sketching and interpreting their behaviour), and
for optimization problems, before having to grapple with the chain rule and more intricate
computation of derivatives. This helps to illustrate what calculus can achieve, and decrease
xi

xii

Preface

the focus on rote mechanical calculations.


Once this entire tour of calculus is complete, we introduce the chain rule and its
applications, and then the transcendental functions (exponentials and trigonometric). Both
are used to illustrate biological phenomena (population growth and decay, then, later on,
cyclic processes). Both allow a repeated exposure to the basic ideas of calculus - curve
sketching, optimization, and applications to related rates. This means that the important
concepts picked up earlier in the context of simpler functions can be reinforced again. The
student also learns to practice and apply the chain rule, and to compute more technically
involved derivatives. But, even more than that, both these topics allow us to informally
introduce a powerful new idea, that of a differential equation.
By making the link between the exponential function and the differential equation
dy/dx = ky, we open the door to a host of applications in the slightly generalized form of
dy/dx = a by. We demonstrate that understanding the first leads to understanding the
second, merely by changing the variable of interest (from y to z = y (a/b). Applications
include the temperature of a cooling object, the level of drug in the bloodstream, simple
chemical reactions, and many more. Even though the student does not yet have the tools
to analytically solve a differential equation (tools developed only in a second semester),
he/she can appreciate the link between the statement about rates of change and predictions
for future behaviour of a system.
Ultimately, a first semester calculus course is all about the applications of a derivative. We use this fact to explore nonlinear differential equations of the first order, using
qualitative sketches of the direction field and the state space of the equation. These simple
yet powerful ideas allow us to get intuition to the behaviour of more realistic biological
models, including density-dependent (logistic) growth and even spread of disease. Many
of the ideas here are geometric, and we return to interpreting the meaning of graphs and
slopes yet again in this context.
The idea of a computational approach is reintroduced in several places, as appropriate. We use simple examples to motivate linear approximation and Newtons method for
finding zeros of a function. Later, we use Eulers method to solve a simple differential equation computationally. All these methods are based on the derivative, and most introduce
the idea of an iterated (repeated) process that is ideally handled by computer or calculator.
The exposure to these computational methods, while novel and sometimes daunting, provides an important set of examples of how properly understanding the math can lead us to
effective design of computational algorithms.

Chapter 1

Power functions as
building blocks

Some of the beautiful architectural marvels built by humans from ancient to modern times
though very complicated as a whole, are made of simple component parts - bricks, beams
and joints. Similarly, some mathematical structures that seem complicated can be decomposed into simpler subunits whose properties are straightforward. Understanding these
component parts and how they fit together to form more interesting structures is an important step in appreciating properties of more complex (mathematical) structures. This
central idea forms the theme of the first chapter.
The components that we explore here are power functions. We first study these on
their own, and compare their shapes. We examine an immediate application of our analysis
to the biological problem of cell size. Then we expand our horizon to consider polynomials
and rational functions. Using the power functions as basic building blocks, we construct the
family of polynomials, and investigate how their features are inherited from the underlying
behaviour of power functions. Here, we begin to develop a few important curve-sketching
skills that will be useful throughout this calculus course.

1.1

Power functions

Learning goals (LG)


1. Understand the shapes of power functions relative to one another (Figs. 1.1, 1.3).
2. Understand the idea that power functions with low powers dominate near the origin,
and power functions with high powers dominate far away from the origin. (Figs. 1.1,
1.3).
3. Be able to find points of intersection of two power functions (Example 1.1).
Let us consider the power functions, that is functions of the form
y = f (x) = xn
1

Chapter 1. Power functions as building blocks

where n is a positive integer. Power functions are among the most elementary and elegant
functions 1 . They are easy to calculate, very predictable and smooth, and, from the point of
view of calculus, very easy to handle.
From Figure 1.1a, we see that the power functions (y = xn for powers n = 2, . . . 5)
intersect at x = 0 and x = 1. This is true for all integer powers. The same figure also
demonstrates another extremely important fact: the greater the power n, the flatter the
graph near the origin and the steeper the graph beyond x > 1. This can be restated in terms
of the relative size of the power functions. We say that close to the origin, the functions
with lower powers dominate, while far from the origin, the higher powers dominate.

y
x5

2x3

x4
x3

5x2
x2

(a)

(b)

Figure 1.1. (a) Graphs of a few power functions y = xn . All intersect at x = 0, 1.


As the power n increases, the graphs become flatter close to the origin and steeper at large
x values (LG 1). Near the origin, power functions with lower powers dominate over (have a
larger value compared to) power functions with higher powers. Far from the origin, power
functions with higher powers dominate (LG 2). (b) Graphs of the two power functions
(y = 5x2 , y = 2x3 ). Close to the origin, the quadratic power function has a larger value,
whereas for large x, the cubic function has larger values. The functions intersect when
5x2 = 2x3 , which holds for either x = 0 or x = 5/2 = 2.5 (LG 3).
More generally, a power function has the form
y = f (x) = K xn
where n is a positive integer and K, sometimes called the coefficient is a constant. So far,
we have compared power functions whose coefficient is K = 1. But we can extend our
discussion to a more general case as well.
1 We

only need to use multiplication to compute the value of these functions at any point.

1.2. How big can a cell be? A model for nutrient balance

Example 1.1 Find points of intersection and compare the sizes of the two power functions
y1 = axn ,

and y2 = bxm .

where a and b are constants. You may assume that both a and b are positive.
Solution: This comparison is a slight generalization of what we have seen above. First,
we note that the coefficients a and b merely scale the vertical behaviour (i.e. stretch the
graph along the y axis. It is still true that the higher the power, the flatter the graph close to
x = 0, and the steeper for large positive or negative values of x. However, now the points
of intersection of the graphs will occur at x = 0 and whenever
axn = bxm

xnm = (b/a)

We can solve this further to obtain a solution in the first quadrant2,


x = (b/a)1/(nm) .
This is shown in Figure 1.1b for the specific example of y1 = 5x2 , y2 = 2x3 . Here we
point out that in general, if b/a is a positive than this value is a real number. Since we have
assumed that both a and b are positive, this will be true.
Example 1.2 Determine points of intersection for the following pairs of functions: (a)
y1 = 3x4 and y2 = 27x2 , (b) y1 = (4/3)x3 , y2 = 4x2 .

Solution: (a) Intersections occur at x = 0 and at (27/3)1/(42) = 9 = 3. (b)


These functions intersect only at x = 0, 3 but not for any negative values of x.
In many cases, the points of intersection will be irrational numbers whose decimal
approximations can only be obtained by a scientific calculator or by some approximation
method (such as Newtons Method).
The observations we have made so far already allow us to examine a biological problem related to the size of cells. We see that application of these ideas will provide insight
into why cells have a size limitation, as discussed in the next section.

1.2

How big can a cell be? A model for nutrient


balance

The shapes of living cells are designed to be uniquely suited to their functions. Few cells are
really spherical. Many have long appendages, cylindrical parts, or branch-like structures.
But here, we will neglect all these beautiful complexities and look at a simple spherical
cell. The question we want to explore is what physical or biological constraints determine
the size of a cell and why some size limitations exist. Why should animals be made of
millions of tiny cells, instead of just a few hundred large ones?
2 As we will shortly see, if n, m are both even or both odd, there will also be an intersection in the third
quadrant, at x = (b/a)1/(nm) .

Chapter 1. Power functions as building blocks

Learning goals
1. Follow and understand the derivation of a mathematical model for cell nutrient absorption and consumption (Section 1.2.1).
2. Develop the skill of using parameters (k1 , k2 ) rather than specific numbers in mathematical expressions.
3. Understand the link between power functions in Section 1.1 and cell nutrient balance
in the model (Eqs. 1.2).
4. Be able to verbally interpret the results of the model (Section 1.2.2).

Figure 1.2. A cell (assumed spherical) absorbs nutrients at a rate proportional to


its surface area S, but consumes nutrients at a rate proportional to its volume V. k1 , k2 are
proportionality constants. The surface area and volume of a sphere of radius r are given
by S = 4r2 , V = 43 r3 . These facts are used to assemble a simple model for nutrient
balance in a spherical cell.
While these questions seem extremely complicated, a relatively simple mathematical
argument can go a long way in illuminating the situation. To delve into this mystery of size
and shape, we will formulate a mathematical model. A model is just a representation of
a real situation which simplifies things by representing the most important aspects, while
neglecting or idealizing the other aspects. Below we follow a reasonable set of assumptions
and mathematical facts to explore how nutrient balance can affect and limit cell size.

1.2.1 Building the model


In order to build the model we make some simplifying assumptions and then restate them
mathematically. We base the model on the following assumptions:
1. The cell is roughly spherical (See Figure 1.2).
2. The cell absorbs oxygen and nutrients from the environment through its surface. If
the surface area, S, of the cell is bigger, it can absorb these substances at a faster

1.2. How big can a cell be? A model for nutrient balance

rate. We will assume that the rate at which nutrients (or oxygen) are absorbed is
proportional to the surface area of the cell.
3. The rate at which nutrients are consumed (i.e., used up) in metabolism is proportional to the volume, V , of the cell; This means that the rate of consumption is some
constant multiple of the volume, and it also implies that the bigger the volume, the
more nutrients are needed to keep the cell alive. We will assume that the rate at which
nutrients (or oxygen) are consumed is proportional to the volume of the cell.
We define the following quantities for our model of a single cell:
A = net rate of absorption of nutrients per unit time,
C = net rate of consumption of nutrients per unit time,
V = cell volume,
S = cell surface area,
r = radius of the cell.
We now rephrase the assumptions mathematically. By assumption (2), A is proportional to S: This means that
A = k1 S,
where k1 is a constant of proportionality. Since absorption and surface area are positive
quantities, in this case only positive values of the proportionality constant make sense,
so k1 must be positive. (The value of this constant would depend on the permeability of
the cell membrane, how many pores or channels it contains, and/or any active transport
mechanisms that help transfer substances across the cell surface into its interior.) By using
a generic parameter to represent this proportionality constant, we keep the model general
enough to apply to many different cell types. (LG 2).
Further, by assumption (3), the rate of nutrient consumption, C is proportional to V ,
so that
C = k2 V,
where k2 is a second proportionality constant (also positive3). The value of k2 would
depend on the rate of metabolism of the cell, i.e. how quickly it consumes nutrients in
carrying out its activities.
Since we have assumed that the cell is spherical, by assumption (1), the surface area,
S, and volume V of the cell are:
4 3
r .
(1.1)
3
Putting these facts together leads to the following relationships between nutrient absorption,
consumption, and cell radius:

 

4
4 3
r =
k2 r3 .
A = k1 (4r2 ) = (4k1 )r2 ,
C = k2
3
3
S = 4r2 ,

3 From

V =

now on, we will simply write k2 > 0 is a constant when we mean this constant to be positive.

Chapter 1. Power functions as building blocks

We note that A, C are now quantities that depend on the radius of the cell.


4
k2 r3 .
A(r) = (4k1 )r2 , and C(r) =
3

(1.2)

Indeed, since the terms in brackets on the right hand sides are just constant coefficients, each of the above expressions is simply a power function (LG 3), with r the independent variable, that is
4
k2 are constants).
3
Each of these expressions has the form of a power function, y = krn for some positive
constant coefficient k. Most importantly, the powers are n = 3 for consumption and n = 2
for absorption. We can now use the properties of power functions discussed perviously to
understand how nutrient balance depends on cell size.
A(r) = ar2 ,

C(r) = cr3

(where a = 4k1 , c =

1.2.2 Nutrient balance depends on cell size


In our discussion of cell size, we found two power functions that depend on the cell radius, namely the nutrient absorption A(r) and consumption C(r) rates given by Eqs. (1.2).
Based on our discussion of power functions, we can characterize whether absorption or
consumption of nutrients dominates for small, medium, or large cells.
Example 1.3 Is the absorption rate or the consumption rate greater for small cells? For
large cells? For what cell size are the two rates equal?
Solution: For small r, the power function with the lower power of r (namely A(r)) dominates, but for very large values of r, the power function with the higher power (C(r))
dominates. The switch takes place at the point of intersection of the two graphs


4
k2 r3 = (4k1 )r2 .
A(r) = C(r)
3
One trivial solution to this equation is r = 0. If r 6= 0, then we can cancel a factor of r2
from both sides to obtain:
k1
r=3 .
k2
For cells of this radius, absorption and consumption are equal, it follows that for smaller cell
sizes the absorption A r2 is the dominant process, while for large cells, the consumption
rate C r3 is higher than the absorption rate. We conclude that cells larger than the critical
size r = 3k1 /k2 will be unable to keep up with the nutrient demand, and will not survive.
Thus, using this simple geometric argument, we have deduced that the size of the cell
has strong implications on its ability to absorb nutrients quickly enough to feed itself. The
restriction on oxygen absorption is even more critical than the replenishment of other substances such as glucose. For these reasons, cells larger than some maximal size (roughly 1
mm in diameter) rarely occur. Furthermore, organisms that are bigger than this size cannot
rely on simple diffusion to carry oxygen to their partsthey must develop a circulatory
system to allow more rapid dispersal of such life-giving substances or else they will perish.

1.2. How big can a cell be? A model for nutrient balance

1.2.3 Even and odd power functions


So far, we have considered power functions y = xn with x > 0. But in general, there
is no reason to restrict the independent variable x to positive values. Here we expand
the discussion to consider all real values of x. This brings up some new ideas, including
symmetry properties.
2.0

2.0

Odd power functions

Even power functions

y=x
y=x3
y=x5

y=x6

y=x2

y=x4
0.0

-2.0
-1.5

1.5

(a)

-1.5

1.5

(b)

Figure 1.3. Graphs of power functions (a) A few of the even (y = x2 , y =


x4 , y = x6 ) power functions (b) Some odd (y = x, y = x3 , y = x5 ) power functions.
Note symmetry properties. Also observe that as the power increases, the graphs become
flatter close to the origin and steeper at large x values. Two even power functions intersect
at (1, 1) and (0, 0). Two odd power functions intersect at (1, 1), (1, 1) and (0, 0).
From Fig. 1.3 we see that power functions with an even power, such as y = x2 , y =
x , y = x6 (shown in panel a), are symmetric about the y axis, whereas odd power functions such as y = x, y = x3 , y = x5 (panel b) are symmetric when rotated through 180
about the origin. We adopt the term even function and odd function to describe such
symmetry properties. More formally, we say that f is an even function if and only if
f (x) = f (x), whereas f is an odd function if and only if f (x) = f (x). Many
functions have no symmetry, and are neither even nor odd. See Appendix C.D for further
details.
4

Example 1.4 Show that the function y = g(x) = x2 3x4 is an even function
Solution: We use the property that if g is an even function, it should satisfy g(x) = g(x).
Let us calculate g(x) and see if this requirement holds. We find that
g(x) = (x)2 3(x)4 = x2 3x4 = g(x).
Here we have used the fact that (x)n = (1)n xn , and that when n is even, (1)n = 1.

Chapter 1. Power functions as building blocks

All power functions are continuous and unbounded. For x both even and odd
power functions satisfy y = xn . For x , odd power functions tend to .
Odd power functions have the property that they are one-to-one. (That is, each value of y
is obtained from a unique value of x and vice versa.) This is not the case for the even power
functions as we can see from Fig 1.3(a): for example, y = 1 is obtained by evaluating the
function y = x2 at either x = 1 or x = 1, and every other positive value of y is similarly
obtained by evaluating a given power function at a positive or a negative value of x. From
Fig 1.3 we see that all power functions go through the point (0, 0). Even power functions
have a local minimum at the origin whereas odd power functions do not.
Definition 1.5 (Local Minimum). A local minimum of a function f (x) is a point xmin
such that the value of f is larger at all sufficiently close points. Formally, f (xmin ) >
f (xmin ) for small enough.

1.3 Sustainability and Energy balance on Planet


Earth
The sustainability of life on Planet Earth depends on a fine balance between the temperature
of its oceans and land masses and the ability of life forms to tolerate climate change. As a
followup to our model for nutrient balance, we briefly introduce a simple energy balance
model to track incoming and outgoing energy and to determine a rough estimate for the
Earths temperature. We use the following basic facts:
1. Energy input from the sun to Earth given the Earths radius r can be approximated as
Ein = (1 a)Sr2 ,

(1.3)

where S is incoming radiation energy per unit area (also called the solar constant)
and 0 a 1 is the fraction of that energy reflected. a is also called the albedo,
and depends on cloud cover, and other aspects of the planet (such as percent forest,
snow, desert, and ocean).
2. Energy lost from Earth due to radiation into space depends on the current temperature
of the Earth T , and is approximated as
Eout = 4r2 T 4 ,

(1.4)

where is the emissivity of the Earths atmosphere, which represents the Earths tendency to emit radiation energy. This constant depends on cloud cover, water vapor as
well as and greenhouse gas concentration in the atmosphere, such as carbon dioxide,
and methane levels. is a physical constant (the Stephan-Bolzmann constant) which
is fixed for the purpose of our discussion.
Example 1.6 (Energy expressions are power functions) Explain in what sense the two
forms of energy above can be viewed as power functions, and what types of power functions
they represent.

1.4. Combining power functions: first steps in graph sketching

Solution: Both Ein and Eout depend on Earths radius as the power r2 . however, since
this radius is a constant, it will not be fruitful to consider it as an interesting variable for
this problem (unlike the cell size example we previously discussed). However, we note that
Eout depends on temperature as T 4 . (We might also select the albedo as a variable and
in that case, we note that Ein depends linearly on the albedo a4 .)
Example 1.7 (Energy equilibrium for the Earth) Explain how the facts above can be
used to determine the equilibrium temperature of the Earth, that is, the temperature at
which the incoming and outgoing radiation energies are balanced.
Solution: The Earth will be at equilibrium when
Ein = Eout

(1 a)Sr2 = 4r2 T 4 .

We observe that the factors r2 cancel, and we obtain an equation that can be solved for the
temperature T . (See Exercise 21) It is instructive to examine how this temperature depends
on the constants in the problem, and how it is affected by cloud cover and greenhouse gas
level. We discuss these issues in the same exercise.

1.4

Combining power functions: first steps in graph


sketching

Properties of the power functions leads to important consequences in functions made up


of such components. Here we discuss two important classes, simple polynomials (sums
of power functions) and rational functions (ratios of such functions). We show that the
ideas discussed in Section 1.1 lead directly and immediately to understanding the overall
behaviour of such functions. We also take some preliminary but fundamentally important steps in sketching the graphs of these functions, a skill that will prove of great value
throughout this course.

Learning goals
1. Be able to easily sketch the graph of a simple polynomial of the form y = axn +bxm
(Fig. 1.4).
2. Be able to sketch a rational function such as y = Axn /(b + xm ).

1.4.1 Sketching a simple (two-term) polynomial


Example 1.8 (Sketching a simple cubic polynomial) Sketch a graph of the polynomial
y = p(x) = x3 + ax.
How would the sketch change if the constant a changes from positive to negative?

(1.5)

10

Chapter 1. Power functions as building blocks


y

a<0

a=0

a>0

Figure 1.4. The graph of the polynomial y = p(x) = x3 + ax can be obtained by


putting together its two power function components. The cubic arms y x3 (top row)
dominate for large x (far from the origin), whereas the linear part y ax (middle row)
dominate near the origin. When these are smoothly connected (bottom row) we obtain a
sketch of the desired polynomial. Shown here are three possibilities, for a < 0, a = 0, a >
0, left to right. The value of a determines the slope of the curve near x = 0 and thus also
affects presence of a local maximum and minimum (for a < 0).
Solution: The polynomial in (1.5) has two terms, each one a power function. Let us
consider their effects individually. Near the origin, for x 0 the term ax dominates so
that, close to x = 0, the function behaves as
y ax.
This is a straight line with slope a. If a > 0 we should see a line with positive slope here,
whereas if a < 0 the slope of the line should be negative. Far away from the origin, the
cubic term dominates, so
y x3 .
That means that we would see a nearly cubic curve when we look at large (positive or
negative) x values. Figure 1.4 illustrates these ideas. In column (a) we see the behaviour
of y = p(x) = x3 + ax for large x, in (b) for small x. Column (c) shows the graph for
an intermediate range. We might notice that for a < 0, the graph has a local minimum
as well as a local maximum. The simple arguments used above already lead us to a fairly
4 We see that we have a variety of choices about which of the quantities to consider as the independent variable
in this example.

1.4. Combining power functions: first steps in graph sketching

11

reasonable sketch of the function in (1.5). We can add further details by simple algebraic
steps as below.
Example 1.9 (Zeros) Find the places at which the polynomial (1.5) crosses the x axis, that
is, find the zeros of the function y = x3 + ax.

Solution: The zeros of the polynomial can be found by setting


y = p(x) = 0

x3 + ax = 0

x3 = ax.

The above equation always has a solution x = 0, but if x 6= 0, we can cancel and obtain
x2 = a.
This would have no solutions if a is a positive number, so that in that case, the graph crosses
the x axis only once, at x = 0, as shown in Figure 1.4. If a is negative, then the negatives
cancel, so the equation can be written in the form
x2 = |a|
and we would have two new zeros at
p
x = |a|.

For example, if a = 1 then the function y = x3 x has zeros at x = 0, 1, 1.


Example 1.10 (A more general case) Explain how you would use the ideas of Example 1.8 to sketch the polynomial y = p(x) = axn + bxm . Without loss of generality,
you may assume that n > m 1 are integers.
Solution: As in Example 1.8, this polynomial has two terms that dominate at different
ranges of the independent variable. Close to the origin, y bxm (since m is the lower
power) whereas for large x, y axn . The full behaviour is obtained by smoothly connecting these pieces of the graph. Finding zeros can refine the graph. Some examples of this
type are discussed in the Exercises (See Exercise 6).
The reasoning used here is a very important first step in sketching a polynomial. Later
in this course we will develop specialized methods to find zeros of more complicated cases
(using an approximation called Newtons method). We will also use calculus to determine
points at which the function attains local maxima or minima (called critical points), and
how it behaves asymptotically, for large positive or negative values of x. The elementary
steps described here will remain useful in later work as a quick approach for visualizing
the overall shape of a graph.

12

Chapter 1. Power functions as building blocks

1.4.2 Sketching a simple rational function


We use similar reasoning to consider the graphs of simple rational functions. A rational
function is a function that can be written as
y=

p1 (x)
,
p2 (x)

where p1 (x) and p2 (x) are polynomials.

Example 1.11 (A rational function) Sketch the graph of the rational function
y=

Axn
,
an + xn

x 0.

(1.6)

What properties of your sketch depend on the power n? What would the graph look like
for n = 1, 2, 3?
Solution: We can break up the process of understanding this function into the following
steps:
The graph of the function (1.6) goes through the origin. (At x = 0, we see that
y = 0.)
For very small x, (i.e., x << a) we can approximate the denominator by the constant
term an + xn an since xn is negligible by comparison, so that
 
Axn
A
Axn
xn for small x.

=
y= n
n
n
a +x
a
an
This means that near the origin, the graph looks like a power function, Cxn (where
C = A/an ).
For large x, i.e. x >> a, we have an + xn xn so that
y=

Axn
Axn

=A
an + xn
xn

for largex.

This reveals that the graph has a horizontal asymptote y = A at large values of x.
Since the function behaves like a simple power function close to the origin, we conclude directly that the higher the value of n, the flatter is its graph near 0. Further,
large n means sharper rise to the eventual asymptote.
The results are displayed in Fig. 1.5.

1.5 Rate of an enzyme-catalyzed reaction


Rational functions introduced in Example 1.11 often play a role in biochemistry. Here we
discuss two important examples and the contexts in which they appear. In both cases, we
consider the initial rise of the function as well as its eventual saturation.

1.5. Rate of an enzyme-catalyzed reaction

Small x

13

Smoothly connected

Large x

y
n=1

n=3

n=1

n=2

Figure 1.5. The rational functions (1.6) with n = 1, 2, 3 are compared on this
graph. Close to the origin, the function behaves like a power function, whereas for large x
there is a horizontal asymptote at y = A. As n increases, the graph becomes flatter close
to the origin, and steeper in its rise to the asymptote.

Learning goals
1. Understand the connection between Michaelis-Menten kinetics in biochemistry and
rational functions described in Section 1.4.2.
2. Be able to interpret properties of a graph such as Fig. 1.7 in terms of properties of an
enzyme-catalyzed reactions.

1.5.1 Saturation and Michaelis-Menten kinetics


Biochemical reactions are often based on the action of proteins known as enzymes that
catalyze many reactions in living cells. Shown in Fig. 1.6 is a typical scheme. The enzyme
E binds to its substrate S to form a complex C. The complex then breaks apart into a product, P, and an enzyme molecule that can repeat its action again. Generally, the substrate is
much more plentiful than the enzyme.

k2

k1
E

k-1

Figure 1.6. An enzyme (catalytic protein) is shown binding to a substrate molecule


(circular dot) and then processing it into a product (star shaped molecule).
In the context of this example, x represents the concentration of substrate in the reaction mixture. The speed of the reaction, v, (namely the rate at which product is formed)
depends on x. But the relationship is not linear, as shown in Fig. 1.7. In fact, this relation-

14

Chapter 1. Power functions as building blocks

ships, known as Michaelis Menten kinetics, has the form


v=

Kx
,
kn + x

(1.7)

where K, kn > 0 are positive constants that are specific to the enzyme and the experimental
conditions.
Michaelis Menten Kinetics
1.0

Hill function Kinetics

3.0

n=3

saturation

n=2
n=1

K/2

initial rise

0.0

0.0

k
0.0 n

1000.0

0.0

10.0

Figure 1.7. left: The graph of reaction speed, v, versus substrate concentration,
c in an enzyme-catalyzed reaction. This behaviour is called Michaelis-Menten kinetics.
Note that the graph at first rises almost like a straight line, but then it curves over and
approaches a horizontal asymptote. We refer to this as saturation. This graph tells us
that the speed of the enzyme cannot exceed some maximal level, i.e. it cannot be faster
than K. See Eqn. 1.7. Right: Hill function kinetics with A = 3, a = 1 and Hill coefficient
n = 1, 2, 3. See also Fig 1.5 for an analysis of the shape of this graph.
Equation (1.7) is a rational function. Since x is a concentration, it must be a positive
quantity, so we restrict attention to x 0. The expression in (1.7) is a special case of
the rational functions explored in Example 1.11, where n = 1, A = K, a = kn . In the
left panel of Fig. 1.7, we used graphics software to plot this function for specific values of
K, kn . The following observations can be made
1. The graph of (1.7) goes through the origin. Indeed, when x = 0 we have v = 0.
2. Close to the origin, the graph looks like a straight line. We can see this by considering values of x that are much smaller than kn . Then the denominator (kn + x) is
well approximated by the constant kn . Thus, for small x, v (K/kn )x. Thus for
small x the graph resembles a straight line with slope (K/kn ).
3. For large x, there is a horizontal asymptote. The reader can use a similar argument
for x kn , to show that v is approximately constant.

1.6. Analysis versus computational tools: two sides of a coin

15

Michaelis-Menten kinetics thus represents one type of relationship in which the phenomenon of saturation occurs: the speed of the reaction increases for small increases in the
level of substrate, but it cannot increase indefinitely, i.e. the enzymes saturate and operate
at their fixed constant speed when the substrate concentration is very high.
It is worth pointing out the units of terms in (1.7). x carries units of concentration
(e.g. nano Molar written nM, which means 109 Moles per litre) v carries units of concentration over time (e.g. nM min1 ). kn must have same units as x. (Only quantities with
identical units can be added or compared!) The units on the two sides of the relationship
(1.7) have to balance too, meaning that K must have the same units as the speed of the
reaction, v.

1.5.2 Hill functions


The Michaelis-Menten kinetics we discussed above fit into a broader class of Hill functions, which are rational functions of the form shown in Eqn. (1.6) with n > 1 and
A, a > 0. This function is often referred to as a Hill function with coefficient n, (although
the coefficient is actually a power in terms of the terminology used in this chapter). Hill
functions occur in biology in situations where the rate of some enzyme-catalyzed reaction
is affected by cooperative behaviour of a number of subunits, or by a chain of steps.
We see that Michaelis Menten kinetics corresponds to a Hill function with n = 1.
In biochemistry, expressions of the form (1.6) with n > 1 are often denoted sigmoidal
kinetics, and a few such functions are plotted on the right panel of Fig. 1.7. We have
already examined the shapes of these functions in Example 1.11. We show the graph as
plotted by graphical software in the right panel of Fig. 1.7 which shows the Michaelian
case for comparison.
All Hill functions have a horizontal asymptote y = A at large values of x. If y is
the speed of a chemical reaction (analogous to the variable we labeled v on the left panel),
then A is the maximal rate or maximal speed of the reaction. Since the Hill function
behaves like a simple power function close to the origin, the higher the value of n, the
flatter is its graph near 0. and the sharper the rise to the eventual asymptote. Hill functions
with large n are often used to represent switch-like behaviour in genetic networks or
biochemical signal transduction pathways.
The constant a is sometimes called the half-maximal activation level for the following reason: When x = a then
y=

Aa2
A
Aan
=
= .
n
+a
2a2
2

an

This shows that the level x = a leads to the half-maximal level of y.

1.6

Analysis versus computational tools: two sides


of a coin

Sections 1.5 and 1.4.2 illustrate the fact that mathematical understanding can be gained in a
variety of ways. Whereas in Section 1.5 we used reasoning and geometric analysis to sketch
graphs of interest, in Section 1.4.2 we relied on software to graph the same functions. The

16

Chapter 1. Power functions as building blocks

two approaches complement one another: one helps to anticipate the shape of the function,
while the other provides greater accuracy provided we pick a reasonable range of values
for the plot. This idea of using distinct but complementary approaches will be used often.
Rough sketches will supplement the more precise graphing that we accomplish using the
calculus, while harnessing software to help finalize our results will also provide strong
computational support for calculations that are otherwise tedious or repetitive.

1.7 For further study: Michaelis-Menten transformed


to a linear relationship
Michaelis-Menten kinetics that we explored in (1.7) is a nonlinear saturating function in
which the concentration x is the independent variable on which the reaction velocity, v
depends. As discussed in Section 1.5.1, the constants K and kn depend on the enzyme and
are often quantified in a biochemical assay of enzyme action. In older times, a convenient
way to estimate the values of K and kn was to measure v for many different values of the
initial substrate concentration. Before nonlinear fitting software was widely available, the
expression (1.7) was transformed (meaning that it was rewritten as a linear relationship.
We can do so with the following algebraic steps:
v=

Kx
kn + x

so, taking reciprocals and expanding leads to


1
kn + x
=
,
v
Kx
x
kn
+
=
Kx  Kx  

1
kn 1
+
=
K x
K
This suggests defining the two constants:
m=

kn
,
K

b=

1
.
K

In which case, the relationship between 1/v and 1/x becomes linear:
 
 
1
1
=m
+ b.
v
x

(1.8)

Both the slope, m and intercept b of the straight line provide information about the parameters. The relationship (1.8), which is a disguised variant of Michaelian kinetics is called
the Linweaver-Burke relationship. Later, we will see how this can be used to estimate the
values of K and kn from biochemical data about an enzyme.

1.8. For further study: Spacing of fish in a school

1.8

17

For further study: Spacing of fish in a school

Many animals live or function best when they are in a group. Social groups include herds
of wildebeest, flocks of birds, and schools of fish, as well as swarms of insects. Life in a
group can affect the way that individuals forage (search for food), their success at detecting
or avoiding being eaten by a predator, and other functions such as mating, protection of
the young, etc. Biologists are interested in the ecological implications of groups on their
own members or on other species with whom they interact, and how individual behaviour,
combined with environmental factors and random effects affect the shape of the groups, the
spacing, and the function.
In many social groups, the spacing between individuals is relatively constant from
one part of the formation to another, because animals that get too close start to move away
from one another, whereas those that get too far apart are attracted back. These spacing
distances can be observed in a variety of groups, and were described in many biological
publications. For example, Emlen [10] found that in flocks, gulls are spaced at about one
body length apart, whereas Conder [11] observed a 2-3 body lengths spacing distance in
tufted ducks. Miller [13] observed that sandhill cranes try to keep about 5.8 ft apart in the
flock he observed.
To try to explain why certain spacing is maintained in a group of animals, it was
proposed that there are mutual attraction and repulsion interactions, (effectively acting like
simple forces) between individuals. Breder [3] followed a number of species of fish that
school, and measured the individual spacing in units of the fish body length, showing that
individuals are separated by 0.16-0.25 body length units. He suggested that the effective
forces between individuals were similar to inverse power laws for repulsion and attraction.
Breder considered a quantity he called cohesiveness, defined as:
c=

R
A
n,
xm
x

(1.9)

where A, R are magnitudes of attraction and repulsion, x is the distance between individuals, and m, n are integer powers that govern how quickly the interactions fall off with
distance. We could re-express the formula (1.9) as
c = Axm Rxn
Thus, the function shown in Breders cohesiveness formula is related to our power functions, but the powers are negative integers. A specific case considered by Breder was
m = 0, n = 2, i.e. constant attraction and inverse square law repulsion,
c = A (R/x2 )
Breder specifically considered the point of neutrality, where c = 0. The distance at
which this occurs is:
x = (R/A)1/2
where attraction and repulsion are balanced. This is the distance at which two fish would
be most comfortable: neither tending to move apart, nor get closer together.
Other ecologists studying a similar problem have used a variety of assumptions about
forces that cause group members to attract or repel one another.

18

Chapter 1. Power functions as building blocks

Exercises
1.1. Power functions: Consider the power function
y = axn ,

< x <

Explain verbally (or using a sketch) how the shape of the function changes when
the coefficient a increases or decreases (for fixed n). How is this change in shape
different from the shape change that results from changing the power n?
1.2. Simple transformations: Consider the graphs of the simple functions y = x, y =
x2 , and y = x3 . What happens to each of these graphs when the functions are
transformed as follows:
(a) y = Ax, y = Ax2 , and y = Ax3 where A > 1 is some constant.
(b) y = x + a, y = x2 + a, and y = x3 + a where a > 0 is some constant.
(c) y = (x b)2 , and y = (x b)3 where b > 0 is some constant.
1.3. Simple sketches: Sketch the graphs of the following functions:
(a) y = x2 ,
(b) y = (x + 4)2
(c) y = a(x b)2 + c for the case a > 0, b > 0, c > 0.

(d) Comment on the effects of the constants a, b, c on the properties of the graph
of y = a(x b)2 + c.
1.4. Sketching simple polynomials: Use arguments from Section 1.4 to sketch graphs
of the following simple polynomials:
(a) y = 2x5 3x2 ,

(b) y = x3 4x5 .

1.5. Finding points of intersection(I):


(a) Consider the two functions f (x) = 3x2 and g(x) = 2x5 . Find all points of
intersection of these functions.
(b) Repeat the calculation for the two functions f (x) = x3 and g(x) = 4x5 .
Observe that finding these points of intersection is equivalent to calculating the zeros
of the functions in Problem 4.
1.6. Qualitative sketching skills:
(a) Sketch the graph of the function y = ax x5 for positive and negative values
of the constant a. Comment on behaviour close to zero and far away from
zero.
(b) What are the zeros of this function and how does this depend on a ?
(c) For what values of a would you expect that this function would have a local
maximum (peak) and a local minimum (valley)?

Exercises

19

1.7. Finding points of intersection(II): Consider the two functions f (x) = Axn and
g(x) = Bxm . Suppose m > n > 1 are integers, and A, B > 0. Determine the
values of x at which the values of the functions are the same. Are there two places
of intersection or three? How does this depend on the integer m n? (Remark:
The point (0,0) is always an intersection point. Thus, we are asking when there is
only one more and when there are two more intersection points. See Problem 5 for
a simple example of both types.)
1.8. More intersection points: Find the intersection of each pair of functions.

(a) y = x, y = x2

(b) y = x, y = x2
(c) y = x2 1,

x2
4

+ y2 = 1

1.9. Crossing the x axis: Answer the following problem by solving for x in each case.
Find all values of x for which the following functions cross the x axis (also called
zeros of the function, or roots of the equation f (x) = 0.)
(a) f (x) = I x, where I, are positive constants.

(b) f (x) = I x + x2 , where I, , are positive constants. Are there cases


where this function does not cross the x axis?
(c) In the case where the root(s) exist in part (b), are they positive, negative or of
mixed signs?
1.10. Crossing the x axis, continued: Answer Problem 9 by sketching a rough graph of
each of the functions in parts (a-b) and using these sketches to answer the question
of how many real roots there can be and where they are located (on the positive or
negative x axis). Note: This problem provides very important qualitative analysis
skills that will become useful in later applications.
1.11. Power functions: Consider the functions y = xn , y = x1/n , y = xn , where n is
an integer (n = 1, 2..) Which of these functions increases most steeply for values of
x greater than 1? Which decreases for large values of x? Which functions are not
defined for negative x values? Compare the values of these functions for 0 < x < 1.
Which of these functions are not defined at x = 0?
1.12. Roots of a quadratic: Find the range of m such that the equation x2 2x m = 0
has two unequal roots.
1.13. Rational Functions: In support of Learning Goal 2 of Section 1.4, describe the
shape of the graph of the function y = Axn /(b + xm ) in two cases: (a) n > m and
(b) m > n.
1.14. Power functions with negative powers: Consider the function
f (x) =

A
xa

where A > 0, a > 1, with a an integer. This is the same as the function f (x) =
Axa , which is a power function with a negative power.
(a) Sketch a rough graph of this function for x > 0.
(b) How does the function change if A is increased?

20

Chapter 1. Power functions as building blocks


(c) How does the function change if a is increased?

1.15. Intersections of functions with negative powers: Consider two functions of the
form
B
A
f (x) = a , g(x) = b .
x
x
Suppose that A, B > 0, a, b > 1 and that A > B. Determine where these functions
intersect for positive x values.
1.16. Zeros of polynomials: Find all real zeros of the following polynomials:
(a) x3 2x2 3x

(b) x5 1

(c) 3x2 + 5x 2.

(d) Find the points of intersection of the functions y = x3 + x2 2x + 1 and


y = x3 .
1.17. Inverse functions: The functions y = x3 and y = x1/3 are inverse functions.
(a) Sketch both functions on the same graph for 2 < x < 2 showing clearly
where they intersect.
(b) The tangent line to the curve y = x3 at the point (1,1) has slope m = 3,
whereas the tangent line to y = x1/3 at the point (1,1) has slope m = 1/3.
Explain the relationship of the two slopes.
1.18. Properties of a cube: The volume V and surface area S of a cube whose sides have
length a are given by the formulae
V = a3 ,

S = 6a2 .

Note that these relationships are expressed in terms of power functions. The independent variable is a, not x. We say that V is a function of a (and also S is a
function of a).
(a) Sketch V as a function of a and S as a function of a on the same set of axes.
Which one grows faster as a increases?
(b) What is the ratio of the volume to the surface area; that is, what is
of a? Sketch a graph of VS as a function of a.

V
S

in terms

(c) The formulae above tell us the volume and the area of a cube of a given side
length. But suppose we are given either the volume or the surface area and
asked to find the side. Find the length of the side as a function of the volume
(i.e. express a in terms of V ). Find the side as a function of the surface area.
Use your results to find the side of a cubic tank whose volume is 1 litre (1 litre
= 103 cm3 ). Find the side of a cubic tank whose surface area is 10 cm2 .
1.19. Properties of a sphere: The volume V and surface area S of a sphere of radius r
are given by the formulae
V =

4 3
r ,
3

S = 4r2 .

Exercises

21

Note that these relationships are expressed in terms of power functions with constant
multiples such as 4. The independent variable is r, not x. We say that V is a
function of r (and also S is a function of r).
(a) Sketch V as a function of r and S as a function of r on the same set of axes.
Which one grows faster as r increases?
(b) What is the ratio of the volume to the surface area; that is, what is
of r? Sketch a graph of VS as a function of r.

V
S

in terms

(c) The formulae above tell us the volume and the area of a sphere of a given
radius. But suppose we are given either the volume or the surface area and
asked to find the radius. Find the radius as a function of the volume (i.e.
express r in terms of V ). Find the radius as a function of the surface area. Use
your results to find the radius of a balloon whose volume is 1 litre. (1 litre =
103 cm3 ). Find the radius of a balloon whose surface area is 10 cm2
1.20. The size of cell: Consider a cell in the shape of a thin cylinder (length L and radius r). Assume that the cell absorbs nutrient through its surface at rate k1 S and
consumes nutrients at rate k2 V where S, V are the surface area and volume of the
cylinder. Here we assume that k1 = 12M m2 per min and k2 = 2M m3
per min. (Note: M is 106 moles. m is 106 meters.) Use the fact that a cylinder
(without end-caps) has surface area S = 2rL and volume V = r2 L to determine
the cell radius such that the rate of consumption exactly balances the rate of absorption. What do you expect happens to cells with a bigger or smaller radius? How
does the length of the cylinder affect this nutrient balance?
1.21. Energy equilibrium for Earth: This problem focuses on Earths temperature, climate change, and sustainability.
(a) Complete the calculation for Example 1.7 by solving for the temperature T of
the Earth at which incoming and outgoing radiation energies balance.
(b) Assume that greenhouse gasses decrease the emissivity of the Earths atmosphere. Explain how this would affect the Earths temperature.
(c) Explain how the size of the Earth affects its energy balance according to the
model.
(d) Explain how the albedo a affects the Earths temperature.
1.22. Allometric relationship: Properties of animals are often related to their physical
size or mass. For example, the metabolic rate of the animal (R), and its pulse rate
(P ) may be related to its body mass m by the approximate formulae R = Amb and
P = Cmd , where A, C, b, d are positive constants. Such relationships are known as
allometric relationships.
(a) Use these formulae to derive a relationship between the metabolic rate and the
pulse rate (Hint: eliminate m).
(b) A similar process can be used to relate the Volume V = (4/3)r3 and surface
area S = 4r2 of a sphere to one another. Eliminate r to find the corresponding relationship between volume and surface area for a sphere.

22

Chapter 1. Power functions as building blocks

1.23. Rate of a very simple chemical reaction: Here we consider a chemical reaction
that does not saturate, and consider the simple linear relationship between reaction
speed and reactant concentration. A chemical is being added to a mixture and is used
up by a reaction that occurs in that mixture. The rate of change of the chemical,
(also called the rate of the reaction) v (in units of M /sec where M stands for
Molar, which is the number of moles per litre) is observed to follow a relationship
v = a bc where c is the reactant concentration (in units of M) and a, b are positive
constants. (Note that here v is considered to be a function of c, and moreover, the
relationship between v and c is assumed to be linear.)
(a) What units should a and b have to make this equation consistent? (Remember:
in an equation such as v = a bc, each of the three terms must have the same
units. Otherwise, the equation would not make sense.)
(b) Use the information in the graph shown in Figure 1.8 to find the values of a
and b. (To do so, you should find the equation of the line in the figure, and
compare it to the relationship v = a bc.)
(c) What is the rate of the reaction when c = 0.005 M?
v

Reaction rate

slope
-0.2

0.01 M
concentration c

Figure 1.8. Figure for problem 23


1.24. Michaelis-Menten kinetics: Consider the Michaelis-Menten kinetics where the
speed of an enzyme-catalyzed reaction is given by v = Kx/(kn + x).
(a) Explain the statement that when x is large there is a horizontal asymptote
and find the value of v to which that asymptote approaches.
(b) Determine the reaction speed when x = kn and explain why the constant kn
is sometimes called the half-max concentration.
1.25. A polymerization reaction: Consider the speed of a polymerization reaction shown
in Figure 1.9. Here the rate of the reaction is plotted as a function of the substrate
concentration. (The experiment concerned the polymerization of actin, an important
structural component of cells; data from [12].) The experimental points are shown as
dots, and a Michaelis-Menten curve has been drawn to best fit these points. Use the
data in the figure to determine approximate values of K and kn in the two treatments
shown.

Exercises

23

Figure 1.9. Figure for problem 25


1.26. Hill functions: Hill functions are sometimes used to represent a biochemical switch,
that is a rapid transition from one state to another. Consider the Hill functions
y1 =

x2
,
1 + x2

y2 =

x5
,
1 + x5

(a) Where do these functions intersect?


(b) What are the asymptotes of these functions?
(c) Which of these functions increases fastest near the origin?
(d) Which is the sharpest switch and why?
1.27. Transforming a Hill function to a linear reationship: A Hill function is a nonlinear function. But if we redefine variables, we can transform it into a linear relationship. The process is analogous to transforming Michaelis-Menten kinetics into a
Linweaver-Burke plot. Determine how to define appropriate variables X and Y (in
terms of the original variables x and y) so that the Hill function y = Ax3 /(a3 + x3 )
is turned into a linear relationship between X and Y . Then indicate how the slope
and intercept of the line are related to the original constants A, a in the Hill function.
1.28. Hill function and sigmoidal chemical kinetics: It is known that the rate v at which
a certain chemical reaction proceeds depends on the concentration of the reactant c
according to the formula
Kc2
v= 2
a + c2
where K, a are some constants. When the chemist plots the values of the quantity
1/v (on the y axis) versus the values of 1/c2 (on the x axis), she finds that the
points are best described by a straight line with y-intercept 2 and slope 8. Use this
result to find the values of the constants K and a.

24

Chapter 1. Power functions as building blocks

1.29. Linweaver-Burke plots: Shown in the Figure (a) and (b) are two Linweaver Burke
plots. By noting properties of these figures comment on the comparison between
the following two enzymes:
(a) Enzyme (1) and (2).
(b) Enzyme (1) and (3).
1/v

1/v
(1)

(1)
(2)

(3)

1/c

1/c

Figure 1.10. Figure for problem 29


1.30. Michaelis Menten Enzyme kinetics: The rate of an enzymatic reaction according
to the Michaelis Menten Kinetics assumption is
v=

Kc
,
kn + c

where c is concentration of substrate (shown on the x axis) and v is the reaction


speed (given on the y axis). Consider the data points given in the table below:
Substrate conc
Reaction speed

nM
nM/min

c
v

5.
0.068

10.
0.126

20.
0.218

40.
0.345

50.
0.39

100.
0.529

Convert this data to a Linweaver-Burke (linear) relationship. Plot the transformed


data values on a graph or spreadsheet, and estimate the slope and y-intercept of the
line you get. Use these results to find the best estimates for K and kn .
1.31. Spacing in a school of fish: According to the biologist Breder [2], two fish in
a school prefer to stay some specific distance apart. Breder suggested that the fish
that are a distance x apart are attracted to one another by a force FA (x) = A/xa and
repelled by a second force FR (x) = R/xr , to keep from getting too close. He found
the preferred spacing distance (also called the individual distance) by determining
the value of x at which the repulsion and the attraction exactly balance. Find the
individual distance in terms of the quantities A, R, a, r (all assumed to be positive
constants.)

Chapter 2

Average rates of change,


average velocity and the
secant line
In this chapter, we introduce the idea of an average rate of change. To motivate ideas, we
examine data for two common processes, changes in temperature, and motion of a falling
object. Simple experiments are described in each case, and some features of the data are
discussed. Based on each example, we define and calculate net change over some time
interval and so define the average rate of change. This concept generalizes to functions of
any variable (not only time). We interpret this idea geometrically, in terms of the slope of
a secant line.
In both cases, we then ask how to use the idea of the average rate of change (over
a given interval) to find better and better approximations of the rate of change at a single
instant, (i.e. at a point). We will see that one way to arrive at this abstract concept entails
refining the dataset - collecting data at closer and closer time points. A second, more
abstract way, is to use the idea of a limit. Eventually, this procedure will allow us to arrive
at the definition of the derivative, which is the instantaneous rate of change.

2.1

Time-dependent data and rates of change

In this section we consider two time dependent processes. We make several observations
about actual data collected in studying those processes, and we arrive at the ideas of rates of
change. We also use graphical software to represent the data for the purpose of visualization
and for computing desired rates of change.

Section 2.1 Learning goals


1. Be able to use (your favorite) graphical software package (spreadsheet, graphics calculator, online tools, etc) to plot data points such as those in Table 2.1.
2. Be able to describe verbally the trends seen in such data using words such as increasing, decreasing, linear, nonlinear, shallow, steep changes, etc.
25

26

Chapter 2. Average rates of change, average velocity and the secant line

2.1.1 Milk temperature in a recipe for yoghurt


Making yoghurt calls for heating milk to 190 F to kill off undesirable bacteria, and then
cooling to 110F. Some pre-made yoghurt with live culture is added, and the mixture
kept at 110F for 7-8 hours. This is the ideal temperature for growth of Lactobacillus, a
useful micro-organism turns milk into yoghurt5.
Example 2.1 (Heating and cooling milk) Shown in Table 2.1 is a set of temperature measurements for milk over time in yoghurt preparation where (a) is the heating phase and (b)
the cooling phase6 . Use your favorite software to plot the data and describe the trends you
see in each graph.
(a) Heating
time (min) Temperature (F)
0.0
44.3
0.5
61
1.0
77
1.5
92.
2.0
108
2.5
122
3.0
135.3
3.5
149.2
4.0
161.9
4.5
174.2
5.0
186

(b) Cooling
time (min)
0
2
4
6
8
10
14
18
22
26

Temperature (F)
190
176
164.6
155.4
148
140.9
131
123
116
111.2

Table 2.1. Temperature of the milk as it is (a) Heated and (b) Cooled.

Solution: The data is plotted in Fig. 2.1(a,b). The measurements are discrete, that is, we
only have a finite number of points at which the temperature was recorded, but we can
connect these points with line segments in (b) or approximate the entire collection by a
straight line in (a) to see the trend. In (a) the temperature increases at close to a constant
rate (the points appear to fit a straight line) whereas in (b) the temperature decreases, but
the steepness of the temperature drop becomes more shallow as time goes by.
As part of our exploration in this chapter, we will address the following questions
1. How fast is the temperature T (t) increasing in (a)?
2. How fast is it decreasing in (b)?
Before answering the questions pose here, we introduce other examples of time dependent
data.
5 The
6 The

initial heating also denatures milk proteins, which prevents the milk from turning into curds.
data was collected by your instructor in her kitchen.

2.1. Time-dependent data and rates of change

200.0

Temperature (F)

200.0

40.0

27

Temperature (F)

100.0
0.0

time (min)

5.0

0.0

(a)

time (min)

26.0

(b)

Figure 2.1. Plots of the data shown in Table 2.1.

2.1.2 Data for swimming Tuna


Example 2.2 (Bluefin tuna swimming distances) The tuna fishing industry is of great
economic value, but danger of overfishing has been recognized. In an effort to support
sustainability, Prof Molly Lutcavage studied the swimming behaviour of Atlantic bluefin
tuna (Thunnus thynnus L.) in the Gulf of Maine. She recorded their position over a period
of 1-2 days. Some of her approximate data is given in Table 2.2. Plot the data points and
describe the trends these display.
time (hr)
0
5
10
15
20
25
30
35

distance Tuna 1 (km)


0
29
51
78
140
160
182
218

distance Tuna 2 (km)


0
32
55
80
111
125
150
180

Table 2.2. Data for tuna swimming distance collected by Molly Lutcavage in the
Gulf of Maine.

Solution: The data is plotted in Fig. 2.2. The distance traveled by Tuna 1 is roughly
proportional to time spent. We see this from the fact that the red trajectory is approximately

28

Chapter 2. Average rates of change, average velocity and the secant line

linear. A linear relationship between distance travelled and time is called uniform motion.
Tuna 2 started with much the same kind of uniform motion, but later it speeded up and
travelled faster. During the time span 15 t 20h, it was moving much faster than at
other times.

250.0

km

Tuna 2

Tuna 1

0.0
0.0

time (hrs)

35.0

Figure 2.2. Distance travelled by two bluefin tuna over 35 hrs

2.1.3 Data for a falling object


Observations recording the position of a falling object were made long ago by Galileo. He
devised some ingenious experiments to quantify the relationship between the total distance
fallen under the force of gravity over a given time. Here we examine his discovery.
Example 2.3 (Gallileos data for height of a falling object) Galileo discovered that the
distance fallen, y(t), is proportional to the square of the time t, that is
y(t) = ct2 ,

(2.1)

where c is a constant7 . When distance is measured in meters (m) and time in seconds (s)
2
the constant is found to be c = 4.9m/s 8 .
Use the relationship in (2.1) to plot a graph of the distance fallen y(t) versus time t
for 0 t 2 seconds at intervals of 0.1s. Connect the data points and comment on the
shape of the graph.
7 Later in this course, we will see that this follows directly from the fact that gravity causes constant acceleration
- but Galileo, did not realize this fact, nor did he have a clear idea about what acceleration meant.
8 Although Galileo did not have formulae or graph-paper in his day, (and was thus forced to express this
relationship in a cumbersome verbal way), what he had discovered was quite remarkable.

2.2. The slope of a straight line is a rate of change

29

20

y(t)=4.9 t2

0.0
0.0

2.0

Figure 2.3. The height of an object falling under the force of gravity.
Solution: The graph is shown in Fig. 2.3. We recognize this as a parabola, resulting from
the quadratic relationship of y and t. (In fact the relationship is that of a simple power
function with a constant coefficient.)
Having looked at three examples of data for time-dependent processes, we now turn
to quantifying the rate at which change occurs in each process. We start with the notion
of average rate of change, and eventually refine this idea and idealize it to develop rates of
change at an instant in time.

2.2

The slope of a straight line is a rate of change

In the examples discussed so far, we have plotted data and used verbal statements to describe trends. Our goal now is to make more precise the idea of change and rate of change.
Let us consider the simplest case where a variable of interest, y depends linearly on time.
This was approximately the case in some examples seen previously (Fig. 2.1a, parts of
Fig. 2.2). We can describe this kind of relationship by the idealized equation
y(t) = mt + b.

(2.2)

Moreover, the graph of y versus t is then a straight line with slope m and intercept b.
Definition 2.4 (Rate of change for a linear relationship). For a straight line, we define
the rate of change of y with respect to time y as the ratio:
Change in y
.
Change in t
We now make a fundamental observation whose importance cannot be overestimated.

30

Chapter 2. Average rates of change, average velocity and the secant line

Example 2.5 Show that the slope m of the straight line (2.2) corresponds to the above
definition of the rate of change of a linear relationship.
Solution: Taking any two points (t1 , y1 ) and (t2 , y2 ) on that line, and using the notation
y, t to represent the change in y and t we compute the ratio and simplify algebraically
to find:
Change in y
(mt2 + b) (mt1 + b)
mt2 mt1
y
y2 y1
=
=
= m.
=
=
Change in t
t
t2 t1
t2 t1
t2 t1
Thus the slope m corresponds exactly with the notion of change of y per unit time which
we call henceforth the rate of change of y with respect to time. It is important to notice
that this calculation leads to the same result no matter which two points we pick on the
graph of the straight line.

2.3 The slope of a secant line is the average rate of


change
Section 2.3 Learning goals
1. Understand the definition of average rate of change and its connection with the concept of the slope of a secant line.
2. Be able to compute the average rate of change using time-dependent data over a
given time interval.
3. Given two points on the graph of a function, or two discrete data points, find both the
slope and the equation of a secant line through those points.
We generalize the ideas in Section 2.2 to consider rates of change for relationships
other than linear. Let f (t) describe some relationship between time t and a variable of
interest y. The relationship y = f (t) could describe some set of discrete data points (as in
Fig. 2.1), or a formula (as in (2.1)).
Let us pick any two points (a, f (a)), and (b, f (b)) satisfying y = f (t). Draw a line
through those two points9 . We refer to this line as the secant line, and we denote its slope
as an average rate of change over the interval a t b. Formally, we define
Definition 2.6 (Secant Line). A secant line is a straight line connecting any two specific
points on the graph of a function.
Definition 2.7 (Average rate of change). The average rate of change of y = f (t) over the
time interval a t b is the slope of the secant line through the two points (a, f (a)), and
(b, f (b)).
9 We have drawn secant lines between every pair of successive data points in Figs. 2.2, 2.3, for example, but in
general a secant line could join any two points.

2.3. The slope of a secant line is the average rate of change

31

Having reduced the definition to the slope of a straight line, we can compute the average
rate of change of f over the time interval a t b as follows
Average rate of change =

f
f (b) f (a)
Change in f
=
=
.
Change in t
t
ba

Observe that the average rate of change will in general depend on which two points we
select, in contrast to the linear case. (See Left panel in Fig. 2.4.) We caution that the
word average sometimes causes confusion. One often speaks in a different context of the
average value of a set of numbers (e.g. the average of {7, 1, 3, 5} is (7 + 1 + 3 + 5)/4 = 4.)
However the average rate of change is always defined in terms of a pair of points. It is not
the average of some arbitrary set of values.

y=f(t)

y=f(x)

f (xo +h )

f(b)
secant line

secant line

f(xo )

f(a)
a

xo

xo+h

Figure 2.4. A secant line is a straight line connecting two points on the graph
of a function. Left: a set of time dependent data points (black circles) or smooth function
(dashed curve) f (t) showing the secant line through the points (a, f (a)), and (b, f (b)).
Right: The graph of some arbitrary function f (x) with a secant line through the points
(x0 , f (x0 )) and (x0 + h, f (x0 + h)). The slope of the secant line is defined as the average
rate of change of f over the given interval.
We use this definition to compute the average rate of change for each of the examples
presented earlier.
Example 2.8 (Average rate of change of milk temperature) Use the data in Table 2.1 to
find the average rates of change of the milk temperature over the time interval 2 < t < 4
for both the heating and the cooling phases.
Solution: Over a given time interval, the average rate of change of the temperature is
Change in temperature
T
=
.
Time taken
t
As the milk cools, over the interval 2 t 4 min, the average rate of change is
(164.6 176)
= 5.7 /min.
(4 2)

32

Chapter 2. Average rates of change, average velocity and the secant line

Over a similar time interval for the heating milk, the average rate of change of the temperature is
(161.9 108)
= 26.95 /min.
(4 2)

Were we to connect two points (2, T (2)) and (4, T (4)) on one of the graphs in Fig. 2.1,
we would obtain a secant line whose slope matches the average rate of change we have
computed here.
Example 2.9 (Equation of a secant line) Write down the equation of the secant line using the fact that it goes through a known point (2, T (2)) and has a slope computed in
Example 2.8.
Solution: The secant line goes through the point (t, T ) = (2, 108) and has slope 26.95.
Therefore
(yT 108)
= 26.95
t2

yT = 108 + 26.95(t 2)

yT = 26.5t + 54.1,

where we have used yT as the height of the secant line, to avoid confusion with T (t) which
is the actual temperature as a function of the time.
Definition 2.10 (Average velocity). For a moving body, the average velocity over a time
interval a t b is the average rate of change of distance over the given time interval.
Example 2.11 (Swimming velocity of Bluefin tuna) Use the tuna swimming data in Fig. 2.2
to answer the following questions: (a) Determine the average velocity of each of these two
fish over the 35h shown in the figure. (b) What is the fastest average velocity shown in this
figure, and over what time interval and for which fish did it occur?
Solution: (a) We find that Tuna 1 swam 180 km over the course of 35 hr, whereas Tuna
2 swam 218 km during the same time period. Thus the average velocity of Tuna 1 was
v = 180/35 5.14 km/h, whereas a similar calculation for Tuna 2 yields 6.23 km/h.
(b) The fastest average velocity would correspond to the segent of the graph that has the
largest slope. We see that the blue curve (Tuna 2) has the greatest slope during the time
interval 15 < t < 20. Indeed, we find that the tuna covered a distance from the distance
covered over that 5 hr interval was from 78 to 140 km over that time, a displacement of
140-78=62km. Its average velocity over that time interval was thus 62/5 = 12.4km/h.
Example 2.12 (Equation of secant line 2) Find the equation of the secant line connecting
the first and last data points for the swimming distances of Tuna 1 in Fig. 2.2.
Solution: Both Tuna 1 and 2 start at distance 0 at time t = 0, so that the y intercept of
the secant line is 0. We have already computed the slope of the secant line (average rate of
change) as 5.14 km/h. Hence the equation of the secant line is
yS = 5.14t.

2.3. The slope of a secant line is the average rate of change

33

We can extend the definition of the average rate of change to any function f (x).
Definition 2.13 (Average rate of change of a function). Suppose y = f (x) is a function
of some arbitrary variable x. The average rate of change of f between two points x0 and
x0 + h is given by

[f (x0 + h) f (x0 )]
y
[f (x0 + h) f (x0 )]
change in y
=
=
=
.
change in t
x
(x0 + h) x0
h
Here h is the difference of the x coordinates. The above ratio is the slope of the secant line
shown in the right panel of Fig. 2.4.
Example 2.14 (Average velocity of a falling object) Consider a falling object. Suppose
that the total distance fallen at time t is given by Eqn. (2.1). Find the average velocity v, of
the object over the time interval t0 t t0 + h.

20.0

Secant line
and
Average velocity

y = 4.9 t2

Secant line

0.0
0.0

t0 t0+h

2.0

Figure 2.5. A secant line through two points on the graph of distance versus time
for an object falling under the force of gravity.

Solution: In Fig. 2.5, we reproduce the data for the falling object from Fig. 2.3 and superimpose a secant line connecting two points labeled t0 and t0 + h. We compute the average

34

Chapter 2. Average rates of change, average velocity and the secant line

velocity as follows:
v =
=
=
=
=

y(t0 + h) y(t0 )
h
c(t0 + h)2 c(t0 )2

 2 h
(t0 + 2ht0 + h2 ) (t20 )
c
h


2ht0 + h2
c
h
c(2t0 + h).

(2.3)

Thus the average velocity over the time interval t0 < t < t0 + h is v = c(2t0 + h).

2.4 From average to instantaneous rate of change


This section could also be titled Shrinking the timesteps between measurements. So far,
the average rates of change and average velocities were computed over a finite interval,
using two endpoints of the given interval. Our ultimate goal is to refine this idea and define
a rate of change at each point, i.e. an instantaneous rate of change. But to do so, we
first consider how a data set can be refined by making more frequent measurements, that is
decreasing the time steps between successive data points. This will provide a more accurate
notion of the rate of change close to a given point. We discuss two examples below.

Section 2.4 Learning goals


1. Understand that a data set with more frequent measurements corresponds to smaller
time intervals t between data points (Figs. 2.6, 2.7).
2. Understand the connection between average rate of change over a very small time
interval and instantaneous rate of change at a single point.

2.4.1 Refined temperature data


In Fig. 2.6, the original data for the cooling milk temperature T versus time t (from Fig. 2.1)
is displayed, together with two refined data sets (repetitions of the experiment using more
closely spaced time points). Table 2.3 provides a sample of this refined data. When the
time between measurement is shortened, we can get a more accurate sense of the rate of
change of temperature close to a given time point, as the following example illustrates.
Example 2.15 (Refined average rate of change) Use the data in Table 2.3 to compute the
average rate of change of the temperature over a time interval 2 t 2 + h where
h = t is the time increment between measured values in each case. (t = 2, 1, 0.5 min,
respectively.) Which of your calculations most accurately describes the behaviour close
to t = 2min?

2.4. From average to instantaneous rate of change

200.0

Temperature (F)

200.0

100.0
0.0

35

Temperature (F)

200.0

100.0

time (min)

26.0

Temperature (F)

100.0

time (min)

0.0

(a)

26.0

0.0

(b)

time (min)

(c)

Figure 2.6. Three graphs of the temperature of cooling milk showing (a) a coarse
data set (measurements every t = 2 min), (b) a more refined data set (measurements
every t = 1 min) (c) an even more refined dataset (measurements every t = 0.5 min).
In all cases, after about 10 min, fewer points were collected.
time
0
2
4
6
8
10

Temp
190
176
164.6
155.4
148
140.9

time
0
1
2
3
4
5

Temp
190
182
176
169.5
164.6
159.8

time
0
0.5
1
1.5
2
2.5

Temp
190
185.5
182
179.2
176
172.9

Table 2.3. Partial data for temperature in degrees Farenheit for the three graphs
shown in Fig. 2.6. The pairs of columns indicate that the data has been collected at more
and more frequent intervals h = t.

Solution: In each of the three cases we calculate the ratio T /t using successive time
points. We obtain, for t = 2, 1, 0.5 the following average rates of change (in degrees F
per min):
t = 2 :
t = 1 :
t = 0.5 :

T
(164.6 176)
=
= 5.7,
t
(4 2)
T
(169.5 176)
=
= 6.5,
t
(3 2)
T
(172.9 176)
=
= 6.2.
t
(2 1.5)

The last of these has been calculated over the smallest time interval, and most closely
represents the rate of change of temperature close to the time t = 2 min. Problem 2(b)

26

36

Chapter 2. Average rates of change, average velocity and the secant line

leads to a similar comparison of this sort close to t = 0, and results in a similar set of finer
values for the average rate of change near the initial data point.

2.4.2 Refined data for the height of a falling object


(a)

Strobe images for height of a falling ball

(b)

20.0

20.0

0.0

0.0
0.0

Refined data for height of a falling ball

0.0

0.0

2.0

Figure 2.7. The height of an object falling under the effect of gravity is shown in
three time sequences. (a) Stroboscopic images. Each sequence starts with t = 0 at the top,
and proceeds to t = 2 at the lowest point. The time interval t at which data is collected
is refined in the sequences from left to right (t = 0.5, 0.2, 0.1) to get more and more
accurate tracking of the object. (b) Three graphs showing Y versus t with the same time
increments as in (a).
Figure 2.7(a) displays a set of three stroboscopic images combined (for visualization
purposes) on a single graph. Each set of dots shows successive vertical positions of an
object falling from a height of 20 meters over a 2 second time period. In (a) the location
of the ball is given first at intervals of t = 0.5 seconds, then at intervals of t = 0.1
and finally t = 0.05 s. (A strobe flashing five times, ten and twenty times would produce
these three data sets, respectively.) In Fig. 2.7(b), each data set is graphed against time t
(side by side for easy visualization). The distance fallen is still described by the function
y(t) = ct2 , as before10. By determining the position of the ball at closer time points,
we can determine the trajectory of the ball as well as its velocity with greater accuracy.
Indeed, the idea of taking smaller and smaller time steps will allow us to define the notion of
instantaneous velocity, and will prove to be a fundamental part of quantifying the calculus
approach to rates of change of natural processes.
10 Equivalently,

the height of the object as shown in the figure would be described by Y (t) = Y0 ct2 .

2.5. Introduction to the derivative

37

2.4.3 Instantaneous velocity


To arrive at a notion of an instantaneous velocity at some time t0 , we will consider defining
average velocities over time intervals t0 t t0 + h, that get smaller and smaller: For
example, we might make the strobe flash faster, so that the time between flashes, t =
h decreases. (We use the notation h 0 to denote the fact that we are interested in
shrinking the time interval.) At each stage, we calculate an average velocity, v over the
time interval t0 t t0 + h. As the interval between measurements gets smaller, i.e the
process of refining our measurements continues, we arrive at a number that we will call the
instantaneous velocity. This number represents the velocity of the ball at the very instant
t = t0 .
Definition 2.16 (Instantaneous velocity). The instantaneous velocity at time t0 , denoted
v(t0 ) is defined as
v(t0 ) = lim v
h0

where v is the average velocity over the time interval t0 < t < t0 + h. In other words,
v(t0 ) = lim

h0

y(t0 + h) y(t0 )
.
h

Example 2.17 (Computing an instantaneous velocity) Use Gallileos formula for the distance fallen, (2.1) to compute the instantaneous velocity of a falling object at time t0 .

Solution: We have already found the average velocity of the falling object over a time
interval t0 < t < t0 + h in Example 2.14, obtaining (2.3),
v = c(2t0 + h).
Then, by Definition 2.16,
v(t0 ) = lim v = lim c(2t0 + h) = 2ct0 .
h0

h0

This result holds for any time t0 . More generally, we could write that at time t, the instantaneous velocity is v(t) = 2ct. For example, using meters and seconds where c = 4.9m/s2 ,
we would find that the velocity of an object at time 1 s after initial release is v(1) = 4.9
m/s.

2.5

Introduction to the derivative

With the concepts introduced in this chapter, we are ready for the the definition of the
derivative.

38

Chapter 2. Average rates of change, average velocity and the secant line

Section 2.5 Learning goals


1. Follow the first examples of calculation of the derivative.
2. Understand how the derivative is obtained from an average rate of change.
3. Be able to compute the derivative of very simple functions such as y = x2 , and
y = Ax + B.
Definition 2.18 (The derivative). The derivative of a function y = f (x) at a point x0 is

the same as the instantaneous rate of change of f at x0 . It is denoted dy/dx or f (x0 )
x0

and defined as


dy
dx

= f (x0 ) = lim

x0

h0

[f (x0 + h) f (x0 )]
.
h

Definition 2.19. If y = f (t) is the position of an object at time t then the derivative f (t)
at time t0 is the instantaneous velocity, also simply called the velocity of the object at that
time.
Example 2.20 (Formal calculation of velocity) Use Gallileos formula to set up and calculate the derivative of (2.1), and show that it corresponds to the instantaneous velocity
obtained in Example 2.17.
Solution: We set up the calculation using limit notation, that is compute
y(t0 + h) y(t0 )
h
c(t0 + h)2 c(t0 )2
= lim
h0
h

 2
(t0 + 2ht0 + h2 ) (t20 )
= lim c
h0
h


2ht0 + h2
= lim c
= lim c(2t0 + h) = 2ct0 .
h0
h0
h

v(t0 ) = lim

h0

(2.4)

All steps but the last are similar to the calculation (and algebraic simplification) of average velocity (compare with Example 2.14). In the last step, we formally allow the time
increment h to shrink, which is equivalent to taking limh0 .
Example 2.21 (Calculating the derivative of a function) Compute the derivative of the
function f (x) = Cx2 at some point x = x0 .
Solution: In the previous example, we calculated the derivative of the function y = f (t) =
ct2 with respect to t. Here we merely have a similar (quadratic) function of x. Thus, we

2.5. Introduction to the derivative

39

have already solved this problem. By switching notation (t0 x0 and c C) we can
write down the answer, 2cx0 at once. However, as practice, we can rewrite the steps in the
case of the general point x
For y = f (x) = Cx2 we have
f (x + h) f (x)
dy
= lim
h0
dx
h
C(x + h)2 Cx2
= lim
h0
h
(x2 + 2xh + h2 ) x2
= lim C
h0
h
(2xh + h2 )
= lim C(2x + h) = 2Cx.
= lim C
h0
h0
h
Evaluating this result for x = x0 we obtain the answer 2Cx0 .
We recognize from this definition that the derivative is obtained by starting with the
slope of a secant line (average rate of change of f over the interval x0 < x < x0 + h)
and proceeds to shrink the interval (limh0 ) so that it approaches a single point (x0 ). The
resultant line will be denoted the tangent line and the value obtained will be identified as
the the instantaneous rate of change of the function with respect to the variable x at the
point of interest, x0 . Another notation used for the derivative is

df
.
dx x0
We will explore properties and meanings of this concept in the next chapter.

40

Chapter 2. Average rates of change, average velocity and the secant line

Exercises
2.1. Heating milk: Consider the data gathered for heating milk in Table 2.1 and Fig. 2.1(a).
(a) Estimate the slope and the intercept of the straight line shown in the figure and
use this to write down the equation of this line. According to this approximate
straight line relationship, what is the average rate of change of the temperature
over the 5 min interval shown?
(b) Find a pair of points such that the average rate of change of the temperature is
smaller than your result in part (a).
(c) Find a pair of points such that the average rate of change of the temperature is
greater than your result in part (a).
(d) Milk boils at 212F, and the recipe for yoghurt calls for avoiding a temperature
this high. Use your common knowledge to explain why the data for heating
milk is not actually linear.
2.2. Refining the data: Table 2.3 shows some of the data for cooling milk that was
collected and plotted in Fig. 2.6. Answer the following questions.
(a) Use the above table to determine the average rate of change of the temperature
over the first 10 min.
(b) Compute the average rate of change of the temperature over the intervals 0 <
t < 2, 0 < t < 1 and 0 < t < 0.5.
(c) Which of your results in (b) would be closest to the instantaneous rate of
change of the temperature at t = 0?
2.3. Height and distance dropped: We have defined the variable Y (t) =height of the
object at time tand the variable y(t) as the distance dropped by time t. State the
connection between these two variables for a ball whose initial height is Y0 . How is
the displacement over some time interval a < t < b related between these two ways
of describing the motion? (Assume that the ball is in the air throughout this time
interval).
2.4. Height of a ball: The vertical height of a ball, Y (in meters) at time t (seconds)
after it was thrown upwards was found to satisfy Y (t) = 14.7t (1/2)gt2 where
g = 9.8 m/s2 for the first 3 seconds of its motion.
(a) What happens after 3 seconds?
(b) What is the average velocity of the ball between the times t = 0 and t = 1
second?
2.5. Falling ball: A ball is dropped from height Y0 = 490 meters above the ground. Its
height, Y , at time t is known to follow the relationship Y (t) = Y0 21 gt2 where
g = 9.8 m /s2 .
(a) Find the average velocity of the falling ball between t = 1 and t = 2 seconds.
(b) Find the average velocity between t sec and t + where 0 < < 1 is some
small time increment. (Assume that the ball is in the air during this time interval.)

Exercises

41

(c) Determine the time at which the ball hits the ground.
2.6. Average velocity at time t: A ball is thrown from the top of a building of height Y0 .
The height of the ball at time t is given by
1
Y (t) = Y0 + v0 t gt2
2
where h0 , v0 , g are positive constants. Find the average velocity of the ball for the
time interval 0 t 1 assuming that it is in the air during this whole time interval.
Express your answer in terms of the constants given in the problem.
2.7. Tuna average velocity:
Find the average velocity of Tuna 1 over each of the time intervals shown in Table 2.2, that is for 0 t 5hr, 5 t 10 hr, etc.
2.8. Average velocity and secant line: The two points on Figure 2.5 through which the
secant line is drawn are (1.3, 8.2810) and (1.4, 9.6040). Find the average velocity
over this time interval and then write down the equation of the secant line.
2.9. Human Population Growth: Table 2.4 gives data for the human population (in
billions) over recorded history (with some estimates where data was not available).
year
1
1000
1500
1650
1750
1804
1850
1900
1927
1950
1960
1980
1987
1999
2011
2020

human population (billions)


0.2
0.275
0.45
0.5
0.7
1
1.2
1.6
2
2.55
3
4.5
5
6
7
7.7

Table 2.4. The human population (billions) over the years AD 1 to AD 2020.

(a) Plot the human population (in billions) versus time (in years) using graphing
software of your choice.
(b) Determine the average rate of change of the human population over the successive time intervals.
(c) Plot the average rate of change versus time (in years) and determine over what
time interval that average rate of change was greatest.

42

Chapter 2. Average rates of change, average velocity and the secant line
(d) Over what period (i.e. time interval) was this average rate of change increasing
most rapidly? (Hint: you should be able to answer this question either by
looking at the graph you have drawn or by calculation.)

2.10. Average rate of change: A certain function takes values given in the table below.
t
0 0.5 1.0 1.5 2.0
f (t) 0
1
0
-1
0
Find the average rate of change of the function over the intervals
(a) 0 < t < 0.5,
(b) 0 < t < 1.0,
(c) 0.5 < t < 1.5,
(d) 1.0 < t < 2.0.
2.11. Consider the functions f1 (x) = x, f2 (x) = x2 , f3 (x) = x3 .
Find the average rate of change of these functions over each of the following intervals.
(a) Over 0 x 1.

(b) Over 1 x 1.
(c) Over 0 x 2.

2.12. Find the average rate of change for each of the following functions over the given
interval.
(a) y = f (x) = 3x 2 from x = 3.3 to x = 3.5.

(b) y = f (x) = x2 + 4x over [0.7, 0.85].

(c) y = x4 and x changes from 0.75 to 0.5.


2.13. Trig Minireview: Consider the following table of values of the trigonometric functions sin(x) and cos(x):
x sin(x) cos(x)
0
0
1

1
2
2
2
3
2

3
2
2
2
1
2

1
0
Find the average rates of change of the given function over the given interval. Express your answer in terms of square roots and . Do not compute decimal expressions.
(a) Find the average rate of change of sin(x) over 0 x /4.

(b) Find the average rate of change of cos(x) over /4 x /3.

(c) Is there an interval over which the functions sin(x) and cos(x) have the same
average rate of change? (Hint: consider the graphs of these functions over one
whole cycle, e.g. for 0 x 2. Where do they intersect?)

Exercises
2.14.

43

(a) Consider the function y = f (x) = 1 + x2 . Consider the point (1, 2) on its
graph and some point nearby, for example (1 + h, 1 + (1 + h)2 ). Find the slope
of a secant line connecting these two points.
(b) The slope of a tangent line to y = f (x) is the derivative f (x). Use the slope
you calculated in (a) to figure out what the slope of the tangent line to the curve
at (1, 2) would be.

(c) Find the equation of the tangent line through the point (1, 2).
2.15. Given the function y = f (x) = 2x3 + x2 4, find the slope of the secant line
joining the points (4, f (4)) and (4 + h, f (4 + h)) on its graph, where h is a small
positive number. Then find the slope of the tangent line to the curve at (4, f (4)).
2.16. Average rate of change: Consider the function f (x) = x2 4x and the point
x0 = 1.
(a) Sketch the graph of the function.
(b) Find the average rate of change over the intervals [1, 3], [1, 1], [1, 1.1], [0.9, 1]
and [1 h, 1], where h is some small positive number.
(c) Find f (1).

2.17. Given y = f (x) = x2 2x + 3.

(a) Find the average rate of change over the interval [2, 2 + h].

(b) Find f (2).


(c) Using only the information from (a), (b) and f (2) = 3, approximate the value
of y when x = 1.99, without substituting x = 1.99 into f (x).
2.18. Find the average rates of the given function over the given interval. Express your
answer in terms of square roots and . Do not compute the decimal expressions.
(a) Find the average rate of change of tan(x) over 0 x
sin(x)
cos(x) ).
(b) Find the average rate of change of cot(x) over
cos(x)
sin(x) ).
2.19.

(Hint: tan(x) =

(Hint: cot(x) =

(a) Find the slope of the secant line to the graph of y = 2/x between the points
x = 1 and x = 2.
(b) Find the average rate of change of y between x = 1 and x = 1 + where > 0
is some positive constant.
(c) What happens to this slope as 0 ?

(d) Find the equation of the tangent line to the curve y = 2/x at the point x = 1.
2.20. For each of the following motions where s is measured in meters and t is measured
in seconds, find the velocity at time t = 2 and the average velocity over the given
interval.
(a) s = 3t2 + 5 and t changes from 2 to 3s.
(b) s = t3 3t2 from t = 3s to t = 5s.
(c) s = 2t2 + 5t 3 on [1, 2].

44

Chapter 2. Average rates of change, average velocity and the secant line

2.21. The velocity v of an object attached to a spring is given by v = A sin(t + ),


where A, and are constants. Find the average change in velocity (acceleration)
of the object for the time interval 0 t 2
.
2.22. Use the definition of derivative to calculate the derivative of the function
f (x) =
(intermediate steps required).

1
x+1

Chapter 3

Three faces of the


derivative: geometric,
analytic, and
computational
In Chapter 2 we used the concept of average rate of change (slope of secant line) to motivate
and then define the notion of an instantaneous rate of change (the derivative). We arrived
at a recipe for calculating the derivative algebraically. We thereby introduced the idea of
limits, a concept that merits further discussion. We take up some of the technical matters
related to limits to enable us to calculate the derivative algebraically.
Before doing so, we consider a distinct approach, which is geometric in flavour.
Namely, we show that the local behaviour of a continuous function is described by a tangent
line at a point on its graph, We arrive at that line by zooming into the graph of the function.
This duality - the geometric (graphical) and analytic (algebraic calculation) views - will
form important themes throughout the discussions to follow, and are two complementary,
but closely related approaches to the calculus.

3.1

The geometric view: Zooming into the graph of a


function

Section 3.1 Learning goals


1. Understand the connection between the local behaviour of a function (seen by zooming into the graph) at a point and the tangent line to the graph of the function at that
point.
2. Given the graph of a function, be able to sketch its derivative.

3.1.1 Locally, the graph of a function looks like a straight line


In this section we consider well-behaved functions whose graphs are smooth, in contrast
to the discrete data points of Chapter 2. We connect the derivative to the local shape of
the graph of a given function. By local behaviour we meet the behaviour seen when we
45

46Chapter 3. Three faces of the derivative: geometric, analytic, and computational


magnify the graph by zooming into a point. Imagine using a high-powered magnifying
glass or a microscope. The center of the field of vision is the point of interest. As we zoom
in, the curves in the graph tend to vanish, and the microscopic view looks more and more
line a straight line.
Definition 3.1 (Tangent line). The straight line that we see when we zoom into the graph
of a smooth function at some point x is called the tangent line to the graph of the function
at that point.
Definition 3.2 (Geometric definition of the derivative). The slope of the tangent line at
the point x will be denoted as derivative of the function at the given point.

6.0

-2.0

4.0

2.0

-6.0

2.2

0.0
0.5

2.0

1.6
1.4

1.6

Figure 3.1. Zooming in on the graph of the function y = f (x) = x3 x at the


point x = 1.5 . As we zoom in, we see that locally, the graph looks like a straight line.
We refer to this line as the tangent line, and its slope is the derivative of the function at that
point.

Example 3.3 (Zoom 1) Consider the function y = f (x) = x3 x and the point x = 1.5.
Find the tangent line to the graph of this function by zooming into the given point.
Solution: The graph of the function is shown in Figure 3.1(a), where we have indicated
the point of interest with a red dot. Now zoom in, and magnify the graph, centered on the
given point. Eventually, as we zoom in, the hills and valleys on the graph disappear off
screen, and locally, the graph resembles a straight line.
The slope we observe in our zoomed-in view will depend on the point of interest,
meaning that the derivative will vary from place to place. For this reason, the derivative,
denoted f (x) is, itself, a function of x.
Example 3.4 (Zooming into the sine graph at the origin:) Determine the derivative of the
function y = sin(x) at x = 0 by zooming into the origin on the graph of this function. Then
write down the equation of the tangent line at that point.

3.1. The geometric view: Zooming into the graph of a function

47

Solution: In Figure 3.2 we show a zoom into the graph of the function
y = sin(x)
at the point x = 0. The sequence of zooms leads to a straight line (far right panel) that

1.0

1.0

-3.14
3.14

-1.0

0.3

-1.0

-0.3
-1.0

1.0

-0.3

Figure 3.2. Zooming into the graph of the function y = f (x) = sin(x) at the
point x = 0 . Eventually, the graph resembles a line of slope 1. This is the tangent line at
x = 0 and its slope, the derivative of y = sin(x) at x = 0 is 1.
we identify once more as the tangent line to the function at x = 0. From the graph it is
apparent that the slope of this tangent line is 1. We say that the derivative of the function
y = f (x) = sin(x) at x = 0 is 1, and write f (0) = 1 to denote this fact. As this line goes
through (0, 0) and has slope 1, its equation is simply y = x. We can also say that close to
x = 0 the graph of y = sin(x) looks a lot like the line y = x.

3.1.2 At a cusp or a discontinuity, the derivative is not defined

Cusp

x
Figure 3.3. If we zoom into a function at a cusp, there is no one straight line
that describes local behaviour. No matter how far we zoom in, we see two distinct lines
meeting at a sharp corner. We say that the function has no tangent line at a a cusp and
the derivative is not defined at that point.

0.3

48Chapter 3. Three faces of the derivative: geometric, analytic, and computational

3.1.3 From the graph of a function, we can sketch its


derivative
The tangent line to the graph of a function varies from point to point along the graph of the
function - what we see when zooming in depends on the point at which we zoom in. This
is equivalent to saying that the derivative f (x) is, itself, also a function. Here we consider
the connection between these two functions by using the graph of one to sketch the other.
The sketch is meant to be approximate, but will contain some important elements.
f(x)

Figure 3.4. The graph of a function. We will sketch its derivative.


Example 3.5 Consider the function whose graph is shown in Fig. 3.4. Reason about the
tangent lines at various points along this curve to arrive at a sketch of the derivative f (x).

f(x)

Tangents
2 1

-1 -0.5 0

2 3

Slopes

f ' (x)

Figure 3.5. Sketching the derivative of a function


Solution: In Figure 3.5 we start by sketching a number of tangent lines along the graph of
f (x).

3.1. The geometric view: Zooming into the graph of a function

49

Pay special attention to the slopes (rather than height, length, or any other property)
of these dashes. Copying these lines in a row along the direction of the x axis, we estimate
their slopes with rather crude numerical values.
We notice that the slopes start out positive, decrease to zero, become negative, and
then increase again through zero back to positive values. (We see precisely two dashes that
are horizontal, and so have slope 0.) Next, we plot the numerical values (for slopes) that
we have recorded on a new graph. This is the beginning of the graph of the derivative,
f (x). Only a few points have been plotted in our figure of f (x); we could add other
values if we so chose, but the trend, is fairly clear: The derivative function has two zeros
(places of intersection with the x axis). It dips down below the axis between these places.
In Figure 3.5 we show the original function f (x) and its derivative f (x). We have aligned
these graphs so that the slope of f (x) matches the value of f (x) shown directly below.
Example 3.6 Sketch the derivative of the function shown in Fig. 3.6.

y=f (x)

Figure 3.6. Sketching the derivative of a function


Solution: See Fig. 3.7 for the entire process, and note that this time, we represent only the
sign of the derivative (positive or negative) and places where it is zero. The thin vertical
lines demonstrate that the f (x) = 0 coincides with tops of hills or bottom of valleys on
the graph of the function f (x).

3.1.4 Constant and linear functions and their derivatives


Example 3.7 (Derivative of y = C) Use a geometric argument to determine the derivative
of the function y = f (x) = C at any point x0 on its graph.
Solution: This function is a horizontal straight line, whose slope is zero everywhere. Thus
zooming in at any point x, leads to the same result, so the derivative is 0 everywhere.
Example 3.8 (Derivative of y = Bx)
Solution: The function y = Bx is a straight line of slope B. At any point on its graph, it
has the same slope, B. Thus the derivative is equal to B at any point on the graph of this
function.

50Chapter 3. Three faces of the derivative: geometric, analytic, and computational

Function

y=f (x)

Tangent lines

Slopes
+ 0 -

Derivative

y=f (x)

Figure 3.7. Sketching the derivative of a function


The reader will notice that in the above two examples, we have thereby found the
derivative for the two power functions, y = x0 and y = x1 . We summarize:
The derivative of any constant function is zero. The derivative of the
function y = x is 1.

3.1. The geometric view: Zooming into the graph of a function

51

3.1.5 Molecular motors


Molecular motors are proteins that can move along molecular tracks inside living cells.
Here we will be concerned with the large cellular highway system, composed of large
straight tracks called microtubules. These structures form dynamically inside all cells,
but are especially pronounced in neurons, long cells responsible for our sensations and for
signaling to muscles. The length of neurons can span a meter, and thus, transporting substances from one end of the cell to the other is a real challenge. This is accomplished by
two kinds of molecular motors, kinesin and dynein. The former (represented in Fig. 3.8(a)
by the letter k walk towards one end of the microtubule (the so-called plus end) and
dynein () walks toward the opposite (minus) end. Kinesin is roughly 5 times as powerful as dynein, and the two can simultaneously attach to cargo and play a tug of war
game. Cargo is generally a vesicle, (a package of cellular substances surrounded by a lipid
envelope), so vesicles are often seen to erratically move backwards and forwards along
microtubular tracks as motors attach and detach randomly. We represent vesicles by the
spherical shapes in Fig. 3.8(a), and show the relative velocities of several vesicles attached
to one dynein (moving to the minus end), attached to both dynein and kinesin (moving
right, since kinesin is more powerful) and a single kinesin (rapidly moving to the plus end).
In Example 3.9, we study a sample of a vesicle track (displacement, y over time t) and
decipher what happened caused this motion.

(a)

(b)

x x
x

(c)

0
(d)

0 2 k

Figure 3.8. Molecular motors: displacement and velocity

Example 3.9 (Motion of molecular motors) Consider the displacement y(t) of a vesicle
shown in Fig. 3.8(b). Sketch the corresponding instantaneous velocity v(t) for the vesicle

52Chapter 3. Three faces of the derivative: geometric, analytic, and computational


and use your sketch to explain how many motors of each type could have been attached
over each of the time intervals in the plot.
Solution: The plot in Fig. 3.8(b) consists of straight line segments with sharp corners
(cusps). Over each of these line segments, the slope dy/dt, which corresponds to the
instantaneous velocity, v(t), is constant. Segments with positive slope (the first, the last
two) correspond to times when the velocity was positive, when the vesicle was moving
toward the plus end. Over times where the slope is negative, the vesicle moved toward
the minus end of the microtubule. Where the slope is zero (flat graph), the vesicle was
stationary. In Fig. 3.8(c), we sketch the rough graph of the instantaneous velocity, v(t).
Observe that v(t), which is the derivative of y(t), is not defined at the points where y(t)
has corners. The larger the slope, the faster the velocity.
Based on Fig. 3.8(c), we can surmise which motors were attached. At first, the fast
positive velocity implies that the strong motor, kinesin, was at work. Once it got accidentally detached, the vesicle was stationary (0 motors). The motion in the opposite direction
(negative velocity) suggests that 1-2 dyneins were then attached and working. After some
time, kinesin was bound, pulling more forcefully towards the plus end. Finally, only kinesin remained bound, and the velocity was positive and fast. We summarize this sequence
of event in Fig. 3.8(d).

3.2 Analytic view: calculating the derivative


Section 3.2 Learning goals
1. Be able to explain the definition of a continuous function, and function with various
types of discontinuities.
2. Understand how to evaluate simple limits of rational functions.
3. Be able to calculate the derivative of a simple function using the definition of the
derivative, 2.18.

3.2.1 Technical matters: continuous functions and limits


We have studied two distinct types of functions so far. In Chapter 2, we found examples of
discrete data points, where a variable of interest (e.g. temperature) was defined only at a
finite set of time points. We also encountered continuous functions, such as the height of a
falling ball (2.1). Intuitively, we could describe a continuous function by saying that every
point of its graph is connected to neighboring points. But using the concept of limits we
can be more accurate.
Definition 3.10 (Continuous function). We say that y = f (x) is continuous at a point
x = a in its domain if
lim f (x) = f (a).
xa

3.2. Analytic view: calculating the derivative

53

By this we mean that the function is defined at x = a, that the above limit exists, and that
it matches with the value that the function takes at the given point.
The function
(2.1), for example, is continuous for all values of t, whereas a function

such as y = x is defined and continuous only for x 0. The function y = 1/(x + 1) is


defined and continuous for x 6= 1.
As discussed in Section 2.5, calculating a derivative requires the use of limits. The
two statements below emphasize this fact.
1. A secant line connects two points on the graph of a function (e.g. x and x + h). In
the limit that those points get closer together (h 0), we obtain a tangent line.
2. The slope of a secant line is an average rate of change, but in the limit (h 0), we
obtain an instantaneous rate of change, which is the slope of the tangent line, namely
the derivative.
Tangent lines and derivatives are only defined at points where a function is continuous
and smooth (has no cusps). Here we briefly discuss the idea of points of discontinuity and
illustrate how limits are calculated, in preparation for understanding the way that derivatives
are computed.
Function with a hole in its graph
Consider a function of the form
f (x) =

(x a)2
.
(x a)

Then if x 6= a, we can cancel a common factor, and obtain (x a). If x = a, the function
is not defined (0/0). In short, we have

(x a)2
xa
x 6= a
f (x) =
=
undefined
x
= a.
(x a)
Even though the function is not defined at x = a, we can evaluate the limit of f as x
approaches a. We write
(x a)2
= lim x a = 0,
xa
xa (x a)

lim f (x) = lim

xa

and say that the limit as x approaches a exists and is equal to 0. We also say that the
function has a removable discontinuity. If we add the point (a, 0) to the set of points at
which the function is defined then we obtain a continuous function identical to the function
x a. See also Appendix D.
Function with jump discontinuity
Consider the function
f (x) =

1 x a,
1
x > a.

54Chapter 3. Three faces of the derivative: geometric, analytic, and computational


We say that the function has a jump discontinuity at x = a. As we approach the point
of discontinuity we observe that the function has two distinct values, depending on the
direction of approach. We formally capture this observation using right and left hand
limits,
lim f (x) = 1,
lim f (x) = 1.
xa

xa+

Since the left and right limits are unequal, we say that the limit does not exist.
Function with blow up discontinuity
Consider the function

1
.
xa
Then as x approaches a, the denominator approaches 0, and the value of the function goes
to . We say that the function blows up at x = a and that the limit limxa f (x) does
not exist (DNE). We also write
lim f (x) =
f (x) =

xa

to denote the same thing.


Figure 3.9 illustrates the differences between functions that are continuous everywhere, those that have a hole in their graph, and those that have a jump discontinuity or a
blow up at some point a.

continuous

hole

jump

blow-up

Figure 3.9. Left to right: a continuous function, a function with a removable


discontinuity, a jump discontinuity, and a function with blow up discontinuity.

Examples of Limits
We now examine several examples of computations of limits. More details about properties
of limits are provided in Appendix D.
By Definition 3.10, to calculate the limit of any function at a point of continuity, we
simply evaluate the function at the given point.
Example 3.11 (Simple limit of a continuous function) Find the following limits:
(a) lim x2 + 2
x3

(b) lim

x1

1
x+1

(c) lim

x10

x
.
1+x

3.2. Analytic view: calculating the derivative

55

Solution: In each of the above, the function is continuous at the point of interest (at x =
3, 1, 10, respectively). Thus, we simply plug in the values of x in each case to obtain
(a) lim x2 + 2 = 32 + 2 = 11
x3

1
1
=
x1 x + 1
2

(b) lim

10
x
=
.
x10 1 + x
11

(c) lim

Example 3.12 (Hole in graph limits) Calculate the limits of the following functions. Note
that each has a removable discontinuity (a hole in its graph).
x2 6x + 9
x3
x3

(a) lim

x2 + 3x + 2
.
x1
x+1

(b) lim

Solution: We first simplify algebraically by factoring the numerator, and then evaluate the
limit. Note that the simplification is possible so long as we evaluate the limit, rather than
the actual function, at the point of discontinuity.
(x 3)2
x2 6x + 9
= lim
= lim (x 3) = 0
x3 (x 3)
x3
x3
x3

(a) lim

x2 + 3x + 2
(x + 1)(x + 2)
= lim
= lim (x + 2) = 1.
x1
x1
x1
x+1
(x + 1)

(b) lim

Example 3.13 (Limit involving sin(x)) Use the observation made in Example 3.4 to arrive at the value of the following limit:
lim

x0

sin(x)
x

Solution: Example 3.4 illustrated the fact that close to x = 0 the function sin(x) has the
following behaviour:
sin(x)
1.
sin(x) x, or
x
This is equivalent to the result

lim

x0

sin(x)
= 1.
x

(3.1)

We read this as x approaches zero, the limit of sin(x)/x is 1. We will find this limit useful
in later calculations involving derivatives of trigonometric functions.

56Chapter 3. Three faces of the derivative: geometric, analytic, and computational

3.2.2 Computing the derivative


We now apply the techniques so far to calculating a few derivatives based on the definition
of the derivative. We start with several simple examples, and work our way up to more
interesting cases.
Example 3.14 (Derivative of a linear function) Using Definition 2.18 of the derivative,
compute the derivative of the function y = f (x) = Bx + C.
Solution: We have already used a geometric approach to find the derivative of related
functions in Examples 3.7-3.8. Here we do the formal calculation as follows:
f (x + h) f (x)
(start with the definition)
h0
h
[B(x + h) + C] [Bx + C]
(apply it to the function)
= lim
h0
h
Bx + Bh + C Bx C
= lim
(expand the numerator)
h0
h
Bh
(simplify)
= lim
h0 h
= lim B (cancel a factor of h)

f (x) = lim

h0

=B

(evaluate the limit)

(3.2)

Hence, we confirmed that the derivative of f (x) = Bx + C is f (x) = B. This agrees with
the sum of the derivatives of the two parts, Bx and C found in Examples 3.7-3.8. Indeed,
as we will establish shortly, the derivative of the sum of two functions is the same as the
sum of their derivatives.
Example 3.15 (Derivative of the cubic power function) Compute the derivative of the function y = f (x) = Kx3 .
Solution: For y = f (x) = Kx3 we have
f (x + h) f (x)
dy
= lim
dx h0
h
K(x + h)3 Kx3
= lim
h0
h
(x3 + 3x2 h + 3xh2 + h3 ) x3
= lim K
h0
h
2
2
(3x h + 3xh + h3 )
= lim K
h0
h
= lim K(3x2 + 3xh + h2 )
h0

= K(3x2 ) = 3Kx2 .
Thus the derivative of f (x) = Kx3 is f (x) = 3Kx2 .

(3.3)

3.3. Computational face of the derivative: software to the rescue!

57

Example 3.16 Use the definition of the derivative to compute f (x) for the function y =
f (x) = 1/x at the point x = 1.
Solution: We write down the formula for this calculation at any point x and then simplify
algebraically, using common denominators to combine fractions, and then, in the final step,
calculate the limit formally. Then we substitute the value x = 1.
f (x + h) f (x)
(the definition)
h
1
1
(x+h) x
= lim
(applied to the function)
h0
h

f (x) = lim

h0

= lim

[x(x+h)]
x(x+h)

(common denominator)
h
h
(algebraic simplification)
= lim
h0 hx(x + h)
1
= lim
(cancel factor of h)
h0 x(x + h)
1
= 2 (limit evaluated)
x
h0

(3.4)
(3.5)

Thus, the derivative of f (x) = 1/x is f (x) = 1/x2 and at the point x = 1 it takes the
value f (1) = 1.
In Problem 10 we apply similar techniques to the derivative of the square-root function to show that

1
(3.6)
y = f (x) = x f (x) = .
2 x
In the next chapter, we formalize some observations about derivatives of power funcitons
and rules of differentiation. This will allow us to avoid such tedious calculations in finding
simple derivatives.

3.3

Computational face of the derivative: software to


the rescue!

Mathematical computations and analysis benefit greatly from software methods that help to
complement the algebraic and geometric approaches. Using a variety of software, we can
gain insight, experiment, as well as design methods for accurate computations that would be
too tedious to carry out by hand. Here we show a third face of the derivative: its numerical
implementation using a simple spreadsheet. This easy introduction to a spreadsheet will
later help us devise techniques for solving a variety of problems where many repetitive
calculations are involved.

58Chapter 3. Three faces of the derivative: geometric, analytic, and computational

Section 3.3 Learning goals


1. Appreciate the fact that software can numerically compute an approximation to the
derivative.
2. Understand that the approximation replaces a (true) tangent line with an (approximating) secant line.
3. Be able to use a spreadsheet (or your favorite software aid) to graph a function and
its derivative.
4. Be able to explain verbally how that the derivative shape is connected with the shape
of the original function.
5. Interpret the differences between two types of biochemical kinetics, MichaelisMenten and Hill function.
Definition 3.17 (Numerical derivative). By a numerical derivative calculation we mean a
numerical approximation to the value of the derivative, obtained by using a finite value,
f (x)numerical

f
x

rather than the actual value f (x)actual = lim

x0

f
.
x

The numerical derivative approximates the true derivative provided x is small relative to the range over which the function f changes dramatically. Since f is the difference
of two values of f , (f = f (x + x) f (x)) it follows that the numerical derivative is the
same as the slope of a secant line. This important realization, associated with LG 2, means
that a secant line is often used to approximate a tangent line, and the slope of a secant line
is used to approximate a derivative in numerical computations. We will see this idea again
in several contexts.

3.3.1 Concentration-dependent rate of chemical reaction


In Section 1.5, we studied two types of enzyme-catalyzed reactions, Michaelis-Menten
kinetics, (1.7) and Hill functions, (1.6). Using c for the substrate concentration, vMM and
vHill for the reaction rates, we repeat their expressions here:
vMM =

Kc
,
a+c

(3.7)

Kcn
,
(3.8)
+ cn
In both cases, the reaction rate vi depends on the chemical concentration c. In the left panel
of Fig. 3.10 we plot both functions (3.7) and (3.8) on the same coordinate system. (See
also Fig. 1.5 for the same kind of plot.)
We ask how does the reaction speed change when we vary the substrate concentration? or equivalently, For each small increase in c, by how much does v increase?. This
vHill =

an

3.3. Computational face of the derivative: software to the rescue!

59

question is equivalent to the theoretical question: what is the derivative of v with respect
to c? that is, what is dv/dc? Here we encounter an example of a rate of change that has
independent variable other than time, an illustration that the derivative is not restricted to
time-dependent processes.
We illustrate the answer to this problem using spreadsheet calculations.
Example 3.18 (The derivative on a spreadsheet) Use a spreadsheet (or your favorite software) to plot the derivatives of the functions (3.7) and (3.8).
concentration, c

vMM

vHill

vMM /c

vHill /c

0.0000
0.1000
0.2000
0.3000
0.4000
0.5000
0.6000
0.7000
0.8000
0.9000
1.0000

0.0000
0.4545
0.8333
1.1538
1.4286
1.6667
1.8750
2.0588
2.2222
2.3684
2.5000

0.0000
0.0005
0.0080
0.0402
0.1248
0.2941
0.5737
0.9681
1.4529
1.9809
2.5000

4.5455
3.7879
3.2051
2.7473
2.3810
2.0833
1.8382
1.6340
1.4620
1.3158
1.1905

0.0050
0.0749
0.3219
0.8463
1.6931
2.7954
3.9441
4.8483
5.2796
5.1914
4.7086

Table 3.1. Using a spreadsheet, we compute points along the graphs of biochemical kinetics functions. We then compute differences of successive points to approximate the
derivative. Here only a few data points are tabulated. Full results are plotted in Fig 3.10.

Solution: Figure 3.11 demonstrates typical spreadsheet manipulation that we use to numerically plot the two functions and their derivatives. In using such software, we must specify
values of any constants or parameters in the functions of interest, and in this example, we
take K = 5, a = 1 for both (3.7) and (3.8) and n = 4 for the latter. Spreadsheet cells are
labeled by row and column. We fill in the following entires in the sequence of steps shown
in the figure:
(A) First we define the two functions.
(1) We create values in column a to represent the horizontal (x) axis, which represents
the values of the chemical concentration c in steps of size c = 0.1. To do so, we
type 0 in cell a0 and a0 + 0.1 in cell a1.
(2) In cell b0 we type the formula for the first function: 5 a0/(1 + a0).
(3) We generate a whole list of (x,y) values (that is (c, v) points) by clicking on the black
square and dragging it down the column.
(4) We repeat the same process for the second function 5 a0 4/(1 + a0 4), in
column c. The symbol denotes a power, denotes multiplication, and braces are

60Chapter 3. Three faces of the derivative: geometric, analytic, and computational

5.5

5.5

Hill Function

Michaelis-Menten

dv/dc

Reaction Rate, v

Hill Function

Michaelis-Menten

0.0

0.0
0.0

chemical concentration, c

5.0

0.0

(a)

chemical concentration, c

5.0

(b)

Figure 3.10. A simple spreadsheet can be used to graph the (approximate) form
of the derivative of a given function. Here we show the Michaelis-Menten and Hill function
biochemical kinetics (left) where reaction rate v is plotted against concentration c. We then
use finite differences of successive points on the curve to compute and plot the derivatives
of each of these functions (thick curves, right panel).
used as needed in quotients or products of terms. The columns a-c now contain the
coordinates of points along the functions. To get a reasonable approximation for the
derivative, the points along the x axis should be close together, so that c is small.
(5) Plotting the above (x,y) data produces the graph shown in the left panel of Fig. 3.10.
(B) Next, we prepare to compute the numerical approximation for the derivative of each of
these functions.
(5) We use column d for the numerical derivative of (3.7). To do so, we approximate the
actual derivative by using a finite difference,
f
df

.
x
dx
Importantly, the above two expressions are not equal (!) However, for sufficiently
small x, they approximate one another well. The value of x is (by our choice
of x axis subdivision) x = 0.1, and is the value we saved in cell a1. To ensure
we point only to that cell, we use an absolute reference (typical syntax $a$1). The
values of f can be calculated by subtracting successive values of the function in the
b column. For example, pointing to cell d0, we type (b1 b0)/$a$1). Dragging the
black square down the d column then generates all desired values of the numerical
derivative, for every value along the x axis.

3.3. Computational face of the derivative: software to the rescue!

61

(6) The process is repeated to generate the derivative of the function (3.8) in the e column. Observe that we use the same absolute reference $a$1 for c and successive
differences of the function in the c column (by typing (c1 c0)/$a$1 in cell e1.
Results of the above process lead to the graphs shown on the right panel of Fig. 3.10.

(A) Define the two functions

(1)
(2)

(4)
(3)

(B) Compute the two derivatives


(5)

(6)

Figure 3.11. Spreadsheet used to calculate a derivative.


Example 3.19 Interpret the graphs of the derivatives (shown as thick curves on the right
panel of Fig. 3.10) in terms of the way that reaction speed increases as the chemical concentration is increased in each of the two types of biochemical kinetics, Michaelis-Menten
and Hill function.
Solution: The Michaelis-Menten curve (red) has a derivative that is positive everywhere.

62Chapter 3. Three faces of the derivative: geometric, analytic, and computational


That derivative starts off at the value 5, and gradually decreases. We see this from the
decreasing thick red curve. We also get the same information from the fact that the actual
function (thin red curve) gradually levels off and flattens as c increases. We can interpret
this to mean that the reaction rate v increases as the substrate level c increases, but the rate
of increase, dv/dc, slows down as saturation takes place at higher c.
This contrasts with the derivative of a Hill function. From the thick blue curve,
we find that the Hill function derivative starts at zero, increases sharply, and only then
decreases to zero. Correspondingly, the Hill function (thin blue curve) is flat at first, then
becomes steeply increasing, and finally flattens to an asymptote. We can summarize this
biochemically by saying that the initial reaction rate v is small and hardly changes near
c 0. For larger c, the reaction rate depends sensitively on c (evidenced by large dv/dc)
but as c increases further, saturation leads to a drop in dv/dc. The reaction can no longer
increase with substrate, as the enzymes are once more saturated.

Exercises

63

Exercises
3.1. Sketching the derivative (Geometric view): Shown in Figure 3.12 is the graph of
some function f (x). Sketch the graph of its derivative, f (x).

Figure 3.12. Figure for Problem 1


3.2. Sketching the derivative (Geometric view): Shown in Figure 3.13 below are three
functions. Sketch the derivatives of these functions.
y

x
x

Figure 3.13. Figure for problem 2


3.3. What the sign of the derivative tells us: You are given the following information
about the signs of the derivative of a function, f (x). Use this information to sketch
a (very rough) graph of the function for 3 < x < 3.
x
f (x)

-3
0

-2
+

-1
0

0
-

1
0

2
+

3
+

3.4. Sketching the function given its derivative: You are given the following information about the the values of the derivative of a function, g(x). Use this information
to sketch (very rough) graph the function for 3 < x < 3.
x
g (x)

-3
-1

-2
0

-1
2

0
1

1
0

2
-1

3
-2

3.5. Sketching the derivative (geometric view): Sketch the graph of the derivative of
the function shown in Figure 3.14.

64Chapter 3. Three faces of the derivative: geometric, analytic, and computational

x
Figure 3.14. Figure for Problem 5
3.6. Shallower or steeper rise: Shown in Fig. 3.15 are two similar functions, both increasing from 0 to 1 but at distinct rates. Sketch the derivatives of each one. Then
comment on what your sketch would look like for a discontinuous step function,
defined as follows:

0 x<0
f (x) =
1 x 0.

(a)

(b)

Figure 3.15. Figure for Problem 12.


3.7. Geometric view, continued:
(a) Given the function in Figure 3.16(a), graph its derivative.
(b) Given the function in Figure 3.16(b), graph its derivative
(c) Given the derivative f (x) shown in Figure 3.16(c) graph the function f (x).
(d) Given the derivative f (x) shown in Figure 3.16(d) graph the function f (x).
3.8. Introduction to velocity and acceleration: The acceleration of a particle is the
derivative of the velocity. Shown in Figure 3.17 is the graph of the velocity of a
particle moving in one dimension. Indicate directly on the graph any time(s) at
which the particles acceleration is zero.

Exercises

65

1.0

10.0

y=f(x)

y=f(x)

-0.5

-10.0
0.0

2.3

0.0

10.0

(a)
10.0

(b)

3.0

f '(x)

f '(x)

-10.0

-2.0
0.0

10.0

-1.3

(c)

1.3

(d)

Figure 3.16. Figures for Problem 7.

3.9. Velocity, continued: The vertical height of a ball, d (in meters) at time t (seconds)
after it was thrown upwards was found to satisfy d(t) = 14.7t 4.9t2 for the first 3
seconds of its motion.
(a) What is the initial velocity of the ball (i.e. the instantaneous velocity at t = 0)?
(b) What is the instantaneous velocity of the ball at t = 2 seconds?
3.10. Computing the derivative of square-root (from the definition): Consider the
function

y = f (x) = x.
(a) Use the definition of the derivative to calculate f (x). You will need to use the

66Chapter 3. Three faces of the derivative: geometric, analytic, and computational

Figure 3.17. Figure for Problem 8


following algebraic simplification:

( a b)( a + b)
ab

.
=

( a + b)
( a + b)
(b) Find the slope of the function at the point x = 4.
(c) Find the equation of the tangent line to the graph at this point.
3.11. Computing the derivative: Use the definition of the derivative to compute the
derivative of the function y = f (x) = C/(x + a) where C and a are arbitrary
constants. Show that your result is f (x) = C/(x + a)2 .
x
.
3.12. Computing the derivative: Consider the function y = f (x) =
(x + a)
a
.
(a) Show that this same function can be written as f (x) = 1
(x + a)
(b) Use the results of Problem 11 to determine the derivative of this function.
(Note: you do not need to use the definition of the derivative to do this coma
putation.) Show that you get f (x) = (x+a)
2.
3.13. Tangent line to a simple function: What is the slope of the tangent line to the
function y = f (x) = 5x + 2 when x = 2? when x = 4 ? How would this slope
change if a negative value of x was used? Why?
3.14. Slope of the tangent line: Use the definition of the derivative to compute the slope
of the tangent line to the graph of the function y = 3t2 t + 2 at the point t = 1.
3.15. Tangent line: Find the equation of the tangent line to the graph of y = f (x) =
x3 x at the point x = 1.5 shown in Fig. 3.1. You may use the fact that the tangent
line goes through (1.7, 1.47) as well as the point of tangency.
3.16. Molecular motors: Fig. 3.18 (a) shows the displacement of a vesicle carried by a
molecular motor. The motor can either walk right (R), left (L) along one of the microtubules or it can unbind (U) and be stationary, then rebind again to a microtubule.
Sketch a rough graph of the velocity of the vesicle v(t) and explain the sequence of
events (using the letters R, L, U) that resulted in this motion. Fig. 3.18 (b) shows the
velocity v(t) of another vesicle. Sketch a rough graph of its displacement starting
from y(0) = 0.
3.17. Concentration gradient: Certain types of tissues, called epithelia are made up
of thin sheets of cells. Substances are taken up on one side of the sheet by some

Exercises

67
y

(b)

(a)

0
0

Figure 3.18. Figure for problem 16


active transport mechanism, and then diffuse down a concentration gradient by a
mechanism called facilitated diffusion on the opposite side. Shown in Figure 3.19
is the concentration profile c(x) of some substance across the width of the sheet (x
represents distance). Sketch the corresponding concentration gradient, i.e. sketch
c (x), the derivative of the concentration with respect to x.

c(x)

facilitated
diffusion

active
transport

distance across the sheet

Figure 3.19. Figure for Problem 17


3.18. Numerically computed derivative: Consider the two Hill functions
H1 (x) =

x2
,
0.01 + x2

H2 (x) =

x4
0.01 + x4

(a) Sketch a rough graph of these two functions on the same plot and/or describe
in words what the two graphs would look like.
(b) On a second plot, sketch a rough graph of both derivatives of these functions
and/or describe in words what the two derivatives would look like.
(c) Using a spreadsheet or your favorite software, plot the two functions over the
range 0 x 1.

(d) Use the spreadsheet to calculate an approximation for the derivatives H1 , (x), H2 (x)
and plot these two functions together. (NOTE: In order to have a reasonably
accurate set of graphs, you will need to select a small step size of x 0.01.)
3.19. More numerically computed derivatives: As we will later find out, trigonometric
functions such as sin(t) and cos(t) can be used to describe biorhythms of various

68Chapter 3. Three faces of the derivative: geometric, analytic, and computational


types. Here we numerically compute the first and second derivative of y = sin(t)
and show the relationships between the trigonometric functions and their derivatives.
We will use only numerical methods (e.g. a spreadsheet), but later, in Chapter 14,
we will also study the analytical calculation of the same derivatives.
(a) Use a spreadsheet (or your favorite software) to plot, on the same graph the
two functions
y1 = sin(t), y2 = cos(t),

0 t 2 6.28.

Note that you should use a fairly small step size, e.g. t = 0.01 to get a
reasonably accurate approximation of the derivatives.
(b) Use the same spreadsheet to (numerically) calculate (an approximate) derivative y1 (t) and add it to your graph.
(c) Now calculate y1 (t), that is (an approximation to) the derivative of the derivative of the sine function and add this to your graph.

Chapter 4

Differentiation rules,
simple antiderivatives
and applications
In our investigation so far, we have defined the derivative of a function, y = f (x) by
f (x + h) f (x)
dy
= f (x) = lim
.
h0
dx
h
We used this formula to calculate derivatives of a few power functions. Here, we will gather
results so far, and observe a pattern, the power rule, for derivatives of power functions. The
power law also allow us to find successive derivatives (e.g. second derivative etc.), to
differentiate polynomials, and even to find antiderivatives of such functions by applying
the rule in reverse (find a function that has a given derivative). All these calculations
are useful to common applications of accelerated motion, investigated later in this chapter.
We round out the technical material by stating several other useful rules of differentiation
(product and quotient), allowing us to easily calculate derivatives of more complicated and
interesting functions.

4.1

Rules of differentiation

Learning goals (LG) for Section 4.1


1. Learn and understand the power rule (Table 4.1) and be prepared to apply it to both
derivatives and antiderivatives of power functions and polynomials.
2. Be able to explain what is meant by the statement that the derivative is a linear
operation.
3. Understand the concept of an antiderivative and why it is defined only up to some
constant.
4. Learn the product and quotient rules and be able to apply these to calculating derivatives of products and of rational functions.
69

70

Chapter 4. Differentiation rules, simple antiderivatives and applications

4.1.1 The derivative of power functions: the power rule


We have already computed the derivatives of several of the power functions. See Example 3.7 for y = x0 = 1 and Example 3.8 for y = x1 . See also Example 2.21 for y = x2
and Example 3.15 for y = x3 . We tabulate these results in Table 4.1.
We see from simple experimentation in Example 3.15 that the derivative of a power
function consists of reducing the power (by 1) and multiplying the result by the original
power. See Table 4.1, where we have taken all the coefficients to be 1 for simplicity. We
refer to this pattern as the power rule of differentiation.
Function
f (x)
1
x
x2
x3
..
.

Derivative
f (x)
0
1
2x
3x2
..
.

xn
xn/m

nxn1
(n/m)x(n/m)1

Table 4.1. The Power Rule of differentiation states that the derivative of the
power function y = xn is nxn1 . For now, we have established this result for integer n.
Later, we will find that this result holds for other powers that are not integer.
We can show that this rule applies for any power function of the form y = f (x) = xn
where n is an integer power. The calculation is essentially the same as the examples we
have shown, but the step of expanding the binomial (x+h)n entails lengthier algebra. Such
expansion contains terms of the form xnk hk multiplied by binomial coefficients, and we
omit the details here. From now on, we will use this convenient result to simply write down
the derivative of a power function, without having to recalculate it from the definition.
Example 4.1 Find the equation of the tangent line to the graph of the power function y =
f (x) = 4x5 at x = 1, and determine the y intercept of that tangent line.
Solution: The derivative of this function is
f (x) = 20x4 .
At the point x = 1, we have dy/dx = f (1) = 20 and y = f (1) = 4. This means that the
tangent line goes through the point (1, 4) and has slope 20. Thus, its equation is
y4
= 20
x1
y = 4 + 20(x 1) = 20x 16.

4.1. Rules of differentiation

71

(At this point is is a good idea to do a quick check that the point (1, 4) satisfies this equation,
and that the slope of the line is 20.) Thus, we find that the y intercept of the tangent line is
y = 16.
Example 4.2 (Energy loss and Earths temperature) In Section 1.3, we studied the energy balance on Earth. According to Eqn. (1.4), the rate of loss of energy from the surface
of the Earth depends on its temperature according to the rule
Eout (T ) = 4r2 T 4 .
Calculate the rate of change of this outgoing energy with respect to the temperature T .
Solution: The quantities , , r are constants for this problem. Hence the rate of change of

energy with respect to T , denoted Eout


(T ) is

Eout
(T ) = (4r2 ) 4T 3 = (16r2 )T 3 .

Next, we find that the result for derivatives of power functions can be extended to derivatives of polynomials, using further simple properties of the derivative.

4.1.2 The derivative is a linear operation


The derivative satisfies several convenient properties: The sum of two functions or the constant multiple of a function has a derivative that is related simply to the original function(s).
The derivative of a sum is the same as the sum of the derivatives. A constant multiple of a
function can be brought outside the differentiation.

df
dg
d
(f (x) + g(x)) =
+
dx
dx dx

(4.1)

d
df
Cf (x) = C
dx
dx

(4.2)

We can summarize these observations by saying that the derivative is a linear operation. In general, a linear operation L is a rule or process that satisfies two properties: (1)
L[f + g] = L[f ] + L[g] and L[cf ] = cL[f ], where f, g are objects (such as functions,
vectors, etc) on which L acts, and c is a constant multiple. We will refer to (4.1) and (4.2)
as the linearity properties of the derivative.

4.1.3 The derivative of a polynomial


Using the properties (4.1) and (4.2), we can extend our differentiation power rule to compute the derivative of any polynomial. Recall that polynomials are sums of power functions
multiplied by constants. A polynomial of degree n has the form
p(x) = an xn + an1 xn1 + . . . a1 x + a0

(4.3)

72

Chapter 4. Differentiation rules, simple antiderivatives and applications

where the coefficients, ai are constant and n is an integer. Thus, by the above two properties, the derivative of a polynomial is just the sum of derivatives of power functions
(multiplied by constants). Thus the derivative of (4.3) is
p (x) =

dy
= an nxn1 + an1 (n 1)xn2 + . . . a1
dx

(4.4)

(Observe that each term consists of the coefficient times the derivative of a power functions.
The constant term a0 has disappeared since the derivative of any constant is zero.) The
derivative, p (x), is apparently also a function, and a polynomial as well. Its degree is
n 1, one less than that of p(x). In view of this observation, we could ask what is the
derivative of the derivative, which we henceforth call the second derivative. written in
d2 y
the notation p (x) or, equivalently dx
2 . Using the same rules, we can compute this easily,
obtaining
p (x) =

d2 y
= an n(n 1)xn2 + an1 (n 1) (n 2)xn3 + . . . a2
dx2

(4.5)

We demonstrate the idea with a few examples


Example 4.3 Find the first and second derivatives of the function (a) y = f (x) = 2x5 +
3x4 + x3 5x2 + x 2 with respect to x and (b) y = f (t) = At3 + Bt2 + Ct + D with
respect to t.
Solution: We obtain the results (a) f (x) = 10x4 + 12x3 + 3x2 10x + 1 and f (x) =
40x3 + 36x2 + 6x 10. (b) f (t) = 3At2 + 2Bt + C and f (t) = 6At + 2B. In (b) the
independent variable is t, but, of course, the rules of differentiation are the same.

4.1.4 Antiderivatives of power functions and polynomials


So far, we have generally taken a familiar route from a function to its derivative(s), using
either definition or rules of differentiation to calculate the derivative. But this route can
be travelled in reverse. Namely, given a derivative, we can ask what function was differentiated to lead to this result? This reverse process is terms antidifferentiation, and the
function we seek is then called an antiderivative. As seen above, when we differentiate
a power function, we get a power function whose power is smaller. Antidifferentiation
reverses the operation. We ask, for example, which function has as its derivative
y (t) = Atn .

(4.6)

The original function, y(t), should have a power higher by ,1 (of the form tn+1 ), but the
guess yguess = Atn+1 is not quite right, since differentiation results in A(n + 1)tn . To
fix this, we revise the guess to
y(t) = A

1
tn+1 .
(n + 1)

(4.7)

It is easily checked that the derivative of the function in (4.7) is indeed(4.6), so that the
function in (4.7) is an antiderivative of (4.6).

4.1. Rules of differentiation

73

Is this the only function that has the desired property? Further thought leads to the
idea that there are other functions whose derivatives are the same. For example, consider
adding an arbitrary constant C to the function in (4.7) and note that we obtain the same
derivative, (4.6) (since the derivative of the constant is zero). We summarize our findings:
The antiderivative of y (t) = Atn

is

y(t) = A

1
tn+1 + C.
(n + 1)

(4.8)

We also note an important result that holds for functions in general:

Given a function, f (x) we can only determine its antiderivative up to some (additive)
constant.
We can extend the same ideas to finding the antiderivative of a polynomial.
Example 4.4 (Antiderivative of a polynomial) Find an antiderivative of the polynomial
y (t) = At2 + Bt + C.

Solution: Since differentiation is a linear operation, we can construct the antiderivative by


antidifferentiating each of the component power functions, obtaining
1
1
y(t) = A t3 + B t2 + Ct + D
3
2
where D is an arbitrary constant. We see that the antiderivative of a polynomial is another polynomial whose degree is higher by 1. It is straightforward to check this result by
differentiation.
Example 4.5 The second derivative of some function is
y (t) = c1 t + c2 .
Find a function y(t) for which this is true
Solution: The above polynomial has degree 1. Evidently, this function resulted by taking
the derivative of y (t), which had to be a polynomial of degree 2. We can check that
y (t) =

c1 2
t + c2 t
2

could be such a function, but so could


y (t) =

c1 2
t + c2 t + c3
2

74

Chapter 4. Differentiation rules, simple antiderivatives and applications

for any constant c3 . In turn, the function y(t) had to be a polynomial of degree 3. We can
see that one such function is
y(t) =

c1 3 c2 2
t + t + c3 t + c4
6
2

where c4 is any constant. (This can be checked by differentiating.) The steps we have just
illustrated are antidifferentiation. In short, the relationship is:
for differentiation y(t) y (t) y (t)
whereas
for antidifferentiation y (t) y (t) y(t).
(Arrows denote what is done to one function to arrive at the next.) These results will be
useful in an application to the acceleration, velocity, and displacement of a moving object
in Section 4.2.

4.1.5 Product and quotient rules for derivatives


So far, using a single rule for differentiation, the power rule, together with properties of
the derivative such as additivity and constant multiplication (described in Section 4.1.2),
we were able to calculate derivatives of polynomials. here we state without proof, two
other rules of differentiation that will prove to be useful in due time.

The product rule: If f (x) and g(x) are two functions, each differentiable in the domain
of interest, then
d[f (x)g(x)]
df (x)
dg(x)
=
g(x) +
f (x).
dx
dx
dx
Another notation for this rule is
[f (x)g(x)] = f (x)g(x) + g (x)f (x).
Example 4.6 Find the derivative of the product of the two functions f (x) = x and g(x) =
1 + x.
Solution: Using the product rule leads to
d[x(1 + x)]
d[x]
d[(1 + x)]
d[f (x)g(x)]
=
=
(1 + x)+
x = 1 (1 + x)+ 1 x = 2x+ 1.
dx
dx
dx
dx
(This can be easily checked by noting that f (x)g(x) = x(1+x) = x+x2 , whose derivative
agrees with the above.)

4.1. Rules of differentiation

75

The quotient rule: If f (x) and g(x) are two functions, each differentiable in the domain
of interest, then


df (x)
g(x) dg(x)
d f (x)
dx f (x)
= dx
.
2
dx g(x)
[g(x)]
We can also write this in the form


f (x)g(x) g (x)f (x)
f (x)
=
.
g(x)
[g(x)]2
Example 4.7 Find the derivative of the function y = axn = a/xn where a is a constant
and n is a positive integer.
Solution: We can rewrite this as the quotient of the two functions f (x) = a and g(x) = xn .
Then y = f (x)/g(x) so, using the quotient rule leads to the derivative
0 xn (nxn1 ) a
anxn1
f (x)g(x) g (x)f (x)
dy
=
=
=
dx
[g(x)]2
(xn )2
x2n
After algebraic simplification, we obtain dy/dx = a(n)xn12n = a(n)xn1 . This
is an interesting result: The power rule of differentiation holds for negative integer
powers.
Example 4.8 (Dynamics of actin in the cell) Actin is a structural protein that forms long
filaments and networks in living cells. The actin network is continually assembling from
small components (actin monomers) and disassembling back again. To study this process,
scientists attach fluorescent markers to actin, and watch the fluorescence intensity change
over time. In a recent experiment, both red and green fluorescent labels were used. The
green label fluoresces only after it is activated by a pulse of light, whereas the red fluorescent protein is active continually.
It was noted that the red and green fluorescence intensities (R, G) satisfied the following relationships11 :
dG
dR
= (a b)R,
= bG
dt
dt
where a, b are constants that characterize the rate of assembly and disassembly (breakup)
of actin. Find the relationship satisfied by the ratio of the two fluorescent signals R/G and
the derivative of that ratio (d(R/G)/dt).
Solution: This is an application of the quotient rule. We write
d(R/G)
=
dt

dR
dt G

dG
(a b)RG (bGR)
aRG
dt R
=
=
= a(R/G)
2
2
G
G
G2

Thus, the derivative of the ratio is proportional to that same ratio and the constant of proportionality is the parameter a.
11 These relationships between a function of time and its own derivative are examples of differential equations,
a topic we will revisit in later chapters.

76

Chapter 4. Differentiation rules, simple antiderivatives and applications

4.1.6 The power rule for fractional powers

Using the definition of the derivative, we have already shown that the derivative of x is
y (x) = 21 x (see Problem 10). We restate this result in terms of (one specific case) of a

fractional power. Recall that x = x1/2 .

The derivative of y =

x is y (x) = 21 x1/2 .

This idea can be generalized to any fractional power. Indeed, we state here a useful result
(to be demonstrated in Chapter 9).
Derivative of fractional-power function: The derivative of
y = f (x) = xm/n
is

dy
m m
= x( n 1) .
dx
n

Example 4.9 (Energy loss and Earths temperature, revisited) In Example 4.2, we calculated the rate of change of energy lost per unit change in the Earths temperature based
on Eqn. (1.4). Find the rate of change of Earths temperature per unit energy loss based on
the same equation.
Solution: We are asked to find dT /dEout . We will first rewrite the relationship12 to express
T as a function of Eout . To do so, we solve for T in Eqn. (1.4), obtaining
T =

Eout
4r2

1/4

1
4r2

1/4

1/4

1/4

Eout = KEout .

Then, as we have indicated, the first term is a constant, and we use the rule for a derivative
of a fractional power to compute that
dT
=
dEout

1
4r2

1/4

1 (1/4)1
Eout
=
4

1
16r2

1/4

3/4

Eout .

4.2 Application: From acceleration to displacement


As we have seen already, the derivative of a function is also, itself, a function. This leads
to the idea that we can apply the same process of differentiation over again to construct the
derivative of that new function. We refer to the derivative of the derivative as the second
derivative. As we discuss here, a natural example of a second derivative is the acceleration
of an object: the rate of change of the velocity (which is, as we have seen, the rate of change
12 In a later chapter we will show that this step is not essential. In fact, implicit differentiation will be introduced as a method of finding the desired derivative without solving for the variable of interest.

4.2. Application: From acceleration to displacement

77

of a displacement). We can also state the same relationships in terms of antiderivatives: the
velocity of an object is an antiderivative of the acceleration, and the displacement is an
antiderivative of the velocity.

Section 4.2.1 Learning goals


1. Understand that velocity and acceleration are first and second derivatives of the position with respect to time.
2. Understand that velocity and position are first and second antiderivatives of acceleration.
3. Given that the acceleration of an object is constant in time, be able to find the velocity
and displacement of that object.

4.2.1 Position, velocity, and acceleration


As an example of the relation between a function and its first and second derivative, we
return to the discussion of displacement, velocity and acceleration of an object falling under
the force of gravity. Here we will use the notation y(t) to denote the position of the object
at time t. From now on, we will refer to the instantaneous velocity of a particle or object at
time t simply as the velocity, v(t).
Definition 4.10 (The velocity). Given the position of some particle as a function of time,
y(t), we define the velocity as the rate of change of the position, i.e. the derivative of y(t):
v(t) =

dy
= y (t).
dt

Here we have just used two equivalent notations for the derivative. In general, v may
depend on time, a fact we indicated by writing v(t).
Definition 4.11 (The acceleration). We will also define the acceleration as the (instantaneous) rate of change of the velocity, i.e. as the derivative of v(t).
dv
= v (t).
dt
(Acceleration could also depend on time, hence a(t).)
a(t) =

Since the acceleration is the derivative of a derivative of the original function, we


also use the notation
 
d2 y
d dy
= 2 = y (t)
a(t) =
dt dt
dt
Here we have used three equivalent ways of writing a second derivative. (This notation
evolved for historical reasons, and is used interchangeably in science.) The acceleration is
hence the second derivative of the position.

78

Chapter 4. Differentiation rules, simple antiderivatives and applications

In view of our discussion of antidifferentiation, given information about the acceleration as a function of t, we can obtain the velocity v(t) (up to some constant) by antidifferentiation. Similarly, we can use the velocity v(t) to determine the position y(t) (up to
some constant). The constants must be obtained from other information, as examples that
follow will illustrate.
Example 4.12 (Uniformly accelerated motion) Suppose that the acceleration of an object is constant in time, i.e. a(t) = g = constant. Use antidifferentiation to determine the
velocity and the position of the object as functions of time.
Solution: We ask: what function of time v(t) has the property that
a(t) = v (t) = g = constant?
The function a(t) = v (t) is a polynomial of degree 0 in the variable t. To find the velocity,
we apply antidifferentiation to obtain a polynomial of degree 1,
v(t) = gt.
This is one antiderivative of the acceleration, but in fact, other functions such as
v(t) = gt + c,

(4.9)

would work for any constant c. How can we decide which value of the constant c to use?
To determine c we need additional information about the velocity, for example at t = 0.
Suppose we are told that v(0) = v0 is the known value of the initial velocity13 . Then,
substituting t = 0 into Eqn. (4.9), we find that c = v0 . Thus in general,
v(t) = gt + v0
where v0 is the initial velocity of the object.
To now determine the position of the particle as a function of the time t, we recall
that v(t) = y (t). Thus, using Eqn. (4.9), we have
y (t) = v(t) = gt + v0

(4.10)

Then, by antidifferentiation of Eqn. (4.10), we obtain a polynomial of degree 2,


1 2
gt + v0 t + k
(4.11)
2
where, as before we allow for some additive constant k. It is a simple matter to check that
the derivative of this function is the given expression for v(t). By reasoning as before, the
constant k can be determined from the initial position of the object y(0) = y0 . A before,
(plugging t = 0 into Eqn. (4.12)) we find that k = y0 , so that
y(t) =

1 2
gt + v0 t + y0 .
(4.12)
2
Here we use the acceleration due to gravity, g, but any other motion with constant acceleration would be treated in the same way.
y(t) =

13 The statement v(0) = v will later be called an initial condition, since it specifies how fast the particle was
0
moving initially.

4.2. Application: From acceleration to displacement

79

Summary, uniformly accelerated motion: If an object moves with constant acceleration


g, then given its initial velocity v0 and initial position y0 at time t = 0, the position at any
later time is described by:
1
y(t) = gt2 + v0 t + y0 .
2
This powerful and general result is a direct result of the assumption that the acceleration is constant, using the elementary rules of calculus, and the definitions of velocity and
acceleration as first and second derivatives of the position. We further illustrate these ideas
with examples of motion under the influence of gravity.
Example 4.13 (The motion of a falling object, revisited) A falling object experiences uniform acceleration (downwards) with a(t) = g = constant 14 . Suppose that an object is
thrown upwards at initial velocity v0 from a building of height h0 . Find the velocity and
the acceleration of the object at any time t.
Solution: By previous reasoning, the height of the object at time t, denoted y(t) is given
by
1
y(t) = gt2 + v0 t + h0 .
2
The velocity is given by:
1
v(t) = y (t) = v0 2( gt) = v0 gt.
2
We may observe that at t = 0, the initial velocity is v(0) = v0 . If the object was thrown
upwards then v0 > 0, i.e., it is initially heading up. Differentiating one more time, we find
that the acceleration is:
a(t) = v (t) = g.

We observe that the acceleration is constant. The negative sign means that the object is
accelerating downwards, in the direction opposite to the positive direction of the y axis.
This makes sense, since the force of gravity acts downwards, causing this acceleration.

Example 4.14 Determine when the object reaches its highest point, and what is its velocity
at that time.
Solution: To find when the object reaches its highest point, we note that the object shoots
up, but it slows down with time. Eventually, it can no longer continue to go up: this happens
precisely when its velocity is zero. From then on it will start to fall to the ground. The top
of its trajectory is determined by finding when the velocity of the object is zero. Equating
v(t) = v0 gt = 0
we solve for t, to get
ttop =

v0
.
g

14 Here we have chosen a coordinate system in which the positive direction is upwards, and so the acceleration,
which is in the opposite direction, is negative. On Earth, g = 9.8 m /s2 .

80

Chapter 4. Differentiation rules, simple antiderivatives and applications

Example 4.15 When does the object hit the ground? What is its velocity at that instant?

Solution: We will assume that the object hits the ground at level y = 0. Then we must
solve for t in the equation:
1
y(t) = h0 + v0 t gt2 = 0.
2
Here we must observe that the highest power of the independent variable is 2, so that y is a
quadratic function of t, and solving for t requires us to solve a quadratic equation. This is
a quadratic equation, which could be written in the form
1 2
gt v0 t h0 = 0,
2
Using the quadratic formula, we obtain
p
2v0 4v02 + 8gh0
tground =
2g

gt2 2v0 t 2h0 = 0.

tground

v0
=

p
v02 + 2gh0
.
g

We have found two roots. One is positive and the other is negative. Since we are interested
in t > 0, we will reject the negative root, so
p
v0
v02 + 2gh0
tground =
+
.
g
g
To find the velocity of the object when it hits the ground. we need to use the time determined in part (b). Substituting tground into the expression for velocity, we obtain:
!
p
v02 + 2gh0
v0
.
+
v(tground ) = v0 gtground = v0 g
g
g
After some algebraic simplification, we obtain
q
v(tground ) = v02 + 2gh0 .

We observe that this velocity is negative, indicating (as expected) that the object is falling
down.
Figure 4.1 illustrates the relationship between the three functions.

4.3 Sketching first, second, and anti- derivatives


Section 4.3 Learning goals
1. Given a sketch of a function, be able to sketch both its first and its second derivatives.
2. Given the sketch of a function, be able to sketch its first and second antiderivatives.

4.3. Sketching first, second, and anti- derivatives

81

t
v

a
0

Figure 4.1. The position, velocity, and acceleration of an object that is thrown
upwards and falls under the force of gravity.
We have already encountered the idea of sketching the derivative of a function, given
a sketch of the original function. Here we practice this skill further. In the examples below,
we make no attempt to be accurate about heights of peaks and valleys in our sketches (as
would be certainly possible using numerical methods like a spreadsheet). Rather, we are
aiming for qualitative features, where the most important aspects of the graphs (locations
of key points such as peaks and troughs) are indicated.
Example 4.16 (Sketching the derivative from the original function) Use the functions
shown on the top panels of Fig. 4.2 to sketch the first and second derivatives in each case.

Solution: In Figure 4.2 we show the functions y(t) (top), their first derivatives y (t) (middle), and the second derivatives y (t) (bottom). (In each case, we determined the slopes
of tangent lines as a first step.) An important feature to notice is that wherever a tangent
line to a curve is horizontal, e.g. at the tops of peaks (local maxima) or bottoms of
valleys(local minima) or at flat parts of the graph, the derivative is zero. This is indicated
at several places in Figure 4.2. In (b), there are several cusps at which the first and second
derivatives are not defined.
Example 4.17 (Sketching a function from a sketch of its derivative) Use the sketches of
the functions y (x) in the top panels of Figure 4.3 (a,b) to sketch the original functions y(x)
in each case.

82

Chapter 4. Differentiation rules, simple antiderivatives and applications

(a)

(b)

t
0

y
t

t
0

0 +

- -

y
t

Figure 4.2. Figure for Example 4.16.


Solution: Recall that if we are given the derivative of a function, y (x), we can only determine the original y(x) up to some (additive) constant. In the bottom panels of Figure 4.3
we show the antiderivative for each case, (a) and (b). An important point is that there are
many possible ways to draw f (x) given f (x), because f (x) only contains information
about changes in f (x), not about how high the function is at any point. If we were given an
additional piece of information, for example that y(0) = 0, we would be able to select out
one specific curve out of this family of solutions. A second point is that antidifferentiation
smoothes a function. Even though y (x) has cusps in (b), we find that y(x) is smooth.
We will later see that the points at which y (x) has a cusp correspond to places where the
concavity of y(x) changes abruptly.

4.3. Sketching first, second, and anti- derivatives

83

(a)

(b)

y
x

0 -

0 -

y
x

Figure 4.3. Using the sketch of a function y (x) to sketch the function y(x).

84

Chapter 4. Differentiation rules, simple antiderivatives and applications

4.3.1 A biological speed machine

Figure 4.4. The parasite Lysteria lives inside a host cell. It assembles a rocketlike tail made up of actin, and uses this assembly to move around the cell, and to pass
from one host cell to another.
Lysteria monocytogenes is a parasite that lives inside cells of the host, causing a nasty
infection. It has been studied by cellular biologists for its amazingly fast propulsion, which
uses the hosts actin filaments as rocket fuel. Actin is part of the structural component
of all animal cells, and is known to play a major role in cell motility. Lysteria manages to
hijack this cellular mechanism, assembling it into its own comet tail, which can be used
to propel inside the cell and pass from one cell to the next. Figure 4.4 illustrates part of
these curious traits.
Researchers in cell biology use Lysteria to find out more about motility at the cellular
level. It has been discovered that certain proteins on the external surface of this parasite
(ActA) are responsible for the ability of Lysteria to assemble an actin filament tail. Surprisingly, even small plastic beads artificially coated in Lysterias ActA proteins can perform
the same trick: they assemble an actin tail which pushes the bead like a tiny rocket.
In a recent paper in the literature Bernheim-Groswasser et al [1] describe the motion
of these beads, shown in Figure 4.5. When the position of the bead is plotted on a graph
with time as the horizontal axis, (see Figure 4.6) we find that the trajectory is not a simple
one: it appears that the bead slows down periodically, and then accelerates.
With the techniques of this chapter, we can analyze the experimental data shown
in Figure 4.6 to determine both the average velocity of the beads, and the instantaneous
velocity over the course of the motion.
Average velocity of the bead
We can get a rough idea of how fast the micro-beads are moving by computing an average
velocity over the time interval shown on the graph. We can use two (approximate) data
points (t, D(t), at the beginning and end of the run, for example (45,20) and (80,35): Then
the average velocity is
D
v =
t

4.3. Sketching first, second, and anti- derivatives

85

Figure 4.5. Small spherical beads coated with part of Lysterias special actinassembly kit also gain the ability to swim around. Based on Bernheim-Groswasser et al
[1].

Figure 4.6. The distance traveled by a little bead is shown as a function of time.
The arrows point to times when the particle slowed down or stopped. We can use this data
to analyze the velocity of the particles. Based on Bernheim-Groswasser et al [1].

v =

35 20
0.43 min1
80 45

so the beads move with average velocity 0.43 microns per minute. (One micron is 106
meters.)
The changing instantaneous velocity:
Because the actual data points are taken at finite time increments, the curve shown in Figure 4.6 is not smooth. We will smoothen it, as shown in Figure 4.7 for a simpler treatment.
In Figure 4.8 we sketch this curve together with a collection of lines that represent the
slopes of tangents along the curve. A horizontal tangent has slope zero: this means that
at all such points (also indicated by the arrows for emphasis), the velocity of the beads is

86

Chapter 4. Differentiation rules, simple antiderivatives and applications

zero. Between these spots, the bead has picked up speed and moved forward until the next
time in which it stops.
We show the velocity v(t), which is the derivative of the original function D(t) in
Figure 4.9. As shown here, the velocity has periodic increases and decreases.

40

30

20

40

50

60

70

80

90

Figure 4.7. The (slightly smoothened) bead trajectory is shown here.

40

30

20

40

50

60

70

80

90

Figure 4.8. We have inserted a sketch of the tangent line configurations along
the trajectory from beginning to end. We observe that some of these tangent lines are
horizontal, implying a zero derivative, and, thus, a zero instantaneous velocity at that time.

4.3. Sketching first, second, and anti- derivatives

87

40
D(t)
30
v(t)

20

40

50

60

70

80

90

Figure 4.9. Here we have sketched the velocity on the same graph.

88

Chapter 4. Differentiation rules, simple antiderivatives and applications

Exercises
4.1. Find the first derivative for each of the following functions.
(a) f (x) = (2x2 3x)(6x + 5)

(b) f (x) = (x3 + 1)(1 3x)

(c) g(x) = (x 8)(x2 + 1)(x + 2)

(d) f (x) = (x 1)(x2 + x + 1)

x2 9
x2 + 9
2 x3
(f) f (x) =
1 3x
b3
(g) f (b) =
2
2 b3
m2
(h) f (m) =
(m 2)(2m 1)
3m 1
(x2 + 1)(x2 2)
(i) f (x) =
3x + 2
4.2. Logistic growth rate: In logistic growth, the rate of growth of a population, R
depends on the population size N as follows:


N
R = rN 1
,
K
(e) f (x) =

where r and K are positive constants. Find the rate of change of the growth rate
with respect to the population size.
4.3. Michaelis-Menten and Hill kinetics: Compute the derivatives of the following
functions:
(a) The Michaelis Menten kinetics of Eqn. (1.7),
v=

Kx
.
kn + x

(b) The Hill function of Eqn. (1.6), that is


y=

Axn
.
an + xn

4.4. Volume, surface area and radius of a sphere: The volume and surface area of a
sphere both depend on its radius:
V =

4 3
r ,
3

S = 4r2 .

(a) Find the rate of change of the volume with respect to the radius and the rate of
change of the surface area with respect to the radius.

Exercises

89

(b) Find the rate of change of the surface area to volume ratio S/V with respect
to the radius.
4.5. Derivative of Volume with respect to surface area: Consider the volume and
surface area of a sphere. (See Problem 4 for the formulae.)
(a) Eliminate the radius and express V as a function of S.
(b) Find the rate of change of the volume with respect to the surface area.
4.6. Surface area and volume of a cylinder: The volume of a cylinder and the surface
area of a cylinder with two flat end-caps are
V = r2 L,

S = 2rL + 2r2

where L is the length and r the radius of the cylinder.


(a) Find the rate of change of the volume and surface area with respect to the
radius, assuming that the length L is held fixed.
(b) Find the rate of change of the surface area to volume ratio S/V with respect
to the radius assuming that the length L is held fixed.
4.7. Growing circular colony: A bacterial colony has the shape of a circular disk with
radius r(t) = 2 + t/2 where t is time in hours and r is in units of mm. Express
the area of the colony as a function of time and then determine the rate of change of
area with respect to time at t = 2hr.
4.8. Rate of change of energy during foraging: When a bee forages for nectar in a
patch of flowers, it gains energy. Suppose that the amount of energy gained during
a foraging time span t is
Et
f (t) =
k+t
where E, k > 0 are constants.
(a) If the bee stays in the patch for a very long time, how much energy can it gain?
(b) Use the quotient rule to calculate the rate of energy gain while foraging in the
flower patch.
4.9. Ratio of two species: In a certain lake it is found that the rate of change of the
population size of each of two species (N1 (t), N2 (t)) is proportional to the given
population size. That is
dN1
= k1 N1 ,
dt

dN2
= k2 N2
dt

where k1 and k2 are constants. Find the rate of change of the ratio of population
d(N1 /N2 )
sizes (N1 /N2 ) with respect to time
. Your answer will be in terms of
dt
k1 , k2 and the ratio N1 /N2 .
4.10. Invasive species and sustainability: An invasive species is one that can outcompete and grow faster than the native species, resulting in takeover and displacement
of the local ecosystem, disrupting sustainability. Consider the two-lake system of

90

Chapter 4. Differentiation rules, simple antiderivatives and applications

Problem 9. Suppose that initially, the ratio of the native species N1 to the invasive
species N2 is very large. Under what condition (on the constants k1 , k2 ) will that
ratio decrease with time, i.e. will the invasive species take over?
4.11. Numerical derivatives: Consider the function
y(x) = 5x3 ,

0 x 1.

Use a spreadsheet (or your favorite software) to compute an approximation of the


derivative of this function over the given interval for x = 0.25 and compare to the
true derivative, using the power rule. Comment on the comparison. Then recompute
the approximation to the derivative using x = 0.05 and comment on the results.
4.12. Antiderivatives: Find antiderivatives of the following functions, that is find y(t).
(a) y (t) = t4 + 3t2 t + 3.

(b) y (x) = x + 2.
(c) y = |x|.

4.13. The velocity of a particle is known to depend on time according to the relationship
v(t) = A Bt2 ,

A, B > 0 constants

(a) Find the acceleration a(t).


(b) Suppose that the initial position of the particle is y(0) = 0. Find the position
at time t.
(c) At what time does the particle return to the origin?
(d) When is the particle farthest away from the origin?
(e) What is the largest velocity of the particle?
4.14. The position of a particle is given by the function y = f (t) = t3 + 3t2 .
(a) Find the velocity and the acceleration of the particle.
(b) A second particle has position given by the function y = g(t) = at4 + t3
where a is some constant and a > 0. At what time(s) are the particles in the
same position?
(c) At what times do the particles have the same velocity?
(d) When do the particles have the same acceleration?
4.15. A ball is thrown from a tower of height h0 . The height of the ball at time t is
h(t) = h0 + v0 t (1/2)gt2
where h0 , v0 , g are positive constants.
(a) When does the ball reach its highest point?
(b) How high is it at that point?
(c) What is the instantaneous velocity of the ball at its highest point ?

Exercises

91

f'

x
Figure 4.10. Figure for Problem 16
4.16. Sketch the graph of a function f (x) whose derivative is shown in Figure 4.10. Is
there only one way to draw this sketch? What difference might occur between the
sketches drawn by two different students?
4.17. Given the derivative f (x) shown in Figure 3.16(c), graph the second derivative
f (x).
4.18. Shown in Figure 4.11 is the graph of f (x), the derivative of some function. Use
this to sketch the graphs of the two related functions, f (x) and f (x)

x
Figure 4.11. Figure for Problem 18
4.19. Sketching graphs: Consider the function shown in Fig. 4.12. Sketch the antiderivative and the derivative of this function, that is sketch F (x) and F (x).

92

Chapter 4. Differentiation rules, simple antiderivatives and applications

Figure 4.12. Figure for problem 19.

Chapter 5

Tangent lines, linear


approximation, and
Newtons method
A straight line has the property that its slope is the same at every point on its graph. Thus,
given a known point (x1 , y1 ) on the line and the slope m, the equation of the line is found
from the statement that any other point (x, y) on the line should satisfy
y y1
rise
= m.
=
run
x x1

(5.1)

A review of properties of straight lines is provided in Appendix A. Here we use Eqn. (5.1)
in many examples where we seek equations of tangent lines or properties of those lines.
Tangent lines approximate the local behaviour of a function near a point. This fact
will lead us to linear approximation, which is a way to estimate values of functions that
are not easy to calculate at a point of interest. A further application of the tangent line is
to Newtons method for approximating zeros of a function, that is values of x for which
f (x) = 0.

5.1

The equation of a tangent line

We first consider a number of simple examples of equations of a tangent line that are easily
found.

Section 5.1 Learning goals


1. Given a simple function y = f (x) and a point x, be able to find the equation of the
tangent line to the graph at that point.
2. Be able to graph both the function and its tangent line using a spreadsheet or your
favorite software.
93

94

Chapter 5. Tangent lines, linear approximation, and Newtons method

6.0

y=x2

Tangent line 1

intersection point

Tangent line 2

0.0
0.0

2.5

Figure 5.1. The graph of the parabola y = f (x) = x2 and its tangent lines at
x = 1 and x = 2. See Example 5.1 for the equations and point of intersection of these
tangent lines.

5.1.1 Simple functions and their tangent lines


Example 5.1 (Tangent to a parabola) Find the equations of the tangent lines to the parabola
y = f (x) = x2 at the points x = 1 and x = 2. The determine whether these tangent lines
intersect, and if so, where.
Solution: Let us denote by Line 1 the tangent line that goes through the point x = 1 and
Line 2 the line through x = 2.
The derivative of f (x) = x2 is f (x) = 2x, so the slopes, mi of these tangent lines
are m1 = f (1) = 21 = 2 (for Line 1) and m2 = f (2) = 22 = 4 (for Line 2). Moreover
each tangent line intersects the parabola at the point of tangency. Using the values x = 1
and x = 2 we find that the corresponding points on the curve, (x, x2 ), are (1,1) and (2,4)
for Line 1 and Line 2 respectively. Then we have that
y1
= m1 = 2,
x1
y4
= m2 = 4
Line 2:
x2

Line 1:

y = 1 + 2(x 1)

y = 2x 1

y = 4 + 4(x 2)

y = 4x 4

Two lines intersect when their y values (and x values) are the same. Solving for x we get
2x 1 = 4x 4

2x = 3

x=

3
.
2

so indeed the two tangent lines intersect at x = 3/2 as shown in Fig. 5.1.
The next example points to the fact that a tangent line can be used to approximate the
zero of a function. This idea will be developed into a useful approximation method called
Newtons method.

5.1. The equation of a tangent line

95

Example 5.2 Draw the graph of the function y = f (x) = x3 x together with its tangent
line at the point x = 1.5. Where does that tangent line intersect the x axis? Compare that
point of intersection with a zero of the function.
Solution: The coordinates of the point of interest (x, f (x)) are (1.5, f (1.5)) = (1.5, 1.875).
6.0

y=x3-x
Tangent line

x
intersection
point

-6.0
-2.0

2.0

Figure 5.2. The graph of the function y = f (x) = x3 x is shown in black,


together with its tangent line at the point x = 1.5. In this low magnification view, we
see that the tangent line stays close to the graph of the function only close to the point of
tangency. Away from that point, it strays off.
Using differentiation rules, the derivative of the polynomial f (x) = x3 x is f (x) =
3x2 1. A tangent line at the point x = 1.5 has slope m = f (1.5) = 3(1.5)2 1 = 5.75.
Thus the equation of the tangent line is
y 1.875
= 5.75
x 1.5

y = 1.875 + 5.75(x 1.5)

y = 5.75x 6.75.

The tangent line intersects the x axis when y = 0, which occurs at


0 = 5.75x 6.75

x=

5.75
= 0.8518.
6.75

This is close (but not equal) to one of the zeros of the function (x3 x = 0 at x = 1). Here
we can easily find all zeros by solving explicitly, but for more complicated functions we
will develop this idea to refine the approximation of a zero using Newtons method.
To graph the function together with its tangent line (as distinct from using a zoom
to simply view the function locally, as we had done in Fig. 3.1), we use software to graph
both y = x4 x and y = 5.75x 6.75 on the same coordinate system. In Fig. 5.2 we have
used a simple spreadsheet to do so.
Example 5.3 (a) Find the equation of the tangent line to
y = f (x) = x3 ax

96

Chapter 5. Tangent lines, linear approximation, and Newtons method

for a > 0 a constant, at the point x = 1. (b) Find where that tangent line intersects the x
axis.

Solution: The function given in the example is a simple polynomial, so we easily calculate
its derivative. The idea is very similar to that of the previous example, but the constant a
makes this calculation a little less straightforward. (a) y = f (x) = x3 ax so the derivative
is
dy
= f (x) = 3x2 a
dx
and at x = 1 the slope (in terms of the constant a) is f (1) = 3 a. The point of interest
on the curve has coordinates x = 1, y = 13 a 1 = 1 a.
We look for a line through (1, 1 a) with slope m = 3 a. That, is,
y (1 a)
= 3 a.
x1
Simplifying algebraically leads to
y = (3 a)(x 1) + (1 a)
or simply
y = (3 a)x 2.
[Remark: at this point is is wise to check that the tangent line goes through the desired
point and has the slope we found. One way to do this is to pick a simple value for a, e.g.
a = 1 and do a quick check that the answer matches what we have found.]
(b) To find the point of intersection, we set
y = (3 a)x 2 = 0
and solve for x. We find that
x=

2
.
3a

Example 5.4 Find the equation of the tangent line to the function y = f (x) =
point x = 4.

x at the

Solution: In Exercise
10 of Chapter 3, we verified that the derivative of y = f (x)
= x

is f (x) = 1/(2 x) (See (3.6)). At x = 4 the slope of the function is f (4) = 1/(2 4) =
1/4 and the point on the graph at which the tangent line is needed is (4, 2). Then the
equation of the tangent line is
y2
= 0.25
x4

y = 2 + 0.25(x 4).

5.2. Generic tangent line equation and properties

5.2

97

Generic tangent line equation and properties

Section 5.2 Learning goals


1. Understand the generic form of the tangent line equation and be able to connect it to
the geometry of the tangent line.
2. Be able to find the coordinate of the point at which the tangent line intersects the x
axis (important for Newtons Method later on in Section 5.4).

5.2.1 Generic tangent line equation


Based on the above examples, we develop the equation of a tangent line to an arbitrary
function at a point. Shown in Fig. 5.3 is some smooth function y = f (x), which we will

tangent line

f(x)

x0

Figure 5.3. The graph of an arbitrary function y = f (x) and a tangent line at
x = x0 . The equation of this generic tangent line is (5.2).
assume to be differentiable at some point x0 labeled x0 . At that point, a tangent line to the
graph has been drawn. We wish to write down the equation of this line. We use two facts,
as before: (1) The line goes through the point (x0 , f (x0 )). (2) The line has slope given by
the derivative of the function at the point of interest, that is, m = f (x0 ). As before, we
write down
y f (x0 )
= m = f (x0 ).
x x0
Rearranging this and eliminating the notation m, we have
y = f (x0 ) + f (x0 )(x x0 ).

(5.2)

Thus, in general, (5.2) is the desired tangent line equation.

5.2.2 Where a tangent line intersects the x axis


In Example 5.2, we found a tangent line and then determined where it intersects the x axis.
We can do the same with the generic tangent line equation (5.2), as discussed in the next
example.

98

Chapter 5. Tangent lines, linear approximation, and Newtons method

Example 5.5 Let y = f (x) be a smooth function, differentiable at x0 , and suppose that
(5.2) is the equation of the tangent line to the curve at x0 . Find the coordinate of the point
at which this tangent line intersects the x axis.
Solution: At the intersection with the x axis, we have y = 0. Plugging this into y =
f (x0 ) + f (x0 )(x x0 ) leads to
0 = f (x0 ) + f (x0 )(x x0 )

(x x0 ) =

f (x0 )
f (x0 )

x = x0

f (x0 )
.
f (x0 )

Thus the desired x coordinate, which we will refer to as x1 is


x1 = x0

f (x0 )
.
f (x0 )

(5.3)

This result will turn out to be of particular relevance in Section 5.4, where we discuss
Newtons method for approximating the zeros of a function.

5.3 Close to a point, we can approximate a function


by its tangent line
Section 5.3 Learning goals
1. Understand that a tangent line approximates the behaviour of a function close to the
point of tangency.
2. Be able to use this idea to find a linear approximation to a value of a given function
at some point.
3. Be able to determine whether the linear approximation over or underestimates the
value of the function.
In Section 5.3 we encountered the idea that the tangent line approximates the behaviour of a function. Here we utilize this idea in a formal procedure called linear approximation. In this technique, the tangent line is used to generate approximate values of
a function close to some point at which the value of the function and of its derivative are
known, or are easy to calculate. The essential ideas are these:
1. The generic equation of the tangent line to a curve at a point (x0 , f (x0 )) is given
by Eqn. (5.2). That line approximates the behaviour of the function close to x0 , and
leads to the so-called linear approximation of the function:
y = f (x0 ) + f (x0 )(x x0 ) f (x)

f (x) f (x0 ) + f (x0 )(x x0 ).

2. The tangent line can approximate the behaviour of a function close to the point of
tangency.

5.3. Close to a point, we can approximate a function by its tangent line

99

3. The approximation is exact at x = x0 , and holds well provided x is close to x0 . (The


expression on the right hand side is precisely the value of y on the tangent line at
x = x0 ).
For example, consider the function
y = f (x) =

x.

The exact value


of this

function
is well known at a number of judiciously chosen values
of x, e.g. 1 = 1, 4 = 2, 9 = 3, etc. Suppose we want to approximate the value
of the square root of 6. This is easily done with a scientific calculator, of course, but we
can also use a rough approximation which uses only simple known values of the square
root function and some elementary manipulations. We know the value of the function at
an adjoining point, i.e. at x = 4, since f (4) = 4 = 2. In Example 5.9 we use these

facts, together with the tangent line equation to estimate the decimal approximation of 6.
before doing so, we discuss other simple examples.
Example 5.6 Use the fact that the derivative of the function f (x) = x2 is f (x) = 2x (as
found in Example 2.21) to find a linear approximation for the value (10.03)2 .
Solution: We know that 102 = 100, so that the point (10, 100) is on the graph of the
function f (x) = x2 . Further, we know that the slope of the tangent line at that point is
f (10) = 2(10) = 20. Thus the equation of the tangent line, and the linear approximation
of the function are:
y 100
= 20
x 10

y = 100 + 20(x 10),

f (x) 100 + 20(x 10)

As before, we determine the value of y corresponding to x = 10.03 as an approximation to


the value of the function. We obtain
f (10.03) 100 + 20(10.03 10) = 100 + 20(0.03) = 100.6
This compares well with the true value of 100.6009 found using a calculator for the actual
function.
Example 5.7 (Approximating the sine of a small angle) Use a linear approximation to
find a rough value for sin(0.1).
Solution: In Example 3.4, we found that when we are close to x = 0 the graph of the
sine function looks very similar to the graph of its tangent line, y = x. This equation is
the linear approximation of the function y = sin(x) near x = 0. It implies that for small
enough values of x, we can approximate the function y = sin(x) by y x (provided x is
in radians, (See Chapter 14). Thus, at x = 0.1 radians, we find that sin(0.1) = 0.09983
0.1.

100

Chapter 5. Tangent lines, linear approximation, and Newtons method

5.3.1 Accuracy of the linear approximation


Example 5.8 (Over or underestimate?) Determine in each of the previous examples whether
the linear approximation over or underestimates the true value of the function near the point
of tangency.
Solution: We show the functions and their linear approximations in Fig. 5.4(a,b). In (a)
we find that the tangent line to y = x2 is always underneath the graph of the function, so
that a linear approximation underestimates the true value of the function. In (b), we see
that the tangent line to y = sin(x) at x = 0 is above the graph for x > 0 and below the
graph for x < 0. This meant that the linear approximation is larger than (overestimates)
the function for x > 0 and smaller than (underestimates) the function for x < 0. Later, we
will associate these properties with the concavity of the function, that is, whether the graph
is locally concave up or down.
400.0

2.0

tangent line

y=x2
y=sin(x)

tangent line

0.0

-2.0
0.0

20.0

(a)

-4.0

4.0

(b)

Figure 5.4. Functions (black curves) and their linear approximations (red) for
Examples 5.6 and 5.7. Whenever the tangent line is below (above) the curve, we say that
the linear approximation under (over)-estimates the value of the function.

Example 5.9 Use linear approximation to estimate the value of 6. Then determine
whether the linear approximation under or over estimates the function.
Solution: We use the following steps:
p

The derivative of y = f (x) = x = x1/2 is f (x) = 1/(2 (x)) = (1/2)x1/2 .


Both the function and its derivative require us to evaluate a square root. Some numbers have convenient square roots (perfect squares), and we seek one close to x = 6.
In particular, x = 4, is such a number, and its square root (2) is well known to us.
We use x = 4 as the base point for the linear approximation. This means that we
glue a tangent line to the graph at that point. The slope
of that tangent line is the

derivative, which we (easily) find to be f (4) = 1/(2 4) = 1/4 = 0.25

5.3. Close to a point, we can approximate a function by its tangent line

101

Puttingthese facts together, we find that the equation of a tangent line to the curve
y = f (x) = x at the point x = 4 is
y = f (4) + f (4)(x 4)

y = 2 + 0.25(x 4).

In Figure 5.5(a), we show the original curve with tangent line superimposed. In
Figure 5.5(b) we show a zoomed portion of the same graph, on which the true value of 6
(black dot) is compared to the value on the tangent line, which approximates it (red dot)
i.e. to

f (6) = 6 2 + 0.25(6 4) = 2.5.

(The actual value, computed on a calculator is 6 = 2.449..). Since the tangent line is
above the graph of the function, we find that the linear approximation overestimates the
true value of the function. It is also evident from Fig. 5.5 that there is some error in the
approximation, since the values are clearly different. However, if we do not stray too far
from the point of tangency (x = 4), the error will not be too large.
3.0

3.0

sqrt(6)
approx value

linear approx at x=4


actual value

y=sqrt(x)

0.0

1.0
0.0

9.0

(a)

3.0

7.0

(b)

Figure 5.5. Linear approximation based at x = 4 to the function y = f (x) =

p
(x).

In Table 5.1, we collect values of the function f (x) = x (computed by scientific


calculator), and compare with values of the linear approximation using the tangent line
through the point (4, 2). At x = 4, the values of the function and of its approximation
are identical (naturally - since we rigged it to match at this point). Close the x = 4, the
values of the approximation are fairly close to the values of the function. Further away,
however, the difference between these gets bigger, and the approximation is no longer very
good at all.
These remarks illustrate two features: (1) the method is easy to use, and involves only
determination of a derivative, and elementary arithmetic. (2) The method has limitations,
and work well only close to the point at which the tangent line is based.

102

Chapter 5. Tangent lines, linear approximation, and Newtons method


x
0.0000
2.0000
4.0000
6.0000
8.0000
10.0000
12.0000
14.0000
16.0000

f (x) = x
(exact value)
0.0000
1.4142
2.0000
2.4495
2.8284
3.1623
3.4641
3.7417
4.0000

y = f (x0 ) + f (x0 )(x x0 )


(approx value)
1.0000
1.5000
2.0000
2.5000
3.0000
3.5000
4.0000
4.5000
5.0000

Table 5.1. Linear approximation to


and the linear approximation in column 3.

x. The exact value is recorded in column 2

5.4 Tangent lines can help approximate the zeros of


a function
In examples so far, we solved equations of the form f (x) = 0 using simple algebra. However, in many practical applications this is not the case. Consequently, we are occasionally
forced to find approximation methods to accomplish this task. Newtons method is one
such method.

Section 5.4 Learning goals


1. Understand the geometry on which Newtons method is based (Fig. 5.6).
2. For a given function f (x) and initial guess x0 , be able to find improved values of the
decimal approximation for the zero of f (x) (root of the equation f (x) = 0).
3. For a given function f (x), be able to decide on a suitable initial guess for Newtons
method, so that the method can be initiated.
Consider the function y = f (x) shown in Figure 5.6. We want to find the value x
such that
f (x) = 0.
In Figure 5.6, the desired point is indicated with the notation x . Usually, the decimal
expansion for the coordinate x is not known in advance: that is what we are trying to
find. We will see that by applying Newtons method several times, we can generate such a
decimal expansion to any desired level of accuracy.
Suppose we have some very rough idea of some initial guess for the value of this
root. (How to find this initial guess will be discussed later.) Newtons method is a recipe
for getting better and better approximations of the true value, x .

5.4. Tangent lines can help approximate the zeros of a function

103

tangent line

f(x0 )

x1 x*

x0

Figure 5.6. Sketch showing the idea behind Newtons method. A (very rough)
initial guess x0 is refined by sliding down the tangent line to the curve at x0 . This brings
us to a new (better) guess x1 which is closer to the desired root. Repeating this again and
again allows us to find the root to any desired accuracy.

5.4.1 Newtons method


In the diagram shown in Figure 5.6, x0 represents an initial starting guess. We observe that
a tangent line to the graph of f (x) at the point x0 gives a rough indication of the behaviour
of the function near that point. We will use the tangent line as an approximation of the
actual function. The point at which the tangent line intersects the x axis (denoted x1 ) is an
approximation of the desired zero. Repeating the same idea over and over again, we will
find values that get closer and closer to the root x . We have a formula for the point x1 , as
calculated in Section 5.2.2, where we obtained the equation (5.3). To summarize:

Given an initial guess, x0 , for the root of the equation


f (x) = 0,
an improved value based on Newtons method is
x1 = x0

f (x0 )
.
f (x0 )

We can repeat this procedure to get a better value


x2 = x1

f (x1 )
.
f (x1 )

x3 = x2

f (x2 )
.
f (x2 )

..
.

104

Chapter 5. Tangent lines, linear approximation, and Newtons method

In general, we can refine the approximation using as many steps as it takes to get the accuracy we want. (We will see in upcoming examples how to recognize when this accuracy is
attained.)

Given an approximation xk for the root of the equation f (x) = 0, we can improve the
accuracy of that approximation using the Newtons method iteration as follows:
xk+1 = xk

7.0

f (xk )
.
f (xk )

7.0

Newtons method

Newtons method

f(x)=x^2-6

x1

x0

f(x)=x^2-6

x3

x0

-7.0

x2

x1

-7.0
0.0

4.0

0.0

(a)

4.0

(b)

Figure 5.7. Newtons method applied to solving y = f (x) = x2 6 = 0.


Example 5.10 Find zeros of the function y = f (x) = x3 x 3.
Solution: Recall that in Example 6.8, we gave up on simple factoring or other algebra. We
now apply Newtons method to this function. Then we use the fact that f (x) = 3x2 1 to
set up the Newtons iteration. Given a starting guess x0 , the improved guess would be
x1 = x0

x3 x0 3
f (x0 )
= x0 0 2
.

f (x0 )
3x0 1

We start with x0 = 1, and obtain the following results for x1 , x2 , etc.


x1 = 1.727272727, x2 = 1.673691174, x3 = 1.67170257, x4 = 1.671699882.
Evidently, the iterates converge (get closer and closer) to the result x 1.6717. Such calculations are best handled using a spreadsheet, to avoid repetitive arithmetical operations.

5.4. Tangent lines can help approximate the zeros of a function


k
0
1
2
3
4

xk
1.00
3.5
2.6071
2.4543
2.4495

f (xk )
-5.00
6.250
0.7972
0.0234
0.000

f (xk )
2.00
7.00
5.2143
4.9085
4.8990

105

xk+1
3.5
2.6071
2.4543
2.4495
2.4495

Table 5.2. Newtons method applied to Example 5.11. We start with x0 = 1 as


our initial approximation and refine it four times.

Example 5.11 Use Newtons method to find a decimal approximation of the square root
of 6.
Solution: It is first necessary to restate the problem in the form Find a value of x such
that a certain function f (x) = 0. Clearly, a function that would accomplish this is
f (x) = x2 6

since the value of x for which f (x) = 0 is indeed x2 6 = 0, i.e. x = 6. We could also
find other functions that have the same property, e.g. f (x) = x4 36, but the above is one
of the simplest such functions.
We compute the derivative for this function:
f (x) = 2x.
Thus the iteration for Newtons method is
x1 = x0

f (x0 )
x2 6
.
= x0 0

f (x0 )
2x0

Suppose we start with the initial guess x1 = 1 (which is actually not very close to the value
of the root) and see how well Newtons method perform: This is shown in Figure 5.7. In
Figure 5.7(a) we see the graph of the function, the position of our initial guess x0 , and
the result of the improved Newtons method approximation x1 . In Fig. 5.7(b), we see how
the value of x1 is then used to obtain x2 by applying a second iteration (i.e repeating the
calculation with the new value used as initial guess.)
A spreadsheet is ideal for setting up the rather repetitive calculations involved, as
shown in the table. For example, we compute the following set of values using our spreadsheet. Observe that the fourth column contains the computed (Newtons method) values,
x1 , x2 , etc. These values are then copied onto the first column to be used as new initial
guesses. We also observe that after several repetitions, the numbers calculated converge
(i.e. get closer and closer) to 2.4495, and no longer change to that level of accuracy. This
is a signal that we need no longer repeat the iteration, if we are satisfied with 5 significant
figures of accuracy.

106

Chapter 5. Tangent lines, linear approximation, and Newtons method

5.5 Harder tangent line problems: Finding the point


of tangency
In Section 5.1.1, calculations were relatively straightforward, since we were given a function and a point at which a tangent line was desired. We now turn to a number of problems
based on derivatives, tangent lines, and slopes of polynomials. We use these to build up
our problem-solving skills in examples where the calculations are more subtle. In the examples below, we use information about a function to identify the slope and/or equation of
its tangent line.
Partly, the examples below are more subtle, since finding the point of tangency is
part of the question. In other cases, the problem involves a parameter whose value is not
specified initially. We use the generic tangent line equation (5.2), and solve for the unknown
point x0 (or the unknown parameters) based on other information in the problem.
Example 5.12 Find any value(s) of the constant a such that the line y = ax is tangent to
the curve
y = f (x) = x2 + 3x 2.

y=ax
y=f(x)
x
xo

Figure 5.8. Figure for Example 5.12


Solution: This example, too, revolves around the properties of a polynomial, but the problem is somewhat more challenging. We must use some geometric properties of the function
and the tentative candidate for a tangent line to determine the value of the unknown constant
a.
As shown in Figure 5.8, there may be one (or more) points at which tangency occurs.
We do not know the coordinate of any such point, but we will label it x0 to denote that
it is some definite (as yet to be determined) value. Notation can sometimes be confusing.
We must remember that while we can compute the derivative of f at any point, only the
specific point at which the tangent touches the curve will have special properties that we
will outline below. Finding that point of tangency, x0 , will be part of the problem.
What we know is that, at x0 ,
The straight line and the graph of the function f (x) go through the same point.

5.5. Harder tangent line problems: Finding the point of tangency

107

The straight line y = ax and the tangent line to the graph coincide, i.e. the derivative
of f (x) at x0 is the same as the slope of the straight line, which is clearly a
Using these two facts, we can write down the following equations:
Equating slopes:

f (x0 ) = 2x0 + 3 = a

Equating y values on line and graph of f (x):


f (x0 ) = x20 + 3x0 2 = ax0
We now have two equations for two unknowns, (a and x0 ). We can solve this system easily
by substituting the value of a from the first equation into the second, getting
x20 + 3x0 2 = (2x0 + 3)x0 .
Simplifying:
x20 + 3x0 2 = 2x20 + 3x0
so

x20 2 = 0, x0 = 2.

This shows that there are two points at which the conditions would apply. In Figure 5.9 we show two such points.
y

y=ax

y=f(x)

x
xo

Figure 5.9. Figure for solution to Example 5.12


We can now find the slope a using a = 2x0 + 3. We get:

x0 = 2 a = 2 2 + 3,
and

x0 = 2 a = 2 2 + 3.

Remark: This problem illustrates the idea that in some cases, we proceed by listing
properties that are known to be true, using the information to obtain a set of (algebraic)
equations, and then solving those equations. The challenge is to use these sequential steps

108

Chapter 5. Tangent lines, linear approximation, and Newtons method

properly - each step on its own is relatively understandable and clearcut. Most problems
encountered in scientific and engineering applications require a whole chain of reasoning,
calculation, or logic, so practicing such multi-step problems is an important part of training
for science, medicine, engineering, and other fields.
Example 5.13 Find the equation of the tangent line to the curve y = f (x) = 1 x2 that
goes through the point (1,1).
Solution: Finding the point of tangency x0 is part of the problem, since this is not provided.
We use the following facts: (1) The tangent line goes through the point (x0 , f (x0 )) on the
graph of the function and has slope f (x0 ). (2) Consequently, its equation will have the
form (5.2). For the given function and point of tangency x0 , we have
f (x0 ) = 1 x20 ,

f (x0 ) = 2x0 .

Hence the tangent line equation is


y = f (x0 ) + f (x0 )(x x0 ) = (1 x20 ) 2x0 (x x0 ).
We are told that this line goes through the point (x, y) = (1, 1) so that
1 = (1 x20 ) 2x0 (1 x0 ),

0 = x20 2x0 ,

x20 = 2x0 .

Thus, there are two possible points of tangency, x0 = 0, 2 and two tangent lines that satisfy
the given condition. Plugging in these two values of x0 into the generic equation for y leads
to the two tangent line equations
y = 1,

y = (1 22 ) 2 2(x 2) = 3 4(x 2).

It is easily checked that both lines go through the point (1,1) as desired.
Example 5.14 Shown in Fig. 5.10 is the function
f (x) = C

x
x+a

together with one of its tangent lines. The tangent line goes through a point (d, 0) as well
as a point on the graph of the function. Find the point x0 and the equation of the tangent
line.

Solution: Finding the point of tangency x0 is part of the problem in this case too. We use
the same approach, and employ facts (1) and (2) from Example 5.13. We also use, for the
specific function in this example,
f (x0 ) = C

x0
x0 + a

f (x0 ) = C

a
.
(x0 + a)2

(See Problem 12 in Chapter 3). Hence, the equation of the tangent line is
y = f (x0 ) + f (x0 )(x x0 ) = C

x0
a
(x x0 ).
+C
x0 + a
(x0 + a)2

5.5. Harder tangent line problems: Finding the point of tangency

109

-d

x0

Figure 5.10. The graph of a function and its tangent line for Example 5.14.
We can simplify this equation by factoring out common factors to obtain:
y=


C
C
x2 + ax .
(x0 (x0 + a) + a(x x0 )) =
x0 + a
x0 + a 0

It is important to realize that in this equation, x0 , C and a represent fixed (known) constants,
and only x, y are variables. This means that the equation expresses a linear relationship
between x and y, as appropriate for a straight line.
We know that the point (d, 0) is on this line, so that (plugging in x = d, y = 0),
we obtain

C
0=
x2 ad .
x0 + a 0

Solving for x0 leads to x0 = ad. Moreover, we can now find the equation of the tangent
line in terms of these parameters.
C
y=
(ad + ax) .
ad + a
This can be simplified by factoring a from numerator and denominator to obtain
C
y= p
(d + x) .
(d/a) + 1

We can easily see that when x = d, we get y = 0, as required. This forms one check that
our calculations are correct.

110

Chapter 5. Tangent lines, linear approximation, and Newtons method

Exercises
5.1. Find the equation of the tangent line to the function y = f (x) = |x + 1| at:
(a) x = 1,

(b) x = 2,
(c) x = 0.

If there is a problem finding a tangent line at one of these points, indicate what the
problem is.
5.2. A function f (x) satisfies f (1) = 1 and f (1) = 2. What is the equation of the
tangent line of f (x) at x = 1?
5.3. Shown in Figure 5.11 is the graph of y = x2 with one of its tangent lines.
(a) Show that the slope of the tangent to the curve y = x2 at the point x = a is
2a.
(b) Suppose that the tangent line intersects the x axis at the point (1,0). Find the
coordinate, a, of the point of tangency.

Figure 5.11. Figure for Problem 3


5.4. Shown in Figure 5.12 is the function f (x) = 1/x4 together with its tangent line at
x = 1.
(a) Find the equation of the tangent line.
(b) Determine the points of intersection of the tangent line with the x and the y
axes.
(c) Use the tangent line to obtain a linear approximation to the value of f (1.1).
Is this approximation larger or smaller than the actual value of the function at
x = 1.1?
5.5. Tangent line, continued: Shown in Figure 5.13 is the function f (x) = x3 with a
tangent line at the point (1, 1).
(a) Find the equation of the tangent line.
(b) Determine the point at which the tangent line intersects the x axis.

Exercises

111
y
f(x)

Figure 5.12. Figure for problem 4


(c) Compute the value of the function at x = 1.1. Compare this with the value of y
on the tangent line at x = 1.1. (This latter value is the linear approximation of
the function at the desired point based on its known value and known derivative
at the nearby point x = 1.)
y

f(x)

(1, 1)

Figure 5.13. Figure for Problem 5


5.6. Shown in Figure 5.14 is the graph of a function and its tangent line at the point x0 .
(a) Find the equation of the tangent line expressed in terms of x0 , f (x0 ) and
f (x0 ).
(b) Find the coordinate x1 at which the tangent line intersects the x axis.
5.7. Estimating
a square root: Use Newtons method to find an approximate value for

8.
(Hint: First think of a function, f (x), such that f (x) = 0 has the solution
x = 8).
5.8. Finding points of intersection: Find the point(s) of intersection of: y1 = 8x3
10x2 + x + 2 and y2 = x3 + 15x2 x 4 (Hint: an intersection point exists between
x = 3 and x = 4).
5.9. Roots of cubic equations: Find the roots for each of the following cubic equations
using Newtons method:
(a) x3 + 3x 1 = 0

(b) x3 + x2 + x 2 = 0

112

Chapter 5. Tangent lines, linear approximation, and Newtons method

tangent
line

x1

f(x)

x0

Figure 5.14. Figure for problem 6


(c) x3 + 5x2 2 = 0 (Hint: Find an approximation to a first root a using Newtons
method, then divide the left hand side of the equation by (x a) to obtain a
quadratic equation, which can be solved by the quadratic formula.)
5.10. The parabola y = x2 has two tangent lines that intersect at the point (2, 3). These
are shown as the dark lines in Figure 5.15. [Remark: note that the point (2, 3) is
not on the parabola]. Find the coordinates of the two points at which the lines are
tangent to the parabola.
unknown
coordinates
to find

(2,3)
x
y

Figure 5.15. Figure for Problem 10


5.11. An approximation for the square root: Use a linear approximation to find a rough
estimate of the following functions at the indicated points.

(a) y = x at x = 10. (Use the fact that 9 = 3.)


(b) y = 5x 2 at x = 1.
5.12. Use the method of linear approximation to find the cube root of

(a) 0.065 (Hint: 3 0.064 = 0.4)

(b) 215 (Hint: 3 216 = 6)


5.13. Use the data in the graph in Figure 5.16 to make the best approximation you can to
f (2.01).

Exercises

113

y = f(x)

(2, 1)

(3, 0)

Figure 5.16. Figure for Problem 13


5.14. Approximate the value of f (x) = x3 2x2 + 3x 5 at x = 1.001 using the method
of linear approximation.
5.15. Approximate the volume of a cube whose length of each side is 10.1 cm.

114

Chapter 5. Tangent lines, linear approximation, and Newtons method

Chapter 6

Sketching the graph of a


function using calculus
tools
The derivative of a function contains a lot of important information about the behaviour of
a function. In this chapter we will focus on how properties of the first and second derivative
can be used to help up refine curve-sketching techniques.

6.1

Overall shape of the graph of a function

Section 6.1 Learning goals


1. Understand that the sign of the first derivative corresponds to an increasing or decreasing property of a function.
2. Understand that the sign of the second derivative correspond to the concavity (curvature) of a function.

6.1.1 Increasing and decreasing functions


Consider a function given by y = f (x). We first make the following observations:
1. If f (x) > 0 then f (x) is increasing.
2. If f (x) < 0 then f (x) is decreasing.
Naturally, we read graphs from left to right, i.e. in the direction of the positive x axis,
so when we say increasing we mean that as we move from left to right, the value of the
function gets larger.
We can use the same ideas to relate the second derivative to the first derivative.
1. If f (x) > 0 then f (x) is increasing. This means that the slope of the original
function is getting steeper (from left to right). The function curves upwards: we say
that it is concave up. See Figure 6.1(a).
115

116

Chapter 6. Sketching the graph of a function using calculus tools

2. If f (x) < 0 then f (x) is decreasing. This means that the slope of the original
function is getting shallower (from left to right). The function curves downwards:
we say that it is concave down. See Figure 6.1(b).

6.1.2 Concavity and points of inflection


The second derivative of a function provides information about the curvature of the graph
of the function, also called the concavity of the function. In Figure 6.1(a), f (x) is concave
up, and its second derivative (not shown) would be positive. In Figure 6.1(b), f (x) is
concave down, and second derivative would be negative.
(a)

(b)

f (x)

f (x)

x
f (x)

f (x)

Figure 6.1. In (a) the function is concave up, and its derivative thus increases
(in the positive direction). In (b), for a concave down function, we see that the derivative
decreases.
Definition 6.1. A point of inflection of a function f (x) is a point x at which the concavity
of the function changes. (See, for example, Fig. 6.2.)

Inflection point
f '' (x) = 0
f ''(x) >0
f '' (x) <0
concave
down

concave
up

Figure 6.2. An inflection point is a place where the concavity of a function changes.
We can deduce from the definition and previous remarks that at a point of inflection

6.1. Overall shape of the graph of a function

117

the second derivative changes sign. This is illustrated in Figure 6.2. Note carefully: It is
not enough to show that f (x) = 0 to conclude that x is an inflection point. We summarize
the one-way nature of this relationship in the box. Then, in Example 6.2 we show why this
is true.

Inflection points
1. If the function y = f (x) has a point of inflection at x0 then f (x0 ) = 0.
2. If the function y = f (x) satisfies f (x0 ) = 0, we cannot conclude that it has a
point of inflection at x0 . We must actually check that f (x) changes sign at x0 .
Example 6.2 Consider the the functions (a) f1 (x) = x3 and (b) f2 (x) = x4 . Show that
for both functions, the second derivative is zero at the origin (f (0) = 0) but that only one
of these functions actually has an inflection point at x = 0.
Solution: The functions are
(a) y = f1 (x) = x3 ,

(b) y = f2 (x) = x4 .

The first derivatives are


(a) y = f1 (x) = 3x2 ,

(b) y = f2 (x) = 4x3 .

and the second derivatives are:


(a) y = f1 (x) = 6x,

(b) y = f2 (x) = 12x2 .

Thus, at x = 0 we have f1 (0) = 0, f2 (0) = 0. However, x = 0 is NOT an inflection


(a)

(b)

x3

x4

Figure 6.3. The functions (a) f1 (x) = x3 and (b) y = f2 (x) = x4 both satisfy
f (0) = 0. However, only x3 has an inflection point at x = 0, whereas x4 has a local
minimum at that point. This results from the fact that f2 (x) does not change sign at x = 0.

point of x4 . In fact, it is a local minimum, as is evident from Figure 6.3.

118

Chapter 6. Sketching the graph of a function using calculus tools

6.1.3 Determining whether f (x) changes sign


In the previous section, we defined an inflection point as a point on the graph of a function
at which the second derivative changes sign. But how do we detect if this sign change
occurs at a given point? Here we address this question and provide a few helpful tools.
We first state the following important result

Sign change in a product of factors:


If an expression is a product of factors, e.g. g(x) = (x a1 )n1 (x a2 )n2 . . . (x am )nm ,
then
1. The expression can be zero only at the points x = a1 , a2 , . . . , am .
2. The expression changes sign only at points x = ai for which ni is an odd integer
power.
Example 6.3 Determine where the expression g(x) = x2 (x + 2)(x 3)4 changes sign.

Solution: The zeros of g(x) are x = 0, 2, 3. However, g(x) only changes sign at x = 2.
Close to this point, g(x) g(2) (2)2 (x + 2)(5)4 = 2500(x + 2). Clearly for
x < 2 this is negative and for x > 2, this is positive. Hence there is a sign change at
x = 2. At x = 0 and at x = 3 there is no sign change as the terms x2 and (x 3)4 are
always positive.
Example 6.4 Find all inflection points of the function f (x) = (2/5)x6 x4 + x.
Solution: We compute the derivatives of the function, and find these to be
f (x) = (12/5)x5 4x3 +1,

f (x) = 12x4 14x2 = 12x2 (x2 1) = 12x2 (x+1)(x1).

Here we have completely factorized the second derivative so that it would be easy to identify factors with even and odd powers, to find locations where the second derivative changes
sign. We see that there is NO sign change at x = 0, whereas at both x = 1, 1 there is a
sign-changing factor. Thus the infection points are at x = 1, 1.

6.2 Special points on the graph of a function


In this section we use tools of algebra and calculus to identify special points on the graph
of a function. We first consider the zeros of a function, and then its critical points.

6.2. Special points on the graph of a function

119

Section 6.2 Learning goals


1. Understand the definition of a zero of a function and be able to identify zeros for
simple functions (factorizable polynomials).
2. Understand that a function f (x) can have various types of critical points (maxima,
minima, and other types) at which f (x) = 0.
3. Be able to find critical points for a given function.
4. Using first or second derivative tests, be able to classify a given critical point as a
maximum, minimum (or neither).

6.2.1 Zeros of a function


Definition 6.5 (Zero). Given a function y = f (x), we say that x0 is a zero of f if f (x0 ) =
0. In this case we also say that x0 is a root of the equation f (x) = 0.
Example 6.6 (Factoring) Find the zeros of the function y = f (x) = x2 5x + 6.
Solution: This function is a polynomial that factors into f (x) = (x 3)(x 2). Thus we
look for values of x satisfying 0 = (x 3)(x 2). We use the fact that when a product of
factors is zero, at least one of the factors must be zero. This means that either (x 3) = 0
or (x 2) = 0, so x = 2, 3 are the two zeros of the function.
Example 6.7 Find zeros of the function y = f (x) = x3 3x2 + x.
Solution: We can factor this function into f (x) = x(x2 3x + 1). From this we see that
x = 0 is one of the desired zeros of f . To find the others, we use the quadratic formula on
the second factor, obtaining
x1,2 =

1
1
(3 32 4) = (3 5).
2
2

Thus, there is a total of three zeros in this case.


Example 6.8 Find zeros of the function y = f (x) = x3 x 3.
Solution: This polynomial does not factor, nor is it easy to apply a cubic formula (analogous to the quadratic formula) for such cases. Rather than use such a formula, we will give
up the elementary algebraic techniques, and use an approximation method, to be discussed
later in Example 5.10.

120

Chapter 6. Sketching the graph of a function using calculus tools

Figure 6.4. A critical point (place where f (x) = 0) can be a local maximum,
local minimum, or neither.

6.2.2 Critical points


Definition 6.9. A critical point of the function f (x) is any point x at which the first
derivative is zero, i.e. f (x) = 0.
Clearly, this will occur whenever the slope of the tangent line to the graph of the
function is zero, i.e. the tangent line is horizontal. Figure 6.4 shows several possible shapes
of the graph of function close to a critical point.
We will call the first of these (on the left) a local maximum, the second a local
minimum, and the last two cases (which are bends in the curve) inflection points.
In many scientific applications, critical points play a very important role. (We will
see examples of this sort shortly.) We would like some criteria for determining whether a
critical point is a local maximum, minimum, or neither. We will develop such diagnoses in
the next section.
Example 6.10 Consider the function y = f (x) = x3 + 3x2 + ax + 1. For what range of
values of a are there no critical points?
Solution: We compute the first derivative f (x) = 3x2 + 6x + a. A critical point would
occur whenever 0 = f (x), which implies 0 = 3x2 + 6x + a. This is a quadratic equation
whose solutions are

6 36 4a 3
.
x1,2 =
6
This leads to two real solutions unless 3612a < 0. In that case, there are no real solutions.
Thus there will be no critical points when 36 12a < 0, which corresponds to a > 3.

6.2.3 What happens close to a critical point


From Figure 6.5 we see the behaviour of the first and second derivatives of a function close
to critical points. We already know that at the point in question, f (x) = 0, so clearly the
graph of f (x) crosses the x axis at each critical point. However, note that next to a local
maximum, (and reading from left to right, as is the convention in any graph) the slope of
f (x) is first positive (to the left), then becomes zero (at the critical point) and then becomes
negative (to the right of the point). This means that the derivative is decreasing from left to
right, as indicated in Figure 6.5.
Since the changes in the first derivative are measured by its derivative, i.e. by f (x),
we can say, equivalently that the second derivative is negative at a local maximum.

6.2. Special points on the graph of a function

f (x)

local max

f (x)

x
f '(x)

121

local min

x
f ' (x)

x
f ''(x)

x
f ''(x)

Figure 6.5. Close to a local maximum, f (x) is concave down, f (x) is decreasing, so that f (x) is negative. Close to a local minimum, f (x) is concave up, f (x) is
increasing, so that f (x) is positive.
The converse is true near any local minimum. This is shown on the right column of
Figure 6.5. We conclude from this discussion that the following diagnosis would distinguish a local maximum from a local minimum:
Summary: first derivative
f (x) < 0
decreasing function

f (x0 ) = 0
critical point
at x0

f (x) > 0
increasing function

Summary: second derivative


f (x) < 0
curve concave down

f (x0 ) = 0
check for
inflection point
at x0
if f changes sign

f (x) > 0
curve concave up

Summary: type of critical point


First derivative test: This test depends on the way that the sign of the first derivative
changes close to the critical point. Near a local maximum, the first derivative has a

122

Chapter 6. Sketching the graph of a function using calculus tools


transition from positive to zero to negative values reading across the graph from left
to right, as shown in the middle left panel of Fig. 6.5 and the table below:
x < x0
f (x) > 0

x = x0
f (x0 ) = 0

x > x0
f (x) < 0

Near a local minimum, the first derivative goes from negative to zero to positive
values as shown in the middle right panel of Fig. 6.5 and the table below:
x < x0
f (x) < 0

x = x0
f (x0 ) = 0

x > x0
f (x) > 0

Second derivative test: At a local maximum, the second derivative is negative. At a


local minimum, the second derivative is positive.
Here we assume that x0 is a critical point, i.e. a point at which f (x0 ) = 0. Then the
following table summarizes what happens at that point
f (x0 ) < 0

f (x0 ) = 0

f (x0 ) > 0

local maximum

inconclusive

local minimum

Inflection points:
We look for points at which f (x0 ) = 0 and check that f changes sign at x0 . When both
these conditions are satisfied, we conclude that x0 is an inflection point.

6.3 Sketching the graph of a function


Recall that in Section 1.4, we used elementary reasoning about power functions to sketch
the graph of simple polynomials. Now that we have at our disposal more advanced calculus
techniques, we will be able to hone such methods to produce more detailed and more
accurate sketches of the graph of a function. We devote this section to illustrating these
new methods and their application.

Section 6.3 Learning goals


1. Given a function (polynomial, rational function, etc) be able to find its zeros, critical
points, inflection points, and determine where it is increasing or decreasing, concave
up or down.
2. Using a combination of the above techniques, together with methods of Section 1.4,
assemble a reasonably accurate sketch of the graph of the function.
3. Using these techniques, be able to identify all local as well as global extrema (minima and maxima) of a function f (x) on an interval a x b.

6.3. Sketching the graph of a function

123

Example 6.11 Sketch the graph of the function B(x) = C(x2 x3 ).


Solution: To prepare the way, we compute the derivatives:
B (x) = C(2x 3x2 ),

B (x) = C(2 6x).

The following set of steps will be a useful way to proceed:


1. We can easily find the zeros of the function by setting B(x) = 0. We find that
C(x2 x3 ) = 0,

x2 = x3

so x = 0 or x = 1 are the solutions.


2. By considering powers, we note that close to the origin, the power x2 would dominate (so we expect to see something resembling a parabola opening upwards close to
the origin), whereas, far away, where the term x3 dominates, we expect an (upside
down) cubic curve, as shown in a preliminary sketch in Figure 6.6.
B(x)

x
close to 0

far from 0

Figure 6.6. Figure for the function B(x) = C(x2 x3 ) in Example 6.11 showing
which power dominates.
3. To find the critical points, we set B (x) = 0, obtaining
B (x) = C(2x 3x2 ) = 0,

2x 3x2 = 0,

2x = 3x2

so either x = 0 or x = 2/3. From the sketch in Figure 6.6 it is clear that the
first is a local minimum, and the second a local maximum. (But we will also get a
confirmation of this fact from the second derivative.)
4. From the second derivative we find that B (0) = 2 > 0 so that x = 0 is indeed a
local minimum. Further, B (2/3) = 2 6 (2/3) = 2 < 0 so that x = 2/3 is a
local maximum. This is the confirmation that our sketch makes sense.
5. Now identifying where B (x) = 0, we find that
B (x) = C(2 6x) = 0,

when 2 6x = 0

x=

2
1
=
6
3

we also note that the second derivative changes sign here: i.e. for x < 1/3, B (x) >
0 and for x > 1/3, B (x) < 0. Thus there is an inflection point at x = 1/3. The
final sketch would be as given in Figure 6.7.

124

Chapter 6. Sketching the graph of a function using calculus tools

B(x)

inflection

1/3

local max

2/3

local min

Figure 6.7. Figure for the function B(x) = C(x2 x3 ) in Example 6.11.
y

Figure 6.8. The function y = f (x) = 8 x5 +5 x4 20 x3 of Example 6.12 behaves


roughly like the negative cubic near the origin, and like 8 x5 for large x.
Example 6.12 Sketch the graph of the function y = f (x) = 8 x5 + 5 x4 20 x3
Solution:
1. Consider the powers:
The highest power is 8 x5 so that far from the origin we expect a typical positive odd
function behavior.
The lowest power is 20 x3 , which means that close to zero, we would expect to see
a negative cubic. This already indicates to us that the function turns around, and
so, must have some local maxima and minima. We draw a rough sketch in Figure 6.8.
2. Zeros: Factoring the expression for y leads to
y = x3 (8x2 + 5x 20).
Using the quadratic formula, we can find the places where y = 0, i.e. the zeros of
the function. They are
x = 0, 0, 0,

5
1
1
5
665,
665
+
16 16
16 16

In decimal form, these are approximately x = 0, 0, 0, 1.3, 1.92

6.3. Sketching the graph of a function

125

35.0

y=f(x)

-10.0
-2.0

1.5

200.0

y=f(x)

-40.0
-2.0

400.0

1.5

y=f(x)

-800.0
-2.0

1.5

Figure 6.9. The function y = f (x) = 8 x5 + 5 x4 20 x3 , and its first and second
derivatives, f (x) and f (x)
3. First derivative: Calculating the derivative of f (x) and then factoring leads to
dy
= f (x) = 40 x4 + 20 x3 60 x2 = 20x2 (2x + 3)(x 1)
dx
so that the places where this derivative is zero are: x = 0, 0, 1, 3/2. We expect
critical points at these places.
4. Second derivative: We calculate the second derivative and factor to obtain
d2 y
= f (x) = 160 x3 + 60 x2 120 x = 20x(8x2 + 3x 6)
dx2

126

Chapter 6. Sketching the graph of a function using calculus tools


Thus, we can find places where the second derivative is zero. This occurs at
x = 0,

1
1
3
3
+
201,
201
16 16
16 16

The values of these roots can be approximated by: x = 0, 0.69, 1.07


5. Classifying the critical points: To identify the types of critical points, we can use
the second derivative test, i.e. determine the sign of the second derivative at each of
the critical points.
At x = 0 we see that f (0) = 0 so the test is inconclusive. At x = 1, we have
f (1) = 20(8 + 3 6) > 0 implying that this is a local minimum. At x = 3/2
we have f (1.5) = 225 < 0 so this is a local maximum. In fact we find that the
value of the function at x = 1.5 is y = f (1.5) = 32.0625, whereas at x = 1
f (1) = 7.
The table below summarizes what we have found, and what we concluded. Each of
the values of x across its top row has some significance in terms of the behaviour of the
function.
x=
f (x) =
f (x) =
f (x) =

1.92
0

zero

1.5
32.0
0
<0
max

1.07
0
inflection

0
0
0
0

0.69

0
inflection

1
7
0
>0
min

1.3
0

zero

We can now sketch the shape of the function, and its first and second derivatives in
Figure 6.9.

6.3.1 Global maxima and minima, endpoints of an interval


Global (absolute) maxima and minima:
A global (or absolute) maximum of a function y = f (x) over some interval is the largest
value that the function attains on that interval. Similarly a global (or absolute) minimum is
the smallest value.
Comment: If the function is defined on a closed interval, we must check both the
local maxima and minima as well as the endpoints of the interval to determine where the
global maxima and minima occur.
Example 6.13 Consider the function y = f (x) = x2 + x2 0.1 < x < 4. Find the largest
and smallest values that this function takes over the given interval.
Solution: We first compute the derivatives:
f (x) = 2

1
+ 2x,
x2

6.3. Sketching the graph of a function

127

1
+ 2.
x3
We now determine where critical points f (x) = 0 occur:
f (x) = 4

1
+ 2x = 0.
x2

Simplifying, we find 2 x12 = 2x, so x3 = 1 and the critical point is at x = 1. Observe that
the second derivative at this point is
f (1) = 4

1
+ 2 = 6 > 0,
13

so that x = 1 is a local minimum.


We now calculate the value of the function at the endpoints x = 0.1 and x = 4
as well as at the critical point x = 1 to determine where global and local minima and/or
maxima occur:
f (0.1) = 20.01
global maximum

f (1) = 3
global minimum

f (4) = 16.5

We see that the global minimum occurs at x = 1. There are no local maxima. The
global maximum occurs at the left endpoint.

128

Chapter 6. Sketching the graph of a function using calculus tools

Exercises
6.1. A zero of a function is a place where f (x) = 0.
(a) Find the zeros, local maxima, and minima of the polynomial y = f (x) =
x3 3x

(b) Find the local minima and maxima of the polynomial y = f (x) = (2/3)x3
3x2 + 4x.
(c) Determine whether each of the polynomials given in parts (a) and (b) have an
inflection point.
6.2. Find critical points, zeros, and inflection points of the function y = f (x) = x3 ax.
Then classify the types of critical points that you have found.
6.3. For each of the following functions, sketch the graph for 1 < x < 1, find
f (0), f (1), f (1) and identify any local minima and maxima.
(a) y = x2 ,
(b) y = x3 ,
(c) y = x4

(d) Using your observations above, when can you conclude that a function whose
derivative is zero at some point has a local maximum at that point?
6.4. Sketch a graph of the function y = f (x) = x4 2x3 , using both calculus and
methods of Chapter 1.
6.5. Find the global maxima and minima for the function in Problem 4 on the interval
0 x 3.
6.6. Find the absolute maximum and minimum values on the given interval:
(a) y = 2x2 on 3 x 3

(b) y = (x 5)2 on 0 x 6

(c) y = x2 x 6 on 1 x 3
1
1
(d) y = + x on 4 x .
x
2
6.7. A function f (x) has as its derivative f (x) = 2x2 3x
(a) In what regions is f increasing or decreasing?

(b) Find any local maxima or minima.


(c) Is there an absolute maximum or minimum value for this function?
6.8. Sketch the graph of x4 x2 + 1 in the range 3 to 3. Find its minimum value.
6.9. Identify all the critical points of the following function.
y = x3 27
6.10. Consider the function g(x) = x4 2x3 + x2 . Determine locations of critical points
and inflection points.
6.11. Consider the polynomial y = x3 + 3x2 + ax + 1. Show that when a > 3 this
polynomial has no critical points.

Exercises

129

6.12. Find the values of a, b, and c if the parabola y = ax2 + bx + c is tangent to the line
y = 2x + 3 at (2, 1) and has a critical point when x = 3.
6.13. Double Wells and Physics: In physics, a function such as
f (x) = x4 2x2
is often called a double well potential. Physicists like to think of this as a landscape with hills and valleys. They imagine a ball rolling along such a landscape:
with friction, the ball eventually comes to rest at the bottom of one of the valleys
in this potential. Sketch a picture of this landscape and use information about the
derivative of this function to predict where the ball might be found, i.e. where the
valley bottoms are located.
6.14. (From Final Exam, Math 100 Dec 1996) Find the first and second derivatives of the
function
x3
.
y = f (x) =
1 x2
Use information about the derivatives to determine any local maxima and minima,
regions where the curve is concave up or down, and any inflection points.
6.15. Find all the critical points of the function
y = f (x) = 2x3 + 3ax2 12a2 x + 1
and determine what kind of critical point each one is. Your answer should be given
in terms of the constant a, and you may assume that a > 0.
6.16. (From Final Exam Dec 1995) The function f (x) is given by
y = f (x) = x5 10kx4 + 25k 2 x3
where k is a positive constant.
(a) Find all the intervals on which f is either increasing or decreasing. Determine
all local maxima and minima.
(b) Determine intervals on which the graph is either concave up or concave down.
What are the inflection points of f (x) ?
6.17. Muscle shortening: In 1938 Av Hill proposed a mathematical model for the rate of
shortening of a muscle, v, (in cm/sec) when it is working against a load p (in gms).
His so called force-velocity curve is given by the relationship
(p + a)v = b(p0 p)
where a, b, p0 are positive constants.
(a) Sketch the shortening velocity versus the load, i.e., v as a function of p. (Note:
the best way to do this is to find the intercepts of the two axes, i.e. find the
value of v corresponding to p = 0 and vice versa.)
(b) Find the rate of change of the shortening velocity with respect to the load, i.e.
calculate dv/dp.

130

Chapter 6. Sketching the graph of a function using calculus tools


(c) What is the largest load for which the muscle will contract? (Hint: A contracting muscle has a positive shortening velocity, whereas a muscle with a very
heavy load will stretch, rather than contract, i.e. will have a negative value of
v.)

6.18. Reaction kinetics: Chemists often describe the rate of a saturating chemical reaction by using simplified expressions. Two examples of such expressions are:
Michaelis-Menten kinetics:

Rm (c) =

Kc
,
kn + c

Sigmoidal kinetics:

Rs (c) =

where c is the concentration of the reactant, K > 0, kn > 0 are constants. R(c)
is the speed of the reaction (Observe that the speed of the reaction depends on the
concentration of the reactant).
(a) Sketch the two curves. To do this, you should analyze the behaviour for c = 0,
for small c, and for very large c. You will find a horizontal asymptote in both
cases. We refer to that asymptote as the maximal reaction speed. What is the
maximal reaction speed for each of the functions Rm , Rs ? (Note: express
your answer in terms of the constants K, kn .)
(b) Show that the value c = kn leads to a half-maximal reaction speed.
For the questions below, you may assume that K = 1 and kn = 1.
(c) Sketch the curves Rm (c), Rs (c).
(d) Show that sigmoidal kinetics, but not Michaelis Menten kinetics has an inflection point.
(e) Explain how these curves would change if K is increased; if kn is increased.
6.19. Checking the endpoints !: Find the absolute maximum and minimum values of the
function
1
f (x) = x2 + 2
x
on the interval [ 21 , 2]. Be sure to verify if any critical points are maxima or minima
and to check the endpoints of the interval.

K
kn2 +

Chapter 7

Optimization

In this chapter, we collect a variety of problems in which the ideas developed in earlier
material are put to use. In particular, we will use calculus to find local (and global) maxima,
and minima so as to get the best (optimal) values of some desirable quantity. Setting up
these problems, from first verbal description, to clear cut mathematical formulation is the
main challenge we will face. Often, we will use geometric ideas to express relationships
between variables leading to our solution.

7.1

Simple biological optimization problems

We start with relatively simple examples where the function to optimize is specified. The
student merely has to take care with differentiation, and apply the diagnostic tests properly.
An important skill to pick up at this point is distinguishing between variables and constant
parameters in the differentiation step. A skill we reinforce is the elementary curve sketching
from earlier chapters.

Section 7.1 Learning goals


1. Given a function of some independent variable, be able to find the derivative of that
function and identify all critical points.
2. Using a combination of sketching and tests for critical points developed in Section 6.2.3, be able to determine whether that critical point is a minimum, maximum
(or neither).

7.1.1 Density dependent (logistic) growth in a population


Biologists often notice that the growth rate of a population depends not only on the size of
the population, but also on how crowded it is. Constant growth is not sustainable. When
individuals have to compete for resources, nesting sites, mates, or food, they cannot invest
131

132

Chapter 7. Optimization

time nor energy in reproduction, leading to a decline in the rate of growth of the population.
Such population growth is called density dependent growth.
One common example of density dependent growth is called the logistic growth law.
Here it is assumed that the growth rate of the population, G depends on the density of the
population N as follows:


K N
.
G(N ) = rN
K
Here N is the independent variable, and G(N ) is the function of interest. All other
quantities are constant: r > 0 is a constant, called the intrinsic growth rate and K > 0
is a constant called the carrying capacity, which represents the population density that a
given environment can sustain. Importantly, when differentiating G, we treat r and K as
numbers. A generic sketch of G as a function of N is shown in Figure 7.1.
G

K/2

Figure 7.1. In logistic growth, the population growth rate rate G depends on
population size N as shown here.
Example 7.1 (Logistic growth rate:) Answer the following questions:
Find the population density that leads to the maximal growth rate.
What is the maximal growth rate?
For what population size is the growth rate zero?

Solution: We can write G(N ) in an alternate form




r
KN
= rN N 2 .
G(N ) = rN
K
K
Then we see that G(N ) is a polynomial in powers of N , with coefficients that depend on
the constants. To find the maximal value of G(N ) we differentiate G with respect to the
variable N and set this derivative to zero. For the differentiation step, we keep in mind that
K, r > 0 are here treated as constants. We get
G (N ) = r 2

r
N = 0.
K

7.1. Simple biological optimization problems

133

Solving for N leads to


r=2

r
N
K

N=

K
.
2

We found a critical point, but we must still confirm that it is a local maximum. We can
do so either using the sketch in Fig. 7.1, or using one of the diagnosis tools developed in
Section 6.2.3. Here we apply the second derivative test.. The second derivative is
G (N ) = 2

r
K

and is clearly is negative for all population sizes. This tells us that the function G(N ) is
concave down, and that N = K/2 is a local maximum. Thus the density leading to largest
growth rate is one half of the carrying capacity.
The growth rate at this density is
!
 
KK
K1
rK
K
K
2
=r
=
.
G( ) = r
2
2
K
2 2
4
To find the population size at which the growth rate is zero, we set G = 0 and solve for N :


K N
G(N ) = rN
= 0.
K
The two solutions are N = 0 (which is not very interesting, since when there is no population there is no growth) and N = K.
We will have more to say about this type of density dependent growth a little later on
in this course.

7.1.2 Cell size for maximal nutrient accumulation rate


We have seen in Section 1.2.2 that the absorption and consumption rates A(r), C(r) for a
simple spherical cell of radius r are:
A(r) = k1 S = 4k1 r2 ,

C(r) = k2 V =

4
k2 r3 ,
3

where k1 , k2 > 0 are constants. We define the net rate of increase of nutrients as the rate
of absorption minus the rate of consumption:
4
N (r) = A(r) C(r) = 4k1 r2 k2 r3 .
3

(7.1)

As we can see, this quantity depends on the radius r of the cell.


Example 7.2 Consider a spherical cell that is absorbing nutrients at a rate proportional to
its surface area and consuming them at a rate proportional to its volume. Determine the
size of the cell for which the net rate of increase of nutrients is largest.

134

Chapter 7. Optimization

Solution: To find the size for greatest net nutrient increase rate, we find critical points of
N (r), keeping in mind that the coefficients 8k1 and 4k2 are constant for the purpose of
differentiation. Then the derivative of (7.1) is
N (r) = 8k1 r 4k2 r2 .
Critical points occur when N (r) = 0, i.e.
N (r) = 8k1 r 4k2 r2 = 0.
Simplifying leads to
4r(2k1 k2 r) = 0.
This is satisfied (trivially) when r = 0, and also when
r=2

k1
.
k2

To check that this is a local maximum, we find the second derivative


N (r) = 8k1 8k2 r = 8(k1 k2 r).
plugging in r = 2k1 /k2 we get
N = 8(k1 k2

2k1
) = 8k1 < 0.
k2

Thus the second derivative is negative, and this verifies that we have a local maximum.

7.2 Optimization with a constraint


We continue to build experience with optimization problems. Here we encounter slightly
more challenging examples, where identifying the function to optimize forms one of the
aspects of the problem. We also consider cases where there are more than one independent
variables, but where there is an additional constraint that can be used to eliminate all but
one. We include both biological and other practical examples in this analysis.

Section 7.2 Learning goals


1. Gain experience with setting up an optimization word problem involving formulae
for volume and surface area of geometric solids.
2. Understand the idea of a constraint in an optimization problem.
3. Be able to use the constraint to eliminate one of the independent variables, and find
a desired critical point. (As before, this includes classifying that critical point as a
local minimum, maximum or neither.)

7.2. Optimization with a constraint

135

7.2.1 A cylindrical cell with minimal surface area


Not all cells are spherical. Some are skinny cylindrical filaments, or sausage shapes. Some
even grow as helical tubes, but we shall leave such complicated examples aside here. We
will explore how minimization of surface area would determine the overall shape of a
cylindrical cell.
Consider a cell shaped like a cylinder with a circular cross-section. The volume
of the cell will be assumed to be fixed, because the cytoplasm in its interior cannot be
compressed. However, suppose that the cell has a rubbery membrane that tends to take
on the smallest surface area possible. (In physical language, the elastic energy stored in the
membrane tends to a minimum.) We want to find the proportions of the cylinder (e.g. the
ratio of length to radius) so that the cell has minimal surface area.
Recall the following properties for a cylinder:
r

2 r

Figure 7.2. Properties of a cylinder


the volume of a cylinder is the product of its base area A and its height, h. That is,
V = Ah. For a cylinder with circular cross-section: V = r2 L.
A cylinder can be cut and unrolled into a rectangle. One side of the rectangle has
length L and the other has length that made the perimeter of the circle, 2r. The
surface area of the unrolled rectangle is then Sside = 2rL, as shown in Figure 7.2.
If the ends of the cylinder are two flat circular caps, then the sum of the areas of
these two ends is Sends = 2r2 .
The total surface area of the cylinder with flat ends is then
S = 2rL + 2r2 .
We would expect that in a cell surrounded by a rubbery membrane, the end caps
would not really be flat. However for simplicity, we will here neglect this issue and assume
that the ends are flat and circular. Then, mathematically, our problem can be restated as
follows
Example 7.3 Minimize the surface area S = 2rL+2r2 of the cell, given that its volume
V = r2 L = K is constant15 .
15 I would

like to thank Prof Nima Geffen (Tel Aviv University) with providing the inspiration for this example.

136

Chapter 7. Optimization

Solution: The shape of the cell depends on both the length L, and the radius r of the
cylinder. However, these are not independent. They are related to one another because the
volume of the cell has to be constant. This is an example of an optimization problem with a
constraint, i.e. a condition that has to be satisfied. The constraint will allow us to eliminate
one of the variables, as we show below.
The constraint is the volume is fixed, i.e.,
V = r2 L = K
where K > 0 is a constant that represents the volume of the given cell. We can use this to
express one variable in terms of the other. For example, we can solve for L.
L=

K
.
r2

(7.2)

The function to minimize is


S = 2rL + 2r2 .
We substitute the expression (7.2) for L as a function of r to obtain S as a function of r
alone:
K
S(r) = 2r 2 + 2r2 .
r
Simplification leads to
K
S(r) = 2 + 2r2 .
r
observe that S is now a function of only one independent variable, r, (K and are constants).
In order to find local minima, we will look for critical points of the function S(r).
We compute the relevant derivatives:
S (r) = 2

K
+ 4r,
r2

The second derivative will also be useful.


S (r) = 4

K
+ 4.
r3

From the last calculation, we observe that the second derivative is always positive since
K, r > 0, so the function S(r) is concave up. Any critical point we find will thus be a
minimum automatically. (In Exercise 7 we also consider the first derivative test as practice.)
To find a critical point, set S (r) = 0:
S (r) = 2

K
+ 4r = 0.
r2

Solving for r, we obtain:


2

K
= 4r
r2

r3 =

K
2

r=

K
2

1/3

7.2. Optimization with a constraint

137

We also find the length of this cell using Eqn. 7.2.


1/3

4K
.
L=

(Details of the algebra is left for Exercise 7) We can finally characterize the shape of the
cell. One way to do this is to specify the ratio of its radius to its length. Based on our
previous results, we find that ratio to be:
L
= 2.
r
(Exercise 7.) Thus, the length of this cylinder is the same as its diameter (which is twice
the radius). This means that in a cylindrical cell with a rubbery membrane, we find a short
and fat shape. In order for the cell to grow as a long skinny cylinder, it has to have some
structural support that prevents the surface area from contracting to the smallest possible
area. An example of this type occurs in fungal cells. These grow as long branched filaments. The outer cell wall contains structural components that prevent the cell surface from
contracting elastically.

7.2.2 Wine for Keplers wedding


In 1613, Kepler set out to purchase a few barrels of wine for his wedding party. The
merchant selling the wine had an interesting way of computing the cost of the wine: He
would plunge a measuring rod through a hole in the barrel, as shown in Figure 7.3. The
price was proportional to the length of the wet part of rod. We will refer to that length as
L in what follows.
Kepler noticed that barrels come in different shapes. Some are tall and skinny, while
others are squat and fat. He conjectured that some shapes would contain larger volumes
for a given length of the measuring rod, i.e. would contain more wine for the same price.
Knowing mathematics, he set out to determine which barrel shape would be the best bargain
for his wedding.

Figure 7.3. Barrels come in various shapes. But the cost of a barrel of wine was
determined by the length of the wet portion of the rod inserted into the barrel diagonally.
Some barrels contain larger volume, but have identical cost.
Clearly, the best bargain would be the wine barrel that contains the most wine for a
given cost. This is equivalent to asking which cylinder has the largest volume for a fixed

138

Chapter 7. Optimization

(constant) length L. Below, we show how this optimization problem can be solved. An
alternate approach is to seek the wine barrel that costs least for a given volume. We explore
this alternative in Exercise 14 and show that it leads to the same result.
Example 7.4 Find the proportions (height:radius) of the cylinder that contains the largest
volume for a fixed value of the length L of the wet rod in Fig. 7.3.
Solution: To simplify the problem, we will assume that the barrel is a simple cylinder, as
shown in Figure 7.4. We also assume that the tap-hole (normally covered to avoid leaks) is
half-way up the height of the barrel. We will define r as the radius and h as the height of
the barrel. These two variables uniquely determine the shape as well as the volume of the
barrel. Well also assume that the barrel is full up to the top with delicious wine, so that the
volume of the cylinder is the same as the volume of wine.
The volume of a cylinder is
V = base area height.
The base is a circle of area A = r2 , so that the volume of the barrel is:
V = r2 h.

(7.3)

The rod used to measure the amount of wine (and hence determine the cost of the barrel)
is shown as the diagonal of length L in Figure 7.4. Because the cylinder walls are perpendicular to its base, the length L is the hypotenuse of a right-angle triangle whose other sides
have lengths 2r and h/2. (This follows from the assumption that the tap hole is half-way
up the side.) Thus, by the Pythagorean theorem,
L2 = (2r)2 +

 2
h
.
2

(7.4)

The problem can be restated: maximize V subject to a fixed value of L. The fact that
L is fixed means that we have a constraint as before. That constraint will be used to reduce
the number of variables in the problem.
The function to be maximized is:
V = r2 h.
After expanding the squares, the constraint is:
L2 = 4r2 +

h2
.
4

We can use the constraint to eliminate one variable; in this case the simplest way is to do
so is to solve Eqn. (7.4) for r2 and substitute the result into V . We obtain


h2
1
L2
.
r2 =
4
4

7.2. Optimization with a constraint

139

h
L

h/2

2r
Figure 7.4. Here we simplify and idealize the problem to consider a cylindrical
barrel with diameter 2r and height h. We assumed that the tap-hole is at height h/2. The
length L denotes the wet portion of the merchants rod, used to determine the cost of this
barrel of wine. We observe that the dotted lines form a Pythagorian triangle.
Then
V = r2 h =





h2

1
L2
h=
L 2 h h3 .
4
4
4

We now have a function of one variable, namely




1

L 2 h h3 .
V (h) =
4
4
For this function, the variable h could sensibly take on any value in the range 0 h 2L.
Outside this range, the volume is negative, and at the two endpoints the volume is zero.
Thus, we anticipate that somewhere inside this range of values we should find the desired
optimum.
To find any critical points of the function V (h), we calculate the derivative V (h)
and set it to zero:


3 2

L h =0
V (h) =
4
4
This implies that L2 43 h2 = 0, i.e.
3h2 = 4L2

h2 = 4

L2
3

L
h = 2 .
3

Now we must check whether this solution is a local maximum (or a minimum).
The second derivative is:



3
3
V (h) =
0 2 h = h < 0.
4
4
8
From this we see that V (h) < 0 for any positive value of h. The the function V (h) is
concave down when h > 0. This verifies that the solution above is a local maximum.
According to the discussion of the relevant range of values of h, this local maximum is also

140

Chapter 7. Optimization

the optimal solution we need. i.e. there are no larger values at endpoints of the interval
0 h 2L.
To finish the problem, we can find the radius of the barrel having this height by
plugging this result for h into the constraint equation, i.e. using






h2
1
L2
1 2 2
1
L2
=
L2
=
L .
r2 =
4
4
4
3
4 3
After simplifying and rewriting, we get
1
r = L.
3 2
The shape of the wine barrel with largest volume for the given price can now be specified.
One way to do this is to specify the ratio of height to radius. (Tall skinny barrels have a
high ratio h/r and squat fat ones have a low ratio.) By the above reasoning, the ratio of
h/r for the optimal barrel is
2 L3

h
= 2 2.
(7.5)
= 1
L
r
3 2

The height of the barrel should be 2 2 3 times the radius in these most economical wine
barrels.

7.3 Checking endpoints


In some cases, the optimal value of a function will not occur at any of its local maxima, but
rather at one of the endpoints of an interval. Here we consider this situation.

Section 7.3 Learning goals


1. Understand the distinction between local and global extrema.
2. Be able to find the global minimum or maximum in a given word problem.
The following example illustrates this point:
Example 7.5 (maximal perimeter) The area of a rectangle having sides of length x and
y is A = xy. Suppose that that the variable x is only allowed to take values in the range
0.5 x 4 Find the dimensions of the rectangle having largest perimeter whose area
A = 1 is fixed. (The perimeter of a rectangle is the total length of its outer edge.)
Solution: The perimeter of a rectangle whose sides are length x, y is
P = x + y + x + y = 2x + 2y.

7.3. Checking endpoints

141

We are asked to maximize this quantity. Since the area of the rectangle is A = xy, and this
is given, we obtain xy = 1 as the constraint. Using the constraint, we can solve for y.
y=

1
.
x

Then, substituting this result leads to a function depending only on x:


P (x) = 2x +

2
.
x

To find critical points, we set




1
P (x) = 2 1 2 = 0.
x

Thus, x2 = 1 or x = 1. We reject the negative root as it is irrelevant for the (positive)


side length of the rectangle. Checking if this is a maximum we find that
P (x) =

4
>0
x3

so we have found a local minimum! This is clearly not the maximum we were looking for.
We must thus check the endpoints of the interval for the maximal value of the function. We find that P (4) = 8.5 and P (0.5) = 5. The largest perimeter for the rectangle will
thus occur when x = 4, indeed at the endpoint of the domain, as shown in Figure 7.5.

Figure 7.5. In Example 7.5, the critical point we found is a local minimum. To
maximize the perimeter of the rectangle, we must consider the end points of the interval
0.5 x 4.

142

Chapter 7. Optimization

7.4 Optimal foraging


Section 7.4 Learning goals
1. Follow the development of a simple model for an animal collecting food (gaining
energy) while foraging in a food patch.
2. Understand the graphical representation of various types of food patches, and be able
to link those graphs to verbal descriptions of the situations.
3. In a specific example, where the animal optimizes the total energy gained over the
total time spent (including travel time to the patch), be able to find the optimal time
to spend foraging in the food patch.
Animals need to spend a considerable part of their time searching for food. There is
a limited time available for this activity, since when the sun goes down, risk of becoming
food (to a predator) increases, and chances of finding more food items decreases. There are
also limited resources, so those who are most successful at finding and utilizing these over
the available time will likely survive, produce offspring, and have an adaptive advantage.
It is argued by biologists that evolution tends to optimize animal behaviour by selecting in
favour of those that are faster, more efficient, stronger, or more fit. In this section we investigate how foraging behaviour is optimized. We follow the basic principles put forward by
Stephens and Krebs [14] and by Charnov [4].

travel time
nest

food patch
time t

Figure 7.6. A bird travels daily to forage in a food patch. We want to determine
how long it should stay in the patch to optimize its efficiency.

Notation for our model


The following notation will be useful in discussing this problem:
= travel time between nest and food patch. (This is considered as time that is
unavoidably wasted.)
t = residence time in the patch (i.e. how long to spend foraging in one patch), also
called foraging time,
f (t) = energy gained by foraging in a patch for time t,

7.4. Optimal foraging

143

Energy gain in food patches


In some patches, it is easy to quickly load up on resources: this would be true if it is easy
to find the nectar (or hunt the prey) or spot the berries. In other places, it may take some
effort to locate the food items or process them so they can be eaten. This is reflected by
a gain function f (t), that may have one of several shapes. Some examples are shown in
Figure 7.7.
(1)

(2)

f(t)

f(t)

f(t)

f(t)

f(t)

f(t)

t
(4)

(3)

t
(5)

t
(6)

Figure 7.7. Examples of various total energy gain f (t) for a given foraging time
t. The shapes of these functions determine how hard or easy it is to extract food from a food
patch. See text for details about what these functions imply about the given food patch.
In the examples shown in Figure 7.7 we see an assortment of cases, discussed below
Example 7.6 (Energy gain versus patch residence time) For each panel in Fig. 7.7, describe the characteristics of a patch that would have the given graph of the energy gain f (t).
For example, in the first two panels we could say that
1. The energy gain is linearly proportional to time spent in the patch. In this case it
appears that the patch has so much food in it that it is never depleted. It would make
sense to stay in such a patch as long as possible, we might suspect.
2. Here the energy gain is independent of time spent. The animal gets the full quantity
as soon as it gets to the patch. (This is not very realistic from a biological perspective.)
It is good practice to interpret the graphs in terms of verbal descriptions in any biologically
motivated model.
Solution: The first two panels are already accounted for. In the other cases we can say the
following:

144

Chapter 7. Optimization

3. In this case, the food is gradually depleted in a given patch, (the total gain levels off
to some constant level as t increases). There is diminishing return for staying longer.
Here, we may expect to have some choice to make as to when to leave and look for
food elsewhere.
4. In this example, the rewards for staying longer actually multiply: the net energy gain
has an increasing slope (or, otherwise stated, f (t) > 0). We will see that in this
case, there is no optimal residence time: some other strategy, such as staying in just
one patch would be optimal.
5. It takes some time to begin to gain energy but later on the gain increases rapidly.
Eventually, the patch is depleted.
6. Here we have the case where staying too long in a patch is actually disadvantageous
in that it leads to a net loss of energy. This might happen if the animal spends more
energy looking for food that is already depleted. Here it is clear that leaving the patch
early enough is the best strategy.
For the purpose of a simple example, we will assume that the patch energy function is given
by
Emax t
where Emax , k > 0, are constants.
(7.6)
f (t) =
k+t
Example 7.7 (Interpreting the assumed function f (t)) Match the function we have assumed in Eqn. (7.6) with one of the panels in Fig. 7.7. Then interpret the meanings of the
constants Emax , k.
Solution: We recognize Eqn. (7.6) as a function that is similar to the graph shown in the
left panel of Fig. 1.7 (Michaelis-Menten kinetics in biochemistry). Panel (3) in Fig. 7.7
resembles this graph. Then, from our previous analysis in Chapter 1, we know that Emax
is the horizontal asymptote, and corresponds to the greatest possible energy that can be
extracted from the patch (if foraging continues indefinitely). The parameter k, which has
units of time in Eqn. (7.6), controls the steepness of the rising phase of the function. At
time t = k, we find that f = Emax /2 (Exercise 26(a)).
Currency to optimize
We will assume that animals try to maximize the average rate of energy gain over the
foraging day. defined by the following ratio:
R(t) =

Total energy gained


total time spent

i.e., R, is the average energy gain per unit time. This quantity will depend on the amount
of time t that is spent foraging during a day. The question we ask is whether there is an
optimal foraging time (i.e. a value of the time, t), that maximizes R(t). As we show below,
whether or not an optimum exists depends greatly on how hard it is to extract food from a
food patch. When an optimal foraging time exists, we will see that it also depends on how
much time is wasted in transit to such foraging sites.

7.4. Optimal foraging

145

The optimal residence time


We now turn to the task of finding the optimal residence time, i.e. time to spend in the
patch. We will make a simplifying assumption that all the patches are identical, making it
equally easy to utilize each one. Now suppose on average, the time spent in a patch is t.
Then, the total energy gained during the day, is f (t). It takes a time to travel between the
nest and the food patch and a time t in the patch so that the total time spent is + t. Thus
R(t) =

f (t)
.
+t

(7.7)

We wish to maximize this function with respect to the residence time, i.e. find the time t
such that R(t) is as large as possible. For the patch function f (t) assumed in Eqn. (7.6),
we have
Emax t
(7.8)
R(t) =
(k + t)( + t)
Example 7.8 Use tools of calculus and of simple sketching to find and classify critical
points of R(t) in Eqn. (7.8).
Solution: We first consider the elementary sketch of R(t). Since we are concerned only
with positive values (R(t) > 0 for biological relevance), we consider the behaviour near
t = 0 and for large positive t.
For small time values, t 0, we find that R(t) (Emax /k )t is a linear increasing
graph.
At t , R(t) Emax t/t2 0, so the graph eventually decreases to zero.
These two facts are shown in the left panel of Fig. 7.8. Thus, we already see that somewhere
in 0 < t < , this function has a local maximum. To find that local maximum, we compute

R(t)

R(t)

Figure 7.8. In Example 7.8 we use sketching techniques and calculus to produce
this rough sketch of the average rate of energy gain R(t) in Eqn. (7.8) for a saturating
patch energy function assumed in (7.6). The graph is linear near the origin, and decays to
zero at large t. Since the function is continuous for t > 0, this sketch verifies that there is
a local maximum for some positive t value.
R (t) using the quotient rule (see Exercise 26c), and set this derivative to zero:
R (t) = Emax

k t2
=0
(k + t)2 ( + t)2

(7.9)

146

Chapter 7. Optimization

This can only be satisfied if the numerator is zero, that is


k t2 = 0

t1,2 = k .

We reject the negative root which is not relevant


here, and deduce that the critical point
of the function R(t) that we seek is at tcrit = k . According to our sketch in Fig. 7.8,
this critical point is a local maximum. We can also confirm this using the techniques of
diagnosis for critical points, as shown in the next example. (This example is for practice,
as Fig. 7.8 is sufficient evidence.)

Example 7.9 Use one of the calculus tests for critical points to show that t = k is a
local maximum for the function R(t) in Eqn. (7.8).
Solution: Since R(t) is a rational function, it is messy to calculate its second derivative,
so we avoid using the second derivative test. Instead, we apply the first derivative test of
Section 6.2.3. We examine the sign of the first derivative to the left and to the right of the
critical point. Looking at Eqn. (7.9), we note that the denominator is positive so the sign of
R (t) is determined by the numerator, k t2 . Thus R (t) is positive (function increases)
whenever t < tcrit , and R (t) is negative (function decreases) whenever t > tcrit . So by
the first derivative test, the critical point is a local maximum. (We henceforth refer to it as
tmax .)
We have found that to optimize the average rate ofenergy gain R(t), the animal should
stay in the patch for a time duration of t = tmax = k . We next ask what is the value of
the average rate of energy gain for this optimal patch residence time.
Example 7.10 Find the optimal average rate of energy gain R(t).
Solution: Here we are asked to compute R(t) for t = tmax =
R(tmax ) =

k . We find that

1
Emax
Emax tmax
p
=
(k + tmax )( + tmax )
(1 + k/ )2

(7.10)

The reader is asked to fill in the steps for this calculation in Exercise 26(d).

7.4.1 For further study: Other patch functions


So far, we have carried out the calculations for a specific patch function f (t) given by
Eqn. (7.6). However, we can gain insight and obtain an interesting result without making
this assumption. Let us repeat our analysis with a more general example in mind.
Example 7.11 Carry out the calculations for the optimal value patch residence time for a
general patch energy function f (t), without using the formula (7.6).
Solution: We use the expression for R(t) given by (7.7). Differentiating, we find the first
derivative,
G(t)
f (t)( + t) f (t)
=
R (t) =
( + t)2
H(t)

7.4. Optimal foraging

147

where
G(t) = f (t)( + t) f (t),

H(t) = ( + t)2 .

(The calculation is easier with this notation.) To maximize R(t) we set


R (t) = 0
which can occur only when the numerator of the above equation is zero, i.e.
G(t) = 0.
This means that
f (t)( + t) f (t) = 0

so that, after simplifying algebraically,

f (t) =

f (t)
.
+t

(7.11)

A geometric argument
In practice, we would need to specify a function for f (t) in order to solve for the optimal
time t. However, we can also solve this problem using a geometric argument. The last
equation equates two quantities that can be interpreted as slopes. On the right is the slope
of a tangent line. On the left is the slope (rise over run) of some right triangle whose
height is f (t) and whose base length is + t. In Figure 7.9, we show each slope on its
own: In the right panel, f (t) is the slope of the tangent line to the graph of f (t). In the
central panel, we have constructed some triangle with the property that its hypotenuse has
slope f (t)/[ + t]. On the left panel we have superimposed both, selecting a value of t for
which the slope of the triangle is the same as the slope of the tangent line. Notice that in
order to fit the triangle on the same diagram, we had to place its tip at the point along
the horizontal axis. When these slopes coincide, it means that we have satisfied equation
(7.11), and we have found the desired time t for optimal foraging.
We can use this observation in general to come up with the following steps to solve
an optimal foraging problem:
1. A biologist conducts some field experiments to determine the mean travel time from
food to nest, , and the shape of the energy gain function f (t). (This may require
capturing the animal and examining the contents of its stomach. . . an unappetizing
thought; we will leave this to task to our brave biological colleagues.)
2. We draw a sketch of f (t) as shown in rightmost panel of Figure 7.9 and extend the
t axis in the negative direction. At the point we draw a line that just touches the
curve f (t) at some point (i.e. a tangent line). The slope of this line is f (t) for some
value of t.
3. The value of t at the point of tangency is the optimal time to spend in the patch!
The diagram drawn in our geometric solution (right panel in Figure 7.9 is often called a
rooted tangent).
We have shown that the point labeled t indeed satisfies the condition that we derived
above for R (t) = 0, and hence is a critical point.

148

Chapter 7. Optimization
energy gain

energy gain
f(t)

f '(t)
f(t)
t

+t

f(t)

Figure 7.9. The solution to the optimal foraging problem can be expressed geometrically in the form shown in this figure. The tangent line at the (optimal) time t should
have the same slope as the hypotenuse of the right triangle shown above. The diagram on
the far right is sometimes termed the rooted tangent diagram.
Checking the type of critical point
We still need to show that this solution leads to a maximum efficiency, (rather than, say a
minimum or some other critical point). We will do this by examining R (t).
Recall that
G(t)
R (t) =
H(t)
in terms of the notation used above. Then
R (t) =

G (t)H(t) G(t)H (t)


.
H 2 (t)

But, according to our remark above, at the patch time of interest (the candidate for optimal
time)
G(t) = 0
so that
R (t) =

G (t)H(t)
G (t)
=
.
H 2 (t)
H(t)

Now we substitute the derivative of G (t), H(t) into this ratio:


G(t) = f (t)( + t) f (t)

We find that
R (t) =

G (t) = f (t)( + t) + f (t) f (t) = f (t)( + t)


f (t)
f (t)( + t)
=
.
( + t)2
( + t)

The denominator of this expression is always positive, so that the sign of R (t) will be the
same as the sign of f (t). But in order to have a maximum efficiency at some residence
time, we need R (t) < 0. This tells us that the gain function has to have the property that
f (t) < 0, i.e. has to be concave down at the optimal residence time.
Going back to some of the shapes of the function f (t) that we discussed in our
examples, we see that only some of these will lead to an optimal solution. In cases (1),
(2), (4) the function f (t) has no points of downwards concavity on its graph. This means

7.5. Additional Examples of geometric optimization

149

that in such cases there will be no local maximum. The optimal efficiency would then be
attained by spending as much time as possible in just one patch, or as little time as possible
in any patch, i.e. it would be attained at the endpoints.

7.5

Additional Examples of geometric optimization

7.5.1 Rectangular box with largest surface area


We consider several other examples of optimization where volumes, lengths, and/or surface
areas are considered.
Example 7.12 (Wrapping a rectangular box:) A box with square base and arbitrary height
has string tied around each of its perimeter. The total length of string so used is 10 inches.
Find the dimensions of the box with largest surface area. (That is, figure out what is the
largest amount of wrapping paper needed to wrap this box.)

x
x
Figure 7.10. A rectangular box is to be wrapped with paper
Solution: The total length of string shown in Figure 7.10, consisting of three perimeters of
the box is as follows:
L = 2(x + x) + 2(x + y) + 2(x + y) = 8x + 4y = 10.
This total length is to be kept constant, so the above equation is the constraint in this
problem. This means that x and y are related to one another. We will use this fact to
eliminate one of them from the formula for surface area.
The surface area of the box is
S = 4(xy) + 2x2 .
since there are two faces (top and bottom) which are squares (area x2 ) and four rectangular
faces with area xy. At the moment, the total surface area S is expressed in terms of both
variables. Suppose we eliminate y from S by rewriting the constraint in the form:
y=

5
2x.
2

150

Chapter 7. Optimization

Then
S(x) = 4x


5
2x + 2x2 = 10x 8x2 + 2x2 = 10x 6x2 .
2

We show the shape of this function in Figure 7.11. Note that S(x) = 0 at x = 0 and at
10 6x = 0 which occurs at x = 5/3. Now that S is expressed as a function of one
S(x)

x
0

5/6

5/3

Figure 7.11. Figure for Example 7.12.


variable, we can find its critical points by setting S (x) = 0, i.e., solving
S (x) = 10 12x = 0
for x: We get x = 10/12 = 5/6. To find the corresponding value of y we can substitute
our result back into the constraint. We get
 
15 10
5
5
5
=
= .
y = 2
2
6
6
6
Thus the dimensions of the box of interest are all the same, i.e. it is a cube with side length
5/6.
We can verify that
S (x) = 12 < 0,
(indeed this holds for all x), which means that x = 5/6 is a local maximum.
Further, we can find that
  
 2
5
5
5
25
S=4
+2
=
6
6
6
6
square inches. Figure 7.11 shows how the surface area varies as the dimension x of the box
is varied.

7.5.2 A cylinder in a sphere


Example 7.13 (Fitting a cylinder inside a sphere) Find the cylinder of maximal volume
that would fit inside a sphere of radius R.

7.5. Additional Examples of geometric optimization

h/2

151

h/2
r

Figure 7.12. Definition of variables and geometry to consider


Solution:
We sketch a cylinder inside a sphere as in Figure 7.12. It is helpful to add the radius
of the sphere and of the cylinder. We define the following:
h = height of cylinder,
r = radius of cylinder,
R = radius of sphere.
Then R is assumed a given fixed positive constant, and r and h are dimensions of the
cylinder to be determined.
From Figure 7.12 we see that the cylinder will fit if the top and bottom rims touch
the circle. When this occurs, the dark line in Figure 7.12 will be a radius of the sphere, and
so would have length R.
The connection between the variables (which will be our constraint) is given from
Pythagoras theorem by:
 2
h
2
2
.
R =r +
2
We would like to maximize the volume of the cylinder,
V = r2 h
subject to the above constraint.
Eliminating r2 using the Pythagoras theorem leads to
h2
)h.
4
We see that the problem is very similar to our previous discussion. The reader can show by
working out the steps that
V (h) = 0
V (h) = (R2

occurs at the critical point

and that this is a local maximum.

2
h= R
3

152

Chapter 7. Optimization

Exercises
7.1. The sum of two positive number is 20. Find the numbers
(a) if their product is a maximum.
(b) if the sum of their squares is a minimum.
(c) if the product of the square of one and the cube of the other is a maximum.
7.2. A tram ride at Disney World departs from its starting place at t = 0 and travels
to the end of its route and back. Its distance from the terminal at time t can be
approximately described by the expression
S(t) = 4t3 (10 t)
(where t is in minutes, 0 < t < 10, and S is distance in meters.)
(a) Find the velocity as a function of time.
(b) When is the tram moving at the fastest rate?
(c) At what time does it get to the furthest point away from its starting position?
(d) Sketch the acceleration, the velocity, and the position of the tram on the same
set of axes.
7.3. At 9A.M., car B is 25 km west of another car A. Car A then travels to the south at
30 km/h and car B travels east at 40 km/h. When will they be the closest to each
other and what is this distance?
7.4. A cannonball is shot vertically upwards from the ground with initial velocity v0 =
15m/sec. It is determined that the height of the ball, y (in meters), as a function of
the time, t (in sec) is given by
y = v0 t 4.9t2
Determine the following:
(a) The time at which the cannonball reaches its highest point,
(b) The velocity and acceleration of the cannonball at t = 0.5 s, and t = 1.5 s.
(c) The time at which the cannonball hits the ground.
7.5. Net nutrient increase rate: In Example 7.2, we considered the net rate of increase
of nutrients in a spherical cell of radius r. Here we further explore this problem.
(a) Draw a sketch of N (r) based on Eqn. (7.1). Use your sketch to verify that this
function has a local maximum.
(b) Use the first derivative test to show that the critical point r = 2k1 /k2 is a local
maximum.
7.6. Nutrient increase in cylindrical cell: Consider a long skinny cell in the shape of
a cylinder with radius r and a fixed length L. Then the volume and surface area of
such a cell (neglecting endcaps) are V = r2 L = K and S = 2rL.

Exercises

153

(a) Adapt the formula for net rate of increase of nutrients N (t) for a spherical cell
(7.1) to the case of a cylindrical cell.
(b) Find the radius of the cylindrical cell that maximizes N (t). Be sure to verify
that you have found a local maximum.
7.7. Cylinder of minimal surface area: In this problem we continue to explore Example 7.3.
2
(a) Reason that the surface area of the cylinder, S(r) = 2 K
r + 2r is a function
that has a local minimum using curve-sketching skills.

K 1/3
is a local minimum for
(b) Use the first derivative test to show that r = 2
S(r).

(c) Show the algebra required to find the value of L corresponding to this r value
and show that L/r = 2.

7.8. (From Final Exam, Math 100, Dec 1997) A closed 3-dimensional box is to be constructed in such a way that its volume is 4500 cm3 . It is also specified that the length
of the base is 3 times the width of the base. Find the dimensions of the box which
satisfy these conditions and have the minimum possible surface area. Justify your
answer.
7.9. A box with a square base is to be made so that its diagonal has length 1. See
Figure 7.13.
(a) What height y would make the volume maximal?
(b) What is the maximal volume?
[Hint: A box having side lengths , w, h has diagonal length D where D2 =
2 + w2 + h2 and volume V = wh.]

x
x

Figure 7.13. Figure for Problem 9


7.10. Find the minimum distance from a point on the positive x-axis (a, 0) to the parabola
y 2 = 8x.
7.11. The largest garden: You are building a fence to completely enclose part of your
backyard for a vegetable garden. You have already purchased material for a fence
of length 100 ft. What is the largest rectangular area that this fence can enclose?
7.12. Good Fences make Good Neighbors: A fence of length 100 ft is to be used to
enclose two gardens. One garden is to have a circular shape, and the other to be

154

Chapter 7. Optimization

square. Find out how the fence should be cut so that the sum of the areas inside both
gardens is as large as possible.
7.13. A rectangular piece of cardboard with dimension 12 cm by 24 cm is to be made into
an open box (i.e., no lid) by cutting out squares from the corners and then turning
up the sides. Find the size of the squares that should be cut out if the volume of the
box is to be a maximum.
7.14. Alternate solution to Keplers wine barrel: In this problem we follow an alternate approach to the most economical wine barrel problem posed by Kepler (Example 7.4).
Out approach is to Find the proportions (height:radius) of the cylinder that minimizes the length L of the wet rod in Fig. 7.3 for a fixed volume.
(a) Explain why minimizing L is equivalent to minimizing L2 in Eqn. (7.4)
(b) Explain how Eqn. (7.3) can be used to specify a constraint for this problem.
(Hint: consider the volume, V to be fixed and show that you can solve for r2 ).
(c) Use your result in (c) to eliminate r from the formula for L2 . Now L2 (h) will
depend only on the height of the cylindrical wine barrel.
(d) Use calculus to find any local minima for L2 (h). Be sure to verify that your
result is a minimum.
(e) Find the corresponding value of r using your result in (b).
(f) Find the ratio h/r. You should obtain the same result as in (7.5).
7.15. Rectangle with largest area: Find the side lengths, x and y, of the rectangle with
largest area whose diameter L is given. (Hint: Eliminate one variable using the
constraint. Then, to simplify the derivative you can use the fact that critical points of
A would also be critical points of A2 , where A = xy is the area of the rectangle. Or
else, if you have already learned the chain rule, you can use it in the differentiation.)
7.16. Find the shortest path that would take a milk-maid from her house at (10, 10) to
fetch water at the river located along the x axis and then to the thirsty cow at (3, 5).
7.17. Water and ice: Why does ice float on water? Because the density of ice is lower!
In fact, water is the only common liquid whose maximal density occurs above its
freezing temperature. (This phenomenon favors the survival of aquatic life by preventing ice from forming at the bottoms of lakes.) According to the Handbook of
Chemistry and Physics, a mass of water that occupies one liter at 0 C occupies a
volume (in liters) of
V = aT 3 + bT 2 cT + 1
at T C where 0 T 30 and where the coefficients are
a = 6.79 108 , b = 8.51 106 , c = 6.42 105 .
Find the temperature between 0 C and 30 C at which the density of water is the
greatest. (Hint: maximizing the density is equivalent to minimizing the volume.
Why is this?)

Exercises

155

7.18. Drug doses and sensitivity: The Reaction R(x) of a patient to a drug dose of size
x depends on the type of drug. For a certain drug, it was determined that a good
description of the relationship is:
R(x) = Ax2 (B x)
where A and B are positive constants. The Sensitivity of the patients body to the
drug is defined to be R (x).
(a) For what value of x is the reaction a maximum, and what is that maximum
reaction value?
(b) For what value of x is the sensitivity a maximum? What is the maximum
sensitivity?
7.19. Thermoregulation in a swarm of bees: In the winter, honeybees sometimes escape
the hive and form a tight swarm in a tree, where, by shivering, they can produce heat
and keep the swarm temperature elevated. Heat energy is lost through the surface of
the swarm at a rate proportional to the surface area (k1 S where k1 > 0 is a constant).
Heat energy is produced inside the swarm at a rate proportional to the mass of the
swarm (which you may take to be a constant times the volume). We will assume
that the heat production is k2 V where k2 > 0 is constant. Swarms that are not large
enough may lose more heat than they can produce, and then they will die. The heat
depletion rate is the loss rate minus the production rate. Assume that the swarm is
spherical. Find the size of the swarm for which the rate of depletion of heat energy
is greatest.
7.20. A right circular cone is circumscribed about a sphere of radius 5. Find the dimension
of this cone if its volume is to be a minimum. (Remark: this is a rather challenging
geometric problem.)
7.21. Optimal Reproductive Strategy: Animals that can produce many healthy babies
that survive to the next generation are at an evolutionary advantage over other, competing, species. However, too many young produce a heavy burden on the parents
(who must feed and care for them). If this causes the parents to die, the advantage is
lost. Also, competition of the young with one another for food and parental attention
jeopardizes the survival of these babies. Suppose that the evolutionary Advantage
A to the parents of having litter size x is
A(x) = ax bx2 .
Suppose that the Cost C to the parents of having litter size x is
C(x) = mx + e.
The Net Reproductive Gain G is defined as
G = A C.
(a) Explain the expressions for A, C and G.
(b) At what litter size is the advantage, A, greatest?

156

Chapter 7. Optimization
(c) At what litter size is there least cost to the parents?
(d) At what litter size is the Net Reproductive Gain greatest?.

7.22. Behavioural Ecology: Social animals that live in groups can spend less time scanning for predators than solitary individuals. However, they waste time fighting with
the other group members over the available food. There is some group size at which
the net benefit is greatest because the animals spend least time on these unproductive
activities, and thus can spend time on feeding, mating, etc.
Assume that for a group of size x, the fraction of time spent scanning for predators
is
1
S(x) = A
(x + 1)
and the fraction of time spent fighting with other animals over food is
F (x) = B(x + 1)2
where A, B are constants. Find the size of the group for which the time wasted on
scanning and fighting is smallest.
7.23. Logistic growth: Consider a fish population whose density (individuals per unit
area) is N , and suppose this fish population grows logistically, so that the rate of
growth R satisfies
R(N ) = rN (1 N/K)
where r and K are positive constants.
(a) Sketch R as a function of N or explain Fig 7.1.
(b) Use a first derivative test to justify the claim that N = K/2 is a local maximum
for the function G(N ).
7.24. Logistic growth with harvesting: Consider a fish population of density N growing
logistically, i.e. with rate of growth R(N ) = rN (1 N/K) where r and K are
positive constants. The rate of harvesting (i.e. removal) of the population is
h(N ) = qEN
where E, the effort of the fishermen, and q, the catchability of this type of fish, are
positive constants. At what density of fish does the growth rate exactly balance the
harvesting rate ? (This density is called the maximal sustainable yield: MSY.)
7.25. Conservation of a harvested population: Conservationists insist that the density
of fish should never be allowed to go below a level at which growth rate of the
fish exactly balances with the harvesting rate. (At this level, the harvesting is at its
maximal sustainable yield. If more fish are taken, the population will keep dropping
and the fish will eventually go extinct.) What level of fishing effort should be used to
lead to the greatest harvest at this maximal sustainable yield? [Remark: you should
first do the previous problem.]
7.26. Optimal foraging: Consider Example 7.7 for the optimal foraging model.
(a) Show that the parameter k in (7.6) is the time at which f (t) = Emax /2.

Exercises

157

(b) Consider panel (5) of Fig. 7.7. Show that a function such as a Hill function
would have the shape shown in that sketch. Interpret any parameters in that
function.
(c) Use the quotient rule to calculate the derivative of the function R(t) given by
Eqn. (7.8) and show that you get (7.9).
(d) Fill in the missing steps in the calculation in Eqn. (7.10) to find the optimal
value of R(t).
7.27. Rate of net energy gain while foraging and traveling: Animals spend energy in
traveling and foraging. In some environments this energy loss is a significant portion
of the energy budget. In such cases, it is customary to assume that to survive, an
individual would optimize the rate of net energy gain, defined as
Q(t) =

Energy gained Energy lost


Net energy gained
=
total time spent
total time spent

(7.12)

Assume that the animal spends p energy units per unit time in all activities (including
foraging and traveling). Assume that the energy gain in the patch (patch energy
function) is given by (7.6). Find the optimal patch time, that is the time at which
Q(t) is maximized in this scenario.
7.28. Maximizing net energy gain: Suppose that the situation requires an animal to maximize its net energy gained E(t) defined as
E(t) = energy gained while foraging energy spent while foraging and traveling.
(So, this means that E(t) = f (t) r(t + ).) where r is the rate of energy spent per
unit time and is the fixed travel time. Assume as before that the energy gained by
foraging for a time t in the food patch is f (t) = Emax t/(k + t). Find the amount
of time t spent foraging that maximizes E(t). Then indicate a condition of the form
k <?? that is required for existence of this critical point

158

Chapter 7. Optimization

Chapter 8

Introducing the chain


rule

So far, we have worked with simple functions such as power functions, polynomial, and
rational functions. This has made differentiation steps relatively easy. Now we want to
expand our horizons to deal with a variety of more interesting mathematical objects. Our
first step towards this goal is to learn how to differentiate composite functions. We dedicate
this chapter to the chain rule and its applications. Our first steps are to learn and understand
the new tool, and how it is used. Then we will use it on a variety of practical examples
where function composition is involved.

8.1

The chain rule

Section 8.1 Learning goals


1. Understand the concept of function composition and be able to express a composite
function in terms of the underlying composed functions.
2. Understand the chain rule of differentiation and be able to use it to find the derivative
of a composite function.

8.1.1 Function composition

u
x

Figure 8.1. Function composition


Shown in Fig. 8.1 is an example of function composition: An independent variable,
x, is used to evaluate a function, and the result, u = f (x) then acts as an input to a second
159

160

Chapter 8. Introducing the chain rule

function, g. The final value is y = g(u) = g(f (x)). We refer to the two-step function
operation as function composition.

Example 8.1 Consider the two functions f (x) = x and g(x) = x2 + 1. Determine the
two new functions obtained by composing these, namely h1 (x) = f (g(x)) and h2 (x) =
g(f (x)).

Solution: For h1 we apply g first, followed by f , so h1 (x) = ( x)2 + 1 = x + 1. For h2 ,


the functions are applied in the reversed order so that h2 (x) = x2 + 1. We note that the
domains of the two functions are slightly different. h1 is only defined for x 0 since f (x)
is not defined for negative x, whereas h2 is defined for all x.
Example 8.2 Express the function h(x) = 5(x3 x2 )10 as the composition of two simpler
functions. What are the domains of each of the functions?
Solution: We can write this in terms of the two functions f (x) = x3 x2 and g(x) = 5x1 0.
Then h(x) = g(f (x)).

8.1.2 The chain rule of differentiation


Given a composite function y = f (g(x)), such as the ones we have encountered in our
examples, we are interested in understanding how changes in the variable x affect change
in y.

If y = g(u) and u = f (x) are both differentiable functions and y = g(f (x)) is the
composite function, then the chain rule of differentiation states that
dy
dy du
=
dx
du dx
Informally, the chain rule states that the change in y with respect to x is a product of
two rates of change: (1) the rate of change of y with respect to its immediate input u, and
(2) the rate of change of u with respect to its input, x.
Why does it work this way? Although the derivative is not merely a quotient, we can
recall that it is arrived at from a quotient through a process of shrinking an interval. If we
write
y u
y
=
x
u x
then it is apparent that the cancellation of terms u in numerator and denominator lead to
the correct fraction on the left. The proof of the chain rule uses this essential idea, but care
is taken to ensure that the quantity u is nonzero, to avoid the embarrassment of dealing
with the nonsensical ratio 0/0.
Example 8.3 Apply the chain rule to differentiating the function h(x) = 5(x3 x2 )10 .

8.1. The chain rule

161

Solution: We express the function as y = h(x) = 5u10 where u = (x3 x2 ). Then





d(x3 x2 )
dy du
d(5u10 )
dy
= 5u9 (3x2 2x).
=
=
dx
du dx
du
dx
Then, using the expression for u leads to dy/dx = 5(x3 x2 )9 (3x2 2x).
p
Example 8.4 Compute the derivative of the function y = f (x) = x2 + a2 , where a is
some positive real number.

Solution: This function can be considered as the composition of g(u) = u and u(x) =
x2 +a2 , That is, we can write f (x) = g(h(x)) We rewrite g in the form of a power function
and then use the chain rule to compute the derivative. We obtain
x
dy
1
x
=
= (x2 + d2 )1/2 2x = 2
dx
2
(x + d2 )1/2
x2 + d2
Example 8.5 Compute the derivative of the function
x
y = f (x) =
,
x2 + d2

where d is some positive real number

Solution: We use both the quotient rule and the chain rule for this calculation.

[x] x2 + d2 [ x2 + d2 ] x
dy

=
dx
( x2 + d2 )2
Here the denotes differentiation. Then

1 x2 + d2 [ 12 2x (x2 + d2 )1/2 ] x
dy
=
dx
(x2 + d2 )
We simplify algebraically by multiplying top and bottom by (x2 + d2 )1/2 and canceling
factors of 2 to obtain
x2 + d2 x2
dy
d2
= 2
= 2
2
1/2
2
2
dx
(x + d ) (x + d )
(x + d2 )3/2

8.1.3 Interpreting the chain rule


The following intuitive examples may help to motivate why the chain rule is based on a
product of two rates of change.
Example 8.6 (Pollution level in a lake) A species of fish is sensitive to pollutants in its
lake. As humans settle and populate the area adjoining the lake, one may see a decline in
the population of these fish due to increased levels of pollution. Quantify the rate at which
the pollution level changes with time based on the pollution produced per human and the
rate of increase of the human population.

162

Chapter 8. Introducing the chain rule

Solution: The rate of decline of the fish would depend on the rate of change in the human
population around the lake, and the rate of change in the pollution created by each person.
If either of these factors increases, one would expect an increase in the effect on the fish
population and their possible extinction. The chain rule says that the net effect is a product
of the two interdependent rates. To be more specific, we could think of time t in years,
x = f (t) as the number of people living at the lake in year t, and p = g(x) as the pollution
created by x people. Then the rate of change of the pollution p over the years will be a
product in the rate of change of pollution per human, and the rate of increase of humans
over time:
dp dx
dp
=
dt
dx dt
Example 8.7 (Population of carnivores, prey, and vegetation) The population of large
carnivores, C, on the African Savannah depends on the population of gazelles that are
prey, P . The population of these gazelles, in turn, depends on the abundance of vegetation
V , and this depends on the amount of rain in a given year, r. Quantify the rate of change
of the carnivore population with respect to the rainfall.

Figure 8.2. An example in which the population of carnivores, C = h(P ) =


P 2 depends on prey P , while the prey depend on vegetation P = f (V ) = 2V, and the
vegetation depends on rainfall V = g(r) = r1/2 .
Solution: We can express these dependencies through functions; for instance, we could
write V = g(r), P = f (V ) and C = h(P ), where we understand that g, f, h are some
functions (resulting from measurement or data collection on the savanna).
As one specific example, shown in Figure 8.2, consider the case that
C = h(P ) = P 2 ,

P = f (V ) = 2V,

V = g(r) = r1/2 .

If there is a drought, and the rainfall changes, then there will be a change in the
vegetation. This will result in a change in the gazelle population, which will eventually
affect the population of carnivores on the savanna. We would like to compute the rate of
change in the carnivores population with respect to the rainfall, dC/dr.
According to the chain rule,
dC
dC dP dV
=
.
dr
dP dV dr
The derivatives we need are
1
dV
= r1/2 ,
dr
2

dP
= 2,
dV

dC
= 2P.
dP

8.1. The chain rule

163

so that
dC dP dV
1
2P
dC
=
= r1/2 (2)(2P ) = 1/2 .
dr
dP dV dr
2
r
We can simplify this result by using the fact that V = r1/2 and P = 2V . Plugging these
in, we obtain
2P
2(2V )
dC
=
=
= 4.
dr
V
V
This example is simple enough that we can also express the number of carnivores
explicitly in terms of rainfall, by using the fact that C = h(P ) = h(f (V )) = h(f (g(r))).
We can eliminate all the intermediate variables and express P in terms of r directly:
C = P 2 = (2V )2 = 4V 2 = 4(r1/2 )2 = 4r.
(This may be much more cumbersome in more complicated examples.) We can compute
the desired derivative in the simple old way, i.e.
dC
= 4.
dr
We can see that our two answers agree.
Example 8.8 (Budget for coffee) The budget spent on coffee depends on the number of
cups consumed per day and on the price per cup. The total budget might change if the price
goes up or if the consumption goes up (e.g. during late nights preparing for midterm exams). Quantify the rate at which your budget for coffee would change if both consumption
and price change.
Solution: The total rate of change of the coffee budget is a product of the change in the
price and the change in the consumption. (In this example, we might think of time t in days
as the independent variable, x = f (t) as the number of cups of coffee consumed on day t,
and y = g(x) as the price for x cups of coffee.)
dy dx
dy
=
dt
dx dt
Example 8.9 (Earths temperature and greenhouse gases) In Exercise 21 of Chapter 1,
we found that the temperature of the Earth depends on the albedo a (fraction of incoming
radiation energy reflected) according to the formula
T =

(1 a)S

1/4

(8.1)

Suppose that the albedo a depends on the level of greenhouse gases G so that da/dG is
known. If this is the only quantity that depends on G, determine how the temperature would
change as the level of greenhouse gases G increases.

164

Chapter 8. Introducing the chain rule

Solution: The information provided specifies that T depends on the level of greenhouse
gases via the chain of dependencies G a T . Let us write
T =

1/4

(1 a)1/4

In this problem the quantities S, , are all constants, so it simplifies calculation to write
the function in the form shown above. According to the chain rule,
dT
dT da
=
.
dG
da dG
We are given da/dG and we can compute dT /da. Hence, we find that
dT
=
dG

1/4

i
d h
1/4 da
=
(1 a)
da
dG

1/4

1
da
(1/4)1
(1 a)
(1)
.
4
dG

Rearranging leads to
1
dT
=
dG
4

1/4

(1 a)3/4

da
.
dG

In general, greenhouse gasses affect both the Earths albedo a and its emissivity . We
generalize our results in Exercise 2.

8.2 Chain Rule applied to optimization problems


Now that the chain rule is available to us as a differentiation tool, we are able to extend
the repertoire of mathematical problems that are amenable to calculus solutions. Here we
show two examples where derivatives require the chain rule. In both cases, we consider
optimization problems based on a biological situation. Optimization will provide insight
into how animals organize to find their food.

Section 8.2 Learning goals


1. Read and follow the derivation of each optimization model.
2. Be able to carry out the calculations of derivatives appearing in the problems (using
the chain rule)
3. Using optimization, find each critical point and verify its type.
4. Understand and be able to explain the interpretation of the mathematical results.

8.2. Chain Rule applied to optimization problems

165

8.2.1 Shortest path from food to nest


Ants are good mathematicians! They are able to find the shortest route that connects their
nest to a food source, to be as efficient as possible in bringing the food back home.
But how do they do it? It transpires that each ant secretes a chemical pheromone
that other ants like to follow. This marks up the trail that they use, and recruits nestmates to food sources. The pheromone (chemical message for marking a route) evaporates
after a while, so that, for a given number of foraging ants, a longer trail will have a less
concentrated chemical marking than a shorter trail. This means that whenever a shorter
route is found, the ants will favour it. After some time, this leads to selection of the shortest
possible trail.
Food

Food

Food

Food

Food

Food

x
D

Nest
(a)

Nest
(b)

Nest
(c)

Figure 8.3. Three ways to connect the ants nest to two food sources, showing (a)
a V-shaped, (b) T-shaped, and (c) Y-shaped paths.
Shown in the figure below is a common laboratory test scenario, where ants at a nest
are offered two equivalent food sources to utilize. We will use the chain rule and other
results of this chapter to determine the shortest path that will emerge after the ants explore
for some time.
Example 8.10 (Minimizing the total path length for the ants) Use the diagram to determine the length of the shortest path that connects the nest to both food sources. Assume
that d << D.
Solution: We consider two possibilities before doing any calculus. The first is that the
shortest path has the shape of the letter T whereas the second is that it has the shape of a
letter
V. Then for a T-shaped path, the total length is D + 2d whereas for a V-shaped path
it is 2 D2 + d2 . Now we consider a third possibility, namely that the path has the shape
of the letter Y. This means that the ants start to walk straight ahead and then veer off to the
food after a while.
It turns out to simplify our calculations if we label the distance from the nest to the
Y-junction as D x. Then x is the remaining distance shown in the diagram. The length
of the Y-shaped path is then given by
p
L = L(x) = (D x) + 2 d2 + x2 .
(8.2)

166

Chapter 8. Introducing the chain rule

Now we observe that when x = 0, then LT = D +


2d, which corresponds exactly to the
T-shaped path, whereas when x = D then LV = 2 d2 + D2 which is the length of the
V-shaped path. Thus in this problem, we have 0 < x < D as the appropriate domain, and
we have determined the values of L at the two domain endpoints.
To find the minimal path length, we look for critical points of the function L(x).
Differentiating, we obtain (using results of Example 8.4)
L (x) =

dL
x
.
= 1 + 2
2
dx
x + d2

Critical points occur at L (x) = 0, which corresponds to


x
= 0.
1 + 2
2
x + d2
We simplify this algebraically to obtain
p
x2 + d2 = 2x

x2 + d2 = 4x2

3x2 = d2

d
x= .
3

To determine the kind of critical point, we find the second derivative (See Example 8.5).
Then
d2
L (x) = 2 2
> 0.
(x + d2 )3/2
Thus the second derivative is positive and the critical point is a local maximum.

To determine the actual length of the path, we substitute x = d/ 3 into the function
L(x) and obtain (after simplification, see Exercise 3)

L = L(x) = D + 3d.

8.2.2 Food choice and attention


The example described in this section is taken from actual biological research. It has several
noteworthy features: First, we put the chain rule to use. Second, we encounter a surprise in
some of our elementary calculations. Third, we find that not every problem has an elegant
or analytically simple solution, and arrive at an application of Newtons method. Finally,
we see that some very general observations can provide insight that we do not get as easily
from the specific cases. The problem is taken from the study of animal behaviour.
Paying attention
Behavioural ecologist Reuven Dukas (McMaster U) studies the choices that animals make
when deciding which food to look for. His work has resulted in both theoretical and experimental conclusions about choices and strategies that animals follow. The example described below is based on his work with blue jays described in several publications [7, 8, 9].
Many types of food are cryptic, i.e. hidden in the environment, and require time
and attention to find. Some types of food are more easily detected, but other foods might

8.2. Chain Rule applied to optimization problems

167

d
x=d/sqrt(3)

(a)

(b)

Figure 8.4. (a) In the configuration for the shortest path we found that x = d/ 3.
(b) The total length ofthe path L(x) as a function of x for D = 2, d = 1. The minimal
path
3

0.577.
The
length
of
the
shortest
path
is
then
L
=
D
+
3d =
occurs
when
x
=
1/

2 + 3 3.73.
provide more nourishment. Clearly, the animal that succeeds in gaining the greatest nourishment during a typical day will have the greatest chance of surviving and out-competing
others. Thus, it makes sense that animals should chose to divide their time and attention
between food types in such a way as to maximize the total gain over the given time period
available for foraging.
Setting up a model
Suppose that there are two types of food available in the environment. We will define a
variable that represents the attention that an animal can devote to finding a given food type.
Let x = attention devoted to finding food of some type. Assume that 0 < x < 1, with
x = 0 representing no attention at all to that type of food and x = 1 full attention
devoted to finding that item.
Let P (x) denote the probability of finding the food given that attention x is devoted
to the task. Then 0 < P < 1, as is commonly assumed for a probability. P = 0
means that the food is never found, and P = 1 means that the food is always found.
Consider foods that have the property P (0) = 0, P (1) = 1. This means that if no
attention is payed (x = 0) then there is no probability of finding the food (P = 0),
whereas if full attention is given to the task x = 1 then there is always success
(P = 1).
Suppose that there is more than one food type in the animals environment. Then we
will assume that the attentions paid to finding these foods, x and y sum up to 1: i.e.

168

Chapter 8. Introducing the chain rule


since attention is limited, x + y = 1, or, simply, y = 1 x.

In figure 8.5, we show typical examples of the success versus attention curves for
four different types of food labeled 1 through 4. On the horizontal axis, we show the
attention 0 < x < 1, and on the vertical axis, we show the probability of success at finding
food, 0 < P < 1. We observe that all the curves share in common the features we have
described: Full success for full attention, and no success for no attention.
However the four curves shown here differ in their values at intermediate levels of
attention.
1

Probability, P(x)

4
3
2
1

attention, x

Figure 8.5. The probability, P (x), of finding a food depends on the level of attention x devoted to finding that food. Here 0 x 1, with x = 1 being full attention
devoted to the task. We show possible curves for four types of foods, some easier to find
than others.

Questions:
1. What is the difference between foods of type 1 and 4?
2. Which food is easier to find, type 3 or type 4?
3. What role is played by the concavity of the curve?
You will have observed that some curves, notably those concave down, such as curves
3 and 4 rise rapidly, indicating that the probability of finding food increases a lot just by
increasing the attention by a little: These represent foods that are relatively easy to find. In
other cases, where the function is concave up, (curves 1 and 2), we must devote much more
attention to the task before we get an appreciable increase in the probability of success:
these represent foods that are harder to find, or more cryptic. We now explore what happens
when the attention is subdivided between several food types.
Suppose that two foods available in the environment can contribute relative levels of
nutrition 1 and N per unit. We wish to determine for what subdivision of the attention,
would the total nutritional value gained be as large as possible.
Suppose that P1 (x) and P2 (y) are probabilities of finding food of type 1 or 2 given
that we spend attention x or y in looking for that type.

8.2. Chain Rule applied to optimization problems

169

Let x = the attention devoted to finding food of type 1. Then attention y = 1 x can
be devoted to finding food of type 2.
Suppose that the relative nutritional values of the foods are 1 and N .
Then the total value gained by splitting up the attention between the two foods is:
V (x) = P1 (x) + N P2 (1 x).
Example 8.11 (P1 and P2 as power function with integer powers:) Consider the case that
the probability of finding the food types is given by the simple power functions,
P1 (x) = x2 ,

P2 (y) = y 3 .

Find the optimal food value V (x) that can be attained.


Solution: We note that these functions satisfy P (0) = 0, P (1) = 1, in accordance with
the sketches shown in Figure 8.5. Further, suppose that both foods are equally nutritious.
Then N = 1, and the total value is
V (x) = P1 (x) + N P2 (1 x) = x2 + (1 x)3 .
We look for a maximum value of V : Setting V (x) = 0 we get (using the Chain Rule:)
V (x) = 2x + 3(1 x)2 (1) = 0.
We observe that a negative factor (1) comes from applying the chain rule to the factor
(1 x)3 . The above equation can be expanded into a simple quadratic equation:
3x2 + 8x 3 = 0
whose solutions are

4 7
x=
0.4514, 2.21.
3
Since the attention must take on a value in 0 < x < 1, we must reject the second of the two
solutions. It would appear that the animal may benefit most by spending a fraction 0.4514
of its attention on food type 1 and the rest on type 2.
However, to confirm our speculation, we must check whether the critical point is a
maximum. To do so, consider the second derivative,
V (x) =


d
2x 3(1 x)2 = 2 3(2)(1 x)(1) = 2 + 6(1 x).
dx

(The factor (1) that appears in the computation is due to the Chain Rule applied to (1x)
as before.)
Observing the result, and recalling that x < 1, we note that the second derivative
is positive for all values of x! This is unfortunate, as it signifies a local minimum! The
animal gains least by splitting up its attention between the foods in this case. Indeed, from
Figure 8.6(a), we see that the most gain occurs at either x = 0 (only food of type 2 sought)
or x = 1 (only food of type 1 sought). Again we observe the importance of checking for
the type of critical point before drawing hasty conclusions.

170

Chapter 8. Introducing the chain rule

1.0

Nutritional value, V1(x)

Nutritional value, V2(x)

1.6

0.0

1.0
0.0

1.0

attention, x

0.0

(a)

attention, x

1.0

(b)

Figure 8.6. (a) Figure for Example 8.11 and (b) for Example 8.12. In (a) the
probabilities of finding foods of types 1 and 2 are concave up power functions, whereas
in (b) both functions are concave down. As a result there is a local maximum for the
nutritional value in (b) but not in (a).
Example 8.12 (Fractional-power functions for P1 , P2 :) As a second example, consider
the case that the probability of finding the food types is given by the concave down power
functions,
P1 (x) = x1/2 , P2 (y) = y 1/3
and both foods are equally nutritious (N = 1). Find the optimal food value V (x).
Solution: These functions also satisfy P (0) = 0, P (1) = 1, in accordance with the
sketches shown in Figure 8.5. Then

V (x) = P1 (x) + P2 (1 x) = x + (1 x)(1/3) ,


V (x) =
V (x) =

1
1

,
x 3 (1.0 x)(2/3)
1

4 x(3/2)

2
.
9 (1.0 x)(5/3)

At this point we would like to proceed to solve V (x) = 0 to find the critical point. Unfortunately, this problem, while seemingly routine, turns out to be algebraically nasty. However,
rather than despair, we seek an approximate solution to the problem, for which Newtons
Method proves ideal, as shown next.
A plotting program is used to display the value obtained by splitting up the attention
in this way in Figure 8.6(b). It is clear from this figure that a maximum occurs in the middle
of the interval, i.e for attention split between finding both foods. We further see from V (x)
that the second derivative is negative for all values of x in the interval, indicating that we
have obtained a local maximum, as expected.

8.2. Chain Rule applied to optimization problems

171

Applying Newtons method to finding the critical point


Example 8.13 Use Newtons Method to find the critical point for the function V (x) in
Example 8.12.
Solution: Let f (x) = V (x). Then finding the critical point of V (x) reduces to finding the
zero of the function V (x), i.e. solving f (x) = 0. This is the precise type of problem that
Newtons Method addresses, as discussed in Section 5.4. SInce the interval of interest is
0 x 1, we will start with an initial guess for the critical point at x0 = 0.5. midway
along this interval. Then, according to Newtons method, the improved guess would be
x1 = x0

f (x0 )
.
f (x0 )

and, repeating this, at the kth stage,


xk+1 = xk

f (xk )
.
f (xk )

To use this method, we must carefully note that


f (x) = V (x) =

1
1

,
2 x 3 (1.0 x)(2/3)

f (x) = V (x) =

2
1

4 x(3/2)
9 (1.0 x)(5/3)

Thus, we might use a spreadsheet in which cells A1 stores our initial guess, whereas B1,
C1, and D1 store the values of f (x), f (x) and x0 f (x0 )/f (x0 ). In the typical syntax
of spreadsheets, this might read something like the following.
A1 0.5
B1 =(1/(2*SQRT(A1))-1/(3*(1-A1)(2/3)))
C1 =(-1/(4*A1(3/2))-2/(9*(1-A1)(5/2)))
D1 =A1-B1/C1
Applying this idea, and repeating the calculation by dragging the values to successive
rows would lead to iterated approximations as follows.
x0 = 0.50000, x1 = 0.59061, and thereafter, successive values 0.60816, 0.61473,
0.61751, 0.61875, 0.61931, 0.61956, 0.61968, 0.61973, 0.61976, 0.61977, 0.61977, 0.61977
...
Thus, we see that the values converge to the location of the critical point, x =
0.61977 (and y = 1 x = 0.38022.) within the interval of interest.
Epilogue
While the conclusions drawn above were disappointing in one specific case, it is not always
true that concentrating all ones attention on one type is optimal. We can examine the
problem in more generality to find when the opposite conclusion might be satisfied. In the
general case, the value gained is

172

Chapter 8. Introducing the chain rule

V (x) = P1 (x) + N P2 (1 x).


A critical point occurs when
V (x) =

d
[P1 (x) + N P2 (1 x)] = P1 (x) + N P2 (1 x)(1) = 0.
dx

(By now you realize where the extra term (1) comes from - yes, from the Chain Rule!)
Suppose we have found a value of x in 0 < x < 1 at which this is satisfied. We then
examine the second derivative:
d
d
[V (x)] =
[P (x) N P2 (1 x)]
dx
dx 1
= P1 (x) N P2 (1 x)(1) = P1 (x) + N P2 (1 x).

V (x) =

The concavity of the function V is thus related to the concavity of the two functions P1 (x)
and P2 (1 x). If these are concave down (e.g. as in food types 3 or 4 in Figure 8.5), then
V (x) < 0 and a local maximum will occur at any critical point found by our differentiation.
Another way of stating this observation is: if both food types are relatively easy to
find, one can gain most benefit by splitting up the attention between the two. Otherwise, if
both are hard to find, then it is best to look for only one at a time.

Exercises

173

Exercises
8.1. Practicing the Chain Rule: Use the chain rule to calculate the following derivatives
(a) y = f (x) = (x + 5)5
(b) y = f (x) = 4(x2 + 5x 1)8

(c) y = f (x) = ( x + 2x)3


8.2. Earths temperature: In this problem, we expand and generalize the results of
Example 8.9. As before, let G denote the level of greenhouse gasses on Earth,
and consider the relationship of temperature of the earth to the albedo a and the
emissivity given by Eqn. (8.1).
(a) Suppose that a is constant, but depends on G. Assume that d/dG is given.
Determine the rate of change of temperature with respect to the level of greenhouse gasses in this case.
(b) Suppose that both a and depend on G. Find dT /dG in this more general
case. (Hint: the quotient rule as well as the chain rule will be needed in this
case.)
8.3. Shortest path from nest to food sources:
(a) Use the first derivative test to verify that the value x =
of the function L(x) given by Eqn (8.2)

(b) Show that the shortest path is L = D + 3d.

d
3

is a local minimum

(c) In Section 8.2.1 we assumed that d << D, so that the food sources were close
together relative to the distance from the nest. Now suppose that D = d/2.
How would this change the solution to the problem?
8.4. Geometry of the shortest ants path: Use the results of Section 8.2.1 to show that
in the shortest path, the angles between the branches of the Y-shaped
path are all
120 , You may find it helpful to recall that sin(30) = 1/2, sin(60) = 3/2.
8.5. More about the ant trail: Consider the lengths of the V and T-shaped paths in the
ant trail example of Section 8.2.1. We will refer to these as LV and LT , and both
depend on the distances d and D in Fig. 8.3.
(a) Write down the expressions for each of these functions.
(b) Suppose the distance D is fixed. How do the two lengths LV , LT depend on
the distance d? Use your sketching skills to draw a rough sketch of Lv (d), LT (d).
(c) Use you sketch to determine whether there is a value of d for which the lengths
LV and LT are the same.
8.6. Divided attention:
This problem is based on the material on food choice and attention described in
Section 7.6. It is advisable to first read that section.
A bird in its natural habitat feeds on two kinds of seeds, whose nutritional values are
5 calories per seed of type 1 and 3 calories per seed of type 2. Both kinds of seeds
are hidden among litter on the forest floor and have to be found. If the bird splits its

174

Chapter 8. Introducing the chain rule


attention into a fraction x1 searching for seed type 1 and a fraction x2 searching for
seed type 2, then its probability of finding 100 seeds of the given type is
P1 (x1 ) = (x1 )3 ,

P2 (x2 ) = (x2 )5 .

Assume that the bird pays full attention to searching for seeds so that x1 + x2 = 1
where 0 x1 1 and 0 x2 1.
(a) Write down an expression for the total nutritional value V gained by the bird
when it splits its attention. Use the constraint on x1 , x2 to eliminate one of
these two variables. (For example, let x = x1 and write x2 in terms of x1 .)
(b) Find critical points of V (x) and classify those points.
(c) Find absolute minima and maxima of V (x) and use your results to explain
what is the birds optimal strategy to maximize the nutritional value of the
seeds it can find.

Chapter 9

Chain rule applied to


related rates and implicit
differentiation
9.1

Applications of the chain rule to related rates

In many applications of the chain rule, we are interested in processes that take place over
time. We ask how the relationships between certain geometric (or physical) variables affects the rates at which they change over time. Many of these examples are given as word
problems, and we are called on to assemble the required geometric or other relationships
in solving the problem.

Section 9.1 Learning goals


1. Understand the application of the chain rule to problems in which a composite relationship between variables is involved.
2. Given a geometric relationship and a rate of change of one of the variables, be able
to use the chain rule to find the rate of change of a related variable.
3. Understand how to use verbal information about rates of change to set up the required
relationships, and to solve a word problem involving an application of the chain rule
(related rates problem).
A few relationships that we will find useful are concentrated in Table 9.1.
Example 9.1 (Tumor growth:) A tumor grows so that its radius expands at a constant
rate, k. Determine the rate of growth of the volume of the tumor when the radius is one
centimeter. Assume that the tumor is approximately spherical.
Solution: The volume of a sphere depends on its radius r, a fact we can express with the
notation V (r) = (4/3)r3 . In the situation under discussion, r changes with time, resulting
in V changing with time. We indicate this chain of dependencies using the notation r(t)
175

176

Chapter 9. Chain rule applied to related rates and implicit differentiation


Volume of sphere
Surface area of sphere
Area of circle
Perimeter of circle
Volume of cylinder
Volume of cone
Area of rectangle
Perimeter of rectangle
Volume of box
Sides of Pythagorean triangle

V = 43 r3
S = 4r2
A = r2
P = 2r
V = r2 h
V = 13 r2 h
A = xy
P = 2x + 2y
V = xyz
c2 = a 2 + b 2

Table 9.1. Common relationships on which problems about related rates are often based.

Figure 9.1. Growth of a spherical tumor. Since the radius changes with time, the
volume, too, changes with time. We use the chain rule to link dV /dt to dr/dt.
and V (r(t)). At any time t, the relationship is
V (r(t)) =

4
[r(t)]3 .
3

Here we have emphasized function composition to motivate the necessity of applying the
chain rule. Then


d
dV dr
d 4 3 dr 4
dr
dr
V (r(t)) =
=
r
3r2
= 4r2 .
dt
dr dt
dr 3
dt 3
dt
dt
But we are told that the radius expands at a constant rate, k, so that
dr
= k.
dt
Hence
dV
= 4r2 k.
dt
We see that the rate of growth of the volume actually goes as the square of the radius.
(Indeed a more astute observation is that the volume grows at a rate proportional to the

9.1. Applications of the chain rule to related rates

177

surface area, since the quantity 4r2 is precisely the surface area of the sphere.) At the
instant that the radius is r = 1 cm we find that
dV
= 4k.
dt
An important note is that this numerical value for the radius holds only at one instant and
is used at the end of the calculation, after the differentiation and simplification steps are
completed.

Extended tissue

Original tissue

w
L
Figure 9.2. Convergent extension of tissue in embryonic development. Cells elongate along one axis (which increases L) while contracting along the other axis (decreasing
w). SInce the volume and thickness remain fixed, the changes in L can be related to changes
in w.
Example 9.2 (Convergent extension) Most animals are longer head to tail than side to
side. To obtain relative elongation along one axis, the embryo undergoes a process called
convergent extension whereby a block of tissue elongates (extends) along one axis and
narrows (converges) along the other axis as shown in Fig. 9.2. Here we consider this
process. Suppose a block of tissue originally having dimensions L = w = 10mm and
thickness = 1mm extends at the rate of 1mm per day, while the volume V and thickness
remain fixed. At what rate is the width w of the tissue block changing when the length is
L = 20mm?
Solution: Assume a rectangular block of tissue of length L(t) and width w(t). We are told
that the volume V and the thickness remain constant. We easily find, using the initial
length, width and thickness, that the volume is V = 10 10 1mm3 . Further, at any given
time t, the volume of the rectangular block is
V = L(t)w(t).
While we have not explicitly indicated this, let us remember that V depends on L and w,
both of which depend on time. Hence, there is a chain of dependencies t L, w, V ,
motivating the chain rule. Differentiating both sides with respect to t leads to
d
d
V =
(L(t)w(t) )
dt
dt

0 = (L (t)w(t) + L(t)w (t)) .

178

Chapter 9. Chain rule applied to related rates and implicit differentiation

(Here we have used the product rule to differentiate L(t)w(t) with respect to t. We also
used the fact that V is constant so its derivative of V is zero, and is constant, so it
multiplies the derivative of L(t)w(t) as would any multiplicative constant.) Consequently,
canceling the constant factor and solving for w (t) results in
L (t)w(t) + L(t)w (t) = 0

w (t) =

L (t)w(t)
.
L(t)

At the instant that L(t) = 20, w(t) = V /(L(t) ) = 100/20 = 5. Hence we find that
w (t) =

L (t)w(t)
1mm/day 5mm
=
= 0.25mm/day.
L(t)
20mm

The negative sign indicates that w is decreasing while L is increasing.


Example 9.3 (A spiders thread:) A spider moves horizontally across the ground at a
constant rate, k, pulling a thin silk thread with it. One end of the thread is tethered to a
vertical wall at height h above ground and does not move. The other end moves with the
spider. Determine the rate of elongation of the thread.

l
h
x
Figure 9.3. The length of a spiders thread
Solution: We use the Pythagorean Theorem to relate the height of the tether point h, the
position of the spider x, and the length of the thread :
2 = h2 + x2 .
We note that h is constant, and that x, are changing so that
[(t)]2 = h2 + [x(t)]2 .
Differentiating with respect to t leads to


d
d
[(t)]2 =
h2 + [x(t)]2
dt
dt

dx
d
= 0 + 2x

dt
dt
Simplifying and using the fact that
dx
=k
dt
2

d
2x dx
=
.
dt
2 dt

9.1. Applications of the chain rule to related rates


leads to

179

x
x
d
.
= k = k
2
dt

h + x2

Example 9.4 (A conical cup:) Water is leaking out of a conical cup of height H and radius
R. Find the rate of change of the height of water in the cup at the instant that the cup is
full, if the volume is decreasing at a constant rate, k.

R
r

H
h

Figure 9.4. The geometry of a conical cup


Solution: Let us define h and r as the height and radius of water inside the cone. Then we
know that the volume of this (conically shaped) water in the cone is
V =

1 2
r h,
3

or, in terms of functions of time,


V (t) =
We are told that

1
[r(t)]2 h(t).
3

dV
= k,
dt

where the negative sign indicates that volume is decreasing.


By similar triangles, we note that
R
r
=
h
H
so that we can substitute

R
h
H
and get the volume in terms of the height alone:
r=

V (t) =

 2
1
R
[h(t)]3 .

3
H

180

Chapter 9. Chain rule applied to related rates and implicit differentiation

We can now use the chain rule to conclude that


 2
dh
1
R
dV
3[h(t)]2 .
=
dt
3
H
dt
Now using the fact that volume decreases at a constant rate, we get
 2
dh
dh
kH 2
R
[h(t)]2
.
k =

=
H
dt
dt
R2 h2
The rate computed above holds at any time as the water leaks out of the container. At
the instant that the cup is full, we have h(t) = H and r(t) = R, and then
dh
k
kH 2
=
.
=
dt
R2 H 2
R2
For example, for a cone of height H = 4 and radius R = 3,
k
dh
=
.
dt
9
It is important to remember to plug in the information about the specific instant at the very
end of the calculation, after the derivatives are computed.

9.2 Implicit differentiation


Section 9.2 Learning goals
1. Understand the distinction between a function that is defined explicitly and one that
is defined implicitly.
2. Understand the idea of implicit differentiation geometrically.
3. Be able to compute the slope of a curve at a given point using implicit differentiation,
find tangent line equations, and solve problems based on such ideas.

9.2.1 Implicit and explicit definition of a function


A review of the definition of a function (e.g. Appendix C) reminds us that for a given x
value, only one y value is permitted. For example, for
y = x2
any value of x leads to a single y value (Fig. 9.5a). geometrically, this means that the graph
of this function satisfies the vertical line property. Not all curves satisfy this property. The
elliptical curve in Fig. 9.5b clearly fails this test (intersecting some vertical lines twice).
This simply means that, while we can write down an equation for such a curve, e.g.
(x 1)2
+ (y 1)2 = 1,
4

9.2. Implicit differentiation

181

we cannot solve for a simple function that describes the entire curve. Nevertheless, the
idea of a tangent line to such a curve, and consequently the slope of such a tangent line is
a perfectly reasonable notion.
In order to make sense of this idea, we will restrict attention to a local part of the
curve, close to some point of interest (Fig. 9.5c). Then near this point, the equation of the
curve defines an implicit function, that is, close enough to the point of interest, a value
of x leads to a unique value of y. We will refer to this value as y(x) to remind us of the
relationship between the two variables.
How can we generalize the notion of a derivative to this case of implicit functions?
We observe from (Fig. 9.5d) that a small change in x leads to a small change in y. Even if
we are not able to write down an explicit expression for y versus x, we can still in principle
determine these small changes, and form the ratio y/x. Then provided x 0 is
infinitessimally small, we arrive at the slope of a tangent line as before, dy/dx. In the
next section we show how to do this using an application of the chain rule called implicit
differentiation.

(a)

(b)

(c)

(d)

y
x

Figure 9.5. (a) A function has to satisfy the vertical line property: a given x
value can have at most one corresponding y value. hence the curve shown in (b) cannot
be a function. We can write down an equation for the curve, but we cannot solve for y
explicitly. (c) However, close to a given point on the curve (dark point), we can think of
how changing the x coordinate of the point (shaded interval on x axis) leads to a change
in the corresponding y coordinate on the same curve (shaded interval on y axis). (d)
We can also ask what is the slope of the curve at the given point. This corresponds to
limx0 y/x. Implicit differentiation can be used to compute that derivative.

9.2.2 Slope of a tangent line at the point on a curve


We now show how to compute this slope in several examples where it is inconvenient, or
impossible to isolate y as a function of x.
Example 9.5 (Tangent to a circle:) In the first example, we find the slope of the tangent
line to a circle. This example can be solved in different ways, but here we focus on the
method of implicit differentiation.
(a) Find the slope of the tangent line to the point x = 1/2 in the first quadrant on a circle
of radius 1 and center at the origin.

182

Chapter 9. Chain rule applied to related rates and implicit differentiation


y

(x,y)

tangent line
(x,y)

ZOOM
x

(a)

(b)

Figure 9.6. The curve in (a) is not a function and hence it can only be described
implicitly. However, if we zoom in to a point in (b), we can define the derivative as the slope
of the tangent line to the curve at the point of interest.
(b) Find the second derivative d2 y/dx2 at the above point.

Figure 9.7. Tangent line to a circle by implicit differentiation


Solution:
(a) The equation of a circle with radius 1 and center at the origin is
x2 + y 2 = 1.
p
p

When x = 1/2 we have y = 1 (1/2)2 = 1 (1/4) = 3/2. However


only one of these two values is in the first quadrant, i.e. y = + 3/2, so we are
concerned with the behaviour close to this point.
In the original equation of the circle, we see that the two variables are linked in a
symmetric relationship: although we could solve for y, we would not be able to
express the relationship as a single function. Indeed, the top of the circle can be
expressed as
p
y = f1 (x) = 1 x2
and the bottom as

p
y = f2 (x) = 1 x2 .

9.2. Implicit differentiation

183

However, this makes the work of differentiation more complicated than it needs be.
Here is how we can handle the issue conveniently: We will think of x as the independent variable and y as the dependent variable. That is, we will think of the behaviour
close to the point of interest as a small portion of the upper part of the circle, in which
y varies locally as x varies. Then the equation of the circle would look like this:
x2 + [y(x)]2 = 1.
Now differentiate each side of the above with respect to x:

 2
 d1
d
d
dx
x2 + [y(x)]2 =
= 0.
+
[y(x)]2 = 0.
dx
dx
dx
dx

The second term has the following chain of dependencies: x y y 2 . That is,
d
[y(x)]2 we must
the value of x determines y which in turn determines y 2 . To find dx
hence apply the Chain Rule. We obtain

 2
dy
dy 2 dy
dx
= 0. 2x + 2[y(x)]
+
=0
dx
dy dx
dx
Thus
2y

dy
= 2x
dx

dy
2x
x
=
=
dx
2y
y

Here the slot of the tangent line to the circle is expressed as a ratio of the coordinates
of the point of the circle. We could, in this case, simplify to
y (x) =

dy
x
.
=
dx
1 x2

(This will not always be possible. In many cases we will not have an easy way to
express y as a function of x in the final equation).

The point of interest is x = 1/2, and the corresponding value of y is y = 3/2.


Thus

1
3
1/2
dy
= =
=
.
y =
dx
3
3/2
3
(b) The second derivative can be computed by differentiating
y =

dy
x
=
dx
y

We use the quotient rule:


d2 y
d
=
2
dx
dx



x

yx
d2 y
1y xy
=

dx2
y2
y2

x
y

y 2 + x2
1
= 3.
y3
y

184

Chapter 9. Chain rule applied to related rates and implicit differentiation


Substituting y =

3/2 from part (a) yields


8
d2 y
1
= (
=
.
2
3
dx
3 3/2)
( 3/2)

We have used the equation of the circle, and our previous result for the first derivative in simplifying the above. We can see from this last expression that the second
derivative is negative for y > 0, i.e. for the top semi-circle, indicating that this part of
the curve is concave down (as expected). Similarly, for y < 0, the second derivative
is positive, and this agrees with the concave up property of that portion of the circle.
As in the case of simple functions, the second derivative can thus help identify concavity of curves.
Example 9.6 (Energy loss and Earths temperature; implicit differentiation) Redo Example 4.9 using implicit differentiation, that is Find the rate of change of Earths temperature unit energy loss based on Eqn. (1.4), but without solving for T as a function of Eout .

Solution: We write the equation in the form


Eout (T ) = (4r2 )T 4
and observe that the term in braces is constant. Differentiate both sides with respect to
Eout . Then we find
dT 4 dT
dEout
= (4r2 )
dEout
dT dEout
1 = (4r2 ) 4T 3

dT
.
dEout

The calculation is completed by rearranging this result,


dT
1
1
=
.
dEout
16r2 T 3

9.3 The power rule for fractional powers


Implicit differentiation can help in determining the derivatives of a number of new functions. In this case, we use what we know about the integer powers to determine the derivative for a fractional power such as 1/2. A similar idea will recur several times later on in
this course, when we encounter a new type of function and its inverse function.
Example 9.7 (Derivative of

x:) Consider the function

y= x

Use implicit differentiation to compute the derivative of this function.

9.3. The power rule for fractional powers

185

Solution: We can re-express this function in the form


y = x1/2 .
In this example, we will show that the power rule applies in the same way to fractional
powers: That is, we show that
1
y (x) = x1/2
2

We rewrite the function y = x in the form


y2 = x
but we will continue to think of y as the dependent variable, i.e. when we differentiate, we
will remember that
[y(x)]2 = x
Taking derivatives of both sides leads to

d
d
[y(x)]2 =
(x)
dx
dx
dy
=1
dx
dy
1
=
.
dx
2y

2[y(x)]

We now use the original relationship to eliminate y, i.e. we substitute y =


that
1
1
dy
= = x1/2 .
dx
2
2 x

x. We find

This verifies the power law for the above example.


A similar procedure can be applied to a power function with fractional power. When
we apply similar steps, we find that
Derivative of fractional-power function: The derivative of
y = f (x) = xm/n
is

dy
m m
= x( n 1) .
dx
n
This is left as an exercise for the reader.

Example 9.8 (The astroid:) The curve


x2/3 + y 2/3 = 22/3
has the shape of an astroid. It describes the shape generated by a ball of radius 12 rolling
inside a ball of radius 2. Find the slope of the tangent line to a point on the astroid.

186

Chapter 9. Chain rule applied to related rates and implicit differentiation

Solution: We use implicit differentiation as follows:



d 2/3
d  2/3
x + y 2/3 =
2
dx
dx
2 1/3
d 2/3 dy
x
+
(y )
=0
3
dy
dx

2 1/3 2 1/3 dy
x
+ y
=0
3
3
dx
x1/3 + y 1/3

dy
=0
dx

We can rearrange into the form:


x1/3
dy
= 1/3 .
dx
y
We can see from this form that the derivative fails to exist at both x = 0 (where x1/3
would be undefined) and at y = 0 (where y 1/3 would be undefined. This stems from the
sharp points that the curve has at these places.
In the next example we put the second derivative to work in an implicit differentiation
problem. The goal is as follows:
Example 9.9 (Horizontal tangent and concavity on a rotated ellipse:) Find the highest
point on the (rotated) ellipse
x2 + 3y 2 xy = 1

Solution: The highest point on the ellipse will have a horizontal tangent line, so we should
look for the point on this curve at which dy/dx = 0. We proceed as follows:
1. Finding the slope of the tangent line: By implicit differentiation,
d 2
d
[x + 3y 2 xy] =
1
dx
dx
d(x2 ) d(3y 2 ) d(xy)
+

= 0.
dx
dx
dx
We must use the product rule to compute the derivative of the last term on the LHS:
2x + 6y

dy
dy
dx
x

y=0
dx
dx dx

2x + 6y

dy
dy
x
1y = 0
dx
dx

Grouping terms, we have


(6y x)

dy
+ (2x y) = 0
dx

9.3. The power rule for fractional powers


Thus

We can also use the notation

187

(y 2x)
dy
=
.
dx
(6y x)
y (x) =

(y 2x)
(6y x)

to denote the derivative. Setting dy/dx = 0, we obtain y 2x = 0 so that y = 2x


at the point of interest. However, we still need to find the coordinates of the point
satisfying this condition.
2. Determining the coordinates of the point we want: To do so, we look for a point
that satisfies the equation of the curve as well as the condition y = 2x. Plugging into
the original equation of the ellipse, we get:
x2 + 3y 2 xy = 1
x2 + 3(2x)2 x(2x) = 1.

After simplifying, this equation becomes 11x2 = 1, leading to the two possibilities
1
x = ,
11

2
y = .
11

We need to figure out which one of these two points is the top. (Evidently, the other
point would also have a horizontal tangent, but would be at the bottom of the
ellipse.)

Figure 9.8. A rotated ellipse


3. Finding which point is the one at the top: The top point on the ellipse will be located at a portion of the curve that is concave down. We can determine the concavity
close to the point of interest by using the second derivative, which we will compute
(from the first derivative) using the quotient rule:
y (x) =

d2 y
[y 2x] (6y x) [6y x] (y 2x)
=
2
dx
(6y x)2

188

Chapter 9. Chain rule applied to related rates and implicit differentiation


y (x) =

[y 2](6y x) [6y 1](y 2x)


.
(6y x)2

In the above, we have used the prime notation () to denote a derivative.


4. Plugging in information about the point: Now that we have set down the form
of this derivative, we make some important observations about the specific point of
interest: (Note that this is done as a final step, only after all derivatives have been
calculated!)
We are only concerned with the sign of this derivative. The denominator is always
positive (since it is squared) and so will not affect the sign. (It is possible to work
with the sign of the numerator alone, though, in the interest of providing detailed
steps, we go through the entire calculation below.)
At the point of interest (top of ellipse) y = 0, simplifying some of the terms above.
At the point in question, y = 2x so the term (y 2x) = 0.
We can thus simplify the above to obtain
y (x) =

[2](6y x) [x](0)
[2](6y x)
2
=
=
.
(6y x)2
(6y x)2
(6y x)

Using again the fact that y = 2x, we get the final form
y (x) =

2
2
=
.
(6(2x) x)
11x

We see directly from this result that the second derivative is negative (implying concave
down curve)
whenever x is positive. This tells us that at the point with positive x value,

x = 1/ 11, we are at the top of the ellipse. A graph of this curve is shown in Figure 9.8.

Exercises

189

Exercises
9.1. Consider the growth of a cell, assumed spherical in shape. Suppose that the radius
of the cell increases at a constant rate per unit time. (Call the constant k, and assume
that k > 0.)
(a) At what rate would the volume, V , increase ?
(b) At what rate would the surface area, S, increase ?
(c) At what rate would the ratio of surface area to volume S/V change? Would
this ratio increase or decrease as the cell grows? [Remark: note that the answers you give will be expressed in terms of the radius of the cell.]
9.2. Growth of a circular fungal colony: A fungal colony grows on a flat surface starting with a single spore. The shape of the colony edge is circular (with the initial site
of the spore at the center of the circle.) Suppose the radius of the colony increases
at a constant rate per unit time. (Call this constant C.)
(a) At what rate does the area covered by the colony change ?
(b) The biomass of the colony is proportional to the area it occupies (factor of
proportionality ). At what rate does the biomass increase?
9.3. Limb development: During early development, the limb of a fetus increases in size,
but has a constant proportion. Suppose that the limb is roughly a circular cylinder
with radius r and length l in proportion
l/r = C
where C is a positive constant. It is noted that during the initial phase of growth, the
radius increases at an approximately constant rate, i.e. that
dr/dt = a.
At what rate does the mass of the limb change during this time? [Note: assume that
the density of the limb is 1 gm/cm3 and recall that the volume of a cylinder is
V = Al
where A is the base area (in this case of a circle) and l is length.]
9.4. A rectangular trough is 2 meter long, 0.5 meter across the top and 1 meter deep. At
what rate must water be poured into the trough such that the depth of the water is
increasing at 1 m/min when the depth of the water is 0.7 m?
9.5. Gas is being pumped into a spherical balloon at the rate of 3 cm3 /s.
(a) How fast is the radius increasing when the radius is 15 cm?
(b) Without using the result from (a), find the rate at which the surface area of the
balloon is increasing when the radius is 15 cm.
1
9.6. A point moves along the parabola y = x2 in such a way that at x = 2 the x4
coordinate is increasing at the rate of 5 cm/s. Find the rate of change of y at this
instant.

190

Chapter 9. Chain rule applied to related rates and implicit differentiation

9.7. Boyles Law: In chemistry, Boyles law describes the behaviour of an ideal gas:
This law relates the volume occupied by the gas to the temperature and the pressure
as follows:
P V = nRT
where n, R are positive constants.
(a) Suppose that the pressure is kept fixed, by allowing the gas to expand as the
temperature is increased. Relate the rate of change of volume to the rate of
change of temperature.
(b) Suppose that the temperature is held fixed and the pressure is decreased gradually. Relate the rate of change of the volume to the rate of change of pressure.
9.8. Spread of a population: In 1905 a Bohemian farmer accidentally allowed several
muskrats to escape an enclosure. Their population grew and spread, occupying
increasingly larger areas throughout Europe. In a classical paper in ecology, it was
shown by the scientist Skellam (1951) that the square root of the occupied area
increased at a constant rate, k. Determine the rate of change of the distance (from
the site of release) that the muskrats had spread. For simplicity, you may assume
that the expanding area of occupation is circular.
9.9. A spherical piece of ice melts so that its surface area decreases at a rate of 1 cm2 /min.
Find the rate that the diameter decreases when the diameter is 5 cm.
9.10. A Convex lens: A particular convex lens has a focal length of f = 10 cm. The
distance p between an object and the lens, the distance q between its image and the
lens and the focal length f are related by the equation:
1
1 1
= + .
f
p q
If an object is 30 cm away from the lens and moving away at 4 cm/sec, how fast is
its image moving and in which direction?
9.11. A conical cup: Water is leaking out of a small hole at the tip of a conical paper cup
at the rate of 1 cm3 /min. The cup has height 8 cm and radius 6 cm, and is initially
full up to the top. Find the rate of change of the height of water in the cup when the
cup just begins to leak. [Remark: the volume of a cone is V = (/3)r2 h.]
9.12. Conical tank: Water is leaking out of the bottom of an inverted conical tank at the
1
rate of
m3 /min, and at the same time is being pumped in the top at a constant
10
rate of k m3 /min. The tank has height 6 m and the radius at the top is 2 m. De1
termine the constant k if the water level is rising at the rate of m/min when the
5
height of the water is 2 m. Recall that the volume of a cone of radius r and height h
is
1
V = r2 h.
3
9.13. The gravel pile: Gravel is being dumped from a conveyor belt at the rate of 30 f t3 /min
in such a way that the gravel forms a conical pile whose base diameter and height

Exercises

191

are always equal. How fast is the height of the pile increasing when the height is
1
10 f t? (Hint: the volume of a cone of radius r and height h is V = r2 h.)
3
9.14. The sand pile: Sand is piled onto a conical pile at the rate of 10m3 /min. The
sand keeps spilling to the base of the cone so that the shape always has the same
proportions: that is, the height of the cone is equal to the radius of the base. Find
the rate at which the height of the sandpile increases when the height is 5 m. Note:
The volume of a cone with height h and radius r is

V = r2 h.
3
9.15. Water is flowing into a conical reservoir at a rate of 4 m3 /min. The reservoir is 3 m
in radius and 12 m deep.
(a) How fast is the radius of the water surface increasing when the depth of the
water is 8 m?
(b) In (a), how fast is the surface rising?
9.16. A ladder 10 meters long leans against a vertical wall. The foot of the ladder starts to
slide away from the wall at a rate of 3 m/s.
(a) Find the rate at which the top of the ladder is moving downward when its foot
is 8 meters away from the wall.
(b) In (a), find the rate of change of the slope of the ladder.
9.17. Sliding ladder: A ladder 5 m long rests against a vertical wall. If the bottom of the
ladder slides away from the wall at the rate of 0.5 meter/min how fast is the top of
the ladder sliding down the wall when the base of the ladder is 1 m away from the
wall ?
9.18. Ecologists are often interested in the relationship between the area of a region (A)
and the number of different species S that can inhabit that region. Hopkins (1955)
suggested a relationship of the form
S = a ln(1 + bA)
where a and b are positive constants. Find the rate of change of the number of
species with respect to the area. Does this function have a maximum?
9.19. The burning candle: A candle is placed a distance l1 from a thin block of wood of
height H. The block is a distance l2 from a wall as shown in Figure 9.9. The candle
burns down so that the height of the flame, h1 decreases at the rate of 3 cm/hr. Find
the rate at which the length of the shadow y cast by the block on the wall increases.
(Note: your answer will be in terms of the constants l1 and l2 . Remark: This is a
challenging problem.)
9.20. Use implicit differentiation to show that the derivative of the function
y = x1/3
is
y = (1/3)x2/3 .
First write the relationship in the form y 3 = x, and then find dy/dx.

192

Chapter 9. Chain rule applied to related rates and implicit differentiation

y
H

h1
l1

l2

Figure 9.9. Figure for Problem 19


9.21. Generalizing the Power Law:
(a) Use implicit differentiation to calculate the derivative of the function
y = f (x) = xn/m
where m and n are integers. (Hint: rewrite the equation in the form y m = xn
first.)
(b) Use
to derive the formulas for the derivatives of the functions y =
your result1/3
x and y = x
.
9.22. The equation of a circle with radius r and center at the origin is
x2 + y 2 = r2
(a) Use implicit differentiation to find the slope of a tangent line to the circle at
some point (x, y).
(b) Use this result to find the equationsof the tangent lines of the circle at the
points whose x coordinate is x = r/ 3.
(c) Use the same result to show that the tangent line at any point on the circle is
perpendicular to the radial line drawn from that point to the center of the circle
Note: Two lines are perpendicular if their slopes are negative reciprocals.
9.23. For each of the following, find the derivative of y with respect to x.
(a) y 6 + 3y 2x 7x3 = 0

(b) ey + 2xy = 3
9.24. The equation of a circle with radius 5 and center at (1, 1) is
(x 1)2 + (y 1)2 = 25
(a) Find the slope of the tangent line to this curve at the point (4, 5).

Exercises

193

(b) Find the equation of the tangent line.


9.25. Tangent to a hyperbola: The curve
x2 y 2 = 1
is a hyperbola. Use implicit differentiation to show that for large x and y values, the
slope dy/dx of the curve is approximately 1.
9.26. An ellipse: Use implicit differentiation to find the points on the ellipse
y2
x2
+
=1
4
9
at which the slope is -1/2.
9.27. Motion of a cell: In the study of cell motility, biologists often investigate a type of
cell called a keratocyte, an epidermal cell that is found in the scales of fish. This
flat, elliptical cell crawls on a flat surface, and is known to be important in healing
wounds. The 2D outline of the cell can be approximated by the ellipse
x2 /100 + y 2 /25 = 1
where x and y are distances in m (Note: 1m, often called 1 micron, is 106
meters). When the motion of the cell is filmed, it is seen that points on the leading
edge (top arc of the ellipse) move in a direction perpendicular to the edge. Determine the direction of motion of the point (xp , yp ) on the leading edge.
9.28. The Folium of Descartes: A famous curve (see Figure 9.10) that was studied historically by many mathematicians (including Descartes) is
x3 + y 3 = 3axy
(1.5a,1.5a)

-a

Figure 9.10. The Folium of Descartes in Problem 28


You may assume that a is a positive constant.
(a) Explain why this curve cannot be described by a function such as y = f (x)
over the domain < x < .

(b) Use implicit differentiation to find the slope of this curve at a point (x, y).

194

Chapter 9. Chain rule applied to related rates and implicit differentiation


(c) Determine whether the curve has a horizontal tangent line anywhere, and if so,
find the x coordinate of the points at which this occurs.
(d) Does implicit differentiation allow you to find the slope of this curve at the
point (0,0) ?

9.29. Isotherms in the Van-der Waals equation: In thermodynamics, the Van der
Waals equation relates the mean pressure, p of a substance to its molar volume
v at some temperature T as follows:
(p +

a
)(v b) = RT
v2

where a, b, R are constants. Chemists are interested in the curves described by this
equation when the temperature is held fixed. (These curves are called isotherms).
(a) Find the slope, dp/dv, of the isotherms at a given point (v, p).
(b) Determine where points occur on the isotherms at which the slope is horizontal.
9.30. The circle and parabola: A circle of radius 1 is made to fit inside the parabola
y = x2 as shown in figure 9.11. Find the coordinates of the center of this circle,
i.e. find the value of the unknown constant c. [Hint: Set up conditions on the points
of intersection of the circle and the parabola which are labeled (a, b) in the figure.
What must be true about the tangent lines at these points?]
y

(0,c)
(a,b)

Figure 9.11. Figure for Problem 30


9.31. Consider the curve whose equation is
x3 + y 3 + 2xy = 4, y = 1 when x = 1.
(a) Find the equation of the tangent line to the curve when x = 1.
(b) Find y at x = 1.
(c) Is the graph of y = f (x) concave up or concave down near x = 1?
Hint: Differentiate the equation x3 + y 3 + 2xy = 4 twice with respect to x.

Chapter 10

Exponential functions

10.1 Unlimited growth and doubling


In this chapter, we introduce an important new class of functions, the exponential functions.
We first describe the discrete process of population doubling and then show how this
simple idea leads to a continuous function that represents unlimited growth. An application
of this idea will be investigated in the context of a bacterial colony that grows without limit.
We first consider the discrete integer powers of 2, 2n , where n is an integer . We will
then generalize this to a continuous function 2x where x is any real number. We investigate
how the base of the exponential function ax , where a > 0 is arbitrary affects the pattern.
Once a continuous function is defined, we can attach a meaning to the idea of its derivative.
Computing the derivative of an exponential function, we encounter a specially convenient
base denoted e, and then inestigate the exponential function ex .

Section 10.1 Learning goals


1. Follow and understand the example of doubling of a population and its link to integer
powers of base 2.
2. Given information about the doubling time of a population and its initial size, be able
to determine the size of that population at some later generation.
3. Appreciate the connection between 2n for integer values of n and 2x for a real number x.

10.1.1

The Andromeda Strain

The mathematics of uncontrolled growth are frightening. A single cell of the bacterium E.
coli would, under ideal circumstances, divide every twenty minutes. That is not particularly
disturbing until you think about it, but the fact is that bacteria multiply geometrically: one
becomes two, two become four, four become eight, and so on. In this way it can be shown
195

196

Chapter 10. Exponential functions

that in a single day, one cell of E. coli could produce a super-colony equal in size and
weight to the entire planet Earth.
Michael Crichton (1969) The Andromeda Strain, Dell, N.Y. p247
The Andromeda Strain scenario motivates our investigation of a new family of functions
that represent uncontrolled exponential growth. We start with values of the form 2n where
n = 1, 2 . . . is an integer. To get a sense of how 2n grows with n, we list the first such
values (n = 1, . . . , 10) in Table 10.1. It is clear that an initially gentle growth becomes
extremely steep in just a few steps.
1000.0

n
0
1
2
3
4
5
6
7
8
9
10

y=2n

2n
1
2
4
8
16
32
64
128
256
512
1024

0.0
-4.0

10.0

Table 10.1. Powers of 2 including both negative and positive integers: Here we
show 2n for 4 < n < 10. Note that 210 1000 = 103 . This is a useful approximation in
converting binary numbers (powers of 2) to decimal numbers (powers of 10).

Example 10.1 (Growth of E. coli:) Use the following facts to check the assertion made
by Michael Crichton in the quotation at the beginning of this chapter.
Mass of 1 E. coli cell : 1 nanogram = 109 gm = 1012 kg.
Mass of Planet Earth : 6 1024 kg.

Solution: Based on the above two facts, we surmise that the size of an E. coli colony
(number of cells) that together form a mass equal to Planet Earth would be
m=

6 1024
= 6 1036 .
1012

Each hour corresponds to 3 twenty-minute generations. In a period of 24 hours, there are


24 3 = 72 generations, with each one producing a doubling in colony size. After 1 day,

10.1. Unlimited growth and doubling

197

the number of cells equal is then 272 . It is easier to understand this number if we convert
it to an approximate decimal form. We use the fact that 210 103 as noted in Table 10.1.
We proceed as follows:
272 = 22 270 = 4 (210 )7 4 (103 )7 = 4 1021 .
The actual value is found to be 4.7 1021 , so the approximation is relatively good.
Apparently, the estimate made by Crichton is not quite accurate. However it can be
shown that it takes less than 2 days to produce a number far in excess of the desired size.
(The exact number of generations is left as an exercise for the reader.. but we will return to
this in due time.)

10.1.2

The function 2x and its relatives

1000.0

1000.0

y=2x

y=2x

(discrete)

(smooth)

0.0

0.0
-4.0

10.0

-4.0

(a)

10.0

(b)

Figure 10.1. (a) Values of the function 2x for x = 0, 0.5, 1, 1.5, etc. (b) The
function 2x is shown extended to negative values of x and connected smoothly to form a
continuous curve.
Properties of 2n and related expressions are reviewed in Appendix B.A, where common manipulations are illustrated. Here we assume that the reader is familiar with this
elementary material. From previous familiarity with power functions such as y = x2 (not
to be confused with 2x ), we know the value of

21/2 = 2 1.41421 . . .
We can use this value to compute

23/2 = ( 2)3 ,

25/2 = ( 2)5 ,

198

Chapter 10. Exponential functions

and all other fractional exponents that are multiples of 1/2. We can add these to the graph
of our previous powers of 2 to fill in additional points. This is shown on Figure 10.1(a).
In this way, we could also calculate exponents that are multiples of 1/4 since
q

2
21/4 =

is a value that we can obtain. We show how adding these values leads to an even finer set
of points. By continuing in the same way, we fill in the graph of the emerging function.
Connecting the dots smoothly allows us to define a value for any real x, of a new continuous
function,
y = f (x) = 2x .
This function is shown in Figure 10.1(b) as the smooth curve superimposed on the points
we have gathered.
Example 10.2 (Generalization to other bases) Use similar ideas to plot the relatives of
2x that have other common bases, such as y = 3x , y = 4x and y = 10x and comment about
the function y = ax where a > 0 is a constant (called the base).

1000.0

10x 4x

3x 2x

0.0
-4.0

10.0

Figure 10.2. The function y = f (x) = ax is shown here for a variety of bases,
a = 2, 3, 4, and 10.
Solution: We can generalize the above idea to form an for integer values of n, simply by
multiplying a by itself n times. This would generate the discrete functions an analogous to
Fig. 10.1. So long as a is positive, we can fill in values of ax when x is rational (in the same

10.2. Derivatives of exponential functions and the function ex

199

way as we did for 2x , and we can smoothly connect the points to lead to the continuous
function ax for any real x. Given some positive constant a, we will define the new function
f (x) = ax as the exponential function with base a. Shown in Figure 10.2 are the functions
y = 2x , y = 3x , y = 4x and y = 10x .

10.2 Derivatives of exponential functions and the


function ex
Section 10.2 Learning goals
1. Be able to use the definition of the derivative to calculate the derivative of the function
y = ax for an arbitrary base a > 0.
2. Understand the significance of the special base e.
3. Learn the properties of the function ex , its derivatives, and how to manipulate it
algebraically.
4. Note the fact that the function y = ekx has a derivative that is proportional to the
same function.

Calculating the derivative of ax

10.2.1

In this section we show how to compute the derivative of the new exponential function just
defined. We first consider an arbitrary positive constant a that will be used for the base of
the function. Then for a > 0 let
y = f (x) = ax .
Then

ax+h ax
dax
= lim
h0
dx
h

ax ah ax
= lim
h0
h
h
(a 1)
= lim ax
h0
h


h
a
1
.
= ax lim
h0
h
Notice that the variable x appears only in the form of ax . Everything inside the square
brackets does not depend on x at all! It is a constant that depends only on the base we used.
To summarize, we have found that
The derivative of an exponential function ax is of the form Ca ax
where Ca is a constant that depends only on the base a.

200

Chapter 10. Exponential functions


We now examine this in more detail with the base 2 and the base 10.

Example 10.3 (Derivative of 2x ) Compute the derivative for the base a = 2 using the
above result.
Solution: For base a = 2, we have
d2x
= C2 2x
dx
where
2h 1
2h 1

h0
h
h

C2 (h) = lim

for small h.

Example 10.4 (The value of C2 ) Find an approximation for the value of the constant C2
in Example 10.3 by using small values of h, e.g., h = 1, 0.1, 0.01, etc. Does this value
approach a fixed real number?
Solution: We take these successively smaller values of h so as to approximate the constant
C2 with increasing levels of accuracy. Using a calculator, we find that h = 1 leads to
C2 = 1.0, h = 0.1 leads to C2 = 0.7177, h = 0.001 leads to C2 = 0.6934, h = 0.00001
to C2 = 0.6931. It is clear that this constant is approaching a fixed value. We represent
this by writing C2 0.6931. Thus, the derivative of 2x is
d2x
= C2 2x = (0.6931) 2x .
dx
Example 10.5 (The base 10 and the derivative of 10x) Determine the derivative of y =
f (x) = 10x .
Solution: If we had chosen base 10 for our exponential function, we would have
C10 (h)

10h 1
h

for small h.

We find, by similar approximation, that


C10 = 2.3026,
so that
d10x
= C10 10x = (2.3026) 10x .
dx
Thus, different bases come with different constant multipliers when derivatives are computed.

10.2. Derivatives of exponential functions and the function ex

10.2.2

201

The natural base e is convenient for calculus

In Examples 10.3-10.3, we found that when we differentiate an exponential function such


as ax , we get the same function multiplied by some base-dependent constant, i.e. Ca ax .
These constants are somewhat inconvenient, but unavoidable if we use an arbitrary base.
Here we ask whether there exists a more convenient base (to be called e) for which the
constant is particularly simple, namely, such that Ce = 1. This indeed, is the property of
the natural base that we identify as follows.
Such a base would have to have the property that
eh 1
= 1,
h0
h

Ce = lim
i.e. that, for small h

eh 1
1.
h

This means that


eh 1 h

eh h + 1

e (1 + h)1/h .

More specifically,
e = lim (1 + h)1/h .
h0

We can find an approximate value for this interesting new base by calculating the expression
shown above for some very small (but finite value) of h, e.g. h = 0.00001. Using this value,
we find that
e (1.00001)100000 2.71826
To summarize, we have found that for this special base, e, we have the following property:
The derivative of the function ex is ex .
Remark: In the above computation, we came up with a little recipe for calculating
the value of the base e. The recipe involves shrinking some value h and computing a limit.
We can restate this recipe in another way. Let n = 1/h. Then as h shrinks, n will be a
growing number, i.e h 0 implies n . Restated, we have found the following two
equivalent definitions for the base e:
The base of the natural exponential function is the number defined as
follows:

n
1
e = lim (1 + h)1/h = lim 1 +
n
h0
n
Using the derivative of ex and the chain rule, we can now differentiate composite
functions in which the exponential appears.
Example 10.6 Find the derivative of y = ekx .

202

Chapter 10. Exponential functions

Solution: The simple chain rule with u = kx leads to


dy du
dy
=
dx
du dx
but

dy
du
= k so
= eu k = kekx .
dx
dx
This is a useful result, which we highlight for future use:
The derivative of y = f (x) = ekx is f (x) = kekx

Example 10.7 (Chemical reactions) According to the collision theory of bimolecular gas
reactions, a reaction between two molecules occurs when the molecules collide with energy
greater than some activation energy, Ea , referred to as the Arrhenius activation energy.
Ea > 0 is constant for the given substance. The fraction of bimolecular reactions in which
this collision energy is achieved is
F = e(Ea /RT )
where T is temperature (in degrees Kelvin) and R > 0 is the gas constant. Suppose that
the temperature T increases at some constant rate C per unit time. Determine the rate of
change of the fraction F of collisions that result in a successful reaction.
Solution: This is a related rates problem involving an exponential function that depends
on the temperature, which itself depends on time. We are given
F = e(Ea /RT )

and

dT
= C.
dt

Let u = Ea /RT then F = eu , We use the chain rule to calculate:


dF
dF du dT
=
.
dt
du dT dt
Further, we have

dF
= eu
du
Assembling these parts, we have

and

du
Ea
.
=
dT
RT 2

Ea 2 (Ea /RT )
Ea
dF
C=C
= eu
T e
.
dt
RT 2
R

10.2.3 Properties of the function ex


We list below some of the key features of this function: All of these stem from basic
manipulations of exponents as reviewed in Appendix B.A.
1. ea eb = ea+b as with all similar exponent manipulations.

10.2. Derivatives of exponential functions and the function ex

203

2. (ea )b = eab also stems from simple rules for manipulating exponents.
3. ex is a function that is defined, continuous, and differentiable for all real numbers x.
4. ex > 0 for all values of x.
5. e0 = 1, and e1 = e.
6. ex 0 for increasing negative values of x.
7. ex for increasing positive values of x.
8. The derivative of ex is ex . (Shown in this chapter).
Example 10.8 Find the derivative of ex at x = 0 and show that the tangent line at that
point is the line y = x.
Solution: The derivative of ex is ex , and at x = 0 the slope of the tangent line is therefore
e0 = 1. The tangent line goes through (0,0) so it has a y intercept of 0. Thus the tangent
line at x = 0 with slope 1 is y = x. This is shown in Figure 10.3.
4.0

ex
tangent line

0.0
-4.0

4.0

Figure 10.3. The function y = ex has the property that its tangent line at x = 0
has slope 1. (Note that the horizontal scale on this graph is 4 < x < 4.)

10.2.4

The function ex satisfies a new kind of equation

We divert our attention to an interesting observation before going on in our development of


this chapter. We here introduce a new type of equation that will be denoted a differential
equation. This arises in connection with the exponential function when we consider the
link between this function and its own derivative. Here we will merely comment on this
fact, and later we revisit it in much more detail.
We have seen that the function
y = f (x) = ex

204

Chapter 10. Exponential functions

satisfies the relationship


dy
= f (x) = f (x) = y.
dx
In other words, when differentiating, we get the same function back again.
The function y = f (x) = ex is equal to its own derivative and hence, it satisfies the
equation
dy
= y.
dx
An equation linking a function and its derivative(s) is called a differential equation.
This is a new type of equation, unlike ones seen before in this course. We will see
later in this course that such equations have important applications.

10.3 Inverse functions and logarithms


So far in this chapter, we have defined a new function ex and computed its derivative.
Paired with this newcomer is an inverse function, the natural logarithm, ln(x). The reader
will find it helpful to review concepts of inverse functions in Appendix C. In particular, the
following key ideas are important:
Given a function y = f (x), its inverse function, denoted f 1 (x) satisfies
f (f 1 (x)) = x,

and f 1 (f (x)) = x.

The range of f (x) is the domain of f 1 (x) (and vice versa), which implies that in many
cases, the relationship holds only on some subset of real numbers. An discussed in the
appendix, the domain of a function (such as y = x2 ) has to be restricted (e.g. to x 0) so
that its inverse function (y = sqrtx) be defined. On that restricted domain, the graphs of
f and f 1 are mirror images of one another about the line y = x, which essentially stems
from the fact that the roles of x and y are interchanged.

Section 10.3 Learning goals


1. Understand the concept of inverse function from both algebraic and geometric points
of view: given a function, be able to determine whether (and for what restricted domain) an inverse function can be defined and to sketch its inverse function. (Review
Appendix C.E).
2. Understand the relationship between the domain and range of a function and the
range and domain of its inverse function. (Review Appendix C.E).
3. Be able to apply these ideas to the logarithm, which is the inverse of an exponential
function.
4. Follow and be able to reproduce the calculation of the derivative of ln(x) using implicit differentiation.

10.3. Inverse functions and logarithms

10.3.1

205

The natural logarithm is an inverse function for ex

For our newly defined function y = f (x) = ex we will define an inverse function, shown
on Figure 10.4. We will call this function the logarithm (base e), and write it as
y = f 1 (x) = ln(x).
We have the following connection: y = ex implies x = ln(y). The fact that the functions

4.0

e^x

ln(x)

y=x

-4.0
-4.0

4.0

Figure 10.4. The function y = ex is shown here together with its inverse, y = ln x.
are inverses also implies that
eln(x) = x and

ln(ex ) = x.

The domain of ex is < x < , and its range is x > 0. For the inverse function,
this domain and range are interchanged, meaning that ln(x) is only defined for x > 0 (its
domain) and returns values in < x < (its range). As shown in Fig. 10.4, the
functions ex and ln(x) are reflections of one another about the line y = x.
Properties of the logarithm stem directly from properties of the exponential function.
A brief review of these is provided in Appendix B.B. As for other bases, we have that
1. ln(ab) = ln(a) + ln(b)
2. ln(ab ) = b ln(a)
3. ln(1/a) = ln(a1 ) = ln(a)

206

Chapter 10. Exponential functions

10.3.2 Derivative of ln(x) by implicit differentiation


Implicit differentiation is helpful whenever an inverse function appears. Knowing the
derivative of the original function allows us to easily compute the derivative of its inverse
by using the special relationship. Here we use implicit differentiation to find the derivative
of the newly defined function, y = ln(x) as follows: First, restate the relationship in the
inverse form, but consider y as the dependent variable, that is think of y(x) as a quantity
that depends on x:
y = ln(x)

ey = x

ey

d
d y(x)
e
=
x
dx
dx

Here we apply the chain rule:


dey dy
=1
dy dx

dy
=1
dx

1
dy
1
= y =
dx
e
x

We have thus shown the following:


The derivative of ln(x) is 1/x:
d ln(x)
1
= .
dx
x

10.4 Applications of the logarithm


Section 10.4 Learning goals
1. Understand the relationships between propertied of ex and properties of its inverse
ln(x), and master manipulations of expressions involving both.
2. Be able to use logarithm for base conversions, and for solving equations involving
the exponential function. (For instance, given an equation of the form A = ebt , solve
for t.)
3. Given a relationship such as y = axb , show that ln(y) is related linearly to ln(x),
and use data points for (x, y) to determine the values of a and b.

10.4.1 Using the logarithm for base conversion


The logarithm is helpful in changing an exponential function from one base to another. We
give some examples here.
Example 10.9 Rewrite y = 2x in terms of base e.
Solution:
y = 2x

ln(y) = ln(2x ) = x ln(2)

10.4. Applications of the logarithm

207

eln(y) = ex ln(2)

y = ex ln(2)

We find (using a calculator) that ln(2) = 0.6931.. so we have


y = ekx

where k = ln(2) = 0.6931..

Example 10.10 Find the derivative of y = 2x .


Solution: We have expressed this function in the alternate form
y = 2x = ekx

with k = ln(2).

From Example 10.6 we have


dy
= kekx = ln(2)eln(2)x = ln(2)2x .
dx
We have discovered that the constant C2 in the derivative of 2x that we computed in Example 10.4 is actually related to the natural logarithm: C2 = ln(2).

10.4.2

The logarithm helps to solve exponential equations

Equations involving the exponential function can sometimes be simplified and solved using
the logarithm. Here we provide a few examples of this kind.
2

Example 10.11 Find zeros of the function y = f (x) = e2x e5x .


2

Solution: We are being asked to find values of x for which f (x) = e2x e5x = 0. We
write
2

e2x e5x = 0

e2x = e5x

e5x
=1
e2x

e5x

2x

=1

Taking logarithm of both sides, and using the facts that


ln(e5x

2x

= 5x2 2x,

ln(1) = 0

we obtain
5x2 2x = 0

x = 0, 5/2.

We see that the logarithm is useful in the last step of isolating x, after simplifying the
exponential expressions appearing in the equation.
Earlier in this chapter we had posed a question: How long will it take for the Andromeda
strain population to attain a size of 6 1036 cells, i.e. to grow to an Earth-sized colony. We
can now address this question and solve it fully.
Example 10.12 (The Andromeda strain) How long will it take for an E. coli colony to
reach size of 6 1036 cells by the unlimited doubling every 20 minutes??

208

Chapter 10. Exponential functions

Solution: We recall that the doubling time for the bacteria is 20 min, so that one generation
(or one doubling occurs for every multiple of t/20). However, it is not necessarily true that
all cells will split in a synchronized way. This means that after t minutes, we expect that
the number, B(t) of bacteria would be roughly given by the smooth function:
B(t) = 2t/20 .
(Note that this function agrees with our previous table and graph for powers of 2 at all
integer multiples of the generation time, i.e. for t = 20, 40, 60, 80.. minutes.)
We can compute this as follows:
6 1036 = 2t/20

ln(6 1036 ) = ln(2t/20 )

ln(6) + 36 ln(10) =
so
t = 20

t
ln(2)
20

1.79 + 36(2.3)
ln(6) + 36 ln(10)
= 20
= 2441.27
ln(2)
0.693

This is the time in minutes. In hours, it would take 2441.27/60 = 40.68 hours for the colony
to grow to such a size.
Example 10.13 (Using base e:) Express the number of bacteria in terms of base e (for
practice with base conversions).
Solution: We would do this as follows:
B(t) = 2t/20

ln(B) =

t
ln(2)
20

ln(2)
.
20
The constant k will be referred to as the growth rate of the bacteria. We observe that this
constant can be written as:
ln(2)
k=
.
doubling time
t

eln(B(t)) = e 20 ln(2)

B(t) = ekt

where k =

We will see the usefulness of this approach very soon.

10.4.3 Logarithms help plot data that varies on large scale

Living organisms come in a variety of sizes, from the tiniest cells to the largest whales.Comparin
attributes across species of vastly different sizes poses a challenge, as visualizing such data
on a simple graph obscures both extremes. This is particularly true in allometry, where
comparisons are made of physiological properties across animals from mouse to elephant.
An example of such data for metabolic rate versus mass of the animal is shown in Table 10.2. It would be hard to see all data points clearly on a regular graph. For this reason,
it can be helpful to use logarithmic scaling for either or both variables. We show an example
of this kind of log-log plot in Figure 10.5.

10.4. Applications of the logarithm


animal
mouse
rat
rabbit
dog
man
horse

209

body weight (gm)


25
226
2200
11700
70000
700000

metabolic rate
1580
873
466
318
202
106

Table 10.2. Animals of various sizes (mass in gm) have widely different metabolic
rates. How should we plot such data? double log scale graph of this data is shown in
Fig. 10.5.

9.0

ln(MR)

mouse
rat
rabbit
dog
human
horse

2.0
0.0

ln(body weight in gm)

20.0

Figure 10.5.
A double-log plot of the data in Table 10.2, showing
ln(MR)=ln(metabolic rate) versus ln(body weight in grams).
In allometry, it is conjectured that such data fits some power function of the form
y axb , where a, b > 0.

(10.1)

(Note that this is not an exponential function, but a power function with power b and coefficient a.) Finding the allometric constants a and b for such a relationship is sometimes
useful. Below we illustrate how this can be done based on the graph in Fig 10.5.
Example 10.14 (Log transformation) Define Y = ln(y), X = ln(x). Show that Eqn. (10.1)
can be rewritten as a linear relationship between the new variables.

210

Chapter 10. Exponential functions

Solution: We have
Y = ln(y) = ln(axb ) = ln(a) + ln(xb ) = ln(a) + b ln(x) = A + bX
where A = ln(a). Thus, we have shown that X and Y are related linearly:
Y = A + bX,

where A = ln(a).

This is the equation of a straight line whose slope is b and whose Y intercept is A.
Example 10.15 (Finding the allometric constants) Use the straight line superimposed on
the data in Fig. 10.5 to deduce the (approximate) values of the allometric constants a and
b.
Solution: We use the straight line that has been fitted to the data in Fig. 10.5. The Y intercept of this line is roughly 8.2. The line goes through the points (20,3) and (0,8.2) so
its slope is (3 8.2)/20 = 0.26. According to the relationship we found in Example 10.14,
8.2 = A = ln(a)

a = e8 .2 = 3640,

and b = 0.26.

Thus, reverting to the original allometric relationship leads to


y = axb = 3640x0.26 =

3640
.
x0.26

It is clear that the metabolic rate y decreases with the size x of the animal, as indicated by
the data in Table 10.2.

Exercises

211

Exercises
10.1. Graph the following functions:
(a) f (x) = x2 ex
(b) f (x) = ln(x2 + 3)
(c) f (x) = ln(e2x )
10.2. Express the following in terms of base e:
(a) y = 3x
(b) y =

1
7x

(c) y = 15x

+2

Express the following in terms of base 2:


(d) y = 9x
(e) y = 8x
(f) y = ex

+3

Express the following in terms of base 10:


(g) y = 21x
(h) y = 100010x
(i) y = 50x

10.3. Compare the values of each pair of numbers (i.e. indicate which is larger):
(a) 50.75 , 50.65
(b) 0.40.2 , 0.40.2
(c) 1.0012, 1.0013
(d) 0.9991.5, 0.9992.3
10.4. Rewrite each of the following equations in logarithmic form:
(a) 34 = 81
(b) 32 =

1
9

1
3
10.5. Solve the following equations for x:
1

(c) 27 3 =

(a) ln x = 2 ln a + 3 ln b
(b) loga x = loga b

2
3

loga c

10.6. Reflections and transformations: What is the relationship between the graph of
y = 3x and the graph of each of the following functions?
(a) y = 3x (b) y = 3x
(c) y = 31x
|x|
x
(d) y = 3
(e) y = 2 3
(e) y = log3 x
10.7. Solve the following equations for x:

212

Chapter 10. Exponential functions


(a) e32x = 5
(b) ln(3x 1) = 4
(c) ln(ln(x)) = 2

(d) eax = Cebx , where a 6= b and C > 0.


10.8. Find the first derivative for each of the following functions:
(a) y = ln(2x + 3)3
(b) y = ln3 (2x + 3)
(c) y = ln(cos 12 x)
(d) y = loga (x3 2x) (Hint :
(e) y = e3x

d
1
(loga x) =
)
dx
x ln a

(f) y = a 2 x
(g) y = x3 2x
x

(h) y = ee

et et
et + et
10.9. Find the maximum and minimum points as well as all inflection points of the following functions:
(i) y =

(a) f (x) = x(x2 4)

(b) f (x) = x3 ln(x), x > 0


(c) f (x) = xex

1
+ 1+x
, 1 < x < 1

(e) f (x) = x 3 3 x

(d) f (x) =

1
1x

(f) f (x) = e2x ex

10.10. Shown in Figure 10 is the graph of y = Cekt for some constants C, k, and a tangent
line. Use data from the graph to determine C and k.
10.11. Consider the two functions
(a) y1 (t) = 10e0.1t ,
(b) y2 (t) = 10e0.1t .
Which one is decreasing and which one is increasing? In each case, find the value
of the function at t = 0. Find the time at which the increasing function has doubled
from this initial value. Find the time at which the decreasing function has fallen to
half of its initial value. [Remark: these values of t are called, the doubling time, and
the half-life, respectively]
10.12. Invasive species: An ecosystem with mature trees has a relatively constant population of beetles (species 1) that number around 109 . At t = 0, a single reproducing

Exercises

213

(0, 4)

y = Cekt

(2, 0)
Figure 10.6. Figure for Problem 10

invasive beetle (species 2) is introduced accidentally. If this population initially


grows at the exponential rate
N2 (t) = ert ,

where r = 0.5 per month

how long will it take for species 2 to overtake the population of the resident species
1? Assume exponential growth for the entire duration.
10.13. Human population growth: It is sometimes said that the population of humans on
Earth is growing exponentially. By this is meant that
P (t) = Cert ,

where r > 0.

In this problem we investigate this claim. To do so, we will consider the human
population starting in year 1800 (t = 0). Hence, we ask whether the data in Table 2.4
fits the relationship
P (t) = Cer(t1800) ,

where t is time in years and r > 0.

(a) Show that the above relationship implies that ln(P ) is a linear function of
time, and that r is the slope of the linear relationship. (Hint: take the natural
logarithm of both sides of the relationship and simplify.)
(b) Use the data from Table 2.4 for the years 1800 to 2020 to investigate whether
P (t) fits an exponential relationship. (Hint: plot ln(P ), where P is human
population (in billions) against time t in years. We refer to this process as
transforming the data.)
(c) A spreadsheet can be used to fit a straight line through the transformed data
you produced in (b). Find the best fit for the growth rate parameter r using that
option. What are the units of r? What is the best fit value of C?

214

Chapter 10. Exponential functions


(d) Based on your plot of ln(P ) versus t and the best fit values of r and C, over
what time interval was the population growing more slowly than the overall
trend, and when was it growing more rapidly than this same overall trend?
(e) Under what circumstances could an exponentially growing population be sustainable?

10.14. A sum of exponentials:


Researchers that investigated the molecular motor dynein found that the number of
motors N (t) remaining attached to their microtubule tracks at time t (in sec) after a
pulse of activation was well described by a double exponential of the form
N (t) = C1 er1 t + C2 er2 t ,

t 0.

They found that r1 = 0.1, r2 = 0.01 per second, and C1 = 75, C2 = 25 percent.
(a) Plot this relationship for 0 < t < 8 min. Which of the two exponential terms
governs the behaviour over the first minute? Which dominates in the later
phase?
(b) Now consider a plot of ln(N (t)) versus t. Explain what you see and what the
slopes and other aspects of the graph represent.
10.15. Exponential Peeling:
time
0.0000
0.1000
0.2000
0.3000
0.4000
0.5000
2.5000
4.5000
6.0000
8.0000

N (t)
100.0000
57.6926
42.5766
35.8549
31.8481
28.8296
4.7430
0.7840
0.2032
0.0336

Table 10.3. Table for Problem 15.

You are given the data in Table 10.3 and told that it was generated by a double
exponential function of the form
N (t) = C1 er1 t + C2 er2 t ,

t 0.

Use the data to determine the values of the constants r1 , r2 , C1 , C2 .


10.16. Shannon Entropy: In a recent application of information theory to the field of
genomics, a function called the Shannon entropy, H, was considered. A given gene
is represented as a binary device: it can be either on or off (i.e. being expressed

Exercises

215

or not). If x is the probability that the gene is on and y is the probability that it is
off, the Shannon entropy function for the gene is defined as
H = x log(x) y log(y)
[Remark: the fact that x and y are probabilities, just means that they satisfy 0 < x
1, and 0 < y 1.] The gene can only be in one of these two states, so x + y = 1.
Use these facts to show that the Shannon entropy for the gene is greatest when the
two states are equally probable, i.e. for x = y = 0.5.
10.17. A threshold function: The response of a regulatory gene to inputs that affect it is
not simply linear. Often, the following so called squashing function or threshold
function is used to link the input x to the output y of the gene.
y = f (x) =

1
1 + e(ax+b)

where a, b are constants.


(a) Show that 0 < y < 1.
(b) For b = 0 and a = 1 sketch the shape of this function.
(c) How does the shape of the graph change as a increases?
10.18. Sketch the graph of the function y = et sin t.
10.19. The Mexican Hat: Find the critical points of the function
2

y = f (x) = 2ex ex

/3

and determine the value of f at those critical points. Use these results and the fact
that for very large x, f 0 to draw a rough sketch of the graph of this function.
Comment on why this function might be called a Mexican Hat. (Note: The second derivative is not very informative here, and we will not ask you to use it for
determining concavity in this example. However, you may wish to calculate it just
for practice with the chain rule.)
10.20. The Ricker Equation: In studying salmon populations, a model often used is the
Ricker equation which relates the size of a fish population this year, x to the expected
size next year y. (Note that these populations do not change continuously, since all
the parents die before the eggs are hatched.) The Ricker equation is
y = xex
where , > 0.
(a) Find the value of the current population which maximizes the salmon population next year according to this model.
(b) Find the value of the current population which would be exactly maintained in
the next generation.
(c) Explain why a very large population is not sustainable.

216

Chapter 10. Exponential functions

10.21. Spacing in a fish school: Life in a social group has advantages and disadvantages:
protection from predators is one advantage. Disadvantages include competition with
others for food or resources. Spacing of individuals in a school of fish or a flock of
birds is determined by the mutual attraction and repulsion of neighbors from one
another: each individual does not want to stray too far from others, nor get too
close.
Suppose that when two fish are at distance x > 0 from one another, they are attracted
with force Fa and repelled with force Fr given by:
Fa = Aex/a
Fr = Rex/r
where A, R, a, r are positive constants. A, R are related to the magnitudes of the
forces, and a, r to the spatial range of these effects.
(a) Show that at the distance x = a the first function has fallen to (1/e) times its
value at the origin. (Recall e 2.7.) For what value of x does the second
function fall to (1/e) times its value at the origin? Note that this is the reason
why a, r are called spatial ranges of the forces.
(b) It is generally assumed that R > A and r < a. Interpret what this mean
about the comparative effects of the forces and sketch a graph showing the two
functions on the same set of axes.
(c) Find the distance at which the forces exactly balance. This is called the comfortable distance for the two individuals.
(d) If either A or R changes so that the ratio R/A decreases, does the comfortable
distance increase or decrease? (Give reason.)
(e) Similarly comment on what happens to the comfortable distance if a increases
or r decreases.
10.22. Seed distribution: The density of seeds at a distance x from a parent tree is observed to be
2
2
D(x) = D0 ex /a ,
where a > 0, D0 > 0 are positive constants. Insects that eat these seeds tend to
congregate near the tree so that the fraction of seeds that get eaten is
F (x) = ex

/b2

where b > 0. (Remark: These functions are called Gaussian or Normal distributions.
The parameters a, b are related to the width of these bell-shaped curves.) The
number of seeds that survive (i.e. are produced and not eaten by insects) is
S(x) = D(x)(1 F (x))
Determine the distance x from the tree at which the greatest number of seeds survive.

Exercises

217

10.23. Eulers e: In 1748, Euler wrote a classic book on calculus (Introductio in


Analysin Infinitorum) in which he showed that the function ex could be written
in an expanded form similar to an (infinitely long) polynomial:
ex = 1 + x +

x3
x2
+
+ ...
12 123

Use as many terms as necessary to find an approximate value for the number e and
for 1/e to 5 decimal places. Remark: we will see later that such expansions, called
power series, are central to approximations of many functions.

218

Chapter 10. Exponential functions

Chapter 11

Differential equations for


exponential growth and
decay

In this chapter we capitalize on an important observation made about exponential functions


to open the door to a new kind of mathematical equation in which functions and their
derivatives are related, that is differential equations. Here we merely acquaint the reader
with the special relationship between ekx and one simple example of this class of equations.
This relationship will lead us into a new and important link between scientific problems and
mathematical descriptions.

11.1 Introducing a new kind of equation


Section 11.1 Learning goals
1. Understand that the exponential function and its derivative are proportional to one
another, and thereby satisfy a relationship of the form dy/dx = ky.
2. Understand the definitions of a differential equation and of a solution to a differential
equation.
3. Understand that y = ekt is a solution to the differential equation dy/dt = ky.

11.1.1

Observations about the exponential function

In a previous chapter we made an observation about a special property of the exponential


function
y = f (x) = ex
namely, that it satisfies the relationship
dy
= ex = y.
dx
219

220

Chapter 11. Differential equations for exponential growth and decay

In this way, we encountered an important new class of equations,


dy
= y.
dx
We call this a differential equation because it connects (one or more) derivatives of a
function with the function itself.
Definition 11.1 (Differential equation). A differential equation is a mathematical equation that relates one or more derivatives of some (possibly unknown) function to the function itself. Solving the differential equation is the process of identifying the function(s) that
satisfies the given relationship.
In this chapter we will study the implications of the above observation. Since most
of the applications that we examine will be time-dependent processes, we will here use t
(for time) as the independent variable.
Then we can make the following observations:
1. Let y be the function of time:
y = f (t) = et .
Then

dy
= et = y.
dt

With this slight change of notation, we see that the function y = et satisfies the
differential equation
dy
= y.
dt
2. Now consider
y = ekt .
Then, using the chain rule, and setting u = kt, and y = eu we find that
dy
dy du
=
= eu k = kekt = ky.
dt
du dt
So we see that the function y = ekt satisfies the differential equation
dy
= ky.
dt
3. If instead we had the function
y = ekt ,
we could similarly show that the differential equation it satisfies is
dy
= ky.
dt

11.1. Introducing a new kind of equation

221

4. Now suppose we had a constant in front, e.g. we were interested in the function
y = 5ekt .
Then, by simple differentiation and rearrangement we have
d
dy
= 5 ekt = 5(kekt ) = k(5ekt ) = ky.
dt
dt
So we see that this function with the constant in front also satisfies the differential
equation
dy
= ky.
dt
5. The conclusion we reached in the previous step did not depend at all on the constant
out front. Indeed, if we had started with a function of the form
y = Cekt ,
where C is any constant, we would still have a function that satisfies the same differential equation.
6. While we will not prove this here, it turns out that these are the only functions that
satisfy this equation.
The differential equation
dy
= ky
dt

(11.1)

y = Cekt .

(11.2)

has as its solution, the function

A few comments are worth making: First, unlike algebraic equations, (whose solutions are numbers), differential equations have solutions that are functions. We have seen
above that depending on the constant k, we get either functions with a positive or with
a negative exponent (assuming that time t > 0). This leads to the two distinct types of
behaviour, exponential growth or exponential decay shown in Figures 11.1(a) and (b). In
each of these figures we see a family of curves, each of which represents a function that
satisfies one of the differential equations we have discussed.

11.1.2

The solution to a differential equation

Definition 11.2 (Solution to a differential equation). By a solution to a differential equation, we mean a function that satisfies that equation.
In the previous section we have seen a collection of solutions to each of the differential equations we discussed. For example, each of the curves shown in Figure 11.1(a) share
the property that they satisfy the equation
dy
= ky.
dt

222

Chapter 11. Differential equations for exponential growth and decay

k<0

k>0
(a)

(b)

Figure 11.1. Functions of the form y = Cekt (a) for k > 0 these represent exponentially growing solutions, whereas (b) for k < 0 they represent exponentially decaying
solutions.
We now ask: what distinguishes one from the other? More specifically, how could we
specify one particular member of this family as the one of interest to us? As we saw above,
the differential equation does not distinguish these: we need some additional information.
For example, if we had some coordinates, say (a, b) that the function of interest should go
through, this would select one function out of the collection. It is common practice (though
not essential) to specify the starting value or initial value of the function i.e. its value at
time t = 0.
Definition 11.3 (Initial value). An initial value is the value at time t = 0 of the desired
solution of a differential equation.
Example 11.4 Suppose we are given the differential equation (11.1) and the initial value
y(0) = y0
where y0 is some (known) fixed value. Find the value of the constant C in the solution
(11.2).
Solution: We proceed as follows:
y(t) = Cekt ,

so y(0) = Cek0 = Ce0 = C 1 = C.

But, by the initial condition, y(0) = y0 . So then,


C = y0
and we have established that
y(t) = y0 ekt ,

where y0 is the initial value.

11.1. Introducing a new kind of equation

11.1.3

223

Where do differential equations come from?

Scientific
problem
or
system

Facts,
observations,
assumptions,
hypotheses

Predictions
about the
system
behaviour

Solutions
to the
differential
equations

"Laws of Nature"
or
statements about
rates of change

Mathematical
Model

Differential
equation(s)
describing the
system

Figure 11.2. A flow chart showing how differential equations originate from
scientific problems.
Figure 11.2 shows how differential equations arise in scientific investigations. The
process of going from initial vague observations about a system of interest (such as planetary motion) to a mathematical model, often involves a great deal of speculation, at first,
about what is happening, what causes the motion or the changes that take place, and what
assumptions might be fruitful in trying to analyze and understand the system.
Once the cloud of doubt and vague ideas settles somewhat, and once the right simplifying assumptions are made, we often find that the mathematical model leads to a differential equation. In most scientific applications, it may then be a huge struggle to figure out
which functions would be the appropriate class of solutions to that differential equation,
but if we can find those functions, we are in position to make quantitative predictions about
the system of interest.
In our case, we have stumbled on a simple differential equation by noticing a property
of functions that we were already familiar with. This is a lucky accident, and we will exploit
it in an application shortly.
In many cases, the process of modelling hardly stops when we have found the link
between the differential equation and solutions. Usually, we would then compare the predictions to observations that may help us to refine the model, reject incorrect or inaccurate

224

Chapter 11. Differential equations for exponential growth and decay

assumptions, or determine to what extent the model has limitations.


A simple example of population growth modelling is given as motivation for some of
the ideas seen in this discussion.

11.2 Differential equation for unlimited population


growth
Section 11.2 Learning goals
1. Follow the derivation of the model for human population growth and understand that
it leads to a differential equation.
2. Appreciate that the solution to that equation is an exponential function.
3. Understand the definitions of per capita birth rates and rates of mortality, and follow
the process of estimating their values from assumptions about a population.
4. Be able to compute the doubling time of the population from its growth rate and vice
versa.
In this section we will examine the way that a simple differential equation arises
when we study the phenomenon of uncontrolled population growth.
We will let N (t) be the number of individuals in a population at time t. The population will change with time. Indeed the rate of change of N will be due to births (that
increase N ) and deaths (that decrease it).


Change in N
per year

Number of births
per year

Number of deaths
per year

We will assume that all individuals are identical in the population, and that the average per capita birth rate, r, and average per capita mortality rate, m are some fixed
positive constants. That is
r = per capita birth rate =

number births per year


,
population size

m = per capita mortality rate =

number deaths per year


.
population size

Consequently, we have
Number of births per year = rN
Number of deaths per year = mN
We will refer to constants such as r, m as parameters. In general, for a given population,
these would have specific numerical values that could be found from experiment, by collecting data, or by making simple assumptions. In Section 11.2.1, we will show how a set
of assumptions about birth and mortality could lead to such values.

11.2. Differential equation for unlimited population growth

225

Then in year t, the total number of births is rN , and the total number of deaths is
mN . This means that rN people per year enter the population while mN people per year
leave it. The rate of change of the population as a whole is given by the derivative dN/dt.
Thus we have arrived at:
dN
= rN mN.
(11.3)
dt
This is a differential equation: it links the derivative of N (t) to the function N (t).
By solving the equation (i.e. identifying its solution), we will be able to make a projection
about how fast the world population is growing.
We can first simplify the above by noting that
dN
= rN mN = (r m)N = kN.
dt
where
k = (r m).
This means that we have shown that the population satisfies a differential equation of the
form
dN
= kN, for k = (r m).
dt
Here k is the so-called net growth rate, i.e., birth rate minus mortality rate. This leads us
to the following conclusions:
The function that describes population over time is (by previous results) simply
N (t) = N0 ekt = N0 e(rm)t .
(The result is identical to what we saw previously, but with N rather than y as the
time-dependent function.)
We are no longer interested in negative values of N since it now represents a quantity
that has to be positive to have biological relevance, i.e. population size.
The population will grow provided k > 0 which happens when r m > 0 i.e. when
birth rate exceeds mortality rate.
If k < 0, or equivalently, r < m then more people die on average than are born, so
that the population will shrink and (eventually) go extinct.

11.2.1

A simple model for human population growth

Let us apply the ideas developed in this chapter to the issue of human population expansion.
Our goal in this section is to make some simplifying assumptions about births and mortality
of humans so as to estimate values for the rates r and m that appear in Eqn. (11.3) (or
alternately for k = r m in Eqn. (11.2)). We list our assumptions and conclusions below.

226

Chapter 11. Differential equations for exponential growth and decay

Assumptions:
The age distribution of the population is flat, i.e. there are as many 10 year-olds as
70 year olds. (This is quite inaccurate, but will be a good place to start, as it will be
easy to estimate some of the quantities we need.) Figure 11.3 shows such a uniform
age distribution.
number
of people

age
0

80

Figure 11.3. We assume a uniform age distribution to make it easy to determine


the fraction of people who are fertile (and can give birth) or who are old (and likely to die).
While slightly silly, this simplification will help estimate the desired parameters.

The sex ratio is roughly 50%. This means that half of the population is female and
half male.
Women are fertile and can have babies only during part of their lives: We will assume
that the fertile years are between age 15 and age 55, as shown in Figure 11.4.
number
of people

fertile

age
0

15

55

80

Figure 11.4. We have assumed that only women between the ages of 15 and 55
years old are fertile and can give birth. Then, according to our uniform age distribution
assumption, half of all women are between these ages and hence fertile.

A lifetime lasts 80 years. This means that for half of that time a given woman can
contribute to the birth rate, or that (55-15)/80=50% of women alive at any time are
able to give birth.
During a womans fertile years, well assume that on average, she has one baby every
10 years. (This is also a suspect assumption, since in the Western world, a woman
has on average 2-2.3 children over her lifetime, while in the Developing nations, the
number of children per woman is much higher. )

11.2. Differential equation for unlimited population growth

227

Based on the above assumptions, we can estimate the parameter r as follows:


r=

number women years fertile number babies per woman

population
years of life
number of years

Thus we compute that


r=

1 1 1

= 0.025 births per person per year.
2 2 10

Note that this value is now a rate per person per year, averaged over the entire population
(male and female, of all ages). We need such an average rate since our model of Eqn. (11.3)
assumes that individuals are identical. We now have an approximate value for human per
capita birth rate, r 0.025 per year.
Next, we estimate the mortality.
We also assume that deaths occur only from old age (i.e. we ignore disease, war,
famine, and child mortality.)
We assume that everyone lives precisely to age 80, and then dies instantly. (Not an
assumption our grandparents would happily live with!)
number
of people

mortality
occurs here
age

80

Figure 11.5. We assume that the people in the age bracket 79-80 years old all die
each year, and that those are the only deaths. This, too, is a silly assumption, but makes it
easy to estimate mortality in the population.
But, with the flat age distribution shown in Figure 11.3, there would be a fraction of
1/80 of the population who are precisely removed by mortality every year (i.e. only those
in their 80th year.) In this case, we can estimate that the per capita mortality is:
m=

1
= 0.0125.per person per year.
80

Putting our results together, we have the net growth rate k = r m = 0.025
0.0125 = 0.0125 per person per year. In the context of such growth problems, we will
often refer to the constant k as the rate constant, or the growth rate of the population. We
also say that the population grows at the rate of 1.25% per year in this case.
Example 11.5 Using the results of this section, find a prediction for the population size
N (t) as a function of time t.

228

Chapter 11. Differential equations for exponential growth and decay

Solution: We have found that our population satisfies the equation


dN
= 0.0125N
dt
so that
N (t) = N0 e0.0125t

(11.4)

where N0 is the starting population size. Figure 11.6 illustrates how this function behaves,
using a starting value of N (0) = N0 = 6 billion.

Figure 11.6. Projected world population (in billions) over the next 100 years,
based on our model of Eqn. (11.4) and assuming that the current population is 6 billion.
Example 11.6 (Human population in 100 years) Given the initial condition (IC) N (0) =
6 billion, determine the human population level in 100 years as predicted by the model.
Solution: We have that at time t = 0, N (0) = N0 = 6 billion. Then in billions,
N (t) = 6e0.0125t
so that when t = 100 we would have
N (100) = 6e0.0125100 = 6e1.25 = 6 3.49 = 20.94
Thus, with population around the 6 billion now, we should see about 21 billion people on
Earth in 100 years.

11.2.2 A critique
Before leaving our population model, we should remember that our projections hold only so
long as some rather restrictive assumptions are made. We have made many simplifications,
and ignored many features that would seriously affect these results.

11.2. Differential equation for unlimited population growth

229

These include variations in the birth and mortality rates that stem from competition
for the Earths resources, epidemics that take hold when crowding occurs, uneven distributions of resources or space, and other factors. We have also assumed that the age distribution is uniform (flat), but that is clearly wrong: the population grows only by adding new
infants, and this would skew the distribution even if it starts out uniform. All these factors
would lead us to be skeptical, and to eventually think about more advanced ways of describing the population growth. Certainly, the uncontrolled exponential growth described
so far would not be sustainable in the long run.

11.2.3

Growth and doubling

In Chapter 10, we used base 2 to launch our discussion of exponential growth and population doublings. But later, we discovered that base e proves more convenient for calculus,
as its derivative is simplest. We also saw in Chapter 10, that bases of exponents can be
interconverted. These skills will prove helpful in our discussion of doubling times below.
We ask how long it would take for a population to double given that it is growing
exponentially, with growth rate k, as described above. That is, we ask at what time t it
would be true that N reaches twice its starting value, i.e. N (t) = 2N0 . We determine this
time as follows:
N (t) = 2N0 and N (t) = N0 ekt ,
implies that the population has doubled when t satisfies
2N0 = N0 ekt ,

2 = ekt .

Taking the natural log of both sides leads to


ln(2) = ln(ekt ) = kt.
Thus, the doubling time, denoted is:
=

ln(2)
.
k

Example 11.7 (Human population doubling time) Determine the doubling time for the
human population based on the results of our approximate growth model.
Solution: We have found a growth rate of roughly k = 0.0125 per year for the human
population. Based on this, it would take
=

ln(2)
= 55.45 years
0.0125

for the population to double. Compare this with the graph of Fig 11.6, and note that over
this time span, the population increases from 6 to 12 billion.

230

Chapter 11. Differential equations for exponential growth and decay


In general, an equation of the form
dy
= ky
dt
that represents an exponential growth will have a doubling time of
=

ln(2)
.
k

2yo
yo
t

Figure 11.7. Doubling time for exponential growth.


This is shown in Figure 11.7. The interesting thing that we discovered is that the
population doubles every 55 years! So that, for example, after 110 years there have been
two doublings, or a quadrupling of the population.
Example 11.8 (A ten year doubling time) Suppose we are told that some animal population doubles every 10 years. What growth rate would lead to such a trend?
Solution: Rearranging
t2 =

ln(2)
k

we obtain
k=

0.6931
ln(2)
=
0.07 per year.
t2
10

Thus, we may say that a growth rate of 7% leads to doubling roughly every 10 years.

11.3 Radioactive decay


A radioactive material consists of atoms that undergo a spontaneous change. Every so often, some atom will emit a particle, and decay into an inert form. We call this a process
of radioactive decay. For any one atom, it is impossible to predict when this event would

11.3. Radioactive decay

231

occur exactly, but based on the behaviour of a large number of atoms decaying spontaneously, we can assign a probability k of decay per unit time. In this section, we show
how simple book-keeping (Keeping track of the number of radioactive atoms remaining)
leads naturally to a differential equation. Once we arrive at that equation, we use ideas
developed earlier to determine a likely candidate for its solution and to check its validity.
We then use these results to make a long-term prediction about the amount of radioactivity
remaining at any future time.

Section 11.3 Learning goals


1. Follow the model for the number of radioactive atoms and understand how this leads
to a differential equation.
2. Be able to determine the solution of the resulting differential equation.
3. Given the initial amount, be able to determine the amount of radioactivity remaining
at a future time.
4. Understand the link between half-life of the radioactive material and its decay rate,
and be able to find one when given the other.

11.3.1

Deriving the model

Let N (t) be the number of radioactive atoms at time t. Generally, we would know N (0),
the number present initially. Our goal is to make simple assumptions about the process of
decay, and arrive at a mathematical model that will help us to predict values of N (t) at any
later time t > 0.
Assumption 1a: The process is random, but on average, the probability of decay for a
given radioactive atom is k per unit time where k > 0 is some constant.
Assumption 1b: During each (small) time interval of length h = t, a radioactive atom
has probability kh of decaying. (This is merely a restatement of Assumption 1a.)
Suppose that at some time t0 , there are N (t0 ) radioactive atoms. Then according to the
above assumptions, on average khN (t0 ) atoms would decay during the time period t0
t t0 + h. How many will there be at time t0 + h? We can write the following wordequation:

Amount left
Amount present
Amount decayed

=
during time interval
at time
at time
t0 + h
t0
t0 t t0 + h
or, restated in symbols
N (t0 + h) = N (t0 ) khN (t0 ).

(11.5)

232

Chapter 11. Differential equations for exponential growth and decay

Here we have assumed that h is a small time period. Rearranging Eqn. (11.5) leads to
N (t0 + h) N (t0 )
= kN (t0 ).
h
Now let h get smaller and smaller (h 0) and recall that

N (t0 + h) N (t0 )
dN
lim
=
= N (t0 )
h0
h
dt t0

where we have used the notation for a derivative of N with respect to t at the point t = t0 .
We have thus shown that a description of the population of radioactive atoms reduces to

dN
= kN (t0 ).
dt t0

but this is true for any time t0 , so we can replace this with the more general equation, which
holds at any time t,
dN
= kN.
(11.6)
dt
We recognize this as a differential equation. As before, it provides a link between a function
of time N (t) and its own rate of change dN/dt. Indeed, this equation specifies that dN/dt
is proportional to N , but with a negative constant of proportionality. We will shortly see
that this implies a process of decay.
ABove we formulated the entire model in terms of the number of radioactive atoms.
However, as we shown in the next example, the same equation holds regardless of units we
chose to measure the amount of radioactivity
Example 11.9 Define the number of moles of radioactive material by y(t) = N (t)/A
where A is Avogadros number, which is the number of molecules in 1 mole). determine
the differential equation satisfied by y(t).
Solution: We write y(t) = N (t)/A in the form N (t) = Ay(t) and substitute this expression for N (t) in Eqn. (11.6). We use the fact that A is a constant to simplify the derivative.
Then
dN
= kN
dt

Ady(t)
= k(Ay(t))
dt

dy(t)
= A(ky(t))
dt

canceling the constant A from both sides of the equations leads to


dy(t)
= ky(t),
dt

or simply

dy
= ky.
dt

(11.7)

Thus y(t) satisfies the same kind of differential equation, with the same negative proportionality between the derivative and the original function. We will henceforth denote (11.7)
as the decay equation.
Next, we ask what kind of function has this property, i.e. we seek the solution of this
differential equation.

11.3. Radioactive decay

11.3.2

233

Solution to the decay equation

Here we explore the solution to Eqn. (11.7). Suppose that initially, there was an amount
y0 . Then, together, the differential equation and initial condition are
dy
= ky,
dt

y(0) = y0 .

(11.8)

We often refer to this pairing between a differential equation and an initial condition as
an initial value problem. Next, we show that an exponential function is an appropriate
solution to this problem
Example 11.10 Show that the function
y(t) = y0 ekt .

(11.9)

is a solution to the initial value problem (11.8).


Solution: As always, we can verify that a function is a solution to a differential equation
by checking that it satisfies the equation. First we compute the derivative of the candidate
function, obtaining
dy(t)
dekt
d
= [y0 ekt ] = y0
= ky0 ekt = ky(t).
dt
dt
dt
We have used the fact that y0 is a constant, and applied the chain rule to differentiating
ekt . Then by the above algebra, we have verified that for the exponential function in
question, dy
dt = ky. We can also check that the initial condition is satisfied:
y(0) = y0 ek0 = y0 e0 = y0 1 = y0 .
Hence, y(0) = y0 and the initial condition is also satisfied. We can conclude that the
function (11.9) is the solution to the initial value problem for radioactive decay. For k > 0
a constant, this is a decreasing function of time that we refer to as exponential decay.

11.3.3

The half life

Given a process of exponential decay, we can ask how long it would take for half of the
original amount to remain. Let us recall that the original amount (at time t = 0) is y0 .
Then we are looking for the time t such that y0 /2 remains.
y(t) =

y0
.
2

We will refer to the value of t that satisfies this as the half life.
Example 11.11 (Half life) Determine the half life in the exponential decay described by
(11.9).

234

Chapter 11. Differential equations for exponential growth and decay

Solution: We compute:

y0
= y0 ekt
2

Now taking reciprocals:


2=

ekt

1
= ekt .
2

= ekt .

Thus we find the same result as in our calculation for doubling times, namely,
ln(2) = ln(ekt ) = kt,
so that the half life is
=

ln(2)
.
k

This is shown in Figure 11.8.

y
y
y /2

Figure 11.8. Half-life in an exponentially decreasing process.


Example 11.12 (Chernobyl: April 1986) In 1986 the Chernobyl nuclear power plant exploded, and scattered radioactive material over Europe. Of particular note were the two
radioactive elements iodine-131 (I131 ) whose half-life is 8 days and cesium-137 (Cs137 )
whose half life is 30 years. Use the model for radioactive decay to predict how much of
this material would remain over time.
Solution: We first determine the decay constants for each of these two elements, by noting
that
ln(2)
,
k=

and recalling that ln(2) 0.693. Then for I131 we have


k=

ln(2)
ln(2)
=
= 0.0866 per day.

This means that for t measured in days, the amount of I131 left at time t would be
yI (t) = y0 e0.0866t .

11.4. Summary and Review


For Cs137
k=

235

ln(2)
= 0.023 per year.
30

so that for T in years,


yC (t) = y0 e0.023T .
(We have used T rather than t to emphasize that units are different in the two calculations
done in this example.)
Example 11.13 (Decay to 0.1% of the initial level) How long it would take for I131 to decay to 0.1 % of its initial level? (Assume that initial level was just after the explosion at
Chernobyl.)
Solution: We must calculate the time t such that yI = 0.001y0:
0.001y0 = y0 e0.0866t

0.001 = e0.0866t

ln(0.001) = 0.0866t.

Therefore,
6.9
ln(0.001)
=
= 79.7 days.
0.0866
0.0866
Thus it would take about 80 days for the level of Iodine-131 to decay to 0.1% of its initial
level.
t=

11.4 Summary and Review


Here is a brief review of what we have seen about differential equations so far:
1. A differential equation is a statement linking the rate of change of some state variable
with current values of that variable. An example is the simplest population growth
model: If N (t) is population size at time t:
dN
= kN.
dt
2. A solution to a differential equation is a function that satisfies the equation. For
instance, the function N (t) = Cekt (for any constant C) is a solution to the above
unlimited growth model. (We checked this by the appropriate differentiation in a
previous chapter.) Graphs of such solutions (e.g. N versus t) are called solution
curves.
3. To select a specific solution, more information is needed: Namely, some starting
value (initial condition) is needed. Given this information, e.g. N (0) = N0 , we can
fully characterize the desired solution.
4. So far, we have seen simple differential equations with simple functions for their
solutions. In general, it may be quite challenging to make the connection between the
differential equation (stemming from some application or model) with the solution
(which we want in order to understand and predict the behaviour of the system.)

236

Chapter 11. Differential equations for exponential growth and decay

Example 11.14 (Exponential growth, revisited) Characterize the solutions to the exponential growth model with initial condition
dy
= y,
dt

y(0) = y0 .

Solution: We know that solutions are y(t) = y0 et . These are functions that grow with
time, as shown on the left panel in Figure 11.9.
Example 11.15 (Exponential decay:) What are solutions to the differential equation and
initial condition
dy
= y, y(0) = y0 .
dt

Solution: This differential equation has solutions of the form y(t) = y0 et , which are
functions that decrease with time. We show some of these on the right panel of Figure 11.9.
(Both graphs were produced with Eulers method and a spreadsheet.)
20.0

10.0

y
Solutions to the differential equation

Solutions to the differential equation

dy/dt = y
dy/dt = - y

time, t

0.0
0.0

2.0

(a)

time, t

0.0
0.0

2.0

(b)

Figure 11.9. Simple exponential growth and decay

Exercises

237

Exercises
11.1. A differential equation is an equation in which some function is related to its own
derivative(s). For each of the following functions, calculate the appropriate derivative, and show that the function satisfies the indicated differential equation
(a) f (x) = 2e3x , f (x) = 3f (x)

(b) f (t) = Cekt , f (t) = kf (t)

(c) f (t) = 1 et , f (t) = 1 f (t)

11.2. Consider the function y = f (t) = Cekt where C and k are constants. For what
value(s) of these constants does this function satisfy the equation
(a)
(b)

dy
dt
dy
dt

= 5y,
= 3y.

[Remark: an equation which involves a function and its derivative is called a differential equation.]
11.3. Find a function that satisfies each of the following differential equations. [Remark:
all your answers will be exponential functions, but they may have different dependent and independent variables.]
dy
= y,
(a)
dt
dc
= 0.1c and c(0) = 20,
(b)
dx
dz
(c)
= 3z and z(0) = 5.
dt
11.4. If 70% of a radioactive substance remains after one year, find its half-life.
11.5. Carbon 14: Carbon 14 has a half-life of 5730 years. This means that after 5730
years, a sample of Carbon 14, which is a radioactive isotope of carbon will have lost
one half of its original radioactivity.
(a) Estimate how long it takes for the sample to fall to roughly 0.001 of its original
level of radioactivity.
(b) Each gram of 14 C has an activity given here in units of 12 decays per minute.
After some time, the amount of radioactivity decreases. For example, a sample
5730 years old has only one half the original activity level, i.e. 6 decays per
minute. If a 1 gm sample of material is found to have 45 decays per hour,
approximately how old is it? (Note: 14 C is used in radiocarbon dating, a
process by which the age of materials containing carbon can be estimated.
W. Libby received the Nobel prize in chemistry in 1960 for developing this
technique.)
11.6. Strontium-90: Strontium-90 is a radioactive isotope with a half-life of 29 years.
If you begin with a sample of 800 units, how long will it take for the amount of
radioactivity of the strontium sample to be reduced to

238

Chapter 11. Differential equations for exponential growth and decay


(a) 400 units
(b) 200 units
(c) 1 unit

11.7. More radioactivity: The half-life of a radioactive material is 1620 years.


(a) What percentage of the radioactivity will remain after 500 years?
(b) Cobalt 60 is a radioactive substance with half life 5.3 years. It is used in
medical application (radiology). How long does it take for 80% of a sample of
this substance to decay?
11.8. Assume the atmospheric pressure y at a height x meters above the sea level satisfies
dy
= kx. If one day at a certain location the atmospheric pressures
the relation
dx
are 760 and 675 torr (unit for pressure) at sea level and at 1000 meters above sea
level, respectively, find the value of the atmospheric pressure at 600 meters above
sea level.
11.9. Population growth and doubling: A population of animals has a per-capita birth
rate of b = 0.08 per year and a per-capita death rate of m = 0.01 per year. The
population density, P (t) is found to satisfy the differential equation
dP (t)
= bP (t) mP (t)
dt
(a) If the population is initially P (0) = 1000, find how big the population will be
in 5 years.
(b) When will the population double?
11.10. Rodent population: The per capita birthrate of one species of rodent is 0.05 newborns per day. (This means that, on average, each member of the population will
result in 5 newborn rodents every 100 days.) Suppose that over the period of 1000
days there are no deaths, and that the initial population of rodents is 250. Write a
differential equation for the population size N (t) at time t (in days). Write down the
initial condition that N satisfies. Find the solution, i.e. express N as some function
of time t that satisfies your differential equation and initial condition. How many
rodents will there be after 1 year ?
11.11. Growth and extinction of microorganisms:
(a) The population y(t) of a certain microorganism grows continuously and follows an exponential behaviour over time. Its doubling time is found to be
0.27 hours. What differential equation would you use to describe its growth
? (Note: you will have to find the value of the rate constant, k, using the
doubling time.)
(b) With exposure to ultra-violet radiation, the population ceases to grow, and the
microorganisms continuously die off. It is found that the half-life is then 0.1
hours. What differential equation would now describe the population?
11.12. A bacterial population: A bacterial population grows at a rate proportional to the
population size at time t. Let y(t) be the population size at time t. By experiment

Exercises

239

it is determined that the population at t = 10min is 15, 000 and at t = 30min it is


20, 000.
(a) What was the initial population?
(b) What will the population be at time t = 60min?
11.13. Antibiotic treatment: A colony of bacteria is treated with a mild antibiotic agent so
that the bacteria start to die. It is observed that the density of bacteria as a function of
time follows the approximate relationship b(t) = 85e0.5t where t is time in hours.
Determine the time it takes for half of the bacteria to disappear. (This is called the
half life.) Find how long it takes for 99% of the bacteria to die.
11.14. Chemical breakdown: In a chemical reaction, a substance S is broken down. The
concentration of the substance is observed to change at a rate proportional to the current concentration. It was observed that 1 Mole/liter of S decreased to 0.5 Moles/liter
in 10 minutes. How long will it take until only 0.25 Moles per liter remain? Until
only 1% of the original concentration remains?
11.15. Two populations: Two populations are studied. Population 1 is found to obey the
differential equation
dy1 /dt = 0.2y1
and population 2 obeys
dy2 /dt = 0.3y2
where t is time in years.
(a) Which population is growing and which is declining?
(b) Find the doubling time (respectively half-life) associated with the given population.
(c) If the initial levels of the two populations were y1 (0) = 100 and y2 (0) =
10, 000, how big would each population be at time t ?
(d) At what time would the two populations be exactly equal?
11.16. The human population: The human population on Earth doubles roughly every 50
years. In October 2000 there were 6.1 billion humans on earth. Determine what the
human population would be 500 years later under the uncontrolled growth scenario.
How many people would have to inhabit each square kilometer of the planet for this
population to fit on earth? (Take the circumference of the earth to be 40,000 km for
the purpose of computing its surface area and assume that the oceans have dried up.)
11.17. First order chemical kinetics: When chemists say that a chemical reaction follows
first order kinetics, they mean that the concentration of the reactant at time t, i.e.
c(t), satisfies an equation of the form dc
dt = rc where r is a rate constant, here
assumed to be positive. Suppose the reaction mixture initially has concentration 1M
(1 molar) and that after 1 hour there is half this amount.
(a) Find the half life of the reactant.
(b) Find the value of the rate constant r.
(c) Determine how much will be left after 2 hours.
(d) When will only 10% of the initial amount be left?

240

Chapter 11. Differential equations for exponential growth and decay

11.18. Fish in two lakes: Two lakes have populations of fish, but the conditions are quite
different in these lakes. In the first lake, the fish population is growing and satisfies
the differential equation
dy
= 0.2y
dt
where t is time in years. At time t = 0 there were 500 fish in this lake. In the second
lake, the population is dying due to pollution. Its population satisfies the differential
equation
dy
= 0.1y,
dt
and initially there were 4000 fish in this lake. At what time will the fish populations
in the two lakes be identical?
11.19. A barrel initially contains 2 kg of salt dissolved in 20 L of water. If water flows in
the rate of 0.4 L per minute and the well-mixed salt water solution flows out at the
same rate. How much salt is present after 8 minutes?
11.20. A savings account: You deposit a sum P (the Principal) in a savings account
with an annual interest rate, r and make no withdrawals over the first year. If the
interest is compounded annually, after one year the amount in this account will be
A(1) = P + rP = P (1 + r).
If the interest is compounded semi-annually (once every 1/2 year), then every 6
months half of the interest is added to your account, i.e.
 

r
r
1
=P + P =P 1+
A
2
2
2
 


r
r
r
r 2
1
1+
=P 1+
1+
=P 1+
A(1) = A
2
2
2
2
2

(a) Suppose that you invest $500 in an account with interest rate 4% compounded
semi-annually. How much money would you have after 6 months? After 1
year ? After 10 years ? Roughly how long does it take to double your money
in this way? How would it differ if the interest was 8% ?

(b) Interest can also be compounded more frequently, for example monthly (i.e.
12 times per year, each time with an increment of r/12). Answer the questions
posed in part (a) in this case
(c) Is it better to save your money in a bank with 4% interest compounded monthly,
or 5% interest compounded annually?

Chapter 12

Solving differential
equations

12.1 Introduction
In the previous chapters, we were introduced to differential equations. We saw that the verbal descriptions of the rate of change of a process (for example, the growth of a population
or the decay of a radioactive substance) can be expressed in the format of a differential
equation, and that the functions associated with such equations allow us to predict the behaviour of the process over time.
In this chapter, we will develop some of these ideas further. We will explore several
techniques for finding and verifying that a given function is a solution to a differential
equation. We will then examine a simple class of differential equations that have many
applications to processes of production and decay, and find their solutions. Finally we will
show how an approximation method provides for numerical solutions of such problems.

12.2 Given a function, check that it is a solution


In this section we concentrate on analytic solutions to a differential equation. By analytic
solution, we mean a formula in the form y = f (x) that satisfies the given differential
equation. We have already seen in previous chapters that we can check whether a function
satisfies a differential equation.
Suppose we encounter a new differential equation, and we are given a function that is
believed to satisfy that equation. We can always check and verify that this claim is correct
(or find it incorrect) by simple differentiation. Examples in this section show how this is
done.

241

242

Chapter 12. Solving differential equations

Section 12.2 Learning goals


1. Given a function, be able to check that whether that function does or does not satisfy
a given differential equation.
2. Be able to check whether a given function does or does not satisfy an initial condition.
Example 12.1 Show that the function y(t) = (2t + 1)1/2 is a solution to the differential
equation and initial condition
dy
1
= ,
dt
y

y(0) = 1.

Solution: First, we check the derivative, obtaining


d(2t + 1)1/2
1
1
1
dy(t)
=
= (2t + 1)1/2 2 = (2t + 1)1/2 =
= .
dt
dt
2
y
(2t + 1)1/2
Next, we examine the initial condition, and find that y(0) = (2 0 + 1)1/2 = 11/2 = 1.
Hence the initial condition is also satisfied.
Example 12.2 Consider the differential equation and initial condition
dy
= 1 y,
dt

y(0) = y0 .

Show that the function y(t) = y0 et is not a solution to this differential equation, but that
the function y(t) = 1 (1 y0 )et is a solution.
Solution: (a) To check whether y(t) = y0 et is a solution, we differentiate this function,
obtaining
dy
d[y0 et ]
=
= y0 et = y 6= 1 y.
dt
dt
Thus the function does not satisfy the differential equation.
(b) We check if the second function satisfies the differential equation. We differentiate the function, and get
dy
d
det
= [1 (1 y0 )et ] = (1 y0 )
= (1 y0 )(et ) = (1 y0 )et
dt
dt
dt
But the function is y(t) = 1 (1 y0 )et so, rearranging this leads to 1 y(t) =
(1 y0 )et . Hence, we see that
dy
dy
= (1 y0 )et = 1 y(t)
= 1 y.
dt
dt
Next, let us show that the initial condition is also satisfied. At time t = 0 we have that
y(0) = 1 (1 y0 )e0 = 1 (1 y0 ) 1 = 1 (1 y0 ) = y0 .
Thus both the differential equation and the initial condition are satisfied.

12.3. Equations of the form y (t) = a by

243

Example 12.3 (Height of water draining out of a cylindrical container:) A cylindrical container with cross-sectional area A contains water. When a small hole of area a is opened
at its base, the water leaks out. It can be shown that height of water h(t) in the container
satisfies the differential equation

dh
= k h.
(12.1)
dt
where
k is a constant that depends on the size and shape of the cylinder and its hole:
a
2g, where g is the acceleration due to gravity. Show that the function
k=A
h(t) =

2
p
t
.
h0 k
2

(12.2)

is a solution to the differential equation (12.1) and the initial condition h(0) = h0 .

Solution: By plugging in t = 0, we see that h(0) = h0 in (12.2). Thus, the initial condition
is satisfied. To show that the differential equation (12.1) is satisfied, we differentiate the
function in (12.2):
dh(t)
d
=
dt
dt

p
2


p

p
p
k
t
t
t
h0 k
h0 k
h0 k

= k
= k h(t).
=2
2
2
2
2

Here we have used the power law and the chain rule, remembering
that h0 , k are conp
stants. Now we notice that, using (12.2), the expression for h(t) exactly matches what
we have computed for dh/dt. Thus, we have shown that the function in (12.2) satisfies
both the initial condition and the differential equation. Remark: The derivation of the differential equation from physical principles, and the calculation that discovers its solution is
discussed in a second semester calculus course.
As shown in Examples 12.1- 12.3, if we are told that a function is a solution to a
differential equation, we can check the assertion and verify that it is correct or incorrect.
A much more difficult task is to find the solution of a new differential equation from first
principles. In some cases, the technique of integration, learned in second semester calculus,
can be used. In other cases, some transformation that changes the problem to a more familiar one is helpful. (An example of this type is presented in Section 12.3.1). In many cases,
particularly those of so-called non-linear differential equations, it requires great expertise
and familiarity with advanced mathematical methods to find the solution to such problems
in an analytic form, i.e. as an explicit formula. In such cases, approximation and numerical
methods are helpful.

12.3 Equations of the form y (t) = a by

In this section we introduce an important class of equations that have many applications in
physics, chemistry , biology, and other situations. All share a similar structure, namely all
are of the form
dy
= a by, y(0) = y0 .
(12.3)
dt

244

Chapter 12. Solving differential equations

Methods for finding solutions to such differential equations are the same. We first introduce
a simple example of this type and show how a systematic process leads to the solution.
Them we examine a number of interesting applications, that we explore in more detail.

Section 12.3 Learning goals


1. Learn how to reduce a differential equation of the form (12.3) to a simple decay
equation, and thereby find its solution.
2. Understand the example of Newtons Law of Cooling (NLC), which is an equation
of the same type. Be able to find its solution and explain verbally what this solution
implies.
3. Be able to use the solution to NLC to solve problems involving the temperature of a
cooling body over time.
4. Understand a variety of related examples, and be able to use the same methods to
solve and interpret these. (Examples include chemical production and decay, the
velocity of a skydiver, the concentration of drug in the blood, and others).

12.3.1 Reduction to a simpler differential equation


Here we consider how to reduce the equation (12.3) to a simpler one that we already know
how to solve, namely to a decay equation. We start with a simple example, in which the
constants a, b in (12.3) are 1. Then we consider the general case
Example 12.4 Suppose we are given the differential equation
dy
= 1 y,
dt

(12.4)

with initial condition y(0) = y0 . Determine the solutions to this differential equation.
Solution: We use a simple transformation of the variable to restate (12.4) in a simpler
form. Let z(t) = 1 y(t). Then the derivatives of z and y are related:
dy
dz
= .
dt
dt
But dy/dt = 1 y, so that

dz
= (1 y) = z.
dt
The differential equation has been simplified (when written in terms of the variable z): It
is just
dz
= z.
dt

12.3. Equations of the form y (t) = a by


3.0

245

y
Solutions to the differential equation
dy/dt = 1 - y

time, t

0.0
0.0

2.0

Figure 12.1. Solutions to (12.4) are functions that approach the value y = 1
This means that we can write down its solution by inspection, since it has the same form as
the exponential decay equation studied previously:
z(t) = z0 et .
Observe, also, that the initial condition for y implies that at time t = 0, we have z(0) =
1 y(0) = 1 y0 . We now have:
z(t) = (1 y0 )et
1 y(t) = (1 y0 )et .
Finally, rearranging this result, we can arrive at an expression for y which is what we were
looking for originally:
y(t) = 1 (1 y0 )et .
This is an exact formula that predicts the values of y through time, starting from any initial
value.
Example 12.5 Find the solution of (12.3) using a similar method.
Solution: We define z(t) = a by(t) and observe that
dy
dz
= b
= b(a by) = bz.
dt
dt
Furthermore z(0) = a by(0) = a by0 = z0 . Hence
z(t) = z0 ebt = (a by0 )ebt .

246

Chapter 12. Solving differential equations

We complete the process by going back to the original variable, writing


a by(t) = (a by0 )ebt .
Once we isolate y(t), we have the desired solution,

a a
y(t) =
y0 ebt .
b
b
See Problem 11 for the detailed steps.

12.3.2 Newtons law of cooling


Consider an object at temperature T (t) in an environment whose ambient temperature is E.
Depending on whether the object is cooler or warmer than the environment, the object will
heat up or cool down. From common experience we know that after a long time, we should
find that the temperature of the object will be essentially equal to that of its environment.
Newton formulated a hypothesis to describe the rate of change of temperature. He
assumed that

The rate of change of temperature T of an object is proportional to the difference


between its temperature and the ambient temperature, E.
dT
is proportional to (T (t) E)
dt
so that

dT
= k(E T (t)),
dt

where k > 0.

(12.5)

Here we have used the proportionality constant k > 0 to arrive at the appropriate sign
of the Right Hand Side (RHS). (Otherwise, if the expression on the right were k(T (t) E),
then the direction of the change would be incorrect (a hotter object would get hotter in a
cold room, etc). The above differential equation links the current temperature T (t) to its
rate of change. Generally, we are given the temperature at some initial time and desire to
predict T (t) for later time. For example, the initial value may be of the form T (0) = T0 .
Example 12.6 Consider the temperature T (t) as a function of time. Solve the differential
equation for Newtons law of cooling together with the initial condition
dT
= k(E T ),
dt

T (0) = T0 .

Solution: As before, we transform the variable to reduce the differential equation to one
that we know how to solve. Let us define z(t) = E T (t). Then
dz(t)
= kz
dt

12.3. Equations of the form y (t) = a by

247

(This is left as an exercise for the reader.) We can also see that z(0) = E T (0) = E T0 .
Just as in the previous example, when the dust clears, we can find the formula for the
solution, which turns out to be
T (t) = E + (T0 E)ekt .

(12.6)

In Figure 12.2 we show a number of the curves that describe this behaviour for five different
starting values of the temperature. (We have set E = 10 and k = 0.2 in this case.) This
family of curves is what we refer to as the solution curves to the differential equation.
20

temperature

15

10

10

15

Figure 12.2. Temperature versus time for a cooling object


Before moving forward to use our results, we interpret the behaviour of the solutions
described by (12.6).
Example 12.7 Explain in words what the form of the solution (12.6) of Newtons Law of
Cooling implies about the temperature of an object as it warms or cools.
Solution: We make the following remarks
It is straightforward to verify that the initial temperature is T (0) = T0 . The time
dependence of the solutions (12.6) is contained in the term ekt , which is an exponentially decreasing term (since k > 0). As time increases, the term (T0 E)ekt
continually shrinks, so that as t , T E. Thus the temperature of the object
always approaches the ambient temperature as time goes by. This is evident in the
example in Fig. 12.2.
We also observe that the direction of approach (decreasing or increasing) depends
on the sign of the constant (T0 E). In the case that T0 > E, the temperature
approaches E from above, whereas if T0 < E, it approaches from below.
In the specific case that T0 = E, there is no change at all. T = E is a solution
to the differential equation that also satisfies dT /dt = 0. We refer to such constant
solutions as steady states.

248

Chapter 12. Solving differential equations

12.3.3 Using Newtons Law of Cooling to solve a mystery


Now that we have a detailed solution to the differential equation representing Newtons
Law of Cooling, we can apply it to making exact determinations of temperatures over time,
or of time at which a certain temperature was attained. The following example shows how
the solution can help us with an important determination of time of death.
Example 12.8 (Murder mystery:) It is a dark clear night. The air temperature is 10 C.
A body is discovered at midnight. Its temperature is then 27 C. One hour later, the body
has cooled to 24 C. Use Newtons law of cooling to determine the time of death.
Solution: We will assume that the temperature of the person just before death is 37 C, i.e.
normal body temperature in humans. Letting the time of death be t = 0, this would mean
that T (0) = T0 = 37. We want to find how much time elapsed until the body was found,
i.e. the value of t at which the temperature of the body was 27 C. We are told that the
ambient temperature is E = 10, and we will assume that this was constant over the time
span being considered. Newtons law of cooling states that
dT
= k(10 T ).
dt
The solution to this equation is
T (t) = 10 + (37 10)ekt = 27,
or
27 = 10 + 27ekt ,

i.e.

17 = 27ekt .

We do not know the value of the constant k, but we have enough information to find it,
since we know that at t + 1 (one hour after discovery) the temperature was 24 C, i.e.
T (t + 1) = 10 + (37 10)ek(t+1) = 24,

24 = 10 + 27ek(t+1) .

Thus
14 = 27ek(t+1) .
We have two separate equations for the two unknowns t and k. We can find both
unknowns from these. Taking the ratio of the two equations we obtained we get
 
14
27ek(t+1)
14
k
= 0.194
= e . k = ln
=
17
27ekt
17
Thus we have found the constant that describes the rate of cooling of the body. Now to find
the time we can use
 
17
kt
17 = 27e
kt = ln
= 0.4626
27
so

0.4626
0.4626
=
= 2.384.
k
0.194
Thus the time of discovery of the body was 2.384 hours (i.e. 2 hours and 23 minutes) after
death, i.e. at 9:37 pm.
t=

12.3. Equations of the form y (t) = a by

12.3.4

249

Related applications and further examples

Let us return to the general example of the differential equation


dy
= a by
(12.7)
dt
where a, b > 0 are constants. The initial condition y(0) = y0 then leads to solutions that
we have already seen:

a a
y(t) =
(12.8)
y0 ebt .
b
b
Newtons Law of Cooling is a representative member of the class of differential equations
This is easily seen by expanding the terms in (12.5): k(E T ) = kE kT and identifying
the constants with a = kE, b = k in (12.7).
Hence, we can summarize the behaviour of solutions (12.8) by analogy to our interpretation of the solutions to Newtons Law of Cooling, namely
The solutions should all satisfy y(0) = y0 . The time dependence of the solutions
is contained in a term ebt , which is an exponentially decreasing term (assuming
b > 0). As time increases, t , y a/b.
If y0 > a/b, then y approaches a/b from above, whereas if y0 < a/b, it approaches
from below.
In the specific case that initially y0 = a/b, there is no change at all. Thus y = a/b is
a steady states of (12.7).
In this section we describe a few other examples that share the same structure, and
hence similar kind of dynamics .
Friction and terminal velocity
The velocity of a falling object changes due to the acceleration of gravity, but friction has an
effect of slowing down this acceleration. The differential equation satisfied by the velocity
v(t) of the falling object is
dv
= g kv
(12.9)
dt
where g > 0 is acceleration due to gravity and k > 0 is a constant that represents the effect
of friction.
Example 12.9 Use our general results to write down the solution to the differential equation (12.9) for the velocity of a skydiver given the initial condition v(0) = v0 . Interpret
your results in a simple verbal description of what happens over time.
Solution: Identifying v y, g a, k b, v0 y0 , we find the same kind of differential
equation and initial condition. Hence, without further calculation, we can conclude that the
solution of (12.9) together with the initial condition is:

g g
(12.10)
v0 ekt .
v(t) =
k
k

250

Chapter 12. Solving differential equations

Then, as before, the velocity is initially v0 , and eventually approaches g/k which is the
steady state or terminal velocity for the object. The object will either slow down (if
v0 > g/k) or speed up (if v0 < g/k) as it approaches this constant velocity.
Production and removal of a substance
An infusion containing a fixed concentration of substance is introduced into a fixed volume. Inside the volume, a chemical reaction results in decay of the substance at a rate
proportional to its concentration. Letting c(t) denote the time-dependent concentration of
the substance, we obtain a differential equation of the form
dc
= Kin c
dt

(12.11)

where Kin > 0 represents the rate of input of substance and > 0 the decay rate.
Example 12.10 Write down the solution to the differential equation (12.11) given the initial condition c(0) = c0 . Determine the steady state concentration of the substance.
Solution: We can understand the behaviour of these systems by translating our notation
from the general to the specific forms given above. For example,
c(t) y(t),

Kin a,

b.

As before, we can write down the solution:




Kin
Kin
c(t) =

v0 et .

(12.12)

The steady state concentration is c = Kin /, and we expect that all initial chemical concentrations will approach this level as time goes by.
As we have seen in this section, the behaviour found in the general case, can be interpreted in each of the specific situations of interest. This points to one of the powerful
aspects of mathematics, namely the ability to use results in abstract general cases to solve a
variety of seemingly unrelated scientific problems that share the same mathematical structure.

12.4 Eulers Method and numerical solutions


So far, we have explored ways of finding a solution to a differential equation in the form of
an analytic expression, namely a formula for the solution as a function of time. In many
cases, this is difficult without extensive training, or impossible even for experts. Even if
we can find such a solution, it may be inconvenient to determine its numerical values at
arbitrary times, or to interpret its behaviour.
For that reason, we add a method for finding the desired numerical solution using a
technique called Eulers method, based on the fact that derivatives can be approximated by
finite differences.

12.4. Eulers Method and numerical solutions

251

Consider the general initial value problem (differential equation and initial condition) of the form
dy
= f (y), y(0) = y0 .
dt
Below, we explain how an approximate numerical solution is constructed using Eulers
method.

Section 12.4 Learning goals


1. Understand how Eulers method is based on approximating the derivative by the
slope of a secant line.
2. Understand the idea of a numerical solution and how this compares with an exact or
analytic solution.
3. Be able to use Eulers method to calculate a numerical solution (using calculator,
spreadsheet, or your favorite software) to a given initial value problem.
To set up the recipe for generating successive values of the desired solution, we first
have to pick a step size, t, and subdivide the t axis into discrete steps of that size: we
then have a set of time points t1 , t2 , . . . , spaced t apart as shown in Figure 12.3. Our
procedure will be to start with the known initial value of y = y0 , and use it to generate the
value at the next time point, then the next and so on.
t

time
0

Figure 12.3. The time axis is subdivided into steps of size t.


We will replace the differential equation by an approximating finite difference equation

dy
yk+1 yk
= f (y) approximated by
= f (yk ).
dt
t
This approximation is reasonable only for a small time step size t. (In that case, the
derivative is well approximated by the slope of a secant line.) Rearranging this equation
leads to a recipe (also called recursion relation) linking successive values of the solution.
yk+1 = yk + t f (yk ).
(12.13)

How is this used in practice? We start with the known initial value, y0 . Then (using k = 0
in (12.13)) we obtain
y1 = y0 + f (y0 )t.

252

Chapter 12. Solving differential equations

The quantities on the right are known, so we can compute the value of y1 , i.e. the value of
the approximate solution at the time point t1 . We can then continue to generate the value
at the next time point in the same way, by approximating the derivative again as a secant
slope. This leads to
y2 = y1 + f (y1 )t.
The approximation so generated, leading to values y1 , y2 , . . . is called Eulers method.
Applying this approximation repeatedly, leads to the recipe
y1 = y0 + f (y0 )t,
y2 = y1 + f (y1 )t,
..
.
yk+1 = yk + f (yk )t.
We get from this iterated technique the approximate values of the function for as many time
steps as desired starting from t = 0 in increments of t up to some final time T as desired.
It is customary to use the following notation to refer to the true ideal solution and the
one that is actually produced by this approximation method:
t0 = the initial time point, usually at t = 0.
h = t = common notations for the step size, i.e. the distance between the points
along the t axis.
tk = the kth time point. Since the points are just at multiples of the step size that
we have picked, it follows that tk = kt = kh.
y(t) = the actual value of the solution to the differential equation at time t. This is
usually not known, but in the examples discussed in this chapter, we can solve the
differential equation exactly, so we have a formula for the function y(t). In most
hard scientific problems, no such formula is known in advance.
y(tk ) = the actual value of the solution to the differential equation at one of the
discrete time points, tk . (Again, not usually known.)
yk = the approximate value of the solution obtained by Eulers method. We hope
that this approximate value is fairly close to the true value, i.e. that yk y(tk ),
but there is always some error in the approximation. More advanced methods that
are specifically designed to reduce such errors are discussed in courses on numerical
analysis.

12.4.1 Eulers method applied to population growth


Example 12.11 Apply Eulers method to approximating solutions for the simple exponential growth model that was studied in Chapter 11,
dy
= ay,
dt

12.4. Eulers Method and numerical solutions

253

(where a is a constant) with initial condition


y(0) = y0 .
(See Eqn 11.1.)
Solution: Let us subdivide the t axis into steps of size t, starting with t0 = 0, and
t1 = t, t2 = 2t, . . . From the above discussion, we note that the first value of y is
known to us exactly, namely,
y0 = y(0) = y0 .
We replace the differential equation by the approximation
yk+1 yk
= ayk .
t
Then
yk+1 = yk + atyk ,

k = 1, 2, . . .

In particular,
y1 = y0 + aty0 = y0 (1 + at),
y2 = y1 (1 + at),
y3 = y2 (1 + at),
and so on. At every stage, the quantity on the right hand side depends only on values of yk
that are already known, so that this generates a recipe for moving from the initial value to
successive values of the approximation for y.
Example 12.12 Consider the specific problem in which
dy
= 0.5y,
dt

y(0) = 100.

Use step size t = 0.1 and Eulers method to approximate the solution for two time steps.

Solution: Eulers method applied to this example would lead to


y0 =100.
y1 =y0 (1 + at) = 100(1 + (0.5)(0.1)) = 95,
y2 =y1 (1 + at) = 95(1 + (0.5)(0.1)) = 90.25,
and so on.
Clearly, these kinds of repeated calculations are best handled on a spreadsheet or
similar computer software.

254

Chapter 12. Solving differential equations

12.4.2 Eulers method applied to Newtons law of cooling


We apply Eulers method to Newtons Law of Cooling. A motivation for doing so is that
we can directly compare the approximate numerical solution generated by Eulers method
to the true (analytic) solution that we have worked out in this chapter.
Example 12.13 (Newtons law of cooling:) Consider the temperature of an object T (t)
in an ambient temperature of E = 10 . Assume that k = 0.2/min. Use the initial value
problem
dT
= k(E T ), T (0) = T0
dt
to write down the the exact solution (12.6) in terms of the initial value T0 .
Solution: In this case, the differential equation has the form
dT
= 0.2(10 T ),
dt
and its true solution (based on previous work) is
T (t) = 10 + (T0 10)e0.2t .
Below, we investigate the solutions from several initial conditions, T (0) = 0, 5, 15, 20
degrees.
Example 12.14 (Eulers method applied to Newtons law of cooling:) Write down the formula for Eulers method to find an approximate solution for the problem outlined in Example 12.13.
Solution: Eulers method leads to approximate the differential equation by
Tk+1 Tk
= 0.2(10 Tk ).
t
or, in simplified form,
Tk+1 Tk + 0.2(10 Tk )t.
Fig. 12.4 illustrates the time-stepping that this formula implies.
Example 12.15 Use the formula from Example 12.14 and time steps of size t = 1.0 to
find the first few values of temperature versus time.
Solution: Note that t = 1.0 is not a small step, and we use it here only to illustrate the
idea. Subdivide the horizontal (t) axis into steps of size t, and label the successive time
values as t0 , t1 , t2 , . . . tn where
t0 = 0,

tk = kt.

This is shown in Figure 12.3. Then the initial condition will give us the value of T0 = T (0).
We will find the temperatures at the successive times by

12.4. Eulers Method and numerical solutions

255

T
T0
T1

T2

t1

t2

Figure 12.4. Using Eulers method to approximate the temperature over time.

T1 =T0 + 0.2(10 T0 )t,

T2 =T1 + 0.2(10 T1 )t,


T3 =T2 + 0.2(10 T2 )t.
..
.
By the time we get to the kth step, we have:

Tk+1 = Tk + 0.2(10 Tk )t.


Again we note that at each step, the right hand side involves a calculation that depends only
on known quantities.
20.0

Eulers method

True solution

delta t = 1.0

0.0
0.0

time
tk
0.0000
1.0000
2.0000
3.0000
4.0000
5.0000
6.0000
7.0000
8.0000

approx solution
Tk
0.0000
2.0000
3.6000
4.8800
5.9040
6.7232
7.3786
7.9028
8.3223

exact soln
T (t)
0.0000
1.8127
3.2968
4.5119
5.5067
6.3212
6.9881
7.5340
7.9810

10.0

Table 12.1. Eulers method applied to Newtons law of cooling. The graph shows
the true solution (red) and the approximate solution (black)

In Table 12.1, we show a typical example of the method with initial value T (0) =
T0 = 0 and with a (large) step size t = 1.0. The true (red) and approximate (black)

256

Chapter 12. Solving differential equations

solutions are then shown in the accompanying figure. We show four distinct solutions, each
one representing an experiment with a different initial temperature. (For the approximate
solution point values at are shown at each time step.) The approximate solution is close to,
but not identical to the true solution.

Exercises

257

Exercises
12.1. Consider the differential equation
dy
= a by
dt
where a, b are constants.
(a) Show that the function

a
Cebt
b
satisfies the above differential equation for any constant C.
y(t) =

(b) Show that by setting


C=

a
y0
b

we also satisfy the initial condition


y(0) = y0 .
Remark: You have now shown that the function

a  bt a
y(t) = y0
e
+
b
b

is a solution to the initial value problem (i.e differential equation plus initial
condition)
dy
= a by, y(0) = y0 .
dt

12.2. Steps in an example: Complete the algebraic steps in Example 12.5 to show that
the solution to Eqn. (12.3) can be obtained by the substitution z(t) = a by(t).
12.3. Verifying a solution: Show that the function
y(t) =

1
1t

is a solution to the differential equation and initial condition


dy
= y2,
dt

y(0) = 1.

Comment on what happens to this solution as t approaches 1.


12.4. For each of the following, show the given function y is a solution to the given differential equation.
dy
= 3y, y = 2t3 .
(a) t
dt
d2 y
(b)
+ y = 0, y = 2 sin t + 3 cos t.
dt2

258

Chapter 12. Solving differential equations

dy
d2 y
2
+ y = 6et , y = 3t2 et .
2
dt
dt
12.5. Show the function determined by the equation 2x2 + xy y 2 = C, where C is a
dy
constant and 2y 6= x, is a solution to the differential equation (x2y)
= 4xy.
dx
12.6. Find the constant C that satisfies the given initial conditions.
(c)

(a) 2x2 3y 2 = C, y|x=0 = 2.

(b) y = C1 e5t + C2 te5t , y|t=0 = 1 and


(c) y = C1 cos(t C2 ), y|t= 2 = 0 and

dy
dt |t=0 = 0.
dy

dt |t= 2 = 1.

12.7. Friction and terminal velocity: The velocity of a falling object changes due to the
acceleration of gravity, but friction has an effect of slowing down this acceleration.
The differential equation satisfied by the velocity v(t) of the falling object is
dv
= g kv
dt
where g is acceleration due to gravity and k is a constant that represents the effect
of friction. An object is dropped from rest from a plane.
(a) Find the function v(t) that represents its velocity over time.
(b) What happens to the velocity after the object has been falling for a long time
(but before it has hit the ground)?
12.8. Alcohol level: Alcohol enters the blood stream at a constant rate k gm per unit time
during a drinking session. The liver gradually converts the alcohol to other, nontoxic byproducts. The rate of conversion per unit time is proportional to the current
blood alcohol level, so that the differential equation satisfied by the blood alcohol
level is
dc
= k sc
dt
where k, s are positive constants. Suppose initially there is no alcohol in the blood.
Find the blood alcohol level c(t) as a function of time from t = 0, when the drinking
started.
12.9. Newtons Law of Cooling: Newtons Law of Cooling states that the rate of change
of the temperature of an object is proportional to the difference between the temperature of the object, T , and the ambient (environmental) temperature, E. This leads
to the differential equation
dT
= k(E T )
dt
where k > 0 is a constant that represents the material properties and, E is the
ambient temperature. (We will assume that E is also constant.)
(a) Show that the function
T (t) = E + (T0 E)ekt
which represents the temperature at time t satisfies this equation.

Exercises

259

(b) The time of death of a murder victim can be estimated from the temperature of
the body if it is discovered early enough after the crime has occurred. Suppose
that in a room whose ambient temperature is E = 20 C, the temperature
of the body upon discovery is T = 30 C, and that a second measurement,
one hour later is T = 25 C. Determine the approximate time of death. (You
should use the fact that just prior to death, the temperature of the victim was
37 C.)
12.10. A cup of coffee: The temperature of a cup of coffee is initially 100 degrees C. Five
minutes later, (t = 5) it is 50 degrees C. If the ambient temperature is A = 20
degrees C, determine how long it takes for the temperature of the coffee to reach 30
degrees C.
12.11. Newtons Law of Cooling applied to data: The following data was gathered in
producing Fig. 2.1 for cooling milk during yoghurt production. According to Newtons Law of Cooling, this data can be described by the formula
T = E + (T (0) E) ekt .
where T (t) is the temperature of the milk (in degrees Fahrenheit) at time t (in min),
E is the ambient temperature, and k is some constant that we will determine in this
problem.
time (min) Temp
0.0
190.0
0.5
185.5
1.0
182.0
1.5
179.2
2.0
176.0
2.5
172.9
3.0
169.5
3.5
167.0
4.0
164.6
4.5
162.2
5.0
159.8
(a) Rewrite this relationship in terms of the quantity Y (t) = ln(T (t) E), and
show that Y (t) is related linearly to the time t.
(b) Explain how the constant k could be found from this converted form of the
relationship.
(c) Use the data in the table and your favorite spreadsheet (or similar software) to
show that the data so transformed appears to be close to linear. Assume that
the ambient temperature was E = 20 F.
(d) Use the same software to determine the constant k by fitting a line to the transformed data.
12.12. Lake Fishing: Fish Unlimited is a company that manages the fish population in a
private lake. They restock the lake at constant rate (To restock means to add fish to

260

Chapter 12. Solving differential equations


the lake). N fishers are allowed to fish in the lake per day. The population of fish in
the lake, F (t) is found to satisfy the differential equation
dF
= I N F
dt

(12.14)

(a) At what rate is fish added per day according to Eqn. (13.11)? (Give value and
units.) What is the average number of fish caught by one fisher? (Give value
and units.) What is being assumed about the fish birth and mortality rates in
Eqn. (13.11)?
(b) If the fish input and number of fishers are constant, what is the steady state
level of the fish population in the lake?
(c) At time t = 0 the company stops restocking the lake with fish. Write down the
revised form of the differential equation (13.11) that takes this into account.
(Assume the same level of fishing as before.) How long would it take for the
fish to fall to 25% of their initial level?
(d) When the fish population drops to the level Flow , fishing is stopped and the
lake is restocked with fish at the same constant rate (Eqn (13.11), with = 0.)
Write down the revised version of (13.11) that takes this into account. How
long would it take for the fish population to double?
12.13. Glucose solution in a tank: A tank that holds 1 liter is initially full of plain water.
A concentrated solution of glucose, containing 0.25 gm/cm3 is pumped into the
tank continuously, at the rate 10 cm3 /min and the mixture (which is continuously
stirred to keep it uniform) is pumped out at the same rate. How much glucose will
there be in the tank after 30 minutes? After a long time? (Hint: write a differential
equation for c, the concentration of glucose in the tank by considering the rate at
which glucose enters and the rate at which glucose leaves the tank.)
12.14. Pollutant in a lake:
(From the Dec 1993 Math 100 Exam) A lake of constant volume V gallons contains
Q(t) pounds of pollutant at time t evenly distributed throughout the lake. Water
containing a concentration of k pounds per gallon of pollutant enters the lake at a
rate of r gallons per minute, and the well-mixed solution leaves at the same rate.
(a) Set up a differential equation that describes the way that the amount of pollutant in the lake will change.
(b) Determine what happens to the pollutant level after a long time if this process
continues.
(c) If k = 0 find the time T for the amount of pollutant to be reduced to one half
of its initial value.
12.15. A sugar solution: Sugar dissolves in water at a rate proportional to the amount of
sugar not yet in solution. Let Q(t) be the amount of sugar undissolved at time t.
The initial amount is 100 kg and after 4 hours the amount undissolved is 70 kg.
(a) Find a differential equation for Q(t) and solve it.
(b) How long will it take for 50 kg to dissolve?

Exercises

261

12.16. Leaking water tank: A cylindrical tank with cross-sectional area A has a small
hole through which water drains. The height of the water in the tank y(t) at time t
is given by:
kt 2

)
y(t) = ( y0
2A
where k, y0 are constants.
(a) Show that the height of the water, y(t), satisfies the differential equation
dy
k
y.
=
dt
A
(b) What is the initial height of the water in the tank at time t = 0 ?
(c) At what time will the tank be empty ?
(d) At what rate is the volume of the water in the tank changing when t = 0?
12.17. Find those constants a, b so that y = ex and y = ex are both solutions of the
differential equation
y + ay + by = 0.
12.18. Let y = f (t) = et sin t, < t < .

(a) Show that y satisfies the differential equation y + 2y + 2y = 0.

(b) Find all critical points of f (t).

262

Chapter 12. Solving differential equations

Chapter 13

Qualitative methods for


differential equations

Not all differential equations are easily solved analytically. Furthermore, even when we
find the analytic solution, it is not always easy to interpret, graph, or understand. This
motivates a number of qualitative methods that lead us to an overall understanding of the
behaviour directly from information contained in the differential equation, without the challenges of finding a full functional form of the solution. In this chapter we will expand our
familiarity with differential equations and assemble such new techniques for understanding
these. When these equations are nonlinear, i.e. when the function f (y) in
dy
= f (y)
dt
is not a simple linear function of y, then it can be quite challenging to discover analytic
solutions. We will encounter both qualitative methods. Geometric techniques will form the
core of the concepts here discussed.

13.1 Linear and nonlinear differential equations

Section 13.1 Learning goals


1. Understand the distinction between unlimited and density dependent population
growth. Be able to explain terms in the logistic equation in its original (13.1) and
rescaled (13.2) versions.
2. Be able to state the definition of a linear differential equation.
3. Understand the law of mass action, and be able to derive simple differential equations
for interacting species based on this law.
263

264

Chapter 13. Qualitative methods for differential equations

In our previous model for population growth, in Chapter 11, we encountered the
differential equation
dN
= kN,
dt
where N (t) is population size at time t and k is a constant per capita growth rate. This
differential equation, as we have seen, has exponential solutions, which means that only
two possible behaviours are obtained: explosive growth if k > 0 or extinction if k < 0.
But this is unrealistic. Most natural populations do not grow indefinitely in an explosive,
exponential way. Due to limited resources or competition for territory, eventually the population may attain some static level rather than expanding continually. This motivates a
revision of our previous model to depict density dependent growth.

13.1.1 The logistic equation for population growth


Let N (t) represent the size of a population at time t, as before. Consider the differential
equation
(K N )
dN
= rN
.
(13.1)
dt
K
We call this differential equation the logistic equation. Here the parameter r is the intrinsic growth rate and K is the carrying capacity. Both r, K are assumed to be positive
constants for a given population in a given environment. The logistic equation has a long
history in modelling population growth of microorganisms, animals, and human populations.
In the form written above, we could interpret the logistic equation as


(K N )
dN
N.
= r
dt
K
then the term R(N ) = [r(K N )/K], which replaces the constant rate of growth k, is
a so-called density dependent growth rate. (It replaces the previously assumed constant
growth rate r, that leads to unlimited growth.)
We later show yet another interpretation that involves hostile interactions between
individuals in the population.

13.1.2 Linear versus nonlinear


The logistic equation introduces the first example of a nonlinear differential equation.
We explain the distinction and why it matters here.
Definition 13.1 (Linear differential equation). A first order differential equation is said
to be linear if it is a linear combination of terms of the form
dy
,
dt

y,

that is, it can be written in the form

dy
+ y + = 0
dt

13.1. Linear and nonlinear differential equations

265

where , , do not depend on y. (First order means that only up to the first derivative
occurs in the equation.)
So far, we have seen several examples of this type with constant coefficients , , .
For example = 1, = a in Section 12.3, whereas = 1, = k, = 0 in Eqn. 11.1.
Any differential equation not of this simple form is said to be nonlinear.
Example 13.2 (Linear versus nonlinear differential equations) Which of the following
differential equations are linear and which are nonlinear?
(a)

dy
= y2,
dt

(b)

dy
y = 5,
dt

(c) y

dy
= 1.
dt

Solution: Any term of the form y 2 , y, 1/y, etc is nonlinear in y. a product y dy


dt is also
nonlinear. Hence equations (a), (c) are nonlinear, while (b) is linear.
The significance of the distinction between linear and nonlinear differential equations
is that nonlinearities make it much harder to systematically find a solution to the given differential equation by analytic methods. Most linear differential equations have solutions
that are made of exponential functions or expressions involving such functions. This is not
true for nonlinear equations. However, as we will see shortly, geometric methods become
very helpful in understanding the behaviour of such nonlinear differential equations.

13.1.3

Law of mass action

Nonlinear terms in differential equations arise in various ways. One common source is
interaction between individuals or particles that affects their state. Here is a simple example
of this type.
Consider a chemical reaction in which molecules of type A bind to those of type B to
react chemically and form some product P. Suppose we start out with a test-tube containing
a mixture of A and B molecules at concentrations a(t), b(t). These concentrations depend
on time because the chemical reaction will use up both types in producing the product.
What can we say about the rate of the reaction? First, we note that the reaction only
occurs when A and B collide and interact. This happens randomly, but clearly the more
A is present, the more likely are such collisions, and similarly for B. Hence the rate of
reaction should be faster if the concentration a(t) is higher, and/or if the concentration b(t)
is higher. The simplest assumption that captures both of these ideas is
rate of reaction ab

rate of reaction = kab

where k is some constant that represents the reactivity of the molecules.


We can formally state this result, known as the Law of Mass Action as follows:

The Law of Mass Action: The rate of a chemical reaction involving an interaction of two
or more chemical species is proportional to the product of the concentrations of the given
species.

266

Chapter 13. Qualitative methods for differential equations

Example 13.3 (Differential equation for interacting chemicals) In a 1 litre chemical reactor, substance A is constantly added at a constant rate of I moles per hour. There, pairs
of molecules of A interact chemically to form some product. Assuming that the volume
does not change, write down a differential equation that keeps track of concentration of A
in the reactor, y(t).
Solution: First suppose that there is no reaction. Then the addition of A to the reactor at a
constant rate would lead to changing y(t), which would satisfy the differential equation
dy
= I.
dt
When the chemical reaction takes place, there is a depletion of A which depends on interaction of pairs of molecules. But according to the law of mass action, such a term would
be of the form ky y = ky 2 . This reduces the concentration of a, so it contributes to a
negative rate of change, hence
dy
= I ky 2 .
dt
This is a nonlinear differential equation, as it contains a term of the form y 2 .
Example 13.4 (Logistic equation reinterpreted) Rewrite the logistic equation in the form
dN
= rN bN 2
dt
(where b = r/K is a positive quantity). Interpret the meaning of this restated form of
the equation by explaining what each of the terms on the right hand side could represent.
Which of the two terms would be most significant for small versus for large population
levels?
Solution: This form of the equation has a linear growth term rN , which we have encountered before in exponentially growing populations. However, there is also a quadratic
(nonlinear) rate of loss (note minus sign) bN 2 . This term could describe interactions
between individuals that lead to mortality, e.g. through fighting or competition. From
familiarity with power functions (N, N 2 ) we can deduce that the quadratic term will dominate for larger values of N , and this means that when the population is crowded, the loss
of individuals is greater than the rate of reproduction.

13.1.4 Scaling the variable can simplify the ODE


It is often desirable to formulate a differential equation in the simplest possible terms. We
can do this by a process called rescaling. For example, the logistic equation (13.1) contains
two constants, r and K. Since units on each side of an equation must balance, and must be
the same for terms that are added or subtracted, we can infer that K has the same units as
N , and indeed, that it is a population density (at which the growth rate dN/dt = 0). By
redefining the dependent variable in terms of this constant reference population level, we
can simplify the equation and leave a single constant, as shown in the next example.

13.2. The geometry of change

267

Example 13.5 (Rescaling:) Define a new variable

y(t) =

N (t)
.
K

Interpret what this variable represents and show that the Logistic equation can be written
in a simpler form in terms of this variable.

Solution: The rescaled variable, y(t), is a population density expressed in units of the
carrying capacity. (For example, if the environment can sustain 1000 individuals, and the
current population size is N = 950 then the value of y is y = 0.950.) Since K is assumed
constant,
1 dN
dy
=
dt
K dt
and we can simplify the equation:
dy
= ry(1 y).
dt

(13.2)

We observe that indeed, this equation looks simpler and also has only one constant parameter left in it. It is generally the case that rescaling reduces the number of parameters in
a differential equation such as seen here.

13.2 The geometry of change


In this section, we turn to some new methods for understanding differential equations,
using graphical and geometric arguments that avoid the need for formulas. We resort to
concepts learned much earlier in this course: the derivative as a slope of a tangent line, in
order to use the differential equation itself to assemble a sketch of its predicted behaviour.
That is, rather than writing down y = F (t) as a solution to the differential equation (and
then graphing that function) we sketch the qualitative behaviour of such solution curves
directly from information contained in the differential equation.

268

Chapter 13. Qualitative methods for differential equations

Section 13.2 Learning goals


1. Understand the idea of a slope field of a differential equation. Given a differential
equation (linear or nonlinear), be able to construct such a diagram and use it to sketch
solution curves.
2. Understand the idea of a state-space diagram, and be able to construct such a diagram and use it to interpret the behaviour of solution curves to a given differential
equation.
3. Understand the relationships between slope field, and state-space diagram, and families of solution curves to a given differential equation.
4. Be able to identify steady states of a differential equation and determine whether they
are stable or unstable.
5. Given a differential equation and initial condition, be able to predict the behaviour
of the solution for t > 0.

13.2.1 Slope fields


Here we discuss a geometric way of understanding what a differential equation is saying
using a slope field, also denoted direction field. We have already seen that solutions to a
differential equation of the form
dy
= f (y)
dt
are curves in the y, t plane that describe how y(t) changes over time. (Thus, these curves
are graphs of functions of time.) Each initial condition y(0) = y0 is associated with one
of these curves, so that together, these curves form a family of solutions. What do these
curves have in common geometrically?
Simply stated, the slope of the tangent line (which is just dy/dt) at any point on any
of the curves has to be related to the value of the y coordinate of that point, specified by the
function f (y). That is exactly what the differential equation is saying: at any point (t, y(t))
on a solution curve, the tangent line must have slope f (y), which depends only on the y
value, and not on the time t16 . By sketching slopes at various values of y, we obtain the
slope field from which we can get a reasonable idea of the behaviour of the solutions to the
differential equation.
Example 13.6 Consider the differential equation
dy
= 2y.
dt

(13.3)

Compute some of the slopes for various y values and use this to sketch a slope field for the
differential equation (13.3).
16 In more general cases, the expression f (y) that appears in the differential equation might depend on t as well
as y. For the purpose of this course, we will not consider such examples in detail.

13.2. The geometry of change

269

Solution: Equation (13.3) states that, if a solution curve passes through a point (t, y), then
its tangent line at that point has a slope 2y, regardless of the value of t. This example is
simple enough that we can state the following: for positive values of y, the slope if positive,
for negative values of y, the slope is negative, and for y = 0 the slope is zero. We provide
some tabulated values of y indicating the values of the slope f (y), its sign, and what this
implies about the local behaviour of the solution and its direction. Then, in Figure 13.1
we combine this information to generate the direction field and the corresponding solution
curves. Note that the direction of the arrows (rather than their absolute magnitude) provides
the most important qualitative tendency for the slope field sketch.
y
-2
-1
0
1
2

f (y) = 2y
-4
-2
0
2
4

behaviour of y
decreasing
decreasing
no change in y
increasing
increasing

slope of tangent line


-ve
-ve
0
+ve
+ve

direction of arrow

Table 13.1. Table of derivatives and slopes for the differential equation (13.3) of
Example 13.6.

1.5

1.5

0.5

0.5

-0.5

-0.5

-1

-1

-1.5

-1.5

-2

(a)

-2

(b)

Figure 13.1. Direction field and solution curves for Example 13.6.
In constructing the slope field and solution curves, the following basic rules should
be followed:
1. By convention, time flows from left to right along the t axis in our graphs, so the
direction of all arrows (not indicated explicitly on the slope field) is always from left
to right.
2. According to the differential equation, for any given value of the variable y, the slope
is given by the expression f (y) in the differential equation. The sign of that quantity

270

Chapter 13. Qualitative methods for differential equations


is particularly important in determining whether the solution is locally increasing,
decreasing, or neither. In the tables, we indicate this in the last column with the
notation , , or

3. There is a single arrow at any point in the ty plane, and consequently solution curves
cannot intersect anywhere (although they can get arbitrarily close to one another).
We will see some implications of these rules in our examples.
Example 13.7 For example, consider the differential equation
dy
= f (y) = y y 3 .
dt
Create a slope field diagram for this differential equation.

y
y < 1
-1
-0.5
0
0.5
1
y>1

sign of f (y) = y y 3
+ve
0
-ve
0
+ve
0
-ve

behaviour of y
increasing
no change in y
decreasing
no change in y
increasing
no change in y
decreasing

direction of arrow

Table 13.2. Table for Example 13.7.

0.5

0.5

-0.5

-0.5

-1

-1

10

15

(a)

10

(b)
Figure 13.2. Figure for Example 13.7.

15

13.2. The geometry of change

271

Solution: Based on the last example, we will pay attention to the sign, rather than the value
of the derivative f (y), since that sign determines whether the solutions increase, decrease,
or stay constant. To determine the sign of f (y) it can help to factor the expression:
dy
= f (y) = y y 3 = y(1 y 2 ) = y(1 + y)(1 y).
dt
The sign of f is hence determined by the signs of the factors y, (1 + y), (1 y). Clearly,
f (y) = 0 at three points, y = 0, 1. To the left of all three (for y < 1), two factors,
y, (1 + y), are negative, whereas (1 y) is positive, so that the product is positive overall.
The sign of f (y) changes at each of the three points y = 0, 1 where one or another of
the three factors changes sign, as shown in Table 13.2. Eventually, to the right of all three
(when y > 1), the sign is negative. We summarize these observations in Table 13.2 and
show the slopes field and solution curves in Fig 13.2.
Example 13.8 Sketch a slope field and solution curves for the problem of a cooling object,
and specifically for
dT
= f (T ) = 0.2(10 T ),
(13.4)
dt

T
T < 10
T = 10
T > 10

sign of f (T ) = 0.2(10 T )
+ve
0
-ve

behaviour of T
increasing
no change
decreasing

direction of arrow

20

20

15

15

temperature

temperature

Table 13.3. Slopes for Example 13.8.

10

10

10

(a)

15

10

(b)

Figure 13.3. Slope field for a cooling object of Example 13.8.

15

272

Chapter 13. Qualitative methods for differential equations

Solution: The collection of curves shown in Figure 12.2 are solution curves for the T (t),
the function f (T ) = 0.2(10 T ) also corresponds to the slope of the tangent lines to
the curves in Figure 12.2. In Table 13.3, we tabulate the signs of the derivative f (T ) =
0.2(10 T ) for temperature below, equal and above 10. The slope field is then shown in
Figure 13.3(a) with solution curves in (b).

13.2.2 State-space diagrams


In Examples 13.6-13.8, we have already seen that we can understand qualitative features
of solutions to the differential equation
dy
= f (y)
dt

(13.5)

by examining the expression f (y) in this equation. Up to now, we used the sign of f (y)
to assemble a slope field diagram and sketch solution curves. The slope field informed us
about which initial values of y would increase, decrease or stay constant. We next show
another way of determining the same information. First, let us define a state space, also
called phase line, which is essentially the y axis with arrows to denote the direction of flow
and points at which y is static.
Definition 13.9 (State space (or Phase line)). A line representing the dependent variable
(y) together with arrows to describe the flow along that line (increasing or decreasing
y) satisfying (13.5) is called the state space diagram or the phase line diagram for the
differential equation.
Rather than tabulating signs for f (y), we could arrive at similar conclusions by
sketching f (y) and observing where this function is positive (implying that y increases)
or negative (y decreases). Places where f (y) = 0 (zeros of f ) are important boundaries
between such regimes and also important in their own right for signifying static solutions
(no change in y). Along the y axis (which is now on the horizontal axis of the sketch) increasing y means motion to the right, decreasing y means motion to the left.
As we shall see, the information contained in this type of diagram provides qualitative
description of solutions to the differential equation, but with the explicit time behaviour
suppressed. This is illustrated by Fig. 13.4, where we show the connection between the
slope field diagram and the state space diagram for a typical differential equation.
Example 13.10 Consider the differential equation
dy
= f (y) = y y 3 .
dt

(13.6)

Sketch f (y) versus y and use your sketch to determine where y is static, and where y
increases or decreases. Then describe in words what happens in case the initial condition
is (i) y(0) = 0.5, (ii) y(0) = 0.3, or (iii) y(0) = 2.
Solution: From a previous example, we know that f (y) = 0 at for y = 1, 0, 1. This
means that y does not change at these values, i.e. if we start a system off with y(0) = 0, or

13.2. The geometry of change

273

f (y)
t
y

(a)

(b)

(c)

Figure 13.4. The relationship of the slope field and state space diagrams. (a) A
typical slope field. A few arrows have been added to indicate the direction of time flow
along the tangent vectors. Now consider looking down the time axis as shown by the
eye in this diagram. Then the t axis points towards us, and we see only the y axis
as in (b). Arrows on the y axis indicate the directions of flow for various values of y
as determined in (a). Now rotate the y axis so it is horizontal, as shown in (c). The
direction of the arrows exactly correspond to places where f (y), in (c), is positive (which
implies increasing y, ), or negative (which implies decreasing y, ). The state space
diagram is the y axis in (b) or (c).
y(0) = 1, the value of y will be static. The three places at which this happens are marked
by heavy dots in Figure 13.5(a).
We also see that f (y) < 0 for 1 < y < 0 and for y > 1. This means that the rate of
change of y is negative whenever 1 < y < 0 or y > 1, which, in turn, implies that if the
value of y(t) falls in either of these intervals at any time t, then y(t) must be a decreasing
function of time. On the other hand, for 0 < y < 1 or for y < 1, we have f (y) > 0, so
y(t) is increasing. See arrows on Figure 13.5(b). We see from the directions marked that
there is a tendency for y to move away from the value y = 0 and to approach either of the
values 1 or 1 as time goes by. Starting from the initial values given above, we have (i)
y(0) = 0.5 results in y 1, (ii) y(0) = 0.3 leads to y 1, and (iii) y(0) = 2 implies
y .
Example 13.11 (A cooling object:) Sketch the same type of diagrams for the problem of
a cooling object and interpret its meaning.
Solution: Here, the differential equation is
dT
= f (T ) = 0.2(10 T ).
dt
Here, the function f (T ) = 0.2(10 T ) is the rate of change associated with a given
temperature T . A sketch of the rate of change, F (T ) versus the temperature T is shown in
Figure 13.6(a).

274

Chapter 13. Qualitative methods for differential equations

f (y)

f (y)

(a)

(b)

Figure 13.5. Static points and intervals for which y increases or decreases for the
differential equation (13.6). See Example 13.10.

f(T)

f(y)

T
E

y
a/b

(a)

(b)

Figure 13.6. (a) Figure for Example 13.11, (b) Qualitative sketch for Eqn. (13.7)
in Example 13.12.

Example 13.12 Create a similar qualitative sketch for the more general form of linear
differential equation
dy
= f (y) = a by.
dt

(13.7)

For what values of y would there be no change?

Solution: The rate of change of y is given by the function f (y) = a by. This is shown in
the sketch in Figure 13.6(b). We see that there is one point at which f (y) = 0, namely at
y = a/b. Starting from an initial condition y(0) = a/b, there would be no change. We also
see from this figure that y approaches this value over time. After a long time, the value of
y will be approximately a/b.

13.3. Applying qualitative analysis to biological models

13.2.3

275

Steady states and stability

We notice from Figure 13.3 that for a certain initial temperature, namely T0 = 10 there will
be no change with time. Indeed, we find that at this temperature the differential equation
specifies that dT /dt = 0. Such a value is called a steady state.
Definition 13.13 (Steady state). A Steady state is a state in which a system is not changing.
Example 13.14 Find the steady states of the equation (13.6).
Solution: To find steady states we look for y such that dy/dt = 0. But these are just points
that satisfy f (y) = 0, that is zeros of f . Thus y = 0 and y = 1 are the three steady states
of this differential equation.
From Figure 13.5, we see that solutions starting close to y = 1 tend to get closer and
closer to this value. We refer to this behaviour as stability of the steady state.
Definition 13.15 (Stability). We say that a steady state is stable if states that are initially
close enough to that steady state will get closer to it with time. We say that a steady state is
unstable, if states that are initially very close to it eventually move away from that steady
state.
Example 13.16 Find a stable and an unstable steady state of Eqn. (??) in Example 13.14
are stable.
Solution: From any starting value of y > 0 in this example, we see that after a long time,
the solution curves tend to approach the value y = 1. States close to y = 1 get closer to
it, so this is a stable steady state. For the steady state y = 0, we see that initial conditions
close to y = 0 do not get closer, but rather move away over time. Thus, this steady state is
unstable. It turns out that there is also a stable steady state at y = 1.
As seen in Example 13.14, even though we do not have any formula that connects
y values with specific times, we can say qualitatively what happens to any positive initial
values after a long time: they all approach the value y = 1.

13.3 Applying qualitative analysis to biological


models
The ideas developed in this chapter, and particularly the qualitative and geometric ideas,
can help us to understand a variety of differential equations that stem from biological,
physical, or chemical applications. In the following sections we will first use the methods
to obtain a thorough understanding of logistic population growth.
In a second direction, we consider a model for interactions of infected and healthy
individuals and the spread of an infection. After making simple assumptions, we derive
a pair of differential equations and show that they can be reduced to a model that greatly
resembles the structure of the logistic equation. Using methods of this chapter, we arrive
at qualitative predictions for cases when the disease would disappear or take hold of the
population.

276

Chapter 13. Qualitative methods for differential equations

Section 13.3 Learning goals


1. Practice the techniques of slope field, state-space diagram, and steady state analysis
to the logistic equation.
2. Follow the derivation of a model for interacting (healthy, infected) individuals based
on a set of assumptions.
3. Understand that the resulting set of two ODEs can be reduced to a single ODE. Be
able to use qualitative methods to analyse the model behaviour and to interpret the
results.

13.3.1 Qualitative analysis for the logistic equation


In this section we will familiarize ourselves with the behaviour predicted by the logistic
equation.
Example 13.17 Find the steady states of the Logistic Equation (13.1).
Solution: To determine the steady states of the equation (13.1), i.e. the level of population
that would not change over time, we look for values of N such that
dN
= 0.
dt
This leads to

(K N )
= 0,
K
which has solutions N = 0 (no population at all) or N = K (the population is at its
carrying capacity).
The logistic equation is justified either by considering it to be a special case of the
density dependent growth equation
rN

dN
= R(N )N
dt
(where the reproductive rate has the form R(N ) = r(K N )/K), or, equivalently, it can
be considered to fall into a class of equations that have the form
dN
= rN bN 2
dt
(where the constant is b = r/K), which means that a constant rate of reproduction rN is
modified by a quadratic mortality rate bN 2 . The mortality would tend to dominate only
for larger values of the population, i.e. if conditions are crowded so that animals have
to compete for resources or habitat. (This stems from the fact that the quadratic term is
smaller than the linear term near N = 0, but dominates for large N , as we have already
discussed in Chapter 1.)

13.3. Applying qualitative analysis to biological models

277

Example 13.18 Draw a plot of the rate of change dy/dt versus the value of y for the
rescaled logistic equation (13.2).
Solution: This plot is shown in Figure 13.7. The steady states are located at y = 0, 1
(which correspond to N = 0 and N = K in the original variable.) We also find that in the
interval 0 < y < 1, the rate of change is positive, so that y increases, whereas for y > 1,
the rate of change is negative, so y decreases. Since y refers to population size, we need
not concern ourselves with behaviour for y < 0.

Rate of change
dy/dt

Figure 13.7. Plot of dy/dt versus y for the rescaled logistic equation(13.2).
From Figure 13.7 we expect to see solutions to the differential equation that approach
the value y = 1 after a long time. (The only exception to this would be the case where there
is no population present at all, i.e. y = 0, in which case, there would be no change.) Restated in terms of the original quantities in the model, the population N (t) should approach
K after a long time. We now look at the same equation from the perspective of the slope
field.
Example 13.19 Draw a slope field for the rescaled logistic equation with r = 0.5, that is
for
dy
= f (y) = 0.5y(1 y).
(13.8)
dt

Solution: We generate slopes in Table 13.4 for different values of y and plot the slope field
in Figure 13.8(a).
Finally, we can use the numerical technique of Eulers method to graph out the full
solution to this differential equation from some set of initial conditions.
Example 13.20 (Numerical solutions to the logistic equation:) Use Eulers method to approximate the solutions to the logistic equation (13.8).
Solution: In Figure 13.8(b) we show a set of solution curves, obtained by solving the
equation numerically using Eulers method and the spreadsheet. To obtain these solutions,

278
y
0
0<y<1
1
y>1

Chapter 13. Qualitative methods for differential equations


sign of f (y) = f (y) = 0.5y(1 y)
0
+ve
0
-ve

behaviour of y
no change in y
increasing
no change in y
decreasing

direction of arrow

Table 13.4. Table for slope field for the logistic equation (13.8). See Fig 13.8(a)
for the resulting diagram.

a value of h = t = 0.1 was used, the time axis was discretized (subdivided) into steps of
size 0.1. A starting value of y(0) = y0 at time t = 0 were picked. The successive values of
y were calculated as follows:
y1 = y0 + 0.5y0 (1 y0 )h
y2 = y1 + 0.5y1 (1 y1 )h
..
.
yk+1 = yk + 0.5yk (1 yk )h
(The attractive feature of using a spreadsheet is that this repetition can be handled automatically by dragging the cell entry containing the results for one iteration down to generate
other iterations. Another attractive feature is that once the method is implemented, it is
possible to change the initial condition very easily, just by changing a single cell entry.
From these results, we see that solution curves approach y = 1. This means (in terms
of the original variable, N ) that the population will approach the carrying capacity K for
all nonzero starting values, i.e. there will be a stable steady state with a fixed level of the
population.
Example 13.21 Some of the curves shown in Figure 13.8(b) have an inflection point, but
others do not. Use the differential equation to determine which of the solution curves will
have an inflection point.
Solution: From Figure 13.8(b) we might observe that the curves that emanate from initial
values in the range 0 < y0 < 1 are all increasing. Indeed, this follows from the fact if y is
in this range, the rate of change ry(1 y) is a positive quantity.
The logistic equation has the form
dy
= ry(1 y) = ry ry 2
dt
This means that (by differentiating both sides and remembering the chain rule)
d2 y
dy
dy
dy
=r
2ry
= r (1 2y).
2
dt
dt
dt
dt

13.3. Applying qualitative analysis to biological models

279

population
1.2

0.8

0.6

0.4

0.2

10
time

15

20

(a)
1.25

0.0
0.0

10.0

(b)
Figure 13.8. (a) Slope field and (b) solution curves for the logistic equation (13.8).

An inflection point would occur at places where the second derivative changes sign, and in
addition
d2 y
= 0.
dt2
From the above we see that this is possible for dy/dt = 0 or for (1 2y) = 0. We have
already dismissed the first possibility because we have argued that the rate of change in
nonzero in the interval of interest. Thus we conclude that an inflection point would occur
whenever y = 1/2. Any initial condition satisfying 0 < y0 < 1/2 would eventually pass
through y = 1/2 on its way up to the steady state level at y = 1, and in so doing, would
have an inflection point.

280

Chapter 13. Qualitative methods for differential equations

13.3.2 A model for the spread of a disease


In the era of human immunodeficiency virus (HIV), Severe acute respiratory syndrome
(SARS), Avian influenza (bird flu) and similar emerging infectious diseases, we are faced
with questions about how infection spreads, and how it can be controlled or suppressed.
Sustaining the health of the public at large requires an understanding of the dynamics of
disease, motivating a simple example discussed here.
We consider a population with two types of individuals, those that are healthy and
those that are currently infected. We will assume that all healthy individuals are susceptible
to catching the infection, and those that are currently infected are also infectious, which
means that they can transmit the infection to others through social interactions. We also
assume that the infection is mild enough that individuals recover at some constant rate, and
that there is no disease-related mortality. Furthermore, we will consider this scenario in
the context of a fixed population (with no birth, death or migration during the timescale
of interest). Our goal in this section is to predict whether the infection would spread and
take hold in the population or whether it would run its course and disappear. We will find
that this example illustrates the methods used so far and allows us to draw conclusions that
were not intuitively obvious to begin with.
Let us use the following notation:
S(t) = size of population of susceptible (healthy) individuals
I(t) = size of population of infected individuals
N (t) = S(t) + I(t) = total population size
We make a few simplifying assumptions.
1. The population mixes very well, so each individual is equally likely to contact and
interact with any other individual. The contact is random.
2. Other than the state (S or I), individuals are identical. They recover at the same
(constant) rate, and they have the same tendency to become infected.
3. On the timescale of interest, there is no birth, death or migration, only exchange
between S and I.
Example 13.22 Suppose that the process can be represented by the scheme
S + I I + I,
IS
The first part, transmission of disease from I to S involves interaction. The second part
is recovery. Use the assumptions to track the two populations and to formulate a set of
differential equations for I(t) and S(t).
Solution: We first write down the following word equations to keep track of individuals

Rate of
Rate of Gain
Rate of loss
change of = due to disease

due to
I(t)
transmission
recovery

13.3. Applying qualitative analysis to biological models

281

According to our assumption, recovery takes place at a constant rate. We denote that rate
by > 0 per unit time. By the law of mass action, the disease transmission rate should be
proportional to the product of the populations, (S I). Assigning > 0 to be the constant
of proportionality leads to the following differential equations for the infected population
(which simply restates the word equation in mathematical notation):
dI
= SI I.
dt
Similarly, we can write a word equation that tracks the population of susceptibles:

Rate of
Rate of Loss
Rate of gain
change of = due to disease +

due to
S(t)
transmission
recovery

Observe that loss from one group leads to (exactly balanced) gain in the other group. By
similar logic, the differential equation for S(t) is then
dS
= SI + I.
dt
We have arrived at two differential equations that describe the changes in each of the
groups,
dI
= SI I,
dt
dS
= SI + I.
dt

(13.9a)
(13.9b)

It is evident from Eqs. (13.9) that changes in one population are linked to the levels of both,
which means that the differential equations are coupled (linked to one another). Hence, we
cannot solve one independently of the other. We must treat them as a pair. However, as
we will observe in the next examples, we can simplify this system of equations using the
fact that the total population does not change.
Example 13.23 Use equations (13.9) to show that the total population does not change.
(Hint: show that the derivative of S(t) + I(t) is zero.)
Solution: Add the equations to one another. Then we obtain
dI
dS
d
[I(t) + S(t)] =
+
= SI I SI + I = 0.
dt
dt
dt
Hence

d
dN
[I(t) + S(t)] =
= 0,
dt
dt

which mean that the total population does not change, so that N (t) = [I(t) + S(t)] =
N =constant.

282

Chapter 13. Qualitative methods for differential equations

Example 13.24 Use the fact that N is constant to express S(t) in terms of I(t) and N , and
eliminate S(t) from the differential equation for I(t). Your equation will contain only the
constants N, , .
Solution: Since N = S(t) + I(t) is constant, we can write S(t) = N I(t). Then,
plugging this into the differential equation for I(t) we obtain
dI
= (N I)I I.
dt
Example 13.25 Show that the above equation can be written in the form
dI
= I(K I),
dt
where K is a constant, and determine how this constant depends on N, , and . Is the
constant K positive or negative?
Solution: We rewrite the differential equation for I(t) as follows:





dI
= I N I .
= (N I)I I = I (N I)
dt

Then, we identify the constant,


K=




.
N

Evidently, K could be either positive or negative, that is



N K 0,
N < K < 0.
Using the above process, we have reduced the system of two differential equations for the
two variables I(t), S(t) to a single differential equation for I(t), together with the statement
S(t) = N I(t). We now examine implications of this result using qualitative methods
developed in this chapter.
Example 13.26 Consider the differential equation for I(t) given by


dI

.
= I(K I), where K = N
dt

(13.10)

Find the steady states of the differential equation (13.10) and draw a state space diagram in
each of the following two cases: (a) K 0, (b) K < 0. Use your diagram to determine
which steady state(s) are stable or unstable.
Solution: Steady states of Eqn. (13.10) satisfy dI/dt = 0, namely I(K I) = 0. The
possible roots are I = 0 (no infected individuals) and I = K. The latter can only make
sense if K 0. We plot the function f (I) = I(K I) in Eqn. (13.10) against the

13.3. Applying qualitative analysis to biological models

283

dI
dt

dI
dt

(a)

(b)

Figure 13.9. Plot of dI/dt versus I as specified by the differential equation


(13.10) for K 0, and (b) K < 0. The grey regions are not biologically meaningful
since I cannot be negative.
state variable I in both cases. Observe that this function is quadratic in I, and, as in the
logistic equation, its graph is a parabola opening downwards. We add arrows pointing right
() in the regions where dI/dt > 0 and arrows pointing left () where dI/dt < 0. In
case (a), when K 0, we find that arrows point towards I = K, so this steady state is
stable. Arrows point away from I = 0, so this represents an unstable steady state. In
case (b), while we still have a parabolic graph with two steady states, the state I = K
is not admissible since K is negative. Hence only one steady state, at I = 0 is relevant
biologically, and all initial conditions will move towards this state.
Example 13.27 Interpret the results of the model in terms of the disease, assuming that
initially most of the population is in the S group, and a small number of infected individuals
are present at t = 0.
Solution: In case (a), as long as the initial size of the infected group is positive (I > 0),
with time it will approach K, that is, I(t) K = N /. This holds provided K > 0
which is equivalent to N > /beta. For this case, we also conclude that the rest of the
population, S(t) = N I(t) will approach N K, that is S(t) N (N /) =
/. There will then be some infected and some healthy individuals in the population
indefinitely, according to the model. In this case, we say that the disease becomes endemic.
In case (b), which corresponds to N < /, we see that I(t) 0 regardless of the
initial size of the infected group. In that case, S(t) N so with time, the infected group
will shrink and the healthy group will grow. From these two results, we can conclude that
the disease will be wiped out in a small population, whereas in a large population, it will
spread until a steady state is attained. In fact we have identified a threshold that separates
these two behaviours:
N
> 1 disease becomes endemic,

N
< 1 disease is wiped out.

284

Chapter 13. Qualitative methods for differential equations

The ratio of constants in these inequalities is called the reproductive number for the disease. Many current and much more detailed models for disease transmission also have
threshold behaviour, and the ratio that determines whether the disease spreads or disappears
is denoted R0 . This ratio represents the number of infections that arise when 1 infected individual interacts with a population of N susceptible individuals.

Exercises

285

Exercises
13.1. Consider the differential equation
dy
= a by
dt
where a, b are constants.
(a) Show that the function

a
Cebt
b
satisfies the above differential equation for any constant C.
y(t) =

(b) Show that by setting


C=

a
y0
b

we also satisfy the initial condition


y(0) = y0 .
Remark: You have now shown that the function

a  bt a
y(t) = y0
e
+
b
b

is a solution to the initial value problem (i.e differential equation plus initial
condition)
dy
= a by, y(0) = y0 .
dt

13.2. Steps in an example: Complete the algebraic steps in Example 12.5 to show that
the solution to Eqn. (12.3) can be obtained by the substitution z(t) = a by(t).
13.3. Verifying a solution: Show that the function
y(t) =

1
1t

is a solution to the differential equation and initial condition


dy
= y2,
dt

y(0) = 1.

Comment on what happens to this solution as t approaches 1.


13.4. For each of the following, show the given function y is a solution to the given differential equation.
dy
= 3y, y = 2t3 .
(a) t
dt
d2 y
(b)
+ y = 0, y = 2 sin t + 3 cos t.
dt2

286

Chapter 13. Qualitative methods for differential equations

dy
d2 y
2
+ y = 6et , y = 3t2 et .
2
dt
dt
13.5. Show the function determined by the equation 2x2 + xy y 2 = C, where C is a
dy
constant and 2y 6= x, is a solution to the differential equation (x2y)
= 4xy.
dx
13.6. Find the constant C that satisfies the given initial conditions.
(c)

(a) 2x2 3y 2 = C, y|x=0 = 2.

(b) y = C1 e5t + C2 te5t , y|t=0 = 1 and


(c) y = C1 cos(t C2 ), y|t= 2 = 0 and

dy
dt |t=0 = 0.
dy

dt |t= 2 = 1.

13.7. Friction and terminal velocity: The velocity of a falling object changes due to the
acceleration of gravity, but friction has an effect of slowing down this acceleration.
The differential equation satisfied by the velocity v(t) of the falling object is
dv
= g kv
dt
where g is acceleration due to gravity and k is a constant that represents the effect
of friction. An object is dropped from rest from a plane.
(a) Find the function v(t) that represents its velocity over time.
(b) What happens to the velocity after the object has been falling for a long time
(but before it has hit the ground)?
13.8. Alcohol level: Alcohol enters the blood stream at a constant rate k gm per unit time
during a drinking session. The liver gradually converts the alcohol to other, nontoxic byproducts. The rate of conversion per unit time is proportional to the current
blood alcohol level, so that the differential equation satisfied by the blood alcohol
level is
dc
= k sc
dt
where k, s are positive constants. Suppose initially there is no alcohol in the blood.
Find the blood alcohol level c(t) as a function of time from t = 0, when the drinking
started.
13.9. Newtons Law of Cooling: Newtons Law of Cooling states that the rate of change
of the temperature of an object is proportional to the difference between the temperature of the object, T , and the ambient (environmental) temperature, E. This leads
to the differential equation
dT
= k(E T )
dt
where k > 0 is a constant that represents the material properties and, E is the
ambient temperature. (We will assume that E is also constant.)
(a) Show that the function
T (t) = E + (T0 E)ekt
which represents the temperature at time t satisfies this equation.

Exercises

287

(b) The time of death of a murder victim can be estimated from the temperature of
the body if it is discovered early enough after the crime has occurred. Suppose
that in a room whose ambient temperature is E = 20 C, the temperature
of the body upon discovery is T = 30 C, and that a second measurement,
one hour later is T = 25 C. Determine the approximate time of death. (You
should use the fact that just prior to death, the temperature of the victim was
37 C.)
13.10. A cup of coffee: The temperature of a cup of coffee is initially 100 degrees C. Five
minutes later, (t = 5) it is 50 degrees C. If the ambient temperature is A = 20
degrees C, determine how long it takes for the temperature of the coffee to reach 30
degrees C.
13.11. Newtons Law of Cooling applied to data: The following data was gathered in
producing Fig. 2.1 for cooling milk during yoghurt production. According to Newtons Law of Cooling, this data can be described by the formula
T = E + (T (0) E) ekt .
where T (t) is the temperature of the milk (in degrees Fahrenheit) at time t (in min),
E is the ambient temperature, and k is some constant that we will determine in this
problem.
time (min) Temp
0.0
190.0
0.5
185.5
1.0
182.0
1.5
179.2
2.0
176.0
2.5
172.9
3.0
169.5
3.5
167.0
4.0
164.6
4.5
162.2
5.0
159.8
(a) Rewrite this relationship in terms of the quantity Y (t) = ln(T (t) E), and
show that Y (t) is related linearly to the time t.
(b) Explain how the constant k could be found from this converted form of the
relationship.
(c) Use the data in the table and your favorite spreadsheet (or similar software) to
show that the data so transformed appears to be close to linear. Assume that
the ambient temperature was E = 20 F.
(d) Use the same software to determine the constant k by fitting a line to the transformed data.
13.12. Lake Fishing: Fish Unlimited is a company that manages the fish population in a
private lake. They restock the lake at constant rate (To restock means to add fish to

288

Chapter 13. Qualitative methods for differential equations


the lake). N fishers are allowed to fish in the lake per day. The population of fish in
the lake, F (t) is found to satisfy the differential equation
dF
= I N F
dt

(13.11)

(a) At what rate is fish added per day according to Eqn. (13.11)? (Give value and
units.) What is the average number of fish caught by one fisher? (Give value
and units.) What is being assumed about the fish birth and mortality rates in
Eqn. (13.11)?
(b) If the fish input and number of fishers are constant, what is the steady state
level of the fish population in the lake?
(c) At time t = 0 the company stops restocking the lake with fish. Write down the
revised form of the differential equation (13.11) that takes this into account.
(Assume the same level of fishing as before.) How long would it take for the
fish to fall to 25% of their initial level?
(d) When the fish population drops to the level Flow , fishing is stopped and the
lake is restocked with fish at the same constant rate (Eqn (13.11), with = 0.)
Write down the revised version of (13.11) that takes this into account. How
long would it take for the fish population to double?
13.13. Glucose solution in a tank: A tank that holds 1 liter is initially full of plain water.
A concentrated solution of glucose, containing 0.25 gm/cm3 is pumped into the
tank continuously, at the rate 10 cm3 /min and the mixture (which is continuously
stirred to keep it uniform) is pumped out at the same rate. How much glucose will
there be in the tank after 30 minutes? After a long time? (Hint: write a differential
equation for c, the concentration of glucose in the tank by considering the rate at
which glucose enters and the rate at which glucose leaves the tank.)
13.14. Pollutant in a lake:
(From the Dec 1993 Math 100 Exam) A lake of constant volume V gallons contains
Q(t) pounds of pollutant at time t evenly distributed throughout the lake. Water
containing a concentration of k pounds per gallon of pollutant enters the lake at a
rate of r gallons per minute, and the well-mixed solution leaves at the same rate.
(a) Set up a differential equation that describes the way that the amount of pollutant in the lake will change.
(b) Determine what happens to the pollutant level after a long time if this process
continues.
(c) If k = 0 find the time T for the amount of pollutant to be reduced to one half
of its initial value.
13.15. A sugar solution: Sugar dissolves in water at a rate proportional to the amount of
sugar not yet in solution. Let Q(t) be the amount of sugar undissolved at time t.
The initial amount is 100 kg and after 4 hours the amount undissolved is 70 kg.
(a) Find a differential equation for Q(t) and solve it.
(b) How long will it take for 50 kg to dissolve?

Exercises

289

13.16. Leaking water tank: A cylindrical tank with cross-sectional area A has a small
hole through which water drains. The height of the water in the tank y(t) at time t
is given by:
kt 2

)
y(t) = ( y0
2A
where k, y0 are constants.
(a) Show that the height of the water, y(t), satisfies the differential equation
dy
k
y.
=
dt
A
(b) What is the initial height of the water in the tank at time t = 0 ?
(c) At what time will the tank be empty ?
(d) At what rate is the volume of the water in the tank changing when t = 0?
13.17. Find those constants a, b so that y = ex and y = ex are both solutions of the
differential equation
y + ay + by = 0.
13.18. Let y = f (t) = et sin t, < t < .

(a) Show that y satisfies the differential equation y + 2y + 2y = 0.

(b) Find all critical points of f (t).

290

Chapter 13. Qualitative methods for differential equations

Chapter 14

Trigonometric functions

In this chapter we will explore trigonometric functions and their properties. This important
new class of functions will be introduced here; their basic properties and interconnections
will be discussed. Belonging to a wider class of periodic functions, these illustrate the ideas
of amplitude, frequency, period, and phase. We will find that many cyclic phenomena can
be described approximately by suitably adjusted basic functions such as sine and cosine. As
a second theme, we return to the idea of inverse functions and show that important restrictions must be applied to ensure the existence of an inverse, particularly for the trigonometric
functions. Then, in the next chapter, we calculate the derivatives of trigonometric functions
and show applications to rates of change of periodic phenomena or changing angles.

14.1 Basic trigonometry


Trigonometric functions are closely associated with angles and ratios of sides of a rightangle triangle. But they are also connected to motion of a point around a unit circle. Before
we can understand these connections, we agree on a universal way of measuring angles,
and then define the functions of interest.

Section 14.1 Learning goals


1. Understand the definition of the radian as a measure for angles.
2. Understand the correspondence between a point moving on a unit circle and the sine
and cosine of the angle it forms at the origin.
3. Be able to make correspondence between ratios of sides of a Pythagorean triangle
and the trigonometric functions of one of its angles.
4. Review properties of the functions sin(x) and cos(x) and other trigonometric functions. Understand and be able to state and apply the connections between these
functions (trigonometric identities).
291

292

Chapter 14. Trigonometric functions

14.1.1 Angles and circles


Angles can be measured in a number of ways. One way is to assign a value in degrees, with
the convention that one complete revolution is represented by 360. Why 360? And what is
a degree exactly? Is this some universal measure that any intelligent being (say on Mars or
elsewhere) would find appealing? Actually, 360 is a rather arbitrary convention that arose
historically, and has no particular meaning. We could as easily have had mathematical
ancestors that decided to divide circles into 1000 equal pieces or 240 or some other
subdivision. It turns out that this measure is not particularly convenient, and we will replace
it by a more universal quantity.
The universal quantity stems from the fact that circles of all sizes have one common
geometric feature: they have the same ratio of circumference to diameter, no matter what
their size (or where in the universe they occur). We call that ratio , that is
=

Circumference of circle
Diameter of circle

The diameter D of a circle of radius r is just twice the radius,


D = 2r,
so this naturally leads to the familiar relationship of circumference, C, to radius
C = 2r.
(But we should not forget that this is merely a definition of the constant . The more
interesting conclusion that develops from this definition is that the area of the circle is
A = r2 , but we shall see the reason for this later, in the context of areas and integration.)

Figure 14.1. The angle in radians is related in a simple way to the radius R of
the circle, and the length of the arc S shown.
From Figure 14.1 we see that there is a correspondence between the angle () subtended in a circle of given radius and the length of arc along the edge of the circle. For a
circle of radius R and angle we will define the arclength, S by the relation
S = R,
where is measured in a convenient unit that we will now select. We now consider a circle
of radius R = 1 (called a unit circle) and denote by s a length of arc around the perimeter

14.1. Basic trigonometry

293

of this unit circle. In this case, the arc length is


S = R = .
We note that when S = 2, the arc consists of the entire perimeter of the circle. This
leads us to define the unit called a radian: we will identify an angle of 2 radians with one
complete revolution around the circle. In other words, we use the length of the arc in the
unit circle to assign a numerical value to the angle that it subtends.
We can now use this choice of unit for angles to assign values to any fraction of a
revolution, and thus, to any angle. For example, an angle of 90 corresponds to one quarter
of a revolution around the perimeter of a unit circle, so we identify the angle /2 radians
with it. One degree is 1/360 of a revolution, corresponding to 2/360 radians, and so on.
To summarize our choice of units we have the following two points:

1. The length of an arc along the perimeter of a circle of radius R subtended by an angle
is S = R where is measured in radians.
2. One complete revolution, or one full cycle corresponds to an angle of 2 radians.
It is easy to convert between degrees and radians if we remember that 360 corresponds to 2 radians. (180 then corresponds to radians, 90 to /2 radians, etc.)

14.1.2

Defining the trigonometric functions sin(x) and cos(x)

(x,y)
1

t
x

Figure 14.2. Shown above is the circle of radius 1, x2 +y 2 = 1. The radius vector
that ends at the point (x, y) subtends an angle t (radians) with the x axis. The triangle is
also shown enlarged to the right, where the lengths of all three sides are labeled. The
trigonometric functions are just ratios of two sides of this triangle.
Consider a point (x, y) moving around the rim of a circle of radius 1, and let t be
some angle (measured in radians) formed by the x axis and the radius vector to the point

294

Chapter 14. Trigonometric functions

(x, y) as shown in Figure 14.2. We the functions sine and cosine, both dependent on the
angle t (abbreviated sin(t) and cos(t)) as follows:
sin(t) =

y
= y,
1

cos(t) =

x
=x
1

That is, the function sine tracks the y coordinate of the point as it moves around the unit
circle, and the function cosine tracks its x coordinate. (Remark: see also the review definitions of these trigonometric quantities as shown in Figure F.1 of Appendix F as the opposite
over hypotenuse and adjacent over hypotenuse in a right angle triangle. The hypotenuse in
our diagram is simply the radius of the circle, which is 1 by assumption.)

14.1.3 Properties of sin(x) and cos(x)


We now explore the consequences of these definitions:
Values of sine and cosine
The radius of the circle is 1. This means that the x coordinate cannot be larger
than 1 or smaller than -1. Same holds for the y coordinate. Thus the functions
sin(t) and cos(t) are always swinging between -1 and 1. (1 sin(t) 1 and
1 cos(t) 1 for all t). The peak (maximum) value of each function is 1, the
minimum is -1, and the average value is 0.
When the radius vector points along the x axis, the angle is t = 0 and we have
y = 0, x = 1. This means that cos(0) = 1, sin(0) = 0.
When the radius vector points up the y axis, the angle is /2 (corresponding to
one quarter of a complete revolution), and here x = 0, y = 1 so that cos(/2) =
0, sin(/2) = 1.
Using simple geometry, we can also determine the lengths of all sides, and hence the
ratios of the sides in a few particularly simple triangles, namely equilateral triangles
(in which all angles are 60 ), and right triangles with two equal angles of 45 . These
types of calculations (omitted here) lead to some easily determined values for the
sine and cosine of such special angles. These values are shown in Table F.1 of the
Appendix F.
Connection between sine and cosine
The two functions, sine and cosine depict the same underlying motion, viewed from
two perspectives: cos(t) represents the projection of the circularly moving point onto
the x axis, while sin(t) is the projection of that point onto the y axis. In this sense, the
functions are a pair of twins, and we can expect many relationships to hold between
them.
The cosine has its largest value at the beginning of the cycle, when t = 0 (since
cos(0) = 1), while the other the sine its peak value a little later, (sin(/2) = 1).

14.2. Periodic Functions

295

Throughout their circular race, the sine function is /2 radians ahead of the cosine
i.e.

cos(t) = sin(t + ).
2
The point (x, y) is on a circle of radius 1, and, thus, its coordinates satisfy
x2 + y 2 = 1.
This implies that
sin2 (t) + cos2 (t) = 1

(14.1)

for any angle t. This is an important relation, (also called a trigonometric identity
between the two trigonometric functions, and one that we will use quite often. See
Appendix F for a review of trigonometric identities

14.1.4

Other trigonometric functions

Although we shall mostly be concerned with the two basic functions described above, several others are historically important and are encountered frequently in integral calculus.
These include the following:
sin(t)
,
cos(t)
1
sec(t) =
,
cos(t)

tan(t) =

1
,
tan(t)
1
csc(t) =
.
sin(t)
cot(t) =

We review these and the identities that they satisfy in Appendix F. We also include the Law
of Cosines (F.1), and angle-sum identities in the same appendix.

14.2 Periodic Functions


Section 14.2 Learning goals
1. Understand the definition of a periodic function.
2. Given a periodic function, be able to determine its period, amplitude and phase.
3. Given a graph or description of a periodic or rhythmic process, be able to fit an
approximate sine or cosine function with the correct period, amplitude and phase.
A function is said to be periodic if
f (t) = f (t + T ).
where T is a constant that we call the period of the function. Graphically, this means
that if we shift the function by a constant distance along the horizontal axis, we
see the same picture again. All the trigonometric functions are periodic.

296

Chapter 14. Trigonometric functions

y=sin (t)
1
0

/2

3/2

5/2

period, T
y=cos (t)
1
0
1

/2

3/2

5/2

period, T

Figure 14.3. Periodicity of the sine and cosine. Note that the two curves are just
shifted versions of one another.

The point (x, y) in Figure 14.2 will repeat its trajectory every time a revolution
around the circle is complete. This happens when the angle t completes one full
cycle of 2 radians. Thus, as expected, the trigonometric functions are periodic, that
is
sin(t) = sin(t + 2),

and

cos(t) = cos(t + 2).

tan(t) = tan(t + 2),

and

cot(t) = cot(t + 2).

Similarly

We say that the period is T = 2 radians. The graphs of sine and cosine are displayed
in Fig. 14.3. The same applies to sec(t) and csc(t), that is all six trigonometric
functions are periodic.
We can make other observations about sine and cosine. For example, by noting the
symmetry of the functions relative to the origin, we can see that sin(t) is an odd function
and the cos(t) is an even function. This follows from the fact that for a negative angle (i.e.
an angle clockwise from the x axis) the sine flips sign while the cosine does not.

14.2. Periodic Functions

297

2.0

2.0

y=sin(t)
y=Asin(t)

-2.0

-2.0
0.0

6.3

0.0

(a)

6.3

(b)

2.0

2.0

y=A sin(w (t-a))

y=A sin(w t)

-2.0

-2.0
0.0

6.3

0.0

(c)

6.3

(d)

Figure 14.4. Graphs of the functions (a) y = sin(t), (b) y = A sin(t) for A > 1,
(c) y = A sin(t) for > 1, (d) y = A sin((t a)).

14.2.1

Phase, amplitude, and frequency

In Appendix C we review how the appearance of functions changes when we shift their
graph in one direction or another, scale one of the axes, and so on. Using these ideas it will
be straightforward to follow the basic changes in shape of a typical trigonometric function.
A function of the form
y = f (t) = A sin(t)
has both its t and y axes scaled, as shown in Fig. 14.4(c). The constant A, referred to as
the amplitude of the graph, scales the y axis so that the oscillation swings between a low
value of A and a high value of A. The constant , called the frequency, scales the t axis.
This results in crowding together of the peaks and valleys (if > 1) or stretching them out

298

Chapter 14. Trigonometric functions

(if < 1). One full cycle is completed when


t = 2,
and this occurs at time

2
.

We have already used the symbol T , to denote this special time, and defined T as the period
of the function. We note the connection between frequency and period:
t=

2
,
T
If we examine a graph of function
=

T =

2
.

y = f (t) = A sin((t a))


we find that the graph has been shifted in the positive t direction by a, as in Fig. 14.4(d).
We note that at time t = a, the value of the function is
y = f (t) = A sin((a a)) = A sin(0) = 0.
This tells us that the cycle starts with a delay, i.e. the value of y goes through zero when
t = a.
Another common variant of the same function can be written in the form
y = f (t) = A sin(t ).
Here is called the phase shift of the oscillation. Comparing the above two related forms,
we see that they are the same if we identify with a. The phase shift, is considered
to be a quantity without units, whereas the quantity a has units of time, same as t. When
= 2, (which happens when a = 2/), the graph has been moved over to the right
by one full period. (Naturally, when the graph is so moved, it looks the same as it did
originally, since each cycle is the same as the one before, and same as the one after.)
Some of the scaled, shifted, sine functions described here are shown in Figure 14.4.

14.2.2 Rhythmic processes


Many natural phenomena are cyclic. It is often convenient to represent such phenomena
with one or another simple periodic functions, and sine and cosine can be adapted for
the purpose. The idea is to pick the right function, the right frequency (or period), the
amplitude, and possibly the phase shift, so as to represent the desired behaviour.
To select one or another of these functions, it helps to remember that cosine starts a
cycle (at t = 0) at its peak value, while sine starts the cycle at 0, i.e., at its average value. A
function that starts at the lowest point of the cycle is cos(t). In most cases, the choice of
function to use is somewhat arbitrary, since a phase shift can correct for the phase at which
the oscillation starts.
Next, we pick a constant such that the trigonometric function sin(t) (or cos(t))
has the correct period. Given a period for the oscillation, T , recall that the corresponding
frequency is simply = 2/T . We then select the amplitude, and horizontal and vertical
shifts to complete the mission. The examples below illustrate this process.

14.2. Periodic Functions

299

Example 14.1 (Daylight hours:) In Vancouver, the shortest day (8 hours of light) occurs
around December 22, and the longest day (16 hours of light) is around June 21. Approximate the cyclic changes of daylight through the season using the sine function.
Solution: On Sept 21 and March 21 the lengths of day and night are equal, and then there
are 12 hours of daylight. (Each of these days is called an equinox). Suppose we call
identify March 21 as the beginning of a yearly day-night length cycle. Let t be time in
days beginning on March 21. One full cycle takes a year, i.e. 365 days. The period of the
function we want is thus
T = 365
and its frequency is
= 2/365.
Daylight shifts between the two extremes of 8 and 16 hours: i.e. 12 4 hours. This means
that the amplitude of the cycle is 4 hours. The oscillation take place about the average value
of 12 hours. We have decided to start a cycle on a day for which the number of daylight
hours is the average value (12). This means that the sine would be most appropriate, so the
function that best describes the number of hours of daylight at different times of the year
is:


2
t
D(t) = 12 + 4 sin
365
where t is time in days and D the number of hours of light.
Example 14.2 (Hormone levels:) The level of a certain hormone in the bloodstream fluctuates between undetectable concentration at 7:00 and 100 ng/ml at 19:00 hours. Approximate the cyclic variations in this hormone level with the appropriate periodic trigonometric
function. Let t represent time in hours from 0:00 hrs through the day.
Solution: We first note that it takes one day (24 hours) to complete a cycle. This means
that the period of oscillation is 24 hours, so that the frequency is
=

2
=
=
.
T
24
12

The variation in the level of hormone is between 0 and 100 ng/ml, which can be expressed
as 50 50 ng/ml. (The trigonometric functions are symmetric cycles, and we are here
finding both the average value about which cycles occur and the amplitude of the cycles.)
We could consider the time midway between the low and high points, namely 13:00 hours
as the point corresponding to the upswing at the start of a cycle of the sine function. (See
Figure 14.5 for the sketch.) Thus, if we use a sine to represent the oscillation, we should
shift it by 13 hrs to the left.
Assembling these observations, we obtain the level of hormone, H at time t in hours:


H(t) = 50 + 50 sin
(t 13) .
12

300

Chapter 14. Trigonometric functions

H(t)

period: T= 24 hrs

100
50
0
1

13

19

t
24

12 hrs
6 hrs

Figure 14.5. Hormonal cycles. The full cycle is 24 hrs. The level H(t) swings
between 0 and 100 ng. From the given information, we see that the average level is 50 ng,
and that the origin of a representative sine curve should be at t = 13 (i.e. 1/4 of the cycle
which is 6 hrs past the time point t = 7) to depict this cycle.
In the expression above, the number 13 represents a shift along the time axis, and carries
units of time. We can express this same function in the form


t 13

.
H(t) = 50 + 50 sin
12
12
In this version, the quantity
=

13
12

is the phase shift.


In selecting the periodic function to use for this example, we could have made other
choices. For example, the same periodic can be represented by any of the functions listed
below:


(t 1) ,
H1 (t) = 50 50 sin
12


(t 19) ,
H2 (t) = 50 + 50 cos
12


H3 (t) = 50 50 cos
(t 7) .
12
All these functions have the same values, the same amplitudes, and the same periods.
Example 14.3 (Phases of the moon:) A cycle of waxing and waning moon takes 29.5
days approximately. Construct a periodic function to describe the changing phases, starting
with a new moon (totally dark) and ending one cycle later.
Solution: The period of the cycle is T = 29.5 days, so
=

2
2
=
.
T
29.5

14.3. Inverse Trigonometric functions

301

29.5

Figure 14.6. Periodic moon phases


For this example, we will use the cosine function, for practice. Let P (t) be the fraction
of the moon showing on day t in the cycle. Then we should construct the function so that
0 < P < 1, with P = 1 in mid cycle (see Figure 14.6). The cosine function swings
between the values -1 and 1. To obtain a positive function in the desired range for P (t), we
will add a constant and scale the cosine as follows:
1
[1 + cos(t)].
2
This is not quite right, though because at t = 0 this function takes the value 1, rather than
0, as shown in Figure 14.6. To correct this we can either introduce a phase shift, i.e. set
P (t) =

1
[1 + cos(t + )].
2

(Then when t = 0, we get P (t) = 0.5[1 + cos ] = 0.5[1 1] = 0.) or we can write
P (t) =

1
[1 cos(t + )],
2

which achieves the same result.

14.3 Inverse Trigonometric functions


The introduction of trigonometric functions in this chapter provides another opportunity
to illustrate the roles and properties of inverse functions17. In this section, we investigate
inverse trigonometric functions. As in other examples, the inverse of a given function leads
to exchange of the roles of the dependent and independent variables, as well as the the roles
of the domain and range. Geometrically, an inverse function is obtained by reflecting the
original function about the line y = x. However, we must take care that the resulting graph
represents a true function, i.e. satisfies all the properties required of a function.
17 The material in this section could be omitted without loss of continuity in the next chapter. If this is done, the
instructor can merely skip Sections 15.1.3 and 15.3.3.

302

Chapter 14. Trigonometric functions

Section 14.3 Learning goals


1. Review the concept of an inverse function, and be able to apply this idea to trigonometric functions.
2. Understand the requirement of restricting the domain (of the original function) so as
to be able to define its inverse. Given any of the trigonometric functions, be able to
identify a suitably restricted domain on which an inverse function can be defined.
3. Be able to simplify and/or interpret the meaning of expressions involving the trigonometric and inverse trigonometric functions.
The domains of sin(x) and cos(x) are both < x < while their ranges are
1 y 1. In the case of the function tan(x), the domain excludes values /2 as
well as angles 2n /2 at which the function is undefined. The range of tan(x) is
< y < .
There is one difficulty in defining inverses for trigonometric functions: the fact that
these functions repeat their values in a cyclic pattern means that a given y value is obtained
from many possible values of x. For example, all of the values x = /2, 5/2, 7/2, etc
all have identical sine values sin(x) = 1. We say that these functions are not one-to-one.
Geometrically, this is just saying that the graphs of the trig functions intersect a horizontal
line in numerous places. When these graphs are reflected about the line y = x, they would
intersect a vertical line in many places, and would fail to be functions: the function would
have multiple y values corresponding to the same value of x, which is not allowed. The
reader may recall that a similar difficulty was encountered in an earlier chapter with the
inverse function for y = x2 .
We can avoid this difficulty by restricting the domains of the trigonometric functions
to a portion of their graphs that does not repeat. To do so, we select an interval over
which the given trigonometric function is one-to -one, i.e. over which there is a unique
correspondence between values of x and values of y. (This just mean that we keep a
portion of the graph of the function in which the y values are not repeated.) We then define
the corresponding inverse function, as described below.
Arcsine is the inverse of sine
The function y = sin(x) is one-to-one on the interval /2 < x < /2. We will define
the associated function y = Sin(x) (shown in red on Figures 14.7(a) and (b) by restricting
the domain of the sine function to /2 < x < /2. On the given interval, we have
1 < Sin(x) < 1. We define the inverse function, called arcsine
y = arcsin(x)

1<x<1

in the usual way, by reflection of Sin(x) through the line y = x as shown in Figure F.3(a).
To interpret this function, we note that arcsin(x) is the angle whose sine is x. In
Figure 14.8, we show a triangle in which = arcsin(x). This follows from the observation

14.3. Inverse Trigonometric functions

303

1.5

1.5

y=Sin(x)

y=sin(x)

y=Sin(x)

-1.5

y=x

-1.5
-6.3

6.3

y=arcsin(x)

-1.5

1.5

(a)

(b)

Figure 14.7. (a) The original trigonometric function, sin(x), in black, as well as
the portion restricted to a smaller domain, Sin(x), in red. The red curve is shown again
in part b. (b) Relationship between the functions Sin(x), defined on /2 < x < /2 (in
red) and arcsin(x) defined on 1 < x < 1 (in blue). Note that one is the reflection of the
other about the line y = x. The graphs in parts (a) and (b) are not on the same scale.
that the sine of theta, opposite over
hypotenuse, is x/1 which is simply x. The length of
the other side of the triangle is then 1 x2 by the Pythagorean theorem.
1

1x

Figure 14.8. This triangle has been constructed so that is an angle whose sine
is x/1 = x. This means that = arcsin(x)

For example arcsin( 2/2) is the angle whose sine is 2/2, namely /4. (We see
this by checking the values of trig functions of standard angles shown in Table 1.) A few
other inter-conversions are given by the examples below.
The functions sin(x) and arcsin(x), reverse (or invert) each others effect, that is:
arcsin(sin(x)) = x for
sin(arcsin(x)) = x for

/2 < x < /2,


1 < x < 1.

There is a subtle point that the allowable values of x that can be plugged in are not exactly

304

Chapter 14. Trigonometric functions

the same for the two cases. In the first case, x is an angle whose sine we compute first, and
then reverse the procedure. In the second case, x is a number whose arc-sine is an angle.
We can evaluate arcsin(sin(x)) for any value of x, but the result may not agree with
the original value of x unless we restrict attention to the interval /2 < x < /2. For
example, if x = , then sin(x) = 0 and arcsin(sin(x)) = arcsin(0) = 0, which is not the
same as x = . For the other case, i.e. for sin(arcsin(x)), we cannot plug in any value of
x outside of 1 < x < 1, since arcsin(x) is simply not define at all, outside this interval.
This demonstrates that care must be taken in handling the inverse trigonometric functions.
Inverse cosine

1.5

3.1

y=Cos(x)
y=arccos(x)
y=cos(x)

y=Cos(x)

y=x
-1.5

-1.0
-6.3

6.3

-1.0

(a)

3.1

(b)

Figure 14.9. (a) The original function cos(x), is shown in black; the restricted domain version, Cos(x) is shown in red. The same red curve appears in part (b) on a slightly
different scale. (b) Relationship between the functions Cos(x) (in red) and arccos(x) (in
blue). Note that one is the reflection of the other about the line y = x.
We cannot use the same interval to restrict the cosine function, because it has the
same y values to the right and left of the origin. If we pick the interval 0 < x < , this
difficulty is avoided, since we arrive at a one-to-one function. We will call the restricteddomain version of cosine by the name y = Cos(x) = cos(x) for0 < x < . (See red
curve in Figure 14.9(a). On the interval 0 < x < , we have 1 > Cos(x) > 1 and we
define the corresponding inverse function
y = arccos(x)

1<x<1

as shown in blue in Figure 14.9(b).


We understand the meaning of the expression y = arccos(x) as the angle (in
radians) whose cosine is x. For example, arccos(0.5) = /3 because /3 is an angle
whose cosine is 1/2. In Figure 14.10, we show a triangle constructed specifically so that

14.3. Inverse Trigonometric functions

305

= arccos(x). Again, this follows from the fact that cos() is adjacent over hypotenuse.
The length of the third side of the triangle is obtained using the Pythagorian theorem.

1
1x

x
Figure 14.10. This triangle has been constructed so that is an angle whose
cosine is x/1 = x. This means that = arccos(x)
The inverse relationship between the functions mean that
arccos(cos(x)) = x

for 0 < x < ,

cos(arccos(x)) = x for

1 < x < 1.

The same subtleties apply as in the previous case discussed for arc-sine.
Inverse tangent

10.0

6.3

y=Tan(x)

y=Tan(x)
y=tan(x)
y=arctan(x)

y=x
-10.0

-6.3
-6.5

6.5

(a)

-6.3

6.3

(b)

Figure 14.11. (a) The function tan(x), is shown in black, and T an(x) in red. The
same red curve is repeated in part b (b) Relationship between the functions T an(x) (in red)
and arctan(x) (in blue). Note that one is the reflection of the other about the line y = x.
The function y = tan(x) is one-to -one on an interval /2 < x < /2, which is
similar to the case for Sin(x). We therefore restrict the domain to /2 < x < /2, that is,

306

Chapter 14. Trigonometric functions

we define,
y = T an(x) = tan(x)

/2 < x < /2.

Unlike sine, as x approaches either endpoint of this interval, the value of T an(x) approaches , i.e. < T an(x) < . This means that the domain of the inverse
function will be from to , i.e. will be defined for all values of x . We define the
inverse tan function:
y = arctan(x) < x < .
as before, we can understand the meaning of the inverse tan function, by constructing a
triangle in which = arctan(x), shown in Figure 14.12.

1+x

2
x

1
Figure 14.12. This triangle has been constructed so that is an angle whose tan
is x/1 = x. This means that = arctan(x)
The inverse tangent inverts the effect of the tangent on the relevant interval:
arctan(tan(x)) = x for
tan(arctan(x)) = x

for

/2 < x < /2
<x<

The same comments hold in this case.


A summary of the above inverse trigonometric functions, showing their graphs on a
single page is provided in Fig. F.3 in Appendix F Some of the standard angles allow us
to define precise values for the inverse trig functions. A table of such standard values is
given in the same Appendix (See Table F.2). For other values of x, one has to calculate the
decimal approximation of the function using a scientific calculator.

Example 14.4 Simplify the following expressions: (a) arcsin(sin(/4), (b) arccos(sin(/6))

Solution: (a) arcsin(sin(/4) = /4 since the functions are simple inverses of one another
on the domain /2 < x < /2.
(b) We evaluate this expression piece by piece: First, note that sin(/6) = 1/2.
Then arccos(sin(/6)) = arccos(1/2) = 2/3. The last equality is obtained from the
table of values prepared above.
Example 14.5 Simplify the expressions: (a) tan(arcsin(x), (b)cos(arctan(x)).

14.3. Inverse Trigonometric functions

307

Solution: (a) Consider first the expression arcsin(x), and note that this represents an angle
(call it ) whose sine is x, i.e. sin() = x. Refer to Figure 14.8 for a sketch of a triangle in
which this relationship holds. Now note that tan() in this same triangle is the ratio of the
opposite side to the adjacent side, i.e.
x
tan(arcsin(x)) =
1 x2
(b) Figure 14.12 shows a triangle that captures the relationship tan() = x or =
arctan(x). The cosine of this angle is the ratio of the adjacent side to the hypotenuse, so
that
1
cos(arctan(x)) =
x2 + 1

308

Chapter 14. Trigonometric functions

Exercises
14.1. Convert the following expressions in radians to degrees:
(a) (b) 5/3 (c) 21/23 (d) 24
Convert the following expressions in degrees to radians:
(e) 100o (f) 8o (g) 450o (h) 90o
Using a Pythagorean triangle, evaluate each of the following:
(i) cos(/3) (j) sin(/4) (k) tan(/6)
14.2. Graph the following functions over the indicated ranges:
(a) y = x sin(x) for 2 < x < 2

(b) y = ex cos(x) for 0 < x < 4.

14.3. Sketch the graph for each of the following functions:



1

(a) y = sin 3 x
2
4
(b) y = 2 sin x

(c) y = 3 cos 2x



1
x+
(d) y = 2 cos
2
4
14.4. The Radian is an important unit associated with angles. One revolution about a circle
is equivalent to 360 degrees or 2 radians. Convert the following angles (in degrees)
to angles in radians. (Express these as multiples of , not as decimal expansions):
(a) 45 degrees
(b) 30 degrees
(c) 60 degrees
(d) 270 degrees.
Find the sine and the cosine of each of these angles.
14.5. Find the appropriate trigonometric function to describe the following rhythmic processes:
(a) Daily variations in the body temperature T (t) of an individual over a single
day, with the maximum of 37.5o C at 8:00 am and a minimum of 36.7o C 12
hours later.
(b) Sleep-wake cycles with peak wakefulness (W = 1) at 8:00 am and 8:00pm
and peak sleepiness (W = 0) at 2:00pm and 2:00 am.
(For parts (a) and (b) express t as time in hours with t = 0 taken at 0:00 am.)
14.6. Find the appropriate trigonometric function to describe the following rhythmic processes:
(a) The displacement S cm of a block on a spring from its equilibrium position,
with a maximum displacement 3 cm and minimum displacement 3 cm, a
period of 2 and at t = 0, S = 3.
g/l

Exercises

309

(b) The vertical displacement y of a boat that is rocking up and down on a lake. y
was measured relative to the bottom of the lake. It has a maximum displacement of 12 meters and a minimum of 8 meters, a period of 3 seconds, and
an initial displacement of 11 meters when measurement was first started (i.e.,
t = 0).
14.7. Sunspot cycles: The number of sunspots (solar storms on the sun) fluctuates with
roughly 11-year cycles with a high of 120 and a low of 0 sunspots detected. A peak
of 120 sunspots was detected in the year 2000. Which of the following trigonometric
functions could be used to approximate this cycle?





2
11

(t 2000) +
(t + 2000)
, (B) N = 60+60 sin
11
2
2




2
11
(t + 2000) , (D) N = 60 + 60 sin
(t 2000)
(C) N = 60 + 60 cos
2
11


2
(t 2000)
(E) N = 60 + 60 cos
11

(A) N = 60+120 sin

14.8. The inverse trigonometric function arctan(x) (also written arctan(x)) means the
angle where /2 < < /2 whose tan is x. Thus cos(arctan(x) (or cos(arctan(x))
is the cosine
of that same angle. By using a right triangle whose sides have length
1, x and 1 + x2 we can verify that
p
cos(arctan(x)) = 1/ 1 + x2 .
Use a similar geometric argument to arrive at a simplification of the following functions:
(a) sin(arcsin(x)),
(b) tan(arcsin(x),
(c) sin(arccos(x).
14.9. Inverse trig: The value of tan(arccos(x)) is which of the following?

1 x2
1 + x2
2
2
(A) 1 x , (B) x, (C) 1 + x , (D)
, (E)
,
x
x
14.10. Inverse trig functions: The function y = tan(arctan(x)) has the following domain
and range
(A) Domain 0 x ; Range y

(B) Domain x ; Range y

(C) Domain x ; Range y ;

(D) Domain /2 x /2; Range /2 y /2;


(E) Domain x ; Range 0 y

310

Chapter 14. Trigonometric functions

Chapter 15

Cycles, periods, and


rates of change

15.1 Derivatives of trigonometric functions


Having acquainted ourselves with properties of the trigonometric functions and their inverses in Chapter 14, we are ready to compute their derivatives and apply our results to
understanding rates of change of these periodic functions. We compute derivatives in this
section, and use these results in a medley of problems on optima, related rates, and differential equations afterwards.

Section 15.1 Learning goals


1. Be able to use the definition of the derivative to calculate the derivatives of sin(x)
and cos(x).
2. Using the quotient rule, be able to compute derivatives of tan(x), sec(x), csc(x), and
cot(x).
3. Using properties of the inverse trigonometric functions and implicit differentiation,
be able to calculate derivatives of arcsin(x), arccos(x), and arctan(x).

15.1.1

Limits of trigonometric functions

In Chapter 3, we zoomed in on the graph of the sine function close to the origin (Fig. 3.2).
By doing so, we reasoned that
sin(x) x,

for small x.

Restated, with h replacing the variable x, we would have sin(h) h for small h, or in
more formal limit notation,
sin(h)
lim
= 1.
(15.1)
h0
h
311

312

Chapter 15. Cycles, periods, and rates of change

(See (3.1).) This is a very important limit, that will be used directly in computing the
derivative of the trigonometric functions using the definition of the derivative.
A similar analysis of the graph of the cosine function, (here omitted) leads to a second
important limit:
cos(h) 1
= 0.
(15.2)
lim
h0
h
We can now apply these to computing derivatives.

15.1.2 Derivatives of sine, cosine, and other trigonometric


functions
Let y = f (x) = sin(x) be the function to differentiate, where x is now the independent
variable (previously called t). Below, we use the definition of the derivative to compute the
derivative of this function.
Example 15.1 (Derivative of sin(x):) Compute the derivative of y = sin(x) using the
definition of the derivative.
Solution: We apply the definition of the derivative as follows:
f (x + h) f (x)
h
sin(x + h) sin(x)
d sin(x)
= lim
h0
dx
h
sin(x) cos(h) + sin(h) cos(x) sin(x)
= lim
h0
h


cos(h) 1
sin(h)
= lim sin(x)
+ cos(x)
h0
h
h




cos(h) 1
sin(h)
= sin(x) lim
+ cos(x) lim
h0
h0
h
h
= cos(x).
f (x) = lim

h0

Observe that the limits (15.1) and (15.2) were used in arriving at our final result.
A similar calculation using the function cos(x) leads to the result
d cos(x)
= sin(x).
dx
(The same two limits appear in this calculation as well.) We can now calculate the derivative of the any of the other trigonometric functions using the quotient rule.
Example 15.2 (Derivative of the function tan(x):) Compute the derivative of y = tan(x).

Solution: We apply the quotient rule:


d tan(x)
[sin(x)] cos(x) [cos(x)] sin(x)
=
dx
cos2 (x)

15.1. Derivatives of trigonometric functions


y = f (x)
sin(x)
cos(x)
tan(x)
csc(x)
sec(x)
cot(x)

313

f (x)
cos(x)
sin(x)
sec2 (x)
csc(x) cot(x)
sec(x) tan(x)
csc2 (x)

Table 15.1. Derivatives of the trigonometric functions

Using the recently found derivatives for the sine and cosine, we have
d tan(x)
sin2 (x) + cos2 (x)
=
.
dx
cos2 (x)
But the numerator of the above can be simplified using the trigonometric identity (14.1),
leading to
1
d tan(x)
=
= sec2 (x).
dx
cos2 (x)
The derivatives of the six trigonometric functions are given in the table below. The
reader may wish to practice the use of the quotient rule by verifying one or more of the
derivatives of the relatives csc(x) or sec(x). In practice, the most important functions are
the first three, and their derivatives should be remembered, as they are frequently encountered in practical applications.

15.1.3

Derivatives of the inverse trigonometric functions

Implicit differentiation can be used to determine all derivatives of the new functions we
have just defined. As an example, we demonstrate how to compute the derivative of
arctan(x). To do so, we will need to recall that the derivative of the function tan(x) is
sec2 (x). We will also use the identity tan2 (x) + 1 = sec2 (x).
y = f (x)
arcsin(x)

f (x)
1
1x2

arccos(x)

1
1x
2

arctan(x)

1
x2 +1 .

Table 15.2. Derivatives of the inverse trigonometric functions.


Let y = arctan(x). Then on the appropriate interval, we can replace this relationship
with the equivalent one:
tan(y) = x.

314

Chapter 15. Cycles, periods, and rates of change

Differentiating implicitly with respect to x on both sides, we obtain


sec2 (y)

dy
=1
dx

1
1
dy
=
=
dx
sec2 (y)
tan2 (y) + 1
Now using again the relationship tan(y) = x, we obtain
d arctan(x)
1
= 2
.
dx
x +1
This will form an important expression used frequently in integral calculus. The derivatives
of the important inverse trigonometric functions are shown in Table 15.2.

15.2 Changing angles and related rates


The examples in this section will allow us to practice chain rule applications using the
trigonometric functions. We will discuss a number of problems, and show how the basic
properties of these functions, together with some geometry are used to arrive at desired
results.

Section 15.2 Learning goals


1. Understand how the chain rule is applied to problems in which geometric quantities
depend on angles that are changing in time (related rates).
2. Given a description of the geometry and rate of change of angle or side (e.g. in
a triangle) be able to set up the mathematical solution to the word-problem using
the ideas of related rates, analysis of the geometry and properties of trigonometric
functions (e.g. trigonometric identities).
Example 15.3 (A point on a circle:) A point moves around the rim of a circle of radius 1
so that the angle subtended by the radius vector to that point changes at a constant rate,
= t,
where t is time. Determine the rate of change of the x and y coordinates of that point.
Solution: We have (t), x(t), and y(t) all functions of t. The fact that is proportional to
t means that
d
= .
dt
The x and y coordinates of the point are related to the angle by
x(t) = cos((t)) = cos(t),

15.2. Changing angles and related rates

315

y(t) = sin((t)) = sin(t).


This implies (by the chain rule) that
d cos() d
dx
=
,
dt
d dt
d sin() d
dy
=
.
dt
d dt
Performing the required calculations, we have
dx
= sin(),
dt
dy
= cos().
dt
We will see some interesting consequences of this in a later section.
Example 15.4 (Runners on a circular track:) Two runners start at the same position (call
it x = 0) on a circular race track of length 400 meters. Joe Runner takes 50 sec, while
Michael Johnson takes 43.18 sec to complete the 400 meter race. Determine the rate of
change of the angle formed between the two runners and the center of the track, assuming
that the runners are running at a constant rate.
Solution: The track is 400 meters in length (total). Joe completes one cycle around the
track (2 radians) in 50 sec, while Michael completes a cycle in 43.18 sec. (This means
that Joe has period of T = 50 sec, and a frequency of 1 = 2/T = 2/50 radians per sec.
Similarly, Michaels period is T = 43.18 sec and his frequency is 2 = 2/T = 2/43.18
radians per sec. From this, we find that
dJ
2
=
= 0.125 radians per sec,
dt
50
2
dM
=
= 0.145 radians per sec.
dt
43.18
Thus the angle between the runners, M J changes at the rate
d(M J )
= 0.145 0.125 = 0.02 radians per sec.
dt
Example 15.5 (Simple law of cosines:) The law of cosines applies to an arbitrary triangle, as reviewed in Appendix F. Consider the triangle shown in Figure 15.1. Suppose that
the angle increases at a constant rate, i.e. d/dt = k. If the sides a = 3, b = 4, are
of constant length, determine the rate of change of the length c opposite this angle at the
instant that c = 5.

316

Chapter 15. Cycles, periods, and rates of change

b
c

Figure 15.1. Law of cosines states that c2 = a2 + b2 2ab cos().


Solution: Let a, b, c be the lengths of the three sides, with c the length of the side opposite
angle . The law of cosines states that
c2 = a2 + b2 2ab cos().
We identify the changing quantities by writing this relation in the form
c2 (t) = a2 + b2 2ab cos((t))
so it is evident that only c and will vary with time, while a, b remain constant. We are
also told that
d
= k.
dt
Differentiating and using the chain rule leads to:
2c

d cos() d
dc
= 2ab
.
dt
d dt

But d cos()/d = sin() so that


dc
ab
d
ab
= ( sin())
= k sin().
dt
c
dt
c
We now note that at the instant in question, a = 3, b = 4, c = 5, forming a Pythagorean
triangle in which the angle opposite c is = /2. We can see this fact using the law of
cosines, and noting that
c2 = a2 + b2 2ab cos(), 25 = 9 + 16 24 cos().
This implies that 0 = 24 cos(), cos() = 0 so that = /2. Substituting these into our
result for the rate of change of the length c leads to
ab
34
dc
= k=
k.
dt
c
5
Example 15.6 (Clocks:) Find the rate of change of the angle between the minute hand and
hour hand on a clock.

15.2. Changing angles and related rates

317

(a)

(b)

Figure 15.2. Figure for Examples 15.6 and 15.7.


Solution: We will call 1 the angle that the minute hand subtends with the x axis (horizontal direction) and 2 the angle that the hour hand makes with the same axis.
If our clock is working properly, each hand will move around at a constant rate.
The hour hand will trace out one complete revolution (2 radians) every 12 hours, while
the minute hand will complete a revolution every hour. Both hands move in a clockwise
direction, which (by convention) is towards negative angles. This means that
d1
= 2 radians per hour,
dt
2
d2
=
radians per hour.
dt
12
The angle between the two hands is the difference of the two angles, i.e.
= 1 2 .
Thus,

d
d
d1
d2
2
= (1 2 ) =

= 2 +
.
dt
dt
dt
dt
12
We find that the rate of change of the angle between the hands is
11
11
d
= 2
= .
dt
12
6
Example 15.7 (Clocks, continued:) Suppose that the length of the minute hand is 4 cm
and the length of the hour hand is 3 cm. At what rate is the distance between the hands
changing when it is 3:00 oclock?
Solution: We use the law of cosines to give us the rate of change of the desired distance.
We have the triangle shown in figure 15.2 in which side lengths are a = 3, b = 4, and c(t)
opposite the angle (t). From the previous example, we have
ab
d
dc
=
sin() .
dt
c
dt

318

Chapter 15. Cycles, periods, and rates of change

At precisely 3:00 oclock, the angle in question is = /2 and it can also be seen that the
Pythagorean triangle abc leads to
c2 = a2 + b2 = 32 + 42 = 9 + 16 = 25
so that c = 5. We found from our previous analysis that d/dt = 11
6 . Using this information leads to:
dc
34
11
22
=
sin(/2)( ) = cm per hr.
dt
5
6
5
The negative sign indicates that at this time, the distance between the two hands is decreasing.

15.3 The Zebra danios escape responses


We consider an example involving trigonometry and related rates with a biological application. We first consider the geometry on its own, and then link it to the biology of predator
avoidance and escape responses.

Section 15.3 Learning goals


1. Understand the geometry of a visual angle, and determine how that angle changes as
the distance to the viewed object (or the size of the object) changes (an application
of related rates).
2. Be able to determine the rate of change of the visual angle of a prey fish (zebra
danio) changes as a predator of a given size approaches it at some speed.
3. Understand the link between that rate of change and the triggering of an escape
response.
4. Using the results of the analysis, be able to explain in words under what circumstances the prey does (or does not) manage to escape from its predator.

15.3.1 Visual angles


Example 15.8 (Visual angle:) In the triangle shown in Figure 15.3, an object of height s
is moving towards an observer. Its distance from the observer at some instant is labeled
x(t) and it approaches at some constant speed, v. Determine the rate of change of the angle
(t) and how it depends on speed, size, and distance of the object. Often is called a visual
angle, since it represents the angle that an image subtends on the retina of the observer. A
more detailed example of this type is discussed in Section 15.3.2.
Solution: We are given the information that the object approaches at some constant speed,
v. This means that
dx
= v.
dt

15.3. The Zebra danios escape responses

319

x
Figure 15.3. A visual angle would change as the distance x decreases. The size
s is assumed constant. See Example 15.8.
(The minus sign means that the distance x is decreasing.) Using the trigonometric relations,
we see that
s
tan() = .
x
If the size, s, of the object is constant, then the changes with time imply that
tan((t)) =

s
.
x(t)

We differentiate both sides of this equation with respect to t, and obtain


d
d tan() d
=
d
dt
dt
sec2 ()

s
x(t)

1 dx
d
= s 2
,
dt
x dt

so that
d
1 dx
1
= s 2
.
dt
sec () x2 dt
We can use the trigonometric identity
sec2 () = 1 + tan2 ()
to express our answer in terms only of the size, s, the distance of the object, x and the
speed:
 s 2
x2 + s2
=
sec2 () = 1 +
x
x2
so
1 dx
x2
s
d
v.
= s 2
= 2
2
2
dt
x + s x dt
x + s2
(Two minus signs cancelled above.) Thus, the rate of change of the visual angle is sv/(x2 +
s2 ).

320

Chapter 15. Cycles, periods, and rates of change

S
predator

prey

Figure 15.4. A cartoon showing the visual angle, (t) and how it changes as a
predator approaches its prey, the zebra danio.

Figure 15.5. The geometry of the escape response problem.

15.3.2 The Zebra danio and a looming predator


Visual angles are important to predator avoidance. We use the ideas of Example 15.8 to
consider a problem in biology, studied by Larry Dill, a biologist at Simon Fraser University
in Burnaby, BC.
The Zebra danio is a small tropical fish, which has many predators (larger fish) eager
to have it for dinner. Surviving through the day means being able to sense danger quickly
enough to escape from a hungry pair of jaws. However, the danio cannot spend all its time
escaping. It too, must find food, mates, and carry on activities that sustain it. Thus, a
finely tuned mechanism which allows it to react to danger but avoid over-reacting would be
advantageous. We investigate the visual basis of an escape response, based on a hypothesis
formulated by Dill in his papers [5, 6].
Figure 15.5 shows the relation between the angle subtended at the Danios eye and
the size S of an approaching predator, currently located at distance x away. We will assume
that the predator has a profile of size S and that it is approaching the prey at a constant speed
v. This means that the distance x satisfies
dx
= v.
dt
If we consider the top half of the triangle shown in Figure 15.5 we find a Pythagorean
triangle identical to the one we have seen in Example 15.8 provided we redefine = /2,

15.3. The Zebra danios escape responses

321

s = S/2. The side labeled x is identical in both pictures. Thus, the trigonometric relation
that holds is:
  (S/2)
.
(15.3)
=
tan
2
x
Furthermore, based on the results of Example 15.8, we know that d/dt is
d
Sv
S/2
v= 2
=2 2
.
2
dt
x + (S/2)
x + (S 2 /4)
We first observe that d/dt depends on the size of the predator, S, its speed, v, and its
distance away at the given instant. In fact, we can plot the way that this expression depends
on the distance x by noting the following:
When x = 0, i.e., when the predator has reached its prey,
Sv
4v
d
=
=
.
dt
0 + (S 2 /4)
S
For x , when the predator is very far away, we have a large value x2 in the
denominator, so
d
0.
dt
A rough sketch of the way that the rate of change of the visual angle depends on the
current distance to the predator is shown in the curve on Figure 15.6.
When to escape?
What sort of visual input should the danio respond to, if it is to be efficient at avoiding the
predator? In principle, we would like to consider a response that has the following features
If the predator is too far away, if it is moving slowly, or if it is moving in the opposite
direction, it should appear harmless and should not cause undue panic and inappropriate escape response, since this uses up the preys energy to no good purpose.
If the predator is coming quickly towards the danio, and approaching directly, it
should be perceived as a threat and should trigger the escape response.
In keeping with these reasonable expectations, the hypothesis proposed by Dill is
that:
The escape response is triggered when the predator approaches so quickly, that
the rate of change of the visual angle is greater than some critical value.
We will call that critical value Kcrit . This constant would depend on how skittish
the Danio is given factors such as perceived risks of its environment. This means that the
escape response is triggered in the Danio when
d
= Kcrit
dt

322

Chapter 15. Cycles, periods, and rates of change

i.e. when
Kcrit =

Sv
.
x2 + (S 2 /4)

Figure 15.6(a) illustrates geometrically a solution to this equation. We show the line y =
Kcrit and the curve y = Sv/(x2 + (S 2 /4)) superimposed on the same coordinate system.
The value of x, labeled xreact will be the distance of the predator at the instant that the
Danio realizes that it is under threat and should escape. We can determine the value of this
distance, referred to as the reaction distance, by solving for x.
Large slow predators beat Danios escape response
Figure 15.6(b) illustrates a possibility where there is no distance at which which Kcrit =
Sv/(x2 + (S 2 /4)). This may happen if either the Danio has a very high threshold of alert,
so that it fails to react to threats, or if the curve depicting d/dt is too low. That happens
either if S is very large (big predator) or if v is small (slow moving predator sneaking up
on its prey). From this scenario, we find that in some situations, the fate of the Danio would
be sealed in the jaws of its pursuer.
To determine how far away the predator is detected in the happier scenario of Figure 15.6(a), we solve for the reaction distance, xreact :
s
s 

2
Sv
v
S
S
Sv
2
2
.
xreact =

= S
x + (S /4) =
Kcrit
Kcrit
4
Kcrit
4
d /dt
d /dt

4 v/S

crit

4 v/S

K crit
x react

(a)

x react

(b)

Figure 15.6. The rate of change of the visual angle d/dt in two cases, when the
quantity 4v/S is above (a) and below (b) some critical value.
It is clear that the reaction distance of the Danio with reaction threshold Kcrit would
be greatest for certain sizes of predators. In Figure 15.7, we plot the reaction distance xreact
(on the vertical axis) versus the predator size S (horizontal axis). We see that very small
predators S 0 or large predators S 4v/Kcrit the distance at which escape response
is triggered is very small. This means that the Danio may miss noticing such predators
until they are too close for a comfortable escape, resulting in calamity. Some predators will
be detected when they are very far away (large xreact ). (We can find the most detectable

15.3. The Zebra danios escape responses

323

(a)

(b)

Figure 15.7. (a) The reaction distance xreact (on the vertical axis) is shown as a
function of the predator size S (horizontal axis). (b) The reaction distance xreact is shown
as a function of the predator velocity v
size by finding the value of S corresponding to a maximal xreact . The reader may show as
an exercise that this occurs for size S = 2v/Kcrit .) At sizes S > 4v/Kcrit , the reaction
distance is not defined at all: we have already seen this fact from Figure 15.6(b): when
Kcrit > 4v/S, the straight line and the curve fail to intersect, and there is no solution.
Figure 15.7(b) illustrates the dependence of the reaction distance xreact on the speed
v of the predator. We find that for small values of v, i.e. v < Kcrit S/4, xreact is not defined:
the Danio would not notice the threat posed by predators that swim very slowly.

15.3.3

Alternate approach involving inverse trig functions

The problem of the escape response was solved by using implicit differentiation and related
rates. But there are various approaches to solve a mathematical problem. Here we illustrate that an alternate approach is to express the relationship of interest in terms of inverse
trigonometric functions, and then use the derivative of that function to find the desired rate
of change18.
Example 15.9 In Section 15.4, we studied the escape response of the zebra danio and
showed that the connection between the visual angle and distance to predator satisfies
tan


2

(S/2)
.
x

(15.4)

We also computed the rate of change of the visual angle per unit time using implicit differentiation. Here, we practice differentiation of inverse trigonomentric functions and redo
the same calculation using these functions. Use the inverse function arctan to restate the
18 This

section is optional and can be skipped or left as an independent exercise for the student.

324

Chapter 15. Cycles, periods, and rates of change

angle in Eqn. 15.4 as a function of x. Then differentiate that function using the chain
rule to compute d/dt.
Solution: We can restate this relationship using the inverse trigonometric function arctan
as follows:
 

S
.
= arctan
2
2x
Our experience with the derivative of this function will be useful below. Since both the
angle and the distance from the predator x change with time, we indicate so by writing


S
(t) = 2 arctan
.
2x(t)
We apply the chain rule to this expression to calculate the rate of change of the angle
with respect to time. Let u = S/2x. Recall that S is a constant. Then the derivative of the
inverse trigonometric function,
d arctan(u)
1
= 2
du
u +1
and the chain rule leads to
d arctan(u) du dx
1
d(t)
=
= 2
dt
du
dx dt
u +1

S
2x2 (t)

(v).

By simplifying, we arrive at the same result, namely that


d
Sv
= 2
.
dt
x + (S 2 /4)
This is the rate of change of the visual angle, and agrees with Example 15.8.

15.4 For further study: Trigonometric functions and


differential equations
As we have seen in this chapter, the functions sin(t) and cos(t) are related to one another
via differentiation: one is the derivative of the other (with a multiple of the factor (-1).):
d sin(t)
= cos(t),
dt

d cos(t)
= sin(t).
dt

The connection becomes even clearer when we examine the second derivatives of these
functions:
d2 sin(t)
d cos(t)
=
= sin(t),
dt2
dt

d sin(t)
d2 cos(t)
=
= cos(t).
dt2
dt

Thus, for each of the functions y = sin(t), y = cos(t), we find that the function and its
second derivative are related to one another by the differential equation (DE) d2 y/dt2 =

15.5. Additional examples: Implicit differentiation

325

y. Here the highest derivative is a second derivative, and we denote this as a second
order DE.
More generally, we make the following observations. These follow by the same
reasoning, where the chain rule is applied in differentiation.
The functions
x(t) = cos(t),

y(t) = sin(t)

satisfy a pair of differential equations,


dx
= y,
dt

dy
= x.
dt

The functions
x(t) = cos(t), y(t) = sin(t)
also satisfy a related differential equation with a second derivative
d2 x
= 2 x.
dt2
Students of physics will here recognize the equation that governs the behaviour of a
harmonic oscillator, and will see the connection between the circular motion of our point
on the circle, and the differential equation for periodic motion.

15.5 Additional examples: Implicit differentiation


This section is dedicated to practicing implicit differentiation in the context of trigonometric functions.
A surface that looks like an egg carton can be described by the function
z = sin(x) cos(y)
See Figure 15.8(a) for the shape of this surface.
Suppose we slice though the surface at various levels. We would then see a collection
of circular contours, as found on a topographical map of a mountain range. Such contours
are called level curves, and some of these can be seen in Figure 15.8. We will here be
interested in the contours formed at some specific height, e.g. at height z = 1/2. This set
of curves can be described by the equation:
1
.
2
Let us look at one of these, e.g. the curve shown in Figure 15.8(b). This is just
one of the contours, namely the one located in the portion of the graph for 1 < y < 1,
0 < x < 3. We practice implicit differentiation for this curve, i.e. we find the slope of
tangent lines to this curve.
sin(x) cos(y) =

326

Chapter 15. Cycles, periods, and rates of change


1

0.5

1
4

1.5

2.5

x
2

0.5

0
4
2
2

0
x
2
4

(a)

(b)

Figure 15.8. (a) The surface sin(x) cos(y) = 21 (b) One level curve for this
surface. Note that the scales are not the same for parts (a) and (b).
Example 15.10 (Implicit differentiation:) Find the slope of the tangent line to a point on
the curve shown in Figure 15.8(b).
Solution: Differentiating, we obtain:
d
d
(sin(x) cos(y)) =
dx
dx

 
1
2

d sin(x)
d cos(y)
cos(y) + sin(x)
=0
dx
dx
dy
cos(x) cos(y) + sin(x)( sin(y))
=0
dx
cos(x) cos(y)
dy
1
dy
=

=
.
dx
sin(x) sin(y)
dx
tan(x) tan(y)
We can now determine the slope of the tangent lines to the curve at points of interest.
Example 15.11 Find the slope of the tangent line to the same level curve at the point
x = 2 .
Solution: At this point, sin(x) = sin(/2) = 1 which means that the corresponding y
coordinate of a point on the graph satisfies cos(y) = 1/2 so one value of y is y = /3.
(There are other values, for example at /3 and at 2n /3, but we will not consider
these here.) Then we find that
dy
1
=
.
dx
tan(/2) tan(/3)

15.5. Additional examples: Implicit differentiation

327

dy
= 0. The tangent line is horizontal
But tan(/2) = so that the ratio above leads to dx
as it goes though the point (/2, /3) on the graph.

Example 15.12 Find the slope of the tangent line to the same level curve at the point
x = 4 .

Solution: Here we have sin(x) = sin(/4) = 2/2, and we find that the y coordinate
satisfies

1
2
cos(y) =
2
2
This means that cos(y) =

1
2

2
2

so that y = /4. Thus

1
1
dy
=
= =1
dx
tan(/4) tan(/4)
1
so that the tangent line at the point (/4, /4) has slope 1.

328

Chapter 15. Cycles, periods, and rates of change

Exercises
15.1. Calculate the first derivative for the following functions.
(a) y = sin x2
(b) y = sin2 x

(c) y = cot2 3 x
(d) y = sec(x 3x2 )
(e) y = 2x3 tan x
(f) y =

x
cos x

(g) y = x cos x
(h) y = e sin

2 1
x

(i) y = (2 tan 3x + 3 cos x)2


(j) y = cos(sin x) + cos x sin x
15.2. Take the derivative of the following functions.
(a) f (x) = cos(ln(x4 + 5x2 + 3))
p
(b) f (x) = sin( cos2 (x) + x3 )
(c) f (x) = 2x3 + log3 (x)

(d) f (x) = (x2 ex + tan(3x))4


q
(e) f (x) = x2 sin3 (x) + cos3 (x)

15.3. A point is moving on the perimeter of a circle of radius 1 at the rate of 0.1 radians
per second. How fast is its x coordinate changing when x = 0.5? How fast is its y
coordinate changing at that time?
15.4. The derivatives of the two important trig functions are [sin(x)] = cos(x) and
[cos(x)] = sin(x). Use these derivatives to answer the following questions.
Let f (x) = sin(x) + cos(x), 0 x 2
(a) Find all intervals where f (x) is increasing.

(b) Find all intervals where f (x) is concave up.


(c) Locate all inflection points.
(d) Graph f (x).

15.5. Find all points on the graph of y = tan(2x), < x < , where the slope of the
4
4
tangent line is 4.
15.6. A V shaped formation of birds forms a symmetric structure in which the distance
from the leader to the last birds in the V is r = 10m, the distance between those
trailing birds is D = 6m and the angle formed by the V is , as shown in Figure 15.9
below. Suppose that the shape is gradually changing: the trailing birds start to get
closer so that their distance apart shrinks at a constant rate dD/dt = 0.2m/min

Exercises

329

Flying bird
formation

Figure 15.9. Figure for Problem 6


while maintaining the same distance from the leader. (Assume that the structure is
always in the shape of a V as the other birds adjust their positions to stay aligned in
the flock.) What is the rate of change of the angle ?
15.7. A hot air balloon on the ground is 200 meters away from an observer. It starts rising
vertically at a rate of 50 meters per minute. Find the rate of change of the angle of
elevation of the observer when the balloon is 200 meters above the ground.
15.8. Match the differential equations given in parts (i-iv) with the functions in (a-f) which
are solutions for them. (Note: each differential equation may have more than one
solution)
Differential equations:
(i) d2 y/dt2 = 4y
(ii) d2 y/dt2 = 4y

(iii) dy/dt = 4y

(iv) dy/dt = 4y
Solutions:
(a) y(t) = 4 cos(t)
(b) y(t) = 2 cos(2t)
(c) y(t) = 4e2t
(d) y(t) = 5e2t
(e) y(t) = sin(2t) cos(2t),
(f) y(t) = 2e4t .

15.9. Jack and Jill have an on-again off-again love affair. The sum of their love for one
another is given by the function y(t) = sin(2t) + cos(2t).
(a) Find the times when their total love is at a maximum.
(b) Find the times when they dislike each other the most.
15.10. A ladder of length L is leaning against a wall so that its point of contact with the
ground is a distance x from the wall, and its point of contact with the wall is at
height y. The ladder slips away from the wall at a constant rate C.

330

Chapter 15. Cycles, periods, and rates of change


(a) Find an expression for the rate of change of the height y.
(b) Find an expression for the rate of change of the angle formed between the
ladder and the wall.

15.11. A cannon-ball fired by a cannon at ground level at angle to the horizon (0


/2) will travel a horizontal distance (called the range, R) given by the formula
below:
1 2
v sin cos .
R=
16 0
Here v0 > 0, the initial velocity of the cannon-ball, is a fixed constant and air
resistance is neglected. (See Figure 15.10.) What is the maximum possible range?

Figure 15.10. Figure for problem 11


15.12. A wheel of radius 1 meter rolls on a flat surface without slipping. The wheel moves
from left to right, rotating clockwise at a constant rate of 2 revolutions per second.
Stuck to the rim of the wheel is a piece of gum, (labeled G); as the wheel rolls
along, the gum follows a path shown by the wide arc (called a cycloid curve) in
Figure 15.11. The (x, y) coordinates of the gum (G) are related to the wheels angle
of rotation by the formulae
x = sin ,
y = 1 cos ,
where 0 2. How fast is the gum moving horizontally at the instant that it
reaches its highest point? How fast is it moving vertically at that same instant?
15.13. In Figure 15.12, the point P is connected to the point O by a rod 3 cm long. The
wheel rotates around O in the clockwise direction at a constant speed, making 5
revolutions per second. The point Q, which is connected to the point P by a rod 5
cm long, moves along the horizontal line through O. How fast and in what direction
is Q moving when P lies directly above O? (Remember the law of cosines: c2 =
a2 + b2 2ab cos .)
15.14. A ship sails away from a harbor at a constant speed v. The total height of the ship
including its mast is h. See Figure 15.13.
(a) At what distance away will the ship disappear below the horizon?
(b) At what rate does the top of the mast appear to drop toward the horizon just
before this? (Note: In ancient times this effect lead people to conjecture that

Exercises

331

(x,y)

x
Figure 15.11. Figure for Problem 12

P
3

Q
0

Figure 15.12. Figure for Problem 13


the earth is round (radius R), a fact which you need to take into account in
solving the problem.)

Figure 15.13. Figure for Problem 14


15.15. Find

dy
dx

using implicit differentiation.

(a) y = 2 tan(2x + y)
(b) sin y = 2 cos x

(c) x sin y + y sin x = 1

332

Chapter 15. Cycles, periods, and rates of change

15.16. Use implicit differentiation to find the equation of the tangent line to the following
curve at the point (1, 1):
x sin(xy y 2 ) = x2 1
15.17. The function y = arcsin(ax) is a so-called inverse trigonometric function. It expresses the same relationship as does the equation ax = sin(y). (However, this
function is defined only for values of x between 1/a and 1/a.) Use implicit differentiation to find y .
15.18. Find the first derivative of the following functions.
1

(a) y = arcsin x 3
1

(b) y = (arcsin x) 3
(c) = arctan(2r + 1)
(d) y = x arcsec x1

(e) y = xa a2 x2 arcsin xa , a > 0.


2t
(f) y = arccos 1+t
2

15.19. Your room has a window whose height is 1.5 meters. The bottom edge of the window is 10 cm above your eye level. (See Figure 15.14.) How far away from the
window should you stand to get the best view? (Best view means the largest visual angle, i.e. angle between the lines of sight to the bottom and to the top of the
window.)

1.5

window

0.1

x
Figure 15.14. Figure for Problem 19
15.20. You are directly below English Bay during a summer fireworks event and looking
straight up. A single fireworks explosion occurs directly overhead at a height of
500 meters. (See Figure 15.15.) The rate of change of the radius of the flare is 100
meters/sec. Assuming that the flare is a circular disk parallel to the ground, (with its
center right overhead) what is the rate of change of the visual angle at the eye of an
observer on the ground at the instant that the radius of the disk is r = 100 meters?
(Note: the visual angle will be the angle between the vertical direction and the line
between the edge of the disk and the observer).
15.21. Periodic motion:

Exercises

333

fireworks

500

O
Figure 15.15. Figure for Problem 20
(a) Show that the function y(t) = A cos(wt) satisfies the differential equation
d2 y
= w2 y
dt2
where w > 0 is a constant, and A is an arbitrary constant. [Remark: Note
that w corresponds to the frequency and A to the amplitude of an oscillation
represented by the cosine function.]
(b) It can be shown using Newtons Laws of motion that the motion of a pendulum
is governed by a differential equation of the form
g
d2 y
= sin(y),
dt2
L
where L is the length of the string, g is the acceleration due to gravity (both
positive constants), and y(t) is displacement of the pendulum from the vertical.
What property of the sine function is used when this equation is approximated
by the Linear Pendulum Equation:
g
d2 y
= y.
dt2
L
(c) Based on this Linear Pendulum Equation, what function would represent the
oscillations? What would be the frequency of the oscillations?
(d) What happens to the frequency of the oscillations if the length of the string is
doubled?

334

Chapter 15. Cycles, periods, and rates of change

Chapter 16

Review Problems

335

336

Chapter 16. Review Problems

Exercises
16.1. Multiple Choice:
(1) : The equation of the tangent line to the function y = f (x) at the point x0 is
(a) y = f (x0 ) + f (x0 )(x x0 )
(b) y = x0 + f (x0 )/f (x0 )
(c) y = f (x) f (x)(x x0 )
(d) y = f (x0 ) + f (x0 )(x x0 )
(e) y = f (x0 ) f (x0 )(x x0 )

(2) : The functions f (x) = x2 and g(x) = x3 are equal at x = 0 and at x = 1.


Between x = 0 and at x = 1, for what value of x are their graphs furthest
apart?
(a) x = 1/2 (b) x = 2/3 (c) x = 1/3 (d) x = 1/4 (e) x = 3/4

(3) : Consider a point in the first quadrant on the hyperbola x2 y 2 = 1 with


x = 2. The slope of the tangent line at that point is

(a) 2/ 3 (b) 2/ 5 (c) 1/ 3 (d) 5/2 (e) 2/3


(4) : For a, b > 0, solving the equation ln(x) = 2 ln(a) 3 ln(b) for x leads to
(a) x = e2a3b (b) x = 2a 3b (c) x = a2 /b3 (d) x = a2 b3 (e) x = (a/b)6

(5) : The function y = f (x) = arctan(x) (x/2) has local maXima (LX), local
minima (LM) and inflection points(IP) as follows:
(a) LX: x = 1, LM: x = 1, IP: x = 0.
(b) LX: x = 1, LM: x = 1, IP: x = 0.
(c) LX: x = 1, LM: x = 1, IP: none

(d) LX: x = 3, LM: x = 3, IP: x = 0.

(e) LX: x = 3, LM: x = 3, IP: x = 0.

(6) Consider the function y = f (x) = 3e2x 5e4x


(a) The function has a local maximum at x = (1/2) ln(10/3)
(b) The function has a local minimum at x = (1/2) ln(10/3)
(c) The function has a local maximum at x = (1/2) ln(3/5)
(d) The function has a local minimum at x = (1/2) ln(3/5)
(e) The function has a local maximum at x = (1/2) ln(3/20)

(7) Let m1 be the slope of the function y = 3x at the point x = 0 and let m2 be
the slope of the function y = log3 x at x = 1 Then
(a) m1 = ln(3)m2 (b) m1 = m2 (c) m1 = m2 (d) m1 = 1/m2 (e)
m1 = m2 / ln(3)
(8) Consider the curve whose equation is x4 + y 4 + 3xy = 5. The slope of the
tangent line, dy/dx, at the point (1, 1) is
(a) 1 (b) -1 (c) 0 (d) -4/7 (e) 1/7

Exercises

337

(9) Two kinds of bacteria are found in a sample of tainted food. It is found that
the population size of type 1, N1 and of type 2, N2 satisfy the equations
dN1
= 0.2N1 ,
dt

N1 (0) = 1000,

dN2
= 0.8N2 ,
dt

N2 (0) = 10.

Then the population sizes are equal N1 = N2 at the following time:


(a) t = ln (40) (b) t = ln (60) (c) t = ln (80) (d) t = ln (90) (e) t = ln (100)
(10) In a conical pile of sand the ratio of the height to the base radius is always
r/h = 3. (Recall that the volume of a cone with height h and radius r is
V = (/3)r2 h.) If the volume is increasing at rate 3 m3 /min, how fast (in
m/min) is the height changing when h = 2m?
(a) 1/(12) (b) (1/)1/3 (c) 27/(4) (d) 1/(4) (e) 1/(36)
16.2. Shown in Figure 16.1 is a function and its tangent line at x = x0 . The tangent line
intersects the x axis at the point x = x1 . Based on this figure, the coordinate of the
point x1 is

y=f(x)

x
x1

x0

Figure 16.1. Graph for Problem 2.

[A] x1 = x0 +

f (x0 )
,
f (x0 )

[B] x1 = x0 f (x0 )(x x0 ),

[D] x1 = x0 +

f (x1 )
,
f (x1 )

[E] x1 = x0

[C] x1 = x0

f (x1 )
f (x1 )

f (x0 )
f (x0 )

16.3. Eulers Method: For the differential equation and initial condition
dy
= (2 y),
dt

y(0) = 1

using one time step of size t = 0.1 leads to which value of the solution at time
t = 0.1?
(A) y(0.1) = 2,

(B) y(0.1) = 2.1,

(C) y(0.1) = 2.2,

(D) y(0.1) = 1.2,

(E) y(0

338

Chapter 16. Review Problems

16.4. Linear approximation: Consider the function y = cos(x) and its tangent line to
this function at the point x = /2. Using that tangent line as a linear approximation
of the function would lead to
(A) Overestimating the value of the actual function for any nearby x.
(B) Underestimating the value of the actual function for any nearby x.
(C) Overestimating the function when x > /2 and underestimating the function
when x < /2.
(D) Overestimating the function when x < /2 and underestimating the function
when x > /2.
(E) Overestimating the function when x < 0 and underestimating the function
when x > 0.
16.5. Related Rates: Two spherical balloons are connected so that one inflates as the
other deflates, the sum of their volumes remaining constant. When the first balloon
has radius 10 cm and its radius is increasing at 3 cm/sec, the second balloon has
radius 20 cm. What is the rate of change of the radius of the second balloon? [The
volume of a sphere of radius r is V = (4/3)r3 ].
16.6. Particle velocity: A particle is moving along the x axis so that its distance from the
origin at time t is given by
x(t) = (t + 2)3 + t
where is a constant
(a) Determine the velocity v(t) and the acceleration a(t).
(b) Determine the minimum velocity over all time.
16.7. Motion: A particles motion is described by y(t) = t3 6t2 + 9t
where y(t) is the displacement (in metres) t is time (in seconds) and 0 t 4
seconds.
(a) During this time interval, when is the particle furthest from its initial position
?
(b) During this time interval, what is the greatest speed of the particle?
(c) What is the total distance (including both forward and backward directions)
that the particle has travelled during this time interval?
16.8. Falling object: Consider an object thrown upwards with initial velocity v0 > 0 and
initial height h0 > 0. Then the height of the object at time t is given by
1
y = f (t) = gt2 + v0 t + h0 .
2
Find critical points of f (t) and use both the second and first derivative tests to establish that this is a local maximum.
16.9. Critical points:

Exercises

339

(a) Find critical points for the function y = ex (1 ln(x)) for 0.1 x 2 and
classify their types.
(b) The function y = ln(x) ex has a critical point in the interval 0.1 x 2. It
is not possible to solve for the value of x at that point, but it is possible to find
out what kind of critical point that is. Determine whether that point is a local
maximum, minimum, or inflection point.
16.10. Minima and Maxima:
(a) Consider the polynomial y = 4x5 15x4 . Find all local minima maxima, and
inflection points for this function.
(b) Find the global minimum and maximum for this function on the interval [-1,1].
16.11. Minima and Maxima: Consider the polynomial y = x5 x4 +3x3 . Use calculus
to find all local minima maxima, and inflection points for this function.
16.12. Linear approximation: Find a linear approximation to the function y = x2 at the
point whose x coordinate is x = 2. Use your result to approximate the value of
(2.0001)2 .
16.13. HIV virus: High-risk activity leads to a HIV infection. Initially, the patient has
1000 copies of the virus. How long will it take until the HIV infection is detectable?
Assume that the number of virus particles y grows according to the equation
dy
= 0.05y
dt
where t is time in days, and that the smallest detectable viral load is 350,000 particles. Leave your answer in terms of logarithms.
16.14. Fish generations: In Fish River, the number of salmon (in thousands), x, in a given
year is linked to the number of salmon (in thousands), y, in the following year by
the function
y = Axebx
where A, b > 0 are constants.
(a) For what number of salmon is there no change in the number from one year to
the next?
(b) Find the number of salmon that would yield the largest number of salmon in
the following year.
16.15. Polynomial: Find a polynomial of third degree that has a local maximum at x = 1,
a zero and an inflection point at x = 0, and goes through the point (1,2). Hint:
assume p(x) = ax3 + bx2 + cx + d and find the values of a, b, c, d.
16.16. Lennard-Jones potential: The Lennard-Jones potential, V (x) is the potential energy associated with two uncharged molecules a distance x apart, and is given by
the formula
b
a
V (x) = 12 6
x
x
where a, b > 0. Molecules would tend to adjust their separation distance so as to
minimize this potential. Find any local maxima or minima of this potential. Find the

340

Chapter 16. Review Problems

distance between the molecules, x, at which V (x) is minimized and use the second
derivative test to verify that this is a local minimum.
16.17. Rectangle inscribed in a circle: Find the dimensions of the largest rectangle that
can fit exactly into a circle whose radius is r.
16.18. Race track: Fig. 16.2 shows a 1 km race track with circular ends. Find the values
of x and y that will maximize the area of the rectangle.

Figure 16.2. This shape is investigated in both problems 18 and 19.


16.19. Leaf shape: Now suppose that Fig 16.2 shows the shape of a leaf of some plant. If
the plant grows so that x increases at the rate 2 cm/year and y increases at the rate 1
cm/year, at what rate will the leafs entire area be increasing?
16.20. Shape of E. coli: A cell of the bacterium E.coli has the shape of a cylinder with two
r

Figure 16.3. Shape of the object described in Problem 4. Note: Useful volumes
and surface areas: For a hemisphere, V = (2/3)r3 , S = 2r2 . For a cylinder, V = r2 h
and S = 2rh (not including end caps)
hemispherical caps, as shown in Fig 16.3. Consider this shape, with h the height of
the cylinder, and r the radius of the cylinder and hemispheres.
(a) Find the values of r and h that lead to the largest volume for a fixed constant
surface area, S= constant.

Exercises

341

(b) Describe or sketch the shape you found in (a).


(c) A typical E. coli cell has h = 1m and r = 0.5m. Based on your results in
(a) and (b), would you agree that E. coli has a shape that maximizes its volume
for a fixed surface area? (Explain your answer).
16.21. Changing cell shape: If the cell shown above in Fig 16.3 is growing so that the
height increases twice as fast as the radius. If the radius is growing at 1 m per day
at what rate will the volume of the cell increase? (Leave your answer in terms of the
height and radius of the cell.)
16.22. Growth of vine: A vine grows up a tree in the form of a helix as shown on the left
in Fig. 16.4. If the length of the vine increases at a constant rate cm/day, at what
rate is the height of its growing tip increasing? Assume that the radius of the tree is
r and the pitch of the helix (i.e. height increase for each complete turn of the helix)
is p, a positive constant. Note that the right panel in Fig. 16.4 shows the unwrapped
cylinder, with the vines location along it.

Figure 16.4. Growth of a vine in the shape of a spiral for problem 22.
16.23. Newtons Law of Cooling: Newtons Law of cooling leads to a differential equation
that predicts the temperature T (t) of an object whose initial temperature is T0 in
an environment whose temperature is E. The predicted temperature is given by
T (t) = E + (T0 E)ekt where t is time and k is a constant. Shown in Fig 16.5
on the following page is some data points plotted as ln(T (t) E) versus time in
minutes. The ambient temperature was E = 22 C. Also shown on the graph is the
line that best fits those 11 points. Find the value of the constant k.
16.24. Blood alcohol: Blood alcohol level (BAL), the amount of alcohol in your blood
stream (here represented by B(t), is measured in milligrams of alcohol per 10millilitres of blood. At the end of a party (time t = 0), a drinker is found to have
B(0) = 0.08 (the legal level for driving impairment), and after that time, B(t)
satisfies the differential equation
dB
= kB,
dt

k>0

where k is a constant that represents the rate of removal of alcohol form the blood
stream by the liver.

342

Chapter 16. Review Problems


ln ( T(t) - E )
4.1

4.0

3.5

Bestfitline

3.0

20

10
time in minutes

2.9
0.0

30.0

Figure 16.5. Figure for Problem 23

(a) If the drinker had waited for 3 hrs before driving (until = 3), his BAL would
have dropped to 0.04. Determine the value of the rate constant k (specifying
appropriate units) for this drinker.
(b) According to the model, how much longer would it take for the BAL to drop
to 0.01?
16.25. Population with immigration: An island has a bird population of density P (t).
New birds arrive continually with a constant colonization rate C birds per day. Each
bird also has a constant probability per day, , of leaving the Island. At time t = 0
the bird population is P (0) = P0
(a) Write down a differential equation that describes the rate of change of the bird
population on the island.
(b) Find the steady state of that equation and interpret this in terms of the bird
population.
c Write down the solution of the differential equation you found in (b) and show
that it satisfies the following two properties: (i) the initial condition, (ii) as
t it approaches the steady state you found in (b).

(d) If the island has no birds on it at time t = 0, how long would it take for the
bird population to grow to 80% of the steady state value?
16.26. Learning:

(a) It takes you 1 hrs (total) to travel to and from UBC every day to study Philosophy 101. The amount of new learning (in arbitrary units) that you can get by
spending t hours at the university is given approximately by
LP (t) =

10t
.
9+t

Exercises

343
How long should you stay at UBC on a given day if you want to maximize
your learning per time spent? (Time spent includes travel time.)

(b) If you take Math 10000 instead of Philosophy, your learning at time t is
LM (t) = t2 .
How long should you stay at UBC to maximize your learning in that case?
16.27. Learning and forgetting: Knowledge can be acquired by studying, but it is forgotten over time A simple model for learning represents the amount of knowledge,
y(t), that a person has at time t (in years) by a differential equation
dy
= S fy
dt
where S 0 is the rate of studying and f 0 is the rate of forgetting. We will
assume that S and f are constants that are different for each person. [Your answers
to the following questions will contain constants such as S or f .]
(a) Mary never forgets anything. What does this imply about the constants S and
f ? Mary starts studying in school at time t = 0 with no knowledge at all. How
much knowledge will she have after 4 years (i.e. at t = 4)?
(b) Tom learned so much in preschool that his knowledge when entering school
at time t = 0 is y = 100. However, once Tom in school, he stops studying
completely. What does this imply about the constants S and f ? How long will
it take him to forget 75% of what he knew?
(c) Jane studies at the rate of 10 units per year and forgets at rate of 0.2 per year.
Sketch a direction field (slope field) for the differential equation describing Janes knowledge. Add a few curves y(t) to show how Janes knowledge
changes with time.
16.28. Least cost: A rectangular plot of land has dimensions L by D. A pipe is to be built
joining points A and C. The pipe can be above ground along the border of the plot
(Section AB), but has to be buried underground along the segment BC. The cost
per unit length of the underground portion is 3 times that of the cost of the above
ground portion. Determine the distance y so that the cost of the pipe will be as low
as possible.
16.29. Ducks in a row:
Graduate student Ryan Lukeman studies behaviour of duck flocks swimming near
Canada Place in Vancouver, BC. This figure from his PhD thesis shows his photography set-up. Here H = 10 meters is the height from sea level up to his camera
aperture at the observation point, D = 2 meters is the width of a pier (a stationary
platform whose size is fixed), and x is the distance from the pier to the leading duck
in the flock (in meters). is a visual angle subtended at the camera, as shown. If
the visual angle is increasing at the rate of 1/100 radians per second, at what rate is
the distance x changing at the instant that x = 3 meters?

344

Chapter 16. Review Problems

y
L

D
Figure 16.6. Figure for Problem 28

Figure 16.7. Figure for Problem 29


16.30. Human growth: Given a population of 6 billion people on Planet Earth, and using
the approximate growth rate of r = 0.0125 per year, how long ago was this population only 1 million? Assume that the growth has been the same throughout history
(which is not actually true).
16.31. Circular race track: Two runners are running around a circular race track whose
length is 400m, as shown in Fig. 16.8(a). The first runner make a full revolution
every 100s and the second runner every 150 s. They start at the same time at the
start position, and the angles subtended by each runner with the radius of the start
position are 1 (t), 2 (t), respectively. As the runners go around the track both 1 (t)
and 2 (t) will be changing with time.
(a) At what rate it the angle = 1 2 changing?

(b) What is the angle at t = 25s?

(c) What is the distance between the runners at t = 25s? (Here distance refers
to the length of the straight line connecting the runners.)
(d) At what rate is the distance between the runners changing at t = 25s?

Exercises

345

R1
(a)

(b)

R2
START

START

Figure 16.8. Figure for problems 31 and 32. The angles in (a) are 1 (t), 2 (t).
In (b), the angle between the runners is .
16.32. Phase angle and synchrony: Suppose that the same two runners as in Problem 31
would speed up or slow down depending on the angle between them, . (See
Fig. 16.8). Then = (t) will change with time. We will assume that the angle
satisfies a differential equation of the form
d
= A B sin()
dt
where A, B > 0 are constants.
(a) What values of correspond to steady states (i.e. constant solutions) of this
differential equation?
(b) What restriction should be placed on the constants A, B for these steady states
to exist?
(c) Suppose A = 1, B = 2. Sketch the graph of f () = A B sin() for
and use it to determine what will happen if the two runners start
st the same point, ( = 0) at time t = 0.
16.33. Logistic equation and its solution:
(a) Show that the function
y(t) =

1
1 + et

satisfies the differential equation


dy
= y(1 y).
dt
(b) What is the initial value of y at t = 0?
(b) For what value of y is the growth rate largest?
(d) What will happen to y after a very long time?

346

Chapter 16. Review Problems

16.34. Tumor mass:


The figure (not drawn to scale) shows a tumor mass containing a necrotic (dead)
core (radius r2 ), surrounded by a layer of actively dividing tumor cells. The entire
tumor can be assumed to be spherical, and the core is also spherical. (Recall that the
volume and surface area of a sphere are V = (4/3)r3 , S = 4r2 .)

necrotic

core
active cells

Figure 16.9. Figure for Problem 34


(a) If the necrotic core increases at the rate 3 cm3 /year and the volume of the active
cells increases by 4 cm3 /year, at what rate is the outer radius of the tumor (r1 )
changing when r1 = 1 cm. (Leave your answer as a fraction in terms of ;
indicate units with your answer.)
(b) At what rate (in cm2 /yr) does the outer surface area of the tumor increase when
r1 = 1cm?
16.35. Blood vessel branching: Shown in Fig. 16.10 is a major artery, (radius R) and one
of its branches (radius r). A labeled schematic diagram is also shown (right). The
length 0A is L, and the distance between 0 and P is d, where 0P is perpendicular
to 0A. The location of the branch point (B) is to be determined so that the total
resistance to blood flow in the path ABP is as small as possible. (R, r, d, L are
positive constants, and R > r.)

L
R
d
r

P
Figure 16.10. Figure for Problem 35

(a) Let the distance between 0 and B be x. What is the length of the segment BA
and what is the length of the segment BP?

Exercises

347

(b) The resistance of any blood vessel is proportional to its length and inversely
proportional to its radius to the fourth power19 . Based on this fact, what is the
resistance, T1 , of segment BA and what is the resistance, T2 , of the segment
BP?
(c) Find the value of the variable x for which the total resistance, T (x) = T1 + T2
is a minimum.

19 z

is inversely proportional to y means that z = k/y for some constant k

348

Chapter 16. Review Problems

Appendices

349

Appendix A

A review of Straight
Lines

A.A Geometric ideas: lines, slopes, equations


Straight lines have some important geometric properties, namely:
The slope of a straight line is the same everywhere along its length.

Definition: slope of a straight line:

y
y
x
x

Figure A.1. The slope of a line (usually given the symbol m) is the ratio of the
change in the y value, y to the change in the x value, x.
We define the slope of a straight line as follows:
Slope =

y
x

where y means change in the y value and x means change in the x value between
two points. See Figure A.1 for what this notation represents.
351

352

Appendix A. A review of Straight Lines

Equation of a straight line


Using this basic geometric property, we can find the equation of a straight line given any of
the following information about the line:
The y intercept, b, and the slope, m:
y = mx + b.
A point (x0 , y0 ) on the line, and the slope, m, of the line:
y y0
=m
x x0
Two points on the line, say (x1 , y1 ) and (x2 , y2 ):
y y1
y2 y1
=
x x1
x2 x1
Remark: any of these can be rearranged or simplified to produce the standard form
y = mx + b, as discussed in the problem set.
The following examples will refresh your memory on how to find the equation of the
line that satisfies each of the given conditions.
Example A.1 In each case write down the equation of the straight line that satisfies the
given statements. (Note: you should also be able to easily sketch the line in each case.)
(a) The line has slope 2 and y intercept 4.
(b) The line goes through the points (1,1) and (3,-2).
(c) The line has y intercept -1 and x intercept 3.
(d) The line has slope -1 and goes through the point (-2,-5).

Solution:
(a) We can use the standard form of the equation of a straight line, y = mx + b where
m is the slope and b is the y intercept to obtain the equation: y = 2x + 4
(b) The line goes through the points (1,1) and (3,-2). We use the fact that the slope is the
same all along the line. Thus,
(y1 y0 )
(y y0 )
=
= m.
(x x0 )
(x1 x0 )
Substituting in the values (x0 , y0 ) = (1, 1) and (x1 , y1 ) = (3, 2),
(1 + 2)
3
(y 1)
=
= .
(x 1)
(1 3)
2

A.A. Geometric ideas: lines, slopes, equations

353

(Note that this tells us that the slope is m = 3/2.) We find that
3
3
3
y 1 = (x 1) = x + ,
2
2
2
3
5
y = x+ .
2
2
(c) The line has y intercept -1 and x intercept 3, i.e. goes through the points (0,-1) and
(3,0). We can use the method in (b) to get
y=

1
x1
3

Alternately, as a shortcut, we could find the slope,


m=

1
y
= .
x
3

(Note that means change in the value, i.e. y = y1 y0 ). Thus m = 1/3 and
b = 1 (y intercept), leading to the same result.
(d) The line has slope -1 and goes through the point (-2,-5). Then,
(y + 5)
= 1,
(x + 2)
so that
y + 5 = 1(x + 2) = x 2,
y = x 7.

354

Appendix A. A review of Straight Lines

Exercises
1.1. Find the slope and y intercept of the following straight lines:
(a) y = 4x 5

(b) 3x 4y = 8
(c) 2x = 3y

(d) y = 3
(e) 5x 2y = 23

1.2. Find the equations of the following straight lines


(a) Through the points (2,0) and (1,5).
(b) Through (3,-1) with slope 1/2.
(c) Through (-10,2) with y intercept 10.
(d) The straight line shown in Figure A.2.

y = -8 + 18 x - 9 x2
1

Figure A.2. Figure for problem 2(d)


1.3. Find the equations of the following straight lines:
(a) Slope 4 and y intercept 3.

(b) Slope 3 and x intercept 2/3.

(c) Through the points (2, 7) and (1, 11).

(d) Through the point (1, 3) and the origin.

(e) Through the intersection of the lines 3x + 2y = 19 and y = 4x + 7 and


through the point (2, 7).
(f) Through the origin and parallel to the line 2x + 8y = 3.

1
(g) Through the point (2, 5) and perpendicular to the line y = x + 6.
2

Exercises

355

1.4. Tangent to a circle: Shown in Figure A.3 is a circle of radius 1. The xcoordinate
of the point on the circle at which the line touches the circle is x = 2/2. Find
the equation of the tangent line. Use the fact that on a circle, the tangent line is
perpendicular to the radius vector.

Figure A.3. Figure for problem 4

356

Appendix A. A review of Straight Lines

Appendix B

A precalculus review

B.A Manipulating exponents


Recall that 2n = 2 2 . . . 2 (with n factors of 2).
This means that 2n 2m = (2 2 . . . 2) (2 2 . . . 2) = 2n+m , where we have expanded
the product into n and then m factors of 2. Similarly, we can derive many properties of
manipulations of exponents. A list of these appears below, and holes for any positive base
a
1. 2a 2b = 2a+b as with all similar exponent manipulations.
2. (2a )b = 2ab also stems from simple rules for manipulating exponents
3. 2x is a function that is defined, continuous, and differentiable for all real numbers x.
4. 2x > 0 for all values of x.
5. We define 20 = 1, and we also have that 21 = 2.
6. 2x 0 for increasing negative values of x
7. 2x for increasing positive values of x

B.B Manipulating logarithms


The following properties hold for logarithms of any base. Since we have used the base 2 in
our previous section, we keep the same base here as well Properties of the logarithm stem
directly from properties of the exponential function, and include the following:
1. log2 (ab) = log2 (a) + log2 (b)
2. log2 (ab ) = b log2 (a)
3. log2 (1/a) = log2 (a1 ) = log2 (a)
357

358

Appendix B. A precalculus review

Appendix C

A Review of Simple
Functions

Herer we review a few basic concepts related to functions

C.A What is a function


A function is just a way of expressing a special relationship between a value we consider
as the input (x) value and an associated output (y) value. We write this relationship in
the form
y = f (x)
to indicate that y depends on x. The only constraint on this relationship is that, for every
value of x we can get at most one value of y. This is equivalent to the vertical line
property: the graph of a function can intersect a vertical line at most at one point. The set
of all allowable x values is called the domain of the function, and the set of all resulting
values of y are the range.
Naturally, we will not always use the symbols x and y to represent independent and
dependent variables. For example, the relationship
V =

4 3
r
3

expresses a functional connection between the radius, r, and the volume, V , of a sphere.
We say in such a case that V is a function of r.
All the sketches shown in Figure C.1 are valid functions. The first is merely a collection of points, x values and associated y values, the second a histogram. The third sketch
is here meant to represent the collection of smooth continuous functions, and these are the
variety of interest to us here in the study of calculus. On the other hand, the example shown
in Figure C.2 is not the graph of a function. We see that a vertical line intersects this curve
at more than one point. This is not permitted, since as we already said, a given value of x
should have only one corresponding values of y.
359

360

Appendix C. A Review of Simple Functions

x
x
x
Figure C.1. All the examples above represent functions.

Figure C.2. The above elliptical curve cannot be the graph of a function. The
vertical line (shown dashed) intersects the graph at more than one point: This means that
a given value of x corresponds to too many values of y. If we restrict ourselves to the
top part of the ellipse only (or the bottom part only), then we can create a function which
has the corresponding graph.

C.B Geometric transformations


It is important to be able to easily recognize what happens to the graph of a function when
we change the relationship between the variables slightly. Often this is called applying
a transformation. Figures C.3 and C.4 illustrate what happens to a function when shifts,
scaling, or reflections occur:
y

(a) y = f (x)

(b) y = f (x a)

(c) y = f (x) + b

Figure C.3. (a) The original function f (x), (b) The function f (x a) shifts f to
the right along the positive x axis by a distance a, (c) The function f (x) + b shifts f up the
y axis by height b.

C.B. Geometric transformations

y= f ( x)

y= f( x)

361

y=f(x)

y= f(x)

Figure C.4. Here we see a function y = f (x) shown in the black solid line. On
the same graph are superimposed the reflections of this graph about the x axis, y = f (x)
(dashed black), about the y axis y = f (x) (red), and about the y and the x axis, y =
f (x) (red dashed). The latter is equivalent to a rotation of the original graph about the
origin.

362

Appendix C. A Review of Simple Functions

C.C Classifying

constant linear

power

smooth

wild

constant
slope
easily
computed
has a
derivative
unpredictable
Figure C.5. Classifying functions according to their properties.
While life offers amazing complexity, one way to study living things is to classify
them into related groups. A biologist looking at animals might group them according to
certain functional properties - being warm blooded, being mammals, having fur or claws,
or having some other interesting characteristic. In the same way, mathematicians often
classify the objects that they study, e.g., functions, into related groups. An example of the
way that functions might be grouped into very broad classes is also shown in Figure C.5.
From left to right, the complexity of behaviour in this chart grows: at left, we see constant
and linear functions (describable by one or two simple parameters such as intercepts or
slope): these linear functions are most convenient or simplest to describe. Further to the
right are functions that are smooth and continuous, while at the right, some more irregular,
discontinuous function represents those that are outside the group of the well-behaved.
We will study some of the examples along this spectrum, and describe properties that they
share, properties they inherit form their cousins, and new characteristics that appear at
distinct branches.

C.D Power functions and symmetry


We list some of the features of each family of power functions in this section
Even integer powers
For n = 2, 4, 6, 8.. the shape of the graph of y = xn is as shown in Figure ??(a).
Here are some things to notice about these graphs:

C.D. Power functions and symmetry

363

1. The graphs of all the even power functions intersect at x = 0 and at at x = 1. The
value of y corresponding to both of these is y = +1. (Thus, the coordinates of the
three intersection points are (0, 0), (1, 1), (1, 1).)
2. All graphs have a lowest point, also called a minimum value at x = 0.
3. As x , y , We also say that the functions are unbounded from above.
4. The graphs are all symmetric about the y axis. This special type of symmetry will be
of interest in other types of functions, not just power functions. A function with this
property is called an even function.
Odd integer powers
For n = 1, 3, 5, 7, .. and other odd powers, the graphs have shapes shown in Figure ??(b).
1. The graphs of the odd power functions intersect at x = 0 and at x = 1. The three
points of intersection in common to all odd power functions are (1, 1), (0, 0), and
(1, 1).
2. None of the odd power functions have a minimum value.
3. As x +, y +. As x , y . The functions are unbounded
from above and below.
4. The graphs are all symmetric about the origin. This special type of symmetry will be
of interest in other types of functions, not just power functions. A function with this
type of symmetry is called an odd function.

C.D.1 Further properties of intersections


Here, and in Figure C.6 we want to notice that a horizontal line intersects the graph of a
power function only once for the odd powers but possibly twice for the even powers (we
have to allow for the case that the line does not intersect at all, or that it intersects precisely
at the minimum point). This observation will be important further on, once we want to
establish the idea of an inverse function.
A horizontal line has an equation of the form y = C where C is some constant. To
find where it intersects the graph of a power function y = xn , we would solve an equation
of the form
xn = C
(C.1)
To do so, we take nth root of both sides:
(xn )1/n = C 1/n .
Simplifying, using algebraic operations on powers leads to
(xn )1/n = xn/n = x1 = x = C 1/n ,

364

Appendix C. A Review of Simple Functions

However, we have to allow for the fact that there may be more than one solution to equation C.1, as shown for some C > 0 in Figure C.6. Here we see the the distinction between
odd and even power functions. If n is even then the solutions to equation C.1 are
x = C 1/n ,
whereas if n is odd, there is but a single solution,
x = C 1/n .

2.25

2.25

y=C

y=x^3
y=C
y=x^2

0.0

-2.25
-1.5

1.5

-1.5

(a)

1.5

(b)

Figure C.6. The even power functions intersect a horizontal line in up to two
places, while the odd power functions intersect such a line in only one place.

Definition C.1 (Even and odd functions:). A function that is symmetric about the y axis
is said to be an even function. A function that is symmetric about the origin is said to be an
odd function.
Even functions satisfy the relationship
f (x) = f (x).
Odd functions satisfy the relationship
f (x) = f (x).
Examples of even functions include y = cos(x), y = x8 , y = |x|. All these are
their own mirror images when reflected about the y axis. Examples of odd functions are

C.D. Power functions and symmetry

365

y = sin(x), y = x3 , y = x. Each of these functions is its own double-reflection (about y


and then x axes).
In a later calculus course, when we compute integrals, taking these symmetries into
account can help to simplify (or even avoid) calculations.

C.D.2 Optional: Combining even and odd functions


Not every function is either odd or even. However, if we start with symmetric functions,
certain manipulations either preserve or reverse the symmetry.
Example C.2 Show that the product of an even and an odd function is an odd function.

Solution: Let f (x) be even. Then


f (x) = f (x).
Let g(x) be an odd function. Then g(x) = g(x). We define h(x) to be the product of
these two functions,
h(x) = f (x)g(x).
Using the properties of f and g,
f (x)g(x) = f (x)[g(x)]
so, rearranging, we get
h(x) = f (x)g(x) = f (x)[g(x)] = [f (x)g(x)].
but this is just the same as h(x). We have established that
h(x) = h(x)
so that the new function is odd.
A function is not always even or odd. Many functions are neither even nor odd.
However, by a little trick, we can show that given any function, y = f (x), we can write it
as a sum of an even and an odd function.
Hint: Suppose f (x) is not an even nor an odd function. Consider defining the two associated functions:
1
fe (x) = (f (x) + f (x)),
2
and
1
f0 (x) = (f (x) f (x)).
2
(Can you draw a sketch of what these would look like for the function given in Figure C.3(a)?) Show that fe (x) is even and that f0 (x) is odd. Now show that
f (x) = fe (x) + f0 (x).

366

C.E

Appendix C. A Review of Simple Functions

Inverse functions and fractional powers

Suppose we are given a function expressed in the form


y = f (x).
What this implies, is that x is the independent variable, and y is obtained from it by evaluating a function, i.e. by using the rule or operation specified by that function. The above
mathematical statement expresses a certain relationship between the two variables, x and
y, in which the roles are distinct. x is a value we pick, and y is then calculated from it.
However, sometimes we can express a relationship in more than one way: as an
example, if the connection between x and y is simple squaring, then provided x > 0, we
might write either
y = x2
or
x = y 1/2 =

to express the same relationship. In other words


y = x2 x =

y.

Observe that we have used two distinct functions in describing the relationship from
the two points of view: One function involves squaring and the other takes a square root.
We may also notice that for x > 0

f (g(x)) = ( x)2 = x
p
g(f (x)) = (x2 ) = x
i.e. that these two functions invert each others effect.
Functions that satisfy
y = f (x) x = g(y)

are said to be inverse functions. We will often use the notation


f 1 (x)
to denote the function that acts as an inverse function to f (x).

C.E.1 Graphical property of inverse functions


The graph of an inverse function y = f 1 (x) is geometrically related to the graph of the
original function: it is a reflection of y = f (x) about the 45 line, y = x. This relationship
is shown in figure C.7 for a pair of functions f and f 1 .
But why should this be true? The idea is as follows: Suppose that (a, b) is any
point on the graph of y = f (x). This means that b = f (a). That, in turn, implies that
a = f 1 (b), which then tells us that (b, a) must be a point on the graph of f 1 (x). But the
points (a, b) and (b, a) are related by reflection about the line y = x. This is true for any
arbitrary point, and so must be true for all points on the graphs of the two functions.

C.E. Inverse functions and fractional powers

y=x

367

y=f (x)

y=f 1(x)

(b, a)

(b,a)

y=f(x)

y=f(x)

(a,b)

(a , b)
x
(a)

x
(b)

Figure C.7. The point (a, b) is on the graph of y = f (x). If the roles of x and
y are interchanged, this point becomes (b, a). Geometrically, this point is the reflection of
(a, b) about the line y = x. Thus, the graph of the inverse function y = f 1 (x) is related
to the graph of the original function by reflection about the line y = x. In the left panel,
the inverse is not a function, as it does not satisfy the vertical line property. In the panel on
the right, both f and its reflection satisfy that property, and thus the inverse, f 1 is a true
function.

C.E.2 Restricting the domain


The above argument establishes that, given the graph of a function, its inverse is obtained
by reflecting the graph in an imaginary mirror placed along a line y = x.
However, a difficulty could arise. In particular, for the function
y = f (x) = x2 ,
a reflection of this type would lead to a curve that cannot be a function, as shown in Figure C.8. (The sideways parabola would not be a function if we included both its branches,
since a given value of x would have two associated y values.)
To fix such problems, we simply restrict the domain to x > 0, i.e. to the solid parts of
the curves shown in Figure C.8. For this subset of the x axis, we have no problem defining
the inverse function.
Observe that the problem described above would be encountered for any of the even
power functions (by virtue of their symmetry about the y axis) but not by the odd power
functions.
y = f (x) = x3 y = f 1 (x) = x1/3
are inverse functions for all x values: when we reflect the graph of x3 about the line y = x
we do not encounter problems of multiple y values.

368

Appendix C. A Review of Simple Functions

1.5

y=x^(1/2)

y=x^2
omit this branch=>

Blue curve not a function


if this branch is included =>

-1.5
-1.5

1.5

Figure C.8. The graph of y = f (x) = x2 (blue) and of its inverse function. We
cannot define the inverse for all x, because the red parabola does not satisfy the vertical
line property: However, if we restrict to positive x values, this problem is circumvented.

This follows directly from the horizontal line properties that we discussed earlier, in
Figure C.6. When we reflect the graphs shown in Figure C.6 about the line y = x, the
horizontal lines will be reflected onto vertical lines. Odd power functions will have inverses that intersect a vertical line exactly once, i.e. they satisfy the vertical line property
discussed earlier.

C.F Polynomials
A polynomial is a function of the form
y = p(x) = an xn + an1 xn1 + + a1 x + a0 .
This form is sometimes referred to as superposition (i.e. simple addition) of the basic
power functions with integer powers. The constants ak are called coefficients. In practice
some of these may be zero. We will restrict attention to the case where all these coefficients
are real numbers. The highest power n (whose coefficient is not zero) is called the degree
of the polynomial.
We will be interested in these functions for several reasons. Primarily, we will find
that computations involving polynomials are particularly easy, since operations include
only the basic addition and multiplication.

C.F. Polynomials

369

C.F.1 Features of polynomials


Zeros of a polynomial are values of x such that
y = p(x) = 0.
If p(x) is quadratic (a polynomial of degree 2) then the quadratic formula gives a
simple way of finding roots of this equation (also called zeros of the polynomial).
Generally, for most polynomials of degree higher than 5, there is no analytical recipe
for finding zeros. Geometrically, zeros are places where the graph of the function
y = p(x) crosses the x axis. We will exploit this fact much later in the course to
approximate the values of the zeros using Newtons Method.
Critical Points: Places on the graph where the value of the function is locally larger
than those nearby (local maxima) or smaller than those nearby (local minima) will be
of interest to us. Calculus will be one of the main tools for detecting and identifying
such places.
Behaviour for very large x: All polynomials are unbounded as x and as
x . In fact, for large enough values of x, we have seen that the power function
y = f (x) = xn with the largest power, n, dominates over other power functions with
smaller powers.For
p(x) = an xn + an1 xn1 + + a1 x + a0
the first (highest power) term will dominate for large x. Thus for large x (whether
positive or negative)
p(x) an xn for large x.
Behaviour for small x: Close to the origin, we have seen that power functions with
smallest powers dominate. This means that for x 0 the polynomial is governed by
the behaviour of the smallest (non-zero coefficient) power, i.e,
p(x) a1 x + a0 for small x.

370

Appendix C. A Review of Simple Functions

Exercises
3.1. Figure C.9 shows the graph of the function y = f (x). Match the functions (a)-(d)
below with their appropriate graph (1)-(4) in Figure C.10.
(a) y = |f (x)|,

(b) y = f (|x|),
(c) y = f (x),
(d) y = f (x).

Figure C.9. Plot for problem 1

y
x

x
0

(1)

(2)

0
(3)

0
(4)

Figure C.10. Plot for problem 1

3.2. Even and odd functions: An even function is a function that satisfies the relationship f (x) = f (x). An odd function satisfies the relationship g(x) = g(x).
Determine which of the following is odd, which is even, and which is neither.
(a) h(x) = 3x

Exercises

371

(b) p(x) = x2 3x4


(c) q(x) = 2

(d) w(x) = sin(2x)


(e) s(x) = x + x2
3.3. Figure C.11 shows the graph for the function y = f (x), sketch the graph for y =
f (|x|).

y
2
1
2

1 0

1
2

Figure C.11. Plot for problem 3


3.4. Consider the function y = Axn for n > 0 an odd integer and A > 0 a constant.
Find the inverse function. Sketch both functions on the same set of coordinates, and
indicate the points of intersection. How would your figure differ if n were an even
integer?

372

Appendix C. A Review of Simple Functions

Appendix D

Limits

We have surreptitiously introduced some notation involving limits without carefully defining what was meant. Here, such technical matters are briefly discussed.
The concept of a limit helps us to describe the behaviour of a function close to some
point of interest. This proves to be most useful in the case of functions that are either not
continuous, or not defined somewhere. We will use the notation
lim f (x)

xa

to denote the value that the function f approaches as x gets closer and closer to the value
a.

D.A Limits for continuous functions


If x = a is a point at which the function is defined and continuous (informally, has no
breaks in its graph) the value of the limit and the value of the function at a point are the
same, i.e.
If f is continuous at x = a then
lim f (x) = f (a).

xa

Example D.1 Find lim f (x) for the function y = f (x) = 10


x0

Solution: This function is continuous (and constant) everywhere. In fact, the value of the
function is independent of x. We conclude immediately that
lim f (x) = lim 10 = 10.

x0

x0

Example D.2 Find lim f (x) for the function y = f (x) = sin(x).
x0

373

374

Appendix D. Limits

Solution: This function is a continuous trigonometric function, and has the value sin(0) =
0 at the origin. Thus
lim f (x) = lim sin(x) = 0
x0

x0

Power functions are continuous everywhere. This motivates the next example.
Example D.3 Compute the limit lim xn where n is a positive integer.
x0

Solution: The function in question, f (x) = xn is a simple power function that is continuous everywhere. Further, f (0) = 0. Hence the limit as x 0 coincides with the value of
the function oat that point, so
lim xn = 0.
x0

D.B Properties of limits


Suppose we are given two functions, f (x) and g(x). We will also assume that both functions have (finite) limits at the point x = a. Then the following statements follow.
1.
lim (f (x) + g(x)) = lim f (x) + lim g(x)

xa

xa

xa

2.
lim (cf (x)) = c lim f (x)

xa

xa

3.
lim (f (x) g(x)) =

xa

 

lim f (x) lim g(x)

xa

xa

4. Provided that lim g(x) 6= 0, we also have that


xa

lim

xa

f (x)
g(x)

lim f (x)

xa

lim g(x)

xa

The first two statements are equivalent to linearity of the process of computing a
limit.
Example D.4 Find lim f (x) for the function y = f (x) = 2x2 x3 .
x2

Solution: Since this function is a polynomial, and so continuous everywhere, we can simply plug in the relevant value of x, i.e.

lim 2x2 x3 = 2 22 23 = 0.
x2

Thus when x gets closer to 2, the value of the function gets closer to 0. (In fact, the value
of the limit is the same as the value of the function at the given point.)

D.C. Limits of rational functions

375

D.C Limits of rational functions


D.C.1 Case 1: Denominator nonzero
We first consider functions that are the quotient of two polynomials, y = f (x)/g(x) at
points were g(x) 6= 0. This allows us to apply Property 4 of limits together with what
we have learned about the properties of power functions and polynomials. Much of this
discussion is related to the properties of power functions and dominance of lower (higher)
powers at small (large) values of x, as discussed in Chapter 1. In the examples below, we
consider both limits at the origin (at x = 0) and at infinity (for x ). The latter means
very large x. See Section ?? for examples of the informal version of the same reasoning
used to reach the same conclusions.
Example D.5 Find the limit as x 0 and as x of the quotients
(a)

Kx
,
kn + x

(b)

Axn
.
+ xn

an

Solution: We recognize (a) as an example of the Michaelis Menten kinetics, found in (1.7)
and (b) as a Hill function in (1.6) of Chapter 1. We now compute, first for x 0,
Kx
= 0,
x0 kn + x

(b) lim

(a) lim

Axn
= 0.
+ xn

x0 an

This follows from the fact that, provided a, kn 6= 0, both functions are continuous at x = 0,
so that their limits are the same as the actual values attained by the functions. Now for
x
Kx
Kx
= lim
= K,
x x
x kn + x

(a) lim

Axn
Axn
=
lim
= A.
x xn
x an + xn

(b) lim

This follows from the fact that the constants kn , an are always swamped out by the value
of x as x , allowing us to obtain the result. Other than the formal limit notation, there
is nothing new here that we have not already discussed in Sections 1.5.
Below we apply similar reasoning to other examples of rational functions.
Example D.6 Find the limit as x 0 and as x of the quotients
(a)

3x2
,
9 + x2

(b)

1+x
.
1 + x3

Solution: For part (a) we note that as x , the quotient approaches 3x2 /x2 = 3. As
x 0, both numerator and denominator are defined and the denominator is nonzero, so
we can use the 4th property of limits. We thus find that
3x2
= 3,
x 9 + x2

(a) lim

3x2
= 0,
x0 9 + x2
lim

376

Appendix D. Limits

For part (b), we use the fact that as x , the limit approaches x/x3 = x2 0. As
x 0 we can apply property 4 yet again to compute the (finite) limit, so that
(b) lim

1+x
,
1 + x3

lim

x0

1+x
.
1 + x3

Example D.7 Find the limits of the following function at 0 and


y=

x4 3x2 + x 1
.
x5 + x

Solution: for x powers with the largest power dominate, whereas for x 0, smaller
powers dominate. Hence, we find
x4 3x2 + x 1
1
x4
= lim
=
lim
= 0.
x
x x
x x5
x5 + x
lim

1
1
x4 3x2 + x 1
= lim
= lim =
5
x0 x
x0 x
x0
x +x
So in the latter case, the limit does not exist.
lim

D.C.2 Case 2: zero in the denominator and holes in a graph


In the previous examples, evaluating the limit, where it existed, was as simple as plugging
the appropriate value of x into the function itself. The next example shows that this is not
always possible.
Example D.8 Compute the limit as x 4 of the function f (x) = 1/(x 4)
Solution: This function has a vertical asymptote at x = 4. Indeed, the value of the function
shoots off to + if we approach x = 4 from above, and if we approach the same point
from below. We say that the limit does not exist in this case.
Example D.9 Compute the limit as x 1 of the function f (x) = x/(x2 1)
Solution: We compute
lim

x1

x
x
= lim
x2 1 x1 (x 1)(x + 1)

It is evident (even before factoring as we have done) that this function has a vertical asymptote at x = 1 where the denominator approaches zero. Hence, the limit does not exist.
Next, we describe an extremely important example where the function has a hole in
its graph, but where a finite limit exists. This kind of limit plays a huge role in the definition
of a derivative.

D.C. Limits of rational functions

377

Example D.10 Find lim f (x) for the function y = (x 2)/(x2 4).
x2

Solution: This function is a quotient of two rational expressions f (x)/g(x) but we note that
limx2 g(x) = limx2 (x2 4) = 0. Thus we cannot use property 4 directly. However, we
can simplify the quotient by observing that for x 6= 2 the function y = (x 2)/(x2 4) =
(x 2)/(x 2)(x + 2) takes on the same values as the expression 1/(x + 2). At the
point x = 2, the function itself is not defined, since we are not allowed division by zero.
However, the limit of this function does exist:
lim f (x) = lim

x2

x2

(x 2)
.
(x2 4)

Provided x 6= 2 we can factor the denominator and cancel:


lim

x2

(x 2)
1
(x 2)
= lim
= lim
(x2 4) x2 (x 2)(x + 2) x2 (x + 2)

Now we can substitute x = 2 to obtain


lim f (x) =

x2

1
1
=
(2 + 2)
4

y
y=f(x)
1/4
x
2
(x2)
Figure D.1. The function y = (x
2 4) has a hole in its graph at x = 2.
The limit of the function as x approaches 2 does exist, and supplies the missing point:
limx2 f (x) = 41 .

Example D.11 Compute the limit


K(x + h)2 Kx2
.
h0
h
lim

378

Appendix D. Limits

Solution: This is a calculation we would perform to compute the derivative of the function
y = Kx2 from the definition of the derivative. Details have already been displayed in Example 2.21. The essential idea is that we expand the numerator and simplify algebraically
as follows:
(2xh + h2 )
= lim K(2x + h) = 2Kx.
lim K
h0
h0
h
Even though the quotient is not defined at the value h = 0 (as the denominator is zero
there), the limit exists, and hence the derivative can be defined. See also Example 3.15 for
a similar calculation for the function Kx3 .

D.D Right and left sided limits


Some functions are discontinuous at a point, but we may still be able to define a limit that
the function attains as we approach that point from the right or from the left. (This is
equivalent to gradually decreasing or gradually increasing x as we get closer to the point
of interest.
Consider the function

0 if x < 0;
f (x) =
1 if x > 0.
This is a step function, whose values is 0 for negative real numbers, and 1 for positive real
numbers. The function is not even defined at the point x = 0 and has a jump in its graph.
However, we can still define a right and a left limit as follows:
lim f (x) = 0,

x+ 0

lim f (x) = 1.

x 0

That is, the limit as we approach from the right is 0 whereas from the left it is 1. We also
state the following result:
If f (x) has a right and a left limit at a point x = a and if those limits
are equal, then we say that the limit at x = a exists, and we write
lim f (x) = lim
f (x) = lim f (x)

x+ a

x a

xa

Example D.12 Find lim f (x) for the function y = f (x) = tan(x).
x/2

Solution: The function tan(x) = sin(x)/ cos(x) cannot be continuous at x = /2 because


cos(x) in the denominator takes on the value of zero at the point x = /2. Moreover, the
value of this function becomes unbounded (grows without a limit) as x /2. We say in
this case that the limit does not exist. We sometimes use the notation
lim tan(x) = .

x/2

(We can distinguish the fact that the function approaches + as x approaches /2 from
below, and as x approaches /2 from higher values.

D.E. Limits at infinity

379

D.E Limits at infinity


We can also describe the behaviour at infinity i.e. the trend displayed by a function for
very large (positive or negative) values of x. We consider a few examples of this sort below.
Example D.13 Find lim f (x) for the function y = f (x) = x3 x5 + x.
x

Solution: All polynomials grow in an unbounded way as x tends to very large values. We
can determine whether the function approaches positive or negative unbounded values by
looking at the coefficient of the highest power of x, since that power dominates at large x
values. In this example, we find that the term x5 is that highest power. Since this has a
negative coefficient, the function will approach unbounded negative values as x gets larger
in the positive direction, i.e.
lim x3 x5 + x = lim x5 = .

Example D.14 Determine the following two limits:


(a) lim e2x ,
x

(b)

lim e5x ,

Solution: The function y = e2x becomes arbitrarily small as x . The function


y = e5x becomes arbitrarily small as x . Thus we have
(a) lim e2x = 0,
x

(b)

lim e5x = 0.

Example D.15 Find the limits below:


(a) lim x2 e2x ,
x

1 x
e ,
x0 x

(b) lim

Solution: For part (a) we state here the fact that as x , the exponential function with
negative exponent decays to zero faster than any power function increases. For part (b) we
note that for the quotient ex /x we have that as x 0 the top satisfies ex e0 = 1,
while the denominator has x 0. Thus the limit at x 0 cannot exist. We find that
(a) lim x2 e2x = 0,
x

(b) lim

x0

1 x
e = ,
x

D.F Summary of special limits


As a reference, in the table below, we collect some of the special limits that are useful in a
variety of situations.
We can summarize the information in this table informally as follows:

380

Appendix D. Limits
Function
eax , a > 0

point
x

Limit notation
lim eax

value
0

eax , a > 0

lim eax

eax , a > 0

lim eax

ekx

x0

x0

lim ekx

xn eax , a > 0

lim xn eax

ln(ax), a > 0

lim ln(ax)

ln(ax), a > 0

x1

x1

lim ln(ax)

ln(ax), a > 0

x0

x0

lim ln(ax)

x ln(ax), a > 0

x0

lim x ln(ax)

ln(ax)
,a>0
x

lim

ln(ax)
x

sin(x)
x

x0

x0

lim

sin(x)
x

(1 cos(x))
x

x0

(1 cos(x))
x

x0

lim

x0

Table D.1. A collection of useful limits.

1. The exponential function ex grows faster than any power function as x increases, and
conversely the function ex = 1/ex decreases faster than any power of (1/x) as x
grows. The same is true for eax provided a > 0.
2. The logarithm ln(x) is an increasing function that keeps growing without bound as
x increases, but it does not grow as rapidly as the function y = x. The same is true
for ln(ax) provided a > 0. The logarithm is not defined for negative values of its
argument and as x approaches zero, this function becomes unbounded and negative.
However, it approaches more slowly than x approaches 0. For this reason, the
expression x ln(x) has a limit of 0 as x 0.

Appendix E

Proof of the chain rule

Here we present a plausibility argument for the Chain Rule. First note that if a function
is differentiable, then it is also continuous. This means that when x changes a very little,
u can change only by a little. (There are no abrupt jumps). Then x 0 means that
u 0.
Now consider the definition of the derivative dy/du:
y
dy
= lim
du u0 u
This means that for any (finite) u,
dy
y
=
+
u
du
where 0 as u 0. Then
y =

dy
u + u
du

Now divide both sides by some (nonzero) x: Then


y
dy u
u
=
+
x
du x
x
Taking x 0 we get u 0, (by continuity) and hence also 0 so that as desired,
dy du
dy
=
dx
du dx

381

382

Appendix E. Proof of the chain rule

Appendix F

Trigonometry review

The definition of trigonometric functions in terms of the angle in a right triangle are
reviewed in Fig. F.1.

sin = opp/hyp

opposite

nus

te
ypo

cos =adj/hyp
tan =opp/adj

adjacent

Figure F.1. Review of the relation between ratios of side lengths (in a right triangle) and trigonometric functions of the associated angle.
Based on these definitions, we find certain angles whose for which sine and cosine
can be found explicitly. (And similarly tan() = sin()/ cos(). This is shown in Table F.1.
We also define the other trigonometric functions as follows:
tan(t) =

sin(t)
,
cos(t)

cot(t) =

1
,
tan(t)

sec(t) =

1
,
cos(t)

csc(t) =

1
.
sin(t)

Sine and cosine are related by the identity


cos(t) = sin(t +

).
2

This identity then leads to two others of similar form. Dividing each side of the above
relation by cos2 (t) yields
tan2 (t) + 1 = sec2 (t)
383

384

Appendix F. Trigonometry review


degrees
0
30
45
60
90

radians
0

sin(t)
0

1
2
2
2
3
2

cos(t)
1

3
2
2
2
1
2

tan(t)
0
1
3

Table F.1. Values of the sines, cosines, and tangent for the standard angles.
whereas division by sin2 (t) gives us
1 + cot2 (t) = csc2 (t).
These will be important for simplifying expressions involving the trigonometric functions,
as we shall see.
Law of cosines
This law relates the cosine of an angle to the lengths of sides formed in a triangle. (See
figure F.2.)
c2 = a2 + b2 2ab cos()
(F.1)

where the side of length c is opposite the angle .

b
c

Figure F.2. Law of cosines states that c2 = a2 + b2 2ab cos().


Here are other important relations between the trigonometric functions that should
be remembered. These are called trigonometric identities:
Angle sum identities
The trigonometric functions are nonlinear. This means that, for example, the sine of the
sum of two angles is not just the sum of the two sines. One can use the law of cosines and
other geometric ideas to establish the following two relationships:
sin(A + B) = sin(A) cos(B) + sin(B) cos(A)
cos(A + B) = cos(A) cos(B) sin(A) sin(B)

F.A. Summary of the inverse trigonometric functions


x
1

3/2
2/2
1/2
0
1/2

2/2
3/2
1

arcsin(x)
/2
/3
/4
/6
0
/6
/4
/3
/2

385

arccos(x)

5/6
3/4
2/3
/2
/3
/4
/6
0

Table F.2. Standard values of the inverse trigonometric functions.

These two identities appear in many calculations, and will be important for computing derivatives of the basic trigonometric formulae.
Related identities
The identities for the sum of angles can be used to derive a number of related formulae.
For example, by replacing B by B we get the angle difference identities:
sin(A B) = sin(A) cos(B) sin(B) cos(A)
cos(A B) = cos(A) cos(B) + sin(A) sin(B)
By setting = A = B in these we find the subsidiary double angle formulae:
sin(2) = 2 sin() cos()
cos(2) = cos2 () sin2 ()
and these can also be written in the form
2 cos2 () = 1 + cos(2)
2 sin2 () = 1 cos(2).
(The latter four are quite useful in integration methods.)

F.A

Summary of the inverse trigonometric functions

We show the table of standard values of these functions (Table F.2). In Figure F.3 we summarize the the relationships between the original trigonometric functions and their inverses.

386

Appendix F. Trigonometry review

1.5

1.5

y=Sin(x)

y=sin(x)

y=Sin(x)

-1.5

y=x

-1.5
-6.3

6.3

y=arcsin(x)

-1.5

1.5

(a)

(b)

1.5

3.1

y=Cos(x)
y=arccos(x)
y=cos(x)

y=Cos(x)

y=x
-1.5

-1.0
-6.3

6.3

-1.0

3.1

(c)

(d)

10.0

6.3

y=Tan(x)

y=Tan(x)
y=tan(x)
y=arctan(x)

y=x
-10.0

-6.3
-6.5

6.5

(e)

-6.3

6.3

(f)

Figure F.3. A summary of the trigonometric functions and their inverses. (a)

Appendix G

Short Answers to
Problems

387

388

Appendix G. Short Answers to Problems

G..1 Answers to Chapter 1 Problems


Problem 1.1:
Problem 1.2:

(a) Stretched in y direction by factor A; (b) Shifted up by a; (c) Shifted in positive x


direction by b.

Problem 1.3:
Not Provided

Problem 1.4:
Problem 1.5:

(a) x = 0, (3/2)1/3 ; (b) x = 0, x =

Problem 1.6:

p
1/4.

(b) a < 0: x = 0; a 0: x = 0, a1/4 ; (c) a > 0.

Problem 1.7:

if m n even: x =

Problem 1.8:


A 1/(mn)
B

(a) (0, 0) and (1, 1); (b) (0, 0); (c) (

Problem 1.9:
(a) x = I/, (b) x =

, x = 0; if m n odd: x =

7 3
2 , 4 ),

7 3
2 , 4 ),


A 1/(mn)
B

and (0, 1).

p
2 4I
.
2

Problem 1.10:
Problem 1.11:

y = xn ; y = xn ; y = x1/n , n = 2, 4, 6, . . .; y = xn , n = 1, 2, 3, . . .

Problem 1.12:
m > 1

Problem 1.13:
Problem 1.14:
Not Provided

Problem 1.15:
1
  ba
B
x=
A
Problem 1.16:

(a) x = 0, 1, 3; (b) x = 1; (c) x = 2, 1/3; (d) x = 1.

,x = 0

389
Problem 1.17:

(a) Intersections x = 1, 0, 1.

Problem 1.18:

1
V
1
1
= a, a > 0; (c) a = V 3 ; a = ( 16 S) 2 ; a = 10 cm; a = 315 cm.
(a) V ; (b)
S
6
Problem 1.19:


r
3 1/3 1/3
1 1/2 1/2
(a) V ; (b) ; (c) r = 4
V ; r = 4
S ; r 6.2035 cm; r
3
0.8921 cm.
Problem 1.20:

r = 2k1 /k2 = 12m.

Problem 1.21:
1/4

(1 a)S
.
(a) T =

Problem 1.22:

R d/b
(a) P = C A
; (b) S = 4
Problem 1.23:


3V 2/3
.
4

(a) a: M s1 , b: s1 ; (b) b = 0.2, a = 0.002; (c) v = 0.001.

Problem 1.24:

(a) v K, (b) v = K/2.

Problem 1.25:

K 0.0048, kn 77 nM

Problem 1.26:

(a) x = 1, 0, 1 (b) 1 (c) y1 (d) y2 .

Problem 1.27:

Line of slope a3 /A and intercept 1/A

Problem 1.28:

K = 0.5, a = 2

Problem 1.29:
Not Provided

Problem 1.30:

m 67, b 1.2, K 0.8, kn 56

Problem 1.31:
1
 ra
.
x= R
A

390

Appendix G. Short Answers to Problems

G..2 Answers to Chapter 2 Problems


Problem 2.1:

(a) m = 28 /min, b = 50.

Problem 2.2:

(a) 4.91F/min. (b) -7, -8, -9 F/min. (c) -9 F/min.

Problem 2.3:
Displacements have same magnitude, opposite signs.
Problem 2.4:
(b) 9.8 m/s.

Problem 2.5:

(a) 14.7 m/s; (b) gt

g
2 ;

(c) t = 10 s.

Problem 2.6:
v0 g/2

Problem 2.7:
5.8, 4.4, 5.4, 12.4, 4, 4.4, 7.2 (km/hr)
Problem 2.8:

v = 13.23 m/s.; secant line is y = 13.23x 2.226

Problem 2.9:
Problem 2.10:

(a) 2; (b) 0; (c) 2; (d) 0.

Problem 2.11:

(a) 1; 1; 1; (b) 1; 0; 1; (c) 1; 2; 4.

Problem 2.12:

(a) 3; (b) 5.55; (c)

Problem 2.13:
(a)

2 2
;

(b)

32
3 .

6(1 2)
;

(c) /4 x 5/4 (one solution).

Problem 2.14:

(a) 2 + h; (b) 2; (c) y = 2x.

Problem 2.15:

2h2 + 25h + 104; slope =104

391
Problem 2.16:

(b) 0, 4, 1.9, 2.1, 2 h; (c) 2.

Problem 2.17:

(a) 2 + h; (b) 2; (c) 2.98.

Problem 2.18:
(a) 4 ; (b)

4 312
.

Problem 2.19:
(a) 1; (b)

2
1+ ;

Problem 2.20:

(c) Slope approaches -2; (d) y = 2x + 4.

(a) v(2) = 12 m/s; v = 15 m/s; (b) v(2) = 0 m/s; v = 25 m/s; (c) v(2) = 13 m/s;
v = 11 m/s.

Problem 2.21:
0

Problem 2.22:
1
(x+1)2

392

Appendix G. Short Answers to Problems

G..3 Answers to Chapter 3 Problems


Problem 3.1:
Not Provided

Problem 3.2:
Not Provided

Problem 3.3:
Not Provided

Problem 3.4:
Not Provided

Problem 3.5:
Not Provided

Problem 3.12:
Problem 3.7:
Not Provided

Problem 3.8:
Not Provided

Problem 3.9:

(a) 14.7 m/s; (b) 4.9 m/s.

Problem 3.10:

(a) f (x) = 1/(2 x), (b) 0.25 (c) y = 2 + 0.25(x 4).

Problem 3.11:
Problem 3.12:
Problem 3.13:

5; 5; no change; linear function

Problem 3.14:
5.

Problem 3.15:

y = 5.8x 6.825.

Problem 3.16:
Problem 3.17:
Not Provided

393
Problem 3.18:
Problem 3.19:

394

Appendix G. Short Answers to Problems

G..4 Answers to Chapter 4 Problems


Problem 4.1:

(a) 36x2 16x 15; (b) 12x3 + 3x2 3; (c) 4x3 18x2 30x 6; (d) 3x2 ; (e)
36x
(x2 +9)2 ;

(f)

6x3 3x2 +6
18b2 7b 3
(13x)2 ; (g) 3(2b 32 )2

; (h) 36m

+72m2 36m+5
;
(3m1)2

(i)

9x4 +8x3 3x2 4x+6


.
(3x+2)2

Problem 4.2:
r
dR
= r 2 N.
dN
K
Problem 4.3:
dv
dy
kn
nxn1 an
(a)
, (b)
.
=K
=A n
2
dx
(kn + x)
dx
(a + xn )2
Problem 4.4:
Problem 4.5:
(a)V =

S 3/2
,
6()1/2

(b) dV /dS =

1
4()1/2

S 1/2 .

Problem 4.6:

(a) V (r) = 2L + 4r, S (r) = 2L + 4r, (b) 2/r2 .

Problem 4.7:

A (2) = 3mm2 /hr

Problem 4.8:
(a) E

Problem 4.9:
d(N1 /N2 )
N1
= (k1 k2 )
dt
N2
Problem 4.10:
k2 > k1 .

Problem 4.11:
Problem 4.12:

(a) y(t) = 51 t5 + t3 21 t2 + 3t+ C, (b) y(x) = 21 x2 + 2x+ C, (c) y = | 12 x2 |+ C.

Problem 4.13:

(a) a(t) = 2Bt, (b) y(t) = At (B/3)t3 , (c) t =


v = A.

Problem 4.14:

(a) v = 3t2 + 6t, a = 6t + 6; (b) t = 0,

p
p
3A/B, (d) t = A/B, (e)

p
p

3/a; (c) t = 0, 3/2a; (d) t = 1/ 2a.

395
Problem 4.15:

(a) t = v0 /g; (b) h0 +

Problem 4.16:
Not Provided

Problem 4.17:
Not Provided

Problem 4.18:
Not Provided

Problem 4.19:

v02
2g ;

(c) v = 0.

396

Appendix G. Short Answers to Problems

G..5 Answers to Chapter 5 Problems


Problem 5.1:

(a) no tangent line; (b) y = (x + 1); (c) y = (x + 1).

Problem 5.2:
y = 2x 3

Problem 5.3:
(b) a = 2.

Problem 5.4:

(a) y = 4x + 5; (b) x = 5/4, y = 5; (c) y = 0.6, smaller.

Problem 5.5:

(a) y = 3x 2; (b) x = 2/3; (c) 1.331; 1.3.

Problem 5.6:
(a) y = f (x0 )(x x0 ) + f (x0 ); (b) x = x0

f (x0 )
.
f (x0 )

Problem 5.7:
2.83

Problem 5.8:

(a) (3.41421, 207.237), (0.58580, 0.762), (0.42858, 0.895);

Problem 5.9:

(a) x = 0.32219; (b) x = 0.81054; (c) x = 0.59774, x = 0.68045, x = 4.91729.

Problem 5.10:
(3, 9), (1, 1)

Problem 5.11:
(a)

19
6 ;

(b) 3.

Problem 5.12:

(a) 0.40208; (b) 5.99074.

Problem 5.13:
0.99

Problem 5.14:
2.998

Problem 5.15:
1030 cm3

397

G..6 Answers to Chapter 6 Problems


Problem 6.1:

(a) zeros: x = 0, x = 3; loc. max.: x = 1; loc. min.: x = 1; (b) loc. min.:


x = 2; loc. max.: x = 1; (c) (a): x = 0; (b): x = 3/2.

Problem 6.2:
p

Zeros at 0, a; inflection point at 0, local maximum at a/3, local minimum at


sqrta/3.
Problem 6.3:

(a) f (x) = 2x, f (0) = 0, f (1) = 2 > 0, f (1) = 2 < 0. Local minimum at
x = 0; (b) f (x) = 3x2 , f (0) = 0, f (1) = 3 < 0, f (1) = 3 < 0. No local
maxima nor minima; (c) f (x) = 4x3 , f (0) = 0, f (1) = 4 < 0,f (1) = 4 >
0. Local maximum at x = 0.

Problem 6.4:

Zeros at 0, 2, Inflection point at 0, local min at 3/2.

Problem 6.5:

Global maximum at x = 3, global minimum at x = 3/2.

Problem 6.6:

(a) max.: 18; min.: 0; (b) max.: 25; min.: 0; (c) max.: 0; min.: 6; (d) max.: 2;
min.: 17/4.

Problem 6.7:

(a) increasing: < x < 0, 1.5 < x < ; decreasing for 0 < x < 1.5; (b) 0,
local maximum; 1.5, local minimum; (c) No.

Problem 6.8:
min.: 3/4

Problem 6.9:
x=0

Problem 6.10:

1
3
critical points: x = 0, 1, 1/2; inflection points: x =
2
6

Problem 6.11:
Not Provided
Problem 6.12:

a = 1, b = 6, c = 7

398

Appendix G. Short Answers to Problems

Problem 6.13:
Not Provided

Problem 6.14:

min. at x = 3; max. at x = 3; c.u.: x < 1, 0 < x < 1; c.d.: 1 < x <


0,x > 1; infl.pt.: x = 0

Problem 6.15:

loc. min.:x = a loc. max.: x = 2a

Problem 6.16:

(a) increasing: x < 0, 0 < x < 3k, x > 5k; decreasing:


3k < x < 5k;loc. max.:

x = 3k; loc. min.:x = 5k; (b) c.u.: 0 < x < (3 26 )k, x > (3 + 26 )k; c.d.:

x < 0, (3 26 )k < x < (3 + 26 )k; infl.pts.: x = 0, (3 26 )k.

Problem 6.17:

(a+p0 )
(b) dv/dp = b (p+a)
2 ; (c) p = p0 .

Problem 6.18:
Not Provided

Problem 6.19:

abs. max. of 4.25 at end points; abs. min. of 2 at x = 1

399

G..7 Answers to Chapter 7 Problems


Problem 7.1:

(a) 10, 10; (b) 10, 10; (c) 12, 8.

Problem 7.2:

(a) v(t) = 120t2 16t3 ; (b) t = 5; (c) t = 7.5.

Problem 7.3:

9 : 24A.M., 15 km

Problem 7.4:

(a) t 1.53 sec; (b) v(0.5) = 10.1 m/sec, v(1.5) = 0.3 m/sec, a(0.5) = 9.8 m/sec2 ,
a(1.5) = 9.8 m/sec2 ; (c) t 3.06 sec.

Problem 7.5:
See Example 7.2.
Problem 7.6:

(a) N (r) = 2k1 rL k2 r2 L, (b) r = k1 /k2 .

Problem 7.7:
Problem 7.8:

30 10 15 cm

Problem 7.9:

(a) y = (1/ 3); (b) 3/9.


Problem 7.10:

|a| if a < 4; 2 2a 4 if a 4
Problem 7.11:
A = 625 ft2

Problem 7.12:
All of the fencing used for a circular garden.
Problem 7.13:

Squares of side 6 2 3 cm.

Problem 7.14:
Problem 7.15:

A square with A = L2 /2

400

Appendix G. Short Answers to Problems

Problem 7.16:

Straight lines from (10, 10) to ( 16


3 , 0) then to (3, 5).

Problem 7.17:
4 C

Problem 7.18:

(a) x = 2B/3, R = (4/27)AB 3 ; (b) x = B/3, S = AB 2 /3.

Problem 7.19:
r = 2k1 /k2

Problem 7.20:

h = 20, r = 5 2
Problem 7.21:
(b) x =

a
2b ;

(c) x = 0; (d) x =

Problem 7.22:

x = (A/2B)1/3 1

Problem 7.23:
Problem 7.24:

NMSY = K(1 qE/r)

Problem 7.25:
E = r/2q

Problem 7.26:
Problem 7.27:

topt = k .
Problem 7.28:

am
2b .

401

G..8 Answers to Chapter 8 Problems


Problem 8.1:

(a) y (x) = 5(x + 5)4

Problem 8.2:
(a)

dT
1
=
dG
4

(1 a)S

1/4

5/4

d
.
dG

Problem 8.3:

(c) Global minimum occurs at an endpoint, rather than at a critical point.

Problem 8.4:
Not Provided

Problem 8.5:

(c) d = 3D/4.

Problem 8.6:

(a) V (x) = 500x3 +300(1x)5 , (b) Critical point at x1 =


(c) Best strategy is x1 = 1, x2 = 0.

3 5
2

is a local minimum.

402

Appendix G. Short Answers to Problems

G..9 Answers to Chapter 9 Problems


Problem 9.1:

(a) 4r2 k; (b) 8rk; (c) 3k


r2 .

Problem 9.2:

(a) dA/dt = 2rC; (b) dM/dt = 2rC.

Problem 9.3:
dM
dt

= C(3r2 )a

Problem 9.4:
dV
dt

= 1 m3 /min

Problem 9.5:
(a)

1
300

cm/s; (b)

2
5

cm2 /s.

Problem 9.6:
5 cm/s

Problem 9.7:
(a)

dV
dt

nR dT
P dt

; (b)

dV
dt

= nRT
P2

Problem 9.8:
1/2 k.

Problem 9.9:

1
cm/min
10

Problem 9.10:

1 cm/sec toward lens

Problem 9.11:
dh
dt

1
36

cm/min

Problem 9.12:
k=

1
10

4
45

Problem 9.13:
dh
dt

6
5

ft/min

Problem 9.14:
h (5) =

2
5

m/min

Problem 9.15:
(a)

1
4

m/min; (b)

m/min.

dP
dt

403
Problem 9.16:

(a) 4 m/s; (b) 25


32 per sec.

Problem 9.17:
1

4 6

m/min

Problem 9.18:
dS
dA

ab
1+bA ;

no.

Problem 9.19:
dy
dt

= 3 ll21 cm/hr

Problem 9.20:
Not Provided

Problem 9.21:
Not Provided

Problem 9.22:
(a)

dy
dx

x
= 2x
2y = y ; (b) y =

Problem 9.23:
(a)

dy
dx

21x2 +2
6y 5 +3 ;

(b)

dy
dx

1 (x
2

+ r 3); y = (1/ 2)(x r 3).

= ey2y
+2x ;

Problem 9.24:

(a) 3/4; (b) y = (3/4)x + 8.

Problem 9.25:
Not Provided

Problem 9.26:

( 210 , 910 ) and ( 210 , 910 )

Problem 9.27:
m=

4yp
xp

Problem 9.28:
(b)

dy
dx

(ayx2 )
(y 2 ax) ;

Problem 9.29:
(a)

dp
dv

(c) x = 0, x = 21/3 a; (d) No.

= (2 va3 ) (p +

a
v 2 )/(v

b).

Problem 9.30:
(0, 5/4)

Problem 9.31:

(a) y 1 = 1(x 1); (b) y = 54 ; (c) concave up.

404

Appendix G. Short Answers to Problems

G..10 Answers to Chapter 10 Problems


Problem 10.1:
Not Provided

Problem 10.2:
Not Provided

Problem 10.3:

(a) 50.75 > 50.65 ; (b) 0.40.2 > 0.40.2 ; (c) 1.0012 < 1.0013 ; (d) 0.9991.5 >
0.9992.3.

Problem 10.4:
Not Provided

Problem 10.5:

(a) x = a2 b3 ; (b) x =

b
2

c3

Problem 10.6:
Not Provided

Problem 10.7:
(a) x =

3ln(5)
;
2

(b) x =

Problem 10.8:

dy
dy
6
dx = 2x+3 ; (b) dx
2
dy
dy
(e) dx
= 6xe3x ; (f) dx
dy
= (et +e4t )2 .
(i) dx

(a)

e4 +1
3 ;

=
=

(c) x = e(e

= eee ; (d) x =

ln(C)
ab .

2
6[ln(2x+3)]2
dy
dy
2
; (c) dx
= 21 tan 12 x; (d) dx
= (x33x
2x+3
2x) ln a ;
x
1
dy
dy
21 a 2 x ln a; (g) dx = x2 2x (3+x ln 2); (h) dx = ee +x ;

Problem 10.9:

1
(a) min.: x = 23 ; max.: x = 23 ; infl.pt.: x = 0; (b) min.: x =
3 ; (c) max.:
3
x = 1; inf.pt.: x = 2; (d) min.: x = 0; (e) min.: x = 1; max.: x = 1; (f) min.:
x = ln(2); infl. pt.: x = ln(4).

Problem 10.10:

C = 4, k = 0.5

Problem 10.11:

(a) decreasing; (b) increasing; y1 (0) = y2 (0) = 10; y1 half-life = 10 ln(2); y2


doubling-time = 10 ln(2)

Problem 10.12:
41.45 months.

Problem 10.13:

(c) r = 0.0101 per year, C = 0.7145 billions.

405
Problem 10.14:
Problem 10.15:

r1 14.5, r2 0.9 per unit time, C1 = 55, C2 = 45.

Problem 10.16:
Not Provided

Problem 10.17:
Not Provided

Problem 10.18:
Not Provided

Problem 10.19:

crit.pts.: x = 0, x 1.64; f (0) = 1; f (1.64) 0.272

Problem 10.20:

(a) x = 1/, (b) x = ln()/.

Problem 10.21:

(a) x = r; (c) x =

ar
ar

ln

Problem 10.22:
p
x = b ln((a2 + b2 )/b2 )
Problem 10.23:
Not Provided

R
A


; (d) decrease; (e) decrease.

406

Appendix G. Short Answers to Problems

G..11 Answers to Chapter 11 Problems


Problem 11.1:
Not Provided

Problem 11.2:

(a) C any value, k = 5; (b) C any value, k = 3.

Problem 11.3:

(a) y(t) = Cet ; (b) c(x) = 20e0.1x; (c) z(t) = 5e3t .

Problem 11.4:

ln 2
t = ln 7ln
10

Problem 11.5:

(a) 57300 years; (b) 22920 years

Problem 11.6:

(a) 29 years; (b) 58 years; (c) 279.7 years.

Problem 11.7:

(a) 80.7%; (b) 12.3 years.

Problem 11.8:

y 707.8 torr

Problem 11.9:

(a) P (5) 1419; (b) t 9.9years.

Problem 11.10:
dN
dt

= 0.05N ; N (0) = 250; N (t) = 250e0.05t ; 2.1 1010 rodents

Problem 11.11:

(a) dy/dt = 2.57y; (b) dy/dt = 6.93y.

Problem 11.12:

(a) 12990; (b) 30792 bacteria.

Problem 11.13:

1.39 hours; 9.2 hours

Problem 11.14:

20 min; 66.44 min

Problem 11.15:

(a) y1 growing, y2 decreasing; (b) 3.5, 2.3; (c) y1 (t) = 100e0.2t, y2 (t) = 10000e0.3t;
(d) t 9.2 years.

407
Problem 11.16:

12265 people/km2

Problem 11.17:

(a) 1 hour; (b) r = ln(2); (c) 0.25 M; (d) t = 3.322 hours.

Problem 11.18:
6.93 years

Problem 11.19:
1.7043 kg

Problem 11.20:

(a) $510, $520.20, $742.97; 17.5 years; for 8% interest: $520, $540.80, $1095.56;
(b) $510.08, $520.37, $745.42; (c) 5%.

408

Appendix G. Short Answers to Problems

G..12 Answers to Chapter 12 Problems


Problem 12.1:
Not Provided
Problem 12.11:
Problem 12.3:
Problem 12.4:
Not Provided
Problem 12.5:
Not Provided
Problem 12.6:

(a) C = 12; (b) C1 = 1, C2 = 5; (c) C1 = 1, C2 = 0.

Problem 12.7:

(a) v(t) = kg ekt + kg ; (b) v = kg .

Problem 12.8:

c(t) = ks est +

k
s

Problem 12.9:

(b) 46 minutes before discovery.

Problem 12.10:
10.6 min

Problem 12.11:

(a) Y = Y0 kt, (d) k = 0.0333 per min.

Problem 12.12:

(a) Input rate I, F fish caught per day. Birth and mortality neglected. (b) Steady
state level F = I/N . (c) 2 ln(2)/N days. (d) t = Flow /I days.

Problem 12.13:

64.795 gm, 250 gm

Problem 12.14:

(a) Q (t) = kr

Problem 12.15:
(a)

dQ
dt

Q
V r

= Vr [Q kV ]; (b) Q = kV ; (c) T = V ln 2/r.

= kQ; Q(t) = 100e(8.910

)t

; (b) 7.77 hr.

409
Problem 12.16:
(b) y0 ; (c) t =

2A y0
;
k

Problem 12.17:
a = 0, b = 1

Problem 12.18:

(b) t = /4 + n.

(d) k y0 .

410

Appendix G. Short Answers to Problems

G..13 Answers to Chapter 13 Problems


Problem 13.1:
Not Provided
Problem 13.11:
Problem 13.3:
Problem 13.4:
Not Provided
Problem 13.5:
Not Provided
Problem 13.6:

(a) C = 12; (b) C1 = 1, C2 = 5; (c) C1 = 1, C2 = 0.

Problem 13.7:

(a) v(t) = kg ekt + kg ; (b) v = kg .

Problem 13.8:

c(t) = ks est +

k
s

Problem 13.9:

(b) 46 minutes before discovery.

Problem 13.10:
10.6 min

Problem 13.11:

(a) Y = Y0 kt, (d) k = 0.0333 per min.

Problem 13.12:

(a) Input rate I, F fish caught per day. Birth and mortality neglected. (b) Steady
state level F = I/N . (c) 2 ln(2)/N days. (d) t = Flow /I days.

Problem 13.13:

64.795 gm, 250 gm

Problem 13.14:

(a) Q (t) = kr

Problem 13.15:
(a)

dQ
dt

Q
V r

= Vr [Q kV ]; (b) Q = kV ; (c) T = V ln 2/r.

= kQ; Q(t) = 100e(8.910

)t

; (b) 7.77 hr.

411
Problem 13.16:
(b) y0 ; (c) t =

2A y0
;
k

Problem 13.17:
a = 0, b = 1

Problem 13.18:

(b) t = /4 + n.

(d) k y0 .

412

Appendix G. Short Answers to Problems

G..14 Answers to Chapter 14 Problems


Problem 14.1:

(a) 180o (b) 300o (c) 164.35o (d) 4320o


(e) 5/9 (f) 2/45 (g) 5/2 (h) /2
(i) 1/2 (j)

2/2 (k) 3/3

Problem 14.2:
Not Provided

Problem 14.3:
Not Provided

Problem 14.4:
Not Provided

Problem 14.5:

(a) T (t) = 37.1 + 0.4 cos[(t 8)/12]; (b) W (t) = 0.5 + 0.5 cos[(t 8)/6].

Problem 14.6:
(a) S = 3 cos

Problem 14.7:

pg 
l t ; (b) y = 2 sin

(E)

Problem 14.8:

(a) x; (b) x/ 1 x2 ; (c) 1 x2 .


Problem 14.9:
(D)

Problem 14.10:
(B)

2
3 t

+ 10.

413

G..15 Answers to Chapter 15 Problems


Problem 15.1:
(a)

2
dy
dy
= 2x cos x2 ; (b) dx
= sin 2x; (c) dx
= 32 x 3 (cot 3 x)(csc2 3 x); (d)
dy
dy
= (1 6x) sec(x 3x2 ) tan(x 3x2 ); (e) dx
= 6x2 tan x+ 2x3 sec2 x; (f) dx
=
dy
dx

dy
dx
cos x+x sin x
;
cos2 x

(g)

dy
dx

= cos x x sin x; (h)

3 cos x)(2 sec2 3x sin x); (j)

dy
dx

dy
dx

2
sin x
sin2 1
2
x
x e

; (i)

dy
dx

= 6(2 tan 3x +

= sin(sin x) cos x + cos 2x.

Problem 15.2:
(a) f (x) =
(c) f (x) =

(3x2 2 cos(x) sin(x)) cos( cos2 (x)+x3


(4x3 +10x) sin(ln(x4 +5x2 +3))

; (b) f (x) =
(x4 +5x2 +3)
(2 cos2 (x)+x3 )
1
6x2 + x ln(3)
; (d) f (x) = 4(x2 ex +tan(3x))3 (2xex +x2 ex +3 sec2 (3x));
q

(e) f (x) = 2x sin3 (x) + cos3 (x) +

3x2 (sin2 (x) cos(x)cos2 (x) sin(x))

sin3 (x)+cos3 (x)

Problem 15.3:

3/20, 1/20
Problem 15.4:

(a) [0, /4], [5/4, 2]; (b) [3/4, 7/4]; (c) x = 3/4, 7/4.

Problem 15.5:
( 8 , 1)

Problem 15.6:

0.021 rad/min

Problem 15.7:

0.125 radians per minute

Problem 15.8:
Not Provided
Problem 15.9:

(a) /8; (b) 5/8.

Problem 15.10:
(a)

dy
dt

= LCx
; (b)
2 x2

Problem 15.11:
R=

1
32

v02

Problem 15.12:
8 m/s; 0 m/s

d
dt

C
y.

414

Appendix G. Short Answers to Problems

Problem 15.13:

30 cm/s; to the right

Problem 15.14:

(a) h2 + 2hR; (b) v h2 + 2hR/R.


Problem 15.15:
(a)

dy
dx

4 sec2 (2x+y)
12 sec2 (2x+y) ;

(b)

dy
dx

2 sin x
cos y ;

(c)

dy
dx

x+sin y
= xy cos
cos y+sin x .

Problem 15.16:
y = x + 2

Problem 15.17:

y = a/ 1 a2 x2
Problem 15.18:
d
dy
1
1
1
dy
p
; (c)
; (b)
=
=
= 2
;
(a)
2
2
2
2
dx
dx
dr
2r
+
2r + 1
3(arcsin x) 3 1 x
3x 3 1 x 3


1 + t2
dy
2x2 + a2 a
dy
dy
x

; (f)
= arcsec x1 1x
=
=
; (e)
(d)
2
dx
dx
dt
1 t2
a a2 x2
2(1 t2 )
.
(1 + t2 )2
Problem 15.19:
0.4 m

Problem 15.20:
5
26

rad/s

Problem 15.21:

p
(c) y(t) = A cos( g/Lt).

415

G..16 Answers to Chapter 16 Problems


Problem 16.1:

1(d), 2(b), 3(a), 4(c) , 5(a), 6(a), 7(d), 8(b), 9(e), 10(a).

Problem 16.2:
(E)

Problem 16.3:
(E)

Problem 16.4:
(D)

Problem 16.5:
3/4 cm/sec.

Problem 16.6:

(a) v(t) = 3(t + 2)2 + , a(t) = dv/dt = 6(t + 2). (b) .

Problem 16.7:

(a) t = 1 and t = 4, (b) 9 m/s (c) 12 m.

Problem 16.8:

Local max at t = v0 /g.

Problem 16.9:

(a) Inflection point at x = 1. (b) Local maximum.

Problem 16.10:

(a) Critical points at x = 0, 3, inflection at x = 9/4. (b) Global min x = 1, global


max x = 0.

Problem 16.11:

Local min x = 9/5, inflection point x = 0.

Problem 16.12:
4.0004

Problem 16.13:
t=

ln(350)
0.05

days.

Problem 16.14:

(a)x = 0, ln(A)/b (b)x = 1/b.

Problem 16.15:

p(x) = x3 + 3x.

416

Appendix G. Short Answers to Problems

Problem 16.16:

Local min at x = (2a/b)1/6 .

Problem 16.17:

Square of side length r/ 2.

Problem 16.18:
x = 41 P .

Problem 16.19:
dAleaf
dt

= 2y(t) + x(t) +

y(t)
2

Problem 16.20:
q
(a) r = 21 S and h = 0, (b) a sphere, (c) No.
Problem 16.21:
dV
dt


= 2[r(t)h(t) + r(t)2 ] + 4r(t)2 .

Problem 16.22:
Problem 16.23:
k 1/27

Problem 16.24:

(a) k = ln(2)/3 per hr (b) 6 more hrs.

Problem 16.25:

(a) dP/dt = C P (b) P = C/ (d) t = (ln(1/0.2)/.

Problem 16.26:

(a) t = 3h (b) 23 h.

Problem 16.27:

(a) y = 4S (b) T = 21/2 = 2 ln(2)/f (c) y 50

Problem 16.28:

y = D/ 8
Problem 16.29:
0.125m/s

Problem 16.30:
696 ys.

Problem 16.31:

(a) /150 radians/s (b) /6 radians (c) D =

200
(2

3)1/2 (d)

dD
dt

3(2 3)1/2

417
Problem 16.32:

(a) = arcsin(A/B), (b) 1 < arcsin(A/B) < 1, (c) /6

Problem 16.33:

(b) y(0) = 0.5 (c) at y = 0.5 (d) y 1.

Problem 16.34:
(a) r1 (t) =

7
4 cm/year,

Problem 16.35:

(b) S (t) = 14.

(a)p
BA length L x, BP length
4
d
is x2 + d2 /r4 . (c) Rr8 r
.
8

x2 + d2 . (b) Resistance of BA is (L x)/R4 , BP

418

Appendix G. Short Answers to Problems

G..17 Answers to Appendix A Problems


Problem 1:

(a) slope 4, y intercept 5; (b) slope 43 , y intercept 2; (c) slope 23 , y intercept 0; (d)
slope 0, y intercept 3; (e) slope 25 , y intercept 23
2 .

Problem 2:

(a) y = 5(x 2) = 5x + 10; (b) y =


(3/4)x + 1.

Problem 3:

1
2x

25 ; (c) y =

4
5x

+ 10; (d) y =

(a) y = 4x + 3; (b) y = 3x + 2; (c) y = 6x + 5; (d) y = 3x; (e) y = 6x + 5;


(f) y = x/4; (g) y = 2x + 9.

Problem 4:

y = 2x

419

G..18 Answers to Appendix B Problems


Problem 1:

Not Provided

Problem 2:

(a) Odd; (b) Even; (c) Even; (d) Odd; (e) Neither.

Problem 3:

Not Provided

Problem 4:

y = [(1/A)x]1/n ; x = 0, (1/A)1/(n1)

420

Appendix G. Short Answers to Problems

Bibliography
[1] Anne Bernheim-Groswasser, Sebastian Wiesner, Roy M Golsteyn, Marie-France Carlier, and Cecile Sykes. The dynamics of actin-based motility depend on surface parameters. Nature, 417(6886):308311, 2002.
[2] C.M. Breder. Structure of a fish school. Bull. Amer. Mus. Nat. Hist., 98:127, 1951.
[3] C.M. Breder. Equations descriptive of fish schools and other animal aggregations.
Ecology, pages 361370, 1954.
[4] Eric L Charnov. Optimal foraging, the marginal value theorem. Theoretical population biology, 9(2):129136, 1976.
[5] Lawrence M Dill. The escape response of the zebra danio (Brachydanio rerio) I. The
stimulus for escape. Animal Behaviour, 22(3):711722, 1974.
[6] Lawrence M Dill. The escape response of the zebra danio (Brachydanio rerio) II. The
effect of experience. Animal Behaviour, 22(3):723730, 1974.
[7] Reuven Dukas and Stephen Ellner. Information processing and prey detection. Ecology, 74(5):13371346, 1993.
[8] Reuven Dukas and Alan C Kamil. The cost of limited attention in blue jays. Behavioral Ecology, 11(5):502506, 2000.
[9] Reuven Dukas and Alan C Kamil. Limited attention: the constraint underlying search
image. Behavioral Ecology, 12(2):192199, 2001.
[10] John T Emlen Jr. Flocking behavior in birds. The Auk, 1952.
[11] Conder P. J. Individual distance. Ibis, 91:649655, 1949.
[12] Rajat Rohatgi, Peter Nollau, Hsin-Yi Henry Ho, Marc W Kirschner, and Bruce J
Mayer. Nck and phosphatidylinositol 4, 5-bisphosphate synergistically activate actin
polymerization through the N-WASP-Arp2/3 pathway. Journal of Biological Chemistry, 276(28):2644826452, 2001.
[13] Miller R.S. and Stephen W. J.D. Spatial relationships in flocks of sandhill cranes.
Ecology, 47(2):323327, 1966.
421

422

Bibliography

[14] D. W. Stephens and J. R. Krebs. Foraging theory. Princeton University Press, Princeton NJ, 1986.

Index
Arrhenius, 202
differential equation, 221
Lysteria
monocytogenes, 84

Avogadros
number, 232
base, 198
of exponent, 195
biochemical
reaction, 13
bird flock, 17
birth rate
human, 227
box
rectangular, 149

acceleration, 76, 77, 81


uniform, 78
ActA, 84
actin, 75, 84
age distribution, 226
uniform, 226
albedo, 8, 21, 163, 173
allometric constants, 209
allometry, 208
ambient temperature, 246
amplitude, 295, 297
analytic solution, 241
Andromeda strain, 196, 207
angle
degrees, 292
radians, 293
ant trails, 165
antiderivative, 72, 77
antidifferentiation, 72, 78
approximation
linear, 98
arc length, 292
arccosine, 304
arcsine, 302
arctan, 306
argument
geometric, 6
astroid, 185
attention, 166
attraction, 17
average
rate of change, 33

carrying capacity, 132, 264


cell
length, 137
shape, 3
size, 133
spherical, 4
cesium-137, 234
chain rule, 159, 160, 175
Chernoby, 234
circle
area of, 292
circumference of, 292
circumference
of circle, 292
clock hands, 316, 317
coefficient
power function, 2
coffee budget, 163
comet tail, 84
concave
down, 116
up, 116
concavity, 100, 116
cone, 179
423

424
constraint, 136, 138
converge, 104, 105
convergent extension, 177
cooling, 26
cooling object, 273
cosine, 384
derivative, 312
cosines
law of, 315
coupled
ODEs, 281
Crichton
Michael, 196
critical point, 120
critical points, 11
classifying, 126
cryptic food, 166
cubic, 56
curvature, 116
cycle, 293
peridic, 298
cylinder
surface area, 135
volume, 135
data
refined, 34
daylight cycle, 299
decay
equation, 232
exponential, 233
decreasing
function, 116
degree
of polynomial, 71
density dependent
growth, 132
growth rate, 264
derivative, 25, 38, 46
definition, 37
differential equation, 75, 203, 204, 220,
241, 325
Dill
Larry, 320
direction field, 268
discontinuity

Index
jump, 54
removable, 53
disease, 280
doubling, 195
doubling time, 229, 230
Dukas
Reuven, 166
dynein, 51
E. coli, 196
Earth
temperature of, 8, 21, 71, 76, 163,
173, 184
ellipse
rotated, 186
emissivity, 8, 173
endemic
disease, 283
endpoints
maxima at, 140
energy
balance, 21
energy balance, 8, 71
energy gain, 143
enzyme, 13
equinox, 299
escape
response, 320
Eulers method, 236, 251, 252, 277
even
function, 7, 296
exponential decay, 236
exponential function, 199, 219
base 10, 200
base 2, 200
base e, 201
exponential growth, 252
extrema, 122
falling object, 77, 79, 249
fertility, 226
finite difference, 60
finite difference equation, 251
first derivative
test, 121
fish school, 17

Index

425

fluorescence, 75
food patch, 142
food type, 167
foraging
optimal, 142
foraging time, 142
frequency, 297
function
composition, 159

initial value problem, 251


instantaneous
rate of change, 39
intercept, 29
intrinsic growth rate, 264
invasive species, 89, 213
inverse function, 205, 301
iodine-131, 234
iteration, 104, 105

Galileo, 28
geometric argument, 6, 147
geometric relationships, 175
gravity, 28, 77
greenhouse gas, 8, 21, 163, 173
growth
density dependent, 264
growth rate, 227
intrinsic, 132

Kepler, 137
wedding, 137
kinesin, 51

half life, 233


harmonic oscillator, 325
heating, 26
Hill
coefficient, 14
function, 15
HIV, 280
hormone cycle, 299
identity
trigonometric, 295
implicit
differentiation, 181
function, 181
implicit differentiation, 76, 325
increasing
function, 115
independent variable, 132
inflection
point, 116
influenza, 280
infusion, 250
initial
velocity, 78
initial value, 222
problem, 233

Lactobacillus, 26
law of cosines, 315, 384
Law of Mass Action, 265
level curves, 325
limit, 52
DNE, 54
exists, 53
right and left, 54
linear
approximation, 93
operation, 71
relationship, 29
linearity
of derivative, 71
of limits, 374
Linweaver-Burke, 16
local
behaviour, 45
maximum, 120
minimum, 120
log-log plot, 208
logarithm
natural, 205
logistic
growth, 131
logistic equation, 264, 276
logistic growth, 88, 132
maximum, 131
absolute, 126
global, 126
Michaelis-Menten

426
kinetics, 14, 88
microtubules, 51
minimum, 131
absolute, 126
global, 126
local, 8
model
mathematical, 4
molecular collision, 202
moon phase, 300
mortality, 227
motion
uniform, 28
motor
molecular, 51
moving bead, 84
murder mystery, 248
net growth rate, 225
Newtons
law of cooling, 246, 254
method, 3
Newtons method, 11, 9395, 98, 102
nM
nano Molar, 15
nonlinear
differential equation, 263, 264
nuclear power plant, 234
numerical solution, 250
nutrient
absorption, 5
balance, 4
consumption, 5
odd
function, 7, 296
one-to-one, 8, 302
optimal
foraging, 142
oxygen, 6
parameter, 224
per capita
birth rate, 224
mortality rate, 224
percent

Index
growth, 227
perimeter
maximal, 140
of circle, 293
period, 295
periodic function, 295
phase, 295
phase line, 272
phase shift, 298
pheromone, 165
Pi (), 292
pollution, 161
polynomial, 9, 71
derivative of, 71
population
density, 132
growth, 224, 235
position, 81
power
dominant, 2
function, 2, 7
power rule, 75, 184
powers
of 2, 196
predator
size, 321
Preface, xi
probability
of decay, 231
proportional, 5
proportionality
constant, 5
Pythagoras theorem, 151, 303, 305
Pythagorian
triangle, 291
race track, 315
radian, 99, 291, 293
radioactive decay, 230
rate
constant, 227
rate of change
average, 25, 29, 33
instantaneous, 25, 34
rational
function, 9

Index
rational function, 12, 14
reaction
speed, 13
recursion relation, 251
related rates, 175, 314, 318
reproductive
number, 284
repulsion, 17
rescaling, 266, 267
residence time, 142, 145
restricting the domain, 302
Ricker
equation, 215
root, 19
of equation, 119
SARS, 280
saturation, 15
scientific problems, 223
secant
line, 25, 30
second derivative, 72, 76, 120
test, 122, 133
second order
DE, 325
shortest path, 165
sine, 384
derivative, 312
sketching
the derivative, 48, 81
slope
of straight line, 29
of tangent line, 46
slope field, 268
solar constant, 8
solution
to differential equation, 221
solution curve, 235, 247
spacing distance, 17
spreadsheet, 105, 253
stability, 275
stable
steady state, 275
state space, 268, 272
steady state, 247, 249, 275
step function, 64

427
step size, 251
straight line, 29
stroboscope, 36
substrate, 13
surface area
minimal, 135
sustainability, 8, 21, 27, 41, 71, 76, 89,
131, 156, 163, 173, 196, 207,
213, 215, 224, 229, 239, 259,
260, 264, 276, 280, 287, 288
system
of equations, 281
tangent line, 39, 46, 95
temperature
milk, 26
terminal velocity, 249
time of death, 248
trigonometric
functions, 291
identities, 291
trigonometric functions, 291, 301
trigonometric identities, 384
tug of war, 51
tumor growth, 175
unbounded, 8
unit circle, 292
unstable, 275
velocity, 38, 77, 81
instantaneous, 36, 37
terminal, 250
vertical line property, 180
vesicle, 51
visual angle, 318, 324
wine barrel, 137
yoghurt, 26
zebra danio, 320
zero, 19, 49, 118, 119
zeros, 11
of a function, 124
zoom, 46

Das könnte Ihnen auch gefallen