Sie sind auf Seite 1von 251

Abstract mathematics

M. Anthony
MT2116, 2790116

2011

Undergraduate study in
Economics, Management,
Finance and the Social Sciences
This subject guide is for a 200 course offered as part of the University of London
International Programmes in Economics, Management, Finance and the Social Sciences.
This is equivalent to Level 5 within the Framework for Higher Education Qualifications in
England, Wales and Northern Ireland (FHEQ).
For more information about the University of London International Programmes
undergraduate study in Economics, Management, Finance and the Social Sciences, see:
www.londoninternational.ac.uk

This guide was prepared for the University of London International Programmes by:
Martin Anthony, Department of Mathematics, London School of Economics and
Political Science.
This is one of a series of subject guides published by the University. We regret that due to
pressure of work the author is unable to enter into any correspondence relating to, or arising
from, the guide. If you have any comments on this subject guide, favourable or unfavourable,
please use the form at the back of this guide.

University of London International Programmes


Publications Office
Stewart House
32 Russell Square
London WC1B 5DN
United Kingdom
Website: www.londoninternational.ac.uk
Published by: University of London
University of London 2010
Reprinted with minor revisions 2011
The University of London asserts copyright over all material in this subject guide except where
otherwise indicated. All rights reserved. No part of this work may be reproduced in any form,
or by any means, without permission in writing from the publisher.
We make every effort to contact copyright holders. If you think we have inadvertently used
your copyright material, please let us know.

Contents

Contents
1 Introduction
1.1

This subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.1

Relationship to previous mathematics courses . . . . . . . . . . .

1.1.2

Aims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.3

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.4

Topics covered . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

Online study resources . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.1

The VLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.2

Making use of the Online Library . . . . . . . . . . . . . . . . . .

1.4

Using the guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.5

Examination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.6

The use of calculators . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Part 1

Numbers and Proof

2 Mathematical statements, proof, logic and sets

11

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.2

Mathematical statements and proof . . . . . . . . . . . . . . . . . . . . .

11

2.2.1

Examples of mathematical statements

. . . . . . . . . . . . . . .

11

2.2.2

Introduction to proving statements . . . . . . . . . . . . . . . . .

13

Some basic logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.3.1

Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

2.3.2

Conjunction and disjunction . . . . . . . . . . . . . . . . . . . . .

18

2.4

If-then statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

2.5

Logical equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

2.6

Converse statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.7

Contrapositive statements . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.8

Working backwards to obtain a proof . . . . . . . . . . . . . . . . . . . .

23

2.3

Contents

2.9

Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

2.9.1

Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

2.9.2

Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

2.9.3

Unions and intersections . . . . . . . . . . . . . . . . . . . . . . .

25

2.9.4

Universal sets and complements . . . . . . . . . . . . . . . . . . .

25

2.9.5

Sets and logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

2.9.6

Cartesian products . . . . . . . . . . . . . . . . . . . . . . . . . .

26

2.9.7

Power sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

2.10 Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

2.10.1 Proof by contradiction . . . . . . . . . . . . . . . . . . . . . . . .

28

2.11 Some terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

2.12 General advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

2.12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

2.12.2 How to write mathematics . . . . . . . . . . . . . . . . . . . . . .

30

2.12.3 How to do mathematics . . . . . . . . . . . . . . . . . . . . . . .

31

2.12.4 How to become better in mathematics . . . . . . . . . . . . . . .

32

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

35

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

36

3 Natural numbers and proof by induction

ii

41

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

3.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

3.2

Natural numbers: an axiomatic approach . . . . . . . . . . . . . . . . . .

41

3.3

Least and greatest members and the well-ordering principle . . . . . . . .

42

3.4

The principle of induction . . . . . . . . . . . . . . . . . . . . . . . . . .

43

3.4.1

Proof by induction . . . . . . . . . . . . . . . . . . . . . . . . . .

43

3.4.2

An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

3.4.3

Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

44

3.5

Summation formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

3.6

Recursively defined sequences . . . . . . . . . . . . . . . . . . . . . . . .

46

3.7

Using the axioms for the natural numbers . . . . . . . . . . . . . . . . .

47

3.8

Why the Principle works . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

Contents

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

50

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

52

4 Functions and counting

57

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

4.2

Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

4.2.1

Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

4.2.2

Composition of functions . . . . . . . . . . . . . . . . . . . . . . .

58

Bijections, surjections and injections . . . . . . . . . . . . . . . . . . . .

58

4.3.1

An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

4.4.1

Definition, and existence . . . . . . . . . . . . . . . . . . . . . . .

60

4.4.2

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

4.5

Counting as a bijection . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

4.6

The pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

4.6.1

The principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

4.6.2

Some applications of the Pigeonhole Principle . . . . . . . . . . .

63

4.6.3

A generalised form of the Pigeonhole Principle . . . . . . . . . . .

65

Infinite sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

68

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

68

4.3
4.4

4.7

5 Equivalence relations and the integers

73

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

5.2

Equivalence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

73

5.2.1

Relations in general . . . . . . . . . . . . . . . . . . . . . . . . . .

73

5.2.2

The special properties of equivalence relations . . . . . . . . . . .

74

5.3

Equivalence classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

75

5.4

Construction of the integers from the natural numbers . . . . . . . . . .

76

5.5

Properties of the integers . . . . . . . . . . . . . . . . . . . . . . . . . . .

78

5.6

Ordering the integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

iii

Contents

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

81

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

81

6 Divisibility and prime numbers

83

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

6.2

Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

6.3

Quotients and remainders . . . . . . . . . . . . . . . . . . . . . . . . . .

83

6.4

Representation of integers with respect to a base . . . . . . . . . . . . . .

84

6.5

Greatest common divisor . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

6.6

The Euclidean algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

6.7

Some consequences of the Euclidean algorithm . . . . . . . . . . . . . . .

87

6.8

Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

6.9

Prime factorisation: the Fundamental Theorem of Arithmetic . . . . . . .

89

6.9.1

The Fundamental Theorem . . . . . . . . . . . . . . . . . . . . .

89

6.9.2

Proof of the Fundamental Theorem . . . . . . . . . . . . . . . . .

90

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

92

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

93

7 Congruence and modular arithmetic

iv

97

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

7.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

7.2

Congruence modulo m . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

7.2.1

The congruence relation . . . . . . . . . . . . . . . . . . . . . . .

97

7.2.2

Congruence classes . . . . . . . . . . . . . . . . . . . . . . . . . .

99

7.3

Zm and its arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100

7.4

Invertible elements in Zm . . . . . . . . . . . . . . . . . . . . . . . . . . .

101

7.5

Solving equations in Zm . . . . . . . . . . . . . . . . . . . . . . . . . . .

102

7.5.1

Single linear equations . . . . . . . . . . . . . . . . . . . . . . . .

102

7.5.2

Systems of linear equations

. . . . . . . . . . . . . . . . . . . . .

103

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

104

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

105

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

105

Contents

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .


8 Rational, real and complex numbers

106
109

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109

8.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109

8.2

Rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109

8.2.1

An important equivalence relation . . . . . . . . . . . . . . . . . .

109

8.2.2

Rational numbers as equivalence classes

. . . . . . . . . . . . . .

110

8.2.3

Doing arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . .

111

Rational numbers and real numbers . . . . . . . . . . . . . . . . . . . . .

112

8.3.1

Real numbers: a sketchy introduction . . . . . . . . . . . . . . .

112

8.3.2

Rationality and repeating patterns . . . . . . . . . . . . . . . . .

113

8.3.3

Irrational numbers . . . . . . . . . . . . . . . . . . . . . . . . . .

115

8.3.4

Density of the rational numbers . . . . . . . . . . . . . . . . . .

116

8.4

Countability of rationals and uncountability of real numbers . . . . . . .

116

8.5

Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

118

8.5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

118

8.5.2

Complex numbers: a formal approach . . . . . . . . . . . . . . . .

118

8.5.3

Complex numbers: a more usual approach . . . . . . . . . . . . .

118

8.5.4

Roots of polynomials . . . . . . . . . . . . . . . . . . . . . . . . .

120

8.5.5

The complex plane . . . . . . . . . . . . . . . . . . . . . . . . . .

121

8.5.6

Polar form of z . . . . . . . . . . . . . . . . . . . . . . . . . . . .

122

8.5.7

Exponential form of z

. . . . . . . . . . . . . . . . . . . . . . . .

124

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

126

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

126

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

127

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

129

8.3

Part 2

Analysis

131

9 Supremum and infimum

133

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

133

9.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

133

9.2

The triangle inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . .

133

9.3

Properties of real numbers: supremum and infimum . . . . . . . . . . . .

134

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

136

Contents

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

136

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

137

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

138

10 Sequences and limits

141

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

141

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

141

10.2 Examples of sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . .

141

10.3 Sequences and limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

142

10.3.1 Sequences: formal definition . . . . . . . . . . . . . . . . . . . . .

142

10.3.2 The definition of a limit . . . . . . . . . . . . . . . . . . . . . . .

142

10.4 Some standard results on and properties of limits . . . . . . . . . . . . .

146

10.4.1 Using the formal definition of limit . . . . . . . . . . . . . . . . .

147

10.4.2 Bounded sequences . . . . . . . . . . . . . . . . . . . . . . . . . .

148

10.4.3 Monotonic sequences . . . . . . . . . . . . . . . . . . . . . . . . .

149

10.5 Algebra of limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

150

10.6 Some useful limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

151

10.7 The sandwich theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . .

153

10.8 Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

154

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

156

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

156

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

159

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

160

11 Limits of functions and continuity

vi

167

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

167

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

167

11.2 Limit of a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

167

11.2.1 Definition of limit . . . . . . . . . . . . . . . . . . . . . . . . . . .

167

11.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

168

11.2.3 Algebra of limits . . . . . . . . . . . . . . . . . . . . . . . . . . .

169

11.2.4 More on limits

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

169

11.3 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

170

11.4 Continuity and sequences . . . . . . . . . . . . . . . . . . . . . . . . . . .

172

11.4.1 Continuous functions on closed intervals . . . . . . . . . . . . . .

172

11.5 The Intermediate Value Theorem . . . . . . . . . . . . . . . . . . . . . .

173

Contents

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

174

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

175

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

177

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

177

Part 3

Algebra

181

12 Groups

183

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183

12.2 Definition of a group . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

183

12.2.1 Binary operations . . . . . . . . . . . . . . . . . . . . . . . . . . .

183

12.2.2 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

185

12.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

185

12.4 Group tables

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

187

12.5 Some elementary properties . . . . . . . . . . . . . . . . . . . . . . . . .

188

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

189

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

189

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

191

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

191

13 Subgroups

197

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

197

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

197

13.2 Definition of a subgroup . . . . . . . . . . . . . . . . . . . . . . . . . . .

197

13.3 Powers and order of a group element . . . . . . . . . . . . . . . . . . . .

200

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

202

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

202

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

204

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

204

14 Homomorphisms and Lagranges theorem

209

Essential reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

209

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

209

14.2 Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

209

14.3 Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

210

vii

Contents

14.4 Cosets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

211

14.5 Lagranges theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

213

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

214

Sample examination questions . . . . . . . . . . . . . . . . . . . . . . . . . . .

215

Comments on selected activities . . . . . . . . . . . . . . . . . . . . . . . . . .

217

Sketch answers to or comments on sample questions . . . . . . . . . . . . . . .

217

A Sample examination paper

225

B Sketch solutions to the sample examination paper

231

viii

Chapter 1
Introduction
In this very brief introduction, I aim to give you an idea of the nature of this subject
and to advise on how best to approach it. I also give general information about the
contents and use of this subject guide, and on recommended reading and how to use the
textbooks.

1.1
1.1.1

This subject
Relationship to previous mathematics courses

If you are taking this course as part of a BSc Degree you will already have taken a
pre-requisite Mathematics subject, either a combination of 05A Mathematics 1 and
05B Mathematics 2 or 174 Calculus. Any references in the text to these courses for
prerequisite material will apply equally to whatever pre-requisite you have taken. Please
note: this course may not be taken with 95 Further mathematics for economists.
In 05A Mathematics 1 and 05B Mathematics 2 you will have learned about
techniques of calculus and linear algebra. In Abstract mathematics the emphasis is
on theory rather than method: we will want to understand why certain techniques work,
and how we might be able to prove that they do, for example. The main central topic in
this course is proof. This course is an introduction to formal mathematical reasoning, in
which proof is central. We will meet the fundamental concepts and constructions of
mathematics and see how to formulate mathematical statements in precise terms, and
we will see how such statements can be proved or disproved.
In this subject, we need to work with precise definitions and statements, and you will
need to know these. Not only will you need to know these, but you will have to
understand them, and be able (through the use of them) to demonstrate that you
understand them. Simply learning the definitions without understanding what they
mean is not going to be adequate. I hope that these words of warning dont discourage
you, but I think its important to make it clear that this is a subject at a higher level
than those prerequisite subjects.
In this subject, you will learn how to prove mathematical statements precisely. This is a
very different sort of mathematics from that which you encountered in 05A
Mathematics 1 and 05B Mathematics 2, where the emphasis is on solving problems
through calculation. In Abstract mathematics, one has to be able to produce
convincing mathematical arguments as to why a given mathematical statement is true
or false. For example, a prime number is a positive integer greater than 1 that is only
divisible by itself and the number 1 (so 7 is a prime number, but 8 is not). The
statement There are infinitely many prime numbers is a mathematical statement, and

1. Introduction

it is either true (there are infinitely many prime numbers) or false (there are only
finitely many prime numbers). In fact, the statement is true. But why? Theres no quick
calculation we can do to establish the truth of the statement. What is needed is a
proof: a watertight, logical argument. This is the type of problem we consider in this
subject.

1.1.2

Aims

This course is designed to enable you to:


develop your ability to think in a critical manner;
formulate and develop mathematical arguments in a logical manner;
improve your skill in acquiring new understanding and expertise;
acquire an understanding of basic pure mathematics, and the role of logical
argument in mathematics.

1.1.3

Learning outcomes

At the end of this course and having completed the Essential reading and activities, you
should:
have used basic mathematical concepts in discrete mathematics, algebra and real
analysis to solve mathematical problems in this subject
be able to use formal notation correctly and in connection with precise statements
in English
be able to demonstrate an understanding of the underlying principles of the subject
be able to solve unseen mathematical problems in discrete mathematics, algebra
and real analysis
be able to prove statements and formulate precise mathematical arguments.

1.1.4

Topics covered

Descriptions of topics to be covered appear in the relevant chapters. However, it is


useful to give a brief overview at this stage.
The first half, approximately, of the subject is concerned primarily with proof, logic,
and number systems. We shall refer to this part of the subject as the Numbers and
proof part. The rest of the subject falls into two parts: Analysis and Algebra. We
will be concerned, specifically, with elements of real analysis, and the theory of groups.
It is possible to give only a brief overview of these three sections at this stage, since a
more detailed description of each inevitably involves technical concepts that have not
yet been met.
In the Numbers and proof part (Chapters 2 to 8), we will first investigate how
precise mathematical statements can be formulated, and here we will use the language
and symbols of mathematical logic. We will then study how one can prove or disprove
mathematical statements. Next, we look at some important ideas connected with
functions, relations, and numbers. For example, we will look at prime numbers and

1.2. Reading

learn what special properties these important numbers have, and how one may prove
such properties.
In the Analysis part (Chapters 9 to 11), we will see how the intuitive idea of the limit
of a sequence of numbers can be made mathematically precise so that certain properties
can be proved to hold. We will also look at functions and the key concept of continuity
(which is intuitively appealing, but must be precisely mathematically formulated in
order to be useful).
The Algebra part (Chapters 12 to 14) is about the theory of groups. A group is an
abstract mathematical concept, but there are many concrete examples in the earlier
part of this subject. In this part of the subject, we study general properties of groups.
Not all chapters of the guide are the same length. It should not be assumed that you
should spend the same amount of time on each chapter. We will not try to specify how
much relative time should be spent on each: that will vary from person to person and
we do not want to be prescriptive.
As a very rough guide (bearing in mind that this must vary from individual to
individual), we would suggest that the percentages of time spent on each chapter are
something along the lines suggested in the table below. (This should not be taken as
any indication about the composition of the examination.)
Chapter
2
3
4
5
6
7
8
9
10
11
12
13
14

1.2

Title
Mathematical statements, proof, logic and sets
Natural numbers and proof by induction
Functions and counting
Equivalence relations and the integers
Divisibility and prime numbers
Congruence and modular arithmetic
Rational, real and complex numbers
Supremum and infimum
Sequences and limits
Limits of functions and continuity
Group
Subgroups
Homomorphisms and Lagranges theorem

% Time
10
5
5
5
10
5
5
5
15
10
10
5
10

Reading

You will have to read books in order to supplement your reading. This subject guide is
just a guide, and is not a textbook.
There are many books that would be useful for this subject, since numbers and proof,
analysis and algebra are components of almost all university-level mathematics degree
programmes.
For the Numbers and proof part of the subject, you should obtain copies of the
following two books. It is enough to have one of them, but best to have both. (I will
assume you have access to these, as they will be heavily cited in this guide):

R
R

1. Introduction

Biggs, Norman L. Discrete Mathematics. (Oxford Press: Oxford, 2002) second


edition [ISBN 9780198507178].
Eccles, P.J. An Introduction to Mathematical Reasoning: numbers, sets and
functions. (Cambridge University Press: Cambridge and New York, 1997) [ISBN
9780521597180].

For the Analysis part, there are many suitable books, with the words analysis, real
analysis or mathematical analysis in their title. The one I recommend most is the
following:

Bryant, Victor Yet Another Introduction to Analysis. (Cambridge University


Press: Cambridge, 1990) [ISBN 9780521388351].

This book is written informally and entertainingly, and it will be the one I cite in the
Analysis chapters. As I indicated, there are many other textbooks with titles such as
Real Analysis or Mathematical Analysis that you will find useful. Here are some:

R
R

Binmore, K.G. Mathematical Analysis: A Straightforward Approach. (Cambridge


University Press: Cambridge, 1982) [ISBN 97805212888278.
Bartle, R.G. and D.R. Sherbert Introduction to Real Analysis. (John Wiley and
Sons: New York, 1999) fourth edition [ISBN 9780471433316].

For the Algebra part of the subject, you should use the Biggs book, cited above.
There is one topic that neither of these covers, which is the topic of complex umbers.
However, this is a topic that is well-covered in a number of other textbooks and I have
included a fairly full treatment of it in the guide to compensate for the fact that it is
not covered in the recommended textbooks.
A text that covers this topic (and will also be very useful for the subject 118
Advanced linear algebra, a subject you might also be studying) is:

Anton, H. Elementary Linear Algebra. (John Wiley: Hoboken, NJ, 2010) tenth
edition [ISBN 9789470561577].1

So the ideal combination of texts consists of three main books: Biggs, Bryant and
Eccles, together with access to another book (such as Anton) that covers complex
numbers. Your study of this subject will be much enhanced if you have these.
Detailed reading references in this subject guide refer to the editions of the set
textbooks listed above. New editions of one or more of these textbooks may have been
published by the time you study this course. You can use a more recent edition of any
of the books; use the detailed chapter and section headings and the index to identify
relevant readings. Also check the VLE regularly for updated guidance on readings.

1.3

Online study resources

In addition to the subject guide and the Essential reading, it is crucial that you take
advantage of the study resources that are available online for this course, including the
virtual learning environment (VLE) and the Online Library.
1

There are many editions and variants of this book, such as the Applications version. Any one is
equally useful.

1.3. Online study resources

You can access the VLE, the Online Library and your University of London email
account via the Student Portal at:
http://my.londoninternational.ac.uk
You should have received your login details for the Student Portal with your official
offer, which was emailed to the address that you gave on your application form. You
have probably already logged in to the Student Portal in order to register! As soon as
you registered, you will automatically have been granted access to the VLE, Online
Library and your fully functional University of London email account.
If you forget your login details at any point, please email uolia.support@london.ac.uk
quoting your student number.

1.3.1

The VLE

The VLE, which complements this subject guide, has been designed to enhance your
learning experience, providing additional support and a sense of community. It forms an
important part of your study experience with the University of London and you should
access it regularly.
The VLE provides a range of resources for EMFSS courses:
Self-testing activities: Doing these allows you to test your own understanding of
subject material.
Electronic study materials: The printed materials that you receive from the
University of London are available to download, including updated reading lists
and references.
Past examination papers and Examiners commentaries: These provide advice on
how each examination question might best be answered.
A student discussion forum: This is an open space for you to discuss interests and
experiences, seek support from your peers, work collaboratively to solve problems
and discuss subject material.
Videos: There are recorded academic introductions to the subject, interviews and
debates and, for some courses, audio-visual tutorials and conclusions.
Recorded lectures: For some courses, where appropriate, the sessions from previous
years Study Weekends have been recorded and made available.
Study skills: Expert advice on preparing for examinations and developing your
digital literacy skills.
Feedback forms.
Some of these resources are available for certain courses only, but we are expanding our
provision all the time and you should check the VLE regularly for updates.

1.3.2

Making use of the Online Library

The Online Library contains a huge array of journal articles and other resources to help
you read widely and extensively.
To access the majority of resources via the Online Library you will either need to use
your University of London Student Portal login details, or you will be required to
register and use an Athens login:

1. Introduction

http://tinyurl.com/ollathens
The easiest way to locate relevant content and journal articles in the Online Library is
to use the Summon search engine.
If you are having trouble finding an article listed in a reading list, try removing any
punctuation from the title, such as single quotation marks, question marks and colons.
For further advice, please see the online help pages:
http://www.external.shl.lon.ac.uk/summon/about.php

1.4

Using the guide

As already mentioned, it is important that you read textbooks in conjunction with the
guide and that you try problems from the textbooks. The Sample examination
questions at the end of the chapters of this guide are a very useful resource. You should
try them once you think you have mastered a particular chapter. Really try them:
dont just simply read the solutions provided. Instead, make a serious attempt before
consulting the solutions. Note that the solutions are often just sketch solutions, to
indicate to you how to answer the questions. However, in the examination, you must
show all your reasoning. It is vital that you develop and enhance your problem-solving
skills and the only way to do this is to try lots of examples.
Finally, we often use the symbol to denote the end of a proof, where we have finished
explaining why a particular result is true. This is just to make it clear where the proof
ends and the following text begins.

1.5

Examination

Important: the information and advice given here are based on the examination
structure used at the time this guide was written. Please note that subject guides may
be used for several years. Because of this we strongly advise you to always check both
the current Regulations for relevant information about the examination, and the virtual
learning environment (VLE) where you should be advised of any forthcoming changes.
You should also carefully check the rubric/instructions on the paper you actually sit
and follow those instructions. Remember, it is important to check the VLE for:
up-to-date information on examination and assessment arrangements for this course
where available, past examination papers and Examiners commentaries for the
course which give advice on how each question might best be answered.
A Sample examination paper is given as an appendix to this guide. There are no
optional topics in this subject: you should study them all. The examination paper will
provide some element of choice as to which questions you attempt: see the Sample
examination paper at the end of the subject guide for an indication of the structure of
the examination paper.
Please do not assume that the questions in a real examination will necessarily be very
similar to these sample questions. An examination is designed (by definition) to test

1.6. The use of calculators

you. You will get examination questions unlike questions in this guide. The whole point
of examining is to see whether you can apply knowledge in familiar and unfamiliar
settings. The Examiners (nice people though they are) have an obligation to surprise
you! For this reason, it is important that you try as many examples as possible, from
the guide and from the textbooks. This is not so that you can cover any possible type of
question the Examiners can think of! Its so that you get used to confronting unfamiliar
questions, grappling with them, and finally coming up with the solutions.
Do not panic if you cannot completely solve an examination question. There are many
marks to be awarded for using the correct approach or method.

1.6

The use of calculators

You will not be permitted to use calculators of any type in the examination. This is not
something that you should worry about: the Examiners are interested in assessing that
you understand the key concepts, ideas, methods and techniques, and will set questions
which do not require the use of a calculator.

1. Introduction

Part 1
Numbers and Proof

Chapter 2
Mathematical statements, proof, logic
and sets
Essential reading

R
R

One or both of the following:

2.1

Biggs, N.L. Discrete Mathematics. Chapters 13.


Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 14 and 6.

Introduction

In this important chapter, we set the ground for much of what follows in this course.
Abstract mathematics is about making precise mathematical statements and
establishing, by proof or disproof, whether these statements are true or false. In this
chapter we look at what this means, concentrating on fairly simple types of
mathematical statement, in order to emphasise techniques of proof. In later chapters
(such as those on numbers, analysis and algebra) we will use these proof techniques
extensively. You might think that some of the things we prove in this chapter are very
obvious and hardly merit proving, but proving even obvious statements can be quite
tricky sometimes, and it is good preparation for proving more complicated things later
on.

2.2

Mathematical statements and proof

To introduce the topics of mathematical statements and proof, we start by giving some
explicit examples. Later in the chapter we give some general theory and principles. Our
discussion of the general theory is limited because this is not a course in logic, as such.
What we do need is enough logic to understand what mathematical statements mean
and how we might prove or disprove them.

2.2.1

Examples of mathematical statements

Consider the following statements (in which, you should recall that the natural numbers
are the positive integers):
(a) 20 is divisible by 4.

11

2. Mathematical statements, proof, logic and sets

(b) 21 is not divisible by 7.

(c) 21 is divisible by 4.
(d) 21 is divisible by 3 or 5.
(e) 50 is divisible by 2 and 5.
(f) n2 is even.
(g) For every natural number n, the number n2 + n is even.
(h) There is a natural number n such that 2n = 2n .
(i) If n is even, then n2 is even.
(j) For all odd numbers n, n2 is odd.
(k) For natural numbers n, n2 is even if and only if n is even.

(l) There are no natural numbers m and n such that 2 = m/n.


These are all mathematical statements, of different sorts (all of which will be discussed
in more detail in the remainder of this chapter).
Statements (a) to (e) are straightforward propositions about certain numbers, and these
are either true or false. Statements (d) and (e) are examples of compound statements.
Statement (d) is true precisely when either one (or both) of the statements 21 is
divisible by 3 and 21 is divisible by 5 is true. Statement (e) is true precisely when both
of the statements 50 is divisible by 2 and 50 is divisible by 5 are true.
Statement (f) is different, because the number n is not specified and whether the
statement is true or false will depend on the value of the so-called free variable n. Such
a statement is known as a predicate.
Statement (g) makes an assertion about all natural numbers and is an example of a
universal statement.
Statement (h) asserts the existence of a particular number and is an example of an
existential statement.
Statement (i) can be considered as an assertion about all even numbers, and so it is a
universal statement, where the universe is all even numbers. But it can also be
considered as an implication, asserting that if n happens to be even, then n2 is even.
Statement (j) is a universal statement about all odd numbers. It can also be thought of
(or rephrased) as an implication, for it says precisely the same as if n is odd, then n2 is
odd.
Statement (k) is an if and only if statement: what it says is that n2 is even, for a
natural number n, precisely when n is even. But this means two things: namely that n2
is even if n is even, and n is even if n2 is even. Equivalently, it means that n2 is even if
n is even and that n2 is odd if n is odd. So statement (k) will be true precisely if (i) and
(j) are true.
Statement (l) asserts the non-existence of a certain pair of numbers (m, n). Another
way of thinking about this statement is that it says that for all choices of (m, n), it is

12

2.2. Mathematical statements and proof

not the case that m/n = 2. (This is an example of the general rule that a
non-existence statement can be thought of as a universal statement, something to be
discussed later in more detail.)
Its probably worth giving some examples of things that are not proper mathematical
statements.
For example, 6 is a nice number is not a mathematical statement. This is because nice
number has no mathematical meaning. However, if, beforehand, we had defined nice
number in some way, then this would not be a problem. For example, suppose we said:
Let us say that a number is nice if it is the sum of all the positive numbers
that divide it and are less than it.
Then 6 is a nice number would be a proper mathematical statement, and it would be
true, because 6 has positive divisors 1, 2, 3, 6 and 6 = 1 + 2 + 3. But without defining
what nice means, its not a mathematical statement. Definitions are important.
n2 + n is not a mathematical statement, because it does not say anything about
n2 + n. It is not a mathematical statement in the same way that David Cameron is not
a sentence: it makes no assertion about what David Cameron is or does. However,
n2 + n > 0 is an example of a predicate with free variable n and, for a particular value
of n, this is a mathematical statement. Likewise, for all natural numbers n, n2 + n > 0
is a mathematical statement.

2.2.2

Introduction to proving statements

Weve seen, above, various types of mathematical statement, and such statements are
either true or false. But how would we establish the truth or falsity of these?
We can, even at this early stage, prove (by which we mean establish the truth of) or
disprove (by which we mean establish the falsity of) most of the statements given
above. Heres how we can do this.
(a) 20 is divisible by 4.
This statement is true. Yes, yes, I know its obvious, but stay with me. To give a
proper proof, we need first to understand exactly what the word divisible means.
You will probably most likely think that this means that when we divide 20 by 4
we get no remainder. This is correct: in general, for natural numbers n and d, to
say that n is divisible by d (or, equivalently, that n is a multiple of d) means
precisely that there is some natural number m for which n = md. Since 20 = 5 4,
we see that 20 is divisible by 4. And thats a proof! Its utterly convincing,
watertight, and not open to debate. Nobody can argue with it, not even a
sociologist! Isnt this fun? Well, maybe its not that impressive in such a simple
situation, but we will certainly prove more impressive results later.
(b) 21 is not divisible by 7.
This is false. Its false because 21 is divisible by 7, because 21 = 3 7.
(c) 21 is divisible by 4.

13

2. Mathematical statements, proof, logic and sets

This is false, as can be established in a number of ways. First, we note that if the
natural number m satisfies m 5, then m 4 will be no more than 20. And if
m 6 then m 4 will be at least 24. Well, any natural number m is either at most
5 or at least 6 so, for all possible m, we do not have m 4 = 21 and hence there is
no natural number m for which m 4 = 21. In other words, 21 is not divisible by 4.
Another argument (which is perhaps more straightforward, but which relies on
properties of rational numbers rather than just simple properties of natural
numbers) is to note that 21/4 = 5.25, and this is not a natural number, so 21 is not
divisible by 4. (This second approach is the same as showing that 21 has remainder
1, not 0, when we divide by 4.)

(d) 21 is divisible by 3 or 5.
As we noted above, this is a compound statement and it will be true precisely when
one (or both) of the following statements is true:
(i) 21 is divisible by 3
(ii) 21 is divisible by 5.
Statement (i) is true, because 21 = 7 3. Statement (ii) is false. Because at least
one of these two statements is true, statement (d) is true.
(e) 50 is divisible by 2 and 5.
This is true. Again, this is a compound statement and it is true precisely if both of
the following statements are true:
(i) 50 is divisible by 2
(ii) 50 is divisible by 5.
Statements (i) and (ii) are indeed true because 50 = 25 2 and 50 = 10 5. So
statement (e) is true.
(f) n2 is even.
As mentioned above, whether this is true or false depends on the value of n. For
example, if n = 2 then n2 = 4 is even, but if n = 3 then n2 = 9 is odd. So, unlike
the other statements (which are propositions), this is a predicate P (n). The
predicate will become a proposition when we assign a particular value to n to it,
and the truth or falsity of the proposition can then be established. Statements (i),
(j), (k) below do this comprehensively.
(g) For every natural number n, the number n2 + n is even.
Heres our first non-immediate, non-trivial, proof. How on earth can we prove this,
if it is true, or disprove it, if it is false? Suppose it was false. How would you
convince someone of that? Well, the statement says that for every natural number
n, n2 + n is even. So if you managed (somehow!) to find a particular N for which
N 2 + N happened to be odd, you could prove the statement false by simply
observing that When n = N , it is not the case that n2 + n is even. And that
would be the end of it. So, in other words, if a universal statement about natural
numbers is false, you can prove it is false by showing that its conclusion is false for
some particular value of n. But suppose the statement is true. How could you prove
it. Well, you could prove it for n = 1, then n = 2, then n = 3, and so on, but at

14

2.2. Mathematical statements and proof

some point you would expire and there would still be numbers n that you hadnt
yet proved it for. And that simply wouldnt do, because if you proved it true for
the first 9999 numbers, it might be false when n = 10000. So what you need is a
more sophisticated, general argument that shows the statement is true for any
arbitrary n.
Now, it turns out that this statement is true. So we need a nice general argument
to establish this. Well, heres one approach. We can note that n2 + n = n(n + 1).
The numbers n and n + 1 are consecutive natural numbers. So one of them is odd
and one of them is even. When you multiply any odd number and any even number
together, you get an even number, so n2 + n is even. Are you convinced? Maybe
not? We really should be more explicit. Suppose n is even. What that means is
that, for some integer k, n = 2k. Then n + 1 = 2k + 1 and hence
n(n + 1) = 2k(2k + 1) = 2 (k(2k + 1)) .
Because k(2k + 1) is an integer, this shows that n2 + n = n(n + 1) is divisible by 2;
that is, it is even. We supposed here that n was even. But it might be odd, in
which case we would have n = 2k + 1 for some integer k. Then
n(n + 1) = (2k + 1)(2k + 2) = 2 ((2k + 1)(k + 1)) ,
which is, again, even, because (2k + 1)(k + 1) is an integer.
Right, were really proving things now. This is a very general statement, asserting
something about all natural numbers, and we have managed to prove it. I find that
quite satisfying, dont you?
(h) There is a natural number n such that 2n = 2n .
This is an existential statement, asserting that there exists n with 2n = 2n . Before
diving in, lets pause for a moment and think about how we might deal with such
statements. If an existential statement like this is true we would need only to show
that its conclusion (which in this case is 2n = 2n ) holds for some particular n. That
is, we need only find an n that works. If the statement is false, we have a lot more
work to do in order to prove that it is false. For, to show that it is false, we would
need to show that, for no value of n does the conclusion holds. Equivalently, for
every n, the conclusion fails. So wed need to prove a universal statement and, as
we saw in the previous example, that would require us to come up with a suitably
general argument.
In fact, this statement is true. This is because when n = 1 we have
2n = 2 = 21 = 2n .
(i) If n is even, then n2 is even.
This is true. The most straightforward way to prove this is to assume that n is
some (that is, any) even number and then show that n2 is even. So suppose n is
even. Then n = 2k for some integer k and hence n2 = (2k)2 = 4k 2 . This is even
because it is 2(2k 2 ) and 2k 2 is an integer.
(j) For all odd numbers n, n2 is odd.
This is true. The most straightforward way to prove this is to assume that n is any
odd number and then show that n2 is also odd. So suppose n is odd. Then

15

2. Mathematical statements, proof, logic and sets

n = 2k + 1 for some integer k and hence n2 = (2k + 1)2 = 4k 2 + 4k + 1. To establish


that this is odd, we need to show that it can be written in the form 2K + 1 for
some integer K. Well, 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1. This is indeed of the form
2K + 1, where K is the integer 2k 2 + 2k. Hence n2 is odd.

Another way to prove this result is to prove that if n2 is even then n must be even.
We wont do that right now, because to do it properly requires a result we meet
later concerning the factorisation of numbers into prime numbers. But think about
the strategy for a moment. Suppose we were able to prove the following statement,
which well call Q:
Q:

If n2 is even then n is even.

Why would that establish what we want (namely that if n is odd then n2 is odd)?
Well, one way is to observe that Q is whats called the contrapositive of statement
(j) that were trying to prove, and the contrapositive is logically equivalent to the
initial statement. (This is a bit of formal logic, and we will discuss this in more
detail later). But theres another way of thinking about it, which is perhaps easier
to understand at this stage. Suppose we have proved statement Q and suppose that
n is odd. Then it must be the case that n2 is odd. For, if n2 was not odd, it would
be even and then Q would tell us that this means n is even. But we have assumed
n is odd. It cannot be both even and odd, so we have reached a contradiction. By
assuming that the opposite conclusion holds (n2 even) we have shown that
something impossible happens. This type of argument is known as a proof by
contradiction and it is often very powerful. We will see more about this later.
(k) For natural numbers n, n2 is even if and only if n is even.
This is true. What we have shown in proving (i) and (j) is that if n is even then n2
is even, and if n is odd then n2 is odd. The first, (statement (i)) establishes that if
n is even, then n2 is even. The second of these (statement (j)) establishes that n2 is
even only if n is even. This is because it shows that n2 is odd if n is odd, from
which it follows that if n2 is even, n must not have been odd, and therefore must
have been even. If and only if statements of this type are very important. As we
see here, the proof of such statements breaks down into the proof of two If-then
statements.
(l) There are no natural numbers m and n such that

2 = m/n.

This is, in fact, true, though we defer the proof for now, until we know more about
factorisation of numbers into prime numbers. We merely comment that the easiest
way to prove the statement is to use a proof by contradiction.
These examples hopefully demonstrate that there are a wide range of statements and
proof techniques, and in the rest of this chapter we will explore these further.
Right now, one thing I hope comes out very clearly from these examples is that to prove
a mathematical statement, you need to know precisely what it means. Well, that sounds
obvious, but you can see how detailed we had to be about the meanings (that is, the
definitions) of the terms divisible, even and odd. Definitions are very important.

16

2.3. Some basic logic

2.3

Some basic logic

Mathematical statements can be true or false. Lets denote true by T and false by F.
Given a statement, or a number of statements, it is possible to form other statements.
This was indicated in some of the examples above (such as the compound statements).
A technique known as the use of truth tables enables us to define logical operations
on statements, and to determine when such statements are true. This is all a bit vague,
so lets get down to some concrete examples.

2.3.1

Negation

The simplest way to take a statement and form another statement is to negate the
statement. The negation of a statement P is the statement P (sometimes just denoted
not P ), which is defined to be true exactly when P is false. This can be described in
the very simple truth table, Table 2.1:
P
T
F

P
F
T

Table 2.1: The truth table for negation or not

What does the table signify? Quite simply, it tells us that if P is true then P is false
and if P is false then P is true.
Example 2.1 If P is 20 is divisible by 3 then P is 20 is not divisible by 3.
Here, P is false and P is true.
It has, I hope, been indicated in the examples earlier in this chapter, that to disprove a
universal statement about natural numbers amounts to proving an existential
statement. That is, if we want to disprove a statement of the form for all natural
numbers n, property p(n) holds (where p(n) is some predicate, such as n2 is even) we
need only produce some N for which p(N ) fails. Such an N is called a counterexample.
Equally, to disprove an existential statement of the form there is some n such that
property p(n) holds, one would have to show that for every n, p(n) fails. That is, to
disprove an existential statement amounts to proving a universal one. But, now that we
have the notion of the negation of a statement we can phrase this a little more formally.
Proving that a statement P is false is equivalent to proving that the negation P is
true. In the language of logic, therefore, we have the following:
The negation of a universal statement is an existential statement.
The negation of an existential statement is a universal statement.
More precisely,
The negation of the universal statement for all n, property p(n) holds is the
existential statement there is n such that property p(n) does not hold.

17

2. Mathematical statements, proof, logic and sets

The negation of the existential statement there is n such that property p(n) holds
is the universal statement for all n, property p(n) does not hold.

We could be a little more formal about this, by defining the negation of a predicate p(n)
(which, recall, only has a definitive true or false value once n is specified) to be the
predicate p(n) which is true (for any particular n) precisely when p(n) is false. Then
we might say that
The negation of the universal statement for all n, p(n) is true is the existential
statement there is n such that p(n) is true.
The negation of the existential statement there is n such that p(n) is true is the
universal statement for all n, p(n) is true.
Now, lets not get confused here. None of this is really difficult or new. We meet such
logic in everyday life. If I say It rains every day in London then either this statement is
true or it is false. If it is false, it is because on (at least) one day it does not rain. The
negation (or disproof) of the statement On every day, it rains in London is simply
There is a day on which it does not rain in London. The former is a universal
statement (On every day, . . . ) and the latter is an existential statement (there is . . . ).
Or, consider the statement There is a student who enjoys reading these lecture notes.
This is an existential statement (There is . . . ). This is false if No student enjoys
reading these lecture notes. Another way of phrasing this last statement is Every
student reading these lecture notes does not enjoy it. This is a more awkward
expression, but it emphasises that the negation of the initial, existential statement, is a
universal one (Every student . . . ).
The former is an existential statement (there is something I will write that . . . ) and
the latter is a universal statement (everything I write will . . . ). This second example is
a little more complicated, but it serves to illustrate the point that much of logic is
simple common sense.

2.3.2

Conjunction and disjunction

There are two very basic ways of combining propositions: through the use of and
(known as conjunction) and the use of or (known as disjunction).
Suppose that P and Q are two mathematical statements. Then P and Q, also denoted
P Q, and called the conjunction of P and Q, is the statement that is true precisely
when both P and Q are true. For example, statement (e) above, which is
50 is divisible by 2 and 5
is the conjunction of the two statements
50 is divisible by 2
50 is divisible by 5.
Statement (e) is true because both of these two statements are true.
Table 2.2 gives the truth table for the conjunction P and Q. What Table 2.2 says is
simply that P Q is true precisely when both P and Q are true (and in no other
circumstances).

18

2.4. If-then statements

P
T
T
F
F

Q
T
F
T
F

P Q
T
F
F
F

Table 2.2: The truth table for and

Suppose that P and Q are two mathematical statements. Then P or Q, also denoted
P Q, and called the disjunction of P and Q, is the statement that is true precisely
when P , or Q, or both, are true. For example, statement (d) above, which is
21 is divisible by 3 or 5
is the disjunction of the two statements
21 is divisible by 3
21 is divisible by 5.
Statement (d) is true because at least one (namely the first) of these two statements is
true.
Note one important thing about the mathematical interpretation of the word or. It is
always used in the inclusive-or sense. So P Q is true in the case when P is true, or Q
is true, or both. In some ways, this use of the word or contrasts with its use in normal
everyday language, where it is often used to specify a choice between mutually exclusive
alternatives. (For example Youre either with us or against us.) But if I say Tomorrow
I will wear brown trousers or I will wear a yellow shirt then, in the mathematical way
in which the word or is used, the statement would be true if I wore brown trousers and
any shirt, any trousers and a yellow shirt, and also if I wore brown trousers and a yellow
shirt. You might have your doubts about my dress sense in this last case, but, logically,
it makes my statement true.
Table 2.3 gives the truth table for the disjunction P and Q.
P
T
T
F
F

Q
T
F
T
F

P Q
T
T
T
F

Table 2.3: The truth table for or

What Table 2.3 says is simply that P Q is true precisely when at least one of P and Q
is true.

2.4

If-then statements

It is very important to understand the formal meaning of the word if in mathematics.


The word is often used rather sloppily in everyday life, but has a very precise

19

2. Mathematical statements, proof, logic and sets

mathematical meaning. Let me give you an example. Suppose I tell you If it rains, then
I wear a raincoat, and suppose that this is a true statement. Well, then, suppose it
rains. You can certainly conclude I will wear a raincoat. But what if it does not rain?
Well, you cant conclude anything. My statement only tells you about what happens if
it rains. If it does not, then I might, or I might not, wear a raincoat: and whether I do or
not does not affect the truth of the statement I made. You have to be clear about this:
an if-then statement only tells you about what follows if something particular happens.
More formally, suppose P and Q are mathematical statements (each of which can
therefore be either true or false). Then we can form the statement denoted P Q (P
implies Q or, equivalently, if P , then Q), which has as its truth table Table 2.4. (This
type of statement is known as an if-then statement or an implication.)
P
T
T
F
F

Q
T
F
T
F

P Q
T
F
T
T

Table 2.4: The truth table for P Q

Note that the statement P Q is false only when P is true but Q is false. (To go back
to the previous example, the statement If it rains, I wear a raincoat is false precisely if
it does rain but I do not wear a raincoat.) This is tricky, so you may have to spend a
little time understanding it. As Ive suggested, perhaps the easiest way is to think about
when a statement if P , then Q is false.
The statement P Q can also be written as Q P . There are different ways of
describing P Q, such as:
if P then Q
P implies Q
P is sufficient for Q
Q if P
P only if Q
Q whenever P
Q is necessary for P .
All these mean the same thing. The first two are the ones I will use most frequently.
If P Q and Q P then this means that Q will be true precisely when P is. That is
Q is true if and only if P is. We use the single piece of notation P Q instead of the
two separate P Q and Q P . There are several phrases for describing what
P Q means, such as:
P if and only if Q (sometimes abbreviated to P iff Q)

20

2.5. Logical equivalence

P is equivalent to Q
P is necessary and sufficient for Q

Q is necessary and sufficient for P .


The truth table is shown in Table 2.5, where we have also indicated the truth or falsity
of P Q and Q P to emphasise that P Q is the same as the conjunction
(P Q) (Q P ).
P Q
T T
T F
F T
F F

P Q QP
T
T
F
T
T
F
T
T

P Q
T
F
F
T

Table 2.5: The truth table for P Q

What the table shows is that P Q is true precisely when P and Q are either both
true or both false.
Activity 2.1 Look carefully at the truth table and understand why the values for
P Q are as they are. In particular, try to explain in words why the truth table
is the way it is.

2.5

Logical equivalence

Two statements are logically equivalent if when either one is true, so is the other, and if
either one is false, so is the other. For example, for statements P and Q, the statements
(P Q) and P Q are logically equivalent. We can see this from the truth table,
Table 2.6, which shows that, in all cases, the two statements take the same logical value
T or F ). (This value is highlighted in bold.)
P
T
T
F
F

Q
T
F
T
F

P Q
T
T
T
F

(P Q)
F
F
F
T

P
F
F
T
T

Q
F
T
F
T

P Q
F
F
F
T

Table 2.6: The truth tables for (P Q) and P Q

The fact that (P Q) and P Q are logically equivalent is quite easy to


understand. The statement P Q is true if and only if at least one of P, Q is true. The
statement is therefore false precisely when both P and Q are false, which means P and
Q are both true, which means P Q is true. Again, we can understand these
things fairly easily with some common sense. If I tell you I will wear brown trousers or

21

2. Mathematical statements, proof, logic and sets

I will wear a yellow shirt then this is a false statement only if I do not wear brown
trousers and I do not wear a yellow shirt.

Now that we know the meaning of , we can see that to say that (P Q) and
P Q are logically equivalent is to say that (P Q) P Q.
Activity 2.2
Show that the statements (P Q) and P Q are logically equivalent. [This
shows that the negation of P Q is P Q. That is, (P Q) is equivalent to
P Q.]

2.6

Converse statements

Given an implication P Q, the reverse implication Q P is known as its converse.


Generally, there is no reason why the converse should be true just because the
implication is. For example, consider the statement If it is Tuesday, then I buy the
Guardian newspaper. The converse is If I buy the Guardian newspaper, then it is
Tuesday. Well, I might buy that newspaper on other days too, in which case the
implication can be true but the converse false. Weve seen, in fact, that if both P Q
and Q P then we have a special notation, P Q, for this situation. Generally,
then, the truth or falsity of the converse Q P has to be determined separately from
that of the implication P Q.
Activity 2.3 What is the converse of the statement if the natural number n
divides 4 then n divides 12 ? Is the converse true? Is the original statement true?

2.7

Contrapositive statements

The contrapositive of an implication P Q is the statement Q P . The


contrapositive is logically equivalent to the implication, as Table 2.7 shows. (The
columns highlighted in bold are identical.)
P
T
T
F
F

Q
T
F
T
F

P Q
T
F
T
T

P
F
F
T
T

Q
F
T
F
T

Q P
T
F
T
T

Table 2.7: The truth tables for P Q and Q P .

If you think about it, the equivalence of the implication and its contrapositive makes
sense. For, Q P says that if Q is false, P is false also. So, it tells us that we cannot
have Q false and P true, which is precisely the same information as is given by P Q.

22

2.8. Working backwards to obtain a proof

So whats the point of this? Well, sometimes you might want to prove P Q and it
will, in fact, be easier to prove instead the equivalent (contrapositive) statement
Q P . See Biggs, section 3.5 for an example.

2.8

Working backwards to obtain a proof

Weve already seen, in the examples earlier in this chapter, how some statements may
be proved directly. For example, in order to prove a universal statement for all n, P (n)
about natural numbers, we would need to provide a proof that starts by assuming that
n is any given (that is, arbitrary) natural number and show the desired conclusion
holds. To disprove such a statement (which is the same as proving its negation), we
would simply need to find a single value of n for which P (n) is false (and such an n is
known as a counterexample).
However, some statements are difficult to prove directly. It is sometimes easier to work
backwards. Suppose you are asked to prove something, such as an inequality or
equation. It might be easier to see how to do so if the end-result (the inequality or
equation you are required to prove) is simplified, or expanded, or re-written in some
way. Heres an example.

Example 2.2 Prove the statement that: if a, b are real numbers and a 6= b, then
ab < (a2 + b2 )/2.
Its certainly not immediately obvious how to approach this. But lets start with
what we want to prove. This is the inequality ab < (a2 + b2 )/2, which can be
rewritten as a2 + b2 2ab > 0. Now, this can be simplified as (a b)2 > 0 and maybe
now you can see why it is true: the given fact that a 6= b means that a b 6= 0 and
hence (a b)2 is a positive number. So we see why the statement is true. To write
down a nice proof, we can now reverse this argument, as follows:
Proof
Since a 6= b, a b 6= 0 and, hence, (a b)2 > 0. But (a b)2 = a2 + b2 2ab. So we
have a2 + b2 > 2ab and, therefore, ab < (a2 + b2 )/2, as required.
There are a few things to note here. First, mathematics is a language and what you
write has to make good sense. Often, it is tempting to make too much use of
symbols rather than words. But the words used in this proof, and the punctuation,
make it easy to read and give it a structure and an argument. You should find
yourself using words like so, hence, therefore, since, because, and so on. Do use
words and punctuation and, whatever you do, do not replace them by symbols of
your own invention! A second thing to note is the use of the symbol . There is
nothing particularly special about this symbol: others could be used. What it
achieves is that it indicates that the proof is finished. There is no need to use such a
symbol, but you will find that textbooks do make much use of symbols to indicate
when proofs have ended. It enables the text to be more readable, with proofs not
running into the main body of the text. Largely, these are matters of style, and you
will develop these as you practise and read the textbooks.

23

2. Mathematical statements, proof, logic and sets

2.9

2.9.1

Sets
Basics

You have probably already met some basic ideas about sets and there is not too much
more to add at this stage, but they are such an important idea in abstract mathematics
that they are worth discussing here.
Loosely speaking, a set may be thought of as a collection of objects. A set is usually
described by listing or describing its members inside curly brackets. For example, when
we write A = {1, 2, 3}, we mean that the objects belonging to the set A are the numbers
1, 2, 3 (or, equivalently, the set A consists of the numbers 1, 2 and 3). Equally (and this
is what we mean by describing its members), this set could have been written as
A = {n | n is a whole number and 1 n 3}.
Here, the symbol | stands for such that. Often, the symbol : is used instead, so that
we might write
A = {n : n is a whole number and 1 n 3}.
When x is an object in a set A, we write x A and say x belongs to A or x is a
member of A. If x is not in A we write x 6 A.
As another example, the set
B = {x N | x is even}
has as its members the set of positive even integers. Here we are specifying the set by
describing the defining property of its members.
Sometimes it is useful to give a constructional description of a set. For example,
C = {n2 | n N} is the set of natural numbers known as the perfect squares.
The set which has no members is called the empty set and is denoted by . The empty
set may seem like a strange concept, but it has its uses.

2.9.2

Subsets

We say that the set S is a subset of the set T , and we write S T , if every member of
S is a member of T . For example, {1, 2, 5} {1, 2, 4, 5, 6, 40}. (Be aware that some
texts use where we use .) What this means is that the statement
xSxT
is true.
A rather obvious, but sometimes useful, observation is that, given two sets A and B,
A = B if and only if A B and B A. So to prove two sets are equal, we can prove
that each of these two containments holds. That might seem clumsy, but it is, in many
cases, the best approach.
For any set A, the empty set, , is a subset of A. You might think this is strange,
because what it means is that every member of is also a member of A. But has no
members! The point, however, is that there is no object in that is not also in A
(because there are no objects at all in ).

24

2.9. Sets

2.9.3

Unions and intersections

Given two sets A and B, the union A B is the set whose members belong to A or B
(or both A and B): that is,
A B = {x | x A or x B}.
Equivalently, to use the notation weve learned,
x A B (x A) (x B).
Example 2.3 If A = {1, 2, 3, 5} and B = {2, 4, 5, 7}, then A B = {1, 2, 3, 4, 5, 7}.
Similarly, we define the intersection A B to be the set whose members belong to both
A and B:
A B = {x | x A and x B}.
So,
x A B (x A) (x B).

2.9.4

Universal sets and complements

Weve been a little informal about what the possible objects in a set might be.
Officially, we always work with respect to some universal set E. For example, if we are
thinking about sets of natural numbers, the universal set (the possible candidates for
membership of the sets we might want to consider) is the set N of all natural numbers.
This might seem like an unnecessary complication, but it is essential. Suppose I tell you
that the set A is the set of all even natural numbers. What are the objects that do not
belong to A? Well, in the context of natural numbers, it is all odd natural numbers. The
context is important (and it is this that is encapsulated in the universal set). Without
that context (or universal set), then there are many other objects that we could say do
not belong to A, such as negative integers, apples, bananas and elephants. (I could go
on, but I hope you get the point!)
Given a universal set E and a subset A of E, the complement of A (sometimes called
the complement of A in E) is denoted by E \ A and is
E \ A = {x E | x 6 A}.
If the universal set is clear, the complement of A is sometimes denoted by A or Ac (with
textbooks differing in their notation).
Suppose A is any subset of E. Because each member of E is either a member of A, or is
not a member of A, it follows that
A (E \ A) = E.

2.9.5

Sets and logic

There are a great many comparisons and analogies between set theory and logic. Using
the shorthand notation for complements, one of the De-Morgan laws of
complementation is that

A B = A B.

25

2. Mathematical statements, proof, logic and sets

This looks a little like the fact (observed in an earlier Activity) that (P Q) is
equivalent to P Q. And this is more than a coincidence. The negation operation,
the conjunction operation, and the disjunction operation on statements behave entirely
in the same way as the complementation, intersection, and union operations (in turn)
on sets. In fact, when you start to prove things about sets, you often end up giving
arguments that are based in logic.
We could argue as follows:
For example, how would we prove that A B = A B?
x A B

x 6 A B
(x A B)
((x A) (x B))
(x A) (x B)
(x B)

(x A)

x A B.

What the result says is, in fact, easy to understand: if x is not in both A and B, then
thats precisely because it fails to be in (at least) one of them.
For two sets A and B (subsets of a universal set E), the complement of B in A, denoted
by A \ B, is the set of objects that belong to A but not to B. That is,
A \ B = {x A | x 6 B}.
Activity 2.4 Prove that A \ B = A (E \ B).

2.9.6

Cartesian products

For sets A and B, the Cartesian product A B is the set of all ordered pairs (a, b),
where a A and b B. For example, if A = B = R then A B = R R is the set of
all ordered pairs of real numbers, usually denoted by R2 .

2.9.7

Power sets

For a set A, the set of all subsets of A, denoted P(A), is called the power set of A. Note
that the power set is a set of sets. For example, if A = {1, 2, 3}, then
P(A) = {, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}} .
Activity 2.5 Write down the power set of the set A = {1, 2, 3, 4}.
Activity 2.6
Suppose that A has n members, where n N. How many members does P(A) have?

2.10

Quantifiers

We have already met the ideas of universal and existential statements involving natural
numbers. More generally, given any set E, a universal statement on E is one of the form

26

2.10. Quantifiers

for all x E, P (x). This statement is true if P (x) is true for all x in E, and it is false
if there is some x in E (known as a counterexample) such that P (x) is false. We have a
special symbol that is used in universal statements: the symbol means for all. So
the typical universal statement can be written as
x E, P (x).
(The comma is not necessary, but I think it looks better.) An existential statement on E
is one of the form there is x E such that P (x), which is true if there is some x E
for which P (x) is true, and is false if for every x E, P (x) is false. Again, we have a
useful symbol, , meaning there exists. So the typical existential statement can be
written as
x E, P (x).
Here, we have omitted the phrase such that, but this is often included if the statement
reads better with it. For instance, we could write
n N, n2 2n + 1 = 0,
but it would probably be easier to read
n N such that n2 2n + 1 = 0.
Often such that is abbreviated to s.t.. (By the way, this statement is true because
n = 1 satisfies n2 2n + 1 = 0.)
We have seen that the negation of a universal statement is an existential statement and
vice versa. In symbols, (x E, P (x)) is logically equivalent to x E, P (x); and
(x E, P (x)) is logically equivalent to x E, P (x).
With these observations, we can now form the negations of more complex statements.
Consider the statement
n N, m N, m > n.
Activity 2.7 What does the statement n N, m N, m > n mean? Is it true?
What would the negation of the statement be? Lets take it gently. First, notice that
the statement is
n N, (m N, m > n).
The parentheses here do not change the meaning. According to the rules for negation of
universal statements, the negation of this is
n N, (m N, m > n).
But what is (m N, m > n)? According to the rules for negating existential
statements, this is equivalent to m N, (m > n). What is (m > n)? Well, its just
m n. So what we see is that the negation of the initial statement is
n N, m N, m n.
We can put this argument more succinctly, as follows:
(n N(m N, m > n)) n N, (m N, m > n)
n N, m N, (m > n)
n N, m N, m n.

27

2. Mathematical statements, proof, logic and sets

2.10.1

Proof by contradiction

Weve seen a small example of proof by contradiction earlier in the chapter. Suppose
you want to prove P Q. One way to do this is by contradiction. What this means is
that you suppose P is true but Q is false (in other words, that the statement P Q is
false) and you show that, somehow, this leads to a conclusion that you know, definitely,
to be false.
Heres an example.
Example 2.4 There are no integers m, n such that 6m + 8n = 1099.
Proof
To prove this by contradiction, we can argue as follows:
Suppose that integers m, n do exist such that 6m + 8n = 1099. Then since 6 is even,
6n is also even; and, since 8 is even, 8n is even. Hence 6m + 8n, as a sum of two even
numbers, is even. But this means 1099 = 6m + 8n is an even number. But, in fact, it
is not even, so we have a contradiction. It follows that m, n of the type required do
not exist.

This sort of argument can be a bit perplexing when you first meet it. Whats going on
in the example just given? Well, what we show is that if such m, n exist, then something
impossible happens: namely the number 1099 is both even and odd. Well, this cant be.
If supposing something leads to a conclusion you know to be false, then the initial
supposition must be false. So the conclusion is that such integers m, n do not exist.
Probably the most famous proof by contradiction is Eulers proof that there are
infinitely many prime numbers. A prime number is a natural number greater than 1
which is only divisible by 1 and itself. Such numbers have been historically of huge
importance in mathematics, and they are also very useful in a number of important
applications, such as information security. The first few prime numbers are
2, 3, 5, 7, 11, . . . . A natural question is: does this list go on forever, or is there a largest
prime number? In fact, the list goes on forever: there are infinitely many prime
numbers. Well mention this result again later. A full, detailed, understanding of the
proof requires some results well meet later, but you should be able to get the flavour of
it at this stage. So here it is, a very famous result:
There are infinitely many prime numbers.
Proof
(Informally written for the sake of exposition) Suppose not. That is, suppose there are
only a finite number of primes. Then theres a largest one. Lets call it M . Now consider
the number
X = (2 3 5 7 11 M ) + 1,
which is the product of all the prime numbers (2 up to M ), with 1 added. Notice that
X > M , so X is not a prime (because M is the largest prime). If a number X is not
prime, that means that it has a divisor p that is a prime number and which satisfies
1 < p < X. [This is the key observation: we havent seen this yet, but we will later.] But
p must therefore be one of the numbers 2, 3, 5, . . . , M . However, X is not divisible by any

28

2.11. Some terminology

of these numbers, because it has remainder 1 when divided by any of them. So we have
reached a contradiction: on the one hand, X must be divisible by one of these primes,
and on the other, it is not. So the initial supposition that there were not infinitely many
primes simply must be wrong. We conclude there are infinitely many primes.
This proof has been written in a fairly informal and leisurely way to help explain whats
happening. It could all be written more succinctly.

2.11

Some terminology

At this point, its probably worth introducing some important terminology. When, in
Mathematics, we prove a true statement, we often say we are proving a Theorem, or a
Proposition. (Usually the word Proposition is used if the statement does not seem
quite so significant as to merit the description Theorem.) A theorem that is a
preliminary result leading up to a Theorem is often called a Lemma, and a minor
theorem that is a fairly direct consequence of, or special case of, a theorem is called a
Corollary, if it is not significant enough itself to merit the title Theorem. For your
purposes, it is important just to know that these words all mean true mathematical
statements. You should realise that these terms are used subjectively: for instance, the
person writing the mathematics has to make a decision about whether a particular
result merits the title Theorem or is, instead, merely to be called a Proposition.

2.12

General advice

2.12.1

Introduction

Proving things is difficult. Inevitably, when you read a proof, in the textbooks or in
these notes, you will ask How did the writer know to do that? and you will often find
you asking yourself How can I even begin to prove this?. This is perfectly normal. This
is where the key difference between abstract mathematics and more methods-based
mathematics lies. If you are asked to differentiate a function, you just go ahead and do
it. It might be technically difficult in some cases, but there is no doubt about what
approaches you should use. But proving something is more difficult. You might try to
prove it, and fail. Thats fine: what you should do in that case is try another attack.
Keep trying until you crack it. (I suppose this is a little bit like integration. Youll know
that there are various methods, but you dont necessarily know which will work on a
particular integral, so you should try one, and keep trying until you manage to find the
integral.) Abstract mathematics should always be done with a large pile of scrap paper
at your disposal. You are unlikely to be able to write down a perfect solution to a
problem straightaway: some scratching around to get a feel for whats going on might
well be needed, and some false starts might be pursued first. If you expect to be able to
envisage a perfect solution in your head and then write it down perfectly, you are
placing too much pressure on yourself. Abstract mathematics is simply not done like
that.

29

2. Mathematical statements, proof, logic and sets

In this chapter I have tried to indicate that there are methodical approaches to proof
(such as proof by contradiction, for example). What you have to always be able to do is
to understand precisely what it is that you have to prove. That sounds obvious, but it is
something the importance of which is often underestimated. Once you understand what
you need to show (and, here, working backwards a little from that end-point might be
helpful, as weve seen), then you have to try to show it. And you must know when you
have done so! So it is inevitable that you will have to take a little time to think about
what is required: you cannot simply dive in like you might to a differentiation question.
All this becomes much easier as you practise it. You should attempt problems from the
textbooks (and also the problems below). Problems are a valuable resource and you are
squandering this resource if you simply turn to the answers (should these be available).
It is one thing to agree with an answer, or to understand a proof, but it is quite a
different thing to come up with a proof yourself. There is no point in looking at the
answer before you have tried hard yourself to answer the problem. By trying (and
possibly failing), you will learn more than simply by reading answers. Examination
questions will be different from problems you have seen, so there is no point at all in
learning answers. You need to understand how to approach problems and how to
answer them for yourself.

2.12.2

How to write mathematics

You should write mathematics in English!! You shouldnt think that writing
mathematics is just using formulae. A good way to see if your writing makes sense is by
reading it aloud (where you should only read what you really have written, not adding
extra words). If it sounds like nonsense, a sequence of loose statements with no obvious
relations, then you probably need to write it again.
Dont use more symbols than necessary.
Since many people seem to think that mathematics involves writing formulae, they
often use symbols to replace normal English words. An eternal favourite is the double
arrow = to indicate that one thing follows from the other. As in :
x2 = 1

x = 1 or x = 1.

This is not only pure laziness, since its just as easy to write :
x2 = 1, hence x = 1 or x = 1.
But it is even probably not what was meant! The implication arrow = has a logical
meaning if . . . , then . . . . So if you write x2 = 1 = x = 1 or x = 1, then that
really means if x2 = 1, then x = 1 or x = 1. And hence this gives no real information
about what x is. On the other hand, writing
I know x2 = 1, hence x = 1 or x = 1,
means that now we know x = 1 or x = 1 and can use that knowledge in what follows.
Some other unnecessary symbols that are sometimes used are and . They mean
something like therefore/hence and since/because. It is best not to use them, but to
write the word instead. It makes things so much easier to read.

30

2.12. General advice

Provide all the information required.


A good habit is to start by writing what information is given and what question needs
to be answered. For instance, suppose you are asked to prove the following :
For any natural numbers a, b, c with c 2, there is a natural number n such that
an2 + bn + c is not a prime.
A good start to an answer would be :
Given : natural numbers a, b, c, with c 2.
To prove : there is a natural number n such that an2 + bn + c is not a prime.
At this point you (and any future reader) has all the information required, and you can
start thinking what really needs to be done.

2.12.3

How to do mathematics

In a few words : by trying and by doing it yourself !!


Try hard
The kind of questions you will be dealing with in this subject often have no obvious
answers. There is no standard method to come to an answer. That means that you have
to find out what to do yourself. And the only way of doing that is by trial and error.
So once you know what you are asked to do (plus all the information you were given),
the next thing is to take a piece of paper and start writing down some possible next
steps. Some of them may look promising, so have a better look at those and see if they
will help you. Hopefully, after some (or a lot) of trying, you see how to answer the
question. Then you can go back to writing down the answer. This rough working is a
vital part of the process of answering a question (and, in an examination, you should
make sure your working is shown). Once you have completed this part of the process,
you will then be in a position to write the final answer in a concise form indicating the
flow of the reasoning and the arguments used.
Keep trying
You must get used to the situation that not every question can be answered
immediately. Sometimes you immediately see what to do and how to do it. But other
times you will realise that after a long time you havent got any further.
Dont get frustrated when that happens. Put the problem aside, and try to do another
question (or do something else). Look back at the question later or another day, and see
if it makes more sense then. Often the answer will come to you as some kind of ah-ha
flash. But you cant force these flashes. Spending more time improves the chances they
happen, though.
Finally, if you need a long time to answer certain questions, you can consider yourself in
good company. For the problem known as Fermats Last Theorem, the time between
when the problem was first formulated and when the answer was found was about 250
years !

31

2. Mathematical statements, proof, logic and sets

Do it yourself
Here is a possible answer to the previous example: Given : natural numbers a, b, c, with
c 2.
To prove : there is a natural number n such that an2 + bn + c is not a prime.
By definition (see page 70 of Biggs book, or the footnote on page 4 of Eccles book), a
number p is prime if p 2 and the only divisors of p are 1 and p itself.
Hence to prove : there is a natural number n for which an2 + bn + c is smaller than 2
or it has divisors other than 1 or itself.
Lets take n = c. Then we have an2 + bn + c = ac2 + bc + c.
But we can write ac2 + bc + c = c (ac + b + 1), which shows that ac2 + bc + c has c and
ac + b + 1 as divisors.
Moreover, its easy to see that neither c nor ac + b + 1 can be equal to 1 or to ac2 + bc + c.
Weve found a value of n for which an2 + bn + c has divisors other than 1 or itself.
The crucial step in the answer above is the one in which I choose to take n = c. Why
did I choose that? Because it works. How did I get the idea to take n = c? Ah, thats far
less obvious. Probably some rough paper and lots of trying was involved. In the final
answer, no information about how this clever idea was found needs to be given.
You probably have no problems following the reasoning given above, and hence you may
think that you understand this problem. But being able to understand the answer, and
being able to find the answer yourself are two completely different matters. And it
is the second skill you are supposed to acquire in this course. (And hence the skill that
will be tested in the examination.) Once you have learnt how to approach questions
such as the above and come up with the clever trick yourself, you have some hope of
being able to answer other questions of a similar type.
But if you only study answers, you will probably never be able to find new arguments
for yourself. And hence when you are given a question youve never seen before, how
can you trust yourself that you have the ability to see the trick that that particular
question requires ?
For many, abstract mathematics seems full of clever tricks. But these tricks have
always been found by people working very hard to get such a clever idea, not by people
just studying other problems and the tricks found by other people.

2.12.4

How to become better in mathematics

One thing you might consider is doing more questions. The books are a good source of
exercises. Trying some of these will give you extra practice.
But if you want to go beyond just being able to do what somebody else has written
down, you must try to explore the material even further. Try to understand the reason
for things that are perhaps not explicitly asked.
As an illustration of thinking that way, look again at the formulation of the example we
looked at before:
For any natural numbers a, b, c with c 2, there is a natural number n such that
an2 + bn + c is not a prime.

32

2.12. Learning outcomes

Why is it so important that c 2 ? If you look at the proof in the previous section, you
see that that proof goes wrong if c = 1. (Since we want to use that c is a divisor
different from 1.) Does that mean the statement is wrong if c = 1 ? (No, but a different
proof is required.)
And what happens if we allow one or more of a, b, c to be zero or negative?
And what about more complicated expression such as an3 + bn2 + cn + d for some
numbers a, b, c, d with d 2 ? Could it be possible that there is an expression like this
for which all n give prime numbers? If you found the answer to the original question
yourself, then you probably immediately see that the answer has to be no, since
similar arguments as before work. But if you didnt try the original question yourself,
and just studied the ready-made answer, youll be less well equipped to answer more
general or slightly altered versions.
Once you start thinking like this, you are developing the skills required to be good in
mathematics. Trying to see beyond what is asked, asking yourself new questions and
seeing which you can answer, is the best way to train yourself to become a
mathematician.

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
demonstrate an understanding of what mathematical statements are
prove whether mathematical statements are true or false
negate statements, including universal statements and existential statements
construct truth tables for logical statements
use truth tables to determine whether logical statements are logically equivalent or
not
demonstrate knowledge of what is meant by conjunction and disjunction
demonstrate understanding of the meaning of if-then statements and be able to
prove or disprove such statements
demonstrate understanding of the meaning of if and only if statements and be
able to prove or disprove such statements
find the converse and contrapositive of statements
prove statements by proving their contrapositive
prove results by various methods, including directly, by the method of proof by
contradiction, and by working backwards
demonstrate understanding of the key ideas and notations concerning sets
prove results about sets
use existential and universal quantifiers
be able to negate statements involving several different quantifiers.

33

2. Mathematical statements, proof, logic and sets

Sample examination questions

Question 2.1
Is the following statement about natural numbers n true or false? Justify your answer
by giving a proof or a counterexample:
If n is divisible by 6 then n is divisible by 3.
What are the converse and contrapositive of this statement? Is the converse true? Is the
contrapositive true?
Question 2.2
Is the following statement about natural numbers n true or false? Justify your answer
by giving a proof or a counterexample:
If n is divisible by 2 then n is divisible by 4.
What are the converse and contrapositive of this statement? Is the converse true? Is the
contrapositive true?
Question 2.3
Prove that (P Q) and P Q are logically equivalent.
Question 2.4
Prove that the negation of P Q is P Q.
Question 2.5
Construct the truth tables for P (Q R) and (P Q) (P R). Are these two
statements logically equivalent?
Question 2.6
Suppose P, Q, R are three statements. Show that (P Q) R and P (Q R) are
not logically equivalent.
Question 2.7
Prove that for all real numbers a, b, c, ab + ac + bc a2 + b2 + c2 .
Question 2.8
Prove by contradiction that there is no largest natural number.
Question 2.9
Prove that there is no smallest positive real number.

34

2.12. Comments on selected activities

Question 2.10
Suppose A and B are subsets of a universal set E. Prove that

(E E) \ (A B) = ((E \ A) E) (E (E \ B)).

Question 2.11
Suppose that P (x, y) is a predicate involving two free variables x, y from a set E. (So,
for given x and y, P (x, y) is either true or false.) Find the negation of the statement
x E, y E, P (x, y)

Comments on selected activities


Feedback to activity 2.1
We can do this by constructing a truth table. Consider Table 2.8. This proves that
(P Q) and P Q are equivalent.
P
T
T
F
F

Q
T
F
T
F

P Q
T
F
F
F

(P Q)
F
T
T
T

P
F
F
T
T

Q
F
T
F
T

P Q
F
T
T
T

Table 2.8: The truth tables for (P Q) and P Q

Feedback to activity 2.2


The converse is if n divides 12 then n divides 4. This is false. For instance, n = 12 is a
counterexample. This is because 12 divides 12, but it does not divide 4. The original
statement is true, however. For, if n divides 4, then for some m Z, 4 = nm and hence
12 = 3 4 = 3nm = n(3m), which shows that n divides 12.
Feedback to activity 2.3
We have
x A \ B (x A) (x 6 B)
(x A) (x E \ B)
x A (E \ B).
Feedback to activity 2.4
P(A) is the set consisting of the following sets:
, {1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4},
{1, 2, 3}, {2, 3, 4}, {1, 3, 4}, {1, 2, 4}, {1, 2, 3, 4}.

35

2. Mathematical statements, proof, logic and sets

Feedback to activity 2.5


The members of P(A) are all the subsets of A. A subset S is determined by which of
the n members of A it contains. For each member x of A, either x S or x 6 S. There
are therefore two possibilities, for each x A. It follows that the number of subsets is
2 2 2 (where there are n factors, one for each element of A). Therefore P(A)
has 2n members.
Feedback to activity 2.6
The statement means that if we take any natural number n there will be some natural
number m greater than n. Well, this is true. For example, m = n + 1 will do.

Sketch answers to or comments on sample questions


Answer to question 2.1
The statement is true. For, suppose n is divisible by 6. Then for some m N, n = 6m,
so n = 3(2m) and since 2m N, this proves that n is divisible by 3.
The converse is If n is divisible by 3 then n is divisible by 6. This is false. For example,
n = 3 is a counterexample: it is divisible by 3, but not by 6.
The contrapositive is If n is not divisible by 3 then n is not divisible by 6. This is true,
because it is logically equivalent to the initial statement, which we have proved to be
true.
Answer to question 2.2
The statement is false. For example, n = 2 is a counterexample: it is divisible by 2, but
not by 4.
The converse is If n is divisible by 4 then n is divisible by 2. This is true. For, suppose
n is divisible by 4. Then for some m N, n = 4m, so n = 2(2m) and since 2m N, this
proves that n is divisible by 2.
The contrapositive is If n is not divisible by 4 then n is not divisible by 2. This is false,
because it is logically equivalent to the initial statement, which we have proved to be
false. Alternatively, you can see that its false because 2 is a counterexample: it is not
divisible by 4, but it is divisible by 2.
Answer to question 2.3
This can be established by using the truth table constructed in Activity 2.2. See the
solution above.
Answer to question 2.4
This is established by Table 2.6. That table shows that (P Q) is logically equivalent
to P Q. This is the same as saying that the negation of P Q is P Q.
Answer to question 2.5
Table 2.9 is the truth table for P (Q R) and Table 2.10 is the table for
(P Q) (P R). In all cases, these give the same truth value, so the statements are
logically equivalent.

36

2.12. Sketch answers to or comments on sample questions

P
T
T
T
T
F
F
F
F

Q
T
T
F
F
T
T
F
F

R
T
F
T
F
T
F
T
F

QR
T
F
F
F
T
F
F
F

P (Q R)
T
F
F
F
T
T
T
T

Table 2.9: The truth table for P (Q R).

P
T
T
T
T
F
F
F
F

Q
T
T
F
F
T
T
F
F

R
T
F
T
F
T
F
T
F

P Q
T
T
F
F
T
T
T
T

P R
T
F
T
F
T
T
T
T

(P Q) (P R)
T
F
F
F
T
T
T
T

Table 2.10: The truth table for (P Q) (P R).

Answer to question 2.6


We can show that (P Q) R and P (Q R) are not logically equivalent by
constructing their truth tables and showing that in some cases, they give different truth
values. See Table 2.6. This shows, for example, that if P is F , Q is T and R is F , then
the two statements take different logical values. For, in this case, P Q is T and
(P Q) R is F . On the other hand, P (Q R) is T .
P
T
T
T
T
F
F
F
F

Q
T
T
F
F
T
T
F
F

R
T
F
T
F
T
F
T
F

P Q
T
T
F
F
T
T
T
T

QR
T
F
T
T
T
F
T
T

(P Q) R
T
F
T
T
T
F
T
F

P (Q R)
T
F
T
T
T
T
T
T

Table 2.11: The truth table for Question 2.6

37

2. Mathematical statements, proof, logic and sets

Answer to question 2.7

We work backwards, since it is not immediately obvious how to begin. We note that
what were trying to prove is equivalent to
a2 + b2 + c2 ab ac bc 0.
This is equivalent to
2a2 + 2b2 + 2c2 2ab 2ac 2bc 0,
which is the same as
(a2 2ab + b2 ) + (b2 2bc + c2 ) + (a2 2ac + c2 ) 0.
You can perhaps now see how this is going to work, for (a2 2ab + b2 ) = (a b)2 and so
on. Therefore the given inequality is equivalent to
(a b)2 + (b c)2 + (a c)2 0.
We know this to be true because squares are always non-negative. If we wanted to write
this proof forwards we might argue as follows. For any a, b, c, (a b)2 0, (b c)2 0
and (a c)2 0, so
(a b)2 + (b c)2 + (a c)2 0
and hence
2a2 + 2b2 + 2c2 2ab 2ac 2bc 0,
from which we obtain
a2 + b2 + c2 ab + ac + bc,
as required.
Answer to question 2.8
Lets prove by contradiction that there is no largest natural number. So suppose there is
a largest natural number. Let us call it N . (What we want to do now is somehow show
that a conclusion, or something we know for sure must be false, follows.) Well, consider
the number N + 1. This is a natural number. But since N is the largest natural number,
we must have N + 1 N , which means that 1 0, and thats nonsense. So it follows
that we must have been wrong in supposing there is a largest natural number. (Thats
the only place in this argument where we could have gone wrong.) So there is no largest
natural number. We could have argued the contradiction slightly differently. Instead of
using the fact that N + 1 N to obtain the absurd statement that 1 0, we could
have argued as follows: N + 1 is a natural number. But N + 1 > N and this contradicts
the fact that N is the largest natural number.
Answer to question 2.9
We use a proof by contradiction. Suppose that there is a smallest positive real number
and lets call this r. Then r/2 is also a real number and r/2 > 0 because r > 0. But
r/2 < r, contradicting the fact that r is the smallest positive real number. (Or, we could
argue: because r/2 is a positive real number and r is the smallest such number, then we
must have r/2 r, from which it follows that 1 2, a contradiction.)

38

2.12. Sketch answers to or comments on sample questions

Answer to question 2.10


We need to prove that

(E E) \ (A B) = ((E \ A) E) (E (E \ B)).
Now,
(x, y) (E E) \ (A B)

((x, y) A B)
((x A) (y B))
(x A) (y B)
(x E \ A) (y E \ B)
((x, y) (E \ A) E) ((x, y) E (E \ B))
(x, y) ((E \ A) E) (E (E \ B)).

Answer to question 2.11


We deal first with the existential quantifier at the beginning of the statement. So, the
negation of the statement is
x E, (y E, P (x, y))
which is the same as
x E, y E, P (x, y).

39

2. Mathematical statements, proof, logic and sets

40

Chapter 3
Natural numbers and proof by
induction

Essential reading

RR

One or both of the following:

3.1

Biggs, N. L. Discrete Mathematics. Chapters 13.


Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 14 and 6.

Introduction

This chapter explores some of the properties of the natural numbers. These will not be
new to you, but they shall be explained a little more formally. The chapter also studies
a very powerful proof method, known as proof by induction. This enables us to prove
many universal statements about natural numbers that would be extremely difficult to
prove by other means.

3.2

Natural numbers: an axiomatic approach

This section is included for the sake of completeness. It is rather more abstract than the
rest of the subject. Do not worry if you find it challenging. The main matter of this
chapter is the method of proof by induction.
You know that the natural numbers are the positive integers, and you are certainly very
comfortable with them. But suppose an alien visited you and you had to explain to him
or her (or it, I suppose) what the natural numbers were. How would you do this? Well,
you could describe them by the properties they have. (To paraphrase Biggs, we describe
what they do rather than what they are.) We have the following axioms for the
natural numbers. (An axiom is a statement that is assumed to be true.)
1. For all a, b N we have a + b N. [Closure under Addition]
2. For all a, b N we have a b N. [Closure under Multiplication]
3. For all a, b N we have a + b = b + a. [Commutative Law for addition]
4. For all a, b, c N we have (a + b) + c = a + (b + c). [Associative Law for addition]
5. For all a, b N we have a b = b a. [Commutative Law for Multiplication]

41

3. Natural numbers and proof by induction

6. For all a, b, c N we have (a b) c = a (b c). [Associative Law for


Multiplication]
7. There is a special element of N, denoted by 1, which has the property that for all
n N we have n 1 = n.

8. For all a, b, c N, if a + c = b + c, then a = b. [Additive cancellation]


9. For all a, b, c N, if a c = b c, then a = b. [Multiplicative Cancellation]
10. For all a, b, c N we have a (b + c) = (a b) + (a c). [Distributive Law]
11. For all a, b N, a < b if and only if there is some c N with a + c = b.
12. For all a, b N, exactly one of the following is true: a = b, a < b, b < a.
We also will write a b for a b.
Other properties of the natural numbers follow from these axioms. (That is, they can be
proved assuming these axioms.)
For example, we can prove the following.
P1. For all a, b, c N, (a + b) c = (a c) + (b c).
P2. If a, b N satisfy a b = a, then b = 1.
P3. For a, b, c N, if a < b and b < c, then a < c. [Transitivity]
P4. For a, b, c N, if a < b then a + c < b + c and a c < b c.
P5. 1 is the least element of N.
P6. 1 is not equal to 1+1.
We dont need to add these to the axioms because they follow from the axioms we
already have. (We can prove them just from the axioms above.) Well come back to
this later in the chapter.
We have 1 + 1 6= 1. We will use the symbol 2 to represent 1 + 1. It can also be shown
that 1 + 1 + 1 is not equal to 1 or to 1 + 1. Well denote it by 3. And so on. . . .
You might find this all a bit weird. We already knew, after all, that 1 + 1 = 2 and have
done for some time!
But the point is that we can define the natural numbers axiomatically by a set of rules
or axioms and everything else about them follows from those axioms.

3.3

Least and greatest members and the


well-ordering principle

If S is a subset of N, then l is a least member or least element of S if l S and, for all


s S, l s. The natural number g is a greatest member of S if g S and, for all s S,
g s.
Its quite obvious that any non-empty set of natural numbers has a least member, but
it does not follow from the axioms given above. We therefore take this as an additional

42

3.4. The principle of induction

axiom of the set of natural numbers, to be added to those above. It is known as the
well-ordering principle or Least Element Axiom.
Well-ordering principle (or Least Element Axiom): (Axiom 13 for the natural
numbers): Every non-empty subset of N has a least element.
This is such an important property of natural numbers (not shared by sets of real
numbers, for instance).

Activity 3.1 Think of a set of real numbers that has no least member.

3.4
3.4.1

The principle of induction


Proof by induction

One particularly useful principle that follows from the axioms of the natural numbers
given above is the following one, known as the Induction principle. We can, in fact, take
the Induction Principle as one of the axioms of the natural numbers, in place of the
well-ordering principle: the two are equivalent. (See Eccles, Section 11.2.) This is, in
fact, the approach taken by Biggs. But we shall view the Induction Principle as a
consequence of the well-ordering principle along with the other axioms for the natural
numbers. (See the end of this chapter for more on this.)
The Induction Principle
Suppose P (n) is a statement involving natural numbers n. Then P (n) is true for all
n N if the following two statements are true:
(i) P (1) is true;
(ii) For all k N, P (k) P (k + 1).

Suppose you want to prove n N, P (n). Suppose you can prove two things:
P (1) is true, and k N, P (k) P (k + 1).
Then: because P (1) P (2) and P (1) is true, we must have that P (2) is true. Then,
because P (2) P (3) and P (2) is true, we must have P (3) is true, and so on. So what
you see is that this establishes the universal statement that for every n N, P (n) is
true.

3.4.2

An example

Heres an example of how we might prove by induction a result we proved directly


earlier, in the previous chapter, namely:
n N, n2 + n is even.

43

3. Natural numbers and proof by induction

Let P (n) be the statement n2 + n is even. Then P (1) is true, because 12 + 1 = 2, and
this is even. (Establishing P (1) is known as proving the base case or the induction
basis.) Next we show that P (k) P (k + 1) for any k N. So we show that if P (k) is
true, so will be P (k + 1). To do this we assume that P (k) is true and show that
P (k + 1) is then also true. (The assumption that P (k) is true is known as the inductive
hypothesis.) So suppose P (k) is true, which means that k 2 + k is even. What we need to
do now is show that this means that P (k + 1) is also true, namely that
(k + 1)2 + (k + 1) is even. So we need somehow to relate the expression
(k + 1)2 + (k + 1) to the one we are assuming we know something about, k 2 + k. Well,
(k + 1)2 + (k + 1) = k 2 + 2k + 1 + k + 1 = (k 2 + k) + (2k + 2).
Now, by the inductive hypothesis (the assumption that P (k) is true), k 2 + k is even.
But 2k + 2 = 2(k + 1) is also even, so (k + 1)2 + (k + 1) is an even number, in other
words P (k + 1) is true. So we have shown that k, P (k) P (k + 1). It now follows, by
the Principle of Induction, that for all n N, P (n) is true.
Once we get used to this technique, we can make our proofs more succinct.
The basic way of proving a result n N, P (n) by induction is as follows:
[The Base Case] Prove P (1) is true.
[The Induction Step] Prove that, for any k N, assuming P (k) is true (the
inductive hypothesis), then P (k + 1) is also true.
And thats all you need to do! The principle of induction then establishes that P (n) is
true for all n N.

3.4.3

Variants

Suppose N is some particular natural number and that P (n) is a statement involving
natural numbers n. Then P (n) is true for all n N if the following two statements are
true:
(i) P (N ) is true;
(ii) For all k N, k N , P (k) P (k + 1).
This is a version of the Induction Principle obtained from the standard one by
changing the base case. It can be used to prove a result like the following:
n 4, n2 2n .
(The inequality n2 2n is false when n = 3, so it does not hold for all n N.)
Activity 3.2 Prove that n 4, n2 2n .
Another variant of the Induction Principle is the following, known as the Strong
Induction Principle:

44

3.5. Summation formulae

The Strong Induction Principle


Suppose P (n) is a statement involving natural numbers n. Then P (n) is true for all
n N if the following two statements are true:
(i) P (1) is true;

(ii) For all k N, (P (s) true s k) P (k + 1).

The name is misleading, because, in fact, the strong induction principle follows from the
standard induction principle.
Activity 3.3 Try to understand why the strong induction principle follows from the
induction principle. Hint: consider Q(n), the statement s n, P (s) is true. [This
is difficult, so you may want to omit this activity at first.]
The strong induction principle is, as we shall see, often useful when it comes to proving
results about sequences that are defined recursively.

3.5

Summation formulae

Suppose
Pn a1 , a2 , a3 , . . . is a sequence (an infinite, ordered, list) of real numbers. Then the
sum r=1 ar is the sum of the first n numbers in the sequence. It is useful to define
these sums recursively or by induction, as follows:
1
X

ar = a1

and

for n N,

r=1

n+1
X

ar =

r=1

n
X

!
ar

+ an+1 .

r=1

With this observation, we can use proof by induction to prove many results about the
values and properties of such sums. Here is a simple, classical, example.
Example 3.1 For all n N,
n
X

1
r = n(n + 1).
2
r=1

This is simply the statement that the sum of the first n natural numbers is
n(n + 1)/2.
Proof
P
We prove the result by induction. Let P (n) be the statement that nr=1 r = 21 n(n + 1).
Then P (1) states that 1 = 1, which is true. So the base case P (1) is true. Now lets do
the induction step. Suppose that k N and that (the inductive hypothesis)

45

3. Natural numbers and proof by induction

Pk

r=1

r = 21 k(k + 1) holds. Consider


k+1
X

r =

r=1

k
X

Pk+1
r=1

r. We have

r + (k + 1)

r=1

=
=
=
=
=

1
k(k + 1) + (k + 1) by the induction hypothesis
2
1 2
(k + k + 2k + 2)
2
1 2
(k + 3k + 2)
2
1
(k + 1)(k + 2)
2
1
(k + 1)((k + 1) + 1).
2

This
Pk+1establishes that P (k + 1) is true (for P (k + 1) is precisely the statement that
r = (k + 1)((k + 1) + 1)/2.) Therefore, by induction, for all n N,
Pr=1
n
1
r=1 r = 2 n(n + 1).
Note how the the induction hypothesis was used. In the induction step, you always prove
P (k + 1) to be true assuming P (k) is. (Unless you do so, it isnt a proof by induction.)

Activity 3.4 Prove by induction that the sum of the first n terms of an arithmetic
progression with first term a and common difference d is
n(2a + (n 1)d)
.
2

3.6

Recursively defined sequences

Sequences of numbers are often defined recursively or by induction.


For example, suppose that the sequence xn is given by x1 = 9, x2 = 13 and, for n 3,
xn = 3xn1 2xn2 . We can prove by induction (using the strong induction principle)
that, for all n N, xn = 5 + 2n+1 . Heres how:
Since the inductive definition for xn only applies for n 3, the case step of our proof is
to verify the result for the cases n = 1 and n = 2. Now, when n = 1, 5 + 2n+1 = 9, which
is indeed x1 ; and when n = 2, 5 + 2n+1 = 13, which equals x2 , so these hold. Assume
inductively that k N and that, for all s k, xs = 5 + 2s+1 . (Note that, here, we use
strong induction. This is because xk+1 depends not only on xk but on xk1 too.) In

46

3.7. Using the axioms for the natural numbers

particular, therefore, we have xk = 5 + 2k+1 and xk1 = 5 + 2k . So,


xk+1 = 3xk 2xk1
= 3(5 + 2k+1 ) 2(5 + 2k )
= 15 + 3(2k+1 ) 10 2(2k )

= 5 + 3(2k+1 ) 2(2k )
= 5 + 6(2k ) 2(2k )
= 5 + 4(2k )
= 5 + 2k+2
= 5 + 2(k+1)+1 ,
which is exactly what we need. So the formula for xn holds for all n.

3.7

Using the axioms for the natural numbers

Earlier, we said that the following results follow from the axioms for N.
P1. For all a, b, c N, (a + b) c = (a c) + (b c).
P2. If a, b N satisfy a b = a, then b = 1.
P3. For a, b, c N, if a < b and b < c, then a < c. [Transitivity]
P4. For a, b, c N, if a < b then a + c < b + c and a c < b c.
P5. 1 is the least element of N.
P6. 1 is not equal to 1+1.
Lets see why.
Proof of P1. For all a, b, c N, (a + b) c = (a c) + (b c).
(a + b) c = c (a + b)

(by 5 [Commutative])

= (c a) + (c b)

(10 [Distributive])

= (a c) + (b c)

(by 5 [Commutative])

Proof of P2. If a, b N satisfy a b = a, then b = 1.


Suppose a b = a. Then, since (by 7), a = a 1, so
a b = a 1.
By 5 [Commutative],
b a = 1 a.
Then, by 9 [Cancellation],
b = 1.

47

3. Natural numbers and proof by induction

Proof of P3. For a, b, c N, if a < b and b < c, then a < c. [Transitivity]


If a < b and b < c then, by 11, there are x, y N such that a + x = b and b + y = c.
Then,
a + (x + y) = (a + x) + y by 4 [Associativity]
=b+y =c

and so, by 11, together with the fact (by Closure, 1) that x + y N , a < c.
Proof of P4. For a, b, c N, if a < b then a + c < b + c and a c < b c.
If a < b then (by 11) this means d N with a + d = b.
1. We prove a + c < b + c. We have:
(a + c) + d = d + (a + c)

(by 3)

= (d + a) + c

(by 4)

= (a + d) + c

(by 3)

= b + c.
This shows a + c < b + c.
2. We prove a c < b c. We have:
(a c) + (d c) = (a + d) c.
This is P1 from above.
So, since a + d = b, we have (a c) + (d c) = b c.
Since d c N (by 2), we have a natural number z (z = d c) such that
(a c) + z = (b c). So, by 11, a c < b c.
Proof of P5. 1 is the least element of N.
Certainly, by 13 [Well-ordering], N has a least member. Call it a. Suppose a 6= 1. Axiom
12 says a < 1 or a = 1 or 1 < a. We are assuming a 6= 1 and cant have 1 < a by
minimality of a. So a < 1. By P4,
a a < 1 a.
But (Commutativity, 5), 1 a = a 1 = a (by 7). So:
a a < a.
But a a N (by Closure, 2) and this contradicts the minimality of a.
Proof of P6. 1 is not equal to 1+1.
11 says a < b if and only if there is some c N with a + c = b. So 1 < 1 + 1.
But 12 says that for all a, b N, exactly one of the following is true: a = b, a < b, b < a.
So we do not have 1 = 1 + 1.

48

3.8. Why the Principle works

3.8

Why the Principle works

We can now prove the Principle of Induction from our axioms for the natural numbers
(including the least element axiom).
Theorem 3.1 (The Induction Principle) Suppose

(i) P (1) is true;


(ii) For all k N, P (k) P (k + 1).
Then P (n) is true for all n N.
Proof
Suppose its not the case that P (n) is true for all n N. Then the set S of n N for
which it is not true is non-empty and, by the Least Element Axiom (13), has a least
member a. Now, a 6= 1 because P (1) is true. And we cant have a < 1 because (P5) 1 is
the least member of N. So, by 12, 1 < a. Consider a 1: this is a natural number less
than a and is therefore not in S. So P (a 1) is true. But since P (k) P (k + 1) for all
k, it follows that P (a) is true, meaning a 6 S, a contradiction.
You might not be entirely satisfied with that proof. It used a 1 but we havent defined
subtraction! Heres another way of explaining the last bit:
Since 1 < a, there is some c N such that 1 + c = a (by Axiom 11). By Axiom 5
(Commutativity), c + 1 = a, which, by 11, means c < a. So, because a is the least
element of S, c 6 S and hence P (c) is true. But P (c) P (c + 1) and hence P (a) is
true, a contradiction.

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
understand how the natural numbers can be defined by axioms and understand
that other properties of natural numbers can be proved from the axioms. (You are
not expected to remember all the axioms: you merely need to know that an
axiomatic approach to the natural numbers can be undertaken. You should not
anticipate examination questions involving extensive use of the axioms. This
material is included in this chapter for the sake of completeness of exposition.)
state what is meant by a greatest and least member of a set of natural numbers
and state what is meant by the well-ordering principle (or least element axiom)
state the Induction Principle and its variants
use Proof by Induction to prove a range of statements, including those involving
summation and recursive sequences.

49

3. Natural numbers and proof by induction

Sample examination questions


Question 3.1
Prove by induction that, for all n N, 2n n + 1.

Question 3.2
Prove by induction that the sum a + ar + ar2 + + arn1 of the first n terms of a
geometric progression with first term a and common ratio r 6= 1 is a(1 rn )/(1 r).
Question 3.3
Prove by induction that for all n N,
n
X
1
r2 = n(n + 1)(2n + 1).
6
r=1
Question 3.4
Prove by induction that for all n N,
n
X
i=1

n
1
=
.
i(i + 1)
n+1

Question 3.5
Suppose the sequence xn is given by x1 = 7, x2 = 23 and, for n 3,
xn = 5xn1 6xn2 . Prove by induction that, for all n N, xn = 3n+1 2n .
Question 3.6
Prove by induction that, for all n N, 2n+2 + 32n+1 is divisible by 7.
Question 3.7
Q
For a sequence of numbers x1 , x2 , x3 , . . . , and for n N, the number nr=1 xr is the
product of the first r numbers of the sequence. It can be defined inductively as follows:
!
1
k+1
k
Y
Y
Y
xr = x1 , and for k 1,
xr =
xr xk+1 .
r=1

r=1

r=1

Suppose that x 6= 1. Prove that


n
n
Y
1 x2
2r1
(1 + x
)=
.
1x
r=1

Comments on selected activities


Feedback to activity 3.2
When n = 4, n2 = 16 and 2n = 24 = 16, so in this base case, the statement is true.
Suppose we make the inductive hypothesis that for some k 4, k 2 2k . We want to

50

3.8. Comments on selected activities

show
(k + 1)2 2k+1 .
We have
(k + 1)2 = k 2 + 2k + 1 2k + 2k + 1
(by the inductive hypothesis). So well be done if we can show that 2k + 1 2k . This
will follow from 2k + 1 k 2 and the assumed fact that k 2 2k . Now,
2k + 1 k 2 k 2 2k 1 0 (k 1)2 2,
which is true for k 4. So, finally,
(k + 1)2 2k + 2k + 1 2k + k 2 2k + 2k = 2k+1 ,
as required. So the result is true for all n 4.
Feedback to activity 3.3
Let Q(n) be the statement s n, P (s) is true. Then Q(1) is true if and only if P (1)
is true. The statement Q(k) Q(k + 1) is the same as
(P (s) true s k) (P (s) true s k + 1).
But if P (s) is true for all s k then its truth for all s k + 1 follows just from its truth
when s = k + 1. That is, Q(k) Q(k + 1) is the same as (P (s) true s k) P (k + 1).
The (standard) Induction Principle applied to the statement Q(n) tells us that: Q(n) is
true for all n N if the following two statements are true:
(i) Q(1) is true;
(ii) For all k N, Q(k) Q(k + 1).
What weve established is that (i) and (ii) can be rewritten as:
(i) P (1) is true;
(ii) For all k N, (P (s) true s k) P (k + 1).
We deduce that: P (n) is true for all n N if the following two statements are true:
(i) P (1) is true;
(ii) For all k N, (P (s) true s k) P (k + 1).
This is exactly the Strong Induction Principle. So the Strong Induction Principle follows
from the standard one and is, therefore, not really stronger.
Feedback to activity 3.4
Let P (n) be the statement that the sum of the first n terms is (n/2)(2a + (n 1)d).
The base case is straightforward. The first term is a, and the formula
(n/2)(2a + (n 1)d) gives a when n = 1. Suppose that P (k) holds, so the sum of the

51

3. Natural numbers and proof by induction

first k terms is (k/2)(2a + (k 1)d). Now, the (k + 1)st term is a + kd, so the sum of
the first k + 1 terms is therefore
k
k(k 1)
a + kd + (2a + (k 1)d) = a + kd + ak +
d
2
2
= (k + 1)a +

k(k + 1)
d
2

(k + 1)
(2a + kd)
2

(k + 1)
(2a + ((k + 1) 1)d),
2

so P (k + 1) is true. The result follows for all n by induction.

Sketch answers to or comments on sample questions


Answer to question 3.1
Let P (n) be the statement 2n n + 1. When n = 1, 2n = 2 and n + 1 = 2, so P (1) is
true. Suppose P (k) is true for some k N. Then 2k k + 1. It follows that
2k+1 = 2.2k 2(k + 1) = 2k + 2 k + 2 = (k + 1) + 1,
so P (k + 1) is also true. Hence, by induction, for all n N, 2n n + 1.
Answer to question 3.2
Let P (n) be the statement that the sum of the first n terms is a(1 rn )/(1 r). P (1)
states that the first term is a(1 r1 )/(1 r) = a, which is true. Suppose P (k) is true.
Then the sum of the first k + 1 terms is the sum of the first k plus the (k + 1)st term,
which is ark , so this sum is
a(1 rk ) + (1 r)ark
a(1 rk )
+ ark =
1r
1r
=

a ark + ark ark+1


1r

a(1 rk+1 )
,
1r

which shows that P (k + 1) is true. Hence, for all n N, P (n) is true, by induction.
Answer to question 3.3
Let P (n) be the statement that
n
X

1
r2 = n(n + 1)(2n + 1).
6
r=1

52

3.8. Sketch answers to or comments on sample questions

Then P (1) states that 1 = 1(2)(3)/6, which is true. Suppose P (k) is true for k N.
Then
k
X
1
r2 = k(k + 1)(2k + 1)
6
r=1
and P (k + 1) is the statement that

k+1
X

1
1
r2 = (k + 1)(k + 2)(2(k + 1) + 1) = (k + 1)(k + 2)(2k + 3).
6
6
r=1

We have
k+1
X

r2 = (k + 1)2 +

r=1

k
X

r2

r=1

1
= (k + 1)2 + k(k + 1)(2k + 1)
6
(by the induction hypothesis)
1
= (k + 1) [6(k + 1) + k(2k + 1)]
6

1
= (k + 1) 2k 2 + 7k + 6
6
1
= (k + 1)(k + 2)(2k + 3),
6
so P (k + 1) is true. By induction, P (n) is true for all n N.
Answer to question 3.4
Let P (n) be the statement that

n
X
i=1

1
n
=
. Then P (1) states that
i(i + 1)
n+1

1
1
=
, which is true. Suppose P (k) is true for k N. Then
12
1+1
k
X
i=1

1
k
=
i(i + 1)
k+1

and P (k + 1) is the statement that


k+1
X
i=1

1
k+1
=
.
i(i + 1)
k+2

Now,
k+1
X
i=1

X
1
1
1
=
+
i(i + 1)
(k + 1)(k + 2) i=1 i(i + 1)
=

k
1
+
(k + 1)(k + 2) k + 1

53

3. Natural numbers and proof by induction

(by the induction hypothesis)

1 + k(k + 2)
(k + 1)(k + 2)

k 2 + 2k + 1
(k + 1)(k + 2)

(k + 1)2
=
(k + 1)(k + 2)
=

k+1
,
k+2

so P (k + 1) is true. By induction, P (n) is true for all n N.


Answer to question 3.5
Let P (n) be the statement that xn = 3n+1 2n . We use the Strong Induction Principle
to prove P (n) is true for all n N. The base cases are n = 1 and n = 2. When n = 1,
x1 = 7 and 3n+1 2n = 9 2 = 7. When n = 2, x2 = 23 and 3n+1 2n = 27 4 = 23, so
these are true. Suppose that k 2 and that for all s k, P (s) is true. In particular,
P (k) and P (k 1) are true and so
xk+1 = 5xk 6xk1
= 5(3k+1 2k ) 6(3k 2k1 )
= 5(3k+1 ) 5(2k ) 6(3k ) + 6(2k1 )
= 15(3k ) 6(3k ) 10(2k1 ) + 6(2k1 )
= 9(3k ) 4(2k1 )
= 3k+2 2k+1
= 3(k+1)+1 2k+1 ,
so P (k + 1) is true. Therefore, P (n) is true for all n N.
Answer to question 3.6
Let P (n) be the statement that 2n+2 + 32n+1 is divisible by 7. When n = 1,
2n+2 + 32n+1 = 8 + 27 = 35 and this is a multiple of 7 because 35 = 5 7. Suppose P (k)
is true, which means that for some m N, 2k+2 + 32k+1 = 7m. Now, when we take
n = k + 1,
2n+2 + 32n+1 = 2k+3 + 32k+3
= 2(2k+2 ) + 9(32k+1 )
= 2(2k+2 + 32k+1 ) + 7(32k+1 )
= 7m + 7(32k+1 )

= 7 m + 32k+1 ,

54

3.8. Sketch answers to or comments on sample questions

which is a multiple of 7. So the statement is true for P (k + 1) and hence, by induction,


for all n N.
Answer to question 3.7
Let P (n) be the statement
n
n
Y
1 x2
2r1
(1 + x
)=
.
1x
r=1

When n = 1, the left hand side is 1 + x2 = 1 + x and the right hand side is
(1 x2 )/(1 x) = 1 + x, so P (1) is true. Suppose P (k) is true, so that
k
k
Y
1 x2
2r1
(1 + x
)=
.
1x
r=1

Then
k+1
Y

r1

(1 + x2

(k+1)1

) = (1 + x2

r=1

k
Y
r1
(1 + x2 )
r=1

= (1 + x2 )

1 x2
1x

by the induction hypothesis,


k

1 (x2 )2
=
1x
where we have used (1 + y)(1 y) = 1 y 2 ,
k

1 x2 2
1x
k+1

1 x2
=
,
1x
which shows that P (k + 1) is true. So P (n) is true for all n N, by induction.

55

3. Natural numbers and proof by induction

56

Chapter 4
Functions and counting
Essential reading

RR

One or both of the following:


Biggs, N. L. Discrete Mathematics. Chapters 5 and 6.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapter 10, Sections
10.1 and 10.2, and Chapter 11.

4.1

Introduction

In this chapter we look at the theory of functions (in more detail than in Mathematics
1 and Mathematics 2), and we see how the idea of the size of a set can be formalised.

4.2
4.2.1

Functions
Basic definitions

You have worked extensively with functions in your previous mathematical study.
Chiefly, you will have worked with functions from the real numbers to the real numbers,
these being the primary objects of interest in calculus.
Well now take a more abstract approach to functions and their properties.
Here is the definition of what we mean by a function.
Definition 4.1 Suppose that X and Y are sets. Then a function (also known as a
mapping) from X to Y is a rule that associates a unique member of Y to each member
of X. We write f : X Y . The set X is called the domain of f and Y is called the
codomain.
The element of Y that is assigned to x X is denoted by f (x) and is called the image
of x. We can write x 7 f (x) to indicate that x maps to f (x).
There are various ways of describing a function. Sometimes, if X has only finitely many
members, we can simply list the images of the members of X. More usually, we can give
a formula for the function. For instance, f : R R given by f (x) = 2x is the function
that maps each real number a to the real number 2a.
Sometimes a function can be defined recursively. For example, we might define

57

4. Functions and counting

f : N N by
f (1) = 1

f (n) = 2 + 3f (n 1), (n 2).

(You can see that the sequence of numbers f (1), f (2), f (3), . . . is therefore given by a
first order difference equation.)
What does it mean to say that two functions f, g are equal? Well, first, they must have
the same domain X and codomain Y . Then, for each x X, we must have f (x) = g(x).
For example, if R+ is the set of positive real numbers, then the function f : R+ R
given by f (x) = x2 and the function g : R R given by g(x) = x2 are not equal
because their domains are different.

For any set X, the identity function i : X X is given by i(x) = x.

4.2.2

Composition of functions

Suppose that X, Y, Z are sets and that f : X Y and g : Y Z. Then the


composition gf , also denoted by g f , is the function from X to Z given by
(gf )(x) = g(f (x)) (x X).
Note the notation, which can sometimes cause confusion. For example, suppose
X = Y = Z = R. Then you might be tempted to think that gf denotes the product
function (gf )(x) = g(x)f (x). But this would be wrong. It should always be clear from
the context whether gf should be interpreted as a composition. Be aware of this. If I
need to talk about the product of the functions f and g I will denote this by f (x)g(x).
The notation g f leads to less confusion, but it is not used in all textbooks.
Example 4.1 Suppose f : N N and g : N N are given by f (x) = x2 + 1 and
g(x) = (x + 1)2 . Then,
(f g)(x) = f (g(x)) = f ((x + 1)2 ) = ((x + 1)2 )2 + 1 = (x + 1)4 + 1.
And,
(gf )(x) = g(f (x)) = g(x2 + 1) = ((x2 + 1) + 1)2 = (x2 + 2)2 .

4.3

Bijections, surjections and injections

There are three very important properties that a function might possess:
Definition 4.2 (Surjection) Suppose f is a function with domain X and codomain Y .
Then f is said to be a surjection (or f is surjective) if every y Y is the image of
some x X; that is, f is a surjection if and only if y Y, x X, s.t. f (x) = y.
Definition 4.3 (Injection) Suppose f is a function with domain X and codomain Y .
Then f is said to be an injection (or f is injective) if every y Y is the image of at
most one x X. In other words, the function is an injection if different elements of X
have different images under f . Thus, f is an injection if and only if
x, x0 X, x 6= x0 f (x) 6= f (x0 )

58

4.3. Bijections, surjections and injections

or (equivalently, taking the contrapositive), if and only if


x, x0 X, f (x) = f (x0 ) x = x0 .
This latter characterisation often provides the easiest way to verify that a function is an
injection.
Definition 4.4 (Bijection) Suppose f is a function with domain X and codomain Y .
Then f is said to be a bijection (or f is bijective) if it is both an injection and a
surjection. So this means two things: each y Y is the image of some x X, and each
y Y is the image of no more than one x X. Well, of course, this is equivalent to:
each y Y is the image of precisely one x X.
Example 4.2 f : N N given by f (x) = 2x is not a surjection, because there is no
n N such that f (n) = 1. (For, 2n = 1 has no solution where n N.) However, it is
an injection. To prove this, suppose that m, n N and f (m) = f (n). Then 2m = 2n,
which implies m = n.
Example 4.3 f : R R given by f (x) = 2x is a bijection.
Activity 4.1 Prove that f : R R given by f (x) = 2x is a bijection.

4.3.1

An example

Example 4.4 Let X = R, the set of real numbers, and let Y be the interval
(1, 1), the set of real numbers x such that 1 < x < 1. Then the function
f : X Y given by
x
f (x) =
1 + |x|
is a bijection from X to Y .
First, we prove f is injective. To do this, we prove that f (x) = f (y) implies x = y.
So, suppose f (x) = f (y). Then
x
y
=
.
1 + |x|
1 + |y|
So
x + x|y| = y + y|x|.
Because x/(1 + |x|) = y/(1 + |y|), x and y are both non-negative or both negative.
For, otherwise, one of x/(1 + |x|) and y/(1 + |y|) will be negative and the other one
will be non-negative, which cannot be the case since they are equal. So, x|y| = y|x|,
both being xy if x, y 0 and xy if x, y < 0. So, we have x = y.
Next, we show f is surjective. We need to prove that, for each y (1, 1), theres
x R such that x/(1 + |x|) = y. Consider separately the case in which y 0 and the
case in which y < 0.

59

4. Functions and counting

y 0. Then, to have x/(1 + |x|) = y, we need x 0. So |x| = x and we need to solve


x/(1 + x) = y. This has solution x = y/(1 y).
y < 0. Then well need to have x < 0 and the equation to solve is x/(1 x) = y,
which has solution x = y/(1 + y).

4.4

Inverse functions

4.4.1

Definition, and existence

Suppose f : X Y . Then g : Y X is an inverse function of f if (gf )(x) = x for all


x X and (f g)(y) = y for all y Y . An equivalent characterisation is that
y = f (x) x = g(y).
The following theorem tells us precisely when a function has an inverse. It also tells us
that if an inverse exists, then there is only one inverse. For this reason we can speak of
the inverse function, and give it a specific notation, namely f 1 .
Theorem 4.1 f : X Y has an inverse function if and only if f is a bijection. When
f is bijective, there is a unique inverse function.
First, we prove:
f : X Y has an inverse f is bijective.
Proof
This is an theorem, so there are two things to prove: the and the .
First, we show: f : X Y has an inverse f is bijective.
Suppose f is a bijection. For each y Y there is exactly one x X with f (x) = y.
Define g : Y X by g(y) = x. Then this is an inverse of f . Check this!
Next, we show: f : X Y has an inverse f is bijective.
Suppose f has an inverse function g. We know that for any y Y ,
f (g(y)) = (f g)(y) = y, so there is some x X (namely x = g(y)) such that f (x) = y.
So f is surjective.
Now suppose f (x) = f (x0 ). Then g(f (x)) = g(f (x0 )). But g(f (x)) = (gf )(x) = x and,
similarly, g(f (x0 )) = x0 . So: x = x0 and f is injective.
Now we prove that when f is bijective, the inverse is unique.
Proof
Suppose that g and h are inverses of f . Then hf is the identity function on X and f g is
the identity function on Y . So, for any y Y ,
g(y) = (hf )(g(y)) = ((hf )g)(y) = (h(f g))(y) = h((f g)(y)) = h(y),
so g = h.
Note that if f : X Y is a bijection, then its inverse function (which exists, by
Theorem 4.1) is also a bijection.

60

4.5. Counting as a bijection

4.4.2

Examples

Example 4.5 The function f : R R is given by f (x) = 3x + 1. Find the inverse


function.
To find a formula for f 1 , we use: y = f (x) x = f 1 (y). Now,
y = f (x) y = 3x + 1 x = (y 1)/3,
so

1
f 1 (y) = (y 1).
3

Let Z denote the set of all integers (positive, zero, and negative).
Example 4.6 The function f : Z N {0} is defined as follows:

2n
if n 0
f (n) =
2n 1 if n < 0.
Prove that f is a bijection and determine a formula for the inverse function f 1 .
First, we prove that f is injective: Suppose f (n) = f (m). Since 2n is even and 2n 1
is odd, either (i) n, m 0 or (ii) n, m < 0. (For otherwise, one of f (n), f (m) is odd and
the other even, and so they cannot be equal.)
In case (i), f (n) = f (m) means 2n = 2m, so n = m.
In case (ii), f (n) = f (m) means 2n 1 = 2m 1, so n = m. Therefore f is injective.
Next, we prove that f is surjective: We show that m N {0}, n Z such that
f (n) = m. Consider separately the case m even and the case m odd.
Suppose m is even. Then n = m/2 is a non-negative integer and f (n) is 2(m/2) = m).
If m odd, then n = (m + 1)/2 is a negative integer and


(m + 1)
1 = m.
f (n) = f ((m + 1)/2) = 2
2

4.5

Counting as a bijection

What does it mean to say that a set has three objects? Well, it means that I can take
an object from the set, and call that Object 1, then I can take a different object from
the set and call that Object 2, and then I can take a different object from the set and
call that Object 3, and then there will be no objects left in the set without a name.
Obvious, I know, but this is the fundamental way in which we can abstractly define
what we mean by saying that a set has m members.
For m N, let Nm be the set {1, 2, . . . , m} consisting of the first m natural numbers.
Then we can make the following formal definition:

61

4. Functions and counting

Definition 4.5 A set S has m members if there is a bijection from Nm to S.


So, the set has m members if to each number from 1 to m, we can assign a
corresponding member of the set S, and all members of S are accounted for in this
process. This is like the attachment of labels Object 1, etc, described above.

Note that an entirely equivalent definition is to say that S has m members if there is a
bijection from S to Nm . This is because if f : Nm S is a bijection, then the inverse
function f 1 : S Nm is a bijection also. In fact, because of this, we can simply say
that S has m members if there is a bijection between Nm and S. (Eccles uses the
definition that involves a bijection from Nm to S and Biggs uses the definition that
involves a bijection from S to Nm .)
For m N, if S has m members, we say that S has cardinality m (or size m). The
cardinality of S is denoted by |S|.

4.6
4.6.1

The pigeonhole principle


The principle

The pigeonhole principle is something that you might find obvious, but it is very
useful.
Informally, what it says is that if you have n letters and you place them into m
pigeonholes in such a way that no pigeonhole contains more than one letter, then
n m. Equivalently, if n > m (so that you have more letters than pigeonholes), then
some pigeonhole will end up containing more than one letter. This is very intuitive.
Obvious as it may be, however, can you think about how you would actually prove it?
We shall prove it below. But lets state the principle more formally, first. Recall that,
for r N, Nr = {1, 2, . . . , r}.
Theorem 4.2 (Pigeonhole Principle (PP)) Let m be a natural number. Then the
following statement is true for all n N: if there is an injection from Nn to Nm , then
n m.
Proof
We prove this by induction. The statement we want to prove is the statement P (n): if
there is an injection from Nn to Nm , then n m. The base case, n = 1, is true because
(since m N), 1 m. Suppose P (k) is true. We now want to show that P (k + 1) is also
true. So suppose there is an injection f : Nk+1 Nm . (What we want to show is that
k + 1 m.) Since k 1, k + 1 2. So m must be at least 2. (If m was 1, then, for
example, f (1) and f (2) would be equal, and f would not be an injection.) Since m 2
we can write m as m = s + 1 where s N. Now, either there is some
x Nk = {1, 2, . . . , k} with f (x) = s + 1, or there is not. Lets examine each case
separately.
Suppose, then, first, that for no x Nk do we have f (x) = s + 1. Then define
f : Nk Ns by f (x) = f (x) for x Nk . Then, because f is an injection, so too is
f . So there is an injection (namely, f ) from Nk to Ns . By the induction
hypothesis, therefore, k s and hence k + 1 s + 1 = m, as required.

62

4.6. The pigeonhole principle

Now suppose that there is some j Nk such that f (j) = s + 1. Then the value
y = f (k + 1) must be different from s + 1 and therefore y Ns . Define
f : Nk Ns by f (j) = y and f (x) = f (x) if x Nk \ {j}. Then f maps from Nk
to Nm and, furthermore, it is an injection. So, by the inductive hypothesis, k s
and hence k + 1 s + 1 = m.
A consequence of this is:
Theorem 4.3 Suppose n, m are two natural numbers. If there is a bijection from Nn
to Nm , then n = m.

Proof
Suppose f : Nn Nm is a bijection. Then f is an injection. So from Theorem 4.2,
n m.
But there is in inverse function f 1 : Nm Nn and this is also a bijection. In
particular, f 1 is an injection from Nm to Nn , and hence m n.
Now we have both n m and m n, hence n = m.
A slightly more general form of the pigeonhole principle, easy to prove from that above
is:
Theorem 4.4 Suppose that A and B are sets with |A| = n and |B| = m, where
m, n N. If there is an injection from A to B, then m n.
Proof
From the definition of counting, there are bijections g : Nn A and h : Nm B. We
also have an inverse bijection h1 : B Nm .
Suppose there is an injection f : A B. Consider the composite function
h1 f g : Nn Nm . If we can prove that this is an injection, then from Theorem 4.2 it
follows that n m.
So, let us prove injectivity. Suppose a, b Nn with a 6= b. Since g is a bijection
g(a), g(b) A with g(a) 6= g(b). Since f is an injection f (g(a)), f (g(b)) B with
f (g(a)) 6= f (g(b)). Since h1 is a bijection h1 (f (g(a))), h1 (f (g(b))) Nm with
h1 (f (g(a))) 6= h1 (f (g(b))). This last inequality is what we need.
The pigeonhole principle is remarkably useful (even in some very advanced areas of
mathematics). It has many applications. For most applications, it is the contrapositive
form of the principle that is used. This states:
If m < n then there is no injection f : Nn Nm .
So, if m < n, and f is any function f : Nn Nm , then there are x, y Nn with x 6= y
such that f (x) = f (y).

4.6.2

Some applications of the Pigeonhole Principle

We now prove some theorems using the pigeonhole principle.


We start with an easy example.

63

4. Functions and counting

Theorem 4.5 In any group of 13 or more people, there are two persons whose
birthday is in the same month.
Proof
Consider the function that maps the people to their months of birth. Since 13 > 12, this
cannot be a bijection, so two people are born in the same month.
This next one is not hard, but perhaps not immediately obvious.
Theorem 4.6 In a room full of people, there will always be at least two people who
have the same number of friends in the room.

Proof
Let X be the set of people in the room and suppose |X| = n 2. Consider the function
f : X N {0} where f (x) is the number of friends x has in the room.
Lets assume that a person cant be a friend of themselves. (We could instead assume
that a person is always friendly with themselves: we simply need a convention one way
or the other.)
Then f (X) = {f (x) : x X} {0, 1, . . . , n 1}. But there cant be x, y with
f (x) = n 1 and f (y) = 0. Why? Well, such a y would be a friend of all the others,
including x, which isnt possible since x has no friends in the room.
So either f (X) {0, 1, . . . , n 2} or f (X) {1, . . . , n 1}. In each case, since f (x)
can take at most n 1 values, there must, by the pigeonhole principle, be at least two
x, y X with f (x) = f (y). And thats what we needed to prove.
Heres an interesting geometrical example. For two points (x1 , y1 ), (x2 , y2 ) in the plane,
the midpoint of (x1 , y1 ) and (x2 , y2 ) is the point


1
1
(x1 + x2 ), (y1 + y2 )
2
2
(the point on the middle of the line connecting the two points ).
Theorem 4.7 If we have a set A of five or more points in the plane R2 with integer
coordinates, then there are two points in A whose midpoint has integer coordinates.
Proof
For two integers a, b, 21 (a + b) is an integer if and only if a + b is even, so if and only if
a, b are both even or both odd.
So the midpoint of (x1 , y1 ), (x2 , y2 ) has integer coordinates if and only if x1 , x2 are both
even or both odd, and also y1 , y2 are both even or both odd.
Lets label each of the points (a, b) of A with one of (even,even), (even,odd),
(odd,even) or (odd,odd).
Since |A| 5, there will be at least two points which receive the same label. Hence
these two points have the same parity (odd or even) for the first coordinate, and the
same parity for the second coordinate. This means the midpoint of these two points
must be integer as well.
By the way, this result would not necessarily hold if we only had four points in the set.
Consider (0, 0), (1, 0), (1, 0) and (1, 1).

64

4.6. The pigeonhole principle

Heres a very interesting number theory application (with a very sneaky proof). It uses
the notion of remainders on division by n, which well cover properly soon: for now, all
we need is that, for every natural number m, the remainder, r, upon division by n is
one of the numbers 0, 1, . . . , n 1, and that m r is divisible by n.
Theorem 4.8 Let a1 , a2 , . . . , an be n integers (where n 2). Then there exists a
non-empty collection of these integers whose sum is divisible by n.
Proof
Consider the numbers s0 , s1 , . . . , sn given by
s0 = 0,
s 1 = a1 ,
s 2 = a1 + a2 ,
s 3 = a1 + a2 + a3 ,

etc., until
s n = a1 + a2 + + an .
(It is not obvious, at all, why we should do this, but it will work!)
For each of these si , consider the remainder upon division by n. Since there are n + 1
numbers si , but only n possible remainders (0, 1, . . . , n 1), two of the si will have the
same remainder upon division by n.
So suppose sk and s` have the same remainder, where k < `. Then s` sk is divisible
by n. But since s` sk = ak+1 + ak+2 + + a` , this means that the sum
ak+1 + ak+2 + + a` is divisible by n. Se we have proved the result.
In fact we proved something even stronger than what we set out to prove :
Let a1 , a2 , . . . , an be a list of n integers ( where n 2 ). Then there exists a non-empty
collection of consecutive numbers from this list ak+1 , ak+2 , . . . , a` whose sum is
divisible by n.
The theorem isnt true if we have fewer than n integers. For instance, if for any n 2
we take the numbers a1 , . . . , an1 all equal to 1, then its impossible to find a sum that
adds up to something divisible by n.

4.6.3

A generalised form of the Pigeonhole Principle

We state without proof the following more general version of the pigenonhole principle.
Again, its rather obvious. Isnt it?
Theorem 4.9 Suppose f : A B and that |A| > k|B| where k N. Then there is
some element of B that is the image of at least k + 1 elements of A.
Last year, 232 students were registered for a course I was grading. I knew, before
marking the examinations, that at least three of them would get the same examination
mark.
Why? Well, apply the theorem, with A being the students, B being the set
{0, 1, . . . , 100} of all possible marks (which is of size 101) and f (x) the mark of student

65

4. Functions and counting

x. Since 232 > 2(101), theres some mark y such that at least 2 + 1 = 3 students will
have y = f (x), which means they get the same mark.

4.7

Infinite sets

We say that a set A is finite when there is some n N such that |A| = n. Otherwise, A
is said to be infinite.

For example, the set of natural numbers is infinite. You might think thats obvious, but
how would you prove it? (Remember that the formal definition that a set A has
cardinality n is that there is a bijection between Nn and A.)
One way to show this is to use a proof by contradiction. Suppose (for a contradiction)
that N is finite, of cardinality n N, and that f : Nn N is a bijection. Consider the
number N = f (1) + f (2) + + f (n). Since each f (i) is a natural number, for all
i Nn , N is also a natural number. But N > f (i) for all i Nn . So here is a natural
number, N , that is not equal to f (i) for any i Nn . But that contradicts the fact that
f is a bijection, because if its a bijection then its certainly a surjection and there
should be some i Nn with f (i) = N .

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
describe precisely what is meant by a function
describe precisely what it means to say a function is a surjection, an injection and
a bijection, and be able to determine whether a given function has these properties
state the definition of the composite function gf
establish whether a function has an inverse or not
demonstrate that you understand the formal definition of the cardinality of a finite
set
state and use the pigeonhole principle
state what it means to say that a set is infinite; and be able to prove that a set is
infinite.

Sample examination questions


Question 4.1
Suppose that X, Y, Z are sets and that f : X Y and g : Y Z. Prove that if f and g
are injections, so is the composition gf . Prove also that if f and g are surjections, then
so is the composition gf .

66

4.7. Sample examination questions

Question 4.2
Let Z be the set of all integers and suppose that f : Z Z is given, for x Z, by

x+1
if x is even
f (x) =
x + 3 if x is odd.
Determine whether f is injective. Determine also whether f is surjective.
Question 4.3
The function f : N N is defined as follows:

f (n 1) + 3
2
f (1) = 1, and for n 2, f (n) =

f (n 1) + 5

if f (n 1) is odd
if f (n 1) is even.

Show that f is neither an injection nor a surjection.


Question 4.4
Suppose that X, Y, Z are sets, that f : X Y , g : Y Z, and h : Y Z. Suppose
that the compositions hf and gf are equal, and also that f is surjective. Prove that
g = h.
Question 4.5
Suppose that X, Y, Z are sets and that f : X Y and g : Y Z. Prove that if the
composition gf is injective, then f is injective. Prove that if gf is surjective, then g is
surjective.
Question 4.6
Suppose that A and B are non-empty finite sets and that they are disjoint (meaning
that A B = ). Prove, using the formal definition of cardinality, that
|A B| = |A| + |B|.
Question 4.7
Suppose that X, Y are any two finite sets. By using the fact that
X Y = (X \ Y ) (Y \ X) (X Y ),
together with the result of Question 4.6, prove that
|X Y | = |X| + |Y | |X Y |.
Question 4.8
Suppose that X is a set of n 2 people. Define f : X {0, 1, . . . , n 1} by: f (x) is the
number of people in X with whom x is friendly. (For the purposes of this question,
nobody is their own friend, and if x is friendly with y, then y is friendly with x.)
Explain why it is not possible for there to be x, y X with f (x) = 0 and f (y) = n 1.
Hence show that there are at least two members of X who have precisely the same
number of friends in X.

67

4. Functions and counting

Comments on selected activities


Feedback to activity 4.1
Given any y R, let x = y/2. Then f (x) = 2(y/2) = y. This shows that f is surjective.
Also, for x, y R,
f (x) = f (y) 2x = 2y x = y,
which shows that f is injective. Hence f is a bijection.

Sketch answers to or comments on sample questions


Answer to question 4.1
Suppose f and g are injective. Then, for x, y X,
(gf )(x) = (gf )(y) g(f (x)) = g(f (y))
f (x) = f (y) (because g is injective)
x = y (because f is injective).
This shows that gf is injective.
Suppose that f and g are surjective. Let z Z. Then, because g is surjective, there is
some y Y with g(y) = z. Because f is surjective, there is some x X with f (x) = y.
Then
(gf )(x) = g(f (x)) = g(y) = z,
so z is the image of some x X under the mapping gf . Since z was any element of Z,
this shows that gf is surjective.
Answer to question 4.2
Suppose one of x, y is even and the other odd. Without any loss of generality, we may
suppose x is even and y odd. (Without loss of generality signifies that there is no need
to consider also the case in which x is odd and y is even, because the argument wed use
there would just be the same as the one were about to give, but with x and y
interchanged.) So f (x) = x + 1 and f (y) = y + 3. But we cannot then have
f (x) = f (y) because x + 1 must be an odd number and y + 3 an even number. So if
f (x) = f (y), then x, y are both odd or both even. If x, y are both even, this means
x + 1 = y + 1 and hence x = y. If they are both odd, this means x + 3 = y + 3,
which means x = y. So we see that f is injective.
Is f surjective? Let z Z. If z is odd, then z 1 is even and so
f (z 1) = (z 1) + 1 = z.
If z is even, then 3 z is odd and so
f (3 z) = (3 z) + 3 = z.
So for z Z there is x Z with f (x) = z and hence f is surjective.

68

4.7. Sketch answers to or comments on sample questions

Answer to question 4.3


Its not at all clear how to answer this. Lets work out some values of f . We have
f (1)
f (2)
f (3)
f (4)
f (5)
f (6)
f (7)
f (8)
f (9)
f (10)

=
=
=
=
=
=
=
=
=
=
..
.

1
(1 + 3)/2 = 2
2+5=7
(7 + 3)/2 = 5
(5 + 3)/2 = 4
4+5=9
(9 + 3)/2 = 6
6 + 5 = 11
(11 + 3)/2 = 7
(7 + 3)/2 = 5

Do you see what happens? The values f (3) to f (8) repeat: because f (9) = 7, the values
of f (9) to f (14) are the same as f (3) to f (8), and so on, so the set of all values of f , in
order, are
1, 2, 7, 5, 4, 9, 6, 11, 7, 5, 4, 9, 6, 11, 7, 5, 4, 9, 6, 11, . . . .
|
{z
} |
{z
} |
{z
}
repeats
repeats
repeats
So it is clear that the function is not injective, since the values 7, 5, 4, 9, 6, 11 are taken
infinitely often. Furthermore, it is not surjective, because the value 3 is never taken, for
instance.
Answer to question 4.4
Suppose f is surjective and that hf = gf . Let y Y . We show g(y) = h(y). Since y is
any element of Y in this argument, this will establish that g = h. Because f is
surjective, there is some x X with f (x) = y. Then, because hf = gf , we have
h(f (x)) = g(f (x)), which means that h(y) = g(y). So weve achieved what we needed.
Answer to question 4.5
Suppose gf is injective. To show that f is injective we need to show that
f (x) = f (y) x = y. Well,
f (x) = f (y) g(f (x)) = g(f (y)) (gf )(x) = (gf )(y) x = y,
where the last implication is because gf is injective.
Now suppose gf is surjective. So for all z Z there is some x X with (gf )(x) = z. So
g(f (x)) = z. Denoting f (x) by y, we therefore see that there is y Y with g(y) = z.
Since z was any element of Z, this shows that g is surjective.
Answer to question 4.6
Suppose |A| = m and |B| = n. We need to show that |A B| = m + n which means,
according to the definition of cardinality, that we need to show there is a bijection from

69

4. Functions and counting

Nm+n to A B. Because |A| = m, there is a bijection f : Nm A and because |B| = n,


there is a bijection g : Nn B. Let us define h : Nm+n A B as follows:
for 1 i m, h(i) = f (i) and for m + 1 i m + n, h(i) = g(i m).
Then h is injective. We can argue this as follows: if 1 i, j m then
h(i) = h(j) f (i) = f (j) i = j,
because f is injective. If m + 1 i, j m + n then
h(i) = h(j) g(i m) = g(j m) i m = j m i = j,

because g is injective. The only other possibility is that one of i, j is between 1 and m
and the other between m + 1 and m + n. In this case, the image under h of one of i, j
belongs to A and the image of the other to B and these cannot be equal because
A B = . So h is indeed an injection. It is also a surjection. For, given a A, because
f is a surjection, there is 1 i m with f (i) = a. Then h(i) = a also. If b B then
there is some 1 j n such that g(j) = b. But then, this means that
h(m + j) = g((m + j) m) = b, so b is the image under h of some element of Nm+n . So
h is a bijection from Nm+n to A B and hence |A B| = m + n.
Answer to question 4.7
Note first that the two sets (X \ Y ) (Y \ X) and X Y are disjoint. Therefore,
|X Y | = |(X \ Y ) (Y \ X)| + |X Y |.
Now, (X \ Y ) and (Y \ X) are disjoint, so
|(X \ Y ) (Y \ X)| = |(X \ Y )| + |(Y \ X)|
and therefore
|X Y | = |(X \ Y )| + |(Y \ X)| + |X Y |.
Now, the sets X \ Y and X Y are disjoint and their union is X, so
|X| = |(X \ Y ) (X Y )| = |X \ Y | + |X Y |.
A similar argument shows that
|Y | = |(Y \ X) (X Y )| = |Y \ X| + |X Y |.
These mean that
|X \ Y | = |X| |X Y | and |Y \ X| = |Y | |X Y |.
So we have
|X Y | = |(X \ Y )| + |(Y \ X)| + |X Y |
= (|X| |X Y |) + (|Y | |X Y |) + |X Y |
= |X| + |Y | |X Y |.

70

4.7. Sketch answers to or comments on sample questions

Answer to question 4.8


Think about what f means. If we had f (y) = n 1 then that means y is friendly with
all of the other n 1 people. So there is no person in the group who is not friends with
y. In particular, therefore, there is nobody who has no friends. That is, there cannot be
x with f (x) = 0. So f cannot take both of the values 0 and n 1. Therefore f takes at
most n 1 different values. By the pigeonhole principle, since |X| = n and since f maps
into a set of at most n 1 values, f is not an injection and there are therefore x, y X
with f (x) = f (y). Then, x and y have the same number of friends in X.

71

4. Functions and counting

72

Chapter 5
Equivalence relations and the integers
Essential reading

R
R

One or both of the following:

5.1

Biggs, N.L. Discrete Mathematics. Chapter 7.

Eccles, P.J. An Introduction to Mathematical Reasoning. Chapter 22.

Introduction

In this chapter we study the important idea of an equivalence relation, a concept that is
central in abstract mathematics. As an important example, we look at how the integers
can be defined from the natural numbers through the use of an equivalence relation. We
also study some of the important properties of the integers.

5.2
5.2.1

Equivalence relations
Relations in general

The idea of a relation is quite a general one. For example, consider the set of natural
numbers N and let us say that two natural numbers m, n are related, denoted by m R n,
if m + n is even. So we have, for instance, 6 R 2 and 7 R 5, but that 6 and 3 are not
related. This relation has some special properties. For one thing, since 2n is even for all
n N, n R n for all n N. (We say such a relation is reflexive.) Also, if m R n, then
m + n is even. But m + n = n + m and hence, also, n R m. (We say such a relation is
symmetric.) It is because m R n n R m that we can simply say that m and n are
related rather than m is related to n or n is related to m. The relation R has other
important properties that we will come back to later.
Formally, a relation R on a set X is a subset of the Cartesian product X X (which,
recall, is the set of all ordered pairs of the form (x, y) where x, y X).
Example 5.1 Suppose R is the relation on R given by x R y x > y. Regarded
as a subset of R R, this is the set {(x, y) | x > y}. This relation does not possess
the reflexive and symmetric properties we met in the example above. For no x R
do we have x R x because x is not greater than x. Furthermore, if x R y then x > y,
and we cannot therefore also have y R x, for that would imply the contradictory
statement that y > x.

73

5. Equivalence relations and the integers

In many cases, we use special symbols for relations. For instance = is a relation, as is
>. It is often convenient to use a symbol other than R: for instance, many textbooks
use x y rather than x R y to denote the typical relation.

5.2.2

The special properties of equivalence relations

There are three special properties that a relation might have (two of which we saw in
one of the earlier examples):
Definition 5.1 Suppose that R is relation on a set X. Then
[The reflexive property] R is said to be reflexive if, for all x X, x R x.
[The symmetric property] R is said to be symmetric if, for all x, y X, x R y
implies y R x (equivalently, for all x, y X, x R y y R x).

[The transitive property] R is said to be transitive if, for all x, y, z X,


whenever x R y and y R z, we also have x R z; that is, (x R y) (y R z) xRz.
A relation that has all three of these properties is called an equivalence relation.
Definition 5.2 A relation is an equivalence relation if is reflexive, symmetric and
transitive.
Example 5.2 We saw earlier that the relation on N given by
m R n m + n is even
is reflexive and symmetric. It is also transitive. To prove that, suppose x, y, z are
three natural numbers and that x R y and y R z. Then x + y is even and y + z is
even. To show that x R z we need to establish that x + z is even. Well,
x + z = (x + y) + (y + z) 2y,
and all three terms on the right (x + y, y + z, and 2y) are even. Therefore, x + z is
even and so x R z.
Example 5.3

Let X be the set of n n real matrices. Define a relation on X by:


M N r, s N s.t. M r = N s .

Then is an equivalence relation.


Reflexivity and symmetry are easy to see: M 1 = M 1 and, if M r = N s , then N s = M r .
Proving transitivity requires more work. Suppose M N and N R. Then there are
r, s, t, u N with M r = N s and N t = Ru . Then
M rt = (M r )t = (N s )t = (N t )s = (Ru )s = Rus ,
so there are integers w = rt and x = us such that M w = Rx and hence M R.

74

5.3. Equivalence classes

5.3

Equivalence classes

Given an equivalence relation, it is natural to group together objects that are related to
each other. The resulting groupings are known as equivalence classes. In this section, we
formally define equivalence classes and discuss some of their properties.
Definition 5.3 Suppose R is an equivalence relation on a set X and, for x X, let [x]
be the set of all y X such that y R x. So,
[x] = {y X | y R x}.
Notice that each [x] is a subset of X.
Example 5.4 Consider again R on N given by m R n m + n is even. Any even
number is related to any other even number; and any odd number to any odd
number. So there are two equivalence classes:
[1] = [3] = [5] = = set of odd positive integers;
[2] = [4] = [6] = = set of even positive integers.

Example 5.5 Suppose that f : X Y is a surjection. Define relation R on X by


x R y f (x) = f (y). Then R is an equivalence relation and the equivalence classes
are given by
Cy = {x X : f (x) = y},
for y Y .
Activity 5.1 Check this!
The equivalence classes have a number of important properties. These are given in the
following result.
Theorem 5.1 Suppose R is an equivalence relation on a set X. Then
(i) For x, y X, [x] = [y] x R y
(ii) For x, y X, if x and y are not related by R, then [x] [y] = .
Proof
(i) This is an if and only if statement, so we have two things to prove: namely that
[x] = [y] x R y and that x R y [x] = [y].
Suppose, then, that [x] = [y]. The relation R is reflexive, so we have x R x. This means
that x [x]. But if [x] = [y], then we must have x [y]. But that means (by definition
of [y]) that x R y.
Conversely, suppose that x R y. We now want to show that [x] = [y]. So let z [x]. (We
will show that z [y].) Then z R x. But, because x R y and R is transitive, it follows

75

5. Equivalence relations and the integers

that z R y and hence z [y]. This shows [x] [y]. We now need to show that [y] [x].
Suppose w [y]. Then w R y and, since x R y, we also have, since R is symmetric, y R x.
So w R y and y R x. By transitivity of R, w R x and hence w [x]. This shows that
[y] [x]. Because [x] [y] and [y] [x], [x] = [y], as required.
(ii) Suppose x and y are not related. We prove by contradiction that [x] [y] = . So
suppose [x] [y] 6= . Let z be any member of the intersection [x] [y]. (The fact that
were assuming the intersection is non-empty means there is such a z.) Then z [x], so
z R x and z [y], so z R y. Because R is symmetric, x R z. So: x R z and z R y and,
therefore, by transitivity, x R y. But this contradicts the fact that x, y are not related by
R. So [x] [y] = .

Theorem 5.1 shows that either two equivalence classes are equal, or they are disjoint.
Furthermore, because an equivalence relation is reflexive, any x X is in some
equivalence class (since it certainly belongs to [x] because x R x). So what we see is that
the equivalence classes form a partition of X: their union is the whole of X, and no two
equivalence classes overlap.
Example 5.6 Consider again the equivalence relation R on N given by
m R n m + n is even.
We have seen that there are precisely two equivalence classes: the set of odd positive
integers and the set of even positive integers. Note that, as the theory predicted,
these form a partition of all of N (since every natural number is even or odd, but not
both).

5.4

Construction of the integers from the natural


numbers

This section might seem at first a little tricky. We know what the integers are, so why
do we have to define or construct them, you might well ask. Well, for one thing the
procedure were about to look at is used very often in mathematics. We will meet it
again when we study rational numbers and modular arithmetic. The big idea here is
that we can often define an interesting set of objects (and perform operations on them)
by considering the set of equivalence classes of some relation. That is, we have a set X
and an equivalence relation R on X, and we look at the set C of equivalence classes. In
the abstract, this is perhaps confusing, but we will see some examples over the course of
the next few chapters. We start with the integers.
(The informal motivation behind what follows is this: imagine you have deposits of a
and debts of b. Denote this by (a, b). Then you have a total of a b, which might be
negative.)
We can describe, or construct, the integers from the natural numbers, using an
equivalence relation. In fact, we consider an equivalence relation on the set N N of all
ordered pairs of natural numbers. Given (a, b) and (c, d) in X = N N, let us say that
(a, b) R (c, d) a + d = b + c.

76

5.4. Construction of the integers from the natural numbers

(Informal motivation: were thinking of [(a, b)] as the (familiar) integer a b. Thats
why we defined
(a, b) R (c, d) a + d = b + c.
This is the same as a b = c d. But we want to make this work, in the set of
natural numbers and using addition, not subtraction, which we havent defined.)
Ive said this is an equivalence relation, but lets check this.
First, R is reflexive because (a, b) R (a, b) if and only if a + b = b + a, which is clearly
true.
Next, R is symmetric, for
(a, b) R (c, d) a + d = b + c c + b = d + a (c, d) R(a, b).
Finally, R is transitive. For suppose that (a, b) R (c, d) and (c, d) R (e, f ). Then
a + d = b + c and c + f = d + e. Therefore,

(a + d) + (c + f ) = (b + c) + (d + e).
That is (after cancelling c and d from each side),
a + f = b + e,
which means (a, b) R (e, f ).
What are the equivalence classes of R? The typical equivalence class [(a, b)] contains all
(c, d) for which a + d = b + c. For example, [(2, 1)] will be
[(2, 1)] = {(3, 2), (4, 3), (5, 4), . . . }.
Now, for n N, let us denote the equivalence class [(n + 1, 1)] by n and let us denote
the equivalence class [(1, n + 1)] by n. Also, we denote by 0 the class [(1, 1)]. Then we
define the integers to be the set
{. . . , 3, 2, 1, 0, 1, 2, 3, . . . }.
We can do arithmetic (addition and multiplication) with the integers defined in this
way. First, lets define an addition operation (between equivalence classes) by
[(a, b)] + [(c, d)] = [(a + c, b + d)].
So, for example, what does this say about the sum of integers +3 and 1. Well,
+3 = [(4, 1)] and 1 = [(1, 2)] and therefore
+3 + 1 = [(4, 1)] + [(1, 2)] = [(5, 3)].
Now, (5, 3) R (3, 1), so [(5, 3)] = [(3, 1)], which is what we called +2. No surprise there,
then: 3 + (1) = 2 in the usual notation for integer arithmetic. (Henceforth, we can
simply write m + (n) as the subtraction m n.)
There is quite a subtle point about this definition of addition of equivalence classes. We
know that there are many (infinitely many) pairs (a0 , b0 ) such that [(a, b)] = [(a0 , b0 )].

77

5. Equivalence relations and the integers

Such an (a0 , b0 ) is called a representative of the equivalence class. (It is always the case,
for any equivalence relation, as Theorem 5.1 shows, that [x] and [x0 ] are the same
whenever x0 R x. So any equivalence class can be represented by potentially many
different representatives: any member of the class will do.) So we need to be sure that
the definition of addition will give us the same answer if we use different
representatives. In other words, suppose that [(a0 , b0 )] = [(a, b)] and [(c0 , d0 )] = [(c, d)].
The definition of addition,
[(a, b)] + [(c, d)] = [(a + c, b + d)],
will only make sense (or, it will only be consistent) if
[(a0 , b0 )] + [(c0 , d0 )] = [(a, b)] + [(c, d)].
Thus, we need to prove that if [(a0 , b0 )] = [(a, b)] and [(c0 , d0 )] = [(c, d)], then
[(a + c, b + d)] = [(a0 + c0 , b0 + d0 )].

We can see this easily enough in specific cases. For instance, [(4, 1)] = [(6, 3)] and
[(1, 2)] = [(2, 3)]. We have
[(4, 1)] + [(1, 2)] = [(5, 3)] and [(6, 3)] + [(2, 3)] = [(8, 6)].
Well, (5, 3) R (8, 6), because 5 + 6 = 3 + 8, so [(5, 3)] = [(8, 6)]. So, in this case, and with
these choices of representatives for each equivalence class, we end up with the same
class when we apply the addition operation.
We can prove, more generally, that the definition works, that it does not depend on the
choice of representatives of the classes. Remember, what we need to prove is that if
[(a0 , b0 )] = [(a, b)] and [(c0 , d0 )] = [(c, d)], then
[(a + c, b + d)] = [(a0 + c0 , b0 + d0 )].
Well, [(a0 , b0 )] = [(a, b)] and [(c0 , d0 )] = [(c, d)] mean that (a0 , b0 ) R (a, b) and
(c0 , d0 ) R (c, d), so that a0 + b = b0 + a and c0 + d = d0 + c. We need to show that
[(a + c, b + d)] = [(a0 + c0 , b0 + d0 )]. This means we need to show that
(a + c, b + d) R (a0 + c0 , b0 + d0 ). But
(a + c, b + d) R (a0 + c0 , b0 + d0 ) (a + c) + (b0 + d0 ) = (b + d) + (a0 + c0 )
(a + b0 ) + (d0 + c) = (a0 + b) + (c0 + d),
which is true, because a0 + b = b0 + a and c0 + d = d0 + c.
Multiplication, , is defined on these equivalence classes by
[(a, b)] [(c, d)] = [(ac + bd, ad + bc)].

5.5

Properties of the integers

We now revert to the usual notation for the integers. In particular, we denote +n by n
and consider N to be a subset of Z, the set of integers. Many of the usual properties of
integers follow from the formal definition of integers given in the previous section: see
Biggs, Section 7.5, for more details. We give one example:

78

5.6. Ordering the integers

Example 5.7 With the integers as defined formally in the previous section, we
show that for any integer z, z + 0 = z. Recall that, for some (a, b), we will have
z = [(a, b)] and that 0 is [(1, 1)]. Now, the definition of addition of integers (that is,
of the equivalence classes) means that
z + 0 = [(a, b)] + [(1, 1)] = [(a + 1, b + 1)].
But (a + 1, b + 1) R (a, b) because (a + 1) + b = (b + 1) + a, and hence
[(a + 1, b + 1)] = [(a, b)] = z. So z + 0 = z.
Activity 5.2 With integers defined in the formal way as these equivalence classes,
prove that for any integer z, z 0 = 0.

5.6

Ordering the integers

For integers x = [(a, b)] and y = [(c, d)], we say x < y if and only if a + d < b + c.
We noted in an earlier chapter that any non-empty subset of N has a least member. But
this is not true for subsets of integers.
For a subset S of Z, m is a lower bound for S if for all s S, m s; and M is an upper
bound for S if for all s S, s M . We say that S is bounded below if it has a lower
bound; and that it is bounded above if it has an upper bound. The natural number l is a
least member of S if l S and, for all s S, l s. So a least member will be a lower
bound that belongs to S. The natural number g is a greatest member of S if g S and,
for all s S, g s. So a greatest member will be an upper bound that belongs to S.
The following fact is a fundamental property of the integers, known as the well-ordering
principle. (The well-ordering principle was discussed earlier, when it was presented as
an axiom for the natural numbers. This is a generalisation of that principle.)
The Well-ordering Principle
If S is a non-empty set of integers that has a lower bound, then S has a least member.

(The same statement is true with lower replaced by upper and least replaced by
greatest.)
Furthermore, if S is bounded below, then there is precisely one least member. For, if l, l0
are least members then l, l0 S and so (since for all s S, l s and l0 s) we have
both l l0 and l0 l, so that l = l0 .

79

5. Equivalence relations and the integers

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
demonstrate that you know what is meant by a relation
demonstrate that you know what it means to say a relation is reflexive, symmetric
or transitive, or that it is an equivalence relation
verify whether given relations are reflexive, symmetric or transitive
demonstrate that you know the definition of equivalence classes and that you know
some of their basic properties, in particular that they form a partition of the set on
which the relation is defined
determine the equivalence classes that correspond to an equivalence relation
demonstrate knowledge of the way in which the integers can be formally
constructed from the natural numbers through the use of an equivalence relation
state the Well-Ordering Principle.

Sample examination questions


Question 5.1
Define a relation R on Z by: for x, y Z, x R y x2 = y 2 . Prove that R is an
equivalence relation, and describe the corresponding equivalence classes.
Question 5.2
Define the relation R on the set N by x R y if and only if there is some n Z such that
x = 2n y. Prove that R is an equivalence relation.
Question 5.3
Let X be the set of n n real matrices. Define a relation on X by:
M N an invertible P X s.t. N = P 1 M P.
Prove that is an equivalence relation.
Question 5.4
Suppose that f : X Y is a surjection. Define the relation R on X by
x R y f (x) = f (y). Prove that R is an equivalence relation. What are the
equivalence classes? Let C denote the set of equivalence classes [x] for x X. Prove
that if [x] = [y] then f (x) = f (y). This means that we can define a function g : C Y
by: g([x]) = f (x). Prove that g is a bijection.
Question 5.5
Prove that the set {x Z | x is a multiple of 4} has no lower bound.

80

5.6. Comments on selected activities

Comments on selected activities


Feedback to activity 5.2
Suppose z = [(a, b)]. Also, 0 = [(1, 1)]. The definition of multiplication is that
[(a, b)] [(c, d)] = [(ac + bd, ad + bc)].
So,
z 0 = [(a, b)] [(1, 1)] = [(a + b, a + b)].
Now, (a + b, a + b) R (1, 1) because a + b + 1 = 1 + a + b and therefore
[(a + b, a + b)] = [(1, 1)] = 0. So we see that z 0 = 0.

Sketch answers to or comments on sample questions

Answer to question 5.1


R is reflexive because for any x, x2 = x2 . R is symmetric because x2 = y 2 y 2 = x2 .
To show R is transitive, suppose x, y, z Z and x R y and y R z. Then x2 = y 2 and
y 2 = z 2 , so x2 = z 2 , which means x R z. Thus R is an equivalence relation. Given any
x Z, the equivalence class [x] consists precisely of those integers y such that y 2 = x2 .
So [x] = {x, x}.
Answer to question 5.2
R is reflexive because for any x, x = 20 x. R is symmetric because if x R y then n Z
with x = 2n y. This means that y = 2n x and hence, taking m = n, m Z such that
y = 2m x. So y R x. To show R is transitive, suppose x, y, z Z and x R y and y R z.
Then there are m, n Z such that x = 2n y and y = 2m z, so x = 2n y = 2n (2m z) = 2m+n z
which, since m + n Z, shows that x R z. Thus R is an equivalence relation.
Answer to question 5.3
For any M , M = I 1 M I where I is the identity matrix, so M M . For matrices
M, N X, if M N then theres an invertible P with N = P 1 M P and so
M = P N P 1 , which can be written as M = (P 1 )1 M P 1 . So there is an invertible
matrix Q (equal to P 1 ) such that M = Q1 N Q and hence M N . This shows the
relation is symmetric. Suppose M N and N R. Then there are invertible matrices
P and Q such that N = P 1 M P and R = Q1 N Q. We therefore have
R = Q1 (P 1 M P )Q = (Q1 P 1 )M (P Q) = (P Q)1 M (P Q),
so there is an invertible matrix T = P Q so that R = T 1 M T and hence M R,
establishing that is transitive. It follows that is an equivalence relation. (We used
here the fact that (P Q)1 = Q1 P 1 . This follows from the fact that
(Q1 P 1 )(P Q) = Q1 (P 1 P )Q = Q1 IQ = Q1 Q = I.)
Answer to question 5.4
x R x because f (x) = f (x). If x R y then f (x) = f (y) so f (y) = f (x) and hence y R x. If
x R y and y R z then f (x) = f (y) and f (y) = f (z), so f (x) = f (z) and x R z. Hence R is
an equivalence relation.

81

5. Equivalence relations and the integers

For x X, [x] is the set of all y X with f (y) = f (x), so, since f is a surjection, the
equivalence classes are exactly the sets Cz for each z Y , where
Cz = {x X | f (x) = z} is the set of elements of X mapped onto z by f .
The fact that [x] = [y] implies f (x) = f (y) follows directly either from this description
of equivalence classes, or from the fact that [x] = [y] implies x R y, which implies
f (y) = f (x).
Let g be as defined. It is surjective because for each z Y , there is some x X such
that f (x) = z (since f is surjective) and hence g([x]) = f (x) = z. Also, g is bijective
because g([x]) = g([y]) implies f (x) = f (y), which means x R y and hence that [x] = [y].
Answer to question 5.5

We can prove this by contradiction. Suppose that the set


S = {x Z | x is a multiple of 4} has a lower bound, l. Then, for all x S, x l. Now,
one of l 1, l 2, l 3, l 4 must be a multiple of 4. So one of these numbers is in S.
However, each is less than l, contradicting the fact that l is a lower bound on S.

82

Chapter 6
Divisibility and prime numbers
Essential reading

R
R

One or both of the following:


Biggs, N.L. Discrete Mathematics. Chapter 8.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 15-17 and
Chapter 23 (except section 23.5).

6.1

Introduction

In this chapter we begin to study elements of number theory. We start with a discussion
of divisibility and this leads us to discuss common divisors. Prime numbers are the basic
building blocks in the theory of numbers: in particular, each number can be written in
essentially only one way as a product of primes.

6.2

Divisibility

For integers x, y we say that x is a multiple of y or that y divides x if, for some q Z,
x = yq. We use the notation y | x to signify that y divides x.
Note that, for every x Z, x | 0. But 0 | x only if x = 0.
When y does not divide x, we write y 6 | x. In this case, as you will know from elementary
arithmetic, dividing x by y will leave a remainder.

6.3

Quotients and remainders

The following theorem is very useful. It formalises the fact that one integer may be
divided by another, leaving a remainder.
Theorem 6.1 For any positive integers a and b, there are unique non-negative integers
q and r such that
a = bq + r and 0 r < b.
Proof
This can be proved using some standard properties of integers. Let N0 = N {0} and let
Q = {m N0 | bm a}.

83

6. Divisibility and prime numbers

Then Q is non-empty because 0 Q (since 0b = 0 a). Also, Q is finite, because if


qb a, then, given that b N, we have q a. So Q must have a maximum member.
Lets call this q. Let r = a bq. Because q Q, bq a and hence r 0. Now, q is the
maximum member of A, so q + 1 6 A, which means that (q + 1)b > a, so qb + b > a and
hence r = a bq < b. So we have established that a = bq + r, and that 0 r < b.
To show that q and r are unique, suppose we have a = bq + r = bq 0 + r0 where
0 r, r0 < b. (We show q = q 0 and r = r0 .) Either q q 0 or q q 0 . Lets suppose that
q q 0 (the argument is similar if q q 0 ). Then
0 r = a bq a bq 0 = r0 < b.
So
0 (a bq 0 ) (a bq) < b,
which simplifies to 0 b(q q 0 ) < b. This implies 0 q q 0 < 1. But q q 0 is an
integer, and so we must have q q 0 = 0. So q = q 0 . Then, r = a bq = a bq 0 = r0 .

The same result holds more generally, without the restriction that a > 0, but the proof
is simplest when we restrict to the positive case. The general Division Theorem is:
Theorem 6.2 (Division Theorem) For any integers a and b with b > 0, there are
unique integers q and r such that
a = bq + r and 0 r < b.

6.4

Representation of integers with respect to a base

Let t be a positive integer. Then any positive integer x can be represented uniquely in
the form
x = xn tn + xn1 tn1 + + r1 t + r0
for some integer n. The numbers xi are integers between 0 and t 1 and can be found
by repeated division by t: see Biggs Section 8.3. We write
x = (xn xn1 . . . x1 x0 )t .
Example 6.1 The number 60 (written in base 10) can be written in base 3 as
(2020)3 because:
60 = 2(33 ) + 0(32 ) + 2(3) + 0(1).
The representation can be found by repeated division.
60
20
6
2

84

=
=
=
=

3 20 + 0
36+2
32+0
3 0 + 2.

6.5. Greatest common divisor

Example 6.2 Whats the representation in base 4 of the number (201)10 ?


201
50
12
3

=
=
=
=

4 50 + 1
4 12 + 2
43+0
4 0 + 3.

So the answer is (3021)4 .


Check: 3(43 ) + 2(4) + 1 = 201.

6.5

Greatest common divisor

The greatest common divisor of two integers is defined as follows.


Definition 6.1 (Greatest common divisor) Suppose a, b are two integers, at least
one of which is not 0. Then the greatest common divisor (gcd) of a and b, denoted by
gcd(a, b), is the unique positive integer d with the following properties:
(i) d divides both a and b (that is, it is a common divisor of a and b)
(ii) d is greater than every other common divisor of a and b: that is, if c | a and c | b then
c d.
Implicit here is the fact that the gcd is unique. This easily follows from properties (i)
and (ii). For, suppose that d and d0 are two positive integers satisfying (i) and (ii).
Because d0 divides a and b and because d satisfies (ii), we have d0 d. But also, because
d divides a and b and because d0 satisfies (ii), we have d d0 . So we must have d = d0 .
Its not too hard to see that the gcd exists. For any n Z, let D(n) = {m Z : m | n}.
This is the set of positive divisors of n. Since 1 | n, D(n) 6= . Consider now the set
D(a, b) = D(a) D(b), which is the set of positive common divisors of a and b. Then,
D(a, b) 6= since 1 D(a, b). Suppose a 6= 0. (We know that at least one of a, b is
nonzero. If a 6= 0, a very similar argument will work using b in place of a.) Then
m D(a, b) m |a|. So D(a, b) is bounded above and hence has a (unique) maximal
element d. Thats the gcd.
Note that some textbooks use (a, b) to denote the gcd, rather than gcd(a, b).
Example 6.3 gcd(12, 20) = 4 because 4 | 12 and 4 | 20, but there is no common
divisor of 12 and 20 that is greater than 4.
Activity 6.1 Convince yourself that gcd(35, 77) = 7.
If two numbers a, b have gcd(a, b) = 1, then we say that a and b are coprime. In this
case, a and b have no common factors other than 1 and 1. For example, 72 and 77 are
coprime.

85

6. Divisibility and prime numbers

6.6

The Euclidean algorithm

There is a standard method for computing greatest common divisors, known as the
Euclidean algorithm. Before presenting this, we state two important properties of
greatest common divisors.
If b N and a | b, then gcd(a, b) = a.
For non-zero integers a and b, if a = bq + r where q, r are integers, then
gcd(a, b) = gcd(b, r).
The first fact is clear: for a is, in this case, a common divisor of a and b. And theres no
greater positive divisor of a than a itself.
Activity 6.2 Prove this second fact by proving that the set of common divisors of a
and b is the same as the set of common divisors of b and r (and hence both sets have
the same greatest member.)

These observations provide a way to determine gcds. Lets think about a simple
example. Suppose we want gcd(100, 15). (You can see immediately that the answer is 5,
but lets try to explain a method that will be useful in rather less easy examples.)
We have 100 = 15 6 + 10 so, by the second fact above, gcd(100, 15) = gcd(15, 10).
Next, 15 = 10 1 + 5, so gcd(15, 10) = gcd(10, 5). But 10 = 2 5, so 5 divides 10 and so,
by the first of the two facts above, gcd(10, 5) = 5. It follows, then, that gcd(100, 15) = 5.
The method, known as the Euclidean algorithm, therefore employs successive use of the
division theorem. Heres another, more substantial, example.
Example 6.4 Let us calculate gcd(2247, 581). We have
2247
581
504
77
42
35

=
=
=
=
=
=

581 3 + 504
504 1 + 77
77 6 + 42
42 1 + 35
35 1 + 7
7 5.

It follows that gcd(2247, 581) = 7. The reason is that these divisions establish the
following equalities:
gcd(2247, 581) =
=
=
=
=
this last equality because 7 | 35.

86

gcd(581, 504)
gcd(504, 77)
gcd(77, 42)
gcd(42, 35)
gcd(35, 7) = 7,

6.7. Some consequences of the Euclidean algorithm

6.7

Some consequences of the Euclidean algorithm

A useful consequence of the Euclidean algorithm is that we can use it to express d, the
gcd of a and b, as an integer linear combination of a and b, by which we mean the
following.
Theorem 6.3 Suppose a and b are integers (at least one of which is not 0) and let
d = gcd(a, b). Then there are m, n Z such that d = am + bn.
For a formal, general, proof of this, see Biggs Section 8.4 or Eccles Section 17.1. Ill give
an example here, which will demonstrate the way in which we can find m, n once we
have applied the Euclidean algorithm. Lets work with a = 2247 and b = 581. We want
to find integers m and n such that
2247m + 581n = 7.
We can use the calculation we had above, by working backwards through the sequence
of equations, as follows:
7 =
=
=
=
=
=

42 35
42 (77 42) = 42 2 77
(504 77 6) 2 77 = 504 2 77 13
504 2 (581 504) 13 = 504 15 581 13
(2247 581 3) 15 581 13 = 2247 15 581 58
2247 15 + 581 (58).

So we see that m = 15 and n = 58 will work. Not an answer you could easily have
guessed! Notice how, in each line of this calculation, we use one of the lines from the
calculation that arises from the Euclidean algorithm (and we also simplify as we go
along). You will get used to this with practice. There are many examples you could
make up for yourself in order to help you practise.
Activity 6.3 In the above procedure to find m and n, figure out exactly which part
of the Euclidean algorithm calculation is being used at each stage.
Activity 6.4 Choose two particular positive integers a and b. Use the Euclidean
algorithm to find gcd(a, b) and then use your calculation to find integers m and n
such that d = ma + nb. Do this several times with different choices of numbers until
you have mastered it.
The fact that there are m, n Z such that gcd(a, b) = am + bn (that is, that the gcd of
two integers is an integer linear combination of them) is very useful. Heres one nice
consequence.
We know that, by definition, if c | a and c | b, then c d = gcd(a, b). But we can say
something stronger, namely that c | d.
Theorem 6.4 Suppose that a, b N and let d = gcd(a, b). If c | a and c | b, then c | d.

87

6. Divisibility and prime numbers

Proof
There are integers m, n such that d = ma + nb. Suppose that c | a and c | b. Then c | ma
and c | nb, so c | (ma + nb). But this says c | d, as required.
Heres another consequence:
Theorem 6.5 For a, b N, let d = gcd(a, b). Then, for c N, there are integers m and
n such that c = am + bn if and only if d | c.
Proof
Suppose c = am + bn. Now, d = gcd(a, b) satisfies d | a and d | b, so, also, d | (ma + nb)
and hence d | c.
Conversely, suppose d | c, so that for some integer k, c = kd. Now, there are m, n Z
with d = ma + nb. Then,
c = kd = k(ma + nb) = (km)a + (kn)b = M a + N b,

where M, N Z. This shows that c can be written as an integer linear combination of a


and b, as required.
We also have:
Theorem 6.6 Suppose that a, b N are coprime (meaning gcd(a, b) = 1). If a | r and
b | r, then ab | r.
This is not generally true if the numbers a, b are not coprime. Think of a
counterexample!
Proof
Because gcd(a, b) = 1, there are integers m, n such that 1 = ma + nb. So
r = r 1 = r(ma + nb) = mra + nrb.
Because a | r and b | r, there are integers k1 , k2 such that r = k1 a and r = k2 b. So
r = mra + nrb = m(k2 b)a + n(k1 a)b = (mk2 + nk1 )ab,
which shows that ab | r.

6.8

Prime numbers

A prime number (or a prime) is a natural number p 2 with the property that the only
divisors of p are 1 and p. In a precise sense, which well see shortly, primes are the
building blocks of the natural numbers.
One important property of primes is that if a prime divides a product of numbers, then
it must divide at least one of the numbers in the product. This isnt true for
non-primes: for example, 4 | 12 = 2 6, but 4 does not divide either 2 or 6.

88

6.9. Prime factorisation: the Fundamental Theorem of Arithmetic

Theorem 6.7 Suppose that p is a prime number and that a, b N. If p | ab, then p | a
or p | b.
Proof
The proof makes use of the useful fact (seen above) that the gcd of any two numbers
can be written as an integer linear combination of the numbers (Theorem 6.3). Suppose,
then, that p is prime and that p | ab. If p | a, then the conclusion of the theorem holds, so
suppose p 6 | a. Then p and a have no common positive divisor other than 1. (The only
positive divisors of p are 1 and p because it is prime, and p does not divide a, by
assumption.) So gcd(p, a) = 1 and therefore there exist integers m and n with
1 = mp + na. So, b = b(mp + na) = (bm)p + n(ab). Now, p | ab and hence p | n(ab).
Clearly, also, p | (bm)p. It follows that p | b, as required.
This result can easily be extended: if a1 , a2 , . . . , an N and p | a1 a2 . . . an , then, for some
i between 1 and n, p | ai .
Activity 6.5 Prove this generalisation of Theorem 6.7.

6
6.9

6.9.1

Prime factorisation: the Fundamental Theorem of


Arithmetic
The Fundamental Theorem

The Fundamental Theorem of Arithmetic is the name given to the following Theorem.
(As its name suggests, this is an important theorem!)
Theorem 6.8 (Fundamental Theorem of Arithmetic)
Every integer n 2 can be expressed as a product of one or more prime numbers.
Furthermore, there is essentially only one such way of expressing n: the only way in
which two such expressions for n can differ is in the ordering of the prime factors.

The expression of an integer as a product of primes is known as its prime


decomposition. For example, the prime decomposition of 504 is
504 = 2 2 2 3 3 7 = 23 .32 .7.
(Note that, in this last expression, the dot, . denotes multiplication.)
The proof of the Fundamental Theorem is not very difficult, given the results we
already have about prime numbers. Establishing that each positive integer can be
written as a product of primes is easy. Showing that such a decomposition is essentially
unique (that is, unique up to the ordering of the factors) is a little trickier, but can be
established using Theorem 6.7.

89

6. Divisibility and prime numbers

6.9.2

Proof of the Fundamental Theorem

There are two things to prove:


Any n 2 can be written as a product of primes: n = pk11 pk22 . . . pkr r , where
p1 < p2 < < pr are primes and k1 , k2 , . . . , kr N. (Existence)
This is essentially unique: if pk11 pk22 . . . pkr r = q1l1 q2l2 . . . qsls , are two equal such
expressions, then r = s, pi = qi and ki = li for all i. (Uniqueness).
Fundamental Theorem: Existence
We use (strong) induction. For n 2, let P (n) be the statement:
n can be written as a product of primes.
Base case: n = 2 is a product of a single prime, so P (2) is true.

Assume, inductively, that k N and that P (s) is true for all s k. (Were using strong
induction.) Consider k + 1. This could be a prime number, in which case were done and
P (k + 1) is true. Otherwise k + 1 = ab where 1 < a, b < k + 1. But then P (a) and P (b)
are true (by assumption) so each of a, b is a product of primes. So, therefore, is k + 1.
Fundamental Theorem: Uniqueness
The idea here is simple enough, but it is notationally difficult to write a detailed proof.
Ill try to give you the basic idea.
Suppose p1k1 pk22 . . . pkr r = q1l1 q2l2 . . . qsls . Cancel as much as possible all (powers of) common
primes. We want to show both resulting sides are 1 (so that the expressions were the
same.)
Suppose not. Then the resulting LHS (left-hand side) is a product of some powers of
some pi and the RHS is a product of some powers of some qi , and no pi equals any qj
(because weve cancelled). Take any p appearing in the LHS. Then p | RHS so (by an
earlier result, If p | a1 a2 . . . an , then, for some i, p | ai ) p divides one of the qj . This isnt
possible since pi 6= qj .
We can use the Fundamental Theorem of Arithmetic to prove that there are infinitely
many primes. You can find an outline of this proof in Chapter 2.
Activity 6.6 Look again at the proof, in Chapter 2, that there are infinitely many
primes, and understand where the Fundamental Theorem of Arithmetic is used in
the proof.

90

6.9. Learning outcomes

Learning outcomes
At the end of this chapter and the Essential reading and activies, you should be able to:
state clearly what it means to say that one number divides another
state the Division Theorem and, given two numbers, find the remainder and
quotient when one is divided by the other
understand what is meant by the representation of an integer with respect to a
particular basis and be able to work with this definition
state the definition of the greatest common divisor of two numbers
use the Euclidean algorithm to find the gcd of two numbers
demonstrate that you know that the gcd of two numbers can always be expressed
as an integer linear combination of the two numbers; and be able to express the gcd
in this way for any two numbers
use the Euclidean algorithm to express the gcd of two numbers as an integer linear
combination of them
state what is meant by a prime number
state the Fundamental Theorem of Arithmetic.

Sample examination questions


Question 6.1
Find d = gcd(2406, 654). Express d in the form d = 2406m + 654n for integers m, n.
Question 6.2
Suppose that a, b N, both non-zero, and let d = gcd(a, b). We know that, by
definition, if c | a and c | b, then c d. Prove, in fact, that c | d.
Question 6.3
Suppose a, b N and that d = gcd(a, b). Prove that, for c N, there are integers m and
n such that c = am + bn if and only if d | c.
Question 6.4
Suppose a, b N. Prove that if there are integers m and n such that am + bn = 1 then a
and b are coprime.
Question 6.5
Prove that for all n N, the numbers 9n + 8 and 6n + 5 are coprime.
Question 6.6
Suppose that a, b N and that gcd(a, b) = 1. Suppose that a | r and b | r. Prove that
ab | r.

91

6. Divisibility and prime numbers

Question 6.7
The Fibonacci numbers f1 , f2 , f3 , . . . are defined as follows: f1 = f2 = 1 and, for n 3,
fn = fn1 + fn2 . Prove that for all n N, gcd(fn , fn+1 ) = 1.
Question 6.8
Suppose that p1 , p2 , . . . , pk are primes and that a, b N are given by
a = pl11 pl22 . . . plkk ,

mk
1 m2
b = pm
1 p2 . . . pk .

Prove that
gcd(a, b) = pr11 pr22 . . . prkk ,
where, for i = 1 to k, ri is the smaller of the two numbers li and mi .
Question 6.9
Suppose a, b N satisfy gcd(a, b) = 1 and, for some k N, ab = k 2 . Prove that for some
integers m, n, a = m2 and b = n2 .

6
Comments on selected activities
Feedback to activity 6.1
Certainly, 7 divides both 35 and 77, but there is no larger common divisor. (For, the
only larger divisor of 35 is 35 itself, and this does not divide 77.)
Feedback to activity 6.2
We prove that D(a, b) = D(b, r). The result on gcds will follow since gcd(x, y) is the
maximal element of D(x, y).
Suppose m D(a, b). Then m | a and m | b. It follows that m | (a bq); that is, m | r. So
m | b and m | r and hence m D(b, r). Therefore D(a, b) D(b, r).
Suppose m D(b, r). Then m | b and m | r. It follows that m | (bq + r); that is, m | a. So
m | b and m | a and hence m D(a, b). Therefore D(b, r) D(a, b).
Feedback to activity 6.5
To prove this, we can (unsurprisingly) use induction on n. Let P (n) be the statement:
If a1 , a2 , . . . , an N and p | a1 a2 . . . an , then, for some i between 1 and n, p | ai .
Then P (1) is clearly true (and P (2) is the theorem just proved). Suppose P (n) is true
and lets show P (n + 1) follows. So, suppose a1 , a2 , . . . , an+1 N and p | a1 a2 . . . an an+1 .
Well, since p | Aan+1 , where A = a1 a2 . . . an , we can apply the n = 2 case to see that p | A
or p | an+1 . But, by P (n), if p | A = a1 a2 . . . an then p | ai for some i between 1 and n. So
were done: p divides at least one of the ai for i between 1 and n + 1.
Feedback to activity 6.6
Heres the proof, with explicit reference to the Fundamental Theorem.
Suppose there were not infinitely many primes, so theres a largest prime, M , say. Let
X = (2 3 5 7 11 M ) + 1.

92

6.9. Sketch answers to or comments on sample questions

Since X > M , X is not a prime. By the Fundamental Theorem of Arithmetic, X has a


prime divisor p which satisfies 1 < p < X. This p must be one of the numbers
2, 3, 5, . . . , M (since these are the only primes). However, X is not divisible by any of
these numbers. So we have a contradiction. We conclude there are infinitely many
primes.

Sketch answers to or comments on sample questions


Answer to question 6.1
See Biggs, Section 8.4. The gcd is 6 and we have 6 = 28 2406 + (103) 654.
Answer to question 6.2
We know that there are integers m, n such that d = ma + nb. Suppose that c | a and c | b.
Then c | ma and c | nb, so c | (ma + nb). But this says c | d, as required.
Answer to question 6.3
Suppose first that c = am + bn for some some integers m and n. Now, d = gcd(a, b)
satisfies d | a and d | b, so we also have d | (ma + nb) and hence d | c. Conversely, suppose
d | c, so that for some integer k, c = kd. Now, the gcd d = gcd(a, b) can be written as an
integer linear combination of a and b, so there are m, n Z with d = ma + nb. Then,
c = kd = k(ma + nb) = (km)a + (kn)b = M a + N b,
where M, N Z. This shows that c can be written as an integer linear combination of a
and b, as required.
Answer to question 6.4
This follows from the previous question, but we can prove it directly. Suppose that
d N, that d | a and d | b. Then d | (am + bn), which means d | 1. Therefore, we must have
d = 1. That is, the only positive common divisor of a and b is 1 and hence gcd(a, b) = 1
and the numbers are coprime.
Answer to question 6.5
We have 2(9n + 8) 3(6n + 5) = 1. So, if d = gcd(9n + 8, 6n + 5), then d | (9n + 8) and
d | (6n + 5), so d | 2(9n + 8) 3(6n + 5). But this says d | 1 and hence d = 1.
Answer to question 6.6
Before we begin, lets just note that this property does not hold if a and b are not
coprime. For example, 6 | 12 and 4 | 12 but 24 6 | 12. Suppose then that a | r and b | r. The
fact that gcd(a, b) = 1 means that there are integers m, n such that 1 = ma + nb. So
r = r 1 = r(ma + nb) = mra + nrb.
Now, because a | r and b | r there are integers k1 , k2 such that r = k1 a and r = k2 b. So
r = mra + nrb = m(k2 b)a + n(k1 a)b = (mk2 + nk1 )ab,

93

6. Divisibility and prime numbers

which shows that r is an integer multiple of ab and hence ab | r.


Answer to question 6.7
We prove this by induction on n. Let P (n) be the statement that gcd(fn , fn+1 ) = 1.
When n = 1, this is true, because gcd(f1 , f2 ) = gcd(1, 1) = 1. It is true also when n = 2
because f3 = 2 and hence gcd(f2 , f3 ) = gcd(1, 2) = 1. Suppose, inductively, that k 2
and gcd(fk , fk+1 ) = 1. We want to show that gcd(fk+1 , fk+2 ) = 1. Now, fk+2 = fk+1 + fk .
Therefore, if d | fk+1 and d | fk+2 then d | fk+1 and d | (fk+2 fk+1 ) = fk . So any common
divisor of fk+1 and fk+2 is also a common divisor of fk and fk+1 . Also, if d | fk and
d | fk+1 then we also have d | fk+1 and d | (fk + fk+1 ) = fk+2 , so any common divisor of fk
and fk+1 is also a common divisor of fk+1 and fk+2 . This all shows that the common
divisors of the pair {fk+1 , fk+2 } are precisely the same as the common divisors of the
pair {fk+1 , fk }. Therefore, the greatest common divisors of each pair are equal. That is,
gcd(fk+1 , fk+2 ) = gcd(fk+1 , fk ) = gcd(fk , fk+1 ) = 1,
where we have used the inductive hypothesis for the last equality.

You can also establish this result by thinking about the way in which the Euclidean
algorithm would work in finding the gcd of fk and fk+1 .
Answer to question 6.8
Let d = pr11 pr22 . . . prkk . Then because ri is the smaller of li and mi , we have ri li and
ri mi . So pri | pli and pri | pmi , for each i. Therefore d | a and d | b. Explicitly, for
example,
a = pl11 pl22 . . . plkk
= pr11 pr22 . . . prkk pl11 r1 pl22 r2 . . . plkk rk
= d(pl11 r1 pl22 r2 . . . plkk rk ),
and because lk rk is a non-negative integer for each i, the number in parentheses is an
integer. This shows d | a. The fact that d | b can be similarly shown. So d is a common
divisor of a and b.
Suppose D is any common divisor of a and b. Then, by the Fundamental Theorem of
Arithmetic, D can be written as a product of primes. Let p be any one of these. Then
p | a and p | b. Now, we know that if p | a1 a2 . . . an then p | ai for some i. (This follows from
the results of Section 6.8.) The only primes appearing in the decomposition of a and b
are p1 , p2 , . . . , pk , so we can deduce that for some i, p | pi which means p = pi (given that
p and pi are primes). So the prime decomposition of D is of the form
D = ps11 ps22 . . . pskk
for some non-negative integers si . Now, suppose that, for some i, si > li . Then, for some
integers M and N ,
D = psi i M, a = plii N,
where, because they involve only products of the other primes, neither N or M is
divisible by pi . Now, the fact that D | a means that theres an integer L with a = LD, so
plii N = Lpsi i M

94

6.9. Sketch answers to or comments on sample questions

and hence
N = Lpsi i li M.
But this shows, since si li 1 (because si , li Z and sI > li ), that pi | N , contradicting
the observation that pi 6 | N . So we must have si li for all i. A similar argument shows
that si mi for all i. So si ri = min(li , mi ) and hence D d. The result follows.
Answer to question 6.9
We use the Fundamental Theorem of Arithmetic. Let k have prime decomposition
k = p1 1 p2 2 . . . pr r . Then
1 22
r
. . . p2
ab = k 2 = p2
1 p2
r .
It follows that a and b must have prime decompositions involving only the primes
p1 , p2 , . . . , pr , and that each of a, b takes the form ps11 ps22 . . . psrr where si is a non-negative
integer. But we cannot have, for any i, pi | a and also pi | b, for this would mean that
i
pi > 1 is a common divisor of a, b, contradicting gcd(a, b) = 1. So, for each i, p2
i
divides precisely one of a and b and pi does not divide the other of the two numbers. In
1 22
2r
other words, each of a, b takes the form p2
where i = 0 or i = i . This
1 p2 . . . pr
1 2
r 2
can be written as (p1 p2 . . . pr ) , and hence there are integers m, n such that a = m2
and b = n2 .

95

6. Divisibility and prime numbers

96

Chapter 7
Congruence and modular arithmetic
Essential reading

R
R

One or both of the following:

7.1

Biggs, N.L. Discrete Mathematics. Chapter 13, Sections 13.1-13.3.


Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 19-21.

Introduction

In this chapter, we study congruence and we describe modular arithmetic. This builds
on the ideas and results on divisibility and equivalence relations that we met in earlier
chapters.
You go to sleep at 10 oclock and you sleep for 8 hours. At what time do you wake?
Well, this is simple: you wake at 6 oclock. What youre doing in this calculation is
youre doing whats called arithmetic modulo 12. The answer is not 10 + 8 = 18, because
the clock re-starts once the hour of 12 is reached. This is a fairly simple idea, when
expressed in these terms, and its the key concept behind modular arithmetic, a topic we
cover later in this chapter. We now take a more abstract approach.

7.2
7.2.1

Congruence modulo m
The congruence relation

Suppose that m is a fixed natural number, and lets define a relation R on the integers
by a R b if and only if b a is a multiple of m. That is, a R b m | (b a). Then R is
an equivalence relation.
Activity 7.1 Prove that R is an equivalence relation on Z.
This relation is so important that it has a special notation. If a R b, we say that a and b
are congruent modulo m and we write a b (mod m).
If a and b are not congruent modulo m, then we write a 6 b (mod m).
The division theorem tells us that for any integers a and for any m N, there are
unique integers q and r such that
a = qm + r and 0 r < b.

97

7. Congruence and modular arithmetic

What this implies for congruence is that, for any a, there is precisely one integer r in
the range 0, 1, . . . , m 1 such that a r(mod m).
Congruence relations can be manipulated in many ways like equations, as the following
Theorem shows.
Theorem 7.1 Suppose that m N and that a, b, c, d Z with a b (mod m) and
c d (mod m). Then
(i) a + c b + d (mod m)
(ii) a c b d (mod m)
(iii) ac bd (mod m)
(iv) k Z, ka kb (mod m)
(v) n N, an bn (mod m)
Proof
I leave (i) and (ii) for you to prove. Heres how to prove (iii): because a b (mod m)
and c d (mod m), we have m | (b a) and m | (d c). So, for some integers k, l,
b a = km and d c = lm. That is, b = a + km and d = c + lm. So
bd = (a + km)(c + lm) = ac + (kmc + alm + klm2 ) = ac + m(kc + al + klm).

Now, kc + al + klm Z, so bd ac = (kc + al + klm)m is a multiple of m; that is,


m | (bd ac) and ac bd (mod m). Part (iv) follows from (iii) by noting that
k k (mod m), and part (v) follows by repeated application of (iii) (or, by (iii) and
induction.).
Activity 7.2 Prove parts (i) and (ii) of Theorem 7.1.
Theorem 7.1 is useful, and it enables us to solve a number of problems. Here are two
examples.
Example 7.1 Suppose that the natural number x has digits xn xn1 . . . x0 (when
written, normally, in base 10). So, for example, if x = 1246 then
x0 = 6, x1 = 4, x2 = 2, x3 = 1. Then 9 divides x if and only if x0 + x1 + + xn is a
multiple of 9. (So this provides a quick and easy way to check divisibility by 9. For
example, 127224 is divisible by 9 because 1 + 2 + 7 + 2 + 2 + 4 = 18 is.)
How do we prove that this test works? We can use congruence modulo 9. Note that
x = x0 + (10)x1 + (10)2 x3 + + (10)n xn .
Now, 10 1 (mod 9), so, for each k N, 10k 1 (mod 9). Hence
x = x0 + (10)x1 + (10)2 x2 + + (10)n xn x0 + x1 + + xn (mod 9).
A number is divisible by 9 if and only if it is congruent to 0 modulo 9, so
9 | x x 0 (mod 9)
x0 + x1 + + xn 0 (mod 9)
9 | (x0 + x1 + + xn ).

98

7.2. Congruence modulo m

This is precisely what the test says.


Example 7.2 We can use congruence to show that there are no integers x and y
satisfying the equation 7x2 15y 2 = 1.
We prove this by contradiction. So suppose such x and y did exist. Then, because
15y 2 is a multiple of 5, wed have 7x2 1 (mod 5). Now, x is congruent to one of the
numbers 0, 1, 2, 3, 4 modulo 5. That is, we have:
x 0 or x 1 or x 2 or x 3 or x 4 (mod 5).
So,
x2 0 or x2 1 or x2 4 or x2 9 or x2 16 (mod 5).
But 9 4 (mod 5) and 16 1 (mod 5), so, in every case, either x2 0 or x2 1 or
x2 4 (mod 5). It follows, then, that in all cases, we have
7x2 0 or 7x2 7 2 or 7x2 28 3 (mod 5).
So there does not exist an integer x with 7x2 1 (mod 5), and hence there are no
integer solutions to the original equation.

7.2.2

Congruence classes

What are the equivalence classes of the congruence relation, modulo a particular
positive integer m? These are often called the congruence classes modulo m. Lets
denote these by [x]m . For a particular x Z, [x]m will be all the integers y such that
y x (mod m). We know that each x is congruent to precisely one of the integers in the
range 0, 1, 2, . . . , m 1, and we know (from the general theory of equivalence relations)
that if x y (mod m) then [x]m = [y]m . So it follows that for each x Z, well have
[x]m = [0]m , or [x]m = [1]m , . . . or [x]m = [m 1]m .
So there are precisely m equivalence classes,
[0]m , [1]m , . . . , [m 1]m .
The theory of equivalence relations tells us that these form a partition of Z: they are
disjoint and every integer belongs to one of them. But what are they? Well, [0]m is the
set of all x such that x 0 (mod m), which means m | x. So [0]m is the set of all integers
divisible by m. Generally, for 0 r m 1, [r]m will be the set of x such that
x r (mod m). It follows that [r]m is the set of all integers x which have remainder r
on division by m.
Example 7.3 Suppose m = 4. Then the congruence classes are:
[0]4 = {. . . , 8, 4, 0, 4, 8, . . . },
[1]4 = {. . . , 7, 3, 1, 5, 9, . . . },
[2]4 = {. . . , 6, 2, 2, 6, 10, . . . },
[3]4 = {. . . , 5, 1, 3, 7, 11, . . . }.

99

7. Congruence and modular arithmetic

7.3

Zm and its arithmetic

When, in an earlier chapter, we looked at how the integers may be constructed from the
natural numbers through using an equivalence relation, we also saw that we could do
arithmetic with the equivalence classes. We can also do this here, and the resulting
addition and multiplication operations are known as modular arithmetic.
First, lets introduce a new piece of notation. For m N, Zm is called the set of integers
modulo m, and is the set of equivalence classes
[0]m , [1]m , . . . , [m 1]m .
So Zm has m members. We can define operations and on Zm as follows:
[x]m [y]m = [x + y]m , [x]m [y]m = [xy]m .
For example, when m = 4, [2]4 [3]4 = [5]4 = [1]4 and [2]4 [3]4 = [6]4 = [2]4 .
In practice, if we do not use the and symbols and we simply write x instead of
[x]m , using values of x between 0 and m 1. We would say that we are in Zm if thats
not clear from the context. So the above two calculations may be written:
in Z4 , 2 + 3 = 1, and 2 3 = 2.

Note that we use only the symbols 0, 1, . . . , m 1, so we do not write 2 + 3 = 5. Instead


we replace 5 by the number that it is congruent to modulo 5 and which lies between 0
and 4.
The equations weve just written are entirely equivalent to the statements
2 + 3 1 (mod 5), 2 3 2 (mod 5).
The addition and multiplication operations weve defined on Zm obey a number of rules
that are familiar from normal addition and multiplication of integers.
Theorem 7.2 Let m N. In Zm , for all a, b, c,
(i) a + b = b + a
(ii) a b = b a
(iii) (a + b) + c = a + (b + c)
(iv) (a b) c = a (b c)
(v) a + 0 = a
(vi) a 1 = a
(vii) a (b + c) = (a b) + (a c)
(viii) for each a Zm there is a unique element a Zm such that a + (a) = 0.

100

7.4. Invertible elements in Zm

Lets think a little about rule (viii) in this Theorem. Suppose were in Z4 and that
a = 3. What is a? Well, what we want is an element of Zm which when added to 3
gives 0 when we are doing arithmetic in Z4 . Now, 3 + 1 = 0 in Z4 because 3 + 1 is
congruent to 0 modulo 4, so 3 = 1. (Alternatively, we can note that 3 = 1(4) + 1,
so 3 has remainder 1 on division by 4, so 3 1 (mod 4).)
Activity 7.3 In Z9 , what is 4?
It is important to realise that arithmetic in Zm does not obey all the nice properties
that normal arithmetic of integers obeys. In particular, we cannot generally cancel. For
example, in Z4 ,
2 3 = 2 = 2 1,
but we cannot cancel the 2 (that is, divide both sides by 2) to deduce that 3 = 1,
because 3 6= 1 in Z4 . (The reason we cannot cancel the 2 is that 2 has no inverse in Z4 .
Existence of inverses is the topic of the next section.)

7.4

Invertible elements in Zm

A member x of Zm is invertible if there is some y Zm such that (in Zm ) xy = yx = 1.


If such a y exists, it is called the inverse of x and is denoted by x1 .
Example 7.4 In Z10 , 3 has inverse 7 because, in Z10 , 3 7 = 1 (because
3 7 = 21 1 (mod 10)).
Example 7.5 In Z10 , 5 has no inverse. There is no x such that 5x = 1. For, modulo
10, for any x Z, 5x 0 or 5. (This is just the familiar fact that any multiple of 5
has as the last digit 0 or 5.)
If x Zm is invertible, then it is possible to cancel x from both sides of an equation in
Zm . That is, we have
xa = xb a = b (in Zm ).
This is not, as we have seen, generally true, but it works when x has an inverse because
in this case
xa = xb x1 (xa) = x1 (xb) (x1 x)a = (x1 x)b 1a = 1b a = b.
Which x Zm are invertible? The answer is given by the following theorem.
Theorem 7.3 Suppose m N. Then an element x of Zm is invertible if and only if x
and m are coprime (that is, gcd(x, m) = 1).
Proof
Suppose x is invertible, so that there is y with xy = 1 in Zm . This means that
xy 1 (mod m), so, for some k Z, xy = 1 + km. Let d = gcd(x, m). Then d | x and
d | m, so d | (xy km). That is, d | 1, from which it follows that d = 1 and x and m are
coprime.

101

7. Congruence and modular arithmetic

Conversely, suppose gcd(x, m) = 1. Then (by the fact that the gcd can be written as an
integer linear combination), there are integers y, z such that 1 = yx + zm. But this
means yx 1 (mod m), so, in Zm , yx = 1 and x has inverse y.
As a result of this theorem, we see that if p is a prime, then every non-zero element x of
Zp is invertible. This is because gcd(x, p) = 1.

7.5
7.5.1

Solving equations in Zm
Single linear equations

Suppose we want to solve, in Zm , the equation ax = b. That is, we want to find x


between 0 and m 1 such that ax b (mod m). This may have no solutions. Indeed,
suppose we take b = 1. Then the equation were confronted with is ax = 1, which has a
solution if and only if a is invertible (by definition of inverse). So if a has no inverse in
Zm , then such a linear equation will not always have a solution. If, however, a is
invertible, then we can see that the equation ax = b in Zm has solution x = a1 b,
because a(a1 b) = (aa1 )b = 1b = b.

How do you find a solution? Trial and error is not efficient if the numbers involved are
large. But we can use the Euclidean algorithm. For suppose we want to solve ax = b in
Zm , where gcd(a, m) = 1. Then, by using the Euclidean algorithm, we have seen how we
can find integers k, l such that 1 = ak + ml. So, b = abk + mlb, from which it follows
that, modulo m, a(bk) 1. So it looks like bk will be a solution. But remember that
were looking for a solution in Zm , so we will want to find x Zm such that
x bk (mod m). Heres an example.
Example 7.6 Suppose we want to solve the equation 83x = 2 in Z321 . We can
check that 83 and 321 are coprime by the Euclidean algorithm, as follows: We have
321
83
72
11
6
5

=
=
=
=
=
=

83 3 + 72
72 1 + 11
11 6 + 6
61+5
51+1
1 5.

It follows that gcd(321, 83) = 1. Now, working backwards, we can express 1 as an


integer linear combination of 83 and 321:
1 =
=
=
=
=

102

65
6 (11 6) = 6 2 11
(72 11 6) 2 11 = 72 2 11 13
72 2 (83 72) 13 = 72 15 83 13
(321 83) 3) 15 83 13 = 321 15 83 58.

7.5. Solving equations in Zm

This tells us that


83 (58) 1 (mod 321).
So,
83 (116) 2 (mod 321).
Now, we want to find x in the range 0 to 321 such that 116 x (mod 321 ). The
answer is x = 205. So, finally, then, we see that, in Z321 , the equation 83x = 2 has
solution x = 205.
Activity 7.4 This calculation also reveals that 83 is invertible in Z321 . Why? And
what is the inverse of 83 in Z321 ?
More generally, we can ask: when does ax = b have a solution in Zm ?
The answer is: when d = gcd(a, m) divides b.
So a special case is when d = 1.
That is, we have the following theorem.
Theorem 7.4 In Zm , ax = b has a solution if and only if d | b, where d = gcd(a, m).
Proof
First part, =: Suppose ax0 = b in Zm . Then ax0 b = km for some k Z. So,
b = ax0 km. Since d = gcd(a, m), d | a and d | m, so d | (ax0 m). That is, d | b.

Second part, =: Suppose d | b, so b = db1 for some b1 Z. There are x1 , y1 such that
d = x1 a + y1 m. Then, b = db1 = (x1 b1 )a + (y1 b1 )m.
So, in Zm , a(x1 b1 ) = b. That is, x1 b1 (reduced modulo m) is a solution.
This theorem suggests a general method for solving ax = b in Zm :
Find d = gcd(a, m).
If d 6 | b, theres no solution.
If d | b, write b = db1 . Use Euclidean algorithm to find x, y Z such that
d = xa + ym. Then a solution is xb1 , reduced modulo m.

7.5.2

Systems of linear equations

We can also consider simultaneous linear equations in Zm . It should be realised that


there might be no solutions, or more than one solution.
Example 7.7 Lets solve the following two equations simultaneously in Z6 :
2x + 3y = 1, 4x + 3y = 5.
Subtracting the first equation from the second gives 2x = 4. You might be tempted
to cancel the 2 and deduce that x must be 2. But wait! You cant cancel unless 2
and 6 are coprime, and they are not (since their gcd is 2). Instead, you can check for

103

7. Congruence and modular arithmetic

each of the elements of Z6 whether 2x = 4. Of course, x = 2 is a solution, but so also


is x = 5 because, in Z6 , 2(5) = 10 = 4. You can also check by calculating the other
values of 2x that x = 2, 5 are the only solutions. Now, from the first equation,
3y = 1 2x. When x = 2 or 5, 2x = 4, so this is 3y = 1 4 = 3 = 3. So we now
have 3y = 3. Again, we cannot cancel the 3. Instead we check, for each y Z6 ,
whether it is a solution, and we find that 1, 3 and 5 are all solutions. What this
argument shows is that the possible solutions are
(x, y) = (2, 1), (2, 3), (2, 5), (5, 1), (5, 3), (5, 5).
In fact, it can easily be checked (by substituting these pairs of values into the original
equations) that these are indeed solutions. So this system has 6 different solutions.
Activity 7.5 Check, by substituting into the original equations, that each of these
six possible solution pairs (x, y) is indeed a solution.
Example 7.8 Consider the following system of simultaneous equations in Z7 :
3x + y = 1, 5x + 4y = 1.

If we multiply the first equation by 4, we obtain 12x + 4y = 4, which is the same (in
Z7 ) as 5x + 4y = 4. But the second equation says 5x + 4y = 1. Since 1 and 2 are not
equal in Z7 , these equations are inconsistent, so there are no solutions to this
system.

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
state the definition of the equivalence relation congruence modulo m
prove that congruence modulo m is an equivalence relation
demonstrate an understanding of the links between congruence modulo m and
remainder on division by m
state and prove standard properties of congruence
apply congruence to show, for example, that equations have no solutions
demonstrate an understanding of what the congruence classes are
demonstrate an understanding of what Zm means and how its addition and
multiplication are defined
find the negatives of elements of Zm
state the definition of an invertible element of Zm
demonstrate that you know an element x in Zm is invertible if and only if x and m
are coprime
find the inverse of an invertible element
solve linear equations and systems of linear equations in Zm .

104

7.5. Sample examination questions

Sample examination questions


Question 7.1
Show that n 7 (mod 12) n 3 (mod 4). Is the converse true?
Question 7.2
Show that for all n Z, n2 0 or 1 (mod 3). Hence show that if 3 divides x2 + y 2 then
3 | x or 3 | y. Use this to prove that there are no integers x, y, z such that x2 + y 2 = 3z 2 ,
other than x = y = z = 0.
Question 7.3
Show that, for all n N, 33n+1 3 5n (mod 11) and that 24n+3 8 5n (mod 11).
Hence show that for all n N, 11 | (33n+1 + 24n+3 ).
Question 7.4
By working modulo 7, prove that 2n+2 + 32n+1 is divisible by 7. (This result was proved
in a different way, using induction, in the Questions at the end of Chapter 3.)
Question 7.5
Show that, for any m N, the element m 1 of Zm is invertible, with
(m 1)1 = m 1.

Question 7.6
Find the solution(s), if any, to the system of simultaneous equations
2x + y = 1, x + 2y = 3,
when
(i) the system is a system in Z7 ;
(ii) the system is a system in Z6 .
Question 7.7
Prove that 290 is an invertible element of Z357 and find its inverse.
Question 7.8
Solve the equation 10x = 3 in Z37 .

Comments on selected activities


Feedback to activity 7.1
We have m | 0 = a a, so the relation is reflexive. Symmetry follows from
m | (b a) m | (a b). Suppose that a R b and b R c. Then m | (b a) and m | (c b),
so m | ((b a) + (c b)) = (c a) and hence a R c. Thus, R is transitive.

105

7. Congruence and modular arithmetic

Feedback to activity 7.2


We have m | (b a) and m | (d c). So m | ((b a) + (d c)), which is the same as
m | ((b + d) (a + c)), so a + c b + d (mod m). We also have m | ((b a) (d c)),
which is the same as m | ((b d) (a c)) so a c b d (mod m).
Feedback to activity 7.3
Because 4 + 5 = 9 0 (mod 9), we have 4 = 5 in Z9 .
Feedback to activity 7.4
The calculation shows that 83 (58) 1 (mod 321). We also know that
58 263 (mod 321) because 58 + 263 = 312 0 (mod 321). So well have
83 263 1 (mod 321). This shows that 83 is invertible in Z321 and that its inverse is
263.

Sketch answers to or comments on sample questions


Answer to question 7.1
If n 7 (mod 12) then, for some integer k, m = 7 + 12k and so
m = 3 + 4 + 12k = 3 + 4(1 + 3k). This means that n 3 (mod 4), because 1 + 3k is an
integer. The converse is false, because, for example, 3 3 (mod 4), but 3 6 7 (mod 12).

Answer to question 7.2


We have, modulo 3, n 0 or n 1 or n 2. So, respectively, n2 02 = 0 or
n2 12 = 1 or n2 22 = 4 1. So in all cases n2 is congruent to 0 or 1. Suppose
3 | (x2 + y 2 ). Then, modulo 3, x2 + y 2 0. But each of x2 and y 2 is congruent to 0 or 1.
If either or both are congruent to 1, then wed have x2 + y 2 1 or x2 + y 2 2. So we
can see that we must have x2 0 and y 2 0. This means x 0 and y 0, which is the
same as 3 | x and 3 | y.
Now suppose that, for integers x, y, z, x2 + y 2 = 3z 2 , where not all of x, y, z are zero. If d
is a common factor of x, y and z, then we can write x = dx1 , y = dy1 and z = dz1 , where
x1 , y1 , z1 Z. We can then see that d2 x21 + d2 y12 = 3d2 z12 , so that x21 + y12 = 3z12 . What
this shows is that if there are any integer solutions, then there is one in which x, y, z
have no common divisors (for any common divisors can be cancelled). So assume were
dealing with such a solution. Now, x2 + y 2 = 3z 2 implies 3 | (x2 + y 2 ) (noting that neither
side of the equation is 0 because not all of x, y, z are). What weve shown earlier in this
question establishes that 3 | x and 3 | y. So x = 3x1 and y = 3y1 for some x1 , y1 Z.
Then the equation x2 + y 2 = 3z 2 becomes 9x21 + 9y12 = 3z 2 and so z 2 = 3x21 + 3y12 . This
implies 3 | z 2 . But this means 3 | z. (You can see this either by the Fundamental Theorem
of Arithmetic, or by the fact that if z 6 0 modulo 3 then z 2 6 0, as we see from the
calculations at the start of this solution.) So what we see, then, is that x, y, z are all
divisible by 3. But we assumed that they constituted a solution with no common factor
and weve reached a contradiction. So there are no solutions other than x = y = z = 0.
Answer to question 7.3
Modulo 11, we have 33 = 27 5 and so 33n (33 )n 5n and hence
33n+1 = 3(3n ) 3 5n . Also, 24 = 16 5 and so 24n+3 = 8 (24 )n 8 5n . It follows

106

7.5. Sketch answers to or comments on sample questions

that
33n+1 + 24n+3 3 5n + 8 5n = 11(5n ) 0 (mod 11),
which means that 11 | (33n+1 + 24n+3 ).
Answer to question 7.4
Modulo 7, 32 = 9 2, so 32n+1 = 3(32n ) 3(2n ) and 2n+2 = 4(2n ), so
2n+2 + 32n+1 4(2n ) + 3(2n ) = 7(2n ) 0,
and hence 7 | (2n+2 + 32n+1 ).
Answer to question 7.5
Modulo m,
(m 1)(m 1) = m2 2m + 1 0 0 + 1 1,
so (m 1) has inverse (m 1).
Answer to question 7.6
Lets first consider the system as a system in Z7 . Multiplying the second equation by 2
gives 2x + 4y = 6 and subtracting the first equation from this gives 3y = 5. Now, you
can either check each element of Z7 in turn to find the solution(s) of this. You should
find there is exactly one solution, y = 4. Or, we can note that 3y = 5 is equivalent to
3y 5 (modulo 7), which is the same as 3y 12. Since 3 and 7 are coprime, we can
cancel 3 and obtain y = 4. Then, 2x = 1 y = 3 = 4 and, since 2 and 7 are coprime,
we can cancel the 2 and obtain x = 2. So there is one solution in Z7 , namely
x = 2, y = 4.
Now lets work in Z6 . Here, the equation 3y = 5 has no solutions. For, modulo 6, 3y is
always either 0 or 3. So there are no solutions to the system if it is a system in Z6 .
Answer to question 7.7
By the Euclidean algorithm, we have
357 = 290 + 67
290 = 4 67 + 22
67 = 3 22 + 1,
so 290 and 357 are coprime, from which it follows that 290 is invertible. Now, from the
calculations just given,
1 = 67 3 22
= 67 3(290 4 67) = 13 67 3 290
= 13(357 290) 3 290 = 13 357 16 290.
The fact that 13 357 16 290 = 1 means that, modulo 357, 16 290 1. So, in
Z357 , (290)1 = 16 = 341.

107

7. Congruence and modular arithmetic

Answer to question 7.8


Because 37 is prime, we certainly know that 10 and 37 are coprime, so the equation has
a solution. The quickest way to find it is simply to note that the equation is equivalent
to the congruence, modulo 37, that 10x 40, and the 10 can then be cancelled because
10 and 37 are coprime. But suppose we didnt spot that. The Euclidean algorithm tells
us that:
37
10
7
3

=
=
=
=

3 10 + 7
17+3
23+1
3 1,

and so
1 =
=
=
=

723
7 2(10 7) = 3 7 2 10
3(37 3 10) 2 10
3 37 11 10.

So we see that 11 10 1 (mod 37) and hence 33 10 3 (mod 37). Now,


33 4 (mod 37), so the solution is x = 4. This is easily checked: 10(4) = 40 = 3 in Z37 .

108

Chapter 8
Rational, real and complex numbers
Essential reading

R
R

One or both of the following:


Biggs, N.L. Discrete Mathematics. Chapter 9.
Eccles, P.J. An Introduction to Mathematical Reasoning. Chapters 13 and 14.

The treatment in Biggs is probably better for the purposes of this Guide.
Neither of these books covers complex numbers. You do not have to know very much
about complex numbers for this subject, but because this topic is not in these books, I
have included quite a bit of material on complex numbers in this chapter. (Complex
numbers is also a topic you will meet in 118 Advanced linear algebra if you are
taking that subject.)
You can find useful reading on complex numbers in a number of books, including the
following:

8.1

Anton, H. Elementary Linear Algebra. Appendix B.

8
Introduction

In this chapter, we explore rational numbers, real numbers and complex numbers. In
this course, we started with natural numbers and then we showed how to construct the
set of all integers from these. This construction used an equivalence relation, together
with a suitable way of adding and multiplying the equivalence classes. In a similar way,
the rational numbers can be constructed from the integers by means of an equivalence
relation. In this course, we do not take a very formal approach to the definition or
construction of the real numbers (which can, in fact, be quite complicated). But we
study properties of real numbers, and in particular we shall be interested in whether
real numbers are rational or not. We also consider the cardinality of infinite sets.

8.2
8.2.1

Rational numbers
An important equivalence relation

Rational numbers are, essentially, fractions. Youll certainly be aware that there are
2
many ways of representing a given rational number. For instance, represents the same
5

109

8. Rational, real and complex numbers

4
. We can capture these sorts of equivalences more formally by using an
10
equivalence relation on pairs of integers (m, n), where n 6= 0. So let X = Z (Z \ {0}) be
the set of all pairs (m, n) where m, n Z and n 6= 0, and define a relation R on X by:
number as

(m, n) R (m0 , n0 ) mn0 = m0 n.


(Yes, it looks like what were really saying here is m/n = m0 /n0 , but we want to work in
the world of the integers for now, so we dont want to do division.) Then we can define
the set of rational numbers Q to be the set of equivalence classes of the relation R.
Lets just pause for a moment to prove that R is indeed an equivalence relation.
R is Reflexive: (m, n)R(m, n) because mn = nm.
R is Symmetric: (m, n)R(p, q) mq = np pn = qm (p, q)R(m, n).
R is Transitive: Suppose (m, n)R(p, q) and (p, q)R(s, t). Then mq = np and pt = qs.
So, (mq)(pt) = (np)(qs) and, after cancelling qp, this gives mt = ns, so (m, n)R(s, t).
But, wait a minute: can we cancel pq? Sure, if its nonzero. If it is zero then that means
p = 0 (since we know that q 6= 0). But then mq = 0, so m = 0; and qs = 0, so s = 0. So,
in this case mt = ns = 0.

8.2.2

Rational numbers as equivalence classes

m
We represent the equivalence class [(m, n)] by . For example, we then have the
n
2
4
(familiar) fact that =
which follows from the fact that [(2, 5)] = [(4, 10)],
5
10
something that is true because (2, 5) R (4, 10) (2 10 = 4 5). What weve done here is
construct the set of rational numbers without reference to division. In an abstract
approach, this is the logically sound thing to do. Once we have constructed the rational
numbers, we can then make sense of the division of integers: the division of m by n is
the rational number m/n.
What, for instance, is the equivalence class [(1, 2)]? Well, (m, n)R(1, 2) means
m 2 = n 1, or n = 2m. So it consists of
(1, 2), (1, 2), (2, 4), (2, 4), (3, 6), (3, 6) . . . .
Denoting the equivalence class [(m, n)] by

m
, we therefore have
n

1
= {(1, 2), (1, 2), (2, 4), (2, 4), (3, 6), (3, 6), . . . }.
2
Recall that if x0 [x] then [x0 ] = [x]. So we can say
1
1
2
2
3
3
=
= =
= =
= .
2
2
4
4
6
6
We can think of the integers as particular rational numbers by identifying the integer n
n
with the rational number (that is, with the equivalence class [(n, 1)]). So Z Q.
1

110

8.2. Rational numbers

8.2.3

Doing arithmetic

How do we do arithmetic with rational numbers? Well, youve been doing this for
years, but how would we define addition and multiplication of rational numbers in an
abstract setting? Just as we defined operations on equivalence classes in earlier chapters
(in the construction of Z from N and in the construction of Zm ), we can define addition
and multiplication as an operation on the equivalence classes of R. Heres how: let
and be defined on the set of rational numbers as follows:
ad + bc
a c
=
,
b d
bd

a c
ac
= .
b d
bd

In practice, we just use normal addition and multiplication symbols (and we often omit
the multiplication symbol), so we have
a c
ad + bc
+ =
,
b d
bd

a c
ac
= .
b d
bd

Well, no surprises there, but remember that what we are doing here is defining addition
and multiplication of rational numbers (and remember also that these rational numbers
are, formally, equivalence classes). Now, if you think hard about it, one issue that is
raised is whether these definitions depend on the choice of representatives from each
equivalence class. They should not, but we ought to check that. What I mean is that we
really should have
2 2
4
1
+ =
+ ,
5 6
10 3
for example, because
4
2
1
2
=
and = .
5
10
6
3
Well, lets see. Consider the addition definition. Suppose that
a
a0
c
c0
= 0 and = 0 .
b
b
d
d
What we need to check is that

a c
a0 c 0
+ = 0 + 0.
b d
b
d

a
a0
= 0 means precisely that [(a, b)] = [(a0 , b0 )], which means that
b
b
ab0 = a0 b. Similarly, we have cd0 = c0 d. Now,

Now, the fact that

a c
ad + bc
a0 c 0
a0 d0 + b0 c0
+ =
, and 0 + 0 =
b d
bd
b
d
b0 d 0
and we need to prove that
ad + bc
a0 d 0 + b 0 c 0
=
.
bd
b0 d 0
This means we need to prove that
(ad + bc, bd) R (a0 d0 + b0 c0 , b0 d0 ).

111

8. Rational, real and complex numbers

Now,
(ad + bc, bd) R (a0 d0 + b0 c0 , b0 d0 ) (ad + bc)b0 d0 = (a0 d0 + b0 c0 )bd
adb0 d0 + bcb0 d0 = a0 d0 bd + b0 c0 bd
(ab0 )dd0 + (cd0 )bb0 = (a0 b)dd0 + (c0 d)bb0 .
Now, the first terms on each side are equal to each other because ab0 = a0 b and the
second terms are equal to each other because cd0 = c0 d, so we do indeed have
consistency (that is, the definition of addition is independent of the choice of
representatives chosen for the equivalence classes).
Activity 8.1 Show that the definition of multiplication of rational numbers is
consistent: that is, that it does not depend on the choice of representatives chosen
for the equivalence classes. Explicitly, show that if
a0
c0
a
c
= 0 and = 0 ,
b
b
d
d
then

a0
c0
a c
= 0 0.
b d
b
d

By the way, the rational numbers are described as such because they are (or, more
formally, can be represented by) ratios of integers.

8.3
8.3.1

Rational numbers and real numbers


Real numbers: a sketchy introduction

For our purposes, we will assume that the set of real numbers R is given. That is, we
shall not construct the real numbers formally. (This can be done, but is outside the
scope of this course.) We will regard the rational numbers (or, as well often call them,
the rationals) Q as a subset of R. There are various ways of thinking about real
numbers. One way is to think about a real number as a point on the infinite real line.
Another is to think of real numbers as the limits of rational numbers. The formal idea
of limit will be encountered in a later chapter, but we can encapsulate most of what we
need by thinking about the decimal representation of real numbers.
First, lets note that if ai N {0} and ai 9 for 1 i n, then the (finite) decimal
expansion
a0 .a1 a2 . . . an
represents the rational number
a0 +

a2
an
a1
+
+ +
.
2
10 (10)
(10)n

For example, what we mean by 1.2546 is the number


1+

112

2
5
4
6
+
+
+
.
10 100 1000 10000

8.3. Rational numbers and real numbers

Every positive real number can be represented by an infinite decimal expansion


a0 .a1 a2 a3 . . . ai . . . ,
where ai N {0} and ai 9 for i 1. We allow for ai to be 0, so, in particular, it is
possible that ai = 0 for all i N where N is some fixed number: such an expansion is
known as a terminating expansion. Given such an infinite decimal expansion, we say
that it represents a real number a if, for all n N {0},
a0 .a1 a2 . . . an a a0 .a1 a2 . . . an + 1/(10)n .
This formalism allows us to see that the infinite decimal expansion 0.99999 . . . , all of
whose digits after the decimal point are 9, is in fact the same as the number
1.0000000 . . . .
For example, two infinite decimal expansions are
3.1415926535 . . .
and
0.1833333333333 . . . .
(Youll probably recognise the first as being the number .) Suppose, in this second
decimal expansion, that every digit is 3 after the first three (that is, ai = 3 for i 3).
We can extend this notation to
Then we write this as 0.183 (or, in some texts, 0.183).
cases in which there is a repeating pattern of digits. For example, suppose we have

0.1123123123123 . . . ,
where the 123 repeats infinitely. Then we denote this by 0.1123.

8.3.2

Rationality and repeating patterns

You probably have heard stories of strange, obsessive mathematicians working out the
expansion of to millions and millions of decimal places. (This has been the subject of
a novel, a play, a film, and a song!) This is relevant because the digits of have no
repeating pattern, which you might think quite remarkable. In fact, it turns out that a
real number will have an infinitely repeating pattern in its decimal expansion (which
includes the case in which the pattern is 0, so that it includes terminating expansions) if
and only if the number is rational.
Lets look at part of this statement: if a number is rational, then its decimal expansion
will have a repeating pattern (which might be 0). Why is that? Well, lets look at an
example.

113

8. Rational, real and complex numbers

Example 8.1 We find the decimal expansion of 4/7 by long-division.


0.5714285
7 4.0000000
3.5
.50
.49
10
7
30
28
20
14
60
56
40
35
50
So,
4/7 = 0.571428.
Notice: we must have the same remainder re-appear at some point, and then the
calculation repeats. Heres the calculation again, with the repeating remainder
highlighted in bold.
0.5714285
7 4.0000000
3.5
.50
.49
10
7
30
28
20
14
60
56
40
35
50

Next, we think about the second part of the statement: that if the decimal expansion
repeats, then the number is rational.
Clearly, if the decimal expansion is terminating, then the number is rational. But what
about the infinite, repeating, case? Weve given two examples above. Lets consider
these in more detail.

114

8.3. Rational numbers and real numbers

Example 8.2 Consider a = 0.183. Let x = 0.003. Then 10x = 0.03 and so
10x x = 0.03 0.003 = 0.03. So, 9x = 0.03 and hence x = (3/100)/9 = 1/300, so
0.183 = 0.18 + 0.003 =

1
55
11
18
+
=
= ,
100 300
300
60

and this is the rational representation of a.

Example 8.3 Consider the number 0.1123. If x = 0.0123, then 1000x = 12.3123
and 1000x x = 12.3. So 999x = 12.3 and hence x = 123/9990. So,
0.1123 =

1
1
123
1122
+x=
+
=
.
10
10 9990
9990

In general, if the repeating block is of length k, then an argument just like the previous
two, in which we multiply by 10k , will enable us to express the number as a rational
number.

8.3.3

Irrational numbers

A real number is irrational if it is not a rational number. (So, given what we said above,
an irrational number has no infinitely repeating pattern in its decimal expansion.)
Whats clear from above is that any real number can be approximated well by rational
numbers: for the rational number a0 .a1 a2 . . . an is within 1/(10)n of the real number
with infinite decimal expansion a0 .a1 a2 . . . .
We can, in some cases, prove that particular numbers are irrational. Here is a classic
result of this type.

Theorem 
8.1  The real number 2 is irrational. That is, there are no positive integers
m 2
m, n with
= 2.
n
Proof
Suppose there were such m, n. If m, n are divisible by some d > 1, we may divide both
m and n to obtain m0 , n0 such that the rational number m0 /n0 equals m/n. So we may
assume that m, n have no common divisors other than 1; that is, gcd(m, n) = 1. Now,
the equation (m/n)2 = 2 means m2 = 2n2 . So we see that m2 is even. We know (from
Chapter 2) that this means m must be even. (For, the square of an odd integer is easily
shown to be odd.) So there is some m1 such that m = 2m1 . Then, m2 = 2n2 becomes
4m21 = 2n2 , and so n2 = 2m21 . Well, this means n2 is even and hence n must be even. So
m, n are both divisible by 2. But we assumed that gcd(m, n) = 1, and this is
contradicted by the fact that m and n have 2 as a common divisor. So our assumption
that (m/n)2 = 2 must have been wrong and we can deduce no such integers m and n
exist.
Isnt this theorem a thing of beauty?

115

8. Rational, real and complex numbers

Activity 8.2 Make sure you understand that this is a proof by contradiction, and
that you understand what the contradiction is.

Many other important numbers in mathematics turn out to be irrational. Ive already
mentioned , and there is also e (the base of the natural logarithm).

8.3.4

Density of the rational numbers

As weve seen, some important numbers in mathematics are not rational. An intuitive
question that arises is how many real numbers are rational and this is a difficult
question to answer. There are infinitely many real numbers and infinitely many
rationals, and infinitely many real numbers are not rational. More on this in the next
section!
For the moment, lets make one important observation: not only are there infinitely
many rational numbers, but there are no gaps in the rational numbers. If you accept
the view of real numbers as (possibly) infinite decimal expansions, then this is quite
clear: you can get a very good approximation to any real number by terminating its
decimal expansion after a large number of digits. (And we know that a terminating
decimal expansion is a rational number.) The following theorem makes sense of the
statement that there are no rational-free zones in the real numbers. Precisely, between
any two rational numbers, no matter how close together they are, there is always
another rational number.

Theorem 8.2 Suppose q, q 0 Q with q < q 0 . Then there is r Q with q < r < q 0 .
Proof
Consider r = (1/2)(q + q 0 ). The details are left to you!
Activity 8.3 Complete this proof.

8.4

Countability of rationals and uncountability of


real numbers

We know that N Z Q R, and that each inclusion is strict (there are integers that
are not natural numbers, rational numbers that are not integers, and real numbers that
are not rational). All of these sets are infinite. But there is a sense in which the sets N, Z
and Q have the same size and R is larger. Clearly we have to define what this means
in precise terms, because right now all we know is that there are more real numbers
than rationals, for instance, but there are infinitely many of each type of number.
The following definition helps us.
Definition 8.1 (Countable sets) A set is countable if there is a bijection between the
set and N.

116

8.4. Countability of rationals and uncountability of real numbers

For instance, Z is countable: we can define f : N Z by


f (1) = 0, f (2) = 1, f (3) = 1, f (4) = 2, f (5) = 2, f (6) = 3, f (7) = 3, . . . .
(In general, f (n) = (1)n bn/2c, where bn/2c means the largest integer that is no more
than n/2.) It is straightforward to show that f is a bijection. Hence Z is countable. So,
in this sense, the sets Z and N have the same cardinality (even though Z is strictly
larger than N). Working with infinite sets is not the same as working with finite sets:
two finite sets, one of which was a strict subset of the other, could not have the same
cardinality!
What does countable mean? The formal definition is given above. But one way of
thinking about it is that if S is countable, then the members of S can be listed:
s1 , s2 , s3 , . . . , .
For, suppose S is countable. Then there is a bijection f : N S. Let si = f (i) for
i N. Then, because f is a bijection, the list s1 , s2 , s3 , . . . will include every element of
S, each precisely once.
What is more surprising is that Q is also countable. See, for instance, Biggs, Section 9.7.
Theorem 8.3 The set Q of rational numbers is countable.
So, informally speaking, N, Z and Q all have the same size. What about R? Well, here
it gets very interesting: the set of real numbers is not countable. (It is said to be
uncountable.)

Theorem 8.4 The set R is not countable. (That is, R is uncountable.)


This is probably not too surprising: a randomly written decimal expansion will not
terminate or have a repeating pattern, so, intuitively, most real numbers are not
rational.
The proof uses the famous Cantor diagonal argument.
Suppose that f : N R and that
f (n) = xn0 .xn1 xn2 xn3 . . . .
We show theres a number in R which isnt the image under f of any element of N (and
hence f is not a surjection). Consider
y = 0.y1 y2 y3 . . .
where

yi =

1 if xii =
6 1
2 if xii = 1.

For all n N, y 6= f (n) since yn is different from xnn .


Since f was arbitrary, this shows that there can be no function f : N R that is
surjective. Hence R is not countable.

117

8. Rational, real and complex numbers

8.5
8.5.1

Complex numbers
Introduction

Consider the two quadratic polynomials,


p(x) = x2 3x + 2

and

q(x) = x2 + x + 1.

If you sketch the graph of p(x) you will find that the graph intersects the x-axis at the
two real solutions (or roots) of the equation p(x) = 0, and that the polynomial factors
into the two linear factors,
p(x) = x2 3x + 2 = (x 1)(x 2).
Sketching the graph of q(x), you will find that it does not intersect the x-axis. The
equation q(x) = 0 has no solution in the real numbers, and it cannot be factorised (or
factored) over the reals. Such a polynomial is said to be irreducible. In order to solve
this equation, we need to define the complex numbers.

8.5.2

Complex numbers: a formal approach

In a formal approach to complex numbers, we define the set C of complex numbers to


be the set of all ordered pairs (x, y) of real numbers, with addition and multiplication
operations defined as follows:
(a, b) + (c, d) = (a + c, b + d), (a, b) (c, d) = (ac bd, ad + bc).

By identifying the complex number (x, 0) with the real number x, we regard R as a
subset of C. We give the special symbol i to the complex number (0, 1). Note that we
can then re-write the complex number (x, y) as (x, y) = x + yi.
The complex number i satisfies
i2 = i i = (0, 1) (0, 1) = (0 0 1 1, 0 1 + 1 0) = (1, 0),
so i2 is the real number 1.

8.5.3

Complex numbers: a more usual approach

Rather than the ordered pairs approach outlined above, it is more common to define the
complex numbers as follows. We begin by defining the imaginary number i which has
the property that i2 = 1. The term imaginary is historical, and not an indication
that this is a figment of someones imagination. With this, we can then say what we
mean by the complex numbers.
Definition 8.2 A complex number is a number of the form z = a + ib, where a and b
are real numbers, and i2 = 1. The set of all such numbers is
C = {a + ib : a, b R} .

118

8.5. Complex numbers

If z = a + ib is a complex number, then the real number a is known as the real part of
z, denoted Re(z), and the real number b is the imaginary part of z, denoted Im(z). Note
that Im(z) is a real number.
If b = 0, then z is a real number, so R C. If a = 0, then z is said to be purely
imaginary.
The quadratic polynomial q(z) = x2 + x + 1 can be factorised over the complex
numbers, because the equation q(z) = 0 has two complex solutions. Solving in the usual
way, we have

1 3
x=
.
2
p

We write, 3 = (1)3 = 1 3 = i 3, so that the solutions are

1
3
1
3
w = +i
and
w = i
.
2
2
2
2
Notice the form of these two solutions. They are what is called a conjugate pair. We
have the following definition.
Definition 8.3 If z = a + ib is a complex number, then the complex conjugate of z is
the complex number z = a ib.
We can see by the application of the quadratic formula, that the roots of an irreducible
quadratic polynomial with real coefficients will always be a conjugate pair of complex
numbers.
Addition, multiplication, division

Addition and multiplication of complex numbers are defined as for polynomials in i


using i2 = 1.
Example 8.4 If z = (1 + i) and w = (4 2i) then
z + w = (1 + i) + (4 2i) = (1 + 4) + i(1 2) = 5 i
and
zw = (1 + i)(4 2i) = 4 + 4i 2i 2i2 = 6 + 2i.
If z C, then zz is a real number:
zz = (a + ib)(a ib) = a2 + b2 .

Activity 8.4
zz.

Carry out the multiplication to verify this: let z = a + ib and calculate

Division of complex numbers is then defined by

z
zw
=
w
ww

since ww is real.

119

8. Rational, real and complex numbers

Example 8.5
(1 + i)(4 + 2i)
2 + 6i
1
3
1+i
=
=
=
+ i
4 2i
(4 2i)(4 + 2i)
16 + 4
10 10

Properties of the complex conjugate


A complex number is real if and only if z = z. Indeed, if z = a + ib, then z = z if and
only if b = 0.
The complex conjugate of a complex number satisfies the following properties:
z + z = 2 Re(z) is real
z z = 2i Im(z) is purely imaginary
z=z
z+w =z+w
zw = z w
z
z
=
w
w
Activity 8.5 Let z = a + ib, w = c + id and verify all of the above properties.

8.5.4

Roots of polynomials

The Fundamental Theorem of Algebra asserts that a polynomial of degree n with


complex coefficients has n complex roots (not necessarily distinct), and can therefore be
factorised into n linear factors. Explicitly, any equation
an z n + an1 z n1 + + a1 z + a0 = 0
where ai C has n solutions z C. Contrast this with the difficulty of solving
polynomial equations in R. So, the introduction of i enables us to solve all polynomial
equations: theres no need to introduce anything else. A fancy way of saying this is:
The field of complex numbers is algebraically closed.
If the coefficients of the polynomial are restricted to real numbers, the polynomial can
be factorised into a product of linear and irreducible quadratic factors over R and into a
product of linear factors over C. The proof of the Fundamental Theorem of Algebra is
beyond the scope of this course. However, we note the following useful result.
Theorem 8.5
pairs.

Complex roots of polynomials with real coefficients appear in conjugate

Proof
Let P (x) = a0 + a1 x + + an xn , ai R, be a polynomial of degree n. We shall show
that if z is a root of P (x), then so is z.

120

8.5. Complex numbers

Let z be a complex number such that P (z) = 0, then


a0 + a1 z + +a2 z 2 + an z n = 0.
Conjugating both sides of this equation,
a0 + a1 z + a2 z 2 + + an z n = 0 = 0.
Since 0 is a real number, it is equal to its complex conjugate. We now use the properties
of the complex conjugate: that the complex conjugate of the sum is the sum of the
conjugates, and the same is true for the product of complex numbers. We have
a0 + a1 z + a2 z 2 + + an z n = 0,
and
a0 + a1 z + a2 z 2 + + an z n = 0.
Since the coefficients ai are real numbers, this becomes
a0 + a1 z + a2 z 2 + + an z n = 0.
That is, P (z) = 0, so the number z is also a root of P (x).
Example 8.6 Let us consider the polynomial
x3 2x2 2x 3 = (x 3)(x2 + x + 1).

1
3
If w = + i
, then
2
2

x3 2x2 2x 3 = (x 3)(x w)(x w).

Activity 8.6 Multiply out the last two factors above to check that their product is
the irreducible quadratic x2 + x + 1.

8.5.5

The complex plane

The following theorem shows that a complex number is uniquely determined by its real
and imaginary parts.
Theorem 8.6 Two complex numbers are equal if and only if their real and imaginary
parts are equal.
Proof
Two complex numbers with the same real parts and the same imaginary parts are
clearly the same complex number, so we only need to prove this statement in one
direction. Let z = a + ib and w = c + id. If z = w, we will show that their real and
imaginary parts are equal. We have a + ib = c + id, therefore a c = i(d b). Squaring
both sides, we obtain (a c)2 = i2 (d b)2 = (d b)2 . But a c and (d b) are real
numbers, so their squares are non-negative. The only way this equality can hold is for
a c = d b = 0. That is, a = c and b = d.

121

8. Rational, real and complex numbers

As a result of this theorem, we can think of the complex numbers geometrically, as


points in a plane. For, we can associate the vector (a, b)T uniquely to each complex
number z = a + ib, and all the properties of a two-dimensional real vector space apply.
A complex number z = a + ib is represented as a point (a, b) in the complex plane; we
draw two axes, a horizontal axis to represent the real parts of complex numbers, and a
vertical axis to represent the imaginary parts of complex numbers. Points on the
horizontal axis represent real numbers, and points on the vertical axis represent purely
imaginary numbers.

(a, b)

7


z = a + ib


(0, 0)






Complex plane or Argand diagram

Activity 8.7 Plot z = 2 + 2i and w = 1 i 3 in the complex plane.

8.5.6

Polar form of z

If the complex number z = a + ib is plotted as a point (a, b) in the complex plane, then
we can determine the polar coordinates of this point. We have
a = r cos ,

b = r sin

where r = a2 + b2 is the length of the line joining the origin to the point (a, b) and is
the angle measured anticlockwise from the real (horizontal) axis to the line joining the
origin to the point (a, b). Then we can write z = a + ib = r cos + i r sin .
Definition 8.4 The polar form of the complex number z is
z = r(cos + i sin ).

The length r = a2 + b2 is called the modulus of z, denoted |z|, and the angle is
called the argument of z.
Note the following properties:
z and z are reflections in the real axis. If is the argument of z, then is the
argument of z.
|z|2 = zz.
and + 2n give the same complex number.
We define the principal argument of z to be the argument in the range, < .

122

8.5. Complex numbers

Activity 8.8 Express z = 2 + 2i, w = 1 i 3 in polar form.


Describe the following sets of z: (a) |z| = 3, (b) argument of z is 4 .
Multiplication and division using polar coordinates gives
zw = r(cos + i sin ) (cos + i sin )
= r(cos( + ) + i sin( + ))

r
z
=
cos( ) + i sin( ) .
w

Activity 8.9 Show these by performing the multiplication and the division as
defined earlier, and by using the facts that cos( + ) = cos cos sin sin and
sin( + ) = sin cos + cos sin .
DeMoivres Theorem
We can consider explictly a special case of the multiplication result above, in which
w = z. If we apply the multiplication to z 2 = zz, we have
z 2 = zz
= (r(cos + i sin ))(r(cos + i sin ))

= r2 (cos2 + i2 sin2 + 2i sin cos )


= r2 (cos2 sin2 + 2i sin cos )
= r2 (cos 2 + i sin 2).
Here we have used the double angle formulae for cos 2 and sin 2.
Applying the product rule n times, where n is a positive integer, we obtain DeMoivres
Formula.
Theorem 8.7
(cos + i sin )n = cos n + i sin n
Proof
z n = z| {z
z}
n times

n
r(cos + i sin )


= rn cos(| + {z
+ }) + i sin(| + {z
+ })
=

n times

n times

123

8. Rational, real and complex numbers

8.5.7

Exponential form of z

Functions of complex numbers can be defined by the power series (Taylor expansions) of
the functions:

ez = 1 + z +

z2 z3
+
+
2!
3!

sin z = z

z3 z5
+

3!
5!

cos z = 1

z2 z4
+

2!
4!

for all z C.
If we use the expansion for ez to expand ei , and then factor out the real and imaginary
parts, we find:
ei = 1 + (i) +

(i)2 (i)3 (i)4 (i)5


+
+
+
+
2!
3!
4!
5!

3 4
5
2
i +
+ i
2!
3!
4!
5!




2 4
3 5
=
1
+
+ i
+

2!
4!
3
5!

= 1 + i

From which we conclude:


Eulers Formula:

ei = cos + i sin

Definition 8.5 The exponential form of a complex number z = a + ib is


z = rei
where r = |z| is the modulus of z and is the argument of z.
In particular, the following equality is of note because it combines the numbers e,
and i in a single expression: ei = 1.
If z = rei , then its complex conjugate is given by z = rei . This is because, if
z = rei = r(cos + i sin ), then
z = r(cos i sin ) = r(cos() + i sin()) = rei .
We can use either the exponential form, z = rei , or the standard form, z = a + ib,
according to the application or computation we are doing. For example, addition is
simplest in the form z = a + ib, but multiplication and division are simpler in
exponential form. To change a complex number between rei and a + ib, use Eulers
formula and the complex plane (polar form).

124

8.5. Complex numbers

Example 8.7

+ i sin 2
= 12 + i 23 .
ei 3 = cos 2
3
3

e2+i 3 = e2 ei 3 = e2 cos 3 + ie2 sin 3.

Activity 8.10 Write each of the following complex numbers in the form a + ib:

ei 2

Example 8.8

ei 2

ei 4

ei

11
3

e1+i

e1

Let z = 2 + 2i = 2 2 ei 4 and w = 1 i 3 = 2ei 3 , then

w6 = (1 i 3)6 = (2ei 3 )6 = 26 ei2 = 64

zw = (2 2ei 4 )(2ei 3 ) = 4 2ei 12

and

7
z
= 2ei 12 .
w

Notice that in the above example we are using certain properties of the complex
exponential function, that if z, w C,
ez+w = ez ew

and

(ez )n = enz

for n Z.

This last property is easily generalised to include the negative integers.


Example 8.9

Solve the equation

Write z 6 = (rei )6 = r6 ei6 ,

z 6 = 1 to find the 6th roots of 1.

1 = ei = ei(+2n) .

Equating these two expressions, and using the fact that r is a real positive number,
we have
2n
r=1
6 = + 2n, = +
.
6
6
This will give the six complex roots by taking n = 0, 1, 2, 3, 4, 5.
Activity 8.11 Show this. Write down the six roots of 1 and show that any one
raised to the power 6 is equal to 1. Show that n = 6 gives the same root as n = 0.
Use this to factor the polynomial x6 + 1 into linear factors over the complex numbers
and into irreducible quadratics over the real numbers.

125

8. Rational, real and complex numbers

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:

demonstrate that you understand how the rational numbers can be formally
constructed by means of an equivalence relation and that addition and
multiplication of rational numbers can be defined as operations on the equivalence
classes
indicate that you know that a real number is rational if and only if it has an
infinitely repeating pattern in its decimal expansion
find the decimal expansion of a rational number
determine, in the form m/n, a rational number from its decimal expansion
prove that certain numbers are rational or irrational
demonstrate that you understand that there are rational numbers arbitrarily close
to any real number
state what it means to say that a set is countable or uncountable
demonstrate that you know that the rationals are countable and the reals
uncountable
show that you know what is meant by complex numbers, and demonstrate that you
can add, subtract, multiply and divide complex numbers
state the definition of the complex conjugate of a complex number
show that you know that every polynomial of degree n has n complex roots and
that these occur in conjugate pairs
indicate complex numbers on the complex plane
determine the polar and exponential form of complex numbers
state and use DeMoivres theorem and Eulers formula.

Sample examination questions


Question 8.1

Prove that 5 is irrational.


Question 8.2

Weve seen that 2 is irrational. Assuming


this, prove that for all rational numbers x

and y with y 6= 0, the number x + y 2 is irrational. Hence show that between any two
rational numbers, there is an irrational number.
Question 8.3
Express the complex number

1 + 2i
in the form a + bi.
4 5i

Question 8.4
Solve the equation x2 2ix + 3 = 0.

126

8.5. Comments on selected activities

Question 8.5
Write each of the following complex numbers in the form a + ib:

ei 2

ei 2

ei 4

ei

11
3

e1+i

e1 .

Question 8.6

Express 1 + 3i in exponential form. Hence find (1 + 3i)30 .

Comments on selected activities


Feedback to activity 8.1
Suppose that
a
a0
c0
c
= 0 and = 0 .
b
b
d
d
What we need to check is that

a0
c0
a c
= 0 0.
b d
b
d

a
a0
= 0 means precisely that [(a, b)] = [(a0 , b0 )], which means that
b
b
ab0 = a0 b. Similarly, we have cd0 = c0 d. Now,

Now, the fact that

ac
a0
c0
a0 c 0
a c
= , and 0 0 = 0 0
b d
bd
b
d
bd

and so we need to prove that


ac
a0 c 0
= 0 0.
bd
bd
This means we need to prove that
[(ac, bd)] R [(a0 c0 , b0 d0 )].
Now,
[(ac, bd)] R [(a0 c0 , b0 d0 )] acb0 d0 = a0 c0 bd
(ab0 )(cd0 ) = (a0 b)(c0 d),
which is true because ab0 = a0 b and cd0 = c0 d.
Feedback to activity 8.6
We have
(x w)(x w) = x2 (w + w)x + ww.
Now, w + w = 2 Re(w) = 2( 12 ) and ww =
is x2 + x + 1.

1
4

3
4

so the product of the last two factors

Feedback to activity 8.7


The plots are given in Figure 8.1.

127

8. Rational, real and complex numbers

z = 2 + 2i

2i
i

(0, 0)

w =1i 3

2i

Figure 8.1: Plots of the complex numbers z = 2 + 2i and w = 1 i 3.

Feedback to activity 8.8


Draw the line from the origin to the point z in the diagram above. Do the same for w.


For z, |z| = 2 2 and = 4 , so z = 2 2 cos( 4 ) + i sin( 4 ) . The modulus of w is |w| = 2
and the argument is 3 , so that


w = 2 cos( ) + i sin( ) = 2 cos( ) i sin( ) .


3
3
3
3
The set (a) |z| = 3, is the circle of radius 3 centered at the origin. The set (b), argument
of z is 4 , is the half line from the origin through the point (1,1).

Feedback to activity 8.10


The roots are:

z1 = 1 ei 6 ,
7

z4 = 1 ei 6 ,

z3 = 1 ei 6 ,

z6 = 1 ei

z5 = 1 ei 6 ,

These roots are in conjugate pairs, and e


z4 = z 3 = ei 6 ,

z2 = 1 ei 6 ,
i 13
6

11
6

i 6

=e :

z5 = z 2 = ei 2 ,

z6 = z 1 = ei 6 .

The polynomial factors as


x6 + 1 = (x z1 )(x z 1 )(x z2 )(x z 2 )(x z3 )(x z 3 ),

Using the a + ib form of each complex number, for example, z1 = 23 + i 12 , you can carry
out the multiplication of the linear terms pairwise (conjugate pairs) to obtain x6 + 1 as
a product of irreducible quadratics with real coefficients,

x6 + 1 = (x2 3 x + 1)(x2 + 3 x + 1)(x2 + 1).

128

8.5. Sketch answers to or comments on sample questions

Sketch answers to or comments on sample questions


Answer to question 8.1

Suppose we have 5 = m/n where m, n Z. Since 5 > 0, we may assume that


m, n > 0. (Otherwise, both are negative, and we can multiply each by 1.) We can also
suppose that m, n have greatest common divisor 1. (For, we can cancel any common
factors.) Then (m/n)2 = 5 means that m2 = 5n2 . So 5 | m2 . Now m can, by the
Fundamental Theorem of Arithmetic, be written as a product of primes m = p1 p2 . . . pk .
Then m2 = p21 p22 . . . p2k . If no pi is 5, then 5 does not appear as a factor in m2 and so 5
does not divide m2 . So some pi is equal to 5. So 5 | m. Now, this means that m = 5r for
some r N and hence m2 = (5r)2 = 25r2 and so 25r2 = 5n2 . Then, n2 = 5r2 , so 5 | n2 .
Arguing as before, 5 | n.
So 5 is a common factor if m and n, which contradicts
gcd(m, n) = 1. Hence 5 is not rational.
Answer to question 8.2

We can prove this by contradiction. Suppose r = x + y 2 is rational, for some rational


numbers x and y, with y 6= 0. Then, because x is rational, the number s = r x is
rational. (This uses the fact that q1 , q2 are rational,
so isq1 q2 , because
So y 2 is rational. Now, y 6= 0 and
a/b c/d = (ad bc)/(bd).) But s = r x = y 2.
y Q, so y = k/l for some k, l Z withl 6= 0. So, 2 = (l/k)s. Since
s is rational,
s = m/n for some m, n Z and hence 2 = (lm)/(kn). This shows 2 is rational,
something we know to be false. So a contradiction
has arisen. Therefore r must be
irrational. So for all x, y Q with y 6= 0, x + y 2 is irrational.
Now we want to show that between any two rational numbers q1 , q2 , with q1 < q2 , there
is an irrationalnumber z. The result weve just proved
suggests we aim

to find a z of
the form x + y 2. If wecan find a number r + s 2 where 0 < r + s 2 < 1, then the
number z = q1 + (r + s 2)(q2 q1 ) will be between q1 and q2 . For example, if we take
r = 1, s = 1, we obtain the number

1 + 2.

2
This number is positive because 2 > 1 (since 2 > 12 ) and it is less than 1 because

2
2 < 2 (because 2 = 2 < 22 = 4.) So consider



z = q1 + 1 + 2 (q2 q1 ) = 2q1 + 2(q2 q1 ).

Now, z is between q1 and q2 and z is of the form x + y 2 where x, y Q and y 6= 0.


(2q1 and q2 q1 are rational because q1 , q2 are, and q2 q1 6= 0 because q1 6= q2 .) (There
are many other ways to answer this question.)

129

8. Rational, real and complex numbers

Answer to question 8.3


We have
1 + 2i
1 + 2i 4 + 5i
=
4 5i
4 5i 4 + 5i
=

(1 + 2i)(4 + 5i)
(4 5i)(4 + 5i)

4 + 8i + 5i + 10i2
16 25i2

6 + 13i
41
6
13
= + i.
41 41
=

You can check that this is the correct answer by calculating the product


6
13
+ i (4 5i)
41 41
and observing that the answer is 1 + 2i.
Answer to question 8.4
To solve the equation x2 2ix + 3 = 0, we could use the formula for the solutions of a
quadratic equation. Or we could note that the equation is equivalent to (x i)2 = 4,
so the solutions are given by x i = 2i and x i = 2i, so they are x = 3i and x = i.
Answer to question 8.5
We have
ei/2 = i,
e

i(11/3)

ei3/2 = i,

1
1
ei3/4 = + i ,
2
2

3
1
,
e1+i = e1 ei = e cos(1) + i e sin(1),
=e
= i
2
2
e1 = e1 + 0i is real, so already in the form a + ib.
i(/3)

Answer to question 8.6

To express z = 1 + 3i in exponential form, we first note that


!

1
3
1 + 3i = 2
+
i
2
2
and this is r(cos + i sin ) when r = 2, = /3. So z = 2ei/3 . Then,
(1 +

130

30
30
3) = z 30 = 2ei/3
= 230 e30i/3 = 230 e10i = 230 .

Part 2
Analysis

131

Chapter 9
Supremum and infimum

Essential reading
Bryant, Victor. Yet Another Introduction to Analysis. Chapter 1.

(Or, similar reading from other analysis books. Use the index to find material on
supremum and infimum.)

9.1

Introduction

In this short chapter we explore some properties of sets of real numbers. Whereas
bounded sets of integers will always have a least and a greatest member (that is, a
minimum and a maximum), this is not true for bounded sets of real numbers. Instead,
the corresponding notions are infimum and supremum.

9.2

The triangle inequality

We start with a useful observation about real numbers. For any real number x, the
absolute value of x, denoted by |x|, is x itself if x 0 and x if x < 0. For instance,
|5| = 5 and | 3.5| = 3.5.
The basic triangle inequality is very simple, but extremely useful. It states that for any
two real numbers x, y, we have
|x + y| |x| + |y|.
Activity 9.1 Prove the triangle inequality. (Just consider each of four cases:
x > 0, y < 0, etc.)
This has many useful consequences, which we shall use often. In particular, for any
three real numbers x, y, z, we have
|x y| |x z| + |y z|
(simply because |x y| = |(x z) + (z y)| |x z| + |z y|).
Sometimes useful is the fact that for any two real numbers x, y,
|x y| ||x| |y||.

133

9. Supremum and infimum

9.3

Properties of real numbers: supremum and


infimum

Recall that for real numbers a, b, we define different types of interval as follows:
[a, b] = {x R | a x b}
(a, b] = {x R | a < x b}
(a, b) = {x R | a < x < b}
[a, b) = {x R | a x < b}
[a, ) = {x R | x a}
(a, ) = {x R | x > a}
(, b] = {x R | x b}
(, b) = {x R | x < b} .
We shall sometimes use the symbol R+ to denote the set of all positive real numbers.
It is very important (especially in what follows) that you understand that is not a
number; it is merely a symbol. You cannot perform arithmetic with it. It is quite simply
nonsense to write 1/ = 0 and so on.
Some of these intervals are bounded above, some bounded below and some bounded
above and below. The formal definitions of boundedness are:
Definition 9.1 S R is bounded above if and only if there is M R such that
for all x S, x M.

In this case, M is called an upper bound of (or on or for) S.


Definition 9.2 S R is bounded below if and only if there is m R such that
for all x S, x m.
In this case, m is called a lower bound of (or on or for) S.
A set is said simply to be bounded if it is bounded above and bounded below.
Definition 9.3 An element x S is the maximum of S if it is an upper bound of S
(similarly, we can define the minimum of S).
A set that is bounded above need not have a maximum; a set has a maximum element
if and only if it has an upper bound that belongs to the set. Similarly, a set that is
bounded below need not have a minimum; a set has a minimum element if and only if it
has a lower bound that belongs to the set.
Every bounded subset of the integers has a maximum and a minimum. The
corresponding notions for subsets of the real line are supremum and infimum, defined
below.

134

9.3. Properties of real numbers: supremum and infimum

We assume the following property of real numbers (as an axiom).


The continuum property
Every non-empty set of real numbers that is bounded above has a least upper bound,
and every non-empty set of real numbers that is bounded below has a greatest upper
bound.

Definition 9.4 For a non-empty subset S of R, if S is bounded above, the least upper
bound of S (which exists, by the continuum property) is called the supremum of S, and
is denoted sup S. If S is bounded below, the greatest lower bound is called the infimum
of S and is denoted inf S.
Note that the supremum and infimum of a set S (when they exist; i.e. when S is
bounded above/below) need not belong to S.
Example 9.1 Consider the real interval S = (1, 2). This is bounded above: for
example, 2 is an upper bound, since every member of S is at most 2. In fact, any
number greater than or equal to 2 is an upper bound, so the set of upper bounds is
[2, ). It is clear that the least upper bound is therefore 2. That is, sup S = 2. But
notice that the set S has no maximum. For, suppose that M is a maximum. Then
(two things!): (i) M S and, (ii) for all x S, x M . Now, because M S, we
must have M < 2. But now consider the number (M + 2)/2, half-way between M
and 2. This is still less than 2, so it belongs to S. Thus it is not true that x M for
all x S, because this doesnt hold when we take x = (M + 2)/2. So S has a
supremum but no maximum.
Example 9.2 On the other hand, the interval S = (1, 2] has a maximum: its
maximum (and its supremum) is 2.

Example 9.3 Consider the set S = {1/n | n N} = {1, 1/2, 1/3, . . . }. This is
clearly bounded below, by 0 for example. In fact, inf S = 0. To see this, we observe
(as we already have) that 0 is a lower bound on S. To prove it is the greatest lower
bound, we need to show that no number > 0 is a lower bound on S. So lets
suppose that is any positive number. To prove it is not a lower bound on S we
have to show that some member of S is less than . Well, this means we need to find
a number of the form 1/n such that 1/n < . But its clear that this inequality is
equivalent to n > 1/ . So if we take any natural number n that is greater than 1/ ,
then 1/n S and 1/n < . So we are done: any > 0 cant be a lower bound, and 0
is, so 0 is the greatest lower bound, the infimum.
Activity 9.2 Make sure you understand the preceding three examples by reading
them again and trying to reproduce the arguments given.
An alternative description of the supremum is: = sup S if and only if is an upper
bound for S and if for any 0 < , there is some x S with x > 0 . (A similar condition
can be given for infimum.)

135

9. Supremum and infimum

Activity 9.3 Write down the corresponding description of infimum.


A further characterisation (which follows from the one just given by taking 0 = )
is as follows.
Theorem 9.1 For S R, S non-empty, = sup S if and only if is an upper bound
for S and
 > 0, y S with y > .
Activity 9.4 Convince yourself that this characterisation of the supremum is
equivalent to the one given in the definition.
There is also a similar characterisation of the infimum.
Theorem 9.2 For S R, S non-empty, = inf S if and only if is a lower bound for
S and
 > 0, x S with x < + .

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
understand and use the triangle inequality
know what is meant by a set of real numbers being bounded above, bounded below,
or bounded
explain exactly what is meant by the supremum and infimum of a non-empty set
and be able to determine these.

9
Sample examination questions
Question 9.1
The distance between two real numbers x and y is defined as the absolute value |x y|
of their difference. This is a nonnegative real number, and satisfies |x y| = |y x| and
|x + y| |x| + |y|. Use these properties to show the following triangle inequality:
|x z| |x y| + |y z|
and the inequality:
|x y| ||x| |y||.
Question 9.2
For each of the following sets, find the sup, inf, max and min whenever these exist.
[1, ),

136

9.3. Comments on selected activities

{x R | x2 2x 1 < 0},
{x2 2x 1 | x R}.
Question 9.3
Suppose A is a bounded subset of R. Let B be the set defined by
B = {b | b = 2a + 3, a A}.
Prove that
sup B = 2 sup A + 3.
[Hint: let A = sup A and start by proving that 2A + 3 is an upper bound on B.]
Question 9.4
A bounded subset A of R has the property that for all x, y in A, |x y| < 1. Prove by a
contradiction argument that
(sup A inf A) 1.
Question 9.5
Let S be a non-empty set of reals and let be a lower bound of S. Prove that
= inf S

 > 0 y S

y < + .

Comments on selected activities


Feedback to activity 9.4
What we need to prove is that if S is a nonempty set of reals and is an upper bound
of S, then
= sup S

 > 0 y S y > .
The proof is almost there once it is understood what the condition on the right hand
side means. It says for all positive  there is an element y of S so that y > . So let
us prove that this condition holds if is a supremum, i.e. show the implication = of
the equivalence that we want to prove. Suppose otherwise; that is, is a supremum but
 > 0 y S y >  is false, that is, there is some  > 0 so that
y S y >  is false, that is, for all y in S we have that y >  is false, that
is, for all y in S we have y . But this means that  is an upper bound for S,
which is less than since  > 0, contradicting our assumption that is the least upper
bound. So = indeed holds.
Conversely, assume that is an upper bound for S and that for all positive  there is an
element y of S so that y > . Then must be the least upper bound of S since if
there was an upper bound 0 of S with 0 < , taking  = 0 (which is > 0) would
give an  > 0 with y S y , contradicting the assumption. Hence = sup S
(since is an upper bound of S by assumption), showing the implication = as well.

137

9. Supremum and infimum

In fact, this theorem is not that difficult, but we have argued very carefully using the
logic of quantifiers and the negations of quantified formulae to get you used to that.
Remember that y P (y) means for all y property P (y) holds, and that its negation
y P (y) is equivalent to y P (y), meaning that P (y) is not true for all y if it is
false for at least one y (a rather obvious statement). Similarly, the statement y P (y)
means there is a y so that P (y) holds, and its negation there is no y such that P (y)
holds, that is, y P (y), is equivalent to y P (y) and means that for all y, P (y)
is false (again, rather obviously). When quantifiers start to pile up, it is useful to go
through these negation processes step by step, as illustrated in the preceding proof.

Sketch answers to or comments on sample questions


Answer to question 9.1
For the first inequality, note that, since |a + b| |a| + |b|, we have
|x z| = |(x y) + (y z)| |x y| + |y z|.
For the second inequality, we need to prove
|x y| |x| |y| |x y|.
So there are two inequalities to prove. The leftmost follows from the triangle inequality,
as follows:
|y| = |(y x) + x| |y x| + |x|,
so
|x y| = |y x| |x| |y|.
For the second, we observe that

|x| = |(x y) + y| |x y| + |y|.

Answer to question 9.2


For the first set, min=inf=1, and there
is nosup and no max. For the second, we note
that the set
(1 2, 1 + 2). There is no max or min, but
is the open interval

sup = 1 + 2, inf = 1 2. The third set equals [2, ). No max or sup, but
inf=min=2.
Answer to question 9.3
There are several ways of proving the result. It is given that A is bounded, so
A = sup A exists and, a A for all a A. So, a, 2a + 3 2A + 3. So B is bounded
above by 2A + 3. Since B is bounded above, B = sup B exists and, since 2A + 3 is an
upper bound, we must have
B 2A + 3.
We need to show =, rather than . We could now proceed to prove it by contradiction,
supposing that B < 2A + 3 and showing that, as a consequence of this supposition,

138

9.3. Sketch answers to or comments on sample questions

A0 = (B 3)/2 is an upper bound on A, which is less than A a contradiction.


Almost equivalently, we could simply observe that
A = {(b 3)/2 | b B}
so, arguing as at the beginning,
A (B 3)/2;
that is,
2A + 3 B .
Either way, we get what we need to complete the proof.
Answer to question 9.4
This is tricky. A diagram based on the real line might help. Suppose, for a contradiction,
that = sup A, = inf A, and > 1. Suppose that = 1 + r. Then, for any
s > 0, by the definition of sup and inf, there are x, y A with x < + s and y > s.
Lets choose s small enough that this will imply that y > x and also that y x > 1
(giving our contradiction). To make y > x, we need s < ( )/2. We then have
|x y| = y x = ( s) ( + s) = ( ) 2s,
and the right hand side will be at least 1 if s < r/2.
In other words, we can formally argue as follows: Suppose > 1, and that
= 1 + r where r > 0. Let 0 < s < min(r/2, ( )/2). Then, by the properties of
sup and inf, there exist y, x A with y > s and x < + s. We then have y > x
(because s < ( )/2) and, because s < r/2, we have
|x y| = y x > ( s) ( + s) = ( ) 2s > (1 + r) 2s > (1 + r) r = 1.
So |x y| 1, a contradiction to the given information.

Answer to question 9.5


The proof is almost there once it is understood what the condition on the right hand
side means. It says for all positive  there is an element y of S so that y < z + . So
let us prove that this condition holds if z is an infimum, i.e., show the implication =
of the equivalence that we want to prove. Suppose otherwise, that is, z is an infimum
but  > 0 y S y < z +  is false, that is, there is some  > 0 so that
y S y < z +  is false, that is, for all y in S we have that y < z +  is false, that
is, for all y in S we have y z + . But this means that z +  is a lower bound for S,
which is bigger than z since  > 0, contradicting our assumption that z is the greatest
lower bound. So = indeed holds.
Conversely, assume that for all positive  there is an element y of S so that y < z + .
Then z must be the greatest lower bound of S since if there was a lower bound w of S
with w > z, taking  = w z would give an  > 0 with the following property:
y S

z +  y,

contradicting the assumption. Hence z = inf S (since z is a lower bound of S by


assumption), showing the implication = as well.

139

9. Supremum and infimum

140

Chapter 10
Sequences and limits

Essential reading
Bryant, Victor. Yet Another Introduction to Analysis. Chapter 2 (but not the
material on Infinite Sums.

(Or, similar reading from other analysis books. Use the index to find material on
sequences and limits.)

10.1

Introduction

The single most important idea in the Analysis part of this subject is the notion of the
limit of a sequence, and this is the topic of this chapter. This idea can sometimes
appear daunting, and many students struggle with it when they first meet it. But it is
of such importance in this subject that you are urged to grapple with it until it makes
sense to you.

10.2

Examples of sequences

A sequence is an infinite ordered list a1 , a2 , a3 , . . . of real numbers. An example is given


by an = 1/n for all natural numbers (positive integers) n; that is, the sequence
1 1 1 1
1, , , , , . . . .
2 3 4 5
What happens to the numbers 1/n as n gets larger and larger? Clearly they approach
0. But how do we formalise this? Before considering limits of sequences in a formal way,
we consider another important sequence of numbers (which will show us that things are
not always so simple as they are for the sequence just considered).
For a fixed x > 0, let the numbers a1 , a2 , a3 , . . . be defined by
a1 = 1, a2 = 1 + x, a3 = 1 + x +

x2
x2 x3
, a4 = 1 + x +
+
2
2
6

and, in general,
n1

an = 1 + x +

X xi
x2
xn1
+ +
=
.
2
(n 1)!
i!
i=0

Although that sequence (obviously) increases with each step, it turns out that, as n
increases, the numbers approach a fixed number. That is, the sequence has a limit,

141

10

10. Sequences and limits

which we will denote by limn an . This is far from clear. The value of that limit
depends on x and defines a very important function in mathematics, which has its own
name, exp(x). That is,
n1 i
X
x
,
exp(x) = lim
n
i!
i=0
which we write as
exp(x) =

X
xi
i=0

i!

(Be aware that the infinite sum notation used here is just a shorthand for the limiting
value of the finite sum of the an ; we dont really add up an infinite sequence of
numbers. You will find out more about such matters if you take 41 Advanced
mathematical analysis, in which series are studied.) Considered as a function of x,
this is the so-called exponential function, a function of central importance in analysis
and calculus.

10.3

Sequences and limits

10.3.1

Sequences: formal definition

Let N = {1, 2, 3, . . .} be the set of natural numbers (positive integers). Formally, a


sequence is a function from N to R. Only the notation is unusual: Instead of writing
f (n) for the value of the function f , say, for the natural number n, that value is written
an . The entire sequence is then written in one of the following ways, all legitimate:
(an )nN ,

(an )
n=1 ,

(an )n1 ,

(an ),

where the last expression (an ), the one we will use most often, is an abbreviation that
should only be used when it is clear that the index n runs through the natural numbers.

10

The nth element (or nth term) an of a sequence may be defined explicitly by a formula
involving n, as in the two examples given above. Alternatively, a sequence might be
defined recursively (or inductively). For example, we might have a1 = 1 and
an+1 = an /2 for n 1.

10.3.2

The definition of a limit

As an example, let us calculate the first few terms of the sequence given by
a1 = 1, an+1 =

3
an
+
(n 1).
2
2an

We have
a1 = 1, a2 = 2, a3 = 1.75, a4 = 1.73214, a5 = 1.7320508, . . . .

142

10.3. Sequences and limits

Activity 10.1 Check these calculations of a2 , . . . , a5 .


Note that the terms get closer and closer to 1.73205
. . . as n increases. In fact, the
number the terms approach
can be shown to be
3 = 1.7320508057 . . . . We say that

the sequence has limit 3, or that it converges to 3.


Lets take things easy to start with. We intuitively know what we mean by saying that
the sequence (1/n) tends to the limit 0 as n tends to infinity. In symbols, we may write
1/n 0 as n . Well, I say we know what it means, but do we, really? Before we go
further, try the following learning activity.
Activity 10.2 Try to write down, precisely, what it means to say that a sequence
(an ) of numbers tends to 0 as n . Its not that easy.
You might suggest something like the terms get closer and closer to 0, but although
this is indeed a property of the sequence (1/n), it is not a sufficient definition of tending
towards 0. For, consider an = 1/2 + 1/n. This does not tend to 0. (It tends to 1/2, as
you can begin to see by calculating some of the numbers.) Although it is the case that
the terms get closer and closer to 0. (The point is, they dont get close enough.)
Consider now the sequence bn given as follows
1
1 1 1 1 1 1
, 1, , , , , , , . . . .
2
3 2 4 3 5 4
So, b2k = 1/k and b2k1 = 1/(k + 1). This sequence tends to the limit 0, but it is not the
case that the terms get closer and closer to 0, because we have, for any k, b2k > b2k1 .
Anyway, all this is just to try to convince you that we really do need a formal, precise,
definition of what it means for a sequence to tend to 0 if we are to have a notion of limit
that works well. And here it is. . . .
Definition 10.1 The sequence (an ) is said to tend to 0 (as n tends to infinity) if for all
positive real numbers  there is a number N = N () such that for any natural number
n > N , the distance of the element an from 0 is at most :
 > 0 N n > N

|an | < .

(?)

If this holds, we write


an 0 as n
and also
lim an = 0.

The definition may also be understood as saying that for any  > 0, there is N such
that if n > N then the size |an | of the nth term is smaller than . Another
interpretation is: given any error margin , the terms of the sequence are ultimately
equal to 0 within that error margin.
It should be understood in (?) that  is an arbitrary positive real number, whereas n is a
natural number. Also, N depends on , something I have stressed by writing N = N ().

143

10

10. Sequences and limits

As an example, consider the sequence (an ) defined by an = 1/n for n N. The limit of
this sequence is zero, which can be seen, formally, as follows: To show (?) holds,
consider any  > 0. Then we look for an N such that whenever n > N , then |an | < ,
that is, 1/n <  or, equivalently, 1/n < , that is, n > 1/. So a suitable value for N in
that case is N = 1/ (which may not be an integer, but this poses no problem since n in
(?) is always an integer).
In general, (?) states that no matter how close one wants to be to the limit 0 (that is,
no matter how small  is chosen), the elements an are eventually all closer to 0 than
distance . Eventually all means all except for a finite number, where these finite
exceptions must be among the elements an with 1 n N . In general, and fairly
obviously, the smaller  is, the larger N must be. We see this in the above example, in
which N = N () = 1/.
What about a sequence tending to some limit other than 0? The following definition
shows how we can easily adapt the one just given.
Definition 10.2 The sequence (an ) is said to have the limit a (as n tends to infinity) if
and only if an a tends to 0.
Explicitly, we therefore have:
Definition 10.3 The sequence (an ) is said to have the limit a (as n tends to infinity) if
for all positive real numbers  there is a number N = N () such that for any natural
number n > N , the distance of the element an from a is at most :
 > 0 N = N () n > N

|an a| < .

(??)

If this holds, one also says that the sequence tends to a as n tends to infinity, written as
an a as n
and also as
a = lim an .
n

A sequence that has a limit is called convergent, otherwise divergent.

10

In general, (??) states that no matter how close one wants to be to the limit a (that is,
no matter how small  is chosen), the elements an are eventually all closer to a than
distance .
Definition (??) is crucial and should be understood thoroughly. Note that the following
conditions are equivalent:
|an a| < ,
a  < an < a + ,
an (a , a + ) .
The interval (a , a + ) is also called the -neighbourhood of a.
Note also, for example, that the reference to N in (??) cannot be omitted: The (rather
useless) statement  > 0 n |an a| <  would be trivially true if a was one of the

144

10.3. Sequences and limits

elements an of the sequence. Furthermore, it is also not sufficient that infinitely many
elements of the sequence are close to a: The sequence (an ) defined by an = (1)n is
given by 1, 1, 1, 1, 1, . . . and alternates between 1 and 1. Either of these two
numbers has an infinite number of elements of the sequence close to it, but not
eventually all of them. (Take  = 1 in (??) where there is no N with the property stated
there.) This is an example of a divergent sequence.
We can give a diagrammatic representation of convergence of a sequence. Suppose we
have a sequence (an ) with limit L. Then, as illustrated in Figure 10.1, we can think of
convergence as follows. Pick any  > 0, and consider the shaded strip of width  around
the horizontal line passing through L. Then one can find some N N, large enough,
such that all the terms an of the sequence, for n > N lie in the shaded strip.

K
L
M

K
L
M
N

Figure 10.1: Convergence of a sequence with limit L.

Certain sequences diverge because their members become arbitrarily large. For these
sequences, a useful definition can be given.
Definition 10.4 The sequence (an ) tends to infinity, written an as n , if
K N n > N

an > K.

This is also written as


lim an = .

Similarly, the sequence tends to minus infinity if (an ) tends to infinity.


Note: If a sequence tends to infinity, it is divergent (not convergent). Note also that a
sequence may diverge because it tends to or , or because, while remaining
bounded like an = (1)n , it simply fails to tend towards any fixed real number. Do not
confuse divergence with tending to infinity or minus infinity: divergence is more general
than this. The key thing is to be clear that convergence means tending towards a
(finite) real number.

145

10

10. Sequences and limits

As I mentioned, the definition of a limit is the most important thing in this part of the
subject. Its even more important than any of the Theorems. Definitions matter. We
need them because we need to know precisely what we mean by saying, for example,
that a sequence converges to a number. You cannot even properly begin to prove things
about the limits of sequences until you know the formal definition. This should be clear:
how can you prove that a sequence converges to 0 if you dont even know what
converges to 0 means? Try to make sure, now, that you understand this definition. Its
meaning will become clearer as we work further with it. I know it might be difficult: this
is well-known to be one of the hardest conceptual hurdles in university mathematics.
Bryant (on the back cover of his book) claims that A first course in Analysis at college
is always regarded as one of the hardest in the curriculum. Anybody who has taught
or even studied university-level mathematics is very aware that it can initially be
difficult to handle the concept of limit. So, it might be difficult at first, but it is
nonetheless important that you do your best to grapple with this important concept.
Once the idea clicks into place you will wonder why you ever had any problems with it.

10.4

Some standard results on and properties of


limits

The notation limn an suggests that the limit of a sequence is unique. Indeed, this is
the case:
Theorem 10.1 A sequence has at most one limit.
Proof
Consider a sequence (an ) and assume, to the contrary, that it has two limits a and b
with a 6= b. We will arrive at a contradiction by choosing  in the definition of limit
small enough. Intuitively, if  is less than half of the distance between a and b, then an
element that is less than  away from a must be farther than  away from b. So it cant
be the case that eventually all elements of the sequence can be that close to both a and
b.

10

To formalize this, let  = |a b|/2, which is positive since a 6= b. Then for some N1 and
all n > N1 , since a is a limit of the sequence, |an a| < , and similarly for some N2 and
all n > N2 , since b is a limit of the sequence, |an b| < . But then for
n > N = max{N1 , N2 },
2 = |a b| |a an | + |an b| = |an a| + |an b| <  +  = 2,
which is a contradiction. Hence it is not possible that the sequence has more than one
limit.
A slightly different way to argue this, which is essentially the same as that just given, is
as follows. As above, let  > 0 be any given positive number. Then, with N1 and N2 as
above, we deduce that for any n > N = max{N1 , N2 }, |a b| < 2. Now, this must be
true for any  > 0. The only way that a non-negative number X can satisfy X < 2 for
all  > 0 is if X = 0. (For, if X > 0, then the inequality X < 2 will not hold if
 = X/2.)

146

10.4. Some standard results on and properties of limits

10.4.1

Using the formal definition of limit

We now look at a few examples in which the formal definition of limit is explicitly used.

Example 10.1 Suppose that xn = 1/( n + 1). We will prove that xn 0. Now,
this means that we need to show
() such that if
that, for any
 > 0, there is N = N
n > N , then xn < . Since 1/( n + 1) < 1/ n, it suffices to have 1/ n < , which
means n > N = 1/2 .
Note that we do not need to find the smallest such N (): it is enough to find any
suitable one. What this means is that it is fine
to use the inequalities we used above,
and we do not need to solve the equation 1/( n + 1) = . (This can, however, be done,
in this case. In other, more complicated examples, it would be a nightmare.)
Example 10.2 Suppose that xn = 1 n12 . We prove that xn 1. This means we
must show that for any  > 0, there is N = N () such that if n > N , then
2
2
|xn 1|< . We have |xn 1| = 1/n
. This is <  provided n > ; that is,
n > 1/ . So a suitable N () is 1/ 
Example 10.3 Let (xn ) be the sequence defined by
xn =

2n 3
.
n+3

We prove xn 2. What does it mean to say xn 2? It means that if somebody


gives us any positive , we have to produce a number N () with the property that if
n > N () then |xn 2| < . Now,




2n 3 2(n + 3) 9
2n 3




= 9 .

=

n + 3 2 =
n+3
n + 3 n + 3
We therefore want to know that 9/(n + 3) <  for all n greater than some number
N (). We could solve this inequality for n to obtain n > 9/ 3. So a suitable N ()
is 9/ 3. But we dont need to be so careful. Just approximate (in the right way, by
which is meant bounding above): since 9/(n + 3) < 9/n, it suffices to have 9/n < ;
that is, n > N () = 9/. This is a bit easier.
Example 10.4 Lets prove directly that
an =

4n2 n + 3
2.
2n2 n + 1

The difference between an and 2 is


2

4n n + 3

|an 2| = 2
2
2n n + 1
2

4n n + 3 4n2 + 2n 2


=

2n2 n + 1
n+1
=
.
2
2n n + 1

147

10

10. Sequences and limits

Now, we want to find N () such that this will be <  for all n > N (). Now, since the
inequality (n + 1)/(2n2 n + 1) <  is tricky, we approximate, by bounding above by
some simple bn . Now, wed like to bound the numerator above, and get the terms on
the numerator to be the same degree as each other. Well, n + 1 n + n = 2n. What
about the denominator? We can throw away the +1, but we cannot just throw away
the n, because we need to lower bound the denominator. Well, n n2 , so
2n2 n + 1 2n2 n2 + 1 n2 , so we have (for n 2)
|an 2|

2
2n
=
2
n
n

and this is <  provided n > N () = 2/.


Fortunately, there are easier ways of proving limits, but these examples are all good,
clean healthy fun and good practice in understanding the definition. Moreover, if an
examination question asks you to use the definition of limit, it is expected that you
answer in this sort of way.

10.4.2

Bounded sequences

A sequence is said to be bounded if all its elements belong to an interval of the form
[a, b]. (That is, if the set of terms of the sequence is a bounded set of real numbers.)
Then, a is any lower bound and b any upper bound of the set of elements of the
sequence. The case that only lower or upper bounds exist is also of interest.
Definition 10.5 Let (an ) be a sequence and S = {an | n N}. Then the sequence (an )
is said to be bounded below if S has a lower bound, bounded above if S has an upper
bound, and bounded if it is bounded above and below.
Theorem 10.2 Any convergent sequence is bounded.

10

Proof
Let (an ) be a convergent sequence with limit a. Then the sequence is bounded since for
sufficiently large n, the elements an of the sequence are close to a, whereas there is only
a finite number of them that are not close to a (and those finitely many elements are
also bounded). To make this precise, we use the definition of a limit with  = 1, so that
for some N and all n > N we have a 1 < an < a + 1 (any other positive number 
instead of 1 would do as well). Then we define as lower and upper bounds L and U
L = min ({an | 1 n N } {a 1}) ,
U = max ({an | 1 n N } {a + 1}) ,
where these are minima and maxima of finite sets and therefore exist. Then clearly,
L an U holds for all n in N, so the sequence is indeed bounded.
This result says that convergent sequences are bounded, but theres no reason at all
why a bounded sequence must converge. For example, the sequence given by
xn = (1)n is certainly bounded, but does not converge.

148

10.4. Some standard results on and properties of limits

Activity 10.3 Convince yourself that xn = (1)n is bounded, but does not
converge.

10.4.3

Monotonic sequences

The next definition concerns sequences a1 , a2 , a3 , . . . that satisfy


a1 a2 a3
or
a1 a2 a3
which are called increasing and decreasing, respectively. Note that some elements in
such a sequence can be equal to their predecessors, so the sequence does not have to go
strictly up or down.
Definition 10.6 A sequence (an ) is increasing if an an+1 for all n N, decreasing if
an an+1 for all n N, and monotonic if it is increasing or decreasing.
The following result is very useful.
Theorem 10.3 An increasing sequence that is bounded above has a limit.
Proof
Let (an ) be an increasing sequence that is bounded above and consider the set of its
elements S = {an | n N}. There is a natural candidate for the limit of (an ), namely
= sup S, which exists since S is non-empty and has an upper bound by assumption.
Because = sup S, if  > 0 there is some y S such that y > . But, because S is
simply the set of an , this means there is some N N such that aN > . Now, the
sequence is increasing, so for all n > N , we must have an aN > . Also, since is
an upper bound on S, an for all n. Thus, for all n > N ,  < an and, in
particular, |an | < . This shows that an as n .
We could also, alternatively, prove the result by contradiction. Suppose, to the contrary,
that is not the limit of (an ). Then
 > 0 N n > N

|an | .

So consider such an  > 0 and take any N N, so that, according to this condition,
there is some n > N with |an | . Since the sequence is increasing and since is an
upper bound for S, this means
aN an = |an | .
Because N is an arbitrary natural number here, aN  for all N N. But this
means that  is an upper bound for S that is strictly less than the supremum of
that set, and this is a contradiction. So is indeed the limit of the sequence.
We have not only shown that an increasing sequence that is bounded above has a limit,
but that this limit is the supremum of its elements. Similarly, one can see the following:
Theorem 10.4 A decreasing sequence that is bounded below converges to the
infimum of its elements.

149

10

10. Sequences and limits

10.5

Algebra of limits

When computing the limits of a sequence, it is useful to apply arithmetic rules if the
terms are the sum, product, etc., of terms of sequences that have a known limit
behaviour. For example, one can prove using the formal definition of a limit that the
sequence (an ) defined by
4n2 + 9
an = 2
3n + 7n + 11
converges to 4/3. However, it is simpler to observe that
an =

4 + 9/n2
3 + 7/n + 11/n2

where the terms 9/n2 , 7/n and 11/n2 all have limit zero and can be replaced by their
limit to obtain that

lim 4 + 9/n2
n

lim an =
n
lim 3 + 7/n + 11/n2
n

lim (4) + lim (9/n2 )

lim (3) + lim (7/n) + lim (11/n2 )

4+0
3+0+0

4
= .
3

Such an algebra of limits is possible because of the following observation.


Theorem 10.5 Let (an ) and (bn ) be convergent sequences with limits a and b,
respectively. Let C be a real number and let k be a positive integer. Then as n ,

10

(a) Can Ca,


(b) |an | |a|,
(c) an + bn a + b,
(d) an bn ab,
(e) akn ak ,
(f) if bn 6= 0 for all n and b 6= 0, then 1/bn 1/b as n .
Proof
We prove (d) as an example to illustrate the proof method in one of the more difficult
cases. Note that (d) implies (e) by induction on k, taking (bn ) to be the sequence
defined by bn = ak1
n . To show that an bn ab as n , consider any  > 0. The goal
is to show that |an bn ab| <  for all sufficiently large n. Here we need a trick in order

150

10.6. Some useful limits

to exploit that an a and bn b as n , which is all we know at this point. The


trick is to insert the term abn + abn , which is zero, into the sum to obtain
|an bn ab| = |an bn abn + abn ab|
|an bn abn | + |abn ab|
= |(an a) bn | + |a (bn b)|
= |an a| |bn | + |a| |bn b|.
We can make this less than  if each of the two terms on the right-hand side is less than
/2. Since the sequence (bn ) converges, it is bounded, as proved above, so that for some
real number B we have |bn | < B for all n, where B > 0 (we will need this shortly since
we divide by B). Then
|an a| |bn | < |an a| B,
so in order to have the left hand side smaller than /2 it suffices to have
|an a| <


,
2B

which holds for n > N1 , say, since (an ) converges to a, so that then
|an a| |bn | < |an a| B <


.
2

Similarly, we obtain
|a| |bn b| < /2
immediately if a = 0, otherwise for all n > N2 so that
|bn b| <


,
2 |a|

where N2 exists since (bn ) converges to b. Combining these observations, we let


N = max{N1 , N2 } so that for any n > N , |an bn ab| < , as desired. This reverse

instead of  in the definition of
construction of N1 and N2 using a modified term like 2B
limit is very typical in convergence proofs.
Be careful not to misinterpret (e). This works only for a fixed number k. So, for
example, since 1 + 1/n 1 as n , we have (1 + 1/n)k 1k = 1 for any k N.
However, (1 + 1/n)n e as n .
Activity 10.4 Prove (c) of the Theorem above by using the fact that
|(an + bn ) (a + b)| = |(an a) + (bn b)| |an a| + |bn b|.

10.6

Some useful limits

The following result is useful.

151

10

10. Sequences and limits

Theorem 10.6 If |a| < 1, then limn an = 0.


Proof
Suppose first that 0 < a < 1, and let xn = an . Then each term in the sequence (xn ) is
positive. Also, xn+1 = an+1 = axn < xn , so (xn ) is a decreasing sequence. Since xn 0
for all 0, the sequence is bounded below. So the sequence is bounded and monotonic
and hence it must converge to some limit L. Now (convince yourself!), if xn L then
xn+1 L too. But xn+1 = aL, so by the algebra of limits,
lim xn+1 = lim(axn ) = a lim xn = aL.
Hence L = aL, so L(1 a) = 0. But since 1 a 6= 0, this means L = 0. To sum up,
then, xn = an 0 as n . If 1 < a < 0 then for any n, |an 0| = |a|n . But
0 < |a| < 1, so we know from what weve just shown that |a|n 0 as n . Thus
|an 0| 0. Hence an 0. (This last step can be seen as an application of a general
result, namely that if |xn | 0 then xn 0. It should not be thought, however, that if
|xn | L for general L then xn L; that simply isnt true, as can be seen by
considering the sequence xn = (1)n .)
In this proof we used the simple observation that if xn L then xn+1 L. This is
often useful in determining what the limit of a recursively defined sequence must be (if
it converges).
Activity 10.5 Prove that xn L if and only if xn+1 L.
Example 10.5 Consider the example we met earlier:
a1 = 1, an+1 =

3
an
+
(n 1).
2
2an

If this sequence converges, to L, then an L and an+1 L too. But


an+1 =

10

an
3
L
3
+
+
,
2
2an
2 2L

so we must have
L=

L
3
+
,
2 2L

which may be written as


L2 =

L2 3
+ , L2 = 3,
2
2

so L = 3 or 3. Since (as can easily be shown) an 0 for all n, we have L 0


and therefore the limit (if it converges,as in fact it does, though to show this
requires extra work), then the limit is 3.
Another useful result is the following. This is often described as saying that nth powers
dominate powers of n.
Theorem 10.7 Given any a > 1, and any k N, an /nk as n . Also, given
any b with 0 < b < 1 and any k N, bn nk 0 as n .

152

10.7. The sandwich theorem

10.7

The sandwich theorem

A useful result is the sandwich theorem (also known as the squeeze theorem).
Theorem 10.8 (Sandwich Theorem) Let (an ), (bn ), (cn ) be sequences such that
an bn cn for all n 1 and
lim an = L = lim cn .

Then
lim bn = L.

Proof
Because an L and cn L, given any  > 0, there are N1 , N2 such that for n > N1 ,
|an L| <  and, for n > N2 , |cn L| < . If we set N = max{N1 , N2 }, the larger of N1
and N2 , then for n > N we have n > N1 and n > N2 , so both the above conditions hold:
|an L| <  and |cn L| < . These may be written as
L  < an < L + , L  < cn < L + ,
so for n > N ,
L  < an bn cn < L + ,
which says |bn L| < . Thus bn L as n .
As an example, consider the sequence (an )nN where
1
1
1
+ 2
+ ... + 2
.
an = 2
n +1 n +2
n +n
1
1
1
Clearly an n 2
=
. Furthermore, an 0. Thus, since
1
n +1
n
n+ n
1
,
n n

lim 0 = 0 = lim

we have by the sandwich theorem that




1
1
1
lim
+
+ ... + 2
= 0.
n
n2 + 1 n2 + 2
n +n
Another application of the sandwich theorem is to prove that xn tends to zero if |x| < 1.
One approach to proving this goes as follows: Let  > 0. We are looking for a number N
so that n > N implies |xn | < , which is equivalent to |x|n < . Here we can without loss
of generality assume x > 0, since otherwise we can replace x by x and if x = 0 the
claim holds trivially. Now observe the following equivalences, where the logarithm is
taken for any basis and log x < 0 since x < 1, so the inequality reverses when dividing
both sides by log x:
xn < 
log(xn ) < log 
n log x < log 
n>

log 
.
log x

153

10

10. Sequences and limits

That is, if N = log / log x, then n > N implies xn <  as desired. The problem with this
derivation of N is that it uses the definition of the logarithm, which if done formally
requires a number of other concepts in analysis that are not yet proven at this point.
In short, we want a more elementary proof that xn converges to zero if 0 < x < 1. To
achieve this, observe that, because 0 < x < 1, x can be written as 1/(1 + h) for some
h > 0. By induction on n, one can easily prove
(1 + h)n 1 + hn
which implies
1
1
1
<
.

n
(1 + h)
1 + hn
hn
Then taking an = 0, bn = xn and cn = 1/hn and applying the sandwich theorem shows
that xn 0 since cn 0 as n , as desired.
xn =

This is not the only way to show that xn 0 if 0 < x < 1. Weve seen another way
above.

10.8

Subsequences

Consider the sequence


1 1 1 1 1
1, , , , , , . . .
2 3 4 5 6
If we were to cross out every other term our sequence would become
1 1 1
, , ,...
2 4 6
The resulting sequence is called a subsequence of the original sequence. The first term of
the new sequence is the second term of the original sequence; the second term of the
new sequence is the fourth term of the original sequence, and so on. Thus the elements
of the new subsequence can be found by looking at the appropriate position in the
original sequence. More precisely, the nth term of the subsequence is simply the (2n)th
term of the original sequence. The formal definition of a subsequence is as follows.

10

Definition 10.7 Let (an )nN be a sequence and consider strictly increasing natural
numbers k1 , k2 , k3 , . . ., that is, k1 < k2 < k3 < k4 < . Then the sequence (akn )nN is
called a subsequence of the sequence (an )nN .
In the above example, we have obtained a subsequence given by the indices k1 = 2,
k2 = 4, etc., in general kn = 2n. The indices kn tell us precisely which terms to keep
from the original sequence.
An alternative definition of a subsequence changes the set of indices N to a subset S of
the natural numbers. Namely, let S be an infinite subset of N. Then a subsequence of
(an )nN is simply given by
(am )mS
with the understanding that the index m runs through the elements of S in increasing
order. Namely, here m = kn and the index set S is the set
S = {k1 , k2 , k3 , . . .} = {kn | n N} .

154

10.8. Subsequences

The set S can always be written in this way because it is infinite.


Theorem 10.9 Let (an ) be a sequence which tends to a limit a. Then any subsequence
also tends to the limit a.
Proof
The proof is almost immediate: Let an a as n , and let (akn ) be a subsequence.
Then by the definition of limits,
 > 0 N n > N

|an a| < .

Now, because the sequence k1 , k2 , . . . is increasing, we have kn n for all n. So, for all
n > N , since kn > N , we have |akn a| < . This means that the subsequence (ank )
converges to a.
Here are two examples. The sequences (1/2n )nN and (1/n!)nN are subsequences of
(1/n)nN . Secondly, ((1)2n )nN and ((1)2n1 )nN are subsequences of ((1)n )nN . The
latter are two constant sequences with constant terms 1 and 1, respectively. From this
we can see that ((1)n ) has no limit by using the result that each subsequence of a
convergent sequence has the same limit. This is because ((1)2n ) tends to 1 as n ,
whereas ((1)2n1 ) tends to 1.
Theorem 10.10 Every sequence has a monotonic subsequence.
Proof
We first give a nice illustration of this proof taken from Bryants book. Assume that
(an ) is the given sequence, and that an is the height of a hotel with number n, which is
followed by hotel n + 1, and so on, along an infinite line where at infinity (to the right)
there is the sea. A hotel is said to have the seaview property if it is higher than all
hotels following it. That is, a hotel without seaview is followed sooner or later by a hotel
of at least that height. See Figure 10.2.

10

Figure 10.2: The seaview property.

Now there are only two possibilities: Either there are infinitely many hotels with
seaview. Then they form a decreasing (in fact strictly decreasing) subsequence. Or there
is only a finite number of hotels with seaview, so that after the last hotel with seaview,
one can start with any hotel and then always find one later that is at least as high,
which is taken as the next hotel, then considering yet another that is at least as high as
that one, and so on. Then the subsequence of hotels generated in this way is increasing,
although not necessarily strictly.
Formally, let S = {m N | n > m am > an } (which is the set of numbers of hotels
with seaview). Consider the following two possible cases.
Case 1. S is infinite. Then clearly, (am )mS is a decreasing subsequence of (an )nN .

155

10. Sequences and limits

Case 2. S is finite. Then if S is empty, let k1 = 1, otherwise let k1 = max(S) + 1. This


means that for all n k1 , the sequence element an does not belong to S (hotel n does
not have seaview). Then define inductively for n = 1, 2, 3, . . .
kn+1 = min{m N | m > kn and am akn },
which is possible since the set on the right hand side is not empty (otherwise kn would
belong to S). In other words, kn+1 (which we intend to be the next index in the
subsequence) is the smallest number greater than kn so that akn+1 akn . This implies
that k1 , k2 , k3 , . . . are the indices of a subsequence of (an )nN so that
ak1 ak2 ak3 . That is, (akn )nN is an increasing subsequence of (an )nN .
We obtain from this as a corollary the following famous result.
Theorem 10.11 (The Bolzano-Weierstrass Theorem) Every bounded sequence of
real numbers has a convergent subsequence.
Proof
Let (an ) be a bounded sequence. By Theorem 10.10, it has a monotonic subsequence
(akn ). This subsequence is then also bounded and we have seen (Theorem 10.3) that
bounded monotonic sequences are convergent.

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:

10

explain what is meant by a sequence


work with sequences defined by a formula or defined recursively
explain precisely what it means to say that a sequence converges or tends to
infinity or minus infinity
determine limits and prove results using the formal definition of the limit of a
sequence
explain what is meant by a sequence being bounded, bounded above or bounded
below
comprehend the links between boundedness and convergence
prove and use the fact that bounded monotonic sequences converge
calculate limits of sequences using the algebra of limits
prove and use the Sandwich Theorem
know precisely what is meant by a subsequence of a sequence
know that any bounded sequence has a convergent subsequence.

Sample examination questions


Question 10.1
Suppose that (xn ) is a sequence of real numbers which converges to a limit L > 0. Show
that there is N N such that for all n N , xn > 0.

156

10.8. Sample examination questions

Question 10.2
Let (an ) be the sequence defined by
an =

4n2 + 9
.
3n2 + 7n + 11

Using the formal definition of the limit of a sequence, show explicitly that
an

4
as n .
3

[Like the previous exercise, this means you must produce N ().]
Question 10.3
Let (an )nN be a sequence, and let (bn )nN be the sequence defined by bn = |an | for
n N. Which of the following two statements implies the other?
(a) (an ) converges.
(b) (bn ) converges.
Question 10.4
Prove that a sequence which is decreasing and bounded below converges.
Question 10.5
Let (xn ) be a sequence of non-negative real numbers (i.e., xn 0 for each n). Suppose
that the sequence converges to x. Prove that x 0. [Note, however, that a sequence of
positive terms need not have a positive limit; for example, (1/n) is a sequence of
positive terms converging to 0.]
Question 10.6
Suppose the sequence (xn ) of positive terms converges to 0. Prove that if yn = 1/xn
then the sequence (yn ) tends to infinity.
Let (xn ) be a sequence of positive terms such that the sequence (1/xn ) tends to infinity.
Show that (xn ) converges to 0.
Question 10.7
Find the limits as n of:
4n 5
,
22n 7

5(32n ) 1
,
4(9n ) + 7

1 + 2 + + n
,
n2

n2 1
.
n3 + 1

Question 10.8
A sequence (xn ) is defined as follows. Let x1 be any positive real number and, for n 1,
let
x2 + K
xn+1 = n
,
2xn
where K is a fixed positive number.

157

10

10. Sequences and limits


2
2
(i) By using the
inequality a + b 2ab for real numbers a, b (or otherwise), prove
that xn K for n 2.

(ii) Prove that xn+1 xn for all n 2.


(iii) Deduce that (xn ) converges and find its limit.
Question 10.9
Find limn xn in the following cases:
xn =

2n3 + 1
,
3n3 + n + 2

xn =

n
(n + 1)
2n
+
+ + 2.
2
2
n
n
n

Question 10.10
Prove by induction that if n N and x 1 then
(1 + x)n 1 + nx.

Taking x = 1/ n in this, show that


1/n

1n

2/n
< 1+ n

1
1+
n

2
.

Deduce that
lim n1/n = 1.

Question 10.11
Using the fact that for a > 0, a1/n 1 as n , and a sandwiching argument, find
lim (1n + 2n + 3n + 4n )1/n .

Question 10.12

10

Find


lim

+
+ +
n2 + 1
n2 + 2
n2 + n
1


.

Question 10.13
For the positive sequence (an ), it is given that there exist numbers N and such that
< 1 and
an+1
0<
< n > N.
an
Prove that (an ) is eventually decreasing and that an 0 as n . Give an example
to show that if we relax this condition and only have
0<

an+1
< 1 n > N,
an

then we cannot conclude that an 0.

158

10.8. Comments on selected activities

Question 10.14
Discuss the behaviour as n of the following:
 n
1
2n
3
(n + n)
,
,
2
n3 + n
22n + n
,
n3 3n + 1

2n3 + 1
n+1

n+1

 n
3
,
4

n.

Question 10.15
Let (an )be a sequence of non-negative numbers. Prove that if an L as n then

an L as n . [This is not covered by the algebra of limits results (why not?):


prove the result explicitly using the formal definition of a limit.]

Comments on selected activities


Feedback to activity 10.4
In a proof like this, you need to know what information you have, precisely, and you
need to know where you want to get to, precisely. That all seems very obvious, but its
crucial. What we know is that an a and bn b. What that really means is the
following two statements: for all 1 > 0, there is N1 such that |an a| < 1 for all
n > N1 ; and, for all 2 > 0, there is N2 such that |bn b| < 2 for all n > N2 . (Why
didnt I just write plain old  and N in each? Well, I want to keep things as general as
possible at the moment until I see what choices of  Ill want to use.) What we want to
establish is that an + bn a + b, which means we must show that for any  > 0, there is
some N such that |(an + bn ) (a + b)| <  for all n > N . Now, if I want to do this given
the information I know, I have to relate |(an + bn ) (a + b)| to the two quantities
|an a| and |bn b|. From the clue given, we have
|(an + bn ) (a + b)| = |(an a) + (bn b)| |an a| + |bn b|.
How can I ensure that the right-hand side will be less than ? Well, I can do this if I can
guarantee that |an a| and |bn b| are both < /2. So take 1 = 2 = /2. There are
N1 , N2 such that |an a| < /2 for n > N1 and |bn b| < /2 for n > N2 . How big
should N be so that both of these hold for all n > N ? Well, take N = max(N1 , N2 ), the
larger of N1 and N2 . Then well have, for all n > N ,


|(an + bn ) (a + b)| = |(an a) + (bn b)| |an a| + |bn b| < + = ,
2 2
which is exactly what we need.
That was a bit long-winded, just to show you the thought process. A perfectly
acceptable way to write a proof would be as follows:
Let  > 0. Since an a, there is N1 such that |an a| < /2 for all n > N1 and, since
bn b, there is N2 so that |bn b| < /2 for all n > N2 . Then, for all
n > N = max(N1 , N2 ), we have


|(an + bn ) (a + b)| = |(an a) + (bn b)| |an a| + |bn b| < + = .
2 2
It follows that an + bn a + b.

159

10

10. Sequences and limits

Feedback to activity 10.5


Suppose first that xn+1 L as n . Then, given  > 0 there is N such that n > N
implies |xn+1 L| < . This means that if n > N + 1 then |xn L| < , and this shows
that xn L. For the other part, if xn L then for  > 0 there is N so that n > N
implies |xn L| < . Then its clearly true (because n + 1 > n think about it!) that if
n > N , we also have |xn+1 L| < .

Sketch answers to or comments on sample questions


Answer to question 10.1
Taking  = L/2 > 0 in the definition of a limit, there is N so that for n N ,
|xn L| < L/2. In particular, for n N , xn L > L/2 and so xn > L/2 > 0.
Answer to question 10.2
We have
|an 4/3| =
=
=



4n2 + 9

4


3n2 + 7n + 11 3


12n2 + 27 12n2 28n 44




3(3n2 + 7n + 11)


28n + 17


9n2 + 21n + 33


28n + 17n




9n2
45n
9n2
5
.
n

This is <  for n > N () = 5/.

10

If you do not make the approximations indicated (or something like them) then this
becomes much more difficult. Note the way the inequalities work: to be sure that
|an 4/3| < , we bound |an 4/3| from above by some simpler sequence bn , and we
then solve bn < . And, we bound |an 4/3| by above by upper bounding its numerator
and lower bounding its denominator. Note also that we do this in such a way that the
resulting bn has all terms in its numerator of the same degree, so that solving bn < 
becomes easy. Without approximations, you end up having to solve the inequality
28n + 17 < (9n2 + 21n + 33)
for n, and this is a messy quadratic.
Heres another (slightly different) example. Lets prove directly that
an =

4n2 n + 3
2.
2n2 n + 1

The difference between an and 2 is


2
2

4n n + 3
4n n + 3 4n2 + 2n 2
n+1
=
|an 2| = 2
2 =
.

2
2
2n n + 1
2n n + 1
2n n + 1

160

10.8. Sketch answers to or comments on sample questions

Now, we want to find N () such that this will be <  for all n > N (). Now, since the
inequality (n + 1)/(2n2 n + 1) <  is tricky, we approximate, by bounding above by
some simple bn . Now, wed like to bound the numerator above, and get the terms on the
numerator to be the same degree as each other. Well, n + 1 n + n = 2n. What about
the denominator? We can throw away the +1, but we cannot just throw away the n,
because we need to lower bound the denominator. Well, n n2 , so
2n2 n + 1 2n2 n2 + 1 n2 , so we have (for n 2)
|an 2|

2
2n
=
2
n
n

and this is <  provided n > N () = 2/.


Fortunately, there are easier ways of proving limits, but this is all good, clean healthy
fun and good practice in understanding the definition.
Answer to question 10.3
(a) implies (b). For, suppose that an L. Then for any  > 0 there is N such that
n > N implies |an L| < .
[Right, now for some scratching around: How do we prove that (bn ) converges? Well,
first, wed like some idea of what it converges to. But this is easy: if the an can be made
as close as we like to L then, the fact that bn = |an | suggests that bn should converge, if
it does converge, to |L|. So were going to try to show that |bn |L|| can be made as
small as we like. Back to the formalities...]
We have
|bn |L|| = ||an | |L||.
For any numbers x and y, we have
||x| |y|| |x y|,
so
|bn |L|| |an L|.
For n > N , this is < . So we have shown that for all  > 0, there is N such that
|bn |L|| <  for n > N . That is, bn |L|.
(b) does not imply (a). To see this, we just need a counterexample. Consider the
sequence (an ) given by an = (1)n . Then (bn ) is the sequence with every term equal to
1 and it is therefore, trivially, convergent. But (an ) does not converge.
Answer to question 10.4
Suppose the sequence is (xn ) and let = inf{xn | n N} (which exists, by
boundedness). Let  > 0. Since +  cannot be a lower bound on the set of xn , theres
N so that xN < + . But the sequence is decreasing, so for n > N ,
xn xN < +  and hence |xn | < . This shows the sequence converges to .
Answer to question 10.5
Proof by contradiction. We show x cant be negative. Suppose it was. Then x/2 > 0,
so taking  = x/2 in the definition of limit, there is N so that for n > N ,
3x
x
x
< xn < .
|xn x| < , i.e.,
2
2
2

161

10

10. Sequences and limits

But xn < x/2 says xn < 0 for n > N a contradiction. Draw a picture!
Answer to question 10.6
Let K > 0. Taking  = 1/K, there is N so that n > N implies |xn 0| < 1/K. Since
xn > 0 this means 0 < xn < 1/K for n > N . So, for such n, yn = 1/xn > K. So
yn . For the second part, let  > 0 and let K = 1/. Then, because 1/xn , N
such that for n > N , 1/xn > K. But then, 0 < xn < 1/K = , so |xn | < . So xn 0.
Answer to question 10.7
We use the algebra of limits.
4n 5
4n 5
1 5/4n
1
=
=
= 1.

22n 7
4n 7
1 7/4n
1
5 1/9n
5(32n ) 1
5
=
.
n
n
4(9 ) + 7
4 + 7/9
4
For the next one, you might be tempted to write the n term as
xn =

1
2
n
+ 2 + + 2,
2
n
n
n

1
n

;
n
n2
n2

and observe that

that is,
1
xn 1.
n
This is true, but is not of much use, because the sequence on the left tends to 0, which
is not the same as the limit on the right (which is 1). So although this bounds xn , it
does not sandwich it between two sequences that converge to the same limit. So the
Sandwich theorem is no use here. Instead:
1 + 2 + + n
n(n + 1)/2
n2 /2 + n/2
1 + 1/(2n)
1
=
=
=
= 1.
2
2
2
n
n
n
1
1

10

(For the numerator, we can use the formula for the sum of an arithmetic progression.)
Finally,
n2 1
0
1/n 1/n3
=

= 0.
n3 + 1
1 + 1/n3
1
Be careful how you argue here. Dont say something like
n2 1
n2
1
'
= 0,
3
3
n +1
n
n
where ' means approximately equal to. This isnt precise enough. Also, dont write
n2 1
n2
1

= 0.
3
3
n +1
n
n
This has no meaning: sequences tend to numbers, not other sequences. (I know what
you mean when you write these things, but they are too vague.)

162

10.8. Sketch answers to or comments on sample questions

Answer to question 10.8


We have, for n 2,

x2n1 + ( k)2
x2n1 + k
2xn1 k
=

= k.
xn =
2xn1
2xn1
2xn1
Alternatively, we can see that, since xn+1 = (x2n + K)/(2xn ), 2xn xn+1 = x2n + K. But we
also know that 2xn xn+1 x2n + x2n+1 . So,
x2n + x2n+1 x2n + K,
so x2n+1 K for alln. Since each xn is clearly positive, this implies xn+1
n, and hence xn K for all n 2.

K for all

We now show the sequence is decreasing. Again, there are various ways in which this
can be done. For n 1,
x2n + k
k x2n
xn =
0,
2xn
2xn

where we have used the fact that xn k. So xn+1 xn . Alternatively, we can note
that
x2 + x2n
x2 + K
n
= xn .
xn+1 = n
2xn
2xn

So the sequence is bounded below (by k) and it is decreasing. These two facts mean
that it converges. Suppose xn L. Then xn+1 L too. But
xn+1 xn =

xn+1 =

x2n + k
L2 + k

,
2xn
2L

and so L = (2L2 +
k)/2L, or L2 = k. Since xn is positive for all n, L 0, so L =
and we have xn k.

k,

Answer to question 10.9

10

First,
2n3 + 1
2 + 1/n3
2
=
.
3
2
3
3n + n + 2
3 + 1/n + 2/n
3
For the second, although it looks like its worth trying, the Sandwich Theorem doesnt
help much. Instead, we use
n + (n + 1) + + 2n =

3n2 3n
(n + 1)(n + 2n)
=
+ ,
2
2
2

so

3n2 /2 + 3n/2
3/2 + 3/(2n)
3
xn =
=
.
2
n
1
2
(Straightforward Sandwich theorem application only tells us that xn lies between 1 and
2.)

163

10. Sequences and limits

Answer to question 10.10


When n = 1 both sides are 1 + x, so it holds. Suppose its true for n = k. Then
(1 + x)k 1 + kx and
(1 + x)k+1 = (1 + x)(1 + x)k (1 + x)(1 + kx) = 1 + (k + 1)x + kx2 1 + (k + 1)x,

and so its true for n = k + 1. With x = 1/ n we get



n
n
n
1+
1+ ,
n
n
which is equivalent to
(1 +

n) (1 +

n)1/n ,

from which we get (on squaring),

(1 + n)2 (1 + n)2/n > ( n)2/n = n1/n .


Obviously, since n 1, n1/n 1. The chain of inequalities is now established. The
leftmost term is the constant 1 and the rightmost tends to (1 + 0)2 = 1. By
Sandwiching, n1/n 1 too.
Answer to question 10.11
We have, just using some crude bounding of the quantity inside the parentheses,
(4n )1/n xn = (1n + 2n + 3n + 4n )1/n (4n + 4n + 4n + 4n )1/n .
That is,
4 xn 4(41/n ).
41/n 1, so both sides tend to 4. Hence xn 4 as n .
Answer to question 10.12
Let xn denote the quantity we need to find the limit of. Then
n

1
1
xn n
.
2
+n
n +1

n2

Now,

10
and

n
1
1
=p
= 1,
1
n2 + n
1 + 1/n

n
1
1
=p
= 1.
1
n2 + 1
1 + 1/n2

By Sandwiching, xn 1.
Answer to question 10.13
For n > N , an+1 < an < an , so the sequence is decreasing after N . Since an > 0, the
sequence is bounded below (by 0). So it converges, to some limit L 0. (At this stage,
this is all we know about L. We have to prove that L = 0.) If the limit is L then we
must have also an+1 L. If L 6= 0, then an+1 /an L/L = 1, which is contrary to
an+1 /an < < 1. So L = 0. We may also argue, alternatively, that since an+1 < an for
n > N , then L L, from which it again must follow that L = 0 since < 1.

164

10.8. Sketch answers to or comments on sample questions

Answer to question 10.14


First,
 2
 n
 2
1
1
1
3
=n
+n
0 + 0 = 0.
(n + n)
2
2
2
3

The second example is the reciprocal of the first, so it tends to . For the third, we have
 n
 n
 n
2n3 + 1 3
3
3n3 3
2
0<
<
= 3n
0,
n+1
4
n
4
4
so the term tends to 0. Next,
22n + n
22 n
(4/3)n
>
=
,
n3 3n + 1
2n3 3n
2n3
so the term tends to . The last one is different. The clever thing is to realise that

( n + 1 n)( n + 1 + n) = (n + 1) n = 1,
so
0<

n+1

n=

1
1
< 0,
2 n
n+1+ n

so the expression tends to 0.


Answer to question 10.15
We use the fact that

( an L)( an + L) = an L.

First, we deal with the case L = 0. If an 0 then, given  > 0, there is N so that for

n > N , |an 0| < 2 . So, for n > N , 0 an < , and so an 0. Suppose L 6= 0.


Then (taking  = L/2), there is N1 so that for n > N1 , an > L/2. Further, given , since
an L, there is N2 so that |an L| <  for n > N2 . So, for n > N = max(N1 , N2 ),

|an L|

<p
.
| an L|
an + L
L/2 + L

The denominator is just a constant, so this proves an L.

10

Here is a different proof, this time based on the fact that

x+y x+ y
for x and y non-negative (to see this, square both sides).
Choose  > 0, and take N so that, for n > N , |an L| < 2 , or in other words
L 2 < an < L + 2 . Then we have

L  L 2 < an < L + 2 L + ,

which implies that | an L| < .


Again, this proof doesnt quite work if L = 0 (why not?), and we need to treat that case
separately, as above.
Make sure you see where each of the inequalities in the string above comes from. Of

165

10. Sequences and limits

course, when constructing the proof, you work from both ends, so the last thing to fill
in is that you need to start with 2 .
Do you see why this result doesnt follow from the Algebra of Limits theorem? That
tells us that if an L then for any fixed positive integer k, akn Lk , but it tells us
nothing about non-integer powers such as 1/2.

10

166

Chapter 11
Limits of functions and continuity

Essential reading
Bryant, Victor. Yet Another Introduction to Analysis. Chapter 3, the sections
entitled Limits of Functions and Continuous Improvements.

(Or, similar reading from other analysis books. Use the index to find material on
functions and continuity.)

11.1

Introduction

In the previous chapter we studied limits of sequences. In this chapter, we look at limits
of functions. We also look at a key property that functions might have, namely
continuity.

11.2

Limit of a function

11.2.1

Definition of limit

Definition 11.1 Let f : R R be a function. We say that L is the limit of f (x) as x


approaches a, denoted by limxa f (x) = L (or f (x) L as x a) if for each  > 0,
there exists > 0 such that
0 < |x a| < = |f (x) L| < .
The definition states that if someone gives us any arbitrarily small , then there is some
neighbourhood of a, (a , a + ), such that any x in this neighbourhood - other than a
itself - will have f (x) in the -neighbourhood (L , L + ) of L. We start by defining
what we mean by the limit of a function f (x) as the argument x approaches some value
a R. Note that we are not concerned in this definition about the value of the function
at the point a (indeed we dont even care if f is defined at a): the notion of limit applies
only to the behaviour of the function f in a neighbourhood near to a, and not at a itself.
Figure 11.1 indicates what f (x) L as x a means graphically.
We can also make the following definition.
Definition 11.2 Let f : R R be a function. We say that f (x) tends to infinity as
x a (denoted by f (x) as x a) if for each K, there exists > 0 such that
0 < |x a| < = f (x) > K.

167

11

11. Limits of functions and continuity

yu
y
fx
yl

cl

cr

Figure 11.1: A function with limit L as x tends to a. Given any  > 0 (which determines a

strip around the line y = L of width 2), there exists a > 0 (which determines an interval
around the point a of width 2 such that whenever x lies in this interval (but x 6= a), (so that
x satisfies 0 < |x a| < ), then f (x) satisfies L  < f (x) < L + , that is, |f (x) L| < .

Activity 11.1 Write down the definition of f (x) as x a.

11.2.2

Examples

Example 11.1 Suppose that f (x) = 3x 1. We can then show directly from the
definition of a limit that limx1 f (x) = 2. To prove this, we must argue that for any
 we can bound the value |f (x) 2| = |(3x 1) 2| <  in some neighbourhood of 1.
But we easily see that
|f (x) 2| = 3|x 1|.
In other words, the distance of f (x) from 2 is three times the distance of x from 1.
Thus if we choose x to be within distance /3 from 1, then f (x) is within distance 
of 2.
A formal proof would go as follows. Let  > 0. Then, let = /3. For any x s.t.
0 < |x 1| < we have:
|f (x) 2| = 3|x 1| < 3


< .
3

Therefore f (x) 2 as x 1.

11

Example 11.2 As a second example, consider f (x) = x2 + x. We show that


limx2 f (x) = 6. To see this, suppose that  > 0 is given. If x is such that
0 < |x 2| < , then
|f (x) 6| = |x2 + x 6| = |x + 3||x 2| < |x + 3|.
Now, if |x 2| < then |x + 3| 5 + |x 2| < 5 + , by the triangle inequality. So
to make |f (x) 6| less than , it suffices to have (5 + ) . There is no need to
solve a quadratic equation in . If we just put = min{1, /6}, the smaller of 1 and
/6, then we have 5 + 6 and /6, so
(5 + ) 6(/6) = ,
and were done.

168

11.2. Limit of a function

Another way of arguing this is as follows. We have


|f (x) 6| = |x + 3||x 2|.
If we assume that |x 2| < 1 then 1 < x < 3 and so 4 < x + 3 < 6 and so |x + 3| < 6.
It follows that |f (x) 6| < 6|x 2|. So to have |f (x) 6| < , it will suffice to have
both |x 2| < 1 and |x 2| < /6, which will hold if |x 2| < = min{1, /5}. (By
the way, there is nothing special about the choice of 1 in assuming |x 2| < 1 here:
we could equally well have assumed, for example, that |x 2| < 2.)
Activity 11.2 Prove that f (x) = x2 2 2 as x 2.

11.2.3

Algebra of limits

Let f, g : R R be two functions and c be any real number, then we derive other
functions by applying an algebra on the set of functions. For example, a new function
(f + g) is obtained by defining for each x, (f + g)(x) = f (x) + g(x). We say that (f + g)
is derived point-wise since the value of (f + g) at x is defined by the normal arithmetic
sum of the two real numbers f (x) and g(x). Similarly, we may define the functions
|f |, (cf ), (f g), (f + g), (f g) and (f /g), provided g(x) 6= 0.
Theorem 11.1 Let f, g : R R be two functions and c be any real number. Suppose
that limxa f (x) = L and limxa g(x) = M . Then
1. limxa (cf )(x) = cL
2. limxa (|f |)(x) = |L|
3. limxa (f + g)(x) = L + M
4. limxa (f g)(x) = L M
5. limxa f (x)g(x) = LM
6. limxa (f /g)(x) = L/M provided g(x) 6= 0 for each x in some neighbourhood of a.
Note that in this theorem, I have used f (x)g(x) rather than the more usual f g in order
to avoid confusion with the composition of the functions.

11
11.2.4

More on limits

Sometimes, we may have a situation where the limit from one side is different from that
from the other. We adapt the definition as follows.
Definition 11.3 Let f : R R be a function. We say that L is the limit of f (x) as x
approaches a from the left (or from below), denoted by limxa f (x) = L if for each
 > 0, there exists > 0 such that
a < x < a |f (x) L| < .

169

11. Limits of functions and continuity

A similar definition applies to limits from the right (or from above), denoted
limxa+ f (x) = L. For example, suppose the graph of a function looked like the
following. Then the left-hand and right-hand limits at a are L, M respectively.
6

M
L

So far weve discussed limits of a function as x approaches some value a. But we can
also make the following definition (similar to the definition of the limit of a sequence).
Definition 11.4 Let f : R R be a function. We say that L is the limit of f (x) as x
approaches , denoted by limx f (x) = L if for each  > 0, there exists M > 0 such
that
x M |f (x) L| < .

Example 11.3 Consider f (x) =

x2

1
. Then f (x) 0 as x . For,
+1

|f (x)|
if x > M =

11

11.3

1
<
x2

p
1/.

Continuity

Loosely speaking, a function is continuous if its graph can be drawn without lifting pen
from paper. For a formal definition of this notion, we first define continuity at a point.
Definition 11.5 A function is continuous at the point a if f (a) is defined and is equal
to limxa f (x).
Definition 11.6 A function is continuous if it is continuous at each point a.
Definition 11.7 A function is continuous on the interval [a, b] if it is continuous at
each point in (a, b) and (i) f (a) = limxa+ f (x) and (ii) f (b) = limxb f (x).

170

11.3. Continuity

As an example of a discontinuous function (that is, one that is not continuous), consider
the sign function

1,
x > 0,

f (x) = 0,
x = 0,

1, x < 0.
This function f (x) makes a jump when x = 0. This represents a discontinuity since
when we approach zero from the left (that is, with negative values for x), the function
always has value 1, no matter how close we are to zero, whereas at zero it has a
different value. That is, the function values when approaching zero tend to something
other than the value of the function. (The same happens when zero is approached from
the right, but already a strange behaviour like this when coming from one direction is
enough to make the function discontinuous.)
Figure 11.2 gives a diagrammatic representation of continuity.

yu
y
fx
yl

cl

cr

Figure 11.2: The definition of the continuity of a function at point a. If the function is
continuous at a, then given any  > 0 (which determines a strip around the line y = f (a)
of width 2), there exists a > 0 (which determines an interval around a of width 2)
such that whenever x lies in this interval (so that x satisfies a < x < a + , that is,
|x a| < ), f (x) satisfies f (a)  < f (x) < f (a) + , that is, |f (x) f (a)| < .

It follows from the results on the algebra of limits that:


Theorem 11.2 Let f, g : R R be functions which are continuous at a R and c be
any real number. Then |f |, (cf ), (f g), (f + g), (f (x)g(x)) are all continuous at a, and
(f /g) is continuous provided g(x) 6= 0 for any x in some neighbourhood of a.
P
As a corollary we see that any polynomial p(x) = ki=0 ai xi is continuous. This can be
proved by induction as follows. Clearly any constant function is continuous by the
above results. Moreover, the function f (x) = x is continuous. Thus by the product rule,
so is f (x) = x2 , and by induction f (x) = xk is continuous for any finite k. Hence so is
any function of the form f (x) = ai xi . Finally, by repeated application of the summation
rule, the polynomial p(x) is deduced to be continuous.
Recall that if f, g are functions, then the composition function f g is defined by
(f g)(x) = f (g(x)) for each x.
Theorem 11.3 If g is a function which is continuous at a, and f is a function which is
continuous at g(a). Then (f g) is continuous at a.

171

11

11. Limits of functions and continuity

Proof
Lets be clear about what we want to prove. To show that the composite function is
continuous at a, we must show that for any  > 0, there is > 0 such that |x a| <
implies |f (g(x)) f (g(a))| < . By the continuity of f at g(a), we know that for any
given  > 0 there is some 1 > 0 such that if |y g(a)| < 1 then |f (y) f (g(a))| < . So
if we can find > 0 such that |x a| < implies that (taking y = g(x))
|g(x) g(a)| < 1 , then well have that |x a| < implies |f (g(x)) f (g(a))| < . But
we can certainly find such a , simply from the definition of g being continuous at a.
(Note the use of the notation 1 here, so that we do not confuse intermediate -values
with the we need to finally produce.)

11.4

Continuity and sequences

We now give an alternative definition of continuity which ties in the concept of limits
for sequences.
Theorem 11.4 A function f is continuous at a if and only if for each sequence (xn )
such that limn xn = a we have limn f (xn ) = f (a).
Proof
Let (?) be the statement that for any sequence (xn ) such that limn xn = a,
limn f (xn ) = f (a).
Suppose first that f is continuous at a. We prove that this implies (?). Let (xn ) be a
sequence of reals converging to a. We want to show that f (xn ) f (a) as n , that
is,
 > 0 N n N |f (xn ) f (a)| < .
(??)
To prove this, let  > 0. Choose, according to the definition of continuity, a > 0 so
that for all x, whenever |x a| < , then |f (x) f (a)| < . Since xn a as n ,
there is an N so that n N implies |xn a| < , which in turn implies
|f (xn ) f (a)| < . This shows (??) as desired.
Conversely, assume that property (?) holds. In order to show continuity, we assume, to
the contrary, that the function is discontinuous at a. This means that there is an  > 0
so that for all > 0 there is an x with |x a| < but |f (x) f (a)| . In particular,
for every natural number n, letting = 1/n, there is a real number x, call it xn , with
|xn a| < 1/n but |f (xn ) f (a)| . But then clearly xn a as n , but we do
not have f (xn ) f (a) as n , a contradiction to (?).

11

We find this a very useful alternative for the sake of manipulation of limits since it
states that for any sequence xn a, the limit of the f (xn )s is simply f applied to the
limit of the xn s:


lim f (xn ) = f lim xn .
n

11.4.1

Continuous functions on closed intervals

Definition 11.8 For a subset X of the domain of a function f , we say that f is


bounded on X if there exists M such that |f (x)| M for each x M .

172

11.5. The Intermediate Value Theorem

Definition 11.9 We define the supremum (or maximum) of f on X as


sup{f (x) | x X} (or max{f (x) | x X} if it exists).
The following result is very important and useful. It is often known as the Extreme
Value Theorem.
Theorem 11.5 Let f be continuous on [a, b]. Then f is bounded on [a, b] and it
achieves its maximum; that is, the supremum is equal to the maximum.
Proof
Suppose first that f is unbounded above. For each n N, let xn be a point in [a, b] such
that f (xn ) > n. The sequence (xn ) is bounded, so has a convergent subsequence (xkn ),
tending to some limit c (by Theorem 10.11). Necessarily c [a, b]. Since f is continuous
at c, f (xkn ) f (c) as n . But this contradicts the construction of the sequence
(xn ), since f (xkn ) > n . So f is bounded above. Let M = sup{f (x) | x [a, b]}.
For each n N, let xn be a point in [a, b] such that f (xn ) > M n1 . Again take a
convergent subsequence (xkn ) of (xn ), tending to some limit c [a, b]. Arguing as before,
we see f (c) = M .
The same result holds with max/sup replaced by min/inf.

11.5

The Intermediate Value Theorem

In this section we prove one of the most fundamental (and obvious!) theorems in real
analysis. One useful property of continuous functions f lies in the fact that they have
solutions x to equations of the form f (x) = C for any given C where such a solution
might reasonably be expected, namely if f takes values below and above C. In other
words, a continuous function cannot hop over intermediate values as it moves from
one value to another. This central property of continuous functions is known as the
intermediate value theorem.
Theorem 11.6 (The Intermediate Value Theorem)
Let f be a continuous function on [a, b] and let K be such that f (a) < K < f (b).
Then for some c (a, b), f (c) = K.

11

Figure 11.3 helps us understand this theorem.


We prove a special case of the result, from which the full Theorem follows.
Theorem 11.7 Let f be a continuous function on [a, b] such that f (a) < 0 and
f (b) > 0. Then for some c (a, b), f (c) = 0.
Proof
We construct a sequence of intervals [an , bn ] such that
1. f (an ) < 0, f (bn ) > 0 for each n
2. [an+1 , bn+1 ] [an , bn ] for each n.

173

11. Limits of functions and continuity

B
y
A

c b

Figure 11.3: The Intermediate value theorem: the graph of f (x), passing from y-coordinate

f (a) to y-coordinate f (b) as x passes from a to b, must pass through all y-values in between
f (a) and f (b).
We start by letting [a1 , b1 ] = [a, b]. Then for each n 1, we define [an+1 , bn+1 ] as follows.
Let cn = (an + bn )/2, be the midpoint of the previous interval. If f (cn ) = 0, then the
theorem is proved and so we need not continue constructing intervals!
Otherwise, if f (cn ) < 0, we define an+1 = cn and bn+1 = bn . And if f (cn ) > 0, we define
bn+1 = cn and an+1 = an . Note that the condition 1. is satisfied by choosing our
intervals in this manner.
Moreover, note that the (n + 1)st interval is half the size of the nth interval and so
bn+1 an+1 (b1 a1 )/2n . It follows that
lim (bn an ) = 0.

(11.1)

Finally, note that (an ) is increasing and bounded above (by b1 ) and so it has a limit;
similarly (bn ) is decreasing and bounded below and so has a limit. Thus by (11.1) (and
algebra of limits) these limits are equal to, say, c. Thus by continuity (using
Theorem 11.4),
f (c) = lim f (bn ) 0,
n

where the last inequality follows from the fact that each f (bn ) 0 (in fact > 0).
Similarly,
f (c) = lim f (an ) 0.
n

Thus f (c) must be equal to zero, and the proof is complete.


Activity 11.3 Show that Theorem 11.7 implies Theorem 11.6.
[Hint: consider g defined by g(x) = f (x) K.]

11
Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
explain what is meant formally by the limit of a function at a point, or as x tends
to or , and by one-sided limits
determine limits and prove results using the formal definitions of the limit of a
function

174

11.5. Sample examination questions

calculate limits of functions using the algebra of limits


explain what is meant by continuity
prove functions are continuous or discontinuous using the formal definition and the
algebra of limits
explain and be able to use the interaction between continuous functions and
convergent sequences
use the fact that a continuous function has a maximum and a minimum value on a
closed bounded interval
state and use the Intermediate Value Theorem.

Sample examination questions


Question 11.1
Prove, from the formal definition of a limit, that f defined on R \ {2} by
f (x) =

5
(x 2)2

tends to infinity as x 2.
Question 11.2
Evaluate the following limit:
x3 + 5x + 7
.
x1 x4 + 6x2 + 8
lim

Question 11.3
Let bxc denote the largest integer n such that n x. Determine
lim (x bxc) and

x1+

lim (x bxc).

x1

Question 11.4
Let f : R R be defined by

(x 1)2 , x < 1,

f (x) = 1,
x = 1,

3x + 2,
x > 1.

11

Use the formal definitions of limits to prove that f (x) 0 as x 1 and f (x) 5 as
x 1+. Is the function
continuous
continuous on the right
continuous on the left
at the point 1?

175

11. Limits of functions and continuity

Question 11.5
The function f is defined on R by
f (x) =

0, if x is rational;
1

if x is irrational.

Prove that f is discontinuous at every point of R.


[Hint: use the fact that there are irrational numbers arbitrarily close to any rational
number and that there are rational numbers arbitrarily close to any irrational number.]
Question 11.6
Suppose the real function f is continuous at c and f (c) > 0. Prove, directly from the
definition of continuity, that there is > 0 such that f (x) is positive for x in the
interval (c , c + ).
Question 11.7
Suppose the real function f is continuous, positive and unbounded on R and that
inf{f (x) | x R} = 0. Use the Intermediate Value Theorem to prove that the range of f
is (0, ), the set of all positive real numbers. [It might be obvious, but give a watertight
proof. Explicitly, prove that for any y > 0 there is some c R such that y = f (c).]
Question 11.8
Let T (), for 0 2, be the surface temperature at the point at degrees longitude
on the equator. (Note that T (0) = T (2).) Assuming T is a continuous function of ,
prove that at any given time, there are two points on the equator which have the same
temperature and are diametrically opposite. [Regrettably, field trips are not available.
Instead, consider the function f () = T () T ( + ), and use the Intermediate Value
Theorem.]
Question 11.9
Suppose that the real function f is continuous on the closed interval [a, b] and that f
maps [a, b] into [a, b]. By considering the function h(x) = f (x) x, show that there is
c [a, b] with f (c) = c.

11

Suppose the real function g is continuous on R and that g maps [a, b] into [d, e] and
maps [d, e] into [a, b], where
a < b, d < e.
By considering the function
k(x) = g(g(x)),
prove that there are p, q R such that
g(p) = q, g(q) = p.
Hence show that there is c R such that g(c) = c.

176

11.5. Comments on selected activities

Comments on selected activities


Feedback to activity 11.1
We say that f (x) tends to minus infinity as x a (denoted by f (x) as x a)
if for each K, there exists > 0 such that
0 < |x a| < = f (x) < K.
Feedback to activity 11.2
We have
|f (x) 2| = |(x2 2) 2| = |x2 4| = |(x + 2)(x 2)| = |x + 2||x 2|.
Suppose that |x 2| < 1. Then 1 < x < 3 and so 3 < x + 2 < 5, so that |x + 2| < 5. It
follows that, then, |f (x) 2| < 5|x 2|. If, also, |x 2| < /5, then we will have
|f (x) 2| < . So if both |x 2| < 1 and |x 2| < /5, we have |f (x) 2| < .
Therefore, if |x 2| < = min{1, /5}, we have |f (x) 2| < .
Feedback to activity 11.3
The function g(x) = f (x) K will be such that 0 is between g(a) and g(b) if K is
between f (a) and f (b). Also, continuity of f on the interval [a, b] implies continuity of g
there too. So the version of the theorem weve proved shows there is some c (a, b)
with g(c) = 0. But g(c) = f (c) K, so this means f (c) = K, as required.

Sketch answers to or comments on sample questions


Answer to question 11.1
Let K > 0. We want to find so that if 0 < |x 2| < then f (x) > K. If |x 2| <
then
5
5
> 2,
f (x) =
2
(x 2)

p
so it suffices to have 5/ 2 K, so we may take = 5/K.
Answer to question 11.2
13 + 5(1) + 7
13
x3 + 5x + 7
=
=
.
x1 x4 + 6x2 + 8
14 + 6(1)2 + 8
15
lim

11

Answer to question 11.3


If 1 < x < 2 then x bxc = x 1, so
lim (x bxc) = lim (x 1) = 0.

x1+

x1+

If 0 < x < 1 then x bxc = x 0 = x, so


lim (x bxc) = lim (x) = 1.

x1

x1

177

11. Limits of functions and continuity

Answer to question 11.4

2
For x < 1,
|f
(x)

0|
=
(x

1)
.
This
is
<

if
0
<
|x

1|
<

where

=
. So, if

x (1 , 1) then |f (x) 0| < . Thus, the left-limit at 1 is 0. For x > 1,


|f (x) 5| = |3x + 2 5| = 3|x 1|. This is <  if |x 1| < = /3. So if x (1, 1 + /3)
then |f (x) 5| < , and hence the right-limit is 5. The function is not continuous at 1,
nor is it continuous on the left or on the right. (The function value at 1 is not equal to
either the left or the right limit.)
Answer to question 11.5
We have to show that for each x R, the function is not continuous at x. There are two
cases: x rational and x irrational. Suppose first that x is rational. Then, for any > 0
there are irrational numbers y such that |y x| < , and we would have
|f (y) f (x)| = |1 0| = 1. If f were continuous at the point x then (taking  = 1) there
should be such that |y x| < implies |f (y) f (x)| < 1. But our observations have
just shown this is impossible. The case of x irrational is analogous.
Answer to question 11.6
Taking  = f (c)/2 > 0 in the formal definition of continuity at c, there is some such
that if x (c , c + ) then |f (x) f (c)| < f (c)/2. Thus, for x (c , c + ),
f (x) > f (c) f (c)/2 = f (c)/2 > 0, so f is positive on that interval.
Answer to question 11.7
Let y (0, ). We show that there is some c R such that f (c) = y. This shows that
the range is the whole of (0, ). (The fact that it is no larger follows from the given fact
that f is positive.) Now, f (R) = inf{f (x) | x R} = 0, so, since y > 0, there must be
some y1 f (R) with y1 < y. This means there is some x1 R such that y1 = f (x1 ) < y.
Similarly, because f is unbounded, which means f (R) is unbounded, there must be
some y2 f (R) with y2 > y and there will be some x2 R such that y2 = f (x2 ) > y.
Then y lies between f (x1 ) and f (x2 ) and, since f is continuous, the Intermediate Value
Theorem shows that there is some c between x1 and x2 with f (c) = y.
Answer to question 11.8
Note that
f () = T () T (2) = T () T (0) = (T (0) T ()) = f (0),

11

so either f (0) = f () = 0, in which case the diamaterically opposite points at longitude


0 and have the same temperature, or f (0) and f () lie either side of 0, in which case
the Intermediate Value Theorem establishes that there is c such that f (c) = 0 that is,
T (c) = T (c + ).
Answer to question 11.9
The function h satisfies h(a) = f (a) a 0, since f (a) [a, b] implies f (a) a.
Similarly, h(b) 0. So a < b and h(a) 0 h(b). Since f is continuous, as is the
function x 7 x, it follows that h is continuous. By the Intermediate Value Theorem,
c [a, b] with h(c) = 0; that is, f (c) = c. Consider g now. The function k given by
k(x) = g(g(x)) maps [a, b] into [a, b], so, by the first part, there is some c [a, b] such

178

11.5. Sketch answers to or comments on sample questions

that k(c) = c; that is, g(g(c)) = c. Let p = c and q = g(c). Then g(p) = q (obviously),
and g(q) = g(g(c)) = c = p. For the last part, consider the function h(x) = g(x) x. If
p = q then g(c) = c and we are done. Otherwise,
h(q) = g(q) q = p g(p) = h(p),
so 0 lies between h(p) and h(q). By the Intermediate Value Theorem (and the
continuity of h) there is some c between p and q such that h(c) = 0; that is, g(c) = c.

179

11. Limits of functions and continuity

180

Part 3
Algebra

181

Chapter 12
Groups

Essential reading
Biggs, N.L. Discrete Mathematics. Chapter 20, Sections 20.120.3.

12.1

Introduction

In this part of the subject, we look at the theory of groups. A group is a set of objects
together with an associated way of operating on them, such that certain properties
hold. This all sounds vague, but will be made precise shortly.

12.2

Definition of a group

12.2.1

Binary operations

A binary operation ? on a set G is, formally, a function defined on the set of ordered
pairs G G. Instead of using a clumsy functional notation, ?(x, y), we write x ? y. So,
what a binary operation does is it takes two elements of G and returns a single object
(which, in our approach, might or might not belong to G).
Example 12.1 Multiplication is a binary operation on the set R of real numbers.
In primary school we wrote to denote this, so that the multiplication of x and y is
denoted x y. Now, we usually write just xy to denote the multiplication.
Example 12.2 Addition is a binary operation on the set R of real numbers. We
use the symbol +. So, x + y is the addition (or sum) of x and y.
Example 12.3 Let X be any set and let G be the set of all functions from X to X.
Composition of functions is a binary operation on G. We can denote this operation
by . Thus, f g is the function given by
(f g)(x) = f (g(x)) for all x X. (Note: f g equals f g in the notation of
Chapter 4.)
Note that, in each of these examples, we used specific symbols, , +, rather than the
generic ?. For most of our general results we will use ?, but bear in mind that in specific
concrete examples, special symbols are sometimes used.

183

12

12. Groups

Some binary operations are nicer than others. What I mean is that there are desirable
properties that some binary operations have that others do not. (These properties are,
of course, linked to the set on which the operation is defined. We shall always clearly
specify what set the binary operation operates on.
The first important property that the operation ? might have is closure. We say that G
is closed under ? if for all x, y G, x ? y is an element of G. Perhaps to make this
clear, we could give an example of a case in which it fails. Suppose that G is the set of
odd numbers and that ? is addition. Then G is not closed under ? because, for example,
1 + 3 = 4 is even, not odd. (Here, the operation always fails to return an element of G;
but, in general, for the closure property not to hold, it is enough to fail in a single case.
This is because closure is the property x, y G, x ? y G, the negation of which is
x, y G, x ? y 6 G.)
Activity 12.1 Make sure you understand the statement in parentheses at the end
of the previous sentence. To say that the closure property holds is to assert the truth
of a universal statement that for all x, y G, x ? y G. For this to fail, it is enough
that there are some particular x, y G such that x ? y 6 G.
Activity 12.2 Prove that the set of even integers is closed under addition.
The operation is said to be associative if for all x, y, z G,
(x ? y) ? z = x ? (y ? z).
Another useful property is existence of an identity. We say that there is an identity
element e G (for the operation ?) if
e ? x = x ? e = x, x G.
We say that G possesses inverses for ? if for all x G there is some element b of G
such that x ? b = b ? x = e. We usually denote b by x1 .
Finally, we say that the operation ? is commutative on G if x ? y = y ? x for all x, y G.
To sum up, nice properties that a binary operation ? on a set G might or might not
have are:
x, y G, x ? y G [closure property]

12

x, y, z G, (x ? y) ? z = x ? (y ? z) [associativity property]
e G such that x G, e ? x = x ? e = x [identity property]
x G, x1 G such that x ? x1 = x1 ? x = e [inverse property]
x, y G, x ? y = y ? x [commutative property]

184

12.3. Examples

12.2.2

Groups

We say that G is a group under the binary operation ? (or, simply, (G, ?) is a group) if
? has the closure, associativity, identity and inverse properties on G. It does not need to
be the case that the commutative property holds, but if it does, we have a special type
of group known as a commutative group or (named after the mathematician Abel),
an Abelian group. Explicitly, therefore:
Definition 12.1 Let G be a set and ? a binary operation on G. Then (G, ?) is a group
if:
1. x, y G, x ? y G.
2. x, y, z G, (x ? y) ? z = x ? (y ? z).
3. e G such that x G, e ? x = x ? e = x.
4. x G, x1 G such that x ? x1 = x1 ? x = e.
(G, ?) is an Abelian (or commutative) group if, additionally, x ? y = y ? x for all x, y G.
The first four properties expressed in this definition are called the group axioms.
If the group G is finite (meaning that G is a finite set), we call the cardinality |G| of G
the order of G.

12.3

Examples

Example 12.4 (R \ {0}, ) is a group. The identity element is 1 and, for all
x R \ {0}, the inverse x1 is (as the notation suggests) 1/x, because
x 1/x = 1/x x = 1. (Clearly (R, ) is not a group, because there is no inverse for
0.)
Example 12.5 (R, +) is a group. Here, the identity element is 0, because, for all
x R, x + 0 = 0 + x = x. The inverse x1 of x is x, because
x + (x) = (x) + x = 0, the identity element. (Note that the generic notation x1
is slightly confusing when the operation is addition.)
Example 12.6 For m 2, (Zm , ) is a group, where Zm is the set of integers
modulo m and is addition modulo m. The identity element is 0 and, for instance,
in the case m = 6, the inverse of 2 Z6 is 4 because 2 + 4 = 4 + 2 0 (mod 6).
Example 12.7 Let p be a prime and let Zp be the non-zero integers modulo p.
Then (Zp , ) is a group, where is multiplication modulo p. How can we prove
this? Well, we have to check that each of the four group axioms holds. To show
closure, we need to prove that if x, y Zp then x y Zp . Well, certainly,
x y Zp , so what we need to show is that x y 6= 0; that is, xy 6 0 (mod p). So
we need to show that for any nonzero x, y Zp , the product x y is also nonzero in

185

12

12. Groups

Zp . That is, we need to prove that if p does not divide x and does not divide y then
p does not divide xy. But we know, because p is prime, that
p | xy p | x or p | y,
and this is exactly what we need because it shows that if x y were the zero element
of Zp , then one of x and y would also have to be. Associativity clearly holds, since it
holds for normal multiplication of integers. The identity is 1. We next need to see
why the inverse property holds. For x Zp , we must produce some x1 Zp such
that x x1 = x1 x = 1. Now, if p does not divide x, then the greatest common
divisor gcd(x, p) is 1. By the properties of gcds, this means there are integers m and
n such that mx + np = 1. Suppose that m b (mod p) where b Zp . Clearly, b 6= 0
because if p | m then p | (mx + np); that is, p | 1, which cannot be true. So b Zp .
Furthermore, b x = x b = 1, because mx 1 (mod p).

Example 12.8 (Mn,m (R), +) is a group, where Mn,m (R) is the set of n m
matrices with real entries, and + is matrix addition. The identity element is the
matrix with all entries equal to 0, and the inverse of a given matrix A = [aij ] is
A = [aij ].

Example 12.9 The set GL(n, R) of invertible n n real matrices is a group under
the operation of matrix multiplication. The identity is the identity matrix, and the
fact that the matrices are explicitly invertible means that inverses exist.

Example 12.10 For any n N, the set of all bijections from {1, 2, . . . , n} to itself
is a group under the operation of composition of functions. (This is the group of
permutations of {1, 2, . . . , n}.) The identity is the identity mapping x 7 x for
1 x n. Since the functions are bijections, each has an inverse function.

Example 12.11 Symmetry groups of geometrical shapes provide interesting


examples of groups. Suppose we take an equilateral triangle T and label its vertices
A, B, C. We will represent the triangle as
A
BC

12

A symmetry of T is a mapping f : R2 R2 such that the image f (T ) of T occupies


the same space as T . There are six symmetries of T . (Clearly this is all there can be,
given that the number of ways of permuting the three vertices is six.) One, the
identity symmetry, i, leaves T unchanged. Then there are two rotations: r, which
rotates T by 2/3, and s, which rotates T by 4/3. There are also three reflections,
x, y, z, which reflect T through the lines which pass through vertices A, B, C,
respectively, and bisect the opposite side. These symmetries may be described as

186

12.4. Group tables

follows:
i :
r :
s :
x :
y :
z :

A
BC
A
BC
A
BC
A
BC
A
BC
A
BC

7
7
7
7
7
7

A
BC
B
CA
C
AB
A
CB
C
BA
B
AC

It can be checked that this set G = {i, r, s, x, y, z} of symmetries of T is a group


under the operation of composition of functions.
Activity 12.3 Check that this last example is indeed a group. What is the inverse
of each element? [See Biggs, page 261.]

12.4

Group tables

A group (G, ?) can be completely described by its group table. This indicates, for all
x, y G, the elements x ? y (where x corresponds to the row and y to the column).
Example 12.12 Suppose the group is (Z5 , ). Then we have the group table

1
2
3
4

1
1
2
3
4

2
2
4
1
3

3
3
1
4
2

4
4
3
2
1

12

You can see that this table is symmetric: this is because the group is Abelian
(xy = yx for all x, y Z5 ).
Example 12.13 Here is the table of the group of symmetries of an equilateral
triangle (as described above):

187

12. Groups

?
i
r
s
x
y
z

i
i
r
s
x
y
z

r
r
s
i
z
x
y

s
s
i
r
y
z
x

x
x
y
z
i
r
s

y
y
z
x
s
i
r

z
z
x
y
r
s
i

Here, for instance, the entry in the row labelled r and the column labelled x is rx,
which is y. Note that this group is not Abelian: for example, xr 6= rx.

12.5

Some elementary properties

The group axioms mean that many of the techniques we use in algebraic manipulation
of numbers can also be used in groups.
From now on, we shall sometimes write groups multiplicatively. That is, we sometimes
dispense with the notation ? for the binary operation of the group, and instead use
juxtaposition. In other words, well write xy instead of x ? y. Notice that, generally, the
order is important (since not all groups are Abelian). Be aware also that, although the
multiplicative notation is a useful shorthand, it can be a little confusing if the binary
operation is addition. Thus, in specific cases in which the operation is addition, we will
use the + sign. When using the + sign, we say that we are using additive notation, or
writing additively.
Theorem 12.1 Suppose that G is a group (written multiplicatively). [This means that
(G, ?) is a group but that we will signify ? by juxtaposition, as mentioned above.] Then,
denoting the identity element by e:
(a) If xy = xz then y = z [i.e., we can cancel];
(b) If xy = e then y = x1 [i.e., inverses are unique];
(c) If xy = x then y = e [so identity is unique];
(d) The equation ax = b has a unique solution for all a, b G (as does the equation
xa = b).

12

Activity 12.4 Prove statements (a) to (d) in this Theorem. [See Biggs, section
20.3].

188

12.5. Learning outcomes

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
explain what is meant by a binary operation
explain what it means to say that a binary operation has the closure property, the
associativity property, the identity property, the inverse property, or the
commutative property
prove that specific binary operations have, or do not have, these properties
state the definition of a group
prove that a given set with a given binary operation forms, or does not form, a
group
explain what is meant by a group table
construct group tables for given groups
perform algebraic manipulations using only the properties of a group.

Sample examination questions


Note: Groups will often be written multiplicatively and identity elements will usually be
denoted by e.
Question 12.1
Let G denote the set of matrices
(
G=

a b

0 1

)
| a, b Z7 , a 6= 0 .

Show that G is a group with respect to the operation of matrix multiplication, where all
additions and multiplications are carried out in Z7 . Thus, for example,
!
!
3 4
6 6

2 3 =
,
0 1
0 1
the top-right entry of the product being 6 because, in Z7 , (3 3) (4 1) = 2 4 = 6.
(You may assume that the matrix multiplication is associative. You may also assume
any properties of the groups (Z7 , ) and (Z7 , ).)
Question 12.2
Suppose that G is the subset of the rational numbers Q given by
 2

a + b2
| a, b, c, d Z, ac =
6 0 .
G=
c2 + d 2

12

Prove that G is a group under the operation of multiplication. [You might find it useful
to use the fact that (x2 + y 2 )(z 2 + w2 ) = (xz yw)2 + (yx + xw)2 .]

189

12. Groups

Question 12.3
Let T denote the set of all functions f : R R of the form f (x) = ax + b for some
a, b R, with a 6= 0. Prove that T is a group with respect to the operation of
composition of functions.
Question 12.4
Suppose that n N and that SL(n, R) denotes the set of real n n matrices with
determinant equal to 1. Show that SL(n, R) is a group under the operation of matrix
multiplication. You may assume any properties of determinants you need, and you may
assume the associativity of matrix multiplication.
Question 12.5
Suppose that (G, ?) is a group and define a binary operation on G by x y = y ? x.
Is (G, ) a group? Justify your answer.
Question 12.6
Prove that, for any group G and any x G, (x1 )1 = x. Prove also that for any
x, y G, (xy)1 = y 1 x1 .
Question 12.7
In a group G, the commutator of x, y G is defined as [x, y] = x1 y 1 xy. Show that
[x, y] = e if and only if x and y commute (that is, xy = yx).
Question 12.8
Suppose G = {g1 , g2 , . . . , gn } is a finite group of order n and that x G. Show that
{xg1 , xg2 , . . . , xgn } = G.
Question 12.9
Suppose that G is a finite group and that H is a non-empty subset of G such that for
all x, y H, xy H. Prove that H is a subgroup of G.
[So, you have to establish, additionally, that for all x H, x1 H. Hint: take any
x H and consider all the elements of the form xn for n N. Since G is finite, two of
these powers must be equal. Proceed from this observation.]
Question 12.10
Suppose that p is a prime and that x is an integer between 1 and p 1. Show that if g
and h are integers between 1 and p 1 such that xg xh (mod p), then g = h. Hence
show that if x Zp = Zp \ {0}, then there exists g Zp such that x g = 1.

12

[This is another way of establishing the existence of inverses in (Zp , ).]


Question 12.11
Let R be a (non-square) rectangle in R2 with corners having coordinates (1, 1),
(1, 1), (1, 1) and (1, 1). Then there are four symmetries, i, a, b, c of R, as follows:
i is the identity

190

12.5. Comments on selected activities

a is reflection in the x-axis


b is reflection in the y-axis
c is a rotation of around the origin.
Construct a group table for the group G = {i, a, b, c} of symmetries of R. Show that
each non-identity element has order 2.

Comments on selected activities


Feedback to activity 12.2
All we need to show is that if x is even and y is even, then so is x + y. Well, thats easy.
If theyre even then there are integers m, n so that x = 2m and y = 2n. So
x + y = 2m + 2n = 2(m + n) and this is even.

Sketch answers to or comments on sample questions


Answer to question 12.1
To prove that G is a group, we need to check the four axioms. First closure: we need to
prove that the product of any two elements of G is an element of G. But
!
!
!
a b
c d
ac ad + b
=
,
0 1
0 1
0
1
which is of the required form, since ac 6= 0. The associative law does hold. You were
allowed to assume this, but its easy, though tedious, to verify:
"
!
!#
!
!
!
!
a b
c d
e f
ac ad + b
e f
ace acf + ad + b
=
=
,
0 1
0 1
0 1
0
1
0 1
0
1
and
a b

!"

0 1

c d

0 1

The identity matrix I =

e f

!#
=

0 1
1 0

a b

0 1

ce cf + d
0

!
=

ace acf + ad + b
0

!
.

is in G, and does what its supposed to.


0 1
!
!
!
!
a b
a b
c d
1 0
To find the inverse of
, we have to solve
=
, which
0 1
0 1
0 1
0 1
amounts to ac = 1, ad + b = 0. These have the solution c = a1 , d = a1 b in Z7 (note
that a is a non-zero
element of Z7 , so does have an inverse). Thus the inverse is
!
1
1
a
a b
, which is a member of M . Note that it wont do to quote a formula: it is
0
1
exceedingly unlikely that you have ever met matrices with entries in Z7 before, and if

191

12

12. Groups

not then you cant possibly have proved anything about their inverses. The formula
! you
a b
might expect to be true is indeed true (let f = ad bc; then the inverse of
is
c d
!
f 1 d f 1 b
if f 6= 0 can you prove this?). You might guess an answer using
f 1 c f 1 a
this formula; but you still would have to check it.
Answer to question 12.2
First we prove closure. Take two numbers from the set, say
x=

a2 + b 2
,
c2 + d 2

y=

r 2 + s2
.
t2 + u2

Then, following the hint, we note that


xy =

(ar bs)2 + (br + as)2


(a2 + b2 )(r2 + s2 )
=
.
(c2 + d2 )(t2 + u2 )
(ct du)2 + (dt + cu)2

Now, since x and y both have non-zero denominator and numerator, so does this
product expression. In other words, we can write xy as (A2 + B 2 )/(C 2 + D2 ) where,
since numerator and denominator are nonzero), at least one of A and B (which we may
assume to be A) is nonzero and at least one of C and D (which we may assume to be
C) is nonzero, so that AC 6= 0. So G is closed under multiplication. Now, multiplication
in G is associative because multiplication of rationals has this property. The identity is
1, and since this can be written as (12 + 02 )/(12 + 02 ), it belongs to G. G also possesses
inverses, because if x = (a2 + b2 )/(c2 + d2 ) G then x1 = (c2 + d2 )/(a2 + b2 ), and this
belongs to G (noting that ac 6= 0 is the same as ca 6= 0). So G is a group.
Answer to question 12.3
First, we show closure. Take two elements of T . Suppose these are given by
f (x) = ax + b and g(x) = cx + d. Then
(f g)(x) = f (cx + d) = a(cx + d) + b = (ac)x + (b + d).
This is also of the form x 7 Ax + B, where A = ac 6= 0, so it belongs to T .
Associativity follows from the fact that it holds for composition of functions. The
identity element is i(x) = x = 1(x) + 0, because for all f , f i = i f = f . (For,
i(f (x)) = f (x) and f (i(x)) = f (x).) Suppose that f T and that f (x) = ax + b. Then
the inverse of f is g given by g(x) = (x b)/a. This is an element of T because it can be
written as g(x) = (1/a)x + (b/a). So G is a group.

12

Answer to question 12.4


This is very straightforward if we use the fact that for any two n n matrices, M and
N , |M N | = |M ||N |, where |A| denotes the determinant of a matrix A. Suppose, then,
that M, N G = SL(n, R). Then |M | = |N | = 1 and so |M N | = |M ||N | = 1 and hence
M N G. Also, if M G then, since |M |M 1 | = |M M 1 | = |I| = 1, we have
|M 1 | G. (These two observations establish that G is a subgroup of GL(n, R) and G
is therefore a group, so we could stop here. But it is easy enough to verify the other two

192

12.5. Sketch answers to or comments on sample questions

group axioms for G directly.) Since matrix multiplication is associative, so is


multiplication in G. Furthermore, the identity matrix belongs to G since it has
determinant 1, and so G has an inverse.
Answer to question 12.5
We show that (G, ) is a group. For all x, y G, x y = y ? x G (by the closure
property of ?). So (G, ) has the closure property. Associativity holds because
(x y) z = (y ? x) z = z ? (y ? x) = (z ? y) ? x = x (z ? y) = x (y z).
(Note how we have used nothing more than the definition of and the associativity of
?. Suppose the identity element of (G, ?) is e. Then, for all x,
x ? e = e ? x = x,
which means that
e x=x e=x
and so e is also an identity element in (G, ). Finally, if x G, let x1 denote its
inverse in (G, ?). Then
x ? x1 = x1 ? x = e,
so
x1 x = x x1 = e,
and x1 is also an inverse in (G, ).
Answer to question 12.6
By definition, the inverse of a G is the element b of G such that ab = ba = e. This
means that the inverse, (x1 )1 of x1 , is the element b such that bx1 = x1 b = e. But
we know, by the definition of x1 that xx1 = x1 x = e. So, b = x fits the description
and, since inverses are unique, it must be the inverse of x1 . So (x1 )1 = x. Since
(xy)(y 1 x1 ) = x(yy 1 )x1 = xex1 = xx1 = e
and, similarly, (y 1 x1 )(xy) = e, we have that (xy)1 = y 1 x1 .
Answer to question 12.7
We have
[x, y] = e

x1 y 1 xy = e
xx1 y 1 xy = x
ey 1 xy = x
y 1 xy = x
yy 1 xy = yx
exy = yx
xy = yx.

12

193

12. Groups

Weve taken this very slowly, first multiplying on the left by x and then y. But a more
direct approach is to do both multiplications simultaneously:
[x, y] = e

x1 y 1 xy = e
yxx1 y 1 xy = yx
yey 1 xy = yx
yy 1 xy = x
xy = yx.

Answer to question 12.8


The key observation is that the n elements xgi (for i = 1, 2, . . . , n) are distinct. This is
because xgi = xgj gi = gj (and we can perform such cancellation because G is a
group). So every element of G must be of the form xgi for some i, since there are n
distinct such elements, and n elements of G. (And, of course, since G is a group, each
element of the form xgi also belongs to G.) The corresponding result is true for infinite
groups, too. That is, if G is any group and x G then {xg | g G} = G. To prove this,
and to give an alternative proof of the result for the finite case, we first note, by the
closure of multiplication, that {xg | g G} G. Then, also, for any h G, we can write
h as h = x(x1 h) = xg, where g = x1 h G. Thus, G {xg | g G} and we are done.
Answer to question 12.9
If xg xh (mod p), then p | (xg xh); that is, p | x(g h). By the fact that p is prime,
this means either that p | x or p | (g h). But since x is between 1 and p 1, we cannot
have p | x. Thus, p | (g h). But |g h| < p 1 and so the only way in which p can
divide g h is for g h to be 0; so h = g, as required. Suppose, for the next part, that
x Zp , and consider all the products xg as g ranges through Zp . By what has just been
shown, these p 1 products are distinct and hence they give all of the p 1 elements of
Zp , including 1; that is, there exists g Zp such that xg = 1 (and, since multiplication
is commutative, we also have gx = 1). The element g is therefore an inverse to x. Since
x was any element of Zp , we now see that (Zp , ) possesses inverses.
Answer to question 12.10
By the definition of ?, since for all x1 , x2 G, x1 x2 G and for all y1 , y2 H,
y1 y2 H, it follows that G H is closed under ?. The identity element in G H is
(e, e). (Here, were denoting the identities in G and H by the same symbol e. We could
be more precise and indicate that these may be different by using the notation
(eG , eH ).) This is because, for all (x, y) G H,
(x, y) ? (e, e) = (xe, ye) = (x, y)

12

and, similarly, (e, e) ? (x, y) = (x, y). Associativity follows from associativity in the
groups G and H: for,
(x1 , y1 ) ? ((x2 , y2 ) ? (x3 , y3 )) =
=
=
=

194

(x1 , y1 ) ? (x2 x3 , y2 y3 )
(x1 (x2 x3 ), y1 (y2 y3 ))
((x1 x2 )x3 , (y1 y2 )y3 )
((x1 , y1 ) ? (x2 , y2 )) ? (x3 , y3 ).

12.5. Sketch answers to or comments on sample questions

Given (x, y) G H, let z = (x1 , y 1 ) where x1 is the inverse of x in G and y 1 is


the inverse of y in H. Then
(x, y) ? z = (x, y) ? (x1 , y 1 ) = (xx1 , yy 1 ) = (e, e),
the identity in G H; and, similarly, z ? (x, y) = (e, e). It follows that G H is a group.
Answer to question 12.11
The group table is as indicated.

i
a
b
c

i
i
a
b
c

a
a
i
c
b

b
b
c
i
a

c
c
b
a
i

It can be seen that a2 = b2 = c2 = i, the identity of the group, so each non-identity


element has order 2. (They cannot have order 1, since only i has order 1.)

12

195

12. Groups

12

196

Chapter 13
Subgroups

Essential reading
Biggs, N.L. Discrete Mathematics. Chapter 20, Sections 20.4, 20.6 and 20.7.

13.1

Introduction

This chapter concerns subgroups. A subgroup is a subset of a group that is also, itself, a
group. This is a central idea in algebra.

13.2

Definition of a subgroup

Definition 13.1 If (G, ?) is a group and H G is such that (H, ?) is also a group,
then we say that H is a subgroup of G. (We could be pedantic and say that (H, ?) is a
subgroup of (G, ?), but this isnt necessary.) We write H 6 G to signify that H is a
subgroup of G.
Note that H 6 G means that H is a subgroup of G, whereas H G just means that H
is a subset of G.
Example 13.1 In the previous chapter, we looked at the group G of symmetries of
an equilateral triangle T (where the operation is composition of functions). Let
H = {i, r, s}. This consists of the two rotations and the identity transformation.
Then H is a subgroup of G.
Example 13.2 We mentioned earlier that the set GL(n, R) of invertible n n real
matrices is a group under the operation of matrix multiplication. The subset
GL(n, Q) consisting of n n invertible matrices with rational entries is a subgroup.
How do we know, or how can we check, that a given H G is in fact a subgroup? Well,
since the group operation is associative on G, it is certainly associative on H; so this is
never something to worry about. Whats left? Well, H must have the closure, identity
and inverse properties. That is, writing the group multiplicatively, the following must be
true:
x, y H = xy H (and not simply xy G, which we know is true because G is
closed under the group operation).

197

13

13. Subgroups

There must be some element eH H such that xeH = eH x = x for all x H.


For all x H, there must be some element x0 H such that xx0 = x0 x = eH .
Now, it turns out that eH must be the identity element e of G; and that, for each
x H, the inverse in H, x0 , is in fact the inverse x1 in G. Lets see why:
First, we must have eH eH = eH and eeH = eH , so
eH eH = eeH .
By cancellation, eH = e.
Next, let x H and let x0 denote its inverse in H and x1 its inverse in G. Then (using
the fact that eH = e) xx0 = eH = e. Then, x1 xx0 = x1 e = x1 ; that is, x0 = x1 .
We therefore have: H 6 G if and only if H G and
x, y H = xy H
eH
for all x H, x1 H.
Now, the second of these can be omitted (if H 6= ), because if we take any x H then
x1 H also and, by closure xx1 H. But xx1 = e, so the other two properties
imply that e H. We therefore have:
Theorem 13.1 Suppose G is a group and =
6 H G. Then, writing the group
operation multiplicatively, H is a subgroup of G (H 6 G) if and only if:
(1) x, y H = xy H, and
(2) x H = x1 H.
An equivalent, slightly different characterisation can be given.
Theorem 13.2 Suppose G is a group and =
6 H G. Then H is a subgroup of G if
and only if
x, y H = x1 y H.
Activity 13.1 Prove Theorem 13.2.
Lets do some examples.
Example 13.3 Let H = {i, r, s}, the subset of the group G of symmetries of an
equilateral triangle T . We claimed above that this was a subgroup of G. Note that
the three elements of H are rotations, of 0, 2/3 and 4/3 clockwise. H is clearly
closed under composition. It is clear that
i1 = i H, r1 = s H, s1 = r H.

13

(For, rs and sr are both equivalent to a rotation of 2, which is the same as i. That
is, rs = sr = i.) So, H is closed under the taking of inverses. The group table of H is
as follows:

198

13.2. Definition of a subgroup

i
r
s

i
i
r
s

r
r
s
i

s
s
i
r

Example 13.4 We claimed that GL(n, Q) 6 GL(n, R). To prove this, we need to
show H = GL(n, Q) is closed under matrix multiplication and inversion. Clearly the
product of two n n matrices with rational entries is itself an n n matrix with
rational entries. The inverse of any n n matrix with rational entries also has
rational entries. (You can see this either by thinking about the co-factor method or
the row-operation method for matrix inversion.)
Example 13.5 Suppose that
(
H=

x y
0 z

)
| x, y, z R and xz 6= 0 .

We prove that H is a subgroup of GL(2, R), the group of all invertible 2 2 matrices
under matrix multiplication. First, we note that H is indeed a subset of GL(2, R)
since for any A H, the condition xz 6= 0 ensures |A| =
6 0 and hence A is invertible.
Now, we show the closure of H under matrix multiplication. First, we note that
!
!
!
x y
x0 y 0
xx0 xy 0 + yz 0
=
.
0 z
0 z0
0
zz 0
Also, (xx0 )(zz 0 ) = (xz)(x0 z 0 ). So, if A, B H, then the matrix product is also in H,
since it takes the form
!
X Y
,
0 Z
with XZ 6= 0. Now we show that H is closed under taking inverses. By the usual
formula,
!1
!
x y
1/x y/xz
=
H.
0 z
0
1/z

Example 13.6 Let G be any group, written multiplicatively. Define


Z = {z G | gz = zg g G}. So, Z is the set of elements of the group that
commute with every member of the group. Z is called the centre of G. It is always
non-empty because e Z. If G is Abelian, then all elements commute with all
elements and Z = G. In all cases, Z is a subgroup of G. To prove this, we need to
show:
x, y Z = xy Z

199

13

13. Subgroups

and
x Z = x1 Z.
Suppose, then, x, y Z. Then, for all g G, xg = gx and yg = gy. We need to show
that for all g G, (xy)g = g(xy). We argue as follows, making use of associativity:
(xy)g = x(yg) = x(gy) = (xg)y = (gx)y = g(xy).
Now, suppose that x Z. We want to show x1 Z, which means x1 g = gx1 for
all g G. We know that for all g, gx = xg. Therefore x1 (gx)x1 = x1 (xg)x1 .
Now,
x1 (gx)x1 = x1 g(xx1 ) = x1 ge = x1 g
and, similarly, x1 xgx1 = gx1 . So, for all g G, we do indeed have x1 g = gx1 .
So Z 6 G.

13.3

Powers and order of a group element

Suppose that (G, ?) is a group and that x G and n N. Then the nth power of x is
x
| ? x ? x{z? ? x} .
n times

We can also define negative integer powers of an element x: for n N, xn is defined to


be (x1 )n .
If the group is written multiplicatively, then the notation xn makes sense, and is
familiar. When the operation is addition, however, the nth power is nx. In our general
(abstract) discussion of powers, we will use the multiplicative notation xn , but you
should be aware of what this really means when the operation is not multiplication. (It
could be addition, as already discussed, or it could be composition of functions, for
example.)
The properties xm xn = xm+n and (xm )n = xmn hold. (In an additive group these simply
say that mx + nx = (m + n)x and n(mx) = (mn)x.)
Powers of a fixed group element give us useful subgroups.
Theorem 13.3 Suppose that G is a group and that x G. Then H = {xn | n Z} is
a subgroup of G. We denote this subgroup by hxi and call it the subgroup generated by
x. The subgroup hxi is, moreover, the smallest subgroup of G containing the element
x, in the sense that if H 6 G and x H, then hxi H.
Example 13.7 Consider the subgroup of the symmetries of an equilateral triangle
generated by r. We have r2 = s and r3 = i. So,
hri = {rn | n Z} = {r, r2 , r3 } = {r, s, i},

13

the subgroup H considered earlier.


In this example, the third power of r is the identity, and no smaller power is. We say
that r has order 3.

200

13.3. Powers and order of a group element

Generally, if G is a group and x G, then we say that x has infinite order if xn 6= e for
all n N; and we say that x has order m N if xm = e and xn 6= e for
k = 1, 2, . . . , (m 1). (So the order of x is the least positive integer m such that xm = e,
where we interpret this as infinite in the case that no such m exists.)
Theorem 13.4 If x has infinite order, then xm 6= xn if m 6= n, for m, n Z. Therefore,
in particular, if the group is finite, it can have no element of infinite order.
Proof
xm = xn = xm (xn )1 = e and xn (xm )1 = e
= xmn = xnm = e
= x|mn| = e
But no positive power of x equals e, so |m n| = 0 and m = n. This shows that if x is
of infinite order, then the subgroup hxi generated by x is infinite, and so G is too, since
hxi G.
For elements of finite order, we have the following result.
Theorem 13.5 Suppose that the group element x has finite order m. Then:
(1) Let n Z. If n = km + r where k, r Z and 0 r m 1, then xn = xr .
(2) For n N, xn = 1 m | n.
(3) 1, x, x2 , . . . , xm1 is a complete, repetition-free, list of the elements of hxi.
(4) The subgroup hxi generated by x has cardinality m.
Example 13.8 Consider the group (Z5 , ). Take x = 2. Modulo 5 (and writing
multiplicatively) we have
22 = 4, 23 = 8 = 3, 24 = 16 = 1, 25 = 32 = 2.
The subgroup h2i is therefore the whole of the group. The subgroup h4i is smaller: it
is {1, 4}.
A group G with the property that G = hxi for some x G is called a cyclic group. The
Example just given shows that (Z5 , ) is cyclic. More generally, it is the case that if p is
any prime number, then (Zp , ) is cyclic. (This is a special case of a result that is
important in the theory of finite fields and in a number of the applications of fields to
coding and cryptography.)

13
201

13. Subgroups

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
explain what is meant by a subgroup of a group
determine whether a given subset of a group is a subgroup of that group
explain what is meant by the order of a group element
demonstrate understanding of what is meant by the subgroup generated by a group
element
demonstrate knowledge of what is meant by a cyclic group.

Sample examination questions


Question 13.1
Suppose that G is a finite group and that H is a non-empty subset of G such that for
all x, y H, xy H. Prove that H is a subgroup of G. [So, you have to establish,
additionally, that for all x H, x1 H. Hint: take any x H and consider all the
elements of the form xn for n N. Since G is finite, two of these powers must be equal.
Proceed from this observation.]
Question 13.2
1 x

, for x R. Show that H is a


0 1
subgroup of GL(2, R), the group of 2 2 invertible real matrices, with the operation of
matrix multiplication.
Let H be the set of matrices of the form

Question 13.3
Suppose that G is a group and that x G. Show that
C(x) = {g G | xg = gx}
is a subgroup of G. Suppose now that G = GL(2, R) is the group of invertible 2 2 real
matrices and that
!
!
2 0
1 1
M=
, N=
.
0 1
0 1
Determine C(M ) and C(N ).
Question 13.4

13

Suppose that G is the group of permutations of a set . (So, G is the set of bijections
from to , and the group operation is composition of functions.) Prove that, for any
x , the set Gx = {f G | f (x) = x} is a subgroup of G.

202

13.3. Sample examination questions

Question 13.5
Suppose that H1 , H2 , . . . , Hk are subgroups of a group G. Show that the intersection
k
\

Hi = H1 H2 Hk

i=1

is a subgroup of H.
Question 13.6
Suppose H and K are subgroups of G such that neither H K nor K H. Prove that
H K is not a subgroup of G. [Hint: suppose that h H \ K and k K \ H and
consider the element hk.]
Question 13.7
Prove that every cyclic group is Abelian.
Question 13.8
Show that in an Abelian group G, the set of all elements of finite order forms a
subgroup. Suppose GL(2, R) is the group of invertible 2 2 matrices with real entries
(which you may assume to be a group). Let
!
!
0 1
0 1
A=
, B=
.
1 0
1 1
Show that A, B GL(2, R), that A has order 4, B has order 3, and AB has infinite
order. [Thus, the elements of finite order need not form a subgroup if G is not Abelian.]
Question 13.9
Suppose that the element x of a group G satisfies x12 = x2 . What are the possible
values for the order of x?
Question 13.10
Suppose G is a group. Prove that, for x G, x2 = e if and only if x = x1 . Hence prove
that if every non-identity element of G has order 2, then G is Abelian.
Question 13.11
Prove that for any x in a group G, x and x1 have the same order. Deduce that if G is
finite and has even order, then G contains at least one element of order 2.
Question 13.12
Suppose that the finite cyclic group G = hxi has order n. Show that, if r is a positive
integer, then G = hxr i if and only if the greatest common divisor of r and n is 1. How
many different generators has a cyclic group of order 2k (where k N)? How many
different generators has an infinite cyclic group?

203

13

13. Subgroups

Question 13.13
A group G is such that G contains at least 2 elements and the only subgroups of G are
{e} and G itself. Prove that G is a finite cyclic group of prime order. [There are three
things to prove: finite; cyclic; prime order].

Comments on selected activities


Feedback to activity 13.1
We need to show that the property
x, y H = x1 y H
holds if and only if both of the following hold:
(1) x, y H = xy H, and
(2) x H = x1 H.
First, suppose H has the property that if x, y H then x1 y H. Then lets take y to
be the identity element and we see that x1 y H means x1 H, so property (ii)
holds. Then, let x, y H. By what weve just shown, x1 H. So (x1 )1 y H. But
this is just xy H, so (i) holds.
Conversely, suppose (i) and (ii) hold. Let x, y H. By (ii), x1 H and by (i),
therefore, x1 y H.
So both are equivalent.

Sketch answers to or comments on sample questions


Answer to question 13.1
Suppose x H. Because H is closed under multiplication, all powers xn of x (for
n N) belong to H. But since H is finite, this list of powers necessarily contains
repetitions. So there are r, s N with r < s and xr = xs . Now, xr = xs = xr xsr and,
by cancellation in G, xsr = e. Since s > r the element xsr is in H; so weve shown
that H contains the identity. Now, consider the equation xsr = e. If s r = 1 then we
have x = e, the inverse of which is e, which belongs to H. Otherwise, s r 1 is a
positive integer, and since xxsr1 = e, it follows that x1 = xsr1 and this belongs to
H. Thus, H is a subgroup.
Answer to question 13.2

13

First we note that H is indeed a subset of GL(2, R), because each matrix of the given
type has determinant 1, and is therefore invertible. Now we show H is closed. This
follows from the observation that
!
!
!
1 x
1 y
1 x+y
=
,
0 1
0 1
0
1

204

13.3. Sketch answers to or comments on sample questions

which belongs to H. We can see that H is closed under the taking of inverses from the
fact that
!1
!
1 x
1 x
H.
=
0 1
0 1
So H is a subgroup of GL(2, R).
Answer to question 13.3
Be sure that you read the definition of the set carefully. The set C(f ) consists of those
group elements g such that gf = f g; i.e., g commutes with the fixed group element f .
(Note that we are writing the group multiplicatively.) For instance, the identity e is
certainly in C(f ), as are f 1 and f itself. The set C(f ) of elements of the group that
commute with f is called the centraliser of f . To prove that C(f ) is a subgroup of G,
we need to check three things: that it is nonempty, that it is closed under
multiplication, and that it is closed under the taking of inverses. Suppose that g and h
are in C(f ), so gf = f g and hf = f h. We need to check that gh is in C(f ), i.e., that
(gh)f = f (gh). Now
(gh)f = g(hf ) = g(f h) = (gf )h = (f g)h = f (gh),
using the associative law three times, along with the given equations gf = f g and
hf = f h. The identity element e is in C(f ), since ef = f = f e. The reason we need to
check this is to ensure that the set C(f ) is not empty: make sure that your proof rules
out this possibility. Suppose that g C(f ): we claim that g 1 is also in C(f ). We know
that gf = f g, so
g 1 gf g 1 = g 1 f gg 1 .
(The associative law tells us that these expressions are unambiguous without brackets.)
But this says that f g 1 = g 1 f , as required. To! find C(M ), we need to see which !
a b
2a b
matrices A satisfy AM = M A. Let A =
; we then see that AM =
and
c d
2c d
!
2a 2b
MA =
. Evidently these are equal if and only if b = c = 0. So C(M ) is the
c d
group of invertible diagonal matrices with real coefficients.
!
!
a a+b
a+c b+d
With A as above, we have AN =
, NA =
. These are equal
c c+d
c
d
if and only if
! c = 0 and a = d. Thus C(N ) is the set of real invertible matrices of the
a b
form
.
0 a
Answer to question 13.4
Suppose that f, g Gx . Then f (x) = x and g(x) = x. The composite function f g
satisfies
(f g)(x) = f (g(x)) = f (x) = x,
so f g Gx . For f G, the inverse function f 1 satisfies f 1 (x) = x, simply because
f (x) = x (and, by definition of the inverse function, y = f 1 (x) if and only if f (y) = x).
So Gx is a subgroup of G.

205

13

13. Subgroups

Answer to question 13.5


T
Let H = ki=1 Hi . Suppose x, y TH. Then, for each i, x, y Hi and since Hi is a
subgroup, x1 y Hi . So x1 y Hi = H. This shows that H is a subgroup of G.
[Note: we could, of course, show separately that H is closed under multiplication and
taking of inverses, but as we have noted, it suffices to show that x, y H implies
x1 y H.]
Answer to question 13.6
Quite tricky, this one. Note the strength of the result. Were not simply showing that
H K can sometimes fail to be a subgroup if H is not a subset of K or vice versa:
were showing that it always fails to be a subgroup unless H K or K H. To prove
that L = H K is not a subgroup when neither H K nor K H, we need to show
either that it is not closed under multiplication (or, more formally, the group operation,
which we shall write as multiplication) or that it is not closed under taking inverses.
Following the hint, suppose that h is an element of H not in K and that k is an element
of K not in H. (Such elements exist since were supposing that neither of H, K is a
subset of the other.) Consider hk. Since h H L and k K L, we should have
hk L. But since L = H K this would mean that hk H or hk K. Now, hk H
implies (since H is a subgroup) that h1 (hk) H; that is, k H, which is not true.
Similarly, hk K implies h = (hk)k 1 K, which is false. So we do not have hk L
and L is not closed under multiplication and is hence not a subgroup.
Answer to question 13.7
If G is cyclic then G = hxi for some x G. Suppose that a, b G. If one of these, say b,
is the identity element e, then ab = ae = a = ea = ba. Otherwise, there are m, n N
such that a = xm and b = xn . By the properties of powers,
ab = xm xn = xm+n = xn+m = xn xm = ba.
So for any a, b G, ab = ba and the group is Abelian.
Answer to question 13.8
Let H be the set of elements of finite order in the Abelian group G. Suppose x, y H
have order m, n respectively. Then, because the group is Abelian,
(xy)mn = xmn y mn = (xm )n (y n )m = en em = ee = e,
so xy has finite order (at most mn). Note that (xy)k = xk y k holds because G is Abelian.
If G is not Abelian, this would not necessarily be true: for instance, (xy)2 = xyxy and
we cannot, in general, rearrange the terms of this product to obtain x2 y 2 . However, if
the group is Abelian, we have xyxy = x(yx)y = x(xy)y = x2 y 2 . Now,
(x1 )m = (xm )1 = e1 = e, so x1 also has finite order. Note that the fact that
(x1 )m = (xm )1 follows from the observation that xk (x1 )k = e, something that is
easily seen to be generally true in any group. For example (the case k = 2),

13

x2 (x1 )2 = xxx1 x1 = x(xx1 )x1 = xex1 = xx1 = e.


Thus xy and x1 belong to H and H is therefore a subgroup of G. For the next part, it
is a simple matter of calculation to check that A4 = I but Ak 6= I for k = 1, 2, 3, so that

206

13.3. Sketch answers to or comments on sample questions

A has order 4. Similarly, we can see that B has order 3 because B 3 = I but B 2 6= I and
B 6= I. The matrices A and B certainly belong to the group GL(2, R) because they are
invertible. Now, the product AB is
!
1 1
C = AB =
.
0
1
We have
2

C =

!
1 2
0

, C =

!
1 3
0

and it can be proved by induction that for any n N,


!
1
n
Cn =
.
0
1
So, for no n N do we have C n = I, and C therefore has infinite order.
Answer to question 13.9
Because x12 = x2 , we have x10 = e. Thus the order of x is a divisor of 10, and hence
must be 1, 2, 5 or 10.
Answer to question 13.10
We have x2 = e if and only if xx = e. But this says, precisely, that x = x1 , by the
definition of an inverse. Suppose every non-identity element of G has order 2, and let
x, y G. Then x = x1 and y = y 1 . Using the fact (always true) that (xy)1 = y 1 x1 ,
together with (xy)1 = xy (true here because xy is either a non-identity element with
order 2, or it is e), we immediately have
xy = (xy)1 = y 1 x1 = yx.
Since x, y were any two elements of G, this shows that G is Abelian.
Answer to question 13.11
(We should not forget the case in which the order of x is infinite. We deal with this
first.) Suppose that x has infinite order. Then for all n N, xn 6= e. It follows also that
(x1 )n 6= e, because (x1 )n = (xn )1 and if this is e, we have (xn )1 = e and hence
xn = e1 = e. A similar argument shows that if x1 has infinite order, so too does x.
Now suppose x has finite order m. Then xm = e and so (x1 )m = (xm )1 = e1 = e;
hence x1 has finite order at most m. Conversely, if x1 has finite order k then
(x1 )k = e and so xk = ((x1 )k )1 = e1 = e and x therefore has order at most k. It
follows that both x and x1 have the same order. Now suppose G is finite, so every
element has finite order. If x has order equal to 2, then x = x1 . Only the identity has
order equal to 1. If x has order greater than 2 then x, x1 are different elements of G.
We can therefore decompose G into e, some number M of pairs {x, x1 } for all x of
order > 2, and some number N 0 of singletons {x}, one for each element of order 2.
We need to show that N 1. Counting, we have
|G| = 1 + 2M + N.

207

13

13. Subgroups

Since |G| is even, we therefore must have N odd, and, in particular, N 1. Note that
weve shown slightly more than required: namely that G has an odd number of elements
of order 2.
Answer to question 13.12
Let d = gcd(r, n). If d > 1 then n/d is a positive integer less than n, and r/d N and
we have
(xr )n/d = (xn )r/d = er/d = e,
which shows that xr has order at most n/d, less than n. Then the group generated by
xr has order less than n and is not, therefore, the whole of G. Conversely, suppose
d = 1, and that (xr )m = e. Then, xrm = e and, by a standard result on orders, we must
have n | rm. But if gcd(r, n) = 1, this means n | m and, in particular, m n. So
(xr )n = e and no smaller power of xr is e. That is, xr has order n. Therefore G = hxr i
(since both have the same cardinality).
Suppose G is a cyclic group of order 2k , and suppose G = hxi. All elements of G take
the form xr , for some r 2k . From what we have just shown, the generators of G are
precisely those xr for which gcd(2k , r) = 1. The r that satisfy this are all the odd
numbers less than 2k , and there are 2k1 of these. So the number of generators is 2k1 .
Suppose now that G = hxi is an infinite cyclic group. If r 2 (and r N) and xr
generates G, then for some m N, we have x = (xr )m = xrm , showing that xrm1 = e.
But, since r 2, rm 1 > 0 and this contradicts the fact that x has infinite order
(which means that no positive power of x is equal to e). So there is no generator of the
form xr for r > 1. Similarly, if r is a negative integer, with |r| 2, then xr would
generate G only if there is some m N such that (xr )m = x, and we therefore again
have xrm1 = e. In this case, 1 rm > 1 and x1mr = (xmr1 )1 = e1 = e and we
again contradict the fact that x has infinite order. The only remaining values of r are
r = 1, which clearly gives the generator x itself, and r = 1. Any element of G can be
written as xn for some n. So, if g G then there is some n N such that g 1 = xn .
Then g = (g 1 )1 = (xn )1 = (x1 )n . This establishes that x1 generates G. So there
are exactly 2 generators: x and x1 .
Answer to question 13.13
Suppose first that G were not cyclic. Then, for any x 6= e, H = hxi would be a subgroup
of G such that H 6= {e} and H 6= G. So G must be cyclic. Suppose, then, that G = hxi.
If G were infinite, then hx2 i would be a subgroup H with H 6= {e} and H 6= G. Clearly,
since x has infinite order, x2 6= e, so H 6= {e}. Additionally, if we had H = G then, for
some m N, wed have x = (x2 )m ; that is, x2m1 = e, a contradiction to the fact that x
has infinite order (since 2m 1 is a positive integer). So G is finite. Suppose that
|G| = n. Then the order of the generator x is n. If n = rs for r, s N with r, s > 1, then
xr has order at most s < n since (xr )s = xrs = xn = e, and so H = hxr i is a subgroup of
G such that H 6= {e} and H 6= G (since |H| s < n). So n must be prime, and we are
done.

13
208

Chapter 14
Homomorphisms and Lagranges
theorem

Essential reading
Biggs, N.L. Discrete Mathematics. Chapter 20, Sections 20.5 and 20.8.

14.1

Introduction

In this chapter we explore further the structure of groups. We formalise what it means
to say that two groups have essentially the same structure. We introduce cosets and
show how they can be used to derive Lagranges theorem, an important tool in the
study of groups and their subgroups.

14.2

Homomorphisms

A homomorphism from one group to another is a function that respects the group
operations. Before giving a general definition, lets think about a specific example. The
positive real numbers R+ form a group, G = (R+ , ), under multiplication, and the real
numbers form a group, H = (R, +) under addition. The (natural) log function
(x) = log x has the following property:
(x y) = (x) + (y).
(This is just the familiar log(xy) = log x + log y.) So this means that the image under
of the element of (R+ , ) which is obtained by performing the group operation of G on
x and y is related in a natural way to the images of x and y under : the image is
obtained by performing the group operation of H on the images (x) and (y).
Definition 14.1 Suppose that (G, ?) and (H, ) are groups. A function : G H is a
homomorphism if (x ? y) = (x) (y) for all x, y G.
The definition emphasises, through the use of ? and , that the group operations can be
different (for example, multiplication and addition, as in the example described above).
However, this is cumbersome and so we will usually write both groups multiplicatively,
in which case the defining property of a homomorphism can be expressed as
(xy) = (x)(y).
Theorem 14.1 Suppose is a homomorphism from G to H and that the identities in
the groups are, respectively, eG and eH . Then

209

14

14. Homomorphisms and Lagranges theorem

(i) (eG ) = eH ;
(ii) for all x G, (x1 ) = ((x))1 .
There are two very important subgroups associated with a homomorphism:
Definition 14.2 Suppose is a homomorphism from G to H. Then the kernel of ,
denoted ker , is
ker = {x G | (x) = eH }.
The image of , denoted (G) or im , is
(G) = {(x) | x G}.
Theorem 14.2 For any group homomorphism : G H, ker 6 G and (G) 6 H.
Activity 14.1 Try to prove this.
Note: You may, in some other courses, particularly 118 Advanced linear algebra,
come across the terminology kernel in connection with linear mappings between (real)
vector spaces V and W (where the kernel is also sometimes called the null space). The
notion is the same.

14.3

Isomorphism

Isomorphisms are special types of homomorphisms, namely those that are bijective;
that is, injective (or 1-to-1) and surjective (or onto). There is a sense in which groups
between which there is an isomorphism may be regarded as having the same algebraic
structure, as will be indicated below. First,a formal definition.
Definition 14.3 Let G and H be groups. A homomorphism : G H is an
isomorphism if it is bijective. If there is an isomorphism from G to H then we say that
the groups are isomorphic and we write G
= H.
(Some texts, such as Biggs Discrete Mathematics use slightly different notations; for
instance, Biggs uses G H.)
Note that if : G H is an isomorphism then the inverse function 1 (which exists
since is bijective) is an isomorphism H G. It is easily checked that isomorphism is
an equivalence relation between groups.
As mentioned, isomorphic groups are, as far as their algebra is concerned, essentially
the same as each other. Isomorphic groups have the same cardinality because there is a
bijection between them, but, more than this, the group operations act in the same sort
of way in each group. To explain this further, suppose that the groups are finite (just
for the sake of simplicity, though what follows can be generalised to infinite isomorphic
groups). Suppose we have a group table TG for G, formed by taking the elements of G
in some order x1 , x2 , . . . , xN , where N = |G|. Suppose that TH is a group table for H,
formed by listing the elements of H as (x1 ), . . . , (xN ) (which is legitimate, since is a
bijection). By the properties of the isomorphism , if TG , regarded as a matrix, has

14

210

14.4. Cosets

(i, j)th entry equal to z = xi xj , then the (i, j)th entry of TH is just (z). This is because
(z) = (xi xj ) = (xi )(xj ). In other words, if we take a group table for G and replace
each entry z of it by (z) then we have a group table for H. In this sense, the group
tables are essentially the same: just the symbols representing the elements are different.
That may all have been a little indistinct, so lets look at a very easy example. Consider
the group G = (Z3 , ) of addition modulo 3 and the group H = {i, r, s} of rotational
symmetries of an equilateral triangle, where the operation is function composition
(described above).
If : G H is defined as follows:
(0) = i, (1) = r, (2) = s,
then is an isomorphism. Clearly its a bijection and it is easy to check that is a
homomorphism. For instance, the facts that rs = i and 1 2 = 0 mean that
(1 2) = (1)(2).
The group tables are:

14.4

0
1
2

0
0
1
2

1
1
2
0

2
2
0
1

i
r
s

i
i
r
s

r
r
s
i

s
s
i
r

Cosets

Let H be a subgroup of a group (G, ?) and let a be an element of G. The left coset
a ? H is defined as {a ? h | h H}. The right coset H ? a is {h ? a | h H}. A coset is a
subset of G, but not necessarily a subgroup. If G is Abelian, then clearly a ? H = H ? a,
but this is not generally the case. We often write the group multiplicatively, in which
case left cosets and right cosets are denoted simply by aH and Ha. If the group
operation is addition, then the appropriate notations would be a + H and H + a.
Example 14.1 Let G be the group of symmetries of the equilateral triangle (as
earlier) and let H 6 G be {i, r, s}. Then the left cosets are
iH = rH = sH = {i, r, s} = H, xH = yH = zH = {x, y, z}.
(Note that {x, y, z} is a subset, but not a subgroup, of G.)

211

14

14. Homomorphisms and Lagranges theorem

Example 14.2 Suppose that (G, ?) = (Z6 , ) and let H = {0, 2, 4}. Then H 6 G
(since it is the group generated by 2 recall that the group operation is addition
modulo 6). Then, the left cosets aH, which in this case should be written a H, are
0 H = 2 H = 4 H = {0, 2, 4} = H,
1 H = 3 H = 5 H = {1, 3, 5}.
Of course, since the group is Abelian, the corresponding right cosets are the same as
the left cosets.
Example 14.3 Suppose that (G, ?) = (Z, +). Let H = 4Z be the subgroup
H = {4n | n Z}, consisting of all integer multiples of 4. Then (writing the cosets
additively), it can be seen that there are precisely 4 different left cosets (which
coincide with the right cosets since the group is Abelian). If g = 4k for some k N
(that is, g is a multiple of 4), then
g + H = {4k + 4n | n N}
= {4(k + n) | n N}
= {4m | m N} = H.
If g 1 (mod 4) then g = 4k + 1 for some k N and
g+H =
=
=
=

{4k + 1 + 4n | n N}
{4(k + n) + 1 | n N}
{4m + 1 | m N}
{. . . , 7, 3, 1, 5, . . . }.

If g 2 (mod 4) then g = 4k + 2 for some k N and


g+H =
=
=
=

{4k + 2 + 4n | n N}
{4(k + n) + 2 | n N}
{4m + 2 | m N}
{. . . , 6, 2, 2, 6, . . . }.

If g 3 (mod 4) then g = 4k + 3 for some k N and


g+H =
=
=
=

{4k + 3 + 4n | n N}
{4(k + n) + 3 | n N}
{4m + 3 | m N}
{. . . , 5, 1, 3, 7, . . . }.

Theorem 14.3 Let G be a group and H 6 G.


(a) If H is finite, then |aH| = |H| for all a G.

14

212

14.5. Lagranges theorem

(b) The set of distinct left cosets of H forms a partition of G. That is, their union is G
and no two of them intersect.
Proof
(a) The function f : H aH defined by f (h) = ah is a bijection.
(b) We have g gH for every g G, so the union of all cosets is G. Suppose
g aH bH. Then g = ah1 = bh2 for some h1 , h2 H, so a = bh2 h1
1 . We prove that
aH bH. Suppose j aH, with j = ah, say. Then j = bh2 h1
h

bH,
since h2 h1
1
1 h, as
a product of three elements of the subgroup H, lies in H. So aH bH, and similarly
bH aH, so aH = bH. Thus any two distinct cosets are disjoint.
The same results hold for right cosets.
There is an alternative approach to cosets, which we now outline. Suppose G is a group
(written multiplicatively) and H 6 G. Let us define a relation on G called left
congruence modulo H by:
xl y (mod H) x1 y H.
We can similarly define a relation right congruence modulo H by:
xr y (mod H) xy 1 H.
Then it can be shown that left congruence and right congruence modulo H are
equivalence relations on G. The equivalence classes are, respectively, the left cosets and
the right cosets.
Example 14.4 Suppose that (G, ?) = (Z, +) and that H = 4Z is the subgroup
H = {4n | n Z}, consisting of all integer multiples of 4. Then (recalling that the
operation is addition),
xl y (mod H) (x) + y H 4 | (y x) x y (mod 4).
This shows that left congruence is the same as congruence modulo 4. (Because the
group is Abelian, this is also true of right congruence.) Note that the left cosets
(listed above in the example) are precisely the equivalence classes of integers modulo
4.
Recall that the equivalence classes of an equivalence relation partition the set of which
the relation is defined. Because left congruence modulo H is an equivalence relation on
G, this means that the left cosets (which are the equivalence classes of this relation)
partition H. This provides another, indirect, way to arrive at part (b) of the Theorem
above.

14.5

Lagranges theorem

Now for an important consequence of cosets and their properties. (Indeed, this is the
main reason weve studied cosets):

213

14

14. Homomorphisms and Lagranges theorem

Theorem 14.4 (Lagranges theorem) Let G be a finite group, and let H be a


subgroup of G. Then |H| divides |G|.
Proof
Since the set of distinct left cosets partitions G, the order of G is the sum of the
cardinalities of the various left cosets. But each coset has cardinality |H|, so |G| = |H|
number of cosets. Therefore |G| is an integer multiple of |H|, as claimed.
It should not be supposed that a converse to this Theorem holds which would say that
if m | |G| then there is a subgroup H of G of cardinality m.
Corollary 14.5 Let G be a finite group, and let g be an element of G. Then the order
of g divides the order |G| of G.
Proof
Apply Lagranges theorem to the subgroup hgi.
Corollary 14.6 If G is a group of order n, then g n = e for every element g of G.
Example 14.5 Consider the group (Z11 , ) of non-zero integers modulo 11, under
multiplication modulo 11. (This is a group, since 11 is prime.) Consider the element
3. We see that 30 = 1, 31 = 3, 32 = 3 3 = 9, 33 = 9 3 = 5, 34 = 5 3 = 4,
35 = 4 3 = 1. Therefore the order of 3 in the group is 5. The subgroup H = h3i
generated by 3 is {1, 3, 9, 5, 4}: any power of 3, modulo 11, is an element of this set.
The coset 2 H is equal to {2 h | h H} = {2, 6, 7, 10, 8}. Notice that every
element of the group is in either H or 2 H: i.e., the two cosets do form a partition
of Z11 . Since the group has 10 elements, any subgroup must have order 1, 2, 5 or 10,
and the order of any element must also be 1, 2, 5 or 10. Also we see that a10 = 1 in
Z11 , for any a Z11 : this translates to saying that, for any integer a that is not a
multiple of 11, a10 1 modulo 11.
Example 14.6 Suppose that G is any group such that |G| = 27. Then G has a
subgroup of order 3. For, suppose that 1 6= x G. Then the order of x is greater than
1 and divides 27, and is therefore 3, 9 or 27. If x has order 3, then hxi is a subgroup
of order 3. If x has order 9, then hx3 i is a subgroup of order 3. If x has order 27,
then hx9 i is a subgroup of order 3. So there is certainly some subgroup of order 3.

Learning outcomes
At the end of this chapter and the Essential reading and activities, you should be able
to:
explain what is meant by a homomorphism and be able to prove that a mapping is
a homomorphism
prove some standard properties of homomorphisms
explain what is meant by an isomomorphism and be able to prove that a mapping
is an isomomorphism

14

214

14.5. Sample examination questions

explain what it means to say that two groups are isomorphic and be able to prove
that groups are isomorphic
demonstrate that you know what is meant by a coset and be able to determine
these for a particular subgroup
understand the proof of Lagranges theorem
demonstrate that you understand Lagranges theorem
apply Lagranges theorem to various problems.

Sample examination questions


Question 14.1
1 x

, for x R, where the


0 1
group operation is matrix multiplication. (See Exercise 13.2.) Prove that H is
isomorphic to the group H = (R, +) of real numbers under the operation of addition.
Let G be the group consisting of matrices of the form

Question 14.2
Suppose that : G H is a homomorphism. Prove that is injective if and only if
ker = {e} (that is, the kernel of is the trivial subgroup, consisting of just the identity
element of G).
Question 14.3
Let C denote the complex numbers. Let z = e(2/n)i C. Show that any cyclic group of
order n is isomorphic to the group hzi (where the operation in hzi is multiplication of
complex numbers).
Question 14.4

Let G = {a + b 2 | a, b Q} R and H = {a + bi | a, b Q} C. Show that the


groups (G, +) and (H, +) are isomorphic. (You may assume these are indeed groups.)
Question 14.5
Prove that the direct product G H of two groups has a subgroup isomorphic to G and
a subgroup isomorphic to H. (The direct product is defined in Exercise 12.10.)
Question 14.6
Suppose that G is any group and that to each element x G we associate a function
fx : G G defined by fx (g) = xg. Prove that, for all x, fx is a bijection. Let (H, ) be
the group of bijections of G with the operation of composition of functions, and define
: G H by (x) = fx . Prove that is an injective group homomorphism.
[This proves a famous result known as Cayleys theorem, which states that any group is
isomorphic to a group of permutations.]

215

14

14. Homomorphisms and Lagranges theorem

Question 14.7
Suppose that G is a group. An isomorphism from G to itself is called an automorphism.
Prove that the set Aut(G) of all automorphisms of G is a group under the operation of
composition of functions. For each x G, prove that x : G G given by
x (g) = x1 gx is an automorphism of G, and let Inn(G) = {x | x G} denote the set of
all such automorphisms (known as inner automorphisms). Prove that Inn(G) 6 Aut(G).
Question 14.8
Suppose that G = (Rn , +) is the set of real n-vectors with the operation of vector
addition. (You may assume this is an Abelian group.) Suppose that w Rn and let
H = {x Rn | wT x = 0}.
Prove that H is a subgroup of G. Suppose now that n = 3. Describe geometrically the
cosets of H.
Question 14.9
Suppose that G = (Z7 , ). Find the order of each element of G. Which elements are in
the subgroup H = h6i generated by 6? Find all the left cosets of H in G. For which of
m = 1, 2, 3, 4, 5, 6 is there a subgroup of G of order m? Justify your answers.
Question 14.10
Suppose that elements x, y of a group G have orders m and n respectively, where
gcd(m, n) = 1. Prove that hxi hyi = {e}. Hence show that if x and y commute with
each other (that is, xy = yx), then xy has order mn.
Question 14.11
A finite group contains elements of every order up to and including 12. What is the
least possible order of the group?
Question 14.12
Suppose that G is a group and that H 6 G. Prove that, for x G, the coset xH is
equal to H if and only if x H.
Question 14.13
Suppose that H 6 G and that |G| = n|H|, where n N. (We know, by Lagranges
theorem, that there is such an n.) Let x G. Prove that there is some k N with
1 k n such that xk H.
[Hint: consider the left cosets xi H for i = 1, 2, . . . , n. Use the fact that there are
precisely n distinct left cosets. Exercise 14.12 will also be useful.]
Question 14.14
Suppose that G is an Abelian group and that H 6 G. Let K be the set {xH | x G} of
cosets of H in G. This is a finite set of cardinality n = |G|/|H|. Define an operation ?
on the set K as follows: if C1 , C2 K and C1 = x1 H, C2 = x2 H, then
C1 ? C2 = (x1 x2 )H. Prove, first, that this is a well-defined operation. (The potential

14

216

14.5. Comments on selected activities

problem is that for cosets C1 , C2 K, there will be different choices of x1 and x2 such
that C1 = x1 H and C2 = x2 H, and it looks as if C1 ? C2 depends on that choice. Show
that it does not.) Prove that (K, ?) is a group.
[This is an important construction. Generally, K is denoted by G/H and is called the
quotient group. A similar construction is possible when the group G is not Abelian. But
in this case, H has to have an additional property, to the effect that xH = Hx for all x.
Such a subgroup is termed a normal subgroup.]

Comments on selected activities


Feedback to activity 14.1
Suppose that x, y ker . Then (x) = (y) = eH and hence
(xy) = (x)(y) = eH eH = eH ,
which shows xy ker . Also,
(x1 ) = ((x))1 = e1
H = eH ,
so x1 H. Thus ker 6 G.
Now consider (G). If x, y (G) then there are x0 , y 0 G with x = (x0 ) and y = (y 0 ).
Then
xy = (x0 )(y 0 ) = (x0 y 0 ),
which shows that xy (G). Also,
x1 = ((x0 ))1 = ((x0 )1 ),
which shows that x1 (G). Thus (G) 6 H.

Sketch answers to or comments on sample questions


Answer to question 14.1
Think about how multiplication of these matrices works:
!
!
!
1 x
1 y
1 x+y
=
.
0 1
0 1
0
1
This suggests that matrix multiplication in this group is essentially the same as the
action of adding the top-right entries. This suggests, therefore, the following possible
isomorphism:
!!
1 x

= x.
0 1
This is easily seen to be a bijection: it is injective because, for A, B G,
!
1 z
(A) = (B) = z = A = B =
.
0 1

217

14

14. Homomorphisms and Lagranges theorem

It is surjective because, for any real number x,


x=

1 x
0 1

!!
.

It is a homomorphism because
!
!!
!!
1 x
1 y
1 x+y

=
=x+y =
0 1
0 1
0
1

1 x

!!

0 1

1 y

!!

0 1

As a bijective homomorphism, is therefore an isomorphism, so the groups are


isomorphic.
Answer to question 14.2
We certainly have {e} ker . Suppose first that is injective. Then, x ker implies
(x) = 1 = (e) which, by injectivity, implies x = e. So ker = {e}. Conversely, suppose
ker = {e}. Then,
(x) = (y) =
=
=
=

(xy 1 ) = (x)(y 1 ) = (x)((y))1 = (x)((x))1 = e


xy 1 ker = {e}
xy 1 = e
x = y.

So is injective.
Answer to question 14.3
Let G = hxi be a cyclic group of order n (so the generator x has order n). Then, define
: G hzi by: if g = xm where 1 m n, then (g) = z n . (This defines on all of G
because every element of G takes this form.) The mapping is injective because, for r, s
between 1 and n, (xr ) = (xs ) implies z r = z s , which means (since r, s lie between 1
and n) that r = s. It is surjective because it is injective and the groups have the same
cardinality. The mapping is a homomorphism because
(xr xs ) = (xr+s ) = z r+s = z r z s = (xr )(xs ).
Answer to question 14.4

Define : G H by (a + b 2) = a + bi. Then

(a + b 2) = (c + d 2)

a + bi = c + di
a = c, b = d

a + b 2 = c + d 2,

so is injective. It is surjective because, for a, b Q, a + bi = (a + b 2). It is a


homomorphism because (recall the operations are addition here),

((a + b 2) + (c + d 2)) = ((a + c) + (b + d) 2)


= (a + c) + (b + d)i
= (a + bi) + (c + di)

= (a + b 2) + (c + d 2).

14

218

14.5. Sketch answers to or comments on sample questions

The groups are therefore isomorphic.


Answer to question 14.5
We prove there is a subgroup isomorphic to G. The proof that there is one isomorphic
to H is very similar. Define K = {(g, e) | g G}. Then it is easy to see that
K 6 G H: for,
(g, e) ? (h, e) = (gh, e) K
and
(g, e)1 = (g 1 , e) K.
Consider the mapping : G K given by (g) = (g, e). This is clearly a bijection and
it is an isomorphism because
(gh) = (gh, e) = (g, e) ? (h, e) = (g) ? (h).
It follows that K
= G.
Answer to question 14.6
We have
fx (g) = fx (h)

xg = xh
x1 xg = x1 xh
eg = eh
g = h,

so fx is injective. For each x, fx is surjective, for given any g G, we have


g = fx (x1 g), since x(x1 g) = (xx1 )g = eg = g. Now, suppose : x 7 fx . We first
show is a homomorphism. Let xy G. We need to prove that (xy) = (x) (y);
that is, that fxy = fx fy . Now, for all g G, fxy (g) = (xy)g and
(fx fy )(g) = fx (fy (g)) = fx (yg) = x(yg) = (xy)g,
so we have what we need. Now we show that is bijective. First, we note that if
(x) = (y) then fx = fy and, in particular, fx (e) = fy (e), so that xe = ye; that is,
x = y. So is injective. To expand a little on the comment about the meaning of this
result, let (G) be the image of G. Then clearly, since it is injective, is a bijection from
G to (G). Therefore it is an isomorphism between G and the group (G) of
permutations of G. [Recall that (G) is a subgroup of H for any homomorphism from G
to H.]
Answer to question 14.7
Suppose f, g Aut(G). Then f and g are bijections from G to G. It follows by standard
properties of functions that f g is also a bijection from G to G. Explicitly,
(f g)(x) = (f g)(y)

f (g(x)) = f (g(y))
g(x) = g(y)
x = y,

219

14

14. Homomorphisms and Lagranges theorem

where we have used (in turn) injectivity of f and g. Also, by surjectivity of f , for any
x G there is y G such that f (y) = x; then, by surjectivity of g, there is z such that
g(z) = y. Then we have (f g)(z) = f (g(z)) = f (y) = x. In this way we see that f g is
surjective. So, since f, g G = f g G, G is closed under the operation of
composition. The identity mapping i(x) = x is in G and acts as an identity in the
group, since f i = i f = f for all f G. Since composition of functions is associative,
we know that composition in G is associative. For any f G, the inverse function f 1
exists, because f is a bijection. Moreover, since f 1 is itself a bijection from G to G, we
have f 1 G. Thus Aut(G) is a group.
Now, fix any x G. We show that x Aut(G). This means we need to show that x is
a bijection from G to G. First, we have
x (g) = x (h) =
=
=
=

x1 gx = x1 hx
xx1 gxx1 = xx1 hkx1
ege = ehe
g = h,

so x is injective. Also, since g = x1 (xgx1 )x = x (xgx1 ), and since xgx1 G, we


see that x is surjective. So x Aut(G).
Now we show that Inn(G) = {x | x G} is a subgroup of Aut(G). Weve already seen
that it is a subset, so it remains to prove closure of Inn(G) under composition and
taking of inverses. First, suppose we take any two functions in Inn(G). For some
x, y G, these will be x and y . Then, for g G,
(x y )(g) =
=
=
=
=
=

x (y (g))
x (y 1 gy)
x1 (y 1 gy)x
(x1 y 1 )g(yx)
(yx)1 g(yx)
yx (g),

so x y = yx also belongs to Inn(G). Furthermore,


(x x1 )(g) = x (xgx1 ) = x1 xgx1 x = g = i(g),
so x x1 = i. Similarly, x1 x = i, showing that (x )1 = x1 Inn(G).
Answer to question 14.8
To show that H is a subgroup, we just need to prove that if x, y H then x + y H
and the inverse x of x is in H. Now, x, y H means wT x = 0. So,
wt (x + y) = wT x + wT y = 0 + 0 = 0
and hence x + y H. Also,
wT (x) = wT x = 0 = 0,
so x H. The cosets of H are the sets x + H. (The group operation is addition and,
since the group is Abelian, there is no distinction between left and right cosets.) In the

14

220

14.5. Sketch answers to or comments on sample questions

case n = 3, H is a 2-dimensional hyperplane passing through the origin (that is, a


2-dimensional subspace), and the cosets are the (infinitely many) planes parallel to this
one.
Answer to question 14.9
The identity element 1 has order 1, 6 has order 2, 2 and 4 have order 3, and 3 and 5
have order 6. H = {1, 6}, the left cosets are H itself, 2H = {2, 5} and 3H = {3, 4}.
Since |G| = 6, the sizes of all subgroups of G divide 6, by Lagranges theorem. That
rules out 4 and 5 as possible subgroup sizes. Examples of all the other sizes exist:
{1}, {1, 6}, {1, 2, 4} and {1, 2, 3, 4, 5, 6} are subgroups of sizes 1, 2, 3 and 6 respectively.
Dont try to say by Lagranges theorem, there is a subgroup of order 3: Lagranges
theorem says nothing of the sort: it says that if H 6 G then |H| | |G|, but it does not
say that if m | |G| then there is a subgroup of order m. (Indeed, as mentioned in the
notes, this is not the case.) Also, it is not in general true that every subgroup of a group
is of the form hgi for some g G.
Answer to question 14.10
By Lagranges theorem, if H = hxi hyi, then, since H is a subgroup of the groups hxi
and hyi, which are of order m and n, we have |H| | m and |H| | n. But because
gcd(m, n) = 1, this means |H| = 1 and hence H = {e}. Suppose xy = yx. Then
(xy)mn = xmn y mn = (xm )n (y n )m = en em = ee = e,
so the order of xy is at most mn. Suppose that (xy)r = e for r N. Then xr y r = e and
so xr = (y r )1 = z, say. Clearly, z hxi. Also, since y r hyi and hyi is a subgroup, we
also have z = (y r )1 hyi. So z H. But this means z = e. We therefore have xr = e,
from which it follows that m | r. Also, y r = z 1 = e1 = e and hence n | r. Since
gcd(m, n) = 1 and m | r and n | r, it follows that mn | r. So r mn. This, together
with the earlier observation that (xy)mn = e, establishes that the order of xy is mn.
Answer to question 14.11
The order of any element of the group divides |G|, the order of the group. (This follows
from Lagranges theorem, because for each x G, hxi 6 G, and the order of the
subgroup hxi is the order of x.) So, if |G| = n then, given the information in the
question, we must have k | n for all 1 k 12. It follows that n m, where m is the
least common multiple of 2, . . . , 12. The prime decompositions of these numbers are
2, 3, 22 , 5, 2 3, 7, 23 , 32 , 2 5, 11, 22 3.
The smallest number which these all divide is
23 32 5 7 11 = 27720.
This does not prove that there is a group of order 27720 which has subgroups of all
orders between 1 and 12. But this is the case. To see why, let Cm denote a cyclic group
of order m. Then the direct product
G = C2 C2 C2 C3 C3 C5 C7 C11

221

14

14. Homomorphisms and Lagranges theorem

is an Abelian group with subgroups of all orders between 1 and 12. (See Exercise 12.10
for the definition of the direct product of two groups. The direct product of more than
two groups is defined in the obvious manner, generalising this.) For example, a
subgroup of order 9 is given by
{(e, e, e, x, y, e, e, e) | x, y C3 }
and a subgroup of order 11 is
{(e, e, e, e, e, e, e, x) | x C1 1}.
Answer to question 14.12
Suppose first that x H. Then for all h H, xh H, so xH = {xh | h H} H.
Furthermore, for any h H, h = x(x1 h) and x1 h H, so h xH. Therefore
xH = H. Conversely, suppose that xH = H. Then, since e H, xe H. But this just
says x H. So were done.
Answer to question 14.13
Following the hint, consider the left cosets xi H for i = 1, 2, . . . , n. There are precisely n
distinct left cosets (because |G| = n|H| and the cosets are disjoint and each of
cardinality |H|). Now, the list
xH, x2 H, . . . , xn H
might contain repetitions. If it does not, then since H = eH is a coset, one of the cosets
in this list must be H: so, for some k, xk H = H. By the result of Exercise 14.12, this
means xk H. Now, if the list does contain repetitions then, for some r, s with r < s,
xr H = xs H. Then, xsr H = H and, therefore, xsr H. So in this case xk H where
k = s r.
Answer to question 14.14
Suppose that x1 H = x2 H and y1 H = y2 H. To show that the operation ? is well-defined,
we need to show that (x1 y1 )H = (x2 y2 )H.
Now, generally, aH = bH if and only if a1 b H. To see this, first suppose that
a1 b H. Then, for any x bH, we have x = bh for some h H. Then,
x = bh = (aa1 )(bh) = a(a1 b)h = a((a1 b)h) = ah1 ,
where h1 = (a1 b)h H since a1 b H. So x aH. Now we could invoke the fact that
the left cosets are disjoint to deduce that aH = bH. Alternatively, we can proceed
directly, as follows: let x aH, and suppose x = ah where h H. Then x = b(b1 a)h.
Now, a1 b H and H is a subgroup, so (a1 b)1 H; that is, b1 a H. So
(b1 a)h H and hence x bH.
1
Now, since x1 H = x2 H, h1 = x1
1 x2 H and, since y1 H = y2 H, h2 = y1 y2 H. We
need to show that (x1 y1 )H = (x2 y2 )H. By the above, this means we must show that
(x1 y1 )1 (x2 y2 ) H. Now,
1
1
(x1 y1 )1 (x2 y2 ) = (y11 x1
1 )x2 y2 = (x1 x2 )(y1 y2 ) = h1 h2 H,

14

222

14.5. Sketch answers to or comments on sample questions

where we have used the fact that G is Abelian. So the operation ? is well-defined.
Clearly, by definition, K is closed under ?, since the result of performing ? is a coset.
One of the cosets xi H must equal H, since H = eH is a coset of H. Then, this is the
identity element of K. For, for all xH K, xH ? H = xH ? eH = (xe)H = xH, and
H ? xH also equals xH. The inverse of xH is x1 H. The operation ? is associative
because the group operation (mulitplication) on G is. So (K, ?) is a group.

223

14

14. Homomorphisms and Lagranges theorem

14

224

Appendix A
Sample examination paper
Important note: This Sample examination paper reflects the examination and
assessment arrangements for this course in the academic year 20102011. The format
and structure of the examination may have changed since the publication of this subject
guide. You can find the most recent examination papers on the VLE where all changes
to the format of the examination are posted.

The format
Note carefully the format of the Sample examination paper. There are eight questions.
Four of these are from the Numbers and proof section of the course and the
remaining four are from the Analysis and Algebra parts. The first four questions (on
the Numbers and proof topics) form Section A. Section B consists of four questions
on Analysis and Algebra (not necessarily two whole questions on each of these two
topics). You are asked to answer six of the eight questions. What will count are your
best three questions from each section. So you should aim to write three good answers
from each section. It would not, for example, be a good idea to do all four questions
from Section A and only two from Section B, because only the best three of your
section A questions would count: the fourth would not contribute to your mark.

225

A. Sample examination paper

Sample examination paper


The following Sample examination paper is the 2007 Zone A paper. It is provided to
indicate the intended format of the examination. Sketch solutions are also given.
Time: 3 hours
This examination has two sections, Section A and Section B.
Attempt THREE questions in Section A. (If more than three are attempted, only your
best three will be taken into account.)
Attempt THREE questions in Section B. (If more than three are attempted, only your
best three will be taken into account.)
Each question carries the same number of marks.

SECTION A
1(a)

A positive integer z is said to be a perfect square if there is some a N


such that z = a2 . Consider the statement:
S: if n is prime then n + 1 is not a perfect square.
Explain what is meant by saying that n = 3 is a counterexample to S.
Write down the contrapositive of S and hence show that S is true for
all integers n > 3. [Do not try to use proof by induction.]

(b)

The statement T is as follows:


T : for all positive integers x, x2 + x + 41 is prime.
Write down the negation of T . Show that the negation is true by
finding a value of x such that x2 + x + 41 is a multiple of 41.

(c)

Construct the truth table for the expression p (q r). Is this


expression logically equivalent to (p q) (p r)?

The function f : N {0} N {0} is defined recursively by: f (0) = 0


and, for n 1, f (n) = 2f (n 1) + n.
Prove by induction that, for all n 0, f (n) = 2n+1 n 2. Explain
clearly why each of the following statements is true:

226

(i)

f is not a surjection;

(ii)

f is an injection.

A
Prove or disprove the following statements:

3(a)

(iii)

If f (n) is a prime then n is an odd number.

(iv)

If n is an odd number greater than 1, then f (n) is a prime.

What does it mean to say that integer a is divisible by integer b?


Which integers a are divisible by 0? For which integers b is 0 divisible
by b? Justify your answers.
What is the greatest common divisor, gcd(a, b), of integers a and b?
Let d be the greatest common divisor of a and b and suppose that
c = a + mb for some integer m. Show that gcd(b, c) = d.
Use the Euclidean algorithm to find the greatest common divisor d of
1155 and 882. Find also integers m, n such that d = 1155m + 882n.

(b)

4(a)

The relation T is defined on the set Z of integers by: aT b if and only if


either ab > 0 or a = b = 0. Show that T is an equivalence relation.
What are the equivalence classes?

Find all solutions in Z6 of the system of equations


2x + y = 3
x + 2y = 3.

(b)

Find positive integers r and s such that r/s is equal to the repeating
decimal
0.30024.
[You need not give the answer in its lowest terms.]

(c)

Find all the complex numbers that satisfy the quadratic equation
z 2 (2 + 4i)z 3 = 0.
Your answers should be in the form a + ib, where a and b are real
numbers.

227

A. Sample examination paper

SECTION B
Answer any three questions from this section.

5(a)

Suppose A is a bounded set of real numbers. What is meant by the


supremum sup A of A? What is meant by the infimum, inf A, of A?
Suppose that B = {1 a : a A}, where A is a bounded set of real
numbers. Prove that B is bounded above and that
inf B = 1 sup A.

(b)

What, precisely, does it mean to say that a sequence (xn ) of real


numbers has limit L as n tends to infinity?
Let

2n3 n2 + 3
.
2n3 n + 1
considering |xn 1| and using the formal definition of a limit, prove
that xn 1 as n .
xn =

(c)

6(a)

A sequence
(xn ) of numbers is defined as follows: x1 = 1 and, for n 2,

xn = 2xn1 . Prove that xn 2 for all n and that the sequence (xn ) is
increasing. It follows that the sequence converges. Determine the limit
of the sequence.

Find the limit as n of the sequence (xn ), where


2

xn =

n
X

i=1

1
n4 + i

1
n4 + 1

1
n4 + 2

+ +

1
.
n4 + n2

(b)

What, precisely, does it mean to say that the function f : R R tends


to the limit L as x tends to a? Suppose that f, g : R R and that, as
x a, f (x) L and g(x) M . Prove that f (x)g(x) LM as x a.

(c)

What does it mean to say that a function f : R R is continuous at


a R?
Suppose that f : R R is continuous at a R and that the sequence
(xn ) converges to a. Prove that f (xn ) f (a) as n .

228

A
Suppose that f : R R is continuous at 0. Let g : (0, ) R be
defined by g(x) = f (1/x). Show that if g(n) > 0 for all positive integers
n, then f (0) 0.

Suppose that G is a set and that is a binary operation on G. What


does it mean to say that (G, ) is a group? What does it mean to say
that the group is Abelian?
Suppose that R = R \ {0} is the set of nonzero real numbers. Let
G = R R be the set of all ordered pairs (x, y) where x R and
y R. Define a binary operation on G by:
(a, b) (c, d) = (ac, ad + c1 b).
Prove that (G, ) is a group.
Is (G, ) an Abelian group? Justify your answer.

8(a)

Suppose that G is a group (and that the group operation is


multiplication). What does it mean to say that H is a subgroup of G?
Let H = {x G : xg = gx for all g G}. Prove that H is a subgroup
of G.

(b)

State Lagranges theorem.


Suppose that G is the group Z37 of non-zero elements of Z37 , where the
group operation is multiplication.
Prove that the order of the element 2 in G is 36.
Describe a subgroup H of G of order 6. What can you say about the
cardinalities of the right cosets of H? How many such cosets are there?

229

A. Sample examination paper

230

Appendix B
Sketch solutions to the sample
examination paper

We now give some brief comments on the questions. Items such as definitions have not
always been given in full, since these may be found in the subject guide. (This is not to
suggest that the definitions are not important: they are absolutely critical.) What
follows are comments and sketch solutions and sometimes simply answers not full
model solutions.
1. (a) To say that n = 3 is a counterexample means that it is prime but it is not the
case that n + 1 = 4 is a perfect square (because it is: 4 = 22 ).
The contrapositive of S is:
If n + 1 is a perfect square, then n is not prime.
The contrapositive of S is logically equivalent to S, and so if we can prove that, for
n > 3, the contrapositive is true, it follows that S is true also for n > 3.
Now, suppose n > 3 and n + 1 is a perfect square, so that n + 1 = a2 for some positive
integer a. Then
n = a2 1 = (a + 1)(a 1).
Since n > 3, we have a > 2 and hence a 1 and a + 1 are both greater than 1. This
shows n is not prime (because it can be written as n = rs where r, s are integers with
r, s > 1.) So the contrapositive (and hence S) is true for n > 3.
(b) The negation of T is:
There is some positive integer x such that x2 + x + 41 is not prime.
To show this is true, we only need to find such an x. Clearly, x = 41 will work, since in
this case x2 + x + 41 is divisible by 41 and is not, therefore, prime.
(c) In the usual way, described in the subject guide, we can form the truth tables for S1
and S2 , noting that the statements take the same values on all 8 possible true/false
combinations of p, q, r. (It turns out that each statement is true except when
pqr = T F F .)
2. The base case can be taken to be n = 0, and is easily checked. (If you like, you can
also check the case n = 1 but this is not necessary.) Now, assuming f (k) = 2k+1 k 2,
we have
f (k + 1) =
=
=
=

2f (k) + (k + 1)
2(2k+1 k 2) + (k + 1)
2k+2 k 3
2(k+1)+1 (k + 1) 2,

which establishes what is required. The result follows.

231

B. Sketch solutions to the sample examination paper

The easiest way to proceed for the next two parts is to note that the fact that
f (n) = 2f (n 1) + n, together with the fact that f (n) 0 for all n, shows that f is a
strictly increasing function. (The formula we have derived for f can also be used, but I
think it is easier to use the fact that f is increasing.)
(i) We havef (0) = 0, f (1) = 1, f (2) = 4. Since f is increasing, there is therefore no n
such that f (n) = 3, so f is not surjective.
(ii) Injectivity is easy to show, given that f is strictly increasing: for,
x > y f (x) > f (y) f (x) 6= f (y)
and
y > x f (y) > f (x) f (x) 6= f (y),
so
x 6= y f (x) 6= f (y).
(iii) Suppose f (n) is prime. This means that 2n+1 n 2 is prime. But if n is even then
this is even, and so it is not prime. Hence we must have n odd. The statement is true.
(iv) Lets try a few values of n. We have f (3) = 24 3 2 = 11, which is prime;
f (5) = 26 5 2 = 57, which is not prime. Weve therefore found a counterexample and
the statement is false.
3. (a) a is divisible by b if there is an integer k such that a = bk.
For a to be divisible by 0 means that there is k Z such that a = k0. But this means
a = 0, so only 0 is divisible by 0.
For any integer b, 0 is divisible by b because we can write 0 as b0.
gcd(a, b) is the positive integer d with the properties that (i) d divides a and b and (ii) if
c divides a and b then c d.
Suppose c = a + mb and let D = gcd(b, c) and d = gcd(a, b). We want to show d = D.
First, since d | a and d | b, we have that d | (a + mb) = c. So d is a divisor of b and c and
hence we must have d D. On the other hand, since D | b and D | c = a + mb, it follows
that D | (a + mb) mb = a, so D is a divisor of both a and b and hence D d. So we
must have D = d.
The next part is a standard calculation of which there are examples in the subject guide
and textbooks, so I omit the details. You should find that gcd(1155, 882) = 21 and that
suitable m, n are m = 13 and n = 17.
(b) T is reflexive because aT a means that a2 > 0 if a 6= 0, which is true.
T is symmetric because
aT b ab > 0 or a = b = 0 ba > 0 or a = b = 0 bT a.
T is transitive. For, suppose that aT b and bT c. If a = b = 0 then we have, because bT c,
b = c = 0 and hence a = c = 0 and aT c. Otherwise, ab > 0 and bc > 0 and so
(ab)(bc) > 0, or ab2 c > 0. Since b > 0 this implies ac > 0 and hence aT c.
If n N then aT n an > 0, so the equivalence class containing n is all positive
integers, N. If n < 0 then the equivalence class containing n is all negative integers

232

{n : n N}. If n = 0, then the only a such that aT n is a = 0 and hence {0} is an


equivalence class. So these are the three equivalence classes. (Note that they partition
Z, as they ought to.)
4. Multiplying the first equation by 2 gives 4x + 2y = 6. Subtracting the second
equation gives 3x = 3. One obvious solution is x = 1, but there are others. Checking
each element of Z6 in turn, other solutions are 3 and 5. (For example, in Z6 , 3(3) = 3
because 3(3) = 9 3 (mod 6).) Now, we have y = 3 2x. Taking the three values of x
in turn gives 1, 3, 7 = 1, 3, 5. So potential solutions are
(x, y) = (1, 1), (3, 3), (5, 5).
For a complete answer, it should be checked whether each of these is in fact a solution,
and they all are indeed solutions of the system.
(b) This is fairly standard. Let x = 0.30024. Then
3
+ 0.00024.
10

x=

Let y = 0.00024. Then 100y = 0.024 and so 100y y = 0.024. That is,
99y =
and hence
y=

24
1000

24
.
99000

So,
3
24
29724
+
=
.
10 99000
99000
It is acceptable to leave the answer like this.
x=

(c) We can solve this by completing the square or by using the formula for the solutions
of a quadratic. (Indeed, the latter method derives from completing the square, so these
are equivalent.) What we find, and many candidates did this,
was that the solutions are
of the form 1 + 2i + w, where w2 = 4i. (Or, if you like, w = 4i, though that is a little
sloppy.) But the answer should not be left like this because the question asks that we
find the answers in the form a + ib. Now, expressing 4i in polar form,
4i = 4ei/2+2k
for k Z, and we want to find w such that w2 = 4i. So
w = 2ei/4+k
which gives two distinct solutions:
w = 2e

i/4


=2

and
i5/4

w = 2e


=2

1
1
+ i = 2 + 2i,
2
2

1 1
+ i = 2 2i.
2
2

233

B. Sketch solutions to the sample examination paper

So there are two solutions:

z = (1 + 2i) +
and
z = (1 + 2i)

2 + 2i = (1 + 2) + (2 + 2)i

2i = (1 2) + (2 2)i.

5. (a) For the definitions, see the subject guide. Suppose that = sup A is the
supremum of A. Then, for all a A, a and hence 1 a 1 . Thus, for all b B,
b 1 , showing that B is bounded below and that a lower bound is 1 . Since B is
bounded below, = inf B exists and, since 1 is a lower bound and, by definition, is
the greatest lower bound, we have 1 . We need to show that, in fact, = 1 .
There are several approaches, and you may find similar examples in the subject guide.
Perhaps the easiest way is to notice that A = {1 b : b B}. Since, for all b B, we
have b inf B = , we have 1 b 1 and hence for all a A, a 1 . It follows
that = sup A 1 which is equivalent to 1 . Combined with the earlier
inequality 1 , we have the desired equality: inf B = = 1 = 1 sup A.
(b) The required definition is that xn L as n if and only if for each  > 0 there
is N such that n > N implies |xn L| < . This is the precise definition and nothing
less precise than this will suffice. Using the formal definition means showing that for
any  > 0 there is some N (which will depend on  and which we will produce) such
that if n > N then |xn 1| < . (It is not appropriate to use results on the algebra of
limits here because the question specifically asks us to use the formal definition.) Now,

2
n + n + 2

|xn 1| = 3
2n n + 1
and we want to show this will be less than a given  provided n > N for some N . There
is no need to find the smallest suitable N , so what we will do is bound the quantity
|xn 1| above by something simple and make sure this latter quantity is smaller than .
Now,
2

2
2
2
2
n + n + 2

n + n + 2 n + n + 2n = 4 .
2n3 n + 1 2n3 n + 1
2n3 n3
n
(Here, we have used the facts that, for the numerator, n2 + n + 2 n2 + n2 + 2n2 and,
for the denominator, 2n3 n + 1 2n3 n3 + 0 = 2n3 n3 .) This tells us
|xn 1| 2/n and we can ensure this is smaller than  if n > 4/. So xn 1 as
n . Note the way the inequalities work: to be sure that |xn 1| < , we bound
|xn 1| from above by some simpler sequence bn , and we then solve bn < . And, we
bound |xn 1| by above by upper bounding its numerator and lower bounding its
denominator. Note also that we do this in such a way that solving bn <  becomes easy.
(c) We can use induction to show that xn 2 for all n. It is true when n = 1 because
x1 = 1. Assuming xk 2, we have
p

xk+1 = 2xk 2(2) = 2,


as required.
There is more than one way to show the sequence is increasing. For instance,
r

xn+1
2xn
2
=
=
1,
xn
xn
xn

234

because xn 2, and this shows xn+1 xn . Or, we could see that

xn+1 xn 2xn xn 2xn x2n 2 xn ,

which is true.
So, as an increasing sequence that is bounded above, it converges.
Suppose the limit is

2x

2L and hence we
L. Because xn
L,
we
also
have
x

L.
But
x
=
n
n+1
n+1

2
must have L = 2L. So 2L = L and L(L 2) = 0. But we cannot have L = 0 because
xn 1 for all n (since it is increasing and x1 = 1). So L = 2.
7. (a) The given sequence xn satisfies
n2
so

1
1
,
xn n2
n4 + n2
n4 + 1
n2
n2
.
xn
n4 + n2
n4 + 1

Now,
n2
1

=p
1
n4 + n2
1 + (1/n2 )
and

n2
1
1,
=p
n4 + 1
1 + (1/n4 )

so, by the Sandwich Theorem, xn 1 as n .


(b) The required definition is that f (x) L as x a if and only if for each  > 0 there
is > 0 such that 0 < |x a| < implies |f (x) L| < . Let  > 0. Now,
|h(x) LM | = |f (x)g(x) LM |
= |f (x)g(x) Lg(x) + Lg(x) LM |
|g(x)||f (x) L| + L|g(x) M |.
Since g(x) M as x a, there is 1 > 0 such that 0 < |x a| < 1 implies
|g(x) M | < 1, which means that, in particular, |g(x)| < 1 + |M |. Then,
|g(x)||f (x) L| + L|g(x) M | (1 + |M |)|f (x) L| + L|g(x) M |.
Suppose L 6= 0. Now, there are 2 , 3 > 0 such that 0 < |x a| < 2 implies
|f (x) L| < /(2(1 + |M |))
and 0 < |x a| < 3 implies |g(x) M | < /(2L). Let = min(1 , 2 , 3 ), the smallest of
1 , 2 and 3 . Then 0 < |x a| < implies
|f (x)g(x) LM | < (1 + |M |)



+L
= .
2(1 + |M |)
2L

This shows that f (x)g(x) LM as x a. (The proof when L = 0 is less complicated,


and doesnt need some of the above steps.)

235

B. Sketch solutions to the sample examination paper

(c) For the definitions and the proof, see the subject guide.

For the final part, we have that, for all n N, g(n) > 0 and hence f (1/n) > 0. Now,
1/n 0 as n and so, by continuity,
f (0) = lim f (1/n) 0,
n

since the limit of a sequence of positive numbers is non-negative.


7. (G, ) is a group if:
1. x, y G, x y G.
2. e G such that x G, e x = x e = x.
3. x, y, z G, (x y) z = x (y z).
4. x G, x1 G such that x x1 = x1 x = e.
(G, ) is an Abelian group if, additionally, x y = y x for all x, y G.
We need to check each of these properties in turn to verify that (G, ) is a group.
First, (a, b) (c, d) = ac, ad + c1 b), and this belongs to G because both components are
real numbers and the first, ac is nonzero because a and c are.
We now need to find an identity element, e = (e, f ). Lets try to figure out what works.
The identity element must satisfy
(a, b) (e, f ) = (e, f ) (a, b) = (a, b)
for all a, b. In other words, we need to have, for all a, b,
(ae, af + e1 b) = (a, b) = (ea, eb + a1 f ).
It can be seen that (e, f ) = (1, 0) will work. So there is an identity, e = (1, 0).
To check associativity, we need to verify that for all a, b, c, d, e, f ,
(a, b) ((c, d) (e, f )) = ((a, b) (c, d)) (e, f ).
This involves careful calculation using the definition of the operation and you should
find that both products are equal to
(ace, acf + e1 ad + e1 c1 b).
What should the inverse x1 be, of x = (a, b)? It must be x1 = (c, d) where
(a, b) (c, d) = (c, d) (a, b) = e = (1, 0).
So we need
(ac, ad + c1 b) = (ac, cb + a1 d) = (1, 0).
So, certainly, we need ac = 1 and hence c = a1 (which exists since a 6= 0) and we also
need ad + c1 b = 0, which means d = a1 c1 b = b. (It is then also the case that
cb + a1 d = 0.) So x1 exists and equals (a1 , b).

236

The group is not Abelian because, for instance


(1, 2) (3, 4) = (3, 4 + 2/3)

and
(3, 4) (1, 2) = (3, 6 + 4),
and these are not equal. (There are many other examples we could give.)
8. (a) H is a subgroup if it is a subset of G that is itself a group with the same group
operation as G. To test that H is a subgroup, we need to verify that H 6= and that for
all x, y H we have xy H and x1 H.
It is important to be clear what we need to establish. Clearly H 6= , since 1 H, and
we must show that for every x, y H, xy H and x1 H.
Suppose x, y H. Then, for all g G, xg = gx and yg = gy. What do we need to do to
establish that xy H? What we need is that, for all g G, (xy)g = g(xy). Now, for
any g,
(xy)g = x(yg) = x(gy) = (xg)y = (gx)y = g(xy),
as required. To show that x1 H we need to show that for all g G, x1 g = gx1 .
Well, xg = gx, so
x1 (xg)x1 = x1 (gx)x1 ,
which means that
(x1 x)gx1 = x1 g(xx1 )
and hence gx1 = x1 g, as required. So H is a subgroup.
(b) Lagranges theorem states that if G is a finite group and H is a subgroup of G, then
|H| divides |G|. (In fact, |G| = k|H| where k is the number of left or right cosets of H in
G, and each coset has the same cardinality as H.)
With G = Z37 , we have |G| = 36. Now, the order of any element of G divides |G|. (One
way to observe this is that the subgroup generated by an element is of cardinality equal
to the order of the element, and this must divide |G| by Lagranges theorem. But it is
perfectly acceptable just to state this as a fact.) So if d is the order of 2 then we must
have d = 1, 6 or 36. Now, 21 = 2 6= 1 so d 6= 1. Also,
26 = 64 = 27 6= 1 (mod 37),
so d 6= 6. Therefore, d = 36.
A subgroup of order 6 is the group generated by 26 = 27; that is H = h27i. The number
of right cosets of H in G is |G|/|H| = 6. Each coset has the same cardinality as H,
which is 6.

237

Notes

Untitled-3 8

23/12/2008 10:39:12

Notes

Untitled-3 9

23/12/2008 10:39:12

Notes

Untitled-3 8

23/12/2008 10:39:12

Comment form 2010.qxp

10/11/2010

10:58

Page 1

Comment form
We welcome any comments you may have on the materials which are sent to you as part of your
study pack. Such feedback from students helps us in our effort to improve the materials produced
for the International Programmes.
If you have any comments about this guide, either general or specific (including corrections,
non-availability of Essential readings, etc.), please take the time to complete and return this form.
Title of this subject guide: ..............................................................................................................................
................................................................................................................................................................................
Name ......................................................................................................................................................................
Address ..................................................................................................................................................................
................................................................................................................................................................................
Email ......................................................................................................................................................................
Student number ......................................................................................................................................................
For which qualification are you studying? ..............................................................................................................
Comments
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
................................................................................................................................................................................
Please continue on additional sheets if necessary.
Date: ......................................................................................................................................................................
Please send your comments on this form (or a photocopy of it) to:
Publishing Manager, International Programmes, University of London, Stewart House, 32 Russell Square, London WC1B
5DN, UK.

Das könnte Ihnen auch gefallen