FPM Notes

Foundations of Pure Mathematics
Notes for 201011

Andrew Baker & Richard Steiner
[15/03/2011]
School of Mathematics & Statistics, University of Glasgow.
E-mail address: a.baker@maths.gla.ac.uk & r.steiner@maths.gla.ac.uk
Contents
Chapter 1. Introduction to Number theory 1
1.1. Properties of the natural numbers 1
1.2. Greatest common divisors and the Euclidean algorithm 2
1.3. Prime numbers and prime factorisation 5
1.4. Congruences to a modulus 7
1.5. Arithmetic operation on congruence classes 8
1.6. Linear congruences 9
1.7. Algebraic structures on Z and Z/m 12
1.8. Simultaneous linear congruences and the Chinese Remainder Theorem 13
1.9. Further examples of congruences 15
Chapter 2. Sets, functions, cardinality and equivalence relations 19
2.1. Sets 19
2.2. Functions 24
2.3. Induction and the Well Ordering Principle 28
2.4. Finite sets and cardinality 29
2.5. Innite sets 31
2.6. Countable sets 32
2.7. Power sets and their cardinality 36
2.8. The real numbers are uncountable 37
2.9. Equivalence relations 38
Chapter 3. Permutations and groups 43
3.1. Permutations 43
3.2. Cycles 44
3.3. Even and odd permutations 45
3.4. Groups and subgroups 49
Chapter 4. Groups and symmetry 55
4.1. Some 2-dimensional vector geometry 55
4.2. Isometries of the plane 57
4.3. Types of isometries 58
4.4. The Euclidean group of the plane 62
4.5. Matrices and isometries 64
4.6. Symmetry groups of plane gures 70
Bibliography 75
Mathematics 2F: Exercises 0
i
Exercises on Chapter 1 1
Mathematics 2F: Hints and solutions for the exercises 0
Solutions for Chapter 1 0
ii
CHAPTER 1
Introduction to Number theory
In this chapter we introduce some basic ideas in Number Theory, which can loosely be de-
scribed as the study of properties of the integers and Diophantine problems whose solutions are
required to be integers. Although Number Theory has been studied for thousands of years and
has traditionally been seen as very pure mathematics, it provides the basis for many modern
techniques underlying cryptography, and computational methods. The proof of Fermats Last
Theorem by Andrew Wiles in the 1990s (thus solving a problem which had been outstand-
ing for over three hundred years) was the culmination of an enormous amount mathematical
development involving some of the best mathematicians of their times.
1.1. Properties of the natural numbers
The natural numbers 1, 2, . . . form the most basic type of number and arise when counting
elements of nite sets. We denote the set of all natural numbers by
N = {1, 2, 3, 4, . . .}
and nowadays this is very standard notation. It is perhaps worth remarking that some people
include 0 in the natural numbers since the empty set has 0 elements and later we will think
of natural numbers as being the sizes of nite sets. We will use the notation N
0
for the set of
all natural numbers together with 0:
N
0
= {0, 1, 2, 3, 4, . . .} = {0} N.
We can add and multiply natural numbers to obtain new ones, i.e., if a, b N, then a+b N
and ab N. Of course we have the familiar properties of these operations such as
a +b = b +a, ab = ba, a + 0 = a = 0 +a, a1 = a = 1a, a0 = 0 = 0a, etc.
We can also compare natural numbers using inequalities. Given x, y N exactly one of the
following must be true:
x = y, x < y, y < x.
As usual, if one of x = y or x < y holds then we write x y or y x. Inequality is transitive
in the sense that
x < y and y < z = x < z.
The most subtle aspect of the natural numbers to deal with is the fact that they form an
innite set. We can and usually do list the elements of N in the never ending sequence
1, 2, 3, 4, . . . .
One of the most important properties of N is
The Principle of Mathematical Induction (PMI): Suppose that for each n N the
statement P(n) is dened and also the following conditions hold:
1
P(1) is true;
whenever P(k) is true then P(k + 1) is true.
Then P(n) is true for all n N.
Here is another important property of N which may seem intuitively obvious.
The Well Ordering Principle (WOP): Every non-empty subset S N contains a least
element.
Definition 1.1. A least or minimal element of a subset S N is an element s
0
S for
which s
0
s whenever s S. Similarly, a greatest or maximal element of S is one for which
s s
0
whenever s S. Notice that N has a least element 1, but has no greatest element since
for each n N, n + 1 N and n < n + 1. It is easy to see that least or greatest elements of an
ordered set are always unique if they exist.
In fact, the two properties PMI and WOP are essentially interchangeable.
Theorem 1.2. PMI and WOP are logically equivalent, i.e.,
PMI WOP.
We will discuss this again in Section 2.3, but it will be convenient to use WOP for some
proofs as an alternative to using induction. Note that PMI and WOP are sometimes formulated
with N
0
in place of N.
1.2. Greatest common divisors and the Euclidean algorithm
Let a and b be integers. We say that a divides b or a is a factor of b or that b is a multiple
of a if there is an integer k such that b = ka. The notation a | b means that a divides b; the
notation a b means that a does not divide b; for example
3 | 15, (6) | 18, 7 | 7, 8 20.
It is worth noting that if 0 | t then t = 0.
Proposition 1.3. If d | a and d | b then d | (ma +nb) for all integers m and n.
Proof. Since d | a and d | b, there are integers k and such that a = kd and b = d. Then
ma +nb = mkd +nd = (mk +n)d
with mk +n an integer, so d | (ma +nb).
In general, let d be a positive integer and let n be any integer. Then we can divide n by d
to obtain an integer quotient q and an integer remainder r such that
n = qd +r
and 0 r < d. Clearly d divides n if and only if the remainder is zero. For example, if d = 5
and n = 37 then
n = 7 5 + 2,
so the quotient is 7 and the remainder is 2.
Let a and b be integers which are not both zero. If c Z then c is called a common divisor
(or common factor) of a and b if c | a and c | b. Recall that the greatest common divisor
2
(or highest common factor) of a and b, denoted gcd(a, b), hcf(a, b) or just (a, b), is the largest
positive common divisor of a and b. For example,
gcd(18, 24) = 6, gcd(18, 35) = 1.
The greatest common divisor (gcd) of a and b can also be characterised as the unique natural
number d for which
d is a common divisor of a and b;
if c Z, c | a and c | b, then c | d.
Recall also that gcd(a, b) can be found by the Euclidean algorithm, i.e., by repeated division,
and that as a by-product of the Euclidean algorithm one can express gcd(a, b) in the form
gcd(a, b) = ma +nb
for some integers m and n.
Example 1.4. Find the greatest common divisor d of the integers 888 and 481, and then
nd integers m and n such that
d = 888m+ 481n.
Solution. We divide 888 by 481 to get a reminder 407, then divide 481 by 407 to get a
remainder 74, and so on. Eventually we get a zero remainder, and the greatest common divisor
is the last non-zero remainder. In detail this comes out as follows:
888 = 1 481 + 407,
481 = 1 407 + 74,
407 = 5 74 + 37,
74 = 2 37 + 0;
therefore d = gcd(888, 481) = 37. To express 37 in the form 888m+481n we now work backwards
through these equalities, expressing 37 in terms of 407 and 74, then in terms of 481 and 407,
and then in terms of 888 and 481:
37 = 407 5 74
= 407 5(481 407) = 6 407 5 481
= 6(888 481) 5 481 = 6 888 11 481.
Therefore 37 = 888m+ 481n with m = 6 and n = 11.
It is important to note that the above procedure gives only one of innitely many integer
solutions (x, y) of the equation
888x + 481y = 37.
The general solution is x = 13t 11, y = 24t + 6 for t Z. The above special solution is
only special because a particular method (use of the Euclidean algorithm) produces it.
The general theorem is as follows.
Theorem 1.5. Let a and b be integers which are not both zero, and let d = gcd(a, b). Then
there are integers m and n such that
d = ma +nb.
3
Proof. Let
T = {sa +tb : s, t Z} Z.
Notice that 0, a, b T so T contains at least one positive integer, hence
S = T N = {x T : x N} N
is non-empty. By the Well Ordering Principle, S has a least element, d say. Then there are
integers m, n for which
d = ma +nb.
Notice that if c Z is a common divisor of a and b (i.e., c | a and c | b), then c | d; hence d is
greater or equal to every common divisor of a and b. Furthermore, if s, t Z, then we can nd
q, r Z such that
sa +tb = qd +r,
and 0 r < d. So
0 r = sa +tb qd = (s qm)a + (t qn)b T,
and since d is the least element of S, we must have r = 0. Thus d | (sa + tb), showing that
d divides every element of T, including a, b so that d is actually a common divisor of a and b.
Therefore d = gcd(a, b) and for some m, n Z,
gcd(a, b) = ma +nb.
We remark that for a and b both non-zero, there are innitely many pairs m, n for which
gcd(a, b) = ma +nb. For example, when a = 2 and b = 3, we have
(1) 2 + 1 3 = 1,
and more generally for any t Z,
(3t) 2 + (1 2t) 3 = 1.
In particular, if gcd(a, b) = 1 then a and b are called coprime or relatively prime. As a
special case, we have the following result.
Theorem 1.6. Let a and b be coprime integers. Then there are integers m and n such that
ma +nb = 1.
Here is a useful result about coprime integers whose proof illustrates how we can use the
greatest common divisor to prove results.
Proposition 1.7. Let a, b be coprime integers. If a | c and b | c, then ab | c.
Proof. Since gcd(a, b) = 1, there are integers u, v for which
ua +vb = 1,
and multiplying by c we have
uac +vbc = c.
Writing c = am and c = bn we have
(un +vm)ab = uanb +vamb = uac +vbc = c,
hence ab | c.
4
1.3. Prime numbers and prime factorisation
A prime number (or just a prime) is a natural number p > 1 whose only positive divisors
are 1 and p. Thus the prime numbers are 2, 3, 5, . . . . Note that 1 is not a prime and it behaves
quite dierently from 2, 3, 5, . . . . Primes have two positive integer divisors, but 1 has only one
integer divisor, namely itself.
By using Theorem 1.6 one can prove the following result.
Theorem 1.8. Let p be a prime. If p | ab, then p | a or p | b. More generally, if a
1
, . . . , a
n

Z and p | (a
1
a
n
), then p | a
r
for some r.
Proof. Suppose that p is not a divisor of a; we must show that p is a divisor of b. Since
the only positive divisors of p are 1 and p, it follows that the only positive common divisor of p
and a is 1. Therefore gcd(a, p) = 1. By Theorem 1.6, there are integers u and v such that
ua +vp = 1.
It follows that
b = 1b = (ua +vp)b = uab +vbp.
Since p is a divisor of ab, it then follows that p is a divisor of b as required.
The second statement can be proved by induction on n, the number of factors a
1
, . . . , a
n
.
If q is not prime, then it is possible to have q | ab without q | a or q | b. For example,
6 | (3 4) but 6 3 and 6 4.
As a consequence of Theorem 1.8 we obtain the following important result known as the
Fundamental Theorem of Arithmetic. This was rst proved by Gauss around 1800.
Theorem 1.9 (Fundamental Theorem of Arithmetic). Every natural greater than 1 can
be expressed as a product of primes, and this expression is unique apart from the order of the
factors.
For example, 12 can be expressed as 2 2 3 or as 2 3 2 or as 3 2 2, and 12 cannot
be expressed as a product of primes in any other way.
Proof. In our proof, we will deal with the existence of such a factorisation rst, then show
uniqueness.
We will use the Well Ordering Principle, although this is often proved by Induction. Suppose
that there are natural numbers which cannot be factorised as claimed and let S N be the
set of all such numbers. As S = , it has a least element s S say. Now s cannot be prime
since it would then be a product of primes, contrary to the denition of S. So there are natural
numbers u, v for which
s = uv,
where 1 < u < s and 1 < v < s. Because s is the least element of S, we can write
u = p
1
p
k
, v = q
1
q
,
where p
i
, q
j
are primes. But then
s = uv = p
1
p
k
q
1
q
,
5
which is a product of primes, contradicting the fact that s S and so cannot be factorised in
this way. Thus S must be empty.
For the uniqueness, suppose that n N and
n = p
1
p
k
= q
1
q
,
with the p
i
and q
j
all prime. Then p
k
| (q
1
q
), so by Theorem 1.8, p
k
= q
r
for some r. By
reordering the q
j
if necessary, we can assume that r = , hence p
k
= q
. It follows that
n
p
k
= p
1
p
k1
= q
1
q
1
.
Similarly we nd that after suitable reordering of the q
j
,
p
k
= q
, p
k1
= q
1
, . . . , p
1
= q
k+1
,
and
q
1
q
k
= 1.
The latter equation shows that
q
1
= = q
k
= 1,
hence we must have = k and therefore the prime factorisation of n is unique except possibly
for the order of the factors.
Remark 1.10. By grouping repeated primes together, we can express every integer n greater
than 1 as a product of prime powers
n = p
r
1
1
p
r
2
2
p
r
k
k
,
where we assume that r
i
1 for each i and the primes p
i
are arranged in increasing order
2 p
1
< p
2
< < p
k
.
This factorisation is also unique and is sometimes called the prime power factorisation of n.
When working with prime power factorisations, it is sometimes useful to allow redundant
factors p
0
= 1.
Here is a result which is sometimes useful for nding the greatest common divisor of two
numbers.
Proposition 1.11. Let a, b N be non-zero with prime power factorizations
a = p
r
1
1
p
r
k
k
, b = p
s
1
1
p
s
k
k
,
where r
j
0 and s
j
0. Then
gcd(a, b) = p
t
1
1
p
t
k
k
where t
j
= min{r
j
, s
j
}.
Proof. For each j, we have p
t
j
j
| a and p
t
j
j
| b, hence p
t
j
j
| gcd(a, b). Hence from Proposi-
tion 1.7,
p
t
1
1
p
t
k
k
| gcd(a, b).
If
1 < m =
gcd(a, b)
p
t
1
1
p
t
k
k
,
6
then m | gcd(a, b) and there is a prime q dividing m, hence q | a and q | b. This means that
q = p
for some and so p

t
+1
| gcd(a, b). But then p

r
+1
| a and p
s
+1
| b, which is impossible.
Therefore gcd(a, b) = p
t
1
1
p
t
k
k
.
The method suggested by this result can sometimes be faster than using the Euclidean
algorithm, although factoring large numbers is generally a hard problem.
Example 1.12. If a = 300 and b = 270, then
300 = 2
2
3 5
2
, 270 = 2 3
3
5,
and
gcd(300, 270) = 2 3 5 = 30.
The next result says that there are lots of primes, in fact innitely many. There are now
many proofs of this, the following result and proof are attributed to Euclid.
Theorem 1.13 (Euclid). There are innitely many prime numbers.
Proof. Let F = {p
1
, . . . , p
k
} be a nite set of primes and let
N = p
1
p
k
+ 1.
Then N is an integer greater than 1; hence, by the fundamental theorem of arithmetic, N is
divisible by some prime number p. But for 1 i k we have p
i
N, so p = p
i
; therefore p is a
prime number not belonging to F. It follows that there is no nite set containing all the prime
numbers; in other words, there are innitely many prime numbers.
1.4. Congruences to a modulus
It is often convenient to identify two numbers which dier by a multiple of some given
integer m. For example, in saying that two events happen at the same time on possibly dierent
days, what is really meant is that their times dier by a multiple of 24 hours.
In general, let m be a positive integer, called a modulus, and let a and b be any integers. If
ab is divisible by m, then we say that a and b are congruent modulo m and write a b mod m;
if a b is not divisible by m, then we write a b mod m. For example,
17 5 mod 4, 17 3 mod 5, 1 3 mod 5.
Note that there are several frequently encountered alternative notations for a b mod m,
for example a b (mod m) and a b (m).
Note that a b mod m if and only if a = b + mk for some integer k; note also that
a b mod m if and only if a and b have the same remainder when divided by m.
The set of integers congruent modulo m to a given integer a is called a congruence (or
residue) class modulo m. For example, the congruence classes modulo 5 are the sets
{ . . . , 15, 10, 5, 0, 5, 10, 15, . . . },
{ . . . , 14, 9, 4, 1, 6, 11, 16, . . . },
{ . . . , 13, 8, 3, 2, 7, 12, 17, . . . },
{ . . . , 12, 7, 2, 3, 8, 13, 18, . . . },
{ . . . , 11, 6, 1, 4, 9, 14, 19, . . . }.
7
As one can see from this example, there are m congruence classes modulo m; they are disjoint
non-empty subsets of Z whose union is equal to Z. They can be listed as the congruence classes
of the integers 0, 1, 2, . . . , m1. If 0 r < m then the integers in the congruence class of r are
those which leave a remainder of r when divided by m. The set of congruence classes modulo m
is denoted Z/m, although the alternative notation Z
m
is often encountered.
Note that a b mod m if and only if the congruence classes of a and b are the same.
The relation of congruence behaves somewhat like equality, in particular it has the following
properties:
a a mod m for all integers a;
if a b mod m then b a mod m;
if a b mod m and b c mod m then a c mod m.
Binary relations with these properties are called equivalence relations, and they will be discussed
more generally in Section 2.9.
1.5. Arithmetic operation on congruence classes
An important property of congruence classes modulo m is that they have well-dened op-
erations of addition, subtraction and multiplication. This is because of the following result.
Theorem 1.14. If a b mod m and c d mod m then
a +c b +d mod m,
a c b d mod m,
ac bd mod m.
Proof. We have a b mod m and c d mod m, so we have a = b + km and c = d + lm
for some integers k and l. Then
a +c = (b +km) + (d +lm) = (b +d) + (k +l)m,
a c = (b +km) (d +lm) = (b d) + (k l)m,
ac = (b +km)(d +lm) = bd + (bl +kd +kl)m
with k + l, k l and bl + kd + kl integers, so a + c b + d mod m, a c b d mod m and
ac bd mod m.
Given two members of Z/m one can therefore dene their sum and product, which are also
members of Z/m. As an example, here are the addition and multiplication tables modulo 6.
+ 0 1 2 3 4 5
0 0 1 2 3 4 5
1 1 2 3 4 5 0
2 2 3 4 5 0 1
3 3 4 5 0 1 2
4 4 5 0 1 2 3
5 5 0 1 2 3 4
0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 3 4 5
2 0 2 4 0 2 4
3 0 3 0 3 0 3
4 0 4 2 0 4 2
5 0 5 4 3 2 1
The tables say, for example, that 5 + 4 3 mod 6 and that 5 4 2 mod 6. From this, if
a and b are any integers a and b with a 5 mod 6 and b 4 mod 6, then a +b 3 mod 6 and
ab 2 mod 6.
8
Example 1.15. Show that n
2
1 is divisible by 8 for every odd integer n.
Solution. Modulo 8 we have n 1 or n 3 or n 5 or n 7, hence n
2
1 or n
2
9 or
n
2
25 or n
2
49, hence n
2
1 in all cases. Therefore n
2
1 is divisible by 8 in all cases.
Example 1.16. Prove that 4
2n
+ 6 9
n
is divisible by 7 for all nonnegative integers n.
Solution. The idea is that
4
2n
(4
2
)
n
16
n
2
n
mod 7
and
9
n
2
n
mod 7,
so one can take out a common factor:
4
2n
+ 6 9
n
(4
2
)
n
+ 6 9
n
2
n
+ 6 2
n
7 2
n
0 mod 7.
Example 1.17. Show that if n 3 mod 4, then n cannot be expressed as the sum of two
squares.
Solution. Modulo 4, every integer is congruent to 0, 1, 2 or 3, so every square is congruent
to 0
2
, 1
2
, 2
2
or 3
2
, Therefore every square is congruent to 0 or 1. A sum of two squares is therefore
congruent to 0, 1 or 2, and cannot be congruent to 3.
Example 1.18. Show that a positive integer n congruent to 3 modulo 4 must have a prime
factor congruent to 3 modulo 4.
Solution. We know that n = p
1
p
r
for some prime numbers p
i
. Since n is odd, so is
each p
i
and therefore it is congruent to 1 or 3 modulo 4. If every p
i
is congruent 1 modulo 4,
then p
1
p
r
1 mod 4, so n cannot be congruent to 3 modulo 4. Hence at least one of the
prime factors p
i
must be congruent to 3 modulo 4.
Theorem 1.19. There are innitely many prime numbers congruent to 3 modulo 4.
Proof. Let E = {p
1
, . . . , p
k
} be a nite set of primes congruent to 3 modulo 4, and let
N = 4p
1
p
k
1.
Then N 3 mod 4, so N is divisible by some prime p such that p 3 mod 4 by Example 1.18.
For 1 i k one has N not divisible by p
i
, hence p = p
i
; therefore E does not contain all the
primes congruent to 3 modulo 4. The set of all primes congruent to 3 modulo 4 must therefore
be innite.
1.6. Linear congruences
We will study linear congruences of the form
ax b mod m,
where a, b and m are given and where x is unknown, for example
3x 4 mod 5.
9
To see how linear congruences behave, consider the rows in the multiplication table modulo 6.
In the row containing the multiples of 5, each of the congruence classes from 0 to 5 occurs
exactly once. Hence, for any integer b, the integers x satisfying the congruence
5x b mod 6
form a single congruence class modulo 6. In particular there is a congruence class consisting of
the integers x such that 5x 1 mod 6, so the congruence class of 5 has a multiplicative inverse
in Z/6. The congruence class of 1 behaves in a similar way, but other congruence classes behave
dierently. For example, there are no solutions to the congruence
4x 1 mod 6,
but there are two congruence classes of solutions to the congruence
4x 2 mod 6.
The solutions to this last congruence could also be regarded as a single congruence class mod-
ulo 3, in fact
4x 2 mod 6 if and only if x 2 mod 3.
The dierence in behaviour arises because gcd(5, 6) = 1 and gcd(4, 6) = 1. In general we have
the following result.
Theorem 1.20. Let m be a positive integer and let a be an integer such that gcd(a, m) = 1.
Then there are integers r such that
ra 1 mod m.
If r is such an integer and if b is any integer, then
ax b mod m x rb mod m.
Proof. By Theorem 1.6 there are integers r and s such that ra + sm = 1. From this one
sees that ra 1 mod m. Then
ax b mod m = rax rb mod m
= 1x rb mod m
= x rb mod m
and
x rb mod m = ax rab mod m
= ax 1b mod m
= ax b mod m.
Definition 1.21. An integer r such that ar 1 mod m is called an inverse of a mod m.
So Theorem 1.20 says that a has an inverse modm if gcd(a, m) = 1. In fact, if s is another
such inverse, then since
sa 1 ar mod m,
we also have
r (sa)r s(ar) s mod m.
10
Hence the residue class of an inverse modm is unique.
Example 1.22. Find an inverse mod15 for 13.
Solution. Since 1 is a prime we see that 15 and 13 are coprime, i.e., gcd(13, 15) = 1. We
could use the Euclidean Algorithm to write 13r + 15s = 1, but here is a dierent way.
First note that 13 2 mod 15. Also
8 2 = 16 1 mod 15,
so
8 13 8 (2) 1 mod 15.
Hence we have
7 13 (8) 13 (1)
2
= 1 mod 15.
So 7 is an inverse of 13 mod15. Every such inverse has the form 7 + 15k where k Z.
Example 1.23. Find inverses mod 24 for each of the following integers: 5, 7, 11, 13, 17, 19, 23.
Solution. One approach which is certain to work is to follow the proof of Theorem 1.20.
So for example,
5 5 + (1) 24 = 1,
so mod24, 5 is the inverse of 5.
Otherwise, trial and error can be quick: for example,
7 7 = 49 1 mod 24,
so 7 is the inverse of 7. Similarly,
11 11 = 121 1 mod 24, 13 13 = 169 1 mod 24.
Also,
23 1 mod 24,
so 23 is also its own inverse. Finally, since
17 7 mod 24, 19 5 mod 24,
we see that these are also self-inverses mod24.
We now explain how to solve a congruence
ax b mod m.
The idea is to divide both sides by a. Strictly speaking, this makes no sense, however, if
gcd(a, m) = 1 and r is an inverse of a modm, then multiplying both sides by r achieves the
same eect.
Example 1.24. Solve the congruence 3x 5 mod 22.
11
Solution. We must multiply both sides by an inverse of 3 modulo 22, which exists because
gcd(3, 22) = 1. One could nd an inverse by using the Euclidean Algorithm, but it is probably
quicker to run through the numbers of the form 22s + 1 until one nds a multiple of 3. Indeed
22 2 + 1 = 15 3, so 15 3 1 mod 22. The solution is therefore
3x 5 mod 22 45x 75 mod 22 x 9 mod 22.
At the rst step one multiplies both sides by 15; at the second step one observes that 45 1
and 75 9 modulo 22.
Solution. One has 7 7 1 mod 12, so we multiply both sides by 7. We get
7x 8 mod 12 49x 56 mod 12 x 8 mod 12.
Solution. Here 6 and 9 are not coprime, so one must begin by taking out the common
factor 3:
6x 3 mod 9 9 | (6x 3)
9 | 3(2x 1)
3 | (2x 1)
2x 1 mod 3,
and then, as before,
2x 1 mod 3 4x 2 mod 3 x 2 mod 3.
Equivalently, the solution is x congruent to 2, 5 or 8 modulo 9.
1.7. Algebraic structures on Z and Z/m
The sets Z and Z/m have addition and multiplication properties satisfying nearly all of
the eld axioms. The exceptions are as follows: in Z there are non-zero elements without
multiplicative inverses; if m > 1 and m is not prime then there are non-zero elements in Z/m
without multiplicative inverses; in Z/1 the elements zero and one are not distinct. In Z/p such
that p is prime, these exceptions do not occur: non-zero elements have multiplicative inverses
by Theorem 1.20, and 0 1 mod p because p > 1. We therefore get the following result.
Theorem 1.27. If p is a prime then Z/p is a eld. If m is a positive integer which is not
a prime then Z/m is not a eld.
If one removes the two problem axioms from the denition of a eld, then one is left with
the denition of an algebraic structure called a commutative ring; thus Z is a commutative ring,
and Z/m is a commutative ring for every positive integer m. If one also removes the axiom that
multiplication should be commutative (xy = yx), then one is left with the denition of a ring;
for example, if n > 1 then the set of n n matrices over R is a ring but not a commutative
ring. Actually, in some books, the denitions of rings and commutative rings also omit the
requirement for an identity element for multiplication; the examples we have mentioned are
rings in any case.
12
1.8. Simultaneous linear congruences and the Chinese Remainder Theorem
Example 1.28 (Sun Zi, 4-th century). A number has remainder 2 when divided by 3, it
has remainder 3 when divided by 5, and it has remainder 2 when divided by 7. What is the
smallest such number?
Solution. We need x such that x 2 mod 3, x 3 mod 5 and x 2 mod 7, and we solve
these congruences successively. To get x 2 mod 3 we need x = 3u + 2, so assuming this we
have
x 3 mod 5 3u + 2 3 mod 5
3u 1 mod 5
2 3u 2 1 mod 5
u 2 mod 5
u = 5v + 2
x = 3(5v + 2) + 2
x = 15v + 8.
Assuming that x = 15v + 8, we obtain
x 2 mod 7 15v + 8 2 mod 7
v + 1 2 mod 7
v 1 mod 7
v = 7w + 1
x = 15(7w + 1) + 8
x = 105w + 23
x 23 mod 105.
Thus the simultaneous solutions to all three congruences are given x 23 mod 105, and the
smallest positive solution is given by x = 23.
Note that the general solution is a congruence class modulo 357 = 105. This generalises
as follows.
Theorem 1.29 (Chinese Remainder Theorem). Suppose that m
1
, . . . , m
r
are pairwise co-
prime. Then the general solution of the simultaneous congruences
x a
1
mod m
1
, . . . , x a
r
mod m
r
is a congruence class modulo m
1
m
r
.
Proof. The proof is by induction on r. The result is trivial when r = 1. Suppose that
r > 1 and suppose that the simultaneous congruences
x a
1
mod m
1
, . . . , x a
r1
mod m
r1
have a unique solution modulo m
1
m
r1
. This means that the solutions to the rst r 1
congruences are the numbers
x = m
1
m
r1
+u
13
for a particular choice of u and an arbitrary choice of . For this number x also to be a solution
of x a
r
mod m
r
one needs
m
1
m
r1
a
r
u mod m
r
.
But m
1
m
r1
is coprime to m
r
, so there is a unique solution for modulo m
r
by Theorem 1.20,
say = m
r
+v for some . Thus the solutions are given by
x = (m
r
+v)m
1
m
r1
+u = m
1
m
r
+ (vm
1
m
r1
+u),
i.e.,
x vm
1
m
r1
+u mod m
1
m
r
.
Example 1.30. Solve the simultaneous congruences
x 3 mod 4, x 2 mod 5, 3x 1 mod 11.
Solution. First we have
x 3 mod 4 x = 4u + 3.
Given that x = 4u + 3,
x 2 mod 5 4u + 3 2 mod 5
4u 4 mod 5
u 1 mod 5
u = 5v + 1
x = 20v + 7.
Finally, given that x = 20v + 7,
3x 1 mod 11 60v + 21 1 mod 11
5v 2 mod 11
9 5v 9 2 mod 11
v 7 mod 11
v = 11w + 7
x = 220w + 147
x 147 mod 220.
Example 1.31. Solve the congruence 24x 17 mod 91 by solving the simultaneous con-
gruences
24x 17 mod 7, 24x 17 mod 13.
Solution. Note that 91 = 7 13, where 7 and 13 are prime. It follows that 24x 17 is
divisible by 91 if and only if it is divisible by both 7 and 13, so it is indeed sucient to solve
the two simultaneous congruences. We do so as follows. First,
24x 17 mod 7 3x 3 mod 7 x 1 mod 7 x = 7u + 1;
14
then, given that x = 7u + 1, we get
24x 17 mod 13 168u + 24 17 mod 13
u 7 mod 13
u 7 mod 13
u = 13v + 7.
The solutions are therefore given by x = 7(13v + 7) + 1 = 91v + 50, i.e., x 50 mod 91.
Of course we can also solve the congruence 24x 17 mod 91 in the usual way by nding an
inverse for 24 modulo 91.
Example 1.32. Solve the simultaneous linear congruences
3x 5 mod 34, x 7 mod 20.
Solution. This is not quite the situation of the Chinese remainder theorem, because 34
and 20 are not coprime. The solution turns out to be a congruence class modulo a proper factor
of 34 20. Indeed, we need x = 20u + 7 such that
3(20u + 7) 5 mod 34,
60u + 21 5 mod 34, i.e.,
26u 18 mod 34. i.e.,
The dierence occurs here because 26 and 34 are not coprime. Indeed 26u 18 mod 34 is
equivalent to 34 | (26u 18), i.e., 17 | (13u 9) or equivalently
13u 9 mod 17,
4 13u 4 9 mod 17, i.e.,
u 2 mod 17, i.e.,
u = 17v + 2 for some v Z, i.e.,
giving x = 340v + 47, and so
x 47 mod 340.
1.9. Further examples of congruences
In this section and the Exercises we give a selection of congruences, some of which are
extremely important in Number Theory and its applications. Some more material such as a
proof of Fermats Little Theorem can be found in the Exercises.
Example 1.33. Let p be a prime and m, n any integers with n > 1. Show that
(m+p)
n
m
n
mod p,
(m+p)
n
m
n
+m
n1
np mod p
2
.
15
Solution. Using the Binomial Theorem we have
(m+p)
n
= m
n
+
_
n
1
_
m
n1
p +
_
n
2
_
m
n2
p
2
+ ,
where all the higher terms in the expansion have the form
_
n
r
_
m
nr
p
r
0 mod p
with
_
n
r
_
an integer. Hence
(m+p)
n
m
n
mod p.
Similarly, the second congruence comes from the expansion
(m+p)
n
= m
n
+m
n1
_
n
1
_
p +
= m
n
+m
n1
np +
m
n
+m
n1
np mod p
2
,
where all the higher terms in the binomial expansion have the form
_
n
r
_
m
nr
p
r
0 mod p
r
for r 2.
Example 1.34. Let p be a prime. If 1 k p 1, show that
_
p
k
_
0 mod p.
Solution. We have
_
p
k
_
=
p!
k!(p k)!
=
p (p 1)!
k!(p k)!
,
where none of (p 1)!, k! and (p k)! is divisible by p. So there is no factor of p in the
denominator, but there is one in the numerator. Since
_
p
k
_
is an integer, it must indeed be
divisible by p, i.e., p |
_
p
k
_
.
Here is an important result usually known as Wilsons Theorem.
Theorem 1.35. Let p be an odd prime. Then
(p 1)! 1 mod p.
Proof. Recall that
(p 1)! = 1 2 (p 1).
Notice that p 1 is even. Furthermore, each residue class which is not 0 mod p has the form
r mod p for exactly one of the numbers 1, 2, . . . , p 1.
There are two solutions of the congruence
x
2
1 mod p,
16
namely 1 mod p and 1 mod p = p 1 mod p. These are the only residue classes which are
their own inverses. The remaining nonzero residue classes can be arranged in pairs u mod p,
v mod p, where u, v are in the range 1, 2, . . . , p 1, and also satisfy u < v and
uv 1 mod p.
Thus by rearranging the order, we can write (p1) as a product of 1, p1 and then the product
of the pairs u, v as above. But working modulo p we see that this is congruent to 1.
For example, when p = 11, we have
2 6 3 4 5 9 7 8 1 mod 11.
So the pairs u, v are 2, 6, 3, 4, 5, 9 and 7, 8.
Finally we state Fermats Little Theorem (not to be confused with Fermats Last Theorem,
now more properly called Wiless Theorem).
Theorem 1.36. Let p be a prime and let a Z. Then
a
p
a mod p.
If also p a, then
a
p1
1 mod p.
This result is often useful when calculating powers modulo a prime. For example, 37 is a
prime, hence
7
111
= 7
336+3
= (7
36
)
3
7
3
49 7 12 7 84 10 mod 37.
This sort of calculation can be often done without a calculator, although for large primes
computers may need to be used. Fermats Little Theorem is also used as a basis for primality
testing.
17
CHAPTER 2
Sets, functions, cardinality and equivalence relations
Much of mathematics makes use of sets and functions (or mappings) between them. In
this chapter we will meet some important ideas about sets and functions, and learn how to
work with them. We will also study the basic notions of cardinality which formalises the idea
of how large a set is, and in particular see how to make sense of the dierence between nite
and innite sets. We will also discuss equivalence relations on sets, which generalise the idea of
equality of elements.
2.1. Sets
A set is a collection S of objects known as the elements (or members) of S. We write x S
if x is an element of S, and x / S if it is not. For every object x, exactly one of the statements
x S or x / S is true and the other is false, i.e., x is either a member of S or not a member of
S.
Two sets X, Y are equal (we then write X = Y ) if they have the same elements, i.e.,
x X if and only if x Y.
We sometimes write a set in the form
X = {x : P(x)},
where P(x) is some statement about x, and then
x X P(x) is true.
We will often be working with elements of a given set, Y say, and then have
X = {x Y : P(x)} = {x : x Y and P(x) is true},
so for x Y ,
x X P(x) is true.
We then say that X is a subset of Y and write X Y . So X is a subset of Y means that every
element of X is an element of Y . This allows for the possibility that X = Y . If X Y but
X = Y then we sometimes write X Y or X Y and say that X is a proper subset of Y . If
X is not a subset of Y we write X Y , and this means that at least one element of X is not
an element of Y .
In practise, to show that X, Y are equal it is enough to show that X Y and Y X, i.e.,
(x X) = (x Y ) and (y Y ) = (y X).
We also have a transitivity law: If X, Y, Z are sets, then
(X Y and Y Z) = X Z,
where (X Y and Y Z) is often abbreviated to X Y Z.
19
Example 2.1. The following are sets with the stated elements.
The empty set which has no elements; thus for any object x, x / .
The set of natural numbers
N = {1, 2, 3, . . .}.
We will also write
N
0
= {0, 1, 2, 3, . . .},
which some people also call the set of natural numbers.
The set of integers
Z = {3, 2, 1, 0, 1, 2, 3, . . .}.
The set of even integers
E = {4, 2, 0, 2, 4, . . .} = {2k : k Z}.
The set of integer solutions of the equation (x
2
4)(x
2
9) = 0 is
{n Z : (n
2
4)(n
2
9) = 0} = {3, 2, 2, 3}.
The set of subsets of the set X = {1, 2} is
{, {1}, {2}, X}.
Definition 2.2. Given a set X, the set whose elements are all the subsets of X (including
and X) is called the power set of X and is denoted P(X) or 2
X
.
The notation 2
X
is used because a subset U X corresponds to its indicator or character-
istic function, which will be used later in the proof of Theorem 2.38. Also, if X is a nite set
with n elements, then P(X) is also nite and has 2
n
elements.
Unions and Intersections. Given two sets A, B, we can form their union
A B = {x : x A or x B}.
Here the word or is the inclusive or which allows for both possibilities to be true. Thus
{0, 1, 2} {1, 2, 3} = {x : x = 0, 1, 2 or x = 1, 2, 3} = {0, 1, 2, 3}.
If we have a nite collection of sets A
1
, . . . , A
n
, then
A
1
A
n
= {x : x A
1
or x A
2
or . . . or x A
n
}
= {x : there is an i = 1, 2, . . . , n s.t. x A
i
},
where the phrase there is an i = 1, 2, . . . , n s.t. x A
i
really means there is at least one
i = 1, 2, . . . , n s.t. x A
i
.
Notice that the operation of union is symmetric:
A B = B A,
and also associative:
(A B) C = A (B C),
and under , behaves like zero under addition:
A = A = A.
20
But for a non-empty set A, there is no inverse since we always have A AB and B AB,
so AB is at least as big as A in the sense that it contains all the elements of A. The associative
law allows us to set
A B C = (A B) C = A (B C).
The intersection of A, B is
A B = {x : x A and x B}.
This time we have A B A and A B B. Again we have some useful properties:
A B = B A,
(A B) C = A (B C),
A = A = ,
so under , behaves like zero under multiplication. Again we can write
A B C = (A B) C = A (B C).
Example 2.3. Let X and Y be sets.
(a) If X Y = X, show that X Y .
(b) Show that X X = X = X X.
Solution. (a) If X Y = X, then X = X Y Y , so X Y .
(b) We have
X = {x : x X} = {x : x X and x X} = X X.
Similarly,
X X = {x : x X or x X} = {x : x X} = X.
Let J be a set. Let A
j
where j J, be a collection of sets dened for each element of J. We
call J the indexing set for the collection and say that the A
j
are indexed by J. We sometimes
indicate that a collection of sets is indexed by J by writing A
j
(j J).
We write
jJ
A
j
for the union of all the sets A
j
which is dened by
jJ
A
j
= {x : there exists a j J s.t. x A
j
}.
Similarly,
jJ
A
j
is the intersection of all the sets A
j
which is dened by
jJ
A
j
= {x : for all j J, x A
j
}.
For indexing sets such as {1, 2}, {1, 2, 3}, {1, 2, . . . , n} and N, variations are used such as
A
1
A
2
, A
1
A
2
A
3
, A
1
A
2
A
n
,
n
j=1
A
j
, A
1
A
2
,
j=1
A
j
.
Two sets A and B are said to be disjoint if they have empty intersection, i.e., A B = .
More generally, a family of sets {A
j
: j J} is said to be pairwise disjoint if A
j
A
k
=
whenever j = k.
The operations and obey various laws analogous to the laws of arithmetic which allow
systematic manipulations with sets.
21
Distributive laws. Let A, A
j
(j J) be sets. Then
A
jJ
A
j
= A
_
jJ
A
j
_
=
jJ
(A A
j
),
A
jJ
A
j
= A
_
jJ
A
j
_
=
jJ
(A A
j
).
Example 2.4. For three sets A, B, C, show the following equations are valid:
(a) A (B C) = (A B) (A C),
(b) A (B C) = (A B) (A C).
Solution. (a) We have
x A (B C) x A and x B C
x A and (x B or x C)
(x A and x B) or (x A and x C)
(x A B) or (x A C)
x (A B) (A C),
hence
A (B C) = (A B) (A C).
(b) We have
x A (B C) x A or x B C
x A or (x B and x C)
(x A or x B) and (x A or x C)
x A B and x A C
x (A B) (A C),
hence
A (B C) = (A B) (A C).
Complements and symmetric dierences. Let A and B be sets. We write A \ B
(sometimes denoted AB) for the relative complement or the complement of B in A, dened
by
A\ B = {x : x A and x / B}.
When we are concerned only with subsets of a xed universal set X then X \ A is called the
complement of A in X and is denoted C
X
A or CA. Simple properties of (relative) complements
include:
C(CA) = A, A\ B = A CB
and
A B = CB CA.
The symmetric dierence of the sets A and B is
A B = (A\ B) (B \ A) = {x A B : x / A B}.
22
In contrast to A\ B, A B is symmetric in A, B.
Example 2.5. Show that for any two sets A, B, we have B A = A B.
Solution. Since is symmetric,
B A = (B \ A) (A\ B)
= (A\ B) (B \ A)
= A B.
De Morgans laws. Let A, A
j
(j J) be sets. Then
A\
jJ
A
j
= A\
_
jJ
A
j
_
=
jJ
(A\ A
j
),
A\
jJ
A
j
= A\
_
jJ
A
j
_
=
jJ
(A\ A
j
).
Cartesian products. Let n N and let A
1
, A
2
, . . . , A
n
be sets. The Cartesian product
(or just the product) of A
1
, A
2
, . . . , A
n
is the set
A
1
A
2
A
n
= {(a
1
, a
2
, . . . , a
n
) : a
1
A
1
, . . . , a
n
A
n
}
which consists of all ordered n-tuples (a
1
, a
2
, . . . , a
n
), where a
k
A
k
(k = 1, 2, . . . , n). When
A
1
= A
2
= = A
n
= A, we set
A
n
= AA A = {(a
1
, a
2
, . . . , a
n
) : a
1
, . . . , a
n
A}.
Thus, in particular, for sets A and B,
AB = {(x, y) : x A, y B},
A
2
= {(x, y) : x, y A}.
For sets A
1
, A
2
, . . . , A
n
and B
1
, B
2
, . . . , B
n
,
_
A
1
A
n
_
_
B
1
B
n
_
_
A
1
B
1
_

_
A
n
B
n
_
,
_
A
1
A
n
_
_
B
1
B
n
_
=
_
A
1
B
1
_

_
A
n
B
n
_
.
Strict inclusion is possible in the rst of these.
Venn diagrams. When working with sets, it is sometimes useful to represent them picto-
rially with Venn diagrams. For example, in the following diagram we have two subsets A, B of
a set X) (represented by the square) and the areas within the circles represent the elements of
A, B. The overlapping region represents A B and the region in the square not in inside the
circles represents the complement C
X
(A B).
X
A B
The diagram
23
X
A B
C
represents the intersection of three subsets A, B, C of X. It is instructive to work out the regions
corresponding to the subsets A B C, (A B) C, (A B) C.
2.2. Functions
We will review some basic ideas about functions (sometimes known as mappings), including
properties of composition and inverse functions. Let X, Y and Z be sets.
Definition 2.6. A function (or mapping) f : X Y from X to Y involves a rule which
associates to each x X a unique element f(x) Y . X is called the domain of f, denoted
domf, while Y is called the codomain of f, denoted codomf. Sometimes the rule is indicated
by writing x f(x), where f(x) means the element of Y resulting from applying f to the
element x of X.
Two functions f and g are equal if and only if the following three conditions are satised:
their domains are equal, i.e, domf = domg;
their codomains are equal, i.e., codomf = codomg;
their rules agree, i.e., for every x domf = domg, f(x) = g(x).
Definition 2.7. Given functions f : X Y and g : Y Z, we can form their composi-
tion g f : X Z which has the rule
g f(x) = g(f(x)).
Note the order of composition here: we apply the functions successively from right to left.
X
f
gf
Y
g
Z
We often write gf for g f when no confusion will arise. If X, Y , Z are dierent it may not be
possible to dene f g since domf = X may not be the same as codomg = Z.
The composition of two functions can be generalised to include a sequence of functions
f
1
: X
1
X
2
, f
2
: X
2
X
3
, . . . , f
n
: X
n
X
n+1
,
and the composition
f
n
f
n1
f
1
: X
1
X
n+1
,
often denoted f
n
f
n1
f
1
, for which the rule is
(f
n
f
n1
f
1
)(x) = f
n
(f
n1
( f
1
(x) )).
This denition allows the value to be determined recursively. The order of bracketing here is
essentially unimportant.
Proposition 2.8. Let f : X Y , g : Y Z and h: Z W be three functions. Then
the functions h (g f): X W and (h g) f : X W are equal.
24
Proof. The domains and codomains of h(g f) and (hg) f agree. For x X, we have
(h (g f))(x) = h((g f)(x))
= h(g(f(x)))
= (h g)(f(x))
= ((h g) f)(x),
so the rules also agree. Therefore the functions are equal.
This result says that composition of functions is associative. We usually write h g f for
h (g f) = (h g) f. We can also express it by saying that we get the same answer if we
take an element of X and apply the functions in order by chasing around the solid arrows in
the following diagram in any possible order.
Y
g
Z
h
X
f
gf
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

W
Y
g
hg
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Z
h
Such a diagram whose edges represent functions and where the compositions of successive arrows
always gives the same answer is called commutative or a commutative diagram. The dotted arrow
represents h g f.
We will usually forget the brackets when composing more than two functions since it can
be shown that the bracketing is irrelevant. This obvious fact is actually a theorem that can
be proved by induction on the number of functions being composed.
Definition 2.9. For any set X, the identity function Id
X
: X X has the rule
x Id
X
(x) = x.
More generally, if A X then the function inc
A
: A X with rule
inc
A
(x) = x
for x A is called the inclusion function of A into X.
Note that although the rules of Id
X
and inc
A
are the same, they are dierent functions
unless A = X.
Example 2.10. Let X = Y = R. Then the following rules dene functions R R:
x x + 1, x x
2
, x
1
x
2
+ 1
, x sin x, x e
5x
3
.
Example 2.11. Let X = Y = R
+
, the set of all positive real numbers. Then the following
rules dene two functions R
+
R:
x
x, x
x.
25
Example 2.12. Let X and Y be sets and suppose that w Y is an element. The constant
function c
w
: X Y taking value w has the rule
x c
w
(x) = w.
For example, if X = Y = R then c
0
is the function which returns the value c
0
(x) = 0 for every
real number x R.
Definition 2.13. A function f : X Y is
injective (or an injection or one-to-one) if for x
1
, x
2
X,
f(x
1
) = f(x
2
) = x
1
= x
2
,
surjective (or a surjection or onto) if for every y Y there is an x X for which
y = f(x),
bijective (or injective or a one-to-one correspondence) if it is both injective and surjec-
tive.
We will use the following basic fact.
Proposition 2.14. The function f : X Y is a bijection if and only if there is an inverse
function Y X which is the unique function h: Y X satisfying
h f = Id
X
, f h = Id
Y
.
If such an inverse exists it is usual to denote it by f
1
: Y X, and then we have
f
1
f = Id
X
, f f
1
= Id
Y
.
In the next example, we use the notations
R
+
= {x R : x > 0} R, R
= {x R : x < 0} R,
for the sets of positive an negative real numbers respectively.
Example 2.15. Which of the following functions are bijections?
f : R \ {0} R
+
; f(x) = x
2
,
g : R
+
R
+
; g(x) = x
2
,
h: R
R
+
; h(x) = x
2
.
Solution. f is not injective since for x R
+
, f(x) = x
2
= f(x). It is surjective since for
every y R
+
, f(
y) = y.
g is injective since if x
1
, x
2
R
+
satisfy x
2
1
= x
2
2
, then x
2
= x
1
. It is also surjective since
every y R
+
has a (unique) positive real square root. So g is a bijection and its inverse is
g
1
: R
+
R
+
with rule
g
1
(y) =

y.
The details for h are similar to g. It is injective since if x
1
, x
2
R
+
satisfy x
2
1
= x
2
2
, then
x
1
= x
2
. It is surjective since if y R
+
, then h(
y) = y. Hence h is a bijection and its

inverse h
1
: R
+
R
has the rule

h
1
(y) =
y.
26
Example 2.16. Consider the three functions
E
1
: R R, E
2
: R R
+
, E
3
: R
+
R
+
,
which have the rules
E
1
(x) = E
2
(x) = E
3
(x) = e
x
.
Investigate which of these is injective, surjective or bijective. For each of the bijective ones, nd
its inverse.
Solution. The rst function is the usual exponential function. Then for real numbers
x
1
, x
2
we know that e
x
1
= e
x
2
if and only if x
1
= x
2
, hence E
1
is injective. In fact, the other
two are as well since they have the same rule.
Since e
x
> 0 for x R, E
1
is not surjective since for example 1 = e
x
for any real x. On
the other hand, if y R
+
then
y = e
log y
= E
2
(log y),
where log y R, so E
2
is surjective. But since log y > 0 only if y > 1, E
3
is not surjective (for
example 1 = e
x
for any x > 0).
The function E
2
is bijective since it is both injective and surjective. Its inverse function is
E
1
2
: R
+
R; E
1
2
(y) = log y.
Definition 2.17. Let f : X Y be a function.
If A X, then the image of A under f is
fA = f(A) = {f(a) : a A} = {y Y : for some a A, y = f(a)} Y.
When A = X, fX is called the image of f, and is also denoted imf or Imf. We
always have f = .
If B Y , then the preimage or inverse image of B under f is
f
1
B = f
1
(B) = {x X : f(x) B} X.
Notice that f
1
= and f
1
Y = X. Furthermore, if f is surjective, then for any
y Y , f
1
{y} = .
Example 2.18. Let f : X Y be a function. If U U
X, and V V
Y , show
that: (a) fU fU
, (b) f
1
V f
1
V
.
Solution. (a) If y fU then for some u U, y = f(u), but since u U U
, we have
u U
and so y fU
. Hence fU fU
.
(b) Suppose that x f
1
V . Then f(x) V V
, hence f(x) V
and so x f
1
V
.
Therefore f
1
V f
1
V
.
Example 2.19. Let f : X Y be a function. If U, V X, investigate whether or not the
following are always true:
f(U V ) = fU fV, f(U V ) fU fV.
27
Solution. If x U V then either x U or x V . The rst possibility gives f(x) fU,
while the second gives f(x) fV , hence f(x) fUfV . So we always have f(UV ) fUfV .
If y fU fV then either y fU or y fV . The rst possibility gives y fU f(U V ),
while the second gives y fV f(U V ), so in either case we have y f(U V ). Thus we
always have fU fV f(U V ).
Combining these results we nd that
f(U V ) fU fV f(U V )
and so f(U V ) = fU fV always holds.
If x U V then x U and x V , so f(x) fU and f(x) fV , hence f(x) fU fV .
This shows that f(U V ) fU fV is always true.
This time fU fV f(U V ) need not hold though. Here is a counterexample. Take
f : R R; f(x) = x
2
and let
U = R
+
= {t R : t > 0}, V = R
= {t R : t < 0}.
Then f(1) = 1 = f(1), so 1 fU fV . But U V = , so f(U V ) = f = , hence
1 / f(U V ) and therefore fU fV f(U V ).
Lemma 2.20. Let f : X Y be a surjective function and let U, V Y . Then
f
1
U f
1
V = U V = .
Proof. Suppose that f
1
U f
1
V = . If y U V then y U and so = f
1
{y}
f
1
U. Similarly, y V and so f
1
{y} f
1
V . Thus
f
1
{y} f
1
U f
1
V = ,
hence f
1
{y} = , contradicting the fact that f
1
{y} = . So U V = .
Suppose that U V = . Let x f
1
U f
1
V . Then since x f
1
U, f(x) U. Similarly,
f(x) V and therefore f(x) U V = , giving a contradiction.
2.3. Induction and the Well Ordering Principle
Before passing on to study nite and innite sets, we discuss some properties of the natural
numbers underlying many proofs and already mentioned briey in Section 1.1. First we recall
the familiar Principle of Mathematical Induction (PMI). This is the basic assumption about N
which allows us to deal with the fact that N is innite.
Suppose that for each n N, P(n) is a statement which may depend on n. Assume that
the statement P(1) is true,
for k N, if P(k) is true then P(k + 1) is true.
Then P(n) is true for every n N.
Proposition 2.21 (Well Ordering Principle (WOP)). Let S N be any non-empty subset.
Then S has a minimal (or least) element.
Proof. For n N, consider the following statement P(n),
If S N and S {1, 2, . . . , n} = , then S has a minimal element.
28
The statement P(1) is true since if 1 S then 1 must be minimal since there are no natural
numbers smaller than 1.
Now suppose that P(k) is true. Let S N and suppose that S {1, 2, . . . , k +1} = . Now
either S {1, 2, . . . , k} = , in which case S has a minimal element because P(k) is true, or
k + 1 S and S {1, 2, . . . , k} = . But then k + 1 is clearly the minimal element of S. So
P(k + 1) is true.
By PMI, P(n) is true for every n N. This implies that every non-empty subset of N has
a minimal element.
Remark 2.22. In fact, PMI and WOP are logically equivalent. We have seen in the proof
of Proposition 2.21 that
PMI = WOP.
The converse implication is also true. For if P is a statement for which P(1) is true and
P(k) = P(k + 1), let us assume that there is some m for which P(m) is false. Consider the
set
S = {r N : P(r) is false} N.
Then m S so S is non-empty. By WOP, S has a minimal element, say m
0
. Then m
0
> 1,
and (m
0
1) / S, so P(m
0
1) is true. Since P(m
0
1) = P(m
0
), we have P(m
0
). But it is
impossible to have P(m
0
) both true and false, so this contradiction shows that no such element
can exist and therefore S = , i.e., P(n) is true for all n N.
So as was claimed in Theorem 1.2, PMI and WOP are equivalent conditions on N,
PMI WOP.
We will often use WOP when considering non-empty subsets of N, and we simply assert that
such a subset has a minimal element.
2.4. Finite sets and cardinality
The natural numbers originally arose from counting elements in sets. There are two very
dierent possible sizes for sets, namely nite and innite, and in this section we discuss these
concepts in detail.
For a positive natural number n 1, set
n = {1, 2, 3, . . . , n}.
If n = 0, we set 0 = . Then the set n has n elements and we can think of it as the standard
set of that size. We will use the notations
N = {1, 2, 3, . . .}, N
0
= {0, 1, 2, 3, . . .} = {0} N.
Definition 2.23. A set X is nite if for some n N
0
there is a bijection n X. If X is
not nite, it is innite.
Notice that the empty set = 0 is nite.
The next result is a formal version of what is usually called the Pigeonhole Principle.
29
Theorem 2.24 (Pigeonhole Principle: rst version). Let m and n be natural numbers.
(a) If there is an injection m n, then m n.
(b) If there is a surjection m n, then m n.
(c) If there is a bijection m n, then m = n.
Proof. We begin by remarking that for any m, there is no function m 0 except when
m = 0, in which case there is exactly one.
(a) Suppose that injections m n where m > n 1 exist. Then we may consider the
non-empty set
U = {n N : there is an injection m n for some m > n} N.
By the Well Ordering Principle, there is a minimal element u U. Suppose that f : m u is
an injection with m > u. If f(m) = u, then we can nd a permutation of u which interchanges
f(m) and u and then f is another injection, so we can assume that f(m) = u. Now consider
the function f
: m1 u 1 given by
f
(j) = f(j).
It is easy to see that f
is an injection and so u 1 U. But u 1 < u, contradicting the

minimality of u. Therefore U must be empty.
(b) Suppose that surjections m n where 1 m < n exist. Then we may consider the
non-empty set
V = {n N : there is a surjection m n for some m < n} N.
By the Well Ordering Principle, there is a minimal element v V . Suppose that g : m v is
a surjection with m < v. Again we can compose with a permutation to ensure that g(m) = v.
Now consider the function g
: m1 v 1 given by
g
(j) = g(j).
Then g
is a surjection and so v 1 V , contradicting the assumption that v is minimal in V .

Therefore V is empty.
(c) This follows from (a) and (b) since a bijection is both injective and surjective.
Corollary 2.25. Suppose that X is a nite set and that there are bijections m X and
n X. Then m = n.
Proof. Let f : m X and g : n X be bijections. Using the inverse g
1
: X n
which is also a bijection, we can form a bijection h = g
1
f : m n. By part (c), m = n.
Definition 2.26. For a nite set X, the unique n N
0
for which there is a bijection
n X is called the cardinality of X, denoted |X|, card X or #X. If X is innite then we
sometimes write |X| = , while if X is nite we write |X| < .
Here is a reformulation of Theorem 2.24 which contains some important facts about cardi-
nalities of nite sets.
Theorem 2.27 (Pigeonhole Principle). Let X, Y be two nite sets.
(a) If there is an injection X Y , then |X| |Y |.
(b) If there is a surjection X Y , then |X| |Y |.
(c) If there is a bijection X Y , then |X| = |Y |.
30
The name Pigeonhole Principle alludes to its use when distributing m letters into n pigeon-
holes (e.g., oce mailboxes). If each pigeonhole is to receive at most one letter, m n; if each
pigeonhole is to receive at least one letter, m n.
Notice that if X is a nite set and S a subset, then the inclusion function inc: S X
given by inc(j) = j is an injection. So we must have |S| |X|. If P is a proper subset then
we have |P| < |X| and this implies that there can be no injection X P nor a surjection
P X. These conditions actually characterise nite sets. In the next section we investigate
how to recognise innite sets.
Here are some important formulae for calculating cardinalities of nite sets which we state
without proofs (you are probably already familiar with some of these). It is worthwhile trying
to understand these formula when one or both of the sets is the empty set .
Theorem 2.28. Let A and B be nite sets. Then the following are valid.
(a) Inclusion-exclusion principle: |A B| = |A| +|B| |A B|.
(b) Product formula: |AB| = |A| |B|.
(c) Exponential formula: |B
A
| = |B|
|A|
, where B
A
denotes the set of all functions from A to B.
2.5. Innite sets
As the name suggests, innite sets are sets which are not nite. In this section we will now
examine some of the properties of innite sets which are often very dierent from those enjoyed
by nite sets and can sometimes appear very counterintuitive, which is perhaps not surprising
since much of intuition is based on working with nite sets.
Theorem 2.29. Let X be a set.
(a) X is innite if and only if there is an injection X P where P X is a proper subset.
(b) X is innite if and only if there is a surjection Q X, where Q X is a proper subset.
(c) X is innite if and only if there is an injection N X.
(d) X is innite if and only if there is a subset T X and an injection N T.
Example 2.30. Show that the set N
0
= {0, 1, 2, . . .} is innite.
Solution. Take the proper subset N = {1, 2, 3, . . .} N
0
and dene the function f : N
0

N by f(n) = n + 1.
0
n + 1

If f(m) = f(n) then m+ 1 = n + 1 so m = n, hence f is injective. If k N then k 1 and so
(k 1) 0, implying (k 1) N
0
whence f(k 1) = k. Thus f is also surjective and hence
bijective.
Example 2.31. Show that there are bijections between the set N
0
and each of the sets
S
1
= {2n : n N
0
}, S
2
= {2n + 1 : n N
0
}, S
3
= {3n : n N
0
}.
In every case nd a bijection and its inverse.
31
Solution. For S
1
, let f
1
: N
0
S
1
be given by f
1
(n) = 2n. Then f
1
is a bijection: it is
injective since 2n
1
= 2n
2
implies n
1
= n
2
, and surjective since given 2m S
1
, f
1
(m) = 2m.
The inverse function is given by f
1
1
(k) = k/2.
For S
2
, let f
2
: N
0
S
2
be given by f
2
(n) = 2n + 1. Then f
2
is a bijection: it is injective
(2n
1
+ 1 = 2n
2
+ 1 implies n
1
= n
2
) and surjective since given 2m + 1 N
0
, f
2
(m) = 2m + 1.
The inverse function is given by f
1
2
(k) = (k 1)/2.
For S
3
, let f
3
: N
0
S
3
be given by f
3
(n) = 3n. Then f
3
is a bijection: it is injective
since 3n
1
= 3n
2
implies n
1
= n
2
, and surjective since given 3m N
0
, f
3
(m) = 3m. The inverse
function is given by f
1
3
(k) = k/3.
Notice that each of the sets S
1
, S
2
, S
3
is a proper subset of N
0
, yet each is in one-to-one
correspondence with N
0
itself.
2.6. Countable sets
Countable sets are either nite or their elements can be listed without repetition in an
innite sequence, so they have the same size as subsets of N.
Definition 2.32. A set X is countable if there is a bijection S X where either S = n for
some n N
0
or S = N. A countable innite set is said to be countably innite or of cardinality
0
. An innite set which is not countable is said to be uncountable.
It is easy to deduce from the denition that if X is countable and if there is a bijection
X Y , then Y is countable. Also, every subset of a countable set is countable. See the
exercises for more general results of this kind. Before giving some examples of countable sets,
here is a useful result which is often helpful when verifying that a set is countable.
Proposition 2.33. Let X be a non-empty set.
(a) If there is an injection X N, then X is countable. In particular, every non-empty subset
of N is countable.
(b) If there is a surjection N X, then X is countable.
Proof. If X is nite, both parts are easy to verify, so we will assume that X is innite.
(a) Suppose there is an injection j : X N. Let S
1
= imj N. As this is non-empty, S
1
has a least element s
1
say. Since j is injective, there is a unique x
1
X such that s
1
= j(x
1
).
Now let S
2
= imj \ {s
1
}. Again S
2
has a least element s
2
> s
1
and there is a unique x
2
X
such that s
2
= j(x
2
). Continuing in this way, we obtain a sequence of distinct elements of X,
x
1
, x
2
, . . ., for which
1 j(x
1
) < j(x
2
) < .
Furthermore, every element of X occurs in this list since for any x X, eventually s
k
j(x).
Now we can dene a function f : N X by f(n) = x
n
, and this is a bijection, hence X is
countable.
For a subset of N, the inclusion function is an injection so every non-empty subset of N is
countable.
(b) Suppose that is q : N X is a surjection. Then for each x X, the set q
1
{x} N is
non-empty and so it has a least element t
x
. Now consider
T = {t
x
: x X} N.
32
By (a), T is countable. The function h: X T dened by h(x) = t
x
is a bijection and since
T is countable, so is X.
Example 2.34. Use the Fundamental Theorem or Arithmetic to show that the set of all
ordered pairs of natural numbers NN is countable. Generalise this to show that N
k
= N N
(k factors) is countable.
Solution. Choose two distinct primes p, q say. Now dene h: N N N by
h(r, s) = p
r
q
s
.
Then h is an injection since the Fundamental Theorem or Arithmetic says that the prime power
factorisation of a natural number is unique. Therefore the function h
: N N imh is a
bijection. As imh N it is countable, hence NN is countable. For the generalisation, choose
any k distinct primes (for example, we could require that p
1
< p
2
< < p
k
) and dene
h
k
: N
k
N by
h
k
(r
1
, . . . , r
k
) = p
r
1
1
p
r
k
k
.
By the Fundamental Theorem or Arithmetic, this is an injection. We can now deduce that N
k
is countable as for the case k = 2.
The last result leads to a somewhat surprising general result.
Proposition 2.35. Suppose that X
1
, X
2
, . . . is a sequence of countable sets. Then the union
X =
nN
X
n
is countable.
Proof. We might as well assume that these sets are pairwise disjoint, i.e., X
i
X
j
= if
i = j. This means that for each x X there is a unique n such that x X
n
. For each n choose
an injection j
n
: X
n
N. Dene h: X N N by
h(x) = j
n
(x) if x X
n
.
As each j
n
is injective, so is h. Also, imh N N and as N N is countable, so are imh and
X.
Example 2.36. The following sets are countably innite.
(a) The set of integers Z.
(b) X Y where X, Y are countably innite.
(c) X Y where X is countably innite and Y is nite.
(d) The set of all ordered pairs of natural numbers
N
0
N
0
= {(m, n) : m, n N
0
}.
(e) The set of all positive rational numbers
Q
+
=
_
a
b
: a, b N
_
.
Solution.
(a) Consider the function f : Z N given by
f(n) =
_
_
_
2n if n 1,
1 2n if n 0.
33
It is straightforward to check that this is a bijection, so its inverse is a bijection f
1
: N Z.
(b) The simplest case is where X Y = . Then given bijections f : N X and g : N Y
we construct a function h: N X Y by
h(n) =
_
_
f
_
n
2
_
if n is even,
g
_
n + 1
2
_
if n is odd.
Then h is a bijection.
If Z = X Y and Y Z are both countably innite, let f : N X and g : N Y \ Z
be bijections. Then we dene h: N X Y by
h(n) =
_
_
f
_
n
2
_
if n is even,
g
_
n + 1
2
_
if n is odd.
Again this is a a bijection.
The case where one of X\ XY and Y \ XY is nite is easy to deal with by the method
used for (c).
(c) Since Y is nite so is Y \XY Y . Let f : N X and g : m Y \XY be bijections.
Dene h: N X Y by
h(n) =
_
_
_
g(n) if 1 n m,
f(n m) if m < n.
Then h is a bijection.
(d) Plot each pair (a, b) as the point in the xy-plane with coordinates (a, b); such points are all
those with natural number coordinates. Starting at (0, 0) we can now trace out a path passing
through all of these points and we can arrange to do this without ever recrossing such a point.
.
.
.
(0, 3)

(1, 3)

(0, 2)
(1, 2)
(2, 2)

(0, 1)

(1, 1)
(2, 1)
(3, 1)

(0, 0)

(1, 0)
(2, 0)

(3, 0)

This gives a sequence {(r
n
, s
n
)}
0n
of elements of N
0
N
0
which contains every pair of natural
numbers exactly once. The function
f : N
0
N
0
N
0
; f(n) = (r
n
, s
n
),
34
is a bijection.
(e) This is demonstrated in a similar way to (d), but the argument is slightly more involved.
For each a/b Q
+
, we can assume that a, b are coprime (i.e., they have no common factors) and
plot it as the point in the xy-plane with coordinates (a, b). Starting at (1, 1) we can now trace
out a path passing through all of these points with coprime positive natural number coordinates
and can even arrange to do this without ever recrossing such a point. This gives a sequence r
n
of elements of Q
+
in which each rational number r
n
occurs only once.
.
.
.
(1, 4)
(2, 4)
(1, 3)
(2, 3)
(3, 3)
(1, 2)
(2, 2) (3, 2)
(4, 2)
(1, 1)

(2, 1)
(3, 1)

(4, 1)
(5, 1)

The function
f : N Q
+
; f(n) = r
n
,
is a bijection.
Here is something that shows how weird innity can seem compared to what happens with
nite sets.
Example 2.37. Welcome to the Hotel Innity! (Manager Sam Cantori).
Hotel Innity has innitely many single rooms numbered 1, 2, 3, . . .. One night every room
is occupied. A bus full of 100 new guests arrives late at night. How can they be accommodated?
The next night the hotel is again full up. A full bus with innitely many passengers occu-
pying seats numbered 1, 2, 3, . . . arrives. How can they all be accommodated?
Solution. First night: The Manager asks each guest to move from room n to room n+100,
then puts the passengers into the rst 100 rooms.
Second night: The Manager asks each guest to move from room n to room 2n (so the odd
numbered rooms are all empty now), then puts the passenger in seat k into room 2k 1.
For more on this wonderful establishment see
http://www.c3.lanl.gov/mega-math/workbk/infinity/inhotel.html
35
2.7. Power sets and their cardinality
For two sets X and Y , let
Y
X
= {f : f is a function f : X Y }
be the set of all functions from X to Y . From Theorem 2.28, we know that if X and Y are
nite sets, then Y
X
is nite and has cardinality
|Y
X
| = |Y |
|X|
.
A particular case of this occurs when Y has two elements, e.g., Y = {0, 1}. The set {0, 1}
X
has
2
|X|
elements and indeed it is often denoted 2
X
. It has another important interpretation.
For any set X, we can consider the set of all its subsets
P(X) = {U : U X is a subset},
this is called the power set of X. Notice that the function
X
: X P(X) dened by
X
(x) = {x}
is always an injection so P(X) is at least as big as X.
Theorem 2.38. For a set X, the function
: P(X) {0, 1}
X
; (U) =
U
,
is a bijection. If X is nite then so is P(X) and its cardinality is |P(X)| = 2
|X|
.
Proof. We will make use of the indicator function of a subset U X,
U
: X {0, 1};
U
(x) =
_
_
_
1 if x U,
0 if x / U.
As the indicator function of U X is determined by U, the function is well dened. Also, a
function f {0, 1}
X
determines a corresponding subset of X, namely
U
f
= {x X : f(x) = 1}
with
U
f
= f. This shows that is a bijection whose inverse function satises
1
(f) = U
f
.
Using the standard nite sets n = {1, . . . , n} (n N
0
) we have
|P(0)| = 2
0
= 1, |P(1)| = 2
1
= 2, |P(2)| = 2
2
= 4, |P(3)| = 2
3
= 8, . . .
where
P(0) = {},
P(1) = {, {1}},
P(2) = {, {1}, {2}, {1, 2}},
P(3) = {, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 2}},
.
.
.
We will now see that for any set X the power set P(X) is always bigger than X.
36
Theorem 2.39 (Russells Paradox). For a set X, there is no surjection X P(X).
Proof. Suppose that g : X P(X) is a surjection.
Consider the subset
W = {x X : x / g(x)} X.
Then by surjectivity of g there is a w X such that g(w) = W. If w W, then by denition
of W we must have w / g(w) = W, which is impossible. On the other hand, if w / W, then
w g(w) = W and again this is impossible. But then w cannot be in W or the complement
X\W, contradicting the fact that every element of X has to be in one or other of these subsets
since X = W (X \ W). Thus no such surjection can exist.
When X is nite, this result is not surprising since 2
n
> n for n N. For X an innite set,
it leads to the idea that there are dierent sizes of innity. Russells Paradox is often stated
in terms of the set of all sets, and the key ideas of this proof can also be used to show that
no such set can exist (can you think of a suitable argument?). It shows that naive notions of
sets can lead to problems when sets are allowed to be too large. Modern set theory sets out to
axiomatise the idea of a set to avoid such problems.
2.8. The real numbers are uncountable
Theorem 2.40 (Cantor). The set of real numbers R is uncountable, i.e., there is no bijection
N R.
Proof. Suppose that R is countable and therefore the obviously innite subset (0, 1] R
is countable. Then we can list the elements of (0, 1]:
q
1
, q
2
, . . . , q
n
, . . . .
For each n we can uniquely express q
n
as a non-terminating innite expansion decimal
q
n
= 0.q
n,1
q
n,2
q
n,k
,
where for each k, q
n,k
= 0, 1, . . . , 9 and for every k
0
there is always a k > k
0
for which q
n,k
= 0.
Now dene a real number p (0, 1] by requiring its decimal expansion
p = 0.p
1
p
2
p
k

to have the property that for each k 1,
p
k
=
_
_
_
1 if q
k,k
= 1,
2 if q
k,k
= 1.
Notice that this is also non-terminating. Then p = q
1
since p
1
= q
1,1
, p = q
2
since p
2
= q
2,2
,
etc. So p cannot be in the list of q
n
s, contradicting the assumption that (0, 1] is countable.
The method of proof used here is often referred to as Cantors diagonalization argument.
In particular this shows that R is much bigger than the familiar subset Q R. However it can
be hard to identify particular elements of the complement R \ Q. In fact the subset of all real
algebraic numbers is countable, where such a real number is a root of a monic polynomial of
positive degree,
x
n
+a
n1
x
n1
+ +a
0
Q[x].
37
This means that most real numbers are not algebraic; the numbers e and are not-algebraic,
but for any positive rational number q,

q is algebraic since it is a root of x
2
q.
At this point it is natural to ask the question How big is R? and perhaps expect an answer
such as It has the same size as P(N). In fact this is true, i.e., there is a bijection between P(N)
and R so these sets have the same cardinality. Related to this is the Continuum Hypothesis
(CH), which says that every subset of R is countable or has a bijection to R. By work of Godel
and Cohen, it is now known that the other axioms for Set Theory are logically independent of
CH, so it is impossible to prove or disprove it using them.
2.9. Equivalence relations
Let X be a set. We are familiar with the idea that two elements are equal. But sometimes
we may want to group together elements in a set possessing some common property, and we
might then think of them as related and introduce some symbol to indicate that, e.g., x y or
x y, whenever that is true. Here are some examples.
Example 2.41.
(a) For x, y R, write x - y if and only if |x| |y|.
(b) For x, y Z, write x y if and only if x | y (i.e., x divides y).
(c) Let Y be a nite set and let X = P(Y ) be the power set of Y . For A, B P(Y ) write A B
if and only if |A| |B|, where |S| is the number of elements in the set S.
These are all examples of (binary) relations on sets. In order for this to be a useful idea we
usually require that a relation has more properties.
Definition 2.42. Let be a relation on a set X. Then is an equivalence relation if it
has the following properties.
Reexivity: If x X, then x x.
Symmetry: If x, y X and x y, then y x.
Transitivity: If x, y, z X, x y and y z, then x z.
It is obvious that equality has all of these properties, so for any non-empty set X, the
relation dened by x y if and only if x = y is an equivalence relation.
None of the relations of Example 2.41 is an equivalence relation as they all fail to satisfy
the symmetry condition. Here are some examples of equivalence relations.
Example 2.43. Let mbe a positive integer. Then the relation a b mod mis an equivalence
relation on Z by Section 1.4.
Example 2.44. Let R
n
be given its usual notion of length of vectors:
|(x
1
, . . . , x
n
)| =
x
2
1
+ +x
2
n
.
For x, y R
n
write x y if and only if |x| = |y|. So with respect to two vectors are
equivalent exactly when they have the same length.
When n = 1, R
1
= R. Then for x, y R, x y exactly when y = x. A similar denition
works for C
n
.
38
Example 2.45. Let F be a eld and V a vector space over F. Let V
0
= V \ {0}. For
u, v V
0
, write u v if and only if there is a scalar t F \ {0} such that v = tu. Then is
an equivalence relation on V
0
. Lets check this.
Reexivity: v V
0
, then v = 1v, so v v.
Symmetry: If u, v V
0
and uv, then for some t F \ {0}, v = tu. Now since t = 0,
hence there is an inverse t
1
F. So we have
u = t
1
(tu) = t
1
v,
showing that v u.
Transitivity: Let u, v, w V
0
. Suppose that u v and v w. Then for some
s, t F \ {0}, v = su and w = tv. This gives
w = t(su) = (st)u,
and as st = 0, u w.
It is easy to see that for a xed u, the vectors v equivalent to u are precisely the non-zero
vectors lying in the 1-dimensional subspace spanned by u.
Definition 2.46. Let X be a non-empty set. A collection P of non-empty subsets of X is
called a partition of X or is said to partition X if
X =
UP
U
and for any two distinct sets U, V P,
U V = .
So the sets of P decompose X into a union of disjoint, i.e., non-overlapping, subsets.
Example 2.47. Let X = Z. Dene
U = {n Z : n is even}, V = {n Z : n is odd}.
Then Z = U V and U V = , so {U, V } partitions Z.
Example 2.48. Let X be a non-empty nite set. Suppose that
P = {U
1
, . . . , U
k
},
where the U
i
are non-empty subsets of X which partition X. Then
|X| = |U
1
| +|U
2
| + +|U
k
|.
In particular, if all of the sets U
i
have the same size, then
|X| = k|U
1
|.
To see this, note that each element of X occurs in exactly one of the U
i
.
Now we see an important connection between equivalence relations and partitions. Later
we will see that every equivalence relation arises from a partition.
39
Example 2.49. Let X be a non-empty set and let P be a partition of X. Dene a relation
on X by
x y there is a U P with x U and y U.
Show that is an equivalence relation.
Solution. Let x, y, z X. Then x U for some U P, so x x. It is also clear that
x y = y x.
If x y and y z, suppose that U, V P with x U, y U, y V and z V . As y U V ,
U V = and therefore U = V . So x, z U, hence x z.
Here is a very useful way to form equivalence relations.
Proposition 2.50. Let q : X Y be a surjective function where X is a non-empty set.
Dene a relation on X by x
q
y if and only if q(x) = q(y). Then
q
is an equivalence relation.
Proof. Lets verify the three properties.
Reexivity: Let x X. Then q(x) = q(x), so x
q
x.
Symmetry: Let x, y X and suppose that q(x) = q(y). But then q(y) = q(x) and so
y
q
x.
Transitivity: Let x, y, z X and suppose that q(x) = q(y), q(y) = q(z). Then q(x) =
q(z), so x
q
z.
Notice that for given x X, the set of all elements which are equivalent to x is
{y X : x
q
y} = {y X : q(x) = q(y)} = q
1
{x}.
We will see that every equivalence relation arises in this way.
Now let be an equivalence relation on a non-empty set X. For any x X, let
[x]
= {y X : x Y }.
This is called the equivalence class of x with respect to . If the equivalence relation is clear
from the context we sometimes just write [x] for [x]
. The set of all the distinct equivalence

classes is sometimes denoted X/ ; it is a subset of the power set P(X).
Theorem 2.51. The equivalence classes with respect to the equivalence relation on X
have the following properties.
The equivalences classes [x] and [y] are either disjoint, [x] [y] = , or equal, [x] = [y].
The equivalence classes [x] and [y] are equal if and only if x y.
The set of all the distinct equivalences classes partition X.
Proof. If z [x] [y], then x z and y z. Therefore z y and so x y. It is easy to
see that w x if and only if w y, hence [x] = [y].
Any element x X is in an equivalence class, for example [x], so X is the union of all the
equivalence classes. As these are either equal or disjoint, they partition X.
Any equivalence class can be expressed as [x] for any element then x in it, such an element
is called a representative of the equivalence class.
Given an equivalence relation on X, there is a function : X X/ with rule
(x) = [x].
40
Proposition 2.52. The function : X X/ is a surjection and the preimage
1
{[x]}
agrees with the equivalence class [x] as a subset of X, i.e.,
1
{[x]} = [x].
Proof. This is straightforward.
Notice that Propositions 2.50 and 2.52 together show that equivalence relations on a set X
are essentially equivalent to partitions of X and to surjections with domain X. For example,
consider the equivalence relation a b mod m on Z, where m is a positive integer. The
congruence classes modulo m form a partition of Z (see Example 2.47 for the case m = 2). The
surjective function is the function from Z to Z/m which takes each integer to its congruence
class.
41
CHAPTER 3
Permutations and groups
3.1. Permutations
Let X be a nite set. A permutation of X is a bijective function from X to itself. The set
of all permutations from X to itself is denoted Perm(X). Using properties of bijections (such
as the fact that compositions of bijections are bijections) we have the following result.
Proposition 3.1. Let X be a set.
(i) If f and g are in Perm(X) then f and g have a composition f g, which is also a member
of Perm(X).
(ii) If f, g and h are in Perm(X) then
(f g) h = f (g h).
(iii) There is an identity Id
X
in Perm(X) such that
f Id
X
= Id
X
f = f
for all f Perm(X).
(iv) If f is in Perm(X) then there is an inverse f
1
in Perm(X) such that
f f
1
= f
1
f = Id
X
.
In terminology which will be introduced later, the proposition amounts to saying that the
pair
_
Perm(X),
_
is a group, called the permutation group of X.
In particular, for a nonnegative integer n we consider a standard set n with n elements given
by
n = {1, 2, . . . , n}.
The permutation group Perm(n) is denoted S
n
and is called the symmetric group on n objects
or the symmetric group of degree n or the permutation group on n objects.
To describe a permutation of an n-element set X we write
=
_
a
1
a
2
. . . a
n
b
1
b
2
. . . b
n
_
,
where a
1
, . . . , a
n
is a list of the members of X and where b
i
= (a
i
). For example, if
=
_
1 2 3
2 3 1
_
,
then is the permutation in S
3
given by (1) = 2, (2) = 3, (3) = 1. In general, to construct
a permutation in S
3
, one has to choose one of the three members of S
3
to be (1), then one
43
of the two remaining members to be (2), and then the one remaining member must be (3).
Thus there are 3 2 1 permutations in S
3
, given by
_
1 2 3
1 2 3
_
,
_
1 2 3
1 3 2
_
,
_
1 2 3
2 1 3
_
,
_
1 2 3
2 3 1
_
,
_
1 2 3
3 1 2
_
,
_
1 2 3
3 2 1
_
.
By generalising this argument, we get the following result.
Theorem 3.2. The permutation group of an n-element set has n! members.
Proof. Let a
1
, . . . , a
n
be a list of the members. To construct a permutation one chooses
one of the n members to be (a
1
), then one of the remaining n 1 members to be (a
2
), then
one of the remaining n 2 members to be (a
3
), and so on. The total number of permutations
is therefore
n(n 1)(n 2) (2)(1) = n!,
so there are n! permutations in S
n
.
3.2. Cycles
There is a more compact notation for permutations using cycles, which are permutations of
a particularly simple kind. Let r be a positive integer and let a
1
, . . . , a
r
be distinct elements in
a nite set X. Then
(a
1
, a
2
, . . . , a
r
)
denotes the permutation of X such that
(a
1
) = a
2
, (a
2
) = a
3
, . . . , (a
r1
) = a
r
, (a
r
) = a
1
,
and such that (x) = x for x = a
1
, . . . , a
r
. For example, the permutations in S
3
can be listed
as
Id, (2, 3), (1, 2), (1, 2, 3), (1, 3, 2), (1, 3).
A permutation of the form (a
1
, a
2
, . . . , a
r
) is called an r-cycle or just a cycle.
In S
3
one can also express Id as a cycle, for example Id = (1), so every permutation in S
3
is
a cycle. In larger symmetric groups this is no longer true, but one can express any permutation
as a composite of cycles. For example, in S
4
, one has
_
1 2 3 4
3 4 1 2
_
= (1, 3)(2, 4).
In general, for a permutation of X, one can nd a decomposition of this kind such that each
element of X is a term in exactly one of the factors, provided that one allows 1-cycles as factors;
for example, the permutations in S
3
can be listed as
(1)(2)(3), (1)(2, 3), (1, 2)(3), (1, 2, 3), (1, 3, 2), (1, 3)(2).
Theorem 3.3. Let be a permutation of a nite set X. Then there is a decomposition
=
1

k
such that
1
, . . . ,
k
are cycles and such that each element of X is a term in
exactly one of these cycles.
44
Proof. Suppose that X is not empty. Choose an element a
0
of X and let
(a
0
) = a
1
, (a
1
) = a
2
, . . . .
Since X is nite the elements a
0
, a
1
, . . . cannot all be distinct; hence there must be a smallest r
such that a
r
= a
i
with 0 i < r. For 1 i < r one has a
i1
= a
r1
, hence (a
i1
) = (a
r1
),
hence a
i
= a
r
, so in fact a
r
= a
0
, i.e., (a
r1
) = a
0
. Let
1
= (a
0
, a
1
, . . . , a
r1
);
we see that (a
i
) =
1
(a
i
) for 0 i < r.
Now suppose that X has an element b
0
not in {a
0
, . . . , a
r1
}. Then, by a similar argument,
there is a cycle
2
= (b
0
, b
1
, . . . , b
s1
)
such that (b
j
) =
2
(b
j
) for all j. Since is injective and b
0
/ {a
0
, . . . a
r1
} one also has
b
1
/ {a
0
, . . . , a
r1
}, etc., so all the elements
a
0
, a
1
, . . . , a
r1
, b
0
, b
1
, . . . , b
s1
are distinct. It follows that
1
2
(a
i
) = a
i
for 0 i < r and
1
2
(b
j
) = b
j
for 0 j < s.
One continues in this way until one has used up all the elements of X. One then has cycles
1
, . . . ,
k
whose terms form a complete list of the elements of X without repetition and such
that
1
. . .
k
(x) = (x) for all x X, hence
1
. . .
k
= .
Example 3.4. In S
5
, let
=
_
1 2 3 4 5
2 4 5 1 3
_
, =
_
1 2 3 4 5
3 4 1 5 2
_
.
Express , , , and
1
as composites of cycles.
Solution. We have
= (1, 2, 4)(3, 5), = (1, 3)(2, 4, 5), = (1, 5, 4, 3, 2),
= (1, 4, 3, 2, 5),
1
= (1, 4, 2)(3, 5).
3.3. Even and odd permutations
Let X be an n-element set with its elements listed as x
1
, . . . , x
n
, and let be a permutation
of X. One can measure the extent to which mixes up the elements of X by counting the
inversions; these are the pairs (x
i
, x
j
) such that x
i
is earlier than x
j
in the list but such that
(x
i
) is later than (x
j
). For example, let X be a 3-element set with its elements listed as a, b, c;
then Id has 0 inversions, (a, b) and (b, c) each have 1 inversion, (a, b, c) and (a, c, b) each have
2 inversions, and (a, c) has 3 inversions. One can count these inversions diagrammmatically:
take 6 dots in two rows of three columns; label the points in each row a, b, c from left to right;
for x = a, b, c draw a line from x in the top row to (x) in the bottom row; then the number of
inversions is the number of pairs of lines which cross each other.
(a, c, b) a
b
/
c
/
a b c
45
The number of inversions of a given permutation of a set X depends to some extent on the
listing of the elements of X. For example, let X be a 3-element set {a, b, c}. Then Id has zero
inversions with respect to any listing, and similarly (a, b, c) and (a, c, b) have 2 inversions with
respect to any listing. But (a, b), (a, c) and (b, c) can have either 1 or 3 inversions, depending
on which listing is chosen; for example (a, b) has 1 inversion with respect to the listing a, b, c
and 3 inversions with respect to the listing a, c, b. Note that 1 and 3 are both odd; in general
one has the following result.
Theorem 3.5. Let i and j be the numbers of inversions of a permutation of a nite set X
with respect to two listings. Then i and j are both even or both odd.
Proof. Take a diagram of 4n points arranged in 4 rows and n columns, where n is the
number of elements in X. Label the points in rows 1 and 4 in the order of the rst listing, and
the points in rows 2 and 3 in the order of the second listing. Join x in row 1 to x in row 2,
join x in row 2 to (x) in row 3, and join y in row 3 to (y) in row 4. Let k be the number of
crossings between rows 1 and 2; then k is also the number of crossings between rows 3 and 4.
The number of crossing between rows 2 and 3 is j, so the total number of crossings is j + 2k.
Now let i
1
, i
2
and i
3
be the number of pairs of points in row 1 such that the lines across to
row 4 from these two points cross each other 1, 2 or 3 times respectively. The total number of
crossings is then i
1
+ 2i
2
+ 3i
3
, so
i
1
+ 2i
2
+ 3i
3
= j + 2k.
We also observe that inversions with respect to the rst listing correspond to pairs of lines from
row 1 to row 4 which cross each other 1 or 3 times, so i = i
1
+i
3
. It follows that
i + 2(i
2
+i
3
) = j + 2k.
From this equality i and j must be both even or both odd, as required.
Thus there are two fundamentally distinct kinds of permutations: even permutations, which
have an even number of inversions with respect to any listing, and odd permutations, which
have an odd number of inversions with respect to any listing. For an even permutation we
write
sgn = 1,
and for an odd permutation we write
sgn = 1.
In both cases, sgn is called the sign of .
Example 3.6. The even permutations in S
3
are Id, (1, 2, 3), (1, 3, 2); the odd permutations
in S
3
are (1, 2), (1, 3) and (2, 3).
Example 3.7. In S
4
, which of the permutations
Id, (1, 2), (1, 2)(3, 4), (1, 2, 3), (1, 2, 3, 4)
are even and which are odd?
Solution. The even permutations are Id, (1, 2)(3, 4) and (1, 2, 3); the odd ones are (1, 2)
and (1, 2, 3, 4).
46
Proposition 3.8. An r-cycle is even if and only if r is odd.
Proof. Let the r-cycle be (a
1
, . . . , a
r
) and take a listing which begins with a
1
, . . . , a
r
. Then
the number of inversions is r 1.
Theorem 3.9. Let and be permutations of a nite set X. Then
sgn() = (sgn )(sgn ).
Proof. Take a diagram with 3n columns, where n is the number of elements in X, and
label all the columns in the order of the same listing of X. Join x in row 1 to (x) in row 2,
and join y in row 2 to (y) in row 3. Let i, j and k be the numbers of inversions of , and
respectively, so that
sgn = (1)
i
, sgn = (1)
j
, sgn() = (1)
k
.
Let k
1
and k
2
be the number of pairs in row 1 such that the lines down to column 3 from
these points cross each other once or twice respectively. The total number of crossings is then
k
1
+ 2k
2
. The number of crossings between rows 1 and 2 is j, and the number of crossings
between rows 2 and 3 is i, so
i +j = k
1
+ 2k
2
.
We also have k = k
1
, so
i +j = k + 2k
2
.
Therefore (1)
k
= (1)
i
(1)
j
, giving
sgn() = (sgn )(sgn ).
For example, sgn((a, c, b)(a, b, c)) can be determined from the following diagram.
a
c
..
.
.
.
.
.
.
.
.
.
.
.
.
.
b
/
c
/
a b c
Then sgn((a, c, b)(a, b, c)) = (1)
4
= 1. Notice that (a, c, b)(a, b, c) = Id, whose eect is given
by the dashed arrows.
Using these results, it is easy to determine the sign of a composite of cycles, even if the
cycles have common terms.
Example 3.10. The permutation
(1, 2, 3, 4, 5)(2, 4, 3)(1, 5, 4, 2)(2, 4, 1)
is odd, because there is an odd number of factors which are r-cycles with r even.
There is particular interest in cases such that the factors are transpositions, where a trans-
position is dened to be a 2-cycle. A composite
1

k
such that
1
, . . . ,
k
are transpositions
47
will be an even permutation if and only if k is an even number. Recalling that any permutation
can be decomposed into cycles, and observing that
(a
1
, . . . , a
r
) = (a
1
, a
2
)(a
2
, a
3
) (a
r1
, a
r
),
we see that any permutation can be expressed as a composite of transpositions. This gives the
following result.
Theorem 3.11. Let be a permutation. Then is even if and only if there is a decomposi-
tion =
1

k
such that
1
, . . . ,
k
are transpositions and k is even, and is odd if and only
if there is a decomposition =
1

k
such that
1
, . . . ,
k
are transpositions and k is odd.
Let n be a nonnegative integer. Then the alternating group on n objects, denoted A
n
, is
the set of even permutations in S
n
. We have the following properties, which amount to saying
that A
n
is a subgroup of S
n
.
Proposition 3.12.
(i) If and are in A
n
then and have a composite , which is also a member of A
n
.
(ii) If , and are in A
n
then
() = ().
(iii) There is an identity Id in A
n
such that
Id = Id =
for all A
n
.
(iv) If is in A
n
then there is an inverse
1
in A
n
such that
1
=
1
= Id .
Proof. Part (i) holds because a composite of two even permutations is even. Part (ii) holds
because composition in S
n
is associative. Since Id Id = Id we cannot have Id odd. Therefore
Id A
n
, from which it follows that part (iii) holds. For A
n
we then have
1
even, so we
must also have
1
even; therefore part (iv) holds.
Obviously every permutation in S
0
and S
1
is even. For n 2, half of the permutations
in S
n
are even.
Theorem 3.13. If n 2 then the alternating group A
n
has
1
2
(n!) elements.
Proof. Let f : S
n
S
n
be the function given by
f() = (1, 2).
Then
(f f)() = (1, 2)(1, 2) = Id =
for all S
n
, so f is a bijection with f
1
= f. Since (1, 2) is odd, we have f() odd if and
only if is even. Therefore f maps the set A
n
of even permutations in S
n
bijectively onto the
set S
n
\ A
n
of odd permutations in S
n
. It follows that A
n
and S
n
\ A
n
have the same number
of members. Therefore A
n
has half as many members as S
n
, so A
n
has
1
2
(n!) members.
48
For example, the even permutations in S
3
are
Id, (1, 2, 3), (1, 3, 2),
while the odd permutations are
(1, 2) Id = (1, 2), (1, 2)(1, 2, 3) = (2, 3), (1, 2)(1, 3, 2) = (1, 3).
There is a connection between permutations and determinants. Let A = [a
ij
] be an n n
matrix; then A has a determinant det A which is a polynomial in the entries a
ij
. If n = 2 then
det A = a
11
a
22
a
12
a
21
.
If n = 3 then
det A = a
11
a
22
a
33
a
11
a
23
a
32
a
12
a
21
a
33
+a
12
a
23
a
31
+a
13
a
21
a
32
a
13
a
22
a
31
.
What is the general formula? The answer can be expressed in terms of permutations and their
signs:
det A =
S
n
(sgn)a
1(1)
a
2(2)
a
n(n)
.
This formula is useful for certain purposes, though it does not generally give a good way to
compute determinants of large matrices. It can be useful for matrices with only a few non-zero
entries. For example, suppose there is a permutation in S
n
such that a
ij
= 1 for j = (i) and
a
ij
= 0 otherwise; then det A = sgn . A matrix of this kind is called a permutation matrix.
For example,
0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0
= 1
because the permutation
_
1 2 3 4
4 3 2 1
_
= (1, 4)(2, 3)
is even.
3.4. Groups and subgroups
In studying permutations and symmetry, we often encounter collections of invertible func-
tions from a set to itself which have especially nice algebraic properties, and these are sum-
marised in the notion of a group. Many of the examples will also be found as subgroups of other
groups, and these are particularly convenient to work with when we have good knowledge of
the larger group.
Definition 3.14. Let G be a set and a binary operation which combines each pair of
elements x, y G to give another element x y G. Then (G, ) is a group if it satises the
following conditions.
(Gp1) for all elements x, y, z G, (x y) z = x (y z);
(Gp2) there is an element G such that for every x G, x = x = x ;
(Gp3) for every x G, there is an element y G such that x y = = y x.
49
(Gp1) is usually called the associativity law. In (Gp2), is usually called the identity element
of (G, ). In (Gp3), the element y associated to x is called the inverse of x and is denoted x
1
.
Example 3.15. By Proposition 3.1, if X is a nite set then (Perm(X), ) is a group. In
particular, by taking X = n, we see that (S
n
, ) is a group.
Example 3.16. In each of the following cases, (G, ) is a group.
(1) G = Z, = +.
(2) G = Q, = +.
(3) G = R \ {0}, = (multiplication).
The last part is a special case of a general result. Recall the notion of a eld. The proof
this result uses properties of multiplication in a eld, specically associativity, the existence of 1
and inverses.
Proposition 3.17. Let F be a eld. Then the set of non-zero elements F
= F \{0} forms
an abelian group under multiplication.
The group F
is called the group of units of F. For example, if p is a prime, then in the

nite eld Z/p, the group of units (Z/p)
has p 1 elements.
Example 3.18. Let F be any eld. Let
GL
2
(F) =
__
a b
c d
_
: a, b, c, d F, ad bc = 0
_
and let be matrix multiplication; then (GL
2
(F), ) is a group.
Notice that
det
_
a b
c d
_
= ad bc,
so we could dene
GL
2
(F) = {A : A a 2 2 matrix with entries in F and det A = 0} .
More generally, the set of all n n matrices with non-zero determinant forms a group under
matrix multiplication, known as the the nn general linear group over F and denoted GL
n
(F)
or GL(F, n). Of course, the condition det A = 0 is equivalent to the invertibility of A.
Example 3.19. An isometry of the plane is a bijective function f from the set of 2-
dimensional vectors to itself such that
|f(u) f(v)| = |u v|
for any pair of vectors u, v. The set of isometries of the plane is denoted Euc(2). One nds
that (Euc(2), ) is a group, called the Euclidean group of R
2
.
We will study these and other examples in more detail.
When discussing a group (G, ), we will often write xy for the product x y if no confusion
seems likely to arise. For example, when dealing with a permutation group (Perm(X), ) we
write for .
If a group (G, ) has a nite underlying set G, then the number of elements in the G is
called the order of G and is denoted |G|. If |G| is not nite, G is said to be innite.
50
A group G is commutative or abelian if x y = y x for every pair of elements x, y G.
Most groups are not commutative.
Definition 3.20. Let (G, ) be a group and let H G be a subset. Then H is a subgroup
of G if (H, ) is a group. In detail this means
for x, y H, x y H;
H;
if z H then z
1
H.
Sometimes the three conditions of Denition 3.20 are replaced by the following two equiva-
lent conditions:
H is non-empty;
for every pair x, y H, xy
1
H.
Usually it is easiest to show that H = by directly verifying that H, but given any element
x H, the second property implies that = xx
1
H.
When checking whether a subset H of a group G is a subgroup, it is not necessary to
check associativity, because (Gp1) holds for all elements of G and so in particular for elements
of H. For example, the alternating group A
n
is a subgroup of the symmetric group S
n
(see
Proposition 3.12). We write H G if H is a subgroup of G, and we write H < G if H G
and H = G. If H < G then H is called a proper subgroup of G.
Example 3.21. Let GL
n
(R) be the group of invertible n n matrices over R, and let
SL
n
(R) = { A GL
n
(R) : det A = 1 }
(SL
n
(R) is called the special linear group). Show that SL
n
(R) is a subgroup of GL
n
(R).
Solution. Suppose that A, B SL
n
(R), so that det A = 1 and det B = 1. Then
det(AB) = (det A)(det B) = 1,
so AB SL
n
(R).
One has I SL
n
(R) because det I = 1.
Suppose that A SL
n
(R), so that det A = 1. Then det A
1
= 1 because
(det A)(det A
1
) = det(AA
1
) = det I = 1.
Therefore det(A
1
) = 1.
Because these three conditions hold, SL
n
(R) is a subgroup of GL
n
(R).
Let G be a group and let g be a member of G. We write
g
0
= , g
1
= g, g
2
= gg, g
3
= ggg, . . .
and
g
1
= g
1
, g
2
= g
1
g
1
, g
3
= g
1
g
1
g
1
, . . . ,
and we write g for the subset for G given by
g = { . . . , g
2
, g
1
, g
0
, g
1
, g
2
, . . . }.
51
It is easy to see that g is a subgroup of G, and it is called the cyclic subgroup generated by g.
The usual index laws apply:
g
m+n
= g
m
g
n
, (g
m
)
n
= g
mn
.
If a group G is equal to c for some c G, then G is called a cyclic group.
Note that the powers g
n
of a given group element g do not have to be distinct. For example,
suppose that g = (1, 2, 3) in S
3
; then
g
0
= Id, g
1
= g, g
2
= (1, 3, 2), g
3
= Id, g
4
= g, g
5
= g
2
, g
6
= Id, . . .
and
g
1
= g
2
, g
2
= g, g
3
= Id, g
4
= g
2
, . . . ,
so in fact g has only three distinct members. If the number of members of g is a nite
number m, then we say that g has nite order m and write |g| = m; if the number of members
of g is innite then we say that g has innite order and write |g| = .
One gets cyclic groups of various orders from Z and Z/m under addition, as follows.
Proposition 3.22. Under addition, Z is an abelian group, and it is an innite cyclic group
generated by 1. If m is a positive integer, then Z/m is an abelian group under addition, and it
is a cyclic group of order m generated by 1.
Proof. It is clear that Z satises the abelian group axioms. The cyclic subgroup generated
by 1 consists of the integers
0, 1, (1 + 1), . . . ,
so it is equal to the entire group Z. This means that Z is a cyclic group generated by 1, and it
is clear that Z is innite.
In the same way, Z/m is an abelian group. It has m elements
0, 1, 1 + 1, . . . , 1 + 1 + + 1,
so it is a cyclic group of order m generated by 1.
Every cyclic group behaves like Z or Z/m.
Proposition 3.23. Let g be a member of a group G. If g has innite order then the integer
powers
. . . , g
2
, g
1
, g
0
, g
1
, g
2
, . . .
of g are distinct. If g has nite order m then
g = {g
0
, g
1
, g
2
, . . . , g
m1
}
with g
0
, g
1
, g
2
, . . . , g
m1
distinct and with g
m
= .
Proof. Suppose that the integer powers of g are not distinct; we must show that g has
nite order m and that g is as described. To do this, let i and j be integers with i < j such
that g
i
= g
j
and such that j i is as small as possible, and let m = j i. By the choice of
i and j we see that the powers g
0
, g
1
, . . . , g
m1
are distinct. We also see that
g
m
= g
ji
= g
j
g
i
= g
i
g
i
= g
0
= .
52
For any integer k we can write k = qm+r with 0 r < m, and we then have
g
k
= (g
m
)
q
g
r
= (g
0
)
q
g
r
= g
r
.
Therefore g
0
, g
1
, . . . , g
m1
is a complete list of the members of g. This completes the proof.
Here is an important result about cyclic groups that shows how ideas from number theory
come into group theory.
Proposition 3.24. Let G = g be a cyclic group. Then every subgroup of G is cyclic.
Proof. We can assume that G = {1}. Let H be a subgroup of G.
If H = {1} then H = 1. Otherwise there is some power g
k
= 1 such that g
k
H. Since
g
k
= (g
h
)
1
H, we can assume that k N, hence the set
S = {n N : g
n
H}
is a non-empty subset of N. By the Well Ordering Principle, S has a minimal element, d say.
Now for any g
n
H, we can write
n = qd +r,
where q, r Z and r = 0, 1, . . . , d 1. Then
g
r
= g
nqd
= g
n
g
qd
= g
n
(g
d
)
q
which is in H since g
n
H, g
d
H and H is a subgroup. This shows that if r = 0 then r S,
contradicting the minimality of d, so r = 0. Hence
g
n
= g
qd
= (g
d
)
q
,
so H =
g
d
and hence H is cyclic.

Notice that the group of integers Z under addition has this property, so any additive sub-
group H of Z has the form
H = {kd : k Z}.
Example 3.25. In the additive group Z, let
H = {3x + 7y : x, y Z}.
Show that H is a subgroup and nd a cyclic generator.
Solution. To show that H is a subgroup, note that 0 H and if x
1
, x
2
, y
1
, y
2
Z then
(3x
1
+ 7y
1
) (3x
2
+ 7y
2
) = 3(x
1
x
2
) + 7(y
1
y
2
) H.
Since gcd(3, 7) = 1, we have 1 H, in fact
1 = 7 2 3.
This shows that Z = H.
In general we have the following whose proof we leave as an exercise.
53
Proposition 3.26. In the additive group Z, let a, b Z be non-zero and let
H = {xa +yb : x, y Z}.
Then H is a subgroup and
H = {z gcd(a, b) : z Z},
so it is cyclic with generator gcd(a, b).
Here is another source of useful nite cyclic groups. This result is usually called Gausss
Primitive Element Theorem. The proof can be found in many books. Recall from Proposi-
tion 3.17 the group of units of a eld.
Theorem 3.27. Let p be a prime number. Then the group of units (Z/p)
is cyclic and
has order p 1.
So there is a natural number w with the property that every integer t not divisible by p can
be expressed as a power of w modulo p, i.e., for some d,
t w
d
mod p.
The multiplicative order of the residue class of w is p 1, and any residue class with that order
is a generator. Such residue classesare called primitive generators modulo p.
For example, if p = 13, then working modulo 13 we have
2, 2
2
= 4, 2
3
= 8, 2
4
= 16 3, 2
5
6, 2
6
12, 2
7
24 11,
2
8
22 9, 2
9
18 5, 2
10
10, 2
11
20 7, 2
12
14 1.
So 2 is a generator modulo 13. In fact there are 4 such generators, it is a good exercise to nd
the other 3.
54
CHAPTER 4
Groups and symmetry
In this chapter we will study groups of symmetries of subsets of the plane, R
2
. Actually,
most of the discussion of isometries applies just as well to R
n
for any n 1.
4.1. Some 2-dimensional vector geometry
We will denote a point in the plane by a letter such as P. The origin O will be taken as the
centre of a coordinate system based on the x and y-axes in the usual way.
The distance between points P and Q will be denoted |PQ| = |QP|. The (undirected) line
segment joining P and Q will be denoted PQ, while the directed line segment joining P and Q
will be denoted

PQ (this is a vector). Of course,
QP =
PQ.
Given a point P, its position vector is the vector p =

OP which we think of as joining O to
P. We will often write Pp to indicate that P has position vector p.
Given Pp and Qq, we have the diagram
(4.1)
p =

OP
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
q =

OQ
O
PQ
P
Q
from which we see that p +
PQ = q. Hence we have
(4.2)

PQ = q p.
Each position vector p can be expressed in terms of its x and y coordinates p
1
, p
2
and we
will often write p = (p
1
, p
2
) or p = (x
P
, y
P
). With this notation, Equation (4.2) expands to
PQ = (q
1
p
1
, q
2
p
2
) = (x
Q
x
P
, y
Q
y
P
).
We will denote the set of all vectors (x, y) by R
2
, so
R
2
= {(x, y) : x, y R}.
This set will be identied with the plane by the correspondence
(x, y) the point with position vector (x, y).
55
The distance between two points P and Q can be found using the formula
|PQ| = length of

PQ
= length of (q p)
=
(q
1
p
1
)
2
+ (q
2
p
2
)
2
.
In particular, the length of the vector p =

OP is
(4.3) |p| = |OP| =
p
2
1
+p
2
2
.
To nd the angle between two non-zero vectors u = (u
1
, u
2
) and v = (v
1
, v
2
) we can make
use of the dot or scalar product which is dened to be
(4.4) u v = (u
1
, u
2
) (v
1
, v
2
) = u
1
v
1
+u
2
v
2
.
Notice that
|u|
2
= u u, |v|
2
= v v.
Then
(4.5) = cos
1
_
u v
|u| |v|
_
.
u
v X
Notice that we measure angles in radians and in the anti-clockwise direction; also [0, ] by
denition of cos
1
. However, (4.5) is unchanged if we switch the order of u and v. Vectors u
and v are perpendicular, normal or orthogonal if u v = 0, or equivalently if the angle between
them is /2.
If A, B, C are three distinct points then the angle between the lines AB and AC is given by
(4.6) BAC = cos
1
_

AB
AC
|AB| |AC|
_
.
The lines AB and AC are perpendicular, normal or orthogonal if BAC = /2.
A line L can be specied in several dierent ways. First by using an implicit equation
ax +by = c with (a, b) = (0, 0); this gives
(4.7a) L = {(x, y) R
2
: ax +by = c}.
It is worth remarking that the vector (a, b) is perpendicular to L. An alternative way to write
the implicit equation is as (a, b) (x, y) = c, so we also have
(4.7b) L = {x R
2
: (a, b) x = c}.
To determine c it suces to know any point x
0
on L, then c = (a, b) x
0
.
Second, if we have a vector u parallel to L (and so perpendicular to (a, b)) then we can use
the parametric equation x = tu + x
0
, where t R and x
0
is some point on L. It is usual to
take u to be a unit vector, i.e., |u| = 1. Then
(4.7c) L = {tu +x
0
R
2
: t R}.
56
It is also useful to recall the idea of projecting a non-zero vector v onto another w. To do
this, we make use of the unit vector
w =
1
|w|
w =
w
|w|
.
Then the component of v in the w-direction, or the projection of v onto w is the vector
v
w
= (v w) w =
_
v w
|w|
2
_
w.
w

v

v
w
,
Then v
w
is parallel to w and
(v v
w
) w = 0,
so the vector (v v
w
) is perpendicular to w.
We can also project a point P with position vector p onto a line L which does not contain
P. To do this, we consider the line L
passing through P and perpendicular to L,

L
= {su
+p : s R},
where u
is any non-zero vector perpendicular to L (for example (a, b) or the unit vector in the
same direction). Then the projection of P onto L is the point of intersection of L and L
, whose
position vector p
= s
+p can be determined by solving the following equation for s
:
(a, b) (s
+p) = c.
4.2. Isometries of the plane
Definition 4.1. A function : R
2
R
2
is isometry if it is distance preserving.
Here, is distance preserving or preserves distance means that for points Pp and Qq,
|(P)(Q)| = |PQ|, i.e., |(p) (q)| = |p q|.
Here is a rst observation on isometries; later we will see that isometries are always bijections
and have inverse which are isometries.
Lemma 4.2. Every isometry is injective.
Proof. Let be an isometry and suppose that P, Q are points for which (P) = (Q).
Then
|PQ| = |(P)(Q)| = |(P)(P)| = 0,
which can only happen if P = Q.
Definition 4.3. The isometry xes or stabilizes a point P if (P) = P.
Before considering examples, we give the following important result.
Proposition 4.4. Let : R
2
R
2
be an isometry which xes the origin. Then preserves
scalar products and angles between vectors.
57
Proof. Let u, v be vectors and let U, V be the points with these as position vectors. If
(U) and (V ) have position vectors u
O(U) and v
O(V ), then for every pair of

points P, Q we have |(P)(Q)| = |PQ|, so
|u
|
2
= |(U)(V )|
2
= |UV |
2
= |u v|
2
,
hence
|u
|
2
+|v
|
2
2u
= |u|
2
+|v|
2
2u v.
Since
|u
| = |O(U)| = |(O)(U)| = |OU| = |u|,

|v
| = |O(V )| = |(O)(V )| = |OV | = |v|,

we obtain
u
= u v,
which shows that the scalar product of two position vectors is unchanged by an isometry which
xes the origin. Similarly, angles are preserved since the angle between the vectors u
, v
is
cos
1
_
u
|u
| |v
|
_
= cos
1
_
u v
|u| |v|
_
.
Corollary 4.5. An isometry : R
2
R
2
preserves angles between lines.
Proof. Consider the isometry
0
: R
2
R
2
which sends P R
2
to
0
(P) with position
vector
O
0
(P) =
O(P)
O(O).
Then
0
(O) = O. For any two points A, B we have
(A)(B) =
0
(A)
0
(B)
and the result follows from Proposition 4.4.
4.3. Types of isometries
There are four types of isometries of the plane. The three simplest are translations, reec-
tions in lines and rotations about points. Examples of the fourth type, glide reections, are
built up as compositions of reections and translations parallel to the lines of reection.
Translations. For any vector t R
2
, the translation by t is the function
Trans
t
: R
2
R
2
; Trans
t
(x) = x +t.
t
U
Trans
t
(U)
t
V
Trans
t
(V )
t
W
Trans
t
(W)
Notice that
| Trans
t
(x) Trans
t
(y)| = |(x +t) (y +t)| = |x y|,
58
so Trans
t
is an isometry. If Trans
s
is a second such translation function, we have
Trans
t
Trans
s
(x) = Trans
t
(x +s) = x +s +t = Trans
s+t
(x),
so
Trans
t
Trans
s
= Trans
s+t
.
Since s +t = t +s, we also have
Trans
s
Trans
t
= Trans
t
Trans
s
.
So translations behave well with respect to composition. We also have
Trans
0
= Id
R
2, Trans
1
t
= Trans
t
.
We can summarize these observations in
Proposition 4.6. The set of all translations of R
2
, Trans(2), forms a commutative group
under composition.
Notice that when t = 0, every point in the plane is moved by Trans
t
, so such a transforma-
tion has no xed points.
Reections. The next type of isometry is a reection in a line L. Recall that a line in the
plane has the form
L = {(x, y) R
2
: ax +by = c},
where a, b, c R with at least one of a and b non-zero. The reection in L is the function
Re
L
: R
2
R
2
which sends every point on L to itself and if P lies on a line L
perpendicular to L and
intersecting it at M say, then Re
L
(P) also lies on L
and satises |M Re
L
(P)| = |MP|.
L
P
'
L
Re
L
(P)
M

This is equivalent to saying that if P and M have position vectors p and m, then
Re
L
(p) p = 2(mp)
or
(4.8) Re
L
(p) = 2mp,
where Re
L
(p) p is perpendicular to L.
In order to determine the eect of a reection, recall that the vector (a, b) is perpendicular
to L. Consider the unit vector
u =
1
a
2
+b
2
(a, b).
59
Then we can nd the point M as follows. L
is the line given in parametric form by

x = tu +p (t R),
and M is point on both L and L
. So m = su + p, say, satises the linear equation in the

unknown s,
u m =
c
a
2
+b
2
.
This expands to give
s +u p =
c
a
2
+b
2
.
Thus we have
(4.9) m =
_
c
a
2
+b
2
u p
_
u +p.
Substituting into Equation (4.8) we obtain
(4.10) Re
L
(p) = 2
_
c
a
2
+b
2
u p
_
u +p.
Performing a reection twice gives the identity transformation,
(4.11) (Re
L
)
2
= Re
L
Re
L
= Id
R
2 .
Notice that points on the line L are xed by Re
L
, while all other points are moved.
Example 4.7. Determine the eect of the reection Re
L
, on the points P(1, 0), where
L = {(x, y) : x y = 0}.
Solution. First notice that the unit vector u =
1
2
(1, 1) is perpendicular to L. Using
this, we resolve p = (1, 0) into its components perpendicular and parallel to L. These are the
vectors
p
= ((1, 0) u)u =
1
2
u,
p
= (1, 0)
1
2
u = (1, 0)
1
2
(1, 1) =
1
2
(1, 1).
Then we have
Re
L
(p) = p
+p
=
1
2
(1, 1) +
1
2
(1, 1) = (0, 1).
Example 4.8. If [0, ) and
L
= {(t cos , t sin ) : t R},

nd a formula for the eect of Re
L
on P(x, y) = (0, 0).

Solution. The line L
contains the origin O and the point U(cos , sin ). Also if X(1, 0) is
the point on the x-axis, then XOU = . If XOP = , then on setting r = |OP| =
x
2
+y
2
we have
x = r cos , y = r sin .
If P
= Re
L
(P), with position vector (x
, y
), we have
XOP
= ( ) = 2 ,
60
hence
x
= r cos(2 ), y
= r sin(2 ).
Recall that
cos( +) = cos cos sin sin , sin( +) = cos sin + sin cos .
Using these we obtain
x
= r(cos 2 cos + sin 2 sin ), y
= r(sin 2 cos cos 2 sin ),

which yield
x
= cos 2 x + sin 2 y, (4.12a)

y
= sin 2 x cos 2 y. (4.12b)

So applying Re
L
to P we obtain the point

X
(cos 2 x + sin 2 y, sin 2 x cos 2 y).

We can also describe the composition of two reections in two distinct parallel lines.
Proposition 4.9. Let L
1
and L
2
be distinct parallel lines. Then the two compositions
Re
L
1
Re
L
2
and Re
L
2
Re
L
1
are translations perpendicular to the lines.
Proof. Let p be the position vector of a point on L
1
and let v be a vector perpendicular
to L
1
and chosen so that q = p + v is the position vector of a point Q on L
2
. Clearly v is
independent of which point P on L
1
we start with.
L
1
P
L
2
Q
v
Re
L
2
Re
L
1
(P)
Re
L
1
Re
L
2
(P)
Then for any t R,

Re
L
1
Re
L
2
(p +tv) = Re
L
1
Re
L
2
(q + (t 1)v)
= Re
L
1
(q + (1 t)v)
= Re
L
1
(p + (2 t)v)
= p + (t 2)v
= (p +tv) 2v.
So
Re
L
1
Re
L
2
= Trans
2v
.
Similarly we obtain
Re
L
2
Re
L
1
= Trans
2v
.
61
Rotations. Let C be a point with position vector c. Then Rot
C,
: R
2
R
2
is the rotation
of the plane around C through the angle (measured in radians and taking the anti-clockwise
direction to be positive).
C
P
Rot
C,
(P)
Notice that C is xed by Rot

C,
but unless = 2k for some k Z, no other point is xed.
For k Z,
Rot
C,2k
= Id
R
2, Rot
C,+2k
= Rot
C,
.
Example 4.10. Find a formula for the eect of the Rot
O,
on the point P(x, y).
Solution. We assume that P = O since the origin is xed by this rotation. Recall that if
X(1, 0) is the point on the x-axis and XOP = , then setting r = |OP| =
x
2
+y
2
we have
x = r cos , y = r sin .
If P
= Rot
O,
(P), with position vector (x
, y
), we have
x
= r cos( +), y
= r sin( +).
Using the equations of (4.12) we obtain
(4.13) Rot
O,
(x, y) = (cos x sin y, sin x + cos y).
Glide reections. The composition of a reection Re
L
and a translation Trans
t
parallel
to the line of reection L (in either possible order) is called a glide reection. If the translation
is not by 0 then such a glide reection has no xed points.
L
P
'
Re
L
(P)
Trans
t
Re
L
(P)
4.4. The Euclidean group of the plane

We now record a useful fact about isometries that we have already seen for translations.
Proposition 4.11. Let , : R
2
R
2
be two isometries. Then the two compositions
, : R
2
R
2
are isometries which are not necessarily equal.
62
Proof. For any two points P, Q we have
| (P) (Q)| = |((P))((Q))| = |(P)(Q)| = |PQ|,
| (P) (Q)| = |((P))((Q))| = |(P)(Q)| = |PQ|,
hence and are isometries. The non-commutativity will be illustrated in examples.
We also record a somewhat less obvious fact that will be proved in the next section.
Proposition 4.12. Let : R
2
R
2
be an isometry. Then is a bijection and its inverse
1
is also an isometry.
Proof. See Corollary 4.20 below for a proof that an isometry is invertible. Assuming that
1
exists, for each P, Q R
2
,
|
1
(P)
1
(Q)| = |(
1
(P))(
1
(Q))| = |PQ|,
hence
1
is also an isometry.
We summarize these observations in
Theorem 4.13. The set Euc(2) of all isometries of R
2
forms a group under composition.
The group (Euc(2), ) is called the Euclidean group or isometry group of the plane.
Example 4.14. The translations form a subgroup Trans(2) of Euc(2).
Proof. See Proposition 4.6 and the discussion leading up to it.
For any point Pp, we can consider the isometries which x P, i.e., map P to itself. The set
of all such isometries is denoted Euc(2)
P
and is called the stabilizer of P. Notice that
Euc(2)
P
= { Euc(2) : (P) = P} Euc(2).
Lemma 4.15. For any P, Euc(2)
P
is a subgroup of Euc(2).
Proof. The main points are to show that for , Euc(2)
P
, Euc(2)
P
, and for
1
Euc(2)
P
. For the rst we have
(P) = ((P)) = (P) = P,
and for the second,
1
(P) =
1
((P)) = P.
The following result is extremely important and has an analogue for any R
n
.
Theorem 4.16. The stabilizer of the origin O consists of all linear mappings representable
by orthogonal matrices relative to the standard basis, these are known as linear isometries.
Proof. See Theorem 4.18.
63
4.5. Matrices and isometries
Consider an isometry : R
2
R
2
which xes the origin O, i.e., (O) = O or equivalently
Euc(2).
Suppose that X(1, 0) is sent to X
(cos , sin ) by . Then Y (0, 1) must be sent to one of

the two points Y
(cos( +/2), sin( +/2)) and Y
(cos( /2), sin( /2)) since these are

the only ones at unit distance from O making the angle /2 with OX
.
O
X X
Y

X

If P(x, y), then writing
x = r cos , y = r sin ,
where r =
x
2
+y
2
= |OP|, we nd that the point P
(x
, y
) with P
= (P) has
x
= r cos
, y
= r sin
,
for some
since |OP
| = |OP| = r.
If (Y ) = Y
, then we must have
= +, while if (Y ) = Y
, we must have
= .
This means that
(x
, y
) =
_
_
_
r(cos( +), sin( +)) if T(Y ) = Y
,
r(cos( ), sin( )) if T(Y ) = Y
,
=
_
_
_
(cos x sin y, sin x + cos y) if (Y ) = Y
,
(cos x + sin y, sin x cos y) if (Y ) = Y
.
The rst case corresponds a rotation about the origin O through angle , while the second
corresponds to a reection in the line
sin(/2)x cos(/2)y = 0.
through the origin. Notice that in either case, is a linear transformation or linear mapping
in that
((x
1
, y
1
) + (x
2
, y
2
)) = (x
1
, y
1
) + (x
2
, y
2
), (4.14a)
(t(x, y)) = (tx, ty) = t(x, y). (4.14b)
64
From now on, we will identify (x, y) with the column vector
_
x
y
_
. This allows us to represent
by a matrix. Notice that
_
x
_
=
_
_
_
_
cos sin
sin cos
_
_
_
_
x
y
_
_
if (Y ) = Y
,
_
_
cos sin
sin cos
_
_
_
_
x
y
_
_
if (Y ) = Y
.
So in each case we have (x) = Ax for a suitable matrix A. These matrices satisfy
_
cos sin
sin cos
_
T
_
cos sin
sin cos
_
=
_
cos sin
sin cos
__
cos sin
sin cos
_
= I
2
,
_
cos sin
sin cos
_
T
_
cos sin
sin cos
_
=
_
cos sin
sin cos
__
cos sin
sin cos
_
= I
2
,
so they are both orthogonal matrices.
Definition 4.17. An n n matrix A is orthogonal if A
T
A = I
n
, or equivalently if A is
invertible with inverse A
1
= A
T
.
It is easy to see that every n n orthogonal matrix A has det A = 1. For the above
matrices we have
(4.15) det
_
cos sin
sin cos
_
= 1, det
_
cos sin
sin cos
_
= 1.
It is also true that every 2 2 orthogonal matrix is of one or other of these two forms.
For a general isometry : R
2
R
2
, on setting t = (0) we can form the isometry
0
= Trans
t
: R
2
R
2
which xes the origin and satises
= Trans
t
0
.
Combining all of these ingredients we obtain
Theorem 4.18. Every isometry : R
2
R
2
can be uniquely expressed as a composition
= Trans
t
0
,
where
0
: R
2
R
2
is an isometry that xes O, hence there is an orthogonal matrix [
0
] for
which
(x) = [
0
]x +t
for all x R
2
.
Proof. For uniqueness, observe that if the isometries
1
,
2
given by
1
(x) = Ax +s,
1
(x) = Bx +t
agree on every point, then
Ax +s = Bx +t.
In particular, taking x = 0 we obtain s = t. In general this gives
Ax = Bx.
65
Now choosing x = e
1
, e
2
, the standard basis vectors, we obtain A = B since Ae
i
, Be
i
are the
i-th columns of A, B.
Definition 4.19. If the isometry : R
2
R
2
can be expressed in the form
(x) = Ax +t,
for some orthogonal matrix A, its Seitz symbol is (A | t).
We will use Seitz symbol freely from now on and write
(A | t)x = Ax +t = (x).
For the composition we will write
(A
1
| t
1
)(A
2
| t
2
) = (A
1
| t
1
) (A
2
| t
2
).
Now we can deduce a long awaited result.
Corollary 4.20. Every isometry : R
2
R
2
has an inverse, hence it is a bijection.
Proof. Express in matrix form,
(x) = [
0
]x +t,
where [
0
] is orthogonal and so has an inverse given by [
0
]
1
= [
0
]
T
. Then the function
: R
2
R
2
given by
(4.16) (x) = [
0
]
1
(x t) = [
0
]
1
x [
0
]
1
t
satises
= Id
R
2 = ,
and so is the inverse of . Therefore it is also an isometry (see the proof of Proposition 4.12).
What happens when we compose two Seitz symbols or nd the symbol of inverse function?
Proposition 4.21. We have the following algebraic rules for Seitz symbols of isometries.
(A
1
| t
1
)(A
2
| t
2
) = (A
1
A
2
| t
1
+A
1
t
2
),
(A | t)
1
= (A
1
| A
1
t) = (A
T
| A
T
t).
Proof. The formula for the inverse was given in (4.16). For any x R
2
,
(A
1
| t
1
)(A
2
| t
2
)x = (A
1
| t
1
)(A
2
x +t
2
)
= A
1
(A
2
x +t
2
) +t
1
= A
1
A
2
x +A
1
t
2
+t
1
= (A
1
A
2
| t
1
+A
1
t
2
)x.
We can now classify isometries of the plane in terms of their Seitz symbols. We will denote
the 2 2 identity matrix by I = I
2
.
Translations. These have the form (I | t). To compose two translations, we have
(I | t
1
)(I | t
2
) = (I | t
1
+t
2
).
66
Rotations. Consider a Seitz symbol (A | t) where A is orthogonal with det A = 1, hence
it has the form
A =
_
cos sin
sin cos
_
.
The equation Ax +t = x is solvable if and only if (I A)x = t can be solved. Now
det(I A) = det
_
1 cos sin
sin 1 cos
_
= (1 cos )
2
+ sin
2
= 1 2 cos + cos
2
+ sin
2
= 2 2 cos = 2(1 cos ),

so provided that cos = 1, (I A) is invertible. But cos = 1 if and only if A = I, so (I A)
is invertible if and only if A = I.
So as long as A = I, we can nd a vector c = (I A)
1
t for which (A | t)c = c. Then
(A | t) represents rotation about c through the angle . Notice that once we know A and c we
can recover t using the formula t = (I A)c.
If A = I, (I | 0) is a rotation through angle 0, while if t = 0, (I | t) is not a rotation.
Remark 4.22. When working with rotations it is useful to recall the following formula for
nding the inverse of a 2 2 matrix which is valid provided ad bc = 0:
(4.17)
_
a b
c d
_
1
=
1
ad bc
_
d b
c a
_
=
_
d/(ad bc) b/(ad bc)
c/(ad bc) a/(ad bc)
_
.
In particular, provided cos = 1,
_
1 cos sin
sin 1 cos
_
1
=
1
2(1 cos )
_
1 cos sin
sin 1 cos
_
=
_
_
1
2
sin
2(1 cos )
sin
2(1 cos )
1
2
_
_
.
(4.18a)
Using standard trigonmetric identities we also have
_
1 cos sin
sin 1 cos
_
1
=
_
_
1
2
cos(/2)
2 sin(/2))
cos(/2)
2 sin(/2)
1
2
_
_
=
1
2
_
1 cot(/2)
cot(/2) 1
_
.
(4.18b)
Glide reections. Consider a Seitz symbol (A | t) where A is orthogonal with det A = 1,
hence it has the form
A =
_
cos sin
sin cos
_
.
Recall that this matrix represents Re
L
/2
, reection in the line through the origin
L
/2
= {(x, y) R
2
: sin(/2)x cos(/2)y = 0}.
We will see that (A | t) represents a glide reection, i.e., the composition of a reection in a
line parallel to L
/2
and a translation by a vector parallel to L
/2
.
67
Express t in the form t = u+2v, where v is perpendicular to the line L
/2
and u is parallel
to it. To do this we may take the unit vectors
w
= (cos(/2), sin(/2)), w
= (sin(/2), cos(/2))
which are parallel and perpendicular respectively to L
/2
and nd the projections of t onto
these unit vectors; then we have
u = t
w
, v =
1
2
t
w
.
From the proof of Proposition 4.9 we know that if L is the line parallel to L
/2
containing
v, then
Re
L
= Trans
2v
Re
L
/2
,
and so
Trans
u
Re
L
= Trans
u
Trans
2v
Re
L
/2
= Trans
u+2v
Re
L
/2
= Trans
t
Re
L
/2
= (A | t).
This shows that (A | t) represents reection in L followed by translation by u parallel to L; if
we allow u = 0 here, then a reection can be interpreted as a special kind of glide reection.
Remark 4.23. Here is another way to nd the vectors u and v in the above situation.
Notice that since u is parallel to L
/2
and v is perpendicular to it,
(A | 0)t = A(u + 2v) = Au + 2Av = u 2v.
Hence we have
u =
1
2
(t +At) , v =
1
4
(t At) .
Summary of basic facts on Seitz symbols
Translations: Trans
t
= (I | t).
Rotations: Rot
C,
= (A | t), where
A =
_
cos sin
sin cos
_
= I, t = (I A)c, c = (I A)
1
t.
Glide reections and reections: Trans
u
Trans
2v
Re
L
/2
= (A | t), where
A =
_
cos sin
sin cos
_
,
v is perpendicular to the line
L
/2
= {(x, y) R
2
: sin(/2)x cos(/2)y = 0},
and u is parallel to it. This represents a glide reection in the line parallel to L
/2
and containing the point with position vector v; the translation is by u. When
u = 0, this is a reection.
68
Example 4.24. If (A
1
| t
1
) and (A
2
| t
2
) are glide reections, show that their composition
(A
1
| t
1
)(A
2
| t
2
) is a rotation or a translation.
Solution. We have
det A
1
= 1 = det A
2
,
det(A
1
A
2
) = det A
1
det A
2
= 1,
(A
1
| t
1
)(A
2
| t
2
) = (A
1
A
2
| t
1
+A
1
t
2
).
When A
1
A
2
= I, the composition (A
1
| t
1
)(A
2
| t
2
) is a translation (or a trivial rotation if
t
1
+A
1
t
2
= 0). When A
1
A
2
= I, (A
1
| t
1
)(A
2
| t
2
) is a rotation.
Here are some useful results on inverses, obtained using Proposition 4.21.
Proposition 4.25. Suppose that
A =
_
cos sin
sin cos
_
, B =
_
cos sin
sin cos
_
.
The Seitz symbols of the inverses of the rotation (A | s) and the (glide) reection (B | t) are
(A | s)
1
= (A
T
| A
T
s), (B | t)
1
= (B | Bt),
where
A
T
=
_
cos() sin()
sin() cos()
_
=
_
cos sin
sin cos
_
.
Example 4.26. Let
O(2) = {(A | 0) Euc(2) : A is orthogonal},
SO(2) = {(A | 0) Euc(2) : A is orthogonal and det A = 1}.
Then O(2) and SO(2) are subgroups of Euc(2), and SO(2) is a subgroup of O(2) .
Proof. For (A | 0), (B | 0) O(2) we have
(A | 0)(B | 0) = (AB | 0)
and
(AB)
T
(AB) = (B
T
A
T
)(AB) = B
T
(A
T
A)B = B
T
I
2
B = B
T
B = I
2
.
So (A | 0)(B | 0) O(2). Also, (I
2
| 0) O(2) and
(A | 0)
1
= (A
1
| 0) O(2)
since A
1
= A
T
and
(A
T
)
T
(A
T
) = AA
T
= AA
1
= I
2
,
hence A
1
is orthogonal.
If (A | 0), (B | 0) SO(2), then (A | 0)(B | 0) = (AB | 0) and
det(AB) = det Adet B = 1,
so (A | 0)(B | 0) SO(2). Checking of the remaining points is left as an exercise.
69
O(2) is the orthogonal subgroup of Euc(2) and it consists of all the isometries of R
2
which x
the origin. SO(2) is called the special orthogonal subgroup of Euc(2) and consists of all rotations
about the origin.
Elements of Euc(2) of the form (A | t) with A SO(2) are called direct isometries, while
those with A / SO(2) are called indirect isometries. We denote the subset of direct isometries
by Euc
+
(2) and the subset of indirect isometries by Euc
(2).
Proposition 4.27. The direct isometries form a subgroup of Euc(2).
Proof. If (A
1
| t
1
), (A
2
| t
2
) Euc
+
(2), then
(A
1
| t
1
)(A
2
| t
2
) = (A
1
A
2
| t
1
+A
1
t
2
)
with A
1
A
2
SO(2), so this product is in Euc
+
(2).
4.6. Symmetry groups of plane gures
If S R
2
is a non-empty subset, we can consider the subset
Euc(2)
S
= { Euc(2) : S = S} Euc(2).
Proposition 4.28. Euc(2)
S
is a subgroup of Euc(2), Euc(2)
S
is a subgroup of Euc(2)
Proof. By denition, for Euc(2),
S = {(s) : s S}.
So S = S if and only if
for every s S, (s) S;
every s S has the form s = (s
) for some s
S.
Since an isometry is injective, this really says that each Euc(2)
S
acts by permuting the
elements of S and preserving distances between them.
If , Euc(2)
S
then for s S,
(s) = ((s)) S = S.
Also, there is an s
S such that s = (s
) and similarly an s
S such that s
= (s
); hence
s = (s
) = ((s
)) = (s
).
It is easy to see that Id
R
2 Euc(2)
S
. Finally, if Euc(2)
S
then
1
Euc(2)
S
since
1
S =
1
(S) = (
1
)S = S.
Euc(2)
S
is called the symmetry subgroup of S and is often referred to as the symmetry group
of S as a subset of R
2
. The following is easy to prove.
Lemma 4.29. Trans(2)
S
S
.
70
Example 4.30. Let R
2
be an equilateral triangle with vertices A, B, C.
A
B C
O
A symmetry of is dened once we know where the vertices go, hence there are as many
symmetries as permutations of the set {A, B, C}. Each symmetry can be described using
permutation notation and we obtain the six distinct symmetries
_
A B C
A B C
_
= ,
_
A B C
B C A
_
= (A, B, C),
_
A B C
C A B
_
= (A, C, B),
_
A B C
A C B
_
= (B, C),
_
A B C
C B A
_
= (A, C),
_
A B C
B A C
_
= (A, B).
Therefore we have | Euc(2)
| = 6. Notice that the identity and the two 3-cycles represent

rotations about O, while each of the three transpositions represents a reection in lines through
O and a vertex.
Example 4.31. Let R
2
be the square centred at the origin O and whose vertices are
at the points A(1, 1), B(1, 1), C(1, 1), D(1, 1).
A B
C D
O
Then a symmetry is dened by sending A to any one of the 4 vertices then choosing how to send
B to one of the 2 adjacent vertices. This gives a total of 4 2 = 8 such symmetries, therefore
| Euc(2)
| = 8.
Again we can describe symmetries in terms of their eect on the vertices. Here are the eight
elements of Euc(2)
described in permutation notation.

_
A B C D
A B C D
_
= ,
_
A B C D
B C D A
_
= (A, B, C, D),
_
A B C D
C D A B
_
= (A, C)(B, D),
_
A B C D
D A B C
_
= (A, D, C, B),
_
A B C D
A D C B
_
= (B, D),
_
A B C D
D C B A
_
= (A, D)(B, C),
_
A B C D
C B A D
_
= (A, C),
_
A B C D
B A D C
_
= (A, B)(C, D).
71
Each of the two 4-cycles represents a rotation through a quarter turn about O, and (A, C)(B, D)
represents a half turn. The transpositions (B, D) and (A, C) represent reections in the diag-
onals while (A, D)(B, C) and (A, B)(C, D) represent reections in the lines joining opposite
midpoints of edges.
Example 4.32. Let R R
2
be the rectangle centred at the origin O with vertices at A(2, 1),
B(2, 1), C(2, 1), D(2, 1).
B A
O
C D
A symmetry can send A to any of the vertices, and then the long edge AB must go to the longer
of the adjacent edges. This gives a total of 4 such symmetries, thus | Euc(2)
R
| = 4.
Again we can describe symmetries in terms of their eect on the vertices. Here are the four
elements of Euc(2)
R
described using permutation notation.
_
A B C D
A B C D
_
= ,
_
A B C D
B A D C
_
= (A, B)(C, D),
_
A B C D
C D A B
_
= (A, C)(B, D),
_
A B C D
D C B A
_
= (A, D)(B, C).
(A, C)(B, D) represents a half turn about O while (A, B)(C, D) and (A, D)(B, C) represent
reections in lines joining opposite midpoints of edges.
Example 4.33. Given a regular n-gon (i.e., a regular polygon with n sides all of the same
length and n vertices V
1
, V
2
, . . . , V
n
), the symmetry group is a dihedral group of order 2n, with
elements
, ,
2
, . . . ,
n1
, , ,
2
, . . . ,
n1
,
where
k
is an anticlockwise rotation through 2k/n about the centre and is a reection in
the line through V
1
and the centre. In fact each of the elements
2
is a reection in a line
through the centre. Moreover we have
|| = n, || = 2, =
n1
=
1
.
In permutation notation this gives the n-cycle
= (V
1
, V
2
, . . . , V
n
),
but is more complicated to describe since it depends on whether n is even or odd.
For example, if n = 6 we have
= (V
1
, V
2
, V
3
, V
4
, V
5
, V
6
), = (V
2
, V
6
)(V
3
, V
5
),
while if n = 7
= (V
1
, V
2
, V
3
, V
4
, V
5
, V
6
, V
7
), = (V
2
, V
7
)(V
3
, V
6
)(V
4
, V
5
).
72
We have seen that when n = 3, Euc(2)
is the permutation group of the vertices and so D

6
is
essentially the same group as S
6
.
If we take the regular n-gon centred at the origin with the rst vertex V
1
at (1, 0), the
generators and can be represented as (A | 0) and (B | 0) using the matrices
A =
_
cos 2/n sin 2/n
sin 2/n cos 2/n
_
, B =
_
1 0
0 1
_
In this case the symmetry group is the dihedral group of order 2n,
D
2n
= {, ,
2
, . . . ,
n1
, , ,
2
, . . . ,
n1
}.
Notice that the subgroup of direct symmetries is
D
+
2n
= {, ,
2
, . . . ,
n1
}.
Notice that D
2n
is a subgroup of O(2) and D
+
2n
is a subgroup of SO(2).
More generally we have the following Theorem. A convex region is a subset S R
2
in which
for each pair of points x, y S, the line segment joining them lies in S, i.e.,
{tx + (1 t)y : 0 t 1} S.
Theorem 4.34. If V
1
, . . . , V
n
are the vertices in order of a polygon which bounds a convex
region P of R
2
containing a point not on the boundary, then Euc(2)
P
can be identied with a
subgroup of the permutation group Perm
{V
1
,...,V
n
}
of the vertices.
73
Bibliography
[1] M. Liebeck (with a foreword by R. Guralnick), A Concise Introduction to Pure Mathematics, 2nd edition,
Chapman & Hall/CRC (2005); ISBN 1 58488 547 5.
[2] B. L. Johnston & F. Richman, Numbers and Symmetry, an Introduction to Algebra, CRC Press (1997); ISBN
0 8493 0301 X.
[3] T. Barnard & H. Neil, Teach Yourself Mathematical Groups, Hodder & Stoughton (1996); ISBN 0 340 67012 6.
75
Mathematics 2F: Exercises
Exercises on Chapter 1
1-1. For integers a, b, c, prove that
a | b and b | c = a | c, (a)
a | b and b | a = b = a, (b)
a | b and a | (b +c) = a | c. (c)
1-2. For each of the following pairs a, b, nd gcd(a, b) and express it as a linear combination of
a and b.
(a) a = 28, b = 23; (b) a = 55, b = 34; (c) a = 1492, b = 1066.
1-3. Find the greatest common divisor of 2263 and 1643 by factorising both numbers into
primes. Check your answer by using the Euclidean algorithm. Which method is quicker?
1-4. Suppose that a, b are natural numbers and write a = a
0
gcd(a, b), b = b
0
gcd(a, b). Show
that gcd(a
0
, b
0
) = 1.
1-5. Prove that if gcd(a, b) = 1 and a | bc, then a | c. [This generalises the result that if p is
prime and p b and p | bc then p | c. It can be proved in the same way.]
1-6. Prove that the common divisors of a and a +b are the same as the common divisors of a
and b. Deduce that gcd(a, a +b) = gcd(a, b).
1-7. For two non-zero natural numbers a, b, a natural number m is called a least (or lowest)
lowest common multiple of a, b if
a | m and b | m,
whenever a | c and b | c, then m | c.
(a) Show that
ab
gcd(a, b)
is a natural number and that it is a least common multiple of a, b.
(b) Show that if m is any least common multiple of a, b then m =
ab
gcd(a, b)
. [This allows us to
talk about the least common multiple of a, b; it is usually denoted by lcm(a, b).]
1-8. Determine whether the following are true or false.
a b mod m = 2a 2b mod m. (a)
2a 2b mod m = a b mod m. (b)
a b mod 2m = a b mod m. (c)
a b mod m = a b mod 2m. (d)
1-9. Write down the addition and multiplication tables for Z/7 and Z/8. In each case, decide
whether it is true that xy 0 implies x 0 or y 0.
1-10. For a positive integer n, show that
(a) 4 11
n
+ 3 5
2n
is divisible by 7,
1
(b) 3
4n+2
+ 4
2n+1
is divisible by 13,
(c) 5
2n
+ 3 2
2n2
is divisible by 7.
1-11. In the EAN-13 barcode system, a code is a 13-digit sequence
a
1
a
2
. . . a
13
such that a
i
{0, 1, . . . , 9}. A sequence of this type has a checksum s, also an integer belonging
to the set {0, 1, . . . , 9}, given by
s a
1
+ 3a
2
+a
3
+ 3a
4
+a
5
+ +a
11
+ 3a
12
+a
13
mod 10,
and the sequence is a valid code if and only if the checksum is zero.
(a) Find x such that 5 072605 31304x is a valid code.
(b) Let a
1
a
2
. . . a
13
be a valid code, and suppose that one of the digits is changed to a dierent
value. Show that the result is not a valid code.
(c) Let a
1
a
2
. . . a
13
be a valid code, and suppose that two adjacent digits are transposed, i.e.,
a
i
a
i+1
is changed to a
i+1
a
i
. Show that the result is a valid code if and only if a
i
= a
i+1
or
a
i
= a
i+1
5.
1-12. In the ISBN-10 book-code system, a code is a 10-digit sequence
a
1
a
2
. . . a
10
such that a
i
{0, 1, . . . , 9} for 1 i 9 and such that a
10
{0, 1, . . . , 9, X}. A sequence of
this type has a checksum s, belonging to the set {0, 1, . . . , 9, 10}, given by
s a
1
+ 2a
2
+ 3a
3
+ + 9a
9
+ 10a
10
mod 11,
where X is interpreted as 10. A sequence is a valid code if and only if its checksum is zero.
(a) Show that a code a
1
a
2
. . . a
10
is valid if and only if
a
10
a
1
+ 2a
2
+ + 9a
9
mod 11.
(b) Let a
1
a
2
. . . a
10
be a valid code, and suppose that one of the digits is changed to a dierent
value. Show that the result is not a valid code.
(c) Let a
1
a
2
. . . a
10
be a valid code, and suppose that two adjacent digits are transposed (so
a
i
a
i+1
is changed to a
i+1
a
i
). Suppose also that a
i
= a
i+1
. Show that the result is not a valid
code.
(d) Which of the following ISBN-10 codes are valid?
1 58488 547 5, 0 8493 0301 X, 0 340 67012 4.
[These are taken from the course bibliography, but may have copied wrongly.] Correct any
invalid codes by changing the last digit.
(e) An ISBN-10 code is converted to an ISBN-13 code in the following way: prex the code
with 978; change the last digit so that the result is a valid EAN-13 code. Apply this process to
the codes in part (d) (corrected if necessary).
1-13. (a) Find an inverse mod101 for 9.
(b) By considering enough powers 2
r
mod 21, nd an inverse mod21 for 2. Hence determine all
the distinct powers of 2
n
mod 21. [It is only necessary to consider nitely many exponents n.]
(c) Find inverses mod47 for 5 and 6. Hence nd an inverse mod47 for 30.
2
1-14. Solve each of the following congruences:
(a) 3x 4 mod 7, (b) 3x 4 mod 14, (c) 10x 15 mod 21,
(d) 10x 15 mod 25, (e) 17x 19 mod 23.
1-15. Find the general solution of the simultaneous congruences
x 1 mod 5, x 3 mod 6, x 2 mod 11.
1-16. Find the general solution of the simultaneous congruences
x 3 mod 7, x 5 mod 13, x 4 mod 16.
1-17. Solve the congruence 16x 99 mod 221 by solving the simultaneous congruences
16x 99 mod 17, 16x 99 mod 13.
1-18. According to biorhythm theory, everybody has a physical cycle of 23 days with a maxi-
mum on the 6-th day, an emotional cycle of 28 days with a maximum on the 7-th day, and an
intellectual cycle of 33 days with a maximum on the 8-th day. Find when a person rst has all
the maxima on the same day. Estimate the number of simultaneous maxima in a lifetime.
1-19. (a) Let p be a prime number. Use Example 1.34 to show that
(a +b)
p
a
p
+b
p
mod p
for all integers a and b. In particular deduce that
(a + 1)
p
a
p
+ 1 mod p
for all integers a. Use induction on a to prove that
a
p
a mod p
for all positive integers a, and deduce that it holds for all integers.
If p a, deduce that
a
p1
1 mod p.
[These results are both known as Fermats Little Theorem].
(b) Show that 2
9
1 mod 511, and deduce that 2
9n+7
128 mod 511 for all nonnegative
integers n. Use part (a) to show that 511 is not a prime.
1-20. Let p an odd prime for which p 1 mod 4, and set m = (p 1)/2. Show that
(m!)
2
1 mod p.
So for such a prime, 1 is congruent to a square modulo p. [Hint: express the residues mod-
ulo p of each of the numbers m + 1, m + 2, . . . , p 1 in terms of 1, 2, . . . , m and use Wilsons
Theorem 1.35.]
1-21. Let p be an odd prime. A primitive root modulo p is an integer u for which the powers
u, u
2
, . . . , u
p1
are all distinct modulo p. Then u
p1
1 mod p and this is the smallest positive
power for which this is true; furthermore, if p t, then t u
r
mod p for exactly one value of
r in the range 1, 2, . . . , p 1. Gausss Primitive Element Theorem asserts that such primitive
generators exist for all primes.
3
Find primitive roots for each of the primes 3, 5, 7, 11, 13.
4
2-1. Which of the following formulae are true for arbitrary sets A, B, C. In each case, give
either a counterexample or a proof.
(i) (A B) C = (A C) (B C).
(ii) (A B) C = (A C) (B C).
(iii) (A B) \ C = (A\ C) (B \ C).
(iv) (A B) \ C = (A\ C) (B \ C).
(v) A\ (B C) = (A\ B) (A\ C).
(vi) A\ (B C) = (A\ B) (A\ C).
2-2. Can there exist sets U, V, W such that the following are all true?
U V = , U W = , (U V ) \ W = .
2-3. Recall that the symmetric dierence of two sets X, Y is dened by
X Y = (X \ Y ) (Y \ X).
(i) Show that X Y = (X Y ) \ (X Y ).
(ii) What is X X?
(iii) Show that is associative. i.e., show that for any three sets X, Y, Z,
X (Y Z) = (X Y ) Z.
2-4. Let f
1
: X
1
X
2
, f
2
: X
2
X
3
, f
3
: X
3
X
4
, f
4
: X
4
X
5
be functions. Show
that the following compositions are all equal:
(f
4
f
3
) (f
2
f
1
), f
4
(f
3
(f
2
f
1
)), f
4
((f
3
f
2
) f
1
), (f
4
(f
3
f
2
)) f
1
.
2-5. Let f : X Y and g : Y Z be two functions. Let A
, A
X, B
, B
Y
and C Z. Which of the following are always true? In each case give either a proof or a
counterexample.
f
1
(B
) = f
1
(B
) f
1
(B
), f
1
(B
) = f
1
(B
) f
1
(B
),
f(A
\ A
) = f(A
) \ f(A
), f
1
(B
\ B
) = f
1
(B
) \ f
1
(B
),
f
1
(fA
) A
, f
1
(fA
) A
, f(f
1
B
) B
, f(f
1
B
) B
,
g fX = g(fX), (g f)
1
C = f
1
(g
1
C).
2-6. For each of the following functions, investigate whether it is injective, surjective or bijective:
f : [0, 2] [1, 1], g : [/2, /2] [1, 1], h: [, 0] [1, 1],
where the rules are
f(t) = g(t) = h(t) = cos t.
5
2-7. Investigate the injectivity and surjectivity of the following functions. Where a function is
bijective, give the rule for its inverse.
f : R R; f(x) = x
3
,
g : R R; g(x) = x
4
,
h: (0, ) (, 0); h(x) = x
4
,
k: (0, ) (0, ); k(x) =
1
x
2
.
2-8. Let A and B be two non-empty sets.
(a) Show that the projection functions
p
1
: AB A; p
1
(a, b) = a, p
2
: AB A; p
2
(a, b) = b
are surjections.
(b) Show that the diagonal function
: A AA; (a) = (a, a)
is an injection.
(c) Find a bijection AB B A. Can you generalise this to more than two sets?
2-9. Let U and V be any two disjoint non-empty sets. Find an injection j : U U V and
a surjection q : U V U for which q j = Id
U
. Is j q = Id
UV
?
2-10. Let f : X Y be a function between two non-empty sets.
(a) Prove that f is an injection if and only if f has a left inverse, i.e., there is a function
g : Y X such that g f = Id
X
.
(b) Prove that f is a surjection if and only if f has a right inverse, i.e., there is a function
h: Y X such that f h = Id
Y
.
(c) Suppose that f has both a left inverse g : Y X and a right inverse h: Y X. Prove
that h = g.
2-11. Let f : X Y be a function between two non-empty sets.
(a) If f is an injection, prove that if g, h: W X are two functions then
f g = f h = g = h.
(b) Suppose that for every pair of functions g, h: W X,
f g = f h = g = h.
Prove that f is injective.
(c) If f is a surjection, prove that if k, : Y Z are two functions then
k f = f = k = .
(d) Suppose that for every pair of functions k, : Y Z,
k f = f = k = .
Prove that f is surjective.
2-12. Prove Theorem 2.28.
6
2-13. Let U be a nite set and let V be a subset of U. Prove that
|U \ V | = |U| |V |.
2-14. Let X be the set of all natural numbers from 1 to 5000.
(a) Let A be the set of all elements of X which are not divisible by 3. How many elements does
A have?
(b) Let B be the set of all elements of X which are not divisible by 4. How many elements does
B have?
(c) How many elements of X are not divisible by at least one of 3 and 4?
2-15. Use the Pigeonhole Principle to show the following.
(a) In any set of 9 distinct integers, it is possible to nd two elements whose dierence is divisible
by 8.
(b) For a natural number n, in any set of n + 1 distinct integers, it is possible to nd two
elements whose dierence is divisible by n.
(c) For a natural number n, suppose that x
1
, x
2
, . . . , x
n
are distinct integers. Then there is a
non-empty subset of {x
1
, x
2
, . . . , x
n
} whose elements have sum divisible by n. [Hint: consider
the numbers 0, x
1
, x
1
+x
2
, . . . , x
1
+x
2
+ +x
n
and use (b).]
2-16. Let X and Y be sets and assume that X is countable.
(a) If there is a bijection X Y , show that Y is countable.
(b) If there is an injection g : Y X, show that Y is countable.
(c) If there is a surjection h: X Y , show that Y is countable.
2-17. Let X and Y be sets and assume that X is uncountable.
(a) If there is a bijection X Y , show that Y is uncountable.
(b) If there is an injection j : X Y , show that Y is uncountable.
(c) If there is a surjection k: Y X, show that Y is uncountable.
2-18. Show that each of the following sets is countable:
(a) the set of all integers which are squares of integers, {n
2
: n Z};
(b) the set of all non-zero integers, {n Z : n = 0};
(c) the set of all real numbers which are square roots of rational numbers, {x R : x
2
Q}.
2-19. Let X be a set. Then for any subset U X, its indicator or characteristic function of
U is given by
U
: X {0, 1};
U
(x) =
_
_
_
0 if x / U,
1 if x U.
(a) If U, V X are two subsets, show that for every x X,
UV
(x) =
U
(x)
V
(x),
so
UV
is the product function
U
(x)
V
.
(b) If U X, show that for every x X,
X\U
(x) = 1
U
(x),
7
so
X\U
= 1
U
, where 1 is the constant function on X taking value 1.
(c) If U, V X are two subsets, show that for every x X,
U\V
(x) =
U
(x)(1
V
(x)).
(d) If U X is a nite set, show that
|U| =
xX
U
(x),
where the right hand side really is nite sum.
(e) Use (d) to prove the inclusion-exclusion principle of Theorem 2.28.
2-20. A village in which beards are banned has a resident barber. The barber shaves all the
men resident in the village who do not shave themselves, but of course does not shave the men
who do shave themselves. Is the barber a man?
2-21. Let be the relation on R given by
x y y = x or y = x.
Show that is an equivalence relation and describe the equivalence classes.
2-22. Let m N and dene the relation
m
on Z by
x
m
y m | (y x).
Show that
m
is an equivalence relation. What are the equivalence classes?
2-23. Let Y be a set and let X = P(Y ) be the power set of Y . Let
.
= be the relation on X
given by
A
.
= B there is a bijection A B.
(a) Show that
.
= is an equivalence relation.
(b) Suppose that Y is nite. Describe the equivalence classes of
.
= and state how big each is.
2-24. Let M
mn
(F) be the set of all m n matrices with entries in a eld F. Show that the
relation on M
mn
(F) is an equivalence relation, where
A B if and only if there is an invertible mm matrix P such that B = PA.
Can you identify the equivalence classes? [Hint: what is the result of performing lots of elemen-
tary row operations?]
2-25. Let M
nn
(F) be the set of all n n matrices with entries in a eld F. Show that the
relation on M
nn
(F) is an equivalence relation, where
A B if and only if there is an invertible n n matrix Q such that B = Q
1
AQ.
2-26. (a) Consider the surjective function sin: R [1, 1] and let be the associated
equivalence relation as dened in Proposition 2.50. Describe the equivalence classes of .
(b) Do the analogous exercise for the surjective function tan: X R, where
X = R \
_
(2k + 1)
2
: k Z
_
.
2-27. Let be the relation on R given by
x y y x Z.
8
Show that is an equivalence relation and describe the equivalence classes.
Let T = {z C : |z| = 1}, the unit circle in the complex numbers. Show that the function
f : R T; f(x) = e
2xi
is constant on each equivalence class, hence show that there is a bijection R/ T.
9
3-1. Let f : X Y and g : Y Z be two bijections. Verify that g f : X Z is a
bijection and show that (g f)
1
= f
1
g
1
.
3-2. In the symmetric group S
6
, compute the permutations
_
1 2 3 4 5 6
2 3 1 5 6 4
__
1 2 3 4 5 6
6 4 2 3 1 5
_
,
_
1 2 3 4 5 6
2 3 1 5 6 4
_
1
,
expressing your answers in the form
_
1 2 3 4 5 6
a b c d e f
_
.
6
, express the permutations
_
1 2 3 4 5 6
2 3 1 5 6 4
_
,
_
1 2 3 4 5 6
6 4 2 3 1 5
_
as products of disjoint cycles.
3-4. Calculate the following permutations in the symmetric group S
6
, giving the answers as
products of disjoint cycles:
(2, 3, 5, 6)(1, 6, 2, 3), (2, 3)(1, 6, 2)(5, 6, 2, 4), (5, 6, 2, 4)
1
.
3-5. Write out the multiplication table for the symmetric group S
3
, part of which is shown
below and in which the entry in row and column is .
Id (1, 2, 3) (1, 3, 2) (1, 2) (1, 3) (2, 3)
Id (1, 2, 3)
(1, 2, 3) (1, 2, 3)
(1, 3, 2) (1, 2, 3)
(1, 2) (1, 2, 3)
(1, 3) (1, 2, 3)
(2, 3) (1, 2, 3)
Find a subset A of S
3
with three elements such that A whenever and are both in A.
3-6. In a symmetric group, let be a cycle given by
= (a
1
, a
2
, . . . , a
r
),
let be any permutation, and let
be the cycle given by
= (a
1
, a
2
, . . . , a
r
).
Show that =
. [You must check that both permutations take the same value on each
element a
i
and on each element not of the form a
i
.]
10
6
, compute the signs of the permutations
_
1 2 3 4 5 6
2 3 1 5 6 4
_
,
_
1 2 3 4 5 6
6 4 2 3 1 5
_
.
6
, compute the signs of the permutations
(1, 3, 4, 6)(1, 3, 4, 2), (5, 6)(3, 5, 4)(1, 2, 3, 5), (1, 2, 3, 5)
1
.
3-9. Let be the cycle given by
= (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
in the symmetric group S
12
. For i = 2, 3, 4, 5, express
i
as a product of disjoint cycles, and
compute the sign of
i
.
3-10. Let be a permutation in a symmetric group S
n
with n > 0, let k = (1), and let be
the permutation in S
n
given by
(i) =
_
_
(i + 1) if i < n and (i + 1) < k,
(i + 1) 1 if i < n and (i + 1) > k,
n if i = n.
Show that
= (n, n 1, . . . , k)(1, 2, . . . , n),
and deduce that sgn = (1)
k+1
sgn . [This is related to expansions of determinants. Let
A = [a
ij
] be an nn matrix, and let B be the matrix got from A by omitting the rst row and
the kth column. Then sgn is the sign attached to the term a
2(2)
a
n(n)
in the expansion
of the determinant of B.]
3-11. Let E be the set of even integers and let F be the set of odd integers. Show that E is a
subgroup of the additive group Z, and that F is not a subgroup.
3-12. Let G be the set of non-zero complex numbers, let H be the set of complex numbers of
absolute value 1, and let K be the set of complex numbers z such that z
3
= 1. Show that G is
a group with the usual multiplication. Show also that H and K are subgroups of G.
3-13. Let F be a eld and let V be a vector space over F. Show that V is a group under the
vector space addition +. Is it abelian?
3-14. For n 1, let
O(n) = {A : A is an n n real matrix satisfying A
T
A = I
n
},
i.e., the set of a n n real orthogonal matrices.
(a) Show that O(n) GL
n
(R), i.e., O(n) is a subgroup of GL
n
(R).
(b) Show that if A O(n) then det A = 1. Deduce that
SO(n) = {A O(n) : det A = 1}
is a subgroup of O(n) and that
O(n) = SO(n) {A O(n) : det A = 1}.
11
3-15. Let G be the group of non-zero complex numbers under multiplication, let n be a positive
integer, and let H be the set of complex numbers z such that z
n
= 1. Show that H is a subgroup
of G. Show also that H is a cyclic group of order n.
3-16. Let H and K be subgroups of a group G. Show that H K is a subgroup of G.
3
, nd elements and of orders 2 and 3 respectively. Show
that the union of the cyclic groups and is not a subgroup of S
3
.
3-18. In the additive group of integers Z, nd generators for each of the following subgroups:
H
1
= {70x 130y : x, y Z},
H
2
= {10x + 6y : x, y Z},
H
3
= {70x 130y + 6y : x, y Z}.
What can you say about the subgroup
H = {xa +yb +zc : x, y, z Z},
where a, b, c are three non-zero integers?
3-19. Investigate the primitive generators for your favorite prime numbers. Using a calculator
or computer might be helpful if the primes are large!
12
4-1. (a) Find a parametric equation for the line L
1
with implicit equation 2x 3y = 1.
(b) Find an implicit equation for the line L
2
which has parametric equation x = (t 1, 3t +1).
(c) Find parametric and implicit equations for the line L
3
which contains the point P(1, 1)
and is parallel to the vector (1, 1).
(d) Find the point of intersection of the lines L
1
and L
3
and the angle between them.
4-2. Let u = (5, 0) and v = (2, 1).
(a) Find the angle between u and v.
(b) Find the projection of the vector u onto v.
(c) Find the projection of the vector v onto u.
4-3. Consider the lines
L
1
= {(x, y) : x +y = 2}, L
2
= {(x, y) : x y = 2}.
Find the eects on the point P(1, 0) of the reections Re
L
1
and Re
L
2
.
4-4. Consider the lines
L
1
= {(x, y) : 2x +y = 0}, L
2
= {(x, y) : 2x +y = 2}.
Express each of the isometries Re
L
2
Re
L
1
and Re
L
1
Re
L
2
as translations, i.e., in the form
Trans
t
for some t R
2
.
4-5. Recall the standard identication of the pair (x, y) with the column vector
_
x
y
_
.
(a) Give a matrix interpretation of the dot product (x
1
, y
1
) (x
2
, y
2
).
(b) Let u R
2
be a unit vector. Show that the 2 2 matrix U = uu
T
satises
Ux =
_
_
_
0 if u x = 0,
x if x = tu for some t R.
(c) Deduce that the matrix U
= I
2
2U has the same eect on vectors as reection in the line
L = {x R
2
: u x = 0}.
4-6. (a) Let A be an n n orthogonal matrix. Show that det A = 1.
(b) If P, Q are n n orthogonal matrices, show that their product PQ is also orthogonal.
4-7. (a) Show that a 2 2 real orthogonal matrix A with determinant det A = 1 has the form
A =
_
cos sin
sin cos
_
for some R.
[Hint: Write down a system of equations for the four entries of A, then solve it using the fact
that when a pair of real numbers x, y satises x
2
+ y
2
= 1 there is a real number such that
13
x = cos , y = sin .]
(b) Show that a 2 2 real orthogonal matrix B with determinant det B = 1 has the form
B =
_
cos sin
sin cos
_
for some R.
[Observe that C = B
_
1 0
0 1
_
is orthogonal and satises det C = 1, then apply (a).]
4-8. Let A be a 2 2 orthogonal matrix.
(a) If det A = 1, show that the eigenvalues of A are 1. What is the geometric interpretation
of the eigenvectors? What can you say about the powers of A?
(b) If det A = 1, show that the eigenvalues have the form e
i()
for some real number . Are
there any real eigenvectors? What about the complex eigenvectors? If = 2/n for some
n N, what can you say about the powers of A?
4-9. For each of the following cases, describe the geometric eect of the isometry (A | t).
(a) A =
_
1/
2 1/
2
1/
2 1/
2
_
, t =
_
1
1
_
, (b) A =
_
1/
2 1/
2
1/
2 1/
2
_
, t =
_
1
1
_
,
(c) A =
_
1 0
0 1
_
, t =
_
0
1
_
.
In each case, determine the Seitz symbol of (A | t)
2
= (A | t)(A | t) and describe the eect of
the corresponding isometry.
4-10. Given that
R =
_
0 1
1 0
_
, w =
_
1
3
_
,
nd the angle and centre of the rotation whose Seitz symbol is (R | w).
4-11. Let : R
2
R
2
be an isometry that xes a point Pp.
(a) Show that the composition
= Trans
p
Trans
p
xes the origin and describe the eect this isometry geometrically in terms of that of .
(b) If Qq is a second point, show that the composition
= Trans
(qp)
Trans
(pq)
xes Q and describe the eect of this isometry geometrically in terms of that of .
4-12. Given the Seitz symbols for isometries (A | s), (B | t), determine the Seitz symbols of
the isometries
(a) (A | 0)(I
2
| s), (b) (B | t)(A | 0)(B | t)
1
.
4-13. Let (A | t) be the Seitz symbol of an isometry R
2
R
2
.
(a) If s R and x, y R
2
, show that
(A | t)(sx + (1 s)y) = s(A | t)x + (1 s)(A | t)y.
14
(b) For n 2, show that if s
1
, . . . , s
n
R satisfy s
1
+ +s
n
= 1 and x
1
, . . . , x
n
R
2
, then
(A | t)(s
1
x
1
+ +s
n
x
n
) = s
1
(A | t)x
1
+ +s
n
(A | t)x
n
.
4-14. (a) Consider a regular pentagon P with vertices A, B, C, D, E appearing in anti-clockwise
order around its centre which is at the origin O.
E
A
B
C D
O
Find all ten symmetries of P, describing them geometrically and in permutation notation.
(b) Work out the eect of the two possible compositions of reection in the line OA with
reection in the line OC.
(c) Work out the eect of the two possible compositions of reection in the line OA with rotation
through 3/5 of a turn anti-clockwise.
4-15. Determine the symmetry groups of each of the following plane gures.
(a)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
(b)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_

4-16. (a) Consider a regular hexagon H with vertices A, B, C, D, E, F appearing in anti-

clockwise order around its centre which is at the origin O.
F
A
B
C
D E
O
Find all twelve symmetries of H, describing them both geometrically and in permutation nota-
tion.
(b) Work out the eect of the two possible compositions of reection in the line passing through
the midpoints of the sides AF and CD with reection in the line CF.
(c) Work out the eect of the two possible compositions of reection in the line OA with rotation
through 1/3 of a turn anti-clockwise.
15
4-17. (a) Let S R
2
be a non-empty subset and P S any point in S. Show that
S,P
= { Euc(2)
S
: (P) = P} Euc(2)
S
S
.
(b) When S is a square with vertices A, B, C, D, determine
S,D
.
(c) When S is a rectangle with vertices A, B, C, D and sides |AB| = |CD| = 2|AD| = 2|BC|,
determine
S,D
.
4-18. Let
T = {(x, y) R
2
: x
2
+y
2
= 1} R
2
be the unit circle. Determine the symmetry subgroup Euc(2)
T
. Does Euc(2)
T
have any nite
subgroups?
4-19. Let Rot
C,
be a non-trivial rotation through angle about the point C with position
vector c. If t is a non-zero vector, show that Trans
t
Rot
C,
is rotation through about the
point C
with position vector

c
= c +
1
2
_
1 cot(/2)
cot(/2) 1
_
t.
4-20. Consider the cyclic subgroup of Euc(2) generated by the isometry = (C | w), where
C =
_
1/2
3/2
3/2 1/2
_
, w =
_
3/2
3/2
_
.
Show that has order 3 and list its elements.
4-21. Let be a subgroup of Euc(2). Dene a relation on R
2
by
x y there is a s.t. y = x.
(a) Show that is an equivalence relation on R
2
. [The equivalence classes of are called the
orbits of .]
(b) Take your favourite subgroup of Euc(2), together with a point x R
2
. Determine the
orbit of containing x.
16

FPM Notes

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

FPM Notes

Hochgeladen von

Copyright:

Verfügbare Formate

Foundations of Pure Mathematics

Notes for 201011

for some and so p

| gcd(a, b). But then p

y) = y. Hence h is a bijection and its

has the rule

is an injection and so u 1 U. But u 1 < u, contradicting the

is a surjection and so v 1 V , contradicting the assumption that v is minimal in V .

. The set of all the distinct equivalence

is called the group of units of F. For example, if p is a prime, then in the

and hence H is cyclic.

from which we see that p +

passing through P and perpendicular to L,

+p can be determined by solving the following equation for s

O(V ), then for every pair of

| = |O(U)| = |(O)(U)| = |OU| = |u|,

| = |O(V )| = |(O)(V )| = |OV | = |v|,

is the line given in parametric form by

. So m = su + p, say, satises the linear equation in the

= {(t cos , t sin ) : t R},

on P(x, y) = (0, 0).

(P), with position vector (x

= r(cos 2 cos + sin 2 sin ), y

= r(sin 2 cos cos 2 sin ),

= cos 2 x + sin 2 y, (4.12a)

= sin 2 x cos 2 y. (4.12b)

to P we obtain the point

(cos 2 x + sin 2 y, sin 2 x cos 2 y).

Then for any t R,

Notice that C is xed by Rot

4.4. The Euclidean group of the plane

(cos , sin ) by . Then Y (0, 1) must be sent to one of

(cos( +/2), sin( +/2)) and Y

(cos( /2), sin( /2)) since these are

, then we must have

= 2 2 cos = 2(1 cos ),

| = 6. Notice that the identity and the two 3-cycles represent

described in permutation notation.

is the permutation group of the vertices and so D

be the cycle given by

4-16. (a) Consider a regular hexagon H with vertices A, B, C, D, E, F appearing in anti-

with position vector

Das könnte Ihnen auch gefallen