Sie sind auf Seite 1von 168

MathsWorks for Teachers

MathsWorks for Teachers Series editor David Leigh-Lancaster

This book provides mathematics teachers with an elementary introduction to


matrix algebra and its uses in formulating and solving practical problems, solving
systems of linear equations, representing combinations of affine (including linear)
transformations of the plane and modelling finite state Markov chains. The basic
theory in each of these areas is explained and illustrated using a broad range of
examples. A feature of the book is the complementary use of technology, particularly
computer algebra systems, to do the calculations involving matrices required for
the applications. A selection of student activities with solutions and text and web
references are included throughout the book.

Pam Norton

Matrices are used in many areas of mathematics, and have


also have
applications
applications
in diverse
in
diverse
areas
such
areas
as such
engineering,
as engineering,
computer
computer
graphics,graphics,
image processing,
image processing,
physicalphysical
sciences,
sciences, biological
biological
sciences and
sciences
socialand
sciences.
social Powerful
sciences. Powerful
calculators
calculators
and computers
and computers
can now
can now
carry
outcarry
complicated
out complicated
and difficult
and numeric
difficult numeric
and algebraic
and algebraic
computations
computations
involving
involving
matrix
methods,
matrix and
methods,
such technology
and such technology
is a vital tool
is ain
vital
related
tool real-life,
in relatedproblemreal-life,
problem-solving
solving
applications.
applications.

MATRICES

Matrices
Pam Norton

Series overview

MathsWorks for Teachers has been developed to provide a coherent and contemporary
framework for conceptualising and implementing aspects of middle and senior
mathematics curricula.

Matrices

Titles in the series are:


Functional Equations
David Leigh-Lancaster
Contemporary Calculus
Michael Evans
Matrices
Pam Norton
Foundation Numeracy
in Context
David Tout & Gary Motteram
Data Analysis
Applications
Kay Lipson
Complex Numbers
and Vectors
Les Evans

Functional
Equations

Contemporary
Calculus

David Leigh-Lancaster

Michael Evans

MathsWorks for Teachers

MathsWorks for Teachers

Matrices
Pam Norton

MathsWorks for Teachers

Foundation Numeracy
in Context

Data Analysis
Applications

Complex Numbers
and Vectors

David Tout and Gary Motteram

Kay Lipson

Les Evans

MathsWorks for Teachers

MathsWorks for Teachers

MathsWorks for Teachers

Pam Norton

ISBN 978-0-86431-508-3

780864 315083

MathsWorks for Teachers

Matrices
Pam Norton

MathsWorks for Teachers

Contents
Introduction v
About the author vi

An introduction to matrices 1
History 2
Matrices in the senior secondary mathematics curriculum 5

Rectangular arrays, matrices and operations 11


Definition of a matrix 15
Operations on matrices 16
Addition and subtraction of two matrices 17
Multiplication by a number (scalar multiple) 17
Structure properties of matrix addition and scalar multiplication 19
Matrix multiplication 20
Zero and identity matrices 23
The transpose of a matrix 25
The inverse of a matrix 27
Applications of matrices 30

Solving systems of simultaneous linear


equations 37
Solving systems of simultaneous linear equations using matrix
inverse 43
The method of Gaussian elimination 50
Systems of simultaneous linear equations in various contexts 59

Contents

Transformations of the cartesian plane 75


Linear transformations 76
Linear transformation of a straight line 81
Linear transformation of a curve 87
Standard types of linear transformations 89
Composition of linear transformations 101
Affine transformations 103
Composition of affine transformations 104

Transition matrices 111


Conditional probability 111
Transition probabilities 112
The steady-state vector 119
Applications of transition matrices 121

Curriculum connections 136

Solution notes to student activities 141

References and further reading 163


Notes 165

Introduction
MathsWorks is a series of teacher texts covering various areas of study and
topics relevant to senior secondary mathematics courses. The series has been
specifically developed for teachers to cover helpful mathematical background,
and is written in an informal discussion style.
The series consists of six titles:
Functional Equations
Contemporary Calculus
Matrices
Data Analysis Applications
Foundation Numeracy in Context
Complex Numbers and Vectors
Each text includes historical and background material; discussion of key
concepts, skills and processes; commentary on teaching and learning
approaches; comprehensive illustrative examples with related tables, graphs
and diagrams throughout; references for each chapter (text and web-based);
student activities and sample solution notes; and a bibliography.
The use of technology is incorporated as applicable in each text, and a
general curriculum link between chapters of each text and Australian state
and territory as well as selected overseas courses is provided.
A Notes section has been provided at the end of the text for teachers to
include their own comments, annotations and observations. It could also be
used to record additional resources, references and websites.

A b o u t t h e a u t h o r
Pam Norton is an experienced lecturer in university level mathematics. Her
mathematical interests are in the applications of mathematics to sport, and her
educational interests are in the use of technology in the teaching and learning
of mathematics. She has been involved in the setting and assessing of
examinations and extended assessment tasks in mathematics and curriculum
review at both secondary and tertiary levels.

vi

C ha p t e r

A n i n t r o d u c t i o n t o
m a t ric e s
Throughout history people have collected and recorded various data using sets
(unordered lists), vectors (ordered one-dimensional lists) and matrices
(ordered two-dimensional lists of lists, tables or rectangular arrays). Today
arrays and tables of numbers and other information are found widely in
everyday life. For example, sports ladders give numbers of wins, losses, draws,
points and other information for teams in a competition. Each day stock tables
are given in local newspapers. Results of opinion polls are usually given in
table form in newspapers. Information of many forms is held in tables, and the
spreadsheet is an electronic digital technology that is now widely used in
everyday life wherever data is to be entered, stored and manipulated using
tables and rectangular arrays.
The word matrix comes from the Latin word for womb. The term matrix
is also used in areas other than mathematics, and generally means an
environment, milieu, substance or place in which something is cast, shaped
or developed.
It has become conventional in contemporary mathematics to describe a
matrix in terms of the specification of its elements by reference to row by
column, following the use of this form in developments by 19th century
mathematicians, but it is interesting to speculate whether an element of a
matrix would most likely have been referenced by column first then by row,
had computerised spreadsheets been around prior to the modern development
of matrices! Indeed, Chinese mathematicians who first recorded the use of
matrices used an early form of computer algebra, the counting board, and did
reference elements by column before row (see Hoe, 1980).

MathsWorks for Teachers


Matrices

Hi s t o r y
In the history of mathematics, the explicit and formal development of matrices
is a relatively modern invention. Katz (1993), Grattan-Guinness (1994) and
Fraleigh and Beauregard (1995) provide useful historical background, as do
the websites referenced at the end of this chapter.
Although it is evident that ancient civilisations such as the Babylonians
and Chinese were able to solve systems of simultaneous linear equations, it is
not clear if systems with multiple solutions or no solutions were considered by
them. The Egyptians were able to solve simple linear equations directly and
by guessing an answer and adjusting it to the correct one.
The origins of matrices are found in the study of systems of simultaneous
linear equations, and can be traced back to the Chinese in the Han Dynasty
about 200 BCE. The Han dynasty, established in about 210 BCE, developed
two important texts for mathematical education: Zhoubi suanjing
(Arithmetical Classic of the Gnomon and the Circular Paths of Heaven) and
Jiuzhang suanshu (Nine Chapters on the Mathematical Art). In the eighth
chapter of the latter, systems of simultaneous linear equations were solved
with the aid of a counting board by arranging the equations in tabular form,
with each column containing the coefficients and the constant term for one of
the equations. The solution was then obtained by multiplying and subtracting
columns to get a triangular form, followed by back-substitution. English
translations and commentary on these texts have recently become available
(see Kangshen, Crossley & Lun, 1999).
Systems of simultaneous linear equations arise in many areaseconomics,
social sciences, medicine, engineering, biological and physical sciences and
mathematics. The most useful method for solving such systems is known as
Gaussian elimination. It was first used in modern times by Gauss in 1809 to
solve six simultaneous equations in six unknowns which he built while
studying the orbit of the asteroid Pallas. Gauss simply developed the method
first documented by the Chinese in about 200 BCE. Not only does this
method have historical significance, but it is also the basis for the best direct
methods for programming a computer to solve such systems today. While
software programs such as MatLab have been specifically developed for highspeed numerical computation with high-order matrices, other programs and
technologies such as spreadsheets, computer algebra systems (CAS) and
graphics and CAS calculators can also carry out matrix computations with
matrices that are too large for efficient by hand computation. CAS
technologies can also carry out computations with algebraic as well as

chapter 1
An introduction to matrices

numerical elements for matrices. Where quick and accurate computations are
required, modern calculator and computer technologies are indispensable
tools.
Gausss method was extended to become known as the GaussJordan
method for solving systems of simultaneous linear equations. Jordans
contribution to this method is the incorporation of a systematic technique for
back-substitution. The GaussJordan method is used to obtain the reduced row
echelon form of a matrix and hence to solve a system of simultaneous linear
equations directly. The Jordan part was first described by Wilhelm Jordan, a
German professor of geodesy, in the third edition (1888) of his Handbook of
Geodesy. He used the method to solve symmetric systems of simultaneous
linear equations arising out of a least squares application in geodesy.
It is perhaps somewhat surprising that the idea of a matrix did not evolve
until well after that of a determinant. In 1545, Cardan gave a rule for solving
a system of two linear equations, which is essentially Cramers rule using
determinants for solving a 22 system. The idea of a determinant appeared
in Japan and Europe at approximately the same time. Seki Kowa in a
manuscript in 1683 introduced the notion of a determinant (without using
this name for them). He described their use and showed how to find
determinants of matrices up to order 55. In the same year in Europe,
Leibniz also introduced the idea of a determinant to explain when a system of
equations had a solution.
In 1750, Gabriel Cramer published his Introduction to the Analysis of
Algebraic Curves. He was interested in the problem of determining the
equation of a plane curve of a given degree that passes through a certain
number of points. He stated a general rule, now known as Cramers rule, in an
appendix, but did not explain how it worked. The term determinant was first
introduced by Gauss in 1801 while discussing quadratic forms. Binary
quadratic forms are expressions such as ax 2+2bxy+cy2, where x and y are
variables and a, b and c coefficients, which can be represented in matrix
form as:

6 x y@ =

a b x
G= G
b c y

Gauss considered linear substitutions for the variables x and y of the form
x = ax1 + by1
y = gx1 + hy1


and composition of substitutions, which led to matrix multiplication.

MathsWorks for Teachers


Matrices

Augustin-Louis Cauchy, in 1812, published work on the theory of


determinants, both using the term determinant and the abbreviation (a1,n) to
stand for the symmetric system
a 1, 1 a 1, 2 f a 1, n
a 2, 1 a 2, 2 f a 2, n

h
h h h
an, 1 an, 2 f an, n

that is associated with the determinant. Although many of the basic results in
calculating determinants were already known, he introduced work on minors
and adjoints, and the procedure for calculating a determinant by expanding
along any row or down any column (now called the Laplace expansion).
It was not until 1850 that James Joseph Sylvester used the term matrix to
refer to a rectangular array of numbers. He spent most of his time studying
determinants that arose from matrices.
Soon after, Arthur Cayley showed that matrices were useful to represent
systems of simultaneous linear equations, with the notion of an inverse
matrix for their solution. In 1858 Cayley gave the first abstract definition of a
matrix, and subsequently developed the algebra of matrices, defining the
operations of addition, multiplication, scalar multiplication and inverse.
He also showed that every matrix satisfies its characteristic equation, a result
which is now known as the CayleyHamilton theorem. Cayley proved this
theorem for 22 matrices, and had checked the result for 33 matrices,
while William Rowan Hamilton had proved the special case for 44 matrices
with his investigations into the quaternions. Georg Frobenius proved it for the
general case in 1878.
Many other mathematicians have contributed to the theory of matrices and
determinants, including Etienne Bezout, Colin Maclaurin, AlexandreTheophile Vandermonde, Pierre Simon Laplace, Joseph Louis Lagrange,
Ferdinand Gotthold Eisenstein, Camille Jordan, Jaques Sturm, Karl Gustav
Jacob Jacobi, Leopold Kronecker and Karl Weierstrass.
Markov chains are named after the Russian mathematician Andrei
Andreevich Markov, who first defined them in 1906 in a paper dealing with
the Law of Large Numbers. He used examples from literary books, with the
two possible states being vowels and consonants. To illustrate his results, he
did a statistical study of the alternation of vowels and consonants in Pushkins
Eugene Onegin.
Matrices and matrix methods are now used in many areas of practical and
theoretical application. The Harvard economist Wassily W. Leontief was

chapter 1
An introduction to matrices

awarded the Nobel Prize in Economics for his work on inputoutput models,
which relied heavily on matrices and solving systems of simultaneous linear
equations. Matrices are also used extensively in business optimisation
contexts, in particular where networks (graph theory) are applied to problems
involving representation, connectedness and allocation. The mathematician
Olga Taussky Todd developed matrix applications to analyse vibrations on
airplanes during World War II, and made an important contribution to the
development and application of matrix theory.
The development of the electronic digital computer has had a big impact on
the use of matrices in many areas. Matrix methods are used extensively in
computer graphics, a developing area especially driven by the demands of the
movie and computer games industries. Matrix methods are also used
extensively in the communication industry (especially for encryption and
decryption), in engineering and the sciences, and in economic modelling
and industry.
In mathematics, the study of matrices and determinants is part of linear
algebra, and it is recognised that matrices and their natural operations provide
models for the algebraic structures of a vector space and of a noncommutative ring.

Ma t ric e s i n t h e s e n i o r s e c o n d ar y
m a t h e m a t ic s c u rric u l u m
During the 1960s and 1970s matrices began to be incorporated into aspects of
the senior secondary mathematics curriculum, in particular following the
innovations of the new mathematics program from the 1950s (see, for
example, Allendoerfer & Oakley, 1955; Adler, 1958), and the increasing use of
them as a tool for business-related applications. In curriculum terms this can
also be related to a greater emphasis on discrete mathematics in curriculum
design during the second half of the 20th century.
There have been several purposes for which matrices have been introduced
into the senior secondary mathematics curriculum:
to represent and solve systems of m simultaneous linear equations in
n variables, where m, n 2
to represent and apply transformations, and combinations of
transformations of the cartesian plane, in particular considering those
subsets of the cartesian plane that represent graphs of functions and other
relations

MathsWorks for Teachers


Matrices

as arrays to model and manipulate data related to practical situations such


as stock inventories, sales and prices
to represent and compute states and transitions between states, for example
population modelling, tax scales and conditional probabilities, including
Markov chains
as a mathematical tool for analysis in graph theory (networks) and game
theory
to carry out numerical computation such as for approximation of irrational
real numbers
to provide a model for abstract mathematical structures such as noncommutative rings and vector spaces
to provide a model for other mathematical entities such as complex
numbers and vectors
The incorporation of matrices within the school mathematics curriculum
has been both as objects for investigation in their own right, and for their
instrumental application in particular contexts. For example, in The New
Mathematics (Adler, 1958) matrices are introduced for analysis of linear
transformations of the plane, then considered as an algebra in their own right,
and finally used as a model for the complex numbers. In The Principles of
Mathematics (Allendoerfer & Oakley, 1955) matrices are briefly introduced
as an example of a non-commutative ring. During the 1970s in Aggregating
Algebra (Holton & Pye, 1976), matrices are introduced in the context of
systems of three simultaneous linear equations in three unknowns with an
emphasis on geometric interpretation of the solutions. Similarly, in School
Mathematics Research Foundation: Pure Mathematics (Cameron et al.,
1970), matrices are introduced in the context of linear transformations of the
plane and then applied to the solution of systems of simultaneous linear
equations (with a detailed discussion on elementary matrices) and finally a
brief consideration of matrix algebras.
Around this time matrices also began to be used increasingly for practical
applications within discrete business-oriented mathematics courses (see, for
example, Kemeny et al., 1972) including applications of transition matrices.
During the 1970s and early 1980s, within pure mathematics-oriented senior
secondary courses, especially those intended for students with interest and
aptitude in higher level mathematics, such as Pure Mathematics (Fitzpatrick
& Galbraith, 1971), the use of matrices covered transformations, solution of
systems of simultaneous linear equations and investigation of mathematical
structures (groups of transformations of the plane, complex numbers and
rings). In some cases this also included consideration of determinants for

chapter 1
An introduction to matrices

equation solving (Cramers rule) and finding inverses. However, a


comprehensive study of determinants is a substantial area of mathematical
investigation of its own, and it has typically not been included in senior
secondary mathematics curricula in its own right, but rather in relation to the
study of matrices (see, for example, Hodgson & Patterson, 1990). In terms of
contemporary senior secondary curricula, matrices are covered in, for
example, Further Mathematics and Mathematical Methods (CAS) studies in
Victoria, Australia, and in the Further Pure AS module: Matrices and
Transformations in the United Kingdom.
Matrix computations had the advantage, within the contexts typically used
in senior secondary mathematics courses, of requiring relatively simple
arithmetic calculations. However, for matrices of order 3 3 and greater, the
extent of these calculations meant that reliability of correct calculation
became problematic, and much of the time working on matrices was spent on
learning algorithms for the necessary computations and then carrying these
out mentally, by hand, or possibly with the assistance of arithmetic or
scientific calculators.
A consequence of this was that only matrices of small orders were used,
and computations involving finding inverses tended to be associated with
more formal rather than practically oriented senior secondary mathematics
courses. The advent from the late 1980s of powerful and readily accessible
mathematically able technology, initially in the form of mini-computers and
later as desk-top computers with software such as spreadsheets, numerical
computation software and CAS, and subsequently hand-held graphics and
CAS calculators, has meant that the computational load associated with
matrix work can be carried out by these technologies. Thus, without the time
and reliability constraints imposed by access only to mental and/or by hand
arithmetic calculation, senior secondary mathematics students across a broad
range of courses can utilise matrices, including those of higher orders, for a
variety of purposes.
The efficient and effective use of these technologies requires students to
have a sound conceptual understanding of key matrix definitions (such as
order, row, column, types of matrices) and conditions for computations (such
as conformability, existence of inverses), as well as practical mental and by
hand facility with computation in simple cases so that they can understand
what computations are taking place, verify the reasonableness of results, and
anticipate the nature of these results to verify their mathematical working.
While matrices continue to have a strong role in providing a unifying abstract
structure in contemporary senior secondary mathematics curricula, their

MathsWorks for Teachers


Matrices

instrumental value in practical modelling and applications can be enhanced in


conjunction with the use of modern technology (see, for example, Kissane 1997;
Garner, McNamara & Moya, 2004). Although purpose-designed computational
software such as Matlab is used in complicated real-life applications where high
speed computations involve matrices of very high orders, general CAS such as
Derive, Maple and Mathematica and spreadsheets also have powerful matrix
functionalities. Student versions of these general CAS, or graphics and CAS
calculators can be used by students for examples suitable for senior secondary
mathematics courses. The author has used the CAS software Derive (which also
underpins computation in the TI-89, TI-92 and Voyage 200 series of hand-held
CAS calculators) for matrix computations in this text. More recent hand-held
technology, such as the CASIO Classpad 300 series, the TI-nspire CAS+ and the
HP 50G, have very user-friendly template-based matrix functionality.
SUMM A R Y

Various data can be collected and recorded using sets (unordered


lists), vectors (ordered one-dimensional lists) and matrices (ordered
two-dimensional lists of lists).
Matrices are ordered rectangular arrays, or lists of equal-sized lists,
that are constructed by arranging data in rows and columns.
Rectangular tables are simple examples of matrices.
The origins of matrices can be found in Chinese mathematical
instructional texts from around 200 BCE, with application to the
solution of systems of simultaneous linear equations.
In Europe, matrices arose from Gausss work in the early 19th
century on solving systems of simultaneous linear equations. This
work provides a general method (known as Gaussian elimination).
The term matrix was first applied to a rectangular array of numbers
by Sylvester in 1850.
Gaussian elimination, and related methods, form the basis for
algorithms used by modern technology to solve systems of
simultaneous linear equations.
The idea of a determinant arose independently in both Japan and
Europe in the latter part of the 17th century.
The modern theory of matrices and determinants was developed
significantly in the latter part of the 19th century and the early 20th
century.

chapter 1
An introduction to matrices

SUMM A R Y (Cont.)

Matrices began to be incorporated in senior secondary mathematics


curricula from the 1960s, as discrete mathematics began to have a
more significant role in curriculum design
Matrices have been used in the senior secondary mathematics
curriculum to:
solve simple systems of simultaneous linear equations
apply transformations of the plane to sets of points
solve practical problems in which information can be modelled
and manipulated using matrices and matrix operations
analyse transition states, such as in simple Markov chains
provide an example of an abstract mathematical structure
model complex numbers and vectors.
From the 1960s through to the early 1980s, calculation of matrix
operations in the senior secondary mathematics was generally
carried out by hand (possibly with the assistance of a scientific
calculator) and hence was restricted to applications involving
matrices of low order.
Access to spreadsheets, numeric processors, graphics and CAS
calculators, and CAS software as enabling technology in the senior
secondary mathematics curriculum from the late 1980s has
permitted a broader range of applications involving higher order
matrices to be addressed.

References
Adler, I 1958, The new mathematics, John Day, New York.
Allendoerfer, CB & Oakley, CO 1955, Principles of mathematics, McGraw-Hill, New
York.
Cameron, N, Clements, K, Green, LJ & Smith, GC 1970, School Mathematics
Research Foundation: Pure mathematics, Chesire, Melbourne.
Fitzpatrick, JB & Galbraith, P 1971, Pure mathematics, Jacaranda, Milton, Queensland.
Fraleigh, JB & Beauregard, AR 1995, Linear algebra (3rd edition with historical notes
by VJ Katz), Addison-Wesley, Reading, MA.
Garner, S, McNamara, A & Moya, F 2004, CAS analysis supplement for
Mathematical Methods CAS Units 3 and 4, Pearson, Melbourne.
Grattan-Guinness, I (ed.) 1994, Companion encyclopedia of the history and
philosophy of the mathematical sciences, volume 1, Routledge, London.

MathsWorks for Teachers


Matrices
Hodgson, B & Patterson, J 1990, Change and approximation, Jacaranda, Melbourne.
Hoe, J 1980, Zhu Shijie and his jade mirror of the four unknowns, in JN Crossley
(ed.), First Australian conference on the history of mathematics: Proceedings,
Monash University, Clayton.
Holton, DA & Pye, W 1976, Aggregating algebra, Holt, Rhinehart and Winston,
Sydney.
Kangshen, S, Crossley, JN & Lun, A 1999, The nine chapters on the mathematical
art: Companion and commentary, Oxford University Press, Oxford.
Katz, VJ 1993, A history of mathematicsan introduction, Harper-Collins,
Reading, MA.
Kemeny, JG, Scheifer, A, Snell, JL & Thompson, GC 1972, Finite mathematics with
business applications, Prentice-Hall, Englewood Cliffs, NJ.
Kissane, B 1997, More mathematics with a graphics calculator, Mathematical
Association of Western Australia, Claremont.

Websites
http://www.ualr.edu/lasmoller/matrices.html History Department, University of
Arkansas at Little Rock
This website includes a concise overview of the history of matrices with a good
list of links to related sites, including online applications for matrix computation.
http://www-groups.dcs.st-and.ac.uk/~history/HistTopics/Matrices_and_
determinants.html School of Mathematics and Statistics, University of St
Andrews Scotland
This website contains a comprehensive historical coverage of matrices,
determinants and related mathematics from ancient Babylonia and China
through to the modern era, with extensive cross references to key
mathematicians and related topics.

10

C ha p t e r

R e c t a n g u l ar arra y s ,
m a t ric e s a n d o p e ra t i o n s
The purpose of this chapter is to provide some practical contexts to motivate
the definition of matrices and their features, to introduce several operations
on matrices, and to discuss their properties. When a new mathematical
structure is introduced, students often need to explore the rationale behind
this process, and find a concrete model for the elements of the structure and
related operations defined on these elements that they can interpret in
context. They can then subsequently explore certain generalisations or
specialisations related to this structure as they become more confident with its
elements and operations in their own right.
There are many different ways in which arrays can be made in twodimensions, typically based on regular geometric shapes such as polygons or
circles. Such arrays can be devised by placing a finite set of physical objects
according to some established procedure or frame of reference. In texts,
depending on the culture, rows, such as in English (left to right), or in Arabic
(right to left) or columns, such as in Chinese (top to bottom) are used for
writing in a given direction. Rows and columns can be used together to cross
reference information in texts, and a rectangular table is a convenient way of
doing this. Information is frequently stored in such arrays, and conventions are
required as to how information is to be read from these tables, or when and
how information from such tables may be combined or otherwise manipulated.
In many contexts the information stored in specific locations within a
rectangular array is numerical. For example, suppose we record the number of
coins of the denominations 5 cents, 10 cents, 20 cents, 50 cents, one dollar and
two dollars, in that order, of spare change that several individuals Michael,
Jay, Sam and Lin, also in that order, collect over a week. We might write these
in an ordered list:
{Michael{4, 0, 1, 3, 2, 2}; Jay{5, 1, 0, 0, 4, 2}; Sam{0, 0, 0, 4, 3, 0}
and Lin{10, 4, 6, 0, 0, 1}}

11

MathsWorks for Teachers


Matrices

Alternatively, this information could be displayed in a table:


5 cent

10 cent

20 cent

50 cent

$ 1.00

$ 2.00

Michael

Jay

Sam

Lin

10

The corresponding more compact rectangular array of numbers is:


R
V
S 4 0 1 3 2 2W
S 5 1 0 0 4 2W
S 0 0 0 4 3 0W
S
W
S10 4 6 0 0 1 W

T
X
Similarly, we might consider a business that has two outlets and sells only
three products, A, B and C. We can represent the numbers of items of each
product held in stock by each outlet in a table:
Product
A

Outlet 1

150

40

10

Outlet 2

70

20

10

The rectangular array of numbers representing current stock is


150 40 10
G. The entry in each position corresponds to the number of
=
70 20 10
products of a certain type in stock at a particular outlet. This is an example of
a matrixa simple rectangular array of numbers. Particular pieces of
information can be obtained by reference to the row and column in which the
desired information is located. For example, the number of items of Product C
at Outlet 1, which is 10, is found in the first row, third column. It is common
to designate matrices by capital letters, so this matrix could be called S (the
stock matrix for this context) where:
S ==

12

150 40 10
G
70 20 10

chapter 2
Rectangular arrays, matrices and operations

The piece of information discussed earlier, the numbers of items of Product


C at Outlet 1, can be referred to as s13, where use of the lowercase s indicates
an element of the matrix S, and the subscript 13 indicates this element is
found in the first row and third column of the matrix S. This matrix has two
rows and three columns, and this is summarised by saying that S is a matrix
of size (order or dimension) 2 by 3, or alternatively a 23 matrix.
Teachers can then use students intuitive understanding of the practical
context to provide natural definitions, and related conditions for the processes
of addition, subtraction of matrices, scalar multiples of a matrix and the
product of matrices.
For example, if a sale is made, say, of two items of Product B from Outlet 1,
this can be represented by the matrix =
stock record by subtracting =

0 2 0
G, and we can easily update the
0 0 0

0 2 0
150 40 10
150 38 10
G from =
G to get =
G,
0 0 0
70 20 10
70 20 10

150 40 10
0 2 0
150 38 10
G-=
G==
G.
70 20 10
0 0 0
70 20 10
Similarly, if new stock is delivered, say for example, 10 items of Product A
and 5 items of Product B are delivered to each of the outlets, this additional

that is =

stock can be represented by the matrix =


type can be determined by adding =

10 5 0
G, and the total stock of each
10 5 0

10 5 0
150 38 10
G to =
G to give
10 5 0
70 20 10

160 43 10
150 38 10
10 5 0
160 43 10
G, that is =
G+=
G==
G.
80 25 10
70 20 10
10 5 0
80 25 10
In this sort of context it is usual to have a periodic valuation of stock. For
example, it might be the case that at the end of each month the business
accountant values the stock held at each outlet. Suppose Product A costs $50
per item, Product B $30 per item and Product C $80 per item. This can be
R V
S50 W
represented by the cost matrix S 30 W and we can easily calculate the value of
SS80 WW
T X
the stock held by multiplication of the two matrices as follows:
=

R V
50
10090
160 43 10 S W 160 # 50 + 43 # 30 + 10 # 80
G # S 30 W = =
G
=
G==
#
+
#
+
#
80
50
25
30
10
80
5550
80 25 10 SS WW
80
T X

13

MathsWorks for Teachers


Matrices

It should be noted that this form of product is essentially based on a linear


combination of product numbers multiplied by their corresponding cost.
Students will likely wonder at some stage why multiplication of matrices is
not defined in terms of the corresponding arithmetic operation between
elements in the same position in two matrices of the same order, as is the case
for addition and subtraction of matrices. Practical examples such as the stock
context provide a basis for understanding why the linear combination
definition is used.
If at some time there is a 10% tax added to the cost of items, then the
matrix of costs can easily be updated by multiplying the cost matrix by
100%+10%=110% or a factor of 1.1:
R V R
V R V
S50 W S1.1 # 50 W S 55 W
1.1 S 30 W = S1.1 # 30 W = S 33 W
SS80 WW SS1.1 # 80 WW SS88 WW

T X T
X T X
The role of the scalar multiple can also be introduced through consideration
of repeated addition, such as S+S=2S, S+2S=3S, S+3S=4S and so
on; however, this has the limitation of restricting the scalar multiple to a
whole number.
It is important to distinguish between this type of multiplication, a scalar
multiple of a matrix, and the product of two matrices.
This simple example (see also VCAA 2005, 5760, 14950) demonstrates
the use of matrices, and the operations of matrix addition, matrix
multiplication and scalar multiplication of a matrix in a practical context
where the natural, or intuitive interpretation of the processes being used
motivates and models the development of matrix operations. Another context
which is accessible to senior secondary school students is that of scoring for
events in house or other sporting competitions. However this approach also
means that the matrices used are, by definition, conformable for the operation
that is to be applied, and their order is known prior to any related calculations.
To deal with matrices in their own right (that is, as reified objects),
independently of a particular modelling context, and to explore their general
properties it is necessary to be able to describe and work with them abstractly.

14

chapter 2
Rectangular arrays, matrices and operations
S t u d e n t ac t i v i t y 2 . 1
a
b

Use a suitable matrix product to calculate the total amount of change held by each
of Michael, Jay, Sam and Lin in the given week.
If the AustralianUS dollar exchange rate is A$1 =US$0.76, use a suitable scalar
multiple of the matrix in part a to find the equivalent value of their change in US
dollars.

D e f i n i t i o n o f a m a t ri x
An mn matrix, A, is a rectangular array of numbers with m rows and n
columns. We say A is of order, dimension or size, m by n and write mn as
shorthand for this. This does not mean that we wish to calculate the
corresponding arithmetic product, although this will tell us the total number
of elements in matrix A. Unfortunately it is not very helpful to know this as
many matrices can have the same total number of elements.
As in our practical example, the position of each entry, or element, in the
matrix is uniquely determined by its column and row numbers. Thus, we write
R
V
S a11 a12 f a1n W
S a21 a22 f a2n W
A=S

W
S h
h
h W
Sam1 am2 f amn W
T
X
where the entry in the ith row and jth column, called the (i, j) entry of A, is
denoted aij. In this case the letters i and j are index variables denoting position,
where i runs through 1 to m, that is 1im, and j runs through 1 to n, that
is 1j n.
There are various notations that can be used for matrices. In this text we
will use square brackets to enclose the entries of a matrix. Curve brackets are
also used, however it is conventional to use only one notation in a given
context. Matrices are designated using upper case letters, the entries in a
matrix are identified with the corresponding lower case letter and subscript
indices indicating their position; thus, we sometimes write A=[ai j ] where, as
above, i is the row index and j the column index. As before, aij is the entry in
the ith row and jth column of A and the ranges of i and j are understood to be
those given by the order of the matrix A.

15

MathsWorks for Teachers


Matrices

Two special cases of note are that a m1 matrix is usually called a column
matrix or column vector, while a 1n matrix is usually called a row matrix
or row vector. If a rectangular array is not available for visual display, then a
matrix can be written as a list of lists of equal size, where a list is an ordered
set. For example, the matrix
R
V
S 4 0 2 W
S 1 1 1 W

S
W
S- 5 10 3 W
S 0 0 3.4 W
T
X
is the 43 matrix uniquely defined by the list {{4, 0, 2}, {1, 1, 1}, {5, 10, 3},
{0, 0, 3.4}}.
When technology is used, the data to specify a matrix is either entered into
a template of a specified size (where the dimensions of the matrix needs to be
specified first to obtain the desired template), or as a list of lists.
Example 2.1

150 40 10
G, then A is a 23 matrix, where a21 = 70 and
70 20 10
a12 = 40 .
R V
S50 W
If B = S 30 W, then B is a 31 column matrix, or column vector, where
SS80 WW
T X
b11 = 50 , b21 = 30 and b 31 = 80 .
If A = =

If C = 71 - 2 4 A, then C is a 13 row matrix, or row vector, where


c11 = 1 , c12 =- 2 andc13 = 4 .

O p e ra t i o n s o n m a t ric e s
As we have seen in the earlier practical example, matrices may be added,
subtracted, multiplied by a number (scalar), or multiplied by matrices. Some
of these operations are not always possible; the sizes, or orders, of the matrices
involved is important, that is to say there are conditions to which two matrices
need to conform for their sum, difference or product to be defined, or for them
to be conformable for that operation. In practice, general computation with
matrices of high order is carried out by technology, and the algorithms used
by various programs to carry out these computations need the orders of the

16

chapter 2
Rectangular arrays, matrices and operations

matrices involved and definitions for the relevant processes in terms of


elements and their indices. Since the various operations on matrices are
defined in terms of their elements, it is important to note that these elements,
and any scalars which may also be involved, are usually regarded as being
drawn from some field, often the real number field, R. Thus, the operations of
addition, subtraction and multiplication defined on elements of matrices are
the natural operations of the relevant field.
An interesting exercise for teachers to work through with students is to
devise programs using basic programming constructs in a suitable high-level
programming language that carry out the operations of matrix arithmetic.

A d d i t i o n a n d s u b t rac t i o n o f t w o
m a t ric e s
If matrices A and B are of the same size mn then A+B is the mn matrix
with (i, j) entry aij+bij, for i=1 to m, j=1 to n. That is,
A+B=[aij]+[bij]=[aij+bij]. In other words, we simply add all the entries
in their corresponding positions throughout the matrix.
Subtraction can be defined in the same way, and
A B=[aij] [bij ]=[aij bij]. In other words, we simply subtract all the
entries in matrix B from their corresponding entries in matrix A.
Example 2.2

R
V R
V R
]- 2g + 12 VW RS 8 10 VW
S 1 - 2 W S 7 12 W S 1 + 7
S 3 5 W+ S- 3 1 W = S 3 + ]- 3g
5 + 1 W= S 0 6 W
SS- 4 1 WW SS- 1 2 WW SS]- 4g + ]- 1g 1 + 2 WW SS- 5 3 WW
T
X T
X T
X T
X
V
R
R
V R
V R
]- 2g - 12 W S- 6 - 14 VW
S 1 - 2 W S 7 12 W S 1 - 7
S 3 5 W- S- 3 1 W = S 3 - ]- 3g
5 - 1 W= S 6
4 W
SS- 4 1 WW SS- 1 2 WW SS]- 4g - ]- 1g 1 - 2 WW SS- 3 - 1 WW
T
X T
X T
X T
X

M u l t i p l ica t i o n b y a n u m b e r ( s ca l ar
m u lt i p l e )
Given a matrix A of size mn and a number (scalar) k, then kA is the mn
matrix with (i, j) entry kaij for i=1 to m and j=1 to n. That is, if A=[aij ]
and k is a scalar then kA=k[aij]=[kaij].

17

MathsWorks for Teachers


Matrices
Example 2.3

R
V
S 2 - 2W
If k = 3 and A = S 3 7 W, then
SS- 1 5 WW
T
X
R
V R
V
S 2 - 2W S 6 - 6W
3A = A + ]A + Ag = 3 S 3 7 W = S 9 21 W
SS- 1 5 WW SS- 3 15 WW
T
X T
X
Note that subtraction can also be expressed in terms of a scalar multiple,
with k =- 1 , and the addition operation. If A and B have the same size, then
A - B = A + ]- Bg

with entries the sums aij+(bij) of the corresponding entries in A and (B).
Example 2.4

V R
R
V R
V R
V
S 1 - 2 W S 7 12 W S1 + ]- 7g - 2 + ]- 12gW S- 6 - 14 W
S 3 5 W- S- 3 1 W = S 3 + 3
5 + ]- 1g W = S 6
4 W
SS- 4 1 WW SS- 1 2 WW SS - 4 + 1 1 + ]- 2g WW SS- 3 - 1 WW
T
X T
X T
X T
X

There is a special matrix, called the zero matrix O=[oij] where oij=0 for
all i and j. For any matrix A, A - A = O .
Example 2.5

If A = >
i

18

1 -2 3
2 4 -1
H and B = >
H, then
4 2 -1
-1 3 2
1 -2 3
2 4 -1
H+>
H
4 2 -1
-1 3 2
3 2 2
1 + 2 ]- 2g + 4 3 + ]- 1g
==
G
G==
4 + ]- 1g 2 + 3 ]- 1g + 2
3 5 1

A+B =>

chapter 2
Rectangular arrays, matrices and operations

ii

A-B =>
==

1 -2 3
2 4 -1
H->
H
-1 3 2
4 2 -1

-1 -6 4
1 - 2 ]- 2g - 4 3 - ]- 1g
=>
H
G
4 - ]- 1g 2 - 3 ]- 1g - 2
5 -1 -3

iii 2A - 3B = 2 >
=>
==
iv A - A = >

1 -2 3
2 4 -1
H - 3>
H
4 2 -1
-1 3 2

2 -4 6
6 12 - 3
H->
H
8 4 -2
-3 9 6

- 4 - 16 9
2 - 6 ]- 4g - 12 6 - ]- 3g
H
G=>
]
g
]
g
8 - -3
4-9
-2 - 6
11 - 5 - 8

1 -2 3
1 -2 3
0 0 0
H->
H==
G
4 2 -1
4 2 -1
0 0 0

S t r u c t u r e p r o p e r t i e s o f m a t ri x
a d d i t i o n a n d s ca l ar m u l t i p l ica t i o n
Let A, B and C be any matrices of a given size mn, where m and n are nonzero, then
1 the sum of any two such matrices
is always defined
(Closure property for addition)
(Associative property for addition)
2 (A+B)+C=A+(B+C)
3 A+O=A=O+A
(Identity property for addition)
4 A+(-A)=O=(-A)+A
(Inverse property for addition)
5 A+B=B+A
(Commutative property for addition)
This collection of properties can be established by working from the
general definition of matrix addition as applied to the matrices A=[aij ],
B=[bij ], C=[cij ] and O=[oij ] of the same order. It may be helpful to have
students undertake some general case calculations for matrices of a given
order, for example 23 matrices. The results may appear to be obvious, or
even trivial to students; however, care should be taken to draw to their
attention that they do apply to matrices drawn from the same set of a given
order (for any order) by virtue of the component-wise definition of addition of
matrices, and the corresponding number properties of their elements and the
corresponding number operations. This may be summarised by saying that

19

MathsWorks for Teachers


Matrices

such a set of matrices forms a commutative (or abelian) group under addition.
If we also consider multiplication of a matrix from this set by a scalar (scalar
multiple), then for any scalars r and s the following properties also hold:
(Associative property of scalar multiples)
1 r(sA)=(rs)A
(Right distributive property of scalar multiple over
2 (r+s)A=rA+sA

scalar addition)
(Left distributive property of scalar multiple over
3 r(A+B)=rA+rB

matrix addition)
Again, it may be useful for students to consider the general case for
matrices of a given order. Taken together, these properties show that such a set
of matrices with these operations of addition and scalar multiple form what is
called a vector space.

Ma t ri x m u l t i p l ica t i o n
As observed from the introductory example, the product matrix AB, or
product of two matrices, also written as AB, can only be defined when the
number of columns in matrix A is equal to the number of rows in matrix B.
Alternatively this may be expressed by saying that the matrices A and B are
conformable for the product AB when the number of elements in the rows
of matrix A is the same as the number of elements in the columns of
matrix B.
If A=[aij] is an mp matrix, and B=[bij] is a pn matrix, then
AB=C=[cij] is an mn matrix with (i, j) entry the number
cij =

/ aik bkj

k=1

That is, the (i, j) entry of the product is obtained by multiplying each of the
entries in the ith row of A by the corresponding entries in the jth column of
B, and then adding all these products.

20

chapter 2
Rectangular arrays, matrices and operations
Example 2.6

R V
S 2 W
1 -2 3
a If A = =
G and B = S- 1W, then
4 2 -1
SS 3 WW
T X
R V
2
1 - 2 3 S W 1 # 2 + ]- 2g # ]- 1g + 3 # 3
13
AB = =
G S- 1W = =
G== G
4 2 - 1 S W 4 # 2 + 2 # ]- 1g + ]- 1g # 3
3
S 3 W
T X
Note that we cannot form the product BA, since the number of
columns of B is not equal to the number of rows of A.
R
V
S- 1 2 W
1 -2 3
b If A = =
G and B = S- 2 3 W, then
4 2 -1
SS 1 4 WW
R
VT
X
- 1 2W
1 -2 3 S
AB = >
H S- 2 3 W
4 2 -1 S
W
S 1 4W
T
X
1 # ]- 1g + ]- 2g # ]- 2g + 3 # 1 1 # 2 + ]- 2g # 3 + 3 # 4

==
G
4 # ]- 1g + 2 # ]- 2g + ]- 1g # 1 4 # 2 + 2 # 3 + ]- 1g # 4
6 8
=>
H
- 9 10

and

R
V
S- 1 2 W 1 - 2 3
BA = S- 2 3 W>
H
S
W 4 2 -1
S 1 4W
TR
X
V
S]- 1g # 1 + 2 # 4 ]- 1g # ]- 2g + 2 # 2 ]- 1g # 3 + 2 # ]- 1gW
= S]- 2g # 1 + 3 # 4 ]- 2g # ]- 2g + 3 # 2 ]- 2g # 3 + 3 # ]- 1gW
SS 1 # 1 + 4 # 4
1 # ]- 2g + 4 # 2
1 # 3 + 4 # ]- 1g WW
X
RT
V
S 7 6 - 5W
= S10 10 - 9 W
S
W
S17 6 - 1 W
T
X

So we have the situation in Example 2.6 that AB is a 22 matrix and BA is a


33 matrix. While both products are defined, in this case they are not equal
since the product matrices are of different size (order). Thus, if A and B are
arbitrary matrices and the product AB is defined, it may be the case that the
product BA is either not defined or, if it is defined, it may not be of the same

21

MathsWorks for Teachers


Matrices

size as AB. It is important to point out this aspect of matrix multiplication to


students at an early stage, and also that for matrices of a given order mn
where m and n are different, neither product will be defined. These motivate
consideration of conditions under which matrix multiplication might be
generally defined for a given set of matrices, and also those circumstances
under which addition might be generally defined for the same set of matrices.
A matrix of size mn where m=n is called a square matrix or a nn
matrix. If A and B are two square matrices of the same size, then we can form
the products AB and BA by definition, since the number of rows and columns
in both matrices are equal. This also ensures that addition is defined on these
matrices, and we know addition is commutative anyway. However, it is not so
clear for multiplication of square matrices whether AB is the same as BA or
not. Considering a particular example for two 22 matrices such as
2 3
-1 2
A =>
H and B = >
H
-1 5
2 3

2 3 -1 2
2 # ]- 1g + 3 # 2
2#2+3#3
AB = >
H>
H==
G
]- 1g # ]- 1g + 5 # 2 ]- 1g # 2 + 5 # 3
-1 5 2 3
4 13
==
G
11 13
]- 1g # 2 + 2 # ]- 1g ]- 1g # 3 + 2 # 5
-1 2 2 3
and BA = >
H>
H==
G
2 # 2 + 3 # ]- 1g
2#3+3#5
2 3 -1 5
-4 7
=>
H
1 21
Clearly, AB BA in this case. Inspection of a range of other examples will
generally show that it is not the case that AB is the same as BA. This is also
the case for square matrices of other orders, a situation which students can
readily investigate using suitable technology. They will readily develop a
collection of cases which show that multiplication of square matrices is, in
general, not commutative. A related investigation is to see if students can
identify examples, and then sets, of square matrices for which the product is
commutative, for example:
1 0
3 0
G and =
G
0 6
0 2
There are some important situations in which matrix products are
commutative, for example when they are used to represent and compose
certain types of transformations of the cartesian plane, such as rotations about
the origin.

22

chapter 2
Rectangular arrays, matrices and operations

Teachers may also wish to consider such arguments from the general
definition of matrix multiplication for the case of, for example, 22 square
matrices, and consideration of the equality of two matrices. Thus:
e f
a b
G and B = =
G
g h
c d
ae + bg af + bh
then AB = >
H
ce + dg cf + dh
if A = =

and BA = >

ea + fc eb + fd
H
ga + hc gb + hd

If the elements of A and B are real numbers, then AB = BA when bg = fc ,


af + bh = eb + fd , and ce + dg = ga + hc .
Properties of matrix multiplication
If A, B, and C are matrices of appropriate sizes, and k is a scalar then:
(Distributive property of left multiplication
1 A(B+C)=AB+AC

over addition)
2 (B+C)A=BA+CA
(Distributive property of right multiplication

over addition)
(Associative property of multiplication)
3 (AB)C=A(BC)
4 k(AB)=(kA)B
5 AB BA in general

Z e r o a n d i d e n t i t y m a t ric e s
A matrix with all entries zero is called the zero matrix (of appropriate size),
and denoted O. A square matrix of size nn with all entries zero except the
diagonal entries, that is those in position (j, j) for j=1 to n, which are all 1, is
called the identity matrix of size nn, denoted I. When the size (order) of
matrices being considered is fixed, the symbols O and I can be used without
ambiguity, otherwise the notation Om,n and In,n or just On (when m=n) and
In can be used, as applicable.

23

MathsWorks for Teachers


Matrices
Example 2.7

R
V
S0 0 W
O 3, 2 = S0 0 W is the zero matrix of size 32.
SS0 0 WW
T
X
1 0
I2, 2 = =
G is the identity matrix of size 22.
0 1
R
V
S1 0 0 W
I 3 = S0 1 0 W is the identity matrix of size 33.
SS0 0 1 WW
T
X
Zero and identity matrices have properties similar to the numbers 0 and 1
with respect to addition and multiplication. Students should be able to
convince themselves that if A is a matrix and O is the zero matrix of the same
size, then A+O=A=O+A and A+(A)=O=(A)+A, and hence
AA=O. Similarly if A is an mn matrix and I is the identity matrix of
size nn then AI=A, and if B is an nm matrix then IB=B. If A is also an
nn matrix, then students should also be able to observe that AI=A=IA.
Initially this may be tested by a judicious range of examples, and subsequently
argued in terms of the general definitions of the relevant operations.
Exploration of these operations and their conformable computations for
various combinations of matrices will enable students to see matrices as
rectangular arrays that can be thought of as a list of lists of the same size, as a
rectangular array and as abstract reified objects in their own right. They
should also be aware that while particular computations involving matrices of
relatively low order can be carried out fairly readily (if somewhat tediously)
by hand, more complex computations and/or computations involving matrices
of higher order are generally best carried out by technology designed for
this purpose.
However, students should also be aware that general analysis of matrix
operations is likely to involve component-wise operations with numbers using
indexed sets of sums and/or products of these numbers. For example, students
might be asked to show that for square matrices of a given size AO=O=OA
but AB=O does not necessarily imply that A=O or B=O.

24

chapter 2
Rectangular arrays, matrices and operations
S t u d e n t ac t i v i t y 2 . 2
a

1 3
2 -1
2 4 6
Given A = =
G, C = =
G calculate
G, B = =
-1 2
2 4
8 10 12
i
ii
iii
iv
v
vi

2A
AB
A+B
AB
BA
AC
p
4 1
G and P = = G where p and q are non-zero real numbers. Find all real
q
8 6

Let M = =

values of the scalar k such that MP=kP.

11
Let A = = G . Evaluate An for n=2 to 5 and find a general form for n > 1.
11

d
e

Show that for square matrices of a given size AO=O=OA, but AB=O does not
necessarily imply that A=O or B=O.
Let J = =

0 -1
G and I be the identity matrix for multiplication for 22 matrices.
1 0


Show that J2=I and that J4=I.
f Explain why, if X and Y are 22 matrices, then, in general,
X2Y2 (X+Y)(XY) and illustrate this with a suitable (counter) example. Find
two matrices X and Y for which this relationship is true.

T h e t ra n s p o s e o f a m a t ri x
It is a natural question to ask what happens if a matrix is written the other
way round. The transpose of a matrix is obtained by interchanging its rows
and columns. That is, the entries of the ith row become the entries of the ith
column. So, if A=[aij] is an mn matrix, then its transpose, denoted
AT = 7aij AT = 8aTij B, is an nm matrix, with aTij = a ji .
Example 2.8

R
V
S1 3 W
1 2 5
If A is the 32 matrix S2 4 W, then AT is a 23 matrix =
G.
3 4 8
SS5 8 WW
T
X

25

MathsWorks for Teachers


Matrices

Property of matrix transposes


If A and B are matrices such that we can form the product AB, then

(AB)T=BTAT
We illustrate this with an example.

Example 2.9

Let A = >

1 -2
2 -1 4
H and B = >
H.
3 0
0 1 3

R
V
T
T
S 2 6 W
2
3
2
1
2
2
1
4
Then ]ABgT = f>
H>
Hp = >
H = S- 3 - 3 W
6 - 3 12
3 0 0 1 3
S
W
S- 2 12 W
T
X
R
V
R
V
S 2 0W 1 3
S 2 6 W
and BT AT = S- 1 1 W>
H = S- 3 - 3 W
SS
WW - 2 0
S
W
S- 2 12 W
4 3
T
X
T
X
Note that we cannot form the product AT BT, since the number of columns
in AT, namely 2, is not equal to the number of rows in BT, namely 3.
Most equations involving matrices can be written in two forms: the
original and its equivalent using transposes. For example, the matrix equation
R V
50
160 43 10 S W 10090
S

G # 30 W = =
G
=
80 25 10 S W
5550
S80 W
T X
that we have seen earlier can also be written in transposed form:
R
V
S160 80 W

650 30 80@ # S 43 25 W = 610090 5550@
SS
W
10 10 W
T
X
Commonly, matrix equations involving transformations (see Chapter 4)
and transitions (see Chapter 5) may also be written in a form that is the
transpose of the form given in this book.
A symmetric matrix is the same as its transpose. If A=[aij] is a symmetric
matrix, then aij=aji and AT = A. A symmetric matrix must be a square
matrix.

26

chapter 2
Rectangular arrays, matrices and operations
Example 2.10

R
V
1 - 2 4W
S
1 3
G and S- 2 2 0 W are symmetric matrices.
=
3 2
S
W
S 4 0 3W
T
X
A diagonal matrix is a square matrix that has non-zero entries only on the
main diagonal.
Example 2.11

R
V
S1 0 0 W
1 0
G and S0 - 2 0 W are diagonal matrices.
=
0 2
SS
W
0 0 3W
T
X

T h e i n v e r s e o f a m a t ri x
A square nn matrix A is said to be invertible if there is a matrix C, written
as A1, of the same size as A, such that AC=I and CA=I, where I is the
identity matrix of size nn. C is said to be the inverse matrix of A.
2 4
Not all square matrices have inverses. For example, =
G does not have an
3 6
inverse.
Properties of inverses
It is important that students are familiar with some of the key properties of
matrices and their inverses:
1 A square matrix has at most one inverse, where A1 A = I = AA1.
2 If A is invertible, then so is AT, and (AT) 1=(A1)T.
3 If A is invertible, then so is A1, and (A1) 1=A.
4 If A and B are invertible matrices of the same size, then AB is invertible
and (AB) 1=B 1 A1.
These properties can be established fairly readily, and provide some good
examples to students of proofs that are not lengthy, but are illustrative of
important aspects of mathematical reasoning using structural properties such

27

MathsWorks for Teachers


Matrices

as uniqueness, identity and inverse, definitions and the use of previously


established results. Proofs of these properties are as follows:
1 If C and D are both inverses of A, then AC=I and CA=I, and AD=I and
DA=I. So C=CI=C(AD)=(CA)D=ID=D, that is C=D and the
inverseA1is unique.
2 AA1=I, so (AA1)T=(A1)TAT=I since IT=I. Hence (AT) 1=(A1)T.
3 AA1=I, so clearly A is the inverse matrix of A1.
4 By definition (AB) 1(AB)=I. Consider (B 1 A1)(AB), by the associative
property for multiplication, this is the same as
B 1(A1 A)B,=B 1IB=B 1B=I by inverse and identity properties. So by 1,
since inverses are unique, (AB) 1=B 1 A1.
There are many uses of inverse matrices. The following are just a few.
1 The inverse of a matrix can be used for cancellation purposes in matrix
equations. If A is invertible, then:
i AB=AC implies that B=C (since we can multiply both sides on the
left by A1).
ii BA=CA implies that B=C (since we can multiply both sides on the
right by A1).
2 4
If A is not invertible, then this is not the case. For example, let A = =
G,
3 6
1 2 1
5 10 1
B ==
G and C = =
G.
3 4 2
1 0 2

Then AB = =

2 4 1 2 1
14 20 10
G=
G==
G
3 6 3 4 2
21 30 15

2 4 5 10 1
14 20 10
G=
G==
G
3 6 1 0 2
21 30 15
Hence AB=AC and yet B C.
2 The inverse matrix can be used to solve a system of simultaneous linear
equations with a unique solution. Consider the following system of
simultaneous equations:

and AC = =

2x + 3y = 7
4x + y = 3

This system can be written in matrix form as AX=B, where A = =

2 3
G,
4 1

x
7
X = = G and B = = G. If A1 exists, then we can multiply both sides of the
y
3
equation on the left by A1, and we thus have X = A1B, and so we can find
the solution by matrix multiplication.

28

chapter 2
Rectangular arrays, matrices and operations

There are several ways to find the inverse of a matrix. The most useful one
is via the GaussJordan method (see, for example, Anton & Rorres, 2005, or
Nicholson, 2003 for details). Since we shall generally only be concerned with
the inverse of a 22 matrix, we will find it in a simple way.
Example 2.12

Find the inverse of A = =


Solution

Suppose =
Then =

a b
G if it exists.
c d

e f
G is its inverse.
g h

1 0
a b e f
G.
G==
G=
0 1
c d g h

Hence:
(i)
ae+bg=1
(ii)
af+bh=0
(iii)
ce+dg=0
(iv)
cf+dh=1
Considering equation (ii), if we put f =- b and h = a , then the
equation is satisfied. Similarly for equation (iii) we can put e = d and
g =- c .
Now consider the matrix product:

a b e f
a b d -b
ad - bc
0
G=
G==
G=
G==
G = ]ad - bcgI
0
ad - bc
c d g h
c d -c a

and then it is easy to see that =

a b -1
d -b
1
G =
=
G provided
ad
bc
c a
c d

ad - bc ! 0 , since if BA=kI, where k is a non-zero scalar, then

` k1 Bj A = I , and hence, by definition, ` k1 Bj = A- 1.

If ad bc=0, then the matrix A does not have an inverse. The number ad bc
is called the determinant of A and denoted det(A) or |A|. (For more
information on determinants, see the references on linear algebra.)

29

MathsWorks for Teachers


Matrices

Inverse of a 22 matrix
A 22 matrix A = =

a b
G has an inverse if and only if ad bc 0, and then
c d

a b -1
d -b
1
G =
=
G
ad - bc - c a
c d

(The reader can check that AA1=A1 A=I.)


Note that an n n matrix A has an inverse if and only if det(A)0; see
further references on linear algebra for definition and calculation of such an
inverse.
S t u d e n t ac t i v i t y 2 . 3
a

Find the inverse of each of the following matrices:


i

1 2
G
3 4

ii =

5 3
G
3 2

iii =

2 1
G
3 -1

Use matrix inverses to solve the following system of simultaneous linear equations:
2x + 3y = 7
4x + y = 3

A p p l ica t i o n s o f m a t ric e s
Fibonaccis rabbits
Suppose that newborn pairs of rabbits produce no offspring during the first
month of their lives, but then produce one new pair every subsequent month.
Start with F1=1 newborn pair in the first month and determine the number,
Fr=number of pairs in the rth subsequent month, assuming that no rabbit dies.
Since the newborn pair do not produce offspring in the second month, we
have F2=F1=1. In the third month, the original pair will produce one pair of
offspring, so F3=2. In the fourth month, the pairs in the second month will
each produce another pair, so the total will be these newborn pairs added to
the number of pairs from the previous month, that is F4=1+2=3, or
F4=F2+F3.
Continuing in this manner, we see that
F5 =(Number of pairs alive in the third month, which each produce one
pair of offspring in the fifth month)+(Number of pairs alive in the
fourth month)
=F3+F4

30

chapter 2
Rectangular arrays, matrices and operations

and in general that


Fr =(Number of pairs alive in month r 1)+(Number of pairs alive in
the (r 2)th month, which each produce one pair of offspring in
the rth month)
=Fr1+Fr2.
Thus the sequence for the number of pairs of rabbits is
1, 1, 2, 3, 5, 8, 13, 21, 34
which is called the Fibonacci sequence, and has the property that from the
third term on, each term is the sum of the preceding two terms in the
sequence.
If Fr represents the rth term in this sequence, then
Fr=Fr 1+Fr 2
We can express this in matrix form:

>


Then writing fr=>

Fr
1 1 Fr - 1
G>
H==
H
Fr - 1
1 0 Fr - 2

Fr
1 1
G, we find that fr=Afr 1.
H for r > 1 and A==
Fr - 1
1 0

In particular, f3=Af2, f4=Af3=A2f2, f5=Af4=A3f2 and, in general,


fr=Ar 2f2. So elements of this sequence can be determined by finding powers
of the matrix A, and multiplying by the column matrix f2.

S t u d e n t ac t i v i t y 2 . 4
Find the 29th and 30th numbers in the Fibonacci sequence.

Codes
Governments, national security agencies, telecommunications providers,
banks and other companies are often interested in the transmission of coded
messages that are hard to decode by others, if intercepted, yet easily decoded
at the receiving end. There are many interesting ways of coding a message,
most of which use number theory or linear algebra. We will discuss one that
is effective, especially when a large-size invertible matrix is used.
At a personal level, bank accounts, credit cards, superannuation funds,
computer networks, phone companies, frequent flyer schemes, electronic

31

MathsWorks for Teachers


Matrices

motorway toll systems and other commercial enterprises (such as online


bookstores) require PINs or passwords to be able to do a transaction over the
phone, at a machine or over the Internet. Banks now warn people not to write
down the PIN/password at all, but to remember it. In some cases, people
record PINs/passwords disguised as phone numbers, but this is not very
secure and people are encouraged not to do it. Nowadays, a person usually has
many different PINs/passwords which should be changed regularly. It is
impossible to remember them all. However, it would be possible to write down
the information in a coded form using the methods described below.
Let us start out with an invertible matrix M that is known only to the
transmitting and receiving ends. For example:
-3 4
M ==
G
-1 2

Suppose we want to code the message


L

We make a table of letters of the alphabet with the number corresponding


to the position of the letter in the alphabet under it. We use 0 for an empty
space.
space

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

We can use this table to replace each letter with the number that
corresponds to the letters position in the alphabet.
L

12

22

32

14

15

23

chapter 2
Rectangular arrays, matrices and operations

The message has now been converted into the sequence of numbers
12, 5, 1, 22, 5, 0, 14, 15, 23, 0, which we group as a sequence of column vectors:
=

12 1
5 14 23
G, = G, = G, = G, = G
5 22 0 15 0

and multiply on the left by M:


M =

- 16
- 15
12
5
1
85
14
18
23
- 69
G==
G
G, M = G = = G, M = G = =
G, M = G = = G, M = G = =
-2
-5
5
0
22
43
15
16
0
23

giving the sequence of numbers 16, 2, 85, 43, 15, 5, 18, 16, 69, 23. This is
the coded message. Note that this could have been done in one step by the
matrix multiplication:

M=

- 16 85 - 15 18 - 69
12 1 5 14 23
H
G=>
5 22 0 15 0
- 2 43 - 5 16 23

To decode it, the recipient needs to compute M1:


-1 2
M- 1 = > 1 3 H
-2 2

- 16 85 - 15 18 - 69
and multiply it by the vectors =
G to get back the
G, = G, =
G, = G, =
- 2 43 - 5 16
23
original numbers.
- 16
- 15
12
5
85
1
18
14
M- 1 =
G = = G, M- 1 = G = = G, M- 1 =
G = = G, M- 1 = G = = G,
-2
-5
5
0
43
22
16
15
- 69
23
M- 1 =
G== G
23
0
- 16 85 - 15 18 - 69
12 1 5 14 23
or M- 1 >
H==
G
- 2 43 - 5 16 23
5 22 0 15 0
Note: It is possible to use 22 matrices so that both the matrix and its
inverse have integer entriessimply ensure that the determinant is equal to 1
or 1. It is also easy to extend this simple coding system to take account of
alphanumeric data such as PINs.

33

MathsWorks for Teachers


Matrices
S t u d e n t ac t i v i t y 2 . 5
a

b


Based on this approach, code the message SAVE THE WHALES using the matrix
5 3
=
G.
3 2
The following message was received and was decoded using the coding matrix
5 3
M ==
G.
3 2
65, 42, 75, 50, 138, 87, 90, 54, 85, 54, 80, 49, 160, 99, 123, 76
Determine the original message.

SUMM A R Y

Matrices are ordered rectangular arrays of numbers. An array with


m rows of n elements (that is, with n columns) is said to be a matrix
of size (order, dimension) m by n, written as mn. Capital letters
are typically used to designate matrices.
A matrix A of size mn can be written as a list of lists:
A={{a11, a12, a13a1n}, {a21, a22, a23a2n}, {a 31, a 32, a 33a 3n}
{am1, am2, am3amn}}
or using a template such as:
R
V
S a11 a12 f a1n W
S a a f a 2n W
A=[aij]= S 21 22
W
h
h W
S h
Sam1 am2 f amn W
T
X
where aij is the element in the ith row and the jth column, and
1 i m and 1 j n.
Matrices can be added (are conformable for addition) if they are of
the same size. The sum of two matrices is obtained by adding the
elements in corresponding positions, that is
A+B=[aij]+[bij]=[aij+bij] for all 1 i m and 1 j n.
For any matrices A, B and C of the same size, A+B=B+A and
(A+B)+C=A+(B+C).
The zero matrix of size mn is defined by Om, n=[oij ] where
oij = 0 for all 1 i m and 1 j n, and is often written as O when
the order is known (fixed) in a given context and there is no
ambiguity.

34

chapter 2
Rectangular arrays, matrices and operations

SUMM A R Y (Cont.)

If A is a matrix of size mn and k is a scalar then


kA=k[aij]=[kaij] for all 1 i m and 1 jn. In particular
1A=[aij]=A. For any matrix A and the corresponding zero
matrix O, A+O=A=O+A, A+(A)=O=(A)+A and
A A=O. For any matrix A, and scalars r and s, r(sA)=(rs)A.
If A is an mn matrix and B is a pq matrix then the product
C=AB is defined when n=p and is a matrix of size mq. Matrix
p

multiplication is defined row by column where cij = / aik bkj for


k=1
all 1 i m and 1 j q.
In general, AB BA. If A, B and C are matrices and k is a scalar then
(AB)C=A(BC) and k (AB)=(kA)B, provided the relevant matrix
products are defined.
The nn identity matrix, written In, n=In or just I when the order
is given and no ambiguity arises, is defined as the matrix with 1 for
each element along its leading diagonal (top left to bottom right) and
0 for all other elements. For any square matrix A of a given order
and the corresponding identity matrix I, AI=A=IA.
If A, B and C are square matrices of a given size, and r and s are
scalars, then A(B+C)=AB+AC, (B+C)A=BA+CA,
(r+s)A=rA+sA and r(A+B)=rA+rB.
The transpose of a matrix A=[aij] is the matrix AT=[aij]T=8aTij B
where aTij =aji. If the product AB of two matrices is defined, then
(AB)T=BTAT.
The inverse of a square matrix A is a square matrix A1 such that
AA1=A1 A=I, the identity matrix of the same size as A.
If AB is the product of two square matrices of the same size, then
(AB) 1=B 1 A1.
d -b
a b
1
For a 22 matrix A = =
G, A- 1 =
=
G, provided
ad - bc - c a
c d
adbc0. (ad bc is called the determinant of matrix A, written
det(A) or |A|.)

35

MathsWorks for Teachers


Matrices

References
Anton, H & Rorres, C 2005, Elementary linear algebra (applications version), 9th
edn, John Wiley and Sons, New York.
Cirrito, F (ed.) 1999, Mathematics higher level (core), 2nd edn, IBID Press, Victoria.
Hill, RO Jr 1996, Elementary linear algebra with applications, 3rd edn, Saunders
College, Philadelphia.
Lipschutz, S & Lipson, M 2000, Schaums outlines of linear algebra, 3rd edn,
McGraw-Hill, New York.
Nicholson, KW 2001, Elementary linear algebra, 1st edn, McGraw-Hill Ryerson,
Whitby, ON.
Nicholson, KW 2003, Linear algebra with applications, 4th edn, McGraw-Hill
Ryerson, Whitby, ON.
Poole, D 2006, Linear algebra: A modern introduction, 2nd edn, Thomson Brooks/
Cole, California.
Sadler, AJ & Thorning, DWS 1996, Understanding pure mathematics, Oxford
University Press, Oxford.
Victorian Curriculum and Assessment Authority (VCAA) 2005, VCE Mathematics
study design, VCAA, East Melbourne.
Wheal, M 2003, Matrices: Mathematical models for organising and manipulating
information, 2nd edn, Australian Association of Mathematics Teachers,
Adelaide.

Websites
http://wims.unice.fr/wims/en_tool~linear~matrix.html
This website provides a matrix calculator.
http://en.wikipedia.org/wiki/Matrix_(mathematics)
This website gives a concise introduction to matrices and matrix arithmetic, and
has links to other resources and references.
http://www.sosmath.com/matrix/matrix.html
This site has some notes on basic matrix concepts and operations at quite a simple
level.

36

C ha p t e r

S o lv i n g s ys t e m s o f
s i m u l t a n e o u s l i n e ar
eq u at i o n s
From the middle years of secondary schooling, students become familiar with
linear relations of the form a certain number of xs added to a certain number
of ys are equal to a given number, such as 2x+3y=24. Usually this is done
by considering a table of whole number ordered pairs of values that satisfy the
relation, and then plotting the corresponding points on a set of cartesian axes.
This is then typically extrapolated (by an implicit continuity assumption) to
consideration of the continuous straight line on which these points lie. Thus
students learn to draw the graph, part of which is shown in Figure 3.1, of such
y
15
10
5
20

15

10

10

15

20

5
10
15
20
Figure 3.1: Part of the graph of the linear relation 2x+3y=24

37

MathsWorks for Teachers


Matrices

linear relations, by identification of their axis intercepts and drawing in the


line containing these two points. The corresponding working might go
something like: when x=0, 3y=24 and so y=8, hence (0, 8) are the
coordinates of the vertical, or y-axis, intercept. Similarly, when y=0, 2x=24
and so x=12, hence (12, 0) are the coordinates of the horizontal, or x-axis,
intercept.
Students will also learn how to identify the gradient of the straight line
from this form, the graph, or possibly by re-writing it in the function form
-2
y = 3 x + 8.
A single linear equation in two variables ax+by =k, where a, b and k are
real constants, is used to define the linear relation corresponding to the set of
points {(x, y): ax+by=k, x, yR} and this set of points can be used to draw
the graph of a straight line in the cartesian plane R2. Any ordered pair (x, y)
that satisfies this equation is a point on the graph of the straight line. If we are
asked to consider the set of points which satisfy both this relation and another
relation of the same kind together, then we have a set of two simultaneous
linear equations in the variables x and y. Given that each relation corresponds
to the graph of a straight line in R2, we can interpret this geometrically to see
that there are three possibilities:
The simultaneous linear equations corresponding to the rules of these
relations have a unique solution, and their graphs have a unique point of
intersection.
The simultaneous linear equations corresponding to the rules of these
relations have no solution, and their graphs are parallel and distinct straight
lines.
The simultaneous linear equations corresponding to the rules of these
relations have infinitely many solutions, the equations represent the same
relation and their graphs are the same straight line.
As a related activity students could be asked to identify the rules of several
other linear relations that correspond to each of these cases with respect to
2x+3y=24, and verify these by using technology to draw the corresponding
graphs. In such an activity, students are not considering the relationship in
terms of its component parts, but as an object in its own right: each relation
has a graph, and this graph may or may not have certain properties with
respect to the given relation 2x+3y=24 and its graph.
In the first and last cases described above, the system of simultaneous
linear equations is said to be consistent, in the second case it is said to be
inconsistent. It is likely that a set of three or more arbitrarily selected

38

chapter 3
Solving systems of simultaneous linear equations

simultaneous linear equations in two variables will be inconsistent, however


this is not always the case. For example the set of simultaneous linear
equations {3x+2y=5, x y=0, x+4y=3} has the unique solution (1, 1).
Indeed, teachers can ask students to form systems of several simultaneous
linear equations in two variables that are satisfied by a given ordered pair
(m, n), simply by choosing the constant k to be am+bn for each combination
a and b of coefficients for x and y in ax+by=k. Students will likely be
familiar with solving simultaneous systems of two linear equations in two
variables, using graphical, numerical and algebraic approaches, from their
work in the middle years of secondary mathematics. The linear equations
involved will have been alternatively presented in the forms y=mx+c,
ax+by=k and Ax+By+C=0, and students should be able to convert
algebraically between these forms. They should also be aware that the first
form only applies to linear relations that are functions; that is, it is not
possible to use this form to describe linear relations such as
{(x, y): x=6, yR}. It is perhaps useful in this context to explicitly point out
how this relates to the coordinate specification of the position of a point in the
cartesian plane. For example, as shown in Figure 3.2, the point with
coordinates (4, 7) corresponds to the solution of the pair of simultaneous
linear equations x = 4, or 1x + 0y = 4, and y = 7, or 0x + 1y = 7:
y
15
10

y=7

(4, 7)

5
4

10

x=4

10
15
20
Figure 3.2: Parts of the graphs of the lines with equations x=4 and y=7
showing their point of intersection at (4, 7)

39

MathsWorks for Teachers


Matrices

Practical and theoretical problems in many areas can be formulated in


terms of solving a system of simultaneous linear equations. A linear equation
in two variables is a special case of an equation of the form f(x, y)=k, where
k is a real constant and f(x, y) has the form ax+by for real constants a and b.
For example, f(x, y)=3x 2y=10 is a (linear) equation where k=10 and
a and b are 3 and 2 respectively. Students would be familiar with the use of
this form to specify the rule of a linear function in the cartesian plane, or R2,
and its corresponding straight line graph, with gradient 23 and axis-intercepts
j
at (0, 5) and ` 10
3 , 0 . If g(x, y)=l also has the same form, for example
4x+5y=3, then we have a pair of simultaneous linear equations
{f(x, y)=k, g(x, y)=l} or {3x 2y=10, 4x+5y=3}. The solution to this
pair of simultaneous linear equations will be the set of all ordered pairs (x, y)
- 49
j
that satisfy both equations, in this case the ordered pair ` 44
23 , 23 .
The solution can be readily found using a computer algebra system (CAS),
such as Derive:
SOLVE([3.x 2.y = 10, 4.x + 5.y = 3], [x, y])

[x =

44
49
23 ^ y = 23

This ordered pair represents the point of intersection of the corresponding


straight line graphs in the cartesian plane, as shown in Figure 3.3.
y
6
4
2

2
4
6

Figure 3.3: Graph of two linear functions 3x 2y=10, 4x+5y=3


showing their point of intersection

40

chapter 3
Solving systems of simultaneous linear equations

We can similarly define a linear equation in three variables as a special case


of an equation of the form f(x, y, z)=k, where k is a real constant and f(x, y, z)
has the form ax+by+cz for real constants a, b and c. For example, if
f(x, y, z)=3x 2y+z, then 3x 2y+z=0 is a (linear) equation where k=0
and a, b and c are 3, 2 and 1 respectively. Some students will have worked with
this form as representing the equation of a plane in three-dimensional cartesian
space, or R3. Any point with coordinates (x, y, z) that satisfy this relation is part
of the corresponding plane. Again, this single equation has infinitely many
solutionsall the points in the plane that it defines.
Technologies such as CAS are useful tools to assist in graphically
representing three-dimensional shapes. We can similarly form sets of
simultaneous linear equations involving three variables, with two or more
equations. For example, the intersection of 3x 2y+z=0 with xyz=10,
corresponds to the solution set:

SOLVE([3.x 2.y + z = 0, x y z = 10], [x, y])

[x = 3.z 20 ^ y = 2(2.z + 15)]

This is a parametric solution in terms of the variable z, where for each real
value of z the corresponding values of x and y are given in terms of z. Thus,
the solution set is {(3z 20, 2(2z+15), z): zR}. Each value of z generates
the coordinates of a point in R3, represented by an ordered triple, and these
points all lie on the straight line formed in the three dimensional space, R3,
where the two planes intersect, as shown in part in Figure 3.4.

5
z
5
25
25
y
x
25

25

Figure 3.4: Graph of parts of 3x 2y+z=0 and x y z=10 showing intersection

41

MathsWorks for Teachers


Matrices

This process can be extended to systems defined by sets of simultaneous


linear equations in as many variables as we wish, although it is not then
possible to provide simple geometric interpretations for more than three
variables.
CAS, and other technologies, can be used to solve systems of simultaneous
linear equations directly in black-box mode (that is, providing results
without detailing intermediate steps of calculation). In this text the use of
such technologies is intended to elucidate the processes that are employed in
applying algorithms to obtain solutions. While mental and by hand
computation is an important part of the experience of developing
understanding of key concepts, skills and processes, particularly in less
complicated illustrative cases, CAS and other technologies have been
developed to facilitate problem solving and analysis of the behaviour of
mathematical systems by application of their functionality. They are essential
tools to employ in contexts where many computations are required to be
carried out quickly and reliably, such as with matrices of large size (order).
Although judgments about when, where and how it is appropriate to
employ enabling technology (or not) will naturally vary with practical and
philosophical considerations in context, individuals will ultimately make their
own decisions on this matter where they have a choice. Teachers may find it
useful to discuss with students a range of perspectives and considerations on
this issue, including their own views and the rationales for these views.
Matrices provide a natural model for representations and manipulation of
systems of simultaneous linear equations. For example, the linear form of
f(x, y, z)=k corresponds to multiplying the variables x, y and z by their
respective coefficients a, b and c, and making the sum of these values equal to
k, that is, ax+by+cz=k. This is precisely the same as matrix row by
column multiplication:
R V
Sx W
7a b c A # S y W = 5 k ?
SS z WW

T X
Moreover, this extends naturally to a system of simultaneous linear
equations, because for matrices such multiplication is defined for each
combination of rows and columns where the matrices involved are
conformable for multiplication. That is, the equations involved have the same
number of variables, even if some of their coefficients are zero. For example,
the equations 3x2y+z=0 and x z=4 can be represented as the two

42

chapter 3
Solving systems of simultaneous linear equations

R V
R V
Sx W
Sx W
matrix equations, 7 3 - 2 1A # Sy W = 5 0 ? and 71 0 - 1A # Sy W = 5 4 ?
SS WW
SS WW
z
z
T X
T X
respectively, or simultaneously via the single matrix equation
R V
Sx W 0
3 -2 1
H # Sy W = = G.
>
1 0 -1 S W 4
Sz W
T X
S t u d e n t ac t i v i t y 3 . 1
a
b
c
d

Write down a system of two simultaneous linear equations in two variables that has
(5, 6) as its unique solution.
Write down a system of two simultaneous linear equations in two variables that has
(5, 6) as one of its many solutions.
Write down a system of two simultaneous linear equations in three variables that
have (0, 0, 0) and (1, 1, 1) as solutions.
Use the Solve functionality of a CAS, or a like functionality of other suitable
technology, to find the intersection of 3x 2y+z=0 and x y z=10, and
express the solution set in terms of y.

S o l v i n g s y s t e m s o f s i m u l t a n e o u s l i n e ar
e q u a t i o n s u s i n g m a t ri x i n v e r s e
If a system of n simultaneous linear equations in n variables is consistent and
has a unique solution, then square matrices and their inverse matrices may be
used to find this unique solution. We will begin by considering two simple
examples, give some geometric interpretations and then introduce some
general notation. To start with, it is important for students to see how a
familiar set of two simultaneous linear equations in two variables may be
written as an array, as is often the case for by hand techniques, and
subsequently as a single matrix equation.

43

MathsWorks for Teachers


Matrices
Example 3.1

Consider the system of two simultaneous linear equations in two


variables (often called unknowns in this context), x and y, defined by
requiring two numbers x and y to have a difference of 1 and a sum of 3.
This is given by {x y=1, x+y=3} and can be written in a
rectangular array form, with corresponding variables vertically aligned
x-y = 1
x
1 -1
1
as *
4 or in matrix form as =
G # = G = = G.
y
1
1
3
x+y = 3
In this case the solution of x=2 and y=1 can be readily identified
by inspection. However, this will not always be the case.
The system of three simultaneous linear equations in three variables,
x, y and z given by {xyz=0, 6x+4y=20, -4y+2z=10} can be written
in a rectangular array form, with corresponding variables vertically aligned as
R
V R V R V
Z x - y - z = 0_
S1 - 1 - 1W S x W S 0 W
b
]
[ 6x + 4y + 0z = 20` or in matrix form as S6 4 0 W# Sy W = S20 W.
SS0 - 4 2 WW SS z WW SS10 WW
] 0x - 4y + 2z = 10b
a
\
T
X T X T X
Other more complicated systems of n equations in n variables can also be
likewise represented.
R
V
1 - 1 - 1W
S
1 -1
The matrices =
G and S6 4 0 W are commonly called the coefficient
1 1
S
W
S0 - 4 2 W
T R V
X
R V
0
Sx W
S W
x
1
S
W
matrices, the matrices = G and 20 the constant matrices and = G and Sy W the
y
3
SS z WW
SS10 WW
T X
T X
matrices of the variables or unknowns.
Students should be encouraged to note that the matrices involved hold
systematic information about the system of equations. Each column of the
coefficient matrix contains the coefficients of one variable, one coefficient
from each of the equations. The variables are thus ordered according to which
column of the matrix they correspond to. If we write A for the coefficient
matrix, B for the constant matrix and X for the matrix of unknowns, then
each of the above systems of simultaneous equations (and any other like
system) can be represented in the same form by a single matrix equation

44

chapter 3
Solving systems of simultaneous linear equations

AX=B. This equation can be solved for X by left multiplying both sides of the
matrix equation by A1 and applying matrix algebra:
AX=B
A1(AX)=A1B
(A1 A)X=A1B
IX=A1B
X=A1B

(multiplying by A1 on left)
(by associativity of matrix multiplication)
(since (A1 A)=I)
(since IX=X)

For the system of two simultaneous linear equations in two unknowns this
gives:
R
V
S 1 1W 1
x
2
1 - 1 -1 1

G = G = S 21 12 W= G = = G
= G==
y
1 1
3
1
SW3
S 2 2W
T
X
So x=2 and y=1 is the (simultaneous) solution to both the equations, as
expected. Using the matrix inverse, both values are obtained at the same time,
unlike other techniques in which the values are determined successively.
Geometrically, each equation corresponds to a straight line in the cartesian
plane, and they intersect at the point with coordinates (2, 1), as shown in
Figure 3.5.
y
3

x+y=3

2
1
1

1
2
xy=1

Figure 3.5: Graphs of xy=1 and x+y=3 and their point of intersection

45

MathsWorks for Teachers


Matrices

For the system of three simultaneous linear equations in three unknowns


this gives:
R
R
V
V
2
3
1 W
40 W
S
S
R V R1 - 1 - 1V- 1 R V
11 22 11 WR 0 V S 11 W
W S0W S
Sx W S
S W
Sy W = S6 4 0 W S20 W = S- 3 1 - 3 WS20 W = S- 5 W

W S W S 11 22 22 WSS WW S 11 W
SS WW SS
S 45 W
0 - 4 2 W S10 W S 6 1
z
5 W 10
T X T
X T X S- 11 11 22 WT X S 11 W
T
T
X
X
40
5
45
So x = 11 , y =- 11 and z = 11 is the simultaneous solution to all three
linear equations, and this is not really evident by inspection. Geometrically,
each equation corresponds to a plane in the three dimensional space of R3, and
40 - 5 45
, 11 , 11 i as shown in
these planes intersect at the point with coordinates _ 11
Figure 3.6.

5
5
5

x
y
5

Figure 3.6: Graphs of xyz=0, 6x4y=20 and 4y + 2z=10 and their point of intersection

46

chapter 3
Solving systems of simultaneous linear equations

These are called consistent systems, as there is a solution. In both cases the
solution is also unique. This simple matrix method is ideal if there is a unique
solution. However, in many cases there may be no solution or infinitely many
solutions, and in these cases the coefficient matrix does not have an inverse.
Intuitively students should be able to discuss and identify geometric
interpretations of possible intersections of graphs of straight lines (using
rulers) and planes (using sheets of paper) corresponding to the twodimensional and three-dimensional cases respectively. If two students write
down the equations of two straight lines of the form ax+by=c
independently and then compare, there are three possibilities:
They correspond to distinct lines with different gradients which have a
unique point of intersection, for example {3x+2y=4, x+y=0}.
They correspond to distinct lines with the same gradient which have no
points of intersection, that is they are parallel with different axis intercepts,
for example, {3x+2y=4, 3x + 2y=0}.
They correspond to the same line with the same gradient and have
infinitely many points of intersection, that is they are parallel with the
same axis intercepts, for example {3x+2y=4, 1.5x + y=2}.
If three students write down the equations of three planes of the form
ax+by +cz=d independently and then compare, there are also several
possibilities:
The three planes are distinct and intersect in a unique point.
The three planes are distinct and intersect in a line (like a three-page book).
The three planes are identical and intersect in infinitely many points that
form a single common plane.
The three planes do not intersect all together. (There are several ways in
which this might occur geometrically.)
Students should be able to discuss the idea that, as a single equation in
three variables of the form ax+by+cz=d represents a single plane in R3,
then a system of two simultaneous linear equations in three variables could
have zero or infinitely many solutions, the former because the planes are
parallel but distinct, the latter either because the planes are identical or
because they define a line in R3 by their set of intersection points. Student
exploration of these possibilities will be aided by access to CAS with twodimensional and three-dimensional graphing functionalities.

47

MathsWorks for Teachers


Matrices
Example 3.2

Consider the following systems of two simultaneous linear equations in


x+y = 4
- 6x + 2y = - 8
the two variables x and y: *
4.
4 and *
3x - y = 4
x+y = 2
Their coefficient matrices are =

-6 2
1 1
G and =
G respectively, and
3 -1
1 1

neither of these has an inverse. That is, they are both singular matrices
as their determinants are both equal to zero.
For the first system, the corresponding lines are parallel and do not
intersect, as shown in Figure 3.7, so there are no solutions. This system
is said to be inconsistent.
y
5
4
3

x+y=4

2
1
2

x+y=2
x
1

1
2
3
Figure 3.7: Graphs of x+y=2 and x+y=4, parallel straight lines with
no points of intersection

For the second system, the corresponding lines are in fact the same
line, as shown in Figure 3.8, hence each point on the line is a solution to
the system. This system is consistent, but with infinitely many
solutions.

48

chapter 3
Solving systems of simultaneous linear equations

y
6
4
2

x
2

2
4
6
Figure 3.8: Graphs of 3xy=4 and 6x+2y=8, identical straight lines with
infinitely many points of intersection

If we use y=k as the free variable, a set of solutions, or solution set, can be
written parametrically in the form $` 4 +3 k , kj: k ! R. .There are infinitely
many other ways of writing the solution set for this system. For example,
another solution set, using x = t as the free variable, is {(t, 3t4): t!R}.
Each value of k or t, as applicable, generates the coordinates of a solution point.

S t u d e n t ac t i v i t y 3 . 2
a

c
d

The system of linear equations {ax+by=0, cx+dy=0}, where a, b, c and d are


real constants, is called a homogeneous system. Show that for this system there is
either a unique solution or infinitely many solutions.
Describe relationships between real constants a, b, c, d, e and f for which the
system of simultaneous linear equations {ax+by=e, cx+dy=f} has:
a unique solution
no solution
infinitely many solutions
A system of equations has solution set {(3t+1, 2t1):tR}. Find the
corresponding cartesian equation.
- 6x + 2y = - 8
Write the solution set to the system of simultaneous equations )
3
3x - y = 4
in two ways that are different from the ways given above.

49

MathsWorks for Teachers


Matrices

Although we cannot use the inverse of the coefficient matrix to solve the
above systems, we can use another technique called Gaussian elimination to
solve such systems. This is a generalisation, of the process of eliminating
variables, carried out systematically, and represented in matrix form. This
algorithm can be applied in cases where there are not the same number of
equations and variablesan important technical generalisation.

T h e m e t h o d o f Ga u s s ia n e l i m i n a t i o n
We have seen above that a system of simultaneous linear equations can be
written in matrix form. While the method involving the inverse of the matrix
of coefficients can be used when the system has the same number of equations
as variables within an equation, and the required inverse exists (that is, the
system is consistent and has an unique solution), it would be useful to have a
more general method of analysis. Such a method should:
enable us to determine whether a system is consistent or not, irrespective of
its order
determine the solutions to the system, involving parametric forms as
applicable
generalise existing methods for known simple cases.
Such a method exists, and is called Gaussian elimination. It is a numerical
method, involving only the coefficients and constants of the system of
equations, and can be effectively represented using matrices. This method
underpins the processes used in technology such as CAS for solving systems
of simultaneous linear equations and related operations. To do this we simply
use the coefficient matrix and the matrix of constants as discussed previously,
and place them side by side to form a new matrix called the augmented matrix
for the system of simultaneous linear equations.
Example 3.3

x-y = 1
1 -1
4 has augmented matrix =
1 1
x+y = 3
Z x - y - z = 0_
b
]
The system [ 6x + 4y + 0z = 20` has augmented matrix
] 0x - 4y + 2z = 10b
a
\
The system *

50

1
G.
3
R
V
S1 - 1 - 1 0 W
S6 4 0 20 W.
SS0 - 4 2 10 WW
T
X

chapter 3
Solving systems of simultaneous linear equations

In general, the matrix equation AX=B, where A is an mn coefficient


matrix, X is the n1 matrix of variables and B is the n1 matrix of
constants, gives rise to the m(n + 1) (that is, one additional column)
augmented matrix:
R
V
S a11 a12 f a1n b1 W
S a21 a22 f a2n b 2 W
W

6 A ; B@ = S
S h
hW
SSam1 am2 f amn bn WW
T
X
The important discussion is to lead from this form, which is only a
representation of all the coefficients and constants of the original system, to
another form which enables us to read off the solutions for x1 through to xn.
The processes by which we move from one form to another must ensure that
these forms are equivalent, where two systems of simultaneous linear
equations are said to be equivalent if each has the same set of solutions. The
idea of Gaussian elimination is to solve a system of simultaneous linear
equations by writing a sequence of systems, each one equivalent to the
previous system. Then each of these systems has the same set of solutions as
the original one. The aim is to end up with a system that is easy to solve, such
as one in what is called triangular form. Instead of writing the system of
equations out each time, we simply write the corresponding augmented
matrix, since the natural ordering of the matrix takes care of tracking what
happens to the coefficients of the variables.
To do this, only a certain type of operation, called an elementary
operation, can routinely be performed on systems of simultaneous linear
equations to produce equivalent systems. These are the operations we would
use in solving such a system by hand.
1 Interchange: Any two equations can be interchanged.
2 Scaling: We can multiply an equation by a non-zero constant.
3 Elimination: We can add a constant multiple of one equation to another
equation.
In practice, operations 2 and 3 are often applied in conjunction to add or
subtract a multiple of one equation from a multiple of another equation,
usually to eliminate a variable from one equation.
Elementary operations performed on a system of simultaneous equations
produce corresponding manipulations of the rows of the augmented matrix.
Hence, either by hand or using technology, we manipulate the rows of the

51

MathsWorks for Teachers


Matrices

augmented matrix rather than the equations. These row operations have the
same effect as the operation on equations, where the equations have been
written as a rectangular array with the coefficients of the variables and the
constants vertically aligned. The following are the corresponding elementary
row operations for a matrix:
1 Interchange: Interchange any two rows in their entirety.
2 Scaling: Multiply any row by a non-zero constant.
3 Elimination: Add a constant multiple of one row to another row.
A matrix is said to be in (row) echelon form if it satisfies the following two
conditions:
1 If there are any zero rows, they are at the bottom of the matrix.
2 The first non-zero entry in each non-zero row (called the leading entry or
pivot) is to the right of the pivots in the rows above it.
Matrices which are in row echelon form have a staircase appearance:
R
V
S0 * * * * W
S0 0 * * * W
S0 0 0 0 * W
S
W
S0 0 0 0 0 W

T
X
Example 3.4

The following matrices are in echelon form:


R
V R
V
2 5 W S2 3 5 W
1 2 S
3 4 5 0 0 2 3
G, S0 7 W, S0 0 4 W, =
G, =
G
=
0 3 SS
0
2 1 0 0 0 4
0 0 WW SS0 0 0 WW
T
X T
X
This idea can be extended further: a matrix is said to be in reduced (row)
echelon form if it is in echelon form and also:
3 Each pivot is 1.
4 Each pivot is the only non-zero entry in its column.
Example 3.5

The following matrices are in reduced row echelon form:


R
V R
V
1 0 W S1 3 0 W
1 0 S
1 0 5 0 0 1 0
G, S0 1 W, S0 0 1 W, =
G, =
G
=
0 1 SS
0 1 1 0 0 0 1
S
W
W
S
W
W
0 0 0 0 0
T
X T
X

52

chapter 3
Solving systems of simultaneous linear equations

Two matrices A and B are said to be equivalent if one can be obtained from
the other by a finite sequence of elementary row operations, and we write
A~B to denote this equivalence.
Gaussian elimination is a procedure for bringing a matrix to its equivalent
row echelon form, and is described in steps 14 of the GaussJordan algorithm
given below. At this stage the corresponding system of simultaneous linear
equations is in triangular form and could be solved by back-substitution. The
GaussJordan algorithm, described in Table 3.1 is an extension of Gaussian
elimination which brings the matrix to its equivalent reduced row echelon
form from which the solution (if there is one) can be directly written down.
Table 3.1: GaussJordan algorithm

GaussJordan algorithm

Step 1

Identify the leftmost non-zero column.

Step 2

If the first row has a zero in the column of Step 1, interchange it


with one that has a non-zero entry in the same column.

Step 3

Obtain zeros below the leading entry (also called a pivot) by


adding suitable multiples of the top row to the rows below it.

Step 4

Cover the top row and repeat the same process with the leftover
sub-matrix and starting at step 1. Repeat this process with each
row. (At this stage the matrix is in echelon form.)

Step 5

Start with the last non-zero row, work upwards. For each row,
obtain a leading 1 (by dividing by the value of the pivot) and
introduce zeros above it by adding suitable multiples of the row
with the leading 1 to the corresponding rows.

Example 3.6

Use the GaussJordan algorithm to solve the system of simultaneous


linear equations formed by requiring two numbers x and y to have a
difference of 1 and a sum of 3:

x-y = 1
4
x+y = 3

53

MathsWorks for Teachers


Matrices

As noted earlier, in this case the solution of x=2 and y=1 can
readily be obtained by inspection, however this will not always be the
case. A simple example such as this enables students to attend to the
process being illustrated rather than focus on the manipulations
involved.
Solution

This system has the corresponding augmented matrix =

1 -1 1
G.
1 1 3

The leftmost non-zero column is the first column (Step 1).


The top entry in this column is non-zero, so proceed to Step 3 of the
algorithm (Step 2).)
The new second row will be the old second row minus the first row
(Step 3).
1 -1 1
The resulting matrix is =
G, which is in row echelon form.
0 2 2

(This corresponds to the system of simultaneous linear equations


x-y = 1
*
4 , which is in a triangular form, and could easily be solved
2y = 2
by back-substitution. The last equation is 2y=2, and so y=1.
Substitute this value for y in the first equation, and solve for x,
which gives x1=1, and so x=2, hence the solution of the system
is (2, 1).)
Now we must use Step 5 to convert the matrix to reduced row
echelon form. First, we need to turn the leading entry in the second
row, into a leading 1. So divide the second row by 2 to obtain
=

1 -1 1
G. To convert this to reduced row echelon form, we need to
0 1 1

turn the entry in the first row, second column to 0. This can be done
by writing a new first row which is equal to the second row added to
the first row to obtain =

1 0 2
G.
0 1 1

The matrix is now in reduced row echelon form, corresponding to the


x=2
equivalent system of simultaneous linear equations *
4, which is
y=1
the solution, as noted earlier.

54

chapter 3
Solving systems of simultaneous linear equations
Example 3.7

Use the GaussJordan algorithm to solve the following system of


simultaneous linear equations:
Z x - y - z = 0_
b
]

[ 6x + 4y + 0z = 20`
] 0x - 4y + 2z = 10b
a
\
Solution

This system has the corresponding augmented matrix:


R
V
S1 - 1 - 1 0 W
S6 4 0 20 W
SS0 - 4 2 10 WW

T
X
The leftmost non-zero column is again the first column (Step 1) and
the top entry in this column is non-zero (Step 2), so we proceed to
Step 3 of the algorithm. The new second row will be the old second row
with six times the first row subtracted from it, and since the element of
the first column in the third row is already zero, we do not need to do
anything to the third row at this stage. The equivalent matrix is:
R
V
S1 - 1 - 1 0 W
S0 10 6 20 W
SS0 - 4 2 10 WW

T
X
Now we are at Step 4 of the elimination process. If we cover the top
row, the leftmost non-zero column is now the second column (Step 1)
and the top entry is non-zero (Step 2), so the new third row will be the
4
the second row added to it:
previous third row with 10
R
V
S1 - 1 - 1 0 W
S0 10 6 20 W
SS0 0 4.4 18 WW

T
X
This matrix is now in echelon form, and we could use backsubstitution to solve the corresponding system of simultaneous linear
Zx - y - z = 0_
b
]
equations [
10y + 6z = 20`.
]
4.4z = 18b
a
\

55

MathsWorks for Teachers


Matrices

Now we are at Step 5. We begin by dividing the last row by 4.4 = 22


5,
to obtain a leading 1. The matrix is then:
R
V
S1 - 1 - 1 0 W
S0 10 6 20 W

S
W
S0 0 1 45 W
S
11 W
T
X
We must next turn the first two numbers in the third column to 0 by
using elementary row operations. The new first row will be the previous
first row+the third row, and the new second row will be the previous
second row with 6the third row subtracted from it. The new matrix
will be:
R
V
S1 - 1 0 45 W
11 W
S
50 W
S

S0 10 0 - 11 W
S
45 W
S0 0 1 11 W
T
X
Next, we need to divide the second row by 10 to obtain a leading 1:
R
V
S1 - 1 0 45 W
11 W
S
5W
S

S0 1 0 - 11 W
S
45 W
S0 0 1 11 W
T
X
Finally, to get the matrix into reduced row echelon form, we need to
obtain a 0 in the first row, second column position. This we can do by
adding the second row to the first row and replacing the previous first
row with this.
R
V
S1 0 0 40 W
11 W
S
5W
S

S0 1 0 - 11 W
S
45 W
S0 0 1 11 W
T
X
The final equivalent system of simultaneous linear equations is:
Z
40 _
] x = 11 b
]
b

[ y = -115`
]
b
] z = 45 b
11
\
a
This is the required solution.

56

chapter 3
Solving systems of simultaneous linear equations

Students may well inquire what happens when this algorithm is applied to
a system of simultaneous linear equations that corresponds to a pair of
parallel lines or identical lines in the cartesian plane.
Example 3.8

Use the GaussJordan algorithm to solve this system of simultaneous


linear equations:
*

x+y = 4
4
x+y = 2

Solution

This system corresponds to a pair of distinct parallel lines (same


gradient, different intercepts). The corresponding augmented matrix for
this system is =

1 1 4
G.
1 1 2

The first step in the GaussJordan algorithm is to replace the second


row with the previous second row minus the first row, to obtain

1 1 4
G.
0 0 -2

(The system is now in triangular form, and we can see that the last
row corresponds to the equation 0x+0y=2, which is impossible.
Hence there are no solutions to this system.)
Next, we divide the elements in the second row by 2 to obtain
1 1 4
G, and finally we subtract four times the second row from the
0 0 1
1 1 0
first row to obtain =
G. Now we see that the last equation is
0 0 1
=

0x+0y=1, which is impossible, and so there are no solutions to this


system of equations.
The above is typical of the result of elimination when there are no
solutions. The last non-zero row of the augmented matrix will have zeroes
everywhere except in the right-most position.

57

MathsWorks for Teachers


Matrices
Example 3.9

Use the GaussJordan algorithm to solve the system of simultaneous


linear equations:
- 6x + 2y = - 8
)
3
3x - y = 4


Solution

This system corresponds to a pair of identical parallel lines (same


gradient, same intercepts). The corresponding augmented matrix for this
-6 2 -8
system is =
G.
3 -1 4
Replace the second row by itself+ 12 times the first row. This gives
-6 2 -8
G. We could complete the GaussJordan algorithm by dividing
=
0 0 0
1 4
the first row by 6, giving the reduced row echelon form of > 1 - 3 3 H.
0 0 0
The variables in this example are x and y. There is one leading 1,
corresponding to the x variable (as the coefficients of x were in the first
column), and so we describe variable x as basic or leading and the other
variable, y, as free. The second row now tells us that 0y=0, which is
true for any value of y, so we let y=k where kR is an arbitrary
constant.
1
4
Then, from the first row of the reduced matrix, we have x - 3 y = 3 ,
1
4 k+4
and, since y = k , x = 3 k + 3 = 3 . Thus, we can write the solution
(in parametric form) as $` k +3 4 , kj: k d R. .

In general, the solution to consistent systems such as the one just


considered, but also to more complicated cases where there are both leading
(corresponding to leading 1s) and free variables, is written by assigning
arbitrary constants, or parameters, to the free variables, and then writing the
leading variables in terms of these arbitrary constants.
Most CAS have a function which automatically reduces an augmented
matrix to reduced row echelon form, from which the solution can be
determined. This is fine unless there is an arbitrary constant in the
augmented matrix itself (for example, arising from one of the linear equations
in the system involving an arbitrary constant for one of the coefficients or its

58

chapter 3
Solving systems of simultaneous linear equations

constant term). Then it is necessary to be wary that a division by a function of


the constant may have taken place, and that this operation will only be valid if
the function of the constant is non-zero. The resulting reduced row echelon
matrix may have no trace of the constant. Some CAS have a fraction free
Gaussian-elimination function which effectively only gives the row echelon
matrix, and this can be used to investigate what happens for such systems.

S t u d e n t ac t i v i t y 3 . 3
For the following systems of equations, enter the augmented matrix into a CAS, or other
suitable technology, and use this to obtain the reduced row echelon form. Hence solve the
following systems of simultaneous linear equations.
Z
_
] x + 2y - z = 2b
a
[ x + 4y - 3z = 3`
]2x + 5y - 3z = 1b
\
a
Z
_
x
y
z
1b
+
+
=
]
b
[ x - y + 3z = 5`
] 3x - 2y + z = - 2b
\
a
Z
_
] 2x + 2y - z = 5b
c
[- 2x + y + z = 7`
]- 4x + y + 2z = 10b
\
a

S y s t e m s o f s i m u l t a n e o u s l i n e ar
e q u a t i o n s i n v ari o u s c o n t e x t s
Many different contexts give rise to systems of simultaneous linear equations
in several variables, even when the relations involved may themselves be nonlinear. Substitution of values of the variables in such contexts often results in
a system of simultaneous linear equations relating coefficients and/or
arbitrary constants.

59

MathsWorks for Teachers


Matrices
Example 3.10

A circle has an equation of the form x 2+y2+ax+by+c=0, where


a, b and cR. This circle passes through the points (2, 3), (6, 3) and
(2, 7) in the cartesian plane. Find the values of a, b and c.
Solution

Since (2, 3) lies on the circle, substitution of these values into the
equation for the circle gives (2)2+322a+3b+c=0. This simplifies
to the linear equation:

2a+3b+c=13
Similarly, as (6, 3) also lies on the circle we have
62+32+6a+3b+c=0, which simplifies to the linear equation:

6a+3b+c=45
For the point (2, 7), which also lies on the circle, we have
22+72+2a+7b+c=0, and hence:

2a+7b+c=53
We thus have a system of three simultaneous linear equations in a, b
and c:
Z
_
]- 2a + 3b + c =- 13b
[ 6a + 3b + c =- 45`
]
b

\ 2a + 7b + c =- 53a
The augmented matrix form is:
R
V
S- 2 3 1 - 13W
S 6 3 1 - 45W
SS 2 7 1 - 53WW

T
X
and using by hand or CAS manipulation to bring this to reduced row
echelon form yields:
R
V
S1 0 0 - 4 W
S0 1 0 - 6 W
SS0 0 1 - 3 WW

T
X
The solution can be read directly from this: a=4, b=6 and c=3,
and so the equation of the circle is x2+y24x6y3=0. Completing
the square, on both x and y, results in the alternative equation of the form
(x2)2+(y3)2=16 for the relation. The graph of this relation is a circle
with centre (2, 3) and radius 4, as shown in Figure 3.9.

60

chapter 3
Solving systems of simultaneous linear equations

y
8
6
4
2

2
4
6
Figure 3.9: Graph of the relation x2+y24x6y3=0 or (x2) 2+(y3) 2=16

Example 3.11

Three Toyotas, two Fords and four Holdens can be rented for $212 per
day. Alternatively, two Toyotas, four Fords and three Holdens can be
rented for $214 per day, or four Toyotas, three Fords and two Holdens
could be rented for $204 per day.
Assuming that the rate for renting any type of car is fixed by the
make, find the rental rates for each type of car per day.
Solution

Let a, b and c be the respective costs of renting a Toyota, a Ford and a


Holden per day. Then we have three simultaneous linear equations in
the three unknowns a, b and c.

3a + 2b + 4c = 212
2a + 4b + 3c = 214
4a + 3b + 2c = 204

61

MathsWorks for Teachers


Matrices

The augmented matrix corresponding to this system is:


R
V
S 3 2 4 212 W
S 2 4 3 214 W
SS 4 3 2 204 WW

T
X
and using by hand or CAS manipulation to return the reduced row
echelon form yields:
R
V
S1 0 0 20 W
S0 1 0 24 W
SS0 0 1 26 WW

T
X
Hence the rental rates are $20 per day for a Toyota, $24 per day for a
Ford and $26 per day for a Holden.

Example 3.12

A girl finds $5.20 in coins: 50 cent coins, 20 cent coins and 10 cent coins.
She finds 21 coins in total. How many coins of each type must she have?
Solution

Suppose she has a 50 cent coins, b 20 cent coins and c 10 cent coins. Then

50a + 20b + 10c = 520


She has 21 coins in total, so:

a + b + c = 21
This gives us two simultaneous linear equations in three unknowns.
We write the augmented matrix corresponding to the system:
=

50 20 10 520
G
1 1 1 21


and find its reduced row echelon form:
R
V
S1 0 - 1 10 W
3 3W
S

S0 1 4 53 W
S
3 3W
T
X
Generally, there would be infinitely many possible solutions to these
equations, but we require non-negative integers as solutions. The leading
variables correspond to the columns with leading 1s, so are a and b.

62

chapter 3
Solving systems of simultaneous linear equations

The free variable is c, and it can take integer values between 0 and 21. We can
write a and b in terms of c, from the reduced row echelon matrix above, as
10 c
10 + c
a= 3 +3= 3
53 4c 53 - 4c
b= 3 - 3 =
3

To find the possible integer solutions, we need to consider integer values of


c between 0 and 21, and determine when 10+c and 534c are both divisible
by 3. This could easily be done by technology, forming a 223 matrix, with
10 + c
53 - 4c
the first column containing c, the second 3 and the third
3 .

c
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

10 + c
3
3.3
3.7
4.0
4.3
4.7
5.0
5.3
5.7
6.0
6.3
6.7
7.0
7.3
7.7
8.0
8.3
8.7
9.0
9.3
9.7
10.0
10.3

53 - 4c
3
17.7
16.3
15.0
13.7
12.3
11.0
9.7
8.3
7.0
5.7
4.3
3.0
1.7
0.3
1.0
2.3
3.7
5.0
6.3
7.7
9.0
10.3

63

MathsWorks for Teachers


Matrices

Thus we see that there are several possibilities:


{a=4, b=15, c=2} or {a=5, b=11, c=5} or {a=6, b=7, c=8}
or{a=7, b=3, c=11}
Example 3.13

The scores of three players in a tournament have been lost. The only
information available is the total of the scores for players 1 and 2, the
total for players 2 and 3, and the total for players 3 and 1. Show that the
original scores can be recovered.
Solution

Let x, y and z be the scores for players 1, 2 and 3 respectively, and a, b


and c the totals for players 1 and 2, 2 and 3, and 3 and 1 respectively.
Then
x+y = a
y+z = b
z+x = c


is a system of three simultaneous linear equations in three unknowns x,
y and z. The augmented matrix is:
R
V
S1 1 0 a W
S0 1 1 b W

SS
W
1 0 1 cW
T
X
Its corresponding reduced row echelon form is
R
V
S1 0 0 a - b + c W
2
S
W
a+b-c W
S

S0 1 0
W
2
S
-a + b + c W
SS0 0 1
WW
2
T
X
a-b+c
a+b-c
So the original scores are x =
,
and
y
=
2
2
z=

64

-a + b + c
.
2

chapter 3
Solving systems of simultaneous linear equations
Example 3.14

Find the rule for the family of parabolas which pass through the points
(1, 2) and (3, 4).
Solution

Let the rule for the family of parabolas be y=ax 2+bx+c, where a is
non-zero. Since (1, 2) lies on any member of this family of curves:

a+b+c = 2
Similarly, since (3, 4) lies on any member of the family of curves:

9a + 3b + c = 4
Hence we have a system of two equations in three unknowns a, b
and c. We first write the augmented matrix:

1 1 1 2
G
9 3 1 4


and use CAS to reduce it to its reduced row echelon form:

>

1 0 - 13 - 13
0 1

4
3

7
3

Now a and b are the leading variables, and c is a free variable, so we


c
1
let c=k where kR. Then the first row gives a - 3 =- 3 , and so
a=

4
7
c-1 k-1
3 = 3 . The second row gives b + 3 c = 3 and so

7 - 4c 7 - 4 k
3 = 3 .
Hence the rule for the family of parabolas is:

b=

y=

k - 1 2 7 - 4k
3 x + 3 x + k , kR

We will graph a few of these curves.


If k=1, then a=0 and we obtain the straight line with equation
y=x+1, which passes through the two points.
If k=4, then we obtain the parabola with equation y=x 23x+4.
If k=5, then we obtain the parabola y=2x 2+9x5. The graphs
of these curves are shown in Figure 3.10.

65

MathsWorks for Teachers


Matrices

y
6
g

(3, 4)

3
2
1

f
(1, 2)

1
2
Figure 3.10: Graph of f(x)=2x2+9x5, g(x)=x23x+4 and h (x)=x+1

In the above example, we found a and b in terms of c. We could have used


the same procedure to find, say, b and c in terms of a. All we would need to do
is to have the columns in the matrix corresponding to b and c come before the
column corresponding to a. In this case, the augmented matrix would be
1 0 4 1
1 1 1 2
G, and the reduced row echelon form would be =
=
G. Now we
0 1 -3 1
3 1 9 4
let a=r, where rR. Then from the first row we have b=14r and from
the second row we have c=1+3r, giving the form of the family of parabolas
as y=rx 2+(14r)x+1+3r where rR.
Students may find it of interest to explore what values of k and r are
required to produce a collection of given quadratic functions.
Example 3.15

Find the rule for the family of cubic polynomials which passes through
the points (1, 0) and (1, 0), with slope 4 when x=1.
Solution

Let f(x)=ax3+bx2+cx+d be the rule of a cubic polynomial function,


with a, b, c and d the unknown coefficients. Since (1, 0) lies on the curve:

66

a+b+c+d = 0

chapter 3
Solving systems of simultaneous linear equations

Similarly, since (1, 0) lies on the curve:


-a + b - c + d = 0

Now f(x)=3ax 2+2bx+c, and the slope at x=1 is 4, so


3a+2b+c=4.
We now have a system of three simultaneous linear equations in four
unknowns:
Z
_
] a + b + c + d = 0b

[- a + b - c + d = 0`
] 3a + 2b + c
= - 4b
\
a
We can write down the augmented matrix corresponding to this
system:
R
V
S 1 1 1 1 0 W
S- 1 1 - 1 1 0 W

S
W
S 3 2 1 0 - 4W
T
X
The reduced row echelon form of this is:
R
V
S1 0 0 - 1 - 2 W
S0 1 0 1 0 W

SS
W
0 0 1 1 2 W
T
X
There are three leading variables, a, b and c, and one free variable, d.
Let d=k where kR, and express the leading variables in terms of k.
The first row of the reduced matrix tells us that ad=2, and so
a=2+k. The second row tells us that b+d=0, and so b=k and the
third row tells us that c+d=2, so c=2k. Thus, the rule for the
family of polynomials is:
f: R R, where f(x)=(k2)x3kx2 +(2k)x+k, and kR.
If k=2, the function will be the quadratic with rule f(x)=2x2+2.
We can check that this is the general form by drawing the graphs of
some members of this family:
If k=0, f(x)=2x3+2x.
If k=3, f(x)=x33x 2x+3.
If k=1, f(x)=3x3+x 2+3x1.
The graphs of these, and for k=2, are shown in Figure 3.11 on the
following page.

67

MathsWorks for Teachers


Matrices
y
4
y = 2x + 2x

3 y = x3 3x2 x + 3
f

2
1

y = 3x + x + 3x 1
3

1 g

y = 2x2 + 2

O
1

2
3
4

Figure 3.11: Graphs of f(x) for k=1, 0, 2 and 3

Example 3.16

Find the rule for the family of quartic polynomials (polynomials of


degree 4) that pass through the points (1, 2) and (2, 1), and have slope
5 at x=1.
Solution

Let f(x)=ax4+bx3+cx 2+dx+e be the rule for the family of quartic


polynomials.
Since they pass through (1, 2), f(1)=2, so:

a+b+c+d+e=2
Since they pass through (2, 1), f(2)=1, so:

16a8b+4c2d+e=1
Now f(x)=4ax3+3bx 2+2cx+d. Since we must have f(1)=5:

4a+3b+2c+d=5
We now have a system of three simultaneous linear equations in the
five unknowns a, b, c, d and e:
Z
_
] a + b + c + d + e = 2b

[ 16a - 8b + 4c - 2d + e = - 1`
] 4a + 3b + 2c + d
= 5b
\
a

68

chapter 3
Solving systems of simultaneous linear equations

The augmented matrix for this system is:


R
V
S1 1 1 1 1 2 W
S16 - 8 4 - 2 1 - 1W

SS
W
4 3 2 1 0 5 W
T
X
which has reduced row echelon form:
R
V
S1 0 0 - 1 - 3 1 W
4 12 W
2
S
1 5W
S

S0 1 0 0 - 2 6 W
S
3 9 13 W
SS0 0 1 2
4 12 WW
T
X
So the leading variables are a, b and c, and the free variables d and e.
Let d=s and e=t where s, tR. Our solution will now be given in
terms of two parameters. Then the first row of the reduced echelon
1
3
1
matrix corresponds to the equation a - 2 d - 4 e = 12 , hence
1
3
1
a = 2 s + 4 t + 12 .
The second row of the reduced echelon matrix corresponds to the
1
5
1
5
equation b - 2 e = 6 , hence b = 2 t + 6 .
The third row of the reduced echelon matrix corresponds to the
3
9
13
3
9
13
equation c + 2 d + 4 e = 12 , hence c =- 2 s - 4 t + 12 .

The solution set is


'b 1 s + 3 t + 1 , 1 t + 5 , - 3 s - 9 t + 13 , s, t l: s, t d R1
4
4
2
12 2
6 2
12

and the family of functions has the form:


1
3
1
1
5
3
9
13
f]xg = b 2 s + 4 t + 12 l x 4 + b 2 t + 6 l x 3 + b- 2 s - 4 t + 12 l x2 + sx + t
where s, t d R .
As a solution, this looks fairly formidable, so its a useful strategy to
plot a few of its members.
If s = 0 and t = 0 , then we have the function
1
5
13

f]xg = 12 x 4 + 6 x 3 + 12 x2 .
If s = 1 and t = 0 , then we have the function
7
5
5

f]xg = 12 x 4 + 6 x 3 - 12 x2 + x .

69

MathsWorks for Teachers


Matrices

If s = 0 and t = 1 , then we have the function


5
4
7

f ] xg = 6 x 4 + 3 x 3 - 6 x 2 + 1 .
If s = 1 and t = 1 , then we have the function
4
4
8

f ] xg = 3 x 4 + 3 x 3 - 3 x 2 + x + 1 .
If s =- 1 and t =- 1 , then we have the function
7
1
29

f ] xg = - 6 x 4 + 3 x 3 + 6 x 2 - x - 1 .
These members are shown in Figure 3.12 below.
y
5
4
3
2
1
4

O
1

2
3
4
5
Figure 3.12: Parts of the graphs of some members of the family of functions
for various values of s and t

S t u d e n t ac t i v i t y 3 . 4
a

b
c
d

The scores of four players in a tournament have been lost. The only information
available is the total of the scores for players 1 and 2, the total for players 2 and 3,
the total for players 3 and 4 and the total for players 4 and 1. Can the original
scores be recovered?
Find the equation of the cubic polynomial which passes through the points (1, 0)
and (1, 0), with slope 4 when x=1 and slope 12 when x=1.
Find the rule for the family of quartic polynomials (polynomials of degree 4) that
pass through the points (1, 2), (2, 1) and (2, 0), and have slope 5 at x=1.
Find the rule for the quartic polynomial (polynomial of degree 4) that passes
through the points (1, 2), (2, 1), (2, 0) and (1, 5), and has slope 5 at x=1.

70

chapter 3
Solving systems of simultaneous linear equations

SUMM A R Y

A linear system of m simultaneous equations in n variables


x1, x2, f, xn is a set of m equations of the form
a11 x1 + a12 x2 + f a1n xn = b1
a21 x1 + a22 x2 + f a2n xn = b2
am1 x1 + am2 x2 + f amn xn = bm
The numbers a11, a12, f, a1n, f, amn are the coefficients of the
system, and b1, b2, f, bm are the constant terms.
A system of simultaneous linear equations is said to be consistent if
it has either a unique solution or infinitely many solutions, and
inconsistent if it does not have a solution.
For a system of two simultaneous linear equations in two variables
]m = n = 2g:
there is a unique solution which corresponds to the point of
intersection of the graphs of the corresponding straight lines
(different gradient) in the cartesian plane, R2
or
there are infinitely many solutions which correspond to the
infinite set of points that comprise the superimposition of the
graphs of the same straight line (same gradients and same axis
intercepts) specified by two equivalent linear relations, in the
cartesian plane, R2
or
there are no solutions, and the graphs of the corresponding
straight lines are parallel (same gradient) but distinct (different
axis intercepts) straight lines in the cartesian plane, R2.
For a system of three simultaneous linear equations in three
variables ]m = n = 3g:
there is a unique solution which corresponds to the point of
intersection of the graphs of the corresponding planes in threedimensional space, R3
or
there are infinitely many solutions which correspond to the graphs
of three planes aligned like pages (which may include a
superimposed page or pages) in a book along a common spine, in
three-dimensional space, R3, and where at least two of these pages
are distinct, their intersection points form a straight line in R3

71

MathsWorks for Teachers


Matrices

SUMM A R Y (Cont.)

or
there are no solutions, and the graphs of the three planes are all
parallel but distinct; or one pair is parallel (and distinct) and the
other oblique to this pair; or they are configured like a triangular
prism.
The system of simultaneous linear equations can be written in
matrix form AX=B, where
R
V
R V
S a11 a12 f a1n W
S x1 W
S a21 a22 f a2n W
S x2 W
A=S
W is the mn coefficient matrix, X=S W the
h
h W
S h
ShW
Sam1 am2 f amn W
S xn W
T X
T
X
R V
S b1 W
Sb W
n1 column matrix (vector) of variables, and B= S 2 W the m1
Sh W
column matrix (vector) of constant terms.
Sbm W
T X
If A is an invertible (non-singular) square matrix (m=n) then the
inverse method can be employed and X=A1B.
The m(n+1) augmented matrix of the system is the following
matrix:
R
S a11 a12 f a1n
S a21 a22 f a2n
S
h
h
S h
Sam1 am2 f amn
T

V
b1 W
b2 W
W
h W
bm W
X

The system of simultaneous linear equations of the form AX=O is


said to be homogeneous and is always consistent, with X=O
(the relevant zero vector) a solution.
To solve such systems of equations using the GaussJordan method,
there are three steps.
Step 1: Write the augmented matrix for the system of equations.
Step 2: Enter the augmented matrix into CAS, or other suitable
technology, and obtain the reduced row echelon form of the
matrix (using exact arithmetic where possible).

72

chapter 3
Solving systems of simultaneous linear equations

SUMM A R Y (Cont.)

Step 3: Interpret the resulting reduced row echelon matrix, as


follows:
Case 1: If the number of leading 1s is equal to the number of
variables, and the last leading 1 is not in the rightmost
column, then there is a unique solution which can be written
down directly from the matrix.
Case 2: If the number of leading 1s is less than the number
of variables, and the last leading 1 is not in the rightmost
column, then there will be an infinite number of solutions.
The solutions can be written by assigning an arbitrary
constant to each of the free variables (those not
corresponding to leading 1s), and writing the leading
variables in terms of these constants.
Case 3: If the last non-zero row of the reduced row echelon
matrix has a leading 1 in the rightmost column, then the
system of equations is inconsistent (that is, has no solution).
Non-linear functions and relations can be used to generate a system
of simultaneous linear equations where substitution of some values
for the variables leads to such a system expressed in terms of
coefficients used to specify the particular functions and/or relations
involved.

References
Anton, H & Rorres, C 2005, Elementary linear algebra (applications version),
9th edn, John Wiley and Sons, New York.
Cirrito, F (ed.) 1999, Mathematics higher level (core), 2nd edn, IBID Press, Victoria.
Hill, RO Jr 1996, Elementary linear algebra with applications, 3rd edn, Saunders
College, Philadelphia.
Lipschutz, S & Lipson, M 2000, Schaums outline of linear algebra, 3rd edn,
McGraw-Hill, New York.
Nicholson, KW 2001, Elementary linear algebra, 1st edn, McGraw-Hill Ryerson,
Whitby, ON.
Nicholson, KW 2003, Linear algebra with applications, 4th edn, McGraw-Hill
Ryerson, Whitby, ON.
Poole, D 2006, Linear algebra: A modern introduction, 2nd edn, Thomson Brooks/
Cole, California.
Wheal, M 2003, Matrices: Mathematical models for organising and manipulating
information, 2nd edn, Australian Association of Mathematics Teachers,
Adelaide.

73

MathsWorks for Teachers


Matrices

Websites
http://en.wikipedia.org/wiki/Gaussian_elimination Wikipedia
This site provides a comprehensive discussion with links to other resources and
references.
http://mathworld.wolfram.com/GaussianElimination.html Wolfram Research
This site is the online mathematical encyclopaedia from the developers of the
CAS Mathematica. It provides a concise but comprehensive mathematical
overview and includes links to related topics and a good list of other references.
http://www.sosmath.com/matrix/system1/system1.html SOS Mathematics
This site provides an accessible discussion with worked examples using Gaussian
elimination for some simple cases of systems of simultaneous linear equations.
http://aleph0.clarku.edu/~djoyce/ma105/simultaneous.html Department of
Mathematics and Computer Science, Clark University.
This site includes a first principles discussion of a practical problem based on
ancient Chinese methods.
http://www.jgsee.kmutt.ac.th/exell/PracMath/SimLinEq.html Practical
Mathematics.
This site covers straightforward examples for 22 and 33 systems, with a
collection of related exercises.
http://mathforum.org/linear/choosing.texts/ Drexel University
This site provides information on selected linear algebra texts, including those
that are technology based.
http://www.sosmath.com/matrix/matrix.html SOS Mathematics
This site has some notes with examples on solving systems of linear equations,
and is at quite a simple level.

74

C ha p t e r

Tra n s f o r m a t i o n s o f t h e
car t e s ia n p l a n e
A transformation on the cartesian plane, RR, or R2 as it is also commonly
designated, is a correspondence or mapping from the set of points in the plane
to a set of points in the plane. That is, for every (original) point in the plane
before the transformation is applied, there is a corresponding unique point,
the image, in the plane after the transformation is applied.
In senior secondary mathematics curricula, particular importance is
assigned to the study of the effects of transformations on certain subsets of
the planethose that correspond to the graphs of functions and other
relations. Possibly the first case of analysis related to transformations of
(graphs of) functions for middle school secondary students is the graph of the
function with rule g(x)=a(x+b)2+c, derived from the graph of the function
with rule f(x)=x 2 by a sequence of transformations involving a dilation from
the y-axis, possibly a reflection in the x-axis (depending on the sign of a), a
translation parallel to the x-axis and a translation parallel to the y-axis,
although a good case could be made for considering graphs of linear functions
of the form y=mx+c as a similar sequence of transformations of the graph
of y=x.
The first two transformations mentioned abovedilation and reflection
are examples of what are commonly called linear transformations, while all
three of these transformations are examples of what are called affine
transformations (linear transformations and also translations). Some care will
need to be taken for students to become clear that there are two types of
function involved in this context: the function of a single real variable whose
graph corresponds to a particular type of subset of the cartesian plane
{(x, y) where y=f(x) and xdomain(f)}, and the function which is a
transformation of the plane that maps an ordered pair (that is, a point) to
another unique ordered pair.

75

MathsWorks for Teachers


Matrices

Matrices are well suited to the analysis of linear transformations, and provide
a convenient notation for distinguishing between these two senses of function.
Indeed, linear transformations can be used to provide a strong motivation for the
definition of matrix multiplication, as applied to 22 matrices. The application
of matrices to the analysis of linear transformations involves the solution of
systems of simultaneous linear equations and matrix inverses.

Li n e ar t ra n s f o r m a t i o n s
A linear transformation is a function T:R2R2, T(u)=w, where u and w are
ordered pairs corresponding to points in the plane, which satisfies the
following properties or axioms, called the linearity axioms:
T(u+v)=T(u)+T(v) for all u and vR2
T(kv)=kT(v) for all vR2 and all scalars kR
More generally a linear transformation is defined as a function from one
vector space (see Chapter 2, page 20) to another that satisfies the linearity
axioms above. In this text we will consider only the restricted case of R2,
where the underlying vector space is that of coordinate vectors in the cartesian
plane. It is important that teachers clarify for students the nature of the
cartesian plane as RR, or R2. Students may or may not have come across the
notion of the cartesian product of two sets X and Y, where
XY={(x, y): xX and yY}.
Even if they are familiar with this notion, for example, from listing the
event space for two events with a finite discrete set of possible outcomes, they
may not transfer this conceptually to the case of uncountable continuous sets,
or at least require reminding of its application in this context. Indeed, their
own practical experience is much more likely to have them familiar with a
well known subset of R2, that is, the set of all integer (whole number) valued
grid points, part of which is shown in Figure 4.1, and which constitute
ZZ=Z2, where Z is the set of integers.
In the case of R2={(x, y): x,y R}, where X=Y=R, the corresponding
cartesian product can be regarded as the set of all points in the cartesian plane
or the set of all position vectors for these points with respect to the origin of
the cartesian plane. Clearly there is a one-to-one correspondence between
points and position vectors with respect to the origin, and matrix notation is
quite useful in this context.

76

chapter 4
Transformations of the cartesian plane
y
8
6
4
2
8

x
2

2
4
6
8
Figure 4.1: Intersecting lines indicating a subset of grid points of ZZ=Z2,
where Z is the set of integers

x
Since any u=(x, y)R2 can be written as a column matrix u== G, we
y
1
0
can write u=x = G + y = G; that is, we can write u as a linear combination of
0
1
1
0
= G and = G.
0
1
By the linearity properties, to determine the image of u under T we simply
1
0
need to determine the image of = G and = G under T. The set of vectors
0
1
1 0
{[1, 0], [0, 1]} or )= G, = G3 is said to be a basis of the vector space they
0 1
generate, in this case the set of all coordinate vectors in the cartesian plane, as
x
any coordinate vector ^x, yh = = G is a linear combination of these two vectors.
y
This simple but powerful concept underpins much of the work relating to
transformations of the plane, so it is useful to take some time to ensure that
students have a sound grasp of it. In work on vector representations in the
1 0
plane, )= G, = G3 corresponds to the unit vectors commonly denoted {i, j}
0 1
(see Evans, 2006, Chapter 8).

77

MathsWorks for Teachers


Matrices

a
0
1
b
Let T e= Go = = G and T e= Go = = G for some real numbers a, b, c, and d.
c
1
0
d
x
1
0
1
0
Then T e= Go = T e x = G + y = Go = xT e= Go + yT e= Go
y
0
1
0
1
ax + by
a
b
a b x
= x= G + y= G = =
G= G
G==
cx + dy
c
d
c d y
Such a linear transformation can be accomplished by a matrix
multiplication, with the matrix determined by the image of the two points
1
0
a b
= G and = G. Conversely, any 22 matrix =
G can be considered a linear
0
1
c d
a
1
0
b
transformation that transforms = G to the point = G and = G to the point = G.
c
0
1
d
Any linear transformation T:R2 R2 can be written in the form
T(u)=Au, where A is a 22 matrix. If the transformation T is applied to a
region with area s, then the area of the transformed region is equal to the
product of s and the absolute value of the determinant of A, that is |det(A)|s.
Moreover, A can be uniquely determined if the images of two points u and v
a b
are known, where u kv for some non-zero kR. If A = =
G is the matrix
c d
for T, then:
a
1
0
a b 1
a b 0
b
T e= Go = =
G = G = = G and T e= Go = =
G= G = = G
c
0
1
c d 0
c d 1
d

Example 4.1

1
0
0 1
Let T be the transformation with matrix =
G. Then T e= Go = = G and
0
1
1 0
0
1
T e= Go = = G.
1
0
a Find the image of the point (2, 4).
b Find the image of the point (x, y).
Solution

a To find the image of the point (2, 4), we simply find the matrix
product:

78

0 1 2
4
G= G = = G
1 0 4
2

chapter 4
Transformations of the cartesian plane

So the point (2, 4) is transformed or mapped to the point (4, 2).


y
0 1 x
b Since =
G = G = = G, any point (x, y) is mapped to the point (y, x).
x
1 0 y
Geometrically, this transformation corresponds to a reflection in the
line y=x, and can be used to determine inverse relations.
1
0
If the points whose images are known are not = G and = G, then pairs of
0
1
simultaneous linear equations or inverse matrices are used to find the matrix
for the transformation.
Example 4.2

If T is a linear transformation that transforms (1, 2) to (1, 1) and (3, 1)


to (0, 1), find the matrix for this transformation.
Solution

Let =

a b
G be the matrix corresponding to this linear transformation.
c d

Then =

-1
0
a b 1
a b 3
G = G = = G and =
G = G = = G.
1
1
c d 2
c d 1

a + 2b =- 1
3a + b = 0
and
.
c + 2d = 1
3c + d = 1
Combining the four equations, we have two equations involving
a and b, and two equations involving c and d. For integer values of
a, b, c and d, these can usually be readily, if somewhat tediously, solved
by hand.
a + 2b = - 1
c + 2d = 1
1
3
)
3 has solution a = 5 , b =- 5 , and )
3 has
3a + b = 0
3c + d = 1
R
V
S1 - 3W
1
2
solution c = 5 , d = 5 , hence the required matrix is S 15 25 W.
S
W
S5 5 W
T
X
That is,

79

MathsWorks for Teachers


Matrices

Alternatively, =
and written as =

-1
a b 1
a b 3
0
G = G = > H and =
G = G = = G can be combined
1
c d 2
c d 1
1

-1 0
a b 1 3
H and then
G= G = >
c d 2 1
1 1

- 1 0 1 3 -1
a b
H=
G
G=>
c d
1 1 2 1
-1

-1 0 5
=>
H>
1 1 25
1 -3
5 5
2
5 5

= >1

3
5
-1
5

Example 4.3

-1 2
The matrix A = >
H transforms the point P(x, y) onto the point
3 -8
Q(1, 7). Find the coordinates of the point P.
Solution

-1 2 x
1
We have =
G = G = = G.
3 -8 y
-7
-4 -1 1
- 1 2 -1 1
x
3
Hence = G = =
G = G = > 3 1 H= G = = G
- 2 - 2 -7
3 -8
-7
y
2
So P has coordinates (3, 2).

S t u d e n t ac t i v i t y 4 . 1
a

Show that any linear transformation maps the origin to the origin.

1
3
-1
-3
b Explain why a linear transformation T with T f= Gp = = G and T e= Go = = G is
-1
-1
1
1

c

not uniquely determined. Find at least two linear transformations that satisfy the
above conditions.
Find the points that are mapped to the points (1, 0) and (0, 1) by the linear
4 3
transformation with matrix =
G.
5 4

80

chapter 4
Transformations of the cartesian plane

Li n e ar t ra n s f o r m a t i o n o f a s t rai g h t l i n e
Although a linear transformation acts upon individual points or position
vectors, we usually want to see how a set of points corresponding to a subset
of the plane of interest to us is transformed, particularly curves and figures.
The transformation T:R2 R2 with the rule T(x, y)=(x+y, xy) can be
written in matrix form, with (x, y) as an original point and (x1, y1) as an
image point:
x1
1 1 x
x
1 1 1 x1
> H==
G= G , = G = 2=
G> H
y1
1 -1 y
1 - 1 y1
y

1
1
Hence x = 2 ^x1 + y1h and y = 2 ^x1 - y1h. Under this transformation, the
image of the graph of the relation with the equation ax+by+c=0 (which is
a straight line) is the graph of the relation with the equation
a
b
1]
1]
g
g
2 ^x1 + y1h + 2 ^x1 - y1h + c = 0 , or 2 a + b x1 + 2 a - b y1 + c = 0
(which is also a straight line). In particular, the straight line with equation
2x + 3y - 6 = 0 is transformed onto the straight line with equation
5
1
2 x - 2 y - 6 = 0 , as shown for part of the original line (that is a line segment
subset of the original line) and the corresponding image points in Figure 4.2.
y
+

2x + 3y 6 = 0 5x y 6 = 12

Figure 4.2: Graph of 2x+3y 6=0, 0 < x < 3, and its image under T(x, y)=(x + y, x y)

81

MathsWorks for Teachers


Matrices

An equation for any straight line can easily be written in parametric form.
Although not commonly used for straight lines in the cartesian plane (unless
as a simple application of vector kinematics), the parametric form of an
equation for a straight line is simply a vector equation for the line. It gives the
position vector of each point on the line with respect to the origin. In this
sense the coordinates of a point in the plane correspond to its position vector
relative to the origin (0, 0). Parametric forms are very useful in computer
graphic applications.
The following discussion should be developed through an exposition that
connects the graphical picture with the conceptual and symbolic argument.
Consider the straight line that passes through the two distinct points P, with
position vector p=(x1, y1), and Q, with position vector q=(x2, y2). A
direction vector for this line from P to Q is d=q p=(x2 x1, y2 y1), and
the position vector r of any point on the line is given by r=p+td, tR. That
is, any point on the line that passes through P and Q must be some distance
(a scalar multiple of the length of the directed line segment PQ ) along this
line from the point P. This is illustrated in Figure 4.3. If we restrict t to the
interval [0, 1], then we have exactly the directed line segment from P to Q.
As d=q p, another way of writing this position vector is
r=p+t(q p)=(1 t)p+t q. When t=0, r=p, and when t=1, r=q, the
endpoints of the line segment. If 0 t 1, then the vector r is clearly the
position vector of some point on the directed line segment PQ .
Q

r
p

O
Figure 4.3: Vector representation of a line through two distinct points in the plane

For example, consider the line passing through the points P(1, 2) and
Q(3, 4). Then the position vector r of any point R on the line through P and
Q can be written r=(1, 2)+t (4, 2), where PQ = (4, 2).

82

chapter 4
Transformations of the cartesian plane

It is natural for students to inquire how this relates to the more common
representation of a straight line by the rule y=mx+c, where c is interpreted
as the y-axis intercept and m represents the slope of the line (the ratio of
relative difference between the y values of two points with respect to their
corresponding difference in x values). A simple parameterisation of
y=mx+c is to let x=t, where tR, then y=mt+c and the corresponding
vector parametric form is (t, mt+c), tR. This simple parameterisation can
x
0
1
also be written in matrix form as = G = = G + t= G.
y
c
m
In vector terms, the position vector of any point S on the line is the sum of
0
the vector = G, the position vector of the point (0, c), the y-axis intercept, and
c
1
a scalar multiple of the vector = G, which gives the direction of the line. Note
m
1
that the vector = G has a horizontal component of 1 unit and a vertical
m
component of m units, as shown in Figure 4.4.

S
y

(0, c)

m
1

Figure 4.4: Vector representation of y=mx+c in the plane

Using matrices and vectors, it can easily be shown that under a linear
transformation with non-singular matrix:
1 The origin is mapped onto itself.
2 The transformation is a one-to-one mapping of R2 onto R2.
3 A straight line (line segment) is mapped onto a straight line (line segment).

83

MathsWorks for Teachers


Matrices

4 Any pair of distinct parallel lines is mapped onto another pair of distinct
parallel lines.
5 A straight line that passes through the origin is mapped onto another
straight line that passes through the origin.
If the matrix of the transformation is singular, then any line will be
mapped to a line or point.
There are several methods for determining the equation of the image of a
line under a linear transformation. Some involve the use of the inverse of a
22 matrix, while others do not. The following discussion illustrates three
different approaches. Teachers should encourage students to think about the
mathematical strategies involved, and where various constructs arise in each
case.
Example 4.4

Consider the linear transformation associated with the matrix


1 2
A ==
G. Find the image of the line with rule y=2x+3 under this
4 -3
transformation.
Solution
Method 1

Since straight lines are transformed onto straight lines under a linear
transformation, we can find the image by finding the image of any two
distinct points on the original straight line. For example, the points (0, 3)
and (1, 5) clearly lie on the straight line with equation y=2x+3.The
corresponding
image points are given by =

6
11
1 2 0
1 2 1
G = G = = G and =
G= G = =
G.
-9
- 11
4 -3 3
4 -3 5

So the image of the line y=2x+3 passes through the points (6, 9)
and (11, 11).
y -y
Using the general form y - y1 = d x2 - x 1 n ^x - x1h, we get
2

2
1
^y + 9h =- 5 ]x - 6g or y =- 5 ]2x + 33g.

84

chapter 4
Transformations of the cartesian plane
Method 2

Use a vector parametric form for the straight line,


t
0
1
r== G + t = G = =
G, tR.
3 + 2t
3
2
Then T(r)==

1 2
t
6 + 5t
6
5
G=
G==
G = = G + t = G. This
4 - 3 3 + 2t
- 9 - 2t
-9
-2

5
corresponds to a line through the point (6, 9) in the direction of = G,
-2
2
2
that is, with slope - 5 , and so has cartesian equation y + 9 =- 5 ]x - 6g,
1
or y =- 5 ]2x + 33g, as before.

Method 3

Consider an arbitrary point (x, y) on the line. This is mapped to the


x1
x1
x
1 2 x
point (x1, y1) where A f= Gp = > H, that is =
G = G = > H. This gives
y1
y1
4 -3 y
y
us equations for x1 and y1 in terms of x and y. What we want is to solve
these for x and y in terms of x1 and y1, and substitute into the original
equation to give an equation involving x1 and y1. This can be done easily
by multiplying both sides of the matrix equation by A1.
1 2 -1 1 2 x
1 2 - 1 x1
H >
H= G = >
H > H
y1
4 -3
4 -3 y
4 -3
R
V
R
V
3 2 W
S 3x1 + 2y1 W
S
-1
x1
1 2
x
11 11 W x1
11
W

H > H= S
> H = SS
= G=>
x
4
1
4
y
y
S
W
4
3
y
1 - y1 W
1
1
S
W
S 11 11 W
11
T
X
T
X
4x1 - y1
3
x
+
2
y
= 2 c 1 11 1 m + 3, which
Hence y = 2x + 3 becomes
11
1
simplifies to 5y1 =- 2x1 - 33 or y1 =- 5 ^2x1 + 33h and so the equation
1
of the transformed line is y =- 5 ]2x + 33g, as before.
x1
x
This method can be implemented directly by finding = G = A- 1 f> Hp,
y
y
1

>

substituting into y = mx + c, and then solving for y1 in terms of x1. These


computations can be readily carried out in one step using a CAS. This method

85

MathsWorks for Teachers


Matrices

will not work if we cannot find the inverse of the transformation matrix, that
is, if the transformation matrix is singular. In this case the transformation
maps the plane onto a line through the origin or onto the origin itself.
Method 3 is easy to adapt to finding the image of any function, and is the one
we will use in general.
Example 4.5

1 1
G.
1 1
Under this transformation, what is the image of the line y=2x+3?

Consider the linear transformation associated with the matrix A = =


Solution

We cannot use Method 3 because the matrix A does not have an inverse,
since det(A)=11=0. We begin in a similar way, and find the image
of the point with coordinates (x, y).

x1
x+y
1 1 x
H=> H
G= G = >
y
1 1 y
x+y
1

Since the x1- and y1-coordinates are the same, and x+y is not
constant, the image of the line is y=x. In fact, one can show that the
image of any line other than those of the form y=x+c is y=x, and
that the image of y=x+c is the point (c, c), which of course is on the
line y=x. This transformation corresponds to a projection onto the line
y=x. Projections onto lines will not be considered in any detail, since
they do not occur when transforming functions.

S t u d e n t ac t i v i t y 4 . 2
a Establish each of the properties of linear transformations (with non-singular
matrices) listed above.
b
Find the equation of the image of the line with equation y=5 3x under the linear
2 1

transformation with matrix =
G.
1 3
c
Find the equation of the image of the line with equation y=5 3x under the linear
11

transformation with matrix = G .
11

86

chapter 4
Transformations of the cartesian plane
d

Find the equations of lines which are mapped to points under the linear
11
transformation with matrix = G .
11
Find the image of the unit square (that is, the region bounded by line segments
joining vertices (0, 0), (0, 1), (1, 1) and (1, 0)) and the area of this region under the
2 1
transformation with matrix =
G.
1 3

Li n e ar t ra n s f o r m a t i o n o f a c u r v e
To find the image of the graph of y=f(x) or f(x, y)=c under the linear
transformation with matrix A, we can proceed as for a straight line, provided
A has an inverse. Consider an arbitrary point (x, y) on the curve which is the
graph of the function or relation we are interested in. If this is mapped to the
x1
x1
x
x
point (x1, y1) where A = G = > H, we can equivalently write = G = A- 1 > H,
y
y
y
y
1
1
which gives expressions for x and y. These can simply be substituted into
y=f(x) or f(x, y)=c to find the equation of the image function or relation.
Example 4.6

Find the image of the function y=x 2 and the relation 4x 2+y2=1 under
1 2
the linear transformation represented by the matrix A = >
H.
4 -3
Solution

Consider an arbitrary point (x, y) on the curve, which is mapped to the


x1
1 2 x
point (x1, y1) where >
H = G = > H. This gives us equations for x1 and
y1
4 -3 y
y1 in terms of x and y. What we want is to solve these for x and y in
terms of x1 and y1, and substitute into the original equation of the curve
to give an equation involving x1 and y1. This can be done easily by premultiplying both sides of the matrix equation by A1.

1 2 -1 1 2 x
1 2 - 1 x1
H >
H= G = >
H > H
y1
4 -3
4 -3 y
4 -3
R
V
R
V
3 2 W
S 3x 1 + 2y 1 W
S
-1
x1
1 2
x
11 11 W x1
11
W
H > H= S
> H = SS
= G=>
x
y
4
1
4
y
y
S
W
4
3
y
1
1 W
1
1
S
W
S 11 11 W
11
T
X
T
X
>

87

MathsWorks for Teachers


Matrices

Hence the rule of the image of the function y=x 2 is the relation
4x1 - y1 c 3x1 + 2y1 m2
=
. Expanding this expression and replacing x1 by
11
11
x and y1 by y gives 44x 11y=9x2+12xy+4y2. The graphs of both
the original function and the image relation are shown in Figure 4.5.
y
10
6
2
4 3 2 1

9 10 11 12 13 14 15 16 17 18 19 20

6
10

Figure 4.5: Graph of the function y=x2 and its image relation
under transformation by the matrix A

The rule of the image of the relation 4x 2+y2=1 is the relation


4c

3x1 + 2y1 m2 c 4x1 - y1 m2


+
= 1. Expanding this expression and
11
11
replacing x1 by x and y1 by y we have 52x 2+40xy+17y2=121. The
graphs of the original curve and its image are shown in Figure 4.6.
y
4
3
2
1
6

O 1
1

x
2

2
3
4
Figure 4.6: Graph of the relation 4x2+y2=1 and its image relation
under transformation by matrix A

88

chapter 4
Transformations of the cartesian plane
S t u d e n t ac t i v i t y 4 . 3
a

Find the image of y=sin(x) under the linear transformation with matrix =
3 5
G.
1 2

Find the image of y=x2 under the transformation with matrix =

11
Find the image of y=x2 under the transformation with matrix = G .
11

3 0
G.
0 2

S t a n d ar d t y p e s o f l i n e ar
t ra n s f o r m a t i o n s
It is a key part of many senior secondary mathematics curricula to investigate
the effects of certain standard types of transformations on the graphs of
familiar functions and relations. In the following discussion, such an
investigation is carried out with respect to the unit squaredefined as the
region bounded by and including the set of four line segments joining (0, 0)
with (1, 0); (1, 0) with (1, 1); (1, 1) with (0, 1) and (0, 1) with (0, 0)and the
graphs of the functions with domain R and rules f(x)=x 2 and g(x)=sin(x)
respectively. Note that if a linear transformation with matrix A maps a region
of area a1 onto a region of area a2, then a2 =|det(A)| a1.
Computer algebra systems can be used to good effect to apply transformations
to points, find algebraic relations corresponding to the transformation of
variables, carry out computation involving compositions of transformations
and their inverses, and draw graphs of original and transformed sets of points
in the plane.
Dilations from the axes
Dilation by a factor k from the x-axis

The transformation matrix is of the form =

1 0
G for k > 0.
0 k

Example 4.7

1 0
G have on the unit
0 3
square and the graphs of the functions f and g?

What effect does the transformation matrix =


Solution

Under this transformation, each point (x, y) is mapped to the point


(x, 3y), and so the corners (vertices) of the unit square (0, 0), (1, 0), (1, 1)

89

MathsWorks for Teachers


Matrices

and (0, 1) are mapped to the points (0, 0), (1, 0), (1, 3), (0, 3) respectively.
The square has been stretched vertically, and the resulting rectangle has
area 3 square units, as shown in Figure 4.7.
y
3

(1, 3)

2
(1, 1)

1
1 0
Figure 4.7: Graph of the unit square and its image under the transformation with matrix <
F
0 3
(corresponding to a dilation by factor 3 from the x-axis)

In general, the point (x, y) is mapped to the point (x1, y1), where
x1
x
1 0 x
y
G = G = = G = > H, and so x=x1 and y = 31 .
=
y1
0 3 y
3y
The effect on the graph of y=f(x) is given by substituting for x and y
y
in y=x 2, which results in 31 = x12. The corresponding rule for the
transformed function is y=f 1(x)=3x 2. Part of the graph of the original
function and its image function is shown in Figure 4.8.
y
7
6
5
4
3
2
1
4

O
1

Figure 4.8: Graph of part of f(x)=x2 and its image under the transformation with matrix <

90

x
1 0
F
0 3

chapter 4
Transformations of the cartesian plane

The effect on the graph of y=g(x) is given by substituting for x and


y
y in y = sin ]xg, which results in 31 = sin (x1) or y1 = 3 sin ^x1h. The rule
for the transformed function is g1 ]xg = 3 sin ]xg.
The graphs of g and g 1 are shown in Figure 4.9.
y
4
3
2
1
4

O
1

2
3
4
Figure 4.9: Graph of part of g(x)=sin(x) and its image
under the transformation with matrix <

1 0
F
0 3

In general, dilation by factor k from the x-axis results in y=f(x) being


transformed to y=kf(x).
Dilation by a factor k from the y-axis

The transformation matrix is of the form =

k 0
G for k>0.
0 1

Example 4.8

3 0
What effect does the transformation with matrix =
G have on the unit
0 1
square and the graphs of the functions f and g?
Solution

3x
3 0 x
G = G = = G, so each point (x, y) is mapped
0 1 y
y
to the point (3x, y), and so the vertices of the unit square (0, 0), (1, 0),
(1, 1) and (0, 1) are mapped to the points (0, 0), (3, 0), (3, 1), (0, 1)
respectively. The square has been stretched horizontally, and the area of
the resulting rectangle is 3 square units, as shown in Figure 4.10.

Under the transformation, =

91

MathsWorks for Teachers


Matrices

y
(1, 1)

(3, 1)

1
Figure 4.10: Graph of the unit square and its image under the transformation with matrix <

3 0
F
0 1

In general, the effect on any curve or region is described by


x1
3x
3 0 x
x
G = G = = G = > H, and so x = 31 and y = y1.
=
y1
0 1 y
y
The effect on the graph of y=f(x) is given by substituting for x and y
x 2 x2
in y=x 2, which results in y1 = c 31 m = 91 . The corresponding rule for

x2
the transformed function is y = f1 ]xg = 9 . Part of the graph of the
original function and its image function are shown in Figure 4.11.
y
4
3
2
1
10

10

1
Figure 4.11: Graph of part of f(x)=x2 and its image under the transformation with matrix <

92

3 0
F
0 1

chapter 4
Transformations of the cartesian plane

The effect on the graph of y=g(x) is given by substituting for x and


x
y in y=sin(x), which results in y1 = sin c 31 m.
x
The rule for the transformed function is g1 ]xg = sin b 3 l. The graphs
of g and g 1 are shown in Figure 4.12.
y
2

10

10

1
Figure 4.12: Graph of part of g(x)=sin(x) and its image
3 0
under the transformation with matrix <
F
0 1

In general, dilation by a factor k from the y-axis results in y=f(x) being


x
transformed to y = f b l.
k
Equal dilations from both axes (scalings)
These transformations are used in scale diagrams and maps. The
k 0
transformation matrix is of the form =
G for k > 0. This transformation is
0 k
the product of a dilation from the x-axis followed by a dilation from the
1 0 k 0
k 0
k 0 1 0
y-axis, or vice versa. That is, =
G==
G=
G==
G=
G.
0 k 0 1
0 1 0 k
0 k
Example 4.9

3 0
G have on the unit
0 3
square and on the graphs of the functions f and g?

What effect does the transformation matrix =

93

MathsWorks for Teachers


Matrices

Solution

3x
3 0 x
G = G = = G. Each point (x, y) is mapped to
0 3 y
3y
the point (3x, 3y), and so the vertices of the unit square (0, 0), (1, 0),
(1, 1) and (0, 1) are mapped to the points (0, 0), (3, 0), (3, 3), (0, 3)
respectively. The square has been scaled by a factor of 3 and the
resultant square has an area of 9 square units, as shown in Figure 4.13.

Under the transformation =

y
(3, 3)

3
2
(1, 1)

1
Figure 4.13: Graph of the unit square and its image
under the transformation with matrix <

Since =

3 0
F
0 3

x1
3x
3 0 x
x
y
G = G = = G = > H, we have x = 31 and y = 31 , so the
y1
0 3 y
3y

graphs of y=x 2 and y=sin(x) are transformed to the graphs of the


y
x 2
y
x
x2
x
functions 31 = c 31 m and 31 = sin c 31 m or y = 3 and y = 3 sin b 3 l
respectively.
In general, dilation by factor k from both axes results in y=f(x) being
x
transformed to y = kf b l.
k
After a suitable range of specific examples have been explored from first
principles for a variety of functions and relations, students can be led to
consider the general case of a composition of dilations from both axes. This
can be used to naturally and informally introduce the notion of composition
of linear transformations. It also provides an example of a set of matrices for
ky 0
which multiplication is commutative. Thus, given the transformation >
H
0 1

94

chapter 4
Transformations of the cartesian plane

1 0
for ky > 0 (a dilation from the y-axis) and the transformation >
H for kx > 0
0 kx
(a dilation from the x-axis), the product

>

ky 0
1 0 ky 0
ky 0 1 0
H>
H=>
H>
H=>
H
0 kx 0 1
0 kx
0 1 0 kx


is a composite dilation, and this composition is commutative (it doesnt matter
in which order the transformations are applied, the final image is the same).
x
After such a composite dilation, y=f(x) becomes y = k x f e o.
ky
Reflections in lines through the origin
-1 0 1 0
0 1
The matrices >
H, >
H and =
G represent (i) a reflection in the y-axis,
1 0
0 1 0 -1
(ii) a reflection in the x-axis and a (iii) reflection in the line y=x respectively,
since for any (x, y) we have:
-x
-1 0 x
i >
H= G = > H
y
0 1 y
x
1 0 x
ii >
H= G = > H
0 -1 y
-y
y
0 1 x
G= G = = G
1 0 y
x
That these matrices are indeed the correct representations for the
corresponding transformation can readily be seen by considering a general
a c
transformation matrix =
G and choosing a, b, c and d so that the requisite
b d
transformation of coordinates applies. For example, reflection in the y-axis
-x
a b x
maps the point (x, y) to the point (x, y), or in matrix form =
G = G = > H.
c d y
y
Thus, as we require ax+by=x, this can be obtained by having a=1 and
b=0. Similarly, as we require cx+dy=y, this is obtained by having c=0
-1 0
a b
and d=1, so the required transformation matrix is =
H
G=>
c d
0 1
It is a useful exercise for students to similarly produce the other reflection
matrices mentioned above, and apply the same reasoning to the dilation
matrices covered earlier.
iii =

95

MathsWorks for Teachers


Matrices
Example 4.10

-1 0
What is the effect of the matrix >
H corresponding to a reflection in
0 1
the y-axis on the graphs of the functions f and g?
Solution

x1
-x
-1 0 x
Under this transformation >
H = G = > H = > H, and so x=x1 and
y1
y
0 1 y
y=y1. Thus, in general, y=f(x) is transformed to y1=f(x1) or
y=f(x). So y=x 2 is transformed to y=(x)2=x 2 and y=sin(x) is
transformed to y=sin(x)=sin(x). When the image of a function or
relation is the same as the original function under a transformation, the
transformation illustrates a symmetry of the function or relation (see
Leigh-Lancaster, 2006). In this case, for y=x 2, the symmetry exhibited
is reflection (mirror) symmetry in the vertical coordinate axis.
Example 4.11

1 0
H corresponding to a reflection in
0 -1
the x-axis on the graphs of the functions f and g?

What is the effect of the matrix >


Solution

Under this transformation >

x1
x
1 0 x
H = G = > H = > H, and so x=x1 and
y1
0 -1 y
-y

y=y1. Thus, in general, y=f(x) is transformed to y1=f(x1) or


y=f(x). So y=x 2 is transformed to y=(x)2 and y=sin(x) is
transformed to y=sin(x).
Example 4.12

0 1
G corresponding to a reflection in
1 0
the line y=x on the graphs of the functions f and g?
What is the effect of the matrix =
Solution

Under this transformation =

x1
y
0 1 x
G = G = = G = > H, and so x=y1 and
y1
1 0 y
x

y=x1. Thus, in general, y=f(x) is transformed to x1=f(y1) or x=f(y).


So y=x 2 is transformed to x=y2 and y=sin(x) is transformed to
x=sin(y).

96

chapter 4
Transformations of the cartesian plane

The transformed functions are no longer functions but relations. In fact each
function and its transformed relation are, by definition, inverse relations. For
any one-to-one function h, the inverse relation is also a one-to-one function.
Such functions are useful in solving equations by hand and using technology,
since, for such a function, h, the equation h(x)=k will have the corresponding
solution x=h1(k).
Reflection in the line y=mx
To find the transformation matrix, we need to find the images of the point A
with coordinates (1, 0) and the point B with coordinates (0, 1) under the
transformation. We will assume in the following that 0 < m < 1. The reader
can adapt the argument for other values of m. Consider the diagram shown in
Figure 4.14.
y
P

y = mx
Q

A
1

1
Figure 4.14: Finding the image of A(1, 0) under reflection in the line y=mx, where 0 < m < 1

First we find the image of point A with coordinates (1, 0). The line through
A perpendicular to the line y=mx cuts the line y=mx at Q and passes
through the point P, where length of PQ=length of AQ. So P is the image of
A after reflection in the line y=mx, and OP is also of length 1 unit. Let be
the angle that the line makes with the positive x-axis, so tan ()=m. Then the
angle POA=2 and so the coordinates of P are (cos(2), sin(2)) by
definition. Next we find the image of point B with coordinates (0, 1), as shown
in Figure 4.15.

97

MathsWorks for Teachers


Matrices
y
B 1

y = mx

R
T
1

S
1
Figure 4.15: Finding the image of B (0, 1) under reflection in the line y=mx, where 0 < m < 1

The line through B perpendicular to y=mx intersects the line y=mx at


R, cuts the x-axis at T, and S is the image of B after reflection in the line
y=mx. Angles ORB and ORS are both right angles, and so, since the sum of
angles in a triangle is 180, the magnitude of angle TOS=902. Hence the
coordinates of S are (cos((902)), sin((902))). Using the double
angle formulas

cos(AB)=cos(A)cos(B)+sin(A)sin(B) and
sin(AB)=sin(A)cos(B)cos(A)sin(B)

we see that the coordinates of S are (sin(2), cos(2)). Hence a reflection in


cos ]2qg sin ]2qg
the line y=mx can be represented by the matrix >
H, where
sin ]2qg - cos ]2qg

p
tan()=m and -p
2 1 q 1 2 . In most cases it will not be possible to evaluate
1 - m2
2m
]
g
2
q
exactly, but sin ]2qg =
cos
=
and
for a reflection in the
1 + m2
1 + m2
line y=mx.

Example 4.13

The graph of the function y=f(x) is reflected in the line y=2x. Find the
equation of the transformed function.
Solution

Since m=tan()=2, it follows that =arctan(2), sin(2)=0.8 and


x1
- 0.6 0.8 x
cos(2)=0.6. Consider >
H = G = > H.
y
0.8 0.6 y
1

98

chapter 4
Transformations of the cartesian plane

As reflection transformations are self-inverse, the inverse of a


reflection matrix will be the reflection matrix itself (which can be
readily verified by using the standard form for the inverse of a
- 0.6 0.8 x1
x
22 matrix), = G = >
H > H.
y
0.8 0.6 y1
Thus, the equation of the transformed function will be
0.8x+0.6y=f(0.6x+0.8y).
Rotations about the origin
What is the matrix representing rotation about the origin in the anticlockwise
direction through an angle , as shown in Figure 4.16?
y
1

Q

O

Figure 4.16: Rotation of points (1, 0) and (0, 1) anticlockwise about the origin through an angle q 2 0

The point (1, 0) is rotated to point P with coordinates (cos(), sin()) and
the point (0, 1) is rotated to the point Q with coordinates
(cos(90+), sin(90+))=(sin(), cos()). Hence the matrix
corresponding to an anti-clockwise rotation about the origin through the
angle is:

cos ]qg - sin ]qg


G
sin ]qg cos ]qg

This result can be used to establish the compound angle formulas used in
the previous section, by considering a rotation through an angle of 1+2 as
both a single rotation through an angle of 1+2 and as a composition of two
rotations, one through an angle of 1 followed by another through an angle of
2. By definition, the two corresponding matrices must be equal, so the
corresponding elements give the required identities (see Leigh-Lancaster,
2006, pp. 668).

99

MathsWorks for Teachers


Matrices
Example 4.14

Find the resultant function when the graph of y=f(x) is rotated


anticlockwise about the origin through an angle of 60. Find the image
for the particular case when f(x)=2x.
Solution

The matrix corresponding to an anticlockwise rotation of 60 about the


cos ]60cg - sin ]60cg
origin is =
G.
sin ]60cg cos ]60cg
Then =

x1
cos ]60cg - sin ]60cg x
G= G = = G
y1
sin ]60cg cos ]60cg y

-1
x1
so x
cos ]60cg - sin ]60cg
G > H
= G==
]
]
g
g
y
sin 60c cos 60c
y
1

cos ]60cg sin ]60cg x1


=>
H> H
- sin ]60cg cos ]60cg y1
and y=f(x) is transformed to
xsin(60)+ycos(60)=f(xcos(60)+ysin(60)).
Using the known exact surd values for sin(60) and cos(60) we
x 3 y
x y 3
obtain - 2 + 2 = f d 2 + 2 n . For the particular case f(x)=2x, the
x 3 y
x y 3
image will be - 2 + 2 = 2 d 2 + 2 n , which can be rearranged to
5 3
8
give y =- x c 11 + 11 m .
Students should be encouraged to explore these special cases:
1 When = 90 the rotation matrix becomes =

0 -1
G.
1 0

-1 0
2 When = 180 the rotation matrix becomes =
G.
0 -1

100

chapter 4
Transformations of the cartesian plane

C o m p o s i t i o n o f l i n e ar t ra n s f o r m a t i o n s
When several transformations are applied in a sequence, the matrix of the
resulting transformation is simply the product of the corresponding matrices
x
applied from right to left. If = G is the coordinate vector of any point in the
y
x
plane, and the linear transformation S is applied to = G then the resultant
y
x
xl
image coordinates will by given by = G = S = G. If a second transformation T is
y
yl
applied to this, then the resultant image coordinates of the combined
x
x
xl
transformations will be given by T = G = T fS = Gp = ]TSg = G. The process is
yl
y
y
likewise repeated if further linear transformations are applied.
Example 4.15

Find the matrix of the transformation consisting of a reflection in the


y-axis followed by an anticlockwise rotation about the origin through
an angle of 45.
Solution

-1 0
The matrix for the first transformation is >
H, while the matrix for
0 1
R1
1 V
S
W
]
g
]
g
cos 45c - sin 45c
2
2
W.
the second transformation is =
G= S 1
1
sin ]45cg cos ]45cg
SS
W
2
2 W
T
X
To obtain the image of each point (x, y) in the plane under the given
R1
1 V
S 2 - 2 W -1 0 x
W>
sequence of these two transformations we use S 1
H = G.
1
SS
WW 0 1 y
2
2
T
X
So the matrix acting upon each point is the product of the rotation
matrix and the reflection matrix:
R1
R 1
1 V
1 V
S 2 - 2 W -1 0
S- 2 - 2 W
W>
W
A=S 1
H= S 1
1
1
SS
WW 0 1
SSW
2
2
2
2 W

T
X
T
X

101

MathsWorks for Teachers


Matrices

If the sequence of application of these two transformations is reversed, that


is, we seek to find the matrix of the transformation consisting of a rotation
about the origin through an angle of 45 followed by a reflection in the
y-axis, then the combined transformation matrix will be
R1
R 1 1 V
1 V
S
W
S- 2 2 W
-1 0 2
2
W
S
W, which is not the same as the previous
A =>
=
HS 1
1
1
0 1 SS
WW SS 1
W
2
2
2
2W
T
X
T
X order of the transformations is
combined transformation matrix. The
important for the composition of these two transformations, unlike the case
for composition of two rotations about the origin.
In general, the matrices corresponding to the linear transformations should
be multiplied from right to left matching the sequence of application of the
transformations. Knowledge of the geometry of the transformations involved
will provide insight into whether the composition of transformations in reverse
order yields the same result as the composition of transformations in the
original order, and consequently whether the matrix product is commutative.
In general as matrix multiplication is not commutative, so the composition of
transformations will not be the same when their sequence is reversed unless
this is a consequence of the nature of the transformations involved.

S t u d e n t ac t i v i t y 4 . 4
a

Find the coordinates of the images of the vertices of the unit square under the

k 0
transformation with matrix > y
H , and show that its area is given by kxky.
0 kx

Show that the rule of the image of y=f(x) under the transformation with matrix

k 0
x
H , where k x ! 0 and k y ! 0 , is y = k x f e o .
> y
ky
0 kx

Find the image of the relation x2+y2=1 (the unit circle) under the transformation

k 0
with matrix > y
H where kx=b and ky=a and a ! 0 and b ! 0 . Hence find the
0 kx

formula for the area of the ellipse with horizontal axis length 2a and vertical axis
length 2b.

Find the transformation matrix for an anticlockwise rotation about the origin
through an angle followed by an anticlockwise rotation about the origin through
an angle . Use the transformation matrix for an anticlockwise rotation about the
origin through an angle + to show:
i
sin(+)=sin()cos()+cos()sin()
ii
cos(+)=cos()cos()sin()sin()

102

chapter 4
Transformations of the cartesian plane

e


f

The matrix =

1 k
G represents a shear transformation in the x direction and the
0 1

1 0
matrix =
G represents a shear transformation in the y direction. Find the image
k 1
1 3
of the unit square under the shear transformation with matrix =
G and draw the
0 1
original and its image on the same graph.
Find the image of the function y=f(x) under the shear transformation with matrix
=

1 3
G . In particular, find the image of y=x2 under this transformation.
0 1

A f f i n e t ra n s f o r m a t i o n s
A function T:R2 R2 is called an affine transformation if T(u)=Au+B,
where u is an ordered pair (position vector) corresponding to a point in the
plane, A is a 2 2 matrix and B is a fixed element of R2. A is called the matrix
for T, and B the constant vector. It follows that every linear transformation is
0
an affine transformation, where B = ^0, 0h = = G. However, affine
0
transformations also include translations parallel to the coordinate axes, and
combinations of these translations. With some discussion similar to the earlier
case for linear transformations, it will be apparent to students that affine
transformations also transform straight lines to straight lines and line
segments to line segments. However, unlike linear transformations, they do
0
not necessarily transform the origin ^0, 0h = = G to itself, or straight lines
0
through the origin to straight lines through the origin. Teachers may wish to
explore some simple examples with students to establish this observation, for
example, finding the image of y=x under the affine transformation
x
1 0 x
0
T f= Gp = =
G = G + = G, and the natural generalisation.
y
0 2 y
1
Translations parallel to the axes
To translate a point p units in the positive direction of the x-axis, we would
x
x
p
use T f= Gp = I= G + = G, where I is the identity matrix of order 2. To translate a
y
y
0
point q units in the positive direction of the y-axis, we would use

103

MathsWorks for Teachers


Matrices

x
x
0
T f= Gp = I= G + = G. We can form the composite of these transformations to
q
y
y
translate a point p units in the positive direction of the x-axis and q units in
x
p
x
the positive direction of the y-axis, by T f= Gp = I= G + = G. In this last case,
y
q
y
x1
x1
x1 - p
x
p
x
x
p
T f= Gp = I= G + = G = > H, so = G = > H - = G = >
H and y=f(x) will be
y1
y1
y1 - q
y
y
q
y
q
transformed to y - q = f^x - ph.
Example 4.16

Find the image of the point (2, 6), the line y=3x and the parabola y=x 2
under a translation by three units in the x-direction and two units in the
y-direction.
Solution

x
1 0 x
3
T f= Gp = =
G= G + = G
y
0 1 y
2
2
1 0 2
3
2
3
5
so T f= Gp = =
G= G + = G = = G + = G = = G
6
0 1 6
2
6
2
8
Suppose (x1, y1) is the image of the point with coordinates (x, y).
x1
x
1 0 x
3
Then T f= Gp = =
G= G + = G = > H
y1
y
0 1 y
2
x1
x1 - 3
x
3
Hence = G = > H - = G = >
H
y1
y1 - 2
y
2
Then x=x13 and y=y1 2, and the line y=3x becomes the line
y1 2=3(x1 3) or y=3x 7.
For the case of the parabola y=x 2, we have y1 2=(x1 3)2 or
y=(x 3)2+2.

C o m p o s i t i o n o f a f f i n e t ra n s f o r m a t i o n s
If S and T are affine transformations, then so is the composite transformation
T S defined by T S(u)=T(S(u)). Here S is applied to u first, and T is then
applied to the result. Since u is a 21 coordinate (vector) matrix, S(u) is also
a 21 coordinate (vector) matrix by conformability of matrix multiplication

104

chapter 4
Transformations of the cartesian plane

and addition. The same argument then applies for the application of T to the
21 coordinate matrix S(u).
Example 4.17

a Find the image of the point u=(x, y) after a reflection in the y-axis
followed by a dilation from the x-axis by a factor of 2.
b Find the image of the point u=(x, y) after a dilation from the x-axis
by a factor of 2 followed by a reflection in the y-axis.
Solution

a Let S be a reflection in the y-axis, so S has the matrix representation


-1 0
>
H; and let T be a dilation from the x-axis by a factor of 2, so T
0 1
1 0
has the matrix representation =
G.
0 2
-x
1 0 -1 0 x
H = G = > H, where T S has the
G>
0 2 0 1 y
2y
-1 0
1 0 -1 0
matrix representation =
H=>
H.
G>
0 2 0 1
0 2
b In the reverse order of application we have
-x
-1 0 1 0 x
ST(u)=S(T(u))=>
H=
G = G = > H, where ST also has the
2y
0 1 0 2 y
-1 0 1 0
-1 0
matrix representation >
H=
H.
G=>
0 1 0 2
0 2
Thus composition of these two transformations is commutative, that
is T S = ST.

Then T S(u) =T(S(u))==

Students should be able to verify this for themselves by considering the


geometric interpretation of both compositions.
Example 4.18

The following sequence of affine transformations is applied to the region


bounded by the unit circle {(x, y): x 2+y2=1, 1 x 1} to obtain an
obliquely oriented ellipse:
1 dilation by a factor of 3 from the y-axis and a factor of 2 from the
x-axis

105

MathsWorks for Teachers


Matrices

2 rotation through an angle of 45 anti-clockwise about the origin


3 translation 1 unit parallel to the x-axis and 1 unit parallel to the
y-axis
Find the equation of the resulting ellipse, find its area and then draw
the corresponding graph.
Solution

x
The dilations are applied by multiplying an arbitrary position vector = G
y
3 0
by the matrix =
G. The rotation is then applied by multiplying the
0 2 R
V
S 1 - 1 W
2
2W
, and the translation is applied by
result by the matrix SS
1
1 W
S 2
W
2
T
X
1
addition of the matrix = G.
1
R
V
S 1 - 1 W
x1
1
2
2W3 0 x
Hence, in combination, > H = SS
G= G + = G
=
W
1
1
y1
0 2 y
1
S 2
2 W
T
X
Reversing the order (that is, applying the inverse transformations) we
obtain:
R
V
S 1 - 1 W
x1
1
2
2W3 0 x
G= G
> H - = G = SS
=
1
1 W0 2 y
y1
1
S 2
W
2
T
RX
V
S 1 - 1 W
x1 - 1
2
2W3 0 x
which gives >
H = SS
G= G
=
1
1 W0 2 y
y1 - 1
S 2
2 W
R
V
R T
V X
1
1 W- 1
S x1 2 + y1 2 - 2 W
-1 S
x
3 0 S 2
2 W x1 - 1
6
3 W
S 6
and so = G = =
H= S
G S
>
W.
1
1 W y1 - 1
y1 2 x1 2
y
0 2
S
WW
S 2
W
S
2
4
4
T
X
T
X
Substituting these values for x and y into the equation for the unit
circle x2+y2=1 yields the relation:

13x 22x(5y+8)+13y216y=56
The area of the original region is square units.

106

chapter 4
Transformations of the cartesian plane

The area of the transformed region is


R
V
S 1 - 1 W
3 0
2
2W
p # det =
= 6p square units.
G # det SS
1
1 W
0 2
S 2
W
2
T
X
The graph of both the original and the transformed relations are
shown in Figure 4.17.
y
4
3
2
1
6

1
2
3
Figure 4.17: Composition of affine transformation from unit circle
to an oblique translated ellipse

The processes of finding the rule of an (affine) transformed function or


relation can be carried out readily using CAS, and the corresponding
functions or relations graphed.

S t u d e n t ac t i v i t y 4 . 5
a

1
The graph of the function with rule f]x g = x is transformed as follows:
1 a dilation by a factor of 0.5 from the y-axis
2 a reflection in the y-axis
3 a translation of 3 units parallel to the x-axis and a translation of 1 unit parallel to
the y-axis.
Use matrices to find the rule of the transformed function.
The graph of the relation xy2=0 is reflected in the x-axis and then translated
2 units to the right and 1 unit down. Find the rule of the relation for the
transformed graph.

107

MathsWorks for Teachers


Matrices

SUMM A R Y

Matrices provide a natural representation for coordinate (position)


x
1
0
vectors in the cartesian plane where ^x, yh = = G = x = G + y = G.
y
0
1
1
0
The matrices (vectors) = G and = G are said to form a basis for the
0
1
coordinate vectors of the cartesian plane, since any coordinate vector
can be written as a linear combination of these two vectors.
A linear transformation of the cartesian plane is a function
T: R2 R2, where T(x, y)=(ax+by, cx+dy) and a, b, c and d are
real numbers. In matrix form this can be represented by:
ax + by
a b x
H
G= G = >
=
cx + dy
c d y
Linear transformations (with non-singular matrices) map straight
lines onto straight lines and preserve the parallel relation between
straight lines.
The image of the origin under a linear transformation is itself, and a
straight line passing through the origin is mapped onto another
straight line passing through the origin (provided the
transformation matrix is non-singular).
a
1
0
If the images of = G and = G under a linear transformation T are = G
0
1
c
x
a b x
b
and = G respectively, then T f= Gp = =
G = G.
y
c d y
d
p
r
If the images of = G and = G under a linear transformation T with
q
s
p r
a b
a
b
matrix A are = G and = G respectively, then A = G = =
G and
q s
c
c d
d
a b
p r -1
G # = G if the inverse matrix exists.
q s
c d
If P, with position vector p=(x1, y1), and Q, with position vector
q=(x2, y2) are two distinct points, and d=qp=(x2x1, y2y1),
then the position vector r of any point on the line that contains P
and Q is given by r=p+td, tR.
A ==

108

chapter 4
Transformations of the cartesian plane

SUMM A R Y (Cont.)

The linear function with rule y=mx+c can be written in matrix


x
0
1
(vector) parametric form as = G = = G + t = G, where tR.
c
m
y
To find the image of the graph of y=f(x) or f(x, y)=c under the
x1
x
linear transformation with matrix A, we substitute = G = A- 1 > H
y1
y
into y=f(x) or f(x, y)=c, if A1 exists.
ky 0
>
H where ky and kx are positive real numbers, represents a
0 kx
dilation by a factor ky from the y-axis and a dilation by a factor kx
cos ]2qg sin ]2qg
from the x-axis in either order; >
H represents
sin ]2qg - cos ]2qg

cos ]qg - sin ]qg


G
sin ]qg cos ]qg
represents a rotation through an angle anticlockwise about the
a
origin; and = G represents translation by vector (a,b) (a units in the
b
x-direction and b units in the y-direction when a 2 0 and b 2 0).
An affine transformation of the cartesian plane is a function
T:R2 R2 such that T(u)=Au+B. In matrix form this can be
ax + by + e
e
a b x
represented by =
H. Affine
G= G + = G = >
cx + dy + f
f
c d y
reflection in the line y=mx where m=tan(); =

transformations (where A1 exists) map straight lines onto straight


lines and preserve the parallel relation between straight lines;
however, lines through the origin are not necessarily mapped to
other lines through the origin.
If S and T are affine transformations, then so is the composite
transformation T S defined by T S(u)=T(S(u)). The composition
may, or may not, be commutative. This may be determined by
computation of the respective matrices, or by geometric
interpretation of the transformations involved.

109

MathsWorks for Teachers


Matrices

SUMM A R Y (Cont.)

When a translation is part of an affine transformation, this may


occur before or after composition with other linear transformations.
If the translation component occurs last, then the affine
transformation has the form X1=AX+B, so X=A1(X1B) will
determine x and y in terms of x1 and y1 for substitution into the
rule of the original function or relation to determine the rule of
the transformed relation or function.
If the translation component occurs first, then the affine
transformation has the form X1=A(X+B), so X=A1 X1B will
determine x and y in terms of x1 and y1 for substitution into the
rule of the original function or relation to determine the rule of
the transformed function or relation.
References
Anton, H & Rorres, C 2005, Elementary linear algebra (applications version), 9th
edn, John Wiley and Sons, New York.
Cirrito, F (ed.) 1999, Mathematics higher level (core), 2nd edn, IBID Press, Victoria.
Evans, L 2006, Complex numbers and vectors, ACER Press, Camberwell.
Leigh-Lancaster, D 2006, Functional equations, ACER Press, Camberwell.
Nicholson, KW 2003, Linear algebra with applications, 4th edn, McGraw-Hill
Ryerson, Whitby, ON.
Sadler, AJ & Thorning, DWS 1996, Understanding pure mathematics, Oxford
University Press, Oxford.

Websites
http://wims.unice.fr/wims/en_tool~linear~matrix.html WIMS
This website provides a matrix calculator.
http://en.wikipedia.org/wiki/Linear_transformation Wikipedia
This site provides a comprehensive discussion on linear transformations with
links to other resources and references.
http://en.wikipedia.org/wiki/Affine_transformation Wikipedia
This site provides a comprehensive discussion on affine transformations with
links to other resources and references.
http://www.ies.co.jp/math/java/misc/don_trans/don_trans.html
This website contains an applet that shows how the shape of a dog is transformed
by a 2 2 matrix.
http://merganser.math.gvsu.edu/david/linear/linear.html
This website contains an applet that allows you to move a slider adjusting
coefficients in a 2 2 matrix and see the effect of the equivalent transformation
on the unit square.

110

C ha p t e r

Tra n s i t i o n m a t ric e s
Conditional probability
One of the key ideas that students come across early in their study of
probabilities related to compound events for a given event space is the notion
of conditional probability and the associated ideas of dependent and
independent events. While students experience with subjective probability
makes them well acquainted with events that may or may not be dependent,
such as the likelihood of scoring on the second of two free shots in a basketball
game, given that one may or may not have scored on the first shot, their
school study of probability often begins with compound events that are
(physically) independent events. On the other hand, a combination of
experience and knowledge indicates that certain events are dependent, for
example, gender and colour blindness to red and green, or having a disease
and the likelihood of testing positive for the disease. Whether events in a
given context are independent or not, is not always clear. Thus, experiments
with tossing coins and rolling dice, which are physically independent, can lead
to an implicit willingness, or preference, to assume that two events from a
given event space are independent. One of the more intriguing and counterintuitive scenarios involving conditional probability is the game show problem
called the Monty Hall dilemma (see the UCSD website). This is a good context
for stimulating student interest, many people think the events involved are (or
should be) independent.
If we consider two events A and B from the same event space, U, then the
conditional probability of A given B, that is the probability that event A
occurs given that event B has occurred, written as Pr ^ A | Bh, corresponds to the
proportion of B events that are also A events, relative to the proportion of B
Pr ]A k Bg
events; that is, Pr ^ A | Bh =
. Conditional probability can be used to
Pr ]Bg
discuss both dependence and independence of events.

111

MathsWorks for Teachers


Matrices

If A and B are independent events, then Pr ^ A | Bh will be the same as Pr(A)


Pr ]A + Bg
= Pr ]Ag or
and so in this case Pr ^ A | Bh =
Pr ]Bg
Pr ]A + Bg = Pr ]Ag # Pr ]Bg. Similarly, Pr ^B | Ah will be the same as Pr(B) and
Pr ]A + Bg
= Pr ]Bg or, as before,
so in this case Pr ^B | Ah =
Pr ]Ag
Pr ]A + Bg = Pr ]Ag # Pr ]Bg.
If Pr(A|B) is different from Pr(A), and likewise Pr(B|A) will be different
from Pr(B), then A and B are dependent events. Any two events A and B
from a given event space may or may not be dependent; however, the
following relationships (sometimes called the law of total probability for two
events) holds irrespective of whether this is the case or not:
Pr ]Ag = Pr ^ A | Bh # Pr ]Bg + Pr ^ A | Blh # Pr ]Blg
Pr ]Bg = Pr ^B | Ah # Pr ]Ag + Pr ^B | Alh # Pr ]Alg

where A and B are the complements of A and B in U respectively.


In practice, these relationships can be represented using a Venn diagram or
Karnaugh map, a tree diagram, or by matrices. It is important to ensure that
students are familiar with each of these representations and their use, to assist
in solving problems related to probabilities associated with simple compound
events.

Tra n s i t i o n p r o b a b i l i t i e s
Many natural systems undergo a process of change where at any time the
system can be in one of a finite number of distinct states. For example, the
weather in a city could be sunny, cloudy and fine, or rainy. Such a system
changes with time from one state to another, and at scheduled times, or stages,
the state of the system is observed. At any given stage, the state to which it
changes at the next stage cannot be determined with certainty, but the
probability that a given state will occur next can be specified by knowing the
current state of the system. That is, the probability that the system will be in
a given state next is conditional only on the current state. Such a process of
change is called a Markov chain or Markov process. The conditional
probabilities involved (that is, the probabilities of going to one state given that
the system was in a certain state) are called transition probabilities, and the
process can be modelled using matrices.

112

chapter 5
Transition matrices

We begin by considering the following example.


Jane always does the weekly shopping at one of two stores, A and B. She
never shops at A twice in a row. However, if she shops at B, she is three times
as likely to shop at B the next time as at A. Suppose that she initially shops
at A.
This is an example of a Markov chain, since the store at which she shops
next depends only on the store she shopped at the week before, and the
conditional probabilities for each possible outcome are the same on each
occasion. There are two states, state 1 which corresponds to shopping at
store A, and state 2 which corresponds to shopping at store B. We can
represent this by a tree diagram, as shown in Figure 5.1 and set up a
corresponding table of transition probabilities, as shown in Table 5.1.
Current
0

Next
A

A
1

B
A

0.25
B
0.75

Figure 5.1: Tree diagram representation for the first transition

Table 5.1: Summary of transition probabilities

Present weeks store

Next weeks store

0.25

0.75

Note that the columns of the table sum to 1. The store at which Jane shops
in a given week is not determined. The most we can expect to know is the
probability that she will shop at A or B in that week. Let s(1m) denote the
probability that she shops at A in week m, and s(2m) the probability that she
shops at B in week m.

113

MathsWorks for Teachers


Matrices
Example 5.1

Use the law of total probability to find the probability that Jane shops at
store A in week 1 and the probability that Jane shops at store B in
week 1 given that she shops at store A initially.
Solution

As she shops at A initially, s(10) = 1 and s(20) = 0. For the next week (using
the law of total probability):
s(11) = Pr ^shops at A in week 1 | shopped at A in week 0h

# Pr ^shopped at A in week 0h
+ Pr ^shops at A in week 1 | shopped at B in week 0h
# Pr (shopped at B in week 0)
= 0 # 1 + 0.25 # 0 = 0

s(21) = Pr ^shops at B in week 1 | shopped at A in week 0h

# Pr ^shopped at A in week 0h
+ Pr ^shops at B in week 1 | shopped at B in week 0h
# Pr (shopped at B in week 0)
= 1 # 1 + 0.75 # 0 = 1

If we write S0 = >

s]10g

s]11g
s]1mg
S
=
S
=
and
,
then,
in
general,
> ]1gH
> ]mgH is
H
1
m
s]20g
s2
s2

called the state vector for week m, since it gives the probabilities of being in
any state after m weeks (or transitions). It is convenient to let S0 correspond
to the initial week. Since the definition of matrix multiplication corresponds
naturally to the calculations we wish to carry out for this purpose, these
calculations can be written in matrix form as follows:
S1 = >

s]11g

H==

s]21g

0 0.25 1
0
G = G = PS0 = = G
1 0.75 0
1

0 0.25
G is the matrix of transition probabilities, and is called
1 0.75
the transition matrix.
Teachers should take care to ensure that students follow the modelling
process and make the conceptual connections between the law of total
probability, its application to the transition state problem, and the subsequent
representation using matrices and their products.

where P = =

114

chapter 5
Transition matrices

Hence, from S1 we can see that given that she shopped at A in week 0, the
probability that she shops at A in week 1 is 0 and the probability that she shops
at B in week 1 is 1. What happens two weeks after shopping at A initially?
If we use a tree diagram representation, as shown in Figure 5.2, we can
calculate the corresponding probabilities for the first two transitions.
Initially
0
A

Week 1
A

1
B

0
1

Week 2
A

0.25

B
A

0.75

Figure 5.2: Tree diagram representation for the first two transitions

s(12) = Pr ^shops at A in week 2 | shopped at A in week 1h

# Pr ^shopped at A in week 1h
+ Pr ^shops at A in week 2 | shopped at B in week 1h
# Pr (shopped at B in week 1)
= 0 # 0 + 0.25 # 1 = 0.25

s(22) = Pr ^shops at B in week 2 | shopped at A in week 1h

# Pr ^shopped at A in week 1h
+ Pr ^shops at B in week 2 | shopped at B in week 1h
# Pr (shopped at B in week 1)
= 1 # 0 + 0.75 # 1 = 0.75

In matrix terms, this is equivalent to


S2 = >

s]12g

H==

0 0.25 0
0.25
G = G = PS1 = =
G.
1 0.75 1
0.75
Hence, given that Jane shopped at A in week 0, the probability that she
shops at A in week 2 is 0.25 and the probability that she shops at B in week 2
is 0.75.
0.25 0.1875 1
0.25
Moreover, S2 = PS1 = P^PS 0h = P2 S 0 = =
G= G = =
G (the first
0.75 0.8125 0
0.75
column of P2).
This can be extended to week 3; however, the process of calculation using
tree diagrams becomes increasingly time and space consuming, whereas the
s]22g

115

MathsWorks for Teachers


Matrices

matrix form offers a much more convenient representation for these


calculations, where S 3 = PS 2 = P 3 S0, and in general, for m weeks later:
Sm = PSm - 1 = Pm S0

Where many transitions may take place the relevant matrix calculations
are best done by technology, using, for example, a CAS.
Example 5.2

Find the probabilities that Jane shops at A (i) 3, (ii) 4, (iii) 5, (iv) 10,
(v) 50 and (vi) 100 weeks later.
Solution

i

ii

iii

iv

v

vi


0 0.25 0.25
0.1875
G=
G==
G, and so the probability that she
1 0.75 0.75
0.8125
shops at A three weeks later is 0.1875.
0 0.25 0.1875
0.203125
S 4 = PS 3 = =
G=
G==
G, and so the probability
1 0.75 0.8125
0.796875
that she shops at A 4 weeks later is 0.203125
0.199
S5 = P5 S0 . =
G, and so the probability that she shops at A
0.801
5 weeks later is approximately 0.199.
0.200
S10 = P10 S0 . =
G, and so the probability that she shops at A
0.800
10 weeks later is approximately 0.200.
0.200
S 50 = P 50 S0 . =
G, and so the probability that she shops at A
0.800
50 weeks later is approximately 0.200.
0.200
S100 = P100 S0 . =
G, and so the probability that she shops at A
0.800
100 weeks later is approximately 0.200. We can note that
0.2 0.2
P100 . =
G; in fact there seems to have been very little change,
0.8 0.8
at this level of accuracy for larger values of Pm.
S 3 = PS2 = =

In this example, the state vectors S0 , S1, S2 ... Sn appear to converge to


0.2
S = = G as n increases.
0.8

116

chapter 5
Transition matrices

If this is indeed the case, then we can say that the long-term probability of
Jane shopping at A is 0.2 and shopping at B is 0.8. That is, in the long term she
will shop at A 20% of the time and at B 80% of the time, provided that she
doesnt change her pattern of behaviour in this regards.
If instead of shopping at A in week 0, we knew she shopped at B in
0
week 0, then S0 = = G, and the corresponding probabilities are shown in
1
Example 5.3.
Example 5.3

Find the probabilities that Jane shops at A in (i) 1, (ii) 2, (iii) 3, (iv) 4,
(v) 5, (vi) 10, (vii) 50 and (viii) 100 weeks later, given that she initially
shopped at B.
Solution

i

ii

iii

iv

v

vi

0 0.25 0
0.25
G= G = =
G, and so the probability that she
1 0.75 1
0.75
shops at A 1week later is 0.25. This state vector is the same as that
after two transitions if she shops at A initially.
0 0.25 0.25
0.1875
S2 = PS1 = =
G=
G==
G, and so the probability that she
1 0.75 0.75
0.8125
shops at A 2weeks later is 0.1875.
0 0.25 0.1875
0.203125
S 3 = PS2 = =
G=
G==
G, and so the probability
1 0.75 0.8125
0.796875
that she shops at A 3weeks later is 0.203125.
0.199
S 4 = P 4 S0 . =
G, and so the probability that she shops at A
0.801
4weeks later is approximately 0.199.
0.200
S5 = P5 S0 . =
G, and so the probability that she shops at A
0.800
5 weeks later is approximately 0.200.
0.200
S10 = P10 S0 . =
G, and so the probability that she shops at A
0.800
10 weeks later is approximately 0.200.
S1 = PS0 = =

117

MathsWorks for Teachers


Matrices

0.200
G, and so the probability that she shops at A
0.800
50 weeks later is approximately 0.200.
0.200
viii S100 = P100 S0 . =
G, and so the probability that she shops at A
0.800
100 weeks later is approximately 0.200.
vii S 50 = P 50 S0 . =

The long-term probabilities are as in the previous case, in which we assumed


Jane initially shopped at A. So it appears that, with this pattern of shopping
behaviour, in the long term Jane will shop at A in any week with probability
0.2, and at B in any week with probability 0.8, regardless of where she
shopped initially.
Table 5.2 summarises the state vectors for five transitions given that Jane
either shopped at A or B initially, as well as the transition matrix raised to the
number of transitions, for the first five transitions.
Table 5.2: Summary of state and transition matrices for five transitions from either
initial state

Pn , where n is the
number of
transitions

Number State vector given State vector given


of
she shops at A
she shops at B
transitions
initially
initially

0
= G
1

0.25
G
0.75

0.25
G
0.75

0.1875
G
0.8125

0.1875
G
0.8125

0.203125
G
0.796875

=
=

0 0.25
G
1 0.75

0.25 0.1875
G
0.75 0.8125

0.1875 0.203125
G
0.8125 0.796875

0.203125
G
0.796875

0.199219
G
0.800781

0.203125 0.199219
G
0.796875 0.800781

0.199219
G
0.800781

0.200195
G
0.799805

0.199219 0.200195
G
0.800781 0.799805

We can see from this table that the columns of Pn contain the state vectors
(after n transitions) after initially shopping at A or at B respectively. So the
(i, j)th element of Pn gives the probability of starting in state j and moving to
state i after n transitions.

118

chapter 5
Transition matrices
S t u d e n t ac t i v i t y 5 . 1
a

b

Suppose initially we are unsure of where Jane shops, but it could be at either A or
0.5
B with equal probability. Then S0 = = G . Find S1, S2, S5, S10, and S50.
0.5
Consider a Markov chain, with two states 1 and 2, with transition probability matrix
P, with pij=probability of going from state j to state i in one transition.
0.445 0.444
G . What is the probability of
0.555 0.556
going from state 1 to state 2 in three transitions?
going from state 2 to state 2 in three transitions?

Suppose P3 = =
i
ii

T h e s t e a dy- s tat e v e c t o r
It appears that in the long term, after many transitions, the state vectors
converge to the same vector regardless of where Jane initially shopped. Such a
vector is called a steady-state vector. If this is the case, how can we find the
steady-state vector for this problem?
We can phrase this question more generally, and suppose that P is the
transition matrix of a Markov chain, and assume that the state vectors Sm
converge to a limiting state vector S. Then Sm is very close to S for sufficiently
large m, so Sm + 1 is also very close to S. Then the equation

Sm + 1=PSm

is closely approximated by

S=PS

where S is a solution to this matrix equation. It is easily solved as it can be


written as a system of linear equations in matrix form

(I P)S=O

where the entries of S are the unknowns and I is the identity matrix. This
homogeneous system has many solutions; the one we are most interested in is
the one whose entries sum to 1.

119

MathsWorks for Teachers


Matrices
Example 5.4

Find the steady-state vector S for Janes shopping.


Solution

I-P ==

1 - 0.25
1 0
0 0.25
H
G-=
G=>
0 1
1 0.75
- 1 0.25

1 - 0.25
Using Gaussian elimination, this reduces to >
H, and so the
0
0
s1
solution for S = > H is s1=0.25 s2 with s1+s2=1, hence s1=0.2,
s2
s2=0.8.
It is a natural question to ask if we can always find a steady-state vector,
and whether the powers of the transition matrix converge to a matrix whose
columns are equal to the steady-state vector. To answer this we need to
introduce the notion of a regular transition probability matrix.
We say that a transition probability matrix P is regular if, for some positive
integer m, the matrix Pm has no zero entries. It can be shown that, if P is a
regular transition probability matrix, then it has a unique steady-state vector S
(see, for example, the Iowa State maths website). Further, the matrix defined
by L = lim Pm exists, and is given by L = 6S | S | f | S@, that is a matrix where
m"3
each column is a copy of the steady-state vector S, and if L=[lij] then lij is the
long-term probability of being in state i if the system began in state j.
It is straightforward to determine the steady state vector for a 2 2
transition probability matrix, given that it exists. To do this we use a general
formulation, taking a transition probability matrix P = =

1-a b
G, where
a 1-b
0 # a # 1 and 0 # b # 1, and solve (I P)S = O for S where I is the 2 2
identity matrix. Then
1 0
a -b
1-a b
G-=
=
G==
G
a
b
a b
1
0
1

which reduces by Gaussian elimination to
a -b

=
G.
0 0
x
Writing S = = G, we have ax = by. There are now two possible cases:
y
ax
ax
= 1, and so
Case 1: If b 2 0, then y = , and as x + y = 1, x +
b
b
]a + bg
a
a
b
= 1. Since b 2 0, x =
and hence y =
.
x b1 + l = 1 or x
b
a+b
b
a+b

120

chapter 5
Transition matrices

Note that if a = 0, then y = 0 and x = 1, and state 1 is called an absorbing


state. This means that once the system is in state 1 it will never leave it. Note
0 1
0 1 0.5
0.5
also that if a = 1 and b = 1, then P = =
G is not regular but =
G = G = = G,
1 0
1 0 0.5
0.5
and so the state vectors will converge if and only if the system has initial state
0.5
vector = G; that is, there is no steady state vector.
0.5
Case 2: If b = 0, then ax = 0, so a = 0 or x = 0. If a = 0, then the transition
matrix is the identity matrix, so the system stays in whatever state it is in
initially. So there is no steady state vector. If a 0 and x = 0, then y = 1, and
state 2 is also an absorbing state.
1-a b
In summary, if the transition probability matrix P = =
G is
a 1-b
R
V
S b W
a + bW
regular, then the steady state vector is given by S = SS
.
a W
Sa + b W
T
X
S t u d e n t ac t i v i t y 5 . 2
a

Show that the transition probability matrix P = > 21


2


b

c

1
H is regular. Find the
0

steady-state vector S and the limit matrix L = lim P n .


n"3

0 1
G is not regular. Does the limit
1 0
matrix L = lim P m exist? Does it have a steady-state vector?
m"3
a 0
For 0 1 a 1 1, show that the matrix P = =
G is not regular. Find a
1-a 1
Show that the transition probability matrix P = =

steady-state vector and lim P n if it exists.


n"3

A p p l ica t i o n s o f t ra n s i t i o n m a t ric e s
Examples such as the following, and others from various practical and
research contexts, or from the literature, can be used to help students develop
the formulation, solutions and interpretation skills associated with modelling
and problem solving that employs transition matrices and Markov chains.
Keys aspects of these processes are:
consideration of features of the context that indicate a Markov process is
likely to provide a suitable model

121

MathsWorks for Teachers


Matrices

identification of relevant states and transition (conditional) probabilities


and initial state (or states)
formulation of the transition matrix and initial state vector, and
computation of relevant powers of the transition matrix and subsequent
state vectors
analysis of long-run behaviour, including investigation and interpretation
of possible steady-state or other behaviour of the system
While the first few transitions for a two-state system, and its long-run
behaviour, assuming convergence to a steady state, can readily be computed
by hand calculation, student familiarity with the use of a suitable technology
such as CAS is required for computation of higher powers in two-state
problems, and problems where there are more than two states.
Example 5.5

OzBank offers customers two choices of credit card: Ordinary and Gold.
Currently 70% of its customers have an Ordinary card and 30% have a
Gold card. The bank wants to increase the percentage of its customers
with a Gold card, as it gets higher fees from these customers, and so
sends out an offer to all Ordinary cardholders offering a free upgrade to
a Gold card for twelve months. It expects that each month for the next
three months, 10% of its Ordinary cardholders will upgrade to a Gold
card, but 1% of Gold cardholders will downgrade to an Ordinary card.
What percentage of its customers would have Gold cards at the end of
the three months?
Solution

This information can be summarised as a table:


Current

One month later

Ordinary

Gold

Ordinary

0.90

0.01

Gold

0.10

0.99

The corresponding transition matrix is =

vector =

122

0.70
G.
0.30

0.90 0.01
G, with initial state
0.10 0.99

chapter 5
Transition matrices

To find the percentages three months later, we calculate


0.90 0.01 3 0.70
0.52
G =
G.=
G.
0.10 0.99 0.30
0.48
So after three months the bank could expect approximately 48% of
its customers to have a Gold card.
We can also observe that if the number of customers was fixed
700
during this period at, say, 1000, then we could use =
G instead of the
300
700
0.70
initial state vector, since =
G = 1000 # =
G, and use this to calculate
300
0.30
the numbers in each state after each transition. This only works if the
total number of objects remains constant over transitions.

Example 5.6

A wombat has its burrow beside a creek and each night it searches for
food on either the east or west side of the creek. The side on which it
searches for food each night depends only on the side on which it
searched the night before. If the wombat searches for food on the east
side one night, then the probability of the wombat searching on the east
side of the creek the next night is 0.2. The transition matrix for the
probabilities of the wombat searching for food on either side of the creek
given the side searched on the previous night is

0.2 0.7
G
0.8 0.3


a If the wombat searches for food on the west side one night, what is the
probability that it searches for food on the west side the next night?
b If the wombat searches for food on the west side on the Monday
night, what is the probability it searches for food on the west side
again on the following Saturday (5 days later)?
c In the long term, what proportion of nights will it spend searching
for food on the west side?

123

MathsWorks for Teachers


Matrices
Solution

a We can view the transition matrix as below. It is clear that if it


searches on the west side one night then the probability that it
searches for food on the west side the next night is 0.3.
Side searched for food
current night

Side searched for


food the next night

East

West

East

0.2

0.7

West

0.8

0.3

R
V
S 77 W
0.2 0.7 0
160 W 0.48125
==
b We need to find =
G = G= S
G, or alternatively,
S 83 W 0.51875
0.8 0.3 1
S 160 W
T
X
5
0.2 0.7
0.45 0.48125
simply calculate =
G ==
G. The probability the
0.8 0.3
0.55 0.51875
wombat searches for food on the west side on the following Saturday
83
is 160 = 0.51875.
c Using the formula for the steady-state solution, with a=0.8 and
0.7
7
b=0.7, x = 0.7 + 0.8 = 15 and
0.8
8
y = 0.7 + 0.8 = 15. Hence in the long term the wombat will spend
8
15 of the nights searching for food on the west side.
Alternatively, we can solve the matrix equation
1 0
0.2 0.7 x
0
f=
G-=
Gp = G = = G for x and y, with x+y=1.
0 1
0.8 0.3 y
0
5

0.8 - 0.7
H using Gaussian
0
0
elimination (replacing row 2 by row 2+row 1), and so
7y
0.7y 7y
0.8x0.7y=0. Then x = 0.8 = 8 . As x+y=1, 8 + y = 1,
15y
8
7
8 = 1 and so y = 15 and x = 15. Hence in the long term the
8
wombat will spend 15 of the nights searching for food on the west
side.
The coefficient matrix reduces to >

124

chapter 5
Transition matrices

Also, we could take suitable large powers of P and observe


whether there is significant change in the resultant values or not.
0.466667 0.466667
0.466667 0.466667
Now P20 . =
G, and P 30 . =
G,
0.533333 0.533333
0.533333 0.533333
so we appear to have convergence to 6 decimal places, and so the
wombat will spend approximately 0.466667 (or just under 47%) of
the nights searching for food on the east side and 0.533333 (or just
over 53%) of the nights searching for food on the west side.
Example 5.7

A wombat has its burrow beside a creek and each night it searches for
food on either the other side of the creek or north or south of its burrow
on the same side of the creek. The area in which it searches for food each
night depends only on the area in which it searched for food the night
before. If the wombat searches for food on the other side of the creek one
night, then the probabilities of the wombat searching on the other side
of the creek, or north or south of its burrow the next night are 0.2, 0.4
and 0.4 respectively. If the wombat searches for food north of its burrow
one night, then the probability that it will search for food north of its
burrow the next night is 0.1. The transition matrix for the probabilities
of the wombat searching for food in each area given the area searched
for food on the previous night is
R
V
S0.2 0.5 0.5 W
S0.4 0.1 0.3 W
SS
W
0.4 0.4 0.2 W

T
X
a If the wombat searches for food on the south side of its burrow one
night, what is the probability that he searches for food on the north
side the next night?
b If the wombat searches for food on the north side of its burrow on
the Monday night, what is the probability it searches for food on the
north side of its burrow again on the following Saturday (5 days
later)?
c In the long term, what proportion of nights will the wombat spend
searching for food north of its burrow?

125

MathsWorks for Teachers


Matrices
Solution

a We can view the transition matrix as below, and it is clear that if it


searches for food on the south side one night then the probability
that he searches for food on the north side the next night is 0.3.
Side searched for food
current night

Side searched
for food the
next night

Other

North

South

Other

0.2

0.5

0.5

North

0.4

0.1

0.3

South

0.4

0.4

0.2

R
V
7711
R
V5 R V S 20000 W R
V
S0.2 0.5 0.5 W S 0 W S 28101 W S 0.38555 W
b We need to find S0.4 0.1 0.3 W S 1 W = SS 100000 WW . S 0.28101 W, and so
SS
S
W S W
W
0.4 0.4 0.2 W S 0 W S 1042 W S0.33344 W
T
X T X SS 3125 WW T
X
T
X
the probability it searches for food north of its burrow on the
28101
following Saturday is 100000 = 0.28101.
JR1 0 0 V R0.2 0.5 0.5VN Rx V R0 V
W S
WOS W S W
KS
c Solve K S0 1 0 W- S0.4 0.1 0.3 WOSy W = S 0 W for x, y and z, with
K SS0 0 1 WW SS0.4 0.4 0.2 WWOSS z WW SS 0 WW
LT
X T
XP T X T X
x+y+z=1.
V
R
V R
V R
S1 0 0 W S 0.2 0.5 0.5W S 0.8 - 0.5 - 0.5W
S0 1 0 W- S0.4 0.1 0.3 W = S- 0.4 0.9 - 0.3W
W
SS
W S
W S
0 0 1 W S0.4 0.4 0.2 W S- 0.4 - 0.4 0.8 W
T
X T
X TV
X
R
S1 0 - 15 W
13 W
S
- 11 W
S
which reduces to S0 1 13 W by Gaussian elimination.
S0 0 0 W
T
X
15z
11z
Hence we have x = 13 and y = 13 with x+y+z=1. This
1
11
5
gives 3z=1, so z = 3 , y = 39 and x = 13 . Thus in the long term
11
the wombat will spend 39 ( 0.28205) of nights searching for food
north of its burrow.

126

chapter 5
Transition matrices

Alternatively, consider suitably large powers of the transition


matrix:
R
V
S0.38462 0.38462 0.38462 W
P15 . S0.28205 0.28205 0.28205W
SS
W
0.33333 0.33333 0.33333 W
T R
X V
S0.38462 0.38462 0.38462 W
and P 20 . S0.28205 0.28205 0.28205 W
SS
W
0.33333 0.33333 0.33333 W
T
X
R
V
S0.38462 0.38462 0.38462 W
and so we have P3 . S0.28205 0.28205 0.28205 W
SS
W
0.33333 0.33333 0.33333 W
T
X
Hence the wombat will spend 0.28205 approximately of nights
searching for food north of its burrow.
Example 5.8

Consider a simple genetic model, involving just two types of alleles, A


and a, for a gene. Suppose that a physical trait such as eye colour is
controlled by a pair of these genes, one inherited from each parent. An
individual could then have one of three combinations of alleles of the
form AA, Aa or aa. A person may be classified as being in one of three
states:
Dominant (type AA): gene of type A from both parents
Hybrid (type Aa): gene of type A from one parent and gene of type
a from the other parent
Recessive (type aa): gene of type a from both parents
Assume that the gene inherited from a parent is a random choice
from the parents two genes and that each each parent is equally likely
to transmit either of its two genes to an offspring. We can form a
Markov chain by starting with a population and always crossing with
hybrids to produce offspring. The time required to produce a subsequent
generation is the time period for the chain.
What is the corresponding transition matrix? Suppose we start with a
person with dominant traittype AAand cross with a person with
hybrid traittype Aa. Type AA will always contribute A to the
offspring, and type Aa will contribute A one half of the time and a one
half of the time. If we start with a hybrid, and cross with a hybrid, we
have the following situation: the first hybrid will contribute either A or

127

MathsWorks for Teachers


Matrices

a to the offspring, each with probability one-half. The second hybrid


will also contribute either A or a to the offspring, again each with
probability one-half. Hence we have one-quarter probability of AA, onequarter probability of aa and one-half probability of hybrid Aa. The
transition matrix is as follows:
R D H RV
1 1
D S2 4 0W
S
W
1 1 1W
S

H S2 2 2 W= P
S
W
R S0 1 1 W
4 2
T
X
a What proportion of the third generation offspring (that is, after two
time periods) of the recessive population has the dominant trait?
b What proportion of the third generation offspring (after two time
periods) of the hybrid population is not hybrid?
c If, initially, the entire population is hybrid, find the population
vector in the next generation.
d If, initially, the population is evenly divided among the three states,
find the population vector in the third generation (after two time
transitions).
e Show that this Markov chain is regular, and find the steady-state
population vector.
Solution

a We need to calculate P2, and obtain the number in the (1, 3)


position.
R
V
S3 1 1W
S8 4 8 W
1 1 1
2
P = SS 2 2 2 WW
S1 1 3W
SS 8 4 8 WW
T
X
Hence one-eighth of the third generation offspring of the recessive
population has the dominant trait.
b From P2, we see that one-half of the third generation offspring of
the hybrid population has the hybrid trait, while one-quarter has
dominant and one-quarter has the recessive trait. So one-half is not
hybrid.

128

chapter 5
Transition matrices

R V
S1W
R V
S4W
S0 W
1
c Here S0 = S 1 W, and so PS0 = SS 2 WW and this is the population
SS WW
S1W
0
T X
S4W
T X
distribution vector for the next period.
R V
R V
S1W
S1W
S3W
S4W
1
1
d Here S0 = SS 3 WW, and so P2 S0 = SS 2 WW is the population vector for the
S1W
S1W
SS 3 WW
S4W
T X
T X
third generation. This seems familiar.
e As P2 has no zero entries, P is regular. If we approximate a suitably
R
V
S0.25 0.25 0.25W
high power of P, say P20, we get S 0.5 0.5 0.5 W. So we can guess
SS
W
R
V
0.25 0.25 0.25W
S0.25W
T
X
that the steady-state population distribution should be S 0.5 W. This
SS
W
should be checked.
0.25W
T
X
Begin with I - P:
R
V R
V
1 1
1
1
R
V S2 4 0W S 2 - 4 0 W
W
S1 0 0 W S 1 1 1 W S 1 1
1W
W = SS0 1 0 W- S
2W
SS
W S2 2 2W S 2 2
0 0 1W S 1 1 W S
1 1 W
T
X S0 4 2W S 0 - 4 2 W

T
X T
X
and use technology to obtain the reduced row echelon form:
R
V
S1 0 - 1 W
S0 1 - 2 W
S
W
S0 0 0 W

R V
T
X
Sx W
Let S = Sy W. From the above, x=z and y=2z. Then since
SS WW
z
R
V
0.25W
T X
S
1
x+y+z=1, z = 4 and so S = S 0.5 W.
SS
W
0.25W
T
X
At this stage it is useful to recall that a state in a Markov chain is called an
absorbing state if it is not possible to leave that state over the next time
period. If state i is an absorbing state, then in the ith column of the transition

129

MathsWorks for Teachers


Matrices

matrix, there will be a 1 in the ith row and zeroes everywhere else. When we
take powers of the transition matrix, the ith column will remain the same,
and so the transition matrix is not regular, and may not have a steady-state
vector.
Now, suppose that we always cross with recessives.
R
V
S0 0 0 W
S 1 W
Then the transition matrix is P = S1 2 0 W. This has an absorbing state,
S 1 W
SS0
W
2 1W
T
X P, and see what happens.
the recessive state. Let us take some powers
of
R
V
S 0 0 0W
S1 1 W
P2 = S 2 4 0 W
S1 3 W
SS 4 1 WW
2

T
X
R
V
0
0
0
S
W
S1 1 W
3

P = S 4 8 0W
S3 7 W
SS 4 8 1 WW
X V
R T
0 0W
S 0
S 1
W
1
P10 = S 512 1024 0 W
S 511 1023 W
SS
W
512 1024 1 W

T
X
Continuing with higher powers, it would appear that there will be a
R V
S0 W
steady-state vector, equal to S 0 W. In the long term, we will end up with
SS WW
recessives.
1
R V
T X
0
S W
Check that S = S 0 W is a steady-state vector; that is, show PS=S.
SS WW
1
T X
Example 5.9

Humans have two sets of chromosomes, one obtained from each parent,
which determine their genetic makeup. In this example we investigate
the inbreeding problem.

130

chapter 5
Transition matrices

Assume that two individuals mate randomly. In the next generation,


two of their offspring of opposite sex mate randomly. Suppose the
process of brother and sister sibling mating continues each generation.
We can regard this as a Markov chain whose states consist of six mating
states:
State 1: AA AA
State 2: AA Aa
State 3: AA aa
State 4: Aa Aa
State 5: Aa aa
State 6: aa aa
Let P=[pij] be the corresponding transition matrix. Suppose that the
parents are both of type AA. Then all children will be of type AA, and
so a mating of brother and sister will only give AA. Hence p11=1.
Suppose that the parents are of type AAAa. Then half their
children will be of type AA and half will be of type Aa. A mating of
these offspring will give 0.25 of AAAA, 0.5 of AAAa and 0.25 of
AaAa, and so p21=0.25, p22=0.5 and p24=0.25.
Continuing in this way, we can obtain Table 5.3.
Table 5.3: Summary of parent, offspring and offspring mating combinations
for the inbreeding problem, and related probabilities

Parents

Offspring

Offspring mating

AAAA

All AA.

All AAAA

AAAa

0.5 AA
0.5 Aa

0.25 AAAA
0.5 AAAa
0.25 AaAa

AAaa

All Aa.

All AaAa

AaAa

0.25 AA
0.25 aa
0.5 Aa

0.0625 AAAA
0. 25 AAAa
0.125 AAaa
0.25 AaAa
0.25 Aaaa
0.0625 aaaa

Aaaa

0.5 Aa
0.5 aa

0.25 AaAa
0.25 aaaa
0.5 Aaaa

aaaa

All aa.

All aaaa.

131

MathsWorks for Teachers


Matrices

Note that states 1 and 6 are absorbing states.


R
V
S1 1 0 1 0 0 W
16
S 4
W
S0 1 0 1 0 0 W
S 2
W
4
S
W
1
S0 0 0 8 0 0 W
W.
Hence the transition matrix will be P = S
S0 1 1 1 1 0 W
4 4 W
S 4
1 1 W
S
S0 0 0 4 2 0 W
S
1 1 W
SS0 0 0 16 4 1 WW
T
X
P is not regular, since all powers of P will have zeroes in the first
column except for the first position and in the last column except for the
last position.
What happens in the long term? We can easily use technology to find
powers of the matrix P. To find the long-term steady-state solution if it
exists we need to solve the homogeneous system (IP)S=O, where I is
the 66 identity matrix, S is the steady-state vector and O is the 61
zero matrix. This has been done with technology. The reduced row
echelon form of (IP) is
R
V
S0 1 0 0 0 0 W
S0 0 1 0 0 0 W
S
W
S0 0 0 1 0 0 W

S0 0 0 0 1 0 W
S
W
S0 0 0 0 0 0 W
S0 0 0 0 0 0 W
T
X
and so, if S=[si] then s2=s3=s4=s5=0, and all we know about s1 and
R
V
S a W
S 0 W
S
W
0 W
S
s6 is that they must sum to 1. In fact, for 0 a 1, S =
is a vector
S 0 W
S
W
S 0 W
S1 - a W
T
X
such that PS=S, but it is not a steady-state vector.

132

chapter 5
Transition matrices

Using technology to compute large powers of P, we can guess that


R
V
S1 0.75 0.5 0.5 0.25 0 W
S0 0
0 0
0 0W
S
W
0 0
0 0
0 0W
n
S
lim P =
n"3
S0 0
0 0
0 0W
S
W
0 0
0 0W
S0 0
S0 0.25 0.5 0.5 0.75 1 W

T
X
This means that in the long term we will end up with some
combination of AAAA and/or aaaa, depending on the initial state
vector.

S t u d e n t ac t i v i t y 5 . 3
a

For Example 5.8, suppose that we always cross with dominants. Determine the
transition matrix P, calculate P10 and P20, find lim P n and the steady-state vector if
n"3
it exists.
For Example 5.9, find the long-term distribution of the population if the initial state
vector is
V
R
R V
R V
S 0 W
S 0 W
S 0.1W
S0.25 W
S 0. 1 W
S0.2W
W
S
S 0. 3 W
S0.2W
W iii S0.35 W
i S W ii S
S0.15 W
S0.4 W
S0.2W
S0.25 W
S0.2 W
S0.2W
W
W
S
S
S W
S 0 W
S 0 W
S 0.1W
X
T
T X
T X
There are three states in a country, called A, B and C. Each year 10% of the
residents of state A move to state B and 30% to state C, 20% of the residents of
State B move to state A and 20% to state C, and 5% of the residents of state C
move to state A and 15% to state B. Suppose initially the population is equally
divided between the three states.
i
Find the percentage of the population in the states after 3 years.
ii
Find the percentage of the population in the three states after a long period
of time.

133

MathsWorks for Teachers


Matrices

SUMM A R Y

A transition matrix P=[pij] for a Markov chain is a square matrix


with non-negative entries such that the sum of the entries in each
column is 1.
pij=probability of moving from state j to state i in one transition.
If the column vector S0 is the initial population distribution vector
between states in a Markov chain with transition matrix P, the
population distribution vector after one time period of the chain is PS0.
Pm is the transition matrix for m time periods, so the population
distribution after m time periods is PmS0, and if pij(m) is the (i,j)th
element of Pm, then pij(m) gives the probability of moving from state
j to state i after m transitions.
A Markov chain and its associated matrix P is called regular if there
exists an integer m such that Pm has no zero entries.
If P is a regular transition matrix for a Markov chain:
The columns of Pm all approach the same probability distribution
vector S as m becomes large.
S is the unique probability vector satisfying PS=S.
As the number of time periods increases, the population
distribution vectors approach S regardless of the initial population
distribution (provided P is regular). Thus S is called the steadystate population distribution vector.
1-a b
For a 22 regular transition matrix =
G, 0 a 1,
a 1-b
x
b
0 b 1, the steady-state vector is = G, where x =
and
a+b
y
a
y=
(a+b > 0).
a+b
A state is called an absorbing state if once the system is in this state
it will never leave it. If state j is an absorbing state, then pjj=1 and
pij=0 for i j
References
Anton, H & Rorres, C 2005, Elementary linear algebra (applications version), 9th
edn, John Wiley and Sons, New York.
Nicholson, KW 2001, Elementary linear algebra, 1st edn, McGraw-Hill Ryerson,
Whitby, ON.

134

chapter 5
Transition matrices
Nicholson, KW 2003, Linear algebra with applications, 4th edn, McGraw-Hill
Ryerson, Whitby, ON.
Poole, D 2006, Linear algebra: A modern introduction, 2nd edn, Brooks Cole,
California.
Wheal, M 2003, Matrices: Mathematical models for organising and manipulating
information, 2nd edn, Australian Association of Mathematics Teachers,
Adelaide.

Websites
http://math.ucsd.edu/~crypto/Monty/monty.html
This website simulates the Monty Hall dilemma.
http://orion.math.iastate.edu/msm/AthertonRMSMSS05.pdf#search=%22proof%20
that%20regular%20transition%20matrices%20converge%20to%20a%20steady
%20state%22 Iowa State Department of Mathematics
This pdf discusses Markov chains.
http://math.rice.edu/~pcmi/mathlinks/montyurl.html
This website has links to many other sites that discuss or simulate the Monty
Hall dilemma.

135

C ha p t e r

C u rric u l u m c o n n e c t i o n s
Different school systems and educational jurisdictions have particular features
in their senior secondary mathematics curricula that have been developed over
decades, and even centuries in some cases, to meet the historical and
contemporary educational needs of their cultures and societies. When these
curricula are reviewed, it is often the case that this includes a process of
benchmarking with respect to corresponding curricula in other systems and
jurisdictions. This may be in a local, county, state, national or international
context.
Over the past few decades, particularly in conjunction with renewed
interest in comparative international assessments (such as TIMSS and PISA
OECD), curriculum benchmarking has been employed extensively by
educational authorities and ministries. Such benchmarking reveals much that
is common in curriculum design and purpose in senior secondary
mathematics courses around the world. Some key design constructs that are
used to characterise the nature of senior secondary mathematics courses are:
content (areas of study, topics, strands)
aspects of working mathematically (concepts, skills and processes,
numerical, graphical, analytical, problem-solving, modelling, investigative,
computation and proof)
the use of technology, and when it is permitted, required or restricted
(calculators, spreadsheets, statistical software, dynamic geometry software,
CAS)
the nature of related assessments (examinations, school-based and the
relationship between these)
the relationship between the final year subjects and previous years in terms
of the acquisition of important mathematical background (assumed
knowledge and skills, competencies, prerequisites and the like)
the amount and nature of prescribed material within the course (completely
prescribed, unitised, modularised, core plus options)

136

chapter 6
Curriculum connections

the amount of in-class (prescribed) and out-of-class (advised) time that


students are expected to spend for completion of the course
In broad terms, it is possible to characterise four main sorts of senior
secondary mathematics courses.
Type 1
Courses designed to consolidate and develop the foundation and numeracy
skills of students with respect to the practical application of mathematics in
other areas of study. These often have a thematic basis for course
implementation.
Type 2
Courses designed to provide a general mathematical background for students
proceeding to employment or further study with a numerical emphasis, and
likely to draw strongly on data analysis and discrete mathematics. Such
courses typically do not contain any calculus material, or only basic calculus
material, related to the application of average and instantaneous rates of
change. They may include, for example, business-related mathematics, linear
programming, network theory, sequences, series and difference equations,
practical applications of matrices and the like.
Type 3
Courses designed to provide a sound foundation in function, coordinate
geometry, algebra, calculus and possibly probability with an analytical
emphasis. These courses develop mathematical content to support further
studies in mathematics, the sciences and sometimes economics.
Type 4
Courses designed to provide an advanced or specialist background in
mathematics. These courses have a strong analytical emphasis and often
incorporate a focus on mathematical proof. They typically include complex
numbers, vectors, theoretical applications of matrices (for example
transformations of the plane), higher level calculus (integration techniques,
differential equations), kinematics and dynamics. In many cases Type 4
courses assume that students have previous or concurrent enrolment in a
Type 3 course, or subsume them.
Table 6.1 provides a mapping in terms of curriculum connections between
the chapters of this book, the four types of course identified above, and the
courses currently (2006) offered in various Australian states and territories.

137

MathsWorks for Teachers


Matrices

As this book is a teacher resource, these connections are with respect to the
usefulness of material from the chapters in terms of mathematical background
of relevance, rather than direct mapping to curriculum content or syllabuses
in a particular state or territory.
Table 6.1: Curriculum connections for senior secondary final year mathematics
courses in Australia

State or
territory

Type of course

Relevant
chapters

Victoria

2: Further Mathematics

1, 2 and 5

3: Mathematical Methods (CAS)

all

4: Specialist Mathematics

1, 3 and 4

2: General Mathematics

3: Mathematics and Mathematics Extension 1

4: Mathematics Extension 2

2: Mathematics A

3: Mathematics B

3 and 4

4: Mathematics C

all

South
Australia/
Northern
Territory

2: Mathematical Applications

1, 2 and 5

3: Mathematical Methods/Mathematical
Studies

all

4: Specialist Mathematics

Western
Australia

2: Discrete Mathematics

3: Applicable Mathematics

1, 3 and 4

4: Calculus

2: Mathematics Applied

3: Mathematics Methods

4: Mathematics Specialised

1, 2, 3 and 4

New South
Wales

Queensland

Tasmania

Table 6.2 provides a mapping in terms of curriculum connections between


the chapters of this book, the four types of course identified above, and some
of the courses currently (2006) offered in various English-speaking
jurisdictions around the world. Again, as this book is a teacher resource, these

138

chapter 6
Curriculum connections

connections indicate the usefulness of material from the chapters in terms of


mathematical background of relevance, rather than direct mapping to
curriculum content, or syllabuses, in a particular jurisdiction.
Table 6.2: Curriculum connections for senior secondary final year mathematics
courses in some jurisdictions around the world

State or
territory

Type of course

Relevant
chapters

College Board
US

3: Advanced Placement Calculus AB

4: Advanced Placements Calculus BC

International
Baccalaureate
Organisation
(IBO)

3: Mathematics SL

1, 2, 3 and 4

4: Mathematics HL

1, 2, 3 and 4

UK

3: AS Mathematics

1, 2, 3 and 4

4: Advanced level

1, 2, 3 and 4

Content from the chapters of the book may be mapped explicitly to topics
within particular courses, and teachers will perhaps find it useful to
informally make these more specific connections in terms of their intended
implementation of a given course of interest to them.
References
The following are the website addresses of Australian state and territory curriculum
and assessment authorities, boards and councils. These include various teacher
reference and support materials for curriculum and assessment.
The Senior Secondary Assessment Board of South Australia (SSABSA)
http://www.ssabsa.sa.edu.au/
The Victorian Curriculum and Assessment Authority (VCAA)
http://www.vcaa.vic.edu.au/
The Tasmanian Qualifications Authority (TQA)
http://www.tqa.tas.gov.au/
The Queensland Studies Authority (QSA)
http://www.qsa.qld.edu.au/
The Board of Studies New South Wales (BOS)
http://www.boardofstudies.nsw.edu.au/
The Australian Capital Territory Board of Senior Secondary Studies (ACTBSSS)
http://www.decs.act.gov.au/bsss/welcome.htm

139

MathsWorks for Teachers


Matrices
The Curriculum Council Western Australia
http://www.curriculum.wa.edu.au/
The following are the website addresses of various international and overseas
curriculum and assessment authorities, boards, councils and organisations:
College Board US Advanced Placement (AP) Calculus
http://www.collegeboard.com/student/testing/ap/sub_calab.html?calcab
International Baccalaureate Organisation (IBO)
http://www.ibo.org/ibo/index.cfm
Qualifications and Curriculum Authority (QCA) UK
http://www.qca.org.uk/
OECD Program for International Student Assessment (PISA)
http://www.pisa.oecd.org
Trends in International Mathematics and Science Study (TIMSS)
http://nces.ed.gov/timss/

140

C ha p t e r

Solution notes to student


ac t i v i t i e s

Student activity 2.1


R
V
0.05W
S
R
R
V
V
S 4 0 1 3 2 2 W SS 0.1 WW S7.90 W
S 5 1 0 0 4 2 W S 0.2 W S 8.35 W
a S
=S
W#
W
S 0 0 0 4 3 0 W SS 0.5 WW S 5.00 W
S10 4 6 0 0 1 W S 1 W S 4.10 W
T
T
X S
X
2 W
T
X
That is, Michael has $7.90, Jay has $8.35, Sam has $5.00 and Lin has $4.10.
R
V R
V
S7.90 W S6.004 W
S 8.35 W S6.346 W
b 0.76 # S
W= S
W
S 5.00 W S 3.80 W
S 4.10 W S 3.116 W
T
X T
X
That is, Michael has US$6.00, Jay US$6.35, Sam US$3.80 and Lin
US$3.12.
Student activity 2.2
2 6
a i
2A = >
H
-2 4
ii
iii
iv

-1 4
A-B =>
H
-3 -2
3 2
A+B ==
G
1 6
8 11
AB = =
G
2 9

141

MathsWorks for Teachers


Matrices

3 4
BA = >
H
- 2 14

26 34 42
G
14 16 18
b Suppose MP = kP. Then we have two simultaneous equations
4p + q = kp
8p + 6q = kq
and collecting terms in p and q, we have
]4 - kgp + q = 0 (i)
8p + ]6 - kgq = 0
(ii)
Multiply equation (i) by 6 - k and subtract equation (ii) from it:
]6 - kg ]4 - kgp - 8p = 0
That is, ^16 - 10k + k2h p = 0. Since p 0, 16 - 10k + k2 = 0, so k = 2 or 8.
vi

AC = =

c A2 = =

2 2
1 1
4 4
8 8
16 16
G = 2 = G, A 3 = =
G, A 4 = = G, A 5 = =
G
2 2
1 1
4 4
8 8
16 16

1 1
G = 2n - 1 A.
1 1
d Suppose A and O are of size n n. Consider AO. The element in ith row
Hence An = 2n - 1 =

k=1

k=1

and jth column of this product will be

/ aik okj = / aik 0 = 0, since

okj = 0 for all k = 1, 2 n, j = 1, 2, 3 n. Hence AO = O. The element in


n

the ith row and jth column of OA is / oik akj = / 0akj = 0, and so
k=1
k=1
OA = O.
1 1
2 1
Take A = =
H, then AB = O, but neither A = O nor B = O.
G, B = >
-2 -2
6 3
-1 0
0 -1
0 -1
1 0
H#>
H=>
H =-=
G =- I
0 -1
0 1
1 0
1 0
-1 0
-1 0
1 0
J 4 = J2 # J2 = >
H#>
H==
G= I
0 -1
0 -1
0 1
f (X Y)(X + Y) = X2 YX + XY Y2. Since generally YX XY for
matrices, then (X Y)(X + Y) X2 Y2.
4 1
3 0
Take X = =
G and Y = =
G.
2 2
0 1

e J2 = >

142

chapter 7
Solution notes to student activities

Then ]X - Y g ]X + Y g = =

1 1
7 1
9 4
G#=
G==
G and
2 1
2 3
16 5

4 12 3 02
18 6
9 0
9 6
G -=
G ==
G-=
G==
G
2 2
0 1
12 6
0 1
12 5
and clearly ]X - Y g ]X + Y g ! X2 - Y2.

X2 - Y2 = =

4 0
3
G and Y = =
0 2
0
1
Then ]X - Y g ]X + Y g = =
0

Take X = =

0
G.
1
0
7 0
7 0
G#=
G==
G and
1
0 3
0 3

4 02 3 02
16 0
9 0
7 0
G -=
G ==
G-=
G==
G
0 2
0 1
0 4
0 1
0 3
In this case ]X - Y g ]X + Y g = X2 - Y2.

X2 - Y2 = =

Student activity 2.3


1 4 -2
- 2>
H
-3 1
2 -3
ii >
H
-3 5
1 -1 -1
1 1 1
iii - 5 >
H
H = 5>
3 -2
-3 2
2 3
b This system can be written in matrix form as AX = B, where A = =
G,
4 1
x
7
X = = G and B = = G. If A- 1 exists, then we can multiply both sides of the
3
y
equation on the left by A- 1, and we thus have X = A- 1 B. Now, the inverse
- 0.1 0.3
2 3
of A = =
H. We can check this by matrix
G is A- 1 = >
4 1
0.4 - 0.2
multiplication:
- 0.2 + 1.2 0.6 - 0.6
2 3 - 0.1 0.3
1 0
H=>
H==
G>
G
=
4 1 0.4 - 0.2
- 0.4 + 0.4 1.2 - 0.2
0 1
Now, having found the inverse of the matrix A, we can solve the system of
- 0.1 0.3 7
x
0.2
simultaneous linear equations: X = = G = A- 1 B = >
H= G = = G
y
0.4 - 0.2 3
2.2
Then x = 0.2 and y = 2.2 is the solution of the system of simultaneous
linear equations given.
a i

143

MathsWorks for Teachers


Matrices

Student activity 2.4


1 1
1
514229 317811 1
832040
G, f2 = = G, so f30 = A28 f2 = =
G= G = =
G. Hence
1 0
1
317811 196418 1
514229
the 29th number is 514229 and the 30th is 832040. Note that, in addition, the
power of A also gives the 28th number 317811 and 27th number 196418.

A ==

Student activity 2.5


a
Letter

space

Number

19

22

20

space

space

23

12

19

Letter
Number

To code the message, we find the matrix product:


5 3 19 22 0 8 0 8 12 19
98 125 60 55 69 43 75 95
G=
G==
G
3 2 1 5 20 5 23 1 5 0
59 76 40 34 46 26 46 57
and so the message sent would be 98, 59, 125, 76, 60, 40, 55, 34, 69, 46, 43,
26, 75, 46, 95, 57.
b To find the original message, we must first find the inverse of the coding
matrix:
2 -3
5 3 -1
M- 1 = =
H
G =>
3 2
-3 5
Arrange the received message in column vectors of length 2, and put them
together into one matrix:
65 75 138 90 85 80 160 123
G
=
42 50 87 54 54 49 99 76
Now multiply this on the left by M1 to recover the original message as a
matrix:
2 - 3 65 75 138 90 85 80 160 123
4 0 15 18 8 13 23 18
H=
>
G==
G
- 3 5 42 50 87 54 54 49 99 76
15 25 21 0 15 5 15 11
So the original message is represented by the numbers
4, 15, 0, 25, 15, 21, 18, 0, 8, 15, 13, 5, 23, 15, 18, 11
Checking the coding listed previously, we see that the message reads DO
YOUR HOMEWORK.

144

chapter 7
Solution notes to student activities

Student activity 3.1


a {x + y = 1, 2x + y = 4} is one such system
b In this case, one equation must be a multiple of the other.
{2x + y = 4, 4x + 2y = 8} is one such system
c Since (0, 0, 0) is a solution, the equations must be of the form
ax + by + cz = 0. One example is {x2y + z = 0, 2x + 2y 4z = 0} and the
corresponding planes will intersect in a line with a one parameter family of
solutions {(t, t, t): t R}
Another example is {x 2y + z = 0, 2x 4y + 2z = 0} and the
corresponding planes intersect in a plane (i.e. they are the same plane) with
a two parameter family of solutions {(s, t, 2t s): s, t R}
d SOLVE([3x 2y + z = 0, x y z = 10], [x, z])
3y + 10
y + 30
x=
/
z
=4
4
Solution set is 'c

3y + 10
y + 30 m
: y ! R1
4 , y, - 4

Student activity 3.2


a It is clear that (0, 0) is a solution of this system of equations. The
a b
coefficient matrix is =
G. This is invertible if ad bc 0, in which case
c d
x
a b -1 0
0
G = G = = G, the unique solution.
= G==
y
0
0
c d
So we need to consider the case ad = bc. If c and d are both non-zero,
a b
then c = and the equations are a multiple of one another, and so there
d
will be infinitely many solutions. If, say, c is zero, then either a or d is zero.
If d is zero, then the second equation is 0x + 0y = 0, which has R2 as
solution, and so the solution to the system is simply the set of points on the
line ax + by = 0. If a is zero, then the equations are {by = 0, dy = 0}, so one
is a multiple of the other, and the solution set is the set of points on the
x-axis (provided one of b or d is non-zero). (A similar argument applies if
d = 0.)
b The system of linear equations {ax + by = e, cx + dy = f} will have
infinitely many solutions when one equation is simply a non-zero multiple
a b e
of the other. Assuming c 0, d 0 and f 0, then we need c = = .
f
d

145

MathsWorks for Teachers


Matrices

This can be written ad bc = 0 and af ce = 0 and bf de = 0, and this


way we do not have to assume c 0, d 0 and f 0.
The system of linear equations {ax + by = e, cx + dy = f} will have no
solutions if the corresponding lines are parallel but distinct. In this case we
require ad bc = 0 but either af ce 0 or bf de 0.
The system of linear equations will have a unique solution if
ad bc 0.
c We have x = 3t + 1 and y = 2t 1. We need to eliminate t from these
x-1
equations. From the first equation, t = 3 . Use this to substitute for t in
the second equation:
x-1
y = 2 b 3 l - 1 or 3y = 2x - 5
d The underlying equation is 3x y = 4, or y = 3x 4. So simply choose any
expression involving a parameter for x, and use the equation to write y as a
function of the parameter. Equivalent solution sets would be
{(r + 1, 3r 1): r R} and {(s 1, 3s 7): s R}.
Student activity 3.3

R
V
S1 2 - 1 2 W
a The augmented matrix is S1 4 - 3 3 W, which has reduced row echelon form
S
W
R
V
S2 5 - 3 1 W
S1 0 1 0 W
T
X
S0 1 - 1 0 W. Looking at the last row of the reduced form (corresponding to
SS
W
0 0 0 1W
T
X
the equation 0x + 0y + 0z = 1) we see the equations are inconsistent. Hence
there is no solution.
R
V
S- 1 1 1 - 1 W
b The augmented matrix is S 1 - 1 3 5 W, which has reduced row echelon
S
W
S 3 - 2 1 - 2W
Zx =- 7_
R
V
T
X
S1 0 0 - 7 W
b
]
form S0 1 0 - 9 W. This corresponds to the system of equations [ y =- 9`
S
W
b
]
S0 0 1 1 W
z=1 a
\
T
X
and hence we have a unique solution (7, 9, 1).
R
V
S 2 2 -1 5 W
c The augmented matrix is S- 2 1 1 7 W, which has reduced row echelon
S
W
S- 4 1 2 10 W
R
V
T
X
S1 0 - 1 - 3 W
2 2W
S
form S0 1 0 4 W. There are two leading variables (corresponding to the
SS0 0 0 0 WW
T
X

146

chapter 7
Solution notes to student activities

two leading 1s) x and y, and one free variable z. So let z = k, an arbitrary
k-3
real number. Then the solution set is 'b 2 , 4, kl: k ! R1.
Student activity 3.4
a Let v, x, y and z be the unknown scores of players 1, 2, 3 and 4 respectively,
and a, b, c and d the known totals. Then
v+x = a

x+y = b
y+z = c
z+v = d
is a system of four equations in four unknowns. The augmented matrix is
R
V
R
V
S1 1 0 0 a W
S1 0 0 1 0 W
S0 1 1 0 b W
S0 1 0 - 1 0 W
S
W and its reduced row echelon form using CAS is S
W.
S0 0 1 1 c W
S0 0 1 1 0 W
S1 0 0 1 d W
S0 0 0 0 1 W
T
T
X
X
Now we have a problem. There is no sign of a, b, c or d, and the last line
corresponds to an equation which reads

0v + 0x + 0y + 0z = 1
which is clearly a contradiction, indicating there are no solutions. So to find
out what is really happening, we need to use the Gaussian elimination
procedure to reduce the augmented matrix to echelon form (but not
reduced).
To begin, the new row 4 will be the old row 4 subtract row 1, giving
R
V
S1 1 0 0 a W
S0 1 1 0 b W
S
W.
S0 0 1 1 c W
S0 - 1 0 1 d - a W
T
X
Next, the new row 4 should be the old row 4 with row 2 added to it.
R
V
a
S1 1 0 0
W
S0 1 1 0
W
b
S
W
c
S0 0 1 1
W
S0 0 1 1 d - a + b W
T
X

147

MathsWorks for Teachers


Matrices

Next, the new row 4 will be the old row 4 with row 3 subtracted from it.
R
V
a
S1 1 0 0
W
S0 1 1 0
W
b

S
W
c
S0 0 1 1
W
S0 0 0 0 d - a + b - c W
T
X
Now the last equation is 0v + 0x + 0y + 0z = d a + b c.
Generally, if a + c b + d, there will be no solution.
In this example, (v + x) + (y + z) = a + c, and (x + y) + (z + v) = b + d, so
a + c = b + d = sum of all players scores in the tournament, and this means
R
V
S1 1 0 0 a W
S0 1 1 0 b W
that the echelon form of the augmented matrix is S
W, which is
S0 0 1 1 c W
S0 0 0 0 0 W
X
R T
V
S1 0 0 1 a - b + c W
S0 1 0 - 1 b - c W
equivalent to the reduced row echelon matrix S
W.
c
S0 0 1 1
W
S0 0 0 0
W
0
T
X
So there will be three basic variables v, x, and y, and one free variable z,
and hence there will be an infinite number of solutions and thus the scores
will not be able to be uniquely determined.
Writing z = k, k R, the first row of the matrix tells us that
v = k + a b + c, the second row that x = k + b c, and the third row that
y = k + c.
b Let f(x) = ax3 + bx 2 + cx + d be the equation of a cubic polynomial
function, with a, b, c and d the unknown coefficients. From Example 3.15
we have three equations:
a+b+c+d = 0
-a + b - c + d = 0
3a + 2b + c =- 4
Since f (x) = 3ax 2 + 2bx + c, and the slope at x = 1 is 12, so
3a 2b + c = 12 is a fourth equation to add to the system of three above.

148

chapter 7
Solution notes to student activities

The augmented matrix for this system of four equations is


R
V
S 1 1 1 1 0 W
S- 1 1 - 1 1 0 W
S
W
S 3 2 1 0 - 4W
S 3 - 2 1 0 12 W
T
X
which reduces to
R
V
S1 0 0 0 2 W
S0 1 0 0 - 4 W
S
W
S0 0 1 0 - 2 W
S0 0 0 1 4 W
T
X
and so a = 2, b = 4, c = 2 and d = 4, giving
f(x) = 2x3 4x 2 2x + 4
The graph of this function is given in the figure below.
y
4
3
2
1
4

O
1

2
3

c Let f(x) = ax4 + bx3 + cx 2 + dx + e be the rule for the family of quartic
polynomials.
Since they pass through (1, 2) and (2, 1) and have slope 5 at x = 1, we
have the following equations (see Example 3.16 for details):
a+b+c+d+e = 2
16a - 8b + 4c - 2d + e =- 1

4a + 3b + 2c + d = 5
In addition, they must pass through the point (2, 0), so f(2) = 0 and
16a + 8b + 4c + 2d + e = 0.

149

MathsWorks for Teachers


Matrices

This gives us a system of 4 equations in 5 unknowns. The augmented


R
V
S1 0 0 0 1 - 35 W
4
24 W
R
V
S
S1 1 1 1 1 2 W
S0 1 0 0 - 1 5 W
S16 - 8 4 - 2 1 - 1W
S
2 6 W
.
matrix is S
W, which reduces to S
3 137 W
S4 3 2 1 0 5 W
S0 0 1 0 - 4 24 W
S16 8 4 2 1 0 W
S
W
T
X
S0 0 0 1 2 - 37 W
S
12 W
T
X
Now there are four leading variables a, b, c and d, and one free variable
e. Let e = t, t R. Then the first row of the reduced matrix gives the
equation
1
35
a + 4 e =- 24
1
35
and so a =- 4 t - 24 , since e = t.
The second row of the reduced matrix gives the equation
1
5
b - 2e = 6
1
5
and so b = 2 t + 6 .
The third row of the reduced matrix gives the equation
3
137
c - 4 e = 24
3
137
and so c = 4 t + 24 .
The fourth row of the reduced matrix gives the equation
37
d + 2e =- 12
37
and so d =- 2t - 12 .

Hence the family of functions is of the form:


1
35
1
5
3
137
37
f]xg = b- 4 t - 24 l x 4 + b 2 t + 6 l x 3 + b 4 t + 24 l x2 + b- 2t - 12 l x + t, t d R
Again, we will plot some members of this family.
35
5
137
37
When t = 0, f]xg = b- 24 l x 4 + b 6 l x 3 + b 24 l x2 + b- 12 l x
41
4
155
61
When t = 1, f]xg = b- 24 l x 4 + b 3 l x 3 + b 24 l x2 + b- 12 l x + 1
13
19
7
155
When t =- 8, f]xg = b 24 l x 4 + b- 6 l x 3 + b- 24 l x2 + b 12 l x - 8
The graphs of these are given in the figure on the following page.

150

chapter 7
Solution notes to student activities
y
20
15
10
5
10

O
5

10

10
15
20
25
30

d Let f(x) = ax4 + bx3 + cx 2 + dx + e be the rule for the family of quartic
polynomials.
Since they pass through (1, 2), (2, 1) and (2, 0), and have slope 5 at
x = 1, we have the following equations (see example above for details):
a + b + c + d + e =2
16a - 8b + 4c - 2d + e = - 1

=5
4a + 3b + 2c + d
16a + 8b + 4c + 2d + e = 0
Now we are looking for the particular curve that passes through (1, 5),
so f(1) = 5, and this gives a fifth equation a b + c d + e = 5.
The augmented matrix for this system of 5 equations is
R
V
S1 0 0 0 0 - 4 W
3W
S
R
V
1
1
1
1
1
2
7
S0 1 0 0 0
W
S
W
S
12 W
S16 - 8 4 - 2 1 - 1W
S
S
W
16 W
S 4 3 2 1 0 5 W, which has reduced echelon form S0 0 1 0 0 3 W.
S
W
S16 8 4 2 1 0 W
S0 0 0 1 0 - 25 W
S
W
12 W
S 1 -1 1 -1 1 5 W
S
T
X
1W
S
S0 0 0 0 1 - 2 W
T
X
4
7
16
25
1
Hence f]xg =- 3 x 4 + 12 x 3 + 3 x2 - 12 x - 2 , and its graph is given in
the figure on the following page.

151

MathsWorks for Teachers


Matrices
y
8
6
4
2
5

O
2

4
6
8
10

Student activity 4.1


a b 0
0
a =
G = G = = G and hence any linear transformation maps the origin to
0
c d 0
itself.
3
-3
b T cannot be uniquely determined since > H =- 1 > H, that is, one vector
-1
1
is a multiple of the other.
a b
Let =
G be the matrix for T.
c d
-1
1
a b 3
a b -3
G > H = > H and =
G > H = > H.
-1
c d -1
c d 1
1
3a - b =- 1
- 3a + b = 1
This yields the equations
and
.
3c - d = 1
- 3c + d =- 1
Grouping the equations for a and b and for c and d together, we have
3a - b =- 1
3c - d = 1
)
3 and )
3
- 3a + b = 1
- 3c + d =- 1
But the two equations in each set are in fact a multiple of one another,
so we only have two distinct equations in 4 unknowns. If we choose any
values for b and d, then the values of a and c will be determined by these
equations.
1 4
Take b = 4 and d = 1. Then a = 1 and c = 0, giving >
H as a matrix
0 -1
for T.
Then =

152

chapter 7
Solution notes to student activities

1 4
Take b = 4 and d = 2. Then a = 1 and c = 1, giving =
G as a matrix
1 2
for T.
1 4
Take b = 4 and d = 4. Then a = 1 and c = 1, giving >
H as a
-1 -4
matrix for T, (and this matrix is singular).
4 3 x
1
4 3 r
0
c Let =
G = G = = G and =
G = G = = G.
5 4 y
0
5 4 s
1
x
r
4 3 -1 1
4 3 -1 0
Then = G = =
G = G and = G = =
G = G.
y
5 4
0
s
5 4
1
4 -3 1
4 -3 0
4
-3
x
r
Then = G = >
H = G = > H and = G = >
H = G = > H.
-5
y
-5 4 0
s
-5 4 1
4
So (4, 5) is mapped to (1, 0) and (3, 4) is mapped to (0, 1).
Student activity 4.2
a b 0
0
G = G = = G and hence any linear transformation maps the
0
c d 0
origin to itself.
Property 2: A non-singular linear transformation is onto, since if A is
the matrix of the transformation then any point v = (x, y) is the image of
A1v.
A non-singular linear transformation is one-to-one, since if A is the
matrix of the transformation, then if Av1 = Av2 for any v1, v2 R2, then
A(v1 v2) = O, so A1 A(v1 v2) = A1O, that is, (v1 v2) = O, and
so v1 = v2.
Property 3: Consider a straight line with vector equation r = u + tv, and
a transformation T with matrix A. Then T(r) = Au + tAv = u1 + tv1, where
u1, tv1, R2, and so is the vector equation for a line.
Property 4: Two distinct parallel lines will have vector equations of the
form r1 = u1 + tv and r2 = u2 + tv, where u1 u2 and v gives the direction of
the lines. Under the transformation with matrix A, the direction of both
lines will be Av, so will still be parallel and distinct, since Au1 Au2.
Property 5: A line through the origin can be written in vector form
r = tv, and if A is the matrix of the transformation then T(r) = tAv, which
also passes through the origin.

a Property 1: =

153

MathsWorks for Teachers


Matrices

x1
2 1 x
b > H = =
G= G
y1
1 3 y
x
2 1 - 1 x1
G > H
= G==
y1
1 3
y
=>

3
-1
5
5
-1 2
5 5

1
H >y H

x = 53 x1 - 15 y1
y=

2
-1
5 x1 + 5 y1

y = 5 - 3x becomes

- 15 x1 + 25 y1 = 5 - 3` 53 x1 - 15 y1j

which simplifies to y1 = 8x1 - 25 or y = 8x - 25

x1
1 1 x
c > H = = G = G
y1
1 1 y
x1 = x + y
y1 = x + y
` Since x + y ! constant, x1 = y1 so y = x is image of y = 5 - 3x.
d If the line maps to a point, then from c x1 = y1 = k, where k is a real
constant. So lines of form x + y = k are mapped to the point (k, k) under
this transformation.
2 1 0
0
e =
G= G = = G
1 3 0
0
2 1 0
1
G= G = = G
=
1 3 1
3
2 1 1
3
G= G = = G
=
1 3 1
4
2 1 1
2
G= G = = G
=
1 3 0
1
So the transformation maps the unit square to the parallelogram with
vertices (0, 0), (1, 3), (3, 4) and (2, 1), and the area of the parallelogram is
2 1
det =
G area of unit square = 5 square units.
1 3

154

chapter 7
Solution notes to student activities

Student activity 4.3


a =

x1
3 0 x
G= G = > H
y1
0 2 y

R V
S x1 W
x
3
1 2 0 x1
G > H = SS WW
= G = 6=
y
y
y
0 3 1
1
S2W
T X
y
x
x
Hence the image of y = sin(x) is 21 = sin c 31 m, or y = 2 sin b 3 l.
x1
3 5 x
b =
G= G = > H
y
1 2 y
1
2x 1 - 5y 1
2 - 5 x1
x
H> H = >
H
= G=>
- x 1 + 3y 1
y
- 1 3 y1
Hence the image of y = x 2 is x1 + 3y1 = (2x1 5y1)2, or
4x2 20xy + 25y2 + x 3y = 0.
x1
1 1 x
G= G = > H
y1
1 1 y
So x + y = x1 = y1, and so the image of y = x2 is y = x.

c =

Student activity 4.4


a >

ky 0 0
0
H= G = = G
0
0 kx 0

>

ky 0 1
ky
H= G = > H
0 kx 0
0

>

ky 0 1
ky
H= G = > H
0 kx 1
kx

>

ky 0 0
0
H = G = >k H
0 kx 1
x

Hence the area of the rectangle formed by the transformed vertices is kxky.
ky 0 x
x1
b >
H = G = >y H
0 kx y
1
So x =

x1
y
y
x
and y = 1 , and so y = f(x) is transformed to 1 = f e 1 o, that is,
ky
kx
kx
ky

y = kx f e

x
.
ky o

155

MathsWorks for Teachers


Matrices

y
x
c From part b, x = a1 and y = 1 , and so x 2 + y2 = 1 is transformed to
b
2
2
2
2
c x1 m + c y1 m = 1, that is b x l + c y m = 1. This is the equation of an ellipse
a
a
b
b
with horizontal axis length 2a and vertical axis length 2b.

a 0
G = ab.
0 b
d The matrix for an anticlockwise rotation about the origin through an angle
cos ]qg - sin ]qg
of is =
G.
sin ]qg cos ]qg
The matrix for an anticlockwise rotation about the origin through an
cos ^fh - sin ^fh
angle of is >
H, and the matrix for an anticlockwise rotation
sin ^fh cos ^fh
cos ^q + fh - sin ^q + fh
about the origin through an angle of + is >
H.
sin ^q + fh cos ^q + fh
Hence area of ellipse = area of unit circle det=

Then, since a rotation through an angle of followed by a rotation


through an angle is equivalent to a rotation through an angle + :

cos ^q + fh - sin ^q + fh
cos ]qg - sin ]qg cos ^fh - sin ^fh
H=>
H
G>
sin ]qg cos ]qg sin ^fh cos ^fh
sin ^q + fh cos ^q + fh

cos ]qg - sin ]qg cos ^fh - sin ^fh


H=
G>
sin ]qg cos ]qg sin ^fh cos ^fh
cos ]qg cos ^fh - sin ]qg sin ^fh - cos ]qg sin ^fh - sin ]qg cos ^fh
>
H
sin ]qg cos ^fh + cos ]qg sin ^fh cos ]qg cos ^fh - sin ]qg sin ^fh
And so cos ^q + fh = cos ]qg cos ^fh - sin ]qg sin ^fh and
sin ^q + fh = sin ]qg cos ^fh + cos ]qg sin ^fh.
x + 3y
1 3 x
e =
H so the points (0, 0), (1, 0), (1, 1) and (0, 1) are
G= G = >
0 1 y
y
transformed to the points (0, 0), (1, 0), (4, 1) and (3, 1) respectively. The
unit square and its image under this transformation are shown on the
following page.
Now =

156

chapter 7
Solution notes to student activities
y
2

f =

x1
1 3 x
G= G = > H
y1
0 1 y

x1 - 3y1
1 - 3 x1
x
1 3 - 1 x1
so = G = =
H> H = >
H.
G > H=>
y1
y1
0 1
y
0 1 y1
Hence the function y = f(x) is transformed to y = f(x 3y) under this
transformation.
The image of y = x2 will be y = (x 3y)2.

157

MathsWorks for Teachers


Matrices

Student activity 4.5

a Dilation by factor 0.5 from y-axis has matrix =

0.5 0
G
0 1

-1 0
Reflection in y-axis has matrix >
H
0 1
Translation 3 units parallel to x-axis and 1 unit parallel to y-axis is
3
represented by = G
1
x1
- 1 0 0.5 0 x
3
Hence >
H=
G= G + = G = > H
y1
1
0 1 0 1 y
- 2^x1 - 3h
x
0.5 0 - 1 - 1 0 - 1 x1 - 3
Then = G = =
H >
H=>
G >
H
y
1
y1 - 1
y
0 1
0 1
1

1
So y = f(x) becomes y 1 = f(2(x 3)) and hence y = x is transformed
1
1
to y - 1 =
+ 1.
or y =
- 2]x - 3g
- 2]x - 3g

1 0
2
H gives reflection in the x-axis, > H gives a translation of 2 units to
0 -1
-1
the right and 1 unit down.
x1
1 0 x
2
Then > H = >
H= G + > H
y1
-1
0 -1 y

b >

x1
2
1 0 x
and > H - > H = >
H= G
y1
-1
0 -1 y
that is, >

x1 - 2
1 0 x
H = G, and so
H=>
y1 + 1
0 -1 y

x1 - 2
1 0 - 1 x1 - 2
1 0 x1 - 2
x
H >
H>
H=>
H=>
H
= G=>
+
y
1
y
1
+
0 -1
0 -1 1
^y1 + 1h
y
1
So x y2 = 0 is transformed to (x 2) (y + 1)2 = 0.

158

chapter 7
Solution notes to student activities

Student activity 5.1


a S1 = PS0 = =

0 0.25 0.5
0.125
G= G = =
G
1 0.75 0.5
0.875

S2 = P2 S0 = =

0 0.25 2 0.5
0.21875
G = G==
G
1 0.75 0.5
0.78125

S5 = P5 S0 = =

0 0.25 5 0.5
0.19971
G = G.=
G
1 0.75 0.5
0.80029

S10 = P10 S0 = =

0 0.25 10 0.5
0.20000
G = G.=
G
1 0.75 0.5
0.80000

0 0.25 50 0.5
0.20000
G = G.=
G
1 0.75 0.5
0.80000
0.445 0.444
b P 3 = =
G
0.555 0.556
Hence the probability of going from state 1 to state 2 in 3 transitions is
the element in (2, 1) position of P3, which is 0.555.
Similarly, the probability of going from state 2 to state 2 in 3 transitions
is the element in (2, 2) position of P3, which is 0.556.
S 50 = P 50 S0 = =

Student activity 5.2


a P2 = =

0.75 0.50
G has no non-zero entries, so P is regular.
0.25 0.50
2
3

2
3

Since a = 12 and b = 1, S = > 1 H and L = > 1

2
3
1
3

H.

0 1
G = P3 = P5 = f
1 0
1 0
P2 = =
G = I = P 4 = P6 = f
0 1
P is not regular. The limit matrix L does not exist, as powers of P
0 1
1 0
oscillate between =
G and =
G. There is no steady-state vector; however,
1 0
0 1
0.5
if S = = G, then PS = S.
0.5
c It is not regular since any power of P will have a zero in (1, 2) position. (In

b P = =

fact, Pn = >

0
0 0
an
0
H.)The steady-state vector is S = = G and lim Pn = =
G.
n
n"3
1
1 1
1-a 1

159

MathsWorks for Teachers


Matrices

Student activity 5.3


R
V
S1 1 0 W
S 2 W
1
a P = SS0 2 1 WW
S0 0 0 W
TR
X
V
S1 1023 511 W
S 1024 512 W
1
1
P10 = SS0 1024 512 WW
S0 0
0 W
TR
X
V
S1 1048575 524287 W
S 1048576 524288 W
1
1
20
P = SS0 1048576 524288 WW
S0
0
0 W
T R
X
V
S1 1 1 W
lim Pn = S0 0 0 W
n"3
SS
W
0 0 0W
T
X
R V
S1 W
Steady-state vector S = S 0 W
SS WW
0
T
X
R
V
J
N
1
S1
0 WR V R V
K
O
2
S
WS 1 W S 1 W
K
O
K Check: PS = S0 1 1 WS 0 W = S 0 W = SO
S 2 WS W S W
K
O
K
O
S0 0 0 W S 0 W S 0 W
T X T X
L
T
X
R
VR V P R V
S1 0.75 0.5 0.5 0.25 0 WS0.1 W S0.5W
S0 0
0 0
0 0 WS0.2 W S 0 W
S
WS W S W
0 0
0 0
0 0 WS0.2 W S 0 W
S
b i
=
S0 0
0 0
0 0 WS0.2 W S 0 W
S
WS W S W
0 0
0 0 WS0.2 W S 0 W
S0 0
S0 0.25 0.5 0.5 0.75 1 WS0.1 W S0.5W
T
X T X TR XV
R
V R V S 19 W
S1 0.75 0.5 0.5 0.25 0 WS 0 W S 40 W
S0 0
0 0
0 0 WS 0.1 W S 0 W
S
WS W S W
0 0
0 0
0 0 WS 0.3 W S 0 W
ii S
=
S0 0
0 0
0 0 WS0.4 W S 0 W
S
WS W S W
0 0
0 0 WS 0.2 W S 0 W
S0 0
S0 0.25 0.5 0.5 0.75 1 WS 0 W S 21 W
T
X T X S 40 W
T X

160

chapter 7
Solution notes to student activities

iii

R
S1 0.75
S0 0
S
S0 0
S0 0
S
S0 0
S0 0.25
T

0.5
0
0
0
0
0.5

0.5 0.25
0
0
0
0
0
0
0
0
0.5 0.75

VR
V R V
0 WS 0 W S0.5W
0 WS0.25W S 0 W
WS
W S W
0 WS0.35W S 0 W
=
0 WS0.15W S 0 W
WS
W S W
0 WS0.25W S 0 W
1 WS 0 W S0.5W
XT
X T X
Current state

State next year

0.6

0.2

0.05

0.1

0.6

0.15

0.3

0.2

0.8

R
V
S0.6 0.2 0.05W
Then the transition matrix is P = S0.1 0.6 0.15W, and the initial state
R V
SS
W
0.3 0.2 0.8 W
S1W
T
X
S3W
1W
S
vector is S0 = S 3 W, and so the distribution after 3 years is
S1W
SS 3 WW
T X
R V
R
V3 S 13 W R
V
S0.6 0.2 0.05W S 1 W S0.226 W
P 3 S0 = S0.1 0.6 0.15W S 3 W . S0.256 W
SS
S
W
W
0.3 0.2 0.8 W SS 1 WW S0.518 W
T
X S3W T
X
T X
After 3 years approximately 22.6% of the population will live in
state A, 25.6% in state B and 51.8% in state C.
To find the long term population distribution, we can investigate powers
of P.
R
V
S0.1961 0.1961 0.1961 W
P15 . S0.2549 0.2549 0.2549 W
SS
W
0.5490 0.5490 0.5490 W
TR
XV
S0.1961 0.1961 0.1961 W
P 20 . S0.2549 0.2549 0.2549 W
SS
W
0.5490 0.5490 0.5490 W
T
X

161

MathsWorks for Teachers


Matrices

It appears that, in the long term, approximately 19.6% of the population


will live in state A, 25.5% in state B and 54.9% in state C.
Alternatively, consider
V
R
V R
V R
S1 0 0 W S0.6 0.2 0.05W S 0.4 - 0.2 - 0.05W
I - P = S0 1 0 W- S0.1 0.6 0.15W = S- 0.1 0.4 - 0.15W
W
SS
W S
W S
0 0 1 W S0.3 0.2 0.8 W S- 0.3 - 0.2 0.2 W
T
X T
X
R X T V
S1 0 - 5 W
R V
14 W
S
Sx W
- 13
which has reduced echelon form SS0 1 28 WW, and if S = Sy W, then we have
SS WW
S0 0 0 W
z
T X
T
X
5z
13z
x = 14 , y = 28 and x + y + z = 1, which has solution
10
13
28
x = 51 / y = 51 / z = 51 , or, approximately, x = 0.1961, y = 0.2549,
z = 0.5490.

162

Das könnte Ihnen auch gefallen