You are on page 1of 31

Representation and

Summary of Data
- Dispersion

Representation and Summary of


Data - Dispersion
The last chapter was based on
calculating averages from sets of data
Just the average alone does not give
the full picture though
The next chapter looks at measures of
dispersion, how spread out the data is

Teachings for Exercise 3A

Representation and Summary of


Data - Dispersion
Range and Quartiles

The Quartiles, Q1, Q2 and Q3 split the data into 4 parts, with 25% of the
information in each
Lowest
Value

25%

Q1

Q2
25%

Q3
25%

25%

For discrete data Q1 = n/4

Remember, if the result is whole, you need


the midpoint of the term and the term
above. If not, round up and find the
corresponding term.

Q2 = n/2
Q3 = 3n/4
For continuous data
chapter 2)

Highest
Value

Use interpolation (like with the median from

PL

CW
GF

LB

Inter-quartile range
Upper Quartile Lower Quartile
Q3 Q1

3A

Representation and Summary of


Data - Dispersion
Range and Quartiles

Calculate the Range and Inter-quartile range of the following data.


7, 9, 4, 6, 3, 2, 8, 1, 10, 15, 11
Putting the data in order
1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 15
Range 15 1 = 14
Lower Quartile

n
4

11
4

Upper Quartile

3n
4

33
4

Inter-quartile Range

2.75 (3rd term)


3
8.25 (9th term)
10

10 3 = 7

3A

Representation and Summary of


Data - Dispersion
Range and Quartiles
Rebecca records the number of CDs in
the collections of students in her year.
The results are shown in the table
opposite. Calculate the Inter-quartile
range (IQR).
Q1 = n

Q3 = 3n

95
4

23.75 (24th term)

285
4

71.25 (72nd term)


38

IQR = Q3 Q1
= 38 37
=1

37

No.
Students,
f

Cumulative
Frequency

35

36

17

20

37

29

49

38

34

83

39

12

95

Discrete

3A

Representation and Summary of


Data - Dispersion
Range and Quartiles
The length of time spent on the internet
each evening by a group of students is
shown in the table below. Calculate the
Inter-quartile range.
Q1 = n
4
Q1 =

LB +

31.5 +
Q1 = 32.74

70
4

17.5 term
th

PL
x CW
GF

15.5
x 2
25

Time
(mins)

No.
Students

Cumulative
Frequency

30-31

32-33

25

27

34-36

30

57

37-39

13

70

31.5

33.5

Continuous

3A

Representation and Summary of


Data - Dispersion
Range and Quartiles
The length of time spent on the internet
each evening by a group of students is
shown in the table below. Calculate the
Inter-quartile range.
Q3 = 3n
4
Q3 =

LB +

33.5 +
Q3 = 36.05

210
4

52.5 term
th

PL
x CW
GF

25.5
x 3
30

Time
(mins)

No.
Students

Cumulative
Frequency

30-31

32-33

25

27

34-36

30

57

37-39

13

70

33.5

36.5

Continuous

3A

Representation and Summary of


Data - Dispersion
Range and Quartiles
The length of time spent on the internet
each evening by a group of students is
shown in the table below. Calculate the
Inter-quartile range.
Q1 = 32.74
Q3 = 36.05

Time
(mins)

No.
Students

Cumulative
Frequency

30-31

32-33

25

27

34-36

30

57

37-39

13

70

IQR Q3 Q1
36.05 32.74
3.31

Continuous

3A

Teachings for Exercise 3B

Representation and Summary of


Data - Dispersion
Percentiles
A Percentile is similar to a quartile. The 70th percentile of a set of data will be
the value that has 70% of the data before it. It would normally be written P 70.
The 62nd percentile will be the value that has 62% of the data before it, P 62.

xn
To calculate Px, you find the value of the
th term
100
31n
For the 31st percentile 100
90n
For the 90 percentile
100
th

You can calculate the n% to m% Inter-percentile range Pm Pn


The Quartiles are effectively percentiles Q1 = P25
Q2 = P50
Q3 = P75

3B

Representation and Summary of


Data - Dispersion
Percentiles

Height

Students

Cumulative
Frequency

150-160

160-170

21

25

170-180

32

57

180-190

66

190-200

70

The height, in cm of 70 eighteen year old


boys was measured and the data put into
the table opposite. Calculate the 90th
percentile, the 10th percentile and the
10% to 90% Inter-percentile range.

P90 = 90n
100
P90 = LB +

180 +

6300
100

63rd term

PL
x CW
GF

6
9

P90 = 186.67 (2dp)

10

3B

Representation and Summary of


Data - Dispersion
Percentiles

Height

Students

Cumulative
Frequency

150-160

160-170

21

25

170-180

32

57

180-190

66

190-200

70

The height, in cm of 70 eighteen year old


boys was measured and the data put into
the table opposite. Calculate the 90th
percentile, the 10th percentile and the
10% to 90% Inter-percentile range.

P10 = 10n
100
P10 =

LB +

160 +

700
100

7th term

PL
x CW
GF

3
21

P10 = 161.43 (2dp)

10

3B

Representation and Summary of


Data - Dispersion
Percentiles
The height, in cm of 70 eighteen year old
boys was measured and the data put into
the table opposite. Calculate the 90th
percentile, the 10th percentile and the
10% to 90% Inter-percentile range.

P90 = 186.67 (2dp)

Height

Students

Cumulative
Frequency

150-160

160-170

21

25

170-180

32

57

180-190

66

190-200

70

P10 = 161.43 (2dp)


The 10% to 90% Inter-percentile
range P90 P10
186.67 161.43
25.24cm

3B

Teachings for Exercise 3C

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation
Variance and Standard Deviation are measures of how far away the data is
spread from the mean. If the mean is x and an observation is x, then the
observations dispersion from the mean is x x.

The variance will therefore be given by;

( x x) 2
n

Sum of the squared


dispersions from the mean
(squaring removes any
negative values)
Number of
observations

However, a formula which is more commonly used, especially with larger sets of
data, is;
The mean of the

x
2
n

Variance

The Standard Deviation, is given by

squares

The square of
the mean

Variance .
3C

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation
Important point:
The Standard Deviation tells you the range from the mean which
contains around 68% of the data (if data is normally disributed)
For example, if 100 students have a mean height of 150cm and a
standard deviation of 10cm.
150
140
130

68 of the students are within


one Standard Deviation

160
170

95 of the students are within


two Standard Deviations

3C

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation
Given that for x;

x 42 x

720

n5

Calculate the Variance and Standard


Deviation of x.

x2
n

x
n

720 42
2

5 5

2 144 70.56

Which part is the mean?

Variance

Standard Deviation

2 73.44
8.57

(2dp)

3C

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation
Use the formula to calculate the variance and
standard deviation of the following numbers
3, 4, 6, 2, 8, 8, 5

x 36 x

Total

x2

16

36

64

64

25

36

218

218

x2
n

x
n

218 36
2

7 7

2 31.14 26.45
Variance

Standard Deviation

2 4.69
2.17

(2dp)

3C

Teachings for Exercise 3D

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation from a Table
As with the averages from Chapter 2, you need to be able to calculate the
Variance and Standard Deviation from a frequency table, grouped or ungrouped.
This was the formula from before

x2
n

x
n

The formula for tabled data is similar:


Sum of frequency
times x2
Variance

Sum of frequency
times x

fx

fx
f

Sum of frequency
The difference reflects the fact that each value of x will appear many
times, rather than just once or a few times

3D

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation from a Table
Calculate the Variance and Standard Deviation of a set of data with the
following values already calculated.

fx 224

fx 8731

fx

f f

fx 2

8731 224

25 25

25

Variance
Standard Deviation

2 268.9584

16.40 (2dp)
3D

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation from a Table
Sue records the time spent in town at lunchtime
(mins) of students in her year. The results are in the
table. Calculate the Standard Deviation of the time
spent out of school.

fx 4096

fx 154050

fx

fx
f

154050 4096

109
109

f 109

No.
students
(f)

fx

fx2

35

105

3675

36

17

37

(3 x 35)

(3 x 352)

612

22032

(17 x 36)

(17 x 362)

29

1073

39701

38

34

1292

49096

39

26

1014

39546

109

4096

154050

2 1.19805...

= 1.20 (2dp)

Total

3D

Representation and Summary of


Data - Dispersion
Variance and Standard Deviation from a Table
Andy recorded the lengths of telephone calls he
made over the course of a month. Calculate an
estimate of the Standard Deviation of his calls.

fx 247.5

fx 3018.75

27

fx

f f

fx 2

3018.75 247.5

27
27

Length

Calls, f

Midpoint
(x)

0-5

2.5

5-10

15

7.5

10-15

15-20

fx

fx2

10

25

112.5

843.75

12.5

62.5

781.25

17.5

35

612.5

20-25

22.5

25-30

27.5

27.5

756.25

Total

27

247.5

3018.75

(4 x 2.5) (4 x 2.52)
(15 x 7.5) (15 x 7.52)

2 27.78

5.27 (2dp)

3D

Teachings for Exercise 3E

Representation and Summary of


Data - Dispersion
Coding
As with averages, coding can be used to make data easier to work with.
However, there is something extra to remember
If you have a set of data with a range of 15, and reduce every number
by 2, what will happen to the range?
Nothing!
Range measures the spread of data, and if all the numbers are 2 less,
the spread will not have changed
It is exactly the same for Standard Deviation. Because it measures the
spread of data, any addition/subtraction in the coding will not need to
be undone.
Any division or multiplication will have to be uncoded as normal

3E

Representation and Summary of


Data - Dispersion
Coding
Use the following code to calculate the Standard Deviation of this set of data:
150, 160, 170, 180, 190

x
y
10

Code

15, 16, 17, 18, 19

x 85
2

x2
n

x 1455
2

x
n

1455 85


5
5

n 5

Total

2 2
(2dp)

x2

15

225

16

256

17

289

18

324

19

361

85

1455

But we had divided by


10 so we must undo
this

1.41

x 10

14.14

(2dp)

3E

Representation and Summary of


Data - Dispersion
Coding
Use the following code to calculate the Standard Deviation of this set of data:
150, 160, 170, 180, 190
Code

y x 100
50, 60, 70, 80, 90

x 350

x
n

25500 350

5
5

x2

x 25500

n 5
Total

x2

50

2500

60

3600

70

4900

80

6400

90

8100

350

25500

2 200
14.14

(2dp)

We do not need to undo


as we only subtracted!

3E

Representation and Summary of


Data - Dispersion
Coding
Use the following code to calculate the Standard Deviation of this set of data:
150, 160, 170, 180, 190
Code

x 100
y
10
5, 6, 7, 8, 9

x 35
2

x2
n

x 255
2

x
n

255 35


5 5

2 2
1.41

n 5

(2dp)

Total

x2

25

36

49

64

81

35

255

We only need to undo


the divide by 10
x 10

14.14

(2dp)

3E

Representation and Summary of


Data - Dispersion
Coding
Use the code below to calculate the Standard Deviation of this table of data.

x 7.5
5

Call
length

Calls, f

Midpoint
,x

fy

fy2

0-5

2.5

-1

-4

5-10

12

7.5

10-15

12.5

1.06 (2dp)

15-20

17.5

12

Undo the divide


by 5 only

20-30

25

3.5

3.5

12.25

Total

26

11.5

34.25

Code y

fy 2

fy
f

34.25 11.5

26
26

1.12
2

x5

5.29 (2dp)

(f)

(fy)

(fy2)

Summary
We have now finished chapter 3
We have seen how to calculate range and Interquartile range including using Interpolation from a
table
We have learnt how to calculate Percentiles and
Inter-Percentile range
We have calculated Variance and Standard Deviation
from a table
We have also used coding to simplify calculations