Sie sind auf Seite 1von 3

B io Factsheet

www.curriculumpress.co.uk

April 2003

Number 122

Answering Exam Questions on Statistics


Examination questions may require the calculation and interpretation of statistical measures and tests. This Factsheet discusses
strategies for approaching such questions and gives guidance on common mistakes to avoid. Factsheets 79 and 85 cover the chi-squared
test and t-test specifically. A later Factsheet will cover diagrams and their interpretation.
What can they ask you?

Calculating the standard deviation


The formula for this that you will be given is:

Exactly what is examinable depends on the specification you are studying,


but there are three main categories:

standard deviation =

basic statistical calculations and their interpretation

means "sum of", so x2 means "square each value then add them up"

chi-squared test

a) For a list of numbers:


i) Square each number and add up the squares (this gives x2)
ii) Divide your answer to i) by how many numbers there are
(this gives x2/n)
iii) Find the mean and square it.
iv) Take the answer to iii) away from the answer to ii)
(this gives everything inside the square root)
v) Square root the answer to iv) (this gives the standard
deviation)

t-test

Basic statistical calculations and their interpretation


All specifications require you to calculate the mean; some also require the
standard deviation. You need to remember the formula for the mean, but
will be given it for the standard deviation.
Calculating the mean
a) For a list of numbers, just add them all up and divide by how many there
are.
b) For a table of grouped data, follow this procedure

eg: Find the standard deviation of 2, 5, 6, 7, 8


i)
ii)
iii)
iv)
v)

Step 1. Find out the midpoint of each class, by adding its endpoints
and dividing by two. Add it to the table. Call this column "x"
Step 2. Add another column, and put in it the values of
x number of individuals (f)
Step 3. mean =

total of "x f" column


total of "f" column

mean =

Number of
individuals (f)
6
11
10
8
4

60+143+160+152+88
6+11+10+8+4

xf

x
(9 + 11)
(12 + 14)
(15 + 17)
(18 + 20)
(21 + 23)

2
2
2
2
2

=
=
=
=
=

10
13
16
19
22

x2 =22 + 52 + 62 + 72 + 82 = 178
5 = 35.6
x2 /n = 178
5 = 5.6 Mean2 = 5.62 = 31.36
Mean = (2 + 5 + 6 + 7 + 8)
2
2
x /n mean = 35.6 31.36 = 4.24
standard deviation = 4.24 = 2.0591

b) For a table of grouped data


i) Complete the columns "x" and "x f" as for finding the mean
ii) Add another column, which is x2 f
iii) Find the total of the " x2 f" column. (this gives x2)
iv) Divide your answer to iii) by the total of the "f" column
(this gives x2/n)
v) Find the mean, as described opposite, and square it
vi) Take the answer to v) away from the answer to iv)
(this gives everything inside the square root)
vii) Square root the answer to vi) (this gives the standard deviation)

eg: Find the mean of the following data


Length
(nearest cm)
9 - 11
12 - 14
15 - 17
18 - 20
21 - 23

x 2 mean2
n

60
143
160
152
88

= 15.46
eg. Find the standard deviation of the following data

Calculator Tip:- To do this sum on your calculator, you need to


put brackets around all of the top and all of the bottom, like this:
(60 + 143 + 160 + 152 + 88) (6 + 11 + 10 + 8 + 4)

Length
Number of
(nearest cm) individuals (f)
9 - 11
6
12 - 14
11
15 - 17
10
18 - 20
8
21 - 23
4

Calculator Tip:- Most scientific or graphical calculators will


allow you to calculate mean and standard deviation automatically.
This can save a lot of time! However, not all calculators do it in
the same way, so you need to consult your calculator instruction
book and practice well in advance of the exam.

xf

10
13
16
19
22

60
143
160
152
88

iii) x2 = 600 + 1859 +2560 + 2888 + 1936 = 9843


iv) Total of f column = 6 + 11 + 10 + 8 + 4 = 39
39 = 252.3846
x2 /n = 9843
v) Mean2 = 15.462 = 239.0116
vi) x2 /n - mean2 = 252.3846 - 239.0116 = 13.3730
vii) Standard deviation = 13.3730 = 3.657

One of the commonest mistakes candidates make when using


the calculator is not to clear all the data before starting a new
calculation. You can usually do this on a scientific calculator by
going into the statistics mode and then pressing SHIFT or 2ND
and "AC". To check it works, press the button that you would
normally use to get the mean - if it gives you a number, you
haven't cleared the data properly!

x2 f
600
1859
2560
2888
1936

Bio Factsheet

Answering Exam Questions on Statistics

www.curriculumpress.co.uk

The mean, of course, is the average - but that does not mean half the values
are below and half above it, or that it is a common value. For example, the
mean of the values 1, 1, 2, 3, 100 is 21.4; this is nowhere near any of the actual
values, and four out of the five values are below it!

Degrees of freedom: you do not need to know the exact meaning, although
you do need to know how to calculate them (see below). The idea is that
the amount of data you have affects the critical value - this is because you
are much more likely to get unusual results by chance if you only have a few
observations, than if you have a lot of observations.

The mean also does not distinguish betwee these two data sets:A: 48, 49, 50, 51, 52
B: 35, 40, 50, 62, 63
Both sets of data have mean 50, but they are not very similar.

Interpreting results and drawing conclusions


You must remember that if the value you calculate (the test statistic) is
greater than the value from the tables (the critical value), then you reject
the null hypothesis. Otherwise you accept it.

This is where the standard deviation comes in. This measures how spread
out the data are - the bigger the standard deviation, the greater the spread.
For example, for data set A above, the standard deviation is 1.414, and for
set B, it is 11.296.

You then need to relate this back to the original hypotheses; this will be
discussed in more detail for each test.

Interpreting the mean and standard deviation

Choose your words carefully - a statistical test does not "prove" a


hypothesis is true - there is always a chance that a wrong decision could be
made. It is normal to say "the result is significant at the 5% level" or "the
alternative hypothesis was accepted at the 5% level".

So, for example if you know the following:


Data set 1:
mean = 45.2 standard deviation = 2.13
Data set 2:
mean = 43.7 standard deviation = 10.03

The remainder of the section is divided between the chi-squared test and the
t-test.

We know that data set 2 is more spread out than data set 1. Let's consider
which would be more likely to have a value in it above 50, say.
For data set 1, 50 is more than 2 standard deviations away from the mean
(45.2 + 2 2.13 = 49.46)
For data set 2, 50 is less than 1 standard deviation away from the mean
(43.7 + 10.03 = 53.73).
This tells us that 50 is a less "extreme" or "uncommon" value for data set
2 than for data set 1. So data set 2 is more likely to have values above 50.

Chi-squared test
There are two types main types of chi-squared test you may have to do:
a) Testing to see if there is a difference
b) Testing to see if the theoretical ratios predicted by genetics apply
The hypotheses for the tests are
a) H0: there is no difference between the different conditions
H1: there is a difference between the different conditions

Statistical tests
In the exam, you will always be told which statistical test to use if you are
being required to do calculations. You will be given any tables you need.
There are various types of questions: understanding statistical terms like degrees of freedom, significance, etc

b) H0: the observations are in accordance with the predictions of genetics


H1: the observations are not in accordance with the predictions of
genetics

interpreting results and drawing conclusions


doing the calculations according to the test formula

Calculations for the test formula


In chi-squared, you will need to calculate expected frequencies, and then
the value of chi-squared, using the formula:

finding degrees of freedom

using statistical tables


Some of these are the same for both t-test and chi-squared; others are specific
to the test.

2 =

Understanding statistical terms

(O E- E)

O is observed values - the data from the question


E is expected values - the ones you calculate
means sum of

a) To calculate expected values when you are testing for a difference, you
just add up all the values and divide by the number of them.

Hypotheses: the purpose of a statistical test is to decide between the null


hypothesis and the alternative hypothesis. The exact form of these
hypotheses depends on the test. When you are carrying out the test, you
accept the null hypothesis, unless you have convincing evidence otherwise
(in a court of law, the "null hypothesis" is that the person is innocent - he
is only decided to be guilty if there is enough evidence).

b) To calculate expected values for genetics, you have to use the genetic
ratio. The procedure is:
i) Add up all the values from the data you are given
ii) Add up all the numbers in the genetic ratio
(eg for 9:3:3:1, do 9 + 3 + 3 + 1 = 16)
This tells you the number of parts you will be dividing your total
from i) into.
iii) Find out how much one part is, by dividing your total from i) by your
total from ii)
iv) Find out the expected frequencies, by multiplying one part by the
numbers in the ratio (eg by 9, 3, 3 and 1)

Test statistic: this is the value calculated from your data. The formula for
it depends on the test you are doing.
Critical value: this is the value you compare the test statistic to, to decide
whether you are going to accept or reject the null hypothesis.
For both t-test and chi-squared test, you reject the null hypothesis if your
test statistic is greater than the critical value.
Critical values come from statistical tables.

Once you have calculated the expected frequencies, you substitute into the
formula above to find the chi-squared value.

Significance level: It is possible to reject the null hypothesis even if it is


true, because "unusual" results can occur by chance (eg it is possible although unlikely - to get 100 heads in succession when tossing a coin).
The significance level is the chance of rejecting the null hypothesis when it
is true. These may be written as percentages (10%, 5%, 1%) or as decimals
(0.1, 0.05, 0.01).
The normal significance level in science is 5%. Use this unless you
are told otherwise.

Finding degrees of freedom


You need to learn this formula:
For chi-squared:
degrees of freedom = number of categories - 1

Bio Factsheet

Answering Exam Questions on Statistics

www.curriculumpress.co.uk
Using statistical tables
All you have to do is to read down to find the number of degrees of freedom
you have, and across to find the significance level (usually 5% = 0.05).

Using statistical tables


All you have to do is to read down to find the number of degrees of freedom
you have, and across to find the significance level (usually 5% = 0.05).

chi-squared tables

t-table

df
1
2
3
4

0.10
2.71
4.61
6.25
7.78

0.05
3.84
5.99
7.81
9.49

0.025
0.01
5.02
6.63
7.38
9.21
9.35 11.34
11.14 13.23

0.005
7.88
10.60
12.84
14.86

For a chi-squared test


with 1 degrees of freedom
at a significance level of
5%, the critical (tables)
value is 3.84

df
7
8
9
10
11

Significance level
0.1
0.05
1.895
2.365
1.860
2.306
1.833
2.262
1.812
2.228
1.796
2.201

0.01
3.499
3.355
3.250
3.169
3.106

For a t-test with 10 degrees of


freedom at a significance level
of 5%, the critical (tables)
value is 2.228

t-test
There are two types of t-test, paired and unpaired. The exam will always
make it clear which you should do. You will always be given the relevant
formulae.

Common mistakes
These are some of the commonest errors candidates make:-

The hypotheses for both tests are


H0: mean 1 = mean 2
H1: mean 1 mean 2
(This is a 2-tailed test - you may also come across 1-tailed tests, but in the
exam you will never have to choose between the two)

Rounding errors, due to rounding too early. If in doubt, use all the
figures.
It is useful to keep figures in your calculator, to avoid having to keep
writing down and re-entering data. Learn how to use your calculator
memory.

Calculations for the test formula


The calculations for either type of type of t-test are similar to those for finding
means and standard deviations. You also need to be able to substitute into
a formula. Provided you can do calculations like the ones on page 1, you will
not have a problem with these. Remember, you will be given any formulae
you require.

Calculator errors - putting the correct figures into the calculator


wrongly. See the calculator tips in this Factsheet and practice using
your calculator well before the exam.

Failure to show working - hence throwing away all the marks if there
is even one tiny error in calculation.

The paired t-test first requires you to find the differences between each pair
of values. You then work with these differences only.

Failure to recall the formulae for degrees of freedom - these have


to be learnt. If you get them wrong, they will invalidate your tables
value and your conclusion.

Not drawing conclusions correctly - you must learn that if your


calculated value is larger than the tables value, you reject the null
hypothesis.

Getting the hypotheses the wrong way round - if your calculated


result is greater than the tables value, then:
for the t-test, there is a difference between the means
for testing for a difference in chi-squared, there is a difference
for genetics chi-squared, the results are not as predicted by
genetics

x (n -1)
paired t-test: t =
s

x is the mean of the differences


n is the number of pairs
s is the standard deviation of the
differences

In the unpaired t-test, you will need to use these formulae:


s=

t =

x12 - n1x12 + x22 - n2x22


n1 + n2 - 2
x1 - x2
1+1
s n n
1
2

x1 and x2 are the means of the


two samples
n1 and n2 are the sizes of the
two samples
means "sum of"

Exam questions will get you to do these calculations bit by bit and "follow
through" marks are likely to be awarded - so if you calculate s wrong, for
example, but use your value correctly to calculate the value of t, then you
will get the rest of the marks.
Calculator Tips:To carry out any calculation that is set out as a fraction, you
must put brackets round the top and round the bottom.
It is probably easier to work out the number inside the squareroot first, then take the square root, rather than trying to do it all
in one go.

Finding degrees of freedom


You need to learn these formulae:
For paired t-test:
degrees of freedom = number of pairs - 1

Acknowledgments: This Factsheet was researched and written by Cath Brown.


Curriculum Press, Unit 305B The Big Peg, 120 Vyse Street, Birmingham B18 6NF

For unpaired t-test:


degrees of freedom = number in 1st sample + number in 2nd sample - 2

Bio Factsheets may be copied free of charge by teaching staff or students,


provided that their school is a registered subscriber.
No part of these Factsheets may be reproduced, stored in a retrieval system, or
transmitted, in any other form or by any other means, without the prior
permission of the publisher.
ISSN 1351-5136

Das könnte Ihnen auch gefallen