Sie sind auf Seite 1von 52

Computing in

Archaeology
Session 10. Statistical tests
of significance
© Richard Haddlesey www.medievalarchitecture.net
Aims
 To understand what we mean by statistical
significance and archaeological significance

 To test significance (through the Null


hypothesis and Chi-squared testing)

 Key text: Fletcher & Lock (2nd Ed) 2005.


Digging Numbers. Oxford. 63-5, 128-38
Is it significant?
What do we mean by
significant?
Choosing a test
normal distribution

positively skewed
distribution
normal distribution

parametric test

positively skewed
distribution
normal distribution

parametric test

positively skewed
distribution

non-parametric test
Hypothesis testing
 Before we can test significance we must
formulate two hypotheses

 So what do we mean by hypothesis


testing?

 Theories abound in archaeology although


many of them cannot be tested in any way
let alone in the formal way described
throughout this lecture
Hypothesis testing
 A test must be repeatable, not just by
you, but by anyone who has access to the
data set

 A hypothesis, therefore, must represent a


quantifiable relationship and it is this
relationship which is tested formally

 We could say that all hypotheses are


theories whereas not all theories are
hypotheses
Example hypothesis

 In order to illustrate the logic of a


hypothesis test consider testing the
hypothesis that at least 40% of all
bronze spearheads come from
burials

Fletcher & Lock, 63


Step 1 – formulate two hypotheses
Step 1 – formulate two hypotheses
• null hypothesis (H0)
Step 1 – formulate two hypotheses
• null hypothesis (H0)
• alternative hypothesis (H1)
Step One: H0 & H1

 This should be done so that one and


only one must be true

 In this case we would have:


• H0: proportion of bronze spears from burials is ≥40%
• H1: proportion of bronze spears from burials is <40%
Step 1 – formulate two hypotheses
• null hypothesis (H0)
• alternative hypothesis (H1)

Step 2 – take measurements


Step Two

 Take a suitable measurement or


observation from which a test statistic and
its associated probability (step 3) can be
calculated

 Here we have a sample of 20 bronze


spearheads 7 of which have been found in
burials (this is the observed result)
 So far so good!
Step 1 – formulate two hypotheses
• null hypothesis (H0)
• alternative hypothesis (H1)

Step 2 – take measurements

Step 3 – calculate test statistic


Step 3: the difficult bit
 Here we calculate a test statistic
which can then be tested for
significance in step 4.

 The test statistic allows for the


calculation of the probability of the
observed result which is often called
the p-value
Step 3: continued
 If H0 is true and at least 40% of all
bronze spearheads do come from burials
what is the probability of a sample of 20
containing 7 from burials?

 P (burial) =0.40 and so P (not burial) =0.60

 P (not burial for 1st & 2nd) =(0.60)(0.60) =(0.60)2

 hence P (not burial for 13 (20-7)) =(0.60)13 =0.0013

 The p-value (probability of the observed


result) is 0.0013 or 0.13%
Step 1 – formulate two hypotheses
• null hypothesis (H0)
• alternative hypothesis (H1)

Step 2 – take measurements

Step 3 – calculate test statistic

Step 4 – calculate significance


Step 4: testing the hypotheses
 Remember that the null hypothesis
is being tested. The significance of
the test statistic will determine
whether the Null Hypothesis is
accepted or rejected

 There are set conventions for


significance testing and these will
guide our discussion
Step 4: continued
 Common significance levels used in
the social sciences are:

• p<0.10 reject at the 10% level


• p<0.05 reject at the 5% level
• p<0.01 reject at the 1% level
• p<0.001 reject at the 0.1% level

 The 5% level is often used within


archaeology
Step 4: continued

 If p<0.05 (5%) reject H0 at the 5%


level and conclude that there is
significant evidence to show that the
percentage of bronze spearheads
from burials is less than 40% (in
other words if H0 is rejected H1 must
be accepted
What does this mean?

 We can now conclude that we are 95%


certain that the percentage of bronze
spearheads from burials is less than 40%

 If, however, the p-value was greater than


0.05, the conclusion would have been to
reject H0 at the 5% level and accept H1
Confidence interval

90%

95%

99%
Confidence interval Probability

90% p=0.10

95%

99%
Confidence interval Probability

90% p=0.10

95% p=0.05

99%
Confidence interval Probability

90% p=0.10

95% p=0.05

99% p=0.01
p<0.05 reject at the 5% level
p<0.05 reject at the 5% level
p<0.10 reject at the 10% level
p<0.05 reject at the 5% level
p<0.10 reject at the 10% level
p<0.01 reject at the 1% level
Chi-squared test

 The Chi-squared Test was developed by


Karl Pearson in 1900 to test if a
contingency table provides significant
evidence of an association between two
variables

 It can be used for both nominal and


ordinal levels, though it is better suited to
nominal data
Chi-squared test
 sample:
• 40 spearheads

 variables:
• material – iron/bronze
• loop – yes/no

 First we need to display the data in a


contingency table
Chi-squared explained

 It’s a method of comparing the


observed frequencies (the data) with
those expected under the null
hypothesis of no association between
two variables
bivariate frequency table
No loop Loop

Iron 20 0 20

Bronze 9 11 20

29 11 40

•Is there any association between the two variables?


•How strong is the association between the two variables?
expected frequency (E) = (row total)(column total)
(overall total)
No loop Loop

Iron 20 0 20

Bronze 9 11 20

29 11 40
expected frequency (E) = (row total)(column total)
(overall total)
No loop Loop

Iron 20 0 20

Bronze 9 11 20

29 11 40
expected frequency (E) = (20)(11) = 5.5
(40)
No loop Loop

Iron 20 0 (5.5) 20

Bronze 9 11 20

29 11 40
No loop Loop

Iron 20 (14.5) 0 (5.5) 20

Bronze 9 (14.5) 11 (5.5) 20

29 11 40
No loop Loop

Iron 20 (14.5) 0 (5.5) 20

Bronze 9 (14.5) 11 (5.5) 20

29 11 40
No loop Loop

Iron 20 (14.5) 0 (5.5) 20

Bronze 9 (14.5) 11 (5.5) 20

29 11 40

degrees of freedom (d.f.) = (r-1)(c-1)


No loop Loop

Iron 20 (14.5) 0 (5.5) 20

Bronze 9 (14.5) 11 (5.5) 20

29 11 40

degrees of freedom (d.f.) = (r-1)(c-1)


= (2-1)(2-1)
= (1)(1)
=1
Critical values of
the χ2 distribution
Critical values of
the χ2 distribution

d.f. = 1
Critical values of
the χ2 distribution

d.f. = 1

χ2 = 15.18
Cramer’s V statistic

 Cramer’s V statistic can be calculated


to measure the strength of
association

 This gives us a value V between 0


and 1 with values close to 1
indicating a strong relationship
Cramer’s V statistic

Where:
n = total of all frequencies (40)
m = the smaller of (c-1) and (r-1)
Cramer’s V statistic

V= √15.18/(40)(1)
√0.3795
= 0.62
Summary

Das könnte Ihnen auch gefallen