Beruflich Dokumente
Kultur Dokumente
Mendel’s Peas
and the
Goodness of Fit Test
We will develop the use of the χ2
distribution through an example from the
history of biology.
In Austria in the mid 1800s, an
Augustine monk, Gregor mendel, studied
the garden pea and seven of its traits,
such a shape and color of the peas,
position of flowers on the plant, etc.
He is credited with discovering
patterns of inheritance, the basis of
the field of genetics.
Curiously, Mendel studied seven traits,
one from each of the pea’s seven
chromosomes. His theory of the
independent assortment of genes occurs
only when genes are on different
chromosomes.
We will use one of Mendel’s studies, and
some of his original data, to explore
the χ2 test of significance.
gametes YR Yr yR yr
gametes YR Yr yR yr
gametes YR Yr yR yr
gametes YR Yr yR yr
Yellow 3
YYrr, Yyrr 16
wrinkled
3
Green round yyRR, yyRr 16
1
Green
yyrr 16
wrinkled
If Mendel’s understanding of genetics were
correct, and the crosses made as he
believed, the proportions of the four
phenotypes should fit the calculations
Using thePunnet
from the χ2 distribution,
square. we are able to
test to see if groups of individuals are
present in the same proportions as expected.
This is rather like conducting multiple
Z-tests for proportions at once.
Yellow 3
101
wrinkled 16
3
Green round 108 16
Green 1
32 16
wrinkled
To make a χ2 test for “goodness of fit” we
start as with all other tests of
significance, with a null hypothesis.
Step H0: The F2 generation is comprised
1: of four phenotypes in the
proportions predicted by Mendelian
Ha: The genetics.
F2 generation is not comprised of
four phenotypes in the proportions
predicted by Mendelian genetics.
Another way of saying this is that the
null hypothesis claims the population fits
our expected pattern, while the alternate
hypothesis says it does not.
Assumptions: Our first
Step
assumption is that our data are
2:
counts. (We cannot use
sample of proportions
a population,or and
means.) With
sometimes
examine anχ2,entire
we do population,
not always have a this
as with
example. When working from a sample we
must ensure that the sample is
In order to check assumptions for this
representative.
goodness of fit test we must calculate
the expected counts for each category.
Then we must meet two criteria:
1. All expected counts must be one or
more.
2. No more than 20% of the counts may
be less than 5.
We calculate the expected counts by finding
the total number of observations and
multiplying that by each expected
frequency. Observed Expected
Phenotype Expected counts
counts frequency
9 9
Yellow round 315 ( 556 ) » 312.75
16 16
Yellow 3 3
108 16 ( 556 ) » 104.25
wrinkled 16
3 3
Green round 101 ( 556 ) » 104.25
16 16
32 1 1
Green wrinkled 16 ( 556 ) » 34.75
16
As you can see, all expected counts are
greater than 5, so all assumptions are
met.
Step The formula for the χ2 test
3: statistic is:
2
(o - e) where o = observed counts,
c 2 =å
e and
e = expected counts
This calculation needs to be made in the
graphing calculator.
Enter the observed counts in L1. Enter the
expected frequencies in L2, as exact
numbers. (Enter numbers like 1/3,
directly, as fractions, never round to
just .3 or .33.)
In L3 multiply L2 by 556. This will give
the expected counts. The sum of L1 can be
found using 1-Var Stats.
Now in L4, enter (L1-L3)2/L3, this will
give you the χ2 contribution for each
category.
Finally, χ2 is the sum of L4.