Sie sind auf Seite 1von 22

Chi-Square Analysis

Mendel’s Peas
and the
Goodness of Fit Test
We will develop the use of the χ2
distribution through an example from the
history of biology.
In Austria in the mid 1800s, an
Augustine monk, Gregor mendel, studied
the garden pea and seven of its traits,
such a shape and color of the peas,
position of flowers on the plant, etc.
He is credited with discovering
patterns of inheritance, the basis of
the field of genetics.
Curiously, Mendel studied seven traits,
one from each of the pea’s seven
chromosomes. His theory of the
independent assortment of genes occurs
only when genes are on different
chromosomes.
We will use one of Mendel’s studies, and
some of his original data, to explore
the χ2 test of significance.

Consider two different characteristics of


peas, color and shape. The peas may be
yellow or green, round or wrinkled.
If we cross a plant with yellow round peas
with a plant having green wrinkled peas,
and examine the progeny we will discover a
uniform F1 generation.

The traits yellow and round are each


dominant, while green and wrinkled are
recessive. We use the letter Y for color,
and R for pea shape, so the alleles are Y,
y, R, and r.
This is a Punnett square to illustrate
this dihybrid cross.
Green wrinkled Yellow round pea
pea

Notice the uniformity among the offspring,


as all are YyRr.
Now we cross the F1 among themselves to
produce the F2:

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr


Now we identify the yellow round peas:

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr


Now we identify the yellow wrinkled
peas:
gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr


Next we identify the green round peas:

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr


Finally, the last type of pea is green
and wrinkled:

gametes YR Yr yR yr

YR YYRR YYRr YyRR YyRr

Yr YYRr YYrr YyRr Yyrr

yR YyRR YyRr yyRR yyRr

yr YyRr Yyrr yyRr yyrr


So now we have four phenotypes (different
physical forms) of peas originating from
the single phenotype of the F1 generation.
They are, along with their genotypes and
expected frequencies:
YYRR, YYRr, 9
Yellow round
YyRR, YyRr 16

Yellow 3
YYrr, Yyrr 16
wrinkled
3
Green round yyRR, yyRr 16

1
Green
yyrr 16
wrinkled
If Mendel’s understanding of genetics were
correct, and the crosses made as he
believed, the proportions of the four
phenotypes should fit the calculations
Using thePunnet
from the χ2 distribution,
square. we are able to
test to see if groups of individuals are
present in the same proportions as expected.
This is rather like conducting multiple
Z-tests for proportions at once.

In this example Mendel carried out the


dihybrid cross to produce an F1
generation, and as expected, the F1 were
all of the same phenotype, yellow and
Further,
round. the F1 were crossed among
themselves to produce the F2 generation.
Mendel recorded the numbers of individuals
The following table gives the observed
numbers of each category.
Expected
Phenotype Observed frequency
9
Yellow round 315 16

Yellow 3
101
wrinkled 16

3
Green round 108 16

Green 1
32 16
wrinkled
To make a χ2 test for “goodness of fit” we
start as with all other tests of
significance, with a null hypothesis.
Step H0: The F2 generation is comprised
1: of four phenotypes in the
proportions predicted by Mendelian
Ha: The genetics.
F2 generation is not comprised of
four phenotypes in the proportions
predicted by Mendelian genetics.
Another way of saying this is that the
null hypothesis claims the population fits
our expected pattern, while the alternate
hypothesis says it does not.
Assumptions: Our first
Step
assumption is that our data are
2:
counts. (We cannot use
sample of proportions
a population,or and
means.) With
sometimes
examine anχ2,entire
we do population,
not always have a this
as with
example. When working from a sample we
must ensure that the sample is
In order to check assumptions for this
representative.
goodness of fit test we must calculate
the expected counts for each category.
Then we must meet two criteria:
1. All expected counts must be one or
more.
2. No more than 20% of the counts may
be less than 5.
We calculate the expected counts by finding
the total number of observations and
multiplying that by each expected
frequency. Observed Expected
Phenotype Expected counts
counts frequency

9 9
Yellow round 315 ( 556 ) » 312.75
16 16

Yellow 3 3
108 16 ( 556 ) » 104.25
wrinkled 16

3 3
Green round 101 ( 556 ) » 104.25
16 16

32 1 1
Green wrinkled 16 ( 556 ) » 34.75
16
As you can see, all expected counts are
greater than 5, so all assumptions are
met.
Step The formula for the χ2 test
3: statistic is:
2
(o - e) where o = observed counts,
c 2 =å
e and
e = expected counts
This calculation needs to be made in the
graphing calculator.
Enter the observed counts in L1. Enter the
expected frequencies in L2, as exact
numbers. (Enter numbers like 1/3,
directly, as fractions, never round to
just .3 or .33.)
In L3 multiply L2 by 556. This will give
the expected counts. The sum of L1 can be
found using 1-Var Stats.
Now in L4, enter (L1-L3)2/L3, this will
give you the χ2 contribution for each
category.
Finally, χ2 is the sum of L4.

For this problem, the χ2 statistic is .


4700.
In χ2, we always need to know and report
the degrees of freedom. The degrees of
freedom are the number of categories minus
one.
Here we have 3 degrees of freedom.
Step
4:
Step P( c 2 > .4700) =.9254
5:
The area can also be
found
c2 with
cdf(.4700,10^99,3).
Step Fail to reject H0, as p = 0.9254 > α
6: = .05.

We lack evidence that the pattern


Step
of pea phenotypes is different
7:
from expected. That is, the F2
generation are present in the
expected proportions, 9:3:3:1.
Gregor Mendel did not have modern
statistics to rely on for his data
analysis, but none-the-less analyzed
data in a way that led to this major
scientific discovery, important to this
day.
There has been speculation about his
studies, or how he reported them, as
the data is almost better than chance
variation would produce.
He was, however, an Augustine monk,
so perhaps he had a little help…

Das könnte Ihnen auch gefallen