Sie sind auf Seite 1von 6

Chi-square test

The topic of gene interaction includes a sometimes bewildering array of different


phenotypic ratios. Although these ratios are easily demonstrated in established
systems such as the ones illustrated in this chapter, in an experimental setting a
researcher may observe an array of different progeny phenotypes and not
initially know the meaning of this ratio. At this stage, a hypothesis is devised to
explain the observed ratio. The next step is to determine whether the observed
data are compatible with the expectations of the hypothesis.
In research generally, it is often necessary to compare experimentally observed
numbers of items in several different categories with numbers that are predicted
on the basis of some hypothesis. For example, you might want to determine
whether the sex ratio in some specific population of insects is 1:1 as expected. If
there is a close match, then the hypothesis is upheld, whereas, if there is a poor
match, then the hypothesis is rejected. As part of this process, a judgment has to
be made about whether the observed numbers are a close enough match to those
expected. Very close matches and blatant mismatches generally present no
problem in judgment, but inevitably there are gray areas in which the match is
not obvious. Genetic analysis often requires the interpretation of numbers in
various phenotypic classes. In such cases, a statistical procedure called the
2 (chi-square) test is used to help in making the decision to hold onto or reject
the hypothesis.
The 2 test is simply a way of quantifying the various deviations expected by
chance if a hypothesis is true. For example, consider a simple hypothesis that a
certain plant is a heterozygote (monohybrid) of genotype A/a. To test this
hypothesis, we would make a testcross to a/a and predict a 1:1 ratio
of A/a and a/a in the progeny. Even if the hypothesis is true, we do not always
expect an exact 1:1 ratio. We can model this experiment with a barrel full of
equal numbers of red and blue marbles. If we blindly removed samples of 100
marbles, on the basis of chance we would expect samples to show small
deviations such as 52 red: 48 blue quite commonly and larger deviations such as
60 red:40 blue less commonly. The 2 test allows us to calculate the probability
of such chance deviations from expectations if the hypothesis is true. But, if all
levels of deviation are expected with different probabilities even if the
hypothesis is true, how can we ever reject a hypothesis? It has become a general
scientific convention that a probability value of less than 5 percent is to be taken
as the criterion for rejecting the hypothesis. The hypothesis might still be true,
but we have to make a decision somewhere, and the 5 percent level is the
conventional decision line. The logic is that, although results this far from
expectations are expected 5 percent of the time even when the hypothesis is true,
we will mistakenly reject the hypothesis in only 5% of cases and we are willing
to take this chance of error.
Lets consider an example taken from gene interaction. We cross two pure lines
of plants, one with yellow petals and one with red. The F1 are all orange. When
the F1 is selfed to give an F2, we find the following result:

Compiled by: J. Gumalal for Advanced Genetics (Bio 240)


What hypothesis can we invent to explain the results? There are at least two
possibilities:
Hypothesis 1. Incomplete dominance

Hypothesis 2. Recessive epistasis of r (red) on Y (orange) and y (yellow)

The statistic 2 is always calculated from actual numbers, not from percentages,
proportions, or fractions. Sample size is therefore very important in the 2 test,
as it is in most considerations of chance phenomena. Samples to be tested
generally consist of several classes. The letter O is used to represent the
observed number in a class, and E represents the expected number for the same
class based on the predictions of the hypothesis. The general formula for
calculating 2 is as follows:

To convert the 2 value into a probability, we use Table 4-1, which shows
2 values for different degrees of freedom (df). For any total number of progeny,
if the number of individuals in two of the three phenotypic classes is known, then
the size of the third class is automatically determined. Hence, there are only 2
degrees of freedom in the distribution of individuals among the three classes.
Generally, the number of degrees of freedom (shown as the different rows
of Table 4-1) is the number of classes minus 1. In this case, it is 31=2. Looking
along the 2-df line, we find that the 2 value places the probability at less than
0.025, or 2.5 percent. This means that, if the hypothesis is true, then deviations
from expectations this large or larger are expected approximately 2.5 percent of
the time. As mentioned earlier, by convention the 5 percent level is used as the
cutoff line. When values of less than 5 percent are obtained, the hypothesis is
rejected as being too unlikely. Hence the incomplete dominance hypothesis must
be rejected.

Compiled by: J. Gumalal for Advanced Genetics (Bio 240)


Critical Values of the 2 Distribution.

For hypothesis 2, the calculation is set up as follows.

The probability value (for 2df) this time is greater than 0.9, or 90 percent. Hence
a deviation this large or larger is expected approximately 90 percent of the
timein other words, very frequently. Formally, because 90 percent is greater
than 5 percent, we conclude that the results uphold the hypothesis of
recessive epistasis.

Reference:
An Introduction to Genetic Analysis. 7th edition.
Griffiths AJF, Miller JH, Suzuki DT, et al.
New York: W. H. Freeman; 2000.

Compiled by: J. Gumalal for Advanced Genetics (Bio 240)


HOW TO SOLVE CHI-SQUARE PROBLEMS

Chi-Square = sum of the (observed-expected)2 / expected


The problem is usually figuring out the expected.

To find the expected:


1. Do a Punnett Square.
2. Use the ratios from the Punnet Square. Put the ratios in the form of a
fraction or decimal.
For example, 9:3:3:1 is actually 9/16, 3/16, 3/16, and 1/16.
3. Multiply each fraction by the total number observed.
Example: 916 tall, red 9/16 x 1621 = expected
325 tall, white 3/16 x 1621 = expected
295 short, red 3/16 x 1621 = expected
85 short, white 1/16 x 1621 = expected

1621 total

4. Fill in the following chart:

Observed Expected (o-e) (o-e)2 (o-e)2 / e


916 912 916-921 = 4 42 = 16 16/912 =
0.018
325 304 325-304 = 21 212 = 441 441/304 =
1.451
295 304 295-304 = 9 92 = 81 81/304 =
0.266
85 101 85-101 = 16 162 = 256 256/101 =
2.535
1621 1621 4.270

5. Now look up 4.27 on a probability table that lists the critical values of a
Chi-Square distribution. Remember that the degrees of freedom are one
less than the number of classes (4 - 1 = 3).

Compiled by: J. Gumalal for Advanced Genetics (Bio 240)


PENETRANCE AND EXPRESSITIVITY
In the preceding examples, the dependence of one gene on another is deduced
from clear genetic ratios. In such cases, we can use the phenotype to distinguish
mutant
and wild-type genotypes with 100 percent certainty. In these cases, we say that
the mutation is 100 percent penetrant. However, many mutations show
incomplete
penetrance: not every individual with the genotype expresses the corresponding
phenotype. Thus penetrance is defined as the percentage of individuals with a
given allele who exhibit the phenotype associated with that allele. Why would an
organism have a particular genotype and yet not express the corresponding
phenotype? There
are several possible reasons:

1. The influence of the environment. Individuals with the same genotype


may show a range of phenotypes depending on the environment. It is possible
that the range of phenotypes for mutant and wild-type individuals will overlap:
the phenotype of a mutant individual raised in one set of circumstances may
match the phenotype of a wild-type individual raised in a separate set of
circumstances. Should this happen, it becomes impossible to distinguish mutant
from wild type.
2. The influence of other genes. Modifiers, epistatic genes, or suppressors in
the rest of the genome may act to prevent the expression of the typical
phenotype.
3. The subtlety of the mutant phenotype. The subtle effects brought about
by the absence of a gene function may be difficult to measure in a laboratory
situation.

Another measure for describing the range of phenotypic expression is called


expressivity. Expressivity measures the degree to which a given allele is
expressed at the phenotypic level; that is, expressivity measures the intensity of
the phenotype. For example, brown animals (genotype b/b) from different
stocks might show very different intensities of brown pigment from light to dark.
Different degrees of expression in different individuals may be due to variation
in the allelic constitution of the rest of the genome or to environmental factors.
Figure 6-25 illustrates the distinction between penetrance and expressivity. Like
penetrance, expressivity is integral to the concept of the norm of reaction. An
example of variable expressivity in dogs is found in Figure 6-26. The phenomena
of incomplete penetrance and variable expressivity can make any kind of genetic
analysis substantially more difficult, including human pedigree analysis and
predictions in genetic counseling. For example, it is often the case that a
disease-causing allele is not fully penetrant. Thus someone could have the allele
but not show any signs of the disease. If that is the case, it is difficult to give a
clean genetic bill of health to any individual in a disease pedigree (for example,
individual R in Figure 6-27). On the other hand, pedigree analysis can sometimes
identify individuals who do not express but almost certainly do have a disease
genotype (for example, individual Q in Figure 6-27).

Compiled by: J. Gumalal for Advanced Genetics (Bio 240)


Compiled by: J. Gumalal for Advanced Genetics (Bio 240)

Das könnte Ihnen auch gefallen