Sie sind auf Seite 1von 9

Math 227

8.1 Basics of Hypothesis Testing


____________________ is an eight-step decision-making process for evaluating claims about a
property of a population. The researcher must define the population under study, state the particular
hypothesis that will be investigated, give the significance level, select a sample from the population,
collect the data, perform the calculations required for the statistical test, and reach a conclusion.
Hypotheses concerning parameters such as means, proportions, variances and standard deviations
can be investigated. There are two specific statistical tests used for hypotheses concerning means:
the 𝑧-test and the 𝑡-test. A hypothesis-testing procedure for testing a single variance or standard
deviation involves using the chi-square distribution.
There are three methods used to test hypotheses:
1. The traditional method
2. The 𝑃-value method
3. The confidence interval method

For our purposes, we will primarily be using the 𝑃-value method.

Steps 1, 2, 3 Use the original claim to create a Null Hypothesis 𝑯𝟎 and an Alternative
Hypothesis 𝑯𝟏 , label one of them as the claim, and determine which type of test to run:

Every hypothesis-testing situation begins with the statement of a hypothesis.


A ____________________ is a conjecture about a population parameter. It may or may not be true.
There are two types of statistical hypotheses for each situation:
The ____________________, symbolized by 𝐻( , is a statistical hypothesis that states that there is no
difference between a parameter and a specific claimed value, or that there is no difference between
the two parameters. This is the working assumption for the purposes of conducting the test. It will
always contain one of these equality symbols: ≤, ≥, =.
The ____________________, symbolized by 𝐻- or 𝐻. or 𝐻/ , is a statistical hypothesis that states the
existence of a difference between a parameter and a specific claimed value, or states that there is a
difference between two parameters. In other words, it is a statement that the parameter has a value
that somehow differs from the null hypothesis. It will always contain one of these symbols: <, >, ≠.

There are three types of tests – two-tailed, right-tailed, and left-tailed. Here are examples of each:
Test 1: A medical researcher is interested in finding out whether a new medication will have any
undesirable side effects. The researcher is particularly concerned with the pulse rate of patients who
take the medication. Will the pulse rate increase, decrease, or remain unchanged after a patient
takes the medication?

Since the researcher knows that the mean pulse rate for the population under study is 82 beats per
minute, they hypotheses for this situation are:
𝐻( : 𝜇 = 82 and 𝐻- : 𝜇 ≠ 82
The null hypothesis specifies that the mean will remain unchanged, and the alternative hypothesis
states that it will be different. This test is called a two-tailed test since the possible side effects of the
medicine could be to raise or lower the pulse rate.

Pierce College // MDP 1


Math 227
Test 2: A chemist invents an additive to increase the life of an automobile battery. If the mean lifetime
of the automobile battery without the additive is 36 months, then her hypotheses are
𝐻( : 𝜇 ≤ 36 and 𝐻- : 𝜇 > 36
In this situation, the chemist is interested only in increasing the lifetime of the batteries, so her
alternative hypothesis is that the mean is greater than 36 months. The null hypothesis is that the
mean is less than or equal to 36 months. This is called a right-tailed test, since the interest is in an
increase only.

Test 3: A contractor wishes to lower heating bills by using a special type of insulation in houses. If
the average of the monthly heating bills is $78, her hypotheses about heating costs with the use of
insulation are
𝐻( : 𝜇 ≥ $78 and 𝐻- : 𝜇 < $78
This test is a left-tailed test, since the contractor is interested only in lowering the heating costs.

Example: State the null and alternative hypothesis for each conjecture.
(a) A researcher thinks that if expectant mothers use vitamin pills, the birth weight of babies will
increase. The average birth weight of the population is 8.6 pounds.

(b) An engineer hypothesizes that the mean number of defects can be decreased in a
manufacturing process of compact disks by using robots instead of humans for certain tasks.
The mean number of defective disks per 1000 is 18.

(c) A psychologist feels that playing soft music during a test will change the results of the test.
The psychologist is not sure whether the grades will be higher or lower. In the past, the mean
of these cores was 73.

Pierce College // MDP 2


Math 227
After stating the hypotheses, the researcher designs the study. The researcher selects the correct
statistical test, and then chooses an appropriate level of significance, and formulates a plan for
conducting the study.

Let’s examine Test 1 (above). The researcher will select a sample of patients who will be given the
drug. After allowing a suitable time for the drug to be absorbed, the researcher will measure each
person’s pulse rate.

Recall that when samples of a specific size are selected from a population, the means of these
samples will vary about the population mean, and the distribution of the sample means will be
approximately normal when the sample size is 30 or more (by the Central Limit Theorem). So even if
the null hypothesis is true, the mean of the pulse rates of the sample of patients will not, in most
cases, be exactly equal to the population mean of 82 beats per minute.

There are two possibilities. Either the null hypothesis is true, and the difference of the sample mean
and the population mean is due to chance; or the null hypothesis is false, and the sample came from
a population whose mean is not 82 beats per minute but is some other value that is not known:

The farther away the sample mean is from the population mean, the more evidence there would be
for rejecting the null hypothesis. The probability that the sample came from a population whose mean
is 82 decreases as the distance or absolute value of the difference between the means increases.

If the mean pulse rate of the sample were, say, 83, the researcher would probably conclude that this
difference was due to chance and would not reject the null hypothesis. But if the sample mean were,
say, 90, then in all likelihood the researcher would conclude that the medication increased the pulse
rate of the users and would reject the null hypothesis.

So, the question is, where does the researcher draw the line? Well, the decision is made statistically.
The difference must be significant and in all likelihood not due to chance.

Pierce College // MDP 3


Math 227
However, before we continue, we need to discuss the possibility of committing an error after running
a hypothesis test.
Types of Errors:
In the hypothesis-testing situation, there are four possible outcomes. In reality, the null hypothesis
may or may not be true, and a decision is made to reject or not reject it on the basis of the data
obtained from the sample. The four possibilities are shown below:
𝐻( true 𝐻( false
Error Correct
Reject 𝐻(
Type I decision
Correct Error
Fail to reject 𝐻(
decision Type II
If a null hypothesis is true and it is rejected, then a Type I error is made. In the first situation, for
example, the medication might not significantly change the pulse rate of all the users in the
population; but it might change the rate, by chance, of the subjects in the sample. In this case, the
researcher will reject the null hypothesis when it is really true, thus committing a Type I error.
On the other hand, the medication might not change the pulse rate of the subjects in the sample, but
when it is given to the general population, it might cause a significant increase or decrease in the
pulse rate of users. The researcher, on the basis of the data obtained from the sample, will fail to
reject the null hypothesis, thus committing a Type II error.
The hypothesis-testing situation can be likened to a jury trial. In a jury trial, there are four possible
outcomes. The defendant is either guilty or innocent, and he or she will be convicted or acquitted.
The hypotheses are:
𝐻( : The defendant is innocent.
𝐻- : The defendant is not innocent (i.e. guilty).
Next, the evidence is presented in court by the prosecutor, and based on this evidence, the jury
decides the verdict, innocent or guilty.
If the defendant is convicted but he or she did not commit the crime, then a Type I error has been
committed. If the defendant is convicted and he or she has committed the crime, then a correct
decision has been made.
If the defendant is acquitted and he or she did not commit the crime, then a correct decision has been
made by the jury. However, if the defendant is acquitted and he or she did commit the crime, then a
Type II error has been made.
The decision of the jury does not prove the defendant did or did not commit the crime. The decision
is based on the evidence presented. If the evidence is strong enough, the defendant will be
convicted in most cases. If the evidence is weak, the defendant will be acquitted in most cases.
Nothing is proved absolutely. Likewise, the decision to reject or fail to reject the null hypothesis
DOES NOT prove anything. The only way to prove anything statistically is to use the entire
population, which, in most cases, is not possible. The decision, then, is to be made on the basis of
probabilities. That is, when there is a large difference between the mean obtained from the sample
and the hypothesized mean, the null hypothesis is probably not true.

Pierce College // MDP 4


Math 227
Step 4 Select the Significance Level 𝜶:
So again, how large of a difference is necessary to reject the null hypothesis?
A ____________________ is the maximum probability of committing a Type I error. This probability
is symbolized by 𝛼 (Greek letter “alpha”). So, 𝑃(Type I error) = 𝑃(rejecting 𝐻( when 𝐻( is true) = 𝛼.
In other words, the significance level is the probability value used as the cutoff for determining when
the sample evidence constitutes significant evidence against the null hypothesis. So, by its nature,
the significance level 𝛼 is the probability of mistakenly rejecting the null hypothesis when it is true.
The probability of a Type II error (failing to reject a false null hypothesis) is symbolized by 𝛽 (Greek
letter “beta”). That is, 𝑃(Type II error) = 𝛽. In most hypothesis-testing situations, 𝛽 cannot be easily
computed; however 𝛼 and 𝛽 are related in that decreasing one increases the other. The ________ of
a hypothesis test is the probability 1 − 𝛽 of rejecting a false null hypothesis. Thus, the power is a
probability that is one measure of the effectiveness of a hypothesis test.
Statisticians generally agree on using three arbitrary significance levels: the 0.10, 0.05, and 0.01
levels. That is, if the null hypothesis is rejected, then the probability of a Type I error will be 10, 5, or
1% depending on which level of significance is used.
So, when 𝛼 = 0.10, there is a 10% chance of rejecting a true null hypothesis; when 𝛼 = 0.05, there is
a 5% chance of rejecting a true null hypothesis, and when 𝛼 = 0.01, there is a 1% chance of rejecting
a true null hypothesis.
After a significance level is chosen a _______________, C.V., which separates the critical region
from the non critical region, is selected from a table for the appropriate test (𝑧, 𝑡, or 𝜒 E ).
The _______________ or _______________ is the range of values of the test value that indicates
that there is a significant difference and that the null hypothesis should be rejected.
The _______________ or _______________ is the range of values of the test value that indicates
that the difference was probably due to chance and that the null hypothesis should not be rejected.
Step 5: Identify the test statistic relevant to the test you are running and determine its
sampling distribution:
A _______________ uses the data obtained from a sample to make a decision about whether or not
the null hypothesis should be rejected.
The numerical value obtained from a statistical test is called the ______________.
Based on the sample statistic that is relevant to the claim being tested, identify the sampling
distribution of the statistic (such as normal, 𝑡, or 𝜒 E ). Verify that any requirements are satisfied to
choose the correct sampling distribution:
Parameter Sampling Distribution Requirements Test Statistic
𝑝̂ − 𝑝
𝑧=
Proportion 𝑝 Normal (𝑧) 𝑛𝑝 ≥ 5 and 𝑛𝑞 ≥ 5 𝑝𝑞
J
𝑛
𝜎 not known and normally distributed 𝑥̅ − 𝜇
population 𝑡= 𝑠
Mean 𝜇 𝑡
or
𝜎 not known and 𝑛 > 30 √𝑛
𝜎 known and normally distributed 𝑥̅ − 𝜇
population 𝑧= 𝜎
Mean 𝜇 Normal (𝑧)
or
𝜎 known and 𝑛 > 30 √𝑛
Standard deviation 𝜎 Strict requirement: normally (𝑛 − 1)𝑠 E
𝜒E 𝜒E =
or Variance 𝜎 E distributed population 𝜎E
Pierce College // MDP 5
Math 227
Step 6 Find the value of the test statistic and then find the 𝑷-value:
The _______________ is a value used in making a decision about the null hypothesis. It is found by
converting the sample statistic (such as 𝑝̂ , 𝑥̅ , or 𝑠) to a score (such as 𝑧, 𝑡 or 𝜒 E ) with the assumption
that the null hypothesis is true.
With the ______________ of testing hypotheses, we make a decision by comparing the 𝑃-value to
the significance level.
In a hypothesis test, the _____________ is the probability of getting a value of the test statistic that is
at least as extreme as the test statistic obtained from the sample data, assuming that the null
hypothesis is true.

To find the 𝑃-value, first find the area beyond the test statistic, and then use the following procedure:
• Critical region in left tail: 𝑃-value = area to the left of the test statistic
• Critical region in right tail: 𝑃-value = area to the right of the test statistic
• Critical region in two tails: 𝑃-value = twice the area in the tail beyond the test statistic

Example: The test statistic of 𝑧 = 1.60 has an area of 0.0548 to its right, so a right-tailed test with test
statistic 𝑧 = 1.60 has a 𝑃-value of 0.0548. (Note: on your calculator, 𝑃-values show up when doing a
1-PropZTest)

Pierce College // MDP 6


Math 227
Step 7 Make a decision to either reject 𝑯𝟎 or fail to reject 𝑯𝟎 :
How do we decide to reject or fail to reject the null?
𝑷-value Method:
If 𝑃-value ≤ 𝛼, reject 𝐻( (“If the 𝑃 is low, the null must go”)
If 𝑃-value > 𝛼, fail to reject 𝐻(

Remember, the 𝑃-value is the probability of getting a sample result at least as extreme as the one
obtained, so if the 𝑃-value is low (less than or equal to 𝛼), the sample statistic is significantly
(unusually) low or significantly (unusually) high.

Step 8 Restate the decision using simple and nontechnical terms (address the claim)
Without using technical terms not understood by most people, state a final conclusion that addresses
the original claim with wording that can be understood by those without knowledge of statistical
procedures.

Here are some guidelines:


Condition Conclusion
Original claim does not include equality, and you “There is sufficient evidence to support the claim
reject 𝐻( that… (original claim)”
Original claim does not include equality, and you “There is not sufficient evidence to support the
fail to reject 𝐻( claim that… (original claim)”
Original claim includes equality, and you reject “There is sufficient evidence to warrant rejection
𝐻( of the claim that… (original claim)”
Original claim includes equality, and you fail to “There is not sufficient evidence to warrant
reject 𝐻( rejection of the claim that… (original claim)”

So, final conclusions of “reject the null hypothesis” or “fail to reject (aka don’t reject) the null
hypothesis” are not acceptable because they mean nothing to most people. The final statement
should address the original claim, and it should not involve technical terms like “null hypothesis.”
Notice we do not use phrases like “support” or “accept” when it comes to the hypotheses and claim.
The term “accept” is misleading because it implies incorrectly that the null hypothesis has been
proved, but we can NEVER prove a null hypothesis. The phrase “fail to reject” or “don’t reject” says
more correctly that the available evidence isn’t strong enough to warrant rejection of the null
hypothesis.

Final conclusions can include as many as three negative terms. For example, a conclusion may
read: “There is not sufficient evidence to warrant rejection of the claim of no difference between 0.5
and the population proportion.” For such confusing conclusions, it is better to restate them to be
understandable. Instead we may say something like “Until stronger evidence is obtained, continue to
assume that the population proportion is equal to 0.5.”

Pierce College // MDP 7


Math 227
Confidence Interval Method:
A confidence interval estimate of a population parameter contains the likely values of that parameter.
If a confidence interval does not include a claimed value of a population parameter, reject that claim.
For a two-tailed hypothesis test, construct a confidence interval with a confidence level of 1 − 𝛼, but
for a one-tailed hypothesis test with significance level 𝛼, construct a confidence interval with a
confidence level of 1 − 2𝛼. After constructing the confidence interval, use the following criterion: A
confidence interval estimate of a population parameter contains the likely values of that parameter.
We should therefore reject a claim that the population parameter has a value that is not included in
the confidence interval.

Example: Find the value of the test statistic.


a. Claim: Three-fourths of all adults believe that it is important to be involved in their communities.
Based on a USA Today/Gallup poll of 1021 randomly selected adults, 89% believe that it is
important to be involved in their communities.
b. Claim: Among those who file tax returns, less than one-half file them through an accountant or
other tax professional. A Fellowes survey of 1002 people who file tax returns showed that
48% of them file through an accountant or other tax professional.
c. Claim: For adult females, the standard deviation of their white blood cell counts is equal to
5.00. A random sample of 40 adult females has a standard deviation of 2.28.
d. Claim: For adult females, the mean of their white blood cell counts is equal to 8.00. A random
sample of 40 adult females has a mean of 7.15 and a standard deviation of 2.28 (the
population standard deviation 𝜎 is not known).

Pierce College // MDP 8


Math 227
Example: Assuming a significance level of 𝛼 = 0.05, use the given statement to find the 𝑃-value and
critical values.
a. The test statistic of 𝑧 = 2.00 is obtained when testing the claim that 𝑝 > 0.5.
b. The test statistic of 𝑧 = −2.00 is obtained when testing the claim that 𝑝 < 0.5.
-
c. The test statistic of 𝑧 = −1.75 is obtained when testing the claim that 𝑝 = S.
d. The test statistic of 𝑧 = 1.50 is obtained when testing the claim that 𝑝 ≠ 0.25.
e. With 𝐻- : 𝑝 ≠ 0.25, the test statistic is 𝑧 = −1.23.

Pierce College // MDP 9

Das könnte Ihnen auch gefallen