Sie sind auf Seite 1von 13

Probability & Statistics Probability & Statistics for Engineers & for Engineers & Scientists, by Scientists, by Walpole,

Myers, Myers Walpole, Myers, Myers & Ye & Ye ~ ~ Chapter 10 Notes Chapter 10 Notes

Statistical Hypotheses
A statistical hypothesis is an assertion or conjecture concerning one or more populations.
To prove a hypothesis in statistics, we generally set up the opposite of the hypothesis and see if we can reject it. Acceptance of a hypothesis just means that there is not enough evidence to refute it. Rejection implies that the evidence truly does refute it.

The hypothesis that we wish to test is called the null hypothesis and denoted H0. Rejection of H0 leads to acceptance of the alternate hypothesis H1.
Acceptance or rejection will be based on the value of a sample statistic. The critical region is the range of the sample statistic for which we reject the null hypothesis.

Testing a Hypothesis
There are two types of error that can be made when testing a null hypothesis.
Type I error: Rejecting the null hypothesis when it is true. Type II error: Accepting the null hypothesis when it is false. We define = P(type I error), and = P(type II error).

In hypothesis testing, we generally want to minimize , the probability of making a type I error.
The value of can be reduced by adjusting the critical region. Decreasing generally causes to increase and vice versa. Increasing the sample size n will decrease both

More Hypothesis Testing


The power of an hypothesis test is the probability of (correctly) rejecting H0 given that H1 is true.
For a given , we would like a test with a high power value.

Summary of hypothesis test outcomes: Reality H true H true


0 1

Decision Accept Correct Acceptance 1- Type I Error Type II Error Correct Rejection (power = 1 )

Reject

One-Tailed and TwoTailed Tests


For a one-tailed test, the alternate hypothesis is in a single direction, for example,
H0: = 0 and H1: > 0.

For a two-tailed test, the alternate hypothesis can be in either direction, for example,
H0: = 0 and H1: 0. is equal to the sum of the probabilities on either side (typically /2 in either tail). To calculate , a specific value, H1: = 1 (either high or low) must be used. For a 2-sided test, the high and low values will generally be symmetric around 0.

Pre-selecting vs. using a P-Value


In classical hypothesis testing, we typically preselect to be .05 or .01 and then determine the critical region.
We can then reject the hypothesis with that level of significance. (Remember that in a 2-sided test, with = .05, for example, the critical region would have . 025 in either tail.)

The alternative is calculating the P-value, or probability of obtaining the calculated result if H0 is true. The P-value provides more information than just that the hypothesis was rejected or not.
If rejected, the P-value may be much less than .05 or .01, giving us additional confidence in our decision. If not rejected, the P-value may be very close to .05 or .01, allowing us the option of rejecting at a slightly reduced level. The judgment of the experimenter is used to

Single Mean Tests ( known)


Suppose we have an unknown population distribution with mean and known variance 2, and that H0: = 0 and H1: 0 . If either

z=

x 0 > z 2 n

or

z < z 2

then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0.
For a single-sided test with H1: > 0 , reject H0 if z > z . For a single-sided test with H1: < 0 , reject H0 if z < - z . The critical region for xbar can also be written in terms of and rather than z.

Single Mean Tests ( unknown)


Suppose we have a normal population distribution with mean and unknown variance, and that H0: = 0 and H1: 0 . If either

x 0 t= > t 2, n 1 s n

or

t < t 2, n 1

then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0.
For a single-sided test with H1: > 0 , reject H0 if t > t, n-1 . For a single-sided test with H1: < 0 , reject H0 if t < - t, n-1 . The critical region for xbar can also be written in terms of and s rather than t. If n 30, can still use the normal distribution.

Difference of 2 Means ( known)


Suppose we have an unknown population distribution with mean and known variance 2, and that H0: 1 - x1 x2 ) d and H1: 1 - 1 d0 . If ( 1 = d0 0 z= > z 2 or z < z 2 either 2 2 n + n
1 1 2 2

then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0.

For a single-sided test with H1: > 0 , reject H0 if z > z . For a single-sided test with H1: < 0 , reject H0 if z < - z . The critical region for xbar1 - xbar2 can also be written in terms of and rather than z.

Test of a Single Proportion


Suppose we have a binomial experiment with a binomial random variable X with probability p, and that H0: p = p0 and H1: p p0 . If either

x np0 z= > z 2 np0 q0

or

z < z 2

then reject H0. In this case, the probability of making a type I error (rejecting H0 when it is true) is . If z falls within the above limits, then accept H0.

For a single-sided test with H1: p > p0 , reject H0 if z > z . For a single-sided test with H1: p < p0 , reject H0 if z < - z . The critical region for x can also be written in terms of n, p0 and q0 rather than z. Note: For small n, use the binomial distribution

Goodness of Fit Test


A goodness of fit test helps answer the questions: "Is a die fair?", "Is a population normally distributed", etc. For any situation comparing expected and observed frequencies in k different categories, if ei is the expected, and oi is the e ) 2 k (oi observed frequency 2 i = in category i, ei i= 1

is approximately 2 distributed with = k - 1 degrees of freedom.


The expected frequency, ei, in each category must be 5. Categories with ei < 5 may be combined, but without reference to the observed frequencies (don't cheat!).

Goodness of Fit Test Summary


Steps for a 2 goodness of fit test of the hypothesis H0 that data follows a given distribution:
Break the observed data up into a logical group of categories based on the range of the data. Do not base the categories in any way on the observed frequency values. Determine the total number of observations, n, for the observed data. For the given distribution, calculate the probability of of a randomly selected observation falling in each category. (These probabilities add to 1). Multiply each probability by n to get the expected number of observations, ei, in each category. (These and the oi's add to n). Determine the observed frequencies, oi, in each category. Combine categories if necessary to ensure that each ei 5. Do not use the o 's when deciding which categories to

Degrees of Freedom/Test for Independence


The 2 goodness of fit test can apply to many situations. However, the number of degrees of freedom must be adjusted for parameters taken from the observed data and used to calculate the expected data.
For example, to test data for normality, if we estimate and using xbar and s, we must subtract 2 additional degrees of freedom and use = k - 3.

In a test for independence of two discrete variables, we write the data in a table and determine the expected table frequencies if the data variables are independent.
For this case with a 2-dimensional table, we need to use the row and column sums to determine the expected frequencies in the table. Here, we use =

Das könnte Ihnen auch gefallen