Sie sind auf Seite 1von 44

# INFERENTIAL STATISTICS

## Testing the significance of the difference between two means,

two standard deviations, two proportions, or two percentages,
is an important area of inferential statistics.

## Comparison of 2 or more variables often arises in research or

experiments to make valid conclusions regarding result of
study, one has to apply a test statistic.
HYPOTHESIS
Hypothesis is a conjecture or statement which aims to
explain certain phenomena in the real world. Many hypotheses,
statistical or not are products of mans curiosity.

## To seek for answers to his questions, he tries to find and

present evidences, then tests the resulting hypothesis using
statistical tools and analysis.
In statistical analysis, assumptions are given in the form of
null hypothesis, the truth of which will either be accepted or
rejected within a critical interval.
NULL and ALTERNATIVE
HYPOTHESES
H0 : Null Hypothesis is a hypothesis for testing to determine
whether its truth can be accepted or rejected.
This hypothesis states that there is no significant
relationship or significant between the two or more
variables, or that one variable does not affect another
variable.
NULL and ALTERNATIVE
HYPOTHESES
Ha : Alternative Hypothesis challenges the null hypothesis.
Examples:
Null Hypothesis: There is no significant difference between
effectiveness of method A and method B.

## Alternative Hypothesis: There is significant difference

between method A and Method B, or
Method A is more effective than method B , or
Method A is less effective than method B.
Significance Level
To test the null hypothesis of no significance in the difference
between the two methods, one must set the level of
significance first.

## Type I error (a) probability of accepting the alternative

hypothesis, when in fact, the null hypothesis is true.

## Type II error (b) probability of accepting the null hypothesis,

when in fact, it is false.

## The most common level of significance is 5%.

One-Tailed and Two-Tailed Tests
(a) One-Tailed Test rejection lies on one extreme side of the distribution.

H0 : x

H0 : x

H0 : x y

H0 : x y

## (b) Two-Tailed Test rejection lies on both sides of the distribution.

H0 : x

H0 : x y
TESTING HYPOTHESIS
STEPS IN TESTING TRUTH OF HYPOTHESIS
1. Formulate the hypothesis. Denote it as H0, Null hypothesis, and Ha for
alternative Analysis.
2. Set the desired level of significance.
3. Determine the appropriate statistic to be used in testing the null hypothesis.
4. Compute the value of the statistic to be used.
5. Compute the degrees of freedom.
6. Find the tabular value using the table of values for different tests from the
appendix tables.
7. Compare the computed value, CV, to the tabular, TV.

Decision Rule: If the |CV| is less than the |TV| accept the null hypothesis. If the
|CV| is greater than the |TV| reject the null hypothesis. Make a conclusion
using the result of the comparison.
STATISTICS USED FOR TESTING HYPOTHESES

## 1. Z-TEST (Comparison between the Population Mean and

the Sample Mean).
2. T-TEST (Comparison between the Population Mean and
the Sample Mean).
3. T-TEST (Concern Means of Independent Samples).
4. T-TEST (On the Significance of the Difference Between Two
Correlated Means.
5. Z-TEST (On the Significance of Difference Between Two
Independent Proportions)
6. ANOVA (Significance of the Difference Between Variances)
DEGREES OF FREEDOM
The DEGREES OF FREEDOM gives the number of pieces of
independent information available for computing variability.

## The number of degrees of freedom required will vary

depending on the size of the distribution.

## v For a single group of population , DF = N-1

v For two groups population DF=N1+N2-2 for t-test and
DF=N-2 for Pearson R.
Z-TEST (Comparison between the Population Mean and the
Sample Mean).

## If the population mean (m) and the population variance

() are known, m will be compared to a sample
mean , then the formula below:

## Table for Critical Values for Z

Level of Significance
Test Type 0.10 0.05 0.025 0.010
One-tailed Test +1.28 +1.645 +1.96 +2.33
Two-tailed Test +1.645 +1.96 +2.33 +2.58

DECISION RULE:
Reject H0 if |Z| > |Ztabular|
Z-TEST (Comparison between the Population Mean and the Sample
Mean).
Example:
A company, which makes battery-operated toy cars, claims that its products have a
mean life span of 5 years with a standard deviation of 2 years. Test the null
hypothesis that = 5 years against the alternative hypothesis that 5 years if a
random sample of 40 toy car was tested and found to have a mean life span of only
3 years. Use 0.05 level of significance.
Solution:
1. H0: The mean lifespan of the battery-operated toy cars is 5 years. (=5 years)
Ha: The mean lifespan of the battery-operated toy cars is not 5 years. (5
years)
2. Level of Significance, = 0.05; two-tailed.
3. Given: = 3; = 5; n = 40; = 2; Use Z-test as test statistic
4. Computation:

## 5. Critical regions from Table of Z values: Z < -1.96 and Z>1.96

6. Decision: Reject the H0 and accept the proposition that the mean life span of
the toys is not equal to 5 years since |Z|, which is 6.32, is greater than
|Ztabular|, which is 1.96.
7. The difference is significant. (Claim is false)
T-TEST (Comparison between the Population Mean and the
Sample Mean).
T-TEST can be used to compare the means when the population mean is
known but the population variance is unknown. When the population
standard deviation is unknown but the sample standard deviation can
be computed then the t-test instead of z-test is used.
T-TEST(Comparison between the Population Mean and the Sample
Mean).
Example:
The average length of time for people to vote using the old procedure during a
presidential election period in precinct A is 55 minutes. Using computerization
as a new election method, a random sample of 20 registrants was used and
found to have a mean length of voting of 30 minutes with a standard deviation
of 1.5 minutes. Test the significance of the difference between the population
mean and the sample mean.

Solution:
1.
2.
3.
4.

5.
6.
7.
8.
T-Test Concerning Means of Independent Samples:

## When two samples are drawn from normally distributed

populations with the assumption that their variances are
equal, the T-test with the given formula could be used.
|
EXAMPLE
A course in Physics was taught to 10 students using the traditional method.
.
Another group of 11 students went through the same course using another method.
At the end of the semester, the same test was administered to each group. The 10
students under method A got an average of 82 with standard deviation of 5, while
the 11 students under method B go an average of 78 and standard deviation of 6.
Test the null hypothesis of no significant difference in the performance of the two
groups of students at 5% level of significance.

1.

2.
3.
4.
.

5.

6.
7.
T-Test on the Significance of the Difference Between Two
Correlated Means
When comparing two correlated means, the T-Test is the
appropriate test statistic . A typical example is when
comparing the results of the pre-test and post test
administered to a group of individuals. The two tests must be
the same.
Example:

## To determine whether the students performance in College Algebra improved after

enrolling in the subject for one term, a 60-item pre-test and post-test were
administered to them on the first and the last days of classes, respectively. The same
test was given as pre-test and post-test.

## STUDENT Pre-test Score Post-test Score Difference, d d2

A 34 45 -11 121
B 23 32 -9 81
C 40 46 -6 36
D 31 57 -26 676
E 24 39 -15 225
F 45 48 -3 9
G 27 27 0 0
H 32 33 -1 1
I 12 18 -6 36
J 45 45 0 0
-77 1,185
SOLUTION:
1. The students performance in Algebra did not improved.
The students performance in Algebra improved.

2. One-tailed test.

## 3. T-test will be used.

4. Computations
Sample Variance:
5.
6. Tabular value= 2.281 (one-tailed).

## 7. Reject since |T|>| |. This means that the performance of the

Students in college Algebra significantly improved.
Z-TEST on the significance of the Difference Between Two Independent
Proportions

## To determine if there is a significant difference between proportions of two

variables, the Z-test will be used.

where:
Example:

A sample survey of a presidential candidate in the Philippines shows that 120 of 200
male voters dislike candidate X and 175 of 250 female voters dislike the same
candidate Determine whether the difference between the sample proportions ,
and , is significant or not at 1% level of significance.

## There is no significant difference between the proportion of the male

There is significant difference between the proportion of the male
SIGNIFICANCE OF THE DIFFERENCE BETWEEN VARIANCES

ANALYSIS OF VARIANCE

When the variances of two or more independent samples differ, the appropriate
test statistic to determine the significance of such difference is the Analysis
of Variance (ANOVA), which makes use of the F ratio or the variance ratio.

The various groups being compared are assumed to belong to a population with
normal distribution, each group randomly selected and independent from
the other groups. The variables from each group also have standard
deviations that are approximately equal.

## Steps in Selecting the Analysis of Variance

1. State the null hypothesis.
2. Set the level of significance.
3. Accomplish the ANOVA Table. (See next slides)
4. Find the tabular value of F at the given level of significance (Appendix E).
5. Accept the null hypothesis if the computed value of F is less than the tabular
value and reject if it is greater than the tabular value.
6. Interpret the result.
The ANOVA Table

## Source of Sum of the

df Mean Square F
Variation Square

Between SSB

Within SSW

Total TSS
Example 7

Determine who among the three salesman will most likely be promoted based
on their monthly sales in pesos? Use level of significance.
A B C A^2 B^2 C^2

## 18,800 19,000 16,000 353,440,000 361,000,000 256,000,000

130,000 129,188 128,599 1,951,220,000 1,895,218,544 1,858,674,201
The ANOVA Table

df Mean Square F
Variation Square

0.00974

## Within 135,419,173 24 5,642,465.542

Total 135,529,139 26
Correlation Analysis
The measure of relationship between two variables is called
Correlation.
Correlation Analysis is a method of measuring the strength
of such relationship between two variables.

## When two social, physical, or biological phenomena increase

proportionately and simultaneously because of external factors, the
phenomena are positively correlated. If one increases in the same
proportion that the other decreases, the two phenomena are negatively
correlated. Investigators calculate the degree of correlation by applying a
coefficient of correlation to data concerning the two phenomena.
Examples of Correlated variables

## 1. The students mental ability and academic performance in school are

related.
2. There is a close relationship between reading comprehension and
mathematical ability.
3. The larger the mass of a body, the greater the amount of heat energy
required to melt it.
4. In physics, the larger the force exerted to push a body, the faster the
acceleration the body will be.
5. In the linear equation y=x+1, the higher the value of x to be assigned,
the higher the corresponding value of the dependent variable y.
Measures of Correlation

## 1. Pearson Product-Moment Correlation

Coefficient
2. Spearmans Rank Correlation
Coefficient
Pearson Product-Moment Correlation
Coefficient (Pearson R)
The most common statistical tool in measuring the linear
relationship between two random variables, x and y, is the
linear correlation coefficient common called the Pearson
Product-Moment Correlation Coefficient or Pearson R for short.
Formula developed and perfected by Karl Pearson, who made
behavioral studies of humans.
It became the basis of different theories in the fields of
heredity, psychology, anthropometry, and statistics. It can be
used to determine the linearity of the relationships between
two variables.
r Verbal Interpretation
0.00 to 0.20 Slight correlation

## 0.61 to 0.80 High correlation

Example 1.
Test the hypothesis that there is no significant correlation between mental
ability and English Proficiency at 5% level of significance.

## Mental Ability and English Proficiency Test Scores

Mental Ability English
(x) Proficiency (y)
50 200
54 198
50 200
51 203
49 186
46 205
48 185
47 197
44 183
44 171
46 179
45 185
48 184
53 190
54 191
33 170
34 168
Solution:
1. H0: There is no significant correlation between mental ability and English
proficiency.
2. = 5%
3. Pearson r will be used to test the hypothesis
4. Computation

## Mental Ability (x) English Proficiency (y) xy x2 y2

50 200 10,000 2,500 40,000
54 198 10,692 2,916 39,204
50 200 10,000 2,500 40,000
51 203 10,353 2,601 41,209
49 186 9,114 2,401 34,596
46 205 9,430 2,116 42,025
48 185 8,880 2,304 34,225
47 197 9,259 2,209 38,809
44 183 8,052 1,936 33,489
44 171 7,524 1,936 29,241
46 179 8,234 2,116 32,041
45 185 8,325 2,025 34,225
48 184 8,832 2,304 33,856
53 190 10,070 2,809 36,100
54 191 10,314 2,916 36,481
33 170 5,610 1,089 28,900
34 168 5,712 1,156 28,224
796 3,195 150,401 37,834 602,625
5. df = N-2=17-2=15.
6. Tabular Value = 0.482 (from table)
7. Reject the null hypothesis because the computed value, 0.727, is greater
than the tabular, 0.482.
8. There is a significant linear relationship between the mental ability and
English proficiency. The verbal interpretation of r shows there is a high
correlation between two variables.
Spearmans Rank Correlation Coefficient ()

## When the entries in a set of data are ranks, the Spearmans

Rank Correlation Coefficient ( also known as the Spearman
rho) will be used in hypothesis testing.
Example .
Rank the performance of the following students in their history and literature
classes. Then use the spearman rho coefficient to test the difference between
their ranks. Use 5% level significance.

Performance in

## student History Literature

A 78 79
B 77 80
C 88 85
D 84 78
E 80 89
F 85 80
G 79 80
H 88 85
Solution:
1. H0: There is no significant difference between the performances in the
subjects of the students
2. = 5%
3. Spearman rho will be used to test H0.
4. Computation

Performance in

## student History Literature Rx Ry D D^2

A 78 79 7 7 0 0
B 77 80 8 5 3 9
C 88 85 1.5 2.5 -1 1
D 84 78 4 8 -4 16
E 80 89 5 1 4 16
F 85 80 3 5 -2 4
G 79 80 6 5 1 1
H 88 85 1.5 2.5 -1 1
5. df= N-2=8-2=6
6. Tabular value=0.829
7. The null hypothesis is accepted, since CV is less than TV.
8. There is no significant difference between the
performances of the students.