Sie sind auf Seite 1von 51

How does anxiety affects Statistics performance?

How does student's IQ influence the student's result in a doctoral seminar?


How do the student's IQ and the student's preparation effort influence the
student's result in a doctoral seminar?

Is there a significant difference on the Mathematics achievement of the


male and female students?.

How brand name affects the sales?


ED 202
Tests of Significance or
Hypothesis test
2
Test of Significance

Test of Significance:
Assess the evidence provided by data in favor of
some claim about the population
Start by setting up a hypothesis
This is a statement about a population parameter
Results of the test are expressed in terms of a
probability
Usually called a p-value
This probability (p-value) measures how well the data
and the hypothesis agree.

3
HYPOTHESES

can be considered an educated guess, or


speculation that assist the research in
seeking the answer to the research
problem
Expected outcome
Three parts: IV, DV, population
 derived from theory
What do hypotheses really
do?
It guides the direction of the study
It limits what shall be studied and what shall not
It identifies facts that are relevant and those
that are not
It suggests which form of research design is
likely to be most appropriate
Provides a framework for organizing the
conclusions that result from the conduct of the
research process
Since the hypothesis conjectures a
relationship between variables, a good
hypothesis should be
Stated clearly and unambiguously in the
form of declarative sentence
Kinds of Hypotheses

Null hypothesis
Alternative hypothesis
ALTERNATIVE HYPOTHESIS
Predicts a relationship between two or more
variables
It is called a research hypothesis because its
formulation is based on gathering empirical
evidence and deducing from theory
The statement we hope or suspect is true
What we are trying to prove or the effect we
are hoping to see
Also, called “RESEARCH hypothesis”,
“empirical hypothesis” or substance
hypothesis
Denoted by H1 OR Ha
RESEARCH HYPOTHESIS
Examples
The educational history of high school
freshmen is related to their achievement
Non directional
Affirms the relationship but failed to qualify it
A more favorable organizational climate is
associated with greater efficiency at work
Directional
Suggests directionality between variables
NULL HYPOTHESIS
The null hypothesis assumes that any kind of
difference or significance you see in a set of
data is due to chance.
It is considered null because it denies the
existence of any relationship between or among
variables
Denoted by Ho.
Also called as “statistical hypothesis” because
it is more suitable for the application of
statistical test
Why null hypothesis?
Statistically speaking, we temporarily adopt the
critical stance that our independent variable does
NOT matter. So we have to test it. The
acceptance or non acceptance of it provides
support or no support for the research
hypothesis. The fate of the research hypothesis
depends upon what happens to Ho
Hypotheses flows from
research problem

What is the effect of a seminar-workshop


on the attitudes of teachers toward
seminar?
Teachers’ attitude toward seminar will
improve as a result of attending a
seminar-workshop
There is no difference in teachers’
attitudes toward seminar before and after
the workshop
Types of error
Test Result – H0 True H0 False

True State
H0 True Correct Type I Error
Decision
H0 False Type II Error Correct
Decision

  P(Type I Error )   P(Type II Error )


• Goal: Keep ,  reasonably small
Level of Significance
 Type I error (α) is also called level significance of test.
 It is convention to adopt levels of significance of either
0.05 or 0.01. If for example 5%level of significance is
chosen, then there are about 5 chances in 100 that we
would reject the hypothesis when it should be
accepted, i.e., we are 95% confident that we made the
right decision.
 Insuch a case the hypothesis has been rejected at a
0.05 level of significance, which means that we could
be wrong with probability of 0.05.

15
Significance Level

α ≡ threshold for “significance”


We set α
For example, if we choose α = 0.05, we
require evidence so strong that it would occur
no more than 5% of the time when H0 is true

Basics of Significance
16 3/9/2018 Testing
One-tailed & Two-tailed
test
One-tailed test only considers one end distribution. It is also called as
directional test.

H a :   0
or   0

17
One-tailed & Two-tailed
test
Two-tailed test is one that considers both ends of the distribution.
There is no predictions concerning the direction of the difference in
means.

H a :   0

18
Rejection Region or Critical
Region
It is a area of sampling distribution that lies beyond the test statistic’s
critical value; when the score falls within this region, the null
hypothesis is rejected (Jackson, 2012)

19
P-value
 P-value ≡ the probability the test statistic
would take a value as extreme or more extreme
than observed test statistic, when H0 is true
Smaller-and-smaller P-values → stronger-and-
stronger evidence against H0
Conventions for interpretation
P > .10  evidence against H0 not significant
.05 < P ≤ .10  evidence marginally significant
.01 < P ≤ .05  evidence against H0 significant
P ≤ .01  evidence against H0 very significant
Basics of Significance
20 3/9/2018 Testing
DECISION

Decision rule
P ≤ α  statistically significant evidence
P > α  nonsignificant evidence
For example, if we set α = 0.01, a P-value of
0.0006 is considered significant
MEANING, we will reject the null
hypothesis.so, accept the alternative
hypothesis.

21
Statistical Treatment
Data Parametric

Predicting quantitative variable, Y from Linear regression


quantitative variable, X
Comparing mean of one sample. determine One sample test
whether a sample comes from a population
with a specific mean
One Quantitative Response Variable – Paired Sample ttest
Two Values from Paired Samples

One Quantitative Response Variable – Two Independent Sample


One Qualitative Independent Variable ttest
with two groups

One Quantitative Response Variable – ANOVA


One Qualitative Independent Variable
with three or more groups
How does anxiety affects Statistics performance?
LINEAR REGRESSION
How does student's IQ influence the student's result in a
SIMPLE REGRESSION
doctoral seminar?
How do the student's IQ and the student's preparation
MULTIPLE REGRESSION
effort influence the student's result in a doctoral
seminar?

Is there a significant difference on the Mathematics


INDEPENDENT TWO-
achievement of the male and female students?.
SAMPLE TTEST

How brand name affects the sales?


ANOVA
ONE SAMPLE TEST

A typical college student spends an average of 2.80 hours a day


using a computer. A sample of 13 students at The ABC
University revealed the following number of hours per day
using the computer:
•3.15 3.25 2.00 2.50 2.65 2.75 2.35 2.85
2.95 2.45 1.95 2.35 3.75
•Can we conclude that the mean number of hours per day
using the computer by students at The University is the same
as the typical student’s usage? Use the hypothesis testing
procedure and the 0.05 significance level.
Ho: the mean number of hours per day using the
computer by students at The University is the same as
the typical student’s usage
Ha: the mean number of hours per day using the
computer by students at The University is not the same
as the typical student’s usage
Statistical tool : ONE SAMPLE TTEST
Analysis;

DECISION:
We failed to reject the null hypothesis (p > 0.05)
Conclusion :the
mean number of hours per day using
the computer by students at The University is
the same as the typical student’s usage
TWO SAMPLE
INDEPENDENT TTEST
 A teacher drew a sample of students in his Math class & randomly assigned 10
of them to an experimental group and 10 to a control group. The teacher
taught the female group by use of PSI and the Male group with the traditional
technique. At the end of the semester, a standardized mathematics
achievement test was given to both groups. On the basis of these data, should
the teacher conclude that the PSI is more effective than the traditional
method?
Female Male
25 30
23 18
40 20
50 25
48 45
27 20
30 16 27
Hypotheses

There is no significant difference between


the two methods
There is a significant difference between
the two methods
 Statistical tool : two SAMPLE independent TTEST
 Analysis;

28
29
Minitab output…
 Two-sample T for experimental vs control

 N Mean StDev SE Mean


 experimental 10 28.8 13.3 4.2
 control 10 22.7 11.4 3.6

 Difference = mu experimental - mu control


 Estimate for difference: 6.10
 95% CI for difference: (-5.52, 17.72)
 T-Test of difference = 0 (vs not =): T-Value = 1.10 P-Value =
0.284 DF = 18
 Both use Pooled StDev = 12.4

DECISION: We failed to reject the null hypothesis (p >


0.05)
Conclusion:
Therefore, we don’t have enough
evidence to prove that there is a
significant difference between the two
methods.
The control group did as well as the
experimental group.

31
Significance of Difference
Between Two Means of
Correlated Samples

 Two-related samples. This matching may be achieved by using


each subject as his own control or by pairing subjects.
 This is sometimes called as difference method or PAIRED TTEST

32
Example
 The following are the pre and post-test scores for ten subjects in an
experiment to determine whether learning has taken place as a result of
some specific experience. Test the significance of the difference between
the two means at the 5% level using a directional test.
Subject Pretest Posttest
no.
1 16 20
2 11 8
3 8 9
4 12 13
5 7 10
6 14 17
7 9 11
8 13 15
9 10 12 33
Hypotheses
H0 :There is no significant difference
between the two tests
Ha:There is no significant difference
between the two tests
Significance level: 0.05
Analysis

34
35
Analysis

increase t

DECISION
The p-value is 0.008 less than 0.05 level of significance. Thus, we can
reject the null hypothesis

36
Conclusion:
Therefore, we have enough evidence to
prove that the post-test mean is
significantly different from the pre-test.
It would imply that exposure of the
students to the specific experience did
increase their achievement.

37
Example
Suppose that a researcher wished to learn if a particular chemical is toxic to a
certain species of beetle. She believes that the chemical might interfere with the
beetle’s reproduction. She obtained beetles and divided them into two groups.
She then fed one group of beetles with the chemical and used the second group
as a control. After 2 weeks, she counted the number of eggs produced by each
beetle in each group. The mean egg count for each group of beetles is below.
Group 1 Group 2
fed chemical not fed chemical

33 35
31 42
34 43
38 41
32
28
38
 The researcher believes that the chemical interferes with
beetle reproduction. She suspects that the chemical
reduces egg production. Her hypotheses are:
 Ho: There is no significant difference in the number of
eggs in two groups or
 Ha: The mean number of eggs in group 1 is less than
the mean number of group 2.
 A t-test can be used to test the probability that the two
means do not differ.
 This is a 1-tailed test because her hypothesis proposes
that group B will have greater reproduction than group
1. If she had proposed that the two groups would have
different reproduction but was not sure which group
would be greater, then it would be a 2-tailed test.

39
Minitab output
 Two-sample T for C1 vs C2

 N Mean StDev SE Mean


 C1 6 32.67 3.33 1.4
 C2 4 40.25 3.59 1.8

 Difference = mu C1 - mu C2
 Estimate for difference: -7.58
 95% upper bound for difference: -3.47
 T-Test of difference = 0 (vs <): T-Value = -3.43 P-Value = 0.005 DF = 8
 Both use Pooled StDev = 3.43

40
Decision
The researcher concludes that the mean
of group 1 is significantly less than the
mean for group 2 because the value of
P(0.005) < 0.05. She accepts her
alternative hypothesis that the chemical
reduces egg production because group 1
had significantly less eggs than the
control.
 41
Conclusion
Therefore, the chemical interferes with
beetle reproduction. It can reduces egg
production.

42
ANOVA:
Analysis of Variation

ED 202
Lynn Mangin - Remo
What does ANOVA do?
At its simplest (there are extensions) ANOVA
tests the following hypotheses:
H0: The means of all the groups are equal.

Ha: Not all the means are equal


doesn’t say how or which ones differ.
Can follow up with “multiple comparisons”

Note: we usually refer to the sub-populations as


“groups” when doing ANOVA.
Assumptions of ANOVA
each group is approximately normal
check this by looking at histograms and/or
normal quantile plots, or use assumptions
 can handle some nonnormality, but not
severe outliers
standard deviations of each group are
approximately equal
 rule of thumb: ratio of largest to smallest
sample st. dev. must be less than 2:1
An example ANOVA
situation
Subjects: 25 patients with blisters
Treatments: Treatment A, Treatment B, Placebo
Measurement: # of days until blisters heal

Data [and means]:


• A: 5,6,6,7,7,8,9,10 [7.25]
• B: 7,7,8,9,9,10,10,11 [8.875]
• P: 7,9,9,10,10,10,11,12,13 [10.11]

Are these differences significant?


What does ANOVA do?
At its simplest (there are extensions) ANOVA
tests the following hypotheses:
H0: The means of all the groups are equal.

Ha: Not all the means are equal


doesn’t say how or which ones differ.
Can follow up with “multiple comparisons”

Note: we usually refer to the sub-populations as


“groups” when doing ANOVA.
Minitab ANOVA Output
Analysis of Variance for days
Source DF SS MS F P
treatment 2 34.74 17.37 6.45 0.006
Error 22 59.26 2.69
Total 24 94.00

R ANOVA Output
Df Sum Sq Mean Sq F value Pr(>F)
treatment 2 34.7 17.4 6.45 0.0063 **
Residuals 22 59.3 2.7
Where’s the Difference?
Once ANOVA indicates that the groups do not all
appear to have the same means, what do we do?

Analysis of Variance for days


Source DF SS MS F P
treatmen 2 34.74 17.37 6.45 0.006
Error 22 59.26 2.69
Total 24 94.00
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ----------+---------+---------+------
A 8 7.250 1.669 (-------*-------)
B 8 8.875 1.458 (-------*-------)
P 9 10.111 1.764 (------*-------)
----------+---------+---------+------
Pooled StDev = 1.641 7.5 9.0 10.5

Clearest difference: P is worse than A (CI’s don’t overlap)

Das könnte Ihnen auch gefallen