You are on page 1of 29

Appedix A Calculating the t- Test and ANOVA

The t-Test

The t-test

is

used for

comparing a

sample mean to a population mean,

comparing two means from the same sample, or comparing the means of two different

samples.

In a t-test we establish a hypothesis to test, compute the value of compare it to a critical value of t.

t, and then

Using a t-test also involves degrees of freedom. Degrees of freedom are based on how many values in the data set that are free to vary, and is associated with how many population parameters we intend to estimate.

A. t-Test for Single Samples

This test uses only one sample, and know σ (population standard deviation).

we use this t-test when

we don’t

The formula for single samples t-test is:

M

¿

s/n

t=¿

Key:

M = sample mean µ = population mean s = sample standard deviation n = sample size s /√n = standard error of the mean

Example:

The average IQ for children is 100. Dr. Hockler believes she can increase this level in a sample of children by providing them with extra intellectual stimulation during the preschool years. After 3 years of such stimulation, she measures the IQ scores of ten 6-year-olds. These children score a mean of 109, with s = 7.6. Evaluate her hypothesis at 0.1. Step 1: State the hypotheses:

Ho: There is no significant difference in children’s IQ scores from the population average.

Step 2: Determine the critical value of t (cvt). df = n-1 = 10-1 = 9 cvt = 1.833

Step 3: Compute t:

t= 1090.1

7.6/6

 

108.9

t=

7.6/2.45

t= 108.9

 

3.10

t=35.13

What should we conclude? The calculated value of t is 35.13 is greater than the critical value of 1.833 at 0.1 level of significance. This means that the null hypothesis is rejected. There is a significant difference in children’s IQ scores from the population average.

B. t-Test for Dependent Samples

This test is used when we have only one sample, but plan to compare the group’s mean at one time to the group’s mean at another time to check for a significant mean difference. Another way this test is used is when the participants are paired or matched in some way. In this case, the dependent samples t-test is often called a paired samples t-test or a related samples t-test.

This t-test is based on difference scores, and the formula is:

 

t= M Dμ D S D /n

 

Key:

M D = mean difference (the mean of the difference scores) µ D = population mean difference (which is always 0)

 
 

s D

= standard deviation of the difference scores

n

= sample size, or number of pairs

s D /√n = standard error of the mean difference

 
 

Since

the

value

of

µ D is

always zero, we

can

simply omit it from

the

formula:

 

M D

t=

S D /n

Steps: 1. State the Hypotheses

  • 2. Determine tcv

df

=

n-1

tcv =

  • 3. Find the difference scores (subtract “before” scores from “after” scores)

  • 4. Square the difference scores

  • 5. Find M D

  • 6. Find s D

  • 7. Compute t and compare to tvc

Example Ms. Brooks modified her teaching methods for her 10 th grade Advanced English class to help improve her 7 students’ learning. To detect improvement, she wants to compare the students’ Test 1 scores with their Test 2 scores. At α = .01, find out if Ms. Brooks’ students learned better after she modified her teaching.

Step 1: Ho: There is no significant difference between the performance of the students in test 1 and test 2.

Step 2:

Determine tcv -

df

=

n-1

=

7 – 1 = 6

tcv = 3.143

 

Step 3 and 4: Find difference scores and square them

Test 1

Test 2

D

D 2

85

88

(88 – 85) =

3

9

92

90

(90 – 92) =

-2

4

98

97

-1

1

79

85

6

36

86

90

4

16

87

88

1

1

84

86

2

4

 

13

71

Step 5:

Find M D = 13 / 7 = 1.86

 

Step 6: Find

 

D

Σ ¿ 2 (¿¿ N) ¿

Σ D 2 ¿ ¿

S D =¿

(13¿ 2 /7) ¿

71¿

 

¿

¿¿

¿ 71−(169/7)

6

¿7.81

¿2.79

Step 7: Compute t and compare to tcv

M D

t=

S D /n

 

1.86

t=

2.79

/7

1.86

 

t=

2.79

/2.65

t= 1.86

 

1.05

t=1.77

 

What should we conclude?

The calculated value of

t is lesser than the tabulated value of

t.

Therefore the null hypothesis is accepted. There is no significant difference between the performance of the students in test 1 and test 2.

C. t-Test for Independent Samples

This test is used when we have two different samples, and we want to know if the means are significantly different from one another. Because we are using two samples, we must combine the variance of both.

The formula for this test is:

df

n

1

n

2

1 /¿+(1/¿) ¿ ¿ ¿ (SS 1+SS 2)/¿ ¿

¿ t= M 1 M 2

¿

Key:

M 1

= mean of first sample

M 2 = mean of second sample

SS1 = sum of squares for first sample SS2 = sum of squares for second sample

df = degrees of freedom

[(n 1 + n 2 ) – 2]

n 1 = number in first sample n 2 = number in second sample (SS1 + SS2) / df) = pooled variance

Steps: 1. State the Hypotheses

  • 2. Determine tcv -

df

=

(n + n) - 2

tcv =

  • 3. Square each score for each group

  • 4. Sum the scores, and the squared scores, for each group

  • 5. Find M for each group

  • 6. Find the SS for each group

  • 7. Compute t and compare to tvc

Example:

Mr. Ande is a farmer looking for an effective way to store corn for a 9 month period. He plans to use half of his machine shed to stack the corn in a

sheet metal bin.

His main concern is moisture getting to the corn, but doesn’t

know whether the corn will hold up better if he uses the existing dirt floor, or

constructs a floor made of lime chips like his neighbor suggests. He decides to split his 10,000 bushels between two bins, one with a dirt floor and one with a lime chip floor, and check them for moisture percentage at the end of each month. His data appear below. At α = .01, did one bin of corn suffer a significantly different amount of moisture damage over the other?

Step 1: State the Hypotheses Ho: There will be no difference in the amount of moisture damage suffered by each bin of corn.

Step 2: Determine tcv:

df

=

(n + n) - 2

=

(9+9) – 2 = 16

tcv = 2.921

 

Step 3 and 4:

Dirt Floor (X 1 )

Lime Floor (X 2 )

X 1 2

X 2

2

3

2

9

4

5

3

25

9

6

7.2

5

7

36

51.84

25

49

8.5

12

7

8.9

72.25

144

49

79.21

14

16.8

11

12

196

282.24

121

144

20

12.5

400

156.25

92.5

68.4

`

1216.33

636.46

Step 5:

M 1 = 92.5/9 =10.28

M 2 = 68.4/9 = 7.6

Step 6:

SS1 = ∑X 1 2 – ((∑X 1 ) 2 / n 1 )SS2 = ∑X 2 2 – ((∑X 2 ) 2 / n 2 )

= 1216.33 – (92.5 2 / 9) = 265.64

= 636.46 – (68.4 2 / 9) = 116.62

Step 7: Compute t and compare to tvc

t =

M 1

-

M 2

√(SS1 + SS2) / df) ((1/n 1 ) + (1/n 2 )

=

10.28

-

7.60

__

√(265.64 + 116.62) / 16) ((1/9) + (1/9)

16

382.26/¿(0.11+0.11)

=

¿

¿

¿

2.68

¿

¿

2.68

5.25

¿ 2.68

2.29

¿1.17

What should we conclude?

The calculated value of

t is lesser than the tabulated value of

t.

Therefore the null hypothesis is accepted. There is no difference in the

amount of moisture damage suffered by each bin of corn.

In any given research situation, how do we decide which type of t-test to use?

The following chart will help in deciding which type of t-test to use.

One Sample Two Samples Comparing to a population Comparing mean new mean to previous mean Comparing
One Sample
Two Samples
Comparing to a population
Comparing
mean
new mean to previous mean
Comparing means of matched subjects
Comparing two separate means
Use Single Sample t-testUse Dependent Samples t-test
Use Independent Samples t-test
Exercise t- Test
Exercise
t- Test
  • A. Solve the following exercises.

    • 1. Ms. Chua owns a Chinese restaurant, and wants to increase business during the weekday lunch hours. She tends to average 45 lunch guests per day on weekdays. To increase this number she offers a 10% discount on weekdays. She keeps a count for 8 weeks, and finds she has an average of 48 customers with s = 3.46. Evaluate her hypothesis at α = .01.

Step 1: State the hypotheses:

Ho:

Step 2: Determine the cvt:

df = n-1 = cvt =

Step 3: Compute t:

M

¿

s/n

t=¿

What is your conclusion?

  • 2. Last year on Halloween, a local hospital offered children a trade. They could bring the candy they collected trick-or-treating to the hospital, and trade it in for a free check-up from their nurse practitioners. This year, in order to increase the children’s visits, the hospital mailed letters to the parents of the same children asking the parents to encourage their children to make this trade. Each of 9 nurse practitioners was asked to report the number of

children that came in for a check-up. Below is a sample of the data obtained last year, and this year. At α = .05, find out if significantly more of these children traded their candy for a free check-up this year, over last year.

Step 1: State the Hypotheses Ho:

Step 2: Determine tcv:

df

=

n-1

=

tcv =

Step 3 and 4: Find the difference and difference squared scores

Last Year

 

This Year

 

D

D 2

8

6

10

11

7

13

6

6

9

12

5

8

4

7

11

10

10

18

 

Step

5.

M D =

 

Step 6.

s

D

=

∑D 2 n - 1

-

((∑D) 2 / N)

 

Step 7: Compute t

t

=

M D

s D / √ n

What is your conclusion?

3. Another hospital offered to trade children for their Halloween candy. This hospital, however, offered half the children a free check-up, and offered the other half a coupon for a free ice cream sundae from a local restaurant. A sample of the data, as reported by 8 nurse practitioners, appears below. At α = .05, which group of children was significantly more likely to trade in their candy?

Step 1: State the Hypotheses Ho:

Step 2: Determine tcv:

df

=

(n + n) - 2

=

tcv =

Step 3 and 4:

Check Up (X 1 )

Ice Cream (X 2 )

X 1 2

X 2

2

4

6

5

9

3

9

2

8

2

6

5

6

7

10

4

6

Step 5:

M 1 =

M 2 =

Step 6:

SS1 = ∑X 1 2 – ((∑X 1 ) 2 / n 1 )SS2 = ∑X 2 2 – ((∑X 2 ) 2 / n 2 )

Step 7: Compute t and compare to tvc

t =

M 1

-

M 2

√(SS1 + SS2) / df) ((1/n 1 ) + (1/n 2 )

What is your conclusion?

B.

Considering the following situations, what is the appropriate t-test to use? Encircle your answer.

  • 1. Many American adults have triglyceride levels that are higher than the healthy

limit of 150 or less. Dr. Carter is a dietician who believes that 30 minutes of exercise, 5 days per week, will lower triglycerides. She gathers a sample of 6

adults whose triglyceride levels measure beyond 150, has them walk on a treadmill for ½ hour a day for 8 weeks, then measures their triglyceride levels again.

Single Sample Dependent Sample Independent Sample

  • 2. Mr. Garcia is interested in improving his track team’s 100 yard dash average.

He divides his team of 20 students into two groups. He pairs his two best runners, and puts one in each group. He does the same with his two second best, his two third best, etc. He gives the first group extra training for two weeks, and compares them to the second group to find out if extra training improves the average speed of the runners.

Single Sample Dependent Sample Independent Sample

3. Mr. Sanchez wants to improve the playing of the flute-players in his 8 th grade orchestra. He randomly selects 6 flute-players, and spends an hour a day giving private music lessons to each child for 6 weeks. He then compares them to the remaining flute-players to check for improvement.

Single Sample Dependent Sample

Independent Sample

4. Ms. Oliveros owns the Hotel Paradise in Hawaii, and wants to increase her occupancy for the summer months. In previous years, she has averaged 200 guests per night during the summer. In an effort to increase occupancy, she advertises her hotel nationally throughout the winter months, then takes a count of how many guests stay at her hotel the following summer.

Single Sample Dependent Sample Independent Sample

Analysis of Variance (ANOVA)

Table 1.1

In statistics, analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance is partitioned into components due to different explanatory variables. In its simplest form ANOVA gives a statistical test of whether the means of several groups are all equal, and therefore generalizes Student's two-sample t-test to more than two groups.

In practice, there are several types

of ANOVA depending on the number of

treatments and the way they are applied to the subjects in the experiment:

One-way ANOVA

Two-way ANOVA

Factorial ANOVA

Mixed Design ANOVA

Multivariate analysis of variance (MANOVA)

ANOVA or f-test is an extension of the t-test, which is used in determining the non-significance of difference of three or more group of values.

Decision Rules to be followed:

In statistics, analysis of variance is a collection of statistical models, and their associated procedures, in

One – way Analysis of Variance

In Jose’s Farmville, an experiment was devised to test if fruit yield from three experimental blocks differ significantly from each other. The fruit trees in the first block were applied the Super-Grow fertilizer, the trees on the second block were fertilized by Super-Duper fertilizer, and third block is considered as control. With given values for the yield of the blocks, is there a significant difference among the treatments at 5% level of significance?

Step 1: State the research question:

Is there a significant difference on the yield performance of the fruit trees

using:

  • a. Super-grow fertilizer;

  • b. Super Duper- grow fertilizer;

  • c. Natural Fertilizer?

Step 2: H A : There is a significant difference on the yield performance of the fruit trees with regards to the mentioned criteria.

H 0 : There is no significant difference on the yield performance of the fruit trees

with regards to the mentioned criteria.

Step 3: Determine α, decision rule, and F - critical value. (α=0.05)

Decision rule:

If F c is

F t or F c is

___________

F t : ________________

_________

-F t, then reject H 0.

Step 4: Arrange the data similar below.

 

A

B

C

A

B

C

 

2

2

2

 

x

x

x

x

x

x

 

20

15

20

400

225

 

400

 

22

18

22

484

324

 

484

 

18

15

18

324

225

 

324

 

20

17

20

400

289

 

400

 

23

23

23

529

529

 

529

 

213

 

103

88

103

7

1592

2137

~

~

~

 

∑ x

2

 

x

=20.

x

=17

x

=2

 

=5

6

 

.6

0.6

866

 

~

       
 

Σ x =

x t

=19.

294

 

6

N= 15

20+22+18+20+23

 
 

=20.6

  • 5 (Mean of Treatment A)

 

15+18+15+17+23

 
 

=17.6

  • 5 (Mean of Treatment B)

 

20+22+18+20+23

 
 

=20.6

  • 5 (Mean of Treatment C)

(Total Sum of the Squared Scores)

Solution:

N= number

~

x

a

=

~

x

b

=

~

x

c

=

~

x

t

=

x 2 =2137+1592+2137=5866

∑ x=103+88+103=294

20.6+17.6+20.6

(Sum of the Scores)

=19.6

  • 3 (Average of the Mean Scores)

of cases

Step 5: Compute for SS t , SS b , SS w , df b, df w , MS b , MS w , and for F.

X t = ______

Total Sum of the Squares

Solution:

SS t =5866(294 ) 2

15

SS t =586686436

15

SS t =58665762.4

SS t =103.6

SS t =Σx 2

(ΣX) 2

N

Sum of the Squares Between

SS b =Σ (Σ X) 2

n

(Σ X t ) 2

N

N= Total number of cases

n= Total number of cases in each treatment

Solution:

SS b = ( 103 2 +

5

2 + 103 2

5

)

88

294

2

5

15

SS b = ( 10609 + 7744 + 10609

5

5

5

)

86436

15

SS

b =(2121.8+1548.8+2121.8)5762.4

SS

b =5792.45762.4

SS

b =30

Sum of the Squares within Groups

SS w =SS A +SS B +SS c

Sum of the Squares in a Group

Solution:

SS=Σ X 2

(Σ X) 2

n

SS A =2137(103) 2

5

SS A =213710609

5

SS A =21372121.8

SS A =15.2

SS B =1592(88) 2

5

SS B =15927744

5

SS B =15921548.8

SS B =43.2

SS C =2137(103) 2

5

SS C =213710609

5

SS

C =21372121.8

SS C =15.2

SS W

=15.2+43.2+15.2

SS W =73.6

The within sum-of-squares added to the between sum-of-squares should total

the total sum-of-squares:

SS T =SS W +SS B

SS T =73.6+30

SS T =103.6

It follows, then, that the within sum-of-squares can be directly by subtracting the

between sum-of-squares from the total sum-of-squares:

SS W =SS T +SS B

Degrees of Freedom

There are 15 cases in the problem that we are working, have N-1, or 14, degrees

of freedom. In group A there are 5 cases; hence there are 4 degrees of freedom for this

group, and since in this problem the number of cases is the same in each problem,

there are 4 degrees of freedom in each of the other groups. So far, we have accounted

for 12 of the total number of degrees of freedom. We have three groups. Then it

follows that there are 2 degrees of freedom for the groups. Then it follows that there are

2 degrees of freedom for the groups. To generalize:

df for total groups= number of cases in total (N) minus 1

df for groups between= number of groups (k) minus 1

df for groups within= sum of the number of cases within each sub-

group (n) minus 1. (n 1 -1) + (n 2 -1) + ...

+ (n k -1)

Therefore:

df for total groups= 15 - 1= 14

df for groups between= 3 - 1= 2

df or groups within= (5-1) + (5-1) + (5-1)= 12

Mean Square Between

Mean Square Within

MS

w =

SS w

df w

MS

w =

73.6

12

MS

w =6.13

F- Test

 

SS B

MS B =

df B

30

MS B

=

 

2

MS

B =2

The F value is the ratio of the Mean Square Between and Mean Square Within.

In equation:

F= MSB

MSW

F= 2

6.13

F=0.33

Step 6: Make a summary table.

Source of Variation

Degrees of

Sum of

Mean

Computed

Tabulated

VI

H o

Freedom

Squares

Squares

F -Value

F Value*

Between Groups

2

30

2

0.33

3.88

NS

A

Within Groups

12

73.6

6.13

Total

14

103.6

*Please refer to the table of F distribution

Step 7: Make an interpretation.

The table shows that the computed value of F is 0.33 while the tabulated value of F

at 0.05 level of significance is 3.88. This reveals that the computed value is lesser than

the tabulated value. Therefore, there is no significant difference in the yield

performance of the fruit trees with regards to the mentioned criteria. And the null

hypothesis is accepted.

Two– way Analysis of Variance

In the table below are responses of 40 male and female high school students to

an attitudinal scale. Each group of 40 was randomly divided into two groups of ten

each, and then each of these groups of 10 was shown one of the four different films of a

controversial subject. Later the attitudinal scale on this subject was administered to

each individual. For the study, we will use the 0.05 level of significance.

Step 1: State the research question:

Is there a significant difference on the responses of the group of male and

female students on the attitudinal scale when exposed to four different films of

controversial subject?

Step 2: H A : There is a significant difference on the responses of the group of male and

female students on the attitudinal scale when exposed to four different controversial

films.

H 0 : There is no significant difference on the responses of the group of male and

female students on the attitudinal scale when exposed to four different controversial

films.

Step 3: Determine α, decision rule, and F - critical value. (α=0.05)

Decision rule:

If F c is

F t or F c is

___________

F t : ________________

_________

-F t, then reject H 0.

Step 4: Arrange the data similar on the following table.

Film

1

2

3

4

Males

10

14

13

16

  • 8 12

12

12

  • 6 8

10

10

  • 4 4

9

9

  • 4 4

9

9

  • 4 4

7

7

  • 2 3

4

7

  • 2 2

4

6

  • 2 2

4

5

  • 1 1

2

5

∑=43

54

74

86

∑=257

Females

 

14

14

10

18

12

13

10

16

12

12

10

15

10

11

9

14

8

10

9

13

6

9

7

10

4

7

7

10

2

6

6

10

2

3

5

10

1

2

4

9

∑=71

87

77

125

∑=360

c =114

141

151

211

∑X t =617

Step 5: Compute for the following:

Squares and their sum

Σ X 2 t =10 2 +8 2 +6 2 ….+9 2

Σ X 2 t =6159

Total Sum of Squares

SS T =6159(617) 2

80

SS T =6159(617) 2

80

SS T =61594758.6

SS T =1400.4

2

SS T =Σ X t

(ΣX t ) 2

N t

Sum of Squares Within

2

SS w =Σ X t

Σ(ΣX k ) 2

N

SS w =6159 43 2 +54 2 +74 2 +86 2 +71 2 +87 2 +77 2 +125 2

10

SS w =615951801

10

SS

w =61595180.1

SS

w =978.9

Sum of Squares between Columns

  • X c

Σ ¿

¿

¿2

¿

  • X t

Σ ¿

¿

¿2

Σ ¿

SS c =¿

617 ¿ 2 ¿ ¿

SS c =

  • 114 2 +141 2 +151 2 +211 2

20

¿

SS c =

  • 100199 4758.61

20

SS

c =5009.954758.61

SS c =251.34

Sum of Squares between Rows

  • X r

Σ ¿

¿

¿2

¿

  • X t

Σ ¿

¿

¿2

Σ ¿

SS r =¿

SS

r =

257 2 +360 2 617

2

40

80

SS r =

  • 195649 4758.61

40

SS

r =4891.224758.61

SS r =132.61

Sum of Squares between Interaction

SS c .r =SS t SS w SS c SS r

SS c .r =1400.4978.9251.34132.61

SS c .r =37.6

Degrees of Freedom

df for total groups = N t

df for groups within = k(N-1)

df for groups between columns = c-1

df for groups between rows = r-1

df for groups between interaction = (c-1)(r-1)

Solution:

df t = 80-1=79

df w = 8(10-1)=8(9)= 72

df c = 4-1=3

df r = 2-1=1

df c.r = (4-1) (2-1) =3(1) = 3

Mean Square Within

MS

W =

SS w

df w

MS

W =

978.9

72

MS

W =13.6

Mean Square between Columns

MS

C =

SS c

df c

MS

C =

251.34

3

MS

C =83.78

Mean Square between Rows

MS

r =

SS r

df r

MS

r =

132.61

1

MS

r =132.61

Mean Square between Interaction

MS c. r

=

SS c .r df c .r

MS c. r

=

37.6

3

MS c. r =12.53

F- Test

The F value is the ratio of the Mean Square Between (Columns, Rows and

Interaction) and Mean Square Within.

F= MSB

 
 

MSW

Solution:

 

F

c

83.78

=

13.6

=6.16

F

=

  • 132.61 =9.75

 

r

13.6

F c .r

=

  • 12.53 =0.92

13.6

Step 6: Make a summary table.

Source of

Sum of

Degrees

of

Mean

Computed

Tabulated

VI

H o

Variation

Squares

Freedom

Squares

F -Value

F Value*

Between

Columns

251.34

3

83.78

6.16

2.74

S

rejected

Rows

132.61

1

132.61

9.75

3.98

S

rejected

interaction

37.6

3

12.53

0.92

2.74

NS

accepte

 

d

Within Groups

978.9

72

13.6

Total

1400.4

79

*Please refer to the table of F distribution

Step 7: Make an interpretation.

The table reveals that the computed value of F between columns and rows are

greater that the tabulated value. Therefore we can say that there is a significant

difference in the response of the students in the attitude scale with regards to the

controversial films and sex. While in the interaction the computed value is lesser than

the tabulated value. This means that there is no significant difference in the scores of

the students with regards to the interaction of male and female with the controversial

films.