Sie sind auf Seite 1von 10

The T-Test in Dissertation & Thesis Research

In dissertation or thesis research, there are two types of inferential

statistics: parametric and nonparametric tests. Which type you
use for your data depends on the type of measurement scale
used and how your collected data are distributed.

T-Test Assumptions

The t-test is a parametric statistic and perhaps one of the

simplest analyses used in dissertation and thesis research. Prior
to using the t-test, you must make sure that your data does not
violate any of the three assumptions underlying the t-test:

The scores in your data represent a random sample from the

population under study.
The distribution of the mean of your sample is normal.
The variances of the different groups studied are very similar.

If your data violates one or more of these assumptions, you may be

committing a Type I error more or less often than the alpha
probability you set (either .01 or .05). This bias may undermine
the value of the t-test, and therefore, the results of your

Types of T-Tests

The t-test is used when your data has only two levels of the
independent variable. There is a t-test for dissertations
involving experimental designs with randomized groups
(independent samples), and another t-test for dissertations with
experimental designs involving correlated groups (matched
pairs or within-subjects designs). Knowing what kind of sample
you have is key to selecting the appropriate t-test for your

Let's say that your dissertation involves two groups of

people. If you obtained your subjects from multiple locations
and assigned each person to be in one group or the other
randomly, say through the use of a random numbers table, then
you would use the t-test for independent samples in your
analysis. If, however, your dissertation is looking at men versus
women in an undergraduate introductory psychology course at
your school, you must use the t-test for correlated samples in
your analysis.
Independent Samples

For example, let's suppose that your dissertation involves

two random groups of people, an experimental group and a
control group. You are examining whether seeing a recording
artist's face influences how people rate his/her song. All of your
subjects listen to the same song. The experimental group sees
the artist's face before hearing the song, while the control
group does not. You then collect data from the two groups
about how well they liked the song on a scale of 1-7. For your
analysis, you compute the mean of each group and find that the
experimental group's mean is 5.9, while the control group's
mean is 4.6.

For this analysis, you would use the t-test for independent
means. The crux of your paper is determining whether the 1.3
difference between these means is a statistically-reliable
difference or if the means are different because of sampling

Correlated Samples

Using the above example, let's say your work involved one
group of subjects, but each subject listened to the song first,
without seeing the artist's face, then rated how much they liked
it. Then, the same subject saw the artist's face and listened to
the song again. For your analysis, you would use the t-test for
correlated samples, because each person in your sample made
two observations. Obviously, the ratings for this sample are
correlated, because they came from the same individual. This
type of experimental design is called a "within-subjects" design.

Calculating Degrees of Freedom

Once you have calculated the t-score for your groups, you
need to know whether these t values are large enough to
assume that the difference you found between the two groups
is significant. Most statistical packages used for analyses (SPSS,
etc.) will provide an alpha level for you. If, for your dissertation,
you have set your significance level at .05, any alpha smaller
than this means that you have significant findings. Dissertation
committees and dissertation chairs love significant results!

If you do not have a statistical package, you must first find

the degrees of freedom for your sample. For the first example
given (the between-subjects design), the degrees of freedom is
the number of subjects minus two (N-2). You can then use a t-
test table, found in most statistics books, to determine the
"critical value" of t. If the t value you obtained in your sample is
greater than or equal to the t-score in the table matching your
degrees of freedom for your sample, then the difference
between your two groups' means are statistically different at
the alpha level you set.

If you cannot find a table that has the degrees of freedom

you have for your sample, you can use the next lowest degrees
of freedom in the table that you have. This strategy is
particularly appropriate if your sample is very large and the
degrees of freedom for your t-test is quite a bit larger than
those found in a table. Another option is to use a statistical
calculator to see if the t-value you got for your results is

In sum, if your dissertation or thesis involves two groups

with only two levels of the independent variable, the t-test may
be the ideal statistic for your analysis. Choose the appropriate t-
test for your analysis based on whether your samples are
independent or correlated.

The T-Test
The t-test assesses whether the means of two groups are statistically different from each other. This

analysis is appropriate whenever you want to compare the means of two groups, and especially appropriate

as the analysis for the posttest-only two-group randomized experimental design.

Figure 1. Idealized distributions for treated and comparison group posttest values.

Figure 1 shows the distributions for the treated (blue) and control (green) groups in a study. Actually, the

figure shows the idealized distribution -- the actual distribution would usually be depicted with a histogram or

bar graph. The figure indicates where the control and treatment group means are located. The question the

t-test addresses is whether the means are statistically different.

What does it mean to say that the averages for two groups are statistically different? Consider the three

situations shown in Figure 2. The first thing to notice about the three situations is that the difference

between the means is the same in all three. But, you should also notice that the three situations don't look

the same -- they tell very different stories. The top example shows a case with moderate variability of scores

within each group. The second situation shows the high variability case. the third shows the case with low

variability. Clearly, we would conclude that the two groups appear most different or distinct in the bottom or

low-variability case. Why? Because there is relatively little overlap between the two bell-shaped curves. In

the high variability case, the group difference appears least striking because the two bell-shaped

distributions overlap so much.

Figure 2. Three scenarios for differences between means.

This leads us to a very important conclusion: when we are looking at the differences between scores for two

groups, we have to judge the difference between their means relative to the spread or variability of their

scores. The t-test does just this.

Statistical Analysis of the t-test

The formula for the t-test is a ratio. The top part of the ratio is just the difference between the two means or

averages. The bottom part is a measure of the variability or dispersion of the scores. This formula is

essentially another example of the signal-to-noise metaphor in research: the difference between the means

is the signal that, in this case, we think our program or treatment introduced into the data; the bottom part of

the formula is a measure of variability that is essentially noise that may make it harder to see the group

difference. Figure 3 shows the formula for the t-test and how the numerator and denominator are related to

the distributions.

Figure 3. Formula for the t-test.

The top part of the formula is easy to compute -- just find the difference between the means. The bottom

part is called the standard error of the difference. To compute it, we take the variance for each group and

divide it by the number of people in that group. We add these two values and then take their square root.

The specific formula is given in Figure 4:

Figure 4. Formula for the Standard error of the difference between the means.
Remember, that the variance is simply the square of the standard deviation.

The final formula for the t-test is shown in Figure 5:

Figure 5. Formula for the t-test.

The t-value will be positive if the first mean is larger than the second and negative if it is smaller. Once you

compute the t-value you have to look it up in a table of significance to test whether the ratio is large enough

to say that the difference between the groups is not likely to have been a chance finding. To test the

significance, you need to set a risk level (called the alpha level). In most social research, the "rule of thumb"

is to set the alpha level at .05. This means that five times out of a hundred you would find a statistically

significant difference between the means even if there was none (i.e., by "chance"). You also need to

determine the degrees of freedom (df) for the test. In the t-test, the degrees of freedom is the sum of the

persons in both groups minus 2. Given the alpha level, the df, and the t-value, you can look the t-value up in

a standard table of significance (available as an appendix in the back of most statistics texts) to determine

whether the t-value is large enough to be significant. If it is, you can conclude that the difference between

the means for the two groups is different (even given the variability). Fortunately, statistical computer

programs routinely print the significance test results and save you the trouble of looking them up in a table.

Significance tests determines the probability that the null

hypothesis is true. The researcher sets the probability level.
Suppose for our example, we use a significance test and
find that the probability that the null hypothesis is true is
less than 5 in 100. This would be stated as p < .05,
where p obviously stands for probability. The researcher
should always state the probability level used in their
research findings. It can be set anywhere from .001 to > 0,
however, we play it safe by setting it to .05 (which has been
accepted as the international standard by statisticians). Of
course, if the chances that something is true is less than 5 in
100, it’s a good bet that it’s not true. If it’s probably not
true, we reject the null hypothesis, leaving us with only the
first two explanations that we started with as viable
explanations for the difference.
There is no rule of nature that dictates at what probability
level the null hypothesis should be rejected. However,
conventional wisdom suggests that .05 or less (such as .01
or .001) are reasonable.
When we fail to reject the null hypothesis because the
probability is greater than .05, we do just that: We "fail to
reject" the null hypothesis and it stays on our list of
possible explanations; we never "accept" the null
hypothesis as the only explanation. Remember, there are
three possible explanations (see above) and failing to reject
one of them does not mean that you are accepting it as the
only explanation.
An alternative way to say that we have rejected the null
hypothesis is to state that the difference is statistically
significant. Thus, if we state that a difference is statistically
significant at the .05 level (meaning .05 or less), it is
equivalent to stating that the null hypothesis has been
rejected at that level.
When you read research reported in academic journals, or
research papers, you will find that the null hypothesis is
seldom stated by researchers, who assume that you know
that the sole purpose of a significance test is to test a null
hypothesis. Instead, researchers tell you which differences
were tested for significance, which significance test they
used, and which differences were found to be statistically
significant. It is more common to find null hypotheses
stated in theses and dissertations since committee members
may wish to make sure that the students they are
supervising understand the reason they have conducted a
significance test.

Topic: The t Test

Suppose we have a research hypothesis that says medical "research
investigators who take a short course on the causes of HIV will be less fearful
of the disease than research investigators who have not taken the course," and
test it by conducting an experiment in which a random sample of research
investigators are assigned to take the course and another random sample are
designated as the control group (note: random sampling is preferred, because it
precludes any bias in the assignment of subjects to the groups and because we
can test for the effect of random errors with significance test; we cannot test
for the effects of bias).

Let’s suppose that at the end of the experiment the experimental group gets a
mean of 16.61 on a fear of HIV scale and the control group gets a mean of
29.67 (where the higher the score, the greater the fear of HIV). These
means support our research hypothesis. But can we be certain that our research
hypothesis is correct? If you’ve been reading various topics on statistics, you
already know that the answer is "no" because of the Null Hypothesis, which
says that there is no true difference between the means; that is, the difference
was created merely by the chance errors created by random sampling (these
errors are known as sampling errors). Put another more simple way,
unrepresentative groups may have been assigned to the two conditions
quite at random.

The t test is often used to test the null hypothesis regarding the observed
difference between two means (to test the null hypothesis between
two medians, the median test is used; it is a specialized form of chi square test).
For the example, we are considering, a series of computations (which are
beyond the scope of this paper) would be performed to obtain a value
of t (which, in this case, is 5.38) and a value of degrees of freedom (which, in
this case, is df = 179). These values are not of any special interest to us except
that they are used to get the probability (p) that the null hypothesis is true. In
this particular case, p is less than .05. Thus, in a research report, you may read a
statement such as this:

"The difference between the means is statistically significant (t = 5.38, df =

179, p< .05)".

The term statistically significant indicates that the null hypothesis has been
rejected. You will recall that when the probability that the null hypothesis is
true is .05 or less (such as .01 or .001), we reject the null hypothesis. When
something is unlikely to be true, because it has a low probability of being true,
we reject it.

Having rejected the null hypothesis, we are in a position to assert that our
research hypothesis probably is true (assuming no procedural bias was allowed
to affect the results, such as testing the control group immediately after a major
news story on a celebrity person with AIDS, while testing the experimental
group at an earlier time).

What leads a t test to give us a low probability? Three things:

1. Sample size. The larger the sample, the less likely that an observed
difference is due to sampling errors. Large samples provide more precise
information. Thus, when the sample is large, we are more likely to
reject the null hypothesis than when the sample is small.
2. The size of the difference between means. The larger the difference, the
less likely that the difference is due to sampling errors. Thus, when the
difference between the means is large, we are more likely to reject
the null hypothesis than when the difference is small.
3. The amount of variation in the population. When a population is very
heterogeneous (has much variability) there is more potential for
sampling error. Thus, when there is little variation (as indicated by
the standard deviations of the sample), we are more likely to reject
the null hypothesis than when there is much variation.

A special type of t test is also applied to correlation coefficients. Suppose we

drew a random sample of 50 medical students and correlated their hand size
with their GPAs and got an r of .19. The null hypothesis says that
the true correlation in the population is 0.00 - that we got .19 merely as the
result of sampling errors. For this example the t test indicates that p > .05.
Since the probability that the null hypothesis is true is greater than 5 in 100, we
do not reject the null hypothesis; we have a statistically insignificant
correlation coefficient. In other words, for n = 50, an r of .19 is not
significantly different from an r of 0.00. When reporting the results of the t test
for the significance of a correlation coefficient, it is better not to mention the
value of t. Rather, it is better to indicate only whether or not the correlation is
significant at a given probability level.