statistics: parametric and nonparametric tests. Which type you

use for your data depends on the type of measurement scale

used and how your collected data are distributed.

T-Test Assumptions

simplest analyses used in dissertation and thesis research. Prior

to using the t-test, you must make sure that your data does not

violate any of the three assumptions underlying the t-test:

population under study.

The distribution of the mean of your sample is normal.

The variances of the different groups studied are very similar.

committing a Type I error more or less often than the alpha

probability you set (either .01 or .05). This bias may undermine

the value of the t-test, and therefore, the results of your

dissertation.

Types of T-Tests

The t-test is used when your data has only two levels of the

independent variable. There is a t-test for dissertations

involving experimental designs with randomized groups

(independent samples), and another t-test for dissertations with

experimental designs involving correlated groups (matched

pairs or within-subjects designs). Knowing what kind of sample

you have is key to selecting the appropriate t-test for your

analyses.

people. If you obtained your subjects from multiple locations

and assigned each person to be in one group or the other

randomly, say through the use of a random numbers table, then

you would use the t-test for independent samples in your

analysis. If, however, your dissertation is looking at men versus

women in an undergraduate introductory psychology course at

your school, you must use the t-test for correlated samples in

your analysis.

Independent Samples

two random groups of people, an experimental group and a

control group. You are examining whether seeing a recording

artist's face influences how people rate his/her song. All of your

subjects listen to the same song. The experimental group sees

the artist's face before hearing the song, while the control

group does not. You then collect data from the two groups

about how well they liked the song on a scale of 1-7. For your

analysis, you compute the mean of each group and find that the

experimental group's mean is 5.9, while the control group's

mean is 4.6.

For this analysis, you would use the t-test for independent

means. The crux of your paper is determining whether the 1.3

difference between these means is a statistically-reliable

difference or if the means are different because of sampling

error.

Correlated Samples

Using the above example, let's say your work involved one

group of subjects, but each subject listened to the song first,

without seeing the artist's face, then rated how much they liked

it. Then, the same subject saw the artist's face and listened to

the song again. For your analysis, you would use the t-test for

correlated samples, because each person in your sample made

two observations. Obviously, the ratings for this sample are

correlated, because they came from the same individual. This

type of experimental design is called a "within-subjects" design.

Once you have calculated the t-score for your groups, you

need to know whether these t values are large enough to

assume that the difference you found between the two groups

is significant. Most statistical packages used for analyses (SPSS,

etc.) will provide an alpha level for you. If, for your dissertation,

you have set your significance level at .05, any alpha smaller

than this means that you have significant findings. Dissertation

committees and dissertation chairs love significant results!

the degrees of freedom for your sample. For the first example

given (the between-subjects design), the degrees of freedom is

the number of subjects minus two (N-2). You can then use a t-

test table, found in most statistics books, to determine the

"critical value" of t. If the t value you obtained in your sample is

greater than or equal to the t-score in the table matching your

degrees of freedom for your sample, then the difference

between your two groups' means are statistically different at

the alpha level you set.

you have for your sample, you can use the next lowest degrees

of freedom in the table that you have. This strategy is

particularly appropriate if your sample is very large and the

degrees of freedom for your t-test is quite a bit larger than

those found in a table. Another option is to use a statistical

calculator to see if the t-value you got for your results is

significant.

with only two levels of the independent variable, the t-test may

be the ideal statistic for your analysis. Choose the appropriate t-

test for your analysis based on whether your samples are

independent or correlated.

The T-Test

The t-test assesses whether the means of two groups are statistically different from each other. This

analysis is appropriate whenever you want to compare the means of two groups, and especially appropriate

Figure 1. Idealized distributions for treated and comparison group posttest values.

Figure 1 shows the distributions for the treated (blue) and control (green) groups in a study. Actually, the

figure shows the idealized distribution -- the actual distribution would usually be depicted with a histogram or

bar graph. The figure indicates where the control and treatment group means are located. The question the

What does it mean to say that the averages for two groups are statistically different? Consider the three

situations shown in Figure 2. The first thing to notice about the three situations is that the difference

between the means is the same in all three. But, you should also notice that the three situations don't look

the same -- they tell very different stories. The top example shows a case with moderate variability of scores

within each group. The second situation shows the high variability case. the third shows the case with low

variability. Clearly, we would conclude that the two groups appear most different or distinct in the bottom or

low-variability case. Why? Because there is relatively little overlap between the two bell-shaped curves. In

the high variability case, the group difference appears least striking because the two bell-shaped

This leads us to a very important conclusion: when we are looking at the differences between scores for two

groups, we have to judge the difference between their means relative to the spread or variability of their

Statistical Analysis of the t-test

The formula for the t-test is a ratio. The top part of the ratio is just the difference between the two means or

averages. The bottom part is a measure of the variability or dispersion of the scores. This formula is

essentially another example of the signal-to-noise metaphor in research: the difference between the means

is the signal that, in this case, we think our program or treatment introduced into the data; the bottom part of

the formula is a measure of variability that is essentially noise that may make it harder to see the group

difference. Figure 3 shows the formula for the t-test and how the numerator and denominator are related to

the distributions.

The top part of the formula is easy to compute -- just find the difference between the means. The bottom

part is called the standard error of the difference. To compute it, we take the variance for each group and

divide it by the number of people in that group. We add these two values and then take their square root.

Figure 4. Formula for the Standard error of the difference between the means.

Remember, that the variance is simply the square of the standard deviation.

The t-value will be positive if the first mean is larger than the second and negative if it is smaller. Once you

compute the t-value you have to look it up in a table of significance to test whether the ratio is large enough

to say that the difference between the groups is not likely to have been a chance finding. To test the

significance, you need to set a risk level (called the alpha level). In most social research, the "rule of thumb"

is to set the alpha level at .05. This means that five times out of a hundred you would find a statistically

significant difference between the means even if there was none (i.e., by "chance"). You also need to

determine the degrees of freedom (df) for the test. In the t-test, the degrees of freedom is the sum of the

persons in both groups minus 2. Given the alpha level, the df, and the t-value, you can look the t-value up in

a standard table of significance (available as an appendix in the back of most statistics texts) to determine

whether the t-value is large enough to be significant. If it is, you can conclude that the difference between

the means for the two groups is different (even given the variability). Fortunately, statistical computer

programs routinely print the significance test results and save you the trouble of looking them up in a table.

hypothesis is true. The researcher sets the probability level.

Suppose for our example, we use a significance test and

find that the probability that the null hypothesis is true is

less than 5 in 100. This would be stated as p < .05,

where p obviously stands for probability. The researcher

should always state the probability level used in their

research findings. It can be set anywhere from .001 to > 0,

however, we play it safe by setting it to .05 (which has been

accepted as the international standard by statisticians). Of

course, if the chances that something is true is less than 5 in

100, it’s a good bet that it’s not true. If it’s probably not

true, we reject the null hypothesis, leaving us with only the

first two explanations that we started with as viable

explanations for the difference.

There is no rule of nature that dictates at what probability

level the null hypothesis should be rejected. However,

conventional wisdom suggests that .05 or less (such as .01

or .001) are reasonable.

When we fail to reject the null hypothesis because the

probability is greater than .05, we do just that: We "fail to

reject" the null hypothesis and it stays on our list of

possible explanations; we never "accept" the null

hypothesis as the only explanation. Remember, there are

three possible explanations (see above) and failing to reject

one of them does not mean that you are accepting it as the

only explanation.

An alternative way to say that we have rejected the null

hypothesis is to state that the difference is statistically

significant. Thus, if we state that a difference is statistically

significant at the .05 level (meaning .05 or less), it is

equivalent to stating that the null hypothesis has been

rejected at that level.

When you read research reported in academic journals, or

research papers, you will find that the null hypothesis is

seldom stated by researchers, who assume that you know

that the sole purpose of a significance test is to test a null

hypothesis. Instead, researchers tell you which differences

were tested for significance, which significance test they

used, and which differences were found to be statistically

significant. It is more common to find null hypotheses

stated in theses and dissertations since committee members

may wish to make sure that the students they are

supervising understand the reason they have conducted a

significance test.

Suppose we have a research hypothesis that says medical "research

investigators who take a short course on the causes of HIV will be less fearful

of the disease than research investigators who have not taken the course," and

test it by conducting an experiment in which a random sample of research

investigators are assigned to take the course and another random sample are

designated as the control group (note: random sampling is preferred, because it

precludes any bias in the assignment of subjects to the groups and because we

can test for the effect of random errors with significance test; we cannot test

for the effects of bias).

Let’s suppose that at the end of the experiment the experimental group gets a

mean of 16.61 on a fear of HIV scale and the control group gets a mean of

29.67 (where the higher the score, the greater the fear of HIV). These

means support our research hypothesis. But can we be certain that our research

hypothesis is correct? If you’ve been reading various topics on statistics, you

already know that the answer is "no" because of the Null Hypothesis, which

says that there is no true difference between the means; that is, the difference

was created merely by the chance errors created by random sampling (these

errors are known as sampling errors). Put another more simple way,

unrepresentative groups may have been assigned to the two conditions

quite at random.

The t test is often used to test the null hypothesis regarding the observed

difference between two means (to test the null hypothesis between

two medians, the median test is used; it is a specialized form of chi square test).

For the example, we are considering, a series of computations (which are

beyond the scope of this paper) would be performed to obtain a value

of t (which, in this case, is 5.38) and a value of degrees of freedom (which, in

this case, is df = 179). These values are not of any special interest to us except

that they are used to get the probability (p) that the null hypothesis is true. In

this particular case, p is less than .05. Thus, in a research report, you may read a

statement such as this:

179, p< .05)".

The term statistically significant indicates that the null hypothesis has been

rejected. You will recall that when the probability that the null hypothesis is

true is .05 or less (such as .01 or .001), we reject the null hypothesis. When

something is unlikely to be true, because it has a low probability of being true,

we reject it.

Having rejected the null hypothesis, we are in a position to assert that our

research hypothesis probably is true (assuming no procedural bias was allowed

to affect the results, such as testing the control group immediately after a major

news story on a celebrity person with AIDS, while testing the experimental

group at an earlier time).

1. Sample size. The larger the sample, the less likely that an observed

difference is due to sampling errors. Large samples provide more precise

information. Thus, when the sample is large, we are more likely to

reject the null hypothesis than when the sample is small.

2. The size of the difference between means. The larger the difference, the

less likely that the difference is due to sampling errors. Thus, when the

difference between the means is large, we are more likely to reject

the null hypothesis than when the difference is small.

3. The amount of variation in the population. When a population is very

heterogeneous (has much variability) there is more potential for

sampling error. Thus, when there is little variation (as indicated by

the standard deviations of the sample), we are more likely to reject

the null hypothesis than when there is much variation.

drew a random sample of 50 medical students and correlated their hand size

with their GPAs and got an r of .19. The null hypothesis says that

the true correlation in the population is 0.00 - that we got .19 merely as the

result of sampling errors. For this example the t test indicates that p > .05.

Since the probability that the null hypothesis is true is greater than 5 in 100, we

do not reject the null hypothesis; we have a statistically insignificant

correlation coefficient. In other words, for n = 50, an r of .19 is not

significantly different from an r of 0.00. When reporting the results of the t test

for the significance of a correlation coefficient, it is better not to mention the

value of t. Rather, it is better to indicate only whether or not the correlation is

significant at a given probability level.

