Sie sind auf Seite 1von 3

The Freedom to Vary

First, forget about statistics. Imagine youre a fun-loving person who loves to wear hats. You
couldn't care less what a degree of freedom is. You believe that variety is the spice of life.

Unfortunately, you have constraints. You have only 7 hats. Yet you want to wear a different hat
every day of the week.

On the first day, you can wear any of the 7 hats. On the second day, you can choose from the 6
remaining hats, on day 3 you can choose from 5 hats, and so on.

When day 6 rolls around, you still have a choice between 2 hats that you havent worn yet that
week. But after you choose your hat for day 6, you have no choice for the hat that you wear on
Day 7. You must wear the one remaining hat. You had 7-1 = 6 days of hat freedomin which
the hat you wore could vary!

Thats kind of the idea behind degrees of freedom in statistics. Degrees of freedom are often
broadly defined as the number of "observations" (pieces of information) in the data that are
free to vary when estimating statistical parameters.

Degrees of Freedom: 1-Sample t test


Now imagine you're not into hats. You're into data analysis.

You have a data set with 10 values. If youre not estimating anything, each value can take on
any number, right? Each value is completely free to vary.

But suppose you want to test the population mean with a sample of 10 values, using a 1-
sample t test. You now have a constraintthe estimation of the mean. What is that constraint,
exactly? By definition of the mean, the following relationship must hold: The sum of all values
in the data must equal n x mean, where n is the number of values in the data set.
So if a data set has 10 values, the sum of the 10 values must equal the mean x 10. If the mean
of the 10 values is 3.5 (you could pick any number), this constraint requires that the sum of the
10 values must equal 10 x 3.5 = 35.

With that constraint, the first value in the data set is free to vary. Whatever value it is, its still
possible for the sum of all 10 numbers to have a value of 35. The second value is also free to
vary, because whatever value you choose, it still allows for the possibility that the sum of all the
values is 35.

In fact, the first 9 values could be anything, including these two examples:

34, -8.3, -37, -92, -1, 0, 1, -22, 99


0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9

But to have all 10 values sum to 35, and have a mean of 3.5, the 10th value cannot vary. It must
be a specific number:

34, -8.3, -37, -92, -1, 0, 1, -22, 99 -----> 10TH value must be 61.3
0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 ----> 10TH value must be 30.5

Therefore, you have 10 - 1 = 9 degrees of freedom. It doesnt matter what sample size you use,
or what mean value you usethe last value in the sample is not free to vary. You end up
with n - 1 degrees of freedom, where n is the sample size.

Another way to say this is that the number of degrees of freedom equals the number of
"observations" minus the number of required relations among the observations (e.g., the
number of parameter estimates). For a 1-sample t-test, one degree of freedom is spent
estimating the mean, and the remaining n - 1 degrees of freedom estimate variability.

The degrees for freedom then define the specific t-distribution thats used to calculate the p-
values and t-values for the t-test.
Notice that for small sample sizes (n), which correspond with smaller degrees of freedom (n - 1
for the 1-sample t test), the t-distribution has fatter tails. This is because the t distribution was
specially designed to provide more conservative test results when analyzing small samples
(such as in the brewing industry). As the sample size (n) increases, the number of degrees of
freedom increases, and the t-distribution approaches a normal distribution.

Das könnte Ihnen auch gefallen