Beruflich Dokumente
Kultur Dokumente
Learning Goals:
X
1. Understand that if is unknown, the distribution of standardizing is not Z.
2. Learn to use the t-distribution chart.
3. Learn to calculate a confidence interval on your calculator using given formulas.
It might seem reasonable to calculate the data standard deviation and use it in
place of in our confidence interval formula. It turns out that when replacing the
population standard deviation with the data standard deviation will not be Z-
distribution.
X
Z
S/ n
The distribution of this random variable was developed by William Gosset. Thus, we
can get the desired axis numbers that trap any probability of interest. This will
allow us to determine confidence intervals for when is unknown. This
distribution was named the t-distribution.
It turns out that the distribution of this random variable changes as the sample size
changes. Thus, there are many t-distributions. In out t-table we will be given the
axis numbers needed to determine the most commonly used confidence intervals.
Since the t-curves change with the sample size we will need a notation that
indicates which t-curve we are referring to. When we have a t-distribution that is
n1
used with a sample of size n, we will call it the t-distribution with degrees
tn 1
of freedom (or ). The reason for this name is not important to us and we will
not worry why it is named this way.
Sampl % Times True
e Size Mean is Trapped It turns out that if we use our z-chart numbers and
4 85.5% replace with s in our z-confidence interval formula,
8 90.9% we will not trap the mean 95% of the time. This is bad
15 93.0% since we are making decisions based on the fact we
20 93.5% have 95% confidence in our interval. The table below
30 94.0% shows how we do not trap the mean 95% of the time.
40 94.3% We trap it less than expected, much less for small n.
Densities of Z, t3, t7, t14
0.4
Variable
t14 ____ ____ Z Z
t3
t7
0.3 t14
____ t7
Density
0.2
0.1
____ t3
0.0
-3.6 -2.4 -1.2 0.0 1.2 2.4 3.6
A graph of a few of the t-distributions along with the z-distribution show us that as
the degrees of freedom increase, the given t-distribution looks more like the
standard normal (Z) distribution. We can see that the t-curves look bell shaped, but
they are not normally distributed.
(1 ) 100%
Our new formula for a confidence interval for the population mean,
when is unknown, will not use axis numbers from the z-distribution. Instead the
new formula will use axis numbers from the t-distribution with n-1 degrees of
freedom.
s s
x t / 2, n 1 , x t / 2, n 1
n n
This formula is as easy to use as our last confidence interval formula. We use our t-
chart or t-table to determine the proper chart value to use in the formula. Go down
n 1
to the proper row, . Read over to the proper column and use that value in our
above formula.
Note that this is a Right Sided Chart. That is, this chart tells us how much area
(probability) is to the right of the axis value. To properly use these charts, the
population needs to be normally distributed. In this course, we will assume a
normal population for all problems and examples. Therefore, I do not
need to state normality.
How to use the t-table: Suppose that we collected 11 pieces of data. Then we
11 1 10
have degrees of freedom for this problem. Suppose that we are
constructing a 95% confidence interval. Then we need the t-value with .025 to the
t.025,10
right:
Df t.10 t.05 t.025 t.01 t.005 What value would we to
want look up if we wanted
2 1.88562 2.91999 4.30265 6.96456 9.92484 a 98% confidence interval
3 1.63774 2.35336 3.18245 4.54070 5.84091 and had 12 pieces of
: : : : : : data?
Learning Goals:
1. Understand the importance of knowing the population standard deviation
2. Learn to use the Chi-Square distribution chart.
3. Learn to calculate a confidence interval for on your calculator using given
formulas.
Suppose that we are sampling from a normal distribution and wish to determine a
confidence interval for , the population standard deviation. This type of
confidence interval is very useful in manufacturing. A manufacturer should always
be interested in process variation. Recall that the population standard deviation is
our measure of variation.
For example, if we get on and off an electronic scale, what will the variation in
measurements be? After all, our weight has not changed between measurements.
Companies that make such products need to know the variation in the
measurements. If the variation is very low the user can take one measurement and
be confident in the reading. If the variation is high, the user will need to take
several measurements and take the average. Certainly this would be an
inconvenience.
( n 1) s 2
2
The distribution of the random variable has been determined by the math
geeks. This allows us to use tables to determine what is rare and not rare which
allows us to create a confidence interval formula for .
As with the t-distributions, the distributions used here also depend on the sample
size. So there are many of these distributions.
Our name for these distributions is the Chi-Square distributions with n-1 degrees of
n
2
1
freedom ( ). Again we denote which particular distribution with the subscript n-
1.
(n 1) S 2
P 6.262 27.488 .95
2
2
Using algebra to isolate and then taking the square root of all three terms gives
(n 1) S 2 (n 1) S 2
P .95
27.488 6.262
Some books do not take the square root. Those books have found a confidence
2
interval for the population variance . Since we constantly refer to the standard
deviation, we will be using the formula with the square root and end up with a
confidence interval for the population standard deviation.
Of course we will not need to perform this algebraic manipulation we will just use
(1 ) 100%
the formula below. Our general confidence interval for when
sampling from a normal population is:
(n 1) s 2 (n 1) s 2
,
2 / 2,n 1 12 / 2, n 1
Example: A new instrument used to measure pressure is being considered by a
company. The company wishes to determine a confidence interval for the standard
deviation of the instrument. That is, determine the standard deviation when the
instrument measures the same object repeatedly. The technician takes 17
measurements of a single object. The mean of the measurements is 16.85 and the
standard deviation of the measurements is .54. Determine a 95% confidence
interval for the population standard deviation.
Example: Determine a 98% confidence interval for , given that a sample of 15
pieces of data had a mean of 28.4 and a standard deviation of .21.
Suppose that we have two populations and wish to compare the means of these
populations as given in the previous examples.
In this course we will restrict ourselves to the case where we assume the two
populations have the same, but unknown, standard deviation. The purpose of this
is to limit the number of needed formulas. In the real world of data analysis, we will
be using the computer, so formulas will not be a problem.
With either a confidence interval or a hypothesis test, the first thing we will need to
do is estimate , the common standard deviation of the two populations. This is
done by pooling the individual standard deviations of each population.
n1 and s1
Given a sample size and standard deviation of from the first population
n2 and s2
and a sample size and standard deviation of from the second population
our formula is:
Example: A sample of 5 pieces of data is taken from the first population. The
standard deviation of the data is 3.82. A sample of 8 pieces of data is taken from
the second population. The standard deviation of this data set is 2.41. Determine
the pooled standard deviation.
1 2
Suppose that we sample from two populations with means and . Our
(1 ) 100% 1 2
confidence interval for the differences in means, is given by
the formula
1 1 1 1
x1 x2 t / 2 S p , x1 x2 t / 2 S p
n n n n
1 2 1 2
t / 2 n1 n2 2
Where is taken from the t-table. The degree of freedom used is .
Example: A sample of 5 pieces of data is taken from the first population. The
standard deviation of the data is 3.82. The mean of the data is 1245. A sample of 8
pieces of data is taken from the second population. The standard deviation of this
data set is 2.41. The mean of the data is 1251. Determine a 95% confidence
1 2
interval for .
Sample: A sample of 6 pieces of data is taken from the first population. The mean
of the data is 12.5. A sample of 6 pieces of data is taken from the second
population. The mean of the data is 12.42. Determine a 95% confidence interval
1 2
for . Assumed that the pooled standard deviation has been calculated to be
S p .055