Beruflich Dokumente
Kultur Dokumente
Data Information
Sample
— A sample is a set of data drawn from the
population.
— Potentially very large, but less than the population.
E.g. a sample of 765 voters exit polled on election day.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.2
Key Statistical Concepts…
Parameter
— A descriptive measure of a population.
Statistic
— A descriptive measure of a sample.
Subset
Statistic
Parameter
Populations have Parameters,
Samples have Statistics.
Sample
Inference
Statistic
Parameter
first class…
next class: .355+.185=.540
:
:
“around $35”
(Refer also to Fig. 2.13 in your textbook)
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.20
Scatter Diagram…
Example 2.9 A real estate agent wanted to know to what
extent the selling price of a home is related to its size…
Measures of Variability
Range, Standard Deviation, Variance, Coefficient of Variation
Sample Mean
Population Mean
Population Sample
Size N n
Mean
E.g.
Data: {4, 4, 4, 4, 50} Range = 46
Data: {4, 8, 15, 24, 39, 50} Range = 46
The range is the same in both cases,
but the data sets have very different distributions…
Size N n
Mean
Variance
Sample Variance
Note: increasing the sample size will not reduce this type of
error.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.48
Approaches to Assigning Probabilities…
There are three ways to assign a probability, P(Oi), to an
outcome, Oi, namely:
P(AC) = 1 – P(A)
P(A and B) = 0
Analogy:
Integers are Discrete, while Real Numbers are Continuous
2. E(X + c) = E(X) + c
3. E(cX) = cE(X)
• We can ―pull‖ a constant out of the expected value expression
(either as part of a sum with a random variable X or as a coefficient
of random variable X).
2. V(X + c) = V(X)
• The variance of a random variable and a constant is just the
variance of the random variable (per 1 above).
3. V(cX) = c2V(X)
• The variance of a random variable and a constant coefficient is
the coefficient squared times the variance of the random variable.
for x=0, 1, 2, …, n
P(X ≤ 4) = .967
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.62
Binomial Table…
―What is the probability that Pat gets two answers correct?‖
i.e. what is P(X = 2), given P(success) = .20 and n=10 ?
# trials
P(success)
cumulative
(i.e. P(X≤x)?)
P(X=2)=.3020
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.64
=BINOMDIST() Excel Function…
There is a binomial distribution function in Excel that can
also be used to calculate these probabilities. For example:
What is the probability that Pat fails the quiz?
# successes
# trials
P(success)
cumulative
(i.e. P(X≤x)?)
P(X≤4)=.9672
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.65
Binomial Distribution…
As you might expect, statisticians have developed general
formulas for the mean, variance, and standard deviation of a
binomial random variable. They are:
FYI:
P(X=0) =
f(x)
area=1
a b x
2) The total area under the curve between a and b is 1.0
0
What is the probability that a computer is assembled in a
time between 45 and 60 minutes?
0 1.6
P(Z > 1.6) = .5 – P(0 < Z < 1.6)
= .5 – .4452
= .0548
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.86
Using the Normal Table (Table 3)…
What is P(Z < -2.23) ?
P(0 < Z < 2.23)
-2.23 0 2.23
P(Z < -2.23) = P(Z > 2.23)
= .5 – P(0 < Z < 2.23)
= .0129
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.87
Using the Normal Table (Table 3)…
What is P(Z < 1.52) ?
0 1.52
P(Z < 1.52) = .5 + P(0 < Z < 1.52)
= .5 + .4357
= .9357
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.88
Using the Normal Table (Table 3)…
What is P(0.9 < Z < 1.9) ?
P(0 < Z < 0.9)
0 0.9 1.9
P(0.9 < Z < 1.9) = P(0 < Z < 1.9) – P(0 < Z < 0.9)
=.4713 – .3159
= .1554
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.89
Finding Values of Z…
Other Z values are
Z.05 = 1.645
Z.01 = 2.33
Similarly
P(-1.645 < Z < 1.645) = .90
Student t Distribution,
Chi-Squared Distribution, and
F Distribution.
Figure 8.24
t.05,10
t.05,10=1.812
Sampling Distributions
x 1 2 3 4 5 6
P(x) 1/6 1/6 1/6 1/6 1/6 1/6
3.5 6/36
4.0 5/36
4.5 4/36 2/36
5.0 3/36
5.5 2/36
6.0 1/36 1/36
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
1 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
The larger the sample size, the more closely the sampling
distribution of X will resemble a normal distribution.
2.
Things we know:
1) X is normally distributed, therefore so will X.
2) = 32.2 oz.
3)
what is the probability that one bottle will what is the probability that the mean of
contain more than 32 ounces? four bottles will exceed 32 oz?
mean:
standard deviation:
Point Estimator
Interval Estimator
Table 10.1
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.120
Example 10.1…
A computer company samples demand during lead time over
25 time periods:
235 374 309 499 253
421 361 514 462 369
394 439 348 344 330
261 374 302 466 535
386 316 296 332 334
1.96
75
Given
n 25
therefore:
The lower and upper confidence limits are 340.76 and 399.56.
The estimation for the mean demand during lead time lies
between 340.76 and 399.56 — we can use this as input in
developing an inventory policy.
That is, we estimated that the mean demand during lead time
falls between 340.76 and 399.56, and this type of estimator
is correct 95% of the time. That also means that 5% of the
time the estimator will be incorrect.
Since:
It follows that
The jury does not know which hypothesis is true. They must
make a decision on the basis of evidence presented.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.132
Nonstatistical Hypothesis Testing…
There are two possible errors.
A Type I error occurs when we reject a true null hypothesis.
That is, a Type I error occurs when the jury convicts an
innocent person.
P(Type I error) = α
P(Type II error) = β
The null hypothesis (H0) will always state that the parameter
equals the value specified in the alternative hypothesis (H1)
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.137
Concepts of Hypothesis Testing…
Consider Example 10.1 (mean demand for computers during
assembly lead time) again. Rather than estimate the mean
demand, our operations manager wants to know whether the
mean is different from 350 units. We can rephrase this
request into a test of the hypothesis:
H0: μ = 350
H0 T F
Reject I
Reject II
We know:
n = 400,
= 178, and
= 65
p-value
H1:μ < 22
H0:μ = 22
and x
x
4,759i
21.63
220 220
x 21.63 22
z .91
/ n 6 / 220
p-value = P(Z < -.91) = .5 - .3186 = .1814
-z.025 +z.025 z
0
We find that:
Since z = 1.19 is not greater than 1.96, nor less than –1.96
we cannot reject the null hypothesis in favor of H1. That is
“there is insufficient evidence to infer that there is a
difference between the bills of AT&T and the competitor.”
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.167
PLOT POWER CURVE
Sample
Inference
Statistic
Parameter
H0: = 450
Since
and so:
H1: <1
(so our null hypothesis becomes: H0: = 1). We will use
this test statistic:
3. The variance of is
??
degrees of freedom
degrees of freedom
The sample variances are similar, hence we will assume that the
population variances are equal…
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.196
Example 13.2… COMPUTE
Since our calculated t-statistic does not fall into the rejection
region, we cannot reject H0 in favor of H1, that is, there is
not sufficient evidence to infer that the mean assembly times
differ.
Compare…
degrees of freedom.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1.204
Inference about the ratio of two variances
Our null hypothesis is always:
H0:
df1 = n1 - 1
df2 = n2 - 1
.58 1.61 F
If we double the one-tail p-value Excel gives us, we have the p-value of
the test we’re conducting (i.e. 2 x 0.0004 = 0.0008). Refer to the text
and CD Appendices for more detail.