Beruflich Dokumente
Kultur Dokumente
Estimation
Questions
• What is a sampling distribution?
• What is the standard error?
• What is the principle of maximum
likelihood?
• What is bias (in the statistical sense)?
• What is a confidence interval?
• What is the central limit theorem?
• Why is the number 1.96 a big deal?
Population
• Population & Sample Space
• Population vs. sample
• Population parameter, sample statistic
Parameter Estimation
We use statistics to estimate parameters,
e.g., effectiveness of pilot training,
effectiveness of psychotherapy.
X SD
Sampling Distribution (1)
• A sampling distribution is a distribution of a
statistic over all possible samples.
• To get a sampling distribution,
– 1. Take a sample of size N (a given number like
5, 10, or 1000) from a population
– 2. Compute the statistic (e.g., the mean) and
record it.
– 3. Repeat 1 and 2 a lot (infinitely for large pops).
– 4. Plot the resulting sampling distribution, a
distribution of a statistic over repeated samples.
Suppose
• Population has 6 elements: 1, 2, 3, 4, 5,
6 (like numbers on dice)
• We want to find the sampling
distribution of the mean for N=2
• If we sample with replacement, what
can happen?
1st 2nd M 1st 2nd M 1st 2nd M
1 1 1 3 1 2 5 1 3
1 2 1.5 3 2 2.5 5 2 3.5
1 3 2 3 3 3 5 3 4
1 4 2.5 3 4 3.5 5 4 4.5
1 5 3 3 5 4 5 5 5
1 6 3.5 3 6 4.5 5 6 5.5
2 1 1.5 4 1 2.5 6 1 3.5
2 2 2 4 2 3 6 2 4
2 3 2.5 4 3 3.5 6 3 4.5
2 4 3 4 4 4 6 4 5
2 5 3.5 4 5 4.5 6 5 5.5
2 6 4 4 6 5 6 6 6
7
4
Possible Outcomes 3 Series1
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
Histogram
Sampling
distribution for
mean of 2 dice.
1+2+3+4+5+6 = 21.
21/6 = 3.5
There is only 1
way to get a
mean of 1, but 6
ways to get a
mean of 3.5.
Sampling Distribution (2)
• The sampling distribution shows the relation
between the probability of a statistic and the
statistic’s value for all possible samples of
size N drawn from a population.
Hy pothetical Distribution of Sample Means
f(M)
Mean Value
Sampling Distribution Mean
and SD
• The Mean of the sampling distribution is
defined the same way as any other
distribution (expected value).
• The SD of the sampling distribution is the
Standard Error. Important and useful.
• Variance of sampling distribution is the
expected value of the squared difference – a
mean square.
• Review
G2 E (G G ) 2
Review
0.25
0.2
Likelihood
0.15
0.1
0.05
0
-0.05 0 0.2 0.4 0.6 0.8 1
Theta (p value)
Maximum Likelihood (4)
• In example, best (max like) estimate
would be 9/15 = .60.
• There is a general class called
maximum likelihood estimators that
find values of theta that maximizes the
likelihood of a sample result.
• ML is one principle of ‘goodness’ of an
estimator
More Goodness (1)
• Bias. If E(statistic)=parameter, the
estimator is unbiased. If it’s unbiased,
the mean of the sampling distribution
equals the parameter. The sample mean
has this property: E (X ) . Sample
variance is biased.
More Goodness (2)
• Efficiency – size of the sampling variance.
• Relative Efficiency. Relative efficiency is the
ratio of two sampling variances.
H2
efficiency of G relative to H
G
2
N
• Standard Error of the Mean: M
N
• Law of large numbers: Large samples
produce sample estimates very close to
the parameter.
Unbiased Estimate of
Variance
• It can be shown that: E(S 2
) 2
2
N 1 2
N N
S
2 2
N 1 N 1