Beruflich Dokumente
Kultur Dokumente
Reporter
Statistical Method
Descriptive Statistics
Inferential Statistics
Estimation
Test of hypotheses
-consist of the methods by which one makes inferences or generalizations about a population. Classical Method - inferences are based strictly on information obtained from a random sample selected from the population. Bayesian Method - utilizes prior subjective knowledge about the probability distribution of the unknown parameters in conjunction with the information provided by the sample data.
`
Estimation Estimation
- refers to the process by which one makes inferences about a population, based on information obtained from a sample.
Statistical Method
Descriptive Statistics
Inferential Statistics
Estimation
Test of hypotheses
Point Estimate
Interval Estimate
Statistical Method
Descriptive Statistics
Inferential Statistics
Estimation
Test of hypotheses
Point Estimate
Interval Estimate
For example, - the sample mean x is a point estimate of the population mean . Similarly, the sample proportion p is a point estimate of the population proportion P.
-When the mean of the sampling distribution of a statistic is equal to a population parameter, that statistic is said to be an unbiased estimator of the parameter. For example, - if credit card holders in a city were repetitively random sampled and questioned what their account balances were as of a specific date, the average of the results across all samples would equal the population parameter. If however, only credit card holders in one neighborhood were sampled, the average of the sample estimates would be a biased estimator of all account balances for the city and would not equal the population parameter.
`
If we consider all possible unbiased estimator of some parameter , the one with the smallest variance is called the most efficient estimator of .
Example - 1 and 2 are two unbiased estimators of the same population parameter , we would choose the estimator whose sampling distribution has the smaller variance. If 1 < 2, we say 1 is a more efficient estimator of than 2.
1.
2. Gives No Information about How Close Value Is to the Unknown Population Parameter 3. Sample MeanDX = 3 Is Point Estimate of Unknown Population Mean
Statistical Method
Descriptive Statistics
Inferential Statistics
Estimation
Test of hypotheses
Point Estimate
Interval Estimate
An interval estimate is defined by two numbers, between which a population parameter is said to lie.
For example: a < x < b is an interval estimate of the population mean . It indicates that the population mean is greater than a but less than b.
3. e.g., Unknown Population Mean Lies Between 50 & 70 with 95% Confidence
. A Probability That the Population Parameter Falls Somewhere Within the Interval. Confidence Interval Sample Statistic (Point Estimate)
A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data. If independent samples are taken repeatedly from the same population, and a confidence interval calculated for each sample, then a certain percentage (confidence level) of the intervals will include the unknown population parameter. Confidence intervals are usually calculated so that this percentage is 95%, but we can produce 90%, 99%, 99.9% (or whatever) confidence intervals for the unknown parameter
The width of the confidence interval gives us some idea about how uncertain we are about the unknown parameter (see precision). A very wide interval may indicate that more data should be collected before anything very definite can be said about the parameter.
Confidence limits are the lower and upper boundaries / values of a confidence interval, that is, the values which define the range of a confidence interval. The upper and lower bounds of a 95% confidence interval are the 95% confidence limits. These limits may be taken for other confidence levels, for example, 90%, 99%, 99.9%
The confidence level is the probability value (1 ) associated with a confidence interval. It is often expressed as a percentage. For example, say = 0.05 = 95% , then the confidence level is equal to (1-0.05) = 0.95, i.e. a 95% confidence level.
Wx Known
Wx Unknown
1.
Assumptions
Population Standard Deviation Is Known Population Is Normally Distributed If Not Normal, Can Be Approximated by Normal Distribution (n u 30)
E/2
Wx 1-E
E/2
_
X
and
The mean of a random sample of n = 25 isDX = 50. Set up a 95% confidence interval estimate for QX if WX = 10.
10 50 196 .
25 46.08 e Q X e 53.92
. e Q X e 50 196
10 25
Example:
The average zinc concentration recovered from a sample of zinc measurements in 36 different locations was found to be 2.6 grams per milliliter. Find the 95% an 99% confidence intervals for the mean zinc concentration in the river. Assume that the population standard deviation is 0.3.
99 % confience interval
If x is used as an estimate of , we can then be (1)100% confident that the error will not exceed z /2 /.
Theorem 9.2
If x is used as an estimate of , we can be (1 )100% confident that the error will not exceed a specified amount e when the sample size is
/ 2,
Example: How large a sample is required in example 9.2 if we want to be 95% confident that our estimate of is off by less than 0.05? Solution: = 0.3
Therefore, we can be 95% confident that a random sample size 139 will provide an estimate x differing from by an amount less than 0.05.
Wx Known
Wx Unknown
If and s are the mean and the standard deviation of a random sample from a normal population with unknown variance, a (1- )100% confidence interval for is given by
where t /2 is the t-value with v= n-1 degrees of freedom, leaving an area of /2 to the right.
` `
Example: The contents of 7 similar containers of sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0, 10.2, and 9.6 liters. Find a 95% confidence interval for the mean of all such containers, assuming an approximate normal distribution.
Solution: Sample mean = 10.0 standard deviation s = 0.283 t0.025 = 2.447 for v = 6 degrees of freedom
The estimator
of
with
known
The estimator
of
with
unknown
Example: An experiment was conducted in which two types of engine A and B, where compared. Gas mileage in miles per gallon was measured. Fifty experiments were conducted using engine type A and 75 experiments were done type B. The gasoline used and other conditions were held constant. The average
Gas mileage for engine A was 36 miles per gallon and the average for machine B was 42 miles per gallon. Find the 96% confidence interval on B - A , where A and B are population mean gas mileage for machines A and B, respectively. Assume that the population standard deviations are 6 and 8 for machines A and B, respectively.
`
If and are the means of independent random samples of size n1 and n2, respectively, from approximate normal populations with unknown but equal variances, a ( 1 )100% confidence interval for 1 2 is given by
Where sp is the pooled estimate of the population standard deviation and t /2 is the t-value with v = n1 + n2 2 degrees of freedom, leaving an area of /2 to the right.
Pooled Variance:
Example: What is the difference between commuting patterns for students and professors. 11 students and 14 professors took part in a study to find mean commuting distances. The mean number of miles traveled by students was 5.6 and the standard deviation was 2.8. The mean number of miles traveled by professors was 14.3 and the standard deviation was 9.1. Construct a 95% confidence interval for the difference between the means. What assumption have we made?
`
s2 =
If and , and and , are the means and variances of small independent samples of size n1 and n2 , respectively, from approximate normal distributions with unknown and unequal variances, an appropriate (1- )100% confidence interval for 1 - 2 is given by
Where t
/2
EXAMPLE: A Study on the Nutrient Retention and Microinvertebrate Community Response to Sewage Stress in the Stream Ecosystem was conducted to estimate the difference in the amount of the chemical orthophosphorus measured at two different stations on the James River. Orthophosphorus is measured in milligrams per liter. Fifteen samples were collected from the station 1 and 12 samples were obtained from station2. The 15 samples from station 1 had an average orthophosphorus content of 3.84 milligrams per liter and a standard deviation of 3.07 milligrams per liter, while the 12 samples from station 2 had an average content of 1.49 milligrams per liter and a standard deviation of 0.80 milligrams per liter. Find a 95% confidence interval for the difference in the true average orthophosphorus contents at these two stations, assuming that the observations came from normal populations with different variances.
Solution:
Example:
PAIR 1 2 3 4 5 6 7 8 9 10 Fresh Tomato 0.066 0.079 0.069 0.076 0.071 0.87 0.071 0.073 0.067 0.062 Canned Tomato 0.085 0.088 0.091 0.096 0.093 0.095 0.079 0.078 0.065 0.068 d 0.019 0.009 0.022 0.020 0.022 0.008 0.008 0.005 -0.002 0.006
Find a 98% confidence interval for the true difference in the mean copper contents of fresh and canned tomatoes assuming the distribution of differences to be normal.
Given: Solution: