Sie sind auf Seite 1von 45

Alfredo M. Angel Jr.

Reporter

Statistical Method

Descriptive Statistics

Inferential Statistics

Estimation

Test of hypotheses

-consist of the methods by which one makes inferences or generalizations about a population. Classical Method - inferences are based strictly on information obtained from a random sample selected from the population. Bayesian Method - utilizes prior subjective knowledge about the probability distribution of the unknown parameters in conjunction with the information provided by the sample data.
`

Statistical Statistical Method


Method

Descriptive Descriptive Statistics Statistics

Inferential Inferential Statistics Statistics

Estimation Estimation

Test of Test of hypotheses hypotheses

- refers to the process by which one makes inferences about a population, based on information obtained from a sample.

Statistical Method

Descriptive Statistics

Inferential Statistics

Estimation

Test of hypotheses

Point Estimate

Interval Estimate

Statistical Method

Descriptive Statistics

Inferential Statistics

Estimation

Test of hypotheses

Point Estimate

Interval Estimate

- A point estimate of a population parameter is a single value of a statistic.

For example, - the sample mean x is a point estimate of the population mean . Similarly, the sample proportion p is a point estimate of the population proportion P.

-When the mean of the sampling distribution of a statistic is equal to a population parameter, that statistic is said to be an unbiased estimator of the parameter. For example, - if credit card holders in a city were repetitively random sampled and questioned what their account balances were as of a specific date, the average of the results across all samples would equal the population parameter. If however, only credit card holders in one neighborhood were sampled, the average of the sample estimates would be a biased estimator of all account balances for the city and would not equal the population parameter.
`

If we consider all possible unbiased estimator of some parameter , the one with the smallest variance is called the most efficient estimator of .

Example - 1 and 2 are two unbiased estimators of the same population parameter , we would choose the estimator whose sampling distribution has the smaller variance. If 1 < 2, we say 1 is a more efficient estimator of than 2.

1.

Provides Single Value


Based on Observations from 1 Sample

2. Gives No Information about How Close Value Is to the Unknown Population Parameter 3. Sample MeanDX = 3 Is Point Estimate of Unknown Population Mean

Statistical Method

Descriptive Statistics

Inferential Statistics

Estimation

Test of hypotheses

Point Estimate

Interval Estimate

An interval estimate is defined by two numbers, between which a population parameter is said to lie.

For example: a < x < b is an interval estimate of the population mean . It indicates that the population mean is greater than a but less than b.

1. Provides Range of Values


Based on Observations from 1 Sample

2. Gives Information about Closeness to Unknown Population Parameter


Stated in terms of Probability
x Knowing Exact Closeness Requires Knowing Unknown Population Parameter

3. e.g., Unknown Population Mean Lies Between 50 & 70 with 95% Confidence

. A Probability That the Population Parameter Falls Somewhere Within the Interval. Confidence Interval Sample Statistic (Point Estimate)

Confidence Limit (Lower)

Confidence Limit (Upper)

A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data. If independent samples are taken repeatedly from the same population, and a confidence interval calculated for each sample, then a certain percentage (confidence level) of the intervals will include the unknown population parameter. Confidence intervals are usually calculated so that this percentage is 95%, but we can produce 90%, 99%, 99.9% (or whatever) confidence intervals for the unknown parameter

The width of the confidence interval gives us some idea about how uncertain we are about the unknown parameter (see precision). A very wide interval may indicate that more data should be collected before anything very definite can be said about the parameter.

Confidence limits are the lower and upper boundaries / values of a confidence interval, that is, the values which define the range of a confidence interval. The upper and lower bounds of a 95% confidence interval are the 95% confidence limits. These limits may be taken for other confidence levels, for example, 90%, 99%, 99.9%

The confidence level is the probability value (1 ) associated with a confidence interval. It is often expressed as a percentage. For example, say = 0.05 = 95% , then the confidence level is equal to (1-0.05) = 0.95, i.e. a 95% confidence level.

Confidence Intervals Mean Proportion Variance

Wx Known

Wx Unknown

1.

Assumptions
Population Standard Deviation Is Known Population Is Normally Distributed If Not Normal, Can Be Approximated by Normal Distribution (n u 30)

E/2

Wx 1-E

E/2

_
X

and

The mean of a random sample of n = 25 isDX = 50. Set up a 95% confidence interval estimate for QX if WX = 10.

10 50  196 .

25 46.08 e Q X e 53.92

. e Q X e 50  196

10 25

Example:

The average zinc concentration recovered from a sample of zinc measurements in 36 different locations was found to be 2.6 grams per milliliter. Find the 95% an 99% confidence intervals for the mean zinc concentration in the river. Assume that the population standard deviation is 0.3.

The 95% confidence interval

99 % confience interval

If x is used as an estimate of , we can then be (1)100% confident that the error will not exceed z /2 /.

Theorem 9.2
If x is used as an estimate of , we can be (1 )100% confident that the error will not exceed a specified amount e when the sample size is
/ 2,

Example: How large a sample is required in example 9.2 if we want to be 95% confident that our estimate of is off by less than 0.05? Solution: = 0.3

Therefore, we can be 95% confident that a random sample size 139 will provide an estimate x differing from by an amount less than 0.05.

Confidence Intervals Mean Proportion Variance

Wx Known

Wx Unknown

If and s are the mean and the standard deviation of a random sample from a normal population with unknown variance, a (1- )100% confidence interval for is given by

where t /2 is the t-value with v= n-1 degrees of freedom, leaving an area of /2 to the right.

` `

Example: The contents of 7 similar containers of sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0, 10.2, and 9.6 liters. Find a 95% confidence interval for the mean of all such containers, assuming an approximate normal distribution.

Solution: Sample mean = 10.0 standard deviation s = 0.283 t0.025 = 2.447 for v = 6 degrees of freedom

The estimator

of

with

known

The estimator

of

with

unknown

Example: An experiment was conducted in which two types of engine A and B, where compared. Gas mileage in miles per gallon was measured. Fifty experiments were conducted using engine type A and 75 experiments were done type B. The gasoline used and other conditions were held constant. The average

Gas mileage for engine A was 36 miles per gallon and the average for machine B was 42 miles per gallon. Find the 96% confidence interval on B - A , where A and B are population mean gas mileage for machines A and B, respectively. Assume that the population standard deviations are 6 and 8 for machines A and B, respectively.
`

Solution: Point estimate of B - A is 42 36 = 6. = 0.04; z0.02 = 2.05

If and are the means of independent random samples of size n1 and n2, respectively, from approximate normal populations with unknown but equal variances, a ( 1 )100% confidence interval for 1 2 is given by

Where sp is the pooled estimate of the population standard deviation and t /2 is the t-value with v = n1 + n2 2 degrees of freedom, leaving an area of /2 to the right.

Pooled Variance:

Example: What is the difference between commuting patterns for students and professors. 11 students and 14 professors took part in a study to find mean commuting distances. The mean number of miles traveled by students was 5.6 and the standard deviation was 2.8. The mean number of miles traveled by professors was 14.3 and the standard deviation was 9.1. Construct a 95% confidence interval for the difference between the means. What assumption have we made?
`

x1 = 5.6 x2 = 14.3 s1 = 2.8 9.1 n1 = 11 n2 = 14

s2 =

If and , and and , are the means and variances of small independent samples of size n1 and n2 , respectively, from approximate normal distributions with unknown and unequal variances, an appropriate (1- )100% confidence interval for 1 - 2 is given by

Where t

/2

is the t-value with

degrees of freedom, leaving an area to the right.

EXAMPLE: A Study on the Nutrient Retention and Microinvertebrate Community Response to Sewage Stress in the Stream Ecosystem was conducted to estimate the difference in the amount of the chemical orthophosphorus measured at two different stations on the James River. Orthophosphorus is measured in milligrams per liter. Fifteen samples were collected from the station 1 and 12 samples were obtained from station2. The 15 samples from station 1 had an average orthophosphorus content of 3.84 milligrams per liter and a standard deviation of 3.07 milligrams per liter, while the 12 samples from station 2 had an average content of 1.49 milligrams per liter and a standard deviation of 0.80 milligrams per liter. Find a 95% confidence interval for the difference in the true average orthophosphorus contents at these two stations, assuming that the observations came from normal populations with different variances.

Solution:

where is the t-value with v=n-1 degrees of freedom

Example:
PAIR 1 2 3 4 5 6 7 8 9 10 Fresh Tomato 0.066 0.079 0.069 0.076 0.071 0.87 0.071 0.073 0.067 0.062 Canned Tomato 0.085 0.088 0.091 0.096 0.093 0.095 0.079 0.078 0.065 0.068 d 0.019 0.009 0.022 0.020 0.022 0.008 0.008 0.005 -0.002 0.006

Find a 98% confidence interval for the true difference in the mean copper contents of fresh and canned tomatoes assuming the distribution of differences to be normal.

Given: Solution:

d = 0.0117 Sd = 0.0084 = 0.02, = 2.81 df = 9

Das könnte Ihnen auch gefallen