Sie sind auf Seite 1von 10

Assignment

On
Estimating a Population Mean
Course Title: Research Methodology
Course Code : REM - 312
SUBMITTED TO:
Professor Gouranga Ch. Chanda
Lecturer
Faculty of Business Administration
BGC Trust University Bangladesh

SUBMITTED BY:
Md. Hasan Khan
Id No:
130105152
Sec:
A
Semester: 6th
Batch:
5th
Faculty of Business Administration
BGC Trust University, Bangladesh.

Date of Submission: 19/12/2015

Population:
The entire pool from which a statistical sample is drawn. The information
obtained from the sample allows statisticians to develop hypotheses about
the larger population. Researchers gather information from a sample because
of the difficulty of studying the entire population. In statistical equations,
population is usually denoted with a capital 'N', while the sample is usually
denoted with a lowercase 'n'.
In statistics, population refers to the total set of observations that can be
made.
For example, if we are studying the weight of adult women, the population
is the set of weights of all the women in the world. If we are studying the
grade point average (GPA) of students at Harvard, the population is the set
of GPA's of all the students at Harvard.

Mean:
A mean score is an average score, often denoted by X. It is the sum of
individual scores divided by the number of individuals. The mean is the
average of the numbers: a calculated "central" value of a set of numbers.
To calculate: Just add up all the numbers, then divide by how many numbers
there are.
Thus, if you have a set of N numbers ( X1 , X2 , X3 , . . . XN ), the mean of
those numbers would be defined as:
X = ( X1 + X2 + X3 + . . . + XN ) / N = [ Xi ] / N
Example: what is the mean of 2, 7 and 9?
Add the numbers: 2 + 7 + 9 = 18
Divide by how many numbers (i.e. we added 3 numbers): 18 3 = 6
So the Mean is 6
Simple or arithmetic average of a range of values or quantities, computed by
dividing the total of all values by the number of values. For example, the

mean of 1, 2, 3, 4, and 5 is (15 5) = 3. It is the most common and best


general purpose measure of the mid-point (around which all other values
cluster) of a set of values, but is prone to distortion by the presence of
extreme values and may require use of a measure of distortion (such as mean
deviation or standard deviation). Also called arithmetic mean

Types of Mean and Their Formulas:


In mathematics, the average of two or more numbers is called the mean. The
most commonly used mean is called the arithmetic mean, in which you add
up all the values and divide by the number of values. For example, the
arithmetic mean of 2, 5, and 14 is (2+5+14)/3 = 7. However, there is more
than one kind of mean, including the geomtric mean, harmonic mean, root
mean square, and several others.
The essential property of any mean is that it must fall between the highest
value and the lowest value. When computing means, the type of mean you
need to use depends on the type of data you are analyzing. Several formulas
an examples are discussed below.

Geometric Mean:
The geometric mean of two numbers x and y is sqrt(xy). If you have three
numbers x, y, and z, their geometric mean is cbrt(xyz), where "cbrt" means
the cube root. For n numbers, the harmonic mean is
(x1x2...xn)1/n
Business analysts and scientists use the geometric mean to find the average
growth rate of a process. For example, suppose a business's profits grow by
25% one year, and by 45.8% the next year. To find the average yearly
percent growth rate, you must take the geometric mean of 1.25 and 1.458.
sqrt[(1.25)(1.458)]
= sqrt[1.8225]
= 1.35

Thus, the average growth rate over the two years was 35%. Compare this to
the result you would get if you took the arithmetic mean of 25 and 45.8.
Since (25+45.8)/2 = 35.4, you would get an answer that is too high.

Harmonic Mean:
In science in business applications, the harmonic mean is used to average
ratios. For two numbers x and y, the harmonic mean is 2xy/(x+y). For three
numbers x, y, and z, the harmonic mean is 3xyz/(xy+xz+yz). For n numbers,
the harmonic mean is
n/(1/x1 + 1/x2 + ... + 1/xn)
There are many instances where the harmonic mean is the most appropriate
way to find the average value.
For example, suppose a man drives at a speed of 80 k/h for 100 kilometers
(1.25 hours), and then drives at a speed of 40 k/h for the next 100 km (2.5
hours). The average speed of the car for the entire 200 km trip is total
distance divided by total time. Since 200/(1.25+2.5) = 53.33, the average
speed is 53.33 k/h. This is equivalent to the harmonic mean of 80 and 40.
Observe:
2(80)(40)/(80+40)
= 6400/120
= 53.33
In business, investors use the harmonic mean to compute the average
price/earning ratio of a stock portfolio. For example, suppose you have three
stocks, and their P/E ratios are 8, 18, and 30. The average P/E ratio of the
three stocks is
3(8)(18)(30)/(144+240+540)
= 12960/924
= 14.026

Root Mean Square (Quadratic Mean):


The root mean square, aka quadratic mean, is used in many engineering and
statistical applications, especially when there are data points that can be
negative. The standard deviation of a set of numbers is an example of the
root mean square. (It is the root mean square of the differences between each
data point and the arithmetic mean.) If you have two numbers x and y, the
quadratic mean is sqrt[(x2 + y2)/2]. For n variables, it is
sqrt[(x12 + x22 + ... + xn2)/n]
For example, suppose you have this set of numbers: -10, -5, -4, 1, 6, 7. The
root mean square is
sqrt[(100+25+16+1+36+49)/6]
= sqrt(227/6)
= 6.15
which can be interpreted as the average positive value.

Contraharmonic Mean:
The contraharmonic mean of x and y is (x2 + y2)/(x + y). For n values, the
contra- harmonic mean is
(x12 + x22 + ... + xn2)/(x1 + x2 + ... + xn)
For example, the contraharmonic mean of 1, 3, 5, and 7 is
(1+9+25+49)/(1+3+5+7) = 84/16 = 5.25
Other Means
[(xp + yp)/2]1/p

(Power Mean)

[(xp - yp)/(p(x - y))]1/(p-1)

(Stolarsky Mean)

sqrt[(x2 + xy + y2)/3] when p = 3

(xp + yp)/(xp-1 + yp-1)

(Lehmer Mean)

[(xp + yp)/(xr + yr)]1/(p-r)


[(r(xp - yp))/(p(xr - yr))]1/(p-r)
[(xpyr + xryp)/2]1/(p+r)
(x - y)/(Ln(x) - Ln(y))

(Log Mean)

(xLn(x) + yLn(y))/(Ln(x) + Ln(y))


(x + sqrt(xy) + y)/3

(Heronian Mean)

(1/e)(xx/yy)1/(x-y), e = 2.718281828....

(Identric Mean)

(e)(xy/yx)1/(y-x)
(xxyy)1/(x+y)
(xyyx)1/(x+y)

Mean Inequalities:
Some means are in a constant relationship to one another. If we denote the
arithmentic mean of x and y by A, their geometric mean by G, their
harmonic mean by H, their root mean square by R, and their contraharmonic
mean by C, then the following chain of inequalities is always true
C R A G H.
Estimating a population mean:
We use y as an estimator of . Is it a good estimator?
An estimator is good if:
It is unbiased
It has small standard error.

An estimator is unbiased if the mean of its sampling distribution equals


the parameter we are trying to estimate.
y is unbiased for because E(y) = y = .
In English: if we were to draw 100 samples of size n from some
population with mean , and were to compute y in each of the 100
samples, the average of those 100 y would be close to .

Population mean (contd):


Recall that if y _ (, _2), then the sampling distribution of y is
N(, _2/n).
As n increases, _2/n decreases: the larger the sample, the more reliable
will y be as an estimator of .
The parameter p_n is called standard error of the mean and is estimated
by S/pn.
If y _ N(, _2/n) then
Prob(y 2 _pn< < y + 2 _pn) _ 0.95
(exactly equal to 0.95 if we use 1.96 instead of 2).

Confidence intervals:
Since y will fall within 2_/pn of the population mean approximately
95% of the time, then the intervaly 2 _pnto y + 2 _pn will cover about
95% of the time in repeated sampling.
100(1-_)% confidence interval for :y z_/2_pn,
where z_/2 is the z value with an area equal to _/2 to its right

Confidence intervals:
We can construct confidence intervals with any confidence coefficient
(1 _).
For a 90% confidence interval, use 1.64 instead of 2 (or 1.96 to be
precise), because:
1. _ = 0.10 and _/2 = 0.05
2. z-value with 0.05 to its right (or 0.95 to its left) is 1.64 from
standard normal table.
For a 99% confidence interval, use 2.58 (or z0.005) instead of 2.
Note: the wider the interval, the higher the confidence that it will cover
. Thus, a 99% confidence interval for will always be wider than a
90% interval.

Confidence intervals (contd)


Example: Attention times given by parents to sets of twin boys during
one week (Table 1.9, page 36).
n = 50, y = 20.85 and S = 13.41.
A 90% CI for the true mean attention time is
y 1.64 Spn= 20.85 1.6413.41p50= 20.85 3.11.
95% CI: y 2pS
n = 20.85 21p3.4150
= 20.85 3.80.
99% CI: y 2.57pS
n = 20.85 2.571p3.4150= 20.85 4.88.

Note that we used the sample standard deviation S in place of the


unknown population standard deviation _ to compute the CI.
This is OK only if n is large enough (more than 30).
If _ is unknown (as it usually is) and n < 30 we compute the CI using
t_/2 instead of z_/2 (Students ttable instead of ztable).
The value t_/2 is the upper-tail tvalue such that an area equal to
_/2 lies to its right. Confidence intervals (contd)
To get the appropriate value out of a ttable we need:
1. The degrees of freedom = n 1 in this type of applications.
2. The desired confidence coefficient (1 _).
For small n, a 100(1 _)%CI for isy t_2 ,n1Spn
saline water.
n = 5 (small!), y = 239.2, S = 29.3 and df = 4.
For a 95% CI for the true silica concentration: t_2 ,4 = t0.052 ,4 = 2.776.
Then, the 95% CI for is 239.2 2.776 29.3p5= 239.2 36.4.
If we had wished to obtain a 90% or a 99% CI for the mean, then the
corresponding tvalues (from the table) would have been 2.132 and
4.604, respectively (see Table C.2).

Confidence intervals (contd)


Continue with same example, and now ask the following question: what
sample size would we have needed if we wished to estimate the true
mean silica concentration to within 10 ppm with 95% confidence?

We wish to know what n we would need if we wished to be able to

state that Prob(y 10 < < y + 10) = 0.95.


Above means that
t0.025,n1Spn= 10.
From expression above, we need to solve for n.
Confidence intervals (contd)
We know that the desired n must be larger than 5 because with n = 5
we estimated to within 36.4 ppm with 95% confidence.
We need the following in order to come up with an answer:
Assume that S would not change with increased n
Approximate a value of t0.025,n1 to be about 2

Das könnte Ihnen auch gefallen