Beruflich Dokumente
Kultur Dokumente
X X 2 ... X n
X 1
n
X
i 1
It
It
It
It
It
is rigidly defined.
is easy to calculate and understand.
is based on all observations.
is capable of further mathematical treatment.
has sampling stability.
Demerits:
1. It is too sensitive to extreme observations.
2. It cannot be used with qualitative type of data.
Mode: Mode is defined as the most frequent observation in a given data set.
The value of mode can be identified just by inspection. The
observation which is repeated for maximum number of times is
mode.
following are some of the merits and demerits of mean.
Merits:
1. Mode is easy to calculate and understand.
2. It is not affected by extreme observations.
Demerits:
1.
2.
3.
4.
Median: Median is defined as the middlemost observation in a given data set, when
observations are arranged in order.
Median can be calculated using following steps.
Step1: Arrange obs. as ordered statistic.
Step2: count no. of obs. say n If
i) n is odd,
{
Median =
(n 1)
}' th
2
Obs.
ii) n is even,
n
n
Median =
It is rigidly defined.
It is not affected by extreme observations.
It can be located graphically.
It can be calculated while dealing with qualitative type of
data.
Demerits:
1. It is not based on all observations.
2. It is not suitable for further mathematical treatment.
3. It is affected more by fluctuations of sampling.
2.
Range = L S ,
Where, L= largest Observation, S= Smallest Observation,
Range is an absolute measure of dispersion. The relative measure of dispersion
corresponding to range is Coefficient of Range. It is defined as the difference between
largest and smallest observation divided by sum of largest and smallest observations.
The coefficient of range is an unit less quantity, hence can be used for comparing
variation between two variables with different units measurement.
Coefficient of Range =
LS
LS
It
It
It
It
is
is
is
is
Standard deviation: The standard deviation is defined as the positive square root of
average of squares of deviations taken from mean.
It is given by the formula:
(X
i 1
X )2
C.V .
X
100 %
X
It is rigidly defined.
It is based on all observations.
It is a reliable measure of dispersion.
It is less affected by the fluctuations of sampling.
It is capable of further statistical analysis.
Demerits:
1. It is relatively difficult to calculate and understand.
2. It is very much affected by fluctuations of sampling.
3. It is more affected by extreme observations.
3.
In probability theory, the central limit theorem (CLT) states that, given certain conditions,
the arithmetic mean of a sufficiently large number of iterates of independent random
variables, each with a well-defined expected value and well-defined variance, will be
approximately normally distributed. That is, suppose that a sample is obtained containing
a large number of observations, each observation being randomly generated in a way that
does not depend on the values of the other observations, and that the arithmetic average
of the observed values is computed. If this procedure is performed many times, the
central limit theorem says that the computed values of the average will be distributed
according to the normal distribution (commonly known as a "bell curve").The central
limit theorem has a number of variants. In its common form, the random variables must
be identically distributed. In variants, convergence of the mean to the normal distribution
also occurs for non-identical distributions, given that they comply with certain
conditions.
4.
A random sample mean of hotel bills in Memphis is $275 with a sample standard
deviation of $65. If this information was based on a sample of 45. Construct a
confidence interval based on the aforementioned information. You select the numerical
value for a level of confidence. WRITE IN WORDS YOUR FINDINGS DEPICT
YOUR SPSS OUTPUT.
95% Confidence interval for hotel bills in Memphis is:
SD
95 % CI X 1.96 *
65
95 % CI 275 1.96 *
275 1.96 * 9.6896 275 18.99167 256.0083, 293.9917
45
5. I claim that the average dollars spent on child care is $140/week. You dont accept the
claim that I made. Thus you take a random sample of 49 with a sample mean of $133
and sample standard deviation equal to $16. Alpha = .05. Show SPSS output and
explain your findings.
Let X: Average dollars spent on child care.
H0: The average dollars spent on child care is $140 per week.
H1: The average dollars spent on child care is not equal to $140 per week.
Given;
X 133, n 49, s 16, 0.05
t
Test Statistic:
x
s
, Where, s x
sx
n
133 140
7
3.06252
16
2.2857
7
Mean
Std.
Std. Error
Deviation
Minimum
Maximum
Upper Bound
Employee1
38.2000
5.67450
2.53772
31.1542
45.2458
30.00
44.00
Employee2
36.6000
4.03733
1.80555
31.5870
41.6130
32.00
43.00
Employee3
32.0000
1.87083
.83666
29.6771
34.3229
29.00
34.00
15
35.6000
4.71775
1.21812
32.9874
38.2126
29.00
44.00
Total
ANOVA
Output per day
Sum of Squares
df
Mean Square
Between Groups
103.600
51.800
Within Groups
208.000
12
17.333
Total
311.600
14
Sig.
2.988
.088
5. Interpret the following regression equation. Alpha = .05. Adjusted r-square = .67.
Average selling price of homes = $98,000 + 100*Square footage-500*price of home
Associated p-values or significances with constant (y-intercept) and square footage are
(.12) and (.04) respectively. P value associated with price of home is .02
Average selling price of homes measured in dollars. Square footage is measured in units
of square feet, i.e. 100, 101 , 103 , price of home measured in terms of 100 dollars.
P values same as level of significance
The given regression equation is as:
Average selling price of homes = $98,000 + 100*Square footage-500*price of home
Where average selling price of home is dependent variable and the independent variables are
Square footage and price of home.
From the given information, we can conclude that the regression coefficients for square of
footage and price of home are significant ( p < 0.05).
Hence, the square of footage and price of home are good predictors of average price of homes.
The overall model explains around 67% variation in the dependent variable 'average price of
homes'.