Sie sind auf Seite 1von 7

1.

Discuss/define three measures of central tendency.


The three measures of central tendency are: Mean, Mode and Median.
Mean: Mean is defined as sum of all observations divided by the total number of
observations in a given data set. the formula of mean is given by:
n

X X 2 ... X n
X 1
n

X
i 1

following are some of the merits and demerits of mean.


Merits:
1.
2.
3.
4.
5.

It
It
It
It
It

is rigidly defined.
is easy to calculate and understand.
is based on all observations.
is capable of further mathematical treatment.
has sampling stability.

Demerits:
1. It is too sensitive to extreme observations.
2. It cannot be used with qualitative type of data.
Mode: Mode is defined as the most frequent observation in a given data set.
The value of mode can be identified just by inspection. The
observation which is repeated for maximum number of times is
mode.
following are some of the merits and demerits of mean.
Merits:
1. Mode is easy to calculate and understand.
2. It is not affected by extreme observations.
Demerits:
1.
2.
3.
4.

It is sometimes ill defined.


It is not based on all observations.
It is not capable of further mathematical treatment.
Mode is affected to a greater extent by fluctuations of
sampling.

Median: Median is defined as the middlemost observation in a given data set, when
observations are arranged in order.
Median can be calculated using following steps.
Step1: Arrange obs. as ordered statistic.
Step2: count no. of obs. say n If
i) n is odd,
{

Median =

(n 1)
}' th
2

Obs.

ii) n is even,
n
n

( 2 ) ' thObs. ( 2 1) ' th Obs .

Median =

Following are the merits and demerits of median.


Merits:
1.
2.
3.
4.

It is rigidly defined.
It is not affected by extreme observations.
It can be located graphically.
It can be calculated while dealing with qualitative type of
data.

Demerits:
1. It is not based on all observations.
2. It is not suitable for further mathematical treatment.
3. It is affected more by fluctuations of sampling.

2.

Also, discuss/define two measures of variation.


Range and standard deviation are the measures of dispersion.
Range: range is defined as the difference between largest and smallest observation in a
given data set.

Range = L S ,
Where, L= largest Observation, S= Smallest Observation,
Range is an absolute measure of dispersion. The relative measure of dispersion
corresponding to range is Coefficient of Range. It is defined as the difference between
largest and smallest observation divided by sum of largest and smallest observations.
The coefficient of range is an unit less quantity, hence can be used for comparing
variation between two variables with different units measurement.

Coefficient of Range =

LS
LS

Where, L= largest Observation, S= Smallest Observation,


Following are the merits and demerits of Range.
Merits:
1. It is simplest measure of dispersion.
2. It is rigidly defined.
Demerits:
1.
2.
3.
4.

It
It
It
It

is
is
is
is

not based on all observations.


very much affected by fluctuations of sampling.
unreliable measure of dispersion.
not suitable of further mathematical treatment.

Standard deviation: The standard deviation is defined as the positive square root of
average of squares of deviations taken from mean.
It is given by the formula:

(X
i 1

X )2

Standard deviation is an absolute measure of dispersion. The corresponding relative


measure of dispersion is called as Coefficient of Variance (CV )and it is given by:

C.V .

X
100 %
X

It is a relative measure of dispersion. It is used to check consistency of data under study


or for Comparison purpose. Less C. V. More consistency, more C. V. less consistency.
Following are the merits and demerits of Standard deviation:
Merits:
1.
2.
3.
4.
5.

It is rigidly defined.
It is based on all observations.
It is a reliable measure of dispersion.
It is less affected by the fluctuations of sampling.
It is capable of further statistical analysis.

Demerits:
1. It is relatively difficult to calculate and understand.
2. It is very much affected by fluctuations of sampling.
3. It is more affected by extreme observations.
3.

In addition, define the importance of the central limit theorem.

In probability theory, the central limit theorem (CLT) states that, given certain conditions,
the arithmetic mean of a sufficiently large number of iterates of independent random
variables, each with a well-defined expected value and well-defined variance, will be
approximately normally distributed. That is, suppose that a sample is obtained containing
a large number of observations, each observation being randomly generated in a way that
does not depend on the values of the other observations, and that the arithmetic average
of the observed values is computed. If this procedure is performed many times, the
central limit theorem says that the computed values of the average will be distributed
according to the normal distribution (commonly known as a "bell curve").The central
limit theorem has a number of variants. In its common form, the random variables must
be identically distributed. In variants, convergence of the mean to the normal distribution
also occurs for non-identical distributions, given that they comply with certain
conditions.

4.

A random sample mean of hotel bills in Memphis is $275 with a sample standard
deviation of $65. If this information was based on a sample of 45. Construct a
confidence interval based on the aforementioned information. You select the numerical
value for a level of confidence. WRITE IN WORDS YOUR FINDINGS DEPICT
YOUR SPSS OUTPUT.
95% Confidence interval for hotel bills in Memphis is:

SD
95 % CI X 1.96 *

65
95 % CI 275 1.96 *
275 1.96 * 9.6896 275 18.99167 256.0083, 293.9917
45

Hence, the required 95% confidence interval is (256.0083, 293.9917)

5. I claim that the average dollars spent on child care is $140/week. You dont accept the
claim that I made. Thus you take a random sample of 49 with a sample mean of $133
and sample standard deviation equal to $16. Alpha = .05. Show SPSS output and
explain your findings.
Let X: Average dollars spent on child care.
H0: The average dollars spent on child care is $140 per week.
H1: The average dollars spent on child care is not equal to $140 per week.
Given;
X 133, n 49, s 16, 0.05

t
Test Statistic:

x
s
, Where, s x
sx
n

133 140
7

3.06252
16
2.2857
7

t-critical = 2.0106 ( alpha= 0.05, df=48)


Since |t-cal |> t-critical, we reject null hypothesis.
Hence we conclude that, the average dollars spent on child care is less than $140 per
week.
6. I employ three different employees. I would like to know if the mean outputs of each
worker are equal. Thus, I collected the following information:
Employee 1. Output /day: 30, 35, 40, 42, 44.
Employee 2, Output /day 36, 35, 43, 32, 37.
Employee 3. Output/employee, 32, 33, 34, 32, 29.
Compare means and show results of your SPSS output. I want you to explain the
difference among the three groups and the difference within a group. Copy and paste
the SPSS output).
Solution:
The average output per day for employee 1 was 38.20 (SD = 5.674), the minimum output
was 30.00 and the maximum output was 44.00. The 95% confidence interval for the
output per day of employee 1 was (31.1542, 45.2458).
The average output per day for employee 2 was 36.60 (SD = 4.037) with the minimum
output 2 and maximum output 43. The 95% confidence interval for the per day output of
employee 2 was (31.5870, 41.5870).
The average output per day for employee 3 was 32.0 (SD = 4.72) with a minimum output
of 29.0 and maximum output of 34.0. the 95% confidence interval for the average income
per day was (32.9874, 38.2126)
The output per day of three employees was compared using one way ANOVA. At 5%
level of significance, the result of one way ANOVA indicates that there is no significant
difference in the average output per day for three employees (F = 2.988, DF = 2, 14, p >
0.088).
SPSS Output:
Descriptives
Output per day
N

Mean

Std.

Std. Error

Deviation

95% Confidence Interval for Mean


Lower Bound

Minimum

Maximum

Upper Bound

Employee1

38.2000

5.67450

2.53772

31.1542

45.2458

30.00

44.00

Employee2

36.6000

4.03733

1.80555

31.5870

41.6130

32.00

43.00

Employee3

32.0000

1.87083

.83666

29.6771

34.3229

29.00

34.00

15

35.6000

4.71775

1.21812

32.9874

38.2126

29.00

44.00

Total

ANOVA
Output per day
Sum of Squares

df

Mean Square

Between Groups

103.600

51.800

Within Groups

208.000

12

17.333

Total

311.600

14

Sig.

2.988

.088

5. Interpret the following regression equation. Alpha = .05. Adjusted r-square = .67.
Average selling price of homes = $98,000 + 100*Square footage-500*price of home
Associated p-values or significances with constant (y-intercept) and square footage are
(.12) and (.04) respectively. P value associated with price of home is .02
Average selling price of homes measured in dollars. Square footage is measured in units
of square feet, i.e. 100, 101 , 103 , price of home measured in terms of 100 dollars.
P values same as level of significance
The given regression equation is as:
Average selling price of homes = $98,000 + 100*Square footage-500*price of home
Where average selling price of home is dependent variable and the independent variables are
Square footage and price of home.
From the given information, we can conclude that the regression coefficients for square of
footage and price of home are significant ( p < 0.05).
Hence, the square of footage and price of home are good predictors of average price of homes.
The overall model explains around 67% variation in the dependent variable 'average price of
homes'.

Das könnte Ihnen auch gefallen