Beruflich Dokumente
Kultur Dokumente
) of pass percentage is 72% in each case; we are unable to distinguish be performance of students in the three universities.
However, on the basis of weighted average of pass percentage, University C (7255%) is the best fol University A (7205%) and University B (7061 %).
29. From the results of two colleges A and B given below, state which of them is better and why?
College A
Name of Examination
College B
Appeared
Passed
Appeared
60
100
400
240
800
50
90
300
150
590
200
240
200
160
800
(B)
Taking 'higher pass p.ercentage' as the criterion for better college, both the colleges A and B are equally g< 30. A travelling salesman made five trips in two
months. The record of sales is given below:
The sales manager criticised the salesman's performance as not very good since his mean daily sales were only Rs. 54,000 (2,70,000/5). The salesman called
this an unfair statement for his daily mean sales were as high as Rs. 55,200 (13,80,000/25). What does each average mean here? Which average seems to be
more appropriate in this case?
Trip
I
2
3
4
5
No. of days
5
4
3
7
6
Value of sales
Sales p
(in '06
3,000
J,600
1,500
3,500
4,2"00
60
40
50
50
701
25
13,800
2,70
Ans. The Manager obtained the simple arithmetic mean of the sales per day, while the salesman obtai weighted arithmetic mean. The latter (weighted average)
seems to be more appropriate.
In the words of L.R. Connor:
"The median is that value of the variable which divides the group in two equal parts, on comprising all the values greater and the other, all the values
less than median". Thus media distribution may be defined as that value of the variable which exceeds and is exceeded by the number of observations
i.e., it is the value such that the number of observations above it is .equal number of observations below it. Thus, we see that as against arithmetic
mean which is based on ; items of the distribution, the median is only positional average i.e .. its value depends on the po occupied by a value in the
frequency distribution.
Case (II) : Frequency Distribution. In case of frequency distribution where the variable takes the values Xl> X2, ... , Xn with respective frequenciesfl,h, ...
,fn with 2./= N, total frequency, median is the size of the (N + 1)!2th item or observation. In this case the use of cumulative frequency (c.!) distribution
facilitates the calculations. The steps involved are:
(i) Prepare the 'less than' cumulative frequency (c.f.) distribution. (ii) Find N12.
(iii) See the cf, just greater than NI2.
(iv) The corresponding value of the variable gives median. The example given below illustrates the method.
Example 518. Eight coins were tossed together and the number of heads (X) resulting was noted. The operation was repeated 256 times and the
frequency distribution of the number of heads is given below:
No. of heads (X)
Frequency (f)
26
59
72
52
29
I
2
3
4
5
6
7
8
9
26
59
72
52
29
7
1
1+9= 10
10 + 26 = 36
36 + 59 = 95
95 + 72 = 167
167 + 52 = 219
219 + 29 = 248
248 + 7 = 255
255 + 1= 256
7(~-
Median = I +
C)
where
I is the lower limit of the median class, f is the frequency of the median class,
h is the magnitude or width of the median class, N = If, is the total frequency,
and C is the cumulative frequency of the class preceding the ml,..dian class.
Remarks 1. The interpolation formula (513) is based on the following assumptions:
(i) The distribution of the variable under consideration is continuous with exclusive type classes without any gaps.
(ii) There is an orderly and even distribution of observations within each class.
However, if the data are given as a grouped frequency distribution where classes are not continuous, then it must be converted into a contintlous
frequency distribution before applying the formula. This adjustment will affect only the value of I in (513).
2.
3.
.-
i.e., the sum of the absolute deviations abol,lJ,any arbitrary point A is a1.w~xs .greater than th~ sum <;Jf the ahsolute deviations about the median. For
further discussion, see Mean Deviation in ChapteJ; 6 on
Dispersion.
','"
C','"
"
" ),
ti:f'@; ',;
observations, median is a better average to use than the arithmetic mean since the later gives distorted picture of the distribution.
(iv) Median can be computed while dealing with a distr,ibution with open end classes.
.
;;1
. l
','
(v) Median can sometimes be located by simple insp~ction and can also be computed graphically-. (See
'::
Example 520.
Age (years)
Below 10 20 30 40
No. of persons (in thousands) 2
5
9
12
Age (years) Below 50 60 70 70 and over
No. of persons (in thousands) 14
15
155
156
(i) Find the median age.
(ii) Why is the median a more suitable measure of central tendency than the mean in this case?
Solution.
COMPUTATION OF MEDIAN
(i) First of all we shall convert the given distribution into
the continuous frequency distribution as given in the adjoining table and then compute the median.
Here ~ = 1;.6= 7.8. Cumulative frequency (c.f) greater than 78 is 9. Thus the corresponding class 20-30 is the median class. Hence, using the median formula
(5]3), we get
Median = 20 + ~O (78 - 5) = 20 + ~ x 28
= 20 + 5 x ] -4 = 27
Age
c.f
Number of persons
in '000 (j)
(in years)
0-10
10-20
20-30
30-40
40-50
50-60
60-70
70 and over
(less than)
2
5-2=3
9-5=4
12-9=3
14-12=2
15 - 14 = 1
155 - 15 = 05
156-155=01
2
5
9
12
14
15
155
156
N= If= 156
(ii) In this case median is a more suitable measure of central tendency than mean because the last class I';Z., 70 and over is open end class and as such we cannot
obtain the class mark for this class and hence arithmetic mean cannot be computed.
Example 521. The frequency distribution of weight in grams of mangoes of a given variety is given below. Calculate the arithmetic mean and the median.
Weight in grams
410-419 420-429 430-439 440~49 450-459 460-469 470-479
NlIIl1berofmangoes :
14
20
42
54
45
18
7
Solution. Since the interpolation formula for median is based on continuous frequency distribution we shall first convert the given inclusive class interval series
into exclusive class interval series.
CALCULATIONS FOR MEAN AND MEDIAN
d=X-444.5
J'd
-3
-2
-1
0
-42
- 40
- 42
0
45
36
21
'ifd = -22
10
14
20
42
54
45
18
7
'if = 200 = N
4145
4245
4345
4445
4545
4645
4745
hI,t'd
2
3
10 x (-22)
IOx24
Example 522 Find the missing frequen~y from the following distribution of daily sales of shops, given
that the median sale of shops is Rs. 2,400.
Sale in hundred Rs. :
0-10
No. of shops
5
Solution. Let the missing frequency be 'a'.
10-20 25
Since median sales is Rs. 2,400 (24 hundred), 20-30 is the median class. Using median formula, we get
24 = 20 + lQ (55 + (/ _
=>
4a=5a-25
5(a - 5)
(I
(/
a=25.
30-40 18
40-50 7
Sales in
No. of shops
hundred Rs.
if)
0-10
10-20
20-30
30-40
40-50
0-20 20-40
5
25
Cumulative
frequency (c.f)
5
30
30 +a
48 +a
N= 55 +a
18
7
14 fl 27
15
N = 100 = 56 + fl + f2
Example 523. In the frequency distribution of 100 families given below, the number of families corresponding to expenditure groups 20~0 and 60-80
are missing from the table. However, the median is known to be 50. Find the missing frequencies.
0-20
14
Expenditure
No.offamilies
20--40
?
40-----60
27
60-80
?
80-100
15
Solution. Let the missing frequencies for the classes 20-40 and 60-80 befl andh respectively.
COMPUTATION OF MEDIAN
Expenditure
(in Rupees)
No. of families
if)
14
14+fl
41 +fl 41+fl+/2 56 + fl +/2
2fl = 72 - 27 = 45
=>
fl =
563. Partition Values. The values which divide the series into a number of equal parts are called the partition values. Thus median may be regarded as
a particular partition value which divides the given data into two equal parts.
Quartiles. The values which divide the given data into four equal parts are known as quartiLes.
Obviously there will be three such points Q), Q2 and Q3 such that QI ::; Q2 ::; Q3, termed as the three quartiles. QI, known as the lower or first quartile is the
value which has 25% of the items of the distribution below it and consequently 75% of the items are greater than it. Incidentally Q2, the second quartile,
coincides with the median and has an equal number of observations above it and below it. Q3, known as the upper or third quartile, has 75% of the
observations below it and consequently 25% of the observations above it.
The working principle for computing the quartiles is basically the same as that of computing the
median.
.
To compute QI, the following steps are required: (i) Find N/4, where IV = 2.Jis the total frequency.
(ii) See the (less than) cumulative frequency (c.f) just greater than N/4.
(iii) The corresponding value of X gives the value of QI' In case of continuous frequency distribution, the corresponding class contains QI and the value of QI
is obtained by the interpolation formula:
where I is the lower limit ,f is the frequency, and h is the magnitude of the class containing QI>
and
C is the cumulative frequency (c.f) of the class preceding the class containing Q!.
Similarly to compute Q3, see the (less than) c.f, just greater than 3N/4. The corresponding value of X gives Q3' In case of continuous frequency distribution, the
corresponding class contains Q3 and the value of Q3 is given by the formula:
Q3
where
=1+7(3: - c)
... (515)
I is the lower limit, h is the magnitude, and f is the frequency of the class containing Q3,
and
C is the c.! of the class preceding the class containing Q3'
Deciles. Deciles are the values which divide the series into ten equal parts. Obviously there are nine deciles DI> D2, D3, ... , D9, (say), such that Dj ~ D2 ~ ... ~
D9 Incidentally D5 coincides with the median.
The method of computing the deciles Di, (i = 1,2, ... ,9) is the same/as discussed for Q! and Q3' To compute the ith decile Di, (i = l, 2, ... , 9) see the c.f
7:'.
where
I is the lower limit,fis the frequency and h is the magnitude of the class containing Di,
and
C is the c.r. of the class preceding the class containing Di
Percentiles. Percentiles are the values which divide the series into 100 equal parts. Obviously, there are 99 percentiles PI> P2, ... , P99 such that PI ~ P2 ~ ... ~ P99.
The ith percentile Pi, (i = 1,2, ... ,99) is the
value of X corresponding to c.f just greater than
by the interpolation formula:
il~:':'In case of continuous frequency distribution, the corresponding class contains Pi and its value is obtained
h(iXN
Pi = 1+7
100 - C , (I
).
where
I is the lower limit,fis the frequency and h is the magnitude of the class containing Pi'
and
C is the c.f of the class preceding the class containing Pi'
In particular, we shall have:
P25 = Q\, D1 = PIO,
P50 == D5 = Q2, D2 = P20,
P75 = Q3,
D3 = P30,,,.,
Remark. Importance of partition values. Partition values, particularly the percentiles are specially useful in the scaling and ranking of test scores in
psychological and educational statistics. In the data relating to business and economic statistics, these partition values, specially quartiles, are useful in
personnel work and productivity ratings.
564. Graphic Method of Locating Partition Values. The various partition values viz., quartiles, deciles and percentiles can be easily located graphically with
the help of a curve called the cumulative frequency curve or Ogive. The procedure involves the following steps:
Remark. If we draw a perpendicular from the point of intersection of the two ogives on the x-axis, the foot of the perpendicular gives the value of median.
Example 5'24. The following data gives the distribution of marks of 100 students. Calculate the most suitable average, giving the reason for your choice. Also
obtain the values of quartiles, 6th decile and 70th percentile from the following data.
Marks
Less than 10
No. afstudents
20
30
40
13
Marks
Less than 50
60
70
80
20
32
No. of students
60
80
90
100
Solution. We are given 'less than' cumulative frequency distribution. We shall first convert it into a grouped frequency distribution. Since 'marks' is a discrete
random variable taking only integral values, the classes are: Less than 10, 10-19, ... , 70-79. Further, since the formulae for median, quartiles and percentiles
are based on continuous frequency distribution, we convert the distribution into exclusive type classes with class boundaries below 95, 95-195, ... ,695-795
as given in the following table.
Since the first class 'less than to' is an open end class, we cannot compute any of the mathematical averages like mean, geometric mean or harmonic mean. The
only averages we can compute in this case are median and mode. We compute below the median of the above distribution.
Class
Less than 10
10-19
20-29
30-39
40--49
50-59
60-69
70-79
Frequency (j)
5
8
7
Class Boundaries
Below 95
95-195
195-295
295-395
395--495
495-595
595-695
695-795
13 - 5 =
20 - 13 =
32-20= 12
60-32=28
80 - 60 = 20
90 - 80 = 10
100- 90= 10
Median. ~ = J ~O= 50. The c.f just greater than 50 is 60. Hence, the corresponding class 395-495 is the median class.
M d
39
10 (
lOx 18
..
elan = 5 + 28 50 - 32 = 395 +~= 3950 + 6-43 = 4593
Hence, median marks are 4593.
10(
IOx5
QI =
IOxl5
i~~=
70 ;O~ 00
= 70. The c.f just greater than 70 is 80. Hence, the corresponding class 495-595 contains P70 which is given by :
10(
IOxlO
Number of students
4
10
30
20
30
Marks
40 marks or less
50
60
Number of students
40
47
50
Draw a 'less than' ogive curve on the graph paper and show therein: (i) The range of marks obtained by middle 80% of the students.
(ii) The median.
Also verify your results by direct formula calculations.
Solution. The above data can be arranged in the form of a continuous frequency distribution as given
in the adjoining table.
.
Less Than Ogive. Plot the less than c.! against the corresponding value of the variable in the original table (or against the upper limit of the corresponding
class in the adjoining table) and join these points by a smooth free hand curve to obtain ogive. [See Fig. 51]
(i) At the frequency ~ 25, (along the Y-axis) draw a line parallel to x-axis meeting the ogive at point P. Draw PM perpendicular to the x-axis. Then OM =
275, is the median marks.
Marks
Frequency (j)
0-10
10-20
20-30
30-40
40-50
50-60
4
10-4=
6
30 - 10 = 20
40- 30
=
50- 47 =
47 -40
4
10
30
= 10
40
47
N= 'if= 50
R
L
10
20
30
40
50
60
P90 = 47 (app.)
The cf greater than 5 is 10. Hence, P10 lies in the corresponding class 10-20.
lO
10
10
lOx 5
Hence, the range of the marks obtained by the middle 80% of the students is P90 - PIO = 4714 - 1167 = 35-47.
Example 527. For a group of 5000 workers, the hourly wages vary from Rs. 20 to Rs. 80. The wages of 4 per cent of the workers are under Rs. 25
and. those of 10 per cent are under 30; 15 per cent of the workers earn Rs. 60 and over, and 5 per cent of them get Rs. 70 and over. The quartile
wages are Rs. 40 and Rs. 54, and the sixth decile is Rs. 50. Put this information in the form of a frequency table.
Solution. We are given: N = 5000.
(a) QI = 40 Rs.
(b) D6 = 50 Rs.
(c) Q3 = 54 Rs.
=>
=>
=>
25%
i.e.,
60%
i.e.,
75%
i.e.,
25
100x 5000 =
60
100X 5000
75
100x 5000 =
Using the above infOlmation, we can compute the frequencies for the following class intervals:
Wages in Rs. : Under 25,25-30,30-40,40-50,50-54,54-60,60-70,70 and over, as given 'in the following table:
Hourly Wages (in Rs.)
s Hourly wages
0 (in Rs.)
) 20-30
n 30~0
40-50
I 50-60
s 60-70
70-80
5.
6.
No. of workers
(j)
The mean is the most common measure of central tendency of the data. It satisfies almost all the requirements of a good average. The median is also an
average, but it does not satisfy all the requirements of a good average. However, it carries certain merits and hence is useful in particular fields. Critically examine both
the averages.
3. What do you understand by central tendency? Under what conditions is median more suitable than other measures of central tendency?
4. In each of the following cases, explain whether the description applies to mean, median or both:
(i) Can be calculated from a frequency distribution with open end classes. (ii) The values of all items are taken into consideration in the calculation. (iii) The values of extreme
items do not influence the average.
(iv) In
a distribution with a single peak and moderate skewness to the right, it is closer to the concentration of the distribution.
median,
(ii) mean,
(iii) median
(iv) median.
5. (a) Find the medians of the following two series:
ADS. (i)
(i)
(ii)
38
30
34
31
39
36
35
33
32
29
3]
28
37
35
30
36
41
(Y).
Roll No. X
y
Md (X) = 465, Md. (Y) = 55. Level of knowledge of students is higher in Accountancy.