Sie sind auf Seite 1von 24

1

Measures of Central Tendency for Grouped Data

Data which are arranged in a frequency distribution is called grouped data.


Observations belonging to each class interval are represented by the classmark of the
interval.

How to obtain the mean, median and mode from a frequency table?

Mean from Grouped Data

Steps in computing the mean:

1. Calculate the midpoint or class marks of all class intervals.


2. Multiply each class mark by their corresponding frequency.
3. Add the products of each in No. 2
4. Divide the sum by the total number of cases (n) to obtain the mean.
2

Median from Grouped Data

Steps in computing median from grouped data


1. Determine the median class.
Divide n by 2 (n/2)
Construct the less than cumulative frequency column in the table.
2. Locate the n/2 in the cumulative frequency to determine the median
class.
3. Get the lower boundary of the median class.
4. From the computed n/2, subtract the ˂F.
5. Divide the difference by the frequency of the median class, then multiply
the quotient by the class size (i)
6. Add the obtained value in No. 5 to the lower boundary of the median
class.

(40/2) - 9
Md= 25.5 + _________________ 5 = 29.43
14
3

Mode for Grouped Data

In Increasing Order

Steps in computing the mode from grouped data (increasing order)

1. Determine the modal class. The modal class is the highest frequency in
the distribution. In the example 26- 30 which has 14 frequency.
2. Get the lower boundary of the modal class. Determine delta 1 and delta 2.
Delta 1 = the difference of the highest frequency and the frequency just
above it. In the example (14-7 =7)
Delta 2= the difference of the highest frequency and the frequency just
below it. In the example (14 – 8 = 6)
3. Multiply the results in No. 3 with the class size (i). In the example, 5.
4. Add the answer in number 4 by the lower boundary of the modal class.
4

(In Decreasing Order)

Steps in computing the mode from grouped data (decreasing order)

5. Determine the modal class. The modal class is the highest frequency in
the distribution. In the example 26- 30 which has 14 frequency.
6. Get the lower boundary of the modal class. Determine delta 1 and delta 2.
Delta 1 = the difference of the highest frequency and the frequency just
below it. In the example (14-7 =7)
Delta 2= the difference of the highest frequency and the frequency just
above it. In the example (14 – 8 = 6)
7. Multiply the results in No. 3 with the class size (i). In the example, 5.
8. Add the answer in number 4 by the lower boundary of the modal class.
5

Mean, Median and Mode


from Grouped Frequencies

Explained with Three Examples

The Race and the Naughty Puppy


This starts with some raw data (not a grouped frequency yet) ...

Alex timed 21 people in the sprint race, to the nearest second:

59, 65, 61, 62, 53, 55, 60, 70, 64, 56, 58, 58, 62, 62, 68, 65, 56, 59, 68, 61,
67

To find the Mean Alex adds up all the numbers, then divides by how many
numbers:

Mean = 59 + 65 + 61 + 62 + 53 + 55 + 60 + 70 + 64 + 56 + 58 + 58 + 62 +
62 + 68 + 65 + 56 + 59 + 68 + 61 + 6721
 = 61.38095...

 To find the Median Alex places the numbers in value order and finds the middle
number.

In this case the median is the 11th number:

53, 55, 56, 56, 58, 58, 59, 59, 60, 61, 61, 62, 62, 62, 64, 65, 65, 67, 68, 68,
70

Median = 61 

To find the Mode, or modal value, Alex places the numbers in value order then
counts how many of each number. The Mode is the number which appears most
often (there can be more than one mode):

53, 55, 56, 56, 58, 58, 59, 59, 60, 61, 61, 62, 62, 62, 64, 65, 65, 67, 68, 68,
70

62 appears three times, more often than the other values, so Mode = 62
6

Grouped Frequency Table


Alex then makes a Grouped Frequency Table:

Seconds Frequency
51 - 55 2
56 - 60 7
61 - 65 8
66 - 70 4

So 2 runners took between 51 and 55 seconds, 7 took between 56 and 60


seconds, etc

Oh No!

Suddenly all the original data gets lost (naughty pup!)

Only the Grouped Frequency Table survived ...

... can we help Alex calculate the Mean, Median and Mode from just that table?

The answer is ... no we can't. Not accurately anyway. But, we can


make estimates.
7

Estimating the Mean from Grouped


Data
So all we have left is:

Seconds Frequency
51 - 55 2
56 - 60 7
61 - 65 8
66 - 70 4

The groups (51-55, 56-60, etc), also called class intervals, are of width 5

The midpoints are in the middle of each class: 53, 58, 63 and 68

We can estimate the Mean by using the midpoints.

So, how does this work?


Think about the 7 runners in the group 56 - 60: all we know is that they ran
somewhere between 56 and 60 seconds:

 Maybe all seven of them did 56 seconds,


 Maybe all seven of them did 60 seconds,
 But it is more likely that there is a spread of numbers: some at 56,
some at 57, etc

So we take an average and assume that all seven of them took 58 seconds.

Let's now make the table using midpoints:

Midpoint Frequency
53 2
58 7
63 8
68 4
8

Our thinking is: "2 people took 53 sec, 7 people took 58 sec, 8 people took 63
sec and 4 took 68 sec". In other words we imagine the data looks like this:

53, 53, 58, 58, 58, 58, 58, 58, 58, 63, 63, 63, 63, 63, 63, 63, 63, 68, 68, 68,
68

Then we add them all up and divide by 21. The quick way to do it is to multiply
each midpoint by each frequency:

Midpoint Frequency Midpoint ×


x f Frequency
fx
53 2 106
58 7 406
63 8 504
68 4 272
Totals: 21 1288

And then our estimate of the mean time to complete the race is:

Estimated Mean =  128821  = 61.333...

Very close to the exact answer we got earlier.

Estimating the Median from Grouped


Data
Let's look at our data again:

Seconds Frequency
51 - 55 2
56 - 60 7
61 - 65 8
66 - 70 4

The median is the middle value, which in our case is the 11th one, which is in
the 61 - 65 group:

We can say "the median group is 61 - 65"


9

But if we want an estimated Median value we need to look more closely at the


61 - 65 group.

We call it "61 - 65", but it really includes values from 60.5 up to (but not
including) 65.5.
Why? Well, the values are in whole seconds, so a real time of 60.5 is measured
as 61. Likewise 65.4 is measured as 65.

At 60.5 we already have 9 runners, and by the next boundary at 65.5 we


have 17 runners. By drawing a straight line in between we can pick out where
the median frequency of n/2 runners is:

And this handy formula does the calculation:

Estimated Median = L +  (n/2) − BG × w

where:

 L is the lower class boundary of the group containing the median
 n is the total number of values
 B is the cumulative frequency of the groups before the median group
 G is the frequency of the median group
 w is the group width

For our example:

 L = 60.5
 n = 21
 B = 2 + 7 = 9
 G = 8
 w = 5

Estimated Median= 60.5 + (21/2) − 98 × 5


 = 60.5 + 0.9375
 = 61.4375
10

Estimating the Mode from Grouped


Data
Again, looking at our data:

Seconds Frequency
51 - 55 2
56 - 60 7
61 - 65 8
66 - 70 4

We can easily find the modal group (the group with the highest frequency),
which is 61 - 65

We can say "the modal group is 61 - 65"

But the actual Mode may not even be in that group! Or there may be more
than one mode. Without the raw data we don't really know.

But, we can estimate the Mode using the following formula:

Estimated Mode = L +  fm − fm-1(fm − fm-1) + (fm − fm+1) × w

where:

 L is the lower class boundary of the modal group


 fm-1 is the frequency of the group before the modal group
 fm is the frequency of the modal group
 fm+1 is the frequency of the group after the modal group
 w is the group width

In this example:

 L = 60.5
 fm-1 = 7
 fm = 8
 fm+1 = 4
 w=5

Estimated Mode= 60.5 +  8 − 7(8 − 7) + (8 − 4) × 5


 = 60.5 + (1/5) × 5
 = 61.5

Our final result is:

 Estimated Mean: 61.333...
 Estimated Median: 61.4375
 Estimated Mode: 61.5
11

(Compare that with the true Mean, Median and Mode of 61.38..., 61 and
62 that we got at the very start.)

And that is how it is done.

Now let us look at two more examples, and get some more practice along the
way!

Baby Carrots Example


 

Example: You grew fifty baby carrots using special soil. You dig them
up and measure their lengths (to the nearest mm) and group the
results:

Length
Frequency
(mm)
150 - 154 5
155 - 159 2
160 - 164 6
165 - 169 8
170 - 174 9
175 - 179 11
180 - 184 6
185 - 189 3

Mean
Length Midpoint Frequency
(mm) x f fx
150 - 154 152 5 760
155 - 159 157 2 314
160 - 164 162 6 972
165 - 169 167 8 1336
170 - 174 172 9 1548
12

175 - 179 177 11 1947


180 - 184 182 6 1092
185 - 189 187 3 561
  Totals: 50 8530

Estimated Mean =  853050  = 170.6 mm

Median

The Median is the mean of the 25th and the 26th length, so is in the 170 -


174 group:

 L = 169.5 (the lower class boundary of the 170 - 174 group)
 n = 50
 B = 5 + 2 + 6 + 8 = 21
 G = 9
 w = 5

Estimated Median= 169.5 +  (50/2) − 219 × 5


 = 169.5 + 2.22...
 = 171.7 mm (to 1 decimal)

Mode

The Modal group is the one with the highest frequency, which is 175 - 179:

 L = 174.5 (the lower class boundary of the 175 - 179 group)


 fm-1 = 9
 fm = 11
 fm+1 = 6
 w=5

Estimated Mode= 174.5 +  11 − 9(11 − 9) + (11 − 6) × 5


 = 174.5 + 1.42...
 = 175.9 mm (to 1 decimal)

Age Example
Age is a special case.

When we say "Sarah is 17" she stays "17" up until her eighteenth birthday.
She might be 17 years and 364 days old and still be called "17".

This changes the midpoints and class boundaries.


13

Example: The ages of the 112 people who live on a tropical island are
grouped as follows:

Age Number
0-9 20
10 - 19 21
20 - 29 23
30 - 39 16
40 - 49 11
50 - 59 10
60 - 69 7
70 - 79 3
80 - 89 1

A child in the first group 0 - 9 could be almost 10 years old. So the midpoint for
this group is 5 not 4.5

The midpoints are 5, 15, 25, 35, 45, 55, 65, 75 and 85

Similarly, in the calculations of Median and Mode, we will use the class
boundaries 0, 10, 20 etc

Mean
Age Midpoint Number
x f fx
0-9 5 20 100
10 - 19 15 21 315
20 - 29 25 23 575
30 - 39 35 16 560
40 - 49 45 11 495
50 - 59 55 10 550
60 - 69 65 7 455
70 - 79 75 3 225
80 - 89 85 1 85
  Totals: 112 3360

Estimated Mean =  3360112  = 30


14

Median

The Median is the mean of the ages of the 56th and the 57th people, so is in the
20 - 29 group:

 L = 20 (the lower class boundary of the class interval containing the
median)
 n = 112
 B = 20 + 21 = 41
 G = 23
 w = 10

Estimated Median= 20 +  (112/2) − 4123 × 10


 = 20 + 6.52...
 = 26.5 (to 1 decimal)

Mode

The Modal group is the one with the highest frequency, which is 20 - 29:

 L = 20 (the lower class boundary of the modal class)


 fm-1 = 21
 fm = 23
 fm+1 = 16
 w = 10

Estimated Mode= 20 +  23 − 21(23 − 21) + (23 − 16) × 10


 = 20 + 2.22...
 = 22.2 (to 1 decimal)
15

Summary
 For grouped data, we cannot find the exact Mean, Median and Mode,
we can only give estimates.

 To estimate the Mean use the midpoints of the class intervals:

Estimated Mean = Sum of (Midpoint × Frequency)Sum of Freqency

 To estimate the Median use:

Estimated Median = L +  (n/2) − BG × w

where:

o L is the lower class boundary of the group containing the median
o n is the total number of data
o B is the cumulative frequency of the groups before the median
group
o G is the frequency of the median group
o w is the group width

 To estimate the Mode use:

Estimated Mode = L +  fm  − fm-1(fm − fm-1) + (fm − fm+1) × w

where:

o L is the lower class boundary of the modal group


o fm-1 is the frequency of the group before the modal group
o fm is the frequency of the modal group
o fm+1 is the frequency of the group after the modal group
o w is the group width

https://www.mathsisfun.com/data/frequency-grouped-mean-median-mode.html
16

Quartiles, Deciles and Percentiles


Introduction:

All of us are aware of the concept of the median in Statistics, the middle value or
the mean of the two middle values, of an array. We have learned that the median divides a
set of data into two equal parts. In the same way, there are also certain other values which
divide a set of data into four, ten or hundred equal parts. Such values are referred as
quartiles, deciles, and percentiles respectively.

Collectively, the quartiles, deciles and percentiles and other values obtained by
equal sub-division of the data are called Quartiles.
Quartiles:

The values which divide an array (a set of data arranged in ascending or descending
order) into four equal parts are called Quartiles. The first, second and third quartiles are
denoted by Q1, Q2,Q3 respectively. The first and third quartiles are also called the lower
and upper quartiles respectively. The second quartile represents the median, the middle
value.

Quartiles for Ungrouped Data:

Quartiles for ungrouped data are calculated by the following formulae.

For Example:
Following is the data of marks obtained by 20 students in a test of statistics;

53 74 82 42 39 20 81 68 58

67 54 93 70 30 55 36 37 29

In order to apply formula, we need to arrange the above data into ascending order i.e. in
the form of an array.

20 28 29 30 36 37 39 42 53

55 58 61 67 68 70 74 81 82

Here, n = 20

i.
17

The value of the 5th item is 36 and that of the 6th item is 37. Thus, the
first quartile is a value 0.25th of the way between 36 and 37, which are
36.25. Therefore,  = 36.25. Similarly,

ii.

The value of the 10th item is 54 and that of the 11th item is 55. Thus the
second quartile is the 0.5th of the value 54 and 55. Since the difference
between 54 and 55 is of 1, therefore 54 + 1(0.5) = 54.5. Hence,  =
54.5. Likewise,

iii.

The value of the 15th item is 68 and that of the 16th item is 70. Thus the
third quartile is a value 0.75th of the way between 68 and 70. As the
difference between 68 and 70 is 2, so the third quartile will be 68 +
2(0.75) = 69.5. Therefore,   = 69.5.

Quartiles for Grouped Data:

The quartiles may be determined from grouped data in the same way
as the median except that in place of n/2 we will use n/4. For
calculating quartiles from grouped data we will form cumulative
frequency column. Quartiles for grouped data will be calculated from
the following formulae;

 = Median.

Where,
l = lower class boundary of the class containing the  , i.e. the class
corresponding to the cumulative frequency in which n/4 or 3n/4 lies
h = class interval size of the class containing .
f = frequency of the class containing  .
n = number of values, or the total frequency.
18

C.F = cumulative frequency of the class preceding the class


containing  .
For Example:
We will calculate the quartiles from the frequency distribution for the
weight of 120 students as given in the following Table 18;
Table 18
Cumulative
Weight (lb) Frequency (f) Class Boundaries
Frequency

110 – 119 1 109.5 – 119.5 0

120 – 129 4 119.5 – 129.5 5

130 – 139 17 129.5 – 139.5 22

140 – 149 28 139.5 – 149.5 50

150 – 159 25 149.5 – 159.5 75

160 – 169 18 159.5 – 169.5 93

170 – 179 13 169.5 – 179.5 106

180 – 189 6 179.5 – 189.5 112

190 – 199 5 189.5 – 199.5 117

200 – 209 2 195.5 – 209.5 119

210 – 219 1 209.5 – 219.5 120

∑f = n = 120

i. The first quartile  is the value of  or the 30th item from the
lower end. From Table 18 we see that cumulative frequency of the
third class is 22 and that of the fourth class is 50. Thus  lies in the
fourth class i.e. 140 – 149.
19

ii. The thirds quartile   is the value of   or 90th item from
the lower end. The cumulative frequency of the fifth class is 75 and
that of the sixth class is 93. Thus,   lies in the sixth class i.e. 160 –
169.

Conclusion
From   we conclude that 25% of the students weigh 142.36
pounds or less and 75% of the students weigh 167.83 pounds or less.

Deciles:

The values which divide an array into ten equal parts are called
deciles. The first, second,…… ninth deciles by   respectively.
The fifth decile (  corresponds to median. The second, fourth, sixth
and eighth deciles which collectively divide the data into five equal
parts are called quintiles.

Deciles for Ungrouped Data:


Deciles for ungrouped data will be calculated from the following
formulae;

For Example:
We will calculate second, third and seventh deciles from the following
array of data.
20 28 29 30 36 37 39 42 53 54

55 58 61 67 68 70 74 81 82 93

i. 
20

The value of the 4th item is 30 and that of the 5th item is 36. Thus the
second decile is a value 0.2th of the way between 30 and 36. The fifth
decile will be 30 + 6(0.2) = 31.2. Therefore,   = 31.2.

ii. 

The value of the 6th item is 37 and that of the 7th item is 39. Thus the
third decile is 0.3th of the way between 37 and 39. The third decile will
be 37 + 2(0.3) = 37.6. Hence,   = 37.6.

iii. 

The value of the 14th item is 67 and that of the 15th item is 68. Thus the
7th decile is 0.7th of the way between 67 and 68, which will be as 37 +
0.7 = 67.7. Therefore,   = 67.7.

Decile for Grouped Data

Decile for grouped data can be calculated from the following formulae;

Where,
l = lower class boundary of the class containing the  , i.e. the class
corresponding to the cumulative frequency in which 2n/10 or 9n/10
lies
h = class interval size of the class containing .
f = frequency of the class containing  .
n = number of values, or the total frequency.
21

C.F = cumulative frequency of the class preceding the class containing


.

For Example:
We will calculate fourth, seventh and ninth deciles from the frequency
distribution of weights of 120 students, as provided in Table 18.
i. 

ii. 

iii. 

Conclusion:

From   we conclude that 40% students weigh 148.79 pounds


or less, 70% students weigh 164.5 pounds or less and 90% students
weigh 182.83 pounds or less.

 
Percentiles:

The values which divide an array into one hundred equal parts are
called percentiles. The first, second,……. Ninety-ninth percentile are
denoted by   The 50th percentile ( ) corresponds to the
median. The 25th percentile   corresponds to the first quartile and
th
the 75  percentile   corresponds to the third quartile.

Percentiles for Ungrouped Data:


22

Percentile from ungrouped data could be calculated from the following


formulae;

For Example:

We will calculate fifteenth, thirty-seventh and sixty-fourth percentile


from the following array;
20 28 29 30 36 37 39 42 53

55 58 61 67 68 70 74 81 82

i. 

The value of the 3rd item is 29 and that of the 4th item is 30. Thus the
15th percentile is 0.15th item the way between 29 and 30, which will be
calculated as 29 + 0.15 = 29.15. Hence,   = 29.15.
ii. 

The value of 7th item is 39 and that of the 8th item is 42. Thus the
37th percentile is 0.77th of the between 39 and 42, which will be
calculate as 39 + 3(0.77) = 41.31. Hence,   = 41.31.
iii. 

The value of the 13th item is 61 and that of the 14th item is 67.


Thus, the 64th percentile is 0.44th of the way between 61 and
67. Since the difference between 61 and 67 is 6 so
64th percentile will be calculated as 61 + 6(0.44) = 63.64.
Hence,   = 63.64.

Percentiles for Grouped Data:


23

Percentiles can also be calculated for grouped data which is done with
the help of following formulae;

Where,
l = lower class boundary of the class containing the  ,
i.e. the class corresponding to the cumulative frequency in
which 35n/100 or 99n/100 lies
h = class interval size of the class containing. .
f = frequency of the class containing .
n = number of values, or the total frequency.
C.F = cumulative frequency of the class preceding the class
containing .
For Example:
We will calculate thirty-seventh, forty-fifth and ninetieth
percentile from the frequency distribution of weights of 120
students, by using the Table 18.
i. 

ii. 

iii. 

Conclusion

From   we have concluded or interpreted that 37%


student weigh 147.5 pounds or less. Similarly, 45% students
weigh 151.1 pounds or less and 90% students weigh 182.83
pounds or less.

https://econtutorials.com/blog/quartiles-deciles-and-percentiles/
24

Das könnte Ihnen auch gefallen