Beruflich Dokumente
Kultur Dokumente
DESCRIPTIVE STATISTICS
MEASURES OF CENTRAL TENDENCY
ROUNDING OF DATA
Rounding a number to the nearest unit (tenth, or other decimal place) reduces it to the
number of significant digits warranted in the particular computation. When the remainder
to be rounded off is “exactly 5,” the conversion is to round to the nearest even number.
By this practice the additions due to rounding will tend to counterbalance the subtractions
due to rounding in the long run.
Examples:
1. 18.758 rounded to the nearest tenth = 18.8
2. 15.449 rounded to the nearest hundredth = 15.45
3. 15.449 rounded to the nearest tenth = 15.4
4. 18.05 rounded to the nearest tenth = 18.0
5. 89.1750 rounded to the nearest hundredth = 89.18
6. $63.50 rounded to the nearest dollar = $64 since 3 is odd and 5 is followed by
zero.
7. $64.50 rounded to the nearest dollar = $64 since 4 is even.
8. $64.52 rounded to the nearest dollar = $65 since 5 is not followed by zero.
9. 27.27 to the nearest tenth = 27.3
10. 27.27 to the nearest unit = 27
11. 188.549 to four significant digits = 188.5
12. 325.455 to the nearest hundredth = 325.46
13. 325.455 to the nearest tenth = 325.5 since 5 (hundredth position is not
followed by zero.
14. 325.455 to the nearest unit = 325
15. 0.05049 to two significant digits = 0.050
16. 0.05050 to two significant digits = 0.050 (zero before second 5 is considered
even)
17. 0.05050 to one significant digit = 0.05
The result of rounding a number such as 72.8 to the nearest unit is 73 since 72.8 is closer
to 73 than to 72. Similarly, 72.8146 rounded to the nearest hundredth or to two decimal
places is just as far from 72.81, since 72.8146 is closer to 72.18 than to 72.82.
In rounding 72.456 to the nearest hundredth, however, we are faced with a dilemma since
72.465 is just as far from 72.46 as from 72.47. It has become the practice in such cases to
round to the even integer preceding the 5. Thus 72.465 is rounded to 72.46, 183.575 is
rounded to 183.58, 116,500,000 rounded to the nearest million is 116,000,000 or can also
be written as 116 million. This practice is especially useful in minimizing cumulative
rounding errors when a large number of operations is involved.
2
DECIMAL POINT
LEFT RIGHT
Unit Tenth
Ten Hundredth
Hundred Thousandth
Thousand Ten Thousandth
Ten Thousand Hundred Thousandth
Hundred Thousand Millionth
Million Ten Millionth
Ten Million Hundred Millionth
Hundred Million
SIGNIFICANT DIGITS
As a general rule, leading zeros, such as in $00389 or in 0.00389 are never considered to
be significant digits. Both have only 3 significant figures. Embedded zeros which are
followed by at least one significant digit, such as in $3800.57, are all considered to be
significant. (We have 6 significant figures). Trailing zeros may or may not be significant
According to the level of accuracy in the original data.
Examples:
1. If “$3800” has two significant digits, then the measurement was made to the
nearest hundred dollars, the maximum error is $50, and the value is located
between $3750 and $3850.
2. If “$3800 has three significant digits, then the measurement was made to the
nearest ten dollars, the maximum error is $5, and the true value is located
between $3795 and $3805.
3. If “$3800 has four significant digits, then the measurement was made to the
nearest dollar, the maximum error is 50 cents, and the true value is located
between $3799.50 and $3800.50.
If the height is accurately is accurately recorded as 65.4 inches, it means that the true
height lies between 65.35 and 65.45 inches. The accurate digits, apart from zeros needed
to locate the decimal point, are called the significant digits or significant figures of the
number.
Examples:
1. 65.4 has 3 significant figures
2. 4.5300 has 5 significant figures.
-3
3. .0018 = 0.0018 = 1.8 x 10 has 2 significant figures.
-3
4. .001800 = 0.001800 =1.800 x 10 has 4 significant figures.
3
SCIENTIFIC NOTATION
Note that multiplying a number by 10 raised to +8 has the effect of moving the decimal
point of the number 8 places to the right. Multiplying a number by 10 raised to –6 has the
effect of moving the decimal point of the number 6 places to the left.
COMPUTATIONS
In performing additions and subtractions of numbers, the final result has no more
significant figures after the decimal point than the numbers with the fewest significant
after the decimal point.
Examples:
1. 3.16 + 2.7 = 5.9
2. 83.42 – 72 = 11
3. 47.816 – 25 = 22.816 if 25 is exact.
Let the symbol Xj (read “X sub j) denote any of the N values X1, X2, X3,…..Xn assumed
by a variable X. The letter j in Xj, which can stand for any of the numbers 1, 2, 3, 4, ….N
is called a subscript or index. Clearly any letter other than j, such as I, k, p, q could have
been used as well.
4
SUMMATION NOTATIONS
n
The symbol E is used to denote the sum of all the Xj’s from j = 1 to j = n, i. e. by
j=1
n
definition E(Xj) = X1 + X2 + X3 +…..Xn.
j=1
When no confusion can result, we shall denote this sum simply by E X, E Xj or E Xj.
j
In statistics, the term average is quite precise. It is a single figure that represents a group
of data values. It groups these values into one value. It is the center point where the
values group together to typify a data of individualized information.
x’ = Ewx / Ew
:
4. If A is any guessed or assumed arithmetic mean (which
maybe a number) and if dj = xj – A are the deviations of xj
from A, then :
x’ = A + Ed / n
x’ = A + Efd / n
Ungrouped Data
Mean
Example 1: The following data shows the mathematics grade of 7 first year high school
students taking tutorial classes. They are 72%, 69%, 89%, 65%, 76%, 76%, and 83%.
Compute for the grade mean, median and mode of the given data.
x’ = Ex / n
where
x’ = mean
n = the number of data values
Ex = the sum of the data values
x’ = 530/7 = 75.71%
Therefore, the average mathematics grade of the 7 first year high school students who are
taking tutorial classes is 75.71%. This value summarizes and represents the grades of all
the students in this example. Although widely used, the mean is too sensitive in that it is
affected by extreme high or low values.
6
Weighted Mean
Example 2: If a college student got a grade of 1.25 in a 5-unit chemistry course, a 1.5 in a
3-unit algebra course, a 1.0 in a 1-unit P.E. course a 2.25 in a 3-unit history course and a
1.5 in a 3-unit course in English, find his average grade.
x’ = E(fx)/n
f = the respective weights of each individual observation.
Median
md = (n + 1) / 2
Median is simply the value of the middle item (or the mean of the values of
the two middle items) when the data are arranged in an increasing or
decreasing order of magnitude.
If we have an odd number of items, there is always a middle item whose value
is the median. For example, the median of the five numbers 5, 10, 2, 7, and 8
is 7. and the median of the nine numbers 3, 5, 6, 9, 9, 10, 10, 12, and 13 is 9.
Note that there are two 9’s in this last example and that we do not refer to
either of them as the median. The median is a number and not an item,
namely, the value of the middle item. Generally speaking, if there are n items
and n is odd, the median is the value of the (n + 1)/2 th largest item. Thus, the
median of 25 numbers is given by the value of the (25 + 1)/2 = 13 th largest, the
median of 49 numbers is given by the value (49 + 1)/2 = 25th largest.
If we have an even number of items, there is never a middle item and the
median is defined as the mean of the values of the two middle items. For
instance, the median of the six numbers 3, 6, 8, 10, 13, and 15 is (8 + 10)/2 =
9. It is halfway between the two middle values (here the 3rd and the 4th) and if
we interpret it correctly, the formula (n + 1)/2 again gives the position of the
median. For the six given numbers the median is thus, the value of the (6 +
7
1)/2 = 3.5th largest and we interpret this as “halfway between the values of the
third and the fourth.” Similarly, the median of 100 numbers is given by the
value of the (100 + 1)/2 = 50.5th largest item, or halfway the values of the 50 th
and the 51st.
It is important that the formula (n + 1)/2 is not a formula for the median itself;
it merely tells us the position of the median, namely, the number of items we
have to count until we reach the item whose value is the median (or the two
items whose values have to be averaged to obtain the median).
Therefore, about 50% of the students got grades above 76%, while 50% of
them got scores below 76%. Take note that the median is considered
positional because it is only concerned with the middle or midpoint value. It is
not affected by extreme high or low values compared to the mean.
3. If the median divides the data into two parts, a data can likewise be divided
into quartiles, deciles or percentiles.
4. To compute for the first quartile:
Q1 = 1(n + 1) / 4
= 1(7 + 1) / 4
= 2, hence the 2nd data or 69%. Thus, Q1 which is the second data is
69%. Take note that if the answer has a decimal part, then we have to
interpolate.
Result of Q, D, and P are always less than computed value (if data arranged in
ascending order). Ex. for Q3 where the grade is = 83%. Therefore, 75% (which is
= to Q3) scored 83% or lower (or 25 % scored 83 % or higher).
For D2, where the grade is 67%. Therefore, 20% (which is = D2) scored 67% or
lower (or 80% scored 67% or higher).
For P90, where the grade is = to 83%. Therefore, 90% (which is = to 90%) scored
83% or lower (or 10% scored 83% or higher).
Q3 = 3(n + 1) / 4
= 3(7 + 1) / 4
= 24 / 4
= 6, hence the 6th data or 83%.
D8 = 8(n + 1) / 10
= 8(7 + 1) / 10
= 6.4, hence data is between the 6th and 7th data. Interpolate.
GROUPED DATA
To compute for the mean, median and mode given a grouped data, let us use the
frequency distribution we earlier arrived at. Take note that there are other formulas for
solving these measures. However, since we want to make statistics as simple as possible,
then we have the following:
Mean
f = frequency value
d = deviation coded value
i = the class size
n = the number of data values
Example 1:
Cumulative
Class interval Class Mark Frequency (f) Frequency (F) Deviation (d) fd
25-31 28.0 4 4 -2 -8
32-38 35.0 7 11 -1 -7
39-45 42.0 3 14 0 0
46-52 49.0 2 16 1 2
53-59 56.0 2 18 2 4
60-66 63.0 2 20 3 6
------ ------
20 -3
1. In computing for the mean, deviation coded values, d, are assigned to each
class interval.
2. More specifically, the deviation coded value, 0 is assigned to the middle class
interval. Since the number of class intervals in the example is even (6), then
there is no middle class interval. Instead, move one class interval up from the
middle, and assign 0. The number consecutively –1, -2, upwards and 1, 2, and
3 downwards.
3. Multiply the corresponding deviation coded values with the respective
frequencies to get fd. Add the sum, Efd = -3.
4. Get the assumed mean, 42.0. This is the midpoint of the class interval, 35-45
where d = 0.
5. Compute for the mean where i = 7 and n = 20.
Thus, x’ = 42.0 + (-3 / 20 )(7) = 42.0 – 1.05 = 40.95
Therefore, the average of the 20 students pursuing graduate studies is 41.0 years.
x’ = Efx / Ef
= 819 / 20
= 40.95
= 41
(2).
Class Marks(x) Deviation, x - A Frequency (f) fd
28.0 -14 4 -56.0
35.0 - 7 7 -49.0
42.0* 0 3 0.0
49.0 7 2 14.0
56.0 14 2 28.0
63.0 21 2 42.0
---- ------
20 -21.0
*Assumed Mean, A
x’ = A + Efd / n
= 42.0 + (-21) / 20
= 42.0 - 1.05 = 40.95
= 41.0
Average age of 20 students pursuing graduate studies is 41.0 years old.
Median
Class Interval Real Lower Limits Frequency (f) Cumulative Frequency (F)
25-31 24.5-31.5 4 4
32-38 31.5-38.5 7 11
39-45 38.5-45.5 3 14
46-52 45.5-52.5 2 16
53-59 52.5-59.5 2 18
60-66 59.5-66.5 2 20
------
20
1. In computing for the median, first divide n by the number 2. This number is
constant.
2. Using the resulting quotient, 10, as a basis, refer to the cumulative frequency
column. Find a value that is either equal or less than but nearest to 10. This
will represent F, 4 in your formula.
3. Then move one class interval down, get the frequency value, 7, and its real
lower limit, 31.5.
4. Compute for the median where i = 7. Thus,
Similarly, we can compute for quartiles, Q1 and Q3 as shown in the following formula:
and
Q3 = L.L. + [(3n/4 – F) / f ](i)
The formulas for finding the respective deciles and the percentiles of a given data are as
follows:
Refer to Dean Young’s Statistics Made Simple for Illustrations pages 36-37.
Take note that all these statistical positional formulas are extensions of the median. They
differ only with respect to how the data values are divided, whether 25%, 75%, 10%, or
1%.
12
Mode
When we want to find the mode of a frequency distribution, we just specify the modal
class, which is defined as the class interval containing the largest number of values
mo = the mode
L.L. = the real lower limit of the class interval
du = the difference of the highest frequency with the frequency above it
dl = the difference of the highest frequency with the frequency below it
i = the class size
25-31 24.5-31.5 4*
32-38 31.5-38.5 @7 * * du = 7 – 4 = 3
39-45 38.5-45.5 @3
46-52 45.5-52.5 2 @ dl = 7 – 3 = 4
53-59 52.5-59.5 3
60-66 59.5-66.5 2
------
20
1. To complete for the mode, refer to the frequency column and determine the
highest value, 7.
2. Get the difference of this value, 7 and the frequency above it. This will yield,
du = 3.
3. Get the difference of this value, 7 and the frequency below it. This will yield,
dl = 4.
4. Compute for the mode with i = 7.
mo = 31.5 + [3 / (3 + 4)](7)
= 31.5 + 3
= 34.5
Therefore, a greater number of the graduate students tends towards the average age of
34.5 years. In summary the mean is 41 years, the median is 37.5 years and the mode is
34.5 years.
R.M.S or quadratic mean of a set of numbers x1, x2, x3, ….xn is sometimes denoted by:
________
! 2
R.M.S = ! E(xj)
! ----------
V n
1 square = 1
3 square = 9
4 square = 16
5 square = 25
7 square = 49
(1 + 9 + 16 + 25 + 49) / 5 = 20
GEOMETRIC MEAN
The geometric mean G of a set of number n numbers, x1, x2, x3,….xn is the nth root of
the product of the numbers.
(2)(4)(8) = 64
cube root of 64 is 4.
Grouped data:
___________________
n ! f1 f2 f3 fn
G= V (x1) (x2) (x3)…..(xn)
= [Ef(log x)] / n where x1, x2,…. xn = class marks (midpoints; f1, f2,….fn
as the corresponding frequencies
n = E(f)
HARMONIC MEAN
The harmonic mean H of a set of n numbers x1, x2, x3,…xn is the reciprocal of the
arithmetic mean of the reciprocals of the number.
= n / E(1/x)
or
1/H = [(1 / n)][E(1/x)}
Example:
The harmonic mean of the numbers 2, 4, 8 (n = 3) is:
15
G = or < THAN X’
G = or > THAN H
H = < G = < X’