Sie sind auf Seite 1von 8

1

Frequency Distributions and Graphs


1. Frequency Distributions
A frequency distribution is a collection of observations produced by sorting them
into classes and showing their frequency (or numbers) of occurrences in each class.
Constructing a frequency distribution is the most convenient way of organizing data.

1.1 Basic Types of Frequency Distribution:

1. Categorical frequency distribution is used for data that can be placed in


specific categories, such as nominal or ordinal level data.
Nominal data that includes names, labels or categories only
Ordinal data are arranged in some order but differences between data
values either cannot be determined or are meaningless.

Example of Categorical Frequency Distribution:


The following are obtained from data results of a sample survey with categories
A, B and C. The third column is called the column of frequency.

Category Tally Frequency (f) CF


A llll-l 6 6
B llll-llll 9 15
C llll-llll-llll 15 30
Sum=n = 30

2. Frequency Distribution for Ungrouped Data observations are sorted into


classes of single values.

3. Frequency Distribution for Grouped Data observations are sorted into


classes of more than one value.

The following are the basic terminologies associated with frequency


tables.
a) lower class limit the smallest data value that can be included in the
class.
b) Upper class limit the largest data value that can be included in the
class.
c) Class boundaries are used to separate the classes so that there are no
gaps in the frequency distribution.
d) Class marks the midpoints of the classes.
lower limit upper limit
Xm
2
e) class width the difference between two consecutive lower class
limits.
2

Example 1:

Weekly Expenses of 80 Employees


variable Weekly Expenses Number of Employees
101 300 5
2nd class 16 Frequency of
301 500
the 2nd class
Lower limit of 501 700 11
The 5th class 701 900 40
901 - 1100 8

upper limit of
The 5th class

Class width = 301 101 = 200

Example 2: When 40 people were surveyed at Greenbelt 3, they reported


the distance they drove to the mall, and the results ( in kilometers) are given below.

2 8 1 5 9 5 14 10 31 20
15 4 10 6 5 5 1 8 12 10
25 40 31 24 20 20 3 9 15 15
25 8 1 1 16 23 18 25 21 12

Construct a frequency distribution table.

Cumulative Frequency for a table whose classes are in increasing order is the sum
of the frequencies for that class and all previous classes. It is used when cumulative
totals are desired.

Cumulative frequency for a table whose classes are in decreasing order is the sum of
the frequencies for that class and all succeeding classes.

2. Histograms, Frequency Polygons, and Ogives

Most people comprehend the meaning of data easier if they are presented
graphically than numerically.

Histograms display data using vertical bars of various heights to represent the
frequencies.

frequency

Class boundaries
3

2.2 Frequency Polygon display the data by using lines that connect points plotted
for the frequencies at the midpoints of the classes.

frequency

Class midpoints

2.3 Ogive represents the cumulative frequencies of the classes.

cumulative
frequency

Class boundaries

2.4 Pie Graph a circle that is divided into sections of wedges according to the
percentage of frequencies in each category of the distribution.

Example 3: A survey of 500 families were asked the question Where are you
planning to spend your vacation this summer?. It resulted in the following
distribution and the corresponding pie graph.

Place Number of People Percentage


Boracay 200 40%
Palawan 125 25%
Tagaytay 90 18%
Baguio 35 7%
None of the Above 50 10%

Boracay 40%
None of the above
10%
Palawan 7% Baguio
25% Tagaytay
18%
4

3. Data Description

Measures of Central Tendency (Average) focuses on the average or center of data.


Ungrouped Data: Mean: = (population mean)


= (sample mean)

N= total number of observations in the population


n=total number of observations in the sample

Grouped Mean
mean
f .x m
n

Ungrouped Data: The median is the midpoint of the data array. Before finding this
value, the data is arranged in order, from least to greatest or vice versa. The median will
either be a specific value or will fall between two values.

Grouped Median
n
cf
median (Md ) 2 w Lmd
f

Where n sum of frequencies
cf cumulative frequency of the class preceding/before the median
class
f frequency of the median class
w class width
Lmd lower boundary of the median class

The median class is the one that contains the midpoint of data.

Ungrouped Data: The mode is the value that occurs most often in the data set. A data
can have more than one or none at all.

Grouped Mode
d1
mod e( Mo) LMo w
d1 d 2
Where LMo lower boundary of the modal class
w class width
d1 difference of the frequency of the modal class and the class
preceding it
d2 difference of the frequency of the modal class and the class
succeeding it
The modal class is the class with the largest frequency.
5

Midrange: This is a rough estimate of the middle value.

+
() = 2

Weighted Mean: This is used to find the mean of the values of the data set that are not
equally represented. The weighted average can be found by multiplying the value by its
corresponding weight and dividing the sum of the products by the sum of their weights.

=

Geometric Mean: . . = 1 2


Harmonic Mean: . = 1

Example 4 : A recent survey of a new cola reported the ff. percentages of people who
liked the taste. Find the weighted mean of the percentages.
Area %favored No. surveyed
1 40 1000
2 30 3000
3 50 800

Shapes of Distribution
a. Positively Skewed Distribution the majority of the data values falls to
the left of the mean and clusters to the lower end of the distribution.
b. Symmetrical Distribution the data values are evenly distributed on
both sides of the mean. Also, when the distribution is unimodal, the
mean, median, and mode are the same and are at the center of the
distribution.
c. Negatively Skewed Distribution the majority of the data values falls
to the right of the mean and clusters at the upper end of the
distribution.

y y y mode
median
median

mean mean
mode

x x x
0 Positively skewed 0 Mean 0 Negatively skewed
Median
Mode
Symmetrical

Measures of Variation for Grouped Data


6

Range difference between the largest and the smallest value in a given
data.
Variance and Standard Deviation

()2
Ungrouped Data: 2 = (population variance)

= (standard deviation)

Unbiased estimator of the population variance:

( )2
2 = (sample variance)
1

Where = ; x = observed value; n = sample size

s = sample standard deviation =

Grouped Data:
f x f x
2 2
2
n
s
n n 1
Where : x = class midpoint

Example 5: For 108 randomly selected high school students, the following IQ
frequency distribution were obtained.

Class Limits Frequency


90-98 6
99-107 22
108-116 43
117-125 28
126-134 9

Coefficient of Variation a statistic that allows us to compare two different data sets that
have different units of measurement.
s
For samples: CV = 100%
x


For populations: CV 100%

The data with larger CV is more variable.

Coefficient of Skewness
7

A measure to determine the skewness of a distribution is called Pearson


coefficient of skewness. The formula is
__
3 X Md
SK
s
__
Where X -mean, Md-median, s standard deviation

When the distribution is symmetrical, the coefficient is zero; when the


distribution is positively skewed, the coefficient is positive; when the distribution
is negatively skewed, the coefficient is negative.

Example 6: Find the coefficient of skewness of a distribution with mean 10,


median 8 and standard deviation 3.

3.2.5 Measure of Kurtosis


Even if the curves of distributions have the same coefficient of skewness,
these curves may still differ in the sharpness of their peaks. The following figures
show different types of symmetrical curves.

Mesokurtic Leptokurtic Platykurtic


(normal) (more peak) (flat-topped)

( )4
Ungrouped Data: = 4

( )4
Grouped Data: = 4

Where x = class midpoint and s = sample st


d dev

A distribution is said to be : Mesokurtic if K=3


Leptokurtic if K>3
Platykurtic if K<3

Example 7: Calculate the measure of kurtosis for the data Example 5.

Measures of Position for Grouped Data

Standard Scores or Z scores measures the distance an observation and the mean,
measured in units of standard deviation.
8

_
value mean x x
z
standard deviation s
If z score is positive, the score is above the mean. If z =0, score = mean. If z <0,
score < mean.

Example 8: An IQ test has a mean of 105 and a standard deviation of 20. Find the
corresponding z score for each IQ.
a) 88 b) 122 c) 110

Grouped Data
The quartiles, deciles, percentiles can be determined using the following
formula.
kn cf
L w
f
Where k is equal to: i/4 for quartiles; i/10 for deciles; i/100 for percentiles
i ith quartile, decile, or percentile
L lower boundary of the quartile, decile or percentile class
n total number of observations
w class width
cfp frequency of the preceding class
f frequency of the quartile, decile or percentile

Example 9: Find the third quartile, 4th decile and 7th percentile for the given
frequency distribution below.

Class Boundaries Frequency cf


52.5-63.5 6 6
63.5-74.5 12 18
74.5-85.5 25 43
85.5-96.5 28 71
96.5-107.5 14 85
107.5-118.5 5 90

Das könnte Ihnen auch gefallen