Sie sind auf Seite 1von 54

Amity Business School

Lecture 2 : Diagrammatic & Graphical


Presentation of Data

Department of Decision Sciences


Amity Business School

Objective of the Lecture


 To introduce diagrammatic and graphical statistical methods that
allows managers to summarize data visually to produce useful
information.
 To understand the importance of the graphical methods commonly
used to summarize both qualitative and quantitative data.
 To know how they are prepared and how they should be
interpreted.
Amity Business School

Introduction
 The most common and simple forms of pictorial representation of
data are:
(i) Bar diagram
(ii) Histogram
(iii) Pie diagram
(iv) Stem-Leaf display
(v) Frequency Polygon
(vi) Ogive
 Though the first two approaches above are similar in nature, the bar
diagram is meant for categorical data whereas the histogram and
stem-leaf display are meant solely for quantitative data. On the
other hand, pie diagram can be used for both types of data.
Amity Business School

Example 1.1: University Placement Office Survey


 The student placement office at a university conducted a survey of
last year's business school graduates to determine the general areas
in which the graduates found jobs. The placement office intended to
use the resulting information to help decide where to con-centrate
its efforts in attracting companies to campus to conduct job
interviews. Each graduate was asked in which area he or she found
a job. The areas of employment are
Accounting
Finance
General management
Marketing/Sales
Other
Amity Business School

The responses were recorded using the codes 1, 2, 3, 4, find 5,


respectively. Construct a frequency and relative frequency
distribution for these data and graphically summarize the data by
producing a bar chart and a pie chart.

Data on the next slide…


Amity Business School

Data

1 1 2 4 1 4 2 4 5 2 5 4 1 1 4 2 3
4 5 1 4 1 3 2 4 3 1 2 5 4 2 3 3 2
5 4 1 4 1 4 5 5 1 4 2 4 2 2 5 2 5
1 5 3 4 1 4 1 2 1 3 4 2 4 5 5 1 2
2 1 4 3 3 1 4 1 1 1 1 2 4 1 4 3 2
2 4 1 1 2 4 4 4 5 4 5 1 1 3 2 1 3
3 1 5 3 1 3 2 1 1 1 5 3 2 3 4 2 5
1 3 1 1 1 4 2 4 4 2 1 4 4 5 5 2 1
4 4 2 5 3 2 4 1 1 4 3 2 4 2 3 1 1
1 2 1 1 4 1 4 3 4 4 2 3 1 4 5 3 3
1 4 1 2 4 1 4 5 2 2 2 5 4 4 4 1 4
4 1 4 4 1 2 4 2 2 3 2 1 4 4 3 4 1
3 4 5 3 3 1 5 1 4 2 2 1 5 5 4 1 1
1 4 3 2 2 1 1 4 2 3 1 3 3 2 2 3

4 2 2 1 4 2 3 1 5 1 1 2 1 1 1
Amity Business School

 Scanning the data produces no real information. To extract the


information requires the application of a statistical or graphical
technique. To choose the appropriate technique we must first
identify the type of data. In this example the data are nominal
because the numbers represent categories. The only calculation
permitted on nominal data is to count the number of occurrences of
each category. The list of the categories and their count constitute
the frequency distribution. The relative frequency distribution is
produced by converting the frequencies into proportion. The
frequency and relative frequency distributions are combined in
Table 1.1
Amity Business School

Table 1.1 Frequency and Relative Frequency


Distribution for Example 1.1

Area Frequency Relative Frequency


Accounting 73 28.85%
Finance 52 20.55%
General Management 36 14.23%
Marketing / Sales 64 25.30%
Other 28 11.07%
Total 253 100%

Interpretation: Accounting is the most popular area of


employment, followed by marketing/ sales, finance, general
management and others
Amity Business School

MS – Excel Commands for Frequency


 Type data into one or more columns
 Activate any empty cell. Click fx and select the category Statistical,
and the func-tion COUNTIF.
 In the Range box specify the input range of the data (A1:A254) In
the Criteria box type the code you want to count (1) (2) (3) (4) (5).
The frequency will appear in the dialog box. Change the criteria to
produce the frequency of another cate-gory.
 Alternatively, type the following into any active cell.
=COUNTIF([Data range), [Criteria]))
Amity Business School

Minitab Commands for Frequency


 Type data into one or more columns.
 Click Stat, Tables and Tally Individual Variables
 Type or use the Select button to specify the name of the variable or
the column where the data are stored in the Variable box (Area).
Under Display click Counts and Percents.
Amity Business School

SPSS Commands for Frequency


 Click on Analyze at the SPSS menu bar.
 Click on Descriptive Statistics, followed by Frequencies.
 On the dialogue box which appears, select the variables for which
Frequency Tables are required, by clicking on the right arrow to
transfer them from the variable list on the left to the Variables box
on the right.
 Click OK to get the tables with counts and percentages, for each of
the selected variables.
Amity Business School

Bar and Pie Chart


 Graphical techniques generally catch a reader's eye more quickly
than does a table of numbers. Two graphical techniques can be used
to display the results shown in the table. A bar chart is often used to
display frequencies; a pie chart graphically shows relative
frequencies.
Amity Business School

Bar Chart for Example 1.1


Amity Business School

Pie Chart
 If we wish to emphasize the relative frequencies instead of drawing
the bar chart, we draw the pie chart. A pie chart is simply a circle
subdivided into slices that represent the categories. It is drawn so
that the size of each slice is proportional to the percentage
corresponding to that category. For example, since the entire circle
is composed of 360 degrees, a category that contains 25% of the
observations is represented by a slice of the pie that contains 25%
of 360 degrees, which is equal to 90 degrees. The number of
degrees for each category in Example 1.1 is shown in Table 1.2.
Amity Business School

Table 1.2 Proportion in Each Category in Example 1.2


Area Proportion of Slice of Pie
Graduates
Accounting 28.85% 103.9

Finance 20.55% 74.0

General Management 14.23% 51.2

Marketing / Sales 25.30% 91.1

Other 11.07% 39.8

Total 100.00 360


Amity Business School

Pie Chart for Example 1.1


Amity Business School

MS – Excel Commands for Bar and Pie Chart

 After creating the frequency distribution, highlight the column of


frequencies.
 For a bar chart click the Chart Wizard, Column and Finish. For a
pie chart click Pie instead of Column.
 Click Chart (on Tool Bar), Chart Options. and make whatever
changes you think
 make the chart look best.
Amity Business School

Minitab Commands for Bar and Pie Chart


For a bar chart:
 Click Graph and Bar Chart.
 In the Bars represent box click Counts of unique values and select Simple.
 Type or use the Select button to specify the variable in the Variables box
(Area).
 We clicked Labels and added the title and clicked Data Labels and use y-
value labels to display the frequencies at the top of the columns.
For a pie chart:
 Click Graph and Pie Chart.
 Click Chart raw data and in the Categorical variables box type or use the
Select button to specify the variable (Area).
 We clicked Labels and added the title. We clicked Slice Labels and
clicked Category name and Percent.
Amity Business School

SPSS Commands for Bar and Pie Chart


 Click on Analyze at the SPSS menu bar.
 Click on Descriptive Statistics, followed by Frequencies.
 On the dialogue box which appears, select the variables for which
Frequency Tables are required, by clicking on the right arrow to transfer
them from the variable list on the left to the Variables box on the right.
 Click OK to get the tables with counts and percentages, for each of the
selected variables.
 Charts can be requested by clicking on Charts on the main dialogue box,
selecting the required type of charts, and clicking Continue before step 4
above.
 Alternatively : click on Graphs at the SPSS menu bar followed by Chart
Builder
Amity Business School

Histogram
 Example 1.2: A random sample of 40 days gave the following
information about the total number of people treated per day at a
community hospital emergency room (ER).

40 35 42 6 13 50 60 27
8 42 53 17 25 23 24 12
26 32 28 28 31 29 30 28
21 46 22 19 20 30 31 30
36 30 40 38 30 29 31 41

Here,
the population = collection of days over a long period of time, and
the sample = collection of 40 days
The (quantitative) variable = number of people being treated at the ER
per day.
Amity Business School

Since the variable is quantitative and can take many possible values
(much more than a typical categorical variable), it does not make
sense to have frequencies for distinct entries (we might end up with
40 distinct entries with each having frequency 1). So, here we first
find the minimum (min) and maximum (max) entries to get a spread
of the variable (in the sample).
There is a systematic way of finding the min and max. First, find
the min and max for each column, which is easy to do, since there
are much fewer entries in a single column (compared to the whole
array). Next, find,
Amity Business School

(overall) min = minimum of column minimums


and
(overall) max = maximum of column maximums.
By this method, we get
Column minimums = 8, 30, 22, 6, 13, 23, 24 and 12;
Column maximums = 40,46,53,38,31,50,60 and 41;
and hence
min = 6 and max = 60
Amity Business School

 Note that the unit here (i.e., the smallest possible increment of the
quantitative variable) is 1 (or 1 patient). We modify the range (6,
60) by extending by one half of a unit on both sides. This called a
modified range and for the present data set, our modified range is
(5.5, 60.5). The lower limit of the modified range is 5.5, and the
upper limit is 60.5. The idea behind the modified range is that it
includes the boundary values (6 and 60) properly. The length (L) of
the modified range is
L = upper limit – lower limit
= 60.5 – 5.5 = 55
Amity Business School

 This length L is now divided into several subintervals which gives


us a few classes. The number of classes, say k, is a convenient
number, usually taken between 5 and 8. For the present case take k
= 5 and then
l = length of each class = L/k = 11
(The notation l is used to denote the length of each class or sub
interval)
Therefore, we can divide the modified range (5.5, 60.5) into
successive contiguous classes: (5.5, 5.5 + l) = (5.5, 16.5), (16.5,
16.5 + l) = (16.5, 27.5), (27.5, 27.5 + l) = (27.5, 38.5), (38.5, 38.5 +
l) = (38.5, 49.5) and (49.5, 49.5 + l) = (49.5, 60.5).
Amity Business School
Table 1.3 : Frequency table for number of
individuals treated at ER per day

Class Frequency
5.5 – 16.5 4
16.5 – 27.5 10
27.5 – 38.5 17
38.5 – 49.5 6
49.5 – 60.5 3
Total 40
Amity Business School

Histogram for Example 1.2


 Now we use the emergency room data to illustrate a histogram. Draw and label
the x-y axes. Usually, like the bar diagram, the y-axis represents the relative
frequency and the x - axis represents the class (or interval) boundaries. Now
using the relative frequencies as the heights, draw vertical bars for each class.
Amity Business School

 Given a frequency table (with fixed number of classes and class


boundaries), the histogram of a dataset is unique (unlike the bar
diagram). This is due to the natural ordering of the classes. Another
departure from the bar diagram is the absence of fixed gap between
two successive classes.
 A bar graph and a histogram are essentially the same thing; both are
graphical presentations of the data in a frequency distribution. A
histogram is just a bar graph with no separation between bars. The
separation between bars is appropriate for qualitative data because
the data are discrete; no intermediate values are possible. For
discrete quantitative data, a separation between bars is also
appropriate.
Amity Business School

Frequency Polygon
 Histogram gives rise to another simple concept called relative
frequency polygon. Find the midpoint of each class (midpoint of a
class is found by adding the two endpoints of the class and then
dividing by 2), and then plot the relative frequencies (on y-axis)
against the midpoints (on x-axis). Connect the adjacent points with
straight line segments, and the resultant diagram is a frequency
polygon. A frequency polygon shows the trend in the data in terms of
frequency (which is also evident in the histogram).
 From the frequency polygon in Figure 1.3 it is clear that for the
emergency room dataset, the frequency or relative frequency increases
as the number of patients per day increases to 33, and beyond this the
frequency starts falling. Roughly, we see that there are more days
when we treat 25 patients per day than 15 patients per day. Similarly,
less number of days treat 45 patients per day than 35 patients per day.
Amity Business School

Relative Frequency Polygon for the data in example 1.2

If a frequency polygon has a longer right (left) tail than the left (right)
one along with a single hump, then the frequency polygon (or the
histogram) is called positively (negatively) skewed. If a frequency
polygon with a single hump has approximately equal left and right tails
(i.e., looks symmetric) then it is said to have a bell shape.
Amity Business School

MS Excel Commands for Histogram


 Type the data into one column. In another column type the upper limits of the
class intervals. Excel calls them bins
 Clicks Tool, Data Analysis …, and Histogram. If Data Analysis does not appear
in the menu box, you have to install it by using Excel Options and Add ins.
 Specify the Input Range and the Bin Range. Click Chart Output. Click Labels if
the first row contains names.
 To remove the gaps place the cursor over one of the rectangles-and click the right
button of the mouse. Click (with the left button) Format Data Series .... Click
Options, move the pointer to Gap Width and change the number from 150 to O.
 Click Chart and Chart Options ... to make cosmetic changes.
 Note that the numbers along the horizontal axis represent the upper limits of each
class although they appear to be placed in the centers. Except for the first class,
Excel counts the number of observations in each class that are greater than the
lower limit and less than or equal to the upper limit.
Amity Business School

Minitab Commands for Histogram


 Note that Minitab counts the number of observations in each class that are
strictly less than the upper limit and greater than or equal to the lower
limit.
 Type or import the data into one column.
 Click Graph, Histogram ... , and Simple.
 Type or use the Select button to specify the name of the variable in the
Graph variables box . Click Data View.
 Click Data Display and Bar. Minitab will create a histogram using its own
choices of class intervals.
 To choose your own classes, double-click the horizontal axis. Click
Binning.
 Under Interval Type choose Cutpoint. Under Interval Definition choose
Midpoint/Cutpoint positions and type in your choices.
Amity Business School

Stem and Leaf Display


 The stem-leaf display is an extremely useful way of studying data
structure for a quantitative variable. A frequency table and the
corresponding histogram provide a useful organization and pictorial
representation of data. However, in a frequency table (like Table
2.6) we do lose individual values of the observations. A stem-leaf
display is a simple device that groups the whole dataset and
produces a histogram or bar diagram like picture, yet allows us to
recover the original dataset if required. We illustrate this with the
following example.
Amity Business School

 Example 1.3 Table 1.4 gives the one-way commuting distance (in
nearest miles) of 30 work-ing mothers in a large city
Table 1.4 Commuting Distance Data

13 47 10 3 16
7 25 8 21 19
12 45 1 8 4
6 2 14 13 7
34 13 41 28 50
14 26 10 24 36
Amity Business School

 To make a stem-leaf display, we partition the digits of each


individual observation (numeric value) into two components: stem
and leaf. The left side group of digits of the entry is called a stem
and the right side group of digits is called a leaf. The number of
digits to be included in the stem is chosen conveniently so that the
number of stems in the display is between 5 and 20.
an entry = 8
0 8
tens digit units digit
 (A single digit entry, say 8, is read as 08 before being broken into
'stem' and 'leaf'.)
Amity Business School

 For the data in Table 1.4, where all entries are one- or two-digit
numbers, we use tens digit of an entry to form the stem and the
units digit to form the corresponding leaf. For the first entry 13, the
stem is 1 and the leaf is 3. The entry 8 is treated as 08, meaning 0
for its stem and 8 for its leaf. Figure 1.5 gives the stem-leaf display
of the above mentioned data. From Figure 1.5, it is clear that most
of the entries are in the l0-mile range [i.e., (10, 19) miles], followed
by the 0-mile range [i.e., (0, 9) miles]. The horizontal length of the
leaves represents the frequency for the corresponding stem which is
essentially a class. The stem 1 represents the class 10-19 miles, or
more correctly the class 9.5-19.5 miles, since the data entries are
rounded values and hence anyone commuting 9.5 (or 9.6 or 9.7 or
9.8 or 9.9) miles would be assigned the value 10.
Amity Business School

A Stem-and-Leaf display for the data in example 1.3


Stem Leaf
0 7, 6, 2, 8, 1, 3, 8, 4, 7
1 3, 2, 4, 3, 0, 4, 0, 3, 6, 9
2 5, 6, 1, 8, 4
3 4, 6
4 7, 5 1
5 0,
6
7
8
9
Amity Business School

MS Excel Commands for Stem and Leaf Display


 Type the data into one column
 Click Tools, Data Analysis Plus, and Stem and Leaf Display.
 Specify the Input Range. Click one of the values of Increment. (The
increment is the difference between stems)
Amity Business School

Minitab Commands for Stem and Leaf Display


 Type the data into one column.
 Click Graph and Stem-and-Leaf…
 Type or use the select button to specify the variable in the Variable
box. Type the increment in the Increment box.
Amity Business School

SPSS Commands for Stem and Leaf Display


 Enter the data into one column.
 Click on Analyze at the SPSS menu bar.
 Click on Descriptive Statistics, followed by Explore…
 Select the variable and transfer it into the Dependent List box and
select plots for display.
 Click on Plots to open the Explore: Plots dialog box. Select None
for Boxplots and Stem-and-Leaf for Descriptive and then click on
Continue to return to the Explorer dialog box. Then Click OK
Amity Business School

Ogive
 The frequency distribution lists the number of observations that fall into each
class interval. In some situations we may wish to highlight the number of
observations that lie below each of the class limits. In such cases we create the
cumulative frequency distribution. Table 1.5 displays this type of distribution for
Example 1.2.
Table 1.5 Cumulative Frequency table for number of individuals treated at ER per day
Class Frequency Cumulative
Frequency
5.5 – 16.5 4 4
16.5 – 27.5 10 14
27.5 – 38.5 17 31
38.5 – 49.5 6 37
49.5 – 60.5 3 40

From Table 1.5 we can see that, for example, 77.5 % of the data is less than or equal to 38.5
and that 92.5 % were less than or equal to 49.5.
Amity Business School

Ogive for the data in Example 1.2


 Another way of presenting this information is the ogive, which is a
graphical representation of the cumulative frequencies. Figure 1.5
illustrates an ogive for the cumulative frequencies in example 1.2
Amity Business School

Summary
 A set of data, even if modest in size, is often difficult to interpret
directly in the form in which it is gathered. Graphical methods
provide procedures for organizing and summarizing data so that
patterns are revealed and the data are more easily interpreted.
Fre-quency distributions, relative frequency distributions, percent
frequency distributions, bar graphs, and pie charts were presented
as tabular and graphical procedures for summarizing qualitative
data. Frequency distributions, relative frequency distributions,
percent fre-quency distributions, histograms, cumulative frequency
distributions, and ogives were pre-sented as ways of summarizing
quantitative data. A stem-and-leaf display provides an ex-ploratory
data analysis technique that can be used to summarize quantitative
data.
Amity Business School

Self Test
1. A frequency distribution is a tabular summary of data showing the
a. fraction of items in several classes
b. percentage of items in several classes
c. relative percentage of items in several classes
d. number of items in several classes
2. Qualitative data can be graphically represented by using a(n)
a. histogram
b. frequency polygon
c. ogive
d. bar graph
Amity Business School

3. The relative frequency of a class is computed by


a. dividing the midpoint of the class by the sample size
b. dividing the frequency of the class by the midpoint
c. dividing the sample size by the frequency of the class
d. dividing the frequency of the class by the sample size
4. The percent frequency of a class is computed by
a. multiplying the relative frequency by 10
b. dividing the relative frequency by 100
c. multiplying the relative frequency by 100
d. adding 100 to the relative frequency
Amity Business School

5. Fifteen percent of the students in a school of Business


Administration are majoring in Economics, 20% in Finance, 35% in
Management, and 30% in Accounting. The graphical device(s)
which can be used to present these data is (are)
a. a line graph
b. only a bar graph
c. only a pie chart
d. both a bar graph and a pie chart
Amity Business School

6. A cumulative relative frequency distribution shows


a. the proportion of data items with values less than or equal to the
upper limit of each class
b. the proportion of data items with values less than or equal to the
lower limit of each class
c. the percentage of data items with values less than or equal to the
upper limit of each class
d. the percentage of data items with values less than or equal to the
lower limit of each class
Amity Business School

7. The most common graphical presentation of quantitative data is a


a. histogram
b. bar graph
c. relative frequency
d. pie chart
8. In constructing a frequency distribution, the approximate class
width is computed as
a. (largest data value - smallest data value)/number of classes
b. (largest data value - smallest data value)/sample size
c. (smallest data value - largest data value)/sample size
d. largest data value/number of classes
Amity Business School

9. A histogram is said to be skewed to the left if it has a


a. longer tail to the right
b. shorter tail to the right
c. shorter tail to the left
d. longer tail to the left
10. When a histogram has a longer tail to the right, it is said to be
a. symmetrical
b. skewed to the left
c. skewed to the right
d. none of these alternatives is correct
Amity Business School

 Exhibit 1
Michael's Rent-A-Car, a national car rental company, has kept a
record of the number of cars they have rented for a period of 80
days. Their rental records are shown below:

Number of Cars Rented Number of Days


0 - 19 5
20 - 39 15
40 - 59 30
60 - 79 20
80 - 99 _10_
Total 80
Amity Business School

11. Refer to Exhibit 1. The class width of the above distribution is


a. 0 to 100
b. 20
c. 80
d. 5
12. Refer to Exhibit 1. The lower limit of the first class is
a. 5
b. 80
c. 0
d. 20
Amity Business School

13. Refer to Exhibit 1. If one develops a cumulative frequency


distribution for the above data, the last class will have a frequency of
a. 10
b. 100
c. 0 to 100
d. 80
14. Refer to Exhibit 1. The percentage of days in which the company
rented at least 40 cars is
a. 37.5%
b. 62.5%
c. 90.0%
d. 75.0%
Amity Business School

15. Refer to Exhibit 1. The number of days in which the company


rented less than 60 cars is
a. 20
b. 30
c. 50
d. 60
Amity Business School

 16. There are 800 students in the School of Business


Administration. There are four majors in the School: Accounting,
Finance, Management, and Marketing. The following shows the
number of students in each major.

Major Number of Students


Accounting 240
Finance 160
Management 320
Marketing 80
Develop a percent frequency distribution and construct a bar chart
and a pie chart.
Amity Business School

17. To help determine the need for more golf courses, a survey
was undertaken. A sample of 75 self declared golfers was asked
how many rounds of golf they played last year. These data are as
follows
18 26 16 35 30 15 18 15 18 19 25
30 35 14 20 18 24 21 25 18 29 23
15 19 27 28 9 17 28 25 23 20 24
28 36 20 30 26 12 31 13 26 22 30
29 26 17 32 36 24 29 18 38 31 36
24 30 20 13 23 3 28 5 14 24 13
18 10 14 16 28 19 10 42 22
a. Draw a histogram.
b. Draw a stem-and-leaf display.
c. Draw an ogive.
d. Describe what you have learned.

Das könnte Ihnen auch gefallen