Sie sind auf Seite 1von 37

Chapter 2

Presenting Data in
Tables and Charts
David Chow
Sep 2014

A Picture Is Worth a Thousand Words

Categorical Data

Organizing Categorical Data:


Summary Table

A summary table indicates the frequency, amount, or


percentage of items in a set of categories.

You can easily see differences between categories.

How do you spend the holidays?

Percent

At home with family

45%

Travel to visit family

38%

Vacation

5%

Catching up on work

5%

Other

7%

Organizing Categorical Data:


Bar Chart & Pie Chart

In a bar chart, a bar


shows each category, the
length of which
represents the amount,
frequency or percentage.

Pie chart is a circle


broken up into slices
representing categories.

The size of each slice


corresponds to the
percentage share.
How Do You Spend the Holiday's

How Do You Spend the Holidays?


Other

7%

Catching up on w ork

5%

Vacation

5%

5%

7%

5%

At home with family


45%

Travel to visit family


Vacation

38%

Travel to visit family

Catching up on work
Other

At home w ith family

45%
0%

10%

20%

30%

40%

38%

50%

Organizing Categorical Data:


Pareto Diagram
Also for categorical data
Essentially, it is a bar chart and a cumulative

polygon in the same graph

Categories are shown in descending order of frequency

Easy to see the vital few versus the trivial many

Organizing Categorical Data:


Pareto Diagram
Current Investment Portfolio

- easy to see the


vital few

100%

40%

90%

80%

35%

% invested in each
category (bar graph)

- in descending
order of
frequency

45%

70%
30%
60%
25%
50%
20%
40%
15%
30%
10%

20%

5%

cumulative % invested
(line graph)

Pareto diagram
is a bar chart &
a cumulative
polygon together

10%

0%

0%
Stocks

Bonds

Savings

CD

Numerical Data
Ordered Array & Stemand-Leaf

Organizing Numerical Data:


Ordered Array

An ordered array is a sequence of data, in rank order, from


the smallest value to the largest value.

Age of
Surveyed
College
Students

Day Students

16

17

17

18

18

18

19
22

19
25

20
27

20
32

21
38

22
42

19
33

20
41

21
45

Night Students

18
23

18
28

19
32

Organizing Numerical Data:


Stem and Leaf Display

A stem-and-leaf display organizes data into groups (called


stems) so that the values within each group (the leaves)
branch out to the right on each row.

Age of College Students (stem: 10s column)


Day Students
Stem Leaf

Night Students
Stem Leaf

67788899

8899

0012257

0138

28

23

15

10

Stem-and-Leaf Display
Construct a stem-and-leaf display for

the following data sets:


1.

Midterm scores: 50, 74, 74, 76, 81

2.

Average daily expenditure:

$36.15, $31.00, $35.05, $40.25, $33.75

11

Numerical Data:
Tables & Charts

12

Organizing Numerical Data:


Frequency Distribution

The frequency distribution is a summary table in which the


data are arranged into numerically ordered class groupings.

You must give attention to selecting the appropriate number of


class groupings, determining a suitable width of a class
grouping, and establishing the boundaries of each to avoid
overlapping.

To determine the width of a class interval, you divide the


range (highest value - lowest value) by the number of class
groupings desired.

13

Organizing Numerical Data:


Frequency Distribution Example
Example: A manufacturer of insulation randomly selects 20
winter days and records the daily high temperature (in
Fahrenheit):
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

14

Organizing Numerical Data:


Frequency Distribution Example
Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Find range: 58 - 12 = 46
Select number of classes: 5 (usually between 5 and 15)
Compute class interval (width): 10 (46/5 then round up)

Determine class boundaries (limits): 10, 20, 30, 40, 50, 60


Compute class midpoints: 15, 25, 35, 45, 55
Count observations & assign to classes

15

Organizing Numerical Data:


Frequency Distribution Example
Class

10 but less than 20


20 but less than 30
30 but less than 40
40 but less than 50
50 but less than 60
Total

Frequency

3
6
5
4
2
20

Relative
Frequency

.15
.30
.25
.20
.10
1.00

Percentage

15
30
25
20
10
100

16

Organizing Numerical Data:


The Histogram
The graphical version of a frequency distribution is

called a histogram.
The class boundaries (or class midpoints) are

shown on the horizontal axis. The vertical axis can


be frequency, relative frequency, or percentage.
Bars of the appropriate heights are used to represent

the number of observations within each class.

17

Organizing Numerical Data:


The Histogram

10 but less than 20


20 but less than 30
30 but less than 40
40 but less than 50
50 but less than 60
Total

Frequency

3
6
5
4
2
20

Relative
Frequency

.15
.30
.25
.20
.10
1.00

Percentage
15
30
25
20
10
100

Histogram: Daily High Temperature


7
6
Frequency

Class

5
4
3
2
1
0
5

15

25

35

45

55 More

18

Histogram in Excel: Step 1


EXCEL 2010 Version
Data > Data Analysis*
You may need to activate Data Analysis
by yourself. Simply click

File > Options > Add-in

Earlier Versions
Select Tools/Data Analysis

19

Histogram in Excel: Steps 2-4

2. Choose Histogram

3. Input data range and bin


range (bin range is a cell range
containing the upper class boundaries
for each class grouping)

4. Select Chart Output


and click OK
20

Organizing Numerical Data:


The Polygon
A percentage polygon is formed by having the

midpoint of each class represent the data in that class


and then connecting the sequence of midpoints at
their respective class percentages.
The cumulative percentage polygon, or ogive,

displays the variable of interest along the X axis, and


the cumulative percentages along the Y axis.

21

Organizing Numerical Data:


The Polygon
Class

10 but less than 20


20 but less than 30
30 but less than 40
40 but less than 50
50 but less than 60
Total

Frequency

Relative
Frequency

3
6
5
4
2
20

.15
.30
.25
.20
.10
1.00

Percentage
15
30
25
20
10
100

Frequency Polygon: Daily High Tem perature


7

(In a percentage polygon


the vertical axis would
be defined to show the
percentage of
observations per class)

Frequency

6
5
4
3
2
1
0
5

15

25

35

45

55

More

22

Organizing Numerical Data:


The Cumulative Percentage
Polygon
Lower
Boundary

% Less Than
Lower Boundary

10<20

10

20<30

20

15

30<40

30

45

40<50

40

70

50<60

50

90

60

100

Ogive: Daily High Temperature


Cumulative Percentage

Class

100
80
60
40
20
0
10

20

30

40

50

60

23

Cross Tabulation

24

Cross Tabulations:
The Contingency Table
A cross-classification (or contingency) table presents

the results of two categorical variables


The categories of one variable are located in the rows,

the categories of the other are located in the columns


The joint responses are classified and shown in the cells
A graphical representation is the side-by-side bar chart

25

Cross Tabulations:
The Contingency Table
A survey was conducted to study the importance of brand
name to consumers as compared to a few years ago.
The results, classified by gender, were as follows:

Importance of Brand Name


More

Male

Female

Total

450

300

750

Equal or Less

3300

3450

6750

Total

3750

3750

7500

26

Cross Tabulations:
Side-By-Side Bar Charts
Importance of Brand Name

Response

Less or Equal

Female
Male

More

500

1000

1500

2000

2500

3000

3500

4000

Number of Responses

27

Numerical Data:
Scatter Plots
& Time Series Plots
To create scatter plots & time-series
plots in EXCEL, use the XY(Scatter)
option in the chart wizard.

28

Scatter Plots
Scatter plots are used for numerical data consisting
of paired observations taken from two numerical
variables.
One variable is measured on the vertical axis and the
other variable is measured on the horizontal axis.

29

Scatter Plot Example


Cost per
day

23

125

26

140

29

146

33

160

38

167

42

170

50

188

55

195

60

200

Cost per Day vs. Production Volume


250
Cost per Day

Volume
per day

200
150
100
50
0
20

30

40

50

60

70

Volume per Day

30

Time Series Plot


A time-series plot is used to study patterns in the

values of a numerical variable over time.


Attendance (in millions) at USA
amusement/theme parks from 2000-2005
Year
Number

Attendance

2000

317

2001

319

2002

Attendance (in millions) at US Theme Parks


336

324

Attendance

Year

332
328
324
320
316

2003

322

Year (Since 2000)

2004

328

2005

335

31

Principles of Excellent Graphs


The graph should not distort the data.
The graph should not contain unnecessary

adornments (chart junk).


The scale on the vertical axis should begin at zero.
The graph should contain a title & properly labeled.
Use the simplest possible graph.

32

Graphical
Errors:
Chart
Junk
Example 1

Which one is a better presentation?

Minimum Wage
1960: $1.00

Minimum Wage

1970: $1.60

2
1980: $3.10

0
1990: $3.80

1960

1970

1980

1990

33

Graphical Errors:
Example 2
No Relative Basis

Which one is a better presentation?


As received by
students.

Freq.
300

%
30%

200

20%

100

10%

0%
FR

SO

JR

SR

As received by
students.

FR

SO

JR

SR

FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior


34

Graphical Errors:
Example
3
Compressing the Vertical Axis

Which one is a better presentation?


Quarterly Sales
$

200

50

100

25

0
Q1

Q2

Q3

Q4

Quarterly Sales

Q1

Q2

Q3

Q4

35

10/13/2008

12/29/2008

12/22/2008

12/15/2008

12/8/2008

12/1/2008

11/24/2008

11/17/2008

11/10/2008

11/3/2008

10/27/2008

10/20/2008

0
9/29/2008

5000

10/6/2008

10000

9/15/2008

Hang Seng Index

9/22/2008

20000

9/8/2008

15000

HSI

25000

9/1/2008

12/29/2008

12/22/2008

12/15/2008

12/8/2008

12/1/2008

11/24/2008

11/17/2008

11/10/2008

11/3/2008

10/27/2008

10/20/2008

10/13/2008

10/6/2008

9/29/2008

9/22/2008

9/15/2008

9/8/2008

9/1/2008

HSI

Graphical Errors: No Zero


Example 4
Point on the Vertical Axis
Which one is a better presentation?

Impact of Financial Tsunami to HSI


Hang Seng Index

24000
22000
20000
18000
16000
14000
12000
10000

36

How to Sell a Lie


Point to pictures or graphs

Pictures (relevant or not) often alter our


perceptions of truth

Present numbers or tables


Use words like because
Tell a story

37

Das könnte Ihnen auch gefallen