Sie sind auf Seite 1von 31

4-1

Chapter
Four

McGraw-

2005 The McGraw-Hill Companies, Inc., All

Chapter Four

4-2

Describing Data: Displaying and


Exploring Data
GOALS
When you have completed this chapter, you will
be able to:
ONE
Develop and interpret a dot plot.
TWO
Develop and interpret a stem-and-leaf
display.
THREE
Compute and interpret quartiles, deciles, and percentiles.
FOUR
Construct and interpret box plots.
Goals

Chapter Four

4-3

Describing Data: Displaying and


Exploring Data
FIVE
Compute and understand the coefficient of variation and the
coefficient of skewness.
SIX
Draw and interpret a scatter diagram.
SEVEN
Set up and interpret a contingency table.

Goals

4-4

Dot Plot

Dot plots:
Report the details of each observation
Are useful for comparing two or more data sets
Dot Plot

4-5

This example gives the percentages of men and


women participating in the workforce in a recent
year for the fifty states of the United States.
Compare the dispersions of labor force
participation by gender.

Example 1

This example gives the percentages of men and


women participating in the workforce in a recent
year for the fifty states of the United States.
Compare the dispersions of labor force
participation by gender.

Example 1
(continued)

4-6

4-7

Percentage of women
participating
In the labor force for the
50 states.

Percentage of men
participating
In the labor force for the
50 states.

Example 1 (continued)

4-8

Stem-and-leaf Displays
Stem-and-leaf
display: A
statistical technique
for displaying a set
of data. Each
numerical value is
divided into two
parts: the leading
digits become the
stem and the
trailing digits the
leaf.

Note: an advantage
of the stem-and-leaf
display over a
frequency
distribution is we
do not lose the
identity of each
observation.

Stem-and-leaf Displays

4-9

Stock prices on twelve


consecutive days for a major
publicly traded company

100

90

80

70

60

86, 79, 92, 84, 69, 88, 91

50
1

10

11 12

83, 96, 78, 82, 85.

Example 2

4-10

Stem and leaf display of stock prices


stem leaf
6 9
7 89
8 234568
9 126

Example 2 (Continued )

4-11

Quartiles
D iv id e a s e t o f
o b s e r v a tio n s
in to fo u r
e q u a l p a r ts.

Quartiles

4-12

Quartiles
L o c a te th e m e d ia n ,
(5 0 th p e r c e n tile )

Quartiles (continued)

4-13

Quartiles
L o c a te th e m e d ia n ,
(5 0 th p e r c e n tile )
th e fir s t q u a r tile
(2 5 th p e r c e n tile )

Quartiles (continued)

4-14

Quartiles
L o c a te th e m e d ia n ,
(5 0 th p e r c e n tile )
fir s t q u a r tile (2 5 th p e r c e n tile )
a n d th e 3 r d q u a r tile
(7 5 th p e r c e n tile )

Quartiles (continued)

4-15

Quartiles
Lp = (n+1)

P
100

w h e re
P is th e d e s ire d p e rc e n tile

Quartiles (continued)

4-16

Using the twelve stock prices, we can find the


median, 25th, and 75th percentiles as follows:

Quartile 3
Median

Quartile 1

7 5 = 9 . 7 5 th o b s e r v a t i o n
100

75

= (1 2 + 1 )

50

50
= (1 2 + 1 )
1 0 0 = 6 .5 0

25

25
= (1 2 + 1 )
100

th

= 3 .2 5

o b s e rv a tio n

th

o b s e rv a tio n

Example 2 (continued)

4-17

12
Q4 11
10
9
Q3 8
7
6
Q2 5
4
3
Q1 2
1

th
96 75 percentile
92 Price at 9.75 observation = 88 + .75(91-88)
91 = 90.25
88
86
50th percentile: Median
85
Price at 6.50 observation = 85 + .5(85-84)
84
= 84.50
83
82
th
25
percentile
79
78 Price at 3.25 observation = 79 + .25(82-79)
= 79.75
69

Example 2 (continued)

4-18

The Interquartile
range is the distance
between the third
quartile Q3 and the
first quartile Q1.

This distance will


include the middle 50
percent of the
observations.

Interquartile range = Q3 - Q1
Interquartile Range

4-19

For a set of
observations the third
quartile is 24 and the
first quartile is 10.
What is the quartile
deviation?
The interquartile range is
24 - 10 = 14. Fifty
percent of the observations
will occur between 10 and
24.

Example 3

4-20

A box plot is a graphical


display, based on quartiles,
that helps to picture a set of
data.
Five pieces of data
are needed to
construct a box
plot: the Minimum
Value, the First
Quartile, the
Median, the Third
Quartile, and the
Maximum Value.
Box Plots

4-21

Based on a sample of 20
deliveries,
Buddys Pizza determined the
following information. The
minimum delivery time was 13
minutes and the maximum 30
minutes. The first quartile was
15 minutes, the median 18
minutes, and the third quartile
22 minutes. Develop a box plot
for the delivery times.

Example 4

4-22

Example 4 continued

4-23

M in

12

14

M e d ia n

16

18

20

22

M ax

24

26

28

30

32

Example 4 continued

Relative dispersion

4-24
The coefficient of variation is
the ratio of the standard
deviation to the arithmetic
mean, expressed as a
percentage:

s
CV
(100%)
X

M ea n
Coefficient of Variation

4-25

Skewness is the

measurement of the
lack of symmetry of
the distribution.
The coefficient of
skewness can range
from -3.00 up to 3.00
when using the following
formula:

3 X Median
sk
s

A value of 0 indicates a
symmetric distribution.

Some software packages use a


different formula which results
in a wider range for the
coefficient.

Movie

4-26

Using the twelve stock prices, we find the mean to be


84.42, standard deviation, 7.18, median, 84.5.
Coefficient of variation
s
CV
(100%) = 8.5%
X

Coefficient of skewness

3
X
Median

sk
s

= -.035

Example 2 revisited

4-27

Scatter
diagram: A
technique
used to show
the
relationship
between
variables.

Variables must be at least interval scaled.

Relationship can be positive (direct) or


negative (inverse).

Example
The twelve days of stock prices and the overall market
index on each day are given as follows:
Scatter diagram

4-28

Price

8.0
7.5
7.5
7.3
7.2
7.2
7.1
7.1
7.0
6.2
6.2
5.1

96
92
91
88
86
85
84
83
82
79
78
69

Relationship between Market Index


and Stock Price
100
90
Price

Index
(000s)

80
70
60
50
5

10

Index

Example 2 revisited

4-29

A contingency table is
used to classify
observations according to
two identifiable
characteristics.
Contingency tables are used
when one or both variables are
nominally scaled.
A contingency table is a
cross tabulation that
simultaneously
summarizes two variables
of interest.
Contingency table

4-30

Weight Loss
45 adults, all 60 pounds
overweight, are randomly
assigned to three weight
loss programs. Twenty
weeks into the program, a
researcher gathers data on
weight loss and divides the
loss into three categories:
less than 20 pounds, 20 up
to 40 pounds, 40 or more
pounds. Here are the
results.
Example 5

4-31
Weight
Loss
Plan

Less 20 up to
40
than 20
40
pounds
pounds pounds or more

Plan 1

Plan 2

12

Plan 3

12

Compare the weight loss under the three plans.


Example 5 continued

Das könnte Ihnen auch gefallen