Beruflich Dokumente
Kultur Dokumente
10th Edition
Chapter 3
Numerical Descriptive Measures
Chap 3-1
Learning Objectives
In this chapter, you learn:
To describe the properties of central tendency,
variation, and shape in numerical data
To calculate descriptive summary measures for a
population
To calculate the coefficient of variation and Zscores
To construct and interpret a box-and-whisker plot
To calculate the covariance and the coefficient of
correlation
Chapter Topics
Chapter Topics
(continued)
Summary Measures
Describing Data Numerically
Central Tendency
Quartiles
Variation
Arithmetic Mean
Range
Median
Interquartile Range
Mode
Variance
Geometric Mean
Standard Deviation
Shape
Skewness
Coefficient of Variation
Arithmetic Mean
Median
Mode
X
i1
Geometric Mean
XG ( X1 X 2 Xn )1/ n
Midpoint of
ranked
values
Most
frequently
observed
value
Arithmetic Mean
X
Sample size
X
i1
X1 X 2 Xn
n
Observed values
Arithmetic Mean
(continued)
0 1 2 3 4 5 6 7 8 9 10
Mean = 3
1 2 3 4 5 15
3
5
5
0 1 2 3 4 5 6 7 8 9 10
Mean = 4
1 2 3 4 10 20
4
5
5
Median
0 1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Median = 3
n 1
Note that
is not the value of the median, only the
2
position of the median in the ranked data
Mode
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
Review Example
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
$500 K
$300 K
$100 K
$100 K
Review Example:
Summary Statistics
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
Mean: ($3,000,000/5)
= $600,000
Sum $3,000,000
Quartiles
25%
25%
Q2
25%
Q3
The first quartile, Q1, is the value for which 25% of the
observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% are
larger)
Only 25% of the observations are greater than the third
quartile
Quartile Formulas
Find a quartile by determining the value in the
appropriate position in the ranked data, where
First quartile position:
Q1 = (n+1)/4
Q3 = 3(n+1)/4
Quartiles
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so
Q1 = 12.5
Q1 and Q3 are measures of noncentral location
Q2 = median, a measure of central tendency
Quartiles
(continued)
Example:
(n = 9)
Q1 is in the (9+1)/4 = 2.5 position of the ranked data,
so
Q1 = 12.5
Q2 = median = 16
Q3 = 19.5
Geometric Mean
Geometric mean
XG ( X1 X 2 Xn )
1/ n
R G [(1 R1 ) (1 R 2 ) (1 Rn )]1/ n 1
Example
An investment of $100,000 declined to $50,000 at the
end of year one and rebounded to $100,000 at end
of year two:
X1 $100,000
X 2 $50,000
50% decrease
X3 $100,000
100% increase
Example
(continued)
( 50%) (100%)
X
25%
2
Geometric
mean rate
of return:
R G [(1 R1 ) (1 R 2 ) (1 Rn )]1/ n 1
Misleading result
More
accurate
result
Measures of Variation
Variation
Range
Interquartile
Range
Variance
Standard
Deviation
Coefficient
of Variation
Same center,
different variation
Range
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12
Range = 14 - 1 = 13
13 14
10
11
12
Range = 12 - 7 = 5
10
11
12
Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Interquartile Range
Interquartile Range
Example:
X
minimum
Q1
25%
12
Median
(Q2)
25%
30
25%
45
Q3
maximum
25%
57
Interquartile range
= 57 30 = 27
70
Variance
Sample variance:
S
2
Where
(X X)
i1
X = mean
n = sample size
Xi = ith value of the variable X
n -1
Standard Deviation
(X X)
i1
n -1
Calculation Example:
Sample Standard Deviation
Sample
Data (Xi) :
10
12
14
n=8
S
15
17
18
18
24
Mean = X = 16
130
7
4.3095
Measuring variation
Small standard deviation
12
13
14
15
16
17
18
19
20 21
Mean = 15.5
S = 3.338
20 21
Mean = 15.5
S = 0.926
20 21
Mean = 15.5
S = 4.567
Data B
11
12
13
14
15
16
17
18
19
Data C
11
12
13
14
15
16
17
18
19
Coefficient of Variation
S
CV
X
100%
Comparing Coefficient
of Variation
Stock A:
Average price last year = $50
Standard deviation = $5
S
$5
100%
CVA
100% 10%
$50
X
Stock B:
Average price last year = $100
Standard deviation = $5
S
$5
100%
CVB
100% 5%
$100
X
Both stocks
have the same
standard
deviation, but
stock B is less
variable relative
to its price
Z Scores
XX
Z
S
Z Scores
(continued)
Example:
X X 18.5 14.0
Z
1.5
S
3.0
Shape of a Distribution
Measures of shape
Symmetric or skewed
Left-Skewed
Symmetric
Right-Skewed
Mean = Median
Using Minitab
Using Minitab
1. Select Stat / Basic Statistics /
Display Descriptive statistics
3. Click OK
Using Minitab
(continued)
Descriptive
Statistics Options
Minitab
output
Numerical Measures
for a Population
Where
X
i1
X1 X 2 XN
= population mean
N = population size
Xi = ith value of the variable X
Population Variance
Population variance:
Where
(X )
i1
= population mean
N = population size
Xi = ith value of the variable X
2
(X
)
i
i1
68%
95%
99.7%
Chebyshev Rule
Examples:
At least
within
Where
m f
j j
j1
(m
j1
X) f j
2
n -1
Example:
25%
Minimum
Minimum
25%
1st
1st
Quartile
Quartile
25%
Median
Median
25%
3rd
3rd
Quartile
Quartile
Maximum
Maximum
Min
Q1
Median
Q3
Max
Q1
Q2 Q3
Symmetric
Q1 Q2 Q3
Right-Skewed
Q1 Q2 Q3
Q1
Q2
00 22 33 55
Q3
Max
10
27
27
27
Using Minitab
1. Select Graph / Boxplot
3. Click OK
Using Minitab
(continued)
Minitab output:
Minitab displays
outliers with a *
symbol
Data:
27
10
5
5
4
3
3
2
2
2
0
cov ( X , Y )
( X X)( Y Y )
i1
n 1
Interpreting Covariance
cov(X,Y) > 0
cov(X,Y) < 0
cov(X,Y) = 0
Coefficient of Correlation
cov (X , Y)
r
SX SY
where
n
cov (X , Y)
(X X)(Y Y)
i1
n 1
SX
(X X)
i1
n 1
SY
2
(Y
Y
)
i
i 1
n 1
Features of
Correlation Coefficient, r
Unit free
r = -1
r = -.6
X
Y
r = +1
r=0
r = +.3
r=0
Select
Stat / Basic Statistics /
Correlation
Click OK
(continued)
Minitab output
r = .733
There is a relatively
strong positive linear
relationship between
test score #1
and test score #2
Pitfalls in Numerical
Descriptive Measures
Ethical Considerations
Numerical descriptive measures:
Chapter Summary
Discussed quartiles
Chapter Summary
(continued)