Numerical
Descriptive Measures
PowerPoint to accompany:
Learning Objectives
After studying this Chapter you should have a better
understanding of:
How to calculate and interpret numerical descriptive
measures of central tendency, variation and shape for
numerical data
How to calculate and interpret descriptive summary
measures for a population
How to construct and interpret a boxandwhisker plot
How to calculate and interpret the covariance and the
coefficient of correlation for bivariate data
2 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Describing Data
3 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Arithmetic Mean
Median
Mode
Describing data by its central tendency,
variation and shape
Variance
Standard Deviation
Coefficient of Variation
Range
Interquartile Range
Geometric Mean
Skewness
Central Tendency Variation Quartiles
Shape
Measures of Central Tendency
4
Central Tendency
Arithmetic
Mean
Median Mode
n
X
X
n
i
i
=
=
1
Midpoint of
ranked
values
Most
frequently
observed
value
Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
The Arithmetic Mean
For a sample of size n the sample mean, denoted , is
calculated:
Where means to sum or add up.
5 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
n
X X X
n
X
X
n
n
i
i
+ + +
= =
=
2 1 1
X
i
s are observed values
X
The Median
6 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
In an ordered array, the median is the middle
number (50% above, 50% below).
Its main advantage over the arithmetic mean is that
it is not affected by extreme values.
0 1 2 3 4 5 6 7 8 9 10
Median = 3
0 1 2 3 4 5 6 7 8 9 10
Median = 3
Finding the Median
The location of the median:
Note that is not the value of the median, only the position of
the median in the ranked data.
Rule 1: if the number of values in the data set is odd, the median is
the middle ranked value.
Rule 2: if the number of values in the data set is even, the median is
the mean (average) of the two middle ranked values.
7 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
2
Median = ranked value
1 n +
2
1 n +
The Mode
A measure of central tendency
Value that occurs most often (the most frequent)
Not affected by extreme values
Unlike mean and median, there may be no unique (single) mode for
a given data set
Used for either numerical or categorical (nominal) data
An example of no mode:
An example of several modes:
8 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Modes = 5 and 9
0 1 2 3 4 5 6
Review Example
9 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Prices for 5 houses located near the beach:
$2,000,000
$500,000
$300,000
$100,000
$100,000
Review Example
Mean=
Median (position = 6/2 = 3)
= $300,000
Mode = $100,000
10 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
House Prices:
$1,000,000
$500,000
$300,000
$100,000
$100,000
+ + + +
=
=
3 1
IQR=Q Q
The Interquartile Range (IQR)
22 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Q2 = Median
X
maximum
X
minimum
Q1 Q3
Example: Range = 200 10 = 190 (Misleading)
25% 25% 25% 25%
10 30 45 60 200
IQR = 60 30 = 30
Even if the value of 200 changes to 300, IQR remains
the same, hence resistant to changes in extreme values.
The Sample Variance S
2
23 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Measures average scatter around the mean
Units are also squared
1  n
) X (X
S
n
1 i
2
i
2
=
=
Where
= mean
n = sample size
X
i
= i
th
value of
the variable X
X
The Sample Standard Deviation  S
24 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Most commonly used measure of variation
Shows variation about the mean
Has the same units as the original data
1  n
) X (X
S
n
1 i
2
i
=
=
Calculation Example: Sample Standard Deviation
25 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Sample
Data (X
i
) 10 12 14 15 17 18 18 24
n = 8 Mean = X = 16
= 4.3095
A measure of the average
scatter around the mean
+ + + +
=
+ + + +
=
2 2 2 2
2 2 2 2
(10 X) (12 X) (14 X) (24 X)
S
n 1
(10 16) (12 16) (14 16) (24 16)
8 1
=
130
7
Measuring Variation
26 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Small standard deviation
Large standard deviation
Comparing Standard Deviations
27 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Mean = 15.5
S = 3.338
11 12 13 14 15 16 17 18 19 20 21
11 12 13 14 15 16 17 18 19 20 21
Data B
Data A
Mean = 15.5
S = 0.926
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
S = 4.567
Data C
Variance and Standard Deviation
Advantages
Each value in the data set is used in the calculation
Values far from the mean are given extra weight as
deviations from the mean are squared
Disadvantages
Sensitive to extreme values (outliers)
Measures of absolute variation not relative variation
28 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
The Coefficient of Variation
29 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Measures relative variation i.e. shows variation
relative to mean.
Can be used to compare two or more sets of data
measured in different units.
Always expressed as percentage (%).
 
=

\ .
S
CV 100%
X
Coefficient of Variation Example
30 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Stock A:
Average price last year = $50; standard deviation = $5
Stock B:
Average price last year = $100; standard deviation = $5
Both stocks have
the same std dev,
but stock B is less
variable relative to
its price
 
= = =

\ .
A
S $5
CV 100% 100% 10%
$50
X
 
= = =

\ .
B
S $5
CV 100% 100% 5%
$100
X
The Z Score
31 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
The difference between a given observation and the
mean, divided by the standard deviation.
E.g. a Z score of 2.0 means that a value is 2.0
standard deviations from the mean.
A Z score above 3.0 or below 3.0 is considered an
outlier.
=
X X
Z
S
Z Score Example
If the mean is 14.0 and the standard deviation is 3.0, what is the Z
score for the value 18.5?
The value 18.5 is 1.5 standard deviations above the mean.
A negative Z score would indicate that a value is below the mean.
32 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
= = =
X X 18.5 14.0
Z 1.5
S 3.0
The Shape of a Distribution
33 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Describes how data are distributed.
Measures of shape.
Symmetric or skewed
Mean = Median
Mean < Median Median < Mean
RightSkewed LeftSkewed Symmetric
Using Microsoft Excel
34 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Use menu choice:
Data/Data Analysis
/Descriptive Statistics
Using Microsoft Excel
35 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Numerical Measures for a Population
36 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Population summary measures are called
parameters.
The population mean is the sum of the values in the
population divided by the population size, N.
N
X X X
N
X
N 2 1
N
1 i
i
+ + +
= =
=
Population Variance vs. Standard Deviation
37 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Population variance:
The average of the
squared deviations
of values from the
mean
N
) (X
N
1 i
2
i
2
=
=
N
) (X
N
1 i
2
i
=
=
Population Standard Deviation:
Shows variation about the mean
Is the square root of the
population variance
Has the same units as the original
data
= population mean; N = population size; X
i
= i
th
value of the variable X
The Empirical Rule
38 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
If the data distribution is approximately bellshaped,
then the interval contains about 68% of
the values in the population.
1
68%
1
The Empirical Rule
39 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
contains about 95% of the values in the
population
contains about 99.7% of the values in
the population
2
3
3
99.7% 95%
2
Chebyshev Rule and Examples
40 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
(1  1/1
2
) x 100% = 0% k=1 (
1)
(1  1/2
2
) x 100% = 75% k=2 (
2)
(1  1/3
2
) x 100% = 89% k=3 (
3)
Within At least
Regardless of how the data are distributed, the percentage
of values within k standard deviations of the mean must be
at least:
[(1  1/k2)] x 100% (for k > 1)
Approximating the Mean
41 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Sometimes only a frequency distribution is available,
not the raw data.
Use the midpoint of a class interval to approximate
the values in that class.
Where n = number of values or sample size
c = number of classes in the frequency distribution
m
j
= midpoint of the j
th
class
f
j
= number of values in the j
th
class
n
f m
X
c
1 j
j j
=
=
Approximating the Standard Deviation
42 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
1  n
f ) X (m
S
c
1 j
j
2
j
=
=
Assume that all values within each class interval are
located at the midpoint of the class.
Exploratory Data Analysis
43 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Q2 = Median
X
maximum
X
minimum
Q1 Q3
25% 25% 25% 25%
BoxandWhisker Plot: A graphical display of data using the 5
number summary:
Minimum(X
smallest
)  Q1  Median  Q3  Maximum (X
largest
)
Distribution Shape and BoxandWhisker Plot
44 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
RightSkewed LeftSkewed Symmetric
Q1 Q2 Q3 Q1 Q2 Q3
Q1 Q2 Q3
The Covariance
The sample covariance measures the strength of
the linear relationship between two numerical
variables.
Only concerned with the direction of the relationship
No causal effect is implied
Is affected by units of measurement
45 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
1
( )( )
cov( , )
1
n
i i
i
X X Y Y
X Y
n
=
=
Correlation
46 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Measures the relative strength of the linear
relationship between two variables
Where
Y X
S S
Y) , (X cov
r =
1 n
) X (X
S
n
1 i
2
i
X
=
=
1 n
) Y )(Y X (X
Y) , (X cov
n
1 i
i i
=
=
1 n
) Y (Y
S
n
1 i
2
i
Y
=
=
Features of Correlation Coefficient, r
Also called Standardised Covariance,
i.e. invariant to units of measure.
Ranges between 1 and 1:
The closer to 1, the stronger the negative linear
relationship.
The closer to 1, the stronger the positive linear
relationship.
The closer to 0, the weaker the linear relationship.
47 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Scatter Plots of Data with Various Correlation
Coefficients
48 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Y
X
Y
X
Y
X
Y
X
Y
X
r = 1
r = .6 r = 0
r = +.3 r = +1
Y
X
r = 0
Industry Application
Skyscrapers 'linked with impending financial crashes'
There is an "unhealthy correlation" between the building of skyscrapers and
subsequent financial crashes, according to Barclays Capital.
Examples include the Empire State building, built as the Great Depression
was under way, and the current world's tallest, the Burj Khalifa, built just
before Dubai almost went bust.
China is currently the biggest builder of skyscrapers, the bank said.
India also has 14 skyscrapers under construction.
"Often the world's tallest buildings are simply the edifice of a broader
skyscraper building boom, reflecting a widespread misallocation of capital
and an impending economic correction," Barclays Capital analysts said.
(source: http://www.bbc.co.uk/news/business16494013)
49 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Pitfalls and Ethical Issues
Data analysis is objective.
Should report the summary measures that best meet the
assumptions about the data set
Data interpretation is subjective.
Should be done in fair, neutral and transparent manner
Should document both good and bad results.
Results should be presented in a fair, objective and neutral manner.
Should not use inappropriate summary measures to distort facts.
Do not fail to report pertinent findings even if such findings do not
support original argument.
50 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e
Chapter Summary
Described measures of central tendency.
Mean, median, mode, geometric mean
Described quartiles.
Described measures of variation.
Range, interquartile range, variance and standard deviation,
coefficient of variation, Z scores
Illustrated shape of distribution.
Symmetric, skewed, boxandwhisker plots
Discussed covariance and correlation coefficient.
Addressed pitfalls in numerical descriptive measures and ethical
considerations.
51 Copyright 2013 Pearson Australia (a division of Pearson Australia Group Pty Ltd) 9781442549272/Berenson/Business Statistics /2e