Sie sind auf Seite 1von 92

FIN5SBF

1


Topic 1:

Time Value of Money
(Part I)
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 1
Time Value of Money
1.2
1. INTRODUCTION
What is the time value of money?
Is $1 today equal to $1 tomorrow?
Would you agree to pay $500 to a friend
and receive $500 back 1 year from now?
Would you agree to pay $500 to a friend
and receive $1000 back 1 year from now?
Time Value of Money
1.3
2. INTEREST RATES
Would you agree to receive $9,500 now
and pay $10,000 now?
What if you receive $9,500 now and pay
$10,000 one year from now?
Time Value of Money
1.4
2. INTEREST RATES
An interest rate is a rate of return that
reflects the relationship between
differently dated cash flows.
How much is your return in the previous
example?


FIN5SBF
2


Time Value of Money
1.5
2. INTEREST RATES
Interest rate may be referred to as:
Required rate of return:
The minimum rate of return an investor must receive in
order to accept the investment.
Discount rate:
The rate we use to discount future cash flows.
Opportunity cost:
Value that investors forgo by choosing a particular course of
action.
Time Value of Money
1.6
3. FUTURE VALUE OF A SINGLE CASH FLOW
You invest $9,500 now and receive $10,000
one year from now.
This $10,000 includes the initial $9,500 plus
$500 interest on that.
Future Value is equal to the Present Value of
the investment plus interest on the investment.
( )
( ) r PV FV
PV r PV FV
+ =
+ =
1
Time Value of Money
1.7
3. FUTURE VALUE OF A SINGLE CASH FLOW
Now what if you invest that money for one
more year?


Hence, formula for future value of a single
cash flow after N periods is:
( ) ( ) | |
( )
2
1
1 1
r PV FV
r PV r r PV FV
+ =
+ + + =
( )
N
r PV FV + = 1
Time Value of Money
1.8
Example 1: FV of a Lump Sum
An institution promises to pay you a lump sum,
six years from now at an 8% annual interest
rate, if you invest $2,500,000 today.
FIN5SBF
3


Time Value of Money
1.9
3.1 The Frequency of Compounding
Some investments pay interest more than
once a year.
Financial institutions often quote an annual
interest rate.
If your bank states the annual interest rate is
8%, compounded monthly, how much is the
monthly interest rate?
Time Value of Money
1.10
3.1 The Frequency of Compounding
With more than one compounding period per
year, the future value formula can be
expressed as:
mN
s
m
r
PV FV |
.
|

\
|
+ = 1
Time Value of Money
1.11
Example 2: FV of a Lump Sum with Monthly Compounding
An investment has a six-year maturity and
annual quoted interest rate is 8% compounded
monthly. FV if you invest $2,500,000 is:
Time Value of Money
1.12
3.2 Continuous Compounding
If the number of compounding periods per
year becomes infinite, then interest is said to
compound continuously.
Formula for FV of a sum in N years with
continuous compounding is:
7182818 . 2 in which ~
=
e
PVe FV
N r
s
FIN5SBF
4


Time Value of Money
1.13
Example 3: FV of a Lump Sum with Continuous
Compounding
An investment has a six-year maturity and
annual quoted interest rate is 8% compounded
continuously. FV if you invest $2,500,000 is:
Time Value of Money
1.14
3.3 Stated and Effective Rates
In all of the examples 1-3 stated interest rate
and maturity were 8% and 6 years. But they all
have different future values:
PV = $2,500,000
FV Compounded annually = $3,967,186
FV Compounded Monthly = $4,033,755
FV Compounded Continuously = $4,040,186
Time Value of Money
1.15
3.3 Stated and Effective Rates
Examples 1-3 illustrate that when interest rate
is compounded, monthly or continuously,
effective rate is more than the stated rate.
The effective annual rate is calculated as:


Or for continuous compounding:
( ) 1 1 + =
m
ate Interest R Periodic EAR
1 =
s
r
e EAR
Time Value of Money
1.16
Example 4: Effective Annual Rate
An investment has a six-year maturity and annual
quoted interest rate is 8%. What is the EAR if interest
rate is compounded monthly? What if it is
compounded continuously:


For continuous compounding:
8% = r
FIN5SBF
5


Time Value of Money
1.17
4. FUTURE VALUE OF A SERIES OF
CASH FLOWS
Common terms used in this topic:

Annuity: a finite set of sequential cash flows

Ordinary annuity: first cash flow occurs one year from now

Annuity due: first cash flow occurs immediately (at t=0)

Perpetuity: an infinite set of cash flows beginning one year
from now.

Time Value of Money
1.18
4. FUTURE VALUE OF A SERIES OF
CASH FLOWS
Consider an ordinary annuity paying 5%
annually. Suppose we have 5 annual deposits
of $100 starting one year from now.

We are interested in FV of this ordinary
annuity.
Now: t=0 2 1 3 4 5
$100 $100 $100
FV= ?
$100
$100
Time Value of Money
1.19
4. FUTURE VALUE OF A SERIES OF
CASH FLOWS
Total FV is equal to the sum of FV of every
single payment:

FV of 1
st
payment :

FV of 2
nd
payment :

FV of 3
rd
payment :

FV of 4
th
payment :

FV of 5
th
payment :

( )
1
05 . 0 1 100 $ + = FV
( )
2
05 . 0 1 100 $ + = FV
( )
3
05 . 0 1 100 $ + = FV
( )
4
05 . 0 1 100 $ + = FV
( )
5
05 . 0 1 100 $ + = FV
Time Value of Money
1.20
4. FUTURE VALUE OF A SERIES OF
CASH FLOWS
Total FV is equal to the sum of FV of every
single payment:


But if all the payments are equal, we can
arrive at a general annuity formula:
( )
(

+
=
r
r
A FV
N
1 1
FIN5SBF
6


Time Value of Money
1.21
4. FUTURE VALUE OF A SERIES OF
CASH FLOWS
Hence, In this example we have:

Time Value of Money
1.22
5. PRESENT VALUE OF A SINGLE CASH FLOW
Present value is the discounted value of a future
cash flow.

PV of a lump sum can be found through the
following equation:
( )
( )
N
N
r FV
r
FV PV

+ =
(

+
= 1
1
1
Time Value of Money
1.23
Example 5: PV of a Lump Sum
An institution promises to pay you $100,000 in
six years with an 8% annual interest rate. How
much should they invest today to have this
money at the end of 6
th
year?
Time Value of Money
1.24
5.1 The Frequency of Compounding
With more than one compounding period per
year, the present value formula can be
expressed as:
mN
s
m
r
FV PV

|
.
|

\
|
+ = 1
FIN5SBF
7


Time Value of Money
1.25
Example 6: PV of a Lump Sum with Monthly Compounding
A company must make a $5 million payment
10 years from now. How much should they
invest today if the annual interest rate is 6%,
compounded monthly?
Time Value of Money
1.26
6. PRESENT VALUE OF A SERIES OF
CASH FLOWS
Consider an ordinary annuity paying 5%
annually. Suppose we have 5 annual deposits
of $100 starting one year from now.

We are interested in PV of this ordinary
annuity.
Now: t=0 2 1 3 4 5
$100 $100 $100
PV= ?
$100
$100
Time Value of Money
1.27
6. PRESENT VALUE OF A SERIES OF
CASH FLOWS
Total PV is equal to the sum of PV of every
single payment:

PV of 1
st
payment :

PV of 2
nd
payment :

PV of 3
rd
payment :

PV of 4
th
payment :

PV of 5
th
payment :

( )
1
05 . 0 1 100 $

+ = PV
( )
2
05 . 0 1 100 $

+ = PV
( )
3
05 . 0 1 100 $

+ = PV
( )
4
05 . 0 1 100 $

+ = PV
( )
5
05 . 0 1 100 $

+ = PV
Time Value of Money
1.28
6. PRESENT VALUE OF A SERIES OF
CASH FLOWS
Total PV is equal to the sum of PV of every
single payment:


But if all the payments are equal, we can
arrive at a general annuity formula:
( )
(

+
=

r
r
A PV
N
1 1
FIN5SBF
8


Time Value of Money
1.29
6. PRESENT VALUE OF A SERIES OF
CASH FLOWS
Hence, In this example we have:

Time Value of Money
1.30
Present Value of an Infinite Series of Equal
Cash Flows
Present value of an infinite series of equal
cash flows can be calculated as:

r
A
PV =
Time Value of Money
1.31
Example 6: An Infinite Series of Equal Cash Flows
A type of British government bonds pays $100
per year in perpetuity. What would it be worth
today if interest rate were 5%?
Time Value of Money
1.32
Thank You!
FIN5SBF
9


Topic 2:

Time Value of Money (Part II)
Discounted Cash Flow Applications
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 2
Discounted Cash Flow Applications
1. INTRODUCTION
Key Time Value of Money concepts: NPV
and IRR
Making investment decision
Portfolio return measurement
Calculation of money market yields

Discounted Cash Flow Applications
2.1 Net Present Value (NPV)
Invest or Not?

You first need to know the present value of the
cash flows
Second, you need to know how much the project
cost you
If the project costs less than the PV of the cash
flows you will invest; otherwise you will not
Discounted Cash Flow Applications
2.1 Net Present Value (NPV)
NPV is a method for choosing among
alternative investments. The Net
Present Value is the present value of
cash inflows, minus the present value of
cash outflows.



0
1 0
(1 ) (1 )
N N
t t
t t
t t
CF CF
NPV CF
r r
= =
= + =
+ +

FIN5SBF
10


Discounted Cash Flow Applications
Example 1: NPV
A project generates $100m next year, $150m
in year 2 and $120m in year 3. If the project
costs $310m and you can finance the
investment with an opportunity cost of 10%,
will invest in the project?
Discounted Cash Flow Applications
Example 1: NPV
What if the opportunity cost is 5%?




Would you invest in this project?
Discounted Cash Flow Applications
2.2 Internal Rate of Return (IRR)

Internal Rate of Return is the rate of return that
makes the NPV equal to zero



IRR can be calculated using financial software or
financial calculators, or trial and error method!
0
) 1 (
...
) 1 ( ) 1 (
2
2
1
1
0
=
+
+ +
+
+
+
+ =
N
N
IRR
CF
IRR
CF
IRR
CF
CF NPV
Discounted Cash Flow Applications
2.2 Internal Rate of Return (IRR)

Investment decision is to invest if IRR is more than the
interest rate (opportunity cost) and not invest if IRR is
lower than the interest rate.
If discount rate is more than IRR, NPV will be negative
and if interest rate is lower that IRR, NPV will be positive
Therefore, NPV and IRR will always lead to the same
decision

FIN5SBF
11


Discounted Cash Flow Applications
Example 2: IRR
A project generates $100m next year, $150m
in year 2 and $120m in year 3. If the project
costs $310m. What is the internal rate of
return? Is it 8.34%, 9.12% or 10.27%?
Discounted Cash Flow Applications
Example 2: IRR
Will you invest if the opportunity cost is 10%?




What if the interest rate is 5%?


Discounted Cash Flow Applications
2.3 Problems with IRR rule

IRR and NPV always lead to the same invest/not
invest decision, but sometimes they rank the
projects differently
If the scale of the projects differs
If projects have different timing of future cash flows

If there is a conflict between IRR and NPV, we
should follow NPV as it reflects the real change
in investors wealth
Discounted Cash Flow Applications
2.3 Problems with IRR rule

IRR and NPV always lead to the same invest/not invest
decision, but sometimes they rank the projects differently
If the scale of the projects differs
If projects have different timing of future cash flows

If there is a conflict between IRR and NPV, we should
follow NPV as it reflects the real change in investors
wealth
If the sign of the cash flows changes more than once, we
may get more than one IRR.


FIN5SBF
12


Discounted Cash Flow Applications
3. Portfolio Return Measurement

Holding Period Return (HPR), the fundamental
concept

Return that an investor earns over a specified
holding period

0
1 0 1
P
D P P
HPR
+
=
0
1
1
is the initial investment
is the price at the end of the period
is the cash paid by the investment
P
P
D
Discounted Cash Flow Applications
3.1 Money Weighted Rate of Return
IRR is called Money Weighted Rate of Return
because it depends on timing and the dollar value of
cash flows

IRR is not a good measure for investment
managers
Usually, client decides when and how much to invest or
withdraw

An evaluation tool should only judge the investment manager
only for his own decisions, not for the clients
Discounted Cash Flow Applications
3.2 Time Weighted Rate of Return
The preferred measurement tool in investment
management industry

Measures the compound rate of growth of
each $1 of initial investment over the period
Does not depend on the dollar value of the investment

Not affected by withdrawals or additions to the portfolio
Discounted Cash Flow Applications
3.2 Time Weighted Rate of Return
Calculation of Time Weighted Rate of Return

Price the portfolio before any additions or
withdrawals, breaking the period into subperiods
Calculate the HPR for each subperiod
Take the geometric mean of the calculated Holding
Period Returns (HPR)
n
1 2 n
TWRR= (1+r ) (1+r ) ... (1+r ) 1
FIN5SBF
13


Discounted Cash Flow Applications
Example 3: Money Weighted Rate of Return
At t = 0, an investor buys one share at $200.
At time t = 1, he purchases an additional
share at $225,
At the end of Year 2, t = 2, he sells both
shares for $235 each.
During both years, the share pays a per-share
dividend of $5.
Calculate the Money Weighted Rate of Return
Discounted Cash Flow Applications
Example 3: Money Weighted Rate of Return
t = 0 t = 1 t = 2
$200 $225 $10 2 x $235 $5
Discounted Cash Flow Applications
Example 4: Time Weighted Rate of Return
t = 0 t = 1 t = 2
$200 $225 $10 2 x $235 $5
Discounted Cash Flow Applications
Example 4: Time Weighted Rate of Return
First period:
t = 0 t = 1 t = 2
$200 $225 $10 2 x $235 $5
P
0
=$200 P
1
=$225
FIN5SBF
14


Discounted Cash Flow Applications
Example 4: Time Weighted Rate of Return
t = 0 t = 1 t = 2
$200 $225 $2 x 5 2 x $235 $5
P
0
=$225 P
1
=$235
Discounted Cash Flow Applications
Example 4: Time Weighted Rate of Return
n
1 2 n
TWRR= (1+r ) (1+r ) ... (1+r ) 1
t = 0 t = 1 t = 2
$200 $225 $2 x 5 2 x $235 $5
Discounted Cash Flow Applications
4. Money Market Yields
Consider two 1-year bonds of a company:
One with a $100 face value and $10 coupon at maturity;
The other with a $110 face value but no coupon
Are they selling at the same price today?
What is the interest of the second bond?

Many short-term debts (one-year maturity or less) pay
no explicit coupon but they are sold at a discount.
Pure discount instruments
Discounted Cash Flow Applications
4. Money Market Yields
Pure discount instruments such as T-bills are
quoted on a Bank Discount basis, rather than
on a price basis:

r
BD
: annualized yield on a Bank Discount basis
D: dollar discount = face value purchase price
F: face value
T: actual number of days remaining to maturity
360: bank convention of the number of days in a year
t F
D
r
BD
360
=
FIN5SBF
15


Discounted Cash Flow Applications
Thank You!
Topic 3:

Statistical Concepts and Market
Returns
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 3
Statistical Concepts and Market Returns
2.59
INTRODUCTION
Statistical methods provide a powerful set of tools for
analyzing data.
Descriptive statistics includes basics of describing and
analyzing data.
We explore four properties of return distributions:
Where the returns are centered (central tendency)
How far returns are dispersed from their center (dispersion)
Whether the distribution of returns is symmetrically shaped or not
(skewness)
Whether extreme outcomes are likely (kurtosis)

Statistical Concepts and Market Returns
2.60
Descriptive and Inferential Statistics
Descriptive statistics is to summarize a
small set of data (sample) effectively to
describe the important aspects of a larger
dataset (population)
Statistical inference is to make forecasts.
estimations or judgments about a larger
dataset (population) from a smaller group
(sample) actually observed.
FIN5SBF
16


Statistical Concepts and Market Returns
2.61
Frequency Distributions
The simplest way of summarizing data is
the frequency distribution.
A frequency distribution is a tabular
display of data summarized into a
relatively small number of intervals.



Statistical Concepts and Market Returns
2.62
Example: Frequency Distributions
Banna Insurance Pty Ltd - 4 sales incentive
programs A, B, C and D. 40 Salespeople
asked for their opinion of the preferred
program.



Fill the following table:
B A D C A C D B D B
D D B A D B D A D C
D B C D A D B D B C
B A D B A B A C D B
Statistical Concepts and Market Returns
2.63
Example 1: Frequency Distributions
Program Frequency Relative Frequency
Cumulative
Frequency
Cumulative Relative
Frequency
A
B
C
D
Statistical Concepts and Market Returns
2.64
Histogram & Bar Chart
Histogram consist of adjacent rectangles whose bases
are marked off by class width and their heights are
proportional to the frequencies they possessed.
Bar chart is special form of histograms where bars are
not adjacent and the data have been grouped into a
frequency distribution.
The following two slides display the bar chart of
absolute and relative frequency distributions of example
1. Similarly you can compute histogram of ASX200
returns; Text page
FIN5SBF
17


Statistical Concepts and Market Returns
2.65
Bar Chart of Banna Insurance Example
Statistical Concepts and Market Returns
2.66
Bar Chart of Banna Insurance Example
Statistical Concepts and Market Returns
2.67
Measures of Central Tendency
Arithmetic Mean:

The (arithmetic) mean is the sum of the observations divided by the number
of observations.
The population mean is given by



The sample mean looks at the arithmetic average of the sample of data:


The mean return of ASX200 is 0.70%
N X
N
i
i
=
=
1

n X X
n
i
i
=
=
1
Statistical Concepts and Market Returns
2.68
Example: Arithmetic Mean

Find the mean of 46, 54, 42, 46, 32:
.
FIN5SBF
18


Statistical Concepts and Market Returns
2.69
Median
Median is the value of the middle item of a set of
items, sorted by ascending or descending
order.
If n is an odd number it occupies the (n+1)/2 position.
If n is an even number it is the average of items in the
n/2 and (n+2)/2 positions.
Unlike the mean, the median is not affected by a few
large observations.

Statistical Concepts and Market Returns
2.70
Example: Median

Find the median of 46, 54, 42, 46, 32:
.
Statistical Concepts and Market Returns
2.71
Mode
The mode is the most frequently occurring value
in a distribution.
A distribution can have more than one mode or
even no mode.
Stock returns or other data from continuous
distribution may not have a modal outcome but
we often find the modal interval (intervals)
Which internal in our Banna Insurance example is the
modal interval? (Hint: see the Histogram in slide 9)
Statistical Concepts and Market Returns
2.72
Example: Mode

Find the mode of 46, 54, 42, 46, 32:
.
FIN5SBF
19


Statistical Concepts and Market Returns
2.73
Weighted Mean
The weighted mean allows us to place greater
importance on different observations.
For example, we may choose to give larger
companies greater weight in our computation of an
index. In this case, we would weight each
observation based on its relative size.


The arithmetic mean is a special case where
each observation is given the same weight.

= =
= i
i
n
i
i i
w w X w X 1 where ,
1
Statistical Concepts and Market Returns
2.74
Geometric Mean
The geometric mean is most frequently
used to average rates of change over time
or to compute the growth rate of a
variable.

which can also be calculated by

n , . . . , , i X X X X G
i
n
n
2 1 for 0 with , ] ... [
/ 1
2 1
= > =

=
=
n
i
i
X
n
G
1
ln
1
ln
Statistical Concepts and Market Returns
2.75
Geometric Mean Return
Geometric mean requires all data are
positive It cannot be applied to
observations with negative data like return.
The geometric mean return allows us to
compute the average return when there is
compounding.
| |
| | 1 ) 1 (
) 1 )...( 1 )( 1 )( 1 ( 1
1
1
1
3 2 1
+ =
+ + + + = +
[
=
T
T
t
t G
T
T G
R R
R R R R R
Statistical Concepts and Market Returns
2.76
Quartiles and Percentiles
If your lecturer tells you that your exam mark is
in top 10%, what does it mean?
Median divides the data in half. The dataset
can be also divided into:
Quartiles; Quintiles; Deciles; Percentiles
To find the value of y
th
percentile with n
observations:
first organize data in ascending order
FIN5SBF
20


Statistical Concepts and Market Returns
2.77
Quartiles and Percentiles
then the location of the y
th
percentile is at
L
y
= (n + 1) * y%
however, some software package like Excel uses location L
y
=
[(n - 1)*y%] +1
if L
y
is an integer, the value of yth percentile, P
y
, is equal to the
observation at L
y

if L
y
is not an integer, it can be written as L
y
= k.d, where . is
the decimal point, and
P
y
=V
k
+ .d (V
k+1
- V
k
),
where V
k
and V
k+1
are the values of k
th
and (k + 1)
th
observations.
Statistical Concepts and Market Returns
2.78
Example: Percentiles
Find the values of 25
th
and 40
th

percentiles of the 46, 54, 42, 46, 32:




Note, median = P
50
, = Q
2
= D
5
, P
75
= Q
3
,

, here Q and D denote Quartile
and Decile.
Statistical Concepts and Market Returns
2.79
Measures of Dispersion
One of simplest measures of dispersion is
the range, which is the difference between
the maximum and minimum values in a
dataset:
Range = Maximum value Minimum value.


Statistical Concepts and Market Returns
2.80
Mean Absolute Deviation
Mean Absolute Deviation (MAD) measures
the average distance that each observation
is from the mean:



n X X MAD
n
i
i
=
=
1
FIN5SBF
21


Statistical Concepts and Market Returns
2.81
Population Variance and Standard
Deviation
Variance measures the average squared deviation
from the mean:


Because the variance is not in the same units as the
mean, sometimes we prefer the standard deviation, the
square root of variance, which is in the same units as
the mean
N X
N
i
i
=
=
1
2 2
) ( o
N X
N
i
i
=
=
1
2
) ( o
Statistical Concepts and Market Returns
2.82
Sample Variance and Standard Deviation
Sample Variance and standard deviation
slightly differ from population variance and
standard deviation:


) 1 ( ) (
1
2 2
=

=
n X X s
n
i
i
) 1 ( ) (
1
2
=

=
n X X s
n
i
i
Statistical Concepts and Market Returns
2.83
Example: MAD and Standard Deviation
Find the MAD for 46, 54, 42, 46, 32:



i
X
i
X X X
i
X X
Statistical Concepts and Market Returns
2.84
Example: MAD and Standard Deviation
Find Std Dev. for 46, 54, 42, 46, 32:



i
X
i
X X X
( )
2
i
X X
FIN5SBF
22


Statistical Concepts and Market Returns
2.85
Semivariance
Often times, observations above the mean
are good the variance is not a good
measure of risk. Semivariance looks at
the average squared deviations below the
mean:


where n* is the number of observations
greater than the average of observations.

<

X X
i
i
n
X X
all for
*
2
) 1 (
) (
Statistical Concepts and Market Returns
2.86
Coefficient of Variation
The coefficient of variation is the ratio of the
standard deviation to their mean value.
measure of relative dispersion
can compare the dispersion of data with
different scales


What is the coefficient of variation of the dataset in
previous example?
X s CV =
Statistical Concepts and Market Returns
2.87
Sharpe Ratio
The Sharpe ratio is the ratio of mean excess
return to riskan application of mean and
standard deviation analysis.
Risk averse investors who make decisions only
in terms of mean and standard deviation prefer
portfolios with larger Sharpe ratios:


Assuming monthly risk-free interest rate is
0.3%, the Sharpe ratio of ASX200 is 0.121.
p
F
p
h
s
R R
S

=
Statistical Concepts and Market Returns
2.88
Skewness
Skewness measures the symmetry of a
distribution.


A symmetric distribution has a skewness of 0.
Positive skewness indicates that the mean is greater
than the median (more than half the deviations from
the mean are negative)
Negative skewness indicates that the mean is less
than the median (less than half the deviations from
the mean are negative)
3
1
3
) (
) 2 )( 1 ( s
X X
n n
n
S
n
i
i
K

=

(


=
FIN5SBF
23


Statistical Concepts and Market Returns
2.89
Graphic Illustration of Skewness
Statistical Concepts and Market Returns
2.90
Sample Excess Kurtosis
Kurtosis measures how peaked the distribution
is relative to the normal distribution.
Using the sample excess kurtosis formula


K
E
0 is called Mesokurtic, which means the
distribution is normally distributed
K
E
> 0 is called Leptokurtic, which means the
distribution is more peaked.
K
E
< 0 is called Platykurtic means the is less peaked.

) 3 )( 2 (
) 3 ( 3
) (
) 3 )( 2 )( 1 (
) 1 (
2
4
1
4


+
=

=
n n
n
s
X X
n n n
n n
K
n
i
i
E
Statistical Concepts and Market Returns
2.91
Leptokurtic: Fat Tailed
Statistical Concepts and Market Returns
2.92
Thank You!
FIN5SBF
24


Topic 4:

Probability Concepts
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 4
Probability Concepts
4.94
Uncertainty and Probability
A random variable is a quantity whose outcomes are
uncertain.
An event is a specified set of outcomes.
Probability: the likelihood or chance that something is
the case or will happen

The probability of any event, E, is a number between
.

The sum of the probabilities of any set of mutually
exclusive & exhaustive events equals.
Probability Concepts
4.95
Uncertainty and Probability
A random variable is a quantity whose outcomes are
uncertain.
An event is a specified set of outcomes.
Probability: the likelihood or chance that something is
the case or will happen
The probability of any event, E, is a number between 0 and 1: 0 P(E)
1
The sum of the probabilities of any set of mutually exclusive &
exhaustive events equals 1.
Probability of an event A is equal to:
Probability Concepts
4.96
Example
An experiment is conducted in which a coin is tossed three times -
the uppermost face recorded on each toss. Draw the tree diagram.

First throw Second throw Third throw
FIN5SBF
25


Probability Concepts
4.97
Example
If A is the event of throwing 2 heads and 1 tail in any order, then:


A=


and P(A) =
Probability Concepts
4.98
Uncertainty and Probability
Mutually exclusive events are those only one of
which can occur at a time.
Exhaustive events are the events that cover all
possible outcomes.
How to estimate probability?
An empirical probability is estimated by relative frequency of occurrence
based on historical data
A subjective probability is one drawing on personal or subjective
judgment.
A priori probability is one based on logical analysis rather than on
observation or personal judgment.
A priori or an empirical probability is also called objective probability.
Probability Concepts
4.99
Example
A die is thrown. Determine the probability of obtaining a number 2
or a 5.
A
B
S
Probability Concepts
4.100
Example
A die is thrown. Determine the probability of obtaining a multiple of 2
or a multiple of 3.
A
B
S
FIN5SBF
26


Probability Concepts
4.101
Example
A die is tossed twice. Determine the probability of obtaining a 3 on
the first toss and number > 5 on the second toss.
A
B
S
Probability Concepts
4.102
Example
A die is tossed twice. Determine the probability of obtaining a 3 on
the first toss and a total of 5 on both tosses.
A
B
S
Probability Concepts
4.103
Multiplication and Addition Rules for
Independent Events
When two events are independent, the joint
probability is the product of two probabilities


Consequently, addition rule becomes


Example: What is the probability of tossing
two heads in a row?
Probability Concepts
4.104
Independent Events
Two events are independent if and only if
) B ( P ) A | B ( P ly equivalent or ) A ( P ) B | A ( P = =
FIN5SBF
27


Probability Concepts
4.105
Multiplication Rule for Probability
The joint probability can be found using the
multiplication rule:

On the other hand, if joint probability P(AB)
and unconditional probability P(B) are known,
the conditional probability is

) B ( P ) B | A ( P ) AB ( P =
0 P(B) ,
) B ( P
) AB ( P
) B | A ( P = =
Probability Concepts
4.106
Unconditional and Conditional
Probabilities

Unconditional or marginal probability answers
question, What is the probability of event A.
Conditional probability answers the question,
What is the probability of event A, given that
event B occurs.
Joint probability answers the question, What is
the probability of both events A and B
happening.
Probability Concepts
4.107
Example
A die is tossed twice. Determine the probability of obtaining a total
of 5 on both tosses if a 3 is obtained on the first toss.
A
B
S
3,1
3,3
3,4
3,5
3,6
3,2


2,3
1,4
4,1
Probability Concepts
4.108
Example:
A sample of Business Degree evening students was surveyed in order to
investigate the relationship between age and marital status. The results of
the survey are tabled below:







Answer the following questions
Marital Status
S M
Age <30 25 15
30 20 40
FIN5SBF
28


Probability Concepts
4.109
Example:
What is the sample size?









Are the events being single and being < 30 independent?



( 30) P Age < =
( ) P Being Single =
( AND < 30) P Being Single =
( GIVEN < 30) P Being Single =
Probability Concepts
4.110
The Total Probability Rule
) ( ) | ( ) ( ) | (
) ( ) ( ) ( . 1
C C
C
S P S A P S P S A P
AS P AS P A P
+ =
+ =
) ( ) | ( ... ) ( ) | ( ) ( ) | (
) ( ... ) ( ) ( ) ( . 2
2 2 1 1
2 1
n n
n
S P S A P S P S A P S P S A P
AS P AS P AS P A P
+ + + =
+ + + =
events or scenarios exhaustive
and exclusive mutually are ,... , S where
2 1 n
S S
where S is an even and S
C
is the even not-S or the
complement of S
Probability Concepts
4.111
Expected Value
The expected value of a random variable is
the probability weighted average of the
possible outcomes of the random variable.

=
=
+ + + =
n
i
i i
n n
X X P
X X P X X P X X P X E
1
2 2 1 1
) (
) ( ... ) ( ) ( ) (
Probability Concepts
4.112
Variance
The variance of a random variable is the
expected value of squared deviations from
the random variables expected value:




Note, a better notation is
2 2 2 2
( ) {[ ( )] } ( ) ( ) X E X E X E X E X o = =

=
=
n
i
i i
X E X X P X
1
2 2
)] ( )[ ( ) ( o
2
X
o
FIN5SBF
29


Probability Concepts
4.113
Variance
(i) Var (c) = 0

(ii) Var (cX) = c2 Var (X)

(iii) Var (c + X) = Var (X)

(iv) Var (X + Y) =Var (X) + Var (Y), if X and Y are independent

=Var (X) + Var (Y) + 2 COV (X,Y) if X and Y
are not independent
Probability Concepts
4.114
Standard Deviation
Standard deviation is the positive square
root of variance.
2
o = o
Probability Concepts
4.115
Example
A random variable, X, has the following probability distribution:
x
i
2
P( x
i
) X=xi P(xi) or
P(X=xi)
xiP(xi)
0 0.1
1 0.6
2 0.3
Total 1.0
Probability Concepts
4.116
Example
Find the following:
E(X)

E(X
2
)

E(Y), if Y = aX cX2, where a = 1, c = 2

E(W), if W = (d c) X + a, where d = 8

Var(X)

Var(W)
FIN5SBF
30


Probability Concepts
4.117
Conditional Expected Value
A conditional expected value is the
expected value of a random variable X
given an event or scenario S, is denoted
E(X|S):


Total probability rule for expected value
) ( ) | ( ... ) ( ) | ( ) ( ) | ( ) | (
2 2 1 1 n n
S P S X E S P S X E S P S X E S X E + + + =
n n
X S X P X S X P X S X P S X E ) | ( ... ) | ( ) | ( ) | (
2 2 1 1
+ + + =
Probability Concepts
4.118
Conditional Variance
Since variance is the expected value of
random variable

we can define conditional variance
accordingly
( ) S S X E X E S X Var | )] | ( [ ) | (
2
=
2
)] ( [ X E X E
Probability Concepts
4.119
Portfolio Expected Return and Variance
Investment diversification and portfolio
Portfolio is the profile of the investment
If there are n assets and you invest w
i
(i =1, 2, .., n)
portion of your wealth in asset i, then the investment
portfolio is (w
1
, w
2
, w
n
). The portfolio return is


where R
i
is the return of asset i.
Modern portfolio theory often uses expected
return as the measure of reward and the
variance of returns as a measure of risk.
n n p
R w R w R w R + + + = ...
2 2 1 1
Probability Concepts
4.120
Properties of Expected Value
The expected value of a constant times a
random variable equals the constant times
the expected value of the random variable

Expected value of the sum of random
variables is equal to the sum of expected
values of the random variables:
) ( ... ) ( ) (
) ... (
2 2 1 1
2 2 1 1
n n
n n
R E w R E w R E w
R w R w R w E
+ + + =
+ + +
) ( ) (
i i i i
R E w R w E =
FIN5SBF
31


Probability Concepts
4.121
Calculation of Portfolio Expected Return
Given a portfolio with n securities, the
expected return on the portfolio is a
weighted average of the expected
returns on the component securities.

) ( ... ) ( ) ( ) (
2 2 1 1 n n P
R E w R E w R E w R E + + + =
Probability Concepts
4.122
Covariance and Correlation
)] )( [( ) , (
j j i i j i
ER R ER R E R R Cov =
) ( ) ( ) , (
j i j i ij
R R R R Cov o o =
Covariance of two random variables is defined as

Correlation coefficient of two random variables is
defined as
Probability Concepts
4.123
Interpretation of Return Covariance
If the covariance is 0, the returns on the assets
are unrelated.
If the covariance is negative (positive), when the
returns on one asset is above its expected
value, the returns of the other asset tend to be
below (above) its expected value; i.e the two
returns tends to move in the same (opposite)
direction.
The covariance of a random with itself is its own
variance.
Probability Concepts
4.124
Interpretation of Correlation Coefficient
Correlation is a scaled covariance that
falls between -1 and +1.
A correlation of +1 means the variables
are perfectly positively correlated.
A correlation of -1 means the variables
are perfectly negatively correlated.
A correlation of 0 means the variables
are uncorrelated.
FIN5SBF
32


Probability Concepts
4.125
Portfolio Variance
Unlike portfolio expected return, portfolio
variance is not a weighted average of the
variances of the securities in the portfolio.
To compute portfolio variance, we need to
incorporate the interaction between each
pair of variables (correlation or
covariance).
Probability Concepts
4.126
Portfolio Variance
Portfolio variance for a two-security
portfolio.


Portfolio variance for an n-security
portfolio.
2 1 12 2 1
2
2
2
2
2
1
2
1
2 1 2 1
2
2
2
2
2
1
2
1
2
2
) , ( 2 ) (
o o o o
o o o
w w w w
R R Cov w w w w R
P
+ + =
+ + =
) , ( ) (
1 1
2

= =
=
n
i
n
j
j i j i P
R R Cov w w R o
Probability Concepts
4.127
Thank You!
Topic 5:

Common Probability Distributions
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 5
FIN5SBF
33


Common Probability Distributions
5.129
Random Variables and Distributions

A probability distribution specifies the probabilities of the
possible outcomes of a random variable.

There are two types of random variables:
A discrete random variable can take on at most a
countable number of possible values;
A continuous random variable takes infinitely many
values, on an interval, say between [0, 1] or (-, +)
Common Probability Distributions
5.130
Random Variables and Distributions
For discrete random variable, the probability
function specifies the probability that the random
variable takes on a specific value.
For continuous variable,
The probability density function p(x) specifies the
probability density the random variable takes on the
value x or the approximate probability the random
variable takes on values around x of a unit length.
The cumulative distribution function P(x) gives the
probability that the random variable is less than or equal
to x.
Common Probability Distributions
5.131
Bernoulli Random Variable
Sometimes a random variable can only take on
two values, success or failure. This is referred to
as a Bernoulli random variable.
A Bernoulli trial is an experiment that produces
only two outcomes.
Y = 1 for success and Y = 0 for failure.


p 1 ) 0 Y ( P ) 0 ( p
p ) 1 Y ( P ) 1 ( p
= = =
= = =
Common Probability Distributions
5.132
Binomial Distribution
A binomial random variable X is defined as a number
Bernoulli trials.

The probability of x successes out of n trials is


The mean and variance of B(n,p) are:
= np

2
= np(1-p).

n 2 1
Y Y Y X + + + =
x n x x n x
p p
x x n
n
p p
x
n
x X P x p

= |
.
|

\
|
= = = ) 1 (
! )! (
!
) 1 ( ) ( ) (
FIN5SBF
34


Common Probability Distributions
5.133
Binomial Distribution
General notations (pages 166-168):
n factorial: n! n(n-1)(n-2)1 and 0! 1
Combination (x choices out of n options)



Binomial distribution assumes
The probability, p, of success is constant for all trials
The trials are independent
! )! (
!
x x n
n
x
n
C
r n

|
.
|

\
|

Common Probability Distributions


5.134
Example: Binomial Distribution
Flipping a fair coin: Probability of head = 50%
and probability of tail = 50%
If you flip three coins in a row, what is the
probability you have two heads and one tail?



So, can we answer the question?
Common Probability Distributions
5.135
Example: Binomial Distribution
Three customers enter a clothing store. The probability that a customer
will make a purchase p(s) is 0.30. investigate the probability distribution.
Common Probability Distributions
5.136
Example: A Binomial Model of Stock
Price Movements
If the probability of stock price moving up is 60% and down is 40%, what is the
probability that the stock price goes up in exactly two years?
Find the probability of an upward movement in the first two years followed by a fall in
the price in the third year.
What is the probability that the price goes down at least twice?
FIN5SBF
35


Common Probability Distributions
5.137
Example: A Binomial Model of Stock
Price Movements
Common Probability Distributions
5.138
Continuous Uniform Distribution
Probability density function (pdf) and cumulative distribution
function (cdf) of uniform distribution on [a, b] are:





The mean and variance of a continuous distribution:
Mean: E(X)= = (a+b)/2
Variance: Var (x) =
2
= (a+b)
2
/12

< <
=
otherwise 0
for
1
) (
b x a
a b x f

1
for
for 0
) (

>
< <

s
=
b for x
b x a
a b
a x
a x
x F
Common Probability Distributions
5.139
Normal Distribution
Random variable X follows a normal distribution with mean and
variance
2
(X ~ N(,
2
)) if it has a probability density function as:





There is not a closed-form cdf for a normal distribution and we have
to use a table of cumulative probabilities for a normal distribution
A normal distribution can be determined using its mean and
variance.
+ < < |
.
|

\
|
= x
x
x f for
2
) (
exp
2
1
) (
2
2
o

t o
Common Probability Distributions
5.140
Normal Distribution
A normal distribution has a skewness of 0 (it is
symmetric)
its mean, median and mode (slightly abuse the term) are equal

It has a kurtosis of 3 (or excess kurtosis of 0).

A linear combination of two or more normal random
variables is also normally distributed
This property is vary useful to determine the distribution of
portfolio return, given each assets return.
FIN5SBF
36


Common Probability Distributions
5.141
Graphic Illustration of Two Normal
Distributions
Common Probability Distributions
5.142
Units of Standard Deviation
Common Probability Distributions
5.143
Units of Standard Deviation
Approximately 50 percent of all observations fall in
the interval (2/3).
Approximately 68 percent of all observations fall in
the interval .
Approximately 95 percent of all observations fall in
the interval 2.
Approximately 99 percent of all observations fall in
the interval 3.
Common Probability Distributions
5.144
Confidence Intervals for Values of a Normal
Random Variable X
We expect
90 percent of the values of X to lie within the interval


95 percent of the values of X to lie within the interval


99 percent of the values of X to lie within the interval


These intervals are called 90%, 95% and 99% confidence
intervals for X.
s 96 . 1 X
s 65 . 1 X
s 58 . 2 X
FIN5SBF
37


Common Probability Distributions
5.145
Standard Normal Distribution
Standard normal distribution has a mean of zero and
a standard deviation of 1.

If X is a normal random variable that X ~ N(,
2
),
then Z follows the standard normal distribution (i.e., Z
~ N(0, 1)), if:

o

=
X
Z
Common Probability Distributions
5.146
Example: Normal Distribution
Find the following probabilities: (Z table is available in the next slide)
1. P (0 z 1.4)=



2. P (0 z 1.46)=



3. P (-1.5 z 1.5)=
Common Probability Distributions
5.147
Common Probability Distributions
5.148
Example: Normal Distribution
A portfolio has an estimated mean return of 12% and standard
deviation of return of 22%.

What is the probability that portfolio return will exceed 20%?


What is the probability of that portfolio return will be between
5.5% and 20%?


What is the returns 90% confidence interval?



FIN5SBF
38


Common Probability Distributions
5.149
Application: Safety-First Rule
Roy demonstrates that if portfolio return, R
P
, is normal
then minimizing the probability of R
P
falling below, R
L
,
require maximizes


P L P
R R E o / ] ) ( [ SFRatio =
Common Probability Distributions
5.150
Application: Safety-First Rule
You are managing an $800,000 portfolio for an investor whose
objective is long-term growth. But she may want to liquidate $30,000
at the end of a year. If that need arises, she hopes the liquidation of
$30,000 would not invade the initial capital of $800,000. If return on
the portfolio is 8% with 3% std deviation:




To protect the initial investment, portfolio managers rank the
investents based on the SFRatios
Common Probability Distributions
5.151
Application: Safety-First Rule (Cont.)
There are three investment alternatives :
A B C
Expected annual return (%) 25 11 14
Standard deviation (%) 27 8 20

What is the shortfall level (R
L
)?
According to safety-first criterion, which of the three allocations is the
best?
What is the probability that the return on the safety-first optimal portfolio
will be less than the shortfall level?
Common Probability Distributions
5.152
Lognormal Distribution
The lognormal distribution is widely used for modeling
asset prices.

A random variable Y follows a lognormal distribution if
and only if X = lnY is normally distributed.

Note,

If X has mean and variance
2
, then Y have mean
exp( + 0.5
2
) and variance exp(2 +
2
)[exp(
2
) 1].
X
Y e =
FIN5SBF
39


Common Probability Distributions
5.153
Two Lognormal Distributions
Common Probability Distributions
5.154
Thank You!
Topic 6:

Sampling and Estimation
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 6
Sampling and Estimation
6.156
Sampling
In statistics we are often interested in obtaining
information about the value of some parameters of a
population.

To obtain this information we usually take a small
subset of the population and try to draw some
conclusions from this sample.

A sampling plan is the set of rules used to select a
sample.
FIN5SBF
40


Sampling and Estimation
6.157
Simple Random Sampling
A simple random sample is a subset of a larger
population created in such a way that each element
of the population has an equal probability of being
selected.

Sampling and Estimation
6.158
Stratified Random Sampling
Stratified random sampling occurs when the
population is divided into subpopulations
(strata) and a simple random sample is drawn
from each strata.
It guarantees that population subdivisions of
interests are represented in the sample.
It generates more accurate estimates (smaller
variance) than simple random sampling
Sampling and Estimation
6.159
Types of Sample Data
Cross-sectional data represent observations
over individual units at a point in time;
Time series data is a set of observations on a
variables outcomes in different time periods;
Panel data have both time-series and cross-
sectional aspects and consist of observations
through time on a single characteristics of
multiple observational units.
Sampling and Estimation
6.160
Sampling Error and Statistic
Sampling error is the difference between the
observed value of a statistic and the quantity it is
intended to estimate.

Sampling distribution of a statistic is the distribution of
all the distinct possible values that the statistic can
assume when computed from samples of the same
size randomly drawn from the same population.
FIN5SBF
41


Sampling and Estimation
6.161
Example: Distribution of Sample Mean
Suppose we have a 'population' of 5 elements with
values 1, 2, 3, 4, 5

What are the average and standard deviation of this population?




Now, consider all possible samples of size 3 to provide a point
estimate of the population mean, and find the average of each
sample.
Sampling and Estimation
6.162
Example: Distribution of Sample Mean
Now, consider all possible samples of size 3 to provide a point
estimate of the population mean, and find the average of each
sample. What is the average and standard deviation of the new
distribution?
x Possible Samples, Size 3
1, 2, 3
1, 2, 4
1, 2, 5
1, 3, 4
1, 3, 5
1, 4, 5
2, 3, 4
2, 3, 5
2, 4, 5
3, 4, 5
Sampling and Estimation
6.163
Shape of the Sampling Distribution of Sample Mean
If n is large enough (>30) the sampling distribution of
will be a normal distribution regardless of the distribution
type exhibited by the population.

If the population distribution is normal the sampling
distribution of will be normal regardless of the sample
size.
X
X
Sampling and Estimation
6.164
Standard Error of the Sample Mean
When we use the sample mean to estimate the population mean,
there are some errors.

The standard error of the sample mean is the standard deviation of
the difference between the sample mean and the population mean.

For a sample mean calculated from a sample generated from a
population with standard deviation , the standard error of the
sample mean is


when population standard deviation () is known.
n
X
o
o =
FIN5SBF
42


Sampling and Estimation
6.165
Standard Error of the Sample Mean
In practice, the population variance is almost always
unknown. The standard error of the sample mean is
estimated by,



Note,


X X
s o ~
( ) ) 1 ( where ,
1
2
2
= =

=
n X X s n s s
n
i
i X
Sampling and Estimation
6.166
Central Limit Theorem
The central limit theorem: Given a population
described by any probability distribution having
mean and finite variance
2
, the sampling
distribution of the sample mean computed
from samples of size n from this population will
be approximately normal with mean (the
population mean) and variance
2
/n (the
population variance divided by n) when the
sample size n is large.
X
Sampling and Estimation
6.167
Example: Central Limit Theorem
Electronics Associates Industry has 2500 managers on salaries such that
= $31,800 and s = $4000.
What is the probability that a random sample of 30 managers will have a
mean salary that lies within $1000 of the population mean?


Sampling and Estimation
6.168
Confidence Intervals
Any estimate has errors. But we know that the
estimated parameter must be around the
estimate with high probability.
A confidence interval is an interval for which we
can assert with a given probability 1 , called
the degree of confidence, that it will contain the
parameter it is intended to estimate.
Note, here we move from a point estimation to
internal estimation.
FIN5SBF
43


Sampling and Estimation
6.169
Confidence Intervals
A (1 )% confidence interval for a
parameter has the following structure:
Point estimate Reliability factor Standard error
the reliability factor is a number based on the
assumed distribution of the point estimate and the
degree of confidence (1 ) for the confidence
interval
standard error is the standard error of the sample
statistic providing the point estimate.
Sampling and Estimation
6.170
Confidence Intervals for the Population Mean
For normally distributed population with
known variance.


For large sample, population variance
unknown.
n
z X
2 /
o

o
n
s
z X
2 / o

Sampling and Estimation


6.171
Confidence Intervals for the Population Mean
For population variance unknown, we
have to use t-distribution


The t-distribution is a symmetrical
probability distribution defined by a single
parameter known as degrees of freedom
(df).
n
s
t X
2 / o

Sampling and Estimation


6.172
(Students) t-Distribution versus the Standard
Normal Distribution
FIN5SBF
44


Sampling and Estimation
6.173
Choose Which Reliability Factor?
Sampling from
Small
Sample Size
Large
Sample Size
Normal, Known Var z z
Normal, Unknown Var t t (or z)
Nonnormal, Known Var N/A z
Nonnormal, unknown Var N/A t (or z)
Sampling and Estimation
6.174
Example: Confidence Interval for
Population Mean
Now reconsidering that the population
variance of the distribution of Sharpe ratio
is unknown, the analyst decides to
calculate the confidence interval using the
theoretically correct t-statistic.
Compare the result obtained under normal
distribution.
Sampling and Estimation
6.175
Example: Confidence Interval for
Population Mean
ABC insurance surveys 36 policyholder in order to obtain an estimate of
the average age of all policy holders. If average age of a sample of policy
holders 39.5 years with 1.8 years standard deviation, determine a 90%
confidence interval for the average policy holders age.


.
Sampling and Estimation
6.176
Example: Confidence Interval for
Population Mean
Banana Bank is interested in introducing a new computer based training
program for use by its employees. A sample of 15 employees is selected
to undergo training on the new system and on completion it is found that
the duration of training is 53.87 days and s= 6.82 days. Determine a 95%
confidence interval estimate.
FIN5SBF
45


Sampling and Estimation
6.177
Example: Confidence Interval for
Population Mean
City insurance has found that from a sample of size 36, with mean age
39.5 years and standard deviation 1.8 years, we can say with 90%
confidence that the sampling error associated with the sample mean is
0.5. What sample size would we need to obtain this sampling error with
95% confidence.
Sampling and Estimation
6.178
Selection of Sample Size
What conclusion can we draw from the
previous example?
All else equal, a larger sample size
decreases the width of the confidence
interval because


and reliability factor (or t-critical value)
declines in degree of freedom.
size Sample
deviation standard Sample
mean sample the of error Standard =
Sampling and Estimation
6.179 5.179
Thank You!
Topic 7:

Hypothesis Testing
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 7
FIN5SBF
46


Hypothesis Testing
Hypothesis Testing
Statistical inference as two subdivisions: estimation and
hypothesis testing
Estimation addresses the question: what is this
parameters value?
For example, what is the population mean of annual returns for
company XYZ?
The answer is usually a confidence interval built around a point
estimate.
A hypothesis testing question is:" Is the value of that
parameter equal to ?
For example, is the average annual return of XYZ equal to 10%
The statement Average annual return of XYZ is equal to
10% is called a hypothesis
Hypothesis Testing
1.Stating the hypotheses.
2.Identifying the appropriate test statistic
and its probability distribution.
3.Specifying the significance level.
4.Stating the decision rule.
5.Collecting the data and calculating the
test statistic.
6.Making the statistical decision.
Steps in Hypothesis Testing
Hypothesis Testing
Null vs. Alternative Hypothesis (Step 1)
The null hypothesis is the hypothesis to be
tested.
Null hypothesis can never be accepted. We can either reject the null
(hence accepting the alternative) or not reject the null (and not
accepting the alternative either!)
The alternative hypothesis is the hypothesis
accepted when the null hypothesis is rejected.
Eg:
H
0
: Population average risk premium for Canadian equities is less
than or equal to zero.
H
A
: Population average risk premium for Canadian equities is greater
than zero.

Hypothesis Testing
Formulation of Hypotheses
(Step 1)
There three ways to form hypothesis
1. H
0
: =
0
versus H
a
:
0

2. H
0
:
0
versus H
a
: >
0
3. H
0
:
0
versus H
a
: <
0

The first formulation is a two-sided test. The other two are one-
sided tests.

Eg. we may want to test
H
0
: Population average risk premium for Canadian equities is less
than or equal to zero.
H
A
: Population average risk premium for Canadian equities is
greater than zero.
( )
0
: 0 H s
( ) : 0
a
H >
FIN5SBF
47


Hypothesis Testing
Test Statistic (Step 2)
A test statistic is a quantity, calculated
based on a sample, whose value is the
basis for deciding whether or not to reject
the null hypothesis.


statistic sample the of error Standard
H under parameter population the of Value statistic Sample
statistic Test
0

=
Hypothesis Testing
Test Statistic (Step 2)
Note, if we test a hypothesis of population mean, the
standard error of the sample statistic is calculated by
the same formulas as we used the last topic:



In our example, we are testing whether the average can be 0. The test
statistic for our example is



unknown is population the of deviation standard the if , n s s
X
=
known is population the of deviation standard the if , n
X
o o =
0 0
or
X X
X X
s o

Hypothesis Testing
Two types of Errors (Step 3)
In reaching a statistical decision, we can
make two possible errors:

We may reject a true null hypothesis (a Type I
error), or
Probability of type I error is denoted by the Greek letter alpha,
We may fail to reject a false null hypothesis (a
Type II error).
Probability of type II error is denoted by the Greek letter beta,

Hypothesis Testing
Level of Significance (Step 3)
The level of significance of a test is the
probability of a Type I error that we are
preparing to accept in conducting a hypothesis
test, is denoted by .
The standard approach to hypothesis testing
involves specifying a level of significance
(probability of Type I error) only.
Conventional significance levels: 0.1 (some
evidence), 0.05 (strong evidence), 0.01 (very
strong evidence).
FIN5SBF
48


Hypothesis Testing
Level of Significance (Step 3)
Trade-off: all else equal, if we decrease
the probability of a type I error by
increasing specifying a smaller
significance level, we increase the
probability of making a Type II error.
The power of a test is the probability of
correctly rejecting the null (rejecting the
null when it is false).
It is equal to 1 minus the probability of a
Type II error.
Hypothesis Testing
Rejection Points (Step 4)
A rejection point (critical value) for a test
statistic is a value with which the
computed test statistic is compared to
decide whether to reject the null
hypothesis or not.
For a one-tailed test, we indicate a rejection
point using the symbol for the test statistic
with a subscript of significance level (e.g.,
z

, t

)
For a two-tailed test, the subscript is a half of
the significance level (e.g., z
/2
, t
/2
)
Hypothesis Testing
Example: Rejection Points of a One-
Sided z-test
If Canadian equity risk premium is normal and its
variance is known, then we can use a z-test to test the
hypotheses at the 0.05 level of significance:

H
0
:

0 (average risk premium is less than or equal to zero)
versus
H
A
:

> 0 (average risk premium is greater than zero)

One rejection point exists: z
0.05
= 1.65
Hypothesis Testing
Example: Rejection Points of a One-
Sided z-test
So if our sample data yield


we reject the null hypothesis.
Otherwise, we do not reject the null hypothesis.
We cant accept it either!
This is illustrated by the following slide.

Note, H
0
:

0 versus H
a
:

< 0, the rejection point is z
0.05
= -1.645
and we reject null hypothesis if z < -1.645

0
1.645
X
X
o

>
FIN5SBF
49


Hypothesis Testing
Rejection Point, 0.05 Significance Level, One-Sided Test of the
Population Mean Using a z-Test
Hypothesis Testing
Example: Rejection Point of a Two-Sided
z-test
On the other hand, if the hypotheses are

H
0
:

= 0 (The average Canadian equity risk premium is zero)
versus
H
a
:

0 (The average Canadian equity risk premium is not
equal to zero)

There exists two rejection points. If significance level is
still 0.05, the rejection points are:
z
0.025
= 1.96 and -z
0.025
= -1.96
from the normal distribution table.
Hypothesis Testing
Rejection Points, 0.05 Significance Level, Two-Sided Test of
the Population Mean Using a z-Test
Hypothesis Testing
Example: Rejection Points of a Two-
Sided z-test
So if sample data yield either


we reject the null hypothesis as illustrated by the next
slide. But we do not reject the null hypothesis if

0 0
1.96 or 1.96
X X
X X
o o

> <
FIN5SBF
50


Hypothesis Testing
Confidence Interval
The (1 ) confidence interval represents
the range of values of the test statistic for
which the null hypothesis will not be rejected
at an significance level.
In our previous examples, the 95%
confidence intervals are, respectively


0 1.96 , 0 1.96
X X
o o + (

, 0 1.645
X
o + (

Hypothesis Testing
Collecting Data and Calculating the Test
Statistic (Step 5)
Data collection issues:
Measurement errors
Sample selection bias and time-period bias
Test statistic calculation has shown in the
previous examples.

Hypothesis Testing
Making Statistical Decision (Step 6)
Comparing the calculated test statistic
with corresponding critical value to
decide whether reject the null
hypothesis or not.
Although we will meet other tests below,
the basic principal of statistical decision
making is the same.
Hypothesis Testing
p-Value
An alternative approach is called p-value
approach.
The p-value is the smallest level of
significance at which the null hypothesis
can be rejected.
The smaller the p-value, the stronger the
evidence against the null hypothesis and
in favor of the alternative hypothesis.
FIN5SBF
51


Hypothesis Testing
Hypothesis Tests Concerning the Mean:
t-tests

Can test that the mean of a population is equal
to or differs from some hypothesized value.
Can test to see if the sample means from two
different populations differ.
Hypothesis Testing
Tests Concerning a Single Mean: t-test
A t-test is used to test a hypothesis
concerning the value of a population
mean, if the variance is unknown and
the sample is large, or
the sample is small but the population is
normally distributed, or approximately
normally distributed.
Hypothesis Testing
Tests Concerning a Single Mean


deviation standard sample
mean population the of value ed hypothesiz the
mean sample
freedom of degrees 1 n with statistic
where,
/
1
1
=
=
=
=

s
X
t t
n s
X
t
n
n

Hypothesis Testing
Example: Testing Rio Tinto Mean Return
Sender Equity Fund has achieved a mean monthly return
of 1.50% with a sample st. dev. of 3.60% during a 24
months period. Given its level of systematic risk, the share
is expected to have earned a 1.10% mean monthly return.
Assuming return is normally distributed, is the actual result
consistent with an underlying or population mean monthly
return of 1.10% with 10% level of significance?
FIN5SBF
52


Hypothesis Testing
Example: Testing Rio Tinto Mean Return
cont.
The hypothesis statement is: H
0
: the average return is equal to 1.10%
The alternative hypothesis is H
a
: mean monthly return 0
We use a 2-sided t-test to test for 24 months period (n=24)

Step 1: H
0
: =0
H
a
: 0

Step 2: test statistic =


Step 3: a two tailed test has two rejection points:
t
/2,df
=t
0.05,23
=1.714 and -t
/2,df
=-t
0.05,23
=-1.714

Step 4: reject if 0.544 < -1.714 or 0.544 > 1.714

Can we reject?


1
1.50 1.10
0.544
/ 3.60 / 24
n
X
t
s n


= = =
Hypothesis Testing
Example: Testing average Days of
Receivables
FashionDesigns is concerned about a possible slowdown
in payments from its customers. The rate of payment is
measured by the average number of days in receivables.
FashionDesigns has generally maintained an average of
45 days in receivables. A recent random sample of 50
accounts shows a mean number of days in receivables of
49 with a standard deviation of 8 days. Determine whether
the evidence supports the suspected condition that
customer payments have slowed at 5% level of
significance.
Hypothesis Testing
Example: Testing average Days of
Receivables
The hypothesis statement is: H
0
: number of days in receivable is less than or equal 45
The alternative hypothesis is H
a
: number of days in receivable is more than 45
We use a 1-sided t-test to test for 50 accounts (n=50)

Step 1: H
0
: 45
H
a
: >45

Step 2: test statistic =


Step 3: a tone tailed test has one rejection point:
t
,df
=t
0.05,49
=1.677


Step 4: reject if 3.536 > 1.677

Can we reject?

1
49 45
3.536
/ 8/ 50
n
X
t
s n


= = =
Hypothesis Testing
The z-Test Alternative
If the population sampled is normally
distributed with known variance, then the
test statistic for a hypothesis test
concerning a single population mean, , is


deviation standard population known
mean population the of value ed hypothesiz the
where,
/
=
=

=
o

o

n
X
z
FIN5SBF
53


Hypothesis Testing
The z-Test Alternative
If the population sampled has unknown
variance and the sample is large, in place
of a t-test, an alternative statistic is


deviation standard population known
where
/
=

=
s
n s
X
z

Hypothesis Testing
Rejection Points for a z-Test For
= 0.10
1.H
0
: =
0
versus H
a
:
0

Reject the null hypothesis if z > 1.645 or
if z < -1.645.
2.H
0
:
0
versus H
a
: >
0

Reject the null hypothesis if z > 1.282
3.H
0
:
0
versus H
a
: <
0

Reject the null hypothesis if z < -1.282

Hypothesis Testing
Rejection Points for a z-Test
For = 0.05
1.H
0
: =
0
versus H
a
:
0

Reject the null hypothesis if z > 1.96 or if
z < -1.96.
2.H
0
:
0
versus H
a
: >
0

Reject the null hypothesis if z > 1.645
3.H
0
:
0
versus H
a
: <
0

Reject the null hypothesis if z < -1.645

Hypothesis Testing
Rejection Points for a z-Test
For = 0.01
1.H
0
: =
0
versus H
a
:
0

Reject the null hypothesis if z > 2.576 or
if z < -2.576
2.H
0
:
0
versus H
a
: >
0

Reject the null hypothesis if z > 2.326.
3.H
0
:
0
versus H
a
: <
0
Reject the
null hypothesis if z < -2.326

FIN5SBF
54


Hypothesis Testing
t-test or z-test?
Population normal and variance known z-
test, for both small and large sample.
Variance unknown
Large Sample
(n30)
Small Sample
(n<30)
Population
normal
t-test (or z-test) t-test
Population
non-normal
t-test (or z-test) Not available
Hypothesis Testing
Tests Concerning the Differences
between Means
Sometimes we are interested in testing
whether the mean value differs between
two groups.
If it is reasonable to assume
populations are normally distributed
samples are independent
we can combine observations from both
samples to get a pooled estimate of the
unknown population variance.
Hypothesis Testing
Formulation of Hypotheses
1.H
0
:
1
-
2
= 0 versus H
A
:
1
-
2
0
2.H
0
:
1
-
2
0 versus H
A
:
1
-
2
> 0
3.H
0
:
1
-
2
0 versus H
A
:
1
-
2
< 0

Hypothesis Testing
Test Statistics for Difference between Two Population
Means: equal variances
For normally distributed populations with
unknown variances, but if the variances
can be assumed to be equal, the t-statistic
is





and degrees of freedom is n
1
+ n
2
- 2
( ) ( )
2 / 1
2
2
1
2
2 1
2 1
|
|
.
|

\
|
+

=
n
s
n
s
X X
t
p p

2
) 1 ( ) 1 (
where
2 1
2
2 2
2
1 1 2
+
+
=
n n
s n s n
s
p
FIN5SBF
55


Hypothesis Testing
Test Statistics for Difference between Two Population
Means: unequal variances
For normally distributed populations, if population
variances are unequal and unknown, the t-statistic is



and degrees of freedom is given by





( ) ( )
2 / 1
2
2
2
1
2
1
2 1
2 1
|
.
|

\
|
+

=
n
s
n
s
X X
t

( ) ( )
2
2
1
2
2
1
2
1
2
1
2
2
2
2
1
2
1
/ /
n
n s
n
n s
n
s
n
s
df
+
|
.
|

\
|
+
=
Hypothesis Testing
Example: Mean Return on S&P 500
The realized mean monthly return on the S&P
500 in the 1980s appears to have been
substantially different from that in the 1970s.
Was the difference statistically significant?
The data, shown on the next slide, indicate that
assuming equal population variances for returns
in the two decades is not unreasonable. But if
you assumed unequal variances, would you
reach a different conclusion?
Assume 5% level of significance
Hypothesis Testing
S&P 500 Monthly Return and Standard
Deviation
Decade No. of months Mean return St. Dev.
1970s 120 0.580 4.598
1980s 120 1.470 4.738
Hypothesis Testing
Example: Mean Return on S&P 500 cont.
We assume equal variance to answer the
question: Was the difference statistically
significant?
This means, is the difference between the
average in 80s and 70s significantly different
from zero?
In other words,
80s

70s
=0 or
80s

70s
0



FIN5SBF
56


Hypothesis Testing
Example: Mean Return on S&P 500 cont.


Step 1: H
0
:
80s

70s
=0
H
a
:
80s

70s
=0

Step 2: test statistic =




Note that you must find Sp first and then calculate test statistic using the Sp

Step 3: a two tailed test has two rejection points:
t
/2,df
=t
0.025,238
=1.97 and -t
/2,df
=-t
0.025,238
=-1.97

Step 4: reject if 1.477 < -1.96 or 1.477 > 1.96

Can we reject?
( ) ( ) 1 2
1 2
1/ 2
2 2
1 2
1.4767
p p
X X
t
s s
n n

= =
| |
+ |
|
\ .
Hypothesis Testing
Mean Differences Samples Not Independent
Reminder: In the previous two t-tests, samples
are assumed to be independent
If the samples are not independent, a test of
mean difference is done using paired
observations.
1. H
0
:
d
=
d0
versus H
A
:
d

d0

2. H
0
:
d

d0
versus H
A
:
d
>
d0


3. H
0
:
d

d0
versus H
A
:
d
<
d0

where
d
stands for the population mean
difference and
d0
is a hypothesis value for the
population mean difference
Hypothesis Testing
t-statistic for Mean Differences
Samples Not Independent
To calculate the t-statistic, we first need to
find the sample mean difference:


where d
i
is the difference between two paired
observations (the i
th
pair)
The sample variance is


=
=
n
i
i
d
n
d
1
1
( ) ) 1 (
2
1
2
=

=
n d d s
n
i
i d
Hypothesis Testing
t-statistic for Mean Differences
Samples Not Independent
The standard error of the mean difference is



The test statistic, with n 1 df, is,


n
s
s
d
d
=
d
d
s
d
t
0

=
FIN5SBF
57


Hypothesis Testing
Hypothesis Tests Concerning Variance
We examine two types:
tests concerning the value of a single
population variance and
tests concerning the differences between
two population variances.
Hypothesis Testing
Tests Concerning a Single Population
Variance
We can formulate hypotheses as follows:
2
0
2
a
2
0
2
0
2
0
2
a
2
0
2
0
2
0
2
a
2
0
2
0
: H versus : H . 3
: H versus : H . 2
: H versus : H . 1
o o o o
o o o o
o o o o
< >
> s
= =
Hypothesis Testing
Test-Statistic for Tests Concerning a Single
Population Variance
If we have n independent observations
from a normally distributed population,
the appreciate test statistic is chi-squared
statistic


where s
2
is sample variance,
freedom of degrees 1 n with ,
) 1 (
2
0
2
2

=
o
_
s n
( ) ) 1 (
1
2
2
=

=
n X X s
n
i
i
Hypothesis Testing
Rejection Points for Tests Concerning a
Single Population Variance
1. Equal to H
0
: Reject the null if the
statistic is greater than or smaller
than
2. Not greater than H
0
: Reject the null
if the statistic is greater than
3. Not less than H
0
: Reject the null if
the statistic is less than

2
2 / o
_
2
2 / 1 o
_

2
o
_
2
o
_
FIN5SBF
58


Hypothesis Testing
Tests Concerning the Equality (Inequality) of
Two Variances
Suppose we want to know the relative
values of the variances of two populations,
we can formulate one of the following
hypotheses:
2
2
2
1 a
2
2
2
1 0
2
2
2
1 a
2
2
2
1 0
2
2
2
1 a
2
2
2
1 0
: H versus : H . 3
: H versus : H . 2
: H versus : H . 1
o o o o
o o o o
o o o o
< >
> s
= =
Hypothesis Testing
Test Statistic for Tests Concerning the
Equality (Inequality) of Two Variances
Suppose we have two samples, the first
has n
1
observations with sample
variance and the second has n
2

observations with sample variance . If
both populations are normal, the test-
statistic is

freedom of degrees ) 1 (n and ) 1 ( with ,
2 1 2
2
2
1
= n
s
s
F
2
1
s
2
2
s
Hypothesis Testing
Rejection Points for Tests Concerning the
Equality (Inequality) of Two Variances
Convention: Let the sample with larger
variance be sample 1 and the other
sample 2 F-statistic is always greater
than or equal to 1.
Thus, decision rule is:
1. Equal to H
0
: Reject the null if the statistic
is greater than F
/2
.
2. Not greater than and Not less than H
0
:
Reject the null if the statistic is greater than
F

.

Hypothesis Testing
5.232
Thank You!
FIN5SBF
59


233

Slide
Correlation and Simple Linear Regression,p-
283-300
The Simple Linear Regression Model
The Least Squares Method
The Coefficient of Determination
Model Assumptions
Testing for Significance
Using the Estimated Regression
Equation for Estimation and Prediction
Residual Analysis: Validating Model Assumptions
Residual Analysis: Outliers and Influential
Observations
Chapter 8

234

Slide
The Simple Linear Regression Model
Simple Linear Regression Model
y = |
0
+ |
1
x

+ c

Simple Linear Regression Equation
E(y) = |
0
+ |
1
x

Estimated Simple Linear Regression Equation
y = b
0
+ b
1
x
y = dept var
^
235

Slide
The Least Squares Method
Least Squares Criterion

min E(y
i
- y
i
)
2


where
y
i
= observed value of the dependent variable
for the i th observation
y
i
= estimated value of the dependent variable
for the i th observation

^
^
236

Slide
Slope for the Estimated Regression Equation


y -Intercept for the Estimated Regression Equation
b
0
= y - b
1
x
where
x
i
= value of independent variable for i th observation
y
i
= value of dependent variable for i th observation
x = mean value for independent variable
y = mean value for dependent variable
n = total number of observations
_ _
b
x y x y n
x x n
i i i i
i i
1
2 2
=


( ) /
( ) /
b
x y x y n
x x n
i i i i
i i
1
2 2
=


( ) /
( ) /
_
_
The Least Squares Method
FIN5SBF
60


237

Slide
Example: Reed Auto Sales
Reed Auto periodically has a special week-long sale.
As part of the advertising campaign Reed runs one or
more television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales showing the number of TV ads run and the
number of cars sold in each sale are shown below.
Number of TV Ads Number of Cars Sold
1 14
3 24
2 18
1 17
3 27
238

Slide
Slope for the Estimated Regression Equation
b
1
= 220 - (10)(100)/5 = 5
24 - (10)
2
/5
y -Intercept for the Estimated Regression Equation
b
0
= 20 - 5(2) = 10
Estimated Regression Equation
y = 10 + 5x
^
Example: Reed Auto Sales
239

Slide
The Coefficient of Determination
Relationship Among SST, SSR, SSE
SST = SSR + SSE

Coefficient of Determination
r
2
= SSR/SST
where
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error

( ) ( ) ( ) y y y y y y
i i i i
= +
2 2 2
( ) ( ) ( ) y y y y y y
i i i i
= +
2 2 2 ^
^
240

Slide
Coefficient of Determination
r
2
= SSR/SST = 100/114 = .88

The regression relationship is very strong since
88% of the variation in number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.
Example: Reed Auto Sales
FIN5SBF
61


241

Slide
The Correlation Coefficient & Hypothesis Testing
The sample correlation coefficient is plus or minus
the square root of the coefficient of determination.
Sample Correlation coefficient

Testing for Sample Correlation coefficient p297
r r
xy
=
2
r r
xy
=
2
242

Slide
Model Assumptions
Assumptions About the Error Term c
The error c is a random variable with mean of
zero.
The variance of c , denoted by o
2
, is the same for
all values of the independent variable.
The values of c are independent.
The error c is a normally distributed random
variable.
243

Slide
Testing for Significance: F Test
Hypotheses
H
0
: |
1
= 0
H
a
: |
1
= 0
Test Statistic
F = MSR/MSE
Rejection Rule
Reject H
0
if F > F
o

where F
o
is based on an F distribution with 1 d.f. in
the numerator and n - 2 d.f. in the denominator.
244

Slide
Testing for Significance: t Test (p.312)
Hypotheses
H
0
: |
1
= 0
H
a
: |
1
= 0
Test Statistic


Rejection Rule
Reject H
0
if t < -t
o/2
or t > t
o/2

where t
o/2
is based on a t distribution with
n - 2 degrees of freedom.
t
b
s
b
=
1
1
t
b
s
b
=
1
1
FIN5SBF
62


245

Slide
Using the Estimated Regression Equation
for Estimation and Prediction
Confidence Interval Estimate of E (y
p
)


Prediction Interval Estimate of y
p
y
p
+ t
o/2
s
ind



where the confidence coefficient is 1 - o and t
o/2
is
based on a t distribution with n - 2 d.f.

/
y t s
p y
p

o 2

/
y t s
p y
p

o 2
246

Slide
F Test
Hypotheses H
0
: |
1
= 0
H
a
: |
1
= 0
Rejection Rule
For o = .05 and d.f. = 1, 3: F
.05
= 10.13
Reject H
0
if F > 10.13.
Test Statistic
F = MSR/MSE = 100/4.667 = 21.43
Conclusion
We can reject H
0
.
Example: Reed Auto Sales
247

Slide
t Test
Hypotheses H
0
: |
1
= 0
H
a
: |
1
= 0
Rejection Rule
For o = .05 and d.f. = 3, t
.025
= 3.182
Reject H
0
if t > 3.182
Test Statistics
t = 5/1.08 = 4.63
Conclusions
Reject H
0
: |
1
= 0
Example: Reed Auto Sales
248

Slide
Point Estimation
If 3 TV ads are run prior to a sale, we expect the
mean number of cars sold to be:
y = 10 + 5(3) = 25 cars
Confidence Interval for E (y
p
)
95% confidence interval estimate of the mean number
of cars sold when 3 TV ads are run is:
25 + 4.61 = 20.39 to 29.61 cars
Prediction Interval for y
p

95% prediction interval estimate of the number of
cars sold in one particular week when 3 TV ads are
run is: 25 + 8.28 = 16.72 to 33.28 cars
^
Example: Reed Auto Sales
FIN5SBF
63


249

Slide
Residual Analysis
Residual for Observation i
y
i
- y
i

Standardized Residual for Observation i


where


^
y y
s
i i
y y
i i

y y
s
i i
y y
i i

^
^
s s h
y y i
i i

= 1 s s h
y y i
i i

= 1
^
250

Slide
Detecting Outliers
An outlier is an observation that is unusual in
comparison with the other data.
Minitab classifies an observation as an outlier if its
standardized residual value is < -2 or > +2.
This standardized residual rule sometimes fails to
identify an unusually large observation as being
an outlier.
This rules shortcoming can be circumvented by
using studentized deleted residuals.
The |i th studentized deleted residual| will be
larger than the |i th standardized residual|.
Residual Analysis
251

Slide
The End of Chapter 8
252

Slide
Multiple Regression & Issues in
Regression Analysis p. 325 text
The Multiple Linear Regression Model
The Least Squares Method
The Multiple Coefficient of Determination
Model Assumptions
Testing for Significance
Using the Estimated Regression Equation
for Estimation and Prediction
Qualitative Independent Variables
Residual Analysis
Chapter 9a

FIN5SBF
64


253

Slide
The Multiple Regression Model
The Multiple Regression Model
y = |
0
+ |
1
x
1
+ |
2
x
2
+ . . . + |
p
x
p
+ c

The Multiple Regression Equation
E(y) = |
0
+ |
1
x
1
+ |
2
x
2
+ . . . + |
p
x
p


The Estimated Multiple Regression Equation
y = b
0
+ b
1
x
1
+ b
2
x
2
+ . . . + b
p
x
p
^
254

Slide
The Least Squares Method
Least Squares Criterion

Computation of Coefficients Values
The formulas for the regression coefficients b
0
, b
1
,
b
2
, . . . b
p
involve the use of matrix algebra. We will
rely on computer software packages to perform the
calculations.
A Note on Interpretation of Coefficients
b
i
represents an estimate of the change in y
corresponding to a one-unit change in x
i
when all
other independent variables are held constant.
min (
i
y y
i
)
2
min (
i
y y
i
)
2
^
255

Slide
The Multiple
Coefficient of Determination
Relationship Among SST, SSR, SSE
SST = SSR + SSE


Multiple Coefficient of Determination
R
2
= SSR/SST

Adjusted Multiple Coefficient of Determination
( ) ( ) ( ) y y y y y y
i i i i
= +
2 2 2
( ) ( ) ( ) y y y y y y
i i i i
= +
2 2 2 ^ ^
R R
n
n p
a
2 2
1 1
1
1
=


( ) R R
n
n p
a
2 2
1 1
1
1
=


( )
256

Slide
Model Assumptions
Assumptions About the Error Term c
The error c is a random variable with mean of
zero.
The variance of c , denoted by o
2
, is the same for
all values of the independent variables.
The values of c are independent.
The error c is a normally distributed random
variable reflecting the deviation between the y
value and the expected value of y given by
|
0
+ |
1
x
1
+ |
2
x
2
+ . . . + |
p
x
p
FIN5SBF
65


257

Slide
Testing for Significance: F Test
Hypotheses
H
0
: |
1
= |
2
= . . . = |
p
= 0
H
a
: One or more of the parameters
is not equal to zero.
Test Statistic
F = MSR/MSE
Rejection Rule
Reject H
0
if F > F
o

where F
o
is based on an F distribution with p d.f. in
the numerator and n - p - 1 d.f. in the denominator.
258

Slide
Testing for Significance: t Test
Hypotheses
H
0
: |
i
= 0
H
a
: |
i
= 0
Test Statistic


Rejection Rule
Reject H
0
if t < -t
o/2
or t > t
o/2

where t
o/2
is based on a t distribution with
n - p - 1 degrees of freedom.
t
b
s
i
b
i
= t
b
s
i
b
i
=
259

Slide
Testing for Significance: Multicollinearity
The term multicollinearity refers to the correlation
among the independent variables.
When the independent variables are highly
correlated (say, |r | > .7), it is not possible to
determine the separate effect of any particular
independent variable on the dependent variable.
If the estimated regression equation is to be used
only for predictive purposes, multicollinearity is
usually not a serious problem.
Every attempt should be made to avoid including
independent variables that are highly correlated.
260

Slide
Using the Estimated Regression Equation
for Estimation and Prediction
The procedures for estimating the mean value of y
and predicting an individual value of y in multiple
regression are similar to those in simple regression.
We substitute the given values of x
1
, x
2
, . . . , x
p
into
the estimated regression equation and use the
corresponding value of y as the point estimate.
The formulas required to develop interval estimates
for the mean value of y and for an individual value
of y are beyond the scope of the text.
Software packages for multiple regression will often
provide these interval estimates.
^
FIN5SBF
66


261

Slide
Example: Programmer Salary Survey
A software firm collected data for a sample of 20
computer programmers. A suggestion was made that
regression analysis could be used to determine if salary
was related to the years of experience and the score on
the firms programmer aptitude test.
The years of experience, score on the aptitude test,
and corresponding annual salary ($1000s) for a sample
of 20 programmers is shown on the next slide.

262

Slide
Example: Programmer Salary Survey
Exper. Score Salary Exper. Score Salary
4 78 24 9 88 38
7 100 43 2 73 26.6
1 86 23.7 10 75 36.2
5 82 34.3 5 81 31.6
8 86 35.8 6 74 29
10 84 38 8 87 34
0 75 22.2 4 79 30.1
1 80 23.1 6 94 33.9
6 83 30 3 70 28.2
6 91 33 3 89 30
263

Slide
Example: Programmer Salary Survey
Multiple Regression Model
Suppose we believe that salary (y) is related to the
years of experience (x
1
) and the score on the
programmer aptitude test (x
2
) by the following
regression model:
y = |
0
+ |
1
x
1
+ |
2
x
2
+ c
where
y = annual salary ($000)
x
1
= years of experience
x
2
= score on programmer aptitude test

264

Slide
Example: Programmer Salary Survey
Multiple Regression Equation
Using the assumption E (c ) = 0, we obtain
E(y ) = |
0
+ |
1
x
1
+ |
2
x
2

Estimated Regression Equation
b
0
, b
1
, b
2
are the least squares estimates of |
0
, |
1
, |
2

Thus
y = b
0
+ b
1
x
1
+ b
2
x
2
^
FIN5SBF
67


265

Slide
Example: Programmer Salary Survey
Solving for the Estimates of |
0
, |
1
, |
2


SPSS Computer
Package
for Solving
Multiple
Regression
Problems
b
0
=
b
1
=
b
2
=
R
2
=

etc.
Input Data
Least Squares
Output
x
1
x
2
y

4 78 24
7 100 43
. . .
. . .
3 89 30
266

Slide
Example: Programmer Salary Survey
Minitab Computer Output
The regression is
SALARY = 3.17 + 1.40 EXPER + 0.251 SCORE

Predictor Coef Stdev t-ratio p
Constant 3.174 6.156 .52 .613
EXPER 1.4039 .1986 7.07 .000
SCORE .25089 .07735 3.24 .005

s = 2.419 R-sq = 83.4% R-sq(adj) = 81.5%
267

Slide
Example: Programmer Salary Survey
Minitab Computer Output (continued)
Analysis of Variance

SOURCE DF SS MS F P
Regression 2 500.33 250.16 42.76 0.000
Error 17 99.46 5.85
Total 19 599.79
268

Slide
Example: Programmer Salary Survey
F Test
Hypotheses H
0
: |
1
= |
2
= 0
H
a
: One or both of the parameters
is not equal to zero.
Rejection Rule
For o = .05 and d.f. = 2, 17: F
.05
= 3.59
Reject H
0
if F > 3.59.
Test Statistic
F = MSR/MSE = 250.16/5.85 = 42.76
Conclusion
We can reject H
0
.
FIN5SBF
68


269

Slide
Example: Programmer Salary Survey
t Test for Significance of Individual Parameters
Hypotheses H
0
: |
i
= 0
H
a
: |
i
= 0
Rejection Rule
For o = .05 and d.f. = 17, t
.025
= 2.11
Reject H
0
if t > 2.11
Test Statistics


Conclusions
Reject H
0
: |
1
= 0 Reject H
0
: |
2
= 0
b
s
b
1
1
1 4039
1986
7 07 = =
.
.
.
b
s
b
1
1
1 4039
1986
7 07 = =
.
.
.
b
s
b
2
2
25089
07735
3 24 = =
.
.
.
b
s
b
2
2
25089
07735
3 24 = =
.
.
.
270

Slide
Qualitative Independent Variables
In many situations we must work with qualitative
independent variables such as gender (male, female),
method of payment (cash, check, credit card), etc.
For example, x
2
might represent gender where x
2
= 0
indicates male and x
2
= 1 indicates female.
In this case, x
2
is called a dummy or indicator
variable.
If a qualitative variable has k levels, k - 1 dummy
variables are required, with each dummy variable
being coded as 0 or 1.
For example, a variable with levels A, B, and C
would be represented by x
1
and x
2
values of (0, 0),
(1, 0), and (0,1), respectively.
271

Slide
Example: Programmer Salary Survey (B)
As an extension of the problem involving the
computer programmer salary survey, suppose that
management also believes that the annual salary is
related to whether or not the individual has a graduate
degree in computer science or information systems.
The years of experience, the score on the programmer
aptitude test, whether or not the individual has a
relevant graduate degree, and the annual salary ($000)
for each of the sampled 20 programmers are shown on
the next slide.
272

Slide
Example: Programmer Salary Survey (B)
Exp. Score Degr. Salary Exp. Score Degr. Salary
4 78 No 24 9 88 Yes 38
7 100 Yes 43 2 73 No 26.6
1 86 No 23.7 10 75 Yes 36.2
5 82 Yes 34.3 5 81 No 31.6
8 86 Yes 35.8 6 74 No 29
10 84 Yes 38 8 87 Yes 34
0 75 No 22.2 4 79 No 30.1
1 80 No 23.1 6 94 Yes 33.9
6 83 No 30 3 70 No 28.2
6 91 Yes 33 3 89 No 30
FIN5SBF
69


273

Slide
Example: Programmer Salary Survey (B)
Multiple Regression Equation
E(y ) = |
0
+ |
1
x
1
+ |
2
x
2
+ |
3
x
3

Estimated Regression Equation
y = b
0
+ b
1
x
1
+ b
2
x
2
+ b
3
x
3
where
y = annual salary ($000)
x
1
= years of experience
x
2
= score on programmer aptitude test
x
3
= 0 if the individual does not have a gr. degree
1 if the individual does have a grad. degree
Note: x
3
is referred to as a dummy variable.
^
274

Slide
Example: Programmer Salary Survey (B)
Minitab Computer Output
The regression is
SALARY = 7.95 + 1.15 EXP + 0.197 SCORE + 2.28 DEG

Predictor Coef Stdev t-ratio p
Constant 7.945 7.381 1.08 .298
EXP 1.1476 .2976 3.86 .001
SCORE .19694 .0899 2.19 .044
DEG 2.280 1.987 1.15 .268

s = 2.396 R-sq = 84.7% R-sq(adj) = 81.8%
275

Slide
Example: Programmer Salary Survey (B)
SPSS or Minitab Computer Output (continued)
Analysis of Variance

SOURCE DF SS MS F P
Regression 3 507.90 169.30 29.48 0.000
Error 16 91.89 5.74
Total 19 599.79
276

Slide
Residual Analysis
Residual for Observation i
y
i
- y
i

Standardized Residual for Observation i


where

The standardized residual for observation i in
multiple regression analysis is too complex to be
done by hand. However, this is part of the output of
most statistical software packages.
^
y y
s
i i
y y
i i

y y
s
i i
y y
i i

^
^
s s h
y y i
i i

= 1 s s h
y y i
i i

= 1
^
FIN5SBF
70


277

Slide
Detecting Outliers
An outlier is an observation that is unusual in
comparison with the other data.
Minitab classifies an observation as an outlier if its
standardized residual value is < -2 or > +2.
This standardized residual rule sometimes fails to
identify an unusually large observation as being
an outlier.
This rules shortcoming can be circumvented by
using studentized deleted residuals.
The |i th studentized deleted residual| will be
larger than the |i th standardized residual|.
Residual Analysis
278

Slide

Regression Analysis: Model Building
The General Linear Model
Determining When to Add or Delete Variables
First Steps in the Analysis of a Larger Problem
Variable-Selection Procedures
Residual Analysis
Multiple Regression Approach
to Analysis of Variance and
Experimental Design
Chapter 9b

279

Slide
The General Linear Model
Models in which the parameters (|
0
, |
1
, . . . , |
p
) all
have exponents of one are called linear models.

First-Order Model with One Predictor Variable

Second-Order Model with One Predictor Variable

Second-Order Model with Two Predictor Variables
with Interaction

y x x x x x x = + + + + + + | | | | | | c
0 1 1 2 2 3 1
2
4 2
2
5 1 2
y x x x x x x = + + + + + + | | | | | | c
0 1 1 2 2 3 1
2
4 2
2
5 1 2
y x x = + + + | | | c
0 1 1 2 1
2
y x x = + + + | | | c
0 1 1 2 1
2
y x = + + | | c
0 1 1
y x = + + | | c
0 1 1
280

Slide
The General Linear Model
Often the problem of nonconstant variance can be
corrected by transforming the dependent variable to a
different scale.

Logarithmic Transformations
Most statistical packages provide the ability to apply
logarithmic transformations using either the base-10
(common log) or the base e = 2.71828... (natural log).
Reciprocal Transformation
Use 1/y as the dependent variable instead of y.

FIN5SBF
71


281

Slide
Models in which the parameters (|
0
, |
1
, . . . , |
p
) have
exponents other than one are called nonlinear models.
In some cases we can perform a transformation of
variables that will enable us to use regression analysis
with the general linear model.
Exponential Model
The exponential model involves the regression
equation:

We can transform this nonlinear model to a linear
model by taking the logarithm of both sides.
E y
x
( ) = | |
0 1
E y
x
( ) = | |
0 1
The General Linear Model
282

Slide
Determining When to Add or Delete Variables
F test to test whether the addition of x
2
to a model
involving x
1
(or the deletion of x
2
from a model
involving x
1
and x
2
) is statistically significant



F
x x x
x x
n p
=

SSE( ) - SSE( , )
1
SSE(
2 1 1
1 2
1
, )
F
x x x
x x
n p
=

SSE( ) - SSE( , )
1
SSE(
2 1 1
1 2
1
, )
F =
SSE(reduced) - SSE(full)
number of extra terms
MSE(full)
F =
SSE(reduced) - SSE(full)
number of extra terms
MSE(full)
283

Slide
Variable-Selection Procedures
Stepwise Regression
At each iteration, the first consideration is to see
whether the least significant variable currently in
the model can be removed because its F value,
FMIN, is less than the user-specified or default F
value, FREMOVE.
If no variable can be removed, the procedure
checks to see whether the most significant variable
not in the model can be added because its F value,
FMAX, is greater than the user-specified or default
F value, FENTER.
If no variable can be removed and no variable can
be added, the procedure stops.


284

Slide
Forward Selection
This procedure is similar to stepwise-regression,
but does not permit a variable to be deleted.
This forward-selection procedure starts with no
independent variables.
It adds variables one at a time as long as a
significant reduction in the error sum of squares
(SSE) can be achieved.
Variable-Selection Procedures
FIN5SBF
72


285

Slide
Backward Elimination
This procedure begins with a model that includes
all the independent variables the modeler wants
considered.
It then attempts to delete one variable at a time by
determining whether the least significant variable
currently in the model can be removed because its
F value, FMIN, is less than the user-specified or
default F value, FREMOVE.
Once a variable has been removed from the model
it cannot reenter at a subsequent step.
Variable-Selection Procedures
286

Slide
Best-Subsets Regression
The three preceding procedures are one-variable-
at-a-time methods offering no guarantee that the
best model for a given number of variables will be
found.
Some software packages include best-subsets
regression that enables the use to find, given a
specified number of independent variables, the
best regression model.
Minitab output identifies the two best one-
variable estimated regression equations, the two
best two-variable equation, and so on.
Variable-Selection Procedures
287

Slide
Residual Analysis: Autocorrelation
Durbin-Watson Test for Autocorrelation
Statistic



The statistic ranges in value from zero to four.
If successive values of the residuals are close
together (positive autocorrelation), the statistic
will be small.
If successive values are far apart (negative auto-
correlation), the statistic will be large.
A value of two indicates no autocorrelation.
d
e e
e
t t
t
n
t
t
n
=

=
=
( )
1
2
2
2
1
d
e e
e
t t
t
n
t
t
n
=

=
=
( )
1
2
2
2
1
288

Slide
Example: PGA Tour Data
The Professional Golfers Association keeps a variety
of statistics regarding performance measures. Data
include the average driving distance, percentage of
drives that land in the fairway, percentage of greens hit
in regulation, average number of putts, percentage of
sand saves, and average score.
The variable names and definitions are shown on the
next slide.
FIN5SBF
73


289

Slide
Variable Names and Definitions
DRIVE: average length of a drive in yards
FAIR: percentage of drives that land in the fairway
GREEN: percentage of greens hit in regulation (a
par-3 green is hit in regulation if the
players first shot lands on the green)
PUTT: average number of putts for greens that have
been hit in regulation
SAND: percentage of sand saves (landing in a sand
trap and still scoring par or better)
SCORE: average score for an 18-hole round
Example: PGA Tour Data
290

Slide
Sample Data
DRIVE FAIR GREEN PUTT SAND SCORE
277.6 .681 .667 1.768 .550 69.10
259.6 .691 .665 1.810 .536 71.09
269.1 .657 .649 1.747 .472 70.12
267.0 .689 .673 1.763 .672 69.88
267.3 .581 .637 1.781 .521 70.71
255.6 .778 .674 1.791 .455 69.76
272.9 .615 .667 1.780 .476 70.19
265.4 .718 .699 1.790 .551 69.73
Example: PGA Tour Data
291

Slide
Sample Data (continued)
DRIVE FAIR GREEN PUTT SAND SCORE
272.6 .660 .672 1.803 .431 69.97
263.9 .668 .669 1.774 .493 70.33
267.0 .686 .687 1.809 .492 70.32
266.0 .681 .670 1.765 .599 70.09
258.1 .695 .641 1.784 .500 70.46
255.6 .792 .672 1.752 .603 69.49
261.3 .740 .702 1.813 .529 69.88
262.2 .721 .662 1.754 .576 70.27
Example: PGA Tour Data
292

Slide
Sample Data (continued)
DRIVE FAIR GREEN PUTT SAND SCORE
260.5 .703 .623 1.782 .567 70.72
271.3 .671 .666 1.783 .492 70.30
263.3 .714 .687 1.796 .468 69.91
276.6 .634 .643 1.776 .541 70.69
252.1 .726 .639 1.788 .493 70.59
263.0 .687 .675 1.786 .486 70.20
263.0 .639 .647 1.760 .374 70.81
253.5 .732 .693 1.797 .518 70.26
266.2 .681 .657 1.812 .472 70.96
Example: PGA Tour Data
FIN5SBF
74


293

Slide
Sample Correlation Coefficients

SCORE DRIVE FAIR GREEN PUTT
DRIVE -.154
FAIR -.427 -.679
GREEN -.556 -.045 .421
PUTT .258 -.139 .101 .354
SAND -.278 -.024 .265 .083 -.296
Example: PGA Tour Data
294

Slide
Best Subsets Regression of SCORE
Vars R-sq R-sq(a) C-p s D F G P S
1 30.9 27.9 26.9 .39685 X
1 18.2 14.6 35.7 .43183 X
2 54.7 50.5 12.4 .32872 X X
2 54.6 50.5 12.5 .32891 X X
3 60.7 55.1 10.2 .31318 X X X
3 59.1 53.3 11.4 .31957 X X X
4 72.2 66.8 4.2 .26913 X X X X
4 60.9 53.1 12.1 .32011 X X X X
5 72.6 65.4 6.0 .27499 X X X X X
Example: PGA Tour Data
295

Slide
Minitab Output
The regression equation
SCORE = 74.7 - .0398(DRIVE) - 6.69(FAIR)
- 10.3(GREEN) + 9.86(PUTT)
Predictor Coef Stdev t-ratio p
Constant 74.678 6.952 10.74 .000
DRIVES -.0398 .01235 -3.22 .004
FAIR -6.686 1.939 -3.45 .003
GREEN -10.342 3.561 -2.90 .009
PUTT 9.858 3.180 3.10 .006
s = .2691 R-sq = 72.4% R-sq(adj) = 66.8%
Example: PGA Tour Data
296

Slide
Minitab Output (cont.)
Analysis of Variance

SOURCE DF SS MS F P
Regression 4 3.79469 .94867 13.10 .000
Error 20 1.44865 .07243
Total 24 5.24334
Example: PGA Tour Data
FIN5SBF
75


Chapter 10:

Time Series Analysis
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 10
Hypothesis Testing
7.298
Introduction
A time series is a set of observations on a
variables outcomes in different time periods.
We often use time series data to make
decisions:
to explain the past and
to predict the future of a time series
This topic address issues including:
trend models
autoregressive models
forecast and forecast performance
Hypothesis Testing
7.299
Problems with Time-Series Data
We would like to use linear regression on the
time series, but quite often the assumptions of
the model are violated for time-series data.
The residual errors are correlated instead of being
uncorrelated.
The mean and/or variance of the time series changes
over time.
There maybe seasonal factors.
So, we must be very careful of these problems
when we use time series data.
Hypothesis Testing
7.300
Linear Trend Models
In a linear trend, the dependent variable
changes at a constant rate with time:


In this model, we use time t to explain
the variation of dependent variable, y.

, T , , t t b b y
t t
2 1 ,
1 0
= + + = c
FIN5SBF
76


Hypothesis Testing
7.301
Log-Linear Trend Models
When there is exponential growth, a log-
linear trend model tends to fit the data
better:

This is because exponential growth
implies, on average, that

, T , , t t b b y
t t
2 1 , ln
1 0
= + + = c
, T , , t t b b y
e y
t
t b b
t
2 1 , ln
yields sides both of log natural the Taking
1 0
1 0
= + =
=
+
Hypothesis Testing
7.302
Trend Models and Testing for Correlated
Errors
The linear trend model and the log-linear
trend model are single-variable regression
models.
Independent variable is time t.
To be correctly specified, the regression-
model assumptions must be satisfied.
Durbin-Watson test should be used to test
for serial correlation.
Hypothesis Testing
7.303
Example of Trend Models
The 1
st
20 cells of Column D in sheet Data of book
FIN5SBF-Sheet10 gives the actual quarterly sales of
a company in 5 years and Column G presents the
natural logarithm of the sales.
Sheet Linear and Log-Linear show the results of
linear and log-linear trend models. Log-linear model
seems better than linear model as judged by R
2
.
The DW-statistic of both models is greater than the
upper critical value. Should we reject the null
hypothesis of no serial correlation?
How to explain the regression coefficients? Are they
significantly different from zero?
Rows 23-26 give the predictions of sales from the two
models.
Hypothesis Testing
7.304
Autoregressive (AR)
Time Series Models
An autoregressive model (AR), a time series
regressed on its own past values, effectively
represents the relationship between current-
values and previous-period values.
It must assume the time series we are modeling
is covariance stationary, which means the mean
and variance do not change over time and
covariance depends on lag only, i.e.,

< =

) ( ), ( ) , ( Cov k k y y
k t t

FIN5SBF
77


Hypothesis Testing
7.305
Autoregressive (AR)
Time Series Models
First-order autoregression AR(1) is given
by,

p
th
-order autoregression AR(p) is given by
t t t
x b b x c + + =
1 1 0
t p t p t t t
x b x b x b b x c + + + + =


2 2 1 1 0
Hypothesis Testing
7.306
Detecting Serially Correlated Errors
Durbin-Watson statistic is invalid for testing
serial correlation when the independent
variables include past values of the dependent
variable
We can determine whether we are using the
correct time-series model by testing whether the
autocorrelations of the error term differ
significantly from 0.
The null hypothesis is that an error
autocorrelation at a specified lag (say k lags)
equals 0.
Hypothesis Testing
7.307
Detecting Serially Correlated Errors
The construction of t-test for the null
hypothesis includes the following steps:
1. Estimate the error autocorrelation by:


where is the sample residual (error) and T is
the sample size.

= + =

=
T
t
t
T
k t
k t t k
1
2
1
,
c c c
c
t
c
Hypothesis Testing
7.308
Detecting Serially Correlated Errors
2. The standard error of the residual
correlation is equal to

3. The t-statistic for the null hypothesis is


The critical value of the t-test is 2 in
practice for large samples; i.e., rejecting
the null hypothesis if |t

|>2.
T 1
( )

= + =

= =
T
t
t
T
k t
k t t k
T T t
1
2
1
,
c c c
c c
FIN5SBF
78


Hypothesis Testing
7.309
Autoregressive (AR)
Time Series Model of Rio Tinto
We consider a first-order autoregression model
for Rio Tinto monthly return

Using the data we had in the previous topic, our
estimation yields estimated b
0
= 3.53 and b
1
= -
0.14 (see sheet AR(1)). But does b
1

significantly differ 0? What does R
2
tell you?
From the t-scores of the 1st four
autocorrelations, can we conclude that none of
these autocorerrelations differs significantly from
0?
t t t
x b b x c + + =
1 1 0
Hypothesis Testing
7.310
Mean Reversion
A time series shows mean reversion if it tends
to fall when its level is above its mean and rise
when its level is below its mean.
If a time series is currently at its mean-reverting
level, then the model predicts that the value of
the time series will be the same in the next
period.
This implies that the mean-reverting level is
) 1 (
1 0
b b x =
Hypothesis Testing
7.311
Multiperiod Forecasts
Sometimes we need forecasts of more than
one period.
The chain rule of forecasting is a process in
which the next periods value, predicted by the
forecasting equation, is substituted into the
equation to give a predicted value two periods
ahead (not limited to AR model).
If it was Jan 2008, what is the forecasted return
of Rio in Feb and March by the AR(1) model?
Hypothesis Testing
7.312
Computing Forecast Model Performance
One way to compare the forecast
performance of two models is to
compare the variance of the forecast
errors that the two models make.
The model with the smaller forecast error
variance will be the more accurate model,
and
It will also have the smaller standard error of
the time-series regression.
FIN5SBF
79


Hypothesis Testing
7.313
Forecast Model Performance: In-Sample
Forecast
In-sample forecast errors are the
residuals from a fitted time-series model.
Standard error of a regression model,
reported by software package, can be
uses as the criterion to measure forecast
performance based on in-sample forecast
accuracy.
A model with smaller standard error is
better.
Hypothesis Testing
7.314
Forecast Model Performance: Out-
Sample Forecast
An out-of-sample forecast error is still the
difference between actual the
observation and forecasted value, but
the error is from a forecast period that
does not include the estimation period.
Out-of-sample performance is critical for
evaluating a forecasting models real-
world contribution.

Hypothesis Testing
7.315
Computing Forecast Model Performance
Typically, we compare the out-of-sample
forecasting performance of forecasting
models by comparing their root mean squared
error (RMSE)
The square root of the average squared error.
The model with the smallest RMSE is judged most
accurate.
Sheet Rio demonstrates the calculation of
in-sample and out-sample errors and RMSE.
Hypothesis Testing
7.316
Instability of Regression Coefficients
The estimates of regression coefficients of the
time-series model can change substantially
across different sample periods used for estimating
the model.
because of using relatively shorter and longer
sample periods
because of the choice of models (eg, AR(1) vs
AR(2))
We can get some guidance, if we remember
that our models are valid only for covariance-
stationary time series.
FIN5SBF
80


Hypothesis Testing
7.317
Random Walks
A random walk is a time series in which
the value of the series in one period is
the value of the series in the previous
period plus an unpredictable random
error.


It a special case of AR(1): b
1
= 1 and b
0

= 0. If b
0
0 it is called a random walk
with drift.
( ) ( ) ( ) s t
x x
s t t t
t t t
= = = =
+ =

0, E , E 0, E
2 2
1
c c o c c
c
Hypothesis Testing
7.318
Random Walks
We cannot use standard regression
analysis on a time series that is a random
walk:
It has an undefined mean-reverting level.
The variance of x
t
grows without an upper
bound as t grows large.
However, we can convert the random
walk time series by differencing the data.
Hypothesis Testing
7.319
Differencing
Create a new time series, y
t
so that



The first-differenced variable is covariance
stationary.
AR(1) estimate of the first difference can be
produced.

( ) s for t 0 ) ( , , 0 ) (

2 2
1
= = = =
= =

s t t t
t t t t
E E E
x x y
c c o c c
c
Hypothesis Testing
7.320
Unit Root Test of Nonstationarity
The key problem of random walk is unit
root: b
1
= 1.
Dickey and Fuller (1979) developed a
regression-based unit root.
Limited by the scope of this topic, we do
not discuss it in details.
FIN5SBF
81


Hypothesis Testing
7.321 5.321
Thank You!
Topic 11:

Portfolio Theory
Associate Professor Ishaq Bhatti
La Trobe Business School

E-Mail: i.bhatti@latrobe.edu.au

Slides have been drafted by the La Trobe University, School of
Business based on DeFusco et al (2007)
Statistics for Business and Finance
Chapter 11
Portfolio Theory
Introduction
Issues regarding a portfolio:
Important characteristics
How to model the risk
Optimal asset allocation
Choice of dataset to predict returns
Risk factors to consider

Portfolio Theory
How to use diversification to reduce the
risk
Investors risk appetite and portfolio
choice
Value of any investment opportunity can
be measured in terms of mean return
and variance of the return
Mean Variance Analysis
FIN5SBF
82


Portfolio Theory
Markowitz Mean-Variance Analysis Assumptions
All investors are risk averse
Expected returns for all assets are known
Variance and covariances of all assets are known
Investors only need to know the average return,
variance and covariances of assets to choose their
optimal portfolio
No transaction costs and taxes


Portfolio Theory
Mean-Variance Frontier
An efficient portfolio is the one that provides the
highest return for a given level of risk measured by
variance or standard deviation
To find the efficient portfolio, first we need to identify
portfolios with minimum variance for each level of
return
These portfolios are called minimum-variance
portfolios
The set of efficient portfolios is a subset of minimum
variance portfolios
Portfolio Theory
Mean-Variance Frontier
Expected return and variance of the portfolio can be
calculated using the following formulas:
( ) ( ) ( )
1 1 2 2 P
E R wE R w E R = +
( )
2 2 2 2 2
1 1 2 2 1 2 1,2 1 2
2 2 2 2
1 1 2 2 1 2 1,2 1 2
1,2 1 2
2
2
: 1, 2
P
P
w w w w
or
w w w w
Note Cov
o o o o o
o o o o o
o o
= + +
= + +
=
Portfolio Theory
Example 1
Asset 1 Asset 2
Expected return 15% 5%
Variance 225 100
Std deviation 15% 10%
Correlation 0.5
Find the expected return and variance of the portfolio of
different weights allocated to asset 1 and asset 2.
FIN5SBF
83


Portfolio Theory
Example 1
Portfolio Theory
Mean-Variance Frontier
Effect of correlation in the efficient frontier must not be
ignored
Excel worksheet Correlation illustrates the effect of
correlation
Minimum variance bows out towards left for lower
correlation values
Diversification is meaningless while correlation is 1
Negatively sloped parts in the frontiers for correlations
0, 0.5 and -1 illustrates the benefit of diversification.
The efficient frontier is the positively sloped part of the
minimum variance frontier
Portfolio Theory
Mean-Variance Frontier Three assets
Expected return and variance of the portfolio can be
calculated using the following formulas:
( ) ( ) ( ) ( )
1 1 2 2 3 3 P
E R wE R w E R w E R = + +
2 2 2 2 2 2 2
1 1 2 2 3 3 1 2 1,2 1 2 1 3 1,3 1 3 2 3 2,3 2 3
2 2 2 2 2 2
1 1 2 2 3 3 1 2 1,2 1 2 1 3 1,3 1 3 2 3 2,3 2 3
2 2 2
2 2 2
P
P
w w w w w w w w w
or
w w w w w w w w w
o o o o o o o o o o
o o o o o o o o o o
= + + + + +
= + + + + +
Portfolio Theory
Example 2
Asset 1 Asset 2 Asset 3
Expected return 15% 5% 15%
Variance 225 100 225
Std deviation 15% 10% 15%
Correlations:
Asset 1 0.5 0.8
Asset 2 0.5
Asset 3
FIN5SBF
84


Portfolio Theory
Example 2
Portfolio Theory
Mean-Variance Frontier: Two Assets vs Three Assets
Need to use optimizers to determine the efficient
weights
Three asset frontier dominates the two asset frontier
We can usually improve the risk-return trade off by
expanding the set of assets we can invest in
Composition of minimum-variance portfolios, in
addition to the return, variance and covariance of the
assets, depends on the number of assets we can
invest in.
The more assets we add to the portfolio, the larger the
effect of covariance terms and the smaller the effect of
variance terms diversification

Portfolio Theory
Final exam Structure & the Sample Exam questions
Semester 2, 2010
Unit Code: FIN5SBF

Reading Time: 15 Minutes

Writing Time: 2:00 (Two hours)
Portfolio Theory



ALLOWABLE MATERIALS AND INSTRUCTIONS


1) Formula sheet two-sided A4 paper - allowed

2) This exam contains three parts:
Part A: 20 multiple questions 20 marks
Part B: 4 short answer working questions
Part C: Computer Question based on Assignment 1 &
2 worth 15 marks

3) Attempt all question in the space provided at the end of
each question under heading solution. Show all the working.
No working no marks.
FIN5SBF
85


Portfolio Theory


Part A: Multiple Choice Question answer. (20 Questions)

Please encircle the MOST correct answer.

1. Given that event E has a probability of 25%, the probability of
the complement of event E:

a) can have any value between zero and one
b) must be 0.75
c) is 0.251
d) is 0.275
e) Is unknown.
Portfolio Theory





2. The heights of (in inches) of 25 individuals were recorded and
the following statistics were calculated: mean=70, mode=73,
median= 74, range = 20, variance = 784. The coefficient of
variation equals:

a) 11.2%
b) 12%
c) 0.4%
d) 40%
e) None of the above
Portfolio Theory





3. A statistic teacher asked EACH student in his class their age.
On the basis of this information, he states that the average age
of ALL the students in the University is 22 years. This is an
example of:

a) A sample
b) Descriptive statistics
c) Statistical inference
d) An experiment
e) None of the above
Portfolio Theory





4. A Time Series forecast is based on:

a) Past values
b) Past variables
c) Past forecast errors
d) All of the above
e) None of the above
FIN5SBF
86


Portfolio Theory





5. Which component of a Time Series represents a gradual
shifting of a time series to relatively higher or lower values
over time?


a) Trend Component
b) Cyclical Component
c) Seasonal Component
d) Irregular Component
Portfolio Theory


Part B: Short answer and working questions. (4 Questions)


Lecture notes

Text/Tute changes scenario

FHW
Portfolio Theory
Q.2,C) Here is an incomplete SPSS output related to your assignment with sample size
(n) equal to 40.
Looking into the printout, answer the following questions:

a) Fill in the empty cells in the output tables, that is find the missing values

b) Interpret the coefficient of determination, and briefly explain the difference
between R
2
and r-Pearson correlation coefficient.
c) Does it appear that the normality requirement is satisfied? Why or why not?
d) Test the overall significance of the model. Explain the null and alternative
hypotheses.
e) Carefully explain the meanings of the estimated slope coefficients. Do they have
the signs you would expect? Why or why not?
f) Develop and test appropriate hypotheses concerning the slope coefficients using t-
tests at the 5 percent level.
Part C: Computer based questions.
Portfolio Theory

Model R
R
Square
Adjusted
R
Square
Std. Error
of the
Estimate
Durbin
Watson
1 ??? .956 .954 7.042 1.597
Model Summary
R = ???
FIN5SBF
87


Portfolio Theory

Model R
R
Square
Adjusted
R
Square
Std. Error
of the
Estimate
Durbin
Watson
1 ??? .956 .954 7.042 1.597
Model Summary
2
R R 0.956 0.977 = = =
Portfolio Theory

Model R
R
Square
Adjusted
R
Square
Std. Error
of the
Estimate
Durbin
Watson
1 0.977 .956 .954 7.042 1.597
Model Summary
Portfolio Theory


Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 ?? ?????? ????? .000(a)
Residual 1834.894 ?? 49.592
Total 41924.97 ??
ANOVA
a Predictors: (Constant), FS, WI =idea of k,
How many are they?
b Dependent Variable: WEF

Portfolio Theory


Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 ?? ?????? ????? .000(a)
Residual 1834.894 ?? 49.592
Total 41924.97 ??
ANOVA
a Predictors: (Constant), FS, WI
b Dependent Variable: WEF df = ???
df = (n 1)
FIN5SBF
88


Portfolio Theory

Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 ?? ?????? ????? .000(a)
Residual 1834.894 ?? 49.592
Total 41924.97 39
ANOVA
df df df df 39 2 37
total regression residuals residuals
= + = =
df 2
regression
=
a Predictors: (Constant), FS, WI
b Dependent Variable: WEF
df of residual = n k -1 or
df = SST/MS Residual =
1834.89/49.592 = 37

df n 1 39
total
= =
= Number of Slopes
(FS & WI)
Portfolio Theory

Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 2 ?????? ????? .000(a)
Residual 1834.894 37 49.592
Total 41924.97 39
ANOVA
a Predictors: (Constant), FS, WI
b Dependent Variable: WEF
MSR = ?????
Portfolio Theory

Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 2 ?????? ????? .000(a)
Residual 1834.894 37 49.592
Total 41924.97 39
ANOVA
SSR 40090.08
MSR 20045.040
df 2
= = =
a Predictors: (Constant), FS, WI
b Dependent Variable: WEF
Portfolio Theory

Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 2 20045.04 ????? .000(a)
Residual 1834.894 37 49.592
Total 41924.97 39
ANOVA
F ??? =
a Predictors: (Constant), FS, WI
b Dependent Variable: WEF
FIN5SBF
89


Portfolio Theory

Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 2 20045.04 ????? .000(a)
Residual 1834.894 37 49.592
Total 41924.97 39
ANOVA
MSR 20045.04
F 404.20
MSE 49.592
= = =
a Predictors: (Constant), FS, WI
b Dependent Variable: WEF
Portfolio Theory

Model
Sum of
Squares df
Mean
Square F Sig.
1 Regression 40090.08 2 20045.04 404.20 .000(a)
Residual 1834.894 37 49.592
Total 41924.97 39
ANOVA
a Predictors: (Constant), FS, WI
b Dependent Variable: WEF
Portfolio Theory

Coefficients
t Sig.
95% C.I. for B
B Std.Err L-Bound U-Bound
(Constant) 15.557 ????? 4.855 .000 9.065 22.049
WI .031 .013 ???? .017 .006 .057
FS 20.679 ????? 13.54 .000 17.585 23.773
Coefficients
Portfolio Theory

Coefficients
t Sig.
95% C.I. for B
B Std.Err L-Bound U-Bound
(Constant) 15.557 ????? 4.855 .000 9.065 22.049
WI .031 .013 ???? .017 .006 .057
FS 20.679 ????? 13.54 .000 17.585 23.773
Coefficients
Std Errors = ???
FIN5SBF
90


Portfolio Theory

Coefficients
t Sig.
95% C.I. for B
B Std.Err L-Bound U-Bound
(Constant) 15.557 ????? 4.855 .000 9.065 22.049
WI .031 .013 ???? .017 .006 .057
FS 20.679 ????? 13.54 .000 17.585 23.773
Coefficients
i
B
Std.Error
t
=
15.557
3.204
4.855
=
20.679
1.527
13.543
=
Portfolio Theory

Coefficients
t Sig.
95% C.I. for B
B Std.Err L-Bound U-Bound
(Constant) 15.557 3.204 4.855 .000 9.065 22.049
WI .031 .013 ???? .017 .006 .057
FS 20.679 1.527 13.54 .000 17.585 23.773
Coefficients
t ??? =
Portfolio Theory

Coefficients
t Sig.
95% C.I. for B
B Std.Err L-Bound U-Bound
(Constant) 15.557 3.204 4.855 .000 9.065 22.049
WI .031 .013 ???? .017 .006 .057
FS 20.679 1.527 13.54 .000 17.585 23.773
Coefficients
B .031
t 2.495
Std.Error .013
= = =
Portfolio Theory

Coefficients
t Sig.
95% C.I. for B
B Std.Err L-Bound U-Bound
(Constant) 15.557 3.204 4.855 .000 9.065 22.049
WI .031 .013 2.495 .017 .006 .057
FS 20.679 1.527 13.54 .000 17.585 23.773
Coefficients
FIN5SBF
91


Portfolio Theory


b) Interpret the coefficient of determination, and briefly explain the
difference between R2 and r-Pearson correlation coefficient.

R
2
= SSR/SST:
Shows how fit is this model to the regression line and what
percentage variation of the dependent variable can be
explained by variation of the independent variables.

R-Pearson Correlation Coefficient:
Is a measure of linear association, shows the sign and strength
of the relationship.

Portfolio Theory

c) Does it appear that the normality requirement is satisfied? Why or
why not?
There is a linear relationship between dependant and
independent variables.
There is no linear relationship between any two
independent variables.
Sample size is large enough.
P-P table suggests that error terms are homoscedastic,
most of the residuals are within a narrow horizontal bound
around zero. it means variance of errors is constant
Durbin Watson value is 1.6, we can not reject the
hypothesis of a positive serial correlation between errors.
Histogram appears to be normal.
Portfolio Theory

d) Test the overall significance of the model. Explain the null and
alternative hypotheses.
Significance of the model is tested by F test.

F-test tests the null hypothesis of all the slopes being equal to
zero, against the suitable alternative of at least one slope is
not equal to zero.

In this example, with more than 99.9% confidence (Sig. =
0.000) we can reject the null hypothesis.

e) Carefully explain the meanings of the estimated slope coefficients. Do
they have the signs you would expect? Why or why not?
Refer to Assignment 2

Portfolio Theory

f) Develop and test appropriate hypotheses concerning the slope
coefficients using t-tests at the 5 percent level.
Using the Coefficient Table:
t Sig.
95% Confidence Interval for B
Lower Bound Upper Bound
4.855 .000 9.065 22.049
2.495 .017 .006 .057
13.543 .000 17.585 23.773
0.006 < B1 < 0.057

17.585 < B2 < 23.77
i i
i c i c

t s t s
| |
| <| <| +
FIN5SBF
92


Portfolio Theory
5.365
Thank You!

Das könnte Ihnen auch gefallen