Sie sind auf Seite 1von 72

1

Chapter 2 Simple Linear Regression


Ray-Bing Chen
Institute of Statistics
National University of Kaohsiung
2
2.1 Simple Linear Regression Model
y = |
0
+ |
1
x +
x: regressor variable
y: response variable
|
0
: the intercept, unknown
|
1
: the slope, unknown
: error with E() = 0 and Var() = o
2

(unknown)
The errors are uncorrelated.
3
Given x,
E(y|x) = E(|
0
+ |
1
x + c) = |
0
+ |
1
x
Var(y|x) = Var(|
0
+ |
1
x + c) = o
2

Responses are also uncorrelated.
Regression coefficients: |
0
, |
1

|
1
: the change of E(y|x) by a unit change in x
|
0
: E(y|x=0)


4
2.2 Least-squares Estimation of the
Parameters
2.2.1 Estimation of |
0
and |
1

n pairs: (y
i
, x
i
), i = 1, , n
Method of least squares: Minimize

=
+ =
n
i
i i
x y S
1
2
1 0 1 0
)] ( [ ) , ( | | | |
5



Least-squares normal equations:

6
The least-squares estimator:








7
The fitted simple regression model:


A point estimate of the mean of y for a
particular x
Residual:


An important role in investigating the
adequacy of the fitted regression model and in
detecting departures from the underlying
assumption!
8
Example 2.1: The Rocket Propellant Data
Shear strength is related to the age in weeks of
the batch of sustainer propellant.
20 observations
From scatter diagram, there is a strong
relationship between shear strength (y) and
propellant age (x).
Assumption
y = |
0
+ |
1
x +

9
10







The least-square fit:
65 . 41112
56 . 1106
2 2
= =
= =

y x n y x S
x n x S
i
i i xy
i
i xx
82 . 2627

15 . 37

1 0
1
= =
= =
x y
S
S
xx
xy
| |
|
x y 15 . 37 82 . 2627

=
11
How well does this equation fit the data?
Is the model likely to be useful as a predictor?
Are any of the basic assumption violated and if so
how serious is this?
12
2.2.2 Properties of the Least-Squares Estimators
and the Fitted Regression Model
are linear combinations of y
i





are unbiased estimators.
0 1

and

| |
xx i i
n
i
i i
S x x c y c / ) ( ,

1
1
= =

=
|
x y
1 0

| | =
0 1

and

| |
13





0 1 1 0 1 0
1 1 0
1
1
)

( )

(
) (
) ( ) ( )

(
| | | | | |
| | |
|
= + = =
= + =
= =


=
x x x y E E
x c
y E c y c E E
i
i i
i
i i
n
i
i i


= = =
= =
i
xx
i
i
xx
i
i
i i
i
i i
S
x x
S
c
y Var c y c Var Var
2
2
2
2
2 2
2
1
) (
) ( ) ( )

(
o o
o
|
)
1
( )

(
2
2
0
xx
S
x
n
Var + =o |
14
The Gauss-Markov Theorem:
are the best linear unbiased estimators (BLUE).

0 1

and

| |
15
Some useful properties:
The sum of the residuals in any regression
model that contains an intercept |
0
is always 0,
i.e.



Regression line always passes through the
centroid point of data,



= = =
i
i i
i
i i
i
i
x x y y y y e 0 )) (

( ) (
1
|

=
i i
i i
y y
) , ( y x

= =
i
i i i
i
i i
x x y y x e x 0 )) (

(
1
|

= + =
i i
i i i i i
x x y y x x y e y 0 )) (

) ))(( (

(
1 1
| |
16
2.2.3 Estimator of o
2

Residual sum of squares:






xy T
xy
i
i
i
i
i
i
i i
i
i s
S SS
S y y
x x y y
y y e SS
1
1
2
2
1
2
2
Re

) (
)) (

(
)

(
|
|
|
=
=
=
= =


17
Since ,
the unbiased estimator of o
2
is


MS
E
is called the residual mean square.
This estimate is model-dependent.
Example 2.2
2
) 2 ( ) ( o = n SS E
E
E
E
MS
n
SS
=

=
2

2
o
18
2.2.4 An Alternate Form of the Model
The new regression model:



Normal equations:


The least-squares estimators:
i i
i i
i i i
x x
x x x
x x x y
c | |
c | | |
c | | |
+ + =
+ + + =
+ + + =
) (
) ( ) (
) (
1
'
0
1 1 0
1 1 0

=
=
i i
i i i
i
i
x x y x x
y n
) ( ) (

2
1
'
0
|
|
xx
xy
S
S
y = =
1
'
0

and

| |
19
Some advantages:
The normal equations are easier to solve

are uncorrelated.


xx
xy
S
S
y = =
1
'
0

and

| |
) (

1
x x y y + = |
20
2.3 Hypothesis Testing on the Slope and
Intercept
Assume
i
are normally distributed
y
i
~ N(
0
+
1
x
i
,
2
)

2.3.1 Use of t-Tests
Test on slope:
H
0
:
1
=
10
v.s. H
1
:
1

10



) / , ( ~

2
1 1 xx
S N o | |
21
If
2
is known, under null hypothesis,



(n-2) MS
E
/
2
follows a
2
n-2

If
2
is unknown,




Reject H
0
if |t
0
| > t
/2, n-2


) 1 , 0 ( ~
/

2
10 1
0
N
S
Z
xx
o
| |
=
2
1
10 1 10 1
0
~
)

=
n
xx E
t
se S MS
t
|
| | | |
22
Test on intercept:
H
0
:
0
=
00
v.s. H
1
:
0

00
If
2
is unknown



Reject H
0
if |t
0
| > t
/2, n-2
2
0
00 0
2
00 0
0
~
)

) / / 1 (

=
+

=
n
xx E
t
se
S x n MS
t
|
| | | |
23
2.3.2 Testing Significance of Regression
H
0
:
1
= 0 v.s. H
1
:
1
0
Accept H
0
: there is no linear relationship between
x and y.
24
Reject H
0
: x is of value in explaining the
variability in y.


Reject H
0
if |t
0
| > t
/2, n-2


2
1
1
0
~
)

=
n
t
se
t
|
|
25
Example 2.3:The Rocket Propellant Data
Test significance of regression

MS
E
= 9244.59


the test statistic is


t
0.0025,18
= 2.101
Reject H
0

15 . 37

1
= |
89 . 2 )

(
1
= =
xx
E
S
MS
se |
85 . 12
)

1
1
0
= =
|
|
se
t
26
27
2.3.3 The Analysis of Variance (ANOVA)
Use an analysis of variance approach to test
significance of regression



28

SS
T
: the corrected sum of squares of the
observations. It measures the total variability in
the observations.
SS
Res
: the residual or error sum of squares
The residual variation left unexplained by the
regression line.
SS
R
: the regression or model sum of squares
The amount of variability in the observations
accounted for by the regression line
SS
T
= SS
R
+ SS
Res


+ =
i
i i
i
i i
y y y y y y
2 2 2
)

( )

( ) (
29

The degree-of-freedom:
df
T
= n-1
df
R
= 1
df
Res
= n-2
df
T
= df
R
+ df
Res

Test significance regression by ANOVA
SS
Res
= (n-2) MS
Res
~
n-2

SS
R
= MS
R
~
1

SS
R
and SS
Res
are independent

xy R
S SS
1

| =
2 , 1
Re Re
0
~
) 2 /(
1 /

=
n
s
R
s
R
F
MS
MS
n SS
SS
F
30
E(MS
Res
) =
2

E(MS
R
) =
2
+
1
2
S
xx

Reject H
0
if F
0
> F
/2,1, n-2

If
1

0, F
0
follows a noncentral F with 1 and
n-2 degree of freedom and a noncentrality
parameter
2
2
1
o
|

xx
S
=
31
Example 2.4: The Rocket Propellant Data

32
More About the t Test




The square of a t random variable with f degree
of freedom is a F random variable with 1 and f
degree of freedom.

xx s
S MS se
t
/

Re
1
1
1
0
|
|
|
= =
0
Re Re
1
Re
2
1 2
0

F
MS
MS
MS
S
MS
S
t
s
R
s
xy
s
xx
= = = =
|
|
33
2.4 Interval Estimation in Simple Linear
Regression
2.4.1 Confidence Intervals on
0
,
1
and
2
Assume that
i
are normally and independently
distributed

34
100(1-)% confidence intervals on
0
,
1
are
given:



Interpretation of C.I.
Confidence interval for
2
:



35
Example 2.5 The Rocket Propellant Data






36

37
2.4.2 Interval Estimation of the Mean Response
Let x
0
be the level of the regressor variable for
which we wish to estimate the mean response.
x
0
is in the range of the original data on x.
An unbiased estimator of E(y| x
0
) is
38
follows a normal distribution.


0
|

x y

0
|

x y

39
A 100(1-)% confidence interval on the mean
response at x
0
:

40
Example 2.6 The Rocket Propellant Data

41
42
The interval width is a minimum for and
widens as increases.
Extrapolation
x x =
0
| |
0
x x
43
2.5 Prediction of New Observations
is the point estimate of the new value
of the response
follows a normal distribution with mean
0 and variance


0 1 0 0

x y | | + =
0
y
0 0
y y =
]
) ( 1
1 [ )

( ) (
0 2
0 0
xx
S
x x
n
y y Var Var

+ + = = o
44
The 100(1-)% confidence interval on a future
observation at x
0
(a prediction interval for the
future observation y
0
)



45
Example 2.7:

46
47
The 100(1-)% confidence interval on
0
y
48
2.6 Coefficient of Determination
The coefficient of determination:


The proportion of variation explained by the
regressor x
0 R
2
1
T
s
T
R
SS
SS
SS
SS
R
Re 2
1 = =
49
In Example 2.1, R
2
= 0.9018. It means that
90.18% of the variability in strength is accounted
for by the regression model.
R
2
can be increased by adding terms to the model.
For a simple regression model,


E(R
2
) increases (decreases) as S
xx
increases
(decreases)
2 2
1
2
1 2

) (
o |
|
+
~
xx
xx
S
S
R E
50
R
2
does not measure the magnitude of the slope of
the regression line. A large value of R
2
imply a
steep slope.
R
2
does not measure the appropriateness of the
linear model.
51
2.7 Some Considerations in the Use of
Regression
Only suitable for interpretation over the range of
the regressors, not for extrapolation.
Important: The disposition of the x values. Slope
strongly influenced by the remote values of x.
Outliers and bad values can seriously disturb the
least-square fit. (intercept and the residual mean
square)
Dont imply the cause and effect relationship
52
53
54


The t statistic for
testing H
0
:
1
= 0 for
this model is t
0
=
27.312 and R
2
=
0.9842
1
204 . 2 582 . 4

x y + =
55
x may be unknown. For example: consider
predicting maximum daily load on an electric
power generation system from a regression model
relating the load to the maximum daily
temperature.
56
2.8 Regression Through the Origin
A no-intercept model is

Given (y
i
, x
i
), i = 1 2 ,, n,

57
The 100(1-)% confidence interval on
1




The 100(1-)% confidence interval on E(y| x
0
)



The 100(1-)% confidence interval on y
0


58
Misuse: data lie in a region of x-space remote
from the origin.

59
The residual mean square, MS
Res

Generally R
2
is not a good comparative statistic
for two models.
For the intercept model,


For the no-intercept model,



Occasionally R
0
2
> R
2
, but MS
0,Res
< MS
Res


=
i
i
i
i
y y
y y
R
2
2
2
) (
) (

=
i
i
i
i
y
y
R
2
2
2
0

60
Example 2.8 The Shelf-Stocking Data
61
62
63
64
2.9 Estimation by Maximum Likelihood
Assume that the errors are NID(0,
2
). Then y
i
~N(
0
+
1
x
i
,
2
)
The likelihood function:


65



MLE v.s. LSE
In general MLE have better statistical
properties than LSE.
MLE are unbiased (asymptotically unbiased)
and have minimum variance when compare to
all the other unbiased estimators.
They are also consistent estimators.
They are a set of sufficient statistics.
66
MLE requires more stringent statistical
assumptions than LSE.
LSE only need to have the second moment
assumptions.
MLE require a full distributional assumption.
67
2.10 Case Where the Regressor x Is
Random
2.10.1 x and y Jointly Distributed
x and y are jointly distributed r.v. and this joint
distribution is unknown.
All of our previous results hold if
y|x ~ N(
0
+
1
x,
2
)
The xs are independent r.v.s whose
probability distribution does not involve
0
,

1
,
2

68
2.10.2 x and y Jointly Normally Distributed: the
Correlation Model

69

70
The estimator of

71
Test on







100(1-)% C.I. for

72
Example 2.9 The Delivery Time Data

Das könnte Ihnen auch gefallen