Lecture 12

Generalized Least Squares
Modeling Spatially
Continuous Data
n First fit this model to the observed data using ordinary least
squares
Lecture 12
n Then estimate a variogram model using the residuals from

the ordinary least squares estimation
October17, 2002
n Then compute a covariogram using the variogram model

n From the covariogram estimate the covariance matrix and
refit the model using generalized least squares
Bailey and Gatrell Chapter 5
The validity of the final model depends on:
appropriate trend surface model choice
appropriate variogram model choice
Generalized Least Squares
Want to predict a value for the attribute at location s
Objectives
n Understanding and describing the nature of the spatial
variation in an attribute
Knowledge of trend and covariance structure are
sufficient
n Prediction of the attribute values at locations where it

has not been sampled
How to use derived model for prediction purposes
Developing Models for Prediction

Assume we have the model
Matheron coined the term in honor of D. G. Krige, a South

African mining engineer
Y (s) = x ( s) + U (s)
T
U(s) as a zero mean process with covariance function C()

From this model can do better than just a
prediction based on mean value
Kriging
= x (s )
T
Can add a local component to the mean based on knowledge of

the covariance structure and the observed values at sampled
points
An optimal spatial linear prediction method

Optimal in the sense that it is unbiased and minimizes the
mean squared prediction error
Based primarily on the second order properties of the process Y
Prediction weights are based on the spatial dependence
between observations modeled by the variogram
This approach is known as kriging
Kriging Models
Simple Kriging
Assumes mean is known and does not need to be estimated
n Simple Kriging
Mean is known and constant throughout the study region
n Ordinary Kriging
Mean is fixed but unknown and needs to be estimated
n Universal Kriging
Mean varies, is an unknown linear combination
of known functions
n Subtract mean from sample observations to derive a set

of residuals ui u i = y i (s )
n We assume these residuals with zero mean also have
2
a known variance and covariance function C(s)
n Find an estimate u ( s ) for a value u (s ) of the random
variable U(s) at the location s given observed values ui of the
random variable U(s i)at the n sample locations s i
n With an estimate u ( s ) the predicted value of the random
variable y ( s) is derived by adding u ( s ) to the known trend at
the point s
Simple Kriging
Weighted sum of n random variables at sample sites si

n
U (s) = i ( s)U (si )

i =1
i (s) different weights applied to different locations s

U ( s )as the sum of random variables is a random variable itself
U ( s) = i (s)U (si )
s2
Estimates are weighted linear combinations of the observed

residuals
i=1
2
s1
Unmeasured point to be estimated
s0
s6
s1 ,sn are observed point

locations
s3
3
4
s4
s5
U ( s ) should have a mean value of zero for any set of weights
since by assumption the mean of U (s) is zero and

weights are constants
Simple Kriging
Simple Kriging
Assuming U ( s ) and U (s ) both have zero mean the expected

mean square error is:
Want to chose weights to minimize the mean square error
E ((U ( s) U ( s)) 2 = E (U 2 ( s )) + E (U 2 ( s )) 2 E (U ( s )U ( s))
Differentiating with respect to (s ) in order to minimize gives
= i ( s) j (s )C( si , s j ) + 2 2 i (s )C (s , s i )
i=1 j =1
i =1
= ( s)C (s ) + 2 (s )c(s )
T
C is an (n x n) matrix of covariances, between all possible

pairs of the n sample points
C(s) is an (n x 1) column vector of covariances between
the prediction point s and each of the n sample points
( s ) = C 1c( s )
U ( s) = T ( s)U = c T ( s)C 1U
The minimized expected mean square error corresponding to
the choice of weights is
E ((U (s) U (s)) 2 ) = 2 cT ( s)C 1c (s)

Mean square prediction error or kriging variance
e2
Worked Example Simple Kriging

ID
1
2
3
4
5
Distances
1
2
3
4
5
Y
29
73
82
91
22
1
0
Z
82
105
65
52
22
2
49.2
0
122
183
148
160
176

u
-38.37
22.63
-12.37
-0.37
15.63
3
53.9
40.3
0
4
65.2
55.9
15.8
0
Covariances
1
2
3
4
5
5
60.795
97.1
72.09
69.6
0

c(s)
Simple kriging weights need not sum to 1
6.895
5.032
10.15
8.083
4.079
*
( s)
( s ) = C 1c( s )
0.225
0.066
0.351
0.128
0.107
Sum
2
4.571
20
3
3.970
5.970
20
4
2.828
3.739
12.450
20
5
3.228
1.086
2.300
2.479
20
C-1
0.054914 -0.01004 -0.00646 -0.00095 -0.00746

-0.01004 0.056751 -0.01508 0.000168 0.000252
-0.00646 -0.01508 0.087403 -0.05044 -0.00194
-0.00095 0.000168 -0.05044 0.082023 -0.00422
-0.00746 0.000252 -0.00194 -0.00422 0.051936
C-1
0.054914 -0.01004 -0.00646 -0.00095 -0.00746
-0.01004 0.056751 -0.01508 0.000168 0.000252
-0.00646 -0.01508 0.087403 -0.05044 -0.00194
-0.00095 0.000168 -0.05044 0.082023 -0.00422
-0.00746 0.000252 -0.00194 -0.00422 0.051936
1
20
0.877
U ( s) = i ( s)U ( si )
i =1
u ( s) = 0.225 * 38. 37 + 0 .066 * 22 .63 + 0.351 * 12 .37 + 0.128 * 0 .37 + 0.107 *15.63
9. 86
y (s) = 160.37 9.86 150.5
e2 = 2 c T ( s)C 1c (s)
= 20.0-6.92
= 13.08
i(s)
0.225
0.066
0.351
0.128
0.107
95 percent confidence interval is y(s ) 1 .96 e

95 percent CI = 150 .5 7. 09
s2
n
U ( s) = i (s)U (si )
s1
i=1
0.066
0.225
s0
0.351
s3
150.5
0.128
s4
0.107
Optimal local spatial prediction

n First order effects are implicitly estimated as part
of the prediction process
n Prediction of yi occurs in one step using a weighted
linear combination of the observed values yi
n
Y ( s ) = i ( s) Y ( si )
i= 1
s5
In this case we chose weights so that the mean value of

Y(s) is constrained to be the mean
s1 ,sn are observed point

locations
Ordinary Kriging
The mean of
Ordinary Kriging
Ordinary Kriging
Y(s) and each of the Y (si ) are all
The mean of Y ( s) will also be

weights sum to one
as long as the
With this constraint we minimize the mean squared

error between values of Y(s) and Y ( s )
The expected mean square is
E ((Y ( s ) Y ( s )) 2 ) = T ( s ) C ( s ) + 2 2 T ( s )c (s )
E ((Y ( s) Y ( s)) 2 ) = T ( s) C ( s) + 2 2 T ( s )c (s )
C is an (n x n) matrix of covariances, C(si, sj) between all

possible pairs of the n sample sites and c(s) is an (n x 1)
column vector of covariances, C(s,s i) between the
prediction point s and each of the n sample sites
To carry out optimization with constraints use the method
of Lagrange multiplier v(s)
T ( s)C ( s) + 2 2 T ( s)c ( s) + 2( T 1 1)v ( s)
Ordinary Kriging
Ordinary Kriging
1 =1
These equations are expressed as modified
matrix C+ and vectors + and c+
C (s) + 1v( s) = c (s)
T
C+
C ( s1, s1 )
C ( s n , s1 )
? + (s )
L C ( s1 , sn ) 1 ( s)
O
M
1
M
M
M
( s)
L C ( sn , sn ) 1 n

L
1
0 v ( s)
T ( s)C ( s) + 2 2T (s )c ( s) + 2( T 1 1) v (s )
Leads to 2 simultaneous equations
c+ ( s)
C ( s, s1 )
M
C (s, s )
n
T1 =1
C ( s) + 1v( s) = c( s )
And a solution
+ (s ) = C+ 1c+ (s )
Ordinary Kriging
To obtain the prediction
Solve the equation for
y ( s)
+ (s)
Extract from this the vector

Then
(s )
y ( s ) = T (s ) y
yi is the original set of observations
Source
http://www. msu.edu/~ashton/466/2002_notes/3-18-02/ok_ill.html
Worked Example Ordinary Kriging
1
2
3
4
x y
2 4
4 7
8 9
7 4
z
3
4
2
4
The distance matrix for the 5 data points is:

[1] [2] [3] [4] [5]
[1] 0.000 3.605 7.810 5.000 4.000
[2] 3.605 0.000 4.472 4.243 3.605
[3] 7.810 4.472 0.000 5.099 5.385
[4] 5.000 4.243 5.099 0.000 1.000
[5] 4.000 3.605 5.385 1.000 0.000
The distance vector between the data points and the

unknown point is:
[1] 3.162 2.236 5.000 2.236 1.414

The vector of covariances for the points to the unknown point is:
4.061023 5.026350 2.343750 5.026350 5.919616 1.000000
A vector of weights lambda, along with the LaGrange multiplier
in column 6:
[1]
[1] 0.17289193
[2] 0.26523729
[3] 0.05887157
[4] 0.16986833
[5] 0.33313088
[6] -0.13471033

Variogram model is a spherical model with nugget=2.5, sill=7.5,
range=10.0
3
( h) = C ( h )
3h
h
0 h 10
2.5 + ( 7.5 2.5)
20
2000
( h) = 0
h=0
7.5
otherwise
3h
h3
C (h) = 7. 5 2.5 + (7 .5 2.5 )
20 2000
The covariance matrix for the data points for the given model
[1]
[1] 7.500000
[2] 3.6200650
[3] 0.5001733
[4] 2.3437500
[5] 3.2400000
[6] 1.0000000
[2]
3.620065
7.500000
2.804380
3.013076
3.620065
1.000000
[3]
[4]
[5]
0.5001733 2.343750 3.240000
2.8043796 3.013076 3.620065
7.5.00000 2.260774 2.027458
2.2607737 7.50000 6.378750
2.0274579 6.378750 7.50000
1.0000000 1.000000 1.000000
[6]
1
1
1
1
1
0

Calculate the variance as the transpose of c times the inverse
of C times c, subtracted from the sill
5.135453
The kriging standard error is the square root of this:
Then multiply the weight for each data

point by the attribute value of that point
to determine the ordinary kriging
estimate:
e2 =
2.266154
4.375627
Ordinary versus Simple Kriging

Primarily a local neighborhood estimator
There is no need to estimate a first-order trend; instead, the
mean is estimated from nearby data only.
Estimates are not as sensitive to non-stationarity (though the
covariance model may be affected, it is not as strongly affected
at short ranges as at long ones).
Universal Kriging
Universal Kriging
Includes a first order trend component
Y (s ) = 0 + 1 x1( s) + 2 x2 ( s) + L + n xn ( s) + ( s)
n
Y ( s ) = i ( s) Y ( si )
i= 1
As for ordinary kriging, the weights are chosen to minimize

mean squared error
Subject to the constraint that Y ( s ) is unbiased for
{ }
that is E Y (s ) = E{(Y (s )}
i =1
=1
To obtain the weights that minimize the mean square error

subject to this constraint, again use the method of Lagrange
multipliers
Forms a prediction for y in one step
MSE = E | Y ( s) Y (s) | 2
Y ( s ) is unbiased if and only if
Y(s)
for all 0 1 2 ,L p
Taking the derivatives with respect to and v, setting the

expressions to zero and re-arranging terms we get the
kriging equations (using variogram)
n
(s
i =1
i= 1
=1
sk ) + v 0 + v j x j ( sk ) = ( sk s0 ); k = 1, 2L , n
j =1
i k
( s i ) = xk ( s0 ); k = 1,2 L , p
i =1
Universal Kriging
Universal Kriging
In matrix notation
(s1 , s2 )
0
0
( s2 , s1 )
M
M
( sn , s1 ) (s1 , s2 )
1
1
x1 (s1 )
x1 (s 2 )
M
M
x (s )
x p (s2 )
p 1
L ( s1 , s n ) 1 x1 (s1 ) L xp ( s1 ) 1

L (s 2 , sn ) 1 x1 (s 2 ) L x p (s 2 ) 2
O
M
M
M
M
M M

L
0
1 x 1 (s n ) L x p (s n ) n
L
1
0
0
L
0 v0

L x1 ( sn ) 0
0
L
0 v
1
M
M
M
M M

L x p (s n ) 0
0
L
0
vp
(s 1 s0 )
( s2 s0 )
( sn s0 )
x1 (s 0 )
x (s )
p 0
C+
C (s1 , s2 )
C (s , s )
1 2
x1 ( s1 )
x (s )
p 1
L C( s1 , s 2 )
O
M
x1 (s1 ) L
M
O
L C( s1 , s 2 ) x1 (s n )
L x1 (s n )
0
O
M
M
L x p ( sn )
0
L
L
O
L
x p ( s1 )
M
x p (s n )
0
M
0
+ (s )
1 ( s)
M
(s )
n
v1 ( s)
M
v p ( s)
c + ( s)
C ( s, s1)
M
C ( s, s )
1
x1 (s )
M
x (s )
p
solution
+ ( s) = C+1c+ ( s)
MSE
Prediction is
1
+ +
y ( s) = T ( s) y
= c (s )C c ( s)
2
e
T
+
Universal Kriging
n May make more sense to estimate trend explicitly
n Need to estimate trend to derive residuals for variogram
modeling in any case since it is only safe to estimate
variogram model from y when the mean is assumed
constant
n Does not make sense to use it for local neighborhoods
Best to use generalized approach and remove trend explicitly

Lecture 12

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lecture 12

Hochgeladen von

Copyright:

Verfügbare Formate

Generalized Least Squares

n Then estimate a variogram model using the residuals from

n Then compute a covariogram using the variogram model

Generalized Least Squares

Want to predict a value for the attribute at location s

n Prediction of the attribute values at locations where it

Developing Models for Prediction

Matheron coined the term in honor of D. G. Krige, a South

U(s) as a zero mean process with covariance function C()

Can add a local component to the mean based on knowledge of

An optimal spatial linear prediction method

This approach is known as kriging

n Subtract mean from sample observations to derive a set

Weighted sum of n random variables at sample sites si

U (s) = i ( s)U (si )

i (s) different weights applied to different locations s

Estimates are weighted linear combinations of the observed

Unmeasured point to be estimated

s1 ,sn are observed point

since by assumption the mean of U (s) is zero and

Assuming U ( s ) and U (s ) both have zero mean the expected

Want to chose weights to minimize the mean square error

E ((U ( s) U ( s)) 2 = E (U 2 ( s )) + E (U 2 ( s )) 2 E (U ( s )U ( s))

Differentiating with respect to (s ) in order to minimize gives

C is an (n x n) matrix of covariances, between all possible

E ((U (s) U (s)) 2 ) = 2 cT ( s)C 1c (s)

Worked Example Simple Kriging

Worked Example Simple Kriging

Worked Example Simple Kriging

Simple kriging weights need not sum to 1

0.054914 -0.01004 -0.00646 -0.00095 -0.00746

Worked Example Simple Kriging

y (s) = 160.37 9.86 150.5

95 percent confidence interval is y(s ) 1 .96 e

Optimal local spatial prediction

In this case we chose weights so that the mean value of

s1 ,sn are observed point

Y(s) and each of the Y (si ) are all

The mean of Y ( s) will also be

With this constraint we minimize the mean squared

C is an (n x n) matrix of covariances, C(si, sj) between all

Extract from this the vector

yi is the original set of observations

Worked Example Ordinary Kriging

The distance matrix for the 5 data points is:

The distance vector between the data points and the

Worked Example Ordinary Kriging

Worked Example Ordinary Kriging

C (h) = 7. 5 2.5 + (7 .5 2.5 )

Worked Example Ordinary Kriging

Then multiply the weight for each data

Ordinary versus Simple Kriging

Includes a first order trend component

As for ordinary kriging, the weights are chosen to minimize

Subject to the constraint that Y ( s ) is unbiased for

To obtain the weights that minimize the mean square error

Forms a prediction for y in one step

Y ( s ) is unbiased if and only if

Taking the derivatives with respect to and v, setting the

Best to use generalized approach and remove trend explicitly

Das könnte Ihnen auch gefallen