Sie sind auf Seite 1von 92

Introduction to applied geostatistics

Short version

Overheads

D G Rossiter
Department of Earth Systems Analysis
International Institute for Geo-information Science & Earth Observation (ITC)
<http://www.itc.nl/personal/rossiter>

March 21, 2007


Introduction to applied geostatistics 1

Topic: Resources
There are many resources, at various mathematical levels, some aimed at
particular applications. These lists are not comprehensive but should be good
starting points:

• Texts

• Web pages

• Computer programmes

D G Rossiter
Introduction to applied geostatistics 2

Texts: Mathematical

• Chilès, J.-P. and Delfiner, P., 1999. Geostatistics: modeling spatial uncertainty.
Wiley series in probability and statistics. John Wiley & Sons, New York.

• Christakos, G., 2000. Modern spatiotemporal geostatistics. Oxford University


Press, New York.

• Cressie, N., 1993. Statistics for spatial data. John Wiley & Sons, New York.

• Ripley, B.D., 1981. Spatial statistics. John Wiley & Sons, New York.

D G Rossiter
Introduction to applied geostatistics 3

Texts: In the context of a particular application field

• Davis, J.C., 2002. Statistics and data analysis in geology. John Wiley & Sons,
New York.

• Fotheringham, A.S., Brunsdon, C. and Charlton, M., 2000. Quantitative


geography : perspectives on spatial data analysis. Sage Publications, London ;
Thousand Oaks, Calif.

• Stein, A., Meer, F.v.d. and Gorte, B.G.F. (Editors), 1999. Spatial statistics for
remote sensing. Kluwer Academic, Dordrecht.

• Kitanidis, P.K., 1997. Introduction to geostatistics : applications to


hydrogeology. Cambridge University Press, Cambridge, England.

D G Rossiter
Introduction to applied geostatistics 4

Texts: Application-oriented but mathematical

• Webster, R., and Oliver, M. A., 2001. Geostatistics for environmental scientists.
Wiley & Sons, Chichester.

• Goovaerts, P., 1997. Geostatistics for natural resources evaluation. Oxford


University Press, Oxford and New York.

• Isaaks, E.H. and Srivastava, R.M., 1990. An introduction to applied


geostatistics. Oxford University Press, New York.

D G Rossiter
Introduction to applied geostatistics 5

Texts: Emphasis on computational methods

• Venables, W.N. & Ripley, B.D., 2002. Modern applied statistics with S, 4th
edition. Springer-Verlag, New York.

• Deutsch, C. V., & Journel, A. G., 1992. GSLIB: Geostatistical software library
and user’s guide. Oxford University Press, Oxford.

D G Rossiter
Introduction to applied geostatistics 6

Web pages

• R: http://www.r-project.org/

• R spatial projects: http://sal.uiuc.edu/csiss/Rgeo/

• gstat: http://www.gstat.org/

• gslib: http://www.gslib.com/

• GEOEAS: http://www.epa.gov/ada/csmos/models/geoeas.html

• ILWIS: http://www.itc.nl/ilwis/

• ArcGIS Geostatistical Analyst: http:


//www.esri.com/software/arcgis/arcgisxtensions/geostatistical/

• Geostatistical analysis tutor [Colorado (USA) School of Mines]:


http://uncert.mines.edu/tutor/
D G Rossiter
Introduction to applied geostatistics 7

Computer programmes

• ILWIS 3.3 (ITC)

• R open-source environment for statistical computing and visualisation;


includes several relevant libraries, including

* gstat, by Pebesma
* spatial, by Ripley
* geoR, by Ribeiro & Diggle
* spdep, by Rowlingson & Diggle
* spatstat, by Baddeley & Turner (point pattern analysis)
* sp, underlying spatial data structures (used by others)

• ArcGIS Geostatistical Analyst (ESRI) [requires ArcGIS base]

• PCRaster + gstat (Utrecht) [free]

• GeoEAS, GSLIB, Variowin, VESPER . . .

D G Rossiter
Introduction to applied geostatistics 8

Topic: Introduction to Spatial Analysis

1. Concepts of space: geographic and feature spaces

2. What is special about spatial data?

3. Key concepts in spatial analysis

4. Measuring spatial correlation

D G Rossiter
Introduction to applied geostatistics 9

What is “space”?

• A set of n continuous dimensions; dimension i has range [ximin · · · ximax ]

• Points are mathematical n-dimensional vectors: x = (x1, x2, · · · , xn)

• Depending on how we choose the axes, we can speak of both geographic and
feature spaces . . .

D G Rossiter
Introduction to applied geostatistics 10

Feature space
This “space” is not geographic space, but rather a mathematical space formed by
any set of variables:

• Axes are the range of each variable

• Coordinates are values of variables, possibly transformed or combined

• Not included in the common use of the term “spatial” data or analysis

• But the observation may be related in this ‘space’ . . .

• . . . and we often plot variables in this space, e.g. 2-D scatterplots

This is the “space” in which univariate, bivariate, or multivariate analysis are


carried out.

D G Rossiter
Introduction to applied geostatistics 11

Geographic space

• Axes are 1-d lines

• One-dimensional: coordinates are on a line with respect to some origin (0):


(x1) = x

• Two-dimensional: coordinates are on a grid with respect to some origin (0, 0):
(x1, x2) = (x, y) = (E, N)

• Three-dimensional: coordinates are grid and elevation from a reference


elevation: (x1, x2, x3) = (x, y, z) = (E, N, H)

• Must transform latitude-longitude to grid coordinates in some 2-d projection;


distortions occur over large areas

• Can work directly with geographic coordinates, but not as a grid

D G Rossiter
Introduction to applied geostatistics 12

What is special about spatial data? (1)

1. The location of a sample is an intrinsic part of its definition.

2. All data sets from a given area are implicitly related by their coordinates →
models of spatial structure

3. Values at sample points can not be assumed to be independent

4. That is, there may be a spatial structure to the data

• Classical statistics assumes independence, at least within sampling strata


• Major implications for sampling design and statistical inference

5. Data values may be related to their coordinates → spatial trend

D G Rossiter
Introduction to applied geostatistics 13

Key Concepts

• Spatial dependence: the value of a variable at a point in space is related to its


value at nearby points; knowing the value of these points allows us to predict
(with some degree of certainty) the value at the chosen point

• Spatial structure: the nature of the spatial relation: how far, and in what
directions, is the spatial dependence? How does the dependence vary with
distance and direction between points?

• Support of a sample: the physical dimensions it represents (n.b. may try to


predict to coarser or finer resolutions)

D G Rossiter
Introduction to applied geostatistics 14

Topic: Exploratory spatial data analysis


Since spatial data were collected at known points in geographic space, we should
visualise them in that space.

• Distribution of sample points

• Postplots (values vs. locations): where are which values?

• Geographic postplots: with images, landuse maps etc. as background: do


there appear to be any explanation for the distribution of values?

• Spatial structure: range, direction, strength . . .

• Is there anisotropy? In what direction(s)?

• Do there seem to be several populations with distinct geographic distribution?

D G Rossiter
Introduction to applied geostatistics 15

Point distribution
This shows how sample points are distributed in space.

• What was the sampling plan?

• Random or clustered?

• Are some areas over– or under–sampled?

D G Rossiter
Introduction to applied geostatistics 16

Example: Walker Lake: Distribution of points – All points

D G Rossiter
Introduction to applied geostatistics 17

The Postplot: distribution of values in space


The so-called postplot shows how the data values are distributed in space.

• Are values of closeby points similar to each other, or do the values appear to
be random?

• Does there appear to be a trend?

• Are there distinct clusters of high or low values?

• Is there any directional difference in clustering? (anisotropy)

D G Rossiter
Introduction to applied geostatistics 18

Meuse – Distribution of Log(Cadmium) in soils

D G Rossiter
Introduction to applied geostatistics 19

Geographic postplot
This shows the postplot against a background that may explain the distribution of
samples or values. Examples:

• land cover or land use

• geologic or soil units

• structural geology

D G Rossiter
Introduction to applied geostatistics 20

Meuse – Log(Cadmium) on a false-colour composite

D G Rossiter
Introduction to applied geostatistics 21

Topic: Spatial correlation

1. What is spatial auto-correlation?

2. Evidence of spatial correlation

3. Computing spatial correlation and covariance

4. Summarizing and visualising spatial covariance; the empirical variogram

Topics for later units:

1. modelling spatial correlation

2. predicting using the modelled structure

D G Rossiter
Introduction to applied geostatistics 22

Spatial Correlation

• Question: are nearby points in geographic space also ‘nearby’ in feature


space?

• That is, does knowing the value of some variable at some location give us
information on the value at ‘nearby’ locations?

• The concept of correlation between variables can be applied to correlation


within a variable, using distance to model the relation

D G Rossiter
Introduction to applied geostatistics 23

Covariance and Correlation


Recall: for two non-spatial variables X and Y :

• Sample covariance:

n
1 X
sXY = (xi − x) · (yi − y)
n − 1 i=1

• Sample correlation coefficient: the covariance normalized by sample


standard deviations; range [−1 . . . 1]:
P P
sXY (xi − x) · (yi − y)
rXY = = pP
sX · sY (xi − x)2 · pP(yi − y)2

Can we extend this idea to a single variable, which is then correlated with itself?

D G Rossiter
Introduction to applied geostatistics 24

Auto-correlation
We want to apply the idea of correlation to one variable (auto-correlation); the
prefix auto- means “self”, here referring to the single variable.

Here, the correlation is controlled by some other dimension:

• time – if the variable is collected as a time series

• space – if the variable is collected at points in space

So we will get a measure of how much the variable is correlated to itself,


considering the other factor (time or space) .

D G Rossiter
Introduction to applied geostatistics 25

Auto-covariance

• The spatial auto-covariance is computed within the same variable, using


pairs of observations.

• Each pair of observations (xi, xj ) has a covariance, showing how they jointly
differ from the variable’s mean x:

(xi − x)(xj − x)

• There are (n · (n − 1))/2 point pairs for which this can be calculated

• This is a large number! For example, with 200 points this is 19,900 point pairs.

D G Rossiter
Introduction to applied geostatistics 26

Modelling the auto-covariance

• By themselves the individual auto-covariances are not usefull; they just


quantify the covariance of each point pair.

• We need to summarize the individual covariances as a covariance function of


spatial separation

• Theory: the covariance depends only on the separation between point.

• If we can model this function . . .

• . . . we can then predict the covariance between any two locations in space.

D G Rossiter
Introduction to applied geostatistics 27

Semivariances
It is easier to model semivariances than covariances:

• Each pair of observation points has a semivariance, usually symbolized by the


Greek letter “gamma”, i.e. γ, defined as:

1
γ(xi, xj ) = [z(xi) − z(xj )]2
2

• Each point pair is separated by a known distance, so . . .

• We can plot the semivariances against distance as a variogram “cloud”, with


(n · (n − 1))/2 points in the graph

• Can also summarize in a variogram

• (The ‘semi’ refers to the factor 1/2, because there are two ways to compute for
the same point pair)

D G Rossiter
Introduction to applied geostatistics 28

The gstat package of R


We illustrate the concepts of spatial correlation with the gstat package of the R
environment and the meuse example data set.

The meuse data frame has coördinates in fields x and y; these are used to
promote the object to class SpatialPointsDataFrame.

> # view package information


> library(help=gstat)
> # load the package
> library(gstat)
> ?meuse
> # load sample data
> data(meuse)
> # as loaded is a data frame
> summary(meuse)
> # promote to class SpatialPointsDataFrame
> coordinates(meuse) <- ~ x+y
> # now has explicit coordinates
> summary(meuse)

D G Rossiter
Introduction to applied geostatistics 29
> summary(meuse)
Object of class SpatialPointsDataFrame
Coordinates:
min max
x 178605 181390
y 329714 333611
Is projected: NA
proj4string : [NA]
Number of points: 155
Data attributes:
cadmium copper lead zinc elev
Min. : 0.20 Min. : 14.0 Min. : 37.0 Min. : 113 Min. : 5.18
1st Qu.: 0.80 1st Qu.: 23.0 1st Qu.: 72.5 1st Qu.: 198 1st Qu.: 7.55
Median : 2.10 Median : 31.0 Median :123.0 Median : 326 Median : 8.18
Mean : 3.25 Mean : 40.3 Mean :153.4 Mean : 470 Mean : 8.17
3rd Qu.: 3.85 3rd Qu.: 49.5 3rd Qu.:207.0 3rd Qu.: 674 3rd Qu.: 8.96
Max. :18.10 Max. :128.0 Max. :654.0 Max. :1839 Max. :10.52

dist om ffreq soil lime landuse dist.m


Min. :0.0000 Min. : 1.00 1:84 1:97 0:111 W :50 Min. : 10
1st Qu.:0.0757 1st Qu.: 5.30 2:48 2:46 1: 44 Ah :39 1st Qu.: 80
Median :0.2118 Median : 6.90 3:23 3:12 Am :22 Median : 270
Mean :0.2400 Mean : 7.48 Fw :10 Mean : 290
3rd Qu.:0.3641 3rd Qu.: 9.00 Ab : 8 3rd Qu.: 450
Max. :0.8804 Max. :17.00 (Other):25 Max. :1000
NA’s : 2.00 NA’s : 1

D G Rossiter
Introduction to applied geostatistics 30

The empirical variogram

• To summarize the variogram cloud, compute average semivariance at various


separations (‘lags’); this is the empirical variogram

m(h)
1 X
γ(h) = [z(xi) − z(xj )]2
2m(h) i=1

• m(h) is the number of point pairs separated by vector h

• In practice, we have to define the set of vectors in each “bin” (to have enough
points); that is, we collect a distance range into one bin.

• (Note: there are other ways to estimate the variogram from the variogram
cloud; in particular so-called robust estimators.)

D G Rossiter
Introduction to applied geostatistics 31

Example of an experimental variogram


> (v <- variogram(log(cadmium)~1, data=meuse))
np dist gamma
1 57 79.29244 0.6650872
2 299 163.97367 0.8584648
3 419 267.36483 1.0064382
4 457 372.73542 1.1567136
5 547 478.47670 1.3064732
6 533 585.34058 1.5135658
7 574 693.14526 1.6040086
8 564 796.18365 1.7096998
9 589 903.14650 1.7706890
10 543 1011.29177 1.9875659
11 500 1117.86235 1.8259154
12 477 1221.32810 1.8852099
13 452 1329.16407 1.9145967
14 457 1437.25620 1.8505336
15 415 1543.20248 1.8523791

np are the number of point pairs in the bin; dist is the average separation of
these pairs; gamma is the average semivariance in the bin.

D G Rossiter
Introduction to applied geostatistics 32

Plotting the experimental variogram


This can be plotted as semivariance gamma against average separation dist,
along with the number of points that contributed to each estimate np:

> plot(v, plot.numbers=T)

(Note: gstat defaults to 15 equally-spaced bins and a maximum distance of 1/3


of the maximum separation. These can be over-ridden with the width= and
cutoff= arguments, respectively; or explicit bin limits can be set with the
boundaries= argument.)

D G Rossiter
Introduction to applied geostatistics 33

Default variogram of Log(Cd)

2.0 ● 543
● 452
● 477
● 500
● 457 ● 415
● 589
● 564
● 574

1.5 ● 533

semivariance ● 547

● 457

1.0 ● 419

● 299

● 57

0.5

0.0
0 500 1000 1500
distance

D G Rossiter
Introduction to applied geostatistics 34

Features of the experimental variogram


Later we will look at fitting a theoretical model to the experimental variogram;
but even without a model we can notice some features, which we define here only
qualitatively:

• Sill: maximum semi-variance

* represents variability in the absence of spatial dependence

• Range: separation between point-pairs at which the sill is reached

* distance at which there is no evidence of spatial dependence

• Nugget: semi-variance as the separation approaches zero

* represents variability at a point that can’t be explained by spatial structure

In the previous slide, we can estimate the sill ≈ 1.9, the range ≈ 1200 m, and the
nugget ≈ 0.5 i.e. ≈ 25% of the sill.

D G Rossiter
Introduction to applied geostatistics 35

Defining the bins (1)

• Distance interval, specifying the centres. E.g. (0, 100, 200, . . .) means intervals
of [0 . . . 50], [50 . . . 150], . . .

• All point pairs whose separation is in the interval are used to estimate γ(h) for
h as the interval centre

• Narrow intervals: more resolution but fewer point pairs for each sample

> v <- variogram(log(cadmium)~1, meuse, boundaries = seq(50, 2050, by = 100))


> plot(v, pl=T)
> par(mfrow = c(2, 3)) # show all six plots together
> for (bw in seq(20, 220, by = 40)) {
v<-variogram(log(cadmium)~1, meuse, width=bw)
plot(v$dist, v$gamma, xlab=paste("bin width", bw))

D G Rossiter
Introduction to applied geostatistics 36

Variograms of Log(Cd) with different bin widths

● ● ●

2.0
●● ●
● ● ● ●
●● ● ● ● ● ●
● ●

2.0

1.8

●● ●● ● ●●● ● ● ●●
● ● ●
●● ● ● ● ● ● ● ●
● ● ●● ● ●
● ●
● ● ●●● ●
●●

1.6
● ● ●

1.5

●● ● ● ● ● ●

1.5
●● ●● ● ●
●●●●

v$gamma

v$gamma

v$gamma

● ● ●●

1.4
●● ● ●

●● ● ●●

1.0
● ● ●

1.0
● ● ● ●

1.2
● ●
●●


●● ● ●

1.0
● ●

0.5
0.5

0.8

● ● ●

0 500 1000 1500 0 500 1000 1500 500 1000 1500

bin width 20 bin width 60 bin width 100

● ● ● ●

● ● ●

1.8
● ● ●
1.8

1.8


● ● ● ●

1.6
1.6

1.6


v$gamma

v$gamma

v$gamma
1.4

1.4

1.4


1.2
1.2

1.2
1.0

● ●

1.0

1.0
0.8


0.8

● ● ●

0.8
500 1000 1500 200 600 1000 1400 200 600 1000 1400

bin width 140 bin width 180 bin width 220

D G Rossiter
Introduction to applied geostatistics 37

Defining the bins (2)

• Each bin should have > 100 point pairs; > 300 is much more reliable

> v <- variogram(log(cadmium)~1, meuse, width=20)


> plot(v, plot.numbers=T)
> v$np
[1] 6 19 27 27 51 65 58 62 62 82 76 75 86 81 76
[16] 91 92 90 88 92 112 103 80 116 108 106 79 94 117 99
[31] 100 101 108 117 110 117 114 107 96 110 109 106 114 117 104
[46] 98 94 117 92 110 105 91 89 98 89 91 103 102 93 92
[61] 73 85 88 91 88 84 75 81 90 73 93 95 76 85 67
[76] 77 88 60
> v <- variogram(log(cadmium)~1, meuse, width=120)
> v$np
[1] 79 380 485 577 583 642 654 648 609 572 522 491 493 148
> plot(v, plot.numbers=T)

D G Rossiter
Introduction to applied geostatistics 38

Topic: Modelling the variogram


From the empirical variogram we now derive a variogram model which
expresses semivariance as a function of separation vector.

The model allows us to:

• Infer the characteristics of the underlying process from the functional form
and its parameters;

• Compute the semi-variance between any point-pair, separated by any vector


...

• . . . which is used in an ‘optimal’ interpolator (“kriging”) to predict at


unsampled locations.

D G Rossiter
Introduction to applied geostatistics 39

A variogram model, with parameters

D G Rossiter
Introduction to applied geostatistics 40

Authorized variogram models

• Only some functional forms can be used to model the variogram (theoretical
and mathematical constraints)

• The permitted forms are called authorized models

• Simplest: The exponential model; sill c, effective range 3a

(− h
a)
γ(h) = c{1 − e }

E.g. if the effective range is estimated as 120, the parameter a is 40.

• Another common model: The Spherical model; sill c, range a

   3 
 c 3h − 1 h : h<a
γ(h) = 2a 2 a
c : h≥a

D G Rossiter
Introduction to applied geostatistics 41

Graphs of authorized variogram models


Linear−with−sill variogram model Circular variogram model Spherical variogram model

1.0

1.0

1.0
0.8 sill sill sill

0.8

0.8
0.6

0.6

0.6
semivariance

semivariance

semivariance
0.4

0.4

0.4
0.2

0.2

0.2
nugget nugget nugget

range range range


0.0

0.0

0.0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10

separation distance separation distance separation distance

Pentaspherical variogram model Exponential variogram model Gaussian variogram model


1.0

1.0

1.0
sill sill sill
0.8

0.8

0.8
0.6

0.6

0.6
semivariance

semivariance

semivariance
0.4

0.4

0.4
0.2

0.2

0.2
nugget nugget nugget

range range range


0.0

0.0

0.0
0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10

separation distance separation distance separation distance

D G Rossiter
Introduction to applied geostatistics 42
Comparaison
m
0.0
0.2
0.4
0.6
0
0.8
1
1.0E
Exponential
S
Spherical
G
Gaussian
P
Pentaspherical
C
Circular
L
M
Linear-with-sill
2ange
4
6
8 inear-with-sill
xponential
pherical
entaspherical
omparaison
ircular
aussian
ssemivariance
sill
n
rnugget
Mrange
Mill
.2
.4
.6
.8
.0
ugget
sseparation
eparation
emivariance
distance of variogram models
Comparaison of variogram models

1.0
sill
Exponential
0.8 Gaussian
Circular
0.6

Spherical
semivariance

Pentaspherical
0.4

Linear-with-sill
0.2

nugget

range
0.0

0 2 4 6 8
separation distance

Models vary considerably, from origin to range

D G Rossiter
Introduction to applied geostatistics 43

Comparaison of models available in gstat


> show.vgms()

D G Rossiter
Introduction to applied geostatistics 44

Choosing a model (1)


The empirical variogram should be one realization of a random process. So,
what do we expect from the process that is supposed to be responsible for the
spatial structure represented in the variogram?

• Exponential: First-order autoregressive process: values are random but with


dependency on the nearest neighbour; boundaries according to a Poisson
process

• Gaussian: as exponential, but with strong close-range dependency, very


smooth at each point.

D G Rossiter
Introduction to applied geostatistics 45

Choosing a model (1) – continued

• Spherical, circular, pentaspherical: Patches of similar values; patches have


similar size ≈ range) with transition zones (overlap of processes); These differ
mainly in the “shoulder” transition to the sill

D G Rossiter
Introduction to applied geostatistics 46

Choosing a model (2)

• Which has been successfully applied with this kind of data?


(This is evidence for the nature of this kind of process)

• What do we expect from the supposed process? if we have some other


evidence of its spatial behaviour.
For example, a Gaussian model might be expected for a phenomenon which
physically must be very continuous, e.g. the surface of a ground-water table.

• Visual estimate of functional form from the variogram

• (Fit various models, pick the statistically-best fit)

D G Rossiter
Introduction to applied geostatistics 47

Fitting the model


Once a model form is selected, then the model parameters must be adjusted for
a ‘best’ fit of the experimental variogram.

• By eye, adjusting parameters for good-looking fit

* Hard to judge the relative value of each point


* This is all that’s possible in ILWIS

• Automatically, looking for the best fit according to some objective criterion

* Various criteria possible in gstat

• In both cases, favour sections of the variogram with more pairs and at shorter
ranges (because it is a local interpolator).

• Mixed: adust by eye, evaluate statistically; or vice versa

D G Rossiter
Introduction to applied geostatistics 48

Fitting a variogram model in gstat


We’ve decided on a spherical + nugget model:

> # Calculate the experimental variogram and display it


> v1 <- variogram(log(cadmium)~1, meuse); plot(v1, plot.numbers=T)
> # Fit by eye, display fit
> m1 <- vgm(1.4, "Sph", 1200, 0.5); plot(v1, plot.numbers=T, model=m1)
> # Let gstat adjust the parameters, display fit
> m2 <- fit.variogram(v1, m1); m2
model psill range
1 Nug 0.54785 0.000
2 Sph 1.33980 1149.4
> plot(v1, plot.numbers=T, model=m2)
> # Fix the nugget, fit only the sill of spherical model
> m2a <- fit.variogram(v1,m1,fit.sills=c(F,T),fit.range=F); m2a
model psill range
1 Nug 0.5000 0
2 Sph 1.4651 1200

In this case, the eyeball did a pretty good job . . .

D G Rossiter
Introduction to applied geostatistics 49

2.0 ● 543 2.0 ● 543


● 452 ● 452
● 477 ● 477
● 457 ● 415 ● 457 ● 415
● 500 ● 500
● 589 ● 589
● 564 ● 564
● 574 ● 574

1.5 ● 533 1.5 ● 533

● 547 ● 547
semivariance

semivariance
● 457 ● 457

1.0 ● 419 1.0 ● 419

● 299 ● 299

● 57 ● 57

0.5 0.5

0.0 0.0
0 500 1000 1500 0 500 1000 1500
distance distance

By eye: c0 = 0.5, c1 = 1.4, a = 1200; total sill c0 + c1 = 1.9

Automatic: c0 = 0.548, c1 = 1.340, a = 1149; total sill c0 + c1 = 1.888

The total sill was almost unchanged; gstat raised the nugget and lowered the
partial sill of the spherical model a bit; the range was shortened by 51 m.

D G Rossiter
Introduction to applied geostatistics 50

What sample size to fit a variogram model?

• Can’t use non-spatial formulas for sample size, because spatial samples are
correlated, and each sample is used multiple times in the variogram estimate

• Stochastic simulation from an assumed random field with a known variogram


suggests:

1. < 50 points: not at all reliable


2. 100 to 150 points: more or less acceptable
3. > 250 points: almost certaintly reliable

• More points are needed to estimate an anisotropic variogram.

This is very worrying for many environmental datasets (soil cores, vegetation
plots, . . . ) especially from short-term fieldwork, where sample sizes of 40 – 60
are typical. Should variograms even be attempted on such small samples?

D G Rossiter
Introduction to applied geostatistics 51

Topic: Approaches to spatial prediction


This is the prediction of the value of some variable at an unsampled point,
based on the values at the sampled points.

This is often called interpolation, but strictly speaking:

• Interpolation: prediction is only for points that are geographically inside the
(convex hull of the) sample set;

• Extrapolation: prediction outside this geographic area

(Note: same usage as in feature-space predictions)

D G Rossiter
Introduction to applied geostatistics 52

A taxomomy of spatial prediction methods

Strata divide area to be mapped into ‘homogeneous’ strata; predict within each
stratum from all samples in that stratum

Global predictors: use all samples to predict at all points; also called regional
predictors;

Local predictors: use only ‘nearby’ samples to predict at each point

Mixed predictors: some of structure is explained by strata or globally, some


locally

D G Rossiter
Introduction to applied geostatistics 53

Which approach is “best”?

• No theoretical answer

• Depends on how well the approach models the ‘true’ spatial structure, and
this is unknown (but we may have prior evidence)

• Should correspond with what we know about the process that created the
spatial structure

D G Rossiter
Introduction to applied geostatistics 54

Polynomial trend surfaces

• A global predictor which models a regional trend

• The value of a variable at each point depends only on its coödinates and
parameters of a fitted surface

• This is modelled with a smooth function of position, z = f (x, y) = f (E, N) for


grid coördinates; this is called the trend surface

• Simple form (plane, 1st order):

z = β0 + βx E + βy N

• Higher-order surfaces may also be fitted (beware of fitting the noise!)

D G Rossiter
Introduction to applied geostatistics 55

Fitting trend surfaces

• The trend surface is predicted by linear regression with coödinates as the


predictor variables and the response variable to be predicted, using data from
all sample points.

• All samples participate equally in the prediction

• We can measure the goodness of fit of the trend surface to the sample by the
residual sum of squares

• The same cautions as in feature-space regression analysis!

• Ordinary Least Squares (OLS) is often used but is not really correct, since it
ignores possible correlation among closely-spaced samples; better is
Generalised Least Squares (GLS)

D G Rossiter
Introduction to applied geostatistics 56

Predictions of 1st and 2nd order Trend Surfaces in the study area
1
xM
30.5
333000
332000
331000
-2.0
-1.5
-1.0st and
y330000
TS1
T
M 30000
S1
S2
331000
332000
333000
3
178500
179000
179500
180000
180500
181000
181500
--0.5
0.0
0.5
0
1.0
1.5
1
M2.0
1.5
1.0
30000
31000
32000
33000
78500
79000
79500
80000
80500
81000
81500
.0
.5 2nd order trend surfaces, study area

1st and 2nd order trend surfaces, study area


178500 179000 179500 180000 180500 181000 181500

1.5

333000 1.0

0.5

332000 0.0
y

-0.5

331000
-1.0

-1.5

330000
-2.0

178500 179000 179500 180000 180500 181000 181500


x

D G Rossiter
Introduction to applied geostatistics 57

Predictions of 1st and 2nd order Trend Surfaces in the bounding box
1
xMM st and
3123481500
333000
332000
331000
y330000
TS1
T 30000
S1
S2
331000
332000
333000
178500
179000
179500
180000
180500
181000
181500
-4
-3
-2
--1
0
1
2
3
4
M 30000
31000
32000
33000
78500
79000
79500
80000
80500
81000 2nd order trend surfaces, bounding box

1st and 2nd order trend surfaces, bounding box


178500 179000 179500 180000 180500 181000 181500

3
333000

1
332000
y

-1

331000
-2

-3

330000
-4

178500 179000 179500 180000 180500 181000 181500


x

D G Rossiter
Introduction to applied geostatistics 58

Approaches to prediction: Local predictors

• No strata

• No regional trend

• Value of the variable is predicted from “nearby” samples

* Example: concentrations of soil constituents (e.g. salts, pollutants)


* Example: vegetation density

D G Rossiter
Introduction to applied geostatistics 59

Local Predictors
Each interpolator has its own assumptions, i.e. theory of spatial variability

• Nearest neighbour (Thiessen polygons)

• Average within a radius

• Average of the n nearest neighbours

• Distance-weighted average within a radius

• Distance-weighted average of n nearest neighbours

• ...

• “Optimal” weighting ⇒ Kriging

D G Rossiter
Introduction to applied geostatistics 60

Local predictor: Nearest neighbour (Thiessen polygons)

• Predict each point from its single nearest sample point

• Conceptually-simple, makes the minimal assumptions about spatial structure

• No way to estimate prediction variances, ignores other ‘nearby’ information

• Maps show abrupt discontinuities at boundaries, so don’t look very realistic

• But may be a more accurate predictor than poorly-modelled predictors

D G Rossiter
Introduction to applied geostatistics 61

Local predictor: Average within a radius

• Use the set of all neighbouring sample points within some radius r

• Predict by averaging :

n
1X
xˆ0 = xi , d(x0, xi) ≤ r
n i=1

• Although we can calculate prediction variances from the neighbours, these


assume no spatial structure closer than the radius

• Problem: How do we select a radius?

D G Rossiter
Introduction to applied geostatistics 62

Local predictors: Distance-weighted average

• Inverse of distance to some set of n nearest-neighbours:

n n
X xi X 1
xˆ0 = /
i=1
d(x0 , xi ) i=1
d(x0, xi)

• Inverse of distance to some set of n nearest-neighbours, to some power k

n k
X xi X 1
xˆ0 = k
/ k
i=1
d(x0 , xi ) i=1
d(x0 , xi )

• Implicit theory of spatial structure (a power model), but this is not testable

• Can select all points within some limiting distance (radius), or some fixed
number of nearest points, or . . .

• How to select radius or number and power?


D G Rossiter
Introduction to applied geostatistics 63

Inverse distance in gstat


The idw method is used. There is no model of spatial variability, so there is no
way to estimate a prediction variance.

> kid <- idw(log(cadmium) ~ 1, meuse, meuse.grid)


[inverse distance weighted interpolation]
> levelplot(var1.pred ~ x+y, as.data.frame(kid), aspect="iso")

The weights are computed only from the inverse distance; they do not account for
spatial structure nor for the relative positions of the sample points.

Compare inverse distance (linear) to Ordinary Kriging with a spherical model


(range = 1150 m): OK gives a much smoother map.

D G Rossiter
Introduction to applied geostatistics 64

3 3

●●●●

●●● ●


333000 333000 ● ●
● ●


● ●●●●●
2 2


●●●
●●● ●
●●●
●● ●●

●●


332000 1 332000 1








y










0 ●
● ● 0



331000 331000


●● ●


● ●



●● ●
● ●



●●●●
●●



● ●●●


−1




● −1

● ●
●●


330000 330000

● ●●●●●
178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500

Inverse distance x x

2.5 2.5


●●●●

2.0


●●●● 2.0
333000 333000 ● ●

● ●


1.5
● ● ● ●●
● ●● 1.5

●●●
●●● ●
●●●
●● ●●
1.0 ● 1.0
●●


332000 332000








y



0.5 0.5








0.0

● ●
0.0



331000 331000

●●
● ●


● ●



●● ●
● ●


●● ●

●● ●
−0.5 −0.5


● ●●●







−1.0 ● ●
●●

● −1.0
330000 330000

● ●● ● ●●

178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500

Ordinary kriging x x

D G Rossiter
Introduction to applied geostatistics 65

Approaches to prediction: Mixed predictors

• For situations where there is both long-range structure (trend) or strata and
local structure

* Example: Particle size in the soil: strata (rock type), trend (distance from a
river), and local variation in depositional or weathering processes

• One approach: model strata or global trend, subtract from each value, then
model residuals → Regression Kriging.

• Another approach: model everything together → Universal Kriging or Kriging


with External Drift

D G Rossiter
Introduction to applied geostatistics 66

Topic: Ordinary Kriging


The theory of regionalised variables leads to an “optimal” interpolation method, in
the sense that the prediction variance is minimized.

This is based on the theory of random functions, and requires certain


assumptions.

D G Rossiter
Introduction to applied geostatistics 67

Kriging

• A “Best Linear Unbiased Predictor” (BLUP) that satisfies a certain optimality


criterion (so it’s “best” with respect to the criterion)

• It is only “optimal” with respect to the chosen model and the chosen
optimality criterion

• Based on the theory of random processes, with covariances depending only


on separation (i.e. a variogram model)

• Theory developed several times (Kolmogorov 1930’s, Wiener 1949) but current
practise dates back to Matheron (1963), formalizing the practical work of the
mining engineer D G Krige (RSA).

* Should really be written as “krigeing” (Fr. krigeage) but it’s too late for that.

D G Rossiter
Introduction to applied geostatistics 68

What is so special about kriging?

• Predicts at any point as the weighted average of the values at sampled points

* as for inverse distance (to a power)

• Weights given to each sample point are optimal, given the spatial covariance
structure as revealed by the variogram model (in this sense it is “best”)

• So, the prediction is only as good as the model of spatial structure!

• The prediction error at each point is automatically generated as part of the


process of computing the weights.

D G Rossiter
Introduction to applied geostatistics 69

How do we use Kriging?

1. Sample, preferably at different resolutions

2. Calculate the experimental variogram

3. Model the variogram with one or more authorized functions

• N.b. the variogram model may already be known from other studies or
theoretical considereations

4. Apply the kriging system of equations, with the variogram model of spatial
dependence, at each point to be predicted

• Predictions are often at each point on a regular grid (e.g. a raster map)

5. Calculate the variance of each prediction; this is based only on the sample
point locations, not their data values.

D G Rossiter
Introduction to applied geostatistics 70

OK in gstat
The krige method is used with a variogram model:

# compute experimental variogram


v <- variogram(log(cadmium) ~ 1, meuse)
# estimated model
m <- vgm(1.4, "Sph", 1200, 0.5)
# fitted model
m.f <- fit.variogram(v, m)
data(meuse.grid); coordinates(meuse.grid) <- ~ x +y # interpolation grid
kr <- krige(log(cadmium)~ 1, loc=meuse, newdata=meuse.grid, model=m.f)
[using ordinary kriging]
# visualize interpolation; note aspect option to get correct geometry
levelplot(var1.pred ~ x+y, as.data.frame(kr), aspect="iso")
# visualize prediction error
levelplot(var1.var ~ x+y, as.data.frame(kr), aspect="iso")

Note the model specification (model=m.f); this gives the assumed covariance
structure with which to compute the optimal weights.

D G Rossiter
Introduction to applied geostatistics 71

Ordinary kriging (OK) results for Meuse log(Cd)


2.5 2.5


●●
2.0



●●●●●
● 2.0
333000 333000 ● ●
● ●


1.5
● ● ● ●●
● ●● 1.5

●●●
●●● ●
●●●
1.0 ● ●
●● ●●●
1.0


332000 332000








y



0.5 0.5








0.0

● ●
0.0



331000 331000

●●● ●


● ●



●● ●
● ●


−0.5
● ● ●●●●

−0.5

● ●●● ● ●

● ●
●● ● ●

● ●● ●

330000
−1.0
330000
● ● ● −1.0

● ●●●●●
178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500
x x

D G Rossiter
Introduction to applied geostatistics 72

Kriging prediction errors for Meuse log(Cd)


● ●

● ● ●

1.4 ● ● ●

● 1.4
● ●
● ● ●

333000 333000 ● ●

● ●
● ●
1.3 ● ● ● 1.3
● ●● ●

● ●

● ●
● ●

● ● ●
1.2 ●
● ● ● ● 1.2
● ●
● ● ● ●
●●

● ● ●

332000 332000 ● ● ●
● ●
1.1 ● ● ● 1.1

● ● ● ● ●
y

y

● ●

● ●
● ● ●
● ●
1.0 ● ● 1.0
● ● ●
● ● ● ●
● ●

331000 331000 ●
● ● ● ● ●

0.9 ●● ●
● ● ●
0.9
● ● ●
●● ●

● ●●
● ● ● ● ●
● ●
● ●
● ● ●
0.8 ●

● ● ● 0.8


● ● ●
● ●

330000 330000 ●
0.7 ●
● ● 0.7

● ●

178500 179000 179500 180000 180500 181000 181500 178500 179000 179500 180000 180500 181000 181500
x x

D G Rossiter
Introduction to applied geostatistics 73

How realistic are maps made by Ordinary Kriging?

• The resulting surface is smooth and shows no noise, no matter if there is a


nugget effect in the variogram model

• So the field is the best at each point taken separately, but taken as a whole is
not a realistic map

• The sample points are predicted exactly; they are assumed to be without
error, again even if there is a nugget effect in the variogram model

D G Rossiter
Introduction to applied geostatistics 74

Non-parametric geostatistics
A non-parametric statistic is one that does not assume any underlying data
distribution.

For example:

• a mean is an estimate of a parameter of location of some assumed


distribution (e.g.mid-point of normal, expected proportion of success in a
binomial, . . . )

• a median is simply the value at which half the samples are smaller and half
larger, without knowing anything about the distribution underlying the process
which produced the sample.

In geostatistics, “non-parametric” refers to methods that make no assumptions


about the distribution of the data values, only about spatial structure.

D G Rossiter
Introduction to applied geostatistics 75

Non-parametric geostatistics: Motivation (1)


There is some positive motivation . . .

• In some applications, we may be most interested in finding areas with values


above a certain threshold (e.g. polluted areas), and not really care if we get
accurate predictions in other areas (as long as we are sure they are below the
threshold)

• So the form of the distribution is not important, just whether a value is above
or below some threshold.

• In these applications, we often want a probability that an interpolated point


exceeds the threshold; this is directly useful for probabilistic decision-making

* e.g. whether or not to clean up a polluted site

D G Rossiter
Introduction to applied geostatistics 76

Non-parametric geostatistics: Motivation (2)


. . . and there is also some negative motivation:

• The outlier problem: a dataset may contain a few very high values

• These can make the area mean arbitrarily high (n.b. not the median)

• These contribute a disproportionate amount to the total variance as well

• These can make the experimental semivariogram unreliable for “typical” values
and useless for unusual values:

* the point-pairs where the outliers are included will have very high
semivariances
* these contribute disproportionately to the average semivariance in a bin . . .
* . . . so that the variogram is very difficult to model

D G Rossiter
Introduction to applied geostatistics 77

• E.g. a random sample of 15 N(10, 1) variates with one outlier at 100 (i.e. 10x
the expected value) replacing the last value:

1. Without outlier: x̄ = 9.95, sx2 = 0.57


2. With outlier: x̄ = 15.98, sx2 = 540.8

• The one point with value 100 accounts for (100 − x̄)2/15 = 470 of the
variance, i.e. 470/540 = 87% of it

• So, it will make semi-variances of point pairs involving this point much
higher than others; these can be seen in the variogram cloud

• Note: the median is only slightly affected: if the outlier replaces a value
above the median, the next-highest value is now the median

D G Rossiter
Introduction to applied geostatistics 78

“Solutions” to the outlier problem

1. Ignore (assume that they represent a different population and remove from the
dataset before further analysis)

• → under-estimation, can’t find “hot spots”

2. Set to some arbitrary maximum, nearer the bulk of the population; same
problem

3. Transform the variable to logarithms for modelling; transform back for the
final maps and estimates

• Good solution if the whole distribution is lognormal


• Not optimal if the aim is just to bring some outliers closer (i.e. the rest of
the distribution is not lognormal)

4. → Transform to indicator variables, interpolate by Indicator Kriging (IK)

D G Rossiter
Introduction to applied geostatistics 79

Lognormal Kriging

~ i = log z(x
1. Transform the data to their (natural) logarithms: y(x) ~i); this
should be approximately normally distributed

2. Model and interpolate with the transformed variable (OK, block kriging, UK,
KED, trend surfaces . . . )

3. Optional: Back-transform to original units of measure

Back-transformation is not required if we don’t care about the original variable,


e.g. if the logarithm itself is a useful index.

(Back-transformation of prediction variances is only possible for SK.)

D G Rossiter
Introduction to applied geostatistics 80

Indicator kriging
This is a simple non-parametric (also called distribution-free) method
ofinterpolation.

It is used primarily to estimate the probability of exceeding some pre-defined


threshold value.

It can also be used to estimate an entire cumulative probability distribution


(CDF).

Note that there are other non-parametric methods, e.g. disjunctive kriging;
these have a much more difficult theory.

D G Rossiter
Introduction to applied geostatistics 81

Distribution-free estimates
So far we have assumed an approximately normal or lognormal distribution of the
target spatially-correlated random variable. But this may be demonstrably not
true.

A non-parametric approach does not attempt to fit a distribution to the data, but
rather works directly with the experimental CDF, by dividing it into sample
quantiles.

To work with these, we introduce the idea of indicator variables.

D G Rossiter
Introduction to applied geostatistics 82

Indicator variables

• Binary variables: Take one of the values {1, 0} depending on whether the point
is ‘in’ or ‘out’ of the set; i.e. if it does or does not meet some criterion

* These are suitable for binary nominal variables, e.g. {“urban”, “not urban”};
{“land use changed”, “land use did not change”}

• A continuous variable can be converted to an indicator zt by a threshold or


cut-off value xt : zt = 1 ⇐⇒ x ≤ xt

* e.g. xt = 350 to cut-off at 350 mg kg-1


* Formally: I(x~i, zt ) = 1 iff Z(x
~i) ≤ zt ; 0 otherwise
* By convention 1 indicates values below the threshold (to model the CDF);
inverting reverses the sense

D G Rossiter
Introduction to applied geostatistics 83

Setting up indicators in gstat


> mind <- as.data.frame(meuse)[c("x","y","cadmium")]; str(mind)
‘data.frame’: 155 obs. of 3 variables:
$ x : num 181072 181025 181165 181298 181307 ...
$ y : num 333611 333558 333537 333484 333330 ...
$ cadmium: num 11.7 8.6 6.5 2.6 2.8 3 3.2 2.8 2.4 1.6 ...
> attach(mind)
> quantile(cadmium, seq(0,1,.1))
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.20 0.20 0.64 1.20 1.56 2.10 2.64 3.10 5.64 8.26 18.10
> for (q in seq(.1,.9,.1)) mind <-
+ cbind(mind, as.numeric(cadmium<=quantile(cadmium,q)))
> names(mind)[4:12] <- paste("q",seq(1:9),sep="")
> mind$q5[1:30]
[1] 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1

So field q5 of data frame mind contains a 1 if the corresponding value of field


cadmium is ≤ 2.10, the fifth decile (i.e. the median).

D G Rossiter
Introduction to applied geostatistics 84

Indicator map

• Every sample point is either 1 (‘in’) or 0 (‘out’); a binary map

• No measure of ‘how far’ in or out

• Prepare a series of indicator maps, with increasing thresholds, to visualise


the cumulative sample distribution

• A common strategy is to divide the range of the sample values into quartiles
or deciles and prepare an indicator for each

• The proportion of 1’s will increase with increasing quantile.

D G Rossiter
Introduction to applied geostatistics 85

The Indicator variogram

• Compute as for a parametric variogram; every sample point has either value 1
(below the cutoff, in the set) or 0.

• The semivariance of each point pair is either 0 (both above or below; both out
or in) or 0.5 (one above, one below; one out, one in).

• For a quantized continuous variable, each indicator variable (quantile) might


well have different spatial structure

• Variograms near the two ends of the CDF have few 1’s or 0’s (depending on the
end), so few point-pairs will have semivariance 0.5 → hard to model (fluctuates)

• Model as for parametric variogram; however the total sill must be < 0.5
(generally it’s a lot lower)

D G Rossiter
Introduction to applied geostatistics 86

Probability kriging using indicator variables

1. Calculate the indicator at the required threshold

2. Calculate the empirical variogram for that indicator (not the median)
• (May have to use a threshold closer to the median if there are too few 1’s so
that the variogram is erratic)

3. Model the variogram

4. Solve the kriging system at each point to be predicted, using Simple Kriging
(SK) with the quantile proportion as the expected value (e.g., in the 6th
decile, 0.6 of the values are expected to be 1’s)
• Note! this is only true if the original sampling scheme was unbiased! If not,
also estimate the mean (use OK).

5. If necessary, limit the results to the range [0 . . . 1]

6. This may be interepreted as the probability that the point does not exceed the
threshold
D G Rossiter
Introduction to applied geostatistics 87

Indicator kriging in gstat


> # convert to spatial object
> coordinates(mind) <- ~ x +y
> #compute the variogram for the 90th percentile
> vq9 <- variogram(q9~1, mind)
> plot(vq9, plot.numbers=T)
> mq9 <- vgm(0.05,"Sph",500,0.04)
> plot(vq9, plot.numbers=T, model=mq9)
> mq9f <- fit.variogram(vq9, mq9)
> plot(vq9, plot.numbers=T, model=mq9f)
> # erratic around sill, leads to very short range variogram
> mq9f
model psill range
1 Nug -0.006447182 0.0000
2 Sph 0.090135765 167.9867
> # krige this quantile; note expected proportion of 1’s is known
> k9 <- krige(q9~1, mind, meuse.grid, beta=0.9, model=mq9f)
[using simple kriging]
> levelplot(var1.pred~x+y, as.data.frame(k9), aspect="iso")
> # this is the probability of being *below* the cutoff of 8.26ppm

D G Rossiter
Introduction to applied geostatistics 88

500
● 500

452
● ●452
0.10 477
● 0.10 477

● 415 ● 415
533
● 533

543
● 543

●547 ●564 ●589 457


● ●547 ●564 ●589 457

299
● 299

457
● 457

0.08 0.08
574
● 574

419
● 419

semivariance

semivariance
0.06 0.06

Indicator variogram (9th ● 57 ● 57

decile);
0.04 0.04

0.02 0.02

Estimated model 0.00 0.00


0 500 1000 1500 0 500 1000 1500
distance distance

500

452

0.10 477

● 415
533
● −1
543

333000
●547 ●564 ●589 457

299

457

0.08
574
● −0.8
419

Fitted model; note 332000


semivariance

−0.6
0.06

y
unrealistic nugget ● 57

−0.4
0.04
331000

−0.2

Probability < 0.02

330000 −0

8.26mg kg-1 0.00


0 500 1000 1500 178500 179000 179500 180000 180500 181000 181500
distance x

D G Rossiter
Introduction to applied geostatistics 89

D G Rossiter
Introduction to applied geostatistics 90

Summary: Advantages of IK

• Makes no assumption about the theoretical distribution of the data values,


yet still give realistic probability estimates

• Outlier-resistent: these can not increase the estimate or prediction variances


of an indicator arbitrarily; for data values they only affect one quantile

• Simple Kriging is used at each quantile, which improves the estimate.

D G Rossiter
Introduction to applied geostatistics 91

Summary: Disadvantages of IK

• Variograms may be difficult to model, especially at the highest and lowest


quantiles (few pairs with different 0/1 values)

D G Rossiter

Das könnte Ihnen auch gefallen