Sie sind auf Seite 1von 8

REN R 690 Lab – Geostatistics Lab

The objective of this lab is to try out some basic geostatistical tools with R. Geostatistics is used
primarily in the resource and environmental areas for estimation, uncertainty quantification
and integrating data with difference volumetric supports, precisions and sources. This lab will
introduce you to variograms and kriging for spatial data using the R package gstat.

These concepts will be introduced using a 2D data set from a West Texas oil deposit. This is a
nice small data set consisting of 62 wells from a carbonate-siltstone reservoir in West Texas. It
includes measurements of porosity (void fraction) and permeability (measure of how easily
fluid flows through the rock). The X and Y coordinates in the data set correspond to Easting and
Northing values (in ft), respectively. We are interested in mapping porosity for this reservoir
since this directly correlates to the oil in place. If you have time at the end, you can try the same
procedure with permeability since this directly correlates with our ability to extract the oil. Even
better, you could apply this procedure to some of your own spatial data!

1. Getting Started
Download the dataset “2dwelldata.csv”. Load this data set into R and check that it imported
correctly.

welldata = read.csv("2dwelldata.csv")
fix(welldata)
attach(welldata)
Let’s have a look at a map of the well locations. We can plot a simple map of where the wells
were drilled.

plot(Y~X, xlab="Easting",ylab="Northing")
This isn’t very interesting though. So we could colour this map by porosity. The porosity varies
between about 4 and 12%. To make a plot coloured by porosity where the high porosity rock is
coloured red and the low porosity rock is coloured yellow, a set of possible commands would
be:

mycol=seq(12,4,-0.01)
mycol=heat.colors(length(mycol))[rank(mycol)]
porcol=round((Porosity-4.0)*100.)
porcol=mycol[porcol]

plot(Y~X,pch=16,col=porcol,xlab="Easting", ylab="Northing")

Jared Deutsch - 2013 1


You can colour your map anyway you wish. A map of porosity colored with this scheme:

10000
8000
6000
Northing

4000
2000
0

0 2000 4000 6000 8000 10000

Easting

The high porosity rock is all in the East of the area and concentrated in the North-East of the
reservoir – now the cluster of wells in the North-East makes sense. A large number of wells
were drilled in the high porosity region.

2. Variograms
Download and install the package “gstat”. This is an R package with basic geostatistics
functionality. Install then load the package in R.

install.packages('gstat')
library(gstat)
This also loads the “sp” package which is a spatial data frame package. To run the gstat library
with this data, the data needs to be coerced into a spatial data frame. This is done by first
assigning the coordinates as a set of spatial points and then adding on the porosity data. R code
to do this:

xyspatial=SpatialPoints(cbind(X,Y))
porspatial=data.frame(Porosity)
spatialdata=SpatialPointsDataFrame(xyspatial,porspatial)
Check the spatial data to make sure that it assembled the data correctly. The first few rows
should look like:

> spatialdata

Jared Deutsch - 2013 2


coordinates Porosity
1 (9856, 5652) 9.474762
2 (9767, 4271) 10.096829
3 (8062, 9321) 9.816727
A variogram can be calculated with gstat using the command:

porvario=variogram(Porosity~1,spatialdata)
This is an omnidirectional (isotropic) variogram. This means that we are assuming that the
spatial variability is the same in all directions. This is fine for this exercise although we can tell
by the map that this is not the case! The reservoir is much more continuous in the North-South
direction than the East-West direction.

A variogram plot should always include a line with the sill. The sill is the point at which there is
no spatial correlation. This value corresponds to the data variance. Figure from University of
Alberta MIN E 310 Lecture Notes

We can now calculate the variogram sill (variance) and plot the variogram. One quick note: the
terms variogram and semivariogram are used interchangeably. Technically the semivariogram
(what we are calculating) is the variogram value divided by 2, but since we always divide by 2
to calculate the semivariogram they are used interchangeably.

porsill=var(Porosity)
plot(porvario$dist,porvario$gamma,xlim=c(0,8000),ylim=c(0,porsill+1)
,xlab="Distance (ft)",ylab="Semivariogram")
abline(h = porsill)
text(6000,porsill+0.2,paste("Sill =",round(porsill,3)))

Jared Deutsch - 2013 3


The abline command with h tells R to plot a horizontal line at the sill value. We are adding text
with the sill value above; the paste command tells R to concatenate the strings. Your plot should
look like:

4
Semivariogram Sill = 3.679

3
2
1
0

0 2000 4000 6000 8000

Distance (ft)

Right now we know experimental variogram values at a few specific distances – but we need to
model this variogram so that we know the variogram values at all distances. This means that we
need to determine:

 The nugget effect


 The variogram shape
 The variogram contribution
 The variogram range

For simple cases like this, it is a good idea to pick the nugget effect yourself based on your
knowledge of the variable. Recall that the nugget effect can be thought of as the y-axis intercept.
In this case, the nugget effect looks pretty low. We might estimate a nugget effect of about 0.3.

There are a number of permissible variogram models – the reason we need to pick a defined
variogram model rather than fitting the curve with any function is that the calculated
covariance matrix must be positive definite. With the defined variogram models (spherical,
exponential, Gaussian) this is always true. If we were to use another function this might not be
the case. Here we could choose the spherical variogram model.

Jared Deutsch - 2013 4


The variogram contribution should sum with the nugget effect to the data variance (sill). Since
we picked a nugget effect of 0.3, the variogram contribution would be 3.679-0.3=3.379. If you
picked a different nugget effect then adjust your variogram contribution accordingly.

The variogram range is the point at which the variogram reaches the sill. It looks like this occurs
at around 8000 ft, but we can use the variogram fitting function in the gstat package to help us
pick the range. The variogram range is determined by:

porvm=fit.variogram(porvario, model = vgm(3.379, "Sph", 8000, 0.3),


fit.sills=FALSE)
You can see the values we determined in the variogram model function. 8000 is our guess for
the range. This function will try and fit the range using 8000 as a starting point. We can look at
the calculated range by:

> porvm
model psill range
1 Nug 0.300 0.000
2 Sph 3.379 8769.015
So it fit a range of 8769 ft. This means that our variogram model equation is zero for a distance
of zero (no variability at zero distance!) and is the nugget effect plus our spherical variogram
model equation for distances larger than 0:

( ) {
[ ( ) ( ) ]

We can now plot the variogram model and the experimental points together to check if our fit is
reasonable. There is a built in plotting capability in gstat, but the plots aren’t that pretty so we
can do this ourselves given the above function for the spherical variogram model. If you still
have your variogram plot open you can add a plot of the model with:

curve(0.3+3.379*(1.5*(x/8769)-0.5*(x/8769)^3),add=TRUE)
You should now have a reasonable variogram model that looks like:

Jared Deutsch - 2013 5


4
Sill = 3.679

Semivariogram

3
2
1
0

0 2000 4000 6000 8000

Distance (ft)

3. Kriging
We can now use our variogram model to estimate porosity over the entire area by kriging. To
do this we need a list of locations at which we are going to estimate. We can do this by creating
a regular grid which “paves” the area. Look back at the area quickly. The area is could be
summarized as spanning Easting (X) values of 0 to 10500 ft and Northing (Y) values of 0 to
10500 ft. We could consider estimating a grid where the cells were 250 ft by 250 ft. This would
mean we would have 42 cells each in the X and Y directions. The procedure for generating this
grid is then:

gt = GridTopology(cellcentre.offset=c(125,125), cellsize=c(250,250),
cells.dim=c(42,42))
grd=SpatialGrid(gt)
The generated grid can be checked:

> summary(grd)
Object of class SpatialGrid
Coordinates:
min max
[1,] 0 10500
[2,] 0 10500
Is projected: NA
proj4string : [NA]

Jared Deutsch - 2013 6


Grid attributes:
cellcentre.offset cellsize cells.dim
1 125 250 42
2 125 250 42
We will use simple kriging. This means that we have to provide a mean (the normal procedure
for doing this is declustering, but we will use a naïve mean here). The mean is:

> mean(Porosity)
[1] 8.401975
To krige we provide the variable we are kriging, porosity data, the grid of points to estimate,
the variogram model and the mean (beta parameter here). The results can be plotted using the
spatial plotting utility.

krigedpor=krige(formula=Porosity~1, spatialdata, grd,


model=vgm(3.379, "Sph", 8769, 0.3),
beta=8.402)
spplot(krigedpor["var1.pred"], cuts=length(mycol),
col.regions=mycol)

11

10

Kriging is exact – so it reproduces the data points exactly. We should have a look and check that
this is the case. To do this we first extract the kriged estimates:

krigedpoints=cbind(coordinates(grd),krigedpor$var1.pred)

Jared Deutsch - 2013 7


We can use the same level plot from the lattice library to plot this (we did this in an earlier lab).
Install and load the lattice library if you need to.

install.packages('lattice')
library(lattice)
We can make a level plot using the same colors as before:

levelplot(krigedpoints[,3]~krigedpoints[,1]*krigedpoints[,2],
cuts=length(mycol), col.regions=mycol,xlab="Easting",
ylab="Northing",groups=1)
To add the points, we can use the lattice trellis method:

trellis.focus("panel", 1, 1, highlight=FALSE)
panel.xyplot(X,Y,pch=21,cex=1.3,fill=porcol)
trellis.unfocus()

11
10000

10

8000

6000
Northing

4000
7

6
2000

2000 4000 6000 8000 10000

Easting

We see a transition from low to high porosity values moving to the East and all the data values
are reproduced. You should see that in the far South East corner where there is very little data,
the estimated values tend towards the mean. Now if you have time you could try the same
procedure with the permeability data and see if you can estimate permeability over the area.

Jared Deutsch - 2013 8

Das könnte Ihnen auch gefallen