BIOLOGY

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

14 Aufrufe

Orton 2016

BIOLOGY

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

- sol3
- Ore Reserve Estimation
- Terminology for Petrel manual.docx
- CHAPTER19 Geostatistical Methods
- Normal Curve
- 189
- TechnicalNote_UnderstandingStochasticInversion
- Lindsey 1997 Applying Generalized Linear Models.pdf
- Budget
- Test 2
- 9231_w14_ms_21
- Chapter 3 Introduction to Data Science a Python Approach to Concepts, Techniques and Applications
- MECP-101.doc
- 55027716 a Post Crisis Perspective
- List of matrices_2.pdf
- Digital Color Cameras - Spectral Response
- Literature Review v02
- CourseOutline_Nov2014
- 1
- Aqa Ms04 Qp Jun13

Sie sind auf Seite 1von 13

Geoderma

journal homepage: www.elsevier.com/locate/geoderma

prole data sampled over varying depth intervals

T.G. Orton a,b,, M.J. Pringle b, T.F.A. Bishop a

a

b

Faculty of Agriculture and Environment, The University of Sydney, 1 Central Avenue, Australia Technology Park, Eveleigh, NSW 2015, Australia

EcoSciences Precinct, Department of Science, Information Technology and Innovation, GPO Box 5078, Brisbane, QLD 4001, Australia

a r t i c l e

i n f o

Article history:

Received 16 April 2015

Received in revised form 31 July 2015

Accepted 10 August 2015

Available online xxxx

Keywords:

Geostatistics

Spatial variability

Soil prole

Depth function

Area-to-point kriging

a b s t r a c t

Datasets for modelling and mapping soil properties often consist of samples from many spatial locations, collected from several different soil depth intervals. However, interest may lie in the spatial distribution of the property

for a particular target depth interval, which may or may not correspond to the sampled intervals. It is the task

of the data analyst to put the data together in such a way that useful and reliable conclusions can be drawn for

the soil depths of practical interest. Previous studies to tackle this problem include multi-stage approaches

and point-data-based 3-dimensional geostatistical approaches. One disadvantage of a multi-stage approach

for example, rst tting splines to the data for sampled proles, then imputing new data for the target interval,

before considering a spatial analysis with the imputed data is that the imputation generally ignores any uncertainty in the imputed data, which might give misleading conclusions. Point geostatistical methods, on the other

hand, assume that the data represent the value of the target variable at a specic point in the prole, rather than

its average over a sampling interval; this too could give misleading estimates. In this work, we present a statistical

method that properly deals with the sample support of soil prole data so that all data can be considered in a

single geostatistical analysis. The approach is based on the area-to-point kriging framework, which can be

used to represent the uncertainty from data that are averages over non-negligible sample supports (in our

case, the different sampled depth intervals). We combine a covariance model for the increment-averaged data

in the vertical domain with another model for the horizontal variation. This enables us to (i). process all data

in a single analysis, and (ii). calculate predictions for any target depth and support based on the same statistical

model. We test the approach on data from the MurrayDarling basin in eastern Australia, where interest lies

in mapping various soil properties that could have an effect on water salinity of the nearby Muttama Creek:

we illustrate the methodology for predicting clay content. Finally we discuss a number of possible extensions

of the methodology to broaden its applicability, which should provide the basis of further studies.

2015 Elsevier B.V. All rights reserved.

1. Introduction

Soil properties vary signicantly both across the landscape and

through the soil prole, and interest lies in characterizing and mapping

this variation to provide land users with useful information. Datasets

often consist of samples from many spatial locations, at several different

depth intervals. Within a particular study, these depth intervals may

be xed (e.g. 010 cm, 1020 cm, and 2030 cm). Other studies may

consider different xed intervals, or sampling intervals that are dened

according to soil horizons and therefore vary between locations within

the study. It is then the task of the data analyst to draw useful and

reliable conclusions for soil depths of practical interest. For example,

the GlobalSoilMap project specications (Arrouays et al., 2014) dictate

that soil properties should be mapped for depth intervals of 05 cm,

515 cm, 1530 cm, 3060 cm, 60100 cm and 100200 cm.

Corresponding author.

E-mail address: Thomas.Orton@dsiti.qld.gov.au (T.G. Orton).

http://dx.doi.org/10.1016/j.geoderma.2015.08.013

0016-7061/ 2015 Elsevier B.V. All rights reserved.

task. A common approach is to t splines to the prole data for each

site (Bishop et al., 1999), use the spline to impute data for the soil property over the depth interval of interest, and then proceed with the analysis as if the value were known without error (e.g. Malone et al., 2009,

2011a; Adhikari et al., 2013; Orton et al., 2014; Bishop et al., 2015).

We refer to this as a spline-then-krige (STK) approach. This process

does not account for the uncertainty in the values inferred from the

spline, and could yield misleading conclusions.

Another possible approach to the problem is 3-dimensional (3-D)

geostatistics. However, this has been applied as if the data collected

from soil depth intervals were concentrated at a single point (e.g.

Hengl et al., 2014), at either one of the bounds of the sampling interval,

or at the interval's mid-point. This approach also fails to properly represent the support on which the data were originally collected (over an

interval, rather than from a point), and could again yield misleading

conclusions. Breidt et al. (2007) recognized the dangers of using midpoint assignment to represent increment-averaged data, and proposed

a mixed-model approach for estimating depth functions, whilst properly accounting for the interval support of the data; their focus was on the

estimation of depth proles, whereas our focus is more on the use of

such data for modelling and mapping using spatial datasets of soil

horizon data. Other 3-D approaches (e.g. Poggio and Gimona, 2014;

Veronesi et al., 2012) generally suffer the same drawback; all data are

assumed to have identical vertical support, which ignores their different

uncertainties.

In a geostatistical framework, the sample support of data that are

averages of an attribute over non-negligible areal units can be dealt

with by area-to-point kriging (ATP kriging; Kyriakidis, 2004). This

method allows the sampling units and prediction supports to all have

different sizes and shapes. It has been applied in several case studies

in recent years to analyse areal-averaged data (Kyriakidis and Yoo,

2005; Kerry et al., 2012; Schirrmann et al., 2012; Truong et al., 2014).

Although usually carried out to account for the horizontal support (i.e.

the data are areal averages), there is no reason that the same methodology cannot be carried out to deal with the vertical support of soil prole

data (i.e. for data that are measurements of the average value of a soil

property over depth intervals). This was noted in Heuvelink (2014),

although we are unaware of any studies that have implemented such

an approach.

In this work, we combine the ATP approach for the vertical distribution with standard kriging approaches for the horizontal distribution.

Thus, a statistical model for the complete dataset (all spatial locations

and all depth intervals) is dened, with the support of each datum (a

combination of spatial location and depth interval) properly represented. We refer to this model for increment-averaged data, and the predictions built on the model, as increment-averaged kriging (IAK). We

propose that this all-in-one model should provide a better assessment

of prediction uncertainty compared with a two-stage approach, or an

approach that represents interval data by their mid-points (although

comparison of the different methodologies is not undertaken in the

current study).

We consider the methodology in the framework of a linear mixed

model (LMM; Lark et al., 2006). Thus, part of the variation of the target

variable can be explained by a collection of explanatory variables, with

the remainder being modelled as spatially dependent (i.e. data close

to each other in horizontal space and at similar depths are more likely

to be similar than data far apart in space and at different depths). We

allow interactions between depth and the spatial explanatory variables,

so that different relationships can be modelled at different depths in the

prole. We also allow the variance parameters of residuals to depend on

depth, which provides a mechanism to represent different uncertainties

at different depths in the prole.

Usually in ATP-kriging studies, the average covariances must be

calculated numerically (by a discretization approach), due to the complex nature of the areal data units in 2-D space. However, for our

increment-averaged data, the average covariances can be computed

analytically. We derive an expression for the covariance of the

increment-averaged data, based on an exponential model for the point

covariances. This signicantly reduces the computational load of maximum likelihood methods compared with numerical procedures. Nonetheless, for large datasets (when the total number of data is more than a

few thousand), likelihood approximation techniques may have to be

used (e.g. Stein et al., 2004; Eidsvik et al., 2014); we do not consider

these here though.

We test the proposed IAK approach on data from the Murray

Darling basin in eastern Australia, where interest lies in mapping

soil properties that could have an effect on water salinity of the

nearby Muttama Creek. Soil cores (to a depth of 1 m) were collected from 55 spatial locations over the Muttama catchment, and each

core was divided into horizons, giving a total of 192 samples. We

use this case study to illustrate the IAK approach, mapping clay

content and its attendant uncertainty based on the data from

these samples.

175

2. Theory

Throughout the following, we will assume that the horizontal

support of the data and of the prediction is point support. The method

can be extended to deal with data that are both areal- and depth-wise

averages, if this were to be required in another study. We begin our

presentation of the methodology with a simple stationary model for

the point covariances. We then extend this model to a more realistic

one, allowing variances to depend on depth, before describing how

this relates to the average covariances required to model the variation

of increment-averaged data.

2.1. IAK model: initial stationary model for point covariances

We begin our development towards a statistical model for the analysis of depth interval-averaged data by considering a 3-D model for

point data (i.e. with depths, d, taken to be xed points):

yx; d x; d x; d

known covariates to give the trend function, (x, d), which can be

written as:

x; d X x; d

where X(x, d) contains the known values of the covariates and is the

vector of associated parameters (to be estimated). This is known as the

xed-effect function, and X(x, d) constitutes a row of the xed-effect

design matrix. We assume that the residuals, (x, d), follow a multivariate normal distribution with mean zero and covariances depending

only on the horizontal and vertical separation distances (this assumption will be relaxed in Section 2.2). As a rst approach, we assume a

separable (product) covariance model (De Iaco et al., 2011):

CovY x; d; Y x 0 ; d 0 2 x h x ; x d hd ; d

separated by distance h x = |x ' x|, and d(h d; d) is another correlation function of the vertical separation distances, hd = |d d|. The

parameter, 2, is the variance. This simple model assumes that the

covariances can be written as the product of a function that depends

only on the horizontal separation distances, and one that depends

only on the vertical separation distances. Although this can be restrictive, it provides a useful starting point, and we suggest possible alternatives in the discussion.

In the product covariance model, we can choose any permissible correlation functions (see e.g. Webster and Oliver, 2001) for x(hx; x)

and d(hd; d). However, as Truong et al. (2014) point out, there is no

information in areal-averaged data (or in our case, depth intervalaveraged data) to dene a nugget effect. We therefore assume that the

depth-wise correlation function, d(hd; d), has zero nugget, and

model it with a single spatial autocorrelation structure. We can still

include a nugget effect for the horizontal variation though, and we

write x(hx; x) as the sum of a nugget component and Nm spatial

autocorrelation structures:

x hx ; x s0 x;0 h x

Nm

X

si x;i h x ; x;i

i1

1

if hx 0

is the nugget correlation function;

0

otherwise

x;i(h x ; x;i), i = 1, , Nm, are Nm spatial correlation functions, with

parameter vectors x;i; parameters si, i = 1, , Nm are the proportions of

variance associated with each of the Nm spatial correlation functions;

where: x;0 hx f

176

N

m

and the parameter s0 1i1

si gives the proportion associated

with the nugget variation.

Reparameterizing in terms of ci = si 2, i = 0, , Nm, (i.e. the variances associated with the nugget and each of the Nm spatial correlation

functions), we can write the full product covariance model as:

Eq. (1) presented the statistical model for the point-support variable,

with mean function given by Eq. (2) and covariance function Eq. (8).

However, our data are given as depth interval averages, and we must

use these point-support models to calculate expectations and covariances for the interval support. In the geostatistical literature, this

process is known as regularization (e.g. Goovaerts, 2008), with its

reverse the use of interval-support data to infer point-support

models known as deconvolution.

We assume that the measurement of variable Y for an interval I =

[u, l] (where l N u) represents the arithmetic mean of point values of Y

within this interval. The depth interval-averaged variable is then a linear combination of multivariate normal variables, and is also multivariate

normal (see e.g. Kyriakidis and Yoo, 2005). Its mean and covariance

matrix are given by interval averages of the respective statistics for

the point-support variable.

First, the expectation on interval support is:

CovY x; d; Y x 0 ; d 0

8 N !

m

X

>

>

>

ci d hd ; d

>

<

if hx 0

i0

!

:

Nm

otherwise

X

>

>

>

>

ci x;i hx ; x;i d hd ; d

:

i1

5

We work with this form herein.

For the vertical correlation, we will assume an exponential function.

One property of the exponential function that we will make use of in

this work is that it is integrable; this will allow us to derive an analytical

function for the average covariances (that represent the correlation

between observations of depth-interval averages), rather than requiring a numerical procedure to approximate them.

2.2. IAK model: non-stationary variances

ci x; d; x 0 ; d 0 f i x; d f i x 0 ; d 0 :

Although the fi(x, d) functions can in theory be chosen to model variances that depend on spatially-varying covariates, here we consider

only the dependence on depth. In particular, we will consider:

f i x; d f i d P ri d; i

vector, i, of length ri + 1. In this work, we take each polynomial to be of

order ri = 2 (although there is no theoretical reason for not including

higher-order terms, if deemed to be necessary). Lower-degree polynomials are achieved by setting the higher-order coefcients to zero. In

order for the resulting covariance function to be positive denite, we

must ensure that fi(x, d) N 0 for all i and for all locations in the study

area (both data and prediction locations).

Substituting these polynomial functions into Eq. (5) gives:

Nm

X

where Xx; I is the average of the point-support design matrix, X(x, d),

might be particularly restrictive for analysing 3-D data that vary both

horizontally and vertically. In particular, we might expect different

horizontal (spatial) variability at different depths (e.g. for a ploughed

eld, we might expect smooth variability in clay content in the cultivated top-soil, and a larger degree of variability at depth). The stationary

model given in Eq. (5) can be generalized to accommodate variances,

ci(x, d, x, d), that depend on the spatial locations and depths, which

provides a mechanism to model such effects.

We follow various authors (e.g. Lark, 2009; Haskard and Lark, 2009;

Marchant et al., 2009) in assuming that variance terms can be modelled

by the product of a (positive) function applied at the two locations in

question:

8

>

>

>

>

<

x; I Xx; I

P ri d; i P ri d 0 ; i d hd ; d

if hx 0

0

i0

!

:

Cov Y x; d; Y x0 ; d

Nm

otherwise

X

>

>

0

>

>

P ri d; i P ri d ; i x;i hx ; x;i d hd ; d

:

i1

8

We write the complete set of covariance parameters as =

{ x , d, }, where contains all of the i s and x contains all of

the x;i s.

for all data. We will consider spatial covariates that do not depend on

depth. However, we will allow interactions between these spatial covariates and depth, as detailed in Section 3.2; for example, to model the

interaction between a spatial covariate w and depth squared (d2), we

would include the values wd 2 wd 2 wu2 ul l 2 =3 in Xx; I.

Note that this is not the same as using depth interval mid-points to

2

By using average values of X(x, d) over the sample supports here,

trend calibration will explicitly account for the change of support: the

expectation X(x, d) will be relevant for a point-support variable and

Xx; I relevant for a variable on increment-averaged support.

Second, the covariance for a pair of observations on interval support

is given by the average of the covariances from the point-support model

over the respective depth intervals for the two observations in question.

That is:

CovY x; I ; Y x 0 ; I 0

1

jIj jI 0 j

Z Z

dI

CovY x; d; Y x 0 ; d 0 dd 0 dd 10

d 0 I 0

where |I| and |I| are the lengths of the intervals I and I, respectively

(Kyriakidis, 2004). We use as shorthand C to denote the full data covariance matrix with elements dened by Eq. (10).

In ATP kriging, these average covariances are usually computed

numerically, by discretizing the areal units into a number of points,

calculating covariances between these points, and averaging the values.

Such a numerical procedure is necessary because the integrals (to

compute these averages) of the point covariance function over irregular

two-dimensional areal units are analytically intractable. However, in

our case, the integrals are in one dimension (depth) and as a result

are analytically tractable for certain point-covariance functions. In this

work, we consider the exponential model for the depth-wise correlation

function, d(hd; d), where d = ad is a single distance parameter

(approximately one third of the effective range of correlation), which

allows analytical results for the 1-D interval-averaged covariance. This

signicantly reduces the computational load compared with a numerical procedure, particularly when maximum likelihood methods are

used for parameter estimation. The interval-averaged exponential

covariance function is derived in the Supplementary material and

presented in Appendix A. Herein, we refer to the methods described

in this section for increment-averaged data, and the predictions built

the following section), as increment-averaged kriging (IAK).

177

3. Methods

remove from the pool the one with the least explanatory power for

the target variable; that is, we formulate the multiple linear regression

model:

yi 0 1 d i 2 d 2i 3 w i 4 d i w i 5 d 2i wi i ;

i 1; ; N;

14

Parameter estimation for kriging is often carried out by a method-ofmoments approach. For ATP-kriging, Goovaerts (2008) presents an

iterative method-of-moments approach to perform deconvolution and

estimate parameters of a point-support variogram. For point-support

data, Lark (2000) demonstrated theoretical advantages of maximum

likelihood (compared to method-of-moments) to estimate parameters,

and this approach was suggested by Kyriakidis (2004) as an alternative

for ATP-kriging parameter estimation. A further improvement over

maximum likelihood is residual maximum likelihood (REML), introduced by Patterson and Thompson (1971) to reduce the bias in variance

parameters as a result of the unknown xed-effect parameters (Lark

et al., 2006). We t parameters for the IAK model using REML; the

REML formula is exactly the same as in the usual case, but with Xx; I

in place of X(x, d) to give the xed-effect design matrix X , and

Eq. (A6) used to calculate the elements of the covariance matrix, C:

1 1 T 1

ln C ln X C X

2

2

T 1 1 T 1

1 T 1

1

y C C X X C X

y ;

X C

2

lnR j y k

11

REML with the same xed-effects structure can be compared using

the Akaike information criterion (AIC; Akaike, 1974), and we select

covariance models based on this criterion.

The xed-effect parameters can be estimated, conditionally on the

REML-estimated covariance parameters, as:

1 T 1

^ XT C1 X

X C

y :

12

h i T 1 1

^ X C X

var

:

13

Eqs. (12) and (13) can be used in Wald tests to determine the significance of particular covariates.

3.2. IAK xed-effect and covariance model selection algorithm

We model the horizontal and vertical trends by considering interactions between the spatial covariates and depth. By doing this, different

spatial trends can be represented at different depths. We assume that

our spatial covariates vary in space only; thus, the only mechanism

to represent different trends with depth is some kind of interaction

between these spatial covariates and depth. Throughout the following,

we reserve the term predictors to refer to the columns of the xedeffect design matrix (which may include interactions with depth), and

use covariates or input variables to refer to the original spatial covariates (without interactions with depth). If a model includes predictors

based on a categorical input variable of three classes (without depth

interactions), then removing this single input variable from the model

would reduce the number of predictors by two.

We begin with a large pool of potential spatial covariates,

from which we which we initially remove redundant (highly correlated) covariates. Following Bishop et al. (2015), we identify pairs of

wi(2), (using depth-interval midpoints as a crude approximation for

each di), and remove the one giving the smaller R2.

Based on this reduced pool of nc spatial covariates, a full xed-effect

design matrix is formulated with interactions between the nc covariates,

and depth, d, and d2. (Here, we do not consider interactions between the

spatial covariates themselves.) For a datum sampled on interval [u, l],

the mean of d (equal to the interval midpoint), and the mean of d2

(equal to (u2 + ul + l2)/3) are used to compute elements of the design

matrix. This gives a design matrix with 3nc + 3 columns (the

predictors).

It is at this point that we consider choice of an appropriate covariance model structure. We t several alternative covariance models

with the full xed-effect design matrix, and select the one giving the

smallest AIC. For these alternatives, we consider exponential and Gaussian correlation models, each with nugget and spatial standard deviations given by polynomials in d of order ri 2, i = 0, 1 (i = 0 for the

nugget component, i = 1 for the spatial component; Eq. (7)). We also

consider a pure nugget model, again with r0 2; this gives 21 models

to t and compare.

The full design matrix is then reduced based on an iterative procedure with Wald tests. The selected covariance model is tted with the

full 3nc + 3-column design matrix. Wald tests are then applied and

the least signicant of these 3nc + 3 predictors removed, if its signicance is greater than p = 0.05. Note that we only allow removal of the

highest-order term, so for instance dw cannot be removed if d 2w is

still in the model, and d 2 cannot be removed if the model still contains

any interactions between d 2 and the spatial covariates. Also note that

categorical predictors based on an input variable with more than two

categories occupy more than one column of the design matrix, and

these columns are considered for removal together. This process is

repeated, retting the covariance model each time a column is removed

from the design matrix, until all remaining variables have a signicance

of p b 0.05.

3.3. IAK prediction

With the methods described in Section 2 used to dene the design

matrix, X, and covariance matrix, C, prediction of the primary soil property at unsampled locations follows the standard universal kriging

equations (see e.g. Webster and Oliver, 2001). That is, the prediction is:

^ C 0;d C1 y Xd

^ ;

^0 X 0

y

d

d;d

15

and variance:

1

var y

T 1 1

T

1

1

X0 C0;d Cd;d Xd Xd Cd;d Xd

X0 C0;d Cd;d Xd ;

16

data, respectively, and C0;d refers to the submatrix of C representing covariances between the prediction and data (similarly for Cd;d , Cd;0 , and

C 0;0 ). Note that the nal additive term in Eq. (16) accounts for uncer^ We must also consider

tainty in the estimated trend parameters, .

the desired prediction support in both the horizontal and vertical

dimensions. Here, we consider point-support predictions in the

178

For validation we choose the vertical supports of the validation data,

and for producing maps we choose the vertical supports of 010 cm,

5060 cm, and 90100 cm for illustrative purposes. All predictions are

calculated using all estimation data (i.e. a global search window is used).

some indication of the importance of each covariate in the xed-effect

function (albeit with some reservations about colinearity of predictors).

difcult to interpret and compare the estimated xed-effect coefcients.

For a simple means of comparing regression coefcients, the continuous

input variables can be standardized to have means of 0 and standard

deviations of 1 (Schielzeth, 2010). Gelman (2008) proposed standardizing to have standard deviations of 0.5 (i.e. by dividing by two standard

deviations), so that coefcients for binary variables are more comparable to those for the continuous predictors, and we follow this approach

basin, eastern Australia, due to its threat to ecosystem health and

agricultural productivity. Several sub-catchments have previously

been agged as having high salt exports and high stream salinities

(Department of Environment and Climate Change NSW, 2009), and

one of these the 1025 km2 Muttama creek sub-catchment of the

Murrumbidgee river, Fig. 1 was selected as the focus of the current

study. Knowledge and understanding of the key causes of salt

4. Case study

Fig. 1. The study area in the MurrayDarling basin, eastern Australia. Coordinates are relative to an origin south west of the study area. The fty estimation data proles are shown by

crosses and the ve numbered validation data proles by open circles. Note the close proximity of four of the validation locations to data points, so that their symbols overlap.

mobilisation from landscapes, and its spatial variability, are vital for

salinity control and effective management of the land. Many soil variables affect the release of salt into waterways; here we focus on soil

texture, in particular clay content.

Soil cores were collected in 2013 from 55 locations across the study

area (Fig. 1). Each soil core was taken to a depth of 1 m (or less where

shallower soil did not permit this), and divided into horizons. All locations provided between three and six horizons giving a total of 192 samples available for laboratory analysis. Amongst a number of soil

properties, clay content was measured using the hydrometer method.

Fig. 2 shows histograms summarizing these data at three depths in the

prole (based on the midpoints of sampling intervals, d, for display

purposes only); upper ( d 0:15 m ), middle ( 0:15bd0:5 m ), and

lower (dN0: 5 m). The data at each depth appear reasonably symmetric,

and we proceed with analysis under the assumption that clay content is

a Gaussian random variable. The right-hand panel of this gure shows

all of the data plotted with the lengths of lines indicating the sampling

intervals. This shows the range of sampling intervals in the dataset,

and the increasing trend in clay content down the prole. Thicker bars

occur where multiple data are very similar.

The aim of this study is to model the clay content data in all samples

and map it over the study area for any required depth interval of interest. As detailed previously, we choose the three depths of 010 cm, 50

60 cm, and 90100 cm for mapping, to illustrate differences in the spatial distribution of clay content down the soil prole. For modelling and

mapping, we utilize spatial covariate data on 29 covariates, as listed in

Table 1. For further details of these covariates, we refer to Bishop et al.

(2015). We acknowledge here that this dataset of just 55 spatial locations does not provide the sternest test of the methodology. At this

stage, the aim of the case study is more to illustrate the potential of

the methodology to deal with a dataset of various sampled depth

179

the horizontal and vertical domains.

4.2. Validation and mapping

Validation is also an important part of any geostatistical analysis. The

support at which validation is carried out should ideally represent the

support on which predictions are ultimately required (Bishop et al.,

2015). For us, the required support is point support in the horizontal

domain and interval supports of 010 cm, 5060 cm, and 90100 cm

vertically. However, since our data were sampled based on soil horizons,

the depth intervals are irregular between spatial locations; indeed only

six, one and three of the 55 locations produced data directly for the

three mapping depth intervals, respectively. Furthermore the data are

clustered in horizontal space, which provides another complication for

calculating validation statistics (Brus et al., 2011). Therefore, validation

(either by data-splitting or by a full cross-validation exercise) using

our dataset (of clustered data from a non-probability sampling design)

would not answer the specic questions relevant to our case

study. Brus et al. (2011) recommended that when calibration data are

a non-probability sample, validation of digital soil maps should be

through an additional probability sample, which would be useful in

our case to calculate model-free estimates of validation statistics for

specic depth intervals of interest. However, without such an independent validation set, we consider the following data-splitting exercise,

and stress that this serves more as an illustration rather than a full

validation exercise.

We split the data into estimation and validation data as follows. We

remove ve of the proles, one selected randomly from each of ve

subregions dened by a k-means clustering on the spatial coordinates

of the entire study area. A stratied random sample was preferred

Fig. 2. Histogram plots (left) of the clay content data in the upper (sample midpoint d 0:15 m), middle (0:15 mbd 0:5 m), and lower (dN0:5 m) soil proles. The right-hand plot shows all

data plotted with vertical lines representing the sample depth intervals.

180

Table 1

Summary information of the available covariates.

Category

Spatial

support/scale

Source

Spatial coordinates

Digital terrain attributes

Elevation, slope, aspect, plan curvature, prole curvature, wetness index, altitude above channel

network, length-slope factor, multi-resolution valley bottom atness index, multi-resolution

ridge-top atness index, topographic position index (11 variables)

Potassium (K, %), thorium (Th, ppm), uranium (U, ppm), ratio Th:K, ratio U:K,

ratio U:Th, ratio U2:Th, dose rate (terrestrial sources of radiation), total dose (terrestrial and

cosmic sources of radiation), weathering intensity index (10 variables)

Three categories (cropping/grazing/other)

Three categories (felsic/mac/other)

n/a

90-m raster

n/a

NASA

105-m raster

1:250 000 polygons

ABARES

Geoscience Australia

Radiometrics

Land use

Geology

data over the study area. The locations of the ve removed proles are

shown by the open circles in Fig. 1; the remaining 50 proles (with a

total of 174 horizons) are used as estimation data, and shown as the

asterisks in Fig. 1. As can be seen, four of the validation locations fall

very close to locations in the estimation dataset (between 15 m and

50 m from their nearest estimation data locations). The other location

was on the edge of the study area, 1.7 km from its nearest neighbour.

To provide a numerical evaluation of errors for each predicted

prole, we calculate a root mean squared error (RMSE) for each of the

ve validation proles, with contributions weighted according to the

thickness of each sampled depth interval:

v

u

ni

2

X

u

1

^i j yi j ;

li j ui j y

RMSEi t

lini ui1 j1

model was best, with orders r0 = 2 and r1 = 1 selected for the nugget

and spatial standard deviations, respectively (the black and grey dotted

lines in Fig. 3). Both components were smallest at the top of the prole.

The nugget, represented by a quadratic function of depth, reached a

maximum at around 60 cm before decreasing, whilst the spatial standard deviation continued to increase down the prole. The selected

Gaussian model gave a smaller AIC than the three pure nugget models,

indicating that there is some spatial correlation in the residuals from the

tted trend model. We work with the Gaussian spatial correlation

model with r0 = 2 and r1 = 1 for IAK herein.

5.3. Fixed-effect model selection

5. Results

Wald tests were used to remove predictors from the full design

matrix that did not contain useful information for predicting the target

variable. The procedure presented in Section 3.2 resulted in a design

matrix with 21 columns (reduced from 69), based on 8 different spatial

variables (4 digital terrain model variables, 3 radiometrics variables, and

the geological classes, Fig. 4). We report the tted xed-effect model,

applied for three depth intervals: 010 cm, 5060 cm and 90100 cm

(Table 3a). To apply the model for depth interval [u, l], where the

model contains the three terms, 1 elev, 2 d elev and 3 d2 elev

(where elev is the elevation), for example, we present the coefcient

17

where ij is the prediction of yij (the validation datum for the jth of the ni

layers of validation prole i), uij and lij are the bounds of its sampled

depth interval, ui1 is the upper bound of the top layer for prole i (0 in

each case), and lini is the lower bound of the bottom layer for prole i.

The full pool of spatial covariates (Table 1) consisted of 23 continuous variables (horizontal coordinates, DTM-derived and radiometrics

variables) and two categorical variables (land use and geology, both

with three classes). Highly-correlated covariates were removed according to the algorithm detailed in Section 3.2. This resulted in ve of the 23

continuous covariates being removed from the predictor pool (total

dose, Th and U from the radiometrics variables because of high correlations with dose rate; ratio U:Th because of its high correlation with ratio

U2:Th; length-slope factor due to its correlation with slope). A full

design matrix was formulated for IAK based on interactions between

these spatial covariates and both depth, d, and d2. (Recall that interactions between the spatial covariates themselves were not considered.)

This matrix contained 174 rows (for the 50 spatial locations with data

for three to six horizons at each) and 69 columns: 22 (18 columns for

continuous predictors + 4 columns for categorical predictors) multiplied by 3 (no interaction, interaction with d, interaction with d2) plus

3 (terms for the constant, d and d2).

5.2. Spatial covariance model selection

Twenty-one different covariance models were tted using the full

design matrices to give xed effects. Exponential and Gaussian spatial

covariance models were compared through their AICs, with differentordered polynomials of d used to give the nugget and spatial standard

deviations (Table 2). A pure nugget model, representing no spatial

correlation, was also compared. The results suggest that the Gaussian

(u2 + ul + l2)/3). The changing coefcients demonstrate the ability

of the model to represent different relationships at different depths.

For instance, the elevation had a small positive effect on clay content

in the topsoil, but a larger negative impact lower down the prole.

The coefcients presented in Table 3a were calculated with input

variables dened on their original scales. To allow some comparison of

the effects of each variable, we re-estimate the xed-effect function

with the continuous input variables standardized to have means of 0

and standard deviations of 0.5 (Table 3b). The three variables deemed

to be most important at each depth interval are highlighted, suggesting

that geology and weathering intensity index are important in the

topsoil, whilst geology, potassium and the radiometrics dose rate are

important for mapping in the 0.91.0 m depth.

Table 2

AICs of the 18 tested covariance models; r1 is the order of the polynomial used to model

the square root of the spatial variance, r0 is the order for the square root of the nugget.

Selected model is shown in bold type.

Pure nugget

Nugget

r0

0

1

2

691.9

666.1

662.6

Exponential

Gaussian

r1

r1

680.5

665.7

664.3

664.4

658.1

656.6

661.4

657.9

657.8

679.2

661.7

660.1

664.5

657.0

654.2

661.2

657.8

655.8

181

The residuals from the IAK xed-effect function were modelled with

a Gaussian correlation model (with effective range of correlation 12 km)

with nugget effect. The variance parameters of this model depended on

depth, as shown in Fig. 3 (the black and grey solid lines, for the nugget

and spatial standard deviations, respectively); the functions are very

similar to those tted based on the full xed-effect design matrix.

The increasing standard deviations down the prole reect larger

uncertainty with depth. The vertical correlation model had an effective

range of 70 cm.

5.4. Validation

Fig. 3. Fitted functions for the nugget, (f0(d), black lines) and spatial (f1(d), grey lines)

standard deviations (see Eqs. (6) and (7)). Dotted lines are the functions tted with the

full xed-effect design matrix, solid lines are tted after removal of insignicant xed

effects.

are shown in Fig. 5. The shapes of the continuous predicted proles

(on 1-cm average prediction support) show reasonable agreement

with the validation data (shaded horizontal bars). The plots also show

the IAK predictions at the support of the validation data (vertical solid

black lines), and their associated 95% prediction intervals (vertical

dotted lines). The RMSE was smallest for locations 1 and 4 and largest

for location 5, the validation location that was the farthest from its

nearest estimation datapoint. The 95% prediction intervals for 12

of the 18 validation horizons (67%) captured the true clay content

(this should be close to 95%). However, with just 5 validation proles,

it is not very meaningful to read much into numerical measures of

the adequacy of predictions and their uncertainty assessments. Usually one would expect smaller prediction variances when the target

depth interval is wide. However, this effect is not apparent in our

prediction intervals, since we allowed variance parameters to be

functions of depth, which had a larger effect on prediction variances

in our study.

Fig. 5 also illustrates the coherency of the IAK predictions. That is, the

area to the left-hand side of the continuous prediction line is equal to

the area to the left of each vertical black bar in Fig. 5. Coherence (also

known as the pycnophylactic property) of ATP-kriging point predictions

with areal data is a property of the methodology (Kyriakidis, 2004),

Fig. 4. The eight selected covariates. aacn: altitude above channel network, mrrtf: multi-resolution ridge-top atness index, wndx: weathering intensity index.

182

Table 3

Coefcients and standardized coefcients of the tted xed-effect function, presented for three illustrative depths. The three largest standardized coefcients are highlighted for

each depth.

Depth, m

Int

(a)

00.1

0.50.6

0.91.0

50.4

3.62

0.0463

67.5

21.5

0.666

127

35.7

1.24

dosef

wndx

(b)

00.1

0.50.6

0.91.0

26.4

4.16

1.78

9.07

52.2

24.7

25.7

9.07

58.6

41.1

47.6

9.07

5.87

5.87

5.87

elev

mrrtf

0.0121

0.0193

0.207

1.81

2.89

31.0

0.608

4.64

8.84

1.08

8.22

15.6

aacn

0.0112

0.365

0.666

slope

g1

g2

46.1

132

275

5.92

30.6

50.3

14.5

20.0

24.3

5.92

30.6

50.3

14.5

20.0

24.3

0.373

12.2

22.3

3.41

9.79

20.4

Int: intercept, mean for reference geological class, other; K: potassium; dosef: dose rate; wndx: weathering intensity index; elev: elevation; mrrtf: multi-resolution ridge-top atness index;

aacn: altitude above channel network; slope: slope; g1: mean for felsic geological class in comparison to reference class, other; g2: mean for mac geological class in comparison to reference class, other

which guarantees the agreement between predictions at different supports demonstrated by the IAK approach in this study. This coherence

would seem to be a desirable property of a method that is to be

employed for prediction over various scales (in terms of the widths

of target interval).

(e.g. spline-then-krige, STK; Malone et al., 2011a) that fail to account

for the uncertainty in the interval-sampled data in the formulation of

prediction models.

6.1. Variance with depth

5.5. Mapping

Fig. 6 (left) shows maps of the predicted clay contents at depths of

010 cm, 5060 cm and 90100 cm, and Fig. 6 (right) shows the associated widths of the 95% prediction intervals. For presentation, values less

than 0% or greater than 100% were truncated to these limits. The topsoil

map is relatively homogeneous for large parts of the study area. It was

suggested (Table 3) that geology and weathering intensity index were

the most important predictors for dening the spatial distribution in

the 010-cm depth interval, and the effects of the geology can clearly

be seen in the resultant map. Lower in the prole, the predicted clay

contents are more variable, with more extreme predictions; geology,

potassium and the radiometrics dose rate were suggested (Table 3) as

being the most important for the 90100-cm depth. The maps of prediction interval widths show the smallest uncertainties in the topsoil and

the largest uncertainties lower in the prole.

6. Discussion

We have presented a framework for analysis of soil prole data, in

which averages of the soil property over depth intervals (horizons

or xed depth intervals), rather than observations at exact points in

the prole, are measured. The increment-averaged kriging (IAK)

approach built on the methodology of area-to-point kriging

(Kyriakidis, 2004) accounts properly for the vertical support of

the data. This is in contrast to approaches built on assuming samples

were collected from interval midpoints that ignore the thicknesses of

depend on depth. By doing this, different patterns of spatial variability

in the residuals could be modelled at different depths in the soil prole.

This led to larger variances, with a greater proportion of spatial variance,

being tted for the horizontal variability deeper in the soil prole.

Malone et al. (2011b) validated predictions and prediction intervals

from a spline-then-krige (STK) approach, in which prediction uncertainty was estimated empirically from the residuals (after model tting)

by a fuzzy k-means approach. Their results suggested underestimation

of variance for deeper soils. A possible reason for this underestimation

is the generally wider sampling intervals at depth, meaning that the

imputed data in a STK approach should convey less information about

the soil property in the 90100 cm interval than a true measurement

of the 90100 cm interval. This uncertainty in the values extracted

from the spline is not propagated into model tting in a STK approach.

The IAK approach presented here could combat this, and it would be

interesting to compare results in a more extensive validation exercise.

6.2. Extensions of the covariance model

In contrast to the variance parameters, the range of spatial correlation was assumed constant for all depths. Haskard and Lark (2009)

present a spectral tempering approach, which allows the range of a

covariance model to depend on covariates. This could be applied to extend the covariance modelling approach used here so that the range can

depend on depth. Also, we used a separable covariance model to model

Fig. 5. Validation data (shaded bars) and predictions for the ve validation locations (left to right). Continuous lines show predictions of 1-cm averages. Solid and dashed vertical lines

show predictions and 95% prediction intervals, respectively, at the support of the validation data.

183

Fig. 6. Predictions (left) and widths of 95% prediction intervals (right) for clay content (%) at depths 010 cm (top), 5060 cm (middle), and 90100 cm (bottom).

distance. This simple model provided a reasonable starting point, but

has been criticised for being based on unrealistic assumptions when

this assumption are possible; for instance, productsum covariance

models have been suggested as a exible generalization for modelling

184

sum-metric model (e.g. Heuvelink and Grifth, 2010), although implementation of this model for increment-averaged data might require numerical rather than analytical calculation of average covariances, which

could prove challenging for large datasets. Nonetheless, such extensions

could be implemented and may improve covariance modelling in our

context.

We assumed that the vertical component of the covariance model

had zero nugget variance, since it has no effect on the likelihood of

increment-averaged data (all white-noise variation in the vertical

dimension is averaged out). However, it is likely that there is variation in the soil property occurring over ner scales than the sampling

interval widths, and the data contain no information to model this.

Truong et al. (2014) suggested that expert opinion could be used to

dene a nugget variance in such situations, and this could be used

if predictions are required on a point support. In our case predictions

are only required on a block (or increment average) support, therefore the nugget variance is also averaged out of the prediction

variance. Nonetheless, similar ideas might be used to dene some

short-range spatially-correlated component of variation (perhaps

by assuming an exponential model for this short-range component

with a range of 5 cm, and eliciting information about the expected

difference between values of the soil property at 5-cm intervals).

Prediction variances might be sensitive to the amount of shortrange variation.

In this study we used a multiple linear regression model to give

the trend, allowing interactions between the spatial covariates

and depth (and depth squared) so that different spatial trends would

be modelled at different depths. The use of a linear model of the

covariates in the spatial domain could be a limitation. Incorporation of

interactions between the spatial covariates provides one possible remedy to this. Other non-linear terms in d could be included if, for example,

an exponential change in a soil property was expected down the soil

prole, although some trend parameters will then have to be tted

numerically, along with the covariance parameters, by maximum likelihood (rather than REML). Alternatively, machine-learning techniques

(e.g. articial neural networks, random forests and other regression

tree methods) have demonstrated the ability to model non-linear

relationships between predictors and target variables. A common

approach is to use these techniques in a two-stage methodology, rst

tting the trend model with the machine-learning technique assuming

independence of model residuals, and second performing a spatial

analysis of the residuals and kriging these (e.g. Malone et al., 2009;

Lacarce et al., 2012; Martin et al., 2014). It is possible that the trendtting step will overt, because of the assumption of independence

made in this stage. To combat this, regression tree methods could be

implemented as more of a one-stage approach by performing the

model tting whilst accounting for spatial correlation of residuals. For

instance, the output of a regression tree analysis is a collection of splits

of the covariates, and the predicted value for each branch end (or terminal node) of the tree. The collection of covariate splits effectively denes

a design matrix for the data (and for predictions at unsampled loca^ can be

tions), so that the means for the terminal nodes (the vector )

retted by REML in the IAK framework. By adopting this approach, the

trend will be (partly) tted whilst accounting for both the spatial correlation in the residuals and the varying uncertainties arising from the

different sampled depth intervals. However, this is beyond the scope

of the current work.

An additional advantage of regression-tree approaches is the

discretization of covariates into classes, so that extrapolation to extreme

values of the covariates is not an issue. In this work, in which measured

clay contents ranged from 8 to 83%, many predictions fell outside this

described above, would deal with this, as would curtailing the covariates to their ranges observed in the estimation dataset.

6.4. Transformations

For modelling soil-texture variables, an additional criterion comes

into play: that each compositional variable (a percentage) must be

between 0 and 100%, and if modelling sand, silt and clay contents simultaneously, their sum must be 100%. Lark and Bishop (2007) suggested

the additive log transform as an appropriate variable for analysis of

such soil texture data. This consists of analysing the two transformed

clay

silt

and y2 ln sand

, as Gaussian variables, before

variables, y1 ln sand

back-transforming predictions of these two variables; predicted values

for the three fractions are then guaranteed to be between 0 and 100%

and sum to 100%. To consider such an analysis within the framework

of the IAK approach, we would have to consider the scale of averaging.

For instance, we assumed in this study that the clay content for an interval [u, l] represented the arithmetic mean of point values within this

interval. This is no longer a valid assumption for the transformed

variables, y1 and y2. The same is true for log-transformed variables.

Orton et al. (in press) considered the effects of the scale of averaging

on composite-sampled soil data (i.e. samples that are formed by aggregating a number of basic soil aliquots, before these composite samples

are measured). This approach could be combined with the IAK approach

presented here to deal with lognormal, or other transformed variables,

and is something that we plan to investigate in further work.

Adhikari et al. (2013) considered mapping soil texture in Denmark

using a STK approach, in which they expected more or less uniform

soil texture in the top ploughed layer of agricultural soils. To deal with

this, they introduced articial data at the top and bottom of the

ploughed layer, both of thickness 1 cm and with the same measurement

as the ploughed layer. This forced the splines of soil texture variables to

be constant in the top layer, and only change below the second 1-cm

datum. A similar approach could be applied with the IAK method

presented here. For example, to deal with texture-contrast soils

(which are common in many parts of Australia) in which there is an

abrupt change between the A and B horizons, 1-cm thick data could

be imputed at the bottom of the A and top of the B horizons. This will

have the effect of forcing the prole to have an abrupt change, which

should be damped to some degree as prediction locations move further

from this data location.

An alternative solution was proposed by Kempen et al. (2011),

who dealt with the issue by constructing piecewise depth functions;

the parameters of these functions were interpolated and then applied to model the depth function at unsampled locations. Such a

piecewise model might be incorporated in the trend component of

our framework, perhaps allowing the horizon thicknesses to be predicted as functions of environmental covariates. This would give a

non-linear trend function, therefore its parameters would have to

be estimated by maximum likelihood (rather than REML) using a numerical method. This seems a more elegant solution than the insertion of articial data, but may also be more computationally

demanding due to the number of non-linear parameters. The general

effectiveness of these approaches for dealing with texture-contrast

soils warrants further investigation.

6.6. Use of the method as an alternative imputation approach

Finally, although the main message of this paper has been that we

want to avoid multi-stage procedures, we thought it would be of interest to note that the IAK method also has potential to be used for

be done by tting parameters of a pure-nugget covariance model

with correlation only in the vertical domain for all of the available

soil-prole data. This requires at least two parameters to be tted

based on all soil-prole data: a variance parameter vector, 0 , of

length at least one (modelling the effect of depth on standard deviation), and the distance parameter for the vertical correlation, ad. In

contrast, to utilize the spline approach requires just one parameter

to be tted to all prole data, the smoothness parameter, . Predictions over increments within sampled soil proles are then very similar to those of equal-area splines. One benet of IAK imputation is its

natural assessment of uncertainty in the imputed values, which offers the opportunity to propagate this uncertainty into model tting,

if the model-tting methodology allows. For instance, when modelling soil Carbon stocks, we must consider the bulk densities. Missing

bulk density data in parts of the prole can lead to problems, but

having a reliable means of imputing these values, whilst accounting

for their uncertainty, would be advantageous. Clifford et al. (2014)

considered a non-parametric simulation approach for imputing

missing values in soil proles, whilst accounting for uncertainty.

The work presented here provides an alternative in the framework

of the linear mixed model.

and dene:

0

Si I ; I0 ; i ; d ad Hlu0 H l u

0

P 5 min l ; l ; i i P 5 maxu ; u0 ; i i

"

( 0 )

l l

0

0

2

ad P 2 max l ; l ; i P 2 min l ; l ; i exp

ad

0

ju uj

P 2 maxu ; u0 ; i P 2 minu ; u0 ; i exp

ad

( 0

)

l

u

0

0

P 2 max u ; l ; i P 2 min u ; l ; i exp

ad

#

0

ju lj

0

0

P 2 maxl ; u ; i P 2 minl ; u ; i exp

ad

i 4

data sampled over various depth increments. This allows all data to

be incorporated into a single statistical analysis, whilst properly

accounting for their differing uncertainties. A number of extensions

have been suggested, and we believe the work constitutes an

interesting avenue of research as an alternative procedure for

multi-depth soil mapping to the commonly applied multi-stage

(spline-then-krige) procedures.

A2

Nm spatial variance functions (Nm = 1 in our case study). The func(

1

if z 0

tion, Hz

is the Heaviside function. The param0

otherwise

2

7. Conclusions

185

3

i0 i1 ad 2 i2 a2d

5

i1 2 i2 ad

i2

3

i0 i1 ad 2 i2 a2d

5

i 4

i1 2 i2 ad

i2

A3

2

2

6

6

6

i i 6

6

6

4

Acknowledgements

A4

i0

i0 i0

i0

i0 i1 i1 i0 =2 i0 i1 i1 i0 =2

i0 i2 i1 i1 i2 i0 =3 i0 i2 i1 i1 i2 i0 =3

i1 i2 i2 i1 =4 i1 i2 i2 i1 =4

i2 i2 =5 i2 i2 =5

3

7

7

7

7

7

7

5

A5

Award (APA) for International Postgraduate Research Scholarship

(IPRS) recipients, funded by the Commonwealth Department of

Innovation, Industry, Science and Research (DIISR). We would like

to acknowledge the NSW Department of Agriculture for funding to

support the eld work to collect the soil samples used in this study.

We would also like to thank staff and students at the University

of Sydney who assisted with the eld and lab work for the dataset

presented in this work, in particular Ana Horta, Farzina Akter and

Dipangkar Kundu.

Cov Y x; I ; Y x0 ; I 0

8

Nm

X

>

1

>

>

Si I ; I 0 ; i ; d

>

< lu l0 u0

>

>

>

>

:

if hx 0

i0

Nm

X

otherwise

1

0

0

S

I

;

I

;

h

;

x

i

i

x;i

d

x;i

lu l u0 i1

A6

Supplementary material

Appendix A

Here we present an expression for the average covariance for Y(x, I)

and Y(x, I), where I = [u , l] is the depth interval (of an observation

or prediction) at horizontal (point) location x, and I ' = [u , l]

is the depth interval at x. We assume that the point covariances i.e.

for Y(x, d) and Y(x, d) are given by Eq. (8), with order-2 polynomial

functions (i.e. quadratic equations) of depth for the standard deviations associated with the nugget and spatial variances (see

Eqs. (6) and (7)). We also assume that d(hd; d) is the exponential

correlation function:

h

d hd ; d d hd ; ad exp d

ad

A1

doi.org/10.1016/j.geoderma.2015.08.013.

References

Adhikari, K., Kheir, R.B., Greve, M.B., Bocher, P.K., Malone, B.P., Minasny, B., McBratney,

A.B., Greve, M.H., 2013. High-resolution 3-D mapping of soil texture in Denmark.

Soil Sci. Soc. Am. J. 77, 860876.

Akaike, H., 1974. A new look at the statistical model identication. IEEE Trans. Autom.

Control 19, 716723.

Arrouays, D., McBratney, A.B., Minasny, B., Hempel, J.W., Heuvelink, G.B.M., MacMillan,

R.A., Hartemink, A.E., Lagacherie, P., McKenzie, N.J., 2014. The GlobalSoilMap project

specications. In: GlobalSoilMap: basis of the global spatial soil information system.

In: Arrouays, D., McKenzie, N.J., Hempel, J.W., Richer de Forges, A.C., McBratney,

A.B. (Eds.).

Bishop, T.F.A., McBratney, A.B., Laslett, G.M., 1999. Modelling soil attribute depth functions

with equal-area quadratic smoothing splines. Geoderma 91, 2745.

Bishop, T.F.A., Horta, A., Karunaratne, S.B., 2015. Validation of digital soil maps at different

spatial supports. Geoderma 241242, 238249.

186

Breidt, F.J., Hsu, N.-J., Ogle, S., 2007. Semiparametric mixed models for incrementaveraged data with application to carbon sequestration in agricultural soils. J. Am.

Stat. Assoc. 102, 803812.

Brus, D.J., Kempen, B., Heuvelink, G.B.M., 2011. Sampling for validation of digital soil

maps. Eur. J. Soil Sci. 62, 394407.

Clifford, D., Dobbie, M.J., Searle, R., 2014. Non-parametric imputation of properties for soil

proles with sparse observations. Geoderma 232, 1018.

De Iaco, S., Myers, D.E., Posa, D., 2001. Spacetime analysis using a general productsum

model. Stat. Probab. Lett. 52, 2128.

De Iaco, S., Myers, D.E., Posa, D., 2011. Strict positive deniteness of a product of covariance functions. Commun. Stat. 40, 44004408.

Department of Environment and Climate Change NSW, 2009. Salinity Audit: Upland

catchments of the New South Wales MurrayDarling Basin Available at: www.

environment.nsw.gov.au/resources/salinity/09153SalinityAudit.pdf.

Eidsvik, J., Shaby, B.A., Reich, B.J., Wheeler, M., Niemi, J., 2014. Estimation and prediction in

spatial models with block composite likelihoods. J. Comput. Graph. Stat. 23, 295315.

Gelman, A., 2008. Scaling regression inputs by dividing by two standard deviations. Stat.

Med. 27, 28652873.

Goovaerts, P., 2008. Kriging and semivariogram deconvolution in the presence of irregular

geographical units. Math. Geosci. 40, 101128.

Haskard, K.A., Lark, R.M., 2009. Modelling non-stationary variance of soil properties by

tempering an empirical spectrum. Geoderma 153, 1828.

Hengl, T., de Jesus, J.M., MacMillan, R.A., Batjes, N.H., Heuvelink, G.B.M., Ribeiro, E.,

Samuel-Rosa, A., Kempen, B., Leenaars, J.G.B., Walsh, M.G., Gonzalez, M.R., 2014.

SoilGrids1km global soil information based on automated mapping. PLoS ONE 9,

e105992.

Heuvelink, G.B.M., 2014. Uncertainty quantication of GlobalSoilMap products. In:

GlobalSoilMap: basis of the global spatial soil information system. In: Arrouays, D.,

McKenzie, N.J., Hempel, J.W., Richer de Forges, A.C., McBratney, A.B. (Eds.).

Heuvelink, G.B.M., Grifth, D.A., 2010. Spacetime geostatistics for geography: a case

study of radiation monitoring across parts of Germany. Geogr. Anal. 42, 161179.

Kempen, B., Brus, D.J., Stoorvogel, J.J., 2011. Three-dimensional mapping of soil organic

matter content using soil type-specic depth functions. Geoderma 162, 107123.

Kerry, R., Goovaerts, P., Rawlins, B.G., Marchant, B.P., 2012. Disaggregation of legacy soil

data using area to point kriging for mapping soil organic carbon at the regional

scale. Geoderma 170, 347358.

Kyriakidis, P.C., 2004. A geostatistical framework for area-to-point spatial interpolation.

Geogr. Anal. 36, 259289.

Kyriakidis, P.C., Yoo, E.-H., 2005. Geostatistical prediction and simulation of point values

from areal data. Geogr. Anal. 37, 124151.

Lacarce, E., Saby, N.P.A., Martin, M.P., Marchant, B.P., Boulonne, L., Meersmans, J., Jolivet, C.,

Bispo, A., Arrouays, D., 2012. Mapping soil Pb stocks and availability in mainland

France combining regression trees with robust geostatistics. Geoderma 170, 359368.

Lark, R.M., 2000. Estimating variograms of soil properties by the method-of-moments and

maximum likelihood. Eur. J. Soil Sci. 51, 717728.

Lark, R.M., 2009. Kriging a soil variable with a simple nonstationary variance model.

J. Agric. Biol. Environ. Stat. 14, 301321.

Lark, R.M., Bishop, T.F.A., 2007. Cokriging particle size fractions of the soil. Eur. J. Soil Sci.

58, 763774.

Lark, R.M., Cullis, B.R., Welham, S.J., 2006. On spatial prediction of soil properties in the

presence of a spatial trend: the empirical best linear unbiased predictor (E-BLUP)

with REML. Eur. J. Soil Sci. 57, 787799.

Malone, B.P., McBratney, A.B., Minasny, B., Laslett, G.M., 2009. Mapping continuous depth

functions of soil carbon storage and available water capacity. Geoderma 154,

138152.

Malone, B.P., McBratney, A.B., Minasny, B., 2011a. Empirical estimates of uncertainty for

mapping continuous depth functions of soil attributes. Geoderma 160, 614626.

Malone, B.P., de Gruijter, J.J., McBratney, A.B., Minasny, B., Brus, D.J., 2011b. Using

additional criteria for measuring the quality of predictions and their uncertainties

in a digital soil mapping framework. Soil Sci. Soc. Am. J. 75, 10321043.

Marchant, B.P., Newman, S., Corstanje, R., Reddy, K.R., Osborne, T.Z., Lark, R.M., 2009.

Spatial monitoring of a non-stationary soil property: phosphorus in a Florida water

conservation area. Eur. J. Soil Sci. 60, 757769.

Martin, M.P., Orton, T.G., Lacarce, E., Meersmans, J., Saby, N.P.A., Paroissien, J.B., Jolivet, C.,

Boulonne, L., Arrouays, D., 2014. Evaluation of modelling approaches for predicting

the spatial distribution of soil organic carbon stocks at the national scale. Geoderma

223225, 97107.

Minty, B., Franklin, R., Milligan, P., Richardson, M., Wilford, J., 2009. The radiometric map

of Australia. Explor. Geophys. 40, 325333.

Orton, T.G., Pringle, M.J., Page, K.L., Dalal, R.C., Bishop, T.F.A., 2014. Spatial prediction of soil

organic carbon stock using a linear model of coregionalisation. Geoderma 230231,

119130.

Orton, T.G. Pringle M.J. Allen D.E. Dalal R.C. Bishop T.F.A. in press. A geostatistical method

to account for the number of aliquots in composite samples for normal and lognormal variables, Eur. J. Soil Sci. http://dx.doi.org/10.1111/ejss.12297.

Patterson, H.D., Thompson, R., 1971. Recovery of inter-block information when block sizes

are unequal. Biometrika 58, 545554.

Poggio, L., Gimona, A., 2014. National scale 3D modelling of soil organic carbon stocks

with uncertainty propagation an example from Scotland. Geoderma 232, 284299.

Schielzeth, H., 2010. Simple means to improve the interpretability of regression coefcients. Methods Ecol. Evol. 1, 103113.

Schirrmann, M., Herbst, R., Wagner, P., Gebbers, R., 2012. Area-to-point kriging of soil

phosphorus composite samples. Commun. Soil Sci. Plant Anal. 43, 10241041.

Stein, M.L., 2005. Spacetime covariance functions. J. Am. Stat. Assoc. 100, 310321.

Stein, M.L., Chi, Z., Welty, L.J., 2004. Approximating likelihoods for large spatial datasets.

J. R. Stat. Soc. Ser. B 66, 275296.

Truong, P.N., Heuvelink, B.M., Pebesma, E., 2014. Bayesian area-to-point kriging using

expert knowledge as informative priors. Int. J. Appl. Earth Obs. Geoinf. 30, 128138.

Veronesi, F., Corstanje, R., Mayr, T., 2012. Mapping soil compaction in 3D with depth

functions. Soil Tillage Res. 124, 111118.

Webster, R., Oliver, M.A., 2001. Geostatistics for environmental scientists. John Wiley &

Sons, Chichester, UK.

- sol3Hochgeladen vonThinh
- Ore Reserve EstimationHochgeladen vonMijail Calderon
- Terminology for Petrel manual.docxHochgeladen vonJoseGuevaraG
- CHAPTER19 Geostatistical MethodsHochgeladen vonDOEXXY123
- Normal CurveHochgeladen vonAubrey Holt
- 189Hochgeladen vonsommukh
- TechnicalNote_UnderstandingStochasticInversionHochgeladen vonkadrawi
- Lindsey 1997 Applying Generalized Linear Models.pdfHochgeladen vonCarlos Andrade
- BudgetHochgeladen vonPrema Paniker
- Test 2Hochgeladen vonAmir Husaini
- 9231_w14_ms_21Hochgeladen vonTrynos
- Chapter 3 Introduction to Data Science a Python Approach to Concepts, Techniques and ApplicationsHochgeladen vonChuin-Shan David Chen
- MECP-101.docHochgeladen vonAmbrish (gYpr.in)
- 55027716 a Post Crisis PerspectiveHochgeladen vonDevendra Kant Sahu
- List of matrices_2.pdfHochgeladen vonmars
- Digital Color Cameras - Spectral ResponseHochgeladen vonJorge Leandro
- Literature Review v02Hochgeladen vonHasan Md Ershad
- CourseOutline_Nov2014Hochgeladen vonNurul Hidayah Azmi
- 1Hochgeladen vonanon-606780
- Aqa Ms04 Qp Jun13Hochgeladen vonprsara1975
- A_Comparison_Of_Spatial_Interpolation_Methods_Azpurua_&_DosRamos_2010.pdfHochgeladen vonLeomaris Perales Contreras
- econometrics-01-00180Hochgeladen vonecobalas7
- Length-weight relationship of Lepidocephalichthys guntea (Hamilton, 1822) from Haora River, Tripura, IndiaHochgeladen vonAlok Kumar Jena
- Mathematical ExpectationHochgeladen vonnofiaroza
- Week 3 Example 1.6 Mean variance stdev grouped data.xlsxHochgeladen vonRIFKI ARIWARDI
- Independent Samples Test 16Hochgeladen vonGeorge Blessit
- RHochgeladen vonMovieXXI Trailers
- CDVC_LVM_BJMSP.pdfHochgeladen vonMostafa Salah Elmokadem
- Team Milk Thesis Proposal Spring 2011Hochgeladen vonchristine
- Sampling DistributionHochgeladen vonBReist

- Prolonged milk provisioning in a jumping spider.pdfHochgeladen vonbriologo2
- Fossil Evidence on Origin of the Mammalian BrainHochgeladen vonbriologo2
- Balser2019 Article LinkingMicrobialCommunityAnalyHochgeladen vonbriologo2
- 1-s2.0-S0038071714001084-main.pdfHochgeladen vonbriologo2
- Middle and Long Distance Athletics Races Viewed From TheHochgeladen vonbriologo2
- An Emeging Movement Ecology ParadigmHochgeladen vonbriologo2
- (Soler, Cobos, Pomar, Rodríguez & Vitaller) - Manual De Técnicas De Montaña E Interpretación De La Naturaleza - 1° EdiciónHochgeladen vonbriologo2
- Bayesian Methods for EcologyHochgeladen vonbriologo2
- Emergent Properties From Organisms to Ecosystems-Towards a Realistic ApproachHochgeladen vonbriologo2
- EthnobiologíaHochgeladen vonStefano Torracchi Carrasco
- Coral bleaching - the role of the host.pdfHochgeladen vonbriologo2
- Capturing Escape in Infectious Disease DynamicsHochgeladen vonbriologo2
- Alternative States on Coral Reefs-Beyond Coral-macroalgal Phase ShiftsHochgeladen vonbriologo2
- Quality of the Fossi Record Through TimeHochgeladen vonbriologo2
- Mass Extintions and Ocean Acidification – Biological Constraints on Geological DilemmasHochgeladen vonbriologo2
- Dinosaur Extinction-Changing ViewsHochgeladen vonbriologo2
- Biological Consequences of Late Quaternary Extintions of MegafaunaHochgeladen vonbriologo2
- Estimation of the Population Mean After Transformation of the Data-An Application for Zoological DataHochgeladen vonbriologo2
- Geostatistics in Ecology Interpolating With Known VarianceHochgeladen vonbriologo2
- Testing of Null Hypotheses in Exploratory Community Analysis-Similarity Profieles and Biota-Environment LinkageHochgeladen vonbriologo2
- Richness and EvennessHochgeladen vonbriologo2
- Choosing a Test-A KeyHochgeladen vonbriologo2
- Analogical Thinking in Ecology- Looking Beyond Disciplinary BoundariesHochgeladen vonbriologo2
- The Evolution of Ecology in Mexico Facing Challenges and Preparing for the FutureHochgeladen vonbriologo2
- The Role of Ecological Theory in Microbial EcologyHochgeladen vonbriologo2
- Toward a Metabolic Theory of EcologyHochgeladen vonKaren Kaczala Belotti
- Response to Forum Commentary Toward a Metabolic Theory of EcologyHochgeladen vonbriologo2
- Elements of Ecology and EvolutionHochgeladen vonbriologo2

- WeibullPaperHochgeladen vonJeff Hardy
- Probability and StatisticsHochgeladen vonPrakash Dhage
- MIMO_radar_An_idea_whose_time_has_come.pdfHochgeladen vonBruno Alvim
- EFFICIENCY OF VEGETABLE PRODUCTION UNDER IRRIGATION SYSTEM IN ILORIN METROPOLIS: A CASE STUDY OF FLUTED PUMPKIN (Telferia occidentalis).Hochgeladen vonwilolud2059
- Bera, A Et Al. --Spatial Analysis From the Beginning to the Frontiers of Spatial EconometricsHochgeladen vonbwcastillo
- TSIKRITSIS Nikos Treating Missing Data on Survey Journal_of_Operations_Management_2005Hochgeladen vonAmandaBilicki
- Clm TutorialHochgeladen vonTikva Tom
- Dados Em Paineis - XtdpdmlHochgeladen voncm_feipe
- Signature SeparabilityHochgeladen vonParveen Deswal
- statHochgeladen vonSamson Haile
- v01-Analysis of VarianceHochgeladen vonasg_rus
- TR204ftHochgeladen vonNgân Nguyễn
- Estimation TheoryHochgeladen vonDebajyoti Datta
- MAXIMUM LIKELIHOOD OF WEIBULL.docHochgeladen vonbright01
- TUTORIAL IN BIOSTATISTICS META-ANALYSIS: FORMULATING, EVALUATING, COMBINING, AND REPORTING SHARON-LISE T. NORMANDHochgeladen vonAbeerAlgebali
- An Autoregressive Approach to House Price ModelingHochgeladen vonchezzzy_li
- Survey of Clustering AlgorithmsHochgeladen vonnnshgh
- Estimation TheoryHochgeladen vonSenaka Samarasekera
- TRANSFORMER LIFETIME MODELLING BASED ON CONDITION MONITORING DATAHochgeladen vonIJAET Journal
- 02 WillowsHochgeladen vonIman Heru Wicaksono
- Sample 170Hochgeladen vonDuma Dumai
- Kaplan, 1958. Nonparametric Estimation From Incomplete Observations.Hochgeladen vonJuan Carlos Reyes Hagemann
- spss_readme.txtHochgeladen vonAlberto
- Estimation of Parameters of Johnson-s System of DistributionsHochgeladen vonsmysona
- Maximum LikelihoodHochgeladen vonShaibal Barua
- 22 Final Paper OyugiHochgeladen vonKostasSoufras
- Statistics in Data Science mini project 2Hochgeladen vongaldo2
- Short Term Wind Power Forecasting Using Autoregressive Integrated Moving Average ModelingHochgeladen vonMohamed Abdel-Rahman
- Stan Reference 2.14.0Hochgeladen vonfoobar
- Box Cox 1964Hochgeladen vonedabank4712