Geog 210C: Analytical Methods in Geography III: Here S Denotes The N-TH Support, E.g., A Point

Spring 2009 Geog 210C: Analytical Methods in Geography III
Introduction
Data: measurement of a spatial attribute available at various sampling units (supports).
Often, such supports correspond to (few & scattered in the study region) point locations,
e.g., monitoring stations
Objective: predict the unknown attribute value(s) at arbitrary non-sampled support(s),

and provide measures of associated uncertainty. Often, interpolation is required at point
locations coinciding with the nodes of a very fine grid to construct an attribute “surface”
Slide 1 39
Bay Area rain gauge precipitation
11 39.0
Predicted precipitation via Kriging Kriging variance of precipitation predictions
39.0
10 11.0 9.00
38.5
9
9.0
8
38
6.00
7 7.0
Latitude
Latitude
37.5 6
5.0
5
3.00
37
4
3.0
3
36.5
2 1.0 0.00
36 1 36.0 36.0
−123.5 −123 −122.5 −122 −121.5 −121 -123.5 Longitude -121.0 -123.5 Longitude -121.0
In this handout: a brief introduction to spatial interpolation via Kriging with a known
expected attribute value (think of long-term average or climatology) at any (with or
without measurements) support; this procedure is also termed Simple Kriging
Some Notation & Terminology
Data: set of N attribute measurements {y(sn ), n = 1, . . . , N }, available at N supports

{sn , n = 1, . . . , N }; here sn denotes the n-th support, e.g., a point; these N measurements
can be arranged into a (N × 1) data vector ys = [y(sn ), n = 1, . . . , N ]T
Objective: predict (estimate), using the N sample data, the unknown attribute values
{y(tm ), m = 1, . . . , M } at a set of M supports {tm , m = 1, . . . , M }, e.g., at points
coinciding with the nodes of a regular raster
Slide 2
Terminology:
• the sample attribute measurements in vector ys are often termed source data, and
the unknown attribute values target values (sampling units: source & target supports)
• for now, it is assumed that both source data & target values are defined over
quasi-point supports, and pertain to the same attribute Y ;
it is also assumed that no other data on related attributes (covariates) are available
• global vs local interpolation: data at all N source supports can be used for prediction
at tm , or only a subset available at N (tm ) << N closest source supports
Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Expected Attribute Values & Residuals

Expected attribute values or first-order effects:
• at the N source supports {sn , n = 1, . . . , N }: (N × 1) expectation vector
μs = [μ(sn ), n = 1, . . . , N ]T , with μ(sn ) denoting the expected attribute value at sn ;
think of a long-term average or climatology
• at the M target supports {tm , m = 1, . . . , M }: (M × 1) expectation vector
μt = [μ(tm ), m = 1, . . . , M ]T , with μ(tm ) being the expected attribute value at tm
Slide 3 In what follows, it is assumed that μs and μt are known;

Implication: if μs and μt are estimated, their uncertainty is very small or ignored...
Residuals from known expected values:

• at the N source supports {sn , n = 1, . . . , N }:
(N × 1) vector rs = [r(sn ), n = 1, . . . , N ]T = ys − μs , with r(sn ) = y(sn ) − μ(sn )
denoting the residual (observation - expectation) from the known expected value at sn
• at the m-th target support tm : r(tm ) = y(tm ) − μ(tm ) is the unknown residual at tm
When no information is available to populate μs and μ(tm ),
ignore them by setting them to 0; this implies that ys = rs
Linear Spatial Prediction

of residuals after the known first-order effects have been accounted for
Linear prediction: the predicted target residual r̂(tm ) at the m-th target support tm is
expressed as a weighted sum of the N source residuals {r(sn ), n = 1, . . . , N }:

N
r̂(tm ) = wm (sn )r(sn ) and ŷ(tm ) = μ(tm ) + r̂(tm )
n=1
Slide 4 wm (sn ) = weight given to n-th source residual r(sn ) for prediction at the m-th target support tm
When target supports are points & defined everywhere in 2D,

the corresponding (infinite) set of predicted values forms a interpolated attribute “surface”
Determining the weights: in almost all spatial interpolation methods, the N weights
{wm (sn ), n = 1, . . . , N } are functions of the spatial configuration of target and source
supports. More precisely, such weights are often functions of the N (Euclidean)
target-to-source distances {dmn , n = 1, . . . , N } between the m-th target support tm and
the N source supports. In other words, weights are first derived based on the source
supports and are then applied to the source data or residuals. . .

Geostatistical Spatial Prediction
Flowchart, assuming only source data on attribute Y exist:

1. is there information on expected attribute values at both source and target supports?
• if yes, compute the source residuals (data) from such known expectations
• if not, assume that all such expected values are equal to each other, and for
convenience set them to 0
2. compute the sample semivariogram (of the residuals or the actual data) = discrete
Slide 5
version of an upside-down kernel; the complete kernel is typically missing due to
sparse data
3. fit a valid semivariogram model to the sample semivariogram, i.e., estimate the
parameters of a (physically justified) parametric upside-down kernel
4. use Simple Kriging for spatial interpolation and associated prediction error variance
assessment
Steps 2 & 3 constitute the “art” component of geostatistics
(particularly in the case of sparse source data),
and take up most of the time devoted to analysis
Spatial Prediction with Kriging Example

Precipitation data and semivariogram:
NDJ 1981-82 average precipitation (in mm)
39.0
Sample and model semivariogram of precipitation
11.00
38.5
12.0
9.00
38.0
γ
7.00
8.0
37.5
5.00
37.0
4.0
3.00 sample variogram
model variogram
36.5
1.00
0.0
36.0 0.0 0.4 0.8 0.12 0.16
Slide 6 -123.5 -123.0 -122.5 -122.0 -121.5 -121.0
Distance (degrees)
Kriging predictions and error variances:

39.0 39.0
11.0 9.00
9.0
6.00
7.0
Latitude
Latitude
5.0
3.00
3.0
1.0 0.00
36.0 36.0
-123.5 Longitude -121.0 -123.5 Longitude -121.0

Semivariogram / Covariogram / Correlogram Model

Semivariogram model Covariogram model Correlogram model
15 15 1.5
γ (∞) = σ(0) variance

variance unit correlation
10 σ(0)10 ρ(0) 1
semivariance γ(d)
correlation ρ(d)
covariance σ(d)
5 5 0.5
range range
range
0 0 0
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
lag distance d lag distance d lag distance d
Slide 7
Conversion between models: when the sill σ(0) = γ(∞) of the semivariogram model is finite:
• Semivariogram → covariogram: σ(d) = σ(0) − γ(d)
σ(d)
• Covariogram → correlogram: ρ(d) =
σ(0)
γ(d)
• Semivariogram → correlogram: ρ(d) = 1 −
σ(0)
• Covariogram → semivariogram: γ(d) = σ(0) − σ(d)
Spatial Prediction via Simple Kriging (SK)

• known expected attribute values (”climatologies”) at any (source & target) support.
Often, in the absence of other information, it is assumed that the expected attribute
value is the same everywhere (constant); this implies no first-order effects
• target prediction at target support = weighted sum of source data residuals

(”anomalies”) from known attribute expectations (”climatologies”)
+ expectation at target support
Slide 8
• weights account for:
– correlation between source and target supports
– redundancy (inverse correlation = precision) between source supports
– functional form, e.g., exponential, of semivariogram or correlogram model
• SK variance = reliability of target prediction = overall attribute variance reduced by

“weighted” influence of “nearby” source supports:
– “nearby” = statistical proximity (correlation)
– “weighted” = source redundancy (SK weights)

Simple Kriging (SK) Target Prediction

N
SK prediction: [ŷ(tm ) − μ(tm )] = wm (sn )[y(sn ) − μ(sn )] = wm
T
rs = w m
T
[ys − μs ]
n=1
• wm = [wm (sn ), n = 1, . . . , N ] : (N × 1) vector of SK-weights assigned to N source

T
supports for prediction at target support tm

• rs = [y(sn ) − μ(sn ), n = 1, . . . , N ]T : (N × 1) vector of residual data from known
expectations μ(sn ) at source supports
Slide 9 ⎡ ⎤
y(s1 ) − μ(s1 )
⎢ ⎥
⎢ . ⎥
⎢ . ⎥
⎢⎢
. ⎥
⎥
[ŷ(tm ) − μ(tm )] = wm (s1 ) · · · wm (sn ) · · · wm (sN ) ⎢⎢ y(sn ) − μ(sn )
⎥
⎥
⎢ ⎥
r̂(tm ) ⎢ . ⎥
wmT ⎢ .
. ⎥
⎣ ⎦
y(sN ) − μ(sN )

rs
Use semivariogram model to determine N weights at each target support tm ;

typically, we use the covariogram model (the kernel) due to computational reasons
Requisites for Kriging (I)
Source-to-target & source-to-source distances:

⎡ ⎤ ⎡ ⎤
d1m 0 ··· d1n ··· d1N
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . .. . ⎥
⎢ . ⎥ ⎢ . . . . . ⎥
⎢ . ⎥ ⎢ . . . ⎥
⎢ ⎥ ⎢ ⎥
dm =⎢
⎢ dnm
⎥
⎥ and D=⎢
⎢ dn1 ··· 0 ··· dnN ⎥
⎥
⎢ ⎥ ⎢ . ⎥
⎢ . ⎥ ⎢ . .. . .. . ⎥
⎢ .
. ⎥ ⎢ . . .
. . .
. ⎥
Slide 10 ⎣ ⎦ ⎣ ⎦
dN m dN 1 ··· dN n ··· 0
• as any other spatial interpolation method, one accounts for the proximity of the N
source supports to the target support tm . Note: Vector dm changes from one target
support tm to another, hence the subscript m
• unlike other interpolation methods, Kriging also accounts for the proximity between
source supports themselves (sample configuration or data layout). Note: For global
interpolation, matrix D of source-to-source distances is the same for all target supports

Requisites for Kriging (I): Example

Local data configuration
145.0
791
141.0 696 606
477 (5)
(1) (3)
(2) ?
137.0
133.0
227
129.0 646 783
(4)
(6) (7)
125.0
55.0 60.0 65.0 70.0 75.0 80.0
(N x 1) vector of source-to-target distances:

Slide 11 T
dm = 3.61 4.47 6.71 8.06 8.94 9.49 13.45
n -th element of dm is dnm
(N x N) matrix of source-to-source distances:

⎡ ⎤
0.00 2.24 8.00 11.05 10.05 13.00 16.97
⎢ ⎥
⎢ 2.24 0.00 10.05 10.44 12.17 13.04 17.80 ⎥
⎢ ⎥
⎢ 8.00 10.05 0.00 13.04 2.24 12.37 12.65 ⎥
⎢ ⎥
⎢ ⎥
D=⎢
⎢ 11.05 10.44 13.04 0.00 15.00 4.12 11.05 ⎥
⎥
⎢ ⎥
⎢ 10.05 12.17 2.24 15.00 0.00 13.93 13.15 ⎥
⎢ ⎥
⎢ ⎥
⎣ 13.00 13.04 12.37 4.12 13.93 0.00 7.00 ⎦
16.97 17.80 12.65 11.05 13.15 7.00 0.00
n, n -th element of D is dnn
Requisites for Kriging (II)
From distance matrices to model covariance matrices: Take any distance value dnm
and dnn , i.e., any entry in dm and D, and transform it, via the covariogram model or
kernel, to a covariance value σ(dnm ) and σ(dnn )
Source-to-target & source-to-source model covariances:

⎡ ⎤ ⎡ ⎤
σ(d1m ) σ(0) ··· σ(d1n ) ··· σ(d1N )
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . .. . ⎥
⎢ . ⎥ ⎢ . . . . . ⎥
Slide 12 ⎢ . ⎥ ⎢ . . . ⎥
⎢ ⎥ ⎢ ⎥
σm =⎢
⎢ σ(dnm )
⎥
⎥ and Σ=⎢
⎢ σ(dn1 ) ··· σ(0) ··· σ(dnN ) ⎥
⎥
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . .. . ⎥
⎢ .
. ⎥ ⎢ .
. . .
. . .
. ⎥
⎣ ⎦ ⎣ ⎦
σ(dN m ) σ(dN 1 ) ··· σ(dN n ) ··· σ(0)
• source-to-target covariance vector σm : (N × 1) vector with model covariance values

σ(dnm ) between N source supports and target support tm
• source-to-source covariance matrix Σ: (N × N ) matrix with model covariance values
σ(dnn ) between any two source supports separated by distance dnn

Requisites for Kriging (II): Source-to-Target Example

Local data configuration Correlogram model
145.0 1.00
791
141.0 696 606
0.80
477 (5)
−3d
ρ(d) = exp( )
(1) (3)
(2) ?
10
137.0 0.60
ρ (d)
133.0 0.40
227
129.0 646 783
0.20
(4)
(6) (7)
125.0 0.00
55.0 60.0 65.0 70.0 75.0 80.0 0.0 5.0 10.0 15.0 20.0 25.0 30.0
d
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
Slide 13 3.61 exp(−3 × 3.61/10) 0.34
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 4.47 ⎥ ⎢ exp(−3 × 4.47/10) ⎥ ⎢ 0.26 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 6.71 ⎥ ⎢ exp(−3 × 6.71/10) ⎥ ⎢ 0.13 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 8.06 ⎥ → 1 ⎢ exp(−3 × 8.06/10) ⎥ = ⎢ 0.09 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 8.94 ⎥ ⎢ exp(−3 × 8.94/10) ⎥ ⎢ 0.07 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 9.49 ⎥ ⎢ exp(−3 × 9.49/10) ⎥ ⎢ 0.06 ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
13.45 exp(−3 × 13.45/10) 0.02

dm σ m =sill exp(−3dm /range)
These would be the weights, had one ignored auto-correlation between source data
Requisites for Kriging (II): Source-to-Source Example

145.0 1.00
791
141.0 696 606
0.80
477 (5)
(2)
(1)
?
(3)
−3d
137.0 0.60 ρ(d) = exp( )
ρ (d) 10
133.0 0.40
227
129.0 646 783
0.20
(4)
(6) (7)
125.0 0.00
55.0 60.0 65.0 70.0 75.0 80.0 0.0 5.0 10.0 15.0 20.0 25.0 30.0
Slide 14
⎡ ⎤ ⎡ ⎤
0.00 2.24 8.00 11.05 10.05 13.00 16.97 1.00 0.51 0.09 0.04 0.05 0.02 0.01
⎢ ⎥ ⎢ ⎥
⎢ 2.24 0.00 10.05 10.44 12.17 13.04 17.80 ⎥ ⎢ 0.51 1.00 0.05 0.04 0.03 0.02 0.00 ⎥
⎢ ⎥ ⎢ ⎥
⎢ 8.00 10.05 0.00 13.04 2.24 12.37 12.65 ⎥ ⎢ 0.09 0.05 1.00 0.02 0.51 0.02 0.02 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ 11.05 10.44 13.04 0.00 15.00 4.12 11.05 ⎥ → ⎢ 0.04 0.04 0.02 1.00 0.01 0.29 0.04 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ 10.05 12.17 2.24 15.00 0.00 13.93 13.15 ⎥ ⎢ 0.05 0.03 0.51 0.01 1.00 0.02 0.02 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎣ 13.00 13.04 12.37 4.12 13.93 0.00 7.00 ⎦ ⎣ 0.02 0.02 0.02 0.29 0.02 1.00 0.12 ⎦
16.97 17.80 12.65 11.05 13.15 7.00 0.00 0.01 0.00 0.02 0.04 0.02 0.12 1.00

D Σ=sill exp(−3D/range)
Matrix Σ quantifies source-to-source interaction

Requisites for Kriging (III)
Source-to-target & source-to-source model covariances:

⎡ ⎤ ⎡ ⎤
σ(d1m ) σ(0) ··· σ(d1n ) ··· σ(d1N )
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . . . . .⎥
⎢ . ⎥ ⎢ . .. . .. .⎥
⎢ . ⎥ ⎢ . . .⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
σm = ⎢ σ(dnm ) ⎥ and Σ=⎢ σ(dn1 ) ··· σ(0) ··· σ(dnN ) ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . . . . . ⎥
⎢ . ⎥ ⎢ . .
.
. .
.
. ⎥
⎣ . ⎦ ⎣ . . . ⎦
σ(dN m ) σ(dN 1 ) ··· σ(dN n ) ··· σ(0)
Slide 15
• source-to-target covariance vector σm : encapsulates statistical proximity (correlation)
between source data and unknown target value y(tm ); that correlation is a function
of distance between source and target supports, not of the actual (unknown) target
value y(tm ); the larger the entries of vector σm , the stronger the predictive power of
source data (had each source datum been considered in isolation)
• source-to-source covariance matrix Σ: encapsulates redundancy between source data;
for positive spatial auto-correlation, the more clustered are the source supports, the
more redundant are the corresponding source data (less information content); a
clustered source support layout typically translates into larger entries in Σ
A Note on Stationarity
There is no requirement: that a stationary specification of the covariance vector σm and
the covariance matrix Σ be adopted. In other words, one can specify unequal source
variances (diagonal elements of Σ), as well as covariances between two supports
(source-to-target or source-to-source) which are not functions of distance
General specification of source-to-target & source-to-source covariances:

⎡ ⎤ ⎡ ⎤
σ(s1 , tm ) σ(s1 ) ··· σ(s1 , sn ) ··· σ(s1 , sN )
Slide 16 ⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . .. . ⎥
⎢ . ⎥ ⎢ . . . . . ⎥
⎢ . ⎥ ⎢ . . . ⎥
⎢ ⎥ ⎢ ⎥
σm =⎢
⎢ σ(sn , tm )
⎥
⎥ and Σ=⎢
⎢ σ(sn , s1 ) ··· σ(sn ) ··· σ(sn , sN ) ⎥
⎥
⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . .. . ⎥
⎢ .
. ⎥ ⎢ .
. . .
. . .
. ⎥
⎣ ⎦ ⎣ ⎦
σ(sN , tm ) σ(sN , s1 ) ··· σ(sN , sn ) ··· σ(sN )
Second-order stationarity: is a typical working hypothesis when dealing with spatial data
(or a single cross-section from spatio-temporal data), whereby all diagonal entries of Σ
are equal to the variogram sill σ(0) and are functions of distance between supports

The Simple Kriging (SK) System of Equations

⎡ ⎤⎡ ⎤ ⎡ ⎤
σ(s1 ) ··· σ(s1 , sN ) wm (s1 ) σ(s1 , tm )
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ . .. . ⎥⎢ . ⎥ ⎢ . ⎥
⎢ .. . .. ⎥⎢ .. ⎥=⎢ .. ⎥
⎣ ⎦⎣ ⎦ ⎣ ⎦
σ(sN , s1 ) ··· σ(sN ) wm (sN ) σ(sN , tm )
Σwm = σm
• a system of N equations in N unknowns (the weights in wm ) for prediction at
support tm ; there are M such systems for M target supports, since σm changes from
Slide 17
one target support to another
• a version of the system of normal equations used in multiple linear regression. For
Kriging, the dependent variable pertains to the target support tm , and there are N
predictor (lagged) variables pertaining to the N source supports
• the SK system is also known with different names in different disciplines, e.g.,
collocation in surveying, Yule-Walker equations in time-series modeling, Wiener
prediction in electrical engineering, objective interpolation in atmospheric sciences
• under 2nd-order stationarity: σ(sn ) = σ(0), σ(sn , sn ) = σ(0)ρ(dnn ) and
σ(sn , tm ) = σ(0)ρ(dnm ); in this case, the weights do not depend on the sill σ(0)
Requisites for Kriging (III): SK System Example

145.0 1.00
791
141.0 696 606
0.80
477 (5)
(2)
(1)
?
(3)
−3d
137.0 0.60 ρ(d) = exp( )
ρ (d) 10
133.0 0.40
227
129.0 646 783
0.20
(4)
(6) (7)
125.0 0.00
55.0 60.0 65.0 70.0 75.0 80.0 0.0 5.0 10.0 15.0 20.0 25.0 30.0
d
SK system:
Slide 18
⎡ ⎤⎡ ⎤ ⎡ ⎤
1.00 0.51 0.09 0.04 0.05 0.02 0.01 wm (s1 ) 0.34
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.51 1.00 0.05 0.04 0.03 0.02 0.00 ⎥ ⎢ wm (s2 ) ⎥ ⎢ 0.26 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.09 0.05 1.00 0.02 0.51 0.02 0.02 ⎥ ⎢ wm (s3 ) ⎥ ⎢ 0.13 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.04 0.04 0.02 1.00 0.01 0.29 0.04 ⎥ ⎢ wm (s4 ) ⎥ = ⎢ 0.09 ⎥
⎥ ⎢
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.05 0.03 0.51 0.01 1.00 0.02 0.02 ⎥ ⎢ wm (s5 ) ⎥ ⎢ 0.07 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ 0.02 0.02 0.02 0.29 0.02 1.00 0.12 ⎦ ⎣ wm (s6 ) ⎦ ⎣ 0.06 ⎦
0.01 0.00 0.02 0.04 0.02 0.12 1.00 wm (s7 ) 0.02

Σ wm σm
n, n -th element of matrix Σ: σnn = 1 × exp(−3 × dnn /10)

Interpreting the SK System (I)

A different setting: Consider 3 time profiles of an attribute measured at three supports s1 , s2 and tm , i.e.,
three (Q × 1) attribute vectors x1 , x2 , and y, each having a zero mean μ = 0. The objective is to express the
y-values at support tm as a linear combination of x-values at supports s1 and s2 . This is tantamount to
regression of the y-values on the x-values via a linear model of the form: y = Xβ + e, where ym holds Q
data on the dependent variable, X = [x1 x2 ] is the (Q × 2) design matrix with data on the predictor variables,
β = [β1 β2 ]T is the (2 × 1) vector of regression coefficients (slopes only, since this is a no-intercept model due
to the dependent variable and the predictors having 0-mean), and e is the (Q × 1) vector of errors
Slide 19 System of normal equations: Per classical (ordinary) least squares regression theory, the vector β
of regression coefficients can be estimated as:
1 T 1
XT Xβ = XT y or X Xβ = XT y
Q Q
Covariance-based version: Since all variables are assumed to have 0-mean, the matrix-matrix
1
product Q XT X is an estimate of the (2 × 2) covariance matrix Σ between the predictors, and
1
the matrix-vector product Q XT y is an estimate of the (2 × 1) covariance vector σ between the
predictors and the dependent variable. The above system of normal equations can therefore be
also written as:
Σβ = σ
Interpreting the SK System (II)

The single cross-section case: Consider now the scenario whereby only a single snapshot (or
cross-section) of the x-data is available, coupled with the complete absence of data at tm , i.e.,
x1 = x1 , x2 = x2 and y = [ ]. The objective is now to predict the unavailable value y(tm ) from
the two data values x1 = y(s1 ) and x2 = y(s2 )
Regression with cross-section data: It is still possible to estimate the vector β of regression
coefficients if Σ and σm are known, since the version of normal equations Σβ = σm does not
call for any data. The entries of Σ and σm are simply obtained from the covariogram model σ(d)
since the source-to-source distances d11 = 0, d12 = d21 , d22 = 0 and the source-to-target
Slide 20
distances d1m , d2m are known
Link to Simple Kriging: The SK system Σβ = σm can therefore be seen as the covariance-based
version of the normal equations for the case of regression with lagged variables, i.e., regression
involving a dependent variable defined at a target support tm and predictor variables defined at
source supports s1 and s2 . The SK weights, previously denoted as wm , are none other than the
corresponding regression coefficients vector β
The SK regression model considers only pair-wise (two-point) interactions, i.e., there is no interaction column
x1 x2 , where denotes element-by-element multiplication, in the design matrix X. In other words, (i) only imaginary
d-specific scatter plots between two variables (locations) at a time are considered, and (ii) such scatter plots are
summarized by the corresponding covariance model value σ(d), which is then used to populate Σ and σm

Solving the Simple Kriging (SK) System of Equations

⎡ ⎤ ⎡ ⎤ −1 ⎡ ⎤
wm (s1 ) σ(s1 ) ··· σ(s1 , sN ) σ(s1 , tm )
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . ⎥ ⎢ . ⎥
⎢ .
. ⎥=⎢ .
. . .
. ⎥ ⎢ .
. ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
wm (sN ) σ(sN , s1 ) ··· σ(sN ) σ(sN , tm )
wm = Σ−1 σm
• the weights vector wm is obtained by solving the SK system anew for each target
Slide 21 support tm since the entries of σm change from one target support to another
• the SK system has a unique solution (there is one and only one weights vector wm ) if
and only if the source-to-source covariance matrix Σ is positive definite; for 2nd-order
stationarity, this implies that a valid covariogram model σ(d; θ), e.g., exponential
distance decay, with θ containing the sill and range, is used to populate Σ; in this
case, σ(d; θ) = σ(0)ρ(d; θ) and the weights do not depend on the sill σ(0):
⎡ ⎤ ⎡ ⎤ −1 ⎡ ⎤
wm (s1 ) 1 ··· ρ(d1N ) ρ(d1m )
⎢ ⎥ 1 ⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . ⎥ ⎢ . ⎥
⎢ .
. ⎥= ⎢ .
. . .
. ⎥ σ(0) ⎢ .
. ⎥
⎣ ⎦ σ(0) ⎣ ⎦ ⎣ ⎦
wm (sN ) ρ(dN 1 ) ··· 1 ρ(dN m )
Requisites for Kriging (III): SK Weights Example

⎡ ⎤ ⎡ ⎤⎡ ⎤
wm (s1 ) 1.36 −0.69 −0.09 −0.02 0.00 0.00 −0.01 0.34
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ wm (s2 ) ⎥ ⎢ −0.69 1.35 0.00 −0.02 0.00 −0.01 0.01 ⎥ ⎢ 0.26 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥
wm (s3 ) ⎥ ⎢ −0.09 0.00 1.36 −0.01 −0.69 −0.01 −0.01 ⎥ ⎢ 0.13 ⎥
⎢ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥
wm (s4 ) ⎥ = ⎢ −0.02 −0.02 −0.01 1.09 0.00 −0.32 −0.01 ⎥ ⎢ 0.09 ⎥
⎢ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥
wm (s5 ) ⎥ ⎢ 0.00 0.00 −0.69 0.00 1.35 −0.01 −0.01 ⎥ ⎢ 0.07 ⎥
⎢ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ wm (s6 ) ⎦ ⎣ 0.00 −0.01 −0.01 −0.32 −0.01 1.11 −0.12 ⎦ ⎣ 0.06 ⎦
wm (s7 ) −0.01 0.01 −0.01 −0.01 −0.01 −0.12 1.02 0.02

wm Σ−1 σm
SK weights
Slide 22 145.0
-0.001
141.0 0.267 0.102
0.116 (5)
(1) (3)
(2) ?
137.0 prediction = 592.17
variance = 8.58
133.0
129.0
0.064
(4) 0.028 0.007
(6) (7)
125.0
55.0 60.0 65.0 70.0 75.0 80.0
original weights vector (wm = σm ) modified by Σ−1 to account for sample redundancy;
e.g., wm (s1 ) = 0.27 instead of ρ(d1m ) = 0.34

Limit Cases for SK Weights

⎡ ⎤ ⎡ ⎤ −1 ⎡ ⎤
wm (s1 ) 1 ··· ρ(d1N ) ρ(d1m )
⎢ ⎥ 1 ⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . ⎥ ⎢ . ⎥
⎢ .
. ⎥= ⎢ .
. . .
. ⎥ σ(0) ⎢ .
. ⎥
⎣ ⎦ σ(0) ⎣ ⎦ ⎣ ⎦
wm (sN ) ρ(dN 1 ) ··· 1 ρ(dN m )
• if source-to-source distances dnn are larger than correlogram range, then

ρ(dnn ) = 0, and Σ = σ(0)I, the (N × N ) identity matrix scaled by σ(0);
Slide 23 this entails that wm (sn ) = ρ(dnm ), i.e., weights are equal to correlogram values
• in general, Σ = σ(0)I, i.e., source-to-source distances are within correlation range,
hence Σ−1 modulates σm : influence of source supports in clusters is downplayed
• for source suports far away (beyond correlation range) from the target support tm ,
ρ(dnm ) = 0 and wm (sn ) = 0, ∀n: all weights are equal to 0
• for prediction at a source support, tm ≡ sn , the source-to-target covariance vector
σm is the same as the n-th column σn of Σ; this yields wm (sn ) = 1 if sn ≡ tm , 0
otherwise: only the source support coinciding with the target support receives
non-zero (= 1) weight
Covariogram Models & SK Weights (I)

Two covariogram models (with different ranges):
−3d −3d
A: σ(d) = 10 exp( ) B: σ(d) = 10 exp( )
10 20
Covariogram model Covariogram model
10 10
9 9
8 8
7 7
covariance σ(d)
covariance σ(d)
6 6
5 5
4 4
3 3
2 2
1 1
Slide 24 0
0 5 10 15
lag distance d
20 25 30
0
0 5 10 15
lag distance d
20 25 30
Two sets of Simple Kriging weights:

SK weights using covariogram model (A) SK weights using covariogram model (B)
145.0 145.0
-0.001 -0.009
141.0 0.267 0.102 141.0 0.372 0.193
0.116 (5) 0.167 (5)
(1) (3) (1) (3)
(2) ? (2) ?
137.0 prediction = 592.17 137.0 prediction = 571.93
variance = 8.58 variance = 5.74
133.0 133.0
129.0 129.0
0.064 0.131
(4) 0.028 0.007 (4) 0.058 0.019
(6) (7) (6) (7)
125.0 125.0
55.0 60.0 65.0 70.0 75.0 80.0 55.0 60.0 65.0 70.0 75.0 80.0
shorter range (left) tends to decrease the SK weights towards 0

Covariogram Models & SK Weights (II)

Two covariogram models (with different shapes):
−3d −3d2
A: σ(d) = 10 exp( ) B: σ(d) = 10 exp( )
10 102
Covariogram model Covariogram model
10 10
9 9
8 8
7 7
covariance σ(d)
covariance σ(d)
6 6
5 5
4 4
3 3
2 2
1 1
Slide 25 0
0 5 10 15 20 25 30
0
0 5 10 15 20 25 30
lag distance d lag distance d
Two sets of Simple Kriging weights:

SK weights using covariogram model (A) SK weights using covariogram model (B)
145.0 145.0
-0.001 -0.315
141.0 0.267 0.102 141.0 0.664 0.435
0.116 (5) -0.045 (5)
(1) (3) (1) (3)
(2) ? (2) ?
137.0 prediction = 592.17 137.0 prediction = 599.91
variance = 8.58 variance = 4.73
133.0 133.0
129.0 129.0
0.064 0.140
(4) 0.028 0.007 (4) -0.025 0.005
(6) (7) (6) (7)
125.0 125.0
55.0 60.0 65.0 70.0 75.0 80.0 55.0 60.0 65.0 70.0 75.0 80.0
Gaussian covariogram yields larger weights for nearby data; negative weights possible
Simple Kriging Prediction & Error Variance (I)

Once the SK weights are computed as wm = Σ−1 σm , they are substituted in the
equations below to compute the SK prediction ŷ(tm ) and associated error variance σ̂(tm )
SK prediction:
⎡ ⎤
y(s1 ) − μ(s1 )
⎢ ⎥
⎢ .. ⎥
ŷ(tm ) = μ(tm ) + wm
T
rs = μ(tm ) + [wm (s1 ) · · · wm (sN )] ⎢ . ⎥
⎣ ⎦
Slide 26 y(sN ) − μ(sN )
SK prediction error variance:

⎡ ⎤
σ(s1 , tm )
⎢ ⎥
⎢ . ⎥
σ̂(tm ) = σ(tm ) − wm
T
σm = σ(tm ) − [wm (s1 ) · · · wm (sN )] ⎢ .
. ⎥
⎣ ⎦
σ(sN , tm )
N N
ŷ(tm ) = μ(tm ) + n=1 wm (sn )[y(sn ) − μ(sn )] and σ̂(tm ) = σ(tm ) − n=1 wm (sn )σ(sn , tm )

Simple Kriging Prediction & Error Variance (II)

For the 2nd-order stationary case, where μ(sn ) = μ = μ(tm ), ∀n, m
SK prediction:

N
N
N
ŷ(tm ) = μ+ wm (sn )[y(sn ) − μ] = μ + wm (sn )y(sn ) − wm (sn )μ
n=1 n=1 n=1

N
N
= wm (sn )y(sn ) + [1 − wm (sn )]μ
n=1 n=1
Slide 27
SK prediction = weighted sum of source data + weighted expectation μ
weight of expectation μ = complement to 1 of sum of SK weights
SK prediction error variance:

⎡ ⎤
σ(d1m )
⎢ ⎥
N
⎢ . ⎥
σ̂(tm ) = σ(0) − [wm (s1 ) · · · wm (sN )] ⎢ .
. ⎥ = σ(0) − wm (sn )σ(dnm )
⎣ ⎦
i=1
σ(dN m )

which can also be written as: σ̂(tm ) = σ(0) 1 − Nn=1 wm (s n )ρ(d nm )
Limit Cases for the SK Prediction and Error Variance

N
N
ŷ(tm ) = μ(tm ) + wm (sn )[y(sn ) − μ(sn )] σ̂(tm ) = σ(tm ) − wm (sn )σ(sn , tm )
n=1 n=1
• for sample data far away (beyond correlation range) from target support tm , all
weighs are equal to 0. In this case, r̂(tm ) = 0, hence ŷ(tm ) = μ(tm ) and
σ̂(tm ) = σ(0): the SK target prediction equals the known mean μ(tm ) and the SK
Slide 28 variance equals the known variance σ(tm ); away from the sample data, SK yields
back the (assumed known) attribute mean and variance at the target support tm
• for prediction at a source support tm ≡ sn , ŷ(sn ) = y(sn ) and σ̂(sn ) = 0: the SK

prediction reproduces the known source datum and the SK variance is zero (provided
there is no measurement error); SK is an exact interpolation algorithm
• for all other target supports, the SK predictions depend on the source support
configuration and their source data, while the SK variances depend only on the source
support configuration; both SK predictions and variances depend on the covariogram
model σ(d; θ) adopted (in the case of 2nd-order stationarity)

Flowchart for Spatial Prediction via Simple Kriging
1. 1st-order effects (“climatologies”) are known: compute residuals at source supports
2. compute sample semivariogram of residuals or actual data, and fit a theoretical model

to that experimental semivariogram; convert the former to a covariogram model
3. compute (N × N ) source-to-source distance matrix D, and use the covariogram

model to transform D to matrix Σ of source-to-source covariance values
Slide 29
4. consider a set of M target supports {tm , m = 1, . . . , M } on a regular grid or not
5. visit the m-th target support tm :

(a) compute (N × 1) vector of distances dm between tm and all N source supports,
and use the covariogram model to transform dm into vector σm of
source-to-target covariance values
(b) solve SK system for N weights ; compute SK prediction ŷ(tm ) and SK prediction
error variance σ̂(tm )
6. move to another target support tm , and repeat steps 5a and 5b
Simple Kriging with Local Search Neighborhoods

Same procedure as before but: at each target support tm , use only closest Nm << N
source data {y(sn ), n = 1, . . . , Nm } within a neighborhood around tm to compute the
(Nm × Nm ) source-to-source covariance matrix Σ and the (Nm × 1) source-to-target
covariance vector σm
Pros:
• no need to store and invert a large (N × N ) source-to-source covariance matrix Σ in
the case of large N (many source data), only M much smaller (Nm × Nm )
Slide 30 sub-matrices of Σ
• considering a local neighborhood can lead to the inclusion of more relevant source
data into the prediction exercise
Cons:
• need to define rules for specifying the search neighborhood;
not a big issue: use circle with radius ∼ range of covariogram model
• too small a search neighborhood reduces the number Nm of source data considered
for prediction, and might lead to more uncertain predictions
Kriging with local neighborhoods is widely used in applications of geostatistics

Spatial Prediction with Kriging Example

Precipitation data and semivariogram:
NDJ 1981-82 average precipitation (in mm)
39.0
Sample and model semivariogram of precipitation
11.00
38.5
12.0
9.00
38.0
γ
7.00
8.0
37.5
5.00
37.0
4.0
3.00 sample variogram
model variogram
36.5
1.00
0.0
36.0 0.0 0.4 0.8 0.12 0.16
Slide 31 -123.5 -123.0 -122.5 -122.0 -121.5 -121.0
Distance (degrees)
Kriging predictions and error variances:

39.0 39.0
11.0 9.00
9.0
6.00
7.0
Latitude
Latitude
5.0
3.00
3.0
1.0 0.00
36.0 36.0
Drawbacks of Kriging with Constant Mean (I)

Statistics of predicted precipitation values:
Q-Qplot: Predicted verus sample precipitation values Variogram of predicted precipitation and input model
12.0
12.0
predicted precipitation (mm)
γ
8.0
8.0
4.0 4.0
variogram of predicted values

model variogram
0.0 0.0
0.0 4.0 8.0 12.0 0.0 0.4 0.8 1.2 1.6 2.0
Slide 32 sample precipitation (mm) Distance (degrees)
Smoothing effect of interpolation:

• proportions of high or low values not the same as those of source data
• map of target predictions exhibits much smoother spatial variability than that
quantified by variogram model used for interpolation (Kriging); the amount of excess
smoothness depends on the source support configuration and density
Kriging variance at target support tm = measure of average (covariance) distance

between tm and all N source supports {sn , n = 1 . . . , N};
A glorified way for finding “holes” in the spatial configuration of source supports

Drawbacks of Kriging with Constant Mean (II)

What is Kriging really?
• Kriging solves a local problem, i.e., provides the best (in the least squares sense)
attribute prediction at a target support, one target support at a time. But a map of
locally optimal predictions is not necessarily the optimal map, due to smoothing. In
other words, the map of Kriging predictions should be conveyed as a table (not
displayed on a computer screen), especially in the case of sparse source data. . .
Slide 33 • the Kriging variance is a realistic measure of uncertainty in target predictions only
under very specific conditions. . .
Remedies:
• one can introduce more realism into Kriging predictions by incorporating data on
covariates; the attribute mean then becomes a function of co-located predictors
(is not constant) and is estimated via a regression model with constant coefficients. . .
local (with spatially varying regression coefficients) approach also available
• spatial uncertainty is better modeled via conditional stochastic simulation, i.e.,

generation of alternative attribute fields, conditioned on (reproducing) the source data
Plausible Expectation Surfaces

Local regression using nearby precip data (OK) Regression using predictor X2 (elevation)
39.0 39.0
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
Slide 34 36.0
-123.5 Longitude -121.0
36.0
-123.5 Longitude -121.0
Regression using predictor X4 (shum x elev) Regression using predictors X3, X4, and X7
39.0 39.0
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
36.0 36.0

Plausible Predicted Surfaces

OK precipitation estimates Kriging(2) precipitation estimates
39.0 39.0
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
36.0 36.0
Slide 35 -123.5 Longitude -121.0 -123.5 Longitude -121.0
Kriging(4) precipitation estimates Kriging(347) precipitation estimates

39.0 39.0
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
36.0 36.0
Interpreting the SK Prediction & Variance

SK prediction: ŷ(tm ) is an estimate of the conditional expectation (mean) of random
variable (RV) Y (tm ), given a single joint realization (the N entries of source data vector
ys ) from N RVs {Y (sn ), n = 1, . . . , N }:
ŷ(tm ) = ˆ {Y (tm )|ys }
SK variance: σ̂(tm ) is an estimate of the conditional variance of RV Y (tm ), given a

single joint realization (the N entries of source data vector ys ) from N RVs
Slide 36 {Y (tm ), m = 1, . . . , N }:
σ̂(tm ) = {Y (tm )|ys }
Implications:
• when the joint distribution (PDF) of {Y (tm ), Y (sn ), n = 1, . . . , N} is N + 1-variate
Gaussian, (i) the conditional expectation of Y (tm ) given ys is a linear function of the data in
ys and equals the SK prediction ŷ(tm ), and (ii) the conditional variance of Y (tm ) given ys
does not depend on the data vector ys (homoscedasticity) and equals the SK variance σ̂(tm )
• if the joint PDF is not multivariate Gaussian, the SK prediction and variance approximate the
conditional mean and variance of Y (tm ) given the data vector ys
• Note: One could report as the best attribute estimate any other value (quantile) from the
conditional PDF of Y (tm ) given the source data in ys , instead of the SK-derived mean ŷ(tm )

Recap
Geostatistical spatial prediction via Simple Kriging accounts for:
• known expected values (”climatologies”) at any source or target support
• statistical proximity (correlation) between source and target supports (for the residual
component)
• redundancy (inverse correlation) between source supports, distributing weights among
nearby supports (again for the residual component)
Slide 37 Simple Kriging is a (no-intercept) linear regression model with (0-mean) lagged variables
Fundamental differences from other methods:

• everything depends on the model covariogram (which is typically built from a sample
semivariogram), thus reflecting the notion that weights should account for the nature
of residual spatial variability (smooth versus rough) of a particular attribute
• an output of Kriging is a set of reliability measures for the predicted values. Such
measures are encoded in the prediction error variances, which are independent of the
actual data values and only depend on the sample configuration and covariogram
model; Kriging error variances are often used in sampling design applications
Simple Kriging with Data of Different Supports

Setting:
• source data of the same attribute Y measured at different sampling units (supports);
e.g., polygons of different size and shape, points & pixels, points & polygons
• target values of arbitrary support; e.g., points, pixels, or polygons, or combinations
Corresponding SK system:
• the SK weights are obtained through the same SK system of equations Σwm = σm ;
Slide 38
in other words, the normal equations do not “see” the type of data or unknowns
considered: normal equations require source-to-source and source-to-target
covariances, no matter the complexity of such source and target supports
• the only modification in the SK system is the way in which Σ and σm are populated;
the most consistent way to proceed is to compute any source-to-source covariance
σ(sn , sn ) or any source-to-target covariance σ(sn , tm ) as a function of a point
support covariogram model σ(d; θ)
Some particular cases: point-to-point (punctual) Kriging, point-to-block (block) Kriging,
block-to-block Kriging (areal interpolation), block-to-point Kriging (downscaling)

Simple Kriging with Data Corrupted by Measurement Error

Setting:
• source data of the same attribute Y measured at arbitrary supports, but now such
data are corrupted by measurement error
• assumptions: (i) additive measurement error, i.e., data = signal + error,
(ii) mean error (bias) and error (co)variance between source supports is known, and
(iii) measurement error does not depend on (is uncorrelated with) the signal
Slide 39
Corresponding SK system:
• the SK weights are obtained through the same SK system of equations Σwm = σm ;
in other words, the normal equations do not “see” the quality of the data or
unknowns considered: normal equations require source-to-source and source-to-target
covariances, no matter the quality of the source data; there is no quality issue for the
target values, since the objective is typically to predict error-free y-values at target supports
• the only modification in the SK system is the way in which Σ and σm are populated;
one accounts for the measurement error covariance only in the source-to-source
covariance matrix: Σd = Σs + Σe and not to σs ; this is known as Factorial Kriging
Kriging with Covariates (I)

• source data = N measurements of attribute Y (dependent variable) stored in vector
ys and of K auxiliary variables (covariates) stored in a (N × K) design matrix Xs
• objective: predict target y-values at M target supports where only M sets of
measurements of these K covariates are available and stored in a (M × K) matrix Xt
Regression with correlated errors + Kriging of residuals:
• (i) consider a regression model between the source data of the dependent and auxiliary
Slide 40 variables as: ys = Xs βs + es , let β̂s denote the (K × 1) vector of estimated coefficients
obtained via Generalized Least Squares, (ii) compute source residuals as rs = ys − Xβ̂s and
perform SK to predict the (M × 1) vector of target residuals r̂t , (iii) export β̂s at the target
supports – this can only be done under certain conditions, and estimate the target mean
vector as μ̂t = Xt β̂s ; (iv) compute final y-predictions as: ŷt = μ̂t + r̂t ; three Kriging
variants depending on the covariates: Ordinary Kriging (covariates = vector of 1s; this
corresponds to an unknown but constant mean) Universal Kriging (when the auxiliary data
are functions of coordinates) or Kriging with External Drift (general case)
• Requisite: a covariogram model for the regression errors (difficult to get); to compute
the final Kriging variance, the SK system needs to be modified. . .

Kriging with Covariates (II)

• source data = N measurements of attribute Y (dependent variable) stored in vector
ys and of K auxiliary variables (covariates) stored in a (N × K) matrix Xs
• objective: predict target y-values at M target supports where only M sets of
measurements of these K covariates are available and stored in a (M × K) matrix Xt
Simple Co-Kriging (SCK):
• instead of considering deterministic (fixed) auxiliary data, one can treat them as
stochastic; in this case the cross-covariances between y-data and lagged (or not)
Slide 41
x-data can be computed and jointly modeled (cumbersome, but doable)
• the SCK weights are obtained through the same system of normal equations
Σwm = σm ; in other words, the normal equations do not “see” the type of the data
or unknowns considered: normal equations require source-to-source and
source-to-target covariances, no matter the attribute pertaining to the source data or
target values
• the only modification is that Σ now contains (N + K × N + K) entries, actually
(K × K) blocks blocks of size (N × N ) each, corresponding to (N × N ) pairs of
supports for different attribute combinations; similarly for σm
Kriging with non-Gaussian Data
Setting:
• source data and target values of the same attribute Y measured at arbitrary supports,
but now the multivariate Gaussian assumption is not applicable
Two avenues for prediction:
Slide 42 • Kriging is applied to transformations of the original source data; several

transformations exist, all aiming at making the data distribution more symmetric and
eventually Gaussian; when the transformation is the logarithmic function, one talks
about Lognormal Kriging
• models of non-Gaussian spatial processes exist but are rather complicated; e.g.,
generalized linear models with spatial correlation (model-based geostatistics),
indicator Kriging or disjunctive Kriging
The Kriging variance σ̂(tm ) is no longer homoscedastic,

e.g., in the log-normal case σ̂(tm ) depends on the corresponding target prediction ŷ(tm )

Geog 210C: Analytical Methods in Geography III: Here S Denotes The N-TH Support, E.g., A Point

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Geog 210C: Analytical Methods in Geography III: Here S Denotes The N-TH Support, E.g., A Point

Загружено:

Авторское право:

Доступные форматы

Spring 2009 Geog 210C: Analytical Methods in Geography III

Objective: predict the unknown attribute value(s) at arbitrary non-sampled support(s),

Some Notation & Terminology

Data: set of N attribute measurements {y(sn ), n = 1, . . . , N }, available at N supports

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Expected Attribute Values & Residuals

Slide 3 In what follows, it is assumed that μs and μt are known;

Residuals from known expected values:

Linear Spatial Prediction

When target supports are points & deﬁned everywhere in 2D,

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Geostatistical Spatial Prediction

Flowchart, assuming only source data on attribute Y exist:

Spatial Prediction with Kriging Example

Kriging predictions and error variances:

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Semivariogram / Covariogram / Correlogram Model

γ (∞) = σ(0) variance

• Semivariogram → covariogram: σ(d) = σ(0) − γ(d)

• Covariogram → semivariogram: γ(d) = σ(0) − σ(d)

Spatial Prediction via Simple Kriging (SK)

• target prediction at target support = weighted sum of source data residuals

• SK variance = reliability of target prediction = overall attribute variance reduced by

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Simple Kriging (SK) Target Prediction

• wm = [wm (sn ), n = 1, . . . , N ] : (N × 1) vector of SK-weights assigned to N source

supports for prediction at target support tm

Use semivariogram model to determine N weights at each target support tm ;

Requisites for Kriging (I)

Source-to-target & source-to-source distances:

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Requisites for Kriging (I): Example

(N x 1) vector of source-to-target distances:

(N x N) matrix of source-to-source distances:

Requisites for Kriging (II)

Source-to-target & source-to-source model covariances:

• source-to-target covariance vector σm : (N × 1) vector with model covariance values

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Requisites for Kriging (II): Source-to-Target Example

Requisites for Kriging (II): Source-to-Source Example

Matrix Σ quantiﬁes source-to-source interaction

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Requisites for Kriging (III)

Source-to-target & source-to-source model covariances:

General speciﬁcation of source-to-target & source-to-source covariances:

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

The Simple Kriging (SK) System of Equations

Requisites for Kriging (III): SK System Example

n, n -th element of matrix Σ: σnn = 1 × exp(−3 × dnn /10)

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Interpreting the SK System (I)

Interpreting the SK System (II)

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Solving the Simple Kriging (SK) System of Equations

Requisites for Kriging (III): SK Weights Example

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Limit Cases for SK Weights

• if source-to-source distances dnn are larger than correlogram range, then

Covariogram Models & SK Weights (I)

Two sets of Simple Kriging weights:

shorter range (left) tends to decrease the SK weights towards 0

Phaedon C. Kyriakidis Geostatistical Spatial Prediction: Simple Kriging total # of slides = 42

Covariogram Models & SK Weights (II)