Академический Документы
Профессиональный Документы
Культура Документы
Introduction
Data: measurement of a spatial attribute available at various sampling units (supports).
Often, such supports correspond to (few & scattered in the study region) point locations,
e.g., monitoring stations
10 11.0 9.00
38.5
9
9.0
8
38
6.00
7 7.0
Latitude
Latitude
37.5 6
5.0
5
3.00
37
4
3.0
3
36.5
2 1.0 0.00
36 1 36.0 36.0
−123.5 −123 −122.5 −122 −121.5 −121 -123.5 Longitude -121.0 -123.5 Longitude -121.0
In this handout: a brief introduction to spatial interpolation via Kriging with a known
expected attribute value (think of long-term average or climatology) at any (with or
without measurements) support; this procedure is also termed Simple Kriging
Objective: predict (estimate), using the N sample data, the unknown attribute values
{y(tm ), m = 1, . . . , M } at a set of M supports {tm , m = 1, . . . , M }, e.g., at points
coinciding with the nodes of a regular raster
Slide 2
Terminology:
• the sample attribute measurements in vector ys are often termed source data, and
the unknown attribute values target values (sampling units: source & target supports)
• for now, it is assumed that both source data & target values are defined over
quasi-point supports, and pertain to the same attribute Y ;
it is also assumed that no other data on related attributes (covariates) are available
• global vs local interpolation: data at all N source supports can be used for prediction
at tm , or only a subset available at N (tm ) << N closest source supports
Linear prediction: the predicted target residual r̂(tm ) at the m-th target support tm is
expressed as a weighted sum of the N source residuals {r(sn ), n = 1, . . . , N }:
N
r̂(tm ) = wm (sn )r(sn ) and ŷ(tm ) = μ(tm ) + r̂(tm )
n=1
Slide 4 wm (sn ) = weight given to n-th source residual r(sn ) for prediction at the m-th target support tm
Determining the weights: in almost all spatial interpolation methods, the N weights
{wm (sn ), n = 1, . . . , N } are functions of the spatial configuration of target and source
supports. More precisely, such weights are often functions of the N (Euclidean)
target-to-source distances {dmn , n = 1, . . . , N } between the m-th target support tm and
the N source supports. In other words, weights are first derived based on the source
supports and are then applied to the source data or residuals. . .
38.5
12.0
9.00
38.0
γ
7.00
8.0
37.5
5.00
37.0
4.0
3.00 sample variogram
model variogram
36.5
1.00
0.0
36.0 0.0 0.4 0.8 0.12 0.16
Slide 6 -123.5 -123.0 -122.5 -122.0 -121.5 -121.0
Distance (degrees)
11.0 9.00
9.0
6.00
7.0
Latitude
Latitude
5.0
3.00
3.0
1.0 0.00
36.0 36.0
-123.5 Longitude -121.0 -123.5 Longitude -121.0
correlation ρ(d)
covariance σ(d)
5 5 0.5
range range
range
0 0 0
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
lag distance d lag distance d lag distance d
Slide 7
Conversion between models: when the sill σ(0) = γ(∞) of the semivariogram model is finite:
σ(d)
• Covariogram → correlogram: ρ(d) =
σ(0)
γ(d)
• Semivariogram → correlogram: ρ(d) = 1 −
σ(0)
N
SK prediction: [ŷ(tm ) − μ(tm )] = wm (sn )[y(sn ) − μ(sn )] = wm
T
rs = w m
T
[ys − μs ]
n=1
• as any other spatial interpolation method, one accounts for the proximity of the N
source supports to the target support tm . Note: Vector dm changes from one target
support tm to another, hence the subscript m
• unlike other interpolation methods, Kriging also accounts for the proximity between
source supports themselves (sample configuration or data layout). Note: For global
interpolation, matrix D of source-to-source distances is the same for all target supports
791
141.0 696 606
477 (5)
(1) (3)
(2) ?
137.0
133.0
227
129.0 646 783
(4)
(6) (7)
125.0
55.0 60.0 65.0 70.0 75.0 80.0
From distance matrices to model covariance matrices: Take any distance value dnm
and dnn , i.e., any entry in dm and D, and transform it, via the covariogram model or
kernel, to a covariance value σ(dnm ) and σ(dnn )
791
141.0 696 606
0.80
477 (5)
−3d
ρ(d) = exp( )
(1) (3)
(2) ?
10
137.0 0.60
ρ (d)
133.0 0.40
227
129.0 646 783
0.20
(4)
(6) (7)
125.0 0.00
55.0 60.0 65.0 70.0 75.0 80.0 0.0 5.0 10.0 15.0 20.0 25.0 30.0
d
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
Slide 13 3.61 exp(−3 × 3.61/10) 0.34
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 4.47 ⎥ ⎢ exp(−3 × 4.47/10) ⎥ ⎢ 0.26 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 6.71 ⎥ ⎢ exp(−3 × 6.71/10) ⎥ ⎢ 0.13 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 8.06 ⎥ → 1 ⎢ exp(−3 × 8.06/10) ⎥ = ⎢ 0.09 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 8.94 ⎥ ⎢ exp(−3 × 8.94/10) ⎥ ⎢ 0.07 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 9.49 ⎥ ⎢ exp(−3 × 9.49/10) ⎥ ⎢ 0.06 ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
13.45 exp(−3 × 13.45/10) 0.02
dm σ m =sill exp(−3dm /range)
These would be the weights, had one ignored auto-correlation between source data
791
141.0 696 606
0.80
477 (5)
(2)
(1)
?
(3)
−3d
137.0 0.60 ρ(d) = exp( )
ρ (d) 10
133.0 0.40
227
129.0 646 783
0.20
(4)
(6) (7)
125.0 0.00
55.0 60.0 65.0 70.0 75.0 80.0 0.0 5.0 10.0 15.0 20.0 25.0 30.0
Slide 14
⎡ ⎤ ⎡ ⎤
0.00 2.24 8.00 11.05 10.05 13.00 16.97 1.00 0.51 0.09 0.04 0.05 0.02 0.01
⎢ ⎥ ⎢ ⎥
⎢ 2.24 0.00 10.05 10.44 12.17 13.04 17.80 ⎥ ⎢ 0.51 1.00 0.05 0.04 0.03 0.02 0.00 ⎥
⎢ ⎥ ⎢ ⎥
⎢ 8.00 10.05 0.00 13.04 2.24 12.37 12.65 ⎥ ⎢ 0.09 0.05 1.00 0.02 0.51 0.02 0.02 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ 11.05 10.44 13.04 0.00 15.00 4.12 11.05 ⎥ → ⎢ 0.04 0.04 0.02 1.00 0.01 0.29 0.04 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ 10.05 12.17 2.24 15.00 0.00 13.93 13.15 ⎥ ⎢ 0.05 0.03 0.51 0.01 1.00 0.02 0.02 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎣ 13.00 13.04 12.37 4.12 13.93 0.00 7.00 ⎦ ⎣ 0.02 0.02 0.02 0.29 0.02 1.00 0.12 ⎦
16.97 17.80 12.65 11.05 13.15 7.00 0.00 0.01 0.00 0.02 0.04 0.02 0.12 1.00
D Σ=sill exp(−3D/range)
A Note on Stationarity
There is no requirement: that a stationary specification of the covariance vector σm and
the covariance matrix Σ be adopted. In other words, one can specify unequal source
variances (diagonal elements of Σ), as well as covariances between two supports
(source-to-target or source-to-source) which are not functions of distance
Second-order stationarity: is a typical working hypothesis when dealing with spatial data
(or a single cross-section from spatio-temporal data), whereby all diagonal entries of Σ
are equal to the variogram sill σ(0) and are functions of distance between supports
Σwm = σm
• a system of N equations in N unknowns (the weights in wm ) for prediction at
support tm ; there are M such systems for M target supports, since σm changes from
Slide 17
one target support to another
• a version of the system of normal equations used in multiple linear regression. For
Kriging, the dependent variable pertains to the target support tm , and there are N
predictor (lagged) variables pertaining to the N source supports
• the SK system is also known with different names in different disciplines, e.g.,
collocation in surveying, Yule-Walker equations in time-series modeling, Wiener
prediction in electrical engineering, objective interpolation in atmospheric sciences
• under 2nd-order stationarity: σ(sn ) = σ(0), σ(sn , sn ) = σ(0)ρ(dnn ) and
σ(sn , tm ) = σ(0)ρ(dnm ); in this case, the weights do not depend on the sill σ(0)
791
141.0 696 606
0.80
477 (5)
(2)
(1)
?
(3)
−3d
137.0 0.60 ρ(d) = exp( )
ρ (d) 10
133.0 0.40
227
129.0 646 783
0.20
(4)
(6) (7)
125.0 0.00
55.0 60.0 65.0 70.0 75.0 80.0 0.0 5.0 10.0 15.0 20.0 25.0 30.0
d
SK system:
Slide 18
⎡ ⎤⎡ ⎤ ⎡ ⎤
1.00 0.51 0.09 0.04 0.05 0.02 0.01 wm (s1 ) 0.34
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.51 1.00 0.05 0.04 0.03 0.02 0.00 ⎥ ⎢ wm (s2 ) ⎥ ⎢ 0.26 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.09 0.05 1.00 0.02 0.51 0.02 0.02 ⎥ ⎢ wm (s3 ) ⎥ ⎢ 0.13 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.04 0.04 0.02 1.00 0.01 0.29 0.04 ⎥ ⎢ wm (s4 ) ⎥ = ⎢ 0.09 ⎥
⎥ ⎢
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ 0.05 0.03 0.51 0.01 1.00 0.02 0.02 ⎥ ⎢ wm (s5 ) ⎥ ⎢ 0.07 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ 0.02 0.02 0.02 0.29 0.02 1.00 0.12 ⎦ ⎣ wm (s6 ) ⎦ ⎣ 0.06 ⎦
0.01 0.00 0.02 0.04 0.02 0.12 1.00 wm (s7 ) 0.02
Σ wm σm
Slide 19 System of normal equations: Per classical (ordinary) least squares regression theory, the vector β
of regression coefficients can be estimated as:
1 T 1
XT Xβ = XT y or X Xβ = XT y
Q Q
Covariance-based version: Since all variables are assumed to have 0-mean, the matrix-matrix
1
product Q XT X is an estimate of the (2 × 2) covariance matrix Σ between the predictors, and
1
the matrix-vector product Q XT y is an estimate of the (2 × 1) covariance vector σ between the
predictors and the dependent variable. The above system of normal equations can therefore be
also written as:
Σβ = σ
Regression with cross-section data: It is still possible to estimate the vector β of regression
coefficients if Σ and σm are known, since the version of normal equations Σβ = σm does not
call for any data. The entries of Σ and σm are simply obtained from the covariogram model σ(d)
since the source-to-source distances d11 = 0, d12 = d21 , d22 = 0 and the source-to-target
Slide 20
distances d1m , d2m are known
Link to Simple Kriging: The SK system Σβ = σm can therefore be seen as the covariance-based
version of the normal equations for the case of regression with lagged variables, i.e., regression
involving a dependent variable defined at a target support tm and predictor variables defined at
source supports s1 and s2 . The SK weights, previously denoted as wm , are none other than the
corresponding regression coefficients vector β
The SK regression model considers only pair-wise (two-point) interactions, i.e., there is no interaction column
x1 x2 , where denotes element-by-element multiplication, in the design matrix X. In other words, (i) only imaginary
d-specific scatter plots between two variables (locations) at a time are considered, and (ii) such scatter plots are
summarized by the corresponding covariance model value σ(d), which is then used to populate Σ and σm
wm = Σ−1 σm
• the weights vector wm is obtained by solving the SK system anew for each target
Slide 21 support tm since the entries of σm change from one target support to another
• the SK system has a unique solution (there is one and only one weights vector wm ) if
and only if the source-to-source covariance matrix Σ is positive definite; for 2nd-order
stationarity, this implies that a valid covariogram model σ(d; θ), e.g., exponential
distance decay, with θ containing the sill and range, is used to populate Σ; in this
case, σ(d; θ) = σ(0)ρ(d; θ) and the weights do not depend on the sill σ(0):
⎡ ⎤ ⎡ ⎤ −1 ⎡ ⎤
wm (s1 ) 1 ··· ρ(d1N ) ρ(d1m )
⎢ ⎥ 1 ⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . .. . ⎥ ⎢ . ⎥
⎢ .
. ⎥= ⎢ .
. . .
. ⎥ σ(0) ⎢ .
. ⎥
⎣ ⎦ σ(0) ⎣ ⎦ ⎣ ⎦
wm (sN ) ρ(dN 1 ) ··· 1 ρ(dN m )
SK weights
Slide 22 145.0
-0.001
141.0 0.267 0.102
0.116 (5)
(1) (3)
(2) ?
137.0 prediction = 592.17
variance = 8.58
133.0
129.0
0.064
(4) 0.028 0.007
(6) (7)
125.0
55.0 60.0 65.0 70.0 75.0 80.0
original weights vector (wm = σm ) modified by Σ−1 to account for sample redundancy;
e.g., wm (s1 ) = 0.27 instead of ρ(d1m ) = 0.34
9 9
8 8
7 7
covariance σ(d)
covariance σ(d)
6 6
5 5
4 4
3 3
2 2
1 1
Slide 24 0
0 5 10 15
lag distance d
20 25 30
0
0 5 10 15
lag distance d
20 25 30
-0.001 -0.009
141.0 0.267 0.102 141.0 0.372 0.193
0.116 (5) 0.167 (5)
(1) (3) (1) (3)
(2) ? (2) ?
137.0 prediction = 592.17 137.0 prediction = 571.93
variance = 8.58 variance = 5.74
133.0 133.0
129.0 129.0
0.064 0.131
(4) 0.028 0.007 (4) 0.058 0.019
(6) (7) (6) (7)
125.0 125.0
55.0 60.0 65.0 70.0 75.0 80.0 55.0 60.0 65.0 70.0 75.0 80.0
9 9
8 8
7 7
covariance σ(d)
covariance σ(d)
6 6
5 5
4 4
3 3
2 2
1 1
Slide 25 0
0 5 10 15 20 25 30
0
0 5 10 15 20 25 30
lag distance d lag distance d
-0.001 -0.315
141.0 0.267 0.102 141.0 0.664 0.435
0.116 (5) -0.045 (5)
(1) (3) (1) (3)
(2) ? (2) ?
137.0 prediction = 592.17 137.0 prediction = 599.91
variance = 8.58 variance = 4.73
133.0 133.0
129.0 129.0
0.064 0.140
(4) 0.028 0.007 (4) -0.025 0.005
(6) (7) (6) (7)
125.0 125.0
55.0 60.0 65.0 70.0 75.0 80.0 55.0 60.0 65.0 70.0 75.0 80.0
Gaussian covariogram yields larger weights for nearby data; negative weights possible
SK prediction:
⎡ ⎤
y(s1 ) − μ(s1 )
⎢ ⎥
⎢ .. ⎥
ŷ(tm ) = μ(tm ) + wm
T
rs = μ(tm ) + [wm (s1 ) · · · wm (sN )] ⎢ . ⎥
⎣ ⎦
Slide 26 y(sN ) − μ(sN )
N N
ŷ(tm ) = μ(tm ) + n=1 wm (sn )[y(sn ) − μ(sn )] and σ̂(tm ) = σ(tm ) − n=1 wm (sn )σ(sn , tm )
SK prediction:
N
N
N
ŷ(tm ) = μ+ wm (sn )[y(sn ) − μ] = μ + wm (sn )y(sn ) − wm (sn )μ
n=1 n=1 n=1
N
N
= wm (sn )y(sn ) + [1 − wm (sn )]μ
n=1 n=1
Slide 27
SK prediction = weighted sum of source data + weighted expectation μ
weight of expectation μ = complement to 1 of sum of SK weights
N
N
ŷ(tm ) = μ(tm ) + wm (sn )[y(sn ) − μ(sn )] σ̂(tm ) = σ(tm ) − wm (sn )σ(sn , tm )
n=1 n=1
• for sample data far away (beyond correlation range) from target support tm , all
weighs are equal to 0. In this case, r̂(tm ) = 0, hence ŷ(tm ) = μ(tm ) and
σ̂(tm ) = σ(0): the SK target prediction equals the known mean μ(tm ) and the SK
Slide 28 variance equals the known variance σ(tm ); away from the sample data, SK yields
back the (assumed known) attribute mean and variance at the target support tm
• for all other target supports, the SK predictions depend on the source support
configuration and their source data, while the SK variances depend only on the source
support configuration; both SK predictions and variances depend on the covariogram
model σ(d; θ) adopted (in the case of 2nd-order stationarity)
38.5
12.0
9.00
38.0
γ
7.00
8.0
37.5
5.00
37.0
4.0
3.00 sample variogram
model variogram
36.5
1.00
0.0
36.0 0.0 0.4 0.8 0.12 0.16
Slide 31 -123.5 -123.0 -122.5 -122.0 -121.5 -121.0
Distance (degrees)
11.0 9.00
9.0
6.00
7.0
Latitude
Latitude
5.0
3.00
3.0
1.0 0.00
36.0 36.0
-123.5 Longitude -121.0 -123.5 Longitude -121.0
12.0
12.0
predicted precipitation (mm)
γ
8.0
8.0
4.0 4.0
Slide 33 • the Kriging variance is a realistic measure of uncertainty in target predictions only
under very specific conditions. . .
Remedies:
• one can introduce more realism into Kriging predictions by incorporating data on
covariates; the attribute mean then becomes a function of co-located predictors
(is not constant) and is estimated via a regression model with constant coefficients. . .
local (with spatially varying regression coefficients) approach also available
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
Slide 34 36.0
-123.5 Longitude -121.0
36.0
-123.5 Longitude -121.0
Regression using predictor X4 (shum x elev) Regression using predictors X3, X4, and X7
39.0 39.0
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
36.0 36.0
-123.5 Longitude -121.0 -123.5 Longitude -121.0
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
36.0 36.0
11.00 11.00
9.00 9.00
7.00 7.00
Latitude
Latitude
5.00 5.00
3.00 3.00
1.00 1.00
36.0 36.0
-123.5 Longitude -121.0 -123.5 Longitude -121.0
Recap
Geostatistical spatial prediction via Simple Kriging accounts for:
• known expected values (”climatologies”) at any source or target support
• statistical proximity (correlation) between source and target supports (for the residual
component)
• redundancy (inverse correlation) between source supports, distributing weights among
nearby supports (again for the residual component)
Slide 37 Simple Kriging is a (no-intercept) linear regression model with (0-mean) lagged variables
Corresponding SK system:
• the SK weights are obtained through the same SK system of equations Σwm = σm ;
Slide 38
in other words, the normal equations do not “see” the type of data or unknowns
considered: normal equations require source-to-source and source-to-target
covariances, no matter the complexity of such source and target supports
• the only modification in the SK system is the way in which Σ and σm are populated;
the most consistent way to proceed is to compute any source-to-source covariance
σ(sn , sn ) or any source-to-target covariance σ(sn , tm ) as a function of a point
support covariogram model σ(d; θ)
Some particular cases: point-to-point (punctual) Kriging, point-to-block (block) Kriging,
block-to-block Kriging (areal interpolation), block-to-point Kriging (downscaling)
• the only modification in the SK system is the way in which Σ and σm are populated;
one accounts for the measurement error covariance only in the source-to-source
covariance matrix: Σd = Σs + Σe and not to σs ; this is known as Factorial Kriging
• Requisite: a covariogram model for the regression errors (difficult to get); to compute
the final Kriging variance, the SK system needs to be modified. . .
Setting:
• source data and target values of the same attribute Y measured at arbitrary supports,
but now the multivariate Gaussian assumption is not applicable
• models of non-Gaussian spatial processes exist but are rather complicated; e.g.,
generalized linear models with spatial correlation (model-based geostatistics),
indicator Kriging or disjunctive Kriging