Академический Документы
Профессиональный Документы
Культура Документы
Technical References
Technical References
Contributing authors:
Catherine Bleins
Matthieu Bourges
Jacques Deraisme
Franois Geffroy
Nicolas Jeanne
Ophlie Lemarchand
Sbastien Perseval
Frdric Rambert
Didier Renard
Yves Touffait
Laurent Wagner
Table of Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1. Hints on
Learning Isatis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. Getting
Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
Generalities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3. Structure Identification in the Intrinsic Case . . . . . . . . . . . . . . . . . . . . . . . . . .9
3.1 3.1 The Experimental Variability Functions. . . . . . . . . . . . . . . . . . . . . .10
3.2 3.2 Variogram Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27
3.3 3.3 The Automatic Sill Fitting Procedure. . . . . . . . . . . . . . . . . . . . . . . .43
4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4. Non-stationary Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
4.1 4.1 Unique Neighborhood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
4.2 4.2 Moving Neighborhood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57
4.3 4.3 Case of External Drift(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62
4.4 4.4 Case of Kriging With Bayesian Drift . . . . . . . . . . . . . . . . . . . . . . . .63
5 5 Automatic Variogram Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
5.1 5.5 General Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .66
5.2 5.6 Quadratic optimization under linear constraints. . . . . . . . . . . . . . . .68
5.3 5.7 Minimization of a sum of squares . . . . . . . . . . . . . . . . . . . . . . . . . .70
6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6. Quick
Interpolations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .77
6.1 6.1 Inverse Distances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78
6.2 6.2 Least Square Polynomial Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79
6.3 6.3 Moving Projected Slope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80
6.4 6.4 Discrete Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81
6.5 6.5 Bilinear Grid Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .83
7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7. Grid
Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85
7.1 7.1 List of the Grid Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . .86
7.2 7.2 Filters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .104
8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8. Linear
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111
112
115
116
118
120
121
123
127
129
131
133
134
136
139
140
142
145
148
9. Gaussian
151
152
159
163
10. Non
165
166
168
169
171
174
176
177
11. Krig179
12. Turn183
184
185
188
13. Trun189
14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14.
Plurigaussian Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .193
14.1 14.1 Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .194
14.2 14.2 Variography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .197
14.3 14.3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .201
14.4 14.4 Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .203
15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15.
Impalas Multiple-Point Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .205
16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16. Fractal
Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .213
16.1 16.1 Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .214
16.2 16.2 Midpoint Displacement Method. . . . . . . . . . . . . . . . . . . . . . . . . .215
16.3 16.3 Interpolation Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .216
16.4 16.4 Spectral Synthesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .217
17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17. Annealing Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .219
18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18. Spill
Point Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .223
18.1 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .224
18.2 18.2 Basic Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .225
18.3 18.3 Maximum Reservoir Thickness Constraint . . . . . . . . . . . . . . . . .226
18.4 18.4 The "Forbidden types" of control points . . . . . . . . . . . . . . . . . . .227
18.5 18.5 Limits of the algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .228
18.6 18.6 Converting Unknown volumes into Inside ones . . . . . . . . . . . . .229
19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19. Multivariate Recoverable Resources Models . . . . . . . . . . . . . . . . . . . . . . . . . .231
19.1 19.7 Theoretical reminders on Discrete Gaussian model applied to Uniform Conditioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .232
19.2 19.8 Theoretical reminders on Discrete Gaussian model applied to block simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .238
20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20. Localized Uniform Conditionning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .245
20.1 20.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .246
21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21. Skin
algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .249
22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22. Meandering Channel Simulation (Flumy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .253
23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23. Automatic Variogram Modeling of the Residuals for Universal Kriging . . .257
263
24. Isatoil
265
268
269
281
Introduction
Technical References
for new users to get familiar with Isatis and give some leading lines to carry a study through,
for all users to improve their geostatistical knowledge by presenting detailed geostatistical
workflows.
Basically, each case study describes how to carry out some specific calculations in Isatis as precisely as possible. You may either:
replay by yourself the case study proposed in the manual, as all the data sets are installed on
your disk together with the software,
or just be guided by the descriptions and apply the workflow on your own datasets.
Technical References
2.Getting Help
You have 3 options for getting help while using Isatis: the On-Line Help system, the Frequently
Asked Questions and the Technical Support team (support@geovariances.com).
Getting Help
Generalities
Technical References
This technical reference reviews the main tools available in Isatis to describe the spatial variability
(regularity, continuity, ...) of the variable(s) of interest, commonly referred to as the "Structure", in
the Intrinsic Case.
10
m+Z is the mean calculated over the first points of the pairs (head)
m-Z is the mean calculated over the second points of the pairs (tail)
m
m
n Z Z
(fig. 3.1-1)
Technical References
11
The Variogram
12
----2n Z Z
n
(fig. 3.1-2)
1--n Z mZ Z mZ
n
(fig. 3.1-3)
1--n Z Z
n
12
(fig. 3.1-4)
(fig. 3.1-5)
The Correlogram
1--- Z m Z Z m Z
n ----------------------------------------------2
n
Technical References
13
(fig. 3.1-6)
Z mZ+ Z m Z-
1--- ---------------------------------------------------n
+ n
(fig. 3.1-7)
1- Z Z
----
2n
n
14
(fig. 3.1-8)
1----2n
Z Z
(fig. 3.1-9)
1- Z Z 2
----2n ------------------------m2
n
Technical References
15
(fig. 3.1-10)
Z Z 2
1----- 2
2n -------------------------------+
Z + mZ
n m
------------------------
(fig. 3.1-11)
1- Z Z 2
----2n -------------------------2
Z
+
Z
n ----------------- 2
16
(fig. 3.1-12)
Although the interest of the madogram and rodogram, as compared to the variogram, is quite obvious (at least graphically), as it tends to smooth out the function, the user must always keep in mind
that the only tool that corresponds to the statement of kriging (namely minimizing a variance) is the
variogram. This is particularly obvious when looking at the variability values (measured along the
vertical axis) on the different figures, remembering that the experimental variance of the data is represented as a dashed line on the variogram picture.
Z Z 2
1---
n
---------------------------------------------2
(eq. 3.1-1)
Technical References
17
n Z Y
(fig. 3.1-13)
The Cross-Variogram
1- Z Z Y Y
----
2n
n
18
(fig. 3.1-14)
1--- Z m Y m
Z
Y
n
n
(fig. 3.1-15)
1--Z Y
n
n
Technical References
19
(fig. 3.1-16)
1
--- Z mZ+ Y mY-
n
n
(fig. 3.1-17)
The Cross-Correlogram
1--- Z m Z Y m Y
n ----------------------------------------------
n
Z Y
20
(fig. 3.1-18)
(fig. 3.1-19)
The Cross-Madogram
1
------
2n
n
Z Z Y Y
Technical References
21
(fig. 3.1-20)
The Cross-Rodogram
1- 4 Z Z Y Y
----
2n
n
(fig. 3.1-21)
1- Z Z Y Y
-------------------------------------------------2n
mZ mY
n
22
(fig. 3.1-22)
Z Z Y Y
1-------------------------------------------------------------
+ + m2n n mZ+ + mZm
Y
Y
----------------------- ----------------------
2
2
(fig. 3.1-23)
1- Z Z Y Y
-------------------------------------------------2n
n Z + Z Y + Y
------------------ ----------------- 2 2
Technical References
23
(fig. 3.1-24)
This time most of the curves are no longer symmetrical. In the case of the covariance, it is even
convenient to split it into its odd and even parts as represented below. If h designates the distance
(vector) between the two data points constituting a pair, we then consider:
The Even Part of the Covariance
1-- C h + C ZY h
2 ZY
(fig. 3.1-25)
1-- C h C ZY h
2 ZY
24
(fig. 3.1-26)
Note - The cross-covariance function is a more powerful tool than the cross-variogram in term of
structural analysis as it allows the identification of delay effects. However, it necessitates stronger
hypotheses (stationarity, estimation of means), it is not really used in the estimation steps.
In fact, the cross-variogram can be derived from the covariance as follows:
1
h = C ZY 0 --- C ZY h + C ZY h
2
and is therefore similar to the even part of the covariance. All the information carried by the odd
part of the covariance is simply ignored.
A last remark concerns the presence of information on all variables at the same data points: this
property is known as isotopy. The opposite case is heterotopy: one variable (at least) is not defined
at all the data points.
The kriging procedure in the multivariate case can cope nicely with the heterotopic case. Nevertheless, in the meantime one has to calculate cross-variograms which can obviously be established
from the common information only. This consideration is damaging in a strong heterotopic case
where the structure, only inferred on a small part of the information, is used for a procedure which
possibly operates on the whole data set.
Technical References
25
(fig. 3.1-27)
When this ratio is constant, the variable corresponding to the simple variogram is "self-krigeable".
This means that in the isotopic case (both variables measured at the same locations) the kriging of
this variable is equal to its cokriging. This property can be extended to more than 2 variables: the
ratio should be considered for any pair of variables which includes the self-krigeable variable.
The ratio between the square root of the variogram and the madogram:
(fig. 3.1-28)
This ratio is constant and equal to for a standard normal variable, when its pairs satisfy the
hypothesis of binormality. A similar result is obtained in the case of a bigamma hypothesis.
The ratio between the variogram and the madogram:
26
(fig. 3.1-29)
If the data obeys a mosaic model with tiles identically and independently valuated, this ratio is constant.
The ratio between the cross-variogram and the square root of the product of the two simple variograms:
When two variables are in intrinsic correlation, the two simple variograms and the cross variogram
are proportional to the same basic variogram. This means that this ratio, in the case of intrinsic correlation must be constant. When two variables are in intrinsic correlation cokriging and kriging are
equivalent in the isotopic case.
Technical References
27
its name.
A coefficient which gives the order of magnitude of the variability along the vertical axis
(homogenous to the variance). In the case of bounded functions (covariances), this value is
simply the level of the plateau reached and is called the sill. The same concept has been kept
even for the non-bounded functions and we continue to call it sill for convenience. The interest of this value is that it always comes as a multiplicative coefficient and therefore can be
calculated using automatic procedures, as explained further. The sill is equal to "C" in the
following models.
A parameter which affects the horizontal axis by normalizing the distances: hence the name
of scale factor. This term avoids having to normalize the space where the variable is defined
beforehand (for example when data are given in microns whereas the field extends on several kilometers). This scale factor is also linked to the physical parameter of the selected
basic function.
When the function is bounded, it reaches a constant level (sill) or even changes its expression after a given distance: this distance value is the range (or correlation distance in statistical language) and is equal to the scale factor. For the bounded functions where the sill is
reached asymptotically, the scale factor corresponds to the distance where the function
reaches 95% of the sill (also called practical range). For functions where the sill is reached
asymptotically in a sinusoidal way (hole-effect variogram), the scale factor is the distance
from which the variation of the function does not exceed 5% around the sill value.
This is why, in the variogram formulae, we systematically introduce the coefficient (norm)
which gives the relationship between the Scale Factor (SF) and the parameter a:
SF = a .
For homogeneity of the notations, the norm and a are kept even for the functions which
depend on a single parameter (linear variogram for example): the only interest is to manipulate distances "standardized" by the scaling factor and therefore to reduce the risk of numerical instabilities.
Finally, the scale factor is used in case of anisotropy. For bounded functions, it is easy to say
that the variable is anisotropic if the range varies with the direction. This concept is generalized to any basic function using the scale factor which depends on the direction, in the calculation of the distance.
m
28
a chart representing the shape of the function for various values of the parameters.
Spherical Variogram
3 h 1 h 3
h = C --- ----- --- -----
2 a 2 a
h = C
= 1
h a
h a
(eq. 3.2-1)
(fig. 3.2-1)
h
h = C 1 exp ---------
a
= 2.996
(eq. 3.2-2)
Technical References
29
(fig. 3.2-2)
h 2
h = C 1 exp -----
a
(eq. 3.2-3)
= 1.731
(fig. 3.2-3)
30
h 2 35 h 3 7 h 5 3 h 7
h = C 7 ----- ------ ----- + --- ----- --- -----
a
4 a
4 a
2 a
h = C
= 1
h a
h a
(eq. 3.2-4)
(fig. 3.2-4)
h
sin -----
a
h = C 1 -----------------------h
a
= 20.371
(eq. 3.2-5)
Technical References
31
(fig. 3.2-5)
Stable Variogram
h
h = C 1 exp -----
a
=
(eq. 3.2-6)
(fig. 3.2-6)
Variograms (SF= 8. & = .25, .50, .75, 1., 1.25, 1.5, 1.75, 2.)
32
Note - The technique for simulating stable variograms is not implemented in the Turning Bands
method.
Gamma Variogram
1 h = C 1 ----------------------
1 + -----h
a
=
a 0
(eq. 3.2-7)
20 1
(fig. 3.2-7)
Variograms (SF= 8. & = .5,1.,2.,5.,10.,20.) & Simulation (SF= 10. & = 2.)
h
J -----
a
h = C 1 2 + 1 ---------------
h
-----
a
d--- 1
(eq. 3.2-8)
= 1
where (from Chils J.P. & Delfiner P., 1999, Geostatistics: Modeling Spatial Uncertainty, Wiley
series in Probability and Statistics, New-York):
Technical References
33
by (Euler's integral)
0 eu u 1 du
(eq. 3.2-9)
x
1 k
x 2k
J x = --- ------------------------------------ ---
2
k! + k + 1 2
k =0
(eq. 3.2-10)
- the modified Bessel function of the first kind, used below, is defined by
x
I x = ---
2
k =0
2k
1
----------------------------------- --x-
k! + k + 1 2
(eq. 3.2-11)
- the modified Bessel function of the second kind, used in K-Bessel variogram hereafter, is defined
by
I x I x
K x = --- ---------------------------------2
sin
(eq. 3.2-12)
(fig. 3.2-8)
34
-----h
a
h
K ---- h = C 1 ------------------------a
(eq. 3.2-13)
= 1
(fig. 3.2-9)
hz
hz
h xy
C h = exp --------- cos 2 ----- exp ------a2
a1
a1
Note that C(h) is a covariance in R2 if and only if
Chils, Delfiner, Geostatistics, 1999).
h R a 1 0 a 2 0
h
h = C 1 1 + -----
a
=
20 1
0
(eq. 3.2-14)
Technical References
35
(fig. 3.2-10)
h
h = C -----
a
(eq. 3.2-15)
= 1
(fig. 3.2-11)
36
h = C -----h
a
0 2
(eq. 3.2-16)
= 1
(fig. 3.2-12)
Variograms (SF= 5. & = 0.25, 0.5, 0.75, 1., 1.25, 1.5, 1.75)
Note - The technique for simulating Power variograms is not implemented in the Turning Bands
method.
Technical References
37
(fig. 3.2-13)
38
(fig. 3.2-14)
Practical calculations
The anisotropy consists of a rotation and the ranges along the different axes of the rotated system.
The rotation can be defined either globally or for each basic structure.
In the 2D case, for one basic structure, and if "u" and "v" designate the two components of the distance vector in the rotated system, we first calculate the equivalent distance:
u- 2 ---v 2
d 2 = ---+ a -
a
u
v
(eq. 3.2-17)
where au and av are the ranges of the model along the two rotated axes.
Then this distance is used directly in the isotropic variogram expression where the range is normalized to 1.
In the case of geometric anisotropy, the value au/av corresponds to the ratio between the two main
axes of the anisotropy ellipse.
For zonal anisotropy, we can consider that the contribution of the distance component along one of
the rotated axes is discarded: this is obtained by setting the corresponding range to "infinity".
Obviously, in nature, both anisotropies can be present, and, moreover, simultaneously.
Finally the setup of any anisotropy requires the definition of a system: this is the system carrying
the anisotropy ellipsoid in case of geometric (or elliptic) anisotropy, or the system carrying the
direction or plane of zonal anisotropy.
Technical References
39
This new system is defined by one rotation angle in 2D, or by 3 angles (dip, azimuth and plunge) in
3D. It is possible to attach the anisotropy rotation system globally or individually to each one of the
nested basic structures. This possibility leads to an enormous variety of different textures.
A =
C h dh
x
A is a function of the dimension of the space. The following table gives the integral ranges of the
main basic structures when the sill C is set to 1. with b
1-D
2-D
3-D
Nugget
Effect
Exponential
2b
2b2
8b3
Spherical
3b/4
--- b 2
5
--- b 3
6
Gaussian
b2
b3
Cardinal
Sine
Stable
+1
2b -------------
+2
b 2 -------------
+ 3
4--- 3 ------------b
3
40
Gamma
2b
-----------1
+
1
else
2b2 8b 3
---------------------------- 2 --------------------------------------------- 3
1 2
1 2 3
+
else
+
else
J-Bessel
+ 1 1
--- 4b 2
2b --------------------1
2
+ ---
2
+
+
else
K-Bessel
+ 1---
2
2b ---------------------
Gen. Cauchy
b
-----------1
+
4b 2
1
else
3
--2
else
+ 1 5
8 b 3-------------------- --1
2
---
2
else
+
+ 3---
2
8 b 3---------------------
b3
b2 ---------------------------- 2 --------------------------------------------- 3
1 2 3
1 2
else
+
else +
Convolution
If we know that the measured variable Z is the result of a convolution p applied on the underlying
variable
Z = Y*p
(eq. 3.2-18)
We can demonstrate that the variogram of Z can be deduced from the variogram of Y as follows:
Y = Z *P with P = p*p
(eq. 3.2-19)
Therefore, if the convolution function is fully determined (its type and the corresponding parameters), specifying a model for Y will lead to the corresponding model for Z.
Technical References
41
3.2.4 Incrementation
In order to introduce the concept of incrementation, we must recall the link between the variogram
and the covariance:
h = C0 Ch
where
(eq. 3.2-20)
1
h = ------- Var
Mk
where
(eq. 3.2-21)
k+1
1 q C kq =1 Z x + k + 1 q h
q=0
(eq. 3.2-22)
k+1
M k = C 2k
+2
0
h =
C0
h = 0
h0
(eq. 3.2-23)
The benefit of the incrementation is that the generalized variogram can be derived using the generalized covariance:
h =
1- k + 1
k + 1 + p K ph
----- 1 p C 2k
+1
Mk
p=k 1
Then, we make explicit the relationships between and for several orders :
k
h = K0 Kh
(eq. 3.2-24)
42
4
1
K 0 --- K h + --- K 2h
3
3
3
3
1
K 0 --- K h + --- K 2h ------ K 3h
2
5
10
if K(h) is a generalized covariance of the h type, then h is of the same type: the only
difference comes from its coefficient which is multiplied by:
1- k + 1
k + 1 + p p 1
----- 1 p C 2k
+1
Mk
(eq. 3.2-25)
p = k1
b pij is the sill of the cross-variogram between variables "i" and "j" (or the sill
of the variogram of variable "i" for b pii ) for the basic structure "p".
Note - The cross-covariance value at the origin may be badly defined in the heterotopic case, or
even undefined in the fully heterotopic case. It is possible to specify the values of the simple and
cross-covariances at the origin, using for instance the knowledge about the variance-covariance
coming from another dataset.
Technical References
43
Lajaunie (See Lajaunie C., Bhaxtguy J.P. Elaboration d'un programme d'ajustement semiautomatique d'un modle de corgionalisation - Thorie, Technical report N21/89/G, Paris:
ENSMP, 1989, 6p).
This technique can be used, when the set of basic structures has been defined, in order to establish
the matrix of sills.
It obviously also works for a single variable. Nevertheless, we must note that it can only be used to
infer the sill coefficients of the model but does not help for all the other types of parameters such as:
l
for each one of them, the range or third coefficient (if any),
finally for the anisotropy. This is why the term automatic fitting is somehow abusive.
Considering a set of N second order stationary regionalized random functions Zi(x) we wish to
establish the multivariate model taking into account all the simple and cross covariances Cij(h).
If the variables Zi(x) are intrinsic, the covariances no longer exist and the model must then be
derived from simple and cross variograms ij h . Nevertheless, this chapter will be developed in
the stationary case.
A well known result is that the matrix
tive in order to ensure the positiveness of the variance of any linear combination of the random variables Zi(x).
In order to build this linear model of coregionalization, we assume that the variables Zi are decomposed on a basis of random variables generically denoted Y, stationary and orthogonal. These variables are regrouped in P groups of Yp random functions characterized by the same covariance Cp(h)
called the basic structure. The count of variables within each group is equal to the number of variables N. We will then write:
Zi x =
apik Ykp
(eq. 3.3-1)
p=1 k =1
The coefficients
a pik are the coefficients of the linear model. The covariance between two variables
C ij h =
apik apjk C p h
p=1 k =1
(eq. 3.3-2)
44
C ij h =
bpij C p h
(eq. 3.3-3)
p =1
b pij =
k =1
3.3.1 Procedure
Assuming that the number of basic structures P, as well as all the characteristics of each basic
model Cp(h), are defined, the procedure determines all the coefficients
ance-covariance matrices.
Starting from the experimental simple and cross-covariances
Cij* hu Cij hu 2 hu
(eq. 3.3-4)
i j u=1
where
h u is a weighting function chosen in order to reduce the importance of the lags with few
pairs, and to increase the size of the first lags corresponding to short distances. For more information on the choice of these weights, the user should refer to the next paragraph.
Each matrix Bp is decomposed as:
B p = X p p X pT
where Xp is the matrix composed of the normalized eigen vectors and
(eq. 3.3-5)
the eigen values. Instead of minimizing (eq. 3.3-4) under the constraints that Bp is definite positive,
we prefer writing that:
b pij =
apik apjk
k =1
(eq. 3.3-6)
Technical References
45
pk x pik
a pik =
where
pk
p and x pik
(eq. 3.3-7)
i j u=1
C ij* h u
apik apjk C p hu
hu
(eq. 3.3-8)
p=1 k =1
apik apjk
i j
= 0
(eq. 3.3-9)
K ij =
Cij* hu hu
u=1
pq
C p hu C q hu hu
(eq. 3.3-10)
u =1
A ijp =
C p hu Cij* hu hu
u=1
Kij +
i j
(eq. 3.3-11)
i j p k
i j p q k l
j l q
pq
apjk Aij
p
(eq. 3.3-12)
We shall describe the case of a single structure first before reviewing the more general case of several nested basic structures.
46
a jk a il a jl T
j
a jk Aij
i k
(eq. 3.3-1)
Using the orthogonality constraints, the only non-zero term in the left-hand side of the equality is
obtained when j=i:
a ik a il 2 T =
l
a jk Ajk
i k
(eq. 3.3-2)
If we introduce:
a il 2
Pi =
(eq. 3.3-3)
then:
a ik P i 2 T =
a jk Ajk
i k
(eq. 3.3-4)
-----x ik
a ik =
k 0
(eq. 3.3-5)
k 0
a ik = 0
The minimum of
Kij
i j
kK
-----------k 2
T
(eq. 3.3-6)
Technical References
47
b ijqC q h
(eq. 3.3-1)
qp
i j u
p h
K ik
u
apik apjk C p hu hu
(eq. 3.3-2)
p k
b pij =
apik 2
(eq. 3.3-3)
i j u
C ij* h u m p b pij C p h u h u
(eq. 3.3-4)
b pij b pij m p
(eq. 3.3-5)
a pik m p a pik
Return to step (1)
Step (2) is used to equalize the weight of each basic structure as the first structure processed in
step (1) has more influence than the next ones.
The coefficient mq is the solution of the linear system:
m q bpij bqij T pq
q
i j
bpij Aijp
(eq. 3.3-6)
i j
Note - This procedure ensures that converges but does not induce that the bp converge.
48
performed giving different weights to different lags. The determination of these weights depends on
one of the four following rules.
l
The weight for each lag of each direction is proportional to the total number of pairs for all the
lags of this direction.
The weight for each lag of each direction is proportional to the number of pairs and inversely
proportional to the average distance of the lag.
The weight for each lag of each direction is inversely proportional to the number of lags in this
direction.
= 2
Variance-Covariance matrix :
Variable 1 Variable 2
Variable 1
1.1347
0.5334
Variable 2
0.5334
1.8167
Variance-Covariance matrix :
Variable 1 Variable 2
Variable 1
0.2562
0.0927
Variable 2
0.0927
0.1224
Technical References
E.Vect 1
E.Vect 2
Variable 1 Variable 2
0.8904
0.4552
-0.4552
0.8904
49
For each basic structure, the printout contains the following information:
In the Variance-Covariance matrix, the sill of the simple variogram for the first variable "Pb" and
for the exponential basic structure is equal to 1.1347. This sill is equal to 1.8167 for the second variable "Zn" and the same exponential basic structure. The cross-variogram has a sill of 0.5334. These
values correspond to the b pij matrix for the first basic structure.
This Variance-Covariance matrix is decomposed into the orthogonal normalized vectors Y1 and Y2.
In this example and for the first basic structure, we can read that:
Zn = 0.6975Y 1 + 0.8051Y 2
Pb = 1.2737Y 1 0.4409Y 2
These coefficients are the
(eq. 3.3-7)
check, for example that for the first basic structure (p=1):
(eq. 3.3-8)
50
a 111 =
11 x111
0.6975 =
a 112 =
12 x112
0.8051 =
a 121 =
0.8426 0.8771
11 x121
1.2737 =
a 122 =
2.1087 0.4803
(eq. 3.3-9)
2.1087 0.8771
12 x122
-0.4409 =
0.8426 -0.4803
We can easily check that the vectors x 11. and x 12. are orthogonal and normalized.
Each eigen vector corresponds to a line and is attached to an eigen value. They are displayed by
decreasing order of the eigen values. As the variance-covariance matrix is definite positive, the
eigen values are positive or null. Their sum is equal to the trace of the matrix and it makes sense to
express them as a percentage of the total trace. This value is called "Var. Perc.".
Technical References
51
(eq. 3.4-1)
l
(eq. 3.4-2)
Where F(x) and H(x) are respectively the gradient vector and the Hessian matrix of F computed
at x. Let qx(h) denote this quadratic approximation:
(snap. 3.4-1)
At the (k +1)th iteration, we obtain xk+1 by optimizing qxk with respect to h. Differentiating qxk
with respect to h and equalizing to zero leads to:
(snap. 3.4-2)
52
(eq. 3.4-4)
The Newton type methods are known to converge toward a local optimum with a very good rate
when the current value is not far to this optimum. Indeed, in that case, the quadratic approximation qx(h) of F(x + h) is a very accurate approximation of F. But for any general starting value
x0, this method can be quite inefficient. For this reason, we used a trust region based method.
l
(eq. 3.4-5)
(eq. 3.4-6)
where
is a positive constant. Then we compare the gain in the objective function with the predicted gain by computing the ratio:
(snap. 3.4-3)
If r < 0, we set xk+1 = xk we reject the candidate value xc because F(xc) >F(xk) (since the denominator of r is always positive). If r 0, we set xk+1 = xc.
For the value of
if r > 0.75, we set
:
= 2*
Note - in order to simplify the constrained optimization problem, we work with ||h||=max|hi|. With
this choice, the inequality constraints become linear.
Note - trust regions based methods are intermediate between the gradient method(robust but with a
slow convergence) for small
Technical References
53
(eq. 3.4-1)
(eq. 3.4-2)
(eq. 3.4-3)
Ax b where means that all the components of the left-hand side vector
54
The principle of the algorithm is to produce a finite sequence x1,...,xv of point which satisfy the
inequality constraints and such that
(snap. 3.4-1)
where w1,...,wn is a set of weights, 1 ..., is the set of experimental variograms for distance h1,..., hn, f
is the model and x is the set of parameters. For such a sum of squares the gradient vector can be
written :
Technical References
55
(eq. 3.4-4)
(eq. 3.4-5)
where the ith component of f is given by f(hi, x), W is a matrix of 0 outside the diagonal and contains
the weights wi on the diagonal, and the (i, j)th term of J(x) is equal to
(snap. 3.4-2)
3.4.4.4 Introduction
The Akaike information criterion is a measure of the relative goodness of fit of a statistical model.
It was developed by Hirotsugu Akaike, under the name of "an information criterion" (AIC), and
was first published by Akaike in 1974. It is grounded in the concept of information entropy, in
effect offering a relative measure of the information lost when a given model is used to describe
reality.
It can be said to describe the trade-off between bias and variance in model construction, or loosely
speaking between accuracy and complexity of the model. Given a data set, several candidate models may be ranked according to their AIC values. From the AIC values one may also infer that e.g.
the top two models are roughly in a tie and the rest are far worse. Thus, AIC provides a means for
comparison among models- a tool for model selection. AIC does not provide a test of a model in the
usual sense of testing a null hypothesis; i.e. AIC can tell nothing about how well a model fits the
56
data in an absolute sense. Ergo, if all the candidate models fit poorly, AIC will not give any warning
of that.
3.4.4.5 Definition
In the general case, the AIC is:
AIC= 2k2ln(L)
where kis the number of parameters in the statistical model, and L is the maximized value of the
likelihood function for the estimated model. Given a set of candidate models for the data, the preferred model is the one with the minimum AIC value. Hence AIC not only rewards goodness of fit,
but also includes a penalty that is an increasing function of the number of estimated parameters.
This penalty discourages over-fitting (increasing the number of free parameters in the model
improves the goodness of the fit, regardless of the number of free parameters in the data-generating
process).
The estimate, though, is only valid asymptotically: if the number of data points is small, then some
correction is often necessary.
where O stands for the observation, E for the theoretical data and
is the known variance of
the observation. The sum holds over all the observations. This definition is only useful when one
has estimates for the error on the measurements, but it leads to a situation where a chi-squared distribution can be used to test goodness of fit, provided that the errors can be assumed to have a normal distribution.
3.4.6 Application
In the usual case, the problem is stated as follows: the theoretical model that we are looking for is a
family of functions
. Then we wish to find the optimal function (or equivalently the optimal set of parameters)
Technical References
57
which minimizes the sum of quadratic errors between the data and the prediction performed with
this function (called residuals):
(eq. 3.4-1)
The value
can be considered as the distance between the data and the theoretical model
used to predict the data. Optimally, this distance must be as small as possible.
If we know the standard deviation
of the noise attached to each datum yi, we can use it to
weight the contribution of each datum to the global distance: a sample will have a large influence if
its uncertainty is small. This weighted distance is referred to as the khi-squared test:
(eq. 3.4-2)
, and
its density
. Given the set of observations (x1,x2,...,xN) following the law of the random
variable X, we define the likelihood:
(eq. 3.4-3)
58
(eq. 3.4-4)
(eq. 3.4-5)
We consider the error between the real data and the predicted value
(for each observation) as a random variable, then the likelihood function can be
written:
(eq. 3.4-6)
Note that, for sake of generality, each observation carries its own error variance
(eq. 3.4-7)
Technical References
59
(eq. 3.4-8)
and as only differences in AIC are meaningful (when comparing several parametric models), we
can write:
(eq. 3.4-9)
Let us now consider that all observation errors share the same variance. According to Burnham and
Anderson, in the special case of least squares (LS) estimation with normally distributed errors, AIC
is expressed as:
(eq. 3.4-10)
where
(eq. 3.4-11)
(eq. 3.4-12)
60
(eq. 3.4-13)
which should be used unless n/k>40 for the model with the largest value of k. Thus, AICc is AIC
with a greater penalty for extra parameters. Burnham & Anderson (2002) strongly recommend
using AICc, rather than AIC, if n is small or k is large. Since AICc converges to AIC as n gets
large, AICc generally should be employed regardless. Using AIC (instead of AICc) when n is not
many times larger than k, increases the probability of selecting models that have too many parameters, i.e. of overfitting. The probability of AIC overfitting can be substantial, in some cases.
AICc was first proposed by Hurvich & Tsai (1989). Dierent derivations of it are given by Brockwell
& Davis (2009), Burnham & Anderson (2002), and Cavanaugh (1997). All the derivations assume a
univariate linear model with normally-distributed errors (conditional upon regressors); if that
assumption does not hold, then the formula for AICc will usually change. Further discussionn of
this, with examples of other assumptions, is given by Burnham & Anderson (2002, ch.7). In particular, bootstrap estimation is usually feasible.
(eq. 3.4-14)
Under the assumption that the model errors are independent and identically distributed according to
a normal distribution, this becomes:
(eq. 3.4-15)
Technical References
61
Given any two estimated models, the model with the lower value of BIC is the one to be preferred.
and an increasing function of k. That is, unexplained
The BIC is an increasing function of
variation in the dependent variable and the number of explanatory variables increase the value of
BIC. Hence, lower BIC implies either fewer explanatory variables, better t, or both. The BIC generally penalizes free parameters more strongly than does the Akaike information criterion, though it
depends on the size of n and relative magnitude of n and k.
l
References
- [1] Akaike, Hirotugu (1974). "A new look at the statistical model identification. IEEE
Transactions on Automatic Control 19 (6): 716-723. doi:10.1109/TAC.1974.1100705.
MR0423716.
- [2] Brockwell, P.J., and Davis, R.A. (2009). Time Series: Theory and Methods, 2nd ed.
Springer.
- [3] Burnham, K. P., and Anderson, D.R. (2002). Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd ed. Springer-Verlag. ISBN 0387-95364-7.
- [4] Burnham, K. P., and Anderson, D.R. (2004), Multimodel inference: understanding
AIC and BIC in Model Selection, Sociological Methods and Research, 33: 261-304.
- [5] Cavanaugh, J.E. (1997). Unifying the derivations of the Akaike and corrected
Akaike information criteria, Statistics and Probability Letters, 31:201-208.
- [6] Hurvich, C. M., and Tsai, C.-L. (1989). Regression and time series model selection
in small samples, Biometrika, 76: 297-307.
- [7] Schwarz, Gideon. 1978. Estimating the Dimension of a Model. Annals of Statistics
6:461-4.6
62
Technical References
63
4.Non-stationary Modeling
This page constitutes an add-on to the User Guide for Statistics / Non-stationary Modeling
This technical reference describes the non-stationary variogram modeling approach, where both the
Drift and the Covariance part of the Structure are directly derived in a calculation procedure.
In the non-stationary case (the variable shows either a global trend or local drifts), the correct tool
cannot be the variogram any more as we must deal with variables presenting much larger fluctuations. Generalized covariances are used instead. As they can be specified only when the drift
hypotheses are given, a Non-stationary Model is constituted of both the drift and the generalized
covariance parameters.
The general framework used for the non-stationary case is known as the Intrinsic Random Functions of order k (IRF-k for short). In this scope, the structural analysis is split into two steps:
m
influence of the optimal generalized covariance compatible with the degree of the drift.
The procedure described hereafter only concerns the univariate aspect. Conversely, it is developed
to enable the use of the external drift feature.
64
Non-stationary Modeling
Zx = mx =
al f l x
(eq. 4.1-1)
Z m x
(eq. 4.1-2)
l Z +
l fm
Z
2
a
f
a
a
f
l m
l l l
,m
(eq. 4.1-3)
am fl fm
In matrix notation:
f l Z
(eq. 4.1-4)
Technical References
65
F T F A = F T Z
(eq. 4.1-5)
The principle in this drift identification phase consists in selecting data points as targets, fitting the
polynomials for several order assumptions, based on their neighboring information and derives the
minimum square errors for each assumption. The optimal drift assumption is the one which produces, on average, the smallest error variance.
The drawback to this method is its lack of robustness against possible outliers. As a matter of fact,
an outlier will produce large variances whatever the degree of the polynomial and will reduce the
discrepancy between results.
A more efficient criterion, for each target point, is to rank the least squared errors for the various
polynomial orders. The first rank is assigned to the order producing the smallest error, the second
rank to the second smallest one and so one. These ranks are finally averaged on the different target
points and the smallest averaged rank corresponds to the optimal degree of the drift.
Kh =
p bp Kp h
(eq. 4.1-6)
b p =
Z A Z
(eq. 4.1-7)
Z = X + U
(eq. 4.1-8)
Let us first review the MINQUE approach. The covariance matrix of Z, can be expanded on a basis
of authorized basic models:
Cov Z, Z = 2 1 V 1 + + 2 r V r
introducing the variance components 2 p . We can estimate them using a quadratic form
(eq. 4.1-9)
66
Non-stationary Modeling
2 p = Z T A p Z
(eq. 4.1-10)
= pq
V =
p Vp
2
(eq. 4.1-11)
The MINQUE is reached when the coefficients p coincide with the variance components p , but
this is precisely what we are after.
Using the vector which constitutes an increment of the data Z we can refer Ap by:
Sp T
where:
TX = 0
and check that the norm V is only involved through:
W = TV
If A and B designate real symmetric n*n matrices, we define the scalar product
<A, B>n = Tr(AVBV)
(eq. 4.1-12)
If A and B satisfy invariance conditions, then we can find respectively S and T, such that:
(eq. 4.1-13)
Then:
(eq. 4.1-14)
Technical References
67
which defines a scalar product on the (n-k)*(n-k) matrix if k designates the number of drift terms.
With these notations, we can reformulate the MINQUE theory:
(eq. 4.1-15)
(eq. 4.1-16)
l
(eq. 4.1-17)
(eq. 4.1-18)
then
(eq. 4.1-19)
l
(eq. 4.1-20)
(eq. 4.1-21)
(eq. 4.1-22)
(eq. 4.1-23)
68
Non-stationary Modeling
If designates the subspace spanned on the Hi, the optimality condition induces that Sp belongs
to this space and can be written:
(eq. 4.1-24)
(eq. 4.1-25)
This system has solutions as soon as the matrix H(H(i,j) = <Hi,Hj>) is non singular.
When the coefficients pi have been calculated, the matrices Sp and Ap are determined and finally
the value of
b p is obtained.
These coefficients must then be replaced in the formulation of the norm V and therefore in W. This
leads to new matrices Hi and to new estimates of the coefficients pi . The procedure is iterated
until the estimates
It can be demonstrated however that the coefficients linked to a single basic structure covariance
lead to positive results which produce authorized generalized covariances.
The procedure resembles the one used in the moving neighbourhood case. All the possible combinations are tested and the ones which lead to non-authorized generalized covariances are dropped.
In order to select the optimal generalized covariance, a cross-validation test is performed and the
model which leads to the standardized error closest to 1 is finally retained.
Technical References
69
(eq. 4.2-1)
This establishes that this estimate is a linear combination of the neighboring data. The set of
weights is given by:
(eq. 4.2-2)
As the residual from the least squares polynomial of order k coincides with a kriging estimation
using a pure nugget effect in the scope of the intrinsic random functions of order k, and as the nugget effect is an authorized model for any degree k of the drift, then:
(eq. 4.2-3)
70
Non-stationary Modeling
Z 0 .
1 .
We have found a convenient way to generate one set of weights which, given a set of points, constitutes an authorized linear combination of order k (ALC-k).
(eq. 4.2-4)
introducing the generalized covariance K(h) where K designates the value of this function K for
the distance between points and .
We assume that the generalized covariance K(h) that we are looking for is a linear combination of a
given set of generic basic structures Kp(h), the coefficients bp (equivalent to sills) of which still
need to be determined:
(eq. 4.2-5)
We use the theorem for each one of the measures previously established, that we denote by using
the index "m":
Technical References
Var
71
m Z
m K m
= b p m K p m
(eq. 4.2-6)
If we assume that each generic basic structure Kp(h) is entirely determined with a sill equal to 1,
each quantity:
(eq. 4.2-7)
(eq. 4.2-8)
are known.
Then the problem is to find the coefficients such that
(eq. 4.2-9)
for all the measures generated around each test data. This is a multivariate linear regression problem that we can solve by minimizing:
(eq. 4.2-10)
The term m is a normation weight introduced to reduce the influence of ALC-k with a large variance. Unfortunately this variance is equal to:
72
Non-stationary Modeling
(eq. 4.2-11)
which depends on the precise coefficients that we are looking for. This calls for an iterative procedure.
Moreover we wish to obtain a generalized covariance as a linear combination of the basic structures. As each one of the basic structures individually is authorized, we are in fact looking for a set
of weights which are positive or null. We can demonstrate that, in certain circumstances, some coefficients may be slightly negative. But in order to ensure a larger flexibility to this automatic procedure, we simply ignore this possibility. We should however perform regression under the
positiveness constraints. Instead we prefer to calculate all the possible regressions with one nonzero coefficient only, then with two non-zero coefficients, and so on ... Each one of these regressions is called a subproblem.
As mentioned before, each subproblem is treated using an iterative procedure in order to reach a
correct normation weight.
The principle is to initialize all the non-zero coefficients of the subproblem to 1. We can then derive
2
an initial value for the normation weights m 0 . Using these initial weights, we can solve the
regression subproblem and derive the new coefficients. We can therefore obtain the new value of
the normation weights. This iteration is stopped when the coefficients bp remain unchanged
between two consecutive iterations.
We must still check that the solution is authorized as the resulting coefficients, although stable, may
still be negative. The non-authorized solutions are discarded.
Anyhow, it can easily be seen that the monovariate regressions always lead to authorized solutions.
Let us assume that the generalized covariance is reduced to one basic structure
K(h) = bK0(h)
(eq. 4.2-12)
(eq. 4.2-13)
Technical References
73
(eq. 4.2-14)
(eq. 4.2-15)
74
Non-stationary Modeling
E Z x = m x = a0 + al f l x
(eq. 4.3-1)
when the fl denotes both standard monomials and external deterministic functions.
When this new decomposition has been stated, the determination of the number of terms in the drift
expansion as well as the corresponding generalized covariance is similar to the procedure explained
in the previous paragraph.
Nevertheless some additional remarks need to be mentioned.
The inference (as well as the kriging procedure) would not work properly as soon as some of the
basic drift functions and the data locations are linearly dependant.
In the case of a standard polynomial drift these cases are directly linked to the geometry of the data
points: a first order IRF will fail if all the neighboring data points are located on a line; a second
order IRF will fail if they belong to any quadric such as a circle, an ellipse or a set of two lines.
In the case of external drift(s), this condition involves the value of these deterministic functions at
the data points and is not always easy to check. In particular, we can imagine the case where only
the external drift is used and where the function is constant for all the samples of a (moving) neighborhood: this property with the universality condition will produce an instability in the inference of
the model or in its use via the kriging procedure.
Another concern is the degree that we can attribute to the IRF when the drift is represented by one
or several external functions. As an illustration we could imagine using two external functions corresponding respectively to the first and second coordinates of the data. This would transform the
target variable into a IRF 1 and would therefore authorize the fitting of generalized covariances
such as K(h) = |h|3. As a general rule we consider that the presence of an external drift function
does not modify the degree of the IRF which can only be determined using the standard monomials:
this is a conservative position as we recall that the generalized covariance that can be used for an
IRF(k), can always be used for an IRF(k+1).
Technical References
75
(eq. 4.4-1)
Where
known a priori and
is the drift,
is the residual.
(eq. 4.4-2)
The unbiasedness condition aiming at filtering out on the drift, leads to add the following equations:
(eq. 4.4-3)
(eq. 4.4-4)
(eq. 4.4-5)
Using the optimality condition and minimizing the prediction variance, we get the following Bayesian kriging system:
76
Non-stationary Modeling
(eq. 4.4-6)
(eq. 4.4-7)
(eq. 4.4-8)
With:
Technical References
77
5.Quick Interpolations
This page constitute an add-on to the Users Guide for Interpolate / Interpolation / Quick Interpolation
The term Quick Interpolation is used to characterize an estimation technique that does not require
any explicit model of spatial structure. They usually correspond to very basic estimation algorithms
widely spread in the literature. For simplicity purpose, only the univariate estimation techniques are
proposed.
78
Quick Interpolations
Z =
(eq. 5.1-1)
The weight attached to each information is inverse proportional to the distance from the data to the
target, at a given power (p):
1----d P
= -----------1
----d P
(eq. 5.1-2)
If the smallest distance is smaller than a given threshold, the value of the corresponding sample is
simply copied at the target point:
Technical References
79
2
l
Z
minimum
(eq. 5.2-1)
a l f l f l
Z f l
l
(eq. 5.2-2)
When the coefficients al of the polynomial expansion are obtained, the estimation is:
Z =
al Zf0
l
(eq. 5.2-3)
80
Quick Interpolations
Technical References
81
Z = x y of a geological stratigraphic layer, as such layers are generally nearly horizontal, it is wise to assume that the interpolator is such that:
if we interpolate the top
2
2
R 1 = and R 2 = are minimum
x
y
l
(eq. 5.4-1)
if we consider the layer as an elastic beam that has been deformed under the action of geological stresses, it is known that shearing stresses in the layer are proportional to second order derivatives. At any point where the shearing stresses exceed a given threshold, rupture will occur. For
this reason, it is wise to assume the following condition at any point where no discontinuity
exists:
2
R3 = ,
x2
(eq. 5.4-2)
R = R1 + R2 + 1 R3 + R4 + R5
where
(eq. 5.4-3)
R 5 has little influence on the result. For this reason, the term
82
Quick Interpolations
-
----= i + 1 j i 1 j
x i j
----= i j + 1 i j 1
y i j
2
= i + 1 j 2 i j + i 1 j
2
x i j
2
= i j + 1 2 i j + i j 1
2
y i j
-
---------= i + 1 j + 1 i 1 j + 1 i + 1 j 1 + i 1 j 1
x y i j
2
(eq. 5.4-4)
Due to this limited neighborhood for the constraints, we can minimize the global roughness in an
iterative process, using the Gauss-Seidel Method.
Technical References
83
(fig. 5.5-1)
y x
x- Z i ;j + 1
Z = ------ ------ Z i + 1 ;j + 1 + 1 ----
y x
x
xx-
y- ---- Z i + 1 ;j + 1 ----+ 1 ----Z i ;j
y x
x
(eq. 5.5-1)
Z = Z i j
(eq. 5.5-2)
84
Quick Interpolations
Technical References
85
6.Grid Transformations
This page constitutes an add-on to the On-Line Help for: Interpolate / Interpolation / Grid Operator
/ Tools / Grid or Line Smoothing.
Except for the Grid filters, located in the Tools / Grid or Line Smoothing window and discussed in
the last section, all the Grid Transformations can be found in Interpolate / Interpolation / Grid
Operator and are performed on two different variable types:
l
The real variables (sometimes called colored variables) which correspond to any numeric variable, no matter how many bits the information is coded on,
Any binary variable can be considered as a real variable; the converse is obviously wrong.
The specificity of these transformations is the use of two other sets of information:
l
The threshold interval: it consists of a pair of values defining a semi-open interval of the type
[a,b[. This threshold interval is used as a cutoff in order to transform a real variable into its indicator (which is a binary variable).
The structuring element: it consists of three parameters defining the extension of the neighborhood, expressed in terms of pixels. Each dimension is entered as the radius of the ball by which
the target pixel is dilated: when the radius is null, the target pixel is considered alone; when the
radius is equal to 1, the neighborhood extension is 3 pixels,...
An additional flag distinguishes the type of the structuring element: cross or block. The following
scheme gives an example of a 2-D structuring element with radius of 1 along X (horizontal) and 2
along Y (vertical). The left side corresponds to a cross type and the right side to a block type.
(fig. 6.0-1)
When considering a target cell located on the edge of the grid, the structuring element is reduced to
only nodes those which belong to the field: this produces an edge effect.
86
Grid Transformations
(fig. 6.1-1)
The previous figure presents the two initial simulations on the upper part and the corresponding
binary simulations on the bottom part. The initial simulations have been generated (using the Turning Band method) in order to reproduce:
- a spherical variogram on the left side
- a gaussian variogram on the right side
Both variograms have the same scale factor (10 pixels) and the same variance.
Each transformation will be presented using one of the previous simulations (either in its initial or
binary form) on the left and the result of the transformation on the right.
Technical References
87
In this paragraph, the types of the arguments and the results of the grid transformations are specified using the following coding:
v binary variable
t threshold
v = real2binary(w)
converts the real variable w into the binary variable v. The principle is that the output variable is
set to 1 (true) as soon as the corresponding input variable is different from zero.
w = binary2real(v)
converts the binary variable v into the real variable w.
v = thresh(w,t)
transforms the real variable w into its indicator v through the cutoff interval t. A sample is set to
1 if it belongs to the cutoff interval and to 0 otherwise.
v2 = erosion(s,v1)
performs the erosion on the input binary image v1, using the structuring element s, storing the
result in the binary image v2. A grain is transformed into a pore if there is at least one pore in its
neighborhood, defined by the structuring element. The next figure shows an erosion with a
cross structuring element (size 1).
(fig. 6.1-2)
88
Grid Transformations
v2 = dilation(s,v1)
v2 is the binary image resulting from the dilation of the binary image v1 using the structuring
element s. A pore is replaced by a grain if there is at least one grain in its neighborhood, defined
by the structuring element. The next figure shows an erosion with a cross structuring element
(size 1).
(fig. 6.1-3)
l
v2 = opening(s,v1)
v2 is the binary image resulting from the opening of the binary image v1 using the structuring
element s. It is equivalent to erosion followed by a dilation, using the same structuring element.
The next figure shows an erosion with a cross structuring element (size 1).
Technical References
89
(fig. 6.1-4)
l
v2 = closing(s,v1)
v2 is the binary image resulting from the closing of the binary image v1 using the structuring element s. It is equivalent to a dilation followed by an erosion, using the same structuring element.
The next figure shows an erosion with a cross structuring element (size 1).
(fig. 6.1-5)
l
v3 = intersect(v1,v2)
v3 is the binary image resulting from the intersection of two binary images v1 and v2. A pixel is
considered as a grain if it belongs to the grain in both initial images.
90
Grid Transformations
v3 = union(v1,v2)
v3 is the binary image resulting from the union of two binary images v1 and v2. A pixel is considered as a grain if it belongs to the grain in one of the initial images at least.
v2 = negation(v1)
v2 is the binary image where the grains and the pores of the binary image v1 have been inverted
w2 = gradx(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the X axis, obtained by comparing pixels at each side of the target node.
w 1 ix + 1 iy w 1 ix 1 iy
w 2 ix iy = ------------------------------------------------------------------------2 dx
l
(eq. 6.1-1)
w2 = grad_xm(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the X axis, obtained by comparing the value at target with the previous adjacent pixel.
Practically on a 2D grid:
w 1 ix iy w 1 ix 1 iy
w 2 ix iy = ---------------------------------------------------------------dx
l
(eq. 6.1-2)
w2 = grad_xp(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the X axis, obtained by comparing the value at target with the next adjacent pixel.
Practically, on a 2D grid:
w 1 ix + 1 iy w 1 ix iy
w 2 ix iy = ---------------------------------------------------------------dx
Note - The rightmost vertical column of the image is arbitrarily set to pore (edge effect).
The next figure represents the gradient along the X axis of the initial (real) simulation.
(eq. 6.1-3)
Technical References
91
(fig. 6.1-6)
l
w2 = grady(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the Y axis, obtained by comparing pixels at each side of the target node.
Practically on a 2D grid:
w 1 ix iy + 1 w 1 ix iy 1
w 2 ix iy = ------------------------------------------------------------------------2 dy
l
(eq. 6.1-4)
w2 = grad_ym(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the Y axis, obtained by comparing the value at target with the previous adjacent pixel.
Practically on a 2D grid:
w 1 ix iy w 1 ix iy 1
w 2 ix iy = ---------------------------------------------------------------dy
l
(eq. 6.1-5)
w2 = grad_yp(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the Y axis, obtained by comparing the value at target with the next adjacent pixel.
Practically on a 2D grid:
w 1 ix iy + 1 w 1 ix iy
w 2 ix iy = ---------------------------------------------------------------dy
(eq. 6.1-6)
92
Grid Transformations
Note - The upper line of the image is arbitrarily set to pore (edge effect).
The next figure represents the gradient along the Y axis of the initial (real) simulation.
(fig. 6.1-7)
l
w2 = gradz(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the Z axis, obtained by comparing pixels at each side of the target node.
Practically on a 2D grid:
w 1 ix iz + 1 w 1 ix iz 1
w 2 ix iz = ------------------------------------------------------------------------2 dz
l
(eq. 6.1-7)
w2 = grad_zm(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the Z axis, obtained by comparing the value at target with the previous adjacent pixel.
Practically on a 2D grid:
w 1 ix iz w 1 ix iz 1
w 2 ix iz = --------------------------------------------------------------dz
l
(eq. 6.1-8)
w2 = grad_zp(w1)
w2 is the real image which corresponds to the partial derivative of the initial real image w1 along
the Z axis, obtained by comparing the value at target with the next adjacent pixel.
Practically on a 2D grid:
Technical References
w 1 ix iz + 1 w 1 ix iz
w 2 ix iz = ---------------------------------------------------------------dz
l
93
(eq. 6.1-9)
w2 = laplacian(w1)
w2 is the real image which corresponds to the laplacian of the initial image w1. The next figure
represents the laplacian of the initial (real) simulation.
Practically on a 2D grid:
w 1 ix + 1 iy 2w 1 ix iy + w 1 ix 1 iy
-+
w 2 ix iy = -----------------------------------------------------------------------------------------------------------dx 2
w 1 ix iy + 1 2w 1 ix iy + w 1 ix iy 1
------------------------------------------------------------------------------------------------------------dy 2
(eq. 6.1-10)
Note - The one pixel thick frame of the image arbitrarily set to pore (edge effect).
(fig. 6.1-8)
l
w4 = divergence(w1,w2,w3)
w4 is the real image which corresponds to the divergence of a 3-D field, whose components are
expressed respectively by w1 along X, w2 along Y and w3 along Z.
Practically on a 3D grid:
94
Grid Transformations
w 1 ix + 1 iy iz w 1 ix iy iz
w 4 ix iy iz = -------------------------------------------------------------------------------- +
dx
w 2 ix iy + 1 iz w 2 ix iy iz
-------------------------------------------------------------------------------- +
dy
w 3 ix iy iz + 1 w 3 ix iy iz
-------------------------------------------------------------------------------dz
l
(eq. 6.1-11)
w4 = rotx(w1,w2,w3)
w4 is the real image which corresponds to the component along X of the rotational of 3D field,
whose components are expressed respectively by w1 along X, w2 along Y and w3 along Z.
Practically on a 3D grid:
w 2 ix iy + 1 iz w 2 ix iy iz
w 4 ix iy iz = -------------------------------------------------------------------------------- dy
w 3 ix iy iz + 1 w 3 ix iy iz
-------------------------------------------------------------------------------dz
l
(eq. 6.1-12)
w4 = roty(w1,w2,w3)
w4 is the real image which corresponds to the component along Y of the rotational of 3D field,
whose components are expressed respectively by w1 along X, w2 along Y and w3 along Z.
Practically on a 3D grid:
w 3 ix iy iz + 1 w 3 ix iy iz
w 4 ix iy iz = -------------------------------------------------------------------------------- dy
w 1 ix + 1 iy iz w 1 ix iy iz
-------------------------------------------------------------------------------dx
l
(eq. 6.1-13)
w4 = rotz(w1,w2,w3)
w4 is the real image which corresponds to the component along Z of the rotational of 3D field,
whose components are expressed respectively by w1 along X, w2 along Y and w3 along Z.
Practically on a 3D grid:
Technical References
w 2 ix iy + 1 iz w 2 ix iy iz
w 4 ix iy iz = -------------------------------------------------------------------------------- dy
w 1 ix + 1 iy iz w 1 ix iy iz
-------------------------------------------------------------------------------dx
l
95
(eq. 6.1-14)
w2=gradient(w1)
w2 is the real image containing the modulus of the 2D gradient of w1.
Practically on a 2D grid:
w 2 ix iy =
w 1 ix + 1 iy w 1 ix iy 2 w 1 ix iy + 1 w 1 ix iy 2
---------------------------------------------------------------- + ----------------------------------------------------------------
dx
dy
(eq. 6.1-15)
(fig. 6.1-9)
l
w2=azimuth2d(w1)
w2 is the real image containing the azimuth (in radian) of the 2D gradient of w1.
Practically on a 2D grid:
w 1 ix iy + 1 w 1 ix iy w 1 ix + 1 iy w 1 ix iy
w 2 ix iy = atan ----------------------------------------------------------------- -----------------------------------------------------------------
dy
dx
(eq. 6.1-16)
96
Grid Transformations
(fig. 6.1-10)
l
w = labelling_cross(v)
w is the real image which contains the ranks (or labels) attached to each grain component that
can be distinguished in the binary image v. A grain component is the union of all the grain pixels
that can be connected using a grain path. Here two grains are connected as soon as they share a
common face. The labels are strictly positive quantities such that two pixels belong to the same
grain component if and only if they have the same label. The grain component labels are ordered
so that the largest component receives the label 1, the second largest the label 2, and so on. The
pore is given the label 0. In the following figure, only the 14 first largest components are represented separately (using different colors); all the smallest ones are displayed using the same pale
grey color.
A printout of the grain component dimensions is provided: for each component dimension,
sorted in decreasing order, the program gives the count of occurrences and the total count of
pixels involved.
Technical References
97
(fig. 6.1-11)
l
w = labelcond_cross(v1,v2)
This function is similar to the labelling_cross function, but restricted to v2.
w = labelling_block(v)
This function is similar to the labelling_cross function, but, this time, two grains are connected
as soon as they share a common face or vertex. Therefore the connectivity probability is larger
here which leads to fewer but larger components.
(fig. 6.1-12)
98
Grid Transformations
w = labelcond_block(v1,v2)
This function is similar to the labelling_block function, but restricted to v2.
w2 = moving_average(s,w1)
w2 is a real image where each pixel is obtained as the average of the real image w1 performed
over a moving neighborhood centered on the target pixel, whose dimensions are given by the
structuring element s. The next figure represents the moving average transformation applied to
the initial simulation using the smallest cross structuring element.
(fig. 6.1-13)
l
w2=moving_average_cond(s,w,wf)
Performs the same operation as the moving_average function, but the neighborhood of a target
cell is reduced to the peripheral cells (included in the structuring element s) where the secondary
variable wf has the same value as in the target cell.
w2 = moving_median(s,w1)
w2 is a real image where each pixel is obtained as the median of the real image w1 performed
over a moving neighborhood centered on the target pixel, whose dimensions are given by the
structuring element s. The next figure represents the moving median transformation applied to
the initial simulation using the reference structuring element.
Technical References
99
(fig. 6.1-14)
l
w2=moving_median_cond(s,w,wf)
Performs the same operation as the moving_median function, but the neighborhood of a target
cell is reduced to the peripheral cells (included in the structuring element s) where the secondary
variable wf has the same value as in the target cell.
w2=fill_average(w1)
When w1 is an image containing several holes (non informed areas), you may wish to complete
the grid before performing other operations. A convenient solution is to use the fill_average
option which will replace any unknown grid value by the average of the first non-empty rings
located around the target node
(fig. 6.1-15)
100
Grid Transformations
v=imagestat(v)
This operation provides an easy way to get basic statistics on a binary image. The output binary
image is equal to the input binary image.
w2=shadowing(w1,a,b)
the resulting variable w2 corresponds to the image of the input variable w1 considered as a relief
and represented with the shadow created by a light source. The source location is characterized
by its longitude (a) and its latitude (b). The longitude angle (a) is counted from the north
whereas the latitude is positive when located above the ground level. The following image
shows the shadowed image with a light source located at 10 degrees longitude and 20 degrees
latitude.
(fig. 6.1-16)
l
w2=integrate(w1)
This operation considers the only active grid nodes where the input variable w1 is defined. The
output variable returns the rank of the active node. Moreover some statistics are printed out (in
the message area) where the count of active nodes is given together with the cumulative and
average quantity of the variable or the positive variable (where only its positive values are taken
into account). The following picture shows the resulting variable represented with a grey color
map (black for low values and white for large values).
Technical References
101
(fig. 6.1-17)
l
v=dynprogx(w1)
This function creates a binary image (selection) where one pixel is valid per YOZ plane, which
corresponds to the continuous path of the maximum value between the first and last YOZ planes
of the 3-D block. In the next figure, remember that the large values correspond to the lighter
color of the left side image.
(fig. 6.1-18)
l
v=dynprogy(w1)
This function creates a binary image (selection) where one pixel is valid per XOZ plane, which
corresponds to the continuous path of the maximum value between the first and last XOZ planes
of the 3-D block. In the next figure, remember that the large values correspond to the lighter
color of the left side image.
102
Grid Transformations
(fig. 6.1-19)
l
v=dynprogz(w1)
This function creates a binary image (selection) where one pixel is valid per XOY plane, which
corresponds to the continuous path of the maximum value between the first and last XOY
planes of the 3-D block. In the next figure, remember that the large values correspond to the
lighter color of the left side image.
v=maxplanex(w)
This function creates a binary image (selection) where one pixel is valid per YOZ plane, which
corresponds to the largest value of the function in this plane. In the next figure, remember that
the large values correspond to the lighter color of the left side image.
(fig. 6.1-20)
Technical References
103
v=maxplaney(w)
This function creates a binary image (selection) where one pixel is valid per XOZ plane, which
corresponds to the largest value of the function in this plane. In the next figure, remember that
the large values correspond to the lighter color of the left side image.
(fig. 6.1-21)
l
v=maxplanez(w)
This function creates a binary image (selection) where one pixel is valid per XOY plane, which
corresponds to the largest value of the function in this plane. In the next figure, remember that
the large values correspond to the lighter color of the left side image
104
Grid Transformations
6.2 Filters
A last type of transformation corresponds to the filters. They can be used on data regularly spaced
of any dimension: therefore, they can be applied on grids or on points sampled along a line (in this
case, we consider that the sampled are regularly spaced not paying attention to their actual coordinates). The three elementary filters provided in Isatis are described hereafter.
i 1 i 1
i i i
2
(eq. 6.2-1)
is automatically set to 0.5. Nevertheless, we can ask to perform a two-passes procedure where a second pass is performed with = -0.5.
The parameter
The next picture shows an illustration of the low-pass filtering procedure performed on a 2D grid of
100 X 100 nodes. The top figure presents a simulated variable (spherical variogram with a range of
10) after thresholding (positive values in white and negative values in black). The bottom figure
shows the variable after 50 iterations of the two_passes procedure.
Technical References
105
(fig. 6.2-1)
Note - The filter is applied in a given direction at a node only if its neighborhood is complete
otherwise the initial value is left unchanged. Therefore all the nodes can be treated and the output
grid is complete.
106
Grid Transformations
value over its neighborhood (whose extension radius "n" is given by the user) according to the formulae:
Z i Median Z i n Z i Z i + 1
(eq. 6.2-2)
Note - The final size of the neighborhood is equal to 2n+1 nodes. The neighborhood is truncated
when it intersects the edge so that all the nodes can be treated and the output grid is complete.
The next picture shows an illustration of the median filtering procedure performed on a 2D grid of
100 X 100 nodes. The top figure presents a simulated variable (spherical variogram with a range of
10) after thresholding (positive values in white and negative values in black). The bottom figure
shows the variable after 5 iterations of the median filter where n=2
Technical References
107
(fig. 6.2-2)
An additional feature consists in constraining the first two transformations using an auxiliary cutoff
variable: the deformation for each pixel (distance between the initial value and the modified value)
may not exceed an amplitude given by the cutoff variable. Moreover when the cutoff variable lies
below a given threshold, no deformation is performed. This allows the user to filter the result of a
kriging estimation performed on a grid, with respect to the estimation standard deviation
108
Grid Transformations
Zi Zi 1
Z i ------------------------------------
where
(eq. 6.2-3)
Note - This procedure induces an edge effect as the last value along each ID row (in the direction
of calculation) is not processed and is left undefined in the output grid. The procedure can be
iterated to provide higher order incrementations
The next picture shows an illustration of the incrementing procedure performed on a 2D grid of 100
X 100 nodes. The top figure presents a simulated variable (spherical variogram with a range of 10)
after thresholding (positive values in white and negative values in black). The bottom left figure
represents the gradient (iterated once) along the X direction whereas the bottom right figure is the
gradient along the Y direction
Technical References
109
(fig. 6.2-3)
110
Grid Transformations
Technical References
111
7.Linear Estimation
This page constitutes an add-on to the Users Guide for Interpolate / Estimation / (Co-)Kriging
(unless specified).
This technical reference presents the outline of the main kriging applications. In fact, by the generic
term "kriging", we designate all the procedures based on the Minimum Variance Unbiased Linear
Estimator, for one or several variables. The following cases are presented:
m
ordinary kriging,
simple kriging,
drift estimation,
factorial kriging,
block kriging,
polygon kriging,
gradient estimation,
lognormal kriging,
cokriging,
112
Linear Estimation
We designate by
Z =
(eq. 7.1-1)
For a better legibility, we will omit the summation symbol when possible using the Einstein notation. We consider the estimation error, i.e. the difference between the estimation and the true value
Z* - Z0.
We impose the estimator at the target (denoted "0") to be:
l
unbiased:
E Z Z 0 = E Z Z 0 = 0
(eq. 7.1-2)
(eq. 7.1-3)
EZ = m
(eq. 7.1-4)
E Z Z 0 = m 1 = 0
(eq. 7.1-5)
= 1
Technical References
113
= Var Z Z 0 = b C 2 C 0 + C 00
minimum
(eq. 7.1-6)
= C 2 C 0 + C 00 + 2 1
against the unknown
(eq. 7.1-7)
and .
-------
- = 0 C + = C a0
- = 0
---- = 1
(eq. 7.1-8)
C + = C 0
= 1
2
= C 00 C 0
(eq. 7.1-9)
C 1
C0
=
1 0
1
and
C
= C 00 0
1
h = C0 Ch
We can then rewrite the kriging system:
(eq. 7.1-10)
114
Linear Estimation
+ = 0
= 1
2
= 00 0
(eq. 7.1-12)
In the intrinsic case, there are two ways of expressing kriging equations: either in covariance terms
or in variogram terms. In view of the numerical solution of these equations, the formulation in
covariance terms should be preferred because it endows the kriging matrix with the virtues of definite positiveness and involves an easier practical inversion.
Technical References
115
C = C 0
2
= C 00 C 0
(eq. 7.2-1)
In matrix notation:
C = C 0
2 = C 00 C 0
and
(eq. 7.2-2)
Z*0 =
Z + 1 m
(eq. 7.2-3)
116
Linear Estimation
E Z X = a l f l x
(eq. 7.3-1)
where f l x are the basic monomials and al are the unknown coefficients.
Before applying the kriging conditions, we must make sure that the mean and the variance of the
kriging error exist. We need this error to be an authorized linear combination (ALC) for the degree
k of the polynomial to be filtered:
Z is an ALC-k f l = f 0l for l k
(eq. 7.3-2)
If we now consider the kriging error, the combination consists of the neighboring points
Z with
f l f 0l = 0
l k
(eq. 7.3-3)
E Z * Z 0 = E Z Z 0 = a l f l a l f 0l = a l f l f 0l
(eq. 7.3-4)
Due to (eq. 7.3-3), the expectation of the estimation error is always zero.
The optimality condition (eq. 7.1-3) leads to:
2 = Var Z * Z 0 = K 2 K 0 + K 00
minimum
(eq. 7.3-5)
where K(h) is the new structural tool called the "generalized covariance". This variance must be
minimized under the existence equations. Introducing as many Lagrange parameters
l as there are
= K 2 K 0 + K 00 + 2 l f l f 0l
against the unknowns
and l
(eq. 7.3-6)
Technical References
117
--------- = 0 K + l f l = K 0
= 0 f l = f l
-------
0
1
(eq. 7.3-7)
l k
l
K + l f = K 0
l
= f 0l
f
2
= K 00 K 0 l f 0l
l k
(eq. 7.3-8)
In matrix notation:
K0
K f l
=
f 0l
f l 0 l
and
K
2 = K 00 0
f 0l
l
(eq. 7.3-9)
118
Linear Estimation
Zx = Yx + mx
where
(eq. 7.4-1)
m x = a l f l x is the drift
We wish to estimate the value of the drift at the target point by kriging:
m x 0 = Z
(eq. 7.4-2)
E m m 0 = a l f l a l f 0l = 0
therefore
a l
(eq. 7.4-3)
minimum
(eq. 7.4-4)
f l = f 0l
Var m m 0 = C
Finally the kriging system is derived:
C + f l = 0
l
f = f 0l
2 = l f l
(eq. 7.4-5)
0
C f l
=
l
f l 0 l
f0
(eq. 7.4-6)
In matrix notation:
and
Technical References
119
2 = l
l f0
(eq. 7.4-7)
120
Linear Estimation
Zx = Yx + mx
(eq. 7.5-1)
al* = a *l x 0 = Z
(eq. 7.5-2)
E a l a l = a l f l a l = 0
0
(eq. 7.5-3)
l
f 0 = 1
for
l l0
for
l = l0
(eq. 7.5-4)
Var a l a l = C
0
minimum
(eq. 7.5-5)
C + f l = 0
l
f = 0
f l0 = 1
(eq. 7.5-6)
Technical References
121
Z(x) is
expanded using a basis of polynomials: E[Z(x)] = alf (x) with unknown coefficients al
Here, the basic hypothesis is that the expectation of the variable can be written:
E Z x = a0 + a1 S x
(eq. 7.6-1)
where S(x) is a known variable (background) and where a0 and a1 are unknown.
Once again, before applying the kriging conditions, we must make sure that the mean and the variance of the kriging error exist. We need this error to be a linear combination authorized for the drift
to be filtered. This leads to the equations:
= 1
S = S0
(eq. 7.6-2)
Var Z Z 0 = K 2 K 0 + K 00
minimum
(eq. 7.6-3)
= K 2 K 0 + K 00 + 2 0 1 + 2 1 S S 0
against the unknowns , 0 and 1 :
(eq. 7.6-4)
122
Linear Estimation
-------
- = 0 K + 0 + S = K
= 0 = 1
-------
= 0 S = S 0
-------
(eq. 7.6-5)
K + 0 + 1 S = K 0
= 1
= S0
S
(eq. 7.6-6)
In matrix notation:
K 0
K 1 S
1 0 0 0 = 1
S 0 0 1
S0
(eq. 7.6-7)
and
T
2 = K 00 0
1
K 0
1
S0
(eq. 7.6-8)
Technical References
123
(eq. 7.7-1)
The Kriging conditions of unbiasedness and optimality lead to the following linear Kriging System:
C f l
C
0
=
f 0l
f l 0 l
(eq. 7.7-2)
= C 00
l
l f0
(eq. 7.7-3)
The value of the covariance part of the structural model expressed for the distance between the data points
f l
C
0
f 0l
The value of the drift function ranked "l" applied to the data point
The value of the modified covariance part of the structural model expressed
for the distance between the point and the target point.
The value of the drift function ranked "l" applied to the target point
124
Linear Estimation
The value of the modified covariance part of the structural model (iterated
twice) expressed between the target point and itself.
C
00
and C
C
00 depend on the type of quantity to be estimated:
0
The terms
C
0
C 00
punctual
= C
C
C
= C
drift
C = 0
C
= 0
block average
C =
C dv
C
C = -----x
C
=
vvC dv dv
2C
C = --------x 2
AX = B
(eq. 7.7-4)
where:
A
AB
Technical References
125
The left-hand side matrix depends on the mutual location of the data points present in the neighborhood of the target point.
The right-hand side depends on the location of the data points of the neighborhood with regard
to the location of the target point.
The choice of the calculation option only influences the right-hand side and leaves the left-hand
side matrix unchanged.
In the Moving Neighborhood case, the data points belonging to the neighborhood vary with the
location of the target point. Then the left-hand matrix A, as well as the right-hand side vector B
must be established each time and the vector of kriging weights X is obtained by solving the linear
kriging system. The estimation is derived by calculating the product of the first part of the vector
X (excluding the Lagrange multipliers) by the vector of the variable value measured at the neighboring data samples Z, that we can write in matrix notation as:
Z = X t *Z
(eq. 7.7-5)
is the vector of the variable value complemented by as many zero values as there are drift
where Z
equations (and therefore Lagrange multipliers) and designates the scalar product.
Finally the variance of the estimation error is derived by calculating another scalar product:
2 = C 00 X t *B
(eq. 7.7-6)
In the Unique Neighborhood case, the neighboring data points remain the same whatever the target
point. Therefore the right-hand side matrix is unchanged and it seems reasonable to invert it once
for all A-1. For each target point, the right-hand side vector must be established, but this time the
vector of kriging weights X is obtained by a simple scalar product:
X = A 1 B
(eq. 7.7-7)
Then, the rest of the procedure is similar to the Moving Neighborhood case:
Z = X t *Z
(eq. 7.7-8)
2 = C 00 X t *B
(eq. 7.7-9)
If the variance of the estimation error is not required, the vector of kriging weights does not even
have to be established. As a matter of fact, we can invert the following system:
A C = Z
(eq. 7.7-10)
126
Linear Estimation
The estimation is immediately obtained by calculating the scalar product (usually referred as the
dual kriging system):
Z = C t *B
(eq. 7.7-11)
Technical References
127
Z = m + Y1 + Y2
(eq. 7.8-1)
where Y1 is centered (mean is zero), characterized by the variogram 1 and Y2 by 2 . If the two
variables are independent, it is easy to see that the variogram of the variable Z is given by:
= 1 + 2
(eq. 7.8-2)
Instead of estimating Z , we may be interested in estimating one of the two components, the estimation of the mean has been covered in the previous paragraph. We are going to describe the estimation of one scale component (say the first one):
Y 1 =
(eq. 7.8-3)
Here again, we will have to distinguish whether the mean is a known quantity or not. If the mean is
a known constant, then it is obvious to see that the unbiasedness of the estimator is fulfilled automatically without implying additional constraints on the kriging weights. If the mean is constant but
unknown, the unbiasedness condition leads to the equation:
= 0
(eq. 7.8-4)
Note that the formalism can be extended to the scope of IRF-k (i.e. defining the set of monomials
fl(x) which compose the drift) and impose that:
fl x
= 0
(eq. 7.8-5)
Nevertheless the rest of this paragraph will be developed in the intrinsic case of order 0 and we can
establish the optimality condition:
1
Var Y 1 Y 01 = + 2 1 0 00
minimum
(eq. 7.8-6)
128
Linear Estimation
+ = 1 0
= 0
(eq. 7.8-7)
1 inThe estimation of the second scale component Y2*, will be obtained by simply changing
0
2 in the right-hand side of the kriging system, keeping the left-hand side unchanged.
to
0
Similarly, rather than extracting a scale component, we can also be interested in filtering a scale
component. Usually this happens when the available data measure the variable together with an
acquisition noise. This noise is considered as independent from the variable and characterized by its
own scale component, the nugget effect. The technique is applied to produce an estimate of the
variable, filtering out the effect of this noise, hence the name. In Isatis instead of selecting one scale
component to be estimated, the user has to filter out components.
Because of the linearity of the kriging system, we can easily check that:
Z = m + Y 1 + Y 2
(eq. 7.8-8)
This technique is obviously not limited to two components per variable, nor to one single variable.
We can even perform components filtering using the cokriging technique.
Technical References
129
P .
For each structure p , we introduce a set of orthogonal variables Yp1,...,YpN (means 0 and variances
1), mutually independent and characterized by the same variogram and write:
P
p=1
k =1
apik Ykp
Zi = mi +
(eq. 7.9-1)
Because of the mutual independence, we can easily derive the simple and cross-variograms of the
different variables:
Zi Zj =
apik apjk p
(eq. 7.9-2)
p =1 k =1
N
k =1
Zi Zj =
bpij p
(eq. 7.9-3)
p=1
These coefficients correspond to the sills matrices (Bp), symmetrical, definite positive.
Note that the decomposition of the Zi is not unique and thus the Ypi have no physical meaning.
For a given scale component
"p", we usually derive the Ypi from the decomposition of the (Bp)
matrix into a basis of orthogonal eigen vectors. Each Ypi then corresponds to an eigen factor. The Ypi
are finally sorted by decreasing eigen value (percentage of variance of the scale component).
The principal task of the Factorial Analysis is to estimate, through the traditional cokriging, a given
factor for a given scale component. Two remarks should be made:
130
Linear Estimation
As the factors are mutually independent, we can recover the kriging estimates of the variables
by applying the linear decomposition on the estimated factors:
Z i = m i +
apik Ykp
(eq. 7.9-4)
p=1 k =1
The estimation of the mean is the multivariate extension of the drift estimation (previous paragraphs).
l
For a given scale component, some eigen values may happen to be equal to 0 or almost null.
This means that the contribution of these factors to the estimator (or to the simulated value in
the simulation process) is null.
Technical References
131
K 0 by K v which corresponds to the integral of the covariance function between the data
point and a point which describes the volume v :
1
K v = ----- K x dx
v
(eq. 7.10-1)
The integral must be expanded over the number of dimensions of the space in which v is defined.
l
f l 0 by f l v which correspond to the mean values of the drift functions over the volume:
1
f vl = ----- f xl dx
v
(eq. 7.10-2)
K + f l = K
l
v
f l = f l
(eq. 7.10-3)
2 = K vv K v l f vl
(eq. 7.10-4)
It requires the calculation of the term Kvv instead K00 of the term
1
K vv = -------2v
Kxy dx dx
vv
(eq. 7.10-5)
132
Linear Estimation
For each block v, the Kvv integral needs to be calculated once, whereas
K v needs to be calculated
as many times as there are points in the block neighborhood. Therefore these integral calculations
have to be optimized.
Formal expressions of these integrals exist for a few basic structures. Unfortunately, this is not true
for most of them, and moreover these formal expressions sometimes lead to time consuming calculations. Furthermore, the same type of numerical integration MUST be used for the Kvv and the
In the regular discretization case, the block is partitioned into equal cells and the target is replaced
by the union of the cell centers ci.This allows the calculation of the K v terms:
N
Kv
1
= ---- K c
i
N
(eq. 7.10-6)
i=1
1
K vv = -----2- K c c
i j
N
(eq. 7.10-7)
N N
Applying in this case only the regular discretization sometimes lead to over-estimating the nugget
effect. A random discretization is therefore substituted, where the first point of the discretization
describes the centers of the previous regular cells whereas the second point is randomly located
within its cell. In this case, there is almost no chance that a point ci coincides with the point cj and
the function K(h) is never called for a zero-distance. The nugget effect of the structure therefore
vanishes as soon as the covariance is integrated. This effect is recommended as soon as the dimension of the block is much larger than the dimension of the sample, which is usually the case.
Note - The drawback of this method is linked to its random aspect. For each calculation of a Kvv
term the set of points requires a set of random values to be drawn which will vary from one trial to
another. This is why it is recommended that the user exercises this calculation to determine the
optimum as a trade-off between accuracy and stability of the result on the one hand, and
computation time on the other : this possibility is provided in the Neighborhood procedure.
Technical References
133
1
K v= ---------i wi
1 K vv = --------------------i j wi wj
w i K ci
(eq. 7.11-1)
i=1
wi wj Kcicj
(eq. 7.11-2)
i=1 i=1
where each weight wi corresponds to the surface of the intersection between the cell vi centered in ci
and the polygon.
A random discretization is also performed for the computation of the Kvv term
.
134
Linear Estimation
nents are u = cos sin . Letting = 0 will give us ------ and = --- will give us -----u
Y
2
The objective is to estimate the derivative
Z
u
From a mathematical standpoint, it is necessary to clearly define what is meant.by ------ . There are
two concepts involved:
l
One is the ordinary concept of a two-sided directional derivative of a fixed function z(x),
defined as the limit, if it exists, of:
(eq. 7.12-1)
z
----is really what we are after. Contrary to our
u
usual notation, we have used a lower-case letter "z" to emphasize the difference from the random
field Z(x) .
l
x + ru Z x - ----Zlim E Z
--------------------------------------r
u
r0
= 0
(eq. 7.12-2)
It can be shown that if Z(x) has a stationary and isotropic covariance K(h) , then Z(x) is differentiable in the mean square sense if and only if K(h) is twice differentiable at h = 0. Then K(h) exists
for every h and -K(h) is the covariance of Z(x).
Unfortunately, common covariance models (like the spherical, the exponential, ...) are not twice differentiable. Strictly speaking, it is then impossible to estimate
not defined. In practice however, we cannot let this theoretical difficulty rule out the estimation of
Technical References
135
----Z- = lim Z
x + ru Z x ru --------------------------------------------------u
2r
r 0+
(eq. 7.12-3)
The notation r 0 + means that r decreases to zero from above, i.e. takes on positive values only.
This formula may be best justified in terms of one-sided directional derivatives. We now turn to the
K0
K + f l = ----------
l
u
l
f 0l
f l
------=
u
f 0l
2
K 0
----------------- + var -----Z
=
u
u
u
l
where
Z
------ is equal to -K(0).
u
(eq. 7.12-4)
136
Linear Estimation
Cov Z Z
z z
Cov ----- -----
x x
2
--------2x
z z
Cov ----- -----
y y
2
--------2y
z
Cov Z -----
x
z
= Cov ----- Z
x
-----x
z
Cov Z -----
y
z
= Cov ----- Z
y
-----y
z z
Cov ----- -----
x y
z z
= C ov ----- -----
y x
2
----------x y
We immediately see that this requires the mathematical function to be at least twice differentiable, which discards basic structures such as the nugget effect, the spherical variogram, the linear
variogram which are not differentiable at the origin (the function must be extended by symmetry
around the vertical axis for negative distances).
To overcome the problem, we will replace the (punctual) gradient by a dip which represents the
average slope integrated over a small surface (ball) centered on the data point.
Technical References
137
G x = ----- Z x u y v p u v du dv
x
B
(eq. 7.13-1)
where p(u,v) stands for the convolution weighting function. The integral and the derivation signs
can be inverted and this leads to the interesting differentiability feature as soon as p is differentiable. Therefore, we consider the gaussian weighting function:
1 u 2 v 2
p u v = -----------2- -------------------2a 2a 2
(eq. 7.13-2)
Where a is the radius of the integration ball B on which the dip measurement is integrated.
The structural analysis should therefore be performed with these constraints in mind. Moreover,
this implies that if the depth is an IRF-2, then its two gradient components are IRF-0 ; if the depth is
an IRF-1, then its two gradient components are stationary. This property reinforces the difficulty of
the inference.
The principle in this software is to perform the structural analysis on the depth variable and to
derive (without any check) the structures of the gradient components.
Note - The multivariate structure does not belong to the class of linear coregionalization models.
There are also some constraints on the drift equations.
In fact if we write the cokriging system of depth and gradient.
Z =
-
Z + ----x- + ---y
(eq. 7.13-3)
E Z Z 0 = 0
al
l
(eq. 7.13-4)
f-l
f-l
f l +
--------
=0
x y
(eq. 7.13-5)
f l
fl
- + ------
a a fal + ----x y
(eq. 7.13-6)
138
Linear Estimation
Finally the cokriging system can be expressed as follows, assuming that the depth variable is an
IRF-1.
----x
-----1 x y
y
2
----- -------2- ---------- 0 1 0
x x2 x y
2
2- ------------- ---------y x y y2
1 0
0
x
1
0
y
0
1
----x
0 0 1 = -----y
1
0 0 0 0
x0
0 0 0 x
y0
0 0 0 y
(eq. 7.13-7)
The same type of cokriging system and the same numerical recipe is used when cokriging two variables Z and Y such that:
l
Z = Y
Y
Y
Z = a ------ + -----x
y
Technical References
139
Note - The Gibbs Sampler is an iterative algorithm which consist in starting with an authorized
vector of gaussian values consistent with the inequality constraints. Each value is then modified in
turn using a kriging procedure and adding a random value. In Isatis the parameters attached to this
algorithm are fixed; a unique neighborhood is compulsory for the simple kriging step.
The simulations can only be performed in the gaussian space. The user has previously to transform
the hard data into a gaussian variable and keep the anamorphosis attached to this transformation.
The intervals represented by 2 variables have also to be transformed in the gaussian space by the
same anamorphosis function.
After the simulation, the program has just to calculate the average value of the realizations (after
back transformation in the raw space) at each soft data point. These average values are called the
conditional expectation. This conditional expectation is in fact the most probable value of the variable at the soft data locations. The standard deviation of these realizations is also calculated and
stored.
Then, the final step is to krige the target variable using both the hard data and the conditional expectation values.
140
Linear Estimation
lowing conditions:
E e = 0
Cov e e = 0
Var e = V
Cov e Z = 0
Then the kriging estimator of
if
(eq. 7.15-1)
Z can be written Z 0* =
Z + e and
the variance
becomes:
Var Z 0 Z 0 =
K + 2 V 2 K0 + K00
(eq. 7.15-2)
Then the kriging system of Z0 remains the same except that V is now added to the diagonal terms
Technical References
141
nearest distance between data points. This is the reason why one could conveniently replace this
nugget effect by a transition scheme (say a spherical variogram) with a very short range.
But the "nugget effect" (as used in the modeling phase) can also be due to another factor: the measurement error. In this case, the discontinuity is real and is due to errors of the type e . This time,
the discontinuity remains whatever the size of the structure investigation. If the same type of measurement error is attributed to all data, the estimate is the same whether:
l
you do not use any nugget effect in your model and you provide the same V for each data, or
you define a nugget effect component in your model whose sill C is precisely equal to V .
Unlike the estimate itself, the kriging variance differs depending on which option is chosen. Indeed,
the measurement error V is considered as an artefact and is not a part of the phenomenon of interest. Therefore, a kriging with a variance of measurement error equal for each data and no nugget
effect in the model will lead to smaller kriging variances than the estimation with a nugget component equal to V .
The use of data error variances V really makes sense when the data is of different qualities. Many
situations may occur. For example, the data may come from several surveys: old ones and new
ones. Or the measurement techniques may be different: depth measured at wells or by seismic,
porosities from cores or from log interpretation, etc ...
In such cases error variances may be computed separately for each sub-population and, if we are
lucky, the better quality data will allow identification of the underlying structure (possibly including a nugget effect component), while the variogram attached to the poorer quality data will show
the same previous structure incremented by a nugget effect corresponding to the specific measurement error variance V .
In other cases, it could be possible to evaluate directly the precision of each measurement and
derive V : if we are told that the absolute error on Z is
may consider that,
Z = 2 and take: V = Z 2 2 .
Another use of this technique, is in the post processing of the macro kriging where we calculate
"equivalent samples" with measurement error variances. These variances are in fact calculated from
a fitted model depending on the number of initial samples inside pseudo blocks.
142
Linear Estimation
Z = exp Y
(eq. 7.16-1)
Y = ln Z +
(eq. 7.16-2)
where
is the shift which makes Z a positive variable and is supposed to be normally distributed.
In this paragraph, we refer to the value of the mean of the raw punctual variable (denoted MZ) and
the mean and dispersion variance of the log-variable (denoted mY and S2Y), with the theoretical
relationship:
s2
M Z = exp m Y + ----Y-
(eq. 7.16-3)
Using the variogram of the Y variable (denoted as Y ), we can estimate the value of Y* in any point
of the space as a linear combination of the
Y =
Y + 1 mY
(eq. 7.16-4)
either in the strict stationary case (simple kriging) or in the intrinsic case (ordinary kriging) then
honoring the condition
The derivation of the estimate and the corresponding variance of estimation on the raw scale is less
trivial than simply taking the antilog (exponential) of the log-scale quantities. The following formulae consider the cases of Simple or Ordinary Kriging, for point or block estimations.
For block estimation, values for block v are supposed to be lognormally distributed according to the
formula:
Technical References
143
Z v exp a --- Y x dx + b
v v
(eq. 7.16-5)
where values of coefficients a and b are calculated in order to honor the appropriate mean and variance for Zv, i.e.:
E Zv = E Z
Var Z v = Var Z vZ v
(eq. 7.16-6)
Y
s Y2 + ln ----2- e x y dx dy
v v
a 2 = ---------------------------------------------------------------------s Y2 vY v
(eq. 7.16-7)
1
b = 1 a m Y + --- 1 a 2 s Y2 + a 2 vY v
2
The same formulae include the punctual case with a = 1 and
(eq. 7.16-8)
vY v
b = -------- = 0.
2
1--Y x dx with a
vv
tractable estimation variance (no change of support model isrequired).The relative standard deviation of estimation
, which corresponds to
x
---------------- , is saved in Isatis.
MZ +
1
Z = exp Y + --- Y2
2
s2
2 = e Y 1 e
Y2
(eq. 7.16-9)
(eq. 7.16-10)
144
Linear Estimation
Ordinary Kriging
1
Z = exp Y + --- Y2 +
2
s2
2 = e Y 1 + e
Y2 +
(eq. 7.16-11)
e 2
(eq. 7.16-12)
1
Z = exp aY + --- a 2 Y2 + b
2
(eq. 7.16-13)
2 = e
a 2 s Y2 Y
1
a 2 Y2
(eq. 7.16-14)
Ordinary Kriging
1
Z = exp aY + --- a 2 Y2 + b + a 2
2
2 = e
a 2 s Y2 Y
1
+e
a 2 Y2 +
e a
(eq. 7.16-15)
(eq. 7.16-16)
Technical References
145
7.17 Cokriging
This time, we consider two random variables Z1 and Z2 characterized by:
C 11 / 11
C 12 ( where C 12 = C 21 )
Note - It is because the cross covariance is supposed to be symetrical, which is a particular case,
that the cokriging system can be easily translated from covariance to variograms.
We assume that the variables have unknown and unrelated means:
E Z 1 = m1
and
E Z2 = m2
Let us now estimate the first variable at a target point denoted "0", as a linear combination of the
neighboring information concerning both variables and using respectively the weights
Z 1 = 1 Z 1 + 2 Z 2
1 and 2 :
(eq. 7.17-1)
The first variable is also called the main variable.We still apply the unbiasedness condition (eq. 7.12)
E Z 1 Z 01 = 0
(eq. 7.17-2)
1 m1 + 2 m2 m1
= 0
m 1 m 2
(eq. 7.17-3)
= 1
1
= 0
2
(eq. 7.17-4)
Let us consider the optimality condition (eq. 7.1-3) and minimize the variance of the estimation
error:
146
Linear Estimation
(eq. 7.17-5)
12
C
1
1 =
=
2
+ 2 C 12
+ 1 = C110
22 + = C 12
+ 2 C
2
0
(eq. 7.17-6)
In matrix notations:
11
C
C 12
12
C
22
C
1
0
0
1
1 0 1
C110
0 1 2 = C 12
0
0 0 1
1
0 0 2
0
(eq. 7.17-7)
1 2 = C 11
1 C110 2 C120
00
(eq. 7.17-8)
Technical References
147
In the intrinsic case with symetrical cross-covariances, the cokriging system may be written
using variograms:
11 + 12
1
2
12
22
+ 2
1
1 = 1
= 0
2
1 = 110
2 = 12
0
(eq. 7.17-9)
1 2 = 11
+ 1 110 + 2 120
00
(eq. 7.17-10)
Note - If instead of Z1*, we want to estimate Z2*, the matrix is unchanged and only the right-hand
side is modified:
C110
C120
22
C 12
0 C 0
1
0
0
1
(eq. 7.17-11)
2 2 = C 22
1 C120
00
+ 2 C220 2
(eq. 7.17-12)
Let us first remark that both variables Z1 and Z2 do not have to be systematically defined at all the
data points. The only constraint is that when estimating Z1, the number of data where Z2 is defined
is strictly positive.
This system can easily be generalized to more than two variables. The only constraint lies in the
"multivariate structure" which ensures that the system is regular if it comes from a linear coregionalization model.
148
Linear Estimation
not using any Y information: obviously this does not offer any interest,
using all the Y information contained within the neighborhood: this may lead to an untractable
solution because of too many information,
the initial solution (as mentioned in Xu, W., Tran, T. T., Srivastava, R. M., and Journel, A. G.
1992, Integrating seismic data in reservoir modeling: The collocated cokriging alternative. SPE
paper 24742, 67Th Annual Technical Conference and exhibition, p.833-842) consists in using
the single value located at the target grid node location: hence the term collocated. Its contribu-
Technical References
149
tion to the kriging estimate relies on the cross-correlation between the two variables at zero distance. But, in the Intrinsic case, the weights attached to the secondary variable must add up to
zero and therefore, if only one data value is used, its single weight (or influence) will be zero.
l
the solution used in Isatis is to use the Y variable at the target location and at all the locations
where the Z variable is defined (Multi Collocated Cokriging). This neighborhood search has
given the more reliable and stable results so far.
In general collocated cokriging is less precise than a full cokriging - making use of the auxiliary
variable at all target points when estimating each of these.
Exception are models where the cross variogram (or covariance) between the two variables is proportional to the variogram (or covariance) of the auxiliary variable.
In this case collocated cokriging coincides with full cokriging, but is also strictly equivalent to the
simple method consisting in kriging the residual of the linear regression of the target variable on the
auxiliary variable.
The user interested by the different approaches to Collocated Cokriging can refer to Rivoirard J.,
Which Models for Collocated Cokriging?, In Math. Geology, Vol. 33, No 2, 2001, pp. 117-131.
150
Linear Estimation
Technical References
151
for variable transformation into the gaussian space useful in the simulation processes (normal score transformation),
for histogram modeling and a further use in non linear techniques (D.K., U.C., Global Support Correction, grade-tonnage curves, ...),
For information on the theory of Non Linear Geostatistics see Rivoirard J., Introduction to Disjunctive Kriging and Non-linear Geostatistics (Oxford: Clarendon, 1994, 181p).
152
Y =
i Hi Y
(eq. 8.1-1)
i=0
where the Hi(Y) are called the Hermite Polynomials. In practice, this polynomial expansion is
stopped to a given order. Instead of being strictly increasing, the function
consequently shows
maxima and minima outside an interval of interest, that is for very low probability of Y, for instance
outside [-2.5, 3.] in (fig. 8.1-1) (horizontal axis for the gaussian variable and the vertical axis for the
Raw Variable)
Technical References
153
(fig. 8.1-1)
The modeling of the anamorphosis starts with the discrete version of the curve on the true data set
(fig. 8.1-2). The only available parameters are 2 control points (A and B in (fig. 8.1-2)) which possibly allow the user to modify the behaviour of the model (fig. 8.1-2) on the edges. But this opportunity is in practice important only when the number of samples is small. The other parameters
available are the Authorized Interval on the Raw Variable (defined between a minimum value
Zamin and a maximum one Zamax) and the order of the Hermite Polynomial Expansion (number of
polynomials). The default values for the authorized interval are the minimum and the maximum of
the data set. In this configuration, the 2 control points do not modify the experimental anamorphosis
previously calculated.
154
(fig. 8.1-2)
After the definition of this discretized anamorphosis, the program calculates the
coefficients of
the expansion in Hermite Polynomials. It draws the curve and calculates the Practical Interval of
Definition and the Absolute Interval of Definition:
l
the bounds of the Practical Interval of Definition are delimited by the two points [Ypmin,
Zpmin] and [Ypmax, Zpmax] (fig. 8.1-3). The two calculated points are the points where the
curve crosses the upper and lower authorized limits on raw data (Zamin and Zamax) or the
points where the curve is no longer increasing with Y.
the bounds of the Absolute Interval of Definition are delimited by the two points [Yamin,
Zamin] and [Yamax, Zamax] (fig. 8.1-3). These two points are the intersections of the curve
with the horizontal lines defined by the Authorized Interval on the Raw variable. The values
generated using the anamorphosis function will never be outside this Absolute Interval of Definition.
The Figure 3 explains how the anamorphosis will be truncated later during use
Technical References
155
(fig. 8.1-3)
Condition on Y
Result on Z
Y Y amin
Z = Z amin
Y amin Y Y pmin
156
Condition on Y
Result on Z
Y pmin Y Y pmax
Z =
NH 1
i Hi Y
i=0
Y pmax Y Y amax
Y Y amax
Z = Z amax
Z =
with
d2v
Y +
1 d2v u g u du
as a consequence
(eq. 8.1-2)
2 = C0 2
dv
sk
and
2 = .
1 dv
sk
Technical References
157
Condition on Z
Result on Y
Z Z amin
Y = Y amin
Z amin Z Z pmin
Z pmin Z Z pmax
YZ =
NH 1
i Hi Y
i=0
Z pmax Z Z amax
Z Z amax
Y = Y amax
Frequency Inversion
The program just sorts the raw values. A cumulative frequency is then calculated for each sample FCi from the smallest value adding the frequency of each sample:
FC i = FC i 1 + W i
(eq. 8.1-3)
The frequency Wi is given by the user (The Weight Variable) or calculated as Wi = 1/N. Note
that two samples with the same value will get different cumulative frequencies. The program
has finally to calculate the gaussian value:
Y i = G 1 FC i + G 1 FC i 1 2
(eq. 8.1-4)
In this way, two equal raw data have different gaussian values. The resulting variable is "more"
gaussian. This inversion method is generally recommended in Isatis.
158
Empirical Inversion
The empirical inversion calculates for each raw value the attached empirical frequency and calculates the corresponding gaussian value. This time, two equal raw values will have the same
gaussian transformed value.
Note - It is important to note that even if the gaussian transformed values can be calculated
without any anamorphosis model, if the user performs this operation for a simulation (for example),
he will have to back-transform these gaussian simulated values and this time the anamorphosis
model will be necessary. So it is very important to check during this step that this back
transformation Y Z will not be a problem, particularly from an interval of definition point of
view. Indeed, one has to keep in mind the fact that a simulation generates gaussian values on an
interval often larger than the interval of definition of the initial data: so the Practical Interval
should be carefully checked if the model has to be used later, after a simulation process.
Technical References
159
i coefficients
Z = Y =
i Hi Y
(eq. 8.2-1)
i=0
Zv = r Yv =
i r i Hi Yv
(eq. 8.2-2)
i=0
A simple support correction coefficient can allow the user to get this new anamorphosis and at the
same time a model of the histogram of the blocks. In fact this coefficient "r" is determined from the
variance of the blocks:
i2 r 2i
varZ v =
(eq. 8.2-3)
i=1
. varZ
i2
(eq. 8.2-4)
i=1
The only problem in the calculation of the coefficient "r" is that we need the anamorphosis model
i and a variogram model. And unfortunately the variance of the points can be calculated with
the anamorphosis (see above) or can be considered as the sill of the variogram (in a strict stationary
case). In Isatis we calculate the block variance in the following way:
160
varZ =
i2
(eq. 8.2-5)
i= 1
varZ v = varZ v v
where
(eq. 8.2-6)
When the sill of the punctual variogram is different from var Z, the value of
v v can be normal-
Y has been modelled, the different quantities available for a given cut-
T zc = 1 G yc
with
yc = 1 zc
The metal quantity above the cutoff
Q zc =
y y g y dy
c
Q zc
m z c = ------------T zc
Obviously these quantities can be calculated for the punctual anamorphosis but also with a given
block support. In this way, the user can have access to global recoverable reserves.
Technical References
161
Note - These two parameters can be calculated in "Interpolate / Estimation / (Co-)kriging", "Test
Window" option "Print Complete Information", when kriging a block with the future configuration
of the samples... These two values are called in the kriging output: Variance of Z* (Estimated Z)
and Covariance between Z and Z*.
The used formulae are in this case:
VarZ v =
i2 s 2i
(eq. 8.2-7)
i=1
cov Z v Z v =
i2 r i s i i
(eq. 8.2-8)
i=1
This time, the different quantities for a given cutoff "zc" are
:
l
T zc = 1 G yc
with
y c = s 1 z c
l
Q zc =
yr y g y dy
c
Q zc
m z c = ------------T zc
162
Isatis gives the values of the two gaussian correlation coefficients: "s" and " " in the "Calculate"
window for information.
Note - In the case where the future block estimates have no conditional bias, then s = r , and
the estimated recoverable reserves are the same as in the case of larger virtual blocks that would be
perfectly known ("equivalent blocks", having a variance equal to the variance of the future
estimates).
Technical References
163
Ch =
i2 i h
(eq. 8.3-1)
i=1
where:
l
This relationship is valid if the pair of variables (Y(x), Y(x+h)) can be considered as bivariate normal. From the relationship on covariances, we can derive the relationship on variograms.
The use of that relationship is triple:
l
One can calculate the covariances (or variograms) on gaussian transformed values and raw values and check if the relationship holds in order to confirm the binormality of (Y(x), Y(x+h))
pairs
One can calculate the gaussian variogram on the gaussian transformed values and deduce the
raw variogram
This is interesting because the variogram of the gaussian variable is often more clearly structured and easy to fit than the raw variogram derived from the raw values.
One can calculate the gaussian variogram from the raw variogram.
That transformation is not as immediate as the previous one, as each lag the relationship needs
to be inverted (the secant method can be used for instance).
This use of the relationship between gaussian and raw covariance is compulsory to achieve disjunctive kriging on gaussian transformed values after change of support. It means that it calculates the gaussian covariance for the block support v from an analogous relationship
Cv h =
i2 r 2i vi h
i=1
(eq. 8.3-2)
164
In each case, the relationship has to be applied using a discretization of the space (namely h values).
Technical References
165
166
1
Ind Z z c = 1 Z z c =
0
if
Z zc
if
Z zc
(eq. 9.1-1)
No specific problem occurs when processing indicators instead of real variable(s), through the kriging algorithm. Nevertheless in case of multi-indicators, we must provide a multivariate model
which is not always easy to establish.
Instead, we can imagine using a generic model (obtained say for the indicator of the median value)
and tune its sill for all the indicators of interest; moreover, if we are only interested in the estimation
(rather than in its variance) we do not even have to bother about the tuning. Indicator variables
being necessarily correlated, indicator kriging is only an approximation of indicator cokriging
(which takes into account the other indicators when estimating one particular indicator), except
when all simple and cross structures are proportional (autokrigeability).
Hence, the only work consists in finding the kriging weights and applying them to each set of indicators to obtain an estimated indicator. This approach is used in the window: "Interpolate / Estimation / Bundled Indicator Kriging".
When the user prefers to fit a multivariate model two options are given in Isatis. In any case, the
user must first use the window "Statistics / Indicator Pre Processing" to create in the data file the
indicators and also to create the variables in the output grid file for kriging. When the indicators
have been created the user can fit one multivariate model using the standard multivariate approach
("Statistics / Exploratory Data Analysis" and "Statistics / Variogram Fitting") or fit each indicator
separately and use "Statistics / Univariate to Multivariate Variogram" to get the multivariate model.
The standard "Interpolate / Estimation / (Co-)Kriging" window will then be used to get the kriged
indicators.
In all the cases, the final problem comes in the interpretation of these results: in order to consider
the kriged indicators as conditional cumulative distribution functions (ccdf), we have to ensure that
the following constraints are fulfilled:
l
Definition
Ind Z z c 0 1
(eq. 9.1-2)
Technical References
167
Inequality
Ind Z z 1 Ind Z z 2
if
z1 z2
(eq. 9.1-3)
The results may fail to verify these constraints (for example, because of negative weights) and
therefore the results need to be corrected. The correction used in Isatis in "Statistics / Indicator Post
Processing" has been exhaustively described in the GSLIB User's Guide (by Deutsch and Journel; p
77-81): it consists of the average of an upward and a downward correction of the cdf.
The kriged indicators are primarily used to generate conditional probabilities but the user may wish
to transform these results into the probability of exceeding fixed cutoffs, the average value above or
below these cutoffs, accounting for a possible change of support. These transformations of the indicator (co-)kriging results, have been also inspired by the GSLIB methods and we strongly encourage the user to refer to the paragraph (v.1.6) "Going Beyond a Discrete cdf" to understand the set of
"recipes" and the corresponding parameters.
168
Z x = Y x + x W x
(eq. 9.2-1)
where:
- Y*and
P Z x s = P Y x + x W x 1 s
1 s Y x
= P W x ----------------------------------- x
(eq. 9.2-2)
1 s Y x
= 1 G ------------------------------------
x
where G is the c.d.f. for the gaussian distribution.
1 G 1 s .
Technical References
169
Y = 1 Z Hn Y
(eq. 9.3-1)
These polynomials will be used for the kriging step. For each panel we krige the polynomials:
H n DK =
Hn Y
(eq. 9.3-2)
+
where
r 2n v v
1
= ---- r n v v i n
N i
for all
(eq. 9.3-3)
is the block gaussian covariance model. Using these kriged polynomials, we can eas v
Tonnage
T zc = 1 G yc
l
n =1
H n 1 y c g y c
------------------------------------- H n K
n
(eq. 9.3-4)
Metal Quantity
Q zc =
Hn K i y Hj y g y dy
j=0 i=0
with:
(eq. 9.3-5)
170
y c = v 1 z c
(eq. 9.3-6)
For these calculations, the program needs an anamorphosis modeled on the block support and a variogram model. This variogram model is obtained in several steps:
l
transformation of this regularized variogram in the corresponding gaussian variogram using the
block anamorphosis.
It is important to notice that the kriging step is performed without universality conditions. The
weights of the polynomials are decreasing quickly with the order: this means that in practice the
number of kriged polynomials does not need to be important, only 6 or 7 polynomials are generally
enough. But conversely, the fact that we are without universality conditions, can lead to strange
results in under sampled zones (attraction to the mean).
Technical References
171
Zv = r Yv =
i r i Hi Yv
(eq. 9.4-1)
i=0
Z *V = S Y *V =
i S i Hi YV*
(eq. 9.4-2)
i=0
We get:
Y V* = S 1 Z *V
(eq. 9.4-3)
y c = r1 z c
(eq. 9.4-4)
and;
Tonnage
S
y c --r- Y V
T z c = 1 G -----------------------
s 2
1 -r
172
Qz =
Metal Quantity
i=0
S--- H Y
r i V
j=0
j r j
H j y H i y g y dy
Like for the global recoverable reserves, the calculations can be performed using an information
effect assumption. When fitting the anamorphosis function and using the support effect option, the
user will toggle on the information effect button. In the specific case of the uniform conditioning,
the user will have to fit the anamorphosis function of the blocks and the anamorphosis function of
the kriged panels. In the first case, the fit will have to be performed using the Information Effect
option switched ON. But while the user has to enter the variance of the kriged blocks with the ultimate information in the case of the small units as well as the covariance between the true blocks
and the kriged blocks, only the variance of the kriged panels with the current available information
will need to be entered for the panels.
We have for the blocks:
varZ v* =
i2 s 2i
(eq. 9.4-5)
i=1
cov Z v* Z v =
i2 r i s i i
(eq. 9.4-6)
i=1
varZ V* =
i2 S 2i
(eq. 9.4-7)
i=1
cov Z v* Z V =
i2 s i S i t i
(eq. 9.4-8)
i=1
We make the assumption that the gaussian variables: Yv and Y*V are independent conditionally to
Y*V and we can write:
(eq. 9.4-9)
Technical References
173
E Z v* Z V* = E Z V Z V* = Z V*
(eq. 9.4-10)
cov Z v Z *V Z V* = 0
(eq. 9.4-11)
cov Z v Z V* = var Z *V
(eq. 9.4-12)
cor Y v Y *V = S r
(eq. 9.4-13)
S r = t
(eq. 9.4-14)
S
t = ----r
(eq. 9.4-15)
Y V = S 1 Z V
(eq. 9.4-16)
y c = s 1 z c
(eq. 9.4-17)
and so:
So:
We have also:
and:
Tonnage
Metal Quantity
y c tY *V
T z c = 1 G ---------------------
1 t2
N
Q z = t i H i Y V*
i=0
j=0
j r j j
H j y H i y g y dy
174
E Iz v
Z x = E IY v
y rY x
Y x = 1 G -------------------------
1 r2
(eq. 9.5-1)
where:
m
m
m
(eq. 9.5-2)
n r n Hn y
(eq. 9.5-3)
E Z v I z v
Z x = E v Y v I Y v
Y x
(eq. 9.5-4)
Once the block ore and metal above cut-off have been calculated at data locations, the same quantities can be estimated by kriging at the target points. It only requires fitting variogram models on
both variables.
Technical References
175
The advantage of this method, besides its simplicity, is that it does not require strict stationarity. By
using ordinary kriging the attraction towards the mean that occurs in simple kriging is avoided.
176
r 2 V V
= r V
for all
(eq. 9.6-1)
with:
Y VK =
(eq. 9.6-2)
and:
K2 = 1 r V
(eq. 9.6-3)
Knowing these two values we can derive any confidence interval on the kriged gaussian values
from the gaussian density function. For instance, for a 95% confidence level, we have:
Pr Y VK 2 K Y V Y VK + 2 K = 95 %
(eq. 9.6-4)
which is equivalent for the raw values by using the anamorphosis to:
Pr r Y VK 2 K Y V r Y VK + 2 K = 95 %
(eq. 9.6-5)
Z min = r Y VK 2 K
(eq. 9.6-6)
Z max = r Y VK + 2 K
(eq. 9.6-7)
179
(eq. 10.0-1)
Where
known a priori and
is the drift,
is the residual.
(eq. 10.0-2)
The unbiasedness condition aiming at filtering out on the drift, leads to add the following equations:
(eq. 10.0-3)
(eq. 10.0-4)
(eq. 10.0-5)
Using the optimality condition and minimizing the prediction variance, we get the following Bayesian kriging system:
180
(eq. 10.0-6)
(eq. 10.0-7)
(eq. 10.0-8)
With:
The priors are used in the Bayesian kriging and in the bayesian simulations. This page presents
prior initialization option. The Bayesian technics offer the possibility of adding some prior
information on the coefficients of the basic drift functions (monomials or external drift functions). Let us call N their number. P denotes the number of samples.
The principle of the priors is to consider them as a set of Gaussian random variables which must
therefore be defined by specifying their individual means and variances, as well as their two-bytwo correlation coefficient.
The mean value is simply obtained by solving the regression of the data on the set of basic drift
functions. This requires to solve a N x N system which always has a valid solution, unless the
basic drift functions are linearly linked.
181
For the variance and correlation, the work is slightly more difficult. The principle is to obtain
through a leave-one-point-out algorithm. In this technique, we alternatively remove one point out of
the data set. On the P-1 samples, we apply the regression (as mentioned above). This leads to an
estimate of the N coefficients of the basic drift functions.
When the P trials have been performed, we have P sets of N coefficients. It is then easy to calculate
the variance and correlation between these series.
182
Technical References
183
For the theoretical background, the user should refer to Matheron G., The intrinsic random functions and their application (In Adv. App. Prob. Vol.5, pp. 439-468, 1973).
184
11.1 Principle
The Turning Band method is a stereological device designed to reduce a multidimensional simulation to unidimensional ones: if C3 stands for the (polar) covariance to be produced in
, it is suf-
C 1 h = ----- rC 3 r
r
(eq. 11.1-1)
(eq. 11.1-2)
Technical References
185
Y1 + + Yn
Y n = ----------------------------n
(eq. 11.2-1)
tends to become Multigaussian with covariance C as n becomes very large, according to the Central Limit Theorem.
Several algorithms are available to simulate the elementary random functions Yi with a given covariance C. The user will find much more information in Lantujoul C., Geostatistical Simulation
(Springer Berlin, 2002. 256p).
The choice of the method to generate the random function X is theoretically free. However in Isatis,
this or that method will be used preferably to optimize the generation of this or that specific model
of covariance. The selection of the method is automatic.
l
Spectral Method
The Spectral Method generates a distribution the covariance of which is expressed as the Fourier transform of a positive distribution. This method is rather general and is implemented in Isatis where the covariance is regular at the origin. This is the case for the Gaussian, Cardinal Sine,
J-Bessel or Cauchy models of covariance.
Any covariance is a positive definite function which can be written as the Fourier transform of a
positive spectral measure:
(eq. 11.2-2)
Yx =
where:
(eq. 11.2-3)
186
Dilution Method
The Dilution Method generates a numerical function F and partitions into intervals with constant length. Each interval is randomly valuated with F or -F. This method is suitable to simulate covariances with bounded ranges. In Isatis, it is used to generate Spherical or Cubic models
of covariance.
When the covariance corresponds to a geometrical covariogram i.e.:
(eq. 11.2-4)
(eq. 11.2-5)
where:
g is a numerical function.
Migration Method
The Migration Method generates a Poisson process that partitions into independent exponential intervals which are valuated accordingly to the model of covariance to be simulated. In Isatis it is used for:
m
the exponential model: each interval is split into two halves which are alternatively valuated
with +1 and -1;
the Stable and Gamma models: the intervals are valuated accordingly to an exponential law;
the generalized Covariance models: the intervals are valuated with the sum of gaussian processes.
The simulation of the covariance is then obtained by summation with projection of the simulations
on a given number of lines of the covariance . Each line is called in fact "turning band" and the
problem of the optimal count of Turning Bands remains, although Ch. Lantuejoul provides some
hints in Lantujoul C., Non Conditional Simulation of Stationary Isotropic Multigaussian Random
Functions (In M. Armstrong & P.A. Dowd eds., Geostatistical Simulations, Kluwer Dordrecht,
1994, pp.147-167).
Technical References
187
188
11.3 Conditioning
If we consider the kriging estimation of Z(x) using the value of the variable at the data points z x
, in each point, we can write the following decomposition:
Z(x) = Z(x)K + [Z(x) - Z(x)K]
(eq. 11.3-1)
In the Gaussian framework, the residual [Z(x)-Z(x)K] is not correlated with any data value. It is
therefore independent from any linear combination of these data values, such as the kriging estimate. Finally the estimate and the residual are two independent random functions, not necessarily
stationary: for example at a data point, the residual is zero.
If we consider a non-conditional simulation Zs(x) of the same random function, known over the
whole domain of interest and its kriging estimation based on the value of this simulation at the data
points, we can write similarly:
ZSC(x) = Z(x)K + [ZS(x) - ZS(x)K]
(eq. 11.3-2)
where estimate and residual are independent, with the same structure.
By combining the simulated residual to the initial kriging estimation, we obtain:
ZSC(x) = ZS(x) + [Z(x) - ZS(x)K]
(eq. 11.3-3)
which is another random function, conditional this time as it honors the data values at the data
points.
Note - This conditioning method is not concerned about how the non-conditional simulation Zs(x)
has been obtained.
As non correlation is equivalent to independence in the gaussian context, a simulation of a gaussian
random function with nested structures can be obtained by adding independent simulations of the
elementary structures.
For the same reason, combining linearly independent gaussian random functions with elementary
structures gives, under a linear model of coregionalization, a multivariate simulation of different
variables.
Technical References
189
12.Truncated Gaussian
Simulations
This page constitutes an add-on to the Users Guide for:
m
This model could be considered as a discrete version of the multigaussian one, for more theoretical
explanations, the user should refer to Galli A. et al., The Pros and Cons of the Truncated Gaussian
Method (In Geostatistical Simulations, M. Armstrong & P.A. Dowd eds, Kluwer, p 217, 1994).
The simulation method must produce a discrete variable (each value represents lithofacies) and be
controlled by a limited number of parameters that can be inferred from the usual data available
(core drills).
The leading idea comes from the two following observations:
m
The lithofacies constitute a partition of the space: at a given point (cell of a grid) we may
only have one lithofacies. The different lithofacies can be ordered: for example using the
quantity of clay as a criterion, which in fact, characterizes the quality of the reservoir.
The Spatial Distribution of these lithofacies is different in the horizontal and in the vertical:
along the vertical axis, it reproduces the sedimental process. Whereas, horizontally, it characterizes the homogeneity of the field.
If we consider a continuous random function Y(X) and in the case of two lithofacies A and its complementary A, we can write that:
x A Y x Sa
x A Y x Sa
(eq. 12.0-1)
(eq. 12.0-2)
where Sa is the threshold corresponding to the gaussian transform of the proportion pa of the facies
A : pa = G(Sa), G : being the cumulative density function of a gaussian distribution.
The thresholds are derived from the proportion of the different facies.
The problem is to find the gaussian random variable Y which corresponds to the indicators of the
different facies that we observe at the data points.
When this random variable is found, we must simply truncate the gaussian values to the thresholds
that characterize each facies, to obtain the simulation. Unfortunately the transformation which goes
190
from the indicator of a facies to the gaussian value is not bijective. Therefore, we cannot convert
each facies at the data point into its gaussian equivalent.
On the other hand, we can derive the covariance of the truncated gaussian h from the covariance of the indicators of the different facies CA(h). In the case of two facies, we have:
H2
n 1 S
C A h = g s a 2 ---------------------- n h
n!
1
(eq. 12.0-3)
where:
l
Therefore we can fit the underlying covariance h through its impact in the covariance of the
indicator of the facies A.
For the domain of application of the Truncated Gaussian Method (fluvio deltac environment)
where the behavior along the vertical and the horizontal is quite different, we have chosen a factorized covariance for h :
h x h z = x h x z h z
(eq. 12.0-4)
It expresses that, knowing the value of the gaussian variable at a point P, there is independence
between:
l
Moreover, the choice of exponential basic structures for both x and z yields to interesting screen
effect properties which drastically improve the time consumption of the algorithm.
The Conditional Simulation is finally performed using the random function Y(x) characterized by
h its structure , and such that at a data point where the facies is known, the value of the gaussian variable must respect the thresholds that correspond to this facies. Finally, the gaussian realization is converted back into facies by truncation.
Implementation:
Technical References
No neighborhood,
Migrated data.
191
192
Technical References
193
13.Plurigaussian Simulations
Add-on to the On-Line Help for: Interpolate / Conditional Simulations / Plurigaussian.
This documentation is meant to explain the technical procedure involved in the Plurigaussian simulations. It partly refers to Armstrong M., Galli A., Le Loc'h G., Geffroy F., Eschard R, Plurigaussian Simulations in Geosciences, Springer Berlin, 2003. 149p).
194
Plurigaussian Simulations
13.1 Principle
The principle of the categorical simulations is to obtain a variable on a set of target locations (usually the nodes of a regular grid), each category being represented by an integer value called lithotype. With no restriction, we will consider that the lithotype values are the consecutive integers
ranging from 1 to NLIT (where NLIT stands for the number of lithotypes to be simulated).
One plurigaussian simulation is obtained as the posterior coding of the combination of several
underlying stationary Gaussian Random Functions (GRF). In Isatis, the number of these GRF is
limited to 2 (denoted Y1 and Y2) and characterized by their individual structure. These two GRF are
usually independent but they can also be correlated (with a correlation coefficient
the following scheme. Let W1 and W2 be two independent GRF, then we set:
Y1 = W1
Y2 = W1 + 1 2 W2
) according to
(eq. 13.1-1)
(eq. 13.1-2)
In the rest of this chapter, we will consider (except when stated explicitly) that the two GRF are
independent.
The different lithotypes constitute a partition of the 2D gaussian space. Each lithotype (denoted Fi )
is attached to a domain
Di :
x F i Y 1 x Y 2 x D i
(eq. 13.1-3)
In Isatis we have decided to realize a partition of the 2D gaussian space into rectangles with sides
parallel to the main axes. The projections of these rectangles on the gaussian axes define the thresholds attached to each lithotype and each GRF (denoted t ji and s ji respectively for the lower and
upper bounds of the GRF "j" for the lithotype "i"). Therefore the previous proposition can be stated
as:
t i Y 1 x s 1i
x Fi 1
t 2i Y 2 x s 2i
(eq. 13.1-4)
The 2D gaussian space, with a rectangle representing each lithotype, is usually referred to as the
lithotype rule and corresponds to the next figure (7 lithotypes example):
Technical References
195
(fig. 13.1-1)
For each lithotype, the thresholds on both GRF t 1i s 1i t 2i s 2i are related to the proportions of the
lithotypes:
s 1i s 2i
P F x = E 1 F x = P Y 1 x Y 2 x D i =
i
t t g u v du dv
i i
1 2
where g u v is the bivariate gaussian density function with 0. mean, variance 1., and
relation matrix:
= 1
1
(eq. 13.1-5)
as cor-
(eq. 13.1-6)
When the correlation between the two GRF is 0., the previous equation can be factorized:
s1i
s2i
P F x = g u du g v dv = G s 1i G t 1i G s 2i G t 2i
i
i
i
t1
t2
(eq. 13.1-7)
In the non-stationary case, the only difference is that the thresholds are not constant anymore.
Instead they vary as a function of the target point. For example, a point "x" belongs to the lithotype
Fi if:
196
Plurigaussian Simulations
t i x Y 1 x s 1i x
x Fi 1
t 2i x Y 2 x s 2i x
(eq. 13.1-8)
Technical References
197
13.2 Variography
We assume at this stage that the proportions of each lithotype are unknown at any point in space and
that the lithotype rule is chosen. We must now determine the structure of the two GRF by trial and
error, comparing the experimental variograms to their expressions in the model.
For the experimental quantities, we can compute all the simple variograms of the indicators for all
lithotypes:
1
F x x + h = ------i
2N
x x
=h
1F x 1F x 2
i
(eq. 13.2-1)
1
F F x x + h = ------i j
2N
x x =h
1F x 1F x 1F x 1F x
i
(eq. 13.2-2)
In general, the previous expressions are not allowed as the function is not stationary nor ergodic.
However, as the underlying GRF are stationary, we will still use the previous equations.
In order to match the expression of the experimental simple variogram, we can write the simple variogram model expression as follows:
1
1
F x x + h = --- Var 1 F x 1 F x + h = --- E 1 F x 1 F x + h 2
i
i
i
i
i
2
2
(eq. 13.2-3)
F x x + h =
i
1-- E 1 F x + E 1 F x + h 2E 1 F x 1 F x + h =
i
i
i
i
2
s 1i x s 2i x s 1i x + h s 2i x + h
1--g u 1 u 2 v 1 v 2 du 1 du 2 dv 1 dv 2
PF x + PF x + h 2
i
2 i
t 1i x t 2i x t 1i x + h t 2i x + h
(eq. 13.2-4)
where the 4 variables gaussian density g corresponds to the 4x4 covariance matrix:
198
Plurigaussian Simulations
C1 h C1 h
C1 h C2 h
C1 h C1 h
C1 h C2 h
1
(eq. 13.2-5)
C 1 h C 2 h . Similarly,
1
F F x x + h = --- E 1 F x 1 F x + h 1 F x 1 F x + h
i
(eq. 13.2-6)
which expands as follows because we cannot have different lithotypes at the same point:
1
2
F F x x + h = --- E 1 F x 1 F x + h + E 1 F x 1 F x + h
i
(eq. 13.2-7)
and:
F F x x + h =
i
i
i
j
j
s1 x s2 x s1 x + h s2 x + h
1---
g u 1 u 2 v 1 v 2 du 1 du 2 dv 1 dv 2 +
2
t1i x t 2i x t1j x + h t2j x + h
g u 1 u 2 v 1 v 2 du 1 du 2 dv 1 dv 2
t 1i x + h t 2i x + h t 1j x t 2j x
(eq. 13.2-8)
s 1i x + h s 2i x + h s 1j x s 2j x
Technical References
199
s 1i s 2i s 1i s 2i
FStat h = P F
i
t 1i t 2i t 1i t 2i
(eq. 13.2-9)
FStat h =
i
(eq. 13.2-10)
13.2.2 Optimization
Both simple and cross-variograms use the same quadruple gaussian integral I:
s 1i x s 2i x s 1j x + h s 2j x + h
Ih =
g u 1 u 2 v 1 v 2 du 1 du 2 dv 1 dv 2
t 1i x t i x t j x + h t j x + h
2
(eq. 13.2-11)
This quantity can be optimized according to the calculation environment. We can interchange the
integrals and rewrite the previous formula:
s 1i x s 1j x + h s 2j x s 2j x + h
I h =
g u 1 u 2 v 1 v 2 du 1 du 2 dv 1 dv 2
t 1i x t j x + h t i x t j x + h
1
(eq. 13.2-12)
s1i x s1j x + h
s2j x s2j x + h
I h =
g u v du dv
g u v du dv
1
2
i
i
j x + h
j x + h
t
t
t
t
1 1
2 2
with:
(eq. 13.2-13)
200
Plurigaussian Simulations
1 =
C1 h
C1 h
2 =
and
C2 h
C2 h
1
(eq. 13.2-14)
Similarly, each integral can be optimized in the case where C1(h) or C2(h) is null. For example:
s1i x
g u v du dv =
g u du
1
t 1i x t 1j x + h
t1i x
s 1i x s 1j x + h
s1j x + h
g u du
j
t1 x + h
(eq. 13.2-15)
and each integral can be calculated directly using the Gauss integral function :
s 1i x
g u du = G s 1i x G t 1i x
t 1i x
(eq. 13.2-16)
13.2.3 Calculations
This factorization is crucial as far as the CPU time is concerned: the 4 terms integral is approximately 100 times more expensive than twice the 2 terms integral.
As one can easily check, in the non-stationary case, the model of the simple and cross-variograms
can only be calculated at the data points, as we need to know the thresholds for the integral of the
gaussian multivariate density. This explains the technique used in the structural analysis of the
plurigaussian model: for each pair of data points, we calculate simultaneously the experimental variogram and the model. These two quantities are then regrouped by multiples of the lags and represented graphically.
Technical References
201
13.3 Simulation
The plurigaussian simulation consists in simulating two independent random functions and to code
their "product" into lithotypes, according to the lithotype rule. When the two GRF are correlated,
we can still work with independent primary GRF and combine them afterwards as described earlier.
When running conditional simulations, we must convert the lithotypes into values in the gaussian
scale beforehand. This involves an iterative technique known as the Gibbs sampler. We will now
describe this method for a set of lithotypes data (say "N") in the case of non-correlated GRF. We
will focus on one GRF in particular.
We first initialize two vectors of gaussian values
g min g max corresponding to the given lithotype for the first GRF at the data points. Obviously
the bounds of the interval depend on the values of the proportions at this location:
Y g min Y g max
(eq. 13.3-1)
These values belong to the correct gaussian intervals (by construction) but are not correct with
respect to their covariance. The next steps are meant to fulfill this covariance requirement while
keeping the constraints on the intervals.
We then enter in an iterative process where the following operations are performed:
l
Y =
of this estimation.
(eq. 13.3-2)
If the two GRF are correlated the interval of the third step becomes:
202
Plurigaussian Simulations
g min Y Y 1 2 g max Y Y 1 2
------------------------------------------------------------- ------------------------------------------------------------- 1 2
1 2
(eq. 13.3-3)
This iterative procedure requires the process to be stopped. In Isatis the number of iterations is fixed
by the user.
When the gaussian values are defined at the conditioning data points, the rest of the process is standard. We must first perform the conditional simulations of two independent GRF. Then at each grid
node their outcomes are combined according to the lithotype rule in order to produce lithotype
information.
Technical References
203
13.4 Implementation
The Gibbs sampler is the difficulty of the algorithm. In theory, it requires all the information to be
taken into account simultaneously (unique neighborhood). This constraint rapidly becomes intractable when the number of data is too large.
However there is a strong advantage in considering a unique neighborhood. As a matter of fact, the
covariance matrix CN can be established and inverted once for all (for the N data locations). The
inverse of the kriging matrix
We first consider each well/line individually. The iterative process is performed several times on
the data along this well before the next well is tackled.
When the number of data along one line is too large (more than 400 points), the well/line is subdivided into several pieces using a moving window (of 400 points). The starting 400 samples
are processed first, then the window is moved by 200 samples further before the iterative process is performed again. The window is moved until all the data have been processed.
As the wells/lines are mainly vertical, this first step ensures that the vertical behavior of the
covariance is fulfilled.
In order to reproduce the horizontal behavior of the covariance, we must now run the Gibbs
sampler in a more isotropic way. This is achieved by selecting in a standard Isatis neighborhood
a set of points around the first data point. The Gibbs iterations are performed on this subset of
points and then the program selects another subset around the second point and so on until all
the data points have been the center of a subset.
In Isatis the numbers of iterations for the two steps may be different and are given by the user.
204
Plurigaussian Simulations
205
(eq. 14.4-1)
(eq. 14.4-2)
defines the data event at u. Note that a data event with undefined components (for nodes which are
not yet simulated) can be considered.
To attribute a facies at a node u in the simulation grid, we retain the nodes v in the training image
(TI) where the data event d(v) has the same components as those of d(u). Then the occurrence of
all the facies at the nodes v are counted. This provides a probability distribution function (pdf)
that ca be used to draw a facies at the node u randomly. More precisely, if the positions of the know
components in d(u) are i1 <im (with 0 m N) then the probability to draw the facies k in
the node u is:
(eq. 14.4-3)
For 0 k<M, where M is the number of facies. Note that if the number of matching data events is
too small (less than min replicates parameters, see Imapala's mps help page). We consider the last
206
component in d(u) as undefined (the last informed node is dropped) and we repeat this operation
until this number is acceptable.
In addition, the multigrid approach is used to capture structures within the training image that are
defined at different scales. Let us introduce some terminology.
A isatis grid file is a box shaped set of pixels as:
(eq. 14.4-4)
Where Nx, Ny, Nz are the dimensions in the x axis, y axis (and z axis) direction respectively. In the
main grid G, the Ith subgrid is defined as:
(eq. 14.4-5)
From the original search template t(u) and data event d(u), define the search template
(eq. 14.4-6)
(eq. 14.4-7)
Where the lag vectors are magnified according to the scale of the subgrid SGi. Then, the simulation
proceeds successively with the simulation of all the nodes in the multigrid Gm-1, using the search
template tm-1. The process continues with the nodes in the multigrid Gm-2, using the search template tm-2 and repeats for all the multigrid levels (in decreasing order) similarly. The simulation
starts with the coarsest multigrid level and finishes with the finest one.
207
The pdf used to draw a facies in a node u of the simulation grid is computed from the list. The formula is used and a minimal number of replicates, Cmin, is given. Assume that the simulated node in
the data event centered at u, d(u), are u +hi1,, u + hin and let d(j)(u) be the data event whose
defined components are s(h+hi1),, s(h+hij) (only the first j simulated nodes in d(u) are taken in
account), 1 j n. for 1 j n and O k M, let C(j)k be the number of nodes v in the training
image with facies k such that the data event centered at v, d(v), is compatible with d(j)(u) (i.e s(v
+hil), 1 l j). Then the greatest index j such that the number of replicates
greater or equals to Cmin and the corresponding pdf:
is
(eq. 14.4-8)
(fig. 14.4-1)
(fig. 14.4-2)
208
For each conditioning data, the attributed facies is assigned to the node in the simulation grid
whose corresponding region contains the point of data. (If more than one conditioning data
lead to the same node location, we retain only the conditioning data whose the point is the
closest to the center of the corresponding region.)
Each conditioning node Uc in G obtained by the step above is spread in the subgrid SG1 as
follows:
- We select all the closest nodes in the subgrid SG1 to Uc i.e the nodes in SG1 that realize
the minimum minucsg1.... where d is the dimension (2 or 3) and u(j) (resp. Uc (j)) are the
integer coordinates in G of the node U (resp. Uc).
- If all the selected nodes are unsimulated, we choose one randomly and simulate a facies
for this chosen node using the search template t0, corresponding to the scale of the subgrid SG0. Otherwise, nothing is done (no need to spread this data).
The previous step is repeated for the subgrids SG2,...,SGm-1 successively: the conditioning
nodes in G obtained in the first step are spread in the subgrid SGi using the search template
ti-1 corresponding to the scale of the subgrid SGi-1.
The simulation continues with the simulation of all the unsimulated nodes in the multigrid Gm-1,
Gm-2,...,G0 successively.
209
(fig. 14.4-3)
(fig. 14.4-3): (a) The position of the conditioning data is presented (encircled). (b) The gray scale
represents the index of the node in the path; the gray scale goes from 0 (white) to 49999 (black).
The multiple point statistics simulation described above is valid for stationary training images,
because spatial pattern (pixels configuration) according to a search template are stored (in a list)
regardless their location in the training image. However, IMAPALAs mps allows to use non stationary training images. In this case, auxiliary variable is used to describe the non stationarity. The
use of non stationary training image TI (containing facies, primary variable) requires one auxiliary
variables which must be exhaustively know in the simulation grid to guide the simulation.
Below, the method is presented for one auxiliary variable t and the associated maos are called TIaux
and Gaux for the training image TI and the simulation G respectively.
In presence of an auxiliary variable, a vector m is appended to each element of the list. An element
of the list is then a triplet of vectors (d,c,m) where d = (s1,...,sN) defines a data event, c = (c0,...,cM1) is a list of occurrence counters for each facies and m = (m0,..., mM-1) is a list of means for the
auxiliary variable: ci is the number of data events d(v) equal to d found in the training image with
facies i at the reference node v and mi is the mean f the auxiliary variable at these nodes v.
Before starting the simulation and building the list, the auxiliary variable, say t, is normalized in the
interval [0,1], via the linear transformation [a,b] to [0,1], t=(t-a)/(b-a), where a and b are respectively the minimum and the maximum of the auxiliary variable t(v), v in TIaux U Gaux.
To simulate a facies in a node u of the grid G, knowing the data event d(u) and the auxiliary variable
t(u) (provided from the auxiliary grid Gaux), the following is done. A tolerance error u between 0
210
and 1 is fixed. For each facies k, the set Ek of the elements in the list that are compatible with the
data event d(u) and that satisfy:
(eq. 14.4-9)
are retrained. Then, the sum Ck of the occurrence counter Ck of the elements of Ek and the resulting
mean Mk for the auxiliary variable are computed:
(eq. 14.4-10)
where the exponent (e) denotes the corresponding element in the list. The resulting means are used
to penalize the counters. For each facies k, the penalized counter
is defined as
(eq. 14.4-11)
Then the conditional pdf knowing the date event d(u) and the auxiliary variable t(u), used to draw
the facies in the node u, is given by:
(eq. 14.4-12)
where :
(eq. 14.4-13)
In summary, the presence of an auxiliary variable adds a step of selection and a step of penalization
for retrieving the conditional pdf.
14.4.5 Servo-System
Impala allows to give target global or local proportions / probabilities. A servo-system is used to
modify the pdf employed for simulating each node, in order to tend to the given target proportions.
The method is inspired from Strebelle (2000).
211
, let:
m
m
the current global proportion for the facies k, computed over all the already
informed nodes in the simulation grid,
(eq. 14.4-1)
Thus, when the current global proportion for a facies k is below (resp. above) the target proportion,
the corresponding original probability is increased (resp. reduced), excepted if it is equal to zero,
which means the incompatibility of the facies k. Finally, the pdf
used for simulating the node u is obtained after truncating the values
,
in [0, 1] and a nor-
malization:
(eq. 14.4-2)
Note that the normalization constant C can vanish because the correction is not necessarily applied
to the original probability of each facies. In this case (C = 0), the pdf used for simulating the node u
is set to the (global) target proportion.
Note - The current global proportions are simply updated when a node in the simulation grid is
simulated. For the simulation of the first node of an unconditional simulation, the current global
proportion is not defined and the servo-system is not activated for this node.
212
(eq. 14.4-3)
where
denotes the proportion of the facies k in the training image, d is a constant defined at
1000 and
is a weight specified by the user. The constant d is a scale factor: for example d
= 1000 means that the difference between the target and the prior distribution (given by the training
image) is set in
. Hence, the correction factor t automatically takes into a count for the deviation of the target proportion compared to the facies proportion in the training image (a greater correction will be applied for a greater deviation). The power
values for
in the interval [0, 1]: the value
= 1 corresponds to the maximal strength, which
is set accordingly to the scale parameter d (hardly fixed in the code).
Note - The servo-system is a tool that perturbates the pdf 's used during the simulation. Hence,
using the servo-system can deteriorate the structures to be simulated.
Technical References
213
15.Fractal Simulations
This page constitutes an add-on to the Users Guide for Interpolate / Non-Conditional Simulations /
Random Functions / Fractals.
For more information concerning the Fractal Model, please refer to Peitgen H.O., Saupe D., Barnsley M.F., The Science of Fractal Images (Springer N.Y., 1998).
214
Fractal Simulations
15.1 Principle
A fractional Brownian motion VH(t) is a single valued function of one variable
time), such that its increments have a Gaussian distribution with variance:
E V H t 2 V H t 1 2 = k t 2 t 1 2H
t (referred to as the
0H1
with
(eq. 15.1-1)
H increases.
E VH x2 VH x1 = k x2 x1
n , which
2H
(eq. 15.1-2)
D = n+1H
k represents a positive constant that we assimilate to a variance and denote as
paragraph.
(eq. 15.1-3)
Fractals come in two major variations. Some are composed of several scaled down and rotated copies of itself such as the well-known Von Kock snowflake, or the Julia sets where the whole set can
be obtained by applying a non linear iterated map to an arbitrarily small section of it. These are
called the deterministic fractals. Their generation simply requires the use of a particular mapping
or rule which then is repeated over and over in a usually recursive scheme.
We can also include an additional element of randomness allowing the simulation of random fractals. Given the fact that fractals have infinite details at all scales, a complete computation of fractal
is clearly impossible. Instead it is sufficient to approximate these computations down to a precision
which matches the size of the pixels of the grid that we wish to simulate. Several simulation algorithms are used. These algorithms will be briefly described in 1 dimension ; they are usually
extended to higher dimensions without any major problem.
Technical References
215
2 .
We then set the value of X(1/2) plus some Gaussian random offset D1 with mean 0 and variance
(eq. 15.2-1)
1
1
Var X --- X 0 = --- 2 + 12
2
4
(eq. 15.2-2)
1--- 2
1
+ 12 = --- 2
4
2
(eq. 15.2-3)
Therefore:
1
12 = --- 2
4
(eq. 15.2-4)
216
Fractal Simulations
The additional parameter r will change the appearance of the fractal: it is called the lacunarity.
Following the ideas for the midpoint displacement method, we set X(0) = 0 and X(1) as a sample of
a gaussian variable with mean 0 and variance
before that:
1
n2 = --- 1 r 2 2H r n 2H 2
2
(eq. 15.3-1)
Technical References
217
Ff =
and the spectral density of
0 X t e2ift dt
(eq. 15.4-1)
is:
1
S f = lim --- F f 2
TT
(eq. 15.4-2)
1
H = -----------2
(eq. 15.4-3)
For practical algorithm we have to translate the above into conditions on the coefficients ak of the
discrete Fourier transform.
Xt =
N1
ak e 2ikt
k =0
(eq. 15.4-4)
The conditions to be imposed on the coefficients in order to obtain that S(f) is proportional to f ,
now becomes:
E a k 2 proportional to k
This relation holds for 0 < k < N/2 and, for
(eq. 15.4-5)
N
k ---- , we must have a k = a N k as X is a real func2
tion.
The method thus simply consists of randomly choosing coefficients subject to the condition on their
expectation and then computing the inverse Fourier transform to obtain X in the time domain.
In contrast to the previous algorithms, this method is not iterative and does not proceed in stages of
increasing spatial resolution. We may, however, interpret the addition of more and more Fourier
218
Fractal Simulations
coefficients ak as a process of adding higher frequencies, thus increasing the resolution in the frequency domain.
Technical References
219
16.Annealing Simulations
This page constitutes an add-on to the Users Guide for Interpolate/Conditional Simulations/
Annealing Simulations
Note - This method cannot be actually considered as simulations in the usual sense, as it is meant
to transform an input realization according to several criteria.
The Annealing procedure is similar to the Auto Regressive Deterministic one except that it does not
require any model and that the result is a discrete variable.
The data points are migrated at the nodes of the grid in order to improve the performances of the
methods. If several data points are migrated to the same node, the closest one prevails: the remaining ones are simply ignored.
It requires the definition of several cutoff intervals which must realize a partition of R If some
intervals overlap the results are unpredictable. Each cutoff interval corresponds to facies, numbered
starting from 1.
The procedure starts with an Initial Image which is internally converted into facies.
It requires a Training Image, which will be internally converted into facies, and which will be used
to derive the statistics concerning the proportions and the transitions: they will serve as references.
The principle is to modify iteratively a non conditioning pixel from its current facies value to
another facies value, in order to reduce the gap between experimental quantities and the reference.
These quantities are:
l
the transition probabilities between the different facies, for several steps defined by the three
increments expressed in terms of grid meshes.
For each grid node, the principle is to establish the energy of the current image.
= 2 E + p2 E p
where 2 designates the weight, E the normalized energy and the indices
of the indicators and the proportions.
(eq. 16.0-1)
and
p the variograms
220
Annealing Simulations
The weights are used to increase or to reduce the relative influence of each component in the calculation of the energy.
When calculating these transition probabilities, the procedure makes a distinction whether the
quantities are:
l
(eq. 16.0-2)
where:
l
the indices c and nc respectively refer to the constrained and unconstrained statistics.
E integrates the difference between the experimental transition and the ref-
E = k t ijs t ijs
s i j
(eq. 16.0-3)
where:
l
k is a normation value which considers the number of steps and the number of transitions.
1
E p=---------------------- e x p ix p x i 2 + e y p iy pi y 2 + e z p iz pi z 2
d Nxyz N i
N i
N i
x
where:
(eq. 16.0-4)
Technical References
221
e x indicates if the grid has an extension in the x direction (1 for true ; 0 otherwise),
p ix is the proportion of facies integrated over the pile of cells located at , whatever their or
indices,
(eq. 16.0-5)
The global process consists in iterating, following a random path on each one of the grid nodes
which are not constrained by a data information. If we denote by f n the value of a facies drawn at
random and different from f 0 , and by e 0 and e n the corresponding energies, the Metropolis algorithm gives the following substitution rule:
l
if
e n e 0 we substitute f n to f 0 ,
if
1 p.
p incorporates the difference of energy and the temperature of the system (monotonous function
decreasing with the duration of the process) though the following equation:
en e0
p = exp ------------------------
kT
where
(eq. 16.0-6)
Instead of the temperature function, we ask the user to specify a maximum number of iteration
(eq. 16.0-7)
This method has been proven to converge towards the minimum energy state if the cooling speed
(ruled by the Boltzmann's constant) is small enough.
222
Annealing Simulations
Technical References
223
224
17.1 Introduction
In Oil & Gas applications, the spill point calculation enables the user to delineate a potential reservoir knowing that some control points are inside or outside the reservoir.
For the illustration of this feature, we will consider one map (which may be one of the outcomes of
a simulation process) where the variable is the topography of the top of a reservoir. We will consider the depth as counted positively downwards: the top of the structure corresponds to the lowest
value in the field.
Moreover, we assume that we have a collection of control points whose locations are known and
which belong to one of the following two categories:
l
Note - All the points located outside the frame where the image is defined are considered as
outside.
Technical References
225
(fig. 17.2-1)
The Spill Point corresponds to the location of the saddle below volumes A and B. As a matter of
fact, if we consider a deeper spill, these two volumes will connect and the constraints induced by
the control points will not be fulfilled any more as the same location cannot be simultaneously
inside and outside the reservoir.
The volume A is considered as outside whereas B is inside. An interesting feature comes from the
volumes C1 and C2:
l
they are first connected (as elevation of the separation saddle is located above the spill point)
and therefore constitute a single volume C,
Hence, after the spill point elevation has been calculated, each point in the frame can only correspond to one of the following four status:
l
226
(fig. 17.3-1)
The new spill point is shifted upwards as otherwise the maximum reservoir thickness constraint
would be violated. Note that, the Spill elevation is clearly known whereas the location of the Spill
Point is rather arbitrary this time. It is the last point that may be included in the reservoir: if the next
one (sorted by increasing depth) was included, the thickness of the reservoir would overpass the
maximum admissible value.
Technical References
227
228
(fig. 17.5-1)
Let us now consider the opposite case where the outside control point is at the top of the structure
and the inside control point is on the flank. In principle, the situation should be symmetric with the
same result for the elevation of the Spill. But if we consider the volume of the reservoir now: the
volume of the reservoir controlled by the inside control point and located above the spill has its volume reduced to zero. That is the reason why such a map is considered as not acceptable.
(fig. 17.5-2)
Technical References
229
(fig. 17.6-1)
The volume A is considered outside and B inside the reservoir. The volumes C and D are initially
unknown. If we convert them into inside, the maximum reservoir thickness constraint will not be
fulfilled any more for the volume C. We could imagine to move the spill upwards until the constraint is satisfied, but then we should also move the Spill Point in a new location. Instead, we have
considered that such a map should rather be considered as not acceptable.
230
231
232
Let:
- v be the generic selection block (SMU),
- Z(v) its true grade,
- and Z(v)* its ultimate estimate,
1Z ( v )* z
from the sample point anamorphosis Z ( x) (Y ( x)) through the integral relation:
r ( y ) (ry 1 r 2 u ) g (u ) du
(eq. 18.7-1)
(this expresses Cartiers relation E Z ( x) | Z (v) Z (v) for a point random in a block)
m
summming to 1 and
the correlaion
233
TV ( z )* E 1Z ( v )* z | Z (V )*
(eq. 18.7-1)
The idea is to impose the panel grade, estimated for instance by Ordinary Kriging, in order to avoid
the attraction to the mean that may be caused by some techniques in case of deviation from stationarity. The estimation of the metal at 0 cutoff must then satisfy the relation:
E[Z(v) | Z(V)*] = Z(V)*
This fundamental relation has several important consequences:
m
Note that, having no conditional bias, Z(V)* cannot take negative values (as may be caused, in
kriging, by negative weights). Negative values will anyway not be supported by the coming
anamorphosis.
m
The Gaussian anamorphosis of Z(V)* is necessarily of the same form as that of Z(v).
Let:
Z ( x) (Y ( x))
Z (v) r (Yv )
Z (V )* (YV * )
(eq. 18.7-3)
(eq. 18.7-4)
(eq. 18.7-5)
234
where Y(x), Yv and YV* are standard Gaussian variables. The fundamental relation gives:
(eq. 18.7-6)
S r vV * r corl (Yv , YV * )
Hence the anamorphosis of Z(V)* is inherited from this of Z(v). This holds whatever the estimate,
not only for linear combinations of Z sample values. In pratice S is obtained from the panel estimation and the previous relation will be used to compute:
corl (Yv , YV * ) vV *
S
r
(eq. 18.7-7)
from r and S.
(eq. 18.7-8)
assuming that Z(v) and Z(V)* can be considered independent, conditionally on Z(v)*,
that is:
vV * vv* v*V *
S
r
(eq. 18.7-9)
v*V *
vV * S
vv* r vv*
(eq. 18.7-10)
235
Z 2 (v)1Z1 ( v )* z
(eq. 18.7-1)
It requires:
m
Assuming that Y1v* and Y2v are independent, conditional on Y1v , this will be deduced from:
1v*2 v 1v1v*1v 2 v
(eq. 18.7-2)
are conditional on Z1 (V )* , independent from the auxiliary metal panel grade, so that
the UC estimates for the selection variable correspond to the univariate case. It results that:
(eq. 18.7-1)
236
Z 2 (v )
(this holds
denoting :
(eq. 18.7-4)
In practice S2 is obtained from the panel estimation Z2(V)*, and the previous relation will be used
to get:
S2
r2
(eq. 18.7-5)
Since Z2(v) and Z1(V)* are considered independent, conditional on Z2(V)*, we have:
i.e.
S
2v1V * 2v 2V * 1V *2V * 2 1V *2V *
r2
237
(eq. 18.7-7)
By symmetry between the metals, we can assume Z1(v) and Z2(V)* independent, conditional on
Z1(V)*, that is:
1v 2V * 1v1V * 1V *2V *
S1
1V *2V *
r1
(eq. 18.7-8)
We will finally assume that Z1(v)* and Z2(V)* are independent, conditional on Z1(V)*, that is:
(eq. 18.7-9)
238
the block anamorphosis: Z (v) r (Yv ) with change of support coefficient r, deduced from
the sample point anamorphosis Z ( x) (Y ( x)) through the integral relation :
r ( y) (ry 1 r 2 u ) g (u ) du
(eq. 18.8-1)
(this expresses Cartiers relation E Z ( x) | Z (v) Z (v) for a point random in a block)
The model is also characterized by the relationships between block covariances and point covariances, i.e.:
- Cov[Yv(x), Yv(x+h)] represents the covariance between any pair of blocks v(x) and
v(x+h).
- The point-block covariance and the point covariance can be shown to be:
- The point-point covariance is then :
Cov[Y(x), Y(x+h))] = r Cov[Y(x), Yv(x+h)] = r Cov[Yv(x), Yv(x+h)], with x within vi
and x+h within vj ,
- except for a point and itself (h=0) where Cov[Y(x), Y(x)+h] = Var[Y(x)] = 1:
239
Var[Z(v) ] n2 r 2 n
n 1
(eq. 18.8-2)
where n are the coefficients of the expansion of the anamorphosis function into N Hermite polynomials.
The block gaussian covariance Cov[Yvi , Yvj] is determined by inversion from cov(Z (v(x)), Z
(v(x))), which is the regularized covariance of Z(x).
240
the standard block gaussian variable corresponds to the (normalized) regularized point gaussian
(here univariate):
Yv
Y (v )
Y (v )
r
(eq. 18.8-1)
1
Y ( x) dx
v v
(eq. 18.8-2)
(eq. 18.8-3)
It is derived from the variogram of the gaussian variable, instead of the variogram of the raw variable in the classical method. In the same mind we can calculate directly the block gaussian
covariances and cross-covariances from the regularized covariances and cross-covariances of the
gaussian data:
cov(Y1v , Y1vh )
r12
r12
r12
(eq. 18.8-4)
cov(Y1v ,Y2vh )
r1 r2
r1 r2
r1 r2
(eq. 18.8-5)
241
242
243
Non conditional simulation using Turning Bands of the block gaussian variogram.
Transform of the simulated block values into point values at data locations.
According to that model, Yx and Yv make a pair of bi-gaussian variables. If we consider two
variables "i" and "j", the Gaussian point and block values are obtained by means of a linear
regression. In the formula below the variables Gi and Gj are the normal residuals of the linear
regression.
(eq. 18.8-6)
(eq. 18.8-7)
Knowing the covariance model Cijv(h) of the block Gaussian values we can derive the covariance between point Gaussian values Cij(h).
(eq. 18.8-8)
These covariances are used to establish the cokriging system when conditioning the block simulated values by point Gaussian values considered as random in a block.
The cokriging matrix has to be transformed to account the case where the two data points are
not only located in the same block but are the same point. The modifications are then:
- When the two variables (i and j) are the same we use Civ(0) (equal to 1 when the block
Gaussian variogram is normalized).
244
- When the two variables are different we use Cij that is the covariance between point
gaussian values.
Using the relationship between Y(x) and Yv we can derive the covariance between the residuals
G i, G j.
(eq. 18.8-9)
(eq. 18.8-10)
This correlation matrix must be positive definite, i.e. with positive eigen values.
This property is checked and has to be respected before to use that model for the direct block
simulation.
m
Optionally a sample randomly located in the block can be calculated from the simulated
block values. This is directly coming from the discrete gaussian model where the point and
the block values are linked by the relationship above (eq. 18.8-6) (eq. 18.8-7). The two normal variables Gi and Gi are taken at random from a bi-gaussian distribution with the coefficient of correlation given in (eq. 18.8-10).
245
19.Localized Uniform
Conditionning
Note - This technical reference is based on: Abzalov, M.Z. (2006) Localised Uniform Conditioning
(LUC): A New Approach to Direct modelling of Small Blocks. Mathematical Geology 38(4) p393411.
The Localized uniform conditioning is designed for non linear estimation in mining industries and
should be used as a Uniform Conditioning's post-processing. This is a specific method which
enables to estimate the grades at the block scale.
The classic Uniform Conditioning method estimates only the panel proportion of recoverable mineralization without identifying the location of the recoverable blocks. Beside it is commonly admit
the use of Ordinary Kriging to estimate small blocks is inappropriate when the data are to parsed
compare to the block size. As a matter of fact these two methods (Uniform Conditioning & Ordinary Kriging) do not allow to efficiently determinate the actual location of the economically
extractable blocks. The Localized Uniform Conditioning method aims at filling this lack. This
method estimates the localized block grades by using the grade-tonnage curves given by the uniform conditioning and by reproducing the spatial grade distribution obtained by Ordinary Kriging.
Actually the concept of LUC is to use the grade ranking provided by the Ordinary Kriging while
keeping the distribution grade.
246
19.1 Algorithm
The Uniform Conditioning estimates the grade-tonnage curves for each panels. The grade tonnage
curves correspond to the tonnage and grade of mineralization which can be recovered for a cut-off
value. The Local Uniform Conditioning algorithm then estimates the mean grades of the grade
classes in each panel according to the block. Then the algorithm ranks the SMU blocks distributed
in each panel in their grade (estimate by Ordinary Kriging) increasing order. Finally, the mean
grades (Mi) of the grade class (Gci) which have been deduced from the UC model are assigned to
the SMU blocks whose rank matches the grade class. The grade class is the portion of the panel
whose grade is lying between a given cut-off (Zci) and the following cut-off (Zci+1).
In other words:
(eq. 19.1-1)
With Gci = grade class, Ti (Zci) = the recoverable tonnage at cut-off (Zci) and Ti+1(Zci+1) = the
recoverable tonnage at cut-off (Zci+1). By defining the SMU ranks as proportions of the panel tonnage Tv, the SMU ranks can be converted into the grade classes:
(eq. 19.1-2)
With SMUK = the SMU of a rank K, TK = the proportion of the panel tonnage distributed in SMU
blocks whose rank is equal or lower than (K), and TK+1 = the proportion of the panel distributed in
SMU blocks having higher rank.
Then the UC model enables to deduce the mean grades (Mi) of the grade classes (MGci) in the panels. Finally by matching class indexes MGci and TGci, the mean grade (Mi) of each class can be
transferred to the SMUk blocks.
247
(fig. 19.1-1)
Example of LUC on 16 SMU blocks per panel, and six cut-off values on the grade tonnage curve
coming form the uniform conditionning.
248
195
20.1.1 VOCABULARY
The Seismic Grid Filling algorithm only makes sense on a regular grid, which is characterized by
its geometry and its total number of cells (N1).
The variable of interest is initially defined on a set of grid nodes that we will call the Initial Domain
of Definition (N2).
The grid may also contain an active selection (masking off N3 cells) which reduces the number of
cells that can be ultimately filled (N4).
An obvious formula states that:
Each cell of the grid (already filled or not) may also be assigned an attribute (say its permeability)
which acts as a weighting factor to speed up or slow down the propagation. This property is not
defined in the basic case of grid filling.
This weighting factor can also be influenced by a speed coefficient which describes the velocity
with which a fluid (known in a cell already filled) would tend to invade the adjacent cell. This feature can be used to introduce some anisotropy in the weighting factor or discriminating among several fluids. This feature, essential in the Fluid propagation algorithm, is not used for Grid Filling.
In this paper, we will refer to the neighborhood of a target cell. This refers to all the cells which are
adjacent and have a vertex (not a corner) in common with the target cell. Therefore, in 1-D, the
neighborhood is limited to the 2 adjacent cells, 4 in 2-D and 6 in 3-D.
20.1.2 INITIALIZATION
We start with the cells contained in the Initial Domain of Definition and establish the skin. The skin
is composed of all the cells immediately contiguous to a cell contained in the Initial Domain of
Definition. In other words, a cell belongs to the skin if one of its neighboring cells belongs to the
Initial Domain of Definition.
The next cell consists in filling one cell of this skin. All the cells of the skin do not have the same
probability of being elected. For a given target cell belonging to the skin, this probability is computed as the sum of the weights induced by the only cells already filled and which belong to neighborhood of the target cell. This weight is given by the (permeability) property carried by the cell (if
196
defined) or 1 if not defined. This weight will serve as energy for each cell of the skin. The nonweighted version of the energy leads to the Skin length.
The Initial Skin Energy is the sum of these energies for all the cells that constitute the skin. When
the initialization step is ended, a message is produced (in the verbose option) giving:
- The total number of cells (N1)
- The number of cells already filled (N2)
- The number of cells masked off (N3)
- The number of cells to be processed (N4)
- The initial energy of the skin
197
198
253
21.Meandering Channel
Simulation (Flumy)
Flumy is a meandering channel simulation technique that uses both a stochastic and a process-based
approach to produce realistic models. Flumy uses hydraulic equations to generate a continuous channel
that evolves trough the time and regional avulsion that add discontinuity in the system. While simulating
the channel path, Flumy also considers the depositional process of sedimentary bodies (point-bar, crevasse splays, overbank...). The stochastic nature of Flumy allows to generate multiple realization
from one set of parameters.
Note - To get more information about the simulation algorithm, please refer to:
- Cojan, I., Fouch, O., Lopez, S., Rivoirard, J. (2005) Process-based reservoir modelling inthe
example of meandering channel.- In:O. Leuangthong and C.V. Deutch (eds.),Geostatistics Banff
2004, Dordrecht: Springer.- p. 611-619.
- Cojan, I., Beaudelot, C., Geffroy, F., Laratte, S., Rigollet, Ch. & Rivoirard, J. (2006) Processbased and stochastic modeling of fluvial meandering system. From model to field case
study:example of the Loranca Miocene succession (Spain).
254
Flumy simulates 10 facies that can optionally be regrouped be regrouped in lithotype set of 2, 3, 5
or 8 in order to simplify the ouput. The following table shows the optional lithotype sets:
(fig. 21.1-2)
The simulation can be conditioned by input data. By default the conditioning might not be
respected at 100%. The facies of the input data are only used to attract and repulse the channels.
Nevertheless, a post processing procedure can correct the simulation to reach a 100% conditioning. Note that the flumy conditioning method is based on 3 Flumy litho-facies:
- The meander bars Flumy litho-facies, regrouping the following facies: channel lag, point
bar, sand plug and crevasse splay I. This Flumy litho-facies attracts the closest meander to
the considered wells. When a regional avulsion comes up, its trajectory is going through
the sample.
- The levee deposits Flumy litho-facies, regrouping the following facies: crevasse splay II
channels, crevasse splay II and levee. This Flumy litho-facies attracts the closest meander
to the considered wells and repulse it if the meander is too close. When a regional avulsion comes up, its trajectory is going to be close to this facies.
- The silts Flumy litho-facies, regrouping the following facies: overbank, mud plug, channel fill and wetland. This Flumy litho-facies repulse the channel by slowing down the
migration. When a regional avulsion comes up, its trajectory cannot go through this
facies.
To avoid to alter the genetic part of process and to keep a realistic channel shape, the conditioning is not totally honored during the simulation.
255
Since there is a conversion table between a grain size value and a Flumy litho-facies, a post-processing procedure can be applied on the grain size variable and then back transformed into a
Flumy litho-facies. An a grain size error can be computed a the wells location and using a residual kriging method, this error can be kriged and removed from the result. An exponential model
is used, the ranges and the neighborhood can be edited in the application.
256
258
22.2 Notations
Suppose that we observe z = (z(x1),...,z(xn)) the realization of a random function Z(:) at points
x1,..., xn. We assume that at any location:
(eq. 22.2-1)
. We denote:
the (p + 1)-vector of drift coefficients,
(eq. 22.2-2)
(eq. 22.2-3)
22.2.1 Methodology
The methodology to infer the model consists of the following steps:
(eq. 22.2-4)
l
(eq. 22.2-5)
(eq. 22.2-6)
l
(eq. 22.2-7)
where:
(eq. 22.2-8)
and (i; j) V(h) is the set of pairs (i; j) such as xi-xj 'h and N(h) stands for the number of such
pairs.
260
(eq. 22.3-1)
(eq. 22.3-2)
where bij , the bias associated to the pair (i; j), can be written:
(eq. 22.3-3)
(snap. 22.3-1)
To compute the total bias associated to the experimental variogram at lag h, we need to know the
value of the variogram at all the distances which is precisely the quantity that we are trying to compute. To circumvent this difficulty, we use the iterative procedure described in the next section.
Initialisation:
At step 0 do:
(a) Compute
(b) Set
l
At iteration n:
(a) Fit a model
on
.
by using equations (eq. 22.3-2) and (eq. 22.3-3)
(eq. 22.4-1)
Note that step (a) of iteration n can be performed with the algorithm of Desassis and Renard (2012).
262
Isatoil
Technical References
265
23.Isatoil
One of the main problems during the exploration and the development phases of a reservoir is to
construct a complex multi-layer geological faulted model recognized by seismic campaigns and
several wells (often deviated and sometimes even horizontal). This is the general framework within
which the methodology of Isatoil has been established.
The main sources of uncertainty come from the quality as well as the quantity of the information,
but also the reservoir's geological structure, the variability of the petrophysical properties and the
location of the gas-oil and oil-water contacts.
Therefore the Isatoil methodology has been developed in order to:
l
calculate a base case geological model by using estimation techniques (Kriging) which produce
a set of smooth surfaces that honor all the available data. The graphical display of the general
shape of the reservoirs is used to check the suitability of the model from a geological point of
view,
apply the same concepts by means of simulation techniques so as to obtain reliable distribution
curves of reservoir volumes, over different exploitation segments which constitute a partition of
the field. The series of volume estimates obtained through simulations for a given layer and a
given segment can be represented as risk curves.
The aim of this geostatistical technique, which works with deviated wells, is to provide accurate
estimates, whatever the number of surfaces involved in the geological model.
266
Isatoil
the layer cake hypothesis implies to work with a geological sequence of strata which starts from
a given surface named the Top Layering reference surface.
the same vertical sequence of strata is defined over homegeneous areas of the field. Nevertheless missing stratum corresponding to pinch out can be handled.
the sequence is divided vertically into Layers. The Layers are stacked successively with no vacuum, so that the bottom of one Layer always matches the top of the next Layer. Generically, the
top and bottom of a Layer will be called a surface. Some surfaces can be picked using the seismic information (seismic surface), others cannot. All the Layers, when intersected, must be
"visible" on the well information.
The procedure enables to take into account the following information in order to produce a consistent geological model:
l
Seismic Time maps. They are provided by a series of picks coming from 3D seismic sections
which cover the whole field. These picks are interpolated in order to produce 2D surfaces which
cover the entire field. When the time map is absent, the corresponding Layer will be missing.
The seismic sections also serve in picking the fault events which create a disruption in the time
map. Within the faulted area, the quality of the time map is questionable and therefore, the procedure may alter its values. These faults can be represented on 2D maps as fault polygons
which correspond to the projection on the horizontal plane of the areas where the faults have
perturbed the seismic time surface. The procedure is restricted to normal faults.
Well data. A well consists of a continuous well path through the 3D geological model. The
locations where the well path intersects each surface of the sequence (whether they are reflected
in the seismic time map or not) are recorded, we call them the intercepts. The wells can be vertical or deviated with no limitation - they can even be horizontal -
At some locations along the well path, the values of (static) petrophysical parameters are measured
- porosity and the net to gross ratio values - The saturation could obviously not be considered as
vertically homogeneous within a layer since it strongly depends on the vertical distance to the oilwater contact. The saturation is therefore calculated using a formulae tabulated within the procedure. The petrophysical measurements are usually located within a layer and are therefore different
from the intercepts.
l
Some Control Surfaces given in depth and which cannot be deduced from the layer cake environment, such as unconformities, erosion surfaces or major faults boundaries.
For volume calculation, it is also possible to provide a set of 2D areas which subdivide the field
into compartments where the volumes must be calculated. The procedure enables the reservoir to
contain up to three different phases (gas, oil and water). If present, the order relationship reflects
their density: gas is always located above oil and oil above water. There presence depends upon the
definition of contact surfaces which delineate the transition between two consecutive phases.
Technical References
267
These contacts can be defined for each layer and within each area, either as a constant value or a
map (possibly adding a randomization factor).
268
Isatoil
23.2 Workflow
The Isatoil methodology has been developed in order to reflect the nature of the data and the geological structure of the field. This is reflected in the workflow which is used:
1. select a Top Layering reference surface from which the whole sequence will be derived. This
surface usually corresponds to a good seismic marker
2. perform a Depth Conversion of the set of markers for which seismic time maps are available
3. build the (normal) fault surfaces starting from the fault polygons
4. subdivide a unit (between two consecutive Layers) into zones within which the petrophysical
parameters can be considered as verticaly homogeneous - as well as non correlated between
zones 5. populate each zone with petrophysical variables - porosity, net to gross ratio and saturation in order to derive the in-situ volumes above contact surfaces.
All the results are considered as 2D surfaces (or maps) which must match any information collected
along the deviated wells. For petrophysics, this coarse assumption only holds if the variables can be
considered as vertically homogeneous within each layer.
The same workflow is applied for estimation as well as for simulations. In the latter case, each
sequence produces several outcomes. Caution is required when combining the outcomes of different sequences in order to keep the consistency of the global geological model: let us recall, for
example, that, by construction, the sum of the thicknesses of the layers within a unit must match the
total thickness of this unit.
Technical References
269
well
and
, while D i x is the true vertical depth down to the layer i at the the point x counted from
the Top Layering reference surface: it also corresponds to the sum of the thicknesses of all the layers above i, measured at point x :
Di x =
ji
Tj x
(eq. 23.3-1)
j=0
These definitions are illustrated in the next figure showing two representative well geometries: the
well is vertical while the well
is strongly deviated:
(fig. 23.3-1)
270
Isatoil
map.
When computing the thickness of the second layer, the vertical wells provide a valid thickness
information
T 2 x . Conversely, the deviated well does not provide any valuable direct infor-
T 2 x = D 2 x T 1* x
where
(eq. 23.3-2)
T 1* x designates the estimation of the thickness of the first layer at the point x . We
recall that this value only represents an estimate (with an attached uncertainty) and, therefore, not
an exact quantity.
Technical References
271
Dx =
pl x Tl x
(eq. 23.3-3)
l=0
D x or not: in other words, if the information at the point x is an intercept with a layer deeper
than l or not. For example, for an intercept with the first surface, the p-vector is
p = 1 0 0 0 and p = 1 1 0 0 for an intercept with the second layer.
The cokriging system follows, established for the target layer i and the target node x 0 :
N
T * x =
D x = p l x T l x
i 0
l=0
E T *i x 0 T i x 0 = 0
Var T * x T x
minimum
i 0
i 0
(eq. 23.3-4)
When expanding the previous set of equations, we introduce the following covariance terms:
ij h = Cov T i x + h T j x
C ij h = Cov T i x + h D j x =
lj
pl x ij h
l
C ij h = Cov D i x + h D j x =
li mj
pl x + h pm x lm h
l
(eq. 23.3-5)
272
Isatoil
The cokriging system (eq. 23.3-4) can then be expanded as follows, according to the strict stationarity hypothesis:
i C
= C i
i2
where
(eq. 23.3-6)
= ii 0
C
i
C i = Cov D T i depends on the rank of the layer intercepted at the data point
x and the target layer i at point x 0 and the distance between them h 0 . For simplicity, the distance arguments
Note that the system (eq. 23.3-6) also provides the variance of the estimation and can be established
for the estimation of each target layer i, hence the additional index attached to the cokriging weights
.
i
T
v i = ----i
ti
(eq. 23.3-7)
Once more, the actual information provided by a deviated well consists in the true vertical depth
D i . We can therefore introduce the apparent velocity V i which corresponds to the velocity
averaged over the layers intercepted down to the layer i, i.e. the cumulative thickness divided by the
cumulative time thickness
i :
Technical References
273
li
li
Tl tl vl
Di
l - = -------------l
V i = ----- = ----------- =
i
li
l
- vl
---
l
(eq. 23.3-8)
For better legibility, we can use the same formalism as before, introducing the p-vector where each
element represents the proportion of the time thickness spent in an intercepted layer, and zero for a
layer not intercepted. Note that, this time, the elements of a p-vector always add up to 1.
t1 t2
p = ---- ---- 0
(eq. 23.3-9)
Vi x =
pl x vl x
(eq. 23.3-10)
l=0
Hence the cokriging system, expressed using velocities, and written for the target layer i and the target node :
*
v i x 0 =
V x =
pl x vl x
l=0
E v i x 0 v i x 0 = 0
(eq. 23.3-11)
Var v ii x 0 v i x 0
minimum
i C
= C i
i2
(eq. 23.3-12)
= ii 0
C
i
274
Isatoil
ij h = Cov v i x + h v j x
C ij h = Cov v i x + h V j x =
lj
pl x ij h
l
C ij h = Cov V i x + h V j x =
li mj
pl x + h pm x lm h
l
(eq. 23.3-13)
Another interest for working in velocities (rather than in depth) is when the model for the interval
velocities must reflect some physical behavior (rock compaction rule for example).
E vi = ai + bi Ti
(eq. 23.3-14)
which relates the mathematical expectation of each interval velocity field to the seismic time
through a linear equation. The coefficients a i and b i are assumed to be constant over the field. Note
that this formalism enables the use of a more complex function, and even the possibility of involving more terms.
Introducing the assumptions (eq. 23.3-14) in the cokriging formalism (eq. 23.3-11) leads to the new
formulation of the cokriging system, expressed for the target layer i and the target node x 0 :
Technical References
i C
275
= C i
l p l x
= li
(eq. 23.3-15)
l p l x T l x
li T i x 0
i2 = ii 0 i C i l l T l x 0
where
li stands for the Kroneker sign which is equal to 1 when i = l and to 0 otherwise. We
unknowns. As the second and third type of equations must be repeated for each layer index l, the
dimension of this system is now equal to the count of intercepts incremented by twice the count of
layers.
276
Isatoil
(fig. 23.3-2)
Isatoil interpolates the fault surface from its traces on the depth maps of two consecutive seismic
markers. These top and bottom surfaces are then extrapolated throughout the fault in a pre-faulted
scenario. Any intermediate surface is then estimated throughout the fault in a pre-faulted scenario
and the fault is finally applied afterwards to reproduce the observed throws.
Technical References
277
The Zonation process is similar to the one explained in the previous paragraph, using the cokriging
formalism. This time, it must compulsorily be performed using thickness instead of velocity (as
there is no time map information for each layer). Note that the external drift feature is still applicable as long as the geologist has a sound intuition of a set of variables which can serve as external
drifts: there must be as many variables as they are layers within a unit.
The important difference of this second step comes from the additional constraint that must be considered: the cumulative thickness of the layers within a unit must be equal to the thickness of the
unit. This is achieved by considering a collocation option added to the previous cokriging formalism.
(fig. 23.3-3)
In principle, when the unit is subdivided into N layers, it suffices to calculate the thicknesses of the
N-1 layers: the thickness of the last one is obtained by comparison to the thickness of the total unit.
Nevertheless, in the case of the collocated option, we consider the problem of estimating the N
thicknesses (as if the thickness of the total unit were unknown). This is the reason why we consider
all the intercepts of the wells with the layers, as well as those with the bottom of the unit. The top of
the unit serves as the reference from where the true vertical depth is counted.
Moreover, at the target node location, we simply tell the system that one additional information
must be considered: the thickness of the total unit. When working with true vertical thicknesses, the
task is even simplified as it means that the cumulative depth down to the layer "N" at the target
node, is known.
As the estimation (using the cokriging technique) provides an exact interpolation (all the information is honored), the intercept with the bottom of the unit, collocated with the target node, ensures
that the sum of the estimated thicknesses match the thickness of the total unit.
278
Isatoil
the information provided refers to linear transform of the variables of interest (cumulative thickness or apparent velocity),
the variables are not all sampled at the same points and therefore, it is not possible to use a linear inversion in order to transform the model back to the variables of interest.
For simplicity, we focus on the problem of modelling the thicknesses of several layers, starting
from information on cumulative thickness. Let us denote by
layers which are only measured in a set of points x 1
Zx =
pi x Yi x
(eq. 23.3-16)
The weights p i x are known at each sample point and for all layers.
The problem is to fit a linear model of coregionalization on the
K Yij x + h x =
auij Ku h
Y i variables:
(eq. 23.3-17)
Technical References
a uij =
279
pu xpui xpuj
makes sense as
pu 0 .
with
i =
r pu
i which
pu x pu
rpui rpuj
(eq. 23.3-18)
K Z x y =
i rj K x y
p i x p j y r pu
pu u
i j pu
(eq. 23.3-19)
When the variables have been carefully centered beforehand, in the scope of a stationary model, the
previous term corresponds to the covariance
will be obtained by minimizing the quantity:
J A =
The criterion
2
ij K x y
Z
x
Z
x
p
y
a
i j u u
i j pu
x y
(eq. 23.3-20)
J A is quadratic with respect to the coefficients a uij . If we use for the set of indices
k = p i x p j y k u x y
(eq. 23.3-21)
= Z x Z y
then we can write:
J a =
where
<f g> =
(eq. 23.3-22)
f g
The whole set of matrices is obtained simultaneously by solving the linear system:
< k l >al
= < k >
(eq. 23.3-23)
280
Isatoil
Therefore we use an iterative procedure where each matrix is optimized in turn while keeping the
definite positive condition fulfilled for all the other matrices. The procedure consists of:
l
Technical References
281
(eq. 23.4-1)
where:
l
l
282
Isatoil