Вы находитесь на странице: 1из 10

International Journal of Mining and Geological Engineering, 1987, 5, 121-130

SHORT COMMUNICATION

Modelling of cyclical stratigraphy using Markov


chains
Summary

State-of-the-art on modelling of cyclical stratigraphy using first-order Markov chains is reviewed.


Shortcomings of the presently available procedures are identified. A procedure which eliminates all
the identified shortcomings is presented. Required statistical tests to perform this modelling are
given in detail. An example is given to illustrate the presented procedure.

Introduction

Proper characterization of subsurface stratification is important in many disciplines, such as


geotechnical engineering, petroleum engineering, mining engineering, mineral sciences,
hydrology and water resources. Some stratigraphic sections show evidence of cyclical or
recurrent sedimentation. Markov chains have been applied to model such stratigraphic
sequences (Krumbein and Dacey, 1969; Ethier, 1975; Ali et aI., 1980). Two parameters, state
and time, are needed to describe a Markov chain. In the application of Markov chains to
analyse stratigraphy, the state parameter is used to identify different lithology types, and the
time parameter is changed to a location parameter in space and is used to record the transitions
among different lithologies in space. For a stratigraphic sequence to possess the Markov
property, the lithology type observed at a location in space should depend probabilistically on
the states the sequence occupied at previous locations. At this point it is important to identify
the two extreme models which lie on either side of Markov models: (t) if the state of the system
at any point in space can be predicted with 100% certainty, it is a deterministic model, and (2) if
the state at any point in space is independent of previous states, it is known as a Poisson process.
If the transition of a Markov chain depends only on the immediately preceeding state, the chain
is a first order Markov chain. If the transition depends on more than one previous state, then it is
a higher order model. If the transition model of a Markov chain does not vary with the spatial
parameter, it is called a homogeneous or stationary chain.
Homogeneous first order Markov chains have been used in modelling stratigraphy in the
vertical direction. The papers by Krumbein and Dacey (1969) and Yu (1984) form the state-of-
the-art of modelling stratigraphy in one dimension. Two types of Markov chains have been
employed in these studies. The first approach considers the stratification at discrete points that
are spaced equally along a vertical profile. The points are numbered consecutively, and the use
Keywords: Cyclical stratigraphy; Markov chains; geomathematics;computer modelling.
0269-0316/87 $03.00+.12 ©1987Chapmanand Hall Ltd.
122 Kulatilake

of the Markov chain is based on the assumption that the lithology or state at point n depends
upon the lithology at the preceeding point (n - I). Because the same lithology may be observed
at successive points, the transition matrix that gives the probability of going from one lithology
to another generally has non-zero dements on the main diagonal. This type of Markov chain is
known as a conventional or ordinary Markov chain. If stratigraphy follows a first order
conventional Markov chain, then the thicknesses of lithologies should follow geometric
distributions (Krumbein and Dacey, 1969). This important property can be used in testing
whether a stratigraphy follows a first order conventional Markov chain.
The second approach considers only the succession of lithologies, and because each
transition is to a different lithology within the system, the diagonal elements are all zero. This
Markov chain is known as an embedded Markov chain. In this case, the distributions for
lithologic thicknesses need not follow geometric distributions. Thus, some stratigraphic
sequences may be modelled by using an embedded Markov chain to describe the transitions
between different lithologies, and using different probability distributions to describe
thicknesses of different lithologies. Such a process is known as a semi-Markov process.
Investigations performed so far on modelling stratigraphy in 1-D contain one or several of the
following shortcomings: (a) have assumed homogeneity without performing suitable statistical
tests to check the applicability of homogeneity, (b) have used conventional Markov chains
without satisfying the requirement that the thicknesses of tithologies follow geometric
distributions, (c) have not considered semi-Markov chains when they are more suitable than
conventional Markov chains, or (d) have not used proper statistical tests to check the Markov
property of embedded Markov chains, The stratigraphy modelling procedure suggested in this
paper eliminates all these shortcomings. The sections which follow describe the modelling
procedure. An example in the 'Application Section' illustrates the use of the modelling
procedure.

Overview of the suggested procedure

The suggested procedure is applicable only to stratigraphy data which show cyclical
sedimentation. The first step in the procedure is to check the homogeneity of the stratigraphic
column. The test suggested by Anderson and Goodman (1957) for stationarity can be used in
checking the homogeneity of conventional Markov chains. However, this test is not
appropriate to check the homogeneity of embedded Markov chains. At present, no suitable test
is available to check the stationarity of embedded Markov chains. Until such a test becomes
available, the above mentioned test may be used as the test for homogeneity. If the test result
indicates non-homogeneity, then further investigations should be carried out to separate the
entire stratigraphy into regions where the homogeneity property is applicable. Then each
homogeneous stratigraphic section should be analysed for the other properties explained
below.
Stratigraphic data generally fall into one of the following groups: (a) the observed data have a
first order conventional Markov dependency in the succession of lithologies, and geometric
distributions for all lithologic thicknesses, (b) the observed data have a first order embedded
Markov dependency in the succession of lithologies, but they do not have geometric
Modelling of cyclical stratigraphy using Markov chains 123

distributions for all lithologic thicknesses, (c) the observed data have neither first order
conventional Markov dependency nor embedded Markov dependency in the succession of
lithologies, but they do have geometric distributions for all lithologic thicknesses, or (d) the
observed data neither satisfy Markov dependency in the succession of lithologies, nor do they
have geometric distributions for lithologic thicknesses. If combination (a) holds, then first order
conventional Markov chains should be used for modelling. If combination (b) holds, then the
appropriate model is the semi-Markov chain. For combination (c), an independent event model
such as the multinomial model is suitable. Combination (d) cannot be modelled appropriately
by either the conventional or the semi-Markov model.
Tests given by Anderson and Goodman (1957) and Billingsley (1961) for the Markov
property can be used to test the independence against dependence of states for the conventional
Markov transition matrix. If the results of these two tests lead to the rejection of the null
hypothesis, then it only implies the presence of some dependence between state transitions. If
one wants to prove that the transition matrix follows a first order conventional Markov chain,
then it is necessary to show that the distribution of thickness of each lithology in the stratigraphy
sequence follows a geometric distribution. This can be checked by performing goodness-of-fit
tests (Ang and Tang, 1975) on the thickness data. To check the presence of an embedded
Markov chain, tests given by Yu (1984) seem the best. These tests originate from Goodman
(1968) in the context of incomplete contingency tables.

Stratigraphy modelling using first order conventional Markov chains

Basic concepts
In the case of a stratigraphic column, observations of the state are usually made starting at the
bottom at discrete intervals of vertical distance. Each interval represents one step in the index
space of the conventional Markov chain. A first order Markov chain is one where the transition
from state i to statej depends only on the previous state of the chain. If the number of observed
transitions from state i to statej isf~j, then the tally matrix, F, is given by
F = ~j]; i,j=l, 2,...m (1)
where m is the total number of states. The transition probability from state i to state j, p~j, is
given by

Pij =fij j (2)


J

The transition probability matrix, P, is defined by


P= [Pif]; i,j=l,2 .... m (3)
If the transition matrix does not vary with the location of the space parameter, then the Markov
chain is stationary or homogeneous. The diagonal probabilities, p~, are related to the relative
thicknesses of the lithologic units (Harbaugh and Bonham-Carter, 1970). The transition
probability matrix is sensitive to the interval employed. If the interval is too small, the resulting
124 Kulatilake
p, tend to be one for any finite sample, with zero in the off-diagonals; if too large, some
important layers may be missed. Therefore, one should carefully inspect the stratigraphic
column before choosing an appropriate size for the interval. Thicknesses of lithologies can be
considered as either discrete random variables (Krumbein and Dacey, 1969) or continuous
random variables (Ethier, 1975). If they are treated as discrete random variables, then they
should follow geometric distributions for stratigraphy which can be modelled by conventional
Markov chains (Krumbein and Dacey, 1969). If they are treated as continuous random
variables, then they should follow exponential distributions.
So far we have considered only single-step transition probabilities. Multiple-step transition
probabilities can be obtained by powering a single-step transition probability matrix. If a
matrix of transition probabilities is successively powered with the result that each row is the
same as every other row, the resultant matrix is termed a regular or steady-state transition
matrix (Harbaugh and Bonham-Carter, 1970; Ang and Tang, 1984). The fixed probability row
vector of this matrix provides the proportion of each lithology.

Testing for homogeneity


This is based on the test suggested by Anderson and Goodman (1957) for stationarity. The
stratigraphic column is divided into T subintervals. Then the following statistic is computed.

S1=2L ~
,=1
ifq(t)l°g~FPiJ(t)~
i=~ j=~ k PU _]
(4)

where t refers to the tth subinterval and Pu is the transition probability for the whole sequence. If
the null hypothesis of homogeneity exists, then S1 is asymptotically chi-square distributed with
( T - 1 ) m ( m - 1 ) degrees of freedom (DF). The significance level at which S1 equals the
theoretical value given in the chi-square table provides the maximum significance level,
(Benjamin and Cornell, 1970), at which the null hypothesis can be accepted.

Tests for Markov property of a conventional Markov chain


A test, recommended by Bitlingsley (t961), is the Pearson statistic, given by

$2= ~ ~ (fij-ein)2/eij (5)


i=1 j=l

where eij is the expected number of i to j transitions under the null hypothesis that the state
transitions come from independent multinomial trials. The maximum likelihood estimate of eu
is given by
eij = (fiRfcj)/N (6a)
where

f g = ~ fu for i= 1, 2. . . . m (6b)
j=l
Modelling of cyclical stratigraphy using Markov chains 125

fc~ = ~ fij for j = 1, 2 . . . . m (6c)


i=1

and N is the total number of transitions. Another test, recommended by Anderson and
Goodman (1957), is the likelihood-ratio statistic

$3 = 2 ~ ~ f~j loge(f/j/e,j). (7)


i=1 j = l

Under the null hypothesis of independence both these statistics become asymptotically
chi-square distributed with ( m - 1)z DF. If the results of these two tests leads to the rejection of
the null hypothesis, then it only implies the presence of some dependence between state
transitions. To show that this dependency is a conventional first order Markov dependency it is
necessary to prove that the thicknesses of lithologies follow geometric distributions. Chi-square
and Kolmogorov-Smirnov goodness-of-fit tests (Ang and Tang, 1975) can be performed to
check this.

Stratigraphy modelling using first order semi-Markov chains

In structuring an embedded transition probability matrix from observational data, only the
lithology transitions are tallied. Hence, the number of entries in the embedded matrix is smaller
than for the equal interval matrix. As a result, the off-diagonal probabilities have different
numerical values, but the relative probabilities for p~ where i ¢ j are the same in the two types of
matrices. For semi-Markov chains, the thicknesses of lithologies need not follow geometric
distributions. In order to use the semi-Markov model, the stratigraphy data should satisfy the
Markov property for embedded Markov chains.

Tests for Markov property of an embedded Markov chain


In this case the Pearson statistic takes the following form (Yu, 1984)

s4= (8a)
i=1 j=l
j--/:i

where eij= aibi for i ~ j (8b)

0 for i=j
Values of ai and bj are computed using the iterative scheme given below (Yu, 1984)

Step 1: g o1-JiR/
_ ¢ / ~ 6~j f o r i = l , 2. . . . m (9a)
[j=l

= 0 for i = j
where 6~j
= 1 for i-¢j (9b)
126 Kulatilake

_ 1):
Step 2n(n > • ~=f~j
bf"- 6~ja i2"-z f o r j = l , 2, . . m (9c)
i

m
Step 2 n + l ( n > l ) : a2"=fis ~ 6,pf "-1 (9d)
/j= 1

The iteration can be repeated until the required accuracy is obtained. The maximum likelihood
estimate in this case is given by (Yu, 1984):

$ 5 = 2 ~, ~ f01Ogef~j--2 ~ f/R1og~ai--2 ~ fcjlog~b~ (10)


f=1 j=l j=l j=l
jOi

Under the null hypothesis that the state transitions come from independent multinomial trials,
both statistics approximately follow the chi-square distribution with ( m - 1 ) 2 - m degrees of
freedom. Results should indicate rejection of the null hypothesis in order to satisfy the Markov
property.

Application
Table 1 provides stratigraphy data from the Oficina Formation of eastern Venezuela (Scherer,
1968). These data were used to illustrate the modelling procedure given in the paper. The
stratigraphy column given in Table 1 was divided into five equal subintervals to perform the test
for homogeneity. The following results were obtained:
Degrees of freedom =48; S1 = 55.5; ~ = 0.22.
This shows that the homogeneity can be accepted at a fairly high significant level. This allows us
to treat the whole stratigraphy column under one homogeneity set.
Next, the tally matrices, transition probability matrices, and chi-square statistics for an
ordinary Markov chain were computed for interval sizes of 60, 120 and 240 cm. Results are
given in Table 2. The results clearly show the influence of interval size on the transition
probability matrix and on the chi-square values. As the interval size increases, Pu decreases. In
this particular example, the influence is most pronounced on lignite and siltstone. A careful
inspection of Table 1 shows that both lignite and siltstone have pretty high frequencies for
thicknesses less than 120 and 240 cm. For this example, interval size of 60 cm seems a pretty
good choice. Values obtained for ct dearly show the strong rejection of the null hypothesis of
independence. The transition probability matrix obtained for the interval size of 60 cm was used
to compute the regular transition matrix. The results provided the following proportions of
lithologies for the stratigraphic column:
Sandstone=0.27; Shale=0.49; Siltstone=0.t2; Lignite=0.12
Frequency distributions for lithology thicknesses are given in Fig. 1. Thickness data were
subjected to chi-square and Kolmogorov-Smirnov (K & S) goodness-of-fit tests for geometric
0"
¢b
~/00
~. ~. ~.
hJ 4~ Ox 0

~"

.~ ~ ~ ~ "" ~., ~,. ~ ~ ~


o
N
=
~.~, ~

~ e.+
N N•

~" o ~ o o cg
o o o o ~" o ~ - ~

P~
(3

~o ~P~ ~ = ~ o ~ C~
N

~o

.. ~ ~:~ ~-1,.,~ ~' ~° ~0

•- ~ : ~." O~
N' O~
o g,~

r~
~ ~.~ @
,~. p~ t,O
--3
128 Kulatilake

Table 2. Transition probability matrices and chi-square statistics for modelling stratigraphy
by conventional Markov chains.
Interval size
used for
observation 60 cm 120 cm 240 cm
A B C D A B C D A B C D
A 104 3 8 9 84 22 14 10 31 18 11 7
Tally B 5 74 8 14 26 171 21 25 21 72 7 19
matrix C 5 6 13 7 9 21 13 12 4 10 3 6
D 10 18 2 21 11 29 7 15 11 19 2 4
A B C D A B C D A B C D
Transition A 0.84 0.03 0.06 0.07 0.64 0.17 0.11 0.08 0.46 0.27 0.t6 0.11
probability B 0.05 0.73 0.08 0.14 0.11 0.70 0.09 0.10 0.18 0.60 0.06 0.16
matrix C 0.16 0.19 0.42 0.23 0.16 0.38 0.24 0.22 0.17 0.44 0.13 0.26
D 0.20 0.35 0.04 0.41 0.18 0.47 0.11 0.24 0.30 0.53 0.06 0.11
$2 247.6 171.4 33.4
DF 9 9 9
< 0.005 < 0.005 < 0.005
$3 251.0 162.3 32.9
DF 9 9 9
< 0.005 < 0.005 < 0.005

~' o.3o~ ~ (a)

Sandstone (state A) (c) 0.7.0


065-
Lignite
(state D)
o.zs] -- observed
i=r /
w 0.20 ~ r ~ 0.60
,,'- theoretical Siltstone
.= O,Sli !1" (geometric) 055. (state C) 055-

0.502

t= o,s- 0.*54
ell', ,i~l.l~ii',',T,,,.rn. rn _ ~"
o ,zs 25~o 37,5 5o.o ~.~ ~,.o e'k5 ~ . o ,~.5,2~,.o - ~ o.,,o- b.
Thickness- cm
I~. 0.35- .~ 0,35.

o.so-' (b) .~ o.3o- 0.30.


n."
¢: 0.25-
Shale (state B) -~
(~ 0 . 2 5 - 0.25-

I1
o.o o.o 0`20"

0.I 5"
_

ozo- o.Jo-
d 0.I0- ,
ii
J
0.05- ~ I , 0.05- 0.05- ~ i ,

0 I2.5 25.0 37,5 5 0 0 62.5 750 87.5 tO0.O 112,5 12.5,0 0 12.5 250 3"/5 50.0 0
Thickness-era Thickness-era Thickness- cm

Fig. 1. Observed relative frequencies and geometric distribution fittings on lithology thickness
data from Table 1.
Modellin# of cyclical strati#raphy usin# Markov chains 129

Table 3. Results of goodness-of-fittests on thickness of lithologies.


Lithology type Sandstone Shale Siltstone Lignite
Chi-square value 3.78 8.05 6.36 0.99
DF 7 9 4 2
0.80 0.50 0.19 0.62
K & S value 0.04 0.06 0.05 0.03
DF 15 18 7 5
> 0.95 > 0.95 > 0.95 > 0.95

Using the tally matrix, $4 and $5 were computed according to Equations (8) and (10),
respectively. The associated degrees of freedom and ~ values were also determined. Results are
given below.
$4= 14.3; D F = 5 ; ~=0.015
$5= 14.8; D F = 5 ; ct=0.012
The results show a rejection of the null hypothesis of independence. However this rejection is
not as strong as the rejection indicated by Table 2. Therefore, it can be concluded that the
conventional Markov chain is better than the embedded Markov chain to model the considered
stratigraphy data.

Conclusions

The paper provides a procedure to analyse cyclical stratigraphy data. If lithology transitions
satisfy the Markov property, then the stratigraphy can be modelled using either first-order
conventional Markov chains or first-order embedded Markov chains. The Markov chain type
which should be used depends on the structure of the stratigraphy. Statistical tests, which
should be performed to choose the proper type of Markov chain, are given in detail. If lithology
transitions do not show any Markov dependence, then the stratigraphy should be modelled by
an independent multinomial model.
Once the model is constructed, then it can be used to generate stratigraphy using a Monte-
Carlo simulation. Characterization of stratigraphy is an essential element in any geological or
geotechnical engineering analysis or design.

Acknowledgements

USAE Waterways Experiment Station provided financial assistance for this study. This support
is gratefully acknowledged. Any opinions, findings, conclusions, or recommendations
expressed in this paper are those of the author and do not necessarily reflect the views of the
Waterways Experiment Station. Sue Wiedenbeck, a graduate student in Systems Engineering at
the University of Arizona, assisted with most of the calculations. The writer also gratefully
130 Kulatilake

acknowledges John B. Palmerton of the Waterways Experiment Station for his assistance and
interest over the course of the study.

References

Ali, E.M., Wu, T.H. and Chang, N.Y. (1980) Stochastic model of flow through stratified soils, Journal of
the Geotechnical Engineering Division, ASCE 106, 593-610.
Anderson, T.W. and Goodman, L.A. (1957) Statistical inference about Markov chains, Ann. Math.
Statist. 28, 89-110.
Ang, A.H-S. and Tang, W.H. (1975) Probability Concepts in Engineering Planning and Design 1, John
Wiley and Sons.
Ang, A.H-S. and Tang, W.H. (1984) Probability Concepts in Engineering Planning and Design 2, John
Wiley and Sons.
Benjamin, J.R. and Cornell, C.A. (1970) Probability, Statistics, and Decisionfor Civil Engineers, McGraw-
Hill.
Billingsley, P. (1961) Statistical methods in Markov chains, Ann. Math. Star. 32, 12--40.
Ethier, V.G. (1975) Application of Markov analysis to the Banff Formation (Mississipian), Alberta,
Mathematical Geology 7, 47-61.
Goodman, L.A. (1968) The analysis of cross-classified data: independence, quasi-independence, and
interactions in contingency tables with and without missing entries, Jour. Amer. Statist. Assoc. 63,
1091-131.
Harbaugh, J.W. and Bonham-Carter, G. (1970) Computer Simulation in Geology, John Wiley and Sons.
Krumbein, W.C. and Dacey, M.F. (1969) Markov chains and embedded Markov chains in geology,
Journal of Mathematical Geology 1, 79-96.
Scherer, W. (1968) Application of Markov chains to cyclical sedimentation in the Oficina Formation, eastern
Venezuela, unpublished MS thesis, Northwestern University, Evanston, Illinois.
Yu, J. (1984) Tests for quasi-independence of embedded Markov chains, Journal of Mathematical Geology
16, 267-82.

Department of Mining and Geological Engineering, PINNADUWAH H.S.W. KULATILAKE


University of Arizona,
Tuscon,
Arizona 85721, USA

Received 30 May 1986

Вам также может понравиться