Академический Документы
Профессиональный Документы
Культура Документы
889±905, 1998
Printed in Great Britain
0148-9062/98 $19.00 # 1998 Elsevier Science Ltd. All rights reserved + 0.00
PII: S0148-9062(98)00011-4
On the basis of geological genetic criteria discontinu- severe problems during construction of structures in
ities encountered in rock masses can be classi®ed into rock masses, or during the course of their use. This
faults, bedding planes, joints, fractures and cleavages. disadvantage can, in principle, be overcome by sorting
The existence of the dierent types of discontinuities the survey information on a spreadsheet in addition to
can be attributed to a range of geological processes the contouring [6]. However, this approach can be
that might have occurred in the rock mass being very tedious, if the data set is suciently large and/or
mapped. For example, joints are formed as a result of the number of variables numerous, and it can fail to
cooling in igneous rock, shrinkage due to drying in reveal the true structure apparent in the data.
sedimentary rocks, and for all rock types can be attrib- One other disadvantage of the separation of fracture
uted to tectonic stresses. Detailed accounts on disconti- sets using the contouring of pole orientations on
nuity types can be found in Ref. [3]. Discontinuities stereographic plots is that the process is very subjec-
generally occur in sets (sub-parallel planar groups) and tive. Dierent data analysts can arrive at very dierent
tend to have a pattern. As geologists or geotechnical answers based on their background, experience, and
engineers map rock surfaces, they record observed geo- personal biases and inconsistencies [3]. The dierences
logical relations between groups of discontinuities, and in results are even more pronounced in cases where the
information on their geological genetic types. boundaries between clusters are unclear [3].
Therefore, the information provided by geologists or Past attempts at tackling the problem of subjectivity
geotechnical engineers about the discontinuities have tried to use cluster analysis. Shanley and
encountered during mapping is of great importance to Mahtab [7] were the ®rst to propose an algorithm,
the analysis of joint survey data. which provided some objectivity to the process. That
A typical joint survey data set consists of infor- algorithm was further enhanced by Shanley and
mation on the orientation of the discontinuities Yegulalp [8]. However, this algorithm still lacks the
encountered, the types of discontinuities, roughness of capability of including the recorded additional data in
their surfaces, persistence, spacing and other attributes
the analysis. Dershowitz et al. [9] describe a stochastic
deemed important by the recording geologist or geo-
algorithm for clustering discontinuities, which can
technical engineer. Some of the characteristics recorded
handle the extra information provided. The method is
during the mapping of discontinuities, such as orien-
based on de®ning probability distributions for each of
tation and spacing, are quantitative in nature, while
the fracture characteristics, and involves the inte-
others like color and geological genetic type are quali-
gration of these probability distribution functions.
tative variables. Based on the similarity across vari-
Numerical integration usually is very computationally
ables, the discontinuities are then separated into
intensive. Also the method requires knowledge of the
subgroups or sets.
number of discontinuity sets in the data under analy-
Tools for analyzing survey data and delineating
sis.
them into subgroups or clusters are mainly based on
contouring on stereographic plots projections of the Statistical tools such as discriminant analysis, re-
discontinuity poles (normals to the planes of disconti- gression, decision analysis and cluster analysis exist for
nuities). Computer software packages exist today that the exploratory analysis of data. More recently, arti®-
make the plotting and contouring of pole information cial intelligence techniques such as neural networks
on stereograms a near trivial issue. (One such package have been developed also for the purposes of analyzing
is DIPS, a program developed by the Rock and interpreting data. However, in an environment
Engineering Group of the University of Toronto [5]. It where data has to be classi®ed into homogeneous
was used for displaying and contouring all the pole groups of objects in the absence of a priori information
data in this paper.) The data analyst thereafter pro- on the groups, cluster analysis is the tool most suitable
ceeds to identify the dierent fracture sets revealed on for application [10, 11]. Techniques for clustering data
the plots. abound in literature, and they can be separated into
The above-described process has two principal two broad categories Ðhierarchical cluster algorithms
weaknesses. The ®rst de®ciency is that all the other in- and partitional cluster methods. Hierarchical clustering
formation, some or all of which may also assist in deli- techniques generate nested clusters of data, i.e. some
neating the discontinuities into sets, have been clusters may be embedded in others and as a result
excluded from the analysis. Also there exists the possi- dierent numbers of clusters can be obtained based on
bility that some joint sets may be very similar in orien- the level at which they are observed. Partitional algor-
tation, but greatly dier in some other attributes, say, ithms generate a unitary partitioning of the data into
joint roughness, spacing or geologic genetic type. For homogeneous groups with all of them being at the
example, the orientation of a fault may be close to same level. The authors are of the belief that of the
that of a set of joints, but it cannot be grouped with major types of clustering algorithms, a K-means parti-
the joints, because it is a major discontinuity with tional algorithm would best suit the purposes of the
properties very dierent from that of joints. A separ- rock mechanics expert. This can be attributed to the
ation, therefore, based on only orientations may lead nature of the classi®cation problem in the exploratory
to signi®cant error. Such errors can be sources of analysis of discontinuity data for rock mechanics pur-
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 891
poses, and to the fact that K-means algorithms are less of orientations. These modi®cations include a dierent
susceptible to outliers than hierarchical schemes [12]. distance norm, a novel approach to determining the
On the basis of the nature of the boundaries centroids (means) of clusters of directional (spherical
between the clusters identi®ed, partitional algorithms data), and modi®cations of existing cluster perform-
can be divided into ``hard'' or ``crisp'' algorithms and ance measures peculiar to the clustering of discontinu-
``soft'' classi®cation schemes. In ``hard'' clustering an ity orientations.
observation is assigned to one or the other cluster.
However, this ``all or nothing'' classi®cation does not
adequately represent reality, because it is often the
case that a data object in one cluster may bear simi- THE FUZZY K-MEANS ALGORITHM
larity in some characteristics to other objects in a
The fuzzy K-means algorithm uses Picard iterations
dierent cluster [13]. Soft cluster algorithms, on the
to solve for the minimum of an objective function [13].
other hand, assign to every observation degrees of
Given a data set comprising of N observations, each
membership to the clusters in a data set. These mem-
described by a vector of P attributes, Xj0(Xj1, Xj2, . . . ,
bership degrees often assume real values between zero
XjP), the algorithm seeks to partition the data set into
and one. When observations belong to a cluster, they
K subgroups or clusters on the basis of the measured
tend to have high values of degrees of membership to
similarities among the vectors (observations) of the
that cluster.
data. The algorithm basically seeks for regions of high
The cluster algorithm being proposed in this paper
density in data. The prototype or centroid of each
belongs to this class of cluster techniques. It is founded
cluster is the vector most representative of the cluster,
on one of the most widely used and successful classes
i.e. it is the geometric mean of all the vectors belong-
of cluster analysis, the fuzzy K-means algorithm [13±
ing to that cluster. Cluster prototypes are represented
15]. Fuzzy cluster algorithms have generally been
in the K-means algorithm as Vi. The solution to the
developed and used in solving problems of computer
problem of separating data objects into K sub-groups
vision and medical imaging, and are gaining popularity
can be achieved by minimizing the fuzzy objective
in other areas [11, 13, 14]. In general, it is believed that
function:
fuzzy clustering techniques yield results superior to
those of hard cluster methods [13]. This can be par- N X
X K
tially attributed to the fact that no observation is fully Jm
U,V
uij m d2
Xj ,Vi ; KRN
1
j1 i1
assigned to any cluster in any iteration of the cluster
process. Also the soft boundaries of fuzzy algorithms The quantity d2(Xj,Vi) is the distance between obser-
allow them to escape traps in data that can fool hard vation Xj and the cluster centroid Vi. The distance is a
classi®cation schemes [15]. measure of the dissimilarity between the two points.
The fuzzy K-means approach is especially appealing When the two points exactly coincide, the distance
in solving the problem of separating discontinuities between them is zero. It increases, as the two objects
into sets, because it explicitly accounts for the uncer- became more and more dissimilar.
tainty present in both the collection and the analysis The choice of the distance measure is determined by
of the survey data. It also provides a computationally the space in which variables lie. In RP space, the
attractive alternative to the algorithm proposed by Euclidean norm
Dershowitz et al. in Ref. [9].
X
P
Fuzzy set theory provides the framework within d2
Xj ,Vi
Xjp ÿ Vip 2 ,
2
which uncertainty in data can be accounted for in a p1
natural and realistic manner [13]. It was originated by
Zadeh [16] when he de®ned a new concept of sets that can be used. In order to measure the distance between
allowed an object to belong to a set with a degree of orientations they are ®rst converted to unit normals
membership lying between zero and one. Under the and thus the space becomes points lying on the surface
fuzzy theory, the greater the certainty that an object of a unit sphere (non-Euclidean space). In this space
belongs to a set, the closer its membership value is to the sine of the angle between two normals is the
one. Based on this idea of fuzzy sets, Ruspini [17] measure that represents how far apart they are (the
de®ned an algorithm for separating data into sets cosine is a similarity and not a dissimilarity measure).
through the minimization of an objective function. The distance norm therefore becomes:
Practically all the subsequent work on fuzzy clustering d2
Xj ,Vi 1 ÿ
Xj Vi 2 ,
3
that has followed can derive its foundation from
Ruspini's work [13]. where XjVi is the dot product of vectors Xj and Vi (In
The earliest work, known to the authors, on the ap- Ref. [19] reasons are supplied as to why Equation (3)
plication of fuzzy K-means clustering to the analysis of is a valid distance metric).
discontinuity data was by Harrison [18]. The present It must be noted that the distance measure used in a
work has made a number of modi®cations to the fuzzy cluster algorithm implicitly imposes a topology on the
K-means algorithm, which are speci®c to the clustering data under analysis [13]. A distance metric causes a
892 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS
method outlined in Refs [23±25]. Given N unit vectors Secondly, the two smallest eigenvalues obtained
in the form of direction cosines (xj,yj,zj), where j = 1, from the spectral decomposition of the matrix S* have
. . ., N, their mean can be determined in the following a physical meaning that allows them to be used in
manner: forming critical measures for determining the validity
(i) Compute the orientation matrix S using the for- of various cluster partitions. This will be discussed into
mula greater detail in Section 5. Furthermore, these advan-
2 N 3 tages are gained without paying steep penalties in com-
X XN XN
x jx j x j yj x j zj 7 putational eort or speed. Memory requirements are
6
6 j1 j1 j1 7 not much more than those needed for a sign tracking
6 7
6 N 7 sub-routine based on Equation (5), and there are very
6X XN XN 7
6 x j yj y j yj yj z j 7 fast subroutines, in Numerical Analysis literature, for
S6 7:
6
6 j1 7 ®nding the eigenvalues and eigenvectors of matrices
6 j1 j1 7
6 7 that can be easily implemented.
6X N XN XN 7
4 5
x j zj yj z j zj zj
j1 j1 j1
SEQUENCE OF STEPS FOR EXECUTING THE
(ii) Find the eigenvalues (t1,t2,t3) of S and their re- ALGORITHM
spective normalized eigenvectors (x x1,xx2,x
x3), where
The following scheme gives the sequence of compu-
t1<t2<t3. Vector x3, which is the eigenvector associ-
tations needed to solve for the minimum of the fuzzy
ated with the maximum eigenvalue, will be the mean
objective function.
vector of the group of N vectors.
(i) The algorithm starts o with the selection of in-
In the form described above the eigenanalysis pro-
itial guesses for the K cluster centroids. The initial pro-
cedure cannot be used to determine centroids, since it
totypes are chosen such that they are realistic vectors
would give the overall mean for the N vectors being
that fall in the general region of the data being ana-
analyzed. The authors have successfully adapted this
lyzed. Their selection can be realized in dierent ways.
method for computing the prototypes of clusters of
One method would be to select the ®rst K input vec-
unit vectors in the fuzzy K-means routine by including
tors as the initial guesses of the centroids [11]. Another
the weighting factors, (uij)m, in the orientation matrix. way of picking initial guesses is to randomly select K
A modi®ed orientation matrix S* is de®ned to be vectors from the data set as seed points for the
2 N 3
X XN X N algorithm [11]. To guarantee the selection of K well
m m m
6
uij x j x j
uij x j yj
uij x j zj 7 separated initial prototypes, the mean of the data set
6 j1 7
6 j1 j1 7 can be chosen as the ®rst initial centroid. Thereafter,
6 N 7
6X XN X N 7 each subsequent initial guess is picked such that its dis-
6 m
uij x j yj m
uij yj yj
uij yj zj 7
m
S* 6 7: tance from each of those already chosen is not less
6 j1 7
6 j1 j1 7 than a speci®ed minimum [11, 21].
6 7
6X N XN XN 7 The authors experimented by initializing the algor-
4 m m m 5
uij x j zj
uij yj zj
uij zj zj ithm with K randomly chosen cluster prototypes. One
j1 j1 j1 way of choosing a random vector in RP space is by
7 selecting each of the P attributes of the prototype as a
random real number in the interval (Xpÿ3sp,Xp+3sp),
The normalized eigenvector corresponding to the lar- where Xp is the mean of the pth attribute of the N
gest eigenvalue of S * is the new cluster centroid, i.e. input vectors Xj, and sp is its corresponding standard
V^i x3 :
8 deviation. To select a random point on the surface of
a unit surface sphere, all it takes is to select three ran-
Proof of the validity of using this approach of comput- dom real numbers between (ÿ1,1) and normalize them.
ing cluster centroids for spherical data is given in Vectors selected in the manner described are realistic
Appendix A at the end of this paper. There are several in that they lie in space of the input vectors and in the
compelling reasons for using this approach as opposed general region of the data.
to the use of a routine involving Equation (5) with Dierent choices of initial guesses of cluster proto-
suitable housekeeping of the signs of vectors. One of types can lead to dierent partitions of the same data.
the reasons for the usage of the eigenanalysis approach This is because the algorithm may or may not con-
is that the resulting eigenvalues and vectors contain verge on the global minimum of the objective function
important information on cluster shapes [19]. This in- Equation (1); it is only guaranteed to settle at a local
formation can be utilized with an appropriate distance minimum [11, 21]. This comes into play especially for
metric for ®nding elliptical clusters on a sphere that data sets with poorly separated clusters. However, con-
are ill-placed relative to each other [19]. (For example, ®dence in cluster results can be built by running the al-
two elliptical clusters are ill-placed relative to each gorithm a few times and carefully observing resulting
other, when their centroids are close and their princi- partitions. The authors have observed in their practice
pal axes are perpendicular to each other.) that this problem is much less pronounced when the
894 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS
number of clusters K is correct, i.e. corresponds to the Since the main thrust of the current work is to present
actual number of clusters in a data set. the theoretical aspects of the algorithm highlighted
(ii) Compute the distances d2(Xj,Vi) of all N obser- with two examples, an in-depth analysis of its perform-
vations from the K cluster centroids using formula ance over a broader range of geological conditions
Equation (3) and/or Equation (2). shall be provided in Ref. [26].
(iii) Calculate the degrees of membership, uij, of all An alternate way of implementing the algorithm
N observations in the K clusters with formula would be to start o by choosing random values for
Equation (4). all the degrees of membership, uij, for the N obser-
(iv) Evaluate new cluster prototypes using the eigen- vations and proceeding to compute in sequence the
analysis of the fuzzy orientation matrix, S*, for direc- cluster prototypes, distances, and new membership
tional (spherical) data. If non-orientation data (extra- values, uà ij [13]. The approach of selecting initial cluster
recorded information) is also involved in the analysis, centroids as opposed to initializing memberships is
then Equation (5) can be used to determine those com- preferred in this work, because it allows the data ana-
ponents of the coordinates of cluster prototypes. The lyst to easily utilize any a priori knowledge that may
validity of this treatment of mixed-type data (data exist on the means of discontinuity sets (cluster proto-
with both spherical and non-spherical components) types) to start o the algorithm. It usually is the case
can be proved, if it is recognized that the objective that when information is available on the structure of
function for such data consists of two distinct sums. a data set, it is in the form of the number of clusters
The ®rst sum involves the spherical component of ob- and their centroids.
servations (distances for this part of the objective func-
tion are measured with Equation (3)) while the second ASSIGNMENT OF OBSERVATIONS TO CLUSTERS
sum represents the contribution of the Euclidean com-
ponents of points (distances involved in this com- Observations are assigned to one cluster or the other
ponent are computed using Equation (2)). Since the based on the values of the degrees of membership.
variables of the two sums that form the composite Vector Xj, is considered to belong to cluster I, if its
objective function are completely dierent from each membership, uIj, in cluster I is higher than its member-
other, minimization of the objective function can be ship in all the other clusters. By de®nition, outliers do
accomplished through the minimization of the individ- not ®t into the general patterns of any of the clusters
ual sums. Thus the proof for mixed-type data is a very well. In fuzzy K-means clustering, they can be
combination of that found in Appendix A of the cur- detected because they tend not to have high member-
rent paper and that provided by Bezdek [13]. ship values in any clusters. It is therefore possible to
(v) Compute new distances using formula(s) exclude such ambiguous observations from being
Equation (3) and/or Equation (2), and new degrees of assigned to any cluster by establishing a threshold
membership, uà ij (Equation (4)), for all N observations. value for the maximum memberships of data obser-
vations. When the largest cluster membership value of
(vi) If
a vector falls below the threshold, it may be excluded
maxij juij ÿ u^ ij j<e from getting assigned. Great care must be exercised, if
points are not going to be assigned since not every vec-
stop [12], otherwise go to step (iv) of the procedure. e tor with low cluster memberships is an outlier.
is a tolerance limit that acts as the criterion for termi- Jain and Dubes [11] advocate that outliers be ident-
nating the iterations. It lies between 0 and 1. For all i®ed and removed from cluster analysis because they
examples given in this paper, e = 0.001. tend to distort cluster shapes. Other experts caution on
The fuzzy objective function for any speci®ed num- the exclusion of outliers from data sets. It is their
ber of clusters tends to have multiple stationary points belief that unless there are very compelling reasons for
(local minima). There is no guarantee that the Picard suspecting that an observation that is far from the rest
iterations of the algorithm will converge on the global of the data is a result of erroneous measurement it
minimum of Jm. Also, even if the algorithm converged must be left alone. It might just be a rare occurrence
on the global minimum, it would still not be guaran- of a feature that is present in the data and its removal
teed that this minimum of the objective function can have serious repercussions. If an outlier is known
would necessarily provide the best possible partitioning to be a measurement error and it can be corrected,
of the data under examination [13]. However, in the then this should be done. All that can be stated here
practice of the authors, the proposed algorithm has by the authors is that if so desired, outliers can be
strong tendency to converge on the correct solution removed from data sets after clustering with the fuzzy
when the number of clusters is correct for data sets, K-means algorithm.
and especially when the clusters are well separated.
This may be attributed to the general advantages of
CLUSTER VALIDITY
fuzzy algorithms given in the introduction to this
paper. The examples provided in the paper show that So far in the discussion of the method no mention
there is at least strong evidence that the technique pro- has been made of how to determine the number of
vides rational answers in clustering discontinuity data. clusters in a data set, or establish that the partitioning
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 895
obtained is correct. After all the fuzzy K-means algor- designed for a fuzzy implementation of maximum like-
ithm will provide K clusters whether that partitioning lihood estimation, they can be generalized to apply to
is true or not (in certain cases null clusters can exist). other algorithms in the family of fuzzy K-means
This question is one of the most important in cluster methods.
analysis and is not an easy one to answer. The answer In RP space the fuzzy covariance matrix [27] is
to it is fundamentally tied in to the de®nition of what de®ned as
a cluster is.
X
N
One important issue to bear in mind is that any
uij m
Xj ÿ Vi
Xj ÿ Vi T
cluster algorithm has the potential to impose a struc- j1
ture on a data set that might not re¯ect the actual Fi :
9
X
N
cluster structure present in the data. For example, as a
uij m
result of the distance measure (Equation (3)) used in j1
the above-described algorithm, it primarily seeks for
The fuzzy hypervolume can then be de®ned as
rotationally symmetric clusters whether or not they
exist in data under analysis. If the clusters possess ro- X
K
tational asymmetry and are unfavorably oriented rela- FHV det
Fi 1=2 :
10
tive to each other, the algorithm may be incapable of i1
ever producing the right answers and the resulting val- The average partition density of the clustering is com-
idity measures would not be of much use. (If the place- puted using the formula
ment of non-circular clusters relative to each other is
not critical, which is the most commonly encountered 1X K
Si
DPA ,
11
case, the algorithm as it is can still perform very well K i1 det
Fi 1=2
in isolating these clusters [19].) However, the authors
have proposed a distance measure that is capable of where Si is known as the ``sum of central members'',
detecting rotationally asymmetric clusters of spherical and is calculated as
data. Discussion of this measure lies outside the scope X
N
of this paper, but a comprehensive expose on it is Si uij , 8Xj 2 fXj :
Xj ÿ Vi T Fÿ1
i
Xj ÿ Vi <1g:
given in Ref. [19]. It suces to mention, however, that j1
the cluster validity measures that will be proposed in
12
this section of the current paper for spherical data,
and in themselves do not suer from the above- The partition density which is representative of the
described limitations can be used in conjunction with general density of the clustering is de®ned by Gath
the methods discussed in Ref. [19] for ®nding ill-placed and Geva [21] as
non-circular clusters. S
Often times the data analyst uses a clustering algor- PD ,
13
FHV
ithm to acquire information and knowledge on the
structure existing in the data set [11]. It is therefore im- where
perative that the algorithm provides the analyst with K X
X N
performance measures on the ``validity'' of the answers S uij , 8Xj 2 fXj :
Xj ÿ Vi T Fÿ1
i
Xj ÿ Vi <1g
that are obtained from running the algorithm on data. i1 j1
Cluster validity is the section of cluster analysis that
14
attempts to deal with the issues of validating clustering
results. The optimal partitioning based on these criteria is that
A number of validity measures have been proposed for which the hypervolume is minimal and the density
in the literature in attempt to establish indices of clus- measures are maximal.
ter validity, but in this paper we shall restrict ourselves For orientation data, the hypervolume and ``sum of
to indices for fuzzy cluster algorithms. This is because central members'' used in computing the other per-
the algorithm under consideration belongs to this formance measures of Ref. [21] cannot be used in the
group of classi®cation schemes, and that these per- form in which they are originally given. The analogs
formance measures have been widely accepted among of these cluster validity measures speci®c to the cluster-
researchers of fuzzy clustering. ing of spherical data (or for determining the contri-
Gath and Geva [21] have proposed indices that are bution of the spherical components of mixed type
based on the ``crispiness'' of resulting clusters, their vectors) have to be proposed. The eigenvalues
hypervolume, and their densities. Their idea is that the obtained from the spectral decomposition of the orien-
clusters determined should be as less fuzzy as possible, tation matrix, S*, are of great signi®cance in the deri-
should possess low hypervolumes (the term hypervo- vation of the hypervolume and ``central members'' for
lume is used, because the variable space not necessarily spherical data. Although a full discussion of the deri-
3-dimensional), and should be of highest densities. vations of the spherical analogs of the fuzzy hypervo-
Although their original performance measures were lume and ``sum of central members'' lies outside the
896 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS
scope of the current work, an attempt shall be made Using the understanding of what the eigenvalues
to explain the rationale behind the derived equations. are, the ``sums of central members'' for spherical data
First, we shall reacquaint ourselves with the fact can be de®ned as:
that the spectral decomposition of S * for a cluster ( )
XN X2
Xj x it 2
identi®es three orthogonal axes, x1, x2 and x3, and Si uij , 8Xj 2 Xj : <1 :
18
their corresponding eigenvalues, t1, t2 and t3. As is j1 t1
s2t
already known from the discussion of the calculation
of prototypes, x3 is the mean of a cluster of points. The spherical analogue of the numerator of
Equation (13) for computing the partition density
The other two vectors coincide with the minor and
becomes:
major principal axes of the elliptical distribution of
( )
points generated, when the observations in the cluster X K XN X2
Xj x it 2
are projected onto a plane perpendicular to the mean. S uij , 8Xj 2 Xj : <1 :
19
i1 j1 t1
s2t
The eigenvalues can be expressed as (see Appendix A):
X
N When mixed-type data is being analyzed (when extra
t1 x Tt S*xx1
uij m
Xj x t 2 , t 1, 2 or 3:
15 data columns of information on discontinuities are
j1 included in an analysis), the cluster performance
measures become composites of those measured for
From the expression Equation (15), it can be seen that
spherical data and for Euclidean data. To facilitate the
the eigenvalues are measures of the spread of the pro-
de®nition of the composite measures, the vector Xj,
jections of vectors in the principal directions. The
representing the jth data point will conceptually be
eigenvalues t1 and t2 provide information on cluster
divided into two components such that it can be rep-
shapes. If the distribution of vectors in a cluster has
resented as: Xj0(Xj(s),Xj(e)). The superscripts (s) and (e)
rotational symmetry, i.e. if its shape is circular or near
refer to the spherical component (direction cosines)
circular, then t11t2 or t2/t111. The greater the ratio
and Euclidean component (extra data), respectively.
t2/t1 is, the greater the departure of the cluster shape The hypervolume for clusters then becomes the pro-
is from a circle (the eccentricity of the resulting ellipti- duct of the hypervolumes of the spherical and
cal shape increases). Also, the eigenvalues represent a Euclidean constituents, i.e.:
measure of the deviation of vectors from the centroid
of a cluster in the directions of the principal axes. Fi F
s
e
i Fi ,
20
When a vector exactly coincides with the prototype of
where Fi(s) is the hypervolume determined from
a cluster, its projections onto principal directions 1
Equation (17), and Fi(e) that obtained from
and 2 are zero. Large deviations from the cluster mean
Equation (9). The hybrid condition for establishing
result in larger projections on either or both of these
``central members'' assumes the form:
two directions. This, precisely, is the idea captured by
(
the variance of a set of observations in Euclidean (RP) X2
X
s
j x it
2
Fig. 2. Pole and contour plots of observations in Example 1 data set. (a) Pole of discontinuities in ®rst data set. (b)
Contouring results of the poles.
ation of the data set to the number of observations. 1 ÿ (XjVi)2 has to be used in place of the Euclidean
2
The minimum distance between the prototypes of the metric6XjÿV6 i .
clusters is known as the separation [28]. When a good
partitioning of the data set is obtained the numerator
of the index is small, because the membership degree, SAMPLE APPLICATIONS OF FUZZY K-MEANS
2 ALGORITHM
uij, is high whenever 6XjÿV6 i assumes a small value
2
and is small whenever 6XjÿV6 i becomes large. In the The algorithm was run on two sample data sets. The
case where the clusters obtained are well separated, the ®rst example is a demonstration of the use of the fuzzy
minimum distance between the cluster centres is rela- K-means clustering algorithm for delineating a sample
tively high. It is therefore taken that small values of joint survey data set into distinct clusters based on
nXB indicate a better clustering of the data set than lar- orientation data.
ger values. The second example illustrates how the fuzzy clus-
The Fukuyama±Sugeno validity functional is calcu- tering algorithm facilitates the automatic classi®cation
lated from the formula of joints into sets based not only on orientation infor-
mation, but also when an extra data column is present.
X
K X
N
The extra data column could be any numeric property
FS
U,V;X 2
uij m
kXj ÿ Vi k2 ÿ kVi ÿ Vk
i1 j1
of discontinuities such as trace length, spacing, discon-
tinuity frequency, or even joint roughness coecient
Jm ÿ Km ,
23 (JRC). In Section 6.2, the possible misclassi®cation of
discontinuities when delineation is solely founded on
where V is the geometric mean of the cluster centroids. discontinuity orientation, is contrasted with the case
Jm in the above equation is the fuzzy objective func- when the inclusion of additional data on discontinu-
tion de®ned in Equation (1) and ities signi®cantly improves the quality of cluster separ-
X
K X
N ation.
Km
um 2 For both examples, it is assumed that all samples in
ij kVi ÿ Vk :
i1 j1 the data set come from one structural domain (a zone
of a rock mass homogeneous or uniform in properties.
The physical interpretation of the Fukuyama±Sugeno If the discontinuities being analyzed are not from a
validity index is generally not very clear [20]. However, structural domain, then they must be divided into
since the objective function, Jm, decreases monotoni- homogeneous sub-groups and each of them analyzed
cally for increasing K (the number of clusters), the separately. Failure to do so introduces error into the
function Km may be interpreted as a cost term analysis of discontinuity data [2]. Hoek, Kaiser and
intended to penalize the use of increasing values of K Bawden in Ref. [6] recommend that the number of dis-
to minimize the objective function. Fukuyama and continuities measured in any structural domain be not
Sugeno propose that small values (minima) of nFS less than a 100.
attest to good partitionings of the data set into clus-
ters. nFS has the tendency to decrease monotonically Example 1
with increasing K. The ®rst data set consists of 195 rock joints (this
In order to use the Xie±Beni and Fukuyama± data set is saved under the name Exampmin.dip in the
Sugeno validity indices for establishing the optimal EXAMPLES subdirectory of the DIPS program). For
partitioning of directional data, the sine-squared norm purposes of illustration, only orientation information
898 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS
was used in the analysis. Figure 2 contains illustrations aged. The values of the validity indices are displayed
of a plot of the poles of the discontinuities in data set in Table 1 for the various numbers of clusters, K. In
1 and also a contour plot of these normals. The con- the table, values of the criteria that correspond to the
tours on the stereogram in Fig. 2(b) and on subsequent optimal number of clusters, as determined by each the
stereographic plots in this paper are contours of equal criteria, have been highlighted. Also the results in
probability (in percent) of pole occurrence [25]. On the Table 3 are shown graphically in Fig. 3. Other than the
contour plot it can be seen that there are 4 possible Xie±Beni index, all the other validity indices indicate
fracture sets. the optimal number of fracture sets to be 4. The fact
The data set was analyzed for values of K between 2 that one index did not identify K = 4 as the optimal
and 7. For each number of clusters, K, the algorithm partition underlies the necessity of using more than one
was run multiple times and performance values aver- performance measure. No single cluster validity index
Fig. 3. Graphs of cluster validity indices for data set in Example 1. (a) Plot of fuzzy hypervolume, FHV, against number of
clusters, K. Minimum is seen at K = 4. (b) Plot of partition density, PD, against number of clusters, K. Maximum is seen at
K = 4. (c) Plot of average partition density, DPA, against number of clusters, K. Maximum is seen at K = 4. (d) Plot of
Xie±Beni index, nXB, against number of clusters, K. Minimum is seen at K = 3. Note that value of index for K = 4 is sec-
ond best. (e) Plot of Fukuyama±Sugeno index, nFS, against number of clusters, K. Minimum is seen at K = 4.
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 899
Fig. 4. Contour plots of poles of resulting clusters after fuzzy cluster analysis of ®rst data set. (a) Contour plot of poles of
discontinuities assigned to Set 1. (b) Contour plot of poles of discontinuities assigned to Set 2. (c) Contour plot of poles of
discontinuities assigned to Set 3. (d) Contour plot of poles of discontinuities assigned to Set 4.
900 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS
Table 3. Performance measures for Example 2 for number of clusters, K = 2 to 7, for orientation data only
Average partition
K Fuzzy hypervolume Partition density density Xie±Beni Fukuyama± Sugeno
2 0.172562 473.4203 512.3597 0.147898 ÿ43.7507
3 0.120048 566.159 608.8765 0.075526 ÿ56.0455
4 0.119768 669.8643 724.988 0.368519 ÿ48.8921
5 0.142071 561.2366 651.3806 0.363421 ÿ45.5348
6 0.132959 589.6222 647.515 0.386207 ÿ45.3712
7 0.140767 539.7078 618.295 0.32375 ÿ43.3776
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 901
Fig. 6. Graphs of cluster validity indices for data set in Example 2 when analysis is performed without extra data column.
(a) Plot of fuzzy hypervolume, FHV, against number of clusters, K. Values of hypervolume are practically equal when K = 3
or K = 4. (b) Plot of partition density, PD, against number of clusters, K. Maximum is seen at K = 4. (c) Plot of average
partition density, DPA, against number of clusters, K. Maximum is seen at K = 4. (d) Plot of Xie±Beni index, nXB, against
number of clusters, K. Minimum is seen at K = 3. (e) Plot of Fukuyama±Sugeno index, nFS, against number of clusters, K.
Minimum is seen at K = 3.
properly delineate these two clusters leads to biased leads to the correct recovery of the true structure of
results for the distributions of the extra data column the data set (the assignment of observations to clusters
for these joint sets. (Also, some skewness in the distri- is near identical to the original structure of the simu-
butions of the orientations for these two discontinuity lated data set). As depicted by the graphs in Fig. 7
sets arises.) The average value of the non-orientation and the results in Table 4, all the validity measures
characteristic determined for the two superimposed select the partitioning of the data into 4 clusters as the
joint sets (sets 2 and 3), if the K = 3 partitioning is best clustering of the observations. The contour plots
chosen, is 9.86 when clustering of the data is based on in Fig. 8 show the resulting 4 clusters. A comparison
orientation information only. of Fig. 8(b),(c) reveals the overlap of joint sets 2 and 3
Performing the analysis of the data set anew, but in orientation. However, the algorithm correctly
this time with the inclusion of the extra data column assigned the observations to their appropriate clusters
902 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS
Fig. 7. Plots of performance measures as functions of number of clusters, K, for data set 2, when cluster analysis is per-
formed with extra data column included. (a) Plot of fuzzy hypervolume, FHV, against number of clusters, K. Clearer mini-
mum is attained when K = 4. (b) Plot of partition density, PD, against number of clusters, K. Maximum is observed at
K = 4. (c) Plot of average partition density, DPA, against number of clusters, K. Maximum is observed at K = 4. (d) Plot of
Xie±Beni index, nXB, against number of clusters, K. Minimum occurs at K = 4. (e) Plot of Fukuyama±Sugeno index, nFS,
against number of clusters, K. Minimum occurs at K = 4.
as a result of the information contributed by the ad- Euclidean data. When observations from a number of
ditional data column. statistical distributions are grouped together into one
It must be noted that the contours of the plots in data set, histograms of the observations after they
Fig. 8 do not exactly match the contours of the initial
have been separated into clusters would not exactly
data set in Fig. 5. The dierences in contouring can be
attributed to the ®xed number of grid points used for match the histogram of the grouped observations, if
contouring poles in the program DIPS. An analogous the number of bins for all histogram plots is kept con-
situation arises in the drawing of histograms for stant.
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 903
Table 4. Performance measures for Example 2 for number of clusters, K = 2 to 7, when extra data column is included in analysis
Average partition
K Fuzzy hypervolume Partition density density Xie±Beni Fukuyama± Sugeno
2 0.149958 517.4812 520.15858 0.13512 ÿ41.0821
3 0.075335 1142.174 1316.936 0.127685 ÿ140.8
4 0.027801 2640.082 2974.768 0.0509083 ÿ209.137
5 0.034258 2034.298 2496.128 0.525081 ÿ200.292
6 0.036405 1946.632 2224.118 0.917034 ÿ170.757
7 0.036514 1871.21 2162.864 0.861568 ÿ153.365
In the case where the extra data column is included simple illustration of the extents to which answers can
in the analysis, the average value for this discontinuity be biased, when information pertinent to the identi®-
characteristic for joint set 2 is 18.1, and 1.61 for joint cation of clusters is omitted from analysis. At this
set 3! These values dier greatly from the averaged
stage it must be noted that key information for cor-
value (9.86) obtained when the cluster analysis of the
rectly separating clusters might reside in some geologi-
data set was performed without the extra data column.
Use of the averaged value of the non-orientation dis- cal data such as genetic type. In such a case, the
continuity attribute in engineering calculations could appropriate conversion of the geological data into
have serious repercussions. This example is only a quantitative form so that it could be incorporated into
Fig. 8. Contour plots of poles of clusters obtained from analysis of second data set, with extra data column included. (a)
Contour plot of poles of discontinuities assigned to Set 1. (b) Contour plot of poles of discontinuities assigned to Set 2. (c)
Contour plot of poles of discontinuities assigned to Set 3. (d) Contour plot of poles of discontinuities assigned to Set 4.
904 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS
a subsequent cluster analysis, would be greatly ben- and most certainly to enhancements, by the research
e®cial. community, of its various components.
However, before proceeding with the necessary derivations, we shall The stationary points of Fi(o,Vi) are attained only when its partial
brie¯y look at the de®nition of eigenvalues and eigenvectors. If A is derivatives are equal to zero. Equating each of Equations (A.8),
an n n matrix, and (A.9) and (A.10) to zero, results in the system of equations:
9
AX o X,
A:2 X N
>
uij m
x 2j li x j yj mi x j zj i o li >
>
>
>
>
then there exist choices for the scalar quantity o, called eigenvalues j1 >
>
>
>
of A, that produce non-zero solutions X, called eigenvectors of A. X N >
=
m 2
Also, the eigenvalues o can be expressed in terms of X and A as:
uij
x j yj li yj mi yj zj i o mi ,
A:11
>
>
j1 >
>
o XT AX:
A:3 >
>
X N >
>
>
From the theory of the eigenanalysis of square matrices, it is estab-
uij m
x j zj li yj zj mi z2j i o i >>
;
lished that there exist n eigenvalues, op (p = 1, 2, . . ., n), of the j1
matrix A, with n corresponding eigenvectors, xp. The eigenvalues are
which can be rewritten in matrix form as:
arranged in order of magnitude so that o1Ro2R. . . R on. If A is 2 N 3
symmetric, then all its eigenvalues are real numbers. X XN X N
For the minimization of Jm(U,V) with respect to cluster centroids, 6
uij m x 2j
uij m x j yj
uij m x j zj 7
6 j1 j1 j1 7
the degrees of membership of observations, the uij, of the objective 6 7 2 3 2 3
6 N 7 li li
function are ®xed while the cluster prototypes, the Vi, are variables. 6X XN XN 7
6
uij m x j yj
uij m y2j m 7
uij yj zj 7 mi o 4 mi 5:
4 5
For greater clarity, let each directional data observation and cluster 6
6 j1 7
centroids expressed in direction cosines be represented as: 6 j1 j1 7 i i
2 3 6 7
6X N XN XN 7
xj 4 5
uij m x j zj
uij m yj zj
uij m z2j
Xj 4 yj 5, j1 j1 j1
zj
A:12
and Comparing Equation (A.12) with Equation (A.2) it can be seen that
2 3
li o and
2 3
Vi mi 5
4 li
i 4 mi 5
i
respectively, with the constraint that li2+mi2+ni2=1.
In that case, the distance measured between observation Xj and are respectively the eigenvalues and eigenvectors of the symmetric
prototype Vi for the fuzzy algorithm becomes: matrix:
2 N 3
d2
Xj ,Vi 1 ÿ
Xj Vi 2 1 ÿ
x j li yj mi zj i 2 :
A:4 X XN XN
m 2 m m
6
uij x j
uij x j yj
uij x j zj 7
Rewriting the objective function using the new notations and ex- 6 j1 j1 j1 7
6 7
pressions from above, we obtain: 6 N 7
6X XN XN 7
6
uij m x j yj
uij m y2j
uij m yj zj 7
K X
X N S* 6 7:
A:13
6 j1 7
Jm
U,V
uij m f1 ÿ
x j li yj mi zj i 2 g
A:5 6 j1 j1 7
6 7
i1 j1 6X N XN XN 7
4 5
uij m x j zj
uij m yj zj
uij m z2j
or j1 j1 j1
K X
X N K X
X N
Using the relationship of eigenvalues to eigenvectors given by
Jm
U,V
uij m ÿ
uij m
x j li yj mi zj i 2 :
A:6
i1 j1 i1 j1
Equation (A.3), it is possible to establish that:
X
K X
N
Upon close examination of Equation (A.6) it can be seen that the o
uij m
x j li yj mi zj i 2 :
A:14
extrema
P ofPJNm occur only when the second term on its right-hand i1 j1
side, K i1
m 2
j1
uij
x j li yj mi zj i , assumes an extremal value.
The problem then of optimizing the fuzzy objective function reduces Therefore, from Equation (A.6), the fuzzy objective function,
to that of solving the constrained optimization problem for the sec- Jm(U,V), is minimized when the eigenvalue is largest, i.e. when
ond term. o = o3. As a result, the eigenvector, x3, that corresponds to the lar-
The solution of the constrained optimization problem can be gest eigenvalue is the mean vector of a fuzzy cluster of directional
obtained using the method of Lagrange multipliers. For each ith (spherical) data points, since it is the vector that minimizes the fuzzy
term of the expression to be optimized, let the Lagrangian be: objective function.