Вы находитесь на странице: 1из 17

Int. J. Rock Mech. Min. Sci. Vol. 35, No. 7, pp.

889±905, 1998
Printed in Great Britain
0148-9062/98 $19.00 # 1998 Elsevier Science Ltd. All rights reserved + 0.00

PII: S0148-9062(98)00011-4

Fuzzy Cluster Algorithm for the Automatic


Identi®cation of Joint Sets
R. E. HAMMAH$
J. H. CURRAN$
The task of identifying and isolating joint sets or subgroups of discontinuities
existing in data collected from joint surveys is not a trivial issue and is fun-
damental to rock engineering design. Traditional methods for carrying out
the task have been mostly based on the analysis of plots of the discontinuity
orientations or their clustering. However, they su€er from their inability to
incorporate the extra data columns collected and also lack in objectivity.
This paper proposes a fuzzy K-means algorithm, which has the capability of
using the extra information on discontinuities, as well as their orientations in
exploratory data analysis. Apart from taking into account the hybrid nature
of the information gathered on joints (orientation and non-orientation infor-
mation), the new algorithm also makes no a priori assumptions as to the
number of joint sets available. It provides validity indices (performance
measures) for assessing the optimal delineation of the data set into fracture
subgroups. The proposed algorithm was tested on two simulated data sets in
the paper. In the ®rst example, the data set demanded the analysis of discon-
tinuity orientation only, and the algorithm identi®ed both the number of
joint sets present and their proper partitioning. In the second example, ad-
ditional information on joint roughness was necessary to recover the true
structure of the data set. The algorithm was able to converge on the correct
solution when the extra information was included in the analysis. # 1998
Elsevier Science Ltd. All rights reserved

INTRODUCTION cores recovered can be among other things analyzed


for discontinuity orientation, surface geometry, etc. In
When underground or surface excavations are made in
addition, the walls of boreholes can be examined with
rock masses, the behavior of the surrounding rock ma-
remote cameras or television equipment [1, 3]. Also
terial can be greatly in¯uenced by the presence of
boreholes may be used to provide access for geophysi-
discontinuities [1, 2]. Various modes of rock slope and
cal equipment in order to gather data on the structure
wedge failure can be attributed to the existence of frac-
of rock masses [1, 3].
tures in a rock mass [3]. Discontinuities also play a When discontinuities intersect with rock surfaces,
crucial role in the classi®cation of rock masses [4]. It is linear traces of these discontinuities arise.
therefore of paramount importance, in both civil en- Consequently, the mapping of exposed rock surfaces
gineering and mining applications, to carefully collect allows for direct measurements of discontinuity prop-
and analyze data on the fractures present in a rock erties to be made [3]. Exposed rock surfaces can be
medium. The results of the analysis of joint survey mapped in systematic fashion using either scanline or
data are subsequently used in the design of excavations window sampling. In scanline sampling, a scanline
in rock [1±3]. (usually a tape of length between 2 to 30 m) [3] is
Methods for taking the measurements of the charac- stretched along an exposed planar or near planar sur-
teristics of discontinuities can be divided into two face. Discontinuity traces that intersect the scanline are
broad classes Ðborehole sampling methods and the recorded and measurements of their properties are
mapping of exposed rock surfaces. Borehole sampling made. Window or scanarea sampling is very similar to
methods involve the extraction of rock cores from sub- scanline sampling with the exception that properties of
surface regions by means of diamond drilling. The a discontinuity are recorded, if a section of its trace
falls within a de®ned area of the exposed rock surface
{Rock Engineering Group, Department of Civil Engineering, being mapped. More comprehensive discussions on the
University of Toronto, Toronto Ont., Canada. M5S 1A4 various sampling methods can be found in Refs [1±3].
889
890 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

On the basis of geological genetic criteria discontinu- severe problems during construction of structures in
ities encountered in rock masses can be classi®ed into rock masses, or during the course of their use. This
faults, bedding planes, joints, fractures and cleavages. disadvantage can, in principle, be overcome by sorting
The existence of the di€erent types of discontinuities the survey information on a spreadsheet in addition to
can be attributed to a range of geological processes the contouring [6]. However, this approach can be
that might have occurred in the rock mass being very tedious, if the data set is suciently large and/or
mapped. For example, joints are formed as a result of the number of variables numerous, and it can fail to
cooling in igneous rock, shrinkage due to drying in reveal the true structure apparent in the data.
sedimentary rocks, and for all rock types can be attrib- One other disadvantage of the separation of fracture
uted to tectonic stresses. Detailed accounts on disconti- sets using the contouring of pole orientations on
nuity types can be found in Ref. [3]. Discontinuities stereographic plots is that the process is very subjec-
generally occur in sets (sub-parallel planar groups) and tive. Di€erent data analysts can arrive at very di€erent
tend to have a pattern. As geologists or geotechnical answers based on their background, experience, and
engineers map rock surfaces, they record observed geo- personal biases and inconsistencies [3]. The di€erences
logical relations between groups of discontinuities, and in results are even more pronounced in cases where the
information on their geological genetic types. boundaries between clusters are unclear [3].
Therefore, the information provided by geologists or Past attempts at tackling the problem of subjectivity
geotechnical engineers about the discontinuities have tried to use cluster analysis. Shanley and
encountered during mapping is of great importance to Mahtab [7] were the ®rst to propose an algorithm,
the analysis of joint survey data. which provided some objectivity to the process. That
A typical joint survey data set consists of infor- algorithm was further enhanced by Shanley and
mation on the orientation of the discontinuities Yegulalp [8]. However, this algorithm still lacks the
encountered, the types of discontinuities, roughness of capability of including the recorded additional data in
their surfaces, persistence, spacing and other attributes
the analysis. Dershowitz et al. [9] describe a stochastic
deemed important by the recording geologist or geo-
algorithm for clustering discontinuities, which can
technical engineer. Some of the characteristics recorded
handle the extra information provided. The method is
during the mapping of discontinuities, such as orien-
based on de®ning probability distributions for each of
tation and spacing, are quantitative in nature, while
the fracture characteristics, and involves the inte-
others like color and geological genetic type are quali-
gration of these probability distribution functions.
tative variables. Based on the similarity across vari-
Numerical integration usually is very computationally
ables, the discontinuities are then separated into
intensive. Also the method requires knowledge of the
subgroups or sets.
number of discontinuity sets in the data under analy-
Tools for analyzing survey data and delineating
sis.
them into subgroups or clusters are mainly based on
contouring on stereographic plots projections of the Statistical tools such as discriminant analysis, re-
discontinuity poles (normals to the planes of disconti- gression, decision analysis and cluster analysis exist for
nuities). Computer software packages exist today that the exploratory analysis of data. More recently, arti®-
make the plotting and contouring of pole information cial intelligence techniques such as neural networks
on stereograms a near trivial issue. (One such package have been developed also for the purposes of analyzing
is DIPS, a program developed by the Rock and interpreting data. However, in an environment
Engineering Group of the University of Toronto [5]. It where data has to be classi®ed into homogeneous
was used for displaying and contouring all the pole groups of objects in the absence of a priori information
data in this paper.) The data analyst thereafter pro- on the groups, cluster analysis is the tool most suitable
ceeds to identify the di€erent fracture sets revealed on for application [10, 11]. Techniques for clustering data
the plots. abound in literature, and they can be separated into
The above-described process has two principal two broad categories Ðhierarchical cluster algorithms
weaknesses. The ®rst de®ciency is that all the other in- and partitional cluster methods. Hierarchical clustering
formation, some or all of which may also assist in deli- techniques generate nested clusters of data, i.e. some
neating the discontinuities into sets, have been clusters may be embedded in others and as a result
excluded from the analysis. Also there exists the possi- di€erent numbers of clusters can be obtained based on
bility that some joint sets may be very similar in orien- the level at which they are observed. Partitional algor-
tation, but greatly di€er in some other attributes, say, ithms generate a unitary partitioning of the data into
joint roughness, spacing or geologic genetic type. For homogeneous groups with all of them being at the
example, the orientation of a fault may be close to same level. The authors are of the belief that of the
that of a set of joints, but it cannot be grouped with major types of clustering algorithms, a K-means parti-
the joints, because it is a major discontinuity with tional algorithm would best suit the purposes of the
properties very di€erent from that of joints. A separ- rock mechanics expert. This can be attributed to the
ation, therefore, based on only orientations may lead nature of the classi®cation problem in the exploratory
to signi®cant error. Such errors can be sources of analysis of discontinuity data for rock mechanics pur-
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 891

poses, and to the fact that K-means algorithms are less of orientations. These modi®cations include a di€erent
susceptible to outliers than hierarchical schemes [12]. distance norm, a novel approach to determining the
On the basis of the nature of the boundaries centroids (means) of clusters of directional (spherical
between the clusters identi®ed, partitional algorithms data), and modi®cations of existing cluster perform-
can be divided into ``hard'' or ``crisp'' algorithms and ance measures peculiar to the clustering of discontinu-
``soft'' classi®cation schemes. In ``hard'' clustering an ity orientations.
observation is assigned to one or the other cluster.
However, this ``all or nothing'' classi®cation does not
adequately represent reality, because it is often the
case that a data object in one cluster may bear simi- THE FUZZY K-MEANS ALGORITHM
larity in some characteristics to other objects in a
The fuzzy K-means algorithm uses Picard iterations
di€erent cluster [13]. Soft cluster algorithms, on the
to solve for the minimum of an objective function [13].
other hand, assign to every observation degrees of
Given a data set comprising of N observations, each
membership to the clusters in a data set. These mem-
described by a vector of P attributes, Xj0(Xj1, Xj2, . . . ,
bership degrees often assume real values between zero
XjP), the algorithm seeks to partition the data set into
and one. When observations belong to a cluster, they
K subgroups or clusters on the basis of the measured
tend to have high values of degrees of membership to
similarities among the vectors (observations) of the
that cluster.
data. The algorithm basically seeks for regions of high
The cluster algorithm being proposed in this paper
density in data. The prototype or centroid of each
belongs to this class of cluster techniques. It is founded
cluster is the vector most representative of the cluster,
on one of the most widely used and successful classes
i.e. it is the geometric mean of all the vectors belong-
of cluster analysis, the fuzzy K-means algorithm [13±
ing to that cluster. Cluster prototypes are represented
15]. Fuzzy cluster algorithms have generally been
in the K-means algorithm as Vi. The solution to the
developed and used in solving problems of computer
problem of separating data objects into K sub-groups
vision and medical imaging, and are gaining popularity
can be achieved by minimizing the fuzzy objective
in other areas [11, 13, 14]. In general, it is believed that
function:
fuzzy clustering techniques yield results superior to
those of hard cluster methods [13]. This can be par- N X
X K

tially attributed to the fact that no observation is fully Jm …U,V† ‡ …uij †m d2 …Xj ,Vi †; KRN …1†
jˆ1 iˆ1
assigned to any cluster in any iteration of the cluster
process. Also the soft boundaries of fuzzy algorithms The quantity d2(Xj,Vi) is the distance between obser-
allow them to escape traps in data that can fool hard vation Xj and the cluster centroid Vi. The distance is a
classi®cation schemes [15]. measure of the dissimilarity between the two points.
The fuzzy K-means approach is especially appealing When the two points exactly coincide, the distance
in solving the problem of separating discontinuities between them is zero. It increases, as the two objects
into sets, because it explicitly accounts for the uncer- became more and more dissimilar.
tainty present in both the collection and the analysis The choice of the distance measure is determined by
of the survey data. It also provides a computationally the space in which variables lie. In RP space, the
attractive alternative to the algorithm proposed by Euclidean norm
Dershowitz et al. in Ref. [9].
X
P
Fuzzy set theory provides the framework within d2 …Xj ,Vi † ˆ …Xjp ÿ Vip †2 , …2†
which uncertainty in data can be accounted for in a pˆ1
natural and realistic manner [13]. It was originated by
Zadeh [16] when he de®ned a new concept of sets that can be used. In order to measure the distance between
allowed an object to belong to a set with a degree of orientations they are ®rst converted to unit normals
membership lying between zero and one. Under the and thus the space becomes points lying on the surface
fuzzy theory, the greater the certainty that an object of a unit sphere (non-Euclidean space). In this space
belongs to a set, the closer its membership value is to the sine of the angle between two normals is the
one. Based on this idea of fuzzy sets, Ruspini [17] measure that represents how far apart they are (the
de®ned an algorithm for separating data into sets cosine is a similarity and not a dissimilarity measure).
through the minimization of an objective function. The distance norm therefore becomes:
Practically all the subsequent work on fuzzy clustering d2 …Xj ,Vi † ˆ 1 ÿ …Xj  Vi †2 , …3†
that has followed can derive its foundation from
Ruspini's work [13]. where XjVi is the dot product of vectors Xj and Vi (In
The earliest work, known to the authors, on the ap- Ref. [19] reasons are supplied as to why Equation (3)
plication of fuzzy K-means clustering to the analysis of is a valid distance metric).
discontinuity data was by Harrison [18]. The present It must be noted that the distance measure used in a
work has made a number of modi®cations to the fuzzy cluster algorithm implicitly imposes a topology on the
K-means algorithm, which are speci®c to the clustering data under analysis [13]. A distance metric causes a
892 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

fuzzy cluster algorithm to search for clusters that exhi-


X
N
bit shapes de®ned by the metric [13, 18, 19]. The …uij †m Xj
metrics described by Equations (2) and (3) tend to jˆ1
V^i ˆ : …5†
favor ``spherical'' clusters or clusters that are geometri- X
N
cally homogeneous, i.e. clusters shapes of which do …uij †m
jˆ1
not exhibit preferred directions around their means.
For example, in two-dimensional space, elliptical clus- Equation (5) simply computes the coordinates of clus-
ters show preferred directions (along their principal ter prototypes as the weighted average of the obser-
axes) whereas circular clusters show no such prefer- vations in the data set, the weights being the
ences. There are ways of dealing with this problem, memberships, uij, raised to the mth power. (The deri-
but their discussion is beyond the scope of the current vation of Equation (5) can be found in Ref. [13].)
paper (see Ref. [19] for more comprehensive coverage The extent to which observations in¯uence the deter-
on this topic). mination of cluster prototypes is controlled by m Ðthe
uij is the degree of membership (measure of belongi- fuzziness exponent. When Equation (5) is used for
ness) of observation Xj in cluster i and is dependent on computing the locations of cluster centroids, weights
the distance. It is computed from the formula: of observations that are far from cluster centroids
approach zero more rapidly than those of pattern vec-
 1=…mÿ1†  X
K  1=…mÿ1† ÿ1 tors in the vicinity of cluster prototypes. This allows
1 1
uij ˆ : for the much-reduced in¯uence of outlying points on
d2 …Xj ,Vi † kˆ1
d2 …Xj ,Vk †
the determination of centroids [22]. This establishes
…4† one of the reasons why the fuzzy K-means algorithm is
less susceptible to outliers than some other classi®-
U on the left-hand side of Equation (1) is a K  N cation methods.
matrix of all the membership degrees, uij [20], while V In order to compute new cluster centroids for
is a vector of the K cluster centroids. These prototypes points on the surface of a unit sphere the formula
are computed based on the membership matrix U Equation (5) even with normalization (to ensure that
(details of their determination follow shortly). the centroids are also unit vectors) cannot be used,
The parameter m is known as the degree of fuzzi®ca- since it can lead to completely erroneous prototypes
tion and is a real number greater than 1. This weight- being determined. An example of a case where an
ing exponent m controls the ``fuzziness'' of the erroneous mean can be calculated is considered in
memberships. For the same partitioning, the closer m Fig. 1. The vectors shown on the diagram are nor-
is to 1, the crisper the membership values are, i.e. the mals to two sub-vertical planes perpendicular to the
y±z plane. If Equation (5) (with the weighting factors,
membership values will be close to either being a 0 or
uij, all set to be equal to 1) is used in determining the
1. As the values of m become progressively higher, the
mean, the answer obtained is (0.0, ÿ0.0089, 0.4256).
resulting memberships become fuzzier [13]. No theor-
When normalized we get the unit vector (0.0,
etical optimal value for m has been determined [13], ÿ0.0209, 0.9998) which is the normal to a near hori-
but m = 2 is believed to be the best for most zontal plane! To arrive at the correct answer, the sign
applications [13] and is the value most commonly used of vector V1 would have to be reversed before the
by researchers [13, 14, 20, 21]. application of formula Equation (5). When several
For vectors in RP space new prototypes for the clus- vectors are involved, keeping track of signs becomes
ters, which have been de®ned as the geometric cen- very cumbersome.
troids of the clusters, can be calculated using the An approach for ®nding means of vectors that
formula: avoids the need for reversing signs is the eigenanalysis

Fig. 1. Calculation of mean of normals to two sub-vertical planes.


R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 893

method outlined in Refs [23±25]. Given N unit vectors Secondly, the two smallest eigenvalues obtained
in the form of direction cosines (xj,yj,zj), where j = 1, from the spectral decomposition of the matrix S* have
. . ., N, their mean can be determined in the following a physical meaning that allows them to be used in
manner: forming critical measures for determining the validity
(i) Compute the orientation matrix S using the for- of various cluster partitions. This will be discussed into
mula greater detail in Section 5. Furthermore, these advan-
2 N 3 tages are gained without paying steep penalties in com-
X XN XN
x jx j x j yj x j zj 7 putational e€ort or speed. Memory requirements are
6
6 jˆ1 jˆ1 jˆ1 7 not much more than those needed for a sign tracking
6 7
6 N 7 sub-routine based on Equation (5), and there are very
6X XN XN 7
6 x j yj y j yj yj z j 7 fast subroutines, in Numerical Analysis literature, for
Sˆ6 7: …6†
6 jˆ1 7 ®nding the eigenvalues and eigenvectors of matrices
6 jˆ1 jˆ1 7
6 7 that can be easily implemented.
6X N XN XN 7
4 5
x j zj yj z j zj zj
jˆ1 jˆ1 jˆ1
SEQUENCE OF STEPS FOR EXECUTING THE
(ii) Find the eigenvalues (t1,t2,t3) of S and their re- ALGORITHM
spective normalized eigenvectors (x x1,xx2,x
x3), where
The following scheme gives the sequence of compu-
t1<t2<t3. Vector x3, which is the eigenvector associ-
tations needed to solve for the minimum of the fuzzy
ated with the maximum eigenvalue, will be the mean
objective function.
vector of the group of N vectors.
(i) The algorithm starts o€ with the selection of in-
In the form described above the eigenanalysis pro-
itial guesses for the K cluster centroids. The initial pro-
cedure cannot be used to determine centroids, since it
totypes are chosen such that they are realistic vectors
would give the overall mean for the N vectors being
that fall in the general region of the data being ana-
analyzed. The authors have successfully adapted this
lyzed. Their selection can be realized in di€erent ways.
method for computing the prototypes of clusters of
One method would be to select the ®rst K input vec-
unit vectors in the fuzzy K-means routine by including
tors as the initial guesses of the centroids [11]. Another
the weighting factors, (uij)m, in the orientation matrix. way of picking initial guesses is to randomly select K
A modi®ed orientation matrix S* is de®ned to be vectors from the data set as seed points for the
2 N 3
X XN X N algorithm [11]. To guarantee the selection of K well
m m m
6 …uij † x j x j …uij † x j yj …uij † x j zj 7 separated initial prototypes, the mean of the data set
6 jˆ1 7
6 jˆ1 jˆ1 7 can be chosen as the ®rst initial centroid. Thereafter,
6 N 7
6X XN X N 7 each subsequent initial guess is picked such that its dis-
6 m
…uij † x j yj m
…uij † yj yj …uij † yj zj 7
m
S* ˆ 6 7: tance from each of those already chosen is not less
6 jˆ1 7
6 jˆ1 jˆ1 7 than a speci®ed minimum [11, 21].
6 7
6X N XN XN 7 The authors experimented by initializing the algor-
4 m m m 5
…uij † x j zj …uij † yj zj …uij † zj zj ithm with K randomly chosen cluster prototypes. One
jˆ1 jˆ1 jˆ1 way of choosing a random vector in RP space is by
…7† selecting each of the P attributes of the prototype as a
random real number in the interval (Xpÿ3sp,Xp+3sp),
The normalized eigenvector corresponding to the lar- where Xp is the mean of the pth attribute of the N
gest eigenvalue of S * is the new cluster centroid, i.e. input vectors Xj, and sp is its corresponding standard
V^i ˆ x3 : …8† deviation. To select a random point on the surface of
a unit surface sphere, all it takes is to select three ran-
Proof of the validity of using this approach of comput- dom real numbers between (ÿ1,1) and normalize them.
ing cluster centroids for spherical data is given in Vectors selected in the manner described are realistic
Appendix A at the end of this paper. There are several in that they lie in space of the input vectors and in the
compelling reasons for using this approach as opposed general region of the data.
to the use of a routine involving Equation (5) with Di€erent choices of initial guesses of cluster proto-
suitable housekeeping of the signs of vectors. One of types can lead to di€erent partitions of the same data.
the reasons for the usage of the eigenanalysis approach This is because the algorithm may or may not con-
is that the resulting eigenvalues and vectors contain verge on the global minimum of the objective function
important information on cluster shapes [19]. This in- Equation (1); it is only guaranteed to settle at a local
formation can be utilized with an appropriate distance minimum [11, 21]. This comes into play especially for
metric for ®nding elliptical clusters on a sphere that data sets with poorly separated clusters. However, con-
are ill-placed relative to each other [19]. (For example, ®dence in cluster results can be built by running the al-
two elliptical clusters are ill-placed relative to each gorithm a few times and carefully observing resulting
other, when their centroids are close and their princi- partitions. The authors have observed in their practice
pal axes are perpendicular to each other.) that this problem is much less pronounced when the
894 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

number of clusters K is correct, i.e. corresponds to the Since the main thrust of the current work is to present
actual number of clusters in a data set. the theoretical aspects of the algorithm highlighted
(ii) Compute the distances d2(Xj,Vi) of all N obser- with two examples, an in-depth analysis of its perform-
vations from the K cluster centroids using formula ance over a broader range of geological conditions
Equation (3) and/or Equation (2). shall be provided in Ref. [26].
(iii) Calculate the degrees of membership, uij, of all An alternate way of implementing the algorithm
N observations in the K clusters with formula would be to start o€ by choosing random values for
Equation (4). all the degrees of membership, uij, for the N obser-
(iv) Evaluate new cluster prototypes using the eigen- vations and proceeding to compute in sequence the
analysis of the fuzzy orientation matrix, S*, for direc- cluster prototypes, distances, and new membership
tional (spherical) data. If non-orientation data (extra- values, uà ij [13]. The approach of selecting initial cluster
recorded information) is also involved in the analysis, centroids as opposed to initializing memberships is
then Equation (5) can be used to determine those com- preferred in this work, because it allows the data ana-
ponents of the coordinates of cluster prototypes. The lyst to easily utilize any a priori knowledge that may
validity of this treatment of mixed-type data (data exist on the means of discontinuity sets (cluster proto-
with both spherical and non-spherical components) types) to start o€ the algorithm. It usually is the case
can be proved, if it is recognized that the objective that when information is available on the structure of
function for such data consists of two distinct sums. a data set, it is in the form of the number of clusters
The ®rst sum involves the spherical component of ob- and their centroids.
servations (distances for this part of the objective func-
tion are measured with Equation (3)) while the second ASSIGNMENT OF OBSERVATIONS TO CLUSTERS
sum represents the contribution of the Euclidean com-
ponents of points (distances involved in this com- Observations are assigned to one cluster or the other
ponent are computed using Equation (2)). Since the based on the values of the degrees of membership.
variables of the two sums that form the composite Vector Xj, is considered to belong to cluster I, if its
objective function are completely di€erent from each membership, uIj, in cluster I is higher than its member-
other, minimization of the objective function can be ship in all the other clusters. By de®nition, outliers do
accomplished through the minimization of the individ- not ®t into the general patterns of any of the clusters
ual sums. Thus the proof for mixed-type data is a very well. In fuzzy K-means clustering, they can be
combination of that found in Appendix A of the cur- detected because they tend not to have high member-
rent paper and that provided by Bezdek [13]. ship values in any clusters. It is therefore possible to
(v) Compute new distances using formula(s) exclude such ambiguous observations from being
Equation (3) and/or Equation (2), and new degrees of assigned to any cluster by establishing a threshold
membership, uà ij (Equation (4)), for all N observations. value for the maximum memberships of data obser-
vations. When the largest cluster membership value of
(vi) If
a vector falls below the threshold, it may be excluded
maxij ‰juij ÿ u^ ij jŠ<e from getting assigned. Great care must be exercised, if
points are not going to be assigned since not every vec-
stop [12], otherwise go to step (iv) of the procedure. e tor with low cluster memberships is an outlier.
is a tolerance limit that acts as the criterion for termi- Jain and Dubes [11] advocate that outliers be ident-
nating the iterations. It lies between 0 and 1. For all i®ed and removed from cluster analysis because they
examples given in this paper, e = 0.001. tend to distort cluster shapes. Other experts caution on
The fuzzy objective function for any speci®ed num- the exclusion of outliers from data sets. It is their
ber of clusters tends to have multiple stationary points belief that unless there are very compelling reasons for
(local minima). There is no guarantee that the Picard suspecting that an observation that is far from the rest
iterations of the algorithm will converge on the global of the data is a result of erroneous measurement it
minimum of Jm. Also, even if the algorithm converged must be left alone. It might just be a rare occurrence
on the global minimum, it would still not be guaran- of a feature that is present in the data and its removal
teed that this minimum of the objective function can have serious repercussions. If an outlier is known
would necessarily provide the best possible partitioning to be a measurement error and it can be corrected,
of the data under examination [13]. However, in the then this should be done. All that can be stated here
practice of the authors, the proposed algorithm has by the authors is that if so desired, outliers can be
strong tendency to converge on the correct solution removed from data sets after clustering with the fuzzy
when the number of clusters is correct for data sets, K-means algorithm.
and especially when the clusters are well separated.
This may be attributed to the general advantages of
CLUSTER VALIDITY
fuzzy algorithms given in the introduction to this
paper. The examples provided in the paper show that So far in the discussion of the method no mention
there is at least strong evidence that the technique pro- has been made of how to determine the number of
vides rational answers in clustering discontinuity data. clusters in a data set, or establish that the partitioning
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 895

obtained is correct. After all the fuzzy K-means algor- designed for a fuzzy implementation of maximum like-
ithm will provide K clusters whether that partitioning lihood estimation, they can be generalized to apply to
is true or not (in certain cases null clusters can exist). other algorithms in the family of fuzzy K-means
This question is one of the most important in cluster methods.
analysis and is not an easy one to answer. The answer In RP space the fuzzy covariance matrix [27] is
to it is fundamentally tied in to the de®nition of what de®ned as
a cluster is.
X
N
One important issue to bear in mind is that any …uij †m …Xj ÿ Vi †…Xj ÿ Vi †T
cluster algorithm has the potential to impose a struc- jˆ1
ture on a data set that might not re¯ect the actual Fi ˆ : …9†
X
N
cluster structure present in the data. For example, as a …uij †m
result of the distance measure (Equation (3)) used in jˆ1
the above-described algorithm, it primarily seeks for
The fuzzy hypervolume can then be de®ned as
rotationally symmetric clusters whether or not they
exist in data under analysis. If the clusters possess ro- X
K
tational asymmetry and are unfavorably oriented rela- FHV ˆ ‰det…Fi †Š1=2 : …10†
tive to each other, the algorithm may be incapable of iˆ1

ever producing the right answers and the resulting val- The average partition density of the clustering is com-
idity measures would not be of much use. (If the place- puted using the formula
ment of non-circular clusters relative to each other is
not critical, which is the most commonly encountered 1X K
Si
DPA ˆ , …11†
case, the algorithm as it is can still perform very well K iˆ1 ‰det…Fi †Š1=2
in isolating these clusters [19].) However, the authors
have proposed a distance measure that is capable of where Si is known as the ``sum of central members'',
detecting rotationally asymmetric clusters of spherical and is calculated as
data. Discussion of this measure lies outside the scope X
N
of this paper, but a comprehensive expose on it is Si ˆ uij , 8Xj 2 fXj :…Xj ÿ Vi †T Fÿ1
i …Xj ÿ Vi †<1g:
given in Ref. [19]. It suces to mention, however, that jˆ1
the cluster validity measures that will be proposed in …12†
this section of the current paper for spherical data,
and in themselves do not su€er from the above- The partition density which is representative of the
described limitations can be used in conjunction with general density of the clustering is de®ned by Gath
the methods discussed in Ref. [19] for ®nding ill-placed and Geva [21] as
non-circular clusters. S
Often times the data analyst uses a clustering algor- PD ˆ , …13†
FHV
ithm to acquire information and knowledge on the
structure existing in the data set [11]. It is therefore im- where
perative that the algorithm provides the analyst with K X
X N
performance measures on the ``validity'' of the answers Sˆ uij , 8Xj 2 fXj :…Xj ÿ Vi †T Fÿ1
i …Xj ÿ Vi †<1g
that are obtained from running the algorithm on data. iˆ1 jˆ1
Cluster validity is the section of cluster analysis that …14†
attempts to deal with the issues of validating clustering
results. The optimal partitioning based on these criteria is that
A number of validity measures have been proposed for which the hypervolume is minimal and the density
in the literature in attempt to establish indices of clus- measures are maximal.
ter validity, but in this paper we shall restrict ourselves For orientation data, the hypervolume and ``sum of
to indices for fuzzy cluster algorithms. This is because central members'' used in computing the other per-
the algorithm under consideration belongs to this formance measures of Ref. [21] cannot be used in the
group of classi®cation schemes, and that these per- form in which they are originally given. The analogs
formance measures have been widely accepted among of these cluster validity measures speci®c to the cluster-
researchers of fuzzy clustering. ing of spherical data (or for determining the contri-
Gath and Geva [21] have proposed indices that are bution of the spherical components of mixed type
based on the ``crispiness'' of resulting clusters, their vectors) have to be proposed. The eigenvalues
hypervolume, and their densities. Their idea is that the obtained from the spectral decomposition of the orien-
clusters determined should be as less fuzzy as possible, tation matrix, S*, are of great signi®cance in the deri-
should possess low hypervolumes (the term hypervo- vation of the hypervolume and ``central members'' for
lume is used, because the variable space not necessarily spherical data. Although a full discussion of the deri-
3-dimensional), and should be of highest densities. vations of the spherical analogs of the fuzzy hypervo-
Although their original performance measures were lume and ``sum of central members'' lies outside the
896 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

scope of the current work, an attempt shall be made Using the understanding of what the eigenvalues
to explain the rationale behind the derived equations. are, the ``sums of central members'' for spherical data
First, we shall reacquaint ourselves with the fact can be de®ned as:
that the spectral decomposition of S * for a cluster ( )
XN X2
…Xj  x it †2
identi®es three orthogonal axes, x1, x2 and x3, and Si ˆ uij , 8Xj 2 Xj : <1 : …18†
their corresponding eigenvalues, t1, t2 and t3. As is jˆ1 tˆ1
s2t
already known from the discussion of the calculation
of prototypes, x3 is the mean of a cluster of points. The spherical analogue of the numerator of
Equation (13) for computing the partition density
The other two vectors coincide with the minor and
becomes:
major principal axes of the elliptical distribution of
( )
points generated, when the observations in the cluster X K XN X2
…Xj  x it †2
are projected onto a plane perpendicular to the mean. Sˆ uij , 8Xj 2 Xj : <1 : …19†
iˆ1 jˆ1 tˆ1
s2t
The eigenvalues can be expressed as (see Appendix A):
X
N When mixed-type data is being analyzed (when extra
t1 ˆ x Tt S*xx1 ˆ …uij †m …Xj  x t †2 , t ˆ 1, 2 or 3: …15† data columns of information on discontinuities are
jˆ1 included in an analysis), the cluster performance
measures become composites of those measured for
From the expression Equation (15), it can be seen that
spherical data and for Euclidean data. To facilitate the
the eigenvalues are measures of the spread of the pro-
de®nition of the composite measures, the vector Xj,
jections of vectors in the principal directions. The
representing the jth data point will conceptually be
eigenvalues t1 and t2 provide information on cluster
divided into two components such that it can be rep-
shapes. If the distribution of vectors in a cluster has
resented as: Xj0(Xj(s),Xj(e)). The superscripts (s) and (e)
rotational symmetry, i.e. if its shape is circular or near
refer to the spherical component (direction cosines)
circular, then t11t2 or t2/t111. The greater the ratio
and Euclidean component (extra data), respectively.
t2/t1 is, the greater the departure of the cluster shape The hypervolume for clusters then becomes the pro-
is from a circle (the eccentricity of the resulting ellipti- duct of the hypervolumes of the spherical and
cal shape increases). Also, the eigenvalues represent a Euclidean constituents, i.e.:
measure of the deviation of vectors from the centroid
of a cluster in the directions of the principal axes. Fi ˆ F …s† …e†
i Fi , …20†
When a vector exactly coincides with the prototype of
where Fi(s) is the hypervolume determined from
a cluster, its projections onto principal directions 1
Equation (17), and Fi(e) that obtained from
and 2 are zero. Large deviations from the cluster mean
Equation (9). The hybrid condition for establishing
result in larger projections on either or both of these
``central members'' assumes the form:
two directions. This, precisely, is the idea captured by
(
the variance of a set of observations in Euclidean (RP) X2
…X…s†
j  x it †
2

space. Thus it can be deduced from Equation (15) that Xj 2 Xj :


m tˆ1
s2t
the eigenvalues divided by the quantity, aN jˆ1 (uij) , i.e.
tt )
s2t ˆ N , …16†
X
m ‡…X…e†
j ÿ V…e† T ÿ1 …e†
i † Fi …Xj ÿ V…e†
i †<1 : …21†
…uij †
jˆ1
Two other fuzzy cluster validity measures were used in
are measures of the variance of the projections of determining the optimal partitioning of our sample
spherical data observations in three orthogonal direc- data set. These were the Xie±Beni validity index [28]
tions for the di€erent clusters. Therefore, the hypervo- and the Fukuyama±Sugeno validity functional [20].
lume of a cluster of points on the surface of a sphere The Xie±Beni index is de®ned as
is the surface area determined from the integral:
… s2 … s1 p X
K X
N
…uij †m kXj ÿ Vi k2
Fi ˆ …1= 1 ÿ x 2 ÿ y2 † dx dy: …17† iˆ1 jˆ1
0 0 XB …U,V;X† ˆ : …22†
N…mini6ˆk fkVi ÿ Vj k2 g†
Because a closed-form solution of this integral cannot
be easily established, it is handier to evaluate it using 66 is the norm for measuring distances in the classi®-
numerical quadrature methods. (It is possible to ap- cation space and therefore can be replaced with either
proximate this surface by simply calculating the area Equation (3) (for spherical data only), or the sum of
of the projection of the above-de®ned surface onto a Equations (2) and (3) for mixed-type data. This index
plane, i.e. by using the product: s1s2. This approxi- measures the compactness and separation of the fuzzy
mation, however, works well only for small values of K-partition obtained from the clustering algorithm.
s1 and s2.) The compactness is the ratio of the weighted total vari-
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 897

Fig. 2. Pole and contour plots of observations in Example 1 data set. (a) Pole of discontinuities in ®rst data set. (b)
Contouring results of the poles.

ation of the data set to the number of observations. 1 ÿ (XjVi)2 has to be used in place of the Euclidean
2
The minimum distance between the prototypes of the metric6XjÿV6 i .
clusters is known as the separation [28]. When a good
partitioning of the data set is obtained the numerator
of the index is small, because the membership degree, SAMPLE APPLICATIONS OF FUZZY K-MEANS
2 ALGORITHM
uij, is high whenever 6XjÿV6 i assumes a small value
2
and is small whenever 6XjÿV6 i becomes large. In the The algorithm was run on two sample data sets. The
case where the clusters obtained are well separated, the ®rst example is a demonstration of the use of the fuzzy
minimum distance between the cluster centres is rela- K-means clustering algorithm for delineating a sample
tively high. It is therefore taken that small values of joint survey data set into distinct clusters based on
nXB indicate a better clustering of the data set than lar- orientation data.
ger values. The second example illustrates how the fuzzy clus-
The Fukuyama±Sugeno validity functional is calcu- tering algorithm facilitates the automatic classi®cation
lated from the formula of joints into sets based not only on orientation infor-
mation, but also when an extra data column is present.
X
K X
N
The extra data column could be any numeric property
FS …U,V;X† ˆ  2†
…uij †m …kXj ÿ Vi k2 ÿ kVi ÿ Vk
iˆ1 jˆ1
of discontinuities such as trace length, spacing, discon-
tinuity frequency, or even joint roughness coecient
ˆ Jm ÿ Km , …23† (JRC). In Section 6.2, the possible misclassi®cation of
discontinuities when delineation is solely founded on
where V is the geometric mean of the cluster centroids. discontinuity orientation, is contrasted with the case
Jm in the above equation is the fuzzy objective func- when the inclusion of additional data on discontinu-
tion de®ned in Equation (1) and ities signi®cantly improves the quality of cluster separ-
X
K X
N ation.
Km ˆ …um  2 For both examples, it is assumed that all samples in
ij †kVi ÿ Vk :
iˆ1 jˆ1 the data set come from one structural domain (a zone
of a rock mass homogeneous or uniform in properties.
The physical interpretation of the Fukuyama±Sugeno If the discontinuities being analyzed are not from a
validity index is generally not very clear [20]. However, structural domain, then they must be divided into
since the objective function, Jm, decreases monotoni- homogeneous sub-groups and each of them analyzed
cally for increasing K (the number of clusters), the separately. Failure to do so introduces error into the
function Km may be interpreted as a cost term analysis of discontinuity data [2]. Hoek, Kaiser and
intended to penalize the use of increasing values of K Bawden in Ref. [6] recommend that the number of dis-
to minimize the objective function. Fukuyama and continuities measured in any structural domain be not
Sugeno propose that small values (minima) of nFS less than a 100.
attest to good partitionings of the data set into clus-
ters. nFS has the tendency to decrease monotonically Example 1
with increasing K. The ®rst data set consists of 195 rock joints (this
In order to use the Xie±Beni and Fukuyama± data set is saved under the name Exampmin.dip in the
Sugeno validity indices for establishing the optimal EXAMPLES subdirectory of the DIPS program). For
partitioning of directional data, the sine-squared norm purposes of illustration, only orientation information
898 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

was used in the analysis. Figure 2 contains illustrations aged. The values of the validity indices are displayed
of a plot of the poles of the discontinuities in data set in Table 1 for the various numbers of clusters, K. In
1 and also a contour plot of these normals. The con- the table, values of the criteria that correspond to the
tours on the stereogram in Fig. 2(b) and on subsequent optimal number of clusters, as determined by each the
stereographic plots in this paper are contours of equal criteria, have been highlighted. Also the results in
probability (in percent) of pole occurrence [25]. On the Table 3 are shown graphically in Fig. 3. Other than the
contour plot it can be seen that there are 4 possible Xie±Beni index, all the other validity indices indicate
fracture sets. the optimal number of fracture sets to be 4. The fact
The data set was analyzed for values of K between 2 that one index did not identify K = 4 as the optimal
and 7. For each number of clusters, K, the algorithm partition underlies the necessity of using more than one
was run multiple times and performance values aver- performance measure. No single cluster validity index

Fig. 3. Graphs of cluster validity indices for data set in Example 1. (a) Plot of fuzzy hypervolume, FHV, against number of
clusters, K. Minimum is seen at K = 4. (b) Plot of partition density, PD, against number of clusters, K. Maximum is seen at
K = 4. (c) Plot of average partition density, DPA, against number of clusters, K. Maximum is seen at K = 4. (d) Plot of
Xie±Beni index, nXB, against number of clusters, K. Minimum is seen at K = 3. Note that value of index for K = 4 is sec-
ond best. (e) Plot of Fukuyama±Sugeno index, nFS, against number of clusters, K. Minimum is seen at K = 4.
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 899

Table 1. Performance measures for number of clusters, K = 2 to 7, for Example 1


Average partition
K Fuzzy hypervolume Partition density density Xie±Beni Fukuyama± Sugeno
2 0.237409 331.9678 333.9582 0.192009 ÿ24.3891
3 0.171884 429.9436 445.6096 0.095237 ÿ59.8552
4 0.146379 564.7448 568.5423 0.146124 ÿ66.5256
5 0.161952 501.1604 514.8446 0.355183 ÿ66.373
6 0.180644 452.5918 501.0386 0.378323 ÿ64.0172
7 0.198443 398.2612 474.5068 0.390902 ÿ58.7873

can correctly detect the optimal structure of data sets Example 2


all the time. This is what necessitates the use of mul- In the second example of the application of the
tiple validity criteria. Notwithstanding, the K = 4 par- fuzzy K-means clustering algorithm, a simulated data
tition is still ranked second by the Xie±Beni index. set consisting of four joint sets, each of which had
Figure 4 has contour plots of the clusters, ensuing 50 observations, was generated (Fig. 5). In addition
from the optimal partitioning of the data set. The al- to data on discontinuity orientation, values of a
gorithm has not only determined the correct number quantitative characteristic were generated from uni-
of clusters, but has also apparently identi®ed the form distributions for each of the joint sets. The
structure in the data. The results of clustering with parameters necessary for generating the observations
the K-means algorithm, we believe, are in agreement in this arti®cial data set are given in Table 2. Before
with the partitioning most human analysts would clustering the data, the values of the extra data col-
come up with. umn were standardized using the formula:

Fig. 4. Contour plots of poles of resulting clusters after fuzzy cluster analysis of ®rst data set. (a) Contour plot of poles of
discontinuities assigned to Set 1. (b) Contour plot of poles of discontinuities assigned to Set 2. (c) Contour plot of poles of
discontinuities assigned to Set 3. (d) Contour plot of poles of discontinuities assigned to Set 4.
900 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

but with a possibility of the existence of a fourth clus-


ter. The possible existence of a fourth joint set is sus-
pected because the contour plot of the poles exhibits
twin peaks in the region of orientation 225/65. This
may be cause for further investigation, since the pre-
sence of the twin peaks may be either caused by the
presence of two close lying joint sets that di€er in the
extra property, or may be simply due to a sampling
aberration.
Table 3 contains the values of the validity indices
for K partitions, K ranging from 2 to 7, based on
orientation only. The ®rst three indices (those based
on hypervolumes and densities) point to the existence
of 3 clusters of discontinuities (although the hypervo-
lume index assigns close values for the cases of K = 3
and K = 4). These observations are characterized in
Fig. 5. Contour plot of poles of joints in second data set. Fig. 6(a)±(c) by the stationary points of the plots of
the performance indices against the number of clusters.
The other two cluster performance measures Ðthe

Xj ÿ X
Z1j ˆ , …24† Xie±Beni and Fukuyama±Sugeno indicesÐ indicate 4
S clusters. The results in Table 3 and the plots in
where X and S are, respectively, the mean and stan- Fig. 6(d),(e) evidence this choice. This example high-
dard deviation of the values of the additional data lights the ability of the algorithm to provide very logi-
column. The standardization was performed in order cal answers. The selection of 3 or 4 clusters by the
to scale down these values, in order not to allow validity indices mirrors the debate that would occur in
them to overwhelm the orientation component in the the minds of data analysts as to how many clusters
cluster analysis of the mixed data. (The above-men- exist in the data. It is also noticeable that di€erent val-
tioned issue of equalizing, in some way, the in¯uence idity indices indicate the various partitions with di€er-
of variables on clustering results introduces the topic ing strength or conviction. For example, in Table 3 the
of standardization and weighting of variables in clus- Xie±Beni index strongly points to the 3 cluster parti-
ter analysis. Its detailed treatment in relation to the tioning as being the best, whereas the hypervolume cri-
clustering of discontinuities can be found in terion barely distinguishes between the cases of K = 3
Ref. [29].) and K = 4.
Joint sets 2 and 3 overlap in orientation, but di€er At this stage of the analysis of data set 2, it is im-
greatly in the additional quantitative characteristic. possible to correctly separate joint sets 2 and 3,
From viewing the plot of the all poles in the data set, because the additional discontinuity property has not
it would be correct to say that 3 clusters were present, been included the clustering process. The inability to

Table 2. Simulation parameters for generating discontinuities in Example 2


Joint set # Orientation Extra data column
a
Distribution k Mean direction Distribution Min. value Max. value
Dip Dip direction
1 Fisher 21.0 57 47 Uniform 10 15
2 Fisher 41.5 60 205 Uniform 16 20
3 Fisher 32.5 69 217 Uniform 0 3
4 Fisher 21.5 87 312 Uniform 4 8
a
k is the concentration parameter of the Fisher distribution.
Note: For each of the discontinuity sets in Table 1, 50 points were generated.

Table 3. Performance measures for Example 2 for number of clusters, K = 2 to 7, for orientation data only
Average partition
K Fuzzy hypervolume Partition density density Xie±Beni Fukuyama± Sugeno
2 0.172562 473.4203 512.3597 0.147898 ÿ43.7507
3 0.120048 566.159 608.8765 0.075526 ÿ56.0455
4 0.119768 669.8643 724.988 0.368519 ÿ48.8921
5 0.142071 561.2366 651.3806 0.363421 ÿ45.5348
6 0.132959 589.6222 647.515 0.386207 ÿ45.3712
7 0.140767 539.7078 618.295 0.32375 ÿ43.3776
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 901

Fig. 6. Graphs of cluster validity indices for data set in Example 2 when analysis is performed without extra data column.
(a) Plot of fuzzy hypervolume, FHV, against number of clusters, K. Values of hypervolume are practically equal when K = 3
or K = 4. (b) Plot of partition density, PD, against number of clusters, K. Maximum is seen at K = 4. (c) Plot of average
partition density, DPA, against number of clusters, K. Maximum is seen at K = 4. (d) Plot of Xie±Beni index, nXB, against
number of clusters, K. Minimum is seen at K = 3. (e) Plot of Fukuyama±Sugeno index, nFS, against number of clusters, K.
Minimum is seen at K = 3.

properly delineate these two clusters leads to biased leads to the correct recovery of the true structure of
results for the distributions of the extra data column the data set (the assignment of observations to clusters
for these joint sets. (Also, some skewness in the distri- is near identical to the original structure of the simu-
butions of the orientations for these two discontinuity lated data set). As depicted by the graphs in Fig. 7
sets arises.) The average value of the non-orientation and the results in Table 4, all the validity measures
characteristic determined for the two superimposed select the partitioning of the data into 4 clusters as the
joint sets (sets 2 and 3), if the K = 3 partitioning is best clustering of the observations. The contour plots
chosen, is 9.86 when clustering of the data is based on in Fig. 8 show the resulting 4 clusters. A comparison
orientation information only. of Fig. 8(b),(c) reveals the overlap of joint sets 2 and 3
Performing the analysis of the data set anew, but in orientation. However, the algorithm correctly
this time with the inclusion of the extra data column assigned the observations to their appropriate clusters
902 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

Fig. 7. Plots of performance measures as functions of number of clusters, K, for data set 2, when cluster analysis is per-
formed with extra data column included. (a) Plot of fuzzy hypervolume, FHV, against number of clusters, K. Clearer mini-
mum is attained when K = 4. (b) Plot of partition density, PD, against number of clusters, K. Maximum is observed at
K = 4. (c) Plot of average partition density, DPA, against number of clusters, K. Maximum is observed at K = 4. (d) Plot of
Xie±Beni index, nXB, against number of clusters, K. Minimum occurs at K = 4. (e) Plot of Fukuyama±Sugeno index, nFS,
against number of clusters, K. Minimum occurs at K = 4.

as a result of the information contributed by the ad- Euclidean data. When observations from a number of
ditional data column. statistical distributions are grouped together into one
It must be noted that the contours of the plots in data set, histograms of the observations after they
Fig. 8 do not exactly match the contours of the initial
have been separated into clusters would not exactly
data set in Fig. 5. The di€erences in contouring can be
attributed to the ®xed number of grid points used for match the histogram of the grouped observations, if
contouring poles in the program DIPS. An analogous the number of bins for all histogram plots is kept con-
situation arises in the drawing of histograms for stant.
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 903

Table 4. Performance measures for Example 2 for number of clusters, K = 2 to 7, when extra data column is included in analysis
Average partition
K Fuzzy hypervolume Partition density density Xie±Beni Fukuyama± Sugeno
2 0.149958 517.4812 520.15858 0.13512 ÿ41.0821
3 0.075335 1142.174 1316.936 0.127685 ÿ140.8
4 0.027801 2640.082 2974.768 0.0509083 ÿ209.137
5 0.034258 2034.298 2496.128 0.525081 ÿ200.292
6 0.036405 1946.632 2224.118 0.917034 ÿ170.757
7 0.036514 1871.21 2162.864 0.861568 ÿ153.365

In the case where the extra data column is included simple illustration of the extents to which answers can
in the analysis, the average value for this discontinuity be biased, when information pertinent to the identi®-
characteristic for joint set 2 is 18.1, and 1.61 for joint cation of clusters is omitted from analysis. At this
set 3! These values di€er greatly from the averaged
stage it must be noted that key information for cor-
value (9.86) obtained when the cluster analysis of the
rectly separating clusters might reside in some geologi-
data set was performed without the extra data column.
Use of the averaged value of the non-orientation dis- cal data such as genetic type. In such a case, the
continuity attribute in engineering calculations could appropriate conversion of the geological data into
have serious repercussions. This example is only a quantitative form so that it could be incorporated into

Fig. 8. Contour plots of poles of clusters obtained from analysis of second data set, with extra data column included. (a)
Contour plot of poles of discontinuities assigned to Set 1. (b) Contour plot of poles of discontinuities assigned to Set 2. (c)
Contour plot of poles of discontinuities assigned to Set 3. (d) Contour plot of poles of discontinuities assigned to Set 4.
904 R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS

a subsequent cluster analysis, would be greatly ben- and most certainly to enhancements, by the research
e®cial. community, of its various components.

Accepted for publication 18 January 1998

DISCUSSION AND CONCLUSIONS


The application of the popular and widely used REFERENCES
fuzzy K-means algorithm [13±15] o€ers an elegant, 1. Hoek, E. and Bray, J. W., Rock Slope Engineering. Chapman
simple and yet very powerful tool for overcoming and Hall, London, 1994.
2. Hoek, E. and Brown, E. T., Underground Excavations in Rock.
some of the most fundamental problems of delineating Chapman and Hall, London, 1994.
rock fracture data into sets. Through the use of the al- 3. Priest, S. D., Discontinuity Analysis for Rock Engineering.
gorithm the data analysis process acquires objectivity Chapman and Hall, London, 1993.
4. Palmstrom, A., The volumetric joint count: a useful and simple
and can be performed at several times the speeds measure of the degree of rock mass jointing. In Proceedings of
human analysts are capable of. Boredom and tedium the IV Congress of the International Association of Engineering
can be either eliminated or greatly reduced by using an Geology, Vol. V. New Delhi, 1982, pp. 221±228.
5. Diederichs, M. and Hoek, E., DIPS User's Guide, Version 4.0.
automatic classi®er, especially in the case where the Dept. of Civil Eng., University of Toronto, 1996.
number of variables and categories existing in the data 6. Hoek, E., Kaiser, P. K. and Bawden, W. F., Support of
set is large. The approach also does not require any a Underground Excavations in Hard Rock. Balkema, Rotterdam,
1995.
priori knowledge of the structure present in the data 7. Shanley, R. J. and Mahtab, M. A., Delineation and analysis of
set. In cases where a priori information exists on the clusters in orientation data. J. Math. Geol., 1976, 8(3), 9±23.
structure of a data set, it can be utilized in a cluster 8. Mahtab, M. A. and Yegulalp, T. M., A rejection criterion for
de®nition of clusters in orientation data. In Issues in Rock
analysis by supplying the algorithm with the known Mechanics, Proceedings of the 22nd Symposium on Rock
number of clusters. In addition, the algorithm can be Mechanics, Berkeley, ed. R. E. Goodman and F. E. Heuze.
initialized with the known mean features of the clusters American Institute of Mining Metallurgy and Petroleum
Engineers, New York, 1982, pp. 116±123.
as the seed points for prototypes. 9. Dershowitz, W., Busse, R., Geier, J. and Uchida, M., A stochas-
The information on other features useful in separ- tic approach for fracture set de®nition. In Rock Mechanics Tools
ating discontinuities into distinct clusters can now be and Techniques, Proceedings of the 2nd North American Rock
Mechanics Symposium, Montreal, ed. M. Aubertin, F. Hassani
included in the automated analysis of joint survey and H. Mitri. Brook®eld, 1996, pp. 1809±1813.
data. It places in the hands of the user a powerful tool 10. Anderberg, M. R., Cluster Analysis for Applications. Academic
for trying several di€erent options of clustering, which Press, New York, 1973.
11. Jain, A. K. and Dubes, R. C., Algorithms for Clustering Data.
may otherwise escape detection by even the most Prentice Hall, Englewood Cli€s, 1988.
skilled and experienced human analysts. 12. Milligan, G. W., Clustering validation: results and implications
Performance measures of the partitioning results of for applied analyses. In Clustering and Classi®cation, ed. P.
Arabie, L. J. Hubert and G. De Soete. World Scienti®c, River
fuzzy K-means algorithms allow the data analyst to Edge, 1996, pp. 341±375.
obtain an idea of which particular clustering (or clus- 13. Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function
terings) is best or sensible. The plethora of validity in- Algorithms. Plenum, New York, 1981.
14. Krishnapuram, R. and Keller, J., Fuzzy and possibilistic cluster-
dices available implies that the possibility of being ing methods for computer vision. In Neural and Fussy Systems,
misled by any one index is signi®cantly reduced. Vol. IS 12, ed. S. Mitra, M. Gupta and W. Kraske. SPIE
Although certain data con®gurations or structures can Institute Series, 1994, pp. 133±159.
15. Bezdek, J. C. and Pal, S. K., Fuzzy Models for Pattern
make the algorithm susceptible to the ``seed'' points Recognition: Methods that Search for Structure in Data. IEEE
used in initializing it, when it is combined with human Press, New York, 1992.
perception in practice, it greatly facilitates the recovery 16. Zadeh, L. A., Fuzzy sets. Inf. Control, 1965, 8(3), 338±353.
17. Ruspini, E. H., A new approach to clustering. Inf. Control.,
of the structure of discontinuity data. 1969, 15(3), 22±32.
The fuzzy K-means algorithm provides the appropri- 18. Harrison, J. P., Fuzzy objective functions applied to the analysis
ate framework for creating an excellent data analysis of discontinuity orientation data, In ISRM Symposium: Eurock
`92, Rock Characterization, ed. J. A. Hudson. British
tool that provides the rock mechanics expert the Geotechnical Society, London, 1992, pp. 25±30.
opportunity to obtain objective and accurate answers 19. Hammah, R. E. and Curran, J. H., On distances measures for
in the task of delineating clusters. In addition, it is the fuzzy K-means algorithm for joint data. 1997, submitted for
publication.
also capable of revealing structures in data that might 20. Pal, N. R. and Bezdek, J. C., On clustering validity for the fuzzy
otherwise escape human attention. c-means model. IEEE Trans. Fuzzy Syst., 1995, 3(3), 370±379.
The use of the fuzzy K-means classi®cation tool for 21. Gath, I. and Geva, A. B., Unsupervised optimal fuzzy clustering.
IEEE Trans. Pattern Anal. Machine Intell., 1989, 11(7), 773±781.
analyzing discontinuity data must be encouraged, since 22. Full, W. E., Ehrlich, R. and Bezdek, J. C., Fuzzy Qmodel: a new
it can be performed in a fraction of the time it would approach for linear unmixing. Math. Geol., 1982, 14(3), 259±270.
take a human analysts to examine the data for pat- 23. Markland, J., The analysis of principal components of orien-
tation data. Int. J. Rock Mech. Min. Sci. Geomech. Abstr., 1974,
terns. As a result, the analyst can instead spend his/her 11(3), 157±163.
time in reviewing cluster results, which allows him/her 24. Fisher, N. I., Lewis, T. and Embleton, B. J., Statistical Analysis
to gain deeper appreciation and insight of the structure of Spherical Data. Cambridge University Press, New York, 1987.
25. Diederichs, M. S., DIPS, an interactive and graphical approach
of discontinuity data. Also greater use of the method to the analysis of orientation based data. M. A. Sc. thesis, Dept.
will lead to improved understanding of its behavior of Civil Engineering, University of Toronto, 1990.
R. E. HAMMAH and J. H. CURRAN: AUTOMATIC IDENTIFICATION OF JOINT SETS 905

26. Hammah, R. E. and Curran, J. H., Optimal delineation of joint X


N
sets using a fuzzy clustering algorithm. To be presented at Fi …o ,Vi † ˆ …uij †m …x j li ‡ yj mi ‡ zj i †2 ÿ o …l2i ‡ m2i ‡ 2i ÿ 1†,
NARMS'98, 1998. jˆ1
27. Gustafson, D. E. and Kessel, W. C., Fuzzy clustering with a
…A:7†
fuzzy covariance matrix. In Proceedings of the IEEE Conference
on Decision and Control. San Diego, 1979, pp. 761±766. where o is a Lagrange multiplier.
28. Xie, X. L. and Beni, G., A validity measure for fuzzy clustering.
Partial di€erentiation of Equation (A.7) with respect to li yields
IEEE Trans. Pattern Anal. Machine Intell., 1991, 13(8), 841±847.
the result:
29. Curran, J. H. and Hammah, R. E., Standardization and weight-
ing of variables for the fuzzy clustering of discontinuity data. @ Fi XN

1997, submitted for publication. ˆ 2 …uij †m …x 2j li ‡ x j yj mi ‡ x j zj i † ÿ 2o li : …A:8†


@ li jˆ1

In similar fashion, partial di€erentiation of Fi(o,Vi) with respect to


mi and ni leads to the following equations:
APPENDIX A XN
@ Fi
ˆ 2 …uij †m …x j yj li ‡ y2j mi ‡ yj zj i † ÿ 2o mi , …A:9†
The proof of the use of the fuzzy orientation matrix for determin- @ mi jˆ1
ing new cluster prototypes for directional data can be established
through the constrained optimization of the fuzzy objective function, and
K X
X N @ Fi XN
Jm …U,V† ˆ …uij †m d2 …Xj ,Vi †: …A:1† ˆ 2 …uij †m …x j zj li ‡ yj zj mi ‡ z2j i † ÿ 2o i : …A:10†
@ i jˆ1
iˆ1 jˆ1

However, before proceeding with the necessary derivations, we shall The stationary points of Fi(o,Vi) are attained only when its partial
brie¯y look at the de®nition of eigenvalues and eigenvectors. If A is derivatives are equal to zero. Equating each of Equations (A.8),
an n  n matrix, and (A.9) and (A.10) to zero, results in the system of equations:
9
AX ˆ o X, …A:2† X N
>
…uij †m …x 2j li ‡ x j yj mi ‡ x j zj i † ˆ o li >
>
>
>
>
then there exist choices for the scalar quantity o, called eigenvalues jˆ1 >
>
>
>
of A, that produce non-zero solutions X, called eigenvectors of A. X N >
=
m 2
Also, the eigenvalues o can be expressed in terms of X and A as: …uij † …x j yj li ‡ yj mi ‡ yj zj i † ˆ o mi , …A:11†
>
>
jˆ1 >
>
o ˆ XT AX: …A:3† >
>
X N >
>
>
From the theory of the eigenanalysis of square matrices, it is estab- …uij †m …x j zj li ‡ yj zj mi ‡ z2j i † ˆ o i >>
;
lished that there exist n eigenvalues, op (p = 1, 2, . . ., n), of the jˆ1
matrix A, with n corresponding eigenvectors, xp. The eigenvalues are
which can be rewritten in matrix form as:
arranged in order of magnitude so that o1Ro2R. . . R on. If A is 2 N 3
symmetric, then all its eigenvalues are real numbers. X XN X N

For the minimization of Jm(U,V) with respect to cluster centroids, 6 …uij †m x 2j …uij †m x j yj …uij †m x j zj 7
6 jˆ1 jˆ1 jˆ1 7
the degrees of membership of observations, the uij, of the objective 6 7 2 3 2 3
6 N 7 li li
function are ®xed while the cluster prototypes, the Vi, are variables. 6X XN XN 7
6 …uij †m x j yj …uij †m y2j m 7
…uij † yj zj 7  mi ˆ o 4 mi 5:
4 5
For greater clarity, let each directional data observation and cluster 6
6 jˆ1 7
centroids expressed in direction cosines be represented as: 6 jˆ1 jˆ1 7 i i
2 3 6 7
6X N XN XN 7
xj 4 5
…uij †m x j zj …uij †m yj zj …uij †m z2j
Xj ˆ 4 yj 5, jˆ1 jˆ1 jˆ1
zj
…A:12†
and Comparing Equation (A.12) with Equation (A.2) it can be seen that
2 3
li o and
2 3
Vi ˆ mi 5
4 li
i 4 mi 5
i
respectively, with the constraint that li2+mi2+ni2=1.
In that case, the distance measured between observation Xj and are respectively the eigenvalues and eigenvectors of the symmetric
prototype Vi for the fuzzy algorithm becomes: matrix:
2 N 3
d2 …Xj ,Vi † ˆ 1 ÿ …Xj  Vi †2 ˆ 1 ÿ …x j li ‡ yj mi ‡ zj i †2 : …A:4† X XN XN
m 2 m m
6 …uij † x j …uij † x j yj …uij † x j zj 7
Rewriting the objective function using the new notations and ex- 6 jˆ1 jˆ1 jˆ1 7
6 7
pressions from above, we obtain: 6 N 7
6X XN XN 7
6 …uij †m x j yj …uij †m y2j …uij †m yj zj 7
K X
X N S* ˆ 6 7: …A:13†
6 jˆ1 7
Jm …U,V† ˆ …uij †m f1 ÿ …x j li ‡ yj mi ‡ zj i †2 g …A:5† 6 jˆ1 jˆ1 7
6 7
iˆ1 jˆ1 6X N XN XN 7
4 5
…uij †m x j zj …uij †m yj zj …uij †m z2j
or jˆ1 jˆ1 jˆ1
K X
X N K X
X N
Using the relationship of eigenvalues to eigenvectors given by
Jm …U,V† ˆ …uij †m ÿ …uij †m …x j li ‡ yj mi ‡ zj i †2 : …A:6†
iˆ1 jˆ1 iˆ1 jˆ1
Equation (A.3), it is possible to establish that:
X
K X
N
Upon close examination of Equation (A.6) it can be seen that the oˆ …uij †m …x j li ‡ yj mi ‡ zj i †2 : …A:14†
extrema
P ofPJNm occur only when the second term on its right-hand iˆ1 jˆ1
side, K iˆ1
m 2
jˆ1 …uij † …x j li ‡ yj mi ‡ zj i † , assumes an extremal value.
The problem then of optimizing the fuzzy objective function reduces Therefore, from Equation (A.6), the fuzzy objective function,
to that of solving the constrained optimization problem for the sec- Jm(U,V), is minimized when the eigenvalue is largest, i.e. when
ond term. o = o3. As a result, the eigenvector, x3, that corresponds to the lar-
The solution of the constrained optimization problem can be gest eigenvalue is the mean vector of a fuzzy cluster of directional
obtained using the method of Lagrange multipliers. For each ith (spherical) data points, since it is the vector that minimizes the fuzzy
term of the expression to be optimized, let the Lagrangian be: objective function.

Вам также может понравиться