Investigating The Validity of Conventional Joint Set Clustering Methods

Engineering Geology 118 (2011) 7581
Contents lists available at ScienceDirect
Engineering Geology
j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / e n g g e o
Investigating the validity of conventional joint set clustering methods

Behzad Tokhmechi a,b,, Hossein Memarian a, Behzad Moshiri c, Vamegh Rasouli d, Hossein Ahmadi Noubari c
a
School of Mining Engineering, University of Tehran, P.O. Box 11365-4563, Tehran, Iran
School of Mining, Petroleum and Geophysics Engineering, Shahrood University of Technology, P.O. Box 3619995161-316, Shahrood, Iran
c
School of Electrical & Computer Engineering, Control and Intelligent Processing, Center of Excellence, University of Tehran, P.O. Box 11365-4563, Tehran, Iran
d
Department of Petroleum Engineering, Curtin University of Technology, P.O. Box U1987, WA, Australia
b
a r t i c l e
i n f o
Article history:
Received 11 April 2010
Received in revised form 26 December 2010
Accepted 15 January 2011
Available online 22 January 2011
Keywords:
Joint set
Joint properties
Parzen
K-means clustering
Principal component analysis
a b s t r a c t
Up to 10 properties of joints can be recorded in the eld, yet only two (dip and dip direction) are commonly
used to identify joint sets. This paper investigates some of the shortcomings of commonly employed methods
for joint set clustering, based on an analysis of synthetic and eld data. First, eight synthetic joint sets were
generated using a normal distribution of joint orientations. Each joint was dened in terms of four properties
(dip, dip direction, inlling material and inlling percentage). A Parzen classier was used to conrm the
importance of using all the joint properties in identifying the joint sets. To investigate the generalization
ability of this approach, the analysis was extended to 178 joints measured in the eld, with seven properties
available for each joint. Joints were clustered based on rose diagrams, stereonets, and K-means clustering
methods, yielding three, ve, and seven joint sets, respectively. Calculation of the coefcient of variation and
principal component analysis (PCA) of joint properties resulted in an improvement in clustering, provided
that a large number of joint properties are considered.
2011 Elsevier B.V. All rights reserved.
1. Introduction
One of the essential steps during the early stages of eld
investigations in most geological, mining, geotechnical, and petroleum exploration projects is to undertake a survey of joints, for which
one of the aims is to distinguish different joint sets. The dip and dip
direction are the most common geometrical properties employed in
joint clustering, with clusters of joints being displayed graphically on
rose diagrams and stereonets. Individual rose diagrams enable just
one joint property (i.e., dip direction or strike) to be plotted, whereas
two joint properties (dip and dip direction or strike) can be shown on
a stereonet. Rose diagrams and stereonets are interpreted visually,
meaning that the interpretation may be subjective, depending on the
interpreter's experience.
The shortcomings of rose diagrams and stereonets can be better
understood when we consider that on some occasions it is necessary
to consider more than two properties of joints for clustering (e.g., two
joints with similar dip and dip direction, but different apertures, have
contrasting effects on uid ow). Of course it should be mentioned
that from an engineering viewpoint, the most important attribute is 3D orientation which answers the question "Is failure kinematically
Corresponding author at: School of Mining Engineering, University of Tehran, P.O.

Box 11365-4563, Tehran, Iran. Fax: + 98 21 88637621.
E-mail address: tokhmechi@ut.ac.ir (B. Tokhmechi).
0013-7952/$ see front matter 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.enggeo.2011.01.002
possible?". It should also be noted that clustering is not practiced only

for failure analysis. This could well be used to identify the direction at
which a drilled well trajectory experiences minimum intersection
with the natural fractures, or formations interbeds. Again, from an
engineering perspective, it is very likely that interfaces are studied in
terms of their different properties. Would all fracture planes have
similar strength or roughness, or will they differ signicantly from
each other?
Further joint properties such as roughness, hardness, aperture, and
inlling can give indication of frictional strength of the discontinuity.
Clustering based on continuity (although extremely difcult), can give
indication of cohesive strength of the discontinuity.
Consequently, it has been argued that clustering using two
properties is not always sufcient and can lead to misleading results
(Sirat and Talbot, 2001; Zhou and Maerz, 2001, 2002). To address this
problem, joint surveys may record other properties in addition to dip
and dip direction, including continuity, aperture, hardness, roughness,
inlling percent, and inlling material (Brown, 1981).
Multiple joint properties can be utilized in modern clustering
approaches. One of the early attempts in this regard characterized
the geometry of rock joints based on joint system models
(Dershowitz and Einstein, 1988). Harrison (1992) investigated the
application of fuzzy objective functions to the analysis of orientation
data for discontinuities, and Hammah and Curran (1996, 1998)
presented a fuzzy clustering algorithm for the automatic identication of joint sets. The algorithm was tested on two simulated joint
sets, revealing that in addition to dip and dip direction, roughness
should be taken into account to ensure the identication of the
76
B. Tokhmechi et al. / Engineering Geology 118 (2011) 7581
(a)
Table 1
Semi-quantitative coding used to describe inlling percentage and inlling material for
synthetic joints.
Inlling material
Empty
Clay
Calcite
(b)
Inlling percentage
No lling
Half lled
Fully lled
0
0.5
1
0
0.5
1
correct number of joint sets. Hammah and Curran (1998) investigated the optimal delineation of sets of discontinuities using a fuzzy
clustering algorithm, and Hammah and Curran (1999, 2000)
investigated the use of a fuzzy K-Means algorithm for joint set
clustering.
Sirat and Talbot (2001) studied the application of self-organizing
maps of articial neural networks in joint set clustering, as applied at
the Aspo Hard Rock Laboratory, Sweden, where they observed a good
match between their model results and eld observations. Zhou and
Maerz (2001, 2002) applied multivariate clustering analysis to
discontinuity data collected from various sites. The authors developed
various visualization tools, including a 3D stereonet that enables the
use of three joint properties for the identication of joint sets.
Marcotte and Henry (2002) developed an automatic procedure to the
identication of joint sets. Jimenez-Rodriguez and Sitar (2006)
proposed a spectral method for clustering sets of rock discontinuities.
The performance of their algorithm was assessed from benchmark
test cases, using data sets compiled from eld measurements. Jimenez
(2007) investigated the importance of fuzzy logic in joint set
clustering, using both synthetic and real data, and demonstrated
that a fuzzy approach is applicable in this regard. Tokhmechi et al.
(2008) used a K-means clustering algorithm to cluster real joint data,
obtaining six joint sets when seven joint properties are taken into
account, but just three sets when only dip and dip direction are
considered. Finally, Tokhmechi et al. (2009a,b) utilized Bayesian and
multi-layered perceptron neural networks to investigate shortcomings in conventional joint set clustering methods in which various
joint properties are considered.
In the present paper, a Parzen classier is adopted to classify eight
synthetically generated joint sets. Subsequently, a K-means clustering
approach is applied to the clustering of real joint set data. The aim of
this investigation is to assess the importance of using all measurable
joint properties when identifying joint sets.
S
Fig. 1. a) Rose diagram and b) contoured stereonet of pole density for synthetically
generated joints.
2.1. Parzen classication

The steps involved in the Parzen classication algorithm are as
follows (Fukunaga and Hayes, 1989; Duda et al., 2003):
a) Normalize the data via the following equation:
Xi =
x i i
i
where X i and x i are normalization and the primary vector of the

same data, respectively, and i and i are the mean and standard
deviation vectors of the training data, respectively.
b) Fix a hyper space at the centre of x0 and with the size of hn, where
x0 is the position of the test point, based on its properties.
c) Calculate the volume of the hyper-space as follows:
j
Vn = hn ;
where j is the dimension of the property space. For example, in this

study j is equal to four (dip, dip direction, inlling percentage, and
inlling material).
d) Find the number of training data in various classes in hyper-space.
e) Calculate the function f x0 as follows:

f x0 =
kn
;
Nn Vn
where kn is the number of data in class n located in the hyperspace, Nn is the entire training data of class n, and Vn is the volume
of hyper-space, as found from Eq. (2).

f) x0 belongs to the class with the largest value of f x0 .
g) Repeat the above steps for the entire test data (test data must be
selected randomly).
2. Methodology
In undertaking any system of classication, classes are employed
that contain members dened in terms of their properties. The aim is
to determine whether new, undened members belong to any of the
primary classes. In clustering, data are also dened based on their
properties; however, the number of classes and the dependency of
data on the classes remain unknown. The aim of clustering is to
determine the optimum number of classes and the optimum
distribution of the data among the classes. Parzen and K-means are
examples of classication and clustering techniques, respectively, and
each are considered below.
Optimizing the size of the hyper-space is necessary to achieve the

greatest accuracy in the Parzen algorithm.
2.2. K-means clustering
Among the various techniques developed to simplify computation
and accelerate convergence, we considered one elementary, approximate method. K-means clustering (KMC) is one of the most widely
Table 2
Range of properties of synthetic joint sets.
Join Property
Dip direction (degree)

Dip (degree)
Inlling percentage
Inlling material
15-95
20-60
1
1
20-90
35-65
1
0.5
70-130
30-60
1
1
70-130
20-70
1
0.5
90-140
30-80
0.5
1
190-270
30-55
1
1
180-270
35-65
0
0
200-300
40-70
1
1

d) Calculate the distances between
a randomly chosen joint x j and
the mean of K joint sets k :
Infilling Percentage
(a)
1
1, 2, 3 & 4
0.5
5
60
Dip
30
90
0 0
djk = x j k ;
2
Joint Set 1
Joint Set 2
Joint Set 3
Joint Set 4
Joint Set 5
Joint Set 6
Joint Set 7
Joint Set 8
6&8
0
90
270
180
360
ection
Dip Dir
djk ;
k = 1 x j wk
Joint Set 1
Joint Set 2
Joint Set 3
Joint Set 4
Joint Set 5
Joint Set 6
Joint Set 7
Joint Set 8
6&8
0.5
2&4
0
90
60
Dip
30
0
90
180
270
h) Repeat the above algorithm to minimize J. The cost function is

minimized when the joints are correctly assigned to the joint sets.
i) The process stops when the rate of reduction in J, with each
repetition of step g, falls below a certain threshold.
1
1, 3, & 5
e) Move the joint to the joint set for which the mean vector distance
to the joint is minimized.
f) Repeat the above steps for all joints.
g) Calculate the cost function (J), which shows the accumulated
distance of the mean of all joint sets with their joints, as follows:
J=
(b)
Infilling Material
77
The optimum number of clusters (K) is selected based on the

concept of cluster validity (Theodoridis and Koutroumbos, 2006):
360
ection
Dip Dir

k
k
min min dmin i ; j

i=1 j=1
D U =
forij;
k
max dmax l ; l
l=1
Fig. 2. Eight synthetic joint sets plotted in 3D space dened by a) dip, dip direction, and
inlling percentage, and b) dip, dip direction, and inlling material.
used clustering methods. In this approach, the number of joint sets

(K) is known, and the algorithm is used to optimize the assignment of
joints to each of K sets. To optimize the number of joint sets (K), the
algorithm is applied again, after changing the putative number of joint
sets. The KMC algorithm involves the following steps (Lloyd, 1982):
a) Normalize the data using Eq. (1).
b) Randomly distribute the joints into K putative joint sets, which
results in a primary clustering.

c) Calculate the mean vector k of joints in each of the created joint
sets:
k =
1
x;
Nk x i wk j
where Nk is the number of joints in the kth joint set, and wk and
x j are the kth joint set and the property vector of the jth joint in that
set, respectively.
Trace of Confusion Matrix
7.6
7.2
6.8
6.4
6.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
The numerator in Eq. (7) is a measure of the distances between

clusters, and the denominator is a measure of clusters within distance.
The optimal number of clusters is that which maximizes DU .
3. Analysis of synthetic data
3.1. Data generation
Eight synthetic joint sets were generated, each including 200 joints,
and four properties (dip, dip direction, inlling material, and inlling
percentage) were assigned to each joint. The different classes of inlling
material and inlling percentage are listed in Table 1. These data were
generated using a Gaussian distribution, and the range of variation in
these properties is listed in Table 2. The synthetic joint sets were
generated with overlapping orientations; consequently, rose diagrams
and stereonets were not useful in identifying the different sets.
3.2. Interpretation of data
Fig. 1 shows a rose diagram and stereonet in which data are plotted
for the synthetic joints. The rose diagram reveals only two joint sets
(N20E and N40W) based on the strike of the joints, and the stereonet
reveals ultimately three joint sets (clusters at azimuths of 050, 110,
and 230) based on the dip and dip direction of the joints. It is clear that
these conventional methods of joint clustering are unable to identify
the eight synthetic joint sets.
Fig. 2 shows the synthetic joint data plotted in three dimensions,
based on dip, dip direction, and inlling percentage (Figure 2a) or
inlling material (Figure 2b), revealing four joint sets in both cases.
A combined analysis of Fig. 2a and b reveals at least ve joint sets:
(1 and 3), (2 and 4), (5), (6 and 8), and (7). This simple example
emphasizes the importance of using as many joint properties as
possible in identifying joint sets.
0.8
hn
Fig. 3. Optimizing hn in a Parzen classier (all four joint properties, i.e., dip, dip
direction, inlling material and inlling percentage, are considered).
3.3. Data classication

We investigated the capability of a Parzen classier in discriminating the eight synthetic joint sets using different sets of properties.
78
(a)
1
1
2
3
4
5
6
7
8
(b)
Decided Classes
2
0.92 0 0.08 0 0 0 0 0
0 0.96 0 0.04 0 0 0 0
0.04 0 0.96 0 0 0 0 0
0 0.04 0 0.96 0 0 0 0 Real Classes

0
0
0
0 1 0 0 0
0
0 0 0.92 0 0.06
0.02 0
0
0
0
0 0 0 1 0
0
0
0 0 0.12 0 0.88
0
0
0
4 0.04 0.02 0.54 0.40 0
0
Real Classes
0
5 0
6 0.02
0 0.92 0 0.06
0 0.12 0 0.88
7 0
8 0
(d)
(c)
Decided Classes
1 2
3
4
5
6
1 0.70 0.22 0.08 0
0
0
2 0.08 0.86 0.02 0 0.04 0
3 0.04 0 0.76 0.20 0
0
4 0.04 0.02 0.54 0.26 0.14 0

0 0.20 0.10 0.70 0
5 0
0
0
0 0.64
6 0.02 0
0
0
0
0 0.38
7 0
0
0
0
0 0.06
8 0
Decided Classes
3
4 5
6
1 0.70 0.22 0.08 0 0

2 0.08 0.86 0.02 0.04 0
3 0.04 0 0.78 0.18 0
7
8
0
0
0
0
0
0
0
0
Real Classes
0
0
0.30 0.04
0.52 0.10
0.14 0.80
Decided Classes
1
2
3
4
5
6
7
8
10.58 0.36 0.06 0
0
0
0
0
20.66 0.28 0.02 0.04 0
0
0
0
0
0
3 0 0.04 0.46 0.24 0.26 0
0
0
0 0.58 0.28 0.14 0
4 0
Real Classes
0 0.12 0.02 0.86 0
0
0
5 0
0
0
0
0 0.54 0.32 0.14
6 0
0
0
0
0 0.56 0.32 0.12
7 0
0
0
0
0 0.14 0.24 0.62
8 0
Fig. 4. Parzen confusion matrices constructed with optimum hn and considering a) four joint properties (dip, dip direction, inlling material, and inlling percentage), b) three joint
properties (dip, dip direction, and inlling percentage), c) two joint properties (dip and dip direction), and d) one joint property (dip direction).
The classier was trained using 70% of the joints in each joint set
(selected randomly) and tested using the remaining 30% of the joints.
The results of such classication methods are generally presented
using a confusion matrix. The value of each element in the matrix shows
how effectively the data in each decided class (shown in columns) are
assigned to the actual class (shown in rows). A class is perfectly classied if
the diagonal element corresponding to that class is equal to 1, which
means that other elements in that row become zero (i.e. the accuracy of
classication is 100%). The ideal classication occurs when the
corresponding confusion matrix is a unit matrix (i.e., the matrix trace is
equal to the number of classes, n). Any deviation from this ideal situation
reduces the matrix trace from n; hence, the trace of the confusion matrix
can be used as an indication of classication performance.
As mentioned in Section 2, the optimization of hn is important when
using a Parzen classier. Fig. 3 shows the optimization of hn in the case that
four properties are used in the classication. The maximum trace of the
confusion matrix (i.e., greatest accuracy) is 7.6 (95% accuracy); this occurs
when hn is equal to 0.4.
The classication was rst performed in 4D space using all the joint
properties. It was then repeated in 3D space using the properties of dip,
dip direction, and one of inlling percentage or inlling material.
Classications using dip and dip direction (in 2D space) and dip
direction (in 1D space) were carried out to investigate the capabilities of
the stereonet and rose diagram in joint set classication, respectively.
The results of classication using the Parzen algorithm are shown
in Fig. 4. The trace of the confusion matrix is reduced with decreasing
number of properties employed in the classication. The classication

accuracy decreases from 95% when four properties are considered, to
about 49% when only one property is taken into account
(corresponding to the use of a rose diagram). In Fig. 4b and c, the
Parzen confusion matrix is shown considering three properties (dip,
dip direction, and inlling percentage) and two properties (dip and
dip direction), respectively. Table 3 summarizes the performance of
the Parzen classier when different numbers of joint properties are
taken into account. A comparison of the second and third rows in
Table 3 reveals that inlling material, which represents more
variability (based on Table 2), has a greater inuence on the
classication than does inlling percentage.
4. Clustering of real data

We assessed the performance of K-means clustering (KMC) using
real data. In this section, the importance of using various joint properties
is investigated in terms of clustering the joint sets in real space.
Table 4
Quantication of survey data for joints.
Property
Description
Quantity Property
Continuity
b1 m
1-3 m
3-10 m
10-30 m
1
2
3
4
Table 3
Effect of the number and choice of joint properties on the performance of the Parzen
classier.
Row No of
properties
Joint properties used
Confusion
matrix trace
Accuracy
(%)
7.60
95.00
6.54
81.75
3
4
5
3
2
1
Dip, Dip Direction, Inlling

Material, Inlling Percent
Dip, Dip Direction, Inlling
Percentage
Dip, Dip Direction, Inlling Material
Dip, Dip Direction
Dip Direction
6.78
5.24
3.94
84.75
65.50
49.25
Ends
No ends visible
One ends visible
Two ends visible
Roughness Stepped
Rough
Moderately
Rough
Slightly Rough
Smooth
Slickenside
1
2
3
1
2
3
4
5
6
Description
Hardness Extremely hard

Very hard
Hard
Moderately
hard
Moderately soft
Soft
Very soft
Aperture Tight
b 1 mm
1-3 mm
3-10 mm
10-30 mm
N 30 mm
Quantity
1
2
3
4
5
6
7
1
2
3
4
5
6
(a)
(b)
79
(c)
Fig. 5. Orientation data for joints within the Asmari Formation at the Seymareh Dam site. a) Rose diagram, b) poles to joints, and c) contoured poles to joints.
Table 5
Absolute and relative coefcients of variation for the properties of joints in the study area.
Coefcient of variation
Dip direction
Dip
Continuity
Ends
Roughness
Aperture
Hardness
Absolute
Relative (%)
Accumulative relative (%)
0.81
28.6
28.6
0.48
17.0
45.6
0.43
15.2
60.8
0.37
13.1
73.9
0.33
11.6
85.5
0.29
10.3
95.8
0.12
4.2
100
In this analysis, we consider seven properties (dip, dip direction,

continuity, ends, roughness, hardness, and aperture) are dened in
Table 4 for 178 joints in the Asmari Formation at the site of the
Seymareh Dam, southwest Iran. Fractured carbonate of the OligoceneMiocene Asmari Formation hosts the most productive oil and
gas reservoirs within the Zagros Basin (Alavi, 2004). Many major
dams and civil structures have been constructed upon the Asmari
Formation in other parts of Iran. In geotechnical investigations of dam
foundations, tunnels, slopes, etc., the orientation (dip and dip
direction) of joints is not the only factor that inuences stability; for
example, open and closed joints with similar orientations have
contrasting impacts on seepage rates and stability.
Table 4 shows the format employed in quantifying the joint data.
In this table the classes of joint properties are selected based on Brown
(1981) and subclasses are numerated from one to the numbers of
subclasses. A rose diagram and stereonet projections of the data are
shown in Fig. 5. The rose diagram reveals just three joint sets (N20W,
N50W, and EW), and the stereonets reveal four or ve joint sets
(azimuths of 020, 070, 180, 250, and 270). It is clear that a degree of
uncertainty exists in interpreting the stereonet.
joint properties (dip and dip direction) is 45.6%; therefore, the use of a
stereonet does not take into account the majority of the variation in
joint properties, suggesting that the stereonet is a non-complete
approach to joint set clustering.
0.8
0.7
Optimum Index
4.1. Structure and properties of real data
0.6
0.5
0.4
0.3
4
10
Number of Joint Sets
4.2. Improved clustering using multiple joint properties
Fig. 6. Optimization the number of joint sets based on Eq. (7).
4
iv
Continuity
Table 5 lists the absolute and relative coefcients of variation of

seven joint properties. The strongest joint property is dip direction,
which contributes 28.6% of the total variation in joints properties. This
nding means that no single joint property could represent the
majority of the variation in joints observed throughout the study area.
For this reason, the rose diagram is unsuitable as a method of joint set
clustering. The relative coefcient of variation for the two strongest
ii
iii
vi
1
90
Table 6
Eigen values of the PCA covariance matrix for the surveyed joint properties.
Eigen value
Eigen value (%)
Accumulative eigen
value (%)
2.58
1.34
1.04
0.73
0.55
0.40
0.36
36.9
19.1
14.9
10.4
7.9
5.7
5.1
36.9
56.0
70.9
81.3
89.2
94.9
100
p
Di
Sorted principal
components
60
30
vii
00
90
180
270
360
Dip Direction
Fig. 7. Discrimination of the seven optimum joint sets using seven joint properties,
based on KMC results.
80
Table 7
Properties of the seven joint sets.
Joint Set
No. of Joints
Dip Direction
(degree)
Dip (degree)
Continuity
Ends
Roughness
Hardness
Aperture
i
ii
iii
iv
v
vi
vii
34
16
22
32
25
14
29
5-40
20-50
50-80
60-90
170-200
250-300
250-290
40-80
45-75
45-85
40-70
40-85
70-85
10-50
1-2
3-4
1-2
3-4
1-2
1-2
1-4
2-3
1-2
1-2
1-2
1-3
1-3
1-3
1-3
3-5
2-5
3-5
2-4
2-4
1-3
3-4
2-3
3-4
2-3
3-4
3-4
2-4
2-3
3-5
3-5
3-5
1-4
3
2-4
Principal component analysis (PCA) was performed to investigate

the importance of the coefcient of various linear components of joint
properties in joint set clustering. Seven principal components (PCs)
were created, corresponding to the seven joint properties considered
in the analysis (Table 6). Each PC is an unbiased weighted average of
all the joint properties. The strongest PC represents the greatest
variability in 1D space, which is about 37%. The two strongest PCs
represent 56% of the variability. Therefore, no 1D or 2D space can
represent the strong majority of the variability in joints properties
throughout the study area. This nding conrms the shortcomings of
the use of rose diagrams and stereonets in terms of joint set clustering
for joints in the present study area.
4.3. Clustering of real data

We applied the KMC algorithm to the real data. For each of various
putative joint sets (K), the algorithm was run more than 10 times [22,
23] and the optimum result was used as the value of D(U), which was
calculated via Eq. (7) (Figure 6). The resulting optimum number of
joint sets is seven, for which D(U) is maximized.
Given that it is impossible to visualize the discrimination of seven
joint sets in 7D-space, Fig. 7 shows, as an example, the distribution of
joints in 3D space, based on their dip, dip direction, and continuity.
The seven joint sets are discriminated based on the optimum KMC
results. The properties of the seven joint sets are summarized in
E W
E W
(i)
(ii)
(iii)
E W
E W
(iv)
(v)
(vi)
(vii)
Fig. 8. Contoured stereoplots of poles to joints in the seven optimum joint sets identied when applying KMC to real joint data in 7D space.
Table 7, based on the quantication scheme outlined in Table 4.

Stereonets of the seven joint sets are shown in Fig. 8. The sets (i and ii)
and (iii and iv) are separated based on properties other than dip and
dip direction. Joint set (vi) cannot be recognized from stereonet
analysis (Figure 5c) because of the small number of joints in the set
and the wide range of dips and dip directions.
The dip and dip direction of joints in sets i and ii are similar
(Table 7), consistent with the geological setting of the study area. The
folds in this part of the Zagros mountain belt trend NWSE, having
resulted from collision between the Arabian and Iranian plates (Alavi,
2004). The dip and dip direction of joints in sets i and ii are in good
agreement with the orientation of tensile joints expected to develop
parallel to axial planes. The other ve joint properties have
contrasting values between sets i and ii (Table 7). For example, the
general properties of joint set i are as follows: length b 3 m, maximum
of one visible end, large value of roughness, medium hardness, and
aperture b 3 mm. In contrast, these properties for joint set ii are as
follows: length N 3 m, at least one visible end, medium roughness, high
value of hardness, and aperture between 1 and 10 mm.
In terms of engineering applications, these properties could
signicantly inuence the design process of the Seymareh Dam. The
possibility of shear sliding along the joint surfaces is one of the major
concerns arising from these structures; as such sliding could lead to
instability of the dam abutments, reservoir slopes, access tunnels, etc.
The roughness of the joint surface has a strong inuence on the sliding
potential; therefore, it must be considered when instability parameters are estimated (Barton et al., 1985). The data in Tables 4 and 7
indicate that the surfaces of joints in set i are less rough than those in
set ii, making them more susceptible to shear failure. Another
example of the differences between the two joint sets is the larger
aperture of joints in set ii, which could provide a path for inltrating
water, thereby causing further instability due to the hydrostatic
weight of uids. These examples clearly show the importance of
identifying the properties of different joint sets in terms of their
impact on engineering applications, which was one of the motivations
behind the present research.
5. Conclusions
Although up to 10 joint properties are generally recorded during
joint surveys, only 1 or 2 (i.e., dip and dip direction) are conventionally
used for joint clustering when the data are plotted on a rose diagram or
stereonet. In this paper, we argued that to achieve a more accurate
clustering for geological or engineering applications, additional
properties of joints should be considered. This expanded approach
results in the identication of joint sets in nD space, where n is the
number of joint properties. Two well-dened classication and
clustering methods, Parzen and K-means clustering, were used to
investigate the importance of considering all the available joint
properties when identifying joint sets. We analyzed both synthetically
generated joint sets and real data collected from the Seymareh Dam,
southwest Iran. The proposed method was able to differentiate all
eight synthetic joint sets, whereas conventional methods (i.e., joint
81
strike or dip and dip direction) revealed a maximum of three sets.

Similarly, the proposed method was able to identify sets in the real
data that were not recognized using conventional methods. The
correct recognition and characterization of joint sets can help
geologists and engineers to improve their analyses and design
processes in civil, mining, and petroleum projects.
Acknowledgements
We thank the Mahab Ghods Company of Iran, especially Mr Solgi,
for kindly providing data for analysis. We thank Dr. R. Freij-Ayoub and
Dr. B. Ciftci for critical reviews of the manuscript.
References
Alavi, M., 2004. Regional stratigraphy of the Zagross fold-thrust belt of Iran and its
proforeland evolution. AJS 304, 120.
Barton, N., Bandis, S., Bakhtar, K., 1985. Strength, Deformation and Permeability of rock
joints. Int J Rock Mech Min Sci Geomech Abstr. 22, 121140.
Dershowitz, W.S., Einstein, H.H., 1988. Characterizing Rock Joint Geometry with Joint
System Models. Rock Mech. Rock Eng. 21, 2151.
Duda, R.O., Hart, P.E., Stork, D.G., 2003. Pattern Classication, second ed. John Wiley &
Sons, New York.
Fukunaga, K., Hayes, R.R., 1989. The Reduced Parzen Classier. TPAMI 11 (4), 423425.
Hammah, R.E., Curran, J.H., 1996. Optimal delineation of discontinuity sets using a fuzzy
clustering algorithm. Int J Rock Mech Min Sci Geomech Abstr. 35 (4), 495496.
Hammah, R.E., Curran, J.H., 1998. Fuzzy Cluster Algorithm for the Automatic
Identication of Joint Sets. Int. J. Rock Mech. Min. Sci. 35 (7), 889905.
Hammah, R.E., Curran, J.H., 1999. On distance measures for the fuzzy K-means
algorithm for joint data. Rock Mech. Rock Eng. 32 (1), 127.
Hammah, R.E., Curran, J.H., 2000. Validity measures for the fuzzy cluster analysis of
orientation. TPAMI 22 (12), 14671472.
Harrison, J.P., 1992. Fuzzy objective functions applied to the analysis of discontinuity
orientation data. ISRM Symposium, Eurock 92, Chester, England, pp. 2530.
Brown, E.T. (editor), 1981. Rock Characterization, Testing and Monitoring: ISRM
Suggested Methods, Pergamon Press, 211 pp.
Jimenez, R., 2007. Fuzzy spectral clustering for identication of rock discontinuity sets.
Tech. note Rock Mech Rock Eng. 41 (6), 929939.
Jimenez-Rodriguez, R., Sitar, N., 2006. A Spectral Method for Clustering of Rock
Discontinuity Sets. Int. J. Rock Mech. Min. Sci. 43, 10521061.
Lloyd, S.P., 1982. Least squares Quantization in PCM. IEEE TIT. IT 28 (2), 129137.
Marcotte, D., Henry, E., 2002. Automatic Joint Set Clustering Using a Mixture of Bivariate
Normal Distribution. Int. J. Rock Mech. Min. Sci. 39, 323334.
Sirat, M., Talbot, C.G., 2001. Application of Articial Neural Networks to Fracture
Analysis at the Aspo HRL, Sweden: Fracture Sets Classication. Int. J. Rock Mech.
Min. Sci. 38, 621639.
Theodoridis, S., Koutroumbas, K., 2006. Pattern Recognition, Third ed. Academic Press,
San Diego, USA.
Tokhmechi, B., Memarian, H., Ahmadi Noubari, H., Moshiri, B., 2008. Joint study based
on K means clustering, Asmari Formation, south west Iranian oil elds. 5th Asian
Rock Mechanics Symposium (ARMS5), Tehran, Iran, pp. 13031308.
Tokhmechi, B., Memarian, H., Moshiri, B., Ahmadi Noubari, H., 2009a. New logic in the
joint set classication using MLP neural network and discussion about their
uncertainties. J. Earth Persian 4 (1), 1127.
Tokhmechi, B., Memarian, H., Ahmadi Noubari, H., 2009a. A new method for Joint set
classication based on Bayesian optimum classier. Geosciences, Accepted for
publication (in Persian), 17 pp.
Zhou, W., Maerz, N.H., 2001. Multivariate clustering analysis of discontinuity data:
Implementation and applications. Rock Mechanics in the National Interest;
Proceedings of the 38th U.S. Rock Mechanics Symposium, Washington, D.C, pp.
861868.
Zhou, W., Maerz, N.H., 2002. Implementation of multivariate clustering methods for
characterizing discontinuities data from scanlines and oriented boreholes. C&G 28,
827839.

Investigating The Validity of Conventional Joint Set Clustering Methods

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Investigating The Validity of Conventional Joint Set Clustering Methods

Hochgeladen von

Copyright:

Verfügbare Formate

Engineering Geology 118 (2011) 7581

Contents lists available at ScienceDirect

Investigating the validity of conventional joint set clustering methods

Corresponding author at: School of Mining Engineering, University of Tehran, P.O.

possible?". It should also be noted that clustering is not practiced only

B. Tokhmechi et al. / Engineering Geology 118 (2011) 7581

2.1. Parzen classication

where X i and x i are normalization and the primary vector of the

where j is the dimension of the property space. For example, in this

Optimizing the size of the hyper-space is necessary to achieve the

Dip direction (degree)

B. Tokhmechi et al. / Engineering Geology 118 (2011) 7581

h) Repeat the above algorithm to minimize J. The cost function is

The optimum number of clusters (K) is selected based on the

used clustering methods. In this approach, the number of joint sets

Trace of Confusion Matrix

The numerator in Eq. (7) is a measure of the distances between

3.3. Data classication

B. Tokhmechi et al. / Engineering Geology 118 (2011) 7581

0 0.04 0 0.96 0 0 0 0 Real Classes

4 0.04 0.02 0.54 0.40 0

4 0.04 0.02 0.54 0.26 0.14 0

1 0.70 0.22 0.08 0 0

3 0.04 0 0.78 0.18 0

number of properties employed in the classication. The classication

4. Clustering of real data

Joint properties used

Dip, Dip Direction, Inlling

Hardness Extremely hard

B. Tokhmechi et al. / Engineering Geology 118 (2011) 7581

In this analysis, we consider seven properties (dip, dip direction,

4.1. Structure and properties of real data

Number of Joint Sets

4.2. Improved clustering using multiple joint properties

Fig. 6. Optimization the number of joint sets based on Eq. (7).

Table 5 lists the absolute and relative coefcients of variation of

B. Tokhmechi et al. / Engineering Geology 118 (2011) 7581

Principal component analysis (PCA) was performed to investigate

4.3. Clustering of real data

B. Tokhmechi et al. / Engineering Geology 118 (2011) 7581

Table 7, based on the quantication scheme outlined in Table 4.

strike or dip and dip direction) revealed a maximum of three sets.

Das könnte Ihnen auch gefallen