Академический Документы
Профессиональный Документы
Культура Документы
n
i1
Z
i
d
a
i
n
i1
1
d
a
i
1
where:
Z* is the estimated quantity, Z
i
is the observed quantity
at i-th station, d
i
is the distance between i-th station and the
unsampled point, a is the power usually between 1 and 3
and n is number of sampled points involved in the
interpolation. The power a influences the accuracy of
estimations so that adjacent points are given greater weights
when a is increased. The interpolation was carried out on
a 500-m pixel size.
In IDW method, the weights of sampled points are
determined according to their distance to the unsampled
points while the position and distribution have no effects on
the estimation. However, to study the dependence of the
results on the station density, four station density scenarios
with different distances from the region boundary were
considered. Thus, 45, 33, 22, and 15 stations were involved
in scenario 1 to 4, respectively.
To investigate the annual rainfall patterns, a clustering
and a classification method were applied. Clustering
algorithms can be divided into hard clustering and fuzzy
clustering. In hard clustering, each feature vector is
assigned to one of the clusters with a degree of membership
equal to one. This is based on the assumption that feature
vectors can be divided into non-overlapping clusters with
well-defined boundaries between them. Fuzzy clustering
allows a feature vector to belong to all the clusters
simultaneously with a certain degree of membership in the
[0, 1] interval which means that the cluster boundaries
overlay each other.
Generally, all clustering methods are designed to maximize
within-group similarity and to minimize between-group
similarity. To achieve this purpose, some measures of
similarity or distance between pairs of observations/objects
must be established. The most commonly used distance
measure is the Euclidean distance (Bunkers et al. 1996).
In this study, the Fuzzy c-means method described by
Bezdek (1981) is used as patterns clustering method on the
basis of Euclidean distance as a measure of similarity. Also,
the natural breaks will be used for classification. The
Jenks' optimization method is employed and realized
(Jenks 1967) so that the boundary values are determined in
such a way that the average of a squared deviation in each
class is minimized.
2.1 Clustering with FCM algorithm
The determination of the number of clusters is the most
important issue in clustering algorithms. Here, we use cluster
validity index (CVI) criterion proposed by Fukuyama and
Sugeno (1989) as follows:
Sc
N
k1
c
i1
m
ik
m
x
k
v
i
k k
2
v
i
x k k
2
_ _
2
Where:
N is the number of data to be clustered, c is number of
clusters, c2, x
k
is k-th data, usually a vector, x is average
of x
1
, x
2
,...,x
n
data, v
i
is vector expressing the center of the
i-th cluster, k kis the norm,
ik
is grade of membership of
k-th data to the i-th cluster, and m is adjustable weight
(usually m=1.53).
The number of clusters, c, is determined so that S(c)
reaches a minimum as c increases. It is also imposed that:
c
i1
m
ik
1 3
which means that the memberships of a chosen input feature
vector over all the c fuzzy clusters should sum up to 1.0.
The procedure for determining cluster centers and grade
of membership of k-th data belonging to the i-th cluster is
as follows (Sugeno and Yasukawa 1993):
1. Set t (iteration index) to unity.
2. Set an initial vector for cluster centers: V
0
=(v
1
, v
2
,...,v
c
).
3. Calculate the membership matrix U
t
cN
from vector of
cluster centers determined in the previous step (c is
the number of clusters and N is the number of data):
m
ik
1
c
j1
x
k
v
i
k k
2
x
k
v
j k k
2
_ _ 2
m1
4
4. Calculate new vector of cluster centers from matrix U
t
c
N
;
V
t
N
k1
m
ik
m
x
k
N
k1
m
ik
m
5
5. If V
t
V
t1