13 Chapter 03

35
CHAPTER 3
CLUSTERING APPROACHES ON VLSI CIRCUIT PARTITIONING
3.1. Overview
Circuit partitioning is one of the critical areas of VLSI Physical design automation. The
principle function of VLSI circuit partitioning is to divide the circuit into a number of
sub-circuits with minimum interconnections between them. Designing complex logic
circuits requires sub-division of multi-million transistor design into manageable pieces
within the hierarchy. The presence of hierarchy gives rise to natural clusters of cells.
However, most of the widely used algorithms tend to ignore this fact and divide the net
list in a balanced portioning resulting in partitions which are not optimal.
This chapter begins by discussing an overview of existing approaches to the graph and
hypergraph partitioning problems. Since graph and hypergraph partitioning are NP-
complete problems, this study takes into consideration the need for developing various
clustering models that yield good sub-optimal solutions. As a solution to this problem, a
different clustering approach to solve VLSI circuit partitioning problem has been
proposed. Useful sub-circuits with the lowest amount of interconnection between them
were studied to bridge the gap. Finally, three different data clustering methods namely
two-step clustering, Hierarchical clustering and K-medoids clustering were considered in
dividing the circuits into sub-circuits.
3.2. Clustering Methods
Clustering is one of the techniques for data mining from data warehouses. In the
clustering process, a set of objects is partitioned into subsets, also known as clusters,
36
ensuring that objects are similar within a cluster, but dissimilar with objects of different
subsets [45]. The cluster is considered to be more distinct when the objects within a
cluster is very similar (or having more homogeneity) and very dissimilar between the
other groups. A data represented by few clusters will inevitably lose the finer details, but
at the same time, achieve simplification. The clustering technique is used in analysis of
gene expression data [46], data compression [47], anomaly detection [48], statistical data
analysis, recognition of patterns, machine language, image information, bio-informatics,
information retrieval [49] and structuring results of search engines [50].
In addition to the term clustering, several other terms with similar meanings are used,
which are numerical taxonomy, automatic classification, botryology and typological
analysis. Thus, clustering equates to an unsupervised learning which results in a data
concept [51]-[54]. Cluster analysis is used to solve a problem. The solution can be
obtained by a variety of algorithms that differ significantly in their concept of what
constitutes a cluster and how efficiently it could be found.
The most common concept of clusters involves groups that have a short distance between
the subsets, dense areas of the data space with specific statistical distributions. Thus, a
multi-objective optimization problem is solved by clustering. The proposed use of the
individual data set dictates the types of algorithm and parameters used. Moreover,
clustering uses an iterative process of discovering knowledge and is an interactive multi-
objective optimization that works on trial and error mode; as a consequence, it is not an
automatic process. Preprocessing and parameters are modified till the desired properties
are achieved. Clustering methods are usually based on measuring distances between
records and the clusters. Records are assigned to the respective clusters using a
37
methodology that tends to reduce the distance between records that belong to the same
cluster. The clusters or subsets or segments created through the clustering models are
used as inputs for the successive analyses.
According to Duda, et al. [55], clustering helps to reduce the amount of data by grouping
or categorizing similar data sets into one group. The main objective of using clustering
algorithms is to construct automated tools which can support the creation of categories or
taxonomies. The automation in turn will help in reducing the human intervention in the
process [56], [57].
Clustering methods are usually classified into two basic types: hierarchical and partitional
[53], [56]-[61]. Large subtypes are present in both of these types and entirely different
algorithms are used to find the clusters. Figure 3.1 illustrates the clustering methods.
Figure 3.1: Clustering methods

38
In data mining, hierarchical clustering is a method of cluster analysis which seeks to
build a hierarchy of clusters. The strategies for hierarchical clustering generally fall into
the following two types (Figure 3.2):
 Agglomerative [58] : Merges the smaller clusters (Hierarchical classification
places each object in its own cluster) into larger clusters with a “bottom-up”
approach where each observation starts in its own cluster, and pairs of clusters are
merged as one moves up the hierarchy.
 Divisive: Proceeds by splitting larger groups of clusters into smaller ones with a
“top-down” approach, where all observations start in one cluster, and splits are
performed recursively as one move down the hierarchy.
Figure 3.2: Agglomerative and divisive clustering
These methods produce a tree of clusters commonly called as dendrogram that reveals the
way in which these clusters are related to each other. There are three types of hierarchical
algorithms: Single-linkage method – two clusters are merged based on the closest pair
made up of individuals from each cluster; Complete linkage method – two clusters are
39
merged based on the distant pair made up of individuals from each cluster; and Group
method – two clusters are merged based on the average distance between all pairs of
individuals made from one individual from each cluster.
Partitional clustering [58], on the other hand, is used to directly decompose the single
data set into a set of smaller disjoint clusters. In this clustering, an integer number of
partitions are determined which is used to optimize certain criterion function. The
optimization is an iterative process which highlights the local or global structure of the
data. Generally the global criteria involve minimizing of some measure of dissimilarity in
the samples within each cluster, while maximizing the dissimilarity of different clusters.
In K-means clustering [62], the criterion function is calculated as the average squared
distance of the data items from their nearest cluster centroids,
Where c(xk) is the index of the centroid that is closest to xk and xk is index values . One
possible algorithm for minimizing the cost function begins by initializing a set of K
cluster centroids denoted by mi, i = 1,..., K.
The position of the mi is adjusted in an iterative manner by assigning the data samples
initially of according to the nearest available clusters and then recomputing the respective
centroids. The iteration will be stopped automatically when E does not change any more.
In such an alternative algorithm every sample that is chosen in a random manner is
considered in succession, and the nearest centroid is updated. The above equation stated
is also used to represent the objective of a related method called vector quantization [63]
- [65].
40
In these clustering methods, the interpretation of the clusters is considered to be difficult
and is also perceived as one of the major problems of clustering. Most available
clustering algorithms prefer certain predefined cluster shapes, and the related algorithms
will always assign the data to clusters of such shapes even if there were no clusters found
in the data. When the goal is to compress the data set as well as to make inferences about
its cluster structures, then it is important to ascertain beforehand if the data set displays a
clustering tendency [58].
The number of clusters used is important as depending on the number different kinds of
clusters may be obtained if K is changed. Cluster centroids should be initialized better
else some of the clusters may even be left empty if their centroids lie initially far from the
distribution of data.
Though clustering is used to induce categorization and minimize the amount of data, the
internal categories have only limited value. It is essential that the clusters are analyzed in
a manner in which the concepts are understood. Shih, et al. [66] gave an example of K-
means algorithm, wherein additional illustration methods are required to visualize a
cluster containing a centroid which has high dimensional variables.
Further, hierarchical clustering algorithms may not be a suitable method for clustering
data sets containing categorical attributes [58].
3.3. Two-Step Cluster Method
Many clustering algorithms have been proposed by researchers till now to combine
smaller groups of data into larger clusters in various domains. The performance of these
clustering methods has been found to be highly effective for pure numeric data or for
41
pure categorical data. However, they do not perform well on mixed numeric and
categorical types [62].
The two-step clustering method was first proposed to find clusters of mixed numeric data.
It was designed to analyze large datasets [67]. In the two-step method, the data items that
are available in categorical attributes are first validated to build similarity and then
converted into numeric attributes based on the constructed relationship. The two-step
method is different from the traditional clustering method in that it handles both
continuous and categorical variables and automatically evaluates the optimum number of
clusters. It works on the principle of building clusters followed by sub-clusters in a
sequential manner.
The two-step clustering method uses two-stage approach where the algorithm attempts a
procedure that is very similar to the K-means algorithm (Figure 3.3). The algorithm then
conducts a modified hierarchical agglomerative clustering procedure by combining the
objects sequentially to form homogenous clusters, which is built by a cluster feature tree
whose “leaves” represent distinct objects in the dataset [68]-[70].
Step1 Step2 Step3
Outlier
Pre-
Handling Clustering
Clustering
(Optional)
Figure 3.3: Two-step clustering

42
In the traditional method, every attribute in the clustering algorithm is treated as a single
entity without considering the relationships among them. In order to overcome the
inefficiencies in the two-cluster method, Shih, et al. [66] proposed a two-step method that
integrated the hierarchal and partitional clustering algorithm by adding attributes to
cluster objects. They believe that their method can be used on cluster mixed numeric and
categorical data.
Two-step clustering method has been used in various fields in the past. Schiopu [71] had
proposed the use of SPSS Two-step clustering method to analyze information about the
customers of a bank, dividing them into three clusters.
3.4. Agglomerative Clustering Algorithm
Agglomerative clustering has been used ever since 1950s as a clustering strategy [72],
[73], which then found its way into biological taxonomy to classify organisms [65]. As
discussed earlier, agglomerative clustering is a bottom-up clustering process. Every input
object forms clusters and the two closest clusters will combine till only one cluster is
available at the end creating a hierarchy of clusters.
It is essential to specify the distance measure among clusters to define a successful
agglomerative strategy. There are three strategies used to form the agglomerative
clusters. In the single-linkage strategy, the distance between two clusters is defined as the
distance between their closest pair of data objects. In case of the complete linkage
strategy, the distance between two clusters is defined as the distance between their
furthest pair of data objects. In the average linkage strategy, the distance is defined as the
average distance between data objects from the two clusters [74].
43
3.5. Divisive Clustering Algorithm
Divisive clustering runs repeatedly to divide large clusters into small sub-clusters till it
reaches a specified stopping point. Divisive clustering algorithm represents a top-down
approach, which is more complex compared to the bottom-up approach (Figure 3.4). The
divisive approach uses an efficient second and flat-clustering algorithm for sub-routine to
continue further. This approach is used from the top when all the documents are found in
one cluster. The first cluster is split using the flat-clustering algorithm and then applied
recursively to each of the document till a single cluster is obtained. The criteria used to
stop the analysis includes a specific number of iterations, a maximum number of levels to
which the data set is divided, and a minimum required number of instances for further
partitioning.
Figure 3.4: Divisive clustering tree
Divisive method can be used even when a complete hierarchy is not generated till the
individual document leaves. In comparison with agglomerative clustering algorithms, this
method is better as it runs faster and uses at least quadratic. Further, agglomerative
clustering makes decisions depending on the local patterns and does not consider the
global distribution in the initial stages. Once decision is made, it cannot be undone.
While, in the case of decisive clustering, partitional decisions are taken based on the
44
complete information with respect to the global distribution and hence produces efficient
results [75].
3.6. Hierarchical Clustering Method
Partitioning methods are based on specifying an initial number of groups, and iteratively
reallocating objective among groups to convergence. In contrast, hierarchical methods
combine or divide existing groups, creating a hierarchical structure that reflects the order
in which groups are merged or divided in data mining. Hierarchical clustering is a
method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for
hierarchical clustering generally fall into two types:
 Agglomerative: This is a "bottom-up" approach. Each observation starts in its
own cluster, and pairs of clusters are merged as one moves up the hierarchy.
 Divisive: This is a "top-down" approach. All observations start in one cluster, and
splits are performed recursively as one moves down the hierarchy.
In general, the merges and splits are determined in a greedy manner. The results of
hierarchical clustering are usually presented in a dendrogram. In general, the time
complexity of agglomerative clustering is O (n3), which makes them too slow for large
datasets.
3.7. Agglomerative Hierarchical Clustering Method
In the agglomerative hierarchical clustering, in the beginning, each item x1,..., xn is in its
own cluster C1,..., Cn. This is repeated till only one cluster is left. After which the clusters
merge with the nearest clusters, for example, Ci and Cj resulting in a cluster tree. The
45
cluster tree can be cut at any level to produce a set of new clustering. Figure 3.5
illustrates an example of agglomerative hierarchical clustering method.
Figure 3.5: Agglomerative hierarchical clustering method
The properties of the hierarchy within the final cluster are listed out below [76]:
 The clusters that are generated in the early stages are nested with those generated
in the later stages.
 Clusters with different sizes in the tree can be valuable for geographic knowledge
discovery.
Advantages
 It can produce an ordering of the objects, which may be informative for data
display.
 Smaller clusters are generated, which may be helpful for discovery.

46
Disadvantages
 No provision can be made for a relocation of objects that may have been
incorrectly grouped at an early stage. The result should be examined closely to
ensure it makes sense.
 Use of different distance metrics for measuring distances between clusters may
generate different results. Performing multiple experiments and comparing the
results is recommended to support the veracity of the original results.
3.8. Divisive Hierarchical Clustering Method
Divisive hierarchical clustering method is a top-down clustering method, which is not
often used. Though the principle of divisive approach is similar to agglomerative
clustering, but in the opposite direction. This method starts with a single cluster
containing all objects, and then successive splits resulting clusters until only clusters of
individual objects remain. Figure 3.6 illustrates an example of divisive clustering method.
Figure 3.6: Divisive hierarchical clustering method

47
3.9. K-Medoids Clustering Method
The K-medoids algorithm is a clustering algorithm related to the k-means algorithm and
the medoid shift algorithm. In the case of K-means algorithm, optimal solution is attained
in clustering quality. Moreover, its evolution depends highly on the defining the first
position of centroids. On the contrary, K-medoids method overcomes this problem by
using medoids to represent the cluster rather than centroid. A medoid is the most
centrally located data object in a cluster. Here, k data objects are selected randomly as
medoids to represent k cluster and remaining all data objects are placed in a cluster
having medoid nearest (or most similar) to that data object. After processing all data
objects, new medoid is determined which can represent cluster in a better way and the
entire process is repeated. Again all data objects are bound to the clusters based on the
new medoids. In each iteration, methods change their location step by step, or in other
words, medoids move towards every iteration. This process is continued until no more
medoid move. As a result, k clusters are found representing a set of n data objects. Figure
3.7 illustrates the steps used in the K-medoids clustering.
The K-means technique uses the centroid for representing the cluster, and it is highly
sensitive to a node that lies far from the observant (outliers). This issue is addressed by
using K-medoid cluster technique. Medoids, instead of centroid, are used in this
technique to represent the cluster. A medoid is the most centrally located node in a cluster
(Figure 3.7).
48
Figure 3.7: K-medoids clustering
Strengths
K-medoid is more robust than K-means in the presence of noise and outliers, because a
medoid is less influenced by outliers or other extreme values than a mean.
3.10. Results and Observation
This study made an attempt to understand the best method for clustering. The
performance of K-medoids was compared with Two-step and Hierarchical clustering.
The results are provided in the subsequent sections.
3.10.1. Comparison of three Clustering methods
In order to compare the performance of the proposed method, three methods were used in
this study. They were
1. Two-Step clustering
2. Hierarchical clustering
3. K-mediods
49
The clustering algorithms were compared based on the following factors: the size of the
data set and run time. For each factor, four tests were conducted: one for each algorithm.
The partitioning problem was applied to the circuit as shown in Figure 3.8. Its run time
analysis and data analysis are shown in Figures 3.9 and 3.10, respectively. The Figure 3.8
which gives information about small VLSI Circuit. It is converted into hypergraph
partitioning circuit. Three clustering methods are used to obtain sub circuits with lowest
amount of interconnection. This is tested through visual C++ Software in Visual studio
IDE Environment.
Figure 3.8: Comparing clustering algorithms

50
Figure 3.9: Run time performance analysis of proposed clustering methods
Figure 3.10: Data analysis of proposed clustering methods

51
3.11. Discussion
The main objective of the study is to compare the K-medoids clustering with Two-step
and hierarchical clustering. The results were analyzed based on the size of the data set
and run time factor of each of the algorithms.
 K-medoids method overcomes this problem by using medoids to represent the
cluster rather than centroid.
 K data objects are selected randomly as medoids to represent k cluster and
remaining all data objects are placed in a cluster having medoid nearest to that
data object.
 After processing all data objects, the new method is determined which can
represent cluster in a better way and the entire process is repeated until no any
medoid move.
 As a result, k clusters are found representing a set of n data objects.
K-medoid clustering takes less execution time as shown in Figure 3.9 and more data
coverage as shown in Figure 3.10. Hence, this study proves that K-medoid clustering
achieves greater performance in comparison to Two-step and hierarchical clustering.
3.12. Publication
Based on the research work done on “Clustering approaches on VLSI Circuit
partitioning” the following paper was published.
1. Manikandan, R. and Swaminathan, P. 2012, Comparative study of clustering methods
in VLSI circuit partitioning, International Conference on Electrical, Electronics &
Information Technology, 11th Nov 2012, Trivandrum, India.

13 Chapter 03

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

13 Chapter 03

Загружено:

Авторское право:

Доступные форматы

35

CLUSTERING APPROACHES ON VLSI CIRCUIT PARTITIONING

sub-circuits with minimum interconnections between them. Designing complex logic

circuits requires sub-division of multi-million transistor design into manageable pieces

list in a balanced portioning resulting in partitions which are not optimal.

two-step clustering, Hierarchical clustering and K-medoids clustering were considered in

dividing the circuits into sub-circuits.

3.2. Clustering Methods

analysis, recognition of patterns, machine language, image information, bio-informatics,

information retrieval [49] and structuring results of search engines [50].

which are numerical taxonomy, automatic classification, botryology and typological

analysis. Thus, clustering equates to an unsupervised learning which results in a data

obtained by a variety of algorithms that differ significantly in their concept of what

constitutes a cluster and how efficiently it could be found.

multi-objective optimization problem is solved by clustering. The proposed use of the

clustering uses an iterative process of discovering knowledge and is an interactive multi-

used as inputs for the successive analyses.

process [56], [57].

Figure 3.1: Clustering methods

In data mining, hierarchical clustering is a method of cluster analysis which seeks to

the following two types (Figure 3.2):

 Agglomerative [58] : Merges the smaller clusters (Hierarchical classification

merged as one moves up the hierarchy.

performed recursively as one move down the hierarchy.

Figure 3.2: Agglomerative and divisive clustering

individuals made from one individual from each cluster.

distance of the data items from their nearest cluster centroids,

In these clustering methods, the interpretation of the clusters is considered to be difficult

clustering tendency [58].

clusters may be obtained if K is changed. Cluster centroids should be initialized better

means algorithm, wherein additional illustration methods are required to visualize a

cluster containing a centroid which has high dimensional variables.

data sets containing categorical attributes [58].

3.3. Two-Step Cluster Method

categorical types [62].

clusters. It works on the principle of building clusters followed by sub-clusters in a

conducts a modified hierarchical agglomerative clustering procedure by combining the

whose “leaves” represent distinct objects in the dataset [68]-[70].

Step1 Step2 Step3

Figure 3.3: Two-step clustering

integrated the hierarchal and partitional clustering algorithm by adding attributes to

customers of a bank, dividing them into three clusters.

3.4. Agglomerative Clustering Algorithm

discussed earlier, agglomerative clustering is a bottom-up clustering process. Every input

available at the end creating a hierarchy of clusters.

It is essential to specify the distance measure among clusters to define a successful

3.5. Divisive Clustering Algorithm

reaches a specified stopping point. Divisive clustering algorithm represents a top-down

Figure 3.4: Divisive clustering tree

individual document leaves. In comparison with agglomerative clustering algorithms, this

3.6. Hierarchical Clustering Method

reallocating objective among groups to convergence. In contrast, hierarchical methods

in which groups are merged or divided in data mining. Hierarchical clustering is a

hierarchical clustering generally fall into two types:

 Agglomerative: This is a "bottom-up" approach. Each observation starts in its

splits are performed recursively as one moves down the hierarchy.

hierarchical clustering are usually presented in a dendrogram. In general, the time

3.7. Agglomerative Hierarchical Clustering Method

illustrates an example of agglomerative hierarchical clustering method.

Figure 3.5: Agglomerative hierarchical clustering method

in the later stages.

 Smaller clusters are generated, which may be helpful for discovery.

incorrectly grouped at an early stage. The result should be examined closely to