Вы находитесь на странице: 1из 6

Title of Proposal: Extending the Possibilities of Graph Theoretic Clustering and Segmenta-

tion: from Parallel Computing Expedition to Interdisciplinary Applications

Nature of Proposal: Innovative Student Micro Grant Research Proposal.


2013

PI Student’s Name, PI students academic program, position (undergraduate, mas-


ters, PhD): Can Jin, PhD student of Imaging Science, Chester F. Carlson Center for Imaging
Science

Faculty of Staff Advisor: Professor David Messinger (Potential)

Abstract of Proposal:

Graph theoretic clustering and segmentation has been widely used and proved to be a power-
ful tool. It has intuitive mathematical representation and no requirement for the prior knowledge.
The results are usually better than other algorithms but the graph based algorithms are also
computationally heavy. There are some parallelized algorithms based on the Graphics Processor
Unit (GPU) parallel computing framework CUDA but some newly developed algorithms doesn’t
have their parallelized correspondents. In addition, some parallelized graph theoretic algorithms
newly proven to be efficient are not widely used in imaging science. Based on the history of
mutual inspiration of interdisciplinary research related to imaging science, we proposed to ex-
tend the current application of the latest graph theoretic algorithms, especially the parallelized
expedited versions, to a wider range in imaging science. This includes a series of hierarchical
objectives: 1)The application of latest parallelized graph based clustering on the hyperspectral
remote sensed data or color image segmentation based on CUDA or OpenCL; 2)The parallelized
transplant of the latest graph based clustering algorithms based on CUDA; 3)Conduct a pilot
study to explore the potential extension of applications of graph theoretic clustering across dif-
ferent application scenarios/disciplines within the area of imaging science. Through these steps,
newly proposed parallelism algorithms are expected to expedite the clustering of large volume
of hyperspectral data, or even more, to achieve mutual benefits of subfields of imaging science
who intensively use clustering and segmentation techniques.

Dollar Request: 5000.00

Desired Funding Dates: 12/01/2013 - 07/31/2014

1
PROPOSAL

Scientific Justification

Graph theory based image segmentation methods generally represent the image properties in
terms of a graph
G = (V, E, w)
where each vertex vi ∈ V corresponds to a pixel in the image, and the edges in edge set E connect
pairs of vertices, i.e., pixels. A weight w is designated to each edge based on the properties or
relationship of the vertices, for example, the distance between two pixels in the feature space.
Then an operation of splitting, merging or clustering of subgraphs are done to segment the image
based on the segmentation of the subgraphs.
Based on this representation and goal, many approaches have been proposed and new ap-
proaches are still being proposed. The method based on minimal spanning tree (MST) proposed
by Zahn [1] is one of the earliest method. By deleting edges with the largest weights, the graph
is partitioned into subgraphs. An image can be mapped onto a graph by a number of possible
ways. Urquhart [2] proposed a locally sensitive and hierarchic method based on a concept of
limited neighborhood sets. Cooper [3] proposed an approach of splitting and merging regions
based on a uniformity criterion. Felzenszwalb [4] proposed an improved method based on this
splitting and merging methods taking account of global properties.
Newer methods based on graph cuts used in image segmentation adopted a different property
against the MST methods. They could use different cost functions in a variety of applications,
and take account of global information. A graph cut is defined as a set of edges by which the graph
will be partitioned into two disjoint sets. The dissimilarity of partitions and the segmentation can
be defined by a set of graph cuts by this way. Thus minimizing the graph cut providing a method
to maximize dissimilarity between partitioned sets, from the aspect of network maximum flow
theory. By choosing different cost functions for the graph cut, the criteria can be improved. Wu
and Leahy [5] first proposed the graph cut criterion to minimize a cost function named minimum
cut. This cost function has the bias toward finding small components. One of the studies done to
address this problem is that the well-known normalized cut function(N-cut) proposed by Shi and
Malik[6]. Newman [7] proposed the modularity as the cost function to improve the efficiency.
Besides these methods, there are also other methods that can deliver a high performance in
graph segmentation, such as Markov random field (MRF) models[8], graph theory dynamic pro-
gramming (GTDP)[9], etc. They basically conform a similar way to minimize some cost function
like the graph cuts algorithms.

Graph cuts algorithms are very computation-intensive. There has been considerations mi-
grating graph cuts algorithms onto graphics processor unit(GPU), which has huge computation
power and efficient on parallel tasks. Vineet published CUDA cuts, an algorithm performing
fast graph cuts on a GPU[10]. By applying same operation on constructed data grids split for
threads or cores of the GPU, parallel processing is achieved thus a high performance is delivered.
Vineet[10] utilized a kernel based algorithm to perform graph cuts, using push, pull, local relabel
and global relabel kernels to perform on data grids, implementing minimum-cut/maximum-flow
algorithm.
There has been some efforts in graph theoretic algorithms implemented in CUDA. The typical
average speed-up is about 5-10 times than original serial algorithms[11, 12]. Hu [13] proposed a
multislice modularity optimization on a GPU, which divide modularity calculations into multiple
slices and consider the inter-slice edge on a higher hierarchy. Hu[13] also commented that taking
advantage of data sparsity is a further way to optimize the processes. Cheong [12] proposed
a modularity-based hierarchical parallel algorithm which parallelize the method proposed by
Blondel[14] in 3 level corresponding to 3 steps of Blondel’s method.
Christophe [15] discussed some important principles when migrating remote sensing image
analysis algorithms onto multicore GPU. By investigating these principles, it is possible that we
apply GPU based fast graph clustering onto remote sensed imagery, along with other remote

2
sensing algorithms. Sun et al.[16] has implemented a GPU based pan-sharpening method to
achieve a 25 times speedup than common algorithm. With these work combined, it is possible
to reach a GPU speedup upgrade for common use remote sensing image analysis algorithms,
especially computation-intensive ones like graph theoretic algorithms.
CUDA is a high performance parallel computing framework based on Nvidia GPUs. It is
widely used in scientific computing while it is limited in several Nvidia GPUs. OpenCL is an
universal correspondent of CUDA which can be used on most of the GPUs and CPUs, but it is
a bit slower than CUDA. We also consider to implement various graph theoretic algorithms in
OpenCL

The parallelism expedition of graph theoretic clustering is meaningful and productive. It is


better if we make full advantage of it and apply into subfields of imaging science. Thus in this
research, we also propose a pilot study of extending the application of latest graph theoretic
clustering algorithms in imaging science. The initial idea is to share the latest application
outcome of these algorithms to other related area of imaging science. Below are some preliminary
ideas and similar proposals could be made for the pilot study.
Hyperspectral imagery in remote sensing has the natural property that can be described as
graphs in the spectral space. Graph based clustering only relies on the dataset without any prior
assumptions and has been used in anomaly detection[17] and clustering[18]. Spectral clustering
using graph theoretic methods is also widely used in medical imaging. In neural science, the study
of functional connectivity is to understand the organized behavior of cortical regions by imaging
the activity of brains via fMRI(Functional Magnetic Resonance Imaging). Graph theory provides
a versatile tool to analysis the connectivity[19]. Dodel [20] applied graph theoretic concepts to
fMRI to identify functional clusters of activated brain areas[21].
In graph theory, connectivity is a measure of how connected or spatially continuous a corridor
is[22]. Yue’s study in landscape change detection using connectivity model [22] showed that the
connectivity includes not only the inter-relationships in and between communities but also the
network and flows between them in an ecological system aspect. If we could use remote sensing
techniques to monitor the connectivity of a large scale ecosystem, the problem would be like that
of the functional connectivity in fMRI, discussed previously. Consider part of a forest affected
by some kind of pest. After capturing a series of multispectral imagery at different times, it
is possible to use these datasets to predict the infection routes, i.e., how the group of pests is
moving from one region to another. This could be done by calculating autocorrelation matrix
based on different selected temporal lag τ . The time-delayed autocorrelation will serve as the
edges between pixels. After similar process, the areas that has highly temporal correlation will
be grouped together, which shows the connectivity of the system.
The second idea is about the application of constrained spectral partitioning by graph theory.
Phillips [23] introduced constrained spectral partitioning(CSP) to fMRI. CSP takes advantages
of more than one graph for the same scene. Wang [24] proposed a transfer learning method,
taking advantage of one dataset using a variety of criteria to produce a set of constraints, and
applying the constraints on the graph partitioning of the second dataset, using normalized cut
method. The method has been applied in fMRI and proved effective on reducing noise. The basic
idea is to find proper constraints from one graph, since the direct combination of two graphs
will largely amplify the noise because of the difference in two graphs. Earliest methods are us-
ing binary Must-Link/Cannot-Link constraints. Wang has extended the criteria into a flexible
framework, and optimized by converting the problem into a generalized eigenvalue system[24]. In
remote sensing, sometimes a series of different aircraft images of the same scene are captured and
registered together. For example, multiple oblique imagery is captured by aircrafts to provide
multiple views from various perspectives. It is possible to take advantage of the multiple images
of the same scene using transfer learning technique or developing more appropriate flexible meth-
ods based on multiple graphs from multiple imagery, to improve the performance of automatic
detection of objects.

3
Research Innovation Justification

The first objective is to optimize current graph based clustering algorithms on a parallel
paradigm. Some outcomes have been done in the community. Yet they don’t cover all latest
graph theoretic algorithms and haven’t been widely used in hyperspectral data. There are risks
that the latest algorithms turns out not suitable to modified into a parallel procedure or not
so efficient. There are also risks that the application of parallel computing on hyperspectral
data is not efficient as expected. However, we keep on a hierarchical task list to minimize the
risk. The task is positive ordered by the level of risks: 1)Implementing current graph theoretic
algorithms in CUDA and migrating to OpenCL; 2)Applying on hyperspectral data clustering;
3)Parallelizing latest algorithms.
The second objective of our research is to extend the applications of the latest approaches
in the area of graph theoretic image segmentation and clustering. The PI is taking Alorithms
course from Computer Science Department of RIT, which features a large amount of graph based
algorithms. The PI is also willing to make connections with Maths Department. Theoretical
study basis could be achieved through this way. Based on the active research in remote sensing,
medical imaging and computer vision at Center for Imaging Science of Rochester Institute of
Technology, new approaches can be quickly shared, tried and crafted to fit to new area of appli-
cations. Thus new methods based on similar techniques can be proposed this way and mutual
benefits will be obtained by subfields of imaging science and further, contributions back to the
graph theoretic clustering community could be made. Several potential examples of this research
has been proposed in the previous section. However those proposals still need a preliminary
research on their feasibility and creativity. A pilot study is then needed for this “try and error”
procedure. Since it is a multidisciplinary study, inspirations could emerge during the discussion
with researchers in other related areas. If some of the ideas are feasible, we could hopefully have
had optimized tool of graph theoretic algorithms we implemented earlier to apply on the ideas.

Budget Request

A total budget of $5000 is requested. Among them, an amount of $1000 will be used to
cover the expense of CUDA(or OpenCL GPU parallel computing framework)-enabled device.
An amount of $4000 will cover PhD student stipend for the intersession and part of the summer
quarter.

Budget Justification

The parallel expedition algorithm will be using CUDA, which needs a CUDA-enabled com-
puting device. The device could be a CUDA-enabled computer or an external CUDA-enabled
Nvidia GPU card. This would enable us to implement CUDA/OpenCL parallel algorithms in
the first period and try out newly developed algorithms in a local machine conveniently. The
requested budget for this part is around $1000. The project might cover the intersession and the
first month of the summer. The second part of the budget $4000 could be used as PhD student
stipend in those two months.

Project Plan

The project can be split into two parallel timelines. The parallel expedition algorithm de-
velopment and the extending of applications of newly reported graph theoretic algorithms in
imaging science. Since the latter one will be a pilot study, once the proposed ideas were proved
to be rational and feasible, the earlier outcomes of the parallel expedition algorithms could be
applied in no time.

12/01/2013 - 01/31/2014: Reproduce and implement current available CUDA based graph cut
algorithms, and then apply them onto hyperspectral clustering of the remote sensed data. Try

4
to implement OpenCL framework and determine whether we will use CUDA or OpenCL.
Perform literature review and discuss proposed ideas on extending the applications of graph
theoretic algorithms with faculty from imaging science, biomedical engineering and mathematics.

02/01/2014 - 04/30/2014: Develop parallelized version of newly proposed graph theoretic al-
gorithms. The parallelization of modularity based and random process based algorithms are
potential objectives. The data will be still from remote sensed hyperspectral data since its nat-
ural representation in a spectral space and its verified success of graph theoretic clustering. In
this period the CUDA implementation will be focused.
Determine one feasible interdisciplinary application of graph theoretic algorithm. Complete
data acquisition of the project.

05/01/2014 - 06/30/2014: Continued work on parallelized algorithms implementation. In this


period the RIT Research Computing clusters will be used.
Apply initial outcomes of the parallelized algorithm development to the selected topic of the
interdisciplinary project.

07/01/2014 - 07/31/2014: Wrap up the work. The result of parallelized algorithms and original
algorithms will be compared and evaluated. Potential conference paper could be written.
Based on the outcome of the pilot study of the interdisciplinary application of graph theoretic
algorithms, a potential new project could be proposed.

References
[1] Charles T. Zahn. Graph-theoretical methods for detecting and describing gestalt clusters.
Computers, IEEE Transactions on, 100(1):68–86, 1971.
[2] Roderick Urquhart. Graph theoretical clustering based on limited neighbourhood sets. Pat-
tern Recognition, 15(3):173 – 187, 1982.
[3] Martin C. Cooper. The tractability of segmentation and scene analysis. International
Journal of Computer Vision, 30(1):27–42, 1998.
[4] Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Efficient graph-based image segmenta-
tion. International Journal of Computer Vision, 59(2):167–181, 2004.

[5] Zhenyu Wu and Richard Leahy. An optimal graph theoretic approach to data clustering:
Theory and its application to image segmentation. Pattern Analysis and Machine Intelli-
gence, IEEE Transactions on, 15(11):1101–1113, 1993.
[6] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. Pattern Analysis
and Machine Intelligence, IEEE Transactions on, 22(8):888–905, 2000.
[7] Mark EJ Newman. Modularity and community structure in networks. Proceedings of the
National Academy of Sciences, 103(23):8577–8582, 2006.
[8] Bo Peng, Lei Zhang, and David Zhang. A survey of graph theoretical approaches to image
segmentation. Pattern Recognition, 46(3):1020–1038, 3 2013.

[9] Stephanie J. Chiu, Cynthia A. Toth, Catherine Bowes Rickman, Joseph A. Izatt, and Sina
Farsiu. Automatic segmentation of closed-contour features in ophthalmic images using graph
theory and dynamic programming. Biomedical optics express, 3(5):1127, 2012.
[10] Vibhav Vineet and P. J. Narayanan. Cuda cuts: Fast graph cuts on the gpu. In Computer
Vision and Pattern Recognition Workshops, 2008. CVPRW’08. IEEE Computer Society
Conference on, pages 1–8. IEEE, 2008.

5
[11] Kenneth A. Hawick, Arno Leist, and Daniel P. Playne. Parallel graph component labelling
with gpus and cuda. Parallel Computing, 36(12):655–678, 2010.
[12] Chun Yew Cheong, Huynh Phung Huynh, David Lo, and Rick Siow Mong Goh. Hierarchical
parallel algorithm for modularity-based community detection using gpus. In Euro-Par 2013
Parallel Processing, pages 775–787. Springer, 2013.

[13] Huiyi Hu, Yves van Gennip, Blake Hunter, Andrea L. Bertozzi, and Mason A. Porter.
Multislice modularity optimization in community detection and image segmentation. In
Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on, pages
934–936. IEEE, 2012.

[14] Vincent D. Blondel, Jean-Loup . L. Guillaume, Renaud Lambiotte, and Etienne Lefebvre.
Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory
and Experiment, 2008(10):P10008, 2008.
[15] Emmanuel Christophe, Julien Michel, and Jordi Inglada. Remote sensing processing: From
multicore to gpu. IEEE Journal of Selected Topics in Applied Earth Observations and
Remote Sensing, 4(3):643–652.
[16] Weihua Sun, Bin Chen, and David W. Messinger. Pan-sharpening of spectral image with
anisotropic diffusion for fine feature extraction using gpu. In SPIE Defense, Security, and
Sensing, pages 87431H–87431H. International Society for Optics and Photonics, 2013.
[17] Bill Basener, Emmett J. Ientilucci, and David W. Messinger. Anomaly detection using
topology. In Defense and Security Symposium, pages 65650J–65650J. International Society
for Optics and Photonics, 2007.
[18] Ryan A. Mercovich, Anthony Harkin, and David Messinger. Utilizing the graph modularity
to blind cluster multispectral satellite imagery. In Image Processing Workshop (WNYIPW),
2010 Western New York, pages 66–69. IEEE, 2010.

[19] L. Astolfi, F. de Vico Fallani, F. Cincotti, D. Mattia, M. G. Marciani, S. Bufalari, S. Salinari,


A. Colosimo, L. Ding, J. C. Edgar, W. Heller, G. A. Miller, B. He, and F. Babiloni. Imaging
functional brain connectivity patterns from high-resolution eeg and fmri via graph theory.
Psychophysiology, 44(6):880–93, 11 2007.
[20] Silke Dodel, J. Michael Herrmann, and Theo Geisel. Functional connectivity by cross-
correlation clustering. Neurocomputing, 44:1065–1070, 2002.
[21] Cornelis J. Stam and Jaap C. Reijneveld. Graph theoretical analysis of complex networks
in the brain. Nonlinear biomedical physics, 1(1):3, 2007.
[22] Tian Xiang Yue, Ji Yuan Liu, Sven Erik Jrgensen, and Qin Hua Ye. Landscape change
detection of the newly created wetland in yellow river delta. Ecological Modelling, 164(1):21–
31, 6 2003.
[23] Henry L. Phillips, Peter B. Walker, Carrie H. Kennedy, Owen Carmichael, and Ian N.
Davidson. Guided learning algorithms: An application of constrained spectral partitioning
to functional magnetic resonance imaging (fmri). In Foundations of Augmented Cognition,
pages 709–716. Springer, 2013.
[24] Xiang Wang, Buyue Qian, and Ian Davidson. On constrained spectral clustering and its
applications. Data Mining and Knowledge Discovery, pages 1–30, 2012.

Вам также может понравиться