Вы находитесь на странице: 1из 25

Feature Detection and Graph Simplication

RACHEL LEVANGER
An active area of research in mathematics is the problem of identifying features in
graphs or networks. Two such features of importance are bottlenecks and clusters.
We examine the mathematics involved in detecting each of these types of features,
and also show how the techniques utilized for the latter naturally inform a method
of graph simplication.
Graph Theory, Spectral Graph Theory, Graph Clustering
Contents
1 Introduction 2
2 Graph Theory and Linear Algebra 3
2.1 Introduction to Graph Theory . . . . . . . . . . . . . . . . . . . . . . 3
2.2 The Adjacency Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 The Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Spectral Graph Theory 12
3.1 Important Eigenvalues of the Laplacian . . . . . . . . . . . . . . . . 13
3.2 The Isoperimetric Number . . . . . . . . . . . . . . . . . . . . . . . 14
4 Graph Simplication 18
4.1 Clusters and Total Density . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Nearly Block Diagonal Form . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Inducing a Simplied Graph . . . . . . . . . . . . . . . . . . . . . . 23
Bibliography 25
2 Rachel Levanger
1 Introduction
Given a graph, or a network, is it possible to use mathematics to identify highly
connected sets of vertices? In a social network, these sets could represent groups of
people afliated by friendships or who have had a reason to be in close proximity to one
another for a time, e.g. work, school, or other organized activity. In other networks,
these sets would imply other important information about the objects and connections
the graph represents. What mathematical tools are at our disposal to answer such a
question? We explore two such ideas in this paper.
We begin with an introduction to graph theory, where we lay out common denitions
and explore some basic ideas. This will then move to a discussion on linear algebra
and the connection between a graph and two associated matrices: the adjacency matrix
and the Laplacian matrix. From these matrices, we are able to determine some key
properties of a graph, such as the distance between any two vertices and the number of
disjoint components.
We then move to our rst in-depth exposition concerning the Laplacian matrix of a
graph and what is called the isoperimetric number, or Cheeger constant. This number
gives a value corresponding to the existence of a bottleneck in the graph, or how easy
it is to divide the graph into two clusters that are not very connected to one another.
The proofs in this section nicely illustrate the mechanics of working with vectors and
matrices indexed by vertices of a graph.
Figure 1: Example of graph with a bottleneck.
For our second discussion in graph clustering, we will work with the adjacency matrix.
Feature Detection and Graph Simplication 3
There are methods of converting an adjacency matrix into a best t block diagonal form,
where the matrices on the diagonal in the decomposition correspond to possible clusters
in the graph. Unfortunately there is not always a mathematically optimal solution, but
various techniques in computer science are used to create algorithms that generally turn
up a natural decomposition. As our focus is on the mathematics concerning adjacency
matrices which have already been transformed, we will not go into the details behind
these algorithms, and in fact only explore the mathematical foundation behind one such
algorithm.
Figure 2: Example of a graph with four clusters.
Finally, we discuss a way to use the permuted adjacency matrix and corresponding
block decomposition to inform a simplied version of the original graph, where each
cluster is represented by a vertex. This method also provides a notion of the strength
of the connection between two clusters, which may be used as a lter for inducing the
simplied graph. The ideas in this section are our own.
2 Graph Theory and Linear Algebra
2.1 Introduction to Graph Theory
We begin our discussion with a brief introduction to concepts in graph theory, focusing
primarily on those parts that will be used within the paper, though at times we include
some interesting results not directly related to our goals. The following denitions
come from the presentation in Giblin [2].
4 Rachel Levanger
Denition 2.1 A graph is a pair (V, E) where V is a nite set and E is a set of
unordered pairs of distinct elements of V. To make the association between the sets V
and E and their associated graph G clear, we write V(G) and E(G). We write elements
in E(G) in the form v, w where v, w V(G) and v ,= w. Elements of V(G) are called
vertices (a single element of V(G) is a vertex), and an element v, w E(G) is called
the edge connecting v and w. The number of distinct edges E(G) in which a vertex
v is contained is called its degree, and it is denoted d
v
. Denote the minimum and
maximum vertex degrees of G as (G) and (G), respectively.
Remark Given a graph G where V(G) has n elements, the maximum number of
elements in E(G) is
_
n
2
_
=
1
2
n(n 1).
Denition 2.2 A graph G with the maximum number of edges is called a complete
graph, where every pair of vertices v, w V(G) is connected by an edge in E(G).
Figure 3: Examples of complete graphs of orders 3, 5, and 10.
Denition 2.3 Let G be a graph. A realization of G is a set of points in a real vector
space R
n
, one point for each vertex, together with straight line segments joining those
pairs of points which correspond to edges in E(G). Formally, a realization must honor
two additional conditions:
i) two edges may meet only at a common end-point, and
ii) a vertex may only intersect an edge at an end-point, if at all.
A question that naturally arises from the above denition is whether or not an arbitrary
graph G may be realized in R
3
, since the intersection conditions may provide quite
a restriction for complicated graphs. It turns out, as the following theorem proves,
that any graph has a realization in R
3
, no matter the complexity! While we will not
be utilizing the realization given in the construction in the proof, it contains a nice
argument that is interesting to think about.
Feature Detection and Graph Simplication 5
Theorem 2.4 Every graph can be realized in R
3
.
Proof Consider the twisted cubic" in R
3
, the set C = (x, x
2
, x
3
)[x R (Figure 4).
We will rst show that no four distinct points in C lie on the same plane. To see
this, let P
1
, P
2
, P
3
, P
4
C be distinct points, where P
i
= (x
i
, x
2
i
, x
3
i
). Since two lines
distinguish a plane, if the four points are coplanar, then the vectors connecting them
(there are three) must be linearly dependent. That is, if the vectors are arranged in a
matrix M, we must have
det(M) =

x
1
x
2
x
2
1
x
2
2
x
3
1
x
3
2
x
1
x
3
x
2
1
x
2
3
x
3
1
x
3
3
x
1
x
4
x
2
1
x
2
4
x
3
1
x
3
4

= 0.
Using simple factoring techniques and the fact that all x
i
are distinct, we convert M to
upper-triangular form to get
M =
_
_
_
_
_
_
_
_
_
x
1
x
2
x
2
1
x
2
2
x
3
1
x
3
2
x
1
x
3
x
2
1
x
2
3
x
3
1
x
3
3
x
1
x
4
x
2
1
x
2
4
x
3
1
x
3
4
_
_
_
_
_
_
_
_
_

_
_
_
_
_
_
_
_
_
1 x
1
+x
2
x
2
1
+x
1
x
2
+x
2
2
0 1 x
1
+x
3
+x
2
0 0 1
_
_
_
_
_
_
_
_
_
and, keeping track of elementary row operations, it follows that
det(M) = (x
1
x
2
)(x
1
x
3
)(x
1
x
4
)(x
3
x
2
)(x
4
x
2
)(x
4
x
3
) = 0
if and only if the points are not distinct. Thus, four distinct points in C cannot be
coplanar.
Now we consider a graph G, where V(G) = v
1
, v
2
, ..., v
n
has n elements. Assign
to each element v
m
V(G) the point P
m
= (m, m
2
, m
3
) C R
3
, and connect
these points via straight lines corresponding to the edges in E. It remains to show that
the intersection conditions (i) and (ii), above, hold in this realization. Consider two
distinct edges P
i
, P
j
and P
k
, P
l
in the realization of G. Then we know that i ,= j
and k ,= l, and, by distinctness, the edges can share at most one point, so we can also
assume that i ,= l. Consider the case when j = k, so that the edges share a common
end-point. By the argument above, since no four distinct points are coplanar, then
three distinct points cannot be collinear (since a fourth point could then render a plane).
6 Rachel Levanger
Thus, the edges cannot intersect. When j ,= k, then the end-points of P
i
, P
j
and
P
k
, P
l
are all distinct, and so cannot be coplanar. Hence, the edges cannot intersect
at a point, since this can only happen when two lines fall on the same plane.
Figure 4: A section of the twisted cubic.
Unfortunately, realizing almost any arbitrary graph as the above proof lays out would
be counterproductive to gaining insight into the structural properties of the graph. As
the gure of the twisted cubic shows, the points would be stretched along a single
curve, and so a natural arrangement could not be accomplished. As a result, the
intersection conditions are often dropped, and vertices and edges are taken only to
mean the structures explicitly given by the corresponding graph. We now give some
additional useful denitions in graph theory.
Denition 2.5 A path on a graph G from v
1
to v
n+1
is a sequences of vertices and
edges
v
1
e
1
v
2
e
2
...v
n
e
n
v
n+1
where e
i
= v
i
, v
i+1
. A path is called simple if the edges in the path are all distinct
and the vertices are all distinct, except for possibly v
1
and v
n+1
. That is, a simple path
is one that does not cross itself. A simple path for which v
1
= v
n+1
and n > 0 is called
Feature Detection and Graph Simplication 7
a loop, and thus a path that is a loop must always have n 3. A loop with n = 3 is
called a triangle.
A graph G is called connected if, given two vertices v, w V(G), there is a path on
G from v to w. For any non-empty graph G, a component of G consists of all the
edges and vertices which occur in paths starting at some vertex v of G. It follows that
a non-empty connected graph has exactly one component.
Given a graph G, a graph H is called a subgraph of G if the vertices and edges of H
are vertices and edges of G. The graph H is a proper subgraph of G if, in addition,
H ,= G. We now explore our rst connection between graph theory and linear algebra.
2.2 The Adjacency Matrix
Since the denition of a graph does not lend itself to direct computation, as with
many other areas of mathematics, the desire is to tie the results into a familiar (and
computationally forgiving) subject, such as linear algebra. This tie is created in a few
ways, one of which we shall explore now in the manner of Biggs [1] and Schaeffer [4].
In all cases in this section, consider a graph G where V(G) = v
1
, v
2
, ..., v
n
.
Denition 2.6 The adjacency matrix of G is the n n matrix A = A(G), whose
entries a
ij
are given by
a
ij
=
_
1 if v
i
and v
j
are adjacent;
0 otherwise.
where the matrix A is indexed by vertices in V(G).
Remark From this denition we note that A is a real symmetric matrix with zero
trace, since all entries are dened to be real numbers and a vertex v
i
is assumed not
to be connected to itself, so each a
ii
= 0. Another item of note is that the matrix A
depends upon the order in which the vertices of G are labeled. Thus, in order to study
the structure of a graph, which doesnt depend on vertex labeling, it is important to look
for properties of A invariant under vertex permutations. One such set of properties are
referred to as the spectral properties of A.
Denition 2.7 The spectrum of a graph G is the set of eigenvalues of A(G), together
with their multiplicities. If the distinct eigenvalues of A(G) are
0
>
1
> ... >
s1
,
8 Rachel Levanger
and their corresponding multiplicities are m(
0
), m(
1
), ..., m(
s1
), then we shall
denote the spectrum of G as
Spec G =
_

0

1
...
s1
m(
0
) m(
1
) ... m(
s1
)
_
.
For brevity, we shall refer to the eigenvalues of A(G) as the eigenvalues of G, which
shall not cause confusion. Likewise, the associated characteristic polynomial of A(G)
will be denoted by (G; ), and will be referred to as the characteristic polynomial of
G. If we write the characteristic polynomial of G as
(G; ) =
n
+c
1

n1
+c
2

n2
+c
3

n3
+... +c
n
,
then we can interpret the coefcients c
i
as the sums of the i i principal minors of A.
We now prove a simple, but interesting, result.
Proposition 2.8 Given a characteristic polynomial (G; ) as expressed above, we
have:
(1) c
1
= 0;
(2) c
2
is the number of edges of G;
(3) c
3
is twice the number of triangles in G.
Proof From the denition of the characteristic polynomial in linear algebra, we have
that for i 1, 2, ..., n, each (1)
i
c
i
is given by the sum of the principal minors of A
which have i rows and columns. Then we have:
(1) This follows since the diagonal elements of A are all zero, and hence c
1
= 0.
(2) Again, since all diagonal elements of A are zero, and since A is symmetric, then
any non-zero, 2 2 principal minor will be of the form

0 1
1 0

.
Each pair of adjacent vertices corresponds to one such minor, and since the value for
each is 1, we have (1)
2
c
2
= [E(G)[, where [E(G)[ denotes the number of edges
(or pairs of adjacent vertices) of G. Hence, c
2
is the number of edges of G.
(3) We follow the same method as in (2) and examine the possible non-trivial, 3 3,
principal minors of A. In this case, there are three possibilities:

0 1 0
1 0 0
0 0 0

0 1 1
1 0 0
1 0 0

0 1 1
1 0 1
1 1 0

,
Feature Detection and Graph Simplication 9
and, computing determinants, only the nal matrix is non-zero. Notice that this last
matrix corresponds to a set of three vertices which are all mutually connected by edges,
which is the denition of a triangle. Hence, it follows that c
3
is twice the number of
such triangles.
We will now take a look at another interesting application of the adjacency matrix.
Proposition 2.9 The number of paths of length in G, joining v
i
to v
j
, is given by
the entry in position (i, j) of the matrix A

.
Proof We establish that the result holds for = 0, since A
0
= I , and this describes
there is one zero-length path from each v
i
to itself and no paths otherwise. Now we
proceed via induction. Suppose the result holds for some = L. Then, using the
matrix identity
(A
L+1
)
ij
=
n

k=1
(A
L
)
ik
a
kj
,
we see that (A
L+1
)
ij
is dened to be the number of walks of length L +1 joining v
i
to
v
j
. Thus, the result holds for all .
Since we are hoping to use the adjacency matrix to locate densely connected sections
of a graph, we will take the time to develop some of these ideas now. The following
denition is derived fromliterature, though we introduce some of our own terminology.
The proposition that follows is also our own attempt at connecting ideas in spectral
clustering to graph theory and the adjacency matrix; literature will make these claims
without proof, and so we establish the ideas rigorously here.
Denition 2.10 A block diagonal matrix is an n n matrix M
n
wherein the diagonal
elements are m m matrices with 1 m n, and the off-diagonal elements are zero
matrices. We will say that such a matrix is in block diagonal form. When the square
matrices are in the form [1]
mm
I
m
, where [1]
mm
is the matrix with 1s in each
entry, well say the matrix is in complete block diagonal form (our terminology).
Proposition2.11 Agraph G is composedof m complete components K
0
, K
1
, ..., K
m

G if and only if it is possible to order the rows and columns of A(G) so that it is in
complete block diagonal form.
Proof Let G be a graph composed of complete components K
0
, K
1
, ..., K
m
G.
Since each K
i
is complete, the corresponding adjacency matrix A(K
i
) is in the form
[1]
mm
I
m
, where m is the number of vertices in K
i
. Thus, A(K
i
) is in complete
10 Rachel Levanger
block diagonal form with only a single block on the diagonal. Consider the matrix
given by
_
_
_
_
_
_
_
A(K
0
) 0 0 0
0 A(K
1
) 0 0
0 0 A(K
2
) 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 A(K
m
)
_
_
_
_
_
_
_
which, by denition, is in complete block diagonal form. This matrix is equivalent to
A(G) since each vertex v V and each edge in E is included in one and only one
complete subgraph K
i
.
Consider now a graph G which has an adjacency matrix A(G) that is in complete block
diagonal form. Then A(G) may be decomposed into block matrices such that each
diagonal block has the form A(K
i
) = [1]
mimi
I
mi
and entries are zero elsewhere.
Then each diagonal block A(K
i
), if viewed as an adjacency matrix for a subgraph of
G, is complete, since the only edges that are not included are of the form (v, v) for a
single v V (and these types of edges arent allowed in our treatment). To show that
each K
i
is also a component, note that only edges within a single complete graph K
i
are accounted for in A(G), otherwise this would contradict complete block diagonal
form. Thus, there cannot be an edge connecting K
i
to K
j
for i ,= j. Hence, G is
composed of complete components.
2.3 The Laplacian
Another important tie between a graph and linear algebra is through what is called the
Laplacian of a graph. This matrix holds information about the vertices and connectivity
of the graph, as we saw with the adjacency matrix and its associated characteristic
equation. In this section we will explore basic properties of the Laplacian of a graph.
Later, we will use this matrix to deduce properties relating to network simplication.
The exposition in this section is from [3].
Denition 2.12 The diagonal degree matrix of a graph G is a diagonal n n matrix
where each entry on the diagonal corresponds to the degree of a vertex in V(G). This
matrix is denoted D(G), and is dened as
d
ij
=
_
d
vi
if i = j
0 otherwise.
where the matrix is indexed by V(G).
Feature Detection and Graph Simplication 11
Denition 2.13 The Laplacian of a graph G is the n n matrix given by
L(G) = D(G) A(G).
Thus, L(G) is the matrix with degrees of vertices on each diagonal and negative ones
off the diagonal, each corresponding to adjacent vertices. When the associated graph
is clear and it would be cumbersome to write out L(G), we will just write L for the
Laplacian of G.
Remark Notice that since L(G) is real and symmetric, its eigenvalues are all non-
negative. Denote the (Laplace) eigenvalues of L(G) in increasing order by 0
1
=

1
(G)
2
=
2
(G)
n
=
n
(G), where each value is repeated according
to its multiplicity. We will sometimes write
1
=
min
and
n
=
max
to refer to
the smallest and largest eigenvalues of L(G), respectively. It is possible to get an
immediate result about the smallest eigenvalue,
1
, and its associated eigenvector.
Lemma 2.14 The Laplacian L(G) of a graph G always has its smallest eigenvalue

1
= 0. The corresponding eigenvector is 1 = (1, 1, ..., 1)
T
.
Proof By the denition of the Laplacianof a graph G, indexedby V(G) = v
1
, v
2
, ..., v
n
,
we have
L(G) 1 = (D(G) A(G)) 1
= D(G) 1 A(G) 1
= (d
v
1
, d
v
2
, ..., d
vn
)
T
(d
v
1
, d
v
2
, ..., d
vn
)
T
= 0 1.
Hence, 1 is an eigenvector with eigenvalue 0.
Before we move to the next results, we must take a minute to introduce a new idea.
Recall that the inner product of two vectors x = (x
1
, x
2
, ..., x
n
), y = (y
1
, y
2
, ..., y
n
) R
n
is given by
x, y =
n

i=1
x
i
y
i
.
There is another, and in our case more convenient, way to express a vector in R
n
when
given a particular index on the vertices V(G). Consider the family of functions
R
V
= f : V(G) R
12 Rachel Levanger
which take a vertex in V(G) to a real number. If we enumerate through all vertices
v V(G), preserving the order in which they are enumerated, the function values
f (v) = f
v
form a vector in R
n
. Thinking in this way, we write f = (f
v
1
, f
v
2
, ..., f
vn
)
T
,
and think of the function f exactly as a vector in R
n
, where the value f
v
is the coordinate
corresponding to the vertex v. Notice that if we consider the operations of addition and
scalar multiplication (by real numbers) of two functions in R
V
as (f + g)
v
= f
v
+ g
v
and (f )
v
= (f
v
) respectively, then R
V
is a real vector space of dimension n. With
this notation, the inner product of two functions becomes
f , g =

vV
f
v
g
v
.
Notice the similarity between this denition and the one given previously for vectors
encountered in traditional linear algebra. From this point forward, we shall use the
function denitions of vectors that are indexed by elements in V(G).
We can also look at L(G) as a linear operator on the vectors f R
V
. Consider the
following denition for matrix-vector multiplication, where g = Lf ,
g
u
= (Lf )
u
=

vV
(L)
uv
f
v
, u V.
Using the above denitions, it turns out that we may write the inner product of f with
Lf in a convenient way, which we give here without proof.
Proposition 2.15 Given a function f R
V
, and the Laplacian L(G), we have
f , Lf =

uvE(G)
(f
u
f
v
)
2
.
This concludes our discussion on two fundamental connections between graph theory
and linear algebra. Next, we will see how these tools are used to determine structural
properties of a graph G.
3 Spectral Graph Theory
We now come to a section wherein our rst big result will be proved. As discussed
in the introduction, our goal is to locate features of a graph through analyzing its
underlying adjacency or Laplacian matrices. Techniques which use the eigenvalues of
these matrices belong in the subject of what is called spectral graph theory, a topic of
intense study since the 1980s. All results in this section are from [3].
Feature Detection and Graph Simplication 13
3.1 Important Eigenvalues of the Laplacian
Many results may be found that concern only the eigenvalues of the Laplacian. These
results will be used to prove our bigger theorem. Since the Laplacian is real and
symmetric, the theory of positive semidenite matrices (symmetric matrices with non-
negative eigenvalues) applies, though we will not present the theory explicitly in this
discussion.
Surprisingly, the Laplacian matrix unveils many hidden structures of the graph G.
Following is an interesting result tying the number of connected components of G to
the zero eigenvalues of L(G).
Proposition 3.1 The multiplicity of the value 0 as an eigenvalue of L(G) is equal to
the number of connected components of G.
Proof Let H be a connected component of G. Consider the characteristic function
f
H
R
V
on H, where the f
H
v
= 1 if v V(H), and 0 otherwise. Then Lf
H
= 0 using
the proof of Lemma 2.14 as a guide, together with Propositiong 2.15. Say that G has
m different connected components. Then since the vectors f
H
k
, 1 k m, are all
linearly independent, the eigenvalue 0 has multiplicity at least m since the vectors f
H
k
are the basis of a subspace of the eigenspace corresponding to the eigenvalue 0.
Now, let f R
V
be a vector such that Lf = 0, and so f , Lf = 0. By Proposition 2.15,
we see that f
v
= f
u
when the vertices v, u are in the same component (since otherwise
there could exist vertices v, u such that (f
v
f
u
)
2
> 0, contradicting the inner product
being 0). Hence, f is a linear combination of characteristic vectors on the connected
components of G (they are constant on each connected component), and so f must be
in the eigenspace associated with the eigenvalue 0. Hence, the multiplicity of 0 is just
the number of connected components of G.
Corollary 3.2 The eigenvalue
1
(G) = 0 is simple (multiplicity 1) if and only if G
is a connected graph.
The eigenvalues of L(G) most commonly encountered in spectral graph theory are
2
and
max
, or the second smallest and largest eigenvalues. It is not a coincidence that
these two are considered as a pair, since it turns out that
2
(G) = n
max
(G), and so
the importance of one implies the importance of the other.
Following are two results concerning
2
(G), which we present here without proof
(though they are clearly presented in [3] for reference).
14 Rachel Levanger
Denition 3.3 The number of edges connecting a set of vertices S V(G) to their
complement set, S, is denoted by e(S, S).
Lemma 3.4 Let G be a graph of order n and S V(G). Then

2
(G)
[S[(n [S[)
n
e(S, S)
max
(G)
[S[(n [S[)
n
.
Lemma 3.5 Lets, t V(G) be nonadjacent vertices of a graph G. Then

2
(G)
1
2
(d
s
+d
t
) .
There are many other applications and bounds for the eigenvalues of L(G), though the
above are the only ones we need in order to examine the isoperimetric number, which
we present now.
3.2 The Isoperimetric Number
One question that arises concerning the general shape or structure of a graph is the
existence of whats called a bottleneck, or when a graph may be split into two subsets
of vertices with relatively few edges between them. A number that gives us an idea of
whether this is possible is called the isoperimetric number, or Cheeger constant, of the
graph.
Denition 3.6 The isoperimetric number i(G) of a graph G of order n 2 is dened
to be
i(G) = min
_
e(S, S)
[S[

S V(G), 0 < [S[


n
2
_
.
Remark Notice that the number i(G) will be small if there is a fairly large set of
vertices S V(G) with relatively few edges connecting S to S, the complement of that
set. When G is not connected, then there exists a set S V(G) such that e(S, S) = 0,
and so in this case i(G) = 0.
Unfortunately, for large graphs G, computing i(G) is not feasible, as minimizing this
function is NP-hard [3]. However, we are able to get some relatively good bounds on
i(G) by computing properties of L(G).
Corollary 3.7 Let G be a graph of order n. Then
i(G)

2
(G)
2
.
Feature Detection and Graph Simplication 15
Proof Consider a nonempty subset S V(G). By Lemma 3.4, we have
e(S, S)
[S[

2
(G)
n [S[
n
.
From the denition of i(G), if [S[
n
2
, then we know that (n [S[)/n
1
2
. Plugging
this into the above inequality, we see that e(S, S)/[S[
2
(G)/2. Since this is true
for arbitrary S V(G) with no more than half the vertices of V(G), then we have
i(G)
2
(G)/2.
Obtaining an upper bound for i(G) takes more work, but illustrates many good tech-
niques used in proofs of results in spectral graph theory. We now present such an upper
bound with full proof.
Theorem 3.8 Let G be a graph of order n that is not complete. Let = (G), the
largest vertex degree of G. Then
i(G)
_
(2
2
)
2
.
Proof We may assume that G is connected, since otherwise
2
= i(G) = 0. Let
f R
V
be an eigenvector corresponding to
2
. Dene the set W = v V(G)[f
v
> 0,
the set of vertices in V(G) for which f
v
evaluates to a positive number. Notice that we
may assume that [W[ <
n
2
, since otherwise we may replace f by f , which is still an
eigenvector of the Laplacian, to take the complement of the number of vertices which
evaluate to a nonnegative value. Then

uW
f
2
u
=

uW
(Lf )
u
f
u
=

uW
(Df Af )
u
f
u
=

uW
_
d
u
f
u

vV
a
uv
f
v
_
f
u
=

uW

vV
a
uv
(f
u
f
v
)f
u
=

uW

vW
a
uv
(f
u
f
v
)f
u
+

uW

v/ W
a
uv
(f
u
f
v
)f
u

uW

vW
a
uv
(f
u
f
v
)f
u
+

uW

v/ W
a
uv
f
2
u
.
Notice that it is possible to transform a double sum over vertices to a single sum over
edges. However, since the double sum doesnt take into consideration the order of the
16 Rachel Levanger
end points of the edges, it is twice the value of a sum over edges when indexed over
a single set (we dont have this problem if the sums are performed over disjoint sets).
Now, to use a trick often seen in analysis, we dene g R
V
as g
v
= f
v
if v W and
g
v
= 0 otherwise. Thus, g is just the nonnegative part of f , with zeros in all other
coordinates. We can now re-write the above double sum as a sum over edges, and
then replacing the vectors f with g, together with the fact that a
uv
= 1 if and only if
uv E(G) and zero otherwise, we have

uW
f
2
u

uW

vW
a
uv
(f
u
f
v
)f
u
+

uW

v/ W
a
uv
f
2
u
=

uvE(W,W)
2(f
u
f
v
)f
u
+

uvE(W,W)
f
2
u
=

uvE(W,W)
(g
u
g
v
)
2
+

uvE(W,W)
(g
u
g
v
)
2
=

uvE
(g
u
g
v
)
2
= Lg, g.
But now

vW
f
2
v
=

vV
g
2
v
= g, g, and so the inequality becomes

2

Lg, g
g, g
:= K .
Then, since

uvE
(g
2
v
+ g
2
u
) =

uvE
2g
2
v
=

vV

uV
a
uv
g
2
v
=

vV
d
v
g
2
v
, we
have

uvE
(g
u
+g
v
)
2
= 2

uvE
(g
2
u
+g
2
v
)

uvE
(g
u
g
v
)
2
= 2

vV
d
v
g
2
v
Lg, g
2

vV
g
2
v
Lg, g
= (2 K)g, g.
Substituting the identity Lg, g =

uvE
(g
u
g
v
)
2
into the denition for K, then
applying the Cauchy Schwarz inequality to the numerator and previous inequality to
Feature Detection and Graph Simplication 17
the denominator, we have
K =

uvE
(g
u
g
v
)
2
g, g

uvE
(g
u
+g
v
)
2

uvE
(g
u
+g
v
)
2

_
uvE
[g
2
u
g
2
v
[
_
2
(2 K)g, g
2
.
Next, we borrow an idea from integration. Let 0 = t
0
< t
1
< < t
m
be the distinct
values of g
v
for v V(G). For k = 0, 1, ..., m, consider the set V
k
:= v V[g
v
t
k
,
or the pre image under g of all values greater than t
k
. Notice that [V
k
[ [W[
n
2
when k > 0. Then

uvE
[g
2
u
g
2
v
[ =
m

k=1

uvE
gv<gu=t
k
(g
2
u
g
2
v
)
=
m

k=1

gu=t
k
gv=t
l
,l<k
a
u
v(t
2
k
t
2
k1
+t
2
k1
t
2
l+1
+t
2
l+1
t
2
l
)
=
m

k=1

uV
k

v/ V
k
a
uv
(t
2
k
t
2
k1
)
=
m

k=1
e(V
k
, V
k
) (t
2
k
t
2
k1
)
i(G)
m

k=1
[V
k
[(t
2
k
t
2
k1
)
= i(G)
m

k=1
t
2
k
([V
k
[ [V
k+1
[)
= i(G)

vW
g
2
v
.
Now, we can combine the pervious two inequalities to get
K
_
uvE
[g
2
u
g
2
v
[
_
2
(2 K)g, g
2

i(G)
2
_
vV
g
2
v
_
2
(2 K)g, g
2
=
i(G)
2
2 K
.
18 Rachel Levanger
Since
2
K, writing in terms of a quadratic in K and applying the inequality yields

2

_

2
i(G)
2
.
From Lemma 3.5, we know that
2
. Thus, we have

2

_

2
i(G)
2
_

2
i(G)
2

2

2
i(G)
2
(
2
)
2
i(G)
2

2
(
2
2)
i(G)
2

2
(2
2
)
whereby taking the square root gives us the upper bound.
Thus, we have established upper and lower bounds of i(G) that do not require massive
amounts of calculation. The beauty here is that if one would like to detect an obvious
bottleneck behavior (or know that it doesnt exist), one of these bounds may be able
to provide enough information about the graph without having to compute the actual
value of i(G).
4 Graph Simplication
In this section, we will show how the adjacency matrix and techniques in clustering
may be combined to identify the prominent features, or cliques, of a given graph. We
begin by establishing the denitions used in the literature, and then expand on them by
creating and proving some propositions that tie back to material presented previously
on graph theory and the adjacency matrix. We also construct our own mathematical
object called the weight of a graph, and discover properties of this formulation. As in
prior sections, consider a graph G = (V, E) where V = v
1
, v
2
, ..., v
n
.
4.1 Clusters and Total Density
The denitions we will explore in this section are inspired by Schaeffer [5] unless
otherwise noted. The propositions and proofs are our own.
Denition 4.1 A cluster is a set of vertices ( V. The order of ( is given by the
number of vertices in (, and is denoted [([.
Feature Detection and Graph Simplication 19
Remark We will often nd it convenient to view a cluster ( as a subgraph of G,
whereby we mean the set ( together with the set of the edges in E connecting vertices
in (. We will not make a distinction between a cluster and the subgraph induced by
the cluster.
Intuitively we think of a feature, or dense area of G, as a cluster with a high degree
of internal connectivity and lower degree of external connectivity. This is exactly the
idea on which the following denitions are based.
Denition 4.2 The internal degree of a cluster ( is the number of edges connecting
vertices in (. We write
deg
int
(() = [(u, v) E : u, v ([.
The external degree of a cluster ( is the number of edges connecting vertices in ( to
vertices in V ( and is given by
deg
ext
(() = [(u, v) E : u (, v V ([.
Proposition 4.3 For ( a cluster of G, ( is path connected to V ( if and only if
deg
ext
(() > 0.
Proof Let ( be a cluster of G which is path connected to V (. Then there ex-
ists a vertex v V ( connected by a path to some vertex u (. Denote
this path by ue
0
w
1
e
1
w
2
e
2
...w
m
e
m
v for some vertices w
1
, w
2
, ..., w
m
V and edges
e
0
, e
1
, e
2
, ..., e
m
E. We aim to show that there exists a vertex in ( connected by an
edge to a vertex in V (. If w
1
V ( then were done, since e
0
would be an edge
connecting ( to V (. If not, then we look to see if w
2
V (. If so, were done. If
not, then we look to the next vertex, and so on. If the process continues to the vertex
w
m
(, then e
m
is a path connecting ( to V (. Thus, there is at least one edge
connecting ( to V (, and hence deg
ext
(() > 0.
Now, suppose that deg
ext
(() > 0. Then there exists an edge e = (v, w) E such that
v ( and w V (. Thus, vew is a path connecting ( to V (, and the proposition
is proved.
It is easy to measure the degree of connectivity directly, but how can we use this
information to identify the dense areas in a graph? To do this, we dene the notion of
density.
20 Rachel Levanger
Denition 4.4 The local density of a cluster ( is given by

(() =
deg
int
(()
_
|C|
2
_ =
2 deg
int
(()
[([([([ 1)
.
It is essentially a function that computes how complete" the cluster ( is, i.e., the ratio
of the number of edges in ( to the maximum number of edges possible. The relative
density of a cluster ( is given by

r
(() =
deg
int
(()
deg
int
(() +deg
ext
(()
,
the ratio of the internal degree of ( to the sum of internal and external edges of (.
We now have two density functions at our disposal, which both take values in [0, 1].
Schaeffer combines these in a natural way, enabling us to determine the features in the
graph that are most prominent.
Denition 4.5 The total density of a cluster ( is given by the product of the local and
relative densities
(() =

(()
r
(() =
2 deg
int
(()
2
[([([([ 1)(deg
int
(() +deg
ext
(())
.
Thus, if both the local and relative densities are large (close to 1), the total density will
also be large (close to 1). On the other hand, even if ( is a complete subgraph of G, if
there is a high degree of connectivity of the subgraph to other vertices in G, the total
density could be quite small. We prove the following proposition relating to complete
subgraphs, which shows that this is a desirable density function for distinguishing areas
of high connectivity and distinction in a graph.
Proposition 4.6 Given a cluster ( V in G, we have (() = 1 if and only if ( is a
complete component of G.
Proof Let ( V be a cluster in G with (() = 1. Suppose that ( is not a complete
subgraph of G, whereby

(() < 1. Since


r
(() is at most 1, we have
(() =

(()
r
(() < 1,
which contradicts our initial assumption. Thus, ( must be a complete subgraph of G.
Now, suppose ( is complete, but that ( is path connected to V (. By Proposition 4.3,
deg
ext
(() > 0. Thus,
r
(() < 1, and we have
(() =

(()
r
(() =
r
(() < 1,
Feature Detection and Graph Simplication 21
which again contradicts our initial assumption. Hence, ( must be both a complete
subgraph of G and disconnected from V (, and so it must be a complete component
of G.
Now, consider a cluster ( of order n that is a complete component of G. Since ( is
a complete subgraph of G, then it contains the maximum number of edges. That is,
deg
int
(() =
_
n
2
_
. Since ( is a component of G it is not path connected to V (, and
so by Proposition 4.3, deg
ext
(() = 0. Using the denition of the total density we have
(() =
deg
int
(()
_
|C|
2
_
deg
int
(()
deg
int
(() +deg
ext
(()
=
_
n
2
_
_
n
2
_
_
n
2
_
_
n
2
_
+0
= 1,
which proves the result.
Schaeffer nowuses the total density formula to permute the adjacency matrix of a graph
so that it is in block diagonal form, or at least close to block diagonal form (where a
few edges may be missing from dense clusters and there may be edges connecting the
clusters). The algorithm maximizes the total density function for a set of clusters, and
is implemented with what are known as local search and simulated annealing methods
in computer science. While the details of the algorithm are beyond the scope of this
paper, the important thing to note is that these methods imply that the algorithm is a
type of heuristic, and not one that guarantees an optimal arrangement will be found.
For the rest of this paper, we will assume that an algorithm has been employed that
produces a best t" block diagonal form of the original adjacency matrix.
4.2 Nearly Block Diagonal Form
Consider now an adjacency matrix A(G) which has been run through an algorithm that
permutes the columns in order to place it into a best t block diagonal form, and call the
permuted matrix A
D
(G). We want to nd ways to understand how close the permuted
matrix is to a true block diagonal form (and ideally, a complete block diagonal form).
Since, from Proposition 2.11, the only time A
D
(G) is in complete block diagonal form
is when G is made of disjoint complete components, typically A
D
(G) will be, at best, in
what well call nearly block diagonal form, which we dene now. From this denition,
we may glean some information about the nature of the clusters and the number of
inter-cluster edges.
Remark Denote the sum of the entries in a matrix A as S(A).
22 Rachel Levanger
Denition 4.7 A block diagonalized adjacency matrix A
D
(G) is in what we will call
nearly block diagonal form with parameters (, ) if it may be decomposed into blocks
such that for each square block K
i
of dimension m
i
on the diagonal, we have
S(K
i
) +m
i
m
2
i
1 ,
and for each block B
ij
of dimension m
i
m
j
off of the diagonal, we have
S(B
ij
)
m
i
m
j
,
where , > 0. In this case, we will say that such a matrix A is NBD(, ) for brevity.
Thus, A
D
(G) is in nearly block diagonal form if the diagonal blocks in the block
decomposition of A
D
(G) + I are nearly all 1s, and the entries in the off-diagonal
blocks are nearly all 0s, where , > 0 give the tolerance of on- and off-diagonal
blocks respectively.
Remark A matrix A in complete block diagonal form has 1 1 for the rst
inequality and 0 for the second, which holds for all , > 0. Thus, a matrix in
complete block diagonal form is also NBD with arbitrarily small parameters , > 0.
We give a proposition for the cases in which A is not in complete block diagonal form,
and tie these notions to the associated graph of the adjacency matrix.
Proposition 4.8 Let G be a graph for which the best-t block-diagonalized adjacency
matrix A
D
(G) is NBD(, ). Then for each cluster (
i
of order m
i
corresponding to
the block decomposition of A
D
(G), the number of edges required to complete (
i
has
an upper bound of (m
2
i
)/2. Additionally, the number of edges connecting clusters (
i
to (
j
with respective orders m
i
, m
j
for i ,= j, is at most m
i
m
j
.
Proof Let A
D
(G) be NBD(, ). By denition, for each cluster (
i
in G we have
S(K
i
) +m
i
m
2
i
1 ,
where m
i
= [(
i
[, and K
i
is the square matrix on the diagonal of A
D
(G) corresponding
to (
i
. Thus, we have
S(K
i
) m
2
i
(1 ) m
i
.
Notice that if d represents the number of edges needed to complete K
i
(so d is
half of the number of 0s off the diagonal of K
i
), then S(K
i
) may be expressed as
Feature Detection and Graph Simplication 23
S(K
i
) = m
2
i
m
i
2d. Substituting into the above inequality yields
m
2
i
m
i
2d m
2
i
(1 ) m
i
m
2
i
m
i
2d m
2
i
m
2
i
m
i
m
2
i
2d
and dividing by two gives an upper bound for d.
To see the bound for the edges connecting cluster (
i
to (
j
with respective orders m
i
, m
j
for i ,= j, we look again at the denition of nearly block diagonal form. If B
ij
represents
the m
i
m
j
block matrix corresponding to the edges between clusters (
i
and (
j
, then
S(B
ij
)
m
i
m
j
,
and so, since S(B
ij
) gives the number of edges connecting (
i
to (
j
, multiplying both
sides by m
i
m
j
gives the desired upper bound.
4.3 Inducing a Simplied Graph
We will now construct the weight of an adjacency matrix which will assign numbers
in [0, 1] to each of the blocks in the decomposition of A
D
(G). This weight function is
meant to encode the relative size of the clusters and the strength of the links connecting
them within a smaller matrix, denoted W[A
D
(G)]. Since we are assigning a value to
each block in the decomposition of A
D
(G), it will have the same dimension as the
number of clusters in G with respect to A
D
(G).
To give a weight to the diagonal blocks of A
D
(G), we would like to assign a higher
value to larger clusters and a smaller value as the clusters reduce in size. For this, we
will take a ratio of the order of a cluster (
i
to the maximum order of all clusters in
G. We would also like to assign a higher value to clusters which are more complete,
and smaller values to clusters which have internal edges missing. This points us to the
local density function, or

((
i
). The weights of blocks off of the diagonal of A
D
(G)
will correspond to the links between clusters in G. For this part, we will only take a
ratio of the number of edges between two clusters (
i
and (
j
to the maximum number
of edges possible between the two clusters, namely the product of the order of the
clusters. Thus, the weight of an adjacency matrix is dened as follows.
Denition 4.9 The weight of an n n block diagonalized adjacency matrix A
D
(G)
is an m m matrix, where m represents the number of blocks on the diagonal of the
24 Rachel Levanger
block decomposition of A
D
(G). The matrix is denoted by W[A
D
(G)] and consists of
entries in [0, 1]. It is populated entrywise as
w
ij
=
_

_
|Ci|
max
1km
|C
k
|

((
i
) for i = j
S(B
ij
)
|Ci||C
j
|
for i ,= j
,
where (
i
represents the cluster associated to the i
th
diagonal block of A
D
(G), and B
ij
represents the ij
th
block of A
D
(G).
Recall that the main idea behind our development in this section is to nd a way to
simplify a given graph into its distinguishing features. To do this, we will induce graphs
based on threshold parameters tied to the on- and off-diagonal weights of W[A
D
(G)].
Denition 4.10 The graph induced by W[A
D
(G)] with parameters [s
1
, s
2
] [0, 1]
and [t
1
, t
2
] [0, 1] is denoted G
[s
1
,s
2
],[t
1
,t
2
]
and is given by
V = v
i
[w
ii
[s
1
, s
2
]
E = (v
i
, v
j
)[w
ii
, w
jj
[s
1
, s
2
], w
ij
[t
1
, t
2
], and i ,= j,
where v
1
, v
2
, ..., v
m
represent vertices corresponding to the m diagonal blocks of
W[A
D
(G)].
Another way to look at the graphs induced by W[A
D
(G)] is through the adjacency
matrix. The adjacency matrix of a graph G
[s
1
,s
2
],t
induced by W[A
D
(G)] may be
determined by the following algorithm:
Remove the rows and columns in W[A
D
(G)] corresponding to w
ii
/ [s
1
, s
2
].
Change all diagonal entries equal to zero.
Set w
ij
= f (w
ij
) = 1 for w
ij
t; 0 for w
ij
> t where i ,= j.
In this way, one could have in mind the nature of the clusters one is interested in (or
the links between them) and then place a lter on the weight matrix that induces the
simplied graph. For instance, one may only be interested in the structure of the larger
or more densely connected clusters and then only when they are strongly connected
to one another. Inducing a graph via G
[.5,1],[.33,1]
could achieve such a result. In this
case, it would only show clusters that have at least half of their maximal number of
edges populated and are at least half as large as the largest cluster, and who share at
least a third of the maximal number of edges that could be between them.
This type of denition also paves the way for examining a Morse-theoretic type applica-
tion for these simplied graphs. Fixing the upper or lower endpoints of the parameters,
Feature Detection and Graph Simplication 25
we may vary the non-xed endpoints over all remaining intervals in [0, 1] and produce
a poset of simplied graphs. These sequences of induced graphs may prove interesting
in determining the nature of the global structure of a given graph, such as an exploration
via persistent homology. We look forward to performing such an investigation in the
future.
References
[1] Norman Biggs, Algebraic Graph Theory, Cambridge University Press, London, 1974,
pp. 913.
[2] P.J. Giblin, Graphs, Surfaces and Homology, Chapman and Hall, New York, 1981, pp.
148.
[3] B. Mohar, Some Applications of Laplace Eigenvalues of Graphs, 1997, pp. 118.
[4] S. E. Schaeffer, Graph Clustering, Computer Science Review I, 2007, pp. 27-64.
[5] S. E. Schaeffer, Stochastic Local Clustering for Massive Graphs, Advances in Knowl-
edge Discovery and Data Mining, Lecture Notes in Computer Science, Springer, 2005,
pp. 413-424.
rachel.levanger@unf.edu

Вам также может понравиться