Вы находитесь на странице: 1из 4

Finding the Maximal Cliques in a graph

SHASHIDHAR G - 4JC07CS104 Dept. of Computer Science and Engineering SJCE, Mysore - 570006 gshashidhar125@gmail.com
Abstract The problem of Finding The Maximal Cliques in a Graph is used in many biological applications such as gene expression, network analysis, information systems, imaging process and thousand more. The problem comprises of nding the networked objects in a given system.


A clique of a graph is dened as an induced complete subgraph. A clique of the graph which is not contained in any higher complete sub-graph is called Maximal Clique, i.e clique having maximum number of vertices. In a graph G(V, E), where V and E are the vetex and edge set respectively, if a subgraph G (V , E ) is taken such that G contains all the edges xy E where x, y V then it is called induced subgraph. That is, all the edges that runs between the vertices in V in graph G must be present in the subgraph G . In complete graph every pair of vertices are connected with an edge. So a Clique is dened as a subgraph wherein every vertex are connected to each other and there will be no other edge between the vertices in the original graph.

Figure 1: Graph G (V, E) In the graph G shown above, the vertex set V ={1, 4, 2}(marked in red) forms an induced sub graph G (V ,E ) as every edge connecting these vertices in graph G are also present in the subgraph G . Also the subgraph G is a complete graph K3 . Hence graph G forms a Clique. Also it is a Maximal Clique. The problem of nding the number of cliques is a NP-Complete problem.

1. generate random graph : This module takes number of vertices N as input and produces an Adjacency Matrix Graph=(N N) as a representation of the graph. The edges are chosen randomly. This algorithms complexity is terms of order is (n) 2. clique : This module takes the Adjacency Matrix Graph=(N N) as input and applies the Backtracking principle in determining those vertex sets that form a clique. This module will produce an Adjacency matrix Cliques=(N N), representing all the cliques in the graph. This algorithms complexity in terms of order is (2n ). 1

3. maximal clique : This takes the adjacency matrix Cliques=(N N) that was produced by module clique and nds out the maximal clique(s) [the one containing maximum vertices] and stores them in a Max cliques=(N N) matrix. This algorithms complexity in terms of order is (n2 ) 4. display : This module prints the obtained Maximal cliques into a .dot le that can be used with DOTTY application. 5. main : This is the controlling and co-ordinating center for the whole system.

Figure 2: Test case 1 As we can see that there is only one Maximal clique 0 - 1 - 4 of size 3. The system starts from a NULL string and goes on adding the vertices for the cliques when it nds that they form a complete sub-graph.

Figure 3: Simulation of Test case 1 This gure shows the abstract view of how the system works. Here all the red and blue colored nodes are the Cliques that are obtained from the test case. And one node which is blue in color is the maximal clique as it has the largest number of vertices. This test case shows the backtracking at the various possible levels. The same thing is done for the other test cases also. As we can see that this algorithm optimizes by generating only those combinations of the vertices which have a complete graph, as compared to generating power-set of the vertex set of the given graph. Hence this technique of Backtracking is ecient than the system that generates the power-set.


Table 1: Without using optimization ag Number of vertices Time 500 0.107705 1000 0.127530 1500 0.185504 2000 0.275556 2500 0.400049 3000 0.541453 3500 0.698194 4000 0.897348 4500 2.965529 5000 13.205885

Figure 4: Graph for various inputs

Coverage using gcov

This is the output of the command $ gcc -fprole-arcs -ftest-coverage -pg main.c input.c clique.c maximal cliques.c verify.c $ ./a.out 10 $ gcov main.c input.c clique.c maximal cliques.c verify.c File maximal cliques.c Lines executed:100.00% of 14 maximal cliques.c:creating maximal cliques.c.gcov File input.c Lines executed:100.00% of 8 input.c:creating input.c.gcov File main.c Lines executed:81.93% of 83 main.c:creating main.c.gcov File clique.c Lines executed:100.00% of 25 clique.c:creating clique.c.gcov File verify.c Lines executed:83.33% of 12 verify.c:creating verify.c.gcov

Table 2: With optimization ag (-o2) Number of vertices Time 500 0.119210 1000 0.125603 1500 0.187739 2000 0.271474 2500 0.373859 3000 0.513728 3500 0.664836 4000 0.852942 4500 3.814043 5000 12.197197

Figure 5: Graph for various inputs by enabling optimization ag

Proling using gprof:

This is the output of the command $ gcc -fprole-arcs -ftest -coverage -pg main.c input.c maximal cliques.c cliques.c verify.c $ ./a.out $ gprof a.out

time 65.57 23.05 10.78 0.60 0.00 0.00 0.00

Table 3: Flat prole: cumulative seconds self seconds 2.19 2.19 2.96 0.77 3.32 0.36 3.34 0.02 3.34 0.00 3.34 0.00 3.34 0.00

Each sample counts as 0.01 seconds calls self ms/call total ms/call 10 219.00 219.00 10 10 10 10 10 4 36.00 2.00 0.00 0.00 0.00 36.00 2.00 0.00 0.00 0.00

name nd cliques main display in dotty Rand graph generator display maximal cliques in dotty nd maximal cliques verify