Вы находитесь на странице: 1из 23

GRAPHS Array, stack, queue and linked list are the example of linear data structure where each

element has one next element. Nonlinear data structures are those where each element may have several next elements. Branching data structures like graphs and trees are examples of Nonlinear data structures. Definitions Graph: A graph is a set of points and set of lines, with each line joining one point to another. The points are called the nodes of the graph, and the lines are called the edges. The set of nodes of a given graph G is denoted by V G and the set of edges by EG. Ex: Fig-1

Here VG = {a,b,c,d} and EG = {1,2,3,4,5,6,7,8}. The number of elements in VG is called the order of graph G. Null Graph is a graph with order zero. An edge is determined by the nodes it connects. Edge 4 for example connects nodes c and d and said to be form (c,d). A graph is completely determined by its set of nodes and set of edges. The actual positioning of these elements on the page is unimportant. Fig-2

SLICA

There may be multiple edges connecting two nodes, e.g., edges 5,6, and 7 are all from (b,d). Some pairs of nodes may not be connected, e.g., there is no edge from (a,c) or (a,d). Any pair of nodes which are connected by an edge in the graph is called adjacent node. Ex: (a,b) In a graph a node which is not adjacent to any other node is called an isolated node. Null graph contains only isolated nodes. Some edges may connect one node with itself; these edges are called loop or sling. Ex: (a,a) Simple Graph A graph G is called simple graph if both of the following conditions are true: 1. It has no loops, that is, there does not exist an edge in EG of form (v,v) where v is in VG. 2. No more than one edge joins any pairs of nodes, that is, there does not exist more than one edge in EG of form (v1,v2) for any pair of elements v1 and v2 in vG. Fig shows simple graph derived from the pr. example. Fig-3

A graph that is not simple graph is sometimes called a multigraph. The edges are sometimes referred to as arcs and nodes are sometimes termed vertices.
SLICA
2

Connected graph: All nodes are interconnected by any number of edges and can not be partitioned into two separate graphs by removing one edge. Unconnected graph: Graph that has one or more isolated nodes is called not connected or unconnected graph.

Paths A path in a graph is a sequence of one or more edges that connects two nodes. We denote P(vi,vj) a path connecting nodes vi and vj. For P(vi,vj) to exist, there must be in EG a sequence of edges of the following form: P(vi,vj) = (vi,x1) (x1,x2)..(xn-1,xn) (xn,vj) The length of the path is the number of edges that it comprises. In the simple graph of the fig-3, the following are paths between nodes b and d: P(b,d) = (b,c)(c,d) length = 2 P(b,d) = (b,c)(c,b)(b,c)(c,d) length = 4 P(b,d) = (b,d) length = 1 P(b,d) = (b,c)(c,b)(b,d) length = 3 A path is having all the distinct edges it is known as simple path(edge simple). A path in which all the nodes through which is traverses are distinct is called elementary path (node simple). We are interested only in paths in which a given node is visited no more than once. The second path visits both nodes b and c twice; the fourth path visits node b twice, so only first and third path can be considered.
SLICA
3

Cycles A cycle is a path in which both of the following conditions are true: 1. No edge appears more than once in the sequence of edges. 2. The initial node of the path is the same as the terminal node of the path; i.e., P(v,v). A cycle returns to where it started. The graph of fig-2 has several cycles as: P(a,a) = (a,a) P(b,b) = (b,c)(c,b) P(b,b) = (b,c)(c,d)(d,b) P(d,d) = (b,d)(b,c)(c,d) P(d,d) = (b,d)(b,d) A graph with no cycles is said to be acyclic. Following figs represents acyclic graphs. Fig-4

Directed Graphs Directed graph is a graph in which directionality is assigned to the graphs edges. They are also known as digraph. Fig-5

SLICA

Each edge of a directed graph includes an arrow. When all edges are undirected the graph is called undirected graph. Combination of directed graph and undirected graph is known as mixed graph. In some directed as well as undirected graphs certain pairs of nodes are joined by more than one edges, such edges are called parallel edges. Parallel edges must have same destination. If they have different destination they are not parallel. In the directed graph of fig-5 edges 5 and 6 are parallel. A node v of a simple digraph is said to be reachable from the node u of the same digraph, if there exists a path from u to v. The in-degree of a node in a directed graph is the number of edges that terminates at that node; the out-degree of a node is the number of edges that originates from that node. The degree of the node is the sum of its in-and out-degrees. Ex:

indegree(a) = 1 indegree(b) = 4 indegree(c) = 1 indegree(d) = 1


SLICA

out-degree(a) = 2 out-degree(b) = 2 out-degree(c) = 2 out-degree(d) = 2

degree = 3 degree = 6 degree = 3 degree = 4


5

Adjacency matrix representation Consider a graph G with set of nodes V G and set of edges EG. Assume the graph is of order N, for N>=1. One approach to representing this graph is to use an adjacency matrix, which is a N-by-N array A, where A(i,j) = 1 if and only if edge (vi,vj) is in EG 0 otherwise. If there is an edge connecting nodes i and j, then A(i,j)=1. The adjacency matrix for the undirected graph is:

An edge of a directed graph has its source is one node and terminates in another node. By convention edge (vi,vj) denotes direction from node vi to node vj. The adjacency matrix for the directed graph is:

SLICA

Any element of the adjacency matrix is either 0 or 1. Any matrix whose elements are either 0 or 1 is called bit matrix or a Boolean matrix. The ith row in the adjacency matrix is determined by the edges which originate in the node vi. The number of elements in the ith row whose value is 1 is equal to the out-degree of the node vi. Similarly the number of elements whose value is 1 in a column, say the jth column, is equal to the in-degree of the node vj. Node Directory Representation The node directory representation includes two parts: a directory and a set of linked lists. There is one entry in the directory for each node of the graph. The directory entry for node i points to a linked list that represents the nodes that are connected to node i. Each record of the linked list has two fields: one is a node identifier; one is a link to the next element on the list. The directory represents nodes; the linked list represents edges. A node directory representation for undirected graph:

SLICA

An undirected graph of order N with E edges requires N A node directory representation of the directed graph:

A directed graph of order N with E edges requires N entries in the directory and E linked list entries. The linked list headed by the ith directory entry corresponds to the ith row of the adjacency matrix representation. The directory entries are ordered sequentially by node identifier. The linked list entries are ordered by node identifier, but they could appear in any order. Out-degree is determined by number of links of each node while in-degree is determined by how many times node is repeated in whole node directory.

SLICA

Weighted Edges There are many graph applications where edges carry some weights. Ex: In transportation applications nodes represent cities and edge weights represent distance may be in kilometer between two cities. In a directed graph, those edges which carry some weights are called weighted edge, and the graph where all the edges are weighted is called weighted graph. Node directory representation: Representation of a graph with weighted edges requires making provision in the data structure for storage of those weights. Fig shows directed graph with weighted edges. This is an example of activity graph: each node represents an event and each edge represents a task whose completion helps to trigger the next event, which is the start of other tasks. Each edges weight is its required time. Fig shows node directory representation of graph. Each edge entry contains the identifier of the destination node, the weight of the edge, and a pointer to the next edge with the same source node.

SLICA

Path matrix or reachability matrix The adjacency matrix for the digraph of following fig is:

An entry of 1 in the ith row and jth column of A shows existence of an edge (vi,vj), that is, a path of length 1 from vi to vj. Let us find out the power of adjacency matrix i.e. A2 by aij(2). Then
n

(2) ij

= aik akj
k=1

For any fixed K, aikakj=1 if and only if both aik and akj equal 1; that is, (vi, vk) and (vk, vj) are the edges of the graph. For each such k we get a contribution of 1 in the sum. Now (vi, vk) and (vk, vj) imply that there is a path from vi to vj of length 2. Therefore aij(2) is equal to the number of different paths of exactly length 2 from vi to vj. The diagonal element aij(2) shows the number of cycles of length 2 at the node for vi for i=1,2,n. 1 0 2 A=1 1 1 1 2 0 0 0 0 0 0 1 1 A 3= 0 1 1 2 0 1 1 2 1 0 0 0 0 1 1 0 1 4 1 A=2 1 1 2 1 3 1 0 0 0 0 1 1 2 0

By a similar argument one can show that the element in the ith row and jth column of A3 gives the number of paths of exactly length 3 from vi to vj. In general following statement can be written:
SLICA
10

Let A be the adjacency matrix of a diagraph G. The element in the ith row and jth column of A n (n>1) is equal to the number of paths of length n from the ith node to the jth node. Given a simple digraph G=(V,E), let vi and vj be any two nodes of G. From the adjacency matrix of A we can determine whether there exists an edge from vi to vj in G. Also from the matrix Ar, where r is some positive integer, we can establish the number of paths of length r from vi to vj. If we add the matrices A, A2, A3..Ar to get Br Br = A + A2 + A3 +.. + Ar Then from the matrix Br we can determine the number of paths of length less than or equal to r from vi to vj. If this element is nonzero, then it is clear that vj is reachable from vi. In order to determine reachability, existence of a path is need to be known and not the number of paths between two nodes. Let G = (V,E) be a simple digraph which contains n nodes that are assumed to be ordered. An n n matrix P whose elements are given by Pij = 1 if there exists a path from vi to vj 0 otherwise is called the path matrix (reachability matrix) of the graph G. The path matrix can be obtained from Bn by choosing Pij = 1 if the element in the ith row and jth column of Bn is nonzero, and Pij=0 otherwise. It only shows presence or absence of at least one path or cycle between a pair of nodes. 3 5 0 3 1 1 0 1 3 3 0 2 1 1 0 1 B4 = 6 8 0 5 P= 1 1 0 1 2 3 0 1 1 1 0 1
SLICA
11

Breadth First Search Breadth First Search (BFS) can be used to find the shortest distance between some starting node and the remaining nodes of the graph. This shortest distance is the minimum number of edges traversed in order to travel from the start node to the specific node being examined. Starting at a node v, this distance is calculated by examining all incident edges to node v, and then moving on to an adjacent node w and repeating the process. The traversal continues until all nodes in the graph have been examined. Following fig shows the traversal by using BFS strategy, assuming node A is the start position and each edge is assigned a value of one. The shortest distance from the start is given by the number associated with each node. All nodes adjacent to the current node are numbered before the search is continued. This ensures every node will be examined at least once.

This graph can be represented by node table directory structure having the node structure as: REACH NODENO DATA DIST LISTPTR
SLICA
12

REACH specifies whether a node has been reached in the traversal and its initial value is false. NODENO identifies the node number. DATA contains the information pertaining to this node, and DIST is the variable which will contain the distance from the start node. LISTPTR is a pointer to a list of adjacent edges for the node. The edges are represented by the structure: DESTIN EDGEPTR DESTIN contains the number of the terminal node for this edge, and EDGEPTR points to the next edge in the list. The storage representation of the above graph after the BFS traversal is:

Before BFS traversal all REACH values would be false. The algorithm uses two helping procedures, QINSERT and QDELETE. QINSERT enters a value onto the rear of a queue, in this case a node whose incident edges have not yet been examined. The procedure has two parameters, the queue name and the value to be inserted. QDELETE removes a value from the front of a queue specified, placing it in INDEX. N this value will be the next node being processed.
SLICA
13

Procedure BFS(INDEX).

Given the structure as described above and the queue handling procedures, QINSERT and QDELETE, this algorithm generates the shortest path for each node using a breadth first search. INDEX denotes the current node being processed and LINK points to the edge being examined. It is assumed that the REACH field has been set to false when the structure was created. QUEUE denotes the name of the queue. INDEX Current node being processed LINK Points to edge being examined QUEUE Name of queue REACH Specifies whether the node has been reached or not NODE Node number DATA Info of the node DIST Distance from first node LISTPTR Pointer to list of adjacent edges of node DESTIN Number of terminal node EDGEPTR Pointer to the next edge in the list 1) [Initialize the first nodes dist number and place node in the queue] REACH[INDEX] true DESTIN[INDEX] 0 Call QINSERT(QUEUE, INDEX) 2) [Repeat until all nodes have been examined] Repeat thru step 5 while queue is not empty 3) [Remove current node to be examined from queue] Call QDELETE(QUEUE, INDEX) 4) [Find all unlabeled nodes adjacent to current node] LINK LISTPTR[INDEX] Repeat thru step 5 while LINK NULL 5) [If this is an unvisited node, label it and add it to the queue] If not REACH[DESTIN(LINK)] then DIST[DESTIN(LINK)] DIST[INDEX] + 1 REACH[DESTIN(LINK)] true Call QINSERT(QUEUE, DESTIN(LINK))) LINK EDGEPTR(LINK) (Move down edge list) 6) [Finished] Return Using

the prev. graph representation the algorithm initially places node A into the queue. Step 3 removes
SLICA
14

the front element from the queue (initially node A). Nodes C and E are placed in the queue during the list traversing loop step 4. The algorithm then removes C from the queue and begins processing its incident edges. Since the first node in the list (A) has already been labeled, it is ignored. This ensures that the algorithm will not examine any node after it has been labeled. The algorithm terminates when the queue is emptied. If the distance for one specific node is required, an extra condition can be inserted in the loop at step 5, comparing the current node being labeled to the specific node required. If the values are equal the algorithm can be stopped, saving a traversal of the remaining portion of the graph.

SLICA

15

Depth First Search A depth first search (DFS) of an arbitrary graph can be used to perform a traversal of a general graph. As each new node is encountered, it is marked to show that the node has been visited. The DFS strategy is as follows. A node s is picked as a start node and marked. An unmarked adjacent node to s is now selected and marked, becoming the new start node, possibly leaving the original start node with unexplored edges for the present. The search continues in the graph until the current path ends at a node with out-degree zero or at a node with all adjacent nodes already marked. Then the search returns to the last node which still has unmarked adjacent nodes and continues marking until all nodes are marked. Considering the same graph the DFS strategy results in the traversal indicated by the arrows, assuming each edge has been assigned a value of one, as shown by the below fig.

Starting at node A, the search numbers all nodes down until node F, where all adjacent nodes have already been marked. The algorithm returns to node C, which still has an unlabeled adjacent node D. After node D and E are labeled, all nodes are numbered and the search is complete. The same data structure as presented in algorithm BFS will be used, changing DIST variable in the node table directory to DFN.
SLICA
16

Procedure DFS(INDEX, COUNT).

Given the structure as described before, this recursive procedure calculates the depth first search numbers for a graph. INDEX is the current index into the node table directory and is assumed to be initialized to one outside the procedure. COUNT is used to keep track of the current DFN number and is initially set to zero outside the procedure. Finally, it is assumed the DFN field was initialized to zero when the adjacency was created. INDEX Current node being processed LINK Points to edge being examined REACH Specifies whether the node has been reached or not DATA Info of the node LISTPTR Pointer to list of adjacent edges of node DESTIN Number of terminal node EDGEPTR Pointer to the next edge in the list DFN Depth first search number -Same as DIST in BFS 1) [Update the depth first search number, set and mark current node] COUNT COUNT + 1 DFN[INDEX] COUNT REACH[INDEX] true 2) [Set up loop to examine each neighbor of current node] LINK LISTPTR(INDEX) Repeat step 3 while LINK NULL 3) [If node has not been marked label it and make recursive call] If not REACH[DESTIN(LINK)] then Call DFS(DESTIN(LINK), COUNT) LINK EDGEPTR(LINK) (Examine next adjacent node) 4) [Return to point of call] Return

SLICA

17

Spanning Trees A spanning tree of a graph is an undirected tree consisting of only those edges necessary to connect all the nodes in the original graph. A spanning tree has the properties that for any pair of nodes there exists only one path between them, and the insertion of any edge to a spanning tree forms a unique cycle. The particular spanning tree for a graph depends on the criteria used to generate it. If a depth first search is used, those edges traversed by the algo form the edge of the tree, referred to as a depth first spanning tree. If the breadth first search is used, the spanning tree is formed from those edges traversed during the search, producing a breadth first spanning tree. The following fig shows BFS and DFS spanning trees for the previous graph.

DFS

BFS

Minimal spanning tree In a weighted graph cost of the graph can be calculated by adding weights of all the edges. When sum of weights of spanning tree is minimum it is known as minimal cost spanning tree.
10 20
40 50
18

30

SLICA

BFS (I)
10 20 30 10

DFS (II)
30 40

SUM=60 (III)
10 30

SUM=80 (IV)

20 50
40 50

SUM=90 (V)
30

SUM=110

40

50

SUM=120

SLICA

19

Application of Graphs - PERT or CPM. Graphs can be used for project scheduling in project management techniques often known as PERT (Program Evaluation and Review Technique) or CPM (Critical Path Method). A directed graph is a natural way of describing, representing, and analyzing complex projects which consist of many interrelated activities, and many of the management techniques employ graph as the structure on which analysis is based. A PERT graph is a finite digraph, with no parallel edges or cycles, in which there is exactly one source (node whose indegree is 0) and one sink (node whose outdegree is 0). Each edge in the graph is assigned a weight (time value). The directed edges are meant to represent activities, with the directed edge joining nodes which represent the start time and the finish time of the activity. The weight value of each edge is taken to be the time it takes to complete the activity. The graph can have independent activities as well as certain dependencies with respect to time, say activity ai must be completed before activity aj starts. Fig. shows all such dependencies:

SLICA

20

The project has eight activities. Each node is called an event and represents a point in time. V1 denote the start of the project and V6 its completion. The numbers associated with each edges represent number of days required to do that particular activity. We process the PERT graph by computing the earliest completion time of each activity under the restriction that, before an activity can begin, every activity upon which is depends must be completed. Value zero can be assigned to the source node, n for others it can be calculated by considering maximum length of time associated with longest path. TE(V1) = 0 TE(Vj) = max {t(P)} j 1 Where t(P) denotes the sum of time durations for a path P and where the maximum is taken over all paths from V1 to Vj.

We can next calculate the latest completion time. This is the latest time an activity can be completed without causing a delay in the earliest completion date of the project. TL value of the sink node is equivalent to TE value. TL(Vn) = TE(Vn)
SLICA
21

TL(Vj) = TE(Vn) max {t(P)}

jn

After computing TE and TL for each node event we can calculate critical path. A critical path is a path from the from the source node to the sink node such that if any activity on the path is delayed by an amount of t, then the entire project is delayed by t. Each node on the critical path has its TL value equal to TE value. In our example the nodes on the critical path are: V1, V3, V4, V6 and the critical path is (V1, V3, V4, V6). The nodes which are not on the critical path will have slack time. Slack time of a node is difference between its TL and TE values, and it indicates the amount of spare time avi. to particular activity, by which activity can be delayed without effecting overall time duration. ST(V2) = 4 3 = 1. Means V2 can be delayed by one day. Node V1 V2 V3 V4 V5 V6
SLICA

TE 0 3 2 6 6 8

TL 0 4 2 6 7 8

Slack time 0 1 0 0 1 0
22

SLICA

23

Вам также может понравиться