Path Length

Path Length
1)
Introduction
In calculating the average path length, we must find the shortest path from a source node
to all other nodes contained within the graph. Previously, we found that by using an
inefficient algorithm, experimentally calculating the path length of a graph can be time
consuming. Since then, I have looked further into the necessary algorithms for solving
this problem.
2)
Verification of the Algorithm and Code

Algorithm
The algorithm I used to solve the single source shortest path problem within the
polymeric gel was a breadth first search. A breadth first search begins at a starting node
and explores all of the neighboring nodes. Then for each of those nearest nodes, it
explores their unexplored neighbors until it finds the goal.
For every node within the graph do {

Initialize the distances to all other vertices as -1 (not computed),
Initialize the queue to null
Store s (start node) in a queue
Set the distance to s to be 0 in the Distance Table.
While there are vertices in the queue {
Read a vertex v from the queue
For all adjacent vertices w {
If distance to w is -1 (not computed) do {
Make distance to w equal to (distance to v) + 1
Add w to the queue
The algorithm can be designed to generate a distribution of path lengths between nodes.
This is done by tracking the following process within a given timestep: Starting from a
given node the algorithm will keep count how many nodes are immediate neighbors to
this starting node. The path length between these immediate neighbors and the starting
node is one. This is then done for second nearest neighbors. The path length between the
second nearest neighbors and the starting node is two. This process continues until all
reachable neighbors are visited.
With the polymeric gel, it was necessary to repeat this algorithm N (the number of
aggregates within a timestep) times, starting from each node within a given timestep. It
is well-known that within our gel network, all aggregates are not necessarily connected.
Rattlers and a disconnected graph resulting in a giant component are two examples of this
situation. To account for the shortest paths between all of the aggregates it is necessary
to repeat this algorithm, starting from each node within the network. By repeating this
algorithm, starting from each aggregate and continually updating shorter path lengths for
a given timestep, the discontinuous nature of the polymeric gel network can be accounted
for.
From here a distribution is created by counting the number of each of the specific path
lengths within each timestep.
ErdsRnyi Random Graph
In an effort to validate the accuracy of the FORTRAN code, I compared experimental

results of an ErdsRnyi random graph to the calculated values of well-known formula.
Starting with N disconnected nodes, ErdsRnyi Random graphs are generated by
connecting couples of randomly selected nodes, prohibiting multiple connections, until
the number of edges equals K (S. Boccaletti et al./ Physics Reports 424 (2006) 175-308).
Connections between randomly chosen nodes were made with the exception of loops.
Any connection resulting in a loop was not allowed.
Using this definition of an E.R random graph, I created networks that contain the same
number of nodes and links as our gel at that given temperature. I looked at two
experimental values and three calculated values for each temperature. The experimental
values were gathered using the FORTRAN code of the breadth first search algorithm.
The first was the average path length as calculated from the probability distribution of
path lengths. I calculated the probability distribution by dividing the path length
distribution by the sum of all the path lengths (this includes any disconnections),
Pi (k ) =
Li
Li
From here the average path length is defined as
l = P(k ) k .
[1]
In the following tables of data, this value is labeled Experimental 1.
The second method of experimentally calculating the average path length is defined as
follows
l=
1
Di , j
N ( N 1) i , j
[2]
In an unweighted graph, Di , j is the shortest distance between node i and node j . This
definition assumes that Di , j = 0 if node i cannot be reached by node j or if i = j . N in
this definition is the total number of nodes who have connections. In the following tables
of data, this value is labeled Experimental 2.
I compared these two methods of experimentally gathering average path lengths to

calculated values for a random E.R. graph using formula found in M. Newman et al. /
Phys. Rev. E 64 026118 (2001). In each case z m is the average number of neighbors at
distance m . The first is as follows
ln ( N 1)( z 2 z1 ) + z12 ln z12

l=
ln ( z 2 z1 )
[3]
and will be labeled Calculated 1 in the following table of data (Table 1). In the special
circumstance where the following two conditions hold,
N >> z1
z 2 >> z1
Eq.[3] reduces to
l=
ln(N z1 )
+ 1.
ln( z 2 z1 )
[4]
This value will be labeled Calculated 2. In the special case of an E.R random graph,
2
for which z1 = k and z 2 = k , Eq.[4] reduces to the following
l=
ln( N )
ln k
[5]
(S. Boccaletti et al./ Physics Reports 424 (2006) 175-308). In the following tables of
data, this value will be labeled Calculated 3.
There are a couple of considerations to take into account. In the creation of the E.R.
random networks we started with the same number of nodes and links as the gel at each
given temperature. Importantly, when we randomly choose to make connections between
N nodes, not all of the N nodes will be selected. There will be some nodes that do not
have connections to others. This results in a network with k links, but the total number of
connected nodes is less than the number of desired nodes. This fact is important when
comparing calculated values to experimental results.
I am assuming that based upon the definition of an E.R. random graph, according to S.
Boccaletti et al., the value for N must include those nodes that do not have connections.
A second consideration is in the fact that the created E.R random graph might be
disconnected. The formula used to calculate average path lengths assumes that all nodes
are reachable from any randomly chosen starting node. As stated in M. Newman et al. /
Phys. Rev. E 64 026118 (2001), in general this will not be true and Eq. [4] is
meaningless. A better approximation to l may therefore be given by replacing N in
Eq.[4] by NS , where S is the fraction of the graph occupied by the giant component.
Therefore, I made this approximation. I averaged the largest component of the random
graph per timestep and included this factor in each of the three calculated values.
Figures 1 and 2 contain plots of average path length verses temperature for the gel and
random graphs.
Path Length
Gel - Exp 1
Random - Exp 1
Random - Calc 1
Random - Calc 2
Random - Calc 3
Average Path Length
20
15
10
0
0.5
1.5
Temperature
Figure 1 contains a graph of the average path length data verses temperature for the polymeric gel and E.R.
random graphs.
Path Length
Average Path Length
10
Gel - Exp 1
Random - Exp 1
Random - Calc 1
Random - Calc 2
Random - Calc 3
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Temperature
Figure 2 contains a graph of the average path length data verses temperature for the polymeric gel and E.R.
random graphs. This is the same data as in figure 1, just a closer view.
Tables 1 and 2 contain the average path length data (experimental and calculated) for the
random graph and the polymeric gel.
Table 1
E.R. RANDOM MATRIX PATH LENGTH

N = 2000
Temperature
Cluster Count (approx)
links
# of Nodes
2.00
1.80
1.50
1.20
1.00
0.90
0.80
0.70
0.60
0.50
0.45
0.40
0.35
0.30
1413
971
1335
966
1250
960
1117
948
977
934
883
921
762
904
606
877
427
836
261
790
206
772
176
747
161
724
142
734
1053.1
1020
981.8
910.7
833.4
773
691.2
573.4
418.5
260.5
205.8
175.9
161
142
0.9634 0.98884 0.99689
Ratio - Giant Component

to Total Cluster Count
0.65511 0.71167 0.77297 0.84781 0.90557 0.93105
Average Z1
Average Z2
1.84598 1.89608 1.95763 2.08411 2.24382 2.38551 2.61863 3.06243

4 6.07294 7.51215 8.504832 9.00621 10.3521
2.5348 2.75686 3.00387 3.52454 4.27094 4.96455 6.19878 8.76317 15.2793 33.557 47.1817 56.30699 59.6199 69.2113
<k>
1.84598 1.89608 1.95763 2.08411 2.24382 2.38551 2.61863 3.06243
4 6.07294 7.51215 8.504832 9.00621 10.3521
<k>^2
3.40764 3.59511 3.83231 4.34352 5.03473 5.69066 6.85725 9.37851
16 36.8806 56.4324 72.33217 81.1118 107.166
Experimental values
Experimental 1
Average Path Length
Average Path Length
6.63186 7.10304 7.71975 7.95787
7.7241 7.28176 6.72078 5.79176 4.49197 3.26783 2.83802
2.62216 2.53295 2.35146
Experimental 2
6.63839 7.11042 7.72812 7.96762 7.73413 7.29168
6.7307 5.80216 4.50281 3.28044 2.85189
2.63716 2.54878 2.36814
Calculated values
Calculated 1
Average Path Length
Average Path Length
Average Path Length
16.5118 14.5079 13.0273 10.9471 9.13104 8.08003 6.90495 5.61081 4.25684 3.08397 2.70885 2.517848 2.44093 2.29601
Calculated 2
20.607 17.6088 15.4842 12.6461 10.2862
8.9716 7.54056 6.01875 4.48258 3.19999 2.80211 2.602939
2.5256 2.37825
Calculated 3
11.1426 10.7169 10.2323 9.33257 8.39577 7.72001 6.85466 5.71444
4.3668 3.08482 2.64211 2.415398 2.31192 2.12042
Table 2
POLYMERIC GEL PATH LENGTH

N = 2000
Temperature
2.00
Cluster Count (approx)

links
1413
971
# of Nodes
1.80
1335
966
1.50
1250
960
1.20
1117
948
1.00
977
934
0.90
0.80
883
921
762
904
0.70
609
877
0.60
429
836
0.55
338
815
0.50
261
790
1413.39 1335.94 1250.81 1117.64 977.685 883.087 762.985 609.691 429.014 338.657 261.948
Ratio - Giant Component

to Total Cluster Count
0.0345
0.0677
0.45
0.40
206
772
0.35
176
747
161
724
0.30
142
734
206.97 176.1839 161.984 144.111
0.1652 0.46671 0.66513 0.74723 0.82145 0.88617 0.93782 0.95747 0.97467 0.98735 0.995831 0.99874 0.99259
Average Z1
Average Z2
1.36991 1.43859 1.52291 1.67481 1.86827 2.02433 2.26189 2.66008 3.36427 3.91926 4.62495 5.35033 5.770649 5.96497 6.19191
1.17664 1.47748 1.88046 2.70186 3.89262 4.94847 6.69323 9.75338 14.8364 17.9091 20.9416 23.409 24.95407 24.7829 25.5831
<k>
<k>^2
1.37713 1.44645 1.53145 1.6843 1.87857 2.03531 2.27322 2.67195 3.37635 3.93092 4.63552 5.35828 5.775521 5.96762 6.33488
1.89648 2.09223 2.34535 2.83688 3.52903 4.14248 5.16751 7.13932 11.3998 15.4521 21.488 28.7112 33.35664 35.6125 40.1307
Experimental values
Experimental 1
Average Path Length
Average Path Length
0.02007 0.07875 0.47734 2.95585 4.43444 4.77851
4.8884 4.76399 4.45435 4.26854 4.08551 3.91669 3.774252 3.73232 3.47408
Experimental 2
3)
0.02009 0.07883 0.48037 2.95965 4.44104 4.78668 4.89813 4.77602 4.46983 4.28665 4.10622 3.93957 3.797657 3.75663 3.50048
Discussion
For the random graph, as seen in table 1, the experimentally gathered path lengths ([1]
and [2]) are in close agreement. Yet the difference in the calculated values increases with
temperature. It was expected that the calculation of the path length using formula [3],
[4], and [5] would be more consistent with the experimental results. But, we can see in
figure 1 that this is not the case. Calculated 3 Eq.[5] seems to be the closest to the
experimental results for a random graph at all temperatures. However, the two conditions
of N >> z1 and z 2 >> z1 are not met for all temperatures. I feel that the second condition
is not met for temperatures greater than 0.6. Due to this, I should expect that the
calculated value (using Eq.[5]) should start to deviate from experimental starting at T =
0.6. As seen in figure 2, Random - calc 3, Eq.[5] seems to hold consistent with
experimental results up to T = 0.8.
Since Eq.[4] and [5] have been stated as the result of reducing Eq.[3], while imposing
special conditions, I have assumed that Eq.[3] should hold true for the random graph at
all temperatures and without these two special conditions. Yet, this calculation is only
second best to the experimental results.
If I were to change the value of N by using only the number of connected nodes, this
would result in a lower calculated value in all three cases. However, the calculated
values would still deviate at higher temperatures.
To check if the FORTRAN code was functioning as desired, I have built two small
networks of 20 nodes. Twice, I manually drew the connections between nodes and
verified that the resulting shortest path lengths and distributions are correct.
In this work, if a starting node does not have a path to another, its shortest path length
(zero) is not counted. Earlier, which at this point I dont remember the details, you
informed me of how to deal with these disconnections while taking the inverse. Could
you refresh my memory on those details?

Path Length

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Path Length

Загружено:

Авторское право:

Доступные форматы

Path Length

Verification of the Algorithm and Code

For every node within the graph do {

ErdsRnyi Random Graph

In an effort to validate the accuracy of the FORTRAN code, I compared experimental

From here the average path length is defined as

In the following tables of data, this value is labeled Experimental 1.

I compared these two methods of experimentally gathering average path lengths to

ln ( N 1)( z 2 z1 ) + z12 ln z12

Average Path Length

Average Path Length

E.R. RANDOM MATRIX PATH LENGTH

0.9634 0.98884 0.99689

Ratio - Giant Component

0.65511 0.71167 0.77297 0.84781 0.90557 0.93105

1.84598 1.89608 1.95763 2.08411 2.24382 2.38551 2.61863 3.06243

1.84598 1.89608 1.95763 2.08411 2.24382 2.38551 2.61863 3.06243

4 6.07294 7.51215 8.504832 9.00621 10.3521

3.40764 3.59511 3.83231 4.34352 5.03473 5.69066 6.85725 9.37851

16 36.8806 56.4324 72.33217 81.1118 107.166

Average Path Length

Average Path Length

6.63186 7.10304 7.71975 7.95787

7.7241 7.28176 6.72078 5.79176 4.49197 3.26783 2.83802

2.62216 2.53295 2.35146

6.7307 5.80216 4.50281 3.28044 2.85189

2.63716 2.54878 2.36814

Average Path Length

Average Path Length

Average Path Length

8.9716 7.54056 6.01875 4.48258 3.19999 2.80211 2.602939

4.3668 3.08482 2.64211 2.415398 2.31192 2.12042

POLYMERIC GEL PATH LENGTH

Cluster Count (approx)

Ratio - Giant Component

206.97 176.1839 161.984 144.111

Average Path Length

Average Path Length

0.02007 0.07875 0.47734 2.95585 4.43444 4.77851

4.8884 4.76399 4.45435 4.26854 4.08551 3.91669 3.774252 3.73232 3.47408

Вам также может понравиться