Академический Документы
Профессиональный Документы
Культура Документы
1)
Introduction
In calculating the average path length, we must find the shortest path from a source node
to all other nodes contained within the graph. Previously, we found that by using an
inefficient algorithm, experimentally calculating the path length of a graph can be time
consuming. Since then, I have looked further into the necessary algorithms for solving
this problem.
2)
The algorithm I used to solve the single source shortest path problem within the
polymeric gel was a breadth first search. A breadth first search begins at a starting node
and explores all of the neighboring nodes. Then for each of those nearest nodes, it
explores their unexplored neighbors until it finds the goal.
The algorithm can be designed to generate a distribution of path lengths between nodes.
This is done by tracking the following process within a given timestep: Starting from a
given node the algorithm will keep count how many nodes are immediate neighbors to
this starting node. The path length between these immediate neighbors and the starting
node is one. This is then done for second nearest neighbors. The path length between the
second nearest neighbors and the starting node is two. This process continues until all
reachable neighbors are visited.
With the polymeric gel, it was necessary to repeat this algorithm N (the number of
aggregates within a timestep) times, starting from each node within a given timestep. It
is well-known that within our gel network, all aggregates are not necessarily connected.
Rattlers and a disconnected graph resulting in a giant component are two examples of this
situation. To account for the shortest paths between all of the aggregates it is necessary
to repeat this algorithm, starting from each node within the network. By repeating this
algorithm, starting from each aggregate and continually updating shorter path lengths for
a given timestep, the discontinuous nature of the polymeric gel network can be accounted
for.
From here a distribution is created by counting the number of each of the specific path
lengths within each timestep.
Pi (k ) =
Li
Li
l = P(k ) k .
[1]
The second method of experimentally calculating the average path length is defined as
follows
l=
1
Di , j
N ( N 1) i , j
[2]
In an unweighted graph, Di , j is the shortest distance between node i and node j . This
definition assumes that Di , j = 0 if node i cannot be reached by node j or if i = j . N in
this definition is the total number of nodes who have connections. In the following tables
of data, this value is labeled Experimental 2.
[3]
and will be labeled Calculated 1 in the following table of data (Table 1). In the special
circumstance where the following two conditions hold,
N >> z1
z 2 >> z1
Eq.[3] reduces to
l=
ln(N z1 )
+ 1.
ln( z 2 z1 )
[4]
This value will be labeled Calculated 2. In the special case of an E.R random graph,
2
for which z1 = k and z 2 = k , Eq.[4] reduces to the following
l=
ln( N )
ln k
[5]
(S. Boccaletti et al./ Physics Reports 424 (2006) 175-308). In the following tables of
data, this value will be labeled Calculated 3.
There are a couple of considerations to take into account. In the creation of the E.R.
random networks we started with the same number of nodes and links as the gel at each
given temperature. Importantly, when we randomly choose to make connections between
N nodes, not all of the N nodes will be selected. There will be some nodes that do not
have connections to others. This results in a network with k links, but the total number of
connected nodes is less than the number of desired nodes. This fact is important when
comparing calculated values to experimental results.
I am assuming that based upon the definition of an E.R. random graph, according to S.
Boccaletti et al., the value for N must include those nodes that do not have connections.
A second consideration is in the fact that the created E.R random graph might be
disconnected. The formula used to calculate average path lengths assumes that all nodes
are reachable from any randomly chosen starting node. As stated in M. Newman et al. /
Phys. Rev. E 64 026118 (2001), in general this will not be true and Eq. [4] is
meaningless. A better approximation to l may therefore be given by replacing N in
Eq.[4] by NS , where S is the fraction of the graph occupied by the giant component.
Therefore, I made this approximation. I averaged the largest component of the random
graph per timestep and included this factor in each of the three calculated values.
Figures 1 and 2 contain plots of average path length verses temperature for the gel and
random graphs.
Path Length
Gel - Exp 1
Random - Exp 1
Random - Calc 1
Random - Calc 2
Random - Calc 3
20
15
10
0
0.5
1.5
Temperature
Figure 1 contains a graph of the average path length data verses temperature for the polymeric gel and E.R.
random graphs.
Path Length
10
Gel - Exp 1
Random - Exp 1
Random - Calc 1
Random - Calc 2
Random - Calc 3
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Temperature
Figure 2 contains a graph of the average path length data verses temperature for the polymeric gel and E.R.
random graphs. This is the same data as in figure 1, just a closer view.
Tables 1 and 2 contain the average path length data (experimental and calculated) for the
random graph and the polymeric gel.
Table 1
Temperature
Cluster Count (approx)
links
# of Nodes
2.00
1.80
1.50
1.20
1.00
0.90
0.80
0.70
0.60
0.50
0.45
0.40
0.35
0.30
1413
971
1335
966
1250
960
1117
948
977
934
883
921
762
904
606
877
427
836
261
790
206
772
176
747
161
724
142
734
1053.1
1020
981.8
910.7
833.4
773
691.2
573.4
418.5
260.5
205.8
175.9
161
142
Average Z1
Average Z2
<k>
<k>^2
Experimental values
Experimental 1
Experimental 2
6.63839 7.11042 7.72812 7.96762 7.73413 7.29168
Calculated values
Calculated 1
16.5118 14.5079 13.0273 10.9471 9.13104 8.08003 6.90495 5.61081 4.25684 3.08397 2.70885 2.517848 2.44093 2.29601
Calculated 2
20.607 17.6088 15.4842 12.6461 10.2862
2.5256 2.37825
Calculated 3
11.1426 10.7169 10.2323 9.33257 8.39577 7.72001 6.85466 5.71444
Table 2
Temperature
2.00
1413
971
# of Nodes
1.80
1335
966
1.50
1250
960
1.20
1117
948
1.00
977
934
0.90
0.80
883
921
762
904
0.70
609
877
0.60
429
836
0.55
338
815
0.50
261
790
1413.39 1335.94 1250.81 1117.64 977.685 883.087 762.985 609.691 429.014 338.657 261.948
0.0345
0.0677
0.45
0.40
206
772
0.35
176
747
161
724
0.30
142
734
0.1652 0.46671 0.66513 0.74723 0.82145 0.88617 0.93782 0.95747 0.97467 0.98735 0.995831 0.99874 0.99259
Average Z1
Average Z2
1.36991 1.43859 1.52291 1.67481 1.86827 2.02433 2.26189 2.66008 3.36427 3.91926 4.62495 5.35033 5.770649 5.96497 6.19191
1.17664 1.47748 1.88046 2.70186 3.89262 4.94847 6.69323 9.75338 14.8364 17.9091 20.9416 23.409 24.95407 24.7829 25.5831
<k>
<k>^2
1.37713 1.44645 1.53145 1.6843 1.87857 2.03531 2.27322 2.67195 3.37635 3.93092 4.63552 5.35828 5.775521 5.96762 6.33488
1.89648 2.09223 2.34535 2.83688 3.52903 4.14248 5.16751 7.13932 11.3998 15.4521 21.488 28.7112 33.35664 35.6125 40.1307
Experimental values
Experimental 1
Experimental 2
3)
0.02009 0.07883 0.48037 2.95965 4.44104 4.78668 4.89813 4.77602 4.46983 4.28665 4.10622 3.93957 3.797657 3.75663 3.50048
Discussion
For the random graph, as seen in table 1, the experimentally gathered path lengths ([1]
and [2]) are in close agreement. Yet the difference in the calculated values increases with
temperature. It was expected that the calculation of the path length using formula [3],
[4], and [5] would be more consistent with the experimental results. But, we can see in
figure 1 that this is not the case. Calculated 3 Eq.[5] seems to be the closest to the
experimental results for a random graph at all temperatures. However, the two conditions
of N >> z1 and z 2 >> z1 are not met for all temperatures. I feel that the second condition
is not met for temperatures greater than 0.6. Due to this, I should expect that the
calculated value (using Eq.[5]) should start to deviate from experimental starting at T =
0.6. As seen in figure 2, Random - calc 3, Eq.[5] seems to hold consistent with
experimental results up to T = 0.8.
Since Eq.[4] and [5] have been stated as the result of reducing Eq.[3], while imposing
special conditions, I have assumed that Eq.[3] should hold true for the random graph at
all temperatures and without these two special conditions. Yet, this calculation is only
second best to the experimental results.
If I were to change the value of N by using only the number of connected nodes, this
would result in a lower calculated value in all three cases. However, the calculated
values would still deviate at higher temperatures.
To check if the FORTRAN code was functioning as desired, I have built two small
networks of 20 nodes. Twice, I manually drew the connections between nodes and
verified that the resulting shortest path lengths and distributions are correct.
In this work, if a starting node does not have a path to another, its shortest path length
(zero) is not counted. Earlier, which at this point I dont remember the details, you
informed me of how to deal with these disconnections while taking the inverse. Could
you refresh my memory on those details?