Вы находитесь на странице: 1из 29

School of Information

University of Michigan
SI 614
Random graphs & power law networks
preferential attachment
Lecture 7
Instructor: Lada Adamic

Outline
Erdos-Renyi random graphs
BA model
scale free networks in Pajek
modifications of preferential attachment
other processes that lead to power law networks
randomizing networks but preserving network properties
assortative mixing

Simplest random network
Erdos-Renyi model: randomly draw E edges between N
nodes
Conserves only the average number of neighbors
(connectivity) of a node
<k>=2E/N = p N
No hubs! Narrow distribution of connectivities
Poisson distribution
Real world networks are often power law though...
Sexual networks

Great variation in
contact numbers

Yule model

Basic BA-model
Very simple algorithm to implement
start with an initial set of m
0
fully connected nodes
e.g. m
0
= 3




now add new vertices one by one, each one with exactly m edges
each new edge connects to an existing vertex in proportion to the
number of edges that vertex already has preferential attachment
easiest if you keep track of edge endpoints in one large array and select
an element from this array at random
the probability of selecting any one vertex will be proportional to the number
of times it appears in the array which corresponds to its degree

1 2
3
1 1 2 2 2 3 3 4 5 6 6 7 8 .
generating BA graphs contd
To start, each vertex has an
equal number of edges (2)
the probability of choosing any
vertex is 1/3

We add a new vertex, and it will
have m edges, here take m=2
draw 2 random elements from the
array suppose they are 2 and 3

Now the probabilities of selecting
1,2,3,or 4 are
1/5, 3/10, 3/10, 1/5

Add a new vertex, draw a vertex
for it to connect from the array
etc.

1 2
3
1 1 2 2 3 3
1 2
3
1 1 2 2 2 3 3 3 4 4
4
1 2
3
4
1 1 2 2 2 3 3 3 3 4 4 4 5 5
5
Properties of the BA graph
The distribution is scale free with exponent o = 3
P(k) = 2 m
2
/k
3
The graph is connected
Every new vertex is born with a link or several links (depending
on whether m = 1 or m > 1)
It then connects to an older vertex, which itself connected to
another vertex when it was introduced
And we started from a connected core
The older are richer
Nodes accumulate links as time goes on, which gives older
nodes an advantage since newer nodes are going to attach
preferentially and older nodes have a higher degree to tempt
them with than some new kid on the block
vertex introduced at time t=5
vertex introduced at time t=95
Time evolution of the connectivity of a vertex in the BA model
Younger vertex does not stand a chance:
at t=95 older vertex has ~ 20 edges, and younger vertex is starting out with 5
at t ~ 10,000 older vertex has 200 edges and younger vertex has 50
Generating scale free networks with Pajek
Two general options
Scale free
D.M. Pennock et al. (2002) Winners dont take all, PNAS, 99/8,
5207-5211.
Pajek command: Net > Random Network > Scale Free
Differs from the BA model primarily in that:
new vertices are not automatically assigned edges
probability of attaching is partially independent of degree
Extended model
Albert R., Barabasi A.L.: Topology of evolving networks: local
events and universality http://xxx.lanl.gov/abs/cond-mat/0005085
Pajek command: Net > Random Network > Extended Model
Differs from the simple BA model in that:
edges are added between existing nodes, not only the newcomer
edges are rewired between existing nodes





Scale free network option in Pajek
Network starts with m
0
vertices, which link to each other
with probability p
0
(as in an Erdos-Renyi random graph)
At each time step t, one vertex and m edges are added
to the network
Instead of attaching one end point of each edge to the
newly introduced vertex, choose each end point
according to the probability:
V E
v
E
v
v
out in
1 ) ( deg ) ( deg
) Pr( | o + + =
fraction of edges in the
graph that start at v
fraction of edges in the
graph that end at v
the credit v
gets just for being
one of the vertices
Scale free network generation in Pajek-contd
Observations:
o + | + = 1, so can vary the relative importance of indegree,
outdegree, and independent probability
in an undirected network o = |, since indegree and outdegree are the
same
Not all vertices will be connected, since they are not born with an
edge
The larger g is, the less scale-free the degree distribution
edges are added at without regard to degree
Original BA paper showed that in that case the degree distribution
P(k) ~ exp(-|k) so an exponential distribution

V E
v
E
v
v
out in
1 ) ( deg ) ( deg
) Pr( | o + + =
Pennock model
Example: It is reasonable to assume that some webpages will be
linked to in part because of what they are rather than the number of
links they already have
fits to various subsets of web data,
and web pages in general

Scale free in Pajek
For the network you can specify
undirected, directed, or acyclic
an adding > free option?

# of vertices
# of lines
average degree of vertices

Initial Erdos-Renyi Graph (these are the first few vertices present)
# of vertices (use something small, a couple of vertices)
probability p of connecting type 0.9999 to have them fully connected, or
anything between 0 and 1 doesnt matter much
o this is between 0 and 0.5 for an undirected graph
the higher o the more scale-free your distribution will be
but watch out, if you set o = 0.5, then |=0.5 and = 0, and your new,
edgeless vertices will never get new connections you will only have the
original Erdos-Renyi component connected


in theory you can leave either the # of
vertices or # of lines unconstrained,
but leaving the # of lines
unconstrained (enter in 0) works for
me
Extended BA model (undirected network)
start with m
0
isolated nodes
at each timestep perform one of the following operations:
w/ prob. p add m (m m
0
) new links
for each link
select from vertex at random
select to vertex in proportion to its degree (+1 so that isolated vertices have a
chance of getting links)





w/ prob. q where 0 < q < 1 p
rewire m links
select node i at random and one of is links
rewire the endpoint of is link to another node j randomly chosen with probability
H(k
j
)



+
+
= H
i
j
j
i
i
k
k
k
1
1
) (
Extended BA model contd
w/ prob. 1 p - q
add a new node with m links
connect endpoints of the m links to vertices in proportion to their degree
(H(k
j
)

In the p=q=0 limit, reduces to the simple BA model
rewire m links
select node i at random and one of is links
rewire the endpoint of is link to another node j randomly chosen with
probability H(k
j
)

In the high q (q -> 1) limit, extended model produces a
network with an exponential tail because growth is very
slow (only rewiring is occurring)



parameter space of the extended BA model
In the high p (p > 0.5) limit, have a scale free distribution,
because adding new edges preferentially
saturation effect for small k (degree)
because edges keep being added, but vertices are not being added
that quickly, eventually even the low degree vertices get a few more
edges
power-law exponent varies between 2 and , depending on
parameters

Extended BA model in Pajek
Net > Random Network > Extended Model
Specify
n = # of vertices
m
0
= # of initial, disconnected nodes
m m
0
, number of edges to add/rewire at a time
p = probability to add new lines
q = probability to rewire edges, 0 q 1-p
can ask for network without multiple lines
How can we randomize a network while
preserving the degree distribution?
Stub reconnection algorithm (M. E. Newman, et al, 2001, also known in
mathematical literature since 1960s)
Break every edge in two edge stubs
AB to A B
Randomly reconnect stubs
Problems:
Leads to multiple edges
Cannot be modified to preserve additional topological
properties
Local rewiring algorithm
Randomly select and rewire two edges (Maslov, Sneppen, 2002, also
known in mathematical literature since 1960s)
Repeat many times
Preserves both the number of upstream and downstream
neighbors of each node
Conserving additional low-level topological
properties
In addition to k
i
one may also conserve:
The exact numbers of loops or other motifs
The size and numbers of components: Internet all nodes have
to be connected to each other
Metropolis algorithm: two edges are rewired based on
E=(N
actual
-N
desired
)
2
/N
desired
If AEs0 rewiring step is always accepted
If AE>0 rewiring step is accepted with p=exp(-AE/T)
Assortativity
Social networks are assortative:
the gregarious people associate with other gregarious people
the loners associate with other loners
The Internet is disassorative:
Assortative:
hubs connect to hubs
Random Disassortative:
hubs are in the
periphery
Correlation profile of a network
Detects preferences in linking of nodes to each other
based on their connectivity
Measure N(k
0
,k
1
) the number of edges between nodes
with connectivities k
0
and k
1
Compare it to N
r
(k
0
,k
1
) the same property in a properly
randomized network
Very noise-tolerant with respect to both false positives
and negatives
Correlation profiles give complex networks
unique identities
Internet
Protein interactions
slide by Sergei Maslov
2D picture
Correlation profiles give complex networks
unique identities
Internet
Protein interactions
Sergei Maslov: 2D histogram
Correlation profiles -contd
Pastor-Satorras and Vespignani: 2D plot
average degree
of the nodes neighbors
degree of node
Correlation profiles -contd
Newman: single number
-0.189
internet degree correlation coefficient
The Pearson correlation coefficient of nodes on each
side on an edge

Other examples of assortative mixing
Assortativity is not limited to degree-degree correlations
other attributes
social networks: race, income, gender, age
food webs: herbivores, carnivores
internet: high level connectivity providers, ISPs, consumers

Tendency of like individuals to associate: homophily
more about this later

Вам также может понравиться