Академический Документы
Профессиональный Документы
Культура Документы
40 where nij counts the number of directed edges from node i to node j . Similarly, the bipartite
41 graph is represented by an atomic measure
42
43 Z= zij .i ,j / :
44 i=1 j=1
45 Our bipartite graph formulation introduces two independent CRMs, W CRM., / and W
46 CRM. , /, whose jumps correspond to sociability parameters for nodes in sets V and V
47 respectively. The generative model for the bipartite graph mimics that of the non-bipartite
48 graph:
Sparse Graphs 11
1 W= wi i W CRM., /,
2 i=1
3 W = wj j W CRM. , /,
4 j=1
.14/
D|W , W PP.W W /,
5
D= nij .i ,j /
6
i=1 j=1
7
Z=
8 min.nij , 1/.i , j / :
i=1 j=1
9
10
Model (14) has been proposed by Caron (2012) in a slightly different formulation. In this paper,
11
we recast this model within our general framework, enabling new theoretical and practical
12
insights.
13
14
15 3.5. Interpretation of i
16 We can think of the positive, continuous valued node index i as representing the time at which
17 a potential node enters the network and has the opportunity to link with other existing nodes
18 j < i . We use the terminology potential node here to clarify that this node need not form any
19 observed connections with other nodes existing before time i . We emphasize that an observed
20 link between i and some other node k > i will eventually occur almost surely as time progresses.
21 This could represent, for example, signing on to a social networking service before your friends
22 do and only forming a link once they join. On the basis of our CRM specication, we have
23 almost surely an innite number of potential nodes as time goes to . For innite activity
24 CRMs, we have almost surely an innite set of potential nodes even at any nite time.
25 In Section 5, we examine properties of the network process across time, and we describe
26 methods for simulating networks at any nite time. There, our focus is on the observed link
27 process from this set of potential nodes. For example, sparsity is examined with respect to the
28 set of nodes with degree at least 1, not with respect to the set of potential nodes. Since we need
29 not think of i as a time index, but rather just a general construct of our formulation, we also
30 generically refer to i as the node location in the remainder of the paper.
31
32
4. Related work
33
34 There has been extensive work over recent years on exible Bayesian non-parametric models
35 for networks, allowing complex latent structures of unknown dimension to be uncovered from
36 real world networks (Kemp et al., 2006; Miller et al., 2009; Lloyd et al., 2012; Palla et al., 2012;
37 Herlau et al., 2014). However, as mentioned in the unifying overview of Orbanz and Roy (2015),
38 these methods all t in the AldousHoover framework and as such produce dense graphs.
39 Norros and Reittu (2006) proposed a conditionally Poissonian multigraph process with sim-
40 ilarities to be drawn to our multigraph process. In their formulation, each node has a given
41 sociability parameter and the number of edges between two nodes i and j is drawn from a
42 Poisson distribution with rate the product of the sociability parameters, normalized by the sum
43 of the sociability parameters of all the nodes. The normalization makes this model similar to
44 models based on rescaling of the graphon and, as such, does not dene a projective model, as
45 explained in Section 1. See van der Hofstad (2014) for a review of this model and Britton et al.,
46 (2006) for a similar model.
47 As pointed out by Jacobs and Clauset (2014) in their discussion of an earlier version of
48 this paper, another related model is the degree-corrected random-graph model (Karrer and
12 F. Caron and E. B. Fox
1 Newman, 2011), where edges of the multigraph are drawn from a Poisson distribution whose
2 rate is the product of node-specic sociability parameters and a parameter tuning the interaction
3 between the latent communities to which these nodes belong. When the sociability parameters
4 are assumed to be IID from some distribution, this model yields an exchangeable adjacency
5 matrix and thus a dense graph.
6 Additionally, there are similarities to be drawn with the extensive literature on latent space
7 modelling (e.g. Hoff et al. (2002), Penrose (2003) and Hoff (2009)). In such models, nodes
8 are embedded in a low dimensional, continuous latent space and the probability of an edge is
9 determined by a distance or similarity metric of the node-specic latent factors. In our case, the
10 continuous node index i is of no importance in forming edge probabilities. It would, however,
11 be possible to extend our approach to time- or location-dependent connections by considering
12 inhomogenous CRMs.
13 Finally, as we shall detail in Section 5.5, our model admits a construction with connections
14 to the conguration model (Bollobas, 1980; Newman, 2010), which is a popular model for
15 generating simple graphs with a given degree sequence.
16 The connections with this broad set of past work place our proposed network model within
17 the context of existing literature. Importantly, however, to the best of our knowledge this work
18 represents the rst fully generative and projective approach to sparse graph modelling (see Sec-
19 tion 5), and with a notion of exchangeability that is essential for devising our scalable statistical
20 estimation procedure, as shown in Section 7.
21
22
23 5. General properties and simulation
24 We provide general properties of our network model depending on the properties of the Levy
25 measure .
26
27
28 5.1. Exchangeability under the Kallenberg framework
29 Proposition 1 (joint exchangeability of undirected graph measure). For any CRM W CRM
30 ., /, the point process Z dened by equation (12), or equivalently by equation (6), is jointly
31 exchangeable.
32
33 The proof is given in Appendix B. In the adjacency matrix representation, we think of ex-
34 changeability as invariance to node orderings. Here, we have invariance to the time of arrival of
35 the nodes, thinking of i as a time index.
36 We now reformulate our network process in the Kallenberg representation (5). Because of
37 exchangeability, we know that such a representation exists. What we show here is that our CRM-
38 based formulation has an analytic and interpretable representation. In particular, the CRM W
39 can be constructed from a two-dimensional unit rate Poisson process on R2+ by using the inverse
40 Levy method (Khintchine, 1937; Ferguson and Klass, 1972). Let .i , i / be a unit rate Poisson
41 process on R2+ . Let .x/ be the tail Levy intensity
42
43 .x/ = .dw/: .15/
44 x
45 Then the CRM W = i wi i with Levy measure .dw/d can be constructed from the bidi-
46 mensional point process by taking wi = 1 .i /. Note that the inverse Levy intensity 1 is a
47 monotone function. It follows that our undirected graph model can be formulated under rep-
48 resentation (5) by selecting any 0 , 0 = 0 = 0, g = g = 0, h = h = l = l = 0 and
Sparse Graphs 13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 (a) (b)
18 Fig. 5. Illustration of the model construction based on the Kallenberg representation: (a) a unit rate Poisson
19 process .i , i /, i 2 N, on [0, ] RC ; (b) for each pair {i, j } 2 N2 , set zij D zji D 1 with probability M.i , j /
20 (here, M is indicated by the blue shading (darker shading indicates higher value) for a stable process (GGP
with D 0); in this case there is an analytic expression for N1 and therefore M)
21
22
1 {i,j } M.i , j /,
23 f.0 , i , j , {i,j } / = .16/
0 otherwise
24
25 where M : R2+ [0, 1] is dened by
26
1 exp{2 1 .i / 1 .j /} if i = j ,
27 M.i , j / =
28 1 exp{1 .i /2 } if i = j :
29 In Section 6, we provide explicit forms for depending on our choice of Levy measure .
30 Expression (16) represents a direct analogue to that arising from the AldousHoover framework.
31 In particular, M here is akin to the graphon of expression (47) in Appendix A.1, and thus
32 allows us to connect our CRM-based formulation with the extensive literature on graphons.
33 An illustration of the network construction from the Kallenberg representation, including the
34 function M, is in Fig. 5. Note that, if we had started from the Kallenberg representation and
35 selected an f (or M) arbitrarily, we would probably not have obtained a network model with
36 the normalized CRM interpretation that enables both interpretability and analysis of network
37 properties.
38 For the bipartite graph, Kallenbergs representation theorem for separate exchangeability
39 (Kallenberg (1990) and Kallenberg (2005), theorem 9.23) can likewise be applied.
40
41
42 5.2. Interactions between groups
43 For any disjoint set of nodes A, B R+ , A B = , the probability that there is at least one
44 connection between a node in A and a node in B is given by
45
Pr{Z.A B/ > 0|W } = 1 exp{2W.A/ W.B/},
46
47 i.e. the probability of a between-group edge depends on the sum of the sociabilities in each
48 group, W.A/ and W.B/.
14 F. Caron and E. B. Fox
1 5.3. Graph restrictions
2 Let us consider the restriction of our process to the square [0, ]2 . For nite activity CRMs,
3 there will be a nite number of potential nodes (jumps) in the interval [0, ]. For innite activity
4 CRMs, we shall have an innite number of potential nodes. We are interested in the properties of
5 the process as grows, where we can think of as representing time and observing the process
6 as new potential nodes and any resulting edges enter the network. We note that, in the limit of
7 , the number of edges approaches since W.R+ / = almost surely.
8 Let D and Z be the restrictions of D and Z-respectively to the square [0, ]2 . Then, .D /0
9 and .Z /0 are measure-valued stochastic processes, indexed by . We also denote by W and
10 the corresponding CRM and Lebesgue measure on [0, ]. In what follows, our interests are
11 in studying how the following quantities vary with :
12
(a) N , the number of nodes with degree at least one in the network, and
13
(b) N.e/ , the number of edges in the undirected network.
14
15 We refer to N as the number of observed nodes. In our construction, recall that .N /0 and
16 .N.e/ /0 are non-decreasing, integer-valued stochastic processes corresponding to the number
17 of nodes with at least one connection in Z and the number of edges in Z respectively. Formally,
18
19 N = card.{i [0, ]|Z.{i } [0, ]/ > 0}/, .17/
20
21 N.e/ = Z[{.x, y/ R2+ |0 x y }]: .18/
22
23 The two processes have the same jump times, which correspond to the addition of one or
24 more new nodes with at least one connection in the graph. An example of these processes is
25 represented in Fig. 6. In later sections we use Z = Z .[0, ]2 / to denote the total mass on [0, ]2 ,
26 and similarly for D and W .
27
28 5.4. Sparsity
29 In this section we state the sparsity properties of our graph model, which relate to the properties
30 of the Levy measure . In particular, we are interested in the relative asymptotic behaviour of the
of observed nodes N as . Henceforth,
31 number of edges .e/
N with respect to the number
32 we consider 0 .dw/ > 0, since the case of 0 .dw/ = 0 trivially gives N.e/ = N = 0 almost
33 surely.
34 In theorem 2 we characterize the sparsity of the graph with respect to the properties of
35 its Levy measure: graphs obtained from innite activity CRMs are sparse, whereas graphs
36 obtained from nite activity CRMs are dense. The rate of growth can be further specied when
37 is a regularly varying Levy measure (Feller, 1971; Karlin, 1967; Gnedin et al., 2006, 2007), as
38 dened in Appendix A.2. We follow the notation of Janson (2011) for probability asymptotics
39 (see Appendix C.1 for details).
40
41 Theorem 2. Consider a point process Z representing an undirected graph. Let N.e/ be the
42 number of edges and N be the number of observed nodes in the point process restriction Z
43 (see equations (17) and (18)). Assume that the dening Levy measure is such that 0 w .dw/ <
44 . If the CRM W is nite activity, i.e.
45
46 .dw/ < ,
0
47
48 then the number of edges scales quadratically with the number of observed nodes
Sparse Graphs 15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22 Fig. 6. Example of point process Z and above it the associated integer-valued stochastic processes for
the number of observed nodes .N /0 ( ) and edges .N.e/ /0 ( )
23
24
25 N.e/ = .N2 / .19/
26 almost surely as , implying that the graph is dense.
27 If the CRM is innite activity, i.e.
28
29 .dw/ = ,
30 0
31
then the number of edges scales subquadratically with the number of observed nodes
32
33 N.e/ = o.N2 / .20/
34
almost surely as , implying that the graph is sparse.
35
Furthermore, if the Levy measure is regularly varying (see denition 1 in Appendix A.2),
36
with exponent .0, 1/ and slowly varying function l satisfying lim inf l.t/ > 0, then
37 t
38 N.e/ = O.N2=1+ / almost surley as : .21/
39
Theorem 2 is a direct consequence of two theorems that we state now and prove in Appendix
40
C. The rst theorem states that the number of edges grows quadratically with , whereas the
41
second states that the number of nodes scales superlinearly with for innite activity CRMs,
42
and linearly otherwise.
43
44 Theorem 3. Consider the point process Z. If 0 w .dw/ < , then the number of edges in
45 Z grows quadratically with :
46
47 N.e/ = .2 / .22/
48 almost surely. Otherwise, N.e/ = .2 / almost surely.
16 F. Caron and E. B. Fox
1 Theorem 4. Consider the point process Z. Then
2
./ if W is a nite activity CRM,
3 N = .23/
4 ./ if W is an innite activity CRM
5 almost surely as . In words, the number of nodes with degree at least 1 in Z scales
6 linearly with for nite activity CRMs and superlinearly with for innite activity CRMs.
7 Furthermore, for a regularly varying Levy measure with slowly varying function l such that
8 lim inf t l.t/ > 0, we have
9
10 N = .+1 / almost surely as : .24/
11 We nally give the expressions of the expectations for the number of edges and nodes in the
12 model. The proof is given in Appendix C.4. (Equations (26) and (27) could alternatively be
13 derived as particular cases of the results in Veitch and Roy (2015).)
14
15 Theorem 5. The expected number of edges D in the multigraph, edges N .e/ in the undirected
16 graph and observed nodes N are given as follows:
17
2
18
E[D ] = 2
w .dw/ + w2 .dw/, .25/
19 0 0
20
21 E[N.e/ ] = 2 .w/ .dw/ + {1 exp.w2 /}.dw/, .26/
22 0 0
23
24 E[N ] = [1 exp{w2 .2w/}] .dw/, .27/
0
25
26 where .t/ = 0 {1 exp.wt/}.dw/ is the Laplace exponent. Additionally, if is a regularly
27 varying Levy measure with exponent [0, 1/ and slowly varying function l, and 0 w.dw/ <
28 then
29
1+
30 E[N ] l./ .1 / 2 w .dw/ : .28/
0
31
32
33 5.5. Simulation
34 5.5.1. Direct simulation of graph restrictions
35 By denition, the directed multigraph restriction D is drawn from a Poisson process with nite
36 mean measure W W , where W CRM., /: Leveraging standard properties of the CRM
37 and Poisson process, we can rst simulate the total number of directed edges D based on the
38 total mass W :
39
40 D |W Poisson.W 2 /:
.29/
41 For k = 1, : : : , D a particular edge is drawn by sampling a pair of nodes
42
IID W
43 Ukj |W j = 1, 2, .30/
44 W
45 where W =W is called a normalized CRM. We form directed edges .Uk1 , Uk2 /, resulting in
46
47
D
D = .Uk1 ,Uk2 / : .31/
48 k=1
Sparse Graphs 17
1 Because of the discreteness of W , there will be ties between the .Uk1 , Uk2 /, and the number
2 of such ties corresponds to the multiplicity of that edge. In particular, a total of 2D nodes
3 Ukj are drawn but result in some N 2D distinct values. We overload the notation N here
4 because this quantity also corresponds to the number of nodes with degree at least 1 in the
5 resulting undirected network. Recall that the undirected network construction simply forms an
6 undirected edge between a set of nodes if there is at least one directed edge between them. If we
7 consider unordered pairs {Uk1 , Uk2 }, the number of such unique pairs takes a number N.e/ D
8 of distinct values, where N.e/ corresponds to the number of edges in the undirected network.
9 The construction above enables us to re-express our Cox process model in terms of normalized
10 CRMs (Regazzini et al., 2003). This is very attractive both practically and theoretically. As we
11 show in Section 6 for special cases of CRMs, one can use the results surrounding normalized
12 CRMs to derive an exact simulation technique for our directed and undirected graphs.
13
14 Remark 1. The construction above enables us to draw connections with the conguration
15 model (Bollobas, 1980; Newman, 2010), which proceeds as follows. First, the degree ki of each
16 node i = 1, : : : , n is specied such that the sum of ki is an odd number. Each node i is given a
17 total of ki stubs, or demi edges. Then, we repeatedly choose pairs of stubs uniformly at random,
18 without replacement, and connect the selected pairs to form an edge. The simple graph is
19 obtained either by discarding the multiple edges and self-loops (an erased conguration model),
20 or by repeating the above sampling until obtaining a simple graph. In our case, we have an innite
21 set of (potential) nodes and do not prespecify the node degrees. Furthermore, each node in the
22 pair .Uk1 , Uk2 / is drawn from a normalized CRM rather than the pair being selected uniformly
23 at random. However, at a high level, there is a similar avour to our construction.
24
25
5.5.2. Urn-based simulation of graph restrictions
26
We now describe an urn formulation that allows us to obtain a nite dimensional generative
27
process. Recall that, in practice, we cannot sample W CRM., / if the CRM is innite
28
activity since there will be an innite number of jumps.
29
Let .U1 , : : : , U2D
/ = .U11 , U12 , : : : , UD
1 , UD 2 /. For some classes of Levy measure , it is
30
possible to integrate out the normalized CRM = W =W in expression (30) and derive
31
the conditional distribution of Un+1 given .W , U1 , : : : , Un /. We rst recall some background
32
on random partitions. As is discrete with probability 1, variables U1 , : : : , Un take k n
33
distinct values j , with multiplicities 1 mj n. The distribution on the underlying partition
34
is usually dened in terms of an exchangeable partition probability function (EPPF) (Pitman,
35
1995) .k/
n .m1 , : : : , mk |W / which is symmetric in its arguments. The predictive distribution of
36
Un+1 given .W , U1 , : : : , Un / is then given in terms of the EPPF:
37
38
.k+1/
.m1 , : : : , mk , 1|W / 1
39 Un+1 |.W , U1 , : : : , Un / n+1.k/
40 n .m1 , : : : , mk |W /
41 k .k/ .m1 , : : : , mj + 1, : : : , mk |W /
n+1
42 + .k/
j : .32/
j=1 n .m1 , : : : , mk |W /
43
44 Using this urn representation, we can rewrite our generative process as
45
46 W PW ,
47 D |W Poisson.W 2 /,
48 .Ukj /k=1,:::,D ;j=1,2 |W urn process .32/,
18 F. Caron and E. B. Fox
1
D
2 D = .Uk1 ,Uk2 / , .33/
k=1
3
4 where PW is the distribution of the CRM total mass W . Representation (33) can be used to
5 sample exactly from our graph model, assuming that we can sample from PW and evaluate the
6 EPPF. In Section 6 we show that this is indeed possible for specic CRMs of interest.
7
8 5.5.3. Approximate simulation of graph restrictions
9 If we cannot sample from PW in expression (33) and evaluate the EPPF in expression (32), we
10 resort to approximate simulation methods. In particular, we harness the directed multigraph
11 representation and approximate the draw of W . For our undirected graphs, we simply transform
12 the (approximate) draw of a directed multigraph as described in Section 3.3.
13 One approach to approximate simulation of W , which is possible for some Levy measures ,
14 is to resort to adaptive thinning (Lewis and Shedler, 1979; Ogata, 1981; Favaro and Teh, 2013).
15 A related alternative approximate approach, but applicable to any Levy measure satisfying
16 condition (9), is the inverse Levy method. This method rst denes a threshold " and then
17 samples the weights = {wi |wi > "} by using a Poisson measure on [", ]. One then simulates
18 D using these truncated weights .
19 A naive application of this truncated method that considers sampling directed or undirected
20 edges as in expression (12) or expression (6), respectively can prove computationally problematic
21 since a large number of possible edges must be considered (one Poisson or Bernoulli draw for
22 each .i , j / pair for the directed or undirected case). Instead, we can harness the Cox process
23 representation and resulting sampling procedure of expression (29)(30) to sample rst the total
24 number of directed edges and then their specic instantiations. More specically, to simulate
25 approximately a point process on [0, ]2 , we use the inverse Levy method to sample
26
27 ," = {.w, / , 0 < , w > "}: .34/
Let W," = K
i=1 wi i be the associated truncated CRM and W," = W, " .[0, ]/ its
28
total mass.
D,
29
We then sample D," and Ukj as in expression (29)(30), and set D," = k=1 "
.Uk1 ,Uk2 / .
30 The undirected graph measure Z," is set to the manipulation of D," as in expression (12).
31
32
6. Special cases
33
34 In this section, we examine the properties of various models and their link to classical random-
35 graph models depending on the Levy measure . We show that, in the GGP case, the resulting
36 graph can be either dense or sparse, with the sparsity tuned by a single hyperparameter. Fur-
37 thermore, exact simulation is possible via expression (33). We focus on the undirected graph
38 case, but similar results can be obtained for directed multigraphs and bipartite graphs.
39
40 6.1. Poisson process
41 Consider a Poisson process with xed increments w0 > 0:
42
43 .dw/ = w0 .dw/:
44
This measure denes a nite activity CRM. Recalling the denition .x/ = x .dw/, in this
45 case, we have
46
47 1 if x < w0 ,
.x/ =
48 0 otherwise:
Sparse Graphs 19
1 Ignoring self-edges, the graph construction can be described as follows. To sample W CRM
2 ., /, we generate n Poisson./ and then sample i Unif.[0, ]/ for i = 1, : : : , n. We then
3 sample edges according to expression (6): for 0 < i < j < n, set zij = zji = 1 with probability
4 1 exp.2w02 / and 0 otherwise. The model is therefore equivalent to the ErdosRenyi random-
5 graph model G.n, p/ with n Poisson./ and p = 1 exp.2w02 /. Therefore, this choice of
6 leads to a dense graph, as our theory suggests, where the number of edges grows quadratically
7 with the number of nodes n.
8
9 6.2. Compound Poisson process
10 A compound Poisson process is a process where
11
.dw/ = h.w/ dw
12
13 and h : R+ R+ is such that 0 h.w/ dw = 1 and denes a nite activity CRM. In this case,
14 we have .x/ = 1 H.x/ where H is the distribution function that is associated with h. Here,
15 we arrive at a framework that is similar to the standard graphon. Leveraging the Kallenberg
16 representation (16), we rst sample n Poisson./. Then, for i = 1, : : : , n we set zij = zji = 1 with
17 probability M.Ui , Uj / where Ui are uniform [0, 1] variables and M is dened by
18
19 M.Ui , Uj / = 1 exp{2H 1 .Ui /H 1 .Uj /}:
20 This representation is the same as with the AldousHoover theorem, except that the number
21 of nodes is random and follows a Poisson distribution. As such, the resulting random graph is
22 either trivially empty or dense, again agreeing with our theory.
23
24 6.3. Generalized gamma process
25 The GGP (Hougaard, 1986; Aalen, 1992; Lee and Whitmore, 1993; Brix, 1999) is a exible two-
26 parameter CRM with interpretable parameters and remarkable conjugacy properties (James,
27 2002; Lijoi and Prunster, 2003; Lijoi et al., 2007; Caron et al., 2014). The process is also known
28 as the Hougaard process (Hougaard, 1986) when is the Lebesgue measure, as in this paper,
29 but we shall use the more standard term GGP in the rest of this paper. The Levy measure of the
30 GGP is given by
31
1
32 .dw/ = w1 exp. w/dw, .35/
33 .1 /
34 where the two parameters ., / satisfy
35
., / ., 0] .0, + / or ., / .0, 1/ [0, + /: .36/
36
37 The GGP has different properties if 0 or < 0. When < 0, the GGP is a nite activity CRM
38 (i.e. a compound Poisson process); more precisely, the number of jumps in [0, ] is nite with
39 probability 1 and drawn from a Poisson distribution with rate .=/ whereas the jumps wi
40 are IID gamma., /.
41 When 0, the GGP has an innite number of jumps over any interval [s, t]. It includes
42 as special cases the gamma process ( = 0, > 0), the stable process ( .0, 1/, = 0) and the
43 inverse Gaussian process ( = 21 , > 0).
44 The tail Levy intensity of the GGP is given by
45 ., x/
46
if > 0,
1 1 .1 /
47 .x/ = w exp. w/ dw =
.1 /
x
48 x if = 0,
.1 /
20 F. Caron and E. B. Fox
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 (a) (b)
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34 (c) (d)
35
Fig. 7. Sample graphs: (a) ErdosRenyi graph G.n, p/ with n D 1000 and p D 0:05, and GGP graphs
36 GGP., , / with (E)G D 100, D 2 and (b) D 0, (c) D 0:5 and (d) D 0:8 (the size of a node is proportio
37 QDOto its degree; the Jraphs were generated with the software Gephi (Bastian et al., 2009))
38
39 where .a, x/ is the incomplete gamma function. Example realizations of the process for various
40 values of 0 are displayed in Fig. 7 alongside a realization of an ErdosRenyi graph.
41
42 6.3.1. Exact sampling via an urn approach
43 In the case > 0, W is an exponentially tilted stable random variable, for which exact samplers
44 exist (Devroye, 2009). As shown by Pitman (2003) (see also Lijoi et al. (2008)), the EPPF
45 conditional on the total mass W = t depends only on the parameter (and not and ) and
46 is given by
t
47 .n/ k t n
k .m /
i
k .m1 , : : : , mk |t/ = snk1 g .t s/ ds , .37/
48 .n k/g .t/ 0 i=1 .1 /
Sparse Graphs 21
1 where g is the probability density function of the positive stable distribution. Plugging the EPPF
2 (37) into expression (32) yields the urn process for sampling in the GGP case. In particular, we
3 can use the generative process (33) to sample exactly from the model.
4 In the special case of the gamma process ( = 0), W is a gamma., / random variable and
5 the resulting urn process is given by (Blackwell and MacQueen, 1973; Pitman, 1996):
6
k mj
7 Un+1 |.W , U1 , : : : , Un / + : .38/
8 + n j=1 + n j
9 When < 0, the GGP is a compound Poisson process and can thus be sampled exactly.
10
11 6.3.2. Sparsity
12 Appealing to theorem 2, we use the following facts about the GGP to characterize the sparsity
13 properties of this special case.
14
(a) For < 0, the CRM is nite activity with 0 w.dw/ < ; thus theorem 2 implies that
15 the graph is dense.
16
(b) When 0 the CRM is innite activity; moreover, for > 0, 0 w.dw/ < , and thus
17 theorem 2 implies that the graph is sparse.
18 (c) For > 0, the tail Levy intensity has the asymptotic behaviour
19
x0 1
20 .x/ x
21 .1 /
22 and, as such, is regularly varying with exponent and constant slowly varying function.
23
24 We thus conclude that
.N / if < 0,
2
25
26
.e/ 2
N = o.N / if = 0, > 0, .39/
2=.1+/
27 O.N / if .0, 1/, > 0,
28 almost surely as , i.e. the GGP parameter tunes the sparsity of the graph: The underlying
29 graph is sparse if 0 and dense otherwise.
30
31 Remark 2. The proof technique of theorem 2 requires 0 w .dw/ < and thus excludes
32 the stable process . = 0, .0, 1//, although we conjecture that the graph is also sparse in that
33 case.
34 Additionally, applying theorem 5, we obtain
35
36
37
if < 0,
E[N ] log./ if = 0,
38
1+ 2 .1/
39 if > 0, > 0:
40
41
42 6.3.3. Empirical analysis of graph properties
43 For the GGP-based formulation, we provide an empirical analysis of our network properties in
44 Fig. 8 by simulating undirected graphs by using the approach that was described in Section 5.5
45 for various values of and . We compare with an ErdosRenyi random graph, preferential
46 attachment (Barabasi and Albert, 1999) and the Bayesian non-parametric network model of
47 Lloyd et al. (2012). The particular features that we explore are as follows.
48 (a) Degree distribution: Fig. 8(a) suggests empirically that the model can exhibit power law
22 F. Caron and E. B. Fox
1
2
3
100 100
4
5
101
6 101
7
8 102
9
Distribution
102
Distribution
10 103
11
12 103
104
13
14
104 105
15
16
106
17 105
18 100 101 102 100 101 102 103
19 Degree Degree
20 (a) (b)
21
22
103 104
23
Number of nodes of degree one
24
25
26
Number of edges
27 102 103
28
29
30
31 101 102
32
33
34
35 100 101
36 101 102 101 102
37 Number of nodes Number of nodes
38 (c) (d)
39 Fig. 8. Examination of the GGP undirected network properties (averaging over graphs with various )
40 in comparison with an ErdosRenyi G.n, p/ model with p D 0:05 ( ), the preferential attachment model of
Barabasi and Albert (1999) ( ) and the non-parametric formulation of Lloyd et al. (2012) ( ): (a) degree
41 distribution on a loglog-scale for (a) various values of ( , D 0:2; , D 0:5; , D 0:8) ( D 102 ) and (b)
42 various values of ( , D 101 ; , D 1; , D 5) ( D 0:5) for the GGP; (c) number of nodes with degree
43 1 versus number of nodes on a loglog-scale ( , D 0:2; , D 0:5; , D 0:8) (note that the Lloyd method
leads to dense graphs such that no node has only degree 1; (d) number of edges versus number of nodes
44 ( , D 0:2; , D 0:5; , D 0:8) (here we note growth at a rate o.n2 / for our GGP graph models, and .n2 /
45 for the Erdos-Renyi and Lloyd models (dense graphs))
46
47
48
Sparse Graphs 23
1 behaviour providing a heavy-tailed degree distribution. As shown in Fig. 8(b), the model
2 can also handle an exponential cut-off in the tails of the degree distribution, which is an
3 attractive property (Clauset et al., 2009; Olhede and Wolfe, 2012).
4 (b) Number of degree 1 nodes: Fig. 8(c) examines the fraction of degree 1 nodes versus the
5 number of nodes.
6 (c) Sparsity: Fig. 8(d) plots number of edges versus number of nodes. The larger , the
7 sparser the graph is. In particular, for the GGP random-graph model, we have network
8 growth at a rate O.na / for 1 < a < 2 whereas the ErdosRenyi (dense) graph grows as
9 .n2 /.
10
11 6.3.4. Interpretation of hyperparameters
12 On the basis of the properties derived and illustrated empirically in this section, we see that our
13 hyperparameters have the following interpretations.
14
15 (a) from Figs 8(a) and (d), relates to the slope of the degree distribution in its power law
16 regime and the overall network sparsity. Increasing leads to higher power law exponent
17 and sparser networks.
18 (b) from theorem 5, provides an overall scale that affects the number of nodes and
19 directed interactions, with larger leading to larger networks.
20 (c) from Fig. 8(b), determines the exponential decay of the tails of the degree distribu-
21 tion, with small looking like pure power law. This is intuitive from the form of .dw/ in
22 equation (35), where we see that affects large weights more than small weights.
23
24 7. Posterior characterization and inference
25
26 In this section, we consider the posterior characterization and MCMC inference of parameters
27 and hyperparameters in our statistical network models.
28 Assume that we have observed a set of undirected connections .zij /1i,j N or directed con-
29 nections .nij /1i,j N where N is the observed number of nodes with at least one connection.
30 Without loss of generality, we assume that the locations of these nodes 0 < 1 <: : : < N < are
31 ordered, and we write wi = W.{i }/ as their associated sociability parameters. For simplicity, we
32 are overloading notation here with the unordered nodes in W = i wi i of equation (7).
33 We aim to infer the sociability parameters wi , i = 1, : : : , N , for each of the observed nodes.
34 We also aim to infer the sociability parameters of the nodes with no connections (the difference
35 between the set of potential nodes and those with observed interactions). We refer to these as
36 unobserved nodes. Under our framework, the number of such nodes is either nite but unknown
37 or innite. The observed connections, however, provide information about only the sum of their
38 sociabilities, denoted w . The node locations i of both observed and unobserved nodes are
39 also not likelihood identiable and are thus ignored. We additionally aim to estimate and
40 the hyperparameters of the Levy intensity of the CRM; we write for the set of hyperpa-
41 rameters. We therefore aim to approximate the posterior p.w1 , : : : , wN , w , |.zij /1<i,j<N / for
42 an observed undirected graph and p.w1 , : : : , wN , w , |.nij /1<i,j<N / for an observed directed
43 graph. (Formally, this density is with respect to a product measure that has a Dirac mass at 0
44 for w , as detailed in Appendix F.)
45
46 7.1. Directed multigraph posterior
47 In theorem 6, we characterize the posterior in the directed multigraph case. This plays a key role
48 in the undirected case that is explored in Section 7.2 as well.
24 F. Caron and E. B. Fox
1 Theorem 6. For N 1, let 1 <: : : < N be the set of support points of the measure D such
2 that D = 1i, j N nij .i ,j / . Let wi = W .{i }/ and w = W N
i=1 wi . We have
3
P{.wi dwi /1iN , w dw |.nij /1i,j N , /
4 N 2
N
5 mi
exp wi + w wi .dwi / G .dw / .40/
6 i=1 i=1
7
8 where mi = Nj=1 .nij + nji / > 0 for i = 1, : : : , N are the node degrees of the multigraph and
21 N = Xn :
n=1
22
23 See Fig. 14 for an illustration. Clearly, N is a lower bound for the number of nodes:
24 N N almost surely: .58/
25
Using the notation S .1/
= and S
k=1 A2k1
.2/
=
k=1 A2k ,
.1/ .2/
let W and W be respectively the restriction
26 of W to the set S .1/ and S . As S and S .2/ are non-overlapping and W is a CRM, W .1/ and W .2/ are
.2/ .1/
27 independent. Integrating over W .1/ and using the marking theorem for Poisson processes (see below for
28 more details), we obtain for n 1
29 ind
Xn |W .2/ Poisson[ 21 {W.Sn.2/ /}]: .59/
30
31 Lemma 1 thus implies that
32
n
Xk
33 k=1
1 almost surely: .60/
34
n
1
{W.Sk.2/ /}
35 2
k=1
36
37 We have .Sn.2/ / = n=2 and, using the law of large numbers,
38 W.Sn.2/ /
w .dw/ almost surely: .61/
39 n=2 0
40
Therefore {W.Sn.2/ /} almost surely. Its Cesaro mean also diverges and
41
42
n
{W.Sk.2/ /}
43 k=1
almost surely, .62/
44 n
45 which, together with result (60), implies that 1=nk=1 Xk almost surely. We conclude that N =
n
3 The Levy exponent is a strictly increasing function with .0/ = 0 and limt .t/ = and therefore
4 admits a well-dened inverse, denoted 1 : [0, / [0, /. Using the change of variable u = .2w/, we
obtain
5
6 2 .2w/ exp{w2 .2w/}.w/ dw = exp[{ 1 .u/=2}2 u]{ 1 .u/=2} du:
7 0 0
8 Assume that 0 w .dw/ < . Now note that .t/ t0
t w .dw/ and therefore 1 .t/ t0 t= w.dw/.
0 0
9 If is a regularly varying Levy measure, then
10
11 .x/ x0 l.1=x/x
12 where [0, 1/ and l is a slowly varying function, and it therefore follows from lemma 3 in Appendix D
13 and 1 .0/ = 0 that
14
u0
15 g.u/ := exp[{ 1 .u/=2}2 ]{ 1 .u/=2} l.1=u/u 2 w .dw/
0
16
17 where g is a monotone decreasing function. Applying the Tauberian theorem of proposition 2 in Appendix
D, we therefore have
18
19 exp.u/g.u/du 1 l./ .1 / 2 w .dw/ :
20 0 0
21 Finally, combining the above asymptotics with equation (63), and noting that
22
23 w exp{w2 .2w/}.w/dw = o.1/
0
24
25 by dominated convergence, and lim l./ > 0 for [0, 1/, we obtain
26
E[N ] 1+ l./.1 / 2 w.dw/ :
27 0
28
29 Appendix D: Technical lemmas
30
31 The following lemma is a corollary of theorem 3, page 239, in Feller (1971).
32 Lemma 1. Let .Xn /n=1, 2,::: be a sequence of mutually independent random variables with arbitrary
33 distribution and such that var.Xn / E[Xn ] < . Let Sn = nk=1 Xk . If
34 lim E[Sn ] =
n
35 then
36
37 Sn =E[Sn ] 1 almost surely as n :
38
39 Proof. Assume for simplicity that E[Xn ] > 0 for all n (otherwise, consider the subsequence of random
variables with strictly positive mean). We have
40
41 n var.X /
k n E[Xk ] 1
dx <
k=1 1 + E[Sk ] k=1 1 + E[Sk ] 1 + x2
2 2
42 0
43 by Riemann integration. The result then follows from theorem 3, page 239, in Feller (1971) with bn = E[Sn ].
44
45 Lemma 2 (relating tail Levy intensity and Laplace
exponent). (Gnedin et al. (2007), propositions
46 17 and 19) Let be a Levy measure, .x/ = x .dw/ be the tail Levy intensity and .t/ = 0 {1
exp.wt/}.dw/ its Laplace exponent. The following conditions are equivalent:
47
x0
48 .x/ l.1=x/x , .64/
38 F. Caron and E. B. Fox
t
1 .t/ .1 /t l.t/ .65/
2
where 0 < 1 and l is a function slowly varying at , i.e. satisfying l.cy/=l.y/ 1 as y for every
3 c > 0.
4
5 Lemma 3 (Resnick (1987), chapter 0, proposition 0.8). If U is a regularly varying function at 0 with
6 exponent R, and f is a positive function such that f.t/ t0 tc, for some constant 0 < c < , then
7 U{f.t/} t0 c U.t/.
8
9 Proposition 2 (Tauberian theorem). (Feller (1971), chapter XIII, section 5, theorems 3 and 4). Let
U.dw/ be a measure on .0, / with ultimately monotone density u, i.e. monotone in some interval .x0 , /.
10 Assume that
11
12 L.t/ = exp.tw/u.w/dw
13 0
14 exists for t > 0. If l is slowly varying at and 0 a < , then the following two relationships are equivalent:
15
t
16 L.t/ t a l.t/, .66/
17
x0 1 1
18 u.x/ a1
x l : .67/
.a/ x
19
20
21
22 Appendix E: Proof of theorem 6
23 The proof of theorem 6 relies on results on posterior characterization with models involving normal-
24 ized CRMs. We rst state a corollary of lemma 5 by Pitman (2003) and Theorem 8.1 by James (2002).
25 Similar results appear in (Prunster (2002), James (2005) and James et al. (2009). The corollary involves
the introduction of a discrete random variable R, conditional on which the CRM has strictly positive
26 mass.
27
28 Corollary 1. Let W be a (nite or innite) CRM on [0, ] without xed atoms nor deterministic
component, with mean measure .dw/d. Denote W = W .[0, ]/, with probability distribution G . Let
29
R {0, 1, 2, : : :} be a discrete random variable such that, for r 0,
30
31 .t/ := Pr.R = r|W = t/
r
32 with 0 .0/ = 1. The condition Pr.R = 0|W = 0/ = 1 ensures that, conditionally on R > 0, W > 0 almost
33 surely, and the normalized CRM below is properly dened.
34 Conditionally on R = r > 0, let X1 , : : : , Xr IIDW =W . Let 1 , : : : , k , k r, be the unique values in
35 .X1 , : : : , Xr /, in order of appearance, with multiplicities 1 mj r, wi = W .{i }/ the associated weights
and r = {A1 , : : : , Ak } with Ai = {j|Xj =
i } be the associated random partition of {1, : : : , r}. Let w =
36 W i wi . For r > 0, we have
37
38 Pr[R = r, R = {A1 , : : : , Ak }, .wi dwi /i=1,:::k , w dw /
r
39 k k
k
6 Ur take N r distinct values j , with multiplicities 1 mj r. Let r = {A1 , : : : , AN } be the associated
partition of {1, : : : , r}. From corollary 1 we have, for r {2, 4, 6, : : :},
7
8 Pr[.D = r=2, 2D = {A1 , : : : , AN }, .wi dwi /i=1,:::N , w dw /]
9 2
1
N N
44 which corresponds to the product of the proposals (74) and (75). The proposal for w can be written as
45 an exponential tilting of the probability density function g, .w / :
,
46 .w /
exp{2w . wi + w /}g, ,
47 g, , +2wi +2w .w / =
48 exp{, .2 wi + 2w /}
i
Sparse Graphs 41
1 which will allow the terms involving the intractable probability density function g to cancel in the
2 MetropolisHastings ratio (73). g is either a gamma density . = 0/, a Poisson mixture of gamma densities
3 ( < 0) or an exponentially tilted stable density ( > 0) for which efcient samplers exist (Devroye, 2009;
Hofert, 2011).
4 Under the improper priors (43), the acceptance probability reduces to having
5
6 N 2 N 2
N
7 r = exp wi + w + wi + w exp . + 2w 2w / wi
i=1 i=1 i=1
8
N + .1 /, .2w + 2 wi / N
9
wi
i
:
10 i=1 .1 /, .2w + 2 wi /
i
11
12
13 F.3. Step 3: update of the latent variables nN ij
14 Concerning the latent nij , the conditional distribution is a truncated Poisson distribution (44) from which
15 we can sample directly. An alternative strategy, which may be more efcient for a large number of edges,
16 is to use a MetropolisHastings random-walk proposal.
17
18 References
19 Aalen, O. (1992) Modelling heterogeneity in survival analysis by the compound Poisson distribution. Ann. Appl.
20 Probab., 951972.
21 Adamic, L. A. and Glance, N. (2005) The political blogosphere and the 2004 US election: divided they blog. In
22 Proc. 3rd Int. Wrkshp Link Discovery, pp. 3643. New York: Association for Computing Machinery.
Airoldi, E. M., Blei, D., Fienberg, S. E. and Xing, E. (2008) Mixed membership stochastic blockmodels. J. Mach.
23 Learn. Res., 9, 19812014.
24 Airoldi, E. M., Costa, T. B. and Chan, S. H. (2013) Stochastic blockmodel approximation of a graphon: theory
25 and consistent estimation. In Advances in Neural Information Processing Systems, vol. 26.
Aldous, D. (1985) Exchangeability and related topics. In Ecole dEte de Probabilites de Saint-Flour XIII1983, pp.
26 1198. Springer.
27 Aldous, D. (1997) Brownian excursions, critical random graphs and the multiplicative coalescent. Ann. Probab.,
28 812854.
Aldous, D. J. (1981) Representations for partially exchangeable arrays of random variables. J. Multiv. Anal., 11,
29 581598.
30 Arcones, M. and Gine, E. (1992) On the bootstrap of U and V statistics. Ann. Statist., 655674.
31 Barabasi, A. L. and Albert, R. (1999) Emergence of scaling in random networks. Science, 286, 509512.
Bastian, M., Heymann, S. and Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating
32 networks. ICWSM, 8, 361362.
33 Berger, N., Borgs, C., Chayes, J. T. and Saberi, A. (2014) Asymptotic behavior and distributional limits of pref-
34 erential attachment graphs. Ann. Probab., 42, 140.
Bertoin, J. (2006) Random Fragmentation and Coagulation Processes. Cambridge: Cambridge University Press.
35 Bickel, P. J. and Chen, A. (2009) A nonparametric view of network models and NewmanGirvan and other
36 modularities. Proc. Natn. Acad. Sci., 106, 2106821073.
37 Bickel, P. J., Chen, A. and Levina, E. (2011) The method of moments and degree distributions for network models.
Ann. Statist., 39, 22802301.
38 Blackwell, D. and MacQueen, J. B. (1973) Ferguson distributions via Polya urn schemes. Ann. Statist., 353355.
39 Bollobas, B. (1980) A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Eur.
40 J. Combin., 1, 311316.
Bollobas, B. (2001) Random Graphs. Cambridge: Cambridge University Press.
41 Bollobas, B., Janson, S. and Riordan, O. (2007) The phase transition in inhomogeneous random graphs. Rand.
42 Struct. Algs, 31, 3122.
43 Bollobas, B. and Riordan, O. (2009). Metrics for sparse graphs. In Surveys in Combinatorics (eds. S. Huczynska,
J. Mitchell and C. Roney-Dougal), pp. 211287. Cambridge: Cambridge University Press.
44 Borgs, C., Chayes, J. T., Cohn, H. and Holden, N. (2016) Sparse exchangeable graphs and their limits via graphon
45 processes. Preprint arXiv:1601.07134.
46 Borgs, C., Chayes, J., Cohn, H. and Zhao, Y. (2014a) An Lp theory of sparse graph convergence I: Limits, sparse
random graph models, and power law distributions. Preprint arXiv:1401.2906.
47 Borgs, C., Chayes, J., Cohn, H. and Zhao, Y. (2014b) An Lp theory of sparse graph convergence II: LD convergence,
48 quotients, and right convergence. Preprint arXiv:1408.0744.
42 F. Caron and E. B. Fox
1 Borgs, C., Chayes, J. T. and Gamarnik, D. (2016) Convergent sequences of sparse graphs: a large deviations
2 approach. Rand. Struct. Algs, to be published.
Borgs, C., Chayes, J. T. and Lovasz, L. (2010) Moments of two-variable functions and the uniqueness of graph
3 limits. Geometr. Functnl Anal., 19, 15971619.
4 Borgs, C., Chayes, J. T., Lovasz, L., Sos, V. T. and Vesztergombi, K. (2008) Convergent sequences of dense graphs
5 I: Subgraph frequencies, metric properties and testing. Adv. Math., 219, 18011851.
Britton, T., Deijfen, M. and Martin-Lof, A. (2006) Generating simple random graphs with prescribed degree
6 distribution. J. Statist. Phys., 124, 13771397.
7 Brix, A. (1999) Generalized gamma measures and shot-noise Cox processes. Adv. Appl. Probab., 31, 929953.
8 Brooks, S. P. and Gelman, A. (1998) General methods for monitoring convergence of iterative simulations. J.
Computnl Graph. Statist., 7, 434455.
9 Bu, D., Zhao, Y., Cai, L., Xue, H., Zhu, X., Lu, H., Zhang, J., Sun, S., Ling, L. and Zhang, N. (2003) Topological
10 structure analysis of the proteinprotein interaction network in budding yeast. Nucleic Acids Res., 31, 2443
11 2450.
Buhlmann, H. (1960) Austauschbare stochastische Variablen und ihre Grenzwertsatze. PhD Thesis. University of
12 California at Berkeley, Berkeley.
13 Caron, F. (2012) Bayesian nonparametric models for bipartite graphs. In Advances in Neural Information Processing
14 Systems, vol. 25.
Caron, F. and Fox, E. B. (2014) Bayesian nonparametric models of sparse and exchangeable random graphs.
15 Preprint ArXiv 1401.1137.
16 Caron, F., Teh, Y. W. and Murphy, T. B. (2014) Bayesian nonparametric Plackett-Luce models for the analysis of
preferences for college degree programmes. Ann. Appl. Statist., 8, 11451181.
17 Chen, T., Fox, E. and Guestrin, C. (2014). Stochastic gradient Hamiltonian Monte Carlo. In Proc. Int. Conf.
18 Machine Learning, pp. 16831691.
19 Choi, D. and Wolfe, P. J. (2014) Co-clustering separately exchangeable network data. Ann. Statist., 42, 2963.
Clauset, A., Shalizi, C. R. and Newman, M. E. J. (2009) Power-law distributions in empirical data. SIAM Rev.,
20 51, 661703.
21 Colizza, V., Pastor-Satorras, R. and Vespignani, A. (2007) Reactiondiffusion processes and metapopulation
22 models in heterogeneous networks. Nat. Phys., 3, 276282.
Daley, D. J. and Vere-Jones, D. (2003) An Introduction to the Theory of Point Processes, vol. I, Elementary Theory
23 and Methods, 2nd edn. New York: Springer.
24 Daley, D. J. and Vere-Jones, D. (2008) An Introduction to the Theory of Point Processes, vol. II, General Theory
25 and Structure 2nd edn. New York: Springer.
Devroye, L. (2009) Random variate generation for exponentially and polynomially tilted stable distributions.
26 ACM Trans. Modlng Comput. Simuln, 19, 18.
27 Diaconis, P. and Janson, S. (2008) Graph limits and exchangeable random graphs. Rend. Mat. Applic. Ser., VII,
28 3361.
Duane, S., Kennedy, A. D., Pendleton, B. J. and Roweth, D. (1987) Hybrid Monte Carlo. Phys. Lett. B, 195,
29 216222.
30 Durrett, R. (2007). Random Graph Dynamics. New York: Cambridge University Press.
31 Favaro, S. and Teh, Y. (2013) MCMC for normalized random measure mixture models. Statist. Sci., 28, 335359.
Feller, W. (1971) An Introduction to Probability Theory and its Applications, vol. II, 2nd edn. New York: Wiley.
32 Ferguson, T. and Klass, M. (1972) A representation of independent increment processes without gaussian com-
33 ponents. Ann. Math. Statist., 43, 16341643.
34 Fienberg, S. E. (2012) A brief history of statistical models for network analysis and open challenges. J. Computnl
Graph. Statist., 21, 825839.
35 de Finetti, B. (1931) Funzione caratteristica di un fenomeno aleatorio. Atti R. Acad. Nazn. Linc. Ser. 6, 4, 251
36 299.
37 Freedman, D. A. (1996) De Finettis theorem in continuous time. Lect. Notes Monogr. Ser., 8398.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D., Vehtari, A. and Rubin, D. (2014) Bayesian Data Analysis.
38 Boca Raton: Chapman and HallCRC.
39 Gine, E. and Zinn, J. (1992). Marcinkiewicz type laws of large numbers and convergence of moments for
40 U-statistics. In 8th Proc. Int. Conf. Probability in Banach Spaces, pp. 273291.
Gnedin, A., Hansen, B. and Pitman, J. (2007) Notes on the occupancy problem with innitely many boxes: general
41 asymptotics and power laws. Probab. Surv., 4 (146171), 88.
42 Gnedin, A., Pitman, J. and Yor, M. (2006) Asymptotic laws for compositions derived from transformed subordi-
nators. Ann. Probab., 34, 468492.
43 Goldenberg, A., Zheng, A., Fienberg, S. and Airoldi, E. (2010) A survey of statistical network models. Foundns
44 Trends Mach. Learn., 2, 129233.
45 Herlau, T., Schmidt, M. N. and Mrup, M. (2014) Innite-degree-corrected stochastic block model. Phys. Rev.
E, 90, 032819.
46 Herlau, T., Schmidt, M. N. and Mrup, M. (2016) Completely random measures for modelling block-structured
47 sparse networks. In Advances in Neural Information Processing Systems, vol. 29.
48 Hofert, M. (2011) Sampling exponentially tilted stable distributions. ACM Trans. Modlng Comput. Simuln, 22, 3.
Sparse Graphs 43
1 Hoff, P. D. (2009) Multiplicative latent factor models for description and prediction of social networks. Computnl
2 Math. Organizn Theory, 15, 261272.
Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002) Latent space approaches to social network analysis. J.
3 Am. Statist. Ass., 97, 10901098.
4 van der Hofstad, R. (2014) Random Graphs and Complex Networks, vol. I, Technical Report. Eindhoven: Eindhoven
5 University of Technology.
Hoover, D. N. (1979) Relations on probability spaces and arrays of random variables. Preprint. Institute for
6 Advanced Study, Princeton.
7 Hougaard, P. (1986) Survival models for heterogeneous populations derived from stable distributions. Biometrika,
8 73, 387396.
Jacobs, A. Z. and Clauset, A. (2014) A unied view of generative models for networks: models, methods, oppor-
9 tunities and challenges. Preprint arXiv:1411.4070.
10 James, L. F. (2002) Poisson process partition calculus with applications to exchangeable models and bayesian
11 nonparametrics. Preprint arXiv math/0205093.
James, L. F. (2005) Bayesian Poisson process partition calculus with an application to Bayesian Levy moving
12 averages. Ann. Statist., 17711799.
13 James, L. F., Lijoi, A. and Prunster, I. (2009) Posterior analysis for normalized random measures with independent
14 increments. Scand. J. Statist., 36, 7697.
Janson, S. (2011) Probability asymptotics: notes on notation. Preprint arXiv:1108.3924.
15 Kallenberg, O. (1990) Exchangeable random measures in the plane. J. Theoret. Probab., 3, 81136.
16 Kallenberg, O. (2005) Probabilistic Symmetries and Invariance Principles. New York: Springer.
17 Karlin, S. (1967) Central limit theorems for certain innite urn schemes. J. Math. Mech., 17, 373401.
Karrer, B. and Newman, M. E. (2011) Stochastic blockmodels and community structure in networks. Phys. Rev.
18 E, 83, 016107.
19 Kemp, C., Tenenbaum, J. B., Grifths, T. L., Yamada, T. and Ueda, N. (2006) Learning systems of concepts with
20 an innite relational model. In AAAI, vol. 21, pp. 381.
Khintchine, A. (1937) Zur theorie der unbeschrankt teilbaren Verteilungsgesetze. Mat. Sborn., 2, 79119.
21 Kingman, J. F. C. (1967) Completely random measures. Pacif. J. Math., 21, 5978.
22 Kingman, J. F. C. (1993) Poisson Processes, Vol. 3. New York: Oxford University Press.
23 Lauritzen, S. (2008). Exchangeable Rasch matrices. Rend. Mat. Ser. VII, 28, 8395.
Lee, M.-L. T. and Whitmore, G. A. (1993) Stochastic processes directed by randomized time. J. Appl. Probab.,
24 302314.
25 Lewis, P. A. and Shedler, G. S. (1979) Simulation of nonhomogeneous Poisson processes by thinning. Navl Res.
26 Logist. Q., 26, 403413.
Lijoi, A., Mena, R. H. and Prunster, I. (2007) Controlling the reinforcement in Bayesian non-parametric mixture
27 models. J. R. Statist. Soc. B, 69, 715740.
28 Lijoi, A. and Prunster, I. (2003) On a normalized random measure with independent increments relevant to
29 Bayesian nonparametric inference. In Proc. 13th Eur. Young Statisticians Meet., pp. 123134. Bernoulli Society.
Lijoi, A. and Prunster, I. (2010) Models beyond the Dirichlet process. In Bayesian Nonparametrics (eds. N. L.
30 Hjort, C. Holmes, P. Muller and S. G. Walker). Cambridge: Cambridge University Press.
31 Lijoi, A., Prunster, I. and Walker, S. G. (2008) Investigating nonparametric priors with Gibbs structure. Statist.
32 Sin., 18, 1653.
Lloyd, J., Orbanz, P., Ghahramani, Z. and Roy, D. (2012) Random function priors for exchangeable arrays with
33 applications to graphs and relational data. In Advances in Neural Information Processing Systems, vol. 25.
34 Lovasz, L. (2013) Large Networks and Graph Limits, vol. 60. American Mathematical Society.
35 Lovasz, L. and Szegedy, B. (2006) Limits of dense graph sequences. J. Combin. Theory B, 96, 933957.
McAuley, J. and Leskovec, J. (2012) Learning to discover social circles in ego networks. In Advances in Neural
36 Information Processing Systems, vol. 25, pp. 539547.
37 Miller, K., Grifths, T. and Jordan, M. (2009) Nonparametric latent feature models for link prediction. In Advances
38 in Neural Information Processing Systems, vol. 22.
Neal, R. M. (2011) MCMC using Hamiltonian dynamics. In Handbook of Markov Chain Monte Carlo (eds. S.
39 Brooks, A. Gelman, G. Jones and X.-L. Meng), vol. 2. Boca Raton: Chapman and HallCRC.
40 Nesetril, J. and Ossona de Mendez, P. (2012) Sparsity (Graphs, Structures, and Algorithms). Berlin: Springer.
41 Newman, M. E. J. (2001) The structure of scientic collaboration networks. Proc. Natn. Acad. Sci. USA, 98,
404409.
42 Newman, M. E. J. (2003) The structure and function of complex networks. SIAM Rev., 167256.
43 Newman, M. E. J. (2010) Networks: an Introduction. New York: Oxford University Press.
44 Newman, M. E. J., Strogatz, S. H. and Watts, D. J. (2001) Random graphs with arbitrary degree distributions and
their applications. Phys. Rev. E, 64, 26118.
45 Norros, I. and Reittu, H. (2006) On a conditionally Poissonian graph process. Adv. Appl. Probab., 38, 5975.
46 Nowicki, K. and Snijders, T. (2001) Estimation and prediction for stochastic blockstructures. J. Am. Statist. Ass.,
47 96, 10771087.
Ogata, Y. (1981) On Lewis simulation method for point processes. IEEE Trans. Inform. Theory, 27, 2331.
48
44 F. Caron and E. B. Fox
1 Olhede, S. C. and Wolfe, P. J. (2012) Degree-based network models. Preprint arXiv:1211.6537. University College
2 London: London.
Opsahl, T. and Panzarasa, P. (2009) Clustering in weighted networks. Socl Netwrks, 31, 155163.
3 Orbanz, P. and Roy, D. M. (2015) Bayesian models of graphs, arrays and other exchangeable random structures.
4 IEEE Trans. Pattn Anal. Mach. Intell., 37, 437461.
5 Palla, K., Knowles, D. A. and Ghahramani, Z. (2012) An innite latent attribute model for network data. In Proc.
Int. Conf. Machine Learning.
6 Penrose, M. (2003) Random Geometric Graphs, vol. 5. New York: Oxford University Press.
7 Pitman, J. (1995) Exchangeable and partially exchangeable random partitions. Probab. Theory Reltd Flds, 102,
8 145158.
Pitman, J. (1996) Some developments of the Blackwell-MacQueen urn scheme. Lect. Notes Monogr. Ser., 245267.
9 Pitman, J. (2003) Poisson-Kingman partitions. Lect. Notes Monogr. Ser., 134.
10 Pitman, J. (2006) Combinatorial stochastic processes. In Ecole dEte de Probabilites de Saint-Flour XXXII2002.
11 New York: Springer.
Prunster, I. (2002) Random probability measures derived from increasing additive processes and their application
12 to Bayesian statistics. PhD Thesis. University of Pavia, Pavia.
13 Regazzini, E., Lijoi, A. and Prunster, I. (2003) Distributional results for means of normalized random measures
14 with independent increments. Ann. Statist., 31, 560585.
Resnick, S. (1987) Extreme Values, Point Processes and Regular Variation. New York: Springer.
15 Rohe, K., Chatterjee, S. and Yu, B. (2011) Spectral clustering and the high-dimensional stochastic blockmodel.
16 Ann. Statist., 39, 18781915.
17 Todeschini, A. and Caron, F. (2016) Exchangeable random measures for sparse and modular graphs with over-
lapping communities. Preprint arXiv:1602.02114.
18 Veitch, V. and Roy, D. M. (2015) The class of random graphs arising from exchangeable random measures. Preprint
19 arXiv:1512.03099.
20 Watts, D. J. and Strogatz, S. H. (1998) Collective dynamics of small-world networks. Nature, 393, 440442.
Wolfe, P. J. and Olhede, S. C. (2013) Nonparametric graphon estimation. Preprint arXiv:1309.5936. University
21 College London, London.
22 Zhao, Y., Levina, E. and Zhu, J. (2012) Consistency of community detection in networks under degree-corrected
23 stochastic block models. Ann. Statist., 40, 22662292.
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48