Networks PDF

Econometric Analysis of Networks
Christian Brownlees
Universitat Pompeu Fabra, Barcelona GSE
(Brownlees) 0/1
Introduction
Introduction
Introduction
Introduction
Network Analysis has emerged prominently in many fields of

science over the last years:
Economics, Finance, Computer Science, Social Networks, ...
Network analysis is a powerful tool to represent and synthesize the

interconnections of large multivariate systems
(Brownlees) 1/1
Introduction
In these slides...
These slides introduce network techniques for the analysis of large

panels of economic and financial time series.
Focus is on the recent advances from statistics and econometrics

and the empirical evidence emerging from network applications in
economics and finance.
(Brownlees) 2/1
Introduction
Network Analysis: What is the Fuss?
In statistics network/graphical modeling has been around for quite

some time.
However, interest in this field has been renewed with the

development of estimation techniques that allow to work with
high–dimensional applications
In particular, Meinshausen and Bühlmann (2006) is probably one

the first contributions that showed how to estimate
high–dimensional network models using LASSO and that spurred
renewed interest in the field
(Brownlees) 3/1
Introduction
Network Analysis: What is the Fuss?
In economics/finance networks models have become particularly

important from both a theoretical and empirical perspectives over
the last couple of years.
Recent influential research by Acemoglu et al. (2012) has

introduced economic models in which the aggregate fluctuations
are determined by the most interconnected sectors
The great financial crisis has played an important role in

popularizing networks. In particular, one of the lessons from the
crisis is that high degree of interconnectedness among financial
firms can make the whole financial system vulnerable
(Brownlees) 4/1
Introduction
Roadmap
Basic Concepts
Network techiques for the analysis of econ and financial panels
Network for Static Data
Partial Correlation Network
Network for Dynamic Data

Granger Network
Connectedness Table
NETS
(Brownlees) 5/1
Basic Concepts
Basic Concepts
Basic Concepts
What is a Network?
What is a Network?
Mathematically, a network is a graph.
Roughly, a graph is a collection of vertices connected by lines.
There are multiple graph definitions. Graphs can be defined in

different ways depending on the purpose of the application.
The notation for graphs can be quite extensive. We are going to

focus on a subsets of notions useful for the scope of the course.
(Brownlees) 6/1
Basic Concepts
Graphs
A Graph G is defined as a pair of Vertices and Edges
G = (V, E)
The vertices V is (any) set of elements

In this set of slides we will have throughout V = {1, . . . , n}
The edges E connect vertices
The set of edges is defined as E ⊆ V × V
(i, j) ∈ E ⇐⇒ i and j are connected by an edge
(Brownlees) 7/1
Basic Concepts
History of Graphs: Königsberg Bridge Problem

Graph Representation
D B
In maths, graphs turn out to be more convenient to represent and analyse a number of problems.
One of the early examples of graph theory applications is the Königsberg Bridge Problem. The
problem consists of finding out whether it exists a path that crosses the 7 bridges exactly once
that begins and finishes on the same vertex. In 1736, Leonard Euler showed that such path does
not exist using graph theory.
(Brownlees) 8/1
Basic Concepts
Types of Graphs
There are many types of graph definitions.
In what follows we will focus on:

Undirected & Directed Graphs
Unweighted & Weighted Graphs
(Brownlees) 9/1
Basic Concepts
Undirected Graphs
If the edges do not have a directionality the graph is undirected
(i.e. an edge from i to j is the same as and edge from j to i )
Example:
E B
D C
V = {A, B, C , D, E }
E = {{A, B}, {B, C }, {A, D}, {D, E }}
(Brownlees) 10/1
Basic Concepts
Directed Graphs
If the edges have a directionality the graph is directed
(i.e. the edge from i to j is different from an edge from j to i )
Example:
E B
D C
V = {A, B, C , D, E }
E = {(A, B), (B, C ), (A, D), (D, E )}
(Brownlees) 11/1
Basic Concepts
Unweighted and Weighted Graphs
If the edges have different weights the graph is weighted

otherwise the graph is unweighted
Example: Weighted Graph
E B
D C
(Brownlees) 12/1
Basic Concepts
... and there are many more
There are many other graph types, for example:

Mixed Graph (containing both directed and undirected edges)
Coloured Graph (Four Color Map Theorem)
Multiple Layered Graph
...
(Brownlees) 13/1
Basic Concepts
Network Representation
Important matrices associated with G

Assume the graph is undirected, unweighted and defined over a set of n vertices.
The adjacency matrix AG :

An n × n matrix with aij = [AG ]ij = 1 if i and j are connected by an
edge and zero otherwise.
The degree matrix DG :

An n × n diagonal matrix with dii = [DG ]ii = nj=1 aij on the
P
diagonal.
The Laplacian LG :
LG = DG − AG .
(Brownlees) 14/1
Basic Concepts
Adjacency Matrix
A
 
0 1 0 1 0
E B 
 1 0 1 0 0


 
AG = 
 0 1 0 0 0 

1 0 0 0 1
 
 
D C 0 0 0 1 0
(Brownlees) 15/1
Basic Concepts
Degree Matrix
A
 
2 0 0 0 0
E B 
 0 2 0 0 0


 
DG = 
 0 0 1 0 0 

0 0 0 2 0
 
 
D C 0 0 0 0 1
(Brownlees) 15/1
Basic Concepts
Laplacian Matrix
A
 
2 −1 0 −1 0
E B 
 −1 2 −1 0 0


 
LG = 
 0 −1 1 0 0 

−1 0 0 2 −1
 
 
D C 0 0 0 −1 1
(Brownlees) 15/1
Basic Concepts
Network Representation: Remarks
We will not have time to dig into the properties of these matrices
However, it is important to at least point out that several key

network properties turn out to be elegantly embedded in these
matrices
For instance, it can be shown that the number of connected

component of a graph is equal to the number of zero eigenvalues
of its Laplacian
(Brownlees) 16/1
Basic Concepts
Some Notions from Network Analysis
It turns out that in real world networks there are a large number
of patterns that are commonly encountered
Some of the most relevant ones are

Hubs
Power Law Structure
Community Structure
(Brownlees) 17/1
Basic Concepts
Hubs
In many real world networks there are typically vertices that are
“more important” than others.
Importance in a network can be defined in different ways. It is

common to measure the important of a vertex on the basis of how
central the vertex is. A natural measure of centrality in the
number of connections a vertex has.
Vertices with a high centrality/large number of connections are

typically called hubs.
(Brownlees) 18/1
Basic Concepts
Power Law Structure
Real world networks often exhibit a power law structure, that is

the empirical distribution of the degrees of the vertices in the
network is power law distributed
Recall, that that the power law distribution is a heavy tailed

distribution. This implies that a power law network will have a
non-negligible fraction of vertices with a large degree (i.e. hubs)
Networks with power law structure exhibit small world effects / 6

degrees of separation
(Brownlees) 19/1
Basic Concepts
Community Structure
Real world networks often exhibit a community structure or

clustering, that is
Vertices are partitioned into communities/groups
Higher frequency of edges within the same community than
between different communities
An implication of communities is that the network has an

approximate block structure
(Brownlees) 20/1
Networks for Panels of Economic and Financial Time Series
Networks for Panels of Economic

and Financial Time Series
Networks in Econ and Finance
Let yt denote a multivariate time series of interest

 
y1 t
 y2 t 
 
yt =  . 
 t = 1, ..., T
 .. 

yn t
e.g. returns of a portfolio of assets, CDS spreads of a panel

sovereign bonds, gdp of panel of countries, ...
(Brownlees) 21/1

1 We are concerned with introducing a network representation for
the multivariate process yt where
vertices represent variables and
edges denote presence of an appropriate measure of dependence
between two variables
y1 t
y5 t y2 t
y4 t y3 t
2 Developing estimation techniques that allow us to detect the

network from the data
(Brownlees) 22/1

Network analysis of large panels of time series has different highlights:
Dimensionality Reduction
Analysing and understanding the properties of large dimensional
system is challenging. Network representation can be used as a
dimensionality reduction technique that can hence interpretation.
Regularised Estimation
It turns out that (most) network estimation techniques boil down
to the estimation of large dimensional model subject to
appropriate regularization constraints. In a large dimensional
setting regularization can enhance efficiency.
(cf. Ledoit and Wolf, 2004)
(Brownlees) 23/1
Central issue: Which measure of dependence should we use?
No unique answer. It depends on context of the application.
In practice, network definitions in econometrics/statistics differ on

the dependence measure chosen to build up the network.
(Brownlees) 24/1
Network Definitions
Network classification by dependence type

linear, contemporaneous
Partial Correlation Network (Meinshausen and Bühlmann, 2006; Peng et al., 2009)
linear, dynamic
Granger Network (Billio et al., 2012), Interconnectedness Table (Diebold and Yilmaz,
2014), NETS (Barigozzi and Brownlees, 2016)
nonlinear, contemporaneous
SKEPTIC (Liu et al., 2012), Tail Networks (Hautsch et al., 2012)
(Brownlees) 25/1

Partial Correlation Network:

Definition

Historically, the first network definition proposed in the literature
is the Partial Correlation Network, proposed by Dempster (1972)
Let yt be iid multivariate
yt ∼ D(0, Σ)
where D(µ, Σ) denotes a distribution with mean µ and cov Σ.
The partial correlation network associated with the system is an

undirected/unweighted graph where
1 the components of yt denote vertices
2 the presence of an edge between component i and j denotes that
i and j are partially correlated given all other components
(Brownlees) 26/1
Partial Correlation Networks
Partial Correlation measures (cross-sect.) linear conditional

dependence between yi t and yj t given on all other variables:
ρij = Cor(yi t , yj t |{yk t : k 6= i, j}).
The partial correlation networks is defined
EPC = {{i, j} ∈ V × V|ρij 6= 0}
Note that if the data is Gaussian, absence of partial correlation

implies conditional independence
(Brownlees) 27/1
Partial Correlation Networks: Properties

Partial Correlation is related to Linear Regression:
For instance, consider the model
y1 t = c + θ1 2 y2 t + θ1 3 y3 t + θ1 4 y4 t + θ1 5 y5 t + u1 t
θ13 is different from 0 ⇔ 1 and 3 are partially correlated
The partial correlation of yi t and yj t is equivalently defined as the

linear correlation between the residuals of yi t and yj t obtained
from the regression of the two components on all the other
variables in the system.
Partial Correlation is related to Correlation:

If there is exist a partial correlation path between vertices i and j ,
then i and j are correlated (and viceversa).
(Brownlees) 28/1
Characterizing the Partial Correlation Network

There is a well known and interesting connection between the
partial correlation network and the Concentration matrix K = Σ−1
It is easy to see this through the regression representation of the

variables in the system. Each variable i can be expressed as the
linear combination of the other variables and an error term
X
2
yi t = θij yj t + ui ui ∼ N 0, σ(−i)
j6=i

yi t = θi0 xi t + ui 2
ui ∼ N 0, σ(−i)
with xi t = y(−i) t and Cov(ui , yj ) = 0 for each j 6= i
If θij is zero, then i and j have zero partial correlation
(Brownlees) 29/1
It is easy to see through the formula for the inverse of a

partitioned matrix that the regression parameters θij can be
expressed as a function of the elements of K
Let kij denote the (i, j) element of K. Then the relation between θij
2
and σ(−i) is given by the following relations
s
kij kii
θij = − = ρij
kii kjj
and
2 1
σ(−i) =
kii
(Brownlees) 30/1
A Deeper Look...
The regression parameters can be expressed using K
2
The relation between θij and σ(−i) is given by the following
relations s
kij kii
θij = − = ρij
kii kjj
and
2 1
σ(−i) =
kii
It is interesting and straightforward to show this result
(Brownlees) 31/1
Proof: Step 1 of 3
Partition yt in two subvectors
" #
y1t
yt =
y2t
Define
" # " #
Σ11 Σ12 −1 Σ11 Σ12
Σ= K=Σ =
Σ21 Σ22 Σ21 Σ22
Recall this result for partitioned symmetric matrices.

Let C = Σ11 − Σ12 Σ−122 Σ21 . Then
" #
−1 C−1 C−1 Σ12 Σ−122
Σ = −1
−Σ22 Σ21 C−1 Σ−1
22 + Σ−1
22 Σ 12 C −1 Σ Σ−1
21 22
Finally, let y1t = yit and y2t = y(−i) t = (y1 , yi−1 , yi−2 , yN )0
(Brownlees) 32/1
Proof: Step 2 of 3
θ as a function of Σ
0 = Cov(ui , y(−i)t )
= Cov(yi − θ0 y(−i)t , y(−i)t )
= Cov(yi , y(−i)t ) − θ0 Cov(y(−i)t , y(−i)t )
= Σ12 − θ0 Σ22
θ = Σ−1
22 Σ21
2
σ(−i) as a function of Σ
Var(ui ) = Cov(yi − θ0 y(−i)t , yi − θ0 y(−i)t )
= Cov(yi , yi − θ0 y(−i)t )
= Cov(yi , yi ) − θ0 Cov(yi , y(−i)t )
2
σ(−i) = Σ11 − Σ12 Σ−1
22 Σ21
(Brownlees) 33/1
Proof: Step 3 of 3
2
σ(−i) as a function of K
Σ11 = (Σ11 − Σ12 Σ−1

22 Σ21 )
−1
Σ11 = 2
(σ(−i) )−1
2 1
σ(−i) =
kii
θ as a function of K
Σ21 = −Σ−1
22 Σ21 C
−1
Σ21 = −θΣ11
h i−1
θ = −Σ21 Σ11
kij
θj = −
kii
(Brownlees) 34/1
An analogous formula can be obtained for the generic i, j partial

correlation
−kij
ρij = p
kii kjj
Implications:
This implies that the partial correlation network is entirely
characterized by K
If kij is nonzero, then node i and j are connected by an edge.
We can reformulate the estimation of the partial correlation
network as the estimation of a concentration matrix.
(Brownlees) 35/1

Estimation
Sparse Estimation
The partial correlation network is entirely characterized by a

matrix parameter K that is assumed to be sparse
(i.e. it contains zero entries)
We need to introduce a sparse estimator of K. We need an

estimator that simultaneously selects and estimates the nonzero
entries of K
As we will see, other network definitions lead to analogous

problems
(Brownlees) 36/1
Sparse Estimation
The workhorse of sparse estimation is the LASSO
Consider the regression model
Yt = θ0 Xt + et et ∼ N (0, σ 2 ) t = 1, ..., T
Xt ∈ RP
The (classic) LASSO estimator of this model is defined as

T P
(Yt − θ0 Xt )2 + λ
X X
θ̂λL = arg min |θj | λ≥0
θ
i=1 j=1
where λ is the LASSO tuning parameter
(Brownlees) 37/1
Sparse Estimation
The (classic) LASSO estimator of this model is defined as
T P
(Yt − θ0 Xt )2 + λ
X X
θ̂λL = arg min |θj | λ≥0
θ
i=1 j=1
where λ is the LASSO tuning parameter
Remarks:
LASSO is a shrinkage type estimator. When λ = 0 the estimator
coincides with least squares. When λ is large the effect of the
penalty is to shrink estimates towards zero (like Ridge regression)
A key feature of LASSO is that this penalty delivers estimates in

which some components are estimated as exact zeros. In other
words, the LASSO estimator has the tendency of delivering sparse
esimates when λ is large enough.
(Brownlees) 38/1
LASSO & Sparsity: Graphical Intuition

Consider the LASSO objective function when P = 2
T
X
(Yt − θ1 X1t − θ2 X2t )2 + λ|θ1 | + λ|θ2 |
t=1
The LASSO objective function can be thought of the Lagrangian

of the following constrained optimization problem
T
X
(Yt − θ1 X1t − θ2 X2t )2
t=1
subject to
|θ1 | + |θ2 | ≤ rλ
(Brownlees) 39/1
LASSO & Sparsity: Graphical Intuition
(Brownlees) 40/1
LASSO & Sparsity: Sketch of Proof

Consider
Yt = θXt + et et ∼ N (0, σ 2 )
where Yt and Xt are de-meaned and Xt is scalar
Let θ̂ be the least squares estimator. Then,
( )
X
θ̂λL = arg min 2
(Yt − θXt ) + λ|θ|
θ
t
( )
X
2
= arg min (Yt − θXt − θ̂Xt + θ̂Xt ) + λ|θ|
θ
t
( )
X
= arg min Xt2 (θ 2
− θ̂) + λ|θ|
θ
t
( )
X 1X 2
= arg min Xt2 θ2 −2 X θ̂θ + λ|θ|
θ
t
n t t
Notice that if θ̂ > 0 then θ̂λL ≥ 0.

(Brownlees) 41/1
LASSO & Sparsity: Sketch of Proof

Differentiating and setting the objective function for θ ≥ 0 to zero
we get the FOC
X λ
0=2 Xt2 (θ − θ̂) + λ ⇒ θ̂λL = θ̂ − P 2 .
t
2 t Xt
Notice, that when rhs < 0 we truncate the solution to zero.

Thus, for θ̂ > 0 the solution is

λ
θ̂λL = θ̂ −
Xt2 +
P
2 t
where (x)+ means max(x, 0).

By carrying out analogous computations for the case θ̂ < 0 one
gets that that the LASSO solution is

λ
θ̂λL = sign(θ̂) |θ̂| −
Xt2 +
P
2 t
(Brownlees) 42/1
LASSO Estimator
Highlights of LASSO:
1 Sparsity Detection. The effect of the absolute value penalty is to
shrink some of the θ estimated coefficients to exact zeros. Under
appropriate conditions, LASSO can detect asymptotically the true
nonzero parameters of the model.
2 High-Dimensionality. Under appropriate conditions, the LASSO

estimator is well behaved even when the number of parameters P
is much larger than the number of observations T
(Brownlees) 43/1
LASSO Properties: Sketch of Main Assumptions
1 The total number of parameters is allowed to grow as a function

of the number of observations
2 The total number of nonzero parameters is allowed to grow as a

function of the number of observations however it hasp to be small

relative to the number of observations. Typically, o T / log T
3 Nonzero coefficients are larger in absolute value then a signal

threshold
4 Correlation between variables Xi t associated with zero and

non-zero coefficients cannot be too large
(Brownlees) 44/1
LASSO Properties
The LASSO literature is typically concerned in establishing two results.

Let θ0 denote the true value of the parameter.
Estimation Consistency.
p
||θ̂λL − θ0 || → 0
Selection Consistency.

P sign(θ̂λL i ) = sign(θ0 i ) → 1
(Brownlees) 45/1
LASSO Properties: Comments
It is fair to say that LASSO assumptions are not innocent
There are several variants of the LASSO which tackle the issues of
the baseline (for instance, the Adaptive LASSO).
However, in practice it is important to address the issue of to

which extent LASSO assumptions truly make sense in the context
of a given application.
(Brownlees) 46/1
LASSO Computation
The LASSO estimator cannot be computed in closed form
However, there are several optimization algorithms that can be

used to compute the estimator
One of the first and most commonly used algorithms proposed in

the literature is the so called shooting algorithm
(Brownlees) 47/1
LASSO Computation
Shooting Algorithm
Initialize θ̂L with the least squares estimator.
For k in 1, ..., P, 1, ..., P, 1, ..., P until convergence
1 Define
X
Y(−k) t = Yt − θ̂jL Xj t
j6=k
2 Compute LS estimate of k –th coefficient

PT
t=1 Y(−k) t Xk t
θ̂kLS = PT 2
t=1 Xk t
3 Update LASSO estimate of k –th coefficient

!
λ
θ̂kL = sign θ̂kLS |θ̂kLS | − PT 2
2 t=1 Xk t +
(Brownlees) 48/1
LASSO Estimation Partial Correlation Network
Can we use LASSO techniques to obtain a sparse estimator of K?

Oui!
There are two approaches to estimate K using LASSO

Regression Based
Neighborhood Selection (Meinshausen and Buhlmann, 2006)
SPACE (Peng et al. 2009)
Concentration Penalization Approach

GLASSO (Yuan and Lin, 2007; Banerjee et al. 2008; Friedman et al., 2008)
All of these methods are implemented in several R packages, are

straightforward to apply and work well in fairly high–dimensional
settings
(Brownlees) 49/1
Regression Approach
Regression approaches are based on the regression representation
of the series in the panel
X
yi t = θij yj j + ui t , i = 1, . . . , n,
j6=i
2
with Var(ui ) = σ(−i)
The regression coefficients and residual variance of the regression

are related to the entries of the concentration matrix:
1
kii = 2
σ(−i)
and
kij = −θij kii
(Brownlees) 50/1
Regression Approach - Neighborhood Selection
Meinshausen and Buhlmann (2006) put forward a simple strategy

to estimate the network
The idea is to exploit the regression representation of the variables

in the system to estimate the neighbours of each node
(Brownlees) 51/1
Neighborhood Selection
1 For each i = 1, ..., n, use LASSO regression to estimate the

parameters of the regression
X
yi t = θij yj t + ui t , i = 1, . . . , n,
j6=i
2 Then, kij is set to zero if θ̂ijL = 0 OR θ̂jiL = 0
(Brownlees) 52/1
Procedure is simple but has some limitations
It does not really estimate K but only its sparsity pattern
Also, it doesn’t fully exploit all the information in the system
(Brownlees) 53/1
Regression Approach - SPACE
Peng et al. (2009) develop a smart algorithm to estimate a sparse

K building up on Neighborhood Selection
The idea is that if the diagonal elements of K are known, then it

is possible to write an auxiliary linear regression model whose
unknown parameters are the partial correlations coefficients.
(Brownlees) 54/1

SPACE
1 Given an estimate of kii , estimate ρij by minimizing

  s 2 
n T n n Xi−1
X X X
ij k̂ jj
X
y − ρ y + λ |ρij |
 
it jt
   
i=1 t=1 j6=i
k̂ ii i=2 j=1
2 Given an estimate of ρij , estimate kii as the residual variance

3 If algorithm has not converged, go back to 1
4 Estimate the nondiagonal entries of K as
q
k̂ij = −ρ̂ij k̂ii k̂jj
(Brownlees) 55/1
SPACE allows to simultaneously select and estimate the entries of

K
Note however, that the procedure does not ensure that K is

positive definite. However, in practice, if K is sufficiently sparse
then the estimator will also be positive definite with high
probability.
(Brownlees) 56/1
Concentration Penalization Approach - GLASSO
Rather than using the regression representation, Yuan and Lin

(2007) suggest to estimate a sparse K by directly penalizing the
log likelihood of the concentration matrix under Gaussianity with
a LASSO penalty
Estimate K by optimizing
 
 X 
K
b = arg min
n
tr(ΣK) − log det(K) + λ |kij |
K∈S  
i6=j
where Σ is the sample covariance estimator
(Brownlees) 57/1
Concentration Penalization Approach - GLASSO
Approach is natural. Also, Ravikumar et al. (2011) provide

general conditions for the consistency of the estimator.
However, optimization of the objective function is challenging.

Original algorithm proposed by Yuan and Lin (2007) does not
perform well in high–dimensions
Banerjee et al. (2008) and Friedman et al (2008) show that the

optimization can be recast as a sequence of simple LASSO
regression problems which make the estimation procedure
appealing for large problems.
(Brownlees) 58/1

Implementation Issues
Choosing λ
Fitting a network involves choosing a value of λ
In practice, networks are estimated for different values of λ and

information criteria like the AIC and BIC are used to choose
“optimal” λ’s from the data
Many prefer using the BIC because it penalizes more and makes
the network more sparse
For the space algorithm, the BIC can be computed as

log(T )
BIC (λ) = log(RSS(λ)) + #{(i, j) : i 6= j, ρij 6= 0}
T
(Brownlees) 59/1
Factors & Networks
Is the sparsity assumption always satisfied in economic panels? No!
An important case in which sparsity is violated, is when the

components of the process are influenced by common explanatory
variables and common factors.
In this case, the influence of the common factors has to to be
filtered out first and network analysis can then be carried out on
factor residuals.
Generally speaking, network analysis can be viewed as the
complement of factor analysis
(Brownlees) 60/1

Empirical Illustration
Focus is on estimating the network of idiosyncratic
interconnections of stock returns. We consider a sample of daily
log returns for 93 U.S. bluechips between 2000 and 2013
CAPM one-factor type model:
rt = βrm t + t ∼ N (0, Σ)
Estimation:
1 Estimate ˆt as least squares residuals
2 Estimate the partial correlation network of using SPACE
The tuning parameter λ is chosen using BIC
(Brownlees) 61/1
Idiosyncratic Risk Network

VZ
●
T
●
DOW ●
FOXA
●
●
ACN ●
AAPL ●
QCOM ●
AEP CMCSA
●
●
ALL
●
UNH SO
●
●
INTC ●
● IBM TWX
●HPQ
CSCO ● ● CL
●
●
SPG ●
TXN EBAY
DD
●
MET ●
ORCL ●
● ●
BAC EMC ●
●AMZN MSFT● PG ● DIS
●
MDLZ
●
●
AXP ● C
WFC
●
EXC MDT
●
●
GILD BAX● NKE
MO
● ●
USB ●MS
COF ● ●
● GS● AIG BRK.B

●
● XOM
JPM ●
ABT CVS
●
●
BK ●
AMGN MMM
●
● APA
OXY ● WAG
●
●
SLB ●
HAL ●
F ● ● MRK
JNJ ●
●
CVX BMY
●
PEP
●
DVN
●MON
● FCX
SBUX
●
EMR ●
HD● COST
● ●
LLY
● NOV
COP ● NSC
●
●
CAT
●
APC ●
PFE
KO
● ●
TGT
WMT ● LOW●
●
UNP
●
HON
●
BA
●
GD UTX
●
FDX
●
RTN
●
UPS
●
LMT
●
(Brownlees) 62/1
(Brownlees) 62/1
(Brownlees) 62/1
Return Network: Empirical Properties
Centrality. Financials, Energy and Technology are some of the

most central sectors. In particular, AIG is a hub in the network.
Power Law. Degree distribution is heterogeneous and most

interconnected vertices have a large number of connections
relative to the total.
Community Structure. Vertices that belong to the same industry

are more likely to be linked.
(Brownlees) 63/1
Networks For Time Series

Limitations of Partial Correlations for Time Series
Defining the network on the basis of partial correlations is

motivated by the analysis of serially uncorrelated Gaussian data.
However, this is typically not always satisfactory for economic and

financial applications where data typically exhibit serial
dependence.
A number of proposal have been put forward in the literature to

overcome the limitations.
(Brownlees) 64/1
Networks for Time Series
Proposals:
(Pairwise) Granger Network
Billio, Getmanski, Lo and Pellizzon (2012)
Connectedness Table
Diebold and Yilmaz (2014)
NETS
Barigozzi and Brownlees (2016)
(Brownlees) 65/1
Networks for Time Series: Remarks
It is interesting to note that, besides the fact that these definitions

define connections in fairly different ways, they are all based on a
Vector Autoregressive (VAR) representation of the data
A limitation of defining connection on the basis of a VAR is that

all these network definitions essentially focus on linear dependence
(Brownlees) 66/1
Granger Networks
Granger Networks
Granger Networks
Granger Networks
Billio, Getmasky, Lo and Pellizzon (2011) propose to construct

Granger Causality Networks
Granger Causality: In time series analysis, x is said to Granger

cause y if past values of x help predicting y above and beyond
past information of y itself
(Brownlees) 67/1
Granger Networks
Granger Causality
Let x and y be two time series. We say that x does not Granger
cause y if the MSE of a forecast for y based on the past of y and
x has the same MSE of a forecast based on the past of y only:
MSE(E(yt+s |yt , yt−1 , ...)) = MSE(E(yt+s |yt , yt−1 , ..., xt , xt−1 , ...))
for all s > 0
(Brownlees) 68/1
Granger Networks
Granger Causality: Remarks
The absence of Granger causality implies restrictions of VAR

representations of the data. These restrictions can be tested using
standard methods to assess the evidence of no Granger causality.
Achtung! The term Granger causality is a bit ambiguous.

Granger causality does not really measure causality.
(Brownlees) 69/1
Granger Networks
Granger Networks
Billio et al. (2012) focus on the analysis of spillover effects among

financial institutions during the financial crisis
To this extent, they consider monthly returns for a panel of

financial companies divided into different industry groups
(Brownlees) 70/1
Granger Networks
Granger Networks:
Definition
Granger Networks
Granger Networks: Linear Model
Consider the model

2
yA t = βA ym t + γA yA t−1 + γAB yB t−1 + A t A t ∼ N (0, σA )
where
yA t return of firm A on period t
yB t return of firm B on period t
ym t return of the market on period t
if γAB is significantly different from zero, then yB Granger causes

yA
(Brownlees) 71/1
Granger Networks
Granger Networks:
Estimation
Granger Networks
Estimation
Estimation of the pairwise Granger network is straightforward
The model
2
yA t = βA ym t + γA yA t−1 + γAB yB t−1 + A t A t ∼ N (0, σA )
can be simply estimated by least squares
(Brownlees) 72/1
Granger Networks
Construction of a Granger Causality Networks

Billio, Getmasky, Lo and Pellizzon (2012) procedure:
Granger Network - BGLP2012

Consider all of the possible pairs of companies in the panel.
For each pair in the panel

1 Estimate a bivariate the model (by least squares)
2 Run a Granger causality test between company A and B

i.e. test the null hypothesis H0 : γAB = 0 at the 1% significance
level
3 If company A Granger causes B, then add a direct edge between

A and B
(Brownlees) 73/1
Granger Networks
Granger Networks:
Granger Networks
Granger Causality Networks: Empirical Application
This methodology is used to analyze monthly returns of a panel of

financial institutions between 1994 and 2008.
Firms are classified in Hedge Funds, Banks, Broker-dealers,

Insurance
Since there is evidence that networks change over time, the

analysis is carried out over different time windows
(Brownlees) 74/1
Granger Networks

2002-2004
(Brownlees) 75/1
Granger Networks

2002-2006
(Brownlees) 75/1
Granger Networks

2006-2008
(Brownlees) 75/1
Granger Networks
Granger Causality Networks: Final Remarks
Billio et al. (2012) have the merit to be the first ones to introduce
network analysis in the field. Great tool for explanatory analysis.
Some caveats:
Networks change a lot over time!
Spurious Correlation problem
Networks are very dense. Could it be that some common factors

are missing?
(Brownlees) 76/1
Connectedness Table
Connectedness Table
Connectedness Table
Connectedness Table
In a number of papers Diebold and Yilmaz propose a network

definition based on the classic variance decompositions analysis
Their are interested in answering the question: “What fraction of

the H–step ahead prediction error variance of series i is due to
shock j ?”
The answer of this question is given by variance decomposition

and we denote this fraction by
dijH
(Brownlees) 77/1
Connectedness Table
Connectedness Table:
Definition
Connectedness Table
Variance Decomposition
Assume yt has an infinite MA representation
yt = µ + t + Ψ1 t−1 + Ψ2 t−2 + ... Var() = Σ
The VMA representation is useful to obtain the error in

forecasting H periods ahead
y t+H − ŷ t+H|t = t+H + Ψ1 t+H−1 + Ψ2 t+H−2 + ... + ΨH−1 t+1
The MSE of the H -period-ahead forecasts is

MSE(ŷ t+H|t ) = E(y t+H − ŷ t+H|t )(y t+H − ŷ t+H|t )
= Σ + Ψ1 ΣΨ01 + Ψ2 ΣΨ02 + ... + ΨH−1 ΣΨ0H−1
(Brownlees) 78/1
Connectedness Table
Consider the orthogonlized representation of the shocks t that is
t = Aut = a 1 u1t + a 2 u2t + ... + a N uNt
where a j is the j-the column of A and the ujt are uncorrelated.

(A can be obtained from the Cholesky decomposition)
This implies that
Σ = Var(t )
= E(t 0t ) = E(Au t u 0t A0 ) = A E(u t u 0t ) A0
= a 1 a 01 Var(u1t ) + a 2 a 02 Var(u2t ) + ... + a N a 0N Var(uNt )
(Brownlees) 79/1
Connectedness Table
Substituting the expression of the variance Σ in the MSE of the

H –step ahead prediction one can decompose the total H –step
ahead prediction error in the sum of the N orthogonalized
components MSE(ŷ t+s|t ) is
N h i
Var(uit ) a j a 0j + Ψ1 a j a 0j Ψ01 + Ψ2 a j a 0j Ψ02 + ... + Ψs−1 a j a 0j Ψ0s−1
X
i=1
Thus,
h i
Var(uit ) a j a 0j + Ψ1 a j a 0j Ψ01 + Ψ2 a j a 0j Ψ02 + ... + Ψs−1 a j a 0j Ψ0s−1
measures the contribution of variable i
(Brownlees) 80/1
Connectedness Table
The H –step ahead proportion of the variance of the prediction

error of i due to j is
h i
Var(ujt ) a j a 0j + Ψ1 a j a 0j Ψ01 + Ψ2 a j a 0j Ψ02 + ... + Ψs−1 a j a 0j Ψ0s−1
dijH = P h ii i
N
l=1 Var(u jt ) a a
l l
0 + Ψ a a 0 Ψ0 + Ψ a a 0 Ψ0 + ... + Ψ
1 l l 1 2 l l 2 a a
s−1 l l s−1
0 Ψ0
ii
As H increases, the MSE converges to the unconditional variance,

thus when H is large we can interpret the variance decomposition
as the portion of the unconditional variance explained by uj
(Brownlees) 81/1
Connectedness Table
Connectedness Table
(Brownlees) 82/1
Connectedness Table
Orthogonalizing the Shocks
Variance Decomposition is based on appropriate orthogonalizing

the system shocks. However, this is not always possible to do,
especially in large dimensional system
To this extent, Diebold and Yilmaz suggest to construct variance

decomposition based on the Generalized Variance Decomposition
(GVD) proposed by Pesaran and Shin (1998)
(Brownlees) 83/1
Connectedness Table
GVD
The GVD connectedness table is constructed using
σjj H (e 0 Ψh Σej )2
P
δijH = PH h=0 i
0 0
h=0 (ei Ψh ΣΨh ei )
and
δijH
dijH = PN
H
j=1 δij
(note that we need to standardize the δ s appropriately as variance

decomposition do not sum to one since errors are allowed to be
correlated)
(Brownlees) 84/1
Connectedness Table
Estimation
Connectedness Table
Estimation
It is straightforward to estimate the connectedness table. The

VMA representation can be obtained by estimating a VAR by LS
and then using the VAR companion form to obtain the infinite
VMA representation
Details can be found in classic time series text books by Hamilton

or Luketpol
(Brownlees) 85/1
Connectedness Table
Parameter Time Variation
Diebold and Yilmaz comment that the parameters of the VMA

representation are likely to slowly change over time
To this extent, they suggest estimation of the connectedness table

using a rolling window to visualize how connectedness changes
over time
(Brownlees) 86/1
Connectedness Table
Connectedness Table
Analysis of the connectedness of 13 large US financial institutions
Focus is on interdependence in volatility measured as realized

volatility
Analysis: full sample static analysis & 100-days rolling window
VAR(3) / H=12 / GVD
(Brownlees) 87/1
Connectedness Table
Empirical Illustration: Unconditional

Connectedness
(Brownlees) 88/1
Connectedness Table
Empirical Illustration: Time–Varying Total

Connectedness
(Brownlees) 89/1
Connectedness Table
Empirical Illustration: Time–Varying Net

Connectedness
(Brownlees) 90/1
Connectedness Table
Large Dimensional Connectedness Tables
The baseline methodology of Diebold and Yilmaz was proposed

for small dimensional VAR systems
In practice, one may want to apply these tools to analyse this

system
In forthcoming research the Diebold, Yilmaz and co-authors

propose to estimate VAR using elastic net (a combination between
LASSO and Ridge) and apply this to study large dimensional
systems.
(Brownlees) 91/1
NETS
NETS
NETS
NETS
NETS (network estimation for time series) has been proposed in

Barigozzi & Brownlees (2016)
The idea is to provide a generalization of the Partial Correlation

Network for dependent data
(Brownlees) 92/1
NETS
NETS:
Definition
NETS
VAR Approximation
Approximate the yt process using a VAR

p
X
yt = Ak yt−k + t t ∼ wn(0, Σ )
k=1
A natural representation for such a process the union of two

networks
1 A Granger network capturing the dynamic structure of the
process and
2 a contemporaneous network capturing contemporaneous
dependence
(Brownlees) 93/1
NETS
NETS
1 Granger network
Directed network in which the set of edges EG is such that
(i, j) ∈ EG ⇔ i Granger causes j
2 contemporaneous network
undirected network in which the set of edges EC is such that
(i, j) ∈ EC ⇔ i and j are partially correlated given the past
(Brownlees) 94/1
NETS
NETS
Network can be characterized in terms of the autoregressive matrices
Ak and the covariance matrix of the VAR innovations Σ
1 Granger network
Directed network in which the set of edges EG is such that
(i, j) ∈ EG ⇔ [Ak ]ji 6= 0 for at least one k
2 contemporaneous network
undirected network in which the set of edges EC is such that
(i, j) ∈ EC ⇔ [Σ−1
]ij 6= 0
(Brownlees) 95/1
NETS
NETS:
Estimation
NETS
Sparse Estimation
We work under the assumption that the VAR approximation is

sparse in the sense that the matrices Ak and [Σ−1
] are assumed to
be sparse.
We introduce a LASSO estimation algorithm which allows to

simultaneously estimate the parameters Ak and [Σ−1
]
We work with an alternative parameterization of the model

θ = (a01 , . . . , a0n , ρ0 )0 where ai contains the stacked autoregressive
coefficients of series i and ρ is the vector partial correlations
implied by [Σ−1 ].
(Brownlees) 96/1
NETS
NETS Steps
It can be shown that the loss function for the estimation of the
model parameters can be written as
   2
n
T X n n r n r
X X X ckk X c kk
LT (θ) = yit − aij − ρik akj  yjt−1 − ρi k ykt 
t=1 i=1 j=1 k6=i
cii k6=i
cii
In order to obtain sparse estimates, we optimize such an objective

function subject to a LASSO penalty
  
n
X Xn n
X
LT (φ) + λ  |aij | + |ρik | .
i=1 j=1 k6=i
It turns out that a variant of the standard shooting algorithm can

be implemented to carry out this optimization using a coordinate
descent algorithm.
(Brownlees) 97/1
NETS
NETS:
Empirical Application
NETS
Empirical Application
We interested in estimating the network of interconnections of

stock returns and stock return volatilities
Daily log returns / log volatilities (high-low range) for 93 U.S.

bluechips between 2000 and 2013
Influence of common factors is netted out
(Brownlees) 98/1
NETS
“Reading” Network
It turns out that the networks shares many of the characteristic of

social networks.
1 Centrality. Financials, Energy and Technology are some of the
most central sectors.
2 Community Structure. Vertices that are similar are linked

(industry linkages)
3 Power Law Structure. Evidence of “Small World Effects”
(Brownlees) 99/1
NETS
Volatility Network
●
MCD
TGT
●
●
ABT NKE
●
●
● COF
AXP
JPM
●
● GE
USB ●
VZ
● ●
BK
● C
CAT
●
● AIG
FOXA ●
MET
PEP
●
LLY
●
●
DOW
●
MS●
WFCBAC
●
ALL
BMY
● BRK.B
●
HPQ
● DD
●
F● ●
UNH
GS●
●
T KO
●
● FDX
●
AMGN ●
SPGUNP
●
CMCSA ●
UPS
●
GILD
●
APC
●
LMT ●
TWX
INTC ACN
●
AAPL ●
FCX ●
OXY
APA
●
● ●
● COP ●
EXC
●
EMC ●
ORCL ●
CVX
HON
●
DIS
● ●
CSCO ● NOV
DVN ●
●
TXN ● COST
QCOM ●
SLB
●
●
XOM
SO
●
BA
● ●
AMZN ●
EBAY
●
MON
RTN
●
●
AEP
NSC
●
●
MDLZ CL
●
Granger
(Brownlees) 100/1
NETS
Volatility Network
CMCSA ● DIS
● TWX ●
●
CVS
●
GILD
●
AMGN ●
WAG
●
CL
●
● EXC
SO
●
FDX
●
COST ●
AEP ●
PG
●
AXP
●
WMT ●
UPS
●
TGT
●
HD
●
EBAY
●
AAPL ●
COF
●
GS
●
SBUX ●
AMZN ●
BK
●
LOW
QCOM
●
USB
●
JPM MS
●
PEP
WFC
●
KO ●
INTC
●
CSCO BAC ●
C
●
VZ ●
T ●
TXN
●
ORCL ●
SPG ●
AIG
●
EMC
HPQ
●
●
MET
●
ALL
●
MSFT
●
DD
● CAT
●
DOW
●
FCX XOM ●
● CVX
OXYCOP ●
LMT
●
GD
● ●
● ●●
APC BA ●
RTN
APA
●
MO SLB DVN
NOV ●
UTX
●
MDLZ ●
HAL ●
JNJ
●
ABT
●
PFE ●
HON
●
MRK
● LLY
BMY
●
● UNP
NSC ●
Contemporaneous
(Brownlees) 100/1
NETS
Out–of–sample Validation
It is interesting to evaluate the network in terms of out–of–sample

prediction
To this extent, we use the estimated Granger network to predict

future volatility and compare the forecast with the one obtained
using an array of alternative techniques
(Brownlees) 101/1
NETS
Out–of–sample Validation
(Brownlees) 102/1

Networks PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Networks PDF

Загружено:

Авторское право:

Доступные форматы

Econometric Analysis of Networks

Universitat Pompeu Fabra, Barcelona GSE

Network Analysis has emerged prominently in many fields of

Network analysis is a powerful tool to represent and synthesize the

These slides introduce network techniques for the analysis of large

Focus is on the recent advances from statistics and econometrics

Network Analysis: What is the Fuss?

In statistics network/graphical modeling has been around for quite

However, interest in this field has been renewed with the

In particular, Meinshausen and Bühlmann (2006) is probably one

Network Analysis: What is the Fuss?

In economics/finance networks models have become particularly

Recent influential research by Acemoglu et al. (2012) has

The great financial crisis has played an important role in

Network for Dynamic Data

Roughly, a graph is a collection of vertices connected by lines.

There are multiple graph definitions. Graphs can be defined in

The notation for graphs can be quite extensive. We are going to

A Graph G is defined as a pair of Vertices and Edges

The vertices V is (any) set of elements

History of Graphs: Königsberg Bridge Problem

There are many types of graph definitions.

In what follows we will focus on:

Unweighted and Weighted Graphs

If the edges have different weights the graph is weighted

... and there are many more

There are many other graph types, for example:

Important matrices associated with G

The adjacency matrix AG :

The degree matrix DG :

Network Representation: Remarks

However, it is important to at least point out that several key

For instance, it can be shown that the number of connected

Some Notions from Network Analysis

Some of the most relevant ones are

Importance in a network can be defined in different ways. It is

Vertices with a high centrality/large number of connections are

Power Law Structure

Real world networks often exhibit a power law structure, that is

Recall, that that the power law distribution is a heavy tailed

Networks with power law structure exhibit small world effects / 6

Real world networks often exhibit a community structure or

An implication of communities is that the network has an

Networks for Panels of Economic

Networks in Econ and Finance

Let yt denote a multivariate time series of interest

e.g. returns of a portfolio of assets, CDS spreads of a panel

Networks in Econ and Finance

2 Developing estimation techniques that allow us to detect the

Networks in Econ and Finance

Networks in Econ and Finance

Central issue: Which measure of dependence should we use?

No unique answer. It depends on context of the application.

In practice, network definitions in econometrics/statistics differ on

Network classification by dependence type

Partial Correlation Network

Partial Correlation Network:

Partial Correlation Network

Let yt be iid multivariate

where D(µ, Σ) denotes a distribution with mean µ and cov Σ.

The partial correlation network associated with the system is an

Partial Correlation Networks

Partial Correlation measures (cross-sect.) linear conditional