Академический Документы
Профессиональный Документы
Культура Документы
of
Computa%onal
Journalism
Columbia
Journalism
School
Week
9:
Social
Network
Analysis
November
12,
2012
Network
A
set
of
people
Types
of
connec%ons
Social
network
analysis:
only
one
type
of
connec%on
between
individuals
(e.g.
"friend")
Link
analysis:
mul%ple
types
of
connec%ons
friend
brother
employer
went
to
university
with
sold
a
car
to
Link
analysis
is
much
more
relevant
to
journalism,
because
it
allows
representa%on
of
much
more
detail
and
context.
Betweenness centrality: highest frac%on of shortest paths that pass through node
Eigenvector centrality: how likely you are to end up at a node on a random walk (akin to PageRank)
Who
is
"important"?
What
type
of
person
do
you
want
to
iden%fy
in
the
network?
OZen
assumed
we're
aZer
"inuen%al."
But
Sociology
says
"power"
is
a
complicated
thing
and
dicult
to
dene
and
measure.
Network
analysis
has
mostly
ignored
this
problem.
I
know
of
no
successful
use
of
centrality
metrics
in
journalism
maybe
you'll
be
the
rst.
Finding
Communi%es
For
our
purposes,
a
community
is
"a
group
of
people
who
think
or
act
collec%vely."
In
social
network
analysis,
that
translates
into
clusters
in
the
graph.
Friends/followers
Modularity
Modularity
n
=
number
of
ver%ces
ki
=
degree
of
vertex
i
Aij
=
1
if
edge
between
i,j,
0
otherwise
gij
=
1
if
i,j
in
same
group,
0
otherwise
There
are
m
=
k
total
edges
in
the
graph.
If
they
go
between
random
ver%ces
then
number
of
edges
between
i,j
is
ki k j / 2m
1 2 i
Modularity
n
=
number
of
ver%ces
ki
=
degree
of
vertex
i
Aij
=
1
if
edge
between
i,j,
0
otherwise
gij
=
1
if
i,j
in
same
group,
0
otherwise
Modularity
Q = ( Aij ki k j / 2m)gij ij
If
Q>0
then
there
are
"excess"
edges
inside
the
groups
(and
fewer
edges
between
them.)
Modularity
algorithm
Look
for
a
division
of
nodes
into
two
groups
that
maximizes
Q
Can
nd
this
through
eigenvector
technique
Possible
that
no
division
has
Q>0,
in
which
case
the
graph
is
a
single
community
If
a
division
with
Q>0
found,
split
Recursively
split
sub-graphs
SNA
in
journalism
ICIJ
human
%ssue
inves%ga%on
WSJ
"Galleon's
Web"
insider
trading
story
SCMP
"Who
Runs
Hong
Kong"
Muckety
SNA
in
journalism
Visualiza%on
widely
used
I
am
not
aware
of
successful
applica%on
of
centrality
metrics
or
automated
community
detec%on.
This
may
change
as
the
graphs
journalism
examines
get
bigger...
Would
it
be
possible
to
use
community
detec%on
to
nd
the
"right"
audience
for
a
story?