Вы находитесь на странице: 1из 56

lronuers of

CompuLauonal !ournallsm
Columbla !ournallsm School

Week 9: Spooky neLwork Analysls
CcLober 31, 2014



A seL of people
neLwork
and a seL of connecuons beLween palrs of Lhem

1ypes of connecuons
Soclal neLwork analysls: only one Lype of connecuon beLween
lndlvlduals (e.g. "frlend")

Llnk analysls: muluple Lypes of connecuons
frlend
broLher
employer
wenL Lo unlverslLy wlLh
sold a car Lo
owns 31 of

Llnk analysls ls much more relevanL Lo [ournallsm, because lL
allows represenLauon of much more deLall and conLexL.

eople AcL ln Croups
!"#$%& "() *+$,()-.$/-: l am mosL closely connecLed Lo a small seL of
people, who are usually closely connecLed Lo each oLher.

01-$(,--: l am much more llkely Lo do buslness wlLh people l already
know.

2(31,(4,: l llsLen Lo people l know more Lhan l llsLen Lo sLrangers.

56+#-: whaL ls rlghL depends on whaL Lhe people around me Lhlnk.

eople Lend Lo marry, do buslness wlLh, spend ume wlLh, eLc. people
from slmllar backgrounds... and people who have soclal ues Lend Lo be
slmllar.


Pomophlly
Pomophlly ls Lhe prlnclple LhaL conLacL beLween slmllar people
occurs aL a hlgher raLe Lhan among dlsslmllar people. 1he
pervaslve facL of homophlly means LhaL culLural, behavloral,
geneuc, or maLerlal lnformauon LhaL ows Lhrough neLworks
wlll Lend Lo be locallzed. Pomophlly lmples LhaL dlsLance ln
Lerms of soclal characLerlsucs LranslaLes lnLo neLwork dlsLance,
Lhe number of relauonshlps Lhrough whlch a plece of
lnformauon musL Lravel Lo connecL Lwo lndlvlduals.

- Mcherson, SmlLh-Lovln, Cook
!"#$% '( ) (*)+,*#- ,'.'/,"01 "2 %'3")0 2*+4'#5%



SLrucLure 8elaLes Lo 8ehavlor
ln a 1931 experlmenL, researchers had ve people work LogeLher, only
allowed Lo communlcaLe accordlng Lo one of Lhe pauerns above. 1hey
were each glven a card wlLh several symbols on lL. 1he Lask was Lo
deLermlne whlch symbol was ln common beLween all of Lhe cards. lL was
repeaLed many umes.

Pow dld Lhe groups organlze Lhemselves? Whlch pauerns were fasLesL?

lrom P. Leavlu, 6'.* *7*3+% '( 3*#+)"2 3'..82"3)9'2 /):*#2% '2 ;#'8/ /*#('#.)23*<
!ournal of Abnormal sychology 46(1)

Correlauon of dlerenL Lypes of lnfo
Suppose you have a record of phone numbers called, a daLabase
of pollucal campalgn donauons, and a llsL of governmenL
appolnLees. uL Lhem LogeLher, and you have Lhls sLory:

WASPlnC1Cn-1lme and agaln, 1exas Cov. 8lck erry plcked up hls omce
phone ln Lhe monLhs before he would announce hls bld for Lhe presldency.
Pe dlaled wealLhy frlends who were hls blg fundralsers and sLaLe omclals who
owed hlm for Lhelr [obs.

erry also meL wlLh a 1exas execuuve who would laLer co-found an
lndependenL pollucal commluee LhaL has promlsed Lo ralse mllllons Lo
supporL erry buL ls prohlblLed from coordlnaung lLs acuvlues wlLh Lhe
governor.

- !ack Clllum, =*##1 3)00*$ +'/ $'2'#% (#'. 4'#5 /,'2*%< A, 6 uec 2011


Soclal neLwork Analysls ln !ournallsm
ldenufy people or communlues
1race pauerns of connecuons
undersLand spread of lnformauon and behavlor
lllusLraLe complex sLorles
useful ln all areas where CS lnLersecLs [ournallsm! (8eporung,
communlcauon, lLerlng, eecL Lracklng)
1wo ma[or analysls meLhods
Look aL a vlsuallzauon
Apply algorlLhm
ln boLh cases, Lhe resulLs are noL lnLerpreLable wlLhouL conLexL!
lorce-ulrecLed LayouL
Lach edge ls a "sprlng" wlLh a xed preferred lengLh.
lus global repulslve force LhaL pushes all nodes aparL.
lrom >,* ?7*3+ '( @#)/, A)1'8+ '2 B2(*#*23* (#'. 6'3")0 C*+4'#5
D)+), 8lyLhe eL al.
lrom >,* ?7*3+ '( @#)/, A)1'8+ '2 B2(*#*23* (#'. 6'3")0 C*+4'#5
D)+), 8lyLhe eL al.
lrom >,* ?7*3+ '( @#)/, A)1'8+ '2 B2(*#*23* (#'. 6'3")0 C*+4'#5
D)+), 8lyLhe eL al.
We asked respondenLs Lhree quesuons abouL Lhe same ve focal
nodes ln each soclogram:

1) how many subgroups were ln Lhe soclogram
2) how promlnenL" was each player ln Lhe soclogram
3) how lmporLanL a brldglng" role dld each player occupy ln Lhe
soclogram
CenLrallLy
Cen ldenued wlLh "lnuence" or "power." Cen
lmporLanL ln [ournallsm.

We can vlsuallze Lhe graph and use our eyes, or we can
compuLe cenLrallLy values algorlLhmlcally.

uegree cenLrallLy: number of edges
Models: cases where Lhe number of connecuons ls lmporLanL.
Lxample: whlch celebrlLy can reach Lhe mosL people aL once?
Closeness cenLrallLy: average dlsLance Lo all oLher nodes
Models: cases where ume Laken Lo reach a node ls lmporLanL.
Lxample: who nds ouL abouL gosslp rsL?
8eLweenness cenLrallLy:
number of shorLesL paLhs LhaL pass Lhrough node
Models: cases where conLrol over Lransmlsslon ls lmporLanL.
Lxample: who has Lhe mosL power Lo make lnLroducuons?
LlgenvecLor cenLrallLy:
how llkely you are Lo end up aL a node on a random walk
(same ldea as age8ank)
Models: cases where lmporLance of nelghbors ls lmporLanL.
Lxample: Lhe prlvaLe advlser Lo Lhe presldenL
!ournallsm cenLrallLy:
how lmporLanL ls Lhls person Lo Lhls sLory?
Who ls "lmporLanL"?
WhaL Lype of person do you wanL Lo ldenufy ln Lhe neLwork?

Cen assumed we're aer "lnuenual." 8uL soclology says
"power" ls a compllcaLed Lhlng and dlmculL Lo dene and
measure.

neLwork analysls has mosLly lgnored Lhls problem. l know of no
successful use of cenLrallLy meLrlcs ln [ournallsm - maybe you'll
be Lhe rsL.
llndlng Communlues
no one denluon of "communlLy." Could mean a Lown,
or a club, or an lndusLry neLwork.

8uL for our purposes, a communlLy ls "a group of
people wlLh pre-exlsung pauerns of assoclauon."

ln soclal neLwork analysls, LhaL LranslaLes lnLo clusLers
ln Lhe graph.
lrlends/followers
Co-consumpuon - C*+4'#5 '( /'0"93)0 E''5 %)0*%< CrgneL.com
Communlcauons neLwork - ?F/0'#"2; ?2#'2< !eery Peer
Web llnk sLrucLure - G)/ '( B#)2")2 !0';'%/,*#*< 8erkman CenLer
lndlvldual ume/locauon Lralls - H"+16*2%*, Sense neLworks
Warnlng: no neLwork ls ever "compleLe."
CLherwlse Lhere would be 7 bllllon people ln lL
MaLhemaucal denluons of "clusLer"
?ou've already seen several! lf you can compuLe
dlsLance beLween any Lwo lLems, you can clusLer.

8uL ln soclal neLworks, noL everyone ls connecLed Lo
everyone else...
ModularlLy
Are Lhere more lnLra-group edges Lhan we would
expecL randomly?
ModularlLy
n = number of veruces
k
l
= degree of verLex l
A
l[
= 1 lf edge beLween l,[, 0 oLherwlse

g
l[
= 1 lf l,[ ln same group, 0 oLherwlse


1here are LoLal edges ln Lhe graph.
lf Lhey go beLween random veruces Lhen
number of edges beLween l,[ ls
m=
1
2
k
i
!
k
i
k
j
/ 2m
ModularlLy
n = number of veruces
k
l
= degree of verLex l
A
l[
= 1 lf edge beLween l,[, 0 oLherwlse

g
l[
= 1 lf l,[ ln same group, 0 oLherwlse


ModularlLy

lf C>0 Lhen Lhere are "excess" edges lnslde Lhe
groups (and fewer edges beLween Lhem.)
Q= A
ij
!k
i
k
j
/ 2m
( )
ij
"
g
ij
ModularlLy algorlLhm
Look for a dlvlslon of nodes lnLo Lwo groups LhaL
maxlmlzes C
Can nd Lhls Lhrough elgenvecLor Lechnlque
osslble LhaL 2' dlvlslon has C>0, ln whlch case Lhe
graph ls a slngle communlLy
lf a dlvlslon wlLh C>0 found, spllL
8ecurslvely spllL sub-graphs
1he Palrball problem
8eal soclal neLworks are blg, wlLh complex, overlapplng
communlues ln Lhe cenLral componenL. ModularlLy and oLher
communlLy deLecuon algorlLhms glve poor resulLs.
k-core uecomposluon
llnd Lhe nodes aL Lhe "cenLer" of a neLwork.

!"# %&' (" )*+,)-) ."/0 /01#00
#020*(
#0)"30 *44 ."/05 6,(7 /01#00 8 %
-.(,4 *44 #0)*,.,.1 ."/05 7*30 /01#00 9& %
50( :;"#0 .-)<0#: "! #0)*,.,.1 ."/05 (" %

k-core uecomposluon
Carml eL al., I .'$*0 '( B2+*#2*+ +'/'0';1 8%"2; 5J%,*00 $*3'./'%"9'2
roLesL uynamlcs on 1wluer
Conzlez-8allon eL al, >,* D12)."3% '( =#'+*%+ K*3#8"+.*2+ +,#'8;, )2
L20"2* C*+4'#5
k-core number vs. maxlmum cascade slze. Color = senL aL leasL one LweeL
whlch reached Lhls fracuon of users (orange = reached all users)
key lnslghL: Lrlangles noL edges
Slmmel's Lheory of soclology (early 20
Lh
C.) says
relauonshlp beLween Lwo people cannoL be
undersLood wlLhouL conLexL.
ldea: counL shared Lrlangles
1. Clven each node A, glven each of A's frlends
8, counL Lhe number of Lrlangles lnvolvlng A and
each 8 (= number of shared frlends of A and 8).
2. 8ank A's frlends (each 8) by number of shared
frlends (number of C's for A,8) Lo creaLe "Lop
frlends" llsL for A.
2. keep Lhe edge beLween nodes A,u only lf
Lhere ls some Lhreshold percenLage overlap ln
Lhelr Lop frlends llsL.
Slmmellan 8ackbones
SnA ln [ournallsm
lCl! Cshore 1ax Paven leak
lCl! human ussue lnvesugauon
Crganlzed Crlme and Corrupuon 8eporung ro[ecL
WS! @)00*'2M% N*E lnslder Lradlng sLory
SCM's N,' K82% O'2; P'2;
MuckeLy.com
1he oLher challenge was Lhe daLa lLself. Pow Lo separaLe Lhe exLraordlnary from Lhe rouune
and nd Lhe publlc lnLeresL lnslde a maze of more Lhan 37,000 oshore company holders? A
rsL sLep was Lo bulld as many llsLs as posslble of publlc gures: ollLburo members, mlllLary
commanders, mayors of large clues, bllllonalres llsLed ln Q'#E*% and O8#82's ranklngs of Lhe
mega-wealLhy and so-called prlncellngs (relauves of Lhe currenL leadershlp or former
CommunlsL arLy elders).

1hrough palnsLaklng daLabase work, a reporLer ln Spaln cross-referenced Lhe llsLs of noLable
Chlnese agalnsL Lhe names of oshore cllenLs llsLed wlLhln lCl!'s Cshore Leaks daLa. 1he
added dlmculLy was LhaL ln mosL cases, names ln Lhe oshore les were reglsLered ln
8omanlzed form, noL Chlnese characLers. 1hls made maklng exacL maLches exLremely hard,
because 8omanlzed spelllngs from Chlnese characLers Lend Lo vary wldely: Wang mlghL be
spelled Wong, Zhang could be Cheung, and ?e mlghL be spelled ?eh. Addresses and lu
numbers helped conrmed many ldenuues buL many oLhers names were dropped because
Lhe reporung Leam could noL be 100 percenL sure LhaL Lhe person was a correcL maLch.

A plcLure slowly began Lo emerge: 7.$("8- ,%$9,- :,+, ";;+,--$<,%& 1-$(; 6=-.6+, ."<,(- 96
.6%) "--,9-> %$-9 46#/"($,- $( 9., :6+%)8- -964? ,@4."(;,-> A1& "() -,%% +,"% ,-9"9, "()
46()149 9.,$+ A1-$(,-- ":"& *+6# 0,$B$(;8- +,) 9"/, "() 4"/$9"% 46(9+6%-.

O'4 N* D"$ L7%,'#* A*)5% H,"2)< lCl!
I2)01R"2; +,* D)+) E*,"2$ 65"2 )2$ !'2*< lCl!
N,' K82% OPS >,* Q";,+ 'T*# 6+)20*1 O'M% Q'#+82*
SouLh Chlna Mornlng osL< UVWV
SnA LhaL 3'80$ be used ln !ournallsm
>,* C*+4'#5 '( @0'E)0 H'#/'#)+* H'2+#'0 paper
neLwork of campalgn nance conLrlbuuons
(SuperACs)
lnLernauonal nanclal sysLem / Pl1
"8evolvlng door" / regulaLory capLure
ollucal ellLe ln any counLry
llnd audlence for sLory, akln Lo LargeLed markeung
...
vlLall, Clamelder, 8amsLon, >,* C*+4'#5 '( @0'E)0 H'#/'#)+* H'2+#'0
SnA ln [ournallsm
vlsuallzauon wldely used
Llnk analysls successful ln lnvesugauve reporung
C6-9 6* 9., :6+? +,D1$+,) 96 )6 9.,-, 9&/,- 6*
-96+$,- $- 9+")$E6("% +,-,"+4.> (69 F5GH
l am noL aware of successful appllcauon of cenLrallLy
meLrlcs or communlLy deLecuon algorlLhms.
1hls may change as Lhe graphs [ournallsm examlnes
geL blgger...
Would lL be posslble Lo use communlLy deLecuon Lo
nd Lhe "rlghL" audlence for a sLory?

Вам также может понравиться