Академический Документы
Профессиональный Документы
Культура Документы
GRAPH CONCEPTS
Name
Graph
Multigraph
Weighted
Graph
Labelled Graph
Distance
between 2
nodes
Simple Path
Length of a
path
Definition
A set of nodes & edges
Can be directed or undirected
A graph is connected if there is a path
between any two of its vertices, otherwise
they are connected components
A graph that allows loops and multiple
edges
Graph with weighted edges
A graph where its nodes or edges have
properties (attributes)
Shortest path between the 2 nodes
Name
Diameter of
graph
Example
Definition
Maximum
distance in
graph
Distance between
A & D is 2
NETWORK CHARACTERISTICS
A Full network
contains all entities
and connections
among them
Ego: Node in focus
Alter: neighbor of
Ego
Egocentric
Network: an ego and
its connections
Unimodal Network
Multimodal Network
2 types
Name
Edges of
2 types
Size of
network
Density of
network
n ,
no . of ties=n (n1)
Undirected network of size
no . of ties=n
Reachability
Degree
Centrality
In-degree
Centrality
Outdegree
Centrality
Closeness
Centrality
n ,
n1
2
of shortest distance
1
Betweenness
Centrality
Betweenness Centrality=
Using geodesic
(shortest)
distance,
Node A=
1
=0.25
1+ 1+ 1+1
Node B=
1
=0.14
1+2+2+2
NodeC=
1
=0.17
1+2+1+2
Node D=
1
=0.2
1+2+1+1
Node E=
1
=0.17
1+ 2+ 2+ 1
Node
Betweennes Eigenvecto
s
r
0.5
Number of shortestApaths passing
through v0.162
0.241
Number ofBshortest1.5
paths
C
0.0
0.194
D
0.5
0.162
E
1.5
0.241
0
Eigenvector
Centrality
centrality
1+ 0.5
Large value: Weighs towards wider
network structure
Negative Beta: Egos disadvantage
Metric
Cut Vertex
Bridge
Pivotal Node
Gatekeeper
Node
Gatekeeper
Pivotal
passes through X
A node V is a Local Gatekeeper if
there are two neighbors of V, Y and Z,
that are not connected by an edge
Gatekeeper/Pivotal Local
Gatekeeper
Node A is a gatekeeper
Node D is a local gatekeeper, but not
a gatekeeper
Comparison
Generally, the 3 centrality types will be positively correlated, when they are not, it probably tells you
something interesting about the network
Low Degree
High
Degree
High
Closen
ess
High
Betwe
enness
Low Closeness
Embedded in cluster that is far
from the rest of the network
Low Betweenness
Ego's connections are
redundant - communication
bypasses him/her
Alter connects to each other
Probably multiple paths in
the network, ego is near
many people, but so are
many others
SOCIAL GROUPS
Total mutual
Total connected
Total dyads
Reciprocity
Dyads (2 nodes)
Undirected
Directed
Reciprocity
2 ties - Yes/No
No, 1-way (which way), 2-way
- Ratio of all dyads to
reciprocated r/s
Ratio of all
connected
dyads to
reciprocated r/s
Cliques
Every member of a clique knows everybody else, i.e. Density
Undirected
Directed
Transitivity
x y z
Ye
x y z
Ye
Triadic
Closure
Clans
An N-clan is an N-clique where every pair h
distance
Triads (3 n
6
10
0, 1, 2, or 3 ties
2/10
16OR
possible r/s (See below)
N
x y z
2/6
Example
{ A, B, C } is a 2-clan
{ A, C, E } is not a 2-
Clustering
Clustering
Coefficient
actual ties
Max ties
Agglomerat
ive
STRUCTURAL BALANCE
Triadic
people
friends
future.
if
if
AF=
1
6
AE=
2
5
from its
INFORMATION FLOW
1.
Find any path from source to sink that has a positive flow capacity
remaining. If no more such paths, exit
2.
Determine
the smallest flow capacity on any arc in the path (the bottleneck arc)
3.
Subtract
direction
Add
CD=
4.
( [
i=1
C D ( n )C D ( ni ) ]
A cut is any set of directed arcs containing at least one arc in every
path from the source to the sink. The cut value is the sum of the flow
capacities in the source-to-sink direction of all the arcs.
(g1)(g2)
C D =1
Node 2
1-2: 2
1-3: 3
1-4: 4(2)
1-5: 2(1)
1-6: 4(2)
2-3: 0
2-4: 3
2-5: 1
2-6: 3
3-4: 1
3-5: 1
3-6: 2
4-5: 0
4-6: 2
5-6: 3
Node 3
1-2: 2
1-3: 3
1-4: 4(1)
1-5: 2(1)
1-6: 4(2)
2-3: 0
2-4: 3
2-5: 1
2-6: 3
3-4: 1
3-5: 1
3-6: 2
4-5: 0
4-6: 2
5-6: 3
By the max-flow min-cut theorem, the cut value of the min cut
is equals to the max flow.
UCINET: Network > Cohesion > Max Flow
Flow Betweenness
Let
m jk
vertex
m jk
where
i ,
and
and
j<k .
Max-flow min-cut theorem: for any network having a single origin node
and a single destination node, the maximum possible flow from origin
to destination equals the minimum cut value for all cuts in the network
Information Cascade
Bookkeeping Algorithm:
Conditional Probability:
P ( A|B )=
P ( A ) P ( A|B )
P( B)
STUDY DESIGN
1. Basics: Measurements & Data
Variable: Characteristic or property
Scales: Nominal, Ordinal, Interval, Ratio
Nomina
l
Ordinal
Interval
Categorical; Qualitative
e.g. Male, Female; North, South, East, West
No concept of gap size:
Ratio
a>b >c
+,
+,, ,
e.g. dollars
Pivotal/Non-pivotal: Categorical
Survey Ratings: Ratio
Edge (Yes/No): Categorical
Weighted edge (e.g., 110): Ratio
2. Data collection
Asking
Responde
nts
1)
2)
3)
4)
Experime
nts
Web
Access
Secondar
y Data
Web crawling
Blogs, forums, social media
1)
2)
3)
4)
operations
e.g. Celcius
Ratios can be compared
Can perform
operations
Decide
what to
study
Choose
relevant
populati
on
Collect
data
Analyse
Deduce
Findings
Report
What to study?
The Hypothesis
See Notes for examples
Variables
Identify variables, consider independent variables
e.g. Node properties, edge properties
Level of Detail
e.g. team email: sender, receiver, etc.
Sampling
Identify the population study is interested in
- Roles/positions (directors/politicians)
- Relationships (friends of )
- Events (participation/communication)
- Time
- Location
Complete Population (Census)
VS
Random (ego) + snowball (alters)
Refer to 2. Data Collection
Mixture of qualitative, descriptive statistics, and
statistical tests
Statistics, and compare with prior studies
Clear, meaningful and obvious graphs
Introduction Literature Review Objective
(Hypothesis) Methodology Analysis
Findings
NetDraw
Separate files
with multiple
matrices
Prepare Data
Produce matrix
from attributes
Display
Univariate
Statistics
Compute
Network
Metrics
Test observed
mean/density
against a fixed
value
Find p-value
against a fixed
value
p-value
Test of density
(more than
mean, takes
into account of
variability)
difference
between 2
networks
Find p-value of
2 groups
divided on node
attributes
Correlation
between 2
networks with
same actors
0.0002
2.4089
0.0052<0.05
Find r
Test
pvalue> 0.05
UCINET Output
Analysis
Type
Regressio
n
(you have
control
over the
independe
nt
variable)
T-test of 2
group
means
Look at R-sq first to see if model is a good fit. Then look at individual variables
T-test used to test if
there are differences
between the means
of two groups, in this
case, whether the
govt or non-govt
groups have
different out-degree
centrality (col 1).
Is one group
bigger than the
Result: No difference
across groups, all p-other?
values are
ANOVA for
2 or more
groups
0.05
Look at f-statistic
and significance.
Significance is the
same as that of twotailed test.
Note: Refer to Notes
Triad Undirected
X1 X2 X3
X 1 : No. of mutual dyads
D: Down
U: Up
T: Transitive
C: Cyclic