Академический Документы
Профессиональный Документы
Культура Документы
of
Computa%onal
Journalism
Columbia
Journalism
School
Week
5:
Social
Filtering
October
8,
2012
Classify
Users
Classic
machine
learning
problem.
Classify
each
user
as
one
of:
journalist/blogger
organiza%on
ordinary
individual
First,
need
to
encode
as
a
vector
/
select
features...
Take K closest training points (in high dimensional feature space), choose majority label.
Classier Accuracy
Eyewitness
classier
Goal
is
to
nd
individual
tweets
that
are
eyewitness
reports.
Started
with
LIWC
(linguis%c
inquiry
and
word
count)
dic%onary
that
classies
English
words
along
70
dierent
dimensions,
including
emo%on,
cogni%on,
%me,
health...
Word Aspects
Other
dimensions
Tweet
contains
URL
to
photo
or
video
(used
table
of
domain
names,
e.g.
ickr.com
=
photo)
Posted
from
mobile
device
(from
tweet
metadata
naming
pos%ng
app)
Geocode
users
stated
loca%on
(this
is
painful
and
unreliable)
Distribu%on
of
friends
loca%ons.
(Friend
=
mutual
following)
Unpopular
features:
En%ty
extrac%on
not
helpful,
no
ability
to
lter
by
loca%on
and
eyewitness
status,
focus
on
users
instead
of
content
User
x x x
x x x ltering User
x x
Its a news network Small number of high-degree hubs Dierent network structure than e.g. Facebook. Dierent uses. why?
Social
SoIware
Basic
assump%on:
structure
of
soIware
inuences
how
groups
use
it.
or:
architecture
inuences
behavior
Design
problem...
What
do
we
want
the
users
to
accomplish
together?
How
do
we
encourage
this?
We
can
write
the
code,
but
the
culture
is
to
some
degree
beyond
our
predic%on
or
control.