Академический Документы
Профессиональный Документы
Культура Документы
Nitin Dhar
Quentin Swain
Maggie Neuwald
c
`nspiration behind project
1st attempt:
c Topical relevance (it must be about the event being searched)
c ïroximity to the event (eye-witness accounts)
c New information (rather than emotion)
c Depth and quality of information (either in the tweet or in
embedded links)
Methodology: elevance
*We later also added the length of the tweet as a signal, although that may be a proxy for this signal,
since the search results must contain all keywords in the tweet.
Methodology: Kenerating
queries
c Twitrank allows you to enter data for a query then uses a uby
library to pull tweets from the Twitter ÿï` based on your query
parameters, returning both the results and the associated metadata
(signals) for each result
c Twitter also has a limit on the number of requests you can make
to their ÿï` within an hour, this allowed us to use multiple
Twitter ÿï` accounts to retrieve more results per hour than if one
person was storing the results
Twitrank Screenshot
Twitrank Screenshot
System ÿrchitecture
oading Data into Sofia-ml
Tuple Signal
1 Tweet length
2 % of keywords in tweet
3 Number of retweets
4 ocation of user with respect to event
5 Time difference of tweet with respect to the event
6 Number of followers of user
7 Number of people the user follows
8 Status count of user
9 Favorites count of user
ãxample Tuples
c 3 1:3 2:66 3:0 4:100 5:6 6:101 7:65 8:1189 9:37 // this
one was given a relevance of 3
c 2 1:2 2:100 3:0 4:5 5:6 6:101 7:65 8:1189 9:37 // this
one was given a relevance of 2
Sofia Outputs
Ô
16
* *
* 14
* *
* 12
* 10
Sofia-ml
8
Twitter
6
Total `deal
4
16
14 2
12 0
10 Kroup 1 Kroup 2 Kroup 3 Kroup 4 Kroup 5
8
Total
6
4
2
0
1 2 3
`mplementation and Future
Work