Вы находитесь на странице: 1из 25

Tr e b a l l a n t e n g r a n s e m p r e s e s

multinacionals d'Internet sense moure's


de Catalunya
A l b e r t B i f e t S e t e m b r e 2 8 , 2 0 1 3

@abifet #catosfera

@YahooLabs
Innovaci i m p u l s a d a per la c i n c i a.
Som r e sponsables de grans invents i el
n o s t r e o b j e c t i u s n i ms n i m enys q u e
i n v e n t a r el futur d ' I n t e r n e t.

#catosfera @abifet

@YahooLabs

#catosfera @abifet

@Yahoo
600+ Million Users

9
B
illion



#1


#1


95 Million


A D V E R T I S E M E N T S
S E R V E D E A C H D A Y

I N P A G E V I E W S
A N D M A I L

I N S P O R T S , N E W S ,
F I N A N C E ,
A N D E N T E R T A I N M E N T

F L I C K R P H O T O S
U P L O A D E D E A C H
M O N T H

368 Million

4.5
Billion

100
B
illion


81 Billion


70 Billion



#catosfera @abifet

P E O P L E V I S I T T H E
Y A H O O ! H O M E P A G E
M O N T H L Y

P A G E V I E W S
D A I L Y

M E S S A G E S S E N T F R O M
3 0 0 M I L . M A I L U S E R S
M O N T H L Y

M E S S A G E S S E N T F R O M
1 1 2 M I L L I O N Y !
M E S S E N G E R U S E R S
M I N U T E S S P E N T
M O N T H L Y O N
C O M M U N I C A T I O N S
P R O P E R T I E S

#BigData
B i g D a t a 6 V s :
Volum , Varietat , Velocitat ,
Valor, Variabilitat , Veracitat

#catosfera @abifet

#DataScience

#catosfera @abifet

#catosfera @abifet

Ricardo Baeza-Yates
Yahoo! Research VP, EMEA and LATAM
Program Chair 2013 IEEE International
Conference on Big Data

Francesco Bonchi
Web Mining Research Group Manager
Yahoo! Research Barcelona

#catosfera @abifet
8

#catosfera @abifet

#catosfera @abifet

#catosfera @abifet

#catosfera @abifet

#DataScience

#catosfera @abifet

#catosfera @abifet

Volume 14, Number 2


December 2012

Published by the Association for Computing Machinery


Special Interest Group on Knowledge Discovery and Data Mining

SIGKDD
explorations
TABLE OF CONTENTS
Mining Big Data
Guest edited by Wei Fan and Albert Bifet
1

Mining Big Data: Current Status, and Forecast to the Future


Wei Fan, Albert Bifet

Scaling Big Data Mining Infrastructure: The Twitter Experience


Jimmy Lin, Dmitriy Ryaboy

20

Mining Heterogeneous Information Networks: A Structural Analysis Approach


Yizhou Sun, Jiawei Han

29

Big Graph Mining: Algorithms and Discoveries


U Kang, Christos Faloutsos

37

Mining Large Streams of User Data for Personalized Recommendations


Xavier Amatriain
Position Paper

49

Outlier Ensembles
Charu C. Aggarwal

#catosfera @abifet

#CodiObert

#catosfera @abifet

#Hadoop

#catosfera @abifet

#Mahout

#catosfera @abifet

#s4, #storm

#catosfera @abifet

#SAMOA

#catosfera @abifet

#SAMOA
Machine
Learning

Distributed

Batch

Stream

Hadoop

S4, Storm

Mahout

SAMOA

Non
Distributed

Batch

Stream

R,
WEKA,

MOA

#catosfera @abifet

#SAMOA

Albert Bifet, Gianmarco De Francisci Morales, Nicolas Kourtellis, Matthieu Morel,


Arinto Murdopo, and Antonio Severien

#catosfera @abifet

#SAMOA

Albert Bifet, Gianmarco De Francisci Morales, Nicolas Kourtellis, Matthieu Morel,


Arinto Murdopo, and Antonio Severien

#catosfera @abifet

#WebMining

#catosfera @abifet

#Grcies

#catosfera @abifet