Академический Документы
Профессиональный Документы
Культура Документы
I.
I NTRODUCTION
R ELATED W ORK
P ROPOSED W ORK
for annotated,
for otherwise
wu,v
)
Ewu,y
u,y
(1)
(2)
IV.
Number of Vertices
100
200
400
500
1000
3000
4000
6000
12000
20000
Time (Partition2)
12.33
12
13.33
13.66
17.33
23.66
26.5
28.3
44.5
70.33
Time (Partition4)
15.33
16.33
15.33
19.66
18.66
34.33
36.7
40.2
53.33
88
Number of Vertices
100
200
400
500
1000
3000
4000
6000
12000
20000
Time(HDFS)
17.5
18
18.2
19
21.37
29.5
33
44
67
76
Time(RDD Cached)
12.5
12
14
13.66
17.33
23.66
24.2
31.2
46.66
63
Time(#2)(s)
10.3
12.1
18.2
34.1
43.1
62.3
80
Time(#3)(s)
9.2
10.32
15.3
30.1
41.3
58.3
73
Time(#4)(s)
6.33
7.2
12.2
27.3
37.3
50.1
66
[3]
[4]
[7]
[8]
[9]
V.
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
Andrew Duggan, Beowulf computer clusters, Tessella Support Services PLC, Accessed on April 02, 2013.
Andrew S. Tanenbaum and Maarten van Steen, Distributed Systems:
Principles and Paradigms, Pearson Prentice Hall, 2nd edition, May,
2005.
J.A. Bondy and U.S.R. Murty, Graph Theory with Applications,
OReilly Media, 2nd edition, January 2013.
Maarten Van Steen, Graph Theory and Complex Networks: An Introduction, Altera Corporation, 1st edition, January 2010.
Scala programming concepts and examples, http://docs.scala-lang.org/
tutorials/scala-for-javaprogrammers.html, Accessed on May 19, 2013.
Ian Robinson, Jim Webber, and Emil Eifrem, Graph Databases, Orielly
publications, rst edition, June, 2013.
Elzbieta Krepska, Thilo Kielmann, Wan Fokkink, and Henri Bal, A
high-level framework for distributed processing of large-scale graphs,
in 12th International Conference on Distributed Computing and Networking, 2011, pp. 155166.
Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C.
Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski, Pregel: A
system for large-scale graph processing, in Proceedings of the 2010
ACM SIGMOD International Conference on Management of data, 2010,
pp. 135146.
Tom White, Hadoop Denitive Guide, Orielly publications, second
edition, October, 2010.
Jeffrey Dean and Sanjay Ghemawat, Mapreduce simplied data
processing on large clusters, in in OSDL, 2004, pp. 137150.
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, and
Ion Stoica, Resilient distributed datasets: A fault-tolerant abstraction
for in-memory cluster computing, in Proceedings of the 9th USENIX
conference on Networked Systems Design and Implementation, 2012,
pp. 220.
Matei Zaharia, Tathagata Das, Haoyuan Li, Scott Shenker, and Ion
Stoica, Discretized streams: An efcient and fault-tolerant model for
stream processing on large clusters, in Proceedings of the 4th USENIX
conference on Hot Topics in Cloud Computing, 2012, pp. 110.
Scala programming languag, http://www.informatics.indiana.edu, Accessed on June 23, 2013.
Opic page ranking basics, http://www.w2003.org/cdrom/papers, Accessed on February 11, 2013.
Comparison of graph processing frameworks, http://blog.octo.com/en/
introduction-to-large-scale-graph-processing, Accessed on May 19,
2013.
Elena Nabieva, Kam Jim, Amit Agarwal, Bernard Chazelle, and Mona
Singh, Whole-proteome prediction of protein function via graphtheoretic analysis of interaction maps, in ISMB 2005 Proceedings.
Thirteenth International Conference on Intelligent Systems for Molecular Biology, 2005, pp. 13021310.
Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael J.
Franklin, Scott Shenker, and Ion Stoica, Shark: Fast data analysis
using coarse-grained distributed memory, in Proceedings of the ACM
SIGMOD International Conference on Management of Data, 2007, pp.
689692.