Академический Документы
Профессиональный Документы
Культура Документы
Affiliations: Pervasive Technology Institute SALSA HPC School Of Informatics And Computing
1. Introduction
The project can be summarized into the following four steps: 1. Implementing the sequential PageRank algorithm followed by the Parallel PageRank algorithm using the MPI programming interface. 2. Running the Parallel PageRank code on two environments viz. Bare-Metal and Eucalyptus; and collecting the performance information (i.e. timing data). 3. Building a Resource Monitoring System that monitors and visualizes the resource utilization in a distributed set of nodes using message broker middleware. 4. Writing PBS job scripts that aim at setting up an automated system that does dynamic provisioning among clusters and gives the CPU and Memory utilization while running MPI PageRank.
Figure: PageRank
System Daemon: The System Daemon in this application runs in the background and implements the use of Sigar API for retrieving system resource utilization information and then implements a publisher client of a message broker system to send this information to the consumer client. This application first makes use of an object of class Calendar for getting the current system time and waits until the start of new second to send the first message. Once the program has been started successfully, a call to the user-defined function daemonize() is made which detaches the program from standard out and error. Next inside a loop which runs at an interval of one second, the getCpuPerc() function and getMem() function of class Sigar are used for capturing the current system CPU and memory utilization. Inside the same loop a Narada Broker publisher client has been setup using the classes ClientService, EventProducer and NBEvent provided by the API, which packs this data along with the machine IP provided by the shell script into a message and sends it. System Monitor: The System Monitor application cannot be thought of as divided into two parts: The first part includes a Narada Broker Consumer Client along with a linked-list data structure named MessageList for storing the list of IPs being monitored. A Consumer Client has been setup using the classes ClientService, EventConsumer and Profile provided by the Broker API. In function onEvent() which is called whenever a new message is received the originator IP address is looked for inside the message and if this IP matches an existing one in the linked-list created then the utilization values corresponding to that node are updated else a new entry is made for that node into the linked-list and the corresponding utilization values are stored. The second part involves averaging the received data and plotting graphs using the API JFreeChart. In a loop the average utilization of CPU and memory is calculated and in turn these values are updated into the chart using the updatechart() method of an object of a user-defined class GUI which extends the class ApplicationFrame. Next is the logic for dynamic update of disconnected nodes which after a timeout detects the absence of messages from a registered IP address and deletes it from the list of IPs being monitored. Features: Synchronization of the daemon process running on multiple computing nodes. Simple interface - Shell scripts to start, terminate and display the status of the daemon. Dynamic detection of compute nodes being monitored.
Figure: User interactions with Dynamic provisioning system Bare-Metal: As a part of the dynamic provisioning on academic cloud we have implemented PBS job scripts for running MPI PageRank and Monitoring system applications developed in the previous projects. As we are using batch jobs to request resources the way in which the execution works is that the requested job would wait in the queue and get executed when the desired resources are made available. Apart from the PBS script start_bare_metal_script we have implemented other shell scripts for running the monitor application as daemon and executing the PageRank program. The flow of the execution in this environment would go as follows:
Virtual Machine Enviornment: As in the previous environment here as well we use PBS job script named start_vm_script for setting up the environment and accomplishing the desired task. Moreover, even in this case the tasks are accomplished using a number of shell scripts that we have implemented; viz. start_SystemDaemon, status_systemDaemon, stop_SystemDaemon and run_vm_mpi. The flow of the execution in the virtual machine environment would go as follows:
3. Experiments
3.1 Input and output files 3.1.1.1 Sequential PageRank Results
INPUT FILE: <pagerank.input.1000.4> NUMBER OF ITERATIONS: 100 DAMPING FACTOR: 0.85 TIME TAKEN: 1.123 SECONDS -------------------------------------------------------------------URL | PAGERANK VALUE -------------------------------------------------------------------4 | 0.1382042748741752 34 | 0.123027739664415 0 | 0.11257586324323682 20 | 0.0773730840780489 146 | 0.0571400845693035 2 | 0.04792632638502429 12 | 0.020066424192489902 14 | 0.017905732189139028 16 | 0.013028812374389034 6 | 0.012955807657490121
0 0.1119975127 20 0.0842259538 146 0.0668477042 2 0.0489776400 12 0.0219372778 14 0.0173611889 16 0.0127714565 66 0.0120657004 -------------------------------------------------------------------------------------------TIMING INFORMATION -------------------------------------------------------------------------------------------Computation time = 0.01000 Secs. I/O time = 0.58000 Secs.
Eucalyptus: TYPE 1 Attribute Instance Class Number of Worker Nodes Number of Processes Size of Dataset (No. of Urls) Threshold Number of Iteration Value c1.medium 8 1(on each node) 1000, 10000, 20000, 30000, 40000, 50000, 75000, 100000, 1000000 0.0000000001 100
Eucalyptus: TYPE 2 Attribute Instance Class Number of Worker Nodes Number of Processes Size of Dataset (No. of Urls) Threshold Number of Iteration Value x1.large 1 8(on each node) 1000, 10000, 20000, 30000, 40000, 50000, 75000, 100000, 1000000, 2500000 0.0000000001 100
Finding, description and explanation about the results obtained for PageRank Performance Analysis on Academic Cloud: A) For fixed data size with increase in number of processes the speed-up increases provided the communication overhead component is not very large. This is the basic definition of parallel processing. Moreover the results show that the pagerank algorithm implemented does not have a linear speed-up, because of the fact that, with the increase in number of processes the speed increase is gradual (slow) and not linear. In the Bare-Metal and Eucalyptus Type-2 environments it was seen that the communication overhead is not large as compared to the computation gain obtained, thus the speed-up graph is increasing. But in the case of Eucalyptus Type-1 environment which involves VMs having single processor, the results obtained are opposite. The results show a negative speed-up i.e. with increase in number of processes the performance deteriorates. This is possibly because of the virtualization overhead as it involves a large number of VMs and also because of the communication medium (Ethernet) used by this setup. B) For a fixed number of processes the speed-up increases with increase in dataset size.
The figures above show the plots of speed-up of parallel execution versus the Data-set size on all the environments that our code was executed on. It shows that with the increase in data-set size the speed-up gradually increases; understandably so because at smaller sizes performance gain achieved by parallelism is hampered by the communication overhead. i.e. performance gain achieved is hindered by the increase in burden of communication. But as the data size increases the computation gain observed is higher because the gain obtained by the power of parallel computing starts having an edge over the loss occurred due to overhead. This phenomenon is seen in bare-metal and eucalyptus Type-2 environments but not in the eucalyptus Type-1 as the communication overhead in the last case is significantly higher in all the executions. Hence the speed-up obtained here is negative (decreasing).
C) For fixed dataset size with increase in number of processes the efficiency decreases while the communication overhead increases.
The three figures above show the statistical data for efficiency and overhead encountered for a fixed dataset on the execution environments. Efficiency in context with distributed systems is defined as the value which estimates how well utilized the processes are in solving the problem. Whereas the overhead here is mainly the communication overhead faced when using multiple processes. i.e. nothing but the time required for communication between various processes. The results obtained are as expected in a normal scenario and thus nothing strange has been noticed here. From the results obtained it can be observed that as the number of processes increase the value of overhead increases because the amount of communication required increases, whereas theper processes efficiency decreases as now the amount of computation per processes decreases. But an important result to be noticed here is that in case of the eucalyptus Type-1 execution the value of the overhead is significantly high. The value of overhead observed on eucalyptus Type-1 execution for 100K urls with 8 processes is nine times that observed on eucalyptus Type-2 execution for 250K urls with same number of processes. This in turn supports the reasoning made in the beginning for the observed negative speed-up in eucalyptus Type-1 environment. Our understanding for it is that, in eucalyptus Type-2 setting all the processes are running on the same node so the communication time is quite less as compared to when they are running on different nodes. Moreover eucalyptus VMs use Ethernet technology as against Fiber in bare-metal for communication.
4. Conclusions
Comparison of Sequential and Parallel PageRank algorithms: On comparing the Sequential and Parallel PageRank algorithms we find out that the time required by the Parallel PageRank is very small as compared to sequential page rank. This shows the benefit of distributed systems and parallel programming. Performance Analysis on Academic Cloud: The results obtained for PageRank Performance Analysis on Academic Cloud have lead to the conclusion that except for the place where overhead is high, the parallel pagerank code executes smoothly giving expected performance results. Moreover deciding the type of communication technology to be used for communication amongst the nodes doing parallel computation is an important criterion and cannot be ignored. Synchronization of the daemon process running on multiple computing nodes: Our system does explicit synchronization of the utilization messages sent from the multiple remote nodes that are being monitored by making sure that the messages are being sent at the first millisecond of every new second. Virtualization cost: One interesting finding that was expected while carrying out the experiments was that; starting the virtual machines and making them reachable is very costly in the sense that it requires a large amount of time as compared to the Bare-Metal environment.
5. Acknowledgements:
We would like to acknowledge a lot of people for their support and patience throughout the semester. Special thanks to professor Judy Qiu and both the Associate Instructors Ikhyun Park and Pairoj Rattadilok for giving us a chance to take up such good work and always helping us out. We would like to thank the Salsa Hpc group, FutureGrid group and of course all the classmates for the good discussions and inputs on the Google group.
6. References
[1] http://en.wikipedia.org/wiki/Markov_chain [2] http://en.wikipedia.org/wiki/Adjacency_matrix [3] http://en.wikipedia.org/wiki/PageRank [4] NaradaBrokering: http://www.naradabrokering.org/ [5] Sigar Resource monitoring API: http://sourceforge.net/projects/sigar/ [6] JFreeChart: http://www.jfree.org/jfreechart/ [7] TORQUE Resource Manager: http://www.clusterresources.com/products/torque-resourcemanager.php [8] KVM Hypervisor: http://www.linux-kvm.org/page/Main_Page [9] libvirt: The virtualization API http://www.libvirt.org/ [10] Torque Qsub: http://www.clusterresources.com/torquedocs21/commands/qsub.shtml#I [11] Torque Job submission: http://www.clusterresources.com/torquedocs/2.1jobsubmission.shtml [12] Google group inputs