Вы находитесь на странице: 1из 91

Communication Systems Group, Prof. Dr.

Burkhard Stiller

Design and Development of a


CloudSim Module to Model and
Evaluate Multi-resource
Dependencies

Patrick A. Taddei
Zurich, Switzerland
Student ID: 10-758-696

B ACHELOR T HESIS

Supervisor: Patrick Poullie, Dr. Thomas Bocek


Date of Submission: September 4, 2015

ifi
University of Zurich
Department of Informatics (IFI)
Binzmühlestrasse 14, CH-8050 Zurich, Switzerland
Bachelor Thesis
Communication Systems Group (CSG)
Department of Informatics (IFI)
University of Zurich
Binzmühlestrasse 14, CH-8050 Zurich, Switzerland
URL: http://www.csg.uzh.ch/
Abstract

The paradigm of cloud computing raises several important research questions, one of
the most prominent being the management of resources. The comparison of different
resource allocation policies is therefore an important subject. This thesis provides the
CloudSim RDA (Resource Dependency Aware) module, an extension to the CloudSim
cloud simulator, which makes it possible to compare allocation policies that take multiple
resources and customers into consideration. Much effort has been put into modelling the
resource behavior as realistic as possible, whereby, the Leontief dependencies between the
resources CPU time, network bandwidth and disk I/O have been implemented. These
dependencies are proven by various experiments within this thesis. CloudSim RDA has
then been taken to compare the Standard policy (max-min fair share) with the GM
(greediness metric) policy, which is an algorithm that is developed at the CSG. In various
simulations, the GM policy, is shown to achieve higher fairness values among the customers
than the Standard policy. It is observed that the allocation of the resources on the hosts do
significantly differentiate in scenarios, where customers have different resource utilization
levels.

Eine der wichtigen Forschungsfragen im Bereich des sich etablierenden Cloud Comput-
ing stellt das Management von physischen Ressourcen dar. Die Aufteilung von ph-
ysischen Ressourcen unter verschiedenen Kunden kann mit verschiedenen Ressourcen-
Allokationsstrategien geschehen. Um aufzuzeigen, wie sich verschiede Allokationsstrate-
gien unter verschiedenen Bedingungen verhalten, liefert diese Thesis eine Erweiterung
des Cloud Simulators CloudSim. Dieser Simulator wurde so ausgebaut, dass er Al-
lokationsstrategien anhand von mehreren Ressourcen zulässt. Insbesondere werden auch
Ressourcenabhängigkeiten berücksichtigt, wie zum Beispiel die Leontief Verhältnisse zwis-
chen CPU, Netzwerk Durchsatz und Disk I/O, die zudem auch experimentell bestätigt
werden. Mit Hilfe dieses neuen Moduls (CloudSim RDA), werden dann zwei Alloka-
tionsstrategien miteinander verglichen. Es werden die Unterschiede der Standard Policy
(Max-Min Fair Share) und der am CSG entwickelten GM Policy (Greediness Metric)
in Bezug auf Fairness aufgezeigt, wobei sich die GM Policy in Situationen, wo unter-
schiedliche Nutzungsverhältnisse bestehen, als besonders vorteilhaft erwiesen hat.

i
ii
Acknowledgments

I would like to express my sincere gratitude to Patrick Poullie, for his guidance and
constant inspiration and valuable suggestions during the course of this thesis.

Further, I would like to thank Prof. Dr. Burkardt Stiller for his useful input regarding
the fairness evaluation and Dr. Thomas Bocek for enabling the experiments in the CSG
lab environment.

Moreover, the inputs from my peers, Andreas Gruhler and Björn Hasselmann were also
beneficial.

iii
iv
Contents

Abstract i

Acknowledgments iii

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 The CloudSim project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Simulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Evaluating Physical Resource Dependencies 5

2.1 Dependency between CPU and memory . . . . . . . . . . . . . . . . . . . . 5

2.2 Dependency between CPU and network throughput . . . . . . . . . . . . . 7

2.3 Dependency between CPU and disk I/O . . . . . . . . . . . . . . . . . . . 8

2.4 Empirical dependency evaluation . . . . . . . . . . . . . . . . . . . . . . . 9

3 CloudSim 11

3.1 The CloudSim event model . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 CloudSim scheduling policy model . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 CloudSim physical resources model . . . . . . . . . . . . . . . . . . . . . . 14

3.4 Dependencies between PRs . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

v
vi CONTENTS

4 Resource Dependency Aware (RDA) Module 19

4.1 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.1.1 Input data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.1.2 Output data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1.3 Dependency modeling . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.1.4 Resource allocation policies . . . . . . . . . . . . . . . . . . . . . . 21

4.2 Architecture and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.3.1 RdaCloudlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.3.2 RdaCloudletSchedulerDynamicWorkload . . . . . . . . . . . . . . . 31

4.3.3 RdaHost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.3.4 VmSchedulerMaxMinFairShare . . . . . . . . . . . . . . . . . . . . 32

4.3.5 MaxMinAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.3.6 RdaDatacenter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5 Resource Allocation Policy Evaluation 35

5.1 Simulating workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1.1 Stochastic workload generation . . . . . . . . . . . . . . . . . . . . 36

5.1.2 Workload types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.2 Key determinants for assessing the policies . . . . . . . . . . . . . . . . . . 37

5.2.1 Asset Fairness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.2.2 Domiant Resource Fairness (DRF) . . . . . . . . . . . . . . . . . . 38

5.2.3 Greediness Metric (GM) . . . . . . . . . . . . . . . . . . . . . . . . 38

5.2.4 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.2.5 Duration of simulations . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3.1 Basic simulation setup . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.3.2 Simulation 1: 1 host, 3 customers, 3 VMs with WS workloads,


variable RAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
CONTENTS vii

5.3.3 Simulation 2, 1 host, 3 customers, 3 VMs with WS workloads . . . 43

5.3.4 Simulation 3: 2 hosts, 2 customers, 4 VMs with WS workloads . . . 44

5.3.5 Simulation 4: 3 hosts, 3 customers, 9 VMs with WS, CI, DB workloads 46

5.3.6 Simulation 5: 3 hosts, 3 customers, 9 VMs with WS & CI workloads 48

5.4 Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6 Summary and Conclusions 51

6.0.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.0.2 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.0.3 Final remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Abbreviations 57

Glossary 59

List of Figures 59

List of Tables 62

A Installation Guideline 65

A.1 Basic Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

A.2 Setup Guideline for Eclipse . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

B Experimentation Guideline 67

B.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

B.2 Experiment Runner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

B.2.1 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

B.2.2 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

C Stochastic Data Generation 71


viii CONTENTS

D Simulations 75

D.1 Configuration details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

D.2 Reference on CD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

E Resource experiments 77

E.1 Experiment setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

E.2 Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

E.3 Network bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

E.4 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

F Contents of the CD 81
Chapter 1

Introduction

Cloud computing, with an ever-growing interest, with the promise of turning computing
into the 5th utility, after water, electricity, gas and telephony [10, p. 3], is not far from
it’s goal as one might think. It is currently at the stage, where many enterprises are
considering to adopt to this technology or are already in heavy use of such technologies.
For most computing paradigms, performance is one of the key factors. Nevertheless, in
clouds there are also other important factors to take into account when evaluating a cloud
system; such as security issues, legal and compliance issues, data management issues and
interoperability issues [23, p. 17].
On the endeavor to get the optimal performance of clouds and the best Quality of Service
(QoS), research in the area of the sharing of the physical resources is an integral part.
It addresses namely the resources CPU, memory, network and disk I/O. These physical
resources of a physical machine (PM) may be shared between multiple VMs (Virtual Ma-
chines) among different customers.
After an investigation of the dependencies of these resources, and evaluating the findings in
multiple experiments, this thesis suggests a new, more fine grained design of implementing
the resource utilization of VMs and their attached workloads within the CloudSim project.
Furthermore, this thesis provides an implementation of this newly proposed design.

Greediness metric

One objective of this thesis is to assess the greediness metric. This algorithm was de-
veloped at the Communication Systems Group (CSG) of the University of Zurich and
should allow fair resource allocation, specifically designed for scenarios, where scarcity
occurs. It focuses on clouds, where VMs are not paid for but spawned from customers’
(resource) quotas. The greediness metric (GM) was originally developed to identify VMs
of customers who utilize resources disproportional to their quota. This quantification of
greediness allows to (i) assess the overall fairness in the cloud and (ii) to identify VMs that
should release resources in favor of other VMs in case of scarcity. The GM calculates the
overall fairness for each customer in the cloud, with respect to his quota and the actual
allocated physical resources of all VMs of a particular customer [20].

1
2 CHAPTER 1. INTRODUCTION

1.1 Motivation

An important challenge for providers of cloud computing services is the efficient manage-
ment of virtualized resource pools. Physical resources such as CPU cores, memory, disk
space and network bandwidth must be sliced and shared among virtual machines that
are running heterogeneous workloads. An important objective is to maximize user utility
by optimizing application performance and to reduce energy consumption at the same
time [10, p. 36]. Efficient resource utilization is in fact a major driving force in cloud
computing [23, p. 371].
There are 2 ways how resources can be allocated to VMs. The static provisioning that
sets a top level of each resource and the dynamic resource allocation that adapts to the
actually needed resources [23, p. 371]. The dynamic resource allocation is economically
more interesting, as it reduces the total physical resources needed. However, in both
cases the VMs on a host could experience scarcity of one or multiple resources. Either the
Virtual Machine Manager (VMM) could than migrate the effected VM to another host
with enough resources or the VM can run further on the same host with performance
degradation. The decision depends on the VMM.
However, when simulating such behaviors, the modeling of the actual workload of a VM is
crucial. The individual workload that a VM has, might have an impact on the scheduling
on host level or even data center level, when a VM is getting migrated. To get a better
understanding of such workloads one must take into account dependencies between the
four major physical resources; CPU, memory, network bandwidth and storage I/O. These
dependencies are evaluated in Chapter 2.
However, the different possibilities of workloads are very heterogeneous. Any form of
software one could think of could run in a cloud. Good examples for cloud applications
are: Web servers, file streaming servers or even multiplayer games that are hosted in data
centers worldwide and run on cloud computing technology. They have a high demand on
the mentioned physical resources and each application uses them with its own character-
istics.
To have a realistic simulation of these workloads is not only important for evaluating
scheduling policies of VMs, it even has an effect on the consumed energy and a possible
billing model for resources that might be applied to a simulator.
As long as every VM on a host gets the requested resources, VMs can perform to their full
extent. However, what happens, if a VM’s getting less resources than it would actually
need at a particular time? Is it only one resource that is scarce or even multiple resources
that are scarce? Naturally, it would slowdown the current workload on a VM. However,
what effect has this on the individual resources? As we have determined that there are
dependencies between these resources (see Chapter 2) it is demanding an appropriate
model to simulate the workloads. Currently, such fine grained workload models aren’t
implemented in the CloudSim library or any other cloud simulator (such as Greencloud
& ICanCloud) as it would be desirable.
When it comes to resource scarcity, the hypervisor has the authority to provision or with-
draw the physical resources from the individual VMs hosted on the machine. Different
scheduling policies allow the provisioning of these resources in a different way. Obviously,
this should be done in a fair manner. This implies that all resources should be taken
into account to achieve a high degree of fairness. Currently, the CloudSim library has a
1.2. THE CLOUDSIM PROJECT 3

scheduling policy interface that decides only in regard of the CPU usage of the VM. This
brings up the thought to extend the policy interface to allow resource allocation based on
multiple resources.

1.2 The CloudSim project

The CloudSim project is a Java based cloud simulator, developed and maintained by
the Cloud Computing and Distributed Systems (CLOUDS) Laboratory of the University
of Melbourne. “The primary objective of this project is to provide a generalized and
extensible simulation framework that enables seamless modeling, simulation, and exper-
imentation of emerging Cloud computing infrastructures and application services. By
using CloudSim, researchers and industry-based developers can focus on specific system
design issues that they want to investigate” [2].
It has many functionalities such as modeling and simulating of large-scale data centers,
server hosts, energy aware computational resources and support for user-defined policies
for allocation of virtual machines on the hosts [2].

1.3 Simulators

With the help of a simulator, it is possible to analyze different algorithms, applications


and policies in a repeatable and controllable environment, before actually implementing
them into products [2]. For example to analyze different provisioning policies, it is of
viable worth to simulate them with a simulator to compare them with exactly the same
workloads. In a real environment, it may also be very costly to have an according testing
setup at disposal. Simulations also can be run much faster than real-time, so that a whole
month or year can be simulated in minutes.
4 CHAPTER 1. INTRODUCTION
Chapter 2

Evaluating Physical Resource


Dependencies

When searching for research papers or literature in the area of resource dependencies, one
cannot find any major study solely dedicated to this subject. This shows that this subject
has not received great attention in computer science. Therefore, it is feasible to take a
closer look at this topic. The goal within this thematic was to get a clearer picture within
the area of resource dependencies on a process level. The gained knowledge can then be
integrated in the new CloudSim module.

2.1 Dependency between CPU and memory

To get a better understanding of the relationship between the CPU and memory one must
understand how this two integral computer components work together.

The CPU executes each instruction in a series of small steps. Roughly speaking, the steps
are as follows: [22, p. 58]

1. Fetch the next instruction from memory into the instruction register.

2. Change the program counter to point to the following instruction.

3. Determine the type of instruction just fetched.

4. If the instruction uses a word in memory, determine where it is.

5. Fetch the word, if needed, into a CPU register.

6. Execute the instruction.

7. Go to step 1 to begin executing the following instruction.

5
6 CHAPTER 2. EVALUATING PHYSICAL RESOURCE DEPENDENCIES

Next to many other operations, an instruction can read or write memory. The writing is
done at the actual execution step 6.

All modern CPUs are contained on a single chip. This makes their interaction with the
rest of the system well defined. Each CPU chip has a set of pins, through which all its
communication with the outside world must take place. Some pins output signals from
the CPU to the outside world; others accept signals from the outside world; some can do
both. By understanding the function of all the pins, we can learn how the CPU interacts
with the memory and I/O devices at the digital logic level.
The pins on a CPU chip can be divided into three types: address, data, and control.
These pins are connected to similar pins on the memory and I/O chips via a collection
of parallel wires called a bus. To fetch an instruction, the CPU first puts the memory
address of that instruction on its address pins.
Then it asserts one or more control lines to inform the memory that it wants to read (for
example) a word. The memory replies by putting the requested word on the CPUs data
pins and asserting a signal saying that it is done. When the CPU sees this signal, it accepts
the word and carries out the instruction. The number of data and address pins is an
important factor in the performance of the CPU[22, p. 186]. From an architectural view,
Figure 2.1 shows how the CPU and memory are connected through the bus controller.

Figure 2.1: Processing model

Of course the exact technical metrics (e.g. frequency rates) of the individual compo-
nents playing together are resulting in an overall performance of the computer as such.
Therefore, hardware manufacturers are researching in making microcomputers running as
efficient as possible.

However, when searching on dependencies on a process level, there are two other topics
to mention on this place.

The study from Tafa et. al. [14], tested with two different benchmarks Httperf and
MemAccess and they have determined a slight increase in the CPU utilization, if the
memory is starting to use the paging space. When they increase from 25 MB to 425 MB
(with 256 physical memory) the CPU utilization increased from 44% to 47%. Further
2.2. DEPENDENCY BETWEEN CPU AND NETWORK THROUGHPUT 7

the response time and the page fault number increased enormously, when the memory
is using the paging space. However, they’ve determined that there is no CPU increase,
if the allocated memory stays within the physical available memory. They came to the
conclusion that there is no dependency between memory of virtual machines and CPU
consumed. However, this statement isn’t correct, because we could measure a dependency
between CPU and RAM (see Appendix E.4). Further, the model, described by Tanenbaum
[22], implies that the two resources are dependent. Which means for example, if a certain
instruction allocates/de-allocates memory and would be later executed, because of CPU
scarcity, the effect on memory would also be postponed. One can call this behavior
progress dependent.

Another aspect in the area of CPU and memory dependency is also the memory man-
agement of different applications and programming languages. For example in Java the
garbage collector is using some CPU time to periodically free off some memory [16, p. 22].
It will depend on the implementation of the garbage collector and different configuration
parameters of them. Further, at the end it will depend on the application itself, how often
it de-references objects that will become subject of the garbage collector. In C programs
where no GC is used, it is left to the programmer to allocate/de-allocate the objects.
Thereafter, each application has it’s unique memory behavior.

2.2 Dependency between CPU and network through-


put

An important physical resource that should certainly not get underestimated within the
cloud deployments is the network load. Cloud applications often depend on connectiv-
ity and it might impact directly the overall perceived performance, when working with
applications deployed in a cloud. Having a high latency or not enough bandwidth, can
discourage a user from using an application in a cloud and rather switch to an on premise
installation, where the network traffic is not as impaired as when using an internet site.
However, it is important to get a clear picture of dependencies between the CPU and the
network load of a virtual machine. During the experiments that we have conducted at
the CSG we came to the conclusion that there is a strong dependency between these two
physical resources. In several experiments with the most used web server Apache 2 (used
38%, see web server statistic [1]), a clear dependency among these two resources became
visible.

The experiments were conducted with different HTTP traffic loads. We could measure an
equal proportional CPU increase/decrease when increasing/decreasing the network load
(correlation: 0.99957).
The test setup was as following: We’ve performed this test with 2 pysical hosts, connected
width a 1’000 Mbit/s network connection. One host with a VM running and another host
that was used to generate HTTP requests. We’ve tried it with 5 different loads from about
550 requests/second, where the CPU usage was only 15% to about 3800 requests/second,
at a level where the CPU got exhausted. With each incremental step the CPU increased
with the same proportion as the network load (see Figure 2.2).
8 CHAPTER 2. EVALUATING PHYSICAL RESOURCE DEPENDENCIES

Figure 2.2: CPU and network dependency

In the second experiment we tried the same request loads, but we stressed the CPU to
50% a different load. This resulted in a cut off as soon as the max CPU capacity was
reached.

2.3 Dependency between CPU and disk I/O

The disk I/O, which is a resource that is also subject of this study also showed simi-
lar behaviors that were already observed in the network bandwidth experiments. One
experiment tried to write with several times (10 times) a 4 GB file under a unstressed
and stressed condition. In the stressed condition, the CPU was stressed with a load of
50%. Figure 2.3 shows the experimental output, where 5 unstressed and 5 stressed test
procedures where consecutively conducted. In this experiment the correlation between
CPU and disk I/O was 0.83.
In a second experiment that just wrote one file to also test the CPU and disk read corre-
lation, a correlation of even 0.94 could be measured. The correlation of disk write, when
measured only on one file, has also delivered a correlation of 0.91. In the Appendix E.2,
there is a detailed description of the commands used to perform the disk experiments.
The graphs of the second experiment is also plotted in the Appendix.

Figure 2.3: CPU and Disk write


2.4. EMPIRICAL DEPENDENCY EVALUATION 9

Table 2.1: Sherpa data correlation


resources correlation
CPU / RAM -0.108
CPU / BW 0.877
CPU / DISK I/O 0.741
BW / DISK I/O 0.524
BW / RAM -0.052
DISK I/O / RAM -0.250

2.4 Empirical dependency evaluation

While having the resource dependencies evaluated in a synthetic approach, we could


measure high correlations between the resource cpu, network bandwidth and disk I/O.
However, analyzing traces of different productive servers enables a more realistic view on
actual services that are running in today’s data centers.

Sherpa database

This dataset contains a series of traces of hardware resource usage of the PNUTS/Sh-
erpa database. It’s a massively parallel and geographically distributed database system
for Yahoo!’s web applications [12]. It contains 3 workload traces, measured every minute
for about 30 minutes each. The result shows a very clear picture as expected, shown in
Table 2.1. There is a high correlation between the throughput measured resource types
(CPU, BW, Disk I/O). It also shows that there is no relation between the space measured
resource RAM and the other resources.
10 CHAPTER 2. EVALUATING PHYSICAL RESOURCE DEPENDENCIES
Chapter 3

CloudSim

When analyzing the CloudSim Java library, one can find a simulation facility and a model
of a cloud as expected. The relevant components such as data center, host, VM, policies
and workloads (called cloudlets in the terms of the CloudSim library) are available at
the programmer’s fingertips to easily model a desired environment. With the inherited
power model, it even offers an energy consumption aware model that allows researchers
in the area of energy efficiency to model desired scenarios. It also allows to evaluate the
monetary impact of different consumption patterns.

CloudSim R. 3, has certainly proven as a viable candidate, when choosing a cloud simu-
lation framework. Not to oversee, it has also a couple of extension modules. For example
to simulate Map/Reduce jobs or an easy-to-use user interface with report generation ca-
pabilities. A detailed list of the available modules are listed on the CloudSim website
[2].

However, CloudSim can be classified as a constructive simulator. Which can be used to


analyse concepts, predict possible outcomes in normal or even under stressed conditions
[8]. Because the simulation is not in real time, it can be seen as a calculation model
that generates the possible outcome with a given input. The advantage to this kind of
simulation, is that it always produces the same output, from a given input, assumed the
input doesn’t have some kind of randomness integrated.

3.1 The CloudSim event model

The CloudSim package doesn’t have any dependencies to other simulation frameworks.
It is therefore a self-contained simulation framework with all elements that are required.
The class CloudSim is the main simulation class. The whole event simulation is processed
within the same thread. It is based on a clock that is not determined by the actual time
of the day and simply starts with zero. The events are getting executed in a procedural
way and not in a real time fashion as one might expect from a simulation framework.
However, the execution of the simulation runs fast, but the log that is created contains
the correct timestamps as it would have been executed in real time.

11
12 CHAPTER 3. CLOUDSIM

Figure 3.1: Class model

Figure 3.1 shows some important classes within CloudSim. For simplification, this illus-
tration depicts only some selected methods/attributes that have some relevance within
the event model.

Within the simulation there is the abstract SimEntity class. It is able to handle events
and send events to other entities. Subclasses of the SimEntity are: Datacenter, Datacen-
terBroker, CloudInformationService, CloudSimShutdown.

The actual events are represented with the SimEvent class. An event contains the time,
when it should be started, the source entity and destination entity (SimEntity class), the
event type and some arbitrary data that can be transmitted with the event.

The CloudSim.runClockTick() method iterates through all entities (SimEntity classes)


within the simulation and executes the run() method on these objects. They then process
the events in the DeferredQueue that are sent to the entity (entDst equals the id of the
entity).

FutureQueue contains events that are sent from one entity to another on some point in
the future. As soon as the simulator’s clock is at this point, the event is getting added to
the DeferredQueue until they are all processed by the entity objects.

Another central class in the simulation process is the CloudInformationService. The Cloud
Information Service (CIS) is an entity that provides cloud resource registration, indexing
and discovery services. The Cloud has a list of hosts that tell their readiness to process
Cloudlets by registering themselves with the CloudInformationService. Other entities
such as the DatacenterBroker can contact this class for resource discovery service, which
returns a list of registered resource IDs. In summary, it acts like a yellow page service.
This class will be created by CloudSim upon initialization of the simulation [3].
3.2. CLOUDSIM SCHEDULING POLICY MODEL 13

Figure 3.2: Policies

3.2 CloudSim scheduling policy model

The scheduling of VMs and workloads can be managed with different types of scheduling
policies. These policies have an great impact, when simulating a cloud. One must be
familiar with the different types of these policies to properly simulate the wanted behavior.
To get a better understanding of the overall structure of the scheduling policies that are
modelled within CloudSim the illustration below depicts the different scheduling policies
on the datacenter, host and VM layer. On all three layers the policies are meant to make
decisions based on the resources requested and the available resources on the particular
object.

VmAllocationPolicy

A data center consists of many VMs that need to be provisioned to hosts. To solve
this challenge, the VmAllocationPolicy associates the VMs to the available hosts of the
datacenter. A direct implementation is the VmAllocationPolicySimple class which simply
finds the first host that has the requested capacity available. The capacity is determined
by the resources associated with the VM and host (see Table 3.1).

VmScheduler

Once a VM got sent to a host with enough resources. There is further scheduling on the
host level itself possible. There are two general concepts that are available for schedul-
ing VMs; Time-shared and space-shared. The time-shared concept means that the CPU
cores can be shared with multiple VMs at the same time. Wherelse, the space-shared
14 CHAPTER 3. CLOUDSIM

Figure 3.3: Effects of different provisioning policies on task execution: (a) Space-shared
for VMs and tasks, (b) Space-shared for VMs and time-shared for tasks, (c) Time-shared
for VMs, space-shared for tasks, and (d) Time-shared for VMs and tasks [11].

concept only allows dedicated CPU cores for each VM. The VmSchedulerTimeShare-
dOversubscription allows over-subscription. In other words, this scheduler still allows the
allocation of VMs on the host that require more CPU capacity than it is actually avail-
able. Oversubscription naturally results in performance degradation of the participating
virtual machines.

CloudletScheduler

A cloudlet scheduler is responsible for managing the actual workload in a particular VM.
In a VM there may run several cloudlets simultaneously. Analogously to the VmScheduler,
the CloudletScheduler has two basic implementations; the time-shared and space-shared.
Whereas, the time-shared, shares CPU cores and the space-shared only shares the VM,
but does not allow to use the same CPU cores.

Combined the time-shared and space-shared policies look like in Figure 3.3.

Once the VmAllocationPolicy has found a host with the necessary resources; cpu cores,
memory, network bandwidth and storage (not storage I/O), further scheduling within the
VmScheduler and CloudletScheduler is currently only done with regard of the CPU. With
the new RDA module there is a higher integration of other physical resources suggested,
as described in Chapter 4.

3.3 CloudSim physical resources model

Physical resources are defined on different levels with different attributes. Table 3.1 lists
some important attributes, when it comes to resource scheduling and allocation.
3.3. CLOUDSIM PHYSICAL RESOURCES MODEL 15

Table 3.1: Resource objects


peList Number of processing cores.
ram The amount of memory associated with the host.
Host
bandwidth The bandwidth that is reserved for the host.
storage The hard disk size a host has.

numberOfPes Number of processing elements (cores) required.


ram The amount of memory required.
VM bandwidth The bandwidth that is required.
size Storage size in MB.
userID The user identity of the owning user.

cloudletLength Number of Million Instructions (MI) that are


required for the job.
cloudletFileSize Disk space needed, when starting the job (MB).
cloudletOutputSize Disk space needed, when the job is finished (MB).
numberOfPes Number of processing elements (cores) the job
Cloudlet needs to execute.
userID The user identity of the owning user.
vmID The vmID where this cloudlet is supposed to run.
utilizationModelCpu The utilization model of the CPU.
utilizationModelRam The utilization model of the RAM.
utilizationModelBw The utilization model of the bandwidth.
16 CHAPTER 3. CLOUDSIM

CPU

The available CPU cores and their load is managed by the VmScheduler. Please refer to
the scheduling policy section above.

It is also to mention that there is a PeProvisioner class. However, this class seams to be
a relict of an earlier approach to manage the CPU cores. I would suggest to the cloudsim
project to remove this unused class.

RAM

The RAM resource is managed by a RamProvisioner. It keeps track of the available


RAM and the allocated RAM of it’s host. The class RamProvisionerSimple is a simple
implementation of the abstract class RamProvisioner which simply allocates the RAM of
a VM only, when the requested RAM doesn’t exceed the total available RAM of the host.
The method allocateRamForVm returns only true, if the requested RAM for the VM was
successfully allocated. It returns false, if there was not enough RAM available.

public boolean allocateRamForVm(Vm vm, int ram)

With the method getAvailableRam() the remaining amount of memory that is available
on the particular host can be determined.

Network bandwidth

The available bandwidth of a host is modelled with a BwProvisioner. Similarly as the


RamProvisioner it keeps track of the allocated bandwidth for of all VMs on the host.

The class BwProvisionerSimple is a simple implementation of the abstract class BwPro-


visioner which allocates the bandwidth of a VM only, when the requested bandwidth
doesn’t exceed the available total bandwidth of the host. The method allocateBwForVm
returns only true, if the requested bandwidth for the VM was successfully allocated. It
returns false, if there was not enough bandwidth available.

public boolean allocateBwForVm(Vm vm, long bw)

disk I/O

A host also contains a storage size attribute. This represents the size in MB the host has
available on the hard disk. This is mainly intended to be evaluated when allocating a VM
to a particular host. It will be managed in conjunction with the VM storage attribute.
Reading and writing to disks are not represented with this attribute (storage I/O).

CloudSim offers only on the datacenter level a proper storage model in the veins of a SAN.
Storage area networks are constructed on top of block-addressed storage units connected
3.3. CLOUDSIM PHYSICAL RESOURCES MODEL 17

through dedicated high-speed networks. This is an often used form for distributed storage
concepts next to the network-attached storage (NAS), which in contrast is implemented
by attaching file servers to a TCP/IP network. [10, p. 222] In CloudSim, one can specify
on the datacenter level multiple storage devices with their individual capabilities. To add
data to a storage, one must initiate a SimEvent that will be processed in the method
Datacenter.processDataAdd(). The network resource is not affected by such transfers, in
accordance that it is in the form of a SAN.

Consumption patterns of VMs

A VM must be instantiated with a basic set of values as shown in Table 3.1. These
values are primarily used for the initial provisioning to a host. During the lifetime of a
VM the consumption is closely coupled to the CloudletScheduler associated with the VM.
Through the following methods, one can get the current requested resources from the VM.

public double getCurrentRequestedTotalMips()


public long getCurrentRequestedBw()
public int getCurrentRequestedRam()

The result of this methods derived from the CloudletScheduler. The host may call this
methods to decide for new allocation of the PRs. The host, respectively the different
provisioners can than allocate the resources to the VM by calling the setter methods of
these resources.

Consumption patterns of Cloudlets

Cloudlets are mainly defined by the length in instruction that they need to process to
be finished. Moreover there are different utilization models that define the consumption
patterns of the current workload.

public double getUtilisationOfCpu(final double time)


public double getUtilisationOfRam(final double time)
public double getUtilisationOfBw(final double time)

Currently there are 4 implementations of the UtilizationModel interface.


UtilizationModelFull: Always uses all available resources.
UtilizationModelNull: Always uses none of the available resources.
UtilizationModelStochastic: Uses at each time frame a random value of the resources.
UtilizationModelPlanetLabInMemory: Uses an input file with various utilization values
as an integer value on each line.

The only method within the UtilizationModel interface returns the percentage of the uti-
lization of the resource for the current time.

public double getUtilization(final double time)


18 CHAPTER 3. CLOUDSIM

To make this utilization model cloudlet progress aware, the input must be progress de-
pendent. The methods return the required resource of an individual cloudlet at a given
time during the simulation. All combined cloudlets will add up to the VMs total usage
and all VMs on a particular host will then add up to the total physical resources used on
the particular host.

3.4 Dependencies between PRs

Currently, there are no dependencies between these resources modelled. In the CloudSim
model it is assumed that they are independent variables. This means that for example
the utilization model for bandwidth doesn’t affect the utilization model of the cpu. The
different utilization models act independently of other resources and are dependent on
the current simulation time. However, the current available utilization model interface
that allows to specify the usage by the time. This implementation isn’t progress aware.
If for example the RAM would have to be modelled dependent on a certain process step
in the cloudlet execution the current available utilization models do not support this. For
example if the CPU needed more processing time in a previous time frame, because it got
degraded, the RAM usage would also have to be adapted to match the execution progress
of the cloudlet. In the current implementation this dependency is ignored. To make the
behavior more realistic, the RDA module, which was elaborated within this thesis, has
implemented such dependencies.
Chapter 4

Resource Dependency Aware (RDA)


Module

This newly developed extension to the CloudSim library enables a more realistic behavior
of the various physical resources. It may be used for example to assess different scheduling
policies on different levels. As we have confirmed in Chapter 2, there exist various de-
pendencies between physical resources. These observations should also be reflected, when
running a workload within CloudSim. Therefore, the individual Cloudlets should adapt
to a more realistic behavior, because they represent the actual processes that are running
within the simulation.

Mainly, there are two types of dependencies implemented. The Leontief dependencies
and the progress aware behavior. The Leontief production function is a micro economic
concept that defines fix relationships between the input factors and output factors. If
there is not enough input from one resource, the overall output is also limited to this
level [6]. Progress aware behavior means that the individual resources also get adapted
in terms of the progress of the workload. In case a resource is getting scarce and the
workload can’t be processed as fast as wished, all resources will adapt to this behavior.

4.1 Functional Requirements

Within this section, the functional requirements are manifested. They deliver a basic
understanding of the functions that this module delivers.

4.1.1 Input data

A comma-separated values (CSV) file that contains the workload of a Cloudlet. Every
line corresponds to an interval of a second. Therefore, the first line are the measuring
inputs at 0 seconds, the second line is the measuring input at 1 second.

Table 4.1 shows a sample input:

19
20 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

Table 4.1: Input data


CPU (MIPS) RAM (MB) NETWORK (MBit/s) DISK (MBits/s)
20 1 0 0
220 50 50 20
180 75 45 18

Figure 4.1: Linear resource progress

The measuring points are interpreted to be linear connected as shown in Figure 4.1 for
the resource CPU. Therefore, the CPU is continuously increasing between 0 seconds and
1 second and then linearly decreasing again.

4.1.2 Output data

The output of each VM must contain the four resources utilization levels at a given time.
Thereafter, a CSV file must be created containing the time and the utilization value of
the four observed resources. Every change of a value is reflected in an additional row. In
the same fashion as the input values, these output values can also be interpreted with a
linear connection.

4.1.3 Dependency modeling

The overall goal is to have a progress aware behavior of the workload. This means that
the given input data might get stretched, because of a resource scarcity. Thereafter,
one has to take into account that there are dependencies between the CPU processing
speed and the other physical resources. It is enough to take this direct dependency to
the CPU into account to get an adequate result. The indirect dependencies, for example
between network and disk I/O are implicitly given, when modelling the dependencies of
all resources with the CPU.
4.1. FUNCTIONAL REQUIREMENTS 21

Figure 4.2: Resource progress

CPU → RAM

When calculating the current memory usage of a Cloudlet, the actual progress of the
Cloudlet has to be considered. In Figure 4.2 the normal progress of the CPU and memory
is shown in the left graphic and in the right graphic, the CPU is capped at 200 MIPS.
This results in a 0.12 seconds longer execution time of the workload. The RAM doesn’t
increase anymore with the same gradient than in the uncapped scenario as soon as the
CPU doesn’t get it’s requested utilization level. It is almost invisible, but there is a slight
buckle at 0.4 seconds in the RAM curve.

CPU ↔ network bandwidth

Between the CPU and the network there comes the Leontief utility function into action.
A shortage in CPU leads to an equal percentage drop for the used network bandwidth
resource. And a shortage of the network bandwidth leads also to an equal percentage
drop of the CPU.

CPU ↔ disk I/O

Also with the CPU and the disk IO, the Leontief utility function is assumed. A shortage
in CPU leads to an equal percentage drop for the used disk I/O resource. And a shortage
of the disk I/O leads also to an equal percentage drop of the CPU.

4.1.4 Resource allocation policies

Two allocation policies on the host level that support these multiple resource dependencies
should also be implemented within this module. A basic standard scheduling policy and
a scheduling policy that works with the greediness metric, that was already developed at
the CSG.
22 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

Standard scheduling policy

This policy works with the Max-Min Fair Share (MMFS) algorithm [17, p. 313]. This
principle is the basic way, how several processes on one host are getting their resources,
when having the same priority. It works as follows: The available capacity is split equally
among the VMs in a fair manner. Fair is considered an equal split of the available capacity.
Further, no VM gets a share larger than its demand and the remaining VMs, which are
requesting more resources, obtain an equal share of the remaining available resource. This
principle applies to all different physical resources which are dependent (CPU, network
bandwidth & disk I/O). The resource, that is the scarcest, of all resources, has to be
capped. After that, the remaining resources are capped by the same percentage.

Greediness scheduling policy

The greediness metric, that was developed to achieve an overall fairness among multiple
customers within the same cloud [20] is also assessed within this study. The greediness
algorithm calculates the greediness of each customer according to the resource usage from
his/her VMs. This greediness value per customer can then be passed on to the data
center level, where the average greediness of each customer is calculated. Consecutive
resource allocations on host level are then influenced by the the average greediness of the
user within the cloud/data center. To allocate the resources in a fair manner means to
allocate more resources to VMs of customers with a low greediness. This may increase
their greediness consecutively [20].

An implementation of this resource allocation algorithm is provided in Python. This


Python script has to be integrated within a custom resource allocation scheduler within
the RDA module.

4.2 Architecture and Design

The overall design goal of the RDA module was to make as much use of the CloudSim
classes as possible and integrate the new functionalities in a tight manner. This leads in
many situations to ordinary sub-classing of the existing framework. CloudSim contains
also a power aware package. Within the RDA module we decided to subclass the classes
from the power package, because this enhances the functionalities of the classes and power
aware measurement capabilities may be used in the future.

Some of the adaptations that had to be made in order to gain a more fine grained schedul-
ing also led to changes of number object types. For example some of the resource provi-
sioners had to be switched from integers to double values. This enabled a more precise
calculation, of the allocated resources at times, when the simulator event is not right on
the second, where the given input value would have no decimal places.
4.2. ARCHITECTURE AND DESIGN 23

An important and widely seen concept within CloudSim is also the firing of events. There-
fore, many methods return a time, when they expect the next event to happen on their
side. This time is then propagated to the data center and from there added to the event
engine as future event. Because the overall performance of the simulation strongly de-
pends on the number of events that are produced during the runtime, optimization of the
event creation is important.
When processing a workload, changes in resource consumption could happen every nano
second. As the CPU speed is given in million instructions per second (MIPS), for ex-
ample 200 MIPS (200’000’000 millions instructions per second) implies that there can be
changes in resource consumption, every nano second (0.000000001 second). Thereafter,
this interval has been chosen as minimum event interval between any event in the sim-
ulation. Of course, it would take a high amount of computation time, if the simulation
would recalculate for every nano second the resource consumption. However, it would be
the most precise way to evaluate a certain workload, especially, in times where there is a
scarcity of a certain resource. However, to minimize the events and therefore optimize the
performance of the simulator the individual workloads calculate the next expected change
time in resource consumption and propagate this to the data center. This paradigm offers
best performance for this constructive simulator. Usually, prediction of the next change in
resource consumption can be determined with simple mathematical calculations, that will
be explained in the implementation, Chapter 4.3. However, if scarcity on a host occurs,
the prediction of future resource allocation doesn’t depend simply on the workload itself
anymore and wider aspects come into play, as for example the VM scheduling policy or
even a policy on data center level, that might make customer dependent decisions. In such
timeframes of an over-demand the events must be fired in a more frequent way, to gain
a more accurate calculation that takes into account all varying parameters. For this it is
up to the user of the simulator, to choose the precision degree, with a static scheduling
interval to be applied in time of scarcity. Depending on the simulation scenario/require-
ments the user may choose an interval between one nano second and one whole second.

This design section is intended to be read besides the API specification [4], where the
classes are described in greater detail.

RdaDatacenter

This datacenter implementation is specifically designed for the usage within the RDA
module. It contains an optimized simulation event mechanism that allows a performance
optimized simulation of the cloud within the RDA context.

UserAwareDatacenter

This datacenter has the additional capability to handle customer/user aware scenarios.
Before updating the host processing, this datacenter requests from every host a priority
map with the priority of each user. The priorities are than added up and put into a
consolidated map that will be given to each host, when updating the current processing
workloads. The corresponding UserAwareHost/UserAwareVmScheduler can then consider
24 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

Figure 4.3: Data center and host classes

these given priorities of the user, when reallocating resources for the concerning users. This
mechanism is for example used by the Greediness Metric policy.

RdaHost

This RDA specific host, supports the resource dependency aware scheduling mechanisms
and represents like its superclass an individual host in the data center that may run VMs.

It uses the RDA specific resource provisioners for a finer allocation that works with dec-
imal places (doubles). Because of this, some methods of the superclass Host had to be
overridden and adapted to work with the RDA provisioners.

However, the central method that does process the workloads on the host is:

public double updateVmsProcessing(double currentTime)

This methods call then the scheduler the VM scheduler to reallocate the resources for all
VMs on the particular host. It supports scheduling for multiple resources, such as cpu,
ram, bandwidth and storage I/O.

UserAwareHost

This interface is intended for hosts that support the user aware mechanisms. The methods
allow the transfer of a map with user priorities.

public abstract Map<String, Float> getUserPriorities(double currentTime)


public abstract double updateVmsProcessing(double currentTime, Map<String,
Float> priorities)
4.2. ARCHITECTURE AND DESIGN 25

Figure 4.4: VM and VM scheduler classes

The RdaUserAwareHost class, has this interface implemented.

RdaVmScheduler

This interface is intended to handle multiple resource scheduling. It just specifies one
method.

public abstract void allocateResourcesForAllVms(double currentTime, List<Vm>


vms)

This method replaces the method VmScheduler.allocatePesForVm(). Because, the RDA


module supports multiple resources the resources for all VMs have to be allocated in one
step, as they are all interdependent.

VmSchedulerMaxMinFairShare

This VM scheduler is responsible to allocate the resources of VMs on the host. The
host does particulary delegate the VM scheduling to a defined VM scheduler. Anyhow,
this implementation of a vm scheduler does work after the Max-Min Fair Share (MMFS)
algorithm [17, p. 313], earlier described in the requirements section (Section 4.1).

RdaUserAwareVmScheduler

Beneath the RdaVmScheduler interface, this interface also supports a user/customer


aware prioritisation functionality. Additionally to the allocation method, it has the
method getUserPriorities(). This method is called prior to the resource allocation, to
retrieve the priorities and propagate them to the datacenter. How the priorities are han-
dled is left to the allocation in the second method. As visible in the method declaration,
26 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

Figure 4.5: Cloudlet and cloudlet scheduler classes

the priorities are defined as Float values. Whether, a high value means a good priority or
the contrary, is left to the implementor.

public abstract Map<String, Float> getUserPriorities(double currentTime, List<Vm>


vms)
public abstract void allocateResourcesForAllVms(double currentTime, List<Vm>
vms, Map<String, Float> userPriorities)

VmSchedulerGreedinessAllocationAlgorithm

The greediness scheduling algorithm, was developed to achieve an overall fairness among
multiple customers within the same cloud [20], thereafter, it implements the RdaUser-
AwareVmScheduler interface, which supports a user/customer aware scenario.

This class solely wraps the python script, where one can find the actual implementation
of the greediness metric.

RdaVm

The basic representation of a VM, running on a particular host. The RdaVm particularly
supports multiple resources. The central method updateVmProcessing() had to be ex-
tended to support the network bandwidth and disk I/O, besides CPU speed. Moreover,
some methods had also to be adapted to support decimal places for ram and bandwidth.

The two central classes where the workload processing is actually taking place, are the
RdaCloudlet and the RdaCloudletSchedulerDynamicWorkload as shown in Figure 4.5.
This two classes define the heart of the whole RDA module. Within the RdaCloudlet
class the progress aware behavior is taking place and the Leontief dependencies between
the resources, are implemented within the RdaCloudletSchedulerDynamicWorkload.
4.3. IMPLEMENTATION 27

RdaCloudlet

This Cloudlet works in a progress aware fashion. This means that it requests computing
resources, dependent on the already progressed workload. This behavior becomes visible,
as soon as there is a time of scarcity and it didn’t got all requested resources. This is re-
sulting into a down-graded processing speed for the time, where scarcity exists. Naturally,
this is also reflected in the overall processing time of the Cloudlet that will increase.

It must be instantiated with a CSV input file that contains the requested resources for
CPU, RAM, network bandwidth and disk I/O. Example CSV file:

cpu,ram,network,storage
150,50,0,0
280,300,0,0

cpu in MIPS (million instructions per second)


ram in MB
network in MB/s
disk I/O in MB/s

Later in the implementation chapter, we take a closer look at the actual implementation.

RdaCloudletSchedulerDynamicWorkload

This cloudlet scheduler, makes sure that the Leontief dependencies are taken into account.
This results in an equal percentage drop of the other resources, as soon as one resource gets
scarce. It checks, which resource is the most scarce resource and down-grades the other
resources according to this drop. The central method where the scheduling is processed,
is like in the RdaVm class the method updateVmProcessing(). This method also had to
be extended to support bandwidth and disk I/O, besides the CPU speed.

4.3 Implementation

This section discusses some of the more complex code parts and concepts within the
RDA module that have been chosen in the actual implementation. Please refer to the
API documentation and the code itself, to get a more comprehensive picture of the RDA
module.

4.3.1 RdaCloudlet

The concept to make the cloudlet progress aware is straight forward. One has to keep track
of the instructions a cloudlet has already processed. This is already a concept that the
28 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

Table 4.2: input array for requested resources


INSTRUCTIONS CPU (MIPS) RAM (MB) NETWORK (MBit/s) DISK (MBits/s)
0.0 200 50 55 1
205000000 210 40 22 2
415000000 210 41 23 18

superclass Cloudlet is based on. The class attribute instructionsFinishedSoFar is taking


care of this progress. The question that comes up, is how this class derives the requested
utilization at a given time in the simulation? To understand this, we first have a look at
the input modelling and then further elaborate how we get to the actual utilization values
for the different resources.

Input modelling for requested resources

In the constructor, the CSV file input is taken and placed into a two dimensional array
(see Table 4.2). A new column is added at the beginning, which contains the accumulated
instructions processed till a certain utilization level.

The INSTRUCTIONS column contains the accumulated processed instructions between


the current and previous bound. This is the value of instructions that we get when linearly
connecting those two points. It can be simply determined by calculating the average value
between the 2 points.

cpui−1 + cpui 0
Insti = insti−1 + ( · 1 0000 000)
2

This array, allows then to retrieve the requested resource utilization by having the exact
processing position, that is maintained by instructionsFinishedSoFar attribute. One could
say that the time of the input values is in a sense replaced by the accumulated instruction
column.

Calculation of requested utilization

At every event in the simulation, the cloudlet must be able to calculate the request-
ed/wanted utilization usage for the different resources. This is implemented in the follow-
ing method:

private double getRequestedUtilization(final double timeSpan, double resource-


Grad, int resourceIndex)

timeSpan: Time span since last event in simulation.


resourceGrad: The gradient of the resource to be evaluated.
resourceIndex: The resource index in the array.
expectedTime: The time where the workload progress should be according to the instruc-
tions already progressed.
processedInstructions: The instructions that have already been executed.
4.3. IMPLEMENTATION 29

Figure 4.6: Resource calculation

When calculating the requested utilization, we have to take care of the progress of the
cloudlet, as mentioned before. To retrieve the proper result, several steps are necessary:

1. Retrieve the expected time in the instruction interval The expected time, is the
time, where we are supposed to be in the instruction interval, depending on the
processed instructions. (An instruction interval represents one row in the array.)
pastSpeedCpu = first bound from interval (150 MIPS)
gradCpu = the gradient of the CPU (280-150=130)

f (x) = gradCpu · x + pastSpeedCpu

Z
gradCpu 2
f (x) = · x + pastSpeedCpu · x + C
2

we know the already processed instructions within the current instruction interval,
thereafter
Z
processedInstructions = f (x) + C

we resolve f(x) after x, with C = 0, with the standard formula for squared equations
we get the expected time. (x = expectedT ime)

processedInstructions
discriminant = pastSpeedCpu2 + 2 · gradCpu ·
10 0000 000


−pastSpeedCpu + discriminant
expectedT ime =
gradCpu
30 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

2. Calculate the requested utilization of the given resource (in this example for RAM)
at the expected time (pls. compare Figure 4.6)

pastU tilizationResource = 50

resourceGrad = 300 − 50 = 250

pastRequestedU tilization = resourceGrad · expectedT ime + pastU tilizationResource

3. Calculate the requested utilization of the resource at the new time (expectedT ime +
timeSpan)

currentRequestedU tilization = resourceGrad · timeSpan + pastRequestedU tilization

Calculation of estimated next consumption change

Another method within the RdaCloulet class is calculating the estimated time, when the
next consumption change of the cloudlet will arrive. This method is used to schedule the
next event in the simulation.

past = last CPU utilization


instructionsToProcess = how many instructions the cloudlet has to process till the next
bound

f (x) = grad · x + past

Z
grad 2
f (x) = · x + past · x + C
2

we know instructionsToProcess, thereafter


Z
instructionsT oP rocess = f (x) + C

resolving after x, with C = 0, with the standard formula for squared equations, we get
the estimated time till the end of the interval. (x = time)

discriminant = past2 + 2 · grad · instructions


−past + discriminant
time =
grad
4.3. IMPLEMENTATION 31

4.3.2 RdaCloudletSchedulerDynamicWorkload

The cloudlet scheduler computes the actual processing of the cloudlet within the updat-
eVmProcessing() method. This method is executed at every simulation event. Further it
returns the next estimated time, when the cloudlets handled by this scheduler expect a
consumption change. This time is then propagated up to the data center level that pro-
duces a new future event at the requested time that will consecutively stripe this method
again. When processing an event, it also guarantees that the Leontief dependencies be-
tween the resources are taken care of.

Within this scheduler, the processedInstructions from the last event to this event is
calculated as following:

timeSpan = Time span since last event.


pastCpu = The past processing speed.
effectiveGradient = gradient from the last utilization till the current processing speed

f (x) = grad · x + past

Z
grad 2
f (x) = · x + past · x + C
2
with C = 0,
 
ef f ectiveGradient
processedInstructions = · timeSpan + pastCpu · timeSpan · 10 0000 000
2
2

The calculated processedInstructions will then be added to the cloudlet’s


instructionsF inishedSoF ar variable, to keep track of the overall progress of the cloudlet.

Leontief dependencies

How are the Leontief dependencies taken care off? If one of the dependent resource
(CPU, network bandwidth, disk I/O) doesn’t get the requested utilization amount, the
other resources, are getting downgraded with the same proportion.

Figure 4.7 shows a 10% downgrade of the cpu, leads to a 10% downgrade of the network
bandwidth and disk I/O.

4.3.3 RdaHost

The RDA host’s most central method is certainly the updateVmsProcessing() method,
therefore, we have a closer look at it. It does three main things: First it allocates the
available resources to the different VMs on the host, secondly, it initiates the cloudlet
32 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

Figure 4.7: Leontief dependencies

processing on the individual VMs and at the end it checks, whether a scarcity could come
up in the future. This is done within the checkForScarcity() method.

The checkForScarcity() method adds up all the gradients of each resource of all the VMs
combined and checks, if the max capacity of the resource on the host gets exceeded within
a time smaller than the next estimated event time. If so, it schedules the next time at
exact this time or a minimum scheduling interval in moments where scarcity occurs. This
interval is called scarcitySchedulingInterval and can be found as an attribute on the host.

4.3.4 VmSchedulerMaxMinFairShare

This scheduler uses the Max-Min Fair Share (MMFS) algorithm to allocate the resource
to the different VMs. The MMFS algorithm is a good abstraction, how multiple processes
on one host are sharing physical resources. This behavior could be seen in the various ex-
periments that we have conducted. Please see for example the network experiment trace
(see Appendix E.3) and compare the stressed and unstressed conditions. This scheduler
takes also the Leontief dependencies into account, when allocating resources for the dif-
ferent users. Therefore, this is a more sophisticated version instead of just applying the
Max-Min algorithm to each resource isolated.

How does it work:

1. Check which resource is the most scarce of the resources cpu, bandwidth and disk
I/O.

2. Downgrade that resource for all VMs

3. Downgrade all the other resources of each VM. According to the Leontief production
function dependencies.

4. repeat step 1 to 3 for the other resources

This procedure guarantees that the host resources are not overused.
4.3. IMPLEMENTATION 33

4.3.5 MaxMinAlgorithm

This class is an implementation of the Max-Min fair share (MMFS) algorithm [17]. The
available capacity is split among the customers in a fair manner. So that no customer
gets a share larger than its demand and the remaining customers obtain an equal share
of the resource.

1
2 Map D: (c, d) c: customer, d: demand // contains resources requested by
customers (input value)
3 Map A: (c, d) c: customer, a: allocated // result map with allocated
resources
4 C // the capacity of the resources (input value)
5
6 customerCnt = D.size
7 remainingCapacity = C
8 fairShare = remainingCapacity / customerCnt
9
10 userGotRemoved = true;
11
12 // as long as users request is below fairShare, users are getting the
requested resources
13 WHILE (userGotRemoved=true)
14 userGotRemoved=false
15 FOR all D
16 IF requested <= fairShare
17 A.put(c,requested)
18 remainingCapacity = remainingCapacity - requested
19 D.remove(c)
20 customerCnt = customerCnt - 1
21 userGotRemoved = true
22 fairShare = remainingCapacity / customerCnt
23
24 // splitting up leftover between remaining customers
25 fairShare = remainingCapacity / customerCnt
26 FOR all D
27 A.put(c,fairShare)

4.3.6 RdaDatacenter

The central method of the datacenter, when it comes to the processing of the physical re-
sources is the updateCloudletProcessing(). Within this method the call to processHosts()
is made. This method is slightly differently in the RdaUserAwareDatacenter and the
RdaDatacenter. Because, in the user aware implementation the handling of the priorities
is also taking place.
34 CHAPTER 4. RESOURCE DEPENDENCY AWARE (RDA) MODULE

Otherwise, this datacenter is not much different from the PowerDatacenter, except that
the RDA data center’s event creation is optimized to work with very small time intervals
(nano seconds) and at the same time not sending too much events. This can be observed
in the method addNextDatacenterEvent().
Chapter 5

Resource Allocation Policy


Evaluation

This chapter intends to show an approach how resource allocation policies can be evaluated
with the help of the CloudSim RDA module. There exist several resource allocation
algorithms, where two of them have been implemented within the RDA module. The GM
policy (greediness policy) and a standard resource allocation mechanism that makes use
of the Max-Min Fair Share (MMFS) algorithm.

5.1 Simulating workloads

An important aspect when simulating a cloud computing infrastructure, are the different
workloads. Naturally the primary goal of simulating workloads is to achieve a realistic
behavior that would reflect a real data center. This leads to the question, if there is trace
data available from real-world data centers. Unfortunately, such resource data that is
publicly available seems to be almost non present and only a limited number of trace-
logs is available. This is mostly due to the business and confidentiality concerns of users
and providers in commercial clouds [18]. However, there are three data-sets that can be
mentioned on this place.
The first is from Google and available under the project name googleclusterdata [5]. It
features traces from over 12,000 servers over the period of a month. It contains data from
a huge amount of tasks, however, only the usage traces of CPU and memory is available.
The storage I/O that would be relevant within our research is only logged on a task
summary level and the network bandwidth is totally omitted.
The second data-set is already used within the core CloudSim package. However, it
consist only of CPU traces from the PlanetLab VMs [5], also collected in 2011, similarly
as the googleclusterdata. It was used within CloudSim to evaluate power consumption
mechanisms. The third and most promising resource data is from Yahoo Labs [9]. This
data-set contains a series of traces of hardware resource usage of the PNUTS/Sherpa
database. It’s a massively parallel and geographically distributed database system for
Yahoo!’s web applications [12]. The measurements include CPU utilization, memory

35
36 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION

utilization, disk utilization and network traffic. Thereafter, all resources that are in focus
of this study. As good as it sounds, unfortunately, the traces are only captured in a 1
minute interval, instead of 1 second traces. However, it might be still possible to use them
for our purposes, with the help of adding intermediate estimated data.

Because, the several public available data-sets don’t provide a sufficient basis for our
evaluation, there is no way around to generate own traces within the CSG experimental
environment and compile as realistic workloads as possible. To simulate the load of a
basic web-server, the test setup that was already used to determine dependencies (see
Section 2.2), came in very handy. Instead of running a static workload, the JMeter test
suite was configured to send request in a random way.

The observed behavior has then been analyzed and used as reference for creating stochastic
workloads.

5.1.1 Stochastic workload generation

To get to the point on having a random data generation tool that can be easily used when
running cloud simulations a new class, the StochasticDataGenerator was implemented.

We have designed two basic data generation models. One with more kind of waving curves
and another with more a kind of pillar alike curves. These two types show their differences
in the major periods of the samples. Both types have been created in high consideration
of the observations of the many experiments that have been conducted previously.

Appendix C shows some example workloads that have been generated using this method.

There are many parameters that can be used to individually adapt the workloads. Such as
average consumption of a resource, different standard deviation of the random values, de-
pendency factor, between the curves and vertical stretch of the curves. A special attention
has given to RAM in both models. It has smaller changes than the other three resources
and doesn’t show a randomization between the smaller periodic changes. Thereafter, it
shows a more linear behavior than the other resources.

5.1.2 Workload types

It is a fact that there are many types of workloads processed within clouds. To have a
foundations for our simulations, we came up with three basic types of VM workloads: CI
(Computing Intensive), WS (Web server), DB (Database). Table 5.6 shows in an intuitive
way the different consumption levels. The concrete average utilization levels, may vary in
the upcoming experiments.
5.2. KEY DETERMINANTS FOR ASSESSING THE POLICIES 37

Table 5.1: Workload consumption levels


CPU RAM BW Disk I/O
WS medium low medium low
CI high medium low low
DB low high low high

5.2 Key determinants for assessing the policies

There are basically two key determinants when assess allocation policies; The fairness of
computing power shared between users and the efficiency a certain policy achieves [21].
The key determinant within this study, fairness, takes into account the different consump-
tions of resources for the various customers. There exist many different kind of fairness
measures. In this study we chose three different measures. Asset fairness [13], DRF [13],
and GM (Greediness Metric) [20], to evaluate the allocation policies.
These three different fairness measures have then been scaled with the Jain’s fairness index
[15], which is the traditional function to quantify fairness of single-resource allocations.
This index measures the ”equality” of user allocation x. If all users are getting the same
amount, the fairness is 1. If the fairness decreases and only a few users are getting a share,
the index decreases to 0.
Jain’s fairness index [15]:
Pn
xi
f (x) = Pnn=1 2 , x ≥ 0
n=1 xi

5.2.1 Asset Fairness

The idea behind Asset Fairness is that equal shares of different resources are worth the
same, i.e., that 1% of all CPUs worth is the same as 1% of memory and 1% of I/O
bandwidth [13].
The indicator is calculated as following:
resourceCnt
X resr · 100
shareU seri =
r=1
resourceCapactiy

Then the gathered fairness values for all customers can be supplied to the Jain’s fairness
index formula. Where xi is the total asset share of a user, as previously called shareU seri .
An intuitive approach would also be to calculate the absolute offset to the average of the
total shares of all users.
userCnt
X
deviations = | avgShare − shareU seri |
u=1
38 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION

This absolute indicator that also measures the Asset Fairness is also calculated by the
CloudSim RDA module, when performing a simulation.

5.2.2 Domiant Resource Fairness (DRF)

The DRF [13] is a well known fairness measure, which measures a customer’s resource
allocation by its dominant resource. The dominant resource, is the resource, that has in
proportion to the capacity of the host the highest usage by a certain customer.

As input values for the Jain’s index, the share of the dominant resource in proportion to
the capacity of the resource, delivered the input data for a particular customer.

5.2.3 Greediness Metric (GM)

In the case of the GM, the greediness value is calculated for a particular customer. The
greediness of a customer primarily increases, if a user is using more than the equal share
of the resource and decreases, if a resource is not used to the equal share. This is summed
up over all resources of a user and normalized to one value. The detailed calculation of
this metric can be found in [19].

Because, the Jain’s index doesn’t allow negative values, the number of resources has to
be added to the greediness value (in this study it is 4).

5.2.4 Efficiency

Efficiency is another important factor, when evaluating resource allocation policies. A


high efficiency would mean that the available capacity in the data center is used to it’s
full extent to process workloads.

A resource allocation policy should be efficient and at the same time fair. One can speak
even of a trade off between efficiency and fairness. For example, if a resource allocation
policy would give every user 20% of the host’s CPU capacity and there exist only three
customers, it would be enormously fair. However, it wouldn’t be efficient at all, because
the host would have 40% of unused capacity. Therefore, if one customer would receive
more than 20% the fairness decreases, and the efficiency increases.

Therefore, when evaluating the policies in our case, the key metric is that the overall
utilization of the simulated data center is always close or at it’s capacity limit.
5.3. SIMULATIONS 39

5.2.5 Duration of simulations

An indicator that can be observed in CloudSim is the time that it uses to process a set
of workloads. This time might vary between the policies. For example if there are two
workloads on a host, the overall time will increase, if an absolute amount of CPU cycles is
deducted from the workload with the smaller absolute CPU cycles. If an absolute amount
of CPU cycles is taken from the higher requesting workload, it will result in a shorter
overall process time, because it takes less time for both workloads.

In our experiments the workloads need also the same amount of CPU instructions. How-
ever, little differences in the simulation time can occur and especially at the end of the
experiments some idle time can appear, because not all workloads are finished at the same
time. Therefore, this indicator doesn’t have necessarily something to do with efficiency.
The idle time that is created by the timing difference of the different workloads can be
potentially used by other customers in real world data center setups.

Nevertheless, the times used by the individual customers to process their workloads is
also recorded by the simulations and can be used to evaluate the results and interpret the
different results of the allocation policies. For example, if the customers that have higher
utilization levels, need also more time to process their workloads, speaks for a fair result.

5.3 Simulations

This section describes the simulations that were conducted during this thesis. Further, it
contains detailed descriptions and interpretations of the results that were observed to get
and in-depth understanding of the two different policies that have been evaluated.

The simulations are also listed in order of the complexity of their setup. The initial two
simulations contain one host and 3 customers, after there is a two host scenario followed
by two 3 host scenarios.

Appendix D lists a reference to the configurations and further simulations.

5.3.1 Basic simulation setup

The basic host configuration has following capacity limits:

CPU: 1000 MIPS


RAM: 2048 MB
Network bandwidth: 1’000 Mbit/s
Disk I/O: 4’000 Mbit/s

For certain experiments the network bandwidth has been adapted to 100 Mbit/s to have
a second throughput resource that is scarce.
40 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION

The setups also have been chosen that the RAM never exceeds it’s capacity, because
paging is currently not supported by the CloudSim RDA module.

To get to a statistically representative result, the simulations were performed 3 times over
15 Minutes. The resource allocation interval was chosen every 1/100 second.

The values for the different fairness measures have been captured every second. As longer
the experiment runs, the more differences in resource requests for the two policies occur.
This is caused, because the workloads are progress dependent and the allocation at a
certain point might impact the allocation in the future, because of different request levels
of a certain resource.

In case of the GM policy, in the single host scenarios, the greediness among the customers
was updated every calculation step (1/100 second). In the multiple host scenarios the
update interval was set to 1 second. Further, the average greediness method has been
used instead of the sum of the greediness among the hosts.

5.3.2 Simulation 1: 1 host, 3 customers, 3 VMs with WS work-


loads, variable RAM

Figure 5.1: Simulation 1: Setup

Three customers c1 , c2 , c3 with three VMs, where c1 is using 35% more RAM than the
equal share, c2 exactly as much as the equal share and c3 is using only 65% of the equal
share. The equal share is 682 MB (1/3 of 2048 MB).

As it can be seen in a snapshot during the simulation (Figure 5.3.2 & 5.3.3), the GM
allocates the CPU time differently than the Standard policy. According to the different
workload values of the RAM the GM decides to supply c1 with less CPU time than the
other 2 users. The CPU is shared as following: User c1 gets 21% less CPU than the equal
share, c2 get’s +2% than the equal share and c3 gets 19% more than the equal share of the
CPU time. One may ask, why c2 get also more CPU? This is simply because of the fact,
that c3 is not always using the whole CPU time. This part will then be given to c2 , more
or less as a gift. In the alternative cases, described further down, c2 is not getting more
CPU than the equal share. This is due that c3 , doesn’t have a CPU surplus anymore,
because the CPU ”shift” among the customers isn’t that extreme.

In contrast, the Standard policy allocated the CPU equal to the three customers. Every
customer gets an equal share (333 MIPS). This policy simply takes into account the
resource that is scarce and applies the MMFS algorithm to this resource.
5.3. SIMULATIONS 41

[Asset fairness]

[DRF]

[GM]

Figure 5.2: Fairness metrics during experiment (Simulation 1)

In the snapshots are also the Leontief dependencies visible. Therefore the network band-
width and the disk I/O behaves according to the scarce resource CPU. Because the RAM
has not this dependency and is not scarce, it doesn’t show the same effect.

Figure 5.3: Simulation 1: Snapshot, grouped


42 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION

Figure 5.4: Simulation 1: Snapshot, stacked

Table 5.2 shows the fairness result values for this simulation.

Table 5.2: Simulation 1, Fairness results


StandardPolicy GM policy dev (%)
Asset
0.9826 0.9981 +1.56
fairness
DRF fairness 0.9788 0.9895 +1.1
Greediness
0.9950 0.9993 +0.43
fairness

This simulation example favors on all indicators for the GM policy. The average greediness
values for c1 is 0.136, for c2 0.0238 and for c3 -0.160. Positive greediness values mean that
a customer is greedy.

This simulation shows a behavior that can be observed in many experiments that have
been performed. The GM policy is ”shifting” resources between the customers with the
goal to equalize their assets. This is nicely visible in Figure 5.3.3, where the resources are
stacked. The piles that the GM policy shows are more equal than the three piles that the
Standard policy shows. In this simulation as many other simulations the asset fairness,
shows the highest differences between the two allocation policies (see Table 5.2).

The Tables 5.3 and 5.4 show the same simulation under less intense RAM conditions.
Instead that c1 uses 35% more RAM than the equal share, the result values for an increase
of only 20% and 10% are listed.

Table 5.3: CPU allocation


CPU VM1 CPU VM2 CPU VM3
+35% -21% +2% +19%
+20% -13% 0 +13%
+10% -6.5% 0 +6.5%

Figure 5.2 shows the variations of the Jain’s fairness values for this simulation over time
(only the first 200 seconds are depicted). One can observe that the GM policy shows
5.3. SIMULATIONS 43

Table 5.4: Greediness values


C1 C2 C3
+35% 0.136 0.0238 -0.160
+20% 0.072 -0.001 -0.071
+10% 0.034 -0.001 -0.033

constantly a higher fairness. In the DRF comparison (second diagram), the Standard
policy shows a straight line. This is because the dominant resource for customer c2 and c3
is CPU and this resource is always shared the same way, as already explained above. The
GM in contrast, has a variable DRF fairness curve, because the CPU of the customers c2
and c3 are always slightly adapted over the simulations time.

5.3.3 Simulation 2, 1 host, 3 customers, 3 VMs with WS work-


loads

Figure 5.5: Simulation 2: Setup

In the previous scenario, one could see how the two policies differentiate. In this scenario,
all three users are using about the same level of RAM and CPU. Of course, the values are
randomized with a standard deviation of 20.0 for MIPS, respectively 10.0 for RAM (see
Appendix C, Figure C.1), to imitate a realistic workload progress.

As shown in Table 5.5 the fairness values do not differentiate. The policies act exactly the
same in this simulation. All three fairness indicators are also higher than in simulation
1. The GM is even 1.0 over the whole experiment, which means it exists a high level of
fairness. The other two fairness indicator are slightly less than 1, because not all customers
are constantly using their equal share (1/3 of the CPU capacity). This indicates that the
GM indicator is slightly less sensitive than the other two indicators.

However, this simulation is also of worth to just verify the actual implementation of the
two different policies, as this is one of the few situations, where they are supposed to do
exactly the same (even down to every 1/100 second).
44 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION

Table 5.5: Simulation 2, Fairness results


Standard Policy GM policy dev (%)
Asset
0.9997 0.9997 0.0
fairness
DRF fairness 0.9995 0.9995 0.0
Greediness
1.0000 1.0000 0.0
fairness

5.3.4 Simulation 3: 2 hosts, 2 customers, 4 VMs with WS work-


loads

In this simulation two customers c1 and c2 operate two VMs each. All VMs have the same
configuration and are partitioned to two hosts (cf. Figure 5.6). The hosts’ capacity is
as defined in Section 5.3.1. The VMs host Web servers, wherefore the critical resource is
CPU. Furthermore, all VMs, except VM4, have a high workload. Therefore, all VMs try
to exceed their CPU endowment, except for VM4, which utilizes on average only 50% of
its endowment (250 MIPS).

Figure 5.6: Setup of Simulation 3

Figure 5.7, shows a snapshot how the resources are allocated between c1 and c2 for both
policies during the simulation. It is visible that c1 generally wants more CPU over both
hosts, due to a the higher utilization of the hosted Web servers. In this snapshot, which
is quite representative for this particular simulation, the GM policy increases the CPU
usage of c2 , by allowing this customer to have a bigger share of the CPU on host 1. This
is because the c1 is using more CPU on host 2, because c2 doesn’t request as much of his
fair share/equal share as he potentially could.

In fact, VM2 receives on average 7.25% more CPU than its equal share. Accordingly,
VM1 receives on average 7.25% less CPU. However, this percentage is dependent on how
much VM4 is utilizing. If it would be utilizing more, the influence on VM2 would shrink
and go to 0 if VM4 would also request 50% of the total capacity on Host 2.

In terms of greediness, the allocation on Host 2 increases the greediness of c1 and decreases
the greediness of c2 . When resources on host 1 are allocated, the standard policy allocates
the CPU resource evenly between VM1 and VM2. In contrast, the GM policy takes the
greediness of customers into account to allocate the congested resource CPU to their VMs.
5.3. SIMULATIONS 45

Figure 5.7: Snapshot (Simulation 3)

[Asset fairness]

[DRF]

[GM]

Figure 5.8: Fairness over experiment run time (Simulation 3)


46 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION

Table 5.6: Avg. Workload consumption levels


CPU in MIPS RAM in MB BW in MBit/s Disk I/O in MBit/s
CI 584 497 58 24
WS 460 247 149 23
DB 232 1000 23 756

Naturally, this leads to a more overall fairness of the two involved hosts. Figure 5.8
presents the fairness quantifications achieved over the course of the simulation. The GM
policy achieves on average 1.1% higher fairness than the standard policy, according to
asset fairness and a 0.0924% higher fairness according to the GM metric. Furthermore, it
shows on average an 1.5% higher fairness according to DRF.

However, this simulation shows how resource ”trades” are performed even over different
hosts by the GM policy. Because the Standard policy doesn’t look over a single host, the
GM policy has a clear advantage in terms of fairness among the whole data center.

5.3.5 Simulation 4: 3 hosts, 3 customers, 9 VMs with WS, CI,


DB workloads

Three customers c1 , c2 , c3 , who ran three VMs each, were simulated over 2 minutes. While
all VMs had the same configurationthe same virtual resources, each customer executed
different workloads on his VMs. c1 executed database operations, c2 solved computational
tasks, and c3 hosted Web servers. The average resource consumption of these workloads
can be found in Table 5.6. The table shows that c1 ’s VMs (DB) primarily utilize RAM, and
c2 (CI) and c3 ’s (WS) VMs primarily utilize CPU. While c2 utilizes twice as much RAM
than c3 , c3 uses more network bandwidth. The VMs were hosted on three equal hosts,
which hosted each one VM of each customer. In multiple host scenarios an important
variable is also the update interval of the greediness among the users. This parameter has
been set to 1 second. Same as the 2 host simulation, described above. To take a smaller
interval would generally increase the fairness, though this is rather unrealistic among data
centers with dozens of hosts.

Figure 5.9: Accumulated resources by customers (Snapshot)


5.3. SIMULATIONS 47

Over 3 simulations of each 15 minutes, the following result have been achieved: The GM
fairness results in 0.07% more fairness, than the standard policy, when VM allocations
are executed by the GM policy. The fairness also increased by 1.5%, when the value
is quantified by the asset fairness metric. However, if the fairness is measured with the
DRF measure, it appears that the GM policy is 0.11% less fair than the Standard policy.
This is caused by the dominant resource, CPU time for the customers, c2 and c3 . In the
case of the Standard policy, they are more equal. C1 has the same dominant resource
RAM, which doesn’t differentiate between the policies. However, the DRF doesn’t take
into account, that c2 uses more RAM than c1 , which is a limitation of the DRF. In terms
of multi-resource fairness, the asset fairness measure has to be given more value in this
situation.

These differences in the overall fairness may appear only small, however, when looking at
the impact that it has on the VMs it becomes tangible.

Figure 5.10: Resource utilization by customers (Snapshot)

Figure 5.10 shows a snapshot during the course of the experiment of the resources allocated
by the two different policies. The columns show the accumulated normalized resource
shares for each of the three customers. It is visible, that the GM (right) allocated the
resources more equal than the Standard policy (left). Figure 5.9 depicts a snapshot during
the course of the experiment and makes visible how the GM’s policy allocates resources
differently compared to the standard policy. Certainly, the snapshots were taken at the
same moment of the simulations. The GM policy allocates less CPU time to c1 compared
to the standard policy, because c1 exceeds the fair share of RAM. c3 receives more of
c1 ’s CPU, because c3 utilizes less RAM than c2 . Therefore, c3 covers more of c1 ’s RAM
over-consumption and in return receives more CPU. The amount of CPU that is given
less to c1 , is mainly allocated to c3 , because c2 cedes mores RAM in comparison to c3 .
In contrast, the standard policy ignores the imbalance of RAM consumption, because it
only considers the congested resource (CPU).
48 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION

5.3.6 Simulation 5: 3 hosts, 3 customers, 9 VMs with WS & CI


workloads

This is a simulation, in which the Standard policy, delivers better fairness results than
the GM policy. The three customers, which share 3 hosts, have on all hosts the dominant
resource CPU. The slight fairness deviation is caused by the priority update interval, which
was set in the case of the GM policy to 1 second. Because in-between the customers have
slight utilization changes, the greediness/priority is not always reflecting the actual values
of the customers. The Standard policy has here an advantage, because it always checks
that the max-min fairness for the CPU is given at every resource allocation/calculation
step in the simulation. Table 5.7 shows the fairness results for this particular scenario. In
simulation 2, the priority update interval for the GM policy was chosen at 0.01 second,
which equals the resource allocation/calculation used for the Standard policy. Obviously,
this resolves the slight deviation that one can see here in this scenario.

Table 5.7: Simulation 5, Fairness results


Standard Policy GM policy dev (%)
Asset
0.9039 0.9036 -0.0311
fairness
DRF fairness 0.9037 0.9034 -0.0312
Greediness
0.9933 0.9933 -0.0020
fairness

5.4 Findings

The elaborated simulations within this chapter and many other simulations (see Appendix
D) that were conducted, in order to find the differences of the GM policy and the Standard
policy, have shown a clear benefit for the GM policy. As soon as customers have different
utilization patterns and scarcity on at least one host exists, the GM policy generally
achieves a higher degree of fairness among the customers.

The ”shifting” of resources among customers has been observed in one host scenarios and
multiple host scenarios. In one host scenarios the reason for this kind of shifting was
primarily because of the more static kind of resource, RAM. This is no surprise, because
RAM, a non throughput resource, is the only resource in our particular setup, that doesn’t
have the Leontief dependencies. On multiple host scenarios, the shifting may also occur,
because of the same dominant resource, as shown with the CPU time in the Web server
scenario.

The important properties, as defined in [13] for task allocation, we can also use for eval-
uating resource allocation policies.
Sharing incentive, which means that each customer should be better off sharing the clus-
ter, than exclusively using her own partition of the cluster.
Strategy-proofness, which means that customers should not be able to benefit by lying
5.4. FINDINGS 49

about their resource demands.


Envy-freeness, a customer should not prefer the allocation of another customer.
Pareto efficiency, which means that it should not be possible to increase the allocation of
a user without decreasing the allocation of at least another user.

The various simulations that have been performed, clearly confirmed the above 4 prop-
erties for both policies. [13] defines also four less important properties. From which
single resource fairness, which means that if there would be only one resource, the policy
would result to max-min fairness is given by the Standard policy. For the GM policy, this
property doesn’t apply, because it takes customers into account, as for example shown
in simulation 2. Even in one host scenarios where customers may have multiple VMs,
this criterion might not always apply. However, this would have to be confirmed with an
experiment where customers have multiple VMs on the same host. Bottleneck fairness,
which means that if there is one resource that is percent-wise demanded most of by every
customer, then the solution should reduce to max-min fairness for that resource, is given
by both policies in a one host scenario (with the exclusion of multiple VMs per customer).
In a multiple host scenario the GM policy does not always fulfill this property. The two
other minor properties, population monotonicity and resource monotonicity are also given
by both policies.

Apart from these properties the GM policy has even shown more properties, that could be
defined as ”resource balancing”, which tries to take all resource into account or ”customer
awareness”, which tries to equalize the resource consumptions of the customers among
the data center.

When integrating the GM policy into CloudSim RDA it has also shown, that to just take
the sum of the greediness among all hosts, might not be the best solution. When taking
the sum, the impact of the greediness was quite drastic and less stable in some situations
and it was chosen to take the average greediness for the simulations. This is certainly
an area, where further research would be appropriate, to optimize this formula. Another
variable that has been introduced, is the starvation limit. This limits the degradation of
the resources of a customer on a particular host in relation to it’s equal share. However,
this is also a property among multiple host resource allocation policies that has to be
considered.
50 CHAPTER 5. RESOURCE ALLOCATION POLICY EVALUATION
Chapter 6

Summary and Conclusions

This bachelor thesis covered different aspects within the area of resource allocation as
such and cloud computing.

Initially, this study took a close look at the different physical resources and how they
interdepend. The Leontief dependencies between the CPU time and the different trough-
put resources have been confirmed. It was also observed, that RAM has to be specially
considered, because it is dependent on the progressed instructions.

These findings have then been taken to design and implement the CloudSim RDA module.
This extension to the CloudSim library, with its unique properties, is the first implemen-
tation of it’s kind. It allows to simulate resource scarcity in cloud alike data centers,
specifically, on the individual hosts. The effects of resource scarcity are basically shown in
an adaption of the resource consumption of the workloads and the subsequent processing
time stretch. In the basic CloudSim library, one can only simulate scarcity of the CPU
itself. The CloudSim RDA module enhances this capability with the resources, network
bandwidth and disk I/O.

The newly designed resource allocation interface, allows to allocate the resources for the
individual customers in consideration of all four resources; CPU time, RAM, network
bandwidth and disk I/O. This opens the field on evaluating resource allocation policies
that take into account more resources than only the CPU utilization. Policies like for
example the Standard policy, or a DRF or GM policy.

The RDA module added also the well observed Leontief dependencies and the work-
load progress aware RAM behavior. Moreover, the priority feature by customers on data
center level, which allows to exchange a customer consumption indicator on data center
level is also a new feature.

All in all, the CloudSim RDA module, delivers a new experimentation base, for research
in the area of multi-resource allocation policies within cloud computing environments.
The simulations that have been conducted within this study, have shown that the GM
allocation policy would deliver generally a more fair solution than the Standard policy.

51
52 CHAPTER 6. SUMMARY AND CONCLUSIONS

The GM policy has also shown stable and comparable resource allocation patterns like the
Standard policy, except that it allocates the resources on a higher fairness level. However,
it’s impact is not to underestimate, because the resource degradation of high consuming
customers is quite massive in some situations. On the other hand this benefits low-end
consumers within a cloud computing environment.

6.0.1 Related work

In the past, research in the area of resource allocation in cloud computing environments
has primarily taken place on the task level. Which means that it was mainly evaluated,
how to place the tasks of several customers in a fair manner among one host [13], or
multiple hosts [21]. This study went one step deeper and considered already allocated
endowments. An endowment may be a VM with a certain flavour of resources (e.g.
1 CPU, 500 MB RAM). This is typical for online services, such as for example Web
servers. Among these endowments, the fairness level among customers may differentiate,
because not all customers are using their endowments to their full extent. Thereafter, the
utilization levels may vary over the progress of a hosted VM. This opens for the question,
on how to optimize the fairness among these progressions, specifically, in a situation where
at least one resource is scarce on a particular host.

6.0.2 Future research

Generally, the field of multi-host resource allocation algorithms didn’t get much attention
yet. There is certainly potential for further studies within this area. One could for
example come up with even other algorithms for example on the base of DRF or asset
fairness. Maybe there is even a possibility to not just exchange only one priority indicator
of the hosts on data center level, but instead exchange all resource utilization values of
the customers on data center level. It would be interesting, to compare such algorithms
with the existing ones.

Specifically, future research for the GM policy would be to further investigate in finding
a suitable function to aggregate the priorities of VMs of customers on data center level.
There might be a better way than just summing up or taking the average of the greediness
value. It could be more generalized with parameters like for example: number of VMs,
number of VMs of a customer, total VMs & total customers.
A further extension, which is currently also a limitation of the GM Python script, would
be support for multiple VMs, belonging to one customer on the same host. Currently, they
are considered to be unique customers. This would invite for some further simulations.

6.0.3 Final remarks

The CloudSim RDA module, made it possible to simulate the GM policy under controlled
conditions and directly compare it to the Standard policy. This controlled environment
53

may be used to further test the GM policy with different parameters and fine tune the GM
allocation policy to use it on actual productive systems, as for example on an OpenStack[7]
cloud computing infrastructure.

This thesis gave the author/me insights in different methodologies in science within the
field of informatics. Initially, the experiments and observations on real systems, after-
wards, the design and development of a software module and at the end the simulations
and the evaluation of the results.

I personally hope that research in the field of fairness in cloud computing will progress
and positively impact future cloud consumers and help to make cloud computing even
more attractive.
54 CHAPTER 6. SUMMARY AND CONCLUSIONS
Bibliography

[1] August 2015 web server survey. http://news.netcraft.com/archives/2015/, Accessed


August 22, 2015.

[2] The cloud computing and distributed systems. http://www.cloudbus.org/cloudsim/,


Accessed April 17, 2015.

[3] Cloudsim 3.0 api. http://www.cloudbus.org/cloudsim/doc/api/ Accessed April 17,


2015.

[4] Cloudsim rda 1.0 api. http://pattad.github.io/cloudsim-rda/apidocs/index.html,


Accessed August 30, 2015,.

[5] cluster-data. https://code.google.com/p/googleclusterdata/wiki/ClusterData2011 2,


Accessed August 22, 2015.

[6] Leontief-produktionsfunktion. http://de.wikipedia.org/wiki/Leontief-


Produktionsfunktion, Accessed August 22, 2015,.

[7] Openstack open source cloud computing software. https://www.openstack.org/, Ac-


cessed August 24, 2015.

[8] Simulator types. http://acqnotes.com/acqnote/tasks/simulator-types, Accessed Au-


gust 22, 2015.

[9] webscope datasets. http://webscope.sandbox.yahoo.com/, Accessed August 22, 2015.

[10] Goscinski AM. Buyya R, Broberg J. Cloud Computing, Principles and Paradigms.
John Wiley & Sons, 2010.

[11] Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, César A. F. De Rose, and
Rajkumar Buyya. Cloudsim: a toolkit for modeling and simulation of cloud com-
puting environments and evaluation of resource provisioning algorithms. Software:
Practice and Experience, 41(1):23–50, 2011.

[12] Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip
Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni.
Pnuts: Yahoo!’s hosted data serving platform. Proc. VLDB Endow., 1(2):1277–1288,
August 2008.

55
56 BIBLIOGRAPHY

[13] Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker,
and Ion Stoica. Dominant resource fairness: Fair allocation of multiple resource
types. In Proceedings of the 8th USENIX Conference on Networked Systems Design
and Implementation, NSDI’11, pages 323–336, Berkeley, CA, USA, 2011. USENIX
Association.

[14] H. PACI E. Zanaj A. Xhuvani I. TAFA, E. KAJO. Cpu and memory utilization by
improving performance in network by live migration technology. 2011.

[15] Raj Jain, Dah-Ming Chiu, and W. Hawe. A quantitative measure of fairness
and discrimination for resource allocation in shared computer systems. CoRR,
cs.NI/9809099, 1998.

[16] Michael Kopp. Java Enterprise Performance. 2012.

[17] Ivan Marsic. Computer Networks, Performance and Quality of Service. 2013.

[18] I.S. Moreno, P. Garraghan, P. Townend, and Jie Xu. An approach for characterizing
workloads in google cloud to derive realistic resource utilization models. In Service
Oriented System Engineering (SOSE), 2013 IEEE 7th International Symposium on,
pages 49–60, March 2013.

[19] B. Stiller P. Poullie, P. Taddei. Cloud flat rates enabled via fair multi-resource
consumption. 2015.

[20] Stiller B. Poullie P., Kuster B. Fair multiresource allocation in clouds. 2013.

[21] Kaiji Shen, Xiaoying Zheng, Yingwen Song, and Yanqin Bai. Fair multi-node multi-
resource allocation and task scheduling in datacenter. In Cloud Computing Congress
(APCloudCC), 2012 IEEE Asia Pacific, pages 59–63, Nov 2012.

[22] Austin T. Tanenbaum AS. Structured Computer Organization. Pearson Education,


2012.

[23] Chen J et al. Wang L, Ranjan R. Cloud Computing, Methodology, Systems, and
Applications. CRC Press, 2011.
Abbreviations

API Application Programming Interface

BW Bandwidth

CSV Comma Separated Values

CPU Central Processing Unit

DRF Dominant Resource Fairness

GB Giga Byte

GC Garbage Collector

GM Greediness Metric

HTTP Hypertext Transfer Protocol

I/O Input / Output

KB Kilo Byte

MI Million Instructions

MIPS Million Instructions Per Second

MB Mega Byte

MMFS Max-Min Fair Share

PM Physical Machine

PR Physical Resources

QoS Quality of Service

57
58 ABBREVIATONS

RAM Random Access Memory

VM Virtual Machine

VMM Virtual Machine Manager


Glossary

Host If a server provides the virtual hardware for one or multiple VMs, it is called a host
or host machine.

Hypervisor A hypervisor is a piece of computer software, firmware or hardware that


creates and runs virtual machines.

Leontief dependencies If one of the dependent resource doesn’t get the requested uti-
lization amount, the other resources, are getting downgraded with the same pro-
portion. This was named after Wassily Leontief, which invented this economic
production function.

Workload An amount of work that is expected to be done.

Virtual Machine A virtual machine (VM) is an emulation of a particular computer


system.

Virtual Machine Manager A software to manage VMs.

59
60 GLOSSARY
List of Figures

2.1 Processing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 CPU and network dependency . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.3 CPU and Disk write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.1 Class model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Provisioning policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.1 Linear resource progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.2 Resource progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.3 Data center and host classes . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.4 VM and VM scheduler classes . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.5 Cloudlet and cloudlet scheduler classes . . . . . . . . . . . . . . . . . . . . 26

4.6 Resource calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.7 Leontief dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.1 Simulation 1: Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.2 Fairness metrics during experiment (Simulation 1) . . . . . . . . . . . . . . 41

5.3 Simulation 1: Snapshot, grouped . . . . . . . . . . . . . . . . . . . . . . . 41

5.4 Simulation 1: Snapshot, stacked . . . . . . . . . . . . . . . . . . . . . . . . 42

5.5 Simulation 2: Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.6 Setup of Simulation 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.7 Snapshot (Simulation 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

61
62 LIST OF FIGURES

5.8 Fairness over experiment run time (Simulation 3) . . . . . . . . . . . . . . 45

5.9 Accumulated resources by customers (Snapshot) . . . . . . . . . . . . . . . 46

5.10 Resource utilization by customers (Snapshot) . . . . . . . . . . . . . . . . 47

C.1 Sample WS workload (waving) . . . . . . . . . . . . . . . . . . . . . . . . . 72

C.2 Sample CI workload (waving) . . . . . . . . . . . . . . . . . . . . . . . . . 72

C.3 Sample DB workload (waving) . . . . . . . . . . . . . . . . . . . . . . . . . 72

C.4 Sample WS workload (pile) . . . . . . . . . . . . . . . . . . . . . . . . . . 73

C.5 Sample CI workload (pile) . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

C.6 Sample DB workload (pile) . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

E.1 Disk read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

E.2 Disk write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

E.3 Network bandwidth test, unstressed and stressed at 50% . . . . . . . . . . 79

E.4 Apache bench, normal vs. stressed condition . . . . . . . . . . . . . . . . . 80


List of Tables

2.1 Sherpa data correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.1 Resource objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.1 Input data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.2 input array for requested resources . . . . . . . . . . . . . . . . . . . . . . 28

5.1 Workload consumption levels . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.2 Simulation 1, Fairness results . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.3 CPU allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.4 Greediness values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.5 Simulation 2, Fairness results . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.6 Avg. Workload consumption levels . . . . . . . . . . . . . . . . . . . . . . 46

5.7 Simulation 5, Fairness results . . . . . . . . . . . . . . . . . . . . . . . . . 48

E.1 Experiment results (unstressed) . . . . . . . . . . . . . . . . . . . . . . . . 79

63
64 LIST OF TABLES
Appendix A

Installation Guideline

To use the CloudSim RDA module it is highly recommended to install it within a Java
development environment, such as for example the Eclipse IDE. There are some predefined
example cloud setups in the package ch.uzh.ifi.csg.cloudsim.rda.examples. These examples
should help everyone to get a gentle jumpstart into the RDA module.

A.1 Basic Installation


1. The latest version of the module is available at:
https://github.com/pattad/cloudsim-rda.git
A local clone has to be made from that GIT site.

2. The RDA module has a dependency to the CloudSim core project. One must make
sure that a copy of the JAR file cloudsim 3.1-SNAPSHOT is in the classpath. This
has to be done manually, because CoudSim is not hosted in a public Maven reposi-
tory.
The JAR file can be downloaded from the location:
http://pattad.github.io/cloudsim-rda/cloudsim-3.1-SNAPSHOT.jar

3. Interesting but not required might also be the CloudSim core project. This project
can be retrieved by checking it out with SVN using following location:
https://code.google.com/p/cloudsim/source/checkout.

A.2 Setup Guideline for Eclipse

This setup may be used to evaluate the MMFS and the Greediness metric. The greediness
algorithm is python based and that’s why python is also a requirement when testing the
greediness algorithm.

65
66 APPENDIX A. INSTALLATION GUIDELINE

Prerequisites

1. Latest Eclipse IDE distribution.

2. Python 3.x is required. Also the python library Numpy must be installed.

3. Maven (m2e) Eclipse plugin is installed

Installation steps

1. Switch to the Git perspective In the Git Repository view click ”Clone a Repository”
and input https://github.com/pattad/cloudsim-rda.git as the URL

2. After you have cloned the repository, you need also to import it into your workspace.
This can be done by clicking File>Import>Maven>Existing Maven Projects

3. To make it compile, also add the cloudsim-3.1-SNAPSHOT.jar as described in the


basic installation section above.
Appendix B

Experimentation Guideline

Once the CloudSim RDA is in the workspace of your favorite IDE. The best way to get
started is to view the basic examples that are provided in the package *.cloudsim.rda.examples.

Moreover, there is the ExperimentRunner class, which can be used to start experiments
with different parameters with an already ready-to-go setup to evaluate different resource
allocation policies.

B.1 Examples

FairShareExample

This example uses the standard VM scheduler (class VmSchedulerMaxMinFairShare). It


shows how a data center, hosts, VMs and Cloudlets are getting created and how they are
interconnected. Please have a look at the code and try to run it. As these examples have
a main class, they can simply be started as Java applications.

The FairShareExample defines two Cloudlets, that are running on individual VMs on
the same host. The workloads of the Cloudlets is defined in the resource folder (src/-
main/resources/). The input1.csv, respectively, input2.csv are the workload inputs of
each Cloudlet.

Many output files will be created. Please check the Section B.2.2 for a detailed description
of the files.

There are variables that can be adapted:

schedulingInterval = 0.000000001
This interval defines the min time between the events in the simulation. To achieve a high
precision up to a nano second can be simulated. Nevertheless, for a faster execution of
the simulation it can be adapted to a higher value. For example 1/100 second, switch it
to 0.01.

67
68 APPENDIX B. EXPERIMENTATION GUIDELINE

scarcitySchedulingInterval = 0.01
This interval is the max time between events, if scarcity occurs on a host. It is defined in
seconds.

record = true/false
If set to true, a CSV output file for each Cloudlet is getting produced. The files are getting
created in the folder from where the Java run-time is started.

B.2 Experiment Runner

The ExperimentRunner.java can be used to initiate an experiment among different re-


source allocation policies. If it is started it initially creates workloads and than runs the
same workloads for all different resource allocation policies. Currently, it initially runs
the DRF policy, afterwards the MMFS policy (Standard policy) and at the end the GM
policy.

B.2.1 Configuration

ExperimentRunner.java must be started with the following arguments:


<vmCnt> <hostCnt> <userCnt> <experimentCnt> <pythonPath> <configClass> <ex-
perimentLength> <trace> <priorityUpdateInverval>

vmCnt The number of VMs.


userCnt The number of Users. (must be ≤ vmCnt)
hostCnt The number of hosts.
experimentCnt The number of experiment with this configuration to run.
pythonPath Contains the command and location, on how to access the python files.
configClass Contains the configuration class to use. Please check the API description
for the class *.experiments.config.ExperimentConfig.
experimentLength The length in seconds the simulation should run. If there comes up
scarcity, this time is naturally extended, till all workloads are processed.
trace Set it to true or false.
priorityUpdateInverval This is the interval when the priorities among the customers
on data center level is updated. The priority in case of the Greediness Metric is the
greediness. In one host scenarios it’s recommended to keep it at 0.01 seconds. For multi
host scenarios, 1 second would be an suitable option.

Example configuration:
3 1 3 1 ”python src/main/resources/python”
”ch.uzh.ifi.csg.cloudsim.rda.experiments.config.Config 17” 120 false 1
B.2. EXPERIMENT RUNNER 69

B.2.2 Output

Results are listed in the directory output/. For each experiment a new unique sub folder
is created with the configuration & time-stamp as name. An example structure is shown
below:

output/
--> Config_17_20150726033535775/
---------> workload_0.csv
---------> workload_1.csv
---------> workload_2.csv
---------> experimentParams.log
---------> drf/ (Dominant Resource Fairnesss)
---------> mmfs/ (Max Min Fair Share)
---------> greediness/ (Greediness algorithm)
------------------->resourceShare_cpu
------------------->resourceShare_bw .csv
------------------->resourceShare_disk.csv
------------------->workload_trace_0.csv
------------------->workload_trace_1.csv
------------------->workload_trace_2.csv
------------------->fairness.csv
------------------->jains.csv
------------------->utilization.csv
------------------->summary.log
------------------->trace.log

The workload X.csv files contain the different workloads that were randomly generated.
They will be used for all different resource allocation policies. The parameters with which
the experiment was started are logged in the experimentParams.log file.

The sub folders of the different algorithms each contain the individual output data that
can be used to evaluate the experiment and assess the different policies.

workload trace X.csv


These files contain the actual utilization for each workload, when they are getting pro-
cessed in a VM. The traces are measured every second.
File syntax:
TIME,CPU,MEMORY,BANDWIDTH,DISK I/O,DELAY

jains.csv
This file contains the different fairness metrics, measured every second.
File syntax:
Asset Fairness, DRF, GM

utilization.csv
70 APPENDIX B. EXPERIMENTATION GUIDELINE

This file contains the resource utilization values for each customer, measured every sec-
ond.
File syntax:
User n...N {CPU,RAM,BW,DISK I/O}

resourceShare <resource>.csv
File syntax:
User n...N {<requested resource>,<utilization of resource>,<priority>},
User n..N {<unfairness>}, <unfairness Total>, <accumulated Unfairness>,
VM n..N
{<VM ID>, <User ID>, <requested resource>,<utilization of resource>,<unused>}

fairness.csv
This file contains the deviation to the fair share for each resource.
File syntax: User n...N {<fairness deviation CPU>,<fairness deviation BW>,
<fairness deviation Disk I/O>,<fairness total>},
All users total dev, <total deviation>

trace.log
The detailed CloudSim debug trace is printed to this file. Only if the ”trace” option is set
to TRUE.

summary.log
Contains the same output as the console output for the particular experiment.
Appendix C

Stochastic Data Generation

The following sample workloads where created with the class


ch.uzh.ifi.csg.cloudsim.rda.experiments.StochasticDataGenerator.

Input values of the waving model. Config 6 for DB & CI Config 8 for WS

Input values of the pile model. Config 17

71
72 APPENDIX C. STOCHASTIC DATA GENERATION

Figure C.1: Sample WS workload (waving)

Figure C.2: Sample CI workload (waving)

Figure C.3: Sample DB workload (waving)


73

Figure C.4: Sample WS workload (pile)

Figure C.5: Sample CI workload (pile)

Figure C.6: Sample DB workload (pile)


74 APPENDIX C. STOCHASTIC DATA GENERATION
Appendix D

Simulations

D.1 Configuration details

Metrics.py variables:

starve design parameter = 0.5


final normalizer = 1.0
formula for greediness on data center level: average

D.2 Reference on CD

A whole overview of all conducted simulations discussed within this report and further
simulations, is available in the accompanying CD in the file simulation master.xlsx.

The references to the simulations, how they can be found on the CD are as following:

Simulation 1
Config 35 20150824045730545

Simulation 2
Config 12 20150819032926459

Simulation 3
Config 34 20150814060539196

Simulation 4
Config 17 20150814074838824

Simulation 5
Config 13 20150818022910375

75
76 APPENDIX D. SIMULATIONS
Appendix E

Resource experiments

E.1 Experiment setup

Test VM1

OS: Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-32-generic x86 64)


Web server: Apache/2.4.7 (Ubuntu, Mar 10 2015 13:05:59)
Network: 1 Gbit/s

VM Host N13

CPU: AMD Opteron(tm) Processor 6180 SE @2.5GHz (24 CPUs, 2 Sockets)


Memory: 64 GB
OS: Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-44-generic x86 64)

Test machine N04

Used to run JMeter for the network bandwidth tests.

CPU: Intel(R) Xeon(R) CPU E3113, 2 CPUs @ 3.00GHz


Memory: 8 GB
OS: Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-48-generic x86 64)
Java version: ”1.7.0 75” OpenJDK Runtime Environment (IcedTea 2.5.4) (7u75-2.5.4-
1 trusty1) OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)
Network: 1 GBit/s

77
78 APPENDIX E. RESOURCE EXPERIMENTS

E.2 Disk I/O

record command: virt-top -1 -3 -d 0.5 –csv record.csv

write command:

dd if=/dev/zero of=largefile bs=1M count=4096

read command:

dd if=largefile of=/dev/zero bs=8k

Figure E.1: Disk read

Figure E.2: Disk write


E.3. NETWORK BANDWIDTH 79

E.3 Network bandwidth

Figure E.3: Network bandwidth test, unstressed and stressed at 50%

Table E.1: Experiment results (unstressed)


requests/second cpu (%) requests count duration (s)
547 15 10’000 18.3
1061 28 50’000 47.1
1833 50 50’000 27.3
2888 77 50’000 17.3
3818 100 50’000 13.1

E.4 Memory

This experiment shows the dependency between CPU and memory behavior. It was
conducted with the Apache benchmark (ab). It measure how many requests per second
a system can sustain. The test consists of 1,000,000 requests with 100 requests being
carried out concurrently. It is observable that the USS (unique set size) increases slower,
if the CPU is stressed with 50%.
80 APPENDIX E. RESOURCE EXPERIMENTS

Figure E.4: Apache bench, normal vs. stressed condition


Appendix F

Contents of the CD

Simulations.zip contains the output data of the different simulations. It is structured by


the references as listed in Appendix D. Further there is the file evaluation master.xlsx,
which lists a summary of various experiments conducted within this study. Even experi-
ments that are not discussed within this report.

Apidocs.zip contains the API specification in HTML.

Cloudsim-rda.zip, is the complete source of the CloudSim RDA project.

Install win.zip & Install unix.zip contains the necessary simulation setups that may be
useful to execute further simulations.

81

Вам также может понравиться