Вы находитесь на странице: 1из 68

An Operating System for Multicore and Clouds: Mechanisms and Implementation


Supervisor: DR. AMJAD NUSEIR

Faculty of computer & Information Technology computer science Department/JUST

Introduction Factored operating system (fos) Benefits of a single System Image Multicor and Cloud OS challenges Architecture Case Studies Results and Implementation


The average computer user has an ever-increasing amount of computational power at their fingertips.

Users have progressed from using

The next decade will also bring single chip microprocessors containing hundreds or even thousands of computing cores

Contemporary OSes designed to run on a small number of reliable cores are not equipped to scale up to thousands of cores

Cloud computing and Infrastructure as a Service (IaaS) promises a vision of boundless computation It can be tailored to exactly meet a users need this need grows or shrinks rapidly

Unfortunately IaaS requires users to explicitly manage resources and machine boundaries
the ease of using a cloud computer must match that of a current-day multiprocessor system.

Current IaaS systems present a fractured and non-uniform view of resources to the programmer

They uses virtual machine as a provisioning unit but without a suitable abstraction layer
which introduce complexity to the system user

The user is responsible for

fractured communication paradigms of intra-machine

(shared memory and pipes) and inter-machine (sockets) communication Scheduling and load balancing System administration I/O devices

Fault tolerance

The solution could be to have a single image OS that make IaaS systems as easy to use as multiprocessor systems

factored operating system (fos) designed to meet all of these needs

Fos is a single system image OS on multicore processors as well as cloud computers
it tackles OS scalability and adabtibility challenges Factoring the OS into its component system services Each system service is further factored into a collection of Internetinspired servers which communicate via messaging


Factored operating system (fos) Benefits of a single System Image Multicor and Cloud OS challenges Architecture Case Studies Results and Implementation


Factored operating system (fos)


Introduction Factored operating system (fos)

Benefits of a single System Image Multicor and Cloud OS challenges Architecture Case Studies Results and Implementation


Benefits of a single System Image

Ease of administration Transparent sharing Informed optimizations Consistency

Fault tolerance


Introduction Factored operating system (fos) Benefits of a single System Image

Multicor and Cloud OS challenges Architecture Case Studies Results and Implementation


Multicor and Cloud OS challenges

Scalability The current multicore revolution promises drastic changes in fundamental system architecture the number of general-purpose schedulable processing elements is drastically increasing.

Cloud resources are virtually unlimited for a given user

only restricted by monetary constraints scalability is a first order design constraint for future OSes in both single machine and cloud systems.

Multicor and Cloud OS challenges

Variability of demand multicore OSes need to manage the number of live cores

in contrast , single core Oses only have to manage whether a single core is active or idle.
cloud computing makes more resources available ondemand than was ever conceivable in the past. In both cases the demand is not static and resources are variable

fos seeks to reduce the heat production and power consumption while maintaining the throughput requirements imposed by the user.

Multicor and Cloud OS challenges


In multicore systems as the hardware industry is continuously decreasing the size of transistors and increasing their count on a single chip, the chance of faults is rising. system software components must gracefully support dying cores and bit flips In clouds systems performance interference from other cloud users and applications can potentially impact the quality of service provided to the application Programming for massive systems is likely to introduce software faults

the lack of tools to debug and analyze large software systems makes software faults hard to understand and challenging to fix

Multicor and Cloud OS challenges

Programming chalenges

Developing cloud applications composed of several components deployed across many machines is a difficult task
there is not a uniform programming model for communicating within a single multicore machine and between machines managing and load-balancing these systems is proving to be a daunting task as well


Introduction Factored operating system (fos)

Benefits of a single System Image

Multicor and Cloud OS challenges

Architecture Case Studies Results and Implementation


fos uses the following design principles: Space multiplexing replaces time multiplexing OS is factored into function-specific services, where each is implemented as a parallel, distributed service

OS adapts resource utilization to changing system needs

Faults are detected and handled by OS


Fos Architecture


Fos Architecture
Microkernel fos microkernel executes on every core in the system fos uses a minimal microkernel OS design which provides protected messaging layer a name cache to accelerate message delivery Basic time multiplexing of cores Application Programming Interface(API) All other OS functionality and applications execute in user space.

Microkernel API is designed to allow a process on one core to modify the memory and address space on another core if appropriate capabilities are held. which allows fos to move significant memory management and scheduling logic into userland space.



fos Providing a simple process-to-process messaging API to meet the need of OS Inter-Process communication and synchronization

messaging can be implemented on top of shared memory, or provided by hardware, thus allowing this mechanism to be used for a variety of architectures sharing of data becomes much more explicit in the programming model



fos allows conventional multithreaded applications with shared memory

Operating system services are implemented strictly using messages fos messaging works intra-machine and across the cloud using differing transport mechanisms to provide the same interface



Sending node Inten ded node Proxy server

Proxy server

Intended node

Shared memory Fist station

Shared memory Second station



Messaging Each process has a number of mailboxes that other processes may deliver messages to provided they have the credentials

Fos presents an API that allows the application to manipulate these mailboxes and their properties.
Processes within fos are also able to register a mailbox under a given name.


Naming Each service is divided into process or servers Each process is
Independent Run on a different core capable of handling the given request All of them have the same name


Naming The criteria to choose the best member for a specific task
1. The load of all servers and the latency 2. Fixed policy such as round robin and closest server 3. Custom policies such as callback mechanism and complix load balancer 4. Metadata such as message queue length



The criteria to build the nameserver

1. Low latency for the service 2. Maintaining a consistent and global view of the namespace 3. Able to make continual update to the name-lookup

The importance of naming

Make abstraction that provide flexibility in load balancing and locality to the OS much of the complexity in communication is abstracted behind the naming and messaging API

OS services
Each service is composed of several servers Servers in aggregate can accomplish a given task Servers can be on the same station or on deferent stations(scalable) Fleets communicate internally via massaging Fleet is elastic(watchdog service and handshaking process)


OS services Difficulties with fleet
1. Its complicated to program 2. Each service may have different parallelization strategies 3. Each service may have different constraints

Fos solution
1. Providing cooperative multithreading programming model 2. easy-to-use remote procedure call (RPC) and serialization facilities 3. data structures for common patterns of data sharing.

Fos server model The model is implemented as a user-space threading library written in C The goals
to abstract calls to independent, parallel servers to make them appear as local libraries to mitigate the complexities of parallel programming

Servers are event-driven programs, events are messages. Messages arrive on one of three inbound mailboxes
external (public) mailbox the internal (fleet) mailbox the response mailbox

Fos server model The interface to the server is a simple function call Threading library abstract away massaging this model doesnt eliminate all the complexities of parallel programming Because other code will execute on the server during an RPC The cooperative scheduler runs if
There are threads ready to run If no thread is ready, then a new thread is spawned that waits on messages If threads are sleeping for too long, then they are resumed with a time out error status.

Parallel data structures The idea is to provide a common container interface, which abstracts several implementations that provide different properties of
Consistency replication performance

In the back end each server store some of the data and communicate with others for other data This will alleviating the application developer from concerning themselves with the implementation of distributed data structures

Introduction Factored operating system (fos) Benefits of a single System Image Multicor and Cloud OS challenges Architecture

Case Studies Results and Implementation


Case Studies
Questions: how fos works? how fos solves challenges in the cloud? Answer: presents examples of a key component of fos: 1. File System 2. Spawning Servers 3. Elastic Fleet


Case Studies
File System
fos file server is an example of interaction between the different servers in fos
To diminish cache and performance interferances:
Application client fos file system server block device server

all executing on distinct cores, communication is via messaging infrastructure(see figure 3: Anatomy of a file system)

Case Studies: File System


Case Studies
File System steps(on application client):
1. fos intercepts application client 2. bundles it in a message to be sent via the massaging layer 3. fos queries the name cache and sends the message to destination core


Case Studies
File System

steps (on file system server):

4. If the data requested by the application:

is cached, the server bundles it into a message and sends it back to the requesting
application Otherwise, it fetches the needed sectors from disk through the block device driver server(as in this example, see figure )

5. represents the bundling of the sectors request into block messages 6. look-up of the block device driver in the name cache, then the fos microkernel places the message in the incoming mailbox queue of the block device driver server

Case Studies
File System steps (on block device driver server):
7. 8. 9. In response to the incoming message, the block device driver server processes the request enclosed in the incoming message, fetches the sectors from disk as portrayed 10. encapsulates the fetched sectors in a message

11. look-up of the file system server in the name cache 12. then the fos microkernel places the message back to the incoming mailbox queue of the file system server


Case Studies
File System steps: in the file system server: 13. le server processes the acquired sectors from the incoming
mailbox queue, encapsulates the required data into messages 14. sends them back to the client application

In the client application: 15. libfos receives the data at its incoming mailbox queue and

processes it in order to provide the le system access requested by the client application

Case Studies
File System Case2: the le system server is not running on the local machine(i.e. the name cache could not locate it):
1. message is forwarded to the proxy server has the name cache and

location of all the remote servers .then ,it determines the appropriate destination machine for the message

2.bundles it into a network message 3.sends it via the network stack to the designated machine

Although this adds an extra hop through the proxy server, it provides the system with transparency when accessing local or remote servers

Case Studies
File System Instead of the fragmented view of the cloud resources in a cloud environment, in case 2 : A single image system was provided by uniform messaging and naming allows servers to be assigned to any machine in the system, this gives a uniform application programming model to use inter-machine and intra-machine resources in the cloud


Case Studies
Spawning Server

bundles the spawn arguments into a message
sends that message to the spawn servers incoming request mailbox

Where to deploy?
The spawn server interacts with the scheduler to determine the best machine and core for the new process to start on

Case Studies
Spawning Server

If the best is the local machine:

the spawn server sets up the address space for the new process and starts it

The spawn server then returns the PID to the process

If on a remote machine is best:

the spawn server forwards the spawn request to the spawn server on the remote machine, which then spawns the process


Case Studies:Spawning Server


Case Studies
Elastic Fleet

Watchdog process

Watch the load :

1. spawning a new member of the eet
2. initiating the handshaking process that allows the new server to join the eet 3. During the handshaking process, existing members of the eet are notied of the new member, and state is shared with the new eet member

Case Studies
Elastic Fleet

example (many servers on a single machine that are all requesting service look-ups from the nameserver):
the watchdog process watch that queues are becoming full on each
of the nameservers decide to spawn a new nameserver

allow the scheduler to determine which core to put this nameserver on

notice fos provide programming model which takes the benefit of being dynamically scalable to match demand

Case Studies
Elastic Fleet

whatchdog process also, choose decision and policy :

Resources available in a global scale Load and location of existing servers Monetary concerns So, fos can make a much more informed decision than solutions that simply look at the cloud application at the granularity of VMs


Introduction Factored operating system (fos) Benefits of a single System Image Multicor and Cloud OS challenges Architecture Case Studies

Result and Implementation


Result and Implementation

our experiment:
16 machines, two xen x5460 processors(8 cores) runnning at 3.16GHz and 8GB of main memory interconnected via two 1Gbps Ethernet port


Result and Implementation

competitive with Linux in these basic metrics as well as in the cloud
System Calls 2. Ping Response 3. process creation 4. File System 5. Web Server 6. Single System Image Growth


Result and Implementation

Local system call


Result and Implementation

System Calls Timing was gathered using the hardware time stamp counter from table1: fos's currently is slower than linux's to improve messaging and system call performance use URPC(User-level Remote procedure Call), where the initial result promise to reduce the messaging time and system call


Result and Implementation

Remote System Calls


Result and Implementation

Ping Response


Result and Implementation

Process Creation


Result and Implementation

File System


Result and Implementation

Web server


Result and Implementation

Single System Image Growth Test
Eucalyptus as the cloud manager fos cloud interface server used the EC2 REST API

Test Implementation
fos VM was manually started started a second fos VM instance via Eucalyptus The proxy servers on the two VMs then connected and shared state

providing a single system image by allowing fos native messages to occur over the network

Result and Implementation

Single System Image Growth

Test Result
The amount of time it took for the rst VM to spawn and integrate the second VM was 72.45 seconds

time in the result outside of fos

response time of the Eucalyptus cloud controller time to setup the VM on a different machine with a 2GB disk le time for the second fos VM to receive an IP address via DHCP round trip time of the TCP messages sent by the proxy servers when sharing state

Introduction Factored operating system (fos) Benefits of a single System Image Multicor and Cloud OS challenges Architecture Case Studies Results and Implementation


Cloud computing and multicores have created new classes of platforms for application development fos a single system interface providing a programming model that allows OS system services to scale with demand fos is scalable and adaptive By placing key mechanisms for multicore and cloud management in a unied operating system


Any Question


Thank You ...


Wentzlaff, David, Charles Gruenwald III, Nathan Beckmann, Kevin Modzelewski, Adam Belay, Lamia Youseff, Jason Miller, and Anant Agarwal. "An operating system for multicore and clouds: mechanisms and implementation." In Proceedings of the 1st ACM symposium on Cloud computing, pp. 3-14. ACM, 2010.