Вы находитесь на странице: 1из 31

Spring 2012 Master of Computer Application (MCA) Semester V

MC0085 Advanced Operating Systems (Distributed Systems) (Book ID: B 0967) Assignment Set 1 (40 Marks)

Books ID: B 0967 Each question carries 10 marks

6x10 = 60

1. Describe the following: o Distributed Computing Systems o Distributed Computing System Models A1) Distributed Computing Systems Over the past two decades, advancements in microelectronic technology have resulted in the availability of fast, inexpensive processors, and advancements in communication technology have resulted in the availability of cost-effective and highly efficient computer networks. The advancements in these two technologies favour the use of interconnected, multiple processors in place of a single, high-speed processor. Computer architectures consisting of interconnected, multiple processors are basically of two types: In tightly coupled systems, there is a single system wide primary memory (address space) that is shared by all the processors. If any processor writes, for example, the value 100 to the memory location x, any other processor subsequently reading from location x will get the value 100. Therefore, in these systems, any communication between the processors usually takes place through the shared memory.

In loosely coupled systems, the processors do not share memory, and each processor has its own local memory. If a processor writes the value 100 to the memory location x, this write operation will only change the contents of its local memory and will not affect the contents of the memory of any other processor. Hence, if another processor reads the memory location x, it will get whatever value was there before in that location of its own local memory. In these systems, all physical communication between the processors is done by passing messages across the network that interconnects the processors. Usually, tightly coupled systems are referred to as parallel processing systems, and loosely coupled systems are referred to as distributed computing systems, or simply distributed systems. In contrast to the tightly coupled systems, the processors of distributed computing systems can be located far from each other to cover a wider geographical area. Furthermore, in tightly coupled systems, the number of processors that can be usefully deployed is usually small and limited by the bandwidth of the shared memory. This is not the case with distributed computing systems that are more freely expandable and can have an almost unlimited number of processors.

Tightly Coupled Multiprocessor Systems

Distributed Computing System Models

Distributed Computing system models can be broadly classified into five categories. They are Minicomputer model Workstation model Workstation server model Processor pool model Hybrid model Minicomputer Model The minicomputer model (Fig. 1.3) is a simple extension of the centralized time-sharing system. A distributed computing system based on this model consists of a few minicomputers (they may be large supercomputers as well) interconnected by a communication network. Each minicomputer usually has multiple users simultaneously logged on to it. For this, several interactive terminals are connected to each minicomputer. Each user is logged on to one specific minicomputer, with remote access to other minicomputers. The network allows a user to access remote resources that are available on some machine other than the one on to which the user is currently logged. The minicomputer model may be used when resource sharing (such as sharing of information databases of different types, with each type of database located on a different machine) with remote users is desired. The early ARPAnet is an example of a distributed computing system based on the minicomputer model.

2. Describe the following with respect to Remote Procedure Calls: A2) o The RPC Model o STUB Generation

The RPC Model The RPC mechanism is an extension of a normal procedure call mechanism. It enables a call to be made to a procedure that does not reside in the address space of the calling process. The called procedure may be on a remote machine or on the same machine. The caller and callee have separate address space; so called procedure has no access to the callers environment. The RPC model is used for transfer of control and data within a program in the following manner: 1. For making a procedure call, the caller places arguments to the procedure in some well-specified location.

2. Control is then transferred to the sequence of instructions that constitutes the body of the procedure. 3. The procedure body is executed in a newly created execution environment that includes copies of the arguments given in the calling instruction. 4. After the procedures execution is over, control returns to the calling point, possibly returning a result. When a remote procedure call is made, the caller and the callee processes interact in the following manner: The caller (also known as the client process) sends a call (request) message to the callee (also known as the server process) and waits (blocks) for a reply message. The server executes the procedure and returns the result of the procedure execution to the client. After extracting the result of the procedure execution, the client resumes execution. In the above model, RPC calls are synchronous; however, an implementation may choose to have RPC calls to be asynchronous to allow parallelism. Also, for each request the server can create a thread to process the request so the server can receive other requests

STUB Generation The stubs can be generated in the following two ways: Manual Stub Generation: RPC implementer provides a set of translation functions from which user can construct his own stubs. It is simple to implement and can handle complex parameters. Automatic Stub Generation: This is the most commonly used technique for stub generation. It uses an Interface Definition Language (IDL), for defining the interface between the client and server. An interface definition is mainly a list of procedure names supported by the interface, together with the types of their arguments and results, which helps the client and server to perform compile-time type checking and generate appropriate calling sequences. An interface definition also contains information to indicate whether each argument is an input, output or both. This helps in unnecessary copying input argument needs to be copied from client to server and output needs to be copied from server to client. It also contains information about type definitions, enumerated types, and defined constants-so the clients do not have to store this information. A server program that implements procedures in an interface is said to export the interface. A client program that calls the procedures is said to import the interface. When writing a distributed application, a programmer first writes the interface definition using IDL, then can write a server program that exports the interface and a client program that imports the interface. The interface definition is processed using an IDL compiler (the IDL compiler in Sun RPC is called rpcgen) to generate components that can be combined with both client and server programs, without making changes to the existing compilers. In particular, an IDL compiler generates a client stub procedure and a server stub procedure

for each procedure in the interface. It generates the appropriate marshaling and un-marshaling operations in each sub procedure. It also generates a header file that supports the data types in the interface definition to be included in the source files of both client and server. The client stubs are compiled and linked with the client program and the server stubs are compiled and linked with server program.

3. Describe the following: o Distributed Shared Memory Systems (DSM) o DSM Design & Implementation issues

A3) Distributed Shared Memory Systems (DSM) This is also called DSVM (Distributed Shared Virtual Memory). It is a loosely coupled distributed-memory system that has implemented a software layer on top of the message passing system to provide a shared memory abstraction for the programmers. The software layer can be implemented in the OS kernel or in runtime library routines with proper kernel support. It is an abstraction that integrates local memory of different machines in a network environment into a single logical entity shared by cooperating processes executing on multiple sites. Shared memory exists only virtually. DSM Systems: A comparison between message passing and tightly coupled multiprocessor systems DSM provides a simpler abstraction than the message passing model. It relieves the burden from the programmer from explicitly using communication primitives in their programs. In message passing systems, passing complex data structures between two different processes is difficult. Moreover, passing data structures containing pointers is generally expensive in message passing model. Distributed Shared Memory takes advantage of the locality of reference exhibited by programs and improves efficiency. Distributed Shared Memory systems are cheaper to build than tightly coupled multiprocessor systems. The large physical memory available facilitates running programs requiring large memory efficiently. DSM can scale well when compared to tightly coupled multiprocessor systems. Message passing system allows processes to communicate with each other while being protected from one another by having private address spaces, whereas in DSM one can cause another to fail by erroneously altering data.

When message passing is used between heterogeneous computers marshaling of data takes care of differences in data representation; how can memory be shared between computers with different integer representation. DSM can be made persistent i.e. processes communicating via DSM may execute with overlapping lifetimes. A process can leave information in an agreed location to another process. Processes communicating via message passing must execute at the same time. Which is better? Message passing or Distributed Shared Memory? Distributed Shared Memory appears to be a promising tool if it can be implemented efficiently. Distributed Shared Memory Architecture

As shown in the above figure, the DSM provides a virtual address space shared among processes on loosely coupled processors. DSM is basically an abstraction that integrates the local memory of different machines in a network environment into a single local entity shared by cooperating processes executing on multiple sites. The shared memory itself exists only virtually. The application programs can use it in the same way as traditional virtual memory, except that processes using it can run on different machines in parallel. Architectural Components: Each node in a distributed system consists of one or more CPUs and a memory unit. The nodes are connected by a communication network. A simple message-passing system allows processes on different nodes to exchange messages with each other. DSM abstraction presents a single large shared memory space to the processors of all nodes. Shared memory of DSM exists only virtually. Memory map manager running at each node maps the local memory onto the shared virtual memory. To facilitate this mapping, shared-memory space is partitioned into blocks. Data caching is used to reduce network latency. When a memory block accessed by a process is not resident in local memory: a block fault is generated and control goes to the OS.

the OS gets this block from the remote node and maps it to the applications address space and the faulting instruction is restarted. Thus data keeps migrating from one node to another node but no communication is visible to the user processes. Network traffic is highly reduced if applications show a high degree of locality of data accesses. Variations of this general approach are used for different implementations depending on whether the DSM allows replication and/or migration of shared memory.

DSM Design and Implementation Issues The important issues involved in the design and implementation of DSM systems are as follows: Granularity: It refers to the block size of the DSM system, i.e. to the units of sharing and the unit of data transfer across the network when a network block fault occurs. Possible units are a few words, a page, or a few pages. Structure of Shared Memory Space: The structure refers to the Lay out of the shared data in memory. It is dependent on the type of applications that the DSM system is intended to support. Memory coherence and access synchronization: Coherence (consistency) refers to memory coherence problem that deals with the consistency of shared data that lies in the main memory of two or more nodes. Synchronization refers to synchronization of concurrent access to shared data using synchronization primitives such as semaphores. Data Location and Access: A DSM system must implement mechanisms to locate data blocks in order to service the network data block faults to meet the requirements of the memory coherence semantics being used. Block Replacement Policy: If the local memory of a node is full, a cache miss at that node implies not only a fetch of the accessed data block from a remote node but also a replacement. i.e. a data block of the local memory must be replaced by the new data block. Therefore a block replacement policy is also necessary in the design of a DSM system. Thrashing: In a DSM system, data blocks migrate between nodes on demand. If two nodes compete for write access to a single data item, the corresponding data block may be transferred back and forth at such a high rate that no real work can get done. A DSM system must use a policy to avoid this situation (known as Thrashing). Heterogeneity: The DSM systems built in for homogenous systems need not address the heterogeneity issue. However, if the underlying system environment is heterogeneous, the DSM system must be

designed to take care of heterogeneity so that it functions properly with machines having different architectures.

4. Discuss the clock synchronization algorithms. A4) Clock Synchronization Algorithms Clock synchronization algorithms may be broadly classified as Centralized and Distributed: Centralized Algorithms In centralized clock synchronization algorithms one node has a real-time receiver. This node, called the time server node whose clock time is regarded as correct and used as the reference time. The goal of these algorithms is to keep the clocks of all other nodes synchronized with the clock time of the time server node. Depending on the role of the time server node, centralized clock synchronization algorithms are again of two types Passive Time Sever and Active Time Server. 1. Passive Time Server Centralized Algorithm: In this method each node periodically sends a message to the time server. When the time server receives the message, it quickly responds with a message (time = T), where T is the current time in the clock of the time server node. Assume that when the client node sends the time = ? message, its clock time is T0, and when it receives the time = T message, its clock time is T1. Since T0 and T1 are measured using the same clock, in the absence of any other information, the best estimate of the time required for the propagation of the message time = T from the time server node to the clients node is (T1-T0)/2. Therefore, when the reply is received at the clients node, its clock is readjusted to T + (T1-T0)/2. 2. Active Time Server Centralized Algorithm: In this approach, the time server periodically broadcasts its clock time (time = T). The other nodes receive the broadcast message and use the clock time in the message for correcting their own clocks. Each node has a priori knowledge of the approximate time (Ta) required for the propagation of the message time = T from the time server node to its own node, Therefore, when a broadcast message is received at a node, the nodes clock is readjusted to the time T+Ta. A major drawback of this method is that it is not fault tolerant. If the broadcast message reaches too late at a node due to some communication fault, the clock of that node will be readjusted to an incorrect value. Another disadvantage of this approach is that it requires broadcast facility to be supported by the network. Another active time server algorithm that overcomes the drawbacks of the above algorithm is the Berkeley algorithm proposed by Gusella and Zatti for internal synchronization of clocks of a group of computers running the Berkeley UNIX. In this algorithm, the time server periodically sends a message (time = ?) to all the computers in the group. On receiving this message, each computer sends back its clock value to the time server. The time server has a priori knowledge of the approximate time required for the propagation of a message from each node to its own node. Based on this knowledge, it first readjusts the clock values of the reply messages, It then takes a fault-tolerant average of the clock values of all the computers (including its own). To take the fault tolerant average, the time server

chooses a subset of all clock values that do not differ from one another by more than a specified amount, and the average is taken only for the clock values in this subset. This approach eliminates readings from unreliable clocks whose clock values could have a significant adverse effect if an ordinary average was taken. The calculated average is the current time to which all the clocks should be readjusted, The time server readjusts its own clock to this value, Instead of sending the calculated current time back to other computers, the time server sends the amount by which each individual computers clock requires adjustment, This can be a positive or negative value and is calculated based on the knowledge the time server has about the approximate time required for the propagation of a message from each node to its own node. Centralized clock synchronization algorithms suffer from two major drawbacks: 1. They are subject to single point failure. If the time server node fails, the clock synchronization operation cannot be performed. This makes the system unreliable. Ideally, a distributed system, should be more reliable than its individual nodes. If one goes down, the rest should continue to function correctly. 2. From a scalability point of view it is generally not acceptable to get all the time requests serviced by a single time server. In a large system, such a solution puts a heavy burden on that one process. Distributed algorithms overcome these drawbacks:

5. Discuss the following with respect to Resource Management in Distributed Systems: o Load Balancing Approach o Load Sharing Approach

A5) Load-Balancing Approach The scheduling algorithms that use this approach are known as Load Balancing or Load-Leveling Algorithms. These algorithms are based on the intuition that for better resource utilization, it is desirable for the load in a distributed system to be balanced evenly. Thus a load balancing algorithm tries to balance the total system load by transparently transferring the workload from heavily loaded nodes to lightly loaded nodes in an attempt to ensure good overall performance relative to some specific metric of system performance. We can have the following categories of load balancing algorithms: 1. Static: Ignore the current state of the system. e.g. If a node is heavily loaded, it picks up a task randomly and transfers it to a random node. These algorithms are simpler to implement but performance may not be good.

2. Dynamic: Use the current state information for load balancing. There is an overhead involved in collecting state information periodically; they perform better than static algorithms. 3. Deterministic: Algorithms in this class use the processor and process characteristics to allocate processes to nodes. 4. Probabilistic: Algorithms in this class use information regarding static attributes of the system such as number of nodes, processing capability, etc. 5. Centralized: System state information is collected by a single node. This node makes all scheduling decisions. 6. Distributed: Most desired approach. Each node is equally responsible for making scheduling decisions based on the local state and the state information received from other sites. 7. Cooperative: A distributed dynamic scheduling algorithm. In these algorithms, the distributed entities cooperate with each other to make scheduling decisions. Therefore they are more complex and involve larger overhead than non-cooperative ones. But the stability of a cooperative algorithm is better than that of a non-cooperative one. 8. Non-cooperative: A distributed dynamic scheduling algorithm. In these algorithms, individual entities act as autonomous entities and make scheduling decisions independently of the action of other entities. Load Estimation Policy: This policy makes an effort to measure the load at a particular node in a distributed system according to the following criteria: The number of processes running at a node as a measure of the load at the node. The CPU utilization as a measure of load None of the above fully captures the load at a node, other parameters such as resource demands of these processes, architecture and speed of the processor total remaining execution time of the processes, etc should be taken into consideration as well. Process Transfer Policy: The strategy of load balancing algorithms is based on the idea of transferring some processes from the heavily loaded nodes to lightly loaded nodes. To facilitate this, it is necessary to devise a policy to decide whether or not a node is lightly or heavily loaded. The threshold value of a node is the limiting value of its workload and is used to decide whether a node is lightly or heavily loaded. The threshold value of a node may be determined by any of the following methods: 1. Static Policy: Each node has a predefined threshold value. If the number of processes exceed the predefined threshold value, a process is transferred. Can cause process thrashing under heavy load, thus causing instability. 2. Dynamic Policy: In this method, the threshold value is dynamically calculated. It is increased under heavy load and decreased under light load. Thus process thrashing does not occur.

3. High-low Policy: Each node has two threshold values, high and low. Thus, the state of a node can be overloaded, under-loaded or normal depending on the number of processes greater than high, less than low or otherwise. Location Policies: Once a decision has been made through the transfer policy to transfer a process from a node, the next step is to select the destination node for that process execution. This selection is made by the location policy of a scheduling algorithm. The main location policies proposed are as follows: 1. Threshold: A random node is polled to check its state and the task is transferred if it will not be overloaded; polling is continued until a suitable node is found or a threshold number of nodes have been polled. Experiment shows polling 3 to 5 five nodes performs as good as polling large number of nodes, like 20 nodes. This also has substantial performance over no load balancing at all. 2. Shortest: A predetermined number of nodes are polled and the node with minimum load among these is picked for the task transfer; if that node is overloaded the task is executed locally. 3. Bidding: In this method, each node acts as a manager (the one who tries to transfer a task) and a contractor, the one that is able to accept a new task. In this the Manager broadcasts a request-for-bids to all the nodes. A contractor returns bids (quoted price based on the processor capability, memory size, resource availability, etc). A Manager chooses the best bidder for transferring the task. Problems that could arise as a result of broadcasts of two or more managers concurrently need to be addressed. 4. Pairing: This approach tries to reduce the variance in load between pairs of nodes. In this approach, two nodes that differ greatly in load are paired with each other so they can exchange tasks. Each node asks a randomly picked node if it will pair with it. After a pairing is formed, one or more processes are transfered from heavily loaded node to the lightly loaded node. State Information Exchange Policies: The dynamic policies require frequent change of state information among the nodes of the system. In fact, a dynamic load-balancing algorithm faces a transmission dilemma because of the two opposing impacts the transmission of a message has on the overall performance of the system. On one hand, transmission improves the ability of the algorithm to balance the load. On the other hand, it raises the expected queuing time of messages because of the increase in the utilization of the communication channel. Thus proper selection of the state information exchange policy is essential. The proposed load balancing algorithms use one of the following policies for the purpose: 1. Periodic Broadcast: Each node broadcasts its state information periodically, say every t time units. It does not scale well and causes heavy network traffic. May result in fruitless messages. 2. Broadcast When State Changes: This avoids fruitless messages. A node broadcasts its state only when its state changes. For example, when the state changes from normal to low or normal to high, etc. 3. On-Demand Exchange: Under this approach

A node broadcasts a state information request when its state changes from normal load region to high or low load. Upon receiving this request, other nodes send their current state information to the requesting node. If the requesting node includes its state information in the request then, only those nodes that can cooperate with the requesting node need to send reply. 4. Exchange by Polling: In this approach the state information is exchanged with a polled node only. Polling stops after a predetermined number of polling or after a suitable partner is found, whichever happens first. 5. Priority Assignment Policies: One of the following priority assignment rules may be used to assign priorities to local and remote processes (i.e. processes that have migrated from other nodes): i) Selfish: Local processes are given higher priority than remote processes. Study shows this approach yields worst response time of the three policies. This approach penalizes processes that arrive at a busy node because they will be transferred and hence will execute as low priority processes. It favors the processes that arrive at lightly loaded nodes. ii) Altruistic: Remote processes are given higher priority than local processes Study shows this approach yields best response time of all the three approaches. Under this approach, remote processes incur lower delays than local processes. iii) Intermediate: If the number of local processes are more, local processes get higher priority; otherwise, remote processes get higher priority. Study shows that the overall response time performance under this policy is much closer to that of the altruistic policy. Under this policy, local processes are treated better than the remote processes for a wide range of loads. iv) Migration Limiting Policies: This policy is used to decide about the total number of times a process should be allowed to migrate. Uncontrolled: Remote process is treated like local process. So, there is no limit on the number of nodes it can migrate. Controlled: Most systems use controlled policy to overcome the instability problem Migrating a partially executed process is expensive; so, many systems limit the number of migrations to 1. For long running processes, it might be beneficial to migrate more than once.

Load Sharing Approach Several researchers believe that load balancing, with its implication of attempting to equalize workload on all the nodes of the system, is not an appropriate objective. This is because the overhead involved in gathering the state information to achieve this objective is normally very large, especially in distributed systems having a large number of nodes. In fact, for the proper utilization of resources of a distributed system, it is not required to balance the load on all the nodes. It is necessary and sufficient to prevent the nodes from being idle while some other nodes have more than two processes. This rectification is called the Dynamic Load Sharing instead of Dynamic Load Balancing. Issues in Load-Sharing Algorithms: The design of a load sharing algorithm requires that proper decisions be made regarding load estimation policy, process transfer policy, state information exchange policy, priority assignment policy, and migration limiting policy. It is simpler to decide about most of these policies in case of load sharing, because load sharing algorithms do not attempt to balance the average workload of all the nodes of the system. Rather, they only attempt to ensure that no node is idle when a node is heavily loaded. The priority assignment policies and the migration limiting policies for load-sharing algorithms are the same as that of load-balancing algorithms. 1. Load Estimation Policies: In this an attempt is made to ensure that no node is idle while processes wait for service at some other node. In general, the following two approaches are used for estimation: Use number of processes at a node as a measure of load Use the CPU utilization as a measure of load Process Transfer Policies: Load sharing algorithms are interested in busy or idle states only and most of them employ the all-or-nothing strategy given below: All or Nothing Strategy: It uses a single threshold policy. A node becomes a candidate to accept tasks from remote nodes only when it becomes idle. A node becomes a candidate for transferring a task as soon as it has more than one task. Under this approach, an idle process is not able to immediately acquire a task, thus wasting processing power. To avoid this, the threshold value can be set to 2 instead of 1.

Location Policies: Location Policy decides the sender node or the receiver node of a process that is to be moved within the system for load sharing. Depending on the type of node that takes the initiative to globally search for a suitable node for the process, the location policies are of the following types: 1. Sender-Initiated Policy: Under this policy, heavily loaded nodes search for lightly loaded nodes to which task may be transferred. The search can be done by sending a broadcast message or probing randomly picked nodes

An advantage of this approach is that sender can transfer the freshly arrived tasks, so no preemptive task transfers occur. A disadvantage of this approach is it can cause system instability under high system load.

2. Receiver-Initiated Location Policy: Under this policy, lightly loaded nodes search for heavily loaded nodes from which tasks may be transferred

The search for a sender can be done by sending a broadcast message or by probing randomly picked nodes. An disadvantage of this approach is it may result in preemptive task transfers because sender may not have any freshly arrived tasks. Advantage is, this does not cause system instability, because under high system loads a receiver will quickly find a sender; and under low system loads, it is OK for processes to process some additional control messages.

3. Symmetrically Initiated Location Policy: Under this approach, both senders and receivers search for receivers and senders respectively. 4. State Information Exchange Policies: Since it is not necessary to equalize load at all nodes under load sharing, state information is exchanged only when the state changes. 5. Broadcast When State Changes: A node broadcasts a state information request message when it becomes under-loaded or overloaded.

In the sender-initiated approach a node broadcasts this message only when it is overloaded. In the receiver-initiated approach, a node broadcasts this message only when it is under-loaded. 6. Poll When State Changes: When a nodes state changes,

It randomly polls other nodes one by one and exchanges state information with the polled nodes. Polling stops when a suitable node is found or a threshold number of nodes have been polled. Under sender initiated policy, sender polls to find suitable receiver. Under receiver initiated policy, receiver polls to find suitable sender. The above Average Algorithm by Krueger and Finkel (A dynamic load balancing algorithm) tries to maintain load at each node within an acceptable range of the system average. 7. Transfer Policy: A threshold policy that uses two adaptive thresholds, the upper threshold, and the lower threshold

A node with load lower than lower threshold is considered a receiver A node with load higher than the higher threshold is considered a sender.

A nodes estimated average load is supposed to lie in the middle of the lower and upper thresholds.

6. Discuss the following with respect to File Systems: o Stateful Vs Stateless Servers o Caching

A6) Stateful Vs Stateless Servers The file servers that implement a distributed file service can be stateless or stateful. Stateless file servers do not store any session state. This means that every client request is treated independently, and not as part of a new or existing session. Stateful servers, on the other hand, do store session state. They may, therefore, keep track of which clients have opened which files, current read and write pointers for files, which files have been locked by which clients, etc. The main advantage of stateless servers is that they can easily recover from failure. Because there is no state that must be restored, a failed server can simply restart after a crash and immediately provide services to clients as though nothing happened. Furthermore, if clients crash the server is not stuck with abandoned opened or locked files. Another benefit is that the server implementation remains simple because it does not have to implement the state accounting associated with opening, closing, and locking of files. The main advantage of stateful servers, on the other hand, is that they can provide better performance for clients. Because clients do not have to provide full file information every time they perform an operation, the size of messages to and from the server can be significantly decreased. Likewise the server can make use of knowledge of access patterns to perform read-ahead and do other optimisations. Stateful servers can also offer clients extra services such as file locking, and remember read and write positions.

Caching Besides replication, caching is often used to improve the performance of a DFS. In a DFS, caching involves storing either a whole file, or the results of file service operations. Caching can be performed at two locations: at the server and at the client. Server-side caching makes use of file caching provided by the host operating system. This is transparent to the server and helps to improve the servers performance by reducing costly disk accesses. Client-side caching comes in two flavours: on-disk caching, and in-memory caching. On-disk caching involves the creation of (temporary) files on the clients disk. These can either be complete files (as in the upload/download model) or they can contain

partial file state, attributes, etc. In-memory caching stores the results of requests in the client-machines memory. This can be process-local (in the client process), in the kernel, or in a separate dedicated caching process. The issue of cache consistency in DFS has obvious parallels to the consistency issue in shared memory systems, but there are other tradeoffs (for example, disk access delays come into play, the granularity of sharing is different, sizes are different, etc.). Furthermore, because write-through caches are too expensive to be useful, the consistency of caches will be weakened. This makes implementing Unix semantics impossible. Approaches used in DFS caches include, delayed writes where writes are not propagated to the server immediately, but in the background later on, and write-on-close where the server receives updates only after the file is closed. Adding a delay to write-on-close has the benefit of avoiding superfluous writes if a file is deleted shortly after it has been closed.

Spring 2012 Master of Computer Application (MCA) Semester V

MC0085 Advanced Operating Systems (Distributed Systems) 4 Credits (Book ID: B 0967) Assignment Set 2 (60 Marks) Book ID: B 0967 Each Question carries 10 Marks 6 X 10 = 60 Marks

1. Describe the following: o Synchronization o Buffering

A1) Synchronization In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronizationrefers to the idea that multiple processes are to join up or handshake at a certain point, so as to reach an agreement or commit to a certain sequence of action. Data synchronization refers to the idea of keeping multiple copies of a dataset in coherence with one another, or to maintain data integrity. Process synchronization primitives are commonly used to implement data synchronization.

Thread synchronization or serialization, strictly defined, is the application of particular mechanisms to ensure that two concurrently-executing threads or processes do not execute specific portions of a program at the same time. If one thread has begun to execute a serialized portion of the program, any other thread trying to execute this portion must wait until the first thread finishes. Synchronization is used to control access to state both in small-scale multiprocessing systems -- in multithreaded environments and multiprocessor computers -- and in distributed computers consisting of thousands of units -- in banking and database systems, in web servers, and so on.

Buffering It is a process of storing data in memory area called Buffers while data is being transferred between two devices or between a device and an application. Buffering is done for 3 reasons: a. To cope with the speed mismatch between producer (or sender) and consumer (or receiver) of a data stream. b. To adapt between the devices having different data-transfer size. c. To support copy semantics for application I/O.

In computer storage, disk buffer (often ambiguously[citation needed] called disk cache or cache buffer is the embedded memory in a hard drive acting as a buffer between the computer and the physical hard disk platter that is used for storage. Modern hard disks come with 8 to 64 MiB of such memory. Where a buffer is large and the throughput of the disk is slow, the data becomes cached for too long, resulting in degraded performance over equivalent disks with smaller buffers. This degradation occurs because of longer latencies when flush commands are sent to a disk with a full buffer

When several blocks need to be transferred from disk to main memory and all the block addresses are known, several buffers can be reserved in main memory to speed up the transfer. While one buffer is being read or written, the CPU can process data in the other buffer. This is possible because an independent disk I/O processor (controller) exists that, once started, can proceed to transfer a data block between memory and disk independent of and in parallel to CPU processing.

2. Describe the following: o Communication protocols for RPC o Client Server Binding
A2) Communication Protocol for RPCs

Different systems developed on the basis of remote procedure calls have different IPC requirements. Based on the needs of different systems, several communication protocols have been proposes for RPCs. A brief description of these protocols is given below:

i) The Request Protocol: Also known as the R protocol. It is useful for RPCs in which the called procedure has nothing to return and the client does not require confirmation for the procedure having

been executed. An RPC protocol that uses R protocol is also called asynchronous RPC. For asynchronous RPC, the RPCRuntime does not take responsibility for retrying a request in case of communication failure. So, if an unreliable transport protocol such as UDP is used, then request messages could be lost. Asynchronous RPCs with unreliable transport protocols are generally useful for implementing periodic updates. For example, a time server node in a distributed system, may send synchronization messages every T seconds.

ii) Request/Reply Protocol (RR protocol): It is a basic idea to eliminate acknowledgements.

A servers reply message is regarded as an acknowledgment of the clients request. A subsequent call message is regarded as an acknowledgement for the servers reply. The RR protocol does not possess failure-handling capabilities. A timeout and retry is normally used along with RR protocol, for taking care of lost messages. If duplicate messages are not filtered, RR protocol provides at least once semantics. Servers can support exactly-once semantics by keeping records of replies in a reply cache. How long the reply needs to be kept?

iii) The Request/Reply/Acknowledge-Reply Protocol (RRA): It is useful for the design of systems involving simple RPCs. The server needs to keep a copy of the reply only until it receives the acknowledgement for reply from client. Exactly-once semantics can be implemented easily using this protocol. In this protocol a servers reply message is regarded as an acknowledgement of the clients request message. A subsequent call packet from a client is regarded as an acknowledgement of the servers reply of the previous call made by the client.

Client Server Binding


Client-Server Binding

It is necessary for a client (A Client Stub) to know the location of the server before a remote procedure call can take place. The process by which a client becomes associated with a server so that calls can take place is known as binding. The Client-server binding involves handling of several issues: How does a client specify a server to which it wants to get bound?

How does the binding process locate the specified server? When is it proper to bind a client to server? Is it possible for a client to change a binding during execution? Can a client be simultaneously bound to multiple servers that provide the same service?
Server Naming: Birrell and Nelsons proposal

The specification by a client of a server with which it wants to communicate is primarily a naming issue. An interface name has two parts - a type and an instance. Type specifies the interface itself, and instance specifies a server providing the services within that interface. For example, there may be an interface type file server, and there may be many instances of servers providing file service. Type part also has generally version number field to distinguish between old and new versions of interface (which may provide different sets of service). Interface names are created by users. The RPC package only dictates the means by which an importer uses the interface name to locate an exporter.

Server Locating:

The interface name of a server is its unique identifier. When the client specifies the interface name of a server for making a remote procedure call, the server must be located before the clients request message can be sent to it. This is primarily a locating issue and any locating mechanism can be used for this purpose. The most common methods used for locating are described below:

i) Broadcasting: A broadcast message is sent to locate the server. The first server responding to this message is used by the client. OK for small networks. ii) Binding Agent: A binding agent is basically a name server used to bind a client to a server by providing information about the desired server. The binding agent maintains a binding table which is a mapping of the servers interface name to its locations. All servers register themselves with the binding agent as a part of their initialization process.

To register, the server gives the binder its identification information and a handle to look at it, for example IP address. The Server can deregister when it is no longer prepared to offer this service. The binding agents location is known to all nodes. The binding agent interface has three primitives: register, deregister, and lookup (used by client). The time when can a client be bound to a server is called the Binding Time. If the client and server modules are programmed as if they were linked together, it is known as Binding at Compile Time.

For example a servers network address can be compiled into clients code. This scheme is very inflexible because if the server moves or the server is replicated or the interface changes, all client programs need

to be recompiled. However, it is useful in an application whose configuration is expected to last for a fairly long time.
iii) Binding at Link Time: A server exports its service by registering with the binding agent as part of the initialization process

A client then makes an import request to the binding agent before making a call The binding agent binds the client and server by returning the servers handle. The servers handle is cached by client to avoid contacting the binding agent.
iv) Binding at Call Time: A client is bound to a server at the time when it calls the server for the first time during execution. v) Indirect Call Method: When a client calls the server for the first time, it passes the servers interface name and the arguments of the RPC call to the binding agent. The binding agent looks up the location of the targets server and forwards the RPC message to it. When the target server returns the results to the binding agent, the binding agent returns the result along with the handle of the target server to the client. The client subsequently can call target server directly.

3. Discuss the following algorithms: o Centralized Server algorithm o Dynamic Distributed Server algorithm

A3) Centralized-Server Algorithm


A central server maintains a block table containing owner-node and copy-set information for each block. When a read/write fault for a block occurs at node N, the fault handler at node N sends a read/write request to the central server. Upon receiving the request, the central-server does the following: => If it is a read request: adds N to the copy-set field and sends the owner node information to node N upon receiving this information, N sends a request for the block to the owner node. to N. => If it is a write request: It sends the copy-set and owner information of the block to node N and initializes copy-set to {N} upon receiving this request, the owner returns a copy of the block

Node N sends a request for the block to the owner node and an invalidation message to all blocks in the copy-set. Upon receiving this request, the owner sends the block to node N

Dynamic Distributed Server Algorithm

Under this approach, there is no block manager. Each node maintains information about the probable owner of each block, and also the copy-set information for each block for which it is a owner. When a block fault occurs, the fault handler sends a request to the probable owner of the block. Upon receiving the request => if the receiving node is not the owner, it forwards the request to the probable owner of the block according to its table. => if the receiving node is the owner, then If the request is a read request, it adds the entry N to the copy-set field of the entry corresponding to the block and sends a copy of the block to node N. If the request is a write request, it sends the block and copy-set information to the node N and deletes the entry corresponding to the block from its block table. Node N, upon receiving the block, sends invalidation request to all nodes in the copy-set, and updates its block table to reflect the fact that it is the owner of the block

4. Discuss the Election algorithms.

A4) Election Algorithms


Several distributed algorithms require that there be a coordinator process in the entire system that performs some type of coordination activity needed for the smooth running of other processes in the system. Two examples of such coordinator processes encountered in this unit are the coordinator in the centralized algorithm for mutual exclusion and the central coordinator in the centralized deadlock algorithm. Since all other processes in the system have to interact with the coordinator, they all must unanimously agree on who the coordinator is. Furthermore, if the coordinator process fails due to the failure of the site on which it is located, a new coordinator process must be elected to take up the job of the failed coordinator. Election algorithms are meant for electing a coordinator process from among the

currently running processes in such a manner that at any instance of time there is a single coordinator for all processes in the system. Election algorithms are based on the following assumptions: 1. Each process in the system has a unique priority number. 2. Whenever an election is held, the process having the highest priority number among the currently active processes is elected as the coordinator. 3. On recovery, a failed process can take appropriate to rejoin the set of active processes. Therefore, whenever initiated, an election algorithm basically finds out which of the currently active processes has the highest priority number and then informs this to all the active processes.
(i) The Bully Algorithm

This algorithm was proposed by Garcia-Molina. In this algorithm it is assumed that every process knows the priority number of every other process in the system. The algorithm works as follows: When a process (say Pi) sends a request message to the coordinator and does not receive a reply within a fixed timeout period, it assumes that the coordinator has failed. It then initiates an election by sending an election message to every process with a higher priority number than itself. If Pi does not receive any response to its election message within a fixed timeout period, it assumes that among the currently active processes it has the highest priority number. Therefore it takes up the job of the coordinator and sends a message (call it the coordinator message) to all processes having lower priority numbers than itself, informing that from now on it is the new coordinator. On the other hand, if Pi receives a response for its election message, this means that some other process having higher priority number is alive, Therefore Pi does not take any further action and just waits to receive the final result (a coordinator message from the new coordinator) of the election it initiated. When a process (say Pj) receives an election message, it sends a response message to the sender informing that it is alive and will take over the election activity. Now Pj holds an election if it is not already holding one. In this way, the election activity gradually moves on to the process that has the highest priority number among the currently active processes and eventually wins the election and becomes the new coordinator.

(ii) A Ring Algorithm

This algorithm assumes that all the processes in the system are organized in a logical ring. The ring is unidirectional in the sense that all the messages related to the election algorithm are always passed only in one direction (clockwise / anticlockwise). Every process in the system knows the structure of the ring, so that while trying to circulate a message over the ring, if the successor of the sender process is down, the sender can skip over the successor, or the one after that, until an active member is located. The algorithm works as follows:

When a process (say Pi) sends a request message to the current coordinator and does not receive a reply within a fixed timeout period, it assumes that the coordinator has crashed. Therefore it initiates an election by sending an election message to its successor (actually to the first successor that is currently active). This message contains the priority number of process Pi. On receiving the election message, the successor appends its own priority number to the message and passes it on to the next active member in the ring. This member appends its own priority number to the message and forwards it to its own successor. In this manner, the election message circulates over the ring from one active process to another and eventually returns back to process Pi. Process Pi recognizes the message as its own election message by seeing that in the list of priority numbers held within the message the first priority number is its own priority number. Note that when process Pi receives its own election message, the message contains the list of priority numbers of all processes that are currently active. Therefore, of the processes in this list, it elects the process having the highest priority number as the new coordinator. It then circulates a coordinator message over the ring to inform all the other active processes who the new coordinator is. When the coordinator message comes back to process Pi after completing its one round along the ring, it is removed by process Pi. At this point all the active processes know who the current coordinator is. When a process (say Pj), recovers after failure, it creates an inquiry message and sends it to its successor. The message contains the identity of process Pj. If the successor is not the current coordinator, it simply forwards the enquiry message to its own successor. In this way, the inquiry message moves forward along the ring until it reaches the current coordinator. On receiving an inquiry message, the current coordinator sends a reply to process Pj informing that it is the current coordinator. Notice that in this algorithm two or more processes may almost simultaneously discover that the coordinator has crashed and then each one may circulate an election message over the ring. Although this results in a little waste of network bandwidth, it does not cause any problem because every process that initiated an election will receive the same list of active processes, and all of them will choose the same process as the new coordinator.

5. Discuss the implementation of threads in Distributed Systems.

A5) In a distributed system the processing of information is distributed over several computers rather
than limiting it to a single machine. Here, the implementation details of developing a powerful system out of various systems are completely hidden from the user. The recent development in hardware technology has resulted in small offices using networked multiple computer systems. We can consider Local Area Network (LAN) as an example for distributed systems. LAN is used to connect hundreds of heterogeneous computers. Internet can also be considered as an example of this system. The advantages of distributed systems are as follows: It allows optimal usage of available resources on a network. The combined computing power of many network nodes increases the performance.

It reduces the cost involved in maintenance. For example, by upgrading particular software on a single server, it is possible to upgrade various clients connected to the server. In a distributed system, if a CPU in a multiprocessor system or a computer on the network crashes it does not affect the rest of the system. Total computing power increases when the processing of information is distributed. Database applications that use a client-server model are a good example of applications that are inherently distributed. The only drawback you can find in distributed system is that it uses completely different software unlike a centralized system. To overcome this, various organizations have developed their own technology for distributed computing.

Broker architecture Now that you have an idea of a distributed architecture, let us now understand the working of broker architecture which is a distributed computing technology developed by Microsoft.

The broker architectural pattern is the one that could be utilized to provide a basic structure for the distributed application with dissociated components. The components here interact by remote service invocations. In this type of architecture we can see that each application is connected to the central component that behaves like a broker. No application communicates directly to the other application. All the communication happens through the broker. Communications such as forwarding requests and transmitting results and exceptions is the responsibility of a broker component. Let us now learn this pattern using a context, problem, and solution as we did in our previous unit. We shall consider a context wherein there is an environment consisting of distributed systems. It could also consist of heterogeneous systems which are made up of independent and cooperating components. Let us now discuss the problem. Building a system as a set of dissociated interoperating components is a very good idea. However, it is advisable to have some means of inter-process communication. When components handle this by themselves, it results in dependencies and limitations. We need to consider the following points: It should provide services to add, delete, exchange, activate, and locate components. The applications utilizing these services should not depend on details that are system specific in order to facilitate portability and interoperability. From the point of view of a developer there has to be no difference between developing software for centralized systems and developing software for distributed systems. The application using an object has to only look into the interface that is offered by the object. It need not know about implementation details of an object, or even about the physical locations.

Let us now discuss the solution. We have to introduce a broker component to get better dissociating of clients and servers. Servers have to register themselves with the broker. The clients and servers communicate through the broker. Clients access the servers by sending requests through the broker. Locating the appropriate server, sending the requests to the server and transmitting results and exceptions back to the clients are the tasks of the broker. The broker architecture makes it easier for the applications to access distributed services by sending message calls to the appropriate object, rather than focusing on low-level inter-process communication. Broker architecture allows dynamic changes such as addition, deletion, and relocation of object. Broker pattern makes distribution transparent to the developer and hence, reduces complexity in developing distributed systems. Let us now discuss the six components of broker architecture. Server: The server uses objects that expose their functionality through interfaces that comprises of operations and attributes. These interfaces are made available using an Interface Definition Language (IDL) or a binary standard. There are two types of servers. They are: Servers that offer common services to many application domains. Servers that implement specific functionality for a single application domain or task. Clients: These are applications that use the service of at least one server. Clients call remote services by forwarding a request to the broker. Then, the client receives responses or exceptions from the broker. Broker: It is a messenger that takes the responsibility of transmitting requests from client to server and also transmits responses and exceptions from server to the client. Client-side proxies: This depicts a layer between the clients and the broker. It provides transparency through which the remote object appears as a local one to the client. It hides the implementation details from the clients. Server-side proxies: It is a layer between the broker and the server. This is responsible for receiving requests, un-marshalling the parameters, unpacking incoming messages, and calling the appropriate service. Bridges: These are components used for communication between the brokers. It is optional to hide the details of implementation. It is helpful in a heterogeneous network as it will take care of the communications between various types of networks. Let us now study some scenarios that describe the operation of local broker component. In the first scenario we will learn the process that happens when the server registers itself with the local broker component. The following paragraph describes the process of a server registering itself with the broker component:

First the initialization phase of the system starts the broker. The broker enters the event loop and waits for incoming requests. Next, the server application is started by a user or some other entity. Then the server registers it with the broker after it executes its initialization code. Then, the broker gets the incoming request for registration from the server. The necessary information is extracted and stored into one or more repositories and then, an acknowledgement is sent back. Finally, the server enters the main loop, once it receives the acknowledgement from the broker. It waits here until it gets a request from the client. In the second scenario we will learn the process that happens when a client sends a request to the server. The following paragraph describes the process when a client sends a request to the server: First, the client application is initiated. Then, the client-side proxy condenses all parameters and related data into a message and forwards this message to the local broker. Next, the broker checks for the destination of the requested server in its repositories. As the server is locally available the broker forwards the message to the server-side proxy. Now all packets and other information are unpacked by the server-side proxy and invoke the appropriate service. Then, the server returns the service to the sever-side proxy after the execution of the service is complete. After packaging it into a message with other necessary information, the server-side proxy sends it to the broker. The broker now forwards it to the client-side proxy. Finally, the client-side proxy receives the response, unpacks results and sends it to the client application. Let us now discuss the steps to implement the broker architectural pattern. The steps to implement the broker architectural pattern are as follows:
Step 1: The object model has to be defined, or an existing model can be used. Every object model must specify entities like object names, requests, exceptions, values and so on. The choice of the object model has an impact on all other parts of the system. Step 2: The type of component-interoperability the system should offer has to be defined. We can use a binary standard or an IDL to design interoperability. Step 3: The Application Programming Interface (API) that the broker provides for client server collaboration has to be specified. Step 4: The details of implementation has to be hidden from clients and servers by using proxy objects. Step 5: The broker component has to be designed in parallel with step 3 and 4. Step 6: The IDL compiler has to be defined for every programming language that is supported. We can find various uses of broker architecture. One such example is the World Wide Web (WWW) which is the largest broker system in the world. Here WWW servers are the service providers and the hypertext browser such as Mosaic and Netscape act as brokers. There is no need for the clients to know the location of the server because the broker takes the responsibility to locate the server by using a unique identifier.

There is no functional impact on clients even if the server changes because their interface remains the same. Since the broker system uses indirection layers such as APIs, proxies and bridges to hide operating system and network system details from clients and servers, it is sufficient to port a broker component when porting is required. Thus, we can say that it supports portability. It is possible for different broker systems to interoperate if they understand a common protocol for the exchange of messages. Thus supporting interoperability. Existing services can be used to build new client applications. Thus, the coding required for new applications can be reduced by the reuse of existing services. Let us now discuss some of the disadvantages of broker pattern, which are as follows: Applications that use broker architecture are slower when compared to applications whose component distribution is static and known. If a server or a broker fails during program execution, all the application that uses the server or the broker fails. Reliability can be increased by replication of components.

6. Discuss the following concepts with respect to Naming: o Desirable features of a good naming system o Name Caches

A6)
Desirable Features of a Good Naming System

A good naming system for a distributed system should have the following features: i) Location transparency
The name of an object should not reveal any hint about the physical location of the object ii) Location independency

Name of an object should not be required to be changed when the objects location changes. Thus A location independent naming system must support a dynamic mapping scheme An object at any node can be accessed without the knowledge of its physical location An object at any node can issue an access request without the knowledge of its own physical location
iii) Scalability Naming system should be able to handle the dynamically changing scale of a distributed system

iv) Uniform naming convention Should use the same naming conventions for all types of objects in the system v) Multiple user-defined names for the same object Naming system should provide the flexibility to assign multiple user-defined names for the same object. vi) Grouping name Naming system should allow many different objects to be identified by the same name. vii) Meaningful names A naming system should support at least two levels of subject identifiers, one convenient for human users and the other convenient for machines.

Name Caches Caching can help increase the performance of name resolution operations for the following reasons: i) High degree of locality of name lookup: Due to locality of reference, a reasonable size cache, used for caching the recently used naming information can increase performance. ii) Slow update of name information database: Cost of maintaining consistency of cached data is very low because naming data does not change fast. i.e., the read/write ratio of naming data is very high. iii) On-use consistency of cached information is possible Name cache consistency can be maintained by detecting and discarding stale cache entries on use. Issues related to Name Caches: Types of name caches Directory cache: All recently used directory pages that are brought to the client node during name resolution are cached for a while. Advantages and disadvantages of this approach

When a directory is accessed it is likely that the contents of the directory pages are used for operations such as (ls,../, etc.). For getting one useful entry, namely the directory entry, an entire page of directory blocks large area of cache.
Prefix cache: Used in Zone-based context distribution mechanisms that we saw earlier. Full-name cache: In this type of cache, each entry consists of an objects full path name and the identifier and location of its authoritative name server. Approaches for name cache implementation: A cache per process: A separate cache is maintained for each process. Advantages and disadvantages:

Since each cache is maintained in the processs address space, accessing is fast.
Every new process must create its own name cache from scratch. Cache hit ratio will be small due to start-up misses. To minimize startup misses, a process can inherit the name cache from its parent (Vsystem uses this approach). Possibility of naming information being duplicated unnecessarily at a node. A cache per node: All processes at a node share the same cache. Some of the problems related to the above approach are overcome. However, cache needs to be in the OS area and hence access could be slow. Approaches for maintaining consistency of name caches: 1. Immediate invalidate: In this method, all related name cache entries are immediately invalidated. This can be done in one of the following ways.

Whenever a naming data update is done, an invalidate message identifying the data to be invalidated is sent to all nodes so each node can update its cache. This approach is expensive in large systems. Invalidation message is sent to only the nodes that have cached the data.
2. On-Use update: When a client uses a stale cached data, it is informed by the naming system that the data is stale so that the client can get the updated data.

Вам также может понравиться