Вы находитесь на странице: 1из 70

Brightway Computers Distributed Systems

List of Questions
Unit – I: Introduction to Distributed System
Essay Questions:

1. What is meant by distributed system and explain the advantages of distributed system?
2. Explain the system models and what are the types of system models in distributed system?
3. Explain the Designing issues in Distributed Operating System?
4. Discuss the example of distributed system?

Short Answer Question:

1. What is meant by Openness?


2. What is the differences between shared nothing Parallel System and Distributed System?
3. Explain the disadvantages of distributed systems?
4. What is meant by Naming?
5. Why do we develop distributed systems and distributed coordination?
6. Write any four properties of distributed algorithms?

Unit – II: Message Passing and RPC

Essay Questions:

1. What is meant by message passing and explain in detail?


2. Explain the features of message passing systems?
3. Discuss in detail about Synchronization?
4. Define Buffering and explain it?
5. Explain the client – server architecture model of RPC?
6. Explain the Implementation Mechanism in RPC?
7. Explain the RPC Message?
8. Explain the Call Semantics in RPC?
9. Explain the Communication Protocols?
10. Discuss the Client – Server Binding?

Short Answer Questions:

1. What is explicit and implicit addressing?


2. What is Group Communication?
3. What is all over Multicast?
4. What are the components of RPC?
5. How to send a request and how to get the response?
6. Discuss the issues in IPC by Message Passing?
7. Discuss the basic concepts of RPC?

23
Brightway Computers Distributed Systems
8. What are the Transparency issues of RPC?
9. Discuss the stub generation?
10. Discuss the server management?

Unit – III: Introduction to DSM

Essay Questions

1. What is meant by DSM and explain the design and implementation of DSM Systems?
2. Explain the Granularity?
3. Explain the Consistency Model?
4. Explain the Clock Synchronization?
5. Discuss the Event Ordering?
6. Explain the Mutual exclusion?
7. Define the Deadlock and explain the Deadlock?
8. Explain the Election Algorithm?

Short Answer Questions

1. What is meant by Wireless Elections?


2. Discuss the Read – Replication Algorithm?
3. What is the difference between message passing and DSM?
4. What is meant by Memory Consistency?
5. What is meant by Trashing?
6. Discuss the Issues of DSM?
7. What are the Advantages of DSM?

Unit – IV: Tasks and Loading

1. Define the Task and explain the Task Assignment Approach?


2. Explain the Load Balancing Approach?
3. Explain the Load Sharing Approach?
4. Discuss the Migration?
5. Explain the Threads?

Short Answer Questions

1. What is meant by resource management?


2. Discuss the Issues of Designing of Load Balancing Algorithms?
3. What is meant by Processor and Process?
4. Discuss the Transparency in Client Side?
5. What is meant by file and explain the file models?
6. Explain the File Caching Schemes?
7. Explain the Atomic Transactions?
8. Explain the Authentication?

23
Brightway Computers Distributed Systems
Short Answer Questions

1. Discuss the features of distributed file system?


2. Explain the functions of distributed file system?
3. Explain the File Accessing Models?
4. Explain the File Sharing Semantics?

23
Brightway Computers Distributed Systems

UNIT –I
Introduction to Distributed Systems
1. What is meant by distributed system and explain the advantages of distributed system?
A distributed system is a collection of independent computers that appear to the users of the
system as a single system.
Examples:
1. Network of workstations
2. Distributed manufacturing system (e.g., automated assembly line)
3. Network of branch office computers.
(i) Advantages of Distributed Systems over Centralized Systems:

a. Economics: A collection of microprocessors offer a better price/ performance than


mainframes. Low price/ performance ratio: cost effective way to increase computing
power.
b. Speed: A distributed system may have more total computing power than a mainframe. Ex.
10,000 CPU chips, each running at 50 MIPS. Not possible to build 500,000 MIPS single
processor since it would require 0.002 nsec instruction cycle. Enhanced performance
through load distributing.
c. Inherent distribution: Some applications are inherently distributed. Example a
supermarket chain.
d. Reliability: If one machine crashes, the system as a whole can still survive. Higher
availability and improved reliability.
e. Incremental growth: Computing power can be added in small increments. Modular
expandability.
f. Another deriving force: The existence of large number of personal computers, the need for
people to collaborate and share information.
(ii) Advantages of Distributed Systems over Independent PCs:
a. Data Sharing: allow many users to access to a common data base.
b. Resource Sharing: expensive peripherals like color printers.
c. Communication: enhance human-to-human communication, e.g., email, chat.

d. Flexibility: spread the workload over the available machines.

2. Explain the system models and what are the types of system models in distributed
systems?
Distributed System Models is as follows:
(1) Architectural Models.
(2) Interaction Models

(3) Fault Models.

23
Brightway Computers Distributed Systems
(1) Architectural Models: Architectural model describes responsibilities distributed between
system components and how are these components placed.
i. Client-server model:

The system is structured as a set of processes called servers that offer services to the users called
clients.
a. The client-server model is usually based on a simple request/reply protocol, implemented
with send/receive primitives or using Remote Procedure Calls (RPC) or Remote Method
Invocation (RMI).
b. The client sends a request (invocation) message to the server asking for some service.
c. The server does the work and returns a result (e.g. the data requested) or an error code if
the work could not be performed.

Client
Client

Server

Server
Client

Request:Process (object):
Result:Computer (node):

Fig: Client-server model

A server can itself request services from other servers; thus, in this new relation, the server itself
acts like a client.
ii. Peer-to-peer:
All processes (objects) play similar role.
a. Processes (objects) interact without particular distinction between clients and servers.
b. The pattern of communication depends on the particular application.

c. A large number of data objects are shared; any individual computer holds only a small part
of the application database.
d. Processing and communication loads for access to objects are distributed across many
computers and access links.
e. This is the most general and flexible model.

Peer Peer

Peer Peer
23
Fig: Peer to Peer
Brightway Computers Distributed Systems

f. Peer-to-peer tries to solve some of the above


g. It distributes shared resources widely -> share computing and communication loads.

Problems with peer-to-peer:


High complexity due to
 Cleverly place individual objects
 Retrieve the objects

 Maintain potentially large number of replicas.


(2) Interaction Model: Interaction model are for handling time i.e. for process execution,
message delivery, clock drifts etc.

(i) Synchronous Distributed Systems:


Main features:
1. Lower and upper bounds on execution time of processes can be set.
2. Transmitted messages are received within a known bounded time.
3. Drift rates between local clocks have a known bound.
Important consequences:
1. In a synchronous distributed system there is a notion of global physical time (with a
known relative precision depending on the drift rate).
2. Only synchronous distributed systems have a predictable behavior in terms of timing.
Only such systems can be used for hard real-time applications.
3. In a synchronous distributed system it is possible and safe to use timeouts in order to
detect failures of a process or communication link.
It is difficult and costly to implement synchronous distributed systems.
(ii) Asynchronous distributed systems:
Many distributed systems (including those on the Internet) are asynchronous. – No
bound on process execution time (nothing can be assumed about speed, load and reliability of
computers). –No bound on message transmission delays (nothing can be assumed about speed, load
and reliability of interconnections) – No bounds on drift rates between local clocks.
Important consequences:
1. In an asynchronous distributed system there is no global physical time. Reasoning can
be only in terms of logical time (see lecture on time and state).
2. Asynchronous distributed systems are unpredictable in terms of timing.
3. No timeouts can be used.
Asynchronous systems are widely and successfully used in practice.
In practice timeouts are used with asynchronous systems for failure detection.

23
Brightway Computers Distributed Systems
However, additional measures have to be applied in order to avoid duplicated messages, duplicated
execution of operations, etc.

23
Brightway Computers Distributed Systems
(3) Fault Models:

(i) Failures can occur both in processes and communication channels. The reason can be
both software and hardware faults.
(ii) Fault models are needed in order to build systems with predictable behavior in case of
faults (systems which are fault tolerant).
(iii) Such a system will function according to the predictions, only as long as the real faults
behave as defined by the “Fault Model”.

3. Explain the Designing issues in Distributed Operating System?


(i) A DOS is a collection of heterogeneous computers connected via a network.
(ii) The functions of a conventional operating system are distributed throughout the
network.

(iii) To users of DOS, it is as if the computer has a single processor.


(a) Security: Many of the information resources that are made available and maintained in
distributed systems have a high intrinsic value to their users. Their security is therefore of
considerable importance. Security for information resources has three components: confidentiality,
integrity and availability.
(b) Scalability: Distributed systems operate effectively and efficiently at many different
scales, ranging from a small intranet to the internet. A system is described as scalable if it will
remain effective when there is a significant increase in the number of resources and the number of
users.
(C) Failure Handling: Computer systems sometimes fail. When faults occur in hardware or
software, programs may produce incorrect results or may stop before they have completed the
intended computation. Failures in a distributed system are partial – that is, some components fail
while others continue to function. Therefore the handling of failures is particularly difficult.
(d) Concurrency: Both services and applications provide resources that can be shared by
clients in a distributed system. There is therefore a possibility that several clients will attempt to
access a shared resource at the same time. Object that represents a shared resource in a distributed
system must be responsible for ensuring that it operates correctly in a concurrent environment.
This applies not only to servers but also to objects in applications. Therefore any programmer who
takes an implementation of an object that was not intended for use in a distributed system must do
whatever is necessary to make it safe in a concurrent environment.
(e) Transparency: Transparency can be achieved at two different levels. Easiest to do is to
hide the distribution from the users. The concept of transparency can be applied to several aspects
of a distributed system.
1. Location Transparency: The users cannot tell where resources are located.
2. Migration Transparency: Resources can move at will without changing their names.
3. Replication Transparency: The users cannot tell how many copies exist.
4. Concurrency Transparency: multiple users can share resources automatically.

23
Brightway Computers Distributed Systems
5. Parallelism Transparency: Activities can happen in parallel without users knowing.

(F) Quality of Service: Once users are provided with the functionality that they require of
a service, such as the file service in a distributed system, we can go on to ask about the quality of
the service provided. The main nonfunctional properties of systems that affect the quality of the
service experienced by clients and users are reliability, security and performance. Adaptability to
meet changing system configurations and resource availability has been recognized as a further
important aspect of service quality.
(g) Reliability: One of the original goals of building distributed systems was to make them
more reliable than single processor systems. The idea is that if a machine goes down, some other
machine takes over the job. A highly reliable system must be highly available, but that is not
enough. Data entrusted to the system must not be lost or garbled in any way, and if files are stored
redundantly on multiple servers, all the copies must be kept consistent. In general, the more copies
that are kept, the better the availability, but the greater the chance that they will be inconsistent,
especially if updates are frequent.
(h) Performance: Always the hidden data in the background is the issue of performance.
Building a transparent, flexible, reliable distributed system, more important lies in its performance.
In particular, when running a particular application on a distributed system, it should not be
appreciably worse than running the same application on a single processor. Unfortunately,
achieving this is easier said than done.

4. Discuss the example of Distributed System?


(a) The Internet: Net of nets global access to “everybody” (data, service, other actor, open ended).
1. Enormous size (openended)
2. No single authority
3. Communication types
Interrogation, announcement, stream, data, audio, video.
Examples: Distributed Systems
1. The Internet:
(a) Heterogeneous network of computers and applications.
(b) Implemented through internet protocol.

23
Brightway Computers Distributed Systems
(b) Intranets: A portion of the Internet that is separately administered and has a boundary that
can be configured to enforce local security policies Composed of several LANs linked by
backbone connections be connected to the Internet via a router.

(c) Mobile and ubiquitous computing: Technological advances in device miniaturization and
wireless networking have led increasingly to the integration of small and portable computing
devices into distributed systems. These devices include Laptop computers. Handheld devices,
including mobile phones, smart phones, GPS-enabled devices, pagers, personal digital assistants
(PDAs), video, cameras and digital cameras. Wearable devices, such as smart watches with
functionality similar to a PDA. Devices embedded in appliances such as washing machines, hi-fi
systems, cars and refrigerators.
The portability of many of these devices, together with their ability to connect conveniently to
networks in different places, makes mobile computing possible. Mobile computing is the
performance of computing tasks while the user is on the move, or visiting places other than their
usual environment. In mobile computing, users who are away from their ‘home’ intranet (the
intranet at work, or their residence) are still provided with access to resources via the devices they
carry with them. They can continue to access the internet; they can continue to access resources in
their home intranet; and there is increasing provision for users to utilize resources such as printers
or even sales points that are conveniently nearby as they move around. The latter is also known as
location-aware or context-aware computing. Mobility introduces a number of challenges for
distributed systems, including the need to deal with variable connectivity and indeed
disconnection, and the need to maintain operation in the face of device mobility.
(i) Portable devices:
 Laptops.
 Handheld devices.
 Wearable devices.
 Devices embedded in appliances.
(ii) Mobile computing.
(iii) Location – aware computing.
(iv) Ubiquitous computing, pervasive computing.
(d) Mobile Ad Hoc – Network: Problems, e.g.:(a) Reliable multicast (b) Group management.
Mobile nodes come and go
No infrastructure:
1. Wireless data
communication.
2. Multihop
networking.

23
Brightway Computers Distributed Systems
3. Long, nondeterministic dc delays.
(e) Resource Sharing and the Web:
1. Hardware resource (reduce costs)
2. Data resources (shared usage of information)
3. Service resources
4. Search engines
5. Computer- supported cooperative working
6. Service vs. Server (node or Process)
Examples of Distributed Systems:
IT Services architecture of a Swiss Bank:
1. Service architecture consists of heterogeneous new and legacy components.
2. Hardware platforms range from mainframes to NTs.
3. Programming languages including assembler, Cobol, C, C++, Java….
4. Different types of middleware can be used to resolve distribution and heterogeneity.

SHORT ANSWER QUESTION

1. What is meant by Openness?


1) Openness is concerned with extensions and improvements of distributed systems.
2) Detailed interfaces of components need to be published.
3) New components have to be integrated with existing components.
4) Difference in data representation of interface types on different processors (of different
vendors) have to be resolved.

2. What is the difference between shared nothing Parallel System and Distributed System?
In distributed system, database are geographically separated, they are administered
separately and have slower interconnection.
In distributed systems, we differentiate between local and global transactions. Local
transaction is one that accesses data in the single site at that the transaction was initiated. Global
transaction is one which either accesses data in a site different from the one at which the
transaction was initiated or accesses data in several different sites.

3. Explain the disadvantages of distributed systems?


The added complexity required to ensure proper co-ordination among the sites, is the major
disadvantage. This increased complexity takes various forms:

23
Brightway Computers Distributed Systems
Software Development Cost: It is more difficult to implement a distributed database
system; thus it is more costly.
Greater Potential for Bugs: Since the sites that constitute the distributed database system
operate parallel, it is harder to ensure the correctness of algorithms, especially operation during
failures of part of the system, and recovery from failures. The potential exists for extremely subtle
bugs.
Increased Processing Overhead: The exchange of information and additional computation
required to achieve intersite co-ordination are a form of overhead that does not arise in centralized
system.

4. What is meant by Naming?


A name is resolved when translated into an interpretable form for resource/object
reference. Communication identifier (IP address + port number).

5. Why do we develop distributed systems and distributed coordination?


Availability of powerful yet cheap microprocessors (PCs, workstations), continuing
advances in communication technology.
Communication between processes in a distributed system can have unpredictable delays,
processes can fail and messages may be lost.

6. Write any four properties of distributed algorithms?


1. The relevant information is scattered among multiple machines.
2. Processes make decisions based only on locally available information.
3. A single point of failure in the system should be avoided.
4. No common clock or other precise global time source exists.

23
Brightway Computers Distributed Systems
UNIT – II
MESSAGE PASSING AND RPC
Essay Questions
1. What is meant by message passing and explain in detail?
Inter process communication (IPC) basically requires information sharing among two or more
processes. Two basic methods for information sharing are as follows:
(a) Original sharing, or shared-data approach;
(b) Copy sharing, or message-passing approach.
Two basic Inter Process Communication Paradigms: the shared data approach and message
P Shared common P
memory area
1 2
(a
)

P P
1 2
(b
)
Fig: Message Passing

passing approach.
In the shared-data approach the information to be shared is placed in a common memory
area that is accessible to all processes involved in an IPC.
In the message-passing approach, the information to be shared is physically copied from the
sender process’s space to the address space of all the receiver processes and this is done by
transmitting the data to be copied in the form of messages (message is a block of information).
A Message-Passing System is a subsystem of distributed operating system that provides a
set of message-based IPC protocols and does so by shielding the details of complex network
protocols and multiple heterogeneous platforms from programmers. It enables processes to
communicate by exchanging messages and allows programs to be written by using simple
communication primitives, such as send and receive.

2. Explain the features of message passing systems?

(a) Simplicity: A message passing system should be simple and easy to use. It should be
possible to communicate with old and new applications, with different modules without the
need to worry about the system and network aspects.
(b) Uniform Semantics: In a distributed system, a message-passing system may be used for
the following two types of inter process communication:
i. Local Communication, in which the communicating process are on the same node;

ii. Remote communication, in which the communicating processes are on different


nodes.

23
Brightway Computers Distributed Systems
Semantics of remote communication should be as close as possible to those of local
communications. This is an important requirement for ensuring that the message passing is easy to
use.
(c) Efficiency: An IPC protocol of a message-passing system can be made efficient by reducing
the number of message exchanges, as far as practicable, during the communication process.
Some optimizations normally adopted for efficiency include the following:
i. Avoiding the costs of establishing and terminating connections between the same
pair of processes for each and every message exchange between them.
ii. Minimizing the costs of maintaining the connections.
iii. Piggybacking of acknowledgement of previous messages with the next message
during a connection between a sender and a receiver that involves several messages
exchanges.
(d) Correctness: Correctness is a feature related to IPC protocols for group communication.
Issues related to correctness are as follows:
i. Atomicity;
ii. Ordered delivery;

iii. Survivability.
Atomicity ensures that every message sent to a group of receivers will be delivered to either
all of them or none of them. Ordered delivery ensures that messages arrive to all receivers in an
order acceptable to the application. Survivability guarantees that messages will be correctly
delivered despite partial failures of processes, machines, or communication links.

3. Discuss in detail about Synchronization?


A central issue in the communication structure is the synchronization imposed on the
communicating processes by the communication primitives. The semantics used for
synchronization may by broadly classified as blocking and nonblocking types. A primitive is said to
have nonblocking semantics if its invocation does not block the execution of its invoker (the
control returns almost immediately to the invoker); otherwise a primitive is said to be of the
blocking type.
In case of a blocking send primitive after execution of the send statement, the sending
process is blocked until it receives an acknowledgement from the receiver that the message has
been received. On the other hand, for nonblocking send primitive, after execution of the send
statement, the sending process is allowed to proceed with its execution as soon as the message has
been copied to a buffer.
An important issue in a nonblocking receive primitive is how the receiving process knows
that the message has arrived in the message buffer. One of the following two methods is commonly
used for this purpose:
(a) Polling: In this method, a test primitive is provided to allow the receiver to check the
buffer status. The receiver uses this primitive to periodically poll the kernel to check if
the message is already available in the buffer.

23
Brightway Computers Distributed Systems
(b) Interrupt: In this method, when the message has been filled in the buffer and is ready
for use by the receiver, a software interrupt is used to notify the receiving process.

having blocking-type semantics


Fig: Synchronous mode of communication with both sends and receives primitives

Send (message); execution


Execution resumed

suspended

execution
Sender’s
Acknowledgment

Message
Executing state
Blocked state

execution suspended
Receive (message);
Execution resumed
Send (Acknowledgment)

execution suspended
Receive (message);

A variant of the nonblocking receive primitive is the conditional receive primitive, which also
returns control to the invoking process almost immediately, either with a message or with an
indicator that no message is available.
When both the send and receive primitives of a communication between two processes use
blocking semantics, the communication is said to be synchronous, otherwise it is asynchronous.
The main drawback of synchronous communication is that it limits concurrency and is subject to
communication deadlocks.

4. Define Buffering and explain it?


In the standard message passing model, messages can be copied many times: from the user
buffer to the kernel buffer (the output buffer of a channel), from the kernel buffer of the sending
computer (process) to the kernel buffer in the receiving computer (the input buffer of a channel),
and finally from the kernel buffer of the receiving computer (process) to a user buffer.

23
Brightway Computers Distributed Systems
(a) Null buffer (No Buffering): In this case there is no place to temporarily store the
message. Hence one of the following implementation strategies may be used:
i. The message remains in the sender process’s address space and the execution of
the send is delayed until the receiver executes the corresponding receive.
ii. The message is simply discarded and the time-out mechanism is used to resend
the message after a timeout period. The sender may have to try several times
before succeeding.
The three types of buffering strategies used in inter process communication.

Sending Receiving
process process

Mess
age

(a)

Fig: Types of Buffers

23
Brightway Computers Distributed Systems
(b) Single-Message Buffer: In single-message buffer strategy, a buffer having a capacity to
store a single message is used on the receiver’s node. This strategy is usually used for
synchronous communication; an application module may have at most one message
outstanding at a time.

Sendin Receiving
g process
process
Mess
age
Single-
Node message buffer
boundary
(b)

Unbounded-Capacity Buffer:
In the asynchronous mode of communication, since a sender does not wait for the receiver
to be ready, there may be several pending message that have not yet been accepted by the receiver.
Therefore, an unbounded-capacity message-buffer that can store all unreceived messages is needed
to support asynchronous communication with the assurance that all the messages sent to the
receiver will be delivered.
(c) Finite-Bound Buffer: Unbounded capacity of a buffer is practically impossible.
Therefore, in practice, systems using asynchronous mode of communication use finite-
bound buffers, also known as multiple-message buffers. In this case message is first
copied from the sending process’s memory into the receiving process’s mailbox and
then copied from the mailbox to the receiver’s memory when the receiver calls for the
message.

Sending Receiving
process process
Message 1

Message 2
Message
Message 3

Message n

Multiple-message
buffer/mailbox/po
rt
(c)

23
Brightway Computers Distributed Systems
When the buffer has finite bounds, a strategy is also needed for handling the problem of a
possible buffer overflow. The buffer overflow problem can be dealt with in one of the following two
ways:
Unsuccessful communication: In this method, message transfers simply fail, whenever
there is no more buffer space and an error is returned.
Flow-Controlled Communication: The second method is to use flow control, which means
that the sender is blocked until the receiver accepts some messages, thus creating space in the
buffer for new messages. This method introduces a synchronization between the sender and the
receiver and may result in unexpected deadlocks. Moreover, due to the synchronization imposed,
the asynchronous send does not operate in the truly asynchronous mode for all send commands.

5. Explain the client – server architecture model of RPC?


Remote Procedure Call (RPC) is a powerful technique for constructing distributed, client –
server based applications. It is based on extending the conventional local procedure calling, so that
the called procedure need not exist in the same address space as the calling procedure. The two
processes may be on the same system, or they may be on different systems with a network
connecting them.
When making a Remote Procedure Call:
1. The calling environment is suspended, procedure parameters are transferred across the
network to the environment where the procedure is to execute, and the procedure is executed
there.
2. When the procedure finishes and produces its results, its results are transferred back to
the calling environment, where execution resumes as if returning from a regular procedure call.
NOTE: RPC is especially well suited for client-server (e.g. query – response) interaction in
which the flow of control alternates between the caller and callee. Conceptually, the client and
server do not both execute at the same time. Instead, the thread of execution jumps from the caller
to the callee and then back again.
Working of RPC:
Caller Callee
(Client (Server
process) process)

Request message
(Cantaions remote
Call procedure and procedure’s Receive request and
wait for reply parameter start procedure
execution
Procedure
executes
Send reply and wait for
next request
Resume execution Reply message
(Contains result of
procedure execution

Remote procedure call


model
Fig: Client – Server Model
23
Brightway Computers Distributed Systems
The following steps take place during a RPC:
1. A client invokes a client stub procedure, passing parameters in the usual way. The client stub
resides within the client’s own address space.
2. The Client stub marshalls (pack) the parameters into a message. Marshalling includes
converting the representation of the parameters into a standard format and copying each
parameter into the message.
3. The client stub passes the message to the transport layer, which sends it to the remote server
machine.
4. On the server, the transport layer passes the message to a server stub, which demarshalls
(unpack) the parameters and calls the desired server routine using the regular procedure call
mechanism.
5. When the server procedure completes, it returns to the server stub (e.g., via a normal procedure
call return), which marshalls the return values into a message. The server stub then hands the
message to the transport layer.
6. The transport layer sends the result message back to the client transport layer, which hands the
message back to the client stub.
7. The client stub demarshalls the return parameters and execution returns to the caller.

6. Explain the Implementation Mechanism in RPC?


 RPC mechanism uses the concepts of stubs to achieve the goal of semantic transparency.
 Studs provide a local procedure call abstraction by concealing the underlying RPC
mechanism.
 A separate stub procedure is associated with both the client and server processes.
 RPC communication package known as RPC Runtime is used on both the sides to hide
existence and functionalities of a network.
 Thus implementation of RPC involves the five elements of program.
(1) Client (2) Client Stub, (3) RPC Runtime (4) Server stub (5) Server.
The client, the client stub, and one instance of RPC Runtime execute on the client machine.
The server, the server stub, and one instance of RPC Run time execute on the server machine.
Remote services are accessed by the user by making ordinary LPC.
Client Machine Server Machine

Return Call Se
rv
Call Returner

Client stub Server


stub
Unpack Unpack
Pack Pack

RPC RPC
Runtime Runtime
Receive Wait Send Receive Send

Client Packs
Server packet

Fig: Implementation of
RPC

23
Brightway Computers Distributed Systems
(1) Client :
i. A Client is a user process which initiates a RPC.
ii. The client makes a normal call that will invoke a corresponding procedure in the
client stub.

(2) Client stub:


Client stub is responsible for the following two tasks:
i. On receipt of a call request from the client, it packs specifications of the target
procedure and arguments into a message and asks the local RPC Runtime to send it
to the server stub.
ii. On receipt of the result of procedure execution, it unpacks the result and passes it to
the client.

(3) RPC Runtime:


i. Transmission of messages between Client and the server machine across the
network is handled by RPC Runtime.
ii. It performs Retransmission, Acknowledgement, Routing and Encryption.
iii. RPC Runtime on Client machine receives messages containing result of procedure
execution from server and sends it client stub as well as the RPC Runtime on server
machine receives the same message from server stub and passes it to client
machine.
iv. It also receives call request messages from client machine and sends it to server
stub.

(4) Server Stub:


Server stub is similar to client stub and is responsible for the following two tasks:
i. On receipt of a call request message from the local RPC Runtime, it unpacks and
makes a normal call to invoke the required procedure in the server.
ii. On receipt of the result of procedure execution from the server, it unpacks the result
into a message and then asks the local RPC Runtime to send it to the client stub.
(5) Server: When a call request is received from the server stub, the server executes the
required procedure and returns the result to the server stub.

7. Explain the RPC Messages?


Messages we pass from server to client and vice versa need to be in a format such that client
and server can both decode and encode data passed from each other. Popular message formats for
RPC are JSON and XML. Such communication is called JSON-RPC and XML-RPC for RPC that uses
JSON and XML respectively.
(a) JSON-RPC: In JSONO-RPC all messages sent from server or client are valid JSON objects.
Client must send JSON object with following keys:

i. Method: Name of method/service


ii. Params: Array of arguments to be passed

23
Brightway Computers Distributed Systems
iii. Id: Id is usually integer and makes it easier for client to know which request it
got response to, if RPC calls are done asynchronously.
Server may reply with JSON objected with following keys:
i. Result: Contains return value of method called. It’s null if error occurred.
ii. Error: If error occurred, this will indicate error code or error message, otherwise
it’s null.
iii. Id: The id of the request it is responding to.
Example:
Request:
{“method”: “Arith.Multiply”, “params”:[{ A:2, B:3}], “id”: 1}
Response:
{“result”: 6, “error”: null, “id”:1}
JSON-RPC v2 adds support for batch queries and notifications (calls which don’t require response).
(b) XML – RPC: XML – RPC was created by a Microsoft employ in 1998. It evolved and
became SOAP. It’s hard to elaborate it’s specifics in this blog post so I recommend you
checkout XML – RPC Wikipedia article. Basic XML – RPC is as simple as JSON – RPC. Our
above example for JSON – RPC will look like this in XML – RPC:
Request:
<?xml version=”1.0”?>
<methodCall>
<methodName>Airth.Multiply</methodName>
<params>
<param>
<value><int>2</int></value>
</param>
<param>
<value><int>3</int></value>
</param>
</params>
</methodCall>
Response:
<? xml version=”1.0”?>
<MethodResponse>
<params>
<param>
<value><int>6</int></value>
</param>
</params>
</methodResponse>

8. Explain the Cell Semantics in RPC?

23
Brightway Computers Distributed Systems
In RPC the caller and callee processes can be situated on different nodes. The normal
functioning of an RPC may get

23
Brightway Computers Distributed Systems
Disrupted due to one or more reasons mentioned below:
(i) Call message is lost or response message is lost.
(ii) The callee node crashes and is restarted.
In RPC system the call semantics determines how often the remote procedure may be
executed under fault conditions. The different types of RPC call semantics are as follows:
(a) May – Be Call Semantics:
1. This is the weakest semantics in which a timeout mechanism is used that prevents
the caller from waiting indefinitely for a response from the callee.
2. This means that the caller waits until a pre – determined timeout period and then
continues to execute.
3. Hence this semantics does not guarantee the receipt of call message nor the
execution. This semantics is applicable where the response message is less
important and applications that operate within a local network with successful
transmission of messages.
(b) Last – Once Call Semantics:
1. This call semantics uses the idea of retransmitting the call message based on
timeouts until the caller receives a response.
2. The call, execution and result of will keep repeating until the result of procedure
execution is received by the caller.
3. The results of the last executed call are used by the caller, hence it known as last –
one semantics.
4. Last one semantics can be easily achieved only when two nodes are involved in the
RPC, but it is tricky to implement it for nested RPCs and cases by orphan calls.
(c) Last – of – Many Call Semantics:
1. This semantics neglects orphan calls unlike last – once call semantics. Orphan call is
one whose caller has expired due to node crash.
2. To identify each call, unique call identifiers are used which to neglect orphan calls.
3. When a call is repeated, it is assigned to a new call identifier and each response
message has a corresponding call identifier.
4. A response is accepted only if the call identifier associated with it matches the
identifier of the most recent call else it is ignored.
(d) At – Least – Once Call Semantics:
1. This semantics guarantees that the call is executed one or more times but does not
specify which results are returned to the caller.
2. It can be implemented using timeout based retransmission without considering the
orphan calls.
(e) Exactly – Once Call Semantics:
1. This is the strongest and the most desirable call semantics. It eliminates the
possibility of a procedure being executed more than once irrespective of the number
of retransmitted call.

23
Brightway Computers Distributed Systems
2. The implementation of exactly – once call semantics is based on the use of timeouts,
retransmission, call identifiers with the same identifier for repeated calls and a
reply cache associated with the callee.

9. Explain the Communication Protocols?


Based on the needs of different systems, several communication protocols have been
proposed for use in RPC which are mentioned below:
(i) The Request Protocol:

Client Server

Request message Procedure


First execution
RPC

Request message Procedure


Next execution
RPC

Fig: Request Protocol

1. This protocol is also called as R (request) protocol.


2. It is used in RPC when a called procedure has nothing to return as a result of execution and
the requirement of client confirmation about procedure execution is not needed.
3. As no acknowledgement or reply message is involved, only single message is transmitted
from client to server.
4. The client proceeds after the request message is sent as there is no reply message.
5. This protocol provides May – be call semantics and does not need retransmission of request
message.
6. RPC that uses the R protocol is known as asynchronous RPC which helps to improve the
combined performance of the client and server. This is done because the client does not ait
for a reply and server does not need to send a reply.
7. For an asynchronous RPC, the RPC Runtime does not retry a request in case of
communication failure. TCP is better alternative then UDP since no retransmission is
required and it is connection oriented.
8. Asynchronous RPC with unreliable transport protocol are generally used in implementing
periodic update services. Distributed system window is one of its applications.

23
Brightway Computers Distributed Systems
(ii) The request/Reply protocol:

Client Server

Request Message
First Procedur
RPC e
Reply Message
execution
Acknowledgement for
the request message

Request Message
Acknowledgement for
Procedure
Next the reply of on a previous
execution
RPC RPC
Reply Message
Acknowledgement for
the request message

Fig: Request/Reply Protocol

This protocol is also known as RR (request/reply) protocol:


1. It is useful for designing systems which involve simple RPCs.
2. In a simple RPC all the arguments and result fit in a single packet buffer while the call
duration and intervals between calls are short.
3. This protocol is based on the idea of using implicit acknowledgement to eliminate explicit
acknowledgement messages.
4. In this protocol a server reply is considered as an ACK for a clients request and a
subsequent call from a client is considered as ACK of the client’s previous call.
5. Timeout –and – retires technique is used with RR protocol for failure handling.
Retransmission of request message is done when there is no response.
6. RR protocol and timeout technique provides at – least – once call semantics on if duplicate
requests are not filtered out.
7. Exactly once semantics are supported by servers using reply cache which stores replies.
(iii) The Request/Reply/Acknowledgement – Reply Protocol:
1. This protocol is also known as RRA (request/reply/acknowledge – reply) protocol.
2. RR protocol implements exactly once semantics which requires storage of a lot of
information in the server cache and can lead to loss of replies that have not been delivered.
3. To overcome the limitations or RR protocol, RSA protocol is used.
4. In this clients acknowledge the receipt of reply messages and the server deletes information
from its cache only after it receives an acknowledgement from client.
5. Sometimes the reply acknowledgement message may get lost therefore RRA protocol needs
a unique ordered message identifiers. This keeps a track of the acknowledgement series
sent.

23
Brightway Computers Distributed Systems
Client Server

Request Message

First Procedure
RPC Reply Message execution

Acknowledgement for the


request message

Request Message

Acknowledgement for the reply


of on a previous RPC Procedure
Next execution
RPC Reply Message
Acknowledgement for the
request message

Fig: request/reply/acknowledge – reply Protocol

10. Discuss the Client – Server Binding?


Client Stub must know the location of a server before RPC can take place between them.
Process by which client gets associated with server is known as BINDING. Servers export their
operations to register their willingness toprovide services and Clients import operations, asking the
RPC Runtime to locate a server and establish any state that may be needed at each end.
Issues for client – server binding process:
1. How does a client specify a server to which it wants to get bound?
2. How does the binding process locate the specified server?
3. When is it proper to bind a client to a server?
4. Is it possible for a client to change a binding during execution?
5. Can a client be simultaneously bound to multiple servers that provide the same service?

SHORT ANSWER QUESTIONS

1. What is explicit and implicit addressing?


Explicit addressing:
The process with which communication is desired is explicitly named as a parameter in the
communication primitive used.
Implicit addressing:

The process willing to communicate does not explicitly name a process for communication
(the sender names a server instead of a process). This type of process addressing is also known as
functional addressing.

2. What is Group Communication?


The most elementary form of message – based interaction is one – to – one communication
(also known as point – to – point, or unicast communication) in which a single – sender process
sends a message to a single – receiver process. For performance and ease of programming, several

23
Brightway Computers Distributed Systems
highly parallel distributed applications require that a message – passing system should also provide
group communication facility. Depending on single or multiple senders and receivers.

3. What is all over Multicast?


Atomic Multicast (reliable multicast) has an all – or – nothing property. That is, when a
message is sent to a group by atomic multicast, it is either received by all the surviving (correct)
processes that are members of the group or else it is not received by any of them.

4. What are the components of RPC?


Sun RPC has three components:
(1) rpcgen – a compiler that generates the client and server stub for the definition of a remote
procedure interface. Use – C to generate ANSI C code.
(2) XDR – external Data Representation. Used to encode the data into a portable format.
Simplifies execution among different computer architectures.

(3) A runtime library: -lrpclib.

5. How to send a request and how to get the response?


First, use library function clnt_create (server, DATE_PROG, DATE_VERS, “udp”) to create a
CLIENT handle.
Second, call remote functions in almost the same way as if you are calling local functions.
The request and response will be automatically processed by client stub and server stub.

6. Discuss the issues in IPC by Message Passing?


A message is a block of information formatted by a sending process in such a manner that it
is meaningful to the receiving process. It consists of a fixed – length header and a variable – size
collection of typed data objects. The header usually consists of the following elements:

1. Address: It contains characters that uniquely identify the sending and receiving
processes in the network.

2. Sequence number: This is the message identifier (ID), which is very useful for
identifying lost messages and duplicates messages, in case of system failures.

3. Structural information: This element also has two parts. The type part specifies
whether the data to be passed on to the receiver is included within the message or the
message only contains a pointer to the data, which is stored somewhere outside the
contiguous portion of the message. The second part of this element specifies the length
of the variable – size message data.

Structural
Addresses
information
Actual data Sequence
or pointer to Number of typ number or Receiving Sending
the data bytes/ e message ID process process
elements address address

Variable Fixed – length header


Size
collection of
typed data

23
Brightway Computers Distributed Systems

Fig: A typical message structure

23
Brightway Computers Distributed Systems
7. Discuss the basic concepts of RPC?
Remote Procedure Call (RPC) is a protocol that one program can use to request a service
from a program located in another computers on a network without having to understand the
network’s details. A procedure call is also sometimes known as a function call or a subroutine call.
IPC part of distributed system can often beconveniently handled by message – passing model.
1. It doesn’t offer a uniform panacea for all the needs.
2. RPC emerged as a result of this.
3. It can be said as the special case of message – passing model.

A local procedure call and an RPC behave similarly; however, there are semantic differences
due to several properties of RPCs:

(a) Server/client relationship (binding): While a local procedure call depends on a static
relationship between the calling and the called procedure, the RPC paradigm requires a more
dynamic behaviour. As with a local procedure call, the RPC establishes this relationship through
binding between the calling procedure (client) and the called procedure (server). However, in the
RPC case a binding usually depends on a communications link between the client and server RPC
runtime systems. A client establishes a binding over a specific protocol sequence to a specific host
system and endpoint.
(b) No assumption of shared memory: Unlike a local procedure call, which commonly
uses the call – by – reference passing mechanism for input/output parameters, RPCs with
input/output parameters have copy – in, copy – out semantics due to the differing address spaces of
calling and called procedures.
(c) Independent failure: Beyond execution errors that arise from the procedure call itself,
an RPC introduces additional failure cases due to execution on physically separate machines.
Remoteness introduces issues such as remote system crash, communications links, naming and
binding issues, security problems and protocol incompatibilities.
(d) Security: Executing procedure calls across physical machine boundaries has additional
security implications. Client and server must establish a security context based on the underlying
security protocols and they require additional attributes for authorizing access.

8. What are the Transparency issues of RPC?


A transparent RPC is one in which the local and remote procedure calls are
indistinguishable.
Types of transparencies:
(i) Syntactic transparency: A remote procedure call should have exactly the same syntax as local
procedure call.
(ii) Semantic transparency: The semantics of a remote procedure call are identical to those of
alocal procedure call.
Syntactic transparency is not an issue but semantic transparency is difficult.

Difference between remote procedure calls and local procedure calls:

23
Brightway Computers Distributed Systems
1. Unlike local procedure calls, with remote procedure calls.
(a) Disjoint Address Space
(b) Absence of shared memory
(c) Meaningless making call by reference, using addresses inarguments and pointers.
2. RPC’s are more vulnerable to failure because of:
(a) Possibility of processor crashes or
(b) Communication problems of a network.

9. Discuss the stub generation?


Automatically stub generation: Interface Definition Language (IDL) is used here, to define
the interface between a client and the server.
Interface definition:
1. It is a list of procedure names supported by the interface together with the types of their
arguments and results.
2. It also plays role in reducing data storage and controlling amount of data transferred
over the network.
3. It has information about type definitions, enumerated types and defined constants.

Export the interface:


A server program that implements procedure in the interface.

Import the interface:


1. A client program that calls procedures from an interface.
2. The interface definition is compiled by the IDL compiler.
3. IDL compiler generates components that can be combined with client and server
programs, without making any changes to the existing compilers;
4. Client stub and server stub procedures;
5. The appropriate marshaling and unmarshaling operations;

6. A header file that supports the data types.

10. Discuss the server management?


Servers can be implemented in two ways:
(a) Stateful servers, (b) Stateless servers.
(a) Stateful servers: A Stateful Server maintains client’s state information from one RPC to the
next.
(i) They provide an easier programming paradigm.
(ii) They are more efficient than stateless servers.
(b) Stateless Server: Every request from a client must be accompanied with all necessary
parameters to successfully carry out the desired operation.
(i) Stateless servers have distinct advantage over Stateful server in the event of a failure.
(ii) The choice of using a stateless or a stateful server is purely application dependent.

23
Brightway Computers Distributed Systems
Unit –III
Introduction to DSM
Essay Questions
1. What is meant by DSM and explain the design and implementation of DSM Systems?

1) The distributed shared memory (DSM) implements the shared memory model in
distributed systems, which have no physical shared memory.
2) The shared memory model provides a virtual address space shared between all nodes.
The overcome the high cost of communication in distributed systems, DSM systems move data to
the location of access DSM also known as DSVM.
DSM provides a virtual address space shared among processes on loosely coupled processors.
DSM is basically an abstraction that integrates the local memory of different machine into a single
logical entity.
1) Shared by cooperating processes.
2) Each node of the system consists of one or more CUPs and memory unit.
3) Nodes are connected by high speed communication network.
4) Simple message passing system for nodes to exchange information.
5) Main memory of individual nodes is used to cache pieces of shared memory space.
6) Memory mapping manager routine maps local memory to shared virtual memory.
7) Shared memory of DSM exist only virtually.
8) Shared memory space is partitioned into blocks.
9) Data caching is used in DSM system to reduce network latency.

10) The basic unit of caching is a memory block.


11) The missing block is migrate from the remote node to the client process’s node and
operating system maps into the application’s address space.

23
Brightway Computers Distributed Systems
12) Data block keep migrating from one node to another on demand but no communication
is visible to the user processes.

13) If data is not available in local memory network block fault is generated.

2. Explain the Granularity?


Most visible parameter in the design of DSM system is block size.
Factors influencing block size selection: Sending large packet of data is not much more
expensive than sending small ones.
Paging overhead: A process is likely to access a largeregion of its shared address space in a small
amount of time.
Therefore the paging overhead is less for large block sizeas compared to the paging
overhead for small block size.
Directory size: the larger the block size, the smaller the directory.
Ultimately result in reduced directory management overhead for larger block size.
Thrashing: The problem of thrashing may occur when data item in the same data block are being
updated by multiple node at the same time.
Problem may occur with any block size, it is more likely with larger block size.

False sharing:

Process P1
accesses data in P1
this area

Process P1
accesses data in P2
this area
A data block
Fig: False Sharing

Occur when two different processes access two unrelated variable that reside in the same data
block. The larger is the block size the higher is the probability of false sharing. False sharing of a
block may lead to a thrashing problem.
Using page size as block size: Relative advantage and disadvantages of small and large block size
make it difficult for DSM designer to decide on a proper block size.
Following advantage: It allows the use of existing page fault schemes to trigger a DSM page fault.

23
Brightway Computers Distributed Systems
It allows the access right control page size do not impose undue communication overhead
at the time of network page fault. Page size is a suitable data entity unit with respect to memory
contention.

3. Explain the Consistency Model?


 A consistency model is contract between a distributed data store and processes, in which
the processes agree to obey certain rules in contrast the store promises to work correctly.
 A consistency model basically refers to the degree of consistency that should be maintained
for the shared memory data.
 If a system supports the stronger consistency model, then the weaker consistency model is
automatically supported but the converse is not true.

 The types of consistency models are Data – Centric and client centric consistency models.
1. Data – Centric Consistency Models:
A data store may be physically distributed across multiple machines. Each process that can
access data from the store is assumed to have a local or nearby copy available of the entire
store.
i. Strict Consistency Model:
1) Any read on data item X returns a value corresponding to the result of the most
recent write on X
2) This is the strongest form of memory coherence which has the most stringent
consistency requirement.

3) Strict consistency is the ideal model but it is impossible to implement in a


distributed system. It is based on absolute global time or a global agreement on
commitment of changes.
ii. Sequential Consistency:
1) Sequential consistency is an important data – centric consistency model which is
a slightly weaker consistency model than strict consistency.
2) A data store is said to be sequentially consistent if the result of any execution is
the same as if the (read and write) operations by all processes on the data store
were executed in some sequential order and the operations of each individual
process should appear in this sequence in a specified order.

Example: Assume three operations read (R1), write (W1), read (R2) performed in an order on a
memory address. Then (R1, W1, R2), (R1, R2, W1), (W1, R1, R2) (R2, W1, R1) are acceptable
provided all processes see the same ordering.
iii. Linearizability:
1) It that is weaker than strict consistency, but stronger than sequential
consistency.
2) A data store is said to be linerarizable when each operation is timestamped and
the result of any execution is the same as if the (read and write) operations by
all processes on the data store were executed in some sequential order.

23
Brightway Computers Distributed Systems
3) The operations of each individual process appear in sequence order of each
individual process appear in sequence in some sequential order specified by its
program.

4) If tsOP1(x)< tsOP2(y), then operation OP1(x) should precede OP2(y) in this


sequence.
iv. Causal Consistency:
1) It is a weaker model than sequential consistency.
2) In Casual Consistency all processes see only those memory reference operations
in the correct order that are potentially causally related.
3) Memory reference operations which are not related may be seen by different
processes in different order.
4) A memory reference operation is said to be casually related to another memory
reference operation if the first operation is influenced by the second operation.

5) If a write (w2) operation is casually related to another write (w1) the acceptable
order is (w1, w2).
v. FIFO Consistency:
1) It is weaker than causal consistency.
2) This model ensures that all write operations performed by a single process are
seen by all other processes in the order in which they were performed like a
single process in a pipeline.
3) This model is simple and easy to implement having good performance because
processes are ready in the pipeline.

4) Implementation is done by sequencing write operations performed at each node


independently of the operations performed on other nodes.
Example: If (w11) and (w12) are write operations performed by p1 in the order and (w21), (w22)
by p2. A process p3 can see them as [(w11, w12), (w21, w2)] while p4 can view them as [(w21, w2),
(w11, w12)].
vi. Weak consistency:
1) The basic idea behind the weak consistency model is enforcing consistency on a
group of memory reference operations rather than individual operations.
2) A Distributed Shared Memory system that supports the weak consistency model
uses a special variable called a synchronization variable which is used to
synchronize memory.

3) When a process accesses a synchronization variable, the entire memory is


synchronized by making visible the changes made to the memory to all other
processes.
vii. Release Consistency:
1) Release consistency model tells whether a process is entering or exiting from a
critical section so that the system performs either of the operations when a
synchronization variable is accessed by a process.

23
Brightway Computers Distributed Systems
2) Two synchronization variables acquire and release are used instead of single
synchronization variable. Acquire is used when process enters critical section
and release is when it exits a critical section.

3) Release consistency can be viewed as synchronization mechanism based on


barriers instead of critical sections.
viii. Entry Consistency:
1) In entry consistency every shared data item is associated with a synchronization
variable.
2) In order to access consistent data, each synchronization variable must be
explicitly acquired.

3) Release consistency affects all shared data but entry consistency affects only
those shared data associated with a synchronization variable.
2. Client – Centric Consistency Models:
1) Client – Centric Consistency models aim at providing a system wide view on a data
store.
2) This model concentrates on consistency from the perspective of a single mobile
client.

3) Client – Centric Consistency models are generally used for applications that lack
simultaneous updates were most operations involve reading data.
i. Eventual Consistency:
1) In Systems that tolerate high degree of inconsistency, if no updates take place
for a long time all replicas will gradually and eventually become consistent. This
form of consistency is called eventual consistency.
2) Eventual consistency only requires those updates that guarantee propagation to
all replicas.
3) Eventual consistent data stores work fine as long as clients always access the
same replica.

4) Write conflicts are often relatively easy to solve when assuming that only a small
group of processes can perform updates. Eventual consistency is therefore often
cheap to implement.
ii. Monotonic Reads Consistency:

1) A data store is said to provide monotonic – read consistency if a process reads


the value of a data item x, any successive read operation on x by that process
will always return that same value or a more recent value.
2) A process has seen a value of x at time t, it will never see an older version of x at
a later time.
Example: A user can read incoming mail while moving. Each time the user connects to a different e-
mail server that server fetches all the updates from the server that the user previously visited.
Monotonic Reads guarantees that the user sees all updates, no matter from which server the
automatic reading takes place.

23
Brightway Computers Distributed Systems
iii. Monotonic Writes:
1) A data store is said to be monotonic write consistent if a write operation by a
process on a data item x is completed before any successive write operation on
X by the same process.
2) A write operation on a copy of data item x is performed only if that copy has
been brought up to date by means of any preceding write operations, which may
have taken place on other copies of x.
Example: Monotonic – write consistency guarantees that if an update is performed on a copy of
Server S, all preceding updates will be performed first. The resulting server will then indeed
become the most recent version and will include all updates that have led to previous versions of
the server.

23
Brightway Computers Distributed Systems
iv. Read Your Writes:
1) A data store is said to provide read – your – writes consistency if the effect of a
write operation by a process on data item x will always be a successive read
operation on x by the same process.
2) A write operation is always completed before a successive read operation by the
same process no matter where that read operation takes place.
Example: Updating a Web page and guaranteeing that the Web browser shows the newest version
instead of its cached copy.
v. Writes Follow Reads:
1) A data store is said to provide writes – follows – reads consistency if a process
has write operation on a data item x following a previous read operation on x
then it is guaranteed to take place on the same or a more recent value of x that
was read.
2) Any successive write operation by a process on a data item x will be performed
on a copy of x that is up to date with the value most recently read by that
process.
Example: Suppose a user first reads an article A then posts a response B. by requiring writes –
follows – reads consistency, B will be written to any copy only after A has been written.`

4. Explain the Clock Synchronization?


As in non-distributed systems, the knowledge of when events occur is necessary. However,
clock synchronization is often more difficult in distributed systems because there is no ideal time
source, and because distributed algorithms must sometimes be used. Distributed algorithms must
overcome: Scattering of information Local, rather than global, decisionmaking

a) Physical Clocks: The time difference between two computers is known as drift. Clock drift
over time is known as skew. Computer clock manufacturers specify a maximum skew rate in
their products.

Computer clocks are among the least accurate modern timepieces. Inside every computer is a chip
surrounding a quartz crystal oscillator to record time. These crystals cost 25 seconds to produce.

Average loss of accuracy: 0.86 seconds per day.

This skew is unacceptable for distributed systems. Several methods are now in use to attempt the
synchronization of physical clocks in distributed systems:

Physical Clocks – UTC: Coordinated Universal Time (UTC) is the international time standard. UTC
is the current term for what was commonly referred to as Greenwich Mean Time (GMT). Zero hours
UTC is midnight in Greenwich, England which lies on the zero longitudinal meridians. UTC is based
on a 24-hour clock.

23
Brightway Computers Distributed Systems

(i) Physical Clocks – Christian’s Algorithm:


Assuming there is one time server with UTC:Each node in the distributed system periodically polls
the time server.
Time (T1) is estimated as Stime + (T1 – T0)/2
This process is repeated several times and an average is provided.
Machine T1 then attempts to adjust its time.
Disadvantages: Must attempt to take delay between server T1 and time server into account Single
point of failure if time server fails.

(ii) Physical Clocks – Berkeley Algorithm:


One daemon without UTC:Periodically, the daemon polls all machines on the distributed system for
their times.
The machines answer.
The daemon computes an average time and broadcasts it to the machines so they can adjust.

23
Brightway Computers Distributed Systems

(iii) Physical Clocks – Decentralized Averaging Algorithm:


Each machine on the distributed system has a daemon without UTC.
Periodically, at an agreed – upon fixed time, each machine broadcasts its local time.
Each machine calculates the correct time by averaging all results.

(iv) Physical Clocks – Network Time Protocol (NTP):

Enables clients across the Internet to be synchronized accurately to UTC. Overcomes large and
variable message delays.
Employs statistical techniques for filtering, based on past quality of servers and several other
measures.
Can survive lengthy losses of connectivity: Redundant servers. Redundant paths to servers.
Provides protection against malicious interference through authentication techniques.
Uses a hierarchy of servers located across the Internet. Primary servers are directly connected to a
UTC time source.

Hierarchy in NTP
(UTC)
Most 1
accurate
23
Brightway Computers
Less accurate 1 1 Distributed
1 Systems
NTP has three modes: Multicast Mode: Suitable for user workstations on a LAN. One or more
Fig: Hierarchy
servers periodically multicasts the time in NTP
to other machines on the network. Procedure Call
Mode:Similar to Christian’s algorithm. Provides higher accuracy than Multicast Mode because
delays are compensated for

Symmetric Mode: Pairs of servers exchange pairs of timing messages that contain time stamps of
recent message events. The most accurate, but also the most expensive mode.
b) Logical Clocks: Often, it is not necessary for a computer to know the exact time, only relative
time. This is known as “logical time”.
Logical time is not based on timing but on the ordering of events. Logical clocks can only
advance forward, not in reverse. Non – interacting processes cannot share a logical clock.
Computers generally obtain logical time using interrupts to update a software clock. The
more interrupts (the more frequently time is updated), the higher the overhead.
(i) Logical Clocks – Lamport’s Logical Clock Synchronization Algorithm:
The most common logical clock synchronization algorithm for distributed systems is
Lamport’s Algorithm. It is used in situations where ordering is important but global time is not
required.
Basedon the “happens – before” relation: Event A “happens – before” Event when all
processes involved in a distributed system agree that event A occurred first, and B subsequently
occurred.
This DOES NOT mean that Event A actually occurred before Event B in absolute clock time.
A distributed system can use the “happens – before” relation when: Events A and B are observed by
the same process, or by multiple processes with the same global clock.
Event A acknowledges sending a message and Event B acknowledges receiving it, since a
message cannot be received before it is sent. If two events do not communicate via messages, they
are concurrent because order cannot be determined and it does not matter. Concurrent events can
be ignored.

(ii) Logical Clocks – Lamport’s Logical Clock Synchronization Algorithm (cont.):


In the previous examples, Clock (C)A<(C)B
If they are concurrent, (C)A=(C)B
Concurrent events can only occur on the same system, because every message transfer between
two systems takes at least one clock tick.
In Lamport’s Algorithm, logical clock values for events may be changed but always by moving the
clock forward. Time values can never be decreased.
An additional refinement in the algorithm is often used: if Event A and Event B are concurrent. (C)A
= (C) B, some unique property of the processes associated with these events can be used to choose a
winner. This establishes a total ordering of all events.
Process ID is often used as the tiebreaker.

23
Brightway Computers Distributed Systems
Lamport’s Algorithm can thus be used in distributed systems to ensure synchronization: A logical
clock is implemented in each node in the system.
Each node can determine the order in which events have occurred in that system’s own point of
view.
The logical clock of one node does not need to have any relation to real time or to any other node in
the system.

5. Discuss the Event Ordering?


Coordination of requests (especially in a fair way) requires events (requests) to be ordered.
Stand – alone systems:
1. Shared Clock / Memory.
2. Use a time – stamp to determine ordering.
Distributed Systems
1. No global clock.
2. Each clock runs at different speeds.
How do we order events running on physically separated systems?
Messages (the only mechanism for communicating between systems) can only be received after
they have been sent.
Event Ordering: Happened Before Relation
If A and B are events in the same process, and A executed before B, then AB.
If A, is a message sent and B is when the message is received, then AB.
AB, and BC, then AC

23
Brightway Computers Q Distributed Systems
q0
q1
q2
q3
q4
q5
P0
P1
P2
P3
P4

me
R
rrr3120
Ti e
ag
ss
P
Me

Ordered events  Unordered (Concurrent) events


 p1 precedes ______  q0 is concurrent with ________
 q4 precedes ______  q2 is concurrent with ________
 q2 precedes _____  q4 is concurrent with ________
 p0 precedes ______  q5 is concurrent with ________

Define a notion of event ordering such that:


1. If AB, then A precedes B.
2. If A and B are concurrent events, then nothing can be said about ordering of A and B.
Solution:
1. Each processor I maintains a logical clock LCi
2. When an event occurs locally, ++ LCi
3. When processor X sends a message to Y, it also sends LC x in the message.
4. When Y receives this message it:
If LCy < (LCx +1) LCy = LCx +1;
Note: If “time” of A precedes “time” of B, then???
If A -> B and C-> B does A-> C?
Yes or No

6. Explain the Mutual exclusion?


Every node in the system keeps a request queue sorted by logical time stamp. Logical clocks are
used to impose total global order on all events.
Ordered message delivery between every pair of communicating sites is assumed. Messages sent
form Site Si arrive at Site Sj in the same order.

23
Brightway Computers Distributed Systems
Site Si sends a request and places the request in the local request queue.
2. When Site Sj receives the request, it sends a time – stamped reply to Site Si and places the request
in its local request queue.
3. Site Si gains the critical section of the requested data when it has received a message from all
other sites with a timestamp larger than the request.
(i) Centralized Algorithm: The most simple and straights forward way to achieve to
mutual exclusion in a distributed system is to simulate how it is done in a one – processor system:
one process is elected as the coordinator.
When any process wants to enter a critical section, it sends a request message to the
coordinator stating which critical section it wants to access.
If no other process is currently in that critical section, the coordinator sends back a reply
granting permission. When the reply arrives, the requesting process enters the critical section. If
another process requests access to the same critical section, it is ignored or blocked until the first
process exits the critical section and sends a message to the coordinator stating that it has exited.
The Centralized Algorithm does have disadvantages: The coordinator is a single point of
failure. If processes are normally ignored when requesting a critical section that is in use, they
cannot distinguish between a dead coordinator and “permission denied”. In a large system, a single
coordinator can be a bottleneck.

0 1 2 0 1 2 0 1 2
Reques Ok Reques Releas
t t e Ok
No reply

3 Queue is
empty
3 2 3
Coordinator
(a) (b) (c)

(ii) Distributed Algorithms: It is often unacceptable to have a single point of failure.


Therefore researchers continue to look for distributed mutual exclusion algorithms. The most well-
known is by Ricart and Agrawala: There must be a total ordering of all events in the system.
Lamport’s Algorithm can be used for this purpose.

When a process wants to enter a critical section, it builds a message containing the name of
the critical section, its process number and the current time. It then sends the message to all other
processes, as well as to itself.

When a process receives a request message, the action it takes depends on its state with
respect to the critical section named in the message. There are three cases: if the receiver is not in
the critical section and does not want to enter it, it sends an ok message to the sender.

If the receiver is in the critical section, it does not reply. It instead queues the request.

If the receiver also wants to enter the same critical section, it compares the time stamp in
the incoming message with the time stamp in the message it has sent out. The lowest time stamp

23
Brightway Computers Distributed Systems
wins. If its own message has a lower time stamp it does not reply and queues the request from the
sending process.

When a process has received OK messages from all other processes, it enters the critical
section. Upon exiting the critical section, it sends OK messages to all processes in its queue and
deletes them all from the queue.
(iii) Token – Based Algorithms: Another approach is to create a logical or physical ring.
Each process knows the identity of the process succeeding it. When the ring is initialized,
process 0 is give a token. The token circulates around the ring in order, from process k to process
k+1.
When a process receives the token from its neighbor, it checks to see if it is attempting to
enter a critical section. If so, the process enters the critical section and does its work, keeping the
token the whole time.
After the process exits the critical section, it passes the token to the next process in the ring.
It is not permitted to enter a second critical section using the same token.
If a process is handed a token an is not interested in entering a critical section, it passes the
token to the next process.

1
2
3
7
8
5
4
6

0 2 4 8 7 1 6 5 8 3

(a)

7. Define the Deadlock and explain the Deadlock? (b)


A deadlock occurs when a set of processes in a system are blocked waiting for requests that can
never be satisfied.
Approaches:
1. Detection (& Recovery).
2. Prevention.

3. Avoidance – not practical in distributed setting.


Difficulties:
1. Resource allocation information is distributed.

2. Gathering information requires messages. Since messages have non – zero delays, it is
difficult to have an accurate and current view of resource allocation.

23
Brightway Computers Distributed Systems
Suppose following information is available:
1. For each process, the resources it currently holds.

2. For each process, the request that it is waiting for then, one can check if the current system
state is deadlocked, or not.
In single – processor systems, OS can maintain this information, and periodically execute deadlock
detection algorithm.
What to do if a deadlock is detected:
1. Kill a process involved in the deadlocked set
2. Inform the users, etc.

a) Wait For Graph (WFG):


Definition: A resource graph is a bipartite directed graph (N,E), where
1. N = P U R,
2. P = {p1, …..pn}, R = {r1, ……rn}
3. (r1, ….rn) available unit vector.
4. An edge (pi, rj) a request edge, and

5. An edge (ri, pj) an allocation edge.


Definition: Wait For Graph (WFG) is a directed graph, where nodes are processes and a directed
edge from P -> Q represents that P is blocked waiting for Q to release a resource.
So, there is an edge from process P to process Q if P needs a resource currently held by Q.
(i) Deadlock Detection Algorithms:
Centralized Deadlock Detection, false deadlock.

1. Initial resource graph for machine 0.


2. Initial resource graph for machine 1
3. The coordinator’s view of the world.

4. The situation after the delayed message.


(ii) Wait – for Graph for Detection:
1. Assume only one instance of each resource.

2. Nodes are processes.

23
Brightway Computers Distributed Systems
Recall Resource Allocation Graph: It had nodes for resources as well as processes (basically same
idea)
3. Edges represent waiting: If P is waiting to acquire a resource that is currently held by Q,
then there is an edge from P to Q.
4. A deadlock exists if and only if the global wait – for graph has a cycle
5. Each process maintains a local wait – for graph based on the information it has

6. Global wait – for graph can be obtained by the union of the edges in all the local copies
(iii) Deadlock Prevention:
1. Hierarchical ordering of resources avoids cycles
2. Time – stamp ordering approach:

Prevent the circular waiting condition by preempting resources if necessary.

a) The basic idea is to assign a unique priority to each process and use these priorities
to decide whether process P should wait for process Q.
b) Let P wait for Q if P has a higher priority than Q: Otherwise, P is rolled back.

c) This prevents deadlocks since for every edge (P,Q)in the wait – for graph, P has a
higher priority than Q. Thus a cycle cannot exist.

8. Explain the Election Algorithm?


Election Algorithms:
The coordinator election problem is to choose a process from among a group of processes
on different processors in a distributed system to act as the central coordinator.
An election algorithm is an algorithm for solving the coordinator election problem. By the
nature of the coordinator election problem any election algorithm must be a distributed algorithm.
(i) A group of processes on different machines need to choose a coordinator.
(ii) Peer to peer communication: every process can send messages to every other process.
(iii) Assume that processes have unique IDs, such that one is highes.
(iv) Assume that the priority of process Pi is i

(a) Bully Algorithm:


Background: Any process Pi send a message to the current coordinator; if no response in T
time units, Pi tries to elect itself as leader. Details follow:
Algorithm for process Pi that detected the lack of coordinator
1. Process Pi sends as “Election” message to every process with higher priority.
2. If no other process responds, process Pi starts the coordinator code running and
sends a message to all processes with lower priorities saying “Elected P i”

3. Else, Pi waits for T’ time units to hear from the new coordinator, and if there is no
response start from step (1) again.
Algorithm for other processes (also called Pi)
If Pi is not the coordinator then Pi may receive either of these messages from Pj

23
Brightway Computers Distributed Systems
If Pi sends “Elected Pj”; [this message is only received if i<j]
Pi updates its records to say that Pj is the coordinator.
Else if Pj sends “election” message (i>j)

Pi sends a response to Pj saying it is alive


Pi starts an election.

(b) Election in A Ring => Ring Algorithm:


Assume that processes form a ring: each process only sends messages to the next process in
the ring.
Active list: Its info on all other active processes.
Assumption: message continues around the ring even if a process along the way has crashed.
Background: Any process Pi sends a message to the current coordinator; if no response in T time
units, Pi initiates an election.
1. Initialize active list to empty.
2. Send an “Elect (i)” message to the right + add I to active list. If a process receives an
“Elect(j)” message.
(a) This is the first message sent or seen.
Initialize its active list to [I,j]; send “Elect (i)” + send “Elect(j)”
(b) if I !=j, add I to active list + forward “Elect(j)” message to active list.
(c) otherwise (i=j), so process I has complete set of active processes in its active list.
=> choose highest process ID + send “Elected (x)” message to neighbor
If a process receives “Elected (x)” message,
Set coordinator to x
Example:
Suppose that we have four processes arranged in a ring: P1? P2 ? P3 ? P4 ? P1 ……….
P4 is coordinator
Suppose P1 + P4 crash
Suppose P2 detects that coordinator P4 is not responding P2 sets active list to [ ]
P2 sends “Elect (2)” message to P3; P2 sets active list to [2]
P3 receives “Elect(2)”
This message is the first message seen, so P3 sets its active list to [2, 3].
B (AB) P3 sends “Elect(3)” towards P4 and then sends “Elect(2)” towards P4
The messages pass P4 + P1 and then reach P2
P2 adds 3 to active list [2, 3]
P2 forwards “Elect(3) to P3
P2 receives the “Elect(2) message

23
Brightway Computers Distributed Systems
P2 chooses P3 as the highest process in its list [2, 3] and sends an “Elected (P3)” message
P3 receives the “Elect(3)” message.
P3 chooses P3 as the highest process in its list [2, 3] + sends an “Elected(P3)” message.
Byzantine Generals Problem:
Intuition: Only want to proceed with the plan of attack if they are sure everyone else agrees
Can’t trust other generals.
If generals can’t trust one another they can never be sure if they should attack.

SHORT ANSWER QUESTIONS

1. What is meant by Wireless Elections?


If more than one election is called (multiple source nodes), a node should participate in only
one.
1) Election messages are tagged with a process id.
2) If a node has chosen a parent but gets an election message from a higher numbered node, it
drops out of the current election and adopts the high numbered node as its parent.

2. Discuss the Read – Replication Algorithm?


Replicates data objects to multiple nodes. DSM keeps track of location of data objects. Multiple
nodes can have read access or one node write access (multiple readers – one writer protocol). After
a writer, all copies are invalidated or updated

DSM has to keep track of locations of all copies of data objects. Examples of
implementations:
1. IVY: Owner node of data object knows all nodes that have copies.
2. PLUS: Distributed linked – list tracks all nodes that have copies.
Advantage: The read – replication can lead to substantial performance improvements if the ratio of
reads to writes is large.

3. What is the difference between message passing and DSM?


DSM Message Passing
Variables are shared directly Variables have to be marshaled yourself.
Processes can cause error to one another Processes are protected from one another by
by altering data. having private address spaces.
Processes may execute with non –
Processes must execute at the same time.
overlapping lifetimes.
Invisibility of communication’s cost Cost of communication is obvious.

4. What is meant by Memory Consistency?


To use DSM, one must also implement a distributed synchronization service. This includes
the use of locks, semaphores and message passing.

23
Brightway Computers Distributed Systems
Most implementations, data is read from local copies of the data but updates to data must
be propagated to other copies of the data.
Memory consistency models determine when data updates are propagated and what level
of inconsistency is acceptable.

23
Brightway Computers Distributed Systems
5. What is meant by Trashing?
Thrashing occurs when network resources are exhausted, and more time is spent invalidating
data and sending updates than is used doing actual work. Based on system specifics, one should
choose write – update or write – invalidate to avoid thrashing.

6. Discuss the Issues of DSM?


(a) Granularity:
1. Granularity refers to the block size of DSM
2. The unit of sharing and the unit of data transfer across the network when a network
block fault occurs
3. Possible unit are a few word, a page or a few pages
(b) Structure of Shared memory:
1. Structure refers to the layout of the shared data in memory
2. Dependent on the type of applications that the DSM system is intended to support.
Memory coherence and access synchronization:
In a DSM system that allows replications that allows replication of shared data item, copies of
shared data item may simultaneously be available in the main memories of a number of nodes.
To solve the memory coherence problem that dealwith the consistency of a piece of shared
data lying in the main memories of two or more nodes.
Data location and access: To share data in a DSM, should be possible to locate andretrieve
the data accessed by a user process.
Replacement strategy: If the local memory of a node is full, a cache miss at thatnode
implies not only a fetch of accessed data block from a remote node but also a replacement.
Data block must be replaced by the new data block.
Thrashing: Data block migrate between nodes on demand. Therefore if two nodes compete
for write access to a single data item the corresponding data block may be transferred back.
Heterogeneity: The DSM system built for homogeneous system need notaddress the
heterogeneity issue.

7. What are the Advantages of DSM?


1. Data sharing is implicit, hiding data movement (as opposed to ‘Send/Receive’ in
message passing model).
2. Passing data structures containing pointers is easier (in message passing model data
moves between different address spaces).
3. Moving entire object to user takes advantage of locality difference.
4. Less expensive to build than tightly coupled multiprocessor system: off – the – shelf
hardware, no expensive interface to shared physical memory.
5. Very large total physical memory for all nodes: Large programs can run more efficiently.
6. No serial access to common bus for shared physical memory like in multiprocessor
systems.
7. Programs written for shared memory multiprocessors can be run on DSM systems with
minimum changes.

23
Brightway Computers Distributed Systems
UNIT – IV
TASKS AND LOADING
Essay Questions

1. Define the task and explain the Task Assignment Approach?


A process has already been split up into pieces called Tasks. This split occurs along natural
boundaries (such as a method), so that each task will have integrity in itself and data transfers
among the tasks are minimized.

The amount of computation required by each task and the speed of each CPU are known. The cost of
processing each task on every node is known. This is derived from assumption 2.

The IPC costs between every pair of tasks in known. The IPC cost is 0 for tasks assigned to the same
node. This is usually estimated by an analysis of the static program. If two tasks communicate n
times and the average time for each inter – task communication is t, them IPC costs for the two
tasks is n*t. Precedence relationships among the tasks are known. Reassignment of tasks in not
possible.

(A) Goal is to assign the tasks of a process to the nodes of a distributed system in such a manner as
to achieve goals such as the following goals:

I. Minimization of IPC costs


II. Quick turnaround time for the complete process
III. A high degree of parallelism
IV. Efficient utilization of system resources in general

These goals often conflict. E.g., while minimizing IPC costs tends to assign all tasks of a
process to a single node, efficient utilization of system resources tries to distribute the tasks evenly
among the nodes. So also, quick turn around time and a high degree of parallelism encourage
parallel execution of the tasks, the precedence relationship among the tasks limits their parallel
execution.

Also note that in case of m tasks and q nodes, there are mq possible assignments of tasks to
nodes. In practice, however, the actual number of possible assignments of tasks to nodes may be
less than mq due to the restriction that certain tasks cannot be assigned to certain nodes due to
their specific requirements (e.g. need a certain amount of memory or a certain data file).

(B) There are two nodes, {n1, n2} and six tasks {t1, t2, t3, t4, t5, t6}. There are two task assignment
parameters the task execution cost (x ab the cost of executing task a on node b) and the inter –
task communication cost (cij the inter – task communication cost between tasks i and j).

23
Brightway Computers Distributed Systems
Inter – task communication cost Execution costs
t1 t2 t3 t4 t5 t6 n1 n2 Nodes
t1 0 6 4 0 0 12 t1 5 10
t2 6 0 8 12 3 0 t2 2 
t3 4 8 0 0 11 0 t3 4 4
t4 0 12 0 0 5 0 t4 6 3
t5 0 3 11 5 0 0 t5 5 2
t6 12 0 0 0 0 0 t6  4

Task t6 cannot be executed on node n1 and task t2 cannot be executed on node n2 since the
resources they need are not available on these nodes.

(1) Serial assignment, where tasks t1, t2, t3 are assigned to node n1 and tasks t4, t5, t6 are
assigned to node n2:
Execution cost, x= x11 + x21 + x31 + x42 + x52 + x62 = 5+ 2 + 4 + 3 + 2 + 4 = 20
Communication cost, c = c14 + c15 + c16 + c24 +c25 + c26 + c34 + c35 + c36 = 0+0+12+12+3+0+0+11+0=38.
Hence total cost =58.

(2) Optimal assignment, where tasks t1, t2, t3, t4, t5 are assigned to node n1 and task t6 is
assigned to node n2.
Execution cost, x = x11 +x21 + x31 +x41 +x51 + x62 = 5+2+4+6+5+4=26
Communication cost, c =c16 + c26 + c36 + c46 + c56 = 12+0+0+0+0 = 12
Total cost =38

Optimal assignments are found by first creating a static assignment graph. In this graph, the
weights of the edges joining pairs of task nodes represent inter – task communication costs. The
weight on the edge joining a task node to node n1 represents the execution cost of that task on node
n2 and vice – versa. Then we determine a minimum cutset in this graph.

A cutset is defined to be a set of edges such that when these edges are removed, the nodes of
the graph are partitioned into two disjoint subsets such that nodes in one subset are reachable from
n1 and the nodes in the other are reachable from n2. Each task node is reachable from either n1 or
n2. The weight of a cutset is the sum of the weights of the edges in the cutset. This sums up the
execution and communication costs for that assignment. An optimal assignment is found by finding
a minimum cutset.

23
Brightway Computers Distributed Systems
2. Explain the Load Balancing Approach?
The processes are distributed among nodes to equalize the load among all nodes. The
scheduling algorithms that use this approach are known as Load Balancing or Load Leveling
Algorithms. These algorithms are based on the intuition that for better resource utilization, it is
desirable for the load in a distributed system to be balanced evenly. This a load balancing
algorithms tries to balance the total system load by transparently transferring the workload from
heavily loaded nodes to lightly loaded nodes in an attempt to ensure good overall performance
relative to some specific metric of system performance.

Categories of load balancing algorithms:


(i) Static: Ignore the current state of the system E.g. if a node is heavily loaded, it pick up a
task randomly and transfers it to a random node. These algorithms are simpler to
implement but performance may not be good.
(ii) Dynamic: Use the current state information for load balancing. There is an overhead
involved in collecting state information periodically; they perform better than static
algorithms.
(iii) Deterministic: Algorithms in this class use the processor and process characteristics to
allocate processes to nodes.
(iv) Probabilistic: Algorithms in this class use information regarding static attributes of the
system such as number of nodes, processing capability, etc.
(v) Centralized: System state information is collected by a single node. This node makes all
scheduling decisions.
(vi) Distributed: Most desired approach. Each node is equally responsible for making
scheduling decisions based on the local state and the state information received from
other sites.
(vii) Cooperative: A distributed dynamic scheduling algorithm. In these algorithms, the
distributed entities cooperate with each other to make scheduling decisions. Therefore
they are more complex and involve larger overhead than non – cooperative ones. But
the stability of a cooperative algorithm is better than of non-cooperative one.
(viii) Non – Cooperative: A distributed dynamic scheduling algorithm. In these algorithms,
individual entities act as autonomous entities and make scheduling decisions
independently of the action of other entities.
3. Explain the Load Sharing Approach?
Several researchers believe that load balancing, with its implication of attempting to
equalize workload on all the nodes of the system, is not an appropriate objective. This is because
the overhead involved in gathering the state information to achieve this objective is normally very
large, especially in distributed systems having a large number of nodes. In fact, for the proper
utilization of resources of a distributed system, it is not required to balance the load on all the
nodes. It is necessary and sufficient to prevent the nodes from being idle while some other nodes
have more than two processes. This rectification is called the Dynamic Load Sharing instead of
Dynamic Load Balancing..

23
Brightway Computers Distributed Systems
The design of a load sharing algorithms require that proper decisions be made regarding
load estimation policy, process transfer policy, state information exchange policy, priority
assignment policy, and migration limiting policy. It is simpler to decide about most of these policies
in case of load sharing, because load sharing algorithms do not attempt to balance the average
workload of all the nodes of the system. Rather, they only attempt to ensure that no node is idle
when a node is heavily loaded. The priority assignments policies and the migration limiting policies
for load – sharing algorithms are the same as that of load – balancing algorithms.

4. Discuss the Migration?


(1) Moving from heavily loaded to lightly loaded. Also, to minimize communication costs.
(2) Moving code to data, rather than data to code.
(3) Late binding for a protocol. (Download it).

The principle of dynamically configuring a client to communicate to a server. The client first
2. Client and Server
communicate
Client Server

Service – specific 1. Client fetches code


client – side code

Code
repository
Fig: Dynamic Client Configuration

fetches the necessary software and then invokes the server.

A process consists of three segments:


(1) Code segment: contains the actual code.
(2) Resource segment: contains references to external resources needed by the process.
E.g., files, printers, devices, other processes
(3) Execution segments: store the current execution state of a process, consisting of private
data, the stack, and the program counter.

23
Brightway Computers Distributed Systems
5. Explain the Threads?
A minimal software processor in whose context a series of instructions can be executed.
Saving a thread context implies stopping the current execution and saving all the data needed to
continue the execution at a later stage.
Processor context: The minimal collection of values stored in the registers of a processor
used for the execution of a series of instructions (e.g., stack pointer, addressing registers, program
counter).
Thread context: The minimal collection of values stored in registers and memory, used for
the execution of a series of instructions (i.e., processor context, state).
Process context: The minimal collection of values stored in registers and memory, used for
the execution of a thread (i.e., thread context, but now also at least MMU register values).
Main Issue: Should an OS kernel provide threads or should they be implemented as part of a
user – level package?
User – space solution:
1. Nothing to do with the kernel. Can be very efficient.
2. But everything done by a thread affects the whole process. So what happens when a
thread blocks on a syscall?
3. Can we use multiple CPUs/cores?
Kernel solution:
Kernel implements threads. Everything is system call.
1. Operations that block a thread are no longer a problem: kernel schedules another.
2. External events are simple: the kernel (which catches all events) schedules the thread
associated with the event.
3. Less efficient.
4. Conclusion: Try to mix user – level and kernel – level threads into a single concept.

SHORT ANSWER QUESTIONS

1. What is meant by resource management?


A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the
distributed system). One of the functions of a distributed operating system is to assign processes to
the nodes (resources) of the distributed system such that the resource usage, response time,
network congestion, and scheduling overhead are optimized.
There are three techniques for scheduling processes of a distributed system:
(1) Task Assignment Approach, in which each process submitted by a user for processing is viewed
as a collection of related tasks and these tasks are scheduled to suitable nodes so as to improve
performance.
(2) Load – balancing approach, in which all the processes submitted by the users are distributed
among the nodes of the system so as to equalize the workload among the nodes.
(3) Load – sharing approach, which simply attempts to conserve the ability of the system to
perform work by assuring that no node is idle while processes wait for being processed.

23
Brightway Computers Distributed Systems
The task assignment approach has limited applicability to practical situations because It works
on the assumption that the characteristics (e.g. execution time, IPC costs etc) of all the processes to
be scheduled are known in advance.

2. Discuss the Issues of Designing of Load Balancing Algorithms?


i. Load estimation policy: Determines how to estimate the workload of a node
ii. Process transfer policy: Determines whether to execute a process locally or remote
iii. State information exchange policy: Determines how to exchange load information
among nodes.
iv. Location Policy: Determines to which node the transferable process should be sent
v. Priority assignment policy: Determines the priority of execution of local and remote
process
vi. Migration limiting policy: Determine the total number of times a process can migrate

3. What is meant by Processor and Process?


Processor: Provides a set of instructions along with the capability of automatically executing a
series of those instructions.
Process: A software processor in whose context one or more threads may be executed.
Executing a thread, means executing a series of instructions in the context of that thread.

4. Discuss the Transparency in Client Side?


1. Access transparency: client – side stubs for RPCs.
2. Location/migration transparence: Let client – side software keep track of actual
location.
3. Replication transparency: Multiple invocations handled by client – side stub.
4. Failure transparency: Can often be placed only at client.

23
Brightway Computers Distributed Systems
Unit – V
Distributed File Systems
Essay Questions
1. What is meant by file and explain the file models?
Two main purposes of using files:
1. Permanent storage of information on a secondary storage media.
2. Sharing of information between applications.
File Models:
(a) Unstructured and Structured files: In the unstructured model, a file is an
unstructured sequence of bytes. The interpretation of the meaning and structure of the data stored
in the files is up to the application (e.g. UNIX and MS-DOS). Most modern operating systems use the
unstructured file model.
In structured files (rarely used now) a file appears to the file server as an ordered sequence
of records. Records of different files of the same file system can be of different sizes.
(b) Mutable and immutable files: Based on the modifiability criteria, files are of two types,
mutable and immutable. Most existing operating systems use the mutable file model. An update
performed on a file overwrites its old contents to produce the new contents.
In the immutable model, rather than updating the same file, a new version of the file is created each
time a change is made to the file contents and the old version is retained unchanged. The problems
in this model are increased use of disk space and increased disk activity.

2. Explain the File Caching Schemes?


Every distributed file system uses some form of caching. The reasons are:
1. Better performance since repeated accesses to the same information is handled
additional network accesses and disk transfers. This is due to locality in file access
patterns.
2. It contributes to the scalability and reliability of the distributed file system since data
can be remotely cached on the client node.
Key decisions to be made in file – caching scheme for distributed systems:
1. Cache location.
2. Modification Propagation.
3. Cache Validation.
1. Cache Location: This refers to the place where the cached data is stored. Assuming that the
original location of a file is on its servers disk, there are three possible cache locations in a
distributed file system:

23
Brightway Computers Distributed Systems
i. Servers main memory
1) In this case a cache hit costs one network access.
2) It does not contribute to scalability and reliability of the distributed file system.
3) Since we every cache hit requires accessing the server.
Advantages:
1. Easy to implement.
2. Totally transparent to clients.
3. Easy to keep the original file and the cached data consistent.
ii. Clients disk:
In this case a cache hit costs one disk access. This is somewhat slower than having
the cache in servers main memory. Having the cache in servers main memory is also
simpler.
Advantages:
1. Provides reliability against crashes since modification to cached data is lost in a crash if
the cache is kept in main memory.
2. Large storage capacity.
3. Contributes to scalability and reliability because on a cache hit the access request can be
serviced locally without the need to contact the server.
iii. Clients main memory:
Eliminates both network access cost and disk access cost. This technique is not
preferred to a client’s disk cache when large cache size and increased reliability of cached
data are desired.
Advantages:
1. Maximum performance gain.
2. Permits workstations to be diskless.
3. Contributes to reliability and scalability.
Modification Propagation:
When the cache is located on clients nodes a files data may simultaneously be cached on
multiple nodes. It is possible for caches to become inconsistent when the file data is changed by one
of the clients and the corresponding data cached at other nodes are not changed or discarded.
There are two design issues involved:
1. When to propagate modifications made to a cached data to the corresponding file
server.
2. How to verify the validity of cached data.
The modification propagation scheme used has a critical effect on the systems performance and
reliability. Techniques used include:
(a) Write – through scheme: When a cache entry is modified, the new value is immediately
sent to the server for updating the master copy of the file.

23
Brightway Computers Distributed Systems
Advantage:
1. High degree of reliability and suitability for UNIX-like semantics.
2. This is due to the fact that the risk of updated data getting lost in the event of a client
crash is very low since every modification is immediately propagated to the server
having the master copy.
Disadvantage:
1. This scheme is only suitable where the ratio of read – to –write accesses is fairly large. It
does not reduce network traffic for writes.
2. This is due to the fact that every write access has to wait until the data is written to the
master copy of the server. Hence the advantages of data caching are only read accesses
because the server is involved for all write accesses.
(b) To reduce network traffic for writes the delayed – write scheme is used. In this case, the
new data value is only written to the cache and all updated cache entries are sent to the
server at a later time. There are three commonly used delayed – write approaches:
(i) Write on ejection from cache:
Modified data in cache is sent to server only when the cache-replacement policy has
decided to eject it form clients cache. This can result in good performance but there can be a
reliability problem since some server data may be outdated for a long time.
(ii) Periodic write:
The cache is scanned periodically and any cached data that has been modifies since
the last scan is sent to the server.
(iii) Write on close:
Modification to cached data is sent to the server when the client closes the file. This
does not help much in reducing network traffic for those files that are open for very short
periods or are rarely modified.
Cache Validation schemes the modification propagation policy only specifies when
the master copy of a file on the server node is updated upon modification of a cache entry. It
does not tell anything about when the file data residing in the cache of other nodes is
updated.
A file data may simultaneously reside in the cache of multiple nodes. A clients cache
entry becomes stale as soon as some other client modifies the data corresponding to the
cache entry in the master copy of the file on the server.
It becomes necessary to verify if the data cached at a client node is consistent with
the master copy. If not, the cached data must be invalidated and the updated version of the
data must be fetched again from the server.
There are two approaches to verify the validity of cached data: The client – initiated
approach and the server – initiated approach.

Client – initiated approach: The client contacts the server and checks whether its locally cached
data is consistent with the master copy. Two approaches may be used:

23
Brightway Computers Distributed Systems
1. Checking before every access.
This defeats the purpose of caching because the server needs to be contacted on every access.
2. Periodic checking.
A check is initiated every fixed interval of time.

Server – initiated approach:


A client informs the file server when opening a file, indicating whether a file is being opened
for reading, writing, or both. The file server keeps a record or which client has which file open and
in what mode.
So server monitors file usage modes being used by different clients and reacts whenever it
detects a potential for inconsistency. E.g. if a file is open for reading, other clients may be allowed to
open it for reading, but opening it for writing cannot be allowed. So also, a new client cannot open a
file in any mode if the file is open for writing.
When a client closes a file, it sends intimation to the server along with any modifications
made to the file. Then the sever updates its record of which client has which file open in which
mode.
When a new client makes a request to open an already open file and if the server finds that
the new open mode conflicts with the already open mode, the server can deny the request, queue
the request, or disable caching by asking all clients having the file open to remove that file from
their caches.

3. Explain the Atomic Transactions?


A sequence of operations that perform a single logical function, Separate from all other
transactions
Examples:
1. Withdrawing money from your account
2. Making an airline reservation
3. Making a credit – card purchase
A transaction that happens completely or not at all
 No partial results
Example:
1. Cash machine hands you cash and deducts amount from your account.
2. Airline confirms your reservation and.
a. Reduces number of free seats.
b. Charges your credit card.
c. (Sometimes) increases number of meals loaded onto flight.
Fundamental principles – A C I D
1. Atomicity – to outside world, transaction happens indivisibly
2. Consistency – transaction preserves system invariants
3. Isolated – transaction do not interfere with each other
4. Durable – once a transaction “commits”, the changes are permanent

23
Brightway Computers Distributed Systems
Programming in a Transaction System:
1. Begin_transaction: Mark the start of a transaction.
2. End_transaction: Mark the end of a transaction and try to “commit”.
3. Abort_transaction: Terminate the transaction and restore old values.
4. Read: Read data from a file, table, etc., on behalf of the transaction.
5. Write: Write data to file, table, etc., on behalf of the transaction.
6. Nested Transactions: One or more transactions inside another transaction.
May individually commit, but may need to be undone.
Example:
1. Planning a trip involving three flights:
2. Reservation for each flight “commits” individually.
3. Must be undone if entire trip cannot commit.
Atomic transactions that span multiple sites and/or systems. Same semantics as atomic
transactions on single system
 ACID
Failure modes:
1. Crash or other failure of one site or system
2. Network failure or partition
3. Byzantine failures

4. Explain the Authentication?


Authentication is the process of determining whether someone or something is, in fact, who or
what it is declared to be.
Logically, authentication precedes authorization (although they may often seem to be
combined). The two terms are often used synonymously but they are two different processes.
Message Authentication. In this threat, the user is not sure about the originator of the
message. Message authentication can be provided using the cryptographic techniques that use
secret keys as done in case of encryption.
Message Authentication Code (MAC): MAC algorithm is a symmetric key cryptographic
technique to provide message authentication. For establishing MAC process, the sender and
receiver share a symmetric key K.

Receiver
Sender
Key, K

Key, K MAC MAC


Message Algorithm

Message MAC Equal?


MAC
Algorithm

23
Brightway Computers Distributed Systems
Essentially, a MAC is an encrypted checksum generated on the underlying message that is sent
along with a message to ensure message authentication.
The process of using MAC for authentication is depicted in the following illustration.”
Let us now try to understand the entire process in detail “
1. The sender uses some publicly known MAC algorithm, inputs the message and the
secret key K and produces a MAC value.
2. Similar to hash, MAC function also compresses an arbitrary long input into a fixed length
output. The major difference between hash and MAC is that MAC uses secret key during
the compression.
3. The sender forwards the message along with the MAC. Here, we assume that the
message is sent in the clear, as we are concerned of providing message origin
authentication, not confidentiality. If confidentiality is required then the message needs
encryption.
4. On receipt of the message and the MAC, the receiver feeds the received message and the
shared secret key K into the MAC algorithm and re –computes the MAC value.
5. The receiver now checks equality of freshly computed MAC with the MAC received from
the sender. If they match, then the receiver accepts the message and assures himself
that the message has been sent by the intended sender.
6. If the computed MAC does not match the MAC sent by the sender, the receiver cannot
determine whether it is the message that has been altered or it is the origin that has
been falsified. As a bottom – line, a receiver safely assumes that the message is not the
genuine.
SHORT ANSWER QUESTIONS

1. Discuss the features of distributed file system?


(i) Transparency:
a. Structure transparency: Clients should not know the number or locations of file
servers and the storage devices.
Note: multiple file servers provided for performance, scalability, and reliability.
b. Access transparency: Both local and remote files should be accessible in the same
way. The file system should automatically locate an accessed file and transport it to
the clients site.
c. Naming transparency: The name of the file should give no hint as to the location of
the file. The name of the file must not be changed when moving from one node to
another.
d. Replication transparency: If a file is replicated on multiple nodes, both the
existence of multiple copies and their locations should be hidden from the clients.
(ii) User mobility:
Automatically bring the users environment (e.g. users home directory) to the node
where the user logs in.

23
Brightway Computers Distributed Systems
(iii) Performance:
Performance is measured as the average amount of time needed to satisfy client
requests. This time includes CPU time + time for accessing secondary storage + network
access time. It is desirable that the performance of a distributed file system be
comparable to that of a centralized file system.
(iv) Simplicity and ease of use:
User interface to the file system be simple and number of commands should be as small
as possible.
(v) Scalability:
Growth of nodes and users should not seriously disrupt service.
(vi) High availability:
A distributed file system should continue to function in the face of partial failures such
as link failure, a node failure, or a storage device crash.
A highly reliable and scalable distributed file system should have multiple and
independent file servers controlling multiple and independent storage devices.
(vii) High reliability:
Probability of loss of stored data should be minimized. System should automatically
generate backup copies of critical files.
(viii) Data integrity:
Concurrent access requests from multiple users who are competing to access the file
must be properly synchronized by the use of some form of concurrency control
mechanism. Atomic transactions can also be provided.
(ix) Security:
Users should be confident of the privacy of their data.
(x) Heterogeneity:
There should be easy access to shared data on diverse platforms (e.g. Unix workstation,
Wintel platform etc).

2. Explain the functions of distributed file system?


A file system is a subsystem of the operating system that performs file management
activities such as organization, storing, retrieval, naming, sharing and protection of files.
A file system frees the programmer from concerns about the details of space allocation and
layout of the secondary storage device.
The design and implementation of a distributed file system is more complex than a
conventional file system due to the fact that the users and storage devices are physically dispersed.
In addition to the functions of the file system of a single processor system, the distributed
file system supports the following:

23
Brightway Computers Distributed Systems
(a) Remote information Sharing: Thus any node, irrespective of the physical location of
the file, can access the file.
(b) User mobility: User should be permitted to work on different nodes.
(c) Availability: For better fault-tolerance, files should be available for use even in the
event of temporary failure of one or more nodes of the system. Thus the system should maintain
multiple copies of the files, the existence of which should be transparent to the user.

3. Explain the File Accessing Models?


This depends on the method used for accessing remote files and the unit of data access.
(a) Accessing remote files: A distributed file system may use one of the following models
to service a clients file access request when the accessed file is remote:
(b) Remote service model: Processing of a clients request is performed at the servers
node. Thus, the clients request for file access is delivered across the network as a message to the
server, the server machine performs the access request, and the result is sent to the client. Need to
minimize the number of messages sent and the overhead per message.

(c) Data – caching model: This model attempts to reduce the network traffic of the
previous model by caching the data obtained from the server node. This takes advantage of the
locality feature of the found in file accesses. A replacement policy such as LRU is used to keep the
cache size bounded.

While this model reduces network traffic it has to deal with the cache coherency problem during
writes, because the local cached copy of the data needs to be updated, the original file at the server
node needs to be updated and copies in any other caches need to updated.

(d) Diskless workstations: A distributed file system, with its transparent remote – file
accessing capability, allows the use of diskless workstations in a system.

4. Explain the File Sharing Semantics?


The UNIX semantics is implemented in file systems for single CPU systems because it is the
most desirable semantics and because it is easy to serialize all read/write requests. Implementing
UNIX semantics in a distributed file system is not easy. One may think that this can be achieved in a
distributed system by disallowing files to be cached at client nodes and allowing a shared file to be
managed by only one file server that processes all read and write requests for the file strictly in the
order in which it receives them. However, even with this approach, there is a possibility that, due to
network delays, client requests from different nodes may arrive and get processed at the server
node in an order different from the actual order in which the requests were made.

Also, having all file access requests processed by a single server and disallowing caching on client
nodes is not desirable in practice due to poor performance, poor scalability, and poor reliability of
the distributed file system.

23
Brightway Computers Distributed Systems
Hence distributed file systems implement a more relaxed semantics of file sharing. Applications
that need to guarantee UNIX semantics should provide mechanisms (e.g. mutex lock etc)
themselves and not rely on the underlying semantics of sharing provided by the file system.

23
Brightway Computers Distributed Systems
5. Write the advantages of delayed – write scheme?

(a) Write accesses complete more quickly because the new value is written only client
cache. This results in a performance gain.
(b) Modified data may be deleted before it is time to send to send them to the server (e.g.
temporary data). Since modifications need not be propagated to the server this results
in a major performance gain.
(c) Gathering of all file updates and sending them together to the server is more efficient
than sending each update separately.

6. Explain the Replication Transparency?


Replication of files should be transparent to the users so that multiple copies of a replicated
file appear as a single logical file to its users. This calls for the assignment of a single
identifier/name to all replicas of a file.

In addition, replication control should be transparent, i.e., the number and locations of
replicas of a replicated file should be hidden from the user. Thus replication controlh must be
handled automatically in a user-transparent manner.

7. What are the tools for atomic transactions?


(a) Begin_transaction:
Place a begin entry in log
(b) Write:
Write updated data to log
(c) Abort_transaction:
Place abort entry in log
(d) End_transaction (i.e., commit):
i. Place commit entry in log
ii. Copy logged data to files
iii. Place done entry in log
(e) Crash recovery – search log:
i. If begin entry, lock for matching entries.
ii. If done, do nothing (all files have been updated).
iii. If abort undo any permanent changes that Trans action may have made.

iv. If commit but not done, copy updated blocks from log to files, then add done
entry.

8. Explain the File Replication?


High availability is a desirable feature of a good distributed file system and file replication is the
primary mechanism for improving file availability.
A replicated file is a file that has multiple copies, with each file on a separate file server.

23
Brightway Computers Distributed Systems

23
Brightway Computers Distributed Systems
Difference between Replication and Caching:
1. A replica of a file is associated with a server, whereas a cached copy is normally associated with
a client.
2. The existence of a cached copy is primarily dependent on the locality in file access patterns,
whereas the existence of a replica normally depends on availability and performance
requirements.
3. As compared to a cached copy, a replica is more persistent, widely known, secure, available,
complete and accurate.
4. A cached copy is contingent upon a replica. Only by periodic revalidation with respect to a
replica can a cached copy be useful.

Advantages of Replication:

1. Increased Availability: Alternate copies of a replicated data can be used when the
primary copy is unavailable.
2. Increased Reliability: Due to the presence of redundant data files in the system, recovery
from catastrophic failure (e.g. hard drive crash) becomes possible.
3. Improved response time: It enables data to be accessed either locally or from a node to
which access time is lower than the primary copy access time.
4. Reduced network traffic: If a files replica is available with a file server that resides on a
clients node, the clients access request can be serviced locally, resulting in reduced
network traffic.
5. Improved system throughput: Several clients request for access to a file can be serviced
in parallel by different servers, resulting in improved system throughput.
6. Better scalability: Multiple file servers are available to service client requests since due to
file replication. This improves scalability.

9. Explain the Digital Signature?


Digital Signatures allow us to verify the author, date and time of signature, authenticate the
message contents.
It also includes authentication function for additional capabilities. A digital signature should not
only be tied to the signing user, but also to the message.
The following points explain the entire process in detail:
1. Each person adopting this scheme has a public – private key pair.
2. Generally, the key pairs used for encryption/decryption and signing/verifying are
different. The private key used for signing is referred to as the signature key and the public
key as the verification key.
3. Signer feeds data to the hash function and generates hash of data.

23
Brightway Computers Distributed Systems

Signer Verifier

Signer’s Hashing
Data Private function
Key
Data Equal?
>

Hashing Signature Verification


algorithm Signature algorithm Hash
function

Signer’s
Hash Public
Key

4. Hash value and signature key are then fed to the signature algorithm which produces the
digital signature on given hash. Signature is appended to the data and then both are sent to
the verifier.
5. Verifier feeds the digital signature and the verification key into the verification algorithm.
The verification algorithm gives some value as output.
6. Verifier also runs same hash function on received data to generate hash value.
7. For verification, this hash value and output of verification algorithm are compared. Based
on the comparison result, verifier decides whether the digital signature is valid.
8. Since digital signature is created by ‘private’ key of signer and no one else can have this
key; the signer cannot repudiate signing the in future.

10. Discuss the Cryptography?


The word ‘cryptography’ was coined by combining two Greek words, ‘Krypton’ meaning hidden and
‘graphene’ meaning writing.
Context of Cryptography: Cryptology, the study of cryptosystems, can be subdivided into two
branches
(a) Cryptography; (b) Cryptanalysis

Cryptology

Cryptograph Cryptanalysi
y s

23
Brightway Computers Distributed Systems
a) Cryptography: Cryptography is the art and science of making a cryptosystem that is capable of
providing information security.
Cryptography deals with the actual securing of digital data. It refers to the design of mechanisms
based on mathematical algorithms that provide fundamental information security services. You can
think of cryptography as the establishment of a large toolkit containing different techniques in
security applications.
b) Cryptanalysis: The art and science of breaking the cipher text is known as cryptanalysis.
Cryptanalysis is the sister branch of cryptography and they both co-exist. The cryptographic
process results in the cipher text for transmission or storage. It involves the study of cryptographic
mechanism with the intention to break them. Cryptanalysis is also used during the design of the
new cryptographic techniques to test their security strengths.
Note “Cryptography concerns with the design of cryptosystems, while cryptanalysis studies the
breaking of cryptosystems.

11. Explain the Access Control?


Access control is to limit the actions or operations that a legitimate user of a computer system can
perform. Access control constrains what a user can do directly, as well as what programs executing
on behalf of the users are allowed to do. In this way access control seeks to prevent activity that
could lead to a breach of security.

Access control relies on and coexists with other security services in a computer system. Access
control is concerned with limiting the activity of legitimate users.

It is enforced by a reference monitor which mediates every attempted access by a user (or program
executing on behalf of that user) to objects in the system. The reference monitor consults an
authorization database in order to determine if the user attempting to do an operation is actually
authorized to perform that operation. Authorizations in this database are administered and
maintained by a security administrator. The administrator sets these authorizations on the basis of
the security policy of the organization. Users may also be able to modify some portion of the
authorizations database, for instance, to set permissions for their personal files. Auditing monitors
and keeps a record of relevant activity in the system.

It is important to make a clear distinction between authentication and access control. Correctly
establishing the identity of the user is the responsibility of the authentication service. Access
control assumes that the authentication of the user has been successfully verified prior to
enforcement of access control via a reference monitor.

23

Вам также может понравиться