Вы находитесь на странице: 1из 34

Distributed Systems

Distributed Systems Overview
What is a Distributed
 A collection of independent computers that
appears to its users as a single coherent
 Features:
 No shared memory
 Each runs its own OS
 Heterogeneity
 Purpose
 To present a single system image
Distributed Systems
 To present a single-system image:
 Hide internal organization, communication
 Easily expandable
 Adding new computers is hidden from users
 Continuous availability
 Failures in one component can be covered by
other components
 Supported by middleware
Definition of a Distributed

A distributed system organized as middleware.

The middleware layer runs on all machines, and offers a uniform interface to the users
Role of Middleware (MW)
 In some early research systems: MW tried
to provide the illusion that a collection of
separate machines was a single computer.
 Today:
 clustering software allows independent
computers to work together closely
 MW also supports seamless access to remote
services, doesn’t try to look like a general-
purpose OS
Middleware Examples
 (Common Object Request Broker Architecture)
 (Distributed Component Object Management)
– being replaced by .net
 Sun’s ONC RPC (Remote Procedure Call)
 RMI (Remote Method Invocation)
 SOAP (Simple Object Access Protocol)
Middleware Examples
 All of the previous examples support
communication across a network
 They provide protocols that allow a
program running on one kind of computer,
using one kind of operating system, to call
a program running on another computer
with a different operating system
 The communicating programs must be
running the same middleware.
Distributed Systems Goals
 Resource Availability
 Concurrency
 Distribution Transparency
 Openness
 Scalability
 Reliability & Fault Tolerance
Goal 1: Resource
 Ability to use any hardware, software or
data anywhere in the system.
 Support user access to remote resources
(printers, data files, web pages, CPU
cycles) and the fair sharing of the
 Economics of sharing expensive resources
Goal 1: Resource
 Performance enhancement – due to
multiple processors; also due to ease of
collaboration and info exchange – access
to remote services
 Groupware: tools to support collaboration
 Resource sharing introduces security
Goal 2: Concurrency
 Processes are said to be concurrent if they
run at the same time
 Components in distributed systems are
executed in concurrent processes.
 Components access and update shared
resources (e.g. variables, databases,
device drivers).
Goal 2: Concurrency
 Integrity of the system may be violated if
concurrent updates are not coordinated.
 Lost updates
 Inconsistent analysis
Goal 3: Distribution
 Software hides some of the details of the
distribution of system resources.
 Makes the system more user friendly.
 A DS that appears to its users &
applications to be a single computer
system is said to be transparent.
 Users & apps should be able to access remote
resources in the same way they access local
 Transparency has several dimensions.
Types of Transparency
Degrees of Transparency
 Trade-off: transparency versus other
 Reduced performance: multiple attempts to
contact a remote server can slow down the
system – should you report failure and let user
cancel request?
 Convenience: direct the print request to my
local printer, not one on the next floor
 Too much emphasis on transparency may
prevent the user from understanding
system behavior.
Goal 4: Openness
 An open distributed system “offers
services according to standard rules that
describe the syntax and semantics of
those services.”
 In other words, the interfaces to the
system are clearly specified and freely
 Compare to network protocols
 Not proprietary
Goal 4: Openness
 Interface Definition/Description Languages
(IDL): used to describe the interfaces
between software components, usually in
a distributed system
 Definitions are language & machine
 Support communication between systems
using different OS/programming languages;
e.g. a C++ program running on Windows
communicates with a Java program running on
Open Systems Support
 Interoperability: the ability of two
different systems or applications to work
 A process that needs a service should be able
to talk to any process that provides the
 Multiple implementations of the same service
may be provided, as long as the interface is
Open Systems Support
 Portability: an application designed to run
on one distributed system can run on
another system which implements the
same interface.
 Extensibility: Easy to add new
components, features
Goal 5: Scalability
 Scalability: At least three components:
 Number of users and/or processes (size
 Maximum distance between nodes
(geographical scalability)
 Number of administrative domains
(administrative scalability)
 Most systems account only, to a certain
extent, for size scalability.
 Today, the challenge lies in geographical
and administrative scalability.
Size Scalability
 Scalability is negatively affected when the
system is based on:
 Centralized server: one for all users
 Centralized data: a single data base for all
 Centralized algorithms: one site collects all
information, processes it, distributes the
results to all sites.
 Complete knowledge: good
 Time and network traffic: bad
Geographical Scalability
 Early distributed systems ran on LANs,
relied on synchronous communication.
 May be too slow for wide-area networks
 Wide-area communication is unreliable, point-
 Unpredictable time delays may even affect
Geographical Scalability
 LAN communication is based on broadcast.
 Consider how this affects an attempt to locate
a particular kind of service
 Centralized components + wide-area
communication: waste of network
Administrative Scalability
 Different domains may have different
policies about resource usage,
management, security, etc.
 Trust often stops at administrative
 Requires protection from malicious attacks
Scaling Techniques
 Scalability affects performance more than
anything else.
 Three techniques to improve scalability:
 Hiding communication latencies
 Distribution
 Replication
Goal 6: Reliability & Fault
If machines go down, the system should
work with the reduced amount of resources.
 There should be a very small number of
critical resources;
 critical resources: resources which have to be
up in order the distributed system to work.
 Key pieces of hardware and software
(critical resources) should be replicated ⇒
if one of them fails another one takes up -
Fault Tolerance
 Fault-tolerance is a main issue related to
reliability: the system has to detect faults
and act in a reasonable way:
 mask the fault: continue to work with possibly
reduced performance but without loss of data/
 fail gracefully: react to the fault in a
predictable way and possibly stop functionality
for a short period, but without loss of
Distributed Systems:
 Multiprocessor - a single address space
among the processors
 Multicomputer – each machine has its own
private memory.
Multiprocessor vs.
Distributed Systems:
 Distributed Operating Systems
 Network Operating Systems
 Middleware
Distributed Systems:

DOS – Distributed Operating System

NOS – Network Operating System
Distributed Operating
Distributed Operating System:
 An operating system which manages a
collection of independent systems and
makes it appear to users as a single
Network Operating System
Network Operating System:
 Is a software that runs on a server and
enables the server to manage data, users,
groups, applications, security and other
networking applications
 The most popular NOSs are:
 Microsoft Windows Server 2003
 Unix
 Linux
Pitfalls of Distribution
 The network is reliable
 The network is secure
 The network is homogeneous
 The topology does not change
 Latency is zero
 Bandwidth is infinite
 Transport cost is zero
 There is one administrator