Академический Документы
Профессиональный Документы
Культура Документы
Introduction
interprocess communication is at the heart of all distributed systems communication in distributed systems is based on message passing as offered by the underlying network which is harder as opposed to using shared memory modern distributed systems consist of thousands of processes scattered across an unreliable network such as the Internet unless the primitive communication facilities of the network are replaced by more advanced ones, development of large scale Distributed Systems becomes extremely difficult
the receiving computer must perform the same steps, but in reverse order accept the data from the NIC remove transmitting information that was added by the transmitting computer reassemble the packets of data into the original message the key elements of a protocol are syntax, semantics, and timing syntax: refers to the structure or format of the data semantics: refers to the meaning of each section of bits timing: refers to when data should be sent and how fast they can be sent functions of protocols each device must perform the same steps the same way so that the data will arrive and reassemble properly; if one device uses a protocol with different steps, the two devices will not be able to communicate with each other
5
Protocols in a layered architecture protocols that work together to provide a layer or layers of the model are known as a protocol stack or protocol suite, e.g. TCP/IP each layer handles a different part of the communications process and has its own protocol Data Communication Standards standards are essential for interoperability data communication standards fall into two categories De facto standards: that have not been approved by an organized body; mostly set by manufacturers De jure standards: those legislated by an officially recognized body such as ISO, ITU, ANSI, IEEE
Network (Reference) Models Layers and Services within a single machine, each layer uses the services immediately below it and provides services for the layer immediately above it between machines, layer x on one machine communicates with layer x on another machine Two important network models or architectures The ISO OSI (Open Systems Interconnection) Reference Model The TCP/IP Reference Model a. The OSI Reference Model consists of 7 layers was never fully implemented as a protocol stack, but a good theoretical model Open to connect open systems or systems that are open for communication with other systems
10
b. The TCP/IP Reference Model TCP/IP - Transmission Control Protocol/Internet Protocol used by ARPANET and its successor the Internet design goals the ability to connect multiple networks (internetworking) in a seamless way the network should be able to survive loss of subnet hardware, i.e., the connection must remain intact as long as the source and destination machines are properly functioning flexible architecture to accommodate requirements of different applications - ranging from transferring files to real-time speech transmission these requirements led to the choice of a packet-switching network based on a connectionless internetwork layer has 4 (or 5 depending on how you see it) layers: Application, Transport, Internet (Internetwork), Host-tonetwork (some split it into Physical and Data Link) 11
12
Layers involved in various hosts (TCP/IP) when a message is sent from device A to device B, it may pass through many intermediate nodes the intermediate nodes usually involve the first three layers
13
Middleware Protocols a middleware is an application that contains general-purpose protocols to provide services example of middleware services authentication and authorization services distributed transactions (commit protocols; locking mechanisms) - see later in Chapter 8 middleware communication protocols (calling a procedure or invoking an object remotely, synchronizing streams for real-time data, multicast services) - see later in this Chapter hence an adapted reference model for networked communications is required
14
15
16
Stack pointer
parameter passing in a local procedure call: the stack before the call to read
parameters can be call-by-value (fd and bytes) or call-by reference (buf) or in some languages call-by-copy/restore
Client and Server Stubs RPC would like to make a remote procedure call look the same as a local one; it should be transparent, i.e., the calling procedure should not know that the called procedure is executing on a different machine or vice versa
when a program is compiled, it uses different versions of library functions called client stubs a server stub is the server-side equivalent of a client stub
18
Steps of a Remote Procedure Call 1. Client procedure calls client stub in the normal way 2. Client stub builds a message and calls the local OS (packing parameters into a message is called parameter marshaling) 3. Client's OS sends the message to the remote OS 4. Remote OS gives the message to the server stub 5. Server stub unpacks the parameters and calls the server 6. Server does the work and returns the result to the stub 7. Server stub packs it in a message and calls the local OS 8. Server's OS sends the message to the client's OS 9. Client's OS gives the message to the client stub 10. Stub unpacks the result and returns to client hence, for the client remote services are accessed by making ordinary (local) procedure calls; not by calling send and receive
server machine vs server process; client machine vs client process
19
Parameter Passing 1. Passing Value Parameters e.g., consider a remote procedure add(i, j), where i and j are integer parameters
20
the above discussion applies if the server and the client machines are identical but that is not the case in large distributed systems the machines may differ in data representation (e.g., IBM mainframes use EBCDIC whereas IBM PCs use ASCII) there are also differences in representing integers(1s complement or 2s complement) and floating-point numbers byte numbering may be different (from right to left in Pentium called little endian and left to right in SPARC, big endian) e.g. consider a procedure with two parameters, an integer and a four-character string; each one 32-bit word (5, JILL) the sender is Intel and the receiver is SPARC
21
original message on the Pentium (the numbers in boxes indicate the address of each byte)
the message after receipt on the SPARC; wrong integer (224+226 = 83,886,080), but correct string
22
the message after being inverted (correct integer but wrong string)
23
2. Passing Reference Parameters assume the parameter is a pointer to an array copy the array into the message and send it to the server the server stub can then call the server with a pointer to this array the server then makes any changes to the array and sends it back to the client stub which copies it to the client this is in effect call-by-copy/restore optimization of the method one of the copy operations can be eliminated if the stub knows whether the parameter is input or output to the server if it is an input to the server (e.g., in a call to write), it need not be copied back if it is an output, it need not be sent over in the first place; only send the size the above procedure can handle pointers to simple arrays and structures, but difficult to generalize it to an arbitrary data structure
24
Parameter Specification and Stub Generation the caller and the callee need to use the same protocol (format of messages) and the same steps; with such rules the client and server stubs can assemble, communicate, and interpret messages correctly consider the following example; the procedure foobar has 3 parameters: a character, a floating point number, and an array of 5 integers
assume a word is 4 bytes one possibility is to transmit the character in the rightmost byte, a float as a whole word, and an array as a group of words equal to the array length preceded by a word giving the length this way both client stub and server stub can understand outgoing and incoming the corresponding message 25 messages
other issues that need the agreement of the client and the server how are simple data structures like integers (e.g. 2s complement), characters (e.g. 16-bit Unicode), Booleans, ... represented? endianess which transport protocol to use - the connection-oriented TCP or the unreliable connectionless UDP
26
Asynchronous RPC a shortcoming of the original model is that it is blocking: but no need of blocking for the client in some cases two cases 1. if there is no result to be returned e.g., inserting records in a database, ... the server immediately sends an ack promising that it will carryout the request the client can now proceed without blocking
a) the interconnection between client and server in a traditional RPC b) the interaction using asynchronous RPC
27
2. if the result can be collected later e.g., prefetching network addresses of a set of hosts, ... the server immediately sends an ack promising that it will carryout the request the client can now proceed without blocking the server later sends the result
28
the above method combines two asynchronous RPCs and is sometimes called deferred synchronous RPC variants of asynchronous RPC let the client continue without waiting even for an ack, called one-way RPC problem: if reliability of communication is not guaranteed
29
DCE (Distributed Computing Environment) RPC a middleware and an example RPC system developed by OSF (Open Software Foundation), now The Open Group; it is designed to execute as a layer of abstraction between existing OSs and distributed applications available as open source and vendors integrate it into their systems (http://www.opengroup.org/dce/) it uses the client-server programming model and communication is by means of RPCs services distributed file service: a worldwide file system that provides a transparent way of accessing files directory service: to keep track of the location of all resources in the system (machines, printers, data, servers, ...); a process can ask for a resource without knowing its location security service: for protecting resources; access is only through authorization
30
distributed time service: to maintain clocks on different machines synchronized (clock synchronization is covered in Chapter 6) Steps in writing a Client and a Server in DCE RPC the system consists of languages, libraries, daemons, utility programs, ... for writing clients and servers IDL (Interface Definition Language) is the interface language - the glue that holds everything together it allows procedure declarations (similar to function prototypes in C++) it contains type definitions, constant declarations, information needed to marshal parameters and unmarshal results, and what the procedures do (only their syntax)
31
Edit file
Uuidgen generates a prototype IDL file with a globally unique interface identifier (for uniqueness, the location and time of creation is embedded)
32
the IDL file is edited (filling the names of the remote procedures and their parameters) and the IDL compiler is called to generate 3 files the application writer writes the client and server codes and are then compiled and linked together with the stubs Binding a Client to a Server in DCE RPC binding means locating the correct server and setting up communication between client and server software for a client to call a server, the server must be registered with the daemon (1 & 2 in the following figure) the registration allows the client to locate the server and bind to it the DCE daemon maintains a table (server, endpoint) and the protocols the server uses the directory server maintains the locations of all resources in the system (machines, servers, data,, ...)
33
two steps for locating the server locate the servers machine (3) locate the server process on that machine (with an endpoint or port) (4) now the RPC can take place; the above look up information can be stored for subsequent RPCs
Others RPC systems: Sun RPC and DCOM (Microsoft's system for distributed computing - Distributed Component Object Model)
34
4.3 Remote Object (Method) Invocation (RMI) (Chapter 10: Distributed Object-Based Systems; from page 443) resulted from object-based technology that has proven its value in developing nondistributed applications it is an expansion of the RPC mechanisms it enhances distribution transparency as a consequence of an object that hides its internal from the outside world by means of a well-defined interface Distributed Objects an object encapsulates data, called the state, and the operations on those data, called methods methods are made available through an interface the state of an object can be manipulated only by invoking methods this allows an interface to be placed on one machine while the object itself resides on another machine; such an organization is referred to as a distributed object
35
if the state of an object is not distributed, but only the interfaces are, then such an object is referred to as a remote object the implementation of an objects interface is called a proxy (analogous to a client stub in RPC systems) it is loaded into the clients address space when a client binds to a distributed object tasks: a proxy marshals method invocation into messages and unmarshals reply messages to return the result of the method invocation to the client a server stub, called a skeleton, unmarshals messages and marshals replies
36
37
Object Servers an object server is a server to support distributed objects it does not provide a specific service; services are implemented by the objects that reside on the server the server provides only the means to invoke local objects based on remote client requests Alternatives for Invoking Objects to invoke an object, the object server needs to know which code to execute on which data it should operate whether it should start a separate thread ...
38
different approaches exist 1.assume that all objects look alike and there is only one way to invoke an object like in DCE; inflexible 2.let a server support different policies transient versus persistent objects transient object: create it at first request and destroy it if no clients are bound to it or persistent object: it exists even if it is not currently used separate or shared memory put each object in a memory segment of its own, i.e., objects share neither code nor data; protection of segments required, probably by the underlying OS or objects can at least share code threading implement the server with a single thread of control; or the server may have several threads, one for each of its objects
39
Object Adaptor activation policies: decisions on how to invoke an object object adaptor (wrapper): to group objects per policy; it is a software for implementing a specific activation policy an object adaptor has one or more objects under its control
40
Binding a Client to an Object a process must first bind to an object before invoking its methods, which results in a proxy being placed in the processs address space binding can be implicit (directly invoke methods using only a reference to an object) or explicit (by calling a special function)
Distr_object* obj_ref; obj_ref = ; obj_refdo_something(); Distr_object obj_ref; Local_object* obj_ptr; obj_ref = ; obj_ptr = bind(obj_ref); obj_ptrdo_something(); // Declare a systemwide object reference // Initialize the reference to a distributed object // Implicitly bind and invoke a method (a) // Declare a systemwide object reference // Declare a pointer to local objects // Initialize the reference to a distributed object // Explicitly bind and obtain a pointer to the local proxy // Invoke a method on the local proxy (b) (a) an example with implicit binding using only global references (b) an example with explicit binding using global and local references
41
an object reference could contain network address of the machine where the object resides endpoint of the server an identification of which object the protocol used ... Parameter Passing there are two situations when invoking a method with object reference as a parameter: the object can be local or remote to the client local object: a copy of the object is passed; this means the object is passed by value remote object: copy and pass the reference of the object as a value parameter; this means the object is passed by reference
42
example object-based systems: DCE Remote Objects Java RMI Java Beans
44
Persistence and Synchronicity in Communication assume the communication system is organized as a computer network shown below
general organization of a communication system in which hosts are connected through a network
45
communication can be persistent or transient asynchronous or synchronous persistent: a message that has been submitted for transmission is stored by the communication system as long as it takes to deliver it to the receiver e.g., e-mail delivery, snail mail delivery transient: a message that has been submitted for transmission is stored by the communication system only as long as the sending and receiving applications are executing asynchronous: a sender continues immediately after it has submitted its message for transmission synchronous: the sender is blocked until its message is stored in a local buffer at the receiving host or delivered to the receiver
46
Message-Oriented Transient Communication many applications are built on top of the simple message-oriented model offered by the transport layer standardizing the interface of the transport layer by providing a set of primitives allows programmers to use messaging protocols they also allow porting applications 1. Berkley Sockets an example is the socket interface as used in Berkley UNIX a socket is a communication endpoint to which an application can write data that are to be sent over the network, and from which incoming data can be read
47
Meaning Create a new communication endpoint Attach a local address to a socket; e.g., IP address with a known port number Announce willingness to accept connections; non-blocking Block caller until a connection request arrives Actively attempt to establish a connection; the client is blocked until connection is set up Send some data over the connection Receive some data over the connection Release the connection socket primitives for TCP/IP
Executed by both
servers
clients both
48
49
2. The Message-Passing Interface (MPI) sockets were designed to communicate across networks using general-purpose protocol stacks such as TCP/IP they were not designed for proprietary protocols developed for high-speed interconnection networks; of course portability will suffer MPI is designed for parallel applications and tailored for transient communication MPI assumes communication takes place within a known group of processes, where each group is assigned an identifier (groupID) each process within a group is also assigned an identifier (processID) a (groupID, processID) identifies the source or destination of a message, and is used instead of a transport-level address
50
Meaning Append outgoing message to a local send buffer (for transient asynchronous communication) Send a message and wait until copied to local or remote buffer; semantics are implementation dependent Send a message and wait until receipt starts (for transient synchronous communication) Send a message and wait for reply; strongest form; similar to RPC (synchronous) Pass reference to outgoing message (not copying the message), and continue (asynchronous) Pass reference to outgoing message (not copying the message), and wait until receipt starts (synchronous) Receive a message; block if there are none (synchronous) Check if there is an incoming message, but do not block (asynchronous)
Message-Oriented Persistent Communication there are message-oriented middleware services, called Message-Queuing Systems or Message-Oriented Middleware (MOM) they support persistent asynchronous communication they have intermediate-term storage capacity for messages, without requiring the sender or the receiver to be active during message transmission unlike Berkley sockets and MPI, message transfer may take minutes instead of seconds or milliseconds Message-Queuing Model applications communicate by inserting messages in specific queues it permits loosely-coupled communication the sender may or may not be running; similarly the receiver may or may not be running, giving four possible combinations
52
(a) both are executing during the transmission of a message (b) the sender is executing, but the receiver is not (c) the receiver can read the message while the sender is not executing (d) the system is storing and possibly transmitting even if 53 both are not executing
Meaning Append a message to a specified queue; by the sender and is nonblocking Remove the first (longest pending) message; block if queue is empty Check a specified queue for messages, and remove the first; never block; nonblocking variant of Get Install a handler (by the receiver) to be called when a message is put into the specified queue; usually a daemon
54
General Architecture of a Message-Queuing System messages can be put only into queues that are local to the sender (same machine or on a nearby machine on a LAN) such a queue is called the source queue messages can also be read only from local queues a message put into a local queue must contain the specification of the destination queue; hence a messagequeuing system must maintain a mapping of queues to network locations; like in DNS
55
queues are managed by queue managers they generally interact with the application that sends and receives messages some also serve as routers or relays, i.e., they forward incoming messages to other queue managers however, each queue manager needs a copy of the queueto-location mapping, leading to network management problems for large-scale queuing systems the solution is to use a few routers that know about the network topology hence, only routers need to be updated when queues are added or removed this helps to build scalable message-queuing systems
56
Message Brokers how can applications understand the messages they receive each receiver can not be made to understand message formats of new applications hence, in a message-queuing system conversations are handled by message brokers a message broker converts incoming messages to a format that can be understood by the destination application based on a set of rules
58
59
The Challenge new applications multimedia will be pervasive in few years (as graphics) continuous delivery e.g., 30 frames/s (NTSC), 25 frames/s (PAL) for video guaranteed Quality of Service admission control storage and transmission e.g., 2 hours uncompressed HDTV (19201080) movie: 1.12 TB (19201080x3x25x60x60x2) videos are extremely large, even after compressed (actually encoded) search can we look at 100 videos to find the proper one?
61
Types of Media two types discrete media: text, executable code, graphics, images; temporal relationships between data items are not fundamental to correctly interpret the data continuous media: video, audio, animation; temporal relationships between data items are fundamental to correctly interpret the data a data stream is a sequence of data units and can be applied to discrete as well as continuous media; e.g., TCP provides byte-oriented discrete data streams stream-oriented communication provides facilities for the exchange of time-dependent information (continuous media) such as audio and video streams
62
timing in transmission modes asynchronous transmission mode: data items are transmitted one after the other, but no timing constraints; e.g. text transfer synchronous transmission mode: a maximum end-to-end delay defined for each data unit; it is possible that data can be transmitted faster than the maximum delay, but not slower isochronous transmission mode: maximum and minimum end-to-end delay are defined; also called bounded delay jitter; applicable for distributed multimedia systems a continuous data stream can be simple or complex simple stream: consists of a single sequence of data; e.g., mono audio, video only (only visual frames) complex stream: consists of several related simple streams, called substreams, that must be synchronized; e.g., stereo audio, video consisting of audio and video (may also contain subtitles, translation to other languages, ...) 63
64
a stream can be considered as a virtual connection between a source and a sink the source or the sink could be a process or a device streaming means a user can listen (or watch) after the downloading has started we can stream stored data or live data (compression, actually encoding is required)
66
the data stream can also be multicasted to several receivers if devices and the underlying networks have different capabilities, the stream may be filtered, generally called adaptation (filtering?, transcoding?)
67
Quality of Service (QoS) timing and other nonfunctional requirements are expressed as Quality of Service requirements QoS requirements describe what is needed from the underlying distributed system and network to ensure acceptable delivery; e.g. viewing experience of a user for continuous data, the concerns are timeliness: data must be delivered in time initial delay: maximum delay until a session has been setup maximum end-to-end delay maximum delay variance or jitter volume/bandwidth: the required throughput (bit rate) must be met reliability: a given level of loss of data must not be exceeded quality of perception: highly subjective
68
Enforcing QoS the underlying system offers a best-effort delivery service however, the Internet also provides mechanisms such as differentiated services which categorizes packets into many classes; for example, it has an expedited class to inform the router to forward a packet with absolute priority in addition, a distributed system can help to improve QoS three methods: buffering, forward error correction, and interleaving frames 1. Buffering - Client Side buffer (store) flows on the receiving side (client machine) before delivery (playback) it smoothes jitter (for audio and video on demand since jitter is the main problem) - does not affect reliability or bandwidth, increases delay
69
how long to buffer? 2. Forward Error Correction - Client Side packets may be lost retransmission is not applicable for time-dependent data the overhead for forward error correction may be high
70
3. Interleaving Frames - Server Side a single packet may contain multiple audio and video frames if such a packet is lost, there will be a large gap during play back hence, the idea is to distribute the effect of a packet loss over time but, a larger buffer is required at the receiver/client for example, to play the first four frames, four packets need to be delivered and stored
71
The effect of packet loss in (a) non interleaved transmission and (b) interleaved transmission
72
Stream Synchronization how to maintain temporal relations between streams examples: lip synchronization or a slide show enhanced with audio two approaches 1. explicitly by operating on the data units of simple streams; the responsibility of the application (not good for applications to do it)
73
2. through a multimedia middleware that offers a collection of interfaces for controlling audio and video streams as well as devices such as monitors, cameras, microphones, ...