0 оценок0% нашли этот документ полезным (0 голосов)
8 просмотров19 страниц
External data representation is a standard data serialization format, for uses such as computer network protocols. XDR was developed in the mid 1980s at sun microsystems, and first widely published in 1987. Remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space.
External data representation is a standard data serialization format, for uses such as computer network protocols. XDR was developed in the mid 1980s at sun microsystems, and first widely published in 1987. Remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space.
External data representation is a standard data serialization format, for uses such as computer network protocols. XDR was developed in the mid 1980s at sun microsystems, and first widely published in 1987. Remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space.
External Data Representation (XDR) is a standard data serialization format, for uses such as computer network protocols. It allows data to be transferred between different kinds of computer systems. Converting from the local representation to XDR is called encoding. Converting from XDR to the local representation is called decoding. XDR is implemented as a software library of functions which is portable between different operating systems and is also independent of the transport layer. XDR uses a base unit of 4 bytes, serialized in big-endian order; smaller data types still occupy four bytes each after encoding. Variable-length types such as string and opaque are padded to a total divisible by four bytes. Floating-point numbers are represented in IEEE 754 format.
XDR was developed in the mid 1980s at Sun Microsystems, and first widely published in 1987. [1] XDR became an IETF standard in 1995. The XDR data format is in use by many systems, including: Network File System (protocol) NDMP Network Data Management Protocol Open Network Computing Remote Procedure Call Legato NetWorker backup software (later sold by EMC)
XDR data types boolean int 32-bit integer unsigned int unsigned 32-bit integer hyper 64-bit integer unsigned hyper unsigned 64-bit integer IEEE float IEEE double quadruple (new in RFC1832) enumeration structure string Remote Procedure Calls: 2 | P a g e
Remote procedure call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. [1] That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. When the software in question uses object-oriented principles, RPC is called remote invocation or remote method invocation. An RPC is initiated by the client, which sends a request message to a known remote server to execute a specified procedure with supplied parameters. The remote server sends a response to the client, and the application continues its process. While the server is processing the call, the client is blocked (it waits until the server has finished processing before resuming execution), unless the client sends an asynchronous request to the server, such as an XHTTP call. There are many variations and subtleties in various implementations, resulting in a variety of different (incompatible) RPC protocols. An important difference between remote procedure calls and local calls is that remote calls can fail because of unpredictable network problems. Also, callers generally must deal with such failures without knowing whether the remote procedure was actually invoked. Idempotent procedures (those that have no additional effects if called more than once) are easily handled, but enough difficulties remain that code to call remote procedures is often confined to carefully written low-level subsystems. Sequence of events during an RPC[edit] 1. The client calls the client stub. The call is a local procedure call, with parameters pushed on to the stack in the normal way. 2. The client stub packs the parameters into a message and makes a system call to send the message. Packing the parameters is called marshalling. 3. The client's local operating system sends the message from the client machine to the server machine. 4. The local operating system on the server machine passes the incoming packets to the server stub. 5. The server stub unpacks the parameters from the message. Unpacking the parameters is called unmarshalling. 6. Finally, the server stub calls the server procedure. The reply traces the same steps in the reverse direction. 3 | P a g e
Standard contact mechanisms[edit] To let different clients access servers, a number of standardized RPC systems have been created. Most of these use an interface description language (IDL) to let various platforms call the RPC. The IDL files can then be used to generate code to interface between the client and server. import java.rmi.Naming; import java.rmi.RemoteException; import java.rmi.server.UnicastRemoteObject; import java.rmi.registry.*; public class RmiServer extends UnicastRemoteObject implements RmiServerIntf { public static final String MESSAGE = "Hello World";
public RmiServer() throws RemoteException { super(0); // required to avoid the 'rmic' step, see below }
public String getMessage() { return MESSAGE; }
public static void main(String args[]) throws Exception { System.out.println("RMI server started");
//Instantiate RmiServer RmiServer obj = new RmiServer();
// Bind this object instance to the name "RmiServer" Naming.rebind("//localhost/RmiServer", obj); System.out.println("PeerServer bound in registry"); } } RmiServerIntf interface defines the interface that is used by the client and implemented by the server. 4 | P a g e
public interface RmiServerIntf extends Remote { public String getMessage() throws RemoteException; } RmiClient class this is the client which gets the reference (a proxy) to the remote object living on the server and invokes its method to get a message. If the server object implemented java.io.Serializable instead of java.rmi.Remote, it would be serialized and passed to the client as a value. [2]
import java.rmi.Naming;
public class RmiClient { public static void main(String args[]) throws Exception { RmiServerIntf obj = (RmiServerIntf)Naming.lookup("//localhost/RmiServer"); System.out.println(obj.getMessage()); } }
Socket Operations There are a few logical operations that may be performed on a TCP/IP socket, regardless of whether the socket is synchronous or asynchronous. Each of the operations below is marked immediate (meaning it is completed immediately) or delayed (meaning it depends on the network for completion). Constructing (immediate) - TCP/IP sockets use the InterNetwork (for IPv4) or InterNetworkV6 (for IPv6) AddressFamily, the StreamSocketType, and the Tcp ProtocolType. MSDN links: Socket Binding (immediate) - A socket may be locally bound. This is normally done only on the server (listening) socket, and is how a server chooses the port it listens on. See Using Socket as a Server (Listening) Socket for details. MSDN links: Bind Listening (immediate) - A bound socket notifies the OS that it is almost ready to receive connections by listening. In spite of the term listening, this operation only notifies the OS that the socket isabout to accept connections; it does not actually begin accepting connections, though the OS may accept a connection on behalf of the socket. See Using Socket as a Server (Listening) Socket for details. MSDN links: Listen 5 | P a g e
Accepting (delayed) - A listening socket may accept an incoming connection. When an incoming connection is accepted, a new socket is created that is connected to the remote side; the listening socket continues listening. The new socket (which is connected) may be used for sending and receiving. See Using Socket as a Server (Listening) Socket for details. MSDN links: Accept, BeginAccept, EndAccept, AcceptAsync Connecting (delayed) - A (client) socket may connect to a (server) socket. TCP has a three-way handshake to complete the connection, so this operation is not instantaneous. Once a socket is connected, it may be used for sending and receiving. See Using Socket as a Client Socket for details. MSDN links: Connect, BeginConnect, EndConnect, ConnectAsync Reading (delayed) - Connected sockets may perform a read operation. Reading takes incoming bytes from the stream and copies them into a buffer. A 0-byte read indicates a graceful closure from the remote side. See Using Socket as a Connected Socket for details. MSDN links: Receive, BeginReceive, EndReceive, ReceiveAsync Writing (delayed) - Connected sockets may perform a write operation. Writing places bytes in the outgoing stream. A successful write may complete before the remote OS acknowledges that the bytes were received. See Using Socket as a Connected Socket for details. MSDN links: Send, BeginSend, EndSend, SendAsync Disconnecting (delayed) - TCP/IP has a four-way handshake to terminate a connection gracefully: each side shuts down its own outgoing stream and receives an acknowledgment from the other side. MSDN links: Disconnect, BeginDisconnect, EndDisconnect,DisconnectAsync Shutting down (immediate) - Either the receiving stream or sending stream may be clamped shut. For receives, this is only a local operation; the other end of the connection is not notified. For sends, the outgoing stream is shut down (the same way Disconnect does it), and this is acknowledged by the other side; however, there is no notification of this operation completing. MSDN links: Shutdown Closing (immediate or delayed) - The actual socket resources are reclaimed when the socket is disposed (or closed). Normally, this acts immediate but is actually delayed, performing a graceful disconnect in the background and then actually reclaiming the socket resources when the disconnect completes. Socket.LingerState may be set to change Close to be a synchronous disconnect (delayed, but always synchronous), or an immediate shutdown (always immediate). MSDN links: Close, LingerState
Network socket 6 | P a g e
A network socket is an endpoint of an inter-process communication flow across a computer network. Today, most communication between computers is based on the Internet Protocol; therefore most network sockets are Internet sockets. A socket API is an application programming interface (API), usually provided by the operating system, that allows application programs to control and use network sockets. Internet socket APIs are usually based on the Berkeley sockets standard. A socket address is the combination of an IP address and a port number, much like one end of a telephone connection is the combination of a phone number and a particularextension. Based on this address, internet sockets deliver incoming data packets to the appropriate application process or thread. An Internet socket is characterized by a unique combination of the following: Local socket address: Local IP address and port number Remote socket address: Only for established TCP sockets. As discussed in the client-server section below, this is necessary since a TCP server may serve several clients concurrently. The server creates one socket for each client, and these sockets share the same local socket address from the point of view of the TCP server. Protocol: A transport protocol (e.g., TCP, UDP, raw IP, or others). TCP port 53 and UDP port 53 are consequently different, distinct sockets.
Socket types: There are several Internet socket types available: Datagram sockets, also known as connectionless sockets, which use User Datagram Protocol (UDP). Stream sockets, also known as connection-oriented sockets, which use Transmission Control Protocol (TCP) or Stream Control Transmission Protocol (SCTP). Raw sockets (or Raw IP sockets), typically available in routers and other network equipment. Here the transport layer is bypassed, and the packet headers are made accessible to the application. There are also non-Internet sockets, implemented over other transport protocols, such as Systems Network Architecture (SNA). [2] See also Unix domain sockets (UDS), for internal inter-process communication. 7 | P a g e
A datagram socket is a type of connectionless network socket, which is the sending or receiving point for packet delivery services. [1] Each packet sent or received on a datagram socket is individually addressed and routed. Multiple packets sent from one machine to another may arrive in any order and might not arrive at the receiving computer. UDP broadcasts sends are always enabled on a datagram socket. In order to receive broadcast packets, a datagram socket should be bound to the wildcard address. Broadcast packets may also be received when a datagram socket is bound to a more specific address.
a stream socket is a type of internet socket which provides a connection-oriented, sequenced, and unique flow of data without record boundaries, with well-defined mechanisms for creating and destroying connections and for detecting errors. This internet socket type transmits data on a reliable basis, in order, and with out-of- band capabilities. Traditionally, stream sockets are implemented on top of TCP so that applications can run across any networks using TCP/IP protocol. SCTP can also be used for stream sockets. a raw socket is an internet socket that allows direct sending and receiving of Internet Protocol packets without any protocol-specific transport layerformatting. In standard sockets, the payload to be transmitted is encapsulated according to the chosen transport layer protocol (e.g. TCP, UDP). In contrast, raw sockets usually receive raw packets including the header. When transmitting packets, the automatic addition of a header may be a configurable option of the socket. Raw sockets are used in security related applications like nmap, packet-sniffer.One possible use case for raw sockets is the implementation of new transport-layer protocols inuser space. [1] Raw sockets are typically available in network equipment, and used for routing protocols such as the Internet Group Management Protocol (IGMP) and Open Shortest Path First (OSPF), and in the Internet Control Message Protocol (ICMP, best known for the ping suboperation) Socket.Close Method The Close method closes the remote host connection and releases all managed and unmanaged resources associated with the Socket. Upon closing, the Connectedproperty is set to false. For connection-oriented protocols, it is recommended that you call Shutdown before calling the Close method. This ensures that all data is sent and received on the connected socket before it is closed. If you need to call Close without first calling Shutdown, you can ensure that data queued for outgoing transmission will be sent by setting the DontLinger Socket option tofalse and specifying a non-zero time- out interval. Close will then block until this data is sent or until the specified time-out expires. If you 8 | P a g e
set DontLinger to false and specify a zero time-out interval, Close releases the connection and automatically discards outgoing queued data. aSocket.Shutdown(SocketShutdown.Both); aSocket.Close(); The close() function shall deallocate the file descriptor indicated by fildes. To deallocate means to make the file descriptor available for return by subsequent calls to open() or other functions that allocate file descriptors. All outstanding record locks owned by the process on the file associated with the file descriptor shall be removed (that is, unlocked). If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to [EINTR] and the state of fildes is unspecified. If an I/O error occurred while reading from or writing to the file system during close(), it may return -1 with errno set to [EIO]; if this error is returned, the state of fildes is unspecified. When all file descriptors associated with a pipe or FIFO special file are closed, any data remaining in the pipe or FIFO shall be discarded. When all file descriptors associated with an open file description have been closed, the open file description shall be freed. If the link count of the file is 0, when all file descriptors associated with the file are closed, the space occupied by the file shall be freed and the file shall no longer be accessible. [XSR] If a STREAMS-based fildes is closed and the calling process was previously registered to receive a SIGPOLL signal for events associated with that STREAM, the calling process shall be unregistered for events associated with the STREAM. The last close() for a STREAM shall cause the STREAM associated with fildes to be dismantled. If O_NONBLOCK is not set and there have been no signals posted for the STREAM, and if there is data on the module's write queue, close() shall wait for an unspecified time (for each module and driver) for any output to drain before dismantling the STREAM. The time delay can be changed via an I_SETCLTIME ioctl() request. If the O_NONBLOCK flag is set, or if there are any pending signals, close() shall not wait for output to drain, and shall dismantle the STREAM immediately.
Clientserver model Networks in which certain computers have special dedicated tasks, providing services to other computers (in the network) are calledclient-server networks. The clientserver model of computing is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. [1] Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server host runs one or more server programs which share their resources with clients. A client does not share any of its 9 | P a g e
resources, but requests a server's content or service function. Clients therefore initiate communication sessions with servers which await incoming requests. Examples of computer applications that use the clientserver model are Email, network printing, and the World Wide Web The clientserver characteristic describes the relationship of cooperating programs in an application. The server component provides a function or service to one or many clients, which initiate requests for such services. Servers are classified by the services they provide. For instance, a web server serves web pages and a file server serves computer files. A shared resource may be any of the server computer's software and electronic components, from programs and data to processors and storage devices. The sharing of resources of a server constitute a service. Whether a computer is a client, a server, or both, is determined by the nature of the application that requires the service functions. For example, a single computer can run web server and file server software at the same time to serve different data to clients making different kinds of requests. Client software can also communicate with server software within the same computer. [2] Communication between servers, such as to synchronize data, is sometimes called inter-server or server-to- server communication. In general, a service is an abstraction of computer resources and a client does not have to be concerned with how the server performs while fulfilling the request and delivering the response. The client only has to understand the response based on the well-known application protocol, i.e. the content and the formatting of the data for the requested service. Clients and servers exchange messages in a request-response messaging pattern: The client sends a request, and the server returns a response. This exchange of messages is an example of inter- process communication. To communicate, the computers must have a common language, and they must follow rules so that both the client and the server know what to expect. The language and rules of communication are defined in a communications protocol. All client-server protocols operate in the application layer. The application-layer protocol defines the basic patterns of the dialogue. To formalize the data exchange even further, the server may implement an API (such as a web service). [3] The API is an abstraction layer for such resources as databases and custom software. By restricting communication to a specific content format, it facilitates parsing. By abstracting access, it facilitates cross-platform data exchange. [4]
A server may receive requests from many different clients in a very short period of time. Because the computer can perform a limited number of tasks at any moment, it relies on ascheduling system to prioritize incoming requests from clients in order to accommodate them all in turn. To prevent abuse and maximize uptime, the server's software limits how a client can use the server's resources. Even so, a server is not immune from abuse. A denial of service attack exploits a server's obligation to 10 | P a g e
process requests by bombarding it with requests incessantly. This inhibits the server's ability to responding to legitimate requests. Io Multiplexing I/O multiplexing means what it says - allowing the programmer to examine and block on multiple I/O streams (or other "synchronizing" events), being notified whenever any one of the streams is active so that it can process data on that stream. In the Unix world, it's called select() or poll() (when using the CeeLanguage API for Unix). In the MicrosoftWindowsApi world, it's called WaitForMultipleObjects?(). Other languages/environments have similar features: The advantage of IoMultiplexing is that it allows blocking on multiple resources simultaneously, without needing to use polling (which wastes CPU cycles) or multithreading (which can be difficult to deal with, especially if threads are introduced into an otherwise sequential app only for the purpose of pending on multiple descriptors). With the understanding, of course, that the CPU cycles are gonna get burned somewhere anyway, and even if your task/process is not threaded, the system as a whole will have other threads/tasks running. Agreed; with the caveat that there is a lot of difference between CPU cycles spent polling (even if it's a test-then-sleep loop) and CPU cycles spent doing productive work. OTOH, on many destkop systems most CPU cycles are spent in the IdleTask. 14.5. I/O Multiplexing When we read from one descriptor and write to another, we can use blocking I/O in a loop, such as while ((n = read(STDIN_FILENO, buf, BUFSIZ)) > 0) if (write(STDOUT_FILENO, buf, n) != n) err_sys("write error");
We see this form of blocking I/O over and over again. What if we have to read from two descriptors? In this case, we can't do a blocking read on either descriptor, as data may appear on one descriptor while we're blocked in a read on the other. A different technique is required to handle this case. 11 | P a g e
Let's look at the structure of the telnet(1) command. In this program, we read from the terminal (standard input) and write to a network connection, and we read from the network connection and write to the terminal (standard output). At the other end of the network connection, the telnetd daemon reads what we typed and presents it to a shell as if we were logged in to the remote machine. The telnetd daemon sends any output generated by the commands we type back to us through the telnet command, to be displayed on our terminal. Figure 14.20 shows a picture of this. Figure 14.20. Overview of telnet program
The telnet process has two inputs and two outputs. We can't do a blocking read on either of the inputs, as we never know which input will have data for us. One way to handle this particular problem is to divide the process in two pieces (using fork), with each half handling one direction of data. We show this in Figure 14.21. (The cu(1) command provided with System V's uucp communication package was structured like this.) Figure 14.21. The telnet program using two processes
If we use two processes, we can let each process do a blocking read. But this leads to a problem when the operation terminates. If an end of file is received by the child (the network connection is disconnected by the telnetd daemon), then the child terminates, and the parent is notified by the SIGCHLD signal. But if the parent terminates (the user enters an end of file at the terminal), then the parent has to tell the child to stop. We can use a signal for this (SIGUSR1, for example), but it does complicate the program somewhat. 12 | P a g e
Instead of two processes, we could use two threads in a single process. This avoids the termination complexity, but requires that we deal with synchronization between the threads, which could add more complexity than it saves. We could use nonblocking I/O in a single process by setting both descriptors nonblocking and issuing a read on the first descriptor. If data is present, we read it and process it. If there is no data to read, the call returns immediately. We then do the same thing with the second descriptor. After this, we wait for some amount of time (a few seconds, perhaps) and then try to read from the first descriptor again. This type of loop is called polling. The problem is that it wastesCPU time. Most of the time, there won't be data to read, so we waste time performing the read system calls. We also have to guess how long to wait each time around the loop. Although it works on any system that supports nonblocking I/O, polling should be avoided on a multitasking system. Another technique is called asynchronous I/O. To do this, we tell the kernel to notify us with a signal when a descriptor is ready for I/O. There are two problems with this. First, not all systems support this feature (it is an optional facility in the Single UNIX Specification). System V provides the SIGPOLL signal for this technique, but this signal works only if the descriptor refers to a STREAMS device. BSD has a similar signal, SIGIO, but it has similar limitations: it works only on descriptors that refer to terminal devices or networks. The second problem with this technique is that there is only one of these signals per process (SIGPOLL or SIGIO). If we enable this signal for two descriptors (in the example we've been talking about, reading from two descriptors), the occurrence of the signal doesn't tell us which descriptor is ready. To determine which descriptor is ready, we still need to set each nonblocking and try them in sequence. We describe asynchronous I/O briefly in Section 14.6. A better technique is to use I/O multiplexing. To do this, we build a list of the descriptors that we are interested in (usually more than one descriptor) and call a function that doesn't return until one of the descriptors is ready for I/O. On return from the function, we are told which descriptors are ready for I/O. Three functionspoll, pselect, and selectallow us to perform I/O multiplexing. Figure 14.22 summarizes which platforms support them. Note that select is defined by the base POSIX.1 standard, but poll is an XSI extension to the base Socket Options The socket mechanism provides two socket-option interfaces for us to control the behavior of sockets. One interface is used to set an option, and another interface allows us to query the state of an option. We can get and set three kinds of options: 1. Generic options that work with all socket types 2. Options that are managed at the socket level, but depend on the underlying protocols for support 3. Protocol-specific options unique to each individual protocol 13 | P a g e
The Single UNIX Specification defines only the socket-layer options (the first two option types in the preceding list). We can set a socket option with the setsockopt function. Socket Options This section describes Winsock Socket Options for various editions of Windows operating systems. Use thegetsockopt and setsockopt functions for more getting and setting socket options. To enumerate protocols and discover supported properties for each installed protocol, use the WSAEnumProtocols function. Some socket options require more explanation than these tables can convey; such options contain links to additional pages. IPPROTO_IP Socket options applicable at the IPv4 level. For more information, see the IPPROTO_IP Socket Options. IPPROTO_IPV6 Socket options applicable at the IPv6 level. For more information, see the IPPROTO_IPV6 Socket Options. IPPROTO_RM Socket options applicable at the reliable multicast level. For more information, see the IPPROTO_RM Socket Options. IPPROTO_TCP Socket options applicable at the TCP level. For more information, see the IPPROTO_TCP Socket Options. IPPROTO_UDP Socket options applicable at the UDP level. For more information, see the IPPROTO_UDP Socket Options. NSPROTO_IPX Socket options applicable at the IPX level. For more information, see the NSPROTO_IPX Socket Options. SOL_APPLETALK Socket options applicable at the AppleTalk level. For more information, see the SOL_APPLETALK Socket Options. SOL_IRLMP Socket options applicable at the InfraRed Link Management Protocol level. For more information, see theSOL_IRLMP Socket Options. SOL_SOCKET 14 | P a g e
Socket options applicable at the socket level. For more information, see the SOL_SOCKET Socket Options.
Grid computing Grid computing is the collection of computer resources from multiple locations to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files. What distinguishes grid computing from conventional high performance computing systems such as cluster computing is that grids tend to be more loosely coupled, heterogeneous, and geographically dispersed. [1] Although a single grid can be dedicated to a particular application, commonly a grid is used for a variety of purposes. Grids are often constructed with general-purpose grid middleware software libraries. Grid size varies a considerable amount. Grids are a form of distributed computing whereby a super virtual computer is composed of many networked loosely coupledcomputers acting together to perform large tasks. For certain applications, distributed or grid computing, can be seen as a special type of parallel computing that relies on complete computers (with onboard CPUs, storage, power supplies, network interfaces, etc.) connected to a network (private, public or the Internet) by a conventional network interface, such as Ethernet. This is in contrast to the traditional notion of a supercomputer, which has many processors connected by a local high-speed computer bus One feature of distributed grids is that they can be formed from computing resources belonging to multiple individuals or organizations (known as multiple administrative domains). This can facilitate commercial transactions, as in utility computing, or make it easier to assemble volunteer computing networks. One disadvantage of this feature is that the computers which are actually performing the calculations might not be entirely trustworthy. The designers of the system must thus introduce measures to prevent malfunctions or malicious participants from producing false, misleading, or erroneous results, and from using the system as an attack vector. This often involves assigning work randomly to different nodes (presumably with different owners) and checking that at least two different nodes report the same answer for a given work unit. Discrepancies would identify malfunctioning and malicious nodes. Due to the lack of central control over the hardware, there is no way to guarantee that nodes will not drop out of the network at random times. Some nodes (like laptops or dialupInternet customers) may also be available for computation but not network communications for unpredictable periods. These variations can be accommodated by assigning large work units (thus reducing the need for continuous network connectivity) and reassigning work units when a given node fails to report its results in expected time. 15 | P a g e
The impacts of trust and availability on performance and development difficulty can influence the choice of whether to deploy onto a dedicated cluster, to idle machines internal to the developing organization, or to an open external network of volunteers or contractors. In many cases, the participating nodes must trust the central system not to abuse the access that is being granted, by interfering with the operation of other programs, mangling stored information, transmitting private data, or creating new security holes. Other systems employ measures to reduce the amount of trust client nodes must place in the central system such as placing applications in virtual machines. Public systems or those crossing administrative domains (including different departments in the same organization) often result in the need to run on heterogeneous systems, using different operating systems and hardware architectures. With many languages, there is a trade off between investment in software development and the number of platforms that can be supported (and thus the size of the resulting network). Cross-platform languages can reduce the need to make this trade off, though potentially at the expense of high performance on any given node (due to run- time interpretation or lack of optimization for the particular platform). There are diverse scientific and commercial projects to harness a particular associated grid or for the purpose of setting up new grids. BOINC is a common one for various academic projects seeking public volunteers; more are listed at the end of the article. In fact, the middleware can be seen as a layer between the hardware and the software. On top of the middleware, a number of technical areas have to be considered, and these may or may not be middleware independent. Example areas include SLA management, Trust and Security, Virtual organization management, License Management, Portals and Data Management. These technical areas may be taken care of in a commercial solution, though the cutting edge of each area is often found within specific research projects examining the field. Although Grid computing might be evolving into abigger scope even on the world wide web in the future, most of the current scope would be on theorganizational level. Before we go into the organizational scope, let us take a moment to learn the bigger scope.
All the computers on the world wide web become likeone pool and the computing tasks can crawl toavailable resources, whether it is cpu power ,storage power or memory, on the world wide systemof computing resources and get the task done. The Napster technology is kind of Grid computing but using only the storage area.
As the Grid computing is in the infancy, the technology is getting foothold in the bigger corporations. Until the details, open standards and standard operating procedures are worked out for the WWW grid computing, Grid computing would take a strong foothold on the corporations as it saves lot of money. You and I know, cost saving is THE MOST widely usedconcept in the current technology world. 16 | P a g e
So the scope would be in the corporations level before it matures and crawls to wider area. If you haven't get a clue on the grid computing, please check out the following URL to see an interactive grid demo:
Grid Cafe
Server Consolidation
The most common usage of the grid computing would be the server consolidation. Several servers could becombined into a grid to save costs and effectively use the resources and get the maximum return from thecomputing infrastructure. Any server is not utilized close to 80% of average cpu, memory and SAN is acandidate for the consolidation as a rule of thumb. Of course, you want to have your growth projected and included in the 80% mark.
Most of the times the growth projected seems to be with some buffer capacity whether it is CPU, memory, SAN or networking requirements. When you combine these spare capacity, you end of saving much more cost than what you would have spent individually on the server.
Hypothetical example: Assume the ABC corp has 10 servers with similar capacity to make it simple. We could go into detailed and much complicated issues later on.
newdata001.server.abc.com 45% cpu, 75% memory, 85% san newdata002.server.abc.com 51% cpu, 65% memory, 67% san newdata003.server.abc.com 64% cpu, 35% memory, 37% san newdata004.server.abc.com 89% cpu, 85% memory, 74% san newdata005.server.abc.com 96% cpu, 45% memory, 81% san newdata006.server.abc.com 72% cpu, 35% memory, 65% san newdata007.server.abc.com 30% cpu, 95% memory, 45% san newdata008.server.abc.com 15% cpu, 15% memory, 87% san newdata009.server.abc.com 20% cpu, 45% memory, 37% san newdata010.server.abc.com 33% cpu, 52% memory, 71% san
The average cpu utilization would be around 51.5% if you put all the nodes into a grid and 54.7% of memory and 64.9% of SAN. You would be eventually release at least 30% resources to save cost. Isn't the effective way to get your ROI on the computing infrastructure? The savings are not only on the CapEx but also on the maintenance costs from DataCenter space, site operations, data center 17 | P a g e
Grids could be scaled as needed rather than stacking up all the resources all at once and struggling to use the resources effectively. The nodes could bepartitioned using the industry standard tools such asVmWare, Zones, domains and Solaris Containers etc but each has some cons and pros. In the future, entiresystems might run on single grid with a backup grid atdisaster site.
Virtualization
If you have a powerful server which is usingVirtualization using VmWare or some other software, it potentially introduces a thin layer called hypervisorright above the hardware and making it looks likemultiple servers. Or potentially put large number of servers into a grid and virtualize in any way you want it it. The wonderful thing about the virtualization is thatresources could be dynamically re- mapped to anyvirtual server and each virtual server could potentially run individual OS release, database release, web server or app server release.
Let us assume you have a powerful E15K server, you could virtualize it with VmWare into 4 servers in which one server is used to run database (Oracle, Sybase or DB2 database), one server is used to run web server(Apache), one server is used to run the app server(Tomcat) and another server is used for batch processing. Good deal right?
1. If database is overloaded, well batch server is freeduring the day so move some resources to database server dynamically. During the night, batch server gets busy but web traffic is slow so move some resources(cpu or memory) to the batch server.
2. If database such as Oracle, Sybase or DB2 needs to run on certain release of OS version, Tomcat needs to run on certain release of OS, its all possible.
3. If a faulty CPU is detected, it would affect only node it has affinity to.
If the hardware fails, most of the hardware are hot-swapple but not all the faults are hot fixable. So the other option using multiple machines into the grid and virtualizing have all the greatest benefits you could imagine! To learn more, click link.
VMWare Intro 18 | P a g e
Containers
Sun Microsystems has new technology for thepartitioning and Virtualization technology calledLDoms for Sun CoolThreads Servers. It allows you to run "software-isolated" applications on each virtual machine. When we go into actual design we would go into the details of these virtualization techniques. For now, these are just under the scope of grid technology and what they offer in the Server Consolidation themes in Corporate World. If you want to learn more aboutSolaris Containers or LDoms, please click the following link to read Sun Blue print PDF document about Containers and Virtualization.
Grid architecture: Grid architecture is the way in which a grid has been designed. A grid's architecture is often described in terms of "layers", where each layer has a specific function. The higher layers are generally user-centric, whereas lower layers are more hardware-centric, focused on computers and networks. The lowest layer is the network, which connects grid resources.
Above the network layer lies the resource layer: actual grid resources, such as computers,storage systems, electronic data catalogues, sensors and telescopes that are connected to the network.
The middlewarelayer provides the tools that enable the various elements (servers, storage, networks, etc.) to participate in a grid. The middleware layer is sometimes the "brains" behind a computing grid!
The highest layer of the structure is the application layer, which includes applications inscience, engineering, business, finance and more, as well as portals and development toolkits to support the applications. This is the layer that grid users "see" and interact with. The application layer often includes the so-called serviceware, which performs general management functions like tracking who is providing grid resources and who is using them. 19 | P a g e