Академический Документы
Профессиональный Документы
Культура Документы
A Technical Seminar Report Submitted in partial fulfillment of the Requirement for the Award of Bachelor of Technology Degree in
INFORMATION TECHNOLOGY
By B.PRAVEEN KUMAR REDDY ROLL NO: 06S11A1225
CONTENTS
1. ABSTRACT
.3
2. INTRODUCTION....4 3. DISTRIBUTED COMPUTING..5 4. PARALLEL COMPUTING7 5. DISTRIBUTED COMPUTER SYSTEM METRICS...9 6. ARCHITECTURE...9 7. HOW DOES DISTRIBUTED COMPUTING WORK?.........................11 8. FORMS OF COMMUNICATION..12 9. SECURITY ENFORCEMENT....13 10. SECURITY AND STANDARD CHALLENGES..14 11. TYPES OF APPLICATIONS.15 12. ADVANTAGES16 13. CONCLUSION ...17
ABSTRACT
You can define distributed computing in many different ways. Various vendors have created and marketed distributed computing systems for years, and have developed numerous initiatives and architectures to permit distributed processing of data and objects across a network of connected systems. One flavor of distributed computing has received a lot of attention lately, and it will be a primary focus of this story--an environment where you can harness idle CPU cycles and storage space of tens, hundreds, or thousands of networked systems to work together on a particularly processing-intensive problem. The growth of such processing models has been limited, however, due to a lack of compelling applications and by bandwidth bottlenecks, combined with significant security, management, and standardization challenges.Distributed computing offers researchers the potential of solving complex problems using many dispersed machines. The result is faster computation at potentially lower costs when compared to the use of dedicated resources. The term Distributed Computation has been used to describe the use of distributed computing for the sake of raw computation rather than say, remote file sharing, storage or information retrieval. Distributed computing also often involves competition with other distributed systems. This competition may be for prestige, or it may be a means of enticing users to donate processing power to a specific project. This differs from cluster computing in that computers in a distributed computing environment are typically not exclusively running 'group' tasks, whereas clustered computers are usually much more tightly coupled. The difference makes distributed computing attractive because, when properly configured, it can use computational resources that would otherwise be unused. This paper examines what makes people to turn to distributed computing.
INTRODUCTION
A process can be run faster by being divided into subtasks (threads) that are run on two or more interconnected computers in parallel. The more computers, the more inter-dependant processes can be run at the same time. Although cluster based computing has made very high speed processing possible with a small budget, there are computational problems that require such extensive processing needs that there is no reasonable way to fund the project using dedicated machines. Grid computing hopes to solve budget and infrastructure constraints by using thousands or even millions of networked computers spare CPU time. When these computers are not in use or operating under capacity, they can allow big problems to be solved in small pieces. Distributed Computation offers researchers an opportunity to distribute the task of solving complex problems onto hundreds and in many cases thousands of Internet connected machines. Although, the network is itself distributed, the research and end user participants form a loosely bound partnership. The resulting partnership is not unlike a team or community. People band together to create and connect the resources required for the achievement of a common goal. A fascinating aspect of this continues to be humanitys willingness to transcend cultural barriers.
DISTRIBUTED COMPUTING:
Distributed computing is becoming an ever more common methodology to solve highly complex computing problems that would traditionally be solved using a supercomputer. It is used to more quickly and/or efficiently process information using available resources. Using a distributed operating system a collection of computers can be interconnected through a network into a cluster. It is based on the concept that most CPUs are not fully utilized and can be used to run tasks sent to them. Distributed computing differs from cluster computing, as in a 'Beowulf Cluster', in that machines in a distributed network are not dedicated to the tasks sent to them.
to users at other locations. It is desirable that programs used at other location should employ them.
4. Security: - Distributed systems may give better security and reliability because
possible to build systems which to a large extent can achieve the advantages of both.
COUNTER COUNTER 1
COUNTER 2
Pleas e
Fig:distributed computing
distributed environment. To run a distributed application, there are several issues that will need to be addressed. To begin, it must be possible to start processes on remote computers and the necessary data for these processes must be provided before they can do any work. Some mechanism for synchronizing these processes, such as 'inter-process Semaphores', should be available, so that they know when to access the data and produce any results. Starting a program on another computer is not very hard using programs like 'telnet' or 'rsh'. Exchanging data and synchronizing, however, can be quite difficult and complicated. These problems can distract the programmer from his original project and can be the source of numerous bugs. Linux already has some mechanisms for processes in the same computer to exchange data and synchronize between them. This is called Inter-Process Communication (IPC). One prominent example is the System V IPC, first introduced in AT&T's System V UNIX. Distributed computing first used machines connected in a finite physical network. These are PCs similar in both hardware and software. In order to solve massive computational problems most networks are not big enough. Grid computing is the answer to this problem.
communication bps).
Granularity relative size of units of processing required. Distributed systems
operate best with coarse grain granularity because of the slow communication compared to processing speed in general.
Processor speed MIPS, FLOPS. Reliability ability to continue operating correctly for a given time. Fault tolerance resilience to partial system failure. Security policy to deal with threats to the communication or processing of data
in a system.
Administrative/management domains issues concerning the ownership and
ARCHITECTURE:
The simplest organization of distributed systems is the client-server model. It is to have only two types of machines: 1. A client machine containing only the programs implementing the userinterface level. 2. A server machine containing the rest, that is the programs implementing the processing and data level. Two kinds of architectures are possible with client-server organization. 1. Multi tiered Architectures 2. Modern Architectures
9
1.
Modern Architectures:
In modern architectures, it is often the distribution of the clients and servers that counts, which we refer to as horizontal distribution. In this type of distribution, a client or a server may be physically split up into logically equivalent parts, but each part is operating on its own share of the complete data set, thus balancing the load. Modern distributed systems are generally built by means of an additional layer of software on top of a network operating system. This layer , called middle ware, is designed to hide the heterogeneity and distributed nature of the underlying collection of computers.
10
11
request
simply
says,
"Give
me
work
package".
essence of persistent communication is that a message that is submitted for transmission is stored by the communication system as long as it takes to deliver it. Message-oriented middleware models generally offer persistent asynchronous communication, and are used where RPCs and RMIs are not appropriate. They are primarily used to assist the integration of collection of databases into large-scale information systems. Other applications include e-mail and workflow. A completely different form of communication is that of streaming, in which the issue is whether or not two successive messages have a temporal relationship. In continuous data streams, a maximum end-to-end delay is specified for each message. In addition, it is also required that messages are sent subject to minimum end-to-end delay. Typical examples of such continuous data streams are video and audio streams. Though various modes of communication are available, message-oriented model turns out to be the optimal form of communication. The Operating systems commonly used for distributed computing systems can be broadly classified into two types: network operating systems and distributed operating systems. As compared to network operating systems, a distributed operating systems shows how better transparency and fault tolerance capability and provides the image of a virtual uni processor to the users.
The main issues involved in the design of a distributed operating system are transparency, reliability, flexibility, performance, scalability, heterogeneity. SECURITY ENFORCEMENT-KEY DESIGN ISSUES
Security enforcement can be taken care of during the design of distributed system itself. Three important design issues to be considered in this context are:
13
combine it with a public key system. Current practice shows the use of publickey cryptography for distributing short-term shared secret keys. 2. The second issue in secure distributed system is access control, or authorization. Authorization deals with protecting resources in such a way only processes that have proper access rights can actually access and those resources. Access control always takes place after a process has been authenticated. 3. The third and final issue in secure distributed systems concerns management and especially about key management and authorization management. Key management includes the distribution of cryptographic keys, for which certificates as issued by trusted third parties play an important role. Important with respect to authorization management are attribute certificates and delegation. Kerberos is a widely-used security system based on shared secret keys. Special attention is often paid to anonymity of a customer, as this distinguishes traditional cash-based systems from their electronic counterpart.
14
Most of the current platforms offer high level encryption such as Triple DES. The application packages that are sent to PCs are digitally signed, to make sure a rogue application does not infiltrate a system. Avaki comes with its own PKI (public key infrastructure). Identical application packages are typically sent to multiple PCs and the results of each are compared. Any set of results that differs from the rest becomes security suspect. Even with encryption, data can still be snooped when the process is running in the client's memory, so most platforms create application data chunks that are so small, that it is unlikely snooping them will provide useful information. Avaki claims that it integrates easily with different existing security infrastructures and can facilitate the communications among them, but this is obviously a challenge for global distributed computing. Working out standards for communications among platforms is part of the typical chaos that occurs early in any relatively new technology. In the generalized peer-topeer realm lies the Peer-to-Peer Working Group, started by Intel, which is looking to devise standards for communications among many different types of peer-to-peer platforms, including those that are used for edge services and collaboration. The Global Grid Forum is a collection of about 200 companies looking to devise grid computing standards. Then you have vendor-specific efforts such as Sun's Open Source JXTA platform, which provides a collection of protocols and services that allows peers to advertise themselves to and communicate with each other securely. JXTA has a lot in common with JINI, but is not Java specific (thought the first version is Java based). Intel recently released its own peer-to-peer middleware, the Intel Peer-to-Peer Accelerator Kit for Microsoft . Net, also designed for discovery, and based on the Microsoft.Net platform.
15
TYPES OF APPLICATIONS:
The following scenarios are examples of types of application tasks that can be set up to take advantage of distributed computing.
A query search against a huge database that can be split across lots of desktops, with the submitted query running concurrently against each fragment on each desktop.
Exhaustive search techniques that require searching through a huge number of results to find solutions to a problem also make sense. Drug screening is a prime example.
Complex modeling and simulation techniques that increase the accuracy of results by increasing the number of random trials would also be appropriate, as trials could be run concurrently on many desktops, and combined to achieve greater statistical.
Complex financial modeling, weather forecasting, and geophysical exploration are on the radar screens of the vendors, as well as car crash and other complex simulations.
Many of today's vendors are aiming squarely at the life sciences market, which has a sudden need for massive computing power. Pharmaceutical firms have repositories of millions of different molecules and compounds, some of which may have characteristics that make them appropriate for inhibiting newly found proteins. The process of matching all these ligands to their appropriate targets is an ideal task for distributed computing, and the quicker it's done, the quicker and greater the benefits will be.
16
ADVANTAGES
Distributed computing has been proposed for various reasons ranging from organizational decentralization to economical processing to greater economy.
a) Management of distributed data with different levels of transparency. b) Distribution or network transparency. c) Replication transparency. d) Increased reliability and availability. e) Increased performance.
CONCLUSION
Scalability is also a great advantage of distributed computing. Though they provide massive processing power, super computers are typically not very scalable once they're installed. A distributed computing installation is infinitely scalable--simply add more systems to the environment. In a corporate distributed computing setting, systems might be added within or beyond the corporate firewall. The inabilities to adequately address massive processing and data volume issues have always hampered the potential of computer science. No matter how fast a CPU is or the data
17
throughput rate, our imaginations come up with new applications that exceed the existing technology or budget. For today, however, the specific promise of distributed computing lies mostly in harnessing the system resources that lies within the firewall. It will take years before the systems on the Net will be sharing compute resources as effortlessly as they can share information.
REFERENCES
18
19