Академический Документы
Профессиональный Документы
Культура Документы
Unit 1
UNIT I INTRODUCTION
Distributed Computing
Definition
A distributed system consists of multiple
7/22/16
Introduction
A distributed system is one in which hardware
7/22/16
Introduction
Agent
Cooperation
Agent
Agent
Cooperation
Distribution
Distribution
Cooperation
Distribution
Internet
Internet
Subscription
Agent
Distribution
Job Request
Resource
Management
Large-scale
Application
7/22/16
Motivation
Inherently distributed applications
Performance/cost
Resource sharing
Flexibility and extensibility
Availability and fault tolerance
Scalability
Network connectivity is increasing.
Combination of cheap processors often more
7/22/16
History
1975 1985
Parallel computing was favored in the early
years
Primarily vector-based at first
Gradually more thread-based parallelism was
introduced
The first distributed computing programs were
a pair of programs called Creeper and Reaper
invented in 1970s
Ethernet that was invented in 1970s.
ARPANET e-mail was invented in the early
1970s and probably the earliest example of a
large-scale distributed application.
7
7/22/16
History
1985 -1995
Massively parallel architectures start rising and
message passing interface and other libraries
developed
Bandwidth was a big problem
The first Internet-based distributed computing
project was started in 1988 by the DEC System
Research Center.
Distributed.net was a project founded in 1997 considered the first to use the internet to
distribute data for calculation and collect the
results,
8
7/22/16
History
1995 Today
Cluster/grid architecture increasingly
dominant
Special node machines eschewed in favor of
COTS technologies
Web-wide cluster software
Google take this to the extreme (thousands of
nodes/cluster)
SETI@Home started in May 1999 - analyze
the radio signals that were being collected by
the Arecibo Radio Telescope in Puerto Rico.
7/22/16
Goal
Making Resources Accessible
Data sharing and device sharing
Distribution Transparency
Access, location, migration, relocation,
Communication
Make human-to-human comm. easier. E.g.. :
electronic mail
Flexibility
Spread the work load over the available
7/22/16
Characteristics
Resource Sharing
Openness
Concurrency
Scalability
Fault Tolerance
Transparency
11
7/22/16
Architecture
Client-server
3-tier architecture
N-tier architecture
loose coupling, ortight coupling
Peer-to-peer
Space based
12
7/22/16
Application
Examples of commercial application :
Database Management System
Distributed computing using mobile agents
Local intranet
Internet (World Wide Web)
JAVA Remote Method Invocation (RMI)
13
7/22/16
14
7/22/16
Local Intranet
A portion of Internet that is separately administered &
15
7/22/16
Internet
The Internet is a global system of interconnected
16
7/22/16
JAVA RMI
Embedded in language Java:-
17
RMI Architecture
7/22/16
Charity
7/22/16
Advantages
Economics: Computers harnessed together give a better
price/performance ratio than mainframes.
Speed: A distributed system may have more total computing
power than a mainframe.
Inherent distribution of applications: Some applications are inherently distributed. E.g., an ATMbanking application.
Reliability: If one machine crashes, the system as a whole can still
survive if you have multiple server machines and multiple
storage devices (redundancy).
Extensibility and Incremental Growth: Possible to gradually scale up (in terms of processing
power and functionality) by adding more sources (both
hardware and software). This can be done without
disruption to the rest of the system.
19
7/22/16
Disadvantages
Complexity : Lack of experience in designing, and
implementing a distributed system. E.g. which
platform (hardware and OS) to use, which
language to use etc.
Network problem: If the network underlying a distributed system
20
7/22/16
7/22/16
22
7/22/16
in this Figure:
23
7/22/16
7/22/16
25
7/22/16
26
7/22/16
Conclusion
The concept of distributed computing is the
7/22/16
Grid Computing
Grid computing is a form of distributed computing whereby a
"super and virtual computer" is composed of a cluster of
networked, loosely coupled computers, acting in concert to
perform very large tasks.
Grid computing (Foster and Kesselman, 1999) is a growing
technology that facilitates the executions of large-scale
resource intensive applications on geographically distributed
computing resources.
Facilitates flexible, secure, coordinated large scale resource
sharing among dynamic collections of individuals, institutions,
and resource
Enable communities (virtual organizations) to share
geographically distributed resources as they pursue common
goals
28
7/22/16
29
7/22/16
Grid Applications
Data and computationally intensive applications:
This technology has been applied to computationally-intensive
scientific, mathematical, and academic problems like drug
discovery, economic forecasting, seismic analysis back office
data processing in support of e-commerce
A chemist may utilize hundreds of processors to screen
thousands of compounds per hour.
Teams of engineers worldwide pool resources to analyze
terabytes of structural data.
Meteorologists seek to visualize and analyze petabytes of
climate data with enormous computational demands.
Resource sharing
Computers, storage, sensors, networks,
Sharing always conditional: issues of trust, policy, negotiation,
payment,
Coordinated problem solving
distributed data analysis, computation, collaboration,
30
7/22/16
Grid Topologies
31
Intragrid
Local grid within an organization
Trust based on personal contracts
Extragrid
Resources of a consortium of
organizations
connected through a (Virtual) Private
Network
Trust based on Business to Business
contracts
Intergrid
Global sharing of resources through the
internet
7/22/16
Trust based on certification
Computational Grid
A computational grid is a hardware and software
infrastructure that provides dependable, consistent,
pervasive, and inexpensive access to high-end
computational capabilities.
The Grid: Blueprint for a New Computing
Infrastructure, Kesselman & Foster
Example : Science Grid (US Department of Energy)
32
7/22/16
Data Grid
A data grid is a grid computing system that deals
33
7/22/16
34
7/22/16
Distributed Supercomputing
a single system.
35
7/22/16
High-Throughput Computing
Uses the grid to schedule large numbers
On-Demand Computing
36
Collaborative Computing
Data-Intensive Computing
37
Logistical Networking
Logistical networks focus on exposing
7/22/16
39
7/22/16
Information Service
Details of Grid
resources
1
2
Grid
application
User
Computation
result
A User sends
computation or data
intensive application
to Global Grids in
order to speed up the
execution of the
application.
40
Computational
4 jobs
Processed jobs
Resource Broker
A Resource Broker
distribute the jobs in an
application to the Grid
resources based on users
QoS requirements and
details of available Grid
resources for further
executions.
Grid Resources
Grid Middleware
Grids are typically managed by grid ware -
41
7/22/16
Middleware
Globus chicago Univ
Condor Wisconsin Univ High throughput
computing
Legion Virginia Univ virtual workspacescollaborative computing
IBP Internet back pane Tennesse Univ
logistical networking
NetSolve solving scientific problems in
heterogeneous env high throughput &
data intensive
42
7/22/16
7/22/16
Name
URL/Sponsor
Focus
EuroGrid, Grid
Interoperability
(GRIP)
eurogrid.org
European Union
Fusion
Collaboratory
fusiongrid.org
Create a national computational
DOE Off. Science collaboratory for fusion research
Globus Project
globus.org
DARPA, DOE,
NSF, NASA,
Msoft
GridLab
gridlab.org
European Union
GridPP
gridpp.ac.uk
U.K. eScience
Grid Research
Integration Dev. &
Support Center
grids-center.org
NSF
44
7/22/16
Grid Architecture
45
7/22/16
Applications
Local OS
7/22/16
Collective
Application
Resource
Connectivity
Transport
Internet
Fabric
Link
47
7/22/16
Application
Example:
Data Grid Architecture
App
48
7/22/16
Simulation tools
GridSim job scheduling
SimGrid single client multiserver
scheduling
Bricks scheduling
GangSim- Ganglia VO
OptoSim Data Grid Simulations
G3S Grid Security services Simulator
security services
49
7/22/16
Simulation tool
50
51
resources.
Resources can be modeled operating under
space- or time-shared mode.
Resource capability can be defined (in the form
of MIPS (Million Instructions Per Second)
benchmark.
Resources can be located in any time zone.
Weekends and holidays can be mapped
depending on resources local time to model
non-Grid (local) workload.
Resources can be booked for advance
reservation.
Applications with different parallel
7/22/16
application models can be simulated.
52
Res Conf
User Req
Grid Sc
Output
Res entity
Info serv
Job mgmt
Res alloc
Statis
SMPs
Clusters
Load
Netw
Reservation
PCs
Workstation
SMPs
Clusters
Distributed
7/22/16
Resources
54
7/22/16
55
56
57
What is a Service?
A Service is a reusable component.
A Service changes business data from one
state to another.
A Service is the only way how data is
accessed.
If you can describe a component in WSDL, it is
a Service.
58
SOA
Information Systems
Systems Design
Computing & Communications
59
Informatio
n
Technology
60
unknowable.
Testing application (3 Million lines) requires
>1015 tests.
Probability correct data entry for a supply
item is <65%.
There are >100 formats that identify a
person in DoD.
Output / Office Worker: >30 e-messages
/day.
61
Applications
Security Barrier
VARIETY HERE
LOCAL LEVEL
APPLICATION LEVEL
Business
Security Barrier
Service A
Process
Security Barrier
Service B
OSD
Functional Process A
Functional Process B
Functional Process C
Functional Process D
Corporate Policy, Corporate Standards, Reference Models,
Data Management and Tools, Integrated Systems
Configuration Data Base, Shared Computing and
Telecommunications
Industry Standards, Commercial Off-the-Shelf
Products and Services
62
BUSINESS LEVEL
PROCESS LEVEL
ENTERPRISE LEVEL
STABILITY HERE
GLOBAL LEVEL
PERSONAL
Business A
Business B
Infrastructure
Support
LOCAL
APPLICATION
SHORT TERM
ADAPTABILITY &
TECHNOLOGY
SIMPLICITY
BUSINESS
Functional Process A
Functional Process B
Functional Process C
Functional Process D
Corporate Policy, Corporate Standards, Reference Models,
Data Management and Tools, Integrated Systems
Configuration Data Base, Shared Computing and
Telecommunications, Security and Survivability
Industry Standards, Commercial Off-the-Shelf
Products and Services
63
PROCESS
ENTERPRISE
GLOBAL
LONG TERM
STABILITY &
TECHNOLOGY
COMPLEXITY
Infrastructure
Services
(Enterprise Information)
Data
Services
65
Security
Services
Discovery
Services
66
Semantic
Services
definitions.
Data stewardship defines data custodians.
Zero defects at point of entry.
De-conflict data at source, not at higher
levels.
Data aggregations from sources data, not
from reports.
67
Data Concepts
Data Element Definition
Text associated with a unique data element within a data
68
the Enterprise.
Information is a strategic asset.
Data and applications cannot be coupled to
each other.
Interfaces must be independent of
implementation.
Data must be visible outside of the
applications.
Semantics and syntax is defined by a
community of interest.
69
Data
must be understandable and trusted.
Organization of Security
Services
Security
Services
Transfer
Services
70
Protection
Services
Certification
Services
Systems
Assurance
Authentication
Services
attacks.
Manage measures required to minimize the
networks vulnerability.
Services
Identify and confirm a user's authorization to access
the network.
71
Computing
Facilities
72
Resource
Planning
Control &
Quality
Configuration
Services
Financial
Management
Computing Services
Provide Adaptable Hosting Environments
Global facilities for hosting to the edge.
Virtual environments for data centers.
information sharing.
device.
73
Organization of Communication
Services
Communication
Services
Spectrum
Interoperability
Management
Services
74
Connectivity
Arrangements
Continuity of
Services
Resource
Management
Network Services
Implementation
From point-to-point communications (push
communications) to network-centric
processes (pull communications).
Data posted to shared space for retrieval.
Network controls assure data
synchronization and access security.
75
Communication Services
Provide Information Transport
Transport information, data and services
anywhere.
Ensures transport between end-user devices
and servers.
Expand the infrastructure for on-demand
capacity.
76
Component
Repository
77
Code Binding
Services
Maintenance
Management
Portals
Experimental
Services
78
Example of Development
Tools
Business Process Execution Language,
BPEL Approach
- Externalized decision
- Modeled by business
- Maintained by policy
- Managed by IT
- Automatic logs and
80
References
1.
2.
3.
4.
5.
6.
7.
8.
81
7/22/16
Other presentations
http://www.slideshare.net/drgst/presentations
82
7/22/16
Thank You
Questions and
Comments?
83
7/22/16