Академический Документы
Профессиональный Документы
Культура Документы
Richa Sinha
Electrical Power Grid Analog
Electrical power grid The Grid
users (or electrical appliances) get users (or client applications) gain
access to electricity through wall access to computing resources
sockets with no care or (processors, storage, data,
consideration for where or how applications, and so on) as needed
the electricity is actually with little or no knowledge of where
generated. those resources are located or what
the underlying technologies,
The power grid links together hardware, operating system, and so
power plants of many different on are
kinds
"the Grid" links together computing
resources (PCs, workstations,
servers, storage elements) and
provides the mechanism needed to
access them.
Grid Computing Introduced
Idea of grid was brought by Ian Foster, Carl Kesselman and Steve
Tuecke in the year 1970
Subset of distributed computing
Internet=network of communication
Grid computing=network of computation
History
Early 1980-1990 Concept of Parallel Computing using
PVM
MPI etc
Dynamically linking of
resources
Ensembling
For execution of large
scale, resource
intensive, distributive
applications
Grid Computing Defined as
It is method of harassing the power of many computers in a
network to solve the problem requires large number of processing
cycle and requires huge amount of data
Global Grid Forum (1990)
OGSA (Open Grid Services Architecture) formed by this forum
defined service of Grid
System Management and automation
Workload/Performance
Security
Availability/Service Management
Logical Resource management
Clustering Service
Connecting Issues
Physical Storage Management
Why need of Grid Computing?
Core networking technology now accelerates at a much faster rate
than advances in microprocessor speeds
Exploiting under utilized resources
Parallel CPU capacity
Virtual resources and virtual organizations for collaboration
Access to additional resources
Resource Balancing
Reliability
Management
Types of Grid
Access Storage
Data Grid and Computation Grid
A data grid is a grid computing A computational grid is a hardware
system that deals with data and software infrastructure that
the controlled sharing and provides dependable, consistent,
management of large amounts pervasive, and inexpensive access to
of distributed data high-end computational capabilities
Mainly used in broad areas Data
Mining Mainly used all broad areas
Main Operations
Main Operations
Are Cluster of Clusters, CPU scavenging,
Access, Transfer and Modify massive Provide Computation power, On demand
datasets stored in distributed storage Resources
Application of Grid
Life science- Bio Informatics, genomics, Neuroscience
Engineering: NASA, IPG
Data Oriented: GEODISE aircraft engineering design GRID,
DAME grid to manage data from aircraft engine sensors
Physical science : Particle physics data GRID, INFN(Italian
National Institute for Research in Nuclear and subnuclear Physics)
Astronomical Data Grid
Grid Topologies
Intragrid
Local grid within an organization
Trust based on personal contracts
Extragrid
Resources of a consortium of organizations
connected through a (Virtual) Private Network
Trust based on Business to Business contracts
Intergrid
Global sharing of resources through the internet
Trust based on certification
Types of Resources
Computation
Storage
Communications
Software and licenses
Special equipment, capacities, architectures, and policies
A Job Scheduler on Grid
Overview: Grid Computing Environment
An integrated set of tools
that extend the users
computing environment in
order to provide access to
Grid Services
Parts of GCE
Client
Portal
Services
Grid
Commodity
Grid Computing Environment
R2 Application
database
2 R3
R4
R5
RN
Grid Resource Broker
R6
R1
Resource Broker
Components User-level
Security &
Secure communications (SSL) Core
Distributed security infrastructure
ASPECTS
Manage user credentials to selecting appropriate resources
Data Management
Transferring data throughout the grid and to users
GridFTP
Deals with high-performance, security and reliability
Information Services
Information DB about resources
Availability, capabilities,
Resource Management
Resource discovery, inventories, provisioning, monitoring, fault isolation,
autonomic capabilities
Grid Application Needed
Application partitioning for parallel Processing
Discovery and scheduling of task workflow
Data Communication when and where
Provisioning and distributed application
Result Management
Autonomic Features such as self- configuration, self-
optimization, self- recovery and self- management
GRID Application & Usage Pattern (Resource
Management)
Schedulers
Resource Broker
Load Balancing
Distribution of Workload among resources in GCE
Partitioning of jobs, indentifying the resources, queuing of the jobs
Grid Portals
Provide Uniform Access to GRID resources
Resource authentication, remote resource access, scheduling capabilities, monitoring status
information
Integrated Solutions
Combination of existing advanced middleware & application functionalities to provide more coherent &
high performance result across GCE
Schedulers
Schedulers are used in
Monitoring Job
Second Generation
Third Generation
Organizations developing grid standards
Global Grid Forum (GGF)
OASIS (Organization for the Advancement of Structured
Information Standards)
W3C (World Wide Web Consortium) (Web services, XML,
Semantic Web)
IETF (Internet Engineering Task Force)
DMTF (Distributed Management Task Force)
Global Grid Forum (GGF) Goal
Create grid specifications, architecture documents, and best
practice guidelines
Handle intellectual property policies
Provide a forum for information exchange and collaboration
Improve collaboration among the people involved with grid
research, grid framework builders, grid deployment, and grid users
Create best practice guidelines from the experience of the
technologies
Global Grid Forum (GGF) areas
Application and programming environments
Architecture
Data
Information systems and performance
Peer-to-peer: Desktop grids
Scheduling and resource management
Security
Evolution of Grid Computing Technologies
Evolution Of Grid Computing(1990)
FAFNER 66(Factoring via Network enabled recursion)
This is a set up to factorize RSA130 using Number Field Sieve (NFS) using computational
resources of web-servers. NFS is implemented in such a way that it does not require any
communication between nodes, and is such efficient that it can be run in a workstation
with 4Mb of memory. CGI scripts on server side are invoked by the contributors.
I-WAY
This is a project to integrate High-performance Computing device with existing High
bandwidth networks. It uses ATM(Asynchronous Transfer mode), supporting both TCP
and direct ATM.
It uses I-POP servers acting as a gateway to I-WAY. I-POP servers are UNIX
workstations possessing standard software environment.
Each site participating in I-WAY ran I-POP server. The I-POP server allowed uniform
access, authentication, resource reservation, process creation and communication
functions.
Evolution of Grid Computing: Second
Generation(1998)
Requirements for the data and computation infrastructure
Distributed object systems
Grid resource brokers and schedulers
Grid portals
Integrated systems
Peer-to-Peer computing
Second Generation Core Technologies are
FAFNER was forerunner of SETI@home and Distributed.NET
I-WAY for Globus and Legion
Globus
GSI3 = GSI + Alignment with WS-Security
MDS (Monitoring & Discovery Service) combines data discovery mechanism with
Lightweight Directory Access Protocol (LDAP)
Globus
A collaboration of Argonne National Laboratorys Mathematics and
Computer Science Division, the University of Southern Californias
Information Sciences Institute, and the University of Chicago's
Distributed Systems Laboratory.
A project to develop the underlying technologies needed for the
construction of computational grids.
Focuses on execution environments for integrating widely-
distributed computational platforms, data resources, displays,
special instruments and so forth.
The Globus Toolkit
The Globus Resource Allocation Manager (GRAM)
Creates, monitors, and manages services.
NMI, Grid Physics Network (GriPhyN), International Virtual Data Grid Laboratory (iVDGL)
TerraGrid etc
Condor
Original goal: high-throughput computing
Harvest wasted CPU power from other machines
Can also be used on a dedicated cluster
Condor-G Condor interface to Globus resources
Condor
Provides many features of batch systems:
job queuing
scheduling policy
priority scheme
resource monitoring
resource management
Scheduling Algorithm
Cost Optimization
Time Optimization
Cost-time optimization
Conservative time strategy
NSF Middleware Initiative (NMI)
Grid Research Integration Deployment and Support
(GRIDS) Center
Globus toolkit, Condor-G, GSI-OpenSSH, Network Weather service, Grid
Packaging tools, GridConfig, MPICH-G2, MyProxy etc
CORBA
a standard tool in which a metalanguage interface is used to manage interoperability
among objects
CoG Kit
Commodity Grids
Legion
provides objects with a globally unique (and opaque) identifier
Authentication:
Prove your identity
Stops masquerading imposters
Examples:
Passport
Username and password
Privacy
Medical Record
Patient no: 3456
Integrity
Run myHome/rm f *
Run myHome/whoami
Message Protection
Sending message securely
Integrity
Detect whether message has been tampered
Privacy
No one other than sender and receiver should be able to read message
Authorization establishes rights to do actions
What can a particular identity do?
Examples:
Are you allowed to read this file?
Are you allowed to run a job on this machine?
Unix read/write/execute permissions
Log on once
Type password once
Decryption
Digital Certificate
Public Key Certificate
Certificate Authority
Cryptographic Keys, the building block of
cryptography, are collections of bits
The more bits that you
have, the stronger is the
key
Public key cryptography 0101001110
Public key
Private key
Encryption takes data and a key, feeds it into a
function and gets encrypted data out
Encrypted data is, in <data>
principal, unreadable
unless decrypted
Encryption
Function
Symmetric Key
and
Asymmetric
Key
Decryption feeds encrypted data & a key into a
function and gets the original data
Encryption and decryption
functions are linked
Decryption
Function
<data>
Digital Signatures
State of Illinois
Name John Doe
Issuer 755 E. Woodlawn State of
Public Key Urbana IL 61801 Illinois
Seal
Validity
Signature BD 08-06-65
Male 60 200lbs
GRN Eyes
Valid Till: 01-02-2008
Certification Authorities (CAs) sign certificates
CAs are small set of trusted
entities Name
CA certificates must be Validity
Public Key
distributed securely
Issuer?
Each CA has a Certificate Policy (CP)
The Certificate Policy states:
To whom the CA will issue certificates
How the CA identifies people to whom it will issue certificates
Standard terminology
Share past experience / best practices
Architecture
3) Event producer &
Event schema A Directory Service
Consumer
information which supports the
publication and
discovery of producers,
4) Query 2) Lookup consumers and
or
Subscribe Directory monitoring data (events)
5) Event Service
data Producers that are the
sensors that produce
Producer 1) Event publication performance data;
information
Consumers that access
= API & wire protocol & data format and use performance
Plus security! data.
Consumer
Any program that receives monitoring data (events) from a producer can be
a consumer
Steps performed by Consumer
1. Locate events
2. Locate producers
3. Initiate a query
4. Initiate a subscription
5. Initiate an unsubscribe
6. Register
7. Accept query
8. Accept subscribe
9. Accept unsubscribe
A Directory Service
Provides information about producers or consumers that
accept requests
Main Function Supported
Authorize a search
Authorize a modification
Add
Update
Remove
Search
Producers
A software component that sends monitoring data (events) to a
consumer
Producers Steps
1. Locate event
2. Locate consumer
3. Register
4. Accept query
5. Accept subscribe
6. Accept unsubscribe
7. Initiate query
8. Initiate subscribe
9. Initiate unsubscribe
Monitoring Criteria
Scalable wide-area monitoring
Resource monitoring
Cross-API monitoring
Homogeneous data presentation
Information searching
Run-time extensibility
Filtering/fusing of data
Open and standard protocols
Security
Software availability and dependencies
Grid Monitoring Systems
Autopilot
Control and Observation in Distributed Environments (CODE)
GridICE
Grid Portals Information Repository (GPIR)
GridRM
Hawkeye
Java Agents for Monitoring and Management (JAMM)
MapCenter
Monitoring and Discovery Service (MDS3)
Mercury
Network Weather Service
visPerf
Grid Scheduling
User submits job to the
middleware
The middleware pass the
job to different scheduler
Scheduler schedules these
job to the Grid
Scheduling Paradigms
Centralized Scheduling
Distributed Scheduling
Direct Communication
Scheduling with Job Pool
Hierarchical scheduling
Centralized Scheduling
Distributed Scheduling
Resource
Resource
Discover
Selection
y
Job Schedule
Execution Generation
Resource Discovery