High Performance Cluster Computing:: Architectures and Systems

High Performance Cluster
Computing:
Architectures and Systems
Book Editor: Rajkumar Buyya

Slides: Hai Jin and Raj Buyya
Internet and Cluster Computing Center

Cluster Computing at a Glance
Chapter 1: by M. Baker and R. Buyya
 Introduction
 Scalable Parallel Computer Architecture
 Towards Low Cost Parallel Computing
 Windows of Opportunity
 A Cluster Computer and its Architecture
 Clusters Classifications
 Commodity Components for Clusters
 Network Service/Communications SW
 Middleware and Single System Image
 Resource Management and Scheduling
 Programming Environments and Tools
 Cluster Applications
 Representative Cluster Systems
 Cluster of SMPs (CLUMPS)
 Summary and Conclusions
http://www.buyya.com/cluster/
Resource Hungry Applications
 Solving grand challenge applications using
computer modeling, simulation and analysis
Aerospace
Internet &
Life Sciences
Ecommerce
CAD/CAM Digital Biology Military Applications

Application Categories
How to Run Applications Faster ?
 There are 3 ways to improve performance:
 Work Harder
 Work Smarter
 Get Help
 Computer Analogy
 Using faster hardware
 Optimized algorithms and techniques used
to solve computational tasks

 Multiple computers to solve a particular
task
Scalable (Parallel) Computer
Architectures
 Taxonomy
 based on how processors, memory & interconnect
are laid out, resources are managed
 Massively Parallel Processors (MPP)
 Symmetric Multiprocessors (SMP)
 Cache-Coherent Non-Uniform Memory
Access (CC-NUMA)
 Clusters
 Distributed Systems – Grids/P2P
Scalable Parallel Computer
Architectures
 MPP
 A large parallel processing system with a shared-
nothing architecture
 Consist of several hundred nodes with a high-speed
interconnection network/switch
 Each node consists of a main memory & one or more
processors
 Runs a separate copy of the OS
 SMP
 2-64 processors today
 Shared-everything architecture
 All processors share all the global resources available
 Single copy of the OS runs on these systems
Scalable Parallel Computer
Architectures
 CC-NUMA
 a scalable multiprocessor system having a cache-coherent
nonuniform memory access architecture
 every processor has a global view of all of the memory
 Clusters
 a collection of workstations / PCs that are interconnected by a
high-speed network
 work as an integrated collection of resources
 have a single system image spanning all its nodes
 Distributed systems
 considered conventional networks of independent computers
 have multiple system images as each node runs its own OS
 the individual machines could be combinations of MPPs, SMPs,
clusters, & individual computers
Rise and Fall of Computer
Architectures
 Vector Computers (VC) - proprietary system:
 provided the breakthrough needed for the emergence of
computational science, buy they were only a partial answer.
 Massively Parallel Processors (MPP) -proprietary
systems:
 high cost and a low performance/price ratio.
 Symmetric Multiprocessors (SMP):
 suffers from scalability
 Distributed Systems:
 difficult to use and hard to extract parallel performance.
 Clusters - gaining popularity:
 High Performance Computing - Commodity Supercomputing
 High Availability Computing - Mission Critical Applications
Top500 Computers Architecture
(Clusters share is growing)
The Dead Supercomputer Society
http://www.paralogos.com/DeadSuper/
 ACRI  Dana/Ardent/Stellar
 Alliant  Elxsi
 American  ETA Systems
Supercomputer  Evans & Sutherland
 Ametek Computer Division
 Applied Dynamics  Floating Point Systems  Meiko
Astronautics Myrias
Galaxy YH-1

 Convex C4600 
 BBN
 Goodyear Aerospace MPP  Thinking
 CDC
 Convex
 Gould NPL Machines
 Cray Computer  Guiltech  Saxpy
 Cray Research  Intel Scientific  Scientific
(SGI?Tera) Computers Computer
 Culler-Harris  Intl. Parallel Machines Systems (SCS)
Culler Scientific KSR
 
 Soviet
Cydrome MasPar Supercomputers
 
 Suprenum
Vendors: Specialised ones (e.g.,
TMC) disappeared, new emerged
Computer Food Chain: Causing the
demise of specialized systems
•Demise of mainframes, supercomputers, & MPPs

Towards Clusters
The promise of supercomputing to the average PC User ?

Technology Trends...
 Performance of PC/Workstations components

has almost reached performance of those used
in supercomputers…
 Microprocessors (50% to 100% per year)
 Networks (Gigabit SANs)
 Operating Systems (Linux,...)
 Programming environments (MPI,…)
 Applications (.edu, .com, .org, .net, .shop, .bank)
 The rate of performance improvements
of commodity systems is much rapid
compared to specialized systems
Towards Commodity Cluster
Computing
 Since the early 1990s, there is an increasing
trend to move away from expensive and
specialized proprietary parallel supercomputers
towards clusters of computers (PCs, workstations)
 From specialized traditional supercomputing
platforms to cheaper, general purpose systems
consisting of loosely coupled components built up
from single or multiprocessor PCs or workstations
 Linking together two or more computers to jointly solve
computational problems
History: Clustering of
Computers for Collective Computing
PDA
Clusters
1960 1980s 1990 1995+ 2000+

What is Cluster ?
 A cluster is a type of parallel and distributed processing
system, which consists of a collection of interconnected stand-
alone computers cooperatively working together as a single,
integrated computing resource.
 A node
 a single or multiprocessor system with memory, I/O facilities,
& OS
 A cluster:
 generally 2 or more computers (nodes) connected together
in a single cabinet, or physically separated & connected via a
LAN
 appears as a single system to users and applications
 provides a cost-effective way to gain features and benefits

Cluster Architecture
Parallel Applications
Sequential Applications
Sequential Applications
Sequential Applications Parallel Programming Environment
Cluster Middleware
(Single System Image and Availability Infrastructure)
PC/Workstation PC/Workstation PC/Workstation PC/Workstation
Communications Communications Communications Communications

Software Software Software Software
Network Interface Network Interface Network Interface Network Interface

Hardware Hardware Hardware Hardware
Cluster Interconnection Network/Switch

So What’s So Different about
Clusters?
 Commodity Parts?
 Communications Packaging?
 Incremental Scalability?
 Independent Failure?
 Intelligent Network Interfaces?
 Complete System on every node
 virtual memory
 scheduler
 files
 …
 Nodes can be used individually or jointly...
Windows of Opportunities
 Parallel Processing
 Use multiple processors to build MPP/DSM-like systems for
parallel computing
 Network RAM
 Use memory associated with each workstation as aggregate
DRAM cache
 Software RAID
 Redundant Array of Inexpensive/Independent Disks
 Use the arrays of workstation disks to provide cheap, highly
available and scalable file storage
 Possible to provide parallel I/O support to applications
 Multipath Communication
 Use multiple networks for parallel data transfer between
nodes
Cluster Design Issues
• Enhanced Performance (performance @ low cost)

• Enhanced Availability (failure management)
• Single System Image (look-and-feel of one system)
• Size Scalability (physical & application)
• Fast Communication (networks & protocols)
• Load Balancing (CPU, Net, Memory, Disk)
• Security and Encryption (clusters of clusters)
• Distributed Environment (Social issues)
• Manageability (admin. and control)
• Programmability (simple API if required)
• Applicability (cluster-aware and non-aware app.)
Scalability Vs. Single System
Image
UP
Common Cluster Modes
 High Performance (dedicated).

 High Throughput (idle cycle
harvesting).
 High Availability (fail-over).
 A Unified System – HP and HA

within the same cluster
High Performance Cluster
(dedicated mode)
High Throughput Cluster (Idle
Resource Harvesting)
Shared Pool of
Computing Resources:
Processors, Memory, Disks
Interconnect
Guarantee at least one Deliver large % of collective

workstation to many individuals resources to few individuals
(when active) at any one time
High Availability Clusters
HA and HP in the same Cluster
• Best of both Worlds:

world is heading towards
this configuration)
Cluster Components
Prominent Components of
Cluster Computers (I)
 Multiple High Performance
Computers
 PCs
 Workstations
 SMPs (CLUMPS)
 Distributed HPC Systems leading to
Grid Computing
System CPUs
 Processors
 Intel x86-class Processors
 Pentium Pro and Pentium Xeon
 AMD x86, Cyrix x86, etc.
 Digital Alpha – phased out when HP acquired it.
 Alpha 21364 processor integrates processing, memory
controller, network interface into a single chip
 IBM PowerPC
 Sun SPARC
 Scalable Processor Architecture
 SGI MIPS
 Microprocessor without Interlocked Pipeline
Stages
System Disk
 Disk and I/O
 Overall improvement in disk access time
has been less than 10% per year
 Amdahl’s law
 Speed-up obtained from faster processors is
limited by the slowest system component
 Parallel I/O
 Carry out I/O operations in parallel,
supported by parallel file system based on
hardware or software RAID
Commodity Components for
Clusters (II): Operating Systems
 Operating Systems
 2 fundamental services for users
 make the computer hardware easier to use
 create a virtual machine that differs markedly from the real
machine
 share hardware resources among users
 Processor - multitasking
 The new concept in OS services
 support multiple threads of control in a process itself
 parallelism within a process
 multithreading
 POSIX thread interface is a standard programming environment
 Trend
 Modularity – MS Windows, IBM OS/2
 Microkernel – provides only essential OS services
 high level abstraction of OS portability
Cluster Computers
 State of the art Operating Systems
 Linux (MOSIX, Beowulf, and many more)
 Windows HPC (HPC2N – Umea University)
 SUN Solaris (Berkeley NOW, C-DAC PARAM)
 IBM AIX (IBM SP2)
 HP UX (Illinois - PANDA)
 Mach (Microkernel based OS) (CMU)
 Cluster Operating Systems (Solaris MC, SCO
Unixware, MOSIX (academic project)
 OS gluing layers (Berkeley Glunix)
Operating Systems used in Top500
Powerful computers
AIX
Cluster Computers (III)
 High Performance Networks/Switches
 Ethernet (10Mbps),
 Fast Ethernet (100Mbps),
 Gigabit Ethernet (1Gbps)
 SCI (Scalable Coherent Interface- MPI- 12µsec

latency)
 ATM (Asynchronous Transfer Mode)
 Myrinet (1.28Gbps)
 QsNet (Quadrics Supercomputing World, 5µsec

latency for MPI messages)
 Digital Memory Channel
 FDDI (fiber distributed data interface)
 InfiniBand
Cluster Computers (IV)
 Fast Communication Protocols and
Services (User Level
Communication):
 Active Messages (Berkeley)
 Fast Messages (Illinois)
 U-net (Cornell)
 XTP (Virginia)
 Virtual Interface Architecture (VIA)

Cluster Computers (V)
 Cluster Middleware
 Single System Image (SSI)
 System Availability (SA) Infrastructure
 Hardware
 DEC Memory Channel, DSM (Alewife, DASH), SMP Techniques
 Operating System Kernel/Gluing Layers
 Solaris MC, Unixware, GLUnix, MOSIX
 Applications and Subsystems
 Applications (system management and electronic forms)
 Runtime systems (software DSM, PFS etc.)
 Resource management and scheduling (RMS) software
 Oracle Grid Engine, Platform LSF (Load Sharing Facility), PBS (Portable
Batch Scheduler), Microsoft Cluster Compute Server (CCS)
Advanced Network Services/
Communication SW
 Communication infrastructure support protocol for
 Bulk-data transport
 Streaming data
 Group communications
 Communication service provides cluster with important QoS
parameters
 Latency
 Bandwidth
 Reliability
 Fault-tolerance
 Network services are designed as a hierarchical stack of protocols
with relatively low-level communication API, providing means to
implement wide range of communication methodologies
 RPC
 DSM
 Stream-based and message passing interface (e.g., MPI, PVM)
Cluster Computers (VI)
 Parallel Programming Environments and Tools
 Threads (PCs, SMPs, NOW..)
 POSIX Threads
 Java Threads
 MPI (Message Passing Interface)
 Linux, Windows, on many Supercomputers
 Parametric Programming
 Software DSMs (Shmem)
 Compilers
 C/C++/Java
 Parallel programming with C++ (MIT Press book)
 RAD (rapid application development) tools
 GUI based tools for PP modeling
 Debuggers
 Performance Analysis Tools
 Visualization Tools
Cluster Computers (VII)
 Applications
 Sequential
 Parallel / Distributed (Cluster-aware
app.)
 Grand Challenging applications
 Weather Forecasting
 Quantum Chemistry
 Molecular Biology Modeling
 Engineering Analysis (CAD/CAM)
 ……………….
 PDBs, web servers, data-mining
Key Operational Benefits of
Clustering
 High Performance
 Expandability and Scalability
 High Throughput
 High Availability
Clusters Classification (I)
 Application Target
 High Performance (HP) Clusters
 Grand Challenging Applications
 High Availability (HA) Clusters
 Mission Critical applications
Clusters Classification (II)
 Node Ownership
 Dedicated Clusters
 Non-dedicated clusters
 Adaptive parallel computing

 Communal multiprocessing
Clusters Classification (III)
 Node Hardware
 Clusters of PCs (CoPs)
 Piles of PCs (PoPs)
 Clusters of Workstations (COWs)
 Clusters of SMPs (CLUMPs)
Clusters Classification (IV)
 Node Operating System

 Linux Clusters (e.g., Beowulf)
 Solaris Clusters (e.g., Berkeley
NOW)
 AIX Clusters (e.g., IBM SP2)
 SCO/Compaq Clusters (Unixware)
 Digital VMS Clusters
 HP-UX clusters
 Windows HPC clusters

Clusters Classification (V)
 Node Configuration
 Homogeneous Clusters
 All nodes will have similar architectures
and run the same OSs
 Heterogeneous Clusters
 Nodes will have different architectures
and run different OSs
Clusters Classification (VI)
 Levels of Clustering
 Group Clusters (#nodes: 2-99)
 Nodes are connected by SAN like Myrinet
 Departmental Clusters (#nodes: 10s to 100s)
 Organizational Clusters (#nodes: many 100s)
 National Metacomputers (WAN/Internet-
based)
 International Metacomputers (Internet-
based, #nodes: 1000s to many millions)
 Grid Computing
 Web-based Computing
 Peer-to-Peer Computing
Single System Image
[See SSI Slides of Next

Lecture]
Cluster Programming
Levels of Parallelism
Code-Granularity
Code Item
PVM/MPI Task i-l Task i Task i+1 Large grain
(task level)
Program
func1 ( ) func2 ( ) func3 ( )

{ { {
Medium grain
Threads ....
....
....
....
....
.... (control level)
} } } Function (thread)
Fine grain
Compilers
a ( 0 ) =.. a ( 1 )=.. a ( 2 )=..
b ( 0 ) =.. b ( 1 )=.. b ( 2 )=.. (data level)
Loop (Compiler)
Very fine grain

CPU + x Load
(multiple issue)
With hardware
Cluster Programming
Environments
 Shared Memory Based
 DSM (Distributed Shared Memory)
 Threads/OpenMP (enabled for clusters)
 Java threads (IBM cJVM)
 Aneka Threads
 Message Passing Based
 PVM (Parallel Virtual Machine)
 MPI (Message Passing Interface)
 Parametric Computations
 Nimrod-G, Gridbus, also in Aneka
 Automatic Parallelising Compilers
 Parallel Libraries & Computational Kernels
(e.g., NetSolve)
Programming Environments and
Tools (I)
 Threads (PCs, SMPs, NOW..)
 In multiprocessor systems
 Used to simultaneously utilize all the available
processors
 In uniprocessor systems
 Used to utilize the system resources effectively
 Multithreaded applications offer quicker response

to user input and run faster
 Potentially portable, as there exists an IEEE
standard for POSIX threads interface (pthreads)
 Extensively used in developing both application and
system software
Tools (II)
 Message Passing Systems (MPI and PVM)
 Allow efficient parallel programs to be written for
distributed memory systems
 2 most popular high-level message-passing systems – PVM
& MPI
 PVM
 both an environment & a message-passing library
 MPI
 a message passing specification, designed to be standard
for distributed memory parallel computing using explicit
message passing
 attempt to establish a practical, portable, efficient, &
flexible standard for message passing
 generally, application developers prefer MPI, as it became
the de facto standard for message passing
Tools (III)
 Distributed Shared Memory (DSM) Systems
 Message-passing
 the most efficient, widely used, programming paradigm on distributed
memory system
 complex & difficult to program
 Shared memory systems
 offer a simple and general programming model
 but suffer from scalability
 DSM on distributed memory system
 alternative cost-effective solution
 Software DSM
 Usually built as a separate layer on top of the comm interface
 Take full advantage of the application characteristics: virtual pages, objects, &
language types are units of sharing
 TreadMarks, Linda
 Hardware DSM
 Better performance, no burden on user & SW layers, fine granularity of sharing,
extensions of the cache coherence scheme, & increased HW complexity
 DASH, Merlin
Tools (IV)
 Parallel Debuggers and Profilers
 Debuggers
 Very limited
 HPDF (High Performance Debugging Forum) as Parallel Tools
Consortium project in 1996
 Developed a HPD version specification, which defines the
functionality, semantics, and syntax for a commercial-line
parallel debugger
 TotalView
 A commercial product from Dolphin Interconnect Solutions
 The only widely available GUI-based parallel debugger
that supports multiple HPC platforms
 Only used in homogeneous environments, where each process
of the parallel application being debugged must be running
under the same version of the OS
Functionality of Parallel
Debugger
 Managing multiple processes and multiple threads
within a process
 Displaying each process in its own window
 Displaying source code, stack trace, and stack
frame for one or more processes
 Diving into objects, subroutines, and functions
 Setting both source-level and machine-level
breakpoints
 Sharing breakpoints between groups of processes
 Defining watch and evaluation points
 Displaying arrays and its slices
 Manipulating code variables and constants
Tools (V)
 Performance Analysis Tools
 Help a programmer to understand the performance
characteristics of an application
 Analyze & locate parts of an application that exhibit poor
performance and create program bottlenecks
 Major components
 A means of inserting instrumentation calls to the performance
monitoring routines into the user’s applications
 A run-time performance library that consists of a set of monitoring
routines
 A set of tools for processing and displaying the performance data
 Issue with performance monitoring tools
 Intrusiveness of the tracing calls and their impact on the
application performance
 Instrumentation affects the performance characteristics of the
parallel application and thus provides a false view of its
performance behavior
Performance Analysis
and Visualization Tools
Tool Supports URL
AIMS Instrumentation, monitoring http://science.nas.nasa.gov/Software/AIM
library, analysis S
MPE Logging library and snapshot http://www.mcs.anl.gov/mpi/mpich
performance visualization
Pablo Monitoring library and http://www-

analysis pablo.cs.uiuc.edu/Projects/Pablo/
Paradyn Dynamic instrumentation http://www.cs.wisc.edu/paradyn
running analysis
SvPablo Integrated instrumentor, http://www-

monitoring library and pablo.cs.uiuc.edu/Projects/Pablo/
analysis
Vampir Monitoring library http://www.pallas.de/pages/vampir.htm

performance visualization
Dimenma Performance prediction for http://www.pallas.com/pages/dimemas.htm

message passing programs
s
Paraver Program visualization and http://www.cepba.upc.es/paraver
analysis
Tools (VI)
 Cluster Administration Tools
 Berkeley NOW
 Gathers & stores data in a relational DB
 Uses Java applets to allow users to monitor a system
 SMILE (Scalable Multicomputer Implementation
using Low-cost Equipment)
 Called K-CAP
 Consists of compute nodes, a management node, & a client
that can control and monitor the cluster
 K-CAP uses a Java applet to connect to the management node
through a predefined URL address in the cluster
 PARMON
 A comprehensive environment for monitoring large clusters
 Uses client-server techniques to provide transparent access
to all nodes to be monitored
 parmon-server & parmon-client
Cluster Applications
Cluster Applications
 Numerous Scientific & engineering Apps.
 Business Applications:
 E-commerce Applications (Amazon, eBay);
 Database Applications (Oracle on clusters).
 Internet Applications:
 ASPs (Application Service Providers);
 Computing Portals;
 E-commerce and E-business.
 Mission Critical Applications:

 command control systems, banks, nuclear
reactor control, star-wars, and handling life
threatening situations.
Early Research Cluster Systems
Project Platform Communications OS/Manag Other
ement
Beowulf PCs Multiple Ethernet Linux + PBS MPI/PVM.
with TCP/IP Sockets and
HPF
Berkeley Solaris-based Myrinet and Active Solaris + AM, PVM, MPI,
Now PCs and Messages GLUnix + HPF, Split-C
workstations xFS
HPVM PCs Myrinet with Fast NT or Linux Java-fronted,
Messages connection FM, Sockets,
and global Global Arrays,
resource SHEMEM and
manager + MPI
LSF
Solaris MC Solaris-based Solaris-supported Solaris + C++ and
PCs and Globalization CORBA
workstations layer
Cluster of SMPs (CLUMPS)
 Clusters of multiprocessors (CLUMPS)

 To be the supercomputers of the future
 Multiple SMPs with several network
interfaces can be connected using high
performance networks
 2 advantages
 Benefit from the high performance, easy-to-
use-and program SMP systems with a small
number of CPUs
 Clusters can be set up with moderate effort,
resulting in easier administration and better
support for data locality inside a node
Many types of Clusters
 High Performance Clusters
 Linux Cluster; 1000 nodes; parallel programs; MPI
 Load-leveling Clusters
 Move processes around to borrow cycles (eg. Mosix)
 Web-Service Clusters
 load-level tcp connections; replicate data
 Storage Clusters
 GFS; parallel filesystems; same view of data from each node
 Database Clusters
 Oracle Parallel Server;
 High Availability Clusters
 ServiceGuard, Lifekeeper, Failsafe, heartbeat, failover
clusters
Summary: Cluster Advantage
 Price/performance ratio of Clusters is low when

compared with a dedicated parallel supercomputer.
 Incremental growth that often matches with the
demand patterns.
 The provision of a multipurpose system
 Scientific, commercial, Internet applications
 Have become mainstream enterprise computing
systems:
 As Top 500 List, over 50% (in 2003) and 80% (since 2008)
of them are based on clusters and many of them are
deployed in industries.
 In the recent list, most of them are clusters!
Backup
Key Characteristics of
Scalable Parallel Computers

High Performance Cluster Computing:: Architectures and Systems

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

High Performance Cluster Computing:: Architectures and Systems

Загружено:

Авторское право:

Доступные форматы

High Performance Cluster

Book Editor: Rajkumar Buyya

Internet and Cluster Computing Center

CAD/CAM Digital Biology Military Applications

to solve computational tasks

•Demise of mainframes, supercomputers, & MPPs

The promise of supercomputing to the average PC User ?

 Performance of PC/Workstations components

1960 1980s 1990 1995+ 2000+

 provides a cost-effective way to gain features and benefits

PC/Workstation PC/Workstation PC/Workstation PC/Workstation

Communications Communications Communications Communications

Network Interface Network Interface Network Interface Network Interface

Cluster Interconnection Network/Switch

• Enhanced Performance (performance @ low cost)

 High Performance (dedicated).

 A Unified System – HP and HA

Guarantee at least one Deliver large % of collective

• Best of both Worlds:

 Distributed HPC Systems leading to

 Fast Ethernet (100Mbps),

 Gigabit Ethernet (1Gbps)

 SCI (Scalable Coherent Interface- MPI- 12µsec

 QsNet (Quadrics Supercomputing World, 5µsec

 FDDI (fiber distributed data interface)

 Virtual Interface Architecture (VIA)

 Adaptive parallel computing

 Node Operating System

 SCO/Compaq Clusters (Unixware)

 Digital VMS Clusters

 Windows HPC clusters

[See SSI Slides of Next

func1 ( ) func2 ( ) func3 ( )

Very fine grain

 Multithreaded applications offer quicker response

Pablo Monitoring library and http://www-

SvPablo Integrated instrumentor, http://www-

Vampir Monitoring library http://www.pallas.de/pages/vampir.htm

Dimenma Performance prediction for http://www.pallas.com/pages/dimemas.htm

 Database Applications (Oracle on clusters).

 E-commerce and E-business.

 Mission Critical Applications:

 Clusters of multiprocessors (CLUMPS)

 Price/performance ratio of Clusters is low when

Вам также может понравиться