Вы находитесь на странице: 1из 20

General Parallel File System

Presentation by:
Lokesh Pradhan

File System
Way to organize data which is expected to be

retained after the program terminates by

providing procedures to store, retrieve and
update data as well as manage the available
space on the device which contains it.

Types of File System



Disk file system


Optical discs

CD, DVD, Blu-ray

Tape file system

IBMs Linear tape

Database file system


Transactional file system

TxF, Valor, Amino, TFFS

Flat file system

Amazons S3

Cluster file system

Distributed file system
Shared file system
San file system
Parallel file system



In HPC world
Equally large applications
Large input data set (e.g. astronomy data)
Parallel execution on large clusters

Use parallel file systems for scalable I/O

e.g. IBMs GPFS, Suns Lustre FS, PanFS, and

Parallel Virtual File System (PVFS)

General Parallel File System

Cluster: 512 nodes today,

fast reliable communication

Shared disk: all data and

metadata on disk accessible

from any node through disk
I/O interface (i.e., "any to
any" connectivity)
Parallel: data and metadata

flows from all of the nodes to

all of the disks in parallel
RAS: reliability,

accessibility, serviceability

History of GPFS
Shark video server
Video streaming from single RS/6000
Complete system, included file system, network driver, control server
Large data blocks, admission control, deadline scheduling
Bell Atlantic video-on-demand trial (1993-94)
Tiger Shark multimedia file system
Multimedia file system for RS/6000 SP
Data striped across multiple disks, accessible from all nodes
Hong Kong and Tokyo video trials, Austin video server products
GPFS parallel file system
General purpose file system for commercial and technical computing

on RS/6000 SP, AIX and Linux clusters.

Recovery, online system management, byte-range locking, fast prefetch, parallel allocation, scalable directory, small-block random
Released as a product 1.1 - 05/98.

What is Parallel I/O?

Multiple processes

(possibly on multiple
nodes) participate in the
Application level
File is stored on
multiple disks on a
parallel file system

Compute Nodes

I/O Server Nodes


What does Parallel System support?

A parallel file system must support
Parallel I/O
Consistent global name space across all nodes of the

Including maintaining a consistent view across all nodes
for the same file
Programming model allowing programs to access file data
Distributed over multiple nodes
From multiple tasks running on multiple nodes
Physical distribution of data across disks and network
entities eliminates bottlenecks both at the disk interface and
the network, providing more effective bandwidth to the I/O

Why use general parallel file

Native AIX File System
No file sharing - application can only access files
on its own node
Applications must do their own data partitioning
Distributed File System

Application nodes (DCE clients) share files on

server node
Switch is used as a fast LAN
Coarse-grained (file or segment level) parallelism
Server node : performance and capacity
GPFS Parallel File System
GPFS file systems are striped across multiple
disks on multiple storage nodes
Independent GPFS instances run on each
application node
GPFS instances use storage nodes as "block
servers" - all instances can access all disks

Performance advantages with GPFS file

Allowing multiple processes or applications on all

nodes in the cluster simultaneously

Access to the same file using standard file system calls.
Increasing aggregate bandwidth of your file system by
spreading reads and writes across multiple disks.
Balancing the load evenly across all disks to maximize
their combined throughput. One disk is no more active
than another.

Performance advantages with

GPFS file system (cont.)
Supporting very large file and file system sizes.
Allowing concurrent reads and writes from multiple

Allowing for distributed token (lock) management.
Distributing token management reduces system
delays associated with a lockable object waiting to
obtaining a token.
Allowing for the specification of other networks for
GPFS daemon communication and for GPFS
administration command usage within your cluster.

GPFS Architecture Overview

Implications of Shared Disk Model
All data and metadata on globally accessible

disks (VSD)
All access to permanent data through disk I/O
Distributed protocols, e.g., distributed locking,
coordinate disk access from multiple nodes
Fine-grained locking allows parallel access by
multiple clients
Logging and Shadowing restore consistency after
node failures

GPFS Architecture Overview (cont.)

Implications of Large Scale
Support up to 4096 disks of up to 1 TB each (4

The largest system in production is 75 TB
Failure detection and recovery protocols to
handle node failures
Replication and/or RAID protect against disk /
storage node failure
On-line dynamic reconfiguration (add, delete,
replace disks and nodes; rebalance file system)

GPFS Architecture - Special Node Roles

Three types of nodes:
File system nodes
Manager nodes
Storage nodes

Disk Data Structures:

Large block size allows efficient use of disk bandwidth
Fragments reduce space overhead for small files
No designated "mirror", no fixed placement function:
Flexible replication (e.g., replicate only metadata,

or only important files)

Dynamic reconfiguration: data can migrate blockby-block
Multi level indirect blocks Each disk address:
List of pointers to replicas
Each pointer:
Disk id + sector no.

Availability and Reliability

Eliminate single point of failures
Designed to transparently fail over token

(lock) operations.
Supports data replications to increase
availability in the vent of a storage media
Offers time-tested reliability and has been
installed on thousands of nodes across
Basis of many cloud storage offerings

GPFSs Achievement
Used on six of the ten most powerful

supercomputers in the world, including the

largest (ASCI white)
Installed at several hundred customer
sites, on clusters ranging from a few nodes
with less than a TB of disk, up to 512
nodes with 140 TB of disk in 2 file systems
20 filed patents
ASC Purple Supercomputer which is
composed of more than 12,000 processors
and has 2 PB of total disk storage spanning
more than 11,000 disks.

Efficient for managing data volumes
Provides world-class performance,

scalability and availability for your file data

Designed to optimize the use of storage
Provide highly available platform for dataintensive applications
Delivering real business needs by
streamline data workflows, improvised
services reducing cost and managing the

"File System." Wikipedia, the Free Encyclopedia. Web. 20 Jan. 2012.

"IBM General Parallel File System for AIX: Administration and Programming Reference -

Contents." IBM General Parallel File System for AIX. IBM. Web. 20 Jan. 2012.
"IBM General Parallel File System." Wikipedia, the Free Encyclopedia. Web. 20 Jan. 2012.

Intelligent Storage Management with IBM General Parallel File System. Issue brief. IBM, July

2010. Web. 21 Jan. 2012. <http://www-03.ibm.com/systems/software/gpfs/>.

Mandler, Benny. Architectural and Design Issues in the General Parallel File System. IBM

Haita Research Lab, May 2005. Web. 21 Jan. 2012. <Architectural and Design Issues in the
General Parallel File System>.
"NCSA Parallel File Systems." National Center for Supercomputing Applications at the

University of Illinois. University of Illinois, 20 Mar. 2008. Web. 21 Jan. 2012.

Parallel File System. Rep. Dell Inc., May 2005. Web. 21 Jan. 2012.

Welch, Brent. "What Is a Cluster Filesystem?" Brent B Welch. Web. 21 Jan. 2012.