Вы находитесь на странице: 1из 20

General Parallel File System

Presentation by:
Lokesh Pradhan

Introduction
File System
Way to organize data which is expected to be

retained after the program terminates by


providing procedures to store, retrieve and
update data as well as manage the available
space on the device which contains it.

Types of File System


Types

Examples

Disk file system

FAT, exFAT, NTFS

Optical discs

CD, DVD, Blu-ray

Tape file system

IBMs Linear tape

Database file system

DB2

Transactional file system

TxF, Valor, Amino, TFFS

Flat file system

Amazons S3

Cluster file system


Distributed file system
Shared file system
San file system
Parallel file system

NFS, CIFS, AFS, SMB, GFS,


GPFS, LUSTRE, PAS

In HPC world
Equally large applications
Large input data set (e.g. astronomy data)
Parallel execution on large clusters

Use parallel file systems for scalable I/O


e.g. IBMs GPFS, Suns Lustre FS, PanFS, and

Parallel Virtual File System (PVFS)

General Parallel File System


Cluster: 512 nodes today,

fast reliable communication


Shared disk: all data and

metadata on disk accessible


from any node through disk
I/O interface (i.e., "any to
any" connectivity)
Parallel: data and metadata

flows from all of the nodes to


all of the disks in parallel
RAS: reliability,

accessibility, serviceability

History of GPFS
Shark video server
Video streaming from single RS/6000
Complete system, included file system, network driver, control server
Large data blocks, admission control, deadline scheduling
Bell Atlantic video-on-demand trial (1993-94)
Tiger Shark multimedia file system
Multimedia file system for RS/6000 SP
Data striped across multiple disks, accessible from all nodes
Hong Kong and Tokyo video trials, Austin video server products
GPFS parallel file system
General purpose file system for commercial and technical computing

on RS/6000 SP, AIX and Linux clusters.


Recovery, online system management, byte-range locking, fast prefetch, parallel allocation, scalable directory, small-block random
access.
Released as a product 1.1 - 05/98.

What is Parallel I/O?


Multiple processes

(possibly on multiple
nodes) participate in the
I/O
Application level
parallelism
File is stored on
multiple disks on a
parallel file system

Compute Nodes

Interconnect
I/O Server Nodes

Disk

What does Parallel System support?


A parallel file system must support
Parallel I/O
Consistent global name space across all nodes of the

cluster
Including maintaining a consistent view across all nodes
for the same file
Programming model allowing programs to access file data
Distributed over multiple nodes
From multiple tasks running on multiple nodes
Physical distribution of data across disks and network
entities eliminates bottlenecks both at the disk interface and
the network, providing more effective bandwidth to the I/O
resources

Why use general parallel file


systems?
Native AIX File System
No file sharing - application can only access files
on its own node
Applications must do their own data partitioning
Distributed File System

Application nodes (DCE clients) share files on


server node
Switch is used as a fast LAN
Coarse-grained (file or segment level) parallelism
Server node : performance and capacity
bottleneck
GPFS Parallel File System
GPFS file systems are striped across multiple
disks on multiple storage nodes
Independent GPFS instances run on each
application node
GPFS instances use storage nodes as "block
servers" - all instances can access all disks

Performance advantages with GPFS file


system
Allowing multiple processes or applications on all

nodes in the cluster simultaneously


Access to the same file using standard file system calls.
Increasing aggregate bandwidth of your file system by
spreading reads and writes across multiple disks.
Balancing the load evenly across all disks to maximize
their combined throughput. One disk is no more active
than another.

Performance advantages with


GPFS file system (cont.)
Supporting very large file and file system sizes.
Allowing concurrent reads and writes from multiple

nodes.
Allowing for distributed token (lock) management.
Distributing token management reduces system
delays associated with a lockable object waiting to
obtaining a token.
Allowing for the specification of other networks for
GPFS daemon communication and for GPFS
administration command usage within your cluster.

GPFS Architecture Overview


Implications of Shared Disk Model
All data and metadata on globally accessible

disks (VSD)
All access to permanent data through disk I/O
interface
Distributed protocols, e.g., distributed locking,
coordinate disk access from multiple nodes
Fine-grained locking allows parallel access by
multiple clients
Logging and Shadowing restore consistency after
node failures

GPFS Architecture Overview (cont.)


Implications of Large Scale
Support up to 4096 disks of up to 1 TB each (4

Petabytes)
The largest system in production is 75 TB
Failure detection and recovery protocols to
handle node failures
Replication and/or RAID protect against disk /
storage node failure
On-line dynamic reconfiguration (add, delete,
replace disks and nodes; rebalance file system)

GPFS Architecture - Special Node Roles


Three types of nodes:
File system nodes
Manager nodes
Storage nodes

Disk Data Structures:


Large block size allows efficient use of disk bandwidth
Fragments reduce space overhead for small files
No designated "mirror", no fixed placement function:
Flexible replication (e.g., replicate only metadata,

or only important files)


Dynamic reconfiguration: data can migrate blockby-block
Multi level indirect blocks Each disk address:
List of pointers to replicas
Each pointer:
Disk id + sector no.

Availability and Reliability


Eliminate single point of failures
Designed to transparently fail over token

(lock) operations.
Supports data replications to increase
availability in the vent of a storage media
failure.
Offers time-tested reliability and has been
installed on thousands of nodes across
industries
Basis of many cloud storage offerings

GPFSs Achievement
Used on six of the ten most powerful

supercomputers in the world, including the


largest (ASCI white)
Installed at several hundred customer
sites, on clusters ranging from a few nodes
with less than a TB of disk, up to 512
nodes with 140 TB of disk in 2 file systems
20 filed patents
ASC Purple Supercomputer which is
composed of more than 12,000 processors
and has 2 PB of total disk storage spanning
more than 11,000 disks.

Conclusion
Efficient for managing data volumes
Provides world-class performance,

scalability and availability for your file data


Designed to optimize the use of storage
Provide highly available platform for dataintensive applications
Delivering real business needs by
streamline data workflows, improvised
services reducing cost and managing the
risks.

References
"File System." Wikipedia, the Free Encyclopedia. Web. 20 Jan. 2012.

<http://en.wikipedia.org/wiki/File_system>.
"IBM General Parallel File System for AIX: Administration and Programming Reference -

Contents." IBM General Parallel File System for AIX. IBM. Web. 20 Jan. 2012.
<https://support.iap.ac.cn/hpc/ibm/ibm/gpfs/am3admst02.html>.
"IBM General Parallel File System." Wikipedia, the Free Encyclopedia. Web. 20 Jan. 2012.

<http://en.wikipedia.org/wiki/IBM_General_Parallel_File_System>.
Intelligent Storage Management with IBM General Parallel File System. Issue brief. IBM, July

2010. Web. 21 Jan. 2012. <http://www-03.ibm.com/systems/software/gpfs/>.


Mandler, Benny. Architectural and Design Issues in the General Parallel File System. IBM

Haita Research Lab, May 2005. Web. 21 Jan. 2012. <Architectural and Design Issues in the
General Parallel File System>.
"NCSA Parallel File Systems." National Center for Supercomputing Applications at the

University of Illinois. University of Illinois, 20 Mar. 2008. Web. 21 Jan. 2012.


<http://www.ncsa.illinois.edu/UserInfo/Data/filesystems/>.
Parallel File System. Rep. Dell Inc., May 2005. Web. 21 Jan. 2012.

<www.dell.com/powersolutions>.
Welch, Brent. "What Is a Cluster Filesystem?" Brent B Welch. Web. 21 Jan. 2012.

<http://www.beedub.com/clusterfs.html>.

Questions?