Вы находитесь на странице: 1из 30

SBD-1266

Introduction to IBM Spectrum Scale and


Its Use in Life Science
Sven Oehme, IBM Research
Konstantin Arnold, University of Basel

2016 IBM Corporation


#ibmedge
#ibmedge
1
#ibmedge
2
Spectrum Scale Architecture Highlights: Scalability

#ibmedge
3
Spectrum Scale Architecture Highlights: HA/Reliability

#ibmedge
4
Spectrum Scale Software Local Read Only Cache (LROC)
Many NAS workloads benefit from large read cache
SPECsfs
OpenStack, VMWare and other virtualization
Database

Augment the Spectrum Scale Node DRAM cache with SSD/NVMe


Used to cache:
Data
Inodes
Indirect blocks
Cache consistency insured by standard Spectrum Scale tokens
Assumes SSD device is unreliable, data is protected by checksum and verified on read
Provide low-latency access to file system metadata and data

Implement with consumer flash for maximum Cache/$


Enabled by FLEAs LSA (Data is written Sequential to Device, to eliminate wear leveling)
Reach small File performance leadership compared to other NAS Devices

#ibmedge
5
LROC Example Speed Up
Two consumer grade 200 GB SSDs cache a forty-eight 300 GB 10K SAS disk Spectrum Scale storage system

Initially, with all data coming from the disk storage system, the client reads data from the SAS disks at ~ 3,000 IOPS

As more data is cached in Flash, client performance increases to 33,000 IOPS while reducing the load on the disk subsystem by
more than 95%

#ibmedge
6
Spectrum Scale Raid features

#ibmedge
7
ESS (Spectrum Scale Raid Building Blocks)
Elastic Storage Server (ESS) is a prepacked solution using on the Spectrum Scale Raid technology and
Commodity HW components

SSD/10k SAS Models


GS1, GS2, GS4,GS6
2 x High Volume Servers
1/2/4/6 x JBOD disk enclosures

NL-SAS Models
GL2, GL4,GL6
2 x High Volume Servers
2/4/6 x JBOD disk enclosures

#ibmedge
8
ESS : various models

#ibmedge
9
University of Basel, Switzerland
1460: First and only University in Switzerland
until 19th century
7 faculties: Humanities, Science, Medicine,
Law, Business and Economics,
Psychology, Theology
7600 undergraduate students
3700 postgraduate and doctoral students
1300 academic staff
358 Professors

#ibmedge
10
Scientific Computing @ University of Basel
HPC Clusters specialized for large IO (bioinformatics) and high-speed
interconnects (molecular simulations)
Central systems administration
training
Up-to-date scientific databases &
Up-to-date software stack HPC
support
scientific
Back-up service compute software
clusters
User training 10'000
User support CPU
cores
Developer support 3.5 PB
(code version, issue tracking, storage
wiki, etc.)
Dedicated 24/7 production server environment for
web services (SWISS-MODEL, Ismara, Mirz, etc.)

#ibmedge
11
Supporting research in Northwest Switzerland
Major funding
Hosting reference bioinformatics services
500 registered users
110 research groups
Acknowledged in 70 life-science publications in 2016

SWISS-MODEL
From stellar astrophysics

to brain genomics

to structural biology to hosting reference services

#ibmedge
12
Scientific Storage and Computing Infrastructure
Once upon time

HPC Cluster

NFS Server

#ibmedge
13
Scientific Storage and Computing Infrastructure
Cluster and storage grew bigger ...

HPC Cluster

NSD Server NSD Server NSD Server

#ibmedge
14
Scientific Storage and Computing Infrastructure


Biomedical Life Sciences Physics Chemistry Psychology Economy Microscopy Genome
Research Department Department Department Department Department Facility Sequencing
Facility HPC Cluster

Spectrum Scale Data Hub Layer

SONAS

LTFS-EE TSM-HSM

NSD Server NSD Server NSD Server

#ibmedge
15
Cluster Export Services
Authentication
High available file and object export services
- export/share configuration straight forward
- authentication against AD or LDAP CIFS NFS
Active Directory

Important for planning:


- NFS and Apple OS X Protocol Nodes
- SMB1 not supported
- mixed workload and performance
- changes in authentication Spectrum Scale Data Hub Layer

NSD Server NSD Server NSD Server

#ibmedge
16
AFM for Data migration, Example: SONAS migration
Operational advantages:
- preparing and prefetching before switching clients
- migrate data while clients working on new share
- minimal downtime: 1min (AFM) for share 30TB, 30M inodes Home Cluster
vs. several months (using transfer host with robocopy) Gateway Nodes

Spectrum Scale Data Hub Layer


Technical advantages:
- data transfer: observed up to 1TB/h SONAS
per gateway host
- ACL: transferred together with data
- Direct storage storage migration,
no transfer host or copy software
needed (e.g. robocopy, rsync)
NSD Server NSD Server NSD Server

#ibmedge
17
Example: Scientific web server
Protein sequences vs. protein structures

#ibmedge
18
Example: Scientific web server
Protein annotation: humans vs. machines

#ibmedge
19
Example: Scientific web server
Disaster recovery: AFM between two sites
- less work to develop data replication to DR site
Scientific pipeline speedup x8: big pagepool + LROC
- processing steps depend on bigger datasets, unchanged for 1 week
AFM independent writer
- update of datasets very simple,
(replication not speed critical)
no data distribution required
Internet

pagepool=128GB

LROC: 1TB SSD


NSD Server NSD Server NSD Server

200km

HPC Cluster

#ibmedge
20
Information Lifecycle Management - HSM
Use of tape to lower cost of storage
Spectrum Archive EE (LTFS-EE):
- easy to manage, direct control of tape Clients Clients

- use of policies for fine grained placement


- well suited for data export
- not a full fledged backup system
Spectrum Scale Data Hub Layer

Spectrum Protect for


Space Management
- integration with backup system
- requires TSM infrastructure
TSM Server

NSD Server Disk Pool NSD Server

TS3500 TS3500

Spectrum Archive EE Spectrum Protect for Space Management


#ibmedge
21
Secure environment for biomedical research
Secure research environment
Encryption
- encryption of data at rest and on network
- defined via policies General research environment
- possibility of fine grained access groups SKLM

- encryption keys managed by key


management software (IBM SKLM)
- integration with general research infrastructure Clients
- suited for Biomedical data and processing

Login
HPC Cluster

NSD Server

NSD Server

#ibmedge
22
Summary
Secure research environment


SKLM
Biomedical Life Sciences Physics Chemistry Psychology Economy Microscopy Genome
Research Department Department Department Department Department Facility Sequencing
Facility HPC Cluster

LROC

Encryption CES: CIFS,NFS


Spectrum Scale Data Hub Layer
Login AFM
HPC Cluster

Remote Site

Remote Cluster
LTFS-EE TSM-HSM
SONAS
NSD Server NSD Server NSD Server
NSD Server

ILM, HSM
#ibmedge
23
Spectrum Scale User Group
The Spectrum Scale User Group is free
to join and open to all using, interested
in using or integrating Spectrum Scale.
Join the User Group activities to meet
your peers and get access to experts
from partners and IBM.
Next meetings:
- APAC: October 14, Melbourne
- Global at SC16 : November 13 1pm to 5pm, Salt Lake City
Web page: http://www.spectrumscale.org/
Presentations: http://www.spectrumscale.org/presentations/
Mailing list: http://www.spectrumscale.org/join/
Contact: http://www.spectrumscale.org/committee/
Meet Bob Oesterlin (US Co-Principal) at Edge2016: Robert.Oesterlin@nuance.com

#ibmedge
Session : Futures of IBM Spectrum Scale
NDA & Customers ONLY

Who: IBM Spectrum Scale Offering Management

Carl Zetie, Ron Riffe

When: Tuesday, September 20, 2016

1pm to 2pm

Where: MGM Grand, Signature Tower 3

Meeting Room D

Contact (if any questions)

douglasof@us.ibm.com, cmukhya@us.ibm.com

#ibmedge
25
Session : How to apply Flash benefits to big data
analytics and unstructured data
NDA & Customers ONLY

Who: IBM Elastic Storage Server Offering Management

Alex Chen

When: Thursday, September 22, 2016

1:15pm to 2:15pm

Where: Grand Garden Arena, Lower Level, MGM, Studio 10

Contact(if any questions)

cmukhya@us.ibm.com, douglasof@us.ibm.co

#ibmedge
26
Spectrum Scale Trial VM
Download the IBM Spectrum Scale Trial VM from :

http://www-03.ibm.com/systems/storage/spectrum/scale/trial.html

#ibmedge
27
Spectrum Scale Edge Technical Sessions

Just Search for Spectrum Scale in the IBM Events mobile app. There
are 15+ sessions on various topics including Lab sessions.
Lab Sessions:
Spectrum Scale Problem Determination Lab
Date: Sept 20th 2:15 PM 3:15 PM
Location : MGM Grand , Room 317 Lab Center F
Spectrum Scale Trail VM Lab
Date: Sept 20th 3:45PM 4:45PM
Location: MGM Grand , Room 317 Lab Center F
Booth on ESS , Spectrum Scale + TCT and DeepFlash
#ibmedge
28
Thank You

2016 IBM Corporation


#ibmedge

Вам также может понравиться