Вы находитесь на странице: 1из 32

Copyright IBM Corporation, 2014

IBM System Storage DS8870 Performance with


High-Performance Flash Enclosure


June 2014

Kaisar Hossain
Paul Jennas
Joshua Martin
Sergio Reyes
David V Valverde
Rafael Velez
David Whitworth
Sonny E. Williams
Yan Xu

Document WP102454

Systems and Technology Group
2014, International Business Machines Corporation
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 2 of 32


Notices, Disclaimer and Trademarks
Copyright 2014 by International Business Machines Corporation.

No part of this document may be reproduced or transmitted in any form without written
permission from IBM Corporation. Product data has been reviewed for accuracy as of the date
of initial publication. Product data is subject to change without notice. This information may
include technical inaccuracies or typographical errors. IBM may make improvements and/or
changes in the product(s) and/or programs(s) at any time without notice. References in this
document to IBM products, programs, or services does not imply that IBM intends to make such
products, programs or services available in all countries in which IBM operates or does
business. THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS"
WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS
ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR
NON-INFRINGEMENT.

IBM shall have no responsibility to update this information. IBM products are warranted
according to the terms and conditions of the agreements (e.g., IBM Customer Agreement,
Statement of Limited Warranty, International Program License Agreement, etc.) Under which
they are provided. IBM is not responsible for the performance or interoperability of any non-IBM
products discussed herein. The performance data contained herein was obtained in a
controlled, isolated environment. Actual results that may be obtained in other operating
environments may vary significantly. While IBM has reviewed each item for accuracy in a
specific situation, there is no guarantee that the same or similar results will be obtained
elsewhere. Statements regarding IBMs future direction and intent are subject to change or
withdraw without notice, and represent goals and objectives only. The provision of the
information contained herein is not intended to, and does not, grant any right or license under
any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made,
in writing, to:


IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.


IBM, Enterprise Storage Server, ESCON, FICON, FlashCopy, System Storage, System z,
System p, z/OS, zEnterprise, Easy Tier, and DS8000 are trademarks of International Business
Machines Corporation in the United States, other countries, or both. Other company, products
or service names may be trademarks or service marks of others.
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 3 of 32
Acknowledgements

The authors would like to thank the following colleagues for their comments and insight:
Nick Clayton - IBM Systems and Technology Group, Manchester, United Kingdom
Peter Kimmel IBM Systems and Technology Group, Mainz, Germany
Loren (Yang SH) Liu IBM Systems and Technology Group, Shanghai, China
Brian Sherman IBM Storage Advanced Technical Skills, Markham, ON, Canada
A Note to the Reader

This White Paper assumes a familiarity with the general concepts of Enterprise Disk Storage
Systems. Readers unfamiliar with these topics should consult the References section at the
end of this paper.

The reference to DS8870 in the measurement results means DS8870 P7+, unless it is
specifically denoted otherwise.


IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 4 of 32
Table of Contents

ACKNOWLEDGEMENTS ............................................................................................................................. 3
A NOTE TO THE READER .......................................................................................................................... 3
TABLE OF CONTENTS ............................................................................................................................... 4
1 EXECUTIVE SUMMARY ...................................................................................................................... 5
2 INTRODUCTION ................................................................................................................................... 7
3 BASE PERFORMANCE WITH HIGH-PERFORMANCE FLASH ENCLOSURE ................................ 9
3.1 Single Array and Single HPFE Performance ................................................................................ 9
3.2 Full Configuration Performance .................................................................................................. 11
3.3 Hybrid Configuration Performance .............................................................................................. 13
4 EASY TIER WITH HIGH-PERFORMANCE FLASH ENCLOSURE ................................................... 15
4.1 Easy Tier with HPFE in a Multi-tier Environment ........................................................................ 15
4.1.1 DB2 Brokerage Transactional Workload ............................................................................................ 15
4.1.2 Online Transaction Processing workload ........................................................................................... 17
4.1.3 Easy Tier and Workload Skew ........................................................................................................... 18
4.2 Intra-tier Auto Rebalance between SSDs and HPFE ................................................................. 19
5 COPY SERVICES WITH HIGH-PERFORMANCE FLASH ENCLOSURE ........................................ 22
6 CONCLUSION .................................................................................................................................... 26
7 REFERENCES .................................................................................................................................... 27
APPENDICES ............................................................................................................................................. 28
Appendix A: Workload Characteristics ................................................................................................ 28
Open Workloads ............................................................................................................................................. 28
System z workloads ........................................................................................................................................ 28
Appendix B: Hardware Configurations ................................................................................................ 30
DS8870 Hardware Configurations .................................................................................................................. 30
Appendix C: Definitions and Methodologies ...................................................................................... 32


IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 5 of 32
1 Executive Summary

On June 6, 2014, IBM introduced an evolutionary High-Performance Flash Enclosure (HPFE) to
the IBM

System Storage

DS8870s flash storage attachment portfolio. In addition, IBM also


made generally available an enhanced All-Flash feature on the DS8870 with Release 7.3
License Internal Code (LIC), featuring the High-Performance Flash Enclosure (HPFE).

This paper describes the results of performance measurements conducted by the IBM
Enterprise Storage performance team in Tucson, Arizona utilizing the new High-Performance
Flash Enclosure attached to the enhanced IBM

System Storage

DS8870 with POWER7+


server technology. The main objective of this paper is to contrast the performance capacity of
the HPFE attached to the DS8870 with that of the DS8870 attached to 2.5 Solid State Drive
(SSD) technology, as well as, the performance capacity of traditional Hard Disk Drive (HDD)
technology. The performance of both hybrid and homogeneous configurations are explored.
Additionally, the performance capacity of the All-Flash feature using the HPFE is contrasted with
that of initial DS8870 All-Flash offering using SSD technology. In addition to base functionality,
performance comparisons are provided for the key advanced features and functions offered by
the DS8870, including:

Easy Tier
FlashCopy (local disk system copy)
FlashCopy SE (space-efficient local disk system copy)

This paper primarily examines the performance capability of DS8870 with HPFE in Fixed Block
(FB) data formats.

The new High-Performance Flash Enclosure removes the device adapter limit associated with
the currently supported standard 2.5 SSDs. The High-Performance Flash Enclosure connects
directly to the high bandwidth, Peripheral Component Interconnect Express (PCIe) buses of the
two DS8870 Power7+ processor complexes.

Each High-Performance Flash Enclosure is packaged in a 1U standard Rack and contains:

Two high performance flash adapters, specially designed to exploit and optimize the
performance capacities of flash-based storage
Either 16 or 30 flash cards, in a dimension of 46 mm (1.8") each
Two or four RAID-5 arrays

Laboratory measurements show that the POWER7+ enhanced DS8870 attached to a single
HPFE can achieve 250,000 (4 KB) read operations per second and deliver up to 3.4 GB/s of
Bandwidth, equipping the DS8870 with HPFE to easily support both I/O intensive and bandwidth
intensive workload.

Finally, the enhanced All-Flash feature includes 8 High-Performance Flash Enclosures
packaged in a new high-performance All-Flash single frame DS8870, containing up-to 16 Host
adapters. This new DS8870 packing feature allows clients to unleash the full potential of an All-
Flash, High-Performance Flash Enclosure based storage system by balancing the front-end
performance capacity with the back-end performance capacity in a much smaller package. As
is the case with the initial DS8870 All-Flash offering that was introduced on Release 7.2.1 LIC in
January of 2014, which employed standard 2.5 SSDs, the All-Flash Systems with HPFE come
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 6 of 32
with the Power7+ Flash accelerator feature on the POWER7+ 16-core (per CEC) model which
can boost overall performance capacity by up to 5%.



IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 7 of 32
2 Introduction

The IBM System Storage DS8870 series is designed to manage a broad scope of storage
workloads that exist in todays complex data centers, and do it effectively and efficiently. The
proven success of this flagship IBM disk system is a direct consequence of its extraordinary
resiliency, scalability and performance, as well as, its ability to address the demanding
requirements of the critical data at the heart of your data center.

A new generation, High-Performance Flash Enclosure enables the DS8870 Storage system to
deliver an improved level of extraordinary performance for your most time-sensitive mission-
critical applications, while a highly-resilient architecture and world-class business continuity
solutions makes 24/7 access to critical enterprise applications a reality. Adding the unique
performance-optimizing integration between DS8870 and IBM enterprise servers, makes it
easy to see why this flagship system epitomizes Smarter Storage.

The rich design heritage of the DS8870 is preserved and it is evident by IBM's commitment to
flash technology with the general availability of the latest enhancement to the DS8870s All-
Flash system portfolio with the High-Performance Flash Enclosure.

IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 8 of 32
The High-Performance Flash Enclosure, All-Flash DS8870 model delivers unprecedented
performance and capacity growth and is also a well-balanced general purpose storage system
that performs well with both bandwidth-intensive workloads and I/O-intensive workloads with
low I/O latency requirements. Compared to the performance of the previous DS8870 All-Flash
offering with 2.5 SSDs, the ultra-dense HPFE can provide up to 4 more I/O Operations per
second (IOPS) performance in the same amount of capacity. Figure 1 shows pictorially, the
DS8870 All-Flash model with High-Performance Flash Enclosures. The DS8870 is available
with several processor core options. The measurement data in this paper primarily reflects
performance of the POWER7+ 16-core (per CEC) model. However, measurements with
both hybrid and All-Flash systems with HPFE on the POWER7+ 8-core (per CEC) model are
also included.





50% reduction in
footprint and 12%
reduction in power
All 8 IO bays installed in
base frame for up to 128
8 Gb FC ports
Eight PCIe attached High-
Performance Flash
Enclosures providing up to
73.6 TB usable capacity with
400 GB Flash cards

8-core P7+ server with
256 GB Cache or 16-
core P7+ server with
512 GB or 1 TB Cache
Figure 1: DS8870 All-Flash Model with High-Performance Flash Enclosures
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 9 of 32
3 Base Performance with High-Performance Flash
Enclosure

Sections 3.1 and 3.2 describe the results of various Open Systems performance measurements
and draws comparisons between the DS8870 with HPFE, SSDs and HDDs. Results from
System z (CKD) environment are reported in 3.3. A detailed description of the configuration for
these measurements can be found in Appendix B: Hardware Configurations.

3.1 Single Array and Single HPFE Performance
The DS8870 with HPFEs provides better flash performance in 50% less space than existing
flash options. The measurements in Figure 2 compare HPFE versus SSDs throughput
capabilities for a single array. The HPFE provides faster throughput for both sequential and
random I/O operations.



Figure 2: DS8870 SSD vs. HPFE Single Array (RAID-5, 6+p) Throughput


0
20
40
60
80
100
Read Write
K
I
O
/
s
4KB Random
SSD HPFE
0
0.5
1
1.5
2
Read Write
G
B
/
s
Sequential
SSD HPFE
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 10 of 32
The HPFEs performance advantage over SSDs remains persistent as the number of arrays
scale from one to four in a single HPFE. Note that a single adapter pair is being utilized for the
measurements in Figure 3, which shows up to a 4 increase in random writes for HPFE versus
SSDs.

Figure 3: DS8870 SSD vs. HPFE: Four Array (RAID-5, 6+p) in a single Device Adapter Pair or single HPFE

Although not shown here, similar HPFE performance advantages were attained on CKD as
seen in the Open system results in Figure 2 and Figure 3.

For latency sensitive applications, the HPFE is capable of sustaining low response times at
more demanding I/O rates than its SSD equivalent as seen in Figure 4. The response time in
Figure 4 and throughout the document is the end-to-end time from the application, unless
otherwise noted.


Figure 4: DS8870 SSD vs. HPFE: Single Array RAID-5 6+p




0
50
100
150
200
250
300
Read Write
K
I
O
/
s
4KB Random
SSD HPFE
0
1
2
3
4
Read Write
G
B
/
s
Sequential
SSD HPFE
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
0 10 20 30 40 50 60 70 80 90
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
I/O Rate (KIOPS)
4KB Random Read
SSD HPFE
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 11 of 32
3.2 Full Configuration Performance

OLTP Performance
Online Transaction Processing (OLTP) benchmark workloads are designed to represent the
type of mixed I/O patterns seen in online applications. They are composed of a mixture of both
reads and writes with some cache hits and some cache misses. These workloads access data
primarily in a random fashion.

The DS8870 with HPFEs demonstrates significantly better response times than SSDs, yielding
up to a 3.2 IOPS at an equivalent latency. Figure 5 shows that the performance advantage of
HPFE over SSDs is sustained as the number of HPFEs scale, up to a fully configured
environment. The performance capacity of 1 HPFE is close to that of 4 DA pair with 128 SSDs.
The workload used in Figure 5 is a Database for Open systems (DB Open) workload which
represents a typical OLTP environment. This workload is also referred to as 70/30/50
because it is composed of 70% reads, 30% writes, and 50% read cache hits.

Figure 5: DS8870 (P7+ w/Flash Accelerator) SSD vs. HPFE: DB Open (70/30/50)


0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0 100 200 300 400 500 600 700
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
K IOps
1 HPFE 2 HPFEs
4 HPFEs 8 HPFEs
1 DA Pair w/ 32 SSDs 2 DA Pair w/ 64 SSDs
4 DA Pair w/ 128 SSDs 8 DA Pair w/ 256 SSDs
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 12 of 32
The OLTP workload showcases the advantages of flash over HDDs. The performance gains
can be seen in Figure 6, where various HPFE configurations significantly outperform a DS8870
fully configured with 1,536 HDDs.

Figure 6: DS8870 HPFE (P7+ w/Flash Accelerator) vs. HDD (P7+): DB Open (70/30/50)

Sequential Performance

Although the true benefit of any flash technology is best reflected when conducting random I/O
operations, an All-Flash DS8870 fully configured with 8 HPFEs attains impressive sequential
bandwidth rates and reaches the maximum bandwidth a DS8870 offers. Figure 7 shows
measurements comparing HPFE versus SSDs using a sequential I/O workload in a full
configuration of All-Flash models and both configurations achieved the maximum bandwidth of
the DS8870.

Figure 7: DS8870 HPFE vs. SSD Full Configuration Bandwidth



0
2
4
6
8
10
0 100 200 300 400 500 600 700
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
K IOps
1 HPFE 2 HPFEs 4 HPFEs 8 HPFEs 1536x 15K HDDs RAID10
0
5
10
15
20
25
Read Write
G
B
/
s
Sequential
8 DA Pair/256 SSDs 8 HPFEs
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 13 of 32
3.3 Hybrid Configuration Performance

This section explores HPFE in a hybrid configuration with traditional HDDs in the System z
environment. Client experience has shown that some clients still prefer to manage data
placement, either by Workload Manager (WLM) constructs or manual placement of their
business critical data instead of employing Easy Tier. High-Performance Flash Enclosures, in
combination with HDDs in a hybrid configuration can:

Provide dramatic performance improvement even with the addition of only one HPFE
Provide remarkable performance as a function of workload growth over time and/or
support new applications.
Provide a cost effective way to improve or maintain system responsiveness as a function
of workload growth.

All of these benefits can be realized in a smaller foot-print that can reduce Total Cost of
Ownership (TCO).

Figure 8 is an illustration of the performance potential of adding a single HPFE to a
configuration with 384 HDDs. For this example, the DB z/OS workload was executed against
the all-HDD configuration to establish a base. The DB z/OS workload has volume skew which
provides hot activity volumes which can be moved to the HPFE (see Appendix B: Workload
Characteristics for a complete description of the workload characteristics for DB z/OS). Once
the HPFE was added to the configuration, a portion of those hot activity volumes, comprised of
40% or 55% of the I/O activity, were moved to volumes defined on the single HPFE. The
observed throughput and response time improvements were dramatic as shown in Figure 8 with
greater improvements seen as more activity was allocated to the HPFE. The results
demonstrate that a hybrid solution with HPFE can deliver remarkable throughput at much lower
response times as well as support application growth over time.


Figure 8: DS8870 Hybrid Configuration: CKD DBz
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
0 100 200 300 400
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
K IOPS
CKD DB z/OS (ZHPF)
384-HDDs
384-HDDs+1 HPFE, HPFE contains 40% of I/O activity
384-HDDs+1 HPFE, HPFE contains 55% of I/O activity
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 14 of 32
Finally, Figure 9 illustrates the behavior of the response time component that benefits most
when employing flash technology, that is, Disconnect Time. Figure 9 shows the effect on
disconnect time for the same experiment in Figure 8. Disconnect time is significantly reduced
with the addition of one HPFE and is reduced further as the amount of activity to the HPFE
increases.

Figure 9 DS8870 Hybrid Configuration: CKD DBz Disconnect Time


0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
0 100 200 300 400
D
i
s
c
o
n
n
e
c
t

T
i
m
e

(
m
s
)
K IOPS
CKD DB z/OS (ZHPF)
384-HDDs
384-HDDs+1 HPFE, HPFE contains 40% of I/O activity
384-HDDs+1 HPFE, HPFE contains 55% of I/O activity
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 15 of 32
4 Easy Tier with High-Performance Flash Enclosure

Since the introduction of Easy Tier in May 2010, Easy Tier today supports 3 tiers including
SSDs, Enterprise Disks, and Near Line Disks as well as Auto Rebalance within a homogenous
tier. The newly introduced Flash Cards in the High-Performance Flash Enclosure are
categorized as the same tier (tier 0) as SSDs.

Easy Tier optimizes the system performance by automatically moving data to its appropriate tier
according to the I/O activity in a multi-tiered environment and balancing I/O load among arrays
in the same tier. With the help of Easy Tier, the user can readily enjoy the outstanding
performance provided by HPFEs when HPFEs are added to an existing DS8870 with HDDs
and/or SSDs.

In this section, the Easy Tier performance with HPFE is studied with a multi-tier environment
and a single tier environment with SSDs.

Since the experiment was performed in a lab environment, some Easy Tier default settings were
changed to reduce the duration of the experiment: the Easy Tier short-term decision window
was decreased from the default to 1 hour (single tier) or 2 hours (multi-tier) and the migration
rate was set to the fastest allowed. The workload characteristics were stable over time so the
performance outcome would have been the same if the settings had been kept at the default
values.
4.1 Easy Tier with HPFE in a Multi-tier Environment

Easy Tier performance with HPFE was evaluated in a 2-tier environment of HPFE and 15K
RPM Enterprise Drives. The results were compared with a similar configuration of SSDs and
15K RPM drives. The Easy Tier experiments were conducted with two workloads: the DB2
Brokerage Transactional Workload and an OLTP workload.
4.1.1 DB2 Brokerage Transactional Workload

The DB2 Brokerage Transactional Workload was designed to simulate a class of applications
that facilitate and manage transaction-oriented business processes. These are commonly used
in a broad range of industry segments including finance, retail, and manufacturing. The
application is characterized by having highly random disk operations that consists of 80% read
and 20% write, with an average transfer size of approximately 4 KB. Given its high random read
content, it was considered a suitable workload for evaluating Easy Tier performance
measurement.

The hardware used in these experiments consisted of 300 GB 15K RPM enterprise drives,
400GB HPFE flash cards and 400 GB SSDs on a DS8870. The DB2 Brokerage Transactional
Workload ran on an IBM POWER7+ host server. More detailed configuration information is
available in Appendix B: Hardware Configurations.

The DB2 Brokerage transactional workload was configured to run using a 30 minute ramp up
time period plus additional time to achieve steady state. The workload required about 30
minutes to reach steady state without Easy Tier. After the workload was stable, Easy Tier was
activated. The Peak I/O intensities were used to illustrate the Easy Tier performance behavior.
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 16 of 32

At the beginning of the experiments, the workload ran without Easy Tier. Once Easy Tier was
activated, hot data began to move into the high performance storage tier (HPFE or SSD) and
the workload throughput increased. As shown in Figure 10, at steady state, the DB2 Brokerage
workload achieved 3.5 performance improvement with HPFE and 2.8 improvement with
SSDs compared to the initial application performance without Easy Tier. Easy Tier using HPFE
was able to provide a 20 percent performance improvement over the configuration using SSDs.



Figure 10 DB2 Overall Transaction Rate of a Brokerage Application with Easy Tier

There were also significant response time improvements seen in key trade activities (Table 1).
The overall response time was reduced by 71% with HPFE.


Market
Analysis
(ms)
Customer
Position
(ms)
Lookup
Trade
(ms)
Trade
Order
(ms)
Trade
Status
(ms)
Trade
Update
(ms)
Trade
Results
(ms)
Overall
RT (ms)
15K RPM HDDs 7.65 18.31 2005.49 50.99 52.08 2794.97 63.39 166.91
15K RPM HDDS/HPFE,
with Easy Tier 2.42 13.71 503.21 37.23 29.16 805.38 45.53 48.20
Reduction (%) 68% 25% 75% 27% 44% 71% 28% 71%
Table 1 Response Time for some of the key DB2 Brokerage Transactions

0
2000
4000
6000
8000
10000
12000
14000
15K RPM HDDs 15K RPM HDDs/SSDs,
with Easy Tier
15K RPM HDDs/HPFE,
with Easy Tier
D
B
2

O
v
e
r
a
l
l

T
r
a
n
s
a
c
t
i
o
n

R
a
t
e

p
e
r

S
e
c
o
n
d
DB2 Brokerage Transactional Workload
2.8X
3.5X
1.2X
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 17 of 32
Similar improvements were seen for the corresponding I/O throughput and response time in the
DS8870 (Figure 11).

Figure 11 DS8870 Average throughput and Volume Response Time

4.1.2 Online Transaction Processing workload

The OLTP workload used in the following experiments resembles the typical functions of OLTP
applications. It is characterized by predominantly random I/O operations that consist of 60
percent writes and 40 percent reads, with an average transfer size of approximately 8 KB. The
workload also has very skewed non-uniform access densities which are suitable for evaluation
of Easy Tier.

The following three configurations were used in the experiments:
1. A homogeneous configuration of 192 300 GB 15K RPM HDDs only
2. A two-tier configuration with a combination of 192 300 GB 15K RPM HDDs and 16
400 GB SSDs with Easy Tier enabled
3. A two-tier configuration with a combination of 192 300 GB 15K RPM HDDs and 16
400 GB HPFE Flash Cards with Easy Tier enabled

Easy Tier was designed to move storage capacity with high IOPS (hot data) from a slower
performing tier (HDDs) to a faster performing tier (SSDs or HPFE) and hence improve the
overall storage system performance. As shown in Figure 12, comparing with the HDD-only
configuration, Easy Tier with the HDD/SSD configuration was able to provide significant
decrease in response time and a 4 throughput improvement at the 3 ms response time. Given
outstanding performance of HPFE, Easy Tier with the HDD/HPFE configuration was able to
further decrease response time and improve maximum throughput, with a 7 and 1.7
improvement comparing to HDD-only and HDD/SSD configuration respectively at the 3 ms
response time.
0
5
10
15
20
0
20000
40000
60000
80000
100000
120000
140000
15K RPM HDDs 15K RPM HDDs/SSDs,
with EasyTier
15K RPM HDDs/HPFE,
with EasyTier
V
o
l
u
m
e

R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
T
h
r
o
u
g
h
p
u
t

(
I
O
P
s
)
DS8870 - Average Throughput/Response Time
Throughput Response time
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 18 of 32

Figure 12 OLTP Workload: Single tier with HDDs and two-tier with HDDs and SSDs/HPFE with
Easy Tier


4.1.3 Easy Tier and Workload Skew

The I/O access density distribution, also referred to as the workload skew/distribution, is a key
attribute for estimating the benefit that Easy Tier can provide in a multi-tier environment. With
higher skew, I/O concentration in a smaller capacity increases allowing for greater performance
improvements in an SSD/HDD configuration since a smaller percentage of SSD capacity is
required.

Figure 13 shows the workload skew at the back-end drive level for the DB2 Brokerage
Transactional Workload and the OLTP workload. The OLTP workload has a very high skew,
with more than 80% of its I/O operations concentrated in only 10% of its storage capacity, while
the DB2 Brokerage workload has a lower skew, which takes about 40% of storage capacity to
contain about 80% of its I/O operations.

0.0
2.0
4.0
6.0
8.0
10.0
0 50 100 150 200 250
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
I/O Rate (KIOPS)
OLTP
192 300GB/15K(RAID-5)
192 300GB/15K(RAID-5) + 16 400GB SSDs(RAID-5), with Easy Tier
192 300GB/15K(RAID-5) + 16 400GB HPFE Flash Cards(RAID-5), with Easy Tier
4X
@3ms RT
7X vs HDD only
1.7X vs HDD/SSD
@3ms RT
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 19 of 32

Figure 13 Workload Distribution/Skew of DB2 Brokerage and OLTP workload
Given its higher skew and the higher performance capacity of HPFE, the OLTP workload was
able to attain a greater performance improvement with Easy Tier and HPFE than the DB2
Brokerage workload.
4.2 Intra-tier Auto Rebalance between SSDs and HPFE

Easy Tiers Intra-tier Auto Rebalance function automatically balances the I/O load among ranks
in the same extent pool within the same tier. The HPFE flash cards and SSDs are both
considered as tier 0 drives and Easy Tier would balance I/O operations among HPFE and SSD
ranks according to their performance capacity when they are in the same extent pool.

In the following experiment two HPFE ranks and two SSD ranks were in the same extent pool.
The same OLTP workload described in section 04.1.2 was used. At the start of the experiment,
the I/O load was distributed in a stable, but skewed state across the ranks. Then the Auto
Rebalance function was enabled and the I/O load was allowed to reach a new stable state.

IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 20 of 32
Figure 14 displays the IOPS on each individual rank observed in the configuration over the
duration of the experiment. The workload was stable at the start of the experiment, but clearly
skewed across the four ranks. After letting the workload run at that stable, skewed rate, the
Auto Rebalance function was enabled. As data was redistributed among the ranks, it was
moved from ranks with higher I/O load to ones with lower I/O load according to rank's
performance capacity. At the new stable state, there were more IOPS on the HPFE ranks due to
their higher performance capacity. Throughout the experiment, the total host I/O rate remained
constant.

Figure 14 Effect of Auto Rebalance on IOPS distribution on individual ranks in the system

IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 21 of 32
The capability of the system was also measured both before Auto Rebalance was enabled and
then after when the system had stabilized at its new balanced state. The results of those
measurements are shown in Figure 15. As expected, the balanced system was capable of a
greater throughput and equal or better response times.

Figure 15 OLTP workload throughput capability of the system before and after Auto Rebalance


OLTP
0.0
2.0
4.0
6.0
8.0
10.0
0 10 20 30 40 50 60 70 80 90 100
I/O Rate (KIOPS)
R
e
s
p
o
n
s
e

T
i
m
e

(
m
s
)
16 400GB HPFE Flash Cards, 16 400GB SSDs, RAID-5, initially skewed
16 400GB HPFE Flash Cards, 16 400GB SSDs, RAID-5, after auto-rebalanc
> 40%
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 22 of 32
5 Copy Services with High-Performance Flash Enclosure

Advanced functions such as Copy Services are supported on High-Performance Flash
Enclosures. The following section demonstrates enhanced FlashCopy performance using
HPFEs.

The FlashCopy background copy rate (without I/O) for DS8870 with HPFE showed near-linear
scaling from 1 DA Pair (1 HPFE) to 8 DA Pair (8 HPFE) configurations. As shown in Figure 16,
there is a 70% improvement compared to the DS8870 with HDDs.

Figure 16 FlashCopy Background Copy Rate (without I/O)

0
2
4
6
8
10
12
1 DA-Pair 2 DA-Pairs 4 DA-Pairs 8 DA-Pairs
G
B
/
s
Background Copy
DS8870 P7+ w/HPFE DS8870 P7 w/HDD
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 23 of 32
The following results show the performance of various I/O workloads with Standard and Space
Efficient FlashCopy (Track Space Efficient FlashCopy) with the No-copy option.

Figure 17 shows performance of the Database Open (DBO) workload while sustaining 60% of
the maximum throughput for each of configurations below. The DS8870 with HPFE provides
much higher throughput, and yet lower response time with either Standard FlashCopy or Space
Efficient FlashCopy. The FlashCopy results for HDDs were limited by drives themselves, while
that of the HPFE were limited by the DS8870 processors.

Figure 17 FlashCopy with No-copy option with DBO: at 60% of maximum throughput of each
configuration

0
50
100
150
200
250
300
350
400
450
No FlashCopy Standard FlashCopy Space Efficient FlashCopy
T
h
r
o
u
g
h
p
u
t

(
K
I
O
P
s
)
DBO
DS8870 P7+ - 8 Bluehawks Throughput DS8870 P7 - 8 DA Pairs w/HDDs Throughput
Response time
0.3 ms
3.5 ms
3.4 ms
15.7 ms
13.2 ms
17.3 ms
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 24 of 32
The performance of the Database Open (DBO) workload at an equivalent throughput (with no
FlashCopy) for each of the configurations is shown in Figure 18. The DS8870 with HPFE was
not processor limited at this throughput level. With the exceptional performance capability of
HPFE, the DS8870 with HPFE shows no or minimal impact to both throughput and response
time with either Standard FlashCopy or Space Efficient FlashCopy.

Figure 18 FlashCopy with No-copy option with DBO: equivalent throughput of each configuration

0
10
20
30
40
50
60
70
80
90
No FlashCopy Standard FlashCopy Space Efficient FlashCopy
T
h
r
o
u
g
h
p
u
t

(
K
I
O
P
s
)
DBO
DS8870 P7+ - 8 Bluehawks Throughput DS8870 P7 - 8 DA Pairs w/HDDsThroughput
Response time
0.3 ms 0.3 ms
0.3 ms
3.5 ms
13.2 ms
17.33 ms
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 25 of 32
With a sequential write workload, the DS8870 with HPFE provides much higher throughput with
either Standard FlashCopy or Space Efficient FlashCopy.

Figure 19 FlashCopy with No-copy option with Sequential Write

0.0
2.0
4.0
6.0
8.0
10.0
12.0
No FlashCopy Standard FlashCopy Space Efficient
FlashCopy
G
B
/
s
64 kB Sequential Write
DS8870 P7+ - 8 HPFEs DS8870 P7 - 8 DA Pairs w/HDDs
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 26 of 32
6 Conclusion

The integration of the High-Performance Flash Enclosure with the latest generation of the
DS8870 provides dramatic improvements in system performance. This new offering delivers
the exceptional sub-millisecond response time to which administrators employing flash-based
DS8870 solutions have become accustomed while simultaneously making breakthrough gains
in overall throughput. This was a common theme in the diverse set of experiments detailed in
this paper which demonstrate the HPFE integration with the DS8870 as the new leading-edge
enterprise flash technology. While maintaining the outstanding low response time, lab
experiments demonstrated that the HPFE is able to provide up to 4 random I/O throughput
improvements over traditional SSDs.

The addition of the HPFE also provides tremendous benefits to many of the existing advanced
features offered by the DS8870 including Copy Services and Easy Tier. Furthermore, with Easy
Tier support for HPFE, these performance gains can be realized in existing DS8870
environments through a non-disruptive addition of the enclosures without any tuning required.

Overall, the improvements attained by the incorporation of the High-Performance Flash
Enclosures result in a smaller storage footprint needed to meet a given set of performance
requirements - in some cases as much as half. The smaller storage footprint in turn results in
substantial energy savings and a significantly lower total cost of ownership.

IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 27 of 32
7 References


[1] La Frese, L., Hossain, K., Hyde, J., Lin, A., McNutt, B., Sansone, C., Sutton, L., Xu, Y.,
Zhang, Y. IBM System Storage DS8700 Performance with Easy Tier. May 2010.

[2] Clayton, N., Hossain, K., La Frese, L., Martin, J., McNutt, B., and Xu, Y. IBM System
Storage DS8700 and DS8800 Performance with Easy Tier 2nd Generation. July 2011

[3] Clayton, N., Hossain, K., La Frese, L., Martin, J., McNutt, B., and Xu, Y. IBM System
Storage DS8800 and DS8700 Performance with Easy Tier 3rd Generation. November
2011

[4] Hossain, K., Jarvis, T. C., Martin, J., Valverde, D., Varela, W., Whitworth, D., Williams,
S., and Xu, Y. IBM System Storage DS8870 Performance Whitepaper. June 2014

[5] Dufrasne B., Brandenburg J., Cook J., Lepine J., Manthorpe S., Sallam M. IBM
DS8870 - High-Performance Flash Enclosure: IBM Redbooks Product Guide. May 2014

IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 28 of 32
Appendices
Appendix A: Workload Characteristics
Read Hit (RH): 100% Random read requests to cache. A "read hit" test issues read
requests repeatedly to a small group of blocks or records. The number of affected blocks
is small enough to ensure that the entire set can be retained in cache at the same time.
Hence, all requests in the read hit test are serviced out of cache. Read Hit tests
generally give the highest I/O rate for a storage system. These types of workloads are
for engineering purposes and are not typical of customer environments.
Read Miss (RM): 100% Random read requests to disk. A "read miss" test issues read
requests at random across a storage area much larger than the available cache size.
This test is designed in such a way that the probability of finding the requested data in
cache is nearly zero. Read Miss tests usually serve engineering purposes and are not
typical of customer environments.
Write Hit (WH): 100% Random write requests to cache. A "write hit" test issues write
requests repeatedly to a small group of blocks or records. The number of written blocks
is small enough to ensure that the entire set can be retained in cache at the same time.
It is possible that the controller may defer all destaging until after the completion of a
"write hit" test. This allows throughput on the front end to be isolated and benchmarked.
These types of workloads are for engineering purposes and are not typical of customer
environments.
Write Miss (WM): 100% Random write requests to disk. A "write miss" test issues write
requests at random across a storage area much larger than the available cache size.
This test is designed in such a way that the probability of writing a block a second time,
before that block has been destaged from cache, is almost zero. For this reason, the
number of destage operations is approximately equal to the number of writes.
Open Workloads
70/30/50: An open workload that is similar to typical OLTP applications. Its
characterized by 70% reads, 30% writes, a 50% read hit ratio, an approximate destage
rate of 17% of all I/O operations and a 4 KB block transfer size. It is also known as DB
Open or DBO.
50/50/50: An open workload that is similar to very write intensive OLTP applications. Its
characterized by 50% reads, 50% writes, a 50% read hit ratio, an approximate destage
rate of 17% of all I/O operations and a 4 KB block transfer size.
Sequential: Open Sequential workloads provide for reading or writing data records in
sequential order, one after the other. They are either 100% reads or writes using 64 KB
block data transfers to disk, similar to data warehouse scan/load operations. 256 KB and
1 MB large transfer block sizes have also been used, similar to video imaging
operations.
System z workloads
DB z/OS: DB z/OS (formerly known as Cache Standard) is a System z workload that
simulates a typical OLTP environment on the mainframe. Its characterized by 75%
reads, 25% writes, a 4 KB block transfer size and skewed I/O rates to different volumes.
DB z/OS has a cache read hit ratio that varies with the configurations cache to
backstore ratio, but a frequently used value is 72%. The destage rate is not constant, but
common values are between 14 - 17% of all I/Os.
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 29 of 32
Cache Hostile: This workload is characterized by 67% reads, 33% writes, skewed I/O
and a 4 KB block transfer size. It has a write destage rate of 50% and a destage rate of
18.3% of all I/Os. The cache read hit ratio is adjustable depending on testing
requirements and the cache/backstore ratio.
Cache Friendly: This workload is characterized by 83% reads, 17% writes, skewed I/O
and a 4 KB block transfer size. It has a write destage rate of 50% and a destage rate of
7.5% of all I/Os. The cache read hit ratio is adjustable depending on testing
requirements and the cache/backstore ratio, but generally uses a value of 83%.
Sequential: These workloads are similar to typical batch processing. 100% Read or
100% Write, with large sized transfers in a sequential access pattern to and from disk.
IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 30 of 32
Appendix B: Hardware Configurations

DS8870 Hardware Configurations
Configuration for Open Systems Measurements
HDD Configuration
o DS8870 p7+ with 16 CPU cores and 512 GB cache
o 1536 146 GB 15K RPM HDDs, RAID-10 with 8 DA Pairs
o 16 Host Adapters
o 32 8 Gb FC connections
All Flash SSD configuration
o DS8870 p7+ Turbo with 16 CPU cores and 1 TB cache
o 256 400GB SSDs with 8 DA Pairs, RAID-5
o 16 Host Adapters
o 32 8 Gb FC connections
All Flash HPFE configuration
o DS8870 p7+ Turbo with 16 CPU cores and 1TB cache
o 8 HPFE with 240 400GB flash cards, RAID-5
o 16 Host Adapters
o 32 8 Gb FC connections
Note: the 1-4 arrays tests used a subset of the hardware as described in above
sections.

All workloads used the following host configuration:
IBM Power 780 host (AIX 7.1.2) with 32 8 Gb Fibre Channels.

Configuration for System z Measurements
DS8870 P7+ with 16 CPU cores and 512GB cache
384 146 GB 15K RPM drives, RAID-5 with 8 DA Pairs
1 HPFE with 30 400GB flash cards, RAID-5
16 Host Adapters.
32 8 Gb FC connection
Host workloads were run on a System z 2827 (EC12) with 32 8 Gb Fibre Channels.
Configuration for FlashCopy Measurements
HDD Configuration for background copy with no I/O:
o DS8870 P7 with 16-cores and 512 GB cache
o Source and target volumes were spread across 768 146 GB 15K RPM drives across 8
DAs with RAID-5.
HDD Configuration for Flashcopy with I/O
o DS8870 P7 with 8-cores and 256 GB cache
o Source and target volumes were spread across 384 146 GB 15K RPM drives across 8
DAs with RAID-5.
o 16 Host Adapters
o 32 8 Gb FC connections

IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 31 of 32
HPFE Configuration:
o DS8870 P7+ Turbo with 16-cores and 1 TB cache
o Source and target volumes were spread across 240 400 GB Flash Cards across 8
HPFEs with RAID-5.
o 16 Host Adapters
o 32 8 Gb FC connections

All workloads used the following host configuration:
IBM Power 780 host (AIX 7.1.2) with 32 8 Gb Fibre Channels.
Configuration for DB2 Brokerage Transactional Workload with Easy
Tier
DS8870 Configuration
o 961 8-Core 256 GB Cache
o 144 300 GB 15K RPM drives across 3 DA pairs, configured as RAID-5
o SSDs: 16 400 GB SSDs on a separate DA pair, configured as RAID-5
o HPFE: 16 400 GB Flash Cards in one HPFE, configured as RAID-5
o 16 8 Gb HA ports across 8 HAs
DB2 Configuration
o DB2 9.7 FP1, 4 Instances, 4 2 TB DBs, 4 Buffer Pools at 54 GB each
o 8 1.5 TB volumes were allocated for database, temp files and data generation
(4 1.5 TB volumes were used)
o 4 100 GB volumes were allocated for log files
Server Configuration
o P770+ (AIX 7.1), 8 x Eight Core P7 (3GHz)
o 1024 GB Cache
o 16 8 Gb FC Ports
Switch Configuration
o 2 40 8Gb ports Brockade (sp) Switch

Configuration for OLTP Workload with Easy Tier
DS8870 Configuration
o 961 8-Core 256 GB Cache
o 192 x 300 GB / 15K RPM drives across 4 DA pairs, configured as RAID-5
o SSDs: 16 x 400 GB SSDs on a separate DA pair, configured as RAID-5
o HPFE: 16 x 400 GB Flash Cards in one HPFE, configured as RAID-5
o 16 x 8 Gb HA ports across 8 HAs
Server Configuration
o P780 (AIX 7.1.2.0), 8 Eight Core P7+ (4.4GHz)
o 512 GB Cache
o 16 x 8 Gb FC Ports

IBM DS8870 Performance with High-Performance Flash Enclosure - June 6th 2014
Document: WP102454 Copyright IBM Corporation, 2014 Page 32 of 32
Appendix C: Definitions and Methodologies

Open system: Sometimes referred to as distributed systems, often attached to an
AIX/UNIX host or Microsoft Windows server, and uses the Fixed Block data format.
System z: Attached to a z/OS host and uses the CKD data format.
SCSI: Small Computer System Interface. A set of standards for physically connecting
and transferring data between computers and peripheral devices.
IOPS: input/output operations per second.
RAID-5: A popular RAID implementation that optimizes cost effective performance while
emphasizing use of available capacity through data striping. RAID-5 provides fault
tolerance for one failed disk drive. This scheme uses XOR parity for redundancy. Data is
striped across all drives in the array and parity is distributed across all the drives.
RAID-10: Combines two schemes: RAID-0 (data striping) and RAID-1 (mirroring).
Volume data is striped across several drives and the first set of disk drives is mirrored to
an identical set. Since redundancy is achieved through mirroring, there is no parity in
RAID-10. RAID-10 optimizes high performance while maintaining fault tolerance for disk
drive failures. It can tolerate at least one, and in most cases, multiple disk failures.
FlashCopy: Uses normal volumes as target volumes for FlashCopy. These target
volumes have the same size (or larger) as their corresponding source volumes.
FlashCopy SE (a.k.a. Space Efficient FlashCopy or SEFC): Uses volumes formatted for
SEFC as the target volumes for FlashCopy. These volumes, known as Space Efficient
volumes, have a virtual size equal to the source volume size. However, physical space is
not allocated for Space Efficient volumes when the volumes are created and the
FlashCopy initiated. Instead, space is allocated in a Repository when the first update is
made to original tracks on the source volumes and the tracks are copied to the SEFC
target volume. Writes to the SEFC target will also consume Repository space. Space
Efficient FlashCopy can be a cost-effective method for replicating data locally.
Response Time: It is the end-to-end time that an I/O takes at the application level.

Вам также может понравиться