Вы находитесь на странице: 1из 216

Front cover

In-memory Computing
with SAP HANA on
IBM eX5 Systems
IBM Systems solution
for SAP HANA
SAP HANA overview
and use cases
Operational aspects for
SAP HANA appliances

Gereon Vey
Martin Bachmaier
Ilya Krutov

ibm.com/redbooks

International Technical Support Organization


In-memory Computing with SAP HANA on IBM eX5
Systems
August 2013

SG24-8086-01

Note: Before using this information and the product it supports, read the information in
Notices on page vii.

Second Edition (August 2013)


This edition applies to IBM Systems solution for SAP HANA, an appliance that is based on
IBM System eX5 servers and the SAP HANA offering.
Copyright International Business Machines Corporation 2013. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.

Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . xi
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Chapter 1. History of in-memory computing at SAP . . . . . . . . . . . . . . . . . . 1
1.1 SAP Search and Classification (TREX). . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 SAP liveCache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 SAP NetWeaver Business Warehouse Accelerator . . . . . . . . . . . . . . . . . . 3
1.3.1 SAP BusinessObjects Explorer Accelerated . . . . . . . . . . . . . . . . . . . . 5
1.3.2 SAP BusinessObjects Accelerator . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 2. Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Keeping data in-memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Using main memory as the data store . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Data persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Minimizing data movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Columnar storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Pushing application logic to the database . . . . . . . . . . . . . . . . . . . . . 19
2.3 Divide and conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Parallelization on multi-core systems . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.2 Data partitioning and scale-out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Chapter 3. SAP HANA overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1 SAP HANA overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 SAP HANA architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.2 SAP HANA appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 SAP HANA delivery model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Sizing SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.1 Concept of T-shirt sizes for SAP HANA . . . . . . . . . . . . . . . . . . . . . . 26
3.3.2 Sizing approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 SAP HANA software licensing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Copyright IBM Corp. 2013. All rights reserved.

iii

Chapter 4. Software components and replication methods . . . . . . . . . . . 37


4.1 SAP HANA software components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.1.1 SAP HANA database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 SAP HANA client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.3 SAP HANA studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.4 SAP HANA studio repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.5 SAP HANA landscape management structure . . . . . . . . . . . . . . . . . 49
4.1.6 SAP host agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.7 Software Update Manager for SAP HANA . . . . . . . . . . . . . . . . . . . . 50
4.1.8 SAP HANA Unified Installer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.9 Solution Manager Diagnostics agent . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.10 SAP HANA On-Site Configuration Tool . . . . . . . . . . . . . . . . . . . . . 54
4.2 Data replication methods for SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.1 Trigger-based replication with SAP Landscape Transformation . . . . 55
4.2.2 ETL-based replication with SAP BusinessObjects Data Services . . 57
4.2.3 Extractor-based replication with Direct Extractor Connection . . . . . . 57
4.2.4 Log-based replication with Sybase Replication Server . . . . . . . . . . . 58
4.2.5 Comparing the replication methods . . . . . . . . . . . . . . . . . . . . . . . . . 59
Chapter 5. SAP HANA use cases and integration scenarios . . . . . . . . . . 61
5.1 Basic use case scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 SAP HANA as a technology platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2.1 SAP HANA data acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.2 SAP HANA as a source for other applications . . . . . . . . . . . . . . . . . 67
5.3 SAP HANA for operational reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.4 SAP HANA as an accelerator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.5 SAP products running on SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.5.1 SAP NetWeaver Business Warehouse powered by SAP HANA . . . 75
5.5.2 Migrating SAP NetWeaver Business Warehouse to SAP HANA . . . 80
5.5.3 SAP Business Suite powered by SAP HANA . . . . . . . . . . . . . . . . . . 85
5.6 Programming techniques using SAP HANA . . . . . . . . . . . . . . . . . . . . . . . 87
Chapter 6. IBM Systems solution for SAP HANA . . . . . . . . . . . . . . . . . . . . 89
6.1 IBM eX5 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.1.1 IBM System x3850 X5 and x3950 X5 . . . . . . . . . . . . . . . . . . . . . . . . 90
6.1.2 IBM System x3690 X5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.1.3 Intel Xeon processor E7 family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.1.4 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.1.5 Flash technology storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.1.6 Integrated virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.2 IBM General Parallel File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.2.1 Common GPFS features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2.2 GPFS extensions for shared-nothing architectures . . . . . . . . . . . . 108

iv

In-memory Computing with SAP HANA on IBM eX5 Systems

6.3 Custom server models for SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . 110


6.3.1 IBM System x workload-optimized models for SAP HANA . . . . . . . 110
6.3.2 SAP HANA T-shirt sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.3.3 Scale-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.4 Scale-out solution for SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.4.1 Scale-out solution without high-availability capabilities . . . . . . . . . . 118
6.4.2 Scale-out solution with high-availability capabilities . . . . . . . . . . . . 120
6.4.3 Networking architecture for the scale-out solution . . . . . . . . . . . . . 125
6.4.4 Hardware and software additions required for scale-out. . . . . . . . . 128
6.5 Disaster recovery solutions for SAP HANA. . . . . . . . . . . . . . . . . . . . . . . 129
6.5.1 DR using synchronous SAP HANA System Replication . . . . . . . . . 131
6.5.2 DR using asynchronous SAP HANA System Replication . . . . . . . . 132
6.5.3 DR using GPFS based synchronous replication . . . . . . . . . . . . . . . 132
6.5.4 DR using backup and restore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.6 Business continuity for single-node SAP HANA installations . . . . . . . . . 145
6.7 SAP HANA on VMware vSphere. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.8 SAP HANA on IBM SmartCloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.9 IBM Systems solution with SAP Discovery system . . . . . . . . . . . . . . . . . 153
Chapter 7. SAP HANA operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
7.1 Installation services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.2 IBM SAP HANA Operations Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.3 Interoperability with other platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.4 Backing up and restoring data for SAP HANA . . . . . . . . . . . . . . . . . . . . 160
7.4.1 Basic backup and recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
7.4.2 File-based backup tool integration . . . . . . . . . . . . . . . . . . . . . . . . . 163
7.4.3 Backup tool integration with Backint for SAP HANA . . . . . . . . . . . . 163
7.4.4 IBM Tivoli Storage Manager for ERP 6.4 . . . . . . . . . . . . . . . . . . . . 165
7.4.5 Symantec NetBackup 7.5 for SAP HANA . . . . . . . . . . . . . . . . . . . . 166
7.5 Monitoring SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.6 Sharing an SAP HANA system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.7 Installing additional agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
7.8 Software and firmware levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.9 Support process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
7.9.1 IBM SAP integrated support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.9.2 IBM SAP International Competence Center InfoService. . . . . . . . . 172
Chapter 8. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.1 Benefits of in-memory computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.2 SAP HANA: An innovative analytic appliance . . . . . . . . . . . . . . . . . . . . . 174
8.3 IBM Systems solution for SAP HANA . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.3.1 Workload Optimized Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.3.2 Leading performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

Contents

8.3.3 IBM GPFS enhancing performance, scalability, and reliability . . . . 177


8.3.4 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
8.3.5 Services to speed deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.4 Going beyond infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.4.1 A trusted service partner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.4.2 IBM and SAP team for long-term business innovation . . . . . . . . . . 182
Appendix A. Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
A.1 GPFS license information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
A.2 File-based backup with IBM TSM for ERP . . . . . . . . . . . . . . . . . . . . . . . 188
Abbreviations and acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198

vi

In-memory Computing with SAP HANA on IBM eX5 Systems

Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your
local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not infringe
any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and
verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where such
provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION
PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of
express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the
information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the materials
for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any
obligation to you.
Any performance data contained herein was determined in a controlled environment. Therefore, the results
obtained in other operating environments may vary significantly. Some measurements may have been made on
development-level systems and there is no guarantee that these measurements will be the same on generally
available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as
completely as possible, the examples include the names of individuals, companies, brands, and products. All of
these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is
entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any
form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs
conforming to the application programming interface for the operating platform for which the sample programs are
written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or
imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample
programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing
application programs conforming to IBM's application programming interfaces.

Copyright IBM Corp. 2013. All rights reserved.

vii

Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corporation in the United States, other countries, or both. These and other IBM trademarked
terms are marked on their first occurrence in this information with the appropriate symbol ( or ),
indicating US registered or common law trademarks owned by IBM at the time this information was
published. Such trademarks may also be registered or common law trademarks in other countries. A current
list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States,
other countries, or both:
AIX
BladeCenter
DB2
Global Business Services
Global Technology Services
GPFS
IBM SmartCloud

IBM
Intelligent Cluster
Passport Advantage
POWER
PureFlex
RackSwitch
Redbooks

Redpaper
Redbooks (logo)
System Storage
System x
System z
Tivoli
z/OS

The following terms are trademarks of other companies:

Evolution, and Kenexa


device are trademarks or
registered trademarks of
Kenexa, an IBM Company.
Intel Xeon, Intel, Itanium,
Intel logo, Intel Inside logo,
and Intel Centrino logo are
trademarks or registered
trademarks of Intel
Corporation or its

subsidiaries in the United


States and other
countries.
Linux is a trademark of
Linus Torvalds in the
United States, other
countries, or both.
Microsoft, Windows, and
the Windows logo are

trademarks of Microsoft
Corporation in the United
States, other countries, or
both.
Java, and all Java-based
trademarks and logos are
trademarks or registered
trademarks of Oracle
and/or its affiliates.

Other company, product, or service names may be trademarks or service marks of others.

viii

In-memory Computing with SAP HANA on IBM eX5 Systems

Preface
The second edition of this IBM Redbooks publication describes in-memory
computing appliances from IBM and SAP that are based on IBM eX5 flagship
systems and SAP HANA. We cover the history and basic principles of in-memory
computing and describe the SAP HANA solution with its use cases and the
corresponding IBM eX5 hardware offerings.
We also describe the architecture and components of IBM Systems solution for
SAP HANA, with IBM General Parallel File System (GPFS) as a cornerstone.
The SAP HANA operational disciplines are explained in detail: Scalability
options, backup and restore, high availability and disaster recovery, as well as
virtualization possibilities for SAP HANA appliances.
The following topics are covered:

History and basic principles of in-memory computing


SAP HANA overview
Software components and replication methods
SAP HANA use cases and integration scenarios
The IBM Systems solution for SAP HANA
SAP HANA operations
Benefits of using the IBM infrastructure for SAP HANA

This book is intended for SAP administrators and technical solution architects. It
is also for IBM Business Partners and IBM employees who want to know more
about the SAP HANA offering and other available IBM solutions for SAP clients.

Copyright IBM Corp. 2013. All rights reserved.

ix

Authors
This book was produced by a team of specialists from around the world working
at the IBM International Technical Support Organization (ITSO), Raleigh Center.

Gereon Vey has been a member of the IBM System


x Team at the IBM SAP International Competence
Center (ISICC) in Walldorf, Germany, since 2004. He is
the Global Subject Matter Expert for the SAP
Appliances, such as SAP NetWeaver BW Accelerator
and SAP HANA, at the ISICC, and is part of the team
developing the IBM Systems solution for SAP HANA.
His other activities include maintaining sizing
guidelines and capacity data for System x servers and
pre-sales support for IBM worldwide. He has worked in
the IT industry since 1992. He graduated with a degree
in Computer Science from the University of Applied
Sciences, in Worms, Germany in 1999.
Martin Bachmaier is an IT Versatilist in the IBM
hardware development lab in Boeblingen, Germany.
He currently is part of the team developing the IBM
Systems solution for SAP HANA. Martin has a deep
background in designing, implementing, and managing
scale-out data centers, HPC clusters, and cloud
environments, and has worked with GPFS for seven
years. He gives university lectures, and likes to push IT
limits. Martin is an IBM Certified Systems Expert. He
holds the CCNA, CCNA Security, and VMware
Certified Professional credentials and has authored
several books and papers.
Ilya Krutov is a Project Leader at the ITSO Center in
Raleigh and has been with IBM since 1998. Before
joining the ITSO, Ilya served in IBM as a Run Rate
Team Leader, Portfolio Manager, Brand Manager,
Technical Sales Specialist, and Certified Instructor. Ilya
has expertise in IBM System x, BladeCenter and
PureFlex System products, server operating
systems, and networking solutions. He has authored
over 130 books, papers, and product guides. He has a
Bachelor degree in Computer Engineering from the
Moscow Engineering and Physics Institute.

In-memory Computing with SAP HANA on IBM eX5 Systems

Thanks to the authors of the previous edition of this book:


Gereon Vey
Tomas Krojzl
Ilya Krutov
Thanks to the following people for their contributions to this project:
From the International Technical Support Organization, Raleigh Center:

Kevin Barnes
Tamikia Barrow
Mary Comianos
Shari Deiana
Cheryl Gera
Linda Robinson
David Watts

From IBM:

Irene Hopf
Dr. Oliver Rettig
Sasanka Vemuri
Tag Robertson
Thomas Prause
Volker Fischer

Now you can become a published author, too!


Heres an opportunity to spotlight your skills, grow your career, and become a
published authorall at the same time! Join an ITSO residency project and help
write a book in your area of expertise, while honing your experience using
leading-edge technologies. Your efforts will help to increase product acceptance
and customer satisfaction, as you expand your network of technical contacts and
relationships. Residencies run from two to six weeks in length, and you can
participate either in person or as a remote resident working from your home
base.
Find out more about the residency program, browse the residency index, and
apply online at:
ibm.com/redbooks/residencies.html

Preface

xi

Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about
this book or other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
redbooks@us.ibm.com
Mail your comments to:
IBM Corporation, International Technical Support Organization
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400

Stay connected to IBM Redbooks


Find us on Facebook:
http://www.facebook.com/IBMRedbooks
Follow us on Twitter:
http://twitter.com/ibmredbooks
Look for us on LinkedIn:
http://www.linkedin.com/groups?home=&gid=2130806
Explore new Redbooks publications, residencies, and workshops with the
IBM Redbooks weekly newsletter:
https://www.redbooks.ibm.com/Redbooks.nsf/subscribe?OpenForm
Stay current on recent Redbooks publications with RSS Feeds:
http://www.redbooks.ibm.com/rss.html

xii

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 1.

History of in-memory
computing at SAP
In-memory computing has a long history at SAP. This chapter provides a short
overview of the history of SAP in-memory computing. It describes the evolution
of SAP in-memory computing and gives an overview of SAP products involved in
this process:

1.1, SAP Search and Classification (TREX) on page 2


1.2, SAP liveCache on page 2
1.3, SAP NetWeaver Business Warehouse Accelerator on page 3
1.4, SAP HANA on page 6

Copyright IBM Corp. 2013. All rights reserved.

1.1 SAP Search and Classification (TREX)


SAP first made SAP In-Memory Computing available in product form with the
introduction of SAP Search and Classification, better known as Text Retrieval
and Information Extraction (TREX). TREX is a search engine for both structured
and unstructured data. It provides SAP applications with numerous services for
searching and classifying large collections of documents (unstructured data) and
for searching and aggregating business data (structured data).
TREX offers a flexible architecture that enables a distributed installation, which
can be modified to match various requirements. A minimal system consists of a
single host that provides all TREX functions. Starting with a single-host system,
you can extend TREX to be a distributed system and thus increase its capacity.
TREX stores its data, usually referred to as indexes, not in the way traditional
databases do, but merely as flat files in a file system. For a distributed system,
the file system must be a clustered or shared file system, which presents all files
to all nodes of the distributed system.
For performance reasons, TREX indexes are loaded to working memories.
Indexes for structured data are implemented compactly using data compression,
and the data can be aggregated in linear time to enable large volumes of data to
be processed entirely in memory.
Earlier TREX releases (TREX 7.0 and earlier) are supported on a variety of
platforms (such as IBM AIX, HP-UX, SOLARIS, Linux, and Windows). To
optimize the performance of the search and indexing functions provided by the
TREX engine, SAP decided to concentrate on the Intel platform to optimally
utilize the CPU architecture. Therefore, the newest version of TREX (Version
7.10) is only available on Windows and Linux 64-bit operating systems.
TREX as a search engine component is used as an integral part of various SAP
software offerings, such as SAP NetWeaver Enterprise Search. TREX as an SAP
NetWeaver stand-alone engine is a significant part of most search features in
SAP applications.

1.2 SAP liveCache


SAP liveCache technology can be characterized by a hybrid main-memory
database with intensive use of database procedures. It is based on MaxDB,
which is a relational database owned by SAP, introducing a combination of
in-memory data storage with special object-oriented database technologies
supporting the application logic. This hybrid database system can process
enormous volumes of information, such as planning data. It significantly

In-memory Computing with SAP HANA on IBM eX5 Systems

increases the speed of the algorithmically complex, data-intensive and


runtime-intensive functions of various SAP applications, especially within SAP
Supply Chain Management (SAP SCM) and SAP Advanced Planning and
Optimization (SAP APO). The SAP APO/liveCache architecture consists of these
major components:
ABAP code in SAP APO, which deals with SAP APO functionality
Application functions providing extended database functionality to manipulate
business objects
SAP liveCache's special SAP MaxDB implementation, providing a memory
resident database for fast data processing
From the view of the SAP APO application servers, the SAP liveCache database
appears as a second database connection. SAP liveCache provides a native
SQL interface, which also allows the application servers to trigger object-oriented
functions at the database level. These functions are provided by means of C++
code running on the SAP liveCache server with extremely fast access to the
objects in its memory. This is the functionality, which that allows processing load
to be passed from the application server to the SAP LiveCache server, rather
than just accessing database data. This functionality, referred to as the
COM-Modules or SAP liveCache Applications, supports the manipulation of
memory resident objects and datacubes and significantly increases the speed of
the algorithmically complex, data-intensive, and runtime-intensive functions.
SAP APO transfers performance-critical application logic to the SAP liveCache.
Data needed for these operations is sent to SAP liveCache and kept in-memory.
This ensures that the processing happens where the data is, to deliver the
highest possible performance. The object-oriented nature of the application
functions enable parallel processing so that modern multi-core architectures can
be leveraged.

1.3 SAP NetWeaver Business Warehouse Accelerator


The two primary drivers of the demand for business analytics solutions are
increasing data volumes and user populations. These drivers place new
performance requirements on existing analytic platforms. To address these
requirements, SAP introduced SAP NetWeaver Business Warehouse
Accelerator1 (SAP NetWeaver BW Accelerator) in 2006, deployed as an
integrated solution combining software and hardware to increase the
1

Formerly named SAP NetWeaver Business Intelligence Accelerator, SAP changed the software
solution name in 2009 to SAP NetWeaver Business Warehouse Accelerator. The solution functions
remain the same.

Chapter 1. History of in-memory computing at SAP

performance characteristics of SAP NetWeaver Business Warehouse


deployments.
The SAP NetWeaver BW Accelerator is based on TREX technology. SAP used
this existing technology and extended it with more functionality to efficiently
support the querying of massive amounts of data and to perform simple
operations on the data frequently used in a business analytics environment.
The softwares engine decomposes table data vertically into columns that are
stored separately. This makes more efficient use of memory space than
row-based storage because the engine needs to load only the data for relevant
attributes or characteristics into memory. In general, this is a good idea for
analytics, where most users want to see only a selection of data. We discuss the
technology and advantages of column-based storage in Chapter 2, Basic
concepts on page 9, along with other basic in-memory computing principles
employed by SAP NetWeaver BW Accelerator.
SAP NetWeaver BW Accelerator is built for a special use case, speeding up
queries and reports in SAP NetWeaver BW. In a nutshell, after connecting the
SAP NetWeaver BW Accelerator to the BW system, InfoCubes can be marked to
be indexed in SAP NetWeaver BW Accelerator, and subsequently all
database-bound queries (or even parts of queries) that operate on the indexed
InfoCubes actually get executed in-memory by the SAP NetWeaver BW
Accelerator.
Because of this tight integration with SAP NetWeaver BW and the appliance-like
delivery model, SAP NetWeaver BW Accelerator requires minimal configuration
and set up. Intel helped develop this solution with SAP, so it is optimized for, and
only available on, Intel Xeon processor-based technology. SAP partners with
several hardware vendors to supply the infrastructure for the SAP NetWeaver
BW Accelerator software. Customers acquire the SAP software license from
SAP, and the hardware partner delivers a pre-configured and pre-installed
solution.
The IBM Systems solution for SAP NetWeaver BW Accelerator helps provide
customers with near real-time business intelligence for those companies that
need timely answers to vital business questions. It allows customers to perform
queries in seconds rather than tens of minutes and gives them better visibility
into their business.
IBM has significant competitive advantages with our IBM BladeCenter-based
implementation:
Better density
More reliable cooling
Fibre storage switching

In-memory Computing with SAP HANA on IBM eX5 Systems

Fully redundant enterprise class chassis


Systems management
SAP NetWeaver BW Accelerator plugs into existing SAP NetWeaver Business
Warehouse environments regardless of the server platform used in that
environment.
The IBM solution consists of these components:
IBM BladeCenter chassis with HS23 blade servers with Intel Xeon
processors, available in standard configurations scaling from 2 to 28 blades
and custom configurations up to 140 blades
IBM DS3524 with scalable disk
SUSE Linux Enterprise Server as the operating system
IBM General Parallel File System (GPFS)
IBM Services including Lab Services, IBM Global Business Services (GBS),
IBM Global Technology Services (GTS) offerings, and IBM Intelligent
Cluster enablement team services
This intelligent scalable design is based around the IBM General Parallel File
System, exclusive from IBM. GPFS is a highly scalable, high-performance
shared disk file system, powering many of the worlds largest supercomputing
clusters. Its advanced high-availability and data replication features are a key
differentiator for the IBM offering. GPFS not only provides scalability, but also
offers exclusive levels of availability that are easy to implement with no manual
intervention or scripting.
IBM has shown linear scalability for the SAP NetWeaver BW Accelerator through
140 blades2. Unlike all other SAP NetWeaver BW Accelerator providers, the IBM
solution provides a seamless growth path for customers from two blades to 140
blades with no significant changes in the hardware or software infrastructure.

1.3.1 SAP BusinessObjects Explorer Accelerated


To extend the functionality of SAP NetWeaver BW Accelerator, SAP created a
special version of the SAP BusinessObjects Explorer, which can connect directly
to the SAP NetWeaver BW Accelerator using its proprietary communication
protocol. SAP BusinessObjects Explorer, accelerated version, provides an
alternative front end to navigate through the data contained in SAP NetWeaver
BW Accelerator, with a much simpler, web-based user interface than the SAP
2

WinterCorp white paper: Large-Scale Testing of the SAP NetWeaver BW Accelerator on an IBM
Platform, available at the following site:
ftp://public.dhe.ibm.com/common/ssi/rep_wh/n/SPW03004USEN/SPW03004USEN.PDF

Chapter 1. History of in-memory computing at SAP

NetWeaver BW front ends can provide. This broadens the user base towards the
less experienced BI users.

1.3.2 SAP BusinessObjects Accelerator


SAP enabled the combination of SAP NetWeaver BW Accelerator and SAP
BusinessObjects Data Services to load data into SAP NetWeaver BW
Accelerator from virtually any data source, both SAP and non-SAP data sources.
In combination with BO Explorer as an independent front end, the addition of
SAP BusinessObjects Data Services created a solution that is independent of
SAP NetWeaver BW.
This combination of SAP NetWeaver BW Accelerator, SAP BusinessObjects
Explorer Accelerated, and SAP BusinessObjects Data Services is often referred
to as the SAP BusinessObjects Accelerator or SAP BusinessObjects Explorer
Accelerated Wave 2. Additional blades are added to the SAP NetWeaver BW
Accelerator configuration to support the BusinessObjects Explorer Accelerated
workload, enabling it to be delivered as part of the SAP NetWeaver BW
Accelerator solution.

1.4 SAP HANA


SAP HANA is the next logical step in SAP in-memory computing. By combining
earlier developed or acquired technologies, such as the SAP NetWeaver BW
Accelerator (including TREX technology), SAP MaxDB with its in-memory
capabilities originating in SAP liveCache, or P*Time (acquired by SAP in 2005),
with recent research results from the Hasso Plattner Institute for Software
Systems Engineering3 (HPI), SAP created an in-memory database appliance for
a wide range of applications.

Founded in 1998 by Hasso Plattner, one of the founders of SAP AG, chairman of the board until
2003, and currently chairman of the supervisory board of SAP AG

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure 1-1 shows the evolution of SAP in-memory computing.

TREX

BWA
7.0

BWA
7.20

BO
Explorer

BOE
accelerated

BO
Data
Services

BOA

SAP
HANA 1.0

MAXDB

P*TIME

liveCache

Figure 1-1 Evolution of SAP in-memory computing

Initially targeted at analytical workloads, Hasso Plattner presented (during the


announcement of SAP HANA at SapphireNOW 2010) his vision of SAP HANA
becoming a database suitable as a base for SAPs entire enterprise software
portfolio. He confirmed this vision during his keynote at SapphireNOW 2012 by
highlighting how SAP HANA is on the path to becoming the unified foundation for
all types of enterprise workloads, not only online analytical processing (OLAP),
but also online transaction processing (OLTP) and text.
Just as with SAP NetWeaver BW Accelerator, SAP decided to deploy SAP HANA
in an appliance-like delivery model. IBM was one of the first hardware partners to
work with SAP on an infrastructure solution for SAP HANA.
This IBM Redbooks publication focuses on SAP HANA and the IBM solution for
SAP HANA.

Chapter 1. History of in-memory computing at SAP

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 2.

Basic concepts
In-memory computing is a technology that allows the processing of massive
quantities of data in main memory to provide immediate results from analysis and
transaction. The data to be processed is ideally real-time data (that is, data that
is available for processing or analysis immediately after it is created).
To achieve the desired performance, in-memory computing follows these
basic concepts:
Keep data in main memory to speed up data access.
Minimize data movement by leveraging the columnar storage concept,
compression, and performing calculations at the database level.
Divide and conquer. Leverage the multi-core architecture of modern
processors and multi-processor servers, or even scale out into a distributed
landscape, to be able to grow beyond what can be supplied by a single server.
In this chapter, we describe those basic concepts with the help of a few
examples. We do not describe the full set of technologies employed with
in-memory databases, such as SAP HANA, but we do provide an overview of
how in-memory computing is different from traditional concepts.

Copyright IBM Corp. 2013. All rights reserved.

2.1 Keeping data in-memory


Today, a single enterprise class server can hold several terabytes of main
memory. At the same time, prices for server main memory dramatically dropped
over the last few decades. This increase in capacity and reduction in cost makes
it a viable approach to keep huge amounts of business data in memory. This
section discusses the benefits and challenges.

2.1.1 Using main memory as the data store


The most obvious reason to use main memory (RAM) as the data store for a
database is because accessing data in main memory is much faster than
accessing data on disk. Figure 2-1 compares the access times for data in several
locations.
1,000,000
100,000

150x
10,000
1,000
100

2,000x
10
1

17x
0,1

12x
0,01
0,001
CPU register

CPU Cache

RAM

SSD/Flash

Volatile

Hard disk

Non-volatile

Figure 2-1 Data access times of various storage types, relative to RAM (logarithmic scale)

10

In-memory Computing with SAP HANA on IBM eX5 Systems

The main memory is the fastest storage type that can hold a significant amount
of data. While CPU registers and CPU cache are faster to access, their usage is
limited to the actual processing of data. Data in main memory can be accessed
more than a hundred thousand times faster than data on a spinning hard disk,
and even flash technology storage is about a thousand times slower than main
memory. Main memory is connected directly to the processors through a
high-speed bus, whereas hard disks are connected through a chain of buses
(QPI, PCIe, SAN) and controllers (I/O hub, RAID controller or SAN adapter, and
storage controller).
Compared with keeping data on disk, keeping the data in main memory can
dramatically improve database performance just by the advantage in access
time.

2.1.2 Data persistence


Keeping data in main memory brings up the question of what will happen in case
of a loss of power.
In database technology, atomicity, consistency, isolation, and durability (ACID) is
a set of requirements that guarantees that database transactions are processed
reliably:
A transaction must be atomic. That is, if part of a transaction fails, the entire
transaction has to fail and leave the database state unchanged.
The consistency of a database must be preserved by the transactions that it
performs.
Isolation ensures that no transaction interferes with another transaction.
Durability means that after a transaction is committed, it will remain
committed.
Although the first three requirements are not affected by the in-memory concept,
durability is a requirement that cannot be met by storing data in main memory
alone. Main memory is volatile storage. That is, it looses its content when it is out
of electrical power. To make data persistent, it must reside on non-volatile
storage, such as hard drives, solid-state drives (SSDs), or flash devices.
The storage used by a database to store data (in this case, main memory) is
divided into pages. When a transaction changes data, the corresponding pages
are marked and written to non-volatile storage in regular intervals. In addition, a
database log captures all changes made by transactions. Each committed
transaction generates a log entry that is written to non-volatile storage. This
ensures that all transactions are permanent. Figure 2-2 on page 12 illustrates
this using the example of SAP HANA. SAP HANA stores changed pages in

Chapter 2. Basic concepts

11

savepoints, which are asynchronously written to persistent storage in regular


intervals (by default every five minutes). The log is written synchronously. A
transaction does not return before the corresponding log entry is written to
persistent storage, to meet the durability requirement, as previously described.

Time

Data savepoint
to persistent
storage

Log written
to persistent storage
(committed transactions)

Power failure

Figure 2-2 Savepoints and logs in SAP HANA

After a power failure, the database can be restarted like a disk-based database.
The database pages are restored from the savepoints and then the database
logs are applied (rolled forward) to restore the changes that were not captured in
the savepoints. This ensures that the database can be restored in memory to
exactly the same state as before the power failure.

2.2 Minimizing data movement


The second key to improving data processing performance is to minimize the
movement of data within the database and between the database and the
application. This section describes measures to achieve this.

2.2.1 Compression
Even though todays memory capacities allow keeping enormous amounts of
data in-memory, compressing the data in-memory is still desirable. The goal is to
compress data in a way that does not use up performance gained, while still
minimizing data movement from RAM to the processor.
By working with dictionaries to be able to represent text as integer numbers, the
database can compress data significantly and thus reduce data movement, while
not imposing additional CPU load for decompression, but even adding to the
performance1. Figure 2-3 on page 13 illustrates this with a simplified example.

12

See the example in Figure 2-5 on page 16.

In-memory Computing with SAP HANA on IBM eX5 Systems

Row
ID

Date/
Time

Material

Customer
Name

Quantity

Customers

Material

Chevrier

MP3 Player

Di Dio

Radio

Dubois

Refrigerator

Miller

Stove

Newman

Laptop

14:05

Radio

Dubois

14:11

Laptop

Di Dio

14:32

Stove

Miller

14:38

MP3 Player

Newman

Row
ID

14:48

Radio

Dubois

14:55

Refrigerator

Miller

15:01

Stove

Chevrier

Date/
Time

Material

Customer
Name

Quantity

845

851

872

878

888

895

901

Figure 2-3 Illustration of dictionary compression

On the left side of Figure 2-3, the original table is shown containing text attributes
(that is, material and customer name) in their original representation. The text
attribute values are stored in a dictionary (upper right), assigning an integer value
to each distinct attribute value. In the table, the text is replaced by the
corresponding integer value, as defined in the dictionary. The date and time
attribute was also converted to an integer representation. Using dictionaries for
text attributes reduces the size of the table because each distinct attribute value
has only to be stored once, in the dictionary; therefore, each additional
occurrence in the table just needs to be referred to with the corresponding
integer value.
The compression factor achieved by this method is highly dependent on data
being compressed. Attributes with few distinct values compress well; whereas,
attributes with many distinct values do not benefit as much.
While there are other, more effective compression methods that can be
employed with in-memory computing, to be useful, they must have the correct
balance between compression effectiveness. This gives you more data in your
memory, or less data movement (that is, higher performance), resources needed
for decompression, and data accessibility (that is, how much unrelated data has
to be decompressed to get to the data that you need). As discussed here,

Chapter 2. Basic concepts

13

dictionary compression combines good compression effectiveness with low


decompression resources and high data access flexibility.

2.2.2 Columnar storage


Relational databases organize data in tables that contain the data records. The
difference between row-based and columnar (or column-based) storage is how
the table is stored:
Row-based storage stores a table in a sequence of rows.
Column-based storage stores a table in a sequence of columns.
Figure 2-4 illustrates the row-based and column-based models.

Row-based
Row
ID

Date/
Time

Column-based

Material

Customer
Name

Quantity

Row
ID

Date/
Time

Material

Customer
Name

Quantity

845

845

851

851

872

872

878

878

888

888

895

895

901

901

Row-based store
1

845

851

851

872

878

872

878

Column-based store
1

845

Figure 2-4 Row-based and column-based storage models

14

In-memory Computing with SAP HANA on IBM eX5 Systems

Both storage models have benefits and drawbacks, which are listed in Table 2-1.
Table 2-1 Benefits and drawbacks of row-based and column-based storage
Row-based storage
Benefits

Column-based storage

Record data is stored


together.
Easy to insert/update.

Drawbacks

All data must be read


during selection, even if
only a few columns are
involved in the selection
process.

Only affected columns


have to be read during
the selection process
of a query.
Efficient projectionsa.
Any column can serve
as an index.
After selection, selected
rows must be
reconstructed from
columns.
No easy insert/update.

a. Projection: View on the table with a subset of columns

The drawbacks of column-based storage are not as grave as they seem. In most
cases, not all attributes (that is, column values) of a row are needed for
processing, especially in analytic queries. Also, inserts or updates to the data are
less frequent in an analytical environment2. SAP HANA implements both a
row-based storage and a column-based storage; however, its performance
originates in the use of column-based storage in memory. The following sections
describe how column-based storage is beneficial to query performance and how
SAP HANA handles the drawbacks of column-based storage.

An exception is bulk loads (for example, when replicating data in the in-memory database, which can be
handled differently).

Chapter 2. Basic concepts

15

Efficient query execution


To show the benefits of dictionary compression combined with columnar storage,
Figure 2-5 shows an example of how a query is executed. Figure 2-5 refers to the
table shown in Figure 2-3 on page 13.

Get all records with Customer Name Miller and Material Refrigerator
Dictionary lookup of the strings
Strings are only compared once!

Only those columns are read


which are part of the query condition

Customers

Material

Chevrier

MP3 Player

Di Dio

Radio

Dubois

Refrigerator

Miller

Stove

Newman

Laptop

Integer comparison operations

Customer

Material

0 0 1 0 0 1 0

0 0 0 0 0 1 0
Combine
bit-wise AND

0 0 0 0 0 1 0
Resultset

The resulting records can be assembled from the column stores fast, because positions are known
(here: 6th position in every column)

Figure 2-5 Example of a query executed on a table in columnar storage

The query asks to get all records with Miller as the customer name and
Refrigerator as the material.
First, the strings in the query condition are looked up in the dictionary. Miller is
represented as the number 4 in the customer name column. Refrigerator is
represented as the number 3 in the material column. Note that this lookup has to
be done only once. Subsequent comparisons with the values in the table are
based on integer comparisons, which are less resource intensive than string
comparisons.
In a second step, the columns are read that are part of the query condition (that
is, the Customer and Material columns). The other columns of the table are not
needed for the selection process. The columns are then scanned for values
matching the query condition. That is, in the Customer column all occurrences of

16

In-memory Computing with SAP HANA on IBM eX5 Systems

4 are marked as selected, and in the Material column all occurrences of 3 are
marked.
These selection marks can be represented as bitmaps, a data structure that
allows efficient boolean operations on them, which is used to combine the
bitmaps of the individual columns to a bitmap representing the selection or
records matching the entire query condition. In our example, the record number 6
is the only matching record. Depending on the columns selected for the result,
now the additional columns must be read to compile the entire record to return.
But because the position within the column is known (record number 6) only the
parts of the columns have to be read that contain the data for this record.
This example shows how compression not only can limit the amount of data
needed to be read for the selection process, but even simplify the selection itself,
while the columnar storage model further reduces the amount of data needed for
the selection process. Although the example is simplified, it illustrates the
benefits of dictionary compression and columnar storage.

Delta-merge and bulk inserts


To overcome the drawback of inserts or updates having impact on performance
of the column-based storage, SAP plans to implement a lifecycle management
for database records3.

See Efficient Transaction Processing in SAP HANA Database - The End of a Column Store Myth
by Sikka, Frber, Lehner, Cha, Peh, Bornhvd, available at the following site:
http://dl.acm.org/citation.cfm?id=2213946

Chapter 2. Basic concepts

17

Figure 2-6 illustrates the lifecycle management for database records in the
column-store.
Update / Insert / Delete

L1 Delta

Merge

Bulk Insert

L2 Delta

Merge

Main store

Unified Table

Read
Figure 2-6 Lifetime management of a data record in the SAP HANA column-store

There are three different types of storage for a table:


L1 Delta Storage is optimized for fast write operations. The update is
performed by inserting a new entry into the delta storage. The data is stored
in records, like in a traditional row-based approach. This ensures high
performance for write, update, and delete operations on records stored in the
L1 Delta Storage.
L2 Delta Storage is an intermediate step. While organized in columns, the
dictionary is not as much optimized as in the main storage and appends new
dictionary entries to the end of the dictionary. This results in easier inserts, but
has drawbacks with regards to search operations on the dictionary because it
is not sorted.
Main Storage contains the compressed data for fast read with a search
optimized dictionary.
All write operations on a table work on the L1 Delta storage. Bulk inserts bypass
L1 Delta storage and write directly into L2 Delta storage. Read operations on a
table always reads from all storages for that table, merging the result set to
provide a unified view on all data records in the table.
During the lifecycle of a record, it is moved from L1 Delta storage to L2 Delta
storage and finally to the Main storage. The process of moving changes to a

18

In-memory Computing with SAP HANA on IBM eX5 Systems

table from one storage to the next one is called Delta Merge, and is an
asynchronous process. During the merge operations, the columnar table is still
available for read and write operations.
Moving records from L1 Delta storage to L2 Delta storage involves reorganizing
the record in a columnar fashion and compressing it, as illustrated in Figure 2-3
on page 13. If a value is not yet in the dictionary, a new entry is appended to the
dictionary. Appending to the dictionary is faster than inserting, but results in an
unsorted dictionary, which impacts the data retrieval performance.
Eventually, the data in the L2 Delta storage must be moved to the Main storage.
To accomplish that, the L2 Delta storage must be locked, and a new L2 Delta
storage must be opened to accept further additions. Then a new Main storage is
created from the old Main storage and the locked L2 Delta storage. This is a
resource-intensive task and has to be scheduled carefully.

2.2.3 Pushing application logic to the database


Whereas the concepts described above speed up processing within the
database, there is still one factor that can significantly slow down the processing
of data. An application executing the application logic on the data has to get the
data from the database, process it, and possibly send it back to the database to
store the results. Sending data back and forth between the database and the
application usually involves communication over a network, which introduces
communication overhead and latency and is limited by the speed and throughput
of the network between the database and the application itself.
To eliminate this factor and increase overall performance, it is beneficial to
process the data where it is, at the database. If the database can perform
calculations and apply application logic, less data needs to be sent back to the
application and might even eliminate the need for the exchange of intermediate
results between the database and the application. This minimizes the amount of
data transfer, and the communication between database and application
contributes a less significant amount of time to the overall processing time.

2.3 Divide and conquer


The phrase divide and conquer (derived from the Latin saying divide et impera)
is typically used when a big problem is divided into a number of smaller,
easier-to-solve problems. Regarding performance, processing huge amounts of
data is a significant problem that can be solved by splitting it up into smaller
chunks of data, which can be processed in parallel.

Chapter 2. Basic concepts

19

2.3.1 Parallelization on multi-core systems


When chip manufactures reached the physical limits of semiconductor-based
microelectronics with their single-core processor designs, they started to
increase processor performance by increasing the number of cores, or
processing units, within a single processor. This performance gain can only be
leveraged through parallel processing because the performance of a single core
remained unchanged.
The rows of a table in a relational database are independent of each other, which
allows parallel processing. For example, when scanning a database table for
attribute values matching a query condition, the table, or the set of attributes
(columns) relevant to the query condition, can be divided into subsets and
spread across the cores available to parallelize the processing of the query.
Compared with processing the query on a single core, this basically reduces the
time needed for processing by a factor equivalent to the number of cores working
on the query (for example, on a 10-core processor the time needed is one-tenth
of the time that a single core would need).
The same principle applies for multi-processor systems. A system with eight
10-core processors can be regarded as an 80-core system that can divide the
processing into 80 subsets processed in parallel.

2.3.2 Data partitioning and scale-out


Even though servers available today can hold terabytes of data in memory and
provide up to eight processors per server with up to 10 cores per processor, the
amount of data to be stored in an in-memory database or the computing power
needed to process such quantities of data might exceed the capacity of a single
server. To accommodate the memory and computing power requirements that go
beyond the limits of a single server, data can be divided into subsets and placed
across a cluster of servers, forming a distributed database (scale-out approach).
The individual database tables can be placed on different servers within the
cluster, or tables bigger than what a single server can hold can be split into
several partitions, either horizontally (a group of rows per partition) or vertically (a
group of columns per partition) with each partition residing on a separate server
within the cluster.

20

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 3.

SAP HANA overview


In this chapter, we describe the SAP HANA offering, its architecture and
components, use cases, delivery model, and sizing and licensing aspects.
This chapter contains the following sections:

SAP HANA overview


SAP HANA delivery model
Sizing SAP HANA
SAP HANA software licensing

Copyright IBM Corp. 2013. All rights reserved.

21

3.1 SAP HANA overview


This section gives an overview of SAP HANA. When talking about SAP HANA,
these terms are used:
SAP HANA database
The SAP HANA database (also referred to as the SAP in-memory database)
is a hybrid in-memory database that combines row-based, column-based,
and object-based database technology, optimized to exploit the parallel
processing capabilities of current hardware. It is the heart of SAP offerings,
such as SAP HANA.
SAP HANA appliance (SAP HANA)
SAP HANA is a flexible, data source agnostic appliance that allows you to
analyze large volumes of data in real time without the need to materialize
aggregations. It is a combination of hardware and software, and it is delivered
as an optimized appliance in cooperation with SAPs hardware partners for
SAP HANA.
For the sake of simplicity, we use the terms SAP HANA, SAP in-memory
database, SAP HANA database, and SAP HANA appliance synonymously in this
paper. We cover only the SAP in-memory database as part of the SAP HANA
appliance. Where required, we ensure that the context makes it clear which part
is being discussed.

22

In-memory Computing with SAP HANA on IBM eX5 Systems

3.1.1 SAP HANA architecture


Figure 3-1 shows the high-level architecture of the SAP HANA appliance.
Section 4.1, SAP HANA software components on page 38 explains the most
important software components around SAP HANA database.

SAP HANA Appliance


SAP HANA Database
Session Management

SAP HANA
Studio

Request processing / Execution Control


SAP HANA
Client

SAP HANA
Studio Repository

SQL

MDX

SQL Script

Calculation Engine

Relational Engines
Software Update
Manager

Row
Store

Column
Store

SAP Host Agent

SAP HANA
Client

Transaction
Manager

Authorization
Manager

Metadata
Manager

Page
Management

Persistency Layer

Logger

Data Volumes

Persistent Storage

Log Volumes

LM Structure
JVM
SAP CAR

Figure 3-1 SAP HANA architecture

SAP HANA database


The heart of the SAP HANA database is the relational database engines. There
are two engines within the SAP HANA database:
The column-based store: Stores relational data in columns, optimized holding
tables with huge amounts of data, which can be aggregated in real-time and
used in analytical operations.
The row-based store: Stores relational data in rows, as traditional database
systems do. This row store is more optimized for row operations, such as
frequent inserts and updates. It has a lower compression rate, and query
performance is much lower compared to the column-based store.

Chapter 3. SAP HANA overview

23

The engine used to store data can be selected on a per-table basis at the time of
creation of a table. There is a possibility to convert an existing table from one
type to another. Tables in the row-store are loaded at startup time; whereas,
tables in the column-store can be either loaded at startup or on demand during
normal operation of the SAP HANA database.
Both engines share a common persistency layer, which provides data
persistency consistent across both engines. There is page management and
logging, much like in traditional databases. Changes to in-memory database
pages are persisted through savepoints written to the data volumes on persistent
storage, which is usually hard drives. Every transaction committed in the SAP
HANA database is persisted by the logger of the persistency layer in a log entry
written to the log volumes on persistent storage. The log volumes use flash
technology storage for high I/O performance and low latency.
The relational engines can be accessed through various interfaces. The SAP
HANA database supports SQL (JDBC/ODBC), MDX (ODBO), and BICS (SQL
DBC). The calculation engine allows calculations to be performed in the
database without moving the data into the application layer. It also includes a
business functions library that can be called by applications to do business
calculations close to the data. The SAP HANA-specific SQL Script language is
an extension to SQL that can be used to push down data-intensive application
logic into the SAP HANA database.

3.1.2 SAP HANA appliance


The SAP HANA appliance consists of the SAP HANA database and adds
components needed to work with, administer, and operate the database. It
contains the repository files for the SAP HANA studio, which is an Eclipse-based
administration and data-modeling tool for SAP HANA, in addition to the SAP
HANA client, which is a set of libraries required for applications to be able to
connect to the SAP HANA database. Both the SAP HANA studio and the client
libraries are usually installed on a client PC or server.
The Software Update Manager (SUM) for SAP HANA is the framework allowing
the automatic download and installation of SAP HANA updates from the SAP
Marketplace and other sources using a host agent. It also allows distribution of
the Studio repository to the users.
The Lifecycle Management (LM) Structure for SAP HANA is a description of the
current installation and is, for example, used by SUM to perform automatic
updates.
More details about existing software components are in section 4.1, SAP HANA
software components on page 38.

24

In-memory Computing with SAP HANA on IBM eX5 Systems

3.2 SAP HANA delivery model


SAP decided to deploy SAP HANA as an integrated solution combining software
and hardware, frequently referred to as the SAP HANA appliance. As with SAP
NetWeaver BW Accelerator, SAP partners with several hardware vendors to
provide the infrastructure needed to run the SAP HANA software. IBM was
among the first hardware vendors to partner with SAP to provide an integrated
solution.
Infrastructure for SAP HANA must run through a certification process to ensure
that certain performance requirements are met. Only certified configurations are
supported by SAP and the respective hardware partner. These configurations
must adhere to certain requirements and restrictions to provide a common
platform across all hardware providers:
Only certain Intel Xeon processors can be used. For the currently available
Intel Xeon processor E7 family, the allowed processor models are E7-2870,
E7-4870, and E7-8870. The previous CPU generation was limited to the Intel
Xeon processor X7560.
All configurations must provide a certain main memory per core ratio, which is
defined by SAP to balance CPU processing power and the amount of data
being processed.
All configurations must meet minimum performance requirements for various
load profiles. SAP tests for these requirements as part of the certification
process.
The capacity of the storage devices used in the configurations must meet the
sizing rules (see 3.3, Sizing SAP HANA on page 25).
The networking capabilities of the configurations must include 10 Gb Ethernet
for the SAP HANA software.
By imposing these requirements, SAP can rely on the availability of certain
features and ensure a well-performing hardware platform for their SAP HANA
software. These requirements give the hardware partners enough room to
develop an infrastructure architecture for SAP HANA, which adds differentiating
features to the solution. The benefits of the IBM solution are described in
Chapter 6, IBM Systems solution for SAP HANA on page 89.

3.3 Sizing SAP HANA


This section introduces the concept of T-shirt sizes for SAP HANA and gives a
short overview of how to size for an SAP HANA system.

Chapter 3. SAP HANA overview

25

3.3.1 Concept of T-shirt sizes for SAP HANA


SAP defined so-called T-shirt sizes for SAP HANA to both simplify the sizing and
to limit the number of hardware configurations to support, thus reducing
complexity. The SAP hardware partners provide configurations for SAP HANA
according to one or more of these T-shirt sizes. Table 3-1 lists the T-shirt sizes for
SAP HANA.
Table 3-1 SAP HANA T-shirt sizes
SAP T-shirt
size

XS

S and S+

M and M+

Compressed
data in
memory

64 GB

128 GB

256 GB

512 GB

Server main
memory

128 GB

256 GB

512 GB

1024 GB

Number of
CPUs

The T-shirt sizes, S+ and M+, denote upgradable versions of the S and M sizes:
S+ delivers capacity equivalent to S, but the hardware is upgradable to an M
size.
M+ delivers capacity equivalent to M, but the hardware is upgradable to an L
size.
These T-shirt sizes are used when relevant growth of the data size is expected.
In addition to these standard T-shirt sizes, which apply to all use cases of SAP
HANA, there are configurations that are specific to and limited for use with SAP
Business Suite applications powered by SAP HANA. See Table 3-2.
Table 3-2 Additional T-shirt sizes for SAP Business Suite powered by SAP HANA
SAP T-shirt size

XL

XXL

Compressed data in
memory

512 GB

1 TB

2 TB

Server main memory

1 TB

2 TB

4 TB

Number of CPUs

The workload for the SAP Business Suite applications has different
characteristics. It is less CPU bound and more memory intensive than standard

26

In-memory Computing with SAP HANA on IBM eX5 Systems

SAP HANA workload. Therefore, the memory per core ratio is different than for
the standard T-shirt sizes. All workloads can be used on the T-shirt sizes in
Table 3-1 on page 26, including SAP Business Suite applications with SAP
HANA as the primary database. The T-shirt sizes in Table 3-2 on page 26 are
specific to and limited for use with SAP Business Suite applications powered by
SAP HANA only.
Section 5.5.3, SAP Business Suite powered by SAP HANA on page 85 has
more information about which of the SAP Business Suite applications are
supported by SAP HANA as the primary database.
For more information about T-shirt size mappings to IBM Systems solution
building blocks, see the section 6.3.2, SAP HANA T-shirt sizes on page 113.

3.3.2 Sizing approach


The sizing of SAP HANA depends on the scenario in which SAP HANA is used.
We discuss these scenarios here:
SAP HANA as a stand-alone database
SAP HANA as the database for an SAP NetWeaver BW
The sizing methodology for SAP HANA is described in detail in the following SAP
Notes1 and attached presentations:
Note 1514966 - SAP HANA 1.0: Sizing SAP In-Memory Database
Note 1637145 - SAP NetWeaver BW on HANA: Sizing SAP In-Memory
Database
The following sections provide a brief overview of sizing for SAP HANA.

SAP HANA as a stand-alone database


This section covers sizing of SAP HANA as a stand-alone database, which is
used for example as technology platform, operational reporting, or accelerator
use case scenarios, as described in Chapter 5, SAP HANA use cases and
integration scenarios on page 61.
The sizing methodology for this scenario is described in detail in SAP Note
1514966 and the attached presentation.

SAP Notes can be accessed at http://service.sap.com/notes. An SAP S-user ID is required.

Chapter 3. SAP HANA overview

27

Sizing the RAM needed


Sizing an SAP HANA system is mainly based on the amount of data to be loaded
into the SAP HANA database because this determines the amount of main
memory (or RAM) needed in an SAP HANA system. To size the RAM, perform
the following steps:
1. Determine the volume of data that is expected to be transferred to the SAP
HANA database. Typically, clients select only a subset of data from their ERP
or CRM databases, so this must be done at the table level.
The information required for this step can be acquired with database tools.
SAP Note 1514966 contains a script supporting this process for SAP
NetWeaver based systems, for example, IBM DB2 LUW and Oracle. If data
comes from non-SAP NetWeaver systems, use the manual SQL statement.
The sizing methodology is based on uncompressed source data size, so in
case that compression is used in the source database, this must be taken into
account too. The script is automatically adjusting table sizes only for DB2
LUW database because information about compression ratio is available in
the data dictionary.
For other database systems, the compression factor must be estimated. Real
compression factors can differ because compression itself is dependent on
actual data.
In case that source database is non-unicode, multiply the volume of data by
overhead for unicode conversion (assume 50% overhead).
The uncompressed total size of all the tables (without DB indexes) storing the
required information in the source database is denoted as A.
2. Although the compression ratio achieved by SAP HANA can vary depending
on the data distribution, a working assumption is that, in general, a
compression factor of 7 can be achieved:
B = ( A / 7 )
B is the amount of RAM required to store the data in the SAP HANA
database.
3. Use only 50% of the total RAM for the in-memory database. The other 50% is
needed for temporary objects (for example, intermediate results), the
operating system, and the application code:
C = B * 2
C is the total amount of RAM required.
Round the total amount of RAM up to the next T-shirt configuration size, as
described in 3.3.1, Concept of T-shirt sizes for SAP HANA on page 26, to get
the correct T-shirt size needed.

28

In-memory Computing with SAP HANA on IBM eX5 Systems

Sizing the disks


The capacity of the disks is based on the total amount of RAM.
As described in 2.1.2, Data persistence on page 11, there are two types of
storage in SAP HANA:
Diskpersistence
The persistence layer writes snapshots of the database in HANA to disk in
regular intervals. These are usually written to an array of SAS drives2. The
capacity for this storage is calculated based on the total amount of RAM:
Diskpersistence = 4 * C
Note that backup data must not be permanently stored in this storage. After
backup is finished, it needs to be moved to external storage media.
Disklog
This contains the database logs, written to flash technology storage devices,
that is, SSDs or PCIe Flash adapters. The capacity for this storage is
calculated based on the total amount of RAM:
Disklog = 1 * C
The certified hardware configurations already take these rules into account, so
there is no need to perform this disk sizing. However, we still include it here for
your understanding.

Sizing the CPUs


A CPU sizing can be performed in case that an unusually high number of
concurrently active users executing complex queries is expected. Use the T-shirt
configuration size that satisfies both the memory and CPU requirements.
The CPU sizing is user-based. The SAP HANA system must support 300 SAPS
for each concurrently active user. The servers used for the IBM Systems Solution
for SAP HANA support about 60 to 65 concurrently active users per CPU,
depending on the server model.
SAP recommends that the CPU load not exceed 65%. Therefore, size the
servers to support no more than 40 - 42 concurrently active users per CPU for
standard workload.

The SSD building block, as described in 6.3, Custom server models for SAP HANA on page 110,
combines Diskpersistence and Disklog on a single SSD array with sufficient capacity.

Chapter 3. SAP HANA overview

29

SAP HANA as the database for an SAP NetWeaver BW


This section covers sizing of SAP HANA as the database for an SAP NetWeaver
BW, as described in section 5.5.1, SAP NetWeaver Business Warehouse
powered by SAP HANA on page 75.
The sizing methodology for this scenario is described in detail in SAP Note
1637145 and attached presentations.

Sizing the RAM needed


Similar to the previous scenario, it is important to estimate the volume of
uncompressed data that will be stored in the SAP HANA database. The main
difference is that the SAP NetWeaver BW system is using column-based tables
only for tables generated by BW. All other tables are stored as row-based tables.
The compression factor is different for each type of storage; therefore, the
calculation formula is slightly different.
To size the RAM, perform the following steps:
1. The amount of data that will be stored in the SAP HANA database can be
estimated using scripts attached to SAP Note 1637145. They determine the
volume of row based and column-based tables separately.
Because the size of certain system tables can grow over time and because
the row store compression factors are not as high as for the column store, it is
recommended to clean up unnecessary data. In a cleansed SAP NetWeaver
BW system, the volume of row-based data is around 60 GB.
Just as in the previous case, only the size of tables is relevant. All associated
indexes can be ignored.
In case the data in the source system is compressed, the calculated volume
needs to be adjusted by an estimated compression factor for the given
database. Only for DB2 databases, which contain the actual compression
rates in the data dictionary, the script calculates the required corrections
automatically.
In case the source system is a non-unicode system, a unicode conversion will
be part of the migration scenario. In this case, the volume of data needs to be
adjusted, assuming a 10% overhead because the majority of data is expected
to be numerical values.
Alternatively, an ABAP report can be used to estimate the table sizes. SAP
Note 1736976 has a report attached that calculates the sizes based on the
data present in an existing SAP NetWeaver BW system.

30

In-memory Computing with SAP HANA on IBM eX5 Systems

The uncompressed total size of all the column tables (without DB indexes) is
denoted as Acolumn . The uncompressed total size of all the row tables
(without DB indexes) is referred to as Arow.
2. The average compression factor is approximately 4 for column-based data
and around 1.5 for row based data.
Additionally, an SAP NetWeaver BW system requires about 40 GB of RAM for
additional caches and about 10 GB of RAM for SAP HANA components.
Bcolumn = ( Acolumn / 4 )
Brow = ( Arow / 1.5 )
Bother = 50
For a fully cleansed SAP NetWeaver BW system having 60 GB of row store
data, we can therefore assume a requirement of about 40 GB of RAM for
row-based data.
Brow = 40
B is the amount of RAM required to store the data in the SAP HANA database
for a given type of data.
3. Additional RAM is required for objects that are populated with new data and
for queries. This requirement is valid for column-based tables.
C = Bcolumn * 2 + Brow + Bother
For fully cleansed BW systems, this formula can be simplified:
C = Bcolumn * 2 + 90
C is the total amount of RAM required.
The total amount of RAM must be rounded up to the next T-shirt configuration
size, as described in 3.3.1, Concept of T-shirt sizes for SAP HANA on page 26,
to get the correct T-shirt size needed.

Sizing the disks


The capacity of the disks is based on the total amount of RAM and follows the
same rules as in the previous scenario. For more details, see Sizing the disks
on page 29.
Diskpersistence = 4 * C
Disklog = 1 * C
As in the previous case, disk sizing is not required because certified hardware
configurations already take these rules into account.

Chapter 3. SAP HANA overview

31

Special considerations for scale-out systems


If the memory requirements exceed the capacity of single node appliance, a
scale-out configuration needs to be deployed.
In this case, it is important to understand how the data is distributed during the
import operation. For optimal performance, different types of workload must be
separated from each other.
The master node holds all row-based tables, which are mostly system tables,
and is responsible for SAP NetWeaver-related workloads. Also, additional SAP
HANA database components are hosted on the master node, such as a name
server or statistics server.
All additional slave nodes hold all master data and transactional data.
Transactional tables are partitioned and distributed across all existing slave
nodes to reach optimal parallel processing.
This logic must be taken into account when planning scale-out configuration for
SAP NetWeaver BW system. For more information, review the following SAP
notes and attached presentations:
Note 1637145 - SAP NetWeaver BW on SAP HANA: Sizing SAP In-Memory
Database
Note 1702409 - SAP HANA DB: Optimal number of scale out nodes for SAP
NetWeaver BW on SAP HANA
Note 1736976 - Sizing Report for BW on HANA

SAP HANA as the database for SAP Business Suite


This section covers sizing of SAP HANA as the database for an SAP Business
Suite application, as described in section 5.5.3, SAP Business Suite powered by
SAP HANA on page 85.
The sizing methodology for this scenario is described in detail in SAP Note
1793345. Because we are still in the early days of SAP Business Suite with SAP
HANA, there are only preliminary sizing rules available. These might change in
the future, when further experiences from productive deployments are available,
or due to future optimizations in SAP HANA.
For initial sizings, SAP Quick Sizer3 can be used to determine an estimation for
database size and CPU requirements, as input values for an SAP HANA sizing
as described in this section.

32

Available online at this website: http://service.sap.com/quicksizer

In-memory Computing with SAP HANA on IBM eX5 Systems

Sizing the RAM needed


Similar to the previous scenarios, it is important to estimate the volume of
uncompressed data that will be stored in the SAP HANA database. Having a
well-maintained database can reduce the RAM requirements. To size the RAM,
perform the following steps:
1. Determine the volume of data that is expected to be transferred to the SAP
HANA database. The information required for this step can be acquired with
database tools. SAP Note 1514966 contains a script supporting this process
for SAP NetWeaver based systems, for example, IBM DB2 LUW and Oracle.
The sizing methodology is based on uncompressed source data size, so in
case that compression is used in the source database, this must be taken into
account too. The script is automatically adjusting table sizes only for DB2
LUW database because information about the compression ratio is available
in the data dictionary.
For other database systems, the compression factor must be estimated. Real
compression factors can differ because compression itself is dependent on
actual data.
In case that source database is non-unicode, multiply the volume of data by
overhead for unicode conversion (assume 50% overhead).
The uncompressed total size of all the tables (without DB indexes) storing the
required information in the source database is denoted as A.
2. The current recommendation from SAP is to half the size of the
uncompressed data and apply a 20% safety buffer to the resulting number:
B = ( A / 2)
C = B * 1.2
B is the amount of RAM required to store the data in the SAP HANA
database. C extends this amount by the recommended safety buffer.
C is the total amount of RAM required.
Round the total amount of RAM up to the next T-shirt configuration size, as
described in 3.3.1, Concept of T-shirt sizes for SAP HANA on page 26, to get
the correct T-shirt size needed. Take the special T-shirt sizes for SAP Business
Suite with SAP HANA into account, in addition to the general SAP HANA T-shirt
sizes, because there is currently no support for the distribution of the data across
multiple systems, as outlined in Restrictions on page 87. With SAP Business
Suite, which is powered by SAP HANA, it is especially important to take possible
future growth into account, given these current scale-out restrictions.

Chapter 3. SAP HANA overview

33

Sizing the disks


The capacity of the disks is based on the total amount of RAM and follows the
same rules as in the previous scenarios. For more details, see Sizing the disks
on page 29.
Diskpersistence = 4 * C
Disklog = 1 * C
As in the previous cases, disk sizing is not required because certified hardware
configurations already take these rules into account.

Selecting a T-shirt size


According to the sizing results, select an SAP HANA T-shirt size that satisfies the
sizing requirements in terms of main memory, and possibly CPU capabilities. For
example, a sizing result of 400 GB for the main memory (C) suggests a T-shirt
size of M.
The sizing methodology previously described is valid at the time of writing this
publication and only for use case scenarios previously mentioned. Other use
cases might require another sizing methodology. Also, SAP HANA is constantly
being optimized, which might affect the sizing methodology. Consult SAP
documentation regarding other use cases and up-to-date sizing information.
Note: The sizing approach described here is simplified and can only provide a
rough idea of the sizing process for the actual sizing for SAP HANA. Consult
the SAP sizing documentation for SAP HANA when performing an actual
sizing. It is also a best practice to involve SAP for a detailed sizing because
the result of the sizing does not only affect the hardware infrastructure, but it
also affects the SAP HANA licensing.
In addition to the sizing methodologies described in SAP Notes, SAP provides
sizing support for SAP HANA in the SAP Quick Sizer. The SAP Quick Sizer is an
online sizing tool that supports most of the SAP solutions available. For SAP
HANA, it supports sizing for:
Stand-alone SAP HANA system, implementing the sizing algorithms
described in SAP Note 1514966 (which we described above)
SAP HANA as the database for an SAP NetWeaver BW system,
implementing the sizing algorithms described in SAP Note 1637145
Special sizing support for the SAP HANA rapid-deployment solutions

34

In-memory Computing with SAP HANA on IBM eX5 Systems

The SAP Quick Sizer is accessible online at the following site:


http://service.sap.com/quicksizer4

3.4 SAP HANA software licensing


As described in 3.2, SAP HANA delivery model on page 25, SAP HANA has an
appliance-like delivery model. However, although the hardware partners deliver
the infrastructure, including the operating system and middleware, the license for
the SAP HANA software must be obtained directly from SAP.
The SAP HANA software is available in these editions:
SAP HANA platform edition
This is the basic edition containing the software stack needed to use SAP
HANA as a database, including the SAP HANA database, SAP HANA Studio
for data modeling and administration, the SAP HANA clients, and software
infrastructure components. The software stack comes with the hardware
provided by the hardware partners; whereas, the license has to be obtained
from SAP.
SAP HANA enterprise edition
The SAP HANA enterprise edition extends the SAP HANA platform edition
with the software licenses needed for SAP Landscape Transformation
replication, ETL-based replication using SAP BusinessObjects Data Services,
and Extractor-based replication with Direct Extractor Connection.
SAP HANA extended enterprise edition
SAP HANA extended enterprise edition extends the SAP HANA platform
edition with the software licenses needed for log-based replication with the
Sybase Replication server. This edition was discontinued with the introduction
of SAP HANA 1.0 SPS05 in November 20125.
Although the SAP HANA software comes packaged in those editions, the actual
SAP HANA software licensing is depending on the use case. Chapter 5, SAP
HANA use cases and integration scenarios on page 61 discusses the different
use cases for SAP HANA. Discuss the licensing options for your particular use
case with SAP.

4
5

SAP S-user ID required


See http://help.sap.com/hana/hana_sps5_whatsnew_en.pdf, section 2.1.3

Chapter 3. SAP HANA overview

35

The SAP HANA licenses for most use cases are based on the amount of main
memory for SAP HANA. The smallest licensable memory size is 64 GB,
increasing in steps of 64 GB. The hardware might provide up to double the
amount of main memory than licensed, as illustrated in Table 3-3.
Table 3-3 Licensable memory per T-shirt size
T-shirt size

Server main memory

Licensable memorya

XS

128 GB

64 - 128 GB

256 GB

128 - 256 GB

512 GB

256 - 512 GB

1024 GB (= 1 TB)

512 - 1024 GB

a. In steps of 64 GB

As shown in Table 3-3, the licensing model allows you to have a matching T-shirt
size for any licensable memory size 64 - 1024 GB.

36

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 4.

Software components and


replication methods
This chapter explains the purpose of individual software components of the SAP
HANA solution and introduces available replication technologies.
The following sections are covered:
4.1, SAP HANA software components on page 38
4.2, Data replication methods for SAP HANA on page 54

Copyright IBM Corp. 2013. All rights reserved.

37

4.1 SAP HANA software components


The SAP HANA solution is composed from the following main software
components, which we describe in the following sections:

4.1.1, SAP HANA database on page 39


4.1.2, SAP HANA client on page 39
4.1.3, SAP HANA studio on page 40
4.1.4, SAP HANA studio repository on page 48
4.1.5, SAP HANA landscape management structure on page 49
4.1.6, SAP host agent on page 49
4.1.7, Software Update Manager for SAP HANA on page 50
4.1.8, SAP HANA Unified Installer on page 53

Figure 4-1 illustrates the possible locations of these components.

User Workstation
SAP
SAP HANA
HANA
studio
studio
(optional)
(optional)

SAP
SAP HANA
HANA
client
client
(optional)
(optional)

Server
SAP
SAP HANA
HANA
client
client

Server (log replication)


Sybase
Sybase
Replication
Replication
Agent
Agent (*1)
(*1)

SAP HANA Appliance


SAP
SAP HANA
HANA
studio
studio
(optional)
(optional)

SAP
SAP host
host
agent
agent

SAP
SAP HANA
HANA
client
client

SAP
SAP HANA
HANA
LM
LM structure
structure

SAP
SAP HANA
HANA
studio
studio
repository
repository

Software
Software
Update
Update
Manager
Manager

Data
Data Modeling
Modeling
Row
Row Store
Store
Column
Column Store
Store

SAP
SAP HANA
HANA
database
database

Other optional components:


SMD
SMD Agent
Agent
(optional)
(optional)

Sybase
Sybase
Replication
Replication
Server
Server (*1)
(*1)

Sybase
Sybase
EDCA
EDCA (*1)
(*1)

SAP
SAP HANA
HANA
Load
Load
Controller
Controller (*1)
(*1)

(*1) component is required only in case of replication using Sybase Replication Server
EDCA = Enterprise Connect Data Access
SMD = Solution Manager Diagnostics

Figure 4-1 Distribution of software components related to SAP HANA

Components related to replication using Sybase Replication Server are not


covered in this publication.

38

In-memory Computing with SAP HANA on IBM eX5 Systems

4.1.1 SAP HANA database


The SAP HANA database is the heart of the SAP HANA offering and the most
important software component running on the SAP HANA appliance.
SAP HANA is an in-memory database that combines row-based and
column-based database technology. All standard features available in other
relational databases are supported (for example: tables, views, indexes, triggers,
SQL interface, and so on).
On top of these standard functions, the SAP HANA database also offers
modeling capabilities that allow you to define in-memory transformation of
relational tables into analytic views. These views are not materialized; therefore,
all queries are providing real-time results based on content of the underlying
tables.
Another feature extending the capabilities of the SAP HANA database is the
SQLscript programming language, which allows you to capture transformations
that might not be easy to define using simple modeling.
The SAP HANA database can also be integrated with external applications, such
as an SAP R/3 software environment. Using these possibilities customers can
extend their models by implementing existing statistical and analytical functions
developed in the SAP R/3 programming language.
The internal structures of the SAP HANA database are explained in detail in
Chapter 3, SAP HANA overview on page 21.

4.1.2 SAP HANA client


The SAP HANA client is a set of libraries that are used by external applications to
connect to the SAP HANA database.
The following interfaces are available after installing the SAP HANA client
libraries:
SQLDBC
An SAP native database SDK that can be used to develop new custom
applications working with the SAP HANA database.
OLE DB for OLAP (ODBO) (available only on Windows)
ODBO is a Microsoft driven industry standard for multi-dimensional data
processing. The query language used in conjunction with ODBO is the
Multidimensional Expressions (MDX) language.

Chapter 4. Software components and replication methods

39

Open Database Connectivity (ODBC)


ODBC interface is a standard for accessing database systems, which was
originally developed by Microsoft.
Java Database Connectivity (JDBC)
JDBC is a Java based interface for accessing database systems.
The SAP HANA client libraries are delivered in 32-bit and 64-bit editions. It is
important to always use the correct edition based on the architecture of the
application that will use this client. 32-bit applications cannot use 64-bit client
libraries and vice versa.
To access the SAP HANA database from Microsoft Excel you can also use a
special 32-bit edition of the SAP HANA client called SAP HANA client package
for Microsoft Excel.
The SAP HANA client is backwards compatible, meaning that the revision of the
client must be the same or higher than the revision of the SAP HANA database.
The SAP HANA client libraries must be installed on every machine where
connectivity to the SAP HANA database is required. This includes not only all
servers but also user workstations that are hosting applications that are directly
connecting to the SAP HANA database (for example, SAP BusinessObjects
Client Tools or Microsoft Excel).
It is important to keep in mind that whenever the SAP HANA database is updated
to a more recent revision, all clients associated with this database must also be
upgraded.
For more information about how to install the SAP HANA client, see the official
SAP guide SAP HANA Database - Client Installation Guide, which is available for
download at the following location:
http://help.sap.com/hana_appliance

4.1.3 SAP HANA studio


The SAP HANA studio is a graphical user interface that is required to work with
local or remote SAP HANA database installations. It is a multipurpose tool that
covers all of the main aspects of working with the SAP HANA database. Because
of that, the user interface is slightly different for each function.
Note that the SAP HANA studio is not dependent on the SAP HANA client.

40

In-memory Computing with SAP HANA on IBM eX5 Systems

The following main function areas are provided by the SAP HANA studio (each
function area is also illustrated by a corresponding figure of the user interface):
Database administration
The key functions are stopping and starting the SAP HANA databases, status
overview, monitoring, performance analysis, parameter configuration, tracing,
and log analysis.
Figure 4-2 shows the SAP HANA studio user interface for database
administration.

Figure 4-2 SAP HANA studio: Administration console (overview)

Chapter 4. Software components and replication methods

41

Security management
This provides tools that are required to create users, to define and assign
roles, and to grant database privileges.
Figure 4-3 shows an example of the user definition dialog.

Figure 4-3 SAP HANA studio: User definition dialog

42

In-memory Computing with SAP HANA on IBM eX5 Systems

Data management
Functions to create, change, or delete database objects (like tables, indexes,
views), and commands to manipulate data (for example: insert, update,
delete, bulk load).
Figure 4-4 shows an example of the table definition dialog.

Figure 4-4 SAP HANA studio: Table definition dialog

Chapter 4. Software components and replication methods

43

Modeling
This is the user interface to work with models (metadata descriptions how
source data is transformed in resulting views), including the possibility to
define new custom models, and to adjust or delete existing models.
Figure 4-5 shows a simple analytic model.

Figure 4-5 SAP HANA studio: Modeling interface (analytic view)

44

In-memory Computing with SAP HANA on IBM eX5 Systems

Content management
Functions offering the possibility to organize models in packages, to define
delivery units for transport into a subsequent SAP HANA system, or to export
and import individual models or whole packages.
Content management functions are accessible from the main window in the
modeler perspective, as shown in Figure 4-6.

Figure 4-6 SAP HANA studio: Content functions on the main panel of modeler perspective

Chapter 4. Software components and replication methods

45

Replication management
Data replication into the SAP HANA database is controlled from the data
provisioning dialog in the SAP HANA studio, where new tables can be
scheduled for replication, suspended, or replication for a particular table can
be interrupted.
Figure 4-7 shows an example of a data provisioning dialog.

Figure 4-7 SAP HANA studio: Data provisioning dialog

46

In-memory Computing with SAP HANA on IBM eX5 Systems

Software Lifecycle Management


The SAP HANA solution offers the possibility to automatically download and
install updates to SAP HANA software components. This function is
controlled from the Software Lifecycle Management dialog in the SAP HANA
studio. Figure 4-8 shows an example of such a dialog.

Figure 4-8 SAP HANA studio: Software lifecycle dialog

The SAP HANA database queries are consumed indirectly using front-end
components, such as SAP BusinessObjects BI 4.0 clients. Therefore, the SAP
HANA studio is required only for administration or development and is not
needed for end users.
The SAP HANA studio runs on the Eclipse platform; therefore, every user must
have Java Runtime Environment (JRE) 1.6 or 1.7 installed, having the same
architecture (64-bit SAP HANA studio has 64-bit JRE as prerequisite).
Currently supported platforms are Windows 32-bit, Windows 64-bit, and Linux
64-bit.
Just like the SAP HANA client, the SAP HANA studio is also backwards
compatible, meaning that the revision level of the SAP HANA studio must be the
same or higher revision level than the revision level of the SAP HANA database.

Chapter 4. Software components and replication methods

47

However, based on practical experience, the best approach is to keep SAP


HANA studio on same revision level as the SAP HANA database whenever
possible. Installation and parallel use of multiple revisions of SAP HANA studio
on one workstation is possible. When using one SAP HANA studio instance for
multiple SAP HANA databases, the revision level of the SAP HANA studio must
be the same or higher revision level than the highest revision level of the SAP
HANA databases being connected to.
SAP HANA studio must be updated to a more recent version on all workstations
whenever the SAP HANA database is updated. This can be automated using
Software Update Manager (SUM) for SAP HANA. We provide more details about
this in 4.1.4, SAP HANA studio repository on page 48, and 4.1.7, Software
Update Manager for SAP HANA on page 50.
For more information about how to install the SAP HANA studio, see the official
SAP guide, SAP HANA Database - Studio Installation Guide, which is available
for download at the following location:
http://help.sap.com/hana_appliance

4.1.4 SAP HANA studio repository


Because SAP HANA studio is an Eclipse-based product, it can benefit from all
standard features offered by this platform. One of these features is the ability to
automatically update the product from a central repository on the SAP HANA
server.
The SAP HANA studio repository is initially installed by the SAP HANA Unified
Installer and must be manually updated at the same time that the SAP HANA
database is updated (more details about version compatibility are in section
4.1.3, SAP HANA studio on page 40). This repository can then be used by all
SAP HANA studio installations to automatically download and install new
versions of code.
Using this feature is probably the most reliable way to keep all installations of
SAP HANA studio in sync with the SAP HANA database. However, note that a
one time configuration effort is required on each workstation (for more details see
4.1.7, Software Update Manager for SAP HANA on page 50).
For more information about how to install the SAP HANA studio repository, see
the official SAP guide, SAP HANA Database - Studio Installation Guide, which is
available for download at the following location:
http://help.sap.com/hana_appliance

48

In-memory Computing with SAP HANA on IBM eX5 Systems

4.1.5 SAP HANA landscape management structure


The SAP HANA landscape management (LM) structure (lm_structure) is an XML
file that describes the software components installed on a server. The
information in this file contains:
System ID (SID) of SAP HANA system and host name
Stack description including the edition (depending on the license schema)
Information about the SAP HANA database, including installation directory
Information about the SAP HANA studio repository, including location
Information about the SAP HANA client, including location
In the case of the SAP HANA enterprise extended edition, information about
the SAP HANA load controller (which is part of the Sybase Replication
Server-based replication)
Information about host controller
The LM structure description also contains revisions of individual components
and therefore needs to be upgraded when the SAP HANA database is upgraded.
Information contained in this file is used by the System Landscape Directory
(SLD) data suppliers and by the Software Update Manager (SUM) for SAP
HANA.
More information about how to configure the SLD connection is provided in the
official SAP guide, SAP HANA Installation Guide with Unified Installer, which is
available for download at the following location:
http://help.sap.com/hana_appliance

4.1.6 SAP host agent


The SAP host agent is a standard part of every SAP installation. In an SAP
HANA environment, it is important in the following situations:
Automatic update using SUM for SAP HANA (more information is in
documents: SAP HANA Automated Update Guide and SAP HANA Installation
Guide with Unified Installer)
Replication using the Sybase Replication Server where the host agent is
handling login authentication between source and target servers (explained in
the document SAP HANA Installation and Configuration Guide - Log-Based
Replication)

Chapter 4. Software components and replication methods

49

4.1.7 Software Update Manager for SAP HANA


The Software Update Manager (SUM) for SAP HANA is a tool that belongs to the
SAP Software Logistics (SL) Toolset. This tool offers two main functions:
Automated update of the SAP HANA server components to the latest
revision, downloaded from SAP Service Marketplace
Enablement of automated update of remote SAP HANA studio installations
against the studio repository installed on SAP HANA server
Both functions are discussed in the subsequent sections.

Automated update of SAP HANA server components


The SAP Software Update Manager is a separate software component that must
be started on the SAP HANA server. A good practice is to install this component
as a service.
Tip:
The Software Update Manager can be configured as a Linux service by
running the following commands:
export JAVA_HOME=/usr/sap/<SID>/SUM/jvm/jre
/usr/sap/<SID>/SUM/daemon.sh install
The service can be started using the following command:
/etc/init.d/sum_daemon start
The SAP Software Update Manager does not have a user interface. It is
controlled remotely from the SAP HANA Studio.

50

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure 4-9 illustrates the interaction of SUM with other components.

User Workstation

stack.xml
stack.xml
IMCE_SERVER*.SAR
IMCE_SERVER*.SAR
IMCE_CLIENT*.SAR
IMCE_CLIENT*.SAR
IMC_STUDIO*.SAR
IMC_STUDIO*.SAR
HANALDCTR*.SAR
HANALDCTR*.SAR
SAPHOSTAGENT*.SAR
SAPHOSTAGENT*.SAR
SUMHANA*.SAR
SUMHANA*.SAR

SAP Service Marketplace


Installation
Installation
Installation
Installation
Installation
Installation
package
package
Installation
Installation
package
package
Installation
Installation
package
package
package
package
package
package

Software
Software
Update
Update
Manager
Manager

SAP
SAP host
host
agent
agent

self-update

SAP
SAP HANA
HANA
client
client
updated components

SAP
SAP HANA
HANA
studio
studio

SAP HANA Appliance

SAP
SAP HANA
HANA
studio
studio
repository
repository

Data
Data Modeling
Modeling
Row
Row Store
Store
Column
Column Store
Store

SAP
SAP HANA
HANA
database
database

SAP
SAP HANA
HANA
Load
Load
Controller
Controller (*1)
(*1)

(*1) component is required only in case of replication using Sybase


Replication Server

Figure 4-9 Interaction of Software Update Manager (SUM) for SAP HANA with other software components

The Software Update Manager can download support package stack information
and other required files directly from the SAP Service Marketplace (SMP).
If a direct connection from the server to the SAP Service Marketplace is not
available, the support package stack definition and installation packages must be
downloaded manually and then uploaded to the SAP HANA server. In this case,
the stack generator at the SAP Service Marketplace can be used to identify
required packages and to generate the stack.xml definition file (a link to the stack
generator is located in the download section, subsection Support packages in
the SAP HANA area).
The SUM update file (SUMHANA*.SAR archive) is not part of the stack definition
and needs to be downloaded separately.
The Software Update Manager will first perform a self-update as soon as the
Lifecycle Management perspective is opened in the SAP HANA studio.
After the update is started, all SAP HANA software components are updated to
their target revisions, as defined by the support package stack definition file. This
operation needs downtime; therefore, a maintenance window is required, and the
database must be backed up before this operation.

Chapter 4. Software components and replication methods

51

This scenario is preconfigured during installation using the Unified Installer (see
the document SAP HANA Installation Guide with Unified Installer - section SUM
for SAP HANA Default Configuration for more details). If both the SAP HANA
studio and the Software Update Manager for SAP HANA are running on SPS04,
no further steps are required.
Otherwise, a last configuration step, installing the server certificate inside the
Java keystore needs to be performed on a remote workstation where the SAP
HANA studio is located.
For more information about installation, configuration, and troubleshooting of
SUM updates, see the following guides:
SAP HANA Installation Guide with Unified Installer
SAP HANA Automated Update Guide
The most common problem during configuration of automatic updates using
SUM is a host name mismatch between server installation (fully qualified host
name that was used during installation of SAP HANA using Unified Installer) and
the host-name used in the SAP HANA studio. For more information, see the
troubleshooting section in SAP HANA Automated Update Guide.

Automated update of SAP HANA studio


The second function of the Software Update Manager for SAP HANA is to act as
an update server for remote SAP HANA Studio installations.
Figure 4-10 illustrates the interaction of SUM with other components for this
scenario.

User Workstation
SAP
SAP HANA
HANA
studio
studio

SAP HANA Appliance


Software
Software
Update
Update
Manager
Manager

Data
Data Modeling
Modeling
Row
Row Store
Store

(read operation)
Column
Column Store
Store
SAP
SAP HANA
HANA
studio
studio
repository
repository

SAP
SAP HANA
HANA
database
database

Figure 4-10 Interaction of the SUM for SAP HANA with other software components
during update of remote SAP HANA studio

52

In-memory Computing with SAP HANA on IBM eX5 Systems

If the Unified Installer was used to install SAP HANA software components, no
actions need to be performed on the server.
The only configuration step that is needed is to adjust the SAP HANA studio
preferences to enable updates and to define the location of the update server.

4.1.8 SAP HANA Unified Installer


The SAP HANA Unified Installer is tool targeted to be used by SAP HANA
hardware partners. It installs all required software components on the SAP
HANA appliance according to SAP requirements and specifications.
Installation parameters, such as system ID, system number, and locations of
required directories, are provided through the configuration file.
The tool then automatically deploys all required software components in
predefined locations and performs all mandatory steps to configure the SUM for
SAP HANA.
See the SAP HANA Installation Guide with Unified Installer for more details.

4.1.9 Solution Manager Diagnostics agent


SAP HANA can be connected to an SAP Solution Manager 7.1, SP03 or higher1.
The Solution Manager Diagnostics (SMD) provides a set of tools to monitor and
analyze SAP systems, including SAP HANA. It provides a centralized way to
trace problems in all systems connected to an SAP Solution Manager system.
The SMD agent is an optional component, which can be installed on the SAP
HANA appliance. It enables diagnostics of the SAP HANA appliance through
SAP Solution Manager. The SMD agent not only provides access to the
database logs, but also access to the file system and collects information about
the systems CPU and memory consumption through the SAP host agent.

With monitor content update and additional SAP notes also for SP02

Chapter 4. Software components and replication methods

53

4.1.10 SAP HANA On-Site Configuration Tool


The SAP HANA On-Site Configuration Tool is a tool to perform additional
(post-installation) steps. With the SAP HANA On-Site Configuration tool, you can
perform the following tasks:
Renaming the SAP HANA system, changing the System ID (SID)
Adding or removing nodes in a scale-out configuration
Adding or removing an additional SAP HANA system (instance) on a single
SAP HANA appliance
Configuring the SAP HANA system to connect to System Landscape
Directory (SLD)
Installing or uninstalling the SMD agent on the SAP HANA system
Installing Application Function Libraries (AFLs) on an SAP HANA system
Installing SAP liveCache Applications (SAP LCAs) on an SAP HANA system
A detailed description about how to use the SAP HANA On-Site Configuration
Tool is contained in the guide SAP HANA Installation Guide with SAP HANA
Unified Installer, which is available online at:
http://help.sap.com/hana_appliance

4.2 Data replication methods for SAP HANA


Data can be written to the SAP HANA database either directly by a source
application, or it can be replicated using replication technologies.
The following replication methods are available for use with the SAP HANA
database:
Trigger-based replication
This method is based on database triggers created in the source system to
record all changes to monitored tables. These changes are then replicated to
the SAP HANA database using the SAP Landscape Transformation system.
ETL-based replication
This method employs an Extract, Transform, and Load (ETL) process to
extract data from the data source, transform it to meet the business or
technical needs, and load it into the SAP HANA database. The SAP
BusinessObject Data Services application is used as part of this replication
scenario.

54

In-memory Computing with SAP HANA on IBM eX5 Systems

Extractor-based replication
This approach uses the embedded SAP NetWeaver BW that is available on
every SAP NetWeaver-based system to start an extraction process using
available extractors and then redirecting the write operation to the SAP HANA
database instead of the local Persistent Staging Area (PSA).
Log-based replication
This method is based on reading the transaction logs from the source
database and reapplying them to the SAP HANA database.
Figure 4-11 illustrates these replication methods.

SAP HANA
database
Source System
SAP ERP

Trigger-Based Replication
Application Layer

ETL-Based Replication
Extractor-Based Replication

Embedded BW

Database

Log
File

Log-Based Replication

Figure 4-11 Available replication methods for SAP HANA

The following sections discuss these replication methods for SAP HANA in more
detail.

4.2.1 Trigger-based replication with SAP Landscape Transformation


SAP Landscape Transformation replication is based on tracking database
changes using database triggers. All modifications are stored in logging tables in
the source database, which ensures that every change is captured even when
the SAP Landscape Transformation system is not available.
The SAP Landscape Transformation system reads changes from source systems
and updates the SAP HANA database accordingly. The replication process can

Chapter 4. Software components and replication methods

55

be configured as real-time (continuous replication) or scheduled replication in


predefined intervals.
The SAP Landscape Transformation operates on the application level; therefore,
the trigger-based replication method benefits from the database abstraction
provided by the SAP software stack, which makes it database independent. It
also has extended source system release coverage, where supported releases
start from SAP R/3 4.6C up to the newest SAP Business Suite releases.
The SAP Landscape Transformation also supports direct replication from
database systems supported by the SAP NetWeaver platform. In this case, the
database must be connected to the SAP Landscape Transformation system
directly (as an additional database) and the SAP Landscape Transformation is
playing the role of the source system.
The replication process can be customized by creating ABAP routines and
configuring their execution during replication process. This feature allows the
SAP Landscape Transformation system to replicate additional calculated
columns and to scramble existing data or filter-replicated data based on defined
criteria.
The SAP Landscape Transformation replication leverages proven System
Landscape Optimization (SLO) technologies (such as Near Zero Downtime, Test
Data Migration Server (TDMS), and SAP Landscape Transformation) and can
handle both unicode and non-unicode source databases. The SAP Landscape
Transformation replication provides a flexible and reliable replication process,
fully integrates with SAP HANA Studio, and is simple and fast to set up.
The SAP Landscape Transformation Replication Server does not have to be a
separate SAP system. It can run on any SAP system with the SAP NetWeaver
7.02 ABAP stack (Kernel 7.20EXT). However, it is recommended to install the
SAP Landscape Transformation Replication Server on a separate system to
avoid high replication load causing performance impact on base system.
The SAP Landscape Transformation Replication Server is the ideal solution for
all SAP HANA customers who need real-time (or scheduled) data replication
from SAP NetWeaver-based systems or databases supported by SAP
NetWeaver.

56

In-memory Computing with SAP HANA on IBM eX5 Systems

4.2.2 ETL-based replication with SAP BusinessObjects Data Services


An ETL-based replication for SAP HANA can be set up using SAP
BusinessObjects Data Services, which is a full-featured ETL tool that gives
customers maximum flexibility regarding the source database system:
Customers can specify and load the relevant business data in defined periods
of time from an SAP ERP system into the SAP HANA database.
SAP ERP application logic can be reused by reading extractors or utilizing SAP
function modules.
It offers options for the integration of third-party data providers and supports
replication from virtually any data source.
Data transfers are done in batch mode, which limits the real-time capabilities of
this replication method.
SAP BusinessObjects Data Services provides several kinds of data quality and
data transformation functionality. Due to the rich feature set available,
implementation time for the ETL-based replication is longer than for the other
replication methods. SAP BusinessObjects Data Services offers integration with
SAP HANA. SAP HANA is available as a predefined data target for the load
process.
The ETL-based replication server is the ideal solution for all SAP HANA
customers who need data replication from non-SAP data sources.

4.2.3 Extractor-based replication with Direct Extractor Connection


Extractor-based replication for SAP HANA is based on already existing
application logic available in every SAP NetWeaver system. The SAP NetWeaver
BW package that is a standard part of the SAP NetWeaver platform can be used
to run an extraction process and store the extracted data in the SAP HANA
database.
This functionality requires some corrections and configuration changes to both
the SAP HANA database (import of delivery unit and parameterization) and on
the SAP NetWeaver BW system as part of the SAP NetWeaver platform
(implementing corrections using SAP note or installing a support package and
parameterization). Corrections in the SAP NetWeaver BW system ensure that
extracted data is not stored in local Persistent Staging Area (PSA), but diverted
to the external SAP HANA database.
Use of native extractors instead of replication of underlying tables can bring
certain benefits. Extractors offer the same transformations that are used by SAP

Chapter 4. Software components and replication methods

57

NetWeaver BW systems. This can significantly decrease the complexity of


modeling tasks in the SAP HANA database.
This type of replication is not real-time and only available features and
transformation capabilities provided by a given extractor can be used.
Replication using Direct Extractor Connection (DXC) can be realized in the
following basic scenarios:
Using the embedded SAP NetWeaver BW functionality in the source system
SAP NetWeaver BW functions in the source system are usually not used.
After implementation of the required corrections, the source system calls its
own extractors and pushes data into the external SAP HANA database.
The source system must be based on SAP NetWeaver 7.0 or higher. Because
the function of a given extractor is diverted into SAP HANA database, this
extractor must not be in use by the embedded SAP NetWeaver BW
component for any other purpose.
Using an existing SAP NetWeaver BW to drive replication
An existing SAP NetWeaver BW can be used to extract data from the source
system and to write the result to the SAP HANA system.
The release of the SAP NetWeaver BW system that is used must be at least
SAP NetWeaver 7.0, and the given extractor must not be in use for this
particular source system.
Using a dedicated SAP NetWeaver BW to drive replication
The last option is to install a dedicated SAP NetWeaver system to extract data
from the source system and store the result in the SAP HANA database. This
option has minimal impact on existing functionality because no existing
system is changed in any way. However, a new system is required for this
purpose.
The current implementation of this replication technology is only allowing for one
database schema in the SAP HANA database. Using one system for controlling
replication of multiple source systems can lead to collisions because all source
systems use the same database schema in the SAP HANA database.

4.2.4 Log-based replication with Sybase Replication Server


The log-based replication for SAP HANA is realized with the Sybase Replication
Server. It captures table changes from low-level database log files and
transforms them into SQL statements that are in turn executed on the SAP
HANA database. This is similar to what is known as log shipping between two
database instances.

58

In-memory Computing with SAP HANA on IBM eX5 Systems

Replication with the Sybase Replication Server is fast and consumes little
processing power due to its closeness to the database system. However, this
mode of operation makes this replication method highly database dependent,
and the source database system coverage is limited2. It also limits the conversion
capabilities; therefore, replication with the Sybase Replication Server only
supports Unicode source databases. The Sybase Replication Server cannot
convert between code pages, and because SAP HANA works with unicode
encoding internally, the source database has to use unicode encoding as well.
Also, certain table types used in SAP systems are unsupported.
To set up replication with the Sybase Replication Server, the definition and
content of tables chosen to be replicated must initially be copied from the source
database to the SAP HANA database. This initial load is done with the R3Load
program, which is also used for database imports and exports. Changes in tables
during initial copy operation are captured by the Sybase Replication Server;
therefore, no system downtime is required.
This replication method is only recommended for SAP customers who were
invited to use it during the ramp-up of SAP HANA 1.0. It was part of the SAP
HANA Enterprise Extended Edition, which was discontinued with the introduction
of SAP HANA 1.0 SPS05 in November 20123.
SAP recommends to instead use trigger-based data replication using the SAP
Landscape Transformation Replicator, which is described in the previous section.

4.2.5 Comparing the replication methods


Each of the described data replication methods for SAP HANA has its benefits
and weaknesses:
The trigger-based replication method with the SAP Landscape
Transformation system provides real-time replication while supporting a wide
range of source database systems. It can handle both unicode and
non-unicode databases and makes use of proven data migration technology.
It leverages the SAP application layer, which limits it to SAP source systems.
Compared to the log-based replication method, it offers a broader support of
source systems, while providing almost similar real-time capabilities, and for
that reason it is recommended for replication from SAP source systems.
The ETL-based replication method is the most flexible of all, paying the price
for flexibility with only near real-time capabilities. With its variety of possible
data sources, advanced transformation, and data quality functionality, it is the
ideal choice for replication from non-SAP data sources.
2
3

Only certain versions of IBM DB2 on AIX, Linux, and HP-UX are supported by this replication
method.
See http://help.sap.com/hana/hana_sps5_whatsnew_en.pdf, section 2.1.3.

Chapter 4. Software components and replication methods

59

The extractor-based replication method offers reuse of existing transformation


capabilities that are available in every SAP NetWeaver based system. This
can significantly decrease the required implementation effort. However this
type of replication is not real time and is limited to capabilities provided by the
available extractors in the source system.
The log-based replication method with the Sybase Replication Server
provides the fastest replication from the source database to SAP HANA.
However, it is limited to unicode-encoded source databases, and it does not
support all table types used in SAP applications. It provides no transformation
functionality, and the source database system support is limited.
Figure 4-12 shows these replication methods in comparison.

Real-Time

Near Real-Time

Real-Time Capabilities

Preferred by SAP

SAP LT System

Direct Extractor
Connection

SAP Business Objects


Data Services

Unicode only
Very limited DB support
Data Conversion Capabilities

Sybase
Replication Server

Real Real-Time

Many DBs supported


Unicode and Non-Unicode
on Application Layer

SAP NetWeaver 7.0+


Re-use of extractors
Transformation
Any Datasource
Transformation
Data Cleansing

Figure 4-12 Comparison of the replication methods for SAP HANA

The replication method that you choose depends on the requirements. When
real-lime replication is needed to provide benefit to the business, and the
replication source is an SAP system, then the trigger-based replication is the
best choice. Extractor-based replication might keep project cost down by reusing
existing transformations. ETL-based replication provides the most flexibility
regarding data source, data transformation, and data cleansing options, but does
not provide real-time replication.

60

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 5.

SAP HANA use cases and


integration scenarios
In this chapter, we outline the different ways that SAP HANA can be implemented
in existing client landscapes and highlight various aspects of such an integration.
Whenever possible, we mention real-world examples and related offerings.
This chapter is divided in several sections that are based on the role of SAP
HANA and the way it interacts with other software components:

5.1, Basic use case scenarios on page 62


5.2, SAP HANA as a technology platform on page 63
5.3, SAP HANA for operational reporting on page 68
5.4, SAP HANA as an accelerator on page 72
5.5, SAP products running on SAP HANA on page 75
5.6, Programming techniques using SAP HANA on page 87

Copyright IBM Corp. 2013. All rights reserved.

61

5.1 Basic use case scenarios


The following classification of use cases was presented during the SAP TechEd
2011 event in session EIM205 Applications powered by SAP HANA. SAP defined
the following five use case scenarios:

Technology platform
Operational reporting
Accelerator
In-memory products
Next generation applications

Figure 5-1 illustrates these use case scenarios.

Accelerator
Accelerator
Operational
Operational
Reporting
Reporting

In-Memory
In-Memory
Products
Products

Data
Data Modeling
Modeling

Technology
Technology
platform
platform

Column
Column Store
Store
Row
Row Store
Store

Next
Next
Generation
Generation
Applications
Applications

SAP
SAP HANA
HANA
Figure 5-1 Basic use case scenarios defined by SAP in session EIM205

These five basic use case scenarios describe the elementary ways that SAP
HANA can be integrated. We cover each of these use case scenarios in a
dedicated section within this chapter.
SAP maintains a SAP HANA Use Case Repository with specific examples for
how SAP HANA can be integrated. This repository is online at the following web
address:
http://www.experiencesaphana.com/community/resources/use-cases

62

In-memory Computing with SAP HANA on IBM eX5 Systems

The use cases in this repository are divided into categories based on their
relevance to a specific industry sector. It is a good idea to review this repository
to find inspiration about how SAP HANA can be leveraged in various scenarios.

5.2 SAP HANA as a technology platform


SAP HANA can be used even in non-SAP environments. The client can use
structured and unstructured data that is derived from non-SAP application
systems to be able to take advantage of SAP HANA power. SAP HANA can be
used to accelerate existing functionality or to provide new functionality that was,
until now, not realistic.
Figure 5-2 presents SAP HANA as a technology platform.

Data
Data Modeling
Modeling

Non-SAP
Non-SAP
or
or SAP
SAP
data
source
data source

Non-SAP
Non-SAP
application
application

Column
Column Store
Store
Row
Row Store
Store

SAP
SAP HANA
HANA

SAP
SAP
Reporting
Reporting
and
and Analytics
Analytics

Figure 5-2 SAP HANA as technology platform

SAP HANA is not technologically dependent on other SAP products and can be
used independently as the only one SAP component in the clients information
technology (IT) landscape. On the other hand, SAP HANA can be easily
integrated with other SAP products, such as SAP BusinessObjects BI platform
for reporting or SAP BusinessObjects Data Services for ETL replication, which
gives clients the possibility to use only the components that are needed.
There are many ways that SAP HANA can be integrated into a client landscape,
and it is not possible to describe all combinations. Software components around
the SAP HANA offering can be seen as building blocks, and every solution must
be assembled from the blocks that are needed in a particular situation.
This approach is extremely versatile and the number of possible combinations is
growing because SAP constantly keeps adding new components in their SAP
HANA-related portfolio.

Chapter 5. SAP HANA use cases and integration scenarios

63

IBM offers consulting services that help clients to choose the correct solution for
their business needs. For more information, see section 8.4.1, A trusted service
partner on page 178.

5.2.1 SAP HANA data acquisition


There are multiple ways that data can flow into SAP HANA. In this section, we
describe the various options that are available. Figure 5-3 gives an overview.

Current situation
Non-SAP
Non-SAP
application
application

Replacing existing database


Custom
Custom
database
database

Data replication
Non-SAP
Non-SAP
application
application

Non-SAP
Non-SAP
application
application

SAP
SAP HANA
HANA

Dual database approach


Custom
Custom
database
database

Non-SAP
Non-SAP
application
application

Custom
Custom
database
database

Data replication

SAP
SAP HANA
HANA

SAP
SAP HANA
HANA

Figure 5-3 Examples of SAP HANA deployment options in regards to data acquisition

The initial situation is schematically displayed in the upper-left corner of


Figure 5-3. In this example, a client-specific non-SAP application writes data to a
custom database that is slow and is not meeting client needs.
The other three examples in Figure 5-3 show that SAP HANA can be deployed in
such a scenario. These show that there is no single solution that is best for every
client, but that each situation must be considered independently.

64

In-memory Computing with SAP HANA on IBM eX5 Systems

Each of these three solutions has both advantages and disadvantages, which we
highlight, to show aspects of a given solution that might need more detailed
consideration:
Replacing the existing database with SAP HANA
The advantage of this solution is that the overall architecture is not going to be
significantly changed. The solution will remain simple without the need to
include additional components. Customers might also save on license costs
for the original database.
A disadvantage to this solution is that the custom application must be
adjusted to work with the SAP HANA database. If ODBC or JDBS is used for
database access, this is not a big problem. Also the whole setup must be
tested properly. Because the original database is being replaced, a certain
amount of downtime is inevitable.
Clients considering this approach must be familiar with the features and
characteristics of SAP HANA, especially when certain requirements must be
met by the database that is used (for example in case of special purpose
databases).
Populating SAP HANA with data replicated from the existing database
The second option is to integrate SAP HANA as a side-car database to the
primary database and to replicate required data using one of the available
replication techniques.
An advantage of this approach is that the original solution is not touched and
therefore no downtime is required. Also, only the required subset of data has
to be replicated from the source database, which might allow customers to
minimize acquisition costs because SAP HANA acquisition costs are directly
linked to the volume of stored data.
The need for implementing replication technology can be seen as the only
disadvantage of this solution. Because data is only delivered into SAP HANA
through replication, this component is a vital part of the whole solution.
Customers considering this approach must be familiar with various replication
technologies, including their advantages and disadvantages, as outlined in
section 4.2, Data replication methods for SAP HANA on page 54.
Clients must also be aware that replication might cause additional load on the
existing database because modified records must be extracted and then
transported to the SAP HANA database. This aspect is highly dependent on
the specific situation and can be addressed by choosing the proper replication
technology.

Chapter 5. SAP HANA use cases and integration scenarios

65

Adding SAP HANA as a second database in parallel to the existing one


This third option keeps the existing database in place while adding SAP
HANA as a secondary database. The custom application then stores data in
both the original database and in the SAP HANA database.
This option balances advantages and disadvantages of the previous two
options. A main prerequisite is the ability of the source application to work
with multiple databases and the ability to control where data is stored. This
can be easily achieved if the source application was developed by the client
and can be changed, or if the source application is going to be developed as
part of this solution. If this prerequisite cannot be met, this option is not viable.
An advantage of this approach is that no replication is required because data
is directly stored in SAP HANA as required. Customers can also decide to
store some of the records in both databases.
If data stored in the original database is not going to be changed and SAP
HANA data will be stored in both databases simultaneously, customers might
achieve only minimal disruption to the existing solution.
A main disadvantage is the prerequisite that the application must be able to
work with multiple databases and that it must be able to store data according
to the customers expectations.
Customers considering this option must be aware about the abilities provided
by the application delivering data into the existing database. Also, disaster
recovery plans must be carefully adjusted, especially when consistency
between both databases is seen as a critical requirement.
These examples must not be seen as an exhaustive list of integration options for
an SAP HANA implementation, but rather as a demonstration of how to develop
a solution that matches client needs.
It is of course possible to populate the SAP HANA database with data coming
from multiple different sources, such as SAP or non-SAP applications, custom
databases, and so on.
These sources can feed data into SAP HANA independently, each using a
different approach or in a synchronized manner using the SAP BusinessObjects
Data Services, which can replicate data from several different sources
simultaneously.

66

In-memory Computing with SAP HANA on IBM eX5 Systems

5.2.2 SAP HANA as a source for other applications


The second part of integrating SAP HANA is to connect existing or new
applications to run queries against the SAP HANA database. Figure 5-4
illustrates an example of such an integration.

Current situation
Custom
Custom
database
database

Possible scenario
Non-SAP
Non-SAP
application
application

Custom
Custom
database
database

Non-SAP
Non-SAP
application
application

SAP
SAP HANA
HANA

SAP
SAP analytic
analytic
tools
tools

SAP
SAP BOBJ
BOBJ
reporting
reporting
Figure 5-4 An example of SAP HANA as a source for other applications

The initial situation is schematically visualized in the left part of Figure 5-4. A
customer-specific application runs queries against a custom database that is a
functionality that we must preserve.
A potential solution is in the right part of Figure 5-4. A customer-specific
application runs problematic queries against the SAP HANA database. If the
existing database is still part of the solution, specific queries that do not need
acceleration can still be executed against the original database.
Specialized analytic tools, such as the SAP BusinessObjects Predictive Analysis,
can be used to run statistical analysis on data that is stored in the SAP HANA
database. This tool can run analysis directly inside the SAP HANA database,
which helps to avoid expensive transfers of massive volumes of data between the
application and the database. The result of this analysis can be stored in SAP
HANA, and the custom application can use these results for further processing,
for example, to facilitate decision making.
SAP HANA can be easily integrated with products from the SAP
BusinessObjects family. Therefore, these products can be part of the solution,
responsible for reporting, monitoring critical key performance indicators (KPIs)
using dashboards, or for data analysis.

Chapter 5. SAP HANA use cases and integration scenarios

67

These tools can also be used without SAP HANA; however, SAP HANA is
enabling these tools to process much larger volumes of data and still provide
results in reasonable time.

5.3 SAP HANA for operational reporting


Operational reporting is playing more and more of an important role. In todays
economic environment, companies must understand how various events in our
globally integrated world affect their business to be able to make proper
adjustments to counter the effects of these events. Therefore, the pressure to
minimize the delay in reporting is becoming higher and higher. An ideal situation
is to have the ability to have a real-time snapshot of current situations just within
seconds from requesting.
At the same time, the amount of data that is being captured grows every year.
Additional information is collected and stored at more detailed levels. All of this
makes operational reporting more challenging because huge amounts of data
need to be processed quickly to produce the desired result.
SAP HANA is a perfect fit for this task. Required information can be replicated
from existing transactional systems into the SAP HANA database and then
processed significantly faster than directly on the source systems.
The following use case is often referred to as a data mart or side-car approach
because SAP HANA sits by the operational system and receives the operational
data (usually only an excerpt) from this system by means of replication.
In a typical SAP-based application landscape today, you will find a number of
systems, such as SAP ERP, SAP CRM, SAP SCM, and other, possibly non-SAP,
applications. All of these systems contain loads of operational data, which can be
used to improve business decision making using business intelligence
technology. Data that is used for business intelligence purposes can be gathered
either on a business unit level using data marts or on an enterprise level with an
enterprise data warehouse, such as the SAP NetWeaver Business Warehouse
(SAP NetWeaver BW). ETL processes feed the data from the operational
systems into the data marts and the enterprise data warehouse.

68

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure 5-5 illustrates such a typical landscape.

Corporate BI

Enterprise Data Warehouse (BW)


BWA

Database

Local BI

Data
Mart

SAP ERP 1

SAP ERP n

(or CRM, SRM, SCM)

(or CRM, SRM, SCM)

...

BI

Data
Mart

ETL
DB

Non-SAP
Business
Application

BI

Data
Mart

ETL
Database

DB

Database

Database

DB

Figure 5-5 Typical view of an SAP-based application landscape today

With the huge amounts of data collected in an enterprise data warehouse,


response times of queries for reports or navigation through data can become an
issue, generating new requirements to the performance of such an environment.
To address these requirements, SAP introduced the SAP NetWeaver Business
Warehouse Accelerator, which is built for this use case by speeding up queries
and reports in the SAP NetWeaver BW by leveraging in-memory technology.
Although being a perfect fit for an enterprise data warehouse holding huge
amounts of data, the combination of SAP NetWeaver BW and SAP NetWeaver
BW Accelerator is not always a viable solution for the relatively small data marts.

Chapter 5. SAP HANA use cases and integration scenarios

69

With the introduction of SAP HANA 1.0, SAP provided an in-memory technology
aiming to support business intelligence at a business unit level. SAP HANA
combined with business intelligence tools, such as the SAP BusinessObjects
tools and data replication mechanisms feeding data from the operational system
into SAP HANA in real time, brought in-memory computing to the business unit
level. Figure 5-6 shows such a landscape with the local data marts replaced by
SAP HANA.

Corporate BI

Enterprise Data Warehouse (BW)


Database

Accelerator

Local BI

SAP
HANA
1.0

SAP ERP 1

SAP ERP n

(or CRM, SRM, SCM)

(or CRM, SRM, SCM)

...

Sync

Database

SAP
HANA
1.0

Database

Non-SAP
Business
Application

Database

SAP
HANA
1.0

Figure 5-6 SAP vision after the introduction of SAP HANA 1.0

Business intelligence functionality is provided by an SAP BusinessObjects BI


tool, such as the SAP BusinessObjects Explorer, communicating with the SAP
HANA database through the BI Consumer Services (BICS) interface.
This use case scenario is oriented mainly on existing products from the SAP
Business Suite where SAP HANA acts as a foundation for reporting on big
volumes of data.

70

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure 5-7 illustrates the role of SAP HANA in an operational reporting use case
scenario.

SAP
SAP
Business
Business
Suite
Suite

Data
Data Modeling
Modeling
Column
Column Store
Store
repl.

RDBMS
RDBMS

Row
Row Store
Store

SAP
SAP
Reporting
Reporting
and
and Analytics
Analytics

SAP
SAP HANA
HANA

Figure 5-7 SAP HANA for operational reporting

Usually the first step is the replication of data into the SAP HANA database,
which is usually originating from the SAP Business Suite. However, some
solution packages are also built for non-SAP data sources.
Sometimes source systems need to be adjusted by implementing modifications
or by performing specific configuration changes.
Data is typically replicated using the SAP Landscape Transformation replication;
however, other options, such as replication using SAP BusinessObjects Data
Services or SAP HANA Direct Extractor Connection (DXC), are also possible.
The replication technology is usually chosen as part of the package design and
cannot be changed easily during implementation.
A list of tables to replicate (for SAP Landscape Transformation replication) or
transformation models (for replication using Data Services) are part of the
package.
SAP HANA is loaded with models (views) that are either static (designed by SAP
and packaged) or automatically generated based on customized criteria. These
models describe the transformation of source data into the resulting column
views. These views are then consumed by SAP BusinessObjects BI 4.0 reports
or dashboards that are either delivered as final products or pre-made templates
that can be finished as part of implementation process.
Some solution packages are based on additional components (for example, SAP
BusinessObjects Event Insight). If required, additional content that is specific to
these components can also be part of the solution package.
Individual use cases, required software components, prerequisites, configuration
changes, including overall implementation processes, are properly documented
and attached as part of the delivery.

Chapter 5. SAP HANA use cases and integration scenarios

71

Solution packages can contain:


SAP BusinessObjects Data Services Content (data transformation models)
SAP HANA Content (exported models - attribute views, analytic views)
SAP BusinessObjects BI Content (prepared reports, dashboards)
Transports, ABAP reports (adjusted code to be implemented in source
system)
Content for other software components, such as SAP BusinessObjects Event
Insight, Sybase Unwired Platform, and so on.
Documentation
Packaged solutions such as these are being delivered by SAP under the name
SAP Rapid Deployment Solutions (RDSs) for SAP HANA or by other system
integrators, such as IBM.
Available offerings contain everything that customers need to implement the
requested function. Associated services, including implementation, can also be
part of delivery.
Although SAP HANA as a technology platform can be seen as an open field
where every client can build their own solution using available building blocks, the
SAP HANA for operational reporting scenarios are well prepared packaged
scenarios that can easily and quickly be deployed on existing landscapes.
A list of SAP RDS offerings are at the following website:
http://www.sap.com/resources/solutions-rapid-deployment/solutions-by-bu
siness.epx
Alternatively, you can use the following quick link and then open
Technology SAP HANA:
http://service.sap.com/solutionpackages

5.4 SAP HANA as an accelerator


SAP HANA in a side-car approach as an accelerator is similar to a side-car
approach for reporting purposes. The difference is that the consumer of the data
replicated to SAP HANA is not a business intelligence tool but the source system
itself. The source system can use the in-memory capabilities of the SAP HANA
database to run analytical queries on the replicated data. This helps applications
performing queries on huge amounts of data to run simulations, pattern
recognition, planning runs, and so on.

72

In-memory Computing with SAP HANA on IBM eX5 Systems

SAP HANA can also be used to accelerate existing processes in SAP Business
Suite systems, even for those systems that are not yet released to be directly
running on the SAP HANA database.
Some SAP systems are processing large amounts of records that need to be
filtered or aggregated based on specific criteria. Results are then used as inputs
for all dependent activities in a given system.
In case of really large data volumes, execution time can be unacceptable (in
number of hours). Such workloads can easily run several hours, which can cause
unnecessary delays. Currently, these tasks are typically being processed
overnight as batch jobs.
SAP HANA as an accelerator can help to significantly decrease this execution
time.
Figure 5-8 illustrates this use case scenario.

SAP
SAP UI
UI

SAP
SAP
Business
Business
Suite
Suite

read
Data
Data Modeling
Modeling
Column
Column Store
Store
repl.

RDBMS
RDBMS

Row
Row Store
Store

SAP
SAP
Reporting
Reporting
and
and Analytics
Analytics

SAP
SAP HANA
HANA

Figure 5-8 SAP HANA as an accelerator

The accelerated SAP system must meet specific prerequisites. Before this
solution can be implemented, installation of specific support packages or
implementation of SAP Notes might be required. This introduces the necessary
code changes in the source system.
The SAP HANA client must be installed on a given server, and the SAP kernel
must be adjusted to support direct connectivity to the SAP HANA database.
As a next step, replication of data from the source system is configured. Each
specific use case has a defined replication method and a list of tables that must
be replicated. The most common method is the SAP Landscape Transformation

Chapter 5. SAP HANA use cases and integration scenarios

73

replication. However, some solutions offer alternatives, for example, for the SAP
CO-PA Accelerator, replication can also be performed by an SAP CO-PA
Accelerator-specific ABAP report in source system.
The source system is configured to have direct connectivity into SAP HANA as
the secondary database. The required scenario is configured according to the
specifications and then activated. During activation the source system
automatically deploys the required column views into SAP HANA and activates
new ABAP code that was installed in the source system as the solution
prerequisite. This new code can run time, consuming queries against the SAP
HANA database, which leads to significantly shorter execution times.
Because SAP HANA is populated with valuable data, it is easy to extend the
accelerator use case by adding operational reporting functions. Additional
(usually optional) content is delivered for SAP HANA and for SAP
BusinessObjects BI 4.0 client tools, such as reports or dashboards.
SAP HANA as the accelerator and SAP HANA for operational reporting use case
scenarios can be nicely combined in a single package. Following is a list of SAP
RDSs implementing SAP HANA as an accelerator:
SAP Bank Analyzer Rapid-Deployment Solution for Financial Reporting with
SAP HANA (see SAP Note 1626729):
http://service.sap.com/rds-hana-finrep
SAP rapid-deployment solution for customer segmentation with SAP HANA
(see SAP Note 1637115):
http://service.sap.com/rds-cust-seg
SAP ERP rapid-deployment solution for profitability analysis with SAP HANA
(see SAP Note 1632506):
http://service.sap.com/rds-hana-copa
SAP ERP rapid-deployment solution for accelerated finance and controlling
with SAP HANA (see SAP Note 1656499):
http://service.sap.com/rds-hana-fin
SAP Global Trade Services rapid-deployment solution for sanctioned-party
list screening with SAP HANA (see SAP Note 1689708):
http://service.sap.com/rds-gts

74

In-memory Computing with SAP HANA on IBM eX5 Systems

5.5 SAP products running on SAP HANA


Another way that SAP HANA can be deployed is to use SAP HANA as the
primary database for selected products.
SAP NetWeaver Business Warehouse (BW) running on SAP HANA was
generally available since April 2012. The SAP ERP Central Component (SAP
ECC) running on HANA was announced in early 2013 and other products from
the SAP Business Suite family are expected to follow.
One big advantage of running existing products to use SAP HANA as the primary
database is the minimal disruption to the existing system. Almost all functions,
customizations, and with SAP NetWeaver BW, also client-specific modeling, are
preserved because application logic written in ABAP is not changed. From a
technical perspective, the SAP HANA conversion is similar to any other database
migration.
Figure 5-9 illustrates SAP NetWeaver BW running on SAP HANA.

SAP BW on SAP HANA


traditional
extraction

SAP
SAP
Business
Business
Suite
Suite
RDBMS
RDBMS

SAP ECC on SAP HANA

SAP
SAP BW
BW

SAP
SAP ECC
ECC

Column
Column Store
Store

Column
Column Store
Store

Row
Row Store
Store

Row
Row Store
Store

SAP
SAP HANA
HANA

SAP
SAP HANA
HANA

Figure 5-9 SAP products running on SAP HANA: SAP Business Warehouse (SAP
NetWeaver BW) and SAP ERP Central Component (SAP ECC)

5.5.1 SAP NetWeaver Business Warehouse powered by SAP HANA


SAP HANA can be used as the database for an SAP NetWeaver Business
Warehouse (SAP NetWeaver BW) installation. In this scenario, SAP HANA
replaces the traditional database server of an SAP NetWeaver BW installation.
The application servers stay the same.

Chapter 5. SAP HANA use cases and integration scenarios

75

The in-memory performance of SAP HANA dramatically improves query


performance and eliminates the need for manual optimizations by materialized
aggregates in SAP NetWeaver BW. Figure 5-10 shows SAP HANA as the
database for the SAP NetWeaver Business Warehouse.

Corporate BI

Enterprise Data Warehouse (BW)


SAP HANA
Local BI

Virtual
Data Mart

Virtual
Data Mart

Virtual
Data Mart

Local BI

Local BI

SAP ERP 1

SAP ERP n

(or CRM, SRM, SCM)

(or CRM, SRM, SCM)

...

SAP
HANA
Database

Non-SAP
Business
Application
SAP
HANA

Database

Database

Figure 5-10 SAP HANA as the database for SAP NetWeaver Business Warehouse

In contrast to an SAP NetWeaver BW system accelerated by the in-memory


capabilities of SAP NetWeaver BW Accelerator, an SAP NetWeaver BW system
with SAP HANA as the database keeps all data in-memory. With SAP
NetWeaver BW Accelerator, the client chooses the data to be accelerated, which
is then copied to the SAP NetWeaver BW Accelerator. Here the traditional
database server (for example, IBM DB2 or Oracle) still acts as the primary
datastore.
SAP NetWeaver BW on SAP HANA is probably the most popular SAP HANA use
case, which achieves significant performance improvements with relatively small
efforts.
The underlying database is replaced by the SAP HANA database, which
significantly improves both data loading times and query execution times.
Because the application logic written in ABAP is not impacted by this change, all
investments in developing BW models are preserved. The transition to SAP
HANA is a transparent process that requires minimal effort to adjust existing
modeling.

76

In-memory Computing with SAP HANA on IBM eX5 Systems

In-memory optimized InfoCubes


InfoCubes in SAP NetWeaver BW running on traditional database are using the
so called Enhanced Star Schema. This schema was designed to optimize
different performance aspects of working with multidimensional models on
existing database systems.
Figure 5-11 illustrates the Enhanced Star Schema in BW with an example.
Data Package
Dimension Table:
/BI0/D0COPC_C08P

Company Code
Master Data Table:
/BI0/PCOMP_CODE

DIMID

COMP_CODE

SID_0CHNGID
SID_0RECORDTP

Fact Table:
/BI0/F0COPC_C08

SID_0REQUID

KEY_0COPC_C08P
KEY_0COPC_C08T
KEY_0COPC_C08U

Enterprise Structure
Dimension Table:
/BI0/D0COPC_C081

KEY_0COPC_C081

DIMID

KEY_0COPC_C082

SID_0COMP_CODE

KEY_0COPC_C083

SID_0PLANT

AMOUNTFX

OBJVERS
CHANGED

COMP_CODE

CHRT_ACCTS

SID

COMPANY

CHCKFL

COUNTRY

DATAFL

...

INCFL

KEY_0COPC_C084
KEY_0COPC_C085

Company Code
SID Table:
/BI0/SCOMP_CODE

Material
Dimension Table:
/BI0/D0COPC_C082

Plant
SID Table:
/BI0/SPLANT

Plant
Master Data Table :
/BI0/PPLANT
PLANT

PLANT

OBJVERS

SID

CHANGED

AMOUNTVR

DIMID

CHCKFL

ALTITUDE

PRDPLN_QTY

SID_0MATERIAL

DATAFL

BPARTNER

LOTSIZE_CM

SID_0MAT_PLANT

INCFL

...

Figure 5-11 Enhanced Star Schema in SAP NetWeaver Business Warehouse

The core part of every InfoCube is the fact table. This table contains dimension
identifiers (IDs) and corresponding key figures (measures). This table is
surrounded by dimension tables that are linked to fact tables using the dimension
IDs.
Dimension tables are usually small tables that group logically connected
combinations of characteristics, usually representing master data. Logically

Chapter 5. SAP HANA use cases and integration scenarios

77

connected means that the characteristics are highly related to each other, for
example, company code and plant. Combining unrelated characteristics leads to
a large number of possible combinations, which can have a negative impact on
the performance.
Because master data records are located in separate tables outside of the
InfoCube, an additional table is required to connect these master data records to
dimensions. These additional tables contain a mapping of auto-generated
Surrogate IDs (SIDs) to the real master data.
This complex structure is required on classical databases; however, with SAP
HANA, this requirement is obsolete. SAP therefore introduced the SAP HANA
Optimized Star Schema, illustrated in Figure 5-12.

Fact Table:
/BI0/F0COPC_C08

Data Package
Dimension Table:
/BI0/D0COPC_C08P

KEY_0COPC_C08P

DIMID

SID_0CALDAY

SID_0CHNGID

SID_0FISCPER

SID_0RECORDTP

SID_0FISCVARNT

SID_0REQUID

Company Code
Master Data Table:
/BI0/PCOMP_CODE
COMP_CODE

Company Code
SID Table:
/BI0/SCOMP_CODE

CHANGED

COMP_CODE

CHRT_ACCTS

SID_0CURRENCY

SID

COMPANY

SID_0UNIT

CHCKFL

COUNTRY

SID_0COMP_CODE

DATAFL

SID_0PLANT

INCFL

SID_0MATERIAL
SID_0MAT_PLANT
SID_0CURTYPE

Plant
SID Table:
/BI0/SPLANT

Plant
Master Data Table :
/BI0/PPLANT
PLANT

...

PLANT

OBJVERS

AMOUNTFX

SID

CHANGED

AMOUNTVR

CHCKFL

ALTITUDE

PRDPLN_QTY

DATAFL

BPARTNER

LOTSIZE_CM

INCFL

Figure 5-12 SAP HANA Optimized Star Schema in SAP NetWeaver BW system

78

OBJVERS

In-memory Computing with SAP HANA on IBM eX5 Systems

The content of all dimensions (except for the Data Package dimension) is
incorporated into the fact table. This modification brings several advantages:
Simplified modeling
Poorly designed dimensions (wrong combinations of characteristics) cannot
affect performance anymore. Moving characteristics from one dimension to
another is not a physical operation anymore; instead, it is just a metadata
update.
Faster loading
Because dimension tables do not exist, all overhead workload related to
identification of existing combinations or creating new combinations in the
dimension tables is not required anymore. Instead, the required SID values
are directly inserted into the fact table.
The SAP HANA Optimized Star Schema is automatically used for all newly
created InfoCubes on the SAP NetWeaver BW system running on the SAP
HANA database.
Existing InfoCubes are not automatically converted to this new schema during
the SAP HANA conversion of the SAP NetWeaver BW system. The conversion of
standard InfoCubes to in-memory optimized InfoCubes must be done manually
as a follow-up task after the database migration.

SAP HANA acceleration areas


The SAP HANA database can bring significant performance benefits; however, it
is important to set the expectations correctly. SAP HANA can improve loading
and query times, but certain limits cannot be overcome.
Migration of SAP NetWeaver BW to run on SAP HANA will not improve extraction
processes because extraction happens in the source system. Therefore, it is
important to understand how much of the overall load time is taken by extraction
from the source system. This information is needed to properly estimate the
potential performance improvement for the load process.
Other parts of the load process are improved. The new star schema removes
unnecessary activities from the loading process.
Some of the calculations and application logic can be pushed to the SAP HANA
database. This ensures that data intensive activities are being done on the SAP
HANA database level instead of on the application level. This increases the
performance because the amount and volume of data exchanged between the
database and the application are significantly reduced.

Chapter 5. SAP HANA use cases and integration scenarios

79

SAP HANA can calculate all aggregations in real time. Therefore, aggregates are
no longer required, and roll-up activity that is related to aggregate updates is
obsolete. This also reduces overall execution time of update operations.
If SAP NetWeaver BW Accelerator was used, the update of its indexes is also no
longer needed. Because SAP HANA is based on similar technology as an SAP
NetWeaver BW Accelerator, all queries are accelerated. Query performance with
SAP HANA can be compared to situations when all cubes are indexed by the
SAP NetWeaver BW Accelerator. In reality, query performance can be even
faster than with SAP NetWeaver BW Accelerator because additional features are
available for SAP NetWeaver BW running on SAP HANA, for example, the
possibility to remove an InfoCube and instead run reports against in-memory
optimized DataStore Objects (DSOs).

5.5.2 Migrating SAP NetWeaver Business Warehouse to SAP HANA


There are multiple ways that an existing SAP NetWeaver Business Warehouse
(BW) system can be moved to an SAP HANA database. It is important to
distinguish between a building proof of concept (POC) demo system and a
productive migration.
The available options are divided into two main groups:
SAP NetWeaver BW database migration on page 80
Transporting the content to the SAP NetWeaver BW system on page 84
These two groups are just main driving ideas behind the move from a traditional
database to SAP HANA. Within each group, there are still many possibilities of
how a project plan can be orchestrated.
In the following sections, we explain these two approaches in more detail.

SAP NetWeaver BW database migration


The following software levels are prerequisites for SAP NetWeaver BW running
on SAP HANA1:
SAP NetWeaver BW 7.30 SP52 or SAP NetWeaver BW 7.31 SP4
SAP HANA 1.0 SPS03 (the latest available revision is recommended)

See SAP Note 1600929 for the latest information.


As per SAP Note 1600929, SP07 or higher must be imported for your SAP
NetWeaver BW Installation (ABAP) before migration and after installation.

80

In-memory Computing with SAP HANA on IBM eX5 Systems

It is important to be aware that not all SAP NetWeaver BW add-ons are


supported to run on the SAP HANA-based system. For the latest information,
see following SAP Notes:
Note 1600929 - SAP NetWeaver BW powered by SAP HANA DB: Information
Note 1532805 - Add-On Compatibility of SAP NetWeaver 7.3
Note 1652738 - Add-on compatibility for SAP NetWeaver EHP 1 for NW 7.30
Unless your system already meets the minimal release requirements, the first
step before converting SAP NetWeaver BW is to upgrade the system to the latest
available release and to the latest available support package level.
A database upgrade might be required as part of the release upgrade or as a
prerequisite before database migration to SAP HANA. For a list of supported
databases, see SAP Note 1600929.
Table 5-1 lists the databases that were approved as source databases for the
migration to SAP HANA at the time of writing.
Table 5-1 Supported source databases for a migration to the SAP HANA database
Database

SAP NetWeaver BW 7.30

SAP NetWeaver BW 7.31

Oracle

11.2

11.2

MaxDB

7.8

7.9

MS SQL server

2008

2008

IBM DB2 LUW

9.7

9.7

IBM DB2 for i

6.1, 7.1

6.1, 7.1

IBM DB2 for z/OS

V9, V10

V9, V10

SybaseASE

n/a

15.7

SAP HANA is currently not a supported database for any SAP NetWeaver Java
stack. Therefore, dual-stack installations (ABAP+Java) must be separated into
two individual stacks using the Dual-Stack Split Tool from SAP.
Because some existing installations are still non-Unicode installations, another
important prerequisite step might be a conversion of the database to unicode
encoding. This unicode conversion can be done as a separate step or as part of
the conversion to the SAP HANA database.
All InfoCubes with data persistency in the SAP NetWeaver Business Warehouse
Accelerator are set as inactive during conversion, and their content in SAP
NetWeaver BW Accelerator is deleted. These InfoCubes must be reloaded again

Chapter 5. SAP HANA use cases and integration scenarios

81

from the original primary persistence; therefore, required steps must be


incorporated into the project plan.
A migration to the SAP HANA database follows the exact same process as any
other database migration. All activity in the SAP NetWeaver BW system is
suspended after all preparation activities are finished. A special report is
executed to generate database-specific statements for the target database that is
used during import. Next, the content of the SAP system is exported to a
platform-independent format and stored in files on disk.
These files are then transported to the primary application server of the SAP
NetWeaver BW system. The application part of SAP NetWeaver BW is not
allowed to run on the SAP HANA appliance. Therefore, a minimal installation
needs to have two servers:
SAP HANA appliance hosting the SAP HANA database
The SAP HANA appliance is delivered by IBM with the SAP HANA database
preinstalled. However, the database will be empty.
Primary application server hosting ABAP instance of SAP NetWeaver BW
There are minimal restrictions regarding the operating system of the primary
application server. See the Product Availability Matrix (PAM) for available
combinations (search for SAP NetWeaver 7.3 and download overview
presentation):
http://service.sap.com/pam

82

In-memory Computing with SAP HANA on IBM eX5 Systems

At the time of writing this book, the following operating systems (see
Table 5-2) were available to host the ABAP part of the SAP NetWeaver BW
system.

Windows Server 2008


x86_64 (64-bit) (including R2)

AIX 6.1, 7.1


Power (64-bit)

HP-UX 11.31
IA64 (64-bit)

Solaris 10
SPARC (64-bit)

Solaris 10
x86_64 (64-bit)

Linux SLES 11 SP1


x86_64 (64-bit)

Linux RHEL 5
x86_64 (64-bit)

Linux RHEL 6
x86_64 (64-bit)

IBM i 7.1
Power (64-bit)

Table 5-2 Supported operating systems for primary application server

SAP NetWeaver
BW 7.30

yes

yes

yes

yes

yes

yes

yes

yes

no

SAP NetWeaver
BW 7.31

yes

yes

yes

yes

yes

yes

no

yes

yes

The next step is the database import. It contains the installation of the SAP
NetWeaver BW on the primary application server and the import of data into the
SAP HANA database. The import occurs remotely from the primary application
server as part of the installation process.
Parallel export/import using socket connection and File Transfer Protocol (FTP)
and Network File System (NFS) exchange modes are not supported. Currently,
only the asynchronous file-based export/import method is available.
After mandatory post-activities, conversion of InfoCubes and DataStore objects
to their in-memory optimized form must be initiated to take all benefits that the
SAP HANA database can offer. This can be done either manually for each object
or as a mass operation using a special report.
Clients must plan enough time to perform this conversion. This step can be time
consuming because the content of all InfoCubes must be copied into temporary
tables that have the new structure.
After all post activities are finished, the system is ready to be tested.

Chapter 5. SAP HANA use cases and integration scenarios

83

Transporting the content to the SAP NetWeaver BW system


Unlike with a database migration, this approach is based on performing
transports of activated objects (Business Content) from the existing SAP
NetWeaver BW system into a newly installed SAP NetWeaver BW system with
SAP HANA as a primary database.
The advantage of this approach is that content can be transported across
releases, as explained in following SAP Notes:

Note 1090842 - Composite note: Transports across several releases


Note 454321 - Transports between Basis Release 6.* and 7.0
Note 1273566 - Transports between Basis Release 700/701 and 702/73*
Note 323323 - Transport of all activated objects of a system

The possibility to transport content across different releases can significantly


reduce the amount of effort that is required to build a proof of concept (POC)
system because most of the prerequisite activities, such as the release upgrade,
database upgrade, dual-stack split, and so on, are not needed.
After transporting the available objects (metadata definitions), their content must
also be transported from the source to the target system. The SAP NetWeaver
BW consultant must assess which available options are most suitable for this
purpose.
This approach is not recommended for production migration where a
conventional database migration is used. Therefore, additional effort invested in
building a proof of concept (POC) system in the same way as the production
system will be treated, is a valuable test. This kind of test can help customers to
create a realistic effort estimation for the project, estimate required runtimes, and
define detailed planning of all actions that are required. All involved project team
members become familiar with the system and can solve and document all
specific problems.

Parallel approach to SAP HANA conversion


The suggested approach to convert an SAP NetWeaver BW system to use the
SAP HANA database is a parallel approach, meaning that the new SAP
NetWeaver BW system is created as a clone of the original system. The standard
homogeneous system copy method can be used for this purpose.
This clone is then reconfigured in a way that both the original and the cloned BW
system is functional and both systems can extract data from the same sources.
Detailed instructions about how to perform this cloning operation are explained in
SAP Note 886102, scenario B2.

84

In-memory Computing with SAP HANA on IBM eX5 Systems

Here is some important information that is relevant to the cloned system. Refer to
the content in SAP Note 886102 to understand the full procedure that must be
applied on the target BW system. The SAP Note states:
Caution: This step deletes all transfer rules and PSA tables of these source
systems, and the data is lost. A message is generated stating that the source
system cannot be accessed (since you deleted the host of the RFC connection).
Choose Ignore.
It is important to understand the consequences of this action and to plan the
required steps to reconfigure the target BW system so that it can again read data
from the source systems.
Persistent Staging Area (PSA) tables can be regenerated by the replication of
DataSources from the source systems, and transfer rules can be transported
from the original BW system. However, the content of these PSA tables is lost
and needs to be reloaded from source systems.
This step might potentially cause problems where DataStore objects are used
and PSA tables contain the complete history of data.
An advantage of creating a cloned SAP NetWeaver BW system is that the
original system is not impacted and can still be used for productive tasks. The
cloned system can be tested and results compared with the original system
immediately after the clone is created and after every important project
milestone, such as a release upgrade or the conversion to SAP HANA itself.
Both systems are fully synchronized because both systems periodically extract
data from the same source systems. Therefore, after an entire project is finished,
and the new SAP NetWeaver BW system running on SAP HANA meets the
clients expectations, the new system can fully replace the original system.
A disadvantage of this approach is the additional load imposed on the source
systems, which is caused by both SAP NetWeaver BW systems performing
extraction from the same source system, and certain limitations mentioned in the
following SAP notes:
Note 775568 - Two and more BW systems against one OLTP system
Note 844222 - Two OR more BW systems against one OLTP system

5.5.3 SAP Business Suite powered by SAP HANA


SAP announced restricted availability of SAP Business Suite, which is powered
by SAP HANA, in January 20133. After a successful ramp-up program, SAP
3

See http://www.news-sap.com/sap-business-suite-on-sap-hana-launch

Chapter 5. SAP HANA use cases and integration scenarios

85

made this generally available during the SAPPHIRENOW conference, held in


Orlando, FL, in May 2013.
SAP HANA can be used as the database for an SAP Business Suite installation.
In this scenario, SAP HANA replaces the traditional database server of an SAP
Business Suite installation. The application servers stay the same, and can run
on any platform which supports the SAP HANA database client. As of May 2013,
the following applications of SAP Business Suite are supported by SAP HANA as
their primary database:
Enterprise resource planning (ERP)
Customer relationship management (CRM)
Supply chain management (SCM)
The Product Lifecycle Management (PLM) and SAP Supplier Relationship
Management (SRM) applications are not available with SAP HANA, but it is
planned to make these available to run with SAP HANA as well.
SAP Business Suite, which is powered by SAP HANA, does not induce any
functional changes. Configuration, customization, the ABAP Workbench,
connectivity, security, transports, and monitoring stay unchanged. For
modifications, the same upgrade requirements as with any other upgrade apply.
Customized code can stay unchanged, or can be adjusted to use additional
performance.
SAP Business Suite applications can benefit in various ways from leveraging the
in-memory technology of SAP HANA:
Running dialog processes instead of batch
Integration of unstructured data and machine to machine data (M2M) with
ERP processes
Integration of predictive analysis with ERP processes
Running operational reports in real time, directly on the source data
Removing the need for operational data stores
Eliminating the need for data replication or transfers to improve operational
report performance
SAP is enabling the existing functions of SAP ERP for SAP HANA, SAP CRM for
SAP HANA, and SAP SCM for SAP HANA to utilize the in-memory technology
with the following versions:
SAP enhancement package 6 for SAP ERP 6.0, version for SAP HANA
SAP enhancement package 2 for SAP CRM 7.0, version for SAP HANA
SAP enhancement package 2 for SAP SCM 7.0, version for SAP HANA

86

In-memory Computing with SAP HANA on IBM eX5 Systems

Restrictions
There are certain restrictions4 in effect regarding running SAP Business Suite
with SAP HANA.
Currently, multi-node support for SAP Business Suite with SAP HANA is very
limited5. SAP HANA multi-node configurations can serve different purposes:
Achieving high-availability by the use of standby nodes
Scaling the main memory to accommodate larger databases (scale-out)
Scale-out scenarios with multiple worker nodes (as described in 6.4, Scale-out
solution for SAP HANA on page 116) are not yet supported for SAP Business
Suite with SAP HANA.
High-availability (HA) scenarios for SAP Business Suite with SAP HANA are
supported, but restricted to the simplest case of two servers, one being the
worker node and one acting as a standby node. In this case, the database is not
being partitioned, but the entire database is on a single node. This is why this
configuration is sometimes also referred to as a single-node HA configuration.
Due to these restrictions with regards to scalability, SAP decided to allow
configurations with a higher memory per core ratio, specifically for this use case.
Section 6.3, Custom server models for SAP HANA on page 110 describes
available configurations dedicated to SAP Business Suite, which is powered by
SAP HANA.

5.6 Programming techniques using SAP HANA


The last use case scenario is based on recent developments from SAP where
applications can be built directly against the SAP HANA database using all its
features, such as the embedded application server (XS Engine) or stored
procedures, which allows logic to be directly processed inside the SAP HANA
database.
A new software component can be integrated with SAP HANA either directly or it
can be built on top of the SAP NetWeaver stack, which can work with the SAP
HANA database using client libraries.
Because of its breadth and depth, this use case scenario is not described in
detail as part of this publication.

4
5

See SAP Note 1774566.


See SAP Note 1825774 for up-to-date information about multi-node support.

Chapter 5. SAP HANA use cases and integration scenarios

87

88

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 6.

IBM Systems solution for


SAP HANA
This chapter discusses the IBM Systems solution for SAP HANA. We describe
the hardware and software components, scale-up and scale-out approaches,
workload-optimized models, interoperability with other platforms, and support
processes. We also highlight IBM Systems solution with SAP Discovery System.
The following topics are covered:

6.1, IBM eX5 Systems on page 90


6.2, IBM General Parallel File System on page 106
6.3, Custom server models for SAP HANA on page 110
6.4, Scale-out solution for SAP HANA on page 116
6.5, Disaster recovery solutions for SAP HANA on page 129
6.7, SAP HANA on VMware vSphere on page 148
6.8, SAP HANA on IBM SmartCloud on page 152
6.9, IBM Systems solution with SAP Discovery system on page 153

Copyright IBM Corp. 2013. All rights reserved.

89

6.1 IBM eX5 Systems


IBM decided to base their offering for SAP HANA on their high-performance,
scalable IBM eX5 family of servers. These servers represent the IBM high-end
Intel-based enterprise servers. IBM eX5 systems, all based on the eX5
Architecture, are the HX5 blade server, the x3690 X5, the x3850 X5, and the
x3950 X5. They have a common set of technical specifications and features:
The IBM System x3850 X5 is a 4U highly rack-optimized server. The x3850
X5 also forms the basis of the x3950 X5, the new flagship server of the IBM
x86 server family. These systems are designed for maximum utilization,
reliability, and performance for compute-intensive and memory-intensive
workloads, such as SAP HANA.
The IBM System x3690 X5 is a 2U rack-optimized server. This machine
brings the eX5 features and performance to the mid tier. It is an ideal match
for the smaller, two-CPU configurations for SAP HANA.
The IBM BladeCenter HX5 is a single wide (30 mm) blade server that follows
the same design as all previous IBM blades. The HX5 brings unprecedented
levels of capacity to high-density environments.
When compared with other machines in the System x portfolio, these systems
represent the upper end of the spectrum and are suited for the most demanding
x86 tasks.
For SAP HANA, the x3690 X5 and the x3950 X5 are used, which is why we
feature only these systems in this paper.
Note: For the latest information about the eX5 portfolio, see the IBM
Redpaper publication IBM eX5 Portfolio Overview: IBM System x3850 X5,
x3950 X5, x3690 X5, and BladeCenter HX5, REDP-4650, for further eX5
family members and capabilities. This paper is available at the following web
page:
http://www.redbooks.ibm.com/abstracts/redp4650.html

6.1.1 IBM System x3850 X5 and x3950 X5


The IBM System x3850 X5 (Figure 6-1 on page 91) offers improved performance
and enhanced features, including MAX5 memory expansion and
workload-optimized x3950 X models to maximize memory, minimize costs, and
simplify deployment.

90

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure 6-1 IBM System x3850 X5 and x3950 X5

The x3850 X5 and the workload-optimized x3950 X5 are the logical successors
to the x3850 M2 and x3950 M2, featuring the IBM eX4 chipset. Compared with
previous generation servers, the x3850 X5 offers the following features:
High memory capacity
Up to 64 dual inline memory modules (DIMMs) standard and 96 DIMMs with
the MAX5 memory expansion per 4-socket server
Intel Xeon processor E7 family
Exceptional scalable performance with advanced reliability for your most
data-demanding applications
Extended SAS capacity with eight HDDs and 900 GB 2.5-inch SAS drives or
1.6 TB of hot-swappable Redundant Array of Independent Disks 5 (RAID 5)
with eXFlash technology
Standard dual-port Emulex 10 GB Virtual Fabric adapter
Ten-core, 8-core, and 6-core processor options with up to 2.4 GHz (10-core),
2.13 GHz (8-core), and 1.86 GHz (6-core) speeds with up to 30 MB L3 cache
Scalable to a two-node system with eight processor sockets and 128 dual
inline memory module (DIMM) sockets
Seven PCIe x8 high-performance I/O expansion slots to support hot-swap
capabilities
Optional embedded hypervisor
The x3850 X5 and x3950 X5 both scale to four processors and 2 terabytes (TB)
of RAM. With the MAX5 attached, the system can scale to four processors and

Chapter 6. IBM Systems solution for SAP HANA

91

3 TB of RAM. Two x3850 X5 servers can be connected together for a single


system image with eight processors and 4 TB of RAM.
With their massive memory capacity and computing power, the IBM System
x3850 X5 and x3950 X5 rack-mount servers are the ideal platform for
high-memory demanding, high-workload applications, such as SAP HANA.

6.1.2 IBM System x3690 X5


The IBM System x3690 X5 (Figure 6-2) is a 2U rack-optimized server that brings
new features and performance to the mid tier.

Figure 6-2 IBM System x3690 X5

This machine is a two-socket, scalable system that offers up to four times the
memory capacity of current two-socket servers. It supports the following
specifications:
Up to 2 sockets for Intel Xeon E7 processors. Depending on the processor
model, processors have six, eight, or ten cores.
Scalable 32 - 64 DIMM sockets with the addition of a MAX5 memory
expansion unit.
Advanced networking capabilities with a Broadcom 5709 dual Gb Ethernet
controller standard in all models and an Emulex 10 Gb dual-port Ethernet
adapter standard on some models, optional on all others.
Up to 16 hot-swap 2.5-inch SAS HDDs, up to 16 TB of maximum internal
storage with RAID 0, 1, or 10 to maximize throughput and ease installation.
RAID 5 is optional. The system comes standard with one HDD backplane that
can hold four drives. A second and third backplane are optional for an
additional 12 drives.
New eXFlash high-input/output operations per second (IOPS) solid-state
storage technology.

92

In-memory Computing with SAP HANA on IBM eX5 Systems

Five PCIe 2.0 slots.


Integrated management module (IMM) for enhanced systems management
capabilities.
The x3690 X5 features the IBM eXFlash internal storage using solid state drives
to maximize the number of IOPS. All configurations for SAP HANA that are
based on x3690 X5 use eXFlash internal storage for high IOPS log storage or for
both data and log storage.
The x3690 X5 is an excellent choice for a memory-demanding and
performance-demanding business application, such as SAP HANA. It provides
maximum performance and memory in a dense 2U package.

6.1.3 Intel Xeon processor E7 family


The IBM eX5 portfolio of servers uses CPUs from the Intel Xeon processor E7 family
to maximize performance. These processors are the latest in a long line of
high-performance processors.
The Intel Xeon processor E7 family CPUs are the latest Intel scalable processors
and can be used to scale up to four or more processors. When used in the IBM
System x3850 X5 or x3950 X5, these servers can scale up to eight processors.
The Intel Xeon E7 processors have many features that are relevant for the SAP
HANA workload. We cover some of these features in the following sections. For
more in-depth information about the benefits of the Intel Xeon processor E7
family for SAP HANA, see the Intel white paper Analyzing Business as it
Happens, April 2011, available for download at the following website:
http://www.intel.com/content/dam/doc/white-paper/high-performance-compu
ting-xeon-e7-analyze-business-as-it-happens-with-sap-hana-software-brie
f.pdf

Instruction set extensions


SAP HANA uses several instruction set extensions of the Intel Xeon E7
processors. For example, these extensions allow you to process multiple data
items with one instruction. SAP HANA uses these instructions to speed up
compression and decompression of in-memory data and to improve search
performance.

Intel Hyper-Threading Technology


Intel Hyper-Threading Technology enables a single physical processor to
execute two separate code streams (threads) concurrently on a single processor
core. To the operating system, a processor core with Hyper-Threading appears

Chapter 6. IBM Systems solution for SAP HANA

93

as two logical processors, each of which has its own architectural state.
Hyper-Threading Technology is designed to improve server performance by
exploiting the multi-threading capability of operating systems and server
applications. SAP HANA makes extensive use of Hyper-Threading to
parallelize processing.
For more information about Intel Hyper-Threading Technology, see the following
web page:
http://www.intel.com/technology/platform-technology/hyper-threading/

Intel Turbo Boost Technology 2.0


Intel Turbo Boost Technology dynamically turns off unused processor cores and
increases the clock speed of the cores in use. For example, with six cores active,
a 2.4 GHz 10-core processor can run the cores at 2.67 GHz. With only four cores
active, the same processor can run those cores at 2.8 GHz. When the cores are
needed again, they are dynamically turned back on and the processor frequency
is adjusted accordingly. When temperature, power, or current exceed
factory-configured limits and the processor is running higher than the base
operating frequency, the processor automatically reduces the core frequency to
reduce temperature, power, and current.
Turbo Boost Technology is available on a per-processor number basis for the eX5
systems. For ACPI-aware operating systems, no changes are required to take
advantage of it. Turbo Boost Technology can be engaged with any number of
cores enabled and active, resulting in increased performance of both
multi-threaded, and single-threaded workloads.
For more information about Intel Turbo Boost Technology, see the following web
page:
http://www.intel.com/technology/turboboost/

94

In-memory Computing with SAP HANA on IBM eX5 Systems

QuickPath Interconnect
Earlier versions of the Intel Xeon processor were connected by a parallel bus to a
core chipset, which functions as both a memory and I/O controller. The new Intel
Xeon E7 processors implemented in IBM eX5 servers include a separate
memory controller to each processor. Processor-to-processor communication is
carried over shared-clock or coherent QuickPath Interconnect (QPI) links, and
I/O is transported over non-coherent QPI links through I/O hubs (Figure 6-3).

I/O Hub

I/O

Memory

Processor

Processor

Memory

Memory

Processor

Processor

Memory

I/O Hub

I/O

Figure 6-3 Quick path interconnect, as in the eX5 portfolio

In previous designs, the entire range of memory was accessible through the core
chipset by each processor, which is called a shared memory architecture. This
new design creates a nonuniform memory access (NUMA) system in which a
portion of the memory is directly connected to the processor where a given
thread is running, and the rest must be accessed over a QPI link through another
processor. Similarly, I/O can be local to a processor or remote through another
processor.
For more information about QPI, see the following web page:
http://www.intel.com/technology/quickpath/

Chapter 6. IBM Systems solution for SAP HANA

95

Reliability, availability, and serviceability


Most system errors are handled in hardware by the use of technologies, such as
error checking and correcting (ECC) memory. The E7 processors have additional
reliability, availability, and serviceability (RAS) features due to their architecture:
Cyclic redundancy checking (CRC) on the QPI links
The data on the QPI link is checked for errors.
QPI packet retry
If a data packet on the QPI link has errors or cannot be read, the receiving
processor can request that the sending processor retry sending the packet.
QPI clock failover
In the event of a clock failure on a coherent QPI link, the processor on the
other end of the link can take over providing the clock. This is not required on
the QPI links from processors to I/O hubs because these links are
asynchronous.
Scalable memory interconnect (SMI) packet retry
If a memory packet has errors or cannot be read, the processor can request
that the packet be resent from the memory buffer.
SMI retry
If there is an error on an SMI link, or a memory transfer fails, the command
can be retried.
SMI lane failover
When an SMI link exceeds the preset error threshold, it is disabled, and
memory transfers are routed through the other SMI link to the memory buffer.
All these features help prevent data from being corrupted or lost in memory. This
is especially important with an application, such as SAP HANA, because any
failure in the area of memory or inter-CPU communication leads to an outage of
the application or even of the complete system. With huge amounts of data
loaded into main memory, even a restart of only the application means
considerable time required to return to operation.

Machine Check Architecture


The Intel Xeon processor E7 family also features the machine check architecture
(MCA), which is a RAS feature that enables the handling of system errors that
otherwise require the operating system to be halted. For example, if a dead or
corrupt memory location is discovered, but it cannot be recovered at the memory
subsystem level, and provided that it is not in use by the system or an
application, an error can be logged but the operation of the server can continue.

96

In-memory Computing with SAP HANA on IBM eX5 Systems

If it is in use by a process, the application to which the process belongs can be


aborted or informed about the situation.
Implementation of the MCA requires hardware support, firmware support (such
as that found in the Unified Extensible Firmware Interface (UEFI)), and operating
system support. Microsoft, SUSE, Red Hat, and other operating system vendors
included support for the Intel MCA on the Intel Xeon processors in their latest
operating system versions.
SAP HANA is the first application that leverages the MCA to handle system
errors to prevent the application from being terminated in a system error.
Figure 6-4 shows how SAP HANA leverages the Machine Check Architecture.

Normal
operation

Hardware

OS

Error
detected

Error
passed to
OS

Hardware
correctable
error

Error
corrected

Memory
Page
unused

Memory Page
unmapped
and marked

SAP HANA

Application
signaled

Application
terminates

Page identified
and data can be
reconstructed

Reconstruct
data in
corrupted page

Figure 6-4 Intel Machine Check Architecture (MCA) with SAP HANA

If a memory error is encountered that cannot be corrected by the hardware, the


processor sends an MCA recovery signal to the operating system. An operating
system supporting MCA, such as SUSE Linux Enterprise Server used in the SAP
HANA appliance, now determines whether the affected memory page is in use
by an application. If unused, it unmaps the memory page and marks it as bad. If
the page is used by an application, traditionally the OS has to hold that
application, or in the worst case stop all processing and halt the system. With
SAP HANA being MCA-aware, the operating system can signal the error
situation to SAP HANA, giving it the chance to try to repair the effects of the
memory error.

Chapter 6. IBM Systems solution for SAP HANA

97

Using the knowledge of its internal data structures, SAP HANA can decide what
course of action to take. If the corrupted memory space is occupied by one of the
SAP in-memory tables, SAP HANA reloads the associated tables. In addition, it
analyzes the failure and checks whether it affects other stored or committed data,
in which case it uses savepoints and database logs to reconstruct the committed
data in a new, unaffected memory location.
With the support of MCA, SAP HANA can take appropriate action at the level of
its own data structures to ensure a smooth return to normal operation and avoid
a time-consuming restart or loss of information.

I/O hubs
The connection to I/O devices (such as keyboard, mouse, and USB) and to I/O
adapters (such as hard disk drive controllers, Ethernet network interfaces, and
Fibre Channel host bus adapters) is handled by I/O hubs, which then connect to
the processors through QPI links. Figure 6-3 on page 95 shows the I/O Hub
connectivity. Connections to the I/O devices are fault tolerant because data can
be routed over either of the two QPI links to each I/O hub.
For optimal system performance in the four processor systems (with two I/O
hubs), balance high-throughput adapters across the I/O hubs. The configurations
used for SAP HANA contain several components that require high throughput
I/O:
Dual-port 10 Gb Ethernet adapters
ServeRAID controllers to connect the SAS drives
High IOPS PCIe adapters
To ensure optimal performance, the placement of these components in the PCIe
slots was optimized according to the I/O architecture outlined above.

6.1.4 Memory
For an in-memory appliance, such as SAP HANA, a systems main memory, its
capacity, and its performance play an important role. The Intel Xeon processor
E7 family, Figure 6-5 on page 99, has a memory architecture that is well suited to
the requirements of such an appliance.
The E7 processors have two SMIs. Therefore, memory needs to be installed in
matched pairs. For better performance, or for systems connected together,
memory must be installed in sets of four. The memory used in the eX5 systems is
DDR3 SDRAM registered DIMMs. All of the memory runs at 1066 MHz or less,
depending on the processor.

98

In-memory Computing with SAP HANA on IBM eX5 Systems

Processor

Memory
controller

Buffer

Buffer

Memory
controller

Buffer

Buffer

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

DIMM

Figure 6-5 Memory architecture with Intel Xeon processor E7 family

Memory DIMM placement


The eX5 servers support various ways to install memory DIMMs. It is important
to understand that because of the layout of the SMI links, memory buffers, and
memory channels, you must install the DIMMs in the correct locations to
maximize performance.
Figure 6-6 on page 100 shows eight possible memory configurations for the two
memory cards and 16 DIMMs connected to each processor socket in an x3850
X5. Similar configurations apply to the x3690 X5 and HX5. Each configuration
has a relative performance score. The following key information from this chart is
important:
The best performance is achieved by populating all memory DIMMs in the
server (configuration 1 in Figure 6-6 on page 100).
Populating only one memory card per socket can result in approximately a
50% performance degradation. (Compare configuration 1 with 5.)
Memory performance is better if you install DIMMs on all memory channels
than if you leave any memory channels empty. (Compare configuration 2 with
3.)
Two DIMMs per channel result in better performance than one DIMM per
channel. (Compare configuration 1 with 2, and compare 5 with 6.)

Chapter 6. IBM Systems solution for SAP HANA

99

1
2
3
4
5
6
7
8

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 1

Mem Ctrl 2

Relative
performance
Each processor:
2 memory controllers
2 DIMMs per channel
8 DIMMs per MC

1.0

Each processor:
2 memory controllers
1 DIMM per channel
4 DIMMs per MC

0.94
0.61

Mem Ctrl 2

Each processor:
2 memory controllers
2 DIMMs per channel
4 DIMMs per MC

0.58

Mem Ctrl 2

Each processor:
2 memory controllers
1 DIMM per channel
2 DIMMs per MC

Mem Ctrl 2

Each processor:
1 memory controller
2 DIMMs per channel
8 DIMMs per MC

0.51

Mem Ctrl 2

Each processor:
1 memory controller
1 DIMM per channel
4 DIMMs per MC

0.47

Mem Ctrl 2

Each processor:
1 memory controller
2 DIMMs per channel
4 DIMMs per MC

Mem Ctrl 2

Mem Ctrl 2

Each processor:
1 memory controller
1 DIMM per channel
2 DIMMs per MC

Memory card
DIMMs
Channel
Memory buffer
SMI link
Memory controller

Mem Ctrl 1

0.31

Relative memory performance

Memory configurations

1
0.94

0.9
0.8
0.7

0.61

0.6

0.58
0.51

0.5

0.47

0.4

0.31

0.29

0.3
0.2
0.1
0
1

Configuration

0.29

Figure 6-6 Relative memory performance based on DIMM placement (one processor and two memory
cards shown)

Nonuniform memory architecture


Nonuniform memory architecture (NUMA) is an important consideration when
configuring memory because a processor can access its own local memory
faster than non-local memory. The configurations used for SAP HANA do not use
all available DIMM sockets. For configurations like these, another principle to
consider when configuring memory is that of balance. A balanced configuration
has all of the memory cards configured with the same amount of memory. This
principle helps to keep remote memory access to a minimum.
A server with a NUMA, such as the servers in the eX5 family, has local and
remote memory. For a given thread running in a processor core, local memory
refers to the DIMMs that are directly connected to that particular processor.
Remote memory refers to the DIMMs that are not connected to the processor
where the thread is running currently. Remote memory is attached to another

100

In-memory Computing with SAP HANA on IBM eX5 Systems

processor in the system and must be accessed through a QPI link (Figure 6-3 on
page 95). However, using remote memory adds latency. The more such latencies
add up in a server, the more performance can degrade. Starting with a memory
configuration where each CPU has the same local RAM capacity is a logical step
toward keeping remote memory accesses to a minimum.
In a NUMA system, each processor has fast, direct access to its own memory
modules, reducing the latency that arises due to bus-bandwidth contention. SAP
HANA is NUMA-aware, and thus benefits from this direct connection.

Hemisphere mode
Hemisphere mode is an important performance optimization of the Intel Xeon
processor E7, 6500, and 7500 product families. Hemisphere mode is
automatically enabled by the system if the memory configuration allows it. This
mode interleaves memory requests between the two memory controllers within
each processor, enabling reduced latency and increased throughput. It also
allows the processor to optimize its internal buffers to
maximize memory throughput.
Hemisphere mode is enabled only when the memory configuration behind each
memory controller on a processor is identical. In addition, because eight DIMMs
per processor are required for using all memory channels, eight DIMMs per
processor must be installed at a time for optimized memory performance.

6.1.5 Flash technology storage


As discussed in 2.1.2, Data persistence on page 11, storage technology
providing high IOPS capabilities with low latency is a key component of the
infrastructure for SAP HANA. The IBM eX5 systems used for the IBM Systems
Solution for SAP HANA feature two kinds of flash technology storage devices:
eXFlash, as used in the IBM System x3690 X5-based configuration
High IOPS adapters, as used in the IBM System x3950 X5-based
configurations
The following sections provide more information about these options.

Chapter 6. IBM Systems solution for SAP HANA

101

eXFlash
IBM eXFlash is the name given to the eight 1.8-inch solid-state drives (SSDs),
the backplanes, SSD hot swap carriers, and indicator lights that are available for
the x3850 X5/x3950 X5 and x3690 X5. Each eXFlash can be put in place of four
SAS or SATA disks. The eXFlash units connect to the same types of ServeRAID
disk controllers as the SAS/SATA disks. Figure 6-7 shows an eXFlash unit, with
the status light assembly on the left side.

Status lights
Solid state drives
(SSDs)

Figure 6-7 IBM eXFlash unit

In addition to using less power than rotating magnetic media, the SSDs are more
reliable and can service many more I/O operations per second (IOPS). These
attributes make them suited to I/O-intensive applications, such as transaction
processing, logging, backup and recovery, and business intelligence. Built on
enterprise-grade Multi-level Cell (MLC) NAND flash memory, the SSD drives
used in eXFlash for SAP HANA deliver up to 60,000 read IOPS per single drive.
Combined into an eXFlash unit, these drives can potentially deliver up to 480,000
read IOPS and up to 4 GBps of sustained read throughput per eXFlash unit.
In addition to its superior performance, eXFlash offers superior uptime with three
times the reliability of mechanical disk drives. SSDs have no moving parts to fail.
Each drive has its own backup power circuitry, error correction, data protection,
and thermal monitoring circuitry. They use Enterprise Wear-Leveling to extend
their use even longer.
A single eXFlash unit accommodates up to eight hot-swap SSDs and can be
connected to up to 2 performance-optimized controllers. The x3690 X5-based
models for SAP HANA enable RAID protection for the SSD drives by using two
ServeRAID M5015 controllers with the ServeRAID M5000 Performance
Accelerator Key for the eXFlash units.

High IOPS adapter


The IBM High IOPS SSD PCIe adapters provide a new generation of
ultra-high-performance storage based on solid-state device technology for

102

In-memory Computing with SAP HANA on IBM eX5 Systems

System x and BladeCenter. These adapters are alternatives to disk drives and
are available in several sizes, from 160 GB to 1.2 TB. Designed for
high-performance servers and computing appliances, these adapters deliver
throughput of up to 900,000 IOPS, while providing the added benefits of lower
power, cooling, and management overhead and a smaller storage footprint.
Based on standard PCIe architecture coupled with silicon-based NAND
clustering storage technology, the High IOPS adapters are optimized for System
x rack-mount systems and can be deployed in blades through the PCIe
expansion units. They are available in storage capacities up to 2.4 TB.
These adapters use NAND flash memory as the basic building block of
solid-state storage and contain no moving parts. Thus, they are less sensitive to
issues associated with vibration, noise, and mechanical failure. They function as
a PCIe storage and controller device, and after the appropriate drivers are
loaded, the host operating system sees them as block devices. Therefore, these
adapters cannot be used as bootable devices.
The IBM High IOPS PCIe adapters combine high IOPS performance with low
latency. As an example, with 512 KB block random reads, the IBM 1.2 TB
High IOPS MLC Mono adapter can deliver 143,000 IOPS, compared with
420 IOPS for a 15 K RPM 146 GB disk drive. The read access latency is about
68 microseconds, which is one hundredth of the latency of a 15 K RPM 146 GB
disk drive (about 5 ms or 5000 microseconds). The write access latency is even
less, with about 15 microseconds.
Reliability features include the use of Enterprise-grade MLC (eMLC), advanced
wear-leveling, ECC protection, and Adaptive Flashback redundancy for RAID-like
chip protection with self-healing capabilities, providing unparalleled reliability and
efficiency. Advanced bad-block management algorithms enable taking blocks out
of service when their failure rate becomes unacceptable. These reliability
features provide a predictable lifetime and up to 25 years of data retention.

Chapter 6. IBM Systems solution for SAP HANA

103

The x3950 X5-based models of the IBM Systems solution for SAP HANA come
with IBM High IOPS adapters, either with 320 GB (7143-H1x), 640 GB
(7143-H2x, -H3x), or 1.2 TB storage capacity (7143-HAx, -HBx, -HCx).
Figure 6-8 shows the IBM 1.2 TB High IOPS MLC Mono adapter, which comes
with the x3950 based 2012 models (7143-HAx, -HBx, -HCx).

Figure 6-8 IBM 1.2 TB High IOPS MLC Mono adapter

6.1.6 Integrated virtualization


The VMware ESXi embedded hypervisor software is a virtualization platform that
allows multiple operating systems to run on a host system at the same time. Its
compact design allows it to be embedded in physical servers.
VMware ESXi can be managed through the vSphere Web Services application
programming interface (API) and the Common Information Model (CIM) API,
using tools such as vCenter, vSphere command-line interface (CLI), vSphere
PowerCLI, and CIM clients for agentless hardware monitoring. CIM providers can
be used to expose hardware monitoring data of the host system to VMware
ESXi.
VMware ESXi includes full VMware File System (VMFS) support across local
and directly attached storage, Fibre Channel and Internet Small Computer
System Interface (iSCSI) SAN, and network-attached storage (NAS). VMware
ESXi 5.1 supports1 up to 64 virtual CPUs (Virtual SMP) and up to 1 TB of main
memory per virtual machine. The host system can have up to 160 logical CPUs2
and up to 2 TB of main memory.

1
2

104

See http://www.vmware.com/pdf/vsphere5/r51/vsphere-51-configuration-maximums.pdf
A Logical CPU is equivalent to a physical core or hyperthreading thread

In-memory Computing with SAP HANA on IBM eX5 Systems

IBM offers versions of VMware vSphere Hypervisor (ESXi) customized for select
IBM hardware to give you on-line platform management, including updating and
configuring firmware, platform diagnostics, and enhanced hardware alerts. All
models of IBM System x3690 X5, x3850 X5, and x3950 X5 support several USB
keys as options, as listed in Table 6-1.
Table 6-1 VMware ESXi memory keys
Part number

Feature code

Description

41Y8298

A2G0

IBM Blank USB Memory Key for VMware ESXi Downloads

41Y8300

A2VC

IBM USB Memory Key for VMware ESXi 5.0

41Y8307

A383

IBM USB Memory Key for VMware ESXi 5.0 Update 1

41Y8311

A2R3

IBM USB Memory Key for VMware ESXi 5.1

To enable the embedded hypervisor function on the x3690 X5, an internal USB
connector on the x8 low profile PCI riser card (Figure 6-9) is reserved to support
one USB flash drive.

Spare internal USB

USB for internal


hypervisor key

Riser card 2
with slots 3,
4, and 5
(slots 3 and
4 are on the
back)

PCIe slot 5
Figure 6-9 Low profile x8 riser card with hypervisor flash USB connector

The x3850 X5 and x3950 X5 have two internal USB connectors available for the
embedded hypervisor USB key. The location of these USB connectors is
illustrated in Figure 6-10 on page 106.

Chapter 6. IBM Systems solution for SAP HANA

105

Internal USB sockets

Embedded
hypervisor key
installed
Figure 6-10 Location of internal USB ports for embedded hypervisor on the x3850 X5
and x3950 X5

For more information about the USB keys, and to download the IBM customized
version of VMware ESXi, visit the following web page:
http://www.ibm.com/systems/x/os/vmware/esxi

6.2 IBM General Parallel File System


The IBM General Parallel File System (GPFS) is a key component of the IBM
Systems solution for SAP HANA. It is a high-performance shared-disk file
management solution that can provide faster, more reliable access to a common
set of file data. It enables a view of distributed data with a single global
namespace.

106

In-memory Computing with SAP HANA on IBM eX5 Systems

6.2.1 Common GPFS features


GPFS leverages its cluster architecture to provide quicker access to your file
data. File data is automatically spread across multiple storage devices, providing
optimal use of your available storage to deliver high performance.
GPFS is designed for high-performance parallel workloads. Data and metadata
flow from all the nodes to all the disks in parallel under control of a distributed
lock manager. It has a flexible cluster architecture that enables the design of a
data storage solution that not only meets current needs but that can also quickly
be adapted to new requirements or technologies. GPFS configurations include
direct-attached storage, network block input and output (I/O), or a combination of
the two, and multi-site operations with synchronous data mirroring.
GPFS can intelligently prefetch data into its buffer pool, issuing I/O requests in
parallel to as many disks as necessary to achieve the peak bandwidth of the
underlying storage-hardware infrastructure. GPFS recognizes multiple I/O
patterns, including sequential, reverse sequential, and various forms of striped
access patterns. In addition, for high-bandwidth environments, GPFS can read or
write large blocks of data in a single operation, minimizing the overhead of I/O
operations.
Expanding beyond a storage area network (SAN) or locally attached storage, a
single GPFS file system can be accessed by nodes using a TCP/IP or InfiniBand
connection. Using this block-based network data access, GPFS can outperform
network-based sharing technologies, such as Network File System (NFS) and
even local file systems such as the EXT3 journaling file system for Linux or
Journaled File System. Network block I/O (also called network shared disk
(NSD)) is a software layer that transparently forwards block I/O requests from a
GPFS client application node to an NSD server node to perform the disk I/O
operation and then passes the data back to the client. Using a network block I/O,
configuration can be more cost effective than a full-access SAN.
Storage pools enable you to transparently manage multiple tiers of storage
based on performance or reliability. You can use storage pools to transparently
provide the appropriate type of storage to multiple applications or different
portions of a single application within the same directory. For example, GPFS
can be configured to use low-latency disks for index operations and high-capacity
disks for data operations of a relational database. You can make these
configurations even if all database files are created in the same directory.
For optimal reliability, GPFS can be configured to eliminate single points of
failure. The file system can be configured to remain available automatically in the
event of a disk or server failure. A GPFS file system is designed to transparently
fail over token (lock) operations and other GPFS cluster services, which can be

Chapter 6. IBM Systems solution for SAP HANA

107

distributed throughout the entire cluster to eliminate the need for dedicated
metadata servers. GPFS can be configured to automatically recover from node,
storage, and other infrastructure failures.
GPFS provides this functionality by supporting these:
Data replication to increase availability in the event of a storage media failure
Multiple paths to the data in the event of a communications or server failure
File system activity logging, enabling consistent fast recovery after system
failures
In addition, GPFS supports snapshots to provide a space-efficient image of a file
system at a specified time, which allows online backup and can help protect
against user error.

6.2.2 GPFS extensions for shared-nothing architectures


IBM added several features to GPFS that support the design of shared-nothing
architectures. This need is driven by todays trend to scale-out applications
processing big data.
A single shared storage is not necessarily the best approach when dozens,
hundreds, or even thousands of servers have to access the same set of data.
Shared storage can impose a single point of failure (unless designed in a fully
redundant way using storage mirroring). It can limit the peak bandwidth for the
cluster file system and is expensive to provide storage access to hundreds or
thousands of nodes.

GPFS File Placement Optimizer (GPFS FPO) is the name for a set of features to
support big data applications on shared-nothing architectures. In such
scenarios, hundreds or even thousands of commodity servers compute certain
problems. They do not have a shared storage to hold the data. The internal disks
of the nodes are used to store all data. This requires a new thinking to be able to
run a cluster file system on top of a shared-nothing architecture.
The features introduced with GPFS-FPO include:
Write affinity: Provides control over the placement of new data. It can either
be written to the local node or wide striped across multiple nodes.
Locality awareness: Ability to obtain on which node certain data chunks
reside. This allows the scheduling of jobs on the node holding the data, thus
avoiding costly transfer of data across the network.
Metablocks: Enable two block sizes within the same file system. MapReduce
workloads tend to have very small files (below 1 MB, for example, for index

108

In-memory Computing with SAP HANA on IBM eX5 Systems

files) and very large files (such as 128 MB, holding the actual data) in the
same file system. The concept of metablocks allows for an optimal usage of
the available physical blocks.
Pipelined replication: Makes the most effective use of the node interconnect
bandwidth. Data written on node A sends data to node B, which in turn sends
data to node C. In contrast to pipelined replication, the other replication
schema is star replication where node A sends data to both node B and node
C. For bandwidth-intense operations or for servers with limited network
bandwidth, the outgoing link of node A can limit replication performance in
such a scenario. Choosing the correct replication schema is especially
important when running in a shared-nothing architecture because this almost
always involves replicating data over the network.
Fast recovery: Is an intelligent way to minimize recovery efforts after the
cluster is healthy again. After an error, GPFS keeps track of what updates
have been missing by the failed disks. In addition, the load to recover the data
is distributed across multiple nodes. GPFS also allows two different recovery
policies. After a disk has failed, data can either be rebuilt when the disk has
been replaced or it can immediately be rebuilt using other nodes or disks to
hold the data.
GPFS offers time-tested reliability and was installed on thousands of nodes
across industries, from weather research to multimedia, retail, financial industry
analytics, and web service providers. GPFS also is the basis of many IBM cloud
storage offerings.
The IBM Systems solution for SAP HANA benefits in several ways from the
features of GPFS:
GPFS provides a stable, industry-proven, cluster-capable file system for SAP
HANA.
GPFS transparently works with multiple replicas (that is, copies) of a single
file in order to protect from disk failures.
GPFS adds extra performance to the storage devices by striping data across
devices.
With the new FPO extensions, GPFS enables the IBM Systems solution for
SAP HANA to grow beyond the capabilities of a single system, into a
scale-out solution, without introducing the need for external storage.
GPFS adds high-availability and disaster recovery features to the solution.
This makes GPFS the ideal file system for the IBM Systems solution for SAP
HANA.

Chapter 6. IBM Systems solution for SAP HANA

109

6.3 Custom server models for SAP HANA


Following the appliance-like delivery model for SAP HANA, IBM created several
custom server models for SAP HANA. These workload-optimized models are
designed to match and exceed the performance requirements and the functional
requirements as specified by SAP. With a small set of IBM System x
workload-optimized models for SAP HANA, all sizes of SAP HANA solutions can
be built, from the smallest to large installations.

6.3.1 IBM System x workload-optimized models for SAP HANA


In the first half of 2011, IBM announced a full range of IBM System x
workload-optimized models for SAP HANA, covering all SAP HANA T-shirt sizes
with the newest generation technology. Because there is no direct relationship
between the workload-optimized models and the SAP HANA T-shirt sizes, we
refer to these models as building blocks. In some cases, there are several
building blocks available for one T-shirt size. In some, two-building blocks have to
be combined to build a specific T-shirt size. Table 6-2 shows all building blocks
announced in 2011 and their features.
Table 6-2 IBM System x workload-optimized models for SAP HANA, 2011 models
Building
block

Server
(MTM)

CPUs

Main
memory

Log
storage

Data
storage

Preload

XS

x3690 X5
(7147-H1xa)

2x Intel Xeon
E7-2870

128 GB DDR3
(8x 16 GB)

8x 50 GB 1.8
MLC SSD

8x 300 GB
10 K SAS HDD

Yes

x3690 X5
(7147-H2x)

2x Intel Xeon
E7-2870

256 GB DDR3
(16x 16 GB)

8x 50 GB 1.8
MLC SSD

8x 300 GB
10 K SAS HDD

Yes

SSD

x3690 X5
(7147-H3x)

2x Intel Xeon
E7-2870

256 GB DDR3
(16x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

Yes

S+

x3950 X5
(7143-H1x)

2x Intel Xeon
E7-8870

256 GB DDR3
(16x 16 GB)

320 GB High
IOPS adapter

8x 600 GB
10 K SAS HDD

Yes

x3950 X5
(7143-H2x)

4x Intel Xeon
E7-8870

512 GB DDR3
(32x 16 GB)

640 GB High
IOPS adapter

8x 600 GB
10 K SAS HDD

Yes

L Option

x3950 X5
(7143-H3x)

4x Intel Xeon
E7-8870

512 GB DDR3
(32x 16 GB)

640 GB High
IOPS adapter

8x 600 GB
10 K SAS HDD

No

a. x = Country-specific letter (for example, EMEA MTM is 7147-H1G, and the US MTM is 7147-H1U).
Contact your IBM representative for regional part numbers.

110

In-memory Computing with SAP HANA on IBM eX5 Systems

In addition to the models listed in Table 6-2 on page 110, there are models that
are specific to a geographic region:
Models 7147-H7x, -H8x, and -H9x are for Canada only and are the same
configurations as H1x, H2x, and H3x, respectively.
Models 7143-H4x and -H5x are for Canada only and are the same
configuration as H1x and H2x, respectively.
In October of 2012, IBM announced a new set of IBM System x
workload-optimized models for SAP HANA, updating some of the components
with newer generation versions. Table 6-3 shows all building blocks announced in
2012 and their features.
Table 6-3 IBM System x workload-optimized models for SAP HANA, 2012 models
Building
block

Server
(MTM)

CPUs

Main
memory

Log
storage

Data
storage

Preload

XS

x3690 X5
(7147-HAxa)

2x Intel Xeon
E7-2870

128 GB DDR3
(8x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

Yes

x3690 X5
(7147-HBx)

2x Intel Xeon
E7-2870

256 GB DDR3
(16x 16 GB)

10x 200 GB 1.8 MLC SSD


(combined log and data)

Yes

S+

x3950 X5
(7143-HAx)

2x Intel Xeon
E7-8870

256 GB DDR3
(16x 16 GB)

1.2 TB High
IOPS adapter

8x 900 GB
10 K SAS HDD

Yes

x3950 X5
(7143-HBx)

4x Intel Xeon
E7-8870

512 GB DDR3
(32x 16 GB)

1.2 TB High
IOPS adapter

8x 900 GB
10 K SAS HDD

Yes

L Option

x3950 X5
(7143-HCx)

4x Intel Xeon
E7-8870

512 GB DDR3
(32x 16 GB)

1.2 TB High
IOPS adapter

8x 900 GB
10 K SAS HDD

No

XMb

x3950 X5
(7143-HDx)

4x Intel Xeon
E7-8870

1 TB DDR3
(32x 32 GB)

1.2 TB High
IOPS adapter

8x 900 GB
10 K SAS HDD

Yes

XL
Optionb

x3950 X5
(7143-HEx)

4x Intel Xeon
E7-8870

1 TB DDR3
(32x 32 GB)

1.2 TB High
IOPS adapter

8x 900 GB
10 K SAS HDD

No

a. x = Country-specific letter (for example, EMEA MTM is 7147-HAG, and the US MTM is 7147-HAU).
Contact your IBM representative for regional part numbers.
b. Models are specific to and limited for use with SAP Business Suite powered by SAP HANA.

All models (except for 7143-H3x, 7143-HCx, and 7143-HEx) come with
preinstalled software comprising SUSE Linux Enterprise Server for SAP
Applications (SLES for SAP) 11 SP1, IBM GPFS, and the SAP HANA software
stack. Licenses and maintenance fees (for three years) for SLES for SAP and
GPFS are included. Section GPFS license information on page 186 has an
overview about which type of GPFS license comes with a specific model, and the

Chapter 6. IBM Systems solution for SAP HANA

111

number of processor value units (PVUs) included. The licenses for the SAP
software components have to be acquired separately from SAP.
The L-Option building blocks (7143-H3x or 7143-HCx) are intended as an
extension to an M building block (7143-H2x or 7143-HBx). When building an
L-Size SAP HANA system, one M building block has to be combined with an
L-Option building block, leveraging eX5 scalability. Both systems then act as one
single eight-socket, 1 TB server. Therefore, the L-Option building blocks do not
require preinstalled software. However, they come with the required additional
software licenses for GPFS and SLES for SAP.
Both the XM (7143-HDx) and XL-Option (7143-HEx) building blocks are specific
to and limited for use with SAP Business Suite powered by SAP HANA. They
have a different memory to core ratio than the regular models, which is only
suitable for this specific workload, as outlined in section 5.5.3, SAP Business
Suite powered by SAP HANA on page 85. The XL-Option building block is
intended as an extension to the XM building block. When combined leveraging
eX5 scalability, both systems act as one single eight-socket, 2 TB server.
Therefore, the XL-Option building block does not require preinstalled software. It
comes, however, with the required additional software licenses for GPFS and
SLES for SAP. For further scalability, this 2 TB configuration can be upgraded to
a 4 TB configuration, as described in Table 6-5 on page 115.
The building blocks are configured to match the SAP HANA sizing requirements.
The main memory sizes match the number of CPUs, to give the correct balance
between processing power and data volume. Also, the storage devices in the
systems provide the storage capacity required to match the amount of main
memory.
All systems come with storage for both the data volume and the log volume
(Figure 6-11 on page 113). Savepoints are stored on a RAID protected array of
10 K SAS hard drives, optimized for data throughput. The SAP HANA database
logs are stored on flash technology storage devices:
RAID-protected, hot swap eXFlash SSD drives on the models based on
IBM System x3690 X5
Flash-based High IOPS PCIe adapters for the models based on
IBM System x3950 X5
These flash technology storage devices are optimized for high IOPS
performance and low latency to provide the SAP HANA database with a log
storage that allows the highest possible performance. Because a transaction in
the SAP HANA database can return only after the corresponding log entry is
written to the log storage, high IOPS performance and low latency are key to
database performance.

112

In-memory Computing with SAP HANA on IBM eX5 Systems

The building blocks based on the IBM System x3690 X5 (except for the older
7147-H1x and 7147-H2x), come with combined data and log storage on an array
of RAID-protected, hot-swap eXFlash SSD drives. Optimized for throughput, high
IOPS performance, and low latency, these building blocks give extra flexibility
when dealing with large amounts of log data, savepoint data, or backup data.

Time

Data savepoint
to persistent
storage

SAS Drives

Log written
to persistent storage
(committed transactions)

SSD Drives / PCIe Flash

optimized for

optimized for

throughput

high IOPS / low latency

Server
local
storage

GPFS file system


For maximum performance and scalability
Figure 6-11 SAP HANA data persistency with the internal storage of the workload-optimized systems

6.3.2 SAP HANA T-shirt sizes


This section provides information about how the SAP HANA T-shirt sizes, as
described in 3.3.1, Concept of T-shirt sizes for SAP HANA on page 26, can be
realized using the IBM System x workload-optimized models for SAP HANA3:
For a T-shirt size XS 128 GB SAP HANA system, building block XS
(7147-H1x or 7147-HAx) is the correct choice. These x3690 X5-based
building blocks are the entry-level models of the line of IBM System x
workload-optimized systems for SAP HANA.
A T-shirt size S 256 GB can either be realized with the newer S building block
(7147-HBx) or the older SSD building block (7147-H3x) with combined data
and log storage on eXFlash SSD drives. The older S building block

The model numbers given might need to be replaced by a region-specific equivalent by changing
the x to a region-specific letter identifier. See 6.3.1, IBM System x workload-optimized models for
SAP HANA on page 110.

Chapter 6. IBM Systems solution for SAP HANA

113

(7147-H2x) is suitable too, equipped with separate storage for data (SAS
drives) and logs (SSD drives), but has limitations with regards to the scale-out
solution. All three are based on IBM System x3690 X5.
For a T-shirt size S 256 GB with upgradability to M (that is, a T-shirt size S+),
the S+ building block (7143-H1x or 7143-HAx) is the perfect choice. Unlike the
S and SSD building blocks, it is based on the IBM System x3950 X5 4-socket
system to ensure upgradability.
A T-shirt size M 512 GB can be realized with the M building block (7143-H2x
or 7143-HBx). Because it can be upgraded to a T-shirt size L using the
L-Option building block, it is also the perfect fit if a T-shirt size M+ is required.
For a T-shirt size L 1 TB, one M building block (7143-H2x or 7143-HBx) must
be combined with an L-Option building block (7143-H3x or 7143-HCx),
connected together to form a single server using eX5 scaling technology.
Table 6-4 gives an overview of the SAP HANA T-shirt sizes and their relation to
the IBM custom models for SAP HANA.
Table 6-4 SAP HANA T-shirt sizes and their relation to the IBM custom models
SAP T-shirt
size

XS

S+

M and M+

Compressed
data in
memory

64 GB

128 GB

128 GB

256 GB

512 GB

Server main
memory

128 GB

256 GB

256 GB

512 GB

1024 GB

Number of
CPUs

Mapping to
building
blocksa

7147-HAx or
7147-H1x

7147-HBx or
7147-H3x or
7147-H2x

7143-HAx or
7143-H1x

7143-HBx or
7143-H2x

Combine
7143-HBx or
7143-H2x
with
7143-HCx or
7143-H3x

a. For a region-specific equivalent, see 6.3.1, IBM System x workload-optimized models for SAP
HANA on page 110.

In addition to the standard SAP T-shirt sizes for SAP HANA reflected in
Table 6-4, there are configurations available for use with SAP Business Suite
powered by SAP HANA, outlined in Table 6-5 on page 115.

114

In-memory Computing with SAP HANA on IBM eX5 Systems

Table 6-5 Custom models for use with SAP Business Suite powered by SAP HANA
SAP T-shirt
size

XL

XXL

Compressed
data in
memory

512 GB

1 TB

2 TB

Server main
memory

1 TB

2 TB

4 TB

Number of
CPUs

Mapping to
building
blocksa

7143-HDx

Combine 7143-HDx with


7143-HEx
or
Combine 7143-HBx with
7143-HCx and additional
memory

Combine 7143-HDx with


7143-HEx and additional
memory and storage
or
Combine 7143-HBx with
7143-HCx and upgraded
memory and additional storage

a. For a region-specific equivalent, see 6.3.1, IBM System x workload-optimized models for SAP
HANA on page 110.

The following section, 6.3.3, Scale-up on page 115, provides details about how
to upgrade the custom models to the extended configurations described in
Table 6-5.

6.3.3 Scale-up
This section talks about upgradability, or scale-up, and shows how IBM custom
models for SAP HANA can be upgraded to accommodate the need to grow into
bigger T-shirt sizes.
To accommodate growth, the IBM Systems Solution for SAP HANA can be
scaled in these ways:
Scale-up approach: Increase the capabilities of a single system by adding more
components.
Scale-out approach: Increase the capabilities of the solution by using multiple
systems working together in a cluster.
We discuss the scale-out approach in 6.4, Scale-out solution for SAP HANA on
page 116.

Chapter 6. IBM Systems solution for SAP HANA

115

The building blocks of the IBM Systems Solution for SAP HANA, as described
previously, were designed with extensibility in mind. The following upgrade options
exist:
An XS building block can be upgraded to be an S-size SAP HANA system by
adding 128 GB of main memory to the system.
An S+ building block can be upgraded to be an M-Size SAP HANA system by
adding two more CPUs, which is another 256 GB of main memory. For the
7143-H1x, another 320 GB High IOPS adapter needs to be added to the
system, the newer 7143-HAx has the required flash capacity already
included.
An M building block (7143-H2x or 7143-HBx) can be extended with the L
option (7143-H3x or 7143-HCx) to resemble an L-Size SAP HANA System.
The 2011 models can be combined with the 2012 models, for example, the
older 7143-H2x can be extended with the new 7143-HCx.
With the option to upgrade S+ to M, and M to L, IBM can provide an
unmatched upgrade path from a T-shirt size S up to a T-shirt size L, without
the need to retire a single piece of hardware.
For use with SAP Business Suite powered by SAP HANA, customers can
start with an S+ configuration, and then upgrade M and L, and finally to XL
only by adding new components. Further growth to XXL is possible but
requires exchange of the memory DIMMs.
Clients starting at a 1 TB configuration for SAP Business Suite powered by
SAP HANA can upgrade from XM to XL and then XXL without the need to
retire a single piece of hardware, quadrupling the memory capacity.
Of course, upgrading server hardware requires system downtime. However, due
to the capability of GPFS to add storage capacity to an existing GPFS file system
by just adding devices, data residing on the system remains intact. We
nevertheless recommend that you do a backup of the data before changing the
systems configuration.

6.4 Scale-out solution for SAP HANA


Up to now we talked about single-server solutions. Although the scale-up
approach gives flexibility to expand the capabilities of an SAP HANA installation,
there might be cases where the required data volumes exceed the capabilities of
a single server. To meet such requirements, the IBM Systems Solution for SAP
HANA supports a scale-out approach (that is, combining a number of systems
into a clustered solution, which represents a single SAP HANA instance). An
SAP HANA system can span multiple servers, partitioning the data to be able to
hold and process larger amounts of data than a single server can accommodate.

116

In-memory Computing with SAP HANA on IBM eX5 Systems

Most use cases support a scale-out solution. However, the SAP Business Suite
powered by the SAP HANA use case as outlined in section 5.5.3, SAP Business
Suite powered by SAP HANA on page 85, supports only single-server
configurations at the time of writing.
To illustrate the scale-out solution, the following figures show a schematic
depiction of such an installation. Figure 6-12 shows a single-node SAP HANA
system.

node01
SAP HANA DB
DB partition 1
- SAP HANA DB

- Index server
- Statistic server
- SAP HANA studio

Shared file system GPFS


HDD
Flash
First replica

data01

log01

Figure 6-12 Single-node SAP HANA system (SAP HANA DB mentioned twice to be
consistent with scale-out figures later on in this chapter)

This single-node solution has these components:


The SAP HANA software (SAP HANA database with index server and
statistic server)
The shared file system (GPFS) on the two types of storage:
The data storage (on SAS disks), here referred to as HDD, which holds
the savepoints
The log storage (on SSD drives or PCIe flash devices), here referred to as
Flash, which holds the database logs

Chapter 6. IBM Systems solution for SAP HANA

117

This single node represents one single SAP HANA database consisting of one
single database partition. Both the savepoints (data01) and the logs (log01) are
stored once (that is, they are not replicated), denoted as being first replica in
Figure 6-12 on page 1174.

6.4.1 Scale-out solution without high-availability capabilities


The first step towards a scale-out solution was to introduce a clustered solution
without failover or high-availability (HA) capabilities. IBM was the first hardware
partner to validate a scale-out solution for SAP HANA. SAP validated this
solution for clusters of up to four nodes, using S or M building blocks in a
homogeneous cluster (that is, no mixing of S and M building blocks).
This scale-out solution differs from a single server solution in a number of ways:
The solution consists of a homogeneous cluster of building blocks, which are
interconnected with two separate 10 Gb Ethernet networks (not shown in
Figure 6-13 on page 119), one for the SAP HANA application and one for the
GPFS file system communication.
The SAP HANA database is split into partitions, forming a single instance of
the SAP HANA database.
Each node of the cluster holds its own savepoints and database logs on the
local storage devices of the server.
The GPFS file system spans all nodes of the cluster, making the data of each
node available to all other nodes of the cluster.
Figure 6-13 on page 119 illustrates this solution, showing a 3-node configuration
as an example.

118

In the previous edition of this book, it was called primary data. To be in line with GPFS
documentation and to emphasize that there is no difference between multiple replicas of a single
file, we use first replica (and second and third replica later on) instead.

In-memory Computing with SAP HANA on IBM eX5 Systems

node01

node02

node03

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

HDD

Flash

HDD

Flash

HDD

Flash

data01

log01

data02

log02

data03

log03

Figure 6-13 A 3-node clustered solution without failover capabilities

To an outside application connecting to the SAP HANA database, this looks like a
single instance of SAP HANA. The SAP HANA software distributes the requests
internally across the cluster to the individual worker nodes, which process the
data and exchange intermediate results, which are then combined and sent back
to the requestor. Each node maintains its own set of data, persisting it with
savepoints and logging data changes to the database log.
GPFS combines the storage devices of the individual nodes into one large file
system, making sure that the SAP HANA software has access to all data
regardless of its location in the cluster, while making sure that savepoints and
database logs of an individual database partition are stored on the appropriate
storage device of the node on which the partition is located. Although GPFS
provides the SAP HANA software with the functionality of a shared storage
system, it ensures maximum performance and minimum latency by using locally
attached disks and flash devices. In addition, because server-local storage
devices are used, the total capacity and performance of the storage within the
cluster automatically increases with the addition of nodes, maintaining the same
per-node performance characteristics regardless of the size of the cluster. This
kind of scalability is not achievable with external storage systems.
The absence of fail-over capabilities represents a major disadvantage of this
solution. The cluster acts as a single-node configuration. In case one node

Chapter 6. IBM Systems solution for SAP HANA

119

becomes unavailable for any reason, the database partition on that node
becomes unavailable, and with it the entire SAP HANA database. Loss of the
storage of a node means data loss (as with a single-server solution), and the
data has to be recovered from a backup. For this reason, this scale-out solution
without failover capabilities is an intermediate solution that will go away after all
of the SAP hardware partners can provide a solution featuring high-availability
capabilities. The IBM version of such a solution is described in the next section.

6.4.2 Scale-out solution with high-availability capabilities


The scale-out solution for SAP HANA with high-availability capabilities enhances
the exemplary four-node scale-out solution described in the previous section in
two major fields:
Making the SAP HANA application highly available by introducing standby
nodes, which can take over from a failed node within the cluster
Making the data provided through GPFS highly available to the SAP HANA
application, even in the event of the loss of one node, including its data on the
local storage devices
SAP HANA allows the addition of nodes in the role of a standby node. These
nodes run the SAP HANA application, but do not hold any data or take an active
part in the processing. In case one of the active nodes fails, a standby node
takes over the role of the failed node, including the data (that is, the database
partition) of the failed node. This mechanism allows the clustered SAP HANA
database to continue operation.

120

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure 6-14 illustrates a 4-node cluster with the fourth node being a standby
node.

node01

node02

node03

node04

SAP HANA DB
DB partition 1

DB partition 2

HANA

HANA

DB partition 3

- SAP
DB
Worker node

- SAP
DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

HDD

Flash

HDD

Flash

HDD

Flash

data01

log01

data02

log02

data03

log03

HDD

Flash

Second replica

Figure 6-14 A 4-node clustered solution with failover capabilities

To be able to take over the database partition from the failed node, the standby
node has to load the savepoints and database logs of the failed node to recover
the database partition and resume operation in place of the failed node. This is
possible because GPFS provides a global file system across the entire cluster,
giving each individual node access to all the data stored on the storage devices
managed by GPFS.
In case a node has an unrecoverable hardware error, the storage devices holding
the nodes data might become unavailable or even destroyed. In contrast to the
solution without high-availability capabilities, here the GPFS file system
replicates the data of each node to the other nodes, to prevent data loss in case
one of the nodes goes down. Replication is done in a striping fashion. That is,
every node has a piece of data of all other nodes. In the example illustrated in
Figure 6-14, the contents of the data storage (that is, the savepoints, here
data01) and the log storage (that is, the database logs, here log01) of node01
are replicated to node02, node03, and node04, each holding a part of the data
on the matching device (that is, data on HDD, log on flash). The same is true for
all nodes carrying data, so that all information is available twice within the GPFS
file system, which makes it tolerant to the loss of a single node. The replication
occurs synchronously. That is, the write operation finishes only when the data is
both written locally and replicated. This ensures consistency of the data at any

Chapter 6. IBM Systems solution for SAP HANA

121

point in time. Although GPFS replication is done over the network and in a
synchronous fashion, this solution still over-achieves the performance
requirements for validation by SAP.
Using replication, GPFS provides the SAP HANA software with the functionality
and fault tolerance of a shared storage system while maintaining its performance
characteristics. Again, because server-local storage devices are used, the total
capacity and performance of the storage within the cluster automatically
increases with the addition of nodes, maintaining the same per-node
performance characteristics regardless of the size of the cluster. This kind of
scalability is not achievable with external storage systems.

Example of a node takeover


To further illustrate the capabilities of this solution, this section provides a node
takeover example. In this example, we have a 4-node setup, initially configured
as illustrated in Figure 6-14 on page 121, with three active nodes and one
standby node.
First, node03 experiences a problem and fails unrecoverably. The master node
(node01) recognizes this and directs the standby node, node04, to take over from
the failed node. Remember that the standby node is running the SAP HANA
application and is part of the cluster, but in an inactive role.
To re-create database partition 3 in memory to be able to take over the role of
node03 within the cluster, node04 reads the savepoints and database logs of
node03 from the GPFS file system, reconstructs the savepoint data in memory,
and reapplies the logs so that the partition data in memory is exactly like it was
before node03 failed. Node04 is in operation, and the database cluster has
recovered.

122

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure 6-15 illustrates this scenario.


node01

node02

node03

node04

SAP HANA DB
DB partition 1
HANA

DB partition 2

DB partition 3

- SAP
DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Defunct node

- SAP HANA DB
Worker node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

HDD

Flash

HDD

Flash

data01

log01

data02

log02

HDD

Flash

HDD

Flash

data03

log03

Second replica

Figure 6-15 Standby node 4 takes over from failed node 3

The data that node04 was reading was the data of node03, which failed,
including the local storage devices. For that reason, GPFS had to deliver the data
to node04 from the second replica spread across the cluster using the network.
Now, when node04 starts writing savepoints and database logs again during the
normal course of operations, these are not written over the network, but to the
local drives, again with a second replica striped across the cluster.
After fixing the cause for the failure of node03, it can be reintegrated into the
cluster as the new standby system (Figure 6-16 on page 124).

Chapter 6. IBM Systems solution for SAP HANA

123

node01

node02

node03

node04

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

- SAP HANA DB
Worker node

- SAP HANA DB
Worker node

- SAP HANA DB
Standby node

- SAP HANA DB
Worker node

- Index server
- Statistic server
- SAP HANA studio

- Index server
- Statistic server

- Index server
- Statistic server

- Index server
- Statistic server

Shared file system - GPFS

First replica

HDD

Flash

HDD

Flash

data01

log01

data02

log02

HDD

Flash

HDD

Flash

data03

log03

Second replica

Figure 6-16 Node 3 is reintegrated into the cluster as a standby node

This example illustrates how IBM combines two independently operating


high-availability measures (that is, the concept of standby nodes on the SAP
HANA application level and the reliability features of GPFS on the infrastructure
level), resulting in a highly available and scalable solution.
At the time of writing, clusters of up to 16 nodes using S building blocks
(7143-HBx only), SSD building blocks, or up to 56 nodes using M building blocks
or L configurations (M building block extended by L option), are validated by SAP.
This means that the cluster has a total main memory of up to 56 TB or up to
28 TB of compressed data. Depending on the compression factor, this
accommodates up to 196 TB of source data5.
Note: SAP validated this scale-out solution (with HA), which is documented in
the SAP product availability matrix, with up to 16 or 56 nodes in a cluster.
However, the building block approach of IBM makes the solution scalable
without any known limitation. For those clients who need a scale-out
configuration beyond the 56 TB offered today, IBM offers a joint validation at
the client site working closely with SAP.

124

Uncompressed source data, compression factor of 7:1

In-memory Computing with SAP HANA on IBM eX5 Systems

6.4.3 Networking architecture for the scale-out solution


Networking plays an integral role in the scale-out solution. The standard building
blocks are used for scale-out, which is interconnected by 10 Gb Ethernet, in a
redundant fashion. There are two redundant 10 Gb Ethernet networks for the
communication within the solution:
A fully redundant 10 Gb Ethernet network for cluster-internal communication
of the SAP HANA software
A fully redundant 10 Gb Ethernet network for cluster-internal communication
of GPFS, including replication
These networks are internal to the scale-out solution and have no connection to
the client network. The networking switches for these networks are part of the
appliance and cannot be substituted with other than the validated switch models.
Figure 6-17 illustrates the networking architecture for the scale-out solution and
shows the SAP HANA scale-out solution connected to an SAP NetWeaver BW
system as an example.

SAP
BW
system

SAP HANA scale-out solution

Switch

10 GbE switch

Node 1
Switch

Node 1

Node 1

...

Node n

10 GbE switch

Figure 6-17 Networking architecture for the scale-out solution

All network connections within the scale-out solution are fully redundant. Both
the internal GPFS network and the internal SAP HANA network are connected to
two 10 Gb Ethernet switches, which are interconnected for full redundancy. The

Chapter 6. IBM Systems solution for SAP HANA

125

switch model used here is the IBM System Networking RackSwitch G8264. It
delivers exceptional performance, being both lossless and low latency. With
1.2 Tbps throughput, the G8264 provides massive scalability and low latency that
is ideal for latency-sensitive applications, such as SAP HANA. The scale-out
solution for SAP HANA makes intensive use of the advanced capabilities of this
switch, such as virtual link aggregation groups (vLAGs). For smaller scale-out
deployments, the smaller IBM Systems Networking RackSwitch G8124 can be
used instead of the G8264. Figure 6-18 shows both the G8264 and G8124 switch
models.

Figure 6-18 IBM System Networking RackSwitch G8264 (top) and G8124 (bottom)

126

In-memory Computing with SAP HANA on IBM eX5 Systems

To illustrate the network connectivity, Figure 6-19 shows the back of an M


building block (here: 7143-H2x) with the network interfaces available. The letters
denoting the interfaces correspond to the letters used in Figure 6-17 on
page 125.

GPFS

IMM

SAP HANA
Figure 6-19 The back of an M building block with the network interfaces available

Each building block comes with one (2011 models) or two (2012 models)
dual-port 10 Gb Ethernet interface cards (network interface card (NIC)). To
provide enough ports for a fully redundant network connection to the 10 Gb
Ethernet switches, an additional dual-port 10 Gb Ethernet NIC can be added to
the system (see also section 6.4.4, Hardware and software additions required
for scale-out on page 128).
An exception to this is an L configuration, where each of the two chassis (the M
building block and the L option) holds one or two dual-port 10 Gb Ethernet NICs.
Therefore, an L configuration does not need an additional 10 Gb Ethernet NIC for
the internal networks, even for the 2011 models.
The six available 1 Gb Ethernet interfaces available (a.b.e.f.g.h) on the system
can be used to connect the systems to other networks or systems, for example,
for client access, application management, systems management, data
management. The interface denoted with the letter i is used to connect the
integrated management module (IMM) of the server to the management network.

Chapter 6. IBM Systems solution for SAP HANA

127

6.4.4 Hardware and software additions required for scale-out


The scale-out solution for the IBM Systems Solution for SAP HANA builds upon
the same building blocks as they are used in a single-server installation. There
are however additional hardware and software components needed, to
complement the basic building blocks when implementing a scale-out solution.
Depending on the building blocks used, additional GPFS licenses might be
needed for the scale-out solution. The GPFS on x86 Single Server for Integrated
Offerings, V3 provides file system capabilities for single-node integrated
offerings. This kind of GPFS license does not cover the use in multi-node
environments, such as the scale-out solution discussed here. To use building
blocks that come with the GPFS on x86 Single Server for Integrated Offerings
licenses, for a scale-out solution, GPFS on x86 Server licenses have to be
obtained for these building blocks. Section GPFS license information on
page 186 has an overview about which type of license comes with a specific
model, and the number of processor value units (PVUs) needed. Alternatively,
GPFS File Placement Optimizer licenses can be used with GPFS on x86 Server
licenses. In a scale-out configuration, a minimum of three nodes have to use
GPFS on x86 Server licenses, and the remaining nodes can use GPFS File
Placement Optimizer licenses. Other setups, such as the disaster recovery
solution described in section 6.5, Disaster recovery solutions for SAP HANA on
page 129, might require more nodes using GPFS on x86 Server licenses,
depending on the role of the nodes in the actual setup. Section GPFS license
information on page 186 has an overview on the GPFS license types, which
type of license comes with a specific model, and the number of PVUs needed.
As described in section 6.4.3, Networking architecture for the scale-out solution
on page 125, additional 10 Gb Ethernet NICs must be added to the building
blocks in some configurations in order to provide redundant network connectivity
for the internal networks, and possibly also for the connection to the client
network, in case a 10 Gb Ethernet connection to the other systems (for example:
replication server, SAP application servers) is required. Information about
supported network interface cards for this purpose is provided in the Quick Start
Guide.
For a scale-out solution built upon the SSD-only building blocks based on x3690
X5, additional 200 GB 1.8 MLC SSD drives are required to be able to
accommodate the additional storage capacity required for GPFS replication. The
total number of SSD drives that are required is documented in the SAP Product
Availability Matrix (PAM) for SAP HANA, which is available online at (search for
HANA) the following site:
http://service.sap.com/pam

128

In-memory Computing with SAP HANA on IBM eX5 Systems

6.5 Disaster recovery solutions for SAP HANA


When talking about disaster recovery, it is important to understand the difference
between disaster recovery (DR) and high availability (HA). High availability is
covering a hardware failure (for example, one node becomes unavailable due to
a faulty CPU, memory DIMM, storage, or network failure) in a scale-out
configuration. This was covered in section 6.4.2, Scale-out solution with
high-availability capabilities on page 120.
DR covers the event, when multiple nodes in a scale-out configuration fail, or a
whole data center goes down due to a fire, flood, or other disaster, and a
secondary site needs to take over the SAP HANA system. The ability to recover
from a disaster, or to tolerate a disaster without major impact, is sometimes
also referred to as disaster tolerance (DT).
When running an SAP HANA side-car scenario (for example, SAP CO-PA
Accelerator, sales planning, smart metering) the data will still be available in the
source SAP Business Suite system. Planning or analytical tasks will run
significantly slower without the SAP HANA system being available, but no data is
lost. More important is the situation if SAP HANA is the primary database, like
when using Business Warehouse with SAP HANA as the database. In this case,
the productive data is solely available within the SAP HANA database, and
according to the business service level agreements, prevention for a failure is
absolutely necessary.
A disaster recovery solution for SAP HANA can be based on two different levels:
On the application level:
By replicating all actions performed by processes on the primary site
systems to their counterparts in the secondary site. Essentially, the
secondary site systems execute the exact same instructions as the
primary site systems except for accepting user requests or queries. This
feature is known as SAP HANA System Replication.
By shipping database logs from the primary site to the secondary site. At
the time of writing, this feature is not supported by the SAP HANA
database.
On the infrastructure level:
By replicating the data written to disks by SAP HANAs persistency layer,
either synchronously or asynchronously, allowing to restart and recovering
the SAP HANA database on the secondary site in the event the primary
site becomes unavailable.
Using backups replicated or otherwise shipped from the primary site to the
secondary site, and used for a restore in a disaster.

Chapter 6. IBM Systems solution for SAP HANA

129

Which kind of disaster recovery solution to implement depends on the recovery


time objective (RTO) and the recovery point objective (RPO). The RTO describes
how quickly the SAP HANA database has to be back available after a disaster.
The RPO describes the point in time to which data has to be restored after a
disaster, for example, how old the most recent backup is. An RPO of zero means
no data is lost in a disaster. Business critical systems, like Business Warehouse
with SAP HANA as the database, usually require to be operated with an RPO of
zero. Table 6-6 gives a summary of the possible DR options with SAP HANA.
Table 6-6 Disaster recovery approaches for SAP HANA

Application
level

Infrastructure
level

DR solution

Recovery time
objective (RTO)

Recovery point
objective (RPO)

SAP HANA System


Replication
(synchronous)

minutes

zero

SAP HANA System


Replication
(asynchronous)

minutes

seconds

Log shippinga

n/a

n/a

GPFS based
synchronous
replication

minutes

zero

GPFS based
asynchronous
replicationa

n/a

n/a

Backup - Restore

usually hours

hours to days

a. This feature is not supported at the time of this writing.

Note: When speaking about DR, the words primary site, primary data center,
active site, and production site mean the same. Similarly, secondary site,
back up site, and DR site are also used interchangeably. The primary site
hosts your production SAP HANA instance during normal operation.
If a failover happens and production is running out of the backup site, we
clearly state so.

130

In-memory Computing with SAP HANA on IBM eX5 Systems

6.5.1 DR using synchronous SAP HANA System Replication


System Replication is a new feature introduced with SAP HANA SPS05. In an
environment using synchronous SAP HANA System Replication, the systems on
both sites need to be configured identically (N+N). Every HANA process running
on the primary site cluster nodes has a corresponding process on a secondary
site node and replicates its activity to it.
The only difference between the two sites is the fact that one cannot connect to
the secondary site HANA installation and execute queries on that database.
Upon start of the secondary site HANA cluster, each process establishes a
connection to its primary site counterpart and requests the data located in main
memory. This is called a snapshot. Once the snapshot has been transferred, the
primary site system continuously sends the log to the secondary site system
running in live-replication mode. Because at the time of this writing System
Replication does not support to replay the logs immediately as they are received,
the secondary site system only acknowledges and persists the logs. To avoid
having to replay hours or days of transaction logs upon a failure, from time to time
System Replication transmits a new full data snapshot. In addition, System
Replication also sends over status information such as which tables are currently
loaded into main memory.
System Replication can be set to one of two modes. They are distinguished
according to the level of persistence on the secondary site system:
Synchronous mode: Makes the primary site system wait until the change has
been committed and persisted on the secondary site system.
Synchronous in-memory mode: Makes the primary site system acknowledge
the change after it has been committed in main memory on the secondary
site system, but not yet persisted on disk.
In both modes, the DR overhead is defined through the transmission time from
the primary to its corresponding secondary site process. When running System
Replication in synchronous mode, one has to add the time it takes to persist the
change on disk on top of the transmission delay.
In case the connection between the two data centers is lost, live-replication
stops. Then, after a (configurable) timer expires on the primary site system it
resumes work without replication.
When the connection is restored, the secondary site system requests a delta
snapshot of what changes have been done since the connection was lost.
Live-replication can then continue once this delta has been received on the
secondary site system.

Chapter 6. IBM Systems solution for SAP HANA

131

In a failover to the secondary site, manual intervention is required to change the


secondary site system from live-replication mode into active mode. When all logs
have been replayed, the system is ready to accept outside database
connections.
SAP HANA System Replication ensures an RPO of zero because every change
on the primary site system is also logged on the secondary site. When a failover
has been triggered, it takes a couple of minutes to replay the logs until the
backup system becomes the active and accepts user connections. This means
that RTO is within the range of minutes, depending on the system environment
and workload of your HANA database.

6.5.2 DR using asynchronous SAP HANA System Replication


With the recent release of SAP HANA SPS06 end of June 2013, SAP has added
asynchronous replication capabilities to SAP HANA System Replication.
Asynchronous replication of database logs allows for greater distances between
the DR sites because a high latency does not prevent production workload from
running at maximum performance as it is with synchronous replication.
Asynchronous SAP HANA System Replication requires an identical node
configuration on both sides (N+N). The transaction is committed as soon as the
replicated log is sent out to the DR site. The system does not wait for an
acknowledgement from the remote site. This puts the system at risk to lose the
data in flight when the primary site experiences a disaster. This means the RPO
is not zero, but within seconds.

6.5.3 DR using GPFS based synchronous replication


A different approach for being able to recover from a disaster is to implement
resilience on the infrastructure layer. To achieve this, the IBM Systems solution
for SAP HANA leverages DR-capability that is built into the file system. By using
the GPFS replication feature, an additional data copy can be stored in a
secondary data center location. This approach is identical to realizing HA in a
single-site scale-out environment.
The following sections give an overview on a multi-site DR solution leveraging
GPFS replication. It explains the advantages of having a quorum node in a
separate location, and it talks about how the idling servers on the secondary site
can be used for hosting additional installations of SAP HANA.

132

In-memory Computing with SAP HANA on IBM eX5 Systems

Overview
For a disaster recovery setup, it is necessary to have identical scale-out
configurations on both the primary and the secondary site. In addition, there can
be a third site, which has the sole responsibility to act as a quorum site. The
distance between the primary and secondary data centers has to be within a
certain range in order to keep network latency to a minimum to allow for
synchronous replication with limited impact to the overall application
performance (also referred to as metro-mirror distance).
Application latency is the key indicator for how well a DR solution will perform.
The geographical distance between the data centers can be short. However, the
fibre cable between them may follow another route. The Internet service provider
usually routes via one of its hubs. This leads to a longer physical distance for the
signal to travel, thus a higher latency. Another factor that must be taken into
account is the network equipment between the two demarcation points on each
site. More routers and protocol conversions along the line introduce a higher
latency.
Attention: When talking about latency, make sure to specify the layer at which
you are measuring it. Network engineers usually talk about network latency.
Whereas, SAP prefers to use application latency.

Network latency refers to the low-level latency network packets experience


when traveling over the network from site A to site B. Network latency does not
necessarily include the time that it takes for a network packet to be processed
on a server.
Application latency refers to the delay that an SAP HANA database
transaction experiences when operated in a DR environment. This value is
sometimes also known as end-to-end latency. It is the sum of all delays as
they occur while the database request is in flight and includes, besides
network latency, packet extraction in the Linux TCP/IP stack, GPFS code
execution, or processing the SAP HANA I/O code stack.
The major difference between a single site (as described in 6.4.2, Scale-out
solution with high-availability capabilities on page 120) and a multi-site solution
is the placement of the replicas within GPFS. Whereas in a single-site
configuration there are two replicas6 of each data block in one cluster, a multi-site
solution will hold an additional third replica in the remote or secondary site. This
ensures that, when the primary site fails, a complete copy of the data is available
in the second site and operation can be resumed on this site. Two replicas on the
primary site are required to ensure HA within the primary data center.
6

In GPFS terminology, each data copy is referred to as a replica. This also applies to the primary
data, which is called first replica. This is to indicate that all data copies are equal.

Chapter 6. IBM Systems solution for SAP HANA

133

A two-site solution implements the concept of a synchronous data replication on


file system level between both sites, leveraging the replication capabilities of
GPFS. Synchronous data replication means that any write request issued by the
application is only committed to the application after it has been successfully
written on both sides. In order to maintain the application performance within
reasonable limits, the network latency (and therefore the distance) between the
sites has to be limited to metro-mirror distances. The maximum achievable
distance depends on the performance requirements of the SAP HANA system. In
general, online analytical processing (OLAP) workload can work with higher
latencies than online transaction processing (OLTP) workload. The network
latency will mainly be dictated by the connection between the two SAP HANA
clusters. This inter-site link is typically provided by a third-party Internet service
provider (ISP).

Basic architecture
During normal operation, there is an active SAP HANA instance running. The
SAP HANA instance on the secondary site is not active. The architecture on
each site is identical to a standard scale-out cluster with high availability as
described in section 6.4.2, Scale-out solution with high-availability capabilities
on page 120. It therefore has to include standby servers for high availability. A
server failure is being handled completely within one site and does not enforce a
site failover. Figure 6-20 illustrates this setup.

Site A
node01

node02

node03

Site B
node04

node05

SAP HANA DB (active)


Partition 1

Partition 2

Partition 3

node06

node07

node08

SAP HANA DB (inactive)


Standby

Partition 1

Partition 2

Partition 3

Standby

Shared file system - GPFS


First replica
Second replica

synchronous
replication

Third replica

GPFS
quorum
node

Site C
Figure 6-20 Basic setup of the disaster recovery solution using GPFS synchronous replication

The connection between the two main sites, A and B, depends on the clients
network infrastructure. It is highly recommended to have a dual link dark fibre
connection to allow for redundancy also in the network switch on each site. For

134

In-memory Computing with SAP HANA on IBM eX5 Systems

full redundancy, an additional link pair is required to fully mesh the four switches.
Figure 6-21 shows a connection with one link pair in between. It also shows that
only the GPFS network needs to span both sites. The SAP HANA internal
network is kept within each site because it does not need to communicate to the
other site.

Primary site

Secondary site
BNT8264 #3

BNT8264 #1

HANA internal VLAN


GPFS VLAN

ISL

HANA internal VLAN


GPFS VLAN

BNT8264 #2

HANA internal VLAN


GPFS VLAN

BNT8264 #4

HANA internal VLAN


GPFS VLAN

ISL

Figure 6-21 Networking overview for SAP HANA GPFS-based DR solutions

Within each site, the 10 Gb Ethernet network connections for both the internal
SAP HANA and the internal GPFS network are implemented in a redundant
layout.
Depending on where exactly the demarcation point is between the SAP HANA
installation, the client network, and the inter-site link, different architectures can
be realized. In the best case, there will be a dedicated 10 Gb Ethernet
connection going out of each of the two SAP HANA switches on each site
towards the demarcation point. There is no requirement with which technology
the data centers are connected as long as it can route IP traffic across the link. In
general, low-latency interconnect technologies are the preferred choice.
Depending on the client infrastructure, the inter-site link may be the weakest link
and no full 10 Gb Ethernet can be carried across it. SAP has validated the IBM
solution using a 10 Gb interconnect, but depending on the workload, a HANA
cluster might not generate that much traffic and a value much smaller than 10 Gb
is sufficient. This must be decided on an individual basis for each client. For
example, the initial database load might take hours using a 1 Gbit connection, or
minutes when using a 10 Gbit network connection to the remote site. During
normal operation, latency is more critical than bandwidth for the overall
application performance.
As with a standard scale-out implementation, the disaster recovery configuration
relies on GPFS functionality to enable the synchronous data replication between
sites. A single site solution holds two replicas (that means copies) of each data
block. This is being enhanced with a third replica in the dual site disaster

Chapter 6. IBM Systems solution for SAP HANA

135

recovery implementation. A stretched GPFS cluster is being implemented


between the two sites. Figure 6-20 on page 134 illustrates that there is a
combined cluster on GPFS level spanning both sites, whereas each sites SAP
HANA cluster is independent of each other. GPFS file placement policies ensure
that there are two replicas on the primary site and a third replica on the
secondary site. In case of a site failure, the file system can therefore stay active
with a complete data replica in the secondary site. The SAP HANA database can
then be made operational through a manual procedure based on the persistency
and log files available in the file system.
GPFS is a cluster file system. As such, it is exposed to the risk of a split brain
situation. A split brain happens when the connection between the two data
centers is lost but both clusters can still communicate internally. Each surviving
cluster then thinks the nodes on the other site are down and therefore it is safe to
continue writing to the file system. In the worst case, this can lead to inconsistent
data on the two sites.
To avoid this situation, GPFS requires a quorum of nodes to be able to
communicate. This is called a GPFS cluster quorum. Not every server that is
designated as a GPFS quorum node gets elected to act as one. GPFS chooses
an odd number of servers to act as quorum nodes. The exact number depends
on the number of total servers within the GPFS cluster. For an SAP HANA DR
installation, the primary active site always gets one more node assigned as a
quorum than the backup site. This ensures that if the inter-site link goes down,
GPFS will stay up on the primary site nodes.
In addition to the GPFS cluster quorum, each file system in the GPFS cluster
stores vital information in a small structure called file system descriptor (shown
as FSdesc in the output of GPFS commands). This file system descriptor contains
information such as, file system version, mount point, and a list of all disks that
make up this file system. When the file system is created, GPFS writes a copy of
the file system descriptor onto every disk. From then on, GPFS only updates
them on a subset of disks upon changes to the file system.
Depending on the GPFS configuration, either three, five, or six copies are kept
up-to-date7. Most valid file system descriptors is required for the file system to be
accessible8. If disks fail over time, GPFS updates another copy on different disks
to ensure that all copies are alive. Only if multiple disks fail at the same time and
each was holding a valid copy of the file system descriptor, then a cluster looses
the file system descriptor quorum and the file system is automatically
unmounted.
7
8

136

For environments with just one disk, GPFS uses only one file system descriptor. This scenario does
not apply to SAP HANA setups.
Even if all disks have failed, as long as there is at least one valid file system descriptor that is
accessible, you still have a chance to manually recover data from the file system.

In-memory Computing with SAP HANA on IBM eX5 Systems

Site failover
During normal operation, there is a running SAP HANA instance active on the
primary site. The secondary site has an installed SAP HANA instance that is
inactive. A failover to the remote SAP HANA installation is a manual procedure;
however, it is possible to automate the steps in a script. Depending on the reason
for the site failover, it can be decided if the secondary site becomes the new
production site or a failback needs to happen after the error in the primary site
has been fixed.
To ensure the highest level of safety, during normal operation the GPFS file
system is not mounted on the secondary site. This ensures that there is not read
nor write access to the file system. If desired, however, the file system can be
mounted for example read-only to allow for backup operations on the DR site.
A failover is defined as anything that would bring the primary site down. Single
errors are handled within the site using the fully redundant local hardware (such
as a spare HA node and second network interface).
Events that are handled within the primary site include:

Single server outage


Single switch outage
Accidentally pulled network cable
Local disk failure

Events that cause a failover to the DR site include:


Power outage in the primary data center causing all nodes to be down
Two servers going down at the same time (not necessarily due to the same
problem)
The time for the failover procedure depends on how long it takes to open SAP
HANA on the backup site. The data is already in the data center ready to be
operated from. Any switch from one site to the other incorporates a down time of
SAP HANA operations because the two independent instances on either site
must not run at the same time, due to the sharing of the persistency and log files
on the file system.
GPFS provides the means to restore HA capabilities on the backup site. During
normal operation, only one replica is pushed to the backup site. However, clients
considering their data centers to be equal, might choose to not gracefully
fail back to the primary data center after it has been repaired, but instead
continue to run production from the backup site. For this scenario, a GPFS
restripe is triggered that creates a second replica on the backup site out of the
one available replica. This procedure is called restoring HA capabilities. The
exact commands are documented in the Operations Guide for IBM Systems

Chapter 6. IBM Systems solution for SAP HANA

137

solution for SAP HANA. The duration of this restriping depends on the amount of
data residing in the file system.
When the SAP HANA instance is started, the data is loaded into main memory.
The SAP HANA database is restored to the latest savepoint and the available
logs are recovered. This procedure can be automated; however, it is dependent
on the client environment. The commands in the Operations Guide provide a
template for such an automation.
Clients choosing to continue running production out of the former backup data
center can easily add the former primary site once it has been restored. The
nodes are integrated back into the GPFS cluster and a resynchronization of the
most recent version of the data occurs between the new primary data center and
the new secondary data center. One replica will be held in the new secondary
site. The overall picture looks exactly like before a failover only with the data
centers having switched their designation.

Site failback
A site failback is defined as a graceful switch of the production SAP HANA
instance from the secondary data center back to the primary data center.
To understand the procedure for a failback, it is important to know the initial state
of the DR environment. There are two possibilities:
You have two replicas local in the data center and one replica in the remote
sites data center. Examples include:
A disaster has happened and you have restored HA on the backup site;
now you are ready to failback production to the primary site again
During normal operations, you want to switch production to the backup site
for maintenance reasons
You have one replica local in the data center and two replicas are in the
remote sites data center. Example: A disaster has happened and you are
running production from the backup data center without having HA restored.
Environments with only one working replica need to restore HA first before being
able to gracefully failback. This ensures the highest level of safety for your data
during the failback procedure.
When SAP HANA is running from a file system with two local replicas, the
failback procedure then is identical to a controlled failover procedure. The data
center is assumed to be down, the active site now is the remote site, HA is
restored (using GPFS restriping), and the second site is attached again with one
single replica of the data. SAP HANA can be started as soon as it is shut down

138

In-memory Computing with SAP HANA on IBM eX5 Systems

on the other site, but it will experience a performance impact during HA restore
(that is GPFS restriping). GPFS restriping is an I/O-heavy operation.

DR environments with dedicated quorum node


The most reliable way to implement disaster recovery is with the use of a
dedicated quorum node in a third site. The sole purpose of the quorum node is to
decide which site is allowed to run the production instance after the link that is
connecting the primary and secondary data centers has been lost. This situation
is known as a split brain. The quorum node, placed at a third site, has a separate
connection to the primary and to the secondary site. Figure 6-22 depicts this
configuration.
The quorum node is configured to act as a GPFS quorum server without storing
any SAP HANA data. The only requirement is to have a small disk or partition
available that can hold a file system descriptor.

Primary site

Secondary site

Servers running production

Servers waiting to take over

Weight = 2

Weight = 2

Tertiary site
GPFS quorum node only
Weight = 1

Figure 6-22 Outline of a three-site environment with a dedicated GPFS quorum node

In a standard setup, two nodes from the primary data center act as quorum
nodes, two nodes from the secondary data center act as quorum nodes, plus the
additional quorum node at the third site. The number of quorum nodes is shown
as the weight of a data center in Figure 6-22.
In the unlikely event that two inter-site links get interrupted at the same time,
there is still the majority of quorum nodes available to communicate with each
other over the one connection left in order to decide from which data center to
run production. In terms of weight, this means that in any of these situations a
minimum of three can be guaranteed to be always up.

Chapter 6. IBM Systems solution for SAP HANA

139

If the links between the primary and secondary and between secondary and
tertiary data center go down, SAP HANA keeps running out of the primary site
without any downtime. If the links between the primary and tertiary and between
secondary and tertiary data center go down, the primary and secondary data
center can still communicate and SAP HANA keeps running out of the primary
data center.
If the links between the primary and secondary and primary and tertiary data
center go down, it means that the production SAP HANA instance in the primary
data center is isolated and looses GPFS quorum. As a safety measure, GPFS
will prevent any writing to the file system on the primary site and SAP HANA
stops. GPFS stays up and running on the secondary site because the quorum
node still has a connection to it. Depending on client requirements, HA needs to
be restored first before SAP HANA can be started and production continues to
run out of the secondary site data center.
It is a valid use case to set up an already existing server to act as a GPFS
quorum node for the SAP HANA DR installation9. Keep in mind that GPFS needs
to have root access on this machine in order to run. Three GPFS server licenses
are required on the primary and on the secondary site for the first three
servers10. Additional servers need GPFS FPO licenses.
The main advantage of having a dedicated quorum node is that the file system is
always available during failover and failback without any manual intervention, as
long as there is at least one site plus the quorum node is able to communicate
with each other.

DR environments without dedicated quorum node


Environments that do not have a third site to host a dedicated quorum node can
still implement a GPFS based disaster recovery solution for SAP HANA. The
difference between environments with and without a dedicated quorum node is
the procedure required upon a failover.
A dedicated quorum node at a third site allows GPFS to continuously stay up and
running if any site goes down. Without a dedicated quorum node, its functionality
must be put on the primary site to ensure that GPFS in the primary data center
continues to operate even if the inter-site link is interrupted.
Figure 6-23 on page 141 shows such an environment without a dedicated
quorum node. The weight symbolizes quorum designation. There are three

9
10

140

Applicability of this statement must be verified for each installation by IBM.


The currently validated minimum number of servers on each site for DR is three. This is required
by SAP to be able to set up a scale-out environment with HA. Hence, the requirement for at least
three GPFS server licenses per site.

In-memory Computing with SAP HANA on IBM eX5 Systems

servers with a quorum designation in the primary data center and two servers in
the backup data center.

Primary site

Secondary site

Servers running production

Servers waiting to take over

Weight = 3

Weight = 2

Figure 6-23 Outline of a two-site environment without a dedicated GPFS quorum node

If a disaster happens at the primary site, the GPFS cluster looses its quorum
because the two quorum nodes at the backup site do not meet the minimum of
three. An additional procedure is required to relax GPFS quorum on the surviving
secondary site before GPFS comes up again. The exact procedure is
documented in the Operations Guide for IBM Systems solution for SAP HANA.
After GPFS is running again, the procedure is identical to a failover with a
dedicated quorum node. It is optional to restore HA capabilities within the
secondary site. SAP HANA can be started already while this restore procedure is
still running, but a performance impact must be expected because this is an
I/O-intensive operation.
Keep in mind that you need three GPFS server licenses on the primary and on
the backup site even though during normal operation only two of them are
required on the backup site. If a disaster happens and you need to failover the
SAP HANA production instance to the backup data center, this becomes the
main SAP HANA cluster and thus requires three GPFS server licenses.
Additional servers on either site get GPFS FPO licenses11.

Backup site hosting non-production SAP HANA


In the environments described so far, all servers on the secondary site only
receive data over the network from the primary site and store them on local
disks. Other than that, they are idling. There are not any SAP HANA processes
running on them.

11

The currently validated minimum number of servers on each site for DR is three. This is required
by SAP to be able to set up a scale-out environment with HA. Hence, the requirement for at least
three GPFS server licenses per site.

Chapter 6. IBM Systems solution for SAP HANA

141

To use the idling compute power, SAP supports the hosting of non-production
SAP HANA instances on the backup site, for example, a quality assurance (QA)
or training environment. When a disaster happens, this non-production instance
must then be shut down before the failover procedure of the production instance
can be initiated.
The non-production SAP HANA instances need additional space for its
persistency and log data. IBM is using the IBM Systems Storage EXP2524 to
extend the locally available disk storage space. The EXP2524 directly connects
via an SAS interface to one single server and provides up to 24 additional 2.5
disks. You need one EXP2524 for each secondary site server. Figure 6-24 shows
the overall architecture in an example with four SAP HANA appliances per site.

node1

Primary site
node2
node3

node4

node5

Secondary site
node6
node7

node8

HDD+
Flash

HDD+
Flash

HDD+
Flash

HDD+
Flash

HDD+
Flash

HDD+
Flash

HDD+
Flash

third
replica

Production
file
system

second
replica

first
replica

HDD+
Flash

Non-production
file system

First replica
Second replica

...

...

...

...

Second file system spanning only expansion unit drives (metadata and data)

Figure 6-24 Overview of running non-production SAP HANA instances on the idling backup site

If a failover happens from the primary site to the secondary site, and you are
planning to keep running production from the secondary data center, it means
that you need to be able to host non-production instances in the primary data
center. To accommodate for this additional storage space on the primary site
servers, you must connect EXP2524 expansions units to them as well.
If you plan to gracefully fail back your production instance from the backup site to
the primary site after it has been repaired, you do not need to have EXP2524
expansion units on the primary site servers. Keep in mind that there might be

142

In-memory Computing with SAP HANA on IBM eX5 Systems

unforeseen outages that take a very long time to repair. You will not be able to
run your non-production site instances during this outage.
There is exactly one new file system that spans all expansion units of the backup
site servers. This new file system runs with a GPFS replication factor of two,
meaning that there are always two copies of each data block. The first replica is
stored local to the node writing the data. The second replica is stored in a striped
round-robin fashion over the other nodes. This is identical to a scale-out HA
environment. One server can fail and the data is still available on the other nodes
expansion units. Figure 6-24 on page 142 shows this from node5s perspective. If
node5 writes data into the non-production file system, it stores the first replica on
local disks in the expansion unit. The second replica is striped across node6,
node7, and node8s expansion unit drives (symbolized as a long blue box in the
figure).
Although IBM does not support a multi-SID configuration, it is a valid scenario to
run different SAP HANA instances on different servers. If you had a cluster of six
nodes on each site, you would for instance, be able to run QA on two nodes and
development on four nodes, or you would run QA on three nodes and
development on three nodes. Keep in mind that all non-production instances
must use the same file system, which means in this example, QA and
development must be configured to use different directories.

Summary
The disaster recovery solution for the IBM Systems solution for SAP HANA uses
the advanced replication features of GPFS, creating a cross-site cluster that
ensures availability and consistency of data across two sites. It does not impose
the need for additional storage systems, but instead builds upon the scale-out
solution for SAP HANA. This simple architecture reduces the complexity in
maintaining such a solution while keeping the possibility of adding more nodes
over time if the database grows.

6.5.4 DR using backup and restore


Using backup and restore as a disaster recovery solution is a basic way of
providing disaster recovery. Depending on the RPO, it might however be a viable
way to achieve disaster recovery. The basic concept is to back up the data on the
primary site regularly (at least daily) to a defined staging area that might be an
external disk on an NFS share, or a directly attached SAN subsystem (does not
need to be dedicated to SAP HANA). After the backup is done, it must be
transferred to the secondary site, for example, by a simple file transfer (can be
automated) or by using replication functionality of the storage system used to
hold the backup files.

Chapter 6. IBM Systems solution for SAP HANA

143

As described in Restoring a backup on page 161, a backup can be restored


only to an identical SAP HANA system. Therefore, an SAP HANA system has to
exist on the secondary site, which is identical to the one on the primary site at
minimum regarding the number of nodes and node memory size. During normal
operations, this system can run other non-productive SAP HANA instances, for
example, a quality assurance (QA), development (DEV), test, or other second tier
systems. In case the primary site goes down, the system needs to be cleared
from these second tier HANA systems (a fresh install of the SAP HANA software
is recommended) and the backup can be restored. Upon configuring the
application systems to use the secondary site instead of the primary one,
operation can be resumed. The SAP HANA database will recover from the latest
backup in case of a disaster.
Figure 6-25 illustrates the concept of using backup and restore as a basic
disaster recovery solution.

Primary Site
node01

node02

node03

node04
backup

SAP HANA DB
DB partition 1

DB partition 2

DB partition 3

Standby node

Shared file system - GPFS

First replica

HDD

Flash

HDD

Flash

HDD

Flash

data01

log01

data02

log02

data03

log03

HDD

Flash
Storage

Second replica

mirror /
transfer
backup set

Secondary Site
node01

node02

node03

node04

DB partition 3

Standby node

SAP HANA DB
DB partition 1

DB partition 2

restore

Shared file system - GPFS

First replica

HDD

Flash

HDD

Flash

HDD

Flash

data01

log01

data02

log02

data03

log03

HDD

Second replica

Figure 6-25 Using backup and restore as a basic disaster recovery solution

144

In-memory Computing with SAP HANA on IBM eX5 Systems

Flash
Storage

6.6 Business continuity for single-node SAP HANA


installations
Clients running their SAP HANA instance on a single node can also implement
redundancy to protect against a node failure. IBM uses GPFS to provide
high-availability and disaster recovery solutions for single-node environments.
Table 6-7 gives an overview of the different solutions that IBM is offering. The first
decision is whether there is only one data center available to use, or two separate
ones. If it is just one data center, you can implement only 1+1 high-availability.
If two data centers are available, you need to determine if the standby node
should be ready for automatic takeover, or if the standby node should run a
non-productive SAP HANA instance (for example QA or DEV). This leads to
either a 1+1 stretched high-availability or a 1+1 disaster recovery solution.
All three HA and DR solutions are available with building blocks of size S
(2 sockets and 256 GB), M (4 sockets and 512 GB), and L (8 sockets and 1 TB).
All three solutions also require an additional server to act as a dedicated GPFS
quorum node. This quorum node acts as a weight to ensure that the primary SAP
HANA instance continues to work even if the link to the second node is lost.
In addition, single-node DR can also be implemented for all SAP Business Suite
powered by SAP HANA models running on the new XM (4 sockets and 1 TB), or
XL (8 sockets and 2 TB), or XXL (8 sockets and 4 TB) options.
Table 6-7 Overview of IBM HA and DR solutions for single-node SAP HANA installations
1+1 HA

1+1 stretched HA

1+1 DR

Server location

One data center


on one site

Two data centers


on two sites (metro
distance)

Two data centers


on two sites
(metro distance)

RTO

seconds

seconds

minutes

RPO

zero

zero

zero

Replication method

Synchronous

Synchronous

Synchronous

Automatic failover

Yes

Yes

No

Standby server can


host non-production
HANA

No

No

Yes

Additional quorum
server required

Yes

Yes

Yes

Chapter 6. IBM Systems solution for SAP HANA

145

1+1 High-Availability
This setup realizes HA for a single node within a single data center. To achieve
this, it typically uses two data center areas that are physically isolated from each
other. A fire in the first area must not affect the second area and vice versa.
1+1 HA places one server into each area. One server will run in active mode,
hosting one SAP HANA worker, and the other server is running in HANA standby
mode. GPFS replication ensures that all persistency files on the active node are
copied to the second node. Under normal operation, at any given point in time
there is a synchronous data copy on each of the two servers that means an RPO
of zero is ensured. In a disaster, the standby node takes over and continues
operation using its local data copy. An additional GPFS quorum server in the
primary site ensures that upon a network outage to the secondary node, GPFS
stays up on the primary node and SAP HANA is continuously available.
No manual intervention is required for the takeover because it is handled
internally by SAP HANA processes. The takeover usually happens within
seconds or minutes, depending on the environment and workload. Clients must
ensure to have a network environment between the SAP HANA servers that is
validated by SAP (in terms of throughput and latency) so that continuous data
replication can be guaranteed. Latency must be kept to a minimum12.

1+1 stretched High-Availability


This solution extends 1+1 HA to metro distances. The 1+1 stretched HA setup
spans two data centers in two different sites. One single node is installed in each
site. As such, the network is different to what is available within a single data
center. Latency and throughput requirements for running SAP HANA over two
data centers are identical to the synchronous replication solution described in
section 6.5.3, DR using GPFS based synchronous replication on page 132.
In a 1+1 stretched HA setup is one SAP HANA worker running on the server in
the active data center. GPFS ensures that all data written on the active node is
replicated over to the server in the second data center. This machine has SAP
HANA running in standby mode. During installation, an additional server must be
installed at the active side that will act as a GPFS quorum server. By placing it in
the primary data center, it gives the active site a higher weight so that GPFS
continues to operate even if the server hosting the SAP HANA standby process
is no longer reachable. The standby server can become unreachable, for
example, because of a server hardware failure or because of a broken link
between the two data centers.

12

146

Experience has shown that having a networking firewall in between the two data center areas can
introduce a huge delay that dramatically slows down SAP HANA execution speed.

In-memory Computing with SAP HANA on IBM eX5 Systems

1+1 Disaster Recovery


An environment running 1+1 DR spans two data centers in two distinct locations.
There is a primary and a secondary node and SAP HANA is active on only the
primary node. Under normal conditions, SAP HANA is not running on the
secondary node; that means no SAP HANA processes are running on it. The
secondary node is used only to store data that gets replicated through GPFS
from the primary node. In the event of a disaster, SAP HANA has to be manually
started on the secondary node. This is because there is only one set of
persistency and log files residing in the file system and only one active SAP
HANA instance can work with it. The exact procedure is described in the IBM
SAP HANA Operations Guide.
An additional server is required that acts as a GPFS quorum node. This server
must be placed in the primary data center and in an event when the primary node
can no longer communicate with the secondary node (hardware failure or broken
network link), the primary site server will stay up and GPFS keeps running
without experiencing any downtime. Without this quorum node, if the link
between the two data centers is interrupted, GPFS cannot tell which side should
continue to operate and it will unmount the file system and stop operation to
prevent the two sides from having inconsistent data. When GPFS is down, SAP
HANA cannot continue to operate because no data and logs are accessible.
Administrative intervention is required in that case.
The advantage of having no SAP HANA processes running on the secondary
node (not even in standby mode) is that one can run a non-productive SAP
HANA instance (like DEV or test) during normal operation. Then, if there is a
disaster, this non-productive SAP HANA instance has to be shut down before a
takeover can happen from the primary system.
This additional SAP HANA instance needs its own storage space for persistency
and logs. IBM uses the IBM System Storage EXP2524 unit to provide this
additional space. The EXP2524 adds up to 24 2.5 hard disk drives that are
directly connected to the server via a SAS interface. A second file system is
created over those drives for the additional SAP HANA instance. This second file
system is only visible on this node.
If a disaster occurs and the productive SAP HANA instance is switched to run on
the secondary server, the non-productive instance must be shut down first.
Clients can then choose to either switch back the productive instance to the
primary server after it has been repaired, or it can continue to run on the backup
node. To be able to continue running the non-production instance on the idling
server, that is the former primary server, this machine has to have an IBM
System Storage EXP2524 attached as well. Non-production data residing on the
EXP2524 must be manually copied to the other side before being able to start
the non-production SAP HANA instance.

Chapter 6. IBM Systems solution for SAP HANA

147

6.7 SAP HANA on VMware vSphere


One way of consolidating multiple SAP HANA instances on one system13 is
virtualization.
On November 15th, 2012, VMware and SAP announced14 support for deploying
SAP HANA virtualization using VMware vSphere. SAP supports VMware
vSphere 5.1 since the release of SAP HANA 1.0 SPS 05 (that is, revision 45 and
higher), under the following conditions:
SAP HANA is virtualized with VMware vSphere 5.1, on single-node hardware
configurations that are validated for SAP HANA and provided by certified SAP
HANA appliance vendors.
The installation of the SAP HANA appliance software into the VMware guest
needs to be done by certified SAP HANA appliance vendors or their partners.
Cloning of such virtual machines can then be done by the clients as needed.
Only non-production deployments are supported, such as test, development,
sandbox, break-fix, and learning systems.
Only scale-up single server appliance configurations are supported.
Scale-out configurations are not supported.
More information about SAP HANA virtualization with VMware vSphere is
contained in SAP Note 1788665, which also has an FAQ document attached.
This SAP Note is available for download at the following site:
http://service.sap.com/sap/support/notes/1788665

VMware vSphere for the IBM Systems solution for SAP HANA
The IBM Systems solution for SAP HANA supports virtualization with VMware
vSphere, using an embedded hypervisor, as described in section 6.1.6,
Integrated virtualization on page 104. The SAP HANA Platform Edition can be
installed inside of a VMware virtual machine starting with Service Pack Stack
(SPS) 05. The SAP HANA Virtualization FAQ states that SAP allows multiple
virtual machines to be installed using a concept of slots.
Each slot is a virtual machine created with 10 virtual CPUs (vCPUs) and 64 GB
memory. The standard rules for the operating system and sizes for SAP HANA
Data and Log file systems are to be followed. Due to resources taken by the
VMware ESXi server itself, one slot is reserved by SAP definition because no
CPU/memory overcommitment is allowed in an SAP HANA virtual machine.
13

One SAP HANA system, as referred to in this section, can consist of one single server or multiple
servers in a clustered configuration.
14
See http://www.sap.com/corporate-en/news.epx?PressID=19929 and
http://blogs.vmware.com/apps/2012/11/sap-hana-on-vmware-vsphere.html

148

In-memory Computing with SAP HANA on IBM eX5 Systems

This defines a maximum number of 15 slots available on the IBM System x3950
X5 Workload Optimized system for SAP HANA appliance, and a maximum of 3
slots on the IBM System x3690 X5 Workload Optimized system for SAP HANA
appliance.
Figure 6-26 illustrates the maximum number of slots available per system.

SAP Slots
X

T-Shirt Size (hosting system)


XS
S / S+

10

11

12

13

14

15

Figure 6-26 SAP HANA possible slots per T-shirt size (x = reserved)

Table 6-8 shows the configuration parameters for virtual machines from slot sizes
1 - 8. Sizes 7 and 8 are not supported by VMware due to the restriction of a
maximum number of vCPUs per guest of 64 in their latest VMware ESX server
software. You can install one or more virtual machines of slot sizes
1 - 6 on any IBM System x3950 X5 Workload Optimized solution for SAP HANA
appliance.

Chapter 6. IBM Systems solution for SAP HANA

149

Table 6-8 SAP HANA virtual machine sizes by IBM and SAP
SAP
T-shirt

SAP HANA
support pack

IBM
name

vCPUs
(HT on)

Virtual
memory

Required
no. of slots

Total
HDD

Total
SSD

XXS

SPS 05

VM1

10

64 GB

352 GB

64 GB

XS

SPS 05

VM2

20

128 GB

608 GB

128 GB

Manually

VM3

30

192 GB

864 GB

192 GB

SPS 05

VM4

40

256 GB

1120 GB

256 GB

Manually

VM5

50

320 GB

1376 GB

320 GB

Manually

VM6

60

384 GB

1632 GB

384 GB

n/a

VM7a

70

448 GB

1888 GB

448 GB

n/a

VM8a

80

512 GB

2144 GB

512 GB

a. This slot size is not possible due to limitations of the VMware ESXi 5 hypervisor.

VMware vSphere licensing


VMware vSphere 5 comes in different editions, which have differences in their set
of features15. For the deployment of SAP HANA on VMware vSphere, the
number of supported vCPUs per virtual machine (VM) is the most important
difference within the set of supported features:
The Standard edition supports up to 8 vCPUs per VM, which is below the
number of vCPUs of the smallest SAP HANA VM size. Therefore, this edition
cannot be used.
The Enterprise edition supports up to 32 vCPUs per VM, and can be used for
SAP HANA VMs with up to 30 vCPUs and 192 GB virtual memory. This
edition may be a cost-saving choice when deploying only smaller SAP HANA
VMs, for example, for training or test purposes.
The Enterprise Plus edition supports up to 64 vCPUs per VM, and thus can
be used for SAP HANA VMs with up to 60 vCPUs and 384 GB virtual
memory, which currently is the maximum supported VM size.
VMware vSphere is licensed on a per-processor basis. Each processor on a
server must have a valid license installed in order to be able to run vSphere. The
numbers of processors for each of the building blocks of the IBM Systems
solution for SAP HANA is listed in the tables in section 6.3, Custom server
models for SAP HANA on page 110.

15

150

See http://www.vmware.com/files/pdf/vsphere_pricing.pdf for a complete overview of the


available editions and their feature sets.

In-memory Computing with SAP HANA on IBM eX5 Systems

Table 6-9 on page 151 lists the order numbers for both the Enterprise and
Enterprise Plus editions of VMware vSphere 5.
Table 6-9 VMware vSphere 5 ordering P/Ns
Description

VMware P/N
License / Subscription

IBM P/N

VMware vSphere 5 Enterprise for 1 processor


License and 1year subscription

VS5-ENT-C /
VS5-ENT-PSUB-C

4817SE3

VMware vSphere 5 Enterprise Plus for 1 processor


License and 1year subscription

VS5-ENT-PL-C /
VS5-ENT-PL-PSUB-C

4817SE5

VMware vSphere 5 Enterprise for 1 processor


License and 3 year subscription

VS5-ENT-C /
VS5-ENT-3PSUB-C

4817TE3

VMware vSphere 5 Enterprise Plus for 1 processor


License and 3 year subscription

VS5-ENT-PL-C /
VS5-ENT-PL-3PSUB-C

4817TE5

VMware vSphere 5 Enterprise for 1 processor


License and 5year subscription

VS5-ENT-C /
VS5-ENT-5PSUB-C

4817UE3

VMware vSphere 5 Enterprise Plus for 1 processor


License and 5year subscription

VS5-ENT-PL-C /
VS5-ENT-PL-5PSUB-C

4817UE5

As an example, if you want to deploy SAP HANA on an M building block (for


example, 7143-HBx), you will need four licenses.

Sizing
The sizing is the same for virtualized and non-virtualized SAP HANA
deployments. Although there will be a small performance impact due to the
virtualization, the database size and therefore the required memory size is not
affected.

Support
As with any other deployment type of SAP HANA, clients are asked to open an
SAP support ticket, leveraging the integrated support model outlined in section
7.9.1, IBM SAP integrated support on page 171. Any non-SAP related issue will
be routed to VMware first, before it eventually gets forwarded to the hardware
partner. In certain, but rare situations, SAP or its partners might require to
reproduce the workload on bare metal.

Chapter 6. IBM Systems solution for SAP HANA

151

6.8 SAP HANA on IBM SmartCloud


IBM announced availability of integrated managed services for SAP HANA
Appliances in the Global SmartCloud for SAP Applications Offering. This
combination of SAP HANA and IBM SmartCloud for SAP Applications global
managed services will help clients reduce SAP infrastructure costs and
complexity of their SAP Business Suite and HANA landscape. These appliances
will also accelerate business analytics by leveraging the IBM global managed
cloud infrastructure with standardized processes and skilled SAP-certified staff.
This offering provides SAP platform as a service (PaaS) on IBM SmartCloud
Enterprise+ for all SAP Business Suite and SAP Business Objects products. IBM
is a certified Global SAP Partner for cloud services.
Initially, IBM will support two side-by-side HANA use cases where the data
resides in a traditional database of the SAP application, for example, SAP
Business Suite, but selected data is replicated into HANA to accelerate real-time
reporting and analytics. Only single node SAP HANA installations will be
supported without the high-availability feature.

Side-by-side reporting and analytics


SAP HANA is used as a data mart to accelerate operational reporting
Data is replicated from the SAP application into the SAP HANA appliance, for
example, using the SAP LT Replicator server
Data is modeled in SAP HANA via SAP HANA. Preconfigured models and
templates can be loaded to jump-start the project and deliver quick results
SAP Business Objects BI 4 can be used as the reporting solution on top of
SAP HANA. Pre-configured dashboards and reports can be deployed
SAP Rapid Deployment Solutions (SAP RDSs), which provides predefined
but customizable SAP HANA data models, are available (for example:
ERP CO-PA, CRM)

Side-by-side accelerators
SAP HANA serves as a secondary database for SAP Business Suite applications
Data is replicated from the SAP application into the In-Memory SAP HANA
appliance, for example, using the SAP LT Replicator server
The SAP Business Suite application is accelerated by retrieving results of
complex calculations on mass data directly from the In-Memory database
The user interface for end users remains unchanged to ensure nondisruptive
acceleration

152

In-memory Computing with SAP HANA on IBM eX5 Systems

Additional analytics based on the replicated data in SAP HANA can provide
new insights for end users
The difference to the reporting scenario is that the consumer of the data
replicated to SAP HANA is not a BI tool, but the source system itself
Also for this scenario, several SAP RDSs are provided by SAP

6.9 IBM Systems solution with SAP Discovery system


The SAP Discovery system is a preconfigured hardware and software landscape
that can be used to test-drive SAP technologies. It is an evaluation tool that
provides an opportunity to realize the joint value of the SAP business process
platform and SAP BusinessObjects tools running on a single system. It provides
a complete, fully documented system with standard SAP software components
for developing and delivering service-based applications, including all the
interfaces, functionality, data, and guidance necessary to run a complete,
end-to-end business scenario.
The SAP Discovery system allows you to interact with SAPs most current
technologies: Mobility (Sybase Unwired Platform, Afaria), SAP HANA,
SAP CRM, SAP ERP EhP5, SAP NetWeaver 7.3, SAP BusinessObjects, and
more along with the IBM robust DB2 database. The SAP business process
platform, which is a part of the SAP Discovery system, helps organizations
discover ways to accelerate business innovation and respond to changing
business needs by designing reusable process components that use enterprise
services. The SAP BusinessObjects portfolio of tools and applications on the
SAP Discovery system were designed to help optimize information discovery and
delivery, information management and query, reporting, and analysis. For
business users, the SAP Discovery system helps bridge the gap between
business and IT and serves as a platform for future upgrade planning and
functional trial and gap analysis.
The SAP Discovery system includes sample business scenarios and
demonstrations that are preconfigured and ready to run. It is a preconfigured
environment with prepared demos and populated with best practices data. A list
of detailed components, exercises, and the SAP best practices configuration is
available at the following website:
http://www.sdn.sap.com/irj/sdn/discoverysystem
The IBM Systems solution with SAP Discovery system uses the IBM System
x3650 M4 server to provide a robust, compact, and cost-effective hardware
platform for the SAP Discovery system, using VMware ESXi software with
Microsoft Windows and SUSE Linux operating systems. IBM System x3650 M4

Chapter 6. IBM Systems solution for SAP HANA

153

servers offer an energy-smart, affordable, and easy-to-use rack solution for data
center environments looking to significantly lower operational and solution costs.
Figure 6-27 on page 154 shows the IBM Systems solution with SAP Discovery
system.

Figure 6-27 The IBM Systems solution with SAP Discovery system

With an embedded VMware hypervisor, x3650 M4 provides a virtualized


environment for the SAP software, consolidating a wealth of applications onto a
single 2U server. The IBM Systems solution with SAP Discovery system is also
configured with eight hard drives (including one recovery drive) to create a
compact, integrated system.

154

In-memory Computing with SAP HANA on IBM eX5 Systems

The combination of the IBM Systems solution for SAP HANA and the
IBM Systems solution with SAP Discovery system is the ideal platform to explore,
develop, test, and demonstrate the capabilities of an SAP landscape including
SAP HANA. Figure 6-28 illustrates this configuration.
IBM Systems Solution with SAP Discovery System
Client VM
Windows Server 2008
Sybase SUP 2.1

Server VM
SUSE Linux Enterprise Server 11 SP1

SAP
CRM 7.0

Sybase Afaria
SAP Business Objects
Client Tools

SAP BusinessObjects

SAP
MDM 7.1

Explorer 4.0
Business Intelligence 4.0

SAP CRM 7.0

Data Services 4.0

Dashboard
Crystal
DS / BI 4.0
SAP HANA Studio 1.0
SAP NetWeaver
Developer Studio

SAP NW PI 7.30

SAP NW CE 7.3

ESR

ESR

SR

SR

SLD

SLD

SAP NW
Mobile 7.1

SAP MDM Client

SAP NW
BW 7.3

IBM Systems solution


for SAP HANA
(purchased separately)

SAP NW 7.3 EP
SAP NW 7.3 BI
SAP NWDI 7.3

SAP ERP 6.0 EhP5


Gateway 2.0
SAP Landscape Transformation (SLT)

SAP NetWeaver
Business Client 3.0
SAP GUI 7.20

Integrated
with core

IBM DB2 Version 9.7

VMware ESXi

Figure 6-28 IBM Systems solution with SAP Discovery system combined with SAP HANA

Whether you plan to integrate new SAP products into your infrastructure or are
preparing for an upgrade, the IBM Systems solution with SAP Discovery system
can help you thoroughly evaluate SAP applications and validate their benefits.
You gain hands-on experience, the opportunity to develop a proof of concept,
and the perfect tool for training your personnel in advance of deploying a
production system. The combination of the IBM Systems solution with
SAP Discovery system with one of the SAP HANA models based on
IBM System x3690 X5, gives you a complete SAP environment including SAP
HANA in a compact 4U package.
More information about the IBM Systems solution with SAP Discovery system is
available at the following website:
http://www.ibm.com/sap/discoverysystem

Chapter 6. IBM Systems solution for SAP HANA

155

156

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 7.

SAP HANA operations


This chapter discusses the operational aspects of running an SAP HANA
system.
The following topics are covered:

7.1, Installation services on page 158


7.2, IBM SAP HANA Operations Guide on page 158
7.3, Interoperability with other platforms on page 160
7.4, Backing up and restoring data for SAP HANA on page 160
7.5, Monitoring SAP HANA on page 166
7.6, Sharing an SAP HANA system on page 167
7.7, Installing additional agents on page 168
7.8, Software and firmware levels on page 169
7.9, Support process on page 170

Copyright IBM Corp. 2013. All rights reserved.

157

7.1 Installation services


The IBM Systems solution for SAP HANA comes with the complete software
stack, including the operating system, General Parallel File System (GPFS), and
the SAP HANA software. Due to the nature of the software stack, and
dependencies on how the IBM Systems solution for SAP HANA is used at the
client location, the software stack cannot be preloaded completely at
manufacturing. Therefore, installation services are required. Installation services
for the IBM Systems solution for SAP HANA typically include:
Performing an inventory and validating of the delivered system configuration
Verifying and updating the hardware to the latest level of basic input/output
system (BIOS), firmware, device drivers, and OS patches as required
Verifying and configuring the Redundant Array of Independent Disks (RAID)
configuration
Finishing the software preinstallation according to the client environment
Configuring and verifying network settings and operation
Performing system validation
Providing onsite skills transfer (when required) on the solution and best
practices, and delivering post-installation documentation
To ensure the correct operation of the appliance, installation services for the
IBM Systems solution for SAP HANA have to be performed by specifically trained
personnel, available from IBM STG Lab Services, IBM Global Technology
Services, or IBM Business Partners, depending on the geography.

7.2 IBM SAP HANA Operations Guide


The IBM SAP HANA Operations Guide is an extensive guide describing the
operations of an IBM Systems solution for SAP HANA appliance. It covers the
following topics:
Cluster operations
Actions to take after a server node failure, such as recovering the GPFS
file system, removing the SAP HANA node from the cluster, and installing
a replacement node.
Recovering from a temporary node failure, by bringing GPFS on that node
back to a fully operational state, and restarting SAP HANA again on the
node.

158

In-memory Computing with SAP HANA on IBM eX5 Systems

Adding a cluster node by integrating it into the private networks of the


appliance, and into the GPFS and SAP HANA clusters.
Reinstalling the SAP HANA software on a node.
Disaster Recovery cluster operations
Common operations deviating from the procedures on a normal
installation, such as system shutdown and startup.
Planned failover and failback procedures.
Site failover procedures after a site failure for various scenarios.
How to deal with node failures, disk failures, and network failures.
How to operate non-production instances running on the secondary site in
a disaster recovery scenario.
Drive operations
Checking the drive configuration and the health of the drives.
Replacing failed hard drives, solid-state drives (SSDs), and IBM High
input/output operations per second (IOPS) devices and reintegrating them
into GPFS.
Driver and firmware upgrades for the IBM High IOPS devices.
System health checks
How to obtain GPFS cluster status and configuration, file system status and
configuration, disk status and usage, quotas, SAP HANA application status,
and network information from the switches.
Software updates
Checklists for what to do to update the Linux kernel, the drivers for the
IBM High IOPS drivers, GPFS, including instructions on how to do a rolling
upgrade where applicable.
References to related documentation, pointing to much important
documentation from IBM, SAP, and SUSE.
The Operations Guide is continuously being optimized and extended, based on
new developments and client feedback. The latest version of this document can
be downloaded from SAP Note 1650046.

Chapter 7. SAP HANA operations

159

7.3 Interoperability with other platforms


To access the SAP HANA database from a system (SAP or non-SAP), the SAP
HANA database client has to be available for the platform the system is running
on. Platform availability of the SAP HANA database client is documented in the
product availability matrix (PAM) for SAP HANA, which is available online at the
following site (search for HANA):
http://service.sap.com/pam
At the time of writing, the SAP HANA database client is available on all major
platforms, including but not limited to:
Microsoft Windows Server 2008 and Windows Server 2008 R2
Microsoft Windows Vista, Windows 7 (both 32 bit and 64 bit)
SUSE Linux Enterprise Server 11 on 32 and 64 bit x86 platforms, and
IBM System z
Red Hat Enterprise Linux 5 and 6, on 64 bit x86 platforms
IBM AIX 5.2, 5.3, 6.1, and 7.1 on the IBM POWER platform
IBM i V7R1 on the IBM POWER platform
HP-UX 11.31 on Itanium
Oracle Solaris 10 and 11, on x86 and SPARC
For up-to-date and detailed availability information, refer to the PAM.
If there is no SAP HANA database client available for a certain platform,
SAP HANA can still be used in a scenario with replication by using a dedicated
SAP Landscape Transformation server (for SAP Business Suite sources) or an
SAP BusinessObjects Data Services server running on a platform for which the
SAP HANA database client is available. This way, data can be replicated into
SAP HANA, which then can be used for reporting or analytic purposes, using a
front end supporting SAP HANA as a data source.

7.4 Backing up and restoring data for SAP HANA


Because SAP HANA usually plays a critical role in the overall landscape, it is
critical to back up the data in the SAP HANA database and be able to restore it.
This section gives a short overview about the basics of backup and recovery for
SAP HANA and the integration of SAP HANA and IBM Tivoli Storage Manager
for ERP.

160

In-memory Computing with SAP HANA on IBM eX5 Systems

7.4.1 Basic backup and recovery


Simply saving away the savepoints and the database logs is technically
impossible in a consistent way, and thus does not constitute a consistent backup
that can be recovered from. Therefore, a simple file-based backup of the
persistency layer of SAP HANA is not sufficient.

Backing up
A backup of the SAP HANA database has to be triggered through the SAP HANA
Studio or alternatively through the SAP HANA SQL interface. SAP HANA will
then create a consistent backup, consisting of one file per SAP HANA service on
each cluster node. Simply saving away the savepoints and the database logs
does not constitute a consistent backup that can be recovered from. SAP HANA
always performs a full backup. Incremental backups are currently not supported
by SAP HANA.
SAP HANA internally maintains transaction numbers, which are unique within a
database instance, also and especially in a scale-out configuration. To be able to
create a consistent backup across a scale-out configuration, SAP HANA
chooses a specific transaction number, and all nodes of the database instance
write their own backup files including all transactions up to this transaction
number.
The backup files are saved to a defined staging area that might be on the internal
disks, an external disk on an NFS share1, or a directly attached SAN subsystem.
In addition to the data backup files, the configuration files and backup catalog
files have to be saved to be recovered. For point in time recovery, the log area
also has to be backed up.
With the IBM Systems solution for SAP HANA, one of the 1 Gbit network
interfaces of the server can be used for NFS connectivity, alternatively an
additional 10 Gbit network interface (if PCI slot available). It is also supported to
add a Fibre Channel host bus adapter (HBA) for SAN connectivity. The Quick
Start Guide for the IBM Systems solution for SAP HANA lists supported
hardware additions to provide additional connectivity.

Restoring a backup
It might be necessary to recover the SAP HANA database from a backup in the
following situations:
The data area is damaged
If the data area is unusable, the SAP HANA database can be recovered up to
the latest committed transaction, if all the data changes after the last
1

SAP Note 1820529 lists network file systems that are unsuitable for backup and recovery.

Chapter 7. SAP HANA operations

161

complete data backup are still available in the log backups and log area. After
the data and log backups have been restored, the SAP HANA databases
uses the data and log backups and the log entries in the log area to restore
the data and replay the logs, to recover. It is also possible to recover the
database using an older data backup and log backups, as long as all relevant
log backups made after the data backup are available2. For more information,
see SAP Note 1705945 (Determining the files needed for a recovery).
The log area is damaged
If the log area is unusable, the only possibility to recover is to replay the log
backups. In consequence, any transactions committed after the most recent
log backup are lost, and all transactions that were open during the log backup
are rolled back.
After restoring the data and log backups, the log entries from the log backups
are automatically replayed in order to recover. It is also possible to recover the
database to a specific point in time, as long as it is within the existing log
backups.
The database needs to be reset to an earlier point in time because of a logical
error
To reset the database to a specific point in time, a data backup from before
the point in time to recover to and the subsequent log backups must be
restored. During recovery, the log area might be used as well, depending on
the point in time the database is reset to. All changes made after the recovery
time are (intentionally) lost.
You want to create a copy of the database
It can be desirable to create a copy of the database for various purposes,
such as creating a test system.
A database recovery is initiated from the SAP HANA studio.
A backup can be restored only to an identical SAP HANA system, with regard to
the number of nodes, node memory size, host names, and SID. Changing of host
names and SID during recovery is however enabled since SAP HANA 1.0
SPS04.
When restoring a backup image from a single node configuration into a scale-out
configuration, SAP HANA does not repartition the data automatically. The correct
way to bring a backup of a single-node SAP HANA installation to a scale-out
solution is as follows:
1. Back up the data from the stand-alone node.
2. Install SAP HANA on the master node.
2

162

See SAP Note 1705945 for help with determining the files needed for a recovery.

In-memory Computing with SAP HANA on IBM eX5 Systems

3. Restore the backup into the master node.


4. Install SAP HANA on the slave and standby nodes as appropriate, and add
these nodes to the SAP HANA cluster.
5. Repartition the data across all worker nodes.
More detailed information about the backup and recovery processes for the
SAP HANA database is provided in the SAP HANA Backup and Recovery Guide,
available online at the following site:
http://help.sap.com/hana_appliance

7.4.2 File-based backup tool integration


By using the mechanisms outlined in section 7.4.1, Basic backup and recovery
on page 161, virtually any backup tool can be integrated with SAP HANA.
Backups can be triggered programmatically using the SQL interface, and the
resulting backup files written locally then can be moved into the backup storage
by the backup tool. Backup scheduling can be done using scripts triggered by the
standard Linux job scheduling capabilities or other external schedulers. Because
the Backint backup interface was only introduced to SAP HANA with SPS05, a
file-based backup tool integration is the only option for pre-SPS05 SAP HANA
deployments.
Section A.2, File-based backup with IBM TSM for ERP on page 188 describes
such a file-based integration of IBM Tivoli Storage Manager for ERP V6.4 with
SAP HANA.

7.4.3 Backup tool integration with Backint for SAP HANA


Starting with SAP HANA 1.0 SPS05, SAP provides an application programming
interface (API), which can be used by manufacturers of third-party backup tools to
back up the data and redo logs of an SAP HANA system3. Leveraging this
Backint for SAP HANA API, a full integration with SAP HANA studio can be
achieved, allowing configuration and execution of backups using Backint for
SAP HANA.
With Backint, instead of writing the backup files to local disks, dedicated SAN
disks, or network shares, SAP HANA creates data stream pipes. Pipes are a way
to transfer data between two processes, one is writing data into the pipe, and the
other one is reading data out of the pipe. This makes a backup using Backint a
one-step backup. No intermediate backup data is written, unlike with a file-based
backup tool integration.
3

See SAP Note 1730932: Using backup tools with Backint for more details.

Chapter 7. SAP HANA operations

163

Backing up through Backint


The third-party backup agent runs on the SAP HANA server and communicates
with the third-party backup server. SAP HANA communicates with the third-party
backup agent through the Backint interface. After the user initiated a backup
through the SAP HANA Studio, or the hdbsql command-line interface,
SAP HANA writes a set of text files, describing the parameterization for this
backup, including version and name information, stream pipe location, and the
backup policy to use. Then, SAP HANA creates the stream pipes as announced
before. Each SAP HANA service (for example: index server, name server,
statistics server, XS engine) has its own stream pipe to write its own backup data
to. The third-party backup agents read the data streams from these pipes, and
pass them on to the backup server. Currently, SAP HANA does not offer backup
compression; however, third-party backup agents and servers can compress the
backup data and further transform it, for example, by applying encryption. Finally,
SAP HANA transmits backup catalog information, before the third-party backup
agent writes a file reporting the result and administrative information like backup
identifiers. This information is made available in SAP HANA Studio.

Restoring through Backint


As outlined in section 7.4.1, Basic backup and recovery on page 161Restoring
a backup on page 161, a database restore might be necessary when the data
area or log area is damaged, to recover from a logical error or to copy the
database. This can be achieved using data and log backups performed
previously.
A restore operation can be initiated through the SAP HANA Studio only. As a
very first step, SAP HANA shuts down the database. SAP HANA then writes a
set of text files, describing the parameterization for this restore, including a list of
backup identifiers and stream pipe locations. After receiving the backup catalog
information from the third-party backup tool, SAP HANA performs a series of
checks to ensure that the database can be recovered with the backup data
available. Then, SAP HANA establishes the communication with the third-party
backup agents by using stream pipes, and requests the backup data from the
backup server. The backup agents then stream the backup data received from
the backup server through the stream pipes to the SAP HANA services. As a
final step, the third-party backup agent writes a file reporting the result of the
operation for error-handling purposes. This information is made available in
SAP HANA Studio.

Backint certification
Backup tools using the Backint for SAP HANA interface are subject to
certification by SAP. The certification process is documented at
http://scn.sap.com/docs/DOC-34483. To determine which backup tools are
certified for Backint for SAP HANA, the Partner Information Center can be

164

In-memory Computing with SAP HANA on IBM eX5 Systems

searched, by selecting the SAP-defined integration scenario HANA-BRINT 1.1 HANA Backint Interface. The search function of the Partner Information Center
is available at http://www.sap.com/partners/directories/SearchSolution.epx.
The following sections give a short overview about the backup tools certified for
Backint for SAP HANA today.

7.4.4 IBM Tivoli Storage Manager for ERP 6.4


Starting with version 6.4.1, IBM Tivoli Storage Manager (TSM) for ERP integrates
with the Backint for SAP HANA API for simplified protection of SAP HANA
in-memory databases.
IBM Tivoli Storage Manager for ERP V6.4.1 simplifies and improves the
performance of backup and restore operations for SAP HANA in-memory
relational databases by eliminating an interim copy to disk. The former two-step
process for these operations is replaced by a one-step process, using the new
Backint for SAP HANA API available in the SAP HANA Support Package Stack
05 (SPS 05). By using the new Backint interface, Tivoli Storage Manager for ERP
can now support any SAP HANA appliance, including those running on
competitive Intel based hardware like HP, Cisco, or Fujitsu, that meets the
requirements defined by the SAP HANA level in use with, and supported by,
Tivoli Storage Manager for ERP.
The new extensions to Tivoli Storage Manager for ERP not only allow to use the
most recent functionality in SAP HANAs in-memory database environments, for
example, enhanced performance and scalability in multi-nodes environments
and better consumability through tighter integration with SAP HANA Studio, but
also to apply familiar features of Tivoli Storage Manager for ERP to SAP HANA
backup and recovery:

Management of all files per backup as a logical entity


Run multiple parallel multiplexed sessions
Use multiple network paths
Backup to multiple Tivoli Storage Manager servers, and so on
Creation of multiple copies of the database redo logs as needed
Compression
Deduplication

In addition to the Backint integration, Tivoli Storage Manager for ERP 6.4
continues to feature a file-based integration with SAP HANA for pre-SPS05
SAP HANA deployments on IBM hardware. Section A.2, File-based backup with
IBM TSM for ERP on page 188 describes such a file-based integration of
IBM Tivoli Storage Manager for ERP V6.4 with SAP HANA.

Chapter 7. SAP HANA operations

165

7.4.5 Symantec NetBackup 7.5 for SAP HANA


Symantec NetBackup 7.5 was the first third-party backup tool to be certified for
the Backint for SAP HANA interface, in December of 20124.
With a NetBackup agent version 7.5.0.6 or later installed on the SAP HANA
appliance5, SAP HANA can be integrated into an existing NetBackup
deployment. This allows SAP HANA to send streamed backup data to
NetBackup, which uses an SAP policy to manage the backed up data sets
providing destination targets, retention, duplication, and replication for disaster
recovery purposes. This seamless integration facilitates the following benefits:

Native integration into SAP HANA Studio for ease of use


Integrated media management and capacity management
Deduplication for backup sets
Compression
Encryption on various levels, including network transmission and on tape
Disaster recovery for backups with NetBackup Auto Image Replication (AIR)

Symantec NetBackup supports SAP HANA appliances from all vendors of


validated SAP HANA configurations, including IBM.

7.5 Monitoring SAP HANA


In a productive environment, administration and monitoring of an SAP HANA
appliance play an important role.
The SAP tool for the administration of and monitoring the SAP HANA appliance
is the SAP HANA Studio. It allows you to monitor the overall system state:
General system information (such as software versions).
A warning section shows the latest warnings generated by the statistics
server. Detailed information about these warnings is available as a tooltip.
Bar views provide an overview of important system resources. The amount of
available memory, CPUs, and storage space is displayed, in addition to the
used amount of these resources.
In a distributed landscape, the amount of available resources is aggregated over
all servers.
4

166

See the following website:


http://www.saphana.com/community/blogs/blog/2012/12/19/backint-for-sap-hana-certificati
on-available-now
SAP HANA 1.0 SPS05, Revision 46 or higher

In-memory Computing with SAP HANA on IBM eX5 Systems

Note: More information about the administration and monitoring of SAP HANA
is available in the SAP HANA administration guide, accessible online:
http://help.sap.com/hana_appliance

7.6 Sharing an SAP HANA system


SAP HANA is a high performance appliance, prohibiting the use of any kind of
virtualization concept. This can lead to many SAP HANA appliances in the
datacenter, for example: Production, disaster recovery, quality assurance (QA),
test and sandbox systems, possibly for multiple application scenarios, regions, or
lines of business. Therefore, the consolidation of SAP HANA instances, at least
for non-production systems, seems desirable. Chapter 6.7, SAP HANA on
VMware vSphere on page 148 describes the new support of SAP HANA running
on VMware vSphere environments.
Another way of consolidating is to install more than one instance of SAP HANA
onto one system6. There are however major drawbacks when consolidating
multiple SAP HANA instances on one system. Due to this, it is generally not
supported for production systems. For non-production systems, the support
status depends on the scenario:
Multiple Components on One System (MCOS)
Having multiple SAP HANA instances on one system, also referred to
as Multiple Components on One System (MCOS), is not
recommended because this poses conflicts between different SAP
HANA databases on a single server, for example: Common data and
log volumes, possible performance degradations, interference of the
systems against each other. SAP and IBM support this under certain
conditions (see SAP Note 1681092); however, if issues arise, as part
of the troubleshooting process SAP or IBM might ask you to stop all
but one of the instances to see if the issue persists.
Multiple Components on One Cluster (MCOC)
Running multiple SAP HANA instances on one scale-out cluster (for
the sake of similarity to the other abbreviations, we call this Multiple
Components on One Cluster (MCOC) is supported as long as each
node of the cluster runs only one SAP HANA instance. A
development and a QA instance can run on one cluster, but with
dedicated nodes for each of the two SAP HANA instances, for
example, each of the nodes runs either the development instance, or
6

One SAP HANA system, as referred to in this section, can consist of one single server or multiple
servers in a clustered configuration.

Chapter 7. SAP HANA operations

167

the QA instance, but not both. Only the GPFS file system is shared
across the cluster.
Multiple Components in One Database (MCOD)
Having one SAP HANA instance containing multiple components,
schemas, or application scenarios, also referred to as Multiple
Components in One Database (MCOD), is supported. This means,
however, to have all data within a single database, which is also
maintained as a single database, can lead to limitations in
operations, database maintenance, backup and recovery, and so on.
For example, bringing down the SAP HANA database affects all of
the scenarios. It is impossible to bring it down for only one scenario.
SAP Note 1661202 documents the implications.
Consider the following factors when consolidating SAP HANA instances on one
system:
An instance filling up the log volume causes all other instances on the system
to stop working properly. This can be addressed by monitoring the system
closely.
Installation of an additional instance might fail when there are already other
instances installed and active on the system. The installation procedures
check the available space on the storage, and refuse to install when there is
less free space than expected. This might also happen when trying to reinstall
an already installed instance.
Installing a new SAP HANA revision for one instance might affect other
instances already installed on the system. For example, new library versions
coming with the new installation might break the already installed instances.
The performance of the SAP HANA system becomes unpredictable because
the individual instances on the system are sharing resources such as memory
and CPU.
When asking for support for such a system, you might be asked to remove the
additional instances and to re-create the issue on a single instance system.

7.7 Installing additional agents


Many organizations have processes and supporting software in place to monitor,
back up, or otherwise interact with their servers. Because SAP HANA is delivered
in an appliance-like model, there are restrictions with regards to additional
software, for example, monitoring agents, to be installed onto the appliance. SAP
permits the installation and operation of external software, if the prerequisites
stated in SAP Note 1730928 are met.

168

In-memory Computing with SAP HANA on IBM eX5 Systems

Only the software installed by the hardware partner is recommended on the


SAP HANA appliance. For the IBM Systems solution for SAP HANA, IBM defined
three categories of agents:
Supported

IBM provides a solution covering the respective area; no


validation by SAP is required.

Tolerated

Solutions provided by a third party that are allowed to be used on


the IBM Workload Optimized Solution for SAP HANA. It is the
clients responsibility to obtain support for such solutions. Such
solutions are not validated by IBM and SAP. If issues with such
solutions occur and cannot be resolved, the use of such
solutions might be prohibited in the future.

Prohibited

Solutions that must not be used on the IBM Systems solution for
SAP HANA. Using these solutions might compromise the
performance, stability, or data integrity of the SAP HANA.

Do not install additional software on the SAP HANA appliance that is classified
as prohibited for use on the SAP HANA appliance. As an example, initial tests
show that some agents can decrease performance or even possibly corrupt the
SAP HANA database (for example, virus scanners).
In general, all additionally installed software must be configured to not interfere
with the functionality or performance of the SAP HANA appliance. If any issue of
the SAP HANA appliance occurs, you might be asked by SAP to remove all
additional software and to reproduce the issue.
The list of agents that are supported, tolerated, or prohibited for use on the SAP
HANA appliance are published in the Quick Start Guide for the IBM Systems
solution for SAP HANA appliance, which is available at this website:
http://www-947.ibm.com/support/entry/myportal/docdisplay?lndocid=MIGR-5
087035

7.8 Software and firmware levels


The IBM Systems solution for SAP HANA appliance contains several different
components that might at times be required to be upgraded (or downgraded)
depending on different support organizations recommendations. These
components can be split up into four general categories:

Firmware
Operating system
Hardware drivers
Software

Chapter 7. SAP HANA operations

169

The IBM System x SAP HANA support team, after informed, reserves the right to
perform basic system tests on these levels when it is deemed to have a direct
impact on the SAP HANA appliance. In general, IBM does not give specific
recommendations to which levels are allowed for the SAP HANA appliance.
The IBM System x SAP HANA development team provides, at regular intervals,
new images for the SAP HANA appliance. Because these images have
dependencies regarding the hardware, operating system, and drivers, use the
latest image for maintenance and installation of SAP HANA systems. These
images can be obtained through IBM support. Part number information is
contained in the Quick Start Guide.
If the firmware level recommendations for the IBM components of the SAP HANA
appliance are given through the individual IBM System x support teams that fix
known code bugs, it is the clients responsibility to upgrade or downgrade to the
recommended levels as instructed by IBM support.
If the operating system recommendations for the SUSE Linux components of the
SAP HANA appliance are given through the SAP, SUSE, or IBM support teams
that fix known code bugs, it is the clients responsibility to upgrade or downgrade
to the recommended levels, as instructed by SAP through an explicit SAP Note
or allowed through an OSS Customer Message. SAP describes their operational
concept, including updating of the operating system components in SAP Note
1599888 - SAP HANA: Operational Concept. If the Linux kernel is updated, take
extra care to recompile the IBM High IOPS drivers and IBM GPFS software as
well, as described in the IBM SAP HANA Operations Guide.
If an IBM High IOPS driver or IBM GPFS recommendation to update the software
is given through the individual IBM support teams (System x, Linux, GPFS) that
fix known code bugs, it is not recommend to update these drivers without first
asking the IBM System x SAP HANA support team through an SAP OSS
Customer Message.
If the other hardware or software recommendations for IBM components of the
SAP HANA appliance are given through the individual IBM support teams that fix
known code bugs, it is the clients responsibility to upgrade or downgrade to the
recommended levels as instructed by IBM support.

7.9 Support process


The deployment of SAP HANA as an integrated solution, combining software and
hardware from both IBM and SAP, also reflects in the support process for the
IBM Systems solution for SAP HANA.

170

In-memory Computing with SAP HANA on IBM eX5 Systems

All SAP HANA models offered by IBM include SUSE Linux Enterprise Server
(SLES) for SAP Applications with SUSE 3-year priority support and IBM GPFS
with 3-year support. The hardware comes with a 3-year limited warranty7,
including customer-replaceable unit (CRU), and on-site support8.

7.9.1 IBM SAP integrated support


SAP integrates the support process with SUSE and IBM as part of the HANA
appliance solution-level support. If you encounter software problems on your
SAP HANA system, access the SAP Online Service System (SAP OSS) website:
https://service.sap.com
When you reach the website, create a service request ticket using a
subcomponent of BC-HAN or BC-DB-HDB as the problem component. IBM
support works closely with SAP and SUSE and is dedicated to supporting SAP
HANA software and hardware issues.
Send all questions and requests for support to SAP using their OSS messaging
system. A dedicated IBM representative is available at SAP to work on this
solution. Even if it is clearly a hardware problem, an SAP OSS message should
be opened to provide the best direct support for the IBM Systems solution for
SAP HANA.
When opening an SAP support message, we suggest using the text template
provided in the Quick Start Guide, when it is obvious that you have a hardware
problem. This procedure expedites all hardware-related problems within the SAP
support organization. Otherwise, the SAP support teams will gladly help you with
the questions regarding the SAP HANA appliance in general.
IBM provides a script to get an overview over the current system status and the
configuration of the running system. The script saphana-check-ibm.sh is
preinstalled in the directory: /opt/ibm/saphana/bin. The most recent version can
be found in SAP Note 1661146.
Before you contact support, ensure that you have taken these steps to try to
solve the problem yourself:
Check all cables to ensure that they are connected.
Check the power switches to ensure that the system and any optional devices
are turned on.

7
8

For information about the IBM Statement of Limited Warranty, see the following website:
http://www.ibm.com/servers/support/machine_warranties
IBM sends a technician after attempting to diagnose and resolve the problem remotely.

Chapter 7. SAP HANA operations

171

Use the troubleshooting information in your system documentation, and use


the diagnostic tools that come with your system. Information about diagnostic
tools is available in the Problem Determination and Service Guide on the
IBM Documentation CD that comes with your system.
Go to the following IBM support website to check for technical information,
hints, tips, and new device drivers, or to submit a request for information:
http://www.ibm.com/supportportal
For SAP HANA software-related issues, you can search the SAP OSS
website for problem resolutions. The OSS website has a knowledge database
of known issues and can be accessed at the following site:
https://service.sap.com/notes
The main SAP HANA information source is available at the following site:
https://help.sap.com/hana_appliance
If you have a specific operating system question or issue, contact SUSE
regarding SUSE Linux Enterprise Server for SAP Applications. Go to the SUSE
website:
http://www.suse.com/products/prioritysupportsap
Media is available for download at the following site:
http://download.novell.com/index.jsp?search=Search&families=2658&keywor
ds=SAP
Note: Registration is required before you can download software packages
from the SUSE website.

7.9.2 IBM SAP International Competence Center InfoService


The IBM SAP International Competence Center (ISICC) InfoService is the key
support function of the IBM and SAP alliance. It serves as a single point of entry
for all SAP-related questions for clients using IBM Systems and solutions with
SAP applications. As a managed question and answer service, it has access to a
worldwide network of experts on technology topics about IBM products in SAP
environments. You can contact the ISICC InfoService using email at:
infoservice@de.ibm.com
Note: The ISICC InfoService does not provide product support. If you need
product support for the IBM Systems solution for SAP HANA, refer to section
7.9.1, IBM SAP integrated support on page 171. If you need support for
other IBM products, consult the product documentation on how to get support.

172

In-memory Computing with SAP HANA on IBM eX5 Systems

Chapter 8.

Summary
This chapter summarizes the benefits of in-memory computing and the
advantages of IBM infrastructure for running the SAP HANA solution. We discuss
the following topics:

8.1, Benefits of in-memory computing on page 174


8.2, SAP HANA: An innovative analytic appliance on page 174
8.3, IBM Systems solution for SAP HANA on page 175
8.4, Going beyond infrastructure on page 178

Copyright IBM Corp. 2013. All rights reserved.

173

8.1 Benefits of in-memory computing


In todays data-driven culture, tools for business analysis are quickly evolving.
Organizations need new ways to take advantage of critical data dynamically to
not only accelerate decision making, but also to gain insights into key trends. The
ability to instantly explore, augment, and analyze all data in near real-time can
deliver the competitive edge that your organization needs to make better
decisions faster and to use favorable market conditions, customer trends, price
fluctuations, and other factors that directly influence the bottom line.
Made possible through recent technology advances that combine large, scalable
memory, multi-core processing, fast solid-state storage, and data management,
in-memory computing uses these technology innovations to establish a
continuous real-time link between insight, foresight, and action to deliver
significantly accelerated business performance.

8.2 SAP HANA: An innovative analytic appliance


To support todays information-critical business environment, SAP HANA gives
companies the ability to process huge amounts of data faster than ever before.
The appliance lets business users instantly access, model, and analyze all of a
companys transactional and analytical data from virtually any data source in real
time, in a single environment, without impacting existing applications or systems.
The result is accelerated business intelligence (BI), reporting, and analysis
capabilities with direct access to the in-memory data models residing in SAP
in-memory database software. Advanced analytical workflows and planning
functionality directly access operational data from SAP ERP or other sources.
SAP HANA provides a high-speed data warehouse environment, with an SAP
in-memory database serving as a next-generation, in-memory acceleration
engine.
SAP HANA efficiently processes and analyzes massive amounts of data by
packaging SAPs use of in-memory technology, columnar database design, data
compression, and massive parallel processing together with essential tools and
functionality such as data replication and analytic modeling.
Delivered as an optimized hardware appliance based on IBM eX5 enterprise
servers, the SAP HANA software includes:
High-performance SAP in-memory database and a powerful data calculation
engine
Real-time replication service to access and replicate data from SAP ERP

174

In-memory Computing with SAP HANA on IBM eX5 Systems

Data repository to persist views of business information


Highly tuned integration with SAP BusinessObjects BI solutions for insight
and analytics
SQL and MDX interfaces for third-party application access
Unified information-modeling design environment
SAP BusinessObjects Data services to provide access to virtually any SAP
and non-SAP data source
To explore, model, and analyze data in real time without impacting existing
applications or systems, SAP HANA can be leveraged as a high-performance
side-by-side data mart to an existing data warehouse. It can also replace the
database server for SAP NetWeaver Business Warehouse, adding in-memory
acceleration features.
SAP Business Suite applications powered by SAP HANA bring together
transactions and analytics into a single in-memory platform. In this integrated
scenario, one SAP HANA instance is used as the primary database for SAP
Business Suite applications. By leveraging the power of SAP HANA,
organizations can immediately update plans, run simulations, and drive business
decisions based on detailed real-time data.
These components create an excellent environment for business analysis, letting
organizations merge large volumes of SAP transactional and analytical
information from across the enterprise, and instantly explore, augment, and
analyze it in near real time.

8.3 IBM Systems solution for SAP HANA


The IBM Systems solution for SAP HANA based on IBM eX5 enterprise servers
provides the performance and scalability to run SAP HANA, enabling clients to
drive near real-time business decisions and helping organizations stay
competitive. IBM eX5 enterprise servers provide a proven, scalable platform for
SAP HANA that enables better operational planning, simulation, and forecasting,
in addition to optimized storage, search, and ad hoc analysis of todays
information. SAP HANA running on powerful IBM eX5 enterprise servers
combines the speed and efficiency of in-memory processing with the ability of
IBM eX5 enterprise servers to analyze massive amounts of business data.
Based on scalable IBM eX5 technology included in IBM System x3690 X5 and
System x3950 X5 servers, SAP HANA running on eX5 enterprise servers offers
a solution that can help meet the need to analyze growing amounts of

Chapter 8. Summary

175

transactional data, delivering significant gains in both performance and scalability


in a single, flexible appliance.

8.3.1 Workload Optimized Solution


IBM offers several Workload Optimized Solution models for SAP HANA. These
models, which are based on the 2-socket x3690 X5 and 4-socket x3950 X5, are
optimally designed and certified by SAP. They are delivered preconfigured with
key software components preinstalled to help speed delivery and deployment of
the solution. The x3690 X5-based configurations offer 128 - 256 GB of memory
and the choice of only solid-state disk or a combination of spinning disk and
solid-state disk. The x3950 X5-based configurations use the scalability of eX5
and offer the capability to pay as you grow, starting with a 2-processor, 256 GB
configuration and growing to an 8-processor, 1 TB configuration. The x3950
X5-based configurations integrate High IOPS SSD PCIe adapters. The 8-socket
configurations use a scalability kit that combines two x3950 X5 4-socket systems
to create a single 8-socket, 1 TB system. For SAP Business Suite applications
powered by SAP HANA, the x3950 X5 systems can even scale up further up to
4 TB and beyond, in a single system.
IBM offers the appliance in a box with no need for external storage. With the
x3690 X5-based SSD only models, IBM has a unique offering with no spinning
hard drives, providing greater reliability and performance.

8.3.2 Leading performance


IBM eX5 enterprise servers offer extreme memory and performance scalability.
With improved hardware economics and new technology offerings, IBM is
helping SAP realize a real-time enterprise with in-memory business applications.
IBM eX5 enterprise servers deliver a long history of leading SAP benchmark
performance.
IBM eX5 enterprise servers come equipped with the Intel Xeon processor E7
series. These processors deliver performance that is ideal for your most
data-demanding SAP HANA workloads and offer improved scalability along with
increased memory and I/O capacity, which is critical for SAP HANA. Advanced
reliability and security features work to maintain data integrity, accelerate
encrypted transactions, and maximize the availability of SAP HANA applications.
In addition, Machine Check Architecture Recovery, a reliability, availability, and
serviceability (RAS) feature built into the Intel Xeon processor E7 series, enables
the hardware platform to generate machine check exceptions. In many cases,
these notifications enable the system to take corrective action that allows the
SAP HANA to keep running when an outage would otherwise occur.

176

In-memory Computing with SAP HANA on IBM eX5 Systems

IBM eX5 features, such as eXFlash solid-state disk technology, can yield
significant performance improvements in storage access, helping deliver an
optimized system solution for SAP HANA. Standard features in the solution, such
as the High IOPS adapters for IBM System x, can also provide fast access to
storage.

8.3.3 IBM GPFS enhancing performance, scalability, and reliability


Explosions of data, transactions, and digitally aware devices are straining IT
infrastructure and operations, while storage costs and user expectations are
increasing. The IBM General Parallel File System (GPFS), with its
high-performance enterprise file management, can help you move beyond simply
adding storage to optimizing data management for SAP HANA.
High-performance enterprise file management using GPFS gives SAP HANA
applications these benefits:
Performance to satisfy the most demanding SAP HANA applications
Seamless capacity expansion to handle the explosive growth of SAP HANA
information
Scalability to enable support for the largest SAP HANA database
requirements
High reliability and availability to help eliminate production outages and
provide disruption-free maintenance and capacity upgrades
Disaster Recovery capabilities with cross-site file system replication
Seamless capacity and performance scaling, along with the proven reliability
features and flexible architecture of GPFS, help your company foster innovation
by simplifying your environment and streamlining data workflows for increased
efficiency.

8.3.4 Scalability
IBM offers configurations allowing clients to start with a 2 CPU/256 GB RAM
model (S+), which can scale up to a 4 CPU/512 GB RAM model (M), and then to
an 8 CPU/1024 GB configuration (L). With the option to upgrade S+ to M, and M+
to L, IBM can provide an unmatched upgrade path from a T-shirt size S up to a
T-shirt size L, without the need to retire a single piece of hardware. For SAP
Business Suite applications powered by SAP HANA, the x3950 X5 systems can
even scale up further up to 4 TB and beyond, in a single system.
If you have large database requirements, you can scale the workload-optimized
solutions to multi-server configurations. IBM and SAP have validated
configurations of up to sixteen nodes with high availability, each node holding

Chapter 8. Summary

177

either 256 GB, 512 GB, or 1 TB of main memory. This scale-out support enables
support for databases as large as 56 TB, able to hold the equivalent of about
392 TB of uncompressed data1. Although the IBM solution is certified for up to 16
nodes, its architecture is designed for extreme scalability and can even grow
beyond that. The IBM solution does not require external storage for the
stand-alone or for the scale-out solution. The solution is easy to grow by the
simple addition of nodes to the network. There is no need to reconfigure a
storage area network for failover. That is all covered by GPFS under the hood.
IBM uses the same base building blocks from stand-alone servers to scale out,
for single-site or multi-site deployments for disaster recovery capabilities,
providing investment protection for clients who want to grow their SAP HANA
solution beyond a single server.
IBM or IBM Business Partners can provide these scale-out configurations
preassembled in a rack, helping to speed installation and setup of the
SAP HANA appliance.

8.3.5 Services to speed deployment


To help speed deployment and simplify maintenance of your x3690 X5 and
x3950 X5, the Workload Optimized Solution for SAP HANA, IBM Lab Services,
and IBM Global Technology Services offer quick-start services to help set up and
configure the appliance and health-check services to ensure that it continues to
run optimally. In addition, IBM also offers skills and enablement services for
administration and management of IBM eX5 enterprise servers.

8.4 Going beyond infrastructure


Many clients require more than software and hardware products. IBM as a
globally integrated enterprise can provide clients real end-to-end offerings
ranging across hardware, software, infrastructure, and consulting services, all of
them provided by a single company and integrated together.

8.4.1 A trusted service partner


Clients need a partner to help them assess their current capabilities, identify
areas for improvement, and develop a strategy for moving forward. This is where
IBM Global Business Services provides immeasurable value with thousands of
SAP consultants in 80 countries organized by industry sectors.
1

178

Assuming 7:1 compression factor

In-memory Computing with SAP HANA on IBM eX5 Systems

IBM Global Business Services worked together with IBM research teams,
IBM Software Group, and IBM hardware teams to prepare an integrated offering
focused on business analytical space and mobility. Among others, this offering
also covers all services around the SAP HANA appliance.
Through this offering, IBM can help you to take full advantage of SAP HANA
running on IBM eX5 enterprise servers.

Defining the strategy for business analytics


An important step before implementing the SAP HANA solution is the formulation
of an overall strategy how this new technology can be leveraged to deliver
business value and how it will be implemented in the existing client landscape.
Clients are typically facing the following challenges where the IBM Global
Business Services offering can help:
Mapping of existing pain points to available offerings
Designing a client-specific use case where no existing offering is available
Creation of a business case for implementing SAP HANA technology
Understanding long-term technology trends and their influence on individual
decisions
Underestimating importance of high availability, disaster recovery, and
operational aspects of the SAP in-memory solution
Avoiding delays caused by poor integration between hardware and
implementation partners
Alignment of already running projects to a newly developed in-memory
strategy
IBM experts conduct a series of workshops with all important stakeholders
including decision makers, key functional and technical leads, and architects. As
a result of these workshops, an SAP HANA implementation roadmap is defined.
The implementation roadmap is based on the existing client landscape and on
defined functional, technical, and business-related needs and requirements. It
will reflect current analytic capabilities and current status of existing systems.
An SAP HANA implementation roadmap contains individual use cases about
how SAP HANA can be best integrated into the client landscape to deliver the
wanted functionality. Other technologies that can bring additional value are
identified and the required architectural changes are documented.

Chapter 8. Summary

179

For certain situations, a proof of concept might be recommended to validate that


desired key performance indicators (KPIs) can be met including:

Data compression rates


Data load performance
Data replication rates
Backup/restore speed
Front-end performance

After the SAP HANA implementation roadmap is accepted by the client, IBM
expert teams work with the client to implement the roadmap.

Used implementation methods


Existing use case scenarios can be divided in two groups based on how SAP
HANA is deployed:
SAP HANA as a stand-alone component
The technology platform, operational reporting, and accelerator use case
scenarios are describing the SAP HANA database as stand-alone
component.
IBM Global Business Services offers services to implement the SAP HANA
database using a combination of the IBM Lean implementation approach,
ASAP 7.2 Business Add-on for SAP HANA methodology, and agile
development methodologies that are important for this type of project.
Used methodologies are keeping strict control upon the following solution
components:

Use case (overall approach of how SAP HANA will be implemented)


Sources of data (source tables containing required information)
Data replication (replication methods for transferring data into SAP HANA)
Data models (transformation of source data into required format)
Reporting (front-end components such as reports and dashboards)

This approach has the following implementation phases:

Project preparation
Project kick-off
Blueprint
Realization
Testing
Go-live preparation
Go-live

This methodology is focused to help both IBM and client to keep the defined
and agreed scope under control and to help with issue classification and
resolution management. It is also giving the required visibility about the
current progress of development or testing to all involved stakeholders.

180

In-memory Computing with SAP HANA on IBM eX5 Systems

SAP HANA as the underlying database for SAP Business Suite products
IBM Global Business Services is using a facility called IBM SAP HANA
migration factory, which is designed specially for this purpose. Local experts
who are directly working with the clients are cooperating with remote teams
performing the required activities based on a defined conversion methodology
agreed with SAP. This facility is having the required number of trained experts
covering all key positions needed for a smooth transition from a traditional
database to SAP HANA.
The migration service that is related to the conversion of existing SAP
NetWeaver BW or SAP Business Suite systems to run on the SAP HANA
database has the following phases:
Initial assessment
Local teams perform an initial assessment of the existing systems, their
relations and technical status. Required steps are identified, and an
implementation roadmap is developed and presented to the client.
Conversion preparation
IBM remote teams perform all required preparations for the conversion.
BW experts clean the SAP systems to remove unnecessary objects. If
required, the system is cloned and upgraded to the required level.
Migration to SAP HANA database
In this phase, IBM remote teams perform the conversion to the SAP HANA
database including all related activities. SAP HANA-specific optimizations
are enabled. After successful testing, the system is released back for client
usage.

Extended technical services by IBM for SAP HANA


With the Total Support for SAP HANA offering2, IBM provides a comprehensive
support model that reflects specific support needs of an SAP HANA appliance. A
dedicated integrated problem management for the IBM hardware and software
infrastructure stack is the base. Proactive services such as health-checks help
clients to identify system inconsistencies early and also recommendations for
SAP HANA appliances.
All services are provided by a highly skilled centralized team who fully take care
of all client requests for solving problems or how to questions. Optionally, IBM
can support clients in setting up the environment or help them in different
operating tasks on site. This service is designed to maximize uptime and
maintain consistency during the lifetime of the appliance.

Might not be available in all geographies

Chapter 8. Summary

181

The Total Support for SAP HANA offering is composed of the following services:
Optimized Reactive Services
Integrated hardware and software stack support for SAP HANA appliances
Provided by IBM Technical Support team skilled in Linux, GPFS, and
hardware
Escalation to worldwide IBM L2 and Labs
Escalation to SUSE Priority Support
SAP HANA application support by SAP
Single point of contact
Team of single point of contacts (SPOCs) acting as an interface for IBM,
SAP, and SUSE clients
Skilled team knowledgeable on SAP HANA with deep technical skills in
Linux, GPFS, and hardware
Subject matter experts with support delivery and project management
experience
Proactive Services
Health check
Health check provided by an SAP HANA specialist once a year
Report including an explanation of findings
Perform a system assessment on the complete HANA solution
Check error logs and status of hardware and software components
Yearly capacity planning
Hot Stand-by
In addition to the Total Support for SAP HANA offering, IBM Lab Services, and
IBM Global Technology Services offer health-check services and managed
service offerings to ensure that the SAP HANA appliance continues to run
optimally. IBM also offers skills and enablement services for administration and
management of IBM eX5 enterprise servers.

8.4.2 IBM and SAP team for long-term business innovation


With a unique combination of expertise, experience, and proven methodologies
and a history of shared innovation IBM can help strengthen and optimize
your information infrastructure to support your SAP applications.

182

In-memory Computing with SAP HANA on IBM eX5 Systems

IBM and SAP have worked together for nearly 40 years to deliver innovation to
their shared clients. Since 2006, IBM has been the market leader for
implementing SAPs original in-memory appliance, the SAP NetWeaver Business
Warehouse (BW) Accelerator. Hundreds of SAP NetWeaver BW Accelerator
deployments have been successfully completed in multiple industries. These
SAP NetWeaver BW Accelerator appliances have been successfully deployed on
many of SAPs largest business warehouse implementations, which are based
on IBM hardware and DB2, optimized for SAP.
IBM and SAP offer solutions that move business forward and anticipate
organizational change by strengthening your business analytics information
infrastructure for greater operational efficiency and offering a way to make
smarter decisions faster.

Chapter 8. Summary

183

184

In-memory Computing with SAP HANA on IBM eX5 Systems

Appendix A.

Appendix
This appendix provides information about the following topics:
A.1, GPFS license information on page 186
A.2, File-based backup with IBM TSM for ERP on page 188

Copyright IBM Corp. 2013. All rights reserved.

185

A.1 GPFS license information


The models of the IBM Systems solution for SAP HANA come with GPFS
licenses, including three years of Software Subscription and Support. Software
Subscription and Support contracts, including Subscription and Support
renewals, are managed through IBM Passport Advantage or Passport
Advantage Express.
There are currently four different types of GPFS licenses:
The GPFS on x86 Single Server for Integrated Offerings provides file system
capabilities for single-node integrated offerings. This kind of GPFS license
does not cover the use in multi-node environments like the scale-out solution
discussed here. In order to use building blocks that come with the GPFS on
x86 Single Server for Integrated Offerings licenses, for a scale-out solution,
GPFS on x86 Server licenses or GPFS File Placement Optimizer licenses
have to be obtained for these building blocks.
The GPFS Server license permits the licensed node to perform GPFS
management functions such as cluster configuration manager, quorum node,
manager node, and network shared disk (NSD) server. In addition, the GPFS
Server license permits the licensed node to share GPFS data directly through
any application, service, protocol, or method, such as NFS (Network File
System), CIFS (Common Internet File System), FTP (File Transfer Protocol),
or HTTP (Hypertext Transfer Protocol).
The GPFS File Placement Optimizer license permits the licensed node to
perform NSD server functions for sharing GPFS data with other nodes that
have a GPFS File Placement Optimizer or GPFS Server license. This license
cannot be used to share data with nodes that have a GPFS Client license or
non GPFS nodes.
The GPFS Client license permits exchange of data between nodes that
locally mount the same file system (for example, via a shared storage). No
other export of the data is permitted. The GPFS Client cannot be used for
nodes to share GPFS data directly through any application, service, protocol
or method, such as NFS, CIFS, FTP, or HTTP. For these functions, a GPFS
Server license would be required. Due to the architecture of the IBM Systems
solution for SAP HANA (not having a shared storage system), this type of
license cannot be used for the IBM solution.
Table A-1 on page 187 lists the types of GPFS licenses and the processor value
units (PVUs) included for each of the models.

186

In-memory Computing with SAP HANA on IBM eX5 Systems

Table A-1 GPFS licenses included in the custom models for SAP HANA
MTM

Type of GPFS license included

PVUs
included

7147-H1x

GPFS on x86 Server

1400

7147-H2x

GPFS on x86 Server

1400

7147-H3x

GPFS on x86 Server

1400

7147-H7x

GPFS on x86 Server

1400

7147-H8x

GPFS on x86 Server

1400

7147-H9x

GPFS on x86 Server

1400

7147-HAx

GPFS on x86 Single Server for Integrated Offerings

1400

7147-HBx

GPFS on x86 Single Server for Integrated Offerings

1400

7143-H1x

GPFS on x86 Server

1400

7143-H2x

GPFS on x86 Server

4000

7143-H3x

GPFS on x86 Server

5600

7143-H4x

GPFS on x86 Server

1400

7143-H5x

GPFS on x86 Server

4000

7143-HAx

GPFS on x86 Single Server for Integrated Offerings

4000

7143-HBx

GPFS on x86 Single Server for Integrated Offerings

4000

7143-HCx

GPFS on x86 Single Server for Integrated Offerings

5600

Licenses for IBM GPFS on x86 Single Server for Integrated Offerings, V3
(referred to as Integrated in the table) cannot be ordered independent of the
select hardware for which it is included. This type of license provides file system
capabilities for single-node integrated offerings. Therefore, the model 7143-HAx
includes 4000 PVUs of GPFS on x86 Single Server for Integrated Offerings, V3
licenses, so that an upgrade to the 7143-HBx model does not require additional
licenses. The PVU rating for the 7143-HAx model to consider when purchasing
other GPFS license types is 1400 PVUs.
Clients with highly available, multi-node clustered scale-out configurations must
purchase the GPFS on x86 Server and GPFS File Placement Optimizer product,
as described in 6.4.4, Hardware and software additions required for scale-out
on page 128.

Appendix A. Appendix

187

A.2 File-based backup with IBM TSM for ERP


IBM Tivoli Storage Manager for ERP is a simple, scalable data protection solution
for SAP HANA and SAP ERP. Tivoli Storage Manager (TSM) for ERP V6.4
includes a one-step command that automates file-based SAP HANA backup and
TSM data protection.
TSM clients running SAP HANA appliances can back up their instances using
their existing TSM backup environment, even if the level of the SAP HANA code
does not allow use of the Backint interface for SAP HANA, and only a file-based
backup tool integration can be used. TSM for ERP Data Protection for SAP
HANA v6.4 provides such file-based backup and restore functionality for SAP
HANA.

Setting up Data Protection for SAP HANA


Data Protection for SAP HANA comes with a setup.sh command, which is a
configuration tool to prepare the TSM for ERP configuration file, create the SAP
HANA backup user, and set all necessary environment variables for the SAP
HANA administration user. The setup.sh command guides you through the
configuration process. Data Protection for SAP HANA stores a backup user and
its password in the SAP HANA keystore called hdbuserstore to enable
unattended operation of a backup.

Backing up the SAP HANA database with TSM


SAP HANA writes its backup (logs and data) to files at pre-configured directories.
The Data Protection for SAP HANA command, backup.sh, reads the
configuration files to retrieve these directories (if not default configuration). On
backup execution, the files created in these directories are moved to the running
TSM instance and are deleted afterwards from these directories (except for the
HANA configuration files).

188

In-memory Computing with SAP HANA on IBM eX5 Systems

Figure A-1 illustrates this backup process.

node01

node02

node03

SAP HANA DB
DB partition 1

2
Primary data

Tivoli
Storage
Manager
Server

Backup files

DB partition 3

2 file system - GPFS2


Shared

HDD

Flash

HDD

Flash

HDD

Flash

data01

log01

data02

log02

data03

log03

backup

backup

3
backup

4
move files to TSM

5
backup

DB partition 2

backup.sh

restore

Tivoli Storage Manager


Storage
Figure A-1 Backup process with Data Protection for SAP HANA, using local storage for backup files

The backup process follows these steps:


1. The backup.sh command triggers a log or data backup of the SAP HANA
database.
2. The SAP HANA database performs a synchronized backup on all nodes.
3. The SAP HANA database writes a backup file on each node.
4. The backup.sh command collects the file names of the backup files.
5. The backup files are moved to TSM (and deleted on the nodes).

Appendix A. Appendix

189

Instead of having the backup files of the individual nodes written to the local
storage of the nodes, an external storage system can be used to provide space
to store the backup files. All nodes need to be able to access this storage, for
example, using NFS. Figure A-2 illustrates this scenario.

node01

node02

node03

SAP HANA DB
DB partition 1

2
Primary data

Tivoli
Storage
Manager
Server

move files to TSM

5
backup

DB partition 2

DB partition 3

2 file system - GPFS2


Shared

HDD

Flash

HDD

Flash

HDD

Flash

data01

log01

data02

log02

data03

log03

backup.sh

restore

4
Tivoli Storage Manager
Storage

backup

backup

backup

SAP HANA
backup file storage

Figure A-2 Backup process with Data Protection for SAP HANA, using external storage for backup files

Running log and data backups requires the Data Protection for SAP HANA
backup.sh command to be executed as the SAP HANA administration user
(<sid>adm).
The backup.sh command provides two basic functions:
1. Complete the data-backup (including HANA instance and landscape
configuration files).
2. Complete log-backup and remove successfully saved redo log files from disk.

190

In-memory Computing with SAP HANA on IBM eX5 Systems

The functions can be selected using command-line arguments to be able to


schedule the backup script with a given parameter:
backup.sh --data

Performs complete data and configuration file backup

backup.sh --logs

Performs complete log backup followed by a LOG


RECLAIM

By using this command, a backup of the SAP HANA database into TSM can be
fully automated.

Restoring the SAP HANA database from TSM


The SAP HANA database requires the backup files to be restored to start a
recovery process using the SAP HANA Studio. For SAP HANA database
revisions 30 and higher, Data Protection for SAP HANA provides a restore.sh
command that moves all required files back to the file system location
automatically so that the user is not required to search these files manually. For
earlier revisions of the SAP HANA database, this has to be done manually using
the TSM BACKUP-Filemanager. The SAP HANA database expects the backup
files to be restored to the same directory as they were written during backup. The
recovery itself can then be triggered using SAP HANA Studio.

Appendix A. Appendix

191

To restore data backups, including SAP HANA configuration files and logfile
backups, the TSM BACKUP-Filemanager is used. Figure A-3 shows a sample
panel of the BACKUP-Filemanager.
BACKUP-Filemanager V6.4.0.0, Copyright IBM 2001-2012
.------------------+---------------------------------------------------------------.
| Backup ID's
| Files stored under TSM___A0H7K1C4QI
|
|------------------+---------------------------------------------------------------|
| TSM___A0H7KM0XF4 | */hana/log_backup/log_backup_2_0_1083027170688_1083043933760 |
| TSM___A0H7KLYP3Z | */hana/log_backup/log_backup_2_0_1083043933760_1083060697664 |
| TSM___A0H7KHNLU6 | */hana/log_backup/log_backup_2_0_1083060697664_1083077461376 |
| TSM___A0H7KE6V19 | */hana/log_backup/log_backup_2_0_1083077461376_1083094223936 |
| TSM___A0H7K9KR7F | */hana/log_backup/log_backup_2_0_1083094223936_1083110986880 |
| TSM___A0H7K7L73W | */hana/log_backup/log_backup_2_0_1083110986880_1083127750848 |
| TSM___A0H7K720A4 | */hana/log_backup/log_backup_2_0_1083127750848_1083144513792 |
| TSM___A0H7K4BDXV | */hana/log_backup/log_backup_2_0_1083144513792_1083161277760 |
| TSM___A0H7K472YC | */hana/log_backup/log_backup_2_0_1083161277760_1083178040064 |
| TSM___A0H7K466HK | */hana/log_backup/log_backup_2_0_1083178040064_1083194806336 |
| TSM___A0H7K1C4QI | */hana/log_backup/log_backup_2_0_1083194806336_1083211570688 |
| TSM___A0H7JX1S77 | */hana/log_backup/log_backup_2_0_1083211570688_1083228345728 |
| TSM___A0H7JSRG2B | */hana/log_backup/log_backup_2_0_1083228345728_1083245109824 |
| TSM___A0H7JOH1ZP | */hana/log_backup/log_backup_2_0_1083245109824_1083261872960 |
| TSM___A0H7JK6ONC | */hana/log_backup/log_backup_2_0_1083261872960_1083278636608 |
| TSM___A0H7JJWUI8 | */hana/log_backup/log_backup_2_0_1083278636608_1083295400384 |
| TSM___A0H7JJU5YN | */hana/log_backup/log_backup_2_0_1083295400384_1083312166016 |
| TSM___A0H7JFWAV4 | */hana/log_backup/log_backup_2_0_1083312166016_1083328934016 |
| TSM___A0H7JBG625 | */hana/log_backup/log_backup_2_0_1083328934016_1083345705856 |
| TSM___A0H7JBAASN | */hana/log_backup/log_backup_2_0_1083345705856_1083362476352 |
| TSM___A0H7J7BLDK | */hana/log_backup/log_backup_2_0_1083362476352_1083379244416 |
| TSM___A0H7J5U8S7 | */hana/log_backup/log_backup_2_0_1083379244416_1083396008064 |
| TSM___A0H7J5T92O | */hana/log_backup/log_backup_2_0_1083396008064_1083412772928 |
| TSM___A0H7J4TWPG | */hana/log_backup/log_backup_2_0_1083412772928_1083429538688 |
|
| */hana/log_backup/log_backup_2_0_1083429538688_1083446303424 |
|
| */hana/log_backup/log_backup_2_0_1083446303424_1083463079488 |
|
| */hana/log_backup/log_backup_2_0_1083463079488_1083479846528 V
|------------------+---------------------------------------------------------------|
| 24 BID's
| 190 File(s) - 190 marked
|
`------------------+---------------------------------------------------------------'
TAB change windows
F2 Restore
F3 Mark all
F4 Unmark allF5 reFresh
F6 fileInfo
F7 redireCt
F8 Delete
F10 eXit
ENTER mark file
Figure A-3 The BACKUP-Filemanager interface

Desired data and log backups can be selected and then restored to the wanted
location. If no directory is specified for the restore, the BACKUP-Filemanager
restores the backups to the original location from which the backup was done.

192

In-memory Computing with SAP HANA on IBM eX5 Systems

After the backup files have been restored, the recovery process has to be started
using SAP HANA Studio. More information about this process and the various
options for a recovery is contained in the SAP HANA Backup and Recovery
Guide, available online at the following site:
http://help.sap.com/hana_appliance
After completing the recovery process successfully and the backup files are no
longer needed, they must be removed from the disk manually.

Appendix A. Appendix

193

194

In-memory Computing with SAP HANA on IBM eX5 Systems

Abbreviations and acronyms


ABAP

Advanced Business
Application Programming

HPI

Hasso Plattner Institute

I/O

input/output

ACID

Atomicity, Consistency,
Isolation, Durability

IBM

International Business
Machines

APO

Advanced Planner and


Optimizer

ID

Identifier

IDs

identifiers

IMM

integrated management
module

BI

Business Intelligence

BICS

BI Consumer Services

BM

bridge module

IOPS

I/O operations per second

BW

Business Warehouse

ISICC

CD

compact disc

IBM SAP International


Competence Center

CPU

central processing unit

ITSO

CRC

cyclic redundancy checking

International Technical
Support Organization

CRM

customer relationship
management

JDBC

Java Database Connectivity

JRE

Java Runtime Environment

CRU

customer-replaceable unit

KPIs

key performance indicators

DB

database

LM

landscape management

DEV

development

LUW

logical unit of work

DIMM

dual inline memory module

MB

megabyte

DSOs

DataStore Objects

MCA

Machine Check Architecture

DR

Disaster Recovery

MCOD

DXC

Direct Extractor Connection

Multiple Components in One


Database

ECC

ERP Central Component

MCOS

ECC

error checking and correcting

Multiple Components on One


System

ERP

enterprise resource planning

MDX

Multidimensional Expressions

ETL

Extract, Transform, and Load

NOS

Notes object services

FTSS

Field Technical Sales Support

NSD

Network Shared Disk

GB

gigabyte

NUMA

non-uniform memory access

GBS

Global Business Services

ODBC

Open Database Connectivity

GPFS

General Parallel File System

ODBO

OLE DB for OLAP

GTS

Global Technology Services

OLAP

online analytical processing

HA

high availability

OLTP

online transaction processing

HDD

hard disk drive

OS

operating system

Copyright IBM Corp. 2013. All rights reserved.

195

OSS

Online Service System

SSD

solid-state drive

PAM

Product Availability Matrix

SSDs

solid-state drives

PC

personal computer

STG

PCI

Peripheral Component
Interconnect

Systems and Technology


Group

SUM

Software Update Manager

POC

proof of concept

TB

terabyte

PSA

Persistent Staging Area

TCO

total cost of ownership

PVU

processor value unit

TCP/IP

PVUs

processor value units

Transmission Control
Protocol/Internet Protocol

QA

quality assurance

TDMS

Test Data Migration Server

QPI

QuickPath Interconnect

TREX

RAID

Redundant Array of
Independent Disks

Text Retrieval and Information


Extraction

TSM

Tivoli Storage Manager

RAM

random access memory

UEFI

RAS

reliability, availability, and


serviceability

Unified Extensible Firmware


Interface

RDS

Rapid Deployment Solution

RPM

revolutions per minute

RPO

Recovery Point Objective

RTO

Recovery Time Objective

SAN

storage area network

SAPS

SAP Application Benchmark


Performance Standard

SAS

serial-attached SCSI

SATA

Serial ATA

SCM

supply chain management

SCM

software configuration
management

SD

Sales and Distribution

SDRAM

synchronous dynamic random


access memory

SLD

System Landscape Directory

SLES

SUSE Linux Enterprise


Server

SLO

System Landscape
Optimization

SMI

scalable memory interconnect

SQL

Structured Query Language

196

In-memory Computing with SAP HANA on IBM eX5 Systems

Related publications
The publications listed in this section are considered particularly suitable for a
more detailed discussion of the topics covered in this book.

IBM Redbooks
The following IBM Redbooks publications provide additional information about
the topic in this document. Note that some publications referenced in this list
might be available in softcopy only:
The Benefits of Running SAP Solutions on IBM eX5 Systems, REDP-4234
IBM eX5 Portfolio Overview: IBM System x3850 X5, x3950 X5, x3690 X5, and
BladeCenter HX5, REDP-4650
Implementing the IBM General Parallel File System (GPFS) in a Cross
Platform Environment, SG24-7844
You can search for, view, download, or order these documents and other
Redbooks, Redpapers, Web Docs, drafts, and additional materials, at the
following website:
ibm.com/redbooks

Other publications
This publication is also relevant as a further information source:
Prof. Hasso Plattner, Dr. Alexander Zeier, In-Memory Data Management,
Springer, 2011

Online resources
These websites are also relevant as further information sources:
IBM Systems Solution for SAP HANA
http://www.ibm.com/systems/x/solutions/sap/hana

Copyright IBM Corp. 2013. All rights reserved.

197

IBM Systems and Services for SAP HANA


http://www.ibm-sap.com/hana
IBM and SAP: Business Warehouse Accelerator
http://www.ibm-sap.com/bwa
SAP In-Memory Computing - SAP Help Portal
http://help.sap.com/hana

Help from IBM


IBM Support and downloads
ibm.com/support
IBM Global Services
ibm.com/services

198

In-memory Computing with SAP HANA on IBM eX5 Systems

In-memory Computing with SAP HANA on IBM eX5 Systems

(0.2spine)
0.17<->0.473
90<->249 pages

Back cover

In-memory Computing
with SAP HANA on
IBM eX5 Systems
IBM Systems
solution for SAP
HANA
SAP HANA overview
and use cases
Operational aspects
for SAP HANA
appliances

The second edition of this IBM Redbooks publication


describes in-memory computing appliances from IBM and
SAP that are based on IBM eX5 flagship systems and SAP
HANA. We cover the history and basic principles of
in-memory computing and describe the SAP HANA solution
with its use cases and the corresponding IBM eX5 hardware
offerings.

INTERNATIONAL
TECHNICAL
SUPPORT
ORGANIZATION

We also describe the architecture and components of IBM


Systems solution for SAP HANA, with IBM General Parallel
File System (GPFS) as a cornerstone. The SAP HANA
operational disciplines are explained in detail: Scalability
options, backup and restore, high availability and disaster
recovery, as well as virtualization possibilities for SAP HANA
appliances.

BUILDING TECHNICAL
INFORMATION BASED ON
PRACTICAL EXPERIENCE

This book is intended for SAP administrators and technical


solution architects. It is also for IBM Business Partners and
IBM employees who want to know more about the SAP HANA
offering and other available IBM solutions for SAP clients.

IBM Redbooks are developed by


the IBM International Technical
Support Organization. Experts
from IBM, Customers and
Partners from around the world
create timely technical
information based on realistic
scenarios. Specific
recommendations are provided
to help you implement IT
solutions more effectively in
your environment.

For more information:


ibm.com/redbooks
SG24-8086-01

ISBN 0738438626

Вам также может понравиться