Вы находитесь на странице: 1из 10

IBM i 7.

2 and IBM POWER8


The most powerful, most flexible, and most scalable generation of
IBM Power Systems servers ever
Chris Francois (cfrancoi@us.ibm.com)
IBM i LIC developer
IBM

11 June 2014

IBM i 7.2 and IBM POWER8 have finally arrived. IBM POWER8 is the latest and the
most advanced processor at the heart of IBM Power Systems servers, and version 7.2 is
the latest release of the IBM i operating system that is optimized for POWER8. This article
describes some of the capabilities and features in IBM i 7.2 designed or optimized specifically
for POWER8, presented from the perspective of a veteran IBM i Licenced Internal Code (LIC)
developer.

Introduction
After spending 4 years in the making, IBM has released IBM i 7.2 and POWER8 at last. The official
announcement at ibm.com included descriptions of the advanced capabilities and benefits offered
by this latest generation of Power Systems with POWER8 technology. So, how will this article be
different and why would you want to read it? Well, maybe for the same reasons that I decided to
write it. Let me explain.
I've been with IBM for almost 20 years, most of it behind the scenes as a developer of IBM i LIC,
the foundational code of IBM i that implements the technology-independent machine interface
(TIMI) for the OS and applications. Yet, this is my first foray into writing for IBM developerWorks.
"So why now?" I'm glad you asked. IBM i 7.2 and POWER8 are the result of the inspirations and
innovations of thousands of talented and dedicated IBMers around the globe. With the general
availability of IBM i 7.2, the excitement and enthusiasm around the IBM development lab are
palpable. Like many of my colleagues, I derive satisfaction in knowing that these new Power
Systems servers with POWER8 technology and IBM i 7.2 can quietly serve millions of people
throughout the world, once again raising the bar for performance, reliability, and value. Sometimes,
silence is golden, but right now I would like to tell you about some capabilities and features that
have been part of my world for the last 4 years of development. I invite you to join me: Welcome to
the world of IBM i 7.2 and POWER8!
Copyright IBM Corporation 2014
IBM i 7.2 and IBM POWER8

Trademarks
Page 1 of 10

developerWorks

ibm.com/developerWorks/

Operating system requirements


Both IBM i 7.2 and IBM i 7.1 Technology Refresh 8 are supported on the new Power Systems
servers with POWER8 technology. The operating system and most applications for IBM i are
built on a TIMI that isolates programs from differences in processor architectures, and allows
the system to automatically capitalize on many new IBM Power Architecture features without
changes to existing applications. The new IBM i 7.2 release continues the tradition, providing a
high degree of integration, security, and ease-of-use across multiple generations of IBM Power
Systems servers and processors, including the new POWER8 processor.

Multi-core and multi-thread support


Similar to previous generations of Power Systems servers, POWER8-processor based systems
are designed to scale-up to support workload growth requirements, and also to serve as workload
consolidation platforms. This is made possible through logical partitioning and virtualization with
hardware, hypervisor, and operating systems that are optimized for these dual roles. As this article
unfolds, terms such virtual processor and processor compatibility mode will be used. So, if you
are unfamiliar with the IBM PowerVM processor virtualization concepts, you may want to take a
detour and review Processor Virtualization 101.
One dimension of system scalability is transaction processing capacity. In recent years, gains
in transaction processing capacity have, in large measure, come from growth in multithreading
and multiprocessing as opposed to single-thread performance. POWER8 breaks new ground
by providing significant increases in single-thread, core, and system performance. Servers that
are based on POWER8 processors offer up to 50% more commercial processing workload
(CPW) rating per core than similarly configured IBM POWER7 processor-based servers (CPW
compared for IBM Power 740 server, 16 core POWER7 running at 4.2 GHz and IBM Power
System S824, 16 core POWER8 running at 4.15 GHz. Refer to IBM Power Systems, Performance
Capabilities Reference, IBM i Operating System 7.2, April 2014). The CPW rating provides a
measure of online transaction processing (OLTP) workload performance for systems that run IBM
i.
Operating system limits
The default supported processor threading contexts and logical partition maximum processor limits
by processor compatibility mode and IBM i release are shown in Table 1. The published limits are
the defaults; depending on the IBM i release and configured processor compatibility mode, support
for additional processors may be available by contacting IBM Lab Services.
Table 1. IBM i maximum processor limits
Processor compatibility mode

Supported threading contexts

Maximum processors
IBM i 7.1 TR8

IBM i 7.2

POWER6

ST, SMT2

32

32

POWER6+

ST, SMT2

32

32

POWER7

ST, SMT2, SMT4

32

32

POWER8

ST, SMT2, SMT4, SMT8

32

48

IBM i 7.2 and IBM POWER8

Page 2 of 10

ibm.com/developerWorks/

developerWorks

Scalable simultaneous multithreading (SMT) with intelligent threads


POWER8, similar to POWER7 before it, uses Intelligent Threads technology to maximize workload
performance regardless of the processor's threading context. For POWER8, the technology has
been enhanced to adapt more quickly and with greater efficiency to changes in the workload. If the
POWER8 processor is under-committed, meaning fewer hardware threads are dispatched than
are available, the core performance is roughly the same, independent of threading context. So,
for example, if one thread is dispatched, performance will be similar in single thread (ST), SMT2,
SMT4, and SMT8 contexts; if two threads are dispatched, performance will be similar in SMT2,
SMT4, and SMT8 contexts; if three or four threads are dispatched, performance will be similar in
SMT4 and SMT8 contexts. This is illustrated in Figure 1.
Figure 1. POWER8 SMT scaling

From a usability standpoint, intelligent threads means that manual system-level processor
threading context adjustments typically aren't necessary in order to maximize workload
performance. Highly multithreaded workloads can benefit from the additional throughput offered by
SMT8 technology, but moderately threaded, and even single-threaded workloads are still able to
achieve maximum performance automatically. It just works!
Flexible SMT controls
The processor threading context determines the number of usable threads per processor, and
impacts the processor utilization and accounting information reported by IBM i work management
and performance management tools. While the default processor threading context is suitable for
most commercial environments, IBM i offers manual controls that allow the system to be fine-tuned
to the specific characteristics of the workload. The processor multitasking mode and processor
maximum SMT level can be used to establish any processor threading context supported for the
initial program load (IPL). In general, as the processor threading context is reduced, single-thread
performance and determinism increase, but it is at the expense of greater aggregate throughput
potentially attainable in the higher threading context. Flexible SMT technology allows the system to
be tailored to the specific needs of the business.
IBM i 7.2 and IBM POWER8

Page 3 of 10

developerWorks

ibm.com/developerWorks/

For POWER8, IBM i supports flexible SMT with fully dynamic system-level processor threading
controls. IBM i 7.2 and 7.1 TR8 offer on-the-fly switching among single-thread and simultaneous
multithreading contexts supported by the POWER8 processor. The processor threading contexts
available for the partition IPL is determined by the processor compatibility mode (PCM) partition
attribute as shown in Table 2. Note that the PCM is established during partition activation.
Table 2. IBM i supported and default processor threading contexts
Processor compatibility mode

Supported threading contexts

Default thread context


IBM i 7.1 TR8

IBM i 7.2

POWER6

ST, SMT2

SMT2

SMT2

POWER6+

ST, SMT2

SMT2

SMT2

POWER7

ST, SMT2, SMT4

SMT4

SMT4

POWER8

ST, SMT2, SMT4, SMT8

SMT4

SMT8

The default thread context is selected by the operating system, but it can be easily changed by
the system administrator. Given the intuitive and high performance delivered by intelligent threads,
IBM i has historically used the maximum supported thread context as the default for a release
optimized for a new generation of system. That said, IBM i 7.1 TR8 continues to use SMT4 for the
default thread context for POWER7 and POWER8 processor compatibility modes. For IBM i 7.2,
the default thread context is SMT8.
The choice of SMT4 for POWER8 in IBM i 7.1 TR8 was made for the benefit of clients migrating
from an earlier generation Power Systems server. Many users upgrading to POWER8 will be
moving their workloads from a POWER7 server using SMT4, while continuing to use IBM i 7.1
for a period of time. The choice of SMT4 as the default thread context for POWER8 in IBM i 7.1
TR8 offers most of the benefits and performance advantages of POWER8, with the familiarity and
continuity of SMT4.
Processor multitasking mode
Switching to and from the single-thread context can be accomplished using the IBM i processor
multitasking mode system value, QPRCMLTTSK. For POWER8, QPRCMLTTSK changes are
effective immediately and persist across partition IPL.
Supported QPRCMLTTSK values are as follows:
0 - Processor multitasking is disabled. This value corresponds to the single-thread context.
1 - Processor multitasking is enabled. This value corresponds to SMT2 context if the
partition's processor compatibility mode is IBM POWER6 or IBM POWER6+. Otherwise,
the thread context is determined by the maximum SMT level control.
2 - Processor multitasking is system controlled. This is the default value, and the setting
recommended by IBM. For POWER8, the implementation is identical to '1', with processor
multitasking enabled.
Examples:
IBM i 7.2 and IBM POWER8

Page 4 of 10

ibm.com/developerWorks/

DSPSYSVAL
CHGSYSVAL
CHGSYSVAL
CHGSYSVAL

developerWorks

SYSVAL(QPRCMLTTSK)
SYSVAL(QPRCMLTTSK) VALUE('0') /* ST context */
SYSVAL(QPRCMLTTSK) VALUE('1') /* SMTn context */
SYSVAL(QPRCMLTTSK) VALUE('2') /* SMTn context */

Processor maximum SMT level


When processor multitasking is enabled, switching among the thread contexts available for the
partition IPL can be accomplished using the change processor multitasking information API,
QWCCHGPR. The QWCCHGPR API changes are effective immediately and persist across a
partition IPL.
The QWCCHGPR API takes a single parameter, the maximum number of secondary threads per
processor:
0 No maximum is selected. The system uses the default number of secondary threads as
determined by the operating system.
1-255 The system might use up to the number of secondary threads specified.
The QWCCHGPR API might be called from a command line.
Note that setting the maximum number of secondary threads does not establish the processor
threading context directly. The maximum value will be accepted regardless of the processor
threading contexts supported by the underlying hardware, and the operating system will apply the
configured maximum to the system. On a POWER8 processor-based system, if a maximum value
is specified by the QWCCHGPR API, the operating system tries to establish the maximum thread
context supported (as shown in Table 2), subject to the maximum specified by the QWCCHGPR
API. In other words, if the QWCCHGPR API sets the maximum number of secondary threads to
a value that is not supported by the hardware, the operating system sets the thread context to the
maximum supported by the hardware that meets the specified value.
Examples:
CALL
CALL
CALL
CALL
CALL
CALL
CALL

PGM(QWCCHGPR)
PGM(QWCCHGPR)
PGM(QWCCHGPR)
PGM(QWCCHGPR)
PGM(QWCCHGPR)
PGM(QWCCHGPR)
PGM(QWCCHGPR)

PARM(X'00000000')
PARM(X'00000001')
PARM(X'00000002')
PARM(X'00000003')
PARM(X'00000004')
PARM(X'00000007')
PARM(X'000000FF')

/*
/*
/*
/*
/*
/*
/*

No maximum */
SMT2 context */
SMT2 context */
SMT4 context */
SMT4 context */
SMT8 context */
SMT8 context */

The maximum number of secondary threads can be obtained from the Retrieve Processor
Multitasking Information (QWCRTVPR) API. Note that the value returned is the maximum number
of secondary threads configured.

Additional POWER8 highlights


IBM i uses the TIMI to isolate programs from differences in processor architectures, and allows the
system to automatically capitalize on many new Power Architecture features without changes to
existing applications. In some cases, the IBM i operating system enables new features based on
IBM i 7.2 and IBM POWER8

Page 5 of 10

developerWorks

ibm.com/developerWorks/

the processor compatibility mode of the partition. We'll take a look at several POWER8 examples
in the following sections.
Live Partition Mobility
Live Partition Mobility (LPM) is a PowerVM feature that provides the ability to migrate an active or
inactive IBM i partition between Power Systems servers. IBM i support for LPM was introduced for
POWER7 processor-based servers in IBM i 7.1 TR4.
LPM is supported in IBM i 7.2 and 7.1 TR8 between POWER8 processor-based servers, and
also between POWER8 and POWER7 processor-based servers, but with a caveat. For migration
between POWER7 and POWER8 processor-based servers, the partition must be configured
for a processor compatibility mode that is supported by both servers, and therefore, POWER7,
POWER6, or POWER6+ mode. Note that while PowerVM supports LPM for POWER7 and
POWER8 processor-based servers in POWER6 and POWER6+ processor compatibility modes,
IBM i does not support LPM for POWER6 processor-based servers.
Virtual time base and instruction count
POWER8 provides new hardware facilities for thread-level instruction count and virtual processor
timekeeping. Because these facilities are not available on POWER7, and the partition could find
itself running on a POWER7 processor-based server using Live Partition Mobility, the operating
system provides new Instruction Count (IC) and Virtual Time Base (VTB) data only for partitions
running in the POWER8 processor compatibility mode.
The IC and VTB facilities are relatively straightforward. The IC is the count of POWER instructions
run by a hardware thread, and the VTB is the elapsed time of virtual processor (core) dispatch
for a hardware thread. Both are privileged, that is, they are not directly accessible to application
programs, but accumulated forms of them are provided by the operating system. The accumulation
of IC and VTB for a software thread follows directly from the POWER8 thread IC and VTB
registers. The accumulation of processor non-idle IC and VTB is somewhat less direct, occurring
if and only if any of the processor's threads is not idle, that is, running a program. IBM i 7.2
accumulates IC and VTB for each software thread and process, and non-idle IC and non-idle VTB
for each processor, and partition-wide. Processor IC accumulation is also performed according to
some other categories, such as interrupt IC.
For the programmer, the IC and VTB accumulations are available from a variety of IBM i 7.2
machine interface instructions including:
MATRMD Hex 26, 28 Materialize resource management data
MATPRATR Hex 21, 23 Materialize process attributes
MATMATR Hex 220 Processor attributes
IBM i 7.2 performance management tools have been updated to incorporate IC and VTB
accumulations. For example, collection services also include the process / thread IC and VTB data
in the QAPMJOBMI file, and the sum-of-processor IC and VTB data in the QAPMSYSTEM file,
making longer-term historical analysis possible without adding significantly to data collection costs.
IBM i 7.2 and IBM POWER8

Page 6 of 10

ibm.com/developerWorks/

developerWorks

For more detailed analysis, Performance Explorer (PEX) includes thread IC and VTB in the base
event data, which can be traced down to a very short timescale.
On POWER8, IC and VTB accumulations provide valuable diagnostic insights into the individual
process/thread and overall system performance. They can be used alone or in combinations. For
example, the non-idle processor VTB and IC were designed to provide a 24x7 proxy of processor
cycle and instruction metrics that are frequently used for monitoring overall system health. On
earlier generations of IBM Power servers, this data was available only when a PEX data collection
was active.
Vector Scalar eXtension and crypto acceleration
POWER8 features enhanced Vector Scalar eXtension (VSX) capabilities, including new
instructions to accelerate some frequently used cryptographic operations. VSX in Power Systems
provides support for vector and scalar binary floating point operations conforming to the Institute
of Electrical and Electronics Engineers Standard for Floating Point Arithmetic (IEEE-754). VSX
can be used to increase parallelism by providing single-instruction, multiple-data (SIMD) execution
functionality for floating point double-precision operations, greatly improving the performance
of some applications. IBM i Portable Application Solutions Environment (PASE) applications
running on IBM i 7.2 with POWER8 processors can now take advantage of VSX. For more
information about VSX usage by IBM i PASE, refer to the IBM Redbooks Tuning Techniques for
IBM Processors, including IBM POWER8.
IBM i 7.2 leverages the enhanced POWER8 vector processing capabilities to accelerate AES
cryptographic operations when operating in the POWER8 processor compatibility mode.
Cryptographic services APIs, SSL, VPN, Backup Recovery and Media Services (BRMS) tape
encryption, and SQL encryption functions automatically use POWER8 enhanced vector processing
capabilities to deliver significant increases in performance. Figure 2 exemplifies the gains resulting
from POWER8 cryptographic acceleration. The charts reveal relative encrypt and decrypt
throughput for Cipher Block Chaining (CBC) and Electronic Code Book (ECB) modes using an
internal version of IBM CryptoLite for C/C++ (CLiC) toolkit primitives. In each chart, the series
labeled "Vector" use the POWER8 vector accelerated implementation, whereas, the others do not.
As shown, performance gains are dependent on the modes and block sizes, and results do vary,
but POWER8 vector acceleration can deliver breakthrough levels of cryptographic performance for
some applications.

IBM i 7.2 and IBM POWER8

Page 7 of 10

developerWorks

ibm.com/developerWorks/

Figure 2. POWER8 AES crypto acceleration

Conclusion
In this article, we've taken a closer look at some of the capabilities and features that have been
part of my world for the last 4 years.
POWER8 servers are up to 50% faster than comparable POWER7 models for commercial
workloads.
Cryptographic functions on IBM i 7.2 on POWER8 are performed up to 15 times faster than
ever before.
IBM i 7.2 on POWER8 offers enhanced 24x7 workrate metrics for system health monitoring
and performance analysis.
IBM i 7.2 and IBM POWER8

Page 8 of 10

ibm.com/developerWorks/

developerWorks

IBM i 7.2 is highly scalable and configurable, with a flexible range of SMT options, but is
designed to deliver superior system and single-thread performance without the need for
customized tuning.
Welcome to the world of POWER8 and IBM i 7.2, the most powerful, most flexible, and most
scalable generation of IBM Power Systems servers ever.

References

IBM Power Systems Announcement


Performance Capabilities Reference
Intelligent Threads
IBM Knowledge Center (IBM i)
Under the Hood: POWER7 Logical Partitions
Live Partition Mobility
IBM Knowledge Center - QWCCHGPR API
IBM Knowledge Center QPRCMLTTSK system value
IBM Knowledge Center Performance data collectors
Performance Optimization and Tuning Techniques For IBM Processors, including IBM
POWER8

IBM i 7.2 and IBM POWER8

Page 9 of 10

developerWorks

ibm.com/developerWorks/

About the author


Chris Francois
Chris Francois has been a commercial OS kernel programmer for nearly 25
years, and is currently a lead developer of synchronization primitives and process
scheduling components of IBM i LIC. Chris joined IBM Rochester in 1994 during the
migration of IBM AS/400 to the processor family ultimately to be known as IBM
POWER.
Copyright IBM Corporation 2014
(www.ibm.com/legal/copytrade.shtml)
Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)

IBM i 7.2 and IBM POWER8

Page 10 of 10

Вам также может понравиться