Академический Документы
Профессиональный Документы
Культура Документы
Chandan Chopra
Power Systems Solution Architect, IBM Systems Lab Services
chandan.chopra@in.ibm.com
Agenda
Power E880
24x7 Warranty
E850
E870
E880
16 - 48 Cores
3.72 GHz (12c)
3.35 GHz (10c)
3.02 GHz (8c)
128 GB 2 TB Memory*
7 - 51 PCI Adapters
* Statement of direction to 4 TB. Statements of direction represent plans only and are subject to change without notice.
6
PCIe Gen3
Though these cards physically look the same and fit in the same slots
Gen3 cards/slots have up to 2X more bandwidth than Gen2 cards/slots
Gen3 cards/slots have up to 4X more bandwidth than Gen1 cards/slots
More virtualization
More consolidation
More ports per adapter
18
16
14
12
Peak
Sustained
10
8
6
4
2
0
Gen1
Gen2
Gen3
10
Feat #EMX0
Front view
Rear view
Fan-out Module
6 PCIe Gen3 Slots
Attaches to 1 system node PCIe slot
Fan-out Module
6 PCIe Gen3 Slots
Attaches to 1 system node PCIe slot
4U drawer
11
VM
2
VM
3
VM
4
Up to 64*
Virtual Functions
E850
E870
E880
I/O Drawer
Software
AIX
AIX 6.1
AIX 7.1
AIX 7.1
AIX 6.1
IBM i
Red Hat
SUSE
Ubuntu
PowerVM
12
Front
Rear
* Applies to orders for AIX, Linux, and VIOS, IBM i is ordered as 1 set
13
Memory
How much do I need ? Should I fill the Memory card slots ?
Memory access (Local, Near, and Far NUMA)
I/O
15
Processor Designs
POWER6
POWER7
POWER7+
POWER8
Technology
65nm
45nm
32nm
22nm
Size
341 mm2
567 mm2
567 mm2
675 mm2
Transistors
790 M
1.2 B
~2.4 B
~5 B
Cores
12
Frequencies
4+ GHz
3 4+ GHz
3 4+ GHz
3 4.35 GHz
L2 Cache
4MB / Core
256 KB / Core
256 KB / Core
512 KB / Core
L3 Cache
32MB
32MB
80MB
96MB
L4 Cache
128MB
Memory (Dram
Channel)
8 DDR2
16 DDR2
16 DDR2
32 DDR3/4
I/O
Propriety GX
Propriety GX+
Propriety GX+
Integrated PCIe
Architecture
In of Order
Out of Order
Out of Order
Out of Order
Threads
8
16
Simultaneous Multithreading
17
Simultaneous Multithreading
SMT1
4
3.5
3
2.5
2
1.5
1
0.5
P7
P8
P8
P8
P8
SMT1 SMT1 SMT2 SMT4 SMT8
18
19
POWER6 550
5.0GHz
POWER7 750
3.3GHz
0.8
0.6
0.4
0.2
0
8-core
Socket-to-Socket Performance
1-chip POWER6 vs. POWER7
5
4.5
4
3.5
3
POWER6 570
5.0GHz
2.5
POWER7 780
3.86GHz
2
1.5
1
0.5
0
1-socket
20
CPW
E870
32-core
64-core
4.02 GHz
4.02 GHz
674.5
1,349.0
32-core
64-core
4.02 GHz
4.02 GHz
359,000
711,000
40-core
80-core
4.19 GHz
4.19 GHz
856.0
1,711.9
40-core
80-core
4.19 GHz
4.19 GHz
460,000
911,000
4.35 GHz
4.35 GHz
4.35 GHz
4.02 GHz
4.02 GHz
4.02 GHz
381,000
755,000
1,523,000
518,000
1,034,000
2,069,000
E880
E880
32-core
64-core
128-core
48-core
96-core
192-core
4.35 GHz
4.35 GHz
4.35 GHz
4.02 GHz
4.02 GHz
4.02 GHz
716.0
1,432.5
2,865
976.4
1,952.9
3,905.8
32-core
64-core
128-core
48-core
96-core
192-core
21
70,000 CPW
divided 7,500 per core
-----------------9.33 Cores
POWER8 (SMT8)
70,000 CPW
divided 11,500 per core
-----------------6.08 Cores
POWER8 (SMT4)
70,000 CPW
divided 9,200 per core
-----------------7.6 Cores
The POWER8 system might very well provide the CPW capacity However, remember
response time vs throughput. You might get the transactions but at increased response times
and longer batch runtimes.
USE WLE to size
22
Best Practice #1
If speed (response time and batch run time) is the priority for the workload
then consider using higher frequency POWER8 Processors.
Consider appropriate rPerf and CPW for selecting a POWER8 system.
or better CPW and/or rPerf rating, but not when per thread performance (speed)
is critical
Start with about 3/4 of cores of POWER7 if speed is the requirement.
Consider using SMT4 (POWER7 mode) when speed is a major concern on
POWER8 systems.
Consider dedicated or dedicated donate for partitions that are business critical
Understand the number of cores worth of capacity and performance you need in
24
Multi-core - smaller die size, more transistors, more processor cores per chip, more
threads per core. more functions on chip,
Use of SMP (Symmetric Multi-Processing) to scale across more cores
Multi Core and Multiple Node Power Systems 870, 880, 770, 780, 795
NUMA (Non-Uniform Memory Access), a concept that is used to further drive up the
performance capacity of a system.
Should I use one System node? Would it be better to use two nodes ?
27
Affinity
Affinity is a measurement of the proximity a thread has to a physical resource,
node
Cache Affinity: threads in different domains need to communicate with each other,
Think about your biggest partitions cores and memory, could it fit on a node
33
Power8 Cache
Access
data:
L1 cache
L2 cache
L3 cache
L4 cache
Local
memory
Remote
memory
Distant
memory
Cycles
12
28
180
320
500
800
35
environments
Process first runs to assess level of affinity by partition
User then selects partitions for system optimization
System and workloads continue to run during optimization process
System adjusts workload placement in background to optimize performance
without requiring additional interaction
Available at no additional charge for Power 770, 780, 795, 870 and 880
37
Cores Cores
Cores Cores
DIMMs
Cores Cores
DIMMs
DIMMs
DIMMs
DIMMs
DIMMs
Cores Cores
DIMMs
DIMMs
Best Practice #4
38
Best Practice #5
Dont under-commit entitlement.
Every virtual processor has a preferred Node ID.
That set of cores close to where memory resides.
Best Practice #6
Update Firmware to latest level
The hypervisor has had numerous performance
enhancements
Partition X
Memory
Partition Y
Memory
Partition X
Processors
Partition YPartition Z
Processors
Processors
Partition Z
Memory
Free LMBs
Partition Z
Processors
40
All slots are x16 with buses direct from the Processor Modules and must be
used to install high-performance PCIe adapters
The adapter priority for these slots is for the PCIe3 Optical Cable Adapter
(FC EJ07), SAS adapters (FC EJ0M, EJ11), followed by any other highperformance low-profile adapter
Refer to Slot priority table for all supported adapters for optimal placement
https://www-01.ibm.com/support/knowledgecenter/9119-MHE/p8eab/p8eab_87x_88x_slot_details.htm
Verify whether the adapter is supported for your system. IO placement can
be planned and validated using System Planning Tool (SPT)
41
2x more drawers
PLUS
More flexibility
Notes:
With two system nodes it is a good practice (but not required) to attach the two fan-out
modules in one I/O drawer to different system nodes. Combined with placing
redundant PCIe adapters in different fan-out modules, system availability is enhanced.
PCIe I/O drawer can be in the same or different rack as the system nodes. If large
numbers of I/O cables are attached to PCIe adapters, its nice to have the I/O drawer in
a different rack for cable management ease
System control unit not shown for visual simplicity
44
45
www.ibm.com/systems/support/tools/systemplanningtool/
46
SMT Guidance
Active Memory Mirroring Guidance
SRIOV Guidance
Power Saving Guidance
Enterprise Pools
48
49
50
mode, default)
VP folding for less loaded partitions
51
threads
SMT4 is the best of all worlds for now, but there are now more options to
exploit SMT
52
53
54
55
any changes
In the event of a memory failure on the
56
AMM Guidance
Hypervisor memory mirroring defaults to enabled. You need to be aware of this when sizing system
memory. Plan on AMM to take about 8% of each nodes memory and 16% if hypervisor mirroring
Remember,
Hypervisor data that is mirrored:
Hardware Page Tables (HPTs) that are managed by the hypervisor on behalf of partitions to track
physical adapters.
I/O Adapter Enhanced Capacity is reserved memory
With hypervisor memory mirroring enabled, this gets doubled. Reserved memory can go excessively
high for Power8 Enterprise systems
57
SRIOV Guidance
Link Aggregation (LACP) will not function properly with
59
as Active connection
60
IV68444
Fix Level Recommendation Tool (FLRT)
https://www14.software.ibm.com/webapp/set2/flrt/home
Report
61
processor frequency
Dynamic Power Saver Mode
Processor frequency varies
63
Mobile activations can be used for systems within the same pool
One pool type for Power E880 & POWER7+ 780 & Power 795 systems
One pool type for Power E870 & POWER7+ 770 systems
Activations can be moved at any time by the user without contacting IBM
Done using HMC
Sys A
64-core E880
4.35 GHz
Activations:
10 static
40 mobile
14 dark
Sys B
96-core 795
3.7 GHz
Activations:
30 static
40 mobile
26 dark
Sys C
96-core 780
3.7 GHz
Activations:
16 static
20 mobile
60 dark
Sys D
128-core 795
4.0 GHz
Activations:
40 static
60 mobile
28 dark
Pool Totals
Activations:
96 static
160 mobile
128 dark
65
Monday 8:01 am
Sys A
64-core E880
4.35 GHz
Activations:
10 static
0 mobile
54 dark
Sys B
96-core 795
3.7 GHz
Activations:
30 static
55 mobile
11 dark
Sys C
96-core 780
3.7 GHz
Activations:
16 static
45 mobile
35 dark
Sys D
128-core 795
4.0 GHz
Activations:
40 static
60 mobile
28 dark
Pool Totals
Activations:
96 static
160 mobile
128 dark
66
ORDER
INSTALL
DOWNLOAD
USE
67
Summary (1 of 2)
Identify Power Enterprise systems best suitable for you needs
Perform sizing based on throughput and response time considerations
For response time critical workloads, higher frequency POWER8 processor will give
more benefit
Understand SMT behavior on POWER8 systems and evaluate, apply accordingly
For maximum memory bandwidth, populate all memory DIMMS slots
For optimum cache and memory affinity, plan for partition placement in processor nodes
Additional drawers may help you get better performance. Plan for scalability and
performance
Apply latest firmware level and review minimum supported OS, VIOS and HMC levels for
68
Summary (2 of 2)
AMM can be leveraged for higher reliability on Enterprise systems. Disable IO adapter
69
PowerCare Service
Select one PowerCare service
option with each Power E870 or E880
A PowerCare Services engagement offer is included, at no additional charge, with the purchase
of each Power E870 or E880 system.
Power E870 engagement options include :
Thank You
References
Power systems best practices
http://www14.software.ibm.com/webapp/set2/sas/f/best/home.html
PCIe Slot priority table for all supported adapters for optimal placement
https://www-01.ibm.com/support/knowledgecenter/9119-MHE/p8eab/p8eab_87x_88x_slot_details.htm
72
References
AIX Performance website
https://www.ibm.com/developerworks/wikis/display/WikiPtype/Performance+Monitoring+Documentation
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Power%20Systems/page/rperff
73