Вы находитесь на странице: 1из 70

AIX Virtual Users Group presentation July 29, 2010

AIX 5L/6 Performance Tuning Part I: Tactics for


Monitoring Indications of Performance Issues
Earl Jew
IBM Field Technical Sales Specialist for Power Systems and Storage
IBM Regional Designated Specialist - Power/AIX Performance and Tuning
400 North Brand Blvd., Suite 700 c/o IBM, Glendale, CA, USA 91203
earlj@us.ibm.com (310)251-2907

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

 Strategic Thoughts, Concepts, Considerations, and Tactics

 Monitoring AIX – Usage, Meaning and Interpretation


– Review component technology of the infrastructure, i.e. proper tuning-by-hardware
– Review implemented AIX constructs, i.e. “firm” near-static structures and settings
– Review historical/accumulated AIX events, usages, pendings, counts, blocks, etc.
– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

 Recognizing Common Performance-degrading Scenarios


– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”
– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed
– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity
– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r
– Poor ratio of pending I/O waits to iodones in vmstat -s output

 Tactical Check-list of What-to-Watch

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Strategic Thoughts: Monitoring AIX 5L/6.1 LPARs

 Many AIX performance-degrading scenarios can be readily characterized by


monitoring AIX dynamically (real-time) as well as cumulatively (ie. vmstat –sv).
 By understanding and interpreting the output of mundane AIX commands
better&deeper, areas of resource exhaustion, limitation and over-commitment, as well
as, resource under-utilization, surplus and over-allocation, can be distinguished.

 This presentation focuses on the tactical -- meaning your daily “keyboard awareness”.
 This will explain the numbers presented by AIX commands (vmstat, mpstat, iostat, ps,
etc.) and formulate the severity of performance issues, if any.

 Most cumulative indicators are counts-per-scale over days-uptime.


 Many dynamic indicators are comparing ranges&ratios of system resources.
 Scaled-definitions define blue/surplus, green/normal, yellow/warning,
red/serious and Flashing-Red-with-Sirens/critical status-conditions.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Strategic Concepts: Monitoring AIX 5L/6.1 LPARs


 Can the capabilities&capacities (C&C) of the infrastructure manage the workload?
 Is there an appropriate balance of hardware capabilities&capacities for the workload?
 To answer, AIX must be monitored relative to its hardware resources&infrastructure.

 Note the size, scale, technology and implementation of the given LPAR
 Note the LPAR’s ratio-of-resources, ie. CPU-to-RAM-to-SAN I/O for the workload
 Note the same of other same-frame “sibling” LPARs, if any

 Review the historical/accumulated count-of-events over days-uptime


 Determine points of exhaustion, limitation, and over-commitment
 Determine surplus resources: CPUcycles, RAM, SAN I/O throughput, etc.
 Review exhaustions and surpluses in-light of workload expectations

 Monitor dynamic AIX behaviors to characterize the given workload


 For example: Is this a Think-Think or a Move-the-Data workload? © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

Strategic Considerations: Monitoring AIX 5L/6.1 LPARs

 Monitor dynamic AIX behaviors using a 1 or 2 second sampling interval (vs >30secs)
 Verify a stressful workload exists: “We can’t tune what is not being taxed”
 Discontinue active efforts when done: “If/when it runs fast enough, we’re tuned”
 Build with track-able discrete structures: “We can’t tune what can’t be tracked”
 Monitor spikes,peaks,bursts and burns: “We tune the intensities, not the sleepy-times”
 Establish dynamic baselines by monitoring real-time AIX behaviors by ranges&ratios
 Watch AIX behaviors with the goal of characterizing the workload (vmstat –Iwt 2)
System Configuration: lcpu=48 mem=174080MB

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
23 12 0 16896791 6008393 10421 7588 0 0 6336 6693 5488 2166351 648185 34 27 27 12 13:23:04
0 0 0 16918278 5986559 9317 2992 0 0 2207 2361 5551 1992963 630548 32 27 27 14 13:23:05
24 16 0 16908381 5996756 5554 2118 0 0 5702 6059 5366 2176668 645810 32 27 28 13 13:23:06
26 17 0 16902538 6002492 6608 2412 0 0 6606 7143 6174 2275536 680525 33 25 27 15 13:23:07
19 21 0 16897984 6007108 5991 1907 0 0 5948 6331 5623 3373138 700564 32 26 28 14 13:23:08
18 18 0 16897189 6007718 5659 1793 0 0 5432 5713 5285 2053519 665453 28 24 33 15 13:23:09
21 13 0 16898035 6006971 6530 1552 0 0 6600 6946 5853 1918286 714013 28 24 34 14 13:23:10
23 17 0 16896745 6008235 5987 2032 0 0 5937 6284 5531 2390617 774752 31 26 30 14 13:23:11
29 18 0 16898852 6006076 7040 3969 0 0 6868 7550 6050 2663508 811273 32 28 26 14 13:23:12
21 20 0 16900383 6004434 8590 7483 0 0 8284 8852 8094 2583303 805351 34 29 24 14 13:23:13
36 15 0 16896715 6010736 11738 16017 0 0 7487 8181 8730 2502543 790705 35 29 26 10 13:23:14
24 14 0 16900246 6004726 11607 10387 0 0 5444 5983 6291 2971775 820218 35 28 26 11 13:23:15
27 11 0 16898966 6006775 12471 13520 0 0 6065 6665 6831 1947291 863025 35 29 26 10 13:23:16
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

Strategic Tactics: Monitoring/Tactically-Tuning AIX 5L/6.1 LPARs


 Monitor AIX behaviors with the goal of characterizing the workload
 Use the workload characterization to guide AIX 5L/6.1 tactical-tuning efforts

Tuning Strategy example 1


 Determine points of exhaustion, limitation, and over-commitment
 Determine surplus resources: CPUcycles, RAM, SAN I/O thruput, etc.
 Devise tactics to relieve exhaustions by exploiting surplus resources

Tuning Strategy example 2


 Study the mechanics of AIX Virtual Memory Management (VMM)
 Understand the influence of vmo/ioo/no tuning parameters on AIX VMM dynamic behaviors
 Practice monitoring the behaviors of the AIX VMM mechanisms
 Recognize-and-Remedy the “bottlenecks” in AIX VMM resources

Tuning Strategy example 3


 Exercise&experiment with the various JFS2 mount-options as well as Going Raw
 Devise ways to characterize I/O patterns in routinely-active RDBMS “tablespaces”
 Match/place RDBMS “tablespaces” with the best JFS2 mount-options including Going Raw

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

 Strategic Thoughts, Concepts, Considerations, and Tactics

 Monitoring AIX – Usage, Meaning and Interpretation


– Review component technology of the infrastructure, i.e. proper tuning-by-hardware
– Review implemented AIX constructs, i.e. “firm” near-static structures and settings
– Review historical/accumulated AIX events, usages, pendings, counts, blocks, etc.
– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

 Recognizing Common Performance-degrading Scenarios


– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”
– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed
– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity
– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r
– Poor ratio of pending I/O waits to iodones in vmstat -s output

 Tactical Check-list of What-to-Watch

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Note the size, scale, technology and implementation of the given LPAR
Note the LPAR’s ratio-of-resources, i.e. CPU-to-RAM-to-SAN I/O

$ date ; uname -a ; id ; oslevel –s; lparstat -i


Wed Sep 17 12:21:20 EDT 2008
AIX alamogordo 3 5 00XXXXXXXX00
uid=0(root) gid=0(system)
groups=2(bin),3(sys),8(cron),10(audit),301(udmin),11(lp),401(multiud),303(cognos)
5300-07-03-0811
Node Name : alamogordo
Partition Name : alamogordo
Partition Number : 3
Type : Shared-SMT
Mode : Uncapped
Entitled Capacity : 15.20
Partition Group-ID : 32771
Shared Pool ID : 0
Online Virtual CPUs : 32
Maximum Virtual CPUs : 32
Minimum Virtual CPUs : 2
Online Memory : 126464 MB
Maximum Memory : 131072 MB
Minimum Memory : 2048 MB
Variable Capacity Weight : 255
Minimum Capacity : 2.00
Maximum Capacity : 16.00
Capacity Increment : 0.01
Maximum Physical CPUs in system : 64
Active Physical CPUs in system : 16
Active CPUs in Pool : 16
Shared Physical CPUs in system : -
Maximum Capacity of Pool : -
Entitled Capacity of Pool : -
Unallocated Capacity : 0.00
Physical CPU Percentage : 47.50%
Unallocated Weight : 0

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

prtconf # note the component technology of the given LPAR


$ prtconf
System Model: IBM,9119-595
Machine Serial Number: XXXXXXX
Processor Type: PowerPC_POWER5
Number Of Processors: 32
Processor Clock Speed: 1902 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 3 alamogordo
Memory Size: 126464 MB
Good Memory Size: 126464 MB
Platform Firmware level: Not Available
Firmware Version: IBM,SF240_320
Console Login: enable
Auto Restart: true
Full Core: false

Network Information
Host Name: alamogordo
IP Address: addr
Sub Netmask: mask
Gateway: gateway
Name Server: nameserv
Domain Name: domain

Paging Space Information


Total Paging Space: 6016MB
Percent Used: 11%

Volume Groups Information


==============================================================================
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 511 89 00..00..00..44..45
hdisk1 active 511 89 00..14..00..30..45
==============================================================================


© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

lscfg # note the placement of components in the implementation of the LPAR


$ lscfg
INSTALLED RESOURCE LIST

The following resources are installed on the machine.


+/- = Added or deleted from Resource List.
•= Diagnostic support not available.

Model Architecture: chrp


Model Implementation: Multiple Processor, PCI bus

+ sys0 System Object


+ sysplanar0 System Planar
* pci11 U5791.001.99205Z4-P1 PCI Bus
* pci12 U5791.001.99205Z4-P1 PCI Bus
+ ent6 U5791.001.99205Z4-P1-C04-T1 10 Gigabit Ethernet-SR PCI-X 2.0 DDR Adap
* pci4 U5791.001.99205Z4-P1 PCI Bus
* pci7 U5791.001.99205Z4-P1 PCI Bus
+ fcs3 U5791.001.99205Z4-P1-C05-T1 FC Adapter
* fcnet3 U5791.001.99205Z4-P1-C05-T1 Fibre Channel Network Protocol Device
* fscsi4 U5791.001.99205Z4-P1-C05-T1 FC SCSI I/O Controller Protocol Device
* pci8 U5791.001.99205Z4-P1 PCI Bus
+ fcs4 U5791.001.99205Z4-P1-C06-T1 FC Adapter
* fcnet4 U5791.001.99205Z4-P1-C06-T1 Fibre Channel Network Protocol Device
* fscsi3 U5791.001.99205Z4-P1-C06-T1 FC SCSI I/O Controller Protocol Device
* hdisk15 U5791.001.99205Z4-P1-C06-T1-W50050768014044F1-LD000000000000 MPIO FC 2145
* hdisk16 U5791.001.99205Z4-P1-C06-T1-W50050768014044F1-LE000000000000 MPIO FC 2145
* hdisk17 U5791.001.99205Z4-P1-C06-T1-W50050768014044F1-LF000000000000 MPIO FC 2145
* pci10 U5791.001.99205Z4-P1 PCI Bus
* pci3 U5791.001.99205Z4-P2 PCI Bus
* pci6 U5791.001.99205Z4-P2 PCI Bus
+ fcs2 U5791.001.99205Z4-P2-C05-T1 FC Adapter
* fcnet2 U5791.001.99205Z4-P2-C05-T1 Fibre Channel Network Protocol Device
* fscsi2 U5791.001.99205Z4-P2-C05-T1 FC SCSI I/O Controller Protocol Device
* hdisk14 U5791.001.99205Z4-P2-C05-T1-W5005076801104417-LC000000000000 MPIO FC 2145
* pci2 U5791.001.99205Z4-P2 PCI Bus
* pci5 U5791.001.99205Z4-P2 PCI Bus
+ fcs1 U5791.001.99205Z4-P2-C02-T1 FC Adapter
* fcnet1 U5791.001.99205Z4-P2-C02-T1 Fibre Channel Network Protocol Device
* fscsi1 U5791.001.99205Z4-P2-C02-T1 FC SCSI I/O Controller Protocol Device
* hdisk2 U5791.001.99205Z4-P2-C02-T1-W50050768013044F1-L0 MPIO FC 2145
* hdisk3 U5791.001.99205Z4-P2-C02-T1-W50050768013044F1-L1000000000000 MPIO FC 2145
* hdisk4 U5791.001.99205Z4-P2-C02-T1-W50050768013044F1-L2000000000000 MPIO FC 2145
* hdisk5 U5791.001.99205Z4-P2-C02-T1-W50050768013044F1-L3000000000000 MPIO FC 2145
* hdisk6 U5791.001.99205Z4-P2-C02-T1-W50050768013044F1-L4000000000000 MPIO FC 2145
* hdisk7 U5791.001.99205Z4-P2-C02-T1-W50050768013044F1-L5000000000000 MPIO FC 2145

… © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

lsdev # note the count&capacity of the component technology of the LPAR


$ lsdev
L2cache0 Available L2 Cache
aio0 Available Asynchronous I/O (Legacy)
en0 Available 08-08 Standard Ethernet Network Interface
en1 Defined 0K-08 Standard Ethernet Network Interface
en2 Defined 0K-09 Standard Ethernet Network Interface
en3 Defined 0L-08 Standard Ethernet Network Interface
en4 Defined 0L-09 Standard Ethernet Network Interface
en5 Available Standard Ethernet Network Interface
en6 Available Standard Ethernet Network Interface
en7 Defined 0L-09 Standard Ethernet Network Interface
ent0 Available 08-08 10/100/1000 Base-TX PCI-X Adapter (14106902)
ent1 Available 0K-08 2-Port 10/100/1000 Base-TX PCI-X Adapter (14108902)
ent2 Available 0K-09 2-Port 10/100/1000 Base-TX PCI-X Adapter (14108902)
ent3 Available 0L-08 2-Port 10/100/1000 Base-TX PCI-X Adapter (14108902)
ent4 Available 0L-09 2-Port 10/100/1000 Base-TX PCI-X Adapter (14108902)
ent5 Available EtherChannel / IEEE 802.3ad Link Aggregation
ent6 Available EtherChannel / IEEE 802.3ad Link Aggregation
et0 Defined 08-08 IEEE 802.3 Ethernet Network Interface
et1 Defined 0K-08 IEEE 802.3 Ethernet Network Interface
et2 Defined 0K-09 IEEE 802.3 Ethernet Network Interface
et3 Defined 0L-08 IEEE 802.3 Ethernet Network Interface
et4 Defined 0L-09 IEEE 802.3 Ethernet Network Interface
et5 Defined IEEE 802.3 Ethernet Network Interface
et6 Defined IEEE 802.3 Ethernet Network Interface
et7 Defined 0L-09 IEEE 802.3 Ethernet Network Interface
fcnet0 Defined 05-08-01 Fibre Channel Network Protocol Device
fcnet1 Defined 07-08-01 Fibre Channel Network Protocol Device
fcnet2 Defined 0A-08-01 Fibre Channel Network Protocol Device
fcnet3 Defined 0D-08-01 Fibre Channel Network Protocol Device
fcnet4 Defined 0G-08-01 Fibre Channel Network Protocol Device
fcnet5 Defined 0H-08-01 Fibre Channel Network Protocol Device
fcnet6 Defined 0I-08-01 Fibre Channel Network Protocol Device
fcnet7 Defined 0J-08-01 Fibre Channel Network Protocol Device
fcs0 Available 05-08 FC Adapter
fcs1 Available 07-08 FC Adapter
fcs2 Available 0A-08 FC Adapter
fcs3 Available 0D-08 FC Adapter
fcs4 Available 0G-08 FC Adapter
fcs5 Available 0H-08 FC Adapter

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

 Strategic Thoughts, Concepts, Considerations, and Tactics

 Monitoring AIX – Usage, Meaning and Interpretation


– Review component technology of the infrastructure, i.e. proper tuning-by-hardware
– Review implemented AIX constructs, i.e. “firm” near-static structures and settings
– Review historical/accumulated AIX events, usages, pendings, counts, blocks, etc.
– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

 Recognizing Common Performance-degrading Scenarios


– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”
– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed
– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity
– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r
– Poor ratio of pending I/O waits to iodones in vmstat -s output

 Tactical Check-list of What-to-Watch

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

lsps ; mount # review the implemented construction of “firm” AIX structures

$ lsps -a
Page Space Physical Volume Volume Group Size %Used Active Auto Type
paging01 hdisk80 pagingvg 32736MB 1 yes yes lv
paging01 hdisk81 pagingvg 32736MB 1 yes yes lv
paging00 hdisk78 pagingvg 32736MB 1 yes yes lv
paging00 hdisk79 pagingvg 32736MB 1 yes yes lv
hd6 hdisk0 rootvg 512MB 0 no yes lv
$ lsps -s
Total Paging Space Percent Used
130944MB 1%
$ mount
node mounted mounted over vfs date options
-------- --------------- --------------- ------ ------------ ---------------
/dev/hd4 / jfs Oct 20 22:06 rw,log=/dev/hd8
/dev/hd2 /usr jfs Oct 20 22:06 rw,log=/dev/hd8
/dev/hd9var /var jfs Oct 20 22:06 rw,log=/dev/hd8
/dev/hd3 /tmp jfs Oct 20 22:06 rw,log=/dev/hd8
/dev/hd1 /home jfs Oct 20 22:08 rw,log=/dev/hd8
/proc /proc procfs Oct 20 22:08 rw
/dev/hd10opt /opt jfs Oct 20 22:08 rw,log=/dev/hd8
/dev/lvsapcds /sapcds jfs2 Oct 20 22:08 rw,log=/dev/lv00
/dev/lvcnvbt /cnv jfs2 Oct 20 22:20 rw,log=/dev/lvjfs00
/dev/lvhrtmpbt /hrtmp jfs2 Oct 20 22:20 rw,log=/dev/lvjfs00
/dev/lvoraclebt /oracle jfs2 Oct 20 22:20 rw,log=/dev/lvjfs00
/dev/lvorapr1bt /oracle/PR1 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs00
/dev/lvmirrlogAp /oracle/PR1/mirrlogA jfs2 Oct 20 22:20 rw,log=/dev/lvjfs20
/dev/lvmirrlogBp /oracle/PR1/mirrlogB jfs2 Oct 20 22:20 rw,log=/dev/lvjfs20
/dev/lvoriglogAp /oracle/PR1/origlogA jfs2 Oct 20 22:20 rw,log=/dev/lvjfs20
/dev/lvoriglogBp /oracle/PR1/origlogB jfs2 Oct 20 22:20 rw,log=/dev/lvjfs20
/dev/lvsaparchbt /oracle/PR1/saparch jfs2 Oct 20 22:20 rw,log=/dev/lvjfs00
/dev/lvsapdata1bt /oracle/PR1/sapdata1 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs01
/dev/lvsapdata18bt /oracle/PR1/sapdata10 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs10
/dev/lvsapdata11bt /oracle/PR1/sapdata11 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs11
/dev/lvsapdata24bt /oracle/PR1/sapdata12 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs12
/dev/lvsapdata2bt /oracle/PR1/sapdata2 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs02
/dev/lvsapdata3bt /oracle/PR1/sapdata3 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs03
/dev/lvsapdata14bt /oracle/PR1/sapdata4 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs04
/dev/lvsapdata23bt /oracle/PR1/sapdata5 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs05
/dev/lvsapdata16bt /oracle/PR1/sapdata6 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs06
/dev/lvsapdata7bt /oracle/PR1/sapdata7 jfs2 Oct 20 22:20 rw,log=/dev/lvjfs07

… © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

df -k # review the implemented construction of “firm” AIX structures; observe


count-of-inodes per GBs(used) of each application’s data filesystems
$ df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 262144 129016 51% 3777 3% /
/dev/hd2 3932160 544280 87% 42721 5% /usr
/dev/hd9var 1048576 334980 69% 4293 2% /var
/dev/hd3 1048576 731832 31% 519 1% /tmp
/dev/hd1 262144 63632 76% 2622 5% /home
/proc - - - - - /proc
/dev/hd10opt 262144 213832 19% 849 2% /opt
/dev/lvsapcds 2097152 456840 79% 1246 2% /sapcds
/dev/lvcnvbt 20480000 16993664 18% 715 1% /cnv
/dev/lvhrtmpbt 524288 506984 4% 30 1% /hrtmp
/dev/lvoraclebt 524288 436808 17% 2938 3% /oracle
/dev/lvorapr1bt 8978432 3838252 58% 21476 3% /oracle/PR1
/dev/lvmirrlogAp 3080192 2567348 17% 6 1% /oracle/PR1/mirrlogA
/dev/lvmirrlogBp 3080192 2567348 17% 6 1% /oracle/PR1/mirrlogB
/dev/lvoriglogAp 3080192 2567348 17% 6 1% /oracle/PR1/origlogA
/dev/lvoriglogBp 3080192 2567348 17% 6 1% /oracle/PR1/origlogB
/dev/lvsaparchbt 14680064 14296480 3% 7176 1% /oracle/PR1/saparch
/dev/lvsapdata1bt 268173312 73734764 73% 116 1% /oracle/PR1/sapdata1
/dev/lvsapdata18bt 268173312 73751196 73% 108 1% /oracle/PR1/sapdata10
/dev/lvsapdata11bt 268173312 77027948 72% 108 1% /oracle/PR1/sapdata11
/dev/lvsapdata24bt 268173312 75455208 72% 108 1% /oracle/PR1/sapdata12
/dev/lvsapdata2bt 268173312 76225148 72% 110 1% /oracle/PR1/sapdata2
/dev/lvsapdata3bt 268173312 75569716 72% 110 1% /oracle/PR1/sapdata3
/dev/lvsapdata14bt 268173312 74930816 73% 108 1% /oracle/PR1/sapdata4
/dev/lvsapdata23bt 268173312 77814376 71% 108 1% /oracle/PR1/sapdata5
/dev/lvsapdata16bt 268173312 79387368 71% 108 1% /oracle/PR1/sapdata6
/dev/lvsapdata7bt 268173312 74013420 73% 108 1% /oracle/PR1/sapdata7
/dev/lvsapdata8bt 268173312 75192876 72% 108 1% /oracle/PR1/sapdata8
/dev/lvsapdata19bt 268173312 74668728 73% 108 1% /oracle/PR1/sapdata9
/dev/lvsapreorgbt 25165824 19272876 24% 1153 1% /oracle/PR1/sapreorg
/dev/lvostage 2097152 1957092 7% 794 1% /oracle/stage
/dev/lvsapmntbt 2097152 1447736 31% 357 1% /sapmnt/PR1

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

df –k # review the implemented construction of “firm” AIX structures; observe


count-of-inodes per GBs(used) of each application’s data filesystems using Excel
avg blocks-
1024-blocks per-
Filesystem 1024-blocks Free Iused Mounted on Used inode avg bytes/inode

/dev/lvoriglogBp 3080192 2567348 6 /oracle/PR1/origlogB 512844 85474.0 87,525,376

/dev/lvsaparchbt 14680064 14296480 7176 /oracle/PR1/saparch 383584 53.5 54,737

/dev/lvsapdata1bt 268173312 73734764 116 /oracle/PR1/sapdata1 194438548 1676194.4 1,716,423,044

/dev/lvsapdata18bt 268173312 73751196 108 /oracle/PR1/sapdata10 194422116 1800204.8 1,843,409,692

/dev/lvsapdata11bt 268173312 77027948 108 /oracle/PR1/sapdata11 191145364 1769864.5 1,812,341,229

/dev/lvsapdata24bt 268173312 75455208 108 /oracle/PR1/sapdata12 192718104 1784426.9 1,827,253,134

/dev/lvsapdata2bt 268173312 76225148 110 /oracle/PR1/sapdata2 191948164 1744983.3 1,786,862,909

/dev/lvsapdata3bt 268173312 75569716 110 /oracle/PR1/sapdata3 192603596 1750941.8 1,792,964,385

/dev/lvsapdata14bt 268173312 74930816 108 /oracle/PR1/sapdata4 193242496 1789282.4 1,832,225,147

/dev/lvsapdata23bt 268173312 77814376 108 /oracle/PR1/sapdata5 190358936 1762582.7 1,804,884,727

/dev/lvsapdata16bt 268173312 79387368 108 /oracle/PR1/sapdata6 188785944 1748018.0 1,789,970,432

/dev/lvsapdata7bt 268173312 74013420 108 /oracle/PR1/sapdata7 194159892 1797776.8 1,840,923,420

/dev/lvsapdata8bt 268173312 75192876 108 /oracle/PR1/sapdata8 192980436 1786855.9 1,829,740,430

/dev/lvsapdata19bt 268173312 74668728 108 /oracle/PR1/sapdata9 193504584 1791709.1 1,834,710,130

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

ipcs -bm # review the implemented construction of “firm” AIX structures;


computational memory includes allocated (vs authorized) shmemsegs

$ ipcs -bm | wc -l
164
$ ipcs -bm | grep -v sapsys
IPC status from /dev/mem as of Sun Mar 2 16:23:03 PST 2008
T ID KEY MODE OWNER GROUP SEGSZ
Shared Memory:
m 1048576 0xffffffff --rw-rw---- root system 8192
m 1048577 0xffffffff --rw-rw---- root system 4096
m 1048578 0x7800004f --rw-rw-rw- root system 16777216
m 3 0xffffffff --rw-rw---- root system 4096
m 4 0x670017bc --rw-r--r-- root system 12
m 5 0x680017bc --rw-r--r-- root system 106548
m 6 0x700017bc --rw------- root system 3152
m 7 0xffffffff --rw-rw---- root 219 4096
m 62914569 0x8aa6abe8 --rw-r----- orapr1 dba 12903030784
$ ipcs -bm | grep -c sapsys
152
$ ipcs -bm | grep sapsys
m 23068682 0xffffffff --rw-r----- pr1adm sapsys 1024
m 11 0x0382be84 --rw-rw-rw- pr1adm sapsys 4096
m 46137357 0xffffffff --rw------- pr1adm sapsys 2147483648
m 46137358 0xffffffff --rw------- pr1adm sapsys 2147483648
m 53477391 0xffffffff --rw------- pr1adm sapsys 2147483648
m 44040210 0xffffffff --rw------- pr1adm sapsys 2147483648
m 44040211 0xffffffff --rw------- pr1adm sapsys 2147483648
m 51380244 0xffffffff --rw------- pr1adm sapsys 2147483648
m 51380254 0xffffffff --rw------- pr1adm sapsys 2147483648
m 57671711 0x00002725 --rw-r----- pr1adm sapsys 562192
m 56623136 0x00002722 --rw-r----- pr1adm sapsys 656416
m 56623137 0xffffffff --rw------- pr1adm sapsys 2147483648
m 54525986 0xffffffff --rw------- pr1adm sapsys 2147483648
m 42991653 0xffffffff --rw------- pr1adm sapsys 2147483648

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmo; ioo # review the implemented construction of “firm” AIX structures


$ vmo –L ; ioo –L

NAME CUR DEF BOOT MIN MAX UNIT TYPE


DEPENDENCIES
--------------------------------------------------------------------------------
cpu_scale_memp 8 8 8 1 64 B
--------------------------------------------------------------------------------
data_stagger_interval 161 161 161 0 4K-1 4KB pages D
lgpg_regions
--------------------------------------------------------------------------------
defps 1 1 1 0 1 boolean D
--------------------------------------------------------------------------------
force_relalias_lite 0 0 0 0 1 boolean D
--------------------------------------------------------------------------------
framesets 2 2 2 1 10 B
--------------------------------------------------------------------------------
htabscale n/a -1 -1 -4 0 B
--------------------------------------------------------------------------------
kernel_heap_psize 4K 4K 4K 4K 16M bytes B
--------------------------------------------------------------------------------
kernel_psize 16M 0 0 0 16M bytes B
--------------------------------------------------------------------------------


NAME CUR DEF BOOT MIN MAX UNIT TYPE
DEPENDENCIES
--------------------------------------------------------------------------------
j2_atimeUpdateSymlink 0 0 0 0 1 boolean D
--------------------------------------------------------------------------------
j2_dynamicBufferPreallocation
16 16 16 0 256 16K slabs D
--------------------------------------------------------------------------------
j2_inodeCacheSize 400 400 400 1 1000 D
--------------------------------------------------------------------------------
j2_maxPageReadAhead 128 128 128 0 64K 4KB pages D
--------------------------------------------------------------------------------
j2_maxRandomWrite 0 0 0 0 64K 4KB pages D
--------------------------------------------------------------------------------
j2_maxUsableMaxTransfer 512 512 512 1 4K pages M
--------------------------------------------------------------------------------
j2_metadataCacheSize 400 400 400 1 1000 D
--------------------------------------------------------------------------------
… © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

 Strategic Thoughts, Concepts, Considerations, and Tactics

 Monitoring AIX – Usage, Meaning and Interpretation


– Review component technology of the infrastructure, i.e. proper tuning-by-hardware
– Review implemented AIX constructs, i.e. “firm” near-static structures and settings
– Review historical/accumulated AIX events, usages, pendings, counts, blocks, etc.
– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

 Recognizing Common Performance-degrading Scenarios


– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”
– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed
– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity
– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r
– Poor ratio of pending I/O waits to iodones in vmstat -s output

 Tactical Check-list of What-to-Watch

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

uptime; vmstat –s # Review accumulated count-of-events over days-uptime


10:41AM up 476 days, 13:43, 3 users, load average: 2.83, 3.27, 3.21
2567604831 total address trans. faults
130578171 page ins
316967129 page outs
546 paging space page ins
1543 paging space page outs
0 total reclaims
1092529772 zero filled pages faults
452788 executable filled pages faults
179739818 pages examined by clock
74 revolutions of the clock hand
72866299 pages freed by the clock
27734778 backtracks
300 free frame waits
9 extend XPT waits
14914462 pending I/O waits
406381092 start I/Os
161907544 iodones
13393500872 cpu context switches
1220683619 device interrupts
4305860328 software interrupts
13540901246 decrementer interrupts
71006911 mpc-sent interrupts
71003781 mpc-receive interrupts
10548539 phantom interrupts
0 traps
35581373075 syscalls
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

vmstat –s # Writes to standard output the contents of the sum structure, which
contains an absolute count of paging events since system initialization.

address translation faults


Incremented for each occurrence of an address translation page fault. I/O may or may not be required to
resolve the page fault. Storage protection page faults (lock misses) are not included in this count.
page ins
Incremented for each page read in by the virtual memory manager. The count is incremented for page ins
from page space and file space. Along with the page out statistic, this represents the total amount of
real I/O initiated by the virtual memory manager.
page outs
Incremented for each page written out by the virtual memory manager. The count is incremented for
page outs to page space and for page outs to file space. Along with the page in statistic, this represents
the total amount of real I/O initiated by the virtual memory manager.
paging space page ins
Incremented for VMM initiated page ins from paging space only.
paging space page outs
Incremented for VMM initiated page outs to paging space only.

pages examined by the clock
VMM uses a clock-algorithm to implement a pseudo least recently used (lru) page replacement scheme.
Pages are aged by being examined by the clock. This count is incremented for each page examined by the clock.
revolutions of the clock hand
Incremented for each VMM clock revolution (that is, after each complete scan of memory).
pages freed by the clock
Incremented for each page the clock algorithm selects to free from real memory.

[ Continued ]

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –s # [continued] Writes to standard output the contents of the sum structure,
which contains an absolute count of paging events since system initialization.

backtracks
Incremented for each page fault that occurs while resolving a previous page fault. (The new page fault must be resolved
first and then initial page faults can be backtracked.)
free frame waits
Incremented each time a process requests a page frame, the free list is empty, and the process is forced to wait while
the free list is replenished.
extend XPT waits
Incremented each time a process is waited by VMM due to a commit in progress for the segment being accessed.
pending I/O waits
Incremented each time a process is waited by VMM for a page-in I/O to complete.
start I/Os
Incremented for each read or write I/O request initiated by VMM.
iodones
Incremented at the completion of each VMM I/O request.
CPU context switches
Incremented for each processor context switch (dispatch of a new process).
device interrupts
Incremented on each hardware interrupt.
software interrupts
Incremented on each software interrupt. A software interrupt is a machine instruction similar to a hardware interrupt that
saves some state and branches to a service routine. System calls are implemented with software interrupt instructions
that branch to the system call handler routine.
decrementer interrupts
Incremented on each decrementer interrupt.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

uptime; vmstat –v # Review accumulated count-of-events over days-uptime


10:41AM up 476 days, 13:43, 3 users, load average: 2.83, 3.27, 3.21
5095423 memory pages
4856881 lruable pages
1521046 free pages
2 memory pools
463852 pinned pages
80.0 maxpin percentage
5.0 minperm percentage
80.0 maxperm percentage
18.5 numperm percentage
901533 file pages
0.0 compressed percentage
0 compressed pages
18.2 numclient percentage
80.0 maxclient percentage
887137 client pages
0 remote pageouts scheduled
2910 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
132276 filesystem I/Os blocked with no fsbuf
368343 client filesystem I/Os blocked with no fsbuf
2066915 external pager filesystem I/Os blocked with no fsbuf

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –v # Writes to standard output various statistics maintained by the Virtual


Memory Manager. The -v flag can only be used with the -s flag.
memory pages
Size of real memory in number of 4 KB pages.
lruable pages
Number of 4 KB pages considered for replacement. This number excludes the pages used for VMM internal pages, and
the pages used for the pinned part of the kernel text.
free pages
Number of free 4 KB pages.
memory pools
Tuning parameter (managed using vmo) specifying the number of memory pools.
pinned pages
Number of pinned 4 KB pages.
maxpin percentage
Tuning parameter (managed using vmo) specifying the percentage of real memory which can be pinned.
minperm percentage
Tuning parameter (managed using vmo) in percentage of real memory. This specifies the point below which file pages
are protected from the re-page algorithm.
maxperm percentage
Tuning parameter (managed using vmo) in percentage of real memory. This specifies the point above which the page
stealing algorithm steals only file pages.
numperm percentage
Percentage of memory currently used by the file cache.

[ Continued ]

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –v # [Continued] Writes to standard output various statistics maintained


by the Virtual Memory Manager. The -v flag can only be used with the -s flag.
file pages
Number of 4 KB pages currently used by the file cache.

numclient percentage
Percentage of memory occupied by client pages.
maxclient percentage
Tuning parameter (managed using vmo) specifying the maximum percentage of memory which can be used for
client pages.
client pages
Number of client pages.

pending disk I/Os blocked with no pbuf
Number of pending disk I/O requests blocked because no pbuf was available. Pbufs are pinned memory buffers
used to hold I/O requests at the logical volume manager layer.
paging space I/Os blocked with no psbuf
Number of paging space I/O requests blocked because no psbuf was available. Psbufs are pinned memory buffers
used to hold I/O requests at the virtual memory manager
filesystem I/Os blocked with no fsbuf
Number of filesystem I/O requests blocked because no fsbuf was available. Fsbuf are pinned memory buffers
used to hold I/O requests in the filesystem layer.
client filesystem I/Os blocked with no fsbuf
Number of client filesystem I/O requests blocked because no fsbuf was available. NFS (Network File System) and
VxFS (Veritas) are client filesystems. Fsbuf are pinned memory buffers used to hold I/O requests in the filesystem layer.
external pager filesystem I/Os blocked with no fsbuf
Number of external pager client filesystem I/O requests blocked because no fsbuf was available. JFS2 is an external
pager client filesystem. Fsbuf are pinned memory buffers used to hold I/O requests in the filesystem layer.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

ps -ekf # cumulative since last boot; compare CPU-time of key processes


$ uptime ; ps -ekf | grep -v grep | egrep "syncd|lrud|nfsd|biod|wait"
10:58AM up 133 days, 13:45, 3 users, load average: 19.01, 19.32, 19.28
root 8196 0 0 Oct 20 - 2218:47 wait
root 16392 0 9 Oct 20 - 17178:29 lrud
root 53274 0 0 Oct 20 - 8461:10 wait
root 57372 0 0 Oct 20 - 1335:46 wait
root 61470 0 0 Oct 20 - 6073:45 wait
root 65568 0 0 Oct 20 - 1374:15 wait
root 69666 0 0 Oct 20 - 5560:02 wait
root 73764 0 0 Oct 20 - 1331:34 wait
root 77862 0 0 Oct 20 - 4979:33 wait
root 81960 0 0 Oct 20 - 1260:27 wait
root 86058 0 0 Oct 20 - 4369:40 wait
root 90156 0 0 Oct 20 - 965:50 wait
root 94254 0 0 Oct 20 - 3300:50 wait
root 98352 0 0 Oct 20 - 826:15 wait
root 102450 0 0 Oct 20 - 2657:03 wait
root 106548 0 0 Oct 20 - 681:51 wait
root 110646 0 0 Oct 20 - 2107:25 wait
root 114744 0 0 Oct 20 - 581:12 wait
root 118842 0 0 Oct 20 - 1709:27 wait
root 122940 0 0 Oct 20 - 385:38 wait
root 127038 0 0 Oct 20 - 1047:36 wait
root 131136 0 0 Oct 20 - 294:26 wait
root 135234 0 0 Oct 20 - 731:46 wait
root 139332 0 0 Oct 20 - 185:51 wait
root 143430 0 0 Oct 20 - 382:23 wait
root 266430 1 0 Oct 20 - 1008:28 /usr/sbin/syncd 60
root 1429696 0 0 Oct 20 - 0:25 kbiod
root 1540176 1437714 0 Oct 20 - 0:00 /usr/sbin/nfsd 3891
root 1646808 1 0 Oct 20 - 506:30 nfsd
root 1728526 1437714 0 Oct 20 - 0:00 /usr/sbin/biod 6
$ uptime ; ps -ekf | grep -v grep | egrep "syncd|lrud|nfsd|biod|wait" | grep -c wait
10:58AM up 133 days, 13:45, 3 users, load average: 19.02, 19.32, 19.28
24
$
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

iostat –a # cumulative since last boot; Create a complete tech-stack map, ie.
RAIDset/tech->LUN->[LVMvg:lv::JFS2mtpt w/options]->logical_content
System configuration: lcpu=24 drives=48 ent=1.20 paths=268 vdisks=0 tapes=38

tty: tin tout avg-cpu: % user % sys % idle % iowait physc % entc
0.1 35.5 65.7 31.4 1.0 1.9 0.2 17.7

Adapter: Kbps tps Kb_read Kb_wrtn


sisscsia0 2444.4 35.4 25807797 16239911633

Disks: % tm_act Kbps tps Kb_read Kb_wrtn


hdisk1 3.9 1220.1 17.6 789737 8118447840
hdisk0 4.2 1224.2 17.8 25018060 8121463793

Adapter: Kbps tps Kb_read Kb_wrtn


fcs1 26782.8 1440.5 149036558282 29184434960

Disks: % tm_act Kbps tps Kb_read Kb_wrtn


hdisk8 2.2 607.3 1.5 2164957533 1876147460
hdisk12 2.4 607.7 1.3 2065741964 1978282924
hdisk13 2.2 582.8 1.3 2002751079 1875515764
hdisk11 2.1 593.4 1.3 2073048903 1875758716
hdisk4 1.9 216.3 23.9 812230724 626802460
hdisk15 0.0 2.2 0.6 25584 14666516

Adapter: Kbps tps Kb_read Kb_wrtn
fcs3 26885.6 1444.5 149683495110 29221969816

Disks: % tm_act Kbps tps Kb_read Kb_wrtn


hdisk8 2.1 622.7 1.5 2176121655 1967240656
hdisk12 2.5 592.0 1.3 2071882880 1867155292
hdisk13 2.2 597.8 1.3 2009693373 1967933716
hdisk11 2.1 608.0 1.3 2077880225 1967693560
hdisk4 1.9 216.7 23.9 813889906 627809836
hdisk15 0.0 2.2 0.6 27670 14749344
… © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

iostat –a # cumulative since last boot; mapping&comparing hdisks stats can be


useful in characterizing performance-related I/O patterns&trends
Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk8 2.2 607.3 1.5 2164957533 1876147460
hdisk12 2.4 607.7 1.3 2065741964 1978282924
hdisk13 2.2 582.8 1.3 2002751079 1875515764
hdisk11 2.1 593.4 1.3 2073048903 1875758716
hdisk4 1.9 216.3 23.9 812230724 626802460
hdisk15 0.0 2.2 0.6 25584 14666516
hdisk16 11.5 178.7 23.6 1169343088 19983468
hdisk14 0.0 1.3 0.0 8828331 0
hdisk10 6.3 548.9 7.8 3617545529 35278292
hdisk17 0.0 0.0 0.0 8560 0
hdisk18 0.0 3.0 0.1 9741142 10386688
hdisk7 0.4 53.3 7.6 272419695 82268236
hdisk5 0.6 59.4 6.5 225752039 169601848
hdisk6 2.3 624.3 1.5 2175672098 1978387280
hdisk9 2.3 613.6 1.4 2104140790 1978677528
hdisk20 8.1 228.3 35.1 1511885833 7496668
hdisk21 1.6 99.7 24.8 16230194 647254280
hdisk22 15.9 845.2 58.4 5592808968 31384956
hdisk23 3.5 364.2 60.4 1627955714 795383552
hdisk25 20.3 740.1 36.5 4725304221 199399144
hdisk27 20.1 1015.2 45.8 6675326923 80385252
hdisk26 41.3 2934.5 118.0 18806493859 720917972
hdisk29 19.2 949.4 55.7 6113262738 204348212
hdisk24 25.8 1867.8 59.4 12330198268 98946776
hdisk30 16.9 515.9 38.4 3271643603 161247332
hdisk32 5.1 555.4 34.2 888509245 2807296084
hdisk31 11.7 483.8 71.1 3111959749 107262760
hdisk33 47.2 2760.0 153.7 18308894936 56985180
hdisk36 2.4 597.9 1.3 2103249842 1875221640
hdisk35 2.8 616.8 1.4 2126412342 1977828244

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

iostat –D # cumulative since last boot; mapping&comparing hdisks stats is


useful in characterizing performance-related I/O patterns&trends
$ iostat -D
System configuration: lcpu=24 drives=87 paths=172 vdisks=0

hdisk0 xfer: %tm_act bps tps bread bwrtn


0.8 18.7K 2.3 7.0K 11.7K
read: rps avgserv minserv maxserv timeouts fails
0.6 3.0 0.1 267.1 0 0
write: wps avgserv minserv maxserv timeouts fails
1.7 5.5 0.3 320.5 0 0
queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull
8.8 0.0 291.3 0.0 0.0 6349911
hdisk1 xfer: %tm_act bps tps bread bwrtn
0.6 12.9K 1.7 1.2K 11.7K
read: rps avgserv minserv maxserv timeouts fails
0.0 4.8 0.1 301.8 0 0
write: wps avgserv minserv maxserv timeouts fails
1.7 5.4 0.4 281.1 0 0
queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull
11.3 0.0 275.6 0.0 0.0 6102418

hdisk86 xfer: %tm_act bps tps bread bwrtn
10.2 789.3K 33.1 753.9K 35.4K
read: rps avgserv minserv maxserv timeouts fails
30.6 6.5 0.1 1.3S 0 0
write: wps avgserv minserv maxserv timeouts fails
2.5 2.5 0.2 912.0 0 0
queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull
4.3 0.0 1.1S 0.0 0.0 73320194
hdisk87 xfer: %tm_act bps tps bread bwrtn
10.1 801.6K 33.7 764.2K 37.4K
read: rps avgserv minserv maxserv timeouts fails
31.2 6.3 0.1 1.2S 0 0
write: wps avgserv minserv maxserv timeouts fails
2.5 2.5 0.2 913.1 0 0
queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull
4.3 0.0 1.2S 0.0 0.0 74160810

© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

lvmo –a –v $VG # cumulative since last boot; illustrates per-LUN pbuf


allocation and exhaustion of an LVM:volume-group’s pooling of assigned-LUN pbufs

$ vmstat -v | grep pbuf


176559 pending disk I/Os blocked with no pbuf
$
$ for VG in `lsvg`
do
lvmo -a -v $VG
echo
done

vgname = rootvg
pv_pbuf_count = 512
total_vg_pbufs = 1024
max_vg_pbuf_count = 16384
pervg_blocked_io_count = 0
pv_min_pbuf = 512
global_blocked_io_count = 176559

vgname = oravg
pv_pbuf_count = 512
total_vg_pbufs = 8192
max_vg_pbuf_count = 16384
pervg_blocked_io_count = 176559
pv_min_pbuf = 512
global_blocked_io_count = 176559


© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

netstat –ss # cumulative since last boot; displays only non-zero values (this is
an undocumented command-option of netstat)

$ netstat -ss
ip:
343290037 total packets received
257309 fragments received
5 fragments dropped after timeout
128641 packets reassembled ok98

337255449 packets for this host


5871563 packets for unknown/unsupported protocol
396 packets not forwardable
3083156934 packets sent from this host
71209 output datagrams fragmented
149137 fragments created
2 packets dropped due to the full socket receive buffer
2115340 dead gateway detection packets sent
icmp:
5884385 calls to icmp_error
Output histogram:
echo reply: 1792801
destination unreachable: 5884312
707 messages with bad code fields
1 bad checksum
Input histogram:
echo reply: 2118565
destination unreachable: 5871963
echo: 1792801
1792801 message responses generated
igmp:
6 messages received
6 membership reports received
6 membership reports received for groups to which we belong
4 membership reports sent
tcp:
2416410139 packets sent
2830792567 data packets (3879350818 bytes)
14816541 data packets (1123365316 bytes) retransmitted
614103879 ack-only packets (290291730 delayed)
19 URG only packets
56895 window probe packets


© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

netstat –v # cumulative since last boot; verbose display of netstat statistics

$ netstat -v


-------------------------------------------------------------
ETHERNET STATISTICS (ent7) :
Device Type: EtherChannel
Hardware Address: 00:14:5e:48:c1:cc
Elapsed Time: 134 days 0 hours 21 minutes 27 seconds

Transmit Statistics: Receive Statistics:


-------------------- -------------------
Packets: 34718204002 Packets: 44862435158
Bytes: 48050223811333 Bytes: 9016251735773
Interrupts: 0 Interrupts: 27620561406
Transmit Errors: 0 Receive Errors: 0
Packets Dropped: 2 Packets Dropped: 0
Bad Packets: 0
Max Packets on S/W Transmit Queue: 227
S/W Transmit Queue Overflow: 0
Current S/W+H/W Transmit Queue Length: 2

Elapsed Time: 0 days 0 hours 0 minutes 0 seconds


Broadcast Packets: 14384 Broadcast Packets: 627555
Multicast Packets: 1 Multicast Packets: 3
No Carrier Sense: 0 CRC Errors: 0
DMA Underrun: 0 DMA Overrun: 0
Lost CTS Errors: 0 Alignment Errors: 0
Max Collision Errors: 0 No Resource Errors: 0
Late Collision Errors: 0 Receive Collision Errors: 0
Deferred: 0 Packet Too Short Errors: 0
SQE Test: 0 Packet Too Long Errors: 0
Timeout Errors: 0 Packets Discarded by Adapter: 0
Single Collision Count: 0 Receiver Start Count: 0
Multiple Collision Count: 0
Current HW Transmit Queue Length: 2

General Statistics:
-------------------
No mbuf Errors: 0
Adapter Reset Count: 0
Adapter Data Rate: 2000


© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

nfsstat # cumulative since last boot; displays server and client NFS statistics
$ nfsstat

Server rpc:
Connection oriented
calls badcalls nullrecv badlen xdrcall dupchecks dupreqs
715102799 128 0 0 0 58378077 20
Connectionless
calls badcalls nullrecv badlen xdrcall dupchecks dupreqs
536 0 0 0 0 0 0

Server nfs:
calls badcalls public_v2 public_v3
715102602 27 0 0
Version 2: (256 calls)
null getattr setattr root lookup readlink read
256 100% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
wrcache write create remove rename link symlink
0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
mkdir rmdir readdir statfs
0 0% 0 0% 0 0% 0 0%
Version 3: (715102346 calls)
null getattr setattr lookup access readlink read
208 0% 56755889 7% 1397024 0% 430178070 60% 4733070 0% 0 0% 55037597 7%
write create mkdir symlink mknod remove rmdir
43093505 6% 2409219 0% 7 0% 0 0% 0 0% 2329235 0% 14 0%
rename link readdir readdir+ fsstat fsinfo pathconf
110306 0% 63 0% 17771684 2% 9038704 1% 65544670 9% 206 0% 95 0%
commit
26702780 3%

Client rpc:
Connection oriented
calls badcalls badxids timeouts newcreds badverfs timers
272059 0 0 0 0 0 0
nomem cantconn interrupts
0 0 0
Connectionless
calls badcalls retrans badxids timeouts newcreds badverfs
79 0 0 0 0 0 0
timers nomem cantsend
0 0 0

… © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

 Strategic Thoughts, Concepts, Considerations, and Tactics

 Monitoring AIX – Usage, Meaning and Interpretation


– Review component technology of the infrastructure, i.e. proper tuning-by-hardware
– Review implemented AIX constructs, i.e. “firm” near-static structures and settings
– Review historical/accumulated AIX events, usages, pendings, counts, blocks, etc.
– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

 Recognizing Common Performance-degrading Scenarios


– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”
– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed
– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity
– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r
– Poor ratio of pending I/O waits to iodones in vmstat -s output

 Tactical Check-list of What-to-Watch

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

topas<cr>d n c c # vastly improved instrumentation and functionality

Topas Monitor for host: sapdb1 EVENTS/QUEUES FILE/TTY


Sun Mar 2 11:14:49 2008 Interval: 2 Cswitch 5916 Readch 0.0G
Syscall 24150 Writech 1043.9K
CPU User% Kern% Wait% Idle% Physc Reads 5587 Rawin 0
cpu4 99.9 0.1 0.0 0.0 0.52 Writes 1213 Ttyout 686
cpu5 99.9 0.1 0.0 0.0 0.48 Forks 0 Igets 0
cpu9 99.8 0.2 0.0 0.0 0.72 Execs 0 Namei 46
cpu6 99.8 0.2 0.0 0.0 0.69 Runqueue 20.0 Dirblk 0
cpu11 99.8 0.2 0.0 0.0 0.69 Waitqueue 2.0
cpu15 99.8 0.2 0.0 0.0 0.76
cpu17 99.8 0.2 0.0 0.0 0.48 PAGING MEMORY
cpu20 99.6 0.4 0.0 0.0 0.51 Faults 854 Real,MB 73728
cpu13 99.5 0.3 0.1 0.1 0.62 Steals 1986 % Comp 59.2
cpu16 99.5 0.5 0.0 0.0 0.52 PgspIn 0 % Noncomp 41.4
cpu21 99.5 0.4 0.2 0.0 0.49 PgspOut 0 % Client 41.4
cpu22 99.1 0.4 0.1 0.4 0.59 PageIn 1369
cpu1 98.3 0.8 0.4 0.4 0.64 PageOut 167 PAGING SPACE
cpu23 97.0 0.7 0.9 1.5 0.40 Sios 1436 Size,MB 130944
cpu2 93.0 6.0 0.2 0.8 0.53 % Used 0.7
cpu18 93.0 4.9 0.9 1.2 0.50 NFS (calls/sec) % Free 99.2
cpu3 92.8 2.0 2.4 2.8 0.41 ServerV2 0
cpu7 92.4 2.3 2.5 2.8 0.31 ClientV2 0 Press:
cpu10 91.4 2.2 3.1 3.2 0.31 ServerV3 11 "h" for help
cpu0 90.7 3.4 3.2 2.8 0.32 ClientV3 0 "q" to quit
cpu19 90.3 5.2 2.2 2.3 0.40
cpu8 89.0 2.0 4.4 4.6 0.28
cpu12 87.8 7.7 2.0 2.5 0.38
cpu14 83.9 3.7 5.0 7.4 0.24

Name PID CPU% PgSp Owner


oracle 2424844 6.4 18.1 orapr1
oracle 1474750 6.3 8.9 orapr1
oracle 3256398 5.8 9.4 orapr1
topas 33759394 1.3 40.1 eyj1
oracle 4227258 5.7 11.1 orapr1
oracle 33087570 5.7 8.9 orapr1
oracle 2007246 5.7 8.9 orapr1
oracle 33861652 5.6 12.3 orapr1

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

ps –kelmo THREAD # demonstrates the reality of “threadedness”


USER PID PPID TID ST CP PRI SC WCHAN F TT BND COMMAND
root 1 0 - A 0 60 1 - 200003 - - /etc/init
- - - 4099 S 0 60 1 - 410410 - - -
oracle 98420 1 - A 0 60 2 - 240001 - - ora_diag_prod1
- - - 1769601 S 0 60 1 - 418400 - - -
- - - 1835177 S 0 60 1 - 18400 - - -
patrol 102592 1 - A 0 60 1 f100020005532200 40001 - - ds_listener -port=50005 -hosts= -period=00
- - - 614449 S 0 60 1 f100020005532200 400410 - - -
root 114914 331778 - A 0 60 11 * 240001 - - /usr/sbin/gsclvmd -r 30 -i 300 -t 50 -c 000256aa00
- - - 520247 S 0 60 1 f100070f10007f40 8c10400 - - -
- - - 880849 S 0 60 1 f100070f1000d740 8430410 - - -
- - - 950487 S 0 60 1 f100070f1000e840 8c10400 - - -
- - - 970981 S 0 60 1 - 418410 - - -
- - - 1704015 S 0 60 1 - 400410 - - -
- - - 1708105 S 0 60 1 - 400410 - - -
- - - 1712199 S 0 60 1 - 2400400 - - -
- - - 1740891 S 0 60 1 - 418410 - - -
- - - 2019533 S 0 60 1 f100070f1001ed40 8c10400 - - -
- - - 5501175 S 0 60 1 f100070f10053f40 8c10400 - - -
- - - 6156443 S 0 60 1 f100070f1005df40 8c10400 - - -
root 118892 1 - A 0 60 1 - 240001 - - /usr/sbin/portmap
- - - 438365 S 0 60 1 - 418400 - - -
root 123042 1 - A 0 39 7 * 340001 - - /usr/sbin/rsct/bin/rmcd -a IBM.LPCommands -r
- - - 581707 Z 0 60 1 - c00001 - - -
- - - 598055 Z 0 60 1 - c00001 - - -
- - - 610361 Z 0 60 1 - c00001 - - -
- - - 622649 S 0 60 1 f100070f10009840 8410400 - - -
- - - 655433 S 0 60 1 f100070f1000a040 8410400 - - -
- - - 663635 Z 0 60 1 - c00001 - - -
- - - 684137 S 0 39 1 - 418400 - - -
root 127114 3162294 - A 0 60 1 - 240401 - - /usr/sbin/srcd
- - - 450593 S 0 60 1 - 10400 - - -

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

ps –el; ps –elmo THREAD; ps –kl; ps –klmo THREAD # Shows


thread counts of kernel vs. user processes for a sense of “threadedness”

ps -el | wc
ps -elmo THREAD | wc
ps -kl | wc
ps -klmo THREAD | wc

857 11953 81517 # 857 user procs (one line of column header)

2106 28273 199962 # 2106 – 857 = 1249 threads(user)


46 617 4099 # 46 kernel procs (one line of column header)

215 2795 17682 # 215 – 46 = 169 threads(kernel)


# 1249 + 169 = 1418 total threads

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

ps guww # descending %CPU, SZ:RSS, STIME:TIME, full-syntax;


a useful hunter-seeker of “run-away” processes (read: CPU-time “hogs”)
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
udb02p 18284686 0.6 0.0 22072 15620 - A 09:26:53 52:05 db2agent (CICP75) 0
udb02p 19640470 0.3 0.0 11836 1800 - A 15:41:30 0:01 db2agent (instance) 0
udb02p 15945764 0.3 0.0 8296 6772 - A May 24 266:05 db2agent (CICP75) 0
udb02p 26599546 0.2 0.0 11948 4920 - A 14:57:38 1:49 db2agent (instance) 0
root 2494534 0.1 0.0 7404 7416 - A May 24 132:09 db2ckpwd 0
root 2023454 0.1 0.0 7408 7420 - A May 24 131:32 db2ckpwd 0
root 15098066 0.1 0.0 7408 7420 - A May 24 131:32 db2ckpwd 0
root 19169494 0.1 0.0 126912 128960 - A Apr 30 824:12 /usr/tivoli/tsm/client/ba/bin/dsmc sched -se=AIXDB
udb02p 18792502 0.1 0.0 18296 14980 - A May 26 29:31 db2agent (CICP75) 0
udb02p 11047078 0.1 0.0 16768 13772 - A May 26 27:20 db2agent (CICP75) 0
maestro 14848224 0.1 0.0 2504 2192 - A 14:02:48 1:35 /appl/maestro/maestro/bin/batchman -parm 32000
udb02p 17072266 0.1 0.0 20788 15040 - A 08:58:06 6:17 db2agent (CICP75) 0
udb02p 27947030 0.1 0.0 18564 15952 - A May 26 29:05 db2agent (CICP75) 0
udb02p 27500798 0.1 0.0 25264 17660 - A 13:10:28 2:14 db2agent (CICP75) 0
udb02p 18423896 0.1 0.0 17808 15900 - A May 25 46:34 db2agent (CICP75) 0
udb02p 27803760 0.1 0.0 8752 6524 - A May 26 26:16 db2agent (CICP75) 0
udb02p 10149890 0.1 0.0 19024 15708 - A 07:59:23 6:15 db2agent (CICP75) 0
udb02p 987158 0.1 0.0 17004 6072 - A 15:40:31 0:01 db2agent (CICP75) 0
udb02p 12279828 0.1 0.0 19468 15768 - A May 26 19:43 db2agent (CICP75) 0
udb02p 31576232 0.1 0.0 12556 9176 - A 08:19:57 5:32 db2agent (CICP75) 0
udb02p 26505352 0.1 0.0 28652 24120 - A 08:02:33 5:43 db2agent (CICP75) 0
root 8196 0.1 0.0 384 384 - A Mar 11 1355:41 wait
udb02p 17924100 0.0 0.0 14764 8888 - A 09:23:38 4:30 db2agent (CICP75) 0
udb02p 21852330 0.0 0.0 20252 15208 - A 10:56:48 3:22 db2agent (CICP75) 0
udb02p 25665600 0.0 0.0 21004 14808 - A 11:25:53 2:48 db2agent (CICP75) 0
udb02p 12111920 0.0 0.0 8772 6544 - A May 26 19:05 db2agent (CICP75) 0
udb02p 22696146 0.0 0.0 15376 10332 - A 10:59:24 2:58 db2agent (CICP75) 0
udb02p 18362514 0.0 0.0 14688 5612 - A 15:40:09 0:01 db2agent (idle) 0
udb02p 16011488 0.0 0.0 19588 15056 - A 08:17:00 4:29 db2agent (CICP75) 0
udb02p 5300390 0.0 0.0 14012 10248 - A 19:00:14 12:27 db2agent (CICP75) 0
udb02p 11256034 0.0 0.0 19636 14848 - A 07:53:48 4:39 db2agent (CICP75) 0
udb02p 17399988 0.0 0.0 21036 14520 - A 09:47:43 3:27 db2agent (CICP75) 0 © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

ps gvww # ascending PID, SIZE:RSS, TIME:PGIN, %CPU, full-syntax


PGIN (v flag) The number of disk I/Os resulting from references by the process to pages not loaded in core

PID TTY STAT TIME PGIN SIZE RSS LIM TSIZ TRS %CPU %MEM COMMAND
0 - A 359:49 7 384 384 xx 0 0 0.0 0.0 swapper
1 - A 2:58 112 688 736 32768 32 48 0.0 0.0 /etc/init
8196 - A 1355:41 0 384 384 xx 0 0 0.1 0.0 wait
12294 - A 4:11 0 448 448 xx 0 0 0.0 0.0 sched
16392 - A 0:50 0 1216 1216 xx 0 0 0.0 0.0 lrud
20490 - A 0:00 0 448 448 xx 0 0 0.0 0.0 vmptacrt

192694 - A 231:54 25933 492 500 xx 3 8 0.0 0.0 /usr/sbin/syncd 60

1921222 - A 1:12 0 448 448 xx 0 0 0.0 0.0 aioserver
1945664 - A 1:27 0 448 448 xx 0 0 0.0 0.0 aioserver
1954040 - A 0:01 6502 14184 7704 32768 39 112 0.0 0.0 db2agent (CICP75) 0
1958100 - A 0:00 0 2708 1924 32768 39 112 0.0 0.0 db2agent (CICP75) 0
1962154 - A 0:00 0 2804 1700 32768 39 112 0.0 0.0 db2agent (CICP75) 0
1982616 - A 0:03 7630 21912 14920 32768 39 112 0.0 0.0 db2agent (CICP75) 0
1990804 - A 0:09 35673 20560 15488 32768 39 112 0.0 0.0 db2agent (CICP75) 0
1998936 - A 0:21 18151 11576 7656 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2003126 - A 11:16 0 384 384 xx 0 0 0.0 0.0 wait
2007224 - A 0:07 20302 23108 15988 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2023454 - A 131:32 37058 7372 7420 32768 39 112 0.1 0.0 db2ckpwd 0
2027608 - A 0:08 25717 13248 7792 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2031778 - A 0:01 4392 17400 8296 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2056344 - A 1:22 0 448 448 xx 0 0 0.0 0.0 aioserver
2072818 - A 0:01 789 11080 7736 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2089120 - A 0:10 35352 19668 14596 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2105366 - A 0:04 16649 20856 15016 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2117646 - A 0:02 3150 14996 6340 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2126002 - A 0:00 456 11380 4772 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2129984 - A 0:15 16343 13100 7196 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2150502 - A 0:00 6 3892 2916 32768 39 112 0.0 0.0 db2agent (CICP75) 0
2158720 - A 0:02 4948 15320 7304 32768 39 112 0.0 0.0 db2agent (CICP75) 0

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

investigate by digging deeper # Seek-out and discern anomalies and other


atypical events/situations; routinely monitor ps command output for unusual events

$ ps -ef | grep -v "Oct 20"


UID PID PPID C STIME TTY TIME CMD
orapr1 1474750 1 61 00:45:40 - 82:55 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
root 1515764 1 0 Feb 27 - 1:22 ./wrapper ./agent.conf wrapper.pidfile=../tmp/teagent.pid
orapr1 1556610 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1626200 1 0 00:45:40 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1765528 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1790174 1 0 00:45:40 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
pr1adm 1810644 3301460 0 00:44:18 - 1:17 dw.sapPR1_DVEBMGS00 pf=/usr/sap/PR1/SYS/profile/PR1_DVEBMGS00_sapdb1
orapr1 1818792 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
pr1adm 1835262 3301460 0 00:44:18 - 0:31 dw.sapPR1_DVEBMGS00 pf=/usr/sap/PR1/SYS/profile/PR1_DVEBMGS00_sapdb1
pr1adm 1867872 1 0 00:43:55 - 0:00 ora_reco_PR1
orapr1 1957988 1 59 00:45:40 - 83:56 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1970272 1 0 00:45:15 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1978526 1 0 00:45:21 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1986642 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1990892 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1994952 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 1999052 1 0 00:45:40 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2007246 1 55 00:44:19 - 110:18 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2011256 1 0 00:45:15 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2015340 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2068642 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2158594 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
pr1adm 2162806 3301460 0 00:44:18 - 0:04 dw.sapPR1_DVEBMGS00 pf=/usr/sap/PR1/SYS/profile/PR1_DVEBMGS00_sapdb1
orapr1 2195510 1 0 00:45:40 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2203714 1 0 00:45:13 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2207996 1 0 00:45:14 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
orapr1 2220242 1 0 00:45:39 - 0:00 oraclePR1 (DESCRIPTION=(LOCAL=NO)(SDU=4096))
pr1adm 2232488 3301460 0 00:44:18 - 0:02 dw.sapPR1_DVEBMGS00 pf=/usr/sap/PR1/SYS/profile/PR1_DVEBMGS00_sapdb1


$ ps -ef | grep -c LOCAL=NO
507
$

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –Iwt 2 # establish dynamic baselines of AIX behaviors


$ vmstat –Iwt 2
System configuration: lcpu=8 mem=15808MB

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
6 9 0 3622390 1865 1171 637 57 680 2228 5066 896 68533 7908 89 7 1 3 09:17:40
6 8 0 3627696 1764 847 3 57 808 3489 8207 812 48745 10969 73 6 6 15 09:17:42
2 13 0 3631998 2875 231 343 52 798 2903 6644 777 14471 2850 64 4 14 19 09:17:44
10 9 0 3636139 2053 994 880 129 527 2904 16706 1013 55414 7332 91 6 1 3 09:17:46
10 5 0 3619116 18481 1381 629 145 510 1228 19741 1098 53549 10147 91 7 1 2 09:17:48
11 2 0 3609866 21981 2570 893 244 0 0 0 1537 53911 13926 91 8 0 1 09:17:50
8 5 0 3610726 15307 2522 718 353 0 0 0 1522 49902 9863 89 7 1 3 09:17:52
6 3 0 3595648 28527 588 474 316 0 0 0 1006 100617 5395 84 6 4 5 09:17:54
10 5 0 3595806 25633 1101 564 273 0 0 0 1113 109611 7128 88 8 1 2 09:17:56
6 11 0 3601571 16601 1191 255 423 0 0 0 1216 140583 9472 87 8 1 4 09:17:58
8 12 0 3604703 11245 661 247 427 0 0 0 1041 118076 10307 89 8 0 3 09:18:00
6 14 0 3600579 12444 1035 293 424 0 0 0 1315 67304 18072 87 9 0 4 09:18:02
7 15 0 3600064 9638 1008 268 395 0 0 0 1292 66735 15921 82 9 1 9 09:18:04
6 12 0 3602133 4839 776 295 464 0 0 0 1050 103319 4426 80 6 2 12 09:18:06
5 13 0 3605240 2170 1025 266 307 279 1556 13844 1042 66953 3916 65 5 8 22 09:18:08
8 6 0 3606415 1945 1975 297 317 0 2752 37581 1440 70972 4870 87 7 1 5 09:18:10
10 4 0 3610938 2084 1366 164 234 0 3943 55882 1241 75037 8307 92 6 0 2 09:18:12
9 2 0 3594623 19789 1321 512 246 41 2271 19132 1343 70210 8794 90 7 1 2 09:18:15
8 4 0 3593551 18060 1188 1890 123 0 0 0 1491 58462 7443 76 8 6 10 09:18:17
5 8 0 3598228 9838 1499 5765 226 0 0 0 1502 39586 10136 73 11 6 10 09:18:19
kthr memory page faults cpu time
----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
7 2 0 3601032 4779 969 2464 103 0 0 0 1099 35643 12135 86 7 3 4 09:18:21
8 5 0 3604401 2167 1282 328 293 0 2070 10143 1168 54493 17465 89 10 0 1 09:18:23
10 5 0 3611569 2022 1012 38 282 9 4822 19771 1112 47936 11135 89 8 1 1 09:18:25
8 8 0 3614396 2000 741 7 339 56 2487 11217 1040 40025 8102 91 6 1 2 09:18:27
6 6 0 3617421 2014 404 8 267 303 2191 7130 733 98557 7510 77 5 7 11 09:18:29
8 8 0 3619413 2974 252 90 169 519 1899 5209 704 87714 3431 78 4 4 14 09:18:31
6 7 0 3620661 1965 488 201 293 255 922 2729 1094 39403 4851 89 5 2 4 09:18:33
7 7 0 3623343 2617 684 72 109 610 2468 8924 937 15606 16827 69 7 5 19 09:18:35
6 8 0 3624146 2831 443 7 272 332 1228 4517 739 15674 3277 82 4 4 10 09:18:37
6 7 0 3625514 1934 464 7 336 168 1048 3851 758 17909 4927 79 4 4 14 09:18:39
4 8 0 3608643 19107 618 29 393 115 1171 3803 889 28193 5547 73 5 4 18 09:18:41
6 8 0 3596033 29222 815 8 429 0 0 0 977 62182 10416 73 7 5 16 09:18:43
6 5 0 3598914 22745 1233 246 369 0 0 0 1116 77652 5335 74 7 7 12 09:18:45
5 3 0 3604955 12994 1151 372 375 0 0 0 1085 83391 6533 79 6 5 11 09:18:47
… © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

$ vmstat –I 2
System configuration: lcpu=24 mem=73728MB ent=12.00

kthr memory page faults cpu


-------- ----------- ------------------------ ------------ -----------------------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec
10 6 0 11507787 49543 3483 823 0 0 5306 5355 5662 45907 26258 33 4 53 10 4.71 39.3
10 7 0 11509796 47989 3824 594 0 0 4116 27771 6252 56592 45959 54 7 28 12 7.65 63.8
9 7 0 11510010 47770 3955 622 0 0 3977 9647 5907 56222 46833 48 7 31 15 6.87 57.2
8 6 0 11510560 50021 4164 2431 0 0 5564 40421 6607 51080 49691 41 7 39 13 6.08 50.7
6 6 0 11512741 46710 4886 1443 0 0 4443 4608 6110 42400 30394 36 5 42 17 5.28 44.0
5 7 0 11514081 48807 4675 227 0 0 6461 7028 4838 34521 11343 33 3 55 9 4.68 39.0
8 7 0 11515469 48531 5679 482 0 0 6445 6593 5686 44979 13230 37 3 48 12 4.99 41.5
9 7 0 11514065 49598 3858 1046 0 0 4128 4255 5807 51871 27521 44 6 32 19 6.27 52.3
8 7 0 11517672 49905 4848 632 0 0 7173 7221 5679 44566 47102 48 6 32 14 6.84 57.0
10 13 0 11520210 50148 4669 692 0 0 6313 6491 6341 47122 45622 52 5 28 15 7.22 60.2
7 9 0 11521192 48222 5087 814 0 0 5194 5790 6211 49553 44306 45 6 34 15 6.45 53.7
10 7 0 11521212 50922 3830 627 0 0 5330 5353 6248 48130 32364 47 4 37 12 6.42 53.5
4 7 0 11521503 49362 3475 573 0 0 3075 3102 5717 47907 13356 42 3 41 14 5.69 47.4
9 8 0 11523055 48731 3502 511 0 0 4143 4176 5884 44391 13427 46 2 41 11 6.01 50.1
10 8 0 11524140 50987 3483 761 0 0 5363 5683 5830 45416 15252 60 3 23 14 7.89 65.7
8 7 0 11524407 45661 3871 351 0 0 1488 1621 5378 34403 13034 54 2 29 15 7.14 59.5
6 7 0 11523652 50033 3325 355 0 0 5229 5448 5434 40780 14372 45 3 36 16 6.06 50.5
8 9 0 11525268 48536 4209 272 0 0 4102 4337 4599 36202 10449 44 4 35 18 6.05 50.4
9 9 0 11525476 48242 4322 521 0 0 4307 4634 5375 33863 13975 44 3 35 18 5.97 49.7
8 11 0 11526444 49830 4988 699 0 0 6351 6828 6743 53110 45620 46 6 32 16 6.63 55.3

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

 kthr
 Number of kernel threads in various queues averaged per second over the sampling
interval. The kthr columns are as follows:
–r
Average number of kernel threads that are runnable, which includes threads that are
running and threads that are waiting for the CPU. If this number is greater than the
number of CPUs, then there is at least one thread waiting for a CPU and the more
threads there are waiting for CPUs, the greater the likelihood of a performance impact.
–b
Average number of kernel threads in the VMM wait queue per second. This includes
threads that are waiting on filesystem I/O or threads that are blocking on a shared
resource, i.e. inode-lock.
–p
For vmstat -I The number of threads waiting on I/Os to raw devices per second. Threads
waiting on I/Os to filesystems would not be included here.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

 memory
 Provides information about the real and virtual memory.
– avm
The Active Virtual Memory, avm, column represents the number of active virtual memory
pages present at the time the vmstat sample was collected. It is the sum-total of all
computational memory – including content paged-out to the pagingspace. The avm
statistics do not include file pages.

– fre
The fre column shows the average number of free memory pages. A page is a 4 KB area
of real memory. The system maintains a buffer of memory pages, called the free list, that
will be readily accessible when the VMM needs space. The minimum number of pages
that the VMM keeps on the free list is determined by the minfree parameter of the vmo
command.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

 page [ fi and fo are only included with vmstat –I ]


 Information about page faults and paging activity. These are averaged over the interval and
given in units per second.
– fi
The fi column details the number of pages paged-in from persistent storage, i.e. pages
read-in from JFS/JFS2 file systems on disk. This does not include pagingspace-pagein’s
from the pagingspace; rather, these are filesystem-reads.

– fo
The fo column details the number of pages paged-out to persistent storage, i.e. pages
written-out to JFS/JFS2 file systems on disk. This does not include pagingspace-
pageout’s to the pagingspace ; rather, these are filesystem-writes.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

 Page (continued)
 Information about page faults and paging activity. These are averaged over the interval and given in units
per second.
– pi
The pi column details the number of pages paged in from paging space. Paging space is the part of
virtual memory that resides on disk. It is used as an overflow when memory is over committed. Paging
space consists of logical volumes dedicated to the storage of working set pages that have been stolen
from real memory. When a stolen page is referenced by the process, a page fault occurs, and the
page must be read into memory from paging space.

Due to the variety of configurations of hardware, software and applications, there is no absolute number
to look out for. This field is important as a key indicator of paging-space activity. If a page-in occurs,
there must have been a previous page-out for that page. It is also likely in a memory-constrained
environment that each page-in will force a different page to be stolen and, therefore, paged out.

– po
The po column shows the number (rate) of pages paged out to paging space. Whenever a page of
working storage is stolen, it is written to paging space, if it does not yet reside in paging space or if it
was modified. If not referenced again, it will remain on the paging device until the process terminates
or disclaims the space. Subsequent references to addresses contained within the faulted-out pages
results in page faults, and the pages are paged in individually by the system. When a process
terminates normally, any paging space allocated to that process is freed. If the system is reading in a
significant number of persistent pages, you might see an increase in po without corresponding
increases in pi. This does not necessarily indicate thrashing, but may warrant investigation into data-
access patterns of the applications.
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

 page (continued)
 Information about page faults and paging activity. These are averaged over the interval and
given in units per second.
– fr
Number of pages that were freed per second by the page-replacement algorithm during
the interval. As the VMM page-replacement routine scans the Page Frame Table, or
PFT, it uses criteria to select which pages are to be stolen to replenish the free list of
available memory frames. The criteria include both kinds of pages, working
(computational) and file (persistent) pages. Just because a page has been freed, it does
not mean that any I/O has taken place. For example, if a persistent storage (file) page
has not been modified, it will not be written back to the disk. If I/O is not necessary,
minimal system resources are required to free a page.

– sr
Number of pages that were examined per second by the page-replacement algorithm
during the interval. The page-replacement algorithm might have to scan many page
frames before it can steal enough to satisfy the page-replacement thresholds. The higher
the sr value compared to the fr value, the harder it is for the page-replacement algorithm
to find eligible pages to steal.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

 faults
 Information about process control, such as trap and interrupt rate. The faults columns are as follows:
– in
Number of device interrupts per second observed in the interval.
– sy
The number of system calls per second observed in the interval. Resources are available to user
processes through well-defined system calls. These calls instruct the kernel to perform operations for
the calling process and exchange data between the kernel and the process. Because workloads and
applications vary widely, and different calls perform different functions, it is impossible to define how
many system calls per-second are too many. But typically, when the sy column raises over 10000
calls per second on a uniprocessor, further investigations is called for (on an SMP system the number
is 10000 calls per second per processor). One reason could be "polling" subroutines like the select()
subroutine. For this column, it is advisable to have a baseline measurement that gives a count for a
normal sy value.
– cs
Number of context switches per second observed in the interval. The physical CPU resource is
subdivided into logical time slices of 10 milliseconds each. Assuming a thread is scheduled for
execution, it will run until its time slice expires, until it is preempted, or until it voluntarily gives up
control of the CPU. When another thread is given control of the CPU, the context or working
environment of the previous thread must be saved and the context of the current thread must be
loaded. The operating system has a very efficient context switching procedure, so each switch is
inexpensive in terms of resources. Any significant increase in context switches, such as when cs is a
lot higher than the disk I/O and network packet rate, should be cause for further investigation.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

 cpu
 Percentage breakdown of CPU time usage during the interval. The cpu columns are as follows:
– us
– The us column shows the percent of CPU time spent in user mode. A UNIX® process can execute in either user mode
or system (kernel) mode. When in user mode, a process executes within its application code and does not require
kernel resources to perform computations, manage memory, or set variables.
– sy
– The sy column details the percentage of time the CPU was executing a process in system mode. This includes CPU
resource consumed by kernel processes (kprocs) and others that need access to kernel resources. If a process needs
kernel resources, it must execute a system call and is thereby switched to system mode to make that resource
available. For example, reading or writing of a file requires kernel resources to open the file, seek a specific location,
and read or write data, unless memory mapped files are used.
– id
– The id column shows the percentage of time which the CPU is idle, or waiting, without pending local disk I/O. If there
are no threads available for execution (the run queue is empty), the system dispatches a thread called wait, which is
also known as the idle kproc. On an SMP system, one wait thread per processor can be dispatched. The report
generated by the ps command (with the -k or -g 0 option) identifies this as kproc or wait. If the ps report shows a high
aggregate time for this thread, it means there were significant periods of time when no other thread was ready to run or
waiting to be executed on the CPU. The system was therefore mostly idle and waiting for new tasks.
– wa
– The wa column details the percentage of time the CPU was idle with pending local disk I/O and NFS-mounted disks. If
there is at least one outstanding I/O to a disk when wait is running, the time is classified as waiting for I/O. Unless
asynchronous I/O is being used by the process, an I/O request to disk causes the calling process to block (or sleep)
until the request has been completed. Once an I/O request for a process completes, it is placed on the run queue. If
the I/Os were completing faster, more CPU time could be used.
– A wa value over 25 percent could indicate that the disk subsystem might not be balanced properly, or it might be the
result of a disk-intensive workload.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –Iwt 2 # check for memory-overcommitment; MAX(avm)*4096;


load average=AVG(vmstat:kthr:r) over 60secs: current,5mins,15mins ago
$ uptime ; vmstat -Iwt 2 20
10:51AM up 133 days, 13:43, 3 users, load average: 19.45, 19.53, 19.32

System configuration: lcpu=24 mem=73728MB ent=12.00

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------------------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec hr mi se
20 0 0 10957853 48491 66 47 0 0 0 0 363 117648 2422 96 2 2 0 12.00 100.0 10:51:26
20 0 0 10957452 48808 16 81 0 0 0 0 418 94030 4048 97 2 2 0 12.00 100.0 10:51:28
18 0 0 10957456 48524 111 136 0 0 0 0 486 6692 2338 98 0 2 0 12.00 100.0 10:51:30
18 0 0 10957463 48455 25 34 0 0 0 0 152 11637 1348 97 1 2 0 12.00 100.0 10:51:32
18 0 0 10957464 48432 6 14 0 0 0 0 77 3019 1141 98 0 2 0 12.00 100.0 10:51:34
19 1 0 10957470 48298 65 7 0 0 0 0 197 4164 1330 97 0 2 0 11.95 99.6 10:51:36
18 0 0 10957472 48296 0 5 0 0 0 0 39 2842 1028 97 0 3 0 11.86 98.8 10:51:38
19 0 0 10957479 48236 23 13 0 0 0 0 234 5335 1448 98 0 2 0 12.00 100.0 10:51:40
19 1 0 10957487 47686 271 5 0 0 0 0 402 13439 1806 97 1 2 0 12.00 100.0 10:51:42
19 0 0 10957489 47684 0 9 0 0 0 0 37 7145 997 97 1 2 0 12.00 100.0 10:51:44
20 0 0 10957481 47610 39 56 0 0 0 0 167 2837 1061 98 0 2 0 12.00 100.0 10:51:46
19 0 0 10957483 47548 31 1 0 0 0 0 85 3075 1065 98 0 2 0 12.00 100.0 10:51:48
18 0 0 10957481 47500 26 13 0 0 7 135 75 2921 1032 98 0 2 0 12.00 100.0 10:51:50
19 0 0 10957889 49033 53 6 0 0 1025 1031 129 88541 1871 96 2 2 0 11.98 99.8 10:51:52
19 0 0 10957888 48954 40 0 0 0 0 0 89 94550 1869 96 2 2 0 12.00 100.0 10:51:54
20 1 0 10957882 48926 17 6 0 0 0 0 74 123666 2068 96 2 2 0 12.00 100.0 10:51:56
19 0 0 10957880 48916 5 8 0 0 0 0 47 120104 1913 94 3 4 0 11.80 98.4 10:51:58
20 0 0 10957666 49062 34 1 0 0 0 0 80 117384 1849 96 2 2 0 12.00 100.0 10:52:00
18 0 0 10957883 48841 1 7 0 0 0 0 59 130003 1924 95 3 2 0 12.00 100.0 10:52:02
20 0 0 10957889 48779 28 6 0 0 0 0 143 126580 2284 96 3 2 0 12.00 100.0 10:52:04
$ bc
10957889 * 4096 # MAX(avm)*4096 relative-to mem=73728MB  73728*1024*1024
44883513344 # 44883513344 relative-to 77309411328
quit
$

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat –Iwt 2 #r:b, avm:fre, fre:fi:fo, fr:sr, in:sy:cs, r:b:sy


(avm=3986734)*4096=16329662464 (avm=16.3gb on 15.74gb realRAM)
System configuration: lcpu=8 mem=15744MB

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
1 1 0 3986577 2652 1944 797 0 0 1536 12803 880 2377 4459 10 4 55 31 14:17:58
2 2 0 3986576 2553 1863 757 0 0 2557 37067 852 4053 4446 11 4 55 30 14:18:00
2 1 0 3986574 2206 1959 799 0 0 2559 37499 1009 2523 4559 10 6 53 31 14:18:02
0 3 0 3986573 2597 2044 843 0 0 3069 42804 912 2377 4553 11 4 55 30 14:18:04
1 2 0 3986571 2511 1870 754 0 0 2559 167438 804 2203 4247 10 4 56 30 14:18:06
0 2 0 3986571 2197 1944 787 0 0 2560 102054 814 2310 4063 10 4 56 30 14:18:08
0 2 0 3986570 2872 1960 792 0 0 3070 42557 889 4148 4532 11 4 54 30 14:18:10
1 2 0 3986569 3752 1876 764 0 0 3070 65622 933 2363 4834 10 5 53 32 14:18:12
1 2 0 3986568 3864 1787 730 0 0 2559 49907 880 2135 4617 9 4 53 33 14:18:14
1 1 0 3986567 2634 1915 767 0 0 2047 30676 785 2774 3948 10 4 55 31 14:18:16
0 3 0 3986567 2523 1890 759 0 0 2552 27693 877 2646 4443 10 4 55 32 14:18:18
1 2 0 3986573 2040 2008 810 0 0 2557 23419 928 5155 4671 12 4 54 30 14:18:20
1 2 0 3986572 1962 1878 761 0 0 2554 52663 905 2525 4795 10 4 56 29 14:18:22
2 2 0 3986587 2652 1960 798 3 0 3071 14081 1030 11377 7789 13 9 51 27 14:18:24
2 2 0 3986570 2363 1938 781 0 0 2558 30570 836 3004 5732 10 5 56 29 14:18:26
2 1 0 3986734 2056 1884 762 1 0 2557 32017 888 31414 6058 15 11 47 26 14:18:28
2 0 0 3986617 1933 1920 779 2 0 2558 15377 933 22108 5545 15 9 48 28 14:18:30
1 0 0 3986612 2463 2008 826 0 0 3069 25129 1192 2823 5935 11 9 52 28 14:18:32
1 2 0 3986586 3073 1988 810 0 0 3064 15116 816 2732 4430 10 4 56 30 14:18:34
0 1 0 3986587 3402 1719 685 0 0 2555 24262 799 3395 4429 9 4 58 29 14:18:36
kthr memory page faults cpu time
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

uptime;vmstat;mpstat # compare and match the meaning of both stats


$ uptime ; vmstat -Iwt 2 3 ; mpstat -w 2 1
07:06AM up 12 days, 10:09, 4 users, load average: 8.97, 9.55, 9.79

System configuration: lcpu=36 mem=79360MB

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
10 1 0 12542527 27282 877 551 0 0 0 0 1419 189441 10772 44 3 50 3 07:06:47
8 1 0 12545249 21863 1114 647 0 0 0 0 1477 178350 12383 48 2 47 3 07:06:49
12 1 0 12552354 22160 988 893 0 0 5250 70921 1438 286966 10384 57 6 34 3 07:06:51

System configuration: lcpu=36 mode=Capped

cpu min maj mpc int cs ics rq mig lpa sysc us sy wa id pc


0 0 2 0 378 962 710 0 55 99.6 28017 70.4 4.6 0.2 24.7 0.81
1 0 0 0 29 8 4 0 0 100.0 0 2.4 0.4 0.0 97.2 0.19
2 1025 5 0 561 1352 910 0 53 100.0 3662 83.8 2.5 0.6 13.1 0.88
3 0 0 0 16 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.12
4 1513 12 0 837 2132 1302 0 100 99.9 9266 36.7 6.8 2.2 54.3 0.65
5 0 0 0 41 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.36
6 1061 5 0 485 873 465 0 80 100.0 31318 63.2 5.4 0.6 30.8 0.77
7 0 0 0 29 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.24
8 694 4 0 801 1422 742 0 63 100.0 16970 67.1 4.6 0.7 27.6 0.79
9 0 0 0 28 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.21
10 184 135 0 299 1206 836 0 153 85.9 16218 43.1 6.0 21.3 29.6 0.67
11 10 45 0 131 46 46 0 0 100.0 48 2.3 1.4 11.2 85.1 0.33
12 264 37 0 488 1333 810 0 124 89.6 5712 57.1 5.4 4.1 33.4 0.74
13 0 0 0 32 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.24
14 348 47 0 206 356 227 0 79 82.0 36145 60.9 6.8 4.4 27.9 0.77
15 0 0 0 25 0 0 0 0 100.0 6 0.0 0.2 0.0 99.7 0.23
16 419 67 0 339 850 482 0 140 85.6 5772 55.7 4.5 8.8 31.1 0.73
17 0 0 0 36 0 0 0 0 - 0 0.0 0.3 0.0 99.7 0.27
18 123 55 0 592 1308 708 0 161 89.0 20956 59.3 7.6 7.2 25.9 0.63
19 13 48 0 295 437 238 0 18 96.5 2485 45.8 2.8 7.2 44.2 0.37

26 139 0 0 205 23 13 0 1 95.5 245 0.6 1.1 0.0 98.3 0.51
27 0 0 0 52 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.49
28 185 0 0 152 27 16 0 2 91.7 219 0.5 1.6 0.0 97.9 0.52
29 0 0 0 54 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.48
30 795 0 0 150 82 43 0 2 97.5 1075 6.2 2.1 0.7 91.1 0.55
31 0 0 0 51 0 0 0 0 - 0 0.0 0.2 0.0 99.8 0.45
32 0 0 0 152 8 4 0 0 100.0 16360 34.3 2.9 0.0 62.9 0.62
33 52 0 0 42 38 19 0 0 100.0 198 1.2 0.3 0.0 98.5 0.38
34 0 0 0 162 0 0 0 0 100.0 0 99.7 0.3 0.0 0.0 1.00
35 0 0 0 9 16 8 0 0 100.0 0 0.0 87.3 0.0 12.7 0.00
ALL 6851 575 0 7302 12664 7732 0 1031 94.2 195563 46.6 2.7 3.8 46.8 17.97
-------------------------------------------------------------------------------------------------------------
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

mpstat –w 2 # min:maj:int, (maj=0):int?, cs:ics, int:sysc:pc; monitor ratios to


learn what is typical/normal – so you can discern what’s atypical/abnormal
$ mpstat –w 2
System configuration: lcpu=12

cpu min maj mpc int cs ics rq mig lpa sysc us sy wa id pc


0 12 14 0 409 894 545 0 102 100.0 3900 41.3 10.4 6.1 42.2 0.70
1 0 0 0 179 0 0 0 0 - 0 0.0 1.4 0.0 98.6 0.30
2 15 17 0 528 1045 611 0 78 100.0 2086 26.8 11.5 13.1 48.6 0.66
3 1 0 0 211 0 0 0 0 - 0 0.0 1.6 0.0 98.4 0.34
4 42 38 0 488 1489 852 0 101 100.0 3024 30.3 12.5 17.7 39.6 0.69
5 0 0 0 214 0 0 0 0 - 0 0.0 1.8 0.0 98.2 0.31
6 21 17 0 643 1205 762 0 90 100.0 4000 33.9 12.8 8.1 45.2 0.70
7 0 0 0 194 0 0 0 0 - 0 0.0 1.6 0.0 98.4 0.30
8 0 0 0 409 5 5 0 0 100.0 119 0.0 2.1 0.0 97.9 0.55
9 0 0 0 287 0 0 0 0 - 0 0.0 1.6 0.0 98.4 0.45
10 0 0 0 303 20 10 0 0 100.0 0 0.0 1.6 0.0 98.4 0.55
11 0 0 0 288 0 0 0 0 - 0 0.0 1.6 0.0 98.4 0.45
ALL 91 86 0 4153 4658 2785 0 371 100.0 13129 15.1 6.3 5.1 73.4 6.00
------------------------------------------------------------------------------------------------------------------------------------
0 12 21 0 424 1146 693 0 99 100.0 5313 42.9 13.8 10.0 33.3 0.72
1 0 0 0 173 0 0 0 0 - 0 0.0 1.7 0.0 98.3 0.29
2 21 23 0 562 1479 843 0 91 100.0 2112 27.4 11.2 17.3 44.1 0.66
3 0 0 0 225 0 0 0 0 - 0 0.0 1.8 0.0 98.2 0.33
4 17 15 5 509 1117 668 0 106 100.0 4688 40.6 11.7 8.0 39.7 0.73
5 0 0 0 195 0 0 0 0 - 0 0.0 1.7 0.0 98.3 0.27
6 19 19 0 694 1111 671 0 93 100.0 2253 18.4 9.7 8.8 63.1 0.63
7 0 0 0 247 0 0 0 0 - 0 0.0 1.7 0.0 98.3 0.37
8 1 0 0 385 4 4 0 0 100.0 20 0.0 2.1 0.6 97.3 0.55
9 0 0 0 303 0 0 0 0 - 0 0.0 1.7 0.0 98.3 0.45
10 0 0 0 315 19 10 0 0 100.0 0 0.0 1.6 0.0 98.4 0.54
11 0 0 0 305 0 0 0 0 - 0 0.0 1.6 0.0 98.4 0.45
ALL 70 78 5 4337 4876 2889 0 389 100.0 14386 15.1 6.3 5.1 73.6 6.00
------------------------------------------------------------------------------------------------------------------------------------
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

S0-S5 thread migrations # Only valid w/dedicated-CPU POWER5/5+ LPARs; not


as applicable for latency analysis with POWER6/POWER7 architectures

$ uptime ; mpstat -d 2
03:06PM up 133 days, 17:58, 2 users, load average: 5.87, 4.92, 4.66

System configuration: lcpu=24 ent=12.0

cpu cs ics bound rq push S3pull S3grd S0rd S1rd S2rd S3rd S4rd S5rd ilcs vlcs
0 1387 947 0 0 0 0 0 86.9 0.2 0.0 12.9 0.0 0.0 0 1317
1 4 2 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 1359
2 1165 752 0 0 0 0 0 85.2 0.2 0.0 14.6 0.0 0.0 0 883
3 0 0 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 576
4 868 576 0 0 0 0 0 83.5 0.3 0.0 16.3 0.0 0.0 0 801
5 3 2 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 882
6 1259 813 0 0 0 0 0 84.9 0.1 0.0 14.9 0.0 0.0 0 1091
7 2 2 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 1113
8 1207 809 0 0 0 0 0 85.5 0.0 0.0 14.5 0.0 0.0 0 1082
9 0 0 0 0 0 0 0 - - - - - - 0 1099
10 990 654 0 0 0 0 0 87.0 0.0 0.0 13.0 0.0 0.0 0 804
11 0 0 0 0 0 0 0 - - - - - - 0 802
12 1388 907 0 0 0 0 0 89.4 0.0 0.0 10.6 0.0 0.0 0 986
13 0 0 0 0 0 0 0 - - - - - - 0 990
14 1104 730 0 0 0 0 0 86.7 0.0 0.0 13.3 0.0 0.0 0 949
15 0 0 0 0 0 0 0 - - - - - - 0 960
16 659 411 0 0 0 0 0 88.8 0.0 0.0 11.2 0.0 0.0 0 524
17 0 0 0 0 0 0 0 - - - - - - 0 589
18 863 452 0 0 0 0 0 96.1 0.0 0.0 3.9 0.0 0.0 0 215
19 0 0 0 0 0 0 0 - - - - - - 0 316
20 1175 766 0 0 0 0 0 87.1 0.0 0.0 12.9 0.0 0.0 0 943
21 0 0 0 0 0 0 0 - - - - - - 0 951
22 1224 705 0 0 0 0 0 91.8 0.0 0.0 8.2 0.0 0.0 0 635
23 0 0 0 0 0 0 0 - - - - - - 0 639
ALL 13298 8528 0 0 0 0 0 87.7 0.1 0.0 12.2 0.0 0.0 0 20506
------------------------------------------------------------------------------------------------------

cpu cs ics bound rq push S3pull S3grd S0rd S1rd S2rd S3rd S4rd S5rd ilcs vlcs
0 1969 1277 0 0 0 0 0 83.2 0.0 0.0 16.8 0.0 0.0 0 1940
1 0 0 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 1968
2 1679 1068 0 0 0 0 0 85.7 0.0 0.0 14.3 0.0 0.0 0 1216
3 0 0 0 0 0 0 0 - - - - - - 0 1236

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

uptime; mpstat –dw 2 # cs:ics, S3pull, S0rd:S4rd, S0rd:S5rd; Only useful


for analysis on POWER5/5+ dedicated-CPU LPARs; less useful for POWER6/7
05:46PM up 4 days, 23:50, 14 users, load average: 1.33, 1.30, 1.01

System configuration: lcpu=12

cpu cs ics bound rq push S3pull S3grd S0rd S1rd S2rd S3rd S4rd S5rd
0 1232 816 0 0 0 0 0 84.8 0.0 6.9 8.3 0.0 0.0
1 0 0 0 0 0 0 0 - - - - - -
2 2080 1179 0 0 0 1 0 91.6 0.0 3.4 4.9 0.0 0.0
3 0 0 0 0 0 0 0 - - - - - -
4 1614 989 0 0 0 1 0 89.9 0.0 4.4 5.6 0.0 0.1
5 0 0 0 0 0 0 0 - - - - - -
6 2180 1317 0 0 0 1 0 91.9 0.0 3.5 4.6 0.0 0.1
7 0 0 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0
8 159 110 0 0 0 2 0 98.4 0.0 0.0 0.0 1.6 0.0
9 0 0 0 0 0 0 0 - - - - - -
10 127 96 0 0 0 2 0 97.6 0.0 0.0 0.0 2.4 0.0
11 0 0 0 0 0 0 0 - - - - - -
ALL 7392 4507 0 0 0 7 0 90.5 0.0 4.0 5.3 0.1 0.0
------------------------------------------------------------------------------------------------------------------------------------
0 1815 1058 0 0 0 0 0 95.0 0.0 2.0 3.0 0.0 0.0
1 0 0 0 0 0 0 0 - - - - - -
2 1543 897 0 0 0 0 0 95.1 0.0 2.0 3.0 0.0 0.0
3 0 0 0 0 0 0 0 - - - - - -
4 1593 997 0 0 0 0 0 92.7 0.0 3.7 3.7 0.0 0.0
5 0 0 0 0 0 0 0 - - - - - -
6 2031 1193 0 0 0 0 0 94.8 0.0 2.7 2.5 0.0 0.0
7 0 0 0 0 0 0 0 - - - - - -
8 21 11 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0
9 0 0 0 0 0 0 0 - - - - - -
10 4 4 0 0 0 0 0 - - - - - -
11 0 0 0 0 0 0 0 - - - - - -
ALL 7007 4160 0 0 0 0 0 94.5 0.0 2.6 3.0 0.0 0.0
------------------------------------------------------------------------------------------------------------------------------------
© 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

iostat -aT 2 # Create a complete technology-stack map of I/O;


i.e. RAIDset/tech->LUN-> [LVMvg:lv::JFS2mtpt w/options]->logical_content
$ iostat -aT 2 | grep -v "0.0 0.0 0.0 0 0"

System configuration: lcpu=24 drives=87 ent=12.00 paths=172 vdisks=0

tty: tin tout avg-cpu: % user % sys % idle % iowait physc % entc time
0.0 62.1 92.4 4.8 0.3 2.6 11.8 98.6 15:42:10

Adapter: Kbps tps Kb_read Kb_wrtn time


scsi0 2.0 0.5 4 0 15:42:10

Disks: % tm_act Kbps tps Kb_read Kb_wrtn time


hdisk0 0.0 2.0 0.5 4 0 15:42:10

Adapter: Kbps tps Kb_read Kb_wrtn time


scsi1 0.0 0.0 0 0 15:42:10

Disks: % tm_act Kbps tps Kb_read Kb_wrtn time

Adapter: Kbps tps Kb_read Kb_wrtn time


fcs0 150429.9 7234.0 298240 2369 15:42:10

Disks: % tm_act Kbps tps Kb_read Kb_wrtn time


hdisk86 14.0 228.2 16.0 264 192 15:42:10
hdisk87 11.0 88.1 11.0 176 0 15:42:10
hdisk88 9.5 64.1 8.0 128 0 15:42:10
hdisk89 20.0 2886.4 62.1 5760 8 15:42:10
hdisk90 20.5 9203.7 150.1 18392 0 15:42:10
hdisk91 34.0 7190.0 351.8 14368 0 15:42:10
hdisk92 59.5 6081.1 543.0 12152 0 15:42:10
hdisk93 32.5 384.3 42.0 768 0 15:42:10
hdisk94 12.0 260.2 21.0 304 216 15:42:10
hdisk95 8.5 64.1 8.0 128 0 15:42:10
hdisk96 39.0 7742.5 444.4 15280 192 15:42:10
hdisk97 33.5 400.3 45.0 800 0 15:42:10
hdisk98 12.0 260.2 19.5 328 192 15:42:10
hdisk99 10.5 68.1 8.5 136 0 15:42:10

… © 2010 IBM Corporation
AIX Virtual Users Group presentation July 29, 2010

iostat -AQ 2 # Monitoring asynchronous I/O activity (typically via aioserver)


RDBMS using aio typically exhibit in concurrent bursts (2sec intervals)

aio: avgc avfc maxg maif maxr avg-cpu: % user % sys % idle % iowait
32 0 223 0 8192 21.1 7.8 46.5 24.5

Queue# Count Filesystems


149 2 /db2/AB7/sapdata1
150 3 /db2/AB7/sapdata2
151 3 /db2/AB7/sapdata3
153 3 /db2/AB7/sapdata4
154 2 /db2/AB7/sapdata5
156 2 /db2/AB7/sapdata7
158 3 /db2/AB7/sapdata9
159 3 /db2/AB7/sapdata10
160 6 /db2/AB7/sapdata11
161 5 /db2/AB7/sapdata12

aio: avgc avfc maxg maif maxr avg-cpu: % user % sys % idle % iowait
0 0 215 0 8192 18.8 7.7 49.1 24.4

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

 Strategic Thoughts, Concepts, Considerations, and Tactics

 Monitoring AIX – Usage, Meaning and Interpretation


– Review component technology of the infrastructure, i.e. proper tuning-by-hardware
– Review implemented AIX constructs, i.e. “firm” near-static structures and settings
– Review historical/accumulated AIX events, usages, pendings, counts, blocks, etc.
– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

 Recognizing Common Performance-degrading Scenarios


– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”
– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed
– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity
– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r
– Poor ratio of pending I/O waits to iodones in vmstat -s output

 Tactical Check-list of What-to-Watch

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”

System configuration: lcpu=128 mem=239871MB

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
38 0 0 36247940 1134306 10022 6471 0 0 0 0 11595 311282 416080 38 11 50 1 23:08:01
60 1 2 36349490 1030093 3903 1595 0 0 0 0 7458 358377 420907 46 10 43 1 23:08:02
0 0 0 36361251 1003440 7677 7990 1 0 0 0 10060 292767 409014 43 11 45 1 23:08:03
35 1 2 36368530 990454 6505 92 0 0 0 0 9184 264537 418087 39 10 50 1 23:08:04
27 1 0 36401856 953864 3833 41 0 0 0 0 6181 442110 332004 43 28 28 0 23:08:05
98 1 0 36443786 902689 1643 7964 0 0 0 0 6029 329798 316835 47 33 20 0 23:08:06
85 1 0 36529792 814869 2650 4 0 0 0 0 5875 246727 344842 53 22 24 0 23:08:07
54 1 0 36695785 649875 4592 29 0 0 10693 53771 7637 261305 342048 59 21 20 0 23:08:08
99 1 0 36710625 650934 4223 40 0 0 13889 77048 6807 244587 390610 45 24 30 0 23:08:09
45 1 0 36734792 620598 6352 35 0 0 0 0 8684 350507 435002 42 21 37 0 23:08:10
33 1 0 36764231 582083 9065 5 0 0 0 0 10111 251847 486571 40 21 38 0 23:08:11
36 2 2 36781445 555793 9115 40 0 0 0 0 9386 233633 432140 22 28 49 1 23:08:12
55 0 0 36857354 462204 10362 8014 0 0 0 0 11187 296126 426007 46 21 32 1 23:08:13
100 2 0 36997474 374643 13160 13 0 0 63312 337520 12250 344183 407138 65 16 18 1 23:08:14
79 1 0 36974495 375888 12967 14743 0 0 0 0 10337 263413 496578 50 25 25 0 23:08:15
94 3 0 37018259 324905 1589 992 0 0 44695 240220 6172 593465 347183 46 41 12 1 23:08:16
0 0 0 37026480 402687 1899 39740 0 0 14796 82454 6355 164643 325407 17 64 11 8 23:08:17
60 25 0 37127886 320753 9624 80065 1 0 77802 416638 11793 282172 473596 54 35 7 4 23:08:18
179 24 0 37229111 324464 5116 16020 0 1 124674661349 9837 261490 494991 61 30 6 4 23:08:19
99 19 0 37212306 405774 18191 8117 0 42 89516 459518 11330 225756 425797 39 36 24 1 23:08:20
kthr memory page faults cpu time
----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
87 2 0 37205572 397386 15420 67 0 0 0 0 6062 98600 205265 28 60 12 0 23:08:21
110 0 0 37210411 381230 11371 1268 0 0 0 0 8002 174732 367078 33 36 30 1 23:08:22
52 1 0 37220337 359035 23466 128 0 0 11284 50173 14472 226184 443127 35 27 37 0 23:08:23
55 0 0 37234473 356191 15934 356 0 0 23547 96122 9191 193647 382616 34 32 34 0 23:08:24

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

vmstat:memory:avm near-to or greater-than lruable-gbRAM;


memory over-committed 3986734*4096=16329mb vs 15744mb
System configuration: lcpu=8 mem=15744MB

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
1 1 0 3986577 2652 1944 797 0 0 1536 12803 880 2377 4459 10 4 55 31 14:17:58
2 2 0 3986576 2553 1863 757 0 0 2557 37067 852 4053 4446 11 4 55 30 14:18:00
2 1 0 3986574 2206 1959 799 0 0 2559 37499 1009 2523 4559 10 6 53 31 14:18:02
0 3 0 3986573 2597 2044 843 0 0 3069 42804 912 2377 4553 11 4 55 30 14:18:04
1 2 0 3986571 2511 1870 754 0 0 2559 167438 804 2203 4247 10 4 56 30 14:18:06
0 2 0 3986571 2197 1944 787 0 0 2560 102054 814 2310 4063 10 4 56 30 14:18:08
0 2 0 3986570 2872 1960 792 0 0 3070 42557 889 4148 4532 11 4 54 30 14:18:10
1 2 0 3986569 3752 1876 764 0 0 3070 65622 933 2363 4834 10 5 53 32 14:18:12
1 2 0 3986568 3864 1787 730 0 0 2559 49907 880 2135 4617 9 4 53 33 14:18:14
1 1 0 3986567 2634 1915 767 0 0 2047 30676 785 2774 3948 10 4 55 31 14:18:16
0 3 0 3986567 2523 1890 759 0 0 2552 27693 877 2646 4443 10 4 55 32 14:18:18
1 2 0 3986573 2040 2008 810 0 0 2557 23419 928 5155 4671 12 4 54 30 14:18:20
1 2 0 3986572 1962 1878 761 0 0 2554 52663 905 2525 4795 10 4 56 29 14:18:22
2 2 0 3986587 2652 1960 798 3 0 3071 14081 1030 11377 7789 13 9 51 27 14:18:24
2 2 0 3986570 2363 1938 781 0 0 2558 30570 836 3004 5732 10 5 56 29 14:18:26
2 1 0 3986734 2056 1884 762 1 0 2557 32017 888 31414 6058 15 11 47 26 14:18:28
2 0 0 3986617 1933 1920 779 2 0 2558 15377 933 22108 5545 15 9 48 28 14:18:30
1 0 0 3986612 2463 2008 826 0 0 3069 25129 1192 2823 5935 11 9 52 28 14:18:32
1 2 0 3986586 3073 1988 810 0 0 3064 15116 816 2732 4430 10 4 56 30 14:18:34
0 1 0 3986587 3402 1719 685 0 0 2555 24262 799 3395 4429 9 4 58 29 14:18:36
kthr memory page faults cpu time
----------- --------------------- ------------------------------------ ------------------ ----------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se
0 3 0 3986582 2347 1841 748 0 0 2047 14678 865 2352 4683 10 4 56 31 14:18:38
0 1 0 3986580 3068 1945 784 0 0 3070 24649 784 4741 4233 11 4 55 29 14:18:40
0 2 0 3986583 2797 1929 780 0 0 2559 16436 806 2466 4205 10 4 57 29 14:18:42

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r

System configuration: lcpu=8 mem=20480MB ent=3.90

kthr memory page faults cpu time


----------- --------------------- ------------------------------------ ------------------ ----------------------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec hr mi se
1 11 0 2381462 12283 574 41 4 0 0 0 282 14044 3759 36 3 36 24 1.59 40.8 02:00:03
5 6 0 2383434 9229 537 21 15 0 0 0 214 9309 1221 65 8 22 5 2.94 75.4 02:00:05
3 5 0 2381734 10665 571 5 4 0 451 1303 442 8902 3345 83 6 5 6 3.53 90.4 02:00:07
3 10 0 2383409 7700 666 1 3 0 0 0 232 2606 1828 46 7 20 28 2.08 53.3 02:00:09
6 10 0 2383410 7743 611 0 0 0 661 1956 270 5229 3206 93 2 2 4 3.72 95.3 02:00:11
5 8 0 2384836 7559 544 35 1 45 1150 3962 475 9629 4267 48 10 22 19 2.33 59.7 02:00:13
4 7 0 2385533 8572 569 82 1 30 1437 3311 259 8388 1512 83 2 6 10 3.33 85.3 02:00:15
2 8 0 2387636 7716 411 90 5 7 967 2971 467 40650 2367 58 15 12 16 2.89 74.1 02:00:17
5 7 0 2390359 7622 542 20 19 11 1908 5953 360 18005 2195 77 4 9 10 3.19 81.8 02:00:19
3 11 0 2390756 7956 511 0 91 10 977 3899 178 3897 1655 59 2 17 22 2.40 61.5 02:00:21
3 9 0 2390761 7750 487 16 87 0 504 1368 471 3018 2893 45 2 33 19 1.89 48.5 02:00:23
6 7 0 2392294 7597 483 4 80 40 1227 3822 233 4682 2070 69 5 17 9 2.92 75.0 02:00:25
4 8 0 2392294 7837 413 30 53 0 571 2025 416 4990 5019 85 2 5 8 3.45 88.4 02:00:27
8 8 0 2392294 7704 409 0 0 0 385 983 184 2894 3480 53 1 25 20 2.16 55.3 02:00:29
11 8 0 2387726 19279 325 114 118 177 3971 13773 224 39964 2564 84 15 0 1 3.88 99.6 02:00:31
4 13 0 2402933 7698 390 656 189 118 3732 14638 436 39993 4822 53 24 7 16 3.06 78.6 02:00:33
5 11 0 2403390 8018 396 57 95 6 946 3780 263 23753 1784 93 3 1 3 3.77 96.6 02:00:35
3 11 0 2402851 7753 424 17 73 0 64 98 393 6538 2994 52 6 9 33 2.31 59.3 02:00:37
5 11 0 2404724 7710 354 10 49 0 1356 5981 239 4234 1928 70 6 16 9 2.96 75.8 02:00:39
3 11 0 2404716 8279 324 23 36 0 586 2056 436 6414 3849 76 2 8 13 3.12 80.0 02:00:41
kthr memory page faults cpu time
----------- --------------------- ------------------------------------ ------------------ ----------------------- --------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec hr mi se
5 13 0 2404718 8054 373 0 0 0 274 784 192 1603 2058 56 1 26 17 2.36 60.4 02:00:43
5 13 0 2404716 7965 311 0 0 0 273 583 237 2342 2394 79 1 7 13 3.16 81.0 02:00:45
2 12 0 2388769 26143 353 22 35 3 322 590 390 3550 3130 48 4 25 24 2.06 52.9 02:00:47

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Poor ratio of pending I/O waits to iodones in vmstat -s output

02:18PM up 223 days, 15:57, 5 users, load average: 2.00, 2.54, 2.88
3955510096 total address trans. faults
663204206 page ins
2212345555 page outs
146844573 paging space page ins
136603972 paging space page outs
0 total reclaims
1876773643 zero filled pages faults
29655645 executable filled pages faults
3363959830 pages examined by clock
819733 revolutions of the clock hand
1565374921 pages freed by the clock
82564968 backtracks
5138 free frame waits
0 extend XPT waits
2353814111 pending I/O waits
2859262790 start I/Os
2376968599 iodones
50335935279 cpu context switches
16475063625 device interrupts
1149300248 software interrupts
8545158185 decrementer interrupts
88508563 mpc-sent interrupts
88508328 mpc-receive interrupts
1247188423 phantom interrupts
0 traps
223061804590 syscalls

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

 Strategic Thoughts, Concepts, Considerations, and Tactics

 Monitoring AIX – Usage, Meaning and Interpretation


– Review component technology of the infrastructure, i.e. proper tuning-by-hardware
– Review implemented AIX constructs, i.e. “firm” near-static structures and settings
– Review historical/accumulated AIX events, usages, pendings, counts, blocks, etc.
– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

 Recognizing Common Performance-degrading Scenarios


– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”
– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed
– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity
– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r
– Poor ratio of pending I/O waits to iodones in vmstat -s output

 Tactical Check-list of What-to-Watch

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Tactical Check-list of What-to-Watch: check balance of component


capacities/capabilities for workload, i.e. ensure proper tuning-by-hardware

date ; uname -a ; id ; oslevel –s; lparstat -i


prtconf
lscfg
lsdev

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Tactical Check-list of What-to-Watch: Review implemented AIX constructs,


i.e. “firm” near-static structures and settings

lsps –a
lsps –s
mount
df –k
ipcs –bm
vmo –L
ioo -L

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Tactical Check-list of What-to-Watch: Review historical/accumulated AIX


events, usages, delays, counts, exhaustions, etc.

uptime; vmstat –s
uptime; vmstat –v
uptime ; ps -ekf | egrep "syncd|lrud|nfsd|biod|wait"
iostat –a
iostat –D
vmstat –v | grep pbuf
for VG in `lsvg`; do; lvmo -a -v $VG; echo; done
netstat –ss
netstat –v
nfsstat

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Tactical Check-list of What-to-Watch: Monitor dynamic AIX command


behaviors, i.e. ps, vmstat, mpstat, iostat, topas, etc.

topas <cr> d n c c
ps –kelmo THREAD
ps guww
ps gvww
vmstat –Iwt 2
mpstat –w 2
mpstat –dw 2
iostat –aT 2
iostat -AQ 2

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Trademarks
The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.

Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not
actively marketed or is not significant within its relevant market.
Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.

For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:

*, AS/400®, e business(logo)®, DBE, ESCO, eServer, FICON, IBM®, IBM (logo)®, iSeries®, MVS, OS/390®, pSeries®, RS/6000®, S/30, VM/ESA®, VSE/ESA,
WebSphere®, xSeries®, z/OS®, zSeries®, z/VM®, System i, System i5, System p, System p5, System x, System z, System z9®, BladeCenter®

The following are trademarks or registered trademarks of other companies.

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.
* All other products may be trademarks or registered trademarks of their respective companies.

Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will
experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual
environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without
notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance,
compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

67 29-Jul-10 © 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Disclaimers
No part of this document may be reproduced or transmitted in any form without written permission from IBM
Corporation.
Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change
without notice. This information could include technical inaccuracies or typographical errors. IBM may make
improvements and/or changes in the product(s) and/or program(s) at any time without notice. Any statements
regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals
and objectives only.
The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be
obtained in other operating environments may vary significantly. While IBM has reviewed each item for accuracy in
a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customer
experiences described herein are based upon information and opinions provided by the customer. The same
results may not be obtained by every user.
Reference in this document to IBM products, programs, or services does not imply that IBM intends to make such
products, programs or services available in all countries in which IBM operates or does business. Any reference to
an IBM Program Product in this document is not intended to state or imply that only that program product may be
used. Any functionally equivalent program, that does not infringe IBM's intellectual property rights, may be used
instead. It is the user's responsibility to evaluate and verify the operation on any non-IBM product, program or
service.
THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER
EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE OR INFRINGEMENT. IBM shall have no responsibility to update this information. IBM
products are warranted according to the terms and conditions of the agreements (e.g. IBM Customer Agreement,
Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBM
is not responsible for the performance or interoperability of any non-IBM products discussed herein.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

Disclaimers Continued
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products in connection with this
publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those
products.
The providing of the information contained herein is not intended to, and does not, grant any right or license under
any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
USA

IBM customers are responsible for ensuring their own compliance with legal requirements. It is the customer's sole
responsibility to obtain advice of competent legal counsel as to the identification and interpretation of any relevant
laws and regulatory requirements that may affect the customer's business and any actions the customer may need
to take to comply with such laws.
IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is
in compliance with any law.
The information contained in this documentation is provided for informational purposes only. While efforts were made
to verify the completeness and accuracy of the information provided, it is provided “as is” without warranty of any
kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise
related to, this documentation or any other documentation. Nothing contained in this documentation is intended to,
nor shall have the effect of, creating any warranties or representations from IBM (or its suppliers or licensors), or
altering the terms and conditions of the applicable license agreement governing the use of IBM software.

© 2010 IBM Corporation


AIX Virtual Users Group presentation July 29, 2010

AIX 5L/6 Performance Tuning Part I: Tactics for


Monitoring Indications of Performance Issues

Thank You
Earl Jew
IBM Field Technical Sales Specialist for Power Systems and Storage
IBM Regional Designated Specialist - Power/AIX Performance & Tuning
400 North Brand Blvd., Suite 700 c/o IBM, Glendale, CA, USA 91203
earlj@us.ibm.com (310)251-2907

© 2010 IBM Corporation

Вам также может понравиться