AIX POWERHA 讨论 - 20130109

AIX POWERHA
sina@
Basics
clstrmgr
It uses services provided by the RSCT subsystems to monitor the status of the nodes and their interfaces. It receives ibformation from Topology Sevices
and uses Group Services for inter-node communication. It invokes the appropriate scripts in response to node or network events.(recovering from
SW/HW failures, request to online/offline a node, request to move/online/offline a resource group) It maintains update informations about the resource groups (status, location) A daemon which runs on each cluster nodes.
Basics
clstrmgr
If clstrmgr hangs or is terminated the default action taken by SRC is to issue halt -q, causing the system to crash. Clstrmgr is dependent on RSCT; if
topsvcs or grpsvcs has problems with starting, the clstrmgr will not start either.
Basics
clinfo
Clinfo obtains updated cluster information from the Cluster Manager. It makes information about the state of the cluster, nodes, networks and
applications. Used by clstat, and it is optional on cluster nodes and clients.
# startsrc -s clinfoES
starts clinfo
// usr/es/sbin/cluster/etc/rc.cluster this script also starts everything
# stopsrc -s clinfoES
stops clinfo
Basics
netmon.cf
You can create a netmon.cf configuration file with a list of additional network addresses. These addresses will only be used by topology services to send
ICMP ECHO requests to help determine an adapter's status. This implementation is recommended in clusters with only a single network card on each node, because topology services cannot force traffic over the single
adapter to confirm its proper operation. The file should be in /usr/es/sbin/cluster directory on all nodes and contains 1 IP address per line. The file should contain remote IP labels/addresses that
are not in the cluster configuration and that can be accessed from PowerHA.
5
Basics
netmon.cf
This file is used to send ICMP ECHO requests to each IP address in the file.
After sending the request to every address, netmon checks the inbound packet count before determining whether an adapter has failed.
Basics
clhosts
This file contains IP address information which helps to enable
communication between monitoring daemons on clients and the PowerHA cluster nodes. The file resides on all PowerHA cluster servers and clients in
the /usr/es/sbin/cluster/etc/ directory. When a monitor daemon starts up (for example clinfoES on a client), it reads this file to know which nodes are available for communication.
(when running clstat utility from a client, the clinfoES obtains info from this file.)
7
Basics
ARP
The Internet communication protocol used to dynamically map Internet
addresses to physical (hardware/MAC) addresses on local area networks. The /usr/sbin/cluster/etc/clinfo.rc script, which is called by the clinfo utility
whenever a network or node event occurs, updates the systems ARP cache.
Basics
clinfo.rc
PowerHA can be configured to change the MAC address of a network
interface by hardware address takeover (HWAT). In a switched enwironment, the network switch might not always get promptly informed of the new MAC.
The clinfo.rc script is used to flush the system's ARP cache in order to reflect changes to network IP addresses. (HWAT is only supported when using IPAT via replacement.)
Basics
clinfo.rc
On clients not running clinfoES, you might have to update the local ARP
cache by pinging the client from the cluster node. In order to avoid this, add the IP of the client to the PING_CLIENT_LIST variable in the clinfo.rc script
(/usr/es/sbin/cluster/etc/clinfo.rc). Through the use of PING_CLIENT_LIST entries, the ARP cache of clients (and other network devices) can be updated.
10
Basics
clcomd
All cluster communication is going through clcomd. It must be running before
any cluster services can be started. The trusted IP addresses are stored in the /usr/sbin/cluster/etc/rhosts file. (root.system 0600). Nodes with a nonempty /usr/es../rhosts file (or missing ...rhosts file) will refuse all HACMP related communication with nodes not listed in their rhosts file. If an adapter is missing or there is a format error in the file, clcomd will not function, all
connections will be denied. After the first synchronization HACMP ODM classed are populated, so rhosts file can be emptied.
11
Basics
clcomd
Clcomd is started via /etc/inittab entry, which is created during PowerHA
install (clverify is using the clcomd subsystem). It uses port 6191, and it is the transport medium for PowerHA cluster verification, global ODM changes and
remote command execution.
clcomd is managed by src (startsrc, stopsrc, refresh; refresh is useful to
reread /usr/sbin/cluster/etc/rhosts file), and logs are in /var/hacmp/clcomd/clcomd.log

12
Basics
RSCT
It is a software stack, a package of services ("cient subsystems"), which is a
prerequisite for HACMP and is packaged with AIX.
-Topology Services: generates heartbeats to monitor nodes, networks and network adapters, diagnoses failures. When a node joins the cluster, topology services adds the adapter information to the machine list.
This "topology" or "connectivity" information is then passed on to group services.

13
Basics
RSCT
-Group Services: "Client subsystems" (e.g. event management subsystem,
RMC subsystems...) forms groups, with a membership list. Provides reliable communication and protocols required for cluster operation. The main
daemon is hagsd. Group services coordinates/monitors state changes within the cluster (e.g. node join/leave) and then passes these state changes to the interested subscribers, for example cluster manager or. event
management.Enhanced Concurrent mode disks use RSCT group services to control locking.
14
Basics
RSCT
-Resource Monitoring and Control (RMC): RMC notifies the Cluster
Manager about events, so it responds to this event by RSCT. Main damon is rmcd. The process application monitoring uses RMC and therefore does not
require any custom script. Dynamic Node Priority (DNP) is calcuated by the use of RMC
15
Basics
RSCT
-Resource Monitoring and Control (RMC): RMC notifies the Cluster
Manager about events, so it responds to this event by RSCT. Main damon is rmcd. The process application monitoring uses RMC and therefore does not
require any custom script. Dynamic Node Priority (DNP) is calcuated by the use of RMC
16
Basics
RSCT
-Event Management: Match information about the state of system
resources, it initiates the scripts needed managing the cluster.
HACMP relies on topology services for heartbeats and group services for reliable messaging. These services are started prior to the cluster processes. If there are problems with these services the cluster will not start.
17
Basics
SNMP
It is a popular protocol for network management. It is used for collecting
information from, and configuring, network devices, such as servers, printers, hubs, switches, and routers
When giving cldump it says: "Obtaining information via SNMP from Node: aix11..."
18
Basics
clsmuxpd
Clusmuxpd provides SNMP support. (IT is before v 5.3, as clinfo improved)
clinfo is based on SNMP; it queries the clsmuxpd daemon for up-to-date cluster information and provides a simple display of it.
19
Basics
C-SPOC
It helps managing the entire cluster from a single point. In smitty hacmp or
with commmands which are under /usr/es/sbin/cluster/cspoc. C-SPOC using clcomd for HACMP communication between nodes, so
/etc/rhosts file no longer used. If there is a failure of a C-SPOC function it will be logged in the /tmp/cspoc.log, on the node performing the operation.
cspoc.log
contains the used commands in this file

20
Basics
IPAT via IP replacement
The service IP replaces the existing address on the interface, thus only one service IP can be configured on one interface at one time. The service IP should be on the same subnet as one of the boot IP addresses.
Other interfaces on this node cannot be in the same subnet, and they are called as standby interfaces. These standby interfaces are used if the boot interface fails. IPAT via IP replacement can save subnets, but requires extra hardware.
21
Basics
22
Basics
If the interface holding the service IP address fails, PowerHA moves the service IP address on another available interface on the same node and on the same network; in this case, the resource group is not affected.
If there is no available interface on the same node, the resource group is moved together with the service IP to another node with an available interface on the same logical network
23
Basics
IPAT via aliasing
The service IP is aliased (using the ifconfig command) onto the interface without removing the underlying boot IP address. This means more than one service IP label can coexist on one interface.
Each boot interface on a node must be on a different subnet. The service IP labels can be on one or more subnets, but they cannot be the same as any of the boot interface subnets.
Standby interfaces are not necessary, because all interfaces are labeled as boot interfaces.
24
Basics
IPAT via aliasing
25
Basics
IPAT via aliasing
By removing the need for one interface per service IP address that the node
could host, IPAT through aliasing is more flexible and in some cases requires less hardware. IPAT through aliasing also reduces fallover time, as it is much
faster to add an alias to an interface, rather than removing the base IP address and then apply the service IP address.
26
Basics
HACMP start-up
When PowerHA installed it will create this entry:
hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1
// it starts clcomdES, clstrmgrES, snmpd, syslogd
27
Basics
HACMP start-up
If PowerHA configured for IP Address Takeover:
harc:2:wait:/usr/es/sbin/cluster/etc/harc.net # HACMP for AIX network startup
When start at system restart option is chosen in C-SPOC: hacmp6000:2:wait:/usr/es/sbin/cluster/etc/rc.cluster -boot -A # Bring up
Cluster //do not use this option, manual control better

28
Build-Configure
Install HA Software
-cluster.es -cluster.es.cspoc -cluster.license -cluster.man.en_US.es
# reboot
29
Build-Configure
Network and /etc/host
Boot interfaces are those that share the service subnet. Standby interfaces are those that are not on the service subnet. IPAT via IP REPLACEMENT (service IP is in the same subnet with boot IP) IPAT via IP ALIASING (all IPs are in different subnets)
IP Aliasing in detail:
All base IP addresses on a node must be on separate subnets. (If heartbeat monitoring over IP aliases is not used) All service IP addresses must be on a separate subnet from any of the base subnets. The service IP addresses can all be in the same or different subnets. The subnet masks must all be the same
30
Build-Configure
Network and /etc/host
IP Replacement in detail:
Base (boot) and service IP addresses on the primary adapter must be on the same subnet. All base IP addresses on the secondary adapters must be on separate subnets (different from each other and from the primary adapter).
31
Build-Configure
Storage and FileSystem
-shared disk (on both nodes):

cfgmgr chdev -l hdiskX pv=yes
-create enhanced concurrent vg: lvlstmajor (on both nodes) mkvg -C -y'orabckvg' -s'128' '-n' -V 50 hdiskpower50 // autovaryon should be turned off lv, fs if needed on the other node: importvg -V 50 -y orabckvg -n hdiskpower50
32
Build-Configure
Application Script
stop/start application scripts should be created
33
Build-Configure
Extended Topology
Extended Config -> Extended Topology:
-Config. HACMP Cluster -Config HACMP Node (set nodes and ips)
34
Build-Configure
Discover
Extended Config -> Discover
"/usr/es/sbin/cluster/etc/rhosts" file possibly needed with necessary ips
35
Build-Configure
Extended Config -> Extended Topology
Extended Topology
Configure HACMP Networks: give a name and set netmask

-enable IP adress takeover with Alias? --> yes: if Aliasing --> no: if Replacement
Configure HACMP Interface/Device: Add discovered -> comm interface
ALIASING: 1 network configured and both ...boot_1 and ...boot_2 addreses were used because they are in different subnets
IP REPLACEMENT:1 network configured and ...boot_1 addresses will be used because ...boot_2 IPs are in different subnet from service IP // Not necessary, but here we can do verif. and synch. to see if everything is correct:Ext. Config -> Ext. Ver...
36
Build-Configure
Extended Config -> Ext. resource: -Ext. Res. Group: -startup policy:
RG and resources
if IP repl. with 1 nic per network per node -> Online Using Distribution Policy if IP repl. with more nics on a network on the nodes -> anything if IP aliasing -> anything
-Extended Resource: -Appl. Server: (start, stop scripts) -Service IP..: Configurable on Multiple Nodes -> which network -> F4 to choose RG and resources are ready to be related together: Extended Config -> Extended RG -> Change/show Resources for a RG: with F4 add:
37
Build-Configure
-Service IP -Appl Serv
-VG
RG and resources
38
Build-Configure
places to start in SMITTY:
Sync and Verify
- under Standard Config.: this always runs a full verification, all aspects of the cluster will be checked before synchronization it has no option, it is a press and go function - under Enhanced Config.: allows separation of verification and synchronization Emulate: changes will be tested before trying to implement them Actual: it implements the settings Forced: ignore any errors, can be dangerous to a running cluster Verify changes only: if only few items changed it will allow much faster verification and synchronization
Always verify and synchronize the cluster from the node on which the changed occured to the other nodes in the cluster.
39
Build-Configure
Sync and Verify
-!!! Problem Determination Tools > HACMP Verification: verification can be run without causing a synchronization!!!
Error count: verification is successful if error count not
exceeded The verify process will indicate warnings if the cluster is capable running but:
-may have items that are not configured in resource groups -the recommendations for configuring the cluster have not been followed
40
Build-Configure
Sync and Verify
hen clverify has problems it is usually related to the ability to contact a node.
The node to node communication is provided by clcomd.
(clcomd uses the /usr/es/sbin/cluster/etc/rhosts file for inter node security. (in earlier releases it used the 'r' commands))
41
Build-Configure
there are 2 log files for problem checking:
Sync and Verify
clverify: /var/hacmp/clverify/clverify.log (automatic cluster conf. mon. <every 24 hours, by default on the first node in alphabetical) clcomd: /var/hacmp/clcomd/clcomd.log
additional logs:
/var/hacmp/clverify/fail verification attempt

/var/hacmp/clverify/pass verification attempt
stores data from the most recent failed
stores data from the most recent passed
42
Command Help
odmget HACMPlogs odmget HACMPcluster odmget HACMPnode
ODM
shows where are the log files shows cluster version shows info from nodes
// changing the location of the log files: C-SPOC > Log Viewing and Management)
/etc/es/objrepos
HACMP ODM files
43
Command Help
POWERHA LOGS
44
Command Help
/var/ha/log /var/ha/log/nim.topsvcs... OK between the nodes RSCT logs are here
RSCT LOGS
the heartbeats are logged here (comm. is
45
Command Help
clRGinfo used)
clRGinfo -p (POL) clRGinfo -t
Command
Shows the state of RGs (in earlier HACMP clfindres was
shows the node that has temporarily the highest priority
shows the delayed timer information
clRGinfo -m shows the status of the application monitors of the cluster resource groups state can be: online, offline, acquiring, releasing, error, unknown
cldump (or clstat -o) detailed info about the cluster (realtime, shows cluster status) (clstat requires a running clinfo)
cldisp detailed general info about the cluster (not realtime)
46
Command Help
Command
cltopinfo Detailed information about the network of the cluster (this shows the data in DCD not in ACD)
cltopinfo -i good overview, same as cllsif: this also lists cluster inetrfaces, it was used prior HACMP 5.1 cltopinfo -m clshowres shows heartbeat statistics, missed heartbeats Detailed information about the resource group(s)
cllsserv
Shows which scripts will be run in case of a takeover
clrgdependency -t PARENT_CHILD -sl shows parent child dependencies of resource groups clshowsrv -v overview!!!) shows status of the cluster daemons (very good
47
Command Help
Command
lssrc -ls clstrmgrES shows if cluster is STABLE or not, cluster version, Dynamic Node Priority (pgspace free, disk busy, cpu idle)
ST_STABLE: cluster services running with resources online NOT_CONFIGURED: cluster is not configured or node is not
synced
ST_INIT: cluster is configured but not active on this node ST_JOINING: cluster node is joining the cluster ST_VOTING: cluster nodes are voting to decide event execution ST_RP_RUNNING: cluster is running a recovery program RP_FAILED: recovery program event script is failed ST_BARRIER: clstrmgr is in between events waiting at the
barrier
ST_CBARRIER: clstrmgr is exiting a recovery program ST_UNSTABLE: cluster is unstable usually due to an event error
48
Command Help
lssrc -g cluster lists the running cluster daemons
Command
lssrc -ls topsvcs shows the status of individual diskhb devices, heartbeat intervals, failure cycle (missed heartbeats) lssrc -ls grpsvcs gives info about connected clients, number of groups)
lssrc -ls emsvcs shows the resource monitors known to the event management subsystem)
lssrc -ls snmpd shows info about snmpd
halevel -s
shows PowerHA level (from 6.1)
49
Command Help
Command
cl_ping pings all the adapters of the given list (e.g.: cl_ping -w 2 aix21 aix31 (-w: wait 2 seconds))
cldiag HACMP troubleshooting tool (e.g.: cldiag debug clstrmgr -l 5 <--shows clstrmgr heartbeat infos) cldiags vgs -h nodeA nodeB <--this checks the shared vgs definitions on the given node for inconsistencies
/usr/es/sbin/cluster/utilities/get_local_nodename this node within the HACMP
shows the name of
/usr/es/sbin/cluster/utilities/clexit.rc this script halt the node if the cluster manager daemon stopped incorrectly
50
Command Help
Remove HA Configure
1. stop cluster on both nodes 2. remove the cluster configuration ( smitty hacmp) on both nodes 3. remove cluster filesets (startinf with cluster.*)
If you are planning to do crash-test, do it with halt -q or reboot -q shutdown -Fr will not work, because it stops hacmp and resource groups garcefully (rc.shutdown), so no takeover will occur.
51
Disk HeartBeat
OverView
Heartbeat disks should be used in enhanced concurrent mode. Enhanced concurrent mode disks use RSCT group services to control locking, thus freeing up a sector on the disk that can now be used for communication. This sector, which was formerly used for SSA Concurrent mode disks, is now used for writing heartbeat information.
Any disk that is part of an enhanced concurrent volume group can be used for a diskhb network, including those used for data storage. Also, the volume group that contains the disk used for a diskhb network does not have to be varied on. An enhanced concurrent volume group is not the same as a concurrent volume group (which is part of a concurrent resource group), rather, it refers to the mode of locking by using RSCT.
52
Disk HeartBeat
lspv | grep hb <--shows the actual state of heartbeat disks
How to View
cltopinfo -i | grep hb <--shows what had been saved into the configuration (we have to change it to show the actual state)
53
Disk HeartBeat
1. Create diskhb network
How to Config
Extended Configuration->Extended Topology->Configure HACMP Networks->Add a Network...
choose:diskhb * Network Name * Network Type [anything you want] diskhb
54
Disk HeartBeat
2. Add device
How to Config
Extended Configuration->Extended Topology->Configure HACMP Comm. Interfaces/Dev.->Add ... Add Pre-defined... Communication Devices Choose your diskhb Network Name
* Device Name unique name * Network Type * Network Name * Device Path * Node Name
[aix41_diskhb2] <--choose a
diskhb net_diskhb_aix41_aix42 [/dev/vpath4] [aix41]
// You will repeat this process for the other node and the other device. This will complete both devices for the diskhb network.
55
Disk HeartBeat
How to Test
DO NOT PERFORM THIS TEST WHILE HACMP IS RUNNING??? dhb_read -p devicename dhb_read -p devicename -r dhb_read -p devicename -t <--dump diskhb sector contents <--receive data over diskhb network <--transmit data over diskhb network
1. on one node set receiving: /usr/sbin/rsct/bin/dhb_read -p hdisk2 -r

2. on the other node set transmit: /usr/sbin/rsct/bin/dhb_read -p hdisk2 -t
dhb_read -p rvpath0 -r <--Note: That the device name is raw device as designated with the "r" proceeding the device name. If everything is OK: Link operating normally
56
Disk HeartBeat
root@aix41: / # lssrc -ls topsvcs Subsystem topsvcs Group topsvcs PID Status 921638 active
How to Monitor
Network Name Indx Defd Mbrs St Adapter ID Group ID VLAN200_10_20_ [ 0] 2 2 S 10.10.10.2 10.10.10.2 VLAN200_10_20_ [ 0] en11 0x41c64107 0x41c64108 HB Interval = 1.000 secs. Sensitivity = 10 missed beats Missed HBs: Total: 0 Current group: 0 ...
57
Thank You
SINA@
Make Presentation much more fun

AIX POWERHA 讨论 - 20130109

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

AIX POWERHA 讨论 - 20130109

Загружено:

Авторское право:

Доступные форматы

AIX POWERHA

applications. Used by clstat, and it is optional on cluster nodes and clients.

// usr/es/sbin/cluster/etc/rc.cluster this script also starts everything

This file contains IP address information which helps to enable

The Internet communication protocol used to dynamically map Internet

PowerHA can be configured to change the MAC address of a network

All cluster communication is going through clcomd. It must be running before

Clcomd is started via /etc/inittab entry, which is created during PowerHA

clcomd is managed by src (startsrc, stopsrc, refresh; refresh is useful to

reread /usr/sbin/cluster/etc/rhosts file), and logs are in /var/hacmp/clcomd/clcomd.log

It is a software stack, a package of services ("cient subsystems"), which is a

prerequisite for HACMP and is packaged with AIX.

This "topology" or "connectivity" information is then passed on to group services.

-Group Services: "Client subsystems" (e.g. event management subsystem,

-Resource Monitoring and Control (RMC): RMC notifies the Cluster

-Resource Monitoring and Control (RMC): RMC notifies the Cluster

-Event Management: Match information about the state of system

resources, it initiates the scripts needed managing the cluster.

It is a popular protocol for network management. It is used for collecting

Clusmuxpd provides SNMP support. (IT is before v 5.3, as clinfo improved)

contains the used commands in this file

IPAT via IP replacement

IPAT via IP replacement

IPAT via IP replacement

IPAT via aliasing

IPAT via aliasing

IPAT via aliasing

When PowerHA installed it will create this entry:

hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1

// it starts clcomdES, clstrmgrES, snmpd, syslogd

If PowerHA configured for IP Address Takeover:

harc:2:wait:/usr/es/sbin/cluster/etc/harc.net # HACMP for AIX network startup

Cluster //do not use this option, manual control better

-cluster.es -cluster.es.cspoc -cluster.license -cluster.man.en_US.es

Network and /etc/host

Network and /etc/host

Storage and FileSystem

-shared disk (on both nodes):

stop/start application scripts should be created

Extended Config -> Extended Topology:

Extended Config -> Discover

"/usr/es/sbin/cluster/etc/rhosts" file possibly needed with necessary ips

Configure HACMP Networks: give a name and set netmask

Configure HACMP Interface/Device: Add discovered -> comm interface

Sync and Verify

Sync and Verify

Error count: verification is successful if error count not

Sync and Verify

Sync and Verify

/var/hacmp/clverify/fail verification attempt

stores data from the most recent failed

stores data from the most recent passed

HACMP ODM files

the heartbeats are logged here (comm. is

Shows the state of RGs (in earlier HACMP clfindres was

shows the node that has temporarily the highest priority

shows the delayed timer information

Shows which scripts will be run in case of a takeover

shows PowerHA level (from 6.1)

/usr/es/sbin/cluster/utilities/get_local_nodename this node within the HACMP

shows the name of

Extended Configuration->Extended Topology->Configure HACMP Networks->Add a Network...