Cluster Setup

Contents 1. Introduction 2. Setup and Preparation o Storage setup o Network setup 3. Installation o Prerequisite filesets o HACMP Filesets 4.
Cluster Topology Configuration o Define the Cluster o Define the Cluster Nodes o Define Cluster Sites o Define a Cluster Network o Add a Communication Interface for Heartbeat o Add Persistent IP Addresses o Storage Configuration o Disk Heartbeat 5. Resource Group Configuration o Application Volume Groups o Application Server o Cluster Service Address o Define Resource Group(s) o Create LVs and Filesystems for Applications
A. Failover Test
B. C. D. E. Disk Heartbeat Check Useful Commands clstat and snmp Related Information
1. Introduction
This article describes how to setup a two-nodes-cluster with IBM's standard cluster solution for AIX. Although the name has changed to Power HA with Version 5.5 and to Power HA System Mirror with version 7 IBM's cluster solution is still widely known as HACMP. This article refers to version 5.5.
2. Setup and Preparation

Storage setup
The reason why we create a cluster is to make an application high available. Therefore we need storage from two independent sites (read it as storage from two different datacenters). In this article we have to sites: Datacenter1 and Datacenter2. Each filesystem will be mirrored over the two sites. All storage has to be visible on both nodes. In addition we need two (very small) LUNs for disk heartbeat. 512MB to 1G LUN size is sufficient.
Network setup
In our setup we have two nodes: barney and shakira. We need a boot address only used for cluster intercommunication, a service address, and a persistent address which is equal to the hostnames of our nodes. All cluster addresses have to be present in the /etc/hosts file on both nodes:
node1+node2# vi /etc/hosts #### HACMP # Boot address 172.18.1.4 172.18.1.6 10.111.111.70 10.111.111.4 10.111.111.6 ####
Don't use hyphens (-) and underscores (_) in IP labels here.
barneyboot shakiraboot haservice1 barney shakira
# Service/Cluster address # Node/Persistent address
3. Installation
Installation of Prerequisite Filesets
There are some filesets needed in order to get HACMP to work which are typically not part of a standard AIX installation. Check for
bos.net.nfs.server bos.clvm rsct.compat.basic.hacmp rsct.compat.clients.hacmp
If they are not installed you have to do it now:
node1+node2# smitty install_latest | bos.net.nfs ALL | | | | | > + 6.1.1.0 + 6.1.4.0 Network File System Server Network File System Server
| bos.clvm ALL | | | | | | > | | rsct.compat.basic ALL | | | + 2.5.4.0 RSCT Event Management Basic Function RSCT Event Management Basic Function (HACMP/ES RSCT Event Management Basic Function (PSSP Support) + 6.1.4.2 Enhanced Concurrent Logical Volume Manager + 6.1.4.0 Enhanced Concurrent Logical Volume Manager + 6.1.1.1 Enhanced Concurrent Logical Volume Manager
| > + 2.5.4.0 Support) | | | + 2.5.4.0
| rsct.compat.clients ALL | | | | > + 2.5.4.0 Support) | | | Installation of HACMP Filesets

Put the HACMP filesets and the update filesets somewhere where you can access them from both nodes and run inutoc. Then install the filesets on both cluster nodes:
+ 2.5.4.0
RSCT Event Management Client Function RSCT Event Management Client Function (HACMP/ES RSCT Event Management Client Function (PSSP Support)
+ 2.5.4.0
node1+node2# cd /path/to/bffs node1+node2# smitty install_latest
| > cluster.es.client ALL | | | | | | | | | | | | | | | | | | | | | | > cluster.es.server ALL | | | | | | | | | | | | | | | | | | | | | + 5.5.0.4 ES Cluster Simulator + 5.5.0.0 ES Cluster Simulator + 5.5.0.6 ES Server Utilities + 5.5.0.0 ES Server Utilities + 5.5.0.6 ES Server Events + 5.5.0.0 ES Server Events + 5.5.0.5 ES Server Diags + 5.5.0.0 ES Server Diags + 5.5.0.6 ES Base Server Runtime + 5.5.0.0 ES Base Server Runtime + 5.5.0.5 Web based Smit + 5.5.0.0 Web based Smit + 5.5.0.5 ES Communication Infrastructure + 5.5.0.0 ES Communication Infrastructure + 5.5.0.5 ES Client Utilities + 5.5.0.0 ES Client Utilities + 5.5.0.5 ES Client Runtime + 5.5.0.0 ES Client Runtime + 5.5.0.4 ES Client Libraries + 5.5.0.0 ES Client Libraries
| | | | | |
+ 5.5.0.0 + 5.5.0.3 + 5.5.0.0
ES Cluster Test Tool ES Cluster Test Tool ES Two-Node Configuration Assistant
| > cluster.es.cfs ALL | | | | | | > cluster.es.nfs ALL | | | | | | > cluster.es.cspoc ALL | | | | | | | | | | | | > cluster.license ALL | | | | > cluster.man.en_US.es ALL | | | | |
Note: In the above fileset list HACMP update filesets for SP6 are included. If you installed HACMP from a base CD it's strongly recommended to update HACMP with the latest fixes. Base versions of HACMP are not known to be excessively tested.
+ 5.5.0.0 + 5.5.0.4
ES Cluster File System Support ES Cluster File System Support
+ 5.5.0.0 + 5.5.0.1
ES NFS Support ES NFS Support
+ 5.5.0.0 + 5.5.0.6 + 5.5.0.0 + 5.5.0.5 + 5.5.0.0
ES CSPOC Commands ES CSPOC Commands ES CSPOC Runtime Commands ES CSPOC Runtime Commands ES CSPOC dsh
+ 5.5.0.0
HACMP Electronic License
+ 5.5.0.0 + 5.5.0.1
ES Man Pages - U.S. English ES Man Pages - U.S. English
node1+node2# cd /path/to/update node1+node2# smitty update_all

The nodes have to be rebooted now.
node1+node2# shutdown -Fr
4. Cluster Topology Configuration

Basically the cluster configuration has to be done on only one of our nodes. Only the initial definition and startup has to be done on both nodes. Please mind the command prompt in the below commands. It indicates whether something has to be done on one node or on both nodes.
Define the Cluster

The first step is to define a cluster. This means nothing more then just define the name of our cluster
node1+node2# smitty hacmp -> Extended Configuration -> Extended Topology Configuration -> Configure an HACMP Cluster -> Add/Change/Show an HACMP Cluster Add/Change/Show an HACMP Cluster Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Cluster Name NOTE: HACMP must be RESTARTED on all nodes in order for change to take effect
We follow the advice and restart all cluster related services:
[Cluster1]
node1+node2# stopsrc -g cluster 0513-044 The clstrmgrES Subsystem was requested to stop. node1+node2# stopsrc -s clcomdES 0513-044 The clcomdES Subsystem was requested to stop. node1+node2# startsrc -s clcomdES 0513-059 The clcomdES Subsystem has been started. Subsystem PID is 618753. node1+node2# startsrc -g cluster 0513-059 The clinfoES Subsystem has been started. Subsystem PID is 618534. 0513-059 The clstrmgrES Subsystem has been started. Subsystem PID is 577620.
Define the Cluster Nodes

All steps so far we did on both nodes. But from now on we only work on one of our nodes. We add the first node to our cluster:
node1# smitty hacmp -> Extended Configuration -> Extended Topology Configuration -> Configure HACMP Nodes -> Add a Node to the HACMP Cluster Add a Node to the HACMP Cluster Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Node Name + Communication Path to Node [barney] [barneyboot]
and now the second one:
Add a Node to the HACMP Cluster Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Node Name + Communication Path to Node [shakira] [shakiraboot]
Define Cluster Sites

We don't really use cluster sites in this example setup. But it makes sense to define cluster sites anyway. It gives you the possibility to label your storage. We will use these labels later when we create the application filesystems. First site:
node1# smitty hacmp -> Extended Configuration -> Extended Topology Configuration -> Configure HACMP Sites -> Add a Site Add a Site Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Site Name + * Site Nodes + * Dominance + * Backup Communications
Second site:
[Datacenter1] barney [Yes] [none]
Add a Site Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Site Name + * Site Nodes + * Dominance + * Backup Communications + [Datacenter2] shakira [No] [none]
The home node of our service shall be barney - that's why we set the Dominance to Yes for barney and to No for shakira.
Define a Cluster Network

Before we start with the network configuration we let HACMP try to discover the topology. Automatic discovery does not always work, but it's worth a try.
node1# smitty hacmp -> Extended Configuration -> Discover HACMP-related Information from Configured Nodes
The network topology is used by HACMP for the heartbeat. First we configure heartbeat over ethernet:
node1# smitty hacmp -> Extended Configuration -> Extended Topology Configuration -> Configure HACMP Networks -> Add a Network to the HACMP Cluster
+-------------------------------------------------------------------------+ | | | | | | | | | | | | | | | | | | | | | | | Move cursor to desired item and press Enter. | | | | | | | | | | | | | | | | | # Pre-defined IP-based Network Types XD_data XD_ip atm ether fddi hps ib token # Discovered Serial Device Types rs232 # Discovery last performed: (January 30 10:02) # Discovered IP-based Network Types ether Select a Network Type
| | | | | | | | | | | | | | | | | F1=Help | | F8=Image | | /=Find | +-------------------------------------------------------------------------+

If you trust the automatic discovery select ether under "Discovered IP-based Network Types" - if not select ether under "Pre-defined IP-based Network Types". The latter always work - so it might be the better choice. In the next screen put in the correct netmask and activate the use of IP aliases for IP takeover:
# Pre-defined Serial Device Types XD_rs232 diskhb rs232 tmscsi tmssa
F2=Refresh F10=Exit n=Find Next
F3=Cancel Enter=Do
Add an IP-Based Network to the HACMP Cluster Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Network Name * Network Type * Netmask + * Enable IP Address Takeover via IP Aliases + [net_ether_01] ether [255.255.255.0] [Yes]
IP Address Offset for Heartbeating over IP Aliases []
Add a Communication Interface for Heartbeat

Based on the network definition we define the boot addresses as communication interface:
node1# smitty hacmp -> Extended Configuration -> Extended Topology Configuration -> Configure HACMP Communication Interfaces/Devices -> Add Communication Interfaces/Devices -> Add Pre-defined Communication Interfaces and Devices -> Communication Interfaces
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press Enter. | | | | | | | | | | F1=Help | | F8=Image | | /=Find | +-------------------------------------------------------------------------+
Select net_ether_01 and fill the empty fields:
Select a Network Name
ALL net_ether_01
F3=Cancel Enter=Do
Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * IP Label/Address + * Network Type * Network Name * Node Name + Network Interface
Do the same for the second node:
[barneyboot] ether net_ether_01 [barney] [en8]
Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * IP Label/Address + * Network Type * Network Name * Node Name + Network Interface
The network topology is setup now - time to synchronize the cluster:
[shakiraboot] ether net_ether_01 [shakira] [en8]
node1# smitty hacmp -> Extended Configuration -> Extended Verification and Synchronization HACMP Verification and Synchronization Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields]
* Verify, Synchronize or Both + * Automatically correct errors found during + verification? * Force synchronization if verification fails? + * Verify changes only? + * Logging +
[Both] [No]
[No] [No] [Standard]
Add Persistent IP Addresses

We want to have the IPs belonging to the hostnames of our two nodes to be persistent:
node1# smitty hacmp -> Extended Configuration -> Extended Topology Configuration -> Configure HACMP Persistent Node IP Labels/Addresses -> Add a Persistent Node IP Label Add a Persistent Node IP Label/Address Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Node Name * Network Name + * Node IP Label/Address + Prefix Length #
We do the same for shakira.
barney [net_ether_01] [barney] []
We miss a default route here. Since the persistent IP is defined within HACMP there is no default route defined in the ODM. However, after a reboot the system comes up with boot and persistent address. So we define a default route on both nodes:
node1+node2# chdev -l inet0 -a route=net,-hopcount,0,,0,10.111.111.1 -P

This sets 10.111.111.1 as the default gateway. We will activate the route later with the cluster start. In normal operation you don't have to touch the default route anymore.
Storage Configuration
First we set PVIDs on every LUN we want to use for HACMP and run cfgmgr on the other node.
node1# chdev -l hdisk1 -a pv=yes hdisk1 changed node1# chdev -l hdisk2 -a pv=yes hdisk2 changed : :
On node2 we have to remove the hdisks first an run cfgmgr again. Now we see the same PVIDs as on node1:
node2# rmdev -dl hdisk1 hdisk1 deleted : : node2# cfgmgr hdisk0 active hdisk1 hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 00c722bc389f170f 00f6418384f345d0 00f6418384f34621 00f6418384f3466c 00f6418384f346b0 00f6418384f346f2 00f6418384f44fca 00f6418384f45015 00f6418384f45054 rootvg None None None None None None None None
hdisk9 hdisk10 hdisk11 hdisk12
00f6418384f4508f 00f6418384f450ca 00f6418384f34739 00f6418384f450ff
None None None None
and we run the automatic discovery again:
Now we connect the LUNs to our cluster sites. For every LUN do the following:
node1# smitty hacmp -> System Management (C-SPOC) -> HACMP Physical Volume Management -> Configure Disk/Site Locations for Cross-Site LVM Mirroring -> Add Disk/Site Definition for Cross-Site LVM Mirroring
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press Enter. | | | | | | | | | | F1=Help | | F8=Image | F10=Exit Enter=Do F2=Refresh F3=Cancel Datacenter2 Datacenter1 Site Names
| /=Find |
n=Find Next
+-------------------------------------------------------------------------+
Select site Datacenter1 and select all LUNs located there in the next screen:
Add Disk/Site Definition for Cross-Site LVM Mirroring Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Site Name * Disks PVID +
<F4> gives you a list of all LUNs configured for HACMP - select the ones for site Datacenter1:
Datacenter1
+-------------------------------------------------------------------------+ | | | | | | | | | | | Move cursor to desired item and press F7. | ONE OR MORE items can be selected. Disks PVID
| Press Enter AFTER making all selections. | | > 00f6418384f345d0 ( hdisk1 on all selected nodes ) | > 00f6418384f34621 ( hdisk2 on all selected nodes )
| > 00f6418384f3466c ( hdisk3 on all selected nodes ) | | > 00f6418384f346b0 ( hdisk4 on all selected nodes ) | | > 00f6418384f346f2 ( hdisk5 on all selected nodes ) | | > 00f6418384f34739 ( hdisk11 on all selected nodes ) | | | | | | | | | | | | | | | | F1=Help | | F7=Select | | Enter=Do | +-------------------------------------------------------------------------+
We repeat the procedure for the LUNs located in site Datacenter2.
00f6418384f44fca ( hdisk6 on all selected nodes ) 00f6418384f45015 ( hdisk7 on all selected nodes ) 00f6418384f45054 ( hdisk8 on all selected nodes ) 00f6418384f4508f ( hdisk9 on all selected nodes ) 00f6418384f450ca ( hdisk10 on all selected nodes ) 00f6418384f450ff ( hdisk12 on all selected nodes )
F2=Refresh F8=Image /=Find
F3=Cancel F10=Exit n=Find Next
Disk Heartbeat
Two of our LUNs are dedicated to disk heartbeat. Typically you use small LUN sizes here. If you're not sure which LUNs are the heartbeat LUNs check with " bootinfo -s hdisk<X> ". To protect the LUNs for disk heartbeat we create volume groups for them - a separate VG for each LUN:
node1# smitty hacmp
-> System Management (C-SPOC) -> HACMP Concurrent Logical Volume Management -> Concurrent Volume Groups -> Create a Concurrent Volume Group
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press F7. | | | | Press Enter AFTER making all selections. | | | | > barney | | > shakira | | | | F1=Help | | F7=Select | | Enter=Do | +-------------------------------------------------------------------------+
Select both nodes.
Node Names
ONE OR MORE items can be selected.
+-------------------------------------------------------------------------+ | | Physical Volume Names
| | | Move cursor to desired item and press F7. | | | | Press Enter AFTER making all selections. | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | F1=Help | | F7=Select | | Enter=Do | /=Find n=Find Next F8=Image F10=Exit F2=Refresh F3=Cancel 00f6418384f450ff ( hdisk12 on all selected nodes ) 00f6418384f450ca ( hdisk10 on all selected nodes ) 00f6418384f4508f ( hdisk9 on all selected nodes ) 00f6418384f45054 ( hdisk8 on all selected nodes ) 00f6418384f45015 ( hdisk7 on all selected nodes ) 00f6418384f44fca ( hdisk6 on all selected nodes ) 00f6418384f34739 ( hdisk11 on all selected nodes ) 00f6418384f346f2 ( hdisk5 on all selected nodes ) 00f6418384f346b0 ( hdisk4 on all selected nodes ) 00f6418384f3466c ( hdisk3 on all selected nodes ) 00f6418384f34621 ( hdisk2 on all selected nodes ) 00f6418384f345d0 ( hdisk1 on all selected nodes ) ONE OR MORE items can be selected.
+-------------------------------------------------------------------------+
We select the small LUN from Datacenter1 and fill the empty fields in the next screen:
Create a Concurrent Volume Group with Data Path Devices Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] Node Names PVID VOLUME GROUP name + # + + Physical partition SIZE in megabytes Volume group MAJOR NUMBER Enhanced Concurrent Mode Enable Cross-Site LVM Mirroring Verification barney, shakira 00f6418384f34739 [hacmp_hb1] 4 [38] true false
Warning: Changing the volume group major number may result in the command being unable to execute successfully on a node that does not have the major number currently available. before changing this setting.
The same procedure has to be done for the second disk heartbeat LUN. We call the second volume group " hacmp_hb2 ". Before we go on with the disk heartbeat configuration we let HACMP discover first...
Please check
for a commonly available major number on all nodes|
node1# smitty hacmp -> Extended Configuration
-> Discover HACMP-related Information from Configured Nodes

Now we are ready to configure the disk heartbeat:
node1# smitty hacmp -> Extended Topology Configuration -> Configure HACMP Communication Interfaces/Devices -> Add Communication Interfaces/Devices -> Add Discovered Communication Interface and Devices -> Communication Devices
+-------------------------------------------------------------------------+ | Add | | | Move cursor to desired item and press F7. Use arrow keys to scroll. | | | | Press Enter AFTER making all selections. | | | | | | | | | | > barney 00f6418384f34739 | barney 00f6418384f450ff | > shakira 00f6418384f34739 | shakira 00f6418384f450ff | | hdisk11 | hdisk12 | hdisk11 | hdisk12 | /dev/hdisk12 /dev/hdisk11 /dev/hdisk12 /dev/hdisk11 shakira tty0 /dev/tty0 barney tty0 /dev/tty0 # Node Device Device Path Pvid ONE OR MORE items can be selected. Select Point-to-Point Pair of Discovered Communication Devices to |
| F1=Help | | F7=Select | | Enter=Do |
+-------------------------------------------------------------------------+
We choose the first pair of disks and repeat the procedure for the second pair.
5. Resource Group Configuration

Before we actually define a resource group we prepare all the resources we need:
Application Volume Groups

The first resource we need is a high available application volume group:
node1# smitty hacmp -> System Management (C-SPOC) -> HACMP Logical Volume Management -> Shared Volume Groups -> Create a Shared Volume Group with Data Path Devices
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press F7. | | | | Press Enter AFTER making all selections. | | | ONE OR MORE items can be selected. Node Names
| > barney | | > shakira | | | | F1=Help | | F7=Select | | Enter=Do | +-------------------------------------------------------------------------+

Select both nodes as shown in the screen above and select the hdisks you need in the next screen. Choose one set of disks from Datacenter1 and one set of disks from Datacenter2. Unfortunately in this screen the location is not indicated. In this example we just select all available disks:
+-------------------------------------------------------------------------+ | | | | | | | | | | | | Move cursor to desired item and press F7. | ONE OR MORE items can be selected. Physical Volume Names
| Press Enter AFTER making all selections. | | > 00f6418384f345d0 ( hdisk1 on all selected nodes ) | > 00f6418384f34621 ( hdisk2 on all selected nodes ) | > 00f6418384f3466c ( hdisk3 on all selected nodes )
| > 00f6418384f346b0 ( hdisk4 on all selected nodes ) | | > 00f6418384f346f2 ( hdisk5 on all selected nodes ) | | > 00f6418384f44fca ( hdisk6 on all selected nodes ) | | > 00f6418384f45015 ( hdisk7 on all selected nodes ) | | > 00f6418384f45054 ( hdisk8 on all selected nodes ) | | > 00f6418384f4508f ( hdisk9 on all selected nodes ) | | > 00f6418384f450ca ( hdisk10 on all selected nodes ) | | | | F1=Help | | F7=Select | | Enter=Do | +-------------------------------------------------------------------------+
The next screen asks for the type of volume group. These days scalable VGs seem to be the best choice:
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press Enter. | | | | | | | Original Legacy Volume Group Type
| | | | | |
Big Scalable
| F1=Help | | F8=Image | | /=Find |
F3=Cancel Enter=Do
+-------------------------------------------------------------------------+
After we selected disks and VG type we choose a name for the volume group:
Create a Shared Volume Group with Data Path Devices Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] Node Names PVID 00f6> VOLUME GROUP name Physical partition SIZE in megabytes + Volume group MAJOR NUMBER # Enable Cross-Site LVM Mirroring Verification + Warning: Changing the volume group major number may result in the command being unable to execute successfully on a node that does not have the major number currently available. Please check true [42] barney, shakira 00f6418384f345d0 [appl01vg] 128
for a commonly available major number on all nodes before changing this setting.
After confirming with <ENTER> we are done with the VG and can go on with the
Application Server
For the application servers we first need application start and stop scripts. The scripts are usually provided by the application owners and should match at least two conditions:
it should be no problem to run these scripts multiple times in succession. particularly the stop script should be robust, i.e. it should really be able to stop the application. If HACMP cannot unmount filesystems a manual takeover (aka resource group move) will fail.
Once the scripts are in place we can configure the application server:
node1# smitty hacmp -> Extended Configuration -> Extended Resource Configuration -> HACMP Extended Resources Configuration -> Configure HACMP Applications -> Configure HACMP Application Servers -> Add an Application Server Add Application Server Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Server Name * Start Script [/etc/hacmp/start_srv01] * Stop Script [/etc/hacmp/stop_srv01] Application Monitor Name(s) +
In the above example the start/stop scripts are stored in a folder /etc/hacmp. But you can place them anywhere in the local filesystem tree. Don't place them on shared filesystems! Since the scripts are local we have to copy them over to the other node:
[app_srv01]
node1# scp -rp /etc/hacmp node2:/etc/
Cluster Service Address(es)

The cluster service address is the IP address that clients use to connect to the application. Therefore a service address moves with the resource group. You can define more than one service address per resource group. In this example we define only one service address. Remember: we already defined the service address in /etc/hosts with the initial network setup.
node1# smitty hacmp -> Extended Configuration -> Extended Resource Configuration -> HACMP Extended Resources Configuration -> Configure HACMP Service IP Labels/Addresses -> Add a Service IP Label/Address
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press Enter. | | | | | | | | | | F1=Help | | F8=Image | | /=Find | n=Find Next F10=Exit Enter=Do F2=Refresh F3=Cancel Bound to a Single Node Configurable on Multiple Nodes Select a Service IP Label/Address type
+-------------------------------------------------------------------------+
As the said before, the service address needs to move with the application - so we select "Configurable on Multiple Nodes" here.
Add a Service IP Label/Address configurable on Multiple Nodes (extended) Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * IP Label/Address + * Network Name haservice1 net_ether_01
Alternate HW Address to accompany IP Label/Address []

Now we have all resources in place we finally can
Define Resource Group(s) node1# smitty hacmp -> Extended Configuration -> Extended Resource Configuration -> HACMP Extended Resource Group Configuration -> Add a Resource Group Add a Resource Group (extended) Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Resource Group Name [RG_01]
Inter-Site Management Policy + * Participating Nodes from Primary Site + Participating Nodes from Secondary Site + Startup Policy Node O> + Fallover Policy Prio> + + Fallback Policy
[ignore] [barney] [shakira]
Online On Home Fallover To Next Never Fallback
In this panel we initially define name of the resource group ( RG_01 here). The policy definitions on the bottom are typical to two-node clusters. But you could choose different values here. For HACMP insiders: The above setup is the classic cascading setup. Time again to let HACMP collect information:
Now we want to adjust some parameters of our resource group:
node1# smitty hacmp -> Extended Configuration -> Extended Resource Configuration -> HACMP Extended Resource Group Configuration -> Change/Show Resources and Attributes for a Resource Group
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press Enter. | Change/Show Resources and Attributes for a Resource Group
| | | | | | | F1=Help | | F8=Image | | /=Find | +-------------------------------------------------------------------------+ Group Change/Show All Resources and Attributes for a Custom Resource n=Find Next F10=Exit Enter=Do F2=Refresh F3=Cancel RG_01
Type or select values in entry fields. Press Enter AFTER making all desired changes. [TOP] Resource Group Name Inter-site Management Policy Participating Nodes from Primary Site Participating Nodes from Secondary Site Startup Policy Node O> Fallover Policy Prio> Fallback Policy Service IP Labels/Addresses Application Servers [Entry Fields] RG_01 ignore barney shakira Online On Home Fallover To Next Never Fallback [haservice1] [app_srv01]
+ +
Volume Groups +
[appl01vg]
Use forced varyon of volume groups, if necessary + Automatically Import Volume Groups + Filesystems (empty is ALL for VGs specified) Filesystems Consistency Check Filesystems Recovery Method Filesystems mounted before IP configured Filesystems/Directories to Export
true false
+ + + + + +
[] logredo sequential false []
Filesystems/Directories to NFS Mount Network For NFS Mount + [MORE...10]
[] []
In the above smit panel we assign our service address and the application server we just created ( Application Server) and set the varyon policy to forced. Finally we synchronize the cluster to the other node:
node1# smitty hacmp -> Extended Configuration -> Extended Verification and Synchronization HACMP Verification and Synchronization Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Verify, Synchronize or Both + * Automatically correct errors found during + [Both] [No]
verification? * Force synchronization if verification fails? + * Verify changes only? + * Logging + [No] [No] [Standard]
At this point the cluster is synchronized and in a consistent state. Both nodes have the same information about the cluster setup.
Create LVs and Filesystems for Applications

We want to use CSPOC to create the application filesystems. In order to make use of CSPOC we first start hacmp on both nodes:
node1+node2# smitty clstart Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Start now, on system restart or both + Start Cluster Services on these nodes + BROADCAST message at startup? + Startup Cluster Information Daemon? + Reacquire resources after forced down ? + Ignore verification errors? + Automatically correct errors found during + cluster start?
To activate the route we defined earlier ( Add Persistent IP Addresses) we issue the command
now [barney] true True false false Interactively
node1+node2# mkdev -l inet0

shakira on the other node.
Once the cluster is up we go on with creating LVs and filesystems. If you don't want to use inline jfs2 logs, first a log device has to be created (if you don't do this a log LV called loglv00 will be automatically created with the first filesystem). The procedure to create a log LV is the same as for a regular filesystem with two exceptions:
Use jfs2log as Logical volume TYPE Don't forget to format the jfs2log:
node1# logform /dev/applvg01_jfs2log logform: destroy /dev/rapplvg0_jfs2log (y)?y
Refer to the next section on how to create the LV applvg01_jfs2log and remember to set the right Logical volume TYPE. Now we are ready to create the application filesystems. The below example shows how to create one filesystem. Repeat the steps until all filesystems are setup. Remember to create a jfs2log for each volume group first (if you don't use inline logs).
node1# smitty hacmp -> System Management (C-SPOC) -> HACMP Logical Volume Management -> Shared Logical Volumes -> Add a Shared Logical Volume
+-------------------------------------------------------------------------+ | | | | Shared Volume Group Names
| Move cursor to desired item and press Enter. Use arrow keys to scroll. | | | | | | | #Resource Group RG_01 Volume Group appl01vg
| | | F1=Help | | F8=Image | | /=Find | +-------------------------------------------------------------------------+

Select one pair of disks from both sites - mark with <F7>:
F3=Cancel Enter=Do
+-------------------------------------------------------------------------+ | | | | | Move cursor to desired item and press F7. | | | | Press Enter AFTER making all selections. | | | | | | > barney hdisk1 | | | | | | | | | | > barney hdisk6 | Datacenter2 barney hdisk5 Datacenter1 barney hdisk4 Datacenter1 barney hdisk3 Datacenter1 barney hdisk2 Datacenter1 Datacenter1 Auto-select ONE OR MORE items can be selected. Physical Volume Names
| | | | | | | | | |
barney hdisk7 barney hdisk8 barney hdisk9 barney hdisk10
Datacenter2 Datacenter2 Datacenter2 Datacenter2
| F1=Help | | F7=Select | | |
F2=Refresh F8=Image
F3=Cancel F10=Exit
+-------------------------------------------------------------------------+
Warning: Don't use Auto-select here - although we assigned LUNs to sites it's not guaranteed that CSPOC selects LUNs from different sites!
Add a Shared Logical Volume Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] Resource Group Name VOLUME GROUP name Reference node * Number of LOGICAL PARTITIONS # PHYSICAL VOLUME names Logical volume NAME Logical volume TYPE + POSITION on physical volume + middle RG_01 appl01vg barney [80] hdisk1 [applv01] [jfs2] hdisk6
RANGE of physical volumes + MAXIMUM NUMBER of PHYSICAL VOLUMES # to use for allocation + Number of COPIES of each logical partition Mirror Write Consistency? + Allocate each logical partition copy + on a SEPARATE physical volume? RELOCATE the logical volume during reorganization? + Logical volume LABEL MAXIMUM NUMBER of LOGICAL PARTITIONS Enable BAD BLOCK relocation? + SCHEDULING POLICY for reading/writing + logical partition copies + + Enable WRITE VERIFY? Stripe Size?
minimum []
active strict
yes [] [512] no parallel
no [Not Striped]
On the just created LV we create a filesystem:
node1# smitty hacmp -> System Management (C-SPOC) -> HACMP Logical Volume Management -> Shared File Systems -> Enhanced Journaled File Systems -> Add an Enhanced Journaled File System on a Previously Defined Logical Volume
+-------------------------------------------------------------------------+
| | | |
Logical Volume Names
| Move cursor to desired item and press Enter. | | | | | | | | F1=Help | | F8=Image | | /=Find | +-------------------------------------------------------------------------+ Add an Enhanced Journaled File System on a Previously Defined Logical Volume Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] Node Names LOGICAL VOLUME name * MOUNT POINT PERMISSIONS + Mount OPTIONS + Block Size (bytes) + Inline Log? + Inline Log size (MBytes) # [] no 4096 [] barney,shakira applv01 [/appl01/fs01] read/write n=Find Next F10=Exit Enter=Do F2=Refresh F3=Cancel applv01 barney,shakira
Repeat the steps until all filesystems are setup. Our cluster is ready for use now.
Appendix
A. Failover Test
A cluster failover test is typically done in three or four phases:
1. Manual Failover
The manual failover is the most important test for a cluster configuration. This test can be invoked on one node by
node1# smitty clstop Stop Cluster Services Type or select values in entry fields. Press Enter AFTER making all desired changes. [Entry Fields] * Stop now, on system restart or both + + + Stop Cluster Services on these nodes BROADCAST cluster shutdown? now [barney] true Move Resource
* Select an Action on Resource Groups Groups +
When stopping the cluster on node 1 the first thing executed is the cluster stop script. It brings down the applications and unmounts all application filesystems. If your application stop script is not able to stop all application processes some filesystems can't be unmounted and the failover fails. When all resources are down on node 1 HACMP starts to bring up all resources on node 2. The application start script is the last thing hacmp does. Check that your application is working properly and that all clients can connect. If so the first phase of the failover test is completed.
2. Manual Failback
Switch the resources back to the home node. Again check if everything is fine.
3. Automatic Failover
This test simulates a hardware failure on the active node. The easiest way to simulate is to issue the command
node1# halt -q
on the active node. Check that everything will be brought up on node 2.
4. Partial Hardware Failure

Sometimes only a component fails. Maybe a network switch fails or a storage system becomes unavailable. Test these scenarios to make sure that HACMP is correctly setup - and only starts a failover if needed. These tests also check if your VGs are correctly mirrored over two sites.
B. Disk Heartbeat Check

This is an example on how to check the disk heartbeat. On the first node we set the heartbeat disk to receive mode
node1# /usr/sbin/rsct/bin/dhb_read -p /dev/hdisk11 DHB CLASSIC MODE First node byte offset: 61440 Second node byte offset: 62976 Handshaking byte offset: 65024 Test byte offset: 64512 Receive Mode: Waiting for response . . . Magic number = 0x87654321 Magic number = 0x87654321 Magic number = 0x87654321 Magic number = 0x87654321 Magic number = 0x87654321 Magic number = 0x87654321 Magic number = 0x87654321 Magic number = 0x87654321 Link operating normally
-r
and on the other node we set the same disk to transmit mode...
node2# /usr/sbin/rsct/bin/dhb_read -p /dev/hdisk11 -t DHB CLASSIC MODE First node byte offset: 61440 Second node byte offset: 62976 Handshaking byte offset: 65024 Test byte offset: 64512 Transmit Mode: Magic number = 0x87654321 Detected remote utility in receive mode. Magic number = 0x87654321 Magic number = 0x87654321 Link operating normally
The last line in the above output indicates that the disk heartbeat is working properly.
Waiting for response . . .
C. Useful Commands
This is only a brief and selective list of commands that might be useful when working with HACMP
Which node is owning a resource group?
# /usr/sbin/cluster/utilities/clRGinfo ----------------------------------------------------------------------------
Group Name
State
Node
--------------------------------------------------------------------------- RG_01 ONLINE OFFLINE barney shakira
Move a resource group to another node
# /usr/sbin/cluster/utilities/clRGmove -g <RG> -n <NODE> -m

Stop cluster service (on current node)
# smitty clstop
Start cluster service (on current node)
# smitty clstart
Overview cluster state
# /usr/sbin/cluster/clstat -a
Overview cluster state
# /usr/sbin/cluster/utilities/cllistlogs /var/hacmp/log/hacmp.out /var/hacmp/log/hacmp.out.1 /var/hacmp/log/hacmp.out.2
D. clstat and snmp

clstat and cldump rely on SNMP to be configured properly. If cldump fails with a message like this:
cldump: Waiting for the Cluster SMUX peer (clstrmgrES) to stabilize............. Unable to communicate with the Cluster SMUX Peer Daemon
then /etc/snmpdv3.conf has to be fixed by adding a line
VACM_VIEW defaultView
1.3.6.1.4.1.2.3.1.2.1.5
- included -
snmpd has to be restarted:
# stopsrc -s snmpd # startsrc -s snmpd
E. Related Information

IBM Redbook: PowerHA for AIX Cookbook Certification Study Guide: HACMP for AIX

Cluster Setup

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Cluster Setup

Загружено:

Авторское право:

Доступные форматы

Contents 1. Introduction 2. Setup and Preparation o Storage setup o Network setup 3. Installation o Prerequisite filesets o HACMP Filesets 4.

2. Setup and Preparation

barneyboot shakiraboot haservice1 barney shakira

# Service/Cluster address # Node/Persistent address

bos.net.nfs.server bos.clvm rsct.compat.basic.hacmp rsct.compat.clients.hacmp

If they are not installed you have to do it now:

| > + 2.5.4.0 Support) | | | + 2.5.4.0

| rsct.compat.clients ALL | | | | > + 2.5.4.0 Support) | | | Installation of HACMP Filesets

node1+node2# cd /path/to/bffs node1+node2# smitty install_latest

+ 5.5.0.0 + 5.5.0.3 + 5.5.0.0

ES Cluster Test Tool ES Cluster Test Tool ES Two-Node Configuration Assistant

ES Cluster File System Support ES Cluster File System Support

ES NFS Support ES NFS Support

+ 5.5.0.0 + 5.5.0.6 + 5.5.0.0 + 5.5.0.5 + 5.5.0.0

HACMP Electronic License

ES Man Pages - U.S. English ES Man Pages - U.S. English

node1+node2# cd /path/to/update node1+node2# smitty update_all

node1+node2# shutdown -Fr

4. Cluster Topology Configuration

Define the Cluster

Define the Cluster Nodes

and now the second one:

Define Cluster Sites

[Datacenter1] barney [Yes] [none]

Define a Cluster Network

| | | | | | | | | | | | | | | | | F1=Help | | F8=Image | | /=Find | +-------------------------------------------------------------------------+

# Pre-defined Serial Device Types XD_rs232 diskhb rs232 tmscsi tmssa

F2=Refresh F10=Exit n=Find Next

IP Address Offset for Heartbeating over IP Aliases []

Add a Communication Interface for Heartbeat

Select a Network Name

F2=Refresh F10=Exit n=Find Next

[barneyboot] ether net_ether_01 [barney] [en8]

[shakiraboot] ether net_ether_01 [shakira] [en8]

[No] [No] [Standard]

Add Persistent IP Addresses

barney [net_ether_01] [barney] []

node1+node2# chdev -l inet0 -a route=net,-hopcount,0,,0,10.111.111.1 -P

hdisk9 hdisk10 hdisk11 hdisk12

00f6418384f4508f 00f6418384f450ca 00f6418384f34739 00f6418384f450ff

None None None None

and we run the automatic discovery again:

F2=Refresh F8=Image /=Find

F3=Cancel F10=Exit n=Find Next

node1# smitty hacmp

ONE OR MORE items can be selected.

F2=Refresh F8=Image /=Find

F3=Cancel F10=Exit n=Find Next

+-------------------------------------------------------------------------+ | | Physical Volume Names

for a commonly available major number on all nodes|

node1# smitty hacmp -> Extended Configuration

-> Discover HACMP-related Information from Configured Nodes

| F1=Help | | F7=Select | | Enter=Do |

F2=Refresh F8=Image /=Find

F3=Cancel F10=Exit n=Find Next

5. Resource Group Configuration

Application Volume Groups

| > barney | | > shakira | | | | F1=Help | | F7=Select | | Enter=Do | +-------------------------------------------------------------------------+

F2=Refresh F8=Image /=Find

F3=Cancel F10=Exit n=Find Next

F2=Refresh F8=Image /=Find

F3=Cancel F10=Exit n=Find Next

| F1=Help | | F8=Image | | /=Find |

F2=Refresh F10=Exit n=Find Next