Вы находитесь на странице: 1из 674

Sun Enterprise Cluster Administration

ES-330

Student Guide

Sun Microsystems, Inc.


MS BRM01-209
500 Eldorado Boulevard
Broomfield, Colorado 80021
U.S.A.

Rev. A, September 1999


Copyright © 1999 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, California 94303, U.S.A. All rights reserved.
This product or document is protected by copyright and distributed under licenses restricting its use, copying,
distribution, and decompilation. No part of this product or document may be reproduced in any form by any means
without prior written authorization of Sun and its licensors, if any.
Third-party software, including font technology, is copyrighted and licensed from Sun suppliers.
Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a
registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd.
Sun, Sun Microsystems, the Sun Logo, Sun Enterprise, Sun StorEdge Volume Manager, Solstice DiskSuite, Solaris
Operating Environment, Sun StorEdge A5000, Solstice SyMon, NFS, JumpStart, Sun VTS, OpenBoot, and AnswerBook
are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.
All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc.
in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun
Microsystems, Inc.
The OPEN LOOK and Sun Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees.
Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user
interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface,
which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written
license agreements.
U.S. Government approval required when exporting the product.
RESTRICTED RIGHTS: Use, duplication, or disclosure by the U.S. Govt is subject to restrictions of FAR 52.227-14(g)
(2)(6/87) and FAR 52.227-19(6/87), or DFAR 252.227-7015 (b)(6/95) and DFAR 227.7202-3(a).
DOCUMENTATION IS PROVIDED "AS IS" AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS,
AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH
DISCLAIMERS ARE HELD TO BE LEGALLY INVALID.

Please
Recycle
Contents
About This Course.....................................................................................xix
Course Overview ............................................................................... xx
Course Map........................................................................................ xxi
Module-by-Module Overview ....................................................... xxii
Course Objectives........................................................................... xxvi
Skills Gained by Module.............................................................. xxvii
Guidelines for Module Pacing ................................................... xxviii
Topics Not Covered........................................................................ xxix
How Prepared Are You?................................................................. xxx
Introductions ................................................................................... xxxi
How to Use Course Materials ...................................................... xxxii
Course Icons and Typographical Conventions ....................... xxxiv
Icons ....................................................................................... xxxiv
Typographical Conventions .................................................xxxv
Sun Cluster Overview...............................................................................1-1
Objectives ........................................................................................... 1-1
Relevance............................................................................................ 1-2
Additional Resources ....................................................................... 1-3
Sun Cluster 2.2 New Features ......................................................... 1-4
Cluster Hardware Components...................................................... 1-6
Administration Workstation ...................................................1-7
Terminal Concentrator .............................................................1-7
Cluster Host Systems................................................................1-7
Redundant Private Networks .................................................1-8
Cluster Disk Storage .................................................................1-8
High Availability Features .............................................................. 1-9
High Availability Hardware Design ......................................1-9
Sun Cluster High Availability Software ..............................1-10
Software Redundant Array of Inexpensive Disks (RAID)
Technology............................................................................1-10
Controller-Based RAID Technology ....................................1-10
Year 2000 Compliance ............................................................1-10
High Availability Strategies .......................................................... 1-11

iii
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Redundant Servers..................................................................1-12
Redundant Data ......................................................................1-12
Redundant Public Networks .................................................1-12
Redundant Private Networks ...............................................1-12
Cluster Configurations................................................................... 1-13
Highly Available Data Service Configuration ....................1-13
Parallel Database Configuration...........................................1-14
Sun Cluster Application Support ................................................. 1-15
Highly Available Data Service Support...............................1-16
Parallel Database Support .....................................................1-16
Logical Hosts ................................................................................... 1-17
Logical Host Failure Process .................................................1-18
Cluster Configuration Databases..........................................1-18
Fault Monitoring ............................................................................. 1-19
Data Service Fault Monitoring ..............................................1-20
Cluster Fault Monitoring .......................................................1-20
Failure Recovery Summary ........................................................... 1-22
Exercise: Lab Equipment Familiarization.................................... 1-25
Preparation...............................................................................1-25
NoneTasks................................................................................1-25
Check Your Progress ...................................................................... 1-26
Think Beyond .................................................................................. 1-27
Terminal Concentrator..............................................................................2-1
Objectives ........................................................................................... 2-1
Relevance............................................................................................ 2-2
Additional Resources ....................................................................... 2-3
Cluster Administration Interface.................................................... 2-4
Major Elements..........................................................................2-6
Terminal Concentrator Overview .................................................. 2-7
Operating System Load............................................................2-9
Setup Port...................................................................................2-9
Terminal Concentrator Setup Programs................................2-9
Terminal Concentrator Setup ........................................................ 2-10
Connecting to Port 1 ...............................................................2-11
Enabling Setup Mode .............................................................2-11
Setting the Terminal Concentrator IP Address...................2-12
Setting the Terminal Concentrator Load Source ................2-12
Specify the Operating System Image ...................................2-13
Setting the Serial Port Variables............................................2-14
Terminal Concentrator Troubleshooting..................................... 2-15
Manually Connecting to a Node...........................................2-15
Using the telnet Command to Abort a Node...................2-16
Connecting to the Terminal Concentrator CLI ...................2-16
Using the Terminal Concentrator help Command ...........2-16
Identifying and Resetting a Locked Port .............................2-17

iv Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Exercise: Configuring the Terminal Concentrator ..................... 2-18
Preparation...............................................................................2-18
Tasks .........................................................................................2-18
Exercise Summary...................................................................2-28
Check Your Progress ...................................................................... 2-29
Think Beyond .................................................................................. 2-30
Administration Workstation Installation..............................................3-1
Objectives ........................................................................................... 3-1
Relevance............................................................................................ 3-2
Additional Resources ....................................................................... 3-3
Sun Enterprise Cluster Software Summary .................................. 3-4
Sun Cluster Software Installation ...........................................3-6
Administrative Workstation Software Packages..................3-6
scinstall Command Line Options......................................3-8
Sun Cluster Installation Program Startup ..................................... 3-9
Initial Installation Startup ......................................................3-10
Existing Installation Startup ..................................................3-11
Installation Mode ....................................................................3-12
Administration Workstation Environment................................. 3-13
New Search and Man Page Paths .........................................3-13
Host Name Resolution Changes...........................................3-14
Remote Login Control ............................................................3-14
Remote Display Enabling ......................................................3-15
Controlling rcp and rsh Access ...........................................3-15
Cluster Administration Tools Configuration.............................. 3-16
Cluster Administration Interface..........................................3-17
Administration Tool Configuration Files ............................3-18
Cluster Administration Tools........................................................ 3-19
The Cluster Control Panel .....................................................3-20
Cluster Console .......................................................................3-21
Cluster Administration Tools........................................................ 3-24
Cluster Help Tool....................................................................3-24
Exercise: Installing the Sun Cluster Client Software ................. 3-25
Preparation...............................................................................3-25
Tasks .........................................................................................3-26
Updating the Name Service ..................................................3-26
Installing OS Patches ..............................................................3-26
Running the scinstall Utility ............................................3-27
Configuring the Administration Workstation
Environment .........................................................................3-28
Verifying the Administration Workstation
Environment .........................................................................3-28
Configuring the /etc/clusters File ..................................3-29
Configuring the /etc/serialports File............................3-29
Starting the cconsole Tool ...................................................3-30

v
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Configuring the Cluster Host Systems Environment ........3-31
Verifying the Cluster Host Systems Environment.............3-31
Exercise Summary...................................................................3-33
Check Your Progress ...................................................................... 3-34
Think Beyond .................................................................................. 3-35
Preinstallation Configuration..................................................................4-1
Objectives ........................................................................................... 4-1
Relevance............................................................................................ 4-2
Additional Resources ....................................................................... 4-3
Cluster Topologies ............................................................................ 4-4
Clustered Pairs Topology ........................................................4-5
Ring Topology ...........................................................................4-6
N+1 Topology............................................................................4-7
Shared-Nothing Topology .......................................................4-8
Scalable Topology .....................................................................4-9
Cluster Quorum Devices................................................................ 4-10
Disk Drive Quorum Device ...................................................4-11
Array Controller Quorum Device ........................................4-12
Quorum Device in a Ring Topology ....................................4-13
Quorum Device in a Scalable Topology ..............................4-14
Cluster Interconnect System Overview ....................................... 4-16
Interconnect Types..................................................................4-17
Interconnect Configurations..................................................4-17
Cluster Interconnect System Configuration................................ 4-18
Cluster Interconnect Addressing..........................................4-19
Point-to-Point Connections....................................................4-20
SCI High-Speed Switch Connection.....................................4-21
SCI Card Identification...........................................................4-22
SCI Card Self-Test Information.............................................4-22
SCI Card Scrubber Jumpers...................................................4-23
Ethernet Hub Connection ......................................................4-24
Ethernet Card Identification..................................................4-25
Public Network Management ....................................................... 4-26
PNM Configuration ................................................................4-28
Shared CCD Volume ...................................................................... 4-29
Shared CCD Volume Creation ..............................................4-31
Disabling a Shared CCD ........................................................4-31
Cluster Configuration Information .............................................. 4-32
Using prtdiag to Verify System Configuration ................4-33
Interpreting prtdiag Output................................................4-35
Identifying Storage Arrays ....................................................4-36
Storage Array Firmware Upgrades .............................................. 4-37
Array Firmware Patches ........................................................4-38
Exercise: Preinstallation Preparation ........................................... 4-39
Preparation...............................................................................4-39

vi Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Tasks .........................................................................................4-39
Cluster Topology.....................................................................4-40
Quorum Device Configuration .............................................4-40
Ethernet Cluster Interconnect Configuration .....................4-41
SCI Cluster Interconnect Configuration ..............................4-43
Node Locking Configuration ................................................4-46
Check Your Progress ...................................................................... 4-48
Think Beyond .................................................................................. 4-49
Cluster Host Software Installation .........................................................5-1
Objectives ........................................................................................... 5-1
Relevance............................................................................................ 5-2
Additional Resources ....................................................................... 5-3
Sun Cluster Server Software Overview ......................................... 5-4
Server Package Set Contents ...................................................5-6
Sun Cluster Licensing...............................................................5-8
Sun Cluster Installation Overview ................................................. 5-9
Sun Cluster Volume Managers ..................................................... 5-10
Volume Manager Choices......................................................5-11
Sun Cluster Host System Configuration ..................................... 5-12
Cluster Host System Questions ............................................5-13
SCI Interconnect Configuration ............................................5-14
Ethernet Interconnect Configuration ...................................5-15
Sun Cluster Public Network Configuration................................ 5-16
Sun Cluster Logical Host Configuration ..................................... 5-18
Data Protection Configuration...................................................... 5-20
Failure Fencing ........................................................................5-21
Node Locking ..........................................................................5-22
Quorum Device .......................................................................5-23
Application Configuration ............................................................ 5-26
Post-Installation Configuration..................................................... 5-28
Installation Verification..........................................................5-29
Correcting Minor Configuration Errors ..............................5-30
Software Directory Paths .......................................................5-31
SCI Interconnect Configuration ............................................5-32
Exercise: Installing the Sun Cluster Server Software................. 5-34
Preparation...............................................................................5-34
Tasks .........................................................................................5-35
Update the Name Service ......................................................5-35
Installing Solaris Operating System Patches.......................5-35
Storage Array Firmware Revision ........................................5-36
Installation Preparation..........................................................5-36
Server Software Installation ..................................................5-37
SCI Interconnect Configuration ............................................5-38
Cluster Reboot .........................................................................5-40
Configuration Verification.....................................................5-40

vii
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Testing Basic Cluster Operation ...........................................5-41
Check Your Progress ...................................................................... 5-42
Think Beyond .................................................................................. 5-43
System Operation.......................................................................................6-1
Objectives ........................................................................................... 6-1
Relevance............................................................................................ 6-2
Additional Resources ....................................................................... 6-3
Cluster Administration Tools.......................................................... 6-4
Basic Cluster Control (scadmin).............................................6-6
Cluster Control Panel ....................................................................... 6-8
Starting the Cluster Control Panel..........................................6-9
Adding New Applications to the Cluster Control
Panel.........................................................................................6-9
Console Tool Variations.........................................................6-10
The hastat Command................................................................... 6-11
General Cluster Status............................................................6-12
Logical Host Configuration ...................................................6-13
Private Network Status ..........................................................6-14
Public Network Status............................................................6-15
Data Service Status..................................................................6-16
Cluster Error Messages ..........................................................6-17
Sun Cluster Manager Overview ................................................... 6-18
Sun Cluster Manager Startup................................................6-19
Initial Sun Cluster Manager Display....................................6-20
Sun Cluster Manager Displays ..................................................... 6-21
SCM Cluster Configuration Display ....................................6-22
System Log Filter.....................................................................6-25
The SCM Help Display...........................................................6-26
Sun Cluster SNMP Agent .............................................................. 6-27
Cluster MIB Tables..................................................................6-28
SNMP Traps.............................................................................6-29
Configuring the Cluster SNMP Agent Port ........................6-30
Exercise: Using System Operations .............................................. 6-31
Preparation...............................................................................6-31
Tasks .........................................................................................6-32
Starting the Cluster Control Panel........................................6-32
Using the hastat Command ................................................6-32
Using the Sun Cluster Manager............................................6-33
Exercise Summary...................................................................6-35
Check Your Progress ...................................................................... 6-36
Think Beyond .................................................................................. 6-37
Volume Management Using CVM and SSVM ....................................7-1
Objectives ........................................................................................... 7-1
Relevance............................................................................................ 7-2
Additional Resources ....................................................................... 7-3

viii Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Disk Space Management.................................................................. 7-4
CVM and SSVM Disk Space Management............................7-5
Private Region Contents ..........................................................7-7
Public Region Usage .................................................................7-7
Private and Public Region Format..........................................7-8
Initialized Disk Types...............................................................7-8
CVM and SSVM Encapsulation ...................................................... 7-9
Preferred Boot Disk Configuration ......................................7-10
Prerequisites for Boot Disk Encapsulation..........................7-11
Primary and Mirror Configuration Differences .................7-11
The /etc/vfstab File ............................................................7-12
Boot PROM Changes ..............................................................7-12
Un-encapsulating the Boot Disk ...........................................7-13
CVM and SSVM Disk Grouping................................................... 7-14
Cluster Volume Manager Disk Groups ...............................7-15
Sun StorEdge Volume Manager Disk Groups ....................7-16
Volume Manager Status Commands ........................................... 7-17
Checking Disk Status..............................................................7-19
Saving Configuration Information .......................................7-19
Optimizing Recovery Times.......................................................... 7-20
Dirty Region Logging.............................................................7-21
The Veritas VxFS File System................................................7-21
CVM and SSVM Post-Installation ................................................ 7-22
Initializing the rootdg Disk Group......................................7-22
Matching the vxio Driver Major Numbers ........................7-23
StorEdge Volume Manager Dynamic Multi-Pathing ........7-24
Exercise: Configuring Volume Management.............................. 7-26
Preparation...............................................................................7-26
Tasks .........................................................................................7-27
Installing the CVM or SSVM Software ................................7-28
Disabling Dynamic Multipathing (DMP)............................7-29
Creating a Simple rootdg Slice..............................................7-30
Encapsulating the Boot Disk .................................................7-31
Selecting Demonstration Volume Disks ..............................7-32
Configuring the CVM/SSVM Demonstration
Volumes.................................................................................7-35
Verifying the CVM/SSVM Demonstration File
Systems .................................................................................7-36
Verifying the Cluster ..............................................................7-37
Exercise Summary...................................................................7-38
Check Your Progress ...................................................................... 7-39
Think Beyond .................................................................................. 7-40
Volume Management Using SDS...........................................................8-1
Objectives ........................................................................................... 8-1
Relevance............................................................................................ 8-2

ix
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Additional Resources ....................................................................... 8-3
Disk Space Management.................................................................. 8-4
SDS Disk Space Management .................................................8-5
Solstice DiskSuite Initialization....................................................... 8-6
Replica Configuration Guidelines ..........................................8-7
SDS Disk Grouping........................................................................... 8-8
Dual-String Mediators.................................................................... 8-10
Shared Diskset Replica Placement........................................8-11
Metatrans Devices........................................................................... 8-12
Metatrans Device Structure ...................................................8-14
SDS Status ........................................................................................ 8-15
Checking Volume Status........................................................8-16
Checking Mediator Status......................................................8-16
Volume Manager Status................................................................. 8-17
Checking Replica Status.........................................................8-17
Volume Manager Status................................................................. 8-18
Recording SDS Configuration Information.........................8-18
SDS Post-Installation ...................................................................... 8-19
Configuring State Database Replicas ...................................8-19
Configuring the Disk ID (DID) Driver.................................8-20
Configuring Dual-String Mediators .....................................8-21
Exercise: Configuring Volume Management.............................. 8-22
Preparation...............................................................................8-22
Tasks .........................................................................................8-23
Installing the SDS Software ...................................................8-24
Configuring the SDS Disk ID Driver....................................8-25
Resolving DID Driver Major Number Conflicts.................8-26
Initializing the SDS State Databases.....................................8-28
SDS Volume Overview...........................................................8-29
Selecting SDS Demo Volume Disks Drives.........................8-31
Configuring the SDS Demonstration Volumes ..................8-32
Configuring Dual-String Mediators .....................................8-32
Verifying the SDS Demonstration File Systems .................8-34
Verifying the Cluster ..............................................................8-35
Exercise Summary...................................................................8-36
Check Your Progress ...................................................................... 8-37
Think Beyond .................................................................................. 8-38
Cluster Configuration Database .............................................................9-1
Objectives ........................................................................................... 9-1
Relevance............................................................................................ 9-2
Additional Resources ....................................................................... 9-3
Cluster Configuration Information ................................................ 9-4
The CDB Database ....................................................................9-5
The CCD Database....................................................................9-6
Cluster Database Consistency ......................................................... 9-8

x Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Data Propagation ......................................................................9-8
The CCD Update Protocol .......................................................9-9
Database Consistency Checking ...........................................9-10
Database Majority ...................................................................9-10
Shared CCD Operation ..........................................................9-13
Creating a Shared CCD ..........................................................9-13
Disabling a Shared CCD ........................................................9-15
CCD Administration ...................................................................... 9-16
Verifying CCD Global Consistency........................................9-16
Checkpointing the CCD .........................................................9-17
Restoring the CCD From a Backup Copy............................9-17
Creating a Purified Copy of the CCD ..................................9-17
Disabling the CCD Quorum..................................................9-18
Recommended CCD Administration Tasks........................9-18
Common Mistakes ..................................................................9-18
Exercise: CCD Administration...................................................... 9-19
Preparation...............................................................................9-19
Tasks .........................................................................................9-19
Maintaining the CCD Database ............................................9-20
Exercise Summary...................................................................9-21
Check Your Progress ...................................................................... 9-22
Think Beyond .................................................................................. 9-23
Public Network Management................................................................10-1
Objectives ......................................................................................... 10-1
Relevance.......................................................................................... 10-2
Additional Resources ..................................................................... 10-3
Public Network Management ....................................................... 10-4
The Network Monitoring Process ................................................ 10-6
What Happens? .......................................................................10-7
How PNM Works ........................................................................... 10-8
PNM Support Issues...............................................................10-9
TEST Routine..........................................................................10-11
FAILOVER Routine .................................................................10-12
DETERMINE_NET_FAILURE Routine ....................................10-13
The pnmset Command................................................................. 10-14
Other PNM Commands ............................................................... 10-17
The pnmstat Command.......................................................10-17
The pnmptor Command.......................................................10-19
The pnmrtop Command.......................................................10-19
Exercise: Configuring the NAFO Groups ................................. 10-20
Preparation.............................................................................10-20
Tasks .......................................................................................10-20
Creating a NAFO Group......................................................10-21
Disabling the Interface Groups Feature.............................10-22
Exercise Summary.................................................................10-23

xi
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Check Your Progress .................................................................... 10-24
Think Beyond ................................................................................ 10-25
Logical Hosts.............................................................................................11-1
Objectives ......................................................................................... 11-1
Relevance.......................................................................................... 11-2
Additional Resources ..................................................................... 11-3
Logical Hosts ................................................................................... 11-4
Configuring a Logical Host ........................................................... 11-7
Using the scconf -L Command Option ................................11-8
Logical Host Variations................................................................ 11-10
Basic Logical Host .................................................................11-10
Cascading Failover................................................................11-11
Disabling Automatic Takeover ...........................................11-12
Multiple Disk Group and Hostnames..................................11-12
Administrative File System Overview....................................... 11-13
Administrative File System Components..........................11-14
Using the scconf -F Command Option ...........................11-15
Logical Host File Systems ............................................................ 11-17
Adding a New Logical Host File System...........................11-18
Sample Logical Host vfstab File .......................................11-18
Logical Host Control .................................................................... 11-19
Forced Logical Host Migration ...........................................11-19
Logical Host Maintenance Mode........................................11-20
Exercise: Preparing Logical Hosts .............................................. 11-21
Preparation.............................................................................11-21
Tasks .......................................................................................11-21
Preparing the Name Service................................................11-22
Activating the Cluster ..........................................................11-22
Logical Host Restrictions .....................................................11-23
Creating the Logical Hosts ..................................................11-24
Creating the CVM/SSVM Administrative File
System..................................................................................11-25
Creating the SDS Administrative File System ..................11-25
Exercise Summary.................................................................11-28
Check Your Progress .................................................................... 11-29
Think Beyond ................................................................................ 11-30
The HA-NFS Data Service......................................................................12-1
Objectives ......................................................................................... 12-1
Relevance.......................................................................................... 12-2
Additional Resources ..................................................................... 12-3
HA-NFS Overview.......................................................................... 12-4
HA-NFS Support Issues .........................................................12-5
Start NFS Methods .......................................................................... 12-7
Stop NFS Methods .......................................................................... 12-9
HA-NFS Fault Monitoring........................................................... 12-10

xii Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
HA-NFS Fault Monitoring Probes......................................12-10
Fault Probes ................................................................................... 12-12
Local Fault Probes......................................................................... 12-13
Remote Fault Probes..................................................................... 12-14
Giveaway and Takeaway Process .............................................. 12-15
Sanity Checking.....................................................................12-16
Processes Related to NFS Fault Monitoring.............................. 12-17
HA-NFS Support Files.................................................................. 12-18
Adding Mount Information to the vfstab File................12-19
Adding Share Information to the dfstab File..................12-19
Sample vfstab and dfstab Files .......................................12-20
Removing HA-NFS File Systems From a Logical
Host ......................................................................................12-20
Using the hareg Command......................................................... 12-21
Registering a Data Service ...................................................12-21
Unregistering a Data Service...............................................12-24
Starting and Stopping a Data Service.................................12-25
File Locking Recovery .................................................................. 12-26
Exercise: Setting Up HA-NFS File Systems............................... 12-27
Preparation.............................................................................12-27
Tasks .......................................................................................12-27
Verifying the Environment..................................................12-28
Preparing the HA-NFS File Systems ..................................12-29
Registering HA-NFS Data Service......................................12-30
Verifying Access by NFS Clients ........................................12-31
Observing HA-NFS Failover Behavior ..............................12-32
Check Your Progress .................................................................... 12-33
Think Beyond ................................................................................ 12-34
System Recovery ......................................................................................13-1
Objectives ......................................................................................... 13-1
Relevance.......................................................................................... 13-2
Additional Resources ..................................................................... 13-3
Sun Cluster Reconfiguration Control........................................... 13-4
Cluster Membership Monitor................................................13-6
Switch Management Agent ...................................................13-6
Public Network Management ...............................................13-6
Failfast Driver (/dev/ff) .......................................................13-6
Data Service Fault Monitors ..................................................13-7
Disk Management Software ..................................................13-7
Database Management Software ..........................................13-7
Sun Cluster Failfast Driver ............................................................ 13-8
Failfast Messages...................................................................13-10
Sun Cluster Reconfiguration Sequence...................................... 13-11
Reconfiguration Triggering Events ....................................13-13
Independent Reconfiguration Processes ...........................13-13

xiii
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Sun Cluster Reconfiguration Steps............................................. 13-14
Reconfiguration Process Priorities......................................13-16
Reconfiguration Step Summary..........................................13-17
Cluster Interconnect Failures ...................................................... 13-18
CIS Failure Description ........................................................13-18
CIS Failure Symptoms..........................................................13-19
Correcting Ethernet CIS Failures ........................................13-20
Correcting SCI Interconnect Failures .................................13-20
Two-Node Partitioned Cluster Failure ...................................... 13-21
CVM or SSVM Partitioned Cluster.....................................13-21
SDS Partitioned Cluster .......................................................13-22
Logical Host Reconfiguration ..................................................... 13-23
Sanity Checking.....................................................................13-24
Exercise: Failure Recovery ........................................................... 13-25
Preparation.............................................................................13-25
Tasks .......................................................................................13-25
Losing a Private Network Cable.........................................13-26
Partitioned Cluster (Split Brain) .........................................13-26
Public Network Failure (NAFO group).............................13-27
Logical Host Fault Monitor Giveaway ..............................13-27
Cluster Failfast.......................................................................13-28
Exercise Summary.................................................................13-29
Check Your Progress .................................................................... 13-30
Think Beyond ................................................................................ 13-31
Sun Cluster High Availability Data Service API ..............................14-1
Objectives ......................................................................................... 14-1
Relevance.......................................................................................... 14-2
Additional Resources ..................................................................... 14-3
Overview .......................................................................................... 14-4
Data Service Requirements............................................................ 14-6
Client-Server Data Service .....................................................14-6
Data Service Dependencies ...................................................14-6
No Dependence on Physical Hostname of Server..............14-7
Handles Multi-homed Hosts.................................................14-7
Handles Additional IP Addresses for Logical Hosts.........14-7
Data Service Methods..................................................................... 14-9
START Methods ......................................................................14-9
STOP Methods.......................................................................14-10
ABORT Methods ...................................................................14-10
NET Methods.........................................................................14-11
Fault Monitoring Methods ..................................................14-12
Giveaway and Takeaway............................................................. 14-14
Giveaway Scenario................................................................14-15
Takeaway Scenario ...............................................................14-16
Method Considerations................................................................ 14-17

xiv Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
START and STOP Method Examples......................................... 14-19
Example 1 ...............................................................................14-19
Example 2 ...............................................................................14-20
Data Service Dependencies ......................................................... 14-21
The haget Command Options............................................14-24
The hactl Command ................................................................... 14-26
The hactl Command Options............................................14-27
The halockrun Command .......................................................... 14-28
The hatimerun Command .......................................................... 14-29
The pmfadm Command................................................................. 14-30
What Is Different From HA 1.3? ................................................. 14-31
The hads C Library Routines ...................................................... 14-32
Exercise: Using the Sun Cluster Data Service API ................... 14-33
Preparation.............................................................................14-33
Tasks .......................................................................................14-33
Using the haget Command.................................................14-34
Check Your Progress .................................................................... 14-36
Think Beyond ................................................................................ 14-37
Highly Available DBMS ........................................................................15-1
Objectives ......................................................................................... 15-1
Relevance.......................................................................................... 15-2
Additional Resources ..................................................................... 15-3
Sun Cluster HA-DBMS Overview ................................................ 15-4
Database Binary Placement ...................................................15-5
Supported Database Versions ...............................................15-5
HA-DBMS Components.........................................................15-6
Multiple Data Services ...........................................................15-6
Typical HA-DBMS Configuration ................................................ 15-7
Configuring and Starting HA-DBMS........................................... 15-8
Stopping and Unconfiguring HA-DBMS .................................... 15-9
Removing a Logical Host.....................................................15-10
Removing a DBMS From a Logical Host...........................15-10
The HA-DBMS Start Methods..................................................... 15-11
The HA-DBMS Stop and Abort Methods.................................. 15-13
The HA-DBMS Stop Methods .............................................15-13
The HA-DBMS Abort Methods...........................................15-14
HA-DBMS Fault Monitoring ....................................................... 15-15
Local Fault Probe Operation ...............................................15-16
Remote Fault Probe Operation............................................15-16
HA-DBMS Action Files ........................................................15-17
HA-DBMS Failover Procedures..........................................15-19
Configuring HA-DBMS for High Availability ......................... 15-20
Multiple Data Services .........................................................15-20
Raw Partitions Versus File Systems ...................................15-21
Configuration Overview.............................................................. 15-22

xv
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
General HA-DBMS Configuration Issues..........................15-22
User and Group Entries .......................................................15-23
Database Software Location ................................................15-23
Oracle Installation Preparation ................................................... 15-24
Sybase Installation Preparation .................................................. 15-26
Informix Installation Preparation ............................................... 15-28
Preparing the Logical Host.......................................................... 15-30
Preparing the Database Configuration Files .....................15-31
Enable Fault Monitoring Access .........................................15-31
Registering the HA-DBMS Data Service ...........................15-32
Adding Entries to the CCD..................................................15-32
Bring the HA-DBMS for Oracle Servers Into Service ......15-32
HA-DBMS Control........................................................................ 15-33
Setting HA-DBMS Monitoring Parameters.......................15-33
Starting and Stopping HA-DBMS Monitoring .................15-37
HA-DBMS Client Overview ........................................................ 15-38
Maintaining the List of Monitored Databases ..................15-39
HA-DBMS Recovery..................................................................... 15-40
Client Recovery .....................................................................15-40
HA-DBMS Recovery Time...................................................15-41
HA-DBMS Configuration Files ................................................... 15-42
HA-Oracle Configuration Files ...........................................15-43
HA-Sybase Configuration Files ..........................................15-44
HA-Informix Configuration Files .......................................15-45
Exercise: HA-DBMS Installation................................................. 15-46
Preparation.............................................................................15-46
Tasks .......................................................................................15-46
Exercise Summary.................................................................15-47
Check Your Progress .................................................................... 15-48
Think Beyond ................................................................................ 15-49
Cluster Configuration Forms..................................................................A-1
Cluster Name and Address Information...................................... A-2
Multi-Initiator SCSI Configuration ...................................................... B-1
Preparing for Multi-Initiator SCSI................................................. B-2
Background............................................................................... B-2
Changing All Adapters ................................................................... B-3
Changing the Initiator ID........................................................ B-3
Drive Firmware ........................................................................ B-3
The nvramrc Script .......................................................................... B-4
Changing an Individual Initiator ID for Multipacks .................. B-5
Sun Storage Array Overviews ................................................................C-1
Disk Storage Concepts..................................................................... C-2
Multi-host Access.....................................................................C-2
Host-Based RAID (Software RAID Technology).................C-5

xvi Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Controller-Based RAID (Hardware RAID Technology) ....C-6
Redundant Dual Active Controller Driver...........................C-7
Dynamic Multi-Path Driver....................................................C-8
Hot Swapping...........................................................................C-9
SPARCstorage Array 100 .............................................................. C-10
SPARCstorage Array 100 Features ......................................C-10
SPARCstorage Array 100 Addressing ................................C-11
RSM Storage Array ........................................................................ C-12
RSM Storage Array Features ................................................C-12
RSM Storage Array Addressing...........................................C-13
SPARCstorage Array 214/219...................................................... C-14
SPARCstorage Array 214/219 Features .............................C-14
SPARCstorage Array 214 Addressing ................................C-15
Sun StorEdge A3000 (RSM Array 2000)...................................... C-16
StorEdge A3000 Features ......................................................C-16
StorEdge A3000 Addressing.................................................C-17
StorEdge A1000/D1000................................................................. C-19
Shared Features ......................................................................C-19
StorEdge A1000 Differences .................................................C-20
StorEdge A1000 Addressing.................................................C-20
StorEdge D1000 Differences .................................................C-21
StorEdge D1000 Addressing.................................................C-21
Sun StorEdge A3500 ...................................................................... C-22
StorEdge A3500 Features ......................................................C-22
StorEdge A3500 Addressing.................................................C-24
Sun StorEdge A5000 ...................................................................... C-25
A5000 Features .......................................................................C-25
StorEdge A5000 Addressing.................................................C-27
Sun StorEdge A7000 ...................................................................... C-29
Sun StorEdge A7000 Enclosure............................................C-29
StorEdge A7000 Functional Elements .................................C-31
StorEdge A7000 Addressing.................................................C-33
Combining SSVM and A7000 Devices ................................C-34
SPARCstorage MultiPack ............................................................. C-35
SPARCstorage MultiPack Features .....................................C-36
SPARCstorage MultiPack Addressing................................C-36
Storage Configuration ................................................................... C-37
Identifying Storage Devices..................................................C-37
Identifying Controller Configurations................................C-40
Oracle Parallel Server...............................................................................D-1
Oracle Overview .............................................................................. D-2
Oracle 7.x and Oracle 8.x Similarities ...................................D-3
Oracle 7.x and Oracle 8.x Differences ...................................D-5
Oracle Configuration Files.............................................................. D-7
The /etc/system File .............................................................D-8

xvii
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
The /etc/opt/SUNWcluster/conf/clustname
.ora_cdb File .......................................................................D-9
The init_ora File.................................................................D-10
Oracle Database Volume Access.................................................. D-11
Oracle Volume Types ............................................................D-13
CVM Volume Pathnames .....................................................D-13
Changing Permission or Ownership of Volumes .............D-13
DLM Reconfiguration.................................................................... D-14
DLM Reconfiguration Steps .................................................D-15
Volume Manager Reconfiguration with CVM .......................... D-16
Initial Volume Configuration...............................................D-18
Volume Reconfiguration With CVM...................................D-18
Oracle Parallel Server Specific Software..................................... D-19
The SUNWudlm Package Summary...................................D-19
Glossary ......................................................................................... Glossary-1
Acronyms Glossary................................................................... Acronyms-1

xviii Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
About This Course

Course Goal
This course provides students with the essential information and skills
to install and administer a Sun Enterprise™ Cluster system running
Sun Cluster 2.2 software.

xix
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Course Overview

This course provides students with the essential information and skills
to install and administer Sun Enterprise Cluster hardware running Sun
Cluster 2.2 software.

The most important tasks for the system administrator are Sun Cluster
software installation and configuration, hardware configuration,
system operations, and system recovery. During this course, these
topics are presented in the order in which a typical cluster installation
takes place.

xx Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Course Map

The following course map enables you to see what you have
accomplished and where you are going in reference to the course goal.

About This Course xxi


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Module-by-Module Overview

This course contains the following modules:

● Module 1 – Sun Cluster Overview

This lecture-only module introduces all of the basic concepts


associated with Sun Enterprise Cluster systems.

Lab exercise – There is no lab for this module.

● Module 2 – Terminal Concentrator

This module introduces the critical administrative functions


supported by the Terminal Concentrator interface, and explains
basic Terminal Concentrator theory, and the installation and
configuration process.

Lab exercise – Configure the Terminal Concentrator cabling and


operating parameters for proper operation in the Sun Enterprise
Cluster environment.

xxii Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Module-by-Module Overview

● Module 3 – Administration Workstation

The lecture portion of this module presents an overview of Sun


Cluster (SC) administration workstation software files and the
general process used to install all cluster software.

Lab exercise – Install the SC software on the administration


workstation, configure the cluster administrative files, and start
one of the cluster administration tools, cconsole.

● Module 4 – Preinstallation Configuration

This module provides the information necessary to prepare a Sun


Enterprise Cluster (SEC) system for the Sun Cluster software
installation. It focuses on issues relevant to selecting and
configuring an appropriate cluster topology.

Lab exercise – Select and configure a target cluster topology. You


will verify that the configuration is ready to start the cluster host
software installation.

● Module 5 – Cluster Host Software Installation

The lecture presents an overview of the Sun Cluster host software


files and distribution.

Lab exercise – Install the Sun Cluster software on the cluster host
systems.

● Module 6 – System Operation

This module discusses the Sun Cluster Manager graphical


administration tool, along with the cluster administration
command-line features.

Lab exercise – Start and stop the cluster software using the
scadmin command and verify cluster status with the SCM
application and the hastat command.

About This Course xxiii


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Module-by-Module Overview

● Module 7 – Volume Management with CVM and SSVM

This module reviews the basic space management techniques used


by the Cluster Volume Manager and the Sun StorEdge Volume
Manager™. The installation and initialization processes for CVM
and SSVM are presented along with post-installation issues.

Lab exercise – Install and initialize either CVM or SSVM. You will
use script files to create demonstration volumes.

● Module 8 – Volume Management with Solstice DiskSuite

This module reviews the basic space management techniques used


by the Solstice DiskSuite™ (SDS) volume manager. The
installation and initialization processes for SDS are presented
along with post-installation issues.

Lab exercise – Install and initialize SDS. You will use script files to
create demonstration volumes.

● Module 9 – Cluster Configuration Database

This module discusses the purpose, structure, and administration


of the cluster database (CDB) and the cluster configuration
database (CCD).

Lab exercise – Perform basic administration operations on the


CDB and CCD files. This includes verifying consistency between
cluster hosts, making backup copies, and checking for errors.

● Module 10 – Public Network Management

This module describes the operation, configuration, and


management of the Sun Cluster Public Network Management
(PNM) mechanism. The creation of network adapter failover
groups (NAFO) and their relationship to logical hosts is also
discussed.

Lab exercise – Configure a NAFO group on each cluster host and


disable the Solaris™ Operating System Interface Groups feature.

xxiv Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Module-by-Module Overview

● Module 11 – Logical Hosts

This module discusses the purpose of Sun Cluster logical hosts


and their relationship to data services. The structure and creation
of logical hosts is presented along with some common variations.

Lab exercise – Configure and test two logical hosts.

● Module 12 – The HA-NFS Data Service

This module describes and demonstrates the configuration and


management of Sun Cluster HA-NFS™ file systems.

Lab exercise – Create, register, and test a demonstration HA-NFS


data service. Switch the data service between cluster hosts.

● Module 13 – System Recovery

This module summarizes the basic recovery process for a number


of cluster failure scenarios. It includes background information
and details about operator intervention.

Lab exercise – Create and recover from cluster interconnect


failures, partitioned cluster, public network interface failures, a
logical host failure, and a cluster node failfast failure.

● Module 14 – The Sun Cluster High Availability Data Service API

This module provides an overview of how to integrate


applications into the Sun Cluster High Availability framework. It
also describes key failover actions performed by the Sun Cluster
High Availability software.

Lab exercise – There is no lab for this module

● Module 15 – Highly Available DBMS

This module describes the configuration and operation of a highly


available database in the Sun Cluster environment.

Lab exercise – There is no lab for this module.

About This Course xxv


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Course Objectives

Upon completion of this course, you should be able to:

● Describe major Sun Cluster components and functions

● Verify system cabling

● Configure the Terminal Concentrator for proper operation

● Install, remove, and update Sun Cluster software

● Troubleshoot software installation and configuration errors

● Configure environmental variables for correct Sun Cluster


operation

● Use the Sun Cluster administration tools

● Initialize one of the supported volume managers

● Describe the differences between the supported volume managers

● Prepare the Public Network Management failover environment

● Create and configure logical hosts

● Install and configure highly available data services

● Describe the Sun Cluster failure recovery mechanisms

● Identify and recover from selected Sun Cluster failures

xxvi Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Skills Gained by Module

The skills for Sun Enterprise Cluster Administration are shown in


column 1 of the following matrix. The black boxes indicate the main
coverage for a topic; the gray boxes indicate the topic is briefly
discussed.

Module

Skills Gained 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Describe the major Sun Enterprise Cluster


components and functions

Verify disk storage cabling

Configure the Terminal Concentrator

Configure the cluster interconnect system

Install the Sun Cluster 2.2 software

Troubleshoot software installation and


configuration errors

Configure environmental variables for correct


SEC operation

Use SEC administration tools

Initialize either the Enterprise Volume


Manager, Cluster Volume Manager, or Solstice
DiskSuite

Describe the SEC recovery mechanisms

Configure the Sun Cluster 2.2 Highly


Available NFS data service

Create public network adapter backup groups


with the public network management utility
(PNM

Identify and recover from selected SEC


failures

About This Course xxvii


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Guidelines for Module Pacing

The following table provides a rough estimate of pacing for this


course:

Module Day 1 Day 2 Day 3 Day 4 Day 5

About This Course AM


Product Introduction AM
Terminal Concentrator AM/
PM
Administration PM
Workstation Installation
Preinstallation AM
Configuration
Cluster Host Software AM/
Installation PM
System Operation PM
Volume Management AM
Cluster Configuration PM
Data
Public Network PM
Management
Logical Hosts AM
HA-NFS Data Service AM/
PM
System Recovery PM
Sun Cluster High AM
Availability Data Service
API
HA-DBMS AM/
PM

xxviii Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Topics Not Covered

This course does not cover the topics shown on the above overhead.
Many of the topics listed on the overhead are covered in other courses
offered by Sun Educational Services:

● Database management – Covered in database vendor courses

● Network administration – Covered in SA-380: Solaris 2.x Network


Administration

● Solaris administration – Covered in SA-235: Solaris 2.X System


Administration I and SA-286: Solaris System Administration II

● Database performance and tuning – Covered in database vendor


courses

● Disk storage management – Covered in SO-352: Disk Management


With Solstice DiskSuite, SA-345: Volume Manager with SPARCstorage
Array, and SA-347: Volume Manager with StorEdge A5000

Refer to the Sun Educational Services catalog for specific information


and registration.

About This Course xxix


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
How Prepared Are You?

To be sure you are prepared to take this course, can you answer yes to
the questions shown on the above overhead?

● Virtual volume management administration is a central portion of


the Sun Enterprise Cluster functionality.

● Solaris Operating Environment system administration is an


integral part of Sun Enterprise Cluster administration. You cannot
separate the two.

● Resolving all Sun Enterprise Cluster issues requires more


hardware knowledge than in most other system applications.

● Sun Enterprise Cluster systems are frequently composed of


enterprise-class components. You must be used to dealing with
this type of high-end hardware.

xxx Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Introductions

Now that you have been introduced to the course, introduce yourself
to each other and the instructor, addressing the items shown on the
above overhead.

About This Course xxxi


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
How to Use Course Materials

To enable you to succeed in this course, these course materials employ


a learning model that is composed of the following components:

● Course map – Each module starts with an overview of the content


so you can see how the module fits into your overall course goal.

● Relevance – The “Relevance” section for each module provides


scenarios or questions that introduce you to the information
contained in the module and provoke you to think about how the
module content relates to cluster administration.

● Overhead image – Reduced overhead images for the course are


included in the course materials to help you easily follow where
the instructor is at any point in time. Overheads do not appear on
every page.

● Lecture – The instructor will present information specific to the


topic of the module. This information will help you learn the
knowledge and skills necessary to succeed with the exercises.

xxxii Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
How to Use Course Materials

● Exercise – Lab exercises will give you the opportunity to practice


your skills and apply the concepts presented in the lecture.

● Check your progress – Module objectives are restated, sometimes


in question format, so that before moving on to the next module
you are sure that you can accomplish the objectives of the current
module.

● Think beyond – Thought-provoking questions are posed to help


you apply the content of the module or predict the content in the
next module.

About This Course xxxiii


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Course Icons and Typographical Conventions

The following icons and typographical conventions are used in this


course to represent various training elements and alternative learning
resources.

Icons
Additional resources – Indicates additional reference materials are
available.

Discussion – Indicates a small-group or class discussion on the current


topic is recommended at this time.

Exercise objective – Indicates the objective for the lab exercises that
follow. The exercises are appropriate for the material being discussed.

Note – Additional important, reinforcing, interesting or special


information.

Caution – A potential hazard to data or machinery.

!
Warning – Anything that poses personal danger or irreversible
damage to data or the operating system.

xxxiv Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Course Icons and Typographical Conventions

Typographical Conventions
Courier is used for the names of commands, files, and directories, as
well as on-screen computer output. For example:

Use ls -al to list all files.


system% You have mail.

Courier bold is used for characters and numbers that you type. For
example:

system% su
Password:

Courier italic is used for variables and command-line


placeholders that are replaced with a real name or value. For example:

To delete a file, type rm filename.

Palatino italics is used for book titles, new words or terms, or words
that are emphasized. For example:

Read Chapter 6 in User’s Guide.


These are called class options.
You must be root to do this.

About This Course xxxv


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Sun Cluster Overview 1

Objectives
Upon completion of this module, you should be able to:

● List the hardware elements that comprise a basic Sun Enterprise


Cluster system

● List the hardware and software components that contribute to the


availability of a Sun Enterprise Cluster system

● List the types of redundancy that contribute to the availability of a


Sun Enterprise Cluster system.

● Identify the configuration differences between a high availability


cluster and a parallel database cluster

● Explain the purpose of logical host definitions in the Sun


Enterprise Cluster environment

● Describe the purpose of the cluster configuration databases

● Explain the purpose of each of the Sun Enterprise Cluster fault


monitoring mechanisms.

The main goal of this module is to introduce the basic concepts


associated with the Sun Enterprise Cluster environment.

1-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Relevance

Discussion – The following questions are relevant to understanding


the content of this module:

1. What is a highly available data service?

2. What must be done to make a data service “highly available”?

3. What type of system support would a highly available data service


require?

4. How would you manage the group of resources required by a


highly available data service?

1-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

● Sun Cluster 2.2 Cluster Volume Manager Guide, part number


805-4240

● Sun Cluster 2.2 API Developers Guide, part number 805-4241

● Sun Cluster 2.2 Error Messages Manual, part number 805-4242

● Sun Cluster 2.2 Release Notes, part number 805-4243

Sun Cluster Overview 1-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

Sun Cluster 2.2 New Features

The Sun Cluster 2.2 software release has the following new features:

● Sun Cluster 2.2 is now fully internationalized and Year 2000 (Y2K)
compliant

● Support for Solstice DiskSuite has been added

This provides and upgrade path for existing HA 1.3 customers.


There are several features and restrictions associated with Solstice
DiskSuite installations. They include:
● Solaris 7 Operating Environment compatibility
● A new DIsk ID (DID) software package
● A new DID configuration command, scdidadm
● A special scadmin command option, reserve
● Shared CCD volume is not supported
● Quorum disk drives are not supported (or needed)

1-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Sun Cluster 2.2 New Features

● Solaris 7 Operating Environment is now supported

Currently, the Solaris 7 Operating Environment can be used only


in conjunction with Solstice DiskSuite. The Cluster Volume
Manager and the Sun StorEdge Volume Manager products are not
yet compatible with the Solaris 7 Operating Environment.

● The installation procedures have been changed

You can now fully configure the cluster during the host software
installation process. This includes configuring public network
backup groups and logical hosts.

● Licensing is much simpler

Sun Cluster 2.2 requires no framework or HA data service licenses


to run. However, you need licenses for Sun Enterprise Volume
Manager (SEVM) if you use SEVM with any storage devices other
than SPARCstorage™ Arrays or StorEdge™ A5000s.
SPARCstorage Arrays and StorEdge A5000s include bundled
licenses for use with SEVM. Contact the Sun License Center for
any necessary SEVM licenses; see:
http://www.sun.com/licensing/ for more information.

You might need to obtain licenses for DBMS products and other
third-party products. Contact your third-party service provider for
third-party product licenses.

● There is a new cluster management tool Sun Cluster Manager

A new Java™ technology-based cluster management tool replaces


the previous Cluster Monitor tool. The new tool can be run from
the cluster hosts systems as a standalone application, or can be
accessed from a remote browser after the appropriate HTTP server
software has been installed on the cluster host systems.

Sun Cluster Overview 1-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

Cluster Hardware Components

The basic hardware components that are necessary for most cluster
configurations include:

● One administration workstation

● One Terminal Concentrator

● Two hosts (up to four)

● One or more public network interfaces per system (not shown)

● A redundant private network interface

● At least one source of shared, mirrored disk storage

Note – E10000-based clusters do not use the terminal concentrator for


host system access.

1-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Cluster Hardware Components

Administration Workstation
The administration workstation can be any Sun SPARC workstation,
providing it has adequate resources to support graphics and compute
intensive applications, such as Sun VTS™ and Solstice SyMON™
software. You can run up to several different cluster administration
tools on the administration workstation.

Note – Typically, the cluster applications are available to users through


networks other than the one used by the administration workstation.

Terminal Concentrator
The Terminal Concentrator (TC) is specially modified for use in the
Sun Enterprise Environment. No substitutions are supported.

The TC provides direct translation from the network packet switching


environment to multiple serial port interfaces. Each of the serial port
outputs connects to a separate node in the cluster through serial port
A. Because the nodes do not have frame buffers, this is the only access
path when Solaris is not running.

Cluster Host Systems


A wide range of Sun hardware platforms are supported for use in the
clustered environment. Mixed platform clusters are supported, but the
systems should be of equivalent processor, memory, and I/O
capability.

You cannot mix SBus based systems with PCI-based systems.


Equivalent SBus and PCI interface cards are not compatible.

Sun Cluster Overview 1-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Cluster Hardware Components

Redundant Private Networks


All nodes in a cluster are linked by a private communication network.
This private interface is called the cluster interconnect system (CIS)
and is used for a variety of reasons including:

● Cluster status monitoring

● Cluster recovery synchronization

● Parallel database lock and query information

Cluster Disk Storage


Although a wide range of Sun storage products are available for use in
the Sun Enterprise Cluster Environment, they must all accept at least
dual-host connections. Some models support up to four host system
connections.

1-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

High Availability Features

The Sun Enterprise Cluster system is a general purpose cluster


architecture focused on providing reliability, availability, and
scalability. Part of the reliability and availability is inherent in the
hardware and software used in the Sun Enterprise Cluster.

High Availability Hardware Design


Many of the supported cluster hardware platforms have the following
features that contribute to maximum uptime:

● Hardware is interchangeable between models.

● Redundant system board power and cooling modules.

● The systems contain automatic system reconfiguration; failed


components, such as the central processing unit (CPU), memory,
and input/output (I/O) can be disabled at reboot.

● Several disk storage options support hot swapping of disks.

Sun Cluster Overview 1-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
High Availability Features

Sun Cluster High Availability Software


The Sun Cluster software has monitoring and control mechanisms that
can initiate various levels of cluster reconfiguration to help maximize
application availability.

Software Redundant Array of Inexpensive Disks (RAID)


Technology
The Sun StorEdge Volume Manager (SSVM) and the Cluster Volume
Manager (CVM) software provide RAID protection in the following
ways:

● Redundant mirrored volumes

● RAID-5 volumes (not for all applications)

The SDS software provides mirrored volume support. The SDS


product does not support RAID-5 volumes.

Controller-Based RAID Technology


Several supported disk storage devices use controller-based RAID
technology that is sometimes referred to as hardware RAID. This
includes the following storage arrays:

● Sun StorEdge A1000

● Sun StorEdge A3000/3500

Year 2000 Compliance


The Sun Cluster 2.2 software and the associated Solaris Operating
Environments are both Year 2000 compliant, which contributes to
long-term cluster reliability.

1-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

High Availability Strategies

To provide the high level of system availability required by many


Enterprise customers, the Sun Enterprise Cluster system uses the
following strategies:

● Redundant servers

● Redundant data

● Redundant public network access

● Redundant private communications

Sun Cluster Overview 1-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
High Availability Strategies

Redundant Servers
The Sun Enterprise Cluster system consists of one to four
interconnected systems that are referred to as cluster host systems and
also as nodes. The systems can be almost any of the Sun Enterprise
class of platforms. They use off-the-shelf non-proprietary hardware.

Note – You cannot mix systems that use PCI bus technology, such as
the E450, with SBus technology systems, such as the E3000. Many of
the interface cards, such the storage array interfaces, are not
compatible when connected to the same storage unit.

Redundant Data
A Sun Enterprise Cluster system can use any one of several virtual
volume management packages to provide data redundancy. The use of
data mirroring provides a backup in the event of a disk drive or
storage array.

Redundant Public Networks


The Sun Enterprise Cluster system provides a proprietary public
network monitoring feature (PNM) that can transfer user I/O from a
failed network interface to a predefined backup interface.

Redundant Private Networks


The cluster interconnect system (CIS) consists of dual high-speed
private node-to-node communication links. Only one of the links is
used at a time. If the primary link fails, the cluster software
automatically switches to the backup link. This is transparent to all
cluster applications.

1-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

Cluster Configurations

The Sun Enterprise Cluster system provides a highly available


platform that is suitable for two general purposes:

● Highly available data services

● Parallel databases

Highly Available Data Service Configuration


The highly available data service (HADS) configuration is
characterized by independent applications that run on each node or
cluster host system. Each application accesses its own data.

If there is a node failure, the data service application can be configured


so that a designated backup node can take over the application that
was running on the failed node.

Sun Cluster Overview 1-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Cluster Configurations

Parallel Database Configuration


The parallel database (PDB) configuration is characterized by multiple
nodes that access a single database. The PDB application is not a data
service. This is a less complex configuration than the HADS and when
a node fails, the database software resolves incomplete database
transactions automatically. The Sun Cluster software initiates a portion
of the database recovery process and performs minor recovery
coordination between cluster members.

Parallel database solutions, such as Oracle Parallel Server (OPS), can


require special modifications to support shared concurrent access to a
single image of the data that can be spread across multiple computer
systems and storage devices.

OPS uses a distributed lock management scheme to prevent


simultaneous data modification by two hosts. The lock ownership
information is transferred between cluster hosts across the cluster
interconnect system.

Note – There is also a special configuration for use with the Informix
Online XPS database. This is discussed in a later module.

1-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

Sun Cluster Application Support

The Sun Cluster software framework provides support for both highly
available data services and parallel databases.

Regardless of which application your cluster is running, the core Sun


Cluster control and monitoring software are identical.

Sun Cluster Overview 1-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Sun Cluster Application Support

Highly Available Data Service Support


The Sun Cluster software provides preconfigured components that
support the following highly available data services:

● Oracle, Informix, and Sybase databases

● NFS

● SAP

● Netscape Mail, News, HTTP, LDAP

● DNS

● Tivoli

● Lotus Notes

● HA-API for local applications

Parallel Database Support


The Sun Cluster software provides support for the following parallel
database application:

● Oracle Parallel Server (OPS)

● Informix XPS

Note – When the Sun Cluster software is installed on the cluster host
systems, you must specify which of the above products you intend to
run on your cluster.

1-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

Logical Hosts

A data service in the Sun Cluster environment must be able to migrate


to one or more backup systems if the primary system fails. This should
happen with as little disruption to the client as possible.

At the heart of any highly available data service is the concept of a


logical host. Logical host definitions are created by the system
administrator and are associated with a particular data service such as
highly available NFS (HA-NFS).

A logical host definition provides all of the necessary information for a


designated backup system to take over the data service(s) of a failed
node. This includes the following:

● The IP address/hostname that users use to access the application

● The disk group that contains the application-related data

● The application that must be started

Sun Cluster Overview 1-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Logical Hosts

Logical Host Failure Process


There are several different fault monitoring processes that can detect
the failure of a logical host. Once the failure is detected, the actions
taken are as follows:

● The IP address associated with the logical host is brought up on


the designated backup system.

Note – There are also logical interface names associated with each
logical host.

● The disk group or diskset associated with a logical host migrates


to the designated backup system.

Note – Depending on the configured volume manager, the group of


disks is captured by the backup node using either the vxdg command
or the metaset command.

● The designated application is started on the backup node.

Cluster Configuration Databases


The logical host configuration information is stored in a global
configuration file name ccd.database (CCD). The CCD must be
consistent between all nodes. The CCD is a critical element that
enables each node to be aware of its potential role as a designated
backup system.

Note – There is another cluster database called the CDB that stores
basic cluster configuration information, such as node names. This
database is used during initial cluster startup. It does not require a
high level of functionality as does the CCD database.

1-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

Fault Monitoring

To ensure continued data availability, the Sun Enterprise Cluster


environment has several different fault monitoring schemes that can
detect a range of failures and initiate corrective actions.

The fault monitoring mechanisms fall into two general categories:

● Data service fault monitoring

● Cluster fault monitoring (daemons)

Sun Cluster Overview 1-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Fault Monitoring

Data Service Fault Monitoring


Each Sun-supplied data service, such as the highly available network
file system (HA-NFS), automatically starts its own fault monitoring
processes. The data service fault monitors verify that the data service
is functioning correctly and providing its intended service.

The well-being of the data service is checked both locally and on the
designated backup node for the data service. Each data service always
has a local and a remote fault monitor associated with it.

The data service fault monitors primarily use the public network
interfaces to verify functionality, but can also use the private cluster
interconnect system (CIS) interfaces if there is a problem with the
public network interfaces.

Cluster Fault Monitoring


There are several cluster fault monitoring mechanisms that verify the
overall well-being of the cluster nodes and some of their hardware.

Public Network Management

The public network management (PNM) daemon, pnmd, monitors the


functionality of designated public network interfaces and can
transparently switch to backup interfaces in the event of a failure.

Cluster Membership Monitor

The cluster membership monitor (CMM) daemon, clustd, runs on


each cluster member and is used to detect major system failures. The
clustd daemons communicate with one another across the private
high-speed network interfaces by sending regular heartbeat messages.
If the heartbeat from any node is not detected within a defined timeout
period, it is considered as having failed and a general cluster
reconfiguration is initiated by each of the functioning nodes.

1-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Fault Monitoring

Cluster Fault Monitoring

Switch Management Agent

The switch management agent (SMA) daemon, smad, monitors the


functionality of the current private network interface. If a node detects
a failure on its private network interface, it switches to its backup
private network interface. All cluster members must then switch to
their backup interfaces.

Failfast Driver

Each node has a special driver, ff, that runs in memory. It monitors
critical cluster processes and daemons. If any of the monitored
processes or daemons hang or terminate, the ff driver performs its
failfast function and forces a panic on the node.

Sun Cluster Overview 1-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1

Failure Recovery Summary

To achieve high availability, failed software or hardware components


must be automatically detected and corrective action initiated. Some
failures do not require a cluster reconfiguration.

1-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Failure Recovery Summary

Many of the components shown in Figure 1-1 have failure recovery


capabilities. Some failures are less transparent than others and can
result in a node crash. Although some of the failures do not disturb the
cluster operation, they reduce the level of redundancy and therefore,
increase the risk of data loss.

Node 0 Node 1

RDBMS failfast failfast RDBMS


driver (ff) driver (ff)

Disk Consistency Disk


management CDB CDB management
Consistency
CCD CCD
Heartbeat
CMM CMM
Fiber-optic Heartbeat Fiber-optic
channels SMA SMA channels

Private networks

Storage array Storage array

Figure 1-1 Failure Recovery Summary

Sun Cluster Overview 1-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Failure Recovery Summary

The following describes the types of system failures and recovery


solutions.

● Individual disk drive failure

When virtual volumes are mirrored, you can replace a failed disk
drive without interrupting the database or data service access.

● Fibre Channel or array controller failure

A failure in any part of a fiber-optic interface or an array controller


is treated as a multiple disk failure. All I/O is sent to the mirrors
on another array. This is handled automatically by the disk
management software

● Cluster interconnect failure

If a single private network failure is detected by the SMA, all


traffic is automatically routed through the remaining private
network. This is transparent to all programs that use the private
network.

● Node failure

If a node crashes or if both its private networks fail, the cluster


membership monitor (CMM) daemons on all node detect the
heartbeat loss and initiate a comprehensive reconfiguration.

● Critical process or daemon failure

If certain critical processes or daemons hang or fail, the failfast


driver, ff, forces a panic on the node. As a result, all other nodes
in the cluster go through a reconfiguration.

● Cluster configuration database file inconsistency

When a cluster reconfiguration takes place for any reason, a


consistency check is done among all nodes to ensure that all of the
CDB and CCD database files agree. If there is an inconsistency in
one or more of the cluster database files, CMM determines by a
majority opinion, which nodes remain in clustered operation.

1-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Exercise: Lab Equipment Familiarization

Exercise objective – None

Preparation

NoneTasks

None

Sun Cluster Overview 1-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ List the hardware elements that comprise a basic Sun Enterprise


Cluster system

❑ List the hardware and software components that contribute to the


availability of a Sun Enterprise Cluster system

❑ List the types of redundancy that contribute to the availability of a


Sun Enterprise Cluster system.

❑ Identify the configuration differences between a high availability


cluster and a parallel database cluster

❑ Explain the purpose of logical host definitions in the Sun


Enterprise Cluster environment

❑ Describe the purpose of the cluster configuration databases

❑ Explain the purpose of each of the Sun Enterprise Cluster fault


monitoring mechanisms.

1-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
1
Think Beyond

What are some of the most common problems encountered during


cluster installation?

How does a cluster installation proceed? What do you need to do first?

Do you need to be a database expert to administer a Sun Enterprise


Cluster system?

Sun Cluster Overview 1-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Terminal Concentrator 2

Objectives

Upon completion of this module, you should be able to:

● Describe the Sun Enterprise Cluster administrative interface

● Explain the TC hardware configuration

● Verify the correct TC cabling

● Configure the TC IP address

● Configure the TC to self-load

● Verify the TC port settings

● Verify that the TC is functional

● Use the terminal concentrator help, who, and hangup commands

● Describe the purpose of the telnet send brk command

This module should help you understand the critical functions


provided by the Sun Enterprise Cluster central administration
hardware interface called the Terminal Concentrator. You should also
learn how to configure the TC for proper operation in a clustered
environment.

2-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Relevance

Discussion – The following questions are relevant to your learning the


material presented in this module:

1. Why is this hardware covered so early in the course?

2. You are purchasing an Ultra Enterprise 10000-based cluster. Does


the information in this module apply?

2-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

● Sun Cluster 2.2 Release Notes, part number 805-4243

Terminal Concentrator 2-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2

Cluster Administration Interface

The TC is a hardware interface, consisting of several components that


provide the only access path to the cluster host systems when these
systems are halted or before any operating system software is
installed.

The TC is typically the first component configured during a new


cluster installation. The administration workstation is set up next.

2-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Cluster Administration Interface

As shown in Figure 2-1, the cluster administration interface is a


combination of hardware and software that enable you to monitor and
control one or more clusters from a remote location.

Administration workstation

Administration
tools

Network

Terminal Concentrator
Network
interface
Serial ports
1 2 3 4 5 6 7 8

Setup
port Node 0 Node 1

Setup device Serial port A

Figure 2-1 Cluster Administration Interface

Terminal Concentrator 2-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Cluster Administration Interface

Major Elements
The relationship of the following elements is shown in Figure 2-1.

Administration Workstation

The administration workstation can be any Sun SPARC™ workstation,


providing it has adequate resources to support graphics and
computer- intensive applications, such as SunVTS and Solstice
SyMON software.

Administration Tools

There are several administration tools but only one of them, the
Cluster Console, is functional when the cluster host systems are at the
OpenBoot™ Programmable Read Only Memory (PROM) prompt.

Cluster Console is a tool that automatically links to each node in a


cluster through the TC. A text window is provided for each node. The
connection is functional even when the nodes are at the OK prompt
level. This is the only path available to boot the cluster nodes or
initially load the operating system software.

Terminal Concentrator (TC)

The TC provides translation between the local area network (LAN)


environment and the serial port interfaces on each node. The nodes do
not have display monitors so the serial ports are the only means of
accessing each node to run local commands.

Cluster Host Serial Port Connections

The cluster host systems do not have a display monitor or keyboard.


The system firmware senses this when power is turned on and directs
all system output through serial port A. This is a standard feature on
all Sun systems.

2-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2

Terminal Concentrator Overview

The TC used in the Sun Enterprise Cluster systems has its own
internal OS and resident administration programs. The TC firmware is
specially modified for Sun Enterprise Cluster installation. Although
other terminal concentrators might seem similar, they should not be
used as a substitute.

Terminal Concentrator 2-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Overview

As shown in Figure 2-2, the TC is a self-contained unit with its own


operating system. Part of its operating system is a series of
administrative programs.

Terminal Concentrator
Memory
Network
Self-load at power-on
interface

PROM
Operating system
OPER_52_ENET.SYS
Serial ports
1 2 3 4 5 6 7 8

Setup Node 0 Node 1


port

Serial port A

Setup device

Figure 2-2 Terminal Concentrator Functional Diagram

Caution – If the PROM-based operating system is older than


version 52, it must be upgraded.
!

2-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Overview

Operating System Load


You can set up the TC to load its operating system either internally
from the resident PROM, or externally from a server. In the cluster
application, it is always set to load internally. Placing the operating
system on an external server can actually decrease the reliability of the
terminal server.

When power is first applied to the TC, it performs the following steps:

1. A PROM-based self-test is run and error codes are displayed.

2. A PROM-based OS is loaded into the TC resident memory.

Setup Port
Serial port 1 on the TC is a special purpose port that is used only
during initial setup. It is used primarily to set up the IP address and
load sequence of the Terminal Server. Port 1 access can be either from
a tip connection or from a locally connected terminal.

Terminal Concentrator Setup Programs


You must configure the TC non-volatile random-access memory
(NVRAM) with the appropriate IP address, boot path, and serial port
information. You use the following resident programs to specify this
information:

● Addr

● Seq

● Image

● Admin

Terminal Concentrator 2-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2

Terminal Concentrator Setup

The TC must be configured for proper operation. Although the TC


setup menus seem simple, they can be confusing and it is easy to make
a mistake. You can use the default values for many of the prompts.

2-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Setup

Connecting to Port 1
Before you can perform the initial TC setup, you must first make an
appropriate connection to its setup port. Figure 2-3 shows a tip
hardwire connection from the administration workstation.

1 2 3 4 5 6 7 8

Figure 2-3 Setup Connection to Port 1

Note – You can also connect an American Standard for Information


Interchange (ASCII) terminal to the setup port of the administration
workstation.

Enabling Setup Mode


To enable Setup mode, press the TC Test button shown in Figure 2-4
until the TC power indicator begins to blink rapidly, then release the
Test button and press it again briefly.

STATUS

POWER UNIT NET ATTN LOAD ACTIVE 1 2 3 4 5 6 7 8

Power indicator Test button

Figure 2-4 Terminal Concentrator Test Button

After you have enabled Setup mode, a monitor:: prompt should


appear on the setup device. Use the addr, seq, and image commands
to complete the configuration.

Terminal Concentrator 2-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Setup

Setting the Terminal Concentrator IP Address


The following example shows how to use the addr program to set the
IP address of the TC. Usually this is set correctly when your cluster
arrives but you should always verify this.

monitor:: addr
Enter Internet address [192.9.22.98]::
129.150.182.100
Enter Subnet mask [255.255.255.0]::
Enter Preferred load host Internet address
[192.9.22.98]:: 129.150.182.100
Enter Broadcast address [0.0.0.0]:: 129.150.182.255
Enter Preferred dump address [192.9.22.98]::
129.150.182.100
Select type of IP packet encapsulation
(ieee802/ethernet) [<ethernet>]::
Type of IP packet encapsulation: <ethernet>

Load Broadcast Y/N [Y]::

Setting the Terminal Concentrator Load Source


The following example shows how to use the seq program to specify
the type of loading mechanism to be used.

monitor:: seq
Enter a list of 1 to 4 interfaces to attempt to use
for downloading code or upline dumping. Enter them in
the order they should be tried, separated by commas
or spaces. Possible interfaces are:

Ethernet: net
SELF: self

Enter interface sequence [self]::

2-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Setup

Setting the Terminal Concentrator Load Source


The self response configures the TC to load its OS internally from the
PROM when you turn the power on. The PROM image is currently
called OPER_52_ENET.SYS.

Enabling the self-load feature negates other setup parameters that


refer to an external load host and dump host, but you must still define
them during the initial setup sequence.

Specify the Operating System Image


Even though the self-load mode of operation negates the use of an
external load and dump device, you should still verify the operating
system image name as shown by the following:

monitor:: image

Enter Image name [“oper_52_enet”]::


Enter TFTP Load Directory [“9.2.7/”]::
Enter TFTP Dump path/filename
[“dump.129.150.182.100”]::

monitor::

Note – Do not define a dump or load address that is on another


network because you will see additional questions about a gateway
address. If you make a mistake, you can press Control-c to abort the
setup and start again.

Terminal Concentrator 2-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Setup

Setting the Serial Port Variables


The TC port settings must be correct for proper cluster operation. This
includes the type and mode port settings. Port 1 requires different
type and mode settings. You should verify the port settings before
installing the cluster host software. The following is an example of the
entire procedure:

admin-ws# telnet terminal_concentrator_name


Trying terminal concentrator IP address
Connected to sec-tc.
Escape character is '^]'.
Rotaries Defined:
cli -
Enter Annex port name or number: cli
Annex Command Line Interpreter * Copyright 1991
Xylogics, Inc.
annex: su
Password: type the password
annex# admin
Annex administration MICRO-XL-UX R7.0.1, 8 ports
admin: show port=1 type mode
Port 1:
type: hardwired mode: cli
admin:set port=1 type hardwired mode cli
admin:set port=2-8 type dial_in mode slave
admin: quit
annex# boot
bootfile: <CR>
warning: <CR>.

Note – This procedure is not performed through the special setup port
but through public network access.

2-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2

Terminal Concentrator Troubleshooting

Occasionally it is useful to be able to manually manipulate the


terminal concentrator. The command to do this are not well
documented in the cluster manuals.

Manually Connecting to a Node


If the cconsole tool is not using the terminal concentrator serial ports,
you can use the telnet command to connect to a specific serial port as
follows:
# telnet tc_name 5002

You can then log in to the node attached to port 5002. After you have
finished and logged out of the node, you must break the telnet
connection with the Control ] keyboard sequence and then type quit.
If you do not, the serial port will be locked against use by other
applications, such as the cconsole tool.

Terminal Concentrator 2-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Troubleshooting

Using the telnet Command to Abort a Node


If you have to abort a cluster node, you can either use the telnet
command to connect directly to the node and use the Control ]
keyboard sequence or you can use the Control ] keyboard sequence in
a cconsole window. Once you have the telnet prompt, you can abort
the node with the following command:
telnet > send brk
ok

Note – You might have to repeat the command more than once.

Connecting to the Terminal Concentrator CLI


As shown below, you can use the telnet command to connect
directly to the TC, and then use the resident command line interpreter
to perform status and administration procedures.
# telnet IPaddress
Trying 129.146.241.135...
Connected to 129.146.241.135
Escape character is '^]'.

Enter Annex port name or number: cli

Annex Command Line Interpreter * Copyright 1991


Xylogics, Inc.
annex:

Using the Terminal Concentrator help Command


After you connect directly into a terminal concentrator, you can get
online help as shown below.
annex: help
annex: help hangup
annex: help stats

2-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Terminal Concentrator Troubleshooting

Identifying and Resetting a Locked Port


If a node crashes, it can leave a telnet session active that effectively
locks the port from further use. You can use the telnet command to
connect into the terminal concentrator, use the who command to
identify which port is locked, and then use the admin utility to reset
the locked port. The command sequence is as follows:
annex: who
annex: su
Password:
annex# admin
Annex administration MICRO-XL-UX R7.0.1, 8 ports
admin : reset 6
admin : quit
annex# hangup

Terminal Concentrator 2-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

Exercise objective – In this exercise you will:

▼ Verify the correct TC cabling

▼ Configure the TC IP address

▼ Configure the TC to self-load

▼ Verify the TC port settings

▼ Verify that the TC is functional

Preparation
Before starting this lab, you must obtain an IP address assignment for
your terminal concentrator. Record it below.

TC IP address: ____________________

Tasks
The following tasks are explained in this section:

● Verifying the network and host system cabling

● Connecting a local terminal

● Connecting tip hardwire

● Achieving setup mode

● Configuring the IP address

● Configuring the Terminal Concentrator to self-load

● Verifying the self-load process

● Verifying the TC port settings

2-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

TC can have one or two Ethernet connections, depending on their age.


All TC generations have the same serial port connections.

Before you begin to configure the TC, you must verify that the
network and cluster host connections are correct. The port-to-node
connection configuration will be used when configuring the cluster
administration workstation in a later lab.

Verifying the Network and Host System Cabling

1. Inspect the rear of the TC and make sure it is connected to a public


network.

1 2 3 4 5 6 7 8

Ethernet ports

Figure 2-5 Terminal Concentrator Network Connection

2. Verify that the serial ports are properly connected to the cluster
host systems. Each output should go to serial port A on the
primary system board of each cluster host system.

1 2 3 4 5 6 7 8

To Node 0 To Node 1

Figure 2-6 Concentrator Serial Port Connections

Note – In three and four-node clusters, there are additional serial port
connections to the TC.

Terminal Concentrator 2-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

To set up the TC, you can either connect a “dumb” terminal to serial
port 1 or use the tip hardwire command from a shell on the
administration workstation.

If you are using a local terminal connection, continue with the next
section, “Connecting a Local Terminal.” If you are using the
administration workstation, proceed to the ‘‘Connecting Tip
Hardwire’’ section on page 2-21 section.

Connecting a Local Terminal

1. Connect the local terminal to serial port 1 on the back of the TC


using the cable supplied.

Dumb terminal

1 2 3 4 5 6 7 8

Figure 2-7 Concentrator Local Terminal Connection

Note – Do not use a cable length over 500 feet. Use a null modem
cable.

2. Verify that the local terminal operating parameters are set to 9600
baud, 7 data bits, no parity, and one stop bit.

3. Proceed to the ‘‘Achieving Setup Mode’’ section on page 2-22.

2-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

Perform the following procedure only if you are using the tip
connection method to configure the TC. If you have already connected
to the TC with a “dumb” terminal, skip this procedure.

Connecting Tip Hardwire

1. Connect serial port B on the administration workstation to serial


port 1 on the back of the TC using the cable supplied.

1 2 3 4 5 6 7 8

Figure 2-8 Concentrator to Tip Hardware Connection

Note – Do not use a cable length over 500 feet. Use a null modem
cable.

2. Verify that the hardwire entry in the /etc/remote file matches


the serial port you are using:
hardwire:\
:dv=/dev/term/b:br#9600:el=^C^S^Q^U^D:ie=%$:oe=^D:

The baud rate must be 9600


The serial port designator must match
the serial port you are using.

3. Open a shell window on the administration workstation and make


the tip connection by typing the following command:
# tip hardwire

Terminal Concentrator 2-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

Before the TC configuration can proceed, you must first place it in its
Setup mode of operation. Once in Setup mode, the TC accepts
configuration commands from a serial device connected to Port 1.

Achieving Setup Mode

1. To enable Setup mode, press and hold the TC Test button until the
TC power indicator begins to blink rapidly, then release the Test
button and press it again briefly.

STATUS

POWER UNIT NET ATTN LOAD ACTIVE 1 2 3 4 5 6 7 8

Power indicator Test button

Figure 2-9 Enabling Setup Mode on the Terminal Concentrator

2. After the TC completes its power-on self-test, you should see the
following prompt on the shell window or on the local terminal:
monitor::

Note – It can take a minute or more for the self-test process to


complete.

2-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

In the next procedure, you will set up the TC’s network address with
the addr command. This address must not conflict with any other
network systems or devices.

Configuring the IP Address

Verify that the TC internet address and preferred load host address are
set to your assigned value.

1. To configure the Terminal Concentrator IP address using the addr


command, type addr at the monitor:: prompt.
monitor:: addr

Enter Internet address [192.9.22.98]::


129.150.182.101

Enter Subnet mask [255.255.255.0]::

Enter Preferred load host Internet address


[192.9.22.98]::
129.150.182.101

***Warning: Local host and Internet host are the


same***

Enter Broadcast address [0.0.0.0]::


129.150.182.255

Enter Preferred dump address [192.9.22.98]::


129.150.182.101

Select type of IP packet encapsulation


(ieee802/ethernet) [<ethernet>]::
Type of IP packet encapsulation: <ethernet>

Load Broadcast Y/N [Y]:: Y

monitor::

Terminal Concentrator 2-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

When the TC is turned on, you must configure it to load a small


operating system. You can use the seq command to define the location
of the operating system and the image command to define its name.

Configuring the Terminal Concentrator to Self-Load

1. To configure the TC to load from itself instead of trying to load


from a network host, type the seq command at the monitor::
prompt.
monitor:: seq

Enter a list of 1 to 4 interfaces to attempt to use for downloading


code or upline dumping. Enter them in the order they should be
tried, separated by commas or spaces. Possible interfaces are:
Ethernet: net
SELF: self

Enter interface sequence [self]::

2. To configure the TC to load the correct operating system image,


type the image command at the monitor:: prompt.
monitor:: image
Enter Image name [“oper.52.enet”]::
Enter TFTP Load Directory [“9.2.7/”]::

Enter TFTP Dump path/filename


[“dump.129.150.182.101”]::

3. If you used a direct terminal connection, disconnect it from the TC


when finished.

4. If you used the tip hardwire method, break the tip connection
by typing the ~. sequence in the shell window.

2-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

Before proceeding, you must verify that the TC can complete its self-
load process and that it will answer to its assigned IP address.

Verifying the Self-load Process

1. Turn off the TC power for at least 10 seconds and then turn it on
again.

2. Observe the light emitting diodes (LEDs) on the TC front panel.


After the TC completes its power-on self-test and load routine, the
front panel LEDs should look like the following:

Table 2-1 LED Front Panel Settings

Power Unit Net Attn Load


Active (Green)
(Green) (Green) (Green) (Amber) (Green)

ON ON ON OFF OFF Intermittent


blinking

Note – It takes at least 1 minute for the process to complete. The Load
light extinguishes after the internal load sequence is complete.

Verifying the Terminal Concentrator Pathway

Complete the following steps on the administration workstation from


a shell or command tool window:

1. Test the network path to the TC using the following command:


# ping IPaddress

Note – Substitute the IP address of your TC for IPaddress.

Terminal Concentrator 2-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

You must set the TC port variable type to dial_in for each of the
eight TC serial ports. If it is set to hardwired, the cluster console might
be unable to detect when a port is already in use. There is also a
related variable called imask_7bits that you must set to Y.

Verifying the TC Port Settings

You can verify and if necessary, modify the type, mode, and
imask_7bits variable port settings, with the following procedure.

1. On the administration workstation, use the telnet command to


connect to the TC. Do not use a port number.
# telnet IPaddress
Trying 129.146.241.135...
Connected to 129.146.241.135
Escape character is '^]'.

2. Enable the command line interpreter, su to the root account, and


start the admin program.
Enter Annex port name or number: cli

Annex Command Line Interpreter * Copyright 1991


Xylogics, Inc.

annex: su
Password:
annex# admin
Annex administration MICRO-XL-UX R7.0.1, 8 ports

admin :

Note – By default, the superuser password is the TC IP address. This


includes the periods in the IP address.

2-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

3. Use the show command to examine the current setting of all ports.
admin : show port=1-8 type mode

4. Perform the following procedure to change the port settings and to


end the TC session.
admin:set port=1 type hardwired mode cli
admin:set port=2-8 type dial_in mode slave
admin: quit
annex# boot
bootfile: <CR>
warning: <CR>
Connection closed by foreign host.

Note – It takes at least 1 minute for the process to complete. The Load
light extinguishes after the internal load sequence is complete.

Terminal Concentrator Troubleshooting

1. On the administration workstation, use the telnet command to


connect to the TC. Do not use a port number.
# telnet IPaddress
Trying 129.146.241.135...
Connected to 129.146.241.135
Escape character is '^]'.

2. Enable the command line interpreter.


Enter Annex port name or number: cli

Annex Command Line Interpreter * Copyright 1991


Xylogics, Inc.

annex:

3. Practice using the help and who commands.

4. End the session with the hangup command.

Terminal Concentrator 2-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Exercise: Configuring the Terminal Concentrator

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

2-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Describe the Sun Enterprise Cluster administrative interface

❑ Explain the TC hardware configuration

❑ Verify the correct TC cabling

❑ Configure the TC IP address

❑ Configure the TC to self-load

❑ Verify the TC port settings

❑ Verify that the TC is functional

❑ Use the terminal concentrator help, who, and hangup commands

❑ Describe the purpose of the telnet send brk command

Terminal Concentrator 2-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
2
Think Beyond

Is there a significant danger if the TC port variables are not set


correctly?

Is the Terminal Concentrator a single point of failure? What would


happen if it failed?

2-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Administration Workstation
Installation 3

Objectives

Upon completion of this module, you should be able to:

● Summarize the Sun Cluster administration workstation functions

● Use the scinstall script features

● Install the client software on the administration workstation

● Set up the administration workstation environment

● Configure the Sun Cluster administration tools

This module describes the installation process for the Sun Cluster
software on the Administration Workstation.

3-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Relevance

Discussion – The following questions are relevant to understanding


this module’s content:

1. How important is the administration workstation during the


configuration of the cluster host systems?

3-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

Administration Workstation Installation 3-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3

Sun Enterprise Cluster Software Summary

Sun Cluster software is installed on a Sun Enterprise Cluster hardware


platform. The Sun Cluster software is purchased separately from each
of the supported volume managers. The complete software collection
consists of the following CDs:

● Sun Cluster 2.2

● Cluster Volume Manager 2.2.1

● Sun StorEdge Volume Manager 2.6

● Solstice DiskSuite 4.2

Note – You can use the Solstice DiskSuite volume manager with the
Solaris 7 Operating Environment version of the Sun Cluster product.
You must use the other supported volume managers with the
Solaris 2.6 Operating Environment version of the Sun Cluster product.
The administration workstation can be configured with either version
of the Solaris Operating Environment independently of the cluster
host systems.

3-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Sun Enterprise Cluster Software Summary

As shown in Figure 3-1, the Sun Cluster client software is installed on


the administration workstation and the Sun Cluster server software is
installed on each of the cluster host systems along with the
appropriate volume management software.

Administration
workstation

Solaris 2.6/7
Sun Cluster
client software

Network

Private disk Private disk

Solaris 2.6/7 Solaris 2.6/7


Sun Cluster Sun Cluster
server software server software
Volume Volume
management management

Node 0 Node 1
system system
hardware hardware

Disk storage array Disk storage array

Figure 3-1 Sun Cluster Software Distribution

Administration Workstation Installation 3-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Sun Enterprise Cluster Software Summary

Sun Cluster Software Installation


The Sun Cluster CD-ROM contains a software administration program
called scinstall. The scinstall script is used to install all Sun
Cluster software.
Caution – There are many package dependencies. You should not try
to manually add packages unless instructed to do so by formally
! released procedures.

Administrative Workstation Software Packages

The client Package Set

The client packages are installed only on the administration


workstation and include the following:

● SUNWscch – Sun Cluster common help files

● SUNWccp – Sun Cluster control panel

● SUNWscmgr – Sun Cluster manager

● SUNWccon – Sun Cluster console

● SUNWscsdb – Sun Cluster serialports/clusters database

● SUNWcsnmp – Sun Cluster (Simple Network Management Protocol)


SNMP agent

Note – The Sun Cluster installation program package, SUNWscins, is


also installed.

3-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3

Software Installation Program

You use the scinstall command to install the Sun Cluster CD-ROM
packages. Run in its normal interactive mode, scinstall prompts you
for all of the information that it requires to properly install the Sun
Cluster software.

For the administrative workstation, there is little information required,


although the scinstall command provides many options.

Administration Workstation Installation 3-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Software Installation Program

scinstall Command Line Options


The scinstall command recognizes the following options when
initially started:
scinstall [-a|-c|-s] [[-i|-u|-o|-V][-l]]

[-h] [-d package_dir]

[-A admin_file]

-a Loads all the packages (client and server)

-c Loads the client administration packages

-s Loads the server packages

-i Installs the selected packages (use with -


[acs])

-u Uninstalls the selected packages (use with -


[acs])

-o Uninstalls obsolete packages

-V Verifies the installation

-l Lists installed client and server packages

-d package_dir Indicates where to find the packages

-A admin_file Use the specified package administration file

-h Prints a help message

Note – If no options are provided, the program prompts interactively.


This is the recommended method.

3-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3

Sun Cluster Installation Program Startup

When you run the scinstall script without any command-line


startup options, it manages several complex tasks using simple
interactive prompts. It assumes that the operating system (OS) is
properly installed and configured.

When you start the scinstall script as shown below, the SUNWscins
package is installed before anything else occurs. This program
manages the installation.

# cd
/cdrom/suncluster_sc_2_2/Sun_Cluster_2_2/Sol_2.6/Tools
# ./scinstall

Installing: SUNWscins

Installation of <SUNWscins> was successful.

Checking on installed package state .............

None of the Sun Cluster software has been installed

Administration Workstation Installation 3-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Sun Cluster Installation Program Startup

Initial Installation Startup


During the next phase of the scinstall program shown below, you
must define which Sun Cluster package set you wish to administer.

Choose one:
1) Upgrade Upgrade to Sun Cluster 2.2
Server packages
2) Server Install the Sun Cluster packages
needed on a server
3) Client Install the admin tools needed
on an admin workstation
4) Server and Client Install both Client and Server
packages
5) Close Exit this Menu
6) Quit Quit the Program

Enter the number of the package set [6]: 3

Normally, the Server and Client packages are not installed on the same
system.

Note – When you select the server option, a complex dialogue is


displayed, which requires detailed input about the intended cluster
configuration. Do not start the server installation until you are
prepared to answer all questions.

As shown in the following example, you must define a legitimate


source for the Sun Cluster software.

What is the path to the CD-ROM image [/cdrom/cdrom0]:


/cdrom/suncluster_sc_2_2

You must define the full path name to the Sun_Cluster_2_2 directory.

Note – There are detailed upgrade procedures in the Sun Cluster 2.2
Software Installation Guide in chapter 4, Upgrading Sun Cluster Software.

3-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Sun Cluster Installation Program Startup

Existing Installation Startup


If you run the scinstall program on a cluster host that already has
the current Sun Cluster software configured, you see a different
startup dialogue.

1) Install/Upgrade Install or Upgrade Server


Packages or Install Client
Packages.

2) Remove Remove Server or Client


Packages.

3) Change Modify cluster or data service


configuration

4) Verify Verify installed package sets.

5) List List installed package sets.

6) Quit Quit this program.

7) Help The help screen for this menu.

Please choose one of the menu items: [7]:

You can use option 3 (Change) to modify the cluster configuration if


you made too many mistakes during the initial installation. The
change option allows you to:

● Change information about the nodes in the cluster

● Add data service packages onto the system

● Remove data service packages from the system

● Remove the volume manager software

● Change the logical host configurations

● Reinitialize the NAFO group configurations

Administration Workstation Installation 3-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Sun Cluster Installation Program Startup

Installation Mode
Before you start the installation process, you must select the mode of
installation. A typical dialogue is shown in the following output.

Installing Client packages

Installing the following packages: SUNWscch SUNWccon


SUNWccp SUNWcsnmp SUNWscsdb

>>>> Warning <<<<


The installation process will run several scripts as
root. In addition, it may install setUID programs. If
you choose automatic mode, the installation of the
chosen packages will proceed without any user
interaction.If you wish to manually control the install
process you must choose the manual installation option.

Choices:
manual Interactively install each package
automatic Install the selected packages with no
user interaction.

In addition, the following commands are supported:


list Show a list of the packages to be
installed
help Show this command summary
close Return to previous menu
quit Quit the program

Install mode [manual automatic] [automatic]: automatic

In most situations, you should select the automatic mode of


installation because of complex package dependencies.

3-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3

Administration Workstation Environment

New Search and Man Page Paths


Depending on which shell the system uses, add the following search
path and man path entries to the .profile or .cshrc files for user
root:

● /opt/SUNWcluster/bin

● /opt/SUNWcluster/man

Administration Workstation Installation 3-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Administration Workstation Environment

Host Name Resolution Changes


As shown in the following example, you can enter the IP addresses
and the hostnames of the TC and all cluster hosts in the administration
workstation’s /etc/inet/hosts file even if you are using a naming
service. This gives you access to the cluster host systems by name even
if your naming service fails.
127.0.0.1 localhost
129.146.53.57 adminws1 loghost
129.146.53.60 sc-tc
129.146.53.61 sc-node0
129.146.53.62 sc-node1

Note – To ensure that this works, specify files before any other name
service on the hosts line of /etc/nsswitch.conf.

Remote Login Control


You can control user root logins on the administration workstation in
three ways. Edit the /etc/default/login file and modify the line
CONSOLE=/dev/console in one of the following ways:

# CONSOLE=/dev/console Allows the user to log in remotely as root


from any other workstation

CONSOLE=/dev/console Requires the user to log in as root only


from the workstation keyboard

CONSOLE= Requires the user to log in as another user


and then use the su command to change
to root

Note – An advantages of the CONSOLE= form of login control is that a


log is kept of all su logins in the /var/adm/sulog file.

3-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Administration Workstation Environment

Remote Display Enabling


If the cluster host systems need to display window-based applications
on the administration workstation, edit the .xinitrc file and just
above the last line (wait) type xhost hostname1 hostname2.

Controlling rcp and rsh Access


Figure 3-2 illustrates how to use the rcp or rsh commands to move
files between the cluster host systems or check status.

User JD wants to rlogin, rcp, or rsh to host X host A

host X
No Is user JD in Yes
/etc/passwd?

Yes Is user JD
a superuser?
No

Is host A in Yes
/etc/hosts.equiv?
No Access allowed

Is host A in
$HOME/.rhosts? Yes
No Yes
rlogin Password Password
Command? correct?
prompt
No
rcp rsh
Access denied Login prompt

Figure 3-2 Remote Authentication Flowchart

Administration Workstation Installation 3-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3

Cluster Administration Tools Configuration

All of the necessary information needed for the cluster administration


tools that run on the administration workstation is configured in two
files. The files are on the administration workstation and are:

/etc/clusters

/etc/serialports

When the Sun Cluster client software is installed on the administration


workstation, two blank files are created. You must edit the files and
supply the necessary information.

3-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Cluster Administration Tools Configuration

Cluster Administration Interface


As shown in Figure 3-3, the cluster administration interface is
composed of both hardware and software elements.

Administration workstation

Administration
tools

Network

Terminal concentrator
Network Serial ports
interface 1 2 3 4 5 6 7 8

Serial port A

Node 0 Node 1 Node 2 Node 3

Figure 3-3 Cluster Administration Components

Administration Workstation Installation 3-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Cluster Administration Tools Configuration

Administration Tool Configuration Files


The following is a typical entry in the /etc/clusters file:

sc-cluster sc-node0 sc-node1

The single-line entry defines a cluster named sc-cluster that has two
nodes named sc-node0 and sc-node1.

Note – The cluster name is purely arbitrary, but it should agree with
the name you use when you install the server software on each of the
cluster host systems.

The following is a typical entry in the /etc/serialports file:

sc-node0 sc-tc 5002


sc-node1 sc-tc 5003

There is a line for each cluster host that describes the name of each
host, the name of the terminal concentrator, and the terminal
concentrator port to which each host is attached.

For the E10000, the /etc/serialports entries for each cluster domain
are configured with the domain name, the System Service Processor
(SSP) name, and (always) the number 23, which represents the telnet
port.

sc-10knode0 sc10k-ssp 23
sc-10knode1 sc10k-ssp 23

Note – When upgrading the cluster software, the /etc/serialports


and /etc/clusters files are overwritten. You should make a backup
copy before starting the upgrade.

3-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3

Cluster Administration Tools

The cluster administration tools are used to manage a cluster. They


provide many useful features including:

● Centralized tool bar

● Command line interface to each cluster host

● Overall cluster status tool

The cluster administration tools are accessed by using the ccp


program.

Note – The Cluster Manager tool is not discussed until a later module.
It is not operational until the cluster is fully configured.

Administration Workstation Installation 3-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Cluster Administration Tools

The Cluster Control Panel


As shown in Figure 3-4, the Cluster Control Panel provides centralized
access to several Sun Cluster administration tools.

Figure 3-4 Cluster Control Panel

Cluster Control Panel Start-up

To start the Cluster Control Panel, type the following command:

# /opt/SUNWcluster/bin/ccp [clustername] &

Adding New Applications to the Cluster Control Panel

The Cluster Control Panel New Item display is available under the
Properties menu and allows you to add custom applications and icons
to the Cluster Control Panel display.

3-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Cluster Administration Tools

Cluster Console
The Cluster Console tool uses the TC to access the cluster host systems
through serial port interfaces. The advantage of this is that you can
connect to the cluster host systems even when they are halted. This is
useful when booting the systems and is essential during initial cluster
host configuration when loading the Solaris operating system.

As shown in Figure 3-5, the Cluster Console tool uses xterm windows
to connect to each of the cluster host systems.

Node 0 window Node 1 window

Node 2 window Common window Node 3 window

Figure 3-5 Cluster Console Windows

Administration Workstation Installation 3-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Cluster Administration Tools

Cluster Console

Manually Starting the cconsole Tool

You can use cconsole manually to connect to each of the host systems
or to the cluster. This command starts windows for all cluster hosts.

/opt/SUNWcluster/bin/cconsole node0
/opt/SUNWcluster/bin/cconsole eng-cluster
/opt/SUNWcluster/bin/cconsole node3

Cluster Console Host Windows

There is a host window for each node in the cluster. You can enter
commands in each host window separately.

The host windows all appear to be vt220 terminals. Set the TERM
environment variable to vt220 to use the arrow and other special
keys.

Cluster Console Common Window

The common window shown in Figure 3-6 lets you enter commands to
all host system windows at the same time. All of the windows are tied
together, so that when the Common Window is moved, the host
windows follow. The Options menu allows you to ungroup the
windows, move them into a new arrangement, and group them again.

Figure 3-6 Cluster Console Common Window

3-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Cluster Administration Tools

Cluster Console

Cluster Console Window Variations

There are three variations of the cluster console tool that each use a
different method to access the cluster hosts. They all look and behave
the same way.

● Cluster Console (console mode)

Access to the host systems is made through the TC interface. The


Solaris operating system does not have to be running on the
cluster host systems.

Only one connection at a time can be made through the TC to


serial port A of a cluster node. You cannot start a second instance
of cconsole for the same cluster.

For E10000 domains, a telnet connection is made to the ssp


account on the domain’s SSP, then a netcon session is established.
The SSP name and ssp account password are requested during the
cluster node software installation process.

● Cluster Console (rlogin mode)

Access to the host systems is made using the rlogin command,


which uses the public network. The Solaris OS must be running on
the cluster host systems.

● Cluster Console (telnet mode)

Access to the host systems is made using the telnet command,


which uses the public network. The Solaris OS must be running on
the cluster host systems.

Administration Workstation Installation 3-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Cluster Administration Tools

Cluster Help Tool


The Cluster Help Tool provides comprehensive assistance in the use of
all Sun Cluster administration tools. Start the Cluster Help Tool by
selecting the Help Tool icon in the ccp.

Figure 3-7 shows a page from the Cluster Help Tool that describes the
Cluster Control Panel.

Figure 3-7 Cluster Help Tool

3-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Exercise objective – In this exercise you will:

▼ Install and configure the Sun Cluster client software on an


administration workstation

▼ Configure the Sun Cluster administration workstation


environment for correct Sun Cluster client software operation

▼ Start and use the basic features of the Cluster Control Panel,
the Cluster Console, and the Cluster Help tools.

Preparation
This lab assumes that the Solaris 2.6 operating system software has
already been installed on all of the systems.

1. Remove the ‘‘Cluster Name and Address Information’’ section on


page A-2 and complete the first three entries.

Note – Ask your instructor for assistance with the information.

Administration Workstation Installation 3-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Tasks
The following tasks are explained in this section:

● Updating the name service

● Installing OS patches

● Running the scinstall utility

● Setting up the root user environment

● Configuring the /etc/clusters file

● Configuring the /etc/serialports file

● Starting the cconsole tool

Updating the Name Service


1. If necessary, edit the /etc/hosts file on the administrative
workstation and add the IP addresses and names of the Terminal
Concentrator and the host systems in your cluster.

Installing OS Patches
1. Install any recommended OS patches on the administrative
workstation before running scinstall.

2. Reboot the administrative workstation after installing the patches.

3-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Running the scinstall Utility


1. Log in to your administration workstation as user root.

2. Move to the Sun_Cluster_2_2/Sol_2.6/Tools directory.

Note – Either load the Sun Cluster CD-ROM or move to the location
provided by your instructor.

3. Start the Sun Cluster installation script.


# ./scinstall

4. Select option 3 to install the client software.

5. If necessary, enter the path to the Sun Cluster software packages.

Note – The software needs to know only the portion of the path that is
needed to locate the Sun_Cluster_2_2 directory.

6. Select the automatic mode of installation.

7. After the SUNWscch, SUNWccon, SUNWccp, SUNWcsnmp, and


SUNWscsdb packages have been installed successfully, the
scinstall program should display the main menu again.

8. Use options 3 and 4 to verify and list the installed Sun Cluster
software.

9. Select option 5 to quit the scinstall program.

Administration Workstation Installation 3-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Configuring the Administration Workstation Environment


Caution – Check with your instructor before performing the steps in
this section. Your login files might already be properly configured by a
! JumpStart operation. The steps in this section will destroy that setup.

1. Navigate to the training Scripts/ENV subdirectory.

2. Copy the admenv.sh file to /.profile on the administration


workstation.

3. Exit the window system and then log out and in again as user
root.

Verifying the Administration Workstation Environment


1. Verify that the following search paths and variables are present:
PATH=/sbin:/usr/sbin:/usr/bin:/opt/SUNWcluster/bin:/u
sr/openwin/bin:/usr/ucb:/etc:.

MANPATH=/usr/man:/opt/SUNWcluster/man:/usr/dt/man:/us
r/openwin/man

EDITOR=vi

2. Start the window system again on the administration workstation.

3-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Configuring the /etc/clusters File


The /etc/clusters file has a single line entry for each cluster you
intend to monitor. The entries are in the form:

ClusterName host0name host1name host2name host3name

Sample /etc/clusters File

sec-cluster sec-node0 sec-node1 sec-node2

1. Edit the /etc/clusters file and add a line using the cluster and
node names assigned to your system.

Configuring the /etc/serialports File


The /etc/serialports file has an entry for each cluster host
describing the connection path. The entries are in the form:

hostname tcname tcport

Sample /etc/serialports File

sec-node0 sec-tc 5002


sec-node1 sec-tc 5003
sec-node2 sec-tc 5004

1. Edit the /etc/serialports file and add lines using the node and
TC names assigned to your system.

Note – When you upgrade the cluster software, the


/etc/serialports and /etc/clusters files are overwritten. You
should make a backup copy of these files before starting an upgrade.

Administration Workstation Installation 3-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Starting the cconsole Tool


This section provides a good functional verification of the Terminal
Concentrator in addition to the environment configuration.

1. Make sure power is on for the TC and all of the cluster hosts.

2. Start the cconsole application on the administration workstation.

# cconsole clustername &

3. Place the cursor in the cconsole Common window and press


Return several times. You should see a response on all of the
cluster host windows. If not, ask your instructor for assistance.

Note – The cconsole Common window is useful for simultaneously


loading the Sun Cluster software on all of the cluster host systems.

4. If the cluster host systems are not booted, boot them now.

ok boot

5. After all cluster host systems have completed their boot, log in as
user root.

6. Practice using the Common window Group Term Windows


feature under the Options menu. You can ungroup the cconsole
windows, rearrange them, and then group them together again.

3-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Configuring the Cluster Host Systems Environment


Caution – Check with your instructor before performing the steps in
this section. Your login files might already be properly configured by a
! JumpStart operation. The steps in this section will destroy that setup.

1. Navigate to the training Scripts/ENV subdirectory on each cluster


host system.

2. Copy the nodenv.sh file to /.profile on each cluster host.

3. Edit the .profile file on each cluster host system and set the
DISPLAY variable to the name of the administration workstation.

4. Log out and in again as user root on each cluster host system.

Verifying the Cluster Host Systems Environment


1. Verify that the following environmental variables are present on
each cluster host.

PATH=/usr/opt/SUNWmd/sbin:/opt/SUNWcluster/bin:/opt/S
UNWsma/bin:/opt/SUNWsci/bin:/usr/bin:/usr/sbin:/sbin:
/usr/ucb:/etc:.

MANPATH=/usr/man:/opt/SUNWcluster/man:/opt/SUNWsma/ma
n:/opt/SUNWvxvm/man:/opt/SUNWvxva/man:/usr/opt/SUNWmd
/man

TERM=vt220

EDITOR=vi

DISPLAY=adminworkstation:0.0

Note – If necessary, edit the .profile file on each cluster host and set
the DISPLAY variable to the name of the administration workstation.

Administration Workstation Installation 3-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Verifying the Cluster Host Systems Environment (Continued)


2. On each cluster host system, create a /.rhosts file that contains a
plus (+) sign.

3. Edit the /etc/default/login file on each cluster host system and


comment out the console login control line as follows:
#CONSOLE=/dev/console

4. On the administration workstation, type xhost + in the console


window to enable remote displays.

3-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Exercise: Installing the Sun Cluster Client Software

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

Administration Workstation Installation 3-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Summarize the Sun Cluster administration workstation functions

❑ Use the scinstall script features

❑ Install the client software on the administration workstation

❑ Set up the administration workstation environment

❑ Configure the Sun Cluster administration tools

3-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
3
Think Beyond

What is the advantage of the /etc/clusters and /etc/serialports


files?

Why is the Cluster SNMP agent installed on the administrative


workstation?

What is the impact on the cluster if the administrative workstation is


not available? What would you do for backup?

Administration Workstation Installation 3-35


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Preinstallation Configuration 4

Objectives

Upon completion of this module, you should be able to:

● Configure any supported cluster topology

● List the appropriate applications for each topology

● Configure the cluster interconnect system

● Explain the need for a simple quorum device

● Estimate the number of quorum devices needed for each cluster


topology.

● Describe the purpose of the public network monitor feature

● Describe the purpose of a mirrored CCD volume

● Explain the purpose of the terminal concentrator node locking


port

This module provides the information necessary to prepare a Sun


Enterprise Cluster (SEC) system for the Sun Cluster software
installation.

4-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Relevance

Discussion – The following question is relevant to your learning the


material presented in this module:

1. Why is so much preinstallation planning required for an initial


software installation?

4-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

● Sun Cluster 2.2 Cluster Volume Manager Guide, part number


805-4240

Preinstallation Configuration 4-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Cluster Topologies

You can configure a Sun Enterprise Cluster system in several ways


called topologies. Topology configurations are determined by the
types of disk storage devices used in a cluster and how they are
physically connected to the cluster host systems.

To be a supported configuration, all disk storage devices must have


dual-ported system access at a minimum.

You can use most of the topologies in different application


environments, but some lend themselves better to particular
applications.

4-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Topologies

Clustered Pairs Topology


The clustered pairs configuration, shown in Figure 4-1, is essentially
a pair of two-node clusters that function independently of one
another but can be managed as a single cluster.

This configuration is easy to administer but the Sun Cluster software


treats it as a single four-node cluster when a planned or unplanned
reconfiguration happens. This causes service disruption on all of the
nodes during a cluster reconfiguration.

CIS switch
or hub

Node 0 Node 1 Node 2 Node 3

A B A B A B A B
Array Array Array Array

Figure 4-1 Clustered Pairs Topology Configuration

Target Applications

The clustered pairs configuration is suitable for applications in which


there are two highly available data services that depend on one
another. The HA-SAP application can effectively use the clustered
pairs topology.

Preinstallation Configuration 4-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Topologies

Ring Topology
The Ring topology, shown in Figure 4-2, allows each node to assume
the workload of either of two neighbors, should that neighbor fail.

Node 0

Node 3 CIS switch


Node 1
or hub

Node 2

Figure 4-2 Ring Topology Configuration

Target Applications

The ring topology configuration is used by Highly Available Data


Service (HADS) installations. It allows a great deal of flexibility when
selecting the backup node for a particular data service.

Note – For clarity, mirror devices are not shown in the Figure 4-2
drawing.

4-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Topologies

N+1 Topology
The N+1 topology, shown in Figure 4-3, provides one system to act as
the backup for every other system in the cluster. All of the secondary
paths to the storage devices are connected to the redundant or N+1
system, which can be running a normal workload of its own.

When any cluster system fails, it always fails over to the N+1 system.
The N+1 system must be large enough to support the workload of any
of the other systems in the cluster.

CIS switch
or hub

Node 0 Node 1 Node 2 Node 3

A B A B A B
Array Array Array

Figure 4-3 N+1 Topology Configuration

Target Applications

The N+1 topology is used by Highly Available Data Service (HADS)


installations. Using this configuration, the backup node can take over
without any performance degradation and the backup node is more
cost effective because it does not require dedicated data storage.

Preinstallation Configuration 4-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Topologies

Shared-Nothing Topology
The shared-nothing configuration, shown in Figure 4-4, requires only a
single connection to each disk storage unit.

CIS switch
or hub

Node 0 Node 1 Node 2 Node 3

A B A B A B A B
Array Array Array Array

Figure 4-4 Shared-Nothing Topology Configuration

Target Applications

The shared-nothing configuration is used only by the Informix Online


XPS database. The following limitations apply:

● There are no failover capabilities in the event of a node failure

4-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Topologies

Scalable Topology
The scalable topology configuration features uniform access to storage
array data from all nodes in a cluster. This configuration, shown in
Figure 4-5, must use the A5000 storage arrays, which allow
simultaneous connections from up to four nodes.

CIS switch
or hub

Node 0 Node 1 Node 2 Node 3

a0 a1 b0 b1

A5000 storage
array

Figure 4-5 Scalable Topology Configuration

Target Applications

The scalable topology can be used by all Sun Enterprise Cluster


applications including: Oracle Parallel Server, Informix Online XPS,
and all high availability databases and data services.

Preinstallation Configuration 4-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Cluster Quorum Devices

Quorum devices are designated storage devices that are used to


prevent data corruption in a cluster when a node suddenly crashes or
the cluster interconnect fails. A quorum device can be either a single
array disk or an array controller, but it must be a device that is
physically connected to at least two cluster hosts.

Using a quorum device is one of several cluster strategies to help


ensure that storage array data is not corrupted by a failed cluster host
system. Quorum devices must be configured during the Sun Cluster
software installation. Depending on your topology you will be asked
to define from one to as many as four quorum devices.

Note – Quorum disk drives are not used in Solstice DiskSuite (SDS)
installations. SDS use a different method to ensure data integrity.

4-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Quorum Devices

Disk Drive Quorum Device


As shown in Figure 4-6, when there is a complete cluster interconnect
failure in a two-node cluster, each node assumes the other is in an
uncontrolled state and must be prevented from accessing data.

Both nodes try to perform a Small Computer Systems Interface (SCSI)


reservation of the designated quorum device. The first node to reserve
the quorum disk remains as a cluster member.

In a two-node cluster, the node that fails the race to reserve the
quorum device will abort the Sun Cluster software.

The quorum device information is stored in a cluster database file that


is configured when the Sun Cluster software is installed.

Node 0 Node 1
Cluster
Database interconnect Database
Database
Quorum Quorum
= c3t0d0 =Quorum
c2t0d0
= c2t0d0

Reserve Reserve

A B
Q

Disk storage array

Figure 4-6 Quorum Disk Drive

Note – The physical device paths can be different from each attached
host system, as shown in Figure 4-6.

Preinstallation Configuration 4-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Quorum Devices

Array Controller Quorum Device


As shown in Figure 4-7, you can also configure an array controller as
the quorum device during the Sun Cluster software installation.

The race for the quorum controller is the same as for the quorum disk
drive. The losing node must abort the Sun Cluster software.

The worldwide number (WWN) of the array controller is used during


the SCSI reservation instead of the disk drive physical path.

Node 0 Node 1
Cluster
Database interconnect Database
Database
Quorum Quorum
= WWN =Quorum
WWN
= c2t0d0

Reserve Reserve
Controller
A Q B

Disk storage array

Figure 4-7 Quorum Array Controller

Caution – If an array controller is used as the quorum device, the


array must not contain any disks that are private to one of the nodes
! such as a boot disk. When the controller is reserved by one node, the
other node can no longer access its private disk.

4-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Quorum Devices

Quorum Device in a Ring Topology


Quorum devices are shared between nodes than can both master the
device. As shown in Figure 4-8, the potential resource masters are
more complicated in the ring topology.

Node 0 Node 1 Node 2

A B A B A B
Resource Resource Resource
1 2 3

Figure 4-8 Potential Masters in a Ring Topology

● Node 0 and Node 2 can master Resource 1

● Node 0 and Node 1 can master Resource 2

● Node 1 and Node 2 can master Resource 3

Note – During the Sun Cluster software installation on a ring topology,


you will be asked to select a quorum device for each pair of resource
masters.

Preinstallation Configuration 4-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Quorum Devices

Quorum Device in a Scalable Topology


If your cluster is configured in a scalable topology that uses direct
attach storage arrays, such as the StorEdge A5000, you will be asked to
configure a single quorum disk during the Sun Cluster software
installation. This quorum disk is not used in the same manner as on
dual-ported storage arrays and is not really required.

With the appearance of three- and four-node clusters using direct


access storage devices, such as the A5000, all nodes have equal access
to the disk drives. The SCSI reservation feature cannot be applied to
multiple nodes. Only one node can exclusively reserve the storage
array but as many as three surviving nodes need access to the disks.

The scalable topology instead uses the terminal concentrator in a


scheme called failure fencing to ensure that a suspect node is prevented
from corrupting data.

Failure Fencing and Node Locking

If your cluster has more than two nodes and uses direct attach storage
devices, such as the StorEdge A5000, you are asked to select a node
locking port during the Sun Cluster software installation. You are
asked for the port number of an unused terminal concentrator port.

Does the cluster have a disk storage device that is


connected to all nodes in the cluster [no]? yes

Which unused physical port on the Terminal


Concentrator is to be used for node locking: 6

4-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Quorum Devices

As shown in Figure 4-9, the solution to the failure fencing problem is


to use the Terminal Concentrator to issue an abort sequence to the
failed node, which ensures it cannot damage disk resident data.

Lock Ethernet

Abort
TC

2 3 4 5 6 Node 3
interconnect
failure

H
U
B
Node 0 Node 1 Node 2 Node 3

Direct attached storage

Figure 4-9 Node Locking Overview

The node that issues the abort is the one that first obtained a “cluster
lock” by locking an unused TC port using the telnet utility.

The cluster lock is obtained by the first node that joins the cluster. It
can be transferred to another functional node in the event of a public
network failure. The cluster lock is always owned by the node with the
lowest Node ID.

Note – The node locking function is used to prevent an operator from


accidentally starting a second cluster on a node that has been
partitioned by a complete cluster interconnect failure.

Preinstallation Configuration 4-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Cluster Interconnect System Overview

All Sun Enterprise Cluster installations must have a CIS, a dedicated


high-speed interconnect system, to enable constant communication.
The nodes need to exchange status and configuration information, as
well as provide a fast, efficient path for some types of application data.

Note – During the Sun Cluster software installation, the CIS is also
referred to as the private network interface.

4-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Overview

Interconnect Types
You can use two types of interconnect systems:

● 100base-T Ethernet-based interconnect

● Scalable Coherent Interface (SCI) interconnect

The SCI interconnect has a 100 Mbyte/sec bandwidth and low latency
that is needed by the OPS and XPS parallel database applications.

Interconnect Configurations
Depending on the number of cluster host systems, you can use two
interconnect configurations:

● Point-to-point for two-node clusters

● Interconnect hubs or switches for greater than two-node cluster

Note – You can implement both configurations with Ethernet or SCI


hardware.

Figure 4-10 demonstrates the basic interconnect configuration rules in


which corresponding physical interfaces are connected and redundant
to avoid a cluster-wide single point of failure.
Node 0 Node 1
System board hme0 hme0 System board

System board hme1 hme1 System board

Figure 4-10 Basic Interconnect Configuration

Note – Ethernet crossover cables are required for the Ethernet-based


point-to-point cluster interconnect.

Preinstallation Configuration 4-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Cluster Interconnect System Configuration

Before installing the Sun Cluster software on the cluster host systems,
the CIS must be cabled correctly. This involves identifying the primary
and backup interfaces, as well as making the appropriate connections.

When the cluster software starts on each node, the lowest numbered
physical interconnect interfaces are activated first and are considered
the primary interfaces. The primary interfaces should all be connected
to the same hub or switch.

Any node that violates this configuration cannot join the cluster
because it will attempt to communicate through the wrong hub or
switch.

Only one interface is used at a time. When CIS failure is detected on


any node in the cluster, all nodes switch to their backup interconnects.

4-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Configuration

Cluster Interconnect Addressing


During the Sun Cluster software installation, fixed IP addresses are
assigned to each interconnect interface in the cluster. Each interconnect
pair on a node are also assigned a virtual IP address. This is used to
switch from one interconnect to the backup in the event of failure.

The physical and virtual addresses are shown in Figure 4-11. The
addresses are licensed to Sun Microsystems for their exclusive use.

First Node Third Node


hme 0, scid 0 hme 0, scid 0
204.152.65.1 204.152.65.3

204.152.65.33 204.152.65.35

hme 1, scid 1 hme 1, scid 1


204.152.65.17 204.152.65.19

Second Node Fourth Node


hme 0, scid 0 hme 0, scid 0
204.152.65.2 204.152.65.4

204.152.65.34 204.152.65.36

hme 1, scid 1 hme 1, scid 1


204.152.65.18 204.152.65.20

Figure 4-11 Cluster Interconnect Address Assignments

Preinstallation Configuration 4-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Configuration

Point-to-Point Connections
Point-to-point connections are used for two-node clusters only and can
be Ethernet or SCI based. You must ensure that similar logical
interfaces are connected. As shown in Figure 4-12, the logical interface
hme0 on Node 0 must be physically connected to the logical interface
hme0 on Node 1. The same is true for the logical hme1 interfaces.

Node 0 Node 1
primary
hme0 hme0
System board or or System board
scid0 scid0

backup
hme1 hme1
System board or or System board
scid1 scid1

Figure 4-12 Cluster Interconnect High Availability Cabling

Caution – Any time a SCI component is moved or replaced, you must


run the /opt/SUNWsma/bin/sm_config script or your system will not
! function reliably.

To avoid a single point of failure, there are always two separate


interfaces on each node for the CIS or private network. If any cluster
member loses communication through the cluster interconnect, all
cluster members switch to their backup interfaces to try and
reestablish communication with one another.

The same rules apply when using the SCI interfaces.

4-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Configuration

SCI High-Speed Switch Connection


If a three- or four-node cluster uses the high-speed SCI switch
configuration, shown in Figure 4-13, care must be taken when initially
connecting the cables. If it is not clear which node is connected to a
particular switch, mistakes can be made during problem resolution
attempts.

● Node 0/scid0 should be connected to port 0 of switch 0

● Node 0/scid1 should be connected to port 0 of switch 1

● The same scheme should be used for all other nodes in the cluster

Node 0 Node 1
Switch 0
System board scid0 scid0 System board
0 1

System board scid1 23 scid1 System board

Node 2 Node 3
Switch 1
System board scid0 0 1 scid0 System board

23
System board scid1 scid1 System board

Figure 4-13 High-speed SCI Switch Interconnect

Note – Trying to predict which SCI cards are configured as 0 and 1 is


difficult. The best indicator that you can use before you install the
software is to see what the reset command shows at the ok prompt.
The first card to show in the listing is scid0.

Preinstallation Configuration 4-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Configuration
SCI Card Identification
To determine the version of the SCI cards, type reset at the ok
prompt. This causes the SCI cards to run an internal self-test.

0} ok reset
Resetting ...

DOLPHIN SBus-to-SCI (SBus2b) Adapter - 9029, Serial


#6342
FCode 9029 $Revision: 2.14 $ - d9029_61 $Date:
1997/08/19 13:41:23

Executing SCI adapter selftest. Adapter OK.

DOLPHIN SBus-to-SCI (SBus2b) Adapter - 9029, Serial


#6340
FCode 9029 $Revision: 2.3 $ - d9029_52 $Date:
1996/10/30 07:47:53

Executing SCI adapter selftest. Adapter OK.

Note – The SCI cards must be the current SBus2b version or they will
not function correctly.

SCI Card Self-Test Information


The SCI card self-test routines display several important pieces of
information that you should record for each system. You can
determine the following from the self-test:
● There are two cards in the system and they are basically functional
● Both cards are the SBus2b model
● The first card to appear is addressed as scid0
● The second card to appear is addressed as scid1
● The cards have the serial numbers 6342 and 6340

Note – There are serial number tags on the face of each SCI card.

4-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Configuration
SCI Card Scrubber Jumpers
As shown in Figure 4-14, each SCI interface card has a scrubber jumper
that enables the card to perform link maintenance functions. This
jumper needs to be set either on or off depending on your cluster
configuration.
Scrubber jumper

On

Off

Figure 4-14 SCI Card Scrubber Jumper Location

Two-Node SCI Interconnect Jumper Settings

In a two-node point-to-point configuration, only one scrubber jumper


per link is set to on.
Scrubber on
Node 0 Scrubber off
Node 1

Scrubber on
Scrubber off

Figure 4-15 SCI Card Scrubber Jumper Configuration

Note – All SCI card scrubber jumpers must be set to on in three and
four-node interconnect configuration.

Preinstallation Configuration 4-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Configuration

Ethernet Hub Connection


If a three- or four-node cluster uses the Ethernet hub configuration,
shown in Figure 4-13, the interfaces must be connected to the Ethernet
hubs as follows:

● All hme0 Ethernet interfaces must connect to Hub 0

● All hme1 Ethernet interfaces must connect to Hub 1

Node 0 Node 1
Hub 0
System board hme0 hme0 System board

System board hme1 hme1 System board

Node 2 Node 3
Hub 1
System board hme0 hme0 System board

System board hme1 hme1 System board

Figure 4-16 Ethernet Hub Interconnect

The Ethernet cluster interconnect interfaces must meet the following


requirement:

● They are dedicated exclusively for use by the CIS software

● You cannot use the primary system interface

● Do not place the primary and backup interfaces on the Quad


Ethernet cards

● Only 100base-T Ethernet cards and hubs are supported

4-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Interconnect System Configuration

Ethernet Card Identification


It is difficult to determine which physical address is assigned to a
specific Ethernet card. The OpenBoot PROM firmware creates a device
tree when the system is powered on and that tree is used by the Solaris
operating system to assign physical interface numbers.

The rules about hardware address assignment are complex and are
different for virtually every hardware platform. Few individuals have
sufficient hardware knowledge to accurately predict the physical
address for a given Ethernet interface.

Most people perform OpenBoot PROM testing on Ethernet interfaces


while attaching them to an active network. When the interface that is
being tested finally passes the firmware tests, you can make an
appropriate notation on a hardware diagram.

Preinstallation Configuration 4-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Public Network Management

The Public Network Management software creates and manages


designated groups of local network adapters. The PNM software is a
Sun Cluster package that provides IP address and adapter failover
within a designated group of local network adapters. It is designed for
use in conjunction with the HA data services. By itself, it has limited
functionality.

The network adapter failover groups are commonly referred to as


NAFO groups. If a cluster host network adapter fails, its associated IP
address is transferred to a local backup adapter.

A NAFO group can consist of any number of network adapter


interfaces but usually contains only a few.

Note – This discussion of NAFO groups is to eliminate confusion


during the Sun Cluster software installation on the cluster host
systems. It is not intended as a complete NAFO group lecture.

4-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Public Network Management

As shown in Figure 4-17, the PNM daemon (pnmd) continuously


monitors designated network adapters on a single node. If a failure is
detected, pnmd uses information in the cluster configuration database
(ccd) and the pnmconfig file to initiate a failover to a healthy adapter
in the backup group.

Network

Node 0

hme0 hme1 hme0 hme1


Primary Backup Primary Backup
nafo12 group nafo7 group

Node 1 Monitor
primary
pnmd ifconfig
backup

ccd
/etc/pnmconfig
NAFO group
IP address NAFO group
configuration

Figure 4-17 Public Network Management Components

Preinstallation Configuration 4-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Public Network Management

PNM Configuration
You use the pnmset command to configure a network adapter backup
group. The following shows the process of creating two separate
backup groups. With pnmset, you can create all of the NAFO backup
groups at the same time, or create them one at a time.

# /opt/SUNWpnm/bin/pnmset

In the following you will be prompted to do


configuration for network adapter failover

do you want to continue...[y/n]: y

How many PNM backup groups on the host: 2

Enter backup group number: 7


Please enter all network adapters under nafo0
qe1 qe0

Enter backup group number: 12


Please enter all network adapters under nafo1
hme0

You can assign any number you wish to a NAFO group. However, you
cannot have more than 255 NAFO groups on a system. The groups are
given the name nafo followed by the number you furnish during the
configuration. Typical group names are nafo7 or nafo12.

Note – During the Sun Cluster software installation, you are asked if
you want to configure public networks and NAFO groups. You do not
have to do this at this time. You can configure them after the
installation.

4-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Shared CCD Volume

Each node in a cluster has an identical copy of the cluster


configuration database file called ccd.database (CCD). When a node
tries to join the cluster, its CCD must be consistent with the current
cluster member or it cannot join.

If there is a CCD inconsistency in a two-node cluster, there is no way


to establish a majority opinion about which CCD is correct.

Another problem is that when only one node is in the cluster you
cannot modify the CCD database.

Note – The shared CCD volume is not supported in Solstice DiskSuite


cluster installations.

Preinstallation Configuration 4-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Shared CCD Volume

In a two-node cluster, you can create a third CCD database that is


resident on the storage arrays. It is in a private disk group that is
highly available. The disk group is imported to the assigned backup
node when necessary.

Node 0 Node 1

ccd.database ccd.database

I/O interfaces I/O interfaces

ccd primary ccd mirror

Mass storage Mass storage

Figure 4-18 Array Resident ccd Database Configuration

In the case of a single node failure, there are two CCD files to compare
to ensure integrity.

Note – The shared CCD can be used only with a two-node cluster.

4-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Shared CCD Volume

Shared CCD Volume Creation


You can create the shared CCD volume either by replying yes to its
creation during the scinstall process or by using the confccdssa
script after installation. The confccdssa script, found in the
/opt/SUNWcluster/bin directory, sets up a shared disk array-resident
copy of the CCD that is mirrored.

In either case, you are asked to dedicate two storage array drives for
the mirrored CCD volume. If you specified a shared CCD during
scinstall processing, you must run the confccdssa command after
the Sun Cluster software installation has completed.

To create a shared CCD:

1. Identify two entire drives on shared disk storage to contain the


CCD data. For reliability, they should be on two different storage
devices.

2. If you did not reply yes to the shared CCD question during
scinstall processing, run the following command on both nodes
in the cluster:
# scconf clustername -S ccdvol

3. Run confccdssa on only one node.


# confccdssa

The confccdssa program furnishes a list of disk drives available


for use. They do not yet belong to any disk group. After you select
a pair of drives, the program create a special disk group and
volume for CCD use. The disks cannot be used for any other
purpose.

Disabling a Shared CCD


To disable shared CCD operation, run:
# scconf clustername -S none

Preinstallation Configuration 4-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Cluster Configuration Information

Before you install the Sun Cluster software, you should record the
general system configuration information. This information can be
useful if there are any problems during the software installation. Some
of the utilities that you can use are:

● prtdiag

● finddevices

● luxadm

4-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Configuration Information

Using prtdiag to Verify System Configuration


The prtdiag command furnishes general information about your
system configuration, but detailed analysis of the output requires
considerable hardware knowledge. The following portion of the
prtdiag output shows some basic system configuration information.

# /usr/platform/sun4u/sbin/prtdiag
System Configuration: Sun Microsystems sun4u 8-slot Sun
Enterprise 4000/5000
System clock frequency: 84 MHz
Memory size: 512Mb

========================= CPUs =========================

Run Ecache CPU CPU


Brd CPU Module MHz MB Impl. Mask
--- --- ------- ----- ------ ------ ----
0 0 0 168 0.5 US-I 2.2
0 1 1 168 0.5 US-I 2.2
2 4 0 168 0.5 US-I 2.2
2 5 1 168 0.5 US-I 2.2

========================= Memory =========================

Intrlv. Intrlv.
Brd Bank MB Status Condition Speed Factor With
--- ----- ---- ------- ---------- ----- ------- -------
0 0 256 Active OK 60ns 2-way A
2 0 256 Active OK 60ns 2-way A

From the output you can determine that:

● The system is an 8-slot E4000/5000 system

● There are CPU/Memory boards in slots 0 and 2

● There is a total of 512 Mbytes of system memory

Preinstallation Configuration 4-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Configuration Information

Using prtdiag to Verify System Configuration (Continued)


The following portion of the prtdiag output shows more detailed
information about the system’s interface board configuration.

========================= IO Cards =========================

Bus Freq
Brd Type MHz Slot Name Model
--- ---- ---- ---- --------------------- --------------
1 SBus 25 0 DOLPHIN,sci
1 SBus 25 1 qec/be (network) SUNW,270-2450
1 SBus 25 2 QLGC,isp/sd (block) QLGC,ISP1000U
1 SBus 25 3 SUNW,hme
1 SBus 25 3 SUNW,fas/sd (block)
1 SBus 25 13 SUNW,soc/SUNW,pln 501-2069

Detached Boards
===============
Slot State Type Info
---- -------- ----- -----------------------------
7 disabled disk Disk 0: Target: 14 Disk 1: Target: 15

From the output you can determine that:

● There is an I/O board in slot 1

● There are four option cards on the I/O board

▼ A SCI card in option slot 0

▼ A quad-Ethernet card in option slot 1

▼ An intelligent SCSI interface card in option slot 2

▼ An SOC optical module in option slot 13

● There is a disk board in slot 7

4-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Configuration Information

Interpreting prtdiag Output


Interpreting prtdiag command output in more detail requires
additional information about the internal structure of the I/O boards.
Figure 4-32 can help you understand the prtdiag output of several of
the Ultra-based systems.

TPE Fast/Wide
10/100 SCSI

FCOM FCOM

SBus SBus SBus


card card card
SOC slot 2 slot 0 FEPS
slot 1
slot 13 (d) slot 3

SBus 0 SBus 1

Figure 4-19 Ultra 3000/4000/5000/6000 I/O Board Configuration

Preinstallation Configuration 4-35


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Cluster Configuration Information

Identifying Storage Arrays


The finddevices and luxadm command are useful for identifying the
storage arrays attached to a cluster system.

The finddevices Script Output

The /opt/SUNWcluster/bin/finddevices script is a standard


feature of cluster software. It displays the controller number and the
12-digit array worldwide number for all storage arrays except the
A5000. It does not display non-array controller information.
# /opt/SUNWcluster/bin/finddevices
c2:00000078BF60
c3:00000078B12D
c4:00000078BF9E

The luxadm Utility Output

The luxadm command is a standard Solaris Operating System


command. It can display information about any supported storage
array. The probe option is used only for A5000 storage arrays.

# luxadm probe
Found
SENA Name:d Node WWN:5080020000011df0
Logical Path:/dev/es/ses0
Logical Path:/dev/es/ses1
SENA Name:a Node WWN:50800200000291d8
Logical Path:/dev/es/ses2
Logical Path:/dev/es/ses3

4-36 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4

Storage Array Firmware Upgrades

Before you upgrade the firware in any storage array, you should first
perform careful research. Upgrading the firmware revision of a storage
array is not a single process. The following two components are
upgraded when a storage array firmware patch is installed:

● The Solaris Operating Environment storage array driver software

● The storage array firmware revision

You must complete the process. If a new system driver is installed but
the related array firmware is not downloaded, you can create a
mismatch between the Solaris Operating Environment storage array
driver and the array firmware that makes the storage array
unavailable.

Recovering from this problem can result in a considerable amount of


cluster downtime.

If you are contemplating array firmware upgrades, it is a good idea to


get assistance from your authorized Sun field representative.

Preinstallation Configuration 4-37


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Storage Array Firmware Upgrades

Array Firmware Patches


It is easy to underestimate the complexity of updating array firmware
levels. The following example demonstrates what might be involved if
you are considering updating the firmware on your Sun StorEdge
A5000 storage arrays.

● Different patches are required for different Solaris Operating


Environment versions

● Different patches are required for different types of host system


interface cards (SBus and PCI based cards)

● Patches can be required for the storage array disk drives

Typical Sun StorEdge A5000 Patches

If your A5000 arrays are connected to SBus-based SOC+ cards and


your system is running the Solaris 2.6 operating system, you might
need to install all of the following patches:

● Patch 103346 to update CPU and I/O firmware but only on Sun
Enterprise I/O boards with onboard FCAL.

● Patch 105356 to update the Solaris 2.6 Operating Environment


/kernel/drv/ssd driver

● Patch 105357 to update the Solaris 2.6 Operating Environment


/kernel/drv/ses driver

● Patch 105375 to update the sf and socal drivers, array interface


card firmware and array controller board firmware

If the version number of the firmware too low, you must first
install the 105375-04 version of the patch to bring it up to revision
1.03.

● Patch 106129 to update the A5000 disk drive firmware

Note – All of the above patches must be installed in a certain order.


For more information and comprehensive instructions, see the
README notes for patch 105375.

4-38 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

Exercise objective – In this exercise you will do the following:

● Select and configure a cluster topology

● Estimate the number of quorum devices needed

● Verify that the cluster interconnect is correctly cabled

● Select an appropriate terminal concentrator node locking port

Preparation
To begin this exercise, your cluster should be in the following state:
● You are connected to the cluster hosts through the
cconsole tool
● The cluster host systems have been booted and you are
logged into them as user root

Tasks
The following tasks are explained in this section:

● Cluster topology

● Quorum device configuration

● SCI or Ethernet cluster interconnect configuration

● Node locking configuration for direct attach storage arrays.

Preinstallation Configuration 4-39


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

Cluster Topology
1. Record the desired topology configuration of your cluster.

Topology Configuration

Target
Configuration
Number of Nodes
Number of Storage
Arrays
Types of Storage
Arrays

2. Verify that the storage arrays in your cluster are connected in your
target topology. Recable the storage arrays if necessary.

Quorum Device Configuration


1. Record the estimated number of quorum devices you must
configure during the cluster host software installation.

Estimated number
of quorum devices

Note – Please consult with your instructor if you are not sure about
your quorum device configuration.

4-40 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

Ethernet Cluster Interconnect Configuration

Point-to-Point Ethernet Interconnect

Skip this section if your cluster interconnect is not a point-to-point


Ethernet configuration.

1. Ask your instructor for assistance in determining the logical


names of your cluster interconnect interfaces.

2. Complete the form in Figure 4-20 if your cluster uses an Ethernet-


based point-to-point interconnect configuration.

Node 0 Node 1

First Ethernet
interface: First Ethernet
interface:

Second Ethernet
interface: Second Ethernet
interface:

Figure 4-20 Ethernet Interconnect Point-to-Point Form

Preinstallation Configuration 4-41


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

Ethernet Cluster Interconnect Configuration

Hub-Based Ethernet Interconnect

Skip this section if your cluster interconnect is not an Ethernet


interconnect with hubs configuration.

1. Complete the form in Figure 4-21 if your cluster uses an Ethernet-


based cluster interconnect with hubs.

Node 0 Node 1
Hub 0
First Ethernet
interface: First Ethernet
interface:

Second Ethernet Second Ethernet


interface: interface:

Node 2
Hub 1
First Ethernet
interface:

Second Ethernet
interface:

Figure 4-21 Ethernet Interconnect With Hubs Form

2. Verify that each Ethernet interconnect interface is connected to the


correct hub.

Note – If you have any doubt about the interconnect cabling, consult
with your instructor now. Do not continue this lab until you are
confident that your system is cabled correctly.

4-42 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

SCI Cluster Interconnect Configuration

Point-to-Point SCI Interconnect

Skip this section if your cluster interconnect is not a SCI point-to-point


configuration.

1. Using the cconsole common window, halt each of your cluster


host system. Type the init 0 command to halt the systems.

2. Type a reset command at the ok prompt on all cluster hosts.

3. Use the scroll bar on each cconsole host window and review the
information from the first and second SCI card self-test of each
node in the cluster.

Note – The SCI card self-tests might repeat twice on some systems, so
be careful to start recording at the first self-test output.

4. Complete the form in Figure 4-22 if your cluster uses a SCI-based


point-to-point CIS configuration.

Node 1 Node 0
First self-test First self-test
serial number: serial number:
scid0 __________ __________ scid0

Second self-test Second self-test


serial number: serial number:
scid1 __________ __________ scid1

Figure 4-22 SCI Interconnect Point-to-Point Form

Preinstallation Configuration 4-43


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

SCI Cluster Interconnect Configuration

SCI Interconnect with Switches

Skip this section is your cluster interconnect is not a SCI interconnect


with switches configuration.

1. Using the cconsole common window, halt each of your cluster


host system. Type the init 0 command to halt the systems.

2. Type a reset command at the ok prompt on all cluster hosts.

3. Use the scroll bar on each cconsole host window and review the
information from the first and second SCI card self-test of each
node in the cluster.

Note – The SCI card self-tests might repeat twice on some systems, so
be careful to start recording at the first self-test output.

4. Complete the SCI interconnect configuration form in Figure 4-23.

4-44 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

SCI Cluster Interconnect Configuration

SCI Interconnect with Switches (Continued)

Node 0 Node 1
Switch 0
First self-test First self-test
serial number: serial number:
__________scid0 0 1 scid0 __________

Second self-test 2 3 Second self-test


serial number: serial number:
__________ scid1 scid1 __________

Node 2
Switch 1
First self-test
serial number:
__________scid0 0 1

Second self-test 2 3
serial number:
__________ scid1

Figure 4-23 SCI Interconnect With Switches Form

5. Each SCI card has a serial number tag on its face. Verify that each
serial number is connected to the proper switch and port number.

Note – If you have any doubt about the SCI interconnect cabling,
please consult with your instructor now. Do not continue this lab until
you are confident your system is cabled correctly.

Preinstallation Configuration 4-45


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Preinstallation Preparation

Node Locking Configuration


Although the TC might have been set up previous to this exercise, the
unused TC port that is used for node locking must be correctly
configured.

1. Verify that serial port 6 on the TC is not connected to a cluster host


system.

2. If serial port 6 is in use on the TC, record the port number of a TC


port that is not in use.

TC Locking Port: _______

Note – Do not use serial port 1. It is a special purpose port and does
not work for node locking.

3. Boot each of your cluster host systems if they are halted.

4-46 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Exercise: Hardware Configuration

Exercise Summary

Take a few minutes to discuss what experiences, issues, or discoveries


you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

Preinstallation Configuration 4-47


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Configure any supported cluster topology

❑ List the appropriate applications for each topology

❑ Configure the cluster interconnect system

❑ Explain the need for a simple quorum device

❑ Estimate the number of quorum devices needed for each cluster


topology

❑ Describe the purpose of the public network monitor feature

❑ Describe the purpose of a mirrored CCD volume

❑ Explain the purpose of the terminal concentrator node locking


port

4-48 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
4
Think Beyond

What additional preparation might be necessary before installing the


Sun Cluster host software?

Preinstallation Configuration 4-49


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Cluster Host Software Installation 5

Objectives

Upon completion of this module, you should be able to:

● Install the Sun Cluster host system software

● Correctly interpret configuration questions during Sun Cluster


software installation on the cluster host systems

● Perform post-installation configuration

This module reviews the process of installing and configuring the Sun
Cluster software on each of the cluster host systems. Background
information is furnished that will enable you to correctly interpret the
installation questions.

5-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Relevance

Discussion – The following questions are relevant to understanding


this module’s content:

1. What configuration issues might control how the Sun Cluster


software is installed?

2. What type of post-installation tasks might be necessary?

3. What other software might you need to finish the installation?

5-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

Cluster Host Software Installation 5-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Sun Cluster Server Software Overview

The Sun Cluster server software packages include both the


general cluster framework software and support for a number
of optional configurations. The software types include:

● Sun Cluster server packages (basic framework)

● SCI interconnect system support

● Solstice DiskSuite support (driver, mediator)

● HA data services support

● HA databases support

● Oracle Parallel Server support (UNIX™ Distributed Lock


Manager (UDLM))

Before upgrading to a new Sun Cluster release, you must obtain


the most recent copies of the Sun Cluster software installation
manual and the Sun Cluster product release notes. The upgrade
process can be complex and must be performed according to
published procedures.

5-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Server Software Overview

As shown in Figure 5-1, the Sun Cluster server software is installed on


each of the cluster hosts systems along with the appropriate volume
management software.
Administration
workstation

Solaris 2.6/7
Sun Cluster
client software

Network

Private disk Private disk

Solaris 2.6/7 Solaris 2.6/7


Sun Cluster Sun Cluster
server software server software
Volume Volume
management management

Node 0 Node 1
system system
hardware hardware

Disk storage array Disk storage array

Figure 5-1 Cluster Software Distribution

Cluster Host Software Installation 5-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Server Software Overview

Server Package Set Contents


The server package sets contain the following packages with
descriptions, as described by the pkginfo command:

Sun Cluster Framework Packages

The following packages are loaded in all installations:

● SUNWscman – Sun Cluster Man Pages

● SUNWccd – Sun Cluster Configuration Database

● SUNWsccf – Sun Cluster Configuration Database

● SUNWcmm – Sun Cluster Membership Monitor

● SUNWff – Sun Cluster FailFast Device Driver

● SUNWmond – Sun Cluster Monitor - Server Daemon

● SUNWpnm – Sun Cluster Public Network Management

● SUNWsc – Sun Cluster Utilities

● SUNWsclb – Sun Cluster Libraries

Sun Cluster SCI Interconnect Support

The following packages are loaded only if your cluster has a SCI-based
cluster interconnect system (private network):

● SUNWsci – Sun Cluster SCI Driver

● SUNWscid – Sun Cluster SCI Data Link Provider Interface (DLPI)


Driver

● SUNWsma – Sun Cluster Switch Management

5-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Server Software Overview

Server Package Set Contents

Solstice DiskSuite Support

The following are Solstice DiskSuite (SDS) support software and are
loaded only if you select SDS as your volume manager during the Sun
Cluster software installation:

● SUNWdid – Disk Identification (ID) Pseudo Device Driver

● SUNWmdm – Solstice DiskSuite (Mediator)

Highly Available Data Service Support

The following data service support packages are loaded only if you
select their related data service during the Sun Cluster software
installation:

● SUNWscds – Sun Cluster Highly Available Data Service Utility

● SUNWscdns – Sun Cluster Highly Available DNS

● SUNWschtt – Sun Cluster Highly Available Netscape Web Service

● SUNWsclts – Sun Cluster Highly Available Service For LOTUS

● SUNWscnew – Sun Cluster Netscape News Service

● SUNWscnsl – Sun Cluster Highly Available Netscape Directory


Server

● SUNWscnsm – Sun Cluster Netscape Mail Service

● SUNWscpro – Sun Cluster Internet Pro Common Files

● SUNWscsap – Sun Cluster Highly Available Service For SAP R3

● SUNWsctiv – Sun Cluster Highly Available Service for Tivoli

Cluster Host Software Installation 5-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Server Software Overview

Server Package Set Contents

Highly Available Database Support

The following highly available database support packages are only


loaded if you select their related database during the Sun Cluster
software installation:

● SUNWscor – Sun Cluster Highly Available Oracle

● SUNWscsyb – Sun Cluster Highly Available Sybase

● SUNWscinf – Sun Cluster Highly Available Informix

Oracle Parallel Server Support

● SUNWudlm – Sun Cluster UNIX Distributed Lock Manager

Sun Cluster Licensing


Paper licenses for the Sun Cluster 2.2 framework are distributed for
each hardware platform on which Sun Cluster 2.2 runs.

Paper licenses also are distributed for each Sun Cluster data service,
one for each node.

No licenses are required for SDS or CVM. The Sun StorEdge Volume
Manager license is bundled with SPARCstorage Arrays and A5000
arrays. You need a license for SSVM if you use it in a MultiPack-only
environment.

The Sun Cluster 2.2 framework does not enforce these licenses, but the
paper licenses should be retained as proof of ownership when
technical support or other support services are needed.

5-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Sun Cluster Installation Overview

The Sun Cluster installation script, scinstall, is used to install and


configure each host node for the services that it provides. The scinstall
script prompts you for various configuration information and then
installs the appropriate software components based on your responses.

Some of the packages are always installed. Others are installed based
on the responses that you provide to the prompts. (The prompts are
issued from the SUNWsccf package installation process.) If incorrect
answers are provided to the prompts, the system installs inappropriate
packages for the configuration.

The incorrect packages can prevent certain components from


operating, cause them to operate more slowly, or provide extra
services that are not wanted.

If you specify incorrect information, the Sun Cluster packages must be


completely removed from the system and the entire Sun Cluster
installation process re-run.

Cluster Host Software Installation 5-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Sun Cluster Volume Managers

During the Sun Cluster server software installation, you must select
one of three available volume managers. Although you can use the
volume managers for more than one cluster application, a volume
manager is typically used for a specific application. The most common
uses for each supported volume manager are:

● Cluster Volume Manager for Oracle Parallel Server

● Sun StorEdge Volume Manager for all HA data services and


Informix XPS

● Solstice DiskSuite for customers migrating from the older HA 1.3


data services product

5-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Volume Managers

Volume Manager Choices


The first configuration decision made during the server software
installation is which volume manager you intend to use. This does not
install the volume manager software but in some cases, installs
support software, as shown in the following example.

Volume Manager Selection

Please choose the Volume Manager that will be used


on this node:

1) Cluster Volume Manager (CVM)


2) Sun StorEdge Volume Manager (SSVM)
3) Solstice DiskSuite (SDS)

Choose the Volume Manager: 3

Installing Solstice DiskSuite support packages.


Installing “SUNWdid” ... done
Installing “SUNWmdm” ... done

---------WARNING---------
Solstice DiskSuite (SDS) will need to be installed
before the cluster can be started.

<<Press return to continue>>

The Cluster Volume Manager is selected only if it is intended for use


with the Oracle Parallel Server. Other supported data services and
applications can use either the Sun StorEdge Volume Manager or
Solstice DiskSuite.

Cluster Host Software Installation 5-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Sun Cluster Host System Configuration

During the cluster host software installation, you are asked to furnish
the name of your cluster and the names of each cluster system.

The names supplied should agree with those used in the


/etc/clusters and /etc/serialports files on the administrative
workstation.

Note – The names do not have to agree; however standardization


helps eliminate confusion when trying to administer a cluster.

5-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Host System Configuration

Cluster Host System Questions


The following cluster installation questions seem simple, but if not
answered correctly, can cause a great deal of unnecessary effort later.

What is the name of the cluster? sc-cluster

How many potential nodes will sc-cluster have [4]? 3

How many of the initially configured nodes will be


active [3]? 3

You can specify up to four nodes. The active nodes are those that you
physically connect and include in the cluster now. The potential nodes
are the number of nodes to which you will expand your cluster in the
near future.

Do not specify more potential nodes than active nodes unless you will
be expanding your cluster in the near future.

Note – If the cluster has two active nodes and only two disk strings
and the volume manager is Solstice DiskSuite, you must configure
mediators. You should do so after configuring Solstice DiskSuite, but
before bringing up the cluster. See the Sun Cluster 2.2 System
Administration Guide for the procedure.

Cluster Host Software Installation 5-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Sun Cluster Private Network Configuration

There are two types of private network systems used in the cluster,
and these must be configured during installation.

SCI Interconnect Configuration


The private network configuration is brief if you are using the SCI
interconnect system. The node names furnished during the installation
are checked by scinstall against the /etc/nodename files on each
cluster hosts.

What type of network interface will be used for this


configuration?
(ether|SCI) [SCI]? SCI
What is the hostname of node 0 [node0]? phys-hahost1

What is the hostname of node 1 [node1]? phys-hahost2

Note – Additional SCI configuration is required after the Sun Cluster


software installation is complete. The SCI post-installation process is
discussed later in this module.

5-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Private Network Configuration

Ethernet Interconnect Configuration


If the interconnect on your cluster is Ethernet based, then you are
asked additional questions about the interconnect paths on each node.
The example shown is for the first node. A similar dialogue is repeated
for each node in the cluster.

What is the hostname of node 0 [node0]? phys-hahost1

What is phys-hahost1's first private network


interface [hme0]? hme0

What is phys-hahost1's second private network


interface [hme1]? hme1

You will now be prompted for Ethernet addresses of


the host. There is only one Ethernet address for each
host regardless of the number of interfaces a host
has. You can get this information in one of several
ways:

1. use the 'banner' command at the ok prompt,


2. use the 'ifconfig -a' command (need to be root),
3. use ping, arp and grep commands. ('ping exxon; arp
-a | grep exxon')

Ethernet addresses are given as six hexadecimal bytes


separated by colons.
ie, 01:23:45:67:89:ab

What is phys-hahost1's ethernet address?


01:23:45:67:89:ab

What is the hostname of node 1 [node1]?

Cluster Host Software Installation 5-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Sun Cluster Public Network Configuration

The controllers and names of primary and secondary network


interfaces on each cluster host are collected during the Sun Cluster
host system software installation.

You can use part of this information during the Sun Cluster software
installation to configure network adapter failover groups for use by
the public network management software.

The NAFO groups are mandatory if you are going to use any of the
HA data services. However, you do not have to configure the NAFO
groups during the Sun Cluster software installation. It might be easier
to configure the NAFO groups after you complete the installation.

Note – The configuration and use of NAFO groups is discussed in


more detail in a later module.

5-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Public Network Configuration

The operator is prompted during installation for the names of any


primary and secondary public networks. You are also asked if you
want to establish a NAFO backup group.

What is the primary public network controller for


“phys-hahost1”? hme2
What is the primary public network controller for
“phys-hahost2”? hme2
Does the cluster serve any secondary public subnets
(yes/no) [no]? y

Please enter a unique name for each of these


additional subnets:

Subnet name (^D to finish): sc-cluster-net1


Subnet name (^D to finish): sc-cluster-net2
Subnet name (^D to finish): ^D

The list of secondary public subnets is:


sc-cluster-net1
sc-cluster-net2
Is this list correct (yes/no) [yes]?

For subnet “sc-cluster-net1” ...


What network controller is used for "phys-hahost1"?
qe0
What network controller is used for "phys-hahost2"?
qe0

For subnet "sc-cluster-net2" ...


What network controller is used for "phys-hahost1"?
qe1
What network controller is used for "phys-hahost2"?
qe1

Initialize NAFO on "phys-hahost1" with one ctlr per


group (yes/no) [yes]? y

Cluster Host Software Installation 5-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Sun Cluster Logical Host Configuration

A logical host is a complex structure that associates a group of virtual


volumes with a data service and also defines the user access path
using a network interface that is part of a NAFO group.

Logical hosts do not have to be configured during the Sun Cluster


software installation. It might be easier to configure the logical hosts
after you complete the basic Sun Cluster installation.

If you do decide to configure logical hosts during the Sun Cluster


installation process, you must have configured appropriate NAFO
backup groups earlier in the installation.

Note – The configuration of logical hosts is a complex issue that is


addressed in a later module.

5-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Sun Cluster Logical Host Configuration

Logical hosts can be configured during the installation.

Will this cluster support any HA data services


(yes/no) [yes]? yes
Okay to set up the logical hosts for those HA
services now (yes/no) [yes]? yes
Enter the list of logical hosts you want to add:

Logical host (^D to finish): hahost1


Logical host (^D to finish): ^D

The list of logical hosts is: hahost1

Is this list correct (yes/no) [yes]? y

What is the name of the default master for “hahost1”?


phys-hahost1

Enter a list of other nodes capable of mastering


“hahost1”:
Node name: phys-hahost2
Node name (^D to finish): ^D

The list that you entered is:


phys-hahost1
phys-hahost2

Is this list correct (yes/no) [yes]? y


Enable automatic failback for “hahost1” (yes/no)
[no]? y
What is the net name for “hahost1” on subnet “sc-
cluster-net1”? hahost1-net1
What is the net name for “hahost1” on subnet “sc-
cluster-net2”? hahost1-net2
Disk group name for logical host “hahost1” [hahost1]?
Is it okay to add logical host “hahost1” now (yes/no)

Cluster Host Software Installation 5-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Data Protection Configuration

Preserving data integrity during cluster failures requires that you use
data protection techniques. If certain key components in a cluster fail,
potentially dangerous situations must be anticipated. Depending on
the configuration of your cluster, you are asked one or more questions
related to data protection during the Sun Cluster software installation.

5-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Data Protection Configuration

Failure Fencing
Failure Fencing is used in three and four node clusters to prevent a
failed node from attempting uncontrolled data access. This is done by
forcing a UNIX abort through the terminal concentrator. The following
shows the configuration process.

What type of architecture does phys-hahost1 have


(E10000|other) [other]? other

What is the name of the Terminal Concentrator


connected to the serial port of phys-hahost1
[NO_NAME]? sc-tc

Is 123.456.789.1 the correct IP address for this


Terminal Concentrator (yes | no) [yes]? yes

What is the password for root of the Terminal


Concentrator [?]
Please enter the password for root again [?]

This process is performed for each node in the cluster. When one of the
nodes in a cluster suddenly ceases to respond across the cluster
interconnect, it might be out of control and must be stopped
immediately.

In a smaller two-node cluster, you can use the SCSI reserve feature to
deny data access to the failed node. In a three-node or four-node
cluster, you cannot use SCSI reservation.

One of the surviving nodes in a larger cluster uses the telnet


command to communicate directly into the TC and send a UNIX abort
to the failed node.

Note – The domains on an E10000 do not have serial ports, so you


must implement failure fencing differently from other servers. The
E10000 mechanism requires access to the SSP, so you are prompted for
SSP information if the E10000 architecture is specified.

Cluster Host Software Installation 5-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Data Protection Configuration

Node Locking
If your cluster has more than two nodes and uses direct attach storage
devices, such as the StorEdge™ A5000, you are asked to select a node
locking port. You are asked for the number of an unused terminal
concentrator port.

Does the cluster have a disk storage device that is


connected to all nodes in the cluster [no]? yes

Which unused physical port on the Terminal


Concentrator is to be used for node locking: 6

Figure 5-2 demonstrates the basic node locking feature. The first node
entering the cluster makes a telnet connection to the designated
locking port. This prevents another node from making a similar
connection. In the diagram, a node with a failed interconnect is no
longer a cluster member. An operator might mistakenly try to start
another cluster on the failed node. The startup fails because the
locking port is already reserved.

telnet tc_concentrator 5006


Ethernet

TC
1 2 3 4 5 6

Node 0 Node 1 Node 2 Node 3

Direct attached storage

Figure 5-2 Node Locking Overview

5-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Data Protection Configuration

Quorum Device
In a two-node cluster with both nodes attached to the same storage
array, if communication between the nodes is lost, you must prevent
one node from accessing the storage array in an uncontrolled manner.
Traditionally, a disk drive is assigned as a quorum device and both
nodes race to reserve it. The first node to reserve the quorum device
remains a cluster member. The losing node must abort clustered
operation.

The following example shows the quorum device selection for a two-
node cluster.

Getting device information for reachable nodes in the


cluster.
This may take a few seconds to a few minutes...done
Select quorum device for the following nodes:
0 (phys-hahost1)
and
1 (phys-hahost2)

1) SSA:000000779A16
2) SSA:000000741430
3) DISK:c0t1d0s2:01799413
Quorum device: 1
...
SSA with WWN 000000779A16 has been chosen as the
quorum device.

Finished Quorum Selection

Cluster Host Software Installation 5-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Data Protection Configuration

Quorum Device
Quorum devices are shared between nodes than can both master the
device. As shown in Figure 5-3, the potential resource masters are
more complicated in a ring topology configuration.

Node 0 Node 1 Node 2

A B A B A B
Resource Resource Resource
1 2 3

Figure 5-3 Potential Masters in a Ring Topology

● Node 0 and Node 2 can master Resource 1

● Node 0 and Node 1 can master Resource 2

● Node 1 and Node 2 can master Resource 3

Note – During the Sun Cluster software installation on a ring topology,


you are asked to select a quorum device for each pair of resource
masters.

If the cluster uses direct attach storage arrays, such as the Sun
StorEdge A5000, all cluster hosts can master a single storage resource,
so you must define only one quorum disk.

5-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Data Protection Configuration

Partitioned Cluster Control


In a three-node or four-node cluster, simultaneous multiple cluster
failures can create two separate clusters. This is called a partitioned
cluster. Generally, potential partitioning is detected by a large and
sudden change in cluster membership. One of the potential partitions
must abort clustered operation. You can select either automatic
partition selection or operator intervention.
In case the cluster partitions into subsets, which
subset should stay up?
ask) the system will always ask the operator.
select) automatic selection of which subset
should stay up.

Please enter your choice (ask|select) [ask]: select


You have a choice of two policies:

lowest -- The subset containing the node with the


lowest node ID value automatically becomes the new
cluster. All other subsets must be manually aborted.

highest -- The subset containing the node with the


highest node ID value automatically becomes the new
cluster. All other subsets must be manually aborted.

Select the selection policy for handling partitions


(lowest|highest) [lowest]: highest

If you configure the cluster partitioning to ask, all nodes display


continuous abortpartition/continuepartition messages until the cluster
operator decides which pair of systems should continue.

Note – If you do not configure a quorum device in a two-node cluster,


it always initiates the partitioned behavior instead of simply racing for
a quorum disk.

Cluster Host Software Installation 5-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Application Configuration

The final stage of the Sun Cluster host software installation requires
you to identify the combination of data services you intend to support
on the cluster. Depending on the choices you make, various support
packages are installed.

Note – The HA-NFS data service support is installed by default in all


installations. There are no configuration questions asked about it
during the Sun Cluster host system installation.

5-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Application Configuration

You can select multiple data service support when installing the
cluster host software. Select item 12 when you have selected all the
data services you want.

==== Select Data Services Menu ====================

Please select which of the following data services


are to be installed onto this cluster. Select
singly, or in a space separated list.
Note: HA-NFS and Informix Parallel Server (XPS) are
installed automatically with the Server Framework.

You may de-select a data service by selecting it a


second time.

Select DONE when finished selecting the


configuration.

1) Sun Cluster HA for Oracle


2) Sun Cluster HA for Informix
3) Sun Cluster HA for Sybase
4) Sun Cluster HA for Netscape
5) Sun Cluster HA for Netscape LDAP
6) Sun Cluster HA for Lotus
7) Sun Cluster HA for Tivoli
8) Sun Cluster HA for SAP
9) Sun Cluster HA for DNS
10) Sun Cluster for Oracle Parallel Server

INSTALL 11) No Data Services


12) DONE

Choose a data service: 1 4 6 12

Note – You will not see the Oracle Parallel Server (OPS) option unless
you selected the Cluster Volume Manager earlier in the installation.

Cluster Host Software Installation 5-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5

Post-Installation Configuration

Although post-installation can include a wide range of tasks, such as


installing a volume manager, this section focuses only on the post-
installation tasks that are most critical.

Note – If your cluster uses the SCI cluster interconnect, you must
complete its configuration before attempting to start or use the cluster.

5-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Post-Installation Configuration

Installation Verification
When you have completed the Sun Cluster software installation on the
cluster host systems, you should verify that the basic cluster
configuration information is present by using the scconf command.
The following example is typical of a newly configured cluster.

# scconf sc-cluster -p
/etc/opt/SUNWcluster/conf/sc-cluster.cdb
Checking node status...

Current Configuration for Cluster sc-cluster:

Hosts in cluster: phys-node0 phys-node1 phys-node2

Private Network Interfaces for


phys-node0: be0 be1
phys-node1: be0 be1
phys-node2: be0 be1

Quorum Device Information

Logical Host Timeout Values :


Step10 : 720
Step11 : 720
Logical Host : 180

Cluster TC/SSP Information


phys-node0 TC/SSP, port : 129.150.218.35, 2
phys-node1 TC/SSP, port : 129.150.218.35, 3
phys-node2 TC/SSP, port : 129.150.218.35, 4
sc-cluster Locking TC/SSP, port : 129.150.218.35, 6

Note – You should run the scconf command on each of the


configured cluster host systems to verify that their configuration
database files agree.

Cluster Host Software Installation 5-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Post-Installation Configuration

Correcting Minor Configuration Errors


When the Sun Cluster software is installed, some common mistakes
are:

● Using incorrect node names

● Using incorrect node Ethernet address

● Using incorrect CIS interface assignments

These simple mistakes can be resolved using the scconf command.


Some examples of post installation corrections follow.

Correcting Node Names

To change one or more incorrect node names:


# scconf dbcluster -h dbms1 dbms2 dbms3 dbms4

Note – Even if only one node name is incorrect, you should enter the
names of the other nodes in the cluster.

Correcting Node Ethernet Addresses

To change an incorrect Ethernet address for a node:


# scconf dbcluster -N 1 80:40:33:ff:b0:10

Note – The nodes are specified by their number (0,1,2,3).

Correcting Private Interconnect Assignments

To change the Ethernet interconnect assignment for a node:


# scconf dbcluster -i dbms2 hme2 hme4

5-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Post-Installation Configuration

Software Directory Paths


After the Sun Cluster host system software is installed, you should set
new software directory paths.

General Search Paths and Man Paths

On all nodes, set your PATH to include:


● /sbin
● /usr/sbin
● /opt/SUNWcluster/bin
● /opt/SUNWpnm/bin

On all nodes set your MANPATH to include:


● /opt/SUNWcluster/man

Volume Manager Specific Paths

For SSVM and CVM, set your PATH to include:


● /opt/SUNWvxva/bin
● /etc/vx/bin

For SSVM and CVM, set your MANPATH to include:


● /opt/SUNWvxva/man
● /opt/SUNWvxvm/man

For Solstice DiskSuite, set your PATH to include:


● /usr/opt/SUNWmd/sbin

For Solstice DiskSuite, set your MANPATH to include:


● /usr/opt/SUNWmd/man

Cluster Host Software Installation 5-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Post-Installation Configuration

SCI Interconnect Configuration


After you install the Sun Cluster host system software and before you
reboot the cluster host systems, you must perform an additional
configuration on the SCI interconnects as follows:

1. Add the necessary SCI and cluster information to the template file
on all nodes located in the /opt/SUNWsma/bin/Examples
directory.

2. Run the /opt/SUNWsma/bin/sm_config script on one node


specifying the template file name.

SCI Template File for a Two-Node Cluster

The following example shows a point-to-point SCI template file.


Notice that the names of the future “potential” nodes have been
included and commented out. This is mandatory.
Cluster is configured as = SC

HOST 0 = sec-0
HOST 1 = sec-1
HOST 2 = _%sec-2
HOST 3 = _%sec-3

Number of Switches in cluster = 0


Number of Direct Links in cluster = 2
Number of Rings in cluster = 0

host 0 :: adp 0 is connected to = link 0 :: endpt 0


host 0 :: adp 1 is connected to = link 1 :: endpt 0

host 1 :: adp 0 is connected to = link 0 :: endpt 1


host 1 :: adp 1 is connected to = link 1 :: endpt 1

Network IP address for Link 0 = 204.152.65


Network IP address for Link 1 = 204.152.65

Netmask = f0

5-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Post-Installation Configuration

SCI Template File for a Three-Node Cluster

The following examples show the SCI template file differences


associated with using SCI switches in a three-node cluster.
Cluster is configured as = SC

HOST 0 = sec-0
HOST 1 = sec-1
HOST 2 = sec-2

Number of Switches in cluster = 2


Number of Direct Links in cluster = 0
Number of Rings in cluster = 0

host 0 :: adp 0 is connected to = switch 0 :: port 0


host 0 :: adp 1 is connected to = switch 1 :: port 0
host 1 :: adp 0 is connected to = switch 0 :: port 1
host 1 :: adp 1 is connected to = switch 1 :: port 1
host 2 :: adp 0 is connected to = switch 0 :: port 2
host 2 :: adp 1 is connected to = switch 1 :: port 2

Network IP address for Switch 0 = 204.152.65


Network IP address for Switch 1 = 204.152.65

Netmask = f0

Warning – You must run the sm_config script any time SCI
components have been moved or replaced, or cables have been
switched. It should be run only from one node.

Cluster Host Software Installation 5-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

Exercise objective – In this exercise you will do the following:

▼ Install the Sun Cluster server software

▼ Complete the post-installation SCI interconnect configuration


if appropriate

▼ Configure environmental variables

Preparation
Obtain the following information from your instructor:

1. Ask your instructor which volume manager is to be installed on


your assigned cluster.

Volume manager: _______________

5-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

Tasks
The following tasks are explained in this section:

● Updating the name service

● Installing Solaris operating system patches

● Verifying storage array firmware revisions

● Recording the cluster host Ethernet addresses

● Installing the Sun Cluster server software

● Performing post installation SCI interconnect configuration

● Preparing the cluster host root environment

● Verifying basic cluster operation

Update the Name Service


1. Edit the /etc/hosts file on the administrative workstation and
all cluster nodes and add the IP addresses and hostnames of the
administrative workstation and cluster nodes.

2. If you are using NIS or NIS+, add the IP addresses and hostnames
to the name service.

Note – Your lab environment might already have all of the IP


addresses and host names entered in the /etc/hosts file.

Installing Solaris Operating System Patches


1. Ask your instructor if any Solaris operating system patches should
be installed on the cluster host systems.

2. Reboot all server nodes after installing the patches.

Cluster Host Software Installation 5-35


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

Storage Array Firmware Revision


If you need to upgrade the SPARCstorage™ Array (SSA) or A5000
firmware, you must install the correct patch on all the cluster hosts. In
addition, you must download the firmware to all downlevel storage
arrays. Instructions are included in the README file for the patch.

If you are using Multipacks, you must ensure that the drive firmware
is at the proper level. See the Sun Cluster Release notes for more
information.

Installation Preparation
Caution – Remember to perform the installation in parallel on all
cluster host systems. Wait until all nodes have completed each step
! before proceeding to the next installation step

1. Record the Ethernet address of each of your cluster host systems


on the ‘‘Cluster Name and Address Information’’ section on page
A-2.

Note – Type the ifconfig -a command in the cconsole common


window.

5-36 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

Server Software Installation

Note – Do not configure a shared CCD volume during installation.

1. In the cconsole common window, log in to each of the cluster


hosts systems as user root.

2. Change to the Sun Cluster software location for the correct


operating system version (Sun_Cluster_2_2/Sol_2.6/Tools).

3. Start the scinstall script at the same time on all cluster hosts.

4. As the installation proceeds, make the following choices:

a. Install the Server option

b. Choose the automatic mode of installation

c. Choose your assigned volume manager.

d. Use the assigned cluster name

e. Make the number of potential and initially configured nodes


the same.

f. Select the private network interface that is appropriate for


your cluster (Ethernet or SCI).

g. Supply the Ethernet interconnect controller names, cluster host


names, and Ethernet addresses, as appropriate, for each of the
cluster host systems.

h. Answer yes for data service support.

i. Answer no to setting up the logical hosts.

j. Configure failure fencing, if you are prompted.

k. Configure the node locking port, if you are prompted.

Cluster Host Software Installation 5-37


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

Server Software Installation


l. Configure the quorum device(s) for your cluster if you are
prompted.

Note – Ask your instructor for help with the quorum devices if you
are unsure about how to respond to the installation questions.

m. If you are asked about cluster partitioning behavior, choose


the ask option.

n. Select either the HA Oracle or Oracle Parallel Server (OPS)


data service (OPS requires CVM to be installed).

5. After the installation is complete, practice using the scinstall


List and Verify options.

6. Quit the scinstall program.

7. Apply any Sun Cluster patches as directed by your instructor.


However, you must install either the Sun Cluster patch 107388 or
107538.

Note – Do not reboot your cluster hosts at this time.

SCI Interconnect Configuration


Skip this task if your cluster uses the Ethernet-based interconnect.

To complete the SCI interconnect installation, you must create a special


template file and run the sm_config program.

1. On all nodes, copy the appropriate SCI template file into the
/opt/SUNWsma/bin directory.
# cd /opt/SUNWsma/bin/Examples
# cp switch1.sc ..
# cd ..

5-38 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

SCI Interconnect Configuration (Continued)

Note – There are also sample SCI template files in the Scripts/SCI
training file directory. There are versions for two and three-node
clusters. You can use them as a reference.

2. On all nodes, edit the template files and change the host names
and number of hosts associated with the HOST variables to match
your cluster potential node configuration.
Warning – The sm_config script should be run only on one node if
all nodes in the cluster can communicate using a public network.

3. On Node 0, run the sm_config script.


# ./sm_config -f ./template_name

Although a lot of configuration information is output by this


command, the most important point is that there are no obvious
errors.

4. Verify the installation on all cluster host systems by checking the


contents of the /etc/sma.ip and /etc/sma.config files.
# more /etc/sma.config
0 sec-0 0 8 1 0 0 0
0 sec-0 1 c 17 1 0 1
1 sec-1 0 48 2 0 1 0
1 sec-1 1 4c 18 1 1 1
2 sec-2 0 88 3 0 2 0
2 sec-2 1 8c 19 1 2 1
# more /etc/sma.ip
Hostname = sec-0 scid0 IP address = 204.152.65.1
Hostname = sec-0 scid1 IP address = 204.152.65.17
Hostname = sec-1 scid0 IP address = 204.152.65.2
Hostname = sec-1 scid1 IP address = 204.152.65.18
Hostname = sec-2 scid0 IP address = 204.152.65.3
Hostname = sec-2 scid1 IP address = 204.152.65.19

Cluster Host Software Installation 5-39


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

Cluster Reboot
At this point, you might have installed some patches and configured a
SCI-based cluster interconnect. Regardless, you must reboot all of your
cluster host systems.

1. Reboot all of your cluster host systems.

2. Verify the cluster interconnect functionality using the


get_ci_status command.

Configuration Verification
1. Use the scconf command on all cluster host systems to verify that
the basic CDB database information is correct.
# scconf clustername -p

Note – A quorum disk is not configured in SDS installations.

2. If you identify any errors in the scconf -p command output,


discuss the correction process with your instructor.

5-40 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Exercise: Installing the Sun Cluster Server Software

Testing Basic Cluster Operation


As a checkpoint test, use the following procedure to verify the basic
cluster software operation.

Note – You are using commands that have not yet been presented in
the course. If you have any questions, please consult with your
instructor.

1. Log in to each of your cluster host systems as user root.

2. Start the cluster software only on Node 0. Substitute the name of


your node and cluster for node0_name and cluster_name,
respectively.
# scadmin startcluster node0_name cluster_name

Note – This can take 1 to 2 minutes. It must complete before


proceeding. It is complete when you see the finished message.

3. After the initial cluster startup has completed on Node 0, start the
Sun Cluster software on all remaining nodes.
# scadmin startnode

Warning – When joining multiple nodes to the cluster, you must enter
the scadmin startnode command on each node at exactly the same
time. If you cannot start them at exactly the same time, then join them
one at a time, letting each join complete before starting the next. If you
are not careful, you can cause CCD database corruption.

3. After all the nodes have joined the cluster, check the cluster status
by typing the get_node_status command on all nodes.

Note – The get_node_status command is located in the


/opt/SUNWcluster/bin directory.

Cluster Host Software Installation 5-41


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Install the Sun Cluster host system software

❑ Correctly interpret configuration questions during Sun Cluster


software installation on the cluster host systems

❑ Perform post-installation configuration

5-42 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
5
Think Beyond

As you add additional nodes to the cluster, what might you need to do
on the existing nodes? Can you do this while the nodes are running?

What kinds of configuration changes need to be made simultaneously


on all nodes? How can you tell? What would happen if you did not
make them simultaneously?

What would happen if you did not specify any quorum devices or use
failure fencing?

Cluster Host Software Installation 5-43


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
System Operation 6

Objectives

Upon completion of this module, you should be able to:

● Use the cluster administration tools

● Use the Sun Cluster Manager (SCM) Graphical User


Interface (GUI)

● Use the get_node_status and hastat status commands

● List the Simple Network Management Protocol (SNMP) features

This module introduces cluster monitoring, operations software, and


new administrative features.

6-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Relevance

Discussion – The following questions are relevant to your


understanding of the module’s content:

1. What needs to be monitored in the Sun Cluster environment?

2. How current does the information need to be?

3. How detailed does the information need to be?

4. What types of information are available?

6-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

System Operation 6-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6

Cluster Administration Tools

Several administration tools are used to monitor and operate a Sun


Enterprise cluster system. These tools run either locally on cluster host
nodes or from the administration workstation.

This module reviews and expands on some previously presented


concepts and material, but its main objective is to introduce new
administrative tools and commands.

6-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Cluster Administration Tools

As shown in Figure 6-1, the Cluster Control Panel is initiated on the


cluster administration workstation, but the remainder of the cluster
control and status tools are run locally on each of the host systems.

Administration workstation

Cluster Control
Panel

Network

Terminal Concentrator
Network
interface
Serial ports

Node 0

Sun Cluster Manager


# scadmin
# hastat

Figure 6-1 Cluster Administration Tools

System Operation 6-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6

Cluster Administration Tools

Basic Cluster Control (scadmin)


You use the scadmin command to perform cluster control operations.

Note – The scadmin command has a number of specialized options


that you cannot use until the cluster is fully configured. For that
reason, these options are not discussed until later in the course.

6-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Cluster Administration Tools

Basic Cluster Control (scadmin)

Command Options for scadmin

The following scadmin command options are related only to starting


or stopping the cluster.

scadmin startcluster local_node clustname


scadmin startnode [ clustname ]
scadmin stopnode [ clustname ]

Starting the First Node in a Cluster

You must join the first node to the cluster with the startcluster
option. The node name and cluster name must be furnished.

# scadmin startcluster node0 eng-cluster

Note – This command must complete successfully before you can join
other nodes to the cluster.

Adding and Removing Nodes in the Cluster

You can start and stop additional nodes simultaneously. The cluster
and node names are assumed from the CDB database.

# scadmin startnode
# scadmin stopnode

Warning – When starting or stopping additional nodes in the cluster,


they must either be started of stopped at exactly the same time or each
node must be allowed to complete before starting or stopping the next
node. If you do not do this, you can corrupt the CCD.

System Operation 6-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6

Cluster Control Panel

The Cluster Control Panel is a convenient way to start some of the


commonly used cluster administration tools without having to
remember the program names and locations.

You can customize the Cluster Control Panel to control any desired
application.

6-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Cluster Control Panel

Starting the Cluster Control Panel


To start the Cluster Control Panel, shown in Figure 6-2, type the
following command on the administration workstation:

# /opt/SUNWcluster/bin/ccp [clustname] &

Figure 6-2 Cluster Control Panel Display

Adding New Applications to the Cluster Control Panel


You can access the Cluster Control Panel New Item display, shown in
Figure 6-3, from the Properties menu. It allows you to add custom
applications and icons to the Cluster Control Panel display.

Figure 6-3 Cluster Control Panel New Item Menu

System Operation 6-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Cluster Control Panel

Console Tool Variations


Although the following three console tools look and behave the same,
the method used to access the cluster host systems is different for each.

● Cluster console (console mode)

You use the TC interface to access the host systems. The Solaris
Operating Environment does not have to be running on the cluster
host systems.

Only one connection at a time can be made through the TC.

● Cluster console (rlogin mode)

You use the rlogin command to access the host systems through
the public network. The Solaris Operating Environment must be
running on the cluster host systems.

● Cluster console (telnet mode)

You use the telnet command to access the host systems through
the public network. The Solaris operating system must be running
on the cluster host systems.

Note – The rlogin and telnet versions of the cluster console tool are
useful when there is a problem with the primary access path that uses
the TC.

6-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6

The hastat Command

The hastat command is run locally on each cluster host and provides
a large amount of information in a single output. It is presented in
sections here although you see all of the information at once.

You can run the hastat command on any node that is a cluster
member. It provides global information about all of the cluster
members.

System Operation 6-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
The hastat Command

General Cluster Status


The hastat command provides general cluster information about the
names of configured hosts, the current membership, and the uptime of
each cluster host.

# hastat
Getting Information from all the nodes ......

HIGH AVAILABILITY CONFIGURATION AND STATUS


-------------------------------------------

LIST OF NODES CONFIGURED IN <sc-cluster> CLUSTER


sc-node0 sc-node1 sc-node2

CURRENT MEMBERS OF THE CLUSTER


sc-node0 is a cluster member
sc-node1 is a cluster member
sc-node2 is a cluster member

CONFIGURATION STATE OF THE CLUSTER


Configuration State on sc-node0: Stable
Configuration State on sc-node1: Stable
Configuration State on sc-node2: Stable

UPTIME OF NODES IN THE CLUSTER

uptime of sc-node0:
12:47am up 1:38, 1 user,
load average: 0.14, 0.12, 0.10

uptime of sc-node1:
12:47am up 1:37, 1 user,
load average: 0.16, 0.12, 0.10

uptime of sc-node2:
12:50am up 1:38, 1 user,
load average: 0.16, 0.12, 0.10

6-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
The hastat Command

Logical Host Configuration


The logical host portion of the hastat command output lists the
logical hosts that are mastered by each node and indicates which node
is the designated backup for each logical host.

LOGICAL HOSTS MASTERED BY THE CLUSTER MEMBERS

Logical Hosts Mastered on sc-node0:


sc-dbms

Loghost Hosts for which sc-node0 is Backup Node:


sc-nfs

Logical Hosts Mastered on sc-node1:


sc-nfs

Loghost Hosts for which sc-node1 is Backup Node:


sc-inetpro

Logical Hosts Mastered on sc-node2:


sc-inetpro

Loghost Hosts for which sc-node2 is Backup Node:


sc-dbms

LOGICAL HOSTS IN MAINTENANCE STATE

None

System Operation 6-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
The hastat Command

Private Network Status


The hastat command provides information about the private network
connections that are used in the cluster interconnect system.
STATUS OF PRIVATE NETS IN THE CLUSTER

Status of Interconnects on sc-node0:


interconnect0: selected
interconnect1: up
Status of private nets on sc-node0:
To sc-node0 - UP
To sc-node1 - UP
To sc-node2 - UP

Status of Interconnects on sc-node1:


interconnect0: selected
interconnect1: up
Status of private nets on sc-node1:
To sc-node0 - UP
To sc-node1 - UP
To sc-node2 - UP

Status of Interconnects on sc-node2:


interconnect0: selected
interconnect1: up
Status of private nets on sc-node2:
To sc-node0 - UP
To sc-node1 - UP
To sc-node2 - UP

6-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
The hastat Command

Public Network Status


The hastat command provides configuration and status information
about the NAFO groups that are configured on each of the cluster
hosts.

STATUS OF PUBLIC NETS IN THE CLUSTER

Status of Public Network On sc-node0:

bkggrp r_adp status fo_time live_adp


nafo113 hme1 OK NEVER hme1

Status of Public Network On sc-node1:

bkggrp r_adp status fo_time live_adp


nafo113 hme1 OK NEVER hme1

Status of Public Network On sc-node2:

bkggrp r_adp status fo_time live_adp


nafo113 hme0 OK NEVER hme0

System Operation 6-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
The hastat Command

Data Service Status


The hastat command provides configuration and status information
about the data services that are configured on each of the cluster hosts.

STATUS OF SERVICES RUNNING ON LOGICAL HOSTS IN THE


CLUSTER

Status Of Data Services Running On sc-node0


Data Service HA-SYBASE:
No Status Method for Data Service dns

Data Service HA-NFS:


On Logical Host sc-dbms: Ok

Status Of Data Services Running On sc-node1


Data Service HA-SYBASE:
No Status Method for Data Service dns

Data Service HA-NFS:


On Logical Host sc-nfs: Ok

Status Of Data Services Running On sc-node2


Data Service HA-SYBASE:
No Status Method for Data Service dns

Data Service HA-NFS:


On Logical Host sc-inetpro: Ok

6-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
The hastat Command

Cluster Error Messages


The hastat command gathers and displays the most recent error
messages from each of the cluster hosts.

RECENT ERROR MESSAGES FROM THE CLUSTER

Recent Error Messages on sc-node0

Feb 2 00:24:20 sc-node0 unix: sbusmem51 at sbus3:


SBus3 slot 0x3 offset 0x0
Feb 2 00:24:20 sc-node0 unix: sbusmem51 is
/sbus@7,0/sbusmem@3,0
Feb 2 00:36:31 sc-node0
ID[SUNWcluster.ha.hareg.2004]: Service dns is
registered

Recent Error Messages on sc-node1

Feb 2 00:24:22 sc-node1 unix: sbusmem45 at sbus2:


SBus2 slot 0xd offset 0x0
Feb 2 00:24:22 sc-node1 unix: sbusmem45 is
/sbus@6,0/sbusmem@d,0
Feb 2 00:24:22 sc-node1 unix: sbusmem48 at sbus3:
SBus3 slot 0x0 offset 0x0

Recent Error Messages on sc-node2

Feb 2 00:27:05 sc-node2 unix: sbusmem13 at sbus0:


SBus0 slot 0xd offset 0x0
Feb 2 00:27:05 sc-node2 unix: sbusmem13 is
/sbus@1f,0/sbusmem@d,0
Feb 2 00:27:05 sc-node2 unix: sbusmem14 at sbus0:
SBus0 slot 0xe offset 0x0

System Operation 6-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6

Sun Cluster Manager Overview

The Sun Cluster Manager (SCM) provides detailed cluster status


information. It is a Java application that you can run in the following
three ways:

● Directly as a standalone application

● From a local browser

● From a remote browser

Note – Remote browser operation requires that you configure the


cluster host systems as HTTP servers. You must configure the Apache
HTTP server software on each of the cluster hosts.

6-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
The Sun Cluster Manager Overview

Sun Cluster Manager Startup


When the Sun Cluster software is started on a node, the
/opt/SUNWcluster/scmgr/lib/scmgr_start script runs
automatically. This script starts a Java process that functions as the
information server for the SCM graphical application.

Currently, you must install the SUNWscmgr package on each cluster


node along with the following patch:

● 107388 if you are running Solaris 2.6

● 107538 if you are running Solaris 7

Once you install the appropriate patch, you can start the SCM user
interface as follows:

# /opt/SUNWcluster/bin/scmgr local_host_name

Note – The user application is started on only one node in the cluster.
Running the application directly requires that you set the DISPLAY
variable to the administration workstation. The Sun Cluster software
must be running.

System Operation 6-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
The Sun Cluster Manager Overview

Initial Sun Cluster Manager Display


As shown in Figure 6-4, the initial SCM display shows an icon for the
entire cluster. You can either click on the icon or select the Properties
tab to display the cluster configuration.

Figure 6-4 Initial Sun Cluster Manager Display

6-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6

Sun Cluster Manager Displays

The SCM user interface has several displays that can supply detailed
information about the physical cluster components and their
operational status.

The SCM user interface provides a great deal of centralized


information and can be a valuable time saving tool for the
administrator.

System Operation 6-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Sun Cluster Manager Displays

SCM Cluster Configuration Display


The SCM Cluster Configuration display, shown in Figure 6-5 and
accessed from the Properties tab, provides a visual tree representation
of the cluster configuration. As you select each component in the
configuration tree, detailed information about the component is
displayed.

Figure 6-5 SCM Properties Display

6-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Sun Cluster Manager Displays

SCM Cluster Configuration Display

Configuration Tree

You can expand the cluster configuration tree by selecting the node
points that contain a plus (+) sign. A node that contains a minus (-)
sign collapses when selected. The example in Figure 6-6 shows that the
node p-n0-b has not been expanded yet. You can expand the private
network for the node p-n1-b even further.

Figure 6-6 SCM Configuration Tree

System Operation 6-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Sun Cluster Manager Displays

SCM Cluster Events Viewer


The SCM Cluster Events viewer, shown in Figure 6-7 and accessed
from the Events tab, displays a list of significant events associated with
the item selected in the configuration tree. The events are not system
errors but are associated with state changes for the selected item.

Figure 6-7 SCM Events Viewer

6-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Sun Cluster Manager Displays

System Log Filter


The Syslog function, shown in Figure 6-8 and accessed from the Syslog
tab, displays system log file entries that are related to the item selected
in the configuration tree. The entries are retrieved from the
/var/adm/messages files on the cluster hosts.

Figure 6-8 SCM Syslog Filter

System Operation 6-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Sun Cluster Manager Displays

The SCM Help Display


The SCM help display is started from Help menu in the main SCM
window. A HotJava browser is then displayed, as shown in Figure 6-9.

Figure 6-9 SCM Help Display

6-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6

Sun Cluster SNMP Agent

The Sun Cluster software provides full SNMP support as an


alternative to the GUI-based monitoring tools. The software collects
detailed configuration and status information. You can obtain full
cluster status from SNMP network management workstations.

System Operation 6-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Sun Cluster SNMP Agent

SNMP Agent Overview


The sequence of events that occurs when the Management Information
Base (MIB) tables gather cluster information is as follows:

1. The Super Monitor agent smond connects to in.mond on all of the


requested cluster nodes.

2. smond passes the collected config and syslog information to the


snmpd daemon.

3. snmpd fills in the cluster MIB tables, which are now available to
clients through SNMP GET operations.

4. snmpd sends out enterprise-specific traps for critical cluster events,


as notified by the smonds syslog data.

Cluster MIB Tables


The following tables provide information about the clusters:

● clustersTable

● clusterNodesTable

● switchesTable

● portsTable

● lhostTable – Logical host characteristics

● dsTable – Configured data service characteristics

● dsinstTable – Data service instance information

Note – Each table provides several attributes that describe specific


information about major functional sections of the clusters.

6-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Sun Cluster SNMP Agent

SNMP Traps
The following SNMP traps are generated for critical cluster events:

Table 6-1 Sun Cluster SNMP Traps

Trap No. Trap Name

0 sc:stopped
1 sc:aborted
4 sc:excluded
11 vm:down
21 db:up
31 vm_on_node:slave
100 SOCKET_ERROR:node_out_of_system_resources
106 UNREACHABLE_ERROR:node’s_mond_unreachable:
network_problems??
110 SHUTDOWN_ERROR:node’s_mond_shutdown
200 Fatal:super_monitor_daemon(smond)_exited!!

Note – For more information about using and troubleshooting SNMP


features, see the Sun Cluster 2.2 System Administration Guide.

System Operation 6-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Sun Cluster SNMP Agent

Configuring the Cluster SNMP Agent Port


By default, the cluster SNMP agent listens on User Datagram Protocol
(UDP) Port 161 for requests from the SNMP manager, for example,
SunNet Manager Console. You can change this port by using the -p
option to the snmpd and smond daemons.

You must configure both the snmpd and smond daemons on the same
port to function properly.
Caution – If you are installing the cluster SNMP agent on an SSP or an
Administrative workstation running the Solaris 2.6 Operating
! Environment or compatible versions, always configure the snmpd and
the smond programs on a port other than the default UDP port 161.

For example, with the SSP, the cluster SNMP agent interferes with the
SSP SNMP agent, which also uses UDP port 161. This interference
could result in the loss of RAS features of the Sun Enterprise 10000
server.

6-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Exercise: Using System Operations

Exercise objective – In this exercise you will do the following:

● Start the Cluster Control Panel

● Start the Sun Cluster Manager software

● Use the SCM Cluster Configuration display

● Use the SCM Events Viewer display

● Use the SCM Syslog Filter display

● Use the SCM Help display

Preparation
1. You must verify that the SCM software is installed on each of your
cluster host systems:

# pkginfo SUNWscmgr

2. Verify that the appropriate SCM patch is installed on each of your


cluster host systems:

# showrev -p |grep 107388 (for Solaris 2.6 Operating


Environment)

# showrev -p |grep 107538 (for Solaris 7 systems)

Note – The scmgr program does not function without the appropriate
patch installed.

3. If necessary, install the appropriate SCM patch on each cluster


host. Be sure and reboot the cluster hosts after patch installation so
that the svmgr process will start.
Warning – You must stop the Sun Cluster software on all nodes before
rebooting the system.

4. All nodes should be joined in the cluster and the cconsole tool
should be running from the previous lab.

System Operation 6-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Exercise: Using System Operations

Tasks
The following tasks are explained in this section:

● Starting the Cluster Control Panel

● Using the hastat command

● Using the Sun Cluster Manager

Starting the Cluster Control Panel


1. Start the Cluster Control Panel (CCP) on the administration
workstation. Substitute your cluster name.

# ccp clustername &

2. Start the telnet mode of the cluster console tool by double-clicking


on the telnet mode icon.

3. After practicing using the telnet mode of cconsole, log out of all
the telnet cluster host windows and quit the telnet cluster console
tool.

4. Practice using the CCP cluster help tool. Pay particular attention to
the glossary feature.

5. Quit the Cluster Control Panel.

Using the hastat Command


1. Use the cconsole common window and type the hastat
command on each of the cluster hosts.
# hastat | more

2. Compare the output of the cluster hosts.

3. Type the get_node_status command and compare its output to


that of the hastat command. It is useful for a quick status check.

6-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Exercise: Using System Operations

Using the Sun Cluster Manager


1. On each cluster host, verify that the DISPLAY variable is set to the
administration workstation.
# echo $DISPLAY

2. On the administration workstation, type the xhost command in


the console window and verify that access control is disabled.
# xhost
access control disabled, clients can connect from any
host

3. Log in to any cluster host that is in clustered operation.

4. Verify that the SCM manager software is running on all of the


cluster hosts.

# ps -ef |grep java

5. Start the SCM client software on one of the cluster hosts.

# hostname

# /opt/SUNWcluster/bin/scmgr hostname &

6. If you do not see the SCM display on your administration


workstation after a short time, ask your instructor for help.

7. Select the Cluster Configuration display and practice navigating


through the configuration tree.

8. Select the Cluster Events Viewer display and practice viewing the
events related to different items in the configuration tree.

9. Select the System Log display and practice viewing system log file
messages related to different items in the configuration tree.

Note – The Previous and Next buttons scroll through the messages one
buffer contents at a time.

System Operation 6-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Exercise: Using System Operations

Using the Sun Cluster Manager


10. Display the Help menu and select some of the help items.

11. Practice using the HotJava Help viewer for a while, then quit it.

12. Quit the SCM application.

13. Stop the SC software on all cluster hosts using the scadmin
stopnode command.

6-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Exercise: Using System Operations

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

System Operation 6-35


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
6
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Use the Cluster administration tools

❑ Use the Sun Cluster Manager GUI

❑ Use the hastat status command

❑ List the Simple Network Management Protocol (SNMP) features

6-36 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
6
Think Beyond

What would it be like to administer a cluster with 16 nodes and 200


storage arrays?

What would it be like to administer a cluster with each cluster member


located in a different city?

In general, how does SNMP interact with the cluster environment?

System Operation 6-37


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services September 1999, Rev. A
Volume Management Using CVM
and SSVM 7

Objectives

Upon completion of this module, you should be able to:

● Explain the disk space management technique used by the Cluster


Volume Manager (CVM) and the Sun StorEdge Volume Manager
(SSVM)

● Describe the initialization process necessary for CVM and SSVM

● Describe how the CVM and SSVM products group disk drives

● List the basic status commands for CVM and SSVM

● Describe the basic software installation process CVM and SSVM

● List post-installation issues relevant to CVM and SSVM

● Install and configure either CVM or SSVM

This module introduces some of the basic concepts common to the


Cluster Volume Manager and the Sun StorEdge Volume Manager.

7-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Relevance

Discussion – The following questions are relevant to your learning the


material presented in this module:

1. Which volume manager features are the most important to


clustered systems?

2. What relationship do the volume managers have to normal cluster


operation?

3. Are there any volume manager feature restrictions when they are
used in the Sun Cluster environment?

7-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

● Sun Cluster 2.2 Cluster Volume Manager Guide, part number


805-4240

Volume Management Using CVM and SSVM 7-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7

Disk Space Management

The Cluster Volume Manager (CVM) and Sun StorEdge Volume


Manager (SSVM), use the same method to manage disk storage space.

CVM and SSVM manage data in a non-partitioned environment. They


manage disk space by maintaining tables that associate a list of
contiguous disk blocks with a data volume structure. A single disk
drive can potentially be divided into hundreds of independent data
regions.

7-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Disk Space Management

CVM and SSVM Disk Space Management


As shown in Figure 7-1, CVM and SSVM maintain detailed
configuration records that equate specific blocks on one or more disk
drives with virtual volume structures.

Volume 01
Blocks 1000 – 3000

Blocks 3001 – 5500


Slice 4

Blocks 5501 – 10000


Volume 02

Blocks 10001 – 12000

Physical disk drive


Volume 03

Figure 7-1 CVM and SSVM Space Management

Both CVM and SSVM divide a disk into a single slice and then allocate
portions of the slice to volume structures.

Volume Management Using CVM and SSVM 7-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7

CVM and SSVM Initialization

When a physical disk drive is initialized by CVM or SSVM, it is


divided into two sections called the private region and the public
region.

The private and public regions are used for different purposes.

● The private region is used for configuration information.

● The public region is used for data storage

The private region is small. It is usually configured as slice 3 on the


disk and at most is a few cylinders in size.

The public region is the rest of the disk drive. It is usually configured
as slice 4 on the disk.

Note – You must specially initialize all disk drives that are used by
CVM or SSVM.

7-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Initialization

Private Region Contents


The size of the private region, by default, is 1024 sectors (512Kbytes). It
can be enlarged if a large number of Volume Manager (VM) objects are
anticipated. If you anticipate having over 2000 VM objects, you should
increase the size of the private region. The private region contents are:

● Disk header

Two copies of the file that defines and maintains the host name or
cluster name, unique disk ID, disk geometry information, and disk
group association information.

● Table of contents

The disk header points to this linked list of blocks.

● Configuration database

This database contains persistent configuration information for all


of the disks in a disk group. It is usually referred to as the
configdb record.

● Disk group log

This log is composed of kernel written records of certain types of


actions, such as transaction commits, plex detaches resulting from
I/O failures, dirty region log failures, first write to a volume, and
volume close. It is used after a crash or clean reboot to recover the
state of the disk group just before the crash or reboot.

Public Region Usage


Volumes are created from plexes. A plex is built from one or more
subdisks. Subdisks are areas of the public region that are mapped and
controlled by CVM and SSVM. The public region can be a single large
subdisk or many smaller subdisks. You can assemble subdisks from
many different physical disk drives into a plex that is then associated
with a volume name. Subdisks are not partitions.

Volume Management Using CVM and SSVM 7-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Initialization

Private and Public Region Format


The private and public region format of an initialized SSVM disk can
be verified with the prtvtoc command. As shown in the following
example, slice 2 is defined as the entire disk. Slice 3 has been assigned
tag 15 and is 2016 sectors in size. Slice 4 has been assigned tag 14 and
is the rest of the disk.

In this example, the private region is the first two cylinders on the
disk. The disk is a 1.05-Gbyte disk and a single cylinder only has 1008
sectors or blocks which does not meet the 1024 sector minimum size
for the private region. This is calculated by using the nhead=14 and
nsect=72 values for the disk found in the /etc/format.dat file.

# prtvtoc /dev/rdsk/c2t4d0s2

First Sector Last


Partition Tag Flags Sector Count Sector
2 5 01 0 2052288 2052287
3 15 01 0 2016 2015
4 14 01 2016 2050272 2052287

Initialized Disk Types


By default, SSVM initializes disk drives with the type Sliced. There are
other possible variations. The three types of initialized disks are:

● Simple – Private and public regions are on the same partition.

● Sliced – Private and public regions are on different partitions


(default)

● nopriv – Does not have a private region

Note – You should not use the nopriv command. It is normally used
only for RAM disk storage on non-Sun systems.

7-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7

CVM and SSVM Encapsulation

If you have existing data on the disk, you would not want to initialize
the disk, because this destroys any data. Instead, you can choose to
encapsulate the disk.

When you install the Sun StorEdge Volume Manager software on a


system, you can place your system boot disk under SSVM control
using the vxinstall program

For Sun StorEdge Volume Manager to encapsulate the disk, there


should be at least 1024 sectors in an unused slice at the beginning or
end of the disk and 2 free partitions.

Volume Management Using CVM and SSVM 7-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Encapsulation

Preferred Boot Disk Configuration


Although there are many possible boot disk variations, the preferred
boot disk configuration is shown in Figure 7-2.

SCSI rootdg
c0 disk group

SCSI
c1

SOC
c2
rootvol rootmirror

newdg
disk group
Storage Array

Figure 7-2 Preferred Boot Disk Configuration

The preferred configuration has the following features:


● The boot disk and mirror are on separate interfaces
● The boot disk and mirror are not in a storage array
● Only the boot disk and mirror are in the rootdg disk
group

7-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Encapsulation

Prerequisites for Boot Disk Encapsulation


For the boot disk encapsulation process to succeed, the following
prerequisites must be met:

● The disk must have at least two unused slices

● The boot disk must not have any slices in use other than the
following:
● root
● swap
● var
● usr

The following is an additional prerequisite that is desirable but not


mandatory:

● There should be at least 1024 sectors of unused space on the disk.


Practically, this will be at least 2 full cylinders.

This is needed for the private region. SSVM will take the space
from the end of the swap partition if necessary.

Primary and Mirror Configuration Differences


When you encapsulate your system boot disk, the location of all data
remains unchanged even though the partition map is modified.

When you mirror the encapsulated boot disk, the location of the data
on the mirror is different from the original boot disk.

During encapsulation, a copy of the system boot disk partition map is


made so that the disk can be returned to a state that allows booting
directly from a slice.

The mirror of the boot disk cannot easily be returned to a sliced


configuration.

Volume Management Using CVM and SSVM 7-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Encapsulation

The /etc/vfstab File


A backup copy of the /etc/vfstab file is made before the new boot
disk path names are configured. The following /etc/vfstab file is
typical for a boot disk with a single partition root file system.
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no - -
/proc - /proc proc - no -
/dev/vx/dsk/swapvol - - swap - no -
/dev/vx/dsk/rootvol /dev/vx/rdsk/rootvol / ufs 1 no -
swap - /tmp tmpfs - yes -
#
#NOTE: volume rootvol (/) encapsulated partition c0t0d0s0
#NOTE: volume swapvol (swap) encapsulated partition c0t0d0s1

Boot PROM Changes


When the system boot disk is encapsulated, you can no longer boot
directly from a boot disk partition. The SSVM software creates two
new boot aliases for you so that you can boot from the primary system
boot disk, or if a failure occurs, you can boot from the surviving
mirror. You can examine the new boot aliases as follows:
# eeprom | grep devalias
devalias vx-rootdisk /sbus@1f,0/SUNW,fas@e,8800000/sd@1,0:a
devalias vx-rootmir /sbus@1f,0/SUNW,fas@e,8800000/sd@0,0:a

If your primary boot disk fails, you can boot from the surviving mirror
as follows:

ok boot vx-rootmir

7-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Encapsulation

Un-encapsulating the Boot Disk


About the only time you might want to un-encapsulate the system
boot disk is if you are removing the SSVM software. The vxunroot
command is used to unencapsulate the boot disk but first you must
make sure the following actions have been taken:

● All boot disk volumes have been unmirrored

● All non-root file systems, volumes, plexes, and subdisks have been
removed.

If you forget to prepare the boot disk, the vxunroot command


performs a very thorough check before starting.

The vxunroot command performs the following functions:

● Checks for any unacceptable structures on the boot disk

● Returns the boot disk partition map to its original state

● Returns the /etc/system file to its original state

● Returns the /etc/vfstab file to its original state

● Returns the OpenBoot PROM device aliases to their original state

Volume Management Using CVM and SSVM 7-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7

CVM and SSVM Disk Grouping

Disk groups are an arbitrary collection of physical disks that allow a


backup host to assume a workload. The disk groups are given unique
names and ownership is assigned either to a single cluster host system
or to the name of the cluster. Ownership is dependent on the intended
database platform and application.

CVM and SSVM both use the term disk group to define a related
collection of disk drives. The term “dg” is used frequently in related
documentation.

7-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Disk Grouping

Cluster Volume Manager Disk Groups


You install the CVM version of the Veritas Volume Manager only when
using the Oracle Parallel Server database. As shown in Figure 7-3, the
CVM disk groups are owned by the cluster and have the name of the
cluster written in private regions on the physical disks. This means
that any node in the cluster can read or modify data in a disk group
volume.

Node 0 access A B Node 1 access

Disk group

The disk group


is owned by the
cluster.

Storage array

Volume Volume Volume

Figure 7-3 CVM Disk Group Ownership

Note – To prevent simultaneous data access from two different cluster


host systems and possible corruption, the Oracle Parallel Server
database uses a software locking mechanism called distributed lock
management.

Volume Management Using CVM and SSVM 7-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Disk Grouping

Sun StorEdge Volume Manager Disk Groups


The SSVM can be used by all of the supported high availability
databases and data services. As shown in Figure 7-4, SSVM disk
groups are owned by an individual node and the hostname of that
node is written onto private regions on the physical disks.

Even though another node is physically connected to the same array, it


cannot access data in the array that is part of a disk group it does not
own. During a node failure, the disk group ownership can be
transferred to another node that is physically connected to the array.
This is the backup node scheme used by all of the supported high
availability data services.

Node 0 access A B Node 1 access

Disk group

The disk group


is owned by
Node 0.

Storage array

Volume Volume Volume

Figure 7-4 SSVM Disk Group Ownership

7-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7

Volume Manager Status Commands

Although the graphical user interfaces for CVM, SSVM, and SDS
furnish useful visual status information, there are times when the
images might not update correctly or completely due to window
interlocks or system loads.

The most reliable and the quickest method of checking status is from
the command line. Command line status tools have the additional
benefits of being easy to use in script files, cron jobs, and remote
logins.

Volume Management Using CVM and SSVM 7-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Volume Manager Status

Checking Volume Status

Using the vxprint Command

The vxprint command, used with both CVM and SSVM, is the easiest
way to check the status of all volume structures. The following sample
vxprint output shows the status of two plexes in a volume as bad.
One of the plexes is a log.

# vxprint

Disk group: sdg0

TY NAME ASSOC KSTATE LENGTH PLOFFS STATE


dg sdg0 sdg0 - - - -

dm disk0 c4t0d0s2 - 8368512 - -


dm disk7 c5t0d0s2 - 8368512 - -

v vol0 fsgen ENABLED 524288 - ACTIVE


pl vol0-01 vol0 DISABLED 525141 - IOFAIL
sd disk0-01 vol0-01 ENABLED 525141 0 -
pl vol0-02 vol0 ENABLED 525141 - ACTIVE
sd disk7-01 vol0-02 ENABLED 525141 0 -
pl vol0-03 vol0 DISABLED LOGONLY - IOFAIL
sd disk0-02 vol0-03 ENABLED 5 LOG -

Note – You can use the vxprint -ht vol0 command to get a detailed
analysis of the volume. This gives you all the information you need,
including the physical path to the bad disk.

You can also use the vxprint command to create a backup


configuration file that is suitable for recreating the entire volume
structure. This is useful as a worst-case disaster recovery tool.

7-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Volume Manager Status

Checking Disk Status


When disk drives fail, the CVM or SSVM software can lose complete
contact with a disk and no longer display the physical path with the
vxprint -ht command. At those times, you must find the media
name of the failed disk from the vxprint command and then use the
vxdisk list command to associate the media name with the physical
device.

# vxdisk list

DEVICE TYPE DISK GROUP STATUS


c0t0d0s2 sliced - - error
c0t1d0s2 sliced disk02 rootdg online
- - disk01 rootdg failed was:c0t0d0s2

When a disk fails and becomes detached, the CVM or SSVM software
cannot currently find the disk but still knows the physical path. This is
the origin of the failed was status. This means the disk has failed
and the physical path was the value displayed.

Saving Configuration Information


The vxprint and vxdisk commands can also be used to save detailed
configuration information that is useful in disaster recovery situations.
The output of the following commands should be copied into a file
and stored on tape. You should also keep a printed copy of the files.
# vxprint -ht > filename
# vxdisk list > filename

Volume Management Using CVM and SSVM 7-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7

Optimizing Recovery Times

In the Sun Cluster environment, data volumes are frequently mirrored


to achieve a higher level of availability. If one of the cluster hosts
system fails while accessing a mirrored volume, the recovery process
might involve several steps including:

● Mirrors must be synchronized

● File systems must be checked

Mirror synchronization can take a long time and must be completed


before file systems can be checked. If your cluster uses many large
volumes, the complete volume recovery process can take hours.

You can expedite mirror synchronization by using the CVM/SSVM


dirty region logging feature. You can expedite file system recovery by
using the Veritas VxFS file system software.

Note – Although you can also expedite file system recovery by using
the Solaris 7 Operating Environment UFS logging feature, the current
versions of CVM and SSVM do not run in the Solaris 7 Operating
Environment.

7-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Optimizing Recovery Times

Dirty Region Logging


A dirty region log (DRL) is a CVM or SSVM log file that tracks data
changes made to mirrored volumes. The DRL is used to speed
recovery time when a failed mirror needs to be synchronized with a
surviving mirror.

● Only those regions that have been modified need to be


synchronized between mirrors.

● Improper placement of DRLs can negatively affect performance

A volume is divided into regions and a bitmap (where each bit


corresponds to a volume region) is maintained in the DRL. When a
write to a particular region occurs, the respective bit is set to on. When
the system is restarted after a crash, this region bitmap is used to limit
the amount of data copying that is required to recover plex consistency
for the volume. The region changes are logged to special log subdisks
linked with each of the plexes associated with the volume. Use of dirty
region logging can greatly speed recovery of a volume.

The Veritas VxFS File System


The UNIX file system relies on full structural verification by the fsck
command to recover from a system failure. This means checking the
entire structure of a file system, verifying that it is intact, and
correcting any inconsistencies that are found. This can be time
consuming.

The VxFS file system provides recovery only seconds after a system
failure by using a tracking feature called intent logging. Intent logging
is a logging scheme that records pending changes to the file system
structure. During system recovery from a failure, the intent log for
each file system is scanned and operations that were pending are
completed. The file system can then be mounted without a full
structural check of the entire system. When the disk has a hardware
failure, the intent log might not be enough to recover and in such
cases, a full fsck check must be performed, but often, when failure is
due to software rather than hardware, a system can be recovered in
seconds.

Volume Management Using CVM and SSVM 7-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7

CVM and SSVM Post-Installation

You must complete additional configuration tasks before your cluster


is operational.

Initializing the rootdg Disk Group


Until a disk group named rootdg is created on a system, the CVM or
SSVM software cannot start.

There are three ways to satisfy this requirement:

● Create a dummy rootdg on a single small slice on any system disk

● Initialize any storage array disk and add it to the rootdg disk
group

● Encapsulate the system boot disk

Note – The dummy rootdg method is used in the lab exercise for this
module.

7-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Post-Installation

Matching the vxio Driver Major Numbers


During software installation, device drivers are assigned a major
number in the /etc/name_to_major file. Unless these numbers are
the same on HA-NFS primary and backup host systems, the HA-NFS
users receive “Stale file handle” error messages after a HA-NFS logical
host migrates to a backup system. This effectively terminates the user
session and destroys the high availability feature.

It makes no difference what the major numbers are as long as they


agree on all of the host systems. All nodes associated with a HA-NFS
logical host must be checked as follows:
# grep vxio /etc/name_to_major
vxio 45

Make sure that the number is unique in all of the files. Change one so
that they all match or, if that is not possible, assign a completely new
number in all of the files.

If your boot disk is not encapsulated, you can stop all activity on the
nodes and edit the /etc/name_to_major files so they all agree.

Note – If your boot disk has been encapsulated, the process is


somewhat more complex. You should consult the Sun Cluster Software
Installation Guide for detailed instructions.

Warning – You must stop the Sun Cluster software before making
changes to the vxio driver major number.

Volume Management Using CVM and SSVM 7-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Post-Installation

StorEdge Volume Manager Dynamic Multi-Pathing

Dynamic Multi-Path Driver Overview

The dynamic multi-path driver (DMP) is unique to the SSVM product.


It is used only with fiber-optic interface storage arrays. As show in
Figure 7-5, the DMP driver can access the same storage array through
more than one path. The DMP driver automatically configures
multiple paths to the storage array if they exist. Depending on the
storage array model, the paths are used for load-balancing in a
primary/backup mode of operation.
Storage Array

Host system

Drive

Drive

Drive
SOC Controller
card C1

SOC Controller
Drive

Drive

Drive
card C2

DMP
driver

fiber-optic
interface

Figure 7-5 Dynamic Multi-Path Driver

7-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
CVM and SSVM Post-Installation

StorEdge Volume Manager Dynamic Multi-Pathing

Disabling the Dynamic Multi-Path Feature

The DMP feature is not compatible with the cluster operation and it
must be permanently disabled. During SSVM software installation, the
DMP feature is automatically configured. DMP must be disabled and
completely removed. The procedure is as follows:

1. Remove the vxdmp driver from the /kernel/drv directory.


# rm /kernel/drv/vxdmp

2. Edit the /etc/system file and remove (comment out) the


following line:
forceload: drv/vxdmp

3. Remove the volume manager DMP files.


# rm -rf /dev/vx/dmp /dev/vx/rdmp

4. Symbolically link /dev/vx/dmp to /dev/dsk and /dev/vx/rdmp


to /dev/rdsk.
# ln -s /dev/dsk /dev/vx/dmp
# ln -s /dev/rdsk /dev/vx/rdmp

5. Perform a reconfiguration boot on the system.


# reboot -- -r

Caution – There are versions of the patches 105463 and 106606 that
partially reenable DMP by installing /kernel/drv/vxdmp again. A
! reboot will fail. You will have to boot from CDROM to remove vxdmp
again.

Volume Management Using CVM and SSVM 7-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Exercise objective – In this exercise you will do the following:

● Install your target volume manager (CVM or SSVM)

● Initialize your target volume manager (CVM or SSVM)

● Create demonstration volumes using script files.

Preparation
You must select the appropriate volume manager for installation
during this exercise. When you installed the Sun Cluster software you
were required to select a volume manager at that time. Unless you
install the same volume manager that you specified during the Sun
Cluster installation you might be missing critical support software.

Ask your instructor about the location of the software that will be
needed during this exercise. This includes software for CVM, SSVM,
and some script files.
Caution – There are two methods of initializing the volume manager
software in this exercise. Ask your instructor now which method you
! should use.

Each of the cluster host boot disks should have a small unused slice.
This is used either for a dummy rootdg during the CVM/SSVM
installation.

7-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Tasks
The following tasks are explained in this section:

● Installing the CVM or SSVM software

● Disabling Dynamic Multipathing

● Initializing the CVM or SSVM software

● Selecting disk drives for demonstration volumes

● Configuring the demonstration volumes

● Verifying cluster operation

Volume Management Using CVM and SSVM 7-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Installing the CVM or SSVM Software


1. Move to the location of the CVM or SSVM software.

2. To install the Sun StorEdge Volume Manager (SSVM) software,


verify that the following files are present:
# ls

SUNWasevm SUNWvmdev SUNWvmman SUNWvmsa SUNWvxva


SUNWvxvm

3. To install the Cluster Volume Manager (CVM) software, verify that


the following files are present:
# ls

SUNWvmdev SUNWvmman SUNWvxva SUNWvxvm

4. Run the pkgadd command on all cluster host systems to begin the
volume manager installation.
# pkgadd -d .

5. Select the all option unless you do not want to install the
AnswerBook™ package SUNWasevm. You can enter a space-
separated list of the package numbers you want to install.

6. Leave /opt as the default installation directory.

7. Do not install the Apache HTTPD package.

8. Reply yes to installing the StorEdge Volume Manager Server


software.

9. Reply yes to installing the SUNWvxva package.

10. Answer yes to installing conflicting files.

7-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Disabling Dynamic Multipathing (DMP)


1. Stop the Sun Cluster software on all of the cluster hosts.
# scadmin stopnode

2. Remove the vxdmp driver from the /kernel/drv directory.


# rm /kernel/drv/vxdmp

3. Edit the /etc/system file and remove (comment out) the


following line:
forceload: drv/vxdmp

4. Remove the volume manager DMP files.


# rm -rf /dev/vx/dmp /dev/vx/rdmp

5. Symbolically link /dev/vx/dmp to /dev/dsk and /dev/vx/rdmp


to /dev/rdsk.
# ln -s /dev/dsk /dev/vx/dmp
# ln -s /dev/rdsk /dev/vx/rdmp

6. Install the 106606 SSVM patch before proceeding.

7. Perform a reconfiguration boot on the system.


# reboot -- -r

Volume Management Using CVM and SSVM 7-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Creating a Simple rootdg Slice


Check with your instructor to see if you should use this procedure or
the following procedure in the ‘‘Encapsulating the Boot Disk’’ section
on page 7-31.

The initrootdg script is in the training Scripts/VM directory.


Warning – Do not perform this procedure unless you are absolutely
sure of the rootdg location on each of the nodes. This procedure will
destroy an active, formatted partition.

The following is an example of the initrootdg script:


# more initrootdg
vxconfigd -m disable
wait
vxdctl init
wait
vxdg init rootdg
wait
vxdctl add disk $1 type=simple
wait
vxdisk -f init $1 type=simple
wait
vxdg adddisk $1
wait
vxdctl enable
wait
rm /etc/vx/reconfig.d/state.d/install-db

1. Locate and run the initrootdg script on all nodes, specifying the
correct slice for each node’s local boot disk.
# initrootdg boot_disk_slice

Note – A typical slice would be c0t0d0s7.

2. Reboot all cluster nodes.

7-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Encapsulating the Boot Disk


Caution – Do not perform this section unless you have first checked
with your instructor. The boot disks on your cluster host systems
! might not be properly configured for encapsulation.

1. On each cluster host, start the vxinstall utility.

2. Select installation option 2, Custom Installation.

3. Enter y (yes) at the Encapsulate Boot Disk prompt.

4. For all other disks and controllers, choose the Leave these
disks alone option.

5. Read the summary of your choices carefully at the end of the


interactive section. You can still quit without affecting any of the
system disks.

6. After the boot disk has been encapsulated, you must reboot each
cluster host system.

7. If your cluster host system has enough suitable disk drives, you
can also mirror your system boot disk.

Note – Get assistance from your instructor before attempting to mirror


the encapsulated boot disk. The vxdiskadm command is the easiest
method of creating the mirror. You can use option 6, Mirror volumes
on a disk.

Volume Management Using CVM and SSVM 7-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Selecting Demonstration Volume Disks


The makedg.vm training script is used to create four mirrored volumes,
each composed of two disks in one array and two mirror disks in
another. Figure 7-6 shows the relationship of the disk drives.

Node 0 Node 1

c1 c2 c1 c2

A B A B

Primary hanfs.1 Mirror


hanfs
Primary hanfs.2 Mirror

Primary hadbms.1 Mirror


hadbms
Primary hadbms.2 Mirror

Disk Array Volumes Array


Groups

Figure 7-6 Demonstration Volume Structure

7-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Selecting Demonstration Volume Disks (Continued)


The makedg.vm script prompts you for the disk group name, the four
disks to be put into the disk group, and then creates the disk group
and two mirrored volumes in it.

The drives are specified in the form c0t4d0.

● Make sure that you run the volume creation script on an


appropriate node for each disk group.

The makedg.vm script is run once for each disk group that you need to
create.

Selecting Disks

Before you run the volume creation script, select and record the
physical path to eight disks that are suitable for creating mirrored
volumes.

Disk Data Mirror


Nodes Volumes
Group Devices Devices

hanfs hanfs.1 disk01 disk02

hanfs.2 disk03 disk04

hadbms hadbms.1 disk05 disk06

hadbms.2
disk07 disk08

Volume Management Using CVM and SSVM 7-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Selecting Demonstration Volume Disks (Continued)


The following is an example of the makedg.vm script output. The
hanfs disk group is created as a result.
# ./makedg.vm
what service would you like?
1. HA nfs
2. HA RDBMS
Enter choice (1|2) :
1
First Volume
Enter 2 disks: first data then mirror (Ex.c1t0d3 c2t0d4)
c2t1d0 c3t33d0
Second Volume
Enter 2 disks: first data then mirror (Ex.c1t2d3 c2t2d4)
c2t3d0 c3t35d0
Creating disk group hanfs
disk group hanfs built
Creating subdisks in disk group hanfs
Done with creating sd for hanfs
Creating plexes in disk group hanfs
Done with making plexes
Creating volumes in disk group hanfs
Done with creating volumes for hanfs
Enabling volumes in disk group
Done enabling volumes in group

Note – The script must be run a second time to create the hadbms disk
group.

7-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Configuring the CVM/SSVM Demonstration Volumes


Warning – The makedg.vm script can destroy existing data on the
storage arrays if you specify incorrect drives. If you specify your boot
drive, the script will destroy it.

The cluster does not need to be running during the creation of the disk
groups.
Caution – Run the script on only one cluster node. Do not run it
through the cluster console on multiple nodes.
!
1. Log in as user root on the cluster node.

2. After verifying that your DISPLAY environment variable is set to


the Administration Workstation, start vxva on the cluster node
and watch the creation process.

3. Change to the training scripts directory on the proper cluster


node.

4. Run the makedg.vm script on Node 0 and create the hanfs disk
group.
Caution – You must run this script twice, once for each disk group that
is needed (hanfs, hadbms).
!
5. Run the makedg.vm script again on Node 0 and create the hadbms
disk group.

6. Run newfs on each volume in each disk group you have created.

7. Examine the new volume structures using the vxprint command.


Verify that all volumes are in an enabled state.

Volume Management Using CVM and SSVM 7-35


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Verifying the CVM/SSVM Demonstration File Systems


Before proceeding, you should verify that the file systems you created
are functional.

The file systems that you previously created are dependent on which
volume manager you are using. They are as follows for the
CVM/SSVM volume manager:

● The hanfs.1 and hanfs.2 volumes in disk group hanfs

● The hadbms.1 and hadbms.2 volumes in disk group hadbms

1. Create the file system mount points on each cluster host that will
be associated with the logical hosts.
# mkdir /hanfs1 /hanfs2
# mkdir /hadbms1 /hadbms2

2. Manually mount each file system on one of the nodes to ensure


that they are functional.
# mount /dev/vx/dsk/hanfs/hanfs.1 /hanfs1
# mount /dev/vx/dsk/hanfs/hanfs.2 /hanfs2
# mount /dev/vx/dsk/hadbms/hadbms.1 /hadbms1
# mount /dev/vx/dsk/hadbms/hadbms.2 /hadbms2

3. Verify the mounted file systems.


# ls /hanfs1 /hanfs2 /hadbms1 /hadbms2

Note – You should see a lost+found directory in each file system.

7-36 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volume Management

Verifying the CVM/SSVM Demonstration File Systems


(Continued)
4. Create a test directory in the /hanfs1 file system for use later.
# mkdir /hanfs1/test_dir

5. Make sure the directory permissions are correct.


# cd /hanfs1
# chmod 777 test_dir
# cd /

6. Umount all of the demonstration file systems.


# umount /hanfs1
# umount /hanfs2
# umount /hadbms1
# umount /hadbms2

Verifying the Cluster


1. Test the cluster by joining all nodes to the cluster.

Note – Remember to let the scadmin startcluster command finish


completely before joining any other nodes to the cluster.

Volume Management Using CVM and SSVM 7-37


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Exercise: Configuring Volumes

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

7-38 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Explain the disk space management technique used by the Cluster


Volume Manager (CVM) and the Sun StorEdge Volume Manager
(SSVM)

❑ Describe the initialization process necessary for CVM and SSVM

❑ Describe how the CVM and SSVM products group disk drives

❑ List the basic status commands for CVM and SSVM

❑ Describe the basic software installation process CVM and SSVM

❑ List post-installation issues relevant to CVM and SSVM

❑ Install and configure either CVM or SSVM

Volume Management Using CVM and SSVM 7-39


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
7
Think Beyond

Where does Volume Manager recovery fit into the high availability
environment?

What planning issues are required for the Volume Manager in the high
availability environment?

Is the use of the Volume Manager required for high availability


functionality?

7-40 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Volume Management Using SDS 8

Objectives

Upon completion of this module, you should be able to:

● Explain the disk space management technique used by Solstice


DiskSuite (SDS)

● Describe the initialization process necessary for SDS

● Describe how SDS groups disk drives

● List the basic SDS status commands

● Describe the basic SDS software installation process

● List the post-installation issues relevant to SDS

● Install and configure SDS

This module introduces some of the basic concepts of the Solstice


DiskSuite volume manager.

8-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Relevance

Discussion – The following questions are relevant to your learning the


material presented in this module:

1. Which volume manager features are the most important to


clustered systems?

2. What relationship do the volume managers have to normal cluster


operation?

3. Are there any volume manager feature restrictions when they are
used in the Sun Cluster environment?

8-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

● Sun Cluster 2.2 Cluster Volume Manager Guide, part number


805-4240

Volume Management Using SDS 8-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8

Disk Space Management

The Solstice DiskSuite software manages disk space by associating


standard UNIX partitions with a data volume structure. A single disk
drive can be divided into only seven independent data regions, which
is the UNIX partition limit for each physical disk.

8-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Disk Space Management

SDS Disk Space Management


As shown in Figure 8-1, SDS manages virtual volume structures by
equating standard UNIX disk partitions with virtual volume
structures.

Volume d18
Slice 0

Slice 3

Slice 4
Volume d6

Slice 6

Physical disk drive


Volume d12

Figure 8-1 SDS Space Management

Note – Slice 7 is reserved for state database storage on disks that are
used in a diskset. Disks are automatically partitioned when they are
first added to a diskset.

Volume Management Using SDS 8-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8

Solstice DiskSuite Initialization

Disk drives that are to be used by SDS do not need special


initialization. The standard UNIX partitions are used without any
modification.

SDS needs a minimum of several small databases in which to store


volume configuration information along with some error and status
information. These are called state databases and are replicated on one
or more disk drives. Another common term for the state databases is
replicas.

By default, SDS requires a minimum of three copies of the state


database.

The replicas are placed on standard unused partitions by a special


command, metadb. The default size for each replica is 517 Kbytes
(1034 disk blocks).

8-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Solstice DiskSuite Initialization

Replica Configuration Guidelines


At least one replica is required to start the SDS software. A minimum
of three replicas is recommended. SDS 4.2 allows a maximum of 50
replicas.

The following guidelines are recommended:

● For one drive – Put all three replicas in one slice

● For two to four drives – Put two replicas on each drive

● For five or more drives – Put one replica on each drive

Use your own judgement to gauge how many replicas are required
(and how to best distribute the replicas) in your storage environment.

Note – You cannot store replicas on the root, swap, or /usr partitions,
or on partitions containing existing file systems or data.

If multiple controllers exist on the system, replicas should be


distributed as evenly as possible across all controllers. This provides
redundancy in case a controller fails and also helps balance the load. If
multiple disks exist on a controller, at least two of the disks on each
controller should store a replica.

Do not place more than one replica on a single disk unless that is the
only way to reach the minimum requirement of three replicas.

Volume Management Using SDS 8-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8

SDS Disk Grouping

Disk groups are an arbitrary collection of physical disks that allow a


backup host to assume a workload. The disk groups are given unique
names and ownership is assigned to a single cluster host system.

SDS uses the term diskset to define a related collection of disk drives.

A shared diskset is a grouping of two hosts and disk drives that are
accessible by both hosts. Each host can have exclusive access to a
shared diskset; they cannot access the same diskset simultaneously.

Note – It is important to stress that the hosts do not “share” the disk
drives in a shared diskset. They can take turns having exclusive access
to a shared diskset, but they cannot concurrently access the drives in a
shared diskset.

8-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
SDS Disk Grouping

Disksets facilitate moving disks between host systems, and are an


important piece in enabling high availability. Disksets also enable you
to group storage by department or application.

Mars

Local disks
Local disks
Venus
phys-mars phys-venus

Figure 8-2 Shared Disksets

● A shared diskset is a grouping of two hosts and disk drives, which


are physically accessible by both hosts and have the same device
names on both hosts.

● Each host can have exclusive access to a shared diskset; they cannot
access the same diskset simultaneously.

● Each host must have a local diskset that is separate from the shared
diskset.

● There is one state database for each shared diskset and one state
database for the local diskset.

Volume Management Using SDS 8-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8

Dual-String Mediators

Solstice DiskSuite has two different kinds of state databases. Initially, a


local state database is replicated on each local boot disk. The local
replicas are private to each host system.

When a shared diskset is created, a different set of replicas are created


that are unique to the diskset. Each shared diskset has its own set of
replicas.

8-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Dual-String Mediators

Shared Diskset Replica Placement


When a shared diskset is created on storage arrays, a different type of
state database replicas are automatically created for the diskset. The
illustration in Figure 8-3 shows two different shared disksets that each
have their own set of replicas. The replicas for each diskset are
automatically balanced across storage arrays.

Diskset A

Host A r r Host B

r r
Mediator Mediator

r r
r r
local local
replicas replicas
Boot disk Diskset B Boot disk

Figure 8-3 Diskset Replica Placement

Dual-String Mediation

With dual-string configurations (configurations with two disk stings,


such as two SPARCstorage arrays or two SPARCstorage MultiPacks), it
is possible that only one string is accessible at a given time. In this
situation, it is impossible to guarantee a replica quorum.

To resolve the dual-string limitation, the concept of mediators was


introduced. Essentially, an additional mediator data is stored in the
memory of the host systems and is used to establish a replica quorum
when one of the storage arrays fails.

Volume Management Using SDS 8-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8

Metatrans Devices

After a system panic or power failure, UFS file systems are checked at
boot time with the fsck utility. The entire file system must be checked
and this can be time consuming.

Solstice DiskSuite offers a feature called UFS Logging, sometimes


referred to as “journaling.” UFS logging takes the (logically)
synchronous nature of updating a file system and makes it
asynchronous. Updates to a file system are not made directly to the
disk or partition containing the file system. Instead, user data is
written to the device containing the file system, but the file system
disk structures are not modified—they are logged instead. Updates to
the file system structure are applied to the file system when: the log is
detached and the file system is idle for 5 seconds; the log fills; or the
device is cleared.

Any changes made to the file system by unfinished system calls are
discarded, ensuring that the file system is in a consistent state. This
means that logging file systems do not have to be checked at boot
time, speeding the reboot process.

8-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Metatrans Devices

As shown in Figure 8-4, a metatrans device is used to log a UFS file


system. It has two components: a master device and a logging device. The
master device contains the UFS file system, and has the same on-disk
file system format as a non-logged UFS system. The logging device
contains the log of file system transactions.

Master device Logging device

UNIX file
system UFS log
data

/dev/md/diskset/dsk/d11 /dev/md/diskset/dsk/d14

/dev/md/setname/dsk/d10

Figure 8-4 Solstice DiskSuite UFS Logging

Both the master and logging devices must be mirrored to prevent data
loss in the event of a device failure. Losing data in a log because of
device errors can leave a file system in an inconsistent state, and user
intervention might be required for repair.

Volume Management Using SDS 8-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Metatrans Devices

Metatrans Device Structure


A typical metatrans device structure is illustrated in Figure 8-5. All
components are mirrored.

/dev/md/disksetA/dsk/d10

/dev/md/disksetA/dsk/d11 /dev/md/disksetA/dsk/d14

UNIX file
system UFS log
d15 d16
d12 d13
c2t1d0s6 c1t0d1s6

c1t0d0s0 c2t0d1s0

Figure 8-5 Metatrans Device Structure

8-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8

SDS Status

Although the graphical user interface for SDS furnishes useful visual
status information, there are times when the images might not update
correctly or completely due to window interlocks or system loads.

The most reliable and the quickest method of checking status is from
the command line. Command line status tools have the additional
benefits of being easy to use in script files, cron jobs, and remote
logins.

Volume Management Using SDS 8-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Volume Manager Status

Checking Volume Status

Using metastat

The following metastat command output, is for a mirrored


metadevice, d0, and is used with the SDS volume manager.
# metastat d0
d0: Mirror
Submirror 0: d80
State: Okay
Submirror 1: d70
State: Resyncing Resyncin progress: 15% done
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 2006130 blocks

Note – You can also use the metastat command to create a backup
configuration file that is suitable for recreating the entire volume
structure. This is useful as a worst-case disaster recovery tool.

Checking Mediator Status


You use the medstat command to verify the status of mediators in an
dual-string storage configuration.
# medstat -s boulder
Mediator Status Golden
dolphins Ok No
bills Ok No

8-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Volume Manager Status

Checking Replica Status


The status of the state databases is important and you can verify it
using the metadb command, as shown in the following example. The
metadb command can also be used to initialize, add, and remove
replicas.
# metadb
flags first blk block count
a u 16 1034 /dev/dsk/c0t3d0s5
a u 16 1034 /dev/dsk/c0t2d0s0
a u 16 1034 /dev/dsk/c0t2d0s1
o - replica active prior to last configuration change
u - replica is up to date
l - locator for this replica was read successfully
c - replica's location was in /etc/opt/SUNWmd/mddb.cf
p - replica's location was patched in kernel
m - replica is master, this is replica selected as input
W - replica has device write errors
a - replica is active, commits are occurring
M - replica had problem with master blocks
D - replica had problem with data blocks
F - replica had format problems
S - replica is too small to hold current data base
R - replica had device read errors

The status flags for the replicas shown in the previous example
indicate that all of the replicas are active and up to date.

Volume Management Using SDS 8-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Volume Manager Status

Recording SDS Configuration Information


Diskset configuration information should be archived using the
metastat -p command option. The configuration information is
output in a format that can later be used to automatically rebuild your
diskset volumes.
# metastat -s denver -p
denver/d100 -m denver/d0 denver/d10 1
denver/d0 1 1 /dev/did/dsk/d10s0
denver/d10 1 1 /dev/did/dsk/d28s0
denver/d101 -m denver/d1 denver/d11 1
denver/d1 1 1 /dev/did/dsk/d10s1
denver/d11 1 1 /dev/did/dsk/d28s1
denver/d102 -m denver/d2 denver/d12 1
denver/d2 1 1 /dev/did/dsk/d10s3
denver/d12 1 1 /dev/did/dsk/d28s3

8-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8

SDS Post-Installation

Configuring State Database Replicas


Before you can perform any SDS configuration tasks, such as creating
disksets on the multihost disks or mirroring the root (/) file system,
you must create the metadevice state database replicas on the local
(private) disks on each cluster node. The local disks are separate from
the multihost disks. The state databases located on the local disks are
necessary for basic SDS operation.

A typical command to place three state database replicas on slice 7 of a


system boot disk is as follows:
# metadb -a -c 3 -f c0t0d0s7

Volume Management Using SDS 8-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
SDS Post-Installation

Configuring the Disk ID (DID) Driver


All new installations running Solstice DiskSuite require a DID pseudo
driver to make use of disk IDs. DIDs enable metadevices to locate data
independent of the device name of the underlying disk. Configuration
changes or hardware updates are no longer a problem because the
data is located by DID and not the device name.

To create a mapping between a DID and a disk path, run the


scdidadm -r command from only one node and then check the
configuration using the scdidadm -L command.

The following example shows the creation and verification of DID


devices on a cluster with two dual-hosted StorEdge A5000 arrays.

# scdidadm -r
Configuring /devices and /dev; this may take a while.
# scdidadm -L
1 devsys1:/dev/rdsk/c0t0d0 /dev/did/rdsk/d1
2 devsys1:/dev/rdsk/c2t37d0 /dev/did/rdsk/d2
2 devsys2:/dev/rdsk/c3t37d0 /dev/did/rdsk/d2
3 devsys1:/dev/rdsk/c2t33d0 /dev/did/rdsk/d3
3 devsys2:/dev/rdsk/c3t33d0 /dev/did/rdsk/d3
4 devsys1:/dev/rdsk/c2t52d0 /dev/did/rdsk/d4
4 devsys2:/dev/rdsk/c3t52d0 /dev/did/rdsk/d4
5 devsys1:/dev/rdsk/c2t50d0 /dev/did/rdsk/d5
5 devsys2:/dev/rdsk/c3t50d0 /dev/did/rdsk/d5
6 devsys1:/dev/rdsk/c2t35d0 /dev/did/rdsk/d6
6 devsys2:/dev/rdsk/c3t35d0 /dev/did/rdsk/d6
7 devsys1:/dev/rdsk/c3t20d0 /dev/did/rdsk/d7
7 devsys2:/dev/rdsk/c2t20d0 /dev/did/rdsk/d7
8 devsys1:/dev/rdsk/c3t18d0 /dev/did/rdsk/d8
8 devsys2:/dev/rdsk/c2t18d0 /dev/did/rdsk/d8
9 devsys1:/dev/rdsk/c3t1d0 /dev/did/rdsk/d9
9 devsys2:/dev/rdsk/c2t1d0 /dev/did/rdsk/d9
10 devsys1:/dev/rdsk/c3t3d0 /dev/did/rdsk/d10
10 devsys2:/dev/rdsk/c2t3d0 /dev/did/rdsk/d10
11 devsys1:/dev/rdsk/c3t5d0 /dev/did/rdsk/d11
11 devsys2:/dev/rdsk/c2t5d0 /dev/did/rdsk/d11
12 devsys2:/dev/rdsk/c0t0d0 /dev/did/rdsk/d12
#

8-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
SDS Post-Installation

Configuring Dual-String Mediators


Although the shared-diskset replicas are automatically created and
balanced between storage arrays in a dual-string configuration, the
mediators must be configured manually.

1. Start the cluster software on all host systems.

2. Determine the hostname and private link address of the first


mediator host.
# hostname
capri
# ifconfig -a | grep 204.152.65
inet 204.152.65.1 netmask fffffff0
inet 204.152.65.33 netmask fffffff0
inet 204.152.65.17 netmask fffffff0

3. Determine the hostname and private link address of the second


mediator host.
# hostname
palermo
# ifconfig -a | grep 204.152.65
inet 204.152.65.2 netmask fffffff0
inet 204.152.65.34 netmask fffffff0
inet 204.152.65.18 netmask fffffff0

4. Use the hastat command to determine the current master of the


diskset you are configuring for mediators.

5. Configure the mediators using the metaset command on the host


that is currently mastering the diskset.
capri# metaset -s disksetA -a -m capri,204.152.65.33
capri# metaset -s disksetA -a -m palermo,204.152.65.34

6. Check the mediator status using the medstat command.


capri# medstat -s disksetA

Note – The private links must be assigned as mediator host aliases.

Volume Management Using SDS 8-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Exercise objective – In this exercise you will do the following:

● Install the SDS volume manager

● Initialize the SDS volume manager

● Configure the DID driver

● Create demonstration volumes using script files.

● Create dual-string mediators if appropriate

Preparation
You must select the appropriate volume manager for installation
during this exercise. When you installed the Sun Cluster software you
were required to select a volume manager at that time. Unless you
install the same volume manager that you specified during the Sun
Cluster installation you might be missing critical support software.

Ask your instructor about the location of the software that will be
needed during this exercise. This includes software for SDS and some
script files.

Each of the cluster host boot disks must have a small unused slice that
can be used for a state database during the SDS installation.

8-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Tasks
The following tasks are explained in this section:

● Installing the SDS software

● Configuring the SDS disk ID driver

● Configuring the SDS state databases

● Demonstration volume overview

● Configuring the demonstration volumes

● Cluster verification

Volume Management Using SDS 8-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Installing the SDS Software


1. Move to the location of the SDS software.

2. If you wish to install the SDS software you should see the
following files:
# ls

SUNWmd SUNWmdg SUNWmdn

Note – The SUNWdid and SUNWmdn packages were added during


the Sun Cluster software installation when you specified SDS as your
volume manager.

3. Run the pkgadd command on all cluster host systems to begin the
volume manager installation.
# pkgadd -d ‘pwd‘
The following packages are available:
1 SUNWmd Solstice DiskSuite
(sparc) 4.2,REV=1998.02.09.12.47.28
2 SUNWmdg Solstice DiskSuite Tool
(sparc) 4.2,REV=1998.14.09.08.19.32
3 SUNWmdn Solstice DiskSuite Log Daemon
(sparc) 4.2,REV=1998.02.09.12.47.28

Select package(s) you wish to process (or 'all' to


process
all packages). (default: all) [?,??,q]: 1 2

4. As shown in the previous step, install only the SUNWmd and


SUNWmdg packages.

5. Install the current version of the SDS 4.2 patch, 106627, on all
cluster host systems.

6. Stop the Sun Cluster software on all nodes.

7. Reboot all of the cluster hosts after you install the SDS patch.

8-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Configuring the SDS Disk ID Driver


To balance device mapping between nodes as explained during the
lecture, the Sun Cluster 2.2 software includes a new device driver and
scripts to set up and manage the Disk ID (DID) pseudo devices.

The scdidadm command is used to configure the Disk IDs.


Caution – Although, you run the scdidadm command on only one of
the cluster nodes, all nodes must be joined in the cluster.
!
1. Verify that all nodes are currently cluster members.
# get_node_status

2. Start the scdidadm script on Node 0.


# scdidadm -r

Note – You might see error messages about the /etc/name_to_major


number conflicts. Resolving the name_to_major number conflicts is
addressed later in this exercise.

If the scdidadm command cannot discover the private links, you must
run the command again and specify the name of the other nodes in the
Cluster.

# scdidadm -r -H hostname1,hostname2,....

Note – Do not include the name of the node on which the scdidadm
command is being run.

Volume Management Using SDS 8-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Resolving DID Driver Major Number Conflicts


The scdidadm command checks the /etc/name_to_major files on all
nodes to verify that the DID driver major device numbers are
identical. If they are not, you see the following message:
The did entries in name_to_major must be the same on all
nodes.

1. Examine the /etc/name_to_major files on all nodes and make


sure that the DID driver major numbers are identical.
host1# cat /etc/name_to_major | grep did
did 158

host2# cat /etc/name_to_major | grep did


did 156

host2# cat /etc/name_to_major | grep did


did 156

If all DID driver major numbers are the same, skip to the ‘‘Initializing
the SDS State Databases’’ section on page 8-28.

2. Look at the highest DID driver major number in the


name_to_major files on all nodes and add 1 to it. Record the new
number below.

__________

3. Check the /etc/name_to_major files on all nodes and verify that


the new number is not already being used.

4. Edit the /etc/name_to_major file on all the nodes and change the
DID driver major number to the new value.
Warning – The new DID driver major number must not be in use by
another driver. Consult with your instructor if you are not absolutely
sure about this.

8-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Resolving DID Major Number Conflicts


Once you have modified the name_to_major files, you must remove
all the DID device structures that were created by the scdidadm
command.

5. On each of the nodes where the name_to_major file was changed,


run the following commands.
# scadmin stopnode
# rm -rf /devices/pseudo/did*
# rm -rf /dev/did
# rm -rf /etc/did.conf
# reboot -- -r

6. After the reboot operations have completed, start the Sun Cluster
software again on all cluster host systems.

7. On the node used to run the scdidadm command, run it again.


# scdidadm -r

This should resolve any major device number conflicts across the
nodes.

8. Use the following command to look at pseudo device names


across all nodes in the cluster
# scdidadm -L

1 devsys1:/dev/rdsk/c0t0d0 /dev/did/rdsk/d1
2 devsys1:/dev/rdsk/c2t37d0 /dev/did/rdsk/d2
2 devsys2:/dev/rdsk/c3t37d0 /dev/did/rdsk/d2
3 devsys1:/dev/rdsk/c2t33d0 /dev/did/rdsk/d3
3 devsys2:/dev/rdsk/c3t33d0 /dev/did/rdsk/d3
4 devsys1:/dev/rdsk/c2t52d0 /dev/did/rdsk/d4
4 devsys2:/dev/rdsk/c3t52d0 /dev/did/rdsk/d4
5 devsys1:/dev/rdsk/c2t50d0 /dev/did/rdsk/d5
5 devsys2:/dev/rdsk/c3t50d0 /dev/did/rdsk/d5
6 devsys2:/dev/rdsk/c0t0d0 /dev/did/rdsk/d6

Volume Management Using SDS 8-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Initializing the SDS State Databases


Before you can use SDS to create disksets and volumes, the state
database must be initialized and one or more replicas created.

The system boot disk on each cluster host should be configured with a
small unused partition. This should be slice7.

1. On each node in the cluster, verify that the boot disk has a small
unused slice available for use. Use the format command to verify
the physical path to the unused slice. Record the paths of the
unused slice on each cluster host. A typical path is c0t0d0s7.

Node 0 Replica Slice: _______________

Node 1 Replica Slice: _______________

Node 2 Replica Slice: _______________

Warning – You must ensure that you are using the correct slice. A
mistake can corrupt the system boot disk. Check with your instructor.

2. On each node in the cluster, use the metadb command to create


three replicas on the boot disk slice.
# metadb -a -c 3 -f c0t0d0s7

3. Verify that the replicas are configured on each node.


# metadb
flags first blk block count
a u 16 1034 /dev/dsk/c0t0d0s7
a u 1050 1034 /dev/dsk/c0t0d0s7
a u 2084 1034 /dev/dsk/c0t0d0s7

8-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

SDS Volume Overview


A script, makedg.sds, is used to create two disksets. Each diskset has
three mirrored volumes. The relationship of the primary and mirror
devices is shown in Figure 8-6.

Node 0 Node 1

c1 c2 c1 c2

A B A B

d100
hanfs Primary d101 Mirror
d102

d100
hadbms Primary d101 Mirror
d102

Disksets Array Volumes Array

Figure 8-6 Demonstration Volume Structure

The d100 volumes are 10-Mbytes and are used for a special
administration file system in a later lab exercise.

The d101 and d102 volumes are 250-Mbytes each.

Volume Management Using SDS 8-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

SDS Volume Overview (Continued)


The makedg.sds script is run twice, once for each diskset. It requires
the DID path to pairs of disks along with the physical path to each of
the disks. The script uses the physical path to partition the disk drives
in an arrangement suitable for this lab exercise. The partition map is
contained in a file called 9GB_vtoc. The disk partition structures must
meet the following minimum requirements:

Slice 0 10-Mbytes

Slice 1 250-Mbytes

Slice 2 The entire disk

Slice 3 250-Mbytes

Slice 7 ~ 3-Mbytes (2 cylinders)

The makedg.sds script file prompts for the name of the vtoc file.

The 9GB_vtoc file is used for the 9-Gbyte disk drives found in the
StorEdge A5000 arrays. There are also 1GB_vtoc and 2GB_vtoc files
for use with the older SPARCstorage array product.

If your arrays have different disk drives, you must manually


repartition a disk and then save its vtoc information in a file. You can
create a file as follows:

1. Repartition a disk drive to meet the space requirements outlined


above.

2. Save the vtoc information in a file as shown in the following


example:
# prtvtoc /dev/rdsk/c3t0d0s2 > filename

3. Furnish the new vtoc file name to the makedg.sds script.

8-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Selecting SDS Demo Volume Disks Drives


1. Use the sddidadm command on Node 0 to list all of the available
DID drives.
# scdidadm -l

2. Record the physical and DID paths of four disks that are used to
create the demonstration volumes. Remember to mirror across
arrays.

An entry might look like: d4 c2t50d0

Diskset Volumes Primary Mirror

hanfs d100 disk01 disk02

d101

d102
hadbms d100 disk03 disk04

d101

d102

Note – You need to record only the last portion of the DID path. The
first part is the same for all DID devices: /dev/did/rdsk.

Caution – All disks used in each diskset must be the same geometry
(that is, they must all be the same type of disk).
!

Volume Management Using SDS 8-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Configuring the SDS Demonstration Volumes


Slice 7 on the disks is used by Solstice DiskSuite to store diskset state
databases.

Using the list obtained from the scdidadm -l command, create two
disksets using the makedg.sds script file in the Scripts/SDS
directory.

1. On Node 0, run the makedg.sds script file to create the diskset


called hanfs.

2. On Node 0, run the makedg.sds script file again to create the


diskset called hadbms.

3. Verify the status of the new disksets.


# metaset -s hanfs
# metastat -s hanfs
# metaset -s hadbms
# metastat -s hadbms

Configuring Dual-String Mediators


If your cluster is a dual-string configuration, you must configure
mediation for both of the disksets you have created.

1. Make sure the cluster software is running on the cluster hosts.

2. Determine the hostname and private link address of the first


mediator host (Node 0) using the hostname and ifconfig -a
commands. Record the results below.

Node 0 Hostname: _______________

Node 0 Private Link Address: _______________

Note – The private link address is either 204.152.65.33,


204.152.65.34, 204.152.65.35, or 204.152.65.36.

8-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Configuring Dual-String Mediators (Continued)


3. Record the hostname and private link address of the second
mediator host (Node 1).

Node 1 Hostname: _______________

Node 1 Private Link Address: _______________

4. Use the metaset command to determine the current master of the


disksets you are configuring for mediators.

Note – Both the hanfs and hadbms disksets should be mastered by


Node 0.

5. Configure the mediators using the metaset command on the host


that is currently mastering the diskset.
# metaset -s hanfs -a -m node0_name,204.152.65.33
# metaset -s hanfs -a -m node1_name,204.152.65.34
#
# metaset -s hadbms -a -m node0_name,204.152.65.33
# metaset -s hadbms -a -m node1_name,204.152.65.34

6. Check the mediator status using the medstat command.


# medstat -s hanfs
# medstat -s hadbms

Volume Management Using SDS 8-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Verifying the SDS Demonstration File Systems


Before proceeding, you should verify that the file systems you created
are functional.

The file systems that you previously created are dependent on which
volume manager you are using. They are as follows for the SDS
volume manager:

● The d101 and d102 volumes in diskset hanfs.

● The d101 and d102 volumes in diskset hadbms.

1. Create the file system mount points on each cluster host that will
be associated with the logical hosts.
# mkdir /hanfs1 /hanfs2
# mkdir /hadbms1 /hadbms2

2. Manually mount each file system on one of the nodes to ensure


that they are functional.
# mount /dev/md/hanfs/dsk/d101 /hanfs1
# mount /dev/md/hanfs/dsk/d102 /hanfs2
# mount /dev/md/hadbms/dsk/d101 /hadbms1
# mount /dev/md/hadbms/dsk/d102 /hadbms2

3. Verify the mounted file systems.


# ls /hanfs1 /hanfs2 /hadbms1 /hadbms2

Note – You should see a lost+found directory in each file system.

8-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volume Management

Verifying the SDS Demonstration File Systems (Continued)


4. Create a test directory in the /hanfs1 file system for use later.
# mkdir /hanfs1/test_dir

5. Make sure the directory permissions are correct.


# cd /hanfs1
# chmod 777 test_dir
# cd /

6. Umount all of the demonstration file systems.


# umount /hanfs1
# umount /hanfs2
# umount /hadbms1
# umount /hadbms2

Verifying the Cluster


1. Test the cluster by joining all nodes to the cluster.

Note – Remember to let the scadmin startcluster command finish


completely before joining any other nodes to the cluster.

Volume Management Using SDS 8-35


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Exercise: Configuring Volumes

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

8-36 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Explain the disk space management technique used by Solstice


DiskSuite (SDS)

❑ Describe the initialization process necessary for SDS

❑ Describe how SDS groups disk drives

❑ List the basic SDS status commands

❑ Describe the basic SDS software installation process

❑ List the post-installation issues relevant to SDS

❑ Install and configure SDS

Volume Management Using SDS 8-37


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
8
Think Beyond

Where does SDS fit into the high availability environment?

What planning issues are required for SDS in the high availability
environment?

Is use of the SDS required for high availability functionality?

8-38 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Cluster Configuration Database 9

Objectives

Upon completion of this module, you should be able to:

● Describe the Cluster Database (CDB) and its operation

● Describe the Cluster Configuration Database (CCD) and its


operation

● List the advantages of a shared CCD volume

● Manage the contents of the cluster configuration files

This module describes how the Sun Cluster environment configuration


information is stored and managed.

9-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Relevance

Discussion – The following questions are relevant to understanding


the content of this module:

1. What type of information about the cluster configuration do you


need to keep?

2. How do you update and share this information between nodes?

9-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

Cluster Configuration Database 9-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9

Cluster Configuration Information

Each cluster node maintains local copies of the databases. The cluster
configuration databases contain both local and cluster-wide
configuration information. Critical Sun Enterprises Cluster
configuration and status information is maintained in two cluster-wide
database files clustname.cdb and ccd.database, which are located in
the /etc/opt/SUNWcluster/conf directory.

Warning – You should not manually modify either of these databases.


The slightest mistake could leave your cluster unusable. You should
regularly back up the directory where the databases reside.

9-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Cluster Configuration Information

The CDB Database


The CDB database contains general cluster configuration information
that is used during cluster reconfiguration including:

● The cluster name and all node names

● A cluster application software profile

● Pre-assigned interconnect addresses of all cluster hosts

● Scheduling, priority, and timeout information for cluster software

● Quorum device mappings

A CDB database template is loaded and configured during the Sun


Cluster software installation on the cluster hosts. This database is
seldom modified after the initial Sun Cluster software installation.

The CDB Format

The CDB database has a simple format for every line:

variable name: value

The following shows typical entries in the CDB database file.


cluster.cluster_name: currency
cluster.number.nodes : 2
cluster.number.nets: 2
cluster.node.0.hostname: dollar
cluster.node.1.hostname: penny
# cluster interconnect section
#
cluster.node.0.if.0: scid0
cluster.node.0.phost.0: 204.152.65.1
cluster.node.0.if.1: scid1
cluster.node.0.phost.1: 204.152.65.17
cluster.node.1.if.0: scid0
cluster.node.1.phost.0: 204.152.65.2
cluster.node.1.if.1: scid1
cluster.node.1.phost.1: 204.152.65.18

Cluster Configuration Database 9-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Cluster Configuration Information

The CCD Database


The CCD database is a special purpose database that has multiple uses
depending on your cluster application. The CCD database contains:

● Logical host configuration information

● Data service status

● Data service instance management information

A CCD database template is loaded during the Sun Cluster software


installation on the cluster hosts. Data is added to it as changes are
made to disk groups, NAFO groups, or logical host configurations.

The CCD Database Format

The CCD database is a true database that uses many different formats.
As shown in the following example, each entry is associated with a
major key that has a unique associated format.
# Cluster disk group
CDG_fmt:prim_node:backup_node:dg
CDG:node0:node1:dg

# Logical IP address
LOGIP_fmt:nodelist:iflist:ipaddr:logif
LOGIP:node0,node1:hme0,hme0:129.146.237.136:2

Many different formats are used in the CCD database. Almost all of
them store information about the structure, control, and status of
logical hosts.

9-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Cluster Configuration Information

The CCD Database

CCD Database Files

There are several different files associated with the CDD. All of them
are located in the /etc/opt/SUNWcluster/conf directory. Following is
a brief summary of the files.

● ccd.database.init

The init CCD file contains static configuration parameters used to


start the ccdd daemon. Only major configuration changes, such as
adding an additional node, affect the ccd.database.init file.

● ccd.database

This file contains the dynamic CCD database. When the SC


software is installed, the dynamic and init CCD files are both
created using default entry values. The ccd.database files is
regularly modified by changes in things such as PNM status,
logical host switches, and logical host state changes such as when
they are placed in maintenance mode.

● ccd.database.pure

The ccdadm command can be used to verify the correctness of the


ccd.database file. Any error lines in the original file are removed
and a purified version of the original is created. The purified
version does not have the imbedded checksum. It can be used to
create a new active ccd.database file with the ccdadm command.

● ccd.database.shadow

Whenever the ccd.database file is to be updated, a shadow copy


is made before the update process can continue. The shadow copy
can be used if something goes wrong during the update process.

Cluster Configuration Database 9-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9

Cluster Database Consistency

The consistency of the CDB and CCD databases between cluster hosts
is checked any time a node joins or leaves the cluster.

Data Propagation
The data in the CDB database is seldom modified after the initial Sun
Cluster software installation. The CDB database file is modified as the
result of using certain scadmin command options. You must be
careful, the CDB change that are made with some scadmin command
options do not propagate to the CDB files on all nodes.

The CCD data is modified whenever you perform administrative


duties such as creating a new logical host. When you make any
modification to the CCD database, the changes are automatically
propagated to the CCD databases on all current members of the
cluster.

If a node is not currently in the cluster, it does not get the changes.
This can lead to consistency problems when the node tries to join.

9-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Cluster Database Consistency

The CCD Update Protocol


As shown in Figure 9-1, any CCD updates must be arbitrated by the
CCD master. The master is the cluster member with the lowest node
identifier. The update process is handled by the ccdd daemons that
run on each node.

SCI switch

Node 0 Node 1 Node 2

CCD CCD CCD

Freeze Freeze
ccdd ccdd ccdd
Propagate Propagate
(Master)

Update
request

Figure 9-1 CCD Update Control

The CCD master node handles a change request as follows:

● The CCD master freezes all CCD activity on other nodes.

● The CCD master incorporates the changes into its local CCD.

● The CCD master propagates the CCD changes to all other nodes.

Note – Just prior to starting a ccd.database update, the ccdd daemon


make a shadow copy of the ccd.database file called
ccd.database.shadow that can be used if anything goes wrong
during the update process.

Cluster Configuration Database 9-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Cluster Database Consistency

Database Consistency Checking


During cluster reconfiguration, the consistency of both databases is
checked between the proposed cluster members. The majority opinion
about the CDB and CCD content takes precedence. A node that does
not agree with the majority opinion either cannot join the cluster or is
dropped from current membership.

Additionally, each node’s CCD database contain a locally generated


checksum that allows local detection of corruption. If a node detects
corruption this way, it excludes itself from the cluster.

Database Majority
During cluster reconfiguration, the consistency of the CDB and CCD
databases is checked between potential cluster members. If there is a
majority consensus, a node with a CCD inconsistency can
automatically get a new copy of the CCD downloaded from the
current CCD master node. This is normally possible only when there
are two nodes already joined in the cluster and a third node is trying
to join.

If there is only one node in a cluster and a second node tries to join, it
is not possible to have a majority opinion. Either node could have the
wrong information. The new potential member is not allowed to join if
there is any disagreement about CDB or CCD content.

Unless a special mirrored CCD volume has been configured, you


cannot perform command operations that modify the contents of the
CCD database unless two or more nodes are cluster members.

Note – There is no mechanism to automatically correct defective CDB


databases.

9-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9

Shared CCD VolumeIn a two-node cluster, if only one node is


in the cluster, you cannot modify the CCD database. Also, if the
second node joins the cluster, there is no way to establish a majority
opinion about CCD integrity.

In a two-node cluster, you can create a third CCD database that is


resident on the storage arrays.

In the case of a single node failure, there are two CCD files to compare
to ensure integrity.

The shared CCD is not supported in Solstice DiskSuite installations.

Note – You can use the shared CCD only with a two-node cluster.

Cluster Configuration Database 9-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Shared CCD Volume

As shown in Figure 9-2, you can keep an additional copy of the CCD
on a mirrored disk volume that is shared (physically connected)
between two nodes.

Node A Node B

ccd.database ccd.database

Mirror ccdvol
ccdvol
Primary
ccd.database ccd.database

sc_dg disk group sc_dg disk group

Mass storage Mass storage

Figure 9-2 Shared CCD Volume Configuration

If you are running a two-node cluster, the scinstall procedure asks


you if you want to have a shared CCD. Even if you reply no, you can
create a shared CCD at any time.

The advantage of a shared CCD is that you can update it with only
one node in the cluster. With an unshared CCD, both nodes must be in
the cluster to make any changes.

You can convert a CCD to and from a shared configuration after


installation if necessary.

Note – If you are upgrading from Sun Cluster 2.0, and you were using
a shared CCD, you must rerun the confccdssa program or the
upgraded cluster will not start properly.

9-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Shared CCD Volume

Shared CCD Operation


The general functionality of a shared CCD volume is outlined in the
following table.

When Node A Leaves When Node A Returns

Node B imports the sc_dg disk The shared CCD is copied to


group both ccd.database files
Node B copies its The shared CCD is renamed
ccd.database to shared CCD ccd.shadow
Node B removes its local Node A remounts the ccdvol
ccd.database file file system

Creating a Shared CCD


You can create the shared CCD volume either by replying yes to its
creation during the scinstall process and then using the
confccdssa script after installation, or you can use the scconf and
confccdssa commands together after the Sun Cluster installation.

If you did not reply yes to the shared CCD question during
scinstall processing, run the following command on both nodes in
the cluster:
# scconf clustername -S ccdvol
Checking node status...
Purified ccd file written to
/etc/opt/SUNWcluster/conf/ccd.database.init.pure
There were 0 errors found.

Cluster Configuration Database 9-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Shared CCD Volume

You must run the confccdssa command on only one node. The
confccdssa command searches for suitable disk drives on the
system and prompts you to select a pair as follows:
# confccdssa clustername
The disk group sc_dg does not exist.
Will continue with the sc_dg setup.

Please, select the disks you want to use from the


following list:

1) SSA:00000078C8A0
2) SSA:000000722F83
Device 1: 1
1) t0d0
2) t0d1
3) t0d2
Disk: 3

Disk c0t0d2s2 with serial id 00142458


in SSA 00000078C8A0 has been selected as device 1.

Select devices from list.

1) SSA:00000078C8A0
2) SSA:000000722F83
Device 2: 2
1) t0d2
Disk:
1) t0d2
Disk: 1

Disk c2t0d2s2 with serial id 01186928


in SSA 000000722F83 has been selected as device 2.

newfs: construct a new file system


/dev/vx/rdsk/sc_dg/ccdvol: (y/n)? y

Note – For clarity, several comment fields have been removed from the
confccdssa command output.

9-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Shared CCD Volume

Disabling a Shared CCD


To disable shared CCD operation, run:
# scconf clustername -S none

This will modify the ccd.database.init file.

Note – This does not remove the disk group.

Cluster Configuration Database 9-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9

CCD Administration

You use the ccdadm command to perform a number of administrative


operations on the ccd.database files.
Caution – Do not manually modify the CCD database file. It has a
checksum and other consistency information written at the end of it.
! Manual editing corrupts the database file.

Verifying CCD Global Consistency


Use the following command from any node to verify global CCD
consistency:
# ccdadm clustername -v

9-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
CCD Administration

Checkpointing the CCD


The following command makes a backup copy of the CCD:
# ccdadm clustername -c filename

Note – Do not copy the ccd.database file using the cp command. If


there are any modifications in progress, using the cp command could
cause corruption. The ccdadm -c option freezes all modifications
while copying the file. This prevents corruption.

Restoring the CCD From a Backup Copy


The following command uses a CCD backup to restore the CCD:
# ccdadm clustername -r restore_filename

Caution – The restore operation can be global in nature. If possible,


ask for assistance from your field representative before performing the
! ccdadm restore operation.

Creating a Purified Copy of the CCD


The following command can be used to check for syntax errors in the
ccd.database file. A file named ccd.database.pure is created.
# ccdadm clustername -p ccd.database

Any errors will be reported during the purify operation.

Cluster Configuration Database 9-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
CCD Administration

Disabling the CCD Quorum


If you must perform operations that modify the CCD with only one
node in the cluster, you can disable the mechanism that prevents CCD
modification with the following command:
# ccdadm clustername -q off|on

If you have a mirrored CCD volume, you do not need to disable the
CCD quorum mechanism.

Note – If there are any CCD processing errors, check the CCD log file
in var/opt/SUNWcluster/ccd/ccd.log.

Recommended CCD Administration Tasks


Regular backups of the CCD should be made as follows:

● A daily cron job that uses the ccdadm -c command

● Use ccdadm -c manually before and after any CCD changes

Common Mistakes
A common CCD related error is to specify a shared CCD during the
Sun Cluster installation and not complete the post installation creation
of the CCD mirrored volume. When you attempt to start the cluster
you will see CCD freeze errors.

You must either complete the CCD mirrored volume creation or


disable the shared CCD feature using the scconf command as follows:
# scconf clustername -S none

9-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Exercise: CCD Administration

Exercise objective – In this exercise you will do the following:

● Verify the global CCD consistency

● Checkpoint a ccd.database file

● Create a purified copy of the ccd.database file.

Preparation
There is no preparation for this exercise. Your cluster systems should
have been left in the appropriate state at the end of the previous
module exercise.

Tasks
The following task is explained in this section:

● Maintaining the CCD database

Cluster Configuration Database 9-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Exercise: CCD Administration

Maintaining the CCD Database


It is good practice to make backup copies of the CCD database before
making configuration changes, such as setting up one of the high
availability data services. This can be done with the ccdadm command.
You can also use ccdadm to check the node-to-node consistency of the
CCD database.

1. Verify that all nodes are cluster members with the


get_node_status command.

2. On Node 0, verify the global CCD consistency.


# ccdadm clustername -v

3. Make a backup copy of the CCD database on each of the cluster


host systems.
# cd /etc/opt/SUNWcluster/conf
# ccdadm clustername -c ./ccd.backup

Note – Do not copy the ccd.database file using the cp command.

4. On any of your cluster hosts, create a purified copy of the


ccd.database file.
# ccdadm -p ./ccd.database

9-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Exercise: CCD Administration

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

Cluster Configuration Database 9-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Describe the Cluster Database and its operation

❑ Describe the Cluster Configuration Database and its operation

❑ List the advantages of a shared CCD volume

❑ Manage the contents of the cluster configuration files

9-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
9
Think Beyond

When would you disable the CCD update quorum requirement?

What would it take to have information defined for all nodes, even if
the nodes are offline?

Cluster Configuration Database 9-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Public Network Management 10

Objectives

Upon completion of this module, you should be able to:

● Explain the need for Public Network Management (PNM)

● Describe how PNM works

● Configure a network adapter failover (NAFO) group

● Disable the Solaris Operating Environment Interface Groups


feature

This module describes the operation, configuration, and management


of the Sun Cluster Public Network Management mechanism.

10-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Relevance

Discussion – The following questions are relevant to understanding


this module’s content:

1. What happens if a fully functional cluster node loses its network


interface to a public network?

2. Are there alternatives to failing over all logical hosts from a cluster
if the public network interface is lost?

10-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

Public Network Management 10-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10

Public Network Management

The Public Network Management (PNM) software creates and


manages designated groups of local network adapters. The PNM
software is a Sun Cluster package that provides IP address and
adapter failover within a designated group of local network adapters.
It is designed for use in conjunction with the HA data services.

The network adapter failover groups are commonly referred to as


NAFO groups. If a cluster host network adapter fails, its associated IP
address is transferred to a local backup adapter.

Each NAFO backup group on a cluster host is given a unique name,


such as nafo12, during creation. A NAFO group can consist of any
number of network adapter interfaces but usually contains only a few.

The numbers assigned to each NAFO group can be any value as long
as the total number of NAFO groups does not exceed 256 on a node.

Note – The full discussion of using NAFO groups in conjunction with


logical hosts and data services is presented in another lecture.

10-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Public Network Management
As shown in Figure 10-1, the PNM daemon (pnmd) continuously
monitors designated network adapters on a single node. If a failure is
detected, pnmd uses information in the cluster configuration database
(ccd) and the pnmconfig file to initiate a failover to a healthy adapter
in the backup group.

Network

Node 0

hme0 hme1 hme0 hme1


Primary Backup Primary Backup
nafo12 group nafo7 group

Monitor
CIS primary
pnmd pnmd ifconfig
backup

ccd
Node 1 /etc/pnmconfig
NAFO group
IP address NAFO group
configuration

Figure 10-1 Public Network Management Components

The PNM software functions at two levels. It is started at boot time but
has limited failure detection until the Sun Cluster software is started.
At boot time, the PNM software can detect a local interface failure and
switch to the designated backup interface(s) but does not perform
more complex testing until a data service associated with a NAFO
group is running.

Public Network Management 10-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10

The Network Monitoring Process

If a public network adapter indicates a lack of connectivity, PNM


determines whether the fault lies with the adapter or the public
network itself.

Clearly, if the fault lies with the network itself, there is no recovery
action that the server can take.

However, something can be done if the failure is in the host server’s


network adapter. A backup network adapter is activated to replace the
failed network adapter on the cluster host and the bad adapter fails
over to a new adapter on the same host. This avoids having to move
the entire server workload to another server due to the loss of a single
network adapter.

10-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
The Network Monitoring Process

What Happens?
When monitoring for network faults, the PNM software must
determine where the failure is before taking action. The fault could be
a general network failure and not a local adapter. PNM can use the
cluster interconnect system (CIS) to find out if other nodes are also
experiencing network access problems. If the problem is being seen by
other nodes (peers), then the problem is probably a general network
failure and there is no need to take action.

If the detected fault is determined to be the fault of the local adapter,


notify the network failover component to begin an adapter failover,
which is transparent to the highly available data services.

If a general network failure is identified, or if a remote adapter has


failed, do not perform an unnecessary adapter failover.

Highly Available Data Services, such HA-NFS, can query the physical
net status (using the HA-API framework) if they experience data
service loss to their clients. The HA-API framework then uses this
information to determine:

● Whether to migrate the data service

● Where to migrate the data service

The Sun Cluster Manager displays network adapter status which


system administrators can use during problem diagnosis.

Public Network Management 10-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10

How PNM Works


The PNM daemon is based on an Remote Procedure Call (RPC) client-
server model. It is started at boot time in an /etc/rc3.d script and
killed in /etc/rc2.d.

PNM uses the CCD for storing distributed state information for the
results of the adapter monitoring test results on the various hosts. HA
Data Services can query the status of the remote adapters at any time
using the HA-API framework.

The PNM software can work with backup network adapters on the
same subnet that have different Ethernet or FDDI Media Access
Control (MAC) addresses.

10-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
How PNM Works

PNM Support Issues

Network Trunking

Network trunking is the ability to use multiple network adapters to


transfer data as one, providing a higher bandwidth. These adapters are
logically grouped by special trunking software that provides this
capability.

Network trunking is not supported by the Sun Cluster software.

Interface Groups Feature

The Solaris 2.6 Operating Environment provides an Interface Groups


feature that resolves an old routing problem with multiple network
interfaces on the same network or subnet. The Interface Groups
feature is enabled by default.

The Interface Groups feature can cause logical host switchovers to fail
if it remains enabled. You must disable the Interface groups feature by
adding the following line to the /etc/system file on all cluster hosts:
set ip:ip_enable_group_ifs=0

Supported Interface Types

PNM supports the following public network interface types:


● le
● qe
● be
● hme
● qfe
● FDDI bf and nf
● ATM (LAN emulation mode only)
● Token Ring

Public Network Management 10-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10

PNM Monitoring Routines

The three main PNM routines, TEST, DETERMINE_NET_FAILURE,


and FAILOVER monitor the local NAFO interfaces for traffic and, if
necessary, ask other cluster members for information.

If the routines determine that an interface is bad, they switch to the


next backup interface. This progresses until there are no more backup
interfaces in the group. This usually initiates the failover of the related
logical host to another system.

10-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
PNM Monitoring Routines

TEST Routine
The TEST routine monitors for network traffic every 5 seconds. If there
has been none over the last 5 seconds, it solicits traffic by sending ping
commands. If after 5 more seconds there still has been no traffic, it
marks the NAFO group in DOUBT status. The TEST routine performs
the following steps as necessary:

1. Get the event count, old_count, on the adapter.

2. Sleep for 5 seconds.Get the event count, new_1_count, on the


adapter.

3. If new_1_count is not equal to old_count, set the backup group


status to OK and return the status to caller.

4. Solicit network traffic with a ping command on the subnet


● First try IP routers multicast (224.0.0.2)
● Then try IP hosts multicast (224.0.0.1)
● Finally try broadcast ping (255.255.255.255)

5. Sleep for 5 seconds.

6. Get the event count, new_2_count, on the adapter.

7. If new_2_count is not equal to old_count, set the backup group


status to OK and return the status to caller.

8. Set the backup group status to DOUBT.

9. Return the status to caller.

Public Network Management 10-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
PNM Monitoring Routines

FAILOVER Routine
The FAILOVER routine creates a failover to another adapter in the
group if possible, if the failure has been determined to be the adapter.
If there are no working adapters, the entire group is marked DOWN. This
usually causes logical host failover.

The FAILOVER routine performs the following steps as necessary:

1. Call the DETERMINE_NET_FAILURE routine.

2. If the backup group status is NET_DOWN, return to caller.

3. For each adapter in the backup group:

a. Get the logical IPs configured on the failed adapter.

b. Configure the primary adapter as DOWN.

c. Configure the next adapter in the backup group as UP.

d. Configure the logical IPs on the new adapter.

e. If a test of the new adapter is OK, return to caller.

4. Set the backup group status to DOWN.

5. Return to caller.

10-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
PNM Monitoring Routines

DETERMINE_NET_FAILURE Routine
The DETERMINE_NET_FAILURE routine asks peer nodes (other cluster
nodes) if they have any information. It then determines if the local
adapter or something else has failed.

The DETERMINE_NET_FAILURE routine performs the following steps


as necessary:

1. Using the CCD, check the adapter status of your peer nodes’
adapters.

2. If even one peer node adapter has a status of OK, return to the
FAILOVER routine.

3. If even one peer node claims a status of NET_DOWN, set the backup
group status to NET_DOWN and return to the FAILOVER routine.

4. If all peers have a status of either DOWN or DOUBT, set the backup
group status to NET_DOWN and return to the FAILOVER routine.

Public Network Management 10-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10

The pnmset Command

The pnmset command is used to configure a network adapter backup


group. The example that follows shows the process of creating two
separate backup groups. With pnmset, you can create all of the nafoX
backup groups at the same time, or create them one at a time.

The pnmset command can be run only by root.

10-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
The pnmset Command

Configuring Backup Groups


The following prompts are displayed by the pnmset program during
NAFO group configuration. They each require special attention as
follows:

● How many PNM backup groups on the host: 2

● Enter backup group number: 123

The group number is arbitrary. The total number of groups cannot


exceed 256. It is a good idea to make the backup group number
unique on each node.

If the group already exists, its configuration is overwritten with


new information.

● Please enter all network adapters under nafo0: qe1 qe0

The backup group should contain a minimum of two interfaces. If


reconfiguring an existing group, you can add additional interfaces.

The pnmset program checks the status of the adapters and then selects
a primary adapter. It then ensures that there is only one active adapter
for each backup group. The configuration information is then written
to the /etc/pnmconfig file. The pnmset program then signals the
pnmd daemon to reread the new configuration information and
monitor the adapters accordingly.

If the backup group testing fails, you might have to perform the
following steps on the primary interface in the group before running
pnmset again:

1. # ifconfig hme0

2. # ifconfig hme0 plumb

3. # ifconfig hme0 up

4. # ifconfig hme0 down

Public Network Management 10-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
The pnmset Command

Configuring Backup Groups (Continued)


The following shows the complete transcript of the creation of a single
NAFO backup group.

# pnmset

In the following, you will be prompted to do


configuration for network adapter failover

do you want to continue ... [y/n]: y

How many NAFO backup groups on the host [1]: 1

Enter backup group number [0]: 113

Please enter all network adapters under nafo113


hme0 hme1 hme2

The following test will evaluate the correctness


of the customer NAFO configuration...

name duplication test passed

Check nafo113... < 20 seconds

hme1 is active

remote address = 192.9.10.222

nafo113 test passed

All backup interfaces that you specify are also activated and tested.
The remote address is anyone who responded to a ping.

Note – If you want to add an additional interface to a group later,


recreate the same group number again with additional interfaces. The
old group configuration information is overwritten.

10-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10

Other PNM Commands

The pnmstat Command


The pnmstat command is used to verify backup group status. The
following is the output after a backup group failure:
# pnmstat -c nafo0
OK
129
qe1

The output means:

▼ nafo0 backup group is OK.

▼ It has been 129 seconds since the last failover

▼ qe1 is the current active interface in nafo0

Public Network Management 10-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Other PNM Commands

The pnmstat Command (Continued)


The following shows how to use the pnmstat command to check the
status of all local backup groups:
# pnmstat -l
bkggrp r_adp status fo_time live_adp
nafo0 le0 OK NEVER le0

The following shows how to use the pnmstat command to check the
status of a NAFO group on a remote host using the public network:
# pnmstat -h remote_host -c nafo1
OK
NEVER
le0

The output means:

▼ nafo1 backup group is OK

▼ There has been no failover

▼ le0 is the current active interface in nafo1

The following shows how to use the pnmstat command to check the
status of a NAFO group on a remote host using the private network:
# pnmstat -s -h remote_host -c nafo1
OK
Never
hme0

10-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Other PNM Commands

The pnmptor Command


The following shows how to use the pnmptor command to identify
which adapter is active in a given backup group:
# pnmptor nafo1
hme0

The pnmrtop Command


The following shows how to use the pnmrtop command to determine
which backup group contains a given adapter:
# pnmrtop qe0
nafo0

Public Network Management 10-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Exercise: Configuring the NAFO Groups

Exercise objective – In this exercise you will do the following:

● Create a NAFO group on each cluster host system

● Disable the Interface Groups feature on each cluster host system

Preparation
Ask your instructor for help with defining the NAFO groups that will
be used on your assigned cluster system.

You should create a single NAFO group on each cluster host that is
configured as follows:

● It consists of two interfaces of the same type

● I should be numbered 0 (nafo0)

Tasks
The following tasks are explained in this section:

● Creating a NAFO group

● Disabling the Interface Groups feature

10-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Exercise: Configuring the NAFO Groups

Creating a NAFO Group


1. Determine the primary network adapter on each cluster hosts
using the ifconfig -a command. Record the results below.

Node 0 Interface: __________ (typically hme0)

Node 1 Interface: __________

Node 2 Interface: __________

Node 3 Interface: __________

2. Verify that no NAFO groups exist on each cluster host.


# pnmstat -l

3. Create a single NAFO group, numbered 0, on each cluster host


using the pnmset command.
# pnmset

Note – If at all possible, each group should consist of two interfaces,


one of which can be the primary node interface.

4. Verify that the status of each new NAFO group is OK on all nodes.
# pnmstat -l

Public Network Management 10-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Disabling the Interface Groups Feature
Perform the following steps on every cluster host system.

1. Disable the Interface Groups feature on each node by adding the


following entry to each of their /etc/system files:
set ip:ip_enable_group_ifs=0

2. Stop the Sun Cluster software on all nodes.

3. Install the 105786 patch.

4. Reboot all of your cluster hosts so that the /etc/system file


changes take effect.

10-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Exercise: Configuring the NAFO Groups

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

Public Network Management 10-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Explain the need for Public Network Management (PNM)

❑ Describe how PNM works

❑ Configure a NAFO group

❑ Disable the Solaris Operating Environment Interface Groups


feature

10-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
10
Think Beyond

Are there other system components that would benefit from the
approach taken to network adapters by PNM?

What are the advantages and disadvantages of automatic adapter


failover? Manual adapter failover?

How will IP striping affect this model? Can you realize the dual goals
of higher throughput and high availability through PNM/NAFO for
the network connections?

Public Network Management 10-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Logical Hosts 11

Objectives

Upon completion of this module, you should be able to:

● Configure logical hosts

● Create the administrative file system for a logical host

● Switch logical hosts between physical nodes

This module describes how to configure logical hosts.

11-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Relevance

Discussion – The following questions are relevant to understanding


the content of this module:

1. What is the purpose of a logical host?

2. What needs to be defined for a logical host?

3. What are the restrictions on a logical host?

11-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

Logical Hosts 11-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Logical Hosts

To run a data service, a client system must be able to communicate


with the data service over the network. This is done usually using a
logical hostname that is converted to a IP address by a naming service
or locally in the /etc/hosts files (preferred). The server for the data
service must provide client access to data, both executables and stored
data.

A data service in the Sun Cluster HA environment must be able to


migrate to one or more backup systems if the primary system fails.
This should happen with as little disruption to the client as possible.

A logical host in the Sun Cluster HA environment is a collection of


network definitions and disk storage. A logical host, consisting of one
or more IP addresses, assigned network adapters, and disk storage, is
configured as the unit of failover. One or more data services are
configured to run in a logical host, so that when the logical host
moves, the data service follows it.

The definition of a logical host also includes the list of physical hosts
on which it can run.

11-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Logical Hosts

You should take the following information into account when


designing a logical host:

● You can have more than one logical host for each physical host,
and as many as three backup physical hosts for each logical host.

● You can have logical hosts that are primarily resident on a


particular physical host fail over to different backup hosts.
However, not all topologies support this.

● You can assign as many disk groups as you want to a particular


logical host.

● You can assign multiple data services to a particular logical host,


but remember that the logical host is the unit of failover.

One application or data service per logical host is the norm.

● Each network adapter assigned to a logical host must reside in a


PNM backup group.

● Logical hosts can share physical network adapters. This means


that several logical hosts can specify the same PNM group in their
definition. They must not share logical hostnames (IP addresses).

Logical Hosts 11-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Logical Hosts

As shown in Figure 11-1, the Sun Cluster HA framework provides the


routines for the logical host and data services to be properly failed
over to a designated backup system, and restarted. This is why it is
critical that the contents of the ccd.database files agree on all cluster
host systems.

Client workstation

# ping ds_host
# mount dshost:/Vol-02

Network

Node 0 Node 1
Data service
Logical hostname: recovery routines
dshost
129.50.20.3 Detect Node 0 failure
Import dg3 disk group
NAFO Group fsck and mount Vol-02
Ifconfig dshost IP address
Other recovery routines
Vol-02
volume
Disk group: dg3
lhost2
Primary: Node 0
Backup: Node 1 information

Logical host name: ccd.database


lhost2

ccd.database

Figure 11-1 Logical Host Components

11-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11

Configuring a Logical Host


The creation of a logical host is a three-step process:

1. Put all of the logical host’s network adapters into PNM NAFO
backup groups.

2. Use the scconf -L command to create the logical host.

3. Use the scconf -F command to create the administrative file


system.

Note – In SDS installations, you must create the administrative file


system manually. The scconf -F option does not function.

The logical host becomes active immediately after creation, which


means:

● Assigned disk groups are imported and mounted

● Network interfaces are activated

● IP addresses are configured up.

Note – The administrative file system is discussed in the


‘‘Administrative File System Overview’’ section on page 11-13.

Logical Hosts 11-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Configuring a Logical Host
Using the scconf -L Command Option
A logical host is configured using the scconf -L command. You can
run the command on only one node that is a running member of the
cluster. The following is the format of the command using the -L
option:

scconf clustername -L logical-host-name -n


nodelist -g dglist -i logaddrinfo [-m]

where:
clustername Identifies the name of the cluster in
which the logical host is being
configured.
-L Precedes a logical host name
logical-host-name Identifies the name of the new logical
host
-n Precedes a node name list
nodelist Identifies the cluster nodes on which
this logical host can be run. The
logical host is run preferentially on the
first node specified, with the others
used as backups in order.
-g Precedes a disk group list.
dglist Identifies the disk groups that must
migrate with the logical host.
-i Precedes a list of network interfaces. It
can be specified multiple times if there
is more than one interface to be used
by this logical host on each node. The
network adapter names specified must
be contained in a NAFO group on the
corresponding node. The adapters do
not have to be the same type or name.

11-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Configuring a Logical Host

Using the scconf -L Command Option (Continued)


logaddrinfo Identifies the primary NAFO
interfaces on each node in the node
list. These interfaces are specified in
the same order as the nodes. The last
parameter is the logical hostname (or
IP address) assigned to this logical
host.

-m Disables automatic takeover if a


logical host is running on a backup
node and the primary node rejoins the
cluster.

Note – Disabling automatic failback can help avoid unexpected


disruption of data services. You can manually switch the data service
back to its primary host system when you think it is better for the data
service users.

The scconf -L command activates the logical host on the node on


which it is run. The specified disk groups are imported and the
network interfaces are activated.

Deleting a Logical Host

To delete a logical host definition, type:


scconf clustname -L logical-host-name -r

Logical Hosts 11-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11

Logical Host Variations

You can configure logical hosts with varying levels of complexity. It is


helpful to examine different logical host configurations.

Basic Logical Host


You create a basic logical host with the following command:
# scconf clustername -L lhost1 -n node0,node1 -g dg1 \
-i hme0,hme0,usersys1

In this configuration, a logical host named lhost1 is being configured


to run on primary system node0 and is taken over by the backup
system node1 if node0 fails. If node0 is later repaired and joins the
cluster again, the logical host lhost1 automatically switches back to
node0.

The disk group dg1 is imported by node1 if node0 fails.

The primary NAFO interface for node0 is hme0. The primary NAFO
interface for node1 is hme0. These interfaces might be the first of only
several in their NAFO group. The names of the NAFO groups on each
node are not referenced in this command.

11-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Logical Host Variations

Basic Logical Host (Continued)


Clients reach the services offered by the lhost1 configuration by
referencing the host name usersys1. You can also use an IP address
instead of a logical hostname, but this will be difficult for users to
remember. The IP address and name of usersys1 might be recorded
locally in the /etc/hosts files or obtained from a network naming
service. For highest availability, you should resolve logical hostnames
locally.

When you define this logical host on one of the cluster hosts systems,
its configuration is automatically propagated to the CCD on all other
nodes currently joined in the cluster.
Caution – When performing operations that modify the contents of
the ccd.database file, you should have all nodes joined to the cluster.
! Otherwise you will have ccd.database inconsistencies between
nodes.

Cascading Failover
You create a cascading failover configuration with the following
command:
# scconf clustername -L lhost2 -n node0,node1,node2 \
-g dg2 -i hme0,hme0,hme0,usersys2

This is a variation that can contain three or four nodes instead of two
in the node list. If necessary, the logical host migrates through to each
node in the list and goes back to node0 again if the last node in the list
fails.

Logical Hosts 11-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Logical Host Variations

Disabling Automatic Takeover


You can disable the automatic takeover feature with the following
command:
# scconf clustername -L lhost3 -n nodea,nodeb -g dg1\
-i hme0,hme0,usersys3 -m

The -m option disables automatic takeover if a logical host is


running on a backup node and the primary node rejoins the
cluster.

Note – You cannot add the -m option later. You must delete the logical
host and then recreate it using the -m option.

Multiple Disk Group and Hostnames


You use the following command to associate more than one disk group
or metaset with a logical host:
# scconf clustername -L lhost4 -n nodea,nodeb \
-g dg1,dg2 \
-i hme0,hme0,usersys4a -i qe0,qe0,usersys4b

You can have multiple logical hostnames associated with a logical


host. This can provide multiple paths for users to access the resources
under control of the logical host. Each -i list provides an additional
client interface path and logical hostname.

11-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11

Administrative File System Overview

Each logical host must have a small administrative file system


associated with it. After you create a logical host, you can create the
administrative file system using the scconf -F option. The
administrative file system is created on a disk group or diskset that is
associated with the logical host.

Note – In SDS-based clusters, you must create the administrative file


systems manually for each logical host.

The administrative file system stores some cluster configuration


information as well as logical host data service information. It stores
file lock recovery information for the HA-NFS data service.
Approximately 5 Mbytes of space are required for this file system and
its mirror.

The system administrator does not need to manage or modify the


administrative file system. This file system should not be NFS-shared.

You can create the dgname-stat administrative volume by hand if


you wish to control construction and placement of the disk space.

Logical Hosts 11-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Administrative File System Overview

Administrative File System Components

CVM and SSVM Installations

The following shows the relationship between logical hosts and their
associated administrative file system in a CVM/SSVM installation.

Logical Host Name dnslhost

Logical hostname dnsip (129.55.30.10)

Disk group dnsdg

Volume name dnsdg-stat

Volume mount /dnslhost

Volume path /dev/vx/dsk/dnsdg/dnsdg-stat

Mount file vfstab.dnslhost

SDS Installations

The following shows the relationship between logical hosts and their
associated administrative file system in a SDS installation.

Logical Host Name dnslhost

Logical hostname dnsip (129.55.30.10)

Disk group dnsdg

Volume name d100

Volume mount /dnslhost

Volume path /dev/md/dsk/dnsdg/d100

Mount file vfstab.dnslhost

11-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11

Creating the Administrative File System

Using the scconf -F Command Option


In CVM and SSVM installations, you configure the administrative file
system for a logical host using the scconf -F command. The
command must be run on every node on which the logical host will
run, the nodes must be in the cluster, and the logical host must be
active on one of the cluster nodes. The following is the command
format.

scconf clustername -F lhostname [dgname]

where:

clustername Indicates the name of the cluster in which


the logical host is being configured.

lhostname Indicates the name of the logical host being


configured.

dgname Specifies a disk group that will contain the


administrative file system. If it is not
specified, the first disk group specified in
the scconf -L command that configured
the logical host is used.

Logical Hosts 11-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Creating the Administrative File System
scconf Command Functions

The scconf -F option performs the following functions:

● Creates a 2-Mbyte mirrored UNIX file system (UFS) volume in


either the named disk group or the first disk group specified when
the logical host was created

● The administrative volume is named diskgroup-stat.

● Creates a mount point of the same name as the logical host and
mounts the administrative file system on it (/lhostname)

● Creates a special vfstab file named vfstab.lhostname in the


/etc/opt/SUNWcluster/conf/hanfs directory, which contains
the mount information for the administrative volume

Note – If you need to control the placement of the administrative


volume, you can pre-create the logical host administrative file system
by using a volume manager relevant to your installation. You must
still run the scconf -F command on all nodes.

scconf Command Precautions

To avoid CCD inconsistencies, the following criteria should be met any


time a logical host is created or modified:

● You must run the scconf -F command on all nodes in the


cluster.

● One of the nodes must currently master the logical host.

● All nodes should be cluster members when you use the scconf
command to create or modify logical hosts
If the CCD is not consistent between all cluster hosts, logical host
migration between hosts can fail, or you can experience general cluster
reconfiguration failures.

11-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11

Logical Host File Systems

When you create a logical host, one or more disk groups are associated
with the logical host and an administrative file system is created. The
mount information for the administrative file system is automatically
entered in the logical host-specific vfstab file in the
/etc/opt/SUNWcluster/conf/hanfs directory.

If there are additional file system volumes you want to have mounted
automatically, you must enter the information manually in the logical
host-specific vfstab file.

When the logical host fails over to a backup system, all file system
specific information must be available to the backup host.

Note – You must record the additional file system information in the
logical host-specific vfstab files on all cluster hosts that are
configured as backup systems for the logical host.

Logical Hosts 11-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Logical Host File Systems

Adding a New Logical Host File System


To add a new logical host file system, complete the following steps.

1. Create the file system volume in a disk group or shared diskset


that belongs to the logical host.

2. Initialize the new file system with the newfs command

3. Create a file system mount point on each cluster node that is


configured for the logical host.

4. Test-mount the file system and then unmount it.

5. Add the new file system mount entries to the vfstab.lhname


files on the appropriate Sun Cluster nodes.

The new file system is automatically mounted and managed as part of


the logical host during the next logical host reconfiguration.

Sample Logical Host vfstab File


The format of the vfstab.lhname file is identical to that of
/etc/vfstab file. All entries in this file must correspond to file
systems located on multi-host disks, and can specify either UFS or
VxFS file systems. The “mount at boot” and “fsck pass” fields are
ignored.

A typical entry might look like the following:


/dev/vx/dsk/dg/dg1-v1 /dev/vx/rdsk/dg/dg1-v1 /abc ufs -
no -

Caution – These files must be identical on all nodes that support the
logical host.
!

11-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11

Logical Host Control

Logical hosts migrate automatically between physical hosts when a


data service in the logical host is determined to have failed and cannot
be restarted, or when the physical host node has failed. The logical
hosts fail over to a new physical host in the order specified when the
logical host was created.

Forced Logical Host Migration


You can also switch logical hosts manually, using the haswitch or
scadmin switch commands. The following is the syntax of these
commands:
# haswitch new_phys_host logical_host

# scadmin switch clustername new_phys_host logical_host

With haswitch or scadmin switch, you can specify the physical host
to which the logical host(s) are to fail over.

Note – You can initiate the logical host switch from any node that is
currently a cluster member.

Logical Hosts 11-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Logical Host Control

Logical Host Maintenance Mode


Occasionally you might need to take a logical host down to perform
administration functions, such as backing up file system volumes. You
can place the logical host in maintenance mode using the command:
# scadmin switch clustername -m logical_host

The logical host is shut down and placed in maintenance mode until
the haswitch command is executed again for the logical host. Before
you take any action, all users must be informed that the related data
service will be unavailable. Also, you will probably have to shut down
the data service application.

As shown in the following, you must perform additional steps if you


want to backup logical host file systems.

1. Place the logical host in maintenance mode.

# scadmin switch clustername -m logical_host

2. Import the disk group or diskset.


# vxdg import diskgroup (metaset -s diskset -t)

3. Perform the volume backups.

4. Deport the disk group or diskset.


# vxdg deport diskgroup (metaset -r diskset)

5. Put the logical host back in service.


# scadmin switch clustername new_phys_host logical_host

A special variation of the switch option can be used to force a cluster


reconfiguration without moving any logical hosts.

# scadmin switch clustername -r

This can be used to enable a shared CDD without stopping the cluster
software. However, it will temporarily suspend all active data services.

11-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts
Exercise objective – In this exercise you will do the following:

● Prepare the name service for the logical hosts

● Create logical hosts

● Create logical host administrative file systems

Preparation
In this lab, you will create two logical hosts for use in later lab
exercises.

You will use the administrative file system volumes that were created
in a previous exercise.

Tasks
The following tasks are explained in this section:

● Preparing the name service

● Activating the cluster

● Creating the logical hosts

● Testing the logical hosts

Logical Hosts 11-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts

Preparing the Name Service


You must assign logical IP addresses for each of your logical hosts.
You must enter these addresses and logical host names in the name
service so they are available to each node.

1. Using the IP addresses given to you by your instructor, create


/etc/inet/hosts entries (or entries in the appropriate name
service) for each of your new logical hosts on each cluster node.

IP Address:____________________ clustername-nfs

IP Address:____________________ clustername-dbms

These IP addresses are not active, and do not have interfaces


assigned at this time.

Activating the Cluster


If your cluster is not already started, start it now.

1. On only one cluster node, type:


# scadmin startcluster phys_nodename clustername

2. Wait for the cluster to activate and reconfiguration to complete on


the first node.

3. On each other cluster node, type:


# scadmin startnode

Caution – You must type the scadmin startnode commands


simultaneously using the cconsole common window. If you do not
! start the additional nodes at exactly the same time, the CCD data can
become corrupted.

11-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts

Logical Host Restrictions


Now that the network interfaces and disk groups are ready, you can
create the logical hosts.

You create two logical hosts:

● clustername-nfs

● clustername-dbms

Remember:

● At least one node in the cluster must be running to create a logical


host. You do not need a CCD quorum.

● You must run the scconf command from a node joined to the
cluster.

● You should run the scconf -L command only once for each
logical host, on only one node.

● If you make a mistake, use scconf -L -r to delete the incorrect


logical host definition and rerun the definition.

● Make sure that the nodes you assign to a logical host have
physical access to that logical host’s disk group.

● Logical host names and logical host IP host names are not always
the same.

● Make sure that you have identified the switchable disk drives for
use by the administrative file system for each logical host.

Logical Hosts 11-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts

Creating the Logical Hosts


1. Record your target logical host configurations.

Logical Host Logical Host


clustername-nfs clustername-dbms

Primary Node
Backup Node
Disk group or hanfs hadbms
diskset name
Primary node
interface
Backup node
interface
Logical clustername-nfs clustername-dbms
hostname

2. Create the hanfs logical host on one of your nodes. Use the -m
option to prevent automatic switchback.

# scconf clustername -L clustername-nfs \


-n node1,node2 -g hanfs -i \
intf,intf,clustername-nfs -m

Caution – Do not use the cconsole common window.

!
3. Create the hadbms logical host on a different node so that each
logical host is mastered by a different node. Use the -m option to
prevent automatic switchback.

# scconf clustername -L clustername-dbms \


-n node2,node1 -g hadbms -i \
intf,intf,clustername-dbms -m

11-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts

Creating the CVM/SSVM Administrative File System


Skip this section if your cluster is using the SDS volume manager.

1. If your cluster is running the CVM or SSVM volume managers,


create the administrative file system for each logical host by
running scconf -F for each logical host on each cluster node on
which that logical host will run.
# scconf clustername -F clustername-nfs
# scconf clustername -F clustername-dbms

2. Verify that the mount information for each logical host


administrative file system has been entered in the
vfstab.clustername-nfs and vfstab.clustername-dbms files in the
/etc/opt/SUNWcluster/conf/hanfs directory on each cluster
host system.

Creating the SDS Administrative File System


Skip this section if your cluster is using either the CVM or SSVM
volume manager.

The hanfs and hadbms administrative file system metadevices for


SDS installations should have been created in an earlier exercise. The
metadevice paths for the hanfs and hadbms diskset administrative
file systems should be:
/dev/md/hanfs/dsk/d100 (or /dev/md/hanfs/rdsk/d100)
/dev/md/hadbms/dsk/d100 (or /dev/md/hadbms/rdsk/d100)

1. On both nodes, create the /clustername-nfs and /clustername-dbms


administrative file system mount points.
# mkdir /clustername-nfs /clustername-dbms

Logical Hosts 11-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts

Creating the SDS Administrative File System (Continued)


2. On both nodes, create the logical host-specific vfstab files.
# cd /etc/opt/SUNWcluster/conf
# mkdir ./hanfs
# cd hanfs
# touch vfstab.clustername-nfs
# touch vfstab.clustername-dbms

3. On both nodes, enter the hanfs administrative file system mount


information into the vfstab.clustername-nfs files.
/dev/md/hanfs/dsk/d100 /dev/md/hanfs/rdsk/d100 /clustername-nfs ufs 1 no -

4. On both nodes, enter the hadbms administrative file system


mount information into the vfstab.clustername-dbms file.
/dev/md/hadbms/dsk/d100 /dev/md/hadbms/rdsk/d100 /clustername -dbms ufs 1 no -

11-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts

Testing the Logical Hosts

1. Use either the haswitch command or the scadmin switch


command to move a logical host to a new physical host.

2. On the new physical host, verify that the administrative file


system using the mount command.

3. Put a logical host in maintenance mode.


# haswitch -m logical_host_name

4. Take the logical host out of maintenance mode.


# haswitch physical_node logical_host_name

5. Run the scadmin stopnode command on a cluster node with an


active logical host. What happens?

6. Restart the node with scadmin startnode. What happens? Why?

7. Return all logical hosts to their “home” nodes using the haswitch
or scadmin switch commands.

Leave the cluster running for the next lab.

Logical Hosts 11-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Exercise: Preparing Logical Hosts

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

11-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Configure logical hosts

❑ Create the administrative file system for a logical host

❑ Switch logical hosts between physical nodes

Logical Hosts 11-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
11
Think Beyond

If the concept of a logical host did not exist, what would that imply for
failover?

What complexities does having multiple backup hosts for a single


logical host add to the high availability environment?

11-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
The HA-NFS Data Service 12

Objectives
Upon completion of this module, you should be able to:

● Describe the function of the HA-NFS support files

● List the primary functions of the HA-NFS start and stop methods

● List the primary functions of the HA-NFS fault monitoring probes

● Configure HA-NFS in a Sun Cluster environment

● Add and remove HA-NFS file systems

● Switch a HA-NFS logical host between systems

This module describes and demonstrates the configuration and


management of Sun Cluster HA-NFS file systems.

12-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Relevance

Discussion – The following questions are relevant to understanding


the contents of this module:

1. What does the system need to know abut a highly available NFS
environment?

2. What configuration information does HA-NFS require?

3. Do clients have any recovery issues after a NFS logical host


switchover?

12-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Online man pages for the scadmin and hareg commands

The HA-NFS Data Service 12-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

HA-NFS Overview

The HA-NFS environment is a simple set of modules that acts as an


interface between the Sun Cluster high availability framework and the
Solaris NFS environment. User applications continue to use NFS
services as before, and there is no change to client administration.

Because HA-NFS is designed to work in an off-the-shelf NFS


environment, it does not improve upon existing NFS services beyond
adding high availability. It also does not create any additional
problems for existing NFS services. Clients see no operational
differences.

The NFS 2.0 and NFS 3.0 versions are supported by the HA-NFS data
service. HA-NFS co-exists with all other Sun Cluster data services,
including HA-DBMS and parallel database configurations, on the
same cluster node.

12-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
HA-NFS Overview

HA-NFS Support Issues

PC Client Support

HA-NFS was designed to work with a heterogeneous network of NFS


clients. Clients must implement the lock recovery protocol (that is,
they must provide a lockd and statd daemon). Not all third-party
NFS implementations support this feature. This is most often a
problem with NFS client implementations on PCs.
Caution – The NFS data service is still highly available to a client that
does not support NFS lock recovery protocol, but in the event of
! failover or switchover, the server loses track of the application’s locks
and might grant another application instance access to the locked files.
This could lead to data corruption.

PrestoServe Support

PrestoServe is not supported on HA-NFS servers. On HA-NFS servers,


PrestoServe caches data in the event the server fails. If the server fails,
PrestoServe waits until the server comes back up and resends the data.
If the data is now serviced by another server, this can lead to
synchronization problems.

Local HA-NFS File Systems Access

You cannot locally access HA-NFS file systems from either HA server.
Local file locking interferes with the ability to run the kill command
and restart the lockd command. Between the kill command and the
system restart, a blocked local process can be granted the lock, which
prevents the client machine that owns that lock from reclaiming it.

Secure NFS and Kerberos

Secure NFS and the use of Kerberos with the NFS environment is not
supported in Sun Cluster HA configurations. In particular, the secure
and kerberos options to share_nfs(1M) are not supported.

The HA-NFS Data Service 12-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

HA-NFS Data Service

The HA-NFS data service is automatically available when the Sun


Cluster software is installed. No specific responses are required.

Three components compose the HA-NFS software:

● Start NFS methods

The Sun Cluster framework uses the START methods to start or


restart data services on a cluster host system.

● Stop NFS methods

The STOP methods are used to cleanly shut down a data service
for maintenance or before starting the data service on a designated
backup system

● NFS-oriented fault monitoring

The fault monitors constantly monitor the health of an active data


service. They can force a logical host to migrate to a backup.

The methods are pre-configured routines that run automatically


during a cluster reconfiguration or when a data service is manually
stopped or switched between cluster hosts.

12-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Start NFS Methods

The Start NFS methods run automatically during logical host


reconfiguration (for example, during takeovers and switchovers).
These methods do the following:

● Start or restart NFS-related daemons, as appropriate

● Force NFS daemons to go through a lock recovery protocol, just as


if a server reboot has occurred

● Export shared file systems for the logical host

The HA-NFS Data Service 12-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Start NFS Methods

Before the Start NFS methods are run, the High Availability
framework takes ownership of the appropriate logical hosts’ disk
groups and mounts their file systems. After the Start NFS methods
complete, the High Availability framework begins listening for clients
of the logical host. NFS service is now available.

You should not start NFS manually. If NFS is started manually, the HA
framework does not “know” which file systems are mounted and
exported, and this could adversely affect the fault monitoring routines.

You should not have any NFS file systems outside of the HA-NFS
service. These NFS file systems do not failover and service is
interrupted when the various NFS-related daemons are stopped and
restarted by HA-NFS. If a file system is unknown to HA-NFS, it is not
highly available.

For example, assume a CD-ROM is mounted locally and shared. If the


NFS daemons are stopped and restarted, NFS access to the CD-ROM is
interrupted.
Caution – Starting NFS on Sun Clusters must be done by the High
Availability framework. Do not manually start NFS daemons, mount
! directories, or add entries to startup scripts to perform these tasks.

12-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Stop NFS Methods

The Stop NFS methods are executed automatically during logical host
reconfigurations (for example, during switchovers).

The Stop NFS methods stops the appropriate NFS-related daemons


and unshares the file systems.

NFS clients will begin to see NFS Server not responding


messages after the NFS-related daemons are stopped.

The HA-NFS Data Service 12-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

HA-NFS Fault Monitoring

The HA-NFS fault monitoring routines are run automatically to assess


the health of the HA-NFS data service.

HA-NFS Fault Monitoring Probes


Each Sun Cluster HA server periodically probes the other servers. It
checks if the NFS-related daemons (nfsd, mountd, lockd, statd,
and rpcbind) are running and if a typical client is receiving service.
To check if a typical client is receiving service, it uses the public net
(not the private net) to access the peer server and attempts to mount a
file system and read and write a file.

When checking typical client services, it attempts to mount a file


system approximately every 5 minutes, and it attempts to read and
write a file approximately every minute. You cannot modify how often
these HA-NFS fault monitoring probes are run.

12-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
HA-NFS Fault Monitoring

HA-NFS Fault Monitoring Probes (Continued)


A HA-NFS server is considered “sick” if its NFS service does not
respond within a time-out period (approximately 5–10 minutes). This
is in contrast to the CMM, which detects the failure of a node in a few
seconds. Failed hosts can be detected quickly, but the system is more
cautious when determining NFS failures, because there can be
temporary server overload conditions.

If the peer server appears sick, a takeover is considered. But first, the
new host ensures that it is running without problems. To do this, the
new host: checks that its own NFS-related daemons are functional and
checks that its own NFS-exported file systems can be mounted, read,
and written. The HA framework checks the server’s ability to
communicate over the public network(s) and checks the name service
for any problems.

If everything is verified, a takeover procedure is initiated.

The HA-NFS Data Service 12-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Fault Probes

As with any Sun Cluster High Availability data service, fault probes
are used to determine whether the logical host is functioning correctly.

There are two kinds of probes:

● Local probes (LP)

● Remote probes (RP)

The fault probes run on both the master and the first backup node to
allow for both local and remote functionality checking.

If the system fails over to the first backup, then the remote fault probes
start running on the second backup, if one is configured. If there is not
a second backup, then there will not be a remote probe.

All probes act as NFS clients to the NFS data service, and perform
mount, read, write, and locking operations.

12-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Local Fault Probes

Local probes test the functionality of the data service without


involving a network connection. This ensures that the service is
running, and helps differentiate between a service failure and a
network failure.

The local HA-NFS probes run on the current physical host of the HA-
NFS logical host. They are intended to ensure that the NFS file system
is operational.

There is one set of local probes per data service. The local probes do
the following:

● Use the logical host IP address to ensure that the NFS daemons are
running on the physical host

● Perform read, write, and locking operations to each shared file


system

If these tests fail, a message is written to the console and giveaway is


considered.

The HA-NFS Data Service 12-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Remote Fault Probes

The remote probes help identify not only a service failure, but network
failures as well. Coupled with the network adapter management
process, it allows the probes to determine whether the data service,
local or remote network interface, the network itself, or the node has
failed. Depending on the determined failure cause, the proper
recovery action (network adapter switch, logical host failover, or no
action) is taken.

The remote probes ensure that the HA-NFS file system is visible,
available, and operational from a remote node. The backup physical
system for the HA-NFS logical host runs the remote probes.

The remote probes do the following:

● Use the logical host IP address to ensure that the NFS daemons are
running on the physical host

● Mount all of the NFS file systems from the logical host

● Perform read, write, and locking operations to each file system

If these tests fail, takeaway is considered.

12-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Giveaway and Takeaway Process

If either the local or the remote fault monitors detect certain failures,
they attempt to force a reconfiguration of the logical host. This is an
attempt to migrate the logical host to a healthy system.

If the fault is detected by the local fault monitor, it initiates a giveaway


process. This might end with the designated backup system taking
over the logical host.

If a fault is detected by the remote fault monitor, it initiates a takeaway


process. This might end the same as the giveaway process, with the
backup system taking over the logical host.

The HA-NFS Data Service 12-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Giveaway and Takeaway Process

As shown in Figure 12-1, either the local or remote fault monitors for a
data service can initiate a logical host migration.

Public network

phys-hostA phys-hostB

check
Data service
check Local
fault Remote
monitor fault
monitor

giveaway
takeaway

Figure 12-1 Logical Host Giveaway and Takeaway

Sanity Checking
Before a physical host can become the new master of a logical host, it
must be certified as fully operational. This is called sanity checking
and is performed by the FM_CHECK methods.

If both the local and remote host are not healthy, the logical host might
shut down.

12-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Processes Related to NFS Fault Monitoring

For each logical host, local and remote, the fault monitoring processes,
nfs_probe_loghost and nfs_mon are running.

In addition, there is one local probe per physical node that watches the
local NFS daemons, nfs_probe_local_start.

NFS Server Daemon Threads

If you do not specify enough nfsd server daemon threads in the


/etc/rc3.d/S15nfs.server script, on the /usr/lib/nfs/nfsd line.
you might have server throughput problems. Also, if there are
insufficient nfsd server daemon threads, remote fault probes can fail
and cause unnecessary failovers. The default value is 16 threads. Some
performance and tuning books suggest that you adjust the threads as
follows:

● Use 2 NFS threads for each active client process

● Use 16 to 32 NFS threads for each CPU (can be much higher)

● Use 16 NFS threads for each 10-Mbits of network capacity

The HA-NFS Data Service 12-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

HA-NFS Support Files

In addition to the logical host-specific vfstab file, there is also a


logical host-specific dfstab file that contains NFS share commands.

Once you create the logical host, adding a HA-NFS file system is
similar to adding any NFS file system by adding mount entries to the
vfstab.lhname file and by adding share entries to the
dfstab.lhname file.

The mount and share entries are configured on the cluster hosts that
support the particular HA file system.

If HA-NFS is already running in the logical host, transition the logical


host in and out of maintenance mode using the haswitch command.
This automatically mounts and shares any new file systems.
Caution – If the file systems are mounted and shared manually, the
local and remote fault monitoring processes are not started until you
! use the next haswitch command for that logical host.

When using the VxVA GUI, make sure you do not accidentally create
file systems with the automount option selected.

12-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
HA-NFS Support Files

Adding Mount Information to the vfstab File


The vfstab file specifies the mounting of file systems. There must be
a vfstab.lhname file for each logical host, and the files must be the
same on all cluster nodes that might support that logical host. There
can be more than one disk group for each logical host.

All entries in this file must correspond to file systems located on multi-
host disks, and can specify either UFS or VxFS file systems.

File systems can have any name and mount point, as long as they are
represented in the proper logical host vfstab file.

Adding Share Information to the dfstab File


The dfstab file specifies the sharing (exporting) of HA-NFS file
systems. There must be a dfstab.lhname file for each logical host
that is exporting HA-NFS file systems. These files must be the same on
all cluster nodes that master that logical host.

If you use the dfstab options to limit access to only certain NFS
clients, you should also grant access to all physical hostnames of the
servers. This enables these servers to act as clients, which is an
important part of fault monitoring. For example:
share -F nfs -o rw=client1:client2:sc-node0:sc-node1
/hanfs/export/fs3

If possible, you should confine share options to just the rw or ro forms


that do not provide a list of clients or netgroups. This removes any
dependency on the name service.

Note – You must also register and start the HA-NFS data service,
which is discussed in the ‘‘Registering a Data Service’’ section on page
12-21 and in the ‘‘Starting and Stopping a Data Service’’ section on
page 12-25.

The HA-NFS Data Service 12-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
HA-NFS Support Files

Sample vfstab and dfstab Files


Compare the vfstab.lhost1 and dfstab.lhost1 entries shown in
the following:
# cat vfstab.lhost1
/dev/vx/dsk/dg1/dg1-stat /dev/vx/rdsk/dg1/dg1-stat
/hanfs ufs 1 no -
/dev/vx/dsk/dg1/vol1 /dev/vx/rdsk/dg1/vol1
/ha/fs1 ufs - no -
/dev/vx/dsk/dg1/vol2 /dev/vx/rdsk/dg1/vol2
/hahttp/home ufs - no -
/dev/vx/dsk/dg3/volm /dev/vx/rdsk/dg1/vol4 /dbms/tbp1
ufs - no -
#
#
# cat dfstab.lhost1
share -F nfs -d “HA file system 1” /ha/fs1
share -F nfs -d “HTTP file system” /hahttp/home
share -F nfs -d “DBMS file system” /dbms/tbp1

The administrative file system (/hanfs) is not shared. The


administrative file system for a logical host is never shared.

Removing HA-NFS File Systems From a Logical Host


To remove a file system from HA-NFS control you must:

1. Manually unshare the file system that is being removed.

2. On all appropriate hosts, delete the related mount information


from the /etc/opt/SUNWcluster/conf/hanfs/vfstab.lhname
file for the logical host.

3. On all appropriate hosts, delete the related share information from


the /etc/opt/SUNWcluster/conf/hanfs/dfstab.lhname file
for the logical host.

12-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

Using the hareg Command

You use the hareg command to register a data service, change its
on/off state, and display information about registered data services. It
can configure the data service to all existing logical hosts, or just
selected logical hosts (using -h).

Registering a Data Service


Before a data service, such as HA-NFS, can provide services, you must
register it. Data services are typically registered only once when the
service is initially configured.

You can register data services, only when the cluster is running and all
nodes are joined.

The HA-NFS Data Service 12-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Using the hareg Command

Registering a Data Service

Registering the HA-NFS Data Service

You can run the hareg command on any node currently in the cluster
membership. It is run only once regardless of the number of HA-NFS
logical hosts in the cluster.

● To register the HA-NFS data service and associate one or more


logical hosts with it:
# hareg -s -r nfs [-h logical_host]

Note – The -s option indicates that this is a Sun-supplied data service.

● To check the status of all data services:


# hareg
nfs off
● To associate a previously registered data service with a new logical
host:
# scconf clustname -s data-service-name logicalhost-name
● To obtain configuration information about a data service:
# hareg -q nfs

12-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Using the hareg Command

Registering a Data Service

Registering a Custom Data Service

Configuring a custom data service is a complex issue that requires a


great deal of preparation. The following example shows how a custom
data service might be registered.
# hareg -r new_ha_svc \
-v 2.7 \
-m START=/var/new_ha_svc/my_start \
-t START=30 \
-m STOP=/var/new_ha_svc/my_stop \
-t STOP=30 \
-d NFS

The command options have the following meaning:

● The -s option is not used, this is not a standard Sun data service

● The -r option precedes the data service name

● The -v 2.7 option defines the data service version number

● The -m START option is the path to the start method

The supported method names are START, START_NET, STOP,


STOP_NET, ABORT, ABORT_NET, FM_INIT, FM_START,
FM_STOP, and FM_CHECK.

● The -t START option is the timeout value for the start method

The timeout value is the amount of time the method has to


complete before it is terminated.

● The -d NFS option indicates the custom data service is dependent


on the NFS data service

The HA-NFS Data Service 12-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Using the hareg Command

Unregistering a Data Service


To stop and unregister a data service, such as HA-NFS, you must
perform the following steps:

1. Stop the HA-NFS service on all nodes.


# hareg -n nfs

Note – Data services are cluster-wide, you cannot stop HA-NFS on just
one node. You can effectively stop a data service for a logical host by
placing the logical host in maintenance mode.

2. Unregister the HA-NFS data service.


# hareg -u nfs [-h ...]

3. If appropriate, remove the logical hosts themselves.


# scconf clustername -L lghost1 -r

Note – You must turn a data service off before removing a logical host
associated with it.

12-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Using the hareg Command

Starting and Stopping a Data Service


Each data service has an on/off state. The on/off state for a data
service is cluster-wide, and persists through reboots, takeovers, and
switchovers. This on/off state provides system administrators with a
mechanism for temporarily shutting off a data service. When a data
service is off, it is not providing services to clients.

Starting and Stopping the HA-NFS Data Service

To turn the HA-NFS data service on:


# hareg -y nfs
# hareg
nfs on

Note – Use the hareg -n option to turn a data service off.

You can use multiple -n and -y options to turn some services on and
other services off at the same time. However, the final on/off state
must satisfy any data service dependencies. For example, if a data
service depends on NFS, it is not legal to turn that data service on and
turn NFS off at the same time.

Global Data Service Control

The hareq command has options that globally start all configured data
services.

● To start all registered data services:


# hareg -Y
● To stop all registered data services:
# hareg -N

The HA-NFS Data Service 12-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12

File Locking Recovery

When the HA-NFS logical host fails over to a different physical host,
the client sees no significant differences. The IP address it was
communicating with before is active, and all of the file systems are
available. A request could have timed out, and might need to be
restarted, but there should be no other changes to the client.

However, if the client had locked files on the server, the locks might
have been lost when the server failed. This would require an
immediate termination of all NFS services for the active clients,
because data integrity could no longer be guaranteed. This would
defeat the purpose of HA-NFS.

When the logical host fails over, the NFS statd and lockd processes
on the new physical system are restarted. Before serving any data,
contact all of the clients and request information about the locks that
they were holding.

This lock reestablishment process is a capability of the normal, non-


HA operation of the NFS environment, and is not changed for HA-
NFS.

12-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Exercise: Setting Up HA-NFS File Systems

Exercise objective – In this exercise you will do the following:

● Register the HA-NFS data service

● Verify that the HA-NFS data service is registered and turned on

● Verify that the HA-NFS file systems are mounted and exported

● Verify that clients can access HA-NFS file systems

● Switch the HA-NFS data services from one server to another

Preparation
There is no preparation required for this exercise.

Tasks
The following tasks are explained in this section:

● Verifying the environment

● Preparing the HA-NFS file systems

● Registering the HA-NFS data service

● Verifying access by NFS clients

● Observing HA-NFS Failover Behavior

The HA-NFS Data Service 12-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Exercise: Setting Up HA-NFS File Systems

Verifying the Environment


In earlier exercises, you created a logical host for HA-NFS, hanfs,
using a disk group named hanfs. Confirm that this logical host is
available and ready to configure for HA-NFS.

1. Ensure that your cluster is active. If not, start it.

a. On only one cluster node, type:


# scadmin startcluster phys_nodename clustername

Note – Wait for the cluster to activate and reconfiguration to complete


on the first node.

b. Start each remaining node at exactly the same time.


# scadmin startnode

Caution – If you cannot start the remaining nodes at exactly the same
time, then wait for each node to complete its reconfiguration before
! starting the next node.

2. Make sure that you have the hanfs switchable disk group that you
can assign for use by HA-NFS. The hanfs disk group was created
in an earlier lab.
# vxprint

Note – Use the metastat and metaset -s hanfs commands to verify


the hanfs diskset and volume status for SDS installations.

3. Verify that the status of each new NAFO group is OK on all nodes.
# pnmstat -l

12-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Exercise: Setting Up HA-NFS File Systems

Preparing the HA-NFS File Systems


Now that the logical host is ready, configure it for use with HA-NFS.

The demo file system mount information is different for CVM/SSVM


installations and SDS installations. Both examples are shown.

1. Add the /hanfs1 and /hanfs2 file system mount information in


the vfstab.clustername-nfs file on all nodes on which the HA-
NFS logical host will run.

CVM/SSVM Mount Information

/dev/vx/dsk/hanfs/hanfs.1 /dev/vx/rdsk/hanfs/hanfs.1 /hanfs1 ufs 1 no -


/dev/vx/dsk/hanfs/hanfs.2 /dev/vx/rdsk/hanfs/hanfs.2 /hanfs2 ufs 1 no -

SDS Mount Information

/dev/md/hanfs/dsk/d101 /dev/md/hanfs/rdsk/d101 /hanfs1 ufs 1 no -


/dev/md/hanfs/dsk/d102 /dev/md/hanfs/rdsk/d102 /hanfs2 ufs 1 no -

2. On both nodes, create the logical host-specific dfstab files.


# cd /etc/opt/SUNWcluster/conf/hanfs
# touch dfstab.clustername-nfs

3. On both nodes, add share commands to the HA-NFS-specific


dfstab.clustername-nfs files for your two HA-NFS test file
systems.
share -F nfs -o rw,anon=0 /hanfs1
share -F nfs -o rw,anon=0 /hanfs2

4. If any mountd and nfsd processes are running, stop them, or run
/etc/init.d/nfs.server stop.

The HA-NFS Data Service 12-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Exercise: Setting Up HA-NFS File Systems

Registering HA-NFS Data Service


Now you are ready to activate the HA-NFS data service.

Run these commands from only one node.

1. Use the hareg command to see which data services are currently
registered.
# hareg

2. Register HA-NFS by typing the following command on only one


cluster node:
# hareg -s -r nfs -h clustername-nfs

3. Use the hareg command again to see that the HA-NFS data
service is now registered.
# hareg
nfs off

4. Turn on the HA-NFS service for the cluster.


# hareg -y nfs

5. Verify that the logical host’s HA-NFS file systems are now
mounted and shared using the mount and dfshares commands.

Note – You may have to switch the nfs logical host between nodes
before the nfs file systems are mounted and shared.

12-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Exercise: Setting Up HA-NFS File Systems

Verifying Access by NFS Clients


Verify that NFS clients can access the HA-NFS file systems.

1. On the administration workstation, verify that you can access the


nfs logical host file system.
# ls /net/clustername-nfs/hanfs1
lost+found test_file

2. On the administration workstation, copy the Scripts/test.nfs


file into the root directory.

3. On the administration workstation, edit the /test.nfs script and


change the value of the clustername entry to match your logical
hostname.
#!/bin/sh
cd /net/clustername-nfs/hanfs1
while (true)
do
echo ‘date‘ > test_file
cat test_file
rm test_file
sleep 1
done

When this script is running, it creates and writes to an NFS-mounted


file system. It also displays the time to standard output (stdout). This
script makes it easy to informally time how long the NFS data service
is interrupted during switchovers and takeovers.

The HA-NFS Data Service 12-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Exercise: Setting Up HA-NFS File Systems

Observing HA-NFS Failover Behavior


Now that the HA-NFS environment is working properly, test its high
availability operation.

1. On the administration workstation, start the test.nfs script.

2. Use the scadmin switch or haswitch command to transfer


control of the NFS logical host from one HA server to the other.
# scadmin switch clustername dest-phys-host logical-host
# haswitch dest-phys-host logical-host

3. Observe the messages displayed by the test.nfs script.

4. How long was the HA-NFS data service interrupted during the
switchover from one physical host to another?
__________________________________________________

5. Use the mount and share commands on both nodes to verify


which file systems they are now mounting and exporting.
__________________________________________________
__________________________________________________
__________________________________________________

6. Use the ifconfig command on both nodes to observe the


multiple IP addresses (physical and logical) configured on the
same physical network interface.
# ifconfig -a

12-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Describe the function of the HA-NFS support files

❑ List the primary functions of the HA-NFS start and stop methods

❑ List the primary functions of the HA-NFS fault monitoring probes

❑ Configure HA-NFS in a Sun Cluster environment

❑ Add and remove HA-NFS file systems

❑ Switch a HA-NFS logical host between systems

The HA-NFS Data Service 12-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
12
Think Beyond

Are there restrictions on the file systems HA-NFS can support?

What types of NFS operations (if any) might be more difficult in the
HA-NFS environment?

12-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
System Recovery 13

Objectives

Upon completion of this module, you will be able to:

● List the functions of Sun Cluster control software

● List the events that can trigger a cluster reconfiguration

● Explain the failfast concept

● Describe the general priorities during a cluster reconfiguration

● Describe the recovery process for selected cluster failures

● Recover from selected cluster failures

This module summarizes the basic recovery process for a number of


typical failure scenarios. It includes background information and
details about operator intervention.

13-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Relevance

Discussion – The following questions are relevant to your learning the


material presented in this module:

1. How does the cluster recognize that there has been an error?

2. What types of error detection mechanisms are there?

3. How does the administrative workstation detect an error?

4. How do you recover from the common Sun Cluster HA system


failures?

13-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Cluster Volume Manager Guide, part number


805-4240

● Sun Cluster 2.2 Error Messages Manual, part number 805-4242

System Recovery 13-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13

Sun Cluster Reconfiguration Control

Reconfiguration in a cluster environment can happen at several


different levels. Some reconfigurations are independent of the cluster
framework software. For example: the disk management software
monitors the state of virtual volume and can detach mirrors if there is
a hardware failure. This is managed independently of any other
cluster software.

Other reconfigurations can range from a full reconfiguration in which


cluster membership is renegotiated to a minor switchover to a backup
network interface.

13-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Control

Many of the components shown in Figure 13-1 have failure recovery


capabilities. Some failures are less transparent than others and can
result in a node crash. Although some of the failures do not disturb the
cluster operation, they reduce the level of redundancy and therefore
increase the risk of data loss.

Node 0 Node 1
DBMS SMA SMA DBMS
CMM CMM

Heartbeats

Network Private networks Network


driver driver

Updates

ccdd ccdd

Disk Disk
management PNM PNM management
FF FF
Fiber-optic Fiber-optic
channels Fault Fault channels
monitor monitor

Storage array Storage array

Figure 13-1 Cluster Reconfiguration Control

System Recovery 13-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Control

All of the following cluster software components have some level of


error detection and recovery capabilities.

Cluster Membership Monitor


The CMM daemon, clustd, detects the loss of the heartbeat from
a failed node and initiates a general cluster reconfiguration.

Switch Management Agent


The SMA detects the loss of the primary cluster interconnect and
switches to the backup interconnect path. This is a minor local
reconfiguration.

SMA provides support for Ethernet private networks and for SCI
private networks, as well as additional SCI switch management
functions.

Public Network Management


The pnmd process monitors the state of the cluster’s public
network interfaces and network. It can failover to a backup
adapter or confirm a network outage. It initiates only a minor local
reconfiguration if it is switching to a backup interface. It can also
trigger a larger logical host reconfiguration if general public
network problems are detected on a cluster host.

Failfast Driver (/dev/ff)


The Sun Cluster failfast driver monitors critical processes or
operations. If they do not respond or complete within defined
limits, the failfast driver forces a system panic.

13-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Control

Data Service Fault Monitors


The local and remote fault monitors for a data service can force the
data service to migrate to a new host system. This causes an
intermediate level cluster reconfiguration.

Disk Management Software


The CVM, SSVM, and SDS volume managers all monitor the state
of their virtual volume structures. If a disk drive failure is detected
for a mirrored or RAID5 volume, the disk management software
can take the failed object out of active use. This type of
reconfiguration is completely independent of any other cluster
software and is transparent to all other cluster software.

Database Management Software


Some databases, such as Oracle Parallel Server, have resident
recovery mechanisms that can automatically recover from an
unexpected cluster host system crash and continue. This is an
independent recovery feature. Most other databases do not have
recovery capabilities of this kind.

System Recovery 13-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13

Sun Cluster Failfast Driver

The failfast mechanism is a software watchdog usually described as “a


time-out on a time-out.” Some failures are too critical to allow further
node operation. The affected node must be stopped immediately to
prevent database corruption. The failfast driver forces a UNIX panic.

After the panic, the system automatically reboots. The panic ensures
that the Sun Cluster software is aware that a problem occurred and
will initiate appropriate actions.

13-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Failfast Driver

As shown in Figure 13-2, the failfast drive is constantly monitoring the


success of critical daemons or cluster operations. If the monitored
operation times out, the failfast driver forces a UNIX panic.

Node 0
Critical daemon
Critical operation

Kernel driver: ff OK

Failfast timeout
All other nodes

UNIX panic CMM Loss of heartbeat


detected

reconf_ener
Reboot

Cluster
configuration
dependent steps

Figure 13-2 Failfast Mechanism

After the panic, the system automatically tries to start again. However,
if there is UNIX file system damage that cannot be automatically
repaired by the fsck utility, you might have to run fsck manually to
repair the damage.

System Recovery 13-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Failfast Driver

Failfast Messages
When the failfast driver forces the UNIX panic, a panic message is
displayed in the cconsole window of the system. As shown in the
following example, the panic message contains an error message that
might point to the source of the problem.
# panic[cpu3]/thread=0xf037c4a0: Failfast timeout - unit
“comm_timeout” Device closed while Armed
syncing file systems... [3] 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
7 7 7 7 7 done
rebooting...

You must record the failfast error if possible. It will soon scroll off the
screen. The relevant portion of the error is displayed in quotes.

13-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13

Sun Cluster Reconfiguration Sequence

Many events can trigger cluster reconfiguration. Regardless of how a


general reconfiguration is initiated, the general process is controlled by
a master script file named reconf_ener.

The reconf_ener script has many subroutines and depending on


how it was initiated, the reconfiguration can be minor or it can be a
major reconfiguration that results in the loss of one or more cluster
members.

The later stages of a full reconfiguration are reserved for logical host
reconfigurations.

The /opt/SUNWcluster/bin/reconf_ener script is run any time a


node reconfiguration is required.
Warning – Do not edit the reconf_ener script. Any change to the
script can cause unreliable operation or database corruption.

System Recovery 13-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Sequence

As shown in Figure 13-3, once the reconf_ener script file is initiated,


it can perform many different operations that are dependent on
current status information. The reconf_ener script can also initiate
disk management software recovery procedures.

Operator commands Status change detected

# scadmin startcluster Failed private network


# scadmin startnode Other node failed
# scadmin stopnode Other node joining cluster

reconf_ener

Varied reconfiguration steps


depending on the cluster
configuration and application

Disk Management
Monitor and
Virtual volumes
disable structures
Resync volumes

UNIX
File system recovery
Reboot after panic

Figure 13-3 Reconfiguration Initiation

13-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Sequence

Reconfiguration Triggering Events


Many cluster events can trigger a reconfiguration including:

● The operator using the scadmin command to start or stop a node.

● The CMM (clustd) detecting a failed node.

● The CMM (clustd) detecting that another node is joining the


cluster.

● SMA detecting a failed private network. This generates a minor


reconfiguration.

Independent Reconfiguration Processes


The following recovery processes are independent of the reconf_ener
command

● CVM independently manages problems with virtual volumes.

● RDBMS user application failures are detected and handled


internally by the RDBMS software.

● Oracle data recovery using redo logs is handled by Oracle, but is


initiated indirectly by the DLM recovery process.

● PNM reconfiguration of NAFO groups

● UNIX file system recovery is performed automatically by the fsck


utility unless the errors are too severe.

System Recovery 13-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13

Sun Cluster Reconfiguration Steps

When there is a change in cluster status, either because of a failed node


or operator intervention, the reconfiguration process proceeds in steps.
The steps are coordinated between all active nodes in the cluster and
all nodes must complete a given step before the reconfiguration can
proceed to the next step.

13-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Steps

As shown in Figure 13-4, the cluster interconnect system (CIS) is used


for communication between nodes during a reconfiguration. It
provides a critical link that is used to verify step changes between the
cluster members.

CIS

reconf_ener reconf_ener reconf_ener

Step 1 Step 1 Step 1


| | |
Step 2 Step 2 Step 2
| | |
Step 3 Step 3 Step 3
| | |
Step 4 Step 4 Step 4
| | |
Step n Step n Step n

Figure 13-4 Cluster Reconfiguration Coordination

System Recovery 13-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Steps

Reconfiguration Process Priorities


The reconfiguration steps are prioritized. The first steps during a
cluster reconfiguration resolve fundamental issues that are important
to general cluster operation. The later steps address the more
specialized cluster functionality. The steps proceed as follows:

1. The general reconfiguration steps are completed first. This


includes:
● Reserving quorum devices
● Arbitrating cluster membership
● Establishing cdb and ccd consistency
● Starting disk management recovery programs

2. If appropriate for the cluster application, database reconfiguration


steps are initiated next. This is important for Oracle Parallel Server
installations and includes:
● Distributed lock management recovery

3. The data services reconfiguration steps are completed in the final


stages and include:
● Shutting down NFS daemons
● Exporting HA-NFS file systems
● Importing disk groups (to backup node)
● Using fsck to mount failed HA-NFS file systems
● Using ifconfig on logical IP addresses to backup node
● Restarting NFS daemons (initiating client lock recovery)

13-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Sun Cluster Reconfiguration Steps

Reconfiguration Step Summary


As shown below, the cluster reconfiguration process varies depending
on the data service configuration. Several different reconfiguration
processes can take place in the same time period.

Steps General CVM/SSVM Oracle DLM


0 Begin
1 Disk quorum or
failure fencing
resolved
2 CIS issues
resolved
3 CCD issues
resolved
4
5 Distributed lock
recovery initi-
6 Volume recovery ated and coordi-
initiated and recov- nated.
7 PNM issues ery performed if
resolved necessary
8
9
10 Logical host
issues resolved
11
12 Finish

System Recovery 13-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13

Cluster Interconnect Failures

CIS Failure Description


If the Ethernet or SCI interconnect fails on a node in the cluster, the
smad daemon on that node detects the failure and initiates a minor
reconfiguration that switches to the backup CIS interface.

The other nodes in the cluster are aware that CIS communications
have moved to the backup interface so they also switch to their backup
interfaces.

Note – A critical feature of a CIS failure is that switching to a backup


interface can be done quickly. If it takes too long, then the cluster
membership monitor daemon, clustd, times out and a major cluster
reconfiguration is started.

13-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Cluster Interconnect Failures

CIS Failure Symptoms


Error messages are different for each type of cluster interconnect
system.

The following Ethernet-based interconnect failures issue continuously


repeating error messages.
ID[SUNWcluster.sma.down.5010]: link between node 0 and
node 1 on net 0 is down
Aug 16 11:08:55 eng-node0 ID[SUNWcluster.reconf.5010]:
eng-cluster net 0 (be0) de-selected
Aug 16 11:08:56 eng-node0 ID[SUNWcluster.reconf.1030]:
eng-cluster net 1 (be1) selected
be0: Link Down - cable problem?

The following SCI interconnect failure messages cease after the backup
interface is operational.
NOTICE: ID[SUNWcluster.sma.smak.4001]: SCI Adapter 0:
Card not operational (1 2)
NOTICE: ID[SUNWcluster.sma.smak.4051]: SCI Adapter 0:
Link not operational (1 2)
Nov 15 17:55:08 sec-0 ID[SUNWcluster.sma.smad.5010]: sec-
cluster adapter 0 de-selected
Nov 15 17:55:08 sec-0 ID[SUNWcluster.sma.smad.1030]: sec-
cluster adapter 1 selected

System Recovery 13-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Cluster Interconnect Failures

Correcting Ethernet CIS Failures


The following actions are necessary to repair a failure in a Ethernet-
based cluster interconnect:

● You must determine if the problem is a cable or an interface card.

● You might have to take the node with the failed Ethernet interface
out of clustered operation while repairs are being made.

● After repairs are complete, you must bring the node into the
cluster again. No further action is required.

Correcting SCI Interconnect Failures


The following actions are necessary to repair a failure in a SCI-based
cluster interconnect:

● You must determine if the problem is a cable or an interface card.

● You might have to take the node with the failed SCI interface out
of clustered operation while repairs are being made.

● After repairs are completed, you must take all of the nodes out of
clustered operation and run the sm_config program on one of the
nodes followed by a reboot of all nodes.
Caution – If any SCI card or switch is moved or replaced, you must
run the /opt/SUNWsma/bin/sm_config script again to reprogram the
! SCI card flash PROMs. The sm_config script will tell you which
cluster hosts must be rebooted.

13-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13

Two-Node Partitioned Cluster Failure

If a complete cluster interconnect failure occurs in a two-node cluster,


the clustd daemon on both nodes detect a heartbeat loss, and each
node initiates a reconfiguration.

The reconfiguration process is different depending on which disk


management software your cluster is using.

CVM or SSVM Partitioned Cluster


If there is a complete CIS failure in a two-node cluster that is running
either CVM or SSVM software, both nodes in the cluster race to
reserve the designated quorum disk drive.

The first node to reserve the quorum device remains in the cluster and
takes over any additional logical hosts for which it is the backup. The
other node aborts the Sun Cluster software.

Once the CIS problem is repaired, both nodes can resume normal
clustered operation.

System Recovery 13-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Two-Node Partitioned Cluster Failure

SDS Partitioned Cluster


If there is a complete CIS failure in a two-node cluster that is running
the SDS software, the concept of a quorum device does not exist.
When the private net is broken, both nodes cannot communicate with
each other and assume they are alone in the cluster. The host owning
the logical host or hosts completes its cluster reconfiguration as normal
and takes no action as it owns all the logical hosts. However the
opposite host also believes it is alone in the cluster and it also goes
through cluster reconfiguration.

During its reconfiguration, the backup host assumes it has to take


control of the logical host and in doing so takes ownership of the
disksets. This causes a panic on the node that currently has ownership
of the disksets.

If each host is the designated backup for the other, it is a race as to


which one panics first.

13-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13

Logical Host Reconfiguration

Each logical host has both a local and a remote fault monitoring
program associated with it.

● The local fault monitor runs on the current logical host master

● The remote fault monitor runs on the designated backup system


for the logical host.

If either the local or the remote fault monitors detect certain failures,
they attempt to force a reconfiguration of the logical host. This is an
attempt to migrate the logical host to a healthy system.

System Recovery 13-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Logical Host Reconfiguration

As shown in Figure 13-5, the local fault monitor runs on the logical
host master and verifies the health of the master host. The remote fault
monitor runs on the designated backup system for the logical host and
also verifies the correct operation of the logical host master.
Public network

phys-hostA phys-hostB

check
Data service
check Local
fault Remote
monitor fault
monitor

giveaway
takeaway

Figure 13-5 Logical Host Fault Monitoring

If the fault is detected by the local fault monitor, it initiates a giveaway


process with the hactl command. This might end with the designated
backup system taking over the logical host.

If a fault is detected by the remote fault monitor, it initiates a takeaway


process with the hactl command. This might end the same as the
giveaway process, with the backup system taking over the logical host.

Sanity Checking
Before a physical host can become the new master of a logical host, it
must be certified as fully operational. This is called sanity checking
and is performed by the FM_CHECK methods.

13-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Exercise: Failure Recovery

Exercise objective – In this lab you will perform a recovery for the
following:

● Failed cluster interconnect

● Partitioned cluster

● A NAFO group interface failure

● A logical host fault monitor giveaway or takeaway

● A failfast

Preparation
You should start the Sun Cluster Manager application on one of the
cluster hosts and display it on the administration workstation. Use the
Sun Cluster Manager application to observe the effects of the failures
that are created in this lab.

Tasks
The following tasks are explained in this section:

● Recovering after losing a private network cable


● Recovering for a cluster partition

● Recovering after a public network failure

● Recovering after a logical host fault monitor giveaway

● Recovering for a cluster failfast

System Recovery 13-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Exercise: Failure Recovery

Losing a Private Network Cable


1. Disconnect the active private network cable or turn off the
associated SCI switch.

Note – Be careful with the fragile SCI cables if you move them.

● Predicted behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

● Observed behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

Partitioned Cluster (Split Brain)


Disconnect both private network cables from the same node, as close
to simultaneously as possible, or turn off both SCI switches.

Note – Be careful with the fragile SCI cables if you move them.

● Predicted behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

● Observed behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

13-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Exercise: Failure Recovery

Public Network Failure (NAFO group)


1. Disconnect an external (public) network cable.

● Predicted behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

● Observed behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

Logical Host Fault Monitor Giveaway


1. On one node, use the kill command to kill the nfsd daemon on
the physical host mastering the HA-NFS logical host.

● Predicted behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

● Observed behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

System Recovery 13-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Exercise: Failure Recovery

Cluster Failfast
1. Kill the clustd process on a node to create a cluster abort or kill
the ccdd daemon to create a failfast panic.

● Predicted behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

● Observed behavior:
__________________________________________________________
__________________________________________________________
__________________________________________________________
__________________________________________________________

Caution – Creating a failfast causes a UNIX panic. This can cause


permanent file system damage.
!

13-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Exercise: Failure Recovery

Exercise Summary
Discussion – Take a few minutes to discuss what experiences, issues,
or discoveries you had during the lab exercises.

● Experiences

● Interpretations

● Conclusions

● Applications

System Recovery 13-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ List the functions of Sun Cluster control software

❑ List the events that can trigger a cluster reconfiguration

❑ Explain the failfast concept

❑ Describe the general priorities during a cluster reconfiguration

❑ Describe the recovery process for selected cluster failures

❑ Recover from selected cluster failures

13-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
13
Think Beyond

What are the issues for split-brain failures with more than two modes?

Is it safe to have two “subclusters” running in a nominal four-node


cluster?

What procedures should be documented for operations personnel?

System Recovery 13-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Sun Cluster High Availability Data
Service API 14

Objectives

Upon completion of this module, you should be able to:

● Describe the available data service methods

● Describe when each method is called

● Describe how to retrieve cluster status information

● Describe how to retrieve cluster configuration information

● Describe how the fault methods work and how to request failovers

This module demonstrates how to integrate your applications into the


Sun Cluster High Availability framework. It also describes key failover
actions performed by the Sun Cluster High Availability software.

14-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Relevance

Discussion – The following questions are relevant to understanding


this module’s content:

1. What are the requirements to add a data service?

2. What information do you have to provide to Sun Cluster?

3. How do you interact with the cluster?

4. How do you retrieve cluster status and configuration information?

14-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster System 2.2 Administration Guide, part number 805-4238

● Sun Cluster 2.2 API Developers Guide, part number 805-4241

Sun Cluster High Availability Data Service API 14-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

Overview

Sun Cluster High Availability enables an API to make a data service


highly available. The API permits a client-server data service to be
layered on top of Sun Cluster High Availability.

Usually, the data service already exists and was developed in a


non-HA environment. The API was designed to permit an existing
data service to be easily added to the Sun Cluster HA environment.

The Sun Cluster HA Data Service API employs command–line utility


programs and a set of C library routines. For convenience, all C library
functionality is also available using the command–line utility
programs. This gives the programmer the option to code shell scripts
or to code in a compiled language.

Note – Custom written HA data services are not supported by


SunService unless they are written by the Sun Professional Services
organization.

14-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Overview

When a data service first registers with Sun Cluster High Availability,
it registers a set of call-back programs or methods. Sun Cluster High
Availability makes call-backs to the data service’s methods when
certain key events in the Sun Cluster High Availability cluster occur.

Refer to the Sun Cluster 2.2 API Programmer’s Reference Guide for more
information.

Sun Cluster High Availability Data Service API 14-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

Data Service Requirements

A data service must meet the requirements discussed in the


following sections below to participate in the Sun Cluster High
Availability Data Service API.

Client-Server Data Service


Sun Cluster High Availability is designed for client-server
applications. Time-sharing models in which clients remotely log in
and run the application on the server have no inherent ability to
handle a crash of the server.

Data Service Dependencies


The data service process(es) must be relatively stateless in that they
write all updates to disk. When a physical host crashes and a new
physical host takes over, Sun Cluster High Availability calls a
method to perform any crash recovery of the on-disk data.

14-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Data Service Requirements

No Dependence on Physical Hostname of Server


If a data service needs to know the hostname of the server on which
it is running, it should be modified to use the logical hostname rather
than the physical hostname.

Handles Multi-homed Hosts


Multi-homed hosts are hosts that are on more than one public network.
Sun Cluster High Availability servers can appear on multiple
networks, and have multiple logical (and physical) hostname and IP
addresses. The data service must handle the possibility of multiple
logical hosts on more than one public network.

Handles Additional IP Addresses for Logical Hosts


Even in hosts that are non-multi-homed, a High Availability server
has multiple IP addresses: one for its physical host, and one
additional IP address for each logical host it currently masters. A
High Availability server dynamically acquires additional IP
addresses when it becomes master of a logical host, and dynamically
relinquishes IP addresses when it gives up mastery of a logical host.
The START and STOP methods provide hooks for Sun Cluster HA,
which inform a data service that the set of logical hosts has changed.

Sun Cluster High Availability Data Service API 14-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

Reconfiguration Overview

Whenever a change in cluster state occurs, the Sun Cluster HA


software performs a cluster reconfiguration. A cluster state change can
be caused by a host crashing, or by the planned migration of a logical
host using the haswitch or scadmin switch commands.

The Sun Cluster reconfiguration process is a sequence of steps on all


physical hosts that are currently up and in the cluster. The steps
execute in lock-step, which means that all hosts complete one step
before any host goes on to the next step. After general reconfiguration,
issues are resolved, such as the CDB and CCD consistency checks. A
number of HA routines are performed that start, stop, or abort the
operation of a logical host, as necessary. These HA routines are called
methods.
Caution – If the methods are stored in a disk group on the multihost
disks, only the server that currently owns that disk group has access
! to the methods. All servers in the high availability cluster must be
able to execute the START, STOP, and ABORT methods.

14-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

Data Service Methods

Whenever any change in cluster state occurs as part of the cluster


reconfiguration, Sun Cluster High Availability calls the data services’
method programs on each host in the cluster.

Note – You must have methods for starting and stopping a Sun
Cluster High Availability data service. ABORT methods are optional,
and can be omitted.

START Methods
After a physical host crashes, Sun Cluster HA moves the logical host,
which the physical host had been mastering, to a surviving host. Sun
Cluster HA uses the START methods to restart the data services on the
surviving hosts.

Sun Cluster High Availability Data Service API 14-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Data Service Methods

STOP Methods
The haswitch command moves a logical host from one physical host
to a backup through cluster reconfiguration. When the haswitch
command is executed, the STOP methods are used to cleanly shut
down the data service on the original physical host before starting the
data service on the backup. Similarly, the hastop command uses the
STOP methods to cleanly shut down Sun Cluster HA data services.

Note – The STOP method should perform a smooth shutdown but


does not wait for network clients to completely finish their work,
because that could introduce an unbounded delay.

ABORT Methods
The ABORT methods are called when the cluster on a particular node
aborts. All cluster activity on the node is stopped, but the physical
node continues to run. The ABORT methods must immediately halt
their data services.

If a fault probe detects that a high availability server is “sick,” it


causes a failover of data services from that sick HA server to the
“healthy” HA server. Before shutting down, the sick server attempts
to call the ABORT methods for all currently registered data services.

Sun Cluster HA monitors the health of the physical hosts, and can
decide to halt or reboot a physical host, if necessary. The ABORT
methods execute “last wishes” code before halting a HA server.

The ABORT and ABORT_NET methods are similar to the START and
STOP methods. ABORT_NET methods are called while the logical
host’s network addresses are still configured UP. ABORT methods are
called after the logical host’s network addresses are configured in the
DOWN state and are not available.

14-10 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Data Service Methods

ABORT methods (Continued)


There is no guarantee that the ABORT or ABORT_NET methods are
called. The HA server might panic, and no methods can be called. Or
the HA server might be so sick that it cannot successfully execute the
ABORT methods.

You should use ABORT methods only to optimize performance. Data


services must function correctly, even if ABORT methods are not
called.

ABORT and ABORT_NET methods might be called while one of the


other four START/STOP methods are executing. The ABORT methods
might find that one of the START/STOP methods was interrupted
during its execution, and did not finish executing.

NET Methods

START_NET Method

For each registered data service whose ON/OFF state is ON, Sun
Cluster HA first calls the data service’s START method program.
When the START method is called, the logical host’s network
addresses are not available because they have not been configured UP
yet. Next, logical network addresses are configured UP and then the
START_NET method is called. When the START_NET methods are
called, the logical host’s network addresses are configured UP and
are available.

STOP_NET Method

For each registered data service, Sun Cluster HA calls the data
service’s STOP_NET method program. When the STOP_NET method
is called, the logical host’s network addresses are still configured UP.
Next, the logical host’s network addresses are configured DOWN, and
then the STOP methods are called.

Sun Cluster High Availability Data Service API 14-11


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Data Service Methods

NET Methods

NET Method Workload

The data service can split up the stopping of work between its
STOP_NET and STOP methods any way it chooses. By the time the
STOP method returns, all necessary work associated with stopping
the data service should be accomplished. In particular, the data
service must be sure to cease using any data on the logical host disk
groups, as ownership of these disk groups must be relinquished in
subsequent reconfiguration steps.

It is up to each individual data service to decide how to split the work


between the START and START_NET method programs. The data
service can make one of them non-operational and do all the work in
the other, or it can do some work in each method. All the work
necessary to start up the data service should have been accomplished
by the time START_NET returns control.

Fault Monitoring Methods


Sun Cluster HA software defines four methods for data services to use
for their own fault monitoring:

● FM_INIT

● FM_START

● FM_STOP

● FM_CHECK

The FM_INIT, FM_START and FM_STOP methods are called during


the appropriate points of the logical host reconfiguration sequence.

A data service can register any or all of these methods when it first
registers itself with the hareg command.

14-12 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Data Service Methods

FM_INIT and FM_START

For each registered data service whose ON/OFF state is ON, Sun Cluster
HA calls the data service’s FM_INIT method. FM_INIT initializes fault
monitoring of a data service.

The data service uses the FM_INIT and FM_START methods can be
used by the data service to start up its own data service specific fault
monitoring. This fault monitoring indicates whether the data service is
available and performing useful work for its clients.

FM_INIT and FM_START are called as two successive steps of the Sun
Cluster HA reconfiguration. The FM_INIT step completes on all hosts
in the cluster before any host executes the FM_START step. A data
service fault monitor can leverage this sequencing if it needs to
perform some initialization on all of the hosts before actually starting
the fault monitoring. For example, the data service might need to
create some dummy objects for the fault monitor to query or update or
both.

FM_STOP

The FM_STOP method stops fault monitoring of a data service.

FM_CHECK

The FM_CHECK method checks the health of a data service. It is not


called during HA cluster configuration. It can be called by a local or
remote data service fault monitor that has detected a possible data
service fault condition.

If can be used to verify the health of either the current data service
master or a potential new data service master. Depending on the
results of the FM_CHECK methods, a potential logical host failover
can be stopped or can continue on.

Sun Cluster High Availability Data Service API 14-13


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

Giveaway and Takeaway

You can detect a problem with a logical host can be detected by either
the local fault monitor or the remote fault monitor for the logical host.

If the fault is detected by the local fault monitor, it initiates a giveaway


process with the hactl command. This can result in the designated
backup system taking over the logical host.

If a fault is detected by the remote fault monitor, it initiates a takeaway


process with the hactl command. This also could result in the backup
system taking over the logical host.

Before a physical host can become the new master of a logical host, it
must be certified as fully operational by the FM_CHECK method.

14-14 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Giveaway and Takeaway

Giveaway Scenario
This scenario assumes that the local data service’s fault monitor,
running on the physical host, phys-hostA, has detected a problem and
concluded that phys-hostA is not healthy.

1. The local data service fault monitor (running on phys-hostA)


requests that the logical host be given up using the following
command:
phys-hostA# hactl -g -s A -l mars

2. The potential new master for the logical host is phys-hostB.

3. The FM_CHECK methods for both data services are called on


phy-hostB.

▼ If all FM_CHECK methods exit zero, indicating they are


healthy, the reconfiguration (transferring logical host to
physical host, phys-hostB) continues.

▼ If any FM_CHECK method exits non-zero, indicating it is not


healthy on phys-hostB, the logical host is not transferred to
the physical host, phys-hostB.

Sun Cluster High Availability Data Service API 14-15


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Giveaway and Takeaway

Takeaway Scenario
This scenario assumes the data services’s remote fault monitor,
running on phys-hostB, concludes that phys-hostA is unhealthy,
and requests that the logical host be taken away by using the
following command:
phys-venus# hactl -t -s A -l mars

The potential new master for logical host mars is phys-hostB.

The FM_CHECK method for the data service fault monitor is called on
phys-hostB but not for phys-hostA. The exit status of the FM_CHECK
methods are used in the same way as the giveaway scenario described
above. If all FM_CHECK methods exit zero, the reconfiguration
continues. If any FM_CHECK method exits non-zero, the logical host
is not transferred.

14-16 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Method Considerations

It is important to remember the following rules when designing any


HA method:

● Do not blindly start or stop a data service.

● Verify whether the data service has already been stopped or


started.

Sun Cluster HA might call START and START_NET methods


multiple times, with the same logical hosts being mastered, without
an intervening STOP or STOP_NET method. Sun Cluster HA might
call STOP and STOP_NET methods multiple times without an
intervening START or START_NET method. The same applies to the
fault methods.

For example, START methods should verify that their work has
already been accomplished (that is, the data service has been started)
before starting any processes. Similarly, STOP methods should verify
that the data service has already been stopped before issuing
commands to shut down the data service.

Sun Cluster High Availability Data Service API 14-17


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
STOP and START Method Parameters

If the data service is in the ON state, the STOP or START method is


called with:

● A comma-separated list of logical hosts for which this physical


host is the master

● A comma-separated list of logical hosts for which this node is the


next backup node

● The amount of time the method can take in seconds

Note – If the data service is in the ON state, the STOP method is called
with an empty string, a comma-separated list of all logical hosts, and
a timeout. The START methods are not called if the data service is in
the OFF state.

14-18 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

START and STOP Method Examples

Example 1
Assume there is a symmetrically configured two-node high availability
cluster. The two HA servers are named phys-venus and phys-mars,
and the logical hosts are named mars and venus. Also, assume that
phys-mars is mastering mars and phys-venus is mastering venus.

When the high availability cluster goes through the reconfiguration


process, the STOP_NET/ STOP methods are called first, and then the
START/START_NET methods are called.

The identical STOP/START methods will be called on both phys-mars


and phys-venus. However, different arguments are passed to these
methods on both servers. The examples show how these argument
lists differ. The STOP and START method depend on their
programmed logic to “do the right thing” based on the current cluster
configuration and the current state of the data service.

Sun Cluster High Availability Data Service API 14-19


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

START and STOP Method Examples

Example 2
This example assumes a failover occurred and phys-venus now
masters both venus and mars. Different arguments are passed to the
STOP and START methods on the HA servers.

Note – The timeout was specified when the data service was
registered. Sun Cluster runs each method in its own child process
group, and if the timeout is exceeded, it sends a SIGTERM signal to
the child process group.

14-20 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

Data Service Dependencies

The implementors of a data service must know the other data


services upon which their service depend.

Data service dependencies are specified when a data service is


initially registered with Sun Cluster HA by using the hareg
command.

The registered dependencies induce a partial order in which the data


service methods are called by Sun Cluster HA.

Applications that access a database have dependencies on that


database. Before starting the application, the database must be
running. Additionally, the application should be shut down before
the database is shut down.

● Data service A depends on data service B if, in order for A to


provide its service, B must be providing its service.

● You must supply data service dependencies to the hareg


command when you register a data service.

Sun Cluster High Availability Data Service API 14-21


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Data Service Dependencies

If data service A depends on data service B, then the START method


for data service B is called and completes before Sun Cluster HA calls
the START methods for data service A. The START_NET method for
data service B similarly is completed before Sun Cluster HA calls the
START_NET methods for data service A.

For stopping the data service, the dependencies are considered in


reverse order. The STOP_NET method of data service A is called and
completed before the STOP_NET method of data service B. Similarly,
the STOP method of data service A completes before the STOP method
of data service B is called.

The dependencies for the ABORT and ABORT_NET methods are the
same as for the STOP and STOP methods.

In the absence of dependencies, Sun Cluster HA can call methods for


different data services in parallel.

14-22 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

The haget Command

You use the haget command to extract configuration and state


information about a SC HA configuration. Usually, the haget
command is called by data service methods.

The haget command allows you to program in a shell script, rather


than in C using the library functions in ha_get_calls.

You can call the haget command from any scripting language that
can execute commands and redirect output.

The haget command is designed to make multiple calls to get all the
needed information. Each call is specific enough that parsing the
output of the call should not be required.

Sun Cluster High Availability Data Service API 14-23


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
The haget Command

The haget Command Options


The haget command outputs the extracted information on stdout.
The syntax for the command is:
haget [-S] [-a API_version] -f fieldname [-h hostname]
[-s dataservice]

The -f option takes a single argument that is one of a set of predefined


field names. Specify the -f option only once on the command line.

Some field names require a physical or logical host; that is specified by


the -h option or the name of a particular data service that is, specified
by the -s option.

The following -f field name do not require a -h or -s switch:

all_logical_hosts Returns all of the logical hosts in this


SC HA configuration.

all_physical_hosts Returns the names of all physical


hosts in this SC HA configuration.

mastered Returns the names of all logical


hosts that this physical host
currently masters. Also a
not_mastered option.

syslog_facility Output the name of the syslog


facility that SC HA uses.

The following -f field name requires a -s switch:

service_is_on Outputs a line containing 1 if the


data service is on, and outputs a
line containing 0 if the data service
is off

14-24 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
The haget Command

The haget Command Options


The following -f field names require a -h switch:

names_on_subnets Returns the hostnames associated


with the named host’s subnetworks.

private_links Returns the private links associated


with the name host.

physical_hosts Returns the names of all physical


hosts that can serve the named
logical host.

pathprefix Returns the absolute path name of


where the named logical host’s
administrative file system directory
will get mounted.

vfstab_file Outputs the full path name of the


vfstab file for the named logical
host.

is_maint Queries whether the named logical


host is in maintenance mode.

master Returns the current master of the


named logical host

The following are examples of the haget command using the options
described previously:
# haget -f all_logical_hosts
venus
mars
# haget -f physical_hosts -h mars
phys-mars
phys-venus
# haget -f master -h mars
phys-venus
# haget -f service_is_on -s hasvc
1

Sun Cluster High Availability Data Service API 14-25


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

The hactl Command

The hactl command provides control operations for use by Sun


Cluster HA fault-monitoring programs. The control operations include
requesting the movement of a logical host from one physical host to
another (possibly forcibly), requesting the movement of all logical
hosts that a physical host currently masters to other physical host(s),
and requesting an Sun Cluster HA cluster reconfiguration.

The hactl command applies several sanity checks before actually


carrying out the request. If any of these sanity checks fail, then the
hactl command does not carry out the request and it exits with no
side effects.

14-26 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
The hactl Command

The hactl Command Options


The following examples demonstrate the most common usage of the
hactl command by fault-monitoring programs.

● hactl -s NFS -r

Reconfigures the logical hosts for the NFS environment (as called
by the NFS Fault Methods).

● hactl -k abcd -s nshttp -t -l lhost1

Takes over logical host lhost1 because one of its data service fault
monitors has detected that its current master is “sick.”

● hactl -s nsmail -g -p phy-host1

Gives away all logical hosts.

● hactl -s NFS -n -t -p phy-host1

Does sanity checks, but giveaway is not done yet.

Sun Cluster High Availability Data Service API 14-27


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

The halockrun Command

The halockrun command provides a simple mechanism to serialize


command execution from a shell script. The halockrun command
runs a command while holding a file system lock on a specified file
using fcntl(2). If the file is already locked, the halockrun command
is delayed by the file system locking mechanism until the file becomes
free. Essentially, halockrun implements a mutex mechanism, using
the lock file as the mutex.

The syntax for halockrun is:


halockrun [-vsn] [-e exitcode] lockfile prog [args]

Where

lockfile Is the file serving as the serialization point

prog Is the program to be run

args Is the arguments to be run

See the halockrun man page for further information.

14-28 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14

The hatimerun Command


The hatimerun command provides a simple mechanism to limit the
execution time of a command from a shell script. The hatimerun
command runs a command under a timer of duration timeoutsecs.
If the command does not complete before the timeout occurs, the
command is terminated with a SIGKILL command (default), or with
the signal specified in the -k argument. The command is run in its
own process group.

The syntax for the hatimerun command is:


hatimerun [-va] [-k signalname] [-e exitcod] -t
timeoutsecs prog args

Where

timeoutsecs Indicates how long the program has to finish

prog Identifies the program to be run

args Identifies the arguments to be run

The -a operand starts the program and allows it to finish


asynchronously. The hatimerun command itself finishes immediately.

See the hatimerun man page for further information.

Sun Cluster High Availability Data Service API 14-29


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
The pmfadm Command

The pmfadm command allows you to start a process, and have the
process restarted if it fails. The process can be restarted a certain
number of times, or continually. Each process monitored by the
pmfadm command has a name tag, which is an identifier that describes
a process to be monitored. You can use the name tag to stop the
process if necessary.

The pmfadm command is used as follows:

● To start monitoring a process, use:


pmfadm -c nametag [-n retries] command [args]

● To stop a monitored process, use:


pmfadm -s nametag

14-30 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
What Is Different From HA 1.3?

The haget command


You must be aware of the following haget command changes if you
are migrating to the Sun Cluster environment:

● The vfstab option prints a comma-separated list of vfstab file


contents of the logical host configuration.

● Sun Cluster HA does not maintain a configuration file. The exit


code of 3 in Solstice HA is thus not relevant.

The hactl command


You must be aware of the following hactl command changes if you
are migrating to the Sun Cluster environment:

● The hactl -p option does not abort the node after migrating all
the logical hosts.

● It uses PNM for network monitoring.

Sun Cluster High Availability Data Service API 14-31


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
The hads C Library Routines

When you write methods using script you can access the hads C
libraries through the hactl command. If you write methods in C you
must be sure to include the following libraries directly:

● Synopsis
# cc [flag ...] -I /opt/SUNWcluster/include file \
-L /opt/SUNWcluster/lib -lhads -lintl -ldl \
[library ...]

#include <hads.h>
● Data structures

▼ ha_network_host_t

▼ ha_physical_host_t

▼ ha_logical_host_t

▼ ha_config_t

▼ ha_lhost_dyn_t

● Functions

▼ ha_open, ha_close

▼ ha_get_config, ha_getcurstate, ha_getmastered,


ha_getnotmastered, ha_getonoff, ha_getlogfacility

For more information, see the Sun Cluster 2.2 API Programmer’s
Reference Guide.

14-32 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Exercise: Using the Sun Cluster Data Service API

Exercise objective – In this exercise you will:

● Use the haget command to gather cluster configuration and status


information

Preparation
There is no preparation for this exercise.

Tasks
The following task is explained in this section:

● Using the haget command

Sun Cluster High Availability Data Service API 14-33


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Exercise: Using the Sun Cluster Data Service API

Using the haget Command


The haget command extracts information about the current state of a
Sun Cluster HA configuration. It would typically be used within a
START or STOP method that was written as a script for the C shell or
the Bourne shell.

Use the haget command to display the following information. Record


the results below.

1. The names of all logical hosts:


# haget -f all_logical_hosts

__________________________________________________

2. The names of all physical hosts:


# haget -f all_physical_hosts

__________________________________________________

3. The names of all logical hosts that this physical host currently
masters:
# haget -f mastered

__________________________________________________

4. The name of the physical host which is currently master of the


logical host clustername-nfs:
# haget -f master -h clustername-nfs

__________________________________________________

14-34 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Exercise: Using the Sun Cluster Data Service API

Using the haget Command (Continued)


5. Whether or not the logical host clustername-nfs is in maintenance
mode:
# haget -f is_maint -h clustername-nfs

__________________________________________________

6. Whether or not the NFS data service is currently on:


# haget -f service_is_on -s nfs

__________________________________________________

Note – See the man page for the haget command for more
information on the various haget options.

Sun Cluster High Availability Data Service API 14-35


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Check Your Progress

Before continuing on to the next module, check that you are able to
accomplish or answer the following:

❑ Describe the available data service methods

❑ Describe when each method is called

❑ Describe how to retrieve cluster status information

❑ Describe how to retrieve cluster configuration information

❑ Describe how the fault methods work and how to request failovers

14-36 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
14
Think Beyond

Are there other methods that might be needed for some data services?
What would they be?

Are there ways to make a non-HA compliant data service work with
HA?

How would you debug HA API problems when you were developing
your data service?

Sun Cluster High Availability Data Service API 14-37


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
Highly Available DBMS 15

Objectives

Upon completion of this module, you will be able to:

● List the configuration issues for a highly available DBMS instance

● Describe the general installation and configuration process for an


HA-DBMS data service

This module describes the operation and configuration of a DBMS in


the Sun Cluster High Availability environment.

15-1
Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15
Relevance

Discussion – The following questions are relevant to your learning the


material presented in this module:

1. How is an HA-DBMS instance different from other High


Availability data services?

2. What unique things need to be done for an HA-DBMS instance?

15-2 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15
Additional Resources

Additional resources – The following references can provide


additional details on the topics discussed in this module:

● Sun Cluster 2.2 System Administration Guide, part number 805-4238

● Sun Cluster 2.2 Software Installation Guide, part number 805-4239

Highly Available DBMS 15-3


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15

Sun Cluster HA-DBMS Overview

The Sun Cluster HA-DBMS services for Oracle, Sybase, and Informix
databases are a simple set of modules that act as an interface between
the High Availability framework and off-the-shelf database software.
User applications continue to use the database services as before.

The Sun Cluster HA-DBMS data service components are designed to:

● Leverage existing database crash recovery algorithms

● Have minimal client-side impact

● Be easily installed on servers

No changes to the database engines are required. Applications run


against the DBMS normally. No change to database administration on
the clients is required.

15-4 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15
Sun Cluster HA-DBMS Overview

Database Binary Placement


You can install the database binaries either on the local disks of the
physical hosts or on the multihost disks. Both locations have
advantages and disadvantages. Consider the following points when
selecting an install location.

Placing database binaries on the multihost disk eases administration,


since there is only one copy to administer. It ensures high availability
of the binaries or server during a cluster reconfiguration. However, it
sacrifices redundancy and therefore availability in case of some
failures.

Alternatively, placing database binaries on the local disk of the


physical host increases redundancy and therefore availability in case of
failure or accidental removal of one copy

Supported Database Versions


The Sun Cluster HA-DBMS data service currently supports:

● Oracle Versions 7.3.3, 7.3.4, 8.0.4, and 8.0.5

● Sybase Versions 11.5

● Informix Version 7.23 and 7.30

Note – The supported versions can change without notice. You should
always have your Sun field representative check the current product
release notes for database revision support.

Highly Available DBMS 15-5


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15
Sun Cluster HA-DBMS Overview

HA-DBMS Components
You can divide the Sun Cluster HA-DBMS support software into three
primary components:

● Database start methods

● Database stop methods

● Database-oriented fault monitoring

Each of these components is described in more detail on the


subsequent pages.

Multiple Data Services


If you are running multiple data services in your Sun Cluster High
Availability configuration (for example, HA-NFS and HA-DBMS for
Informix), you can set them up in any order. You will need separate
licenses for the HA-DBMS data service.

15-6 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15

Typical HA-DBMS Configuration

The HA-DBMS runs entirely on one node. As is the case with any HA
data service, its data is accessible from at least one other node. Should
the primary node fail, the entire DBMS instance is restarted on the
second node, conceptually as if it were restarting on the first node.

A physical cluster node can support multiple instances (as separate


logical hosts) of a DBMS server. They can be configured to failover to
different backup physical hosts if desired.

The administrator can manually switch an entire DBMS logical host


between physical hosts at any time, using the scadmin switch or the
haswitch command.

Should a cluster node fail, the logical hosts running on that node are
automatically switched to the available configured backup node(s).

Highly Available DBMS 15-7


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15

Configuring and Starting HA-DBMS

The general procedure used to prepare a HA-DBMS instance is:

1. Configure the logical host.


# scconf clustername -L ...
# scconf clustername -F ...

2. Register the HA-DBMS service and associate it with the logical


hosts that will run instances of it.
# hareg -s -r oracle -h lhost1[, lhost2, ...]

3. Start the HA-DBMS service.


# hareg -y oracle

4. Register the HA-DBMS instances.


# haoracle insert ...

DBMS fault monitoring is started automatically when the HA-DBMS


data service is started.

15-8 Sun Enterprise Cluster Administration


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15

Stopping and Unconfiguring HA-DBMS

To stop and unconfigure a DBMS data service:

1. Stop the HA-DBMS service. All instances of the HA-DBMS in the


cluster will now stop. The DBMS fault monitors are automatically
stopped. Or, put the logical host in maintenance mode.
# hareg -n sybase

2. Disconnect the service from the logical host.


# scconf clustname -s -r syblhost1

3. Unregister the HA-DBMS service.


# hareg -u sybase

4. Unconfigure the logical host if appropriate.


# scconf clustname -L syblhost1 -r

5. Remove the logical host vfstab.lhname and the administrative file


system if appropriate.

Highly Available DBMS 15-9


Copyright 1999 Sun Microsystems, Inc. All Rights Reserved. Enterprise Services, September 1999, Rev. A
15
Stopping and Unconfiguring HA-DBMS

Removing a Logical Host


To remove a logical host and leave the HA-DBMS data service active
for other logical hosts in the cluster, you must not run the hareg
commands from the previous procedure.

To remove a logical host:

1. Stop the logical host’s DBMS instance(s), or put the logical host
into maintenance mode.

2. Stop the data service


# hareg -n sybase

3. Disassociate the data service from the logical host.


# scconf clustname -s -r syblhost1

4. Unconfigure the logical host if appropriate.


# scconf clustname -L lhost1 -r

5. Remove the logical host vfstab.lhname or file if appropriate.

Note – A logical host can also be removed using the scinstall


program. When run on an already configured cluster, there is an
option to modify the cluster or data service configuration.

Removing a DBMS From a Logical Host


If you are removing only the DBMS instance from a logical host, you
need to stop only the instance and disassociate the data service from
the logical host.
# scconf clustname -s -r syblhost1

Remove any logical host vfstab.lhname if