Академический Документы
Профессиональный Документы
Культура Документы
Copyright 2011 Dell, Inc. All rights reserved. EqualLogic is a registered trademark of Dell, Inc. Dell is a trademark of Dell, Inc. All trademarks and registered trademarks mentioned herein are the property of their respective owners. Information in this document is subject to change without notice. Reproduction in any manner whatsoever without the written permission of Dell is strictly forbidden. October 2011 Part Number: 110-6053-EN-R5
Table of Contents
1 Overview of SANHeadQuarters SAN HeadQuarters Features SAN HeadQuarters Environment SAN HeadQuarters Operation 2 Installing SAN HeadQuarters Prerequisites PS Series Group Requirements Requirements for Running the SANHeadQuarters Server Requirements for Running the SANHeadQuarters Client Log File Directory Requirements Required Installation Information Installing SAN HeadQuarters Installation Procedure Post-Installation Tasks Restarting the SANHeadQuarters Server Uninstalling the SAN HeadQuarters Software Retaining Data and Log Files Uninstallation Procedure Upgrading SAN HeadQuarters Upgrading SANHeadQuarters Software Obtaining SANHeadQuarters Update Notifications 3 Getting Started with SAN HeadQuarters Starting the SAN HeadQuarters GUI Starting SAN HeadQuarters from Windows Starting the SAN HeadQuarters GUI Using a Command Line Navigating the GUI All Groups Summary Information 1 1 2 2 5 5 5 6 6 7 8 9 9 11 12 12 12 13 13 13 13 15 15 15 18 20 24
iii
Contents
Using 95th Percentile Reporting Displaying Data from Different Times Missing Data Points in Graphs Adding a Group for Monitoring Specifying Server Monitoring Adding a Server Changing the Default Server Removing a Server Upgrading the Client Monitoring Multiple Servers Verifying and Modifying Access to Log Files 4 Configuring the Group Monitoring Environment SAN HeadQuarters Settings Controlling GUI Appearance Displaying Installation Settings Changing the Log File Directory Controlling Client Startup Settings to Detect Firewall Rules Controlling Tooltips Controlling Chart Display Controlling Temperature Display Settings SANHeadQuarters Service Configuration Settings Degraded Connection Status Managing Group Monitoring Adding a Group to the Monitoring List Removing a Group from the Monitor List Sorting the Monitoring List Hiding Groups in the GUI Increasing the Log File Size Modifying the SNMP Community Name for a Group
27 29 32 32 33 33 34 34 35 37 39 39 39 40 40 41 41 42 42 42 44 45 45 47 47 48 48 49
iv
Contents
Configuring E-Mail Notification for Group Alerts Stopping Group Monitoring Launching the Group Manager with Single Sign-On Single Sign-On Requirements Enabling Single Sign-On / Modifying Login Credentials Disabling Single Sign-On for a Group Deleting Login Credentials for a Group Adding and Managing Favorite Views Adding a View to the Favorites List Setting a View As the Home Page Removing a Favorite View Displaying Live Data Prerequisites for Establishing a Live View Session Running a Live View Session Analyzing RAID Policies Syslog Event Logging Configuring Syslog Event Logging for a Group Disabling Syslog Event Logging for a Group Changing the Syslog Configuration to Use Specific Interfaces Disabling the SAN HeadQuarters Syslog Server Handling Group Network Address Changes Diagnosing and Solving Monitoring Problems Additional Group Monitoring Concepts How Data is Compressed in Log Files How Group Performance Affects SNMP Polling Dependency on Software and Firmware Versions 5 Understanding SAN HeadQuarters GUI Data Information Provided by the GUI
49 51 52 52 53 53 54 54 54 55 55 56 56 57 58 60 60 61 61 62 63 63 64 64 65 66 67 67
Contents
Performance and Capacity Terms Capacity and Replication Terms I/O Terms Network Terms Polling Status Reported Alerts Alert Priorities Displaying Alerts Exporting Alerts Copying Alerts to the Clipboard SAN HeadQuarters Performance Alerts Syslog Events Event Priorities Displaying Events Searching Events Exporting Events Copying Events to the Clipboard Audit Messages Displaying Audit Logs Searching Audits Exporting Audit Logs Copying Audit Logs to the Clipboard 6 Analyzing Group Data and Solving Problems Prerequisites for Analyzing Data Potential Sources of Performance Problems Understanding Application Storage Utilization Best Practices for Analyzing Data Over Time Identifying Hardware Problems
67 68 69 71 72 72 73 73 74 74 74 79 79 79 81 82 82 82 82 83 84 84 85 85 85 86 86 88
vi
Contents
Using the Experimental Analysis Windows Examples of Interpreting Performance Data Example 1: Adequately Performing Group with Excess Capability Example 2: Mainly Idle Group Example 3: Adequately Performing Group that Might Be Near Full Capability Example 4: Busy Group Likely Near Full Capability Example 5: Group With High Latencies Likely Near Full Capability Example 6: Group with Many Small Writes but Some Large Reads Best Practices for Solving Performance Problems Network Infrastructure Recommendations Server and Application Configuration Recommendations Group Configuration Recommendations 7 Preserving Group Data Group Data Reports Report Types Information Required for a Report Using the GUI to Create a Report Using a Command to Create a Report Group Diagnostics Report Data Group Data Archives Using the GUI to Create an Archive File Using a Command to Create an Archive File Opening an Archive File Group Data Exports Using the GUI to Export Group Data Using a Command to Export Group Data A Technical Support and Customer Service Contacting Dell
88 92 92 94 96 98 101 103 105 106 106 107 109 109 109 110 111 111 115 116 117 117 120 121 121 122 127 127
vii
Contents
viii
1 Overview of SANHeadQuarters
SAN HeadQuarters enables you to monitor multiple PS Series groups from a single graphical user interface (GUI). It gathers and formats performance data and other vital group information. Analyzing the data can help you improve performance and more effectively allocate group resources. This manual is designed for administrators responsible for installing SAN HeadQuarters or using it to monitor PS Series groups. Administrators are not required to have extensive network or storage system experience. However, it is useful to understand: Basic networking concepts Current network environment Disk storage space requirements RAID configurations Disk storage management
Note: Detailed information about network configuration is beyond the scope of this manual.
1 Overview of SANHeadQuarters
For a selected group, pool, or member, apply different RAIDpolicies and analyze the performance benefits. See the number of Ethernet ports with active and inactive data center bridging (DCB), and the number of ports incompatible with DCB. View events, audits, and group alerts. Preserve group performance data for later analysis by creating archives. Create customized reports of group performance data. Export group performance data to a spreadsheet. Specify favorite views.
SAN HeadQuarters does not disrupt access to group storage or degrade performance on the hosts or groups.
HeadQuarters Server installation monitor a group. Do not have multiple computers monitoring the same group. Log filesContain the data that the SANHeadQuarters Server collects from a group. The SAN HeadQuarters Server maintains one set of log files for each monitored group. Each set of log files can contain up to one years worth of data. After a year, the SANHeadQuarters Server overwrites the oldest data. You can put the log files on a network-accessible resource to share the data with computers running the SANHeadQuarters Server. SANHeadQuarters ClientProvides a graphical user interface (GUI) for managing the SAN HeadQuarters environment and viewing data collected by one or several SAN HeadQuarters servers. A SAN HeadQuarters Client accesses the log files maintained by the SAN HeadQuarters Server and formats the group data into tables and graphs. You can run the SANHeadQuarters Client on multiple computers.
1 Overview of SANHeadQuarters
After you add a group, the SANHeadQuarters Server issues regular SNMP requests to the group and collects configuration, status, and performance data. By default, the polling period the time between consecutive polling operationsis two minutes.
Note: SNMP polling has no impact on group performance because serving SNMP requests is
not a high priority for a group. The SANHeadQuarters Server stores the data in the group log files. Computers running the SANHeadQuarters Client access the log files and display the data in the SAN HeadQuarters GUI. Figure 1 shows the general layout of a SAN HeadQuarters single-server environment. Figure 1: Single-Server Environment
In Figure 1, a remote SANHeadQuarters Client accesses the local SANHeadQuarters Client/Server via the network. The SANHeadQuarters Server issues a series of SNMPrequests (polls)to each group for configuration, status, and performance information. When the first set of polls return from a group, the server stores this baseline information in the group data and log files for future reference. It issues subsequent polls at regular intervals (by default, two
1 Overview of SANHeadQuarters
minutes), averaging the data from these consecutive polling operations. (For more information on polling, see Polling Status on page 72) The SAN HeadQuarters Server also includes a syslog server to which a PSSeries group can log events. Figure 2 shows an example of a SAN HeadQuarters multi-server environment. Figure 2: Multi-Server Environment
In Figure 2, a remote SANHeadQuarters Client accesses two SAN HeadQuarters Servers at different sites. Both SANHeadQuarters Servers are monitoring groups on separate networks. The Client can access the data and log files that are being monitored by those servers, displaying the information in the Client GUI. The SAN HeadQuarters Server can monitor multiple groups. The server issues a series of polls to each group it monitors for configuration, status, and performance information.
Prerequisites
Before you begin the SAN HeadQuarters installation, perform these tasks for each set of groups that you want to monitor: 1. Identify the groups that you want to monitor. Make sure the groups meet the requirements described in PS Series Group Requirements on page 5. 2. Identify the computer that will run the SAN HeadQuarters Server and monitor the previously-identified groups. The SANHeadQuarters Server stores group data in log files that must be accessible to the computers running the SANHeadQuarters Client. See Requirements for Running the SANHeadQuarters Server on page 6 and Log File Directory Requirements on page 7.
Note: Do not have multiple computers running the SAN HeadQuarters Server monitor the
same group. 3. Identify the computers that will run the SANHeadQuarters Client. The SANHeadQuarters Client enables you to run the SAN HeadQuarters GUI, which obtains group data from the log files maintained by the SAN HeadQuarters Server and formats the data in graphs and tables.
Note: If you install the SANHeadQuarters Server, the SANHeadQuarters Client is also
installed on the computer. See Requirements for Running the SANHeadQuarters Client on page 6. 4. Collect the information you need for the installation. See Required Installation Information on page 8. PS Series Group Requirements A PSSeries group must meet the following requirements to be monitored by SAN HeadQuarters: Run PSSeries Firmware Version 3.3 or higher. Visit the Customer Support site to download PSSeries firmware: http://support.dell.com/equallogic Network connectivity must be established between the group and the computer running the SAN HeadQuarters Server.
SNMP community name must be configured in the group. See the Group Administration Guide for information on configuring SNMP access to a group. Dell recommends that you use a dedicated SNMP community name for SAN HeadQuarters. You can specify up to five SNMP community names in a group.
Profileif it is not detected on the host system. At least 50 MB RAM for each monitored group. Directory for log files. The SAN HeadQuarters Server stores group data in the log files. The directory must meet the requirements described in Log File Directory Requirements on page 7.
If you are using IPv4 network addresses exclusively, Microsoft .NET 2.0 or later. If you are using any IPv6 network addresses, Microsoft .NET 3.5 or later, with the latest service patch.
Note: The SANHeadQuarters installation application installs Microsoft .NET 4.0 Client
Profileif it is not detected on the host system. 100 MB RAM for each monitored group. Local directory for cached data (at least 30 MB of free space for each monitored group). Read access to the log file directory used by the computer running the SAN HeadQuarters Server. See Log File Directory Requirements on page 7.
You specify the log file size during the installation. The default data log file size is 5 MB, the minimum size is 2 MB, and the maximum size is 10 MB. To use the computer running the SAN HeadQuarters Server as a syslog server to store event messages and audits, the default size of the event and audits log file is 5 MB, the minimum size is 2 MB, and the maximum size is 20 MB. You can later modify the syslog size. Once messages consume all the free space in the event and audits file, new messages overwrite the oldest messages. For more information on event messages, see Syslog Event Logging on page 60. For information on audit logs, see Audit Messages on page 82.
Note: Using a log file size that is larger than the default size (5 MB) enables you to store
more precise data; however, it might have a slightly negative impact on response time. If you use a log file size that is smaller than the default size, data will be less precise, but response time might improve. See How Data is Compressed in Log Files on page 64. The location of the directory can be: Local device on the server. If desired, you can set up this directory as a network share. Requirement: You must specify a directory (for example, C:\SANHQ\Log). The directory cannot be a root drive, such as C:\. By default, Windows hides the Program Data folder. Network device. For example, you can use a directory on a PS Series group volume or a network file share. Dell recommends that you specify the UNC name for a network file
share (for example \\monservice\log). If you specify a mapped network drive, SAN HeadQuarters will convert it to a UNC name. Because of the potential for network disruption, Dell recommends that you do not locate the log file directory on the same group that you are monitoring. By default, the SANHeadQuarters Server (EQLXPerf) runs as a local user with the name LocalService. If you are using a log file directory that is on a network file share, you must configure EQLXPerf to run as a domain user that has full access to the network file share. During the SAN HeadQuarters installation, you will be prompted for the domain user name and password. For example, to assign a domain user name to EQLXPerf: a. In the Windows Control Panel, click Administrative Tools and then Services. b. Right-click EqlxPerf in the list of services and select Properties. c. Click the Log On tab. d. Select This account, enter the domain user name and password for EQLXPerf, and then click Apply. Optionally, you can assign a domain user name to EQLXPerf from the SAN HeadQuarters menu bar by selecting Settings then Server Settings. After you modify the EQLXPerf login credentials, you must start and stop the SAN HeadQuarters Server, as described in Restarting the SANHeadQuarters Server on page 12. Each computer running the SAN HeadQuarters Client must have read access to the log file directory and any network resources being used. In addition, if you want to allow a SANHeadQuarters Client computer to add groups to the monitoring list, configure e-mail notification, or change the SNMP community name for a group, the computer must have read-write access to the directory and any network resources. Use the standard Microsoft NTFS security mechanisms for the log file directory. Right-click the directory and select Properties. Then, click the Security tab and specify the access information.
software. Copy the software to a location accessible to the computer on which you want to install SAN HeadQuarters. Log file directory and optional network share name (if installing the SANHeadQuarters Server). See Log File Directory Requirements on page 7.
Note: If you were previously running SAN HeadQuarters and you chose to keep the log
files when you removed the software, you must specify the same log file location when you reinstall SAN HeadQuarters. Network file share where the log files are located (if installing only the SAN HeadQuarters Client). Local cache directory (if installing only the SAN HeadQuarters Client). For each monitored group, you need 30 MB of free space. This version of SAN HeadQuarters requires the Microsoft .NET 4.0 Client Profile or later. As a convenience, it is included in the installation application. Dell recommends that you use .NET 4.0 to update all SAN HeadQuarters Server and Client installations. Otherwise, the .Net 4.0 Client Profile will need to be installed prior to connecting a previous client to an upgraded server. The upgrade provided by the server does not contain the required Microsoft .NET 4.0 Client Profile update.
Installation Procedure
To install SAN HeadQuarters: 1. Double-click the SAN HeadQuarters executable file, SANHQSetup32And64.exe. If you had a previous version of SANHeadQuarters installed, you can choose to update or repair the previous installation (see Upgrading SANHeadQuarters Software on page 13) or uninstall it (see Uninstalling the SAN HeadQuarters Software on page 12). Otherwise, the welcome screen displays, followed by the End-User License Agreement. 2. When you install for the first time, you must specify the location for the SAN HeadQuarters installation files.
3. Choose either the full installation (SANHeadQuarters Server and Client) or a Client-only installation. Continue with the steps below for either type of installation. Full Installation Server and Client 1. Specify the directory for the log files and optional network share name. 2. SANHQ has advanced features that operate through TCP/IP. Choose whether to enable TCP/IP communication. To enable: a. Check the box and enter or accept the default Host Name/IP Address. The IP address must be a static IP address. b. Accept the default TCP/IP port for SAN HeadQuarters, which you can change later.
Note: If the computer running the SANHeadQuarters Server is behind a firewall, that fire-
wall must not block the following access: Ping (ICMP) for IPv4 and, if configured, IPv6. TCP/IP communication to the default TCP/IP port for SAN HeadQuarters, specified during the installation. Dell recommends that you set an individual rule for all SAN HeadQuarters executables for IPv4 and, if configured, IPv6. For detailed information about adding and configuring firewall rules, refer to the TechNet article, Configuring Firewall Rules, at the Microsoft Windows Server TechCenter: http://technet.microsoft.com/en-us/library/dd448559(WS.10).aspx 3. An installation dialog displays as the application is installed. When complete, click Finish. By default, "Launch SAN HeadQuarters" is selected. You will be asked to start the SAN HeadQuarters service (EQLXPerf) that will poll and record requested group monitor data. If you do not start this service, you can continue by using existing data logs but not gather new data. For more information, see Log File Directory Requirements on page 7. Client-Only Installation 1. Specify the local cache directory where group performance data will be stored. Each monitored group requires 30 MB. Dell recommends using a directory on drive C: if you have sufficient space. 2. Specify the network file share where the log files are located, in the form:
<IPAddress of Server>\Monitor
10
3. An installation dialog displays as the application is installed. When complete, click Finish. After the installation completes, by default, the SAN HeadQuarters GUI is launched automatically (see Post-Installation Tasks on page 11).
Post-Installation Tasks
After the installation completes, by default, the SAN HeadQuarters GUI starts. To manually start the GUI, see Starting the SAN HeadQuarters GUI on page 15. The following conditions apply: If the TCP/IPsettings were enabled during the SANHeadQuarters Server installation, you will be asked to provide user credentials before continuing. The Client-only installation asks for the credentials upon first use. If you are installing the SANHeadQuarters Server for the first time, the Add Group wizard starts, enabling you to add groups to the monitoring list. If you were previously running SAN HeadQuarters and you are using the same log files, the SAN HeadQuarters Server will automatically locate the log files and resume monitoring the groups. You might see missing data points for the time period that SAN HeadQuarters was not installed.
See Getting Started with SAN HeadQuarters on page 15 for information about adding groups to the monitoring list and performing other post-installation tasks. Preserve Data SAN HeadQuarters provides several methods that you can use to preserve group performance data. For example, you can: Create reports from group data. See Group Data Reports on page 109. Archive group data. See Group Data Archives on page 116. Export group data. See Group Data Exports on page 121.
You can preserve data at the current time or use a command line to perform the task regularly. Configure Single Sign-On From the SAN HeadQuarters GUI, you can start the Group Manager GUI for a monitored group. By default, the Group Manager GUI is started in your default Web browser. You must then enter a valid group administration account and password to log in to the group. Single Sign-On functionality enables you to locally store a group administration account and password. If you are running SAN HeadQuarters from the same computer and the same domain user account, you can start the Group Manager GUI as a standalone application and log into the group without entering login credentials. For more information about launching the Group Manager GUIwith Single Sign-On, see Launching the Group Manager with Single Sign-On on page 52.
11
Set Up Notification Mechanisms By monitoring group events, you can promptly correct problems. There are several methods for notification: Alerts that occur in a group. These appear in: Alerts table in the All Groups window (Starting the SAN HeadQuarters GUI on page 15) Alerts panel at the bottom of each window See Reported Alerts on page 72 Optionally, you can set up e-mail event notification and designate e-mail addresses to receive messages when an alert occurs in the group. See Configuring E-Mail Notification for Group Alerts on page 49 Optionally, you can configure a group to log events to the syslog server that is part of the SAN HeadQuarters Server. Events will appear in the Events panel at the bottom of each window and also in the Events window. See Syslog Event Logging on page 60 Optionally, you can set up audit logs to track administrator actions. See Displaying Audit Logs on page 82
12
software, usually in \documents and settings\all users\Application Data\Equallogic\San HQ\Logs, and the user configuration file in Data\Dell\SANHQ\version#\user.config .You can restore these files when the reinstallation is complete and the settings will be restored.
Uninstallation Procedure
To uninstall SANHeadQuarters: 1. Open the Windows Control Panel. 2. Click Add or Remove Programs. (On Windows Vista or Windows 7, choose Programs and Features.) 3. Select Dell EqualLogic SAN HeadQuarters and click Remove. 4. Choose whether to keep the log files. If you choose to keep the log files, when you reinstall SAN HeadQuarters on the same computer and specify the same log file location, the SAN HeadQuarters Server automatically locates the log files and starts monitoring the groups again.
version collects will not appear for dates prior to the time of the upgrade. Dell recommends that you back up or archive your data before doing an upgrade to SAN HeadQuarters. If you upgrade to Version 2.2 of SAN HeadQuarters, it will automatically archive all data from previous versions.
13
HeadQuarters, Host Integration Toolkits, and firmware along with relevant news articles. There are two ways in which you can access Updates Notifications: Automated news updatesYou can configure a set amount of time before SAN HeadQuarters will check for more recent information when launched. If new updates are available, an icon will be displayed in the Updates Notification button in the lower right-hand corner of the SAN HeadQuarters Client. This option can be disabled from the Recent News window located under the Help menu. ManuallyAt any time you can select Check for Updates from the Help menu to gather the most recent news available.
14
You can also start SANHeadQuarters from a command line. See Starting the SAN HeadQuarters GUI Using a Command Line on page 18.
When you start the GUI, the Servers and Groups window appears, showing all groups monitored by the server. The Servers and Groups window displays the status of each connected server and monitored group, any active alerts, and enables you to modify GUI settings. When you configure additional servers for monitoring, this view appears for each server. Figure 3 shows the Servers and Groups window with two servers monitored with multiple groups.
15
The numbered items correspond to the following callouts. Callout 1 Menus and Toolbars Provide access to tasks, such as adding a group, exporting group data, and launching the Group Manager GUI. For detailed information on menu options, see Navigating the GUI on page 20. Callout 2 Servers and Groups Tree Expandable list of servers and groups. Select a server to show all groups monitored by that server. Select a group name to display group data in graphs and tables. The icon next to each item in the tree hierarchy indicates the health condition of a group member: A green arrow indicates a server with its associated groups is available. A yellow diamond in the icon indicates that the default server, which cannot be removed. For multiple servers, you can change the default server by right clicking on an available server and selecting Make This Default Server from the drop down menu. (The Client always has one default server that is originally configured during installation.) A check mark in a green circle indicates there are no health conditions.
16
A question mark (?) in a blue circle (and the group name in yellow) indicates that a Caution level condition exists. An exclamation mark (!) in a yellow triangle (and the group name in yellow) indicates that a Warning level condition exists. An x in a red circle (and the group name or IP address in red) indicates that an event requiring immediate attention exists or that the group cannot be reached by SAN HeadQuarters.
Callout 3 Servers and Groups Table Lists the groups monitored by the server and shows the following information for each group: Group name Monitoring status (Connecting, Connected, Disconnected, or Failed to Connect) Amount of time elapsed since the group was last polled (Last Communication). PS Series firmware on the group members (displays mixed if members are running different versions) Group network address (group IP address, DNS name, or management address) Location and description (based on the groups Group Manager settings)
Callout 4 Alerts, Events, and Audit Logs Table Lists any active alerts, events, or audit messages for the selected group. Alerts indicate when an alarm (typically a hardware problem) or a performance condition exists in a group. Events display syslog tracking of system operations. Audits are syslog events about administrator actions. For information, seeReported Alerts on page 72, Syslog Events on page 79 and Audit Messages on page 82.
17
Callout 5 Servers and Groups Button Click the Servers and Groups button to display the All Groups window or to exit a Settings window. Callout 6 Settings Button Enables you to modify general settings (SAN HeadQuarters GUI appearance, installation settings, Client startup settings, tooltip behavior, chart display settings, and temperature display settings), e-mail notification settings, group settings (Single Sign-On functionality and SNMP community name), hidden group settings, favorites settings, and server settings. The Server Settings option only appears when when you do a full installation of the SAN HeadQuarters Server and Client. Callout 7 Help Button Enables you to view SAN HeadQuarters documentation and support information. You can also obtain help in the GUI windows by placing the pointer on a graph legend or on the question mark next to a table title. Callout 8 Update Notification Button Click this button to access software and firmware notifications and other recent product news from EqualLogic. See Obtaining SANHeadQuarters Update Notifications on page 13.
Note: Note that this button appears in the GUI only if there are new updates and if the
Table 1 describes the options that can be specified with this command to customize the view.
Note: The default behavior is to start with the normal view, which is the standard latest 8-hour
format.
18
-NavigateToGroup:"<value>"
-View:"<value>"
include: Name of the group IP address of the group Specifies the view type: "Summary" "Capacity" Inbound Replica" or "Inbound" "Outbound Replica" or "Outbound" "Combined" "IO" "Events" "Hardware" or "Firmware" "Disks" "Experimental Analysis" or "Experimental" "Analysis" "iSCSI" or "Connections" "Network" "Ports" If not specified, the view will load in the Summary view. Specifies the time to be applied to the display. Options for <value> include: -TimeRange:"<Start-End>" or "<Start to End>" where Start and End are either dd/mm/yyyy hh:MM:ss AM|PM dd/mm/yyyy 24hour clock If not specified, the page will load in the standard latest 8hour format. The value is: "X" or "X hours" for X hours "Y days" for y days "Z months" for z months If not specified, the page will load in the standard latest 8hour format. Loads all the data available for that group.
-TimeRange:"<value>"
-TimeLatest:"<value>"
-TimeAll
The following actions occur if there are errors when you enter this command:
19
Error
Resulting Display
Wrong group name or IP address specified Wrong page name specified Wrong time specified
All groups summary window Summary window for the specified group Time period is selected as default 8 hours
The following numbered items correspond to the callouts in Figure 4 Callout 1 Menu Bar Items and Tool Bar Buttons Table 2 shows the menu bar items and where to find additional information.
20
Add New Server Adding a Server on page 33 Add New Group Adding a Group for Monitoring on page 32 SANHQ Create Archive Group Data Archives on page 116 Open Archive Opening an Archive File on page 120 Export All Group Data Group Data Exports on page 121
Exit
Launch Group Manager Launching the Group Manager with Single Sign-On on page 52 Export Group Data Group Data Exports on page 121 Stop Monitoring Stopping Group Monitoring on page 51 Group Start Monitoring Adding a Group for Monitoring on page 32 Increase Log File Size Increasing the Log File Size on page 48 Remove From Monitoring List Removing a Group from the Monitor List on page 47 Hide Group Information Hiding Groups in the GUI on page 48 General Settings SAN HeadQuarters Settings on page 39 E-mail Settings Configuring E-Mail Notification for Group Alerts on page 49 Settings Group Settings Configuring the Group Monitoring Environment on page 39 Hidden Group Settings Hiding Groups in the GUI on page 48 Favorite Settings Adding and Managing Favorite Views on page 54 Configuration, Capacity, Thin Provisoned Volumes, Replication, Replication Across Groups, Performance, Host Connections, Hardware and Firmware, Top 10, Top 10 Across Groups, Alerts, and Group Diagnostics. Report Types on page 109
Reports
21
Menu Item
Options / Information
Add to Favorities Adding a View to the Favorites List on page 54 Favorites Manage FavoritiesAdding and Managing Favorite Views on page 54 Make this Home Page Setting a View As the Home Page on page 55 Default Homepage About, Navigation Help, User Guide, Release Notes, and Check for Updates Starting the SAN HeadQuarters GUI on page 15
Help
Under the menu bar, the tool bar buttons perform the following actions: Back and forwardReturns to a previously-visited window or moves forward to a previously-visited window. You can also click the arrow next to the forward button to display the navigation history. Add New ServerAdds an additional server to monitor groups on that server. Similar to selecting SANHQ from the menu bar, then Add New Server. Add GroupAdds a new group to monitor. Similar to selecting SANHQ from the menu bar, then Add New Group. New WindowProduces a new window displaying the current view. When you exit out of the new window, the original window remains open. Launch Group ManagerWhen you select a group from the Servers and Groups tree, the Launch Group Manager button appears, letting you start the Group Manager GUI for the group. Similar to selecting Group from the menu bar, then Launch Group Manager. Create ArchiveSaves group data to a compressed archive (.grpx) file. Similar to selecting SANHQfrom the menu bar, then Create Archive. PrintPrints the current window.
Callout 2 Servers and Groups View A single SANHeadQuarters Client can monitor multiple servers. The Servers and Groups tree shows all services that have been configured and their associated groups. For multiple servers, oneserver is designated the default server that cannot be removed; additional servers are secondary servers that can be added and removed. Callout 3 Summaries for Capacity, Replication, Hardware/Firmware, and Volumes All groups summary information is available at the top of the Servers and Groups tree. This view presents a comprehensive information, across all groups, for capacity, replication, hardware/firmware, and volumes, allowing you to make quick comparisons and analysis of your infrastructure. Links are provided on the summary pages to take you to specific information for a group. For imore information, see All Groups Summary Information on page 24. Callout 4 Information Categories
22
Click the group name in the Servers and Groups tree to display a summary of capacity, status, I/O performance, and network performance for the group, pools, members, volumes, and volume collections. You can obtain more detailed information by expanding the group in the Servers and Groups tree and selecting one of the following categories:
CapacityDisplays
group, pool, member, volume, and volume collection capacities. Includes the following subcategories:
Inbound ReplicasDisplays
Outbound ReplicasDisplays
group, pool, member, volume, and volume collection information for outbound replicas.
Combined GraphsDisplays
basic graphs for group capacity, I/O, and network operations for the group, pools, members, volumes, and volume collections.
Events/Audit LogsDisplays
group events and audit logs (only if computer running the SAN HeadQuarters Server is configured as a syslog server for the group). See Syslog Events on page 79 and Audit Messages on page 82
Hardware/FirmwareDisplays DisksDisplays
information about the hardware and firmware installed in the group members. Includes the following subcategory: information about member disk drives.
I/ODisplays
information about the I/O rate, IOPS, latency, and I/O size for the group, pools, members, volumes, and volume collections. Includes the following subcategories:
Live View SessionsDisplays
group member or volume I/O data captured in brief session intervals at quick polling rates that you can save and analyze. See Displaying Live Data on page 56.
Experimental AnalysisDisplays
the estimated maximum IOPS and the estimated workload percentage for the group, pools, and members, including the estimated maximum IOPS when a RAID set is degraded.
RAIDEvaluatorDisplays
the current RAIDinformation for a group, pool, or member and lets you apply a different RAIDpolicy to evaluate performance benefits.
iSCSI ConnectionsDisplays
information about iSCSI connections to iSCSI targets in the group, pools, members, volumes, and volume collections. Where the number of iSCSI connections appears in the GUI, this value includes connections to volumes and snapshots, in addition to connections for replication and Microsoft Service operations.
NetworkDisplays
information about iSCSI connections, network throughput, network load, retransmitted TCP packets, and network traffic for the group, pools, and members. Includes the following subcategory:
PortsDisplays
Callout 5 Favorites Lists views saved as favorites. See Adding and Managing Favorite Views on page 54
23
Callout 6 Window Title You can identify the category and object for which the GUI is displaying data by examining the window title. In most cases, the title will include the group name, category, and object. For example:
Capacity of All Pools on Group - group06
Callout 7 Context Link Bar Objects Within each information category, you can view group-wide data or data that is restricted to a specific group object. You can select:
GroupDisplays PoolsDisplays
group-wide data. data for all pools or individual pools. data for all members or individual members.
MembersDisplays VolumesDisplays
data for all volumes or individual volumes. A drop down menu lists the individual volumes, with icons indicating the type of volume: standard (fully provisioned), thin provisioned, template, thin clone, recovery, and offline.
Volume CollectionsDisplays
lections. In some cases, the volume collections windows display data for the group. See the tooltips for details. An object for which a category of information is not available will be disabled (shown in gray text), with the exception of the Group object, which is in gray text when selected.
24
Data in the summary views automatically refreshes every two minutes, or manually when you click the Refresh button. Summary views allow for column sorting and, in some cases, name searching. Within the summary views, links are provided to navigate to individual group, group member, or volume information. Capacity Summary The All Groups Capacity Summary page displays capacity information gathered from the most recent poll of each group. For each group, SANHeadQuarters provides the group capacity, volume reserve (i.e., space allocated to the volume), snapshot reserve, delegated space, and replication reserve. Total, in-use, and free space statistics are shown for each of these categories. For group thin provisioned space, SANHeadQuarters provides the following information: Unreserved SpaceAmount of unallocated space for thin provisioned volumes in the group. The reported size of a thin provisioned volume can be larger than the volume reserve . Free SpacePercentage of free group space required to fulfill the maximum in-use space for all thin-provisioned volumes. Thin Provisioned VolumesNumber of thin provisioned volumes in the group. The reported size of a thin provisioned volume can be larger than the volume reserve. Template VolumesNumber of thin provisioned volumes in the group that are template volumes. Thin Clone VolumesNumber of thin provisioned volumes in the group that are thin clone volumes. For all information categories, you can click the icon in the column heading to sort by size criteria. For additional definitions of capacity and replication terms, see Capacity and Replication Terms on page 68. Replication Summary The All Groups Replication Summary page displays volume outbound replication information gathered from the most recent poll of each group. Information presented in the Replication Summary includes the group/volume tree that when expanded, displays individual individual volume outbound replication information. Additional information presented includes: Replication partnerA list of all groups currently configured as an outbound replication partner. StatusCurrent status of outbound replication. For a group, this status indicates the group's configured replication status. For a volume, this status indicates the volume's individual replication status. Status values can be enabled, disabled, running, and paused. Active ReplicasNumber of volumes actively replicating data to a partner site. An individual volume will indicate if it is actively replicating to a partner site.
25
Paused ReplicasNumber of volumes currently paused from replicating data to a partner site. An individual volume will indicate if it is currently paused. Waiting replicasNumber of volumes currently waiting for outbound replication to begin. An individual volume will indicate if it is currently waiting for outbound replication to begin. For more definitions of replication terms, see Capacity and Replication Terms on page 68. Hardware/Firmware Summary The All Groups Hardware/Firmware Summary page displays the hardware and firmware configuration for all groups monitored by the Server. The information is gathered from the most recent poll of each group. Information presented in the Hardware/Firmware Summary includes: Group MemberThe name of the group where the Hardware/Firmware information is displayed. Expand the group name to display individual members of the group. FirmwareFull firmware revision number on the group or member. Model, Service tab, Enclosure Serial NunberProduct information on the array for the group and members. Controller Type and Controller CountList of and total number of controller models for the group or member. Member StatusSummary of the current member status for the group or the individual status of the member; either offline or online. RAIDstatus and RAIDpolicyThe current RAIDstatus and RAIDpolicy type for the group or member. DisksDisks present on the group or member. UptimeUptime of the controllers on the group or members.
Volume Capacity Summary The All Groups Volume Capacity Summary page displays critical volume capacity, similar to the All Groups Capacity Summary, while providing a more detailed look at a group's individual volumes. The individual volume capacity information is gathered from the most recent poll of each group. You can expand the group to see individual volumes and then click the volume name to navigate to that volume's capacity page. A search field lets you search for a particular volume by name and type. For example, you could search for a thin-provisioned volume with the name "Crv64"to verifiy if the volume has adequate free space. Information presented in the Volume Capacity Summary includes: Group/VolumeName of the group where the volme capacity information if displayed. Expand the group/volume name to display the individual volumes for the group.
26
Volume typeTemplate, for a template volume; Thin Clone for a thin clone of a template volume; Thin, for a thin provisioned volume, or Standard for a standard, fully-provisioned volume. Total CapacityThe volume size as seen by iSCSIinitiators. For volumes that are not thin provisioned, the reported size is the same as the volume reserve. For thin provisioned volumes, the reported size can be greater than the volume reserve. Allocated SpaceThe total, in-use, and free space allocated to the volume or group. For thin provisioned volumes, as volume reserve is consumed, more space is allocated to the volume up to the user-defined limit. Snapshot SpaceThe total, in-use, and free snapshot space for the volume or group. Replication SpaceThe total, in-use, and free replication space for the volume or group. Replication StatusThe current replication status for the volume or group; either enabled or disabled.
27
Group
Pools
Members
Volumes
Volume Collections
Summary data of the selected group I/Odata for the specific group Network data for the selected group Summary data of all pools on the selected group I/O data for all pools on the selected group I/O data for a specific pool on the selected group Network data for all pools on the selected group Network data for a specific pool on the selected group Summary data of all members of the selected group I/O data for all members of the selected group I/Odata for a specific member of the selected group Network data for all members of the selected group Network data for a specific member of the selected group Summary data of all volumes on the selected group I/O data for all volumes of the selected group I/O data for a specific volume of the selected group Summary data of all volume collections on the selected group I/O data for all volume collections on the selected group I/Odata for a specific volume collection on the selected group
By default, SANHeadQuarters sets the view to Standard. For example, the standard view for group I/O might resemble the example in Figure 6, where write I/O data spikes reach 30 KB and write IOPSspikes are 1100 seconds. Figure 6: Standard View for Group I/O
To set 95th percentile reporting, click the "95th%" button on any of the views indicated in Table 3. For example, when you set the view to 95th% in the example in Figure 6, SAN HeadQuarters filters out the data spikes; in this case, the maximum write I/O is under 12 KB and write IOPS
28
under 900 seconds (see Figure 7). For more information, see Controlling Chart Display on page 42. Figure 7: 95th Percentile View for Group I/O
Note: When switching from Standard view to 95th percentile reporting, both summary and
specific data can show missing data if either the beginning or ending data points appearing in Standard view are the high point of the graph. Because 95th percentile reporting factors out the top 5% spikes, the polling period will appear shorter at either the beginning or ending of the graph.
29
The numbered items correspond to the callouts in Figure 8. Callout 1 Timeline Displays a range of dates for which group data is available. Use the left and right arrows next to the timeline or the zoom links to change the dates in the timeline. Callout 2 Time range selector Shown as a gray rectangle on the timeline, controls the time range for the information that appears in the graphs. You can select the time range selector and stretch it along the timeline to set the time range, or you can use the zoom links. You can also select and move the selector along the timeline to change the dates of the time range. By default, when you first view the Group Summary window, the timeline shows the most recent 10 days and the time range selector is set to display data from the most recent eight-hour time period. Callout 3 Zoom links
30
Enable you to quickly set the value of the time range selector and also control the range of dates in the timeline: Click Show Latest to show data up to the most recent poll. If you move the selector to the far right of the timeline, Show Latest is automatically selected. If you move the selector to the left, Show Latest is automatically de-selected. Click 1hr, 8hr, 1d, 7d, or 30d to set the time range selector to 1 hour, 8 hours, 1 day, 7 days, or 30 days, respectively. The link for a specific time range appears only after SAN HeadQuarters has gathered data for this amount of time. Click All to show all the dates for which data is available in the timeline. Click Custom to specify a specific range of dates to show in the timeline.
Callout 4 Available data Shows the beginning date and time and the end date and time for which data is available for the group. This represents the time period (up to the most recent year) that SAN HeadQuarters has gathered group data. Callout 5 Selected range Shows the current setting for the time range. Graphs show data obtained during this time period. Callout 6 Showing data captured Shows the date and time for the data displayed in the tables and circle graphs. By default, the tables and circle graphs show data from most recent time in the selected time range. Callout 7 Data graphs Show data from the selected time range. Place the cursor over a graph legend to obtain help on the data. To obtain a data point for a graph, the SAN HeadQuarters Server calculates the average of the data collected from two consecutive polling operations. Place the cursor over a point in time on a graph to display a tooltip with information about the data point, including the polling (or sampling) period used to obtain the data. In a graph, you can also: Click a point in time to display data in tables and circle graphs from that time. Use the pointer to select a time range in a graph. Tables and circle graphs will show data that is averaged over this time range. Use the left and right arrow keys to move back and forth in the a graph. If your pointer has a wheel, use it to zoom into and zoom out of a graph.
31
If SAN HeadQuarters is unable to communicate with a group because of a significant event in the group, such as control module restart or failover or a firmware upgrade operation, or if you have an unsupported hardware configuration, you might see missing data points in the graphs. Callout 8 Data tables and circle graphs By default, shows data from the latest time in the selected time range. However: If you click a point in time in a graph, shows data from that specific time. If you select a time range in the graphs, shows data averaged over that time range.
Place the cursor over the question mark next to a table or over a circle graph legend to obtain help on the data.
correct the problem, right click on the server name in the Servers and Groups tree and select Change Login Credentials. The Add Group Wizard appears. You must provide the group IP address (or management network address) and the SNMP community name that is already configured in the group. See Adding a Group to the Monitoring List on page 45 for details. Once you add a group, SAN HeadQuarters starts to collect group configuration and performance data and then displays the data in the SAN HeadQuarters GUI (see Navigating the GUI on page 20).
32
Note: After you start the SAN HeadQuarters GUI or add a group, allow time for SAN
HeadQuarters to gather performance information. Data might be delayed by 30 seconds or more, depending on the group workload. After adding a group to the monitoring list, Dell recommends that you perform the following tasks: Set up notification mechanisms so you are informed of significant events and conditions in a group. See Set Up Notification Mechanisms on page 12. Preserve group data on a regular basis. See Preserve Data on page 11. Configure Single Sign-On for easy Group Manager logins. See Configure Single Sign-On on page 11.
For information about the data that appears in the GUI and interpreting the data, see: Understanding SAN HeadQuarters GUI Data on page 67 for information about the data that each window displays. Analyzing Group Data and Solving Problems on page 85 for information on how to interpret performance data.
Adding a Server
To monitor groups on additional servers, add a new server from the SANHeadQuarters GUI as follows:
33
1. Select the SANHQ menu item, then select Add New Server or click Add New Server from the toolbar. Alternately, right click SANHQServers from the top of the Servers and Groups tree in the left panel, then select Add New Server. The Add New SANHeadQuartersServer Wizard displays, which guides you to provide all required information. Click Next.
Note: The servers you add can be in different physical locations, provided that your have
network connectivity between the client and server. 2. Enter server connection information required to initialize the connection to the server: a. Enter the remote log directory path of the server to be monitored. You can specify either the server DNSname or IPaddress. Specify the share name when supplying an IP address. b. Enter a server display name to reference the SANHeadQuarters Server you are adding. You can choose any name or use the server IPaddress. (This field is automatically populated using the DNSname or IP address from Step a.) c. Click Next. 3. When you add a server with TCP/IP enabled, you must enter your authentication credentials for access to the remote server. The supplied credentials generate a validation ID for subsequent logins without needing to store the password. If the credentials you enter cannot be validated, then the connection to the server will be degraded. This might result in slower performance and disable advanced features such as Live View. Once you enter valid credentials, the server's communication interface is established. The default TCP/IP port is 8000 during the installation, although you can configure the server to use a different port. Click Next.
Note: If the server port has been updated but the server has not been not re-started to reflect
the changes, you will not be able to add the server until the server restarts to reflect the changed port number. 4. The Add Server Connection Summary shows a summary of the information you entered for the new connection. If these are correct, click Add Server. Otherwise, use the Back button to re-enter informaton. When you click Add Server, you will be connected to the server you added. It might take a few minutes for the groups on the server to poll. If there is a problem connecting, a message will display indicating the problem, for example, invalid credentials.
Removing a Server
To remove a new server from the SANHeadQuarters GUI:
34
1. Right click the server you want to remove from the Servers and Groups tree, then select Remove Server. You are asked to confirm removing the server. 2. Click Yes to remove the server from monitoring.
Note: You cannot remove the default server to SAN HeadQuarters; the Remove Server
Scenario 2: A newer version of the client exists on two servers. In this scenario, the selected server (fservice) is at a lower client version than the default server (vm2). Continuing to select that server will require a second upgrade, as indicated in the warning message on the bottom of the dialog box.
35
Scenario 3: In this scenario, SANHeadQuarters detected a newer version of two servers running the client. The non-default server (fservice)is running the newest version and therefore is selected by default. However, SANHeadQuarters determined that the log files are incompatible with the current version and added a Remove button to that line:
When you click Remove, the dialog box redisplays with the default server (vm1)selected. As before, the top message indicates that the current SANHeadQuarters Client version (2.3)should be upgraded to the selected version (in this case, server vm1 at version 2.4).
36
When you upgrade to the newest version, SANHeadQuarters presents a dialog box indicating the files are being copied and informs you when complete.
37
39
2. In the GUIAppearance panel: Set the theme for the GUIwindows. You can use the theme that matches your operating system or you can select another theme available on the system. Change the colors used in the GUI graphs. You can use a pre-defined color scheme or select your own colors.
40
Note: SAN HeadQuarters will automatically restart the SAN HeadQuarters Server after the log
file location is changed. To change the log file location: 1. Click Settings in the lower-left GUI panel. The General Settings window displays. 2. In the Installation Settings panel, click Change. The Change Log Files Directory Welcome dialog box appears. Click Next to start the wizard. 3. In the Change Log Files DirectoryNew Location dialog box, specify: New log file directory location. Optionally, whether to delete the original log files and directory after the files are copied to the new location.
4. Click Copy to start the copy operation. 5. When the Change Log Files DirectoryRestart GUI dialog box appears, click Restart GUI. If you chose to keep the original log file directory, you can manually delete the log files and directory.
Controlling Tooltips
From the General Settings window, you can determine how tooltips will appear in the GUI. By default, when you move the pointer over a point in time in a graph, tooltips appear that display data from the selected time. Similarly, when you move the pointer over a graph legend, tooltips appear with definitions of the data. To control tooltip behavior: 1. Click Settings in the lower-left GUI panel. The General Settings window displays.
41
2. In the Tooltip Settings panel, turn tooltips on and off by selecting and deselecting the choices under Tooltip Settings. 3. Click Apply to implement and save the changes.
42
The SANHeadQuarters Service can be started either running the Local Service account (default) or with specific user account credentials. In order for the SANHeadQuarters Service to use a network log directory, the service must be running as a Domain User. If you have administrative privileges, you can change the startup settings for the service in the Startup Settings pane. Dell recommends you configure to run as a specific user and supply a username and password. The SANHeadQuarters Server is capable of using an advanced communication infrastructure. In the Connection Settings for Client pane, you can specify a host name (or IPaddress) and a port number that will used by all clients for advanced communication. SAN HeadQuarters Clients that do not have the correct connection information will use a degraded connection format that retrieves only a subset of data from the server. For more information, see Degraded Connection Status on page 44. If you are set up to provide e-mail notification of alerts for a group, you can change the default notification delay from 6 hours to 1, 2, 4. 8, or 12 hours, or 1, 2, or 7 days. Select the delay period from the drop down menu in the Server E-Mail Notification Settings pane.
43
To correct this problem: 1. Right click on the server name in the Servers and Groups tree in the far-left panel and select Change Login Credential. 2. In the dialog box, resupply the credentials for access on the server. Note that these credentials must be resolved by the server to validate successfully. Direct Connection is Unavailable TCP/IPcommunication to the specified port cannot be established. Typically, this scenario results from a firewall on the server side not allowing communication. SANHeadQuarters reports the degraded connection status as "Direct Connection Unavailable." To correct this problem, add a rule to allow the SANHeadQuarters Server process (SanHQServer,exe) communication on the port specified during installation. Server Restart is Pending The SANHeadQuarters Clients cannot communicate with the Server. In this scenario, the SAN HeadQuarters Client will notify you of a server settings change that requires a restart. SAN HeadQuarters reports the degraded connection status as "Pending Server Restart."
To correct this problem, verify that the SANHeadQuarters Server process, eqlxperf, is restarted and listed in Windows services.
44
To add a group to the monitoring list: 1. Click Add Group in the toolbar. The Add Group Welcome dialog box appears. Click Next to start the wizard. 2. In the Add Group Group Information dialog box, enter: Group IP addressIP address for the group (or management address if a dedicated management network is configured in the group). Alternately, you can enter the DNS name for the group.
45
SNMP community nameThis name must already be configured in the group. You can obtain the SNMP community name by using the Group Manager GUI or CLI. For example, in the Group Manager GUI, click: Group Configuration, then select SNMP. Single Sign-On credentialsOptionally, select Enable and specify a group administration account name and password that SAN HeadQuarters stores for future use. Once you enter these credentials, you can run the GUI as a standalone application and log in to the group without entering an account name and password. You can also configure Single Sign-On at a later time. See Launching the Group Manager with Single Sign-On on page 52 for more information and requirements.
Select Disable if you do not want to configure Single Sign-On at this time. 3. Click Next. 4. In the Add Group E-Mail Notification dialog box, you can optionally set up e-mail notification of alerts. In the dialog box, you can select:
Do not send e-mail alertsSelect
the e-mail notification settings already configured in the group for e-mail notification of alerts. the settings you specify in the dialog box for e-mail notification of alerts. Specify the following: SMTP server and optional port Text for subject field (optional) E-mail address to use in the From field One or more e-mail addresses that will receive notification (enter one address on each line, or use a comma or semicolon-separated list of addresses) See Configuring E-Mail Notification for Group Alerts on page 49 for more information.
5. Click Next. 6. In the Add Group Log Files dialog box, enter: Log file size. For each monitored group, SAN HeadQuarters maintains 13 log files. The default size for each log file is 5 MB, the minimum size is 2 MB, and the maximum size is 10 MB. Use the slider to change the default log file size. Using a larger log file size than the default size (5 MB) enables you to store more precise data, but it might have a slightly negative impact on response time. If you use a log file size that is smaller than the default size, data will be less precise, but response time might improve. See How Data is Compressed in Log Files on page 64. Event file size. Optionally, if you want to use the computer running the SAN HeadQuarters Server as a syslog server for the group, use the slider to change the default size of the event log file, which stores event messages from the group. The default size of the event log file is 5 MB, the minimum size is 5 MB, and the maximum size is 20 MB.
46
Once messages consume all the free space in the event file, new messages will overwrite the oldest messages. See Syslog Event Logging on page 60. 7. In the Add Group Summary dialog box, click Add Group to add the group to the monitoring list. Click Back to make changes. After you add a group, the SAN HeadQuarters Server starts to gather group data. Data will appear in the SAN HeadQuarters GUI graphs and tables as successive polling operations occur. This can take up to 10 minutes, depending on the group workload.
HeadQuarters GUI must have read-write permission to the log file directory and the network file share, if applicable. To remove a group from the list of monitored groups: 1. Select the group from the Servers and Groups tree in the far-left panel. The Group Summary window appears (Figure 4). 2. Optionally, to preserve the logged data, create an archive. See Group Data Archives on page 116. 3. Pull down the Group menu and select Remove From Monitoring List. 4. In the confirmation dialog box, select Remove the group from the list, deleting its log files. You are also given the choice to temporarily stop monitoring the group and keep the log files, as described in Stopping Group Monitoring on page 51.
47
4. To sort the groups by online status (either online or offline), click the indicator (up or down arrow).
To increase the log file size: 1. In the Servers and Groups tree, select a group, then from the the Group menu, select Increase Log File Size. You can also right-click a specific group in the Servers and Groups tree, then select Increase Log File Size. The Increase Log File Size dialog box appears. Alternately, to change the log file size of several groups from one location, select a server or
48
group from the list in the Servers and Groups tree. From the SANHeadQuarters menu bar, select Settings, then Group Settings. The Group Settings window appears, showing all groups for that server. a. In the list of group names, click the box to the left of the group to which you want to increase the log file size. b. In the Log File Size column, select Change Size. The Increase Log File dialog box appears. 2. In the Increase Log File dialog box, use the slider to the increase the log file size. 3. Click the checkbox to acknowledge that you understand that you cannot later decrease the log file size. 4. Click OK. For related information about how data is compressed in log files, see How Data is Compressed in Log Files on page 64.
HeadQuarters GUI must have read-write permission to the log file directory and the network file share, if applicable. See Log File Directory Requirements on page 7. To change the SNMP community name for a group: 1. Pull down the Settings menu and select Group Settings. The Group Settings window appears.
Note: If the dialog box is labeled read-only, you do not have the correct credentials for
changing the SNMP community name. 2. Select the groups SNMP community name and modify the name. 3. Click Apply.
49
timeout), performance problems (such as a high load), and group alarms (for example, a failed control module). Optionally, you can configure e-mail notification of alerts when you add a group or at a later time. If you configure e-mail notification for a group, the computer running the SAN HeadQuarters Server sends e-mail messages to designated addresses when an alert occurs. This feature enables administrators to be promptly informed of potential problems in a monitored group. Each new alert generates an e-mail message. Subsequent alerts for the same problem are combined into a single e-mail message every six hours to limit redundant messages. The SAN HeadQuarters Server also can construct combined e-mail notifications for any alerts still active on a group. To limit redundant messages, you can adjust the time between these combined e-mail notifications from the default of 6 hours. For information, see SAN HeadQuarters Service Configuration Settings on page 42.
Note: If an issue that generates a SAN HeadQuarters alert is resolved within a polling period,
e-mail notification will not occur. To set up or modify e-mail notification for a monitored group, the computer running the SAN HeadQuarters GUI must have read-write permission to the log file directory and the network file share, if applicable. See Log File Directory Requirements on page 7. Dell recommends that you use the PS Series group e-mail notification feature as the primary notification method for group events. The SAN HeadQuarters alert notification system augments the group e-mail notification feature and should be considered a supplementary method of providing notification. To configure e-mail notification for a group: 1. From the Settings menu select E-Mail Settings. The E-Mail Settings window appears.
Note: If the window is labeled read-only, you do not have the correct credentials for con-
figuring e-mail notification. 2. In the row with the group name, select the checkbox in the Enabled column. 3. In the E-Mail Settings panel, select Send E-Mail Alerts. 4. Select Use group e-mail settings if you want to populate the fields with the e-mail notification settings already set up in the group. Select Use these e-mail settings if you want to use the e-mail settings you enter in this window. Enter the following information: IP address of the SMTP server that will deliver the e-mail. Alternately, you can enter the DNS name for the server. Optionally, a port for the SMTP server. The default port is 26. Optionally, text for the Subject field. E-mail address for the From field.
50
E-mail addresses to receive notification. Specify only one address on each line.
5. Click Apply in the E-Mail Settings panel to apply the changes. 6. Dell recommends that you send a test message to ensure that the e-mail settings are correct. Click Test e-mail to send a test e-mail message.
7. To customize the type of alerts that result in notification, click the Notifications tab in the E-Mail Settings window. The E-Mail Settings Notification window appears. 8. Select the group in the top table, select the type of alerts for which you want notification, and then click Apply. 9. If desired, click the Informational, Caution, Warning, and Critical tabs to display a list of alerts for each alert type. 10. Select the alerts for which you want notification and then click Apply. To modify the existing e-mail notification configuration for a group, in the E-Mail Settings window, select the group in the top table and edit the data in the top table and click Apply, or edit the fields in the E-Mail Settings panel and click Apply. You can also change the alert selections in the E-Mail Settings Notification window. Click Apply when finished.
GUI must have read-write permission to the log file network file share and directory. See Log File Directory Requirements on page 7. To temporarily stop monitoring a group and then resume monitoring the group: 1. In the All Groups window (Figure 3), select the group in the Servers and Groups tree. The Group Summary window appears. 2. Pull down the Group menu and select Stop Monitoring. Group data will still appear in the GUI, but the monitoring status is disconnected.
Note: If Stop Monitoring is not active in the Group menu, you do not have the correct cre-
dentials to stop group monitoring. 3. To resume monitoring the group, pull down the Group menu and select Start Monitoring.
51
52
The computer running the SAN HeadQuarters GUI must be running Java 1.5.0 or a higher version. Dell recommends Java 1.6.0 Update 7.
6. Click Apply in the Group Settings window. To launch the Group Manager and log in to the group, click Launch in the Group Settings window or click Group Manager in the toolbar.
Note: If you enabled Single Sign-On and you are prompted for login credentials, be sure you
entered the correct account name and password. Also, make sure you are using the same domain user account and the same computer that you used to configure Single Sign-On.
53
When you want to re-enable Single Sign-On for the group: 1. In the All Groups window (Figure 3), pull down the Settings menu and select Group Settings. The Group Settings window appears. 2. Select the group in the table. 3. Select Enable Single Sign-On. 4. Click Apply.
Adding a View as a Favorite To add a favorite view from the toolbar: 1. Navigate to the specific view that you wish to set as a favorite. 2. From the toolbar, select Favorites and click Add to Favorites.
54
The current view is then added to the list of favorites in the pull-down menu available from the Favorites menu and under the Favorites node in the Group view. Adding a Node as a Favorite To add a node as a favorite from the Servers and Groups tree: 1. Select a specific node from the tree. 2. Right-click in the tree and select Add to Favorites. The selected node is then added as a favorite at group level. It is added to the list of favorites in the pull-down menu available from the Favorites menu in the toolbar and under the Favorites node in the Group view.
55
56
4. Click the arrow button. (The arrow button turns to a square, allowing you to stop collecting data.) SAN HeadQuarters begins collecting data and displays it at the chosen interval. Data continues to display until the session length is reached. A status message in the top of the window indicates how many polls were completed in the selected timeframe. 5. To establish a new Live View session, click the arrow button again. You can change target, session length, or polling interval each time. A dialog box first appears asking you if you want to save the previous session. You can set the default to always save, always discard, or always prompt (default). To view previously-saved sessions: a. Select the group member in the Servers and Groups tree in the left panel, then I/O, and then Live View Sessions. Live View sessions display in the main panel for the member. b. Select the saved session. The session name includes member or volume name, and date and time the session was run. The saved session displays in the Live View window, from which you can select a target and rerun Live View.
Note: If multiple clients initiate a live view on the same member or volume, the second client
will be unable to initiate a session until after the first session completes. Figure 12 shows a completed Live View session for a group member. The session length was one minute with a 5 second polling interval. The status message indicates 12 polls were completed in the session timeframe.
57
58
3. SANHeadQuarters displays the current RAIDpolicy in the upper left pane and performance data in the data graph. To evaluate a new RAIDpolicy for a group, pool, or member, select a new RAIDpolicy from the drop down menu. The available RAID selections are based on the current RAID set, except when a Group or Pool has a mixed RAID set. SAN HeadQuarters automatically refreshes the view with evaluation data for the new RAID policy.
Notes: RAID6 Accelerated is shown for only XVS array configurations and not changeable.
If a Group or Pool contains a mixture of XVS arrays and standard arrays, you might erroneously select an invalid configuration (that is, RAID 6 Accelerated evaluated on a Pool with non-XVS arrays). SAN HeadQuarters displays a notification label in the GUIthat an incorrect selection was made. Figure 13 shows the RAIDevaluation for a group member, where the policy was changed from RAID50 to RAID 10. Data shows the original estimated maximum IOPS and degraded maximum IOPSfor read and write operations, plus the evaluation data based on the new RAID policy. Figure 13: RAIDEvaluation for Group Member
59
60
Before changing the syslog configuration, backup or make a copy of the SyslogConfig.xml file that is located in the log file directory. To change the syslog server configuration and enable only specific network interfaces for use as listening UDP sockets: 1. Edit the SyslogConfig.xml file that is located in the log file directory used by the SAN HeadQuarters Server. For example, the default SyslogConfig.xml file appears as follows:
<?xml version="1.0"?> <SyslogConfig xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Enable>true</Enable> <IPv4Interface>ANY</IPv4Interface> <IPv6Interface>ANY</IPv6Interface>
61
<Port>514</Port> </SyslogConfig>
2. In the <IPv4Interface> tag or the <IPv6Interface> tag, whichever applies to your network configuration, specify:
ANYSpecifies
that any network interface can be used as a listening UDP socket. This
is the default. IP addressSpecifies that only the network interface associated with the IP address can be used as a listening UDP socket. Empty stringDisables the use of all network interfaces for a specific network protocol. For example, specify the following to disable use of IPv6:
<IPv6Interface></IPv6Interface>
The following requirement applies: You must use valid XML tags in the SyslogConfig.xml file. 3. Restart the SAN HeadQuarters Server, as described in Restarting the SANHeadQuarters Server on page 12. If there is a syntax error in the SyslogConfig.xml file or some other problem, SAN HeadQuarters logs the event to the Windows event log.
To disable the SAN HeadQuarters syslog server: 1. Edit the SyslogConfig.xml file that is located in the log file directory used by the SAN HeadQuarters Server. For example:
<?xml version="1.0"?> <SyslogConfig xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Enable>true</Enable>
62
3. Restart the SAN HeadQuarters Server, as described in Restarting the SANHeadQuarters Server on page 12.
Note: If there is a syntax error in the SyslogConfig.xml file or some other problem, SAN
3. Add the group to the list of monitored groups, specifying the new network address. See Adding a Group to the Monitoring List on page 45.
63
Cannot connect to a groupMake sure an SNMP community name is configured in the group and the same name is configured in the SAN HeadQuarters GUI for that group. In addition, for each monitored group, the computer running the SAN HeadQuarters Server must have network access to all the configured network interfaces on all the group members, the group IP address, and the management address (if applicable). Use the ping command to determine if you can access all IP addresses.
Cannot add a groupMake sure the computer running the SAN HeadQuarters GUI has read and write access to the log file directory. Cannot set up e-mail notification or change the SNMP community nameMake sure the computer running the SAN HeadQuarters GUI has read and write access to the log file directory.
precise data; however, a larger log file size might have a slightly negative impact on response
64
time. If you use a log file size that is smaller than the default size, data will be less precise but response time might improve. Table 4 shows how compression over time affects different types of performance data. For example, some data will be understated because intermittent idle time in typical workloads will decrease averages. See Performance and Capacity Terms on page 67 for a description of the terms used in Table 4. Table 4: Compression Impact on Performance Data
Latency Throughput IOPS I/O Size Read/Write Distribution
Yes Yes
Yes Yes
No Yes
No Yes
For related information about increasing log file size, see Increasing the Log File Size on page 48.
65
66
Note: When displaying group data, make sure you select the correct time range. By default, the
GUI displays data from the most recent eight-hour time period. Use the Zoom links above the timeline to quickly set the value of the time range selector and also to control the range of dates seen in the timeline. For example, click Show Latest to show data up to the most recent time. See Navigating the GUI on page 20
obtain information about the data, move the pointer over a graph legend or the question mark icon next to a table title. See the Group Administration Guide for detailed information about the status that appears in the SAN HeadQuarters GUI, in addition to information about PS Series group operation.
67
Local replication reserveSpace on the primary group reserved for storing volume changes during replication and the failback snapshot. Overall group capacityAvailable space. Capacity depends on a number of variables. For example, group capacity depends on the number of members, the number and size of the disks installed in the members, and each member's RAID policy. Make sure free space in a pool does not fall below the smallest of these values: 5% of pool capacity 100GB * number of pool members Otherwise, load balancing, member removal, and replication operations do not perform optimally, and performance for thin provisioned volumes degrades.
Replication partnerA list of all groups currently configured as an outbound replication partner. Replica reservePortion of delegated space reserved for storing the replica set for a volume. Once replica reserve has been consumed, the oldest replica are deleted to free space for new replicas. To retain more replicas, increase the replica reserve percentage. Reported volume sizeVolume size seen by iSCSI initiators. Snapshot reserveSpace reserved for storing snapshots. Once snapshot reserve has been consumed, the oldest snapshots are deleted to free space for new snapshots. To retain more snapshots, increase the snapshot reserve percentage. Thin provisioning statisticsNumber of volumes that are thin provisioned, the amount of unreserved (unallocated) space for thin provisioned volumes, and the percentage of group space required to fulfill the maximum in-use space requirements for thin provisioned volumes. Volume reserveSpace allocated to a volume. For thin-provisioned volumes, the volume reserve is based on usage patterns. As more data is written to the volume, more space is allocated to the volume, up to the user-defined limit.
68
Volume typeType of volume: template, thin-provisioned, thin clone, or standard (a fully provisioned volume).
Space utilization terms are described as follows: In useSpace that is currently storing data. FreeSpace that us not storing data or reserved for any purpose. ReservedSpace that is reserved for some purpose (may include reserved space that is storing data and space that is not storing data). Unused SpaceSum of free space and space that is reserved but not in use.
I/O Terms
In general, I/O data measures traffic between iSCSI initiators and iSCSI targets (volumes and snapshots) in the group. I/O data is provided for reads and writes (total I/O), only reads, and only writes. The data represents the average for the polling period. The SAN HeadQuarters GUI provides the following I/O statistics: Average I/O rateAverage data transfer rate (also called I/O throughput). This is the average amount of data that is transferred each second. Usually, the I/O rate for reads and writes is not a significant indicator of performance. All storage systems have a maximum throughput capacity. Because most I/O is random and not sequential, storage systems rarely reach this threshold. If the threshold is reached, it indicates a sequential workload.
Note: As data is compressed over time, the I/O rate in the GUI becomes less precise.
Average IOPSAverage number of I/O operations processed each second. The GUI displays data for all Ethernet activity, including iSCSI traffic and SAN HeadQuarters SNMP polling.
Note: As data is compressed over time, the IOPS values shown in the GUI become less pre-
cise. Average latencyAverage time required to process and complete an I/O operation. Latency (also called delay) is the best gauge for measuring the storage load and is the principal method for determining if a group has reached its full capabilities. In SAN HeadQuarters, latency is measured from the time the group acknowledges an I/O request to the time the group completes the I/O operation. Latency is measured in milliseconds and is reported as an average for the I/O operations in a polling period. Latency that occurs in the server is not included in the SAN HeadQuarters data.
69
Note: Increasing the period of time over which latencies are averaged does not make the
latency data in the GUI less precise. While some volatility is lost, idle time does not affect the average latency. Therefore, older latency data is still a good indicator of performance. See Identifying Performance Problems on page 1 for information about interpreting latency values. Average I/O sizeAverage I/O operation size. The size of the I/O operations can help you obtain a better understanding of your applications and workload. For example, a workload that consists of many small, random I/O operations will have different performance characteristics than a workload with large, sequential I/O operations.
Note: Increasing the period of time over which I/O sizes are averaged does not make the
data in the GUI less precise. While some volatility will be lost, idle time does not affect the average I/O size. Therefore, older I/O size data is still a good indicator of the workload. Distribution of read and write IOPSPercentage of IOPS that are reads and the percentage of IOPS that are writes in the group. The read and write percentages are not an indicator of performance. However, this information is important when sizing and configuring groups for specific workloads. For example, certain RAID configurations perform better for a certain read/write distribution. In general, RAID 50 and RAID 5 do not perform as well as RAID 10 for workloads consisting predominantly of random writes. If latencies for a specific member suggest a performance problem, and the member performs than 70% write I/O operations, moving the random write workload to a pool with a RAID 10 member might solve the problem.
Note: Increasing the period of time over which the read/write distribution is averaged does
not make the data in the GUI less precise. While some volatility is lost, idle time does not affect the average read/write distribution. In some cases, older read/write distribution data might give a more precise indication of the workload than newer data. Estimated I/O loadEstimated load, relative to the theoretical maximum capability of the group, pool, or member. The estimated I/O load is based on latencies, IOPS, hardware, and the RAID configuration. The load value is an estimate. Use it only as a general indicator. See Identifying Performance Problems on page 1 for information about interpreting the estimated I/O load. I/O Load Space DistributionShows the amount of space associated with three different levels of I/O load low, medium and high. The SSD Space, displayed only on groups with at least one member using tiered storage, indicates the amount of disk space available from solid state drives. IOPS versus latencyRelationship between average latency and IOPS. A graph on the Combined Graphs windows plots I/O operations each second against the average I/O latency. Each SNMP poll is represented by a circle that shows the IOPS and latency at the time of the poll. This information can help you understand the relationship between IOPS and latency. For example, a high number of IOPS usually means a longer latency time.
70
Queue depthAverage number of outstanding I/O operations at the start of each incoming I/O operation. SAN HeadQuarters shows the queue depth for each disk drive (raw I/O) , volumes (only iSCSI traffic), groups and pools. A queue depth of zero indicates there are no outstanding I/Os. Requirement: A group must be running PS Series Firmware Version 4.2 or a later version to display iSCSI queue depth.
Replication IOPSThe number of write operations processed each second for inbound replication.
Network Terms
In general, network data measures all network traffic, including iSCSI operations, GUI operations, SNMP requests, and replication operations. The SAN HeadQuarters GUI provides the following network statistics: Active portsNumber of active member network interfaces. Ethernet port last modified dateDate when the network interface configuration was changed or the member restarted. iSCSI connectionsConnections to iSCSI targets (volumes and snapshots).
Note: Where the number of iSCSI connections appears in the GUI, the value includes con-
nections to volumes and snapshots, in addition to connections due to replication and Microsoft Service operations. Link speedNegotiated link speed for all the active network interfaces. Link speed is reported at the half-duplex data transmission rate. Double the rate to obtain the full-duplex rate. Management networkWhether a dedicated management network is enabled in the group. Network loadPercentage of the theoretical maximum network bandwidth that is being used for sending I/O or receiving I/O, whichever has the highest value. The theoretical maximum bandwidth is based on the negotiated link speed for all active network interfaces on the group members. The network is rarely a bottleneck in a SAN. Usually, network bandwidth is underutilized, especially with random I/O workloads. Network rateThroughput for all Ethernet traffic sent and received, including traffic from iSCSI initiators, mesh traffic, and SNMP requests. Sent and Received trafficAverage per-second rate of network traffic sent and received. TCP retransmissionsRetransmitted TCP segment packets. TCP retransmit rates are tracked on each member, but not on each network interface.
71
Polling Status
The SAN HeadQuarters Server regularly uses SNMP to poll a group to obtain data. By default, the polling periodthe time between consecutive polling operationsis two minutes. You can capture data in shorter polling intervals, as brief as three seconds, by using the Live View feature. See Displaying Live Data on page 56. If a group is busy processing I/O operations, it might drop SNMP requests from the SAN HeadQuarters Server. If the SAN HeadQuarters Server determines that a group is not responding to SNMP requests, it will poll the group less frequently until a poll succeeds. See How Group Performance Affects SNMP Polling on page 65. The SAN HeadQuarters GUI displays the status of the group polling activity in the upper-right corner of each window. Table 5 describes the polling status. Table 5: Polling Status
Status
Description
Successful
SNMP poll was successful. Group performance or a network problem prevented the group from responding to SNMP requests in a timely manner.
Increasing polling SAN HeadQuarters doubles the time between consecutive polls (the default polling period is two minutes) until the group responds to SNMP requests. period When the workload returns to normal or the network problems resolve, it will decrease the poll period by half at each interval until it returns to the default. SAN HeadQuarters cannot contact the group. In this case, an alert Failed describing the problem will be generated. A member rebooted. A successful SNMP poll is required to obtain up-toMember rebooted date group information.
Reported Alerts
Alerts enable you to be quickly informed of problems so you can diagnose and correct them. SAN HeadQuarters displays two types of alerts: Performance-related alerts detected by SAN HeadQuarters (for example, low free pool space or high latency)Some alerts have an increasing priority, as the condition increases in severity. Table 6 shows a list of SAN HeadQuarters alerts.
72
Hardware alarms detected by the group (for example, high temperature or a failed control module)Hardware alarms depend on the PS Series firmware version and also the member hardware. See the Group Administration Guide for a list of hardware alarms.
In some cases, a SAN HeadQuarters alert and a hardware alarm may be generated for the same group event. Optionally, you can configure e-mail notification for alerts. See Configuring E-Mail Notification for Group Alerts on page 49.
Alert Priorities
Alerts reported by SAN HeadQuarters have the following priorities: InformationalNormal, operational events in the group that do not require any administrator action. Informational alerts are unique to SAN HeadQuarters. CautionLow-level conditions that, if not addressed, might lead to performance issues or undesired results. Caution alerts are unique to SAN HeadQuarters. WarningConditions that WILL affect group operation unless addressed immediately. Warning alerts correspond to Warning events and Warning alarms in the group. CriticalSerious problem that is currently affecting group operation. Critical alerts correspond to Error events and Critical alarms in the group.
Note: If an issue that generates an alert is resolved within a data polling period, e-mail
Displaying Alerts
Alerts appear in the bottom panel of the All Groups window (only active alerts) or in the Alerts panel at the bottom of the GUI windows. To display Alerts: 1. Open the Alerts panel and click the Alerts tab (Figure 14). For each alert, the panel shows: Alert priority. See Alert Priorities on page 73. Date and time of the SNMP poll that detected the alert. How long the alert has been active. Statue of the alert (whether the alert is Active or Cleared). Alert description.
73
Note: SAN HeadQuarters displays alerts that occurred within the selected time period. To
display the latest alerts, select Show latest in the GUI window. The Alerts panel displays in the lower pane of the window. Figure 14: Alerts Panel
Exporting Alerts
You can export alerts to an .xls file. To export alerts: 1. Click the Export Alerts icon in the Alerts panel. 2. Enter a file name. 3. Click Save.
74
Table 6 shows a list of SAN HeadQuarters performance alerts. See the Group Administration Guide for a list of hardware alarms.
75
Connection failure Controller failover Controller information not in SNMP poll Controller not detected Disk information not in SNMP poll
Critical Caution
Caution
Warning
Caution
Critical Warning
Warning
Warning Warning
Firmware not recommended Warning Group firmware and Critical member disk incompatibile ICMP ping to group failed Incomplete SNMP poll Critical Caution
Warning
A SAN HeadQuarters Server SNMP request to the group failed. A member's active control module failed, resulting in failover to the secondary control module. SNMP requests for control module information timed out, due to the group workload. Therefore, the number of control modules reported might not be accurate. A member control module has failed or is not installed. Some SNMP requests for member disk drive information timed out, due to the group workload. The number of reported drives might not be accurate. The DNS name for the group network address cannot be resolved by the server running the SAN HeadQuarters Server. Due to detected drive problems, a member is copying data to a spare. A member is missing a disk drive, based on the model number, the number of disk slots, and the standard configurations (14, 16, 24, or 48 disk drives). A disk has a status other than online or spare and requires administrator attention. A disk drive failed A group member is running a supported firmware version that is not recommended in this case. A group's firmware is imcompatible with the disk's firmware. The SAN HeadQuarters Server was unable to contact a group, so it cancelled the poll. SNMP request was not complete, making the poll unusable. A pool's free space is less than the recommended value. Dell recommends that free pool space does not fall below the following, whichever is smaller: 5% of the total pool space 100 GB multiplied by the number of pool members Replication failed because the partner reached the maximum number of replicas and snapshots.
76
Alert
Priority
Description
Member added Member controller reboot Member disk added Member disk protocol mismatch Member disk removed Member disk firmware out of date
A member has been added to the group. A members control module rebooted. A disk has been added to a member. A member contains an unsupported comCritical bination of SATA-IIand SAS disks. Caution A disk has been removed from a member. A disk has been detected with out of date firmCritical ware. A member has been upgraded with new Member firmware upgrade Caution firmware. Member firmware upgrade The firmware on a control module has been Informational reboot pending upgraded. A health condition exists, likely related to a Member health status Warning, critical hardware failure. A member has different PS Series firmware Member mixed firmware Warning versions running on its control modules. Member network port failWarning A network port on a member failed. ure Send, receive, or both send and receive traffic for the network interface is approaching the Member network port load Warning, critical caution (80% load), warning (90% load), or critical threshold (99% load). Member network port An SNMP request to a members network port Warning unreachable failed. Member offline Critical A member is offline. Member RAIDstatus Caution A member's RAIDstatus has changed. changed Member removed Caution A member was removed from the group. A member's status changed, such as from Member status Caution online to offline. Member unconfigured Warning A member's RAID policy is not set. RAID Monitor disk firmware out A disk was detected with out-of-date firmware. Critical of date Contact your Dell EqualLogic support provider. The network connected to the server running Network down Critical the SAN HeadQuarters Server is down. A network interfaceother than one dedicated Port at reduced speed Caution to a management networkis connected to a network device with a speed of less than 1 GB. RAID set verifying Caution A member's RAID set is verifying parity data. Replica reserve resize failA replication operation failed because the volWarning ure ume's replica reserve cannot increase. The replication failed because the mutual Replication authentication Warning authentication passwords on the group do not failure match the passwords on a partner.
77
Alert
Priority
Description
Replication failure Replication partner disallow downgrades Replication partner not available Snapshot reserve SNMP poll connection restored TCP retransmit
Warning
A replication operation has failed. The replication failed because the secondary Warning group does not have downgrades disallowed. The replication failed because the partner Warning could not be reached. The in-use snapshot reserve exceeds the warnWarning ing level set in the Group Manager. A previously-failed SNMPpoll connection is Informational now successful. A member's ratio of TCP retransmits to sent Caution, warning, packets is too high, indicating a network probcritical lem. The threshold ranges from 1% (caution), to 5% (warning) to 10% (critical). A thin provisioned volume's in-use space Warning, critical exceeds the warning limit set in the Group Manager. There is a problem with allocating volume space according to the desired RAID preference.
For example: Volume RAID preference Caution, warning A volume cannot be moved to a member with the volumes preferred RAID level because there is insufficient member free space. The volume's preferred RAID level is over subscribed. Replication failed because the partner is not Volume replication partner Warning running the correct firmware and must be needs upgrade upgraded. Volume replication partner Replication of all volumes to a partner was Caution paused paused. Replication of a volume was paused from the Volume replication paused Caution primary group. Volume replication remote Replication of a volume was paused from the Caution paused secondary group. A volume's replication reserve space is insufficient: When a volume is actively borrowing free space for replication operations. Volume replication space Caution, warning If local replication reserve space falls below 20%. If the remote replication reserve space, as detected by the remote site, is invalid or low. The SAN HeadQuarters Server has encounUnexpected exception Critical tered an unexpected exception when handling SNMP data from a group.
78
Alert
Priority
Description
Unreachable network
Critical
Unsupported firmware
Critical
The computer running the SAN HeadQuarters Server cannot find the group network, due to routing problems. The SANHeadQuarters Server is unable to poll a group because the group firmware is not supported by SANHeadQuarters.
Syslog Events
The SAN HeadQuarters Server includes a syslog server. If you use the Group Manager to configure a monitored PS Series group to log events to the SAN HeadQuarters Server syslog server, the SAN HeadQuarters GUI displays the events.
Note: SAN HeadQuarters must successfully poll a group before the group can log events to the
Event Priorities
Table 7 lists event priorities in order of lowest (least severe) to highest (most severe) priority. Table 7: Event Priorities
Priority Description
Informational messageIndicates an operational or transitional event that requires no action. Potential problemCan become an event with Error priority if administrator intervention does not occur. Serious failureIdentify and correct the problem as soon as possible. Catastrophic failureIdentify and correct the problem immediately.
Displaying Events
There are two ways to display events that a group logs to the SAN HeadQuarters Servers syslog server. From the SANHeadQuarters GUI: Click the Events tab at the bottom of the GUI window to open the Events panel (Figure 15). Click Events/Audit Logs in the Servers and Groups tree in the left panel to display the Events and Audit Logs window (Figure 16). By default, the Show All button is selected. To show only audit logs, select the Show Event Logs only button.
If there are no events to display, instructions are provided to verify that the group is properly configured to send events and audit logs to the syslog server on the SAN HeadQuarters server. Each event message includes the following information: Event priority (see Event Priorities on page 79).
79
Date and time that the syslog server received the event from the group. Member on which the event occurred. Description of the event.
Click a column heading to sort according to the column data. Events that appear in the SAN HeadQuarters GUI can include events that occurred after the most recent poll or while the group was not responding to SNMP requests. SAN HeadQuarters displays events that occurred within the selected time period. To display the latest events, select Show latest in the GUI window. Figure 15: Events Panel
80
Searching Events
You can search the event log for events that include a specific word, words, or text string. You can also use the Filter Editor for advanced search capabilities. To display events that include a specific word, words, or text string, in the Events panel (Figure 15) or the Events window (Figure 16): 1. Enter the text in the search field and click Search. Click Clear to return the original event display. 2. For advanced search capabilities, click Filter Editor. The Filter Editor dialog box appears. The Filter Editor enables you to set up a complex search algorithm: Click the first field (defaults to Message) to select what you want to search (message text, priority, member, or time detected). Click the second field (defaults to Begins with) to select the search parameters. For example, you can specify that you want to match text or exclude text. Click the <enter a value> field and specify the search string. You can select text in the Message column and copy it to the search field. Click And to add additional search criteria.
81
Exporting Events
You can export the event log to an .xls file. In the Events panel or Events window: Click the Export Event Log icon. Enter a file name. Click Save.
Audit Messages
Audit messages are syslog events about administrator actions. They provide a historical reference to actions such as logging in, logging out, creating a volume, setting up replication, and so on. The SAN HeadQuarters Server includes a syslog server. If you use the Group Manager to configure a monitored PS Series group to send audit logs to the SAN HeadQuarters Server syslog server, the SAN HeadQuarters GUI displays the information.
If there are no audit logs to display, instructions are provided to verify that the group is properly configured to send events and audit logs to the syslog server on the SAN HeadQuarters server. Each audit message includes the following information: Account to which the audit message pertains. Date and time that the syslog server received the audit message from the group. Click the column heading arrow to sort ascending or descending by date.
82
A description of the event that occurred at the time the audit message was received. Click the small icon in the upper right corner of the Message column header to view the message details.
Events that appear in the SAN HeadQuarters GUI can include events that occurred after the most recent poll or while the group was not responding to SNMP requests. SAN HeadQuarters displays events that occurred within the selected time period. To display the latest events, select Show latest in the GUI window. Figure 17: Audit Log Panel
Searching Audits
You can search the audit log for audit messages containing a specific word, words, or text string. You can also use the Filter Editor for advanced search capabilities. To display audits that include a specific word, words, or text string, in the Audit log panel (Figure 17) or the Audit window (Figure 18):
83
1. Enter the text in the search field and click Search. Click Clear to return the original audit display. 2. For advanced search capabilities, click Filter Editor. The Filter Editor dialog box appears. The Filter Editor enables you to set up a complex search algorithm: Click the first field (defaults to Message) to select what you want to search (message text, priority, member, or time detected). Click the second field (defaults to Begins with) to select the search parameters. For example, you can specify that you want to match text or exclude text. Click the <enter a value> field and specify the search string. You can select text in the Message column and copy it to the search field, if desired. Click And to add additional search criteria.
For an example of how the Filter Editor works, see Searching Events on page 81.
84
Note: When displaying group data, make sure you select the correct time range. By default,
GUI graphs display data from the most recent eight-hour time period and GUI tables display data from the most recent poll. Use the Zoom links above the timeline to quickly set the value of the time range selector and also to control the range of dates seen in the timeline. For example, click Show Latest to show data up to the most recent time. See Displaying Data from Different Times on page 29.
85
SAN HeadQuarters is a good tool for determining if a performance problem is the result of a hardware failure in a group. SAN HeadQuarters also provides information that can indicate a performance problem in the storage environment (for example, if the workload exceeds the capability of the group). However, SAN HeadQuarters tracks only a portion of the storage stack through which an I/O operation must pass, starting with the application I/O request and ending with the data retrieved from the group. Latencies reported by SAN HeadQuarters do not include latencies that occur in the server. To fully diagnose non-group problems, you must use additional tools.
Contact your PS Series support provider or your application support provider for more information about characterizing your application storage utilization.
86
the statistics in the SAN HeadQuarters GUI appear to indicate a problem, make sure it is not a temporary condition. 2. Monitor the GUI for hardware problemsFailed hardware is a common source of performance problems. See Identifying Hardware Problems on page 88. After you fix a hardware problem, allow time for SAN HeadQuarters to collect new data before analyzing the data. Performance data collected while a hardware failure exists can be regarded as abnormal. 3. Monitor the GUI for common indicators of performance problemsIf you are sure there are no hardware failures, check for statistics that might indicate a performance problem. See Identifying Performance Problems on page 1. Be aware that the performance data is subjective and depends on the performance characteristics of your applications. 4. Continue to monitor the group regularlyIf you have configured e-mail notification, the computer running the SAN HeadQuarters Server will generate a message when an alert related to a hardware failure or a performance problem occurs. Figure 19 describes the process for analyzing SAN HeadQuarters data. Figure 19: Analyzing SAN HeadQuarters Data
87
workload that may not resemble the actual group workload, the data should not be used as the sole measure of group performance. The Experimental Analysis window also provides run-time group performance data, so you can compare the estimates to actual data. Always consider latency when examining estimated performance data. Displaying the Experimental Analysis Window
88
1. Select a group in the Servers and Groups tree in the left panel. 2. Click I/O then Experimental Analysis.
Experimental Analysis Data SAN HeadQuarters collects the following information from the group: Current hardware configuration (including RAID level, controller type, and disk type) Current distribution of reads and writes (that is, the percentage of IOPS that are reads and the percentage of IOPS that are writes)
SAN HeadQuarters then calculates the performance estimates, based on the previous data and a workload with the following IOPS characteristics: Small (8 KB) Random
89
The Experimental Analysis window (see Figure 20) provides the following group performance estimates: Estimated IOPS WorkloadFor the selected time range, the Estimated IOPS Workload graph shows the percentage of how much work (IOPS) the group is performing, based on the estimated maximum number of IOPS the group can perform (estimated maximum IOPS) and the actual number of IOPS performed by the group. The Estimated IOPS Workload table (left of the graph) shows how much work (IOPS) the group is performing, averaged over the time range. For example, if the estimate is 50%, the group is performing half the work SAN HeadQuarters estimates that the group can perform. This estimate is based on a workload consisting of small, random IOPS and the group hardware configuration and read/write distribution. The Estimated IOPS Workload Percentage is never more than 100%, even if the group is performing at more than 100% of the estimated maximum number of IOPS. Estimated Maximum IOPSFor the selected time range, the Estimated Maximum IOPS graph shows the estimated maximum number of IOPS the group can perform, based on a workload consisting of small, random IOPS and the group hardware configuration and read/write distribution. So you can compare the estimated data with run-time data, the graph also shows the actual number of IOPS (reads and writes) performed by the group. Because the estimated maximum IOPS data is based on the group hardware configuration and read/write distribution, the estimated data will usually track the actual number of I/O operations in the group.
Note: The SAN HeadQuarters client may show an estimated IOPSvalue of zero and dis-
play a notification stating, "SAN HQ Error: Invalid configuration is detected. Estimated max IOPS cannot be calculated." This typically indicates a scenario where the version you are running preceeds a newer drive type in your PSSeries array. Dell recommends that you upgrade to the latest release of SAN HeadQuarters. For general guidelines on upgrading SANHeadQuarters software, see Upgrading SAN HeadQuarters on page 13. SAN HeadQuarters calculates the estimated maximum group IOPS when there are no disk drive failures in the group (orange line in the graph) and also when at least one RAID set is in a degraded state (brown line in the graph). This information is useful for understanding the performance impact of a disk failure. The degraded estimate is based on a drive failure in a RAID set that would result in the greatest performance impact. The Estimated Maximum IOPS table (left of the graph) shows the estimated maximum number of IOPS (under non-failure and degraded RAID set conditions), averaged over the selected time range. The degraded estimate does not include the performance impact that might occur during RAID reconstruction (for example, when the array is reconstructing data from parity on a spare drive).
90
Examples of Interpreting Estimated Performance Data Estimated IOPS Workload Percentage is below 50% and no performance issues. If the run-time group data does not indicate a performance problem (that is, latencies are low, applications complete on time, and user response time is adequate), then you can assume that the group can handle an increase in workload without a performance degradation. Estimated IOPS Workload Percentage is more than 80% and no performance issues. If the run-time group data does not indicate a performance problem (that is, latencies are low, applications complete on time, and user response time is adequate), and your workload consists of mainly small, random I/O operations, you may be nearing the limit of the group. You may want to consider decreasing the load on the group or adding additional hardware or arrays. Estimated IOPS Workload Percentage is more than 80% and performance issues exist. If the run-time group data indicates a performance problem (for example, high latencies or high queue depth, applications do not complete on time, or response time is slow), you probably have reached the limit of the group. You should immediately consider decreasing the load on the group or adding additional hardware or arrays. Estimated IOPS Workload Percentage is less than 50% and performance issues exist. If the run-time group data indicates a performance problem (for example, high latencies or high queue depth, applications do not complete on time, or response time is slow), the group workload probably does not consist of mainly small, random I/O operations. The data might indicate one of the following: You reached the limit of the groupYou should immediately consider decreasing the load on the group or adding additional hardware or arrays. One or more members have degraded RAID setsReplace failed drives as soon as possible. Network problems exist. Correct the network problems immediately. A server has reached its maximum capabilitiesConsider increasing the I/O capabilities of the server (for example, install additional network interfaces and configure multipathing). Member hardware problems existReplace any failed hardware and ensure that you configure all the network interfaces on all group members. As these examples show, estimated data must be used in conjunction with run-time group data to obtain an accurate and comprehensive understanding of group performance. For run-time group data examples, see Examples of Interpreting Performance Data on page 92.
91
92
Figure 22: Experimental Analysis Window for Adequately Performing Group with Excess Capability
Table 8: Performance Data for Adequately Performing Group with Excess Capability
Data Description
Latency IOPS I/O Size I/O Rate Overall Assessment Expansion Capability Queue Depth
Good (less than 15ms) Good (performing approximately 50% of the estimated maximum IOPS) Typical (approximately 50 KB for reads and 6 KB for writes) Low (approximately 5 to 15 MB/sec) Adequate Might be able to increase the workload by 25%. Typical range
Example 1 shows a group that is performing well and is within its capabilities. The latencies are all below 20 ms, which is desirable. The reported number of IOPS is about half the maximum IOPS (small, random) that SAN HeadQuarters estimates the group can easily perform. The I/O
93
sizes are typical, so there are no special circumstances to consider (such as very large I/O sizes). The I/O rate (throughput) is low. The workload appears to be reasonably static (and thus predictable). This group probably could handle an increase in the I/O workload of at least 25% before performance problems might develop.
94
Latency IOPS I/O Size I/O Rate Overall Assessment Expansion Capability Queue Depth
Good (less than 4 ms) Very low (approximately 15% of estimated maximum IOPS) Typical (approximately 40 KB for reads and 8 KB for writes) Very low (average of 2.5 MB/sec) Group is mainly idle Can support an increase in workload Low
Example 2 shows a group that is mainly idle. The very low latency and low IOPS values indicate that this group can handle a larger workload. However, since the current group workload is so low, it is difficult to determine how large a workload increase the group can handle. Increase the workload gradually and evaluate the group performance after each increase.
95
96
Figure 26: Experimental Analysis Window for a Group that Might Be Near Full Capability
Table 10: Performance Data for a Group that Might Be Near Full Capability
Data Description
Latency IOPS I/O Size I/O Rate Overall Assessment Expansion Capability Queue Depth
Good (less than 20ms) High (at least 100% of the estimated maximum IOPS) Typical (approximately 40 KB for reads and 32 KB for writes) Low (average 30 to 40 MB/sec) Adequate Only increase the workload gradually and with caution Moderate.
Example 3 shows some contradictory information. The latencies are low (less than 20 ms). However, the IOPS are at or above the maximum estimated IOPS (small, random) for the group. Because the I/O size is small (less than 64 KB), the workload might be sequential, instead of random. Alternately, the group might be benefiting from a high level of control module cache hits.
97
Because the latencies are low, the group currently appears to be performing well. However, an increase in the workload may result in performance degradation.
98
Figure 28: Experimental Analysis Window for a Busy Group that is Likely Near Full Capability
99
Figure 29: Network Window for a Busy Group that is Likely Near Full Capability
Table 11: Performance Data for a Busy Group that is Likely Near Full Capability
Data Description
Latency IOPS I/O Size I/O Rate Network Load Overall Assessment Expansion Capability
Cautionary (above 20 ms, but less than 50 ms) High (approximately 100% of the estimate maximum IOPS) Smaller than typical (approximately 12 KB for reads and 10 KB for writes) Low (less than 10 MB/sec) Low (less than 2%) Busy Increasing the workload will likely result in a performance degradation.
Example 4 shows that a group can have a high I/O load and a low network load at the same time. The network is rarely a bottleneck in a SAN. Usually, network bandwidth is underutilized, especially with random I/O workloads.
100
This group has a heavy load, consisting of highly random, small reads and writes. Yet, only a fraction of the network is being utilized. While this is an extreme example, the concept is true for most groups.
101
Figure 31: Experimental Analysis Window for a Group With High Latencies that is Likely Near Full Capability
Table 12: Performance Data for a Group With High Latencies that is Likely Near Full Capability
Data Description
Latency IOPS I/O Size I/O Rate Overall Assessment Expansion Capability Queue Depth
High (20 to 60 ms, sustained) Very high (above 100% of the estimate maximum IOPS) Typical (approximately 62 KB for reads and 18 KB for writes) Low (average 45 MB/sec) Very busy Increasing the workload will likely result in a performance degradation. Moderate to high
Example 5 shows a busy group with no excess capacity for expansion. The high latencies, sustained over an eight-hour period, indicate that group performance is troublesome. While brief
102
peaks of high latency are acceptable, high sustained latencies will generally have a negative impact on application performance. In addition, the number of IOPS in this example is two to three times the estimated maximum IOPS. This is likely because the workload is sequential, instead of random. Alternately, the group may be benefiting from a high level of control module cache hits.
Example 6: Group with Many Small Writes but Some Large Reads
Figure 32 shows the I/O Group window for a group that is performing many small write operations but some large read operations. Table 13 describes the relevant data.
103
Figure 32: I/O Window for a Group with Many Small Writes but Some Large Reads
104
Table 13: Performance Data for a Group with Many Small Writes but Some Large Reads
Data Description
Read/Write Distribution Read I/O Size Write I/O Size Read IOPS Write IOPS Read I/O Rate Write I/O Rate Queue Depth
Approximately 32% reads and 68% writes Average 70 KB Average 3 KB Average 170 Average 350 Average 12 MB/sec Average 1 MB/sec Moderate
Example 6 shows how the read/write distribution is important for understanding performance statistics. The read/write distribution is the percentage of read IOPS and the percentage of write IOPS, based on the overall number of IOPS. The data in Example 6 indicates a workload that consists mainly of write operations (68% writes). However, these are small writes (average 3 KB). Because the read operations are large (average 70 KB), compared to the writes, most of the I/O throughput is read data (average read I/O rate of 12 MB/sec).
105
6. Modify the group configurationYou might want to change the RAID policy for a member or reassign volumes to different pools. See Group Configuration Recommendations on page 107. 7. Add more SAN hardwareThis best practice can include adding more members to the group, upgrading disks to a higher capacity or speed, or installing disks of a different type.
106
107
You can preserve data at the current time or use a command line to perform the task.
Report Types
You can create the following report types for the selected groups: Configuration reportIncludes information on the overall group configuration. Capacity reportIncludes information about group, member, and volume capacity and space usage. Thin provisioned volumes reportIncludes information about thin provisioned volumes. Replication reportIncludes information about inbound and outbound replication activity. Replication report across groupsIncludes information about inbound and outbound replication activity across multiple selected groups. Performance reportIncludes group, member, and network port I/O performance information.
109
Host Connections reportIncludes information about host session iSCSI connections to group targets. Hardware and Firmware reportIncludes information about the hardware and firmware on the group members. Top 10 reportIncludes the ten volumes with the largest size (capacity), highest IOPS, and highest number of iSCSI connections from initiators. Top 10 report across groupsIncludes the ten volumes with the largest size (capacity), highest IOPS, and highest number of iSCSI connections from initiators across multiple selected groups. Alerts reportContains a combined list of all alerts within a selected range for selected groups. The information in the Alerts report is similar to the Alerts pane on a Server's All Groups page and the Alerts tab on the Summary of Group page. For a given group, the Alerts report information includes the priority level of the alert (informational, caution, warning , or critical), time and date of the poll in which the alert was detected, the duration of the alert, the status of the alert (active or cleared), and the alert message text. Group diagnostics reportAnalyzes all selected groups for possible performance and configuration issues (for example, group members with mismatched firmware or incompatible RAIDpolicies). For information about the data analyzed by the group diagnostics report, see Group Diagnostics Report Data on page 115.
110
E-mail settings for distributing the report, including the SMTP server and port, e-mail addresses to receive the report, e-mail address for the From field, and the text for the Subject field.
111
Optionally, you can automatically e-mail the report once it is generated. In the Report Wizard E-Mail Settings dialog box, select Automatically e-mail report as an attachment after generation and enter the information in the fields. If you configured e-mail notification for alerts, the fields will already contain data. You can edit the fields, as needed. 3. After you click Generate Command Line in the Report Wizard Report Generation Ready dialog box, the Report Generator XML Viewer window appears (Figure 33), displaying the contents of the XML file. You can edit this file, if necessary, as described in Modifying an XML File for Creating Reports on page 113. 4. Once you are satisfied with the contents of the XML file, click Save in the Report Generator XML Viewer window and specify a file name for the XML file. Figure 33: Report Generator XML Viewer
5. To generate the report, go to the SAN HeadQuarters installation directory and use the command format shown next. Be sure to specify the full path for the XML file.
SANHQClient.exe ReportSettingsFile="xml_file_name"
For example:
> SANHQClient.exe ReportSettingsFile="C:\SANReport_XML.xml"
112
Scheduling Report Creation If you want to generate reports regularly, use Schedule Tasks in the Windows Control Panel to specify a schedule for running the following command:
SANHQ_install_directory\SANHQClient.exe -ReportSettingsFile=XML_file_name
The XML_file_name variable specifies the name of the XML file you created in Using a Command to Create a Report on page 111. Modifying an XML File for Creating Reports You can modify an XML file that is used to create group reports. You can modify: Report file name Report type Report Style Date and time The time in the XML file is specified in hours, unless you want the latest data (specify span = "0") or a time range. To specify a time range, specify the beginning and end dates. For example:
begin = "7/15/2010 2:35:31 PM"end = "7/29/2010 2:35:31 PM"
Groups to include in the report (identified by group name) E-mail information for the report
The following sample XML file creates a Configuration Report (in PDF format) from data gathered one day (24 hours) ago for one group:
<SanHQReport> <ReportFile> C:\Documents and Settings\sample_user\My Documents\SanHQ_Report.pdf </ReportFile> <Title> SAN HeadQuarters Configuration Report </Title> <Type> Configuration Report </Type> <Style> Default </Style> <DataType>
Point
</DataType> <GrabNearestPointWithin> 36000000000 </GrabNearestPointWithin>
113
The following sample XML file creates a Capacity Report (in HTML format) from data averaged over a specific two-week time period for two groups:
<SanHQReport> <ReportFile> C:\sample_user\My Documents\Capacity_Report.html </ReportFile> <Title> SAN HeadQuarters Capacity Report </Title> <Type> Capacity Report </Type> <Style> Default </Style> <DataType> Point </DataType> <GrabNearestPointWithin> 36000000000 </GrabNearestPointWithin> <Time begin = "7/15/2010 2:35:31 PM" end = "7/29/2010 2:35:31 PM" /> <Group> thing </Group> <Group> rental17 </Group> </SanHQReport>
The following sample XML file creates a Top 10 Report (in PDF format) from the latest data for a group and then e-mails the report:
<SanHQReport> <ReportFile> C:\sample_user\My Documents\TOP_10_Report.pdf </ReportFile> <Title> SAN HeadQuarters TOP 10 Report </Title> <Type> Top 10 Report </Type> <Style> Default
114
</Style> <DataType> Point </DataType> <GrabNearestPointWithin> 36000000000 </GrabNearestPointWithin> <Time span = "0" /> <Group> MandaGrp </Group> <AutoEmailReport> <SmtpServer> 10.20.30.40 </SmtpServer> <Sender> grouo@company.com </Sender> <SmtpPort> 25 </SmtpPort> <SubjectLine> SAN HeadQuarters Top 10 Report </SubjectLine> <SendToAddress> me@company.com </SendToAddress> <SendToAddress> you@group.com </SendToAddress> </AutoEmailReport> </SanHQReport>
115
Member analysis
Disk analysis
Pool analysis
Port analysis
Members in the same pool with mismatched firmware. Members in the same pool with mismatched RAIDpolicies. Member not online. Member with free space approaching low threshold (90%). Member disk is a non-approved disk or a bad disk (offline, failed, or missing status). Disk models are mismatched. Disk RPMsare mismatched. No spare drives. Drives known to fail. Failed drives. Disks known to be bad or firmware loaded on disks is known to be bad. Disk firmware and Group firmware are incompatible. Disk protocols are mismatched; occurs when an unapproved combination of SATA and SAS drives are found on a member. Pools with high I/O loadInforms you to distribute the load to other pools. Pools with number of connections approaching 90% maximum (Firmware Version 4.2 and higher). Pools with members offlineDisplays the number of members offline. Pool's Total, Free, and In-Use capacity within 90% threshold of the collection. Pool delegated space at 80% threshold. 10 GB Ethernet port operating below maximum link speed of 10 GB. 1 GBEthernet port operating below maximum link speed of 1 GB. Port with an admin status of "not up" (i.e., a user-disabled port or a never-enabled port). Port with an admin status of "up" (i.e., a user-enabled port) but operationally down (i.e., a disconnected port or unresponsive port).
116
It can be beneficial to regularly archive group data. SAN HeadQuarters maintains group performance data in log files for up to one year. As data ages, the SAN HeadQuarters Server compresses the data in the log files, which can make some older data less precise than newer data. You can periodically archive data to retain more precise data. See How Data is Compressed in Log Files on page 64. You also might need to archive data if requested by your PS Series support provider. There are two ways to archive group data: SAN HeadQuarters GUI Using the GUI is the easiest method of archiving data. See Using the GUI to Create an Archive File on page 117. Command line Using a command to archive data enables you to schedule archive operations and regularly capture group data. See Using a Command to Create an Archive File on page 117.
117
The procedure for archiving group data by using a command requires using the SAN HeadQuarters GUI to create an XML file that contains the groups whose data you want to archive. You then run the SAN HeadQuarters executable, specifying the XML file as a parameter. To use a command to archive group data: 1. Pull down the SAN HQ menu and select Create Archive to initiate the Archive Groups wizard. 2. On the first dialog of the wizard, click Next. The Group Selection dialog appears. 3. Select the service (if multiple servers are configured) and group or groups whose data you want to save. You can only archive groups from one service at a time. Click Select all groups to archive data for all the monitored groups. Click Next. The Archive Generation Type dialog appears. 4. Select the Generate Command line Archive Settings file option. 5. Enter a file name for the .grpx archive file. The default name is SANHQ_Archive.grpx. 6. Optionally, check the Trace Files Only box. By selecting this option, the resulting archive will be a compressed archive containing only debug trace files without any group log data. These compressed debug trace files are designed for easy transmission of important diagnostic information. 7. Click Next. The Summary of archive dialog appears. 8. Click Generate Command Line. The Create Archive XML View dialog box appears (Figure 34), displaying the XML file.
118
9. Examine the XML file. If necessary, you can edit the file, as described in Modifying an XML File for Archiving Data on page 120. 10. Click Save if the file is acceptable and enter the path and file name for the XML file. Click Cancel to cancel the operation. 11. Go to the SAN HeadQuarters installation directory and use the command format shown next. Be sure to specify the full path for the XML file.
SANHQClient.exe ArchiveSettingsFile="xml_file_name"
For example:
> SANHQClient.exe ArchiveSettingsFile="C:\SAN_Archive_XML.xml"
To open an archive file, see Opening an Archive File on page 120. Scheduling Archived Data If you want to archive data regularly, use Schedule Tasks in the Windows Control Panel to specify a schedule for running the following command:
SANHQ_install_directory\SANHQClient.exe ArchiveSettingsFile="xml_file_name"
119
The XML_file_name variable specifies the name of the XML file you created in Using a Command to Create an Archive File on page 117. Modifying an XML File for Archiving Data You can modify an XML file that is used to archive group data. For example, you can change the groups (identified by IP address) or the .grpx file name. The following sample XML file archives data gathered for two groups:
<SanHQArchive> <Path> C:\Documents and Settings\sample_user\SANHQ_Archive.grpx </Path> <Group> 10.124.9.144 </Group> <Group> 10.127.14.200 </Group> </SanHQArchive>
120
You cannot export group data, create an archive, add a group, or change the SAN HeadQuarters settings for a group though an archive.
4. Click Next. The Time Range Selection dialog appears. 5. Select the time range for the data that you want to export, or select Custom to enter a range of dates. Unless you select Custom, the time range will end with the most recent time. For example, if you select Latest 7 days, the exported data will be from the most recent 7-day time period. 6. Click Next. The Data Selection dialog appears.
121
7. Select the type of data to include. You can include summaries of group, pool, and member data or information about volumes, volume collections, network interfaces, disks, inbound or outbound replication, and replication partners. 8. Click Next. The Export Generation Type dialog appears. 9. Make sure that the Generate Now option is selected. 10. Specify the path and identifier for the .csv file. The actual file name is generated automatically, based on the file name you enter, the selected group names, and the time range. 11. Click Next. The Summary of Export dialog appears. 12. Click Export Now to export the data. Click Cancel to cancel the operation.
4. Click Next. The Time Range Selection dialog appears. 5. Select the time range for the data that you want to export, or select Custom to enter a range of dates. Unless you select Custom, the time range will end with the most recent time. For example, if you select Latest 7 days, the exported data will be from the most recent 7-day time period. 6. Click Next. The Data Selection dialog appears. 7. Select the type of data to include. You can include summaries of group, pool, and member data or information about volumes, volume collections, network interfaces, disks, inbound or outbound replication, and replication partners. 8. Click Next. The Export Generation Type dialog appears.
122
9. Select the Generate Command line Export Settings file option. 10. Specify the path and identifier for the .csv file. The actual file name is generated automatically, based on the file name you enter, the selected group names, and the time range. 11. Click Next. The Summary of Export dialog appears. 12. Click Generate Command Line. The Export Group XML Editor dialog box appears (Figure 35), displaying the XML file. Figure 35: Export Group XML Editor
13. Examine the XML file. If necessary, you can edit the file, as described in Modifying an XML File for Exporting Data on page 124. 14. Click Save if the file is acceptable and enter the path and file name for the XML file. Click Cancel to cancel the operation. 15. Go to the SAN HeadQuarters installation directory and use the command format shown next. Be sure to specify the full path for the XML file.
SANHQClient.exe ExportSettingsFile="xml_file_name"
For example:
> SANHQClient.exe ExportSettingsFile="C:\SAN_Export_XML.xml"
123
Scheduling Exported Data If you want to export data regularly, use Schedule Tasks in the Windows Control Panel to specify a schedule for running the following command:
SANHQ_install_directory\SANHQClient.exe ExportSettingsFile="xml_file_name"
The XML_file_name variable specifies the name of the XML file you created in Using a Command to Export Group Data on page 122. Modifying an XML File for Exporting Data You can modify an XML file that is used to export group data. For example, you can change: Name and directory for the .csv file Type of data to include Time range The time range in the XML file is specified in hours, unless you want the latest data (specify span = "0"). If you want a time range, specify the beginning and end dates. For example:
begin = "7/15/2009 2:35:31 PM" end = "7/29/2009 2:35:31 PM"
The following sample XML file exports all types of data gathered over the most recent 24-hour time period for two groups:
<SanHQExport> <Settings Path="C:\SANHQ_Export.csv"> <Show Group="True" Pools="True" Members="True" Volumes="True" HostedReplicas="True" ReplicaSites="True" Disks="True" Ports="True" Volume Collections="True" Outbound Replicas="True" /> <Time Span="24" /> </Settings> <Group> 10.127.137.110 </Group>
124
The following sample XML file exports all types of data, except for replication, gathered over a specific date range for two groups:
<SanHQExport> <Settings Path="C:\SANHQ_Export_Range.csv"> <Show Group="True" Pools="True" Members="True" Volumes="True" HostedReplicas="False" ReplicaSites="False" Disks="True" Ports="True" Volume Collections="True" Outbound Replicas="False" /> <Time Begin=1/01/2009 1:00:00PM End=1/12/2010 4:08:13PM /> </Settings> <Group> 10.117.127.120 </Group> <Group> 10.117.141.140 </Group> </SanHQExport>
125
Contacting Dell
Dell provides several online and telephone-based support and service options. Availability varies by country and product, and some services might not be available in your area. For customers in the United States, call 800-945-3355.
Note: If you do not have access to an Internet connection, contact information is printed on your
invoice, packing slip, bill, or Dell product catalog. Use the following procedure to contact Dell for sales, technical support, or customer service issues: 1. Visit support.dell.com or the Dell support URL specified in information provided with the Dell product. 2. Select your locale. Use the locale menu or click on the link that specifies your country or region. 3. Select the required service. Click the Contact Us link, or select the Dell support service from the list of services provided. 4. Choose your preferred method of contacting Dell support, such as e-mail or telephone.
Online Services
You can learn about Dell products and services using the following procedure: 1. Visit www.dell.com (or the URL specified in any Dell product information). 2. Use the locale menu or click on the link that specifies your country or region.
Related Information
For detailed information about PSSeries arrays, groups, volumes, array software, and host software, visit the Documentation page at the customer support site.
127
Index
audit messages A alarms, displaying alerts displaying e-mail notification list of priorities analyzing data best practices examples identifying hardware problems prerequisites solving performance problems applications characterizing workload solving performance problems archives creating with a command creating with GUI modifying XML files opening preserving group data scheduling audit logs copying to clipboard displaying exporting 117 117 120 120 116 119 82 84 17 84 e-mail notification configuring customizing alerts modifying configuration requirements testing 49 51 51 50 51 86 106 86 92 88 85 105 data graph debug trace files delegated space, description display settings controlling chart display GUI appearance distribution of reads and writes, description documentation, displaying E 42 39 70 18 17, 72-73 49 74 73 73 capacity data description monitoring capacity summary capacity, requirements Client startup controlling settings command line method for launching SAN HeadQuarters Connection status, degraded D 31 117-118 68 41 18 44 67-68 88 25 88 C 82
129
managing filtering a group view firewall detecting firmware, group requirement G general settings Group Manager, launching groups 74, 82 79 74, 82 79 79 81 60 adding alerts DNS name e-mail notification hiding IP address log file size logging in with Single Sign-On 88 88 88 management address modifying network address monitoring network address 124 121 124 122 121 F preserving data requirements resuming monitoring Single Sign-On SNMP community name solving performance problems stopping monitoring 54
55 48
41 5
estimated maximum IOPS workload, 90 description estimated maximum IOPS, description 90 Ethernet port statistics, description events copying to clipboard displaying exporting logging priorities searching syslog server experimental analysis data description monitoring typical workload exporting data modifying XML files preserving group data scheduling using a command using the GUI 71
40 52
Favorites adding
130
stopping monitoring temporarily summary syslog event logging GUI alerts audit logs capacity data changing appearance circle graphs controlling Client startup controlling the time range controlling tooltips data description data tables
51 22, 24 60
15 31 67
17, 72 17 68 39 32 41 30 41 67 32
hardware monitoring unsupported hardware/frmware summary help, displaying I I/O data, description I/O Load Space Distribution I/O load, description I/O size, description installation changing log file location displaying settings obtaining software post-installation tasks prerequisites procedure requirements software location upgrades installing SAN HeadQuarters IOPS description versus latency 69 70 40 40 8 33 5 9 8 15, 40 13 5, 9 69 70 70 70 88 32 26 18
displaying data from different times 29 events graph data hardware alarms help I/O data information categories missing graph data points navigating network data objects polling period polling status setting timeline 17 31 73 18 69 23 32 20 71 24 65 72 30
131
iSCSI connections, description L latency, description link speed, description Live View session displaying live data example prerequisites running saving viewing previous local replication reserve log files accessing changing the directory compressing data description increasing size of keeping after removing software location maintaining requirements reusing after reinstallation reusing after upgrade security sizing space utilization
71
69, 86 71
monitoring best practices capacity 86 88 45 88 85 51 47 47 51 88 63 N network handling address changes load rate solving performance problems network data, description notification, alerts O operating system requirements P percentile reporting - 95th performance data description 67 27 6 63 71 71 106 71 12, 46
56 57 56 57 57 57 68
groups hardware performance problems resuming sorting stopping stopping temporarily TCP retransmissions
8 40 64 2 48 12 40 64 7 9 11 7 7, 46, 64 40
troubleshooting
132
version dependencies performance problems solving polling period adjusting default displaying polling status description displaying ports, active preserving group data archives exporting data options reports Q queue depth disk volume R RAID evaluation monitoring read/write distribution, description received traffic, description replica reserve, description replication partner, definition
66
replication reserve, see local replication reserve replication summary reported volume size, description
68 25 68
105-107
65, 72 3 31, 65
reports data required generating with an XML file generating with the GUI 110 111 111 113 109 113 109
72 72 71
71 71
log files SAN HeadQuarters Client SAN HeadQuarters Server Single Sign-On
58 88 70 71 68 68
syslog server volume queue depth S sampling period, See polling period SAN HeadQuarters analyzing data client-server configuration
31
85 2
133
command line features getting started GUI data installation introduction log files monitoring groups obtaining software operation performance impact post-installation tasks removing software from computer restarting the SAN HeadQuarters Server starting GUI syslog server SAN HeadQuarters Client cache directory description identifying if running installing log file access requirements SAN HeadQuarters Server identifying if running installing log file requirements maintaining log files
18 1 15 67 5, 9 1 2, 64 45 8 2 65 33 12 12 15 4, 60
requirements restarting syslog server user requirements SAN HeadQuarters Service installing sent traffic, description servers, solving performance problems settings chart display e-mail favorites general GUI appearance hidden groups log file directory tooltips
6 12 60 8
9 71 106
42 50 54 40 39 48 40 41
40 2 40 5, 9 7 6, 8
Single Sign-On configuring deleting login credentials description disabling enabling modifying login credentials 46, 52 54 11 53 53 53 52 68
40 5, 9 7 64
49
134
requirement specifying for a group software installing obtaining removing upgrading space utilization, description SSD Space summary information capacity hardware and firmware replication volume capacity syslog server changing the configuration configuring disabling event logging requirements T TCP retransmissions description monitoring
5 46
8 8 12 13 69 70 24 25 26 25 26
troubleshooting group monitoring U uninstalling SAN HeadQuarters update notifications upgrades performing reusing log files V volume reserve type volume capacity summary
12 13
13 9, 13
68 69 26
61 60 61-62 79 60-61
71 88
thin provisioning statistics, description 68 time range selector timeline, setting 30 29-30
135