Вы находитесь на странице: 1из 5

Understanding HP EVA leveling

Technical white paper

Table of contents Introduction ..................................................................................................................................2 Leveling definition .........................................................................................................................2 HP EVA storage allocation .............................................................................................................2 Leveling methodology....................................................................................................................3 Leveling triggers ...........................................................................................................................3 Should the customer worry about leveling? ......................................................................................4 How to reduce leveling impact .......................................................................................................4 Conclusion...................................................................................................................................5 Glossary and definitions ................................................................................................................5

Introduction
This white paper discusses what occurs within the HP Enterprise Virtual Array (EVA) prior to, during, and after a leveling event occurs. Out of necessity, we start with a description of what leveling is and its purpose. This description requires some discussion of the internals of the virtualization engine inside of the EVA. Next, we describe the events that can lead to leveling. Finally, we discuss the ways through which customers can minimize the impacts of leveling. The behavior described in this document only applies to HP EVA versions 9.x11.x of the firmware.

Leveling definition
Leveling describes the process used by the EVA to distribute physical storage among the disks in an array. The basic purpose behind leveling is to distribute the physical allocation of storage evenly for the collection of logical disks created by a user. Thus, a logical disks usage on a given physical volume is proportional to the physical volumes contribution to the total amount of physical storage available for allocation to a given logical disk. For example, if a given physical volume contributes 10 percent of the total storage, then 10 percent of each logical disk will be allocated on that volume. This methodology attempts to improve I/O performance and maximize spindle utilization by distributing the physical storage for a logical disk across as many spindles as possible. In addition, leveling maximizes the amount of storage presented by a disk group by redistributing data in a way that includes new disks that have been added to the disk group. Therefore, an array can have physical drives of varying sizes and logical disks of various redundancies (with up to four RAID types) in a single disk group with maximum capacity utilization and I/O performance.

HP EVA storage allocation


To understand leveling, it is necessary to understand some basic storage space allocation concepts:

Each physical volume (disk) is segmented into 2 MB units for the HP 4400/6400/8400 EVA series, and 8 MB
units for the HP P6000 EVA series. Each 2 MB/8 MB unit is called a physical segment (PSEG).

Physical volumes are grouped either by an array controller or management agent or both to form a Logical Disk

Allocation Domain (LDAD). An LDAD is also known as a disk group in the management agent. Each logical disk created by the user is associated with a particular LDAD, and the physical storage of a logical disk is chosen only from the set of physical volumes in that LDAD.

Each LDAD can be further subdivided by the array controller into redundant storage sets (RSS) to improve the fault
tolerance characteristics of the LDAD. An RSS may be a subset of the volumes that make up an LDAD, depending on the size of an LDAD. Another way of putting this is that an LDAD may contain one or more RSSs. The optimal size of an RSS is eight physical volumes.

A user logical disk appears as a linear array of blocks of a given size. The array divides this space into some number of addressable units known as redundant stores or RStores. An RStore represents a contiguous 8 MB (32 MB for P6000 EVA series) of user-addressable space within a logical disk created by the user. An RStore has redundancy characteristics, which dictate the amount of physical storage that must be allocated to yield 8 MB (32 MB for P6000 EVA series) of user space. The EVA supports four types of redundancy levels: Vraid0, Vraid1, Vraid5, and Vraid6. Redundancy rules further dictate the number of physical volumes that are required to form a given RStore. The redundancy level determines the number of disk drives required as well as the total physical storage required to yield the 8 MB (32 MB for P6000 EVA series) of user space. Each RStore is allocated from a given set of PSEGS on a given set of volumes in a given RSS within the target LDAD. The operation of moving disks from one RSS to another is referred to as an RSS migration. This operation is typically the result of a split or merge. If an RSS drops below six disks, it is typically merged with other disks to keep the RSS greater or equal to six disks and less than 12 disks. If an RSS expands to 12 disks then it is split into two six disk RSSs.

Leveling methodology
Leveling attempts to ensure that the physical storage used for a given logical disk is allocated proportionally across the RSSs in the LDAD containing the logical disk, as well as proportionally across the volumes within each RSS. In other words, if a given RSS contains 15 percent of the storage in the LDAD, 15 percent of the logical disk will be allocated from that RSS and, if a given volume in the RSS contains 10 percent of the storage within the RSS, 10 percent of the logical disk space allocated in that RSS will be allocated on that volume. Differing RAID levels will have different calculations for physical storage. For example, in Vraid1, as a pair of drives is necessary for full redundancy, the smaller of the two drives is used to compute the available physical storage.

Leveling triggers
A leveling event is triggered at the following times: First group:

When an RSS migration request is made (RSS migration includes marry operations) When a pending RSS migration request is sent to the level wait queue When an RSS migration completes When the number of normal members in an LDAD transitions is above four Following a successful merge (joining of two RSSs) operation Following any regen (data recovery through redundancy) or when a failed drive is replaced (successful or not)
Second group:

Following creation of an empty container Following a snapshot creation When a mirror clone is fractured or detached When a mirror clone resync is complete When a snapclone completes Following a capacity change (logical disk create, delete, shrink, expand)
Last group:

60 seconds after restart or master transition (controller reset, or mastership moving from one controller to the other)
The first group occurs in response to a configuration change due to either drive failure or drive appearance; these may be due to hardware errors or user action. The second group occurs in response to an action by the user, and the last group occurs as a result of restart or master controller failover.

Should the customer worry about leveling?


Leveling is a consequence of both normal and abnormal events that occur in the life of a cell. Most of the events that trigger the leveling process are not within the customers control. The user should never worry about leveling. It may, however, be useful to be aware of it, as there are side effects such as temporary loss of capacity 1 while leveling is in progress. Users can also run into problems when the amount of free capacity is so low that leveling cannot complete, because there is no space to move the data to. However, by following the suggested guidelines around capacity and always ensuring at single protection level 1, these problems can be eliminated. The one event that is within the customers control is when disks are inserted. In this case, HP suggests the following best practices for increasing capacity of the array:

To reduce false error indications, insert multiple disks carefully and slowly, pausing between disks. This precaution
allows the initial bus interruption from the insertion and the disk power-on communication with the controller to occur without the potential interruption from other disks. In addition, this process sequences leveling so that it does not start until all the new disks are ready. disruptive and can result in a nonoptimum configuration. Do this only if the option to build new disk groups and move existing data to the new disks is unavailable.

Although the array supports replacing existing smaller disks with larger disks, this process is time consuming and

Best practices to improve availability when adding disks to an array:

Set the add disk option to manual Add disks one at a time, waiting a minimum of 60 seconds between disks Distribute disks vertically and as evenly as possible to all the shelves Unless otherwise indicated, add new disks to existing disk groups using the HP Storage System Scripting Utility add
multiple disks command

Add disks in groups of eight For growing existing applications, if the operating system supports virtual disk growth, increase virtual disk size;
otherwise, use a software volume manager to add new virtual disks to applications In addition, users tracking leveling progress may see the percentage go up and down and even start back at zero. This occurs due to events that may suspend or resume leveling, as well as events that trigger leveling as mentioned above. For example:

If the array encounters an event that requires it to suspend leveling, when leveling resumes it will start leveling from
the beginning of the logical disk it had been working on at the time it was suspended. However, leveling will not go back and level logical disks that it completed previous to being suspended.

If leveling is running and a leveling trigger-event occurs (see above), leveling will start over. For instance, if a user
adds 40 new drives to an array and begins creating logical disks as capacity becomes available, the array will start leveling the LDAD from the beginning each time.

How to reduce leveling impact


In general, to reduce the impacts of leveling, the customer should always be at the latest version of EVA firmware, set a minimum protection level of single on all disk groups, and ensure used physical capacity is at or below 90 percent. Under these circumstances the impact of leveling is minimal upon the array, except in the case of a multiple disk failure or when multiple drives are removed or added simultaneously from an already existing disk group that has been written to its near full capacity. Note that in the case where there are no logical disks in the existing LDAD there will not be any leveling. Finally, leveling is a background task and is designed to minimize impact to the customers workload.

Some processes like leveling will consume some additional capacity until the process completes and efficiently spreads data across the disk group.

Conclusion
The HP EVA uses leveling to distribute physical storage evenly among the disks in an array. Leveling is a background process that runs in response to normal and abnormal events that may occur in the array, such as adding or removing drives, and creating or resizing logical disks. Furthermore, the user should never need to worry about leveling when following the best practices mentioned in this document. The impact of leveling on the array will be minimal and leveling can achieve its goal of even distribution across the drives.

Glossary and definitions


Debug flag Full capacity Level wait queue Maintenance command Marry operation Multiple disk failure Multiple disk removals Physical volume Protection level The array provides some additional information through a serial port or logs if this flag is set Nearly all of the useable capacity on the array has become exhausted The queue used to hold leveling requests until the array can service them The management appliance provides the ability to invoke special commands to the array that are used for special maintenance purposes and gathering additional information The process of marrying a disk to another disk in order to form a RAID1 pair. More than one disk failure at one time More than one disk is removed from the array at a time A physical volume is equivalent to a single Fibre Channel disk drive The EVA has three protection levels: 0, 1, and 2; the number represents the guaranteed number of disks that can be reconstructed after disk(s) failure; this is accomplished by setting aside (reserving) enough capacity for the reconstruction to complete Regen refers to data recovery through redundancy; this is the process of recovering data for RAID1 by reading the data from the mirror copy or for RAID5 using parity to recompute the missing data This process occurs after a controller resets as it boots and synchronizes with the other controller Movement of disks from one RSS to another

Regen

Resync RSS migration

Distribute your physical volumes evenly for greater efficiency, visit www.hp.com/go/storage.

Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. 4AA3-8964ENW, Created January 2012

Вам также может понравиться