Upon completion of this module, you should be able to: Provide an overview of Virtual Provisioning List the Benefits of Virtual Provisioning Describe the components of a Virtual Provisioning environment Explain the considerations behind Thin Pool creation and expansion List the tools to monitor Thin Pool storage space 1 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved One of the biggest challenges for storage administrators is balancing the storage space required by various applications in their data centers. Administrators typically allocate storage space based on anticipated storage growth. They do this to reduce the management overhead and application downtime required to add new storage later on. This generally results in the over-provisioning of storage capacity, which leads to higher costs, increased power, cooling, and floor space requirements, and lower capacity utilization. These challenges are addressed by Virtual Provisioning. Virtual Provisioning is the ability to present a logical unit (Thin LUN) to a compute system, with more capacity than what is physically allocated to the LUN on the storage array. Physical storage is allocated to the application on-demand from a shared pool of physical capacity. This provides more efficient utilization of storage by reducing the amount of allocated, but unused physical storage.
2 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved The example shown on this slide compares virtual provisioning with traditional storage provisioning. The example demonstrates the benefit of better capacity utilization. Let us assume that three LUNs are created and presented to one or more compute systems using traditional provisioning methods. The total usable capacity of the storage system is 2 TB. The size of LUN 1 is 500 GB, of which 100 GB contains data and 400 GB is allocated, but unused. The size of LUN 2 is 550 GB, of which 50 GB contains data and 500 GB is allocated, but unused. The size of LUN 3 is 800 GB, of which 200 GB contains data and 600 GB is allocated, but unused. In total, the storage system contains 350 GB of actual data, 1.5 TB of allocated, but unused capacity, and only 150 GB of capacity available for other applications. Now, let us assume that a new application is installed in the data center and requires 400 GB storage capacity. The storage system has only 150 GB of available capacity. So, it is not possible to provide 400 GB storage to the new application even though 1.5 TB of unused capacity is available. This shows the under utilization of storage in a traditional storage provisioning environment. If we consider the same 2 TB storage system with Virtual Provisioning, the differences are quite dramatic. Although the system administrator creates the same size LUNs, there is no allocated unused capacity. In total, the storage system with Virtual Provisioning has 350 GB of actual data and 1.65 TB of capacity available for other applications, versus only 150 GB available in the traditional storage provisioning method.
3 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved One of the issues with the traditional way of storage provisioning is the amount of allocated but unused storage. When a host has allocated a lun, it has exclusive access to that lun. Any unused space on the lun cannot be accessed by any other host. It becomes unused storage. With Symmetrix Virtual Provisioning, storage is allocated from a shared Thin Pool. Multiple hosts and applications can access this shared storage. This means the utilization of storage capacity can be optimized when using Symmetrix Virtual Provisioning. There will be less unused storage. Another advantage is the ease and speed of provisioning, specifically re-provisioning. While the initial provisioning process is the same for Thin Devices as it is for regular Thick Devices, adding additional capacity is simply a matter of adding Data Devices to the Thin Pool. By presenting a host with a larger device than is initially required, re-provisioning operations are not required at the host level. Thus, Symmetrix Virtual Provisioning can simplify storage management. Storage provisioning can be done independently of the physical storage and it can reduce the re-provisioning steps required to support capacity growth. 4 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved An added benefit of Virtual Provisioning is improved performance for certain workloads. There is the overhead of the initial allocation, however, because the Thin Pool utilizes wide striping and there is a performance benefit for subsequent access. With Synchronous SRDF, Enginuity allows one outstanding write per Thin Device per path. With concatenated metadevices, this could sometimes cause a performance problem by limiting the concurrency of writes. This limit will not affect striped metadevices in the same way because of the small size of the metavolume stripe (1 cylinder or 1920 blocks).
5 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Symmetrix Virtual Provisioning uses a type of Symmetrix device called a Thin Device. It is called a Thin Device because no actual data is saved on the device. The Thin Device is presented to a host like any Symmetrix volume and can be used by a host application to write data to it. Thin Devices can be presented to hosts before they are bound, but theyll appear as not ready to the host. Data does not get written to the Thin Device. It is written to a collection of symmetrix devices, called Data Devices. The Data Devices are grouped together into a storage pool called a Thin Pool. A Thin Device must be bound to a Thin Pool before it can be used. This Thin Pool provides the shared storage. The storage capacity in the Thin Pool can be shared among multiple hosts and applications. The action of binding a Thin Device to a Thin Pool makes the device ready and associates storage with the Thin Device. It also allocates one extent (twelve 64 KB tracks or 768 KB) from the pool. 6 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Thin Devices are visible to the host. They use a small amount of cache. The EMC sales tool Direct Express helps to calculate the amount of cache needed for designing a Thin Device configuration. Thin Devices are operating system agnostic and are mapped and masked in the same manner as a regular Symmetrix device. Thin Devices can be mapped and masked prior to being bound to a Thin Pool containing Data Devices. Once mapped and masked, the device will appear in the inquiry output. When the device is bound to a Thin Pool an initial allocation of twelve 64 KB tracks (768 KB) is made unless a larger quota is pre-allocated. More space is allocated as needed by the host application. TDEVs can be replicated using EMC Symmetrix local and remote replication products. 7 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Data Devices are similar to SAVE devices and are not visible to the host and must be contained in a pool before they can be used. Thin Pools can only contain devices of the same emulation and protection type, however, the Data Devices can be different sizes. Data Devices in a pool should be the same size. They should be created on drives with the same size and speed. They should be spread widely across the back-end and should be as large as is practically possible. 8 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Symmetrix supports three types of save pools; TimeFinder/Snap, SRDF DSE, and Virtual Provisioning Thin Pools. Thin Pools contain the aggregate available space that is available for a set of Thin Devices. A pool can contain zero or more Data Devices and can be created in a separate operation or at the time that the Data Devices are created. When a device is added to a pool, it can be enabled for allocation or disabled for future use. Unlike Snap Save Pools, there is not a default pool for Thin Devices. Therefore, at least one user defined Thin Pool must be created. Application performance requirements must be considered as all TDEVs associated with a Thin Pool share the same pool resources and can compete for access to the same set of spindles. Therefore, it may be appropriate to create multiple pools to isolate Thin Devices by performance and availability requirements. Pools are set up with a specific protection type. This is set when the first Data Device is added. If the first device added to a pool is RAID1, all additional devices must be RAID1. The Array Controls Guide states that a maximum of 510 pools of all types combined can exist in the Symmetrix. However, according to information presented in the Best Practices for Fast, Simple Capacity Allocation with EMC Symmetrix Virtual Provisioning (P/N 300-006- 718 Rev A12) white paper, the number is stated as 512, which, according to engineering, is the correct number. The most efficient use of pool resources is achieved by using a minimal number of pools so that capacity can be shared.
9 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Over-subscription is where the total capacity of all TDEVs bound to a Thin Pool is greater than the aggregate capacity of all Data Devices in the pool. This is a normal practice but does require that a storage administrator monitor the pool to prevent a pool full condition. When the pool is full and a host attempts to perform a write that requires a new extent to be allocated, a write error is returned to the host. Different hosts react differently when this error is experienced. The Host Connectivity Guide for each operating system describes the behavior of the specific operating system, volume manager, and file system. These Guides are available on Powerlink. When additional capacity is needed in the pool, Data Devices can be dynamically assigned to the pool. It is recommended practice to add multiple devices to the pool at the same time to ensure that the allocations to a Thin Device are spread across multiple Data Devices. It is also advisable to initiate a pool rebalancing operation. By default, there is no limit to over-subscription. To reduce the risk of over-subscription, a limit can be set for a pool that would limit the total capacity of the Thin Devices that are bound to a Thin Pool as a percent of the pool capacity. 10 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved As a host performs write operations, Enginuity will sequentially allocate storage to the Thin Device from the Data Devices in the associated pool. The unit of allocation is an extent which is 12 tracks or 768Kbytes. The extents are striped across all members of the pool. This wide striping can potentially provide increased performance. The allocation is sequential regardless of whether the host writes are sequential or random. 11 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved 12 The slide shows an example of extent allocation to thin devices. The initial bind causes one extent to be allocated (LBA 0-1535). A subsequent write to LBA 872, which is part of the previously allocated extent, requires no allocation. A new 8 KB write to LBA 72588 will require the allocation of a new extent (LBA 72192- 73727). Another 8 KB write to LBA 73724 will require no allocation for the first 2 KB because this falls in the already allocated extent (LBA 72192-73727), but the remaining 6KB will require the allocation of an extent (LBA 73728-75263).
Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved A Thin Device is seen by the host like any other device. Normally, an application will only read storage that had writing. However, if the host reads a block that did not have writing, the Symmetrix returns data blocks that contain all zeros. Beginning with Symmetrix VMAX and Enginuity 5874, a special indicator known as never written by host, (NWBH) is maintained in the metadata stored in cache for every track on every drive in the array. When this indicator is set for a track, it signifies that a host has never written any data to the track. If Enginuity receives a read request for a track with the NWBH indicator set, it can skip performing a read I/O to the disk and simply return all zeros to the host. If the track is part of a device participating in a local or remote replication session, the NWBH indicator will be set on the corresponding track of the target device. 13 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved When planning a configuration using Thin Devices, the first step involves determining how many separate Thin Pools are needed and the required composition of each thin data pool. Typically, this will involve conceptually organizing disk storage into separate classes, with further subdivision as needed to allow the back-end resources (drives and DAs) used by the pools to be isolated from one another. Depending on the mix of applications to be placed on Thin Devices, it will often be necessary to create multiple Thin Pools but generally, the most efficient use of resources will be achieved by using a minimal number of pools. Typically, a Thin Pool should be designed for use by a given application, or set of related applications, aligned with a given business group. The applications sharing a Thin Pool will compete for back-end resources, including Thin Pool storage capacity. If this is not acceptable, applications should not share the same pool. The devices comprising a Thin Pool will have the same performance and protection properties, so the applications sharing a Thin Pool should have the same performance and protection requirements. 14 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Thin Devices and Data Devices can be configured without being bound to a pool at the time of creation. However, a Thin Device must be bound to a Thin Pool before it can be used. Binding and unbinding Thin Devices can be performed using symconfigure commands or SMC. SymmWin can also be used to bind or unbind Thin Devices to and from a Thin Pool. However, SE and SMC are the recommended methods to bind and unbind Thin Devices from a pool. A Thin Device can only be bound to one Thin Pool at a time. The Thin Device writes its data to the Thin Pool that binds it. If a Thin Device is not bound to a pool, it will be set to user not ready. Once it is bound to a pool, it is in the ready state. With Solutions Enabler version 7.2 and Enginuity version 5875, the data of a Thin Device can be spread across multiple pools, though the device itself will be bound to one pool. The symcfg and symdev commands display all of the pools in which a Thin Device has allocated tracks. Thin Devices can be deleted once they are unbound from the Thin Pool. When Thin Devices are unbound, the extents that have been allocated to them from the Thin Pool are freed, causing all data from the Thin Device to be discarded. 15 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved All of these features require Solutions Enabler 7.2 and Enginuity 5875. Symconfigure allows you to rebind a Thin Device to a new pool without moving its data or losing any data. Rebind simply changes the Thin Device's current binding to a new pool. Rebind will not move any existing data to the new pool, but all new allocations to the Thin Device will go to the new pool after the rebind action completes. The Virtual LUN VP migration feature allows Thin Devices to be moved between pools. In other words, a user can non-disruptively relocate all of a Thin Devices allocated extents from one Thin Pool to a target Thin Pool. The source Thin Devices can be specified in a file, device group, or storage group. VLUN allows you to gather all Thin Device tracks spread over many pools to be relocated to one pool. The approach taken with FAST is to automate the process of identifying which regions of storage should reside on a given drive technology, and to automatically and non-disruptively move storage between tiers to optimize storage resource usage accordingly. FAST VP operates on Virtual Provisioning Thin Devices. Data movements can be performed at the sub- LUN level, and a single Thin Device may have extents allocated across multiple Thin Pools within the array. 16 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved One or more Data Devices can be removed from a pool as long as there are enough unallocated extents on the remaining devices to share the data from the device being drained. Before removing a Data Device from a pool, the device must be disabled. Disabling a Data Device changes its state from Enabled to Draining. The concept of device draining is common to all pool types (data, DSE, and snap pools). When a device enters a draining state, allocated extents on that device are copied to the remaining, enabled devices in the pool. This allows devices to be removed from a pool and re-purposed without having to delete any accompanying sessions associated with the device. If all Thin Devices that have allocated extents on a Data Device are unbound from the pool, all the Data Devices may be disabled and removed from a Thin Pool. Because data on the Thin Device will be striped across all enabled devices in the pool, in practice, all Thin Devices in a pool will need to be unbound before all Data Devices in the pool can be disabled and removed from the pool. 17 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved When an over-subscribed Thin Pool begins to run out of space, Data Devices should be added to the pool before the pool completely fills up. Once a pool is full, an application that performs a write to a previously unmapped portion of a Thin Device will encounter an out- of-space error. Data Devices can non-disruptively be added to a Thin Pool. The set of Data Devices to be added to an existing pool must have the same protection type and emulation as the devices already in the target Thin Pool. Typically, the DA and drive configuration underlying a set of Data Devices to be added to a pool should be sized to have enough available performance capacity to handle the workload that will be delivered to any regions of Thin Devices that get allocated after the pool addition. This is because once the pool addition is made, and any remaining capacity in the pre-existing pool becomes completely exhausted, a point will be reached where all storage allocations made from the pool will be satisfied using just Data Devices from the pool addition. In most cases, it is inadvisable to add a set of Data Devices that is spread over a small number of drives to a Thin Pool that is running out of space. 18 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Automated pool rebalancing allows the user to automatically rebalance workloads non- disruptively to extend Thin Pool capacity in small increments as needed. Users can run a command against a Thin Pool that will rebalance the allocated extents across all enabled Data Devices in the pool. This allows a small number of Data Devices to be used to expand a pool without causing wide striping to be compromised. The rebalancing variance is the target device utilization variance for the rebalancing algorithm. The rebalancing algorithm attempts to level distribution of data in a pool so that the percentage utilization of any device in the pool is within the target variance of the percentage utilization of any other device in the pool. The maximum rebalance scan device range is the maximum number of devices in a pool where the rebalancing algorithm will concurrently operate. In order to reduce capacity requirements, after migrating from standard volumes to thin volumes, it is possible to reclaim extents containing all zeros. Reclamation involves the de-allocation of Thin Device extents that contain all zeros. Running the space reclamation command will spawn a DA background task that will examine the allocated extents on a specified Thin Device. For each allocated extent, all 12 tracks will be brought into cache and scanned to see if they contain all zero data. If the entire extent satisfies the conditions for space reclamation, the extent will be de-allocated and added back into the pool, making it available for a new extent allocation operation. There are two kinds of reclamation options. The first type known as unwritten refers to space that has not been written to by a host. Examples of this would be pre-allocated space on a Thin Device. The second option known as reclaim will reclaim unwritten space as well as space that was zero filled. Zeros can be written by actions such as an Open Replicator pushes or pulls to a Thin Device. 19 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Pools can be renamed. The new pool name must adhere to the same naming restrictions used when creating a pool. Only one pool can be operated on in a session (or command file). You cannot create and rename a pool in the same session. A pre-allocated amount of Thin Pool space can be allocated and assigned to a Thin Device. Only the Thin Device assigned to the region can write to it. Applications that pre-allocate or pre-write all capacity of their Thin Devices may be suitable for pre-allocation. Pre-allocation also overcomes a slight, temporary overhead on the first write to a new extent. Extents that are persistently pre-allocated are not reclaimed by a standard reclaim operation. 20 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved The same metadevice commands (form meta, dissolve meta, convert meta) can be used to configure a thin metadevice. When creating a thin metadevice, all members must be Thin Devices. Mixing and matching Thin and Thick Devices is not supported. Since Thin Devices are striped across the back-end, there is usually no need to use striped metadevices with Virtual Provisioning. However, there may be certain situations where better performance may be achieved using striped metas. With Synchronous SRDF, Enginuity allows one outstanding write per Thin Device per path. With concatenated metadevices, this could cause a performance problem by limiting the concurrency of writes. This limit will not affect striped metadevices in the same way because of the small size of the metavolume stripe (1 cylinder or 1920 blocks). Symmetrix Enginuity has a logical volume write pending limit to prevent one volume from monopolizing writeable cache. Because each meta member gets a small percentage of cache, a striped meta is likely to offer more writable cache to the meta volume.
21 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Benchmarks have shown that Thin Device performance could be better than regular devices. However, real world application performance is highly dependent on the applications that are sharing the Thin Pool. In addition to potential contention for access to shared Data Device, response time and throughput can be impacted by the overheads incurred the first time a write is performed on an unallocated region of a Thin Device. This will disappear once the working set of a Thin Device has been written. Care must be taken when migrating an application whose back-end layout has already been tuned to isolate workloads or where an application has stringent response time and/or throughput requirements.
22 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Virtual Provisioning simplifies, but does not entirely eliminate, the need for proper planning. When implementing Virtual Provisioning, it is important that realistic utilization objectives are set. Generally, organizations should target 60 percent to 80 percent capacity utilization per pool. A buffer should be provided for unexpected growth or a runaway application that consumes more physical capacity than was originally planned for. There should be sufficient free space in the Thin Pool equal to the capacity of the largest unallocated Thin Device. Thin Pool storage can be managed with SMC, Solutions Enabler, and the event daemon. The event daemon offers the most flexible way of monitoring free space in a Thin Pool. 23 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Given the nature of Virtual Provisioning, there is always the danger that storage space in a Thin Pool is exhausted due to over-subscription. With proper monitoring and preventive action, this condition should be avoided at all costs. However, should this circumstance occur, this is what happens: Once space in the pool is exhausted, no new writes to the Thin Devices are allowed. The reaction to the failed write will depend on the application and the Operating System. The Host Connectivity Guides for all operating systems are available on EMC Powerlink and describe the behavior when the No space on Device condition is encountered. The data in the Thin Pool is still available for reads. To free up pool space, you can add Data Devices to an existing pool or migrate selected Thin Devices to another pool. Running zero space reclamation on the Thin Pool will free up space if Thin Devices have allocated but unused space. If there are bound Thin Devices whose data is no longer needed, those devices can be made not ready and unbound from the pool, thus freeing up pool space. 24 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved Key points covered in this module: Overview of Virtual Provisioning Benefits of Virtual Provisioning Components of a Virtual Provisioning environment Suitable and unsuitable environments for Virtual Provisioning Considerations for Thin Pool creation and expansion Tools to monitor Thin Pool storage space 25 Module 2: Virtual Provisioning Concepts and Planning Copyright 2012 EMC Corporation. All rights reserved 1. See slide 6 2. See slide 5 3. See slide 7 4. See slide 9 5. See slide 9,18 6. See slide 10 7. See slide 15 8. See slide 20 9. See slides 24, 25
26 Module 2: Virtual Provisioning Concepts and Planning