Академический Документы
Профессиональный Документы
Культура Документы
What a VM administrator needs to know to avoid performance problems caused by insufficient storage capacity and plan for environment growth
Table of Contents
Introduction .................................................................................................................................................. 3 Insufficient Storage Capacity Causes VM Performance Problems................................................................ 3 How VM Data Flows to Shared Storage: From Host to SAN and Back .................................................... 3 The Hardware and Design Decisions that Determine Storage Capacity ................................................. 4 Storage Space............................................................................................................................................ 5 Throughput ............................................................................................................................................... 6 Inter-connectivity/Fabric ...................................................................................................................... 6 Host Bus Adaptor .................................................................................................................................. 6 Storage Controller ................................................................................................................................. 6 Spindles ................................................................................................................................................. 6 How to Forecast Storage Capacity Requirements ........................................................................................ 7 Virtualization will Vastly Increase Storage Volume ...................................................................................... 8 Virtual Environments Employ Numerous Host to Datastore Connection Permutations.............................. 8 Virtualization Will Cause Unpredictable Throughput Concentrations ......................................................... 9 Additional Storage Abstraction Layers are Used with Virtualization.......................................................... 10 Virtualization Introduces Environment Dynamism and Automation ......................................................... 10 Conclusion ................................................................................................................................................... 10
Introduction
Many VM performance issues stem from bottlenecks within a data centers storage throughput capacity. These issues originate from inadequate planning for a virtual environments storage needs and can be avoided with visibility into storage infrastructure and a planning process to address storage resource expansion in line with the application usage growth that the environment will face. Even for data centers that employ Storage Area Networks (SAN) for their physical servers before shifting to a virtual infrastructure, virtualization adds additional intricacy that must not be overlooked and needs to be factored into storage planning decisions, documentation procedures, and issue troubleshooting. This whitepaper will present nine considerations that VM administrators must know about storage to avoid VM performance problems, forecast storage capacity needs, and further understand how virtualization makes data center storage more complex.
How VM Data Flows to Shared Storage: From Host to SAN and Back
Shared storage requires an infrastructure with several different hardware components. Importantly, data will be processed through the entire infrastructure only at the maximum capacity of the smallest capacity component, or the weakest link, so to say. That means that if an organization has hypothetically deployed high capacity spindles (the actual disks that are being written to), but is using a much lower capacity fabric for interconnectivity between the ESX or Hyper-V hosts, the advantage of
2010 VKernel Corporation. All rights reserved
having the higher capacity spindles will be negated. Consequently, the entire storage system will only be able to handle the amount of data that can fit through the fabric. Due to the systemic nature of storage, a holistic view of an organizations application performance needs is required to define storage and storage access capacity requirements. Figure 1 illustrates how data flows from a host to a disk. As a command leaves for the datastore from the ESX or Hyper-V host, it travels to the Host Bus Adaptor, which acts as a travel agent for the command and tells it where it needs to go within the SAN and how to get there. After the data passes through the Host Bus Adaptor, it passes through the inter-connectivity layer, or fabric to arrive at the SAN. A command emerges in the Storage Controller which will point the command to the correct spindle. The command reaches the physical spindle where it executes its instructions and is given a response to deliver back to the ESX or Hyper-V host. This command then travels through the same infrastructure levels in reverse, back to the host. A bottleneck or issue in any of the levels in this flow will cause performance issues which will be detected as increased latency.
ESX/Hyper-V Host
Inter-connectivity/ Fabric
Storage Controller
Spindle
Storage Capacity
Storage Space
Throughput
Inter-connectivity/ Fabric
Storage Hardware
Overhead
Spindles
Storage Controller
Figure 2 The Connections in Hardware Decision Points that Determine Storage Capacity
Storage Space
Storage space refers to the actual amount of disk space needed for a virtual environment to continue running. Importantly, if a VM needs storage space and there is no more available, the VM or an application on the VM will cease functioning. Virtualized infrastructure storage needs can be much larger than the storage that was necessary when the same applications ran on physical servers as VMs will likely require more storage space than just their allocations. Each VM may have an associated snapshot for quick maintenance purposes. Also additional storage will be needed to host all templates used to quickly provision VMs, and there may be use cases where multiple VM images of the same VMDK or VHD file must exist. Additionally, as VMs can move around with vMotion, the entire storage resource is always fluid and dynamic, and a buffer must be left so that the environment has enough slack to rebalance itself when necessary. Notably, decisions on data redundancy and the RAID scheme employed will impact the amount of storage space needed. Depending on the RAID scheme, up to double the amount of physical storage space will be needed for all actual data being stored. To further add complexity, different parts of the SAN may be equipped at different RAID levels based on the criticality and performance needs of the information being stored and accessed. Accordingly, the decisions that are made at the spindle level will directly impact the amount of storage space that is needed.
2010 VKernel Corporation. All rights reserved
Throughput
Disk throughput is the most common resource in which capacity bottlenecks can arise. These issues can be difficult to pinpoint, and are often sensed through a high latency value. Thus, ensuring that the hardware in the storage solution accurately fits the expected data transaction need is critical. Inter-connectivity/Fabric The connection between the host and the datastore is a critical area to determine the amount of throughput that will be available to the SAN. Several options for both connection technology and file standards exist, with price varying almost directly to the bandwidth available, and thus the throughput capacity within such hardware. Although some high capacity solutions such as Fiber Channel may be very expensive, because storage throughput has such a high impact on VM performance, such solutions may be necessary to maintain a high level of service based on the types of applications that are running in the environment. Host Bus Adaptor The host bus adaptor is the piece of hardware that will direct the command from the host to the disk, and then catch the return message. Generally, this hardware component is specified by the host hardware vendor. Storage Controller The storage controller is the piece of hardware that receives commands from the fabric at the SAN, and integrates to the spindles. It is important to note that based on the RAID standard used, additional calculations may need to be employed with complex RAID performance or data redundancy implementations. Thus, if such RAID standards are used, additional time in milliseconds will be needed every time that a command is sent to disk or returned which could impact performance, especially if other areas of the storage infrastructure could become constricted. Spindles The spindle is the actual disk that will store the data that is being written and accessed by the virtual environment. Tremendous variability exists with the technical capabilities of the spindles, with price directly mapping to performance and amount of storage space. Additionally a fabric must exist between the storage controller and the spindles that similar to the fabric that connects the SAN to the hosts, can vary greatly in throughput capacity and price. As mentioned previously, if the throughput capacity at the spindle level does not match the other components in the storage infrastructure stack (most critically, the fabric connecting the host to the SAN), some of the capacity that a disk has will not be accessible, or vice versa. The RAID standards that will be employed must be decided on at the spindle level, which will directly affect performance of all disk read and writes as an aggregate. RAID standards will also drive the total amount of storage that will be needed as more intensive RAID standards for data redundancy or performance require higher storage overhead which translates into greater amounts of storage space. Additionally, if there is any level of deduplication of the data (i.e. redundancies shared by several data
objects such as when an often-used operating system within a VM is housed in one place), this will make storage usage more efficient.
A storage administrator will then be able to make appropriate decisions on how to increase storage and disk I/O capacity to serve the expansion of the environment.
Physical Host
LUN
Figure 3 Physical Host to LUN Connection In the virtualized world, a host can map to an unlimited amount of datastores based on the needs of the VMs within the host. Because an issue can occur within any one of these host to datastore connections, the number of areas that must be monitored will increase drastically. As Figure 4 shows, the sheer amount of permutations can add significant complexity if for example, a latency issue needs to be investigated to find the root cause. Some environments may also choose to replicate the existing connections for redundancy causing the total number of connections to grow even further.
Virtualized Hosts Datastores
environment must have enough capacity to handle not only regular operating needs, but also peaks in usage. With this multiple host to datastore connection structure, issues can occur quickly with little to no warning and can be hard to track down as the entire virtual environment keeps on shifting.
Conclusion
Having visibility into the capacity for storage space and disk throughput is critical to maintaining a high performing environment. Many capacity bottlenecks occur within the disk throughput resource area, cause massive performance problems and can be hard to track down. Because of the dynamic nature of shared storage, slack must also be built in to storage resource calculations to allow for additional capacity needed at peak times or when self-balancing action such as vMotion occur. As environments are always growing, storage administrators need visibility into application and application usage growth to adequately plan and purchase the necessary hardware to accommodate the growth. The infrastructure to operate shared storage is complex and has many moving parts. The total bandwidth of that infrastructure is only as robust as the weakest link of the infrastructure. Without adequate storage planning, virtualized environments run the risk of running out of disk throughput capacity and facing VM performance problems or stunted growth.
10