Академический Документы
Профессиональный Документы
Культура Документы
In the never-ending quest for better performing storage, flash in a denser form factor has a high
adoption rate among users. In fact, for modern data centers that rely on server virtualization
(/solutions/server-virtualization/), flash storage has proven to be revolutionary in terms of
performance gains.
Modern data reduction techniques such as inline deduplication, inline compression and some server
virtualization APIs, such as VAAI, help flash-based storage (/products/) provide faster, more reliable
and feature-rich storage using commodity hardware by way of servers and SSDs.
How do you troubleshoot performance issues in a Solution? How do you know if the issue is
storage performance, SAN/NAS (network) performance, or an issue with the
Server/Hypervisor?
Performance Paramaters
Datasheets for flash storage arrays usually specify the following per controller:
From the above, there are three parameters we just learned about performance that adorn the
Performance Dashboard on every flash array.
In this blog , we will not dive into type of SSD technologies of TLC/3D NAND, etc. For performance
issues with flash storage we will focus on which aspects make flash perform better than common
legacy storage systems.
SSD technology continues to improve. SSDs are getting faster, denser and cheaper every year. At the
same time, storage software that runs in the controller for performance and storage efficiency
performance, such as dedupe and compression, is getting increasingly “smart.” As the technology
matures to take advantage of reduced wear life characteristics of the newer SSDs, this “smart”
storage software makes sure the SSD lasts five or more years.
Writes have a penalty in MLC Flash, as an entire page has to be re-written for a block of write on a
page. So, rewrites are not possible. Storage software takes care of this by “collecting multiple writes”
and writing them in an “aligned burst.” Additionally storage software features such as inline
deduplication and compression enable “write less” to flash for duplicate blocks and compressible data.
Segment-Cleaning/Garbage collection
Arrays that have enabled inline dedupe and compression typically run a “log structured file system”
and the RAID subsystem is always scrubbing the filesystem to identify “stale/dirty blocks” to determine
if any segments can be “cleaned” and made available for new writes.
In the three tables below, performance issues can be identified and troubleshot as follows:
Protocol IO Typically when running small block IO (4k, 8K) every read/written block takes CPU cycles at
the protocol layer.
If the allocated CPU cores to protocol are running at 100% capacity, the array cannot ingest
more IO faster. As a result, latency increases and this means lower read and write
performance. This status typically means that you need more CPU and you are running the
array controllers beyond their capacity to provide sub-millisecond latency for the IO sent to
them.
Dedupe and This is usually not an area of concern as most arrays are designed to work well when the
compression dataset size can be reduced as there’s less work to do when writing to flash.
computation
Segment As the Log Structured Filesystem (a.k.a. POOL) starts running over 70% full, there is
Cleaning/GC significant fragmentation on the filesystem. The number of free segments for landing writes
is reduced.
Segment Cleaning (GC) has to do more work to scrub dirty blocks in multiple segments and
determine which segments need to be cleaned and returned to the free pool. This scrubbing
typically affects write performance at higher capacity of the pool and also affects the CPU
usage on the controllers. Therefore, write latencies can be expected to be higher than
normal at greater than 70% pool capacity.
Degraded If there is an SSD failure in the array, the array could have reduced performance until the
RAID group failed SSD is replaced and the rebuild is complete.
Unaligned Some SSD-based arrays have “variable block log-structure file systems” that allow them to
write IO from ingest blocks of any size and append them to an an aligned raid-stripe write.
application
Arrays that don’t support this feature have LUNs created with a fixed block size usually
selected at the time of LUN creation. In this case, if the application does not do aligned
writes to the LUNS, the amount of IO that the array handles will double for partial write and
this will in turn “over-run the CPU” allocated for protocol IO causing higher latency. This
phenomenon may also take an additional toll on the wear-leveling on the SSDs and may
draw more capacity from its over-provisioning.
Modern operating systems, such as Windows Server 2012R2 and VMware vSphere, take
care of partition alignment; but some older operating systems, such as Windows Server
2003 and Linux versions or Linux filesystems, do not align unless created with certain flags.
2) Performance Issues Between the Array and the Server
Action to take
The plumbing between the servers and the array is typically 10GbE Ethernet or 8/16 Gbps Fibre Channel.
Having enough bandwidth/throughput* between the Server HBA/CNA and the Array Target Adapters is
essential to achieving optimal performance. The following table provide the bandwidth numbers achievable
with 2 HBAs or 2 NICs per controller.
Tuning your SAN means having “enough buffer-buffer credits” for FC ports, less than 3-4 hops, single
initiator-multi-target zoning. For Ethernet networks, it means proper MTU settings/jumbo frames enabled,
port fast disabled, static routes, etc., for a loss-less Ethernet network for iSCSI or NFS traffic. Network
troubleshooting is a chapter of its own and beyond the scope of this blog.
Always remember that network troubleshooting is a very important part of performance monitoring
and tuning.
2x FC8G links on one 1,740 (870 MBs per link) 13,920 1.69
controller
*The formula for calculating effective bandwidth required is given in a previous section of this blog.
Action to take
Server 4+ socket Servers have PCIe buses attached to a set of sockets and DRAM which has affinity
to those sockets. It is recommended that we DO NOT “round robin” across ports that have
affinity to different sides of the NUMA bus as this flood interrupts the CPU queue by the
adapter and causes IO contention.
In other words , stick all your Adapters into PCIe bus(es) that have affinity to a set of sockets
on the same side of Numa bus. On Linux, let irq balance handle interrupts. Hyper-V and
vSphere will take care of this automatically when VMs are created.
OS Multi-pathing: Please follow the best practices provided by array vendor on multi-pathing
and other tunables (e.g., udev rules for Linux). Windows provide perfmon for performance
analysis . Linux provides various tools like iostat, vmstat, and sar to monitor performance.
Hypervisor There are infodocs available from VMware on tuning HBA tunables. Additonally the Tegile
Tuning vSphere plugin will tune the Tegile array automatically for you.
ESXtop: a utility on vSphere is an exceptional tool that helps analyze latency at the hypervisor
level and also helps to analyze VAAI statistics to determine if VAAI is doing its job or not.
InGuest There are certain guidelines available to tune the guest OS for lower CPU use, for example,
tuning using PVSCSI instead of LSI Logic for SCSI controllers in a VMware virtual machine.
Linux provides udev rules and creating filesystems with specific offsets to create aligned IO.
Also the same tools that are available in the operating system are available inside the guest,
such as perfmon for Windows, and iostat, vmstat, and sar for Linux to monitor guest
performance.
If you want to know what Tegile flash storage can do for performance in your data center, we
invite you to request a demo (http://pages.tegile.com/Request-A-Demo.html) today.
Leave a Reply
Your email address will not be published. Required fields are marked *
COMMENT:
Name *
Email *
Website
POST COMMENT
Recent Posts
The Whole is Greater Than the Sum of its Parts »
Future-Proof Your SAN Investments With a Platform That Supports Cost-Effective Web and Cloud-
Scale Deployments »
Products (https://www.tegile.com/products/)
Solutions (https://www.tegile.com/solutions/)
Resources (https://www.tegile.com/resources/)
Events (https://www.tegile.com/events/)
Partners (https://www.tegile.com/partners-overview/)
Blog (https://www.tegile.com/blog/)
Meet the Team (https://www.wdc.com/about-wd/corporate-officers/corp-bios.html)
Contact Us (https://www.tegile.com/company/contact-us/)
Pricing (https://www.tegile.com/pricing/)
Support (https://www.tegile.com/support/)
Demo (https://www.tegile.com/resource/three-minute-intelliflash-demo/)
(https://www.tegile.com)
© 2017 Western Digital Corporation or its affiliates
All Rights Reserved
Privacy Policy (https://www.tegile.com/privacy-policy/) | Terms of Use (https://www.tegile.com/terms-of-
use/)
(https://www.tegile.com/feed/) (https://twitter.com/Tegile/)
(https://www.linkedin.com/company/tegile/)
(https://www.youtube.com/user/DedupeStorage/)
(https://plus.google.com/u/0/+TegileSystems/videos/)
(https://www.facebook.com/tegilestorage/)
(https://www.xing.com/companies/tegilesystemsinc.)
()