Академический Документы
Профессиональный Документы
Культура Документы
Storage Virtualization
Team 3
Jennifer Brola-Richards Mohib Fanek Kathy Larson Donovan Miles Vishu Reddy Fran Trees
Presentation Outline
Storage Virtualization
What is storage virtualization and why storage virtualization?
Case Study
Research Topics in Storage Virtualization
What are potential topics of research and dissertation?
Storage Virtualization is the next frontier in Storage Advances that aims to provide a layer of abstraction to reduce complexity. Storage Networking Industry Association (SNIA) defines Storage Virtualization as: 1. The act of abstracting, hiding, or isolating the internal functions of a storage (sub) system or service from applications, host computers, or general network resources, for the purpose of enabling application and networkindependent management of storage or data.
2. The application of virtualization to storage services or devices for the purpose of aggregating functions or devices, hiding complexity, or adding new capabilities to lower level storage resources.
3
Storage Virtualization aims to provide a layer of abstraction to manage storage and reduce complexity !!!
Provided continuous availability despite exponential growth (e.g. FaceBook- Over 55 billion page views a month, 41 million active users1) Effectively group and manage heterogeneous storage devices & servers (e.g. Estimated number of Google Servers 450,000 2!)
Allocate and manage storage in accordance to the Quality of Service (QoS) associated with the data (e. g. Gartner estimates average data center doubling its storage every 18 to 24 months)!) Mergers and Acquisitions (e.g. Microsoft & Yahoo!)
(1)
Client side storage innovations variety of storage device innovations that are smaller, higher capacity and cheaper have helped end users cope with increasing storage requirements!
Server side storage innovations a combination of storage devices, storage interfaces and storage software innovations have helped enterprises cope with exponential growth of data storage requirement ! Storage devices have evolved from tapes to hard drives to RAID hard drives increasing capacity and resiliency.
Storage interface innovations have evolved from SCSI to ISCI, Fiber Channel (FCP) and InfiniBand to inter connect devices and transport the data faster.
SCSI
ISCSI
FCP
Infiniband
Storage Access File level access takes center stage along with conventional Block level access. Block level access: Block addresses are used to Read/Write data [Read/Write, Block #] to the storage media.
Sample conventional Block Allocation Map
File level access: Files are accessed by "semantics" instructions [example: Open, Close]. Data inside files is accessed by byte-ranges within the file (example: the first 10 bytes of a file). GFS (Google File System) is an example of a large scale distributed file system.
8
Metadata is Data about data; in the context of storage metadata may describe an individual datum, or content item, or a collection of data including multiple content items. Examples include: file size, who created file, attributes such as read only, free block bitmaps, control data.
Storage Software from simple back-up and restore to advanced storage networks and storage management software functions.
(A) Simple Direct Attached Storage (DAS)
10
Architecture
Transport Protocol Efficiency Sharing and Access Control Typical Applications Typical Clients
Decentralized
Centralized
Layer over TCP/IP SCSI/FC and SCSI/IP Less Good Web Workstations More Poor Database Database servers
11
*
4
**
5 1
Device Virtualization
Source: The Storage Networking Tutorials, SNIAVIRT- Page 20 http://www.snia.org/education/tutorials/ * Host aka Server ** Device=aggregation of Host and Network (Meta Data)
12
1
Storage Device Level Virtualization
2
Host Level Virtualization
3
File Level Virtualization
4
Block Virtualization
5
Device Virtualization
6
Network Virtualization
Historical: Mainframe Recent development example: VMware Historical: RAID Level, SCSI Interface Recent Development Examples: Fiber Channel
Sub-Technique
Sub-Technique
Symmetrical (aka in-band) and Asymmetrical (aka Out-of-Band) are emerging as key areas of abstraction and virtualization.
13
Metadata or Storage Volume Controllers (SVC) are placed (inband) or in the path of data flow.
Metadata or Storage Volume Controllers are placed (out of band) outside the path of data flow.
14
In-Band Virtualization
1
Metadata or Storage Volume Controllers (SVC) are placed (inband) or in the path of data flow.
SVC controls who can get access to the storage device controls, how storage can be accessed, how storage is allocated, etc.
15
2
SVC controls who can get access to the storage device controls, how storage can be accessed, how storage is allocated, etc.
4 1
Metadata or Storage Volume Controllers (SVC) are placed (inband) or in the path of data flow.
3
Storage Pool sends Metadata to SVC
16
Virtualization Engine
Monitor Monitor
VPN Comm-link for remote support SAN Fabric B San Fabric A Director PRIMARY SITE Environment:PROD, DEV, QA, SIT Application:App1, App2
CISCOSYSTEMS
SAN Fabric A Management VLAN _ QA/ DEV _ storage, library, director _ 950 PROD_ Blades + Blade Fabric_ 955 Type 2 SAN Storage
SD
Pwr
Network Appliances
SD
DWDM
Pwr
Network Appliances
Network Appliances
CISCOSYSTEMS
SAN Fabric A
CISCOSYSTEMS
3Com
Type 1 Storage
San Fabric A
3Com
San Fabric B
D. Miles 06/09/07
Ethernet
17
Case Study
The Study
1. Shows that commingling of data and meta-data on a single logical device means that there is no way to achieve different service level objectives for data and meta-data in the same file system, without moving file-system specific knowledge into the logical disk layers. Shows that the standard assumptions underlying the organization of data and meta-data in file systems are no longer valid in virtualized storage environment and hence fail to materialize the full benefits of storage virtualization. Proposes a different file system organization of data and meta-data designed to exploit the power of virtualized storage.
2.
18
Case Study
Service Level requirements within a single file system
Organization A Needs No Encryption Organization B_ Needs Encryption Stores Medical Records Security requirements for file data is extremely high. Performs nightly indexing operation on file systems All directory information and file access times must be read to determine changed state of data Business requirement that all file data be encrypted at rest. File meta data has no security requirement
In Unix fast file system (ffs), a logical disk is divided into collections of blocks called cylinder groups, each of which stores both file data blocks as well as file meta-data
19
Case Study
Results Clean logical separation between data and metadata Allows file system feature to use virtualization features and achieve different SLOs Redesign changes
Code change Packing the re-located cylinder group header in the first few meta data cylinder groups ensures each header is located @ a fixed, predictable offset from the front of the block device User configurable block address space before which no data stored and after no meta data stored
20
Case Study
5-7% gains on the new file system layout 31-44% for the file lookup and file delete benchmarks, which result in little or no file data i/o, the advantage of data-only encryption become obvious
Future Work Differing SLOs for granular meta data Completely separate fixed/dynamic metadata Separate file data from user defined file attribute data
21
What are potential topics of research and dissertation? Sample Research Topics in Storage Virtualization Bayesian analysis for resource management
Annotated References
1. Faibish. S., Fridella S, Bixby P., and Gupta U., Storage Virtualization using a Block -device File System January 2008 ACM SIGOPS Operating Systems Review, Volume 42 Issue 1 Publisher: ACM The Storage Networking Tutorials, SNIAVIRT http://www.snia.org/education/tutorials/ http://en.wikipedia.org/wiki/Metadata http://www.redbooks.ibm.com/redbooks/pdfs/sg246210.pdf Nealan.L., php|works, Atlanta September 13, 2007 http://sizzo.org/wp/wp-content/uploads/2007/09/facebook_performance_caching.pdf http://en.wikipedia.org/wiki/Google_platform
2. 3. 4. 5. 6.
24