Вы находитесь на странице: 1из 19

Block level level sharing sharing of technology Block of storage storage in in microservers microserversusing usingLightpeak Lightpeak technology

Introduction
- Microservers are essentially servers that pack a lot of processors and storage in to their chassis. - Sharing of data in microservers is done using the ISCSI (Internet small computers system interface). - ISCSI is a protocol and defines a set of rules for transferring data over the internet. - It uses conventional devices and cabling to implement data sharing. The fact that it uses existing infrastructure to implement data transfer is also its biggest limitation. - Light Peak provides multi-protocol high-performance communications (10Gbps bandwidth) at extremely low cost and thus is an appealing alternative. - Lightpeak aka Thunderbolt is a technology developed by Intel that combines the PCI-e port and the display port in to a high speed bus that offers transfer speeds of up to 10Gbps. - Unlike ISCSI that are used with Storage Area Networks, LBLK (Lightpeak Block Transport) is specifically designed for sharing storage amongst microservers within a microserver chassis.

- Our design leverages Light Peak (technology as a system fabric to provide multiprotocol high-performance communications at extremely low cost

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

ISCSI
SCSI or small computers system interface is a set of standards for physically connecting and transferring data between computers. ISCSI is internet SCSI and takes the SCSI network to a larger scale i.e over the internet and is used for linking data storage facilities across the globe. It essentially allows clients (called initiators) to send SCSI commands to SCSI storage devices on remote servers. The cabling used is standard cabling and no special infrastructure is needed and the transfer protocol used is TCP. It essentially takes a high performance local storage bus and emulates it over WAN. However there are two major disadvantages to using ISCSI 1. Performance can degrade if not operated on a dedicated network. This coupled with the fact that ISCSI uses existing network infrastructure means that ISCSI is inherently flawed in its design and cannot hope to achieve its true potential. 2. A computer connected to an ISCSI network cannot boot using files on a storage array.

Concepts
There are a few basic concepts or terminology one needs to be familiar with while dealing with ISCSI:(i) Initiator- In a system the initiator deploys SCSI commands over an IP network. It works more like a client. There are two ways to implement an initiator:_ a) Software initiator: In this method code is used to implement SCSI commands. It uses the network card and the network stack to emulate SCSI devices by speaking ISCSI protocol.

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

b) Hardware initiator: This method uses dedicated hardware to establish and maintain a connection. This method is more efficient since it reduces overhead of ISCSI and TCP processing and Ethernet interrupts. The main component in this method is the Host Bus Adapter (HBA). (ii) Target: This refers to a storage resource located on an ISCSI server. It is often a dedicated network connected hard disk. The target might be being accessed by multiple clients at the same time. (iii) Storage Array: The target resides in the storage array. It provides distinct ISCSI targets for numerous clients.

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

Security
Security provided by ISCSI is very basic. It provides clear text protection which means that it protects sensitive data like passwords from being seen as regular text. Since ISCSI shares the network infrastructure with other networks a simple cabling mistake removes barriers between an ISCSI network and a regular network. Unrelated initiators cannot access a data resource. We said earlier that ISCSI uses same network hardware and cabling as other networks. So the question arises as to how we differentiate an ISCSI network from a regular one. This separation is done only at a logical level. It is achieved using Virtual LAN (VLAN) which partitions a physical network so that distinct broadcast domains are created using a switch router.

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

LBLK
Thunderbolt technology is at the heart of LBLK. It combines PCI Express and Display port into a serial data interface that can be carried over longer and less costly cables. The Thunderbolt cable contains a pair of optical fibers that are used for upstream and downstream traffic. The cables can carry any form of I/O and are equipped with full-duplex links. Unlike bus-based I/O architectures, each Thunderbolt port on a computer is capable of providing the full bandwidth of the link in both directions The technology functions using the standard drivers present in most OSs of today. All the features necessary to implement Thunderbolt are integrated into a single chip,

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

Thunderbolt Controller

The controller is connected to the PCI-E and display port and combines them in to a serial interface. The resulting high speed bus is accessible to the processor.

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

The entire left hand side (in fig) is the initiator side. The network is similar to a clientserver relationship. The initiator acts as the client and the server side is called the target which is the right hand side in the figure. The network layer is responsible for resolving the network and establishing a connection. The file system on the initiator side maybe either fat32 or NTFS. The block i/o retrieves data stored in terms of blocks rather than segmented linear storage. In ISCSI the target side is the main bottleneck since each request needs to pass through the block i/o and the request queue. The request queue is an outdated feature and was used to schedule requests keeping in mind the limited processing capabilities of the systems at the time. But the advent of newer and faster processors has obviated the need for the scheduler. In fact retaining the scheduler would mean wastage of entire processing cycles. Using lightpeak infrastructure the entire system can be rehashed to better suit the needs of todays systems. It would mean direct access to block i/o and in effect the required data. The command set used is also much simpler. Block I/O layer issues a read BIO then directly skips to the data blocks skipping the i/o scheduler and request queue.

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

The Communication Layer


Unlike iSCSI that employs the TCP/IP stack as a communication layer, we develop a lightweight communication protocol to provide in-order and reliable delivery service. The base features of this protocol include: 1. Credit -based flow control. Credits in our design represent the number of available buffers in the sink/receiver side. This credit definition has a relatively simple design and implementation compared to defining credits as the number of bytes. 2. Reliable delivery via data acknowledgements and retransmissions. 3. Simple data streaming (limited only by transmission credits) By using a credit-based flow control algorithm to regulate packet transmissions, the communication layer ensures that receiver has sufficient buffering (or credits). The use of credit-based flow control allows for smaller buffer sizes and also reduces design and implementation complexity. During data streaming, each data header includes meta-data about the data in the packet payload, including a sequence number. The sequence number is used by both source and sink to ensure in-order delivery of data. Data responses are acknowledgements transmitted to the source for data packets received. Credit information is piggybacked along with the data acknowledgement status in the data pack.

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

Dept. of Computer Science & Engineering, ASIET

Block level sharing of storage in microservers using Lightpeak technology

LBLK VS ISCSI

Dept. of Computer Science & Engineering, ASIET

10

Block level sharing of storage in microservers using Lightpeak technology

Benchmarking
We implemented LBLK and deployed it in our cluster environment, where 4 Super Micro X7SPA Atom servers are used as initiators and one Super Micro Xeon server IX8DTH-6F is deployed as the I/O node. Each node has installed a PCI Express-based Light Peak card with four 10Gbps ports. Each Atom node directly connects to one port on the Light Peak card in the I/O node and thus there is 10Gbps bandwidth in both directions between each initiator and the I/O node. In the I/O node, eight high - performance 250 GB Intel 510 SSDs are connected to an on-board LSI SAS 2008 controller with RAID 0 settings. For iSCSI experiments, STGT with file I/O backend storage, a popular target server configuration, is used in the I/O node and Open-iSCSI is adopted as the initiator in Atom node. The Atom nodes perform computation and dispatch I/O requests to the I/O node. Each Atom node is assigned four partitions (boot, root file system, swap and data) by the I/O node. Both the micro-benchmark SysBench and macro-benchmark Postmark was run to understand storage sharing performance and measure the corresponding CPU utilization. In order to perform a fair comparison of LBLK and iSCSI, we also deployed iSCSI over the Light Peak fabric. We ran experiments several times to get average values. In addition to the performance comparison, we investigated performance fairness achieved by LBLK and iSCSI.

Dept. of Computer Science & Engineering, ASIET

11

Block level sharing of storage in microservers using Lightpeak technology

Sequential Read Performance

Since CPU utilization in the target side is very low, we present the initiators CPU utilization in this section. The figure shows the performance of sequential reads over various block I/O sizes. As show in Fig, LBLK achieves better sequential read performance with all block I/O sizes. Besides this performance advantage, LBLK also consumes lower CPU utilization than iSCSI when block I/O size are less than 64KB. With the 128 KB block I/O size, LBLK achieves almost 3x higher performance while only burning 2x the CPU cycles of iSCSI.

Dept. of Computer Science & Engineering, ASIET

12

Block level sharing of storage in microservers using Lightpeak technology

Sequential Write Performance

In terms of sequential write performance LBLK shows over 1.5x performance over ISCSI. LBLK also shows considerable improvement over ISCSI in terms of CPU utilization.

Dept. of Computer Science & Engineering, ASIET

13

Block level sharing of storage in microservers using Lightpeak technology

Dept. of Computer Science & Engineering, ASIET

14

Block level sharing of storage in microservers using Lightpeak technology

Performance fairness

We test the performance fairness of LBLK and iSCSI by running Postmarks on 4 initiators simultaneously. Each initiators performance is presented in Figure. It is apparent from Fig that LBLK achieves much better fairness than iSCSI. This is primarily because STGT leverages best effort scheduling (the target polls sockets and processes requests immediately once requests are available without considering load balance among initiators) to serve I/O requests from the initiators. Bursty traffic from one initiator adversely affects other initiators. In contrast, LBLK processes requests from initiators in a round-robin manner and achieves better fairness.

Dept. of Computer Science & Engineering, ASIET

15

Block level sharing of storage in microservers using Lightpeak technology

Benefits of LBLK
Extremely high bandwidth of up to 10GBps. Direct access to the Block i/o bypassing the SCSI subsystem and I/O scheduler reducing the overhead involved. Up to 2x increase in performance over comparable ISCSI systems and 30% less CPU utilization. LBLK achieves better fairness than ISCSI at a considerably reduced cost. Much simpler command set. Uses independent infrastructure as opposed to ISCSI which makes use of existing network to establish connection. ISCSI is separated from the normal network only on a logical level and not physically. This can cause problems as the other network might overlap with the ISCI network. LBLK poses no such hassles. Lightpeak also has the ability to connect multiple devices to one Light Peak cable this is possible because Intel designed Light Peak so it is able to simultaneously use multiple protocols. Since Lightpeak extends the PCI Express bus, which is the main expansion bus in current systems, it allows very low-level access to the system. This means that a system designed using lightpeak will be more secure and harder to bypass.

Dept. of Computer Science & Engineering, ASIET

16

Block level sharing of storage in microservers using Lightpeak technology

Applications
In addition to offering high performance and good fairness, LBLK also opens extensive opportunities to fuel microserver platform research. Based on LBLK, we plan to conduct the following research: Intelligent Data Sharing: Unlike traditional scale up designs, Big Data needs scale-out frameworks like Hadoop, that run on massive clusters typically cause extensive data communication or shuffling among nodes. Unfortunately, even with shared storage architectures, Big Data frameworks are unaware of sharing and still incur significant data movement penalties. An intelligent data sharing scheme that allows nodes to effectively share data stores may avoid unnecessary data movement and stack operations, thereby boosting the performance of Big Data frameworks on microserver platforms. Resource Provisioning: In microservers with a large number of processor nodes, there may be many jobs running simultaneously. Each job may exhibit different characteristics and require different storage performance in terms of read/write bandwidth, read/write IOPS, latency, and disk capacity. With shared storage in microserver platforms, the heuristics for provisioning resources to processor nodes and jobs become extremely important. Without adequate resource provisioning, jobs may heavily interfere each other, thus sacrificing overall system performance. Two approaches can be explored: static resource provisioning and dynamic resource provisioning. In static resource provisioning, a rigid resource management policy is enforced statically in the I/O node. Alternately, dynamic resource provisioning would detect resource requirements at the job level and assign appropriate resources to each job. New devices could be modeled around lightpeak technology to take full advantage of the advances it offers. The technology will, in all probability replace USB 3.0 as the industry standard for high performance interconnects.

Dept. of Computer Science & Engineering, ASIET

17

Block level sharing of storage in microservers using Lightpeak technology

Conclusion and Future work


Microserver designs are beginning to disaggregate processor and memory from I/O devices and thus require an efficient storage sharing mechanism. Motivated by this, we propose a highly efficient block-level storage sharing mechanism to share fabricattached storage. Our design leverages Light Peak technology to provide a high performance and low cost solution. Unlike conventional SAN-based sharing schemes, we redesigned the entire storage subsystem to better match the architectural needs of microservers. LBLK was implemented and deployed in our cluster. A performance comparison of LBLK and iSCSI demonstrates that LBLK obtains higher performance than iSCSI while consuming less CPU utilization. Besides offering a performance advantage, LBLK also achieves better fairness among initiators than iSCSI. In addition to performance and fairness benefits, LBLK can pave the way for further microserver platform research. There is also scope to conduct future research on storage resource provisioning and data sharing in microserver platforms. Its also possible to deploy LBLK on 10s to 100s of Atom nodes and conduct experiments that will help us better understand the system requirements of Big Data applications like Hadoop and SPECWeb on microserver platforms.

Dept. of Computer Science & Engineering, ASIET

18

Block level sharing of storage in microservers using Lightpeak technology

References
S. Addagatla, M. Shaw, S. Sinha et.al., Direct network prototype leveraging Light Peak technology. Intel Corporation, Light Peak Technology, http://www.intel.com/go/lightpeak/index.html J.Katcher, PostMark: A New File System Benchmark, NetApp Technical Report TR-3022
SuperMicro I/O node. http://www.supermicro.com/products/motherboard/QPI/5500/X8DTH-6F.cfm.

Dept. of Computer Science & Engineering, ASIET

19