Академический Документы
Профессиональный Документы
Культура Документы
Technical Report
OSU-CISRC-08/05-TR53
Benefits of High Speed Interconnects to Cluster File Systems:
A Case Study with Lustre ∗
Abstract 1. Introduction
Cluster file systems and Storage Area Networks While CPU clock cycle and memory bus speed
(SAN) make use of network IO to achieve higher IO are reaching the level of sub-nanoseconds and
bandwidth. Effective integration of networking mech- 10Gbytes/sec, disk access time and bandwidth are
anisms is important to their performance. In this paper, still lingering around microseconds and 400Mbytes/sec,
we perform an evaluation of a popular cluster file sys- respectively. Since systems with ever-increasing speed
tem, Lustre, over two of the leading high speed cluster are being deployed at the scale of thousands of nodes,
interconnects: InfiniBand and Quadrics. Our evalua- IO speed needs to keep pace with the demand of high
tion is performed with both sequential IO and parallel performance computing applications. High speed
IO benchmarks in order to explore the capacity of Lus- interconnects are typically being utilized in high perfor-
tre under different communication characteristics. Ex- mance computing due to the very low latency and very
perimental results show that direct implementations of high bandwidth they can achieve. Recently, utilizing
Lustre over both interconnects can improve its perfor- high-end interconnect technologies to bridge the gap
mance, compared to an IP emulation over InfiniBand between CPU/memory speed and IO speed has been
(IPoIB). The performance of Lustre over Quadrics is exploited in storage and file systems to speed up IO
comparable to that of Lustre over InfiniBand with the accesses and scale the aggregated bandwidth. This trend
platforms we have. Latest InfiniBand products provide makes it very promising to build scalable petabyte stor-
higher capacity and embrace latest technologies, e.g., age area networks through commodity clusters. These
taking advantage of PCI-Express and DDR. Our results cluster-based storage and file systems can combine the
show that over a Lustre file system with two Object Stor- advantages of both high speed network accesses of the
age Servers (OSS’s), InfiniBand with PCI-Express tech- interconnects and large storage capacity available from
nology can improve Lustre write performance by 24%. the commodity nodes. Many commercial [13, 15, 8, 7]
Furthermore, our experiments suggest that Lustre meta- and research projects [20, 11, 2] have been developed to
data operations does not scale with the increasing num- provide parallel file systems for IO accesses using such
ber of OSS’s. architectures.
Currently, several high speed interconnects can pro-
vide very high bandwidth at the level of 10Gbps or
even higher, including InfiniBand [14], Quadrics [27, 4],
Myrinet [5], and 10Gigabit Ethernet [12]. Minimum
∗ This research is supported in part by DOE grant #DE-FC02-
achievable latency between two nodes already has gone
01ER25506, NSF Grants #CNS-0403342 and #CNS-0509452; grants
from Intel, Mellanox, Sun MicroSystems, Cisco Systems, and Linux
below 2µs. It is important not only to the system design-
Networks; and equipment donations from Intel, Mellanox, AMD, Ap- ers and practitioners that builds the systems, but also to
ple, IBM, Microway, PathScale, SilverStorm and Sun MicroSystems. the hardware vendors that produce these interconnects to
find out how effectively these interconnects’ communi- Our experimental results indicate that Lustre file IO
cation characteristics can be integrated into the file sys- operations can better benefit from the high speed com-
tem. munication provided by both InfiniBand and Quadrics
A popular cluster file system, Lustre [7], has recently compared to an IP emulation over InfiniBand (IPoIB). In
been designed and developed to address the needs of addition, the read and write bandwidth of an MPI-Tile-
next generation systems using low cost Linux clusters IO benchmark over Lustre over Quadrics is about 13%
with commodity computing nodes and high speed inter- higher than the same over InfiniBand with the platforms
connects. Currently, it is available over several different we have. However, latest InfiniBand products integrate
interconnects, including Gigabit Ethernet, Quadrics [27] well with latest technologies, e.g., taking advantage of
and InfiniBand [14] (in a prototype form). In order to PCI-Express and DDR (Double Data Rate). Our results
gain more understanding of high speed interconnects show that over a Lustre file system with two Object Stor-
and their performance impacts to storage and file sys- age Servers (OSS’s), InfiniBand with PCI-Express tech-
tems, we intend to evaluate the performance of Lus- nology can improve Lustre write performance by 24%.
tre over different interconnects and study the following Our results also suggest that Lustre meta-data operations
questions: scales rather poorly with the increasing number of Ob-
ject Storage Servers (OSS’s), though it benefits slightly
1. Which file system operations of Lustre can bene- from high performance of latest interconnect technolo-
fit more from low latency and high performance of gies.
high speed interconnects? The rest of the paper is presented as follows. In the
next section, we provide an overview of Lustre. Sec-
2. Can the latest IO-bus technologies, such as PCI- tion 3 provides an overview of two interconnects: In-
Express [23], help Lustre performance of high finiBand and Quadrics. Section 4 describes the details of
speed interconnects? our experimental testbed. Sections 5, 6 and 7 provides
performance results from our evaluation. Section 8 gives
3. What aspects of Lustre remain to be investigated a brief review of related works. Section 9 concludes the
even with high speed interconnects and latest IO- paper.
bus technologies?
4. Experimental Testbed
0
1 2 3 4 5 6
Number of Servers
6.1. Parallel Management Operations
Fig. 5. Performance of File Create and Stat Op-
erations
150
600
500
context of high performance parallel applications.
400 Our work compares the performance benefits of In-
300 finiBand and Quadrics to the IO subsystem, using a clus-
200
ter file system, Lustre [7]. The communication capabil-
100
ities of these two interconnects are exposed to Lustre
0
through their kernel level networking API. This work is
1
Number of OSS
2
complementary to the previous work on the evaluation
(a) Read of different interconnect technologies.
100
PCI−X
PCI−Ex
90
9. Conclusions
80
70
Bandwidth (MB/s)