You are on page 1of 9

HP-UX VxFS mount options for Oracle

Database environments
Technical white paper

Table of contents
Executive summary............................................................................................................................... 2
Intended audience............................................................................................................................ 2
Using HP OnlineJFS ............................................................................................................................. 3
Creating filesystems ............................................................................................................................. 3
Archive and Redo log filesystems ....................................................................................................... 4
Oracle tablespace filesystems ............................................................................................................ 5
Oracle binaries filesystems ................................................................................................................ 6
Oracle 11.2.0.2 and filesystemio_options=setall ................................................................................. 6
Impact of convosync=direct on sequential access..................................................................................... 6
Sequential I/O penalty with readv ..................................................................................................... 6
Read-ahead ..................................................................................................................................... 7
Oracle 10g and 11g........................................................................................................................ 8
Summary ............................................................................................................................................ 8
For more information ............................................................................................................................ 9

Executive summary
This white paper provides detailed guidelines for specifying mount options for HP-UX VxFS filesystems
in an Oracle Database environment so as to optimize performance. This paper includes the support of
concurrent I/O option with single instance databases to provide near RAW performance.
HP OnlineJFS, an add-on HP-UX software product, can significantly optimize the interaction between
HP-UX and Oracle by supporting direct I/O communications that bypass HP-UX buffer cache.
The paper provides information on addressing the sequential I/O performance penalty that may be
associated with certain system calls and describes VxFS read-ahead, one of the benefits of using
buffer cache.
The following table summarizes the general recommendations for filesystem block sizes and mount
options for Oracle filesystems.
Table 1. Summary of Recommendations

Filesystem
Redo Logs

Access

Block
size

Mount options

Direct

1 KB1

delaylog,mincache=direct,convosync=direct

Concurrent I/O

1 KB

delaylog,cio

Direct

1 KB1

delaylog,mincache=direct,convosync=direct2

Concurrent I/O

1 KB

delaylog,cio

Cached4

8 KB

delaylog,nodatainlog

Direct5

8 KB7

delaylog,mincache=direct,convosync=direct

Concurrent I/O6

8 KB7

delaylog,cio

Cached

Any

delaylog,nodatainlog

Archive Logs

Tablespaces

Binaries

Recommendations provided herein apply to most HP-UX 11.0, 11i v1 (11.11), 11i v2 (11.23), and
11i v3 (11.31) environments.
Note that it's fairly straightforward to make changes AFTER the standard tools have done the first
filesystem creation but before the database has used the storage. You can specify the "mkfs"
command to generate the different block size filesystems as shown in the following syntax:
mkfs -F vxfs -o bsize=<n> /dev/vg01/rlvol1

Intended audience
This white paper is intended for HP-UX administrators that are familiar with configuring filesystems on
HP-UX.

1
2
3
4
5
6
7

Filesystem blocks size can be any size if VxFS 5.0 is used on 11.31, but 1 KB will always work.
For VxFS 3.5, install VxFS patches PHKL_32355 or later on 11.11, or PHKL_34179 or later on 11.23
For VxFS 3.3, mount the filesystem for buffered I/O using delaylog,nodatainlog mount options and tune discovered_direct_iosz
Use cached I/O if Oracle benefits from VxFS read-ahead or db_file_multiblock_read_count is 16 or less
Use Direct I/O to avoid overhead of buffer/file cache, or if the Oracle block size is < 8 KB
Use Concurrent I/O to avoid JFS Inode Lock contention. Concurrent I/O is available with the OnlineJFS license and VxFS 5.0.1 or higher.
Any filesystem block size can be used, but 8 KB allows flexibility to move to cached I/O if needed

Using HP OnlineJFS
To optimize VxFS performance, you should use OnlineJFS, an add-on HP-UX software product that
can significantly increase the availability of VxFS filesystems. In addition to its dynamic online
management capabilities, OnlineJFS provides a direct I/O mode that can optimize interaction
between HP-UX and Oracle Database.
With direct I/O, requests that exceed a certain size (by default, 256Kb or more) are performed
directly, bypassing the HP-UX buffer cache (11.23 and earlier) or the Unified File Cache (11.31).
Such requests are typically initiated by operations (such as backup or copy) that only read the data
once; thus, there is no value to caching this data in the buffer/file cache where it might otherwise
flush out more useful information.
However, mixing buffered I/O and direct I/O on the same files can cause significant performance
issues. Also, for direct I/O to be effective, there are specific alignment requirements. Due to these and
other issues, the Oracle filesystems must be created with certain filesystem block size requirements
and mounted with the appropriate options to optimize the filesystem performance.
Changes were made to the licensed features of OnlineJFS with VxFS 5.0.1. Direct I/O is now
available with the base product (no licenses needed), and concurrent I/O is now available with
OnlineJFS product. Please review Performance improvements using Concurrent I/O on HP-UX 11i v3
with OnlineJFS 5.0.1 and the HP-UX 11i Logical Volume Manager white paper for more information
on concurrent I/O performance improvements.

Creating filesystems
This section provides guidelines for creating filesystems for the following:
Redo logs
Archive logs
Oracle tablespaces
Oracle binaries
Table 2 outlines key mount options discussed in this document

Table 2. Mount options

Option

Description

delaylog

The delaylog option allows the filesystem to delay the writing of non-critical
filesystem structural information to the VxFS intent log. This option improves
filesystem performance by allowing some system calls to return before the noncritical write data are placed in the intent log.
This option does not impact most Oracle file operations, such as reads and
writes. The delaylog option can have some impact when creating, deleting,
renaming, and extending files. So delaylog is often recommended.

nodatainlog

Since Oracle always uses synchronous writes, the nodatainlog option can be
used to prevent VxFS from writing data to the intent log as well as the file. Note
that the datainlog/nodatainlog options only impact buffered or cached I/O. They
do not impact direct I/O.

convosync=direct

This option converts buffered or cached synchronous I/O requests to direct I/O.
By bypassing the buffer/file cache, the overhead of copying data between the
Oracle System Global Area (SGA) and cache can be eliminated. Using direct
I/O for Oracle will bypass the VxFS read-ahead code, and can impact sequential
access.
In Oracle Database 8.x and later, the convosync=direct mount option also causes
unnecessary physical I/Os during sequential read operations if
db_file_multiblock_read_count is less than 32 or a non-default value on Oracle
10g or 11g.
For more information, see

Impact of convosync=direct on sequential

access.
mincache=direct

This option converts normal asynchronous I/O requests through the buffer/file
cache to direct I/O. Since Oracle uses the O_DSYNC option for synchronous
I/O, using this option would typically not impact Oracle.
However, if the convosync=direct option is used, HP recommends using
mincache=direct to accommodate I/Os from non-Oracle applications (such as
backup utilities) as well as operating system commands like cp and gzip. This
avoids a mix of direct and buffered I/O, which can degrade performance.

cio

This option also converts normal asynchronous I/O requests through the
buffer/file cache to direct I/O. This option also allows for concurrent read and
write operations by converting exclusive lock requests by writes into shared locks,
relying on the Oracle database code to provide the lock synchronization.
Using the cio mount option implies the use of mincache=direct and
convosync=direct.

Archive and Redo log filesystems


The writes to the Oracle Redo logs and Oracle Archive logs will perform best if the logs are written
with direct I/O. For efficient direct I/O, the filesystems used for the redo logs and archive logs should
be created with a 1 KB filesystem block size with newfs/mkfs and mounted with the following mount
options:
delaylog,convosync=direct,mincache=direct
With VxFS 5.0 on 11.31, the filesystems no longer needs to have a filesystem block size of 1 KB for
efficient direct I/O.
Note that if the system has a license for concurrent I/O, which is included with the OnlineJFS license
on VxFS 5.0.1 and later, then the redo and archive logs should be mounted with concurrent I/O:
delaylog,cio

For VxFS 3.5 on 11.11 or 11.23, be sure to install the latest VxFS 3.5 patches to address
performance issues when accessing sparse files, as Oracle creates the archive logs as sparse files.
The fixes for accessing sparse files for direct I/O were incorporated into the following patches:

PHKL_32355 11.11 VxFS 3.5 patch

PHKL_34179 11.23 VxFS 3.5 patch

For VxFS 3.3 the archive log filesystems must use the following mount options and use buffered I/O
for optimal performance:
delaylog,nodatainlog
Modify the /etc/vx/tunefstab file, changing the discovered_direct_iosz parameter for the archive
filesystem to 2097152, which enhances performance when writing archive files. This parameter will
apply to cached tablespace filesystems as well.

Oracle tablespace filesystems


For Oracle tablespace filesystems, either direct I/O, cached I/O, or concurrent I/O can be used.
The advantage of cached I/O is that Oracle will sometimes benefit from the VxFS read-ahead when
doing sequential or near sequential I/O.
The advantage of direct I/O is that the overhead of copying the data to or from the buffer/file cache
is eliminated. With 11.31, the maintenance of the file cache structures and additional memory
utilization can also cause some overhead. Direct I/O can also eliminate unnecessary read-ahead
triggered by VxFS read-ahead algorithm, but not actually consumed by Oracle.
Concurrent I/O has all the advantages of direct I/O, but also eliminates JFS inode lock contention
and provides the best overall performance configuration. With concurrent I/O, the read and write
operations are not serialized. The application is responsible for coordinating the read and write
activities and ensure they are to nonoverlapping blocks of the same file. This coordination is provided
by Oracle database software.
HP offers the following suggestions for creating Oracle tablespace filesystems:
Using HP-UX buffer/file cache (cached I/O)
Create filesystems with an 8 KB block size; use the following mount options:
delaylog, nodatainlog
While using the buffer/file cache may provide some benefit through VxFS read-ahead, it may be
better to bypass buffer cache in environments that are very I/O-intensive.
Bypassing HP-UX buffer/file cache (Direct I/O)
Direct I/O can be more efficient than using cached I/O and provides some benefit in I/O-intensive
environments. Note, however, that it is critical to set the Oracle db_file_multiblock_read_count to the
optimal value associated with the Oracle release (see Sequential I/O penalty with readv for
additional details).
For direct I/O on Oracle tablespaces, the filesystem block size can be any value. However, creating
the filesystem with an 8KB filesystem block size allows for flexibility if cached I/O is desired later.
Use the following mount options:
delaylog,convosync=direct,mincache=direct

Oracle block sizes of 8KB or more are standard practice and are the preferred block size for most
environments. However, it should be noted that when using an Oracle block size of 4KB or less,
direct I/O provides added benefits and is significantly more efficient than cached I/O.
Using Concurrent I/O
Concurrent I/O also bypasses the buffer/file cache and provides the added benefit of eliminating any
JFS inode lock contention. Concurrent I/O can be enabled with the following mount options:
delaylog, cio
Beginning with VxFS 5.0.1or higher on HP-UX 11i v3, the concurrent I/O feature is available with the
OnlineJFS licenses. Prior to VxFS 5.0.1, concurrent I/O was available with the HP Serviceguard
Storage Management Suite bundle. When available, concurrent I/O is recommended over direct I/O
for Oracle tablespace filesystems for better performance.
Oracle 10g and 11g environments
To use direct I/O in an Oracle 10g or 11g environment, allow the value of the Oracle
db_file_multiblock_read_count parameter to remain at default, which results in 1 MB reads that have
no impact on Oracle optimizer logic.

Oracle binaries filesystems


Oracle binaries should never be mounted for direct I/O as data blocks are often accessed multiple
times. Using cached I/O on the Oracle binaries will eliminate unnecessary I/O. Use the following
mount options for cached I/O only:
delaylog,nodatainlog

Oracle 11.2.0.2 and filesystemio_options=setall


Beginning with Oracle version 11.2.0.2, the Oracle parameter filesystemio_options=setall has been
enhanced to detect the filesystem capability and uses VX_SETCACHE ioctl() calls to set the
appropriate caching options. For example, Oracle will set the caching options for the database files,
the redo logs, and the archive logs for concurrent I/O provided the system is licensed for concurrent
I/O. If concurrent I/O is not licensed, then Oracle will set the caching options for the database files,
the redo logs, and the archive logs for direct I/O if the system is licensed for direct I/O. Otherwise,
cached I/O is used. So if Oracle 11.2.0.2 is used and the filesystemio_options=setall, then the mount
options can be set to the following:
delaylog,nodatainlog

Impact of convosync=direct on sequential access


This section outlines the impact of setting convosync=direct on Oracle sequential access (table scans).

Sequential I/O penalty with readv


The readv() system call is similar to the read() system call, but is passed a set of I/O vectors which
are used to transfer the data. Each vector is associated with a different Oracle data block.
When using direct I/O, the readv() system call will result in separate physical I/O for each I/O
vector, even if the I/O vectors are setup for reading sequential data blocks. The physical I/O for each
I/O vector is performed in sequence, meaning the first physical I/O for a vector must complete before
the next physical I/O for the next vector is initiated. With the db_file_multiblock_read_count value set

to 8 and an 8 KB Oracle block size, the result would yield eight physical I/Os of 8 KB each,
performed one I/O at a time, rather than a single 64 KB physical I/O.
Note
If the filesystem was mounted with mincache=direct,convosync=direct, VxFS
would perform a separate physical I/O for each vector specified by
readv().

Oracle 8.x uses the readv system call for full table scans or scattered reads.
Oracle 9.x uses the readv system call for full table scans if the db_file_multiblock_read_count is 16
or less. If the db_file_multiblock_read_count is greater than 16, then Oracle will use a single read
system call with a larger transfer size resulting in more efficient direct I/O. However, increasing the
db_file_multiblock_read_count may alter the behavior of the Oracle optimizer.
On Oracle 10g and 11g, the db_file_multiblock_read_count should be allowed to default. This will
provide for 1 MB multi-block reads without impacting the Oracle optimizer value.
However, if the convosync=direct mount option is not used, readv passes requests through the HP-UX
buffer cache, allowing VxFS to coalesce the eight vectors using fewer physical I/Os and issue the
physical I/Os in parallel.

Read-ahead
An added benefit of using buffer cache is the ability of VxFS to read ahead. After recognizing two
adjacent reads, such as a table scan, VxFS can initiate a read-ahead (256 KB by default), further
enhancing table scan performance.
Note that, in certain circumstances, a read-ahead may be wasteful. VxFS interprets two adjacent
single-block reads as a sequential I/O pattern and, as a result, generates 256 KB of read-ahead;
however, since this pattern is often random, the read-ahead data may not be used.
VxFS can modify the amount of read-ahead based on the number of stripes (columns) and stripe size.
Since this may result in extremely large read-aheads, you should tune the read-ahead size to its
default value, which leads to balanced performance in most Oracle environments.
Read-ahead size
To check read-ahead size, use the vxtunefs command on a mount point. Perform the following
calculation:
Read-ahead size = read_pref_io * read_nstream * 4
Values recommended for balanced performance in most Oracle environments are:
read_pref_io = 65536
read_nstream = 1
These above values are the default values when using LVM. When using VxVM, the default values
reflect the VxVM striping attributes. If the VxVM volume is striped across a large number of disks with
a large stripe size, the read-ahead size will be too large. Be sure to use vxtunefs to check the values
and tune them as mentioned above. While they can be changed interactively through vxtunefs, you
should enter these values in /etc/vx/tunefstab to make them persistent after a reboot.
Higher Oracle multi-block read count values
With any version of Oracle Database, if db_file_multiblock_read_count is set to a value higher than
16, Oracle reverts to the read system call. Thus, with these higher multi-block read count values, the

use of the convosync=direct mount option does not result in the sequential I/O penalty inherent in
readv; conversely, however, the benefits of VxFS read-ahead cannot be achieved.
Higher multi-block read count values may be appropriate in some large data warehouse
environments.

Oracle 10g and 11g


Oracle 10g uses the read system call for multi-block reads of 1 MB when the value of
db_file_multiblock_read_count is allowed to default. Setting the multi-block read count to a higher
value may impact Oracle optimizer.

Summary
This white paper provides detailed guidelines for specifying mount options for HP-UX VxFS filesystems
in an Oracle Database environment so as to optimize performance. HP recommends upgrading to the
latest versions of HP-UX and VxFS to take advantage of performance improvements. However we
recognize that many users may not be using the most current versions of HP-UX, VxFS and Oracle
databases therefore recommendation were provided to configure VxFS mount options for these
different configurations. OnlineJFS can significantly optimize the interaction between HP-UX and
Oracle by supporting direct I/O communications that bypass HP-UX buffer cache. With VxFS 5.0.1or
higher, the concurrent I/O feature is available with the OnlineJFS license and provides the added
benefit of eliminating JFS inode lock contention.

For more information


Veritas File System 5.0.1 Administrators
Guide

http://h20000.www2.hp.com/bc/docs/support/Sup
portManual/c02220689/c02220689.pdf

Veritas File System 5.0.1 Release Notes HPUX 11i v3

http://bizsupport1.austin.hp.com/bc/docs/support/S
upportManual/c02627959/c02627959.pdf

Supported File and File System Sizes for


HFS and JFS

http://bizsupport1.austin.hp.com/bc/docs/support/S
upportManual/c01915880/c01915880.pdf

HP-UX VxFS tuning and performance

http://h20000.www2.hp.com/bc/docs/support/Sup
portManual/c01919408/c01919408.pdf

Performance improvements using


Concurrent I/O on HP-UX 11i v3 with
OnlineJFS 5.0.1 and the HP-UX 11i Logical
Volume Manager

http://h20195.www2.hp.com/V2/GetDocument.asp
x?docname=4AA1-5719ENW&cc=us&lc=en

To know how you can make informed decisions when choosing an I/O subsystem configuration,
please visit: http://h71028.www7.hp.com/enterprise/w1/en/os/hpux11i-fsvm-learn-more.html

To help us improve our documents, please provide feedback at


http://h20219.www2.hp.com/ActiveAnswers/us/en/solutions/technical_tools_feedback.html.

Copyright 2008 - 2011 Hewlett-Packard Development Company, L.P. The information contained herein is
subject to change without notice. The only warranties for HP products and services are set forth in the express
warranty statements accompanying such products and services. Nothing herein should be construed as
constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions
contained herein.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
4AA1-9839ENW, Created May 2008; Updated March 2011, Rev. 3