Вы находитесь на странице: 1из 4

Common NFS Error Messages and Troubleshooting Tips

Common NFS Error Messages and Troubleshooting TipsThe Network File System (NFS),
was developed by Sun Microsystems and is the de facto standard for file sharin
g among UN*X type systems. Netapp has taken that file sharing technology and wra
pped it all up into a storage appliance. NFS is a stateless protocol meaning th
at the file server stores no per-client information, and there are no NFS connect
ions per se. As an example, NFS has no operation to open a file, since this would
require the server to store state information such as when a file is open and d
ealing with file descriptor(s) and next byte to read, etc Instead, NFS supports a
lookup procedure which converts a filename into a file handle. This file handle
is an unique identifier which is usually an inode or disk block address. Most o
perating systems provide system calls to open files, and read from them sequenti
ally. The client s operating system must maintain the required state information,
and translate system calls into stateless NFS operations. None of this under the
cover stuff is seen by the end user. Despite being a stateless protocol, NFS e
nvironments are capable of throwing some error messages. Most of them are common
enough and simple enough to deal with. Below is an excerpt of the most common o
nes, there symptoms and how to deal with them.
Problem: Stale NFS File handle NFS Error 70
A stale NFS file handle
vents:

error message can and usually is caused by the following e

A certain file or directory that is on the NFS server is opened by the NFS clien
t
That specific file or directory is deleted either on that server or on another s
ystem that has access to the same share
Then that file or directory is accessed on the client
A file handle usually becomes stale when a file or directory referenced by the f
ile handle on the client is removed by another host, while your client is still
holding on to an active reference to that object. A typical example is when the
current directory of a process that is running on your client is deleted on the
server by another process running on the server or on another NFS client sharing
the same share. So this can occur if the directory is modified on the NFS serve
r, but the directories modification time is not updated.
Most Unix clients have the error 70 defined in an nfs type header file. And look
s like:
#define

NFSERR_STALE 70

If the share is missing, the best and obvious solution is to remount directory f
rom the NFS client. A remotely sane option is to mount the NFS directory with th
e noac option. But that s a lame workaround and not really recommended because of
performance issues. But could help in the short term, definitely not a long term
fix. Stale NFS file handles are almost always a NFS client application or user
problem, and not an NFS server problem.
How do I fix this problem? What to look for:
Check client vfstab or fstab or automounter maps for correctness
check connectivity to the NFS server(s)
Confirm that the correct mountpoints are there
Confirm valid share permissions and access rights with showmount
from client
Re-run the exportfs command from the filer

e filername

NFS server not responding


Nothing makes the hair on the back of your neck go up faster than when you see t
his in your log files or console. Monitoring servers are ringing bells, and stuf
f is going to hell in a hand basket. This will usually end in an outage, but not
always a disaster. Because of the statelessness of the NFS protocol, sometimes
NFS clients can recover gracefully.
The error manifests itself with something similar
ame) not responding on your console and likely on
It s quite possible to have the NFS clients hang
. If you are running a large web server farm with
hen you are in trouble.

to: NFS Server (your server n


your NFS clients log file too.
if the mount points are accessed
lots of I/O to those shares, t

NFS server not responding is actually a great error message, meaning that the er
ror message isn t totally ambiguous Basically the NFS client can no longer commun
icate with the NFS server. Either via IP as in the NFS server has gone down, or
the RPC services have vanished completely from the NFS server. Or even worse, th
e network cable is unplugged, missing or someone powered off a switch. Some comm
on sense tips to resolve this problem are:
How do I fix this problem? What to look for:
Ping the Netapp filer by IP and DNS name or short name.
On the Netapp filer, try to ping the NFS client(s).
Confirm IP configuration on NFS client and servers are correct
Confirm that the correct NFS version is enabled
Check that all the nfs options on the Netapp filer are correct
Check NFS license on the Netapp filer. They don t usually disappear. But ..
nfs mount: mount: /dba_scripts: Permission denied
You are trying to mount a share on /dba_scripts and you continue to get a permis
sion denied. Essentially your NFS client doesn t have access to the NFS share bein
g offered up from the Netapp filer. This can be causes by a variety reasons, but
mostly to do with either IP address and name conficts. I have seen old versions
of AIX systems that don t negotiate NFS versions very well and have to have the N
FS version hard coded on the mount option.
How do I fix this problem? What to look for:
Check showmount e filername from the NFS client
Test NFS share by trying to mount the share on a different mountpoint
Check exportfs share permissions on the Netapp filer
Confirm the IP / hostname of NFS client and that is matches the names / IP a
ddress used in the exports file on the Netapp Filer
If your NFS client is an older version of Unix like an old AIX system, somet
imes you have to hard code the NFS version as mounting it withough will generate
a permission denied message or a vmount error. The older systems don t seem to be
able to negotiate a compatible NFS protocol version.
Network Performance is Slow
You are experiencing slow or poor NFS read and/or write performance. Takes a lon
g time to read or write. NFS performance is closely related to RPC performance.
Since RPC is a request-reply protocol, it exhibits very poor performance over wi
de area networks. NFS performs best on fast LANs.
How do I fix this problem? What to look for:

Check sysstat 1 for nfs ops/sec vs. kbs/sec


Look at the parameters on network card interface (NIC) with ifconfig a
Look at the network statistics with ifstat a and netstat m
Confirm network links are correctly configured, ie. duplex settings and spee
ds. Look for collisions and retransmitts.
Confirm routing tables if used are valid. On Netapp filer and NFs client.
Check throughput with sio_ntap tool
Check rsize and wsize
Consider configuring jumbo frames (entire path must support jumbo frames)
User the Netapp perfstat tool. Perfstat is a tool that is run from a management
host and not directly on the filer. You need to enable RSH access from your Neta
pp filer to your management server (including the username of whoever you will b
e running it as). This can be done via the filerview or following the guide avai
lable at https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb787. Perfst
at can be downloaded from http://now.netapp.com/NOW/download/tools/perfstat. On
this page it shows some of the syntax for using the tool.
NFS read / write settings
Chck rsize and wsize values from the NFS client:
# nfsstat -m
/homes/bob from mynetapp:/homes/bob
Flags:
rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard
,proto=tcp,timeo=600,retrans=2,sec=sys,mountadd
r=124.34.56.11,mountvers=3,mountport=32854,mountproto=udp,
local_lock=none,addr=124.34.56.11
Older UNIX systems, particularly HP-UX 9,10 and even some 11.x systems depending
on how the server is configured had issues with the rsize and wsize settings. T
his led to dismal I/O performance. The rsize and wsize settings had to be tested
using different sized on the client side to see which rsize and wsie was the be
st performer. This can be set as a mount option on the client side. I remember o
lder HP s were happiest with a 32K read/write size. This is rare nowadays.
RPC not responding RPC: Unable to receive or RPC:Timed out
As I said above, NFS is an RPC based protocol, and needs RPC communication to su
rvive in order for NFS to work.
How do I fix this problem? What to look for:
First ping the Netapp filer from the client, then the other way around. Look
for the time it takes to respond. If it is abnormally slow, investigate further
.
From the NFS client, run the rpcinfo -p netapp filer name, confirm you can
contact the RPC server on the Netapp. As in:
# rpcinfo -p netappfiler
You should see all of the RPC services listed with there ports.
Check NFS client mountpoint
Check showmount e netappname from the NFS client and make sure the shares are
still exported.
Confirm the name of directory being exported from the Netapp server name is
valid
Confirm that the exportfs confirguration if valid on the Netapp filer.
No Space Left On Disk No space left on disk error

As the error message clearly states, somewhere, somehow they storage has been us
ed up. In basic Unix filesystems, you can get this error message if you have run
out of inodes. On the netapp its usually more often that you have completely ex
hausted the available storage space.
How do I fix this problem?
Do a df on the filesystem on the NFS client
Do a df on the filesystem on the root node for the filer.
Is the volume full?
Is the Qtree full?
Look for possible snapshot overruns
Check to see if quota s are enabled and if there are any reports for exceeded
quotas
====================
Stale file handles occur when a file or directory was held open by an NFS client
, and then was either removed, renamed, or replaced.
For example a file gets removed and a new file is created using the same inode,
or if the file was renamed and the inode changed.
The only way to get the ESTALE to go away is to force the client process to nego
tiate new handles. Either open the files again, or restart the processes. You c
an try to umount and remount the filesystem on top of itself, or kill/restart an
y processes that have open file handles. If you prefer not to reboot the machine
, you may create a new mount point on the client for the mount point with the St
ale NFS file handle.