[Leaplist] Any suggestions on resolving "Stale NFS file
handle" errors?
Richard F. Ostrow Jr.
rich at warfaresdl.com
Tue Dec 9 09:40:27 EST 2008
<quote who="Damien McKenna">
> At work we have a cluster of Linux (ubuntu) web servers which uses NFS
> to mount files off a SAN.
>
> The mount flags options are:
> proto=tcp,vers3=,rw,hard,intr,bg,rsize=32768,wsize=32768
>
> The problem is that we're having a large number of "Stale NFS file
> handle" errors, as in several thousand per day. Does anyone have
> experience resolving this problem? Is NFS just doomed to these
> problems? Thanks.
I've found that linux in general tends to have lots of issues with NFS
with other operating systems. I've got a FreeBSD RAID server that I've
tried to tie multiple machines into (all linux, one vista over NFS)... and
the only machine that works well is the vista machine (aside from some
permissions issues, the performance and stability is much higher than with
the linux machines).
This is odd, as the Vista NFS support is rather seriously lacking in any
form of documentation (all I've found is a few scraps of info from online,
nothing in the internal "help" structure).
The linux machines cannot handle large amounts of I/O... they end up
hanging (hard) with no log message or anything I've been able to trace
back. This is awkard because one of these is my mythtv backend running
diskless. I've managed to make it work because the thing boots fine (as
long as I bypass a lot of NFS necessities like using nfs.lockd and
nfs.statd, neither of which work properly at boot time, which means that
the whole fs lacks file locking support) and stores its recordings locally
(3 TB software RAID-0). I boot it diskless so it can take advantage of the
massive redundancy of my central file server, even though it has its own
local disks. The OS is fault-tolerant... and I can live with it if I lose
all my recordings.
For my diskless machines (I've had more than one in the past, but now I'm
just down to one), I've semi-overcome this issue by putting all locations
that may have lock files into a tmpfs partition (which does support file
locking), which would include things like /var/run, /var/lock, /tmp, and
/var/tmp. I've also had to create some scripts to create some directories
and change some permissions on boot-up, as mysql keeps its own folder in
/var/run that it can't auto-create.
In general, I'm getting tired of the linux implementation of NFS, and
especially the common init-scripts implementation of NFS (like mounting
the root file system R/W before starting nfs.statd and nfs.lockd, which
makes booting with locking support impossible).
--
Life without passion is death in disguise
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Leaplist
mailing list