[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Testbed-admins] kernel memory leaks in FreeBSD NFS (patched)



On Thu, Dec 03, 2009 at 02:02:05PM -0800, Mike Ryan wrote:
> 
> An NFS client which tries to delete an inexistent file will trigger the
> bug. This can result from rebooting the ops node while experiments are
> running.

I have a little clarification here:

The bug isn't exercised by *any* attempt to delete a non-existant file.
The problems we saw were caused by trying to delete a file from a
directory to which the client had a stale NFS filehandle.  As Mike says,
a likely cause of this would be a server reboot while one or more
clients were in a recursive rm.

Parenthetically, it also looks like a couple other corner cases are not
well handled - things like attempting to delete the root of an NFS
mounted hierarchy.

Importantly, a standard ENOENT (no such file or directory) request is
properly handled.

The situation is somewhat insidious in that if an experimental node 
issues an NFS remove for a file in a stale directory, the file server
never sends a response to it. So the experimental nodes keep sending the
requests.  These requests are coming from the kernel - the NFS RPC
retransmission code is the resender; the rm process may have been
killed.   The file server keeps slowly leaking memory.  The situation
persists across reboots or panics of the file server, only stopping when
all the experimental nodes resending the request are rebooted.

One can spot the problem by looking at a tcpdump and seeing NFS remove
requests to the server at regular intervals that are completely ignored
rather than being responded to.  (That's in addition to the other
diagnoses Mike suggested).

Fortunately, the bug is doesn't get exercised a lot, but, on the other
hand, once it's affecting your testbed, it will continue to do so until
the nodes tickling it are rebooted.

-- 
Ted Faber
http://www.isi.edu/~faber           PGP: http://www.isi.edu/~faber/pubkeys.asc
Unexpected attachment on this mail? See http://www.isi.edu/~faber/FAQ.html#SIG

Attachment: pgpmaaLXWEii4.pgp
Description: PGP signature