NFSv4 poops itself

From: Jeff Garzik
Date: Fri Jul 27 2007 - 08:53:19 EST


Background:

Server: x86-64 dual core Intel, kernel 2.6.23-rc1 (my home fileserver)
Exporting NFS/NFSv4 mounts. Client count: 1 Uptime: 4 days

Client: x86-64 dual core Intel, kernel 2.6.23-rc1 (my main workstation)
NFS mount setup:
pretzel:/ on /g type nfs4 (rw,noatime,proto=tcp,addr=10.10.10.1)
Uptime: 4 days

Home directory mounted via NFSv4.

Problem:

My workstation has been happily talking to my file server for several days without incident. An hour ago, my numeric keypad stopping working (unrelated problem... USB or X bug?). The solution to the keypad problem is usually to log out of X and log back in, or worse case, reboot.

So, I log out, and log back in. At first, a few shell windows open and successfully initialize themselves (read bash profile over NFS, etc.) Then, as more shell windows open, things start hanging. I can easily switch to console and ssh to the fileserver, so it is clear this is an NFS hang.

No adverse messages at all on the client.

On the server, I see NFSv4 spamming dmesg with hundreds of thousands of messages:

Jul 27 08:20:53 pretzel kernel: NFSD: preprocess_seqid_op: old stateid!
Jul 27 08:21:24 pretzel last message repeated 167966 times
Jul 27 08:21:55 pretzel last message repeated 173628 times
Jul 27 08:21:55 pretzel kernel: NFSD: preprocess_seqid_op: old stateid!
Jul 27 08:22:26 pretzel last message repeated 171286 times
Jul 27 08:23:27 pretzel last message repeated 344461 times
Jul 27 08:23:30 pretzel last message repeated 18656 times

I rebooted the client, the problem disappeared, and everything is happy again... but clearly NFSv4 shat itself. And now I am worried this will happen again.

In all my quite-heavy use of NFSv4 I've never seen this behavior before, so I would call this a regression.

I always run vanilla linux-2.6.git self-built kernels on both client and server.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/