Re: NFS regression? Odd delays and lockups accessing an NFS export.

From: Tom Tucker
Date: Tue Sep 16 2008 - 07:38:56 EST


Ian Campbell wrote:
(dropping the e1000 guys, it seems unnecessary to keep spamming them
with this issue when it's unlikely to be anything to do with them. I've
left their list on for now so they know. I'd suggest dropping it from
any replies.)

On Sat, 2008-09-13 at 09:57 +0100, Ian Campbell wrote:
On Fri, 2008-09-12 at 18:15 -0500, Tom Tucker wrote:
Iain sadly yes. There's a thread stuck holding the BUSY bit or a thread failed to clear the bit properly
(maybe an error path). Data continues to arrives, but the transport never gets put back on the queue
because it's BUSY bit is set. In other words, this is a different error than the one we've been chasing.

If I sent you a patch, could you rebuild the kernel and give it a whirl? Also, can you give me a
kernel.org relative commit-id or tag for the kernel that you're using?
I sure could. I'm using the Debian kernel which is currently at 2.6.26.3
(pkg version 2.6.26-4) although I have an update to 2.6.26.4 (via pkg
2.6.26-5) pending.

If I'm going to build my own I'll start with current git
(a551b98d5f6fce5897d497abd8bfb262efb33d2a) and repro there before trying
your patch.

FYI I've repro'd with commit a551b98d5f6fce5897d497abd8bfb262efb33d2a
Merge: d1c6d2e... 50bed2e...
Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Thu Sep 11 11:50:15 2008 -0700
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
sg: disable interrupts inside sg_copy_buffer
which was the latest git a few days back.

I'm going to start bisecting between v2.6.25 and v2.6.26. There's 173
commits in fs/nfs* net/sunrpc in that interval so with a day per test I
should have something next week...

Iain:

I'm assuming you'll do this in advance of any patch from me? I was simply going to add printk to the various shutdown paths and see if we could get some finer grained debug output since the generic transport debug output was too verbose.

Let me know and I'll help if I can,
Tom

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/