Re: [PATCH] [Bug 16494] NFS client over TCP hangs due to packet loss
From: Andy Chittenden
Date: Tue Aug 03 2010 - 06:25:47 EST
On 2010-08-03 10:11, Andrew Morton wrote:
(cc linux-nfs)
On Tue, 03 Aug 2010 01:21:44 -0700 (PDT) David Miller<davem@xxxxxxxxxxxxx> wrote:
From: "Andy Chittenden"<andyc.bluearc@xxxxxxxxx>
Date: Tue, 3 Aug 2010 09:14:31 +0100
I don't know whether this patch is the correct fix or not but it enables the
NFS client to recover.
Kernel version: 2.6.34.1 and 2.6.32.
Fixes<https://bugzilla.kernel.org/show_bug.cgi?id=16494>. It clears down
any previous shutdown attempts so that reconnects on a socket that's been
shutdown leave the socket in a usable state (otherwise tcp_sendmsg() returns
-EPIPE).
If the SunRPC code wants to close a TCP socket then use it again,
it should disconnect by doing a connect() with sa_family == AF_UNSPEC
There is code to do that in the SunRPC code in xs_abort_connection() but
that's conditionally called from xs_tcp_reuse_connection():
static void xs_tcp_reuse_connection(struct rpc_xprt *xprt, struct
sock_xprt *transport)
{
unsigned int state = transport->inet->sk_state;
if (state == TCP_CLOSE && transport->sock->state == SS_UNCONNECTED)
return;
if ((1 << state) & (TCPF_ESTABLISHED|TCPF_SYN_SENT))
return;
xs_abort_connection(xprt, transport);
}
That's changed since 2.6.26 where it unconditionally did the connect()
with sa_family == AF_UNSPEC. FWIW we cannot reproduce this problem with
2.6.26.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/