This is definitely a bug in the NFS client. Let me explain what's happening
and why the packet is evil:
The client sees a timeout (or some other problem, e.g. garbage
reply). While it correctly figures that it should retransmit the packet,
it does not rebuild the packet but resends the current network buffer
which, for some reason, already contains an RPC *reply* (you can see that
from the second long word - a call has a 0 there, while replies have a 1).
The first long word is the transmission id (XID).
It appears there's a race condition somewhere in the RPC socket handling
where the client doesn't notice that some packet has already been received
when it starts waiting for the reply.
Can you please apply the enclosed patch and check what happens?
May I also request that people put `NFS' somewhere in the subject
when they report a problem to linux-kernel that they suspect is
nfs-related? Quite frequently, I scan the lists only by subject, so
threads like this one usually escape me.
Cheers
Olaf
-- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play okir@monad.swb.de | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax ------------------------------------------------------------------ --- rpcsock.c.orig Wed Jul 9 11:32:30 1997 +++ rpcsock.c Wed Jul 9 12:04:21 1997 @@ -60,19 +60,27 @@ /* - * Insert new request into wait list. We make sure list is sorted by - * increasing timeout value. + * Insert new request into wait list. + * If there's already a request in the first position, this is the receiver + * sitting in rpc_select. Insert new request after the receiver. */ static inline void rpc_insque(struct rpc_sock *rsock, struct rpc_wait *slot) { - struct rpc_wait *next = rsock->pending; + struct rpc_wait *next, *prev; - slot->w_next = next; - slot->w_prev = NULL; - if (next) - next->w_prev = slot; - rsock->pending = slot; + if ((prev = rsock->pending) != NULL) { + next = prev->w_next; + slot->w_next = next; + slot->w_prev = prev; + prev->w_next = slot; + if (next) + next->w_prev = slot; + } else { + slot->w_next = NULL; + slot->w_prev = NULL; + rsock->pending = slot; + } slot->w_queued = 1; dprintk("RPC: inserted %p into queue\n", slot); @@ -412,8 +420,6 @@ while (rsock->pending != slot) { if (!slot->w_gotit) interruptible_sleep_on(&slot->w_wait); - if (slot->w_gotit) - return slot->w_result; /* quite important */ if (current->signal & ~current->blocked) return -ERESTARTSYS; if (rsock->shutdown) @@ -421,6 +427,9 @@ if (current->timeout == 0) return -ETIMEDOUT; } + + if (slot->w_gotit) + return slot->w_result; /* quite important */ /* Wait for data to arrive */ if ((result = rpc_select(rsock)) < 0) { ------------------------------------------------------------------