Re: [PATCH] NFSv4: Use exponential backoff delay for NFS4_ERRDELAY

From: Chuck Lever
Date: Thu Apr 25 2013 - 10:52:00 EST

On Apr 25, 2013, at 9:49 AM, bfields@xxxxxxxxxxxx wrote:

> On Thu, Apr 25, 2013 at 01:30:58PM +0000, Myklebust, Trond wrote:
>> On Thu, 2013-04-25 at 09:29 -0400, bfields@xxxxxxxxxxxx wrote:
>>> My position is that we simply have no idea what order of magnitude even
>>> delay should be. And that in such a situation exponential backoff such
>>> as implemented in the synchronous case seems the reasonable default as
>>> it guarantees at worst doubling the delay while still bounding the
>>> long-term average frequency of retries.
>> So we start with a 15 second delay, and then go to 60 seconds?
> I agree that a server should normally be doing the wait on its own if
> the wait would be on the order of an rpc round trip.
> So I'd be inclined to start with a delay that was an order of magnitude
> or two more than a round trip.
> And I'd expect NFS isn't common on networks with 1-second latencies.
> So the 1/10 second we're using in the synchronous case sounds closer to
> the right ballpark to me.

The RPC layer already keeps RPC round trip statistics, so the client doesn't have to guess with a "one size fits all" number.

I'm all for keeping client recovery time short. But after following this argument, I think 10xRTT is crazy short. Aggressive retransmits can lead to data corruption, and RTT on a fast server is going to be on the order of a millisecond. And what about RDMA, where RTT is about 20usecs?

A better answer might be to start at one second then exponentially back off to the minimum of 0.25x the lease time and 0.25x the RPC retransmit time out.

Chuck Lever

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at