Re: RFC: MTU for serving NFS on Infiniband

From: Eric Dumazet
Date: Wed Aug 25 2010 - 01:55:41 EST


Le mardi 24 aoÃt 2010 Ã 15:39 -0700, Stephen Hemminger a Ãcrit :

> IF NFS server is smart enough to generate:
> Header (skb) + one or more pages in fragment list
> then IP fragmentation could do fragmentation by allocating
> new headers skb (small) and assigning the same pages to
> multiple skb's using page ref count.
>
> It obviously isn't working that way.
>

It is, but ip_append_data() is allocating a huge head if MTU is huge.

NFS is trying to build paged skb, to avoid order-X allocations (X > 0)

> The whole problem is moot because NFS over UDP has known data corruption
> issues in the face of packet loss. The sequence number of the IP fragment
> can easily wrap around causing old data to be grouped with new data and
> the UDP checksum is so weak that the resulting UDP packet will be consumed by the NFS
> client ans passed to the user application as corrupted disk block.
>
> DON'T USE NFS OVER UDP!

But Marc point is using a big MTU, so that no IP fragmentation is
needed.

All UDP applications using MSG_MORE will hit the order-2 allocations if
MTU=9000 for example...



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/