Re: [openib-general] Re: TSO and IPoIB performance degradation

From: Roland Dreier
Date: Tue Mar 07 2006 - 20:15:04 EST


David> How limited are the IPoIB devices, TX descriptor wise?

David> One side effect of the TSO changes is that one extra
David> descriptor will be used for outgoing packets. This is
David> because we have to put the headers as well as the user
David> data, into page based buffers now.

We have essentially no limit on TX descriptors. However I think
there's some confusion about TSO: IPoIB does _not_ do TSO -- generic
InfiniBand hardware does not have any TSO capability. In the future
we might be able to implement TSO for certain hardware that does have
support, but even that requires some firmware help from the from the
HCA vendors, etc. So right now the IPoIB driver does not do TSO.

The reason TSO comes up is that reverting the patch described below
helps (or helped at some point at least) IPoIB throughput quite a bit.
Clearly this was a bug fix so we can't revert it in general but I
think what Michael Tsirkin was suggesting at the beginning of this
thread is to do what the last paragraph of the changelog says -- find
some way to re-enable the trick.

diff-tree 3143241... (from e16fa6b...)
Author: David S. Miller <davem@xxxxxxxxxxxxx>
Date: Mon May 23 12:03:06 2005 -0700

[TCP]: Fix stretch ACK performance killer when doing ucopy.

When we are doing ucopy, we try to defer the ACK generation to
cleanup_rbuf(). This works most of the time very well, but if the
ucopy prequeue is large, this ACKing behavior kills performance.

With TSO, it is possible to fill the prequeue so large that by the
time the ACK is sent and gets back to the sender, most of the window
has emptied of data and performance suffers significantly.

This behavior does help in some cases, so we should think about
re-enabling this trick in the future, using some kind of limit in
order to avoid the bug case.

- R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/