Re: [PATCH v2] tcp: splice as many packets as possible at once

From: Jarek Poplawski
Date: Fri Feb 06 2009 - 04:11:06 EST

Next message: KOSAKI Motohiro: "Re: [linux-next][PATCH] revert headers_check fix: ia64, fpu.h"
Previous message: Bert Wesarg: "[PATCH urcu] Use pthread_equal() for pthread_t's equality test"
In reply to: Herbert Xu: "Re: [PATCH v2] tcp: splice as many packets as possible at once"
Next in thread: David Miller: "Re: [PATCH v2] tcp: splice as many packets as possible at once"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Feb 05, 2009 at 11:52:58PM -0800, David Miller wrote:
> From: Jarek Poplawski <jarkao2@xxxxxxxxx>
> Date: Tue, 3 Feb 2009 09:41:08 +0000
>
> > Yes, this looks reasonable. On the other hand, I think it would be
> > nice to get some opinions of slab folks (incl. Evgeniy) on the expected
> > efficiency of such a solution. (It seems releasing with put_page() will
> > always have some cost with delayed reusing and/or waste of space.)
>
> I think we can't avoid using carved up pages for skb->data in the end.
> The whole kernel wants to speak in pages and be able to grab and
> release them in one way and one way only (get_page() and put_page()).
>
> What do you think is more likely? Us teaching the whole entire kernel
> how to hold onto SKB linear data buffers, or the networking fixing
> itself to operate on pages for it's header metadata? :-)

This idea looks very reasonable, except I wander why nobody else
didn't need this kind of mm interface. Another question is it seems
many mechanisms like fast searching, defragmentation etc. could be
reused.

> What we'll end up with is likely a hybrid scheme. High speed devices
> will receive into pages. And also the skb->data area will be page
> backed and held using get_page()/put_page() references.
>
> It is not even worth optimizing for skb->data holding the entire
> packet, that's not the case that matters.
>
> These skb->data areas will thus be 128 bytes plus the skb_shinfo
> structure blob. They also will be recycled often, rather than held
> onto for long periods of time.

Looks fine, except: you mentioned dumb NICs, which would need this
page space on receive, anyway. BTW, don't they need this on transmit
again?

> In fact we can optimize that even further in many ways, for example by
> dropping the skb->data backed memory once the skb is queued to the
> socket receive buffer. That will make skb->data buffer lifetimes
> miniscule even under heavy receive load.
>
> In that kind of situation, doing even the most stupidest page slicing
> algorithm, similar to what we do now with sk->sk_sndmsg_page, is
> more than adequate and things like NTA (purely to solve this problem)
> is overengineering.

Hmm... I don't get it. It seems these slabs do a lot of advanced work,
and still some people like Evgeniy or Nick thought it's not enough,
and even found it worth of their time to rework this.

There is also a question of memory accounting: do you think admins
don't care if we give away say 25% additionally?

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: KOSAKI Motohiro: "Re: [linux-next][PATCH] revert headers_check fix: ia64, fpu.h"
Previous message: Bert Wesarg: "[PATCH urcu] Use pthread_equal() for pthread_t's equality test"
In reply to: Herbert Xu: "Re: [PATCH v2] tcp: splice as many packets as possible at once"
Next in thread: David Miller: "Re: [PATCH v2] tcp: splice as many packets as possible at once"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]