Re: [PATCH v2] tcp: splice as many packets as possible at once
From: Jarek Poplawski
Date: Fri Feb 06 2009 - 04:11:06 EST
On Thu, Feb 05, 2009 at 11:52:58PM -0800, David Miller wrote:
> From: Jarek Poplawski <jarkao2@xxxxxxxxx>
> Date: Tue, 3 Feb 2009 09:41:08 +0000
>
> > Yes, this looks reasonable. On the other hand, I think it would be
> > nice to get some opinions of slab folks (incl. Evgeniy) on the expected
> > efficiency of such a solution. (It seems releasing with put_page() will
> > always have some cost with delayed reusing and/or waste of space.)
>
> I think we can't avoid using carved up pages for skb->data in the end.
> The whole kernel wants to speak in pages and be able to grab and
> release them in one way and one way only (get_page() and put_page()).
>
> What do you think is more likely? Us teaching the whole entire kernel
> how to hold onto SKB linear data buffers, or the networking fixing
> itself to operate on pages for it's header metadata? :-)
This idea looks very reasonable, except I wander why nobody else
didn't need this kind of mm interface. Another question is it seems
many mechanisms like fast searching, defragmentation etc. could be
reused.
> What we'll end up with is likely a hybrid scheme. High speed devices
> will receive into pages. And also the skb->data area will be page
> backed and held using get_page()/put_page() references.
>
> It is not even worth optimizing for skb->data holding the entire
> packet, that's not the case that matters.
>
> These skb->data areas will thus be 128 bytes plus the skb_shinfo
> structure blob. They also will be recycled often, rather than held
> onto for long periods of time.
Looks fine, except: you mentioned dumb NICs, which would need this
page space on receive, anyway. BTW, don't they need this on transmit
again?
> In fact we can optimize that even further in many ways, for example by
> dropping the skb->data backed memory once the skb is queued to the
> socket receive buffer. That will make skb->data buffer lifetimes
> miniscule even under heavy receive load.
>
> In that kind of situation, doing even the most stupidest page slicing
> algorithm, similar to what we do now with sk->sk_sndmsg_page, is
> more than adequate and things like NTA (purely to solve this problem)
> is overengineering.
Hmm... I don't get it. It seems these slabs do a lot of advanced work,
and still some people like Evgeniy or Nick thought it's not enough,
and even found it worth of their time to rework this.
There is also a question of memory accounting: do you think admins
don't care if we give away say 25% additionally?
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/