Re: [RFC][PATCH 0/4] VM deadlock prevention -v4
From: Daniel Phillips
Date: Sun Aug 13 2006 - 20:40:43 EST
Peter Zijlstra wrote:
On Sat, 2006-08-12 at 20:16 +0200, Indan Zupancic wrote:
What was missing or wrong in the old approach? Can't you use the new
approach, but use alloc_pages() instead of SROG?
Sorry if I bug you so, but I'm also trying to increase my knowledge here. ;-)
I'm almost sorry I threw that code out...
Good instinct :-)
Lemme see what I can do to explain; what I need/want is:
- single allocation group per packet - that is, when I free a packet
and all its associated object I get my memory back.
First, try to recast all your objects as pages, as Evgeniy Polyakov
suggested. Then if there is some place where that just doesn't work
(please point it out) put a mempool there and tweak the high level
reservation setup accordingly.
- not waste too much space managing the various objects
If we waste a little space only where the network would have otherwise
dropped a packet, that is still a big win. We just need to be sure the
normal path does not become more wasteful.
skb operations want to allocate various sk_buffs for the same data
clones. Also, it wants to be able to break the COW or realloc the data.
The trivial approach would be one page (or higher alloc page) per
object, and that will work quite well, except that it'll waste a _lot_
High order allocations are just way too undependable without active
defragmentation, which isn't even on the horizon at the moment. We
just need to treat any network hardware that can't scatter/gather into
single pages as too broken to use for network block io.
As for sk_buff cow break, we need to look at which network paths do it
(netfilter obviously, probably others) and decide whether we just want
to declare that the feature breaks network block IO, or fix the feature
so it plays well with reserve accounting.
So I tried manual packing (parts of that you have seen in previous
attempts). This gets hard when you want to do unlimited clones and COW
breaks. To do either you need to go link several pages.
You need to avoid the temptation to fix the entire world on the first
attempt. Sure, there will be people who call for gobs of overengineering
right from the start, but simple has always worked better for Linux than
lathering on layers of complexity just to support some feature that may
arguably be broken by design. For example, swapping through a firewall.
Changing from per-interface to a global block IO reserve was a great
simplification, we need more of those.
Looking forward to -v5 ;-)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/