Re: [PATCH 00/10] foundations for reserve-based allocation

From: Daniel Phillips
Date: Mon Aug 06 2007 - 20:10:23 EST


Hi Matt,

On Monday 06 August 2007 13:23, Matt Mackall wrote:
> On Mon, Aug 06, 2007 at 12:29:22PM +0200, Peter Zijlstra wrote:
> > In the interrest of getting swap over network working and posting
> > in smaller series, here is the first series.
> >
> > This series lays the foundations needed to do reserve based
> > allocation. Traditionally we have used mempools (and others like
> > radix_tree_preload) to handle the problem.
> >
> > However this does not fit the network stack. It is built around
> > variable sized allocations using kmalloc().
> >
> > This calls for a different approach.
>
> One wonders if containers are a possible solution. I can already
> solve this problem with virtualization: have one VM manage all the
> network I/O and export the device as a simpler virtual block device
> to other VMs. Provided this VM isn't doing any "real" work and is
> sized appropriately, it won't get wedged. Since the other VMs now
> submit I/O through the simpler block interface, they can avoid
> getting wedged with the standard mempool approach.

The patch set posted today amounts to hardly any new bytes of code at
all, and the other necessary bits coming down the pipe are not much
heavier. Compare to dedicating a whole VM to supressing just one of
the symptoms. Much better just to cure the disease and save your VM
for a more noble purpose.

> If we can run nbd and friends inside their own container that can
> give similar isolation, we might not need to add this other
> complexity.

This patch set fixes a severe problem that is so far unfixed in spite of
a number of attempts, and does it in a simple way that translates into
a small amount of code. If it comes across as complex, that is a
matter of presentation and tweaking. Subtle, yes, but not complex.

> Just food for thought. I haven't looked closely enough at the
> containers implementations yet to determine whether this is possible
> or if the overhead in performance or complexity is acceptable.

Immensely more overhead than Peter's patches.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/