Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD
From: Peter Zijlstra
Date: Sat Aug 12 2006 - 05:18:12 EST
On Sat, 2006-08-12 at 12:47 +0400, Evgeniy Polyakov wrote:
> On Fri, Aug 11, 2006 at 11:42:50PM -0400, Rik van Riel (riel@xxxxxxxxxx) wrote:
> > >Dropping these non-essential packets makes sure the reserve memory
> > >doesn't get stuck in some random blocked user-space process, hence
> > >you can make progress.
> > In short:
> > - every incoming packet needs to be received at the packet level
> > - when memory is low, we only deliver data to memory critical sockets
> > - packets to other sockets get dropped, so the memory can be reused
> > for receiving other packets, including the packets needed for the
> > memory critical sockets to make progress
> > Forwarding packets while in low memory mode should not be a problem
> > at all, since forwarded packets get freed quickly.
> > The memory pool for receiving packets does not need much accounting
> > of any kind, since every packet will end up coming from that pool
> > when normal allocations start failing. Maybe Evgeniy's allocator
> > can do something smarter internally, and mark skbuffs as MEMALLOC
> > when the number of available skbuffs is getting low?
> As you described above, memory for each packet must be allocated (either
> from SLAB or from reserve), so network needs special allocator in OOM
> condition, and that allocator should be separated from SLAB's one which
> got OOM, so my purpose is just to use that different allocator (with
> additional features) for netroking always. Since every piece of
> networking is limited (socket queues, socket numbers, hardware queues,
> hardware wire speeds an so on) there is always a maximum amount of
> memory it can consume and can never exceed, so if network allocator will
> get that amount of memory at the begining, it will never meet OOM,
> so it will _always_ work and thus can allow to make slow progress for
> OOM-capable things like block devices and swap issues.
> There are no special reserve and no need to switch to/from it and
> no possibility to have OOM by design.
I'm not sure if the network stack is bounded as you say; for instance
imagine you taking a lot of packets for blocked user-space processes,
these will just accumulate in the network stack and go nowhere. In that
case memory usage is very much unbounded.
Even if blocked sockets would only accept a limited amount of packets,
it would then become a function of the amount of open sockets, which is
In any scheme you need to bound the amount of memory, and in low memory
situations it is very usefull to return memory as soon as possible.
Hence this approach, it uses the global memory reserve, and a page based
allocator that does indeed return memory instantly (yes the one I have
show so far is ghastly, I've since written a nice one).
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/