Re: [RFC][PATCH 2/9] deadlock prevention core

From: Peter Zijlstra
Date: Wed Aug 09 2006 - 10:02:44 EST

On Wed, 2006-08-09 at 15:48 +0200, Indan Zupancic wrote:
> On Wed, August 9, 2006 14:54, Peter Zijlstra said:
> > On Wed, 2006-08-09 at 14:02 +0200, Indan Zupancic wrote:
> >> That avoids lots of checks and should guarantee that the
> >> accounting is correct, except in the case when the IFF_MEMALLOC flag is
> >> cleared and the counter is set to zero manually. Can't that be avoided and
> >> just let it decrease to zero naturally?
> >
> > That would put the atomic op on the free path unconditionally, I think
> > davem gets nightmares from that.
> I confused SOCK_MEMALLOC with sk_buff::memalloc, sorry. What I meant was
> to unconditionally decrement the reserved usage only when memalloc is true
> on the free path. That way all skbs that increased the reserve also decrease
> it, and the counter should never go below zero.

OK, so far so good, except we loose the notion of getting memory back
regular skbs.

> Also as far as I can see it should be possible to replace all atomic
> "if (unlikely(dev_reserve_used(skb->dev)))" checks witha check if
> memalloc is set. That should make davem happy, as there aren't any
> atomic instructions left in hot paths.

dev_reserve_used() uses atomic_read() which isn't actually a LOCK'ed
instruction, so that should not matter.

> If IFF_MEMALLOC is set new skbs set memalloc and increase the reserve.

Not quite, if IFF_MEMALLOC is set new skbs _could_ get memalloc set. We
fall back to alloc_pages() if the regular path fails to alloc. If the
is backed by a page (as opposed to kmem_cache fluff) sk_buff::memalloc
is set.

> When the skb is being destroyed it doesn't matter if IFF_MEMALLOC is set
> or not, only if that skb used reserves and thus only the memalloc flag
> needs to be checked. This means that changing the IFF_MEMALLOC doesn't
> affect in-flight skbs but only newly created ones, and there's no need to
> update in-flight skbs whenever the flag is changed as all should go well.

Your reasoning is sound, except for these two point above...

<snip code>

> It seems that here SOCK_MEMALLOC is only set on the first socket.
> Shouldn't it be set on all sockets instead?

Ouch, where was my brain... Thanks!

Also, I've been thinking (more pain), should I not up the reserve for

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at