Re: [RFC][PATCH 2/9] deadlock prevention core

From: Peter Zijlstra
Date: Wed Aug 09 2006 - 08:56:22 EST


On Wed, 2006-08-09 at 14:02 +0200, Indan Zupancic wrote:
> On Wed, August 9, 2006 2:25, Daniel Phillips said:
> > .... We want the atomic op that some people call
> > "monus": decrement unless zero.
>
> Currently atomic_inc_not_zero(), atomic_add_unless() and atomic_cmpxchg()
> exist, so making an atomic_dec_not_zero() should be easy.

atomic_add_unless() - will nicely do, thanks.

> I'm not sure, to me it looks like dev_unreserve_skb() is always called
> without really knowing if it is justified or not, or else there wouldn't
> be a chance that the counter became negative. So avoiding the negative
> reserve usage seems like papering over bad accounting.

It was indeed called too often it seems, once when deciding to drop the
skb
and again then actually freeing the skb.

> The assumption made seems to be that if there's reserve used, then it must
> be us using it, and it's unreserved. So it appears that either that
> assumption is wrong, and we can unreserve for others while we never
> reserved for ourselves, or it is correct, in which case it probably makes
> more sense to check for the IFF_MEMALLOC flag.

Changed it to only dec_not_zero on free for IFF_MEMALLOC devices.
I'm thinking of making kfree_skbmem -> skb_release_data return whether
they
released the actual data and also depend on that.

> All in all it seems like a per skb flag which tells us if this skb was the
> one reserving anything is missing.

struct sk_buff::memalloc

However the idea is that freeing non memalloc skbs also returns memory
(albeit
to the slab and not the free page list).

> Or rx_reserve_used must be updated for
> all in flight skbs whenever the IFF_MEMALLOC flag changes, so that we can
> be sure that the accounting works correctly.

Yes

> Oh wait, isn't that what the
> memalloc flag is for? So shouldn't it be sufficient to check only with
> sk_is_memalloc()?

See previous comment.

> That avoids lots of checks and should guarantee that the
> accounting is correct, except in the case when the IFF_MEMALLOC flag is
> cleared and the counter is set to zero manually. Can't that be avoided and
> just let it decrease to zero naturally?

That would put the atomic op on the free path unconditionally, I think
davem
gets nightmares from that.

Thanks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/