Re: [PATCH 04/21] Generic percpu refcounting

From: Tejun Heo
Date: Tue May 28 2013 - 21:11:48 EST


Yo,

On Tue, May 28, 2013 at 04:47:28PM -0700, Kent Overstreet wrote:
> > It'd be great if that is explained clearly in more intuitive way. The
> > only actual explanation above is "modular arithmatic is commutative"
> > which is a very compact way to put it and I really think it deserves
> > an easier explanation.
>
> I'm not sure I know of any good way of explaining it intuitively, but
> here's this at least...
>
> * (More precisely: because moduler arithmatic is commutative the sum of all the
> * pcpu_count vars will be equal to what it would have been if all the gets and
> * puts were done to a single integer, even if some of the percpu integers
> * overflow or underflow).

Yeah, that's much better.

> And we can't do more puts than there have been gets - because the sum
> can't be negative. So the most puts() we can do at any given time is the
> real count, or sum of the percpu ref and atomic_t.
>
> Therefore, the amount the atomic_t can go negative is bounded by the
> maximum value of the refcount.

Ah, okay, I thought you were collecting the percpu counters directly
into the global counter. You're staging it into a temp counter and
then adding it into the global counter after the summing is complete.
Yeap, that should be fine then. It'd be worthwhile to document the
importance of not adding it directly to the global counter.

> > I probably should have made it clearer. Sorry about that. tryget()
> > is fine. I was curious about count() as it's always a bit dangerous a
> > query interface which is racy and can return something unexpected like
> > false zero or underflowed refcnt.
>
> Yeah, it is, it was intended just for the module code where it's only
> used for the value lsmod shows.

Let's document so then and limit the range returned. We require the
refcnt to be alive and it'd be a good way to both protect from and
deter creative usages.

> > Let's just have percpu_ref_kill(ref, release) which puts the base ref
> > and invokes release whenever it's done.
>
> Release has to be stored in struct percpu_ref() so it can be invoked
> after a call_rcu() (percpu_ref_kill -> call_rcu() ->
> percpu_ref_kill_rcu() -> percpu_ref_put()) so I'm passing it to
> percpu_ref_init(), but yeah.

Yeah, I'm a bit torn about where to put the release function. For me,
as we have an API which is dedicated to killing a refcnt, it does make
sense to put it there but it's really in the realm of bikeshedding so
choose whatever you wanna choose.

Thanks!

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/