Re: [PATCH 8/8] shmem,percpu_counter: add _limited_add(fbc, limit, amount)

From: Dennis Zhou
Date: Fri Oct 06 2023 - 18:59:39 EST


Hello,

On Thu, Oct 05, 2023 at 10:42:17PM -0700, Hugh Dickins wrote:
> On Thu, 5 Oct 2023, Chen, Tim C wrote:
>
> > >--- a/lib/percpu_counter.c
> > >+++ b/lib/percpu_counter.c
> > >@@ -278,6 +278,59 @@ int __percpu_counter_compare(struct
> > >percpu_counter *fbc, s64 rhs, s32 batch) }
> > >EXPORT_SYMBOL(__percpu_counter_compare);
> > >
> > >+/*
> > >+ * Compare counter, and add amount if the total is within limit.
> > >+ * Return true if amount was added, false if it would exceed limit.
> > >+ */
> > >+bool __percpu_counter_limited_add(struct percpu_counter *fbc,
> > >+ s64 limit, s64 amount, s32 batch) {
> > >+ s64 count;
> > >+ s64 unknown;
> > >+ unsigned long flags;
> > >+ bool good;
> > >+
> > >+ if (amount > limit)
> > >+ return false;
> > >+
> > >+ local_irq_save(flags);
> > >+ unknown = batch * num_online_cpus();
> > >+ count = __this_cpu_read(*fbc->counters);
> > >+
> > >+ /* Skip taking the lock when safe */
> > >+ if (abs(count + amount) <= batch &&
> > >+ fbc->count + unknown <= limit) {
> > >+ this_cpu_add(*fbc->counters, amount);
> > >+ local_irq_restore(flags);
> > >+ return true;
> > >+ }
> > >+
> > >+ raw_spin_lock(&fbc->lock);
> > >+ count = fbc->count + amount;
> > >+
> >
> > Perhaps we can fast path the case where for sure
> > we will exceed limit?
> >
> > if (fbc->count + amount - unknown > limit)
> > return false;
>
> Thanks, that sounds reasonable: I'll try to add something like that -
> but haven't thought about it carefully enough yet (too easy for me
> to overlook some negative case which messes everything up).
>
> Hugh
>

Sorry for the late chime in. I'm traveling right now.

I haven't been super happy lately with percpu_counter as it has had a
few corner cases such as the cpu_dying_mask fiasco which I thought we
fixed with a series from tglx [1]. If not I can resurrect it and pull
it.

I feel like percpu_counter is deviating from its original intended
usecase which, from my perspective, was a thin wrapper around a percpu
variable. At this point we seem to be bolting onto percpu_counter
instead of giving it a clear focus for what it's supposed to do well.
I think I understand the use case, and ultimately it's kind of the
duality where I think it was xfs is using percpu_counters where it must
be > 0 for the value to make sense and there was a race condition with
cpu dying [2].

At this point, I think it's probably better to wholy think about the
lower bound and upper bound problem of percpu_counter wrt the # of
online cpus.

Thanks,
Dennis

[1] https://lore.kernel.org/lkml/20230414162755.281993820@xxxxxxxxxxxxx/
[2] https://lore.kernel.org/lkml/20230406015629.1804722-1-yebin@xxxxxxxxxxxxxxx/