Re: [GIT PULL] ucount fix for v5.14-rc

From: Linus Torvalds
Date: Sat Aug 07 2021 - 04:23:34 EST


On Fri, Aug 6, 2021 at 10:03 PM Hillf Danton <hdanton@xxxxxxxx> wrote:
>
> Then the current atomic_add_negative() in consideration over the "risk"
> of count overflow in real workloads can be replaced with the not_zero
> version.

What? No.

The atomic_add_negative() has absolutely nothing to do with not_zero.

The "negative" comes not at all from the count ever being zero, and as
I explained, that isn't even an issue here.

The "negative" is from a large _positive_ count growing so much that
the sign bit gets set. It's basically a "31-bit overflow" thing.

So:

- not_zero makes no sense for get_ucounts(), because it can't be
zero, because we hold a reference to it

- atomic_add_negative() is about not letting the counts become too
large, and when they do, we undo the reference (ie the pattern is
"increment ref - but if it then overflows into bit #31, decrement it
again"

and the two have *NOTHING* to do with each other. So your statement
about replacing one with the other makes no sense.

I was trying to explain that in _other_ situations, the
"atomic_inc_not_zero()" kind of pattern is used as a way to allow the
find-vs-last-drop race to be done without locking, but that's not what
the ucounts code does.

ucounts uses the ucounts_lock, and that one is entirely immaterial for
the atomic_add_negative() case, because the "negative" test is
literally about the value being as far away from zero as is _possible_
(and at that point, the lock is most definitely not needed - it's
needed only for the cases where the refcount goes to zero, and to make
sure that a "find" cannot race with that).

Linus