Re: [PATCH] mm/rmap: Convert from atomic_t to refcount_t on anon_vma->refcount

From: Kees Cook
Date: Fri Aug 20 2021 - 03:34:04 EST


On Fri, Aug 20, 2021 at 08:43:40AM +0200, Peter Zijlstra wrote:
> On Thu, Aug 19, 2021 at 12:09:37PM -0700, Linus Torvalds wrote:
> > On Thu, Aug 19, 2021 at 8:21 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > If we can skip the OF... we can do something like this:
> >
> > Honestly, I think a lot of the refcount code is questionable. It was
> > absolutely written with no care for performance AT ALL.
>
> That's a bit unfair I feel. Will's last rewrite of the stuff was
> specifically to address performance issues.

Well, to address performance issues with the "full" version. The default
x86-specific code was already as fast as atomic_t. Will got it to nearly
match while making it catch all conditions, not just the exploitable
ones. (i.e. it didn't bother trying to catch underflow; there's no way
to mitigate it).

Will's version gave us three properties: correctness (it catches all the
pathological conditions), speed (it was very nearly the same speed as
regular atomic_t), and arch-agnosticism, which expanded this protection
to things beyond just x86 and arm64.

> > But see above: maybe just make this a separate "careful atomic_t",
> > with the option to panic-on-overflow. So then we could get rid of
> > refcount_warn_saturate() enmtirely above, and instead just have a
> > (compile-time option) BUG() case, with the non-careful version just
> > being our existing atomic_dec_and_test.

This is nearly what we had before. But refcount_t should always saturate
on overflow -- that's specifically the mitigation needed to defang the
traditional atomic_t overflow exploits (of which we had several a year
before refcount_t and now we've seen zero since).

> We used to have that option; the argument was made that everybody cares
> about security and as long as this doesn't show up on benchmarks we
> good.
>
> Also, I don't think most people want the overflow to go BUG, WARN is
> mostly the right thing and only the super paranoid use panic-on-warn or
> something.

Saturating on overflow stops exploitability. WARNing is informational.
BUG kills the system for no good reason: the saturation is the defense
against attack, and the WARN is the "oh, I found a bug" details needed
to fix it.

I prefer the arch-agnostic, fully checked, very fast version of this
(i.e. what we have right now). :P I appreciate it's larger, but in my
opinion size isn't as important as correctness and speed. If it's just
as fast as a small version but has greater coverage, that seems worth
the size.

--
Kees Cook