Re: [RFC] Disable lockref on arm64

From: Will Deacon
Date: Fri Jun 14 2019 - 06:43:39 EST


Hi Ard,

On Fri, Jun 14, 2019 at 12:24:54PM +0200, Ard Biesheuvel wrote:
> On Fri, 14 Jun 2019 at 11:58, Will Deacon <will.deacon@xxxxxxx> wrote:
> > On Fri, Jun 14, 2019 at 07:09:26AM +0000, Jayachandran Chandrasekharan Nair wrote:
> > > x86 added a arch-specific fast refcount implementation - and the commit
> > > specifically notes that it is faster than cmpxchg based code[1].
> > >
> > > There seems to be an ongoing effort to move over more and more subsystems
> > > from atomic_t to refcount_t(e.g.[2]), specifically because refcount_t on
> > > x86 is fast enough and you get some error checking atomic_t that does not
> > > have.
> >
> > Correct, but there are also some cases that are only caught by
> > REFCOUNT_FULL.
> >
> Yes, but do note that my arm64 implementation catches
> increment-from-zero as well.

Ok, so it's just the silly racy cases that are problematic?

> > > Do you think Ard's patch needs changes before it can be considered? I
> > > can take a look at that.
> >
> > I would like to see how it performs if we keep the checking inline, yes.
> > I suspect Ard could spin this in short order.
>
> Moving the post checks before the stores you mean? That shouldn't be
> too difficult, I suppose, but it will certainly cost performance.

That's what I'd like to assess, since the major complaint seems to be the
use of cmpxchg() as opposed to inline branching.

> > > > Whatever we do, I prefer to keep REFCOUNT_FULL the default option for arm64,
> > > > so if we can't keep the semantics when we remove the cmpxchg, you'll need to
> > > > opt into this at config time.
> > >
> > > Only arm64 and arm selects REFCOUNT_FULL in the default config. So please
> > > reconsider this! This is going to slow down arm64 vs. other archs and it
> > > will become worse when more code adopts refcount_t.
> >
> > Maybe, but faced with the choice between your micro-benchmark results and
> > security-by-default for people using the arm64 Linux kernel, I really think
> > that's a no-brainer. I'm well aware that not everybody agrees with me on
> > that.
>
> I think the question whether the benchmark is valid is justified, but
> otoh, we are obsessed with hackbench which is not that representative
> of a real workload either. It would be better to discuss these changes
> in the context of known real-world use cases where refcounts are a
> true bottleneck.

I wasn't calling into question the validity of the benchmark (I really have
no clue about that), but rather that you can't have your cake and eat it.
Faced with the choice, I'd err on the security side because it's far easier
to explain to somebody that the default is full mitigation at a cost than it
is to explain why a partial mitigation is acceptable (and in the end it's
often subjective because people have different thresholds).

> Also, I'd like to have Kees's view on the gap between REFCOUNT_FULL
> and the fast version on arm64. I'm not convinced the cases we are not
> covering are such a big deal.

Fair enough, but if the conclusion is that it's not a big deal then we
should just remove REFCOUNT_FULL altogether, because it's the choice that
is the problem here.

Will