Re: READ_ONCE() + STACKPROTECTOR_STRONG == :/ (was Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.5-2 tag (topic/kasan-bitops))
From: Will Deacon
Date: Mon Dec 16 2019 - 05:28:16 EST
On Fri, Dec 13, 2019 at 02:17:08PM +0100, Arnd Bergmann wrote:
> On Thu, Dec 12, 2019 at 9:50 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > On Thu, Dec 12, 2019 at 11:34 AM Will Deacon <will@xxxxxxxxxx> wrote:
> > > The root of my concern in all of this, and what started me looking at it in
> > > the first place, is the interaction with 'typeof()'. Inheriting 'volatile'
> > > for a pointer means that local variables in macros declared using typeof()
> > > suddenly start generating *hideous* code, particularly when pointless stack
> > > spills get stackprotector all excited.
> >
> > Yeah, removing volatile can be a bit annoying.
> >
> > For the particular case of the bitops, though, it's not an issue.
> > Since you know the type there, you can just cast it.
> >
> > And if we had the rule that READ_ONCE() was an arithmetic type, you could do
> >
> > typeof(0+(*p)) __var;
> >
> > since you might as well get the integer promotion anyway (on the
> > non-volatile result).
> >
> > But that doesn't work with structures or unions, of course.
> >
> > I'm not entirely sure we have READ_ONCE() with a struct. I do know we
> > have it with 64-bit entities on 32-bit machines, but that's ok with
> > the "0+" trick.
>
> I'll have my randconfig builder look for instances, so far I found one,
> see below. My feeling is that it would be better to enforce at least
> the size being a 1/2/4/8, to avoid cases where someone thinks
> the access is atomic, but it falls back on a memcpy.
I've been using something similar built on compiletime_assert_atomic_type()
and I spotted another instance in the xdp code (xskq_validate_desc()) which
tries to READ_ONCE() on a 128-bit descriptor, although a /very/ quick read
of the code suggests that this probably can't be concurrently modified if
the ring indexes are synchronised properly.
However, enabling this for 32-bit ARM is total carnage; as Linus mentioned,
a whole bunch of code appears to be relying on atomic 64-bit access of
READ_ONCE(); the perf ring buffer, io_uring, the scheduler, pm_runtime,
cpuidle, ... :(
Unfortunately, at least some of these *do* look like bugs, but I can't see
how we can fix them, not least because the first two are user ABI afaict. It
may also be that in practice we get 2x32-bit stores, and that works out fine
when storing a 32-bit virtual address. I'm not sure what (if anything) the
compiler guarantees in these cases.
Will