Re: [RFC PATCH 0/8] Rework READ_ONCE() to improve codegen

From: Christian Borntraeger
Date: Mon Jan 13 2020 - 07:40:28 EST




On 10.01.20 17:56, Will Deacon wrote:
> Hi all,
>
> This is a follow-up RFC to the discussions we had on the mailing list at
> the end of last year:
>
> https://lore.kernel.org/lkml/875zimp0ay.fsf@xxxxxxxxxxxxxxxxxx
>
> Unfortunately, we didn't get a "silver bullet" solution out of that
> long thread, but I've tried to piece together some of the bits and
> pieces we discussed and I've ended up with this series, which does at
> least solve the pressing problem with the bitops for arm64.
>
> The rough summary of the series is:
>
> * Drop the GCC 4.8 workarounds, so that READ_ONCE() is a
> straightforward dereference of a cast-to-volatile pointer.
>
> * Require that the access is either 1, 2, 4 or 8 bytes in size
> (even 32-bit architectures tend to use 8-byte accesses here).
>
> * Introduce __READ_ONCE() for tearing operations with no size
> restriction.
>
> * Drop pointer qualifiers from scalar types, so that volatile scalars
> don't generate horrible stack-spilling mess. This is pretty ugly,
> but it's also mechanical and wrapped up in a macro.
>
> * Convert acquire/release accessors to perform the same qualifier
> stripping.
>
> I gave up trying to prevent READ_ONCE() on aggregates because it is
> pervasive, particularly within the mm/ layer on things like pmd_t.
> Thankfully, these don't tend to be volatile.
>
> I have more patches in this area because I'm trying to move all the
> read_barrier_depends() magic into arch/alpha/, but I'm holding off until
> we agree on this part first.
>
> Cheers,
>
> Will
>
> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx>
> Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> Cc: Luc Van Oostenryck <luc.vanoostenryck@xxxxxxxxx>
> Cc: Arnd Bergmann <arnd@xxxxxxxx>

Looks sane on s390. I also checked that the problematic sequence in
arch/s390/kvm/gaccess.c is not miscompiled (the binary code for the
ipte_lock function is almost the same, just different addresses due
to a different start address.)

The kernel seems to get slighly larger though.
Mostly due to different inlining decisions it seems.
Total: Before=14133361, After=14135643, chg +0.02%