Re: [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors

From: Uros Bizjak
Date: Tue Oct 10 2023 - 02:37:44 EST


On Sun, Oct 8, 2023 at 8:00 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, 4 Oct 2023 at 07:51, Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> >
> > The percpu code mostly uses inline assembly. Using segment qualifiers
> > allows to use C code instead, which enables the compiler to perform
> > various optimizations (e.g. propagation of memory arguments). Convert
> > percpu read and write accessors to C code, so the memory argument can
> > be propagated to the instruction that uses this argument.
>
> So apparently this causes boot failures.
>
> It might be worth testing a version where this:
>
> > +#define raw_cpu_read_1(pcp) __raw_cpu_read(, pcp)
> > +#define raw_cpu_read_2(pcp) __raw_cpu_read(, pcp)
> > +#define raw_cpu_read_4(pcp) __raw_cpu_read(, pcp)
> > +#define raw_cpu_write_1(pcp, val) __raw_cpu_write(, pcp, val)
> > +#define raw_cpu_write_2(pcp, val) __raw_cpu_write(, pcp, val)
> > +#define raw_cpu_write_4(pcp, val) __raw_cpu_write(, pcp, val)
>
> and this
>
> > +#ifdef CONFIG_X86_64
> > +#define raw_cpu_read_8(pcp) __raw_cpu_read(, pcp)
> > +#define raw_cpu_write_8(pcp, val) __raw_cpu_write(, pcp, val)
>
> was all using 'volatile' in the qualifier argument and see if that
> makes the boot failure go away.
>
> Because while the old code wasn't "asm volatile", even just a *plain*
> asm() is certainly a lot more serialized than a normal access.
>
> For example, the asm() version of raw_cpu_write() used "+m" for the
> destination modifier, which means that if you did multiple percpu
> writes to the same variable, gcc would output multiple asm calls,
> because it would see the subsequent ones as reading the old value
> (even if they don't *actually* do so).
>
> That's admittedly really just because it uses a common macro for
> raw_cpu_write() and the updates (like the percpu_add() code), so the
> fact that it uses "+m" instead of "=m" is just a random odd artifact
> of the inline asm version, but maybe we have code that ends up working
> just by accident.

FYI: While the emitted asm code is correct, the program flow depends
on uninitialized value. The compiler is free to remove the whole insn
stream in this case. Admittedly, we have asm here, so the compiler is
a bit more forgiving, but it is a slippery slope nevertheless.

Uros.