Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()

From: Linus Torvalds
Date: Thu Oct 12 2023 - 13:10:36 EST


On Thu, 12 Oct 2023 at 09:55, Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
>
> An example:

Oh, I'm convinced.

The fix seems to be a simple one-liner, ie just

- asm(__pcpu_op2_##size(op, __percpu_arg(P[var]), "%[val]") \
+ asm(__pcpu_op2_##size(op, __percpu_arg(a[var]), "%[val]") \

and it turns out that we have other places where I think we could use that '%a',

For example, we have things like this:

asm ("lea sme_cmdline_arg(%%rip), %0"
: "=r" (cmdline_arg)
: "p" (sme_cmdline_arg));

and I think the only reason we do that ridiculous asm is that the code
in question really does want that (%rip) encoding. It sounds like this
could just do

asm ("lea %a1, %0"
: "=r" (cmdline_arg)
: "p" (sme_cmdline_arg));

instead. Once again, I claim ignorance of the operand modifiers as the
reason for these kinds of things.

But coming back to the stable op thing, I do wonder if there is some
way we could avoid the unnecessary reload.

I don't hate Nadav's patch, so that part is fine, but I'd like to
understand what it is that makes gcc think it needs to reload. We have
other cases (like the ALTERNATIVE() uses) where we *have* to use
inline asm, so it would be good to know...

Is it just that "p" (in the constraint, not "P" in the modifier) ends
up always being seen as a memory access, even when we only use the
address?

That part has never really been something we've been entirely clear
on. We *are* passing in just the address, so the hope in *that* place
is that it's only an address dependency, not a memory one.

(Of course, we use 'p' constraints in other places, and may
expect/assume that those have memory dependency. Again - this has
never been entirely clear)

Linus