Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()

From: Linus Torvalds
Date: Tue Oct 10 2023 - 14:43:13 EST


On Tue, 10 Oct 2023 at 11:25, Nadav Amit <namit@xxxxxxxxxx> wrote:
>
> As a minor note the proposed assembly version seems to be missing
> __FORCE_ORDER as an input argument to prevent reordering past preempt_enable
> and preempt_disable. But that’s really not the main point.

Hmm. No, it's probably *is* the main point - see my reply to Uros that
the CSE on the inline asm itself gets rid of duplication.

And yes, we currently rely on that asm CSE for doing 'current' and not
reloading the value all the time.

So yes, we'd like to have a barrier for not moving it across the
preemption barriers, and __FORCE_ORDER would seem to be a good way to
do that.

I really suspect that 'rdgsbase' is better than a memory load in
practice, but I have no numbers to back that up, apart from a "it's
not a slow instruction, and core CPU is generally better than memory".

Linus