Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()
From: Uros Bizjak
Date: Wed Oct 18 2023 - 15:33:36 EST
On Wed, Oct 18, 2023 at 8:26 PM Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
>
> On Wed, Oct 18, 2023 at 8:16 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Wed, 18 Oct 2023 at 11:08, Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> > >
> > > But loads from non-const memory work like the above.
> >
> > Yes, I'm certainly ok with the move to use plain loads from __seg_gs
> > for the percpu accesses. If they didn't honor the memory clobber, we
> > could never use it at all.
> >
> > I was just saying that the 'const' alias trick isn't useful for
> > anything else than 'current', because everything else needs to at
> > least honor our existing barriers.
>
> FYI, smp_processor_id() is implemented as:
>
> #define __smp_processor_id() __this_cpu_read(pcpu_hot.cpu_number)
>
> where __this_* forces volatile access which disables CSE.
>
> *If* the variable is really stable, then it should use __raw_cpu_read.
> Both, __raw_* and __this_* were recently (tip/percpu branch)
> implemented for SEG_SUPPORT as:
This pach works for me:
--cut here--
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index 4fab2ed454f3..6eda4748bf64 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -141,8 +141,7 @@ __visible void
smp_call_function_single_interrupt(struct pt_regs *r);
* This function is needed by all SMP systems. It must _always_ be valid
* from the initial startup.
*/
-#define raw_smp_processor_id() this_cpu_read(pcpu_hot.cpu_number)
-#define __smp_processor_id() __this_cpu_read(pcpu_hot.cpu_number)
+#define raw_smp_processor_id() raw_cpu_read(pcpu_hot.cpu_number)
#ifdef CONFIG_X86_32
extern int safe_smp_processor_id(void);
--cut here--
But removes merely 10 reads from 3219.
BTW: I also don't understand the comment from include/linux/smp.h:
/*
* Allow the architecture to differentiate between a stable and unstable read.
* For example, x86 uses an IRQ-safe asm-volatile read for the unstable but a
* regular asm read for the stable.
*/
#ifndef __smp_processor_id
#define __smp_processor_id(x) raw_smp_processor_id(x)
#endif
All reads up to word size on x86 are atomic, so IRQ safe. asm-volatile
is not some IRQ property, but prevents the compiler from CSE the asm
and scheduling (moving) asm around too much.
Uros.