Re: [PATCH 4/4] x86/cpu: Use SERIALIZE in sync_core() when available

From: hpa
Date: Mon Jul 27 2020 - 02:00:54 EST


On July 26, 2020 10:55:15 PM PDT, hpa@xxxxxxxxx wrote:
>On July 26, 2020 9:31:32 PM PDT, Ricardo Neri
><ricardo.neri-calderon@xxxxxxxxxxxxxxx> wrote:
>>The SERIALIZE instruction gives software a way to force the processor
>>to
>>complete all modifications to flags, registers and memory from
>previous
>>instructions and drain all buffered writes to memory before the next
>>instruction is fetched and executed. Thus, it serves the purpose of
>>sync_core(). Use it when available.
>>
>>Use boot_cpu_has() and not static_cpu_has(); the most critical paths
>>(returning to user mode and from interrupt and NMI) will not reach
>>sync_core().
>>
>>Cc: Andy Lutomirski <luto@xxxxxxxxxx>
>>Cc: Cathy Zhang <cathy.zhang@xxxxxxxxx>
>>Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
>>Cc: Fenghua Yu <fenghua.yu@xxxxxxxxx>
>>Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
>>Cc: Kyung Min Park <kyung.min.park@xxxxxxxxx>
>>Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>Cc: "Ravi V. Shankar" <ravi.v.shankar@xxxxxxxxx>
>>Cc: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>>Cc: linux-edac@xxxxxxxxxxxxxxx
>>Cc: linux-kernel@xxxxxxxxxxxxxxx
>>Reviwed-by: Tony Luck <tony.luck@xxxxxxxxx>
>>Suggested-by: Andy Lutomirski <luto@xxxxxxxxxx>
>>Signed-off-by: Ricardo Neri <ricardo.neri-calderon@xxxxxxxxxxxxxxx>
>>---
>>---
>> arch/x86/include/asm/special_insns.h | 5 +++++
>> arch/x86/include/asm/sync_core.h | 10 +++++++++-
>> 2 files changed, 14 insertions(+), 1 deletion(-)
>>
>>diff --git a/arch/x86/include/asm/special_insns.h
>>b/arch/x86/include/asm/special_insns.h
>>index 59a3e13204c3..0a2a60bba282 100644
>>--- a/arch/x86/include/asm/special_insns.h
>>+++ b/arch/x86/include/asm/special_insns.h
>>@@ -234,6 +234,11 @@ static inline void clwb(volatile void *__p)
>>
>> #define nop() asm volatile ("nop")
>>
>>+static inline void serialize(void)
>>+{
>>+ asm volatile(".byte 0xf, 0x1, 0xe8");
>>+}
>>+
>> #endif /* __KERNEL__ */
>>
>> #endif /* _ASM_X86_SPECIAL_INSNS_H */
>>diff --git a/arch/x86/include/asm/sync_core.h
>>b/arch/x86/include/asm/sync_core.h
>>index fdb5b356e59b..bf132c09d61b 100644
>>--- a/arch/x86/include/asm/sync_core.h
>>+++ b/arch/x86/include/asm/sync_core.h
>>@@ -5,6 +5,7 @@
>> #include <linux/preempt.h>
>> #include <asm/processor.h>
>> #include <asm/cpufeature.h>
>>+#include <asm/special_insns.h>
>>
>> #ifdef CONFIG_X86_32
>> static inline void iret_to_self(void)
>>@@ -54,7 +55,8 @@ static inline void iret_to_self(void)
>> static inline void sync_core(void)
>> {
>> /*
>>- * There are quite a few ways to do this. IRET-to-self is nice
>>+ * Hardware can do this for us if SERIALIZE is available. Otherwise,
>>+ * there are quite a few ways to do this. IRET-to-self is nice
>> * because it works on every CPU, at any CPL (so it's compatible
>> * with paravirtualization), and it never exits to a hypervisor.
>> * The only down sides are that it's a bit slow (it seems to be
>>@@ -75,6 +77,12 @@ static inline void sync_core(void)
>> * Like all of Linux's memory ordering operations, this is a
>> * compiler barrier as well.
>> */
>>+
>>+ if (boot_cpu_has(X86_FEATURE_SERIALIZE)) {
>>+ serialize();
>>+ return;
>>+ }
>>+
>> iret_to_self();
>> }
>>
>
>Any reason to not make sync_core() an inline with alternatives?
>
>For a really overenginered solution, but which might perform
>unnecessary poorly on existing hardware:
>
>asm volatile("1: .byte 0xf, 0x1, 0xe8; 2:"
> _ASM_EXTABLE(1b,2b));

(and : : : "memory" of course.)
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.