Re: [PATCH v10 03/38] x86/msr: Add the WRMSRNS instruction support

From: andrew . cooper3
Date: Thu Sep 14 2023 - 10:05:48 EST


On 14/09/2023 5:47 am, Xin Li wrote:
> Add an always inline API __wrmsrns() to embed the WRMSRNS instruction
> into the code.
>
> Tested-by: Shan Kang <shan.kang@xxxxxxxxx>
> Signed-off-by: Xin Li <xin3.li@xxxxxxxxx>
> ---
> arch/x86/include/asm/msr.h | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
> index 65ec1965cd28..c284ff9ebe67 100644
> --- a/arch/x86/include/asm/msr.h
> +++ b/arch/x86/include/asm/msr.h
> @@ -97,6 +97,19 @@ static __always_inline void __wrmsr(unsigned int msr, u32 low, u32 high)
> : : "c" (msr), "a"(low), "d" (high) : "memory");
> }
>
> +/*
> + * WRMSRNS behaves exactly like WRMSR with the only difference being
> + * that it is not a serializing instruction by default.
> + */
> +static __always_inline void __wrmsrns(u32 msr, u32 low, u32 high)
> +{
> + /* Instruction opcode for WRMSRNS; supported in binutils >= 2.40. */
> + asm volatile("1: .byte 0x0f,0x01,0xc6\n"
> + "2:\n"
> + _ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_WRMSR)
> + : : "c" (msr), "a"(low), "d" (high));
> +}
> +
> #define native_rdmsr(msr, val1, val2) \
> do { \
> u64 __val = __rdmsr((msr)); \
> @@ -297,6 +310,11 @@ do { \
>
> #endif /* !CONFIG_PARAVIRT_XXL */
>
> +static __always_inline void wrmsrns(u32 msr, u64 val)
> +{
> + __wrmsrns(msr, val, val >> 32);
> +}

This API works in terms of this series where every WRMSRNS is hidden
behind a FRED check, but it's an awkward interface to use anywhere else
in the kernel.

I fully understand that you expect all FRED capable systems to have
WRMSRNS, but it is not a hard requirement and you will end up with
simpler (and therefore better) logic by deleting the dependency.

As a "normal" user of the WRMSR APIs, the programmer only cares about:

1) wrmsr() -> needs to be serialising
2) wrmsr_ns() -> safe to be non-serialising


In Xen, I added something of the form:

/* Non-serialising WRMSR, when available.  Falls back to a serialising
WRMSR. */
static inline void wrmsr_ns(uint32_t msr, uint32_t lo, uint32_t hi)
{
    /*
     * WRMSR is 2 bytes.  WRMSRNS is 3 bytes.  Pad WRMSR with a redundant CS
     * prefix to avoid a trailing NOP.
     */
    alternative_input(".byte 0x2e; wrmsr",
                      ".byte 0x0f,0x01,0xc6", X86_FEATURE_WRMSRNS,
                      "c" (msr), "a" (lo), "d" (hi));
}

and despite what Juergen said, I'm going to recommend that you do wire
this through the paravirt infrastructure, for the benefit of regular
users having a nice API, not because XenPV is expecting to do something
wildly different here.


I'd actually go as far as suggesting that you break patches 1-3 into
different series and e.g. update the regular context switch path to use
the WRMSRNS-falling-back-to-WRMSR helpers, just to get started.

~Andrew