RE: [PATCH v6 2/5] x86/asm: Add enqcmds() to support ENQCMDS instruction

From: David Laight
Date: Thu Sep 24 2020 - 18:09:27 EST


From: Dave Jiang
> Sent: 24 September 2020 19:01
>
> Currently, the MOVDIR64B instruction is used to atomically
> submit 64-byte work descriptors to devices. Although it can
> encounter errors like device queue full, command not accepted,
> device not ready, etc when writing to a device MMIO, MOVDIR64B
> can not report back on errors from the device itself. This
> means that MOVDIR64B users need to separately interact with a
> device to see if a descriptor was successfully queued, which
> slows down device interactions.
>
> ENQCMD and ENQCMDS also atomically submit 64-byte work
> descriptors to devices. But, they *can* report back errors
> directly from the device, such as if the device was busy,
> or device not enabled or does not support the command. This
> immediate feedback from the submission instruction itself
> reduces the number of interactions with the device and can
> greatly increase efficiency.
>
> ENQCMD can be used at any privilege level, but can effectively
> only submit work on behalf of the current process. ENQCMDS is a
> ring0-only instruction and can explicitly specify a process
> context instead of being tied to the current process or needing
> to reprogram the IA32_PASID MSR.
>
> Use ENQCMDS for work submission within the kernel because a
> Process Address ID (PASID) is setup to translate the kernel
> virtual address space. This PASID is provided to ENQCMDS from
> the descriptor structure submitted to the device and not retrieved
> from IA32_PASID MSR, which is setup for the current user address space.
>
> See Intel Software Developer’s Manual for more information on the
> instructions.
>
> Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx>
> Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
> arch/x86/include/asm/special_insns.h | 34 ++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index 2258c7d6e281..b4d2ce300c94 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -256,6 +256,40 @@ static inline void movdir64b(void *dst, const void *src)
> : "m" (*__src), "a" (__dst), "d" (__src));
> }
>
> +/**
> + * enqcmds - copy a 512 bits data unit to single MMIO location
> + * @dst: destination, in MMIO space (must be 512-bit aligned)
> + * @src: source
> + *
> + * The ENQCMDS instruction allows software to write a 512 bits command to
> + * a 512 bits aligned special MMIO region that supports the instruction.
> + * A return status is loaded into the ZF flag in the RFLAGS register.
> + * ZF = 0 equates to success, and ZF = 1 indicates retry or error.
> + *
> + * The enqcmds() function uses the ENQCMDS instruction to submit data from
> + * kernel space to MMIO space, in a unit of 512 bits. Order of data access
> + * is not guaranteed, nor is a memory barrier performed afterwards. The
> + * function returns 0 on success and -EAGAIN on failure.
> + *
> + * Warning: Do not use this helper unless your driver has checked that the CPU
> + * instruction is supported on the platform and the device accepts ENQCMDS.
> + */
> +static inline int enqcmds(void __iomem *dst, const void *src)
> +{
> + int zf;
> +
> + /* ENQCMDS [rdx], rax */
> + asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90"
> + CC_SET(z)
> + : CC_OUT(z) (zf)
> + : "a" (dst), "d" (src));
> + /* Submission failure is indicated via EFLAGS.ZF=1 */
> + if (zf)
> + return -EAGAIN;
> +
> + return 0;
> +}
> +


Doesn't this need an "m" input constraint for the source buffer.
Otherwise if it is a local on-stack buffer the compiler
will optimise away the instructions that write to it.

The missing output memory constraint is less of a problem.
The driver needs to be using barriers of its own.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)