Re: [PATCH 6/7] x86/traps: Fix up invalid PASID

From: Thomas Gleixner
Date: Sun Apr 26 2020 - 11:25:22 EST


Fenghua Yu <fenghua.yu@xxxxxxxxx> writes:
> A #GP fault is generated when ENQCMD instruction is executed without
> a valid PASID value programmed in.

Programmed in what?

> The #GP fault handler will initialize the current thread's PASID MSR.
>
> The following heuristic is used to avoid decoding the user instructions
> to determine the precise reason for the #GP fault:
> 1) If the mm for the process has not been allocated a PASID, this #GP
> cannot be fixed.
> 2) If the PASID MSR is already initialized, then the #GP was for some
> other reason
> 3) Try initializing the PASID MSR and returning. If the #GP was from
> an ENQCMD this will fix it. If not, the #GP fault will be repeated
> and we will hit case "2".
>
> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

Just for the record I also suggested to have a proper errorcode in the
#GP for ENQCMD and I surely did not suggest to avoid decoding the user
instructions.

> void __free_pasid(struct mm_struct *mm);
> +bool __fixup_pasid_exception(void);
>
> #endif /* _ASM_X86_IOMMU_H */
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 6ef00eb6fbb9..369b5ba94635 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -56,6 +56,7 @@
> #include <asm/umip.h>
> #include <asm/insn.h>
> #include <asm/insn-eval.h>
> +#include <asm/iommu.h>
>
> #ifdef CONFIG_X86_64
> #include <asm/x86_init.h>
> @@ -488,6 +489,16 @@ static enum kernel_gp_hint get_kernel_gp_address(struct pt_regs *regs,
> return GP_CANONICAL;
> }
>
> +static bool fixup_pasid_exception(void)
> +{
> + if (!IS_ENABLED(CONFIG_INTEL_IOMMU_SVM))
> + return false;
> + if (!static_cpu_has(X86_FEATURE_ENQCMD))
> + return false;
> +
> + return __fixup_pasid_exception();
> +}
> +
> #define GPFSTR "general protection fault"
>
> dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> @@ -499,6 +510,12 @@ dotraplinkage void do_general_protection(struct pt_regs *regs, long error_code)
> int ret;
>
> RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> +
> + if (user_mode(regs) && fixup_pasid_exception()) {
> + cond_local_irq_enable(regs);

The point of this conditional irq enable _AFTER_ calling into the fixup
function is? Also what's the reason for keeping interrupts disabled
while calling into that function? Comments exist for a reason.

> + return;
> + }
> +
> cond_local_irq_enable(regs);
>
> if (static_cpu_has(X86_FEATURE_UMIP)) {
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index da718a49e91e..5ed39a022adb 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -759,3 +759,40 @@ void __free_pasid(struct mm_struct *mm)
> */
> ioasid_free(pasid);
> }
> +
> +/*
> + * Fix up the PASID MSR if possible.
> + *
> + * But if the #GP was due to another reason, a second #GP might be triggered
> + * to force proper behavior.
> + */
> +bool __fixup_pasid_exception(void)
> +{
> + struct mm_struct *mm;
> + bool ret = true;
> + u64 pasid_msr;
> + int pasid;
> +
> + mm = get_task_mm(current);

Why do you need a reference to current->mm ?

> + /* This #GP was triggered from user mode. So mm cannot be NULL. */
> + pasid = mm->context.pasid;
> + /* Ensure this process has been bound to a PASID. */
> + if (!pasid) {
> + ret = false;
> + goto out;
> + }
> +
> + /* Check to see if the PASID MSR has already been set for this task. */
> + rdmsrl(MSR_IA32_PASID, pasid_msr);
> + if (pasid_msr & MSR_IA32_PASID_VALID) {
> + ret = false;
> + goto out;
> + }
> +
> + /* Fix up the MSR. */
> + wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> +out:
> + mmput(mm);

Thanks,

tglx