Re: [PATCH v2 1/5] KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest

From: Mathieu Desnoyers
Date: Mon Aug 23 2021 - 11:01:31 EST


----- On Aug 20, 2021, at 6:49 PM, Sean Christopherson seanjc@xxxxxxxxxx wrote:

> Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to
> transferring to a KVM guest, which is roughly equivalent to an exit to
> userspace and processes many of the same pending actions. While the task
> cannot be in an rseq critical section as the KVM path is reachable only
> by via ioctl(KVM_RUN), the side effects that apply to rseq outside of a
> critical section still apply, e.g. the current CPU needs to be updated if
> the task is migrated.
>
> Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults
> and other badness in userspace VMMs that use rseq in combination with KVM,
> e.g. due to the CPU ID being stale after task migration.

Acked-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>

>
> Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function")
> Reported-by: Peter Foley <pefoley@xxxxxxxxxx>
> Bisected-by: Doug Evans <dje@xxxxxxxxxx>
> Cc: Shakeel Butt <shakeelb@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> kernel/entry/kvm.c | 4 +++-
> kernel/rseq.c | 14 +++++++++++---
> 2 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
> index 49972ee99aff..049fd06b4c3d 100644
> --- a/kernel/entry/kvm.c
> +++ b/kernel/entry/kvm.c
> @@ -19,8 +19,10 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu,
> unsigned long ti_work)
> if (ti_work & _TIF_NEED_RESCHED)
> schedule();
>
> - if (ti_work & _TIF_NOTIFY_RESUME)
> + if (ti_work & _TIF_NOTIFY_RESUME) {
> tracehook_notify_resume(NULL);
> + rseq_handle_notify_resume(NULL, NULL);
> + }
>
> ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work);
> if (ret)
> diff --git a/kernel/rseq.c b/kernel/rseq.c
> index 35f7bd0fced0..6d45ac3dae7f 100644
> --- a/kernel/rseq.c
> +++ b/kernel/rseq.c
> @@ -282,9 +282,17 @@ void __rseq_handle_notify_resume(struct ksignal *ksig,
> struct pt_regs *regs)
>
> if (unlikely(t->flags & PF_EXITING))
> return;
> - ret = rseq_ip_fixup(regs);
> - if (unlikely(ret < 0))
> - goto error;
> +
> + /*
> + * regs is NULL if and only if the caller is in a syscall path. Skip
> + * fixup and leave rseq_cs as is so that rseq_sycall() will detect and
> + * kill a misbehaving userspace on debug kernels.
> + */
> + if (regs) {
> + ret = rseq_ip_fixup(regs);
> + if (unlikely(ret < 0))
> + goto error;
> + }
> if (unlikely(rseq_update_cpu_id(t)))
> goto error;
> return;
> --
> 2.33.0.rc2.250.ged5fa647cd-goog

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com