Re: [PATCH v9 12/22] KVM: VMX: Virtualize FRED event_data

From: Sean Christopherson

Date: Wed Mar 04 2026 - 11:47:41 EST


On Thu, Jan 29, 2026, Xin Li wrote:
> > On Jan 29, 2026, at 9:21 AM, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
> >
> >> Just to confirm, you are referring to requeueing an original event
> >> via vmx_complete_interrupts(), right?
> >>
> >> Regardless of whether FRED or IDT is in use, the event payload is delivered
> >> into the appropriate guest state and then invalidated in
> >> kvm_deliver_exception_payload():
> >>
> >> 1) CR2 for #PF
> >>
> >> 2) DR6 for #DB
> >>
> >> 3) guest_fpu.xfd_err for #NM (in handle_nm_fault_irqoff())
> >>
> >> We should be able to recover the FRED event data from there.
> >>
> >> Alternatively, we could drop the original event and allow the hardware to
> >> regenerate it upon resuming the guest. However, this breaks #DB delivery,
> >> as debug exceptions sometimes are triggered post-instruction.
> >>
> >> Sean, does it make sense to recover the FRED event data from guest CPU state?

No? As Peter points out, the payload is tied to the exception and shouldn't
change.

> > I think some bits in DR6 are "sticky", and so unless the guest has
> > explicitly cleared DR6 the event data isn't necessarily derivable from DR6.
> > However, the FRED event data for #DB is directly based on the data already
> > reported by VTx (for exactly the same reason – knowing what the *currently
> > taken* trap represents.)
>
> Yeah, it's important to keep in mind that DR6 bits are 'sticky'.
>
> Regarding vmx_complete_interrupts(), when a VM migration occurs immediately
> following a VM exit with a valid original event saved in the VMCS, we can
> safely assume the guest DR6 state remains consistent with the original event
> data because there is no chance for guest OS to modify DR6.

There's a different problem though. If there's a re-injected exception at the
time of save/restore, the destination vCPU won't see a valid payload and thus
won't set the appropriate FRED VMCS fields.

We _could_ extend KVM's uAPI to save/restore event_data, but ugh. Rather than
add event_data, what if we reuse payload, and then simply skip updating register
state on re-injection? E.g.

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 967b58a8ab9d..b79d545d69c7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1941,6 +1941,9 @@ void vmx_inject_exception(struct kvm_vcpu *vcpu)
u32 intr_info = ex->vector | INTR_INFO_VALID_MASK;
struct vcpu_vmx *vmx = to_vmx(vcpu);

+ if (ex->has_payload)
+ <do fred>;
+
kvm_deliver_exception_payload(vcpu, ex);

if (ex->has_error_code) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index db3f393192d9..485eec337203 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -773,6 +773,9 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu,
if (!ex->has_payload)
return;

+ if (ex->injected)
+ goto clear_payload;
+
switch (ex->vector) {
case DB_VECTOR:
/*
@@ -814,6 +817,7 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu,
break;
}

+clear_payload:
ex->has_payload = false;
ex->payload = 0;
}