Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT

From: Edgecombe, Rick P
Date: Mon Oct 14 2024 - 13:37:04 EST

Next message: Rob Herring: "Re: [PATCH v12 1/3] dt-bindings: display: mediatek: Add OF graph support for board path"
Previous message: syzbot: "Re: [syzbot] [btrfs?] KASAN: slab-use-after-free Read in add_delayed_ref"
In reply to: Huang, Kai: "Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT"
Next in thread: Huang, Kai: "Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 2024-10-14 at 10:54 +0000, Huang, Kai wrote:
> On Thu, 2024-10-10 at 21:53 +0000, Edgecombe, Rick P wrote:
> > On Thu, 2024-10-10 at 10:33 -0700, Sean Christopherson wrote:
> > > >
> > > > 1st: "fault->is_private != kvm_mem_is_private(kvm, fault->gfn)" is found.
> > > > 2nd-6th: try_cmpxchg64() fails on each level SPTEs (5 levels in total)
> >
> > Isn't there a more general scenario:
> >
> > vcpu0                              vcpu1
> > 1. Freezes PTE
> > 2. External op to do the SEAMCALL
> > 3.                                 Faults same PTE, hits frozen PTE
> > 4.                                 Retries N times, triggers zero-step
> > 5. Finally finishes external op
> >
> > Am I missing something?
>
> I must be missing something. I thought KVM is going to
>

"Is going to", as in "will be changed to"? Or "does today"?

> retry internally for
> step 4 (retries N times) because it sees the frozen PTE, but will never go back
> to guest after the fault is resolved? How can step 4 triggers zero-step?

Step 3-4 is saying it will go back to the guest and fault again.

As far as what KVM will do in the future, I think it is still open. I've not had
the chance to think about this for more than 30 min at a time, but the plan to
handle OPERAND_BUSY by taking an expensive path to break any contention (i.e.
kick+lock + whatever TDX module changes we come up with) seems to the leading
idea.

Retry N times is too hacky. Retry internally forever might be awkward to
implement. Because of the signal_pending() check, you would have to handle
exiting to userspace and going back to an EPT violation next time the vcpu tries
to enter.

Next message: Rob Herring: "Re: [PATCH v12 1/3] dt-bindings: display: mediatek: Add OF graph support for board path"
Previous message: syzbot: "Re: [syzbot] [btrfs?] KASAN: slab-use-after-free Read in add_delayed_ref"
In reply to: Huang, Kai: "Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT"
Next in thread: Huang, Kai: "Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]