Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT

From: Edgecombe, Rick P
Date: Mon Oct 14 2024 - 13:37:04 EST


On Mon, 2024-10-14 at 10:54 +0000, Huang, Kai wrote:
> On Thu, 2024-10-10 at 21:53 +0000, Edgecombe, Rick P wrote:
> > On Thu, 2024-10-10 at 10:33 -0700, Sean Christopherson wrote:
> > > >
> > > > 1st: "fault->is_private != kvm_mem_is_private(kvm, fault->gfn)" is found.
> > > > 2nd-6th: try_cmpxchg64() fails on each level SPTEs (5 levels in total)
> >
> > Isn't there a more general scenario:
> >
> > vcpu0                              vcpu1
> > 1. Freezes PTE
> > 2. External op to do the SEAMCALL
> > 3.                                 Faults same PTE, hits frozen PTE
> > 4.                                 Retries N times, triggers zero-step
> > 5. Finally finishes external op
> >
> > Am I missing something?
>
> I must be missing something.  I thought KVM is going to 
>

"Is going to", as in "will be changed to"? Or "does today"?

> retry internally for
> step 4 (retries N times) because it sees the frozen PTE, but will never go back
> to guest after the fault is resolved?  How can step 4 triggers zero-step?

Step 3-4 is saying it will go back to the guest and fault again.


As far as what KVM will do in the future, I think it is still open. I've not had
the chance to think about this for more than 30 min at a time, but the plan to
handle OPERAND_BUSY by taking an expensive path to break any contention (i.e.
kick+lock + whatever TDX module changes we come up with) seems to the leading
idea.

Retry N times is too hacky. Retry internally forever might be awkward to
implement. Because of the signal_pending() check, you would have to handle
exiting to userspace and going back to an EPT violation next time the vcpu tries
to enter.