Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT
From: Edgecombe, Rick P
Date: Fri Sep 13 2024 - 15:19:26 EST
On Fri, 2024-09-13 at 10:23 -0700, Sean Christopherson wrote:
> > TL;DR:
> > - tdh_mem_track() can contend with tdh_vp_enter().
> > - tdh_vp_enter() contends with tdh_mem*() when 0-stepping is suspected.
>
> The zero-step logic seems to be the most problematic. E.g. if KVM is trying
> to
> install a page on behalf of two vCPUs, and KVM resumes the guest if it
> encounters
> a FROZEN_SPTE when building the non-leaf SPTEs, then one of the vCPUs could
> trigger the zero-step mitigation if the vCPU that "wins" and gets delayed for
> whatever reason.
Can you explain more about what the concern is here? That the zero-step
mitigation activation will be a drag on the TD because of extra contention with
the TDH.MEM calls?
>
> Since FROZEN_SPTE is essentially bit-spinlock with a reaaaaaly slow slow-path,
> what if instead of resuming the guest if a page fault hits FROZEN_SPTE, KVM
> retries
> the fault "locally", i.e. _without_ redoing tdh_vp_enter() to see if the vCPU
> still
> hits the fault?
It seems like an optimization. To me, I would normally want to know how much it
helped before adding it. But if you think it's an obvious win I'll defer.
>
> For non-TDX, resuming the guest and letting the vCPU retry the instruction is
> desirable because in many cases, the winning task will install a valid mapping
> before KVM can re-run the vCPU, i.e. the fault will be fixed before the
> instruction is re-executed. In the happy case, that provides optimal
> performance
> as KVM doesn't introduce any extra delay/latency.
>
> But for TDX, the math is different as the cost of a re-hitting a fault is
> much,
> much higher, especially in light of the zero-step issues.
>
> E.g. if the TDP MMU returns a unique error code for the frozen case, and
> kvm_mmu_page_fault() is modified to return the raw return code instead of '1',
> then the TDX EPT violation path can safely retry locally, similar to the do-
> while
> loop in kvm_tdp_map_page().
>
> The only part I don't like about this idea is having two "retry" return
> values,
> which creates the potential for bugs due to checking one but not the other.
>
> Hmm, that could be avoided by passing a bool pointer as an out-param to
> communicate
> to the TDX S-EPT fault handler that the SPTE is frozen. I think I like that
> option better even though the out-param is a bit gross, because it makes it
> more
> obvious that the "frozen_spte" is a special case that doesn't need attention
> for
> most paths.