Re: [PATCH 09/21] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT

From: Sean Christopherson
Date: Mon Sep 09 2024 - 17:23:51 EST


On Mon, Sep 09, 2024, Sean Christopherson wrote:
> On Mon, Sep 09, 2024, Rick P Edgecombe wrote:
> > On Mon, 2024-09-09 at 17:25 +0200, Paolo Bonzini wrote:
> > > On 9/4/24 05:07, Rick Edgecombe wrote:
> > > > +static inline u64 tdx_seamcall_sept(u64 op, struct tdx_module_args *in)
> > > > +{
> > > > +#define SEAMCALL_RETRY_MAX     16
> > >
> > > How is the 16 determined?  Also, is the lock per-VM or global?
> >
> > The lock being considered here is per-TD, but TDX_OPERAND_BUSY in general can be
> > for other locks. I'm not sure where the 16 came from, maybe Yuan or Isaku can
> > share the history. In any case, there seems to be some problems with this patch
> > or justification.
> >
> > Regarding the zero-step mitigation, the TDX Module has a mitigation for an
> > attack where a malicious VMM causes repeated private EPT violations for the same
> > GPA. When this happens TDH.VP.ENTER will fail to enter the guest. Regardless of
> > zero-step detection, these SEPT related SEAMCALLs will exit with the checked
> > error code if they contend the mentioned lock. If there was some other (non-
> > zero-step related) contention for this lock and KVM tries to re-enter the TD too
> > many times without resolving an EPT violation, it might inadvertently trigger
> > the zero-step mitigation. I *think* this patch is trying to say not to worry
> > about this case, and do a simple retry loop instead to handle the contention.
> >
> > But why 16 retries would be sufficient, I can't find a reason for. Getting this
> > required retry logic right is important because some failures
> > (TDH.MEM.RANGE.BLOCK) can lead to KVM_BUG_ON()s.
>
> I (somewhat indirectly) raised this as an issue in v11, and at a (very quick)
> glance, nothing has changed to alleviate my concerns.

Gah, went out of my way to find the thread and then forgot to post the link:

https://lore.kernel.org/all/Y8m34OEVBfL7Q4Ns@xxxxxxxxxx

> In general, I am _very_ opposed to blindly retrying an SEPT SEAMCALL, ever. For
> its operations, I'm pretty sure the only sane approach is for KVM to ensure there
> will be no contention. And if the TDX module's single-step protection spuriously
> kicks in, KVM exits to userspace. If the TDX module can't/doesn't/won't communicate
> that it's mitigating single-step, e.g. so that KVM can forward the information
> to userspace, then that's a TDX module problem to solve.
>
> > Per the docs, in general the VMM is supposed to retry SEAMCALLs that return
> > TDX_OPERAND_BUSY.
>
> IMO, that's terrible advice. SGX has similar behavior, where the xucode "module"
> signals #GP if there's a conflict. #GP is obviously far, far worse as it lacks
> the precision that would help software understand exactly what went wrong, but I
> think one of the better decisions we made with the SGX driver was to have a
> "zero tolerance" policy where the driver would _never_ retry due to a potential
> resource conflict, i.e. that any conflict in the module would be treated as a
> kernel bug.
>
> > I think we need to revisit the general question of which
> > SEAMCALLs we should be retrying and how many times/how long. The other
> > consideration is that KVM already has per-VM locking, that would prevent
> > contention for some of the locks. So depending on internal details KVM may not
> > need to do any retries in some cases.
>
> Yes, and if KVM can't avoid conflict/retry, then before we go any further, I want
> to know exactly why that is the case.