Re: [PATCH v2 00/24] TDX MMU Part 2

From: Paolo Bonzini
Date: Tue Dec 24 2024 - 09:33:31 EST


On 11/12/24 08:33, Yan Zhao wrote:
Hi,

Here is v2 of the TDX “MMU part 2” series.
As discussed earlier, non-nit feedbacks from v1[0] have been applied.
- Among them, patch "KVM: TDX: MTRR: implement get_mt_mask() for TDX" was
dropped. The feature self-snoop was not made a dependency for enabling
TDX since checking for the feature self-snoop was not included in
kvm_mmu_may_ignore_guest_pat() in the base code. So, strickly speaking,
current code would incorrectly zap the mirrored root if non-coherent DMA
devices were hot-plugged.

There were also a few minor issues noticed by me and fixed without internal
discussion (noted in each patch's version log).

It’s now ready to hand off to Paolo/kvm-coco-queue.


One remaining item that requires further discussion is "How to handle
the TDX module lock contention (i.e. SEAMCALL retry replacements)".
The basis for future discussions includes:
(1) TDH.MEM.TRACK can contend with TDH.VP.ENTER on the TD epoch lock.
(2) TDH.VP.ENTER contends with TDH.MEM* on S-EPT tree lock when 0-stepping
mitigation is triggered.
- The threshold of zero-step mitigation is counted per-vCPU when the
TDX module finds that EPT violations are caused by the same RIP as
in the last TDH.VP.ENTER for 6 consecutive times.
The threshold value 6 is explained as
"There can be at most 2 mapping faults on instruction fetch
(x86 macro-instructions length is at most 15 bytes) when the
instruction crosses page boundary; then there can be at most 2
mapping faults for each memory operand, when the operand crosses
page boundary. For most of x86 macro-instructions, there are up to 2
memory operands and each one of them is small, which brings us to
maximum 2+2*2 = 6 legal mapping faults."
- If the EPT violations received by KVM are caused by
TDG.MEM.PAGE.ACCEPT, they will not trigger 0-stepping mitigation.
Since a TD is required to call TDG.MEM.PAGE.ACCEPT before accessing a
private memory when configured with pending_ve_disable=Y, 0-stepping
mitigation is not expected to occur in such a TD.
(3) TDG.MEM.PAGE.ACCEPT can contend with SEAMCALLs TDH.MEM*.
(Actually, TDG.MEM.PAGE.ATTR.RD or TDG.MEM.PAGE.ATTR.WR can also
contend with SEAMCALLs TDH.MEM*. Although we don't need to consider
these two TDCALLs when enabling basic TDX, they are allowed by the
TDX module, and we can't control whether a TD invokes a TDCALL or
not).

The "KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT" is
still in place in this series (at the tail), but we should drop it when we
finalize on the real solution.


This series has 5 commits intended to collect Acks from x86 maintainers.
These commits introduce and export SEAMCALL wrappers to allow KVM to manage
the S-EPT (the EPT that maps private memory and is protected by the TDX
module):

x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT
pages
x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial
contents

Apart from the possible changes to the SEAMCALL wrappers, this is in good shape.

Thanks,

Paolo