Re: [PATCH Part2 v5 37/45] KVM: SVM: Add support to handle MSR based Page State Change VMGEXIT

From: Sean Christopherson
Date: Wed Oct 13 2021 - 13:04:14 EST


On Tue, Oct 12, 2021, Sean Christopherson wrote:
> If we are unable to root cause and fix the bug, I think a viable workaround would
> be to clear the hardware present bit in unrelated SPTEs, but keep the SPTEs
> themselves. The idea mostly the same as the ZAPPED_PRIVATE concept from the initial
> TDX RFC. MMU notifier invalidations, memslot removal, RMP restoration, etc... would
> all continue to work since the SPTEs is still there, and KVM's page fault handler
> could audit any "blocked" SPTE when it's refaulted (I'm pretty sure it'd be
> impossible for the PFN to change, since any PFN change would require a memslot
> update or mmu_notifier invalidation).
>
> The downside to that approach is that it would require walking all SPTEs to do a
> memslot deletion, i.e. we'd lose the "fast zap" behavior. If that's a performance
> issue, the behavior could be opt-in (but not for SNP/TDX).

Another option if we introduce private memslots is to preserve private memslots
on unrelated deletions. The argument being that (a) private memslots are a new
feature so there's no prior uABI to break, and (b) if not zapping private memslot
SPTEs in response to the guest remapping a BAR somehow breaks GPU pass-through,
then the bug is all but guaranteed to be somewhere besides KVM's memslot logic.