Re: [PATCH v2 07/10] KVM: x86: Always populate L1 GPA for KVM_MAP_MEMORY

From: Paolo Bonzini
Date: Wed Apr 17 2024 - 06:29:01 EST


On Wed, Apr 17, 2024 at 1:00 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > > Hmm... For the non-TDX use cases this is just an optimization, right? For TDX
> > > > there shouldn't be an issue. If so, maybe this last one is not so horrible.
> >
> > It doesn't even have to be ABI that it gives an error. As you say,
> > this ioctl can just be advisory only for !confidential machines. Even
> > if it were implemented, the shadow MMU can drop roots at any moment
>
> Sure, but there's a difference between KVM _potentially_ dropping roots and
> guaranteed failure because userspace is trying to do something that's unsupported.
> But I think this is a non-issue, because it should really just be as simple as:
>
> if (!mmu->pre_map_memory)
> return -EOPNOTSUPP;
>
> Hmm, or probably this to avoid adding an MMU hook for a single MMU flavor:
>
> if (!tdp_mmu_enabled || !mmu->root_role.direct)
> return -EOPNOTSUPP;
>
> > and/or kill the mapping via the shrinker.
>
> Ugh, we really need to kill that code.

Ok, so let's add a KVM_CHECK_EXTENSION so that people can check if
it's supported.

> > That said, I can't fully shake the feeling that this ioctl should be
> > an error for !TDX and that TDX_INIT_MEM_REGION wasn't that bad. The
> > implementation was ugly but the API was fine.
>
> Hmm, but IMO the implementation was ugly in no small part because of the contraints
> put on KVM by the API. Mapping S-EPT *and* doing TDH.MEM.PAGE.ADD in the same
> ioctl() forced KVM to operate on vcpu0, and necessitated shoving temporary data
> into a per-VM structure in order to get the source contents into TDH.MEM.PAGE.ADD.

That's because it was trying to do two things with a single loop. It's
not needed - and in fact KVM_CAP_MEMORY_MAPPING forces userspace to do
it in two passes.

> And stating the obvious, TDX_INIT_MEM_REGION also doesn't allow pre-mapping memory,
> which is generally useful, and can be especially beneficial for confidential VMs
> (and TDX in particular) due to the added cost of a page fault VM-Exit.
>
> I'm not dead set on this generic ioctl(), but unless it ends up being a train wreck
> for userspace, I think it will allow for cleaner and more reusable code in KVM.

Yes, this ioctl() can stay. Forcing it before adding memory to TDX is
ugly, but it's not a blocker. I'll look at it closely and see how far
it is from being committable to kvm-coco-queue.

Paolo