Re: [PATCH v6 4/8] KVM: Extend the memslot to support fd-based private memory

From: Sean Christopherson
Date: Fri Jun 10 2022 - 12:14:35 EST


On Mon, May 30, 2022, Chao Peng wrote:
> On Mon, May 23, 2022 at 03:22:32PM +0000, Sean Christopherson wrote:
> > Actually, if the semantics are that userspace declares memory as private, then we
> > can reuse KVM_MEMORY_ENCRYPT_REG_REGION and KVM_MEMORY_ENCRYPT_UNREG_REGION. It'd
> > be a little gross because we'd need to slightly redefine the semantics for TDX, SNP,
> > and software-protected VM types, e.g. the ioctls() currently require a pre-exisitng
> > memslot. But I think it'd work...
>
> These existing ioctls looks good for TDX and probably SNP as well. For
> softrware-protected VM types, it may not be enough. Maybe for the first
> step we can reuse this for all hardware based solutions and invent new
> interface when software-protected solution gets really supported.
>
> There is semantics difference for fd-based private memory. Current above
> two ioctls() use userspace addreess(hva) while for fd-based it should be
> fd+offset, and probably it's better to use gpa in this case. Then we
> will need change existing semantics and break backward-compatibility.

My thought was to keep the existing semantics for VMs with type==0, i.e. SEV and
SEV-ES VMs. It's a bit gross, but the pinning behavior is a dead end for SNP and
TDX, so it effectively needs to be deprecated anyways. I'm definitely not opposed
to a new ioctl if Paolo or others think this is too awful, but burning an ioctl
for this seems wasteful.

Then generic KVM can do something like:

case KVM_MEMORY_ENCRYPT_REG_REGION:
case KVM_MEMORY_ENCRYPT_UNREG_REGION:
struct kvm_enc_region region;

if (!kvm_arch_vm_supports_private_memslots(kvm))
goto arch_vm_ioctl;

r = -EFAULT;
if (copy_from_user(&region, argp, sizeof(region)))
goto out;

r = kvm_set_encrypted_region(ioctl, &region);
break;
default:
arch_vm_ioctl:
r = kvm_arch_vm_ioctl(filp, ioctl, arg);


where common KVM provides

__weak void kvm_arch_vm_supports_private_memslots(struct kvm *kvm)
{
return false;
}

and x86 overrides that to

bool kvm_arch_vm_supports_private_memslots(struct kvm *kvm)
{
/* I can't remember what we decided on calling type '0' VMs. */
return !!kvm->vm_type;
}

and if someone ever wants to enable private memslot for SEV/SEV-ES guests we can
always add a capability or even a new VM type.

pKVM on arm can then obviously implement kvm_arch_vm_supports_private_memslots()
to grab whatever identifies a pKVM VM.