Re: [PATCH 00/12] SEV Live Migration Patchset.

From: Ashish Kalra
Date: Thu Feb 13 2020 - 18:09:23 EST


On Wed, Feb 12, 2020 at 09:43:41PM -0800, Andy Lutomirski wrote:
> On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@xxxxxxx> wrote:
> >
> > From: Ashish Kalra <ashish.kalra@xxxxxxx>
> >
> > This patchset adds support for SEV Live Migration on KVM/QEMU.
>
> I skimmed this all and I don't see any description of how this all works.
>
> Does any of this address the mess in svm_register_enc_region()? Right
> now, when QEMU (or a QEMU alternative) wants to allocate some memory
> to be used for guest encrypted pages, it mmap()s some memory and the
> kernel does get_user_pages_fast() on it. The pages are kept pinned
> for the lifetime of the mapping. This is not at all okay. Let's see:
>
> - The memory is pinned and it doesn't play well with the Linux memory
> management code. You just wrote a big patch set to migrate the pages
> to a whole different machines, but we apparently can't even migrate
> them to a different NUMA node or even just a different address. And
> good luck swapping it out.
>
> - The memory is still mapped in the QEMU process, and that mapping is
> incoherent with actual guest access to the memory. It's nice that KVM
> clflushes it so that, in principle, everything might actually work,
> but this is gross. We should not be exposing incoherent mappings to
> userspace.
>
> Perhaps all this fancy infrastructure you're writing for migration and
> all this new API surface could also teach the kernel how to migrate
> pages from a guest *to the same guest* so we don't need to pin pages
> forever. And perhaps you could put some thought into how to improve
> the API so that it doesn't involve nonsensical incoherent mappings.o

As a different key is used to encrypt memory in each VM, the hypervisor
can't simply copy the the ciphertext from one VM to another to migrate
the VM. Therefore, the AMD SEV Key Management API provides a new sets
of function which the hypervisor can use to package a guest page for
migration, while maintaining the confidentiality provided by AMD SEV.

There is a new page encryption bitmap created in the kernel which
keeps tracks of encrypted/decrypted state of guest's pages and this
bitmap is updated by a new hypercall interface provided to the guest
kernel and firmware.

KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
bitmap. The bitmap can be used to check if the given guest page is
private or shared.

During the migration flow, the SEND_START is called on the source hypervisor
to create an outgoing encryption context. The SEV guest policy dictates whether
the certificate passed through the migrate-set-parameters command will be
validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
After migration is completed, SEND_FINISH is called to destroy the encryption
context and make the VM non-runnable to protect it against cloning.

On the target machine, RECEIVE_START is called first to create an
incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
the received encrypted page into guest memory. After migration has
completed, RECEIVE_FINISH is called to make the VM runnable.

Thanks,
Ashish