Re: [PATCH 00/12] SEV Live Migration Patchset.
From: Ashish Kalra
Date: Mon Feb 17 2020 - 14:50:24 EST
On Fri, Feb 14, 2020 at 10:58:46AM -0800, Andy Lutomirski wrote:
> On Thu, Feb 13, 2020 at 3:09 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
> >
> > On Wed, Feb 12, 2020 at 09:43:41PM -0800, Andy Lutomirski wrote:
> > > On Wed, Feb 12, 2020 at 5:14 PM Ashish Kalra <Ashish.Kalra@xxxxxxx> wrote:
> > > >
> > > > From: Ashish Kalra <ashish.kalra@xxxxxxx>
> > > >
> > > > This patchset adds support for SEV Live Migration on KVM/QEMU.
> > >
> > > I skimmed this all and I don't see any description of how this all works.
> > >
> > > Does any of this address the mess in svm_register_enc_region()? Right
> > > now, when QEMU (or a QEMU alternative) wants to allocate some memory
> > > to be used for guest encrypted pages, it mmap()s some memory and the
> > > kernel does get_user_pages_fast() on it. The pages are kept pinned
> > > for the lifetime of the mapping. This is not at all okay. Let's see:
> > >
> > > - The memory is pinned and it doesn't play well with the Linux memory
> > > management code. You just wrote a big patch set to migrate the pages
> > > to a whole different machines, but we apparently can't even migrate
> > > them to a different NUMA node or even just a different address. And
> > > good luck swapping it out.
> > >
> > > - The memory is still mapped in the QEMU process, and that mapping is
> > > incoherent with actual guest access to the memory. It's nice that KVM
> > > clflushes it so that, in principle, everything might actually work,
> > > but this is gross. We should not be exposing incoherent mappings to
> > > userspace.
> > >
> > > Perhaps all this fancy infrastructure you're writing for migration and
> > > all this new API surface could also teach the kernel how to migrate
> > > pages from a guest *to the same guest* so we don't need to pin pages
> > > forever. And perhaps you could put some thought into how to improve
> > > the API so that it doesn't involve nonsensical incoherent mappings.o
> >
> > As a different key is used to encrypt memory in each VM, the hypervisor
> > can't simply copy the the ciphertext from one VM to another to migrate
> > the VM. Therefore, the AMD SEV Key Management API provides a new sets
> > of function which the hypervisor can use to package a guest page for
> > migration, while maintaining the confidentiality provided by AMD SEV.
> >
> > There is a new page encryption bitmap created in the kernel which
> > keeps tracks of encrypted/decrypted state of guest's pages and this
> > bitmap is updated by a new hypercall interface provided to the guest
> > kernel and firmware.
> >
> > KVM_GET_PAGE_ENC_BITMAP ioctl can be used to get the guest page encryption
> > bitmap. The bitmap can be used to check if the given guest page is
> > private or shared.
> >
> > During the migration flow, the SEND_START is called on the source hypervisor
> > to create an outgoing encryption context. The SEV guest policy dictates whether
> > the certificate passed through the migrate-set-parameters command will be
> > validated. SEND_UPDATE_DATA is called to encrypt the guest private pages.
> > After migration is completed, SEND_FINISH is called to destroy the encryption
> > context and make the VM non-runnable to protect it against cloning.
> >
> > On the target machine, RECEIVE_START is called first to create an
> > incoming encryption context. The RECEIVE_UPDATE_DATA is called to copy
> > the received encrypted page into guest memory. After migration has
> > completed, RECEIVE_FINISH is called to make the VM runnable.
> >
>
> Thanks! This belongs somewhere in the patch set.
>
> You still haven't answered my questions about the existing coherency
> issues and whether the same infrastructure can be used to migrate
> guest pages within the same machine.
Page Migration and Live Migration are separate features and one of my
colleagues is currently working on making page migration possible and removing
SEV Guest pinning requirements.
>
> Also, you're making guest-side and host-side changes. What ensures
> that you don't try to migrate a guest that doesn't support the
> hypercall for encryption state tracking?
This is a good question and it is still an open-ended question. There
are two possibilities here: guest does not have any unencrypted pages
(for e.g booting 32-bit) and so it does not make any hypercalls, and
the other possibility is that the guest does not have support for
the newer hypercall.
In the first case, all the guest pages are then assumed to be
encrypted and live migration happens as such.
For the second case, we have been discussing this internally,
and one option is to extend the KVM capabilites/feature bits to check for this ?
Thanks,
Ashish