Re: [PATCH 10/12] mm: x86: Invoke hypercall when page encryption status is changed

From: Steve Rutherford
Date: Wed Feb 19 2020 - 22:29:41 EST

On Wed, Feb 19, 2020 at 6:12 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> > On Feb 19, 2020, at 5:58 PM, Steve Rutherford <srutherford@xxxxxxxxxx> wrote:
> >
> > ïOn Wed, Feb 12, 2020 at 5:18 PM Ashish Kalra <Ashish.Kalra@xxxxxxx> wrote:
> >>
> >> From: Brijesh Singh <brijesh.singh@xxxxxxx>
> >>
> >> Invoke a hypercall when a memory region is changed from encrypted ->
> >> decrypted and vice versa. Hypervisor need to know the page encryption
> >> status during the guest migration.
> >
> > One messy aspect, which I think is fine in practice, is that this
> > presumes that pages are either treated as encrypted or decrypted. If
> > also done on SEV, the in-place re-encryption supported by SME would
> > break SEV migration. Linux doesn't do this now on SEV, and I don't
> > have an intuition for why Linux might want this, but we will need to
> > ensure it is never done in order to ensure that migration works down
> > the line. I don't believe the AMD manual promises this will work
> > anyway.
> >
> > Something feels a bit wasteful about having all future kernels
> > universally announce c-bit status when SEV is enabled, even if KVM
> > isn't listening, since it may be too old (or just not want to know).
> > Might be worth eliding the hypercalls if you get ENOSYS back? There
> > might be a better way of passing paravirt config metadata across than
> > just trying and seeing if the hypercall succeeds, but I'm not super
> > familiar with it.
> I actually think this should be a hard requirement to merge this. The host needs to tell the guest that it supports this particular migration strategy and the guest needs to tell the host that it is using it. And the guest needs a way to tell the host that itâs *not* using it right now due to kexec, for example.
> Iâm still uneasy about a guest being migrated in the window where the hypercall tracking and the page encryption bit donât match. I guess maybe corruption in this window doesnât matter?
It does matter, since you don't want to accidentally clear the dirty
bit when you are migrating the page from the wrong perspective.
Treating pages with dirty c-bits as dirty pages should solve this
problem. It's probably reasonable to expect userspace to handle this?
Downside is that you would then need ~3 copies of the c-bit tracking
buffer: one as the kernel version, one as the "old" usermode version,
and one as the "current" usermode version (which is smaller, since you
can fetch smaller sections than the full buffer). The kernel could
probably directly twiddle the dirty tracking bits and avoid the extra
userspace version, but this doesn't seem required.

That said, this does balloon the c-bit tracking overhead. Tracking
100GB of guest pages requires 3MB per instance of these buffers, which
isn't that bad but also isn't free (assuming my back of the envelope
math is right).