Re: [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory

From: Andy Lutomirski
Date: Wed Sep 01 2021 - 14:24:40 EST

On Wed, Sep 1, 2021, at 9:18 AM, James Bottomley wrote:
> On Wed, 2021-09-01 at 08:54 -0700, Andy Lutomirski wrote:
> [...]
> > If you want to swap a page on TDX, you can't. Sorry, go directly to
> > jail, do not collect $200.
> Actually, even on SEV-ES you can't either. You can read the encrypted
> page and write it out if you want, but unless you swap it back to the
> exact same physical memory location, the encryption key won't work.
> Since we don't guarantee this for swap, I think swap won't actually
> work for any confidential computing environment.
> > So I think there are literally zero code paths that currently call
> > try_to_unmap() that will actually work like that on TDX. If we run
> > out of memory on a TDX host, we can kill the guest completely and
> > reclaim all of its memory (which probably also involves killing QEMU
> > or whatever other user program is in charge), but that's really our
> > only option.
> I think our only option for swap is guest co-operation. We're going to
> have to inflate a balloon or something in the guest and have the guest
> driver do some type of bounce of the page, where it becomes an
> unencrypted page in the guest (so the host can read it without the
> physical address keying of the encryption getting in the way) but
> actually encrypted with a swap transfer key known only to the guest. I
> assume we can use the page acceptance infrastructure currently being
> discussed elsewhere to do swap back in as well ... the host provides
> the guest with the encrypted swap page and the guest has to decrypt it
> and place it in encrypted guest memory.

I think the TD module could, without hardware changes, fairly efficiently re-encrypt guest pages for swap. The TD module has access to the full CPU crypto capabilities, so this wouldn't be slow. It would just require convincing the TD module team to implement such a feature.

It would be very, very nice for whatever swap support ends up existing to work without guest cooperation -- even if the guest is cooperative, we don't want to end up waiting for the guest to help when the host is under memory pressure.