Re: [RFC v2] KVM: x86: Support KVM VMs sharing SEV context

From: James Bottomley
Date: Thu Apr 08 2021 - 21:18:40 EST


On Thu, 2021-04-08 at 17:41 -0700, Steve Rutherford wrote:
> On Thu, Apr 8, 2021 at 2:15 PM James Bottomley <jejb@xxxxxxxxxxxxx>
> wrote:
> > On Thu, 2021-04-08 at 12:48 -0700, Steve Rutherford wrote:
> > > On Thu, Apr 8, 2021 at 10:43 AM James Bottomley <
> > > jejb@xxxxxxxxxxxxx>
> > > wrote:
> > > > On Fri, 2021-04-02 at 16:20 +0200, Paolo Bonzini wrote:
[...]
> > > > > However, it would be nice to collaborate on the low-level
> > > > > (SEC/PEI) firmware patches to detect whether a CPU is part of
> > > > > the primary VM or the mirror. If Google has any OVMF patches
> > > > > already done for that, it would be great to combine it with
> > > > > IBM's SEV migration code and merge it into upstream OVMF.
> > > >
> > > > We've reached the stage with our prototyping where not having
> > > > the OVMF support is blocking us from working on QEMU. If we're
> > > > going to have to reinvent the wheel in OVMF because Google is
> > > > unwilling to publish the patches, can you at least give some
> > > > hints about how you did it?
> > > >
> > > > Thanks,
> > > >
> > > > James
> > >
> > > Hey James,
> > > It's not strictly necessary to modify OVMF to make SEV VMs live
> > > migrate. If we were to modify OVMF, we would contribute those
> > > changes
> > > upstream.
> >
> > Well, no, we already published an OVMF RFC to this list that does
> > migration. However, the mirror approach requires a different boot
> > mechanism for the extra vCPU in the mirror. I assume you're doing
> > this bootstrap through OVMF so the hypervisor can interrogate it to
> > get the correct entry point? That's the code we're asking to see
> > because that's what replaces our use of the MP service in the RFC.
> >
> > James
>
> Hey James,
> The intention would be to have a separate, stand-alone firmware-like
> binary run by the mirror. Since the VMM is in control of where it
> places that binary in the guest physical address space and the
> initial configuration of the vCPUs, it can point the vCPUs at an
> entry point contained within that binary, rather than at the standard
> x86 reset vector.

If you want to share ASIDs you have to share the firmware that the
running VM has been attested to. Once the VM moves from LAUNCH to
RUNNING, the PSP won't allow the VMM to inject any more firmware or do
any more attestations. What you mirror after this point can thus only
contain what has already been measured or what the guest added. This
is why we think there has to be a new entry path into the VM for the
mirror vCPU.

So assuming you're thinking you'll inject two pieces of firmware at
start of day: the OVFM and this separate binary and attest to both,
then you can do that, but then you have two problems:

1. Preventing OVMF from trampling all over your separate binary while
it's booting
2. Launching the vCPU up into this separate binary in a way it can
execute (needs stack and heap)

I think you can likely solve 1. by making the separate binary look like
a ROM, but then you have the problem of where you steal the RAM you
need for a heap and stack and it still brings us back to how to launch
the vCPU which was the original question.

With ES we can set the registers at launch, so a vCPU that's never
launched can still be pre-programmed with the separate binary entry
point but solving the stack and heap looks like it requires co-
operation from OVMF.

That's why we were thinking the easiest straight line approach is to
have a runtime DXE which has a declared initialization routine that
allocates memory for the stack and a heap and a separate declared entry
point for the vCPU which picks up the already allocated and mapped
stack and heap.

James