Re: [RFC PATCH 25/35] KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES

From: Tom Lendacky
Date: Tue Sep 15 2020 - 16:41:39 EST


On 9/15/20 11:33 AM, Sean Christopherson wrote:
> On Tue, Sep 15, 2020 at 09:19:46AM -0500, Tom Lendacky wrote:
>> On 9/14/20 4:37 PM, Sean Christopherson wrote:
>>> On Mon, Sep 14, 2020 at 03:15:39PM -0500, Tom Lendacky wrote:
>>>> From: Tom Lendacky <thomas.lendacky@xxxxxxx>
>>>>
>>>> Since many of the registers used by the SEV-ES are encrypted and cannot
>>>> be read or written, adjust the __get_sregs() / __set_sregs() to only get
>>>> or set the registers being tracked (efer, cr0, cr4 and cr8) once the VMSA
>>>> is encrypted.
>>>
>>> Is there an actual use case for writing said registers after the VMSA is
>>> encrypted? Assuming there's a separate "debug mode" and live migration has
>>> special logic, can KVM simply reject the ioctl() if guest state is protected?
>>
>> Yeah, I originally had it that way but one of the folks looking at live
>> migration for SEV-ES thought it would be easier given the way Qemu does
>> things. But I think it's easy enough to batch the tracking registers into
>> the VMSA state that is being transferred during live migration. Let me
>> check that out and likely the SET ioctl() could just skip all the regs.
>
> Hmm, that would be ideal. How are the tracked registers validated when they're
> loaded at the destination? It seems odd/dangerous that KVM would have full
> control over efer/cr0/cr4/cr8. I.e. why is KVM even responsibile for migrating
> that information, e.g. as opposed to migrating an opaque blob that contains
> encrypted versions of those registers?
>

KVM doesn't have control of them. They are part of the guest's encrypted
state and that is what the guest uses. KVM can't alter the value that the
guest is using for them once the VMSA is encrypted. However, KVM makes
some decisions based on the values it thinks it knows. For example, early
on I remember the async PF support failing because the CR0 that KVM
thought the guest had didn't have the PE bit set, even though the guest
was in protected mode. So KVM didn't include the error code in the
exception it injected (is_protmode() was false) and things failed. Without
syncing these values after live migration, things also fail (probably for
the same reason). So the idea is to just keep KVM apprised of the values
that the guest has.

Thanks,
Tom