Re: [PATCH] KVM: SVM: use vmsave/vmload for saving/restoring additional host state

From: Andy Lutomirski
Date: Thu Dec 10 2020 - 18:50:48 EST


> On Dec 10, 2020, at 1:27 PM, Michael Roth <michael.roth@xxxxxxx> wrote:
>
> Quoting Andy Lutomirski (2020-12-10 13:23:19)
>>> On Thu, Dec 10, 2020 at 9:52 AM Michael Roth <michael.roth@xxxxxxx> wrote:
>>> MSR_STAR, MSR_LSTAR, MSR_CSTAR,
>>> MSR_SYSCALL_MASK, MSR_KERNEL_GS_BASE,
>>> MSR_IA32_SYSENTER_CS,
>>> MSR_IA32_SYSENTER_ESP,
>>> MSR_IA32_SYSENTER_EIP,
>>> MSR_FS_BASE, MSR_GS_BASE
>>
>> Can you get rid of all the old FS/GS manipulation at the same time?
>>
>>> + for (i = 0; i < NR_HOST_SAVE_USER_MSRS; i++) {
>>> + rdmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
>>> + }
>>> +
>>> + asm volatile(__ex("vmsave")
>>> + : : "a" (page_to_pfn(sd->save_area) << PAGE_SHIFT)
>>> + : "memory");
>>> + /*
>>> + * Host FS/GS segment registers might be restored soon after
>>> + * vmexit, prior to vmload of host save area. Even though this
>>> + * state is now saved in the host's save area, we cannot use
>>> + * per-cpu accesses until these registers are restored, so we
>>> + * store a copy in the VCPU struct to make sure they are
>>> + * accessible.
>>> + */
>>> #ifdef CONFIG_X86_64
>>> - rdmsrl(MSR_GS_BASE, to_svm(vcpu)->host.gs_base);
>>> + svm->host.gs_base = hostsa->gs.base;
>>> #endif
>>
>> For example, this comment makes no sense to me. Just let VMLOAD
>> restore FS/GS and be done with it. Don't copy those gs_base and
>> gs.base fields -- just delete them please. (Or are they needed for
>> nested virt for some reason? If so, please document that.)
>
> Hi Andy,
>
> The main issue is that we restore FS/GS immediately after a vmexit since
> we need them soon-after for things like per-cpu accesses, but the rest
> of the host state only needs to be restored if we're exiting all the way
> out to userspace. That's also why we store a copy of the values, since
> the host can't access the per-cpu save_area beforehand.

>
> In theory I think we probably could use vmload to restore this state
> immediately after vmexit as you're suggesting, but then we will end up
> taking a performance hit for cases where the vmexit can be handled within
> the kernel, which might leave us worse-off than the pre-patch behavior
> for those cases (handling an MMIO for a virtqueue notification when
> vhost_net is enabled, for instance)

Please benchmark this. WRMSR to MSR_GS_BASE is serializing and may
well be slower than VMLOAD.

>
>>
>>> - savesegment(fs, svm->host.fs);
>>> - savesegment(gs, svm->host.gs);
>>> - svm->host.ldt = kvm_read_ldt();
>>> -
>>> - for (i = 0; i < NR_HOST_SAVE_USER_MSRS; i++)
>>> - rdmsrl(host_save_user_msrs[i], svm->host_user_msrs[i]);
>>> + svm->host.fs = hostsa->fs.selector;
>>> + svm->host.gs = hostsa->gs.selector;
>>
>> This too. Why is the host code thinking about selectors at all?
>>
>>> - kvm_load_ldt(svm->host.ldt);
>>
>> I have a patch that deletes this, too. Don't worry about the conflict
>> -- I'll sort it out.
>>
>>> @@ -120,7 +115,6 @@ struct vcpu_svm {
>>> struct {
>>> u16 fs;
>>> u16 gs;
>>> - u16 ldt;
>>> u64 gs_base;
>>> } host;
>>
>> Shouldn't you be about to delete fs, gs, and gs_base too?
>
> For the reasons above it seems like they'd need to be there in some form,
> though we could maybe replace them with a pointer to the per-cpu save_area
> so that they can be accessed directly before GS segment is restored.

I’m confused. Why would you be accessing them before VMLOAD? These
are host values.

I think there are two reasonable ways to do this:

1. VMLOAD before STGI. This is obviously correct, and it's quite simple.

2. Save cpu_kernelmode_gs_base(cpu) before VM entry, and restore that
value to MSR_GS_BASE using code like this (or its asm equivalent)
before STGI:

if (static_cpu_has(X86_FEATURE_FSGSBASE))
wrgsbase(base);
else
wrmsr...

and then VMLOAD in the vcpu_put() path.

I can't think of any reason to use loadsegment(), load_gs_index(), or
savesegment() at all, nor can I think of any reason to touch
MSR_KERNEL_GS_BASE or MSR_FS_BASE.

--Andy