Re: [PATCH v2] Fix undefined operation VMXOFF during reboot and crash
From: Andy Lutomirski
Date: Mon Jun 29 2020 - 17:23:02 EST
> On Jun 29, 2020, at 1:54 PM, David P. Reed <dpreed@xxxxxxxxxxxx> wrote:
>
> ïSimple question for those on the To: and CC: list here. Should I abandon any hope of this patch being accepted? It's been a long time.
>
> The non-response after I acknowledged that this was discovered by when working on a personal, non-commercial research project - which is "out-of-tree" (apparently dirty words on LKML) has me thinking my contribution is unwanted. That's fine, I suppose. I can maintain this patch out-of-tree as well.
> I did incorporate all the helpful suggestions I received in this second patch, and given some encouragement, will happily submit a revised v3 if there is any likelihood of acceptance. I'm wary of doing more radical changes (like combining emergency and normal paths).
>
Sorry about being slow and less actively encouraging than we should be. We absolutely welcome personal contributions. The actual problem is that everyone is worked and weâre all slow. Also, you may be hitting a corner case in the process: is this a KVM patch or an x86 patch?
>
> On Thursday, June 25, 2020 10:59am, "David P. Reed" <dpreed@xxxxxxxxxxxx> said:
>
>> Correction to my comment below.
>> On Thursday, June 25, 2020 10:45am, "David P. Reed" <dpreed@xxxxxxxxxxxx> said:
>>
>>> [Sorry: this is resent because my mailer included HTML, rejected by LKML]
>>> Thanks for the response, Sean ... I had thought everyone was too busy to follow
>>> up
>>> from the first version.
>>>
>>> I confess I'm not sure why this should be broken up into a patch series, given
>>> that it is so very small and is all aimed at the same category of bug.
>>>
>>> And the "emergency" path pre-existed, I didn't want to propose removing it, since
>>> I assumed it was there for a reason. I didn't want to include my own judgement as
>>> to whether there should only be one path. (I'm pretty sure I didn't find a VMXOFF
>>> in KVM separately from the instance in this include file, but I will check).
>> Just checked. Yes, the kvm code's handling of VMXOFF is separate, and though it
>> uses exception masking, seems to do other things, perhaps related to nested KVM,
>> but I haven't studied the deep logic of KVM nesting.
>>
>>>
>>> A question: if I make it a series, I have to test each patch doesn't break
>>> something individually, in order to handle the case where one patch is accepted
>>> and the others are not. Do I need to test each individual patch thoroughly as an
>>> independent patch against all those cases?
>>> I know the combination don't break anything and fixes the issues I've discovered
>>> by testing all combinations (and I've done some thorough testing of panics,
>>> oopses
>>> crashes, kexec, ... under all combinations of CR4.VMXE enablement and crash
>>> source
>>> to verify the fix fixes the problem's manifestations and to verify that it
>>> doesn't
>>> break any of the working paths.
>>>
>>> That said, I'm willing to do a v3 "series" based on these suggestions if it will
>>> smooth its acceptance. If it's not going to get accepted after doing that, my
>>> motivation is flagging.
>>> On Thursday, June 25, 2020 2:06am, "Sean Christopherson"
>>> <sean.j.christopherson@xxxxxxxxx> said:
>>>
>>>
>>>
>>>>> On Thu, Jun 11, 2020 at 03:48:18PM -0400, David P. Reed wrote:
>>>>>> -/** Disable VMX on the current CPU
>>>>>> +/* Disable VMX on the current CPU
>>>>>> *
>>>>>> - * vmxoff causes a undefined-opcode exception if vmxon was not run
>>>>>> - * on the CPU previously. Only call this function if you know VMX
>>>>>> - * is enabled.
>>>>>> + * vmxoff causes an undefined-opcode exception if vmxon was not run
>>>>>> + * on the CPU previously. Only call this function directly if you know VMX
>>>>>> + * is enabled *and* CPU is in VMX root operation.
>>>>>> */
>>>>>> static inline void cpu_vmxoff(void)
>>>>>> {
>>>>>> - asm volatile ("vmxoff");
>>>>>> + asm volatile ("vmxoff" ::: "cc", "memory"); /* clears all flags on success
>>>>> */
>>>>>> cr4_clear_bits(X86_CR4_VMXE);
>>>>>> }
>>>>>>
>>>>>> @@ -47,17 +47,35 @@ static inline int cpu_vmx_enabled(void)
>>>>>> return __read_cr4() & X86_CR4_VMXE;
>>>>>> }
>>>>>>
>>>>>> -/** Disable VMX if it is enabled on the current CPU
>>>>>> - *
>>>>>> - * You shouldn't call this if cpu_has_vmx() returns 0.
>>>>>> +/*
>>>>>> + * Safely disable VMX root operation if active
>>>>>> + * Note that if CPU is not in VMX root operation this
>>>>>> + * VMXOFF will fault an undefined operation fault,
>>>>>> + * so use the exception masking facility to handle that RARE
>>>>>> + * case.
>>>>>> + * You shouldn't call this directly if cpu_has_vmx() returns 0
>>>>>> + */
>>>>>> +static inline void cpu_vmxoff_safe(void)
>>>>>> +{
>>>>>> + asm volatile("1:vmxoff\n\t" /* clears all flags on success */
>>>>>
>>>>> Eh, I wouldn't bother with the comment, there are a million other caveats
>>>>> with VMXOFF that are far more interesting.
>>>>>
>>>>>> + "2:\n\t"
>>>>>> + _ASM_EXTABLE(1b, 2b)
>>>>>> + ::: "cc", "memory");
>>>>>
>>>>> Adding the memory and flags clobber should be a separate patch.
>>>>>
>>>>>> + cr4_clear_bits(X86_CR4_VMXE);
>>>>>> +}
>>>>>
>>>>>
>>>>> I don't see any value in safe/unsafe variants. The only in-kernel user of
>>>>> VMXOFF outside of the emergency flows is KVM, which has its own VMXOFF
>>>>> helper, i.e. all users of cpu_vmxoff() want the "safe" variant. Just add
>>>>> the exception fixup to cpu_vmxoff() and call it good.
>>>>>
>>>>>> +
>>>>>> +/*
>>>>>> + * Force disable VMX if it is enabled on the current CPU,
>>>>>> + * when it is unknown whether CPU is in VMX operation.
>>>>>> */
>>>>>> static inline void __cpu_emergency_vmxoff(void)
>>>>>> {
>>>>>> - if (cpu_vmx_enabled())
>>>>>> - cpu_vmxoff();
>>>>>> + if (!cpu_vmx_enabled())
>>>>>> + return;
>>>>>> + cpu_vmxoff_safe();
>>>>>
>>>>> Unnecessary churn.
>>>>>
>>>>>> }
>>>>>>
>>>>>> -/** Disable VMX if it is supported and enabled on the current CPU
>>>>>> +/* Force disable VMX if it is supported on current CPU
>>>>>> */
>>>>>> static inline void cpu_emergency_vmxoff(void)
>>>>>> {
>>>>>> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
>>>>>> index e040ba6be27b..b0e6b106a67e 100644
>>>>>> --- a/arch/x86/kernel/reboot.c
>>>>>> +++ b/arch/x86/kernel/reboot.c
>>>>>> @@ -540,21 +540,14 @@ static void emergency_vmx_disable_all(void)
>>>>>> *
>>>>>> * For safety, we will avoid running the nmi_shootdown_cpus()
>>>>>> * stuff unnecessarily, but we don't have a way to check
>>>>>> - * if other CPUs have VMX enabled. So we will call it only if the
>>>>>> - * CPU we are running on has VMX enabled.
>>>>>> - *
>>>>>> - * We will miss cases where VMX is not enabled on all CPUs. This
>>>>>> - * shouldn't do much harm because KVM always enable VMX on all
>>>>>> - * CPUs anyway. But we can miss it on the small window where KVM
>>>>>> - * is still enabling VMX.
>>>>>> + * if other CPUs have VMX enabled.
>>>>>> */
>>>>>> - if (cpu_has_vmx() && cpu_vmx_enabled()) {
>>>>>> + if (cpu_has_vmx()) {
>>>>>> /* Disable VMX on this CPU. */
>>>>>> - cpu_vmxoff();
>>>>>> + cpu_emergency_vmxoff();
>>>>>
>>>>> This also needs to be in a separate patch. And it should use
>>>>> __cpu_emergency_vmxoff() instead of cpu_emergency_vmxoff().
>>>>>
>>>>>>
>>>>>> /* Halt and disable VMX on the other CPUs */
>>>>>> nmi_shootdown_cpus(vmxoff_nmi);
>>>>>> -
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> --
>>>>>> 2.26.2
>>>>>>
>>>>>
>>>
>>>
>>
>>
>>
>
>