Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS

From: Andy Lutomirski
Date: Mon Apr 06 2020 - 16:42:34 EST



> On Apr 6, 2020, at 1:32 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
> ï
>> On Apr 6, 2020, at 1:25 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>
>> ïOn Mon, Apr 06, 2020 at 03:09:51PM -0400, Vivek Goyal wrote:
>>>> On Mon, Mar 09, 2020 at 09:22:15PM +0100, Peter Zijlstra wrote:
>>>>> On Mon, Mar 09, 2020 at 08:05:18PM +0100, Thomas Gleixner wrote:
>>>>>> Andy Lutomirski <luto@xxxxxxxxxx> writes:
>>>>>
>>>>>>> I'm okay with the save/restore dance, I guess. It's just yet more
>>>>>>> entry crud to deal with architecture nastiness, except that this
>>>>>>> nastiness is 100% software and isn't Intel/AMD's fault.
>>>>>>
>>>>>> And we can do it in C and don't have to fiddle with it in the ASM
>>>>>> maze.
>>>>>
>>>>> Right; I'd still love to kill KVM_ASYNC_PF_SEND_ALWAYS though, even if
>>>>> we do the save/restore in do_nmi(). That is some wild brain melt. Also,
>>>>> AFAIK none of the distros are actually shipping a PREEMPT=y kernel
>>>>> anyway, so killing it shouldn't matter much.
>>>
>>> It will be nice if we can retain KVM_ASYNC_PF_SEND_ALWAYS. I have another
>>> use case outside CONFIG_PREEMPT.
>>>
>>> I am trying to extend async pf interface to also report page fault errors
>>> to the guest.
>>
>> Then please start over and design a sane ParaVirt Fault interface. The
>> current one is utter crap.
>
> Agreed. Donât extend the current mechanism. Replace it.
>
> I would be happy to review a replacement. Iâm not really excited to review an extension of the current mess. The current thing is barely, if at all, correct.

I read your patch. It cannot possibly be correct. You need to decide what happens if you get a memory failure when guest interrupts are off. If this happens, you canât send #PF, but you also canât just swallow the error. The existing APF code is so messy that itâs not at all obvious what your code ends up doing, but Iâm pretty sure it doesnât do anything sensible, especially since the ABI doesnât have a sensible option.

I think you should inject MCE and coordinate with Tony Luck to make it sane. And, in the special case that the new improved async PF mechanism is enabled *and* interrupts are on, you can skip the MCE and instead inject a new improved APF.

But, as it stands, I will NAK any guest code that tries to make #PF handle memory failure. Sorry, itâs just too messy to actually analyze all the cases.