Re: [PATCH v2 02/10] KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info

From: Vivek Goyal
Date: Wed Jun 03 2020 - 15:35:28 EST


On Thu, May 28, 2020 at 10:42:38AM +0200, Vitaly Kuznetsov wrote:
> Vivek Goyal <vgoyal@xxxxxxxxxx> writes:
>
> > On Mon, May 25, 2020 at 04:41:17PM +0200, Vitaly Kuznetsov wrote:
> >>
> >
> > [..]
> >> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> >> index 0a6b35353fc7..c195f63c1086 100644
> >> --- a/arch/x86/include/asm/kvm_host.h
> >> +++ b/arch/x86/include/asm/kvm_host.h
> >> @@ -767,7 +767,7 @@ struct kvm_vcpu_arch {
> >> u64 msr_val;
> >> u32 id;
> >> bool send_user_only;
> >> - u32 host_apf_reason;
> >> + u32 host_apf_flags;
> >
> > Hi Vitaly,
> >
> > What is host_apf_reason used for. Looks like it is somehow used in
> > context of nested guests. I hope by now you have been able to figure
> > it out.
> >
> > Is it somehow the case of that L2 guest takes a page fault exit
> > and then L0 injects this event in L1 using exception. I have been
> > trying to read this code but can't wrap my head around it.
> >
> > I am still concerned about the case of nested kvm. We have discussed
> > apf mechanism but never touched nested part of it. Given we are
> > touching code in nested kvm part, want to make sure it is not broken
> > in new design.
> >
>
> Sorry I missed this.
>
> I think we've touched nested topic a bit already:
> https://lore.kernel.org/kvm/87lfluwfi0.fsf@xxxxxxxxxxxxxxxxxxxx/
>
> But let me try to explain the whole thing and maybe someone will point
> out what I'm missing.

Hi Vitaly,

Sorry, I got busy in some other things. Got back to it now. Thanks for
the explanation. I think I understand it up to some extent now.

Vivek

>
> The problem being solved: L2 guest is running and it is hitting a page
> which is not present *in L0* and instead of pausing *L1* vCPU completely
> we want to let L1 know about the problem so it can run something else
> (e.g. another guest or just another application).
>
> What's different between this and 'normal' APF case. When L2 guest is
> running, the CPU (physical) is in 'guest' mode so we can't inject #PF
> there. Actually, we can but L2 may get confused and we're not even sure
> it's L2's fault, that L2 supported APF and so on. We want to make L1
> deal with the issue.
>
> How does it work then. We inject #PF and L1 sees it as #PF VMEXIT. It
> needs to know about APF (thus KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT) but
> the handling is exactly the same as do_pagefault(): L1's
> kvm_handle_page_fault() checkes APF area (shared between L0 and L1) and
> either pauses a task or resumes a previously paused one. This can be a
> L2 guest or something else.
>
> What is 'host_apf_reason'. It is a copy of 'reason' field from 'struct
> kvm_vcpu_pv_apf_data' which we read upon #PF VMEXIT. It indicates that
> the #PF VMEXIT is synthetic.
>
> How does it work with the patchset: 'page not present' case remains the
> same. 'page ready' case now goes through interrupts so it may not get
> handled immediately. External interrupts will be handled by L0 in host
> mode (when L2 is not running). For the 'page ready' case L1 hypervisor
> doesn't need any special handling, kvm_async_pf_intr() irq handler will
> work correctly.
>
> I've smoke tested this with VMX and nothing immediately blew up.
>
> --
> Vitaly
>