Re: [RFC PATCH] KVM: x86: inhibit APICv upon detecting direct APIC access from L2
From: Ake Koomsin
Date: Wed Aug 09 2023 - 04:42:54 EST
On Tue, 8 Aug 2023 16:48:19 -0700
Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > The idea from step 6 to step 10 is to start BitVisor first, and
> > start Linux on top of it. You can adjust the step as you like. Feel
> > free to ask me anything regarding reproducing the problem with
> > BitVisor if the giving steps are not sufficient.
>
> Thank you for the detailed repro steps! However, it's likely going
> to be O(weeks) before anyone is able to look at this in detail given
> the extensive repro steps. If you have bandwidth, it's probably worth
> trying to reproduce the problem in a KVM selftest (or a
> KVM-Unit-Test), e.g. create a nested VM, send an IPI from L2, and see
> if it gets routed correctly. This purely a suggestion to try and get
> a faster fix, it's by no means necessary.
>
> Actually, typing that out raises a question (or two). What APICv
> VMCS control settings does BitVisor use? E.g. is BitVisor enabling
> APICv for its VM (L2)? If so, what values for the APIC access page
> and vAPIC page are shoved into BitVisor's VMCS?
BitVisor does not set up APICv at all. It also does not setup APIC
access page at all. It does not try to emulate APIC at all. It only
monitors for APIC INIT event through EPT_VIOLATION mechanism only for
its AP bringup and stop monitoring after that. As I mentioned in the
previous mail, when BitVisor runs on real hardware, it lets the guest
control real APIC directly.
As it is a micro hypervisor, it runs only one guest OS. Its main focus
is on device access monitoring/manipulation depending on the
configuration. It tries to avoid anything to do with interrupts as
much as possible.
In mean time, I will try to get deeper into KVM internal. Thank you
very much suggesting on KVM-Unit-Test.
> > The problem does not happen when enable_apicv=N. Note that SMP
> > bringup with enable_apicv=N can fail. This is another problem. We
> > don't have to worry about this for now. Linux seems to have no
> > delay between INIT DEASSERT and SIPI during its SMP bringup. This
> > can easily makes INIT and SIPI pending together resultling in
> > signal lost.
> >
> > I admit that my knowledge on KVM and APICv is very limited. I may
> > misunderstand the problem. If you don't mind, would it be possible
> > for you to guide me which code path should I pay attention to? I
> > would love to learn to find out the actual cause of the problem.
>
> KVM *should* emulate the APIC MMIO access from L2. The call stack
> should reach apic_mmio_write(), and assuming it's an ICR write, KVM
> should send an IPI.
When enable_apicv=N, interrupts work properly. This is why I wrote this
RFC patch.
Regarding SMP bringup fail, The thing is when L2 Linux guest runs on top
of L1 BitVisor, it is not going to rely on KVM specific features at all.
In this case, it seems to me that vcpus possibly can not change their
state to wait-for-sipi in time once INIT is issued (might be due to
scheduling?). This does not happen when BitVisor runs on real hardware.
Once you have time to try BitVisor, please let me know if you can
reproduce the problem with the default configuration. Trying with
-smp 8+ on a machine with many cores might be easy to reproduce the
problem. I test mine on i5-13600K.