Re: [PATCH v2 0/3] KVM: x86: KVM_MEM_PCI_HOLE memory
From: Michael S. Tsirkin
Date: Mon Sep 07 2020 - 06:55:45 EST
On Thu, Sep 03, 2020 at 11:12:12PM -0700, Sean Christopherson wrote:
> On Wed, Sep 02, 2020 at 10:59:20AM +0200, Vitaly Kuznetsov wrote:
> > Peter Xu <peterx@xxxxxxxxxx> writes:
> > > My whole point was more about trying to understand the problem behind.
> > > Providing a fast path for reading pci holes seems to be reasonable as is,
> > > however it's just that I'm confused on why there're so many reads on the pci
> > > holes after all. Another important question is I'm wondering how this series
> > > will finally help the use case of microvm. I'm not sure I get the whole point
> > > of it, but... if microvm is the major use case of this, it would be good to
> > > provide some quick numbers on those if possible.
> > >
> > > For example, IIUC microvm uses qboot (as a better alternative than seabios) for
> > > fast boot, and qboot has:
> > >
> > > https://github.com/bonzini/qboot/blob/master/pci.c#L20
> > >
> > > I'm kind of curious whether qboot will still be used when this series is used
> > > with microvm VMs? Since those are still at least PIO based.
> >
> > I'm afraid there is no 'grand plan' for everything at this moment :-(
> > For traditional VMs 0.04 sec per boot is negligible and definitely not
> > worth adding a feature, memory requirements are also very
> > different. When it comes to microvm-style usage things change.
> >
> > '8193' PCI hole accesses I mention in the PATCH0 blurb are just from
> > Linux as I was doing direct kernel boot, we can't get better than that
> > (if PCI is in the game of course). Firmware (qboot, seabios,...) can
> > only add more. I *think* the plan is to eventually switch them all to
> > MMCFG, at least for KVM guests, by default but we need something to put
> > to the advertisement.
>
> I see a similar ~8k PCI hole reads with a -kernel boot w/ OVMF. All but 60
> of those are from pcibios_fixup_peer_bridges(), and all are from the kernel.
> My understanding is that pcibios_fixup_peer_bridges() is useful if and only
> if there multiple root buses. And AFAICT, when running under QEMU, the only
> way for there to be multiple buses in is if there is an explicit bridge
> created ("pxb" or "pxb-pcie"). Based on the cover letter from those[*], the
> main reason for creating a bridge is to handle pinned CPUs on a NUMA system
> with pass-through devices. That use case seems highly unlikely to cross
> paths with micro VMs, i.e. micro VMs will only ever have a single bus.
My position is it's not all black and white, workloads do not
cleanly partition to these that care about boot speed and those
that don't. So IMHO we care about boot speed with pcie even if
microvm does not use it at the moment.
> Unless I'm mistaken, microvm doesn't even support PCI, does it?
>
> If all of the above is true, this can be handled by adding "pci=lastbus=0"
> as a guest kernel param to override its scanning of buses. And couldn't
> that be done by QEMU's microvm_fix_kernel_cmdline() to make it transparent
> to the end user?
>
> [*] https://www.redhat.com/archives/libvir-list/2016-March/msg01213.html