Re: [PATCH RFC 0/5] KVM: x86: KVM_MEM_ALLONES memory
From: Peter Xu
Date: Fri May 15 2020 - 07:15:49 EST
On Thu, May 14, 2020 at 06:03:20PM -0700, Andy Lutomirski wrote:
> On Thu, May 14, 2020 at 3:56 PM Sean Christopherson
> <sean.j.christopherson@xxxxxxxxx> wrote:
> >
> > On Thu, May 14, 2020 at 06:05:16PM -0400, Peter Xu wrote:
> > > On Thu, May 14, 2020 at 08:05:35PM +0200, Vitaly Kuznetsov wrote:
> > > > The idea of the patchset was suggested by Michael S. Tsirkin.
> > > >
> > > > PCIe config space can (depending on the configuration) be quite big but
> > > > usually is sparsely populated. Guest may scan it by accessing individual
> > > > device's page which, when device is missing, is supposed to have 'pci
> > > > holes' semantics: reads return '0xff' and writes get discarded. Currently,
> > > > userspace has to allocate real memory for these holes and fill them with
> > > > '0xff'. Moreover, different VMs usually require different memory.
> > > >
> > > > The idea behind the feature introduced by this patch is: let's have a
> > > > single read-only page filled with '0xff' in KVM and map it to all such
> > > > PCI holes in all VMs. This will free userspace of obligation to allocate
> > > > real memory and also allow us to speed up access to these holes as we
> > > > can aggressively map the whole slot upon first fault.
> > > >
> > > > RFC. I've only tested the feature with the selftest (PATCH5) on Intel/AMD
> > > > with and wiuthout EPT/NPT. I haven't tested memslot modifications yet.
> > > >
> > > > Patches are against kvm/next.
> > >
> > > Hi, Vitaly,
> > >
> > > Could this be done in userspace with existing techniques?
> > >
> > > E.g., shm_open() with a handle and fill one 0xff page, then remap it to
> > > anywhere needed in QEMU?
> >
> > Mapping that 4k page over and over is going to get expensive, e.g. each
> > duplicate will need a VMA and a memslot, plus any PTE overhead. If the
> > total sum of the holes is >2mb it'll even overflow the mumber of allowed
> > memslots.
>
> How about a tiny character device driver /dev/ones?
Yeah, this looks very clean.
Or I also like Sean's idea about using the slow path - I think the answer could
depend on a better knowledge on the problem to solve (PCI scan for small VM
boots) to firstly justify that the fast path is required. E.g., could we even
workaround that inefficient reading of 0xff's for our use case? After all what
the BIOS really needs is not those 0xff's, but some other facts.
Thanks!
--
Peter Xu