Re: [RFC PATCH 0/4] mm: Add PG_zero support

From: Alex Williamson
Date: Mon Apr 13 2020 - 10:49:33 EST


On Sun, 12 Apr 2020 18:43:07 -0700
Dave Hansen <dave.hansen@xxxxxxxxx> wrote:

> On 4/12/20 2:07 AM, liliangleo wrote:
> > Zero out the page content usually happens when allocating pages,
> > this is a time consuming operation, it makes pin and mlock
> > operation very slowly, especially for a large batch of memory.
> >
> > This patch introduce a new feature for zero out pages before page
> > allocation, it can help to speed up page allocation.
>
> I think the bar for getting something like this merged is going to be
> pretty high. We have a long history of zeroing close to page use for
> cache warmth reasons. Starting up big VMs which won't soon touch the
> memory they are allocating is basically the most pathological case
> against our approach since they don't *care* about cache warmth.
>
> I'm also not sure it's something we _want_ to optimize for.
>
> VFIO's unconditional page pinning is the real problem here IMNHO. They
> don't *really* need to pin the memory. We just don't have good
> paravirtualized IOMMU support or want to pay the runtime cost for
> pin/unpin operations. You *could* totally have speedy VM startup if
> only the pages being accessed or having DMA performed to them were
> allocated. But, the hacks that are in place mean that everything must
> be pinned.

Maybe in an SEV or Secure Boot environment we can assume the VM guest
OS uses the IOMMU exclusively for DMA, but otherwise the IOMMU is
optional (at least for x86, other archs do require IOMMU support
afaik). Therefore, how would we know which pages to pin when there are
only limited configs where we might be able to lean on the vIOMMU to
this extent? Thanks,

Alex