Re: [RFC][Patch v9 2/6] KVM: Enables the kernel to isolate guest free pages
From: Alexander Duyck
Date: Fri Mar 08 2019 - 13:06:28 EST
On Thu, Mar 7, 2019 at 6:32 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
>
> On Thu, Mar 07, 2019 at 02:35:53PM -0800, Alexander Duyck wrote:
> > The only other thing I still want to try and see if I can do is to add
> > a jiffies value to the page private data in the case of the buddy
> > pages.
>
> Actually there's one extra thing I think we should do, and that is make
> sure we do not leave less than X% off the free memory at a time.
> This way chances of triggering an OOM are lower.
If nothing else we could probably look at doing a watermark of some
sort so we have to have X amount of memory free but not hinted before
we will start providing the hints. It would just be a matter of
tracking how much memory we have hinted on versus the amount of memory
that has been pulled from that pool. It is another reason why we
probably want a bit in the buddy pages somewhere to indicate if a page
has been hinted or not as we can then use that to determine if we have
to account for it in the statistics.
> > With that we could track the age of the page so it becomes
> > easier to only target pages that are truly going cold rather than
> > trying to grab pages that were added to the freelist recently.
>
> I like that but I have a vague memory of discussing this with Rik van
> Riel and him saying it's actually better to take away recently used
> ones. Can't see why would that be but maybe I remember wrong. Rik - am I
> just confused?
It is probably to cut down on the need for disk writes in the case of
swap. If that is the case it ends up being a trade off.
The sooner we hint the less likely it is that we will need to write a
given page to disk. However the sooner we hint, the more likely it is
we will need to trigger a page fault and pull back in a zero page to
populate the last page we were working on. The sweet spot will be that
period of time that is somewhere in between so we don't trigger
unnecessary page faults and we don't need to perform additional swap
reads/writes.