Re: [RFC PATCH 00/16] 1GB THP support on x86_64

From: Matthew Wilcox
Date: Wed Sep 09 2020 - 08:49:49 EST

On Wed, Sep 09, 2020 at 09:11:17AM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 08, 2020 at 03:27:58PM +0100, Matthew Wilcox wrote:
> > I could also see there being an app which benefits from 1GB for
> > one mapping and prefers 2GB for a different mapping, so I think the
> > per-mapping madvise flag is best.
> I wonder if apps really care about the specific page size?
> Particularly from a portability view?

No, they don't. They just want to run as fast as possible ;-)

> The general app desire seems to be the need for 'efficient' memory (eg
> because it is highly accessed) and I suspect comes with a desire to
> populate the pages too.

The problem with a MAP_GOES_FASTER flag is that everybody sets it.
Any flag name needs to convey its drawbacks as well as its advantages.
Maybe MAP_EXTREMELY_COARSE_WORKINGSET would do that -- the VM will work
in terms of 1GB pages for this mapping, so any swap-out is going to take
out an entire 1GB at once.

But here's the thing ... we already allow

So if we're not doing THP, what's the point of this thread?
My understanding of THP is "Application doesn't need to change, kernel
makes a decision about what page size is best based on entire system
state and process's behaviour".

An madvise flag is a different beast; that's just letting the kernel
know what the app thinks its behaviour will be. The kernel can pay
as much (or as little) attention to that hint as it sees fit. And of
course, it can change over time (either by kernel release as we change
the algorithms, or simple from one minute to the next as more or less
memory comes available).