Re: [RFC PATCH 00/26] mm: reliable huge page allocator
From: Mel Gorman
Date: Fri Apr 21 2023 - 12:12:34 EST
On Wed, Apr 19, 2023 at 05:11:45AM +0100, Matthew Wilcox wrote:
> On Tue, Apr 18, 2023 at 03:12:47PM -0400, Johannes Weiner wrote:
> > This series proposes to make THP allocations reliable by enforcing
> > pageblock hygiene, and aligning the allocator, reclaim and compaction
> > on the pageblock as the base unit for managing free memory. All orders
> > up to and including the pageblock are made first-class requests that
> > (outside of OOM situations) are expected to succeed without
> > exceptional investment by the allocating thread.
> >
> > A neutral pageblock type is introduced, MIGRATE_FREE. The first
> > allocation to be placed into such a block claims it exclusively for
> > the allocation's migratetype. Fallbacks from a different type are no
> > longer allowed, and the block is "kept open" for more allocations of
> > the same type to ensure tight grouping. A pageblock becomes neutral
> > again only once all its pages have been freed.
>
> YES! This is exactly what I've been thinking is the right solution
> for some time. Thank you for doing it.
>
It was considered once upon a time and comes up every so often as variants
of a "sticky" pageblock pageblock bit that prevents mixing. The risks was
ending up in a context where memory within a suitable pageblock cannot
be freed and all of the available MOVABLE pageblocks have at least one
pinned page that cannot migrate from the allocating context. It can also
potentially hit a case where the majority of memory is UNMOVABLE pageblocks,
each of which has a single pagetable page that cannot be freed without an
OOM kill. Variants of issues like this would manifestas an OOM kill with
plenty of memory free bug or excessive CPu usage on reclaim or compaction.
It doesn't kill the idea of the series at all but it puts a lot of emphasis
in splitting the series by low-risk and high-risk. Maybe to the extent where
the absolute protection against mixing can be broken in OOM situations,
kernel command line or sysctl.
--
Mel Gorman
SUSE Labs