Re: [Lsf-pc] [LSF/MM/BPF TOPIC][RFC PATCH v4 00/27] Private Memory Nodes (w/ Compressed RAM)

From: Vlastimil Babka (SUSE)

Date: Mon Jun 15 2026 - 11:29:02 EST


On 6/15/26 17:18, David Hildenbrand (Arm) wrote:
> On 6/15/26 16:38, Vlastimil Babka (SUSE) wrote:
>> On 6/12/26 17:29, Gregory Price wrote:
>>> On Wed, Jun 10, 2026 at 04:12:52PM -0400, Gregory Price wrote:
>>>> ... snip ...
>>>>
>>>> I will still probably send the next RFC version tomorrow or friday,
>>>> as I want to get some eyes on the __GFP_PRIVATE-less pattern.
>>>>
>>>> Also, I made a new `anondax` driver which enables userland testing
>>>> of this functionality without any specialty hardware.
>>>>
>>>
>>> (apologies for the length of this email: this will all be covered in
>>> the coming cover letter, but I just wanted to share a bit of a preview)
>>>
>>> ===
>>>
>>> Just another small update - I am planning to post the RFC today once i
>>> get some mild cleanup done. It will be based on the dax atomic hotplug
>>>
>>> https://lore.kernel.org/linux-mm/20260605211911.2160954-1-gourry@xxxxxxxxxx/
>>>
>>> But a couple specific details regarding the memalloc pieces that i've
>>> learned the past couple of days playing with it.
>>>
>>> 1) memalloc_folio is required to ensure non-folio allocations don't land
>>> on the private node, even if it happens within a memalloc_private
>>> context. Since memalloc_folio may be useful in contexts outside of
>>> private nodes, I kept this as a separate flag.
>>>
>>> If we think there will *never* be additional users of memalloc_folio,
>>> then we could fold _folio into _private to save the flag for now and
>>> add it back when we actually need it.
>>>
>>> 2) memalloc_private is needed to unlock private nodes, but in the
>>> original NOFALLBACK-only design, you also needed __GFP_THISNODE.
>>>
>>> This is *highly* restrictive. I found when playing with mbind that
>>> MPOL_BIND + __GFP_THISNODE generates a WARN (valid WARN, it normally
>>> implies a bug).
>>>
>>> That leads me to #3
>>
>> I think the memalloc approach is dangerous due to unexpected nesting. There
>> might be nested page allocations in page allocation itself (due to some
>> debugging option). But also interrupts do not change what "current" points
>> to. Suddenly those could start requesting folios and/or private nodes and be
>> surprised, I'm afraid.
>
> Yeah, we'd need some way to distinguish the main allocation from these other
> (nested) allocations.

That goes against the very principle of scopes. And I don't see how, except
via a ... flag to the main allocation :D

>>
>> The memalloc scopes only work well when they restrict the context wrt
>> reclaim, and allocations in IRQ have to be already restricted heavily
>> (atomic) so further memalloc restrictions don't do anything in practice. But
>> to make them change other aspects of the allocations like this won't work.
>
> I was assuming that memalloc_pin_save() would already violate that, but really
> it only restricts where movable allocations land, and that doesn't matter for
> other kernel allocations.

Hm yeah its suboptimal, as it can turn a movable allocation unmovable. But
shouldn't cause outright bugs.

> Do you see any other way to make something like an allocation context work, and
> avoid introducing more GFP flags?

Yeah, the idea of augomenting gfp flags with alloc_flags that are no longer
strictly internal to the page allocator, seems like a way to achieve what we
need.