Re: [PATCH mm-unstable v2] mm/page_alloc: keep track of free highatomic
From: Vlastimil Babka
Date: Mon Oct 28 2024 - 07:13:40 EST
On 10/28/24 1:24 AM, Yu Zhao wrote:
> On Sun, Oct 27, 2024 at 3:05 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>>
>> On 10/27/24 21:51, Yu Zhao wrote:
>>> On Sun, Oct 27, 2024 at 2:36 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>>>>
>>>> On 10/27/24 21:17, Yu Zhao wrote:
>>>>> On Sun, Oct 27, 2024 at 1:53 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>>>>>>
>>>>
>>>> For example:
>>>>
>>>> - a page is on pcplist in MIGRATE_MOVABLE list
>>>> - we reserve its pageblock as highatomic, which does nothing to the page on
>>>> the pcplist
>>>> - page above is flushed from pcplist to zone freelist, but it remembers it
>>>> was MIGRATE_MOVABLE, merges with another buddy/buddies from the
>>>> now-highatomic list, the resulting order-X page ends up on the movable
>>>> freelist despite being in highatomic pageblock. The counter of free
>>>> highatomic is now wrong wrt the freelist reality
>>>
>>> This is the part I don't follow: how is it wrong w.r.t. the freelist
>>> reality? The new nr_free_highatomic should reflect how many pages are
>>> exactly on free_list[MIGRATE_HIGHATOMIC], because it's updated
>>> accordingly.
>>
>> You'd have to try implementing your change in the kernel without that
>> migratetype hygiene series, and see how it would either not work, or you'd
>> end up implementing the series as part of that.
>
> A proper backport would need to track the MT of the free_list a page
> is deleted from, and I see no reason why in such a proper backport
> "the counter could drift easily" or "the counter of free highatomic is
> now wrong wrt the freelist reality". So I assume you actually mean
> this patch can't be backported cleanly? (Which I do agree.)
Yes you're right. But since we don't plan to backport it beyond 6.12,
sorry for sidetracking the discussion unnecessarily. More importantly,
is it possible to change the implementation as I suggested? [1] Hooking
to __del_page_from_free_list() and __add_to_free_list() means extra work
in every loop iteration in expand() and __free_one_page(). The
migratetype hygiene should ensure it's not necessary to intercept every
freelist add/move and hooking to account_freepages() should be
sufficient and in line with the intended design.
Thanks,
Vlastimil
[1]
https://lore.kernel.org/all/37a28ef7-e477-40b0-a8e4-3d74b747e323@xxxxxxx/