Re: [PATCH] mm: Introduce free_folio_and_swap_cache() to replace free_page_and_swap_cache()
From: Zi Yan
Date: Thu Apr 10 2025 - 14:56:03 EST
On 10 Apr 2025, at 14:25, Matthew Wilcox wrote:
> On Thu, Apr 10, 2025 at 02:16:09PM -0400, Zi Yan wrote:
>>> @@ -49,7 +49,7 @@ static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
>>> {
>>> VM_WARN_ON_ONCE(delay_rmap);
>>>
>>> - free_page_and_swap_cache(page);
>>> + free_folio_and_swap_cache(page_folio(page));
>>> return false;
>>> }
>>
>> __tlb_remove_page_size() is ruining the fun of the conversion. But it will be
>> converted to use folio eventually.
>
> Well, hm, I'm not sure. I haven't looked into this in detail.
> We have a __tlb_remove_folio_pages() which removes N pages but they must
> all be within the same folio:
>
> VM_WARN_ON_ONCE(page_folio(page) != page_folio(page + nr_pages - 1));
>
> but would we be better off just passing in the folio which contains the
> page and always flush all pages in the folio? It'd certainly simplify
> the "encoded pages" stuff since we'd no longer need to pass (page,
> length) tuples. But then, what happens if the folio is split between
> being added to the batch and the flush actually happening?
Apparently I did not read enough context before made the comment.
__tlb_remove_page_size() is used to check if tlb flush is need by
tlb_remove_page_size(), which is used for zap PMDs and PUDs,
whereas __tlb_remove_folio_pages() is used to check tlb flush needs for
zap PTEs, including single page folio and multiple pages in a folio.
On x86, __tlb_remove_page_size() and __tlb_remove_folio_pages()
use the same backend __tlb_remove_folio_pages_size(), but on s390
they are different.
Like you said, if a folio is split between it is added and flushed,
a flush-folio-as-a-whole function would miss part of the original
folio. Unless a pin is added to avoid that, but that sounds stupid.
Probably we will have to live with this per-page flush thing.
Best Regards,
Yan, Zi