Re: [RFC PATCH] mm/huge_memory: do not add dropped split tail folios to LRU

From: Andrew Morton

Date: Wed Jun 10 2026 - 16:30:10 EST

On Wed, 10 Jun 2026 20:05:35 +0800 "zhaoyang.huang" <zhaoyang.huang@xxxxxxxxxx> wrote:

> From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
>
> The kernel panics are keeping to be reported especially when the f2fs
> partition get almost full. By investigation, we find that the reason is
> one f2fs page got freed to buddy without being deleted from LRU and the
> root cause is the race happened in [2] which is enrolled by this commit.
> We solve this issue by reverting a f2fs commit 9609dd704725 ("f2fs: remove
> non-uptodate folio from the page cache in move_data_block").
>
> There are 3 race processes in this scenario, please find below for their
> main activities. However, by further investigation over the code, I
> think there is a common race window for the truncated folios between
> split_folio_to_order and folio_isolate_lru, where the folios lost the
> refcount on page cache and remains the transient one of the split
> caller, under which the folio could enter free path and compete with the
> isolation process. This commit would like to suggest to have the folios
> beyond EOF stay out of LRU.
>
> Truncate:
> The changed code in move_data_block() lets the GC path evict the tail-end
> folio from the page cache through folio_end_dropbehind(). Once
> folio_unmap_invalidate() removes the folio from mapping->i_pages, the
> page-cache references for all pages in the folio are dropped. The folio
> is then kept alive only by temporary external references, which allows a
> later split to operate on a folio whose subpages are no longer protected
> by page-cache references.
>
> Split:
> After the page-cache references are gone, split_folio_to_order() can
> split the big folio into individual pages and put the resulting subpages
> back on the LRU. For tail pages beyond EOF, split removes them from the
> page cache and drops their page-cache references. A tail page can then
> remain on the LRU with PG_lru set while holding only the split caller's
> temporary reference. When free_folio_and_swap_cache() drops that final
> reference, the page enters the final folio_put() release path.
>
> Isolate:
> In parallel, folio_isolate_lru() can observe the same tail page with a
> non-zero refcount and PG_lru set. It clears PG_lru before taking its own
> reference. If this races with the final folio_put() from the split path,
> __folio_put() sees PG_lru already cleared and skips lruvec_del_folio().
> The page is then freed back to the allocator while its lru links are
> still present in the LRU list. A later LRU operation on a neighboring
> page detects the stale link and reports list corruption.

Thanks. Sashiko AI review might have found some problems with folio
flags:

https://sashiko.dev/#/patchset/20260610120535.2370844-1-zhaoyang.huang@xxxxxxxxxx