Re: [PATCH v3] mm/shmem, swap: fix race of truncate and swap entry split

From: Kairui Song

Date: Wed Jan 28 2026 - 12:05:51 EST

On Wed, Jan 28, 2026 at 05:02:34AM +0800, Chris Mason wrote:
>
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
>
> [ ... ]
>
> > @@ -1191,12 +1203,23 @@
> > folio = fbatch.folios[i];
> >
> > if (xa_is_value(folio)) {
> > + int order;
> > long swaps_freed;
> >
> > if (unfalloc)
> > continue;
> > - swaps_freed = shmem_free_swap(mapping, indices[i], folio);
> > + swaps_freed = shmem_free_swap(mapping, indices[i],
> > + end - 1, folio);
> > if (!swaps_freed) {
> > + /*
> > + * If found a large swap entry cross the end border,
> > + * skip it as the truncate_inode_partial_folio above
> > + * should have at least zerod its content once.
> > + */
> > + order = shmem_confirm_swap(mapping, indices[i],
> > + radix_to_swp_entry(folio));
> > + if (order > 0 && indices[i] + (1 << order) > end)
> > + continue;
> > /* Swap was replaced by page: retry */
> > index = indices[i];
> > break;
>
> Could this code infinite loop if a large swap entry crosses the START
> boundary instead of the END boundary?

Thanks for the review! It's really helpful :)

> When shmem_free_swap() returns 0, it could be either because:
> (a) The entry was replaced by a page, or
> (b) The entry failed the boundary check (base < index ||
> base + nr_pages - 1 > end)
>
> The boundary check here only handles case (b) for entries crossing the
> END boundary:
>
> if (order > 0 && indices[i] + (1 << order) > end)
> continue;
>
> But what happens if the entry crosses the START boundary? If
> find_get_entries() returns a large swap entry at indices[i] where
> the entry's base (calculated as indices[i] & ~((1 << order) - 1)) is
> less than the truncation start point, then shmem_free_swap() will
> return 0 due to the "base < index" check. The code will then call
> shmem_confirm_swap(), get the order, check if it crosses the END
> boundary (which it doesn't), and retry with the same index:
>
> index = indices[i];
> break;
>
> The next iteration will find the same entry again at the same index,
> leading to an infinite loop. For example:
>
> - Truncating range [18, 30]
> - Large swap entry at [16, 23] (order 3, 8 pages)
> - indices[i] = 18
> - shmem_free_swap() sees base=16 < index=18, returns 0
> - Check: 18 + 8 > 30 is false (26 <= 30)
> - Retries with index=18
> - Loop repeats indefinitely

I think this is a valid issue. And it's worse than that, during the `while (index < end)` loop a new large entry can land anywhere in the range, if one interaction's starting `index` points to the middle of any large entry, an infinite loop will occur: indices[0] are always equal to the `index` iteration value of that moments, shmem_free_swap will fail because the swap entry's index doesn't match indices[0], and so the `index = indices[i]; break;` keep it loop forever.

The chance seems very low though.

> Should the boundary check also handle the START case, perhaps:
>
> if (order > 0) {
> pgoff_t base = indices[i] & ~((1UL << order) - 1);
> if (base + (1 << order) - 1 > end || base < start)
> continue;
> }

This still doesn't cover the case when a new large entry somehow lands in the range during the loop.

> where 'start' is preserved from before the loop?

How about following patch: