Re: [PATCH] mm/shmem, swap: fix race of truncate and swap entry split
From: Baolin Wang
Date: Tue Jan 13 2026 - 20:46:14 EST
On 1/13/26 6:10 PM, Kairui Song wrote:
On Tue, Jan 13, 2026 at 3:16 PM Baolin Wang
<baolin.wang@xxxxxxxxxxxxxxxxx> wrote:
Hi Kairui,
Sorry for late reply.
No problem, I was also quite busy with other works :)
Yes, so I just mentioned your swapoff case.
Actually, the real question is how to handle the case where a large swap
entry happens to cross the 'end' when calling shmem_truncate_range(). If
the shmem mapping stores a folio, we would split that large folio by
truncate_inode_partial_folio(). If the shmem mapping stores a large swap
entry, then as you noted, the truncation range can indeed exceed the 'end'.
But with your change, that large swap entry would not be truncated, and
I’m not sure whether that might cause other issues. Perhaps the best
approach is to first split the large swap entry and only truncate the
swap entries within the 'end' boundary like the
truncate_inode_partial_folio() does.
Right... I was thinking that the shmem_undo_range iterates the undo
range twice IIUC, in the second try it will retry if shmem_free_swap
returns 0:
swaps_freed = shmem_free_swap(mapping, indices[i], end - indices[i], folio);
if (!swaps_freed) {
/* Swap was replaced by page: retry */
index = indices[i];
break;
}
So I thought shmem_free_swap returning 0 is good enough. Which is not,
it may cause the second loop to retry forever.
After further investigation, I think your original fix seems to be the
right direction, as the second loop’s find_lock_entries() will filter
out large swap entries crossing the 'end' boundary. Sorry for noise.
See the code in find_lock_entries() (Thanks to Hugh:))
} else {
nr = 1 << xas_get_order(&xas);
base = xas.xa_index & ~(nr - 1);
/* Omit order>0 value which begins before the start */
if (base < *start)
continue;
/* Omit order>0 value which extends beyond the end */
if (base + nr - 1 > end)
break;
}
Then the shmem_get_partial_folio() will swap-in the large swap entry and
split the large folio which crosses the 'end' boundary.
Right, thanks for the info.
But what about find_get_entries under whole_folios? Even though a
large entry is splitted before that, a new large entry that crosses
`end` could appear after that and before find_get_entries, and return
by find_get_entries.
Yes, another corner case:(
I think we could just skip large entries that cross `end` in the
second loop, since if the entry exists before truncate, it must have
been split. We can ignore newly appeared entries.
Sounds reasonable to me. Just as we don’t discard the entire folio when a large folio split fails by updating the 'end':
if (!truncate_inode_partial_folio(folio, lstart, lend))
end = folio->index;
If that's OK I can send two patches, one to ignore the large entries
in the second loop, one to fix shmem_free_swap following your
suggestion in this reply.
Please do. Thanks.