Re: [PATCH] mm/huge_memory: fix a folio_split() race condition with folio_try_get()
From: Lorenzo Stoakes
Date: Tue Mar 03 2026 - 05:02:33 EST
On Mon, Mar 02, 2026 at 11:30:39AM -0500, Zi Yan wrote:
> On 2 Mar 2026, at 8:30, Lorenzo Stoakes wrote:
>
> > On Fri, Feb 27, 2026 at 08:06:14PM -0500, Zi Yan wrote:
> >> During a pagecache folio split, the values in the related xarray should not
> >> be changed from the original folio at xarray split time until all
> >> after-split folios are well formed and stored in the xarray. Current use
> >> of xas_try_split() in __split_unmapped_folio() lets some after-split folios
> >> show up at wrong indices in the xarray. When these misplaced after-split
> >> folios are unfrozen, before correct folios are stored via __xa_store(), and
> >> grabbed by folio_try_get(), they are returned to userspace at wrong file
> >> indices, causing data corruption.
> >>
> >> Fix it by using the original folio in xas_try_split() calls, so that
> >> folio_try_get() can get the right after-split folios after the original
> >> folio is unfrozen.
> >>
> >> Uniform split, split_huge_page*(), is not affected, since it uses
> >> xas_split_alloc() and xas_split() only once and stores the original folio
> >> in the xarray.
> >>
> >> Fixes below points to the commit introduces the code, but folio_split() is
> >> used in a later commit 7460b470a131f ("mm/truncate: use folio_split() in
> >> truncate operation").
> >>
> >> Fixes: 00527733d0dc8 ("mm/huge_memory: add two new (not yet used) functions for folio_split()")
> >> Reported-by: Bas van Dijk <bas@xxxxxxxxxxx>
> >> Closes: https://lore.kernel.org/all/CAKNNEtw5_kZomhkugedKMPOG-sxs5Q5OLumWJdiWXv+C9Yct0w@xxxxxxxxxxxxxx/
> >> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx>
> >> Cc: <stable@xxxxxxxxxxxxxxx>
> >> ---
> >> mm/huge_memory.c | 9 ++++++++-
> >> 1 file changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 56db54fa48181..e4ed0404e8b55 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -3647,6 +3647,7 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
> >> const bool is_anon = folio_test_anon(folio);
> >> int old_order = folio_order(folio);
> >> int start_order = split_type == SPLIT_TYPE_UNIFORM ? new_order : old_order - 1;
> >> + struct folio *origin_folio = folio;
> >
> > NIT: 'origin' folio is a bit ambigious, maybe old_folio, since it is of order old_order?
>
> OK, will rename it.
Thanks
>
> >
> >> int split_order;
> >>
> >> /*
> >> @@ -3672,7 +3673,13 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
> >> xas_split(xas, folio, old_order);
> >
> > Aside, but this 'if (foo) bar(); else { ... }' pattern is horrible, think it's
> > justifiable to put both in {}... :)
>
> I can fix it along with this. It should not cause much trouble during backport.
Thanks!
>
> >
> >> else {
> >> xas_set_order(xas, folio->index, split_order);
> >> - xas_try_split(xas, folio, old_order);
> >> + /*
> >> + * use the original folio, so that a parallel
> >> + * folio_try_get() waits on it until xarray is
> >> + * updated with after-split folios and
> >> + * the original one is unfrozen.
> >> + */
> >> + xas_try_split(xas, origin_folio, old_order);
> >
> > Hmm, but won't we have already split the original folio by now? So is
> > origin_folio/old_folio a pointer to what was the original folio but now is
> > that but with weird tail page setup? :) like:
> >
> > |------------------------|
> > | f |
> > |------------------------|
> > ^old_folio ^ split_at
> >
> > |-----------|------------|
> > | f | f2 |
> > |-----------|------------|
> > ^old_folio
> >
> > |-----------|-----|------|
> > | f | f3 | f4 |
> > |-----------|-----|------|
> > ^old_folio
>
> This should be:
>
> |-----------|-----|------|
> | f | f2 | f3 |
> |-----------|-----|------|
> ^old_folio
>
> after split, the head page of f2 does not change,
> so f2 becomes f2,f3, where f3 is the tail page
> in the middle.
Right, I mean from the perspective of looking at f we'd only see f + some weird
stuff in tail pages, until order is updated?
>
> >
> > etc.
> >
> > So the xarray would contain:
> >
> > |-----------|-----|------|
> > | f | f | f |
> > |-----------|-----|------|
>
> This is the expected xarray state.
>
> >
> > Wouldn't it after this?
> >
> > Oh I guess before it'd contain:
> >
> > |-----------|-----|------|
> > | f | f4 | f4 |
> > |-----------|-----|------|
> >
> > Right?
>
> You got the gist of it. The reality (see the fix above) is
>
> |-----------|-----|------|
> | f | f2 | f3 |
> |-----------|-----|------|
>
> But another split comes at f3, the xarray becomes
>
> |-----------|-----|---|---|
> | f | f2 |f3 | f3|
> |-----------|-----|---|---|
>
> due to how xas_try_split() works. Yeah, feel free to
> blame me, since when I wrote xas_try_split(), I did
> not get into all the details. I am planning to
> change xas_try_split() so that the xarray will become
>
> |-----------|-----|---|---|
> | f | f2 |f3 | f4|
> |-----------|-----|---|---|
Ah ok I see :)
>
>
> >
> >
> > You saying you'll later put the correct xas entries in post-split. Where does
> > that happen?
>
> After __split_unmmaped_folio(), when __xa_store() is performed.
Thanks!
>
> >
> > And why was it a problem when these new folios were unfrozen?
> >
> > (Since the folio is a pointer to an offset in the vmemmap)
> >
> > I guess if you update that later in the xas, it's ok, and everything waits on
> > the right thing so this is probably fine, and the f4 f4 above is probably not
> > fine...
> >
> > I'm guessing the original folio is kept frozen during the operation?
>
> Right. f is kept frozen until the entire xarray is updated. But if the xarray
> is like (before the fix)
>
> |-----------|-----|---|---|
> | f | f2 |f3 | f3|
> |-----------|-----|---|---|
>
> the code after __split_unmmaped_folio()
> 1. unfreezes f2, __xa_store(f2)
> 2. unfreezes f3, __xa_store(f3)
> 3. unfreezes f4, __xa_store(f4), which overwrites the second f3 to f4,
>
> and a parallel folio_try_get() that looks at the second f3 at step 2
> sees f3 is unfrozen, then gives f3 to user but should have given
> f4. It only happens when the split is at the second half of the old
> folio.
Nasty...!
Great thanks for having the patience to explain it to me :)
>
> >
> > Anyway please help my confusion not so familiar with this code :)
> >
>
> Let me know if you have any more questions.
Perfect, appreciated :) I think we're good.
>
> >
> >> if (xas_error(xas))
> >> return xas_error(xas);
> >> }
> >> --
> >> 2.51.0
> >>
> >
> > Thanks, Lorenzo
>
>
> Best Regards,
> Yan, Zi
Cheers, Lorenzo