On Sun, Mar 23, 2025 at 01:49:07PM +0100, David Hildenbrand wrote:
c) In -next, there is now be the option to use folio lock +
folio_maybe_mapped_shared() == false. But it doesn't tell you into how many
VMAs a large folio is mapped into.
In the following case:
[ folio ]
[ VMA#1 ] [ VMA#2 ]
c) would not tell you if you are fine modifying the folio when moving VMA#2.
Right, I feel like prior checks made should assert this is not the case,
however? But mapcount check should be a last ditch assurance?
Something nice might be hiding in c) that might be able to handle a single
folio being covered by multiple vmas.
I was thinking about the following:
[ folio0 ]
[ VMA#0 ]
Then we do a partial (old-school) mremap()
[ folio0 ] [ folio0 ]
[ VMA#1 ] [ VMA#2 ]
To then extend VMA#1 and fault in pages
[ folio0 ][ folio1 ] [ folio0 ]
[ VMA#1 ] [ VMA#2 ]
If that is possible (did not try!, maybe something prevents us from
extending VMA#1) mremap(MREMAP_RELOCATE_ANON) of VMA#1 / VMA#2 cannot work.
We'd have to detect that scenario (partial mremap). You might be doing that
with the anon-vma magic, something different might be: Assume we flag large
folios if they were partially mremapped in any process.
Do we have spare folio flags? :)) I always lose track of the situation with this
and Matthew's levels of tolerance for it :P
Then (with folio lock only)
1) folio_maybe_mapped_shared() == false: mapped into single process
2) folio_maybe_partially_mremaped() == false: not scattered in virtual
address space
It would be sufficient to check if the folio fully falls into the memap()
range to decide if we can adjust the folio index etc.
We *might* be able to use that in the COW-reuse path for large folios to
perform a folio_move_anon_rmap(), which we currently only perform for small
folios / PMD-mapped folios (single mapping). Not sure yet if actually
multiple VMAs are involved.
Interesting... this is the wp_can_reuse_anon_folio() stuff? I'll have a look
into that!
I'm concerned about partial cases moreso though, e.g.:
mremap this
<----------->
[ folio0 ]
[ VMA#0 ]
I mean, I'm leaning more towards just breaking up the folio, especialy if we
consider a case like a biiig range:
mremap this
<--------------------------------------------------->
[ folio0 ][ folio1 ][ folio2 ][ folio3 ][ folio4 ][ folio5 ] (say order-9 each)
[ VMA#0 ]
Then at this point, refusing to do the whole thing seems maybe a bad idea, at
which point splitting the folios for folio0, 5 might be sensible.
I guess a user is saying 'please, I really care about merging' so might well be
willing to tolerate losing some of the huge page benefits, at least at the edges
here.