On 20/01/2025 12:54, David Hildenbrand wrote:
I think the 1 problem that emerged during review of Dev's series, which we don't
have a proper solution to yet, is the issue of "creep", where regions can be
collapsed to progressively higher orders through iterative scans. At each
collapse, the required thresholds (e.g. max_ptes_none) are met, and the collapse
effectively adds more non-none ptes so the next scan will then collapse to even
higher order. Does your solution suffer from this (theoretical/edge case) issue?
If not, how did you solve?
Yes sadly it suffers from the same issue. bringing max_ptes_none much
lower as a default would "help".
Can we just keep it simple and only support max_ptes_none = 511 ("pagefault
behavior" -- PMD_NR_PAGES - 1) or max_ptes_none = 0 ("deferred behavior") and
document that the other weird configurations will make mTHP skip, because "weird
and unexpetced" ? :)
That sounds like a great simplification in principle!
We would need to consider
the swap and shared tunables too though. Perhaps we can pull a similar trick
with those?