Re: [REGRESSION] mm/mprotect: 2x+ slowdown for >=400KiB regions since PTE batching (cac1db8c3aad)
From: Pedro Falcato
Date: Tue Feb 17 2026 - 13:08:13 EST
On Tue, Feb 17, 2026 at 12:43:38PM -0500, Luke Yang wrote:
> On Mon, Feb 16, 2026 at 03:42:08PM +0530, Dev Jain wrote:
> >
> > On 13/02/26 10:56 pm, David Hildenbrand (Arm) wrote:
> > > On 2/13/26 18:16, Suren Baghdasaryan wrote:
> > >> On Fri, Feb 13, 2026 at 4:24 PM Pedro Falcato <pfalcato@xxxxxxx> wrote:
> > >>>
> > >>> On Fri, Feb 13, 2026 at 04:47:29PM +0100, David Hildenbrand (Arm) wrote:
> > >>>>
> > >>>> Hi!
> > >>>>
> > >>>>
> > >>>> Micro-benchmark results are nice. But what is the real word impact?
> > >>>> IOW, why
> > >>>> should we care?
> > >>>
> > >>> Well, mprotect is widely used in thread spawning, code JITting,
> > >>> and even process startup. And we don't want to pay for a feature we can't
> > >>> even use (on x86).
> > >>
> > >> I agree. When I straced Android's zygote a while ago, mprotect() came
> > >> up #30 in the list of most frequently used syscalls and one of the
> > >> most used mm-related syscalls due to its use during process creation.
> > >> However, I don't know how often it's used on VMAs of size >=400KiB.
> > >
> > > See my point? :) If this is apparently so widespread then finding a real
> > > reproducer is likely not a problem. Otherwise it's just speculation.
> > >
> > > It would also be interesting to know whether the reproducer ran with any
> > > sort of mTHP enabled or not.
> >
> > Yes. Luke, can you experiment with the following microbenchmark:
> >
> > https://pastebin.com/3hNtYirT
> >
> > and see if there is an optimization for pte-mapped 2M folios, before and
> > after the commit?
> >
> > (set transparent_hugepages/enabled=always, hugepages-2048Kb/enabled=always)
>
Since you're testing stuff, could you please test the changes in:
https://github.com/heatd/linux/tree/mprotect-opt ?
Not posting them yet since merge window, etc. Plus I think there's some
further optimization work we can pull off.
With the benchmark in https://gist.github.com/heatd/25eb2edb601719d22bfb514bcf06a132
(compiled with g++ -O2 file.cpp -lbenchmark, needs google/benchmark) I've measured
about an 18% speedup between original vs with patches.
--
Pedro