Re: [REGRESSION] mm/mprotect: 2x+ slowdown for >=400KiB regions since PTE batching (cac1db8c3aad)

From: David Hildenbrand (Arm)

Date: Fri Feb 13 2026 - 12:26:57 EST


On 2/13/26 18:16, Suren Baghdasaryan wrote:
On Fri, Feb 13, 2026 at 4:24 PM Pedro Falcato <pfalcato@xxxxxxx> wrote:

On Fri, Feb 13, 2026 at 04:47:29PM +0100, David Hildenbrand (Arm) wrote:

Hi!


Micro-benchmark results are nice. But what is the real word impact? IOW, why
should we care?

Well, mprotect is widely used in thread spawning, code JITting,
and even process startup. And we don't want to pay for a feature we can't
even use (on x86).

I agree. When I straced Android's zygote a while ago, mprotect() came
up #30 in the list of most frequently used syscalls and one of the
most used mm-related syscalls due to its use during process creation.
However, I don't know how often it's used on VMAs of size >=400KiB.

See my point? :) If this is apparently so widespread then finding a real reproducer is likely not a problem. Otherwise it's just speculation.

It would also be interesting to know whether the reproducer ran with any sort of mTHP enabled or not.



In any case, I think I see the problem. Namely, that we now need to call
vm_normal_folio() for every single PTE (this seems similar to the mremap
problem caught in 0b5be138ce00f421bd7cc5a226061bd62c4ab850). I'll try to
draft up a patch over the weekend if I can.

I think we excessively discussed that during review and fixups of the commit in question. You might want to dig through that because I could have sworn we might already have discussed how to optimize this.

When going from none -> writable we always did a vm_normal_folio() with anonymous folios. For the other direction not.

--
Cheers,

David