Re: [PATCH v2 0/2] mm/mprotect: micro-optimization work

From: Pedro Falcato

Date: Wed Apr 01 2026 - 10:29:06 EST


On Wed, Apr 01, 2026 at 10:25:40AM +0200, David Hildenbrand (Arm) wrote:
> On 3/30/26 22:06, Andrew Morton wrote:
> > On Mon, 30 Mar 2026 15:55:51 -0400 Luke Yang <luyang@xxxxxxxxxx> wrote:
> >
> >> Thanks for working on this. I just wanted to share that we've created a
> >> test kernel with your patches and tested on the following CPUs:
> >>
> >> --- aarch64 ---
> >> Ampere Altra
> >> Ampere Altra Max
> >>
> >> --- x86_64 ---
> >> AMD EPYC 7713
> >> AMD EPYC 7351
> >> AMD EPYC 7542
> >> AMD EPYC 7573X
> >> AMD EPYC 7702
> >> AMD EPYC 9754
> >> Intel Xeon Gold 6126
> >> Into Xeon Gold 6330
> >> Intel Xeon Gold 6530
> >> Intel Xeon Platinum 8351N
> >> Intel Core i7-6820HQ
> >>
> >> --- ppc64le ---
> >> IBM Power 10
> >>
> >> On average, we see improvements ranging from a minimum of 5% to a
> >> maximum of 55%, with most improvements showing around a 25% speed up in
> >> the libmicro/mprot_tw4m micro benchmark.
> >
> > Thanks, that's nice. I've added some of the above into the changelog
> > and I took the liberty of adding your Tested-by: to both patches.
> >
> > fyi, regarding [2/2]: it's unclear to me whether the discussion with
> > David will result in any alterations. If there's something I need to
> > it always helps to lmk ;)
>
> I think we want to get a better understanding of which exact __always_inline
> is really helpful in patch #2, and where to apply the nr_ptes==1 forced
> optimization.
>
> I updated my microbenchmark I use for fork+unmap etc to measure
> mprotect as well
>
> https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/pte-mapped-folio-benchmarks.c?ref_type=heads
>
> Running some simple tests with order-0 on 1 GiB of memory:
>
>
> Upstream Linus:
>
> ./pte-mapped-folio-benchmarks 0 write-protect 5
> 0.005779
> ...
> ./pte-mapped-folio-benchmarks 0 write-unprotect 5
> 0.009113
> ...
>
>
> With Pedro's patch #2:
> $ ./pte-mapped-folio-benchmarks 0 write-protect 5
> 0.003941
> ...
> $ ./pte-mapped-folio-benchmarks 0 write-unprotect 5
> 0.006163
> ...
>
>
> With the patch below:
>
> $ ./pte-mapped-folio-benchmarks 0 write-protect 5
> 0.003364
>
> $ ./pte-mapped-folio-benchmarks 0 write-unprotect 5
> 0.005729

Hmm. Thanks for the testing. Interesting. I'll give it a shot. I'll have
results and/or a possible v3 by tomorrow, if need be.

Apologies for the slight delay here! :)

--
Pedro