Re: [PATCH v2 0/5] mm/mprotect: avoid unnecessary TLB flushes

From: Nadav Amit
Date: Fri Oct 22 2021 - 17:58:47 EST




> On Oct 21, 2021, at 8:04 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, 21 Oct 2021 05:21:07 -0700 Nadav Amit <nadav.amit@xxxxxxxxx> wrote:
>
>> This patch-set is intended to remove unnecessary TLB flushes. It is
>> based on feedback from v1 and several bugs I found in v1 myself.
>>
>> Basically, there are 3 optimizations in this patch-set:
>> 1. Avoiding TLB flushes on change_huge_pmd() that are only needed to
>> prevent the A/D bits from changing.
>> 2. Use TLB batching infrastructure to batch flushes across VMAs and
>> do better/fewer flushes.
>> 3. Avoid TLB flushes on permission demotion.
>>
>> Andrea asked for the aforementioned (2) to come after (3), but this
>> is not simple (specifically since change_prot_numa() needs the number
>> of pages affected).
>
> [1/5] appears to be a significant fix which should probably be
> backported into -stable kernels. If you agree with this then I suggest
> it be prepared as a standalone patch, separate from the other four
> patches. With a cc:stable.


There is no functionality bug in the kernel. The Knights Landing bug
was circumvented eventually by changing the swap entry structure so
the access/dirty bits would not overlap with the swap entry data.

>
> And the remaining patches are a performance optimization. Has any
> attempt been made to quantify the benefits?

I included some data before [1]. In general the cost that is saved
is the cost of a TLB flush/shootdown.

I will modify my benchmark to test huge-pages (which were not
included in the previous patch-set) and send results later. I would
also try nodejs to see if there is a significant enough benefit.
Nodejs crashed before (hence the 3rd patch added here), as it
exec-protects/unprotects pages - I will see if the benefit shows in
the benchmarks.

[ The motivation behind the patches is to later introduce userfaultfd
writeprotectv interface, and for my use-case that is under
development this proved to improve performance considerably. ]



[1] https://lore.kernel.org/linux-mm/DA49DBBB-FFEE-4ACC-BB6C-364D07533C5E@xxxxxxxxxx/