Re: [PATCH v4 1/3] mm: use targeted IPIs for TLB sync with lockless page table walkers

From: Dave Hansen

Date: Mon Feb 02 2026 - 11:20:54 EST


On 2/2/26 04:14, Lance Yang wrote:
>>> Note that the tracking adds ~3% latency to GUP-fast, as measured on a
>>> 64-core system.
>>
>> What architecture, and that is acceptable?
>
> x86-64.
>
> I ran ./gup_bench which spawns 60 threads, each doing 500k GUP-fast
> operations (pinning 8 pages per call) via the gup_test ioctl.
>
> Results for pin pages:
> - Before: avg 1.489s (10 runs)
> - After:  avg 1.533s (10 runs)
>
> Given we avoid broadcast IPIs on large systems, I think this is a
> reasonable trade-off 🙂

I thought the big databases were really sensitive to GUP-fast latency.
They like big systems, too. Won't they howl when this finally hits their
testing?

Also, two of the "write" side here are:

* collapse_huge_page() (khugepaged)
* tlb_remove_table() (in an "-ENOMEM" path)

Those are quite slow paths, right? Shouldn't the design here favor
keeping gup-fast as fast as possible as opposed to impacting those?