Re: [PATCH v4 0/3] targeted TLB sync IPIs for lockless page table walkers

From: Peter Zijlstra

Date: Mon Feb 02 2026 - 04:54:47 EST


On Mon, Feb 02, 2026 at 03:45:54PM +0800, Lance Yang wrote:
> When freeing or unsharing page tables we send an IPI to synchronize with
> concurrent lockless page table walkers (e.g. GUP-fast). Today we broadcast
> that IPI to all CPUs, which is costly on large machines and hurts RT
> workloads[1].
>
> This series makes those IPIs targeted. We track which CPUs are currently
> doing a lockless page table walk for a given mm (per-CPU
> active_lockless_pt_walk_mm). When we need to sync, we only IPI those CPUs.
> GUP-fast and perf_get_page_size() set/clear the tracker around their walk;
> tlb_remove_table_sync_mm() uses it and replaces the previous broadcast in
> the free/unshare paths.

I'm confused. This only happens when !PT_RECLAIM, because if PT_RECLAIM
__tlb_remove_table_one() actually uses RCU.

So why are you making things more expensive for no reason?