Re: [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized

From: David Hildenbrand (Red Hat)
Date: Fri Jan 09 2026 - 10:44:32 EST

Next message: Neal Gompa: "Re: [PATCH 0/4] arm64: dts: apple: Add chassis-type properties"
Previous message: Lee Jones: "Re: [PATCH] mfd:tc3589:fix a potential resource leak"
In reply to: Lance Yang: "Re: [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/9/26 16:30, Lance Yang wrote:

On 2026/1/9 22:13, David Hildenbrand (Red Hat) wrote:

What could work is tracking "tlb_table_flush_sent_ipi" really when we
are flushing the TLB for removed/unshared tables, and maybe resetting
it ... I don't know when from the top of my head.

Not sure what's the best way forward here :(

v2 was simpler IMHO.

The main concern Dave raised was that with PV hypercalls or when
INVLPGB is available, we can't tell from a static check whether IPIs
were actually sent.

Why can't we set the boolean at runtime when initializing the pv_ops
structure, when we are sure that it is allowed?

Yes, thanks, that sounds like a reasonable trade-off :)

As you mentioned:

"this lifetime stuff in core-mm ends up getting more complicated than
v2 without a clear benefit".

I totally agree that v3 is too complicated :(

But Dave's concern about v2 was that we can't accurately tell whether
IPIs were actually sent in PV environments or with INVLPGB, which
misses optimization opportunities. The INVLPGB+no_global_asid case
also sends IPIs during TLB flush.

Anyway, yeah, I'd rather start with a simple approach, even if it's
not perfect. We can always improve it later ;)

Any ideas on how to move forward?

I'd hope Dave can comment :)

In general, I saw the whole thing as a two step process:

1) Avoid IPIs completely when the TLB flush sent them. We can achieve
that through v2 or v3, one-way or the other, I don't particularly
care as long as it is clean and simple.

2) For other configs/arch, send IPIs only to CPUs that are actually in
GUP-fast etc. That would resolve some RT headake with broadcast IPIs.

Regarding 2), it obviously only applies to setups where 1) does not apply: like x86 with INVLPGB or arm64.

I once had the idea of letting CPUs that enter/exit GUP-fast (and similar) to indicate in a global cpumask (or per-CPU variables) that they are in that context. Then, we can just collect these CPUs and limit the IPIs to them (usually, not a lot ...).

The trick here is to not slowdown GUP-fast too much. And one person (Yair in RT context) who played with that was not able to reduce the overhead sufficiently enough.

I guess the options are

a) Per-MM CPU mask we have to update atomically when entering/leaving GUP-fast

b) Global mask we have to update atomically when entering/leaving GUP-fast

c) Per-CPU variable we have to update when entering-leaving GUP-fast. Interrupts are disabled, so we don't have to worry about reschedule etc.

Maybe someone reading along has other thoughts.

--
Cheers

David

Next message: Neal Gompa: "Re: [PATCH 0/4] arm64: dts: apple: Add chassis-type properties"
Previous message: Lee Jones: "Re: [PATCH] mfd:tc3589:fix a potential resource leak"
In reply to: Lance Yang: "Re: [PATCH RESEND v3 1/2] mm/tlb: skip redundant IPI when TLB flush already synchronized"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]