Re: [PATCH v2] arm64: tlbflush: Reset active_cpu on ASID rollover

From: Sayali Kulkarni

Date: Wed Jun 24 2026 - 18:24:36 EST

Thank you for catching this. After going through the patch again, I realize that the race is real. Resetting active_cpu to ACTIVE_CPU_NONE on rollover can leave it in an inconsistent state when stale entries still live. Note that the rollover state entries themselves aren’t the issue. Those are handled by the existing reserved_asids/tlb_flush_pending mechanism. It’s specifically active_cpu’s own record going inconsistent.

For v3, I am exploring handling active_cpu together with the generation so that the reset can’t open a NONE window. Doing it in the slow path after local flush and updating to the current CPU rather than NONE. Does that direction seem reasonable or is there a cleaner way to avoid the under-claim window? Will follow up with a patch once it’s figured out.

Thanks,
Sayali

On Thu, 18 Jun 2026, Linu Cherian wrote:

Hi,

On Fri, Jun 12, 2026 at 04:21:06PM -0700, Sayali Kulkarni wrote:

From: Sayali Kulkarni <sskulkarni@xxxxxxxxxxxxxxxxxxx>

Hi Catalin,

Thank you for the review. I’ve addressed your feedback in v2:

- Moved `WRITE_ONCE(mm->context.active_cpu, ACTIVE_CPU_NONE)` from `check_and_switch_context()` to `new_context()` after the `set_asid` label. At this point, a brand new ASID has been allocated that no CPU has ever used, so the reset is safe even for multi-threaded processes where other CPUs may still be running with the old ASID via `reserved_asids`.
- Updated the commit message to correct the safety reasoning: `flush_context()` only sets `tlb_flush_pending`; it does not issue a global TLB flush.

Thanks,
Sayali

Once active_cpu flips to ACTIVE_CPU_MULTIPLE it never resets, even if
the process settles back to one CPU. Reset it to ACTIVE_CPU_NONE in
new_context() after a new ASID is allocated at the set_asid label.

At this point a brand new ASID has been assigned that no CPU has ever
used, so ACTIVE_CPU_NONE accurately reflects reality. Any other threads
of the same process continue running with the old ASID via
reserved_asids and are unaffected.

This gives processes a fresh chance at the local-only flush fast path
after each ASID generation rollover.

Signed-off-by: Sayali Kulkarni <sskulkarni@xxxxxxxxxxxxxxxxxxx> (Ampere)
---
arch/arm64/mm/context.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index f34ed78393e0..46c7fd07b9bf 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -209,6 +209,7 @@ static u64 new_context(struct mm_struct *mm)
set_asid:
__set_bit(asid, asid_map);
cur_idx = asid;
+ WRITE_ONCE(mm->context.active_cpu, ACTIVE_CPU_NONE);

Can the above store race with the store to active_cpu in another thread,
that updates it to ACTIVE_CPU_MULTIPLE ?

Lets say we have two threads both initially running in CPU 0,

Thread 1: Runs in CPU 0

Encounters a rollover, updates mm->context.active_cpu to ACTIVE_CPU_NONE and
updates mm->context.id to new asid.

Thread 2: Scheduled to run on CPU 1 for the first time

Observes the updated mm->context.id that belongs to the current
generation(after the rollover) and hence proceeds to switch_mm_fastpath
and ends up updating the active_cpu to ACTIVE_CPU_MULTIPLE.

If Thread 1 and Thread 2 races, then active_cpu can get corrupted ?

The reason this could be possible is that, write to active_cpu and
mm->context.id can get reordered and we need to enforce ordering for
correctness ?

Do you see this as a valid scenario ?

--
Thanks,
Linu Cherian.