[PATCH] x86,mm: also remove local CPU from mm_cpumask if stale
From: Rik van Riel
Date: Thu Dec 05 2024 - 10:48:00 EST
On Thu, 5 Dec 2024 16:43:24 +0800
kernel test robot <oliver.sang@xxxxxxxxx> wrote:
> besides the performance report
> "[tip:x86/mm] [x86/mm/tlb] 209954cbc7: will-it-scale.per_thread_ops 13.2% regression"
> in
> https://lore.kernel.org/all/202411282207.6bd28eae-lkp@xxxxxxxxx/
>
Anxiously awaiting the bot to get around to v3 or v4 of that patch,
on the extra-large 2 socket system ;)
> we now also observed a WARNING from another test. the issue doesn't always
> happen, so we run it more to make sure the parent keep clean.
Thank you for spotting this corner case, too!
The warning appears to be fairly harmless, and luckily also easy
to fix.
---8<---
From 5b5d1d548fbe07b415ba9e80a2f60deed5aead62 Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel@xxxxxxxxxxx>
Date: Thu, 5 Dec 2024 10:20:28 -0500
Subject: [PATCH 2/2] x86,mm: also remove local CPU from mm_cpumask if stale
The code in flush_tlb_func that removes a remote CPU from the
cpumask if it is no longer running the target mm is also needed
on the originating CPU of a TLB flush, now that CPUs are no
longer cleared from the mm_cpumask at context switch time.
Flushing the TLB when we are not running the target mm is
harmless, because the CPU's tlb_gen only gets updated to
match the mm_tlb_gen, but it does hit this warning:
WARN_ON_ONCE(local_tlb_gen > mm_tlb_gen);
[ 210.343902][ T4668] WARNING: CPU: 38 PID: 4668 at arch/x86/mm/tlb.c:815 flush_tlb_func (arch/x86/mm/tlb.c:815)
Removing both local and remote CPUs from the mm_cpumask
when doing a flush for a not currently loaded mm avoids
that warning.
Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
Closes: https://lore.kernel.org/oe-lkp/202412051551.690e9656-lkp@xxxxxxxxx
---
arch/x86/mm/tlb.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 0507a6773a37..458a5d5be594 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -756,13 +756,13 @@ static void flush_tlb_func(void *info)
if (!local) {
inc_irq_stat(irq_tlb_count);
count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
+ }
- /* Can only happen on remote CPUs */
- if (f->mm && f->mm != loaded_mm) {
- cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(f->mm));
- trace_tlb_flush(TLB_REMOTE_WRONG_CPU, 0);
- return;
- }
+ /* The CPU was left in the mm_cpumask of the target mm. Clear it. */
+ if (f->mm && f->mm != loaded_mm) {
+ cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(f->mm));
+ trace_tlb_flush(TLB_REMOTE_WRONG_CPU, 0);
+ return;
}
if (unlikely(loaded_mm == &init_mm))
--
2.47.0