Re: [linus:master] [x86] 4817f70c25: stress-ng.mmapaddr.ops_per_sec 63.0% regression
From: Rik van Riel
Date: Wed Jan 29 2025 - 11:56:15 EST
On Wed, 29 Jan 2025 08:36:12 -0800
"Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:
> On Wed, Jan 29, 2025 at 11:14:29AM -0500, Rik van Riel wrote:
> > Paul, does this look like it could do the trick,
> > or do we need something else to make RCU freeing
> > happy again?
>
> I don't claim to fully understand the issue, but this would prevent
> any RCU grace periods starting subsequently from completing. It would
> not prevent RCU callbacks from being invoked for RCU grace periods that
> started earlier.
>
> So it won't prevent RCU callbacks from being invoked.
That makes things clear! I guess we need a different approach.
Qi, does the patch below resolve the regression for you?
---8<---
From 5de4fa686fca15678a7e0a186852f921166854a3 Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel@xxxxxxxxxxx>
Date: Wed, 29 Jan 2025 10:51:51 -0500
Subject: [PATCH 2/2] mm,rcu: prevent RCU callbacks from running with pcp lock
held
Enabling MMU_GATHER_RCU_TABLE_FREE can create contention on the
zone->lock. This turns out to be because in some configurations
RCU callbacks are called when IRQs are re-enabled inside
rmqueue_bulk, while the CPU is still holding the per-cpu pages lock.
That results in the RCU callbacks being unable to grab the
PCP lock, and taking the slow path with the zone->lock for
each item freed.
Speed things up by blocking RCU callbacks while holding the
PCP lock.
Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
Suggested-by: Paul McKenney <paulmck@xxxxxxxxxx>
Reported-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
---
mm/page_alloc.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6e469c7ef9a4..73e334f403fd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -94,11 +94,15 @@ static DEFINE_MUTEX(pcp_batch_high_lock);
#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
/*
- * On SMP, spin_trylock is sufficient protection.
+ * On SMP, spin_trylock is sufficient protection against recursion.
* On PREEMPT_RT, spin_trylock is equivalent on both SMP and UP.
+ *
+ * Block softirq execution to prevent RCU frees from running in softirq
+ * context while this CPU holds the PCP lock, which could result in a whole
+ * bunch of frees contending on the zone->lock.
*/
-#define pcp_trylock_prepare(flags) do { } while (0)
-#define pcp_trylock_finish(flag) do { } while (0)
+#define pcp_trylock_prepare(flags) local_bh_disable()
+#define pcp_trylock_finish(flag) local_bh_enable()
#else
/* UP spin_trylock always succeeds so disable IRQs to prevent re-entrancy. */
--
2.47.1