Re: [BUG almost bisected] Splat in dequeue_rt_stack() and build error

From: Peter Zijlstra
Date: Tue Oct 08 2024 - 07:12:31 EST


On Sun, Oct 06, 2024 at 01:44:53PM -0700, Paul E. McKenney wrote:

> With your patch, I got 24 failures out of 100 TREE03 runs of 18 hours
> each. The failures were different, though, mostly involving boost
> failures in which RCU priority boosting didn't actually result in the
> low-priority readers getting boosted.

Somehow I feel this is progress, albeit very minor :/

> There were also a number of "sched: DL replenish lagged too much"
> messages, but it looks like this was a symptom of the ftrace dump.
>
> Given that this now involves priority boosting, I am trying 400*TREE03
> with each guest OS restricted to four CPUs to see if that makes things
> happen more quickly, and will let you know how this goes.
>
> Any other debug I should apply?

The sched_pi_setprio tracepoint perhaps?

I've read all the RCU_BOOST and rtmutex code (once again), and I've been
running pi_stress with --sched id=low,policy=other to ensure the code
paths in question are taken. But so far so very nothing :/

(Noting that both RCU_BOOST and PI futexes use the same rt_mutex / PI API)

You know RCU_BOOST better than me.. then again, it is utterly weird this
is apparently affected. I've gotta ask, a kernel with my patch on and
additionally flipping kernel/sched/features.h:SCHED_FEAT(DELAY_DEQUEUE,
false) functions as expected?


One very minor thing I noticed while I read the code, do with as you
think best...

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 1c7cbd145d5e..95061119653d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1071,10 +1071,6 @@ static int rcu_boost(struct rcu_node *rnp)
* Recheck under the lock: all tasks in need of boosting
* might exit their RCU read-side critical sections on their own.
*/
- if (rnp->exp_tasks == NULL && rnp->boost_tasks == NULL) {
- raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
- return 0;
- }

/*
* Preferentially boost tasks blocking expedited grace periods.
@@ -1082,10 +1078,13 @@ static int rcu_boost(struct rcu_node *rnp)
* expedited grace period must boost all blocked tasks, including
* those blocking the pre-existing normal grace period.
*/
- if (rnp->exp_tasks != NULL)
- tb = rnp->exp_tasks;
- else
+ tb = rnp->exp_tasks;
+ if (!tb)
tb = rnp->boost_tasks;
+ if (!tb) {
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+ return 0;
+ }

/*
* We boost task t by manufacturing an rt_mutex that appears to