Re: scheduler problems in -next (was: Re: [PATCH 6.4 000/227] 6.4.7-rc1 review)

From: Paul E. McKenney
Date: Tue Aug 01 2023 - 15:11:11 EST


On Tue, Aug 01, 2023 at 10:32:45AM -0700, Guenter Roeck wrote:
> On 7/31/23 14:15, Peter Zijlstra wrote:
> > On Mon, Jul 31, 2023 at 09:34:29AM -0700, Guenter Roeck wrote:
> > > > Ha!, I was poking around the same thing. My hack below seems to (so far,
> > > > <20 boots) help things.
> > > >
> > >
> > > So, dumb question:
> > > How comes this bisects to "sched/fair: Remove sched_feat(START_DEBIT)" ?
> >
> > That commit changes the timings of things; dumb luck otherwise.
>
> Kind of scary. So I only experienced the problem because the START_DEBIT patch
> happened to be queued roughly at the same time, and it might otherwise have
> found its way unnoticed into the upstream kernel. That makes me wonder if this
> or other similar patches may uncover similar problems elsewhere in the kernel
> (i.e., either hide new or existing race conditions or expose existing ones).
>
> This in turn makes me wonder if it would be possible to define a test which
> would uncover such problems without the START_DEBIT patch. Any idea ?

Thank you all for tracking this down!

One way is to put a schedule_timeout_idle(100) right before the call to
rcu_tasks_one_gp() from synchronize_rcu_tasks_generic(). That is quite
specific to this particular issue, but it does have the virtue of making
it actually happen in my testing.

There have been a few academic projects that inject delays at points
chosen by various heuristics plus some randomness. But this would be
a bit of a challenge to those because each kernel only passes through
this window once at boot time.

Please see below for my preferred fix. Does this work for you guys?

Back to figuring out why recent kernels occasionally to blow up all
rcutorture guest OSes...

Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 7294be62727b..2d5b8385c357 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -570,10 +570,12 @@ static void rcu_tasks_one_gp(struct rcu_tasks *rtp, bool midboot)
if (unlikely(midboot)) {
needgpcb = 0x2;
} else {
+ mutex_unlock(&rtp->tasks_gp_mutex);
set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
rcuwait_wait_event(&rtp->cbs_wait,
(needgpcb = rcu_tasks_need_gpcb(rtp)),
TASK_IDLE);
+ mutex_lock(&rtp->tasks_gp_mutex);
}

if (needgpcb & 0x2) {