Re: [PATCH] sched: Use RCU-sched in core-scheduling balancing logic

From: Paul E. McKenney
Date: Tue Mar 24 2020 - 09:30:19 EST


On Tue, Mar 24, 2020 at 11:01:27AM +0800, Li, Aubrey wrote:
> On 2020/3/23 23:21, Joel Fernandes wrote:
> > On Mon, Mar 23, 2020 at 02:58:18PM +0800, Li, Aubrey wrote:
> >> On 2020/3/14 8:30, Paul E. McKenney wrote:
> >>> On Fri, Mar 13, 2020 at 07:29:18PM -0400, Joel Fernandes (Google) wrote:
> >>>> rcu_read_unlock() can incur an infrequent deadlock in
> >>>> sched_core_balance(). Fix this by using the RCU-sched flavor instead.
> >>>>
> >>>> This fixes the following spinlock recursion observed when testing the
> >>>> core scheduling patches on PREEMPT=y kernel on ChromeOS:
> >>>>
> >>>> [ 14.998590] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [kworker/0:10:965]
> >>>>
> >>>
> >>> The original could indeed deadlock, and this would avoid that deadlock.
> >>> (The commit to solve this deadlock is sadly not yet in mainline.)
> >>>
> >>> Acked-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> >>
> >> I saw this in dmesg with this patch, is it expected?
> >>
> >> [ 117.000905] =============================
> >> [ 117.000907] WARNING: suspicious RCU usage
> >> [ 117.000911] 5.5.7+ #160 Not tainted
> >> [ 117.000913] -----------------------------
> >> [ 117.000916] kernel/sched/core.c:4747 suspicious rcu_dereference_check() usage!
> >> [ 117.000918]
> >> other info that might help us debug this:
> >
> > Sigh, this is because for_each_domain() expects rcu_read_lock(). From an RCU
> > PoV, the code is correct (warning doesn't cause any issue).
> >
> > To silence warning, we could replace the rcu_read_lock_sched() in my patch with:
> > preempt_disable();
> > rcu_read_lock();
> >
> > and replace the unlock with:
> >
> > rcu_read_unlock();
> > preempt_enable();
> >
> > That should both take care of both the warning and the scheduler-related
> > deadlock. Thoughts?
> >
>
> How about this?
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index a01df3e..7ff694e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4743,7 +4743,6 @@ static void sched_core_balance(struct rq *rq)
> int cpu = cpu_of(rq);
>
> rcu_read_lock();
> - raw_spin_unlock_irq(rq_lockp(rq));
> for_each_domain(cpu, sd) {
> if (!(sd->flags & SD_LOAD_BALANCE))
> break;
> @@ -4754,7 +4753,6 @@ static void sched_core_balance(struct rq *rq)
> if (steal_cookie_task(cpu, sd))
> break;
> }
> - raw_spin_lock_irq(rq_lockp(rq));
> rcu_read_unlock();
> }

As an alternative, I am backporting the -rcu commit 2b5e19e20fc2 ("rcu:
Make rcu_read_unlock_special() safe for rq/pi locks") to v5.6-rc6 and
testing it. The other support for this is already in mainline. I just
now started testing it, and will let you know how it goes.

If that works for you, and if the bug you are looking to fix is also
in v5.5 or earlier, please let me know so that we can work out how to
deal with the older releases.

Thanx, Paul