Re: lockdep splat in v5.11-rc1 involving console_sem and rq locks

From: Paul E. McKenney
Date: Wed Jan 06 2021 - 09:47:20 EST


On Wed, Jan 06, 2021 at 10:52:14AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 05, 2021 at 02:01:15PM -0800, Paul E. McKenney wrote:
> > Hello!
> >
> > The RUDE01 rcutorture scenario (and less often, the TASKS01 scenario)
> > results in occasional lockdep splats on v5.11-rc1 on x86. This failure
> > is probabalistic, sometimes happening as much as 30% of the time, but
> > sometimes happening quite a bit less frequently. (And yes, this did
> > result in a false bisection. Why do you ask?) The problem seems to
> > happen more frequently shortly after boot, so for fastest reproduction
> > run lots of 10-minute RUDE01 runs, which did eventually result in a
> > good bisection. (Yes, I did hammer the last good commit for awhile.)
> >
> > The first bad commit is 1cf12e08bc4d ("sched/hotplug: Consolidate task
> > migration on CPU unplug"). An example splat is shown below.
> >
> > Thoughts?
>
> The splat is because you hit a WARN, we're working on that.

Huh. The WARN does not always generate the lockdep complaint. But
fair enough.

> https://lkml.kernel.org/r/20201226025117.2770-1-jiangshanlai@xxxxxxxxx

Thomas pointed me at this one a couple of weeks ago. Here is an
additional fix for rcutorture: f67e04bb0695 ("torture: Break affinity
of kthreads last running on outgoing CPU"). I am still getting WARNs
and lockdep splats with both applied.

What would break if I made the code dump out a few entries in the
runqueue if the warning triggered?

Thanx, Paul