Re: [stabe-rc 5.9 ] sched: core.c:7270 Illegal context switch in RCU-bh read-side critical section!
From: Thomas Gleixner
Date: Wed Dec 16 2020 - 10:22:26 EST
On Wed, Dec 16 2020 at 15:55, Naresh Kamboju wrote:
> On Tue, 15 Dec 2020 at 23:52, Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>> > Or you could place checks for being in a BH-disable further up in
>> > the code. Or build with CONFIG_DEBUG_INFO=y to allow more precise
>> > interpretation of this stack trace.
>
> I will try to reproduce this warning with DEBUG_INFO=y enabled kernel and
> get back to you with a better crash log.
>
>>
>> My money would be on the option that whatever run on this workqueue
>> before forgot to re-enable BH, but we already have a check for that...
>> Naresh, do you have the full log? Is there nothing like "BUG: workqueue
>> leaked lock" above the splat?
No, because it's in the middle of the work. The workqueue bug triggers
when the work has finished.
So cleanup_up() net does
....
synchronize_rcu(); <- might sleep. So up to here it should be fine.
list_for_each_entry_continue_reverse(ops, &pernet_list, list)
ops_exit_list(ops, &net_exit_list);
ops_exit_list() is called for each ops which then either invokes
ops->exit() or ops->exit_batch().
So one of those callbacks fails to reenable BH, so adding a check after
each invocation of ops->exit() and ops->exit_batch() for
!local_bh_disabled() should be able to identify the buggy callback.
Thanks,
tglx