Re: kernel BUG at kernel/sched/core.c:3490!

From: Qian Cai
Date: Mon Jan 07 2019 - 09:36:13 EST




On 1/7/19 8:52 AM, Peter Zijlstra wrote:
> On Tue, Jan 01, 2019 at 12:44:35AM -0500, Qian Cai wrote:
>> Running some mmap() workloads to put the system on low memory situation with
>> swapping and OOM, and then it trigger this BUG(),
>>
>> void __noreturn do_task_dead(void)
>> {
>> /* Causes final put_task_struct in finish_task_switch(): */
>> set_special_state(TASK_DEAD);
>>
>> /* Tell freezer to ignore us: */
>> current->flags |= PF_NOFREEZE;
>>
>> __schedule(false);
>> BUG();
>>
>> /* Avoid "noreturn function does return" - but don't continue if BUG()
>> is a NOP: */
>> for (;;)
>> cpu_relax();
>> }
>
> This would mean that we somehow loose the TASK_DEAD state before hitting
> schedule(), but that is something that should be avoided by
> set_special_state(), which is supposed to serialize against concurrent
> wake-ups.
>
> Also see commit: b5bf9a90bbeb ("sched/core: Introduce set_special_state()")
>
> How readily does this reproduce?

Running LTP oom01 [1] triggered it at least once in five attempts every time so
far on v4.20+. Have not tried much on v5.0-rc1 yet.

[1]
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c