Re: linux-next: Tree for Nov 5

From: Marco Elver
Date: Tue Nov 10 2020 - 08:54:58 EST


On Tue, 10 Nov 2020 at 10:36, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
[...]
> > > On Tue, Nov 10, 2020 at 8:50 AM Anders Roxell <anders.roxell@xxxxxxxxxx> wrote:
[...]
> > > > When building an arm64 allmodconfig and booting up that in qemu I see
> > > >
> > > > [10011.092394][ T28] task:kworker/0:2 state:D stack:26896 pid:
> > > > 1840 ppid: 2 flags:0x00000428
> > > > [10022.368093][ T28] Workqueue: events toggle_allocation_gate
> > > > [10024.827549][ T28] Call trace:
> > > > [10027.152494][ T28] __switch_to+0x1cc/0x1e0
> > > > [10031.378073][ T28] __schedule+0x730/0x800
> > > > [10032.164468][ T28] schedule+0xd8/0x160
> > > > [10033.886807][ T28] toggle_allocation_gate+0x16c/0x220
> > > > [10038.477987][ T28] process_one_work+0x5c0/0x980
> > > > [10039.900075][ T28] worker_thread+0x428/0x720
> > > > [10042.782911][ T28] kthread+0x23c/0x260
> > > > [10043.171725][ T28] ret_from_fork+0x10/0x18
> > > > [10046.227741][ T28] INFO: lockdep is turned off.
> > > > [10047.732220][ T28] Kernel panic - not syncing: hung_task: blocked tasks
> > > > [10047.741785][ T28] CPU: 0 PID: 28 Comm: khungtaskd Tainted: G
> > > > W 5.10.0-rc2-next-20201105-00006-g7af110e4d8ed #1
> > > > [10047.755348][ T28] Hardware name: linux,dummy-virt (DT)
> > > > [10047.763476][ T28] Call trace:
> > > > [10047.769802][ T28] dump_backtrace+0x0/0x420
> > > > [10047.777104][ T28] show_stack+0x38/0xa0
> > > > [10047.784177][ T28] dump_stack+0x1d4/0x278
> > > > [10047.791362][ T28] panic+0x304/0x5d8
> > > > [10047.798202][ T28] check_hung_uninterruptible_tasks+0x5e4/0x640
> > > > [10047.807056][ T28] watchdog+0x138/0x160
> > > > [10047.814140][ T28] kthread+0x23c/0x260
> > > > [10047.821130][ T28] ret_from_fork+0x10/0x18
> > > > [10047.829181][ T28] Kernel Offset: disabled
> > > > [10047.836274][ T28] CPU features: 0x0240002,20002004
> > > > [10047.844070][ T28] Memory Limit: none
> > > > [10047.853599][ T28] ---[ end Kernel panic - not syncing: hung_task:
> > > > blocked tasks ]---
> > > >
> > > > if I build with KFENCE=n it boots up eventually, here's my .config file [2].
> > > >
> > > > Any idea what may happen?
> > > >
> > > > it happens on next-20201109 also, but it takes longer until we get the
> > > > "Call trace:".
> > > >
> > > > Cheers,
> > > > Anders
> > > > [1] http://ix.io/2Ddv
> > > > [2] https://people.linaro.org/~anders.roxell/allmodconfig-next-20201105.config
[...]
> > oh I missed to say that this is the full boot log with the kernel
> > panic http://ix.io/2Ddv
>
> Thanks!
> The last messages before the hang are:
>
> [ 1367.791522][ T1] Running tests on all trace events:
> [ 1367.815307][ T1] Testing all events:
>
> I can imagine tracing somehow interferes with kfence.

The reason is simply that that config on qemu is so slow (enabling
lockdep helped), and the test that is running doesn't result in
allocations for an extended time. Because of that our wait_event()
just stalls, as there are no allocations coming in. My guess is that
this scenario is unique to early boot, where we are not yet running
user space, paired with running a selftest that results in no
allocations for some time.

Try and give that a spin:
https://lkml.kernel.org/r/20201110135320.3309507-1-elver@xxxxxxxxxx

Thanks,
-- Marco