Re: WARNING: at kernel/workqueue.c:1473 __queue_work+0x3b8/0x3d0

From: Corentin Labbe
Date: Wed Oct 07 2020 - 15:41:25 EST


On Mon, Oct 05, 2020 at 01:09:10PM -0400, Daniel Jordan wrote:
> On Thu, Oct 01, 2020 at 07:50:22PM +0200, Corentin Labbe wrote:
> > On Tue, Mar 03, 2020 at 04:30:17PM -0500, Daniel Jordan wrote:
> > > Barring other ideas, Corentin, would you be willing to boot with
> > >
> > > trace_event=initcall:*,module:* trace_options=stacktrace
> > >
> > > and
> > >
> > > diff --git a/kernel/module.c b/kernel/module.c
> > > index 33569a01d6e1..393be6979a27 100644
> > > --- a/kernel/module.c
> > > +++ b/kernel/module.c
> > > @@ -3604,8 +3604,11 @@ static noinline int do_init_module(struct module *mod)
> > > * be cleaned up needs to sync with the queued work - ie
> > > * rcu_barrier()
> > > */
> > > - if (llist_add(&freeinit->node, &init_free_list))
> > > + if (llist_add(&freeinit->node, &init_free_list)) {
> > > + pr_warn("%s: schedule_work for mod=%s\n", __func__, mod->name);
> > > + dump_stack();
> > > schedule_work(&init_free_wq);
> > > + }
> > >
> > > mutex_unlock(&module_mutex);
> > > wake_up_all(&module_wq);
> > >
> > > but not my earlier fix and share the dmesg and ftrace output to see if the
> > > theory holds?
> > >
> > > Also, could you attach your config? Curious now what your crypto options look
> > > like after fiddling with some of them today while trying and failing to see
> > > this on x86.
> > >
> > > thanks,
> > > Daniel
> >
> > Hello
> >
> > Sorry for the very delayed answer.
> >
> > I fail to reproduce it on x86 (qemu and real hw) and arm.
> > It seems to only happen on arm64.
>
> Thanks for the config and dmesg, but there's no ftrace. I see it's not
> configured in your kernel, so could you boot with my earlier debug patch plus
> this one and the kernel argument initcall_debug instead?
>
> I'm trying to see whether it really is a request module call from the crypto
> tests that's triggering this warning. Preeetty likely that's what's happening,
> but want to be sure since I can't reproduce this. Then I can post the fix.
>

I have added CONFIG_FTRACE=y and your second patch.
The boot log can be seen at http://kernel.montjoie.ovh/108789.log

But it seems the latest dump_stack addition flood a bit.
I have started to read ftrace documentation, but if you have a quick what to do in /sys/kernel/debug/tracing, it will be helpfull.