Re: [PATCH v2] srcu: Fix flush srcu structure's->sup work warning in cleanup_srcu_struct()

From: Paul E. McKenney
Date: Fri Mar 24 2023 - 10:04:31 EST


On Fri, Mar 24, 2023 at 03:53:08AM +0000, Zhang, Qiang1 wrote:
> > Cc: my personal email qiang.zhang1211@xxxxxxxxx
> >
> > > When unloading rcutorture kmod will trigger the following callstack:
> > >
> > > insmod rcutorture.ko
> > > rmmod rcutorture.ko
> > >
> > > [ 209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540
> > > [ 209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture]
> > > [ 209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G W 6.3.0-rc1-yocto-standard+
> > > [ 209.437406] RIP: 0010:__flush_work+0x50a/0x540
> > > .....
> > > [ 209.437758] flush_delayed_work+0x36/0x90
> > > [ 209.437776] cleanup_srcu_struct+0x68/0x2e0
> > > [ 209.437817] srcu_module_notify+0x71/0x140
> > > [ 209.437854] blocking_notifier_call_chain+0x9d/0xd0
> > > [ 209.437880] __x64_sys_delete_module+0x223/0x2e0
> > > [ 209.438046] do_syscall_64+0x43/0x90
> > > [ 209.438062] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > >
> > > flush_delayed_work()
> > > ->__flush_work()
> > > ->if (WARN_ON(!work->func))
> > > return false;
> > >
> > > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(),
> > > when compiling and loading as modules, the srcu_module_coming() is
> > > invoked, allocate memory for srcu structure's->sda and initialize
> > > sda structure, due to not fully initialize srcu structure's->sup,
> > > so at this time the sup structure's->work.work.func is null, if not
> > > invoke init_srcu_struct_fields() before unloading modules, the
> > > __flush_work() be invoked in srcu_module_going() and find work->func
> > > is empty, will raise the warning above.
> > >
> > > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed
> > > to determine whether the check_init_srcu_struct() has been invoked to
> > > initialize srcu objects in srcu_module_going(), if not initialize, there
> > > are no pending or running works, so there is no need to flush, only invoke
> > > free_percpu() to release srcu structure's->sda.
> > >
> > > Co-developed-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > >
> > >Thank you for the testing, bug-finding, and problem-solving!
> > >
> > >In theory, you would need a Signed-off-by here from me as well, but
> > >in practice bisectability means that this must be folded into this:
> > >
> > >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct")
> > >
> > >This will of course be with attribution.
> > >
> > > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
> > >
> > >But this is still a bit more complex than needed. How about something
> > >like this?
> >
> > Agree, from a logical point of view, this is more rigorous😊.
> >
> >And I finally got around to doing some modprobe/rmmod testing myself,
> >and it passes eleven cycles.
> >
> >May I add your Tested-by to the series?
>
> Of course I am glad to.

Thank you, and I will apply this on the next rebase.

Thanx, Paul

> Thanks
> Zqiang
>
> >
> > Thanx, Paul
> >
> > Thanks
> > Zqiang
> >
> > >
> > > Thanx, Paul
> > >
> > >------------------------------------------------------------------------
> > >
> > >/* Initialize any global-scope srcu_struct structures used by this module. */
> > >static int srcu_module_coming(struct module *mod)
> > >{
> > > int i;
> > > struct srcu_struct *ssp;
> > > struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > >
> > > for (i = 0; i < mod->num_srcu_structs; i++) {
> > > ssp = *(sspp++);
> > > ssp->sda = alloc_percpu(struct srcu_data);
> > > if (WARN_ON_ONCE(!ssp->sda))
> > > return -ENOMEM;
> > > }
> > > return 0;
> > >}
> > >
> > >/* Clean up any global-scope srcu_struct structures used by this module. */
> > >static void srcu_module_going(struct module *mod)
> > >{
> > > int i;
> > > struct srcu_struct *ssp;
> > > struct srcu_struct **sspp = mod->srcu_struct_ptrs;
> > >
> > > for (i = 0; i < mod->num_srcu_structs; i++) {
> > > ssp = *(sspp++);
> > > if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
> > > !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
> > > cleanup_srcu_struct(ssp);
> > > free_percpu(ssp->sda);
> > > }
> > >}