Re: [PATCH 05/16] rcu: De-offloading CB kthread

From: Boqun Feng
Date: Wed Nov 04 2020 - 09:43:10 EST


On Wed, Nov 04, 2020 at 03:31:35PM +0100, Frederic Weisbecker wrote:
[...]
> >
> > > + rcu_segcblist_offload(cblist, false);
> > > + raw_spin_unlock_rcu_node(rnp);
> > > +
> > > + if (rdp->nocb_cb_sleep) {
> > > + rdp->nocb_cb_sleep = false;
> > > + wake_cb = true;
> > > + }
> > > + rcu_nocb_unlock_irqrestore(rdp, flags);
> > > +
> > > + if (wake_cb)
> > > + swake_up_one(&rdp->nocb_cb_wq);
> > > +
> > > + swait_event_exclusive(rdp->nocb_state_wq,
> > > + !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB));
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static long rcu_nocb_rdp_deoffload(void *arg)
> > > +{
> > > + struct rcu_data *rdp = arg;
> > > +
> > > + WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
> >
> > I think this warning can actually happen, if I understand how workqueue
> > works correctly. Consider that the corresponding cpu gets offlined right
> > after the rcu_nocb_cpu_deoffloaed(), and the workqueue of that cpu
> > becomes unbound, and IIUC, workqueues don't do migration during
> > cpu-offlining, which means the worker can be scheduled to other CPUs,
> > and the work gets executed on another cpu. Am I missing something here?.
>
> We are holding cpus_read_lock() in rcu_nocb_cpu_offload(), this should
> prevent from that.
>

But what if the work doesn't get executed until we cpus_read_unlock()
and someone offlines that CPU?

Regards,
Boqun

> Thanks!