Re: [PATCH 05/16] rcu: De-offloading CB kthread

From: Frederic Weisbecker
Date: Wed Nov 04 2020 - 09:45:28 EST


On Wed, Nov 04, 2020 at 10:42:09PM +0800, Boqun Feng wrote:
> On Wed, Nov 04, 2020 at 03:31:35PM +0100, Frederic Weisbecker wrote:
> [...]
> > >
> > > > + rcu_segcblist_offload(cblist, false);
> > > > + raw_spin_unlock_rcu_node(rnp);
> > > > +
> > > > + if (rdp->nocb_cb_sleep) {
> > > > + rdp->nocb_cb_sleep = false;
> > > > + wake_cb = true;
> > > > + }
> > > > + rcu_nocb_unlock_irqrestore(rdp, flags);
> > > > +
> > > > + if (wake_cb)
> > > > + swake_up_one(&rdp->nocb_cb_wq);
> > > > +
> > > > + swait_event_exclusive(rdp->nocb_state_wq,
> > > > + !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB));
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > > +static long rcu_nocb_rdp_deoffload(void *arg)
> > > > +{
> > > > + struct rcu_data *rdp = arg;
> > > > +
> > > > + WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
> > >
> > > I think this warning can actually happen, if I understand how workqueue
> > > works correctly. Consider that the corresponding cpu gets offlined right
> > > after the rcu_nocb_cpu_deoffloaed(), and the workqueue of that cpu
> > > becomes unbound, and IIUC, workqueues don't do migration during
> > > cpu-offlining, which means the worker can be scheduled to other CPUs,
> > > and the work gets executed on another cpu. Am I missing something here?.
> >
> > We are holding cpus_read_lock() in rcu_nocb_cpu_offload(), this should
> > prevent from that.
> >
>
> But what if the work doesn't get executed until we cpus_read_unlock()
> and someone offlines that CPU?

work_on_cpu() waits for completion before returning.

> Regards,
> Boqun
>
> > Thanks!