Re: "Dying CPU not properly vacated" splat
From: Paul E. McKenney
Date: Wed Jun 22 2022 - 15:59:06 EST
On Tue, Apr 26, 2022 at 09:24:45AM -0700, Paul E. McKenney wrote:
> On Tue, Apr 26, 2022 at 03:48:06PM +0100, Valentin Schneider wrote:
> > On 25/04/22 17:03, Paul E. McKenney wrote:
> > > On Mon, Apr 25, 2022 at 10:59:44PM +0100, Valentin Schneider wrote:
> > >> On 25/04/22 10:33, Paul E. McKenney wrote:
> > >> >
> > >> > So what did rcu_torture_reader() do wrong here? ;-)
> > >> >
> > >>
> > >> So on teardown, CPUHP_AP_SCHED_WAIT_EMPTY->sched_cpu_wait_empty() waits for
> > >> the rq to be empty. Tasks must *not* be enqueued onto that CPU after that
> > >> step has been run - if there are per-CPU tasks bound to that CPU, they must
> > >> be unbound in their respective hotplug callback.
> > >>
> > >> For instance for workqueue.c, we have workqueue_offline_cpu() as a hotplug
> > >> callback which invokes unbind_workers(cpu), the interesting bit being:
> > >>
> > >> for_each_pool_worker(worker, pool) {
> > >> kthread_set_per_cpu(worker->task, -1);
> > >> WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
> > >> }
> > >>
> > >> The rcu_torture_reader() kthreads aren't bound to any particular CPU are
> > >> they? I can't find any code that would indicate they are - and in that case
> > >> it means we have a problem with is_cpu_allowed() or related.
> > >
> > > I did not intend that the rcu_torture_reader() kthreads be bound, and
> > > I am not seeing anything that binds them.
> > >
> > > Thoughts? (Other than that validating any alleged fix will be quite
> > > "interesting".)
> >
> > IIUC the bogus scenario is is_cpu_allowed() lets one of those kthreads be
> > enqueued on the outgoing CPU *after* CPUHP_AP_SCHED_WAIT_EMPTY.teardown() has
> > been run, and hilarity ensues.
> >
> > The cpu_dying() condition should prevent a regular kthread from getting
> > enqueued there, most of the details have been evinced from my brain but I
> > recall we got the ordering conditions right...
> >
> > The only other "obvious" thing here is migrate_disable() which lets the
> > enqueue happen, but then balance_push()->select_fallback_rq() should punt
> > it away on context switch.
> >
> > I need to rediscover those paths, I don't see any obvious clue right now.
>
> Thank you for looking into this!
>
> The only thought that came to me was to record that is_cpu_allowed()
> returned true do to migration being disabled, and then use that in later
> traces, printk()s or whatever.
>
> My own favorite root-cause hypothesis was invalidated by the fact that
> is_cpu_allowed() returns cpu_online(cpu) rather than just true. ;-)
And I hit this on two of fifteen TREE03 runs on a merge of -rcu with
yesterday's linus/master. Might be a fluke, but I thought I should
at least report it. This is the first time since my last email in
this thread.
Thanx, Paul