Re: BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition

From: Paul E. McKenney

Date: Thu Apr 30 2026 - 12:10:31 EST


On Thu, Apr 30, 2026 at 12:38:16PM +0530, Shrikanth Hegde wrote:
> Hi Paul.
>
> On 4/29/26 11:31 PM, Paul E. McKenney wrote:

[ . . . ]

Sorry, missed one...

> > ------------------------------------------------------------------------
> >
> > commit f8d5aaaf90f8294890802ce8dccbafd9850ac5f9
> > Author: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > Date: Thu Apr 9 11:16:02 2026 -0700
> >
> > srcu: Don't queue workqueue handlers to never-online CPUs
> > While an srcu_struct structure is in the midst of switching from CPU-0
> > to all-CPUs state, it can attempt to invoke callbacks for CPUs that
> > have never been online. Worse yet, it can attempt in invoke callbacks
> > for CPUs that never will be online due to not being present in the
>
> for CPUs that never will be online due to being present in the cpu_possible_mask?

Exactly.

Just because a CPU is in cpu_possible_mask doesn't mean that it will
ever actually come online. For example, for single-threaded performance
reasons, a given system might choose to bring online only one CPU from
each hypertheaded core. In that case, the other CPU in each hyperthreaded
core could be in the cpu_possible_mask, but would never come online.

Thanx, Paul

> > cpu_possible_mask. This can cause hangs on s390, which is not set up to
> > deal with workqueue handlers being scheduled on such CPUs. This commit
> > therefore causes Tree SRCU to refrain from queueing workqueue handlers
> > on CPUs that have not yet (and might never) come online.
> > Because callbacks are not invoked on CPUs that have not been
> > online, it is an error to invoke call_srcu(), synchronize_srcu(), or
> > synchronize_srcu_expedited() on a CPU that is not yet fully online.
> > However, it turns out to be less code to redirect the callbacks
> > from too-early invocations of call_srcu() than to warn about such
> > invocations. This commit therefore also redirects callbacks queued on
> > not-yet-fully-online CPUs to the boot CPU.
> > Reported-by: Vasily Gorbik <gor@xxxxxxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > Tested-by: Vasily Gorbik <gor@xxxxxxxxxxxxx>
> > Cc: Tejun Heo <tj@xxxxxxxxxx>