Re: [PATCH v2] workqueue: Fix false positive stall reports

From: Song Liu

Date: Tue Mar 24 2026 - 14:25:55 EST


On Tue, Mar 24, 2026 at 3:01 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
[...]
> This explains why taking the lock is needed.
>
> > > > + Since
> > > > + * __queue_work() is a much hotter path than the timer
> > > > + * function, we handle false positive here by reading
> > > > + * last_progress_ts again with pool->lock held.
>
> But this is confusing. It says that __queue_work() is a much hotter path
> but it already takes pool->lock. The sentence makes a feeling that
> the watchdog patch is less hot. Then it is weird why the watchdog
> path ignores the lock by default.

This comment primarily concerns the cost of making the read lockless.
To do that, we need to add a memory barrier in __queue_work(),
which will slow down the hot path. __queue_work() does take
pool->lock, but in most cases, __queue_work() only takes the lock
in local pool, which is faster than taking all the pool->locks from the
watchdog timer.

That said, I am open to other suggestions to get rid of this false
positive.

Thanks,
Song