Re: [PATCH v2] workqueue: Fix false positive stall reports
From: Petr Mladek
Date: Wed Mar 25 2026 - 09:49:27 EST
On Tue 2026-03-24 11:22:11, Song Liu wrote:
> On Tue, Mar 24, 2026 at 3:01 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
> [...]
> > This explains why taking the lock is needed.
> >
> > > > > + Since
> > > > > + * __queue_work() is a much hotter path than the timer
> > > > > + * function, we handle false positive here by reading
> > > > > + * last_progress_ts again with pool->lock held.
> >
> > But this is confusing. It says that __queue_work() is a much hotter path
> > but it already takes pool->lock. The sentence makes a feeling that
> > the watchdog patch is less hot. Then it is weird why the watchdog
> > path ignores the lock by default.
>
> This comment primarily concerns the cost of making the read lockless.
> To do that, we need to add a memory barrier in __queue_work(),
> which will slow down the hot path. __queue_work() does take
> pool->lock, but in most cases, __queue_work() only takes the lock
> in local pool, which is faster than taking all the pool->locks from the
> watchdog timer.
>
> That said, I am open to other suggestions to get rid of this false
> positive.
See below a patch which tries to improve the comment.
Honestly, I am not sure if it is better than the original. It probably
better answers the questions which I had. But it might open another
questions. I feel that I have lost a detached view because I already
know the details. Also it is hard to mention everything and keep it
"short".
Feel free to take it, ignore it, or squash it into the original
commit.