Re: [PATCH v2] workqueue: Fix false positive stall reports
From: Petr Mladek
Date: Wed Mar 25 2026 - 10:21:35 EST
On Wed 2026-03-25 14:19:27, Petr Mladek wrote:
> On Tue 2026-03-24 11:22:11, Song Liu wrote:
> > On Tue, Mar 24, 2026 at 3:01 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
> > [...]
> > > This explains why taking the lock is needed.
> > >
> > > > > > + Since
> > > > > > + * __queue_work() is a much hotter path than the timer
> > > > > > + * function, we handle false positive here by reading
> > > > > > + * last_progress_ts again with pool->lock held.
> > >
> > > But this is confusing. It says that __queue_work() is a much hotter path
> > > but it already takes pool->lock. The sentence makes a feeling that
> > > the watchdog patch is less hot. Then it is weird why the watchdog
> > > path ignores the lock by default.
> >
> > This comment primarily concerns the cost of making the read lockless.
> > To do that, we need to add a memory barrier in __queue_work(),
> > which will slow down the hot path. __queue_work() does take
> > pool->lock, but in most cases, __queue_work() only takes the lock
> > in local pool, which is faster than taking all the pool->locks from the
> > watchdog timer.
> >
> > That said, I am open to other suggestions to get rid of this false
> > positive.
>
> See below a patch which tries to improve the comment.
>
> Honestly, I am not sure if it is better than the original. It probably
> better answers the questions which I had. But it might open another
> questions. I feel that I have lost a detached view because I already
> know the details. Also it is hard to mention everything and keep it
> "short".
>
> Feel free to take it, ignore it, or squash it into the original
> commit.
Grrr, I have sent a wrong version of the patch. I forgot to amend
last changes and refresh it. It is actually quite different.
Anyway, this is what I wanted to send: