Re: workqueue lockup debug

From: John Garry
Date: Mon Nov 11 2024 - 08:22:46 EST


On 08/11/2024 08:54, Peter Zijlstra wrote:
On Fri, Nov 08, 2024 at 09:57:38AM +1100, Dave Chinner wrote:
On Thu, Nov 07, 2024 at 01:39:39PM +0100, Thorsten Leemhuis wrote:
On 24.10.24 17:49, John Garry wrote:
Hi workqueue and scheduler maintainers,

As reported in https://urldefense.com/v3/__https://lore.kernel.org/linux-fsdevel/df9db1ce-17d9-49f1-__;!!ACWV5N9M2RV99hQ!K-drGDW_XBDuFwUrnEBHKIab7gT2eumqlwIEviGOHbLBedwmMvP_yJGM98ikNAU5uTtes3Ig7Lk40CG3652p$
ab6d-7ed9a4f1f9c0@xxxxxxxxxx/T/
#m506b9edb1340cdddd87c6d14d20222ca8d7e8796, I am experiencing a
workqueue lockup for v6.12-rcX.

John, what this resolved in between? This and the other thread[1] look
stalled, but I might be missing something. Asking, because I have this
on my list of tracked regressions and wonder if this is something that
better should be solved one way or another before 6.12.

[1]
https://urldefense.com/v3/__https://lore.kernel.org/lkml/63d6ceeb-a22f-4dee-bc9d-8687ce4c7355@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!K-drGDW_XBDuFwUrnEBHKIab7gT2eumqlwIEviGOHbLBedwmMvP_yJGM98ikNAU5uTtes3Ig7Lk40NodyRwu$

I'm still seeing the scheduler bug in -rc6.

But that WARN you reported earlier isn't there anymore. So what exactly
are you seeing now?

My problem reported in https://lore.kernel.org/lkml/63d6ceeb-a22f-4dee-bc9d-8687ce4c7355@xxxxxxxxxx/ seems to be fixed after rc5. rc6 and rc7 look ok. But I will test more.

Any idea what could be that fix?