Re: commit 894d1b3db41c leads to frequent hangs when booting

From: Bert Karwatzki
Date: Thu Dec 12 2024 - 07:03:21 EST


Am Donnerstag, dem 12.12.2024 um 11:31 +0100 schrieb Bert Karwatzki:
> Am Mittwoch, dem 11.12.2024 um 23:40 -0800 schrieb John Stultz:
> > On Wed, Dec 11, 2024 at 6:41 PM John Stultz <jstultz@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Dec 11, 2024 at 4:17 PM Bert Karwatzki <spasswolf@xxxxxx> wrote:
> > > > Am Mittwoch, dem 11.12.2024 um 15:14 -0800 schrieb John Stultz:
> > > > > On Wed, Dec 11, 2024 at 2:46 PM Bert Karwatzki <spasswolf@xxxxxx> wrote:
> > > > > >
> > > > > > Am Mittwoch, dem 11.12.2024 um 22:35 +0100 schrieb Bert Karwatzki:
> > > > > > > I have confirmed that I that linux-next-20241210 is fixed by the same revert
> > > > > > > as v6.13-rc2 (ten boots without incident is the criterion for a good commit)
> > > > > > >
> > > > > > >
> > > > > > > Bert Karwatzki
> > > > > >
> > > > > > Also this bug only occurs with CONFIG_PREEMPT_RT=y, I've just checked v6.13-rc2
> > > > > > without the revert and the following preempt settings and got 10 clean boots:
> > > > > >
> > > > >
> > > > > Hrm. That matches the case where the fix here helped:
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=82f9cc094975240885c93effbca7f4603f5de1bf
> > > > >
> > > > >
> > > > > I'm still working on getting a minimal delta from your config booting
> > > > > in my environment, but in the meantime, I'd be curious if the
> > > > > following reduced revert helps?
> > > > > https://github.com/johnstultz-work/linux-dev/commit/60c60f85670fb1f4708adbe55e15ab918d96f9f0
> > > > >
> > > > > Basically just trying to clarify if the problem is moving the wakeup
> > > > > to the wake_q or if some other interaction (maybe with the
> > > > > preempt_disables) could be causing the issue.
> > > > > (I'm assuming you have 82f9cc094975 applied when you test with that change)
> > > > >
> > > >
> > > > I tested linux-next-20241210 (which includes 82f9cc094975) with your patch
> > > >
> > > > diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h
> > > > index 37f025a096c9..724d4c871cf6 100644
> > > > --- a/kernel/locking/ww_mutex.h
> > > > +++ b/kernel/locking/ww_mutex.h
> > > > @@ -284,7 +284,7 @@ __ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER
> > > > *waiter,
> > > > #ifndef WW_RT
> > > > debug_mutex_wake_waiter(lock, waiter);
> > > > #endif
> > > > - wake_q_add(wake_q, waiter->task);
> > > > + wake_up_process(waiter->task);
> > > > }
> > > >
> > > > return true;
> > > > @@ -332,7 +332,7 @@ static bool __ww_mutex_wound(struct MUTEX *lock,
> > > > * wakeup pending to re-read the wounded state.
> > > > */
> > > > if (owner != current)
> > > > - wake_q_add(wake_q, owner);
> > > > + wake_up_process(owner);
> > > >
> > > > return true;
> > > > }
> > > >
> > > > and this fixes the issue for me.
> > >
> > > Ok, thanks for validating that! Hrm. Ok, I suspect I'm missing a path
> > > in rtmutex.c that calls into __schedule() before we pop up to the
> > > point where we actually wake the wake_q.
> > >
> > > I may have found one or two spots that look likely, and I believe I've
> > > also managed to reproduce your problem using the test-ww_mutex driver.
> > >
> > > So I'm working up a proper patch and will share for testing here soon.
> >
> > Ok, could you try the patch here:
> > https://github.com/johnstultz-work/linux-dev/commit/3c902b92c88122cd034937e8d40930bac254a7c5
> >
> > I've repushed to the test branch I shared earlier, so its against
> > 6.13-rc2 + fix that has landed upstream + test fix above ( + another
> > patch that makes stressing the ww_mutexes easier):
> > https://github.com/johnstultz-work/linux-dev/commits/debug/894d1b3db41c-hang-bert/
> >
> > Let me know if you still have issues with this.
> > thanks
> > -john
>
> I cherry-picked your two patches on top of linux-next-20241210:
>
> 2029cb31eae9 (HEAD -> master) MAYBEFIX: Make sure we wake anything on the wake_q
> when we release the lock->wait_lock
> b525e4779098 test-ww_mutex: Allow test to be run (and re-run) from userland
> 1b2ab8149928 (tag: next-20241210) Add linux-next specific files for 20241210
>
> This fixes the issue for me.
>
> Bert Karwatzki
>

Just tested half the fix (only the task_blocks_on_rt_mutex() part on top of
linux-next-20241212 where the bug is also present) and found it to be sufficient
to fix the issue for me:

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index e858de203eb6..ddbb7423fb69 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1292,7 +1292,13 @@ static int __sched task_blocks_on_rt_mutex(struct
rt_mutex_base *lock,
*/
get_task_struct(owner);

+ preempt_disable();
raw_spin_unlock_irq(&lock->wait_lock);
+ /* wake up any tasks on the wake_q before calling
rt_mutex_adjust_prio_chain */
+ wake_up_q(wake_q);
+ wake_q_init(wake_q);
+ preempt_enable();
+

res = rt_mutex_adjust_prio_chain(owner, chwalk, lock,
next_lock, waiter, task);



Bert Karwatzki