Re: [PATCH 3/4] test-ww_mutex: Handle transient -EDEADLK in test_cycle_work

Next message: Jim Cromie: "[PATCH] kernel/params: fix a pr_debug(&quot; %p &quot;) in parse_one()"
Previous message: Dmitry Torokhov: "Re: [PATCH v5 0/2] rohm-bdi718x7/71828: Use software nodes for gpio-keys"
In reply to: H&#xE5;kon Bugge: "[PATCH 3/4] test-ww_mutex: Handle transient -EDEADLK in test_cycle_work"
Next in thread: Haakon Bugge: "Re: [PATCH 3/4] test-ww_mutex: Handle transient -EDEADLK in test_cycle_work"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: John Stultz

Date: Thu Jun 18 2026 - 15:06:23 EST

On Wed, Jun 17, 2026 at 5:13 AM Håkon Bugge <haakon.bugge@xxxxxxxxxx> wrote:
>
> There is a timing issue in test_cycle_work(), in the sense that
> acquiring *a_mutex* after deadlock has been detected on the *b_mutex*,
> may not succeed immediately. This may lead to false negatives, which
> shows up in the log as:
>
> cyclic deadlock not resolved, ret[77/93] = -35
>
> We fix that by re-trying until the lock is acquired.
>

I definitely have seen this error in testing previously.

But is this fix right? When getting an EDEADLK I thought the ww_mutex
protocol requires the task drop all its locks and re-try acquiring
them all again.

> Fixes: d1b42b800e5d ("locking/ww_mutex: Add kselftests for resolving ww_mutex cyclic deadlocks")
> Fixes: e4a02ed2aaf4 ("locking/ww_mutex: Fix runtime warning in the WW mutex selftest")
> Signed-off-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx>
> ---
> kernel/locking/test-ww_mutex.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c
> index 5a4c92801bdfb..6b29a7a8f5fba 100644
> --- a/kernel/locking/test-ww_mutex.c
> +++ b/kernel/locking/test-ww_mutex.c
> @@ -306,7 +306,9 @@ static void test_cycle_work(struct work_struct *work)
> err = 0;
> ww_mutex_unlock(&cycle->a_mutex);
> ww_mutex_lock_slow(cycle->b_mutex, &ctx);
> - erra = ww_mutex_lock(&cycle->a_mutex, &ctx);
> + do {
> + erra = ww_mutex_lock(&cycle->a_mutex, &ctx);
> + } while (erra == -EDEADLK);
> }

I don't have a clear example in mind, but just trying to grab the same
lock again (especially in a loop with no timeout) seems like it could
open up other problems here.

thanks
-john