Re: [REGRESSION] sched/core merge 1c3b68f0d55b: futex_waitv lost wakeup hangs RE Engine games and PID 1 init

From: Mikhail Gavrilov

Date: Wed Apr 22 2026 - 06:48:38 EST


On Tue, Apr 21, 2026 at 7:41 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:>
> Specifically, see this thread:
>
> https://lore.kernel.org/r/95651a71-1adf-45ba-83eb-5744bc6d4a52@xxxxxxx

You're right — it is to blame, and I was about to follow up to say so.

After sending the first mail I kept re-verifying markers with the
stricter RE2+RE9 workload and found that 33c66eb5e984 (parent 1 of
1c3b68f0, which I had marked "good") actually reproduces the hang too.
A third bisect run with the stricter workload converges unambiguously
on:

25500ba7e77c ("locking/mutex: Remove the list_head from struct mutex")

Parent b9bdd4b68404 ("locking/semaphore: Remove the list_head from struct
semaphore"): good. 25500ba7e77c: bad.

So the "semantic conflict in sched/core merge" framing in my original
mail was wrong; the regression is in the mutex wait-list rework, which
is exactly where you pointed. The ww_mutex breakage fits the symptoms
very well — RE Engine games go through DXVK -> RADV -> amdgpu, which
leans on ww_mutex via drm_exec for reservation objects, and the
userspace-visible effect is the render thread waiting forever on
futex_waitv for a submit that never completes. init startup similarly
exercises DRM reservation locks early, which explains the PID 1 variant.

Third bisect log attached. Full revert of 25500ba7e77c on current
master conflicts with later commits in the same series (rwsem/rtmutex
list_head removal, spinlock context analysis), so I couldn't test a
single-commit revert in isolation, but the bisect is clean.

Thanks for the pointer to
https://lore.kernel.org/r/95651a71-1adf-45ba-83eb-5744bc6d4a52@xxxxxxx
— I'll read that thread and happy to test any candidate fix against
the RE2/RE9 repro.

Apologies for the bogus first-bad in the original mail.

Note for regzbot: the earlier #regzbot introduced: 1c3b68f0d55b in this
thread was based on a bisect that converged on the wrong commit due to
a false-negative "good" marker. Re-bisecting with a stricter workload
identified the actual first-bad as 25500ba7e77c. Updating below.

#regzbot introduced: 25500ba7e77ce9d3d9b5a1929d41a2ee2e23f6fe
#regzbot title: RE Engine games and PID 1 init hang in futex_waitv
after mutex list_head removal

--
Thanks,
Mikhail
git bisect start
# status: waiting for both good and bad commits
# good: [028ef9c96e96197026887c0f092424679298aae8] Linux 7.0
git bisect good 028ef9c96e96197026887c0f092424679298aae8
# bad: [1f5ffc672165ff851063a5fd044b727ab2517ae3] Fix mismerge of the arm64 / timer-core interrupt handling changes
git bisect bad 1f5ffc672165ff851063a5fd044b727ab2517ae3
# skip: [ee60c510fb3468ec6fab98419218c4e7b37e2ca3] Merge tag 'nolibc-20260412-for-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc
git bisect skip ee60c510fb3468ec6fab98419218c4e7b37e2ca3
# good: [1f9017d19db38ad2cb9bedb5b078f6f4f60afa94] wifi: mt76: mt7996: fix queue pause after scan due to wrong channel switch reason
git bisect good 1f9017d19db38ad2cb9bedb5b078f6f4f60afa94
# bad: [33c66eb5e9844429911bf5478c96c60f9f8af9d0] Merge tag 'perf-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 33c66eb5e9844429911bf5478c96c60f9f8af9d0
# good: [cea4a90faf9e5d15aee1fd01883bc81ad7640260] Merge tag 'seccomp-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
git bisect good cea4a90faf9e5d15aee1fd01883bc81ad7640260
# good: [d60bc140158342716e13ff0f8aa65642f43ba053] Merge tag 'pwrseq-updates-for-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
git bisect good d60bc140158342716e13ff0f8aa65642f43ba053
# good: [db23954eeaf23464669043ddbb38a64f7b301ebd] Merge tag 'irq-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good db23954eeaf23464669043ddbb38a64f7b301ebd
# good: [c1fe867b5bf9c57ab7856486d342720e2b205eed] Merge tag 'timers-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good c1fe867b5bf9c57ab7856486d342720e2b205eed
# good: [e80d033851b3bc94c3d254ac66660ddd0a49d72c] Merge tag 'smp-core-2026-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good e80d033851b3bc94c3d254ac66660ddd0a49d72c
# bad: [7393febcb1b2082c0484952729cbebfe4dc508d5] Merge tag 'locking-core-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 7393febcb1b2082c0484952729cbebfe4dc508d5
# good: [1ea4b473504b6dc6a0d21c298519aff2d52433c9] locking/rwsem: Remove the list_head from struct rw_semaphore
git bisect good 1ea4b473504b6dc6a0d21c298519aff2d52433c9
# bad: [68bcd8b6e0b10d902f7fc8bf3f08f335f5d1640e] locking/rwsem: Fix logic error in rwsem_del_waiter()
git bisect bad 68bcd8b6e0b10d902f7fc8bf3f08f335f5d1640e
# bad: [90bb681dcdf7e69c90b56a18f06c0389a0810b92] locking/rtmutex: Add context analysis
git bisect bad 90bb681dcdf7e69c90b56a18f06c0389a0810b92
# bad: [07574b8ebaac7927e2355b4f343b03b50e04494c] compiler-context-analysys: Add __cond_releases()
git bisect bad 07574b8ebaac7927e2355b4f343b03b50e04494c
# bad: [25500ba7e77ce9d3d9b5a1929d41a2ee2e23f6fe] locking/mutex: Remove the list_head from struct mutex
git bisect bad 25500ba7e77ce9d3d9b5a1929d41a2ee2e23f6fe
# good: [b9bdd4b6840454ef87f61b6506c9635c57a81650] locking/semaphore: Remove the list_head from struct semaphore
git bisect good b9bdd4b6840454ef87f61b6506c9635c57a81650
# first bad commit: [25500ba7e77ce9d3d9b5a1929d41a2ee2e23f6fe] locking/mutex: Remove the list_head from struct mutex