On 4/11/22 4:07 PM, Waiman Long wrote:
Hi
On 4/11/22 17:03, john.p.donnelly@xxxxxxxxxx wrote:
I am looking forward to your testing results tomorrow.
I have reached out to Waiman and he suggested this for our next test pass:
1ee326196c6658 locking/rwsem: Always try to wake waiters in out_nolock path
Does this commit help to avoid the lockup problem?
Commit 1ee326196c6658 fixes a potential missed wakeup problem when a reader first in the wait queue is interrupted out without acquiring the lock. It is actually not a fix for commit d257cc8cb8d5. However, this commit changes the out_nolock path behavior of writers by leaving the handoff bit set when the wait queue isn't empty. That likely makes the missed wakeup problem easier to reproduce.
Cheers,
Longman
Hi,
We are testing now
ETA for fio soak test completion is ~15hr from now.
I wanted to share the stack traces for future reference + occurrences.
Cheers,
Longman
Our 24hr fio soak test with :
1ee326196c6658 locking/rwsem: Always try to wake waiters in out_nolock path
applied to 5.15.30 passed.
I suggest you append 1ee326196c6658 with :
cc: stable
Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
I'll leave the implementation details up to the core maintainers how to do that ;-)