On Fri, Mar 18, 2022 at 12:16:09PM -0400, Waiman Long wrote:
In an analysis of a recent vmcore, a reader-owned rwsem was found withUrgh.. this so reads like a band-aid.
385 readers but no writer in the wait queue. That is kind of unusual
but it may be caused by some race conditions that we have not fully
understood yet. In such a case, all the readers in the wait queue should
join the other reader-owners and acquire the read lock.
In rwsem_down_write_slowpath(), an incoming writer will try to wake
up the front readers under such circumstance. That is not the case for
rwsem_down_read_slowpath(), modify the code to do this. This includes the
original supported case where the wait queue is empty and the incoming
reader is going to wake up itself.
With CONFIG_LOCK_EVENT_COUNTS enabled, the newly added rwsem_rlock_rwake
event counter had 13 hits right after the bootup of a 2-socket system. So
the condition that a reader-owned rwsem has readers at the front of
the wait queue does happen pretty frequently. This patch will help to
speed thing up in such cases.
Anyway; it appears to me the out_nolock case of down_read doesn't
feature a wakeup, can we create a scenario with that?
Anyway, I think I much prefer you sitting down writing the rules for
queueing and wakeup and then varifying them against the code rather than
adding extra wakeups just cause.