On 9/26/19 5:34 AM, Peter Zijlstra wrote:You are obviously right, there is a huge race condition.
On Fri, Sep 20, 2019 at 11:54:02AM -0400, Waiman Long wrote:
While looking at a customr bug report about potential missed wakeup in
the system V semaphore code, I spot a potential problem. The fact that
semaphore waiter stays in TASK_RUNNING state while checking queue status
may lead to missed wakeup if a spurious wakeup happens in the right
moment as try_to_wake_up() will do nothing if the task state isn't right.
To eliminate this possibility, the task state is now reset to
TASK_INTERRUPTIBLE immediately after wakeup before checking the queue
status. This should eliminate the race condition on the interaction
between the queue status and the task state and fix the potential missed
wakeup problem.
Bah, this code always makes my head hurt.
Yes, AFAICT the pattern it uses has been broken since 0a2b9d4c7967,
since that removed doing the actual wakeup from under the sem_lock(),
which is what it relies on.
After having a second look at the code again, I probably misread the
code the first time around. In the sleeping path, there is a check of
queue.status and setting of task state both under the sem lock in the
sleeping path. So as long as setting of queue status is under lock, they
should synchronize properly.
It looks like queue status setting is under lock, but I can't use
lockdep to confirm that as the locking can be done by either the array
lock or in one of the spinlocks in the array. Are you aware of a way of
doing that?
Anyway, I do think we need to add some comment to clarify the situation
to avoid future confusion.