Re: [PATCH 03/11] locking, rwsem: introduce basis for down_write_killable

From: Waiman Long
Date: Thu May 12 2016 - 15:42:50 EST


On 05/12/2016 08:19 AM, Michal Hocko wrote:
On Thu 12-05-16 14:12:04, Peter Zijlstra wrote:
On Wed, May 11, 2016 at 08:03:46PM +0200, Michal Hocko wrote:
I still cannot say I would understand why the pending
RWSEM_WAITING_BIAS matters but I would probably need to look at the code
again with a clean head, __rwsem_wake is quite tricky...
Ah, you're asking why an unconditional __rwsem_wake(ANY) isn't enough?

Because; if at that point there's nobody waiting, we're left with an
empty list and WAITER_BIAS set. This in turn will make all fast paths
fail.

Look at rwsem_down_read_failed() for instance; if we enter that we'll
unconditionally queue ourself, with nobody left to come wake us.
This is still not clear to me because rwsem_down_read_failed will call
__rwsem_do_wake if the count is RWSEM_WAITING_BIAS so we shouldn't go to
sleep and get the lock. So you are right that we would force everybody
to the slow path which is not great but shouldn't cause incorrect
behavior. I guess I must be missing something obvious here...

Because of writer lock stealing, having a count of RWSEM_WAITING_BIAS doesn't mean the reader can surely get the lock even if it is the first one in the queue. Calling __rwsem_do_wake() will take care of all the locking and queue checking work. Yes, I think it is a bit odd for the possibility that a task may wake up itself. Maybe we can add code like:

--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -202,7 +202,8 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum rwsem_wake_type wake_type)
*/
smp_mb();
waiter->task = NULL;
- wake_up_process(tsk);
+ if (tsk != current)
+ wake_up_process(tsk);
put_task_struct(tsk);
} while (--loop);

Cheers,
Longman