RE: [PATCH v2] futex: lower the lock contention on the HB lock during wake up
From: Zhu Jefferry
Date: Mon Sep 14 2015 - 22:00:50 EST
Hi
Just in the list, I see the patch "[PATCH v2] futex: lower the lock contention on the HB lock during wake up" at http://www.gossamer-threads.com/lists/linux/kernel/2199938?search_string=futex;#2199938.
But I see another patch with same name, different content here,
23b7776290b10297fe2cae0fb5f166a4f2c68121(http://code.metager.de/source/xref/linux/stable/kernel/futex.c?r=23b7776290b10297fe2cae0fb5f166a4f2c68121) 23-Jun-2015 Linus Torvalds
futex: Lower the lock contention on the HB lock during wake up wake_futex_pi() wakes the task before releasing the hash bucket lock (HB).
The first thing the woken up task usually does is to acquire the lock which requires the HB lock. On SMP Systems this leads to blocking
on the HB lock which is released by the owner shortly after. This patch rearranges the unlock path by first releasing the HB lock and
then waking up the task.
Could you please help to give a little bit more explanation on this, why they have same name with different modify in the futex.c? I'm a newbie in the community.
Actually, I encounter a customer issue which is related to the glibc code "pthread_mutex_lock", which is using the futex service in kernel, without the patches above.
After lots of customer discussing, ( I could not reproduce the failure in my office), I seriously suspect there might be some particular corner cases in the futex code.
In the unlock flow, the user space code (pthread_mutex_unlock) will check FUTEX_WAITERS flag first, then wakeup the waiters in the kernel list. But in the lock flow, the kernel code (futex) will set FUTEX_WAITERS in first too, then try to get the waiter from the list. They are following same sequence, flag first, entry in list secondly. But there might be some timing problem in SMP system, if the query (unlock flow) is executing just before the list adding action (lock flow).
It might cause the mutex is never really released, and other threads will infinite waiting. Could you please help to take a look at it?
===========================================================================================================================
CPU 0 (trhead 0) CPU 1 (thread 1)
mutex_lock
val = *futex;
sys_futex(LOCK_PI, futex, val);
return to user space
after acquire the lock mutex_lock
val = *futex;
sys_futex(LOCK_PI, futex, val);
lock(hash_bucket(futex));
set FUTEX_WAITERS flag
unlock(hash_bucket(futex)) and retry due to page fault
mutex_unlock in user space
check FUTEX_WAITERS flag
sys_futex(UNLOCK_PI, futex, val);
lock(hash_bucket(futex)); <--.
.--------- waiting for the lock of (hash_bucket(futex)) to do list adding
try to get the waiter in waitling <--.
list, but it's empty |
|
set new_owner to itself |
instead of expecting waiter |
|
|
unlock(hash_bucket(futex)); |
| lock(hash_bucket(futex));
.-------- add itself to the waiting list
unlock(hash_bucket(futex));
waiting forever since there is nobody will release the PI
the futex is owned by itself
forever in userspace. Because
the __owner in user space has
been cleared and mutex_unlock
will fail forever before it
call kernel.
Thanks,
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/