RE: [PATCH v2] futex: lower the lock contention on the HB lock during wake up

From: Zhu Jefferry
Date: Mon Sep 14 2015 - 22:00:50 EST

Next message: Zhenzhong Duan: "Re: [PATCH] xen: fix the check of e_pfn in xen_find_pfn_range"
Previous message: Tejun Heo: "Re: [RFC][PATCH 0/5] Fixes for abs() usage on 64bit values"
Next in thread: Thomas Gleixner: "RE: [PATCH v2] futex: lower the lock contention on the HB lock during wake up"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi

Just in the list, I see the patch "[PATCH v2] futex: lower the lock contention on the HB lock during wake up" at http://www.gossamer-threads.com/lists/linux/kernel/2199938?search_string=futex;#2199938.

But I see another patch with same name, different content here,
23b7776290b10297fe2cae0fb5f166a4f2c68121(http://code.metager.de/source/xref/linux/stable/kernel/futex.c?r=23b7776290b10297fe2cae0fb5f166a4f2c68121) 23-Jun-2015 Linus Torvalds
futex: Lower the lock contention on the HB lock during wake up wake_futex_pi() wakes the task before releasing the hash bucket lock (HB).
The first thing the woken up task usually does is to acquire the lock which requires the HB lock. On SMP Systems this leads to blocking
on the HB lock which is released by the owner shortly after. This patch rearranges the unlock path by first releasing the HB lock and
then waking up the task.

Could you please help to give a little bit more explanation on this, why they have same name with different modify in the futex.c? I'm a newbie in the community.

Actually, I encounter a customer issue which is related to the glibc code "pthread_mutex_lock", which is using the futex service in kernel, without the patches above.

After lots of customer discussing, ( I could not reproduce the failure in my office), I seriously suspect there might be some particular corner cases in the futex code.

In the unlock flow, the user space code (pthread_mutex_unlock) will check FUTEX_WAITERS flag first, then wakeup the waiters in the kernel list. But in the lock flow, the kernel code (futex) will set FUTEX_WAITERS in first too, then try to get the waiter from the list. They are following same sequence, flag first, entry in list secondly. But there might be some timing problem in SMP system, if the query (unlock flow) is executing just before the list adding action (lock flow).

It might cause the mutex is never really released, and other threads will infinite waiting. Could you please help to take a look at it?

===========================================================================================================================
CPU 0 (trhead 0)                                CPU 1 (thread 1)

mutex_lock
val = *futex;
sys_futex(LOCK_PI, futex, val);

return to user space
after acquire the lock                           mutex_lock
                                                  val = *futex;
                                                  sys_futex(LOCK_PI, futex, val);
                                                    lock(hash_bucket(futex));
                                                    set FUTEX_WAITERS flag
                                                    unlock(hash_bucket(futex)) and retry due to page fault

mutex_unlock in user space
check FUTEX_WAITERS flag
sys_futex(UNLOCK_PI, futex, val);
   lock(hash_bucket(futex));        <--.
                                       .--------- waiting for the lock of (hash_bucket(futex)) to do list adding

   try to get the waiter in waitling <--.
   list, but it's empty                 |
                                        |
   set new_owner to itself              |
   instead of expecting waiter          |
                                        |
                                        |
   unlock(hash_bucket(futex));          |
                                        |           lock(hash_bucket(futex));
                                        .-------- add itself to the waiting list
                                                    unlock(hash_bucket(futex));
                                                    waiting forever since there is nobody will release the PI
   the futex is owned by itself
   forever in userspace. Because
   the __owner in user space has
   been cleared and mutex_unlock
   will fail forever before it
   call kernel.

Thanks,
Jeff

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Zhenzhong Duan: "Re: [PATCH] xen: fix the check of e_pfn in xen_find_pfn_range"
Previous message: Tejun Heo: "Re: [RFC][PATCH 0/5] Fixes for abs() usage on 64bit values"
Next in thread: Thomas Gleixner: "RE: [PATCH v2] futex: lower the lock contention on the HB lock during wake up"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]