RE: [PATCH v2] futex: lower the lock contention on the HB lock during wake up

From: Thomas Gleixner
Date: Wed Sep 16 2015 - 04:07:26 EST


On Wed, 16 Sep 2015, Zhu Jefferry wrote:

> Thanks for your detail guideline and explanations. Please see my questions in-line.

Please trim the reply to the relevant sections. It's annoying if I
have to search your replies inside of useless quoted text.

> > -----Original Message-----
> > From: Thomas Gleixner [mailto:tglx@xxxxxxxxxxxxx]
> > The flow is:
> >
> > sys_futex(LOCK_PI, futex, ...)
> >
> > retry:
> > lock(hb(futex));
> > ret = set_waiter_bit(futex);
> > if (ret == -EFAULT) {
> > unlock(hb(futex));
> > handle_fault();
> > goto retry;
> > }
> >
> > list_add();
> > unlock(hb(futex));
> > schedule();
> >
> > So when set_waiter_bit() succeeds, then the hash bucket lock is held and
> > blocks the waker. So it's guaranteed that the waker will see the waiter
> > on the list.
> >
> > If set_waiter_bit() faults, then the waiter bit is not set and therefor
> > there is nothing to wake. So the waker will not enter the kernel because
> > the futex is uncontended.
> >

> I assume your pseudo code set_waiter_bit is mapped to the real code
> "futex_lock_pi_atomic", It's possible for futex_lock_pi_atomic to
> successfully set FUTEX_WAITERS bit, but return with Page fault, for
> example, like fail in lookup_pi_state().

No. It's not. lookup_pi_state() cannot return EFAULT. The only
function which can fault inside of lock_pi_update_atomic() is the
actual cmpxchg. Though lock_pi_update_atomic() can successfully set
the waiter bit and then return with some other failure code (ESRCH,
EAGAIN, ...). But that does not matter at all.

Any failure return will end up in a retry. And if the waker managed to
release the futex before the retry takes place then the waiter will
see that and take the futex.

As I said before:

> > Random speculation is not helping here.

You still fail to provide the relevant information I asked for. If you
cannot provide that information, we can't help.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/