Re: Tasks stuck in futex code (in 3.14-rc6)
From: Davidlohr Bueso
Date: Thu Mar 20 2014 - 13:18:48 EST
On Thu, 2014-03-20 at 09:41 -0700, Linus Torvalds wrote:
> On Wed, Mar 19, 2014 at 10:56 PM, Davidlohr Bueso <davidlohr@xxxxxx> wrote:
> >
> > This problem suggests that we missed a wakeup for a task that was adding
> > itself to the queue in a wait path. And the only place that can happen
> > is with the hb spinlock check for any pending waiters.
>
> Ok, so thinking about hb_waiters_pending():
>
> - spin_is_locked() test
> - read barrier
> - plist_head_empty() test
>
> seems to be broken after all.
>
> The race is against futex_wait() that does
>
> - futex_wait_setup():
> - queue_lock()
> - get_futex_value_locked()
> - futex_wait_queue_me()
> - queue_me()
> - plist_add()
>
> right?
Yep.
>
> It strikes me that the "spin_is_locked()" test has no barriers wrt the
> writing of the new futex value on the wake path. And the read barrier
> obviously does nothing wrt the write either. Or am I missing
> something? So the write that actually released the futex might be
> almost arbitrarily delayed on the waking side. So the waiting side may
> not see the new value, even though the waker assumes it does due to
> the ordering of it doing the write first.
Aha, that would certainly violate the ordering guarantees. I feared
_something_ like that when we originally discussed your suggestion as
opposed to the atomics one, but didn't have any case for it either.
> So maybe we need a memory barrier in hb_waiters_pending() just to make
> sure that the new value is written.
>
> But at that point, I suspect that Davidlohrs original patch that just
> had explicit waiting counts is just as well. The whole point with the
> head empty test was to emulate that "do we have waiters" without
> having to actually add the atomics, but a memory barrier is really no
> worse.
>
> The attached is a TOTALLY UNTESTED interdiff that adds back Davidlohrs
> atomic count. It may be terminally broken, I literally didn't test it
> at all.
Comparing with the patch I sent earlier this morning, looks equivalent,
and fwiw, passes my initial qemu bootup, which is the first way of
detecting anything stupid going on.
So, Srikar, please try this patch out, as opposed to mine, you don't
have to first revert the commit in question.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/