Re: bisected: futex regression >= 3.14 - was - Slowdown due to threads bouncing between HT cores

From: Thomas Gleixner
Date: Fri Oct 24 2014 - 11:25:33 EST


On Wed, 8 Oct 2014, Mike Galbraith wrote:
> On Wed, 2014-10-08 at 13:04 -0400, Linus Torvalds wrote:
> > On Wed, Oct 8, 2014 at 11:37 AM, Mike Galbraith
> > <umgwanakikbuti@xxxxxxxxx> wrote:
> > >
> > > 11d4616bd07f38d496bd489ed8fad1dc4d928823 is the first bad commit
> > > commit 11d4616bd07f38d496bd489ed8fad1dc4d928823
> > > Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> > > Date: Thu Mar 20 22:11:17 2014 -0700
> > >
> > > futex: revert back to the explicit waiter counting code
> >
> > While that revert might make things a tiny bit slower (I hated doing
> > it, but the clever approach sadly didn't work on powerpc and depended
> > on x86 locking semantics), I seriously doubt it's really relevant.
> > It's more likely that the *real* problem itself is very
> > timing-dependent, and the subtle synchronization changes here then
> > expose it or hide it, rather than really fixing anything.
> >
> > So like Thomas, I would suspect a race condition in the futex use, and
> > then the exact futex implementation details are just exposing it
> > incidentally.
>
> Whew, good, futex.c is hard. Heads up chess guys <punt>.

I wonder whether the barrier fix which got into 3.17 late fixes that
issue as well.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/