Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier

From: Arkadiusz Miskiewicz
Date: Thu Apr 23 2009 - 09:41:33 EST


On Thursday 23 of April 2009, Mathieu Desnoyers wrote:
> * Ingo Molnar (mingo@xxxxxxx) wrote:
> > * Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
> > > " // Opteron Rev E has a bug in which on very rare occasions a locked
> > > // instruction doesn't act as a read-acquire barrier if followed by a
> > > // non-locked read-modify-write instruction. Rev F has this bug in
> > > // pre-release versions, but not in versions released to customers,
> > > // so we test only for Rev E, which is family 15, model 32..63
> > > inclusive.
> >
> > Dunno. The fix looks a bit intrusive (emits a NOP even on good
> > CPUs). Also, the text above says "not in versions released to
> > customers".
> >
> > So unless there's an official erratum or reports in the field (not
> > from early prototype systems shipped to developers) i'd not rush to
> > apply it, just yet.
>
> Actually, Operon Rev E has this bug in the field (family 15, model
> 32..64). Rev F only had the bug in pre-releases.
>
> But yes, it's bad that it drags so many code additions to something as
> critical as cmpxchg. I start to think it might be better to just
> disallow bringing up more than one CPU on these machines.

That probably would be even worse than what we have now. This bug doesn't
manifest too often in a noticeable way here (I have few such machines here,
mostly 2 x dual core; once per few months mysql dies) and loosing 3 of 4 cores
(or 1 cpu of 2; depends on what you mean) doesn't sound like fun.

> Mathieu


--
Arkadiusz MiÅkiewicz PLD/Linux Team
arekm / maven.pl http://ftp.pld-linux.org/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/