Re: [PATCH 0/24] make atomic_read() behave consistently across allarchitectures

From: Nick Piggin
Date: Fri Aug 17 2007 - 04:39:17 EST


Satyam Sharma wrote:

On Fri, 17 Aug 2007, Herbert Xu wrote:


On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:

BTW, the sort of missing barriers that triggered this thread
aren't that subtle. It'll result in a simple lock-up if the
loop condition holds upon entry. At which point it's fairly
straightforward to find the culprit.


Not necessarily. A barrier-less buggy code such as below:

atomic_set(&v, 0);

... /* some initial code */

while (atomic_read(&v))
;

... /* code that MUST NOT be executed unless v becomes non-zero */

(where v->counter is has no volatile access semantics)

could be generated by the compiler to simply *elid* or *do away* with
the loop itself, thereby making the:

"/* code that MUST NOT be executed unless v becomes non-zero */"

to be executed even when v is zero! That is subtle indeed, and causes
no hard lockups.

Then I presume you mean

while (!atomic_read(&v))
;

Which is just the same old infinite loop bug solved with cpu_relax().
These are pretty trivial to audit and fix, and also to debug, I would
think.


Granted, the above IS buggy code. But, the stated objective is to avoid
heisenbugs.

Anyway, why are you making up code snippets that are buggy in other
ways in order to support this assertion being made that lots of kernel
code supposedly depends on volatile semantics. Just reference the
actual code.


And we have driver / subsystem maintainers such as Stefan
coming up and admitting that often a lot of code that's written to use
atomic_read() does assume the read will not be elided by the compiler.

So these are broken on i386 and x86-64?

Are they definitely safe on SMP and weakly ordered machines with
just a simple compiler barrier there? Because I would not be
surprised if there are a lot of developers who don't really know
what to assume when it comes to memory ordering issues.

This is not a dig at driver writers: we still have memory ordering
problems in the VM too (and probably most of the subtle bugs in
lockless VM code are memory ordering ones). Let's not make up a
false sense of security and hope that sprinkling volatile around
will allow people to write bug-free lockless code. If a writer
can't be bothered reading API documentation and learning the Linux
memory model, they can still be productive writing safely locked
code.

--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/