Re: [PATCH] [2.5] asm-generic/atomic.h and changes to arm, parisc, mips, m68k, sh, cris to use it

From: H. Peter Anvin (hpa@zytor.com)
Date: Thu Aug 08 2002 - 22:52:01 EST


Followup to: <1028851871.28883.126.camel@irongate.swansea.linux.org.uk>
By author: Alan Cox <alan@lxorguk.ukuu.org.uk>
In newsgroup: linux.dev.kernel
>
> On Thu, 2002-08-08 at 23:40, Luca Barbieri wrote:
> > > The compiler can cache the value in a register
> > It shouldn't since it is volatile and the machine has instructions with
> > memory operands.
>
> I'm curious what part of C99 guarantees that it must generate
>
> add 1 to memory
>
> not
>
> load memory
> add 1
> store memory
>
> It certainly guarantees not to cache it for use next statement, but does
> it actually persuade the compiler to use direct operations on memory ?
>
> I'm not a C99 language lawyer but genuinely curious
>

C99 basically doesn't guarantee *anything* about "volatile", except
that it works as a keyword. It really can't, since the C standard
doesn't have a model for either hardware I/O or multithreaded
execution, effectively the two cases for which "volatile" matters; the
exact wording in the standard is за6.7.3.6:

    An object that has volatile-qualified type may be modified in ways
    unknown to the implementation or have other unknown side
    effects. Therefore any expression referring to such an object
    shall be evaluated strictly according to the rules of the abstract
    machine, as described in 5.1.2.3. Furthermore, at every sequence
    point the value last stored in the object shall agree with that
    prescribed by the abstract machine, except as modified by the
    unknown factors mentioned previously.114) What constitutes an
    access to an object that has volatile-qualified type is
    implementation-defined.

114) A volatile declaration may be used to describe an object
     corresponding to a memory-mapped input/output port or an object
     accessed by an asynchronously interrupting function. Actions on
     objects so declared shall not be optimized out by an
     implementation or reordered except as permitted by the rules for
     evaluating expressions.

It therefore becomes a quality of implementation issue, assuming there
are instructions that do that. Even so, it's iffy... should it
generate "lock incl" on x86, for example?

*In practice*, what can be safely assumed for a volatile object is
that:

a) A store to the volatile location will correspond to exactly one
   "store" hardware operations (note that multiple stores to the same
   object between sequence points are illegal in C);

b) If there is exactly one read reference to a volatile object between
   sequence points, it will correspond to exactly one "load" operation
   in hardware. If there is are n read references to the same
   volatile objects between adjacent sequence points, it will
   correspond to some number m "load" operations such that 1 <= m <= n.

c) For either of these, if the volatile object is too large or has the
   wrong alignment to handle atomically in hardware, it will in effect
   be treated like some number of smaller atomic objects.

        -hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 15 2002 - 22:00:18 EST