Satyam Sharma wrote:
Consider this (the above two functions exist only for clear_bit(),
the atomic variant, as you already know), the _only_ memory reference
we care about is that of the address of the passed bit-string:
(1) The compiler must not optimize / elid it (i.e. we need to disallow
compiler optimization for that reference) -- but we've already taken
care of that with the __asm__ __volatile__ and the constraints on
the memory "addr" operand there, and,
(2) For the i386, it also includes an implicit memory (CPU) barrier
already.
So I /think/ it makes sense to let the compiler optimize _other_ memory
references across the call to clear_bit(). There's a difference. I think
we'd be safe even if we do this, because the synchronization in callers
must be based upon the _passed bit-string_, otherwise _they_ are the
ones who're buggy.
[ However, elsewhere Jeremy Fitzhardinge mentioned the case of
some callers, for instance, doing a memset() on an alias of
the same bit-string. But again, I think that is dodgy/buggy/
extremely border-line usage on the caller's side itself ...
*unless* the caller is doing that inside a higher-level lock
anyway, in which case he wouldn't be needing to use the
locked variants either ... ]
You miss my point. If you have:
memset(&my_bitmask, 0, sizeof(my_bitmask));
set_bit(my_bitmask, 44);
Then unless the set_bit has a constraint argument which covers the whole
of the (multiword) bitmask, the compiler may see fit to interleave the
memset writes with the set_bit in bad ways. In other words, plain "+m"
(*(long *)ptr) won't cut it. You'd need "+m" (my_bitmask), I think.