Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb()

From: Mathieu Desnoyers
Date: Sun Nov 09 2008 - 11:24:54 EST


* David Howells (dhowells@xxxxxxxxxx) wrote:
> Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> > > Note that that does not guarantee that the two reads will be done in the
> > > order you want. The compiler barrier _only_ affects the compiler. It
> > > does not stop the CPU from doing the reads in any order it wants. You
> > > need something stronger than smp_rmb() if you need the reads to be so
> > > ordered.
> >
> > For reading hardware devices that can indeed be correct. But for normal
> > memory access on a uniprocessor, if the CPU were to reorder the reads that
> > would effect the actual algorithm then that CPU is broken.
> >
> > read a
> > <--- interrupt - should see read a here before read b is done.
> > read b
>
> Life isn't that simple. Go and read the section labelled "The things cpus get
> up to" in Documentation/memory-barriers.txt.
>
> The two reads we're talking about are independent of each other. Independent
> reads and writes can be reordered and merged at will by the CPU, subject to
> restrictions imposed by barriers, cacheability attributes, MMIO attributes and
> suchlike.
>
> You can get read b happening before read a, but in such a case both
> instructions will be in the CPU's execution pipeline. When an interrupt
> occurs, the CPU will presumably finish clearing what's in its pipeline before
> going and servicing the interrupt handler.
>
> If a CPU is strictly ordered with respect to reads, do you actually need read
> barriers?
>
> The fact that a pair of reads might be part of an algorithm that is critically
> dependent on the ordering of those reads isn't something the CPU cares about.
> It doesn't know there's an algorithm there.
>
> > Now the fact that one of the reads is a hardware clock, then this
> > statement might not be too strong. But the fact that it is a clock, and
> > not some memory mapped device register, I still think smp_rmb is
> > sufficient.
>
> To quote again from memory-barriers.txt, section "CPU memory barriers":
>
> Mandatory barriers should not be used to control SMP effects, since
> mandatory barriers unnecessarily impose overhead on UP systems. They
> may, however, be used to control MMIO effects on accesses through
> relaxed memory I/O windows. These are required even on non-SMP
> systems

<emphasis>
> as they affect the order in which memory operations appear to a device
</emphasis>

In this particular case, we don't care about the order of memory
operations as seen by the device, given we only read the mmio time
source atomically. So considering what you said above about the fact
that the CPU will flush all the pending operations in the pipeline
before proceeding to service an interrupt, a simple barrier() should be
enough to make the two operations appear in correct order wrt local
interrupts. I therefore don't think a full rmb() is required to insure
correct read order on UP, because, again, in this case we don't need to
order accesses as seen by the device.

Mathieu

> by prohibiting both the compiler and the CPU from reordering
> them.
>
> Section "Accessing devices":
>
> (2) If the accessor functions are used to refer to an I/O memory window with
> relaxed memory access properties, then _mandatory_ memory barriers are
> required to enforce ordering.
>
> David

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/