Non atomic unaligned writes

From: Mathieu Desnoyers
Date: Sun Aug 19 2007 - 21:00:40 EST


* Andi Kleen (ak@xxxxxxx) wrote:
> Normally there are not that many NMIs or MCEs at boot, but it would
> be still good to avoid the very rare crash by auditing the code first
> [better than try to debug it on some production system later]
>
> > > - smp lock patching only ever changes a single byte (lock prefix) of
> > > a single instruction
> > > - kprobes only ever change a single byte
> > >
> > > For the immediate value patching it also cannot happen because
> > > you'll never modify multiple instructions and all immediate values
> > > can be changed atomically.
> > >
> >
> > Are misaligned/cross-cache-line updates atomic?
>
> In theory yes, in practice there can be errata of course. There tend
> to be a couple with self modifying code, especially cross modifying
> (from another CPU) -- but you don't do that.
>
> -Andi

I must disagree with Andi on this point. Considering the quoted
paragraph below, misaligned/cross-cache-line updates are not atomic.
This is why I align the immediate values in such a way that the
immediate value within the mov instruction is itself aligned.

Intel System Programming Guide

7.1.1 Guaranteed Atomic Operations

The Intel386â, Intel486â, PentiumÂ, and P6 family processors guarantee
that the following basic memory operations will always be carried out
atomically:
â Reading or writing a byte.
â Reading or writing a word aligned on a 16-bit boundary.
â Reading or writing a doubleword aligned on a 32-bit boundary.
The P6 family processors guarantee that the following additional memory
operations will always be carried out atomically:
â Reading or writing a quadword aligned on a 64-bit boundary. (This
operation is also guaranteed on the Pentium processor.)
â 16-bit accesses to uncached memory locations that fit within a 32-bit
data bus.
â 16-, 32-, and 64-bit accesses to cached memory that fit within a
32-Byte cache line.

Accesses to cacheable memory that are split across bus widths, cache
lines, and page boundaries are not guaranteed to be atomic by the
Intel486â, PentiumÂ, or P6 family processors. The P6 family processors
provide bus control signals that permit external memory subsystems to
make split accesses atomic; however, nonaligned data accesses will
seriously impact the performance of the processor and should be avoided
where possible.

Mathieu


--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/