I asked this question to Tony Luck before. If I remember right,That would be great. It still doesn't explain the barriers in the
his answer was:
CPU guarantees outstanding writes to be flushed when a register write
instruction is executed and an additional barrier instruction is not
needed.
dma sync routines. Those have been there since the following commit
in the history tree: