Re: Allow data races on some read/write operations

From: Robin Murphy
Date: Wed Mar 05 2025 - 09:43:32 EST

Next message: Ziyang Huang: "Re: [PATCH 1/1] wifi: ath11k: pcic: make memory access more readable"
Previous message: Vincent Mailhol: "Re: [PATCH v4 3/8] bits: introduce fixed-type genmasks"
In reply to: Ralf Jung: "Re: Allow data races on some read/write operations"
Next in thread: Andreas Hindborg: "Re: Allow data races on some read/write operations"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 05/03/2025 1:27 pm, Ralf Jung wrote:

Hi,

On 05.03.25 14:23, Alice Ryhl wrote:

On Wed, Mar 5, 2025 at 2:10 PM Ralf Jung <post@xxxxxxxx> wrote:

Hi,

On 05.03.25 04:24, Boqun Feng wrote:

On Tue, Mar 04, 2025 at 12:18:28PM -0800, comex wrote:

On Mar 4, 2025, at 11:03 AM, Ralf Jung <post@xxxxxxxx> wrote:

However, these optimizations should rarely trigger misbehavior in
practice, so I wouldn’t be surprised if Linux had some code that
expected memcpy to act volatile…

Also in this particular case we are discussing [1], it's a memcpy (from
or to) a DMA buffer, which means the device can also read or write the
memory, therefore the content of the memory may be altered outside the
program (the kernel), so we cannot use copy_nonoverlapping() I believe.

[1]: https://lore.kernel.org/rust-for-linux/87bjuil15w.fsf@xxxxxxxxxx/

Is there actually a potential for races (with reads by hardware, not other
threads) on the memcpy'd memory? Or is this the pattern where you copy some data
somewhere and then set a flag in an MMIO register to indicate that the data is
ready and the device can start reading it? In the latter case, the actual data
copy does not race with anything, so it can be a regular non-atomic non-volatile
memcpy. The flag write *should* be a release write, and release volatile writes
do not exist, so that is a problem, but it's a separate problem from volatile
memcpy. One can use a release fence followed by a relaxed write instead.
Volatile writes do not currently act like relaxed writes, but you need that
anyway for WRITE_ONCE to make sense so it seems fine to rely on that here as well.

Rust should have atomic volatile accesses, and various ideas have been proposed
over the years, but sadly nobody has shown up to try and push this through.

If the memcpy itself can indeed race, you need an atomic volatile memcpy --
which neither C nor Rust have, though there are proposals for atomic memcpy (and
arguably, there should be a way to interact with a device using non-volatile
atomics... but anyway in the LKMM, atomics are modeled with volatile, so things
are even more entangled than usual ;).

For some kinds of hardware, we might not want to trust the hardware.
I.e., there is no race under normal operation, but the hardware could
have a bug or be malicious and we might not want that to result in UB.
This is pretty similar to syscalls that take a pointer into userspace
memory and read it - userspace shouldn't modify that memory during the
syscall, but it can and if it does, that should be well-defined.
(Though in the case of userspace, the copy happens in asm since it
also needs to deal with virtual memory and so on.)

Wow you are really doing your best to combine all the hard problems at the same time. ;)
Sharing memory with untrusted parties is another tricky issue, and even leaving aside all the theoretical trouble, practically speaking you'll want to exclusively use atomic accesses to interact with such memory. So doing this properly requires atomic memcpy. I don't know what that is blocked on, but it is good to know that it would help the kernel.

If you don't trust the device then I wouldn't think it actually matters what happens at this level - the higher-level driver is already going to have to carefully check and sanitise whatever data it reads back from the buffer before consuming it, at which point reading a torn value due to a race would be essentially indistinguishable from if the device had gone wrong and simply written that nonsense value itself.

I think the more significant case is when polling for the device to write back some kind of status word, where in C code the driver would use READ_ONCE() to ensure a single-copy-atomic read of the same size the device is going to write - sticking a regular memcpy() into the middle of that can't necessarily be trusted to work correctly (even if it may appear to 99% of the time).

Thanks,
Robin.

Next message: Ziyang Huang: "Re: [PATCH 1/1] wifi: ath11k: pcic: make memory access more readable"
Previous message: Vincent Mailhol: "Re: [PATCH v4 3/8] bits: introduce fixed-type genmasks"
In reply to: Ralf Jung: "Re: Allow data races on some read/write operations"
Next in thread: Andreas Hindborg: "Re: Allow data races on some read/write operations"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]