Re: Allow data races on some read/write operations

From: Ralf Jung
Date: Wed Mar 05 2025 - 16:53:56 EST


Hi all,

On 05.03.25 22:26, Andreas Hindborg wrote:
"Ralf Jung" <post@xxxxxxxx> writes:

Hi all,

For some kinds of hardware, we might not want to trust the hardware.
I.e., there is no race under normal operation, but the hardware could
have a bug or be malicious and we might not want that to result in UB.
This is pretty similar to syscalls that take a pointer into userspace
memory and read it - userspace shouldn't modify that memory during the
syscall, but it can and if it does, that should be well-defined.
(Though in the case of userspace, the copy happens in asm since it
also needs to deal with virtual memory and so on.)

Wow you are really doing your best to combine all the hard problems at the same
time. ;)
Sharing memory with untrusted parties is another tricky issue, and even leaving
aside all the theoretical trouble, practically speaking you'll want to
exclusively use atomic accesses to interact with such memory. So doing this
properly requires atomic memcpy. I don't know what that is blocked on, but it is
good to know that it would help the kernel.

I am sort of baffled by this, since the C kernel has no such thing and
has worked fine for a few years. Is it a property of Rust that causes us
to need atomic memcpy, or is what the C kernel is doing potentially dangerous?

It's the same in C: a memcpy is a non-atomic access. If something else
concurrently mutates the memory you are copying from, or something else
concurrently reads/writes the memory you are copying two, that is UB.
This is not specific to memcpy; it's the same for regular pointer loads/stores.
That's why you need READ_ONCE and WRITE_ONCE to specifically indicate to the
compiler that these are special accesses that need to be treated differently.
Something similar is needed for memcpy.

I'm not a compiler engineer, so I might be wrong about this, but. If I
do a C `memcpy` from place A to place B where A is experiencing racy
writes, if I don't interpret the data at place B after the copy
operation, the rest of my C program is fine and will work as expected.

The program has UB in that case. A program that has UB may work as expected today, but that changes nothing about it having UB.
The C standard is abundantly clear here:
"The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior."
(C23, §5.1.2.4)

You are describing a hypothetical language that treats data races in a different way. Is such a language *possible*? Definitely. For the specific case you describe here, one "just" has to declare read-write races to be not UB, but to return "poison data" on the read side (poison data is a bit like uninitialized memory or padding), which the memcpy would then store on the target side. Any future interpretation of the target memory would be UB ("poison data" is not the same as "random data"). Such a model has actually been studied [1], though no a lot, and not as a proposal for a semantics of a user-facing language. (Rather, that was a proposal for an internal compiler IR.) The extra complications incurred by this choice are significant -- there is no free lunch here.

[1]: https://sf.snu.ac.kr/publications/promising-ir-full.pdf

However, C is not that language, and neither is Rust. Defining a concurrency memory model is extremely non-trivial (there's literally hundreds of papers proposing various different models, and there are still some unsolved problems). The route the C++ model took was to strictly rule out all data races, and since they were the first to actually undertake the effort of defining a model at this level of rigor (for a language not willing to pay the cost that would be incurred by the Java concurrency memory model), that has been the standard ever since. There's a lot of subtle trade-offs here, and I am far from an expert on the exact consequences each different choice would have. I just want to caution against the obvious reaction of "why don't they just". :)


I
may even later copy the data at place B to place C where C might have
concurrent reads and/or writes, and the kernel will not experience UB
because of this. The data may be garbage, but that is fine. I am not
interpreting the data, or making control flow decisions based on it. I
am just moving the data.

My understand is: In Rust, this program would be illegal and might
experience UB in unpredictable ways, not limited to just the data that
is being moved.

That is correct. C and Rust behave the same here.

One option I have explored is just calling C memcpy directly, but
because of LTO, that is no different than doing the operation in Rust.

I don't think I need atomic memcpy, I just need my program not to
explode if I move some data to or from a place that is experiencing
concurrent writes without synchronization. Not in general, but for some
special cases where I promise not to look at the data outside of moving
it.

I'm afraid I do not know of a language, other than assembly, that can provide this.

Atomic memcpy, however, should be able to cover your use-case, so it seems like a reasonable solution to me? Marking things as atomic is literally how you tell the compiler "don't blow up if there are concurrent accesses".

Kind regards,
Ralf