Re: [PATCH bpf-next] bpf, docs: add LOAD_AQCUIRE and STORE_RELEASE instructions

From: Alexis Lothoré

Date: Wed May 20 2026 - 11:56:38 EST


On Wed May 20, 2026 at 5:18 PM CEST, bot+bpf-ci wrote:
>> diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
>> --- a/Documentation/bpf/standardization/instruction-set.rst
>> +++ b/Documentation/bpf/standardization/instruction-set.rst
>> @@ -695,22 +695,24 @@
>> *(u64 *)(dst + offset) += src
>>
>> In addition to the simple atomic operations, there also is a modifier and
>> -two complex atomic operations:
>> +four complex atomic operations:
>>
>> .. table:: Complex atomic operations
>>
>> =========== ================ ===========================
>> imm value description
>> =========== ================ ===========================
>> - FETCH 0x01 modifier: return old value
>> - XCHG 0xe0 | FETCH atomic exchange
>> - CMPXCHG 0xf0 | FETCH atomic compare and exchange
>> + FETCH 0x0001 modifier: return old value
>> + XCHG 0x00e0 | FETCH atomic exchange
>> + CMPXCHG 0x00f0 | FETCH atomic compare and exchange
>> + LOAD_ACQ 0x0100 atomic load with barrier
>> + STORE_REL 0x0110 atomic store with barrier
>> =========== ================ ===========================
>>
>> The ``FETCH`` modifier is optional for simple atomic operations, and
>> -always set for the complex atomic operations. If the ``FETCH`` flag
>> -is set, then the operation also overwrites ``src`` with the value that
>> -was in memory before it was modified.
>> +always set for the ``XCHG`` and ``CMPXCHG`` complex atomic operations. If
>> +the ``FETCH`` flag is set, then the operation also overwrites ``src`` with
>> +the value that was in memory before it was modified.
>>
>> The ``XCHG`` operation atomically exchanges ``src`` with the value
>> addressed by ``dst + offset``.
>
> [ ... ]
>
>> @@ -721,6 +723,11 @@
>> value that was at ``dst + offset`` before the operation is zero-extended
>> and loaded back to ``R0``.
>>
>> +The ``LOAD_ACQ`` and ``STORE_REL`` operations implement lighter LOAD and
>> +STORE memory barriers than full barriers. The corresponding accesses must
>> +be aligned, but are allowed for any access size (8-bit up to 64-bit
>> +operations).
>
> Does the LOAD_ACQ documentation need to clarify its non-standard register
> semantics? BPF atomic operations are encoded as STX instructions, which
> conventionally use dst as the memory base pointer and src as the value
> register. However, LOAD_ACQ reverses this pattern by using src as the
> memory base pointer and dst as the destination register for the fetched
> value, acting like an LDX instruction.

True, I can specify that both are anyway supported by a STX instruction.

> Additionally, should the documentation clarify that the loaded value is
> zero-extended into the 64-bit destination register for 8, 16, and 32-bit
> accesses? Omitting these details might lead to incorrect implementations
> based on standard STX rules.

I guess I can add this detail, yes.

> This contradiction creates specification ambiguity: the documentation
> states that LOAD_ACQ and STORE_REL "are allowed for any access size
> (8-bit up to 64-bit operations)", but earlier in the document (line 671)
> explicitly states "8-bit and 16-bit wide atomic operations are not
> supported."
>
> The instruction encoding section (lines 667-671) only defines two size
> encodings for ATOMIC mode:
> - {ATOMIC, W, STX} for 32-bit operations
> - {ATOMIC, DW, STX} for 64-bit operations
>
> No encoding is shown for B (byte/8-bit) or H (half-word/16-bit) with
> ATOMIC mode.
>
> If LOAD_ACQ/STORE_REL truly support 8-bit and 16-bit sizes, shouldn't the
> encoding section document how B and H size modifiers work with ATOMIC
> mode, and line 671 clarify the exception?

This point, and the corresponding mentions to the "atomic32 conformance
group" and "atomic64 conformance group", made me realize that the kernel
doc seems to be in sync with the eBPF ISA RFC
(https://www.rfc-editor.org/rfc/rfc9669.html). It makes me wonder if
it's really ok to add those LOAD_ACQUIRE/STORE_RELEASE mentions in the
kernel doc only ?

> Alternatively, if LOAD_ACQ/STORE_REL only support 32-bit and 64-bit sizes
> like other atomic operations, should line 728 be corrected to state
> "32-bit and 64-bit operations" instead of "8-bit up to 64-bit operations"?
>
> For a standardized instruction set specification used by multiple
> implementations (verifier, JIT compilers, potentially other BPF runtimes),
> internal contradictions can lead to implementation divergence and
> interoperability issues.
>
>> +
>> 64-bit immediate instructions
>> -----------------------------
>>
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26170628668




--
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com