Re: [PATCH bpf-next] bpf, docs: add LOAD_AQCUIRE and STORE_RELEASE instructions
From: Alexei Starovoitov
Date: Wed May 20 2026 - 13:04:18 EST
On Wed, May 20, 2026 at 5:46 PM Alexis Lothoré
<alexis.lothore@xxxxxxxxxxx> wrote:
>
> On Wed May 20, 2026 at 5:18 PM CEST, bot+bpf-ci wrote:
> >> diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst
> >> --- a/Documentation/bpf/standardization/instruction-set.rst
> >> +++ b/Documentation/bpf/standardization/instruction-set.rst
> >> @@ -695,22 +695,24 @@
> >> *(u64 *)(dst + offset) += src
> >>
> >> In addition to the simple atomic operations, there also is a modifier and
> >> -two complex atomic operations:
> >> +four complex atomic operations:
> >>
> >> .. table:: Complex atomic operations
> >>
> >> =========== ================ ===========================
> >> imm value description
> >> =========== ================ ===========================
> >> - FETCH 0x01 modifier: return old value
> >> - XCHG 0xe0 | FETCH atomic exchange
> >> - CMPXCHG 0xf0 | FETCH atomic compare and exchange
> >> + FETCH 0x0001 modifier: return old value
> >> + XCHG 0x00e0 | FETCH atomic exchange
> >> + CMPXCHG 0x00f0 | FETCH atomic compare and exchange
> >> + LOAD_ACQ 0x0100 atomic load with barrier
> >> + STORE_REL 0x0110 atomic store with barrier
> >> =========== ================ ===========================
> >>
> >> The ``FETCH`` modifier is optional for simple atomic operations, and
> >> -always set for the complex atomic operations. If the ``FETCH`` flag
> >> -is set, then the operation also overwrites ``src`` with the value that
> >> -was in memory before it was modified.
> >> +always set for the ``XCHG`` and ``CMPXCHG`` complex atomic operations. If
> >> +the ``FETCH`` flag is set, then the operation also overwrites ``src`` with
> >> +the value that was in memory before it was modified.
> >>
> >> The ``XCHG`` operation atomically exchanges ``src`` with the value
> >> addressed by ``dst + offset``.
> >
> > [ ... ]
> >
> >> @@ -721,6 +723,11 @@
> >> value that was at ``dst + offset`` before the operation is zero-extended
> >> and loaded back to ``R0``.
> >>
> >> +The ``LOAD_ACQ`` and ``STORE_REL`` operations implement lighter LOAD and
> >> +STORE memory barriers than full barriers. The corresponding accesses must
> >> +be aligned, but are allowed for any access size (8-bit up to 64-bit
> >> +operations).
> >
> > Does the LOAD_ACQ documentation need to clarify its non-standard register
> > semantics? BPF atomic operations are encoded as STX instructions, which
> > conventionally use dst as the memory base pointer and src as the value
> > register. However, LOAD_ACQ reverses this pattern by using src as the
> > memory base pointer and dst as the destination register for the fetched
> > value, acting like an LDX instruction.
>
> True, I can specify that both are anyway supported by a STX instruction.
>
> > Additionally, should the documentation clarify that the loaded value is
> > zero-extended into the 64-bit destination register for 8, 16, and 32-bit
> > accesses? Omitting these details might lead to incorrect implementations
> > based on standard STX rules.
>
> I guess I can add this detail, yes.
>
> > This contradiction creates specification ambiguity: the documentation
> > states that LOAD_ACQ and STORE_REL "are allowed for any access size
> > (8-bit up to 64-bit operations)", but earlier in the document (line 671)
> > explicitly states "8-bit and 16-bit wide atomic operations are not
> > supported."
> >
> > The instruction encoding section (lines 667-671) only defines two size
> > encodings for ATOMIC mode:
> > - {ATOMIC, W, STX} for 32-bit operations
> > - {ATOMIC, DW, STX} for 64-bit operations
> >
> > No encoding is shown for B (byte/8-bit) or H (half-word/16-bit) with
> > ATOMIC mode.
> >
> > If LOAD_ACQ/STORE_REL truly support 8-bit and 16-bit sizes, shouldn't the
> > encoding section document how B and H size modifiers work with ATOMIC
> > mode, and line 671 clarify the exception?
>
> This point, and the corresponding mentions to the "atomic32 conformance
> group" and "atomic64 conformance group", made me realize that the kernel
> doc seems to be in sync with the eBPF ISA RFC
> (https://www.rfc-editor.org/rfc/rfc9669.html). It makes me wonder if
> it's really ok to add those LOAD_ACQUIRE/STORE_RELEASE mentions in the
> kernel doc only ?
It's ok. It already diverged a bit. Eventually we will do an RFC update.