Re: [BUG io_uring] Failed RECVSEND_BUNDLE can persistently shrink non-INC pbuf ring len and affect later READ operations

From: Jens Axboe

Date: Sun Jun 07 2026 - 17:53:01 EST


On 6/7/26 3:38 PM, Jens Axboe wrote:
>> The reproducer runs unprivileged and demonstrates:
>>
>> 1. non-INC provided-buffer ring with entry0.len = 4096 and entry1.len = 4096
>> 2. IORING_OP_RECV + IOSQE_BUFFER_SELECT + IORING_RECVSEND_BUNDLE on an
>> empty SOCK_DGRAM socket
>> 3. CQE returns -EAGAIN, but entry0.len is changed from 4096 to 1
>> 4. a later unrelated IORING_OP_READ from a pipe using the same buffer
>> group returns 1 byte instead of 4096
>> 5. a second READ uses entry1 and returns 4096, so head/bid accounting
>> appears coherent in this repro
>>
>> I am not claiming privilege escalation from this. The demonstrated
>> issue is persistent provided-buffer descriptor length corruption after
>> a failed/no-data RECV_BUNDLE, affecting a later READ operation.
>
> Right, I believe you already mentioned in the first email. It's just
> a bug that can cause the app to (rightfully) get confused about the
> state of a buffer.
>
> And it's not a corruption in the sense that something else writes
> to this buffer length field, the kernel is deliberately writing
> to that valid piece of memory. It just misses restoring it when
> the operation fails.

IOW, it's a consistency issue. Words like unprivileged are tossed around
here, but the app could've just written this memory without even the
kernel to do it, it's application memory. There's absolutely nothing
privileged going on here, kernel isn't touching anything that the
application couldn't just have done itself, without involving the
kernel. The kernel _should_ not do it for this case, that's the bug. And
from a quick look, the fix would just be to remove that buf->len
assignment in this case. For the normal case of eg wanting to read 32b
where the length would've been truncated to 32b in the buffer, it should
be fine to leave it at 4096 or whatever size it is. For bundles,
userspace must iterate the buffers when it gets a completion for X
bytes. But the iteration should always be:

unsigned this_len = min(buf->len, left);

and hence it should not matter if buf->len remains at the untouched
length, for a truncated end buffer.

--
Jens Axboe