Re: [PATCH] io_uring/net: don't fail linked ops when done_io > 0
From: Hannes Furmans
Date: Fri Feb 27 2026 - 11:25:40 EST
Hi Stefan,
Am 27.02.26 um 14:59 schrieb Stefan Metzmacher:
> That's by design, if a MSG_WAITALL calls fails it means
> not call data the caller expected arrived or were sent.
> When there's a LINK after that the linked operation likely
> relies on all expected data being processed! Otherwise
> the message stream can get out of sync and causes corruption.
You're right — a short MSG_WAITALL read should sever the IO_LINK
chain. The v1 patch was wrong to guard req_set_fail() on done_io > 0.
> Let's assume I want to send a message header with
> IO_SEND linked with a IO_SPLICE to send the payload.
>
> If IO_SEND returns short the situation needs to be
> recovered by the caller instead of letting the
> IO_SPLICE give more data to the socket.
Agreed, the linked operation expects the complete data.
> So the current behavior is exactly what MSG_WAITALL
> gives you. If you don't want that why are you using it
> at all?
The actual bug is narrower. I traced the root cause with kTLS.
When IORING_OP_RECV is used with MSG_WAITALL on a kTLS socket,
the recv completes successfully (ret >= min_ret, full requested
amount received). But kTLS calls put_cmsg(SOL_TLS,
TLS_GET_RECORD_TYPE) for every first record of a recvmsg call
(tls_sw.c:1843). Since io_recv sets up the msghdr with
msg_control=NULL and msg_controllen=0, put_cmsg sets MSG_CTRUNC.
Then io_recv hits the else-if branch:
} else if ((flags & MSG_WAITALL) &&
(msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
req_set_fail(req);
}
This sets REQ_F_FAIL on a fully successful recv. The CQE shows
the full byte count, but the linked write gets -ECANCELED.
I confirmed this with ftrace — the recv completes with
result=67108864 (exactly 64MB requested), then
io_uring_fail_link fires immediately after from an io-wq worker.
I also confirmed with a plain recvmsg debug tool that kTLS
returns msg_flags=0x88 (MSG_EOR | MSG_CTRUNC) on every call.
Your commit 0031275d119e says "For IORING_OP_RECVMSG we also
check for the MSG_TRUNC and MSG_CTRUNC flags" but the code
applies the check to IORING_OP_RECV as well. MSG_CTRUNC is
meaningful for IORING_OP_RECVMSG (user provides a cmsg buffer).
It's meaningless for IORING_OP_RECV which never has a cmsg
buffer.
I'll send a v2 that only removes MSG_CTRUNC from the io_recv
check.
Thanks,
Hannes