Re: [PATCH net v3] net: tls: use sync AEAD for sk_msg BPF sockets

From: John Fastabend

Date: Wed May 27 2026 - 15:16:24 EST

On Wed, May 27, 2026 at 01:09:44PM +0800, Jiayuan Chen wrote:

On 5/27/26 7:11 AM, Jakub Kicinski wrote:

On Tue, 26 May 2026 14:44:24 +0800 Jiayuan Chen wrote:

If async_capable is set to 1, the zerocopy path in tls_sw_sendmsg() is
skipped.
Unfortunately ktls with bpf_msg_pop_data() does not work correctly under
this
copy path.

tls_clone_plaintext_msg() aliases msg_pl onto msg_en's plaintext area
(in-place encryption).

BPF runs bpf_msg_pop_data(msg, 0, 2). This shifts msg_pl's SG entry
forward by 2 bytes.
The two SGs now point to the same page at different offsets. Physical
memory overlaps but the start of
address differ.

Ugh, do you mean that the memcopy path is broken? There are other
conditions under which we may fall into it than just !async_capable :(
Small send with MSG_MORE is probably the easiest?

So we need to fix that one way or the other.

Yes, the memcopy path is broken, but only when combined with sockmap's pop helper.

msg_pl and msg_en share the underlying page:

msg_pl msg_pl end
^ ^
|------|------------------|-------|
| hdr | plaintext | tag |
|------|------------------|-------|
^ ^
| |
msg_en msg_en end

Before encryption, sge->offset += prot->prepend_size is applied
to msg_en so that the encryption's dst and src point to the same
block of memory.

But once pop has run — i.e. msg_pl's start advances — the encryption's dst and src
are no longer the same.

crypto_ctr_crypt():
When dst and src have the same address, crypto saves the encryption result into a
temporary buffer and then writes it back to dst.

When dst and src have different addresses, the crypto module treats them as two

separate buffers and stops considering in-place mode.

it's complicated to process pop/push + head/mid/tail...

For our use case (not deployed yet, but deployed in non-kTLS case)
all we do is observe data and possible drop the skb if it has
malicious HTTP headers for example.

All this push/pop/... in the middle of the kTLS stack is painful.

One option we start rejecting these helpers? That would resolve most
the pain I suspect. The original thought was we do have use cases
now for userspace proxy where we insert headers.

I think selecting a sync provider via mask = CRYPTO_ALG_ASYNC is
sufficient to
remove the -EINPROGRESS return path.

May be time to remove skmsg from ktls? (disable by default first,
re-enable via a new ktls module_param?)

Yes, we asked John F off-list to get his attention and I think there's
only a vague plan to start using kTLS + sockmap, no current user
(sorry if I misread / misremembered).

I'm not against a cleaner solution here.

Another idea: We just add a simple sockops BPF hook with the sk_buff?
No updating sg lists, manipulating data packet sizes and so on.

That would solve the vast majority of any future use case if we have
a user that really started running kTLS and wanted the security stack
to keep working. Even openssl usage of kTLS has really ground to a
halt after it was initially added as far as I can tell.

Something like this already on the list for recv side of tcp.

[PATCH v3 bpf-next 10/11] bpf: tcp: Add SOCK_OPS rcvlowat hook

module params aren't a great API. If we want to deprecate it let's just
remove the integration in net-next. You have my vote..