Re: [PATCH net v1] tls: fix hung task in tx_work_handler by using non-blocking sends

From: Jiayuan Chen

Date: Sun Mar 01 2026 - 01:52:15 EST

March 1, 2026 at 01:15, "Jakub Kicinski" <kuba@xxxxxxxxxx mailto:kuba@xxxxxxxxxx?to=%22Jakub%20Kicinski%22%20%3Ckuba%40kernel.org%3E > wrote:

>
> On Fri, 27 Feb 2026 14:32:31 +0800 Jiayuan Chen wrote:
>
> >
> > tx_work_handler calls tls_tx_records with flags=-1, which preserves
> > each record's original tx_flags but results in tcp_sendmsg_locked
> > using an infinite send timeout. When the peer is unresponsive and the
> > send buffer is full, tcp_sendmsg_locked blocks indefinitely in
> > sk_stream_wait_memory. This causes tls_sk_proto_close to hang in
> > cancel_delayed_work_sync waiting for tx_work_handler to finish,
> > leading to a hung task:
> >
> > INFO: task ...: blocked for more than ... seconds.
> > Call Trace:
> > cancel_delayed_work_sync
> > tls_sw_cancel_work_tx
> > tls_sk_proto_close
> >
> > A workqueue handler should never block indefinitely. Fix this by
> > introducing __tls_tx_records() with an extra_flags parameter that
> > gets OR'd into each record's tx_flags. tx_work_handler uses this to
> > pass MSG_DONTWAIT so tcp_sendmsg_locked returns -EAGAIN immediately
> > when the send buffer is full, without overwriting the original
> > per-record flags (MSG_MORE, MSG_NOSIGNAL, etc.). On -EAGAIN, the
> > existing reschedule mechanism retries after a short delay.
> >
> > Also consolidate the two identical reschedule paths (lock contention
> > and -EAGAIN) into one.
> >
> It's not that simple. The default semantics for TCP sockets is that
> queuing data and then calling close() is a legitimate thing to do
> and the data should be sent cleanly, followed by a normal FIN in such
> case.
>
> Maybe we should explore trying to make sure we have enough wmem before
> we start creating records. Get rid of the entire workqueue mess?

Regarding wmem pre-check: the async crypto path is not triggered by
wmem shortage — it's triggered when the crypto operation itself is
asynchronous (e.g. cryptd fallback when SIMD is unavailable). At the
time tls_do_encryption() returns -EINPROGRESS, wmem may be perfectly
fine. The problem occurs later when tls_encrypt_done() fires and
tx_work_handler tries to push the completed records — by that point
the send buffer may have filled up. Since these are two different
points in time, pre-checking wmem at record creation wouldn't help.

> Regarding your patch I think all callers passing -1 as flags are on
> the close path, you could have just added | DONTWAIT if the flags
> are -1.

Regarding adding MSG_DONTWAIT unconditionally when flags == -1:
tls_sw_release_resources_tx() also calls tls_tx_records(sk, -1).
That's in the close path where we actually want to block and flush
remaining records to honour the "close() should send data cleanly"
semantics you mentioned. Making that non-blocking would cause data
loss. So we do need to distinguish between the two callers, which
is why I introduced __tls_tx_records() with the extra_flags parameter.

Thanks,

> pw-bot: cr
>