Re: [PATCH] net: Do not break out of sk_stream_wait_memory() with TIF_NOTIFY_SIGNAL
From: Paolo Abeni
Date: Tue Mar 19 2024 - 08:30:48 EST
On Mon, 2024-03-18 at 13:03 +0100, Sascha Hauer wrote:
> Apologies, I have sent the wrong mail. Here is the mail I really wanted
> to send, with answers to some of the questions Paolo raised the last
> time I sent it.
>
> -----------------------------------8<------------------------------
>
> > From 566bb198546423c024cdebc50d0aade7ed638a40 Mon Sep 17 00:00:00 2001
> From: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx>
> Date: Mon, 23 Oct 2023 14:13:46 +0200
> Subject: [PATCH v2] net: Do not break out of sk_stream_wait_memory() with TIF_NOTIFY_SIGNAL
>
> It can happen that a socket sends the remaining data at close() time.
> With io_uring and KTLS it can happen that sk_stream_wait_memory() bails
> out with -512 (-ERESTARTSYS) because TIF_NOTIFY_SIGNAL is set for the
> current task. This flag has been set in io_req_normal_work_add() by
> calling task_work_add().
>
> It seems signal_pending() is too broad, so this patch replaces it with
> task_sigpending(), thus ignoring the TIF_NOTIFY_SIGNAL flag.
>
> A discussion of this issue can be found at
> https://lore.kernel.org/20231010141932.GD3114228@xxxxxxxxxxxxxx
>
> Suggested-by: Jens Axboe <axboe@xxxxxxxxx>
> Fixes: 12db8b690010c ("entry: Add support for TIF_NOTIFY_SIGNAL")
> Link: https://lore.kernel.org/r/20231023121346.4098160-1-s.hauer@xxxxxxxxxxxxxx
> Signed-off-by: Sascha Hauer <s.hauer@xxxxxxxxxxxxxx>
> ---
>
> Changes since v1:
> - only replace signal_pending() with task_sigpending() where we need it,
> in sk_stream_wait_memory()
>
> I'd like to pick up the discussion on this patch as it is still needed for our
> usecase. Paolo Abeni raised some concerns about this patch for which I didn't have
> good answers. I am referencing them here again with an attempts to answer them.
> Jens, maybe you also have a few words here.
>
> Paolo raised some concerns in
> https://lore.kernel.org/all/e1e15554bfa5cfc8048d6074eedbc83c4d912c98.camel@xxxxxxxxxx/:
>
> > To be more explicit: why this will not cause user-space driven
> > connect() from missing relevant events?
>
> Note I dropped the hunk in sk_stream_wait_connect() and
> sk_stream_wait_close() in this version.
> Userspace driven signals are still catched with task_sigpending() which
> tests for TIF_SIGPENDING. signal_pending() will additionally check for
> TIF_NOTIFY_SIGNAL which is exclusively used by task_work_add() to add
> work to a task.
It looks like even e.g. livepatch would set TIF_NOTIFY_SIGNAL, and
ignoring it could break livepatch for any code waiting e.g. in
tcp_sendmsg()?!?
This change looks scary to me.
I think what Pavel is suggesting is to refactor the KTLS code to ensure
all the writes are completed before releasing the last socket
reference.
I would second such suggestion.
If really nothing else works, and this change is the only option, try
to obtain an ack from kernel/signal.c maintainers.
Thanks,
Paolo