Re: [RFC PATCH 1/8] signal: Dequeue SIGKILL even if SIGNAL_GROUP_EXIT/group_exec_task is set

From: Eric W. Biederman
Date: Thu May 18 2023 - 14:31:47 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> On 05/18, Mike Christie wrote:
>>
>> On 5/18/23 11:25 AM, Oleg Nesterov wrote:
>> > I too do not understand the 1st change in this patch ...
>> >
>> > On 05/18, Mike Christie wrote:
>> >>
>> >> In the other patches we do:
>> >>
>> >> if (get_signal(ksig))
>> >> start_exit_cleanup_by_stopping_newIO()
>> >> flush running IO()
>> >> exit()
>> >>
>> >> But to do the flush running IO() part of this I need to wait for it so
>> >> that's why I wanted to be able to dequeue the SIGKILL and clear the
>> >> TIF_SIGPENDING bit.
>> >
>> > But get_signal() will do what you need, dequeue SIGKILL and clear SIGPENDING ?
>> >
>> > if ((signal->flags & SIGNAL_GROUP_EXIT) ||
>> > signal->group_exec_task) {
>> > clear_siginfo(&ksig->info);
>> > ksig->info.si_signo = signr = SIGKILL;
>> > sigdelset(&current->pending.signal, SIGKILL);
>> >
>> > this "dequeues" SIGKILL,
>
> OOPS. this doesn't remove SIGKILL from current->signal->shared_pending

Neither does calling get_signal the first time.
But the second time get_signal is called it should work.

Leaving SIGKILL in current->signal->shared_pending when it has already
been short circuit delivered appears to be an out and out bug.

>> >
>> > trace_signal_deliver(SIGKILL, SEND_SIG_NOINFO,
>> > &sighand->action[SIGKILL - 1]);
>> > recalc_sigpending();
>> >
>> > this clears TIF_SIGPENDING.
>
> No, I was wrong, recalc_sigpending() won't clear TIF_SIGPENDING if
> SIGKILL is in signal->shared_pending

That feels wrong as well.

>> I see what you guys meant. TIF_SIGPENDING isn't getting cleared.
>> I'll dig into why.
>
> See above, sorry for confusion.
>
>
>
> And again, there is another problem with SIGSTOP. To simplify, suppose
> a PF_IO_WORKER thread does something like
>
> while (signal_pending(current))
> get_signal(...);
>
> this will loop forever if (SIGNAL_GROUP_EXIT || group_exec_task) and
> SIGSTOP is pending.

I think we want to do something like the untested diff below.

That the PF_IO_WORKER test allows get_signal to be called
after get_signal returns a fatal aka SIGKILL seems wrong.
That doesn't happen in the io_uring case, and certainly nowhere
else.

The change to complete_signal appears obviously correct although
a pending siginfo still needs to be handled.

The change to recalc_siginfo also appears mostly right, but I am not
certain that the !freezing test is in the proper place. Nor am I
certain it won't have other surprise effects.

Still the big issue seems to be the way get_signal is connected into
these threads so that it keeps getting called. Calling get_signal after
a fatal signal has been returned happens nowhere else and even if we fix
it today it is likely to lead to bugs in the future because whoever is
testing and updating the code is unlikely they have a vhost test case
the care about.

diff --git a/kernel/signal.c b/kernel/signal.c
index 8f6330f0e9ca..4d54718cad36 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -181,7 +181,9 @@ void recalc_sigpending_and_wake(struct task_struct *t)

void recalc_sigpending(void)
{
- if (!recalc_sigpending_tsk(current) && !freezing(current))
+ if ((!recalc_sigpending_tsk(current) && !freezing(current)) ||
+ ((current->signal->flags & SIGNAL_GROUP_EXIT) &&
+ !__fatal_signal_pending(current)))
clear_thread_flag(TIF_SIGPENDING);

}
@@ -1043,6 +1045,13 @@ static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
* This signal will be fatal to the whole group.
*/
if (!sig_kernel_coredump(sig)) {
+ /*
+ * The signal is being short circuit delivered
+ * don't it pending.
+ */
+ if (type != PIDTYPE_PID) {
+ sigdelset(&t->signal->shared_pending, sig);
+
/*
* Start a group exit and wake everybody up.
* This way we don't have other threads



Eric