Re: [PATCH AUTOSEL 5.11 43/44] signal: don't allow STOP on PF_IO_WORKER threads

From: Stefan Metzmacher
Date: Thu Mar 25 2021 - 09:57:23 EST


Am 25.03.21 um 14:38 schrieb Jens Axboe:
> On 3/25/21 6:11 AM, Stefan Metzmacher wrote:
>>
>> Am 25.03.21 um 13:04 schrieb Eric W. Biederman:
>>> Stefan Metzmacher <metze@xxxxxxxxx> writes:
>>>
>>>> Am 25.03.21 um 12:24 schrieb Sasha Levin:
>>>>> From: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
>>>>>
>>>>> [ Upstream commit 4db4b1a0d1779dc159f7b87feb97030ec0b12597 ]
>>>>>
>>>>> Just like we don't allow normal signals to IO threads, don't deliver a
>>>>> STOP to a task that has PF_IO_WORKER set. The IO threads don't take
>>>>> signals in general, and have no means of flushing out a stop either.
>>>>>
>>>>> Longer term, we may want to look into allowing stop of these threads,
>>>>> as it relates to eg process freezing. For now, this prevents a spin
>>>>> issue if a SIGSTOP is delivered to the parent task.
>>>>>
>>>>> Reported-by: Stefan Metzmacher <metze@xxxxxxxxx>
>>>>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>>>> Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
>>>>> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
>>>>> ---
>>>>> kernel/signal.c | 3 ++-
>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/kernel/signal.c b/kernel/signal.c
>>>>> index 55526b941011..00a3840f6037 100644
>>>>> --- a/kernel/signal.c
>>>>> +++ b/kernel/signal.c
>>>>> @@ -288,7 +288,8 @@ bool task_set_jobctl_pending(struct task_struct *task, unsigned long mask)
>>>>> JOBCTL_STOP_SIGMASK | JOBCTL_TRAPPING));
>>>>> BUG_ON((mask & JOBCTL_TRAPPING) && !(mask & JOBCTL_PENDING_MASK));
>>>>>
>>>>> - if (unlikely(fatal_signal_pending(task) || (task->flags & PF_EXITING)))
>>>>> + if (unlikely(fatal_signal_pending(task) ||
>>>>> + (task->flags & (PF_EXITING | PF_IO_WORKER))))
>>>>> return false;
>>>>>
>>>>> if (mask & JOBCTL_STOP_SIGMASK)
>>>>>
>>>>
>>>> Again, why is this proposed for 5.11 and 5.10 already?
>>>
>>> Has the bit about the io worker kthreads been backported?
>>> If so this isn't horrible. If not this is nonsense.
>
> No not yet - my plan is to do that, but not until we're 100% satisfied
> with it.

Do you understand why the patches where autoselected for 5.11 and 5.10?

>> I don't know, I hope not...
>>
>> But I just tested v5.12-rc4 and attaching to
>> an application with iothreads with gdb is still not possible,
>> it still loops forever trying to attach to the iothreads.
>
> I do see the looping, gdb apparently doesn't give up when it gets
> -EPERM trying to attach to the threads. Which isn't really a kernel
> thing, but:

Maybe we need to remove the iothreads from /proc/pid/tasks/

>> And I tested 'kill -9 $pidofiothread', and it feezed the whole
>> machine...
>
> that sounds very strange, I haven't seen anything like that running
> the exact same scenario.
>
>> So there's still work to do in order to get 5.12 stable.
>>
>> I'm short on time currently, but I hope to send more details soon.
>
> Thanks! I'll play with it this morning and see if I can provoke
> something odd related to STOP/attach.

Thanks!

Somehow I have the impression that your same_thread_group_account patch
may fix a lot of things...

metze