Re: [PATCH 0/2] Don't show PF_IO_WORKER in /proc/<pid>/task/

From: Jens Axboe
Date: Thu Mar 25 2021 - 17:51:15 EST


On 3/25/21 2:43 PM, Eric W. Biederman wrote:
> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>
>> On Thu, Mar 25, 2021 at 12:42 PM Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> On Thu, Mar 25, 2021 at 12:38 PM Linus Torvalds
>>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>>
>>>> I don't know what the gdb logic is, but maybe there's some other
>>>> option that makes gdb not react to them?
>>>
>>> .. maybe we could have a different name for them under the task/
>>> subdirectory, for example (not just the pid)? Although that probably
>>> messes up 'ps' too..
>>
>> Actually, maybe the right model is to simply make all the io threads
>> take signals, and get rid of all the special cases.
>>
>> Sure, the signals will never be delivered to user space, but if we
>>
>> - just made the thread loop do "get_signal()" when there are pending signals
>>
>> - allowed ptrace_attach on them
>>
>> they'd look pretty much like regular threads that just never do the
>> user-space part of signal handling.
>>
>> The whole "signals are very special for IO threads" thing has caused
>> so many problems, that maybe the solution is simply to _not_ make them
>> special?
>
> The special case in check_kill_permission is certainly unnecessary.
> Having the signal blocked is enough to prevent signal_pending() from
> being true.
>
>
> The most straight forward thing I can see is to allow ptrace_attach and
> to modify ptrace_check_attach to always return -ESRCH for io workers
> unless ignore_state is set causing none of the other ptrace operations
> to work.
>
> That is what a long running in-kernel thread would do today so
> user-space aka gdb may actually cope with it.
>
>
> We might be able to support if io workers start supporting SIGSTOP but I
> am not at all certain.

See patch just send out as a POC, mostly, not fully sanitized yet. But
I did try to return -ESRCH from ptrace_check_attach() if it's an IO
thread and ignore_state isn't set:

if (!ignore_state && child->flags & PF_IO_WORKER)
return -ESRCH;

and that causes gdb to abort at that thread. For the same test case
as in the previous email, you get:

Attaching to process 358
[New LWP 359]
[New LWP 360]
[New LWP 361]
Couldn't get CS register: No such process.
(gdb) 0x00007ffa58537125 in ?? ()

(gdb) bt
#0 0x00007ffa58537125 in ?? ()
#1 0x0000000000000000 in ?? ()
(gdb) info threads
Id Target Id Frame
* 1 LWP 358 "io_uring" 0x00007ffa58537125 in ?? ()
2 LWP 359 "iou-mgr-358" Couldn't get registers: No such process.
(gdb) q
A debugging session is active.

Inferior 1 [process 358] will be detached.

Quit anyway? (y or n) y
Couldn't write debug register: No such process.

where 360 here is a regular pthread created thread, and 361 is another
iou-mgr-x task. While gdb behaves better in this case, it does still
prevent you from inspecting thread 3 which would be totally valid.

--
Jens Axboe