Re: re. Spurious wakeup on a newly created kthread

From: Eric W. Biederman
Date: Sat Jun 25 2022 - 19:41:34 EST


"Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> writes:

> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>
>> On Sat, Jun 25, 2022 at 11:25 AM Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> And that's not at all what the kthread code wants. It wants to set
>>> affinity masks, it wants to create a name for the thread, it wants to
>>> do all those other things.
>>>
>>> That code really wants to just do copy_process().
>>
>> Honestly, I think kernel/kthread.c should be almost rewritten from scratch.
>>
>> I do not understand why it does all those odd keventd games at all,
>> and why kthread_create_info exists in the first place.
>
> I presume you mean kthreadd games?
>
>> Why does kthread_create() not just create the thread directly itself,
>> and instead does that odd queue it onto a work function?
>>
>> Some of that goes back to before the git history, and very little of
>> it seems to make any sense. It's as if the code is meant to be able to
>> run from interrupt context, but that can't be it: it's literally doing
>> a GFP_KERNEL kmalloc, it's doing spin-locks without irq safety etc.
>>
>> So why is it calling kthreadd_task() to create the thread? Purely for
>> some crazy odd "make that the parent" reason?
>>
>> I dunno. The code is odd, unexplained, looks buggy, and most fo the
>> reasons are probably entirely historical.
>
> I can explain why kthreadd exists and why it creates the threads.
>
> Very long ago in the context of random userspace processes people would
> use kernel_thread to create threads and a helper function that I think
> was called something like kernel_daemonize to scrub the userspace bits
> off.
>
> It was an unending sources of problems as the scrub was never complete
> nor correct.
>
> So with the introduction of kthreadd the kernel threads were moved
> out of the userspace process tree, and userspace stopped being able to
> influence the kernel threads.
>
> AKA instead of doing the equivalent of a suid exec the code started
> going the equivalent sshing into the local box.
>
> We *need* to preserve that kind of separation.
>
> I want to say that all that is required is that copy_process copies
> from kthreadd. Unfortunately that means that it needs to be kthreadd
> doing the work, as copy_process does always copies from current. It
> would take quite a bit of work to untangle that mess.
>
> It does appear possible to write a parallel function to copy_process
> that is used only for creating kernel threads, and can streamline itself
> because it knows it is creating kernel threads.
>
> Short of that the code needs to keep routing through kthreadd.
>
> Using create_io_thread or a dedicated wrapper around copy_process
> certainly looks like it could simplify some of kthread creation.

Hmm. Looking at kthread() I completely agree that kernel_thread() has
the wrong set of semantics and we really could benefit from never waking
the fledgling kernel thread in the first place.

Eric