Re: [PATCH 06/10] exit: Implement kthread_exit

From: Eric W. Biederman
Date: Mon Jan 10 2022 - 10:01:04 EST


David Laight <David.Laight@xxxxxxxxxx> writes:

> From: Eric W. Biederman
>> Sent: 08 January 2022 18:36
>>
>> Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:
>>
>> > IMO the right way to handle that would be
>> > 1) turn these two do_exit() into do_exit(0), to reduce
>> > confusion
>> > 2) deal with all do_exit() in kthread payloads. Your
>> > name for the primitive is fine, IMO.
>> > 3) make that primitive pass the return value by way of
>> > a field in struct kthread, adjusting kthread_stop() accordingly
>> > and passing 0 to do_exit() in kthread_exit() itself.
>> >
>> > (2) is not as trivial as you seem to hope, though. Your patches
>> > in drivers/staging/rt*/ had papered over the problem in there,
>> > but hadn't really solved it.
>> >
>> > thread_exit() should've been shot, all right, but it really ought
>> > to have been complete_and_exit() there. The thing is, complete()
>> > + return does *not* guarantee that driver won't get unloaded before
>> > the thread terminates. Possibly freeing its .code and leaving
>> > a thread to resume running in there as soon as it regains CPU.
>> >
>> > The point of complete_and_exit() is that it's noreturn *and* in
>> > core kernel. So it can be safely used in a modular kthread,
>> > if paired with wait_for_completion() in or before module_exit.
>> > complete() + do_exit() (or complete + return as you've gotten
>> > there) doesn't give such guarantees at all.
>>
>>
>> I think we are mostly in agreement here.
>>
>> There are kernel threads started by modules that do:
>> complete(...);
>> return 0;
>>
>> That should be at a minimum calling complete_and_exit. Possibly should
>> be restructured to use kthread_stop().
>
> There is also module_put_and_exit(0);
> Which must have an implied THIS_MODULE.

Later in the patch series I change
module_put_and_exit -> module_put_and_kthread_exit
complete_and_exit -> complete_and_kthread_exit

The problem that I understand all was seeing was where people should
have been using complete_and_exit and were not.

Eric