Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

From: Eric W. Biederman
Date: Mon May 29 2023 - 22:49:10 EST

michael.christie@xxxxxxxxxx writes:

> On 5/29/23 2:35 PM, Mike Christie wrote:
>>> Hmm... If we you CLONE_THREAD the exiting vhost_worker() will auto-reap itself,
>> Oh wait, are you saying that when we get auto-reaped then we would do the last
>> fput and call the file_operations->release function right? We actually set
>> task_struct->files = NULL for the vhost_task task_struct, so I think we call
>> release a little sooner than you think.
>> vhost_task_create() sets kernel_clone_args->no_files, so the vhost_task task_struc
>> that gets created works like kthreads where it doesn't do a CLONE_FILES and it
>> doesn't do a dup_fd.
>> So when we do de_thread() -> zap_other_threads(), that will kill all the threads
>> in the group right? So when they exit, it will call our release function since
>> we don't have refcount on ourself.
> Just to make sure I'm on the same page now.
> In the past thread when were discussing the patch below and you guys were saying
> that it doesn't really ignore SIGKILL because we will hit the
> SIGNAL_GROUP_EXIT/group_exec_task checks and the parent is going to exit, it was
> on purpose.
> Instead of a signal to tell me when do exit, I was using the parent exiting and doing
> the last fput on the vhost device's file which calls our release function. That then
> allowed the vhost code to use the vhost_task to handle the outstanding IOs and then
> do vhost_task_should_stop to tell the vhost_task to exit when the oustanding IO
> had completed.
> On the vhost side of things it's really nice, because all the shutdown paths work
> the same and we don't need rcu/locking in the main IO path to handle the vhost_task
> exiting while we are using it.

The code below does nothing for exec.

You really need to call get_signal to handle SIGSTOP/freeze/process exit.

A variant on my coredump proposal looks like it can handle exec as well.
It isn't pretty but it looks good enough for right now.

If you could test the patch I posted up thread I think that is something
imperfect that is good enough for now.