Re: [PATCH 3/3] fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

From: Mike Christie
Date: Tue May 30 2023 - 23:31:06 EST


On 5/29/23 9:38 PM, Eric W. Biederman wrote:
> diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
> index b7cbd66f889e..f3ce0fa6edd7 100644
> --- a/kernel/vhost_task.c
> +++ b/kernel/vhost_task.c

...

> static int vhost_task_fn(void *data)
> {
> struct vhost_task *vtsk = data;
> - int ret;
> + bool dead = false;
> +
> + for (;;) {
> + bool did_work;
> +
> + /* mb paired w/ kthread_stop */
> + set_current_state(TASK_INTERRUPTIBLE);
> +
> + if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
> + __set_current_state(TASK_RUNNING);
> + break;
> + }
> +
> + if (!dead && signal_pending(current)) {
> + struct ksignal ksig;
> + /*
> + * Calling get_signal will block in SIGSTOP,
> + * or clear fatal_signal_pending, but remember
> + * what was set.
> + *
> + * This thread won't actually exit until all
> + * of the file descriptors are closed, and
> + * the release function is called.
> + */
> + dead = get_signal(&ksig);

Hey Eric, the patch works well. Thanks! There was just one small issue.

get_signal() does try_to_freeze() -> ... __might_sleep() which wants the
state to be TASK_RUNNING.

We just need the patch below on top of yours which I think also cleans up
some of the state setting weirdness with the code like where vhost.c calls
__set_current_state(TASK_RUNNING) for each work. It looks like that was
not needed for any reason like a work->fn changing the state and the old
code could have done:

node = llist_del_all(&worker->work_list);
if (!node) {
schedule();
continue;
} else {
__set_current_state(TASK_RUNNING);
}

So I think we can do the following on top of your patch:

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 221d1b6c1be5..074273020849 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -346,7 +346,6 @@ static bool vhost_worker(void *data)
smp_wmb();
llist_for_each_entry_safe(work, work_next, node, node) {
clear_bit(VHOST_WORK_QUEUED, &work->flags);
- __set_current_state(TASK_RUNNING);
kcov_remote_start_common(worker->kcov_handle);
work->fn(work);
kcov_remote_stop();
diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
index f3ce0fa6edd7..fead2ed29561 100644
--- a/kernel/vhost_task.c
+++ b/kernel/vhost_task.c
@@ -29,12 +29,8 @@ static int vhost_task_fn(void *data)
bool did_work;

/* mb paired w/ kthread_stop */
- set_current_state(TASK_INTERRUPTIBLE);
-
- if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
- __set_current_state(TASK_RUNNING);
+ if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags))
break;
- }

if (!dead && signal_pending(current)) {
struct ksignal ksig;
@@ -53,8 +49,10 @@ static int vhost_task_fn(void *data)
}

did_work = vtsk->fn(vtsk->data);
- if (!did_work)
+ if (!did_work) {
+ set_current_state(TASK_INTERRUPTIBLE);
schedule();
+ }
}

complete(&vtsk->exited);