Re: [PATCH] kunit: Fix race condition in try-catch completion
From: Rae Moar
Date: Fri Apr 12 2024 - 17:32:44 EST
On Thu, Apr 11, 2024 at 10:59 PM David Gow <davidgow@xxxxxxxxxx> wrote:
>
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is created, but which is set to
> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
>
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
>
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
> Closes: https://lore.kernel.org/lkml/20240410102710.35911-1-naresh.kamboju@xxxxxxxxxx/
> Tested-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
> Acked-by: Mickaël Salaün <mic@xxxxxxxxxxx>
> Signed-off-by: David Gow <davidgow@xxxxxxxxxx>
Hello,
This fix looks good to me. I have tested it and besides the fortify
test error discussed in the previous patch series I am happy.
Thanks!
-Rae
Reviewed-by: Rae Moar <rmoar@xxxxxxxxxx>
> ---
> lib/kunit/try-catch.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
> index fa687278ccc9..6bbe0025b079 100644
> --- a/lib/kunit/try-catch.c
> +++ b/lib/kunit/try-catch.c
> @@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
> {
> struct kunit *test = try_catch->test;
> struct task_struct *task_struct;
> + struct completion *task_done;
> int exit_code, time_remaining;
>
> try_catch->context = context;
> @@ -75,13 +76,16 @@ void kunit_try_catch_run(struct kunit_try_catch *try_catch, void *context)
> return;
> }
> get_task_struct(task_struct);
> - wake_up_process(task_struct);
> /*
> * As for a vfork(2), task_struct->vfork_done (pointing to the
> * underlying kthread->exited) can be used to wait for the end of a
> - * kernel thread.
> + * kernel thread. It is set to NULL when the thread exits, so we
> + * keep a copy here.
> */
> - time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
> + task_done = task_struct->vfork_done;
> + wake_up_process(task_struct);
> +
> + time_remaining = wait_for_completion_timeout(task_done,
> kunit_test_timeout());
> if (time_remaining == 0) {
> try_catch->try_result = -ETIMEDOUT;
> --
> 2.44.0.683.g7961c838ac-goog
>