Re: [PATCHv2 1/1] lib/tests: test_ratelimit: fix stress test thread lifecycle and leak

From: Petr Mladek

Date: Thu May 21 2026 - 09:26:23 EST


On Tue 2026-05-19 09:37:45, Jia He wrote:
> The stress test checks the kthread pointer in the child thread, but the
> thread was created with kthread_run(). Since kthread_run() wakes the
> thread before returning, the child can run before sktp[i].tp is assigned,
> triggering the WARN_ON_ONCE() check on an arm64 N2 server with 128 cores.

True.

> Use kthread_create() instead, assign sktp[i].tp after successful
> creation, and then wake the thread explicitly. This guarantees that the
> child observes an initialized task pointer.

IMHO, this does not guarantee that the task pointer has been assigned.
AFAIK, there might be spurious wakeups, for example, by the
suspend/resume or livepatching code.

I have another idea, see below.

> Also add a common cleanup path for thread creation failures. If creating
> one of the later threads fails, stop all threads that were already
> started and free the allocated array instead of leaving orphan kthreads
> and leaked memory behind.
>
> Finally, replace the module-static doneflag with kthread_should_stop().
> With the doneflag, child threads may exit before the parent calls
> kthread_stop(), so the task lifetime is no longer guaranteed when the
> parent later tries to stop them. Using kthread_should_stop() keeps each
> child alive until kthread_stop() synchronously terminates it.

This part looks good to me.

> Signed-off-by: Jia He <justin.he@xxxxxxx>
> ---
> v2: replace the module-static doneflag with kthread_should_stop() to avoid
> UAF (sashiko-bot)
> lib/tests/test_ratelimit.c | 37 ++++++++++++++++++++++++++-----------
> 1 file changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/lib/tests/test_ratelimit.c b/lib/tests/test_ratelimit.c
> index 33cea5f3d28b..d95c49e914c7 100644
> --- a/lib/tests/test_ratelimit.c
> +++ b/lib/tests/test_ratelimit.c
> @@ -68,7 +68,6 @@ static void test_ratelimit_smoke(struct kunit *test)
> static struct ratelimit_state stressrl = RATELIMIT_STATE_INIT_FLAGS("stressrl", HZ / 10, 3,
> RATELIMIT_MSG_ON_RELEASE);
>
> -static int doneflag;
> static const int stress_duration = 2 * HZ;
>
> struct stress_kthread {
> @@ -86,7 +85,7 @@ static int test_ratelimit_stress_child(void *arg)
> set_user_nice(current, MAX_NICE);
> WARN_ON_ONCE(!sktp->tp);

It seem that "sktp->tp" is not really used in this function.
I am not sure why the WARN_ON_ONCE() is there.

It would simply remove this check and keep the original kthread_run().

>
> - while (!READ_ONCE(doneflag)) {
> + while (!kthread_should_stop()) {
> sktp->nattempts++;
> if (___ratelimit(&stressrl, __func__))
> sktp->nunlimited++;

Best Regards,
Petr