Re: [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
From: Shinichiro Kawasaki
Date: Mon Nov 15 2021 - 05:46:20 EST
On Nov 04, 2021 / 17:51, Vincent Donnefort wrote:
> Nothing protects the access to the per_cpu variable sd_llc_id. When testing
> the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
> update_top_cache_domain(). One scenario being:
>
> CPU1 CPU2
> ==================================================================
>
> per_cpu(sd_llc_id, CPUX) => 0
> partition_sched_domains_locked()
> detach_destroy_domains()
> cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
> per_cpu(sd_llc_id, CPUX) => 0
> per_cpu(sd_llc_id, CPUX) = CPUX
> per_cpu(sd_llc_id, CPUX) => CPUX
> return false
>
> ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
> is a warning triggered from ttwu_queue_wakelist().
>
> Avoid a such race in cpus_share_cache() by always returning true when
> this_cpu == that_cpu.
>
> Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
> Reported-by: Jing-Ting Wu <jing-ting.wu@xxxxxxxxxxxx>
> Signed-off-by: Vincent Donnefort <vincent.donnefort@xxxxxxx>
> Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f2611b9cf503..f5ca15cdcff4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3726,6 +3726,9 @@ void wake_up_if_idle(int cpu)
>
> bool cpus_share_cache(int this_cpu, int that_cpu)
> {
> + if (this_cpu == that_cpu)
> + return true;
> +
> return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
> }
>
> --
> 2.25.1
>
Oh, this is the exactly same fix as I posted before [1]. It is a little bit sad
that my post did not get reviewed. Anyway, good to see the issue fixed. Thanks.
[1] https://lore.kernel.org/all/20211029005618.773579-1-shinichiro.kawasaki@xxxxxxx/
--
Best Regards,
Shin'ichiro Kawasaki