Re: [PATCH] sched/core: Test online status in available_idle_cpu()

From: Sven Schnelle
Date: Wed May 08 2024 - 03:31:31 EST


Valentin Schneider <vschneid@xxxxxxxxxx> writes:

> On 29/04/24 07:54, Sven Schnelle wrote:
>> The current implementation of available_idle_cpu() doesn't test
>> whether a possible cpu is offline. On s390 this dereferences a
>> NULL pointer in arch_vcpu_is_preempted() because lowcore is not
>> allocated for offline cpus. On x86, tracing also shows calls to
>> available_idle_cpu() after a cpu is disabled, but it looks like
>> this isn't causing any (obvious) issue. Nevertheless, add a check
>> and return early if the cpu isn't online.
>>
>> Signed-off-by: Sven Schnelle <svens@xxxxxxxxxxxxx>
>
>
> So most of the uses of that function is in wakeup task placement.
> o find_idlest_cpu() works on the sched_domain spans, so shouldn't deal with
> offline CPUs.
> o select_idle_sibling() may issue an available_idle_cpu(prev) with an
> offline previous, which would trigger your issue.
>
> Currently, even if select_idle_sibling() picks an offline CPU, this will
> get corrected by select_fallback_rq() at the end of
> select_task_rq(). However, it would make sense to realize @prev isn't a
> suitable pick before making it to the fallback machinery, in which case
> your patch makes sense beyond just fixing s390.
>
> Reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx>

Thanks for the review! Ingo/Peter, gentle ping, are you planning to take
this patch?