Re: [PATCH] sched/core: Test online status in available_idle_cpu()
From: Sven Schnelle
Date: Thu May 16 2024 - 12:38:33 EST
Sven Schnelle <svens@xxxxxxxxxxxxx> writes:
> Valentin Schneider <vschneid@xxxxxxxxxx> writes:
>
>> On 29/04/24 07:54, Sven Schnelle wrote:
>>> The current implementation of available_idle_cpu() doesn't test
>>> whether a possible cpu is offline. On s390 this dereferences a
>>> NULL pointer in arch_vcpu_is_preempted() because lowcore is not
>>> allocated for offline cpus. On x86, tracing also shows calls to
>>> available_idle_cpu() after a cpu is disabled, but it looks like
>>> this isn't causing any (obvious) issue. Nevertheless, add a check
>>> and return early if the cpu isn't online.
>>>
>>> Signed-off-by: Sven Schnelle <svens@xxxxxxxxxxxxx>
>>
>>
>> So most of the uses of that function is in wakeup task placement.
>> o find_idlest_cpu() works on the sched_domain spans, so shouldn't
> deal with
>> offline CPUs.
>> o select_idle_sibling() may issue an available_idle_cpu(prev) with
> an
>> offline previous, which would trigger your issue.
>>
>> Currently, even if select_idle_sibling() picks an offline CPU, this
> will
>> get corrected by select_fallback_rq() at the end of
>> select_task_rq(). However, it would make sense to realize @prev
> isn't a
>> suitable pick before making it to the fallback machinery, in which
> case
>> your patch makes sense beyond just fixing s390.
>>
>> Reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx>
>
> Thanks for the review! Ingo/Peter, gentle ping, are you planning to
> take
> this patch?
Ping?
Thanks,
Sven