Re: [PATCH 8/9] sched/fair: select idle cpu from idle cpumask for task wakeup

From: Aubrey Li
Date: Fri Sep 17 2021 - 05:12:50 EST


On 9/17/21 12:15 PM, Barry Song wrote:
>> @@ -4965,6 +4965,7 @@ void scheduler_tick(void)
>>
>> #ifdef CONFIG_SMP
>> rq->idle_balance = idle_cpu(cpu);
>> + update_idle_cpumask(cpu, rq->idle_balance);
>> trigger_load_balance(rq);
>> #endif
>> }
>
> might be stupid, a question bothering yicong and me is that why don't we
> choose to update_idle_cpumask() while idle task exits and switches to a
> normal task?

I implemented that way and we discussed before(RFC v1 ?), updating a cpumask
at every enter/exit idle is more expensive than we expected, though it's
per LLC domain, Vincent saw a significant regression IIRC. You can also
take a look at nohz.idle_cpus_mask as a reference.

> for example, before tick comes, cpu has exited from idle, but we are only
> able to update it in tick. this makes idle_cpus_span inaccurate, thus we
> will scan cpu which isn't actually idle in select_idle_sibling.
> is it because of the huge update overhead?
>

Yes, we'll have false positive but we don't miss true positive. So things
won't be worse than the current way.

Thanks,
-Aubrey