Re: [RFC PATCH 2/2] sched/fair: skip the cache hot CPU in select_idle_cpu()

From: Mathieu Desnoyers
Date: Tue Sep 12 2023 - 11:17:17 EST


On 9/12/23 10:14, Chen Yu wrote:
On 2023-09-12 at 10:06:27 -0400, Mathieu Desnoyers wrote:
[...]

One more tweak: given that more than one task can update the cache_hot_timeout forward
one after another, and given that some tasks have larger burst_sleep_avg values than
others, I suspect we want to keep the forward movement monotonic with something like:

if (sched_feat(SIS_CACHE) && task_sleep && !rq->nr_running && p->se.burst_sleep_avg &&
rq->cache_hot_timeout < now + p->se.burst_sleep_avg)
rq->cache_hot_timeout = now + p->se.burst_sleep_avg;


Yeah, Aaron has mentioned this too:
https://lore.kernel.org/lkml/ZP7SYu+gxlc%2FYjHu@chenyu5-mobl2/
May I know the benefit of keeping forward movement monotonic?
I thought that, should we only honor the latest dequeued task's burst_sleep_avg?
Because we don't know whether the old deuqued task's cache has been scribbled by the latest
dequeued task or not, does it still make sense to wake up the old dequeued task on its
previous CPU?

Here is my reasoning:

If a second task is scheduled after the first dequeued task (a
task with large burst_sleep_avg) is dequeued, that second task (with
small burst_sleep_avg) would need to entirely scribble the other task's
cache lines within the time given by sysctl_sched_migration_cost, which
I suspect is typically not very large. So I doubt that the second task
can entirely kick out the first task cache lines within that time frame,
and therefore that second task should not move the cache_hot_timeout
value backwards.

But perhaps I'm missing something ?

Thanks,

Mathieu


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com