Re: [PATCH v3] sched: Consolidate cpufreq updates
From: Dietmar Eggemann
Date: Wed May 15 2024 - 06:00:57 EST
On 14/05/2024 00:09, Qais Yousef wrote:
> On 05/13/24 14:43, Dietmar Eggemann wrote:
>> On 12/05/2024 21:00, Qais Yousef wrote:
>>
>> [...]
>>
>>> @@ -4682,7 +4659,7 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
>>>
>>> add_tg_cfs_propagate(cfs_rq, se->avg.load_sum);
>>>
>>> - cfs_rq_util_change(cfs_rq, 0);
>>> + cpufreq_update_util(rq_of(cfs_rq), 0);
>>
>> Isn't this slighlty different now?
>>
>> before:
>>
>> if (&rq->cfs == cfs_rq) {
>> cpufreq_update_util(rq, ....)
>> }
>>
>> now:
>>
>> cpufreq_update_util(rq_of(cfs_rq), ...)
>>
>> You should get way more updates from attach/detach now.
>
> Yes, well spotted!
>
> Looking at the path more closely, I can see this is called from
> enqueue_task_fair() path when a task migrates to new CPU. And when
> attach_task_cfs_rq() which is called when we switch_to_fair(), which I already
> cover in the policy change for the RUNNING task, or when
> task_change_group_fair() which what I originally understood Vincent was
> referring to. I moved the update to this function after the detach/attach
> operations with better guards to avoid unnecessary update.
Yeah, all !root cfs_rq attach or detach wouldn't change anything since
the util_avg wouldn't have propagated to the root cfs_rq yet. So
sugov_get_util() wouldn't see a difference.
Yes, enqueue_entity() sets DO_ATTACH unconditionally.
And dequeue_entity() sets DO_DETACH for a migrating (!wakeup migrating)
task.
For a wakeup migrating task we have remove_entity_load_avg() but this
can't remove util_avg from the cfs_rq. This is deferred to
update_cfs_rq_load_avg() in update_load_avg() or __update_blocked_fair().
And switched_{to,from}_fair() (check_class_changed()) and
task_change_group_fair() are the other 2 users of
{attach,detach}_entity_load_avg(). (plus online_fair_sched_group() for
attach).
> I understood this will lead to big change and better apply immediately vs
> wait for the next context switch. But I'll ask the question again, can we drop
> this and defer to context switch?
Hard to say really, probably we can. All benchmarks with score numbers
will create plenty of context switches so you wont see a diff. And for
more lighter testcases you would have to study the differences in trace
files and reason about the implications of potentially kick CPUfreq a
little bit later.
[...]