Re: [PATCH v7] sched: Consolidate cpufreq updates

From: Rafael J. Wysocki
Date: Thu Sep 12 2024 - 07:34:10 EST


On Wed, Sep 11, 2024 at 10:34 PM Christian Loehle
<christian.loehle@xxxxxxx> wrote:
>
> On 7/28/24 19:45, Qais Yousef wrote:
> > Improve the interaction with cpufreq governors by making the
> > cpufreq_update_util() calls more intentional.
> >
> > At the moment we send them when load is updated for CFS, bandwidth for
> > DL and at enqueue/dequeue for RT. But this can lead to too many updates
> > sent in a short period of time and potentially be ignored at a critical
> > moment due to the rate_limit_us in schedutil.
> >
> > For example, simultaneous task enqueue on the CPU where 2nd task is
> > bigger and requires higher freq. The trigger to cpufreq_update_util() by
> > the first task will lead to dropping the 2nd request until tick. Or
> > another CPU in the same policy triggers a freq update shortly after.
> >
> > Updates at enqueue for RT are not strictly required. Though they do help
> > to reduce the delay for switching the frequency and the potential
> > observation of lower frequency during this delay. But current logic
> > doesn't intentionally (at least to my understanding) try to speed up the
> > request.
> >
> > To help reduce the amount of cpufreq updates and make them more
> > purposeful, consolidate them into these locations:
> >
> > 1. context_switch()
> > 2. task_tick_fair()
> > 3. sched_balance_update_blocked_averages()
> > 4. on sched_setscheduler() syscall that changes policy or uclamp values
> > 5. on check_preempt_wakeup_fair() if wakeup preemption failed
> > 6. on __add_running_bw() to guarantee DL bandwidth requirements.
> >
>
> Actually now reading that code again reminded me, there is another
> iowait boost change for intel_pstate.
> intel_pstate has either intel_pstate_update_util() or
> intel_pstate_update_util_hwp().
> Both have
> if (smp_processor_id() != cpu->cpu)
> return;
> Now since we move that update from enqueue to context_switch() that will
> always be false.
> I don't think that was deliberate but rather to simplify intel_pstate
> synchronization, although !mcq device IO won't be boosted which you
> could argue is good.
> Just wanted to mention that, doesn't have to be a bad, but surely some
> behavior change.

This particular change shouldn't be problematic.