Re: Default governor regardless of cpuidle driver

From: Joao Martins
Date: Fri Aug 30 2019 - 07:09:25 EST


On 8/29/19 10:51 PM, Daniel Lezcano wrote:
> On 29/08/2019 23:12, Joao Martins wrote:
>
> [ ... ]
>
>>>> Say you wanted to have a kvm specific config, you would still see the same
>>>> problem if you happen to compile intel_idle together with haltpoll
>>>> driver+governor.
>>>
>>> Can a guest work with an intel_idle driver?
>>>
>> Yes.
>>
>> If you use Qemu you would add '-overcommit cpu-pm=on' to try it out. ofc,
>> assuming you're on a relatively recent Qemu (v3.0+) and a fairly recent kernel
>> version as host (v4.17+).
>
> Ok, thanks for the clarification.
>
>>>> Creating two separate configs here, with and without haltpoll
>>>> for VMs doesn't sound effective for distros.
>>>
>>> Agree
>>>
>>>> Perhaps decreasing the rating of
>>>> haltpoll governor, but while a short term fix it wouldn't give much sensible
>>>> defaults without the one-off runtime switch.
>
> The rating has little meaning because each governor fits a specific
> situation (server, desktop, etc...) and it would probably make sense to
> remove it and add a default governor in the config file like the cpufreq.
>
ICYM, I had attached a patch in the first message of this thread [0] right below
the scissors mark. It's not based on config file, but it's the same thing you're
saying (IIUC) but at runtime and thus allowing a driver to state a 'preferred'
governor to switch to at idle registration -- let me know if you think that
looks a sensible approach. Note that the intent of that patch follows the
thinking of leaving all defaults as before haltpoll governor was introduced, but
once user modloads/uses cpuidle-haltpoll this governor then gets switched on.

[0] https://lore.kernel.org/kvm/c8cf8dcc-76a3-3e15-f514-2cb9df1bbbdc@xxxxxxxxxx/

I would think a config-based preference on a governor would be good *if* one
could actually switch idle governors at runtime like you can with cpufreq -- in
case userspace wants something else other than the default. Right now we can't
do that unless you toggle 'cpuidle_sysfs_switch', or picking one at boot with
'cpuidle.governor='.

> May be I missed the point from some previous discussion but IMHO the
> problem you are facing is coming from the design: there is no need to
> create a halt governor but move the code inside the cpuidle-halt driver
> instead and ignore the state asked by the governor and return the state
> the driver entered.
>
Marcello's original patch series (first 3 revisions to be exact) actually had
everything in the idle driver, but after some revisions (v4+) Rafael asked him
to split the logic into a governor and unify it with poll state[1].

[1]
https://lore.kernel.org/kvm/CAJZ5v0gPbSXB3r71XaT-4Q7LsiFO_UVymBwOmU8J1W5+COk_1g@xxxxxxxxxxxxxx/

Joao