Re: [PATCH v4 1/3] x86: Handle idle=nomwait cmdline properly for x86_idle

From: Wyes Karny
Date: Mon Jun 06 2022 - 05:13:44 EST


Hello Rui,

On 6/5/2022 6:02 PM, Zhang Rui wrote:
> On Thu, 2022-06-02 at 21:11 +0530, Wyes Karny wrote:
>>>
>>
>> Hi Rui,
>>
>> On 5/25/2022 1:36 PM, Zhang Rui wrote:
>>> On Mon, 2022-05-23 at 22:25 +0530, Wyes Karny wrote:
>>>> When kernel is booted with idle=nomwait do not use MWAIT as the
>>>> default idle state.
>>>>
>>>> If the user boots the kernel with idle=nomwait, it is a clear
>>>> direction to not use mwait as the default idle state.
>>>> However, the current code does not take this into consideration
>>>> while selecting the default idle state on x86.
>>>>
>>>> This patch fixes it by checking for the idle=nomwait boot option
>>>> in
>>>> prefer_mwait_c1_over_halt().
>>>>
>>>> Also update the documentation around idle=nomwait appropriately.
>>>>
>>>> Signed-off-by: Wyes Karny <wyes.karny@xxxxxxx>
>>>> ---
>>>> Changes in v4:
>>>> - Update documentation around idle=nomwait
>>>> - Rename patch subject
>>>>
>>>> Documentation/admin-guide/pm/cpuidle.rst | 15 +++++++++------
>>>> arch/x86/kernel/process.c | 6 +++++-
>>>> 2 files changed, 14 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/Documentation/admin-guide/pm/cpuidle.rst
>>>> b/Documentation/admin-guide/pm/cpuidle.rst
>>>> index aec2cd2aaea7..19754beb5a4e 100644
>>>> --- a/Documentation/admin-guide/pm/cpuidle.rst
>>>> +++ b/Documentation/admin-guide/pm/cpuidle.rst
>>>> @@ -612,8 +612,8 @@ the ``menu`` governor to be used on the
>>>> systems
>>>> that use the ``ladder`` governor
>>>> by default this way, for example.
>>>>
>>>> The other kernel command line parameters controlling CPU idle
>>>> time
>>>> management
>>>> -described below are only relevant for the *x86* architecture and
>>>> some of
>>>> -them affect Intel processors only.
>>>> +described below are only relevant for the *x86* architecture and
>>>> references
>>>> +to ``intel_idle`` affect Intel processors only.
>>>>
>>>> The *x86* architecture support code recognizes three kernel
>>>> command
>>>> line
>>>> options related to CPU idle time management: ``idle=poll``,
>>>> ``idle=halt``,
>>>> @@ -635,10 +635,13 @@ idle, so it very well may hurt single-
>>>> thread
>>>> computations performance as well as
>>>> energy-efficiency. Thus using it for performance reasons may
>>>> not be
>>>> a good idea
>>>> at all.]
>>>>
>>>> -The ``idle=nomwait`` option disables the ``intel_idle`` driver
>>>> and
>>>> causes
>>>> -``acpi_idle`` to be used (as long as all of the information
>>>> needed
>>>> by it is
>>>> -there in the system's ACPI tables), but it is not allowed to use
>>>> the
>>>> -``MWAIT`` instruction of the CPUs to ask the hardware to enter
>>>> idle
>>>> states.
>>>> +The ``idle=nomwait`` option prevents the use of ``MWAIT``
>>>> instruction of
>>>> +the CPU to enter idle states. When this option is used, the
>>>> ``acpi_idle``
>>>> +driver will use the ``HLT`` instruction instead of ``MWAIT``. On
>>>> systems
>>>> +running Intel processors, this option disables the
>>>> ``intel_idle``
>>>> driver
>>>> +and forces the use of the ``acpi_idle`` driver instead. Note
>>>> that in
>>>> either
>>>> +case, ``acpi_idle`` driver will function only if all the
>>>> information
>>>> needed
>>>> +by it is in the system's ACPI tables.
>>>>
>>>> In addition to the architecture-level kernel command line
>>>> options
>>>> affecting CPU
>>>> idle time management, there are parameters affecting individual
>>>> ``CPUIdle``
>>>> diff --git a/arch/x86/kernel/process.c
>>>> b/arch/x86/kernel/process.c
>>>> index b370767f5b19..4e0178b066c5 100644
>>>> --- a/arch/x86/kernel/process.c
>>>> +++ b/arch/x86/kernel/process.c
>>>> @@ -824,6 +824,10 @@ static void amd_e400_idle(void)
>>>> */
>>>> static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86
>>>> *c)
>>>> {
>>>> + /* User has disallowed the use of MWAIT. Fallback to HALT */
>>>> + if (boot_option_idle_override == IDLE_NOMWAIT)
>>>> + return 0;
>>>> +
>>>> if (c->x86_vendor != X86_VENDOR_INTEL)
>>>> return 0;
>>>>
>>>> @@ -932,7 +936,7 @@ static int __init idle_setup(char *str)
>>>> } else if (!strcmp(str, "nomwait")) {
>>>> /*
>>>> * If the boot option of "idle=nomwait" is added,
>>>> - * it means that mwait will be disabled for CPU C2/C3
>>>> + * it means that mwait will be disabled for CPU
>>>> C1/C2/C3
>>>> * states. In such case it won't touch the variable
>>>> * of boot_option_idle_override.
>>>
>>> the code didn't change boot_option_idle_override when it was
>>> introduced, but this has changed since commit d18960494f65 ("ACPI,
>>> intel_idle: Cleanup idle= internal variables")
>>
>> Could you please clarify bit more why the commit you mentioned is
>> related to this patch?
>>
>
> The comment "In such case it won't touch the variable of
> boot_option_idle_override." has been broken for some time, it is not
> related with this patch. But given that this patch "Also update the
> documentation around idle=nomwait appropriately", so my suggestion is
> to update it altogether, by deleting the last sentence.

Sure, will do. Thanks!

>
> thanks,
> rui
>