Re: [PATCH v4 1/3] x86: Handle idle=nomwait cmdline properly for x86_idle
From: Zhang Rui
Date: Sun Jun 05 2022 - 08:33:06 EST
On Thu, 2022-06-02 at 21:11 +0530, Wyes Karny wrote:
> >
>
> Hi Rui,
>
> On 5/25/2022 1:36 PM, Zhang Rui wrote:
> > On Mon, 2022-05-23 at 22:25 +0530, Wyes Karny wrote:
> > > When kernel is booted with idle=nomwait do not use MWAIT as the
> > > default idle state.
> > >
> > > If the user boots the kernel with idle=nomwait, it is a clear
> > > direction to not use mwait as the default idle state.
> > > However, the current code does not take this into consideration
> > > while selecting the default idle state on x86.
> > >
> > > This patch fixes it by checking for the idle=nomwait boot option
> > > in
> > > prefer_mwait_c1_over_halt().
> > >
> > > Also update the documentation around idle=nomwait appropriately.
> > >
> > > Signed-off-by: Wyes Karny <wyes.karny@xxxxxxx>
> > > ---
> > > Changes in v4:
> > > - Update documentation around idle=nomwait
> > > - Rename patch subject
> > >
> > > Documentation/admin-guide/pm/cpuidle.rst | 15 +++++++++------
> > > arch/x86/kernel/process.c | 6 +++++-
> > > 2 files changed, 14 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/pm/cpuidle.rst
> > > b/Documentation/admin-guide/pm/cpuidle.rst
> > > index aec2cd2aaea7..19754beb5a4e 100644
> > > --- a/Documentation/admin-guide/pm/cpuidle.rst
> > > +++ b/Documentation/admin-guide/pm/cpuidle.rst
> > > @@ -612,8 +612,8 @@ the ``menu`` governor to be used on the
> > > systems
> > > that use the ``ladder`` governor
> > > by default this way, for example.
> > >
> > > The other kernel command line parameters controlling CPU idle
> > > time
> > > management
> > > -described below are only relevant for the *x86* architecture and
> > > some of
> > > -them affect Intel processors only.
> > > +described below are only relevant for the *x86* architecture and
> > > references
> > > +to ``intel_idle`` affect Intel processors only.
> > >
> > > The *x86* architecture support code recognizes three kernel
> > > command
> > > line
> > > options related to CPU idle time management: ``idle=poll``,
> > > ``idle=halt``,
> > > @@ -635,10 +635,13 @@ idle, so it very well may hurt single-
> > > thread
> > > computations performance as well as
> > > energy-efficiency. Thus using it for performance reasons may
> > > not be
> > > a good idea
> > > at all.]
> > >
> > > -The ``idle=nomwait`` option disables the ``intel_idle`` driver
> > > and
> > > causes
> > > -``acpi_idle`` to be used (as long as all of the information
> > > needed
> > > by it is
> > > -there in the system's ACPI tables), but it is not allowed to use
> > > the
> > > -``MWAIT`` instruction of the CPUs to ask the hardware to enter
> > > idle
> > > states.
> > > +The ``idle=nomwait`` option prevents the use of ``MWAIT``
> > > instruction of
> > > +the CPU to enter idle states. When this option is used, the
> > > ``acpi_idle``
> > > +driver will use the ``HLT`` instruction instead of ``MWAIT``. On
> > > systems
> > > +running Intel processors, this option disables the
> > > ``intel_idle``
> > > driver
> > > +and forces the use of the ``acpi_idle`` driver instead. Note
> > > that in
> > > either
> > > +case, ``acpi_idle`` driver will function only if all the
> > > information
> > > needed
> > > +by it is in the system's ACPI tables.
> > >
> > > In addition to the architecture-level kernel command line
> > > options
> > > affecting CPU
> > > idle time management, there are parameters affecting individual
> > > ``CPUIdle``
> > > diff --git a/arch/x86/kernel/process.c
> > > b/arch/x86/kernel/process.c
> > > index b370767f5b19..4e0178b066c5 100644
> > > --- a/arch/x86/kernel/process.c
> > > +++ b/arch/x86/kernel/process.c
> > > @@ -824,6 +824,10 @@ static void amd_e400_idle(void)
> > > */
> > > static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86
> > > *c)
> > > {
> > > + /* User has disallowed the use of MWAIT. Fallback to HALT */
> > > + if (boot_option_idle_override == IDLE_NOMWAIT)
> > > + return 0;
> > > +
> > > if (c->x86_vendor != X86_VENDOR_INTEL)
> > > return 0;
> > >
> > > @@ -932,7 +936,7 @@ static int __init idle_setup(char *str)
> > > } else if (!strcmp(str, "nomwait")) {
> > > /*
> > > * If the boot option of "idle=nomwait" is added,
> > > - * it means that mwait will be disabled for CPU C2/C3
> > > + * it means that mwait will be disabled for CPU
> > > C1/C2/C3
> > > * states. In such case it won't touch the variable
> > > * of boot_option_idle_override.
> >
> > the code didn't change boot_option_idle_override when it was
> > introduced, but this has changed since commit d18960494f65 ("ACPI,
> > intel_idle: Cleanup idle= internal variables")
>
> Could you please clarify bit more why the commit you mentioned is
> related to this patch?
>
The comment "In such case it won't touch the variable of
boot_option_idle_override." has been broken for some time, it is not
related with this patch. But given that this patch "Also update the
documentation around idle=nomwait appropriately", so my suggestion is
to update it altogether, by deleting the last sentence.
thanks,
rui