Re: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter

From: Rafael J. Wysocki
Date: Fri Jan 31 2020 - 06:23:38 EST


On Fri, Jan 31, 2020 at 12:07 PM David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> From: Rafael J. Wysocki
> > Sent: 30 January 2020 14:47
> >
> > In certain system configurations it may not be desirable to use some
> > C-states assumed to be available by intel_idle and the driver needs
> > to be prevented from using them even before the cpuidle sysfs
> > interface becomes accessible to user space. Currently, the only way
> > to achieve that is by setting the 'max_cstate' module parameter to a
> > value lower than the index of the shallowest of the C-states in
> > question, but that may be overly intrusive, because it effectively
> > makes all of the idle states deeper than the 'max_cstate' one go
> > away (and the C-state to avoid may be in the middle of the range
> > normally regarded as available).
> >
> > To allow that limitation to be overcome, introduce a new module
> > parameter called 'states_off' to represent a list of idle states to
> > be disabled by default in the form of a bitmask and update the
> > documentation to cover it.
>
> The problem I see is that there are (at least) 3 different ways of
> referring to the C-States:

So the mask is not referring to the C-states in the first place.

> 1) The state names, C1, C1E, C3, C7 etc.
> I'm not sure these are visible outside intel_idle.c.

Yes, they are, in sysfs.

> 2) The maximum allowed latency in us.
> 3) The index into the cpu-dependant tables in intel_idle.c.
>
> Boot parameters that set 3 are completely hopeless for normal
> users. The C-state names might be - but they aren't documented.
>
> Unless you know exactly which cpu table is being used the
> only constraint a user can request is the latency.

So this mask refers to the idle states numbering in sysfs, as stated
in the documentation update. That covers state0 which is not a
C-state too.

> (I've had the misfortune to read intel_idle.c in the last week.
> Almost impenetrable TLA ridden uncommented code.)

I have some patches to improve that, will post them after this is settled.

> ...
> > + * The positions of the bits that are set in the two's complement representation
> > + * of this value are the indices of the idle states to be disabled by default
> > + * (as reflected by the names of the corresponding idle state directories in
> > + * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the
> > + * given state).
>
> What has 'two's complement' got to do with anything?

Well, it is the representation in which bits are used. Kind of as
opposed to decimal or hex digits. But I can replace that phrase with
"bits that are set in this number" easily enough.

> ...
> > +The value of the ``states_off`` module parameter (0 by default) represents a
> > +list of idle states to be disabled by default in the form of a bitmask. Namely,
> > +the positions of the bits that are set in the two's complement representation of
> > +that value are the indices of idle states to be disabled by default (as
> > +reflected by the names of the corresponding idle state directories in ``sysfs``,
> > +:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the
> > +index of the given idle state; see :ref:`idle-states-representation` in
> > +:doc:`cpuidle`). For example, if ``states_off`` is equal to 3, the driver will
> > +disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3
> > +will be disabled by default and so on (bit positions beyond the maximum idle
> > +state index are ignored). The idle states disabled this way can be enabled (on
> > +a per-CPU basis) from user space via ``sysfs``.
>
> A few line breaks would make that easier to read.

Fair enough.

Thanks!