Re: [PATCH v4 0/8] support for bitmap (and hence CPU) list "N" abbreviation

From: Paul E. McKenney
Date: Wed Feb 10 2021 - 19:23:59 EST


On Wed, Feb 10, 2021 at 03:50:07PM -0800, Yury Norov wrote:
> On Wed, Feb 10, 2021 at 9:57 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 10, 2021 at 06:26:54PM +0200, Andy Shevchenko wrote:
> > > On Tue, Feb 09, 2021 at 05:58:59PM -0500, Paul Gortmaker wrote:
> > > > The basic objective here was to add support for "nohz_full=8-N" and/or
> > > > "rcu_nocbs="4-N" -- essentially introduce "N" as a portable reference
> > > > to the last core, evaluated at boot for anything using a CPU list.
> > >
> > > I thought we kinda agreed that N is confusing and L is better.
> > > N to me is equal to 32 on 32 core system as *number of cores / CPUs*. While L
> > > sounds better as *last available CPU number*.
> >
> > The advantage of "N" is that people will automatically recognize it as
> > "last thing" or number of things" because "N" has long been used in
> > both senses. In contrast, someone seeing "0-L" for the first time is
> > likely to go "What???".
> >
> > Besides, why would someone interpret "N" as "number of CPUs" when doing
> > that almost always gets you an invalid CPU number?
> >
> > Thanx, Paul
>
> I have no strong opinion about a letter, but I like Andy's idea to make it
> case-insensitive.
>
> There is another comment from the previous iteration not addressed so far.
>
> This idea of the N notation is to make the bitmap list interface more robust
> when we share the configs between different machines. What we have now
> is definitely a good thing, but not completely portable except for cases
> 'N', '0-N' and 'N-N'.
>
> For example, if one user adds rcu_nocbs= '4-N', and it works perfectly fine for
> him, another user with s NR_CPUS == 2 will fail to boot with such a config.
>
> This is not a problem of course in case of absolute values because nobody
> guaranteed robustness. But this N feature would be barely useful in practice,
> except for 'N', '0-N' and 'N-N' as I mentioned before, because there's always
> a chance to end up with a broken config.
>
> We can improve on robustness a lot if we take care about this case.For me,
> the more reliable interface would look like this:
> 1. chunks without N work as before.
> 2. if 'a-N' is passed where a>=N, we drop chunk and print warning message
> 3. if 'a-N' is passed where a>=N together with a control key, we set last bit
> and print warning.
>
> For example, on 2-core CPU:
> "4-2" --> error
> "4-4" --> error
> "4-N" --> drop and warn
> "X, 4-N" --> set last bit and warn
>
> Any comments?

We really don't know the user's intent, and we cannot have complete
portability without knowing the user's intent. For example, "4-N" means
"all but the first four CPUs", in which case an error is appropriate
because "4-N" makes no more sense on a 2-CPU system than does "4-1".
I could see a potential desire for some notation for "the last two CPUs",
but let's please have a real need for such a thing before overengineering
this patch series any further.

To get the level of portability you seem to be looking for, we need some
higher-level automation that knows how many CPUs there are and what
the intent is. That automation can then generate the cpumasks for a
given system. But for more typical situations, what Paul has now will
work fine.

Paul Gortmaker's patch series is doing something useful. We should
not let potential future desires prevent us from taking a very useful
step forward.

Thanx, Paul