Re: [PATCH v4 0/8] support for bitmap (and hence CPU) list "N" abbreviation
From: Paul Gortmaker
Date: Sun Feb 21 2021 - 03:05:04 EST
[Re: [PATCH v4 0/8] support for bitmap (and hence CPU) list "N" abbreviation] On 10/02/2021 (Wed 15:50) Yury Norov wrote:
> On Wed, Feb 10, 2021 at 9:57 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 10, 2021 at 06:26:54PM +0200, Andy Shevchenko wrote:
> > > On Tue, Feb 09, 2021 at 05:58:59PM -0500, Paul Gortmaker wrote:
> > > > The basic objective here was to add support for "nohz_full=8-N" and/or
> > > > "rcu_nocbs="4-N" -- essentially introduce "N" as a portable reference
> > > > to the last core, evaluated at boot for anything using a CPU list.
> > >
> > > I thought we kinda agreed that N is confusing and L is better.
> > > N to me is equal to 32 on 32 core system as *number of cores / CPUs*. While L
> > > sounds better as *last available CPU number*.
> >
> > The advantage of "N" is that people will automatically recognize it as
> > "last thing" or number of things" because "N" has long been used in
> > both senses. In contrast, someone seeing "0-L" for the first time is
> > likely to go "What???".
> >
> > Besides, why would someone interpret "N" as "number of CPUs" when doing
> > that almost always gets you an invalid CPU number?
> >
> > Thanx, Paul
>
> I have no strong opinion about a letter, but I like Andy's idea to make it
> case-insensitive.
It is trivial to add later if someone can prove a genuine need for it,
but it is impossible to remove later if we add it now for no reason.
>
> There is another comment from the previous iteration not addressed so far.
Actually, no - it was addressed in detail already:
https://lore.kernel.org/lkml/20210127091238.GH23530@xxxxxxxxxxxxx/
> This idea of the N notation is to make the bitmap list interface more robust
> when we share the configs between different machines. What we have now
> is definitely a good thing, but not completely portable except for cases
> 'N', '0-N' and 'N-N'.
>
> For example, if one user adds rcu_nocbs= '4-N', and it works perfectly fine for
> him, another user with s NR_CPUS == 2 will fail to boot with such a config.
Firstly there is no "fail to boot" from "rcu_nocbs=<invalid>" -- that
just doesn't happen. In any case, as you can see, I added in v4 the
documentation (as you requested) for this case - in several places.
And I explained in the thread above why any attempt to do some kind of
mapping policy was doomed to just add confusion and end up doing the
wrong thing. And the discussion ended with that.
So I'm not clear why it was brought up again here as if I just ignored
your "broken config" concerns and never addressed them.
In any case as others have indicated, it serves no immediate purpose to
over-think this and start adding corner case reactions to use cases that
simply don't exist and probably never will.
Thanks,
Paul.
--
>
> This is not a problem of course in case of absolute values because nobody
> guaranteed robustness. But this N feature would be barely useful in practice,
> except for 'N', '0-N' and 'N-N' as I mentioned before, because there's always
> a chance to end up with a broken config.
>
> We can improve on robustness a lot if we take care about this case.For me,
> the more reliable interface would look like this:
> 1. chunks without N work as before.
> 2. if 'a-N' is passed where a>=N, we drop chunk and print warning message
> 3. if 'a-N' is passed where a>=N together with a control key, we set last bit
> and print warning.
>
> For example, on 2-core CPU:
> "4-2" --> error
> "4-4" --> error
> "4-N" --> drop and warn
> "X, 4-N" --> set last bit and warn
>
> Any comments?