Re: [GIT RFC PULL rcu/urgent] Prevent Kconfig from asking pointless questions

From: Ingo Molnar
Date: Sat Apr 18 2015 - 10:33:05 EST



* Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> On Sat, Apr 18, 2015 at 03:03:41PM +0200, Ingo Molnar wrote:
> >
> > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > Hello, Ingo,
> > >
> > > This series contains a single change that fixes Kconfig asking pointless
> > > questions (https://lkml.org/lkml/2015/4/14/616). This is an RFC pull
> > > because there has not yet been a -next build for April 16th. If you
> > > would prefer to wait until after -next has pulled this, please let me
> > > know and I will redo this pull request after that has happened.
> > >
> > > In the meantime, this change is available in the git repository at:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo
> > >
> > > for you to fetch changes up to 8d7dc9283f399e1fda4e48a1c453f689326d9396:
> > >
> > > rcu: Control grace-period delays directly from value (2015-04-14 19:33:59 -0700)
> > >
> > > ----------------------------------------------------------------
> > > Paul E. McKenney (1):
> > > rcu: Control grace-period delays directly from value
> > >
> > > kernel/rcu/tree.c | 16 +++++++++-------
> > > lib/Kconfig.debug | 1 +
> > > 2 files changed, 10 insertions(+), 7 deletions(-)
> >
> > Pulled, thanks a lot Paul!
> >
> > Note, while this fixes Linus's immediate complaint that arose from the
> > new option, I still think we need to do more fixes in this area.
>
> Good point!
>
> > To demonstrate the current situation I tried the following experiment,
> > I did a 'make defconfig' on an x86 box and then took the .config and
> > deleted all 'RCU Subsystem' options not marked as debugging.
> >
> > Then I did a 'make oldconfig' to see what kinds of questions a user is
> > facing when trying to configure RCU:
> >
> > *
> > * Restart config...
> > *
> > *
> > * RCU Subsystem
> > *
> > RCU Implementation
> > > 1. Tree-based hierarchical RCU (TREE_RCU) (NEW)
> > choice[1]: 1
>
> Hmmm... Given that there is no choice, I agree that it is a bit silly
> to ask...

To clarify: this doesn't actually ask - it gets skipped by the kconfig
tool. All the rest is an interactive prompt.

> > Task_based RCU implementation using voluntary context switch (TASKS_RCU) [N/y/?] (NEW)
>
> Agreed, this one should be driven directly off of CONFIG_RCU_TORTURE_TEST
> and the tracing use case.

Yeah.

> > Consider userspace as in RCU extended quiescent state (RCU_USER_QS) [N/y/?] (NEW)
>
> This should be driven directly off of CONFIG_NO_HZ_FULL, unless
> Frederic knows something I don't.

Yes.

> > Tree-based hierarchical RCU fanout value (RCU_FANOUT) [64] (NEW)
>
> Hmmm... I could drop/obscure this one in favor of a boot parameter.

Well, what I think might be even bette to make it scale based on
CONFIG_NR_CPUS. Distros already actively manage the 'maximum number of
CPUs we support', so relying on that value makes sense.

So if someone sets CONFIG_NR_CPUS to 1024, it gets scaled accordingly.
If CONFIG_NR_CPUS is set to 2, it gets scaled to a minimal config.
Note that this would excercise and test the affected codepaths better
as well, as we'd get different size setups.

As for the boot option to override it: what would be the usecase for
that?

> > Tree-based hierarchical RCU leaf-level fanout value (RCU_FANOUT_LEAF) [16] (NEW)
>
> Ditto -- though large configurations really do set this to 64 in
> combination with the skew_tick boot parameter. Maybe we need to
> drive these off of some large-system parameter, like CONFIG_MAX_SMP.

Or rather CONFIG_NR_CPUS. CONFIG_MAX_SMP is really a debugging thing,
to configure the system to the silliest high settings that doesn't
outright crash - but it doesn't make much sense otherwise.

> > Disable tree-based hierarchical RCU auto-balancing (RCU_FANOUT_EXACT) [N/y/?] (NEW)
>
> I should just make this a boot parameter. Absolutely no reason for
> it to be a Kconfig parameter.

Again I'd size this to NR_CPUS - and for the boot parameter, I'd think
about actual usecases.

> > Accelerate last non-dyntick-idle CPU's grace periods (RCU_FAST_NO_HZ) [N/y/?] (NEW)
>
> On this one, I have no idea. Its purpose is energy efficiency, but
> it does have some downsides, for example, increasing idle entry/exit
> latency. I am a bit nervous about having it be a boot parameter
> because that would leave an extra compare-branch in the path. This
> one will require some thought.

Keeping this one configurable, with a good default and a good
explanation makes sense. There's a lot of

> > Real-time priority to use for RCU worker threads (RCU_KTHREAD_PRIO) [0] (NEW)
>
> Indeed, Linus complained about this one. ;-)

:-) Yes, it's an essentially unanswerable question.

> This Kconfig parameter is a stopgap, and needs a real solution.
> People with crazy-heavy workloads involving realtime cannot live
> without it, but that means that most people don't have to care. I
> have had solving this on my list, and this clearly increases its
> priority.

So what value do they use, prio 99? 98? It might be better to offer
this option as a binary choice, and set a given priority. If -rt
people complain then they might help us in solving it properly.

> > Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)
>
> Hmmm... Maybe a boot parameter, but I thought that there was some
> reason that this was problematic. I will have to take another look.
>
> Anyway, this one is important to non-NO_HZ_FULL real-time workloads.
> In a -rt kernel, making CONFIG_PREEMPT_RT (or whatever it is these
> days) drive this one makes a lot of sense.

Ok.

>
> > #
> > # configuration written to .config
> > #
> >
> > Only TREE_RCU is available on defconfig, so all the other options
> > marked with '(NEW)' were offered as an interactive prompt.
> >
> > I don't think that any of the 8 interactive options (!) here are
> > particularly useful to even advanced users who configure kernels, and
> > I don't think they should be offered under non-expert settings.
>
> Would it make sense to have a CONFIG_RCU_EXPERT setting to hide the
> remaining settings? That would reduce the common-case number of
> questions to one, which would be a quick and safe improvement.
> Especially when combined with the changes I called out above.

Yes, that's absolutely sensible - although I'd also do the
CONFIG_NR_CPUS based auto-scaling if it's not set, to make sure
distros don't end up tuning this (inevitably imperfectly) which won't
flow back upstream:

That's the other main problem with widely tunable, numeric settings,
beyond their user hostility: if they are wrong and are corrected in a
distro they don't flow back to upstream, so they are dead end
mechanisms as far as code quality and good defaults are concerned.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/