Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

From: Frederic Weisbecker
Date: Tue Jan 16 2018 - 17:51:36 EST


On Tue, Jan 16, 2018 at 11:52:11AM -0500, Luiz Capitulino wrote:
> On Tue, 16 Jan 2018 16:41:00 +0100
> Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
> > So isolcpus= is now the place where we control the isolation features
> > and nohz is one of them.
>
> That's the part I'm not very sure about. We've been advising users to
> move away from isolcpus= when possible, but this very wanted nohz_offload
> feature will force everyone back to using isolcpus= again.

Note "isolcpus=nohz" only implies nohz. You need to add "domain" to get
the behaviour that you've been advising users against. We are simply
reusing a kernel parameter that was abandoned to now control the isolation
features that were disorganized and opaque behind nohz.

>
> I have the impression this series is trying to solve two problems:
>
> 1. How (and where) we control the various isolation features in the
> kernel

No, that has already been done in the previous merge window. We have a
dedicated isolation subsystem now (kernel/sched/isolation.c) and
an interface to control all these isolation features that were abusively implied
by nohz. The initial plan was to introduce "cpu_isolation=" but it looked too much like
"isolcpus=". Then in fact, why not using "isolcpus=" and give it a second life.
And there we are.

In the end the goal is to propagate what is passed to "isolcpus=" to cpusets.


>
> 2. Where we add the control for the tick offload feature
>
> I think item 1 is too complex to solve right now. IMHO, this series
> should focus on item 2. And regarding item 2, I think we have two
> choices to make:
>
> 1. Make tick offload a first class citizen by making it default to
> nohz_full=. If there are regressions, we handle them

That's a possible way to go.

>
> 2. Add a new option to nohz_full=, like nohz_full=tick_offload
>
> As an avid user of nohz_full I'm dying to see option 1 happening,
> but I'm not totally sure what the consequences can be.

"nohz_full=" parameter has been badly designed as it implies much more
than just full dynticks. So I'm not really looking forward to expanding
it.

> Another idea is to add CONFIG_NOHZ_TICK_OFFLOAD as an experimental
> feature.

I fear it's way too distro-unfriendly. They will want to have it as a
capability without necessarily running it. Just like they do with
CONFIG_NO_HZ_FULL.

>
> > The complain about isolcpus is the immutable result. I'm thinking about
> > making it modifiable to cpuset but I only see two possible solutions:
> >
> > - Make the root cpuset modifiable
> > - Create a directory called "isolcpus" visible on the first cpuset mount
> > and move all processes there.
>
> So, if we move the control of the tick offload to nohz_full= itself,
> we can completely ditch any isolcpus= change in this series.
>
> I think this should give you a great relief :)

Not at all :)

What would be a great relief to me is that we can finally propagate isolcpus=
to cpusets so that we can continue to expand it without a second thought.