Re: Future of NOHZ full/isolation development (was Re: [NOHZ] Remove scheduler_tick_max_deferment)

From: Paul E. McKenney
Date: Tue Nov 11 2014 - 12:40:29 EST


On Tue, Nov 11, 2014 at 06:15:28PM +0100, Frederic Weisbecker wrote:
> On Mon, Nov 10, 2014 at 12:26:51PM -0600, Christoph Lameter wrote:
> > >
> > > Would it make sense for unlimited max deferment to be available as
> > > a boot parameter? That would allow people who want tick-free execution
> > > more than accurate stats to get that easily, while keeping stats accurate
> > > for everyone else.
> >
> > Subject: Make the maximum tick deferral for CONFIG_NO_HZ configurable
> >
> > Add a way to configure this interval at boot and via
> > /proc/sys/vm/max_defer_tick
> >
> > Signed-off-by: Christoph Lameter <cl@xxxxxxxxx>
>
> Sorry but that's not solving the problem. All it does is to allow the user
> to tune bugs.
>
> Kevin Hilman proposed something similar using debugfs and I declined it as
> well. Integrating a hack like this is a good way to make sure that nobody
> will ever fix the real underlying issue.

I guess I should have remembered that before suggesting this to Christoph,
my apologies to all!

> BTW, that's a good opportunity for me to generalize this case to the full
> dynticks development general issue. I got a lot of help from people to improve
> the kernel's isolation and full dynticks: Paul has spent a lot of time to improve
> RCU, you improved vmstat, full dynticks got ported to other archs, people
> like Viresh fixed some timers internals, Gilad fixed IPIs, Peterz reviewed a
> lot, etc...
>
> But now we reached a step where there are mostly core issues remaining that
> require some infrastrure change investments, some extensions or a bit of rethinking.
> We know we reach that step when people who want the features are stuck sending
> workarounds.
> Nothing like big rewrites is needed really, actually just a bunch of pretty
> self contained issues. And by self-contained I mean that each of these individual
> problems can be worked out seperately as they are unrelated enough altogether. Here is
> a summarized list:
>
> * Unbound workqueues affinity (to housekeeper)
> * Unbound timers affinity (to housekeeper)
> * 1 Hz residual scheduler tick offlining to housekeeper
> * Fix some scheduler accounting that don't even work with 1 Hz: cpu load
> accounting, rt_scale, load balancing, etc...
> * Lighten the syscall path and get rid of cputime accounting + RCU hooks
> for people who want isolation + fast syscalls and faults.

I thought that the RCU hooks were well and truly down in the noise.
Or is that not the case without cputime accounting to hide behind?

Thanx, Paul

> * Work on non-affinable workqueues
> * Work on non-affinable timers
> * ...
>
> If I'm going to work alone on all that, this is going to take several years,
> honestly.
>
> But we know what to do and how. So all we need is (at least one) more full time
> core developer to get these things done in a reasonable amount of time.
>
> Thanks.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/