Re: Requirements to control kernel isolation/nohz_full at runtime

From: Frederic Weisbecker
Date: Wed Sep 09 2020 - 22:32:31 EST


On Fri, Sep 04, 2020 at 01:47:40PM -0700, Paul E. McKenney wrote:
> On Tue, Sep 01, 2020 at 12:46:41PM +0200, Frederic Weisbecker wrote:
> > Hi,
> >
> > I'm currently working on making nohz_full/nohz_idle runtime toggable
> > and some other people seem to be interested as well. So I've dumped
> > a few thoughts about some pre-requirements to achieve that for those
> > interested.
> >
> > As you can see, there is a bit of hard work in the way. I'm iterating
> > that in https://pad.kernel.org/p/isolation, feel free to edit:
> >
> >
> > == RCU nocb ==
> >
> > Currently controllable with "rcu_nocbs=" boot parameter and/or through nohz_full=/isolcpus=nohz
> > We need to make it toggeable at runtime. Currently handling that:
> > v1: https://lwn.net/Articles/820544/
> > v2: coming soon
>
> Looking forward to seeing it!

So many ordering riddles I had to put on paper. But I'm getting close to
something RFC-postable now.

>
> > == TIF_NOHZ ==
> >
> > Need to get rid of that in order not to trigger syscall slowpath on CPUs that don't want nohz_full.
> > Also we don't want to iterate all threads and clear the flag when the last nohz_full CPU exits nohz_full
> > mode. Prefer static keys to call context tracking on archs. x86 does that well.
>
> Would it help if RCU was able to, on a per-CPU basis, distinguish between
> nohz_full userspace execution on the one hand and idle-loop execution
> on the other? Or do you have some other trick in mind?

No it's more about context tracking. Initially it used TIF_NOHZ to enter
the syscall slow path and call to context tracking on kernel entry and exit.

The problem is that it forces all CPUs, including housekeepers, to run into
that syscall slowpath. So we rather want the context tracking call conditional
on a per cpu basis and not on a per task basis. And static keys are good for
that. That's what x86 does.

So RCU can't help much I fear (but hey, first time I can say that! ;-)

Thanks.