Re: [PATCH RFC tip/core/rcu] Parallelize and economize NOCB kthread wakeups

From: Peter Zijlstra
Date: Tue Jul 08 2014 - 05:06:20 EST


On Wed, Jul 02, 2014 at 10:55:01AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 02, 2014 at 07:26:00PM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 02, 2014 at 10:08:38AM -0700, Paul E. McKenney wrote:
> > > As were others, not that long ago. Today is the first hint that I got
> > > that you feel otherwise. But it does look like the softirq approach to
> > > callback processing needs to stick around for awhile longer. Nice to
> > > hear that softirq is now "sane and normal" again, I guess. ;-)
> >
> > Nah, softirqs are still totally annoying :-)
>
> Name me one thing that isn't annoying. ;-)
>
> > So I've lost detail again, but it seems to me that on all CPUs that are
> > actually getting ticks, waking tasks to process the RCU state is
> > entirely over doing it. Might as well keep processing their RCU state
> > from the tick as was previously done.
>
> And that is in fact the approach taken by my patch. For which I just
> kicked off testing, so expect an update later today. (And that -is-
> optimistic! A pessimistic viewpoint would hold that the patch would
> turn out to be so broken that it would take -weeks- to get a fix!)

Right, but as you told Mike its not really dynamic, but of course we can
work on that.

That said; I'm somewhat confused on the whole nocb thing. So the way I
see things there's two things that need doing:

1) push the state machine
2) run callbacks

It seems to me the nocb threads do both, and somehow some of this is
getting conflated. Because afaik RCU only uses softirqs for (2), since
(1) is fully done from the tick -- well, it used to be, before all this.

Now, IIRC rcu callbacks are not guaranteed to run on whatever cpu
they're queued on, so we can 'easily' splice the actual callback list
into some other CPUs callback list. Which leaves only (1) to actually
'do'.

Yet the whole thing is called after the 'no-callback' thing, even though
the most important part is pushing the state machine remotely.

Now I can see we'd probably don't want to actually push remote cpu's
their rcu state from IRQ context, but we could, I think, drive the state
machine remotely. And we want to avoid overloading one CPU with the work
of all others, which is I think still a fundamental issue with the whole
nohz_full thing, it reverts to the _one_ timekeeper cpu, but on big
enough systems that'll be a problem.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/