Re: [RFC][PATCH 12/13] stop_machine: Remove lglock

From: Peter Zijlstra
Date: Wed Jul 01 2015 - 12:16:58 EST


On Wed, Jul 01, 2015 at 08:56:55AM -0700, Paul E. McKenney wrote:
> On Wed, Jul 01, 2015 at 01:56:42PM +0200, Peter Zijlstra wrote:
> Odd that you have four of eight of the rcuos CPUs with higher consumption
> than the others. I would expect three of eight. Are you by chance running
> an eight-core system with hyperthreading disabled in hardware, via boot
> parameter, or via explicit offline? The real question I have is "is
> nr_cpu_ids equal to 16 rather than to 8?"

It should not, but I'd have to instrument to be sure. Its a regular
4 core + ht part.

model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz

> Also, do you have nohz_full set?

Nope..

> Just wondering why callback offloading
> is enabled. (If you want it enabled, fine, but from what I can see your
> workload isn't being helped by it and it does have higher overhead.)

I think this is a distro .config; every time I strip the desktop kernel
I end up needing a driver I hadn't built. Clearly I've not really paid
attention to the RCU options.

> Even if you don't want offloading and do disable it, it would be good to
> reduce the penalty. Is there something I can do to reduce the overhead
> of waking several kthreads? Right now, I just do a series of wake_up()
> calls, one for each leader rcuos kthread.
>
> Oh, are you running v3.10 or some such? If so, there are some more
> recent RCU changes that can help with this. They are called out here:

Not that old, but not something recent either. I'll upgrade and see if
it goes away. I really detest rebooting the desktop, but it needs to
happen every so often.

> > Yah, if only we could account it back to whomever caused it :/
>
> It could be done, but would require increasing the size of rcu_head.
> And would require costly fine-grained timing of callback execution.
> Not something for production systems, I would guess.

Nope :/ I know.

> > What I was talking about was the interaction between the force
> > quiescence state and the poking detectoring that a QS had indeed be
> > started.
>
> It gets worse.
>
> Suppose that a grace period is already in progess. You cannot leverage
> its use of the combining tree because some of the CPUs might have already
> indicated a quiescent state, which means that the current grace period
> won't necessarily wait for all of the CPUs that the concurrent expedited
> grace period needs to wait on. So you need to kick the current grace
> period, wait for it to complete, wait for the next one to start (with
> all the fun and exciting issues called out earlier), do the expedited
> grace period, then wait for completion.

Ah yes. You do do find the fun cases :-)

> > If you wake it unconditionally, even if there's nothing to do, then yes
> > that'd be a waste of cycles.
>
> Heh! You are already complaining about rcu_sched consuming 0.7%
> of your system, and rightfully so. Increasing this overhead still
> further therefore cannot be considered a good thing unless there is some
> overwhelming benefit. And I am not seeing that benefit. Perhaps due
> to a failure of imagination, but until someone enlightens me, I have to
> throttle the wakeups -- or, perhaps better, omit the wakeups entirely.
>
> Actually, I am not convinced that I should push any of the patches that
> leverage expedited grace periods to help out normal grace periods.

It would seem a shame not to.. I've not yet had time to form a coherent
reply to that thread though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/