Re: RCU vs NOHZ
From: Joel Fernandes
Date: Thu Sep 15 2022 - 11:37:45 EST
On Thu, Sep 15, 2022 at 9:39 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> Hi,
>
> After watching Joel's talk about RCU and idle ticks I was wondering
> about why RCU doesn't have NOHZ hooks -- that is regular NOHZ, not the
> NOHZ_FULL stuff.
Glad the talk stirred up a discussion!
> These deep idle states are only feasible during NOHZ idle, and the NOHZ
> path is already relatively expensive (which is offset by then mostly
> staying idle for a long while).
>
> Specifically my thinking was that when a CPU goes NOHZ it can splice
> it's callback list onto a global list (cmpxchg), and then the
> jiffy-updater CPU can look at and consume this global list (xchg).
>
> Before you say... but globals suck (they do), NOHZ already has a fair
> amount of global state, and as said before, it's offset by the CPU then
> staying idle for a fair while. If there is heavy contention on the NOHZ
> data, the idle governor is doing a bad job by selecting deep idle states
> whilst we're not actually idle for long.
>
> The above would remove the reason for RCU to inhibit NOHZ.
>
>
> Additionally; when the very last CPU goes idle (I think we know this
> somewhere, but I can't reaily remember where) we can insta-advance the
> QS machinery and run the callbacks before going (NOHZ) idle.
>
>
> Is there a reason this couldn't work? To me this seems like a much
> simpler solution than the whole rcu-cb thing.
You mean the “whole rcu-nocb” thing? The nocb feature does not just
eliminate the need to keep idle ticks ON, that’s just one of the
reasons to use nocb. The other reasons is nocb makes the cb invoke in
a thread and not softirq of the queuing cpu, this makes the energy
aware scheduler decide where to place those threads and it seems to do
a really good job of saving power (Vlad Rezki who works on android
checked and I CC’d).
Maybe your suggestion is more about how to keep the idle tick off on
systems without using no-cb (note that nocb has additional overhead
due to thread wake ups and locking)? If we do the global thing you
mentioned, then that means we won’t get the per cpu cache locality
benefits on those systems that want the cb to execute on same cpu (cb
operating on a hot cache line) since your cb gets invoked on the
jiffie updating cpu per your design.
Outside of RCU, I do remember back in my android days that even with
nocb, we can see idle ticks present quite a bit (this is likely
because of the idle governor in android, but I did not dig much
deeper). I need to go and investigate that again..
Thanks,
- Joel