Re: RCU vs NOHZ

From: Joel Fernandes
Date: Thu Sep 15 2022 - 23:40:48 EST


On 9/15/2022 6:30 PM, Peter Zijlstra wrote:
>>>>> These deep idle states are only feasible during NOHZ idle, and the NOHZ
>>>>> path is already relatively expensive (which is offset by then mostly
>>>>> staying idle for a long while).
>>>>>
>>>>> Specifically my thinking was that when a CPU goes NOHZ it can splice
>>>>> it's callback list onto a global list (cmpxchg), and then the
>>>>> jiffy-updater CPU can look at and consume this global list (xchg).
>>>>>
>>>>> Before you say... but globals suck (they do), NOHZ already has a fair
>>>>> amount of global state, and as said before, it's offset by the CPU then
>>>>> staying idle for a fair while. If there is heavy contention on the NOHZ
>>>>> data, the idle governor is doing a bad job by selecting deep idle states
>>>>> whilst we're not actually idle for long.
>>>>>
>>>>> The above would remove the reason for RCU to inhibit NOHZ.
>>>>>
>>>>>
>>>>> Additionally; when the very last CPU goes idle (I think we know this
>>>>> somewhere, but I can't reaily remember where) we can insta-advance the
>>>>> QS machinery and run the callbacks before going (NOHZ) idle.
>>>>>
>>>>>
>>>>> Is there a reason this couldn't work? To me this seems like a much
>>>>> simpler solution than the whole rcu-cb thing.
>>>> To restate Joel's reply a bit...
>>>>
>>>> Maybe.
>>>>
>>>> Except that we need rcu_nocbs anyway for low latency and HPC applications.
>>>> Given that we have it, and given that it totally eliminates RCU-induced
>>>> idle ticks, how would it help to add cmpxchg-based global offloading?
>>> Because that nocb stuff isn't default enabled?
>> Last I checked, both RHEL and Fedora were built with CONFIG_RCU_NOCB_CPU=y.
>> And I checked Fedora just now.
>>
>> Or am I missing your point?
> I might be missing the point; but why did Joel have a talk if it's all
> default on?

It was not default on until recently for Intel ChromeOS devices. Also, my talk
is much more than just idle-ticks/NOCB. I am talking about delaying grace
periods by long amounts of times using various techniques, with data to show
improvements, and new tool rcutop to show the problems :-) The discussion of
ticks which disturb idle was more for background information for the audience
(Sorry if I was not clear about that).

thanks,

- Joel