Re: RCU vs NOHZ

From: Peter Zijlstra
Date: Fri Sep 16 2022 - 05:20:27 EST


On Fri, Sep 16, 2022 at 12:58:17AM -0700, Paul E. McKenney wrote:

> To the best of my knowledge at this point in time, agreed. Who knows
> what someone will come up with next week? But for people running certain
> types of real-time and HPC workloads, context tracking really does handle
> both idle and userspace transitions.

Sure, but idle != nohz. Nohz is where we disable the tick, and currently
RCU can inhibit this -- rcu_needs_cpu().

AFAICT there really isn't an RCU hook for this, not through context
tracking not through anything else.

> It wasn't enabled for ChromeOS.
>
> When fully enabled, it gave them the energy-efficiency advantages Joel
> described. And then Joel described some additional call_rcu_lazy()
> changes that provided even better energy efficiency. Though I believe
> that the application should also be changed to avoid incessantly opening
> and closing that file while the device is idle, as this would remove
> -all- RCU work when nearly idle. But some of the other call_rcu_lazy()
> use cases would likely remain.

So I'm thinking the scheme I outlined gets you most if not all of what
lazy would get you without having to add the lazy thing. A CPU is never
refused deep idle when it passes off the callbacks.

The NOHZ thing is a nice hook for 'this-cpu-wants-to-go-idle-long-term'
and do our utmost bestest to move work away from it. You *want* to break
affinity at this point.

If you hate on the global, push it to a per rcu_node offload list until
the whole node is idle and then push it up the next rcu_node level until
you reach the top.

Then when the top rcu_node is full idle; you can insta progress the QS
state and run the callbacks and go idle.