Re: RCU vs NOHZ

From: Peter Zijlstra
Date: Sat Sep 17 2022 - 09:36:04 EST


On Fri, Sep 16, 2022 at 02:11:10PM -0400, Joel Fernandes wrote:
> Hi Peter,
>
> On Fri, Sep 16, 2022 at 5:20 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> [...]
> > > It wasn't enabled for ChromeOS.
> > >
> > > When fully enabled, it gave them the energy-efficiency advantages Joel
> > > described. And then Joel described some additional call_rcu_lazy()
> > > changes that provided even better energy efficiency. Though I believe
> > > that the application should also be changed to avoid incessantly opening
> > > and closing that file while the device is idle, as this would remove
> > > -all- RCU work when nearly idle. But some of the other call_rcu_lazy()
> > > use cases would likely remain.
> >
> > So I'm thinking the scheme I outlined gets you most if not all of what
> > lazy would get you without having to add the lazy thing. A CPU is never
> > refused deep idle when it passes off the callbacks.
> >
> > The NOHZ thing is a nice hook for 'this-cpu-wants-to-go-idle-long-term'
> > and do our utmost bestest to move work away from it. You *want* to break
> > affinity at this point.
> >
> > If you hate on the global, push it to a per rcu_node offload list until
> > the whole node is idle and then push it up the next rcu_node level until
> > you reach the top.
> >
> > Then when the top rcu_node is full idle; you can insta progress the QS
> > state and run the callbacks and go idle.
>
> In my opinion the speed brakes have to be applied before the GP and
> other threads are even awakened. The issue Android and ChromeOS
> observe is that even a single CB queued every few jiffies can cause
> work that can be otherwise delayed / batched, to be scheduled in. I am
> not sure if your suggestions above address that. Does it?

Scheduled how? Is this callbacks doing queue_work() or something?

Anyway; the thinking is that by passing off the callbacks on NOHZ, the
idle CPUs stay idle. By running the callbacks before going full idle,
all work is done and you can stay idle longer.

> Try this experiment on your ADL system (for fun). Boot to the login
> screen on any distro,

All my dev boxes are headless :-) I don't thinkt he ADL even has X or
wayland installed.

> and before logging in, run turbostat over ssh
> and observe PC8 percent residencies. Now increase
> jiffies_till_first_fqs boot parameter value to 64 or so and try again.
> You may be surprised how much PC8 percent increases by delaying RCU
> and batching callbacks (via jiffies boot option) Admittedly this is
> more amplified on ADL because of package-C-states, firmware and what
> not, and isn’t as much a problem on Android; but still gives a nice
> power improvement there.

I can try; but as of now turbostat doesn't seem to work on that thing at
all. I think localyesconfig might've stripped a required bit. I'll poke
at it later.