Re: [PATCH v10 18/20] timers: Implement the hierarchical pull model

From: Frederic Weisbecker
Date: Tue Feb 06 2024 - 05:29:30 EST


On Tue, Feb 06, 2024 at 11:06:19AM +0100, Anna-Maria Behnsen wrote:
> Frederic Weisbecker <frederic@xxxxxxxxxx> writes:
>
> > Le Mon, Feb 05, 2024 at 02:29:34PM +0100, Anna-Maria Behnsen a écrit :
> >> Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx> writes:
> >>
> >> > Frederic Weisbecker <frederic@xxxxxxxxxx> writes:
> >> >
> >> >> Le Thu, Feb 01, 2024 at 05:15:37PM +0100, Anna-Maria Behnsen a écrit :
> >> >>> Frederic Weisbecker <frederic@xxxxxxxxxx> writes:
> >> >>>
> >> >>> > Le Mon, Jan 15, 2024 at 03:37:41PM +0100, Anna-Maria Behnsen a écrit :
> >> >>> > Heh, I was about to say that it's impossible that timer_base_is_idle()
> >> >>> > at this stage but actually if we run in nohz_full...
> >> >>> >
> >> >>> > It happens so that nohz_full is deactivated until rcutree_online_cpu()
> >> >>> > which calls tick_dep_clear() but it's a pure coincidence that might
> >> >>> > disappear one day. So yes, let's keep it that way.
> >> >>>
> >> >>> I instrumented the code (with NOHZ FULL and NOHZ_IDLE) to make sure the
> >> >>> timer migration hierarchy state 'idle' is in sync with the timer base
> >> >>> 'idle'. And this was one part where it was possible that it runs out of
> >> >>> sync as I remember correctly. But if I understood you correctly, this
> >> >>> shouldn't happen at the moment?
> >> >>
> >> >> Well, it's not supposed to :-)
> >> >
> >> > Hmm, let me double check this and run the tests on the instrumented
> >> > version...
> >>
> >> I added a prinkt() to verify what I think I remember. I was able to see
> >> the prints. So it seems, that the coincidence that nohz_full is
> >> deactivated until rcutree_online_cpu() already disappeared.
> >
> > Nice, then I guess it can become a WARN_ON.
>
> Either I misunderstood something, or wasn't able to explain what I
> wanted to say.
>
> I understood, that nohz full is disabled (by coincidence) until
> rcutree_online_cpu() which comes after the timer migration CPU hotplug
> AP. This means, that the check whether timer base is idle or not,
> shouldn't be required in tmigr_cpu_online() to keep cpu idle or mark it
> active in the hierarchy. But we could keep it in case coincidence
> disappears. No?
>
> So I added a printk() when timer base is idle in tmigr_cpu_online(). And
> I was able to see the prints. This means, nohz full is _not_ disabled
> when executing tmigr_cpu_online(), or am I wrong?
>
> So when I replace the printk() with a WARN_ON() it will definitely
> trigger. So I'm not sure if this is what you want to have :)

Yes, silly me, I thought the tick dependency was set on all hotplug
operations but it's only cpu down. So this piece doesn't need to change
AFAICT.