Re: [RFC 3/3] tick-sched: Replace jiffie readout with idle_entrytime

From: Thomas Gleixner
Date: Tue Nov 12 2024 - 09:32:47 EST


On Fri, Nov 08 2024 at 17:48, Joel Fernandes wrote:
> This solves the issue where jiffies can be stale and inaccurate.

Which issue?

> Putting some prints, I see that basemono can be quite stale:
> tick_nohz_next_event: basemono=18692000000 basemono_from_idle_entrytime=18695000000

What is your definition of stale? 3ms on a system with HZ < 1000 is
completely correct and within the margin of the next tick, no?

> Since we have 'now' in ts->idle_entrytime, we can just use that. It is
> more accurate, cleaner, reduces lines of code and reduces any lock
> contention with the seq locks.

What's more accurate and what is the actual problem you are trying to
solve. This handwaving about cleaner, less lines of code and contention
on a non existing lock is just not helpful.

> I was also concerned about issue where jiffies is not updated for a long
> time, and then we receive a non-tick interrupt in the future. Relying on
> stale jiffies value and using that as base can be inaccurate to determine
> whether next event occurs within next tick. Fix that.

I'm failing to decode this word salad.

> XXX: Need to fix issue in idle accounting which does 'jiffies -
> idle_entrytime'. If idle_entrytime is more current than jiffies, it
> could cause negative values. I could replace jiffies with idle_exittime
> in this computation potentially to fix that.

So you "fix" some yet to be correctly described issue by breaking stuff?

> static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu)
> {
> - u64 basemono, next_tick, delta, expires, delta_hr, next_hr_wo;
> + u64 basemono, next_tick, delta, expires, delta_hr, next_hr_wo, boot_ticks;
> unsigned long basejiff;
> int tick_cpu;
>
> - basemono = get_jiffies_update(&basejiff);
> + boot_ticks = DIV_ROUND_DOWN_ULL(ts->idle_entrytime, TICK_NSEC);

Again this div/mult is more expensive than the sequence count on 32bit.

> -/*
> - * Read jiffies and the time when jiffies were updated last
> - */
> -u64 get_jiffies_update(unsigned long *basej)

How does this even compile? This function is global for a reason.

Thanks,

tglx