Re: [patch 2/2] timekeeping: Always check for negative motion

From: Guenter Roeck
Date: Thu Nov 28 2024 - 12:13:27 EST


On 11/28/24 07:57, Guenter Roeck wrote:
On 11/28/24 06:51, Thomas Gleixner wrote:
On Wed, Nov 27 2024 at 15:02, Guenter Roeck wrote:
On 11/27/24 14:08, John Stultz wrote:
An example log is at [1]. It says

clocksource: npcm7xx-timer1: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 597268854 ns

That's a 24bit counter. So negative motion happens when the readouts are
more than (1 << 23) apart. AFAICT the counter runs with about 14MHz, but
I'd like to have that confirmed.

clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
...
clocksource: Switched to clocksource npcm7xx-timer1

I don't know where exactly it stalls; sometime after handover to userspace.
I'll be happy to do some more debugging, but you'll nee to let me know what
to look for.

On that platform max_idle_ns should correspond to 50% of the counter
width. So if both CPUs go deep idle for max_idle_ns, then the next timer
interrupt doing the timeeeping advancement sees a delta of > (1 << 23)
and timekeeping stalls.

If my ssumption is correct, then the below should fix it.


While that didn't work, the following code does.

Guenter

---
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 0ca85ff4fbb4..bd88c04ae357 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2190,7 +2190,7 @@ static u64 logarithmic_accumulation(struct timekeeper *tk, u64 offset,
        /* Accumulate one shifted interval */
        offset -= interval;
        tk->tkr_mono.cycle_last += interval;
-       tk->tkr_raw.cycle_last  += interval;
+       tk->tkr_raw.cycle_last  = (tk->tkr_raw.cycle_last + interval) & tk->tkr_mono.mask;
^^^^^^^ ^^^^^^^^

No idea what I was testing earlier, but that obviously doesn't work either, and masking
both tkr_raw.cycle_last and tk->tkr_mono.cycle_last also doesn't work. Sorry for the noise.

Guenter