Re: UML time-travel warning from __run_timers

From: Johannes Berg
Date: Mon Apr 04 2022 - 03:05:48 EST


On Sun, 2022-04-03 at 21:51 +0200, Thomas Gleixner wrote:
>
> > There was no timer. If there's ever a timer on this base (BASE_DEF) then
> > this doesn't happen.
>
> You said:
>
> > > > init_timer_cpu(0) base 0 clk=0xffff8ad0, next_expiry=0x13fff8acf
> > > > init_timer_cpu(0) base 1 clk=0xffff8ad0, next_expiry=0x13fff8acf
>
> which confused me. It's actually initialized to:
>
> base->clk + NEXT_TIMER_MAX_DELTA
>
> but that's fine and it is overwritten by every timer which is inserted
> to expire before that. So that's not an issue as the prandom timer is
> firing and rearmed.

No, as I said before, there's never any timer with base 1 (BASE_DEF) in
the config we have. The prandom timer is not TIMER_DEFERRABLE (it
probably could be, but it's not now). There's no deferrable timer at
all. Once there is at least one, the warning goes away.

> That would not happen if next_expiry would stay at 0x13fff8acf. The
> first one in your trace expires at 5339070200, i.e. 0x13e3bbef8, which
> is way before that.

But it's not a deferrable timer, so it's on another timer wheel (base),
so it doesn't affect the "base 1" part above.

> Can you please apply the debug patch below and run with the same
> parameters as before?
>
> Thanks,
>
> tglx
> ---
> Hint: I tried to figure out how to use that time travel muck, but did
> not get to the point where I bothered to try myself. Might be
> either my incompetence or lack of documentation. Clearly the bug
> report lacks any hint how to reproduce that problem.

Well, the original bug report did have all the information, I gave the
link to it before:

https://lore.kernel.org/r/20220330110156.GA9250@xxxxxxxx

With that kernel config and command line, you can reproduce it easily.
All you need to know is to use "make ARCH=um" with that .config file :)


> + trace_printk("RUN: now=%lu clk=%lu next_expiry=%lu
> recalc=%d\n",
> + jiffies, base->clk, base->next_expiry,
> + base->next_expiry_recalc);

IMHO all of this extra debug is a waste of time since you're not
differentiating the two bases anywhere. You'll just get confused (as
above) since timers do happen on BASE_STD, just not on BASE_DEF.

johannes