[PATCH] nohz: Fix spurious warning when hrtimer and clockevent get out of sync

From: Frederic Weisbecker
Date: Thu Jun 08 2017 - 15:07:17 EST


On Wed, Jun 07, 2017 at 09:36:45PM +0000, Levin, Alexander (Sasha Levin) wrote:
> On Wed, Jun 07, 2017 at 04:14:03PM +0200, Frederic Weisbecker wrote:
> > On Wed, Jun 07, 2017 at 04:17:41AM +0000, Levin, Alexander (Sasha Levin) wrote:
> > > > Thanks Sasha!
> > > >
> > > > I couldn't reproduce it, that config boots fine on my kvm.
> > > > Would you have the time to dump some traces for me?
> > > >
> > > > I'd need you to add this boot option: trace_event=hrtimer_cancel,hrtimer_start,hrtimer_expire_entry
> > > > And boot your kernel with the below patch. This will dump the timer traces to your console.
> > > > I would be very interested in the resulting console dump file.
> > >
> > > Attached. Let me know if you need anything else.
> >
> > Great! So now I can deduce that the problem doesn't come from the nohz code as
> > ts->next_tick matches the hrtimer deadline. But the dev->next_event from the
> > clockevent seems to be out of line.
> >
> > Sorry to bother you again, but I'm chasing this bug for several weeks now and
> > you're one of the rare person who can reproduce it. So I may need some more
> > tracing details.
>
> I take payment in beers ;)

Duly noted ;-)

>
> But really, not a problem.
>
> > Here is another version of the debugging patch (not a delta), I added more trace_printk,
> > namely the places where we set this dev->next_event. Can you please apply the below and do
> > the dump again?
>
> Attached.

Awesome, these traces have been very helpful! So now I think I get what's going on.
Can you please test the following fix?

Thanks a lot!

---