Re: [PATCH v7 2/2] clocksource: add J-Core timer/clocksource driver

From: Thomas Gleixner
Date: Sat Oct 08 2016 - 13:06:17 EST


On Sat, 8 Oct 2016, Rich Felker wrote:
> On Sat, Oct 08, 2016 at 01:32:06PM +0200, Thomas Gleixner wrote:
> > CPU spins and waits for an interrupt to happen
> >
> >
> > <idle>-0 [000] d... 150.841530: rcu_dyntick: End 0 1
> >
> > Dropping out of the spin about the time we expect the PIT interrupt to
> > fire. And an interrupt is the reason why we drop out because the 'need
> > resched' flag is not set and we end up in:
>
> OK, I missed that.
>
> > <idle>-0 [000] d.s. 150.841724: tick_irq_enter: update jiffies via irq
> >
> > which is called from irq_enter()
> >
> > <idle>-0 [000] d.s. 150.841918: tick_do_update_jiffies64: finished do_timer(1)
> > <idle>-0 [000] d.s. 150.842348: tick_do_update_jiffies64: finished updating jiffies
> >
> >
> > So here we would expect:
> >
> > irq_handler_entry: irq=16 name=jcore_pit
> > ...
> > irq_handler_exit .....
> >
> >
> > or any other interrupt being invoked. But we see this here:
>
> According to the trace the 'irq' is soft ('s'). I couldn't find the
> code path from the idle loop to softirq but so maybe this is a bug. Is
> it possible that in some cases the arch irq entry does not get
> identified as a hard irq or traced and then the subsequent tick code
> thinks it's running in a softirq and behaves differently? I could add
> more tracing around irq entry.

No.

irq_enter()
if (is_idle_task(current) && !in_interrupt()) {
local_bh_disable();
tick_irq_enter();
local_bh_enable();
}
__irq_enter()
preempt_count_add(HARDIRQ_OFFSET);

So the 's' comes from local_bh_disable() and the code does not behave
special because of that. It's just always that way. Look at all the other
instances of tick_irq_enter() which do not show that issue.

> > <idle>-0 [000] d... 150.842603: __tick_nohz_idle_enter: can stop idle tick
> >
> > And that's just wrong.
>
> Can you explain?

Because you drop out the idle spin due to an interrupt, but no interrupt is
handled according to the trace. You just go back to sleep and the trace is
full of this behaviour.

> > Both CPUs have the same interrupt number for their per cpu PIT interrupt.
> >
> > IIRC you said earlier when the percpu interrupt discussion happened, that
> > your per cpu PITs have distinct interrupt numbers, but that does not seem
> > the case here.
>
> No, I said the opposite. They use the same irq number but they're each
> wired to their own cpu, and there's no way for them to fire on the
> wrong cpu.

Your patch does:

> + err = request_irq(pit_irq, jcore_timer_interrupt,
> + IRQF_TIMER | IRQF_PERCPU,
> + "jcore_pit", jcore_pit_percpu);

which is the wrong thing to do. You need request_percpu_irq() here, but of
course with the irq chip you are using, which has every callback as a noop,
it does not matter at all. So that's not an issue and not the reason for
this wreckage.

But what you need to figure out is why this happens:

100.00x hrtimer_start(expires = 100.01)
100.00x idle spin
100.01x tick_irq_enter()
100.01x tick_stop()

i.e. why do you drop out of your idle spin, take the low level interrupt
entry path which calls irq_enter() -> tick_irq_enter() and then you do not
handle any interrupt at all and drop back into the idle loop. That's the
real question and that's a problem in your architecture low level interrupt
entry code or some hardware related issue. But certainly not a bug in the
core tick code.

You can come up with another boat load of weird theories about bugs in
that code, but I won't even have a look _before_ you can explain why the
above sequence happens and the PIT timer interrupt is not handled.

Thanks,

tglx