Re: next-20130627 breaks i.MX6 sabre sd UART console

From: Thomas Gleixner
Date: Mon Jul 01 2013 - 17:24:53 EST


On Mon, 1 Jul 2013, Stephen Boyd wrote:
> On 07/01/13 13:14, Thomas Gleixner wrote:
> > The issue is very subtle. What happens is:
> >
> > CPU0 CPU1
> >
> > Switch to oneshot mode
> >
> > Copy the bits from tick_broadcast_mask to
> > tick_broadcast_oneshot_mask. We need to do
> > that so the other cpus reach the timer irq
> > and the softirq which switches them to
> > oneshot.
> >
> > Kick the broadcast device into oneshot.
> >
> > Timer interrupt fires
> >
> > irq_enter sees the cpu in
> > tick_broadcast_oneshot_mask and
> > sets the device to oneshot mode
> >
> > handle_periodic:
> > Sees oneshot mode and adds
> > period to
> > dev->next_event(KTIME_MAX)
> >
>
> Yep. It is also racing with the timer interrupt so having more than two
> CPUs must help widen the window (which is why we see it on the higher
> numbered CPUs).

The race above is about the timer interrupt. You mean the broadcast
one which is still enabled due to the dummy -> functional transition
issue, right? That helps a lot to make this more visible, because we
double the number of events.

> > + * because the CPU is running and therefor not
>
> s/therefor/therefore/

Duh. That one haunts me forever.

/me goes off to split the patch into two separate fixes, add proper
changelogs and wait for Vincents confirmation.

I really wish, that x86 would have been the only architecture which
made use of that broadcast nonsense. Though the ARM folks went there
and created the same mess as x86 but raised to the power of N, where

N = Number of odd ARM chips designed by morons who thought that
copying the already publicly documented idiocy of x86 is a
brilliant idea.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/