Re: [PATCH] m68k: coldfire: Prevent spurious interrupts when masking IMR

From: Geert Uytterhoeven
Date: Wed Feb 05 2025 - 03:14:46 EST


Hi Jean-Michel,

On Wed, 5 Feb 2025 at 08:07, Jean-Michel Hautbois
<jeanmichel.hautbois@xxxxxxxxxx> wrote:
> On 04/02/2025 20:27, Geert Uytterhoeven wrote:
> > On Tue, 4 Feb 2025 at 19:38, Jean-Michel Hautbois
> > <jeanmichel.hautbois@xxxxxxxxxx> wrote:
> >> The ColdFire interrupt controller can generate spurious interrupts if an
> >> interrupt source is masked in the IMR while the CPU interrupt priority
> >> mask (SR[I]) is set lower than the interrupt level.
> >>
> >> The reference manual states:
> >>
> >> To avoid this situation for interrupts sources with levels 1-6, first
> >> write a higher level interrupt mask to the status register, before
> >> setting the mask in the IMR or the module’s interrupt mask register.
> >> After the mask is set, return the interrupt mask in the status register
> >> to its previous value.
> >>
> >> It can be tested like this:
> >> - Prepare a iperf3 server on the coldfire target (iperf3 -s -D)
> >> - Start a high priority cyclictest:
> >> cyclictest --secaligned -m -p 99 -i 2500 -q
> >> - Start iperf3 -c $COLDFIRE_IP -t 0
> >>
> >> After a few seconds the dmesg may display:
> >> [ 84.784301] irq 24, desc: dbc502da, depth: 1, count: 0, unhandled: 0
> >> [ 84.784455] ->handle_irq(): 0ba0aca3, handle_bad_irq+0x0/0x1e0
> >> [ 84.784610] ->irq_data.chip(): c6779d4f, 0x41652544
> >> [ 84.784719] ->action(): 00000000
> >> [ 84.784770] unexpected IRQ trap at vector 18
> >>
> >> With this patch, I never saw it in a few hours testing.
> >>
> >> Signed-off-by: Jean-Michel Hautbois <jeanmichel.hautbois@xxxxxxxxxx>
> >
> > Thanks for your patch!
> >
> >> --- a/arch/m68k/coldfire/intc-simr.c
> >> +++ b/arch/m68k/coldfire/intc-simr.c
> >> @@ -58,6 +58,14 @@ static inline unsigned int irq2ebit(unsigned int irq)
> >>
> >> #endif
> >>
> >> +static inline void intc_irq_setlevel(unsigned long level)
> >> +{
> >> + asm volatile ("move.w %0,%%sr"
> >> + : /* no outputs */
> >> + : "d" (0x2000 | ((level) << 8))
> >> + : "memory");
> >> +}
> >> +
> >> /*
> >> * There maybe one, two or three interrupt control units, each has 64
> >> * interrupts. If there is no second or third unit then MCFINTC1_* or
> >> @@ -67,13 +75,17 @@ static inline unsigned int irq2ebit(unsigned int irq)
> >> static void intc_irq_mask(struct irq_data *d)
> >> {
> >> unsigned int irq = d->irq - MCFINT_VECBASE;
> >> + unsigned long flags = arch_local_save_flags();
> >>
> >> + intc_irq_setlevel(7);
> >
> > Can't all of the above just be replaced by
> >
> > unsigned long flags = arch_local_irq_save();
>
> The only change is the Supervisor bit in SR which is not changed in
> arch_local_irq_disable() while it is forced to 1 in my function (setting
> it to 0x2700 AFAICT).
>
> But I can confirm I couldn't see the issue with this code, while using
> the existing arch_local_irq_save() it still appears (less frequently
> than without it at all, but still).
>
> Any suggestion :-) ?

There are other differences: your version clears all other bits, incl.
condition codes and master/interrupt state.

Can you save the flags above in a global, and print it in the
unexpected IRQ handler, to see which other bits are set when
it happens?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds