Re: [PATCH] firewire: ohci: mask bus reset interrupts between ISR and bottom half

From: Takashi Sakamoto
Date: Mon Apr 01 2024 - 08:18:12 EST


Hi Adam,

On Mon, Mar 25, 2024 at 09:58:28AM +0900, Takashi Sakamoto wrote:
> Hi,
>
> On Wed, Mar 20, 2024 at 02:15:19AM -0700, Adam Goldman wrote:
> > In the FireWire OHCI interrupt handler, if a bus reset interrupt has
> > occurred, mask bus reset interrupts until bus_reset_work has serviced and
> > cleared the interrupt.
> >
> > Normally, we always leave bus reset interrupts masked. We infer the bus
> > reset from the self-ID interrupt that happens shortly thereafter. A
> > scenario where we unmask bus reset interrupts was introduced in 2008 in
> > a007bb857e0b26f5d8b73c2ff90782d9c0972620: If
> > OHCI_PARAM_DEBUG_BUSRESETS (8) is set in the debug parameter bitmask, we
> > will unmask bus reset interrupts so we can log them.
> >
> > irq_handler logs the bus reset interrupt. However, we can't clear the bus
> > reset event flag in irq_handler, because we won't service the event until
> > later. irq_handler exits with the event flag still set. If the
> > corresponding interrupt is still unmasked, the first bus reset will
> > usually freeze the system due to irq_handler being called again each
> > time it exits. This freeze can be reproduced by loading firewire_ohci
> > with "modprobe firewire_ohci debug=-1" (to enable all debugging output).
> > Apparently there are also some cases where bus_reset_work will get called
> > soon enough to clear the event, and operation will continue normally.
> >
> > This freeze was first reported a few months after a007bb85 was committed,
> > but until now it was never fixed. The debug level could safely be set
> > to -1 through sysfs after the module was loaded, but this would be
> > ineffectual in logging bus reset interrupts since they were only
> > unmasked during initialization.
> >
> > irq_handler will now leave the event flag set but mask bus reset
> > interrupts, so irq_handler won't be called again and there will be no
> > freeze. If OHCI_PARAM_DEBUG_BUSRESETS is enabled, bus_reset_work will
> > unmask the interrupt after servicing the event, so future interrupts
> > will be caught as desired.
> >
> > As a side effect to this change, OHCI_PARAM_DEBUG_BUSRESETS can now be
> > enabled through sysfs in addition to during initial module loading.
> > However, when enabled through sysfs, logging of bus reset interrupts will
> > be effective only starting with the second bus reset, after
> > bus_reset_work has executed.
> >
> > Signed-off-by: Adam Goldman <adamg@xxxxxxxxx>
> > ---
> >
> > --- linux-6.8-rc1.orig/drivers/firewire/ohci.c 2024-01-21 14:11:32.000000000 -0800
> > +++ linux-6.8-rc1/drivers/firewire/ohci.c 2024-03-12 01:15:10.000000000 -0700
> > @@ -2060,6 +2060,8 @@ static void bus_reset_work(struct work_s
> >
> > ohci->generation = generation;
> > reg_write(ohci, OHCI1394_IntEventClear, OHCI1394_busReset);
> > + if (param_debug & OHCI_PARAM_DEBUG_BUSRESETS)
> > + reg_write(ohci, OHCI1394_IntMaskSet, OHCI1394_busReset);
> >
> > if (ohci->quirks & QUIRK_RESET_PACKET)
> > ohci->request_generation = generation;
> > @@ -2125,12 +2127,14 @@ static irqreturn_t irq_handler(int irq,
> > return IRQ_NONE;
> >
> > /*
> > - * busReset and postedWriteErr must not be cleared yet
> > + * busReset and postedWriteErr events must not be cleared yet
> > * (OHCI 1.1 clauses 7.2.3.2 and 13.2.8.1)
> > */
> > reg_write(ohci, OHCI1394_IntEventClear,
> > event & ~(OHCI1394_busReset | OHCI1394_postedWriteErr));
> > log_irqs(ohci, event);
> > + if (event & OHCI1394_busReset)
> > + reg_write(ohci, OHCI1394_IntMaskClear, OHCI1394_busReset);
> >
> > if (event & OHCI1394_selfIDComplete)
> > queue_work(selfid_workqueue, &ohci->bus_reset_work);
>
> Thanks for the patch. I pushed topic branch[1] for it, since I'm
> considering about whether to send it to stable and longterm releases.
>
> I had realized that the debug=8 for firewire-ohci module provides tons
> of logs triggering by the irq handler, since the irq for bus reset is
> not unmasked, so I rely on selfID events when debugging bus-reset. I have
> few objections to the change.
>
> My concern is how much invasive it is. The unmasking is kept until
> bus_reset_work() is executed in the workqueue. When considering about
> the delay of workqueue (since it is a kind of schedulable task), many
> irq events for bus reset is potentially skipped from the logging. Of
> course, it is the aim of change.
>
> Let me take more time to evaluate the change, but I'm willing to send it
> to upstream until -rc3 or -rc4, at least, if receiving no objections
> from the others.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394.git/log/?h=v6.9-firewire-mask-bus-reset-event-during-handling

As long as I tested, the patch is enough good to suppress the spurious
IRQ event. I'll send it to mainline in this weekend.

I sent an additional patch[1] to handle the bus-reset event at the first
time. I'd like you to review and test it as well, especially under your
environment in which 1394:1995 and 1394a phys exist.

[1] https://lore.kernel.org/lkml/20240401121200.220013-1-o-takashi@xxxxxxxxxxxxx/


Thanks

Takashi Sakamoto