Re: Help with IRQ-MSI-IRQ bridges

From: LuÃs Matallui
Date: Fri Apr 03 2020 - 17:46:38 EST


Hey Marc,

Just wanted to close this thread and thank you for the help.
Using that handle_fasteoi_ack_irq() saved my life.

I was just having trouble unrelated to this, but some poorly documented SoC.

Thanks again!
Luis

On Fri, 3 Apr 2020 at 07:26, LuÃs Matallui <matallui@xxxxxxxxx> wrote:
>
> On Fri, 3 Apr 2020 at 06:03, Marc Zyngier <maz@xxxxxxxxxx> wrote:
> >
> > On Fri, 3 Apr 2020 05:13:35 -0600
> > LuÃs Matallui <matallui@xxxxxxxxx> wrote:
> >
> > > Hi Marc,
> > >
> > > On Fri, 3 Apr 2020 at 04:18, Marc Zyngier <maz@xxxxxxxxxx> wrote:
> > > >
> > > > Hi Luis,
> > > >
> > > > On 2020-04-03 02:35, LuÃs Matallui wrote:
> > > > > Hi,
> > > > >
> > > > > I've got this SoC which uses IRQ-MSI and MSI-IRQ bridges in order to
> > > > > get interrupts from devices external to the ARM subsystem.
> > > > > I already got some pointers from Maz and have been able to create the
> > > > > drivers with the stacked domains and can now see the mappings working
> > > > > fine across domains.
> > > > >
> > > > > Maz pointed me to the Marvell mvebu-gicp (for my MSI controller, a.k.a
> > > > > MSI-IRQ bridge) and to mvebu-icu for the MSI client (IRQ-MSI bridge).
> > > > >
> > > > > I now have the interrupts working, but it seems like I'm missing a
> > > > > bunch of them. And therefore my device doesn't work properly.
> > > > > The main difference between my HW and Marvell's is that my IRQs are
> > > > > not level-triggered and the MSIs don't support the two messages for
> > > > > level-triggered interrupts.
> > > >
> > > > Which is probably a very good thing, as long as all your devices
> > > > generate
> > > > only edge-triggered interrupts.
> > > >
> > > > >
> > > > > To illustrate my system:
> > > > >
> > > > > DEV --line--> IRQ-MSI Bridge (MSIC) --msi--> MSI-IRQ Bridge (GICP)
> > > > > --line--> GICv2
> > > > >
> > > > > For MSIC, all I can do is configure the address and data for the MSI,
> > > > > and I believe on every rising edge of the Device IRQ, an MSI is sent.
> > > > > For GICP, all I have is a doorbell and a way to enable/disable it, and
> > > > > whenever the doorbell is enabled and has a value != 0, the IRQ line to
> > > > > GICv2 gets asserted.
> > > > >
> > > > > The first thing I noticed is that when I get an interrupt, the IRQ
> > > > > flow goes like:
> > > > >
> > > > > handle_irq();
> > > > > irq_eoi();
> > > > >
> > > > > So, I guess my first question here is, how can I guarantee that I
> > > > > don't get another MSI whilst in handle_irq()?
> > > >
> > > > At the GIC level, once the interrupt is Ack'd, anything that is signed
> > > > after this ack is a separate interrupt. It will be made pending and will
> > > > fire once the GIC driver EOIs the first one.
> > >
> > > The thing here is, there is no Ack, or at least my irqchips are not getting
> > > the irq_ack() callback, which is where I was expecting to clear the doorbell.
> >
> > Not getting an irq_ack() here is expected, as the GIC uses the fast_eoi
> > flow, which doesn't use irq_ack() at all. If you really want irq_ack()
> > to be called, you'll need to change that flow for the specified
> > interrupts (handle_fasteoi_ack_irq is probably of interest).
> >
>
> I had tried level and trigger flow handlers and was always getting weird hangs,
> so never even attempted that, but with that handler I do get the Ack calls now.
>
> I'm still missing the interrupts, so I'm starting to believe I'm just
> doing something
> else wrong.
>
> > >
> > > >
> > > > > If I do, then I will clear the doorbell on irq_eoi() (because that's
> > > > > my only choice) and will lose the queued IRQs.
> > > >
> > > > Why do you need to do anything at the doorbell level? This is just a
> > > > write,
> > > > so there should be nothing to clear. If you do need to clear anything,
> > > > then your MSI-IRQ bridge isn't stateless as it should, and you'll need
> > > > to
> > > > give much more details about the HW. Do you have a pointer to the TRM
> > > > for your HW?
> > >
> > > The hardware is really simple. On the MSI controller (GICP) side, each
> > > interrupt only has 3 registers: 1 status, 1 mask and 1 clear. When an
> > > MSI lands a write on the status register, it asserts the interrupt line.
> > > The interrupt stays asserted until we clear the status (using the clear
> > > register). The mask register is just to enable the interrupt basically.
> > > The MSI data is really irrelevant, as long as it's non-zero we always
> > > obtain the same result.
> >
> > Does it mean it asserts a level each time it gets an edge, and you need
> > to clear the MSI to allow another one? If so, that's a bit silly. it
> > would have made a lot more sense to leave it flowing to the GIC where
> > all the logic is already present.
>
> Yes! I totally agree this is silly, but I'm pretty sure that's how it
> works. I will
> try to reach out to the SoC team to double check, but from their documentation,
> they say the interrupt line will stay asserted as long as the doorbell status
> value is different than 0, so until we manually clear it, it stays asserted.
>
> > >
> > > On the MSI client side, we only configure the MSI address and data for
> > > a certain device interrupt line, and for each rising edge, an MSI gets issued.
> > >
> > > >
> > > > > It also seems that I'm missing IRQs in the beginning after probing the
> > > > > device, and before it was working for me when I was setting up all
> > > > > these registers manually and simply using GICv2 as my only interrupt
> > > > > controller.
> > > >
> > > > Well, setting all of this in firmware is always the preferred option
> > > > if you don't expect things to change dynamically.
> > >
> > > Well, the solution I have now works perfectly for the configuration, because
> > > the MSIC gets configured by msi_compose_msg -> msi_write_msg at IRQ
> > > allocation time and never gets touched again.
> > >
> > > Then when the IRQ gets activated, the GICP is unmasking the interrupt but
> > > enabling the doorbell (setting the mask register).
> > >
> > > The only thing I really need is to intercept every MSI before the handler so
> > > I can Ack it by clearing the doorbell status register.
> >
> > See above.
> >
> > > > > I do see the unmask() ops being called for all my stacked irqchips, so
> > > > > I don't understand how I'm missing so many interrupts.
> > > >
> > > > unmask is just a static configuration to enable the interrupt. There
> > > > shouldn't
> > > > be that many calls to that later on unless an endpoint driver
> > > > disables/enables
> > > > interrupts by hand.
> > > >
> > > > Please give us a bit more details to understand the context, as there is
> > > > only
> > > > so much I can do with so little HW information.
> > > >
> > > > Thanks,
> > > >
> > > > M.
> > > > --
> > > > Jazz is not dead. It just smells funny...
> > >
> > > Let me know if that is good enough information. There's really not much on
> > > the HW side.
> >
> > If I'm correct above, I'd say there is a bit too much there! ;-)
> >
> > M.
> > --
> > Jazz is not dead. It just smells funny...
>
> Thanks again Marc!
> Let me try to reach out to the SoC guys and try to figure out what
> else I'm doing
> wrong. I will give you an update once I figure it out.
>
> -- Luis