Re: Warning seen when removing a module using irqdomain framework

From: pheragu
Date: Thu Jul 25 2019 - 17:41:51 EST

On 2019-07-23 23:51, Marc Zyngier wrote:
On Tue, 23 Jul 2019 14:52:34 -0700
pheragu@xxxxxxxxxxxxxx wrote:

Hi Prakruthi,


I have been working on a interrupt controller driver that uses tree
based mapping for its domain (irq_domain_add_tree(..)).
If I understand correctly, the clients get a mapping when they call
platform_get_irq(..). However, after these clients are removed
(rmmod), when I try to remove the interrupt controller driver where
it calls irq_domain_remove(..), I hit this warning from
kernel/kernel/irq/irqdomain.c:: irq_domain_remove(..)
WARNING: CPU: 0 PID: 238 at /kernel/kernel/irq/irqdomain.c:246 irq_domain_remove+0x84/0x98

Also, I see that the requested IRQs by the clients are still present
(in /proc/interrupts) even after they had been removed. Hence, I just
wanted to know how to handle this warning. Should the client clean up
by calling irq_dispose_mapping(..) or is it the responsibility of the
interrupt controller driver to dispose the mappings one by one?

In general, building interrupt controller drivers as a module is a
pretty difficult thing to do in a safe manner. As you found out, this
relies on the irq_domain being "emptied" before it can be freed. There
are some other gotchas in the rest of the IRQ stack as well.

Doing that is hard. One of the reasons is that the OF subsystem will
happily allocate all the interrupts it can even if there is no driver
having requested them (see of_platform_populate). This means that you
cannot track whether a client driver is using one of the interrupt your
irqchip is in charge of. You can apply some heuristics, but they are in
general all wrong.

Fixing the OF subsystem is possible, but will break a lot of platforms
that will have to be identified and fixed one by one. Another
possibility would be to refcount irqdescs, and make sure the irqdomain
directly holds pointers to them. Doable, but may create overhead.

To sum it up, don't build your irqchip driver as a module if you can
avoid it. If you can't, you'll have to be very careful about how the
mapping is established (make sure it is not created by
of_platform_populate), and use irq_dispose_mapping in the client

As per your suggestion I tried making this driver a statically compiled one.
I tried various approaches with this -

1. Using arch_inticall(..) - When I used this call, I saw that once the
clients were removed, I don't see the IRQs requested by them (in /proc/interrupts).

2. Using module_init(..) (statically compiled) - When I used this call, I saw that
even after the clients were removed, I do see their requested IRQs in /proc/interrupts.
This behavior in #2 is the same as the one I saw when I compiled my driver as a module
and used arch_initcall(..).

Is there any reason why this is seen only with arch_initcall(..) used statically?

Prakruthi Deepak