Re: [PATCH v2 10/19] irqdomain: Introduce irq_domain_alloc() and irq_domain_publish()

From: Herve Codina
Date: Thu Jun 06 2024 - 11:53:20 EST


Hi Thomas,

On Wed, 05 Jun 2024 15:02:46 +0200
Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> On Mon, May 27 2024 at 18:14, Herve Codina wrote:
> > The irq_domain_add_*() family functions create an irq_domain and also
> > publish this newly created to domain. Once an irq_domain is published,
> > consumers can request IRQ in order to use them.
> >
> > Some interrupt controller drivers have to perform some more operations
> > with the created irq_domain in order to have it ready to be used.
> > For instance:
> > - Allocate generic irq chips with irq_alloc_domain_generic_chips()
> > - Retrieve the generic irq chips with irq_get_domain_generic_chip()
> > - Initialize retrieved chips: set register base address and offsets,
> > set several hooks such as irq_mask, irq_unmask, ...
> >
> > To avoid a window where the domain is published but not yet ready to be
>
> I can see the point, but why is this suddenly a problem? There are tons
> of interrupt chip drivers which have exactly that pattern.
>

I thing the issue was not triggered because these interrupt chip driver
are usually builtin compiled and the probe sequence is the linear one
done at boot time. Consumers/supplier are probe sequentially without any
parallel execution issues.

In the LAN966x PCI device driver use case, the drivers were built as
modules. Modules loading and drivers .probe() calls for the irqs supplier
and irqs consumers are done in parallel. This reveals the race condition.

> Also why is all of this burried in a series which aims to add a network
> driver and touches the world and some more. If you had sent the two irq
> domain patches seperately w/o spamming 100 people on CC then this would
> have been solved long ago. That's documented clearly, no?

Yes, the main idea of the series, as mentioned in the cover letter, is to
give the big picture of the LAN966x PCI device use case in order to have
all the impacted subsystems and drivers maintainers be aware of the global
use case: DT overlay on top of PCI device.
Of course, the plan is to split this series into smaller ones once parts
get discussed in the DT overlay on top of PCI use case and reach some kind
of maturity at least on the way to implement a solution.

Thomas, do you prefer to have all the IRQ related patches extracted right
now from this big picture series ?

>
> > void irq_domain_free_fwnode(struct fwnode_handle *fwnode);
> > +struct irq_domain *irq_domain_alloc(struct fwnode_handle *fwnode, unsigned int size,
> > + irq_hw_number_t hwirq_max, int direct_max,
> > + const struct irq_domain_ops *ops,
> > + void *host_data);
> > +
> > +static inline struct irq_domain *irq_domain_alloc_linear(struct fwnode_handle *fwnode,
> > + unsigned int size,
> > + const struct irq_domain_ops *ops,
> > + void *host_data)
> > +{
> > + return irq_domain_alloc(fwnode, size, size, 0, ops, host_data);
> > +}
>
> So this creates exactly one wrapper, which means we'll grow another ton
> of wrappers if that becomes popular for whatever reason. We have already
> too many of variants for creating domains.
>
> But what's worse is that this does not work for hierarchical domains and
> is just an ad hoc scratch my itch solution.
>
> Also looking at the irq chip drivers which use generic interrupt
> chips. There are 24 instances of irq_alloc_domain_generic_chips() and
> most of this code is just boilerplate.
>
> So what we really want is a proper solution to get rid of this mess
> instead of creating interfaces which just proliferate and extend it.
>
> Something like the uncompiled below allows to convert all the
> boilerplate into a template based setup/remove.
>
> I just converted a random driver over to it and the result is pretty
> neutral in terms of lines, but the amount of code to get wrong is
> significantly smaller. I'm sure that more complex drivers will benefit
> even more and your problem should be completely solved by that.
>
> The below is just an initial sketch which allows further consolidation
> in the irqdomain space. You get the idea.

Got it, thanks a lot for the idea, the sketch and the way to use it in
drivers. I will rework my patches in that way.

Thanks,
Hervé