Re: [PATCH v2 0/2] Attach DT nodes to existing PCI devices
From: Herve Codina
Date: Tue Mar 19 2024 - 10:42:01 EST
Hi Rob,
On Thu, 14 Dec 2023 15:31:22 +0100
Herve Codina <herve.codina@xxxxxxxxxxx> wrote:
> > >
> > > But you don't. The logic to find the interrupt parent is walk up the
> > > parent nodes until you find 'interrupt-parent' or
> > > '#interrupt-controller' (and interrupt-map always has
> > > #interrupt-controller). So your overlay just needs 'interrupts = <1>'
> > > for INTA and it should all just work.
> >
> > Yes, I tried some stuffs in that way...
> > >
> > > That of course implies that we need interrupt properties in all the
> > > bridges which I was hoping to avoid. In the ACPI case, for DT
> > > interrupt parsing to work, we're going to need to end up in an
> > > 'interrupt-controller' node somewhere. I think the options are either
> >
> > ... and I went up to that point.
> > Further more with that way, we need to update the addr value retrieved
> > from the device requesting the interrupt.
> > https://elixir.bootlin.com/linux/latest/source/drivers/of/irq.c#L343
> > Indeed, with the current 'interrupt-map' at bridges level, a addr value
> > update is needed at the PCI device level if the interrupt is requested
> > from some PCI device children.
> > This is were my (not so good) interrupt-ranges property could play a
> > role.
> >
> > > we walk interrupt-map properties up to the host bridge which then
> > > points to something or the PCI device is the interrupt controller. I
> > > think the PCI device is the right place. How the downstream interrupts
> >
> > Agree, the PCI device seems to be the best candidate.
> >
> > > are routed to PCI interrupts are defined by the device. That would
> > > work the same way for both DT and ACPI. If you are concerned about
> > > implementing in each driver needing this, some library functions can
> > > mitigate that.
> > >
> > > I'm trying to play around with the IRQ domains and get this to work,
> > > but not having any luck yet.
> >
>
> Got some piece of code creating an interrupt controller at PCI device level.
> To have it working, '#interrupt-cell = <1>' and 'interrupt-controller'
> properties need to be set in the PCI device DT node.
>
> I can set them when the PCI device DT node is created (add them in
> of_pci_add_properties()) but as this interrupt controller is specific to the
> PCI device driver (it needs to be created after the pci_enable_device() call
> and will probably be device specific with MSI), it would make sense to have
> these properties set by the PCI device driver itself or in the overlay it
> applies.
>
> But these properties creation done by the device driver itself (or the
> overlay) lead to memory leaks.
> Indeed, we cannot add properties to an existing node without memory
> leaks. When a property is removed, the property is not freed but moved
> to the node->deadprops list (of_remove_property()).
> The properties present in the list will be free once the node itself is
> removed.
> In our use-case, the node (PCI device node) is not removed when the driver
> is removed and probe again (rmmod/insmod).
>
> Do you have any opinion about the '#interrupt-cell' and
> 'interrupt-controller' properties creation:
>
> - Created by of_pci_add_properties():
> No mem leak but done outside the specific device driver itself and done for
> all PCI devices.
> Maybe the driver will not create the interrupt controller, maybe it will
> prefer an other value for '#interrupt-cell', maybe it will handle MSI and
> will need to set other properties, ... All of these are device specific.
>
> - Created by the device driver itself (or the overlay it applies):
> The mem leak is present. Any idea about a way to fix that or at least having
> a fix for some properties ?
> I was thinking about avoiding to export properties (of_find_property()) and
> so avoid references properties or introducing refcount on properties but
> these modifications touch a lot of drivers and subsystems and are subject
> to errors.
> That's said, checking a property presence based on its name can be done without
> exporting the property, as well as getting a single value. Iterating over array
> values need some more complicated code.
>
I revive this quite old topic.
The primary goal of this series was to avoid a struct device duplication due
to the insertion of DT nodes related to PCI devices.
The series added the sysfs of_node symlink once the device is visible from
sysfs.
You proposed to create the DT node earlier, DECLARE_PCI_FIXUP_EARLY() instead
of DECLARE_PCI_FIXUP_FINAL() in order to have it set before the device_add()
call.
This raised some new issues because the DT node creation needs some information
set by the PCI core. DECLARE_PCI_FIXUP_HEADER() was the new candidate.
Issues were still present.
The 'ranges' property is needed and information needed for its computation
are set by the PCI core after the device_add() call.
At this point the discussion continued also on interrupts with the idea to
add the 'interrupt-controller' property in the PCI device node in order to
bypass all the interrupt mapping done in DT nodes.
Indeed, in order to have a working DT mapping, an 'interrupt-parent' phandle
is needed at some points and will be problematic with ACPI.
On my side I've got a piece of code to consider the PCI device as an interrup
controller. This work with the 'interrupt-controller' property.
The question raised:
Which part of code set the interrupt-controller property ?
- DT node creation:
Common to all PCI devices even if the interrupt are not handled by the PCI
device driver. Also '#interrupt-cells' is really device specific.
- Added by the PCI device driver itself
Seems the good place but we enter in overlay memleak issues
What is your opinion related to the best candidate for the
'interrupt-controller' and '#interrupt-cells' property creation ?
Back to the of_node link addition to sysfs.
Is it really an issue to add it on an already present device ?
If so, is it acceptable (not tested on my side) to create the of_node link
to an empty node before the device_add() call and then fill this node later
when needed information are set by the PCI core ?
With your answers, I hope to move forward on these topics.
Best regards,
Hervé
--
Hervé Codina, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com