Re: [PATCH v3 09/12] of: property: Simplify of_link_to_phandle()
From: Geert Uytterhoeven
Date: Wed Feb 08 2023 - 02:57:21 EST
Hi Saravana,
On Wed, Feb 8, 2023 at 8:32 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> On Tue, Feb 7, 2023 at 6:08 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > On Tue, Feb 7, 2023 at 12:57 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > On Tue, Feb 7, 2023 at 2:42 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > > The driver core now:
> > > > - Has the parent device of a supplier pick up the consumers if the
> > > > supplier never has a device created for it.
> > > > - Ignores a supplier if the supplier has no parent device and will never
> > > > be probed by a driver
> > > >
> > > > And already prevents creating a device link with the consumer as a
> > > > supplier of a parent.
> > > >
> > > > So, we no longer need to find the "compatible" node of the supplier or
> > > > do any other checks in of_link_to_phandle(). We simply need to make sure
> > > > that the supplier is available in DT.
> > > >
> > > > Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> > >
> > > Thanks for your patch!
> > >
> > > This patch introduces a regression when dynamically loading DT overlays.
> > > Unfortunately this happens when using the out-of-tree OF configfs,
> > > which is not supported upstream. Still, there may be (obscure)
> > > in-tree users.
> > >
> > > When loading a DT overlay[1] to enable an SPI controller, and
> > > instantiate a connected SPI EEPROM:
[...]
> > > The SPI controller and the SPI EEPROM are no longer instantiated.
> > Sigh... I spent way too long trying to figure out if I caused a memory
> > leak. I should have scrolled down further! Doesn't look like that part
> > is related to anything I did.
> >
> > There are some flags set to avoid re-parsing fwnodes multiple times.
> > My guess is that the issue you are seeing has to do with how many of
> > the in memory structs are reused vs not when an overlay is
> > applied/removed and some of these flags might not be getting cleared
> > and this is having a bigger impact with this patch (because the fwnode
> > links are no longer anchored on "compatible" nodes).
> >
> > With/without this patch (let's keep the series) can you look at how
> > the following things change between each step you do above (add,
> > remove, retry):
> > 1) List of directories under /sys/class/devlink
> > 2) Enable the debug logs inside __fwnode_link_add(),
> > __fwnode_link_del(), device_link_add()
> >
> > My guess is that the final solution would entail clearing
> > FWNODE_FLAG_LINKS_ADDED for some fwnodes.
>
> You replied just as I was about to hit send. So sending this anyway...
>
> Ok, I took a closer look and I think it's a bit of a mess. The fact
> that it even worked for you without this patch is a bit of a
> coincidence.
>
> Let's just take platform devices that are created by
> driver/of/platform.c as an example.
>
> The main problem is that when you add/remove properties to a DT node
> of an existing platform device, nothing is really done about it at the
> device level. We don't even unbind and rebind the driver so the driver
> could make use of the new properties. We don't remove and add back the
> device so whoever might use the new property will use it. And if you
> are adding a new node, it'll only trigger any platform device level
> impact if it's a new node of a "simple-bus" (or similar bus) device.
>
> Problem 1:
> So if you add a new child node to an existing probed device that adds
> its children explicitly (as in, the parent is not a "simple-bus" like
> device), nothing will happen. The newly added child device node will
> get converted into a platform device, not will the parent device
> notice it. So in your case of adding msiof0_pins, it's just that when
> the consumer gets the pins, the driver doesn't get involved much and
> it's the pinctrl framework that reads the DT and figures it out.
>
> With this patch, the fwnode links point to the actual resource and the
> actual parent device inherits them if they don't get converted to a
> struct device. But since we are adding this msiof0_pins after the
> parent device has probed, the fwnode link isn't inherited by the
> parent pinctrl device.
>
> Problem 2:
> So if you add a property to an already bound device, nothing is done
> by the driver. In your overlay example, if you move the status="okay"
> line to be the first property you change in the msiof0 spi device,
> you'll probably see that fw_devlink is no longer the one blocking the
> probe. This is because the platform device will get added as soon as
> the status flips from disabled to enabled and at that point fw_devlink
> will think it has no suppliers and won't do any probe deferring. And
> then as the new properties get added nothing will happen at the device
> or fw_devlink level. If the msiof0's spi driver fails immediately with
> NOT -EPROBE_DEFER when platform device is added because it couldn't
> find any pinctrl property, then msiof0 will never probe (unless you
> remove and add the driver). If it had failed with -EPROBE_DEFER, then
> it might probe again if something else triggers a deferred probe
> attempt. Clearly, things working/not working based on the order of
> properties in DT is not a good implementation.
>
> Problem 3:
> What if you enable a previously disabled supplier. There's no way to
> handle that from a fw_devlink level without re-parsing the entire
> device tree because existing devices might be consumers now.
>
> Anyway, long story short, it's sorta worked due to coincidence and
> it's quite messy to get it to work correctly.
Several subsystems register notifiers to be informed of the events
above. E.g. drivers/spi/spi.c:
if (IS_ENABLED(CONFIG_OF_DYNAMIC))
WARN_ON(of_reconfig_notifier_register(&spi_of_notifier));
if (IS_ENABLED(CONFIG_ACPI))
WARN_ON(acpi_reconfig_notifier_register(&spi_acpi_notifier));
So my issue might be triggered using ACPI, too.
> Another way to get this to work is to:
> 1) unload pinctrl driver, unload spi driver.
> 2) apply overlay
> 3) reload pinctrl driver, reload spi driver.
>
> This is assuming unloading those 2 drivers doesn't crash your system.
Unloading the pinctrl driver is not an option.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds