Re: fw_devlink=on breaks probing devices when of_platform_populate() is used

From: Olof Johansson
Date: Mon Sep 19 2022 - 19:31:31 EST


On Sun, Aug 28, 2022 at 04:39:52PM +0200, Rafa?? Mi??ecki wrote:
> On 30.07.2022 09:36, Rafa?? Mi??ecki wrote:
> > On 16.07.2022 22:50, Rafa?? Mi??ecki wrote:
> > > I added of_platform_populate() calls in mtd subsystem in the commit
> > > bcdf0315a61a2 ("mtd: call of_platform_populate() for MTD partitions"):
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bcdf0315a61a29eb753a607d3a85a4032de72d94
> > >
> > > I recently backported that commit in OpenWrt to kernels 5.10 and 5.15.
> > > We started receiving reports that probing Ethernet devices stopped
> > > working in kernel 5.15. I bisected it down to the kernel 5.13 change:
> > >
> > > commit ea718c699055c8566eb64432388a04974c43b2ea (refs/bisect/bad)
> > > Author: Saravana Kannan <saravanak@xxxxxxxxxx>
> > > Date:???? Tue Mar 2 13:11:32 2021 -0800
> > >
> > > ???????? Revert "Revert "driver core: Set fw_devlink=on by default""
> > >
> > > ???????? This reverts commit 3e4c982f1ce75faf5314477b8da296d2d00919df.
> > >
> > > ???????? Since all reported issues due to fw_devlink=on should be addressed by
> > > ???????? this series, revert the revert. fw_devlink=on Take II.
> > >
> > > ???????? Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> > > ???????? Link: https://lore.kernel.org/r/20210302211133.2244281-4-saravanak@xxxxxxxxxx
> > > ???????? Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > >
> > > For me with above commit kernel just never calls bcm4908_enet_probe().
> > > Reverting it from the top of 5.13.19 and 5.15.50 fixes it. I believe the
> > > same issue happens with other drivers.
> > >
> > > Critical detail is that in DT Ethernet block node references NVMEM cell
> > > of MTD partition (see below).
> > >
> > > Could you help me dealing with this issue, please? Can you see something
> > > obvious breaking fw_devlink=on + of_platform_populate() case? Can I
> > > provide some extra information to help fixing it?
> >
> > Any ideas about this problem / solution?
>
> I didn't get any reponse for this bug for 6 weeks now. Is that OK if I
> send a revert patch then?

I'm pretty sure this is the same root cause as I had for PCIe with a reference
to iommu with fw_devlink.strict=1:

https://lore.kernel.org/lkml/CAOesGMgpJQjMvo6m7on+27F8REiHaVSRL6HBjiRPVDM9Jscnrg@xxxxxxxxxxxxxx/

There's the dependency on the nvmem-cells propery, so the driver core doesn't
call probe until it's fulfilled. Meanwhile, the platform_driver() code
unregisters the driver if it (thinks) it as called probe and there are no
devices linked to it -- since it's not a needed driver. Thus, probe will never
be called. That code is in drivers/base/platform.c:__platform_driver_probe().

I don't know what the proper fix is here, this seems like a design issue with
the fw_devlink code.


-Olof