Re: [PATCH v1 0/2] Make fw_devlink=on more forgiving

From: Saravana Kannan
Date: Tue Feb 02 2021 - 03:22:10 EST


On Mon, Feb 1, 2021 at 11:55 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
>
> Hi Saravana,
>
> On Tue, Feb 2, 2021 at 4:01 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > On Mon, Feb 1, 2021 at 2:40 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > > On Sat, Jan 30, 2021 at 5:09 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > > On Fri, Jan 29, 2021 at 8:03 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > > > This patch series solves two general issues with fw_devlink=on
> > > > >
> > > > > Patch 1/2 addresses the issue of firmware nodes that look like they'll
> > > > > have struct devices created for them, but will never actually have
> > > > > struct devices added for them. For example, DT nodes with a compatible
> > > > > property that don't have devices added for them.
> > > > >
> > > > > Patch 2/2 address (for static kernels) the issue of optional suppliers
> > > > > that'll never have a driver registered for them. So, if the device could
> > > > > have probed with fw_devlink=permissive with a static kernel, this patch
> > > > > should allow those devices to probe with a fw_devlink=on. This doesn't
> > > > > solve it for the case where modules are enabled because there's no way
> > > > > to tell if a driver will never be registered or it's just about to be
> > > > > registered. I have some other ideas for that, but it'll have to come
> > > > > later thinking about it a bit.
> > > > >
> > > > > These two patches might remove the need for several other patches that
> > > > > went in as fixes for commit e590474768f1 ("driver core: Set
> > > > > fw_devlink=on by default"), but I think all those fixes are good
> > > > > changes. So I think we should leave those in.
> > > > >
> > > > > Marek, Geert,
> > > > >
> > > > > Can you try this series on a static kernel with your OF_POPULATED
> > > > > changes reverted? I just want to make sure these patches can identify
> > > > > and fix those cases.
> > > > >
> > > > > Tudor,
> > > > >
> > > > > You should still make the clock driver fix (because it's a bug), but I
> > > > > think this series will fix your issue too (even without the clock driver
> > > > > fix). Can you please give this a shot?
> > > >
> > > > Marek, Geert, Tudor,
> > > >
> > > > Forgot to say that this will probably fix your issues only in a static
> > > > kernel. So please try this with a static kernel. If you can also try
> > > > and confirm that this does not fix the issue for a modular kernel,
> > > > that'd be good too.
> > >
> > > Thanks for your series!
> > >
> > > For the modular case, this series has no impact, as expected (i.e. fails
> > > to boot, no I/O devices probed).
> > > With modules disabled, both r8a7791/koelsch and r8a77951/salvator-xs
> > > seem to boot fine, except for one issue on koelsch:
> >
> > Thanks a lot for testing the series!
> >
> > Regarding the koelsch issue, do you not see it with your OF_POPULATED
> > fix for rcar-sysc driver? But only see if you revert it and use this
> > series?
>
> I've just rechecked, and with fw_devlink=on, and my OF_POPULATED
> fir for rcar-sysc, i2c-demux-pinctrl works, both with modules enabled
> and disabled.

Thanks Geert! My guess is that with your OF_POPULATED changes the
"i2c-parents" of i2c-demux-pinctrl don't get probe deferred and
therefore i2c-demux-pinctrl probes after them and everything goes
well.

I guess that goes to show this series can't be the magic bullet even
with patch 2/3 -- especially for top level DT nodes that never have
devices created.

The other odd thing I noticed is that i2c-demux-pinctrl seems to
return -ENODEV when I think it should do -EPROBE_DEFER. In
i2c_demux_activate_master():

ret = of_changeset_apply(&priv->chan[new_chan].chgset);
if (ret)
goto err;

adap = of_find_i2c_adapter_by_node(priv->chan[new_chan].parent_np);
if (!adap) {
ret = -ENODEV;
goto err_with_revert;
}

If I understand the code correctly, it's assuming the selected parent
will probe successfully as soon as its status=ok change is done. Which
is not guaranteed for many reasons (driver not registered, async
probing, stuff like fw_devlink, etc).

Thanks,
Saravana