Re: [PATCH] firmware: QCOM_SCM: Allow qcom_scm driver to be loadable as a permenent module

From: Bjorn Andersson
Date: Mon Jul 19 2021 - 16:10:56 EST


On Mon 19 Jul 14:00 CDT 2021, John Stultz wrote:

> On Fri, Jul 16, 2021 at 10:01 PM Bjorn Andersson
> <bjorn.andersson@xxxxxxxxxx> wrote:
> > On Tue 06 Jul 23:53 CDT 2021, John Stultz wrote:
> > > Allow the qcom_scm driver to be loadable as a permenent module.
> > >
> > > This still uses the "depends on QCOM_SCM || !QCOM_SCM" bit to
> > > ensure that drivers that call into the qcom_scm driver are
> > > also built as modules. While not ideal in some cases its the
> > > only safe way I can find to avoid build errors without having
> > > those drivers select QCOM_SCM and have to force it on (as
> > > QCOM_SCM=n can be valid for those drivers).
> > >
> > > Reviving this now that Saravana's fw_devlink defaults to on,
> > > which should avoid loading troubles seen before.
> > >
> >
> > Are you (in this last paragraph) saying that all those who have been
> > burnt by fw_devlink during the last months and therefor run with it
> > disabled will have a less fun experience once this is merged?
> >
>
> I guess potentially. So way back when this was originally submitted,
> some folks had trouble booting if it was set as a module due to it
> loading due to the deferred_probe_timeout expiring.
> My attempts to change the default timeout value to be larger ran into
> trouble, but Saravana's fw_devlink does manage to resolve things
> properly for this case.
>

Unfortunately I see really weird things coming out of that, e.g. display
on my db845c is waiting for the USB hub on PCIe to load its firmware,
which typically times out after 60 seconds.

I've stared at it quite a bit and I don't understand how they are
related.

> But if folks are having issues w/ fw_devlink, and have it disabled,
> and set QCOM_SCM=m they could still trip over the issue with the
> timeout firing before it is loaded (especially if they are loading
> modules from late mounted storage rather than ramdisk).
>

I guess we'll have to force QCOM_SCM=y in the defconfig and hope people
don't make it =m.

> > (I'm picking this up, but I don't fancy the idea that some people are
> > turning the boot process into a lottery)
>
> Me neither, and I definitely think the deferred_probe_timeout logic is
> way too fragile, which is why I'm eager for fw_devlink as it's a much
> less racy approach to handling module loading dependencies.

Right, deferred_probe_timeout is the main issue here. Without it we
might get some weird probe deferral runs, but either some driver is
missing or it settles eventually.

With deferred_probe_timeout it's rather common for me to see things
end up probe out of order (even more now with fw_devlink finding cyclic
dependencies) and deferred_probe_timeout just breaking things.

> So if you
> want to hold on this, while any remaining fw_devlink issues get
> sorted, that's fine. But I'd also not cast too much ire at
> fw_devlink, as the global probe timeout approach for handling optional
> links isn't great, and we need a better solution.
>

There's no end to the possible and valid ways you can setup your
defconfig and run into the probe deferral issues, so I see no point in
holding this one back any longer. I just hope that one day it will be
possible to boot the upstream kernel in a reliable fashion.

Thanks,
Bjorn