Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall

From: John Stultz
Date: Thu Feb 13 2020 - 21:42:07 EST


On Thu, Feb 13, 2020 at 5:51 PM Rob Herring <robh@xxxxxxxxxx> wrote:
> On Thu, Feb 13, 2020 at 6:44 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
> > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > at the end of initcall"), along with commit 25b4e70dcce9
> > ("driver core: allow stopping deferred probe after init") after
> > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > instead see an error causing the driver to fail to load.
> >
> > That change causes trouble when trying to use many clk drivers
> > as modules, as the clk modules may not load until much later
> > after init has started. If a dependent driver loads and gets an
> > error instead of EPROBE_DEFER, it won't try to reload later when
> > the dependency is met, and will thus fail to load.
> >
> > Instead of reverting that patch, this patch tries to extend the
> > time that EPROBE_DEFER is retruned by two minutes, to (hopefully)
> > ensure that everything has had a chance to load.
>
> I think regulators already has some delay like this. We should use the
> same timeouts.

Sounds good. My memory was a bit foggy from the time I initially
brought this up, and I looked briefly before sending this out and
didn't find the regulator change you had mentioned. If you have a
specific pointer (or patch name or something) let me know, otherwise
I'll dig around later tonight/tomorrow.

> We also have the 'deferred_probe_timeout' cmdline option. It's deemed
> a debug option currently, but we could change that and change the
> default.

I looked at that code, but couldn't really make heads or tails of it.
The initcalls_done is checked and returns before the
deferred_probe_timeout is looked at, so I was guessing the
deferred_probe_timeout was addressing a bit more subtle issue than
what I was going for. If its really the same functionality, I'm happy
to try to rework it.

> > Specifically, on db845c, this change allows us to set
> > SDM_GPUCC_845, QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and
> > get a working system, where as without it the display will fail
> > to load.
> >
> > Cc: Alexander Graf <agraf@xxxxxxx>
> > Cc: Rob Herring <robh@xxxxxxxxxx>
> > Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> > Cc: Kevin Hilman <khilman@xxxxxxxxxx>
> > Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> > Cc: Pavel Machek <pavel@xxxxxx>
> > Cc: Len Brown <len.brown@xxxxxxxxx>
> > Cc: Todd Kjos <tkjos@xxxxxxxxxx>
> > Cc: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>
> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > Cc: linux-pm@xxxxxxxxxxxxxxx
> > Fixes: e01afc3250255 ("PM / Domains: Stop deferring probe at the end of initcall")
> > Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")
>
> We can debate the design, but those work as designed. So Fixes?
>

Well, clk module loading would have worked, and then stopped working
with e01afc3250255, so it is a regression of sorts. And really the
tags are mostly for making sure patches get applied to trees that
backported these commits (and it's not my intention to shame a patch
as broken. :)

thanks
-john