Re: [PATCH v3 1/2] driver core: Rework logic in __driver_deferred_probe_check_state to allow EPROBE_DEFER to be returned for longer

From: John Stultz
Date: Tue Feb 18 2020 - 18:21:51 EST


On Tue, Feb 18, 2020 at 2:51 PM Rob Herring <robh@xxxxxxxxxx> wrote:
>
> On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
> >
> > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > at the end of initcall"), along with commit 25b4e70dcce9
> > ("driver core: allow stopping deferred probe after init") after
> > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > instead see an error causing the driver to fail to load.
> >
> > That change causes trouble when trying to use many clk drivers
> > as modules, as the clk modules may not load until much later
> > after init has started. If a dependent driver loads and gets an
> > error instead of EPROBE_DEFER, it won't try to reload later when
> > the dependency is met, and will thus fail to load.
> >
> > This patch reworks some of the logic in
> > __driver_deferred_probe_check_state() so that if the
> > deferred_probe_timeout value is set, we will return EPROBE_DEFER
> > until that timeout expires, which may be after initcalls_done
> > is set to true.
> >
> > Specifically, on db845c, this change (when combined with booting
> > using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> > QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> > system, where as without it the display will fail to load.
>
> I would change the default for deferred_probe_timeout to 30 and then
> regulator code can rely on that.

That is exactly what I do in the following patch! Let me know if you
have further suggestions for it.

> Curious, why 30 sec is fine now when
> you originally had 2 min? I'd just pick what you think is best. I
> doubt Mark had any extensive experiments to come up with 30sec.

I had two minutes initially as, due to other delays I still have to
chase, booting all the way to UI on the db845c can sometimes take
almost a minute. So it was just a rough safety estimate. But in v2, I
tested with the 30 second time used by the regulator code, and that
seemed to work ok.

As long as 30 seconds is working well, I'm ok with it for now (and it
can be overrided via boot arg). Though from past experience with
enterprise distros running on servers with tons of disks (which might
take many minutes to initialize), I'd suspect its likely some
environments may need much longer.

thanks
-john