Re: [PATCH v3 1/2] driver core: Rework logic in __driver_deferred_probe_check_state to allow EPROBE_DEFER to be returned for longer

From: John Stultz
Date: Tue Feb 18 2020 - 18:58:44 EST


On Tue, Feb 18, 2020 at 3:53 PM Rob Herring <robh@xxxxxxxxxx> wrote:
>
> On Tue, Feb 18, 2020 at 5:21 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
> >
> > On Tue, Feb 18, 2020 at 2:51 PM Rob Herring <robh@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
> > > >
> > > > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > > > at the end of initcall"), along with commit 25b4e70dcce9
> > > > ("driver core: allow stopping deferred probe after init") after
> > > > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > > > instead see an error causing the driver to fail to load.
> > > >
> > > > That change causes trouble when trying to use many clk drivers
> > > > as modules, as the clk modules may not load until much later
> > > > after init has started. If a dependent driver loads and gets an
> > > > error instead of EPROBE_DEFER, it won't try to reload later when
> > > > the dependency is met, and will thus fail to load.
> > > >
> > > > This patch reworks some of the logic in
> > > > __driver_deferred_probe_check_state() so that if the
> > > > deferred_probe_timeout value is set, we will return EPROBE_DEFER
> > > > until that timeout expires, which may be after initcalls_done
> > > > is set to true.
> > > >
> > > > Specifically, on db845c, this change (when combined with booting
> > > > using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> > > > QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> > > > system, where as without it the display will fail to load.
> > >
> > > I would change the default for deferred_probe_timeout to 30 and then
> > > regulator code can rely on that.
> >
> > That is exactly what I do in the following patch! Let me know if you
> > have further suggestions for it.
>
> Indeed.
>
> Between the above comment and this comment in patch 2:
> /* preserve 30 second interval if deferred_probe_timeout=-1 */
>
> ...I was confused.
>

Sorry. I added that out of an abundance of caution to avoid breaking
things if someone booted specifying deferred_probe_timeout=-1 as a
boot argument, since that would cause the regulator timeout to likely
never expire.

> > > Curious, why 30 sec is fine now when
> > > you originally had 2 min? I'd just pick what you think is best. I
> > > doubt Mark had any extensive experiments to come up with 30sec.
> >
> > I had two minutes initially as, due to other delays I still have to
> > chase, booting all the way to UI on the db845c can sometimes take
> > almost a minute. So it was just a rough safety estimate. But in v2, I
> > tested with the 30 second time used by the regulator code, and that
> > seemed to work ok.
> >
> > As long as 30 seconds is working well, I'm ok with it for now (and it
> > can be overrided via boot arg). Though from past experience with
> > enterprise distros running on servers with tons of disks (which might
> > take many minutes to initialize), I'd suspect its likely some
> > environments may need much longer.
>
> Those systems aren't going to have a pinctrl or clock driver sitting
> in an encrypted RAID partition either. :)

Fair enough. Not today.. but it's always only a matter of time between
"that's ridiculous!" and "oh, we need that!" :)

thanks
-john