Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
From: Bjorn Andersson
Date: Thu Feb 13 2020 - 21:19:30 EST
On Thu 13 Feb 16:44 PST 2020, John Stultz wrote:
> Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> at the end of initcall"), along with commit 25b4e70dcce9
> ("driver core: allow stopping deferred probe after init") after
> late_initcall, drivers will stop getting EPROBE_DEFER, and
> instead see an error causing the driver to fail to load.
>
> That change causes trouble when trying to use many clk drivers
> as modules, as the clk modules may not load until much later
> after init has started. If a dependent driver loads and gets an
> error instead of EPROBE_DEFER, it won't try to reload later when
> the dependency is met, and will thus fail to load.
>
> Instead of reverting that patch, this patch tries to extend the
> time that EPROBE_DEFER is retruned by two minutes, to (hopefully)
> ensure that everything has had a chance to load.
>
> Specifically, on db845c, this change allows us to set
> SDM_GPUCC_845, QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and
> get a working system, where as without it the display will fail
> to load.
The purpose of 25b4e70dcce9 ("driver core: allow stopping deferred probe
after init") is to ensure that when the kernel boots with a DeviceTree
blob that references a resource (power-domain in this case) that either
hasn't been compiled in, or simply doesn't exist yet, it should continue
to boot - under the assumption that these resources probably aren't
needed to provide a functional system.
I don't think your patch maintains this behavior, because when userspace
kicks in and load kernel modules during the first two minutes they will
all end up in the probe deferral list. Past two minutes any event that
registers a new driver (i.e. manual intervention) will kick of a new
wave of probing, which will now continue as expected, ignoring any
power-domains that is yet to be probed (either because they don't exist
or they are further down the probe deferral list).
You can improve the situation somewhat by calling
driver_deferred_probe_trigger() in your
deferred_initcall_done_work_func(), to remove the need for human
intervention. But the outcome will still depend on the order in
deferred_probe_active_list.
That said, your patch does resolve an important problem for me!
We have a number of drivers providing power-domains that are registered
subject to timing in interaction with co-processors. So with a
sufficiently small kernel (e.g. heavy use of kernel modules) it's likely
that these are registered past late_initcall.
Your extension of this to two minutes past late_initcall will for sure
be sufficient to avoid this issue.
Regards,
Bjorn
>
> Cc: Alexander Graf <agraf@xxxxxxx>
> Cc: Rob Herring <robh@xxxxxxxxxx>
> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> Cc: Kevin Hilman <khilman@xxxxxxxxxx>
> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> Cc: Pavel Machek <pavel@xxxxxx>
> Cc: Len Brown <len.brown@xxxxxxxxx>
> Cc: Todd Kjos <tkjos@xxxxxxxxxx>
> Cc: Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: linux-pm@xxxxxxxxxxxxxxx
> Fixes: e01afc3250255 ("PM / Domains: Stop deferring probe at the end of initcall")
> Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")
> Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
> ---
> drivers/base/dd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index b25bcab2a26b..35ebae8b65be 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -311,6 +311,12 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
> }
> static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
>
> +static void deferred_initcall_done_work_func(struct work_struct *work)
> +{
> + initcalls_done = true;
> +}
> +static DECLARE_DELAYED_WORK(deferred_initcall_done_work, deferred_initcall_done_work_func);
> +
> /**
> * deferred_probe_initcall() - Enable probing of deferred devices
> *
> @@ -327,7 +333,7 @@ static int deferred_probe_initcall(void)
> driver_deferred_probe_trigger();
> /* Sort as many dependencies as possible before exiting initcalls */
> flush_work(&deferred_probe_work);
> - initcalls_done = true;
> + schedule_delayed_work(&deferred_initcall_done_work, 120 * HZ);
>
> /*
> * Trigger deferred probe again, this time we won't defer anything
> --
> 2.17.1
>