Re: [PATCH v1] driver core: Fix suspend/resume order issue with deferred probe
From: Rafael J. Wysocki
Date: Tue Jun 30 2020 - 12:11:57 EST
On Tue, Jun 30, 2020 at 5:39 PM Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Jun 30, 2020 at 03:50:58PM +0200, Rafael J. Wysocki wrote:
> > On Fri, Jun 26, 2020 at 10:53 PM Geert Uytterhoeven
> > <geert@xxxxxxxxxxxxxx> wrote:
> > >
> > > Hi Saravana,
> > >
> > > On Fri, Jun 26, 2020 at 10:34 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > > On Fri, Jun 26, 2020 at 4:27 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > > > On Thu, Jun 25, 2020 at 7:52 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > > > > On Thu, Jun 25, 2020 at 10:47 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > > > > > Note that deferred probing gets in the way here and so the problem is
> > > > > > > related to it.
> > > > > >
> > > > > > I mean, we officially support deferred probing. Shouldn't we fix it so
> > > > > > that it doesn't break suspend/resume?
> > > > >
> > > > > Yes, we should fix deferred probing.
> > >
> > > Please take into account that breakage is an actual regression.
> > >
> > > > > > Also, it's pretty easy to have
> > > > > > cases where one module probes multiple device instances and loading it
> > > > > > in one order would break dpm_list order for one device and loading it
> > > > > > in another order would break it for another device. And there would be
> > > > > > no "proper" order to load modules (because module order != device
> > > > > > order).
> > > > >
> > > > > I'm not saying that the current code is perfect. I'm saying that the
> > > > > fix as proposed adds too much cost for everybody who may not care IMO.
> > > >
> > > > Ok, how about I don't do this reordering until we see the first
> > > > deferred probe request? Will that work for you? In that case, systems
> > > > with no deferred probing will not incur any reordering cost. Or if
> > > > reordering starts only towards the end, all the previous probes won't
> > > > incur reordering cost.
> > >
> > > That first deferred probe request is more or less as of the first probe,
> > > since commit 93d2e4322aa74c1a ("of: platform: Batch fwnode parsing when
> > > adding all top level devices"), at least on DT systems.
> >
> > The deferred probe reordering of devices to the end of dpm_list
> > started in 2012, so it is nothing new, and it demonstrably works for
> > devices where the dependencies are known to the driver core.
> >
> > That said, in the cases when the dependencies are known to the driver
> > core, it is also unnecessary to reorder dpm_list in
> > deferred_probe_work_func(), because the right ordering of it is going
> > to be determined elsewhere.
> >
> > Also commit 494fd7b7ad10 ("PM / core: fix deferred probe breaking
> > suspend resume order") is not the source of the problem here, because
> > the problem would have still been there without it, due to the
> > device_pm_move_last() that was there before, so the Fixes: tag
> > pointing to that commit is misleading.
> >
> > Now, because 716a7a259690 ("driver core: fw_devlink: Add support for
> > batching fwnode parsing") is an optimization and the regression is
> > present because of it AFAICS, the best way to address it at that point
> > would be to revert commit 716a7a259690 for 5.8 and maybe do the
> > optimization more carefully.
> >
> > Greg, what do you think?
>
> I've been ignoreing this and letting you all sort it out :)
>
> But if you think that patch should be reverted, I'll not object and will
> be glad to to it if this solves the issue.
Well, if Geert can confirm that reverting commit 716a7a259690 makes
the problem go away, IMO this would be the most reasonable thing to do
at this stage of the cycle without risking that more regressions will
be introduced.
Geert?