Re: [PATCH] component: Move host device to end of device lists on binding

From: Rafael J. Wysocki
Date: Tue May 11 2021 - 13:20:19 EST


On Tue, May 11, 2021 at 7:00 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
>
> Quoting Rafael J. Wysocki (2021-05-11 03:52:06)
> > On Mon, May 10, 2021 at 9:08 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
> >
> > [cut]
> >
> > >
> > > >
> > > > > I will try it, but then I wonder about things like system wide
> > > > > suspend/resume too. The drm encoder chain would need to reimplement the
> > > > > logic for system wide suspend/resume so that any PM ops attached to the
> > > > > msm device run in the correct order. Right now the bridge PM ops will
> > > > > run, the i2c bus PM ops will run, and then the msm PM ops will run.
> > > > > After this change, the msm PM ops will run, the bridge PM ops will run,
> > > > > and then the i2c bus PM ops will run. It feels like that could be a
> > > > > problem if we're suspending the DSI encoder while the bridge is still
> > > > > active.
> > > >
> > > > Yup suspend/resume has the exact same problem as shutdown.
> > >
> > > I think suspend/resume has the exact opposite problem. At least I think
> > > the correct order is to suspend the bridge, then the encoder, i.e. DSI,
> > > like is happening today. It looks like drm_atomic_helper_shutdown()
> > > operates from the top down when we want bottom up? I admit I have no
> > > idea what is supposed to happen here.
> >
> > Why would the system-wide suspend ordering be different from the
> > shutdown ordering?
>
> I don't really know. I'm mostly noting that today the order of suspend
> is to suspend the bridge device first and then the aggregate device. If
> the suspend of the aggregate device is traversing the devices like
> drm_atomic_helper_shutdown() then it would operate on the bridge device
> after it has been suspended, like is happening during shutdown. But it
> looks like that isn't happening. At least for the msm driver we're
> suspending the aggregate device after the bridge, and there are some
> weird usages of prepare and complete in there (see msm_pm_prepare() and
> msm_pm_complete) which makes me think that it's all working around this
> component code.

Well, it looks like the "prepare" phase is used sort-of against the
rules (because "prepare" is not supposed to make changes to the
hardware configuration or at least that is not its role) in order to
work around an ordering issue that is present in shutdown which
doesn't have a "prepare" phase.

> The prepare phase is going to suspend the display pipeline, and then the
> bridge device will run its suspend hooks, and then the aggregate driver
> will run its suspend hooks. If we had a proper device for the aggregate
> device instead of the bind/unbind component hooks we could clean this
> up.

I'm not sufficiently familiar with the component code to add anything
constructive here, but generally speaking it looks like the "natural"
dpm_list ordering does not match the order in which the devices in
question should be suspended (or shut down for that matter), so indeed
it is necessary to reorder dpm_list this way or another.

Please also note that it generally may not be sufficient to reorder
dpm_list if the devices are suspended and resumed asynchronously
during system-wide transitions, because in that case the callbacks of
different devices are only started in the dpm_list order, but they may
be completed in a different order (and of course they may run in
parallel with each other).

Shutdown is simpler, because it runs the callback synchronously for
all devices IIRC.