Re: [PATCH 4/7] drm/msm/dp: fix aux-bus EP lifetime
From: Johan Hovold
Date: Tue Sep 13 2022 - 03:18:47 EST
On Tue, Sep 13, 2022 at 07:35:15AM +0100, Doug Anderson wrote:
> Hi,
>
> On Mon, Sep 12, 2022 at 7:10 PM Dmitry Baryshkov
> <dmitry.baryshkov@xxxxxxxxxx> wrote:
> >
> > On 12/09/2022 18:40, Johan Hovold wrote:
> > > Device-managed resources allocated post component bind must be tied to
> > > the lifetime of the aggregate DRM device or they will not necessarily be
> > > released when binding of the aggregate device is deferred.
> > >
> > > This can lead resource leaks or failure to bind the aggregate device
> > > when binding is later retried and a second attempt to allocate the
> > > resources is made.
> > >
> > > For the DP aux-bus, an attempt to populate the bus a second time will
> > > simply fail ("DP AUX EP device already populated").
> > >
> > > Fix this by amending the DP aux interface and tying the lifetime of the
> > > EP device to the DRM device rather than DP controller platform device.
> >
> > Doug, could you please take a look?
> >
> > For me this is another reminder/pressure point that we should populate
> > the AUX BUS from the probe(), before binding the components together.
>
> Aside from the kernel robot complaints, I'm not necessarily convinced.
> I think we know that the AUX DP stuff in MSM-DP is fragile right now
> and Qualcomm has promised to clean it up. This really feels like a
> band-aid and is really a sign that we're populating the AUX DP bus in
> the wrong place in Qualcomm's code. As you said, if we moved this to
> probe(), which is the plan in the promised cleanup, then it wouldn't
> be a problem.
Right, but that appears to be non-trivial judging from the discussions
you had back when the offending patch was merged:
https://lore.kernel.org/lkml/CAD=FV=X+QvjwoT2zGP82KW4kD0oMUY6ZgCizSikNX_Uj8dNDqA@xxxxxxxxxxxxxx/t/#u
> As far as I know Qualcomm has queued this cleanup behind their current
> PSR work (though it's never been clear why both can't be worked on at
> the same time) and the PSR work was stalled because they couldn't
> figure out what caused the glitching I reported. It's still on my nag
> list that I bring up with them every week...
>
> In any case, if a band-aid is urgent, maybe you could just call
> of_dp_aux_populate_bus() directly in Qualcomm code and you could add
> your own devm_add_action_or_reset() on the DRM device.
Yeah, that's probably better. I apparently missed a bunch of users of
devm_of_dp_aux_populate_ep_devices() after searching for
devm_of_dp_aux_populate_bus() instead. Judging from a quick glance all
of these populate the bus at probe, so Qualcomm indeed appears to be
the odd bird here.
But the bug is real, in mainline and needs to be fixed, so rolling a
custom devm action indeed should to be the right thing to do here (if
only to have a smaller fix).
Johan