Re: [PATCH v7 00/11] drm/bridge: add devm_drm_bridge_alloc() with bridge refcount

From: Maxime Ripard
Date: Wed Mar 19 2025 - 12:18:26 EST


On Mon, Mar 17, 2025 at 03:56:07PM +0100, Luca Ceresoli wrote:
> Hello Maxime,
>
> On Fri, 14 Mar 2025 19:21:01 +0100
> Maxime Ripard <mripard@xxxxxxxxxx> wrote:
>
> > Hi,
> >
> > On Fri, Mar 14, 2025 at 11:31:13AM +0100, Luca Ceresoli wrote:
> > > This series improves the way DRM bridges are allocated and
> > > initialized and makes them reference-counted. The goal of reference
> > > counting is to avoid use-after-free by drivers which got a pointer
> > > to a bridge and keep it stored and used even after the bridge has
> > > been deallocated.
> > >
> > > The overall goal is supporting Linux devices with a DRM pipeline
> > > whose final components can be hot-plugged and hot-unplugged,
> > > including one or more bridges. For more details see the big picture
> > > [0].
> > >
> > > DRM bridge drivers will have to be adapted to the new API, which is
> > > pretty simple for most cases. Refcounting will have to be adopted
> > > on the two sides: all functions returning a bridge pointer and all
> > > code obtaining such a pointer. This series has just an overview of
> > > some of those conversions, because for now the main goal is to
> > > agree on the API.
> > >
> > > Series layout:
> > >
> > > 1. Add the new API and refcounting:
> > >
> > > drm/bridge: add devm_drm_bridge_alloc()
> > > drm/bridge: add support for refcounting
> > >
> > > 2. get/put the reference in basic operations in the bridge core:
> > >
> > > drm/bridge: get/put the bridge reference in
> > > drm_bridge_add/remove() drm/bridge: get/put the bridge reference in
> > > drm_bridge_attach/detach()
> > >
> > > 3. as an example of changes for bridge consumers, get a reference
> > > for the bridge returned by drm_bridge_chain_get_first_bridge(),
> > > have it put by all callers (all users will be covered later on
> > > separately):
> > >
> > > drm/bridge: add a cleanup action for scope-based
> > > drm_bridge_put() invocation drm/bridge: get the bridge returned by
> > > drm_bridge_chain_get_first_bridge() drm/mxsfb: put the bridge
> > > returned by drm_bridge_chain_get_first_bridge() drm/atomic-helper:
> > > put the bridge returned by drm_bridge_chain_get_first_bridge()
> > > drm/probe-helper: put the bridge returned by
> > > drm_bridge_chain_get_first_bridge()
> > >
> > > 4. convert a few bridge drivers (bridge providers) to the new API:
> > >
> > > drm/bridge: ti-sn65dsi83: use dynamic lifetime management
> > > drm/bridge: samsung-dsim: use dynamic lifetime management
> > >
> > > This work was formerly a part of my v6 DRM bridge hotplug
> > > series[0], now split as a standalone series with many improvements,
> > > hence the "v7" version number.
> >
> > Except for one patch where I had comments, I think the series is in
> > excellent shape. We're still missing a couple of things to close this
> > topic though:
> >
> > - Converting the other bridge iterators/accessors to take / put the
> > references
>
> I sent a couple in this series as you had asked, to show how conversion
> looks like. But I have a large part of this conversion partially done
> already, and it is the largest part of the refcounting work in terms of
> touched files due to the large number of drivers using the iterators
> and accessors. Here are the functions to convert:
>
> A) drm_bridge_chain_get_first_bridge
> B) drm_bridge_get_prev_bridge
> C) drm_bridge_get_next_bridge
> D) drm_for_each_bridge_in_chain
> E) drm_bridge_connector_init
> F) of_drm_find_bridge
>
> A) is present in this series as an example but I don't think it should
> be applied until all bridge drivers are converted to
> drm_bridge_alloc(). Otherwise for not-yet-converted bridge drivers we'd
> have drm_bridge_get/put() operating on an uninitialized kref, and
> __drm_bridge_free() called on non-refcounted bridges, so I think we'd
> see fireworks.
>
> In the previous iteration I used drm_bridge_is_refcounted() in
> drm_bridge_get/put() to allow a progressive migration, but if we want
> to convert everything in a single run we need to first convert all
> bridges to drm_bridge_alloc() and then convert all accessors.
>
> The same reasoning applies to patches 3-4 which add get/put to
> drm_bridge_add/remove() and _attach/detach().

Agreed.

> B) to E) are ready in my work branch, about 20 patches in total.
> Indeed item E) is a special case but it is handled there too.
>
> Item F) is the beast, because of the reverse call graph of
> of_drm_find_bridge() which includes drm_of_find_panel_or_bridge() and
> then *_of_get_bridge(), each having a few dozen callers, and leading
> to the panel_bridge topic. I have converted maybe half of the users of
> accessors in F), it's 35 patches but it's the easy half and I still need
> to tackle to hardest ones.

One thing to keep in mind is that, while it's not correct, if the bridge
has been allocated with devm_drm_bridge_alloc, it's not worse either. If
you're not getting a reference to your pointer, the point is buggy,
sure, but it's just as buggy as before, and in the same situations.

So we can make that gradually if it's more convenient.

One way to solve it would be that, for example, of_drm_find_bridge is
oddly named according to our convention and it would make more sense to
be called drm_of_find_bridge().

So maybe we can just create drm_of_find_bridge() that takes a reference,
make of_drm_find_bridge() deprecated in favour of drm_of_find_bridge(),
add a TODO, and call it a day. People will gradually switch to the new
API over time.

> > - Mass converting the drivers to devm_drm_bridge_alloc
>
> Again I sent a couple in this series as you had asked, to show how
> conversion looks like for the typical bridge driver. There are ~70
> drivers to convert in total and I think most will be easy as the two
> examples presented here.
>
> I think this should be merged entirely before merging any accessor
> changes, as explained above.

Agreed.

> > - Documenting somewhere (possibly in drm_bridge_init?) that it
> > really shouldn't be used anymore
>
> I'm afraid there is no drm_bridge_init(), bridge drivers just do
> [devm_]kzalloc and set fields explicitly. So I don't think there is a
> place to document this.

Oh, right.

Then, drm_bridge_add() would be a good candidate too to mention that
bridges must be allocated using devm_drm_bridge_alloc().

> However I used to have a documentation patch until v6 [0], and I think
> it should be revived and resent at some point, after removing the
> "legacy mode" as we are converting all drivers at once. BTW I also have
> a kunittest patch that should be revived. Do you still prefer me to
> resend these two patches as a separate series, waiting after the API in
> this series is applied?

Both options work for me.

> Overall, I think this could be the path forward, let me know if
> youthink it should be done differently:
>
> A. have patches 1 and 2 of this series applied
> (why not, even patches 10-11)

I had some comments on patch 2, but it's ok for me on principle.

> B. after (A), send series to convert all bridge drivers to new API
> (includes patches 10-11 of this series if not applied already)
> C. after (A), send documentation and kunittest patches
> D. after (B), add get/put to drm_bridge_add/remove() + attach/detech()
> (patches 3-4 in this series)
> E. after (B), send series to convert accessors (including patches 5-9
> in this series which convert drm_bridge_chain_get_first_bridge()
> and its users)

Sounds like a plan.

Maxime

Attachment: signature.asc
Description: PGP signature