Re: [PATCH] drm/bridge: Fix refcount shown via debugfs for encoder_bridges_show()

From: Luca Ceresoli

Date: Fri Mar 13 2026 - 13:35:01 EST

On Fri Mar 13, 2026 at 11:22 AM CET, Liu Ying wrote:
> On Fri, Mar 13, 2026 at 10:57:04AM +0100, Luca Ceresoli wrote:
>> On Fri Mar 13, 2026 at 9:33 AM CET, Liu Ying wrote:
>>> On Thu, Mar 12, 2026 at 06:30:22PM +0100, Luca Ceresoli wrote:
>>>> Hello Liu, Maxime,
>>>>
>>>> On Thu Mar 12, 2026 at 7:05 AM CET, Liu Ying wrote:
>>>>> A typical bridge refcount value is 3 after a bridge chain is formed:
>>>>> - devm_drm_bridge_alloc() initializes the refcount value to be 1.
>>>>> - drm_bridge_add() gets an additional reference hence 2.
>>>>> - drm_bridge_attach() gets the third reference hence 3.
>>>>>
>>>>> This typical refcount value aligns with allbridges_show()'s behaviour.
>>>>> However, since encoder_bridges_show() uses
>>>>> drm_for_each_bridge_in_chain_scoped() to automatically get/put the
>>>>> bridge reference while iterating, a bogus reference is accidentally
>>>>> got when showing the wrong typical refcount value as 4 to users via
>>>>> debugfs. Fix this by caching the refcount value returned from
>>>>> kref_read() while iterating and explicitly decreasing the cached
>>>>> refcount value by 1 before showing it to users.
>>>>
>>>> Good point, indeed the refcount shown by
>>>> <debugfs>/dri/<card>/encoder-0/bridges is by one unit higher than the one
>>>> shown in <debugfs>/dri/bridges. I understand it's puzzling from a debugfs
>>>> user point of view.
>>>>
>>>> As you noticed, this is because the _scoped loop holds an extra ref on the
>>>> current bridge.
>>>>
>>>> For other reasons I proposed a mutex for stronger protection around the
>>>> bridge chain [v2]. With the mutex the extra ref is redundant, so in [v2]
>>>> the extra ref is removed, thus making your patch unneeded. However Maxime
>>>> asked to keep the extra ref, and so my latest iteration [v4] still has the
>>>> extra ref.
>>>>
>>>> That series is still on the mailing list, we are still in time to rediscuss
>>>> it.
>>>>
>>>> @Maxime: based on the issue Liu is trying to work around, do you think it
>>>> would make sense to go back to the initial approach for that series?
>>>> I.e. drm_for_each_bridge_in_chain_scoped() grabs the chain lock, which is a
>>>> superset of the per-bridge refcount, and thus the refcount can be dropped?
>>>> This would remove the debugfs issue, slightly simplify
>>>> drm_for_each_bridge_in_chain_scoped(), and introduce no new issues AFAIK.
>>>
>>> Just my take on the chain lock approach - I agree Maxime's comment on [v2]
>>> that keeping the get/put is a better than using the chain lock to ensure
>>> the refcount is correct. The chain lock could be added later on if needed.
>>
>> Well, no, adding the chain mutex is necessary(*), otherwise Thread A could
>> iterate over the chain while thread B is adding/removing bridges to/from
>> the chain.
>>
>> And the chain mutex is a superset of the per-bridge refcount, so when
>> adding the mutex the refcount inside drm_for_each_bridge_in_chain_scoped()
>> becomes useless (and slightly hurting as it makes the refcount shown in
>> debugfs inconsistent, as you noticed).
>
> For better code readability, I think keeping the get/put is fine even if
> you add a lock

The [v4] code with the removal of the extra refcount would not be more
complex. It would be a bit less code (no need for the DEFINE_FREE and
__free()). Maybe it'd need an extra comment to clarify when the
drm_bridge_put() is called.

[v4] https://lore.kernel.org/all/20260113-drm-bridge-alloc-encoder-chain-mutex-v4-4-60f3135adc45@xxxxxxxxxxx/

> (maybe RCU list is better than mutex, since the chain is
> read often). That follows the idea that you mentioned in [1]: "every
> pointer to a drm_bridge stored somewhere is a reference to a bridge".

That's true. However while it's an important pointer hygiene rule for
device drivers, for core code it's OK to deviate when there is a reason.

> Plus, seems no performance issue with the get/put, as discussed in [v2].

I confirm performance is surely not an issue here.

All that said, I'm OK with either option:

* no ref taken when the mutex is added
* ref taken when the mutex is added (as v4) + your patch to fix debugfs

Luca

--
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com