Re: [PATCH RESEND v3 4/4] drm/connector: warn when cleaning up a refcounted connector
From: Luca Ceresoli
Date: Mon Dec 30 2024 - 10:17:10 EST
Hello,
On Wed, 25 Dec 2024 23:15:53 +0800
kerne test robot <oliver.sang@xxxxxxxxx> wrote:
> the WARN added in this commit is hit in our tests, below just FYI.
>
> kernel test robot noticed "WARNING:at_drivers/gpu/drm/drm_connector.c:#drm_connector_cleanup[drm]" on:
...
> [ 75.546607][ T377] i915 0000:00:02.0: [drm] [ENCODER:98:DDI A/PHY A] failed to retrieve link info, disabling eDP
> [ 75.557310][ T377] ------------[ cut here ]------------
> [ 75.562737][ T377] WARNING: CPU: 9 PID: 377 at drivers/gpu/drm/drm_connector.c:631 drm_connector_cleanup+0x591/0x5c0 [drm]
...
> [ 75.820489][ T377] Call Trace:
> [ 75.823699][ T377] <TASK>
> [ 75.826566][ T377] ? drm_connector_cleanup+0x591/0x5c0 [drm]
> [ 75.832581][ T377] ? __warn+0x9d/0x140
> [ 75.836581][ T377] ? drm_connector_cleanup+0x591/0x5c0 [drm]
> [ 75.842597][ T377] ? report_bug+0x1a6/0x1d0
> [ 75.847028][ T377] ? handle_bug+0x53/0xa0
> [ 75.851288][ T377] ? exc_invalid_op+0x13/0x40
> [ 75.855903][ T377] ? asm_exc_invalid_op+0x16/0x20
> [ 75.860873][ T377] ? drm_mode_object_read_refcount+0x35/0x40 [drm]
> [ 75.867428][ T377] ? drm_connector_cleanup+0x591/0x5c0 [drm]
> [ 75.873436][ T377] ? drm_connector_cleanup+0x34/0x5c0 [drm]
> [ 75.879362][ T377] intel_dp_init_connector+0x773/0x920 [i915]
OK, so we have this warning because intel_dp_init_connector() does in
the error path (stripped code):
intel_dp_init_connector()
{
...
drm_connector_init_with_ddc(); // sets refcount to 1
...
if (<some error condition>) {
goto fail;
}
...
fail:
drm_connector_cleanup(connector); // refcount == 1, warning triggers here
}
My patch is based on the assumption that a connector is always freed by
drm_connector_put() when refcount goes to 0.
The code here (and in other drivers) is an exception that makes sense
specifically for error handling during probe or initialization, and
only when the connector pointer has not been taken by other parts of
the code. However it makes the warning check unavoidably generate false
positives.
At first sight, false positives can be removed by replacing:
-drm_connector_cleanup(connector);
+drm_connector_put(connector);
and letting drm_connector_put() end up in calling the .destroy func.
However that imposes to ensure .destroy is idempotent and does not
"destroy too much", by code inspection at least. By a quick search I
counted about 30 instances.
So, I think there are three options:
1. this patch is useful (and it helped Luca find an actual bug in his
code) so we want it but first we need to remove the false positives
2. we want to support _cleanup instead of a _put() in the error path,
so let's remove this patch
3. let's keep the patch and ignore the warning: future patches adding
_cleanup() in the error path could get a kernel test robot notice
like this, so it would be good to use _put() in future drivers
Thoughts about this?
Luca
--
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com