Re: [PATCH] driver core: Don't log intentional skip of device link creation as error
From: Nícolas F. R. A. Prado
Date: Wed Oct 02 2024 - 16:57:33 EST
On Fri, Aug 09, 2024 at 12:13:25PM -0400, Nícolas F. R. A. Prado wrote:
> On Mon, Jul 29, 2024 at 05:08:48PM -0700, Saravana Kannan wrote:
> > On Mon, Jul 29, 2024 at 2:25 PM Nícolas F. R. A. Prado
> > <nfraprado@xxxxxxxxxxxxx> wrote:
> > >
> > > On Tue, Jun 25, 2024 at 09:56:07AM -0400, Nícolas F. R. A. Prado wrote:
> > > > On Mon, Jun 24, 2024 at 04:53:30PM -0700, Saravana Kannan wrote:
> > > > > On Mon, Jun 24, 2024 at 8:21 AM Nícolas F. R. A. Prado
> > > > > <nfraprado@xxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > Commit ac66c5bbb437 ("driver core: Allow only unprobed consumers for
> > > > > > SYNC_STATE_ONLY device links") introduced an early return in
> > > > > > device_link_add() to prevent useless links from being created. However
> > > > > > the calling function fw_devlink_create_devlink() unconditionally prints
> > > > > > an error if device_link_add() didn't create a link, even in this case
> > > > > > where it is intentionally skipping the link creation.
> > > > > >
> > > > > > Add a check to detect if the link wasn't created intentionally and in
> > > > > > that case don't log an error.
> > > > >
> > > > > Your point is somewhat valid, and I might Ack this. But this really
> > > > > shouldn't be happening a lot. Can you give more context on how you are
> > > > > hitting this?
> > > >
> > > > Of course. I'm seeing this on the mt8195-cherry-tomato-r2 platform.
> > > >
> > > > The following error is printed during boot:
> > > >
> > > > mediatek-drm-dp 1c500000.edp-tx: Failed to create device link (0x180) with backlight-lcd0
> > > >
> > > > It doesn't happen with the upstream defconfig, but with the following config
> > > > change it does:
> > > >
> > > > -CONFIG_PWM_MTK_DISP=m
> > > > +CONFIG_PWM_MTK_DISP=y
> > > >
> > > > That probably changes the order in which the MTK DP and the backlight drivers
> > > > probe, resulting in the error.
> > > >
> > > > One peculiarity that comes to mind is that the DP driver calls
> > > > devm_of_dp_aux_populate_bus() to run a callback once the panel has finished
> > > > probing. I'm not sure if this could have something to do with the error.
> > > >
> > > > Full log at https://lava.collabora.dev/scheduler/job/14573149
> > >
> > > Hi Saravana,
> > >
> > > With the given context for where this issue is happening, what do you think
> > > about this patch?
> >
> > Ah sorry, missed your earlier email.
> >
> > Couple of points:
> > 1. It looks like the link in question is "SYNC_STATE_ONLY" because
> > it's part of a cycle. Correct me if I'm wrong. You might want to use
> > the new "post-init-providers" property to help fw_devlink break the
> > cycle and enforce the right dependency between the edp-tx and your
> > backlight. And then this error should go away and your device ordering
> > is enforced a bit better.
>
> I don't see any cycle there. edp-tx points to backlight, but backlight doesn't
> point back (from mt8195-cherry.dtsi):
>
> &edp_tx {
> ...
> aux-bus {
> panel {
> compatible = "edp-panel";
> power-supply = <&pp3300_disp_x>;
> backlight = <&backlight_lcd0>;
>
>
> backlight_lcd0: backlight-lcd0 {
> compatible = "pwm-backlight";
> brightness-levels = <0 1023>;
> default-brightness-level = <576>;
> enable-gpios = <&pio 82 GPIO_ACTIVE_HIGH>;
> num-interpolated-steps = <1023>;
> pwms = <&disp_pwm0 0 500000>;
> power-supply = <&ppvar_sys>;
> };
>
> And DL_FLAG_CYCLE is not set in the flags in the error log: 0x180. (Let me know
> if there's something else that I should be looking at to detect a cycle)
Hi Saravana,
Here are some debug logs to help contextualize the issue:
[ 0.198518] device: 'backlight-lcd0': device_add
[ 0.198655] platform 1c500000.edp-tx: Linked as a sync state only consumer to backlight-lcd0
[ 34.971653] platform backlight-lcd0: error -EPROBE_DEFER: supplier 1100e000.pwm not ready
[ 35.115480] mediatek-drm-dp 1c500000.edp-tx: driver: 'mediatek-drm-dp': driver_bound: bound to device
[ 35.160115] mediatek-drm-dp 1c500000.edp-tx: Dropping the link to backlight-lcd0
[ 53.910433] pwm-backlight backlight-lcd0: driver: 'pwm-backlight': driver_bound: bound to device
[ 53.919213] mediatek-drm-dp 1c500000.edp-tx: Failed to create device link (0x180) with backlight-lcd0
So a SYNC_STATE_ONLY device link is created from backlight-lcd0 to edp-tx. When
the edp-tx probes, the link is dropped, since it is SYNC_STATE_ONLY. When the
backlight-lcd0 probes a new devlink is attempted to the consumer edp-tx and
fails, since it is useless, printing the warning.
You mentioned a cycle before. The only cycle I see is between the edp-tx and the
panel, but doesn't involve the backlight:
[ 0.198104] ----- cycle: start -----
[ 0.198105] /soc/edp-tx@1c500000/aux-bus/panel: cycle: depends on /soc/edp-tx@1c500000
[ 0.198112] ----- cycle: start -----
[ 0.198113] /soc/edp-tx@1c500000/aux-bus/panel: cycle: child of /soc/edp-tx@1c500000
[ 0.198119] /soc/edp-tx@1c500000: cycle: depends on /soc/edp-tx@1c500000/aux-bus/panel
[ 0.198125] ----- cycle: end -----
[ 0.198126] platform 1c500000.edp-tx: Fixed dependency cycle(s) with /soc/edp-tx@1c500000/aux-bus/panel
Just in case I tried using post-init-providers:
diff --git a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi
index 75d56b2d5a3d..19df138ef043 100644
--- a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi
@@ -322,6 +322,7 @@ &edp_tx {
pinctrl-names = "default";
pinctrl-0 = <&edptx_pins_default>;
+ post-init-providers = <&panel>;
ports {
#address-cells = <1>;
@@ -344,7 +345,7 @@ edp_out: endpoint {
};
aux-bus {
- panel {
+ panel: panel {
compatible = "edp-panel";
power-supply = <&pp3300_disp_x>;
backlight = <&backlight_lcd0>;
It broke the cycle, as I no longer see it in the logs, but the failed device
link warning is still there as expected.
It seems to me that the issue comes form the device link being SYNC_STATE_ONLY
in the first place. I see that comes from the 'else' path in
if (con->fwnode == link->consumer)
flags = fw_devlink_get_flags(link->flags);
else
flags = FW_DEVLINK_FLAGS_PERMISSIVE;
and the value on each side of the comparison is:
con->fwnode: /soc/edp-tx@1c500000
link->consumer: /soc/edp-tx@1c500000/aux-bus/panel
Could you help me understand what (if anything) is wrong here?
(I also noticed that as per the DT the consumer for backlight-lcd0 should be the
panel, but the devlink has it instead as the edp-tx, I'm guessing that's another
symptom of the same issue)
Thanks,
Nícolas
>
> Thanks,
> Nícolas
>
> >
> > 2. If we take this patch, it might be better to make a "useless device
> > links" helper function and use the same on in device_link_add() and in
> > the "if" condition that's used to decide if a device link should be
> > created. The con == sup is another "useless device link" check. But
> > I'm not really sure if we should mask this error as it help with stuff
> > like (1). con == sup is a bit more clear cut example of a "useless
> > device link" in the context of fw_devlink.
> >
> > -Saravana