Re: [PATCH] drm/client: Send hotplug event after registering a client

From: Thomas Zimmermann
Date: Mon Jul 10 2023 - 06:00:50 EST


Hi

Am 10.07.23 um 11:52 schrieb Javier Martinez Canillas:
Thomas Zimmermann <tzimmermann@xxxxxxx> writes:

Hello Thomas,

Generate a hotplug event after registering a client to allow the
client to configure its display. Remove the hotplug calls from the
existing clients for fbdev emulation. This change fixes a concurrency
bug between registering a client and receiving events from the DRM
core. The bug is present in the fbdev emulation of all drivers.

The fbdev emulation currently generates a hotplug event before
registering the client to the device. For each new output, the DRM
core sends an additional hotplug event to each registered client.

If the DRM core detects first output between sending the artificial
hotplug and registering the device, the output's hotplug event gets
lost. If this is the first output, the fbdev console display remains
dark. This has been observed with amdgpu and fbdev-generic.

Fix this by adding hotplug generation directly to the client's
register helper drm_client_register(). Registering the client and
receiving events are serialized by struct drm_device.clientlist_mutex.
So an output is either configured by the initial hotplug event, or
the client has already been registered.

The bug was originally added in commit 6e3f17ee73f7 ("drm/fb-helper:
generic: Call drm_client_add() after setup is done"), in which adding
a client and receiving a hotplug event switched order. It was hidden,
as most hardware and drivers have at least on static output configured.
Other drivers didn't use the internal DRM client or still had struct
drm_mode_config_funcs.output_poll_changed set. That callback handled
hotplug events as well. After not setting the callback in amdgpu in
commit 0e3172bac3f4 ("drm/amdgpu: Don't set struct
drm_driver.output_poll_changed"), amdgpu did not show a framebuffer
console if output events got lost. The bug got copy-pasted from
fbdev-generic into the other fbdev emulation.

Reported-by: Moritz Duge <MoritzDuge@xxxxxxxxxxxx>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2649

Aren't you missing a Fixes: for 0e3172bac3f4 too? Since that's the commit
that unmasked the bug for amdgpu, IMO that is the most important to list.

Well, OK.


Fixes: 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup is done")
Fixes: 8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate source file")
Fixes: b79fe9abd58b ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA helpers")
Fixes: 63c381552f69 ("drm/armada: Implement fbdev emulation as in-kernel client")
Fixes: 49953b70e7d3 ("drm/exynos: Implement fbdev emulation as in-kernel client")
Fixes: 8f1aaccb04b7 ("drm/gma500: Implement client-based fbdev emulation")
Fixes: 940b869c2f2f ("drm/msm: Implement fbdev emulation as in-kernel client")
Fixes: 9e69bcd88e45 ("drm/omapdrm: Implement fbdev emulation as in-kernel client")
Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation")
Fixes: 71ec16f45ef8 ("drm/tegra: Implement fbdev emulation as in-kernel client")
Signed-off-by: Thomas Zimmermann <tzimmermann@xxxxxxx>
Tested-by: Moritz Duge <MoritzDuge@xxxxxxxxxxxx>
Tested-by: Torsten Krah <krah.tm@xxxxxxxxx>
Tested-by: Paul Schyska <pschyska@xxxxxxxxx>
Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
Cc: David Airlie <airlied@xxxxxxxxx>
Cc: Noralf Trønnes <noralf@xxxxxxxxxxx>
Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
Cc: Maxime Ripard <mripard@xxxxxxxxxx>
Cc: Javier Martinez Canillas <javierm@xxxxxxxxxx>
Cc: Russell King <linux@xxxxxxxxxxxxxxx>
Cc: Inki Dae <inki.dae@xxxxxxxxxxx>
Cc: Seung-Woo Kim <sw0312.kim@xxxxxxxxxxx>
Cc: Kyungmin Park <kyungmin.park@xxxxxxxxxxx>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@xxxxxxxxxx>
Cc: Patrik Jakobsson <patrik.r.jakobsson@xxxxxxxxx>
Cc: Rob Clark <robdclark@xxxxxxxxx>
Cc: Abhinav Kumar <quic_abhinavk@xxxxxxxxxxx>
Cc: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx>
Cc: Tomi Valkeinen <tomi.valkeinen@xxxxxxxxxxxxxxxx>
Cc: Alex Deucher <alexander.deucher@xxxxxxx>
Cc: "Christian König" <christian.koenig@xxxxxxx>
Cc: "Pan, Xinhui" <Xinhui.Pan@xxxxxxx>
Cc: Thierry Reding <thierry.reding@xxxxxxxxx>
Cc: Mikko Perttunen <mperttunen@xxxxxxxxxx>
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
Cc: linux-samsung-soc@xxxxxxxxxxxxxxx
Cc: linux-arm-msm@xxxxxxxxxxxxxxx
Cc: freedreno@xxxxxxxxxxxxxxxxxxxxx
Cc: amd-gfx@xxxxxxxxxxxxxxxxxxxxx
Cc: linux-tegra@xxxxxxxxxxxxxxx
Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx
Cc: <stable@xxxxxxxxxxxxxxx> # v5.2+

While it's true that the but was introduced by commit 6e3f17ee73f7 and that
landed in v5.2, I wonder if this patch could even be applied to such olders
Linux versions. Probably in practice it would be at most backported to
v6.2, which is the release that exposed the bug for the amdgpu driver.

No idea. The fix looks simple enough, but a lot has changed in the surrounding code.

Best regards
Thomas


Your explanation makes sense to me and the patch looks good.

Reviewed-by: Javier Martinez Canillas <javierm@xxxxxxxxxx>


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)

Attachment: OpenPGP_signature
Description: OpenPGP digital signature