Re: [PATCH v2 0/5] Add support for GE SUNH hot-pluggable connector (was: "drm: add support for hot-pluggable bridges")

From: Daniel Vetter
Date: Thu May 16 2024 - 09:22:15 EST


Apologies for missing v1 ...

On Fri, May 10, 2024 at 09:10:36AM +0200, Luca Ceresoli wrote:
> DRM hotplug bridge driver
> =========================
>
> DRM natively supports pipelines whose display can be removed, but all the
> components preceding it (all the display controller and any bridges) are
> assumed to be fixed and cannot be plugged, removed or modified at runtime.
>
> This series adds support for DRM pipelines having a removable part after
> the encoder, thus also allowing bridges to be removed and reconnected at
> runtime, possibly with different components.
>
> This picture summarizes the DRM structure implemented by this series:
>
> .------------------------.
> | DISPLAY CONTROLLER |
> | .---------. .------. |
> | | ENCODER |<--| CRTC | |
> | '---------' '------' |
> '------|-----------------'
> |
> |DSI HOTPLUG
> V CONNECTOR
> .---------. .--. .-. .---------. .-------.
> | 0 to N | | _| _| | | 1 to N | | |
> | BRIDGES |--DSI-->||_ |_ |--DSI-->| BRIDGES |--LVDS-->| PANEL |
> | | | | | | | | | |
> '---------' '--' '-' '---------' '-------'
>
> [--- fixed components --] [----------- removable add-on -----------]
>
> Fixed components include:
>
> * all components up to the DRM encoder, usually part of the SoC
> * optionally some bridges, in the SoC and/or as external chips
>
> Components on the removable add-on include:
>
> * one or more bridges
> * a fixed connector (not one natively supporting hotplug such as HDMI)
> * the panel

So I think at a high level this design approach makes sense, but the
implementation needs some serious thought. One big thing upfront though,
we need to have a clear plan for the overlay hotunload issues, otherwise
trying to make drm bridges hotpluggable makes no sense to me. Hotunload is
very, very tricky, full of lifetime issues, and those need to be sorted
out first or we're just trying to build a castle on quicksand.

For bridges itself I don't think the current locking works. You're trying
to really cleverly hide it all behind a normal-looking bridge driver, but
there's many things beyond that which will blow up if bridges just
disappear. Most importantly the bridge states part of an atomic update.

Now in drm we have drm_connector as the only hotunpluggable thing, and it
took years to sort out all the issues. I think we should either model the
bridge hotunplug locking after that, or just outright reuse the connector
locking and lifetime rules. I much prefer the latter personally.

Anyway the big issues:

- We need to refcount the hotpluggable bridges, because software (like
atomic state updates) might hang onto pointers for longer than the
bridge physically exists. Assuming that you can all tear it down
synchronously will not work.

If we reuse connector locking/lifetime then we could put the
hotpluggable part of the bridge chain into the drm_connector, since that
already has refcounting as needed. It would mean that finding the next
bridge in the chain becomes a lot more tricky though. With that model
we'd create a new connector every time the bridge is hotplugged, which I
think is also the cleaner model (because you might plug in a hdmi
connector after a panel, so things like the connector type change).

- No notifiers please. The create a locking mess with inversions, and
especially for hotunplug they create the illusion that you can
synchronously keep up to date with hardware state. That's not possible.
Fundamentally all bridge drivers which might be hotunplugged need to be
able to cope with the hardware disappearing any momemnt.

Most likely changes/fixes we need to make overlay hotunload work will
impact how exactly this works all ...

Also note that the entire dance around correctly stopping userspace from
doing modesets on, see all the relevant changes in
update_connector_routing(). Relying on hotplugging connectors will sort
out a lot of these issues in a consistent way.

- Related to this: You're not allowed to shut down hardware behind the
user's back with drm_atomic_helper_shutdown. We've tried that approach
with dp mst, it really pisses off userspace when a page_flip that it
expected to work doesn't work.

- There's also the design aspect that in atomic, only atomic_check is
allowed to fail, atomic_commit must succeed, even when the hardware is
gone. Using connectors and their refcounting should help with that.

- Somewhat aside, but I noticed that the bridge->atomic_reset is in
drm_bridge_attach, and that's kinda the wrong place. It should be in
drm_mode_config_reset, like all the other ->atomic_reset hooks. That
would make it a lot clearer that we need to figure out who/when
->atomic_reset should be called for hotplugged bridges, maybe as part of
connector registration when the entire bridge and it's new connector is
assembled?

- Finally this very much means we need to rethink who/how the connector
for a bridge is created. The new design is that the main driver creates
this connector, once the entire bridge exists. But with hotplugging this
gets a lot more complicated, so we might want to extract a pile of that
encoder related code from drivers (same way dp mst helpers take care of
connector creation too, it's just too much of a mess otherwise).

The current bridge chaining infrastructure requires a lot of hand-rolled
code in each bridge driver and the encoder, so that might be a good
thing anyway.

- Finally I think the entire bridge hotplug infrastructure should be
irrespective of the underlying bus. Which means for the mipi dsi case we
might also want to look into what's missing to make mipi dsi
hotunpluggable, at least for the case where it's a proper driver. I
think we should ignore the old bridge model where driver's stitched it
all toghether using the component framework, in my opinion that approach
should be deprecated.

- Finally I think we should have a lot of safety checks, like only bridges
which declare themselve to be hotunplug safe should be allowed as a part
of the hotpluggable bridge chain part. All others must still be attached
before the entire driver is registered with drm_dev_register.

Or that we only allow bridges with the NO_CONNECTOR flag for
drm_bridge_attach.

There's probably a pile more fundamental issues I've missed, but this
should get a good discussion started.
-Sima
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch