Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

From: Mika Westerberg

Date: Wed May 27 2026 - 03:14:55 EST


Hi,

On Wed, May 27, 2026 at 02:41:21PM +0800, ChunAn Wu wrote:
> When the Thunderbolt driver loads early (e.g., from initramfs)
> and discovers a BIOS-established DisplayPort tunnel, it starts
> asynchronous DPRX polling which checks if the GPU driver has
> read DPCD from the connected monitor within a 12-second timeout
> (TB_DPRX_TIMEOUT).
>
> On systems with Full Disk Encryption (FDE/LUKS), the GPU driver
> (i915, xe, amdgpu, etc.) resides on the encrypted root filesystem
> and cannot load until the user enters the passphrase. This creates
> a driver load ordering issue where the DPRX timeout fires before
> the GPU driver has had a chance to initialize, causing the
> Thunderbolt driver to permanently tear down the DP tunnel and
> remove the DP IN adapter from available resources. Recovery
> requires a physical re-plug of the dock.
>
> Fix this by deferring the DP tunnel teardown when no PCI display
> driver has bound yet. Register a PCI bus notifier that watches
> for display class (PCI_BASE_CLASS_DISPLAY) driver bind events.
> When the DPRX timeout fires:
>
> - If no display driver is bound: tear down the tunnel but keep
> the DP IN adapter in the available resources list, allowing
> a retry.
> - If a display driver is already bound: proceed with the
> existing behavior of permanently removing the DP IN resource.
>
> When a display driver eventually binds, the notifier triggers a
> DP tunnel retry via a scheduled work item, re-establishing the
> connection.
>
> This approach requires no changes to GPU drivers and handles all
> GPU vendors (Intel, AMD, NVIDIA) through the generic PCI base
> class check (0x03xx covers VGA, XGA, 3D, and other display
> controllers). It also handles the FDE case gracefully since the
> defer and retry can span an unbounded passphrase wait.
>
> Tested on Dell Pro Max 14 MC14250 with Dell SD25TB5 Thunderbolt
> 5 Dock and LUKS full disk encryption. Simulated a 58-second
> delay between TB and GPU driver loading -- display came up
> successfully after display driver bound.
>
> Signed-off-by: ChunAn Wu <an.wu@xxxxxxxxxxxxx>
> ---
> drivers/thunderbolt/tb.c | 96 ++++++++++++++++++++++++++++++++++++----
> 1 file changed, 88 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
> index 95d84612e06e..48e0b540fbec 100644
> --- a/drivers/thunderbolt/tb.c
> +++ b/drivers/thunderbolt/tb.c
> @@ -62,6 +62,9 @@ MODULE_PARM_DESC(asym_threshold,
> * @remove_work: Work used to remove any unplugged routers after
> * runtime resume
> * @groups: Bandwidth groups used in this domain.
> + * @pci_nb: PCI bus notifier to detect when a display driver binds
> + * @display_bound: Set when a PCI display driver has bound
> + * @display_retry_work: Work to retry DP tunneling after display driver binds
> */
> struct tb_cm {
> struct list_head tunnel_list;
> @@ -69,6 +72,9 @@ struct tb_cm {
> bool hotplug_active;
> struct delayed_work remove_work;
> struct tb_bandwidth_group groups[MAX_GROUPS];
> + struct notifier_block pci_nb;
> + bool display_bound;
> + struct work_struct display_retry_work;
> };
>
> static inline struct tb *tcm_to_tb(struct tb_cm *tcm)
> @@ -1914,6 +1920,58 @@ static struct tb_port *tb_find_dp_out(struct tb *tb, struct tb_port *in)
> return NULL;
> }
>
> +static void tb_tunnel_dp(struct tb *tb);
> +
> +/*
> + * Check if any PCI display class (0x03xx) device has a driver bound.
> + * Used to decide whether to defer DPRX polling at boot.
> + */
> +static bool tb_is_display_driver_bound(void)
> +{
> + struct pci_dev *pdev = NULL;
> +
> + while ((pdev = pci_get_base_class(PCI_BASE_CLASS_DISPLAY, pdev))) {

There is no way we are going to call PCI functions from the core of the CM.
We are actually going to the opposite direction to be able to support
non-PCIe hosts.

Why not put the TB driver as part of the encrypted volume as well if the
graphics driver is there? Or put the graphics drivers part of the
initramfs?