Re: [PATCH 0/1] thunderbolt: Fix blank external display after HRR on USB4 v2
From: Mika Westerberg
Date: Thu Apr 30 2026 - 06:15:10 EST
Hi,
On Thu, Apr 30, 2026 at 03:31:42PM +0800, Chia-Lin Kao (AceLan) wrote:
> Hi,
>
> On Dell XPS 14 (Panther Lake) with a WD22TB4 Thunderbolt dock and BenQ
> PD2725U external display, the display goes permanently blank on ~50% of
> boots. The only way to recover is a full reboot — re-plugging the
> monitor or dock does not help.
>
> The root cause is a race between the USB4 v2 Host Router Reset (HRR)
> and the graphics driver initialization:
>
> 1. nhi_probe() performs HRR at ~t=1s, destroying BIOS-established
> DP tunnels.
> 2. The Thunderbolt driver re-discovers the dock via hotplug at ~t=4s
> and attempts to re-create the DP tunnel.
> 3. DPRX negotiation fails because the graphics driver (xe) is not yet
> ready — the 12-second timeout expires at ~t=18s.
> 4. tb_dp_tunnel_active() permanently removes the DP IN adapter from
> available resources on the first failure, so the display never
> recovers.
>
> The fix adds a retry mechanism: on DPRX negotiation failure, the driver
> retries up to 3 times with a 5-second delay, giving the graphics driver
> time to come up.
>
> Tested with 13 boot cycles on the affected machine:
> - 6 boots hit the HRR + DPRX race: all recovered via retry, display
> came online after 3 retry attempts (~58s).
> - 5 clean boots (no HRR): DP tunnel established immediately.
> - 2 boots with HRR where DPRX succeeded on first try.
> - 0 teardowns: the retry mechanism was never exhausted.
>
> Full dmesg log - https://people.canonical.com/~acelan/bugs/dp-retry-on-hrr/
I'm looking at that but the first thing that stands out is this:
[ 1.051684] thunderbolt: loading out-of-tree module taints kernel.
Which tells me that this has some potential modifications outside of the
mainline.
Second thing is that it's missing "thunderbolt.dyndbg=+p" that could show
what is going on there. I suggest adding that pretty much always.
Yes, this can happen and the 12 s idea was that it accounts for the
possible time that it takes to boot up (as well as the polling the i915
does if it is runtime suspended). I would say that whatever is delaying the
boot time should be investigated first because that's not really good user
experience.
Aside from that if you add "thunderbolt.dprx_timeout=-1" does it work? If
really needed we can increase that a bit but I'm not too enthustiatic
adding code for retrying this because we do have this timeout that we can
adjust as needed (we can make the default higher).