Re: UBSAN: shift-out-of-bounds in drivers/gpu/drm/display/drm_dp_mst_topology.c:4575:36
From: Imre Deak
Date: Fri Dec 12 2025 - 14:14:53 EST
Hi,
On Fri, Dec 12, 2025 at 07:55:12PM +0100, Zdenek Kabelac wrote:
> Hi
>
> I've noticed this in my message log while suspending my P1 machine
> (and unplugging it from my docking station).
>
> ----
> kernel: UBSAN: shift-out-of-bounds in
> drivers/gpu/drm/display/drm_dp_mst_topology.c:4575:36
> kernel: shift exponent -1 is negative
> kernel: CPU: 12 UID: 0 PID: 2972 Comm: Xorg Not tainted
> 6.18.0-0.rc7.58.fc44.x86_64 #1 PREEMPT(lazy)
> kernel: Hardware name: LENOVO 21FFS02H22/21FFS02H22, BIOS N3VET59W (1.59 )
> 05/13/2025
> kernel: Call Trace:
> kernel: <TASK>
> kernel: dump_stack_lvl+0x5d/0x80
> kernel: ubsan_epilogue+0x5/0x2b
> kernel: __ubsan_handle_shift_out_of_bounds.cold+0xd7/0x1ab
> kernel: drm_dp_atomic_release_time_slots.cold+0x1c/0x90 [drm_display_helper]
> kernel: drm_atomic_helper_check_modeset+0x2a9/0x690
> pipewire-media-session[3089]: ms.mod.default-profile: device
> 'alsa_card.pci-0000_c4_00.1': can't restore profile: No such device
> kernel: amdgpu_dm_atomic_check+0x64/0x1570 [amdgpu]
> kernel: drm_atomic_check_only+0x180/0x3e0
> kernel: ? __pfx_drm_mode_setcrtc+0x10/0x10
> kernel: drm_atomic_commit+0x71/0xe0
> kernel: ? __pfx___drm_printfn_info+0x10/0x10
> kernel: drm_atomic_helper_set_config+0x7a/0xd0
> kernel: drm_mode_setcrtc+0x37a/0x900
> kernel: ? __wait_for_common+0x162/0x190
> kernel: ? __pfx_drm_mode_setcrtc+0x10/0x10
> kernel: drm_ioctl_kernel+0xae/0x100
> kernel: drm_ioctl+0x2a8/0x550
> kernel: ? __pfx_drm_mode_setcrtc+0x10/0x10
> kernel: amdgpu_drm_ioctl+0x4a/0x90 [amdgpu]
> kernel: __x64_sys_ioctl+0x97/0xe0
> kernel: do_syscall_64+0x7e/0x7f0
> kernel: ? __check_object_size.part.0+0x34/0xc0
> kernel: ? drm_ioctl+0x2dd/0x550
> kernel: ? __pfx_drm_mode_setcrtc+0x10/0x10
> kernel: ? ktime_get_mono_fast_ns+0x35/0xe0
> kernel: ? amdgpu_drm_ioctl+0x7b/0x90 [amdgpu]
> kernel: ? syscall_exit_work+0x143/0x1b0
> kernel: ? do_syscall_64+0xb6/0x7f0
> kernel: ? do_syscall_64+0xb6/0x7f0
> kernel: ? sched_clock+0x10/0x30
> kernel: ? sched_clock_cpu+0xb/0x30
> kernel: ? irqtime_account_irq+0x3c/0xc0
> kernel: ? irqentry_exit_to_user_mode+0x2c/0x1c0
> kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
> kernel: RIP: 0033:0x7f1280c2444d
> kernel: Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00
> 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00
> f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
> kernel: RSP: 002b:00007ffc601c14d0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> kernel: RAX: ffffffffffffffda RBX: 000000001bc87f60 RCX: 00007f1280c2444d
> kernel: RDX: 00007ffc601c1560 RSI: 00000000c06864a2 RDI: 000000000000000d
> kernel: RBP: 00007ffc601c1520 R08: 0000000000000000 R09: 0000000000000000
> kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc601c1560
> kernel: R13: 00000000c06864a2 R14: 000000000000000d R15: 0000000000000000
> kernel: </TASK>
> kernel: ---[ end trace ]---
> boltd[1177]: probing: timeout, done: [2636501] (2000000)
> kernel: Lockdown: systemd-logind: hibernation is restricted; see man
> kernel_lockdown.7
> kernel: Lockdown: systemd-logind: hibernation is restricted; see man
> kernel_lockdown.7
> systemd-logind[1183]: The system will suspend now!
> -----
>
> I'd have guess it didn't liked maybe unplug of cable - however the problem
> was later on after resume - where the machine looked 'very slow' - first
> I've suspected the latest version of Firefox got somehow slow - but after
> clean reboot everything started to work fine & fast again.
>
> But so far I don't see anything else suspicious in my system log to give
> some hints why later on my Xorg session was running so slowly.
>
> My laptop: ThinkPad P16v Gen 1, 64G RAM, 16 AMD Cores
The WARN itself should be removed by
https://lore.kernel.org/all/20251119094650.799135-1-suraj.kandpal@xxxxxxxxx
however, the above fix doesn't have other effects besides removing the
WARN message/stack trace, so the slow-down you describe must have a
differrent root cause than using a -1 as a shift value above. A wild
guess is that it's some timeout trying to access the sink (which is
inaccissible because the sink was unplugged) and the corresponding MST
timeout being rather long (up to 4 sec per MST programming step).
I'd suggest opening an AMD DRM driver ticket to get a better idea,
attaching there a dmesg log taken after booting with debug enabled (at
least adding the drm.debug=0x15e kernel param) and reproducing the
problem.
> Regards
>
> Zdenek
>