Regression with display port monitor under xorg after undock/dock cycle

From: Manfred Benesch
Date: Mon Jul 15 2019 - 07:09:18 EST


Hallo everybody,

after upgrading from a kernel 5.0 to 5.2, i run into a reproducible
regression with a display-port monitor on a Quadro M1000M. I got the
following kernel message :

[Â 161.070503] nouveau 0000:01:00.0: 126.016 Gb/s available PCIe
bandwidth, limited by 8 GT/s x16 link at 0000:00:01.0 (capable of
992.439 Gb/s with 16 GT/s x63 link)
[Â 162.449210] WARNING: CPU: 5 PID: 1497 at
drivers/gpu/drm/drm_dp_mst_topology.c:3209
drm_dp_atomic_release_vcpi_slots+0x43/0xa0 [drm_kms_helper]
[Â 162.449211] Modules linked in: bnep cpufreq_conservative
cpufreq_userspace cpufreq_powersave msr binfmt_misc nls_ascii nls_cp437
vfat fat joydev arc4 uvcvideo btusb btrtl btbcm videobuf2_vmalloc
btintel mei_hdcp videobuf2_memops videobuf2_v4l2 videobuf2_common
bluetooth videodev media ecdh_generic ecc intel_rapl iwlmvm
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mac80211 kvm
irqbypass snd_hda_codec_realtek intel_cstate snd_hda_codec_generic
iwlwifi intel_uncore deflate intel_rapl_perf efi_pstore snd_hda_intel
pcspkr snd_hda_codec efivars snd_hwdep snd_hda_core snd_pcm rmi_smbus
iTCO_wdt rmi_core snd_timer iTCO_vendor_support cfg80211 sg mei_me
nvidiafb mei vgastate fb_ddc thinkpad_acpi intel_pch_thermal nvram snd
tpm_crb soundcore tpm_tis rfkill ac battery tpm_tis_core tpm rng_core
pcc_cpufreq loop sunrpc ecryptfs efivarfs ip_tables x_tables autofs4
btrfs algif_skcipher af_alg mmc_block hid_generic usbhid hid lrw fuse
fan dm_raid raid456 async_raid6_recov async_memcpy async_pq
[Â 162.449241]Â async_xor async_tx xor raid6_pq libcrc32c md_mod
dm_snapshot dm_bufio dm_crypt dm_mirror dm_region_hash dm_log dm_mod
crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel nouveau
i915 rtsx_pci_sdmmc mmc_core mxm_wmi ttm i2c_algo_bit xhci_pci xhci_hcd
aesni_intel drm_kms_helper e1000e aes_x86_64 psmouse crypto_simd ptp
serio_raw cryptd glue_helper pps_core rtsx_pci drm usbcore i2c_i801
mfd_core sd_mod evdev thermal wmi video button
[Â 162.449259] CPU: 5 PID: 1497 Comm: Xorg Tainted: GÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ T
5.2.0-thinkpad #1
[Â 162.449261] Hardware name: LENOVO 20EQS3V200/20EQS3V200, BIOS
N1EET82W (1.55 ) 12/18/2018
[Â 162.449268] RIP: 0010:drm_dp_atomic_release_vcpi_slots+0x43/0xa0
[drm_kms_helper]
[Â 162.449270] Code: 50 08 48 8d 70 08 48 8d 5a f0 48 39 d6 74 1b 48 3b
6a f0 75 08 eb 2f 48 39 69 f0 74 29 48 8b 4b 10 48 8d 59 f0 48 39 ce 75
ed <0f> 0b 48 c7 c7 00 4e 5c c0 48 89 c2 48 89 ee e8 09 76 db ff b8 ea
[Â 162.449271] RSP: 0018:ffffa68d435cfa68 EFLAGS: 00010246
[Â 162.449272] RAX: ffff97bdba93c1a0 RBX: ffff97bdba93c198 RCX:
ffff97bdba93c1a0
[Â 162.449273] RDX: ffff97bdba93c1a8 RSI: ffff97bdba93c1a8 RDI:
0000000000000010
[Â 162.449274] RBP: ffff97bdcae43000 R08: ffff97bdcb905600 R09:
ffff97bdcbfc7200
[Â 162.449275] R10: ffffa68d435cfa20 R11: ffff97bdcbfc7200 R12:
0000000000000004
[Â 162.449276] R13: ffff97bdb4025a80 R14: 0000000000000003 R15:
ffff97bdb7f92010
[Â 162.449277] FS:Â 00007f57bcf45200(0000) GS:ffff97bdcf540000(0000)
knlGS:0000000000000000
[Â 162.449278] CS:Â 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Â 162.449280] CR2: 000055f5b0838a90 CR3: 0000000475bf6002 CR4:
00000000003606e0
[Â 162.449281] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[Â 162.449281] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[Â 162.449282] Call Trace:
[Â 162.449293]Â drm_atomic_helper_check_modeset+0x391/0xa80 [drm_kms_helper]
[Â 162.449301]Â drm_atomic_helper_check+0x14/0x90 [drm_kms_helper]
[Â 162.449352]Â nv50_disp_atomic_check+0x83/0x1d0 [nouveau]
[Â 162.449371]Â drm_atomic_check_only+0x5b3/0x870 [drm]
[Â 162.449387]Â drm_atomic_commit+0x13/0x50 [drm]
[Â 162.449395]Â drm_atomic_helper_set_config+0x77/0x80 [drm_kms_helper]
[Â 162.449411]Â drm_mode_setcrtc+0x548/0x740 [drm]
[Â 162.449425]Â ? drm_mode_getcrtc+0x180/0x180 [drm]
[Â 162.449435]Â drm_ioctl_kernel+0xbb/0x100 [drm]
[Â 162.449446]Â drm_ioctl+0x2e2/0x380 [drm]
[Â 162.449457]Â ? drm_mode_getcrtc+0x180/0x180 [drm]
[Â 162.449508]Â nouveau_drm_ioctl+0x68/0xc0 [nouveau]
[Â 162.449513]Â do_vfs_ioctl+0xb0/0x690
[Â 162.449517]Â ksys_ioctl+0x70/0x80
[Â 162.449520]Â __x64_sys_ioctl+0x16/0x20
[Â 162.449522]Â do_syscall_64+0x55/0x120
[Â 162.449525]Â entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Â 162.449527] RIP: 0033:0x7f57ba419017
[Â 162.449529] Code: 00 00 00 48 8b 05 81 7e 2b 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 51 7e 2b 00 f7 d8 64 89 01 48
[Â 162.449530] RSP: 002b:00007fff12e4fcf8 EFLAGS: 00003246 ORIG_RAX:
0000000000000010
[Â 162.449532] RAX: ffffffffffffffda RBX: 0000000000000014 RCX:
00007f57ba419017
[Â 162.449533] RDX: 00007fff12e4fd30 RSI: 00000000c06864a2 RDI:
0000000000000014
[Â 162.449534] RBP: 00007fff12e4fd30 R08: 0000000000000000 R09:
000055f5b115cce0
[Â 162.449535] R10: 00007fff12e4fed0 R11: 0000000000003246 R12:
00000000c06864a2
[Â 162.449536] R13: 0000000000000014 R14: 000055f5b0863000 R15:
000055f5b078d230
[Â 162.449538] ---[ end trace 11d212b7f8d04b60 ]---
[Â 162.449562] [drm:drm_dp_atomic_release_vcpi_slots [drm_kms_helper]]
*ERROR* no VCPI for [MST PORT:0000000020cedb5a] found in mst state
00000000bab63e56

I can trigger that bug easily with the following steps :

1. boot into xorg/desktop

2. expand the screen to the external monitor (e.g.with "xrandr --output
DP-1-1-1 --auto --right-of eDP1")

3. undock the laptop

4. redock the laptop results in the kernel warning showed above

After getting that warning there is no way to get the external monitor
working again until a whole reboot cycle.

I have checked some older versions of the kernel and that bug was
introduced in 5.1-rc1.

The hardware is a Lenovo P50 laptop with a workstation dock and a Dell
U2412M connected on the display port of the docking station. The
external screen is connected to a Quadro M1000M graphic card driven with
the nouveau driver.

A possible work around for the moment is to disable the external monitor
with "xrandr --output DP-1-1-1 --off" before undocking.

If you need further informations let me know.

Best Regards

Manfred Benesch