Re: Xorg SEGV in Xen PV dom0 after updating from 5.16.18 to 5.17.5

From: Juergen Gross
Date: Wed May 04 2022 - 02:48:21 EST


On 04.05.22 07:46, Thorsten Leemhuis wrote:
Hi, this is your Linux kernel regression tracker. Sending this just to
CC the developers of the culprit mentioned below (bdd8b6c98239cad
("drm/i915: replace X86_FEATURE_PAT with pat_enabled()")) and the
maintainers for the subsystem.

While at it a quick note: I wonder if this is problem a similar to one
that recently turned up with amdgpu and is fixed by this problem:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=78b12008f20

No, this is different.

I have posted a patch yesterday which should fix the issue:

https://lore.kernel.org/lkml/20220503132207.17234-3-jgross@xxxxxxxx/T/#m75efc68c96d8f7160229b5f3147242221ce0c28c


Juergen


Ciao, Thorsten

On 04.05.22 02:37, Marek Marczykowski-Górecki wrote:

After updating from 5.16.18 to 5.17.5 in Xen PV dom0, my Xorg started
crashing when displaying any window mapped from a guest (domU) system.
This is 100% reproducible.
The system is Qubes OS, and it uses a trick that maps windows content
from other guests using Xen grant tables, wrapped as "shared memory"
from Xorg point of view (so, the memory that Xorg mmaps is not just from
another process, but from another VM). That's the ShmPutImage you can
see on the stack trace below.

Stack trace of thread 12858:
#0 0x00007f80029e17d5 raise (libc.so.6 + 0x3c7d5)
#1 0x00007f80029ca895 abort (libc.so.6 + 0x25895)
#2 0x00005b3469ace0e0 OsAbort (Xorg + 0x1c60e0)
#3 0x00005b3469ad3959 AbortServer (Xorg + 0x1cb959)
#4 0x00005b3469ad46aa FatalError (Xorg + 0x1cc6aa)
#5 0x00005b3469acb450 OsSigHandler (Xorg + 0x1c3450)
#6 0x00007f8002b85a90 __restore_rt (libpthread.so.0 + 0x14a90)
#7 0x00007f8002b0a2a1 __memmove_avx_unaligned_erms (libc.so.6 + 0x1652a1)
#8 0x00007f80015dfcc9 linear_to_xtiled_faster (iris_dri.so + 0xc91cc9)
#9 0x00007f80015e3477 _isl_memcpy_linear_to_tiled (iris_dri.so + 0xc95477)
#10 0x00007f8001468440 iris_texture_subdata (iris_dri.so + 0xb1a440)
#11 0x00007f8000a76107 st_TexSubImage (iris_dri.so + 0x128107)
#12 0x00007f8000be9a47 texture_sub_image (iris_dri.so + 0x29ba47)
#13 0x00007f8000becd0c texsubimage_err (iris_dri.so + 0x29ed0c)
#14 0x00007f8000bf2939 _mesa_TexSubImage2D (iris_dri.so + 0x2a4939)
#15 0x00007f800213831f glamor_upload_boxes (libglamoregl.so + 0x1e31f)
#16 0x00007f800213856f glamor_upload_region (libglamoregl.so + 0x1e56f)
#17 0x00007f800212aea6 glamor_put_image (libglamoregl.so + 0x10ea6)
#18 0x00005b3469a4d79c damagePutImage (Xorg + 0x14579c)
#19 0x00005b3469a00a7e ProcShmPutImage (Xorg + 0xf8a7e)
#20 0x00005b3469965a2b Dispatch (Xorg + 0x5da2b)
#21 0x00005b3469969b04 dix_main (Xorg + 0x61b04)
#22 0x00007f80029cc082 __libc_start_main (libc.so.6 + 0x27082)
#23 0x00005b3469952e6e _start (Xorg + 0x4ae6e)

Disassembly of the surrounding code:

0x00007596ae8c82fb <+123>: ja 0x7596ae8c8338 <__memmove_avx_unaligned_erms+184>
0x00007596ae8c82fd <+125>: jb 0x7596ae8c8304 <__memmove_avx_unaligned_erms+132>
0x00007596ae8c82ff <+127>: movzbl (%rsi),%ecx
0x00007596ae8c8302 <+130>: mov %cl,(%rdi)
0x00007596ae8c8304 <+132>: retq
0x00007596ae8c8305 <+133>: vmovdqu (%rsi),%xmm0
0x00007596ae8c8309 <+137>: vmovdqu -0x10(%rsi,%rdx,1),%xmm1
=> 0x00007596ae8c830f <+143>: vmovdqu %xmm0,(%rdi)
0x00007596ae8c8313 <+147>: vmovdqu %xmm1,-0x10(%rdi,%rdx,1)
0x00007596ae8c8319 <+153>: retq


I don't see any related kernel or Xen messages at this time. Xorg's SEGV
handler prints also:

(EE) Segmentation fault at address 0x3c010

Git bisect says it's bdd8b6c98239cad ("drm/i915: replace X86_FEATURE_PAT
with pat_enabled()"), and indeed with this commit reverted on top of
5.17.5 everything works fine.

I guess this part of dom0's boot dmesg may be relevant:

[ 0.000949] x86/PAT: MTRRs disabled, skipping PAT initialization too.
[ 0.000953] x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC

Originally reported at
https://github.com/QubesOS/qubes-issues/issues/7479

#regzbot introduced bdd8b6c98239cad
#regzbot monitor: https://github.com/QubesOS/qubes-issues/issues/7479


Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature