Intel gpu memory corruption
From: Baltazár Radics
Date: Fri Aug 12 2022 - 17:24:19 EST
Hello!
My laptop (ThinkPad T460) seems to have a memory corruption issue that
only occures when the gpu is in use (it has `Intel Corporation Skylake
GT2 [HD Graphics 520] (rev 07)` as reported by lspci).
I haven't been able to reproduce the corruption with standard memory
testing utilities like lenovo's builtin hardware diagnostic tool,
memtest86+, or even the user-space program memtester when it's the only
thing running.
However, running memtester alongside vkmark for example can reproduce
it quite consistently. It will always be a single address for a given
instance of memtester, but looking into /proc/[pid]/pagemap revealed
that seemingly it's always the same hardware address.
With this information, I think I managed to stop it from happening by
appending `memmap=4K$0x1F9D7C000` to my kernel commandline to stop that
address from being allocated. Since then I haven't been able to catch
it with memtester, but I did have a crash that kinda resembled the ones
I had earlier. Many processes segfaulted and I had some `Bad swap file
entry` errors in my dmesg.
I haven't been able to do testing on other OSes yet, but since none of
the regular memtests have found any issues, I'm fairly certain this is
not a hardware issue with my ram. Could still be a hardware issue with
the gpu itself, but for now I'm guessing this is a gpu driver bug.
Is there anything else I can test to confirm that this is i915's fault,
and if so, anything I can do to help track down the bug?
Thanks!