Memory corruption kernel issue (potentially exploitable), request for help
From: Oliver Freyermuth
Date: Fri May 26 2017 - 07:30:53 EST
Dear Kernel hackers,
I have a machine with a self-built, non-tainted kernel, which exhibits memory corruption as soon as I execute
while true; do cat /proc/self/net/dev > /dev/null; done
as normal user.
I am running 4.11.3 (almost vanilla, only Gentoo patches in) on mostly standard hardware (Intel CPU + GPU).
I can also reproduce with 4.9 on that machine.
RAM has already been exchanged. Due to a BIOS bug, the machine needs "iommu=soft" as kernel parameter, but nothing special otherwise.
The corruption appears in two ways:
Often via:
Corrupted low memory at ffff88000000b000 (b000 phys) = 0016e109
Almost every time visible via:
memtester 15G
(machine has 16 G).
Checking the output of memtester, the values it finds match with the content of the numbers in:
/proc/self/net/dev
After each boot, it seems the memory page where the corruption appears is slightly changed, it is usually in the region around 0x94F6000 (physical address).
I have attached my kernel config, gzipped.
I would be very grateful for any advice on how to debug this further - it does not really look like a hardware issue to me anymore,
but if it could be, please enlighten me.
Please include me in replies, as I am not subscribed to the list.
In case relevant, my network controller is:
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
Thanks and all the best,
Oliver Freyermuth
Attachment:
kernconfig.gz
Description: application/gzip