Memory corruption kernel issue (potentially exploitable), request for help

From: Oliver Freyermuth
Date: Fri May 26 2017 - 07:30:53 EST


Dear Kernel hackers,

I have a machine with a self-built, non-tainted kernel, which exhibits memory corruption as soon as I execute
while true; do cat /proc/self/net/dev > /dev/null; done
as normal user.

I am running 4.11.3 (almost vanilla, only Gentoo patches in) on mostly standard hardware (Intel CPU + GPU).
I can also reproduce with 4.9 on that machine.
RAM has already been exchanged. Due to a BIOS bug, the machine needs "iommu=soft" as kernel parameter, but nothing special otherwise.

The corruption appears in two ways:
Often via:
Corrupted low memory at ffff88000000b000 (b000 phys) = 0016e109
Almost every time visible via:
memtester 15G
(machine has 16 G).

Checking the output of memtester, the values it finds match with the content of the numbers in:
/proc/self/net/dev

After each boot, it seems the memory page where the corruption appears is slightly changed, it is usually in the region around 0x94F6000 (physical address).

I have attached my kernel config, gzipped.

I would be very grateful for any advice on how to debug this further - it does not really look like a hardware issue to me anymore,
but if it could be, please enlighten me.

Please include me in replies, as I am not subscribed to the list.

In case relevant, my network controller is:
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)

Thanks and all the best,
Oliver Freyermuth

Attachment: kernconfig.gz
Description: application/gzip