Le Mon, Feb 03, 2025 at 01:01:45PM +0000, Robin Murphy a écrit :
On 2025-02-03 9:05 am, Corentin Labbe wrote:
Hello
I have a supermicro server which is flooded of kernel message:
ahci 0000:e6:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0055 address=0xa14a4000 flags=0x0070]
The server works perfectly anyway.
It happens with official ubuntu kernel vmlinuz-6.8.0-51-generic.
I tried also a custom 6.12.6, same problem.
I tried to update bios, no change.
I tried iommu=soft, no change.
I dont know what to do next.
Regards
IOMMU group 83 e6:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe 2.0 x2 4-port SATA 6 Gb/s RAID Controller [1b4b:9230] (rev 11)
Wow, a Marvell SATA controller doing something other than the usual
phantom function quirk, that's a nice change :D
I'd guess that firmware has left it running for something like legacy
IDE emulation (if that's still a thing?) or its own soft-RAID driver,
but neglected to declare an IVMD entry to described the reserved memory
region(s) it's using for that. A smoking gun would be if 0xa14a4000
matches some firmware-reserved PA in the system memory map. In that
case, if you're lucky you might have some firmware/BIOS option to
disable fancy behaviour and leave it in plain AHCI mode. Otherwise,
booting with "iommu.passthrough=1" (or the even bigger hammer of
"amd_iommu=off") should at least allow you to ignore the issue.
Hello
Thanks for your help
There was no AHCI option in the BIOS (appart hotplug enable).
Adding iommu.passthrough=1 lead to absence of thoses messages.
Unfortunatly, my example is not correct, the address is mostly random:
dmesg |grep IO_PAGE_FAULT | grep -o 'address=0x[0-9a-f]*' | sort | uniq -c | wc -l
9297
dmesg |grep IO_PAGE_FAULT | grep -o 'address=0x[0-9a-f]*' | sort | uniq -c | head
2 address=0x1101f000
2 address=0x1101f004
3 address=0x1102f000
1 address=0x1102f004
2 address=0x1102f008
2 address=0x1102f010
2 address=0x11043000
2 address=0x11043004
1 address=0x11047000
1 address=0x11047004
dmesg |grep IO_PAGE_FAULT | grep -o 'address=0x[0-9a-f]*' | sort | uniq -c | tail
2 address=0xfffffffffe751004
2 address=0xfffffffffe7e6000
2 address=0xfffffffffe7e6004
4 address=0xfffffffffe823000
3 address=0xfffffffffe823004
2 address=0xfffffffffe830000
2 address=0xfffffffffe830004
3 address=0xfffffffffe833000
1 address=0xfffffffffe833004
1 address=0xfffffffffe833008
But the domain/flags are always the same
Full dmesg (without IOMMU messages) https://uk01.z.antigena.com/l/VspdfbZQLwA2gZviRaGoPfE2bAxamMd9VFWOj4n78OuhpCoBo5HcXgWgXfTVvyxW1R3W9GTx4RbHm1MGyqBINkuTrnW31h9eTfLTUvXfcYh-IaTwmSc5kZo_-iU9-qQLbKsIjA9LNxyfbAA2AKGOSws6K4vuOrR6i-DL5DiQW1gHCrhhBMgE0Y7RK2m9
The server is doing qemu GPU passthough via VFIO.
I believe (aka I need to re-verify) that message start whatever qemu starts or not.