Re: iommu: flood of ahci 0000:e6:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0055 address=0xa14a4000 flags=0x0070]
From: Corentin Labbe
Date: Wed Feb 05 2025 - 08:36:48 EST
Le Mon, Feb 03, 2025 at 01:01:45PM +0000, Robin Murphy a écrit :
> On 2025-02-03 9:05 am, Corentin Labbe wrote:
> > Hello
> >
> > I have a supermicro server which is flooded of kernel message:
> > ahci 0000:e6:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0055 address=0xa14a4000 flags=0x0070]
> >
> > The server works perfectly anyway.
> > It happens with official ubuntu kernel vmlinuz-6.8.0-51-generic.
> > I tried also a custom 6.12.6, same problem.
> >
> > I tried to update bios, no change.
> > I tried iommu=soft, no change.
> >
> > I dont know what to do next.
> >
> > Regards
> >
>
> > IOMMU group 83 e6:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9230 PCIe 2.0 x2 4-port SATA 6 Gb/s RAID Controller [1b4b:9230] (rev 11)
>
> Wow, a Marvell SATA controller doing something other than the usual
> phantom function quirk, that's a nice change :D
>
> I'd guess that firmware has left it running for something like legacy
> IDE emulation (if that's still a thing?) or its own soft-RAID driver,
> but neglected to declare an IVMD entry to described the reserved memory
> region(s) it's using for that. A smoking gun would be if 0xa14a4000
> matches some firmware-reserved PA in the system memory map. In that
> case, if you're lucky you might have some firmware/BIOS option to
> disable fancy behaviour and leave it in plain AHCI mode. Otherwise,
> booting with "iommu.passthrough=1" (or the even bigger hammer of
> "amd_iommu=off") should at least allow you to ignore the issue.
>
Hello
Thanks for your help
There was no AHCI option in the BIOS (appart hotplug enable).
Adding iommu.passthrough=1 lead to absence of thoses messages.
Unfortunatly, my example is not correct, the address is mostly random:
dmesg |grep IO_PAGE_FAULT | grep -o 'address=0x[0-9a-f]*' | sort | uniq -c | wc -l
9297
dmesg |grep IO_PAGE_FAULT | grep -o 'address=0x[0-9a-f]*' | sort | uniq -c | head
2 address=0x1101f000
2 address=0x1101f004
3 address=0x1102f000
1 address=0x1102f004
2 address=0x1102f008
2 address=0x1102f010
2 address=0x11043000
2 address=0x11043004
1 address=0x11047000
1 address=0x11047004
dmesg |grep IO_PAGE_FAULT | grep -o 'address=0x[0-9a-f]*' | sort | uniq -c | tail
2 address=0xfffffffffe751004
2 address=0xfffffffffe7e6000
2 address=0xfffffffffe7e6004
4 address=0xfffffffffe823000
3 address=0xfffffffffe823004
2 address=0xfffffffffe830000
2 address=0xfffffffffe830004
3 address=0xfffffffffe833000
1 address=0xfffffffffe833004
1 address=0xfffffffffe833008
But the domain/flags are always the same
Full dmesg (without IOMMU messages) https://kernel.montjoie.ovh/dmesg.0
The server is doing qemu GPU passthough via VFIO.
I believe (aka I need to re-verify) that message start whatever qemu starts or not.
Thanks
Regards