Re: RFC: IOMMU/AMD: Error Handling

From: Don Dutile
Date: Tue Apr 30 2013 - 11:09:15 EST


On 04/30/2013 10:56 AM, Suravee Suthikulanit wrote:
On 4/29/2013 4:42 PM, Don Dutile wrote:
On 04/29/2013 04:34 PM, Duran, Leo wrote:
I'm wondering if resetting the IOMMU at init-time (once) would clear any BIOS induced noise.
Leo

Well, depends what you mean by 'reset'....
(a) setting it up for OS use is effectively a reset, but doesn't quiesce a device
doing dma reads of a (bios-setup) queue. then the noisy messages begin
(b) disable the iommu, and then the dma just occurs... and bad for writes, potentially.

Similar issue is being reported & worked for kdump, where device are still
doing DMA while the system is trying to 'reset' to the kexec'd kernel, and
take a crash dump.

Solution: stop devices from doing dma... but some you _want_ enabled throughout...
like keyboard & mouse via usb controller, so you get to pick os from
grub... not so for kexec...

so, again, for isolation faults.... let the hw do its job -- isolate
and throttle/silence the fault messages on a per-device, time-duration heuristic
so the system can get through boot-up where enough OS is init'd (drivers started)
to stop the temporary noise.
This sounds more like issue with the order of how things are initialized in the system.
If so, could we separate the code which enabling of IOMMU error logging/handling and
delay it until we are certain that systems are stable?

So, you are proposing we not enable fault events when IOMMU is initially configured;
use the IOMMU through boot/driver-config, hoping all is well, and if not, continue blindly,
and then enable IOMMU faults post/late-init ?

Suravee


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/