Re: RFC: IOMMU/AMD: Error Handling

From: Suravee Suthikulanit
Date: Tue Apr 30 2013 - 10:56:33 EST


On 4/29/2013 4:42 PM, Don Dutile wrote:
On 04/29/2013 04:34 PM, Duran, Leo wrote:
I'm wondering if resetting the IOMMU at init-time (once) would clear any BIOS induced noise.
Leo

Well, depends what you mean by 'reset'....
(a) setting it up for OS use is effectively a reset, but doesn't quiesce a device
doing dma reads of a (bios-setup) queue. then the noisy messages begin
(b) disable the iommu, and then the dma just occurs... and bad for writes, potentially.

Similar issue is being reported & worked for kdump, where device are still
doing DMA while the system is trying to 'reset' to the kexec'd kernel, and
take a crash dump.

Solution: stop devices from doing dma... but some you _want_ enabled throughout...
like keyboard & mouse via usb controller, so you get to pick os from
grub... not so for kexec...

so, again, for isolation faults.... let the hw do its job -- isolate
and throttle/silence the fault messages on a per-device, time-duration heuristic
so the system can get through boot-up where enough OS is init'd (drivers started)
to stop the temporary noise.
This sounds more like issue with the order of how things are initialized in the system.
If so, could we separate the code which enabling of IOMMU error logging/handling and
delay it until we are certain that systems are stable?

Suravee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/