Re: [PATCH] amd iommu: force flush of iommu prior during shutdown

From: Joerg Roedel
Date: Thu Apr 01 2010 - 10:29:20 EST

Hi Neil,

first some general words about the problem you discovered: The problem
is not caused by in-flight DMA. The problem is that the IOMMU hardware
has cached the old DTE entry for the device (including the old
page-table root pointer) and is using it still when the kdump kernel has
booted. We had this problem once and fixed it by flushing a DTE in the
IOMMU before it is used for the first time. This seems to be broken
now. Which kernel have you seen this on?

I am back in office next tuesday and will look into this problem too.

On Wed, Mar 31, 2010 at 04:27:45PM -0400, Neil Horman wrote:
> So I'm officially rescinding this patch.

Yeah, the right solution to this problem is to find out why every DTE is
not longer flushed before first use.

> It apparently just covered up the problem, rather than solved it
> outright. This is going to take some more thought on my part. I read
> the code a bit closer, and the amd iommu on boot up currently marks
> all its entries as valid and having a valid translation (because if
> they're marked as invalid they're passed through untranslated which
> strikes me as dangerous, since it means a dma address treated as a bus
> address could lead to memory corruption. The saving grace is that
> they are marked as non-readable and non-writeable, so any device doing
> a dma after the reinit should get logged (which it does), and then
> target aborted (which should effectively squash the translation)

Right. The default for all devices is to forbid DMA.

> I'm starting to wonder if:
> 1) some dmas are so long lived they start aliasing new dmas that get mapped in
> the kdump kernel leading to various erroneous behavior

At least not in this case. Even when this is true the DMA would target
memory of the crashed kernel and not the kdump area. This is not even
memory corruption because the device will write to memory the driver has
allocated for it.

> 2) a slew of target aborts to some hardware results in them being in an
> inconsistent state

Thats indeed true. I have seen that with ixgbe cards for example. They
seem to be really confused after an target abort.

> I'm going to try marking the dev table on shutdown such that all devices have no
> read/write permissions to see if that changes the situation. It should I think
> give me a pointer as to weather (1) or (2) is the more likely problem.

Probably not. You still need to flush the old entries out of the IOMMU.



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at