Re: [PATCH] amd/iommu: flush old domains in kdump kernel

From: Joerg Roedel
Date: Fri Sep 06 2019 - 04:37:20 EST


On Thu, Sep 05, 2019 at 12:09:48PM -0500, Stuart Hayes wrote:
> When devices are attached to the amd_iommu in a kdump kernel, the old device
> table entries (DTEs), which were copied from the crashed kernel, will be
> overwritten with a new domain number. When the new DTE is written, the IOMMU
> is told to flush the DTE from its internal cache--but it is not told to flush
> the translation cache entries for the old domain number.
>
> Without this patch, AMD systems using the tg3 network driver fail when kdump
> tries to save the vmcore to a network system, showing network timeouts and
> (sometimes) IOMMU errors in the kernel log.
>
> This patch will flush IOMMU translation cache entries for the old domain when
> a DTE gets overwritten with a new domain number.

Hmm, this seems to point to an interesting implementation detail of the
AMD IOMMUs. In theory, when the DTE is flushed, there shouldn't be any
device transactions looked up with the old domain id anymore, and thus
no faults should happen.

Anyway, applied the patch, thanks.


Regards,

Joerg