Re: [PATCH] iommu/amd: Fix iommu remap panic while amd_iommu is set to disable

From: Joerg Roedel
Date: Tue Mar 16 2021 - 11:01:28 EST


On Tue, Mar 16, 2021 at 09:36:02PM +0800, Huang Rui wrote:
> Thanks for the comments. Could you please elaborate this?
>
> Do you mean while amd_iommu=off, we won't prepare the IVRS, and even
> needn't get all ACPI talbes. Because they are never be used and the next
> state will always goes into IOMMU_CMDLINE_DISABLED, am I right?

The first problem was that amd_iommu_irq_remap is never set back to
false when irq-remapping initialization fails in amd_iommu_prepare().

But there are other problems, like that even when the IOMMU is set to
disabled on the command line with amd_iommu=off, the code still sets up
all IOMMUs and registers IRQ domains for them.

Later the code checks wheter the IOMMU should stay disabled and tears
everything down, except for the IRQ domains, which stay in the global
list.

The APIs do not really support tearing down IRQ domains well, so its not
so easy to add this to the tear-down path. Now that the IRQ domains stay
in the list, the ACPI code will come along later and calls the
->select() call-back for every IRQ domain, which gets execution to
irq_remapping_select(), depite IOMMU being disabled and
amd_iommu_rlookup_table already de-allocated. But since
amd_iommu_irq_remap is still true the NULL pointer is dereferenced,
causing the crash.

When the IRQ domains would not be around, this would also not happen. So
my patches also change the initializtion to not do all the setup work
when amd_iommu=off was passed on the command line.

Regards,

Joerg