Re: [PATCH] Revert "iommu/amd: Treat per-device exclusion ranges as r/w unity-mapped regions"

From: Baoquan He
Date: Tue Sep 22 2020 - 22:33:02 EST


Forgot CC-ing Jerry, add him.

On 09/23/20 at 10:26am, Baoquan He wrote:
> A regression failure of kdump kernel boot was reported on a HPE system.
> Bisect points at commit 387caf0b759ac43 ("iommu/amd: Treat per-device
> exclusion ranges as r/w unity-mapped regions") as criminal. Reverting it
> fix the failure.
>
> With the commit, kdump kernel will always print below error message, then
> naturally AMD iommu can't function normally during kdump kernel bootup.
>
> ~~~~~~~~~
> AMD-Vi: [Firmware Bug]: IVRS invalid checksum
>
> Why commit 387caf0b759ac43 causing it haven't been made clear.

Hi Joerg, Adrian

We only have one machine which can reproduce the issue, it's a gen10-01
of HPE. If any log or info are needed, please let me know, I can attach
here.

Thanks
Baoquan

>
> From the commit log, a discussion thread link is pasted. In that discussion
> thread, Adrian told the fix is for a system with already broken BIOS, and
> Joerg suggested two options. Finally option 2) is taken. Maybe option 1)
> should be the right approach?
>
> 1) Bail out and disable the IOMMU as the BIOS screwed up
> 2) Treat per-device exclusion ranges just as r/w unity-mapped
> regions.
>
> https://lists.linuxfoundation.org/pipermail/iommu/2019-November/040117.html
> Signed-off-by: Baoquan He <bhe@xxxxxxxxxx>
> ---
> drivers/iommu/amd/init.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 9aa1eae26634..bbe7ceae5949 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1109,17 +1109,22 @@ static int __init add_early_maps(void)
> */
> static void __init set_device_exclusion_range(u16 devid, struct ivmd_header *m)
> {
> + struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
> +
> if (!(m->flags & IVMD_FLAG_EXCL_RANGE))
> return;
>
> - /*
> - * Treat per-device exclusion ranges as r/w unity-mapped regions
> - * since some buggy BIOSes might lead to the overwritten exclusion
> - * range (exclusion_start and exclusion_length members). This
> - * happens when there are multiple exclusion ranges (IVMD entries)
> - * defined in ACPI table.
> - */
> - m->flags = (IVMD_FLAG_IW | IVMD_FLAG_IR | IVMD_FLAG_UNITY_MAP);
> + if (iommu) {
> + /*
> + * We only can configure exclusion ranges per IOMMU, not
> + * per device. But we can enable the exclusion range per
> + * device. This is done here
> + */
> + set_dev_entry_bit(devid, DEV_ENTRY_EX);
> + iommu->exclusion_start = m->range_start;
> + iommu->exclusion_length = m->range_length;
> + }
> +
> }
>
> /*
> --
> 2.17.2
>
> _______________________________________________
> iommu mailing list
> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>