[PATCH 0/5] iommu/amd: fixes for suspend/resume

From: Maxim Levitsky
Date: Tue Nov 23 2021 - 11:10:51 EST


As I sadly found out, a s3 cycle makes the AMD's iommu stop sending interrupts
until the system is rebooted.

I only noticed it now because otherwise the IOMMU works, and these interrupts
are only used for errors and for GA log which I tend not to use by
making my VMs do mwait/pause/etc in guest (cpu-pm=on).

There are two issues here that prevent interrupts from being generated after
s3 cycle:

1. GA log base address was not restored after resume, and was all zeroed
after resume (by BIOS or such).

In theory if BIOS writes some junk to it, that can even cause a memory corruption.
Patch 2 fixes that.

2. INTX (aka x2apic mode) settings were not restored after resume.
That mode is used regardless if the host uses/supports x2apic, but rather when
the IOMMU supports it, and mine does.
Patches 3-4 fix that.

Note that there is still one slight (userspace) bug remaining:
During suspend all but the boot CPU are offlined and then after resume
are onlined again.

The offlining moves all non-affinity managed interrupts to CPU0, and
later when all other CPUs are onlined, there is nothing in the kernel
to spread back the interrupts over the cores.

The userspace 'irqbalance' daemon does fix this but it seems to ignore
the IOMMU interrupts in INTX mode since they are not attached to any
PCI device, and thus they remain on CPU0 after a s3 cycle,
which is suboptimal when the system has multiple IOMMUs
(mine has 4 of them).

Setting the IRQ affinity manually via /proc/irq/ does work.

This was tested on my 3970X with both INTX and regular MSI mode (later was enabled
by patching out INTX detection), by running a guest with AVIC enabled and with
a PCI assigned device (network card), and observing interrupts from
IOMMU while guest is mostly idle.

This was also tested on my AMD laptop with 4650U (which has the same issue)
(I tested only INTX mode)

Patch 1 is a small refactoring to remove an unused struct field.

Best regards,
Maxim Levitsky

Maxim Levitsky (5):
iommu/amd: restore GA log/tail pointer on host resume
iommu/amd: x2apic mode: re-enable after resume
iommu/amd: x2apic mode: setup the INTX registers on mask/unmask
iommu/amd: x2apic mode: mask/unmask interrupts on suspend/resume
iommu/amd: remove useless irq affinity notifier

drivers/iommu/amd/amd_iommu_types.h | 2 -
drivers/iommu/amd/init.c | 107 +++++++++++++++-------------
2 files changed, 58 insertions(+), 51 deletions(-)

--
2.26.3