Re: [PATCH] genirq/msi: Shutdown managed interrupts with unsatifiable affinities

From: John Garry
Date: Wed Mar 09 2022 - 05:21:10 EST


+

On 07/03/2022 19:06, Marc Zyngier wrote:
When booting with maxcpus=<small number>, interrupt controllers
such as the GICv3 ITS may not be able to satisfy the affinity of
some managed interrupts, as some of the HW resources are simply
not available.

In order to deal with this, do not try to activate such interrupt
if there is no online CPU capable of handling it. Instead, place
it in shutdown state. Once a capable CPU shows up, it will be
activated.

Reported-by: John Garry <john.garry@xxxxxxxxxx>
Reported-by: David Decotigny <ddecotig@xxxxxxxxxx>
Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>

Tested-by: John Garry <john.garry@xxxxxxxxxx>

---

JFYI, I could not recreate the same crash reported in the original thread for "nohz_full=5-127 isolcpus=nohz,domain,managed_irq,5-127 maxcpus=1". Here's just showing what I set via cmdline:

estuary:/$ dmesg | grep -i hz
[ 0.000000] Kernel command line: BOOT_IMAGE=/john/Image rdinit=/init console=ttyS0,115200 no_console_suspend nvme.use_threaded_interrupts=0 iommu.strict=0 acpi=force earlycon=pl011,mmio32,0x602b0000 nohz_full=5-127 isolcpus=nohz,domain,managed_irq,5-127 maxcpus=1
[ 0.000000] NO_HZ: Full dynticks CPUs: 5-127.
[ 0.000000] arch_timer: cp15 timer(s) running at 100.00MHz (phys).
[ 0.000000] sched_clock: 57 bits at 100MHz, resolution 10ns, wraps every 4398046511100ns
[ 15.314258] sbsa-gwdt sbsa-gwdt.0: Initialized with 10s timeout @ 100000000 Hz, action=0

And for the kernel build:
$ more .config | grep NO_HZ
CONFIG_NO_HZ_COMMON=y
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ is not set
$

Thanks,
John
kernel/irq/msi.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 2bdfce5edafd..aa84ce84c2ec 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -818,6 +818,18 @@ static int msi_init_virq(struct irq_domain *domain, int virq, unsigned int vflag
irqd_clr_can_reserve(irqd);
if (vflags & VIRQ_NOMASK_QUIRK)
irqd_set_msi_nomask_quirk(irqd);
+
+ /*
+ * If the interrupt is managed but no CPU is available
+ * to service it, shut it down until better times.
+ */
+ if ((vflags & VIRQ_ACTIVATE) &&
+ irqd_affinity_is_managed(irqd) &&
+ !cpumask_intersects(irq_data_get_affinity_mask(irqd),
+ cpu_online_mask)) {
+ irqd_set_managed_shutdown(irqd);
+ return 0;
+ }
}
if (!(vflags & VIRQ_ACTIVATE))