Re: [BUG 4.15-rc7] IRQ matrix management errors

From: Thomas Gleixner
Date: Tue Jan 16 2018 - 06:20:27 EST


On Tue, 16 Jan 2018, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Keith Busch wrote:
>
> > This is all way over my head, but the part that obviously shows
> > something's gone wrong:
> >
> > kworker/u674:3-1421 [028] d... 335.307051: irq_matrix_reserve_managed: bit=56 cpu=0 online=1 avl=86 alloc=116 managed=3 online_maps=112 global_avl=22084, global_rsvd=157, total_alloc=570
> > kworker/u674:3-1421 [028] d... 335.307053: irq_matrix_remove_managed: bit=56 cpu=0 online=1 avl=87 alloc=116 managed=2 online_maps=112 global_avl=22085, global_rsvd=157, total_alloc=570
> > kworker/u674:3-1421 [028] .... 335.307054: vector_reserve_managed: irq=45 ret=-28
> > kworker/u674:3-1421 [028] .... 335.307054: vector_setup: irq=45 is_legacy=0 ret=-28
> > kworker/u674:3-1421 [028] d... 335.307055: vector_teardown: irq=45 is_managed=1 has_reserved=0
> >
> > Which leads me to x86_vector_alloc_irqs goto error:
> >
> > error:
> > x86_vector_free_irqs(domain, virq, i + 1);
> >
> > The last parameter looks weird. It's the nr_irqs, and since we failed and
> > bailed, I would think we'd need to subtract 1 rather than add 1. Adding
> > 1 would doublely remove the failed one, and remove the next one that
> > was never setup, right?
>
> Right. That's fishy. Let me stare at it.

What we want is s/i + 1/i/

That's correct because x86_vector_free_irqs() does:

for (i = 0; i < nr; i++)
....

So if we fail at the first irq, then the loop will do nothing. Failing on
the second will free the first ....

Fix below.

Thanks,

tglx

8<----------------------
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index f8b03bb8e725..3cc471beb50b 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -542,14 +542,17 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,

err = assign_irq_vector_policy(irqd, info);
trace_vector_setup(virq + i, false, err);
- if (err)
+ if (err) {
+ irqd->chip_data = NULL;
+ free_apic_chip_data(apicd);
goto error;
+ }
}

return 0;

error:
- x86_vector_free_irqs(domain, virq, i + 1);
+ x86_vector_free_irqs(domain, virq, i);
return err;
}