Re: Linux 4.9.6 ( Restore IO-APIC irq_chip retrigger callback , breaks my box )

From: Thomas Gleixner
Date: Tue Feb 07 2017 - 17:04:12 EST


On Mon, 6 Feb 2017, Linus Torvalds wrote:
> That said, it also strikes me that the implicated
> irq_chip_retrigger_hierarchy() function looks really very suspicious
> indeed.
>
> Most of the other users don't seem to traverse the parent all the way
> until they find something. They just do the operation in the parent,
> and if the parent needs it, it might then do it in _its_ parent and so
> on.

The whole point of the hierarchy is that we have decoupled the stacked
chips so the ioapic does not know whether it is connected to the irq
remapping unit or to the vector domain directly.

> So I'm wondering if that for-loop triggers a stack overflow on your
> setup somehow, just because that irq_retrigger() call is now truly
> recursive, and hasn't been turned into tail-calls.

It would only be recursive if some level down the hierarchy would use the
same callback.

The ioapic is always on top of its hierarchy and its either connected to
the vector domain directly, which is the last level in the hierarchy and
implements the real retrigger callback or to the irq remapping unit which
does not have a retrigger callback. So it's not a recursion problem AFAICT,
but lets try and just use the apic callback directly as we did before the
whole hierarchy rework. That's wrong for other reasons, but that does not
matter in that particular case. Patch below.

We have the same situation with the MSI interrupt domains which all use
irq_chip_retrigger_hierarchy() function as their retrigger callback, which
does not seem to have the same effect on Gabriels machine.

I have the feeling that this commit unearthes some other subtle wreckage in
the interrupt machinery which gets not triggered otherwise.

Thanks,

tglx

8<-------------------
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 52f352b063fd..3b6e5f3f099d 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1867,6 +1867,8 @@ static int ioapic_set_affinity(struct irq_data *irq_data,
return ret;
}

+extern int apic_retrigger_irq(struct irq_data *irq_data);
+
static struct irq_chip ioapic_chip __read_mostly = {
.name = "IO-APIC",
.irq_startup = startup_ioapic_irq,
@@ -1875,7 +1877,7 @@ static struct irq_chip ioapic_chip __read_mostly = {
.irq_ack = irq_chip_ack_parent,
.irq_eoi = ioapic_ack_level,
.irq_set_affinity = ioapic_set_affinity,
- .irq_retrigger = irq_chip_retrigger_hierarchy,
+ .irq_retrigger = apic_retrigger_irq,
.flags = IRQCHIP_SKIP_SET_WAKE,
};

@@ -1887,7 +1889,7 @@ static struct irq_chip ioapic_ir_chip __read_mostly = {
.irq_ack = irq_chip_ack_parent,
.irq_eoi = ioapic_ir_ack_level,
.irq_set_affinity = ioapic_set_affinity,
- .irq_retrigger = irq_chip_retrigger_hierarchy,
+ .irq_retrigger = apic_retrigger_irq,
.flags = IRQCHIP_SKIP_SET_WAKE,
};

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index 5d30c5e42bb1..ce9b93e19266 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -496,7 +496,7 @@ void setup_vector_irq(int cpu)
__setup_vector_irq(cpu);
}

-static int apic_retrigger_irq(struct irq_data *irq_data)
+int apic_retrigger_irq(struct irq_data *irq_data)
{
struct apic_chip_data *data = apic_chip_data(irq_data);
unsigned long flags;