Re: [PATCH v8 0/4] arm: KGDB NMI/FIQ support

From: Harro Haan
Date: Wed Jul 16 2014 - 13:15:21 EST


On 15 July 2014 19:08, Daniel Thompson <daniel.thompson@xxxxxxxxxx> wrote:
> On 15/07/14 16:59, Harro Haan wrote:
>> On 15 July 2014 16:52, Daniel Thompson <daniel.thompson@xxxxxxxxxx> wrote:
>>> On 15/07/14 14:04, Harro Haan wrote:
>>>>> Makes sense.
>>>>>
>>>>> Avoiding this problem on GICv2 is easy (thanks to the aliased intack
>>>>> register) but on iMX.6 we have only a GICv1.
>>>>>
>>>>>
>>>>>> I can reduce the number of occurrences (not prevent it) by adding the
>>>>>> following hack to irq-gic.c
>>>>>> @@ -297,10 +309,12 @@ static asmlinkage void __exception_irq_entry
>>>>>> gic_handle_irq(struct pt_regs *regs
>>>>>> u32 irqstat, irqnr;
>>>>>> struct gic_chip_data *gic = &gic_data[0];
>>>>>> void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>>>>
>>>>>> do {
>>>>>> + while(readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_PENDING_SET)
>>>>>> & (1 << 30))
>>>>>> + printk(KERN_ERR "TEMP: gic_handle_irq: wait for FIQ exception\n");
>>>>>> irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
>>>>>> irqnr = irqstat & ~0x1c00;
>>>>>
>>>>> I've made a more complete attempt to fix this. Could you test the
>>>>> following? (and be prepared to fuzz the line numbers)
>>>>
>>>> Thanks Daniel, I have tested it, see the comments below.
>>>>
>>>>>
>>>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>>>> index 73ae896..309bf2c 100644
>>>>> --- a/drivers/irqchip/irq-gic.c
>>>>> +++ b/drivers/irqchip/irq-gic.c
>>>>> @@ -303,6 +303,28 @@ static int gic_set_wake(struct irq_data *d,
>>>>> unsigned int on)
>>>>> #define gic_set_wake NULL
>>>>> #endif
>>>>>
>>>>> +/* Check for group 0 interrupt spuriously acked as a normal IRQ. This
>>>>> + * workaround will only work for level triggered interrupts (and in
>>>>> + * its current form is actively harmful on systems that don't support
>>>>> + * FIQ).
>>>>> + */
>>>>> +static u32 gic_handle_spurious_group_0(struct gic_chip_data *gic, u32
>>>>> irqstat)
>>>>> +{
>>>>> + u32 irqnr = irqstat & GICC_IAR_INT_ID_MASK;
>>>>> +
>>>>> + if (!gic_data_fiq_enable(gic) || irqnr >= 1021)
>>>>> + return irqnr;
>>>>> +
>>>>> + if (readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_IGROUP +
>>>>> + (irqnr / 32 * 4)) &
>>>>> + BIT(irqnr % 32))
>>>>> + return irqnr;
>>>>> +
>>>>> + /* this interrupt was spurious (needs to be handled as FIQ) */
>>>>> + writel_relaxed(irqstat, gic_data_cpu_base(gic) + GIC_CPU_EOI);
>>>>
>>>> This will NOT work, because of the note I mentioned above:
>>>> "The FIQ exception will not occur anymore after gic_handle_irq
>>>> read/acknowledges the FIQ by reading the GIC_CPU_INTACK register"
>>>> So with this code you will say End Of Interrupt at the GIC level,
>>>> without actually handling the interrupt, so you are missing an
>>>> interrupt.
>>>> I did the following test to confirm the missing interrupt:
>>>> I have changed the periodic timer interrupt by an one-shot timer
>>>> interrupt. The one-shot timer interrupt is programmed by the FIQ
>>>> handler for the next FIQ interrupt. As expected: When the problem
>>>> occurs, no more FIQ interrupts are generated.
>>>
>>> Can you confirm whether your timer interrupts are configured level or
>>> edge triggered? Or whether EOIing the GIC causes them to be cleared by
>>> some other means.
>>
>> From page 48 of DDI0407I_cortex_a9_mpcore_r4p1_trm.pdf:
>> Watchdog timers, PPI(3)
>> Each Cortex-A9 processor has its own watchdog timers that can generate
>> interrupts, using ID30.
>>
>> From page 56:
>> PPI[0], [2],and[3]:b11
>> interrupt is rising-edge sensitive.
>
> Thanks. This is clear.
>
>
>>> If we can't get level triggering to work then we have to:
>>>
>>> 1. Write code to jump correctly into FIQ mode.
>>>
>>> 2. Modify the gic's ack_fiq() callback to automatically avoid reading
>>> intack when the workaround is deployed.
>>>
>>> The above is why I wanted to see if we can make do with level triggering
>>> (and automatic re-triggering for interrupts such as SGIs that are
>>> cleared by EOI).
>>
>> But the re-triggering introduces extra latencies and a lot of use
>> cases of FIQ's try to avoid that.
>
> I'm not really clear why retriggering should be significantly more
> expensive than the dance required to fake up entry the FIQ handler.
>
> On the other hand retriggering allows us to avoid hacks in the FIQ
> handler to stop it acknowledging the interrupt. Given hacks like that
> won't be needed on A7/A15 this seems like a good outcome.
>
> Anyhow I've put together a new version of my earlier patch that I think
> will retrigger all interrupts except SGIs (I'll look at SGIs and
> compatibility with non-Freescale parts only if this improved approach
> works).
>
> Reported-by: Harro Haan <hrhaan@xxxxxxxxx>
> Signed-off-by: Daniel Thompson <daniel.thompson@xxxxxxxxxx>
> ---
> drivers/irqchip/irq-gic.c | 31 ++++++++++++++++++++++++++++++-
> 1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 73ae896..88f92e6 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -303,6 +303,33 @@ static int gic_set_wake(struct irq_data *d,
> unsigned int on)
> #define gic_set_wake NULL
> #endif
>
> +/* This is a software emulation of the Aliased Interrupt Acknowledge
> Register
> + * (GIC_AIAR) found in GICv2+.
> + */
> +static u32 gic_handle_spurious_group_0(struct gic_chip_data *gic, u32
> irqstat)
> +{
> + u32 irqnr = irqstat & GICC_IAR_INT_ID_MASK;
> + void __iomem *dist_base = gic_data_dist_base(gic);
> + u32 offset, mask;
> +
> + if (!gic_data_fiq_enable(gic) || irqnr >= 1021)
> + return irqnr;
> +
> + offset = irqnr / 32 * 4;
> + mask = 1 << (irqnr % 32);
> + if (readl_relaxed(dist_base + GIC_DIST_IGROUP + offset) & mask)
> + return irqnr;
> +
> + /* this interrupt must be taken as a FIQ so put it back into the
> + * pending state and end our own servicing of it.
> + */
> + writel_relaxed(mask, dist_base + GIC_DIST_PENDING_SET + offset);
> + readl_relaxed(dist_base + GIC_DIST_PENDING_SET + offset);
> + writel_relaxed(irqstat, gic_data_cpu_base(gic) + GIC_CPU_EOI);
> +
> + return 1023;
> +}
> +
> static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
> {
> u32 irqstat, irqnr;
> @@ -310,8 +337,10 @@ static void __exception_irq_entry
> gic_handle_irq(struct pt_regs *regs)
> void __iomem *cpu_base = gic_data_cpu_base(gic);
>
> do {
> + local_fiq_disable();
> irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
> - irqnr = irqstat & GICC_IAR_INT_ID_MASK;
> + irqnr = gic_handle_spurious_group_0(gic, irqstat);
> + local_fiq_enable();
>
> if (likely(irqnr > 15 && irqnr < 1021)) {
> irqnr = irq_find_mapping(gic->domain, irqnr);
> --
> 1.9.3
>
>
>

I just tested the above code. This approach also works as expected for
edge sensitive interrupts.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/