Re: [PATCH] irqchip/gic: Convert to handle_strict_flow_irq()
From: Marc Zyngier
Date: Sun Aug 15 2021 - 03:01:28 EST
On Sun, 15 Aug 2021 00:36:39 +0100,
Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> On 8/14/21 3:26 PM, Valentin Schneider wrote:
> > Hi,
> >
> > On 14/08/21 12:47, Guenter Roeck wrote:
> >> On Tue, Jun 29, 2021 at 01:50:09PM +0100, Valentin Schneider wrote:
> >>> Now that the proper infrastructure is in place, convert the irq-gic chip to
> >>> use handle_strict_flow_irq() along with IRQCHIP_AUTOMASKS_FLOW.
> >>>
> >>> For EOImode=1, the Priority Drop is moved from gic_handle_irq() into
> >>> chip->irq_ack(). This effectively pushes the EOI write down into
> >>> ->handle_irq(), but doesn't change its ordering wrt the irqaction
> >>> handling.
> >>>
> >>> The EOImode=1 irqchip also gains IRQCHIP_EOI_THREADED, which allows the
> >>> ->irq_eoi() call to be deferred to the tail of ONESHOT IRQ threads. This
> >>> means a threaded ONESHOT IRQ can now be handled entirely without a single
> >>> chip->irq_mask() call.
> >>>
> >>> EOImode=0 handling remains unchanged.
> >>>
> >>> Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
> >>> Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> >>> Link: https://lore.kernel.org/r/20210629125010.458872-13-valentin.schneider@xxxxxxx
> >>> ---
> >>
> >> This patch results in a variety of crashes in -next.
> >>
> >
> > Thanks for testing & reporting.
> >
> >> Example:
> >>
> >> [ 0.025729] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> >> [ 0.025811] pgd = (ptrval)
> >> [ 0.026029] [00000000] *pgd=00000000
> >> [ 0.027301] Internal error: Oops: 80000005 [#1] SMP ARM
> >> [ 0.027650] Modules linked in:
> >> [ 0.027986] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.14.0-rc5-next-20210813 #1
> >> [ 0.028162] Hardware name: Freescale i.MX6 Ultralite (Device Tree)
> >> [ 0.028311] PC is at 0x0
> >> [ 0.028485] LR is at handle_strict_flow_irq+0xc8/0x2b8
> >> [ 0.028538] pc : [<00000000>] lr : [<c01a59c4>] psr: 400001d3
> >> [ 0.028566] sp : c1701e60 ip : c17097cc fp : 00000050
> >> [ 0.028594] r10: c1700000 r9 : 00000057 r8 : c2012400
> >> [ 0.028641] r7 : 00000000 r6 : c1711268 r5 : c2083c6c r4 : c2083c00
> >> [ 0.028674] r3 : 00000000 r2 : 00000000 r1 : 0a728000 r0 : c2083c18
> >> [ 0.028766] Flags: nZcv IRQs off FIQs off Mode SVC_32 ISA ARM Segment none
> >> [ 0.028821] Control: 10c5387d Table: 8000406a DAC: 00000051
> >> [ 0.028881] Register r0 information: slab kmalloc-512 start c2083c00 pointer offset 24 size 512
> >> [ 0.029482] Register r1 information: non-paged memory
> >> [ 0.029649] Register r2 information: NULL pointer
> >> [ 0.029672] Register r3 information: NULL pointer
> >> [ 0.029692] Register r4 information: slab kmalloc-512 start c2083c00 pointer offset 0 size 512
> >> [ 0.029752] Register r5 information: slab kmalloc-512 start c2083c00 pointer offset 108 size 512
> >> [ 0.029809] Register r6 information: non-slab/vmalloc memory
> >> [ 0.029876] Register r7 information: NULL pointer
> >> [ 0.029897] Register r8 information: slab kmalloc-1k start c2012400 pointer offset 0 size 1024
> >> [ 0.029958] Register r9 information: non-paged memory
> >> [ 0.029980] Register r10 information: non-slab/vmalloc memory
> >> [ 0.030002] Register r11 information: non-paged memory
> >> [ 0.030023] Register r12 information: non-slab/vmalloc memory
> >> [ 0.030064] Process swapper/0 (pid: 0, stack limit = 0x(ptrval))
> >> [ 0.030167] Stack: (0xc1701e60 to 0xc1702000)
> >> [ 0.030562] 1e60: c168fcd8 c1700000 c17097cc c019fa2c c1709a44 d080c00c c168fcfc c1700000
> >> [ 0.030754] 1e80: c17097cc c1701eb8 c1700000 c076fe04 c1700000 600000d3 c0824188 80000053
> >> [ 0.030924] 1ea0: ffffffff c1701eec 00000000 c1700000 c1700000 c0100b70 c2013400 c20a0000
> >> [ 0.031096] 1ec0: 00000050 00000000 c20a00a0 c07c0ad4 00000910 c2013400 00000000 c20a0000
> >> [ 0.031267] 1ee0: c1700000 00000050 00000000 c1701f08 c0824188 c0824188 80000053 ffffffff
> >> [ 0.031439] 1f00: 00000051 00000000 00000000 c168f31c c17097cc c2013400 c2013708 3123ccba
> >> [ 0.031614] 1f20: 00000001 c2013400 00000960 c20a0000 c1422958 c1709380 c16804d8 c1646020
> >> [ 0.031790] 1f40: c14d1eec c082626c c1f6f4e0 c2013400 00000000 c1422958 c1709380 c16461d4
> >> [ 0.031961] 1f60: 00000000 00000000 00000000 c0f7d3a4 c186fdcc c1700000 c17093cc c16804d8
> >> [ 0.032155] 1f80: c18f8e20 c160fe80 c1903000 c1663a80 00000012 c1700000 c1709380 00000000
> >> [ 0.032327] 1fa0: c170caf0 c1600e50 ffffffff ffffffff 00000000 c1600588 00000000 c1700000
> >> [ 0.032500] 1fc0: 00000000 c1663a80 3126c2ba 00000000 00000000 c1600330 00000051 10c0387d
> >> [ 0.032671] 1fe0: ffffffff 8833b000 410fc075 10c5387d 00000000 00000000 00000000 00000000
> >> [ 0.032889] [<c01a59c4>] (handle_strict_flow_irq) from [<c019fa2c>] (handle_domain_irq+0x64/0xa4)
> >> [ 0.033027] [<c019fa2c>] (handle_domain_irq) from [<c076fe04>] (gic_handle_irq+0x84/0xbc)
> >> [ 0.033076] [<c076fe04>] (gic_handle_irq) from [<c0100b70>] (__irq_svc+0x70/0x98)
> >> [ 0.033140] Exception stack(0xc1701eb8 to 0xc1701f00)
> >> [ 0.033204] 1ea0: c2013400 c20a0000
> >> [ 0.033377] 1ec0: 00000050 00000000 c20a00a0 c07c0ad4 00000910 c2013400 00000000 c20a0000
> >> [ 0.033551] 1ee0: c1700000 00000050 00000000 c1701f08 c0824188 c0824188 80000053 ffffffff
> >> [ 0.033596] [<c0100b70>] (__irq_svc) from [<c0824188>] (do_update_region+0x108/0x19c)
> >> [ 0.033652] [<c0824188>] (do_update_region) from [<c082626c>] (csi_J+0xf0/0x290)
> >> [ 0.033685] [<c082626c>] (csi_J) from [<c16461d4>] (con_init+0x1b4/0x248)
> >> [ 0.033726] [<c16461d4>] (con_init) from [<c160fe80>] (console_init+0x21c/0x330)
> >> [ 0.033760] [<c160fe80>] (console_init) from [<c1600e50>] (start_kernel+0x508/0x6d4)
> >> [ 0.033797] [<c1600e50>] (start_kernel) from [<00000000>] (0x0)
> >> [ 0.034079] Code: bad PC value
> >> [ 0.034726] ---[ end trace c1324fc4facef313 ]---
> >> [ 0.035004] Kernel panic - not syncing: Fatal exception in interrupt
> >>
> >> https://kerneltests.org/builders/qemu-arm-v7-next/builds/66 provides
> >> a more detailed log.
> >>
> >> Reverting this patch fixes the problem.
> >>
> >
> > Passing the stacktrace through scripts/decode_stracktrace.sh would help by
> > giving line numbers - that said I'm pretty sure this is caused by the
> > flow-handler now issuing a chip->irq_ack(), and this has an irqchip that
> > doesn't have one.
> >
> > The hardware name points me at arch/arm/mach-imx/mach-imx6ul.c, but I can't
> > figure out an actual devicetree/irqchip layout from this.
> >
> > Ah, with some more grepping I did find arch/arm/mach-imx/gpc.c, and bingo,
> > no .irq_ack().
> >
> > Now, the above makes me feel like this is the start of a wild goose chase
> > for irqchips in a similar situation. I'm tempted to apply a dumb
> >
> > .irq_ack = irq_chip_ack_parent
> >
> > to all irqchips without one specified. Let's see how bad this gets, does
> > the below help?
> >
>
> As a follow-up, Coccinelle tells me that there are more than 300
> declarations of struct irq_chip variables without irq_ack pointer in
> the kernel. Does this problem affect all of them or just a subset ?
> If it doesn't affect all of them, how does one know when irq_ack is
> mandatory and when not ?
One needs to understand what flow is being used. For irqchip stacks
using fast-eoi, there was no need for an irq_ack callback until we
switched to Valentin's "strict" flow.
The problem is that although irqchip *look* like they can be composed
at will, the composition only makes sense if they all agree on the
flow that is being used.
So we can either identify the irqchips that need to be updated (not an
easy task, but not impossible: just look for anything that feeds into
a GIC through a hierarchy and not a mux). Or we can make the flow
iterate over the irqchip layers.
M.
--
Without deviation from the norm, progress is not possible.