RE: [PATCH 1/1] arm64: kexec: no need to do irq_chip->irq_mask if it already masked

From: Jason Liu
Date: Thu Aug 06 2020 - 06:08:02 EST




> -----Original Message-----
> From: Marc Zyngier <maz@xxxxxxxxxx>
> Sent: Wednesday, August 5, 2020 4:18 PM
> To: Jason Liu <jason.hui.liu@xxxxxxx>
> Cc: Sudeep Holla <sudeep.holla@xxxxxxx>; catalin.marinas@xxxxxxx;
> will@xxxxxxxxxx; sashal@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [PATCH 1/1] arm64: kexec: no need to do irq_chip->irq_mask if it
> already masked
>
> On 2020-08-05 07:31, Jason Liu wrote:
> >> -----Original Message-----
> >> From: Sudeep Holla <sudeep.holla@xxxxxxx>
> >> Sent: Tuesday, August 4, 2020 7:39 PM
> >> To: Marc Zyngier <maz@xxxxxxxxxx>
> >> Cc: Jason Liu <jason.hui.liu@xxxxxxx>; catalin.marinas@xxxxxxx;
> >> will@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Sudeep Holla
> >> <sudeep.holla@xxxxxxx>; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> >> Subject: Re: [PATCH 1/1] arm64: kexec: no need to do
> >> irq_chip->irq_mask if it already masked
> >>
> >> On Tue, Aug 04, 2020 at 11:58:47AM +0100, Marc Zyngier wrote:
> >> > On 2020-08-04 09:56, Jason Liu wrote:
> >> > > No need to do the irq_chip->irq_mask() if it already masked.
> >> > > BTW, unconditionally do the irq_chip->irq_mask() will also bring
> >> > > issues when the irq_chip in the runtime PM suspend. Accessing
> >> > > registers of the irq_chip will bring in the exceptions. For
> >> > > example on the
> >> i.MX:
> >> > >
> >> > > root@imx8qmmek:~# echo c > /proc/sysrq-trigger [ 177.796182] sysrq:
> >> > > Trigger a crash [ 177.799596] Kernel panic - not syncing: sysrq
> >> > > triggered crash [ 177.875616] SMP: stopping secondary CPUs [
> >> > > 177.891936] Internal error: synchronous external abort: 96000210
> >> > > [#1] PREEMPT SMP [ 177.899429] Modules linked in: crct10dif_ce
> >> > > mxc_jpeg_encdec [ 177.905018] CPU: 1 PID: 944 Comm: sh Kdump:
> >> > > loaded Not tainted [ 177.913457] Hardware name: Freescale
> >> > > i.MX8QM MEK (DT) [ 177.918517] pstate: a0000085 (NzCv daIf -PAN
> >> > > -UAO) [ 177.923318] pc : imx_irqsteer_irq_mask+0x50/0x80
> [ 177.927944] lr :
> >> > > imx_irqsteer_irq_mask+0x38/0x80 [ 177.932561] sp :
> >> > > ffff800011fe3a50 [ 177.935880] x29: ffff800011fe3a50 x28:
> >> > > ffff0008f7708e00 [ 177.941196] x27: 0000000000000000 x26:
> >> > > 0000000000000000 [ 177.946513] x25: ffff800011a30c80 x24:
> >> > > 0000000000000000 [ 177.951830] x23: ffff800011fe3af8 x22:
> >> > > ffff0008f24469d4 [ 177.957147] x21: ffff0008f2446880 x20:
> >> > > ffff0008f25f5658 [ 177.962463] x19: ffff800012611004 x18:
> >> > > 0000000000000001 [ 177.967780] x17: 0000000000000000 x16:
> >> > > 0000000000000000 [ 177.973097] x15: ffff0008f7709270 x14:
> >> > > 0000000060000085 [ 177.978414] x13: ffff800010177570 x12:
> >> > > ffff800011fe3ab0 [ 177.983730] x11: ffff80001017749c x10:
> >> > > 0000000000000040 [ 177.989047] x9 : ffff8000119f1c80 x8 :
> >> > > ffff8000119f1c78 [ 177.994364] x7 : ffff0008f46bedf8 x6 :
> >> > > 0000000000000000 [ 177.999681] x5 : ffff0008f46beda0 x4 :
> >> > > 0000000000000000 [ 178.004997] x3 : ffff0008f24469d4 x2 :
> >> > > ffff800012611000 [ 178.010314] x1 : 0000000000000080 x0 :
> >> > > 0000000000000080 [ 178.015630] Call trace:
> >> > > [ 178.018077] imx_irqsteer_irq_mask+0x50/0x80 [ 178.022352]
> >> > > machine_crash_shutdown+0xa8/0x100 [ 178.026802]
> >> > > __crash_kexec+0x6c/0x118 [ 178.030464] panic+0x19c/0x324 [
> >> > > 178.033524] sysrq_handle_reboot+0x0/0x20 [ 178.037537]
> >> > > __handle_sysrq+0x88/0x180 [ 178.041290]
> >> > > write_sysrq_trigger+0x8c/0xb0 [ 178.045389]
> >> > > proc_reg_write+0x78/0xb0 [ 178.049055] __vfs_write+0x18/0x40 [
> >> > > 178.052461] vfs_write+0xdc/0x1c8 [ 178.055779]
> >> > > ksys_write+0x68/0xf0 [ 178.059098] __arm64_sys_write+0x18/0x20
> >> > > [ 178.063027] el0_svc_common.constprop.0+0x68/0x160
> >> > > [ 178.067821] el0_svc_handler+0x20/0x80 [ 178.071573]
> >> > > el0_svc+0x8/0xc [ 178.074463] Code: 93407e73 91001273 aa0003e1
> >> > > 8b130053 (b9400260) [ 178.080567] ---[ end trace
> >> > > 652333f6c6d6b05d
> >> > > ]---
> >> > >
> >> > > Signed-off-by: Jason Liu <jason.hui.liu@xxxxxxx>
> >> > > Cc: <stable@xxxxxxxxxxxxxxx>
> >> > > Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> >> > > Cc: Will Deacon <will@xxxxxxxxxx>
> >> > > Cc: Sasha Levin <sashal@xxxxxxxxxx>
> >> > > ---
> >> > > arch/arm64/kernel/machine_kexec.c | 2 +-
> >> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> >> > >
> >> > > diff --git a/arch/arm64/kernel/machine_kexec.c
> >> > > b/arch/arm64/kernel/machine_kexec.c
> >> > > index a0b144cfaea7..8ab263c733bf 100644
> >> > > --- a/arch/arm64/kernel/machine_kexec.c
> >> > > +++ b/arch/arm64/kernel/machine_kexec.c
> >> > > @@ -236,7 +236,7 @@ static void
> machine_kexec_mask_interrupts(void)
> >> > > chip->irq_eoi)
> >> > > chip->irq_eoi(&desc->irq_data);
> >> > >
> >> > > - if (chip->irq_mask)
> >> > > + if (chip->irq_mask && !irqd_irq_masked(&desc->irq_data))
> >> > > chip->irq_mask(&desc->irq_data);
> >> > >
> >> > > if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
> >> >
> >> > This is pretty dodgy. irq_mask() should be an idempotent action
> >> > (masking twice must not be harmful).
> >> >
> >>
> >> That was my understanding too, but was not totally against adding it
> >> here.
> >
> > Yes, masking twice at least a time of waste and really no need to do
> > it. If you look at the common API mask_irq There did avoid the
> > unnecessary twice or multiple mask. Keep in mind that there are many
> > irqs, so it will waste time to do the things which is not necessary.
> > So, from this point, IMO, this patch is fine.
>
> Let's be serious. You are doing a *kexec*. Rebooting the entire system.
> Another 10 or 1000 accesses are completely invisible here.
>
> >
> > void mask_irq(struct irq_desc *desc)
> > {
> > if (irqd_irq_masked(&desc->irq_data))
> > return;
> >
> > if (desc->irq_data.chip->irq_mask) {
> > desc->irq_data.chip->irq_mask(&desc->irq_data);
> > irq_state_set_masked(desc);
> > }
> > }
> >
> >>
> >> > Even more, it really isn't obvious to me how this can work at all,
> >> > as even if the interrupt isn't masked, the irqsteer could well be
> >> > suspended.
> >> >
> >>
> >> Indeed, the runtime PM ops in that driver looks dodgy. Any calls to
> >> mask_irq from drivers or anywhere with irqchip suspended with just
> >> blows up the system.
> >
> > If you look at the chip->irq_mask implementation on different
> > platforms, almost all with directly access the register of the irqchip
> > including irqsteer. There are fine due to driver will use the common
> > mask_irq API.
> >
> >>
> >> > So as is, this change is just papering over a much deeper issue in
> >> > your driver.
> >> >
> >>
> >> Thanks for confirming
> >
> > No, this patch is not papering over a much deeper issue in the driver.
> > This is just to make things better for the ARM64 kexec.
>
> Yes, I'm sure it is... However:
>
> request_irq()
> <goes into suspend, panic somewhere after having turned the irqchip clock off>
> if (chip->irq_mask && !irqd_irq_masked(&desc->irq_data))
> <explodes, as the interrupt isn't masked>
>
> This is because the PM in the irqsteer driver is completely busted:
> request_irq() should get a reference on the driver to prevent it from being
> suspended. Since you don't implement it correctly, this doesn't happen and
> your "improvement" doesn't help at all.

The request_irq will get a reference to prevent the irqchip from being suspended due to it call
irq_chip_pm_get(). I am pretty sure we have implemented correctly and that is also the common Linux code.
In order to save power and let the irqchip enter into runtime SUSPEND mode, the driver will call free_irq()
When it was not used(idle). Then free_irq() will decrease the reference and let the irqchip enter suspend state.

So, when the irqchip entered suspend, which means there is no user for the irqchip and all the irqs were DISABLED + MASKED.
Due to the runtimePM support for the irqchip, when kexec runs, it will sometimes meet the situation that the irqchip is suspend due to
no users for it. So from either the performance(time cost) or coding logic, the machine_kexec_mask_interrupts() should not do
double mask for the irqs which already masked.

>
> M.
> --
> Jazz is not dead. It just smells funny...