Re: [PATCH 0/3] warn and suppress irqflood
From: Pingfan Liu
Date: Sun Oct 25 2020 - 07:13:06 EST
On Thu, Oct 22, 2020 at 4:37 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Thu, Oct 22 2020 at 13:56, Pingfan Liu wrote:
> > I hit a irqflood bug on powerpc platform, and two years ago, on a x86 platform.
> > When the bug happens, the kernel is totally occupies by irq. Currently, there
> > may be nothing or just soft lockup warning showed in console. It is better
> > to warn users with irq flood info.
> >
> > In the kdump case, the kernel can move on by suppressing the irq flood.
>
> You're curing the symptom not the cause and the cure is just magic and
> can't work reliably.
Yeah, it is magic. But at least, it is better to printk something and
alarm users about what happens. With current code, it may show nothing
when system hangs.
>
> Where is that irq flood originated from and why is none of the
> mechanisms we have in place to shut it up working?
The bug originates from a driver tpm_i2c_nuvoton, which calls i2c-bus
driver (i2c-opal.c). After i2c_opal_send_request(), the bug is
triggered.
But things are complicated by introducing a firmware layer: Skiboot.
This software layer hides the detail of manipulating the hardware from
Linux.
I guess the software logic can not enter a sane state when kernel crashes.
Cc Skiboot and ppc64 community to see whether anyone has idea about it.
Thanks,
Pingfan