Re: [PATCH] panic: Ensure preemption is disabled during panic()

From: Feng Tang
Date: Fri Oct 04 2019 - 09:49:28 EST


On Fri, Oct 04, 2019 at 01:15:21PM +0200, Petr Mladek wrote:
> On Fri 2019-10-04 11:49:48, Will Deacon wrote:
> > On Fri, Oct 04, 2019 at 10:29:17AM +0100, Russell King - ARM Linux admin wrote:
> > > On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote:
> > > > On Thu 2019-10-03 21:56:34, Will Deacon wrote:
> > > > > I've deliberately left the irq part alone, since I think
> > > > > having magic sysrq work via the keyboard interrupt is desirable from the
> > > > > panic loop.
> > > >
> > > > I agree that we should keep sysrq working.
> > > >
> > > > One pity thing is that led_panic_blink() in
> > > > leds/drivers/trigger/ledtrig-panic.c uses workqueues:
> > > >
> > > > + led_panic_blink()
> > > > + led_trigger_event()
> > > > + led_set_brightness()
> > > > + schedule_work()
> > > >
> > > > It means that it depends on the scheduler. I guess that it
> > > > does not work in many panic situations. But this patch
> > > > will always block it.
> > > >
> > > > I agree that it is strange that userspace still works at
> > > > this stage. But does it cause any real problems?
> > >
> > > Yes, there are watchdog drivers that continue to pat their watchdog
> > > after the kernel has panic'd. It makes watchdogs useless (which is
> > > exactly how this problem was discovered.)
> >
> > Indeed, and I think the LED blinking is already unreliable if the
> > brightness operation needs to sleep. For example, if the kernel isn't
> > preemptible or the work gets queued up on a different CPU which is
> > sitting in panic_smp_self_stop().
>
> To make it clear. I do not want to block this patch. I just wanted
> to point out the problem. I am not sure how the blinking is important
> these days. Well, I could imagine that it might be useful on some
> embedded devices.

When reviewing the c39ea0b9dd24 ("panic: avoid the extra noise dmesg"),
there was similar discussion about what's the expectation for kernel
when panic happens, as the earlier version patch is simply keeping the
the local irq disabled, which may break the sysrq and the panic blink
code, so we turned to handling it together with printk.

>
> Another question is how many people want to end up with dead system
> these days. The watchdogs are likely used in data centers. I guess
> that automatic reboot in panic() is a better choice there.
>
> Anyway, it might make sense to remove the panic blinking code when
> it will not have a chance to work.

I was also wondering if the panic blinking code still really works
on any platforms.

Thanks,
Feng

>
> Best Regards,
> Petr