Re: [PATCH] ANDROID: staging: add userpanic-dev driver

From: Woody Lin
Date: Wed Sep 01 2021 - 04:56:21 EST

Next message: Krzysztof Kozlowski: "Re: [PATCH] pwm: samsung: Simplify using devm_pwmchip_add()"
Previous message: Hans de Goede: "Re: [PATCH v4] libata: Add ATA_HORKAGE_NO_NCQ_ON_AMD for Samsung 860 and 870 SSD."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Aug 27, 2021 at 3:14 PM Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, Aug 27, 2021 at 11:51:03AM +0800, Woody Lin wrote:
> > On Thu, Aug 26, 2021 at 6:54 PM Greg Kroah-Hartman
> > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Thu, Aug 26, 2021 at 06:23:53PM +0800, Woody Lin wrote:
> > > > On Thu, Aug 26, 2021 at 5:48 PM Greg Kroah-Hartman
> > > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Aug 26, 2021 at 05:28:54PM +0800, Woody Lin wrote:
> > > > > > Add char device driver 'userpanic-dev' that exposes an interface to
> > > > > > userspace processes to request a system panic with customized panic
> > > > > > message.
> > > > > >
> > > > > > Signed-off-by: Woody Lin <woodylin@xxxxxxxxxx>
> > > > > > ---
> > > > > > drivers/staging/android/Kconfig | 12 +++
> > > > > > drivers/staging/android/Makefile | 1 +
> > > > > > drivers/staging/android/userpanic-dev.c | 110 ++++++++++++++++++++++++
> > > > >
> > > > > Why is this in staging? What is wrong with it that it can not just go
> > > > > into the real part of the kernel? A TODO file is needed explaining what
> > > > > needs to be done here in order for it to be accepted.
> > > >
> > > > Got it. No more TODO for this driver and I will move it to drivers/android/.
> > > >
> > > > >
> > > > > But why is this really needed at all? Why would userspace want to panic
> > > > > the kernel in yet-another-way?
> > > >
> > > > The idea is to panic the kernel with a panic message specified by the userspace
> > > > process requesting the panic. Without this the panic handler can only collect
> > > > panic message "sysrq triggered crash" for a panic triggered by user processes.
> > > > Using this driver, user processes can put an informative description -
> > > > process name,
> > > > reason ...etc. - to the panic message.
> > >
> > > What custom userspace tool is going to use this new user/kernel api and
> > > again, why is it needed? Who needs to panic the kernel with a custom
> > > message and where is that used?
> >
> > It's for Android's services. Currently there are usages like these:
> >
> > * init requests panic in InitFatalReboot (abort handler).
> > https://android.googlesource.com/platform/system/core/+/master/init/reboot_utils.cpp#170
> > android::base::WriteStringToFile("c", PROC_SYSRQ);
> >
> > * llkd requests panic to recover kernel live-lock.
> > https://android.googlesource.com/platform/system/core/+/master/llkd/libllkd.cpp#564
> > android::base::WriteStringToFd("c", sysrqTriggerFd);
> >
> > * Watchdog requests panic to recover timeout-loop.
> > https://android.googlesource.com/platform/frameworks/base/+/master/services/core/java/com/android/server/Watchdog.java#847
> > doSysRq('c');
> >
> > So to improve the panic message from "sysrq triggered crash" to a more
> > informative one (e.g.: "Watchdog break timeout-loop", "llkd panic
> > live-lock"), we'd like to add this driver to expose a dedicated
> > interface for userspace to panic the kernel with a custom message. Later
> > the panic handler implemented per platform can collect the message and
> > use it to build the crash report. A crash report with a more readable
> > title (compared to the generic "sysrq triggered crash") will be easier
> > to categorize, triage, etc.
>
> But you can do that today from userspace, just write to the kernel log
> before doing the sysrq call. That way your tools can pick up what you
> need later on, no kernel changes should be needed at all.

Thanks for the idea. I actually need it in panic message because in our
platforms, the message is saved in a specific buffer that can be
accessed by a crash handler (not running in the same execution level as
Linux) which is also used to build crash reports. So parsing log buffer
can be too complex for it when compared to reading from the dedicated
buffer. But I understand this may not be a good reason to phase in
interfaces like this to the kernel? If so, I am starting from building it as
a vendor module and giving up covering the early stage of "init" for now.

And also thanks for the suggestions on the patch, I will revise them
accordingly when submitting it to the internal kernel modules repo.

>
> > And the reason to submit this to upstream, instead of making it a vendor
> > module, is that we'd like to enable it for the early stage of "init", where
> > none of the kernel module has been mounted.
>
> Helps if it would actually build :(
>
> thanks,
>
> greg k-h

Next message: Krzysztof Kozlowski: "Re: [PATCH] pwm: samsung: Simplify using devm_pwmchip_add()"
Previous message: Hans de Goede: "Re: [PATCH v4] libata: Add ATA_HORKAGE_NO_NCQ_ON_AMD for Samsung 860 and 870 SSD."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]