Re: [RFC PATCH v4 9/9] printk: use a new ringbuffer implementation

From: Steven Rostedt
Date: Thu Aug 08 2019 - 20:48:48 EST

On Thu, 8 Aug 2019 17:21:09 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> But laptops don't have reset buttons. They have "press the power
> button for ten seconds, power turns off. Press it again, and power
> comes on" reset sequences.

I've never tried, but are you saying that even with the "10 second
hold" the laptop's DRAM may still have old data that is accessible?

> They are nasty to debug when they happen on a developer machine (I
> should know, I've definitely had them), but when they happen in the
> wild they are basically "user just rebooted the machine". End of
> story, and no stats or anything like that.

Would a best effort 1 page buffer work? Really, with a hard hang we
usually only care about the last thing that was printed (we need to add
one of those: stop printing after the first WARN_ON is hit, to not
lose the initial bug).

That way you could have a buffer that is written to constantly but only
is the size of one or two pages. It can have a variable in it that gets
reset on shutdown. If the system hangs, the next boot could look to see
if that page was shutdown cleanly (or never initialized) otherwise, it
can read the page or pages into a buffer that can be read from debugfs.

A user space tool could read this page and if it detects that it
contains data from a crash, notify the user and say "Can you send this
to linux-kernel@xxxxxxxxxxxxxxx"? Even better if it tells the user the
subject and content of the email that should be sent.

-- Steve