Re: printk meeting at LPC

From: John Ogness
Date: Fri Sep 13 2019 - 09:27:20 EST


On 2019-09-09, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> printk meeting at LPC Meeting Room - SAFIRA on Tuesday Sept 10. from
> 2PM to 3PM.

The meeting was very effective in letting us come to decisions on the
direction to take. Thanks for the outstanding attendance! It certainly
saved hundreds of hours of reading/writing emails!

The slides[0] from my printk talk served as a _rough_ basis for the
discussion. Here is a summary of the decisions:

1. As a new ringbuffer, the lockless state-based proof of concept
posted[1] by Petr Mladek will be used. Since it has far fewer memory
barriers in the code, it will be simpler to review. I posted[2] a patch
to hack my RFCv4 into a fully functional version of Petr's PoC. So we
know it will work. With this, printk() can be called from any context
and the message will be put directly into the ringbuffer.

2. A kernel thread will be created for each registered console, each
responsible for being the sole printers to their respective
consoles. With this, console printing is _fully_ decoupled from printk()
callers.

3. Rather than defining emergency _messages_, we define an emergency
_state_ where the kernel wants to flush the messages immediately before
dying. Unlike oops_in_progress, this state will not be visible to
anything outside of the printk infrastructure.

4. When in emergency state, the kernel will use a new console callback
write_atomic() to flush the messages in whatever context the CPU is in
at that moment. Only consoles that implement the NMI-safe write_atomic()
will be able to flush in this state.

5. LOG_CONT message pieces will be stored as individual records in the
ringbuffer. They will be "assembled" by the ringbuffer reader (in
kernel) before being copied to userspace or printed on the
console. Since each record in the ringbuffer has its own sequence
number, this has the effect for userspace that sequence numbers will
appear to be skipped. (i.e. if there were LOG_CONT pieces with sequence
numbers 4, 5, 6, the fully assembled message will appear only as
sequence number 6 (and will have the timestamp from the first piece)).

6. A new may-sleep function pr_flush() will be made available to wait
for all previously printk'd messages to be output on all consoles before
proceeding. For example:

pr_cont("Running test ABC... ");
pr_flush();

do_test();

pr_cont("PASSED\n");
pr_flush();

7. The ringbuffer raw data (log_buf) will be simplified to only consist
of alignment-padded strings separated by a single unsigned long. All
record meta-data (timestamp, loglevel, caller_id, etc.) will move into
the record descriptors, which are located in an extra array. The
appropriate crash tools will need to be adjusted for this. (FYI: The
unsigned long in the string data is the descriptor ID.)

8. A CPU-reentrant spinlock (the so-called cpu-lock) will be used to
synchronize/stop the kthreads during emergency state.

9. Support for printk dictionaries will be discontinued. I will look
into who is using this and why. If printk dictionaries are important for
you, speak up now!

(There was also some talk about possibly discontinuing kdb, but that is
not directly related to printk. I'm mentioning it here in case anyone
wants to pursue that.)

If I missed (or misunderstood) anything, please let me know!

John Ogness

[0] https://www.linuxplumbersconf.org/event/4/contributions/290/attachments/276/463/lpc2019_jogness_printk.pdf
[1] https://lkml.kernel.org/r/20190704103321.10022-1-pmladek@xxxxxxxx
[2] https://lkml.kernel.org/r/87lfvwcssu.fsf@xxxxxxxxxxxxx