Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls

From: Petr Mladek
Date: Wed Oct 19 2016 - 10:35:09 EST

On Tue 2016-10-18 19:07:54, Peter Zijlstra wrote:
> On Wed, Oct 19, 2016 at 12:40:39AM +0900, Sergey Senozhatsky wrote:
> > Hello,
> >
> > RFC
> >
> > This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
> So I think you're not taking this far enough. You've also missed an
> entire class of deadlocks.
> The first is that you still keep the logbuf. Having this global
> serialized thing is a source of fail. It would be much better to only
> keep per cpu stuff. _OR_ at the very least make the logbuf itself
> lockfree. So generate the printk entry local (so we know its size) then
> atomically reserve the logbuf entry and copy it over.

This is close to what the lockless ring_buffer does. I mean
kernel/trace/ring_buffer.c that is used by trace_printk. It has
per-CPU buffers, does reservations, ...

Sadly the ring_buffer code is very tricky. Only a single reader is
allowed at a time. And locating the information in the crash dump
is a task for prisoners.

Note that it still does _not_ solve problems with the console output.
I am not sure if going this way would be a real win.

Sigh, a genius idea would help.

> The entire class of deadlocks you've missed is that console->write() is
> a piece of crap too ;-) Even the bog standard 8250 serial console driver
> can do wakeups.

I wonder if all the hard problems are actually related to the console

There are problems with the single logbuffer but these should get
eliminated by the NMI/safe temporary per-CPU buffers.

All might be easier if we always offload the console handling
into a kthread or so and trigger it via the minimalist
irq_work. It would kill huge bunch of possible deadlocks.
It will even allow to get rid of printk_deferred() and
the uncertainty where it is needed.

The penalty would be "slightly" delayed console output. But is
it a real problem? It should not be a big deal when everything works.
We could always try hard when panicking. And there always
might be a fallback with that direct early_console().

Best Regards,