Re: [PATCH v10 1/2] printk: Make printk() completely async

From: Petr Mladek
Date: Tue Aug 30 2016 - 05:29:26 EST

On Fri 2016-08-26 10:56:41, Sergey Senozhatsky wrote:
> On (08/25/16 23:10), Petr Mladek wrote:
> [..]
> > I was so taken by the idea of temporary forcing a lockless and
> > "trivial" printk implementation that I missed one thing.
> >
> > Your patch use the alternative printk() variant around logbuf_lock.
> > But this is not the problem with wake_up_process(). printk_deferred()
> > takes logbuf_lock without problems.
> you didn't miss anything, I think I wasn't too descriptive and that caused
> some confusion. this patch is not a replacement of wake_up_process() patch
> posted earlier in the loop, but an addition to it. not only every WARN/BUG
> issued from wake_up_process() will do no good, but every lock we take is
> potentially dangerous as well. In the simplest case because of $LOCK-debug.c
> files in kernel/locking (spin_lock in our case); in the worst case --
> because of WARNs issued by log_store() and friends (there may be custom
> modifications) or by violations of spinlock atomicity requirements.
> For example,
> vprintk_emit()
> local_irq_save()
> raw_spin_lock()
> text_len = vscnprintf(text, sizeof(textbuf), fmt, args)
> {
> vsnprintf()
> {
> if (WARN_ON_ONCE(size > INT_MAX))
> return 0;
> }
> }
> ...
> this is a rather unlikely event, sure, there must be some sort of
> memory corruption or something else, but the thing is -- if it will
> happen, printk() will not be willing to help.
> wake_up_process() change, posted earlier, is using a deferred version of
> WARN macro, but we definitely can (and we better do) switch to lockless
> alternative printk() in both cases and don't bother with new macros.
> replacing all of the existing ones with 'safe' deferred versions is
> a difficult task, but keeping track of a newly introduced ones is even
> harder (if possible at all).

I see. It makes some sense. I would like to be on the safe side. I am
just afraid that adding yet another per-CPU buffer is too complex.
It adds quite some complexity to the code. And it even more scatters
the messages so that it will be harder to get them from the
crash dump or flush them to the console when the system goes down.

It took few years to get in the solution for NMIs even when
it fixed real life deadlocks for many people and customers.
I am afraid that it is not realistic to get in similar complex
code to fix rather theoretical problems.

Sigh, I waited few days with this comment. I do not want to sound
like a broken record. I have hoped that anyone else could say
an opinion.

Best Regards,