Re: [RFC][PATCHv3 0/6] printk: use printk_safe to handle printk() recursive calls

From: Sergey Senozhatsky
Date: Tue Oct 18 2016 - 21:54:16 EST

On (10/18/16 19:07), Peter Zijlstra wrote:
> > This patch set extends a lock-less NMI per-cpu buffers idea to
> > handle recursive printk() calls. The basic mechanism is pretty much the
> > same -- at the beginning of a deadlock-prone section we switch to lock-less
> > printk callback, and return back to a default printk implementation at the
> > end; the messages are getting flushed to a logbuf buffer from a safer
> > context.
> So I think you're not taking this far enough. You've also missed an
> entire class of deadlocks.

yes. this patch set addresses only printk() recursion. printk deadlocks
in general are much more complicated and basically we don't even know
how many locks we are about to take every time we call printk() (there
may be serial console locks, etc we never know in advance).

> The first is that you still keep the logbuf. Having this global
> serialized thing is a source of fail. It would be much better to only
> keep per cpu stuff. _OR_ at the very least make the logbuf itself
> lockfree. So generate the printk entry local (so we know its size) then
> atomically reserve the logbuf entry and copy it over.

logbuf spin_lock is not the only problematic lock. we _at least_
have console sem spin_lock and rq spin_lock:

> The entire class of deadlocks you've missed is that console->write() is
> a piece of crap too ;-) Even the bog standard 8250 serial console driver
> can do wakeups.

yes. well, I didn't exactly "miss" it. I just don't have any working
solution for those locks yet. so things like

spin_lock_irqsave(&port->lock) // lock the output port
!! WARN() or pr_err() or printk()
/* console_trylock() */
spin_lock_irqsave(&port->lock) // already locked

are out of this patch set's scope. another class of deadlocks is when we
call printk under one of the locks that printk() can take later in order
to print the message.

so in this particular patch set I make "printk() -> $FOO -> printk()" less
damaging. which is just a first step.

we are also looking at things like might_printk()

and even extending the lockdep

not to miss out a DEFERRED_WARN patch set...
//hm, I can't find it online

Subject: [RFC 0/5] printk: Implement WARN_*DEFERRED()
Message-Id: <1474992135-14777-1-git-send-email-pmladek@xxxxxxxx>

> See for example:

interesting idea.

> I've entirely given up on printk(), I'll post the 3 patches I carry to
> make things work for me.

yes, printk is surprisingly fragile and, some would _probably_
even say, broken.

a) we have numerous scenarios when it can deadlock
b) we have numerous scenarios when it can soft-lockup/hard-lockup the system
c) we don't have a bullet proof panic printk. zap_locks() resets
logbuf and console_sem, but there may be serial console locks/etc.