Re: [RFC PATCH v4 0/9] printk: new ringbuffer implementation

From: Peter Zijlstra
Date: Fri Sep 06 2019 - 10:01:53 EST


On Fri, Sep 06, 2019 at 02:42:11PM +0200, Petr Mladek wrote:

> I wish it was that simple. It is possible that I see it too
> complicated. But this comes to my mind:
>
> 1. The simple printk_buffer_store(buf, n) is not NMI safe. For this,
> we might need the reserve-store approach.

Of course it is, and sure it has a reserve+commit internally. I'm sure I
posted an implenentation of something like this at some point.

It is lockless (wait-free in fact, which is stronger) and supports
multi-readers. I'm sure I posted something like that before, and ISTR
John has something like that around somewhere too.

The only thing I'm omitting is doing vscnprintf() twice, first to
determine the length, and then into the reservation. Partly because I
think that is silly and 256 chars should be plenty for everyone, partly
because that avoids having vscnprintf() inside the cpu_lock() and partly
because it is simpler to not do that.

> 2. The simple approach works only with lockless consoles. We need
> something else for the rest at least for NMI. Simle offloading
> to a kthread has been blocked for years. People wanted the
> trylock-and-flush-immediately approach.

Have an irq_work to wake up a kthread that will print to shit consoles.
Seriously.. the trylock and flush stuff is horrific crap. You guys been
piling on the hack for years now, surely you're tired of that gunk?

(and if you _reallllly_ care, build a flush function that 'works'
mostly and waits for the kthread of choice to finish printing to the
'imporant' shit console).

> 3. console_lock works in tty as a big kernel lock. I do not know
> much details. But people familiar with the code said that
> it was a disaster. I assume that tty is still rather
> important console. I am not sure how it would fit into the
> simple approach.

The kernel thread in charge of printing doesn't care.

> 4. The console handling has got non-synchronous (console_trylock)
> quite early (ver 2.4.10, year 2001). The reason was to do not
> serialize CPUs by the speed of the console.
>
> Serialized output could remove many troubles. The logic in
> console_unlock() is really crazy. It might be acceptable
> for debugging. But is it acceptable on production systems?

The kernel thread doesn't care. If you care about independent consoles,
have a kernel thread per console. That way a fast console can print fast
while a slow console will print slow and everybody is happy.

> 5. John planed to use the cpu_lock in the lockless consoles.
> I wonder if it was only in the console->write() callback
> or if it would spread the lock more widely.

Right, I'm saying that since you need it anyway, lift it up one layer.
It makes everything simpler. More simpler is more better.

> 6. One huge nightmare is panic() and code called from there.
> It is a maze of hacks, including arch-specific code, to
> prevent deadlocks and get the messages out.
>
> Any lock might be blocked on any CPU at the moment. Or it
> it might become blocked when CPUs are stopped by NMI.
>
> Fully lock-less log buffer might save us some headache.
> I am not sure whether a single lock shared between printk()
> writers and console drivers will make the situation easier
> or more complicated.

So panic is a non issue for the lockless console.

It only matters if you care to get something out of the crap consoles.
So print everything to the lockless buffer and lockless consoles, then
try and force as much as you can out of the crap consoles.

If you die, tought luck, at least the lockless consoles and kdump image
have the whole message.

> 7. People would complain when continuous lines become less
> reliable. It might be most visible when mixing backtraces
> from all CPUs. Simple sorting by prefix will not make
> it readable. The historic way was to synchronize CPUs
> by a spin lock. But then the cpu_lock() could cause
> deadlock.

Why? I'm running with that thing on, I've never seen a deadlock ever
because of it. In fact, i've gotten output that is plain impossible with
the current junk.

The cpu-lock is inside the all-backtrace spinlock, not outside. And as I
said yesterday, only the lockless console has any wait-loops while
holding the cpu-lock. It _will_ make progress.

> I would be really happy when we could ignore some of the problems
> or find an easy solution. I just want to make sure that we take
> into account all the known aspects.
>
> I am sure that we could do better than we do now. I do not want
> to block any improvements. I am just a bit lost in the many
> black corners.

I hope the above helps. Also note that Linus' memory buffer is a
lockless console.