Re: 4.14 backport request for dbdda842fe96f: "printk: Add console owner and waiter logic to load balance console writes"

From: Petr Mladek
Date: Thu Oct 04 2018 - 03:49:40 EST


On Wed 2018-10-03 13:37:04, Steven Rostedt wrote:
> On Wed, 3 Oct 2018 10:16:08 -0700
> Daniel Wang <wonderfly@xxxxxxxxxx> wrote:
>
> > On Wed, Oct 3, 2018 at 2:14 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
> > >
> > > On Tue 2018-10-02 21:23:27, Steven Rostedt wrote:
> > > > I don't see the big deal of backporting this. The biggest complaints
> > > > about backports are from fixes that were added to late -rc releases
> > > > where the fixes didn't get much testing. This commit was added in 4.16,
> > > > and hasn't had any issues due to the design. Although a fix has been
> > > > added:
> > > >
> > > > c14376de3a1 ("printk: Wake klogd when passing console_lock owner")
> > >
> > > As I said, I am fine with backporting the console_lock owner stuff
> > > into the stable release.
> > >
> > > I just wonder (like Sergey) what the real problem is. The console_lock
> > > owner handshake is not fully reliable. It is might be good enough
>
> I'm not sure what you mean by 'not fully reliable'

I mean that it is not guaranteed that the very first printk() takes over
the console. It will happen only when the other printk() calls
console_trylock_spinning() while the current console owner does
the code between:

console_lock_spinning_enable();
console_lock_spinning_disable_and_check();


> > > Just to be sure. Daniel, could you please send a log with
> > > the console_lock owner stuff backported? There we would see
> > > who called the panic() and why it rebooted early.
> >
> > Sure. Here is one. It's a bit long but complete. I attached another log
> > snippet below it which is what I got when `softlockup_panic` was turned
> > off. The log was from the IRQ task that was flushing the printk buffer. I
> > will be taking a closer look at it too but in case you'll find it helpful.
>
> Just so I understand correctly. Does the panic hit with and without the
> suggested backport patch? The only difference is that you get the full
> output with the patch and limited output without it?

Sigh, the other mail suggest that there was a real deadlock. It means
that the console owner logic might help but it would not prevent
the deadlock completely.

Best Regards,
Petr