Re: [RFC][PATCH 6/7] printk: use alternative printk buffers

From: Petr Mladek
Date: Mon Oct 10 2016 - 07:18:02 EST


On Mon 2016-10-10 13:09:57, Sergey Senozhatsky wrote:
> On (10/06/16 13:32), Petr Mladek wrote:
> > On Thu 2016-10-06 13:22:48, Sergey Senozhatsky wrote:
> > > On (10/05/16 11:50), Petr Mladek wrote:
> > > [..]
> > > > > well, it solves a number of problems that the existing implementation
> > > > > cannot handle.
> > > >
> > > > Please, provide a summary. I wonder if these are real life problems.
> > >
> > > 1) some pathces/reports from Byungchul Park
> > > 2) a report from Viresh Kumar.
> > > 4) sleeping function called from inside logbuf lock
> > > 5) ARM specific
> > > 6) logbuf_lock corruption
> >
> > It is great that you have such a list in hands. It might help
> > to push this solution.
> >
> > I actually have one more reason for this approach:
> >
> > It seems that we will need to keep printk_deferred()/WARN_*DEFERRED().
> > We do not know about a better solution for the deadlocks caused
> > by scheduler/timekeeping/console_drivers locks.
>
> yes, seems so.
>
> > The pain is that the list of affected locations is hard to maintain.
> > It would definitely help if such problems are reported by lockdep
> > in advance. But lockdep is disabled because it creates the deadlock
> > on its own.
>
> right. another issue is that those potentially recursive printk/WARN_ON
> calls may be coming from error-handling branches, not all of which are
> easily reachable for automated solutions. so in order to find out there
> is a problem we must hit it [in some cases].

yes

> it may look that lockdep *probably* can report the issues via 'safe' printk,
> but that's a notably huge behavior breakage -- if lockdep report comes from
> an about-to-deadlock irq handler, then we won't see anything from that CPU
> unless there is a panic/nmi panic.
>
> so it probably has to be semi-automatic/semi-manual:
> - add might_printk() that would acquire/release console sem; or
> logbuf_lock (which is probably even better)
> - find all functions that do printk/WARN in kernel/time and kernel/sched
> - add might_printk() to those functions (just like might_sleep())
> - run the kernel
> - ...
> - profit

I like the idea with might_printk(). I hope that it will be acceptable
for the scheduler/timekeeping people.

JFYI, I could work on the printk-context handling in lockdep.
I am just working on a lockdep support in NMI and am getting
kind of familiar with that code.

Best Regards,
Petr