Re: + printk-do-cond_resched-between-lines-while-outputting-to-consoles.patch added to -mm tree

From: Jan Kara
Date: Thu Dec 03 2015 - 04:57:51 EST


On Thu 03-12-15 11:39:33, Sergey Senozhatsky wrote:
> On (12/03/15 10:11), Sergey Senozhatsky wrote:
> > On (12/02/15 15:57), akpm@xxxxxxxxxxxxxxxxxxxx wrote:
> > [..]
> > > @console_may_schedule tracks whether console_sem was acquired through lock
> > > or trylock. If the former, we're inside a sleepable context and
> > > console_conditional_schedule() performs cond_resched(). This allows
> > > console drivers which use console_lock for synchronization to yield while
> > > performing time-consuming operations such as scrolling.
> > >
> > > However, the actual console outputting is performed while holding irq-safe
> > > logbuf_lock, so console_unlock() clears @console_may_schedule before
> > > starting outputting lines. Also, only a few drivers call
> > > console_conditional_schedule() to begin with. This means that when a lot
> > > of lines need to be output by console_unlock(), for example on a console
> > > registration, the task doing console_unlock() may not yield for a long
> > > time on a non-preemptible kernel.
> > >
> > > If this happens with a slow console devices, for example a serial console,
> > > the outputting task may occupy the cpu for a very long time. Long enough
> > > to trigger softlockup and/or RCU stall warnings, which in turn pile more
> > > messages, sometimes enough to trigger the next cycle of warnings
> > > incapacitating the system.
> > >
> > > Fix it by making console_unlock() insert cond_resched() between lines if
> > > @console_may_schedule.
> >
> > CPU2 still can cause lots of troubles. consider
> >
> > CPU0 CPU1 CPU2
> > printk
> > ... printk_deferred
> > printk wake_up_klogd
> > wake_up_klogd_work_func
> > console_trylock
> > console_unlock
> >
> > printk_deferred() may be issued by scheduler, for example.
>
> IOW, may be we can start limiting the number of bytes printed in console_unlock()
> from irq contexts. Which is quite ugly, yes. We basically don't know how much time
> we spend in call_console_drivers(); some of the consoles can do 'internal' spin_lock
> loops in ->write() handlers, etc. So something like this (below) probably will not
> really help, but still it's not always OK to do `while (1)' loop in console_unlock()
> for irqs.

What we really want is pushing the printing into async context (unless
forced by debug option or oops in progress). Because what you do here fixes
only a small fraction of the problem space. I have patches which fix more
of it (https://lkml.org/lkml/2015/10/26/16) but they are still not enough
because on large machines e.g. udev times out because printing messages
about inserted hardware over serial console just takes too long.

Honza
>
> ---
>
> kernel/printk/printk.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 9da39e7..221a230 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2235,6 +2235,7 @@ void console_unlock(void)
> unsigned long flags;
> bool wake_klogd = false;
> bool do_cond_resched, retry;
> + int printed, irq_count = irq_count();
>
> if (console_suspended) {
> up_console_sem();
> @@ -2257,6 +2258,7 @@ void console_unlock(void)
> /* flush buffered message fragment immediately to console */
> console_cont_flush(text, sizeof(text));
> again:
> + printed = 0;
> for (;;) {
> struct printk_log *msg;
> size_t ext_len = 0;
> @@ -2326,6 +2328,8 @@ skip:
>
> if (do_cond_resched)
> cond_resched();
> + if (irq_count && printed > LOG_LINE_MAX)
> + break;
> }
> console_locked = 0;
>
> @@ -2344,7 +2348,7 @@ skip:
> * flush, no worries.
> */
> raw_spin_lock(&logbuf_lock);
> - retry = console_seq != log_next_seq;
> + retry = (console_seq != log_next_seq) && !!irq_count;
> raw_spin_unlock_irqrestore(&logbuf_lock, flags);
>
> if (retry && console_trylock())
>
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/