Re: [PATCH] printk: do cond_resched() between lines while outputting to consoles

From: Jan Kara
Date: Wed Nov 25 2015 - 04:06:47 EST


On Tue 24-11-15 16:31:25, Tejun Heo wrote:
> @console_may_schedule tracks whether console_sem was acquired through
> lock or trylock. If former, we're inside a sleepable context and
> console_conditional_schedule() performs cond_resched(). This allows
> console drivers which use console_lock for synchronization to yield
> while performing time-consuming operations such as scrolling.
>
> However, the actual console outputting is performed while holding
> irq-safe logbuf_lock, so console_unlock() clears @console_may_schedule
> before starting outputting lines. Also, only a few drivers call
> console_conditional_schedule() to begin with. This means that when a
> lot of lines need to be output by console_unlock(), for example on a
> console registration, the task doing console_unlock() may not yield
> for a long time on a non-preemptible kernel.

So did you particularly have an issue during console registration? Because
at least our customers mostly have issues during heavy use of ordinary
printk (e.g. during boot or when hardware gets probed) and your change
doesn't affect that case. That being said if you really hit a case where
your patch helps, then I have no problem with it (you can add my Acked-by).

At Kernel Summit I spoke with Linus and Andrew regarding printk softlockups
and we ended up with a decision that we decouple queueing into kernel
ringbuffer from the actual printing into console which would happen from
kthread / workqueue. Then the lockups would be solved by printing to
console happening from schedulable context and printk() as such being
independent from console speed. We only have to have some special cases
there for crashes so that messages get printed synchronously in that case.

Honza

> If this happens with a slow console devices, for example a serial
> console, the outputting task may occupy the cpu for a very long time.
> Long enough to trigger softlockup and/or RCU stall warnings, which in
> turn pile more messages, sometimes enough to trigger the next cycle of
> warnings incapacitating the system.
>
> Fix it by making console_unlock() insert cond_resched() beween lines
> if @console_may_schedule.
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Reported-by: Calvin Owens <calvinowens@xxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Jan Kara <jack@xxxxxxx>
> Cc: Dave Jones <davej@xxxxxxxxxxxxxxxxx>
> Cc: Kyle McMartin <kyle@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
> kernel/printk/printk.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2233,13 +2233,24 @@ void console_unlock(void)
> static u64 seen_seq;
> unsigned long flags;
> bool wake_klogd = false;
> - bool retry;
> + bool do_cond_resched, retry;
>
> if (console_suspended) {
> up_console_sem();
> return;
> }
>
> + /*
> + * Console drivers are called under logbuf_lock, so
> + * @console_may_schedule should be cleared before; however, we may
> + * end up dumping a lot of lines, for example, if called from
> + * console registration path, and should invoke cond_resched()
> + * between lines if allowable. Not doing so can cause a very long
> + * scheduling stall on a slow console leading to RCU stall and
> + * softlockup warnings which exacerbate the issue with more
> + * messages practically incapacitating the system.
> + */
> + do_cond_resched = console_may_schedule;
> console_may_schedule = 0;
>
> /* flush buffered message fragment immediately to console */
> @@ -2311,6 +2322,9 @@ skip:
> call_console_drivers(level, ext_text, ext_len, text, len);
> start_critical_timings();
> local_irq_restore(flags);
> +
> + if (do_cond_resched)
> + cond_resched();
> }
> console_locked = 0;
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/