Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

From: Sergey Senozhatsky
Date: Thu Dec 10 2015 - 23:26:43 EST


On (12/10/15 23:52), Sergey Senozhatsky wrote:
> > Currently, console_unlock() prints messages from kernel printk buffer to
> > console while the buffer is non-empty. When serial console is attached,
> > printing is slow and thus other CPUs in the system have plenty of time
> > to append new messages to the buffer while one CPU is printing. Thus the
> > CPU can spend unbounded amount of time doing printing in console_unlock().
> > This is especially serious problem if the printk() calling
> > console_unlock() was called with interrupts disabled.
> >
> > In practice users have observed a CPU can spend tens of seconds printing
> > in console_unlock() (usually during boot when hundreds of SCSI devices
> > are discovered) resulting in RCU stalls (CPU doing printing doesn't
> > reach quiescent state for a long time), softlockup reports (IPIs for the
> > printing CPU don't get served and thus other CPUs are spinning waiting
> > for the printing CPU to process IPIs), and eventually a machine death
> > (as messages from stalls and lockups append to printk buffer faster than
> > we are able to print). So these machines are unable to boot with serial
> > console attached. Also during artificial stress testing SATA disk
> > disappears from the system because its interrupts aren't served for too
> > long.
> >
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads. As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> >
> > Signed-off-by: Jan Kara <jack@xxxxxxx>


> static bool cpu_stop_printing(int printed_chars)
> {
> /* Oops? Print everything now to maximize chances user will see it */
> if (oops_in_progress)
> return false;
> if (!printk_offload_chars || printed_chars < printk_offload_chars)
> return false;
> /*
> * Make sure we load fresh value of printing_tasks_spinning. Matches
> * the barrier in printing_task()
> */
> smp_rmb();
> if (atomic_read(&printing_tasks_spinning))
> return true;
> wake_up(&print_queue);
>
- return false;
+ return true;
> }


we just woke up a task we will offload printing to. no need to do another round
of call_console_drivers() from current, possibly overrunning printk_offload_chars
by strlen of next msg, which can be close to max text length in the worst case,
while woken up print task will burn cpu cycles spinning on the print_lock.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/