Re: [RFC][PATCH] printk: do not flush printk_safe from irq_work

From: Petr Mladek
Date: Sat Jan 27 2018 - 06:29:07 EST


On Wed 2018-01-24 18:37:23, Sergey Senozhatsky wrote:
> Tejun Heo reported that net_console can cause recursive printk()-s
> from call_console_drivers() (under OOM/etc.) and this can lockup that
> CPU, because CPU calls call_console_drivers() to print pending logbufs,
> but recursive printk() issued by one of console drivers adds several
> new messages to the logbuf via recursive printk().

> Note, this is not a printk_safe problem. In fact printk_safe helps
> us to address the issue here and to rate-limit recursive printk()-s.
>
> Schematically, what we have at the moment is:
>
> ==================================================
> printk()
> preempt_disable()
> console_unlock()
> {
> for (;;) {
> local_irq_save()
> printk_safe_enter()
>
> call_console_drivers()
> printk()
> vprintk_safe()
> irq_work_queue()
>
> printk_safe_exit()
> local_irq_restore()
>
> << irq work >>
> printk_safe_flush()
> printk()
> }
> }
> preempt_enable()
> ==================================================
>
> So CPU triggers printk() recursion in call_console_drivers(),
> immediately after local_irq_restore() it executes printk_safe
> flush irq_work, which adds even more pending messages to the
> logbuf which it has to print. CPU hits printk() recursion again
> while it prints pending messages, which results in even more
> pending messages flushed from printk_safe via irq_work. And so
> on.
>
> The idea is to delay printk_safe flush until CPU is in preemptible
> context, so we won't lockup it up.
>
> The new behaviour is:
>
> ==================================================
> printk()
> preempt_disable()
> console_unlock()
> {
> for (;;) {
> local_irq_save()
> printk_safe_enter()
>
> call_console_drivers()
> printk()
> vprintk_safe()
> queue_work_on(smp_processor_id())
>
> printk_safe_exit()
> local_irq_restore()
> }
> }
> preempt_enable()
>
> << work queue >>
> printk_safe_flush()
> printk()
> ==================================================
>
> Since console_unlock() is called under preemption_disabled() now
> we cannot schedule flush work as long as CPU is printing messages.
> We only can flush printk_safe messages after CPU returns from
> console_unlock() and enables preemption.
>
> This might look like with the delayed flush we increase the possibility
> of lost printk_safe messages (printk_safe per-CPU buffers are pretty
> limited in size). But prinkt_safe was never meant to be used for huge
> amounts of data which can generate e.g. debugging of serial consoles,
> etc. Its goal is to make recursive printk()-s less deadlock prone.
> With this patch we add one more thing: avoid CPU lockup which might
> be caused by printk_safe_flush().

I have two concerns about this approach:

First, this delays showing eventually valuable information until
the preemption is enabled. It might never happen if the system
is in big troubles. In each case, it might be much longer delay
than it was before.

Second, it makes printk() dependent on another non-trivial subsystem.
I mean workqueues.

If I got it right, we want to limit the number of recursively added
messages that might be printed() from inside console_unlock(),
especially from console_drivers().


The following, a bit ugly, solution has came to my mind. We could
think about it like extending the printk_context. It counts
printks called in this context and does nothing when we reach
the limit. The difference is that the context is task-specific
instead of CPU-specific.

I used only one compiler barrier which looks strange. Well, it
is late Friday here and it is just a POC. I wonder what you
think about this approach.


Just a compile tested code:

diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 2a7d04049af4..747098310179 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -23,6 +23,9 @@
#define PRINTK_NMI_CONTEXT_MASK 0x80000000

extern raw_spinlock_t logbuf_lock;
+extern struct task_struct *console_unlock_owner;
+extern int console_recursion_count;
+extern int console_recursion_limit;

__printf(1, 0) int vprintk_default(const char *fmt, va_list args);
__printf(1, 0) int vprintk_deferred(const char *fmt, va_list args);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index db4b9b8929eb..4508bb579c5b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -81,6 +81,9 @@ EXPORT_SYMBOL(oops_in_progress);
static DEFINE_SEMAPHORE(console_sem);
struct console *console_drivers;
EXPORT_SYMBOL_GPL(console_drivers);
+struct task_struct *console_unlock_owner;
+int console_recursion_count;
+int console_recursion_limit = 100;

#ifdef CONFIG_LOCKDEP
static struct lockdep_map console_lock_dep_map = {
@@ -2293,6 +2296,7 @@ void console_unlock(void)
unsigned long flags;
bool wake_klogd = false;
bool do_cond_resched, retry;
+ int recursion_count = 0;

if (console_suspended) {
up_console_sem();
@@ -2328,6 +2332,20 @@ void console_unlock(void)
return;
}

+ /* Start section where we prevent recursion */
+ console_unlock_owner = current;
+ /*
+ * Counter is preserved even when console_lock is released,
+ * re-taken and we end up here again.
+ */
+ console_recursion_count = recursion_count;
+ /*
+ * The above values are meaningful only for the current task.
+ * Make only sure that this is set before calling anything that
+ * might call printk().
+ */
+ barrier();
+
for (;;) {
struct printk_log *msg;
size_t ext_len = 0;
@@ -2414,6 +2432,8 @@ void console_unlock(void)

raw_spin_unlock(&logbuf_lock);

+ recursion_count = console_recursion_count;
+ console_unlock_owner = NULL;
up_console_sem();

/*
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 3e3c2004bb23..46209b4fd960 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -364,6 +364,14 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK)
return vprintk_nmi(fmt, args);

+ /*
+ * No barrier is needed. The recursion counter is increased only by
+ * the task owning the console lock.
+ */
+ if (console_unlock_owner == current &&
+ console_recursion_count++ > console_recursion_limit)
+ return 0;
+
/* Use extra buffer to prevent a recursion deadlock in safe mode. */
if (this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK)
return vprintk_safe(fmt, args);



PS: I answered this mail because the original discussion[1] has
already been way too long and covered many other issues. But feel
free to ask me to move it there.

[1] https://lkml.kernel.org/r/20180117121251.7283a56e@xxxxxxxxxxxxxxxxxx