[PATCH 02/10] printk: Try harder to get logbuf_lock on NMI

From: Petr Mladek
Date: Mon May 25 2015 - 08:48:57 EST


If the logbuf_lock is not available immediately, it does not mean
that there is a deadlock. We should try harder and wait a bit.

On the other hand, we must not forget that we are in NMI and the timeout
has to be rather small. It must not cause dangerous stalls.

I even got full system freeze when the timeout was 10ms and I printed
backtraces from all CPUs. In this case, all CPUs were blocked for
too long.

Signed-off-by: Petr Mladek <pmladek@xxxxxxx>
---
kernel/printk/printk.c | 38 +++++++++++++++++++++++++++++++++++---
1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 94fcf6f0b542..e6c00d6ee8dc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -231,6 +231,8 @@ static DEFINE_RAW_SPINLOCK(logbuf_lock);

#ifdef CONFIG_PRINTK
DECLARE_WAIT_QUEUE_HEAD(log_wait);
+/* cpu currently holding logbuf_lock */
+static unsigned int logbuf_cpu = UINT_MAX;
/* the next printk record to read by syslog(READ) or /proc/kmsg */
static u64 syslog_seq;
static u32 syslog_idx;
@@ -1610,6 +1612,38 @@ static size_t cont_print_text(char *text, size_t size)
return textlen;
}

+/*
+ * This value defines the maximum delay that we spend waiting for logbuf_lock
+ * in NMI context. 100us looks like a good compromise. Note that, for example,
+ * syslog_print_all() might hold the lock for quite some time. On the other
+ * hand, waiting 10ms caused system freeze when many backtraces were printed
+ * in NMI.
+ */
+#define TRY_LOCKBUF_LOCK_MAX_DELAY_NS 100000UL
+
+/* We must be careful in NMI when we managed to preempt a running printk */
+static int try_logbuf_lock_in_nmi(void)
+{
+ u64 start_time, current_time;
+ int this_cpu = smp_processor_id();
+
+ /* no way if we are already locked on this CPU */
+ if (logbuf_cpu == this_cpu)
+ return 0;
+
+ /* try hard to get the lock but do not wait forever */
+ start_time = cpu_clock(this_cpu);
+ current_time = start_time;
+ while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
+ if (raw_spin_trylock(&logbuf_lock))
+ return 1;
+ cpu_relax();
+ current_time = cpu_clock(this_cpu);
+ }
+
+ return 0;
+}
+
asmlinkage int vprintk_emit(int facility, int level,
const char *dict, size_t dictlen,
const char *fmt, va_list args)
@@ -1624,8 +1658,6 @@ asmlinkage int vprintk_emit(int facility, int level,
int this_cpu;
int printed_len = 0;
bool in_sched = false;
- /* cpu currently holding logbuf_lock in this function */
- static unsigned int logbuf_cpu = UINT_MAX;

if (level == LOGLEVEL_SCHED) {
level = LOGLEVEL_DEFAULT;
@@ -1672,7 +1704,7 @@ asmlinkage int vprintk_emit(int facility, int level,
*/
if (!in_nmi()) {
raw_spin_lock(&logbuf_lock);
- } else if (!raw_spin_trylock(&logbuf_lock)) {
+ } else if (!try_logbuf_lock_in_nmi()) {
atomic_inc(&nmi_message_lost);
lockdep_on();
local_irq_restore(flags);
--
1.8.5.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/