Re: [RFC][PATCH v6 1/2] printk: Make printk() completely async

From: Sergey Senozhatsky
Date: Tue Mar 22 2016 - 21:23:34 EST


On (03/22/16 17:36), Petr Mladek wrote:
> > - /* cpu currently holding logbuf_lock in this function */
> > - static unsigned int logbuf_cpu = UINT_MAX;
> > + bool in_panic = console_loglevel == CONSOLE_LOGLEVEL_MOTORMOUTH;
>
> I am just looking at the printk in NMI patchset and I will need to
> deal with the panic state as well. I am not sure if this detection
> is secure.
>
> This console level is set also by kdb_show_stack()
> and kdb_dumpregs(). I am not sure how this kdb stuff works
> and if it affects normal kernel but...
>
> Anyway, it seems that many locations detects the panic situation
> via the variable oops_in_progress. It has another advantage
> that it can be easily checked and we would not need any extra
> variable here.

oops_in_progress is not my favorite global. and we can't rely on it
in async printk.

in panic() we have

console_verbose();
bust_spinlocks(1); << sets to one

pr_emerg("Kernel panic - not syncing: %s\n", buf);
smp_send_stop();

bust_spinlocks(0); << sets it back to zero

console_flush_on_panic();

there are several issues here.
- first, panic_cpu does not see oops_in_progress right after bust_spinlocks(0).
thus all printk issued from panic_cpu can go via async printk.

- second, smp_send_stop() does not guarantee that all of the CPUs received
STOP IPI by the time it returns. on some platforms (ARM, for instance)
smp_send_stop()

: if (!cpumask_empty(&mask))
: smp_cross_call(&mask, IPI_CPU_STOP);
:
: /* Wait up to one second for other CPUs to stop */
: timeout = USEC_PER_SEC;
: while (num_online_cpus() > 1 && timeout--)
: udelay(1);
:
: if (num_online_cpus() > 1)
: pr_warn("SMP: failed to stop secondary CPUs\n");
:
: return;

waits for one second and returns back to panic_cpu, and panic_cpu sets
oops_in_progress back to zero. simulataneously SOPT_IPIs can start arriving
to remaining CPUs. on some platforms (ARM, for instance) STOP_IPI is

: raw_spin_lock(&stop_lock);
: pr_crit("CPU%u: stopping\n", cpu);
: dump_stack();
: raw_spin_unlock(&stop_lock);
: }
:
: set_cpu_online(cpu, false);
:
: local_fiq_disable();
: local_irq_disable();
:
: while (1)
: cpu_relax();


so CPUs dump_stack()s can in theory happen when oops_in_progress is zero
and, thus, printk will try to print it by printk_kthread, which is not
something we really want to do in panic().

so I wanted to have in printk some panic indication that once set never
gets cleared. my proposal was

void console_panic(void)
{
printk_sync = false;
}

or similar.

-ss