Re: [PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock

From: Petr Mladek
Date: Wed Aug 04 2021 - 08:31:41 EST


On Tue 2021-08-03 17:36:32, John Ogness wrote:
> On 2021-08-03, Daniel Thompson <daniel.thompson@xxxxxxxxxx> wrote:
> > On Tue, Aug 03, 2021 at 03:18:54PM +0206, John Ogness wrote:
> >> kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active)
> >> during cpu roundup. This will conflict with the printk cpulock.
> >
> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> >> index 3d0c933937b4..1b546e117f10 100644
> >> --- a/kernel/printk/printk.c
> >> +++ b/kernel/printk/printk.c
> >> @@ -214,6 +215,7 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> >> #ifdef CONFIG_SMP
> >> static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
> >> static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
> >> +static unsigned int kgdb_cpu = -1;
> >
> > Is this the flag to provoke retriggering? It appears to be a write-only
> > variable (at least in this patch). How is it consumed?
>
> Critical catch! Thank you. I am quite unhappy to see these hunks were
> accidentally dropped when generating this series.
>
> @@ -3673,6 +3675,9 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
> */
> void __printk_cpu_unlock(void)
> {
> + bool trigger_kgdb = false;
> + unsigned int cpu;
> +
> if (atomic_read(&printk_cpulock_nested)) {
> atomic_dec(&printk_cpulock_nested);
> return;
> @@ -3683,6 +3688,12 @@ void __printk_cpu_unlock(void)
> * LMM(__printk_cpu_unlock:A)
> */
>
> + cpu = smp_processor_id();
> + if (kgdb_cpu == cpu) {
> + trigger_kgdb = true;
> + kgdb_cpu = -1;
> + }

Just in case that this approach is used in the end.

This code looks racy. kgdb_roundup_delay() seems to be called in NMI
context. NMI might happen at this point and set kgdb_cpu after
it was checked.

I am afraid that it won't be easy to make this safe using a single
global variable. A solution might be a per-CPU variable set
by kgdb_roundup_delay() when it owns printk_cpu_lock.
__printk_cpu_unlock() would call kgdb_roundup_cpu(cpu) when
the variable is set.

Nit: The name "kgdb_cpu" is too generic. It is not clear what is
so special about this CPU. I would call the per-CPU variable
"kgdb_delayed_roundup" or so.


Best Regards,
Petr

> /*
> * Guarantee loads and stores from this CPU when it was the
> * lock owner are visible to the next lock owner. This pairs
> @@ -3703,6 +3714,21 @@ void __printk_cpu_unlock(void)
> */
> atomic_set_release(&printk_cpulock_owner,
> -1); /* LMM(__printk_cpu_unlock:B) */
> +
> + if (trigger_kgdb) {
> + pr_warn("re-triggering kgdb roundup for CPU#%d\n", cpu);
> + kgdb_roundup_cpu(cpu);
> + }
> }