Re: [PATCH printk v4 27/27] lockdep: Mark emergency sections in lockdep splats

From: Petr Mladek
Date: Tue Apr 16 2024 - 05:51:53 EST


On Wed 2024-04-03 00:17:29, John Ogness wrote:
> Mark emergency sections wherever multiple lines of
> lock debugging output are generated. In an emergency
> section the CPU will not perform console output for the
> printk() calls. Instead, a flushing of the console
> output is triggered when exiting the emergency section.
> This allows the full message block to be stored as
> quickly as possible in the ringbuffer.
>
> Signed-off-by: John Ogness <john.ogness@xxxxxxxxxxxxx>
> ---
> kernel/locking/lockdep.c | 91 ++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 88 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 151bd3de5936..80cfbe7b340e 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -5150,6 +5211,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> #endif
> if (unlikely(curr->lockdep_depth >= MAX_LOCK_DEPTH)) {
> debug_locks_off();
> + nbcon_cpu_emergency_enter();
> print_lockdep_off("BUG: MAX_LOCK_DEPTH too low!");
> printk(KERN_DEBUG "depth: %i max: %lu!\n",
> curr->lockdep_depth, MAX_LOCK_DEPTH);
> @@ -5157,6 +5219,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> lockdep_print_held_locks(current);
> debug_show_all_locks();

My concern is that the dump of all locks would not fit into the
buffer. And it might even trigger softlockup. See below.


> dump_stack();
> + nbcon_cpu_emergency_exit();
>
> return 0;
> }
> @@ -6609,6 +6688,7 @@ void debug_show_all_locks(void)
> pr_warn("INFO: lockdep is turned off.\n");
> return;
> }
> + nbcon_cpu_emergency_enter();
> pr_warn("\nShowing all locks held in the system:\n");
>
> rcu_read_lock();

The code dumping the locks looks like:

for_each_process_thread(g, p) {
if (!p->lockdep_depth)
continue;
lockdep_print_held_locks(p);
touch_nmi_watchdog();
touch_all_softlockup_watchdogs();
}

I see two problems here:

1. I belive that the watchdogs are touched for a reason. And they will
be useless when we print everything in a single emergency context
and emit all messages at the end.

2. The default config LOG_BUF_SHIFT is 17. So the default kernel
buffer size is 128kB. The number of messages is bound by
the number of processes. I am afraid that all messages might
not fit into the buffer.


I see two solutions:

1. Take the emergency context only around single dump:

nbcon_cpu_emergency_enter();
lockdep_print_held_locks(p);
nbcon_cpu_emergency_exit();


2. Explicitely flush the printk buffer here. Something like:

lockdep_print_held_locks(p);
nbcon_cpu_emergency_flush();
touch_nmi_watchdog();
touch_all_softlockup_watchdogs();


, where nbcon_cpu_emergency_flush() would do something like:

/**
* nbcon_cpu_emergency_flush - Explicitly flush consoles in
* the middle of emergency context.
*
* Both nbcon and legacy consoles are flushed.
*
* It should be used only when there are too many messages printed
* in emergency context, for example, printing backtraces of all
* CPUs or processes. It is typically needed when the watchdogs
* have to be touched as well.
*
void nbcon_cpu_emergency_flush()
{
/* The explicit flush is needed only in the emergency context. */
if (!nbcon_get_cpu_emergency_nesting())
return;

nbcon_atomic_flush_pending();

if (printing_via_unlock && !in_nmi()) {
if (console_trylock())
console_unlock();
}
}


I like the 2nd solution. The rule is more or less clear. The
explicit flush is needed when the the code needed to touch watchdogs.

Maybe, we should go even further and call the flush directly
from the touch_*_*watchdog() API. It has effect only in emergency
context.

Best Regards,
Petr