Re: [PATCH v2] kdb: Make kdb_printf robust to run in NMI context

From: Daniel Thompson
Date: Fri May 22 2020 - 12:35:14 EST


On Fri, May 22, 2020 at 08:03:47PM +0530, Sumit Garg wrote:
> While rounding up CPUs via NMIs, its possible that a rounded up CPU
> maybe holding a console port lock leading to kgdb master CPU stuck in
> a deadlock during invocation of console write operations. So in order
> to avoid such a deadlock, invoke bust_spinlocks() prior to invocation
> of console handlers.
>
> Also, add a check for console port to be enabled prior to invocation of
> corresponding handler.

Perhaps this should have been two patches.

In fact, to be honest, I'd suggest combining all the patches to improve
kdb console handling (including a fixed version of the RFC) into a
single patch set.


> Suggested-by: Petr Mladek <pmladek@xxxxxxxx>
> Suggested-by: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
> Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> ---
>
> Changes in v2:
> - Use oops_in_progress directly instead of bust_spinlocks().
>
> kernel/debug/kdb/kdb_io.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 924bc92..3a5a068 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -699,7 +699,11 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> }
> }
> for_each_console(c) {
> + if (!(c->flags & CON_ENABLED))
> + continue;
> + ++oops_in_progress;

Given the subtly of what is going on I think we need some comments in
the code on what we are doing and why.


> c->write(c, cp, retlen - (cp - kdb_buffer));
> + --oops_in_progress;
> touch_nmi_watchdog();
> }
> }
> @@ -761,7 +765,11 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> }
> }
> for_each_console(c) {
> + if (!(c->flags & CON_ENABLED))
> + continue;
> + ++oops_in_progress;
> c->write(c, moreprompt, strlen(moreprompt));
> + --oops_in_progress;
> touch_nmi_watchdog();

As with the other patches maybe the first patch in the set should be
factoring out the common code before making changes to it.


Daniel.