Re: [PATCH] kgdb: Fix broken handling of printk() in NMI context

From: Petr Mladek
Date: Thu May 14 2020 - 04:43:26 EST


On Wed 2020-05-13 19:04:48, Sumit Garg wrote:
> On Tue, 12 May 2020 at 19:55, Daniel Thompson
> <daniel.thompson@xxxxxxxxxx> wrote:
> >
> > On Tue, May 12, 2020 at 02:18:34PM +0530, Sumit Garg wrote:
> > > Since commit 42a0bb3f7138 ("printk/nmi: generic solution for safe printk
> > > in NMI"), kgdb entry in NMI context defaults to use safe NMI printk()
> >
> > I didn't see the author on Cc: nor any of the folks whose hands it
> > passed through. It would definitely be good to involve them in this
> > discussion.
> >
>
> Thanks for updating the Cc: list.
>
> >
> > > which involves CPU specific buffers and deferred printk() until exit from
> > > NMI context.
> > >
> > > But kgdb being a stop-the-world debugger, we don't want to defer printk()
> > > especially backtrace on corresponding CPUs. So instead switch to normal
> > > printk() mode in kgdb_cpu_enter() if entry is in NMI context.
> >
> > So, firstly I should *definitely* take a mea cupla for not shouting
> > about this at the time (I was on Cc:... twice). Only thing I can say
> > confidently is that the test suite didn't yell about this and so I
> > didn't look at this as closely as I should have done (and that it
> > didn't yell is mostly because I'm still building out the test suite
> > coverage).
> >
> > Anyhow...
> >
> > This feels a little like we are smearing the printk() interception logic
> > across the kernel in ways that make things hard to read. If we accepted
> > this patch we then have, the new NMI interception logic, the old kdb
> > interception logic and some hacks in the kgdb trap handler to defang the
> > NMI interception logic and force the kdb logic to kick in.
> >
> > Wouldn't it be better to migrate kdb interception logic up a couple of
> > levels so that it continues to function even when we are in nmi printk
> > mode. That way *all* the printk() interception code would end up in
> > one place.
> >
>
> Yes it would be better to have all printk() interception code at one
> place. Let me see if I can come up with an integrated logic.

It might be enough to move the kdb_check from vprintk_default()
to vprintk_func().

I have never used kdb. I did not know that it was able to stop
kernel in any context.

Would this work? It is only compile tested!