Re: [PATCH] printk/kdb: Redirect printk messages into kdb in any context

From: Sergey Senozhatsky
Date: Fri May 15 2020 - 12:24:18 EST


On (20/05/15 14:48), Daniel Thompson wrote:
> On Fri, May 15, 2020 at 07:33:08PM +0900, Sergey Senozhatsky wrote:
> > On (20/05/15 10:50), Petr Mladek wrote:

[..]

> > Is this guaranteed that we never execute this path from NMI?
>
> Absolutely not.
>
> The execution context for kdb is pretty much unique...

OK, that was what I expected.

> we are running a debug mode with all CPUs parked in a holding loop with
> interrupts disabled. One CPU is at an unknown exception state and the
> others are either handling an IRQ or NMI depending on architecture[1].

Can a CPU be parked while holding the console driver port lock?

Hmm, a side note - this also means that we cannot handle it from
poll-ing console drivers and need to switch to direct console writes.

> However there are a number of factors that IMHO weigh in favour of
> allowing kdb to intercept here.
>
> 1. kgdb/kdb are designed to work from NMI, modulo the bugs that are
> undoubtedly present.
>
> 2. A synchronous breakpoint (including an implicit breakpoint-on-oops)
> from any code that executes with irqs disabled will exhibit most of
> the same problems as an NMI but without waking up all the NMI logic.
>
> 3. kdb_trap_printk is only set for *very* narrow time intervals by the
> debug master (the single CPU in the system that is *not* in a
> holding loop). Thus in all cases the system has already successfully
> executed kdb_printf() several times before we ever call the printk()
> interception code.
>
> Or put another way, even if we did tickle a bug speculated about in
> #1, it won't be the call to printk() that triggers it; we'd never
> get that far!

OK. I would appreciate a more detailed commit message:
- what do we fix, and what risks do we take. Just for the record.

+ a small nit: looking at for_each_console() loop -- not all consoles
can be invoked at any time and not all consoles are enabled at any time.
You _probably_ might want to do what printk does in call_console_drivers()
loop. printk also had problems with console callbacks being placed in
sections that get discarded, but that's way too niche.

-ss