Re: [RFC] kdb: Switch kdb_printf to use safer console poll APIs

From: Daniel Thompson
Date: Tue May 26 2020 - 07:10:58 EST


On Tue, May 26, 2020 at 01:16:17PM +0530, Sumit Garg wrote:
> On Fri, 22 May 2020 at 21:33, Daniel Thompson
> <daniel.thompson@xxxxxxxxxx> wrote:
> >
> > On Fri, May 22, 2020 at 08:04:31PM +0530, Sumit Garg wrote:
> > > In kgdb NMI context, polling driver APIs are more safer to use instead
> > > of console APIs since the polling drivers know they will execute from
> > > all sorts of crazy places. And for the most common use cases this would
> > > also result in no console handler ever being called. So switch to use
> > > polling driver APIs in case a particular console supports polling mode.
> >
> > This comment seems rather half hearted, not least because it doesn't
> > explain what the current problem is nor why using the polling API is
> > safer.
> >
>
> TBH, some sentences in the above comment were borrowed from your
> suggestion here [1]. But I agree that it doesn't portray the complete
> picture. So how about:
>
> ====
> In kgdb NMI context, calling console handlers isn't safe due to locks
> used in those handlers which could lead to a deadlock. Although, using
> oops_in_progress increases the chance to bypass locks in most console
> handlers but it might not be sufficient enough in case a console uses
> more locks (VT/TTY is good example).
>
> So instead switch to use lockless polling driver APIs in case a
> particular console supports polling mode which is common for most kdb
> use-cases and would result in no console handler ever being called.
> ====

Better, although the later paragraph still seems rather vague to me.
Compare to:

Currently when a driver provides both polling I/O and a console then kdb
will output using the console. We can increase robustness by using the
currently active polling I/O driver (which should be lockless) instead
of the corresponding console. For several common cases (e.g. an
embedded system with a single serial port that is used both for console
output and debugger I/O) this will result in no console handler being
used.


> [1] https://lkml.org/lkml/2020/5/20/356
>
> > Compare the above against the advice in
> > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes
> > and I think it comes up short. Perhaps also consider Ingo Molnar's much
> > more concise suggestion on describing changes:
> >
> > : Please use the customary changelog style we use in the kernel:
> > : " Current code does (A), this has a problem when (B).
> > : We can improve this doing (C), because (D)."
> > -- http://lkml.iu.edu/hypermail//linux/kernel/1311.1/01157.html
>
> Thanks for the pointers.
>
> >
> >
> > > Suggested-by: Daniel Thompson <daniel.thompson@xxxxxxxxxx>
> > > Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> > > ---
> > > kernel/debug/kdb/kdb_io.c | 39 +++++++++++++++++++++++++++++++++------
> > > 1 file changed, 33 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > > index 3a5a068..8e0d581 100644
> > > --- a/kernel/debug/kdb/kdb_io.c
> > > +++ b/kernel/debug/kdb/kdb_io.c
> > > @@ -24,6 +24,7 @@
> > > #include <linux/kgdb.h>
> > > #include <linux/kdb.h>
> > > #include <linux/kallsyms.h>
> > > +#include <linux/tty_driver.h>
> > > #include "kdb_private.h"
> > >
> > > #define CMD_BUFLEN 256
> > > @@ -699,11 +700,24 @@ int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> > > }
> > > }
> > > for_each_console(c) {
> > > + int line;
> > > + struct tty_driver *p;
> > > +
> > > if (!(c->flags & CON_ENABLED))
> > > continue;
> > > - ++oops_in_progress;
> > > - c->write(c, cp, retlen - (cp - kdb_buffer));
> > > - --oops_in_progress;
> > > + p = c->device ? c->device(c, &line) : NULL;
> > > + if (p && p->ops && p->ops->poll_put_char) {
> >
> > What prevents this logic from matching an active console that hasn't
> > been selected as the polling driver?
>
> Yes you are correct and it could lead to invoking poll_put_char()
> without poll_init(). And we couldn't invoke poll_init() here as that
> still comes with locks and could sleep. So one way to overcome this
> would be to pass selected polling driver via dbg_io_ops and use
> polling APIs only if the underlying console driver matches that
> polling driver.

Agree.

Note that this is all I ever expected to look at when I commented about
before.


> > > + len = retlen - (cp - kdb_buffer);
> > > + cp2 = cp;
> > > + while (len--) {
> > > + p->ops->poll_put_char(p, line, *cp2);
> > > + cp2++;
> > > + }
> >
> > Assuming it is possible to identify the console that matches the
> > currently selected polling driver can't we just drop the
> > is_console test and get rid of this branch entirely.
>
> Have a look at my suggested approach above.
>
> >
> > The only reason for the is_console test is to avoid issuing messages
> > twice so if we are able to suppress the c->write() for the same UART
> > then is_console check becomes pointless and can go.
>
> I did consider removing is_console check but it looks like it's not
> only limited to polling drivers but also used at other places (see [1]
> [2]) as well.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/early/ehci-dbgp.c#n1061
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/tty/serial/kgdb_nmi.c#n48

IIUC you mean that the logic to match devices only works for tty drivers
and there examples are not tty drivers.

This could probably be solved but no need to get too tied in knots. It's
fine to keep the is_console check for now.

However rather than replicate the polled I/O write code a third and
fourth time lets get the I/O logic pulled out into proper functions.



Daniel.