Re: [RFT v4] tty/sysrq: Make sysrq handler NMI aware

From: Sumit Garg
Date: Wed Mar 09 2022 - 03:19:39 EST


On Tue, 8 Mar 2022 at 21:16, Daniel Thompson <daniel.thompson@xxxxxxxxxx> wrote:
>
> On Tue, Mar 08, 2022 at 08:13:43PM +0530, Sumit Garg wrote:
> > Hi Daniel,
> >
> > On Mon, 7 Mar 2022 at 19:53, Daniel Thompson <daniel.thompson@xxxxxxxxxx> wrote:
> > >
> > > On Mon, Mar 07, 2022 at 04:33:28PM +0530, Sumit Garg wrote:
> > > > Allow a magic sysrq to be triggered from an NMI context. This is done
> > > > via marking some sysrq actions as NMI safe. Safe actions will be allowed
> > > > to run from NMI context whilst that cannot run from an NMI will be queued
> > > > as irq_work for later processing.
> > > >
> > > > <snip>
> > > >
> > > > @@ -566,12 +573,46 @@ static void __sysrq_put_key_op(int key, const struct sysrq_key_op *op_p)
> > > > sysrq_key_table[i] = op_p;
> > > > }
> > > >
> > > > +static atomic_t sysrq_key = ATOMIC_INIT(-1);
> > > > +
> > > > +static void sysrq_do_irq_work(struct irq_work *work)
> > > > +{
> > > > + const struct sysrq_key_op *op_p;
> > > > + int orig_suppress_printk;
> > > > + int key = atomic_read(&sysrq_key);
> > > > +
> > > > + orig_suppress_printk = suppress_printk;
> > > > + suppress_printk = 0;
> > > > +
> > > > + rcu_sysrq_start();
> > > > + rcu_read_lock();
> > > > +
> > > > + op_p = __sysrq_get_key_op(key);
> > > > + if (op_p)
> > > > + op_p->handler(key);
> > > > +
> > > > + rcu_read_unlock();
> > > > + rcu_sysrq_end();
> > > > +
> > > > + suppress_printk = orig_suppress_printk;
> > > > + atomic_set(&sysrq_key, -1);
> > > > +}
> > > > +
> > > > +static DEFINE_IRQ_WORK(sysrq_irq_work, sysrq_do_irq_work);
> > > > +
> > > > void __handle_sysrq(int key, bool check_mask)
> > > > {
> > > > const struct sysrq_key_op *op_p;
> > > > int orig_log_level;
> > > > int orig_suppress_printk;
> > > > int i;
> > > > + bool irq_work = false;
> > > > +
> > > > + /* Skip sysrq handling if one already in progress */
> > > > + if (atomic_cmpxchg(&sysrq_key, -1, key) != -1) {
> > > > + pr_warn("Skip sysrq key: %i as one already in progress\n", key);
> > > > + return;
> > > > + }
> > >
> > > Doesn't this logic needlessly jam sysrq handling if the irq_work cannot
> > > be undertaken?
> > >
> >
> > Here this is done purposefully to ensure synchronisation of three
> > contexts while handling sysrq:
> > 1. Thread context
> > 2. IRQ context
> > 3. NMI context
>
> Why is it necessary to provide such synchronization?
>
> Also, if there really is an existing bug in the way thread and irq
> contexts interact (e.g. something we can tickle without NMI being
> involved) then that should probably be tackled in a separate patch
> and with an explanation of the synchronization problem.
>
>
> > > A console user could unwittingly attempt an !nmi_safe SysRq action on
> > > a damaged system that cannot service interrupts. Logic that prevents
> > > things like backtrace, ftrace dump, kgdb or reboot is actively harmful
> > > to that user's capability to figure out why their original sysrq doesn't
> > > work.
> >
> > I see your point.
> >
> > >
> > > I think the logic to prohibht multiple deferred sysrqs should only
> > > be present on code paths where we are actually going to defer the sysrq.
> > >
> >
> > It's not only there to prohibit multiple deferred sysrq (as that alone
> > could be handled by irq_work_queue()) but rather to avoid parallelism
> > scenarios that Doug mentioned on prior versions.
>
> I'm afraid I'm still a little lost here. I've only done a quick review
> but sysrq's entry/exit protocols look like they are intended to handle
> stacked contexts. This shouldn't be all that suprising since, even
> without your changes, a sysrq triggered by an irq will interrupt
> a sysrq triggered using /proc/sysrq-trigger .
>

Yeah you are right. I see problems with how globals like
"suppress_printk" and "console_loglevel" are modified and restored
within __handle_sysrq(). A concurrent sysrq may easily lead to
incorrect value restoration as an example below:

Thread 1
Thread 2
orig_suppress_printk = suppress_printk; # here value is 1
suppress_printk = 0;

orig_suppress_printk = suppress_printk; #
here value is 0
suppress_printk = orig_suppress_printk; # here value is 1

suppress_printk = 0;

suppress_printk = orig_suppress_printk;
#incorrect value restored as 0

Greg,

Do let me know if I am missing something otherwise I will create a
separate patch for this issue.

-Sumit

>
> > How about the following add-on change to allow passthrough for broken
> > irq_work systems?
>
> My question ultimately boils down to whether the existing logic
> is necessary, not whether we can make it even more complex!
>
> So before thinking too much about this change I think it would be
> useful to have a clear example of the circumstances that you think
> it will not be safe to run an NMI-safe sysrq from an NMI.
>
>
> Daniel.
>
>
> > diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> > index 005c9f9e0004..0a91d3ccf862 100644
> > --- a/drivers/tty/sysrq.c
> > +++ b/drivers/tty/sysrq.c
> > @@ -608,6 +608,15 @@ void __handle_sysrq(int key, bool check_mask)
> > int i;
> > bool irq_work = false;
> >
> > + /*
> > + * Handle a case if irq_work cannot be undertaken on a damaged
> > + * system stuck in hard lockup and cannot service interrupts.
> > + * In such cases we shouldn't atleast block NMI safe handlers
> > + * that doesn't depend on irq_work.
> > + */
> > + if (irq_work_is_pending(&sysrq_irq_work))
> > + atomic_set(&sysrq_key, -1);
> > +
> > /* Skip sysrq handling if one already in progress */
> > if (atomic_cmpxchg(&sysrq_key, -1, key) != -1) {
> > pr_warn("Skip sysrq key: %i as one already in progress\n", key);
> >
> > -Sumit
> >
> > >
> > > Daniel.