Re: RFC: SysRq nice-all-RT-tasks is broken

From: Steven Rostedt
Date: Wed Mar 08 2017 - 12:40:25 EST

Added Peter

Update: Laurent noticed that sysrq 'n' (nice-all-RT-tasks) calls
__sched_setscheduler() form interrupt context. At the start of that
function, there's a BUG_ON(in_interrupt()). The reason for that was
due to the rt mutex pi code calling wait_lock. Which was not irq
safe. Now it is, but that's not good enough.

On Wed, 8 Mar 2017 18:03:55 +0100
Laurent Dufour <ldufour@xxxxxxxxxxxxxxxxxx> wrote:

> On 08/03/2017 17:57, Steven Rostedt wrote:
> > On Wed, 8 Mar 2017 11:51:14 -0500
> > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> >
> >> Hmm, that commit was added in 2.6.18, and you're right, a lot has
> >> changed since then. Have you tried removing it and running it under
> >> lockdep, and see if it triggers any warnings?
> >
> > I did a little digging, and it appears that its the rt mutex wait lock
> > that the comment was referring to. Today that spin lock is irq safe. I
> > believe its safe to remove the BUG_ON(). Want me to send a patch?
> Sure, go ahead ;)

Actually, it's still not safe :-/

I just noticed this in the call path:


As well as other raw_spin_unlock_irq()s.

Which would enable interrupts regardless of the previous state.

One solution is to change all those to irqsave() but that seems to be a
big step for something that is rarely done (how many years has it been
since 2.6.18?).

I wonder if we should just have a special flag sent by that sysrq
trigger. Since it is causing all tasks to go "nice" there's no need to
do the pi chain walk in __sched_setscheduler().

-- Steve