Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting

From: Scott Wood
Date: Thu Jun 27 2019 - 16:16:25 EST


On Thu, 2019-06-27 at 11:00 -0700, Paul E. McKenney wrote:
> On Wed, Jun 26, 2019 at 11:49:16AM -0500, Scott Wood wrote:
> > On Wed, 2019-06-26 at 11:08 -0400, Steven Rostedt wrote:
> > > On Fri, 21 Jun 2019 16:59:55 -0700
> > > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxx> wrote:
> > >
> > > > I have no objection to the outlawing of a number of these sequences
> > > > in
> > > > mainline, but am rather pointing out that until they really are
> > > > outlawed
> > > > and eliminated, rcutorture must continue to test them in mainline.
> > > > Of course, an rcutorture running in -rt should avoid testing things
> > > > that
> > > > break -rt, including these sequences.
> > >
> > > We should update lockdep to complain about these sequences. That would
> > > "outlaw" them in mainline. That is, after we clean up all the current
> > > sequences in the code. And we also need to get Linus's approval of
> > > this
> > > as I believe he was against enforcing this in the past.
> >
> > Was the opposition to prohibiting some specific sequence? It's only
> > certain
> > misnesting scenarios that are problematic. The rcu_read_lock/
> > local_irq_disable restriction can be dropped with the IPI-to-self added
> > in
> > Paul's tree. Are there any known instances of the other two (besides
> > rcutorture)?
>
> Given the failure scenario Sebastian Siewior reported today, there
> apparently are some, at least when running threaded interrupt handlers.

That's the rcu misnesting, which it looks like we can allow with the IPI-to-
self; I was asking about the other two. I suppose if we really need to, we
could work around preempt_disable()/local_irq_disable()/preempt_enable()/
local_irq_enable() by having preempt_enable() do an IPI-to-self if
need_resched is set and IRQs are disabled. The RT local_bh_disable()
atomic/non-atomic misnesting would be more difficult, but I don't think
impossible. I've got lazy migrate disable working (initially as an attempt
to deal with misnesting but it turned out to give a huge speedup as well;
will send as soon as I take care of a loose end in the deadline scheduler);
it's possible that something similar could be done with the softirq lock
(and given that I saw a slowdown when that lock was introduced, it may also
be worth doing just for performance).

BTW, it's not clear to me whether the failure Sebastian saw was due to the
bare irq disabled version, which was what I was talking about prohibiting
(he didn't show the context that was interrupted). The version where
preempt is disabled (with or without irqs being disabled inside the preempt
disabled region) definitely happens and is what I was trying to address with
patch 3/4.

-Scott