Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical section nesting

From: Scott Wood
Date: Thu Jun 27 2019 - 18:46:35 EST


On Thu, 2019-06-27 at 13:50 -0700, Paul E. McKenney wrote:
> On Thu, Jun 27, 2019 at 03:16:09PM -0500, Scott Wood wrote:
> > On Thu, 2019-06-27 at 11:00 -0700, Paul E. McKenney wrote:
> > > On Wed, Jun 26, 2019 at 11:49:16AM -0500, Scott Wood wrote:
> > > > >
> > > > > On Fri, 21 Jun 2019 16:59:55 -0700
> > > > > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > > I have no objection to the outlawing of a number of these
> > > > > > sequences
> > > > > > in
> > > > > > mainline, but am rather pointing out that until they really are
> > > > > > outlawed
> > > > > > and eliminated, rcutorture must continue to test them in
> > > > > > mainline.
> > > > > > Of course, an rcutorture running in -rt should avoid testing
> > > > > > things
> > > > > > that
> > > > > > break -rt, including these sequences.
> > > > >
> > > > > sequences in the code. And we also need to get Linus's approval of
> > > > > this
> > > > > as I believe he was against enforcing this in the past.
> > > >
> > > > Was the opposition to prohibiting some specific sequence? It's only
> > > > certain
> > > > misnesting scenarios that are problematic. The rcu_read_lock/
> > > > local_irq_disable restriction can be dropped with the IPI-to-self
> > > > added
> > > > in
> > > > Paul's tree. Are there any known instances of the other two
> > > > (besides
> > > > rcutorture)?
>
> If by IPI-to-self you mean the IRQ work trick, that isn't implemented
> across all architectures yet, is it?

Right... smp_send_reschedule() has wider coverage, but even then there's
some hardware that just can't do it reasonably (e.g. pre-APIC x86). So I
guess the options are:

1. Accept that such hardware might experience delayed grace period
completion in certain configurations,
2. Have such hardware check for need_resched in local_irq_enable() (not nice
if sharing a kernel build with hardware that doesn't need it), or
3. Forbid the sequence (enforced by debug checks). Again, this would only
prohibit rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/
local_irq_enable() *without* preempt disabling around the IRQ-disabled
region.

> Why not simply make rcutorture cyheck whether it is running in a
> PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly
> testing only in that case?
>
> And should we later get to a place where the PREEMPT_RT_FULL-unfriendly
> scenarios are prohibited across all kernel configurations, then the module
> parameter can be removed. Again, until we know (as opposed to suspect)
> that these scenarios really don't happen, mainline rcutorture must
> continue testing them.

Yes, I already acknowledged that debug checks detecting the sequences should
come before the test removal (including this patch as an RFC at this point
was mainly meant as a demonstration of what's needed to get rcutorture to
pass), but it'd be nice to have some idea of whether there would be
opposition to the concept before coding up the checks. I'd rather not
continue the state of "these sequences can blow up on RT and we don't know
if they exist or not" any longer than necessary. Plus, only one of the
sequences is exclusively an RT issue (though it's the one with the worst
consequences).

-Scott