Re: [PATCH RFC v2 tip/core/rcu 03/22] rcutorture: Add flag to produce non-busy-wait task stalls

From: Paul E. McKenney
Date: Thu Mar 19 2020 - 11:22:57 EST


On Thu, Mar 19, 2020 at 09:39:47PM +0800, Hillf Danton wrote:
>
> On Thu, 19 Mar 2020 05:38:12 -0700 "Paul E. McKenney" wrote:
> >
> > On Thu, Mar 19, 2020 at 06:46:14PM +0800, Hillf Danton wrote:
> > >
> > > On Wed, 18 Mar 2020 17:10:41 -0700
> > > > static int rcu_torture_stall(void *args)
> > > > {
> > > > + int idx;
> > > > unsigned long stop_at;
> > > >
> > > > VERBOSE_TOROUT_STRING("rcu_torture_stall task started");
> > > > @@ -1610,21 +1612,22 @@ static int rcu_torture_stall(void *args)
> > > > if (!kthread_should_stop()) {
> > > > stop_at = ktime_get_seconds() + stall_cpu;
> > > > /* RCU CPU stall is expected behavior in following code. */
> > > > - rcu_read_lock();
> > > > + idx = cur_ops->readlock();
> > > > if (stall_cpu_irqsoff)
> > > > local_irq_disable();
> > > > - else
> > > > + else if (!stall_cpu_block)
> > > > preempt_disable();
> > > > pr_alert("rcu_torture_stall start on CPU %d.\n",
> > > > - smp_processor_id());
> > > > + raw_smp_processor_id());
> > > > while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
> > > > stop_at))
> > > > - continue; /* Induce RCU CPU stall warning. */
> > > > + if (stall_cpu_block)
> > > > + schedule_timeout_uninterruptible(HZ);
> > >
> > > Why is the scheduled-in task so special that it will be running on
> > > the current CPU with irq disabled?
> >
> > You lost me on this one.
>
> Quite likely :)
>
> > IRQs are not at all disabled.
> >
> > > > if (stall_cpu_irqsoff)
> > > > local_irq_enable();
>
> Local IRQs get enabled here depending on stall_cpu_irqsoff.
>
> What I was asking is the scheduling case like
>
> local_irq_disable();
> schedule_timeout(HZ);
> local_irq_enable();
>
> Is it likely going to be ruled out in this patch?

If an rcutorture run specified both the rcutorture.stall_cpu_irqsoff and
the rcutorture.stall_cpu_block module parameters, then yes, exactly the
sequence you call out should occur. Can't say that I have tried this,
though. Nor would I expect to have ever done so without your suggesting
that I do.

But why not try it on current -rcu?

tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 12 --duration 3 --configs "TRACE01" --bootargs "rcutorture.stall_cpu=25 rcutorture.stall_cpu_holdoff=30 rcutorture.stall_cpu_block=1 rcupdate.rcu_task_stall_timeout=10000 rcutorture.stall_cpu_irqsoff"

This tells rcutorture to use all 12 hardware threads, to run the kernel for
three minutes, to run only the TRACE01 rcutorture scenario, and to test
RCU CPU stall warnings:

rcutorture.stall_cpu=25: Stall the CPU for 25 seconds.

rcutorture.stall_cpu_holdoff=30: Wait 30 seconds after boot to start stalling.

rcutorture.stall_cpu_block=1: Do the schedule_timeout_uninterruptible()
while stalling.

rcupdate.rcu_task_stall_timeout=10000: Set the stall-warning timeout
to 10,000 jiffies, or ten seconds.

rcutorture.stall_cpu_irqsoff: This tells rcutorture to execute the
local_irq_disable() that you called out above.

And this results in a couple of stall warning messages, as expected
given that you get two ten-second intervals in a 25-second interval.

No other complaints, though. And of course what happens is that
__schedule() enables interrupts on first call (as it must do) and they
remain enabled past that point. Then local_irq_enable() redundantly
enables them.

> Or is it anything by design?

So no, not by design. I don't see any reason to change it. After all,
if you are running rcutorture and also asking rcutorture to make CPU
stall warnings, you have to expect a bit of noise from the kernel.
Testing that noise is after all the whole point. ;-)

Thanx, Paul

> > > > - else
> > > > + else if (!stall_cpu_block)
> > > > preempt_enable();
> > > > - rcu_read_unlock();
> > > > + cur_ops->readunlock(idx);
> > > > pr_alert("rcu_torture_stall end.\n");
> > > > }
> > > > torture_shutdown_absorb("rcu_torture_stall");
> > > > --
> > > > 2.9.5
>