Re: [PATCH] rcu: Make jiffies_till_sched_qs writable

From: Byungchul Park
Date: Thu Jul 18 2019 - 21:11:18 EST


On Thu, Jul 18, 2019 at 08:52:52PM -0400, Joel Fernandes wrote:
> On Thu, Jul 18, 2019 at 8:40 PM Byungchul Park <byungchul.park@xxxxxxx> wrote:
> [snip]
> > > - There is a bug in the CPU stopper machinery itself preventing it
> > > from scheduling the stopper on Y. Even though Y is not holding up the
> > > grace period.
> >
> > Or any thread on Y is busy with preemption/irq disabled preventing the
> > stopper from being scheduled on Y.
> >
> > Or something is stuck in ttwu() to wake up the stopper on Y due to any
> > scheduler locks such as pi_lock or rq->lock or something.
> >
> > I think what you mentioned can happen easily.
> >
> > Basically we would need information about preemption/irq disabled
> > sections on Y and scheduler's current activity on every cpu at that time.
>
> I think all that's needed is an NMI backtrace on all CPUs. An ARM we
> don't have NMI solutions and only IPI or interrupt based backtrace
> works which should at least catch and the preempt disable and softirq
> disable cases.
>
> But yeah I don't see why just the stacks of those CPUs that are
> blocking the CPU X would not suffice for the trivial cases where a
> piece of misbehaving code disable interrupts / preemption and
> prevented the stopper thread from executing.

Right. So it makes more interesting tho! :-)

> May be once the test case is ready (no rush!) , then it will be more
> clear what can help.

Yes. I'm really happy to help things about RCU that I love, fixed or
improved. And with the the test case or a real issue, I believe I can do
more helpful work. Looking forward to it, too (no rush!).

Thanks,
Byungchul