Re: [RFC 2/2] rcu: Remove ->dynticks_nmi_nesting from struct rcu_dynticks
From: Andy Lutomirski
Date: Fri Jun 22 2018 - 10:19:31 EST
On Fri, Jun 22, 2018 at 6:26 AM Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Jun 21, 2018 at 10:56:59PM -0700, Joel Fernandes wrote:
> > Hi Paul,
> >
> > On Wed, Jun 20, 2018 at 09:49:02AM -0700, Paul E. McKenney wrote:
> > > On Thu, Jun 21, 2018 at 01:05:22AM +0900, Byungchul Park wrote:
> > > > On Wed, Jun 20, 2018 at 11:58 PM, Paul E. McKenney
> > > > <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > > > > On Wed, Jun 20, 2018 at 05:47:20PM +0900, Byungchul Park wrote:
> > > > >> Hello folks,
> > > > >>
> > > > >> I'm careful in saying that ->dynticks_nmi_nesting can be removed but I
> > > > >> think it's possible since the only thing we are interested in with
> > > > >> regard to ->dynticks_nesting or ->dynticks_nmi_nesting is whether rcu is
> > > > >> idle or not.
> > > > >
> > > > > Please keep in mind that NMIs cannot be masked, which means that the
> > > > > rcu_nmi_enter() and rcu_nmi_exit() pair can be invoked at any point in
> > > > > the process, between any consecutive pair of instructions. The saving
> > >
> > > And yes, I should have looked at this patch more closely before replying.
> > > But please see below.
> > >
> > > > I believe I understand what NMI is and why you introduced
> > > > ->dynticks_nmi_nesting. Or am I missing something?
> > >
> > > Perhaps the fact that there are architectures that can enter interrupt
> > > handlers and never leave them when the CPU is non-idle. One example of
> > > this is the usermode upcalls in the comment that you removed.
> >
> > I spent some time tonight and last night trying to understand this concept of
> > never leaving an interrupt, I hope you don't mind me asking this dumb
> > question... perhaps I will learn something : Could you let me know how is it
> > possible that an interrupt never exits?
> >
> > Typically an interrupt never exiting sounds like a hard-lockup. This is how
> > hardlock detector works: Since regular interrupts in linux can't nest, the
> > hardlockup detector checks if hrtimer interrupts are being handled and if
> > not, then it throws a splat, panics the kernel etc. So I am a bit troubled by
> > this interrupt never exiting concept..
> >
> > Further since an interrupt is an atomic context, it cannot sleep or schedule
> > into usermode so how are these upcalls handled from the interrupt?
>
> It has been some years since I traced the code flow, but what happened
> back then is that it switches itself from an interrupt handler to not
> without actually returning from the interrupt. This can only happen when
> interrupting a non-idle process, thankfully, and RCU's dyntick-idle code
> relies on this restriction. If I remember correctly, the code ends up
> executing in the context of the interrupted process, but it has been some
> years, so please apply appropriate skepticism.
...
>
> I have never seen NMIs be unpaired or improperly nested. However,
> given that rcu_irq_enter() invokes rcu_nmi_enter() and rcu_irq_exit()
> invokes rcu_nmi_exit(), it is definitely the case that rcu_nmi_enter()
> and rcu_nmi_exit() need to deal with unpaired and improperly nested
> invocations.
This is very strange. There are certainly cases in x86 where an
interrupt-ish code path can become less interrupt-ish without
returning (killing a task that overflows a kernel stack is an
example), but the RCU calls should still nest correctly. Do you know
the history of this requirement?