Re: [GIT pull] locking/urgent for v5.10-rc6
From: Peter Zijlstra
Date: Tue Dec 01 2020 - 06:08:12 EST
On Tue, Dec 01, 2020 at 09:07:34AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 30, 2020 at 08:31:32PM +0100, Christian Borntraeger wrote:
> > On 30.11.20 19:04, Linus Torvalds wrote:
> > > On Mon, Nov 30, 2020 at 5:03 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >>
> > >>> But but but...
> > >>>
> > >>> do_idle() # IRQs on
> > >>> local_irq_disable(); # IRQs off
> > >>> defaul_idle_call() # IRQs off
> > >> lockdep_hardirqs_on(); # IRQs off, but lockdep things they're on
> > >>> arch_cpu_idle() # IRQs off
> > >>> enabled_wait() # IRQs off
> > >>> raw_local_save() # still off
> > >>> psw_idle() # very much off
> > >>> ext_int_handler # get an interrupt ?!?!
> > >> rcu_irq_enter() # lockdep thinks IRQs are on <- FAIL
> > >>
> > >> I can't much read s390 assembler, but ext_int_handler() has a
> > >> TRACE_IRQS_OFF, which would be sufficient to re-align the lockdep state
> > >> with the actual state, but there's some condition before it, what's that
> > >> test and is that right?
> > >
> > > I think that "psw_idle()" enables interrupts, exactly like x86 does.
>
> (like ye olde x86, modern x86 idles with interrupts disabled)
>
> > Yes, by definition. Otherwise it would be an software error state.
> > The interesting part is the lpswe instruction at the end (load PSW)
> > which loads the full PSW, which contains interrupt enablement, wait bit,
> > condition code, paging enablement, machine check enablement the address
> > and others. The idle psw is enabled for interrupts and has the wait bit
> > set. If the wait bit is set and interrupts are off this is called "disabled
> > wait" and is used for panic, shutdown etc.
>
> OK, but at that point, hardware interrupt state is on, lockdep thinks
> it's on. And we take an interrupt, just like any old regular interrupt
> enabled region.
>
> But then the exception handler (ext_int_handler), which I'm assuming is
> ran by the hardware with hardware interrupts disabled again, should be
> calling into lockdep to tell interrupts were disabled. IOW that
> TRACE_IRQS_OFF bit in there.
>
> But that doesn't seem to be working right. Why? Because afaict this is
> then the exact normal flow of things, but it's only going sideways
> during this idle thing.
>
> What's going 'funny' ?
So after having talked to Sven a bit, the thing that is happening, is
that this is the one place where we take interrupts with RCU being
disabled. Normally RCU is watching and all is well, except during idle.