Re: [GIT pull] locking/urgent for v5.10-rc6

From: Heiko Carstens
Date: Wed Dec 02 2020 - 02:55:32 EST


> > But but but...
> >
> > do_idle() # IRQs on
> > local_irq_disable(); # IRQs off
> > defaul_idle_call() # IRQs off
> lockdep_hardirqs_on(); # IRQs off, but lockdep things they're on
> > arch_cpu_idle() # IRQs off
> > enabled_wait() # IRQs off
> > raw_local_save() # still off
> > psw_idle() # very much off
> > ext_int_handler # get an interrupt ?!?!
> rcu_irq_enter() # lockdep thinks IRQs are on <- FAIL
>
> I can't much read s390 assembler, but ext_int_handler() has a
> TRACE_IRQS_OFF, which would be sufficient to re-align the lockdep state
> with the actual state, but there's some condition before it, what's that
> test and is that right?

After digging a bit into our asm code: no, it is not right, and only
for psw_idle() it is wrong.

What happens with the current code:

- default_idle_call() calls lockdep_hardirqs_on() before calling into
arch_cpu_idle()

- our arch_cpu_idle() calls psw_idle() which enables irqs. the irq
handler will call/use the SWITCH_ASYNC macro which clears the
interrupt enabled bits in the old program status word (_only_ for
psw_idle)

- this again causes the interrupt handler to _not_ call TRACE_IRQS_OFF
and therefore lockdep thinks interrupts are enabled within the
interrupt handler

So I guess my patch which I sent yesterday evening should fix all that
mess - plus an explicit trace_hardirqs_off() call in our udelay
implementation is required now.

I'll send a proper patch later.