Re: [PATCH] padata: fix lockdep warning in padata serialization

From: Daniel Jordan
Date: Tue Sep 20 2022 - 10:14:12 EST


Hi Steffen,

On Tue, Sep 20, 2022 at 07:54:43AM +0200, Steffen Klassert wrote:
> On Mon, Sep 19, 2022 at 09:47:11PM -0400, Daniel Jordan wrote:
> > On Tue, Sep 20, 2022 at 08:39:08AM +0800, eadavis@xxxxxxxx wrote:
> > > From: Edward Adam Davis <eadavis@xxxxxxxx>
> > >
> > > On Mon, 19 Sep 2022 11:12:48 -0400, Daniel Jordan wrote:
> > > > Hi Edward,
> > > >
> > > > On Mon, Sep 19, 2022 at 09:05:55AM +0800, eadavis@xxxxxxxx wrote:
> > > > > From: Edward Adam Davis <eadavis@xxxxxxxx>
> > > > >
> > > > > Parallelized object serialization uses spin_unlock for unlocking a spin lock
> > > > > that was previously locked with spin_lock.
> > > >
> > > > There's nothing unusual about that, though?
> > > >
> > > > > This caused the following lockdep warning about an inconsistent lock
> > > > > state:
> > > > >
> > > > > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> > > >
> > > > Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
> > > > you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?
> > > Yes, I want say: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > > >
> > > > > We must use spin_lock_irqsave, because it is possible to trigger tipc
> > > > > from an irq handler.
> > > >
> > > > A softirq handler, not a hardirq handler. I'd suggest using
> > > > spin_lock_bh() instead of _irqsave in your patch.
> > > I think _irqsave better than _bh, it can save the irq context, but _bh not,
> > > and in tipc call trace contain SOFTIRQ-ON-W and IN-SOFTIRQ-W.
> >
> > _irqsave saving the context is about handling nested hardirq disables.
> > It's not needed here since we don't need to care about disabling
> > hardirq.
> >
> > _bh is for disabling softirq, a different context from hardirq. We want
> > _bh here since the deadlock happens when a CPU takes the lock in both
> > task and softirq context. padata uses _bh lock variants because it can
> > be called in softirq context but not hardirq. Let's be consistent and
> > do it in this case too.
>
> padata_do_serial is called with BHs off, so using spin_lock_bh should not
> fix anything here. I guess the problem is that we call padata_find_next
> after we enabled the BHs in padata_reorder.

Yeah, padata_do_serial can be called with BHs off, like in the tipc
stack, but there are also cases where BHs can be on, like lockdep said
here:

{SOFTIRQ-ON-W} state was registered at:
...
padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
...

Line 392 is in _do_serial, not _reorder or _find_next.