Re: [PATCH] padata: fix lockdep warning in padata serialization
From: Daniel Jordan
Date: Thu Sep 22 2022 - 22:08:09 EST
On Thu, Sep 22, 2022 at 12:55:37PM +0200, Steffen Klassert wrote:
> On Wed, Sep 21, 2022 at 02:51:38PM -0400, Daniel Jordan wrote:
> > On Wed, Sep 21, 2022 at 09:36:16AM +0200, Steffen Klassert wrote:
> > > On Tue, Sep 20, 2022 at 10:10:57AM -0400, Daniel Jordan wrote:
> > > > Yeah, padata_do_serial can be called with BHs off, like in the tipc
> > > > stack, but there are also cases where BHs can be on, like lockdep said
> > > > here:
> > >
> > > padata_do_serial was designed to run with BHs off, it is a bug if it
> > > runs with BHs on. But I don't see a case where this can happen. The
> > > only user of padata_do_serial is pcrypt in its serialization callbacks
> > > (pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback
> > > pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via
> > > the padata_serial_worker with the padata->serial call. BHs are
> > > off here. The crypto callback also runs with BHs off.
> > >
> > > What do I miss here?
> >
> > Ugh.. this newer, buggy part of padata_do_parallel:
> >
> > /* Maximum works limit exceeded, run in the current task. */
> > padata->parallel(padata);
>
> Oh well...
>
> > This skips the usual path in padata_parallel_worker, which disables BHs.
> > They should be left off in the above case too.
> >
> > What about this?
> >
> > ---8<---
> >
> > Subject: [PATCH] padata: always leave BHs disabled when running ->parallel()
> >
> > A deadlock can happen when an overloaded system runs ->parallel() in the
> > context of the current task:
> >
> > padata_do_parallel
> > ->parallel()
> > pcrypt_aead_enc/dec
> > padata_do_serial
> > spin_lock(&reorder->lock) // BHs still enabled
> > <interrupt>
> > ...
> > __do_softirq
> > ...
> > padata_do_serial
> > spin_lock(&reorder->lock)
> >
> > It's a bug for BHs to be on in _do_serial as Steffen points out, so
> > ensure they're off in the "current task" case like they are in
> > padata_parallel_worker to avoid this situation.
> >
> > Reported-by: syzbot+bc05445bc14148d51915@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool")
> > Signed-off-by: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx>
>
> Yes, that makes sense.
>
> Acked-by: Steffen Klassert <steffen.klassert@xxxxxxxxxxx>
Thanks.
> But we also should look at the call to padata_find_next where BHs are
> on. padata_find_next takes the same lock as padata_do_serial, so this
> might be a candidate for a deadlock too.
Yeah, that seems broken, it's now on my list of things to fix. Probably
worth staring at the rest of the locking for a bit too.