Re: [PATCH] nbd: fix false lockdep deadlock warning
From: Ming Lei
Date: Tue Jul 01 2025 - 22:33:30 EST
On Wed, Jul 02, 2025 at 09:12:09AM +0800, Yu Kuai wrote:
> Hi,
>
> 在 2025/07/01 21:28, Nilay Shroff 写道:
> >
> >
> > On 6/28/25 6:18 AM, Yu Kuai wrote:
> > > Hi,
> > >
> > > 在 2025/06/27 19:04, Ming Lei 写道:
> > > > I guess the patch in the following link may be simper, both two take
> > > > similar approach:
> > > >
> > > > https://lore.kernel.org/linux-block/aFjbavzLAFO0Q7n1@fedora/
> > >
> > > I this the above approach has concurrent problems if nbd_start_device
> > > concurrent with nbd_start_device:
> > >
> > > t1:
> > > nbd_start_device
> > > lock
> > > num_connections = 1
> > > unlock
> > > t2:
> > > nbd_add_socket
> > > lock
> > > config->num_connections++
> > > unlock
> > > t3:
> > > nbd_start_device
> > > lock
> > > num_connections = 2
> > > unlock
> > > blk_mq_update_nr_hw_queues
> > >
> > > blk_mq_update_nr_hw_queues
> > > //nr_hw_queues updated to 1 before failure
> > > return -EINVAL
> > >
> >
> > In the above case, yes I see that t1 would return -EINVAL (as
> > config->num_connections doesn't match with num_connections)
> > but then t3 would succeed to update nr_hw_queue (as both
> > config->num_connections and num_connections set to 2 this
> > time). Isn't it? If yes, then the above patch (from Ming)
> > seems good.
>
> Emm, I'm confused, If you agree with the concurrent process, then
> t3 update nr_hw_queues to 2 first and return sucess, later t1 update
> nr_hw_queues back to 1 and return failure.
It should be easy to avoid failure by simple retrying.
Thanks,
Ming