Re: [RFC PATCH v2 2/2] blk-mq: Lockout tagset iter when freeing rqs

From: Ming Lei
Date: Tue Dec 22 2020 - 08:26:15 EST


On Tue, Dec 22, 2020 at 11:22:19AM +0000, John Garry wrote:
> Resend without ppvk@xxxxxxxxxxxxxx, which bounces for me
>
> On 22/12/2020 02:13, Bart Van Assche wrote:
> > On 12/21/20 10:47 AM, John Garry wrote:
> >> Yes, I agree, and I'm not sure what I wrote to give that impression.
> >>
> >> About "root partition", above, I'm just saying that / is mounted on a
> >> sda partition:
> >>
> >> root@ubuntu:/home/john# mount | grep sda
> >> /dev/sda2 on / type ext4 (rw,relatime,errors=remount-ro,stripe=32)
> >> /dev/sda1 on /boot/efi type vfat
> >> (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
> > Hi John,
> >
>
> Hi Bart, Ming,
>
> > Thanks for the clarification. I want to take back my suggestion about
> > adding rcu_read_lock() / rcu_read_unlock() in blk_mq_tagset_busy_iter()
> > since it is not allowed to sleep inside an RCU read-side critical
> > section, since blk_mq_tagset_busy_iter() is used in request timeout
> > handling and since there may be blk_mq_ops.timeout implementations that
> > sleep.
>
> Yes, that's why I was going with atomic, rather than some synchronization
> primitive which may sleep.
>
> >
> > Ming's suggestion to serialize blk_mq_tagset_busy_iter() and
> > blk_mq_free_rqs() looks interesting to me.
> >
>
> So then we could have something like this:
>
> ---8<---
>
> -435,9 +444,13 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q,
> busy_iter_fn *fn,
> if (!blk_mq_hw_queue_mapped(hctx))
> continue;
>
> + while (!atomic_inc_not_zero(&tags->iter_usage_counter));
> +
> if (tags->nr_reserved_tags)
> bt_for_each(hctx, tags->breserved_tags, fn, priv, true);
> bt_for_each(hctx, tags->bitmap_tags, fn, priv, false);
>
> + atomic_dec(&tags->iter_usage_counter);
> }

Then it is just one spin_lock variant, and you may have to consider
lock validation.

For example, scsi_host_busy() is called from scsi_log_completion()<-scsi_softirq_done(),
which may be run in irq context, then dead lock can be triggered when the irq
is fired during freeing request.

thanks,
Ming