Re: [PATCH v2 04/10] block: revert back to synchronous request_queue removal
From: Luis Chamberlain
Date: Mon Apr 20 2020 - 14:59:48 EST
On Sun, Apr 19, 2020 at 03:23:31PM -0700, Bart Van Assche wrote:
> On 4/19/20 12:45 PM, Luis Chamberlain wrote:
> > +/**
> > + * blk_put_queue - decrement the request_queue refcount
> > + *
> > + * @q: the request_queue structure to decrement the refcount for
> > + *
>
> How about following the example from Documentation/doc-guide/kernel-doc.rst
> and not leaving a blank line above the function argument documentation?
Sure.
> > + * Decrements the refcount to the request_queue kobject, when this reaches
> ^^
> of?
> > + * 0 we'll have blk_release_queue() called. You should avoid calling
> > + * this function in atomic context but if you really have to ensure you
> > + * first refcount the block device with bdgrab() / bdput() so that the
> > + * last decrement happens in blk_cleanup_queue().
> > + */
>
> Is calling bdgrab() and bdput() an option from a context in which it is not
> guaranteed that the block device is open?
If the block device is not open, nope. For that blk_get_queue() can
be used, and is used by the block layer. This begs the question:
Do we have *drivers* which requires access to the request_queue from
atomic context when the block device is not open?
> Does every context that calls blk_put_queue() also call blk_cleanup_queue()?
Nope.
> How about avoiding confusion by changing the last sentence of that comment
> into something like the following: "The last reference must not be dropped
> from atomic context. If it is necessary to call blk_put_queue() from atomic
> context, make sure that that call does not decrease the request queue
> refcount to zero."
This would be fine, if not for the fact that it seems worthy to also ask
ourselves if we even need blk_get_queue() / blk_put_queue() exported for
drivers.
I haven't yet finalized my review of this, but planting the above
comment cements the idea further that it is possible. Granted, I think
its fine as -- that is our current use case and best practice. Removing
the export for blk_get_queue() / blk_put_queue() should entail reviewing
each driver caller and ensuring that it is not needed. And that is not
done yet, and should be considered a separate effort.
> > /**
> > * blk_cleanup_queue - shutdown a request queue
> > + *
> > * @q: request queue to shutdown
> > *
>
> How about following the example from Documentation/doc-guide/kernel-doc.rst
> and not leaving a blank line above the function argument documentation?
Will do.
> > * Mark @q DYING, drain all pending requests, mark @q DEAD, destroy and
> > * put it. All future requests will be failed immediately with -ENODEV.
> > + *
> > + * You should not call this function in atomic context. If you need to
> > + * refcount a request_queue in atomic context, instead refcount the
> > + * block device with bdgrab() / bdput().
>
> Surrounding blk_cleanup_queue() with bdgrab() / bdput() does not help. This
> blk_cleanup_queue() must not be called from atomic context.
I'll just remove that.
>
> > /**
> > - * __blk_release_queue - release a request queue
> > - * @work: pointer to the release_work member of the request queue to be released
> > + * blk_release_queue - release a request queue
> > + *
> > + * This function is called as part of the process when a block device is being
> > + * unregistered. Releasing a request queue starts with blk_cleanup_queue(),
> > + * which set the appropriate flags and then calls blk_put_queue() as the last
> > + * step. blk_put_queue() decrements the reference counter of the request queue
> > + * and once the reference counter reaches zero, this function is called to
> > + * release all allocated resources of the request queue.
> > *
> > - * Description:
> > - * This function is called when a block device is being unregistered. The
> > - * process of releasing a request queue starts with blk_cleanup_queue, which
> > - * set the appropriate flags and then calls blk_put_queue, that decrements
> > - * the reference counter of the request queue. Once the reference counter
> > - * of the request queue reaches zero, blk_release_queue is called to release
> > - * all allocated resources of the request queue.
> > + * This function can sleep, and so we must ensure that the very last
> > + * blk_put_queue() is never called from atomic context.
> > + *
> > + * @kobj: pointer to a kobject, who's container is a request_queue
> > */
>
> Please follow the style used elsewhere in the kernel and move function
> argument documentation just below the line with the function name.
Sure, thanks for the review.
Luis