Re: [PATCH v6 1/1] block/blk-mq: use atomic_t for quiesce_depth to avoid lock contention on RT

From: Bart Van Assche

Date: Thu May 07 2026 - 06:43:46 EST


On 5/7/26 9:45 AM, Sebastian Andrzej Siewior wrote:
On 2026-05-06 11:43:32 [+0200], Bart Van Assche wrote:
On 5/6/26 9:47 AM, Sebastian Andrzej Siewior wrote:
On 2026-05-06 09:14:33 [+0200], Bart Van Assche wrote:
If the atomic_inc() in blk_mq_quiesce_queue_nowait() is protected by
hctx->queue->queue_lock then the above code doesn't have to be modified.

But wouldn't the atomic_inc + barrier avoid the need to have the lock?
Isn't this a normal pattern? If the lock is kept, we could use
non-atomic ops here then. But this avoids having the lock.

I strongly prefer a spinlock + non-atomic variables rather than using an
atomic variable and barriers because algorithms that use a spinlock are
easier to verify.

Hmmm. If we keep the lock, then there is no need for the atomic and we
keep int counter. Then we are where we are right now with the lock
synchronizing everything.
Isn't this also improving the performance for the !RT case or is it
simply not that visible here?

Agreed that not obtaining the queue_lock from blk_mq_run_hw_queue() is an interesting improvement. But I'm not sure the new smp_mb__after_atomic() and smp_rmb() calls are needed. Block layer calls
of blk_mq_quiesce_queue_nowait() are followed by a blk_mq_wait_quiesce_done() call. The latter calls either synchronize_srcu() or synchronize_rcu(). Either is sufficient to guarantee global visibility of the change of the queue state to "quiesced".

This patch removes a spin_lock() call from
blk_mq_quiesce_queue_nowait(). That spin_lock() call guarantees that
other CPUs will observe the "quiesced" state after the store operations
that precede the blk_mq_quiesce_queue_nowait() call. I don't think that
any block layer code depends on this but I noticed that this change has
not been mentioned in the patch description. A similar comment applies
to the blk_mq_unquiesce_queue() changes: the ordering guarantees
provided by the removed spin_lock() call have not been preserved. There
is probably code in the block layer that depends on the "unquiesced"
state only being observed after prior stores performed by the same CPU
core.

Thanks,

Bart.