Re: [PATCH 5/9] blk-mq: don't set data->ctx and data->hctx in blk_mq_alloc_request_hctx
From: Ming Lei
Date: Tue May 19 2020 - 21:18:43 EST
On Tue, May 19, 2020 at 05:30:00PM +0200, Christoph Hellwig wrote:
> On Tue, May 19, 2020 at 09:54:20AM +0800, Ming Lei wrote:
> > As Thomas clarified, workqueue hasn't such issue any more, and only other
> > per CPU kthreads can run until the CPU clears the online bit.
> >
> > So the question is if IO can be submitted from such kernel context?
>
> What other per-CPU kthreads even exist?
I don't know, so expose to wider audiences.
>
> > > INACTIVE is set to the hctx, and it is set by the last CPU to be
> > > offlined that is mapped to the hctx. once the bit is set the barrier
> > > ensured it is seen everywhere before we start waiting for the requests
> > > to finish. What is missing?:
> >
> > memory barrier should always be used as pair, and you should have mentioned
> > that the implied barrier in test_and_set_bit_lock pair from sbitmap_get()
> > is pair of smp_mb__after_atomic() in blk_mq_hctx_notify_offline().
>
> Documentation/core-api/atomic_ops.rst makes it pretty clear that the
> special smp_mb__before_atomic and smp_mb__after_atomic barriers are only
> used around the set_bit/clear_bit/change_bit operations, and not on the
> test_bit side. That is also how they are used in all the callsites I
> checked.
I didn't care if the barrier is smp_mb__after_atomic or smp_mb() because it
is added in slow path.
What I tried to express is that every SMP memory barrier use should be commented
clearly, especially about the pairing usage, see "SMP BARRIER PAIRING" section of
Documentation/memory-barriers.txt.
So please add comments around the new added smp_mb__after_atomic(),
something like:
/*
* The pair of the following smp_mb__after_atomic() is smp_mb() implied in
* test_and_set_bit_lock pair from sbitmap_get(), so that setting tag bit and
* checking INACTIVE in blk_mq_get_tag() can be ordered, same with setting
* INACTIVE and checking tag bit in blk_mq_hctx_notify_offline().
*/
>
> > Then setting tag bit and checking INACTIVE in blk_mq_get_tag() can be ordered,
> > same with setting INACTIVE and checking tag bit in blk_mq_hctx_notify_offline().
>
> Buy yes, even if not that would take care of it.
The OPs have been ordered in this way, that is exactly purpose of the added memory
barrier.
thanks,
Ming