Re: [PATCH v5 0/3] Handle update hardware queues and queue freeze more carefully

From: James Smart
Date: Fri Aug 20 2021 - 11:27:54 EST

Next message: Alex Deucher: "Re: [PATCH] drm/amd/pm: And destination bounds checking to struct copy"
Previous message: Jiri Kosina: "Re: [PATCH v2 55/63] HID: roccat: Use struct_group() to zero kone_mouse_event"
In reply to: Daniel Wagner: "Re: [PATCH v5 0/3] Handle update hardware queues and queue freeze more carefully"
Next in thread: Christoph Hellwig: "Re: [PATCH v5 0/3] Handle update hardware queues and queue freeze more carefully"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 8/20/2021 4:55 AM, Daniel Wagner wrote:

On Fri, Aug 20, 2021 at 10:48:32AM +0200, Daniel Wagner wrote:

Then we try to do the same thing again which fails, thus we never
make progress.

So clearly we need to update number of queues at one point. What would
be the right thing to do here? As I understood we need to be careful
with frozen requests. Can we abort them (is this even possible in this
state?) and requeue them before we update the queue numbers?

After starring a bit longer at the reset path, I think there is no
pending request in any queue. nvme_fc_delete_association() calls
__nvme_fc_abort_outstanding_ios() which makes sure all queues are
drained (usage counter is 0). Also it clears the NVME_FC_Q_LIVE bit,
which prevents further request added to queues.

yes, as long as we haven't attempted to create the io queues via nvme_fc_connect_io_queues(), nothing should be successful queueing and running down the hctx to start the io. nvme_fc_connect_io_queues() will use the queue for the Connect cmd, which is probably what generated the prior -16389 error.

Which says:"nvme-fc: Update hardware queues before using them" should be good to use.

I start wonder why we have to do the nvme_start_freeze() in the first
place and why we want to wait for the freeze. 88e837ed0f1f ("nvme-fc:
wait for queues to freeze before calling update_hr_hw_queues") doesn't
really tell why we need wait for the freeze.

I think that is probably going to be true as well - no need to freeze/unfreeze around this path. This was also a rather late add (last oct), so we had been running without the freezes for a long time, granted few devices change their queue counts.

I'll have to see if I can find what prompted the change. At first blush, I'm fine reverting it.

Given we know the usage counter of the queues is 0, I think we are
safe to move the blk_mq_update_nr_hw_queues() before the start queue
code. Also note nvme_fc_create_hw_io_queues() calls
blk_mq_freeze_queue() but it wont block as we are sure there is no
pending request.

Agree.

-- james

_______________________________________________
Linux-nvme mailing list
Linux-nvme@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-nvme

Next message: Alex Deucher: "Re: [PATCH] drm/amd/pm: And destination bounds checking to struct copy"
Previous message: Jiri Kosina: "Re: [PATCH v2 55/63] HID: roccat: Use struct_group() to zero kone_mouse_event"
In reply to: Daniel Wagner: "Re: [PATCH v5 0/3] Handle update hardware queues and queue freeze more carefully"
Next in thread: Christoph Hellwig: "Re: [PATCH v5 0/3] Handle update hardware queues and queue freeze more carefully"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]