Re: [PATCH] nvme-pci: Make sure to ring doorbell when last request is short-circuited
From: Mohamed Khalfella
Date: Tue Sep 20 2022 - 12:06:23 EST
On 2022-09-19 11:35:09 -0600, Keith Busch wrote:
> > Fixes: d4060d2be1132 ("nvme-pci: fix controller reset hang when racing with nvme_timeout")
>
> I revisted that commit, and it doesn't sound correct. Specifically this part:
>
> 5) reset_work() continues to setup_io_queues() as it observes no error
> in init_identify(). However, the admin queue has already been
> quiesced in dev_disable(). Thus, any following commands would be
> blocked forever in blk_execute_rq().
>
> When a timeout occurs in the CONNECTING state, the timeout handler unquiesces
> the queue specifically to flush out any blocked requests. Is that commit really
> necessary? I'd rather just revert it to save the extra per-IO checks if not.
I can not speak with certainty whether 4060d2be1132 need to be reverted or not.
I will need to carefully inspect reset code path and do more experiments. If this
commit gets reverted we still need to add `nvme_commit_rqs` to `nvme_mq_admin_ops`.