Re: [PATCH] nvme: unquiesce the queue before cleaup it

From: jianchao.wang
Date: Sun Apr 22 2018 - 10:26:02 EST


Hi Max

No, I only tested it on PCIe one.
And sorry for that I didn't state that.

Thanks
Jianchao

On 04/22/2018 10:18 PM, Max Gurtovoy wrote:
> Hi Jianchao,
> Since this patch is in the core, have you tested it using some fabrics drives too ? RDMA/FC ?
>
> thanks,
> Max.
>
> On 4/22/2018 4:32 PM, jianchao.wang wrote:
>> Hi keith
>>
>> Would you please take a look at this patch.
>>
>> This issue could be reproduced easily with a driver bind/unbind loop,
>> a reset loop and a IO loop at the same time.
>>
>> Thanks
>> Jianchao
>>
>> On 04/19/2018 04:29 PM, Jianchao Wang wrote:
>>> There is race between nvme_remove and nvme_reset_work that can
>>> lead to io hang.
>>>
>>> nvme_removeÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ nvme_reset_work
>>> -> change state to DELETING
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> fail to change state to LIVE
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> nvme_remove_dead_ctrl
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> nvme_dev_disable
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> quiesce request_queue
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> queue remove_work
>>> -> cancel_work_sync reset_work
>>> -> nvme_remove_namespaces
>>> ÂÂ -> splice ctrl->namespaces
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ nvme_remove_dead_ctrl_work
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> nvme_kill_queues
>>> ÂÂ -> nvme_ns_removeÂÂÂÂÂÂÂÂÂÂÂÂÂÂ do nothing
>>> ÂÂÂÂ -> blk_cleanup_queue
>>> ÂÂÂÂÂÂ -> blk_freeze_queue
>>> Finally, the request_queue is quiesced state when wait freeze,
>>> we will get io hang here.
>>>
>>> To fix it, unquiesce the request_queue directly before nvme_ns_remove.
>>> We have spliced the ctrl->namespaces, so nobody could access them
>>> and quiesce the queue any more.
>>>
>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@xxxxxxxxxx>
>>> ---
>>> Â drivers/nvme/host/core.c | 9 ++++++++-
>>> Â 1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index 9df4f71..0e95082 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -3249,8 +3249,15 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
>>> ÂÂÂÂÂ list_splice_init(&ctrl->namespaces, &ns_list);
>>> ÂÂÂÂÂ up_write(&ctrl->namespaces_rwsem);
>>> Â -ÂÂÂ list_for_each_entry_safe(ns, next, &ns_list, list)
>>> +ÂÂÂ /*
>>> +ÂÂÂÂ * After splice the namespaces list from the ctrl->namespaces,
>>> +ÂÂÂÂ * nobody could get them anymore, let's unquiesce the request_queue
>>> +ÂÂÂÂ * forcibly to avoid io hang.
>>> +ÂÂÂÂ */
>>> +ÂÂÂ list_for_each_entry_safe(ns, next, &ns_list, list) {
>>> +ÂÂÂÂÂÂÂ blk_mq_unquiesce_queue(ns->queue);
>>> ÂÂÂÂÂÂÂÂÂ nvme_ns_remove(ns);
>>> +ÂÂÂ }
>>> Â }
>>> Â EXPORT_SYMBOL_GPL(nvme_remove_namespaces);
>>> Â
>>
>> _______________________________________________
>> Linux-nvme mailing list
>> Linux-nvme@xxxxxxxxxxxxxxxxxxx
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=eQ9q70WFDS-d0s-KndBw8MOJvcBM6wuuKUNklqTC3h8&s=oBasfz9JoJw4yQF4EaWcNfKChZ1HMCkfHVZqyjvYVHQ&e=
>>
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@xxxxxxxxxxxxxxxxxxx
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.infradead.org_mailman_listinfo_linux-2Dnvme&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=7WdAxUBeiTUTCy8v-7zXyr4qk7sx26ATvfo6QSTvZyQ&m=eQ9q70WFDS-d0s-KndBw8MOJvcBM6wuuKUNklqTC3h8&s=oBasfz9JoJw4yQF4EaWcNfKChZ1HMCkfHVZqyjvYVHQ&e=
>