On 2/1/21 3:16 AM, Chao Leng wrote:For example assume the list has three path, and all path is not NVME_ANA_OPTIMIZED:
On 2021/1/29 17:20, Hannes Reinecke wrote:
On 1/29/21 9:46 AM, Chao Leng wrote:No, in the scenario, ns should not be NULL.
So you mean we'll need something like this?
On 2021/1/29 16:33, Hannes Reinecke wrote:
On 1/29/21 8:45 AM, Chao Leng wrote:Two path just a sample example.
On 2021/1/29 15:06, Hannes Reinecke wrote:
On 1/29/21 4:07 AM, Chao Leng wrote:If there is just one path(the "old") and the "old" is deleted,
On 2021/1/29 9:42, Sagi Grimberg wrote:
ok, I see.
You can't see exactly where it dies but I followed the assembly toSo there is other bug cause nvme_next_ns abormal.
nvme_round_robin_path(). Maybe it's not the initial nvme_next_ns(head,
old) which returns NULL but nvme_next_ns() is returning NULL eventually
(list_next_or_null_rcu()).
I review the code about head->list and head->current_path, I find 2 bugs
may cause the bug:
First, I already send the patch. see:
https://lore.kernel.org/linux-nvme/20210128033351.22116-1-lengchao@xxxxxxxxxx/
Second, in nvme_ns_remove, list_del_rcu is before
nvme_mpath_clear_current_path. This may cause "old" is deleted from the
"head", but still use "old". I'm not sure there's any other
consideration here, I will check it and try to fix it.
The reason why we first remove from head->list and only then clear
current_path is because the other way around there is no way
to guarantee that that the ns won't be assigned as current_path
again (because it is in head->list).
The list will be like this:
nvme_ns_remove fences continue of deletion of the ns by synchronizing
the srcu such that for sure the current_path clearance is visible.
head->next = ns1;
ns1->next = head;
old->next = ns1;
Where does 'old' pointing to?
This may cause infinite loop in nvme_round_robin_path.
for (ns = nvme_next_ns(head, old);
ns != old;
ns = nvme_next_ns(head, ns))
The ns will always be ns1, and then infinite loop.
No. nvme_next_ns() will return NULL.
nvme_next_ns() will return NULL.
The list like this:
head->next = head;
old->next = head;
If there is two or more path and the "old" is deleted,
"for" will be infinite loop. because nvme_next_ns() will return
the path which in the list except the "old", check condition will
be true for ever.
But that will be caught by the statement above:
if (list_is_singular(&head->list))
no?
If there is just two path, will enter it, may cause no path but there is
actually one path. It is falsely assumed that the "old" must be not deleted.
If there is more than two path, will cause infinite loop.
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 71696819c228..8ffccaf9c19a 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -202,10 +202,12 @@ static struct nvme_ns *__nvme_find_path(struct nvme_ns_head *head, int node)
static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
struct nvme_ns *ns)
{
- ns = list_next_or_null_rcu(&head->list, &ns->siblings, struct nvme_ns,
- siblings);
- if (ns)
- return ns;
+ if (ns) {
+ ns = list_next_or_null_rcu(&head->list, &ns->siblings,
+ struct nvme_ns, siblings);
+ if (ns)
+ return ns;
+ }
Why not? 'ns == NULL' is precisely the corner-case this is trying to fix...
May be we can do like this:Urgh. Please, no. That is well impossible to debug.
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 282b7a4ea9a9..b895011a2cbd 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -199,30 +199,24 @@ static struct nvme_ns *__nvme_find_path(struct nvme_ns_head *head, int node)
return found;
}
-static struct nvme_ns *nvme_next_ns(struct nvme_ns_head *head,
- struct nvme_ns *ns)
-{
- ns = list_next_or_null_rcu(&head->list, &ns->siblings, struct nvme_ns,
- siblings);
- if (ns)
- return ns;
- return list_first_or_null_rcu(&head->list, struct nvme_ns, siblings);
-}
+#define nvme_next_ns_condition(head, current, condition) \
+({ \
+ struct nvme_ns *__ptr = list_next_or_null_rcu(&(head)->list, \
+ &(current)->siblings, struct nvme_ns, siblings); \
+ __ptr ? __ptr : (condition) ? (condition) = false, \
+ list_first_or_null_rcu(&(head)->list, struct nvme_ns, \
+ siblings) : NULL; \
+})
Can you please open-code it to demonstrate where the difference to the current (and my fixed) versions is?
I'm still not clear where the problem is once we applied both patches.
Cheers,
Hannes