Re: [PATCH v5] nvme: multipath: Implemented new iopolicy "queue-depth"

From: Christoph Hellwig
Date: Thu May 23 2024 - 02:53:11 EST


> + /*
> + * queue-depth iopolicy does not need to reference ->current_path
> + * but round-robin needs the last path used to advance to the
> + * next one, and numa will continue to use the last path unless
> + * it is or has become not optimized
> + */

Can we please turn this into a full sentence? I.e.:

/*
* The queue-depth iopolicy does not need to reference ->current_path,
* but the round-robin iopolicy needs the last path used to advance to
* the next one, and numa will continue to use the last path unless
* it is or has become non-optimized.
*/

?

> + if (iopolicy == NVME_IOPOLICY_QD)
> + return nvme_queue_depth_path(head);
> +
> + node = numa_node_id();
> ns = srcu_dereference(head->current_path[node], &head->srcu);
> if (unlikely(!ns))
> return __nvme_find_path(head, node);
>
> - if (READ_ONCE(head->subsys->iopolicy) == NVME_IOPOLICY_RR)
> + if (iopolicy == NVME_IOPOLICY_RR)
> return nvme_round_robin_path(head, node, ns);
> +
> if (unlikely(!nvme_path_is_optimized(ns)))
> return __nvme_find_path(head, node);
> return ns;

Also this is growing into the kind of spaghetti code that is on the fast
path to become unmaintainable. I'd much rather see the
srcu_dereference + __nvme_find_path duplicated and have a switch over
the iopolicies with a separate helper for each of them here than the
various ifs at different levels.

> +static void nvme_subsys_iopolicy_update(struct nvme_subsystem *subsys, int iopolicy)

Overly long line here.

> +{
> + struct nvme_ctrl *ctrl;
> + int old_iopolicy = READ_ONCE(subsys->iopolicy);
> +
> + if (old_iopolicy == iopolicy)
> + return;
> +
> + WRITE_ONCE(subsys->iopolicy, iopolicy);
> +
> + /* iopolicy changes reset the counters and clear the mpath by design */
> + mutex_lock(&nvme_subsystems_lock);
> + list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry) {
> + atomic_set(&ctrl->nr_active, 0);
> + nvme_mpath_clear_ctrl_paths(ctrl);
> + }
> + mutex_unlock(&nvme_subsystems_lock);

You probably want to take the lock over the iopolicy assignment to
serialize it. And why do we need the atomic_set here?

> +
> + pr_notice("%s: changed from %s to %s for subsysnqn %s\n", __func__,
> + nvme_iopolicy_names[old_iopolicy], nvme_iopolicy_names[iopolicy],

Pleae avoid the overly long line here as well.

> NVME_REQ_CANCELLED = (1 << 0),
> NVME_REQ_USERCMD = (1 << 1),
> NVME_MPATH_IO_STATS = (1 << 2),
> + NVME_MPATH_CNT_ACTIVE = (1 << 3),

This does not match the indentation above.