Re: [PATCH] nvme/pci: Use host managed power state for suspend

From: Christoph Hellwig
Date: Sat May 11 2019 - 03:25:15 EST


A couple nitpicks, mostly leftover from the previous iteration
(I didn't see replies to those comments from you, despite seeing
a reply to my mail, assuming it didn't get lost):

> +int nvme_set_power(struct nvme_ctrl *ctrl, unsigned ps)
> +{
> + return nvme_set_features(ctrl, NVME_FEAT_POWER_MGMT, ps, NULL, 0, NULL);
> +}
> +EXPORT_SYMBOL_GPL(nvme_set_power);
> +
> +int nvme_get_power(struct nvme_ctrl *ctrl, u32 *result)
> +{
> + struct nvme_command c;
> + union nvme_result res;
> + int ret;
> +
> + if (!result)
> + return -EINVAL;
> +
> + memset(&c, 0, sizeof(c));
> + c.features.opcode = nvme_admin_get_features;
> + c.features.fid = cpu_to_le32(NVME_FEAT_POWER_MGMT);
> +
> + ret = __nvme_submit_sync_cmd(ctrl->admin_q, &c, &res,
> + NULL, 0, 0, NVME_QID_ANY, 0, 0, false);
> + if (ret >= 0)
> + *result = le32_to_cpu(res.u32);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(nvme_get_power);

At this point I'd rather see those in the PCIe driver. While the
power state feature is generic in the spec I don't see it actually
being used anytime anywhere else any time soon.

But maybe we can add a nvme_get_features helper ala nvme_set_features
in the core to avoid a little boilerplate code for the future?

> + ret = nvme_set_power(&dev->ctrl, dev->ctrl.npss);
> + if (ret < 0)
> + return ret;

I can't find any wording in the spec that guarantees the highest
numerical power state is the deepest. But maybe I'm just missing
something as such an ordering would be really helpful?

> static int nvme_suspend(struct device *dev)
> {
> struct pci_dev *pdev = to_pci_dev(dev);
> struct nvme_dev *ndev = pci_get_drvdata(pdev);
>
> + /*
> + * Try to use nvme if the device supports host managed power settings
> + * and platform firmware is not involved.
> + */

This just comments that what, but I think we need a why here as the
what is fairly obvious..