Re: [RFC PATCH v8 05/10] cxl/memscrub: Add CXL device patrol scrub control feature

From: fan
Date: Fri Apr 26 2024 - 19:57:02 EST


On Sat, Apr 20, 2024 at 12:47:14AM +0800, shiju.jose@xxxxxxxxxx wrote:
> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>
> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub control
> feature. The device patrol scrub proactively locates and makes corrections
> to errors in regular cycle.
>
> Allow specifying the number of hours within which the patrol scrub must be
> completed, subject to minimum and maximum limits reported by the device.
> Also allow disabling scrub allowing trade-off error rates against
> performance.
>
> Register with scrub subsystem to provide scrub control attributes to the
> user.
>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
> ---
> Documentation/scrub/scrub-configure.rst | 52 ++++
> drivers/cxl/Kconfig | 19 ++
> drivers/cxl/core/Makefile | 1 +
> drivers/cxl/core/memscrub.c | 314 ++++++++++++++++++++++++
> drivers/cxl/cxlmem.h | 8 +
> drivers/cxl/mem.c | 6 +
> 6 files changed, 400 insertions(+)
> create mode 100644 Documentation/scrub/scrub-configure.rst
> create mode 100644 drivers/cxl/core/memscrub.c
>
> diff --git a/Documentation/scrub/scrub-configure.rst b/Documentation/scrub/scrub-configure.rst
> new file mode 100644
> index 000000000000..2275366b60d3
> --- /dev/null
> +++ b/Documentation/scrub/scrub-configure.rst
> @@ -0,0 +1,52 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +================
> +Scrub subsystem
> +================
> +
> +Copyright (c) 2024 HiSilicon Limited.
> +
> +:Author: Shiju Jose <shiju.jose@xxxxxxxxxx>
> +:License: The GNU Free Documentation License, Version 1.2
> + (dual licensed under the GPL v2)
> +:Original Reviewers:
> +
> +- Written for: 6.9
> +- Updated for:
> +
> +Introduction
> +------------
> +The scrub subsystem provides interface for controlling attributes
> +of memory scrubbers in the system. The scrub device drivers
> +in the system register with the scrub subsystem.The scrub subsystem
> +driver exposes the scrub controls to the user in the sysfs.
> +
> +The File System
> +---------------
> +
> +The control attributes of the registered scrubbers could be
> +accessed in the /sys/class/ras/rasX/scrub/
> +
> +sysfs
> +-----
> +
> +Sysfs files are documented in
> +`Documentation/ABI/testing/sysfs-class-scrub-configure`.
> +
> +Example
> +-------
> +
> +The usage takes the form shown in this example::
> +
> +1. CXL patrol scrubber
> + # cat /sys/class/ras/ras0/scrub/rate_available
> + # 0x1-0xff
> + # echo 30 > /sys/class/ras/ras0/scrub/rate
> + # cat /sys/class/ras/ras0/scrub/rate
> + # 0x1e
> + # echo 1 > /sys/class/ras/ras0/scrub/enable_background
> + # cat /sys/class/ras/ras0/scrub/enable_background
> + # 1
> + # echo 0 > /sys/class/ras/ras0/scrub/enable_background
> + # cat /sys/class/ras/ras0/scrub/enable_background
> + # 0
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 5f3c9c5529b9..3621b9f27e80 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -144,4 +144,23 @@ config CXL_REGION_INVALIDATION_TEST
> If unsure, or if this kernel is meant for production environments,
> say N.
>
> +config CXL_SCRUB
> + bool "CXL: Memory scrub feature"
> + depends on CXL_PCI
> + depends on CXL_MEM
> + depends on SCRUB
> + help
> + The CXL memory scrub control is an optional feature allows host to
> + control the scrub configurations of CXL Type 3 devices, which
> + supports patrol scrubbing.
> +
> + Registers with the scrub subsystem to provide control attributes
> + of CXL memory device scrubber to the user.
> + Provides interface functions to support configuring the CXL memory
> + device patrol scrubber.
> +
> + Say 'y/n' to enable/disable control of memory scrub parameters for
> + CXL.mem devices. See section 8.2.9.9.11.1 of CXL 3.1 specification
> + for detailed description of CXL memory patrol scrub control feature.
> +
> endif
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 9259bcc6773c..e0fc814c3983 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -16,3 +16,4 @@ cxl_core-y += pmu.o
> cxl_core-y += cdat.o
> cxl_core-$(CONFIG_TRACING) += trace.o
> cxl_core-$(CONFIG_CXL_REGION) += region.o
> +cxl_core-$(CONFIG_CXL_SCRUB) += memscrub.o
> diff --git a/drivers/cxl/core/memscrub.c b/drivers/cxl/core/memscrub.c
> new file mode 100644
> index 000000000000..a50f6e384394
> --- /dev/null
> +++ b/drivers/cxl/core/memscrub.c
> @@ -0,0 +1,314 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * CXL memory scrub driver.
> + *
> + * Copyright (c) 2024 HiSilicon Limited.
> + *
> + * - Provides functions to configure patrol scrub feature of the
> + * CXL memory devices.
> + * - Registers with the scrub subsystem driver to expose the sysfs attributes
> + * to the user for configuring the CXL memory patrol scrub feature.
> + */
> +
> +#define pr_fmt(fmt) "CXL_MEM_SCRUB: " fmt
> +
> +#include <cxlmem.h>
> +#include <linux/cleanup.h>
> +#include <linux/limits.h>
> +#include <linux/memory_scrub.h>
> +
> +static int cxl_mem_get_supported_feature_entry(struct cxl_memdev *cxlmd, const uuid_t *feat_uuid,
> + struct cxl_mbox_supp_feat_entry *feat_entry_out)
> +{
> + struct cxl_mbox_supp_feat_entry *feat_entry;
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> + int feat_index, feats_out_size;
> + int nentries, count;
> + int ret;
> +
> + feat_index = 0;
> + feats_out_size = sizeof(struct cxl_mbox_get_supp_feats_out) +
> + sizeof(struct cxl_mbox_supp_feat_entry);
> + struct cxl_mbox_get_supp_feats_out *feats_out __free(kfree) =
> + kmalloc(feats_out_size, GFP_KERNEL);
> + if (!feats_out)
> + return -ENOMEM;
> +
> + while (true) {
> + memset(feats_out, 0, feats_out_size);
> + ret = cxl_get_supported_features(mds, feats_out_size,
> + feat_index, feats_out);
> + if (ret)
> + return ret;
> +
> + nentries = feats_out->nr_entries;
> + if (!nentries)
> + return -EOPNOTSUPP;
> +
> + /* Check CXL memdev supports the feature */
> + feat_entry = feats_out->feat_entries;
> + for (count = 0; count < nentries; count++, feat_entry++) {
> + if (uuid_equal(&feat_entry->uuid, feat_uuid)) {
> + memcpy(feat_entry_out, feat_entry,
> + sizeof(*feat_entry_out));
> + return 0;
> + }
> + }
> + feat_index += nentries;
> + }
> +}
> +
> +/* CXL memory patrol scrub control definitions */
> +#define CXL_MEMDEV_PS_GET_FEAT_VERSION 0x01
> +#define CXL_MEMDEV_PS_SET_FEAT_VERSION 0x01
> +
> +static const uuid_t cxl_patrol_scrub_uuid =
> + UUID_INIT(0x96dad7d6, 0xfde8, 0x482b, 0xa7, 0x33, 0x75, 0x77, 0x4e, \
> + 0x06, 0xdb, 0x8a);
> +
> +/* CXL memory patrol scrub control functions */
> +struct cxl_patrol_scrub_context {
> + struct device *dev;
> + u16 get_feat_size;
> + u16 set_feat_size;
> + bool scrub_cycle_changeable;
> +};
> +
> +/**
> + * struct cxl_memdev_ps_params - CXL memory patrol scrub parameter data structure.
> + * @enable: [IN & OUT] enable(1)/disable(0) patrol scrub.
> + * @scrub_cycle_changeable: [OUT] scrub cycle attribute of patrol scrub is changeable.
> + * @rate: [IN] Requested patrol scrub cycle in hours.
> + * [OUT] Current patrol scrub cycle in hours.
> + * @min_rate:[OUT] minimum patrol scrub cycle, in hours, supported.
> + */
> +struct cxl_memdev_ps_params {
> + bool enable;
> + bool scrub_cycle_changeable;
> + u16 rate;
> + u16 min_rate;
> +};
> +
> +enum cxl_scrub_param {
> + cxl_ps_param_enable,
> + cxl_ps_param_rate,
> +};
> +
> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK BIT(0)
> +#define CXL_MEMDEV_PS_SCRUB_CYCLE_REALTIME_REPORT_CAP_MASK BIT(1)
> +#define CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK GENMASK(7, 0)
> +#define CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK GENMASK(15, 8)
> +#define CXL_MEMDEV_PS_FLAG_ENABLED_MASK BIT(0)
> +
> +struct cxl_memdev_ps_rd_attrs {
> + u8 scrub_cycle_cap;
> + __le16 scrub_cycle;
> + u8 scrub_flags;
> +} __packed;
> +
> +struct cxl_memdev_ps_wr_attrs {
> + u8 scrub_cycle_hr;
> + u8 scrub_flags;
> +} __packed;
> +

In this patch, generally "rate" is used for cycle in hour, here we use
scrub_cycle_hr. I am not sure whether "rate" is the proper term for the
purpose, "interval" or "cycle" seems more straightforward for me.
But someone else may have a different thought about it.

> +static int cxl_mem_ps_get_attrs(struct device *dev,
> + struct cxl_memdev_ps_params *params)
> +{
> + struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> + size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs);
> + size_t data_size;
> +
> + if (!mds)
> + return -EFAULT;
> +
> + struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) =
> + kmalloc(rd_data_size, GFP_KERNEL);
> + if (!rd_attrs)
> + return -ENOMEM;
> +
> + data_size = cxl_get_feature(mds, cxl_patrol_scrub_uuid, rd_attrs,
> + rd_data_size, rd_data_size,
> + CXL_GET_FEAT_SEL_CURRENT_VALUE);
> + if (!data_size)
> + return -EIO;
> +
> + params->scrub_cycle_changeable = FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK,
> + rd_attrs->scrub_cycle_cap);
> + params->enable = FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + rd_attrs->scrub_flags);
> + params->rate = FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + rd_attrs->scrub_cycle);
> + params->min_rate = FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK,
> + rd_attrs->scrub_cycle);
> +
> + return 0;
> +}
> +
> +static int cxl_mem_ps_set_attrs(struct device *dev, struct cxl_memdev_ps_params *params,
> + enum cxl_scrub_param param_type)
> +{
> + struct cxl_memdev_ps_wr_attrs wr_attrs;
> + struct cxl_memdev *cxlmd = to_cxl_memdev(dev);
> + struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlds);
> + struct cxl_memdev_ps_params rd_params;
> + int ret;
> +
> + ret = cxl_mem_ps_get_attrs(dev, &rd_params);
> + if (ret) {
> + dev_err(dev, "Get cxlmemdev patrol scrub params failed ret=%d\n",
> + ret);
> + return ret;
> + }
> +
> + switch (param_type) {
> + case cxl_ps_param_enable:
> + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + params->enable);
> + wr_attrs.scrub_cycle_hr = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + rd_params.rate);
> + break;
> + case cxl_ps_param_rate:
> + if (params->rate < rd_params.min_rate) {
> + dev_err(dev, "Invalid CXL patrol scrub cycle(%d) to set\n",
> + params->rate);
> + dev_err(dev, "Minimum supported CXL patrol scrub cycle in hour %d\n",
> + params->min_rate);
> + return -EINVAL;
> + }
> + wr_attrs.scrub_cycle_hr = FIELD_PREP(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,
> + params->rate);
> + wr_attrs.scrub_flags = FIELD_PREP(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,
> + rd_params.enable);
> + break;
> + }
> +
> + ret = cxl_set_feature(mds, cxl_patrol_scrub_uuid, CXL_MEMDEV_PS_SET_FEAT_VERSION,
> + &wr_attrs, sizeof(wr_attrs),
> + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET);
> + if (ret)
> + dev_err(dev, "CXL patrol scrub set feature failed ret=%d\n",
> + ret);
> +
> + return ret;
> +}
> +
> +static int cxl_patrol_scrub_get_enabled_bg(struct device *dev, bool *enabled)
> +{
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_mem_ps_get_attrs(dev->parent, &params);
> + if (ret)
> + return ret;
> +
> + *enabled = params.enable;
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_set_enabled_bg(struct device *dev, bool enable)
> +{
> + struct cxl_memdev_ps_params params = {
> + .enable = enable,
> + };
> +
> + return cxl_mem_ps_set_attrs(dev->parent, &params, cxl_ps_param_enable);
> +}
> +
> +static int cxl_patrol_scrub_get_name(struct device *dev, char *name)
> +{
> + struct cxl_memdev *cxlmd = to_cxl_memdev(dev->parent);
> +
> + return sysfs_emit(name, "%s_%s\n", "cxl_patrol_scrub",
> + dev_name(&cxlmd->dev));
> +}
> +
> +static int cxl_patrol_scrub_write_rate(struct device *dev, u64 rate)
> +{
> + struct cxl_memdev_ps_params params = {
> + .rate = rate,
> + };
> +
> + return cxl_mem_ps_set_attrs(dev->parent, &params, cxl_ps_param_rate);
> +}
> +
> +static int cxl_patrol_scrub_read_rate(struct device *dev, u64 *rate)
> +{
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_mem_ps_get_attrs(dev->parent, &params);
> + if (ret)
> + return ret;
> +
> + *rate = params.rate;
> +
> + return 0;
> +}
> +
> +static int cxl_patrol_scrub_read_rate_avail(struct device *dev, u64 *min, u64 *max)
> +{
> + struct cxl_memdev_ps_params params;
> + int ret;
> +
> + ret = cxl_mem_ps_get_attrs(dev->parent, &params);
> + if (ret)
> + return ret;
> + *min = params.min_rate;
> + *max = U8_MAX; /* Max set by register size */
> +
> + return 0;
> +}
> +
> +static const struct scrub_ops cxl_ps_scrub_ops = {
> + .get_enabled_bg = cxl_patrol_scrub_get_enabled_bg,
> + .set_enabled_bg = cxl_patrol_scrub_set_enabled_bg,
> + .get_name = cxl_patrol_scrub_get_name,
> + .rate_read = cxl_patrol_scrub_read_rate,
> + .rate_write = cxl_patrol_scrub_write_rate,
> + .rate_avail_range = cxl_patrol_scrub_read_rate_avail,
> +};
> +
> +int cxl_mem_patrol_scrub_init(struct cxl_memdev *cxlmd)
> +{
> + struct cxl_patrol_scrub_context *cxl_ps_ctx;
> + struct cxl_mbox_supp_feat_entry feat_entry;
> + struct cxl_memdev_ps_params params;
> + struct device *cxl_scrub_dev;
> + int ret;
> +
> + ret = cxl_mem_get_supported_feature_entry(cxlmd, &cxl_patrol_scrub_uuid,
> + &feat_entry);
> + if (ret < 0)
> + return ret;
> +
> + if (!(feat_entry.attr_flags & CXL_FEAT_ENTRY_FLAG_CHANGABLE))
> + return -EOPNOTSUPP;
> +
> + ret = cxl_mem_ps_get_attrs(&cxlmd->dev, &params);
> + if (ret)
> + return dev_err_probe(&cxlmd->dev, ret,
> + "Get CXL patrol scrub params failed\n");
> +
> + cxl_ps_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ps_ctx), GFP_KERNEL);
> + if (!cxl_ps_ctx)
> + return -ENOMEM;
> +
> + *cxl_ps_ctx = (struct cxl_patrol_scrub_context) {
> + .get_feat_size = feat_entry.get_size,
> + .set_feat_size = feat_entry.set_size,
> + .scrub_cycle_changeable = params.scrub_cycle_changeable,
> + };
> +
> + cxl_scrub_dev = devm_scrub_device_register(&cxlmd->dev, cxl_ps_ctx,
> + &cxl_ps_scrub_ops);
> + if (IS_ERR(cxl_scrub_dev))
> + return PTR_ERR(cxl_scrub_dev);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_patrol_scrub_init, CXL);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 1c50a3e2eced..f95e39febd73 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -956,6 +956,14 @@ int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
> int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
> int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
>
> +/* cxl memory scrub functions */
> +#ifdef CONFIG_CXL_SCRUB
> +int cxl_mem_patrol_scrub_init(struct cxl_memdev *cxlmd);
> +#else
> +static inline int cxl_mem_patrol_scrub_init(struct cxl_memdev *cxlmd)
> +{ return 0; }
> +#endif
> +
> #ifdef CONFIG_CXL_SUSPEND
> void cxl_mem_active_inc(void);
> void cxl_mem_active_dec(void);
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 0c79d9ce877c..399e43463626 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -117,6 +117,12 @@ static int cxl_mem_probe(struct device *dev)
> if (!cxlds->media_ready)
> return -EBUSY;
>
> + rc = cxl_mem_patrol_scrub_init(cxlmd);
> + if (rc) {
> + dev_dbg(&cxlmd->dev, "CXL patrol scrub init failed\n");
> + return rc;
> + }

If the device does not support memory patrol scrub feature, the above
function will return -EOPNOTSUPP. Since the feature is optional, should we
just warn it and let it go through?

Fan
> +
> /*
> * Someone is trying to reattach this device after it lost its port
> * connection (an endpoint port previously registered by this memdev was
> --
> 2.34.1
>